If I understand correctly, Sereal includes data compression, while Storable does not. Your test data is highly repetitive and therefore very compressible. Depending on deep details of your hardware, that compression may have made a significant difference by reducing the amount of data copied in some intermediate steps. (You did not output the length of $frozen, so I can only speculate here.) This may or may not be representative of our questioner's data or a meaningful comparison for non-trivial use.
There is some breakeven point below which the CPU overhead of attempting to compress the data will exceed the cost of simply copying the data. 1GiB of repeated base64 alphabet is obviously well above that point, but that point will vary with real-world data. Algorithms that perform better in the general large case usually have more overhead, and sometimes that can make a significant difference in smaller cases. As an example, I once did an analysis on some code I had written in C and found (to my surprise) that the actual data used was small enough that linear search would be faster than binary search — and that code was in an innermost loop where small gains are worthwhile.