Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
The 50% compression rate is possible. More is possible. Depends on the arrangement of input data and the algorithm you're using!

Here's an idea: Take the first bit of each number and create a list of numbers from that. See if you can compress that list at a better rate. Take the second bit of each input number and create another list, and so on... If your numbers are all even-odd-even-odd or all odd or all even numbers, then this method will help.

If the input numbers are totally random, I would still not give up just yet! I'd generate a list of "random" numbers and XOR the input numbers with randoms to get a new list that has a better chance of being compressed. Try that!

Most programs generate random numbers this way:

FOR LOOP:
   S = (S * A + B) % C
   print "Random number: ", S
END FOR

S is the initial seed for the random number generator. Programs usually set this to the number of milliseconds since 1970. A, B, and C are constants that can be any random value. In many programming languages the builtin random() function usually returns a number between 0 and 1. And in order to get that, C must be 1. If you repeat this calculation over and over again, you get a list of numbers that seems quite random.

By modifying the values of S, A, B, or C even slightly, you get a totally different series of numbers! If, let's say, A is 13.4849927107, and you just change one digit, you will get a totally different list of numbers that does not resemble the previous set at all. So, you could initialize these constants and then get a random list. Take two random lists and either ADD the values or XOR them or whatever. The resulting list MIGHT HAVE more order than your input data set! And this can help you compress the list further.

I've done this with ZIP files... You know, when you compress a ZIP file and you compress it again and again, you reach a limit after which the size starts growing instead of shrinking! But if, at some point, you encode the ZIP file using a list of random numbers, you can sometimes ZIP it again further and get an even smaller file! ;-)


In reply to Re: Data compression by 50% + : is it possible? by harangzsolt33
in thread Data compression by 50% + : is it possible? by baxy77bax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (2)
As of 2024-04-25 23:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found