Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Numeric limits in perl

by npiazza (Novice)
on Mar 25, 2003 at 18:57 UTC ( [id://245754]=perlquestion: print w/replies, xml ) Need Help??

npiazza has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I'm writing a piece of code in order to study chemical reactions. To complete my work I have to find "best" set of reactions. I would like to create an bidimensional array of n*m positions. In my job m is 30 and n is 25*24*23*22*21*20*19. I get a terminated process, because, I think, array is too much bigger. What are numerical limits for perl ? I'm working on linux. Thank you in advance for your help.

Replies are listed 'Best First'.
Re: Numeric limits in perl
by dragonchild (Archbishop) on Mar 25, 2003 at 19:18 UTC
    Obviously, you went beyond the memory limit of your machine. (If you have more memory on your machine, you can hold a bigger array.)

    I suggest doing one (or more) of the following three things:

    1. Add more RAM. (This solution will help, but will not fix your problem.)
    2. Move the array into a database. (This sounds like a perfect job for a database, actually.)
    3. Find a better algorithm.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      4.) distributed Computing

      You could write a simple server that hands out number ranges and recieves results and a client that does the number crunching. If you spread that over five to ten machines, you will get the solution much faster or even better add a database and find a better algorithm.
Re: Numeric limits in perl
by Elian (Parson) on Mar 25, 2003 at 19:25 UTC
    That's an array with about 72 billion elements (72,681,840,000 elements, to be exact) which'll take something like 280G of RAM to represent, if you're using 4 byte integers and a language with densely packed arrays. Perl isn't one of those languages. :)

    Sounds like what you need here are sparse arrays or, if you need a fully populated array of that size, a lot of RAM...

    I think you may find yourself going past normal limits with perl. You might want to look into perl's PDL extension to see if that might help you, but no matter how you look at it that's a darned enourmous array you're postulating.

      And what is more, consider the following:

      Assume that your code (I mean his) could evaluate 1000 reactions per second. We would be looking at 861... days! of computation. Even if you could farm out the m side of the matrix to 30 different machines you'd still be looking at 28 days of computation.

      Something clever is called for. I would imagine a lot of the search space will contain garbage. Something to prune the space will be a big help, like minimax or GAs. So the question is, npiazza, what are you trying to accomplish?

      _____________________________________________
      Come to YAPC::Europe 2003 in Paris, 23-25 July 2003.

        Given the subject of the original, I have no doubt that he doesn't need to fully populate the array, but is instead representing a sparse solution space based on a set of input data. Fully representing the array is not necessary in this case, you just have to be able to store data at any of the arbitrarily indexed locations and fetch it back later, along with potential default values for locations that are read from before being stored to.

        The problem's less of an issue than it might seem--more than anything else the limiting factor is the number of real elements stored, not the potential number of places data could be stored into.

Re: Numeric limits in perl
by Fletch (Bishop) on Mar 25, 2003 at 19:22 UTC

    Erm, you realize that just 30 x (n) double precision floats would be 581,454,720,000 bytes ( 541 gigabytes ). Perl's builtin limits aside (which unless you're on a 64bit platform and have a 64bit compiled perl (see the perl installation instructions for how to do this) you've long passed), unless you're on some really big iron you're not going to be able to hold even just the raw data in memory (let alone when you take into account overhead for perl's data structures).

Re: Numeric limits in perl
by Thelonius (Priest) on Mar 25, 2003 at 19:19 UTC
    You will have to find another way. 25*24*23*22*21*20*19 is approximately 2.4 billion. You will run out of memory on pretty much any language on nearly every computer.
Re: Numeric limits in perl
by dga (Hermit) on Mar 25, 2003 at 20:31 UTC

    People have suggested sparse arrays. If your data is really sparse, as in very sparse, not actually sparse, one way to do this is to use a hash with the numbers as keys. If your data isn't very sparse this will be much larger than an array of course. But say if you had 1,000,000 elements in that address space then a hash would only use say 12 megabytes or so.

    $chem{27}{242272800}=6; # one element = less than 30 bytes #or even $chem[27]{242272800}=6; # an array of hashes if all 'm's # are used but very few 'n's #as opposed to $chem[27][242272800]=6; # great heaping piles of RAM needed

    As mentioned the array of arrays solution will need the better part of a terabyte of virtual space to work while if you have sparse data, less than a Gig might suffice nicely.

Re: Numeric limits in perl
by bluto (Curate) on Mar 25, 2003 at 19:23 UTC
    Unless I calculated this wrong, that's 72,681,840,000 elements. I'm not sure how big your elements need to be, but even if they are 1 bit in size, you'll need about 9 GB of swap space just for the data. Since there is additional overhead, esp in perl, you'll probably need many times that much swap.

    If you can get away with a sparse array, do that (left as an excersize for a monk with too much time on their hands :-). Otherwise I'd suggest you rethink trying to store this in memory (whether you use perl or something else like C).

Re: Numeric limits in perl
by husker (Chaplain) on Mar 25, 2003 at 21:43 UTC
    You might want to take a look at using Judy Arrays. Although not directly implemented in Perl, the Judy algorithms are C-callable and are optimized for large data arrays. Hope that helps!
Re: Numeric limits in perl
by artist (Parson) on Mar 25, 2003 at 19:33 UTC
    You can certainly get more help in terms of algorithms and other ideas if yor post your problem with more details here. I think better algorithm is way to go. In the case like yours, for a momemnt forget the memory required, you would have to process that vast number of data with some further algorithm as well. Using the techniques like 'Genetic Algorithm' may also help you.

    artist

Re: Numeric limits in perl
by zakzebrowski (Curate) on Mar 26, 2003 at 00:13 UTC
    Sounds like a job for a database... As stated above your success depends on what you are trying to do... If you are doing simple operations and calculations, then consider storing / manipulating a database, so that way you can isolate the dependencies...

    ----
    Zak
    Pluralitas non est ponenda sine neccesitate - mysql's philosphy
Re: Numeric limits in perl
by toma (Vicar) on Mar 26, 2003 at 07:00 UTC
    If your version of perl is 32 bit, you will typically run into perl's problems around 2G bytes of total program memory usage. If this is too small you could try to build a 64 bit version of perl. I have not had success with 64 bit perl, but it has been a while since I tried it.

    Usually an operating system feature limits a perl program's memory usage, not perl's limits. Sometimes the operating system can be reconfigured to increase the maximum memory limit.

    It should work perfectly the first time! - toma

Re: Numeric limits in perl
by Elian (Parson) on Mar 26, 2003 at 14:56 UTC
    Lots of folks have commented, but you've not gotten much in the way of useful answers yet, so potentially here's one.

    If you don't need to transform or otherwise operate on your array as a matrix, you can use a hash of hashes, rather than an array of arrays, to represent your data. Use your indices as hash keys, rather than array indices, and you should be OK. That will suffice for plain data storage and retrieval.

    If you need something fancier, then I'd once again recommend looking at the PDL extension to perl. If that's insufficient you might want to consider either some fancy games with tie and custom variables (which has some speed issues) or a different language with features better suited to the problem set. (Mathematica, Matlab, or APL all spring to mind, though I have no doubt there's some less well-known languages and software packages that are better suited to this particular problem)

Re: Numeric limits in perl
by agent_foo (Novice) on Mar 26, 2003 at 17:41 UTC
    I too would suggest taking a look at the PDL...

    it's whole purpose is to provide an efficient, perl based mechanism for efficiently manipulating and operating over large data sets...

    you may particularly be interested in some of the bioinformatics resources available on the net where you will likely find existing code in your area of research...www.bioperl.com

    the native PDL data structure is commonly called a "piddle" and can represent something as simple as a two-dimensional array or n-dimensional data cube...you may have to rethink many of your algorithms though, primarily because PDL's builtin functions/operators take care of iterating over the dataset for you... hope this helps... .

Re: Numeric limits in perl
by zenn (Sexton) on Mar 26, 2003 at 13:55 UTC
    Your problem seems to be NP-Complete. If you are trying to find the best set by enumerating every possible one it would be exactly the Traveller salesman problem. To this set of problems there isn't a reasonable(polinomial) solution in this nowadays machines. The best solution is try to find good heuristic to search for suboptimal solutions. Perhaps you can think in another way to achieve the results.
      The tell-tale sign for NP, but not P problems is that they need a lot of running time, but not a lot of memory (for some value of "a lot"). NP problems, including the NP-complete problems are still solvable in polynomial space. That is, they are in P-SPACE.

      So, I'm very curious what makes you think hardly defined problem seems to be NP-Complete.

      Abigail

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://245754]
Approved by Paladin
Front-paged by diotalevi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (4)
As of 2024-03-29 00:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found