Re: System call doesn't work when there is a large amount of data in a hash

Replies are listed 'Best First'.
Re^2: System call doesn't work when there is a large amount of data in a hash by Nicolasd (Acolyte) on Apr 28, 2020 at 21:26 UTC
Thanks for the reply, Perl version v5.26.2 and the O/S is Centos 7 I need these large hashes to store genetic data in a hash, it's for a genome assembly tool: I want to add a new module, but I need a system call for that, but I can't get it to work when I run it on large datasets. I am not an informatician so have a limited knowledge Any help would be greatly appreciated https://github.com/ndierckx/NOVOPlasty	[reply]
Re^3: System call doesn't work when there is a large amount of data in a hash by 1nickt (Canon) on Apr 29, 2020 at 01:53 UTC
Hi, " I need these large hashes to store genetic data in a hash" That's a bit like saying " I need these hashes because I need these hashes." See: How do I post a question effectively? I know what I mean. Why don't you? Also see: XY Problem Does your "genome assembly tool" accept Perl data hashes as input? Of course it does not. Therefore you must be somehow serializing your massive input to the program in your system call. Perhaps you need to write a file, or provide a data stream to a server? As noted by my learned colleague swampyankee, it's hard to conceive of why you need to store 250Gb of data in an in-memory hash. There are myriad techniques to avoid doing so, depending on your context; why don't you explain a bit more about that, and show some code? Hope this helps! The way forward always starts with a minimal test.	[reply]
Re^3: System call doesn't work when there is a large amount of data in a hash by marto (Cardinal) on Apr 29, 2020 at 07:41 UTC
I'm not a bioinformatitician either, but that repo has some problems, filenames using the : character, a single perl file > 1MB with over 23K lines, a quick glance at which shows room for improvement. I'm not sure if part of the relatively popular Bioperl suite of tools can address your requirements. Regardless all of this is good advice. You don't need to store everything in memory even if you are just planning to call some external command line tool. Consider an alternative such as a database.	[reply]
Re^3: System call doesn't work when there is a large amount of data in a hash by Nicolasd (Acolyte) on Apr 29, 2020 at 10:53 UTC
I know I could have written it better, it's a bit of a mess, but it works great so that's the most important. And I really need that hash, because I need to access that data all the time, a database would be too slow. Which file is using the : character? Could it be that the system call duplicates everything that is in the virtual memory to start the sister process? If that is the case I guess I just can't do system calls, any idea if there is another way	[reply]
Re^4: System call doesn't work when there is a large amount of data in a hash by Corion (Patriarch) on Apr 29, 2020 at 10:58 UTC
Re^5: System call doesn't work when there is a large amount of data in a hash by Nicolasd (Acolyte) on Apr 29, 2020 at 15:00 UTC
Re^5: System call doesn't work when there is a large amount of data in a hash by Nicolasd (Acolyte) on Apr 29, 2020 at 11:21 UTC
Re^4: System call doesn't work when there is a large amount of data in a hash by marto (Cardinal) on Apr 29, 2020 at 11:02 UTC
Re^5: System call doesn't work when there is a large amount of data in a hash by Nicolasd (Acolyte) on Apr 29, 2020 at 11:14 UTC
Some notes below your chosen depth have not been shown here
Re^4: System call doesn't work when there is a large amount of data in a hash by jcb (Parson) on Apr 30, 2020 at 01:47 UTC


Just another Perl shrine
	PerlMonks