Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^9: Best way to store/access large dataset?

by Speed_Freak (Sexton)
on Jun 28, 2018 at 19:15 UTC ( [id://1217581]=note: print w/replies, xml ) Need Help??


in reply to Re^8: Best way to store/access large dataset?
in thread Best way to store/access large dataset?

ETL? And there are a core group of files that will be repeatedly analyzed. But the overall sets change. So files can be added or removed from the calculations as needed.

Each record has 3 "columns" of data with a million rows per column. There are a couple other static values that are a single value. I believe that is one to many? And the samples can be grouped by another singular static value stored with the record. (The shape identifier.)

I'm pretty lost when it comes to the database stuff, so I'm going to point my colleagues here and see what they say honestly!

Replies are listed 'Best First'.
Re^10: Best way to store/access large dataset?
by stonecolddevin (Parson) on Jun 29, 2018 at 16:51 UTC

    ETL is extract/transform/load. So basically you'd be taking your raw data, extracting it out of the files and transforming it into a sensible format or data structure, and loading it up into a persistent data store.

    A million rows per column is much more reasonable than a million columns. That's still a ton of data depending on how many parent rows the associated rows have. I have some ideas but honestly it's probably best to get your co-workers feedback since they know the data and ask whatever other specific questions you have.

    Three thousand years of beautiful tradition, from Moses to Sandy Koufax, you're god damn right I'm living in the fucking past

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1217581]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-03-28 20:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found