Perl Monk, Perl Meditation | |
PerlMonks |
Re^6: Best way to store/access large dataset?by Speed_Freak (Sexton) |
on Jun 26, 2018 at 23:03 UTC ( [id://1217475]=note: print w/replies, xml ) | Need Help?? |
I am getting the following field in the dumper " 'ID' => $VAR1->[0]{'ID'}," added to the top of the output list for each attribute
I am using the following script:
I'll continue looking at that to see if I can see why it isn't carrying the attribute ID forward. But I wanted to ask some more questions, and answer yours! Within the code you have written, is it possible to move the attribute ID "outside" of the grouped data? As in:
The above may answer your question about the attribute ID's being defined. The numbers in the left column of the attribute demo dataset (1-30) are identifiers for that attribute. They could be names, or serial numbers...etc. But they are how I identify that attribute so I can look at it later. At the end of this, I actually need a list of the attributes that pass a True/False statement based on a series of percentages. This is where the grouping comes in. Your script groups the datasets by their category by shifting the category in place of the file name. Which works great, as I am not so concerned with carrying the file name forward. So the categories in this case are "Square" "Circle" Rectangle" "Triangle." What I would need to do then is look at each attribute...So for attribute 7 in the code block above. I would have a series of True/False statements that asked for each category: "Does this attribute occur in Circle more than 50% of the time, and less than 10% of the time in Triangle, Rectangle, and Square?" Then I would have another statement asking the same question about that attribute for the next category. "Does this attribute occur in Square more than 50% of the time, and less than 10% of the time in Triangle, Rectangle, and Circle?" And so on and so forth for each unique category identified in the Attributes_ID file. At the end of that, I would generate a list of attributes that scored "True" for each category. Which brings me to my two questions.
Or, 2) Would it be better to do this one line at a time with the True/False qualifiers built into the loop? As in, read in the first attribute row, with the categories, and evaluate the attribute for each category and store that True/False for that quartet in each category before moving to the next attribute? (It would be good to note that the categories change, but are defined in the attribute_ID file. SO it would be based on unique entries there.) Down the rabbit hole!
In Section
Seekers of Perl Wisdom
|
|