in reply to Useful heuristics for analyzing arrays of data to determine column header
This is a very interesting endeavour! Here are my two cents:
- If the first row has a string and everything else is numbers, the column has a header. Scalar::Util::looks_like_number could be useful.
- If the first row has a number, it is not likely to be a header.
- If the first row is a string, but repeats further below it is not likely to be a header.
- If the value of the first row is unique but other values appear multiple times it is likely a header. This should be easy to implement.
- I would assign some likelihood for each column. If the average is above a threshold or one or more columns are certain to have a header, the first row is a header row.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Useful heuristics for analyzing arrays of data to determine column header
by Laurent_R (Canon) on Feb 15, 2019 at 09:40 UTC | |
by nysus (Parson) on Feb 17, 2019 at 10:33 UTC |
In Section
Seekers of Perl Wisdom