Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^3: Algorithm to reduce the weight of a collection of bags

by haukex (Archbishop)
on Jul 06, 2022 at 21:14 UTC ( [id://11145314]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Algorithm to reduce the weight of a collection of bags
in thread Algorithm to reduce the weight of a collection of bags

My terminal is 158 characters wide; my CSV files have relatively few fields (under 15); the columns with the widest data tend to be unstructured text, where the widest cells are generally much wider than the average cell. Currently I'm browsing bank transaction files, where the Description column is the widest, and is the safest to truncate without chopping off "important" information.

Thanks for the context. In that case I personally would probably take the pragmatic route and attempt to identify those text columns based on their width, and truncate those, while not truncating any columns under a certain length to make sure I don't truncate amounts or dates (perhaps even trying to identify such "important" column data types with regexes). But as you said, since this kind of thing is also fun, I understand wanting a more generic solution - at the moment I just don't have good tips for that. As to the existing modules, I didn't have the time to look through all of them to see if maybe there is one that already limits its output width to the terminal width "intelligently" - but perhaps another solution would be to implement the truncation yourself before passing the data off to a module for the output.

  • Comment on Re^3: Algorithm to reduce the weight of a collection of bags

Replies are listed 'Best First'.
Re^4: Algorithm to reduce the weight of a collection of bags
by ibm1620 (Hermit) on Jul 06, 2022 at 22:59 UTC
    This is the truncation (or weight-reduction) algorithm as it now stands.
    #!/usr/bin/env perl use v5.36; # implies warnings no warnings q/experimental::for_list/; no warnings q/experimental::builtin/; use builtin qw/indexed/; use List::Util qw/sum/; my $target_weight = shift // die 'need target_weight'; my @weights = ( 20, 3, 25, 10, 3, 24, 25 ); say "Before:\n" . display( \@weights, $target_weight ); shrink( \@weights, $target_weight ); say "After:\n" . display( \@weights, $target_weight ); die if sum(@weights) != $target_weight; sub shrink ( $bags, $target_weight ) { my $curr_weight = sum @$bags; return if ( $curr_weight <= $target_weight ); # no shrink req'd my @refs = sort { ${$b} <=> ${$a} } map \$_, @$bags; BAG: for my ($i, $ref) ( indexed @refs ) { my $next_wt = $i < $#refs ? ${$refs[$i+1]} : 0; my $drop = $$ref - $next_wt; my $lowered_weight = $curr_weight - $drop * ( $i + 1 ); if ( $lowered_weight >= $target_weight ) { for ( 0 .. $i ) { ${$refs[ $_ ]} -= $drop; } $curr_weight = $lowered_weight; } else { use integer; my $target_loss = $curr_weight - $target_weight; my $div = $target_loss / ( 1 + $i ); my $rem = $target_loss % ( 1 + $i ); for ( reverse 0 .. $i ) { ${$refs[ $_ ]} -= $div + ( $rem-- > 0 ? 1 : 0 ); } last BAG; } } } sub display ($aref, $target) { my $r = ''; for my ( $i, $wt ) ( indexed @$aref ) { $r .= sprintf " %2s: {%s} (%d)\n", "#$i", ( '=' x $wt ), $wt; } $r .= sprintf "Weight %d, target=%d\n", sum(@$aref), $target; return $r; }
    Note that, having brought the four highest weights down to 16 but still needing to trim one more character, it took it from the bag that was originally the lightest of the four (#0), thus never violating the original ranking.
    $ shrink 79 Before: #0: {====================} (20) #1: {===} (3) #2: {=========================} (25) #3: {==========} (10) #4: {===} (3) #5: {========================} (24) #6: {=========================} (25) Weight 110, target=79 After: #0: {===============} (15) #1: {===} (3) #2: {================} (16) #3: {==========} (10) #4: {===} (3) #5: {================} (16) #6: {================} (16) Weight 79, target=79
    Plugging this into my simple-minded CSV columnizer gave me exactly what I wanted. It remains to be seen if I'll ever want to apply more sophisticated, data-aware methods of narrowing. :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11145314]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-04-19 21:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found