http://qs321.pair.com?node_id=1089283


in reply to Perl text processing

Let's call the file with the interesting categories imei_tags.txt and the file with 3 million devices imeis.txt then something like the following would do the job...
WARNING: Untested off the top of my head code
#! /usr/bin/perl use strict; use warnings; open (my $imeitag , '<', '/path/to/imei_tags.txt'); my %devices; while(<$imeitag>){ $devices{$1}=0 if /^\s*(\d{8}$/; } close $imeitag; open (my $imei, '<', '/path/to/imeis.txt'); while(<$imei>){ my $imeitag=$1 if /^\s*(\d{8}\d+\s*$/ $devices{$imeitag}++ if defined $devices{$imeitag}; } close $imei; my $count=0; for my $imeitag (sort {$devices{$a}<=>$devices{$b}} keys %devices){ print "$imeitag\n"; $count++; last if $count >=100; }
print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."

Replies are listed 'Best First'.
Re^2: Perl text processing
by AnomalousMonk (Archbishop) on Jun 09, 2014 at 17:14 UTC

    Some untested thoughts:

    open (...);

    Return status of open statements not checked; alternately,  use autodie; not used.

    $devices{$1}=0 if /^\s*(\d{8}$/;

    Capture group  (\d{8}$ not closed.

    my $imeitag=$1 if /^\s*(\d{8}\d+\s*$/

    Capture group  (\d{8}\d+\s*$ not closed; statement not terminated (missing semicolon);  my $imeitag ... if ... ; conditional creation of lexical (pre-state static variable hack).

    for my $imeitag (sort {$devices{$a}<=>$devices{$b}} keys %devices){
        print "$imeitag\n";
        $count++;
        last if $count >=100;
    }

    Sorts keys of hash in ascending numerical order, but then prints first 100 keys, which does not seem in accord with requirement to "output ... the top 100 devices" (whatever "top" may exactly mean in the context of the OP).