Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^3: Find what characters never appear

by kennethk (Abbot)
on Sep 04, 2009 at 23:21 UTC ( [id://793627]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Find what characters never appear
in thread Find what characters never appear

If you want to avoid potential issues w/ regex metacharacters, you can use a set of hash keys to track what's been seen and rebuild the regex once for each character:

#!/usr/bin/perl use strict; use warnings; my %char_hash = (); $char_hash{ chr($_) } = undef foreach (33 .. 127); my $chars = join "", keys %char_hash; my $regex = "([\Q$chars\E])"; while (<DATA>) { while (/$regex/g) { delete $char_hash{$1}; $chars = join "", keys %char_hash; $regex = "([\Q$chars\E])"; } } my @good_array = keys %char_hash; print @good_array; __DATA__ !"#$%&'()*+,-./01234567 89:;<=>?@ABCDE FGHIJKLMOPQRSTUVWXYZ[\]^_`abcdefghijklmnop qrstuvwxyz{|}~

though I feel like there must be a simpler way of implementing this approach.

Replies are listed 'Best First'.
Re^4: Find what characters never appear
by Narveson (Chaplain) on Sep 05, 2009 at 13:35 UTC

    This ran in just a few minutes against my big 2GB file.

    All I had to do was change the printable range to 33..126, change <DATA> to <>, and for my own curiosity, add print "$1 seen on line $.\n"; after delete $char_hash{$1};

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://793627]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2024-04-24 11:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found