(jcwren) RE: regexp's
by jcwren (Prior) on Oct 03, 2000 at 06:24 UTC
|
Or, in a few less lines...
#!/usr/local/bin/perl -w
use strict;
{
my %hash = ();
while (<DATA>)
{
my @items = split;
$hash {pop @items} = shift @items;
}
print "key=$_, val=$hash{$_}\n" foreach sort keys (%hash);
}
__DATA__
1 time 02:11:05 djw
5 time 04:20:03 bert
2 time 00:01:39 chris
[jcw@linux fs]$ perl q.pl
key=bert, val=5
key=chris, val=2
key=djw, val=1
[jcw@linux fs]$
--Chris
e-mail jcwren | [reply] [Watch: Dir/Any] [d/l] [select] |
|
Benchmark: timing 50000 iterations of Ovid, jcwren...
Ovid: 4 wallclock secs ( 3.32 usr + 0.00 sys = 3.32 CPU) @ 15060.24/s (
n=50000)
jcwren: 2 wallclock secs ( 2.63 usr + 0.00 sys = 2.63 CPU) @ 19011.41/s (
n=50000)
Rate Ovid jcwren
Ovid 15060/s -- -21%
jcwren 19011/s 26% --
--
$you = new YOU;
honk() if $you->love(perl) | [reply] [Watch: Dir/Any] |
|
My GOD that is elegant. Too bad I can only ++ you one time on that one. This is one the most elegant things I have EVER seen, save for something merlyn did in Effective Perl programming:
($_ & ~$_) eq 0
to determine if a scalar is a number (i may be slightly off on the actual expression)
| [reply] [Watch: Dir/Any] [d/l] |
(Ovid) Re: regexp's
by Ovid (Cardinal) on Oct 03, 2000 at 06:20 UTC
|
Assuming that the data is in a file called "data.txt", I might try something like the following (untested):
#!/usr/bin/perl -w
use strict;
my %somehash;
my $file = 'data.txt';
open FILE "<$file" or die "Can't open $file for reading: $!";
while (<FILE>) {
if (/^(\d+)\s+[^\s]+\s+[^\s]+\s+([a-zA-Z]+)$/ {
$somehash{$2} = $1;
}
}
The regex breaks out as follows:
/^ # Anchor to beginning of string
( # Capture to $1
\d+ # one or more digits
) #
\s+ # One or more whitespace
[^\s]+ # One or more non-whitespace
\s+ # One or more whitespace
[^\s]+ # One or more non-whitespace
\s+ # One or more whitespace
( # Capture to $2
[a-zA-Z]+ # One or more letters
) #
$/x; # Anchor to end of string
For more information about why I did not use a simpler regex like /^(\d+).*\b(\w+)$/, you may want to read Death to Dot Star!.
Simpler, however, would be to use a split (also untested):
while (<FILE>) {
chomp;
my ($value, $key) = (split /\s/, $_)[0,3];
$somehash{$key} = $value;
}
Cheers,
Ovid
Update: I would just like to say that I have no frickin' idea why I wrote that regex. Yes, it works. So what? I saw regex in the title and got carried away.
Use the split;
UpdateII: Yup. I have the key value backwards. It's fixed now. Sigh.
Join the Perlmonks Setiathome Group or just go the the link and check out our stats. | [reply] [Watch: Dir/Any] [d/l] [select] |
RE: regexp's
by vladdrak (Monk) on Oct 03, 2000 at 06:26 UTC
|
use strict;
my %hash=();
while (<>) {
my ($num,$name)=(split/\s/,$_)[0,3];
$hash{$name}=$num;
}
foreach (keys %hash) {
print "Key: $_\n";
print "Data: $hash{$_}\n";
}
| [reply] [Watch: Dir/Any] [d/l] |
Re: regexp's
by Cybercosis (Monk) on Oct 03, 2000 at 10:37 UTC
|
Well, if it's in a file, I'd do this:
@ARGV = filename.txt;
while(<>)
{
/^(\d)\s+time\s+\d{2}\:\d{2}\:\d{2}\s+(\d+)/;
$hash{$1} = $2;
}
-------------------update--------------
as per merlyn's advice: (godz it's hard not to try to make spaceballs jokes...)
while(<>)
{
if(/^(\d)\s+time\s+\d{2}\:\d{2}\:\d{2}\s+(\d+)/)
{
$hash{$1} = $2;
}
}
or didn't i read something about this possibly being acceptable:
while(<>)
{
{
/^(\d)\s+time\s+\d{2}\:\d{2}\:\d{2}\s+(\d+)/;
$hash{$1} = $2;
}
}
since the enclosing {} puts it in a seperate block? i fully expect to be waaaay off on this. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
Re: regexp's
by ChOas (Curate) on Oct 03, 2000 at 19:08 UTC
|
#!/usr/bin/perl -w
use strict;
my %Hash;
$Hash{substr($_,rindex($_," ")+1,-1)} = substr($_,0,index($_," ")), wh
+ile(<>);
print "key=$_, val=$Hash{$_}\n" foreach sort keys (%Hash);
GrtZ! ;)))
| [reply] [Watch: Dir/Any] [d/l] |
(Dermot) Re: regexp's
by Dermot (Scribe) on Oct 03, 2000 at 17:54 UTC
|
Untested but should be enough to give you the idea:
foreach (@array)
{
/^(\d+).*(\w+)$/;
%hash{$2} = $1;
}
- For each element in @array
- Start of line
- Capture digits to $1
- Don't capture stuff between $1 and $2
- Capture alphanumerics to $2
- End of line
- Use values captured in $1 and $2 to populate hash
Update: Long live Dot Star. The problem is simple, I believe
the solution should be too.
Update II: Ovid Just a nitpick, don't hate me for this but he
wanted the last field as key and first field as value not the other
way around. jcwren
that is a beautiful solution. | [reply] [Watch: Dir/Any] [d/l] |
|
AAAAAAARRRRRRRGGGGGGGGGHHHHHHHHH! I had them backwards!!!! I hate it when I do that!!!
Yes, I'm anal about the .* thing. I'm also anal about use strict, -w, checking that my open actually opened something, etc. I do see your point and I acknowledge that your regex is much simpler to read. However, iteration combined with dot star is begging for issues. In this case, though, since the backtracking appears to be small (just 4 characters, max) -- assuming that this is not just an over-simplified subset of data -- it's probably not that much of an issue.
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just go the the link and check out our stats.
| [reply] [Watch: Dir/Any] |
RE: regexp's
by Anonymous Monk on Oct 03, 2000 at 20:25 UTC
|
Well, this is one way:
foreach (@array)
{
my ($key,$val);
/\w+$/ and $key = $&;
/^\d+/ and $val = $&;
$hash{$key} = $val;
}
This will slow down all your pattern matches because of using $&. The alternative, if you don't use $& or similar variables elsewhere, would be
$key = $_; $val = $_;
and then some kind of substitution.
It also assumes that numbers are always,er, numbers and the names are always one word. If not, you could use \b for boundary-matching
dave | [reply] [Watch: Dir/Any] |
RE: regexp's
by Anonymous Monk on Oct 03, 2000 at 21:49 UTC
|
# you might have to refine the regexp, but that should
# match your array
foreach $element (@array)
{ if ($element=~/(\d+).*\s+(\w+)/)
{
$hash{$2}=$1;
}
}
# probably not the best, but it should work! | [reply] [Watch: Dir/Any] |