Your skill will accomplishwhat the force of many cannot PerlMonks

### Number of bits, length of RegExp

by YAFZ (Pilgrim)
 on Apr 21, 2003 at 14:22 UTC Need Help??

YAFZ has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

Let's assume that I have a file called bit.txt :
```bit.txt
=====================
110101010110111010101
100000000000000000001
101010101010101010101
110000000000000000011
110100000000000000000
11
And I want two things.

1) The regular expression that matches the lines containing an even number of 1s.
2) The regular expression that matches the lines containing an odd number of 1s.

```1) ^(0*10*10*)*\$
2) ^(0*10*10*)*1(0*10*10*)*\$
My question is: Are these the shortest possible RegExps for the job? Especially the second one? Can it be shorter?

Replies are listed 'Best First'.
Re: Number of bits, length of RegExp
by pfaut (Priest) on Apr 21, 2003 at 14:36 UTC

Does it have to be a regex? tr/// provides a simpler solution.

```#!/usr/bin/perl -w
use strict;
while (my \$str = <DATA>) {
my \$odd = \$str =~ tr/1// & 1;
print \$str,"has an ",\$odd?"odd":"even"," number of 1's",\$/;
}
__DATA__
110101010110111010101
100000000000000000001
101010101010101010101
110000000000000000011
110100000000000000000
11

Output:

```110101010110111010101
has an odd number of 1's
100000000000000000001
has an even number of 1's
101010101010101010101
has an odd number of 1's
110000000000000000011
has an even number of 1's
110100000000000000000
has an odd number of 1's
11
has an even number of 1's
 90% of every Perl application is already written. ⇒ dragonchild
Re: Number of bits, length of RegExp
by Improv (Pilgrim) on Apr 21, 2003 at 14:36 UTC
By short, are you really meaning fast, or do you just mean fewest characters? If you want speed, try using non-capturing parenthesis (?:expression) -- it'll save perl some effort and make your search faster. For an even number of ones, I'd actually go with
```\$foo =~ tr/0//d;
length(\$foo);
and handle an odd number similarly. It's not a regex, but it should do the job, and it's very short. If you really want to do a regex, try
```/^(?:0*10*1)*/  # Matches evens
/^0*1(?:0*10*1)*/ # Odds
As you can see, I think your one to match evens is good.

I wouldn't have let tr delete anything - just count the number of ones and then test that number for oddness. So I'd say this as \$is_odd = \$foo =~ tr/1// & 1; \$is_even = \$foo =~ tr/1// ^ 1;. This only tests bit zero's value and in theory is speedy. I dunno if it actually is.

Out of academic curiosity, does anyone know if scalar construction is likely to be more or less expensive than scalar modification? As a matter of fact, it would be really interesting to see performance analyses of many common perl operations.. Maybe they're already out there.. Any pointers, anyone?
Re: Number of bits, length of RegExp (not a regex)
by Aristotle (Chancellor) on Apr 21, 2003 at 17:18 UTC
```#!/usr/bin/perl -w
use strict;

sub is_odd { unpack "%1b*", pack "b*", shift }

for(1..10) {
print is_odd("1" x \$_), "\n";
}
__END__
1
0
1
0
1
0
1
0
1
0

Makeshifts last the longest.

Re: Number of bits, length of RegExp
by Thelonius (Priest) on Apr 21, 2003 at 14:51 UTC
The second one could be shorter:
``` ^(0*10*10*)*10*\$
Re: Number of bits, length of RegExp
by demerphq (Chancellor) on Apr 21, 2003 at 16:22 UTC
```^0*(?:10*10*)*\$
^0*(?:10*10*)*10*\$

Hopefully this is an idle curiosity question. Regexes are not the best way to do this at all.

---
demerphq

<Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
Re: Number of bits, length of RegExp
by YAFZ (Pilgrim) on Apr 21, 2003 at 18:33 UTC
I want to thank to Improv, diotalevi, pfaut, Thelonious, demerphq and Aristotle for their invaluable comments and insights.

I appreciate all of them :)

Create A New User
Node Status?
node history
Node Type: perlquestion [id://252005]
Approved by pfaut
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2021-04-16 18:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?

No recent polls found

Notices?