Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: Apocalypse 5 and regexes

by erikharrison (Deacon)
on Jun 05, 2002 at 19:56 UTC ( #171975=note: print w/replies, xml ) Need Help??

in reply to Apocalypse 5 and regexes

Alright, I've been thinking hard and here is my praise.

Perl 5 cannot parse with just it's regexes. Really what we do is tokenize. For simple formats and such then the two are very nearly synonymous. But Perl often needs to do so much more. And all this power was TOO concise and just about never self documenting. It's easy to write bugy regexes because the syntax itself provides nothing to allow you to build regexes out of little parts. And finally, regexen are a cargo cultish magic. Those who can master them don't have an easy way of providing handy little tools to their lessers, and instead we have broken code decending from broken code - Matt Wright and CGI parsing as an example.

Potentially, Apoc 5 solves all of this. I love Apoc 5. My analysis follows.

Okay, alot of this is a nice series of syntactic changes. Trailing modifiers moving to the front prevents action from a distance problems and increases clarity without changing much. /x being on at all times is a lovely cultural change, but not earth shattering. Named captures increases clarity, and making them alises to the number vars is the best way to handle them. Regexes being first class objects is really useful, but is there only if you need it. Embedded code is handled in a lovely way, and making the closures into anonymous methods means that parser writer have clean access to the power of regex objects while keeping the syntax clean. This also allows me to properly debug my regexes incrementally by good old print statement embedding. All of this is nice and well handled, but really not what I'm excited about.

It's all about grammers baby. rule declarations and angle brackets solve a problem so deep that it's hard to even see that it's there. One, they give true parsing power to the regex engine. They allow me to build up regexes incrementally from component rules in a way that is both easier and self documenting. The syntax looks like BNF grammers that parser writers are already used to.

My favorite aspect of all this is that we can standerdize rules into modules and pass these in clear ways to the Perl community. This provides a great deal of clarity, allows regex masters to share their skill with others while preventing cargo cult practices. And grammers no allow for the best of object oriented design, not only by opaque "rules", but also by grammer inheritance. Simple by adding or modifying of rules can make extending already existing regex based tools much easier. People who have to maintain code are gonna love this.

All in all, I am very fond of the new system. I'm sure that there are nits to be picked and bugs for the community to hunt down (much like how currying syntax changed post Exegesis because of some really smart people on the Perl 6 language list) - and I hope that they do. But the ideas are incredible and in general the execution is brilliant.


Light a man a fire, he's warm for a day. Catch a man on fire, and he's warm for the rest of his life. - Terry Pratchet

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://171975]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2022-05-27 00:48 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (94 votes). Check out past polls.