Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Using regexps to parse an XML DTD file

by Leitchn (Acolyte)
on Nov 26, 2000 at 06:00 UTC ( [id://43334]=perlquestion: print w/replies, xml ) Need Help??

Leitchn has asked for the wisdom of the Perl Monks concerning the following question:

OK - I know there are various tools out there to do this, but I need to do it myself.
So here goes......

What I'm currently trying to do is get the children from this line in a DTD file:

<!ELEMENT ORDER (NAME,DELIVERY,PAYMENT,ORDERNUMBER)>


I'm currently drowning in regexps, and my current attempt is wildly off track - can anyone point out where I'm going wrong?

while ($dtdLine=~/(\((.*?)\,|\,(.*?))/sg){ push(@children,$1); print "@children\n"; }


TIA

Nick.

Replies are listed 'Best First'.
Re: Using regexps to parse an XML DTD file
by the_slycer (Chaplain) on Nov 26, 2000 at 06:54 UTC
    Not what I would do ($1 is not neccesarily a good thing ;-)), but here's some code that works, note that it's not all a regex :-)
    $string = "<!ELEMENT ORDER (NAME,DELIVERY,PAYMENT,ORDERNUMBER)>"; $string =~ m/\((.*)\)/; @array = split (/,/,$1); foreach (@array){print "$_\n"}
      Excellent, thanks - out of interest though, can it be done with a single regexp?

      Nick.
        A single regex cannot generate a variable number of backreferences. What do you think this is, Java? 1{grin}

        -- Randal L. Schwartz, Perl hacker

        1The Java regex library I saw caused /(\w+)\s+)+/ to generate a variable number of backreferences, making it impossible to know exactly what part of the match you were looking at. Evil. They just don't get it.
Re: Using regexps to parse an XML DTD file
by Leitchn (Acolyte) on Nov 26, 2000 at 06:40 UTC
    I'm getting closer:

    $dtdLine=~m/\((.*)\,|\,(.*)\)|\,(.*)\,/sg)


    Gives me all but the last one.
Re: Using regexps to parse an XML DTD file
by Leitchn (Acolyte) on Nov 26, 2000 at 06:15 UTC
    I didn't really forget the 'm' after the =~.

    Honest.

    Nick.

      If you hadn't of said anything we wouldn't have noticed; the 'm' in m// is optional and has been for a while : )

      <myExperience> $mostLanguages = 'Designed for engineers by engineers.'; $perl = 'Designed for people who speak by a linguist.'; </myExperience>

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://43334]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-19 19:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found