Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Transforming strange format to XML

by chipmunk (Parson)
on Nov 17, 2001 at 23:31 UTC ( [id://126032]=note: print w/replies, xml ) Need Help??


in reply to Transforming strange format to XML

I don't recognize the format either, so I'll just provide a substitution to transform the cell tags in a line, according to your example. s,\[_CELL_\]\s*(.*?)\s*?(?=\[_CELL_\]|$),<CELL>$1</CELL>,g; This matches from an opening [_CELL_] up to the next opening [_CELL_] or the end of the line, and sticks in the opening and closing <CELL> tags, swallowing any leading and trailing whitespace.

This substitution may need to be adjusted based on the details of the actual format.

Replies are listed 'Best First'.
Re: Re: Transforming strange format to XML
by Tortue (Scribe) on Nov 18, 2001 at 00:04 UTC
    I obviously hadn't quite mastered combining positive lookahead and non-greediness (selflessness?) yet. Until now my solution was to use two statements:
    s{\[_(CELL|COLHEAD)_\] (.*)}{<$1>$2</$1>}g; s{\s*\[_(CELL|COLHEAD)_\]\s*}{</$1><$1>}g;
    Here it's more complicated because there's several of these tags. It's lamer and slower than yours, so I'll gladly change it, thanks!

    Pauses to think for a while... Hm, with that extra twist I just introduced (not fair, I know), maybe the two-step version isn't slower. Lookahead on something it doesn't know yet could get tricky, maybe.

        s,\[_(CELL|COLHEAD)_\]\s*(.*?)\s*?(?=\[_\1_\]|$),<$1>$2</$1>,g;
    But it seems to work fine, and I don't think I care about speed anyway.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://126032]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-25 17:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found