substitution on

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

How to convert all '<' to < if the '>' doesnot exists. For example: In the text,

Read them in one <place with Google Reader,<a href="www.google.com"> n
+ew<
[download]

How to get the output as

Read them in one &lt;place with Google Reader,<a href="www.google.com"
+> new&lt;
[download]

while(<DATA>){
s/</&lt;/g;
}

__DATA__
Read them in one <place with Google Reader,<a href="www.google.com"> n
+ew<
[download]

Code tags added by GrandFather

Comment on substitution on Select or Download Code

Replies are listed 'Best First'.

Re: substitution on
by ikegami (Patriarch) on Aug 24, 2009 at 17:01 UTC

Your question is next to impossible to understand since we can't tell the difference between when you meant to say "<" and when you meant to say "<". Please repost your question (using Preview until it's readable).

	To get <	To get <
Outside of code tags	`<`	`&lt;`
Inside of code tags	`<`	`<`

[reply]
[d/l]
[select]

Re: substitution on
by Taulmarill (Deacon) on Aug 24, 2009 at 17:09 UTC

s/<(?![^<]*>)/</g;

[reply]
[d/l]

Re: substitution on
by jethro (Monsignor) on Aug 24, 2009 at 17:15 UTC

< and > are special characters for html. Your text is nearly unreadable, please edit your node and use < or code-tags to print those characters.

To answer your question, you might use something like this:

s/<(?=[^>]*(<|$))/&lt;/g;
[download]

This substitutes all < not followed by > but instead by < or end of the string

Naturally this won't work if single < could occur between a valid < ... > construct. It also won't work if a > might be on the next line to its opening <. In that case you could slurp in the whole file into one string.

[reply]
[d/l]
[select]

Re: substitution on
by Sewi (Friar) on Aug 24, 2009 at 17:57 UTC

while (s/((^|\>)[^\<]*?)\>/$1\&gt\;/g) { 1; }
[download]

The only 100% working solution would be using an XML/HTML parser. The sample above will mix up HTML-code like <a name=">here">

edit: Sorry, I just saw that my solution is working on the wrong side.

edit2: Here is the correct solution:

while (s/\<([^\>]*?(\<|$))/\&lt\;$1/g) { 1; }
[download]

^\>

<<html><x<y</html><
[download]

The same thing (using a XML/HTML parser for best results) is also true for this one.
Remember using /s if you're processing a multiline string

[reply]
[d/l]
[select]


Think about Loose Coupling
	PerlMonks