AutoMagic HTML

Cody Pendant has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: AutoMagic HTML by BrowserUk (Patriarch) on Jun 24, 2003 at 04:09 UTC
`I _always_ favoured /markup/ done like this.` [download] becoming I always favoured markup done like this. `Paragraphs are simply blocks of text with no intervening blanks lines. Paragraphs are simply blocks of text with no intervening blanks lines. Paragraphs are simply blocks of text with no intervening blanks lines. Paragraphs are simply blocks of text with no intervening blanks lines. This is the start of a new para.` [download] Paragraphs are simply blocks of text with no intervening blanks lines.Paragraphs are simply blocks of text with no intervening blanks lines.Paragraphs are simply blocks of text with no intervening blanks lines.Paragraphs are simply blocks of text with no intervening blanks lines. This is the start of a new para. `1 This is an H1 header Header lines are single lines starting with a numeric with blank lines + above and below. -list item 1 -list item 2 --nested list item 1 --nested list item 2 This is a para subordinate to the second item in the nested list. and another. --nested list item 3 The nested list ends with the -- line below. -- -list item 3 - ================= Any line consisting of say half a dozen or more =s becomes an HR. This is the final paragraph in this example.` [download] This is an H1 header Header lines are single lines starting with a numeric with blank lines above and below. list item 1 list item 2 nested list item 1 nested list item 2 This is a para subordinate to the second item in the nested list. And another. nested list item 3 The nested list ends with the -- line below. list item 3 Any line consisting of say half a dozen or more =s becomes an HR. This is the final paragraph in this example. This seems easy and intuative to type, relatively easy for the human eye to parse and see the intent in its raw form and uses simple enough rules to make it faiirly simple to perform the conversion process. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller	[reply] [d/l] [select]
Re: AutoMagic HTML by hossman (Prior) on Jun 24, 2003 at 03:10 UTC
You might want to look into HTML::FromText ... in particular, take a look at the "See Also" section.	[reply]
Re: AutoMagic HTML by jmcnamara (Monsignor) on Jun 24, 2003 at 08:22 UTC
This idea seems similar to the scheme used by various Wikis where simple text markup is converted to Html. See for example C2 Wiki formatting, Usemod formatting or the Kwiki formatting rules. The above examples are written in Perl and the source code is readily available should you wish to use their formatting. -- John.	[reply]
Re: Re: AutoMagic HTML by Cody Pendant (Prior) on Jun 25, 2003 at 02:04 UTC
Thanks jmacnamara, that was very useful. The Wiki people, as I found out here start with the text as one long string. We have the text of a page in one big string which we split into lines to be processed individually. This colors our TextFormattingRules, especially those dealing with bullet lists. Since we've now forced authors to be newline conscious, we give them the opportunity to escape newlines with a back-slash (\) which we substitute with a blank. `sub PrintBodyText { s/\\\n/ /g; foreach (split(/\n/, $_)){` [download] (there follows a long long list of regexes) which is kind of an interesting answer to my question, "should I work on an array or a block of text?", which I hadn't considered. It's kind of a "both". “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.” M-J D	[reply] [d/l]
Re: AutoMagic HTML by erikharrison (Deacon) on Jun 24, 2003 at 06:36 UTC
Well, you've got a simple markup. So what you're really doing is hacking up a parser for it. And we all know what happens to hacks . . . So, in the interest of elegance, ease, and extensibility, I'd use an existant parser generator, and keep the language definition around to extend. I'd use Parse::RecDescent myself, but use whatever you like of course. Cheers, Erik Light a man a fire, he's warm for a day. Catch a man on fire, and he's warm for the rest of his life. - Terry Pratchet	[reply]
Re: AutoMagic HTML by bart (Canon) on Jun 24, 2003 at 13:17 UTC
the line which starts with an i-space gets rendered as italics. You don't ever expect normal text to start with "i"+space? i like tea. I'd take something more exceptional for the markup, or provide a way to escape it.	[reply]
Re: Re: AutoMagic HTML by Cody Pendant (Prior) on Jun 25, 2003 at 01:44 UTC
If e.e. cummings ever uses my software, that could be an issue. Otherwise I won't sweat it. “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.” M-J D	[reply]
Re: AutoMagic HTML by Cody Pendant (Prior) on Jun 24, 2003 at 04:33 UTC
Well thanks, but those posts aren't really what I was hoping for -- perhaps I should abstract the problem a bit more to make it more Monk-y? I have some replacements to perform on a text file which are best done as line-by-line processing on the file. I have some more which are best done as applying patterns to a block of text which happens to contain newlines. I could slurp it into an array, work on the array and do some, then join it and work on the others, or start by slurping into one big file and do it with multi-line regexes, which is best? “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.” M-J D	[reply]
Re: Re: AutoMagic HTML by Skeeve (Parson) on Jun 24, 2003 at 06:08 UTC
> which is best? Best is what YOU think is best. TMTOWTDI and no one can tell what's the best way if you don't define "best" ;-) Is best the code that is fastest shortest least memory consuming clearly to understand at most obfuscated ... Having said that, I would do it on a line-by-line basis, something like this `while (<>) { # replace italics and bold s{^([ib])\s+(.)}[<$1>$2</$1>]; # find ordered lists if (my $hit= /^\[\s$/ .. /^\]\s*$/) { if ($hit==1) { print "<ol>\n"; next; } if ($hit=~ /e0/i) { print "</ol>\n"; next; } s[^][<li>]; s[$][</li>]; } }` [download] This won't work with cascaded, ordered lists, but it's a starting point.	[reply] [d/l]
Re: Re: Re: AutoMagic HTML by Cody Pendant (Prior) on Jun 24, 2003 at 07:02 UTC
`* fastest * shortest * least memory consuming` [download] Sorry, good point, I meant "fastest", as this rendering is supposed to happen on the fly as HTML is output to the browser. “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.” M-J D	[reply] [d/l]
Re: Re: Re: Re: AutoMagic HTML by sgifford (Prior) on Jun 24, 2003 at 07:07 UTC


more useful options
	PerlMonks

AutoMagic HTML

This is an H1 header