Re: Site HTML filtering, Phase II

by Abigail-II (Bishop) on Feb 11, 2004 at 14:54 UTC

The only hard part is remembering which of the less common but harmless and useful HTML tags don't work.

<code>

<a>

That, and remembering the entity for escaping the left square bracket. (I usually just put code tags around it. Easier to remember.)

common

C<

[

>

function()

$variable

If you want to see some needlessly complicated and gratuitously different site markup, have a look at Wikipedia sometime.

[[link]]

[link]

[..]

common

[[..]]

''foo''

*bar*

Abigail

[reply]
[d/l]
[select]

by theorbtwo (Prior) on Feb 11, 2004 at 16:05 UTC

There is only one "tag" that behaves differently on PM vs elsewhere, and that is <code>. <readmore> is an additional pseudo-tag, but has no meaning in normal HTML. (In fact, we make use of the fact that it is meaningless in normal HTML.) I have no idea what behavior you're seeing with <a>; if you give me further information, I can attempt to explain. Perhaps you'll trying to put it in a place where all HTML is escaped.

As to <code>[</code> being difficult to type, you're correct, it is. However, it's rare to mention the [ character all by it's lonesome. When you do, [ is not difficult to type, or to remember. Code tags are useful semantic information, and allow for better visual cues. Please, don't abuse them for formatting.

Allowing input-as-POD, or another semi-plaintext format is easy to do wrong and difficult to do right. So far, we've done pretty well, I think, at not doing things wrong.

Having a tag like code that says "things inside this tag are PODish" is an interesting idea, and I may get around to taking a look into it at some point, but many, many, many things are higher up on my todo list.

Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

by Abigail-II (Bishop) on Feb 11, 2004 at 16:17 UTC

2Re: Site HTML filtering, Phase II

by theorbtwo (Prior) on Feb 11, 2004 at 16:42 UTC

by jeffa (Bishop) on Feb 12, 2004 at 02:09 UTC

by jonadab (Parson) on Feb 12, 2004 at 13:12 UTC

2Re: Site HTML filtering, Phase II

by theorbtwo (Prior) on Feb 12, 2004 at 13:32 UTC

Some notes below your chosen depth have not been shown here

by jeffa (Bishop) on Feb 12, 2004 at 01:59 UTC

I have been quiet on this matter, but i have to pipe in and say that replacing [ .. ] with [[ .. ]] is a SPLENDID idea and you hit the nail on the head why it is a better fit for this site. (typing [ is a royal PITA!)

There seems to be two major problems (barring having all pages be W3C compliant (X)HTML)

newcomers not knowing how to format code sections
folks using "unescaped" [ .. ] sequences, inadvertently producing potential Google hits to their array indices.

$this->[$example]

As for using POD ... newbies have a hard enough with programming, let alone Perl. Offer them POD and watch them run screaming ... maybe not a bad idea after all ...

My stock answer for posting with POD is pod2html | perl -pe 'custom filters here' | tidy which has served me quite well for several of my larger, premeditated posts.

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)

[reply]
[d/l]

Re^3: Site HTML filtering, Phase II (backcompat)

by tye (Sage) on Feb 12, 2004 at 06:41 UTC

Re^4: Site HTML filtering, Phase II (way fwd)

by tye (Sage) on Feb 12, 2004 at 08:45 UTC

Some notes below your chosen depth have not been shown here

Re: Re^3: Site HTML filtering, Phase II (backcompat)

by ysth (Canon) on Feb 12, 2004 at 08:27 UTC

by jonadab (Parson) on Feb 12, 2004 at 12:38 UTC

The hard part is finding out which elements are named the same in both HTML and Perlmonks, but act differently. <code> for instance means something else in HTML than in Perlmonks.

code tags are something I use often enough that they're not hard to remember.

But I still haven't figured out how the <a> element is working on Perlmonks.

Hmmm. I haven't run into that one. As near as I can tell, it works like in regular HTML. Must be I just haven't tried the right (or wrong) thing yet.

Easier to remember, but not easier to type.

Agreed, I find having to escape the left square bracket annoying (I did say it was one of my two pet annoyances on pm, didn't I?), and doing the editing in a browser textarea control instead of a real editor doesn't help this any. Sometimes I'm tempted to do a whole post in Emacs and copy-and-paste it over. Sometimes I do that. I suppose the bracket syntax for perlmonks was taken from E2 and/or Wiki, but I've always wondered why the same things couldn't be done with angle brackets...

How It Is	How It Could Have Been
`[jonadab]`	`<node jonadab>`
`[id://328276]`	`<id 328276>`
`[cpan://Net::Server::POP3]`	`<cpan Net::Server::POP3>`
`[Newest Nodes]`	`<node Newest Nodes>`
`[weird syntax >= escaping]`	`<node "weird syntax >= escaping">`

However, retrofitting those changes now would be quite painful, as all existing nodes would be impacted (and that's ignoring developing and testing the code for the changes).

At least in POD

Please, no POD. I do *not* want to try deal with significant whitespace in a feature-impoverished browser textarea, and if you think getting newbies to use code tags and whatnot is hard with an HTML-like markup, just you think about trying to convince newbies who want help with PERL that they should post their question with POD markup. Gah. Gives me the heebie-jeebies just thinking about it.

$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}}
split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
[download]

[reply]
[d/l]
[select]

Re^2: Site HTML filtering, Phase II (</hr>)

by Abigail-II (Bishop) on Feb 12, 2004 at 14:06 UTC

by tye (Sage) on Feb 11, 2004 at 20:04 UTC

The XML-style closing / gets stripped out too

What? Yes, </hr> gets stripped now and didn't used to. But for some time now, <hr> has been changed to the XMLish <hr />.

Oh, I see. There is a bug in that <hr /> can *report* (if you have error reporting set high enough) that the / was stripped when in fact it wasn't. I'll fix that soon.

Thanks.

- tye

by theorbtwo (Prior) on Feb 11, 2004 at 16:52 UTC

If you give me a list of tags, and where you think they should be allowed, I'll look at them. Can't promise more, I'm rather busy at present.

by jonadab (Parson) on Feb 12, 2004 at 17:02 UTC

If you give me a list of tags, and where you think they should be allowed, I'll look at them. Can't promise more, I'm rather busy at present.

Please don't feel like there's any urgency here. I didn't mean to be complaining. These are actually quite small annoyances. Still there are some entities that I do occasionally miss being able to use...

abbr and/or <acronym title="FOO">Foreign Optometrists' Organization</acronym>
<cite>Citation</cite>
<q>Short Quotation</q>. Maybe I'm being silly with this one, since we can still use traditional "quote marks".
~~deleted text~~. This is semantically pretty much the same as <strike>strikethrough</strike>, except that <strike> is deprecated and <del> isn't. Again, maybe I'm being silly with this one. (Sometimes it's hard for me to tell when I'm being silly or not about things like this.) If <del> were supported, it would make sense to also support inserted text, as they seem to go together.
It's tempting in some ways to add <style> to the list, but I can think of N ways in which it could be abused, so it's probably best left out.

<cite> happens to be the one I've used most often, forgetting that it wasn't permitted, though <abbr> when I do miss it is somewhat more bothersome.

As far as where they should be allowed, I'm not sure I understand the inner workings of the site well enough to say, other than that it's usually in an ordinary node body (such as either a root node or reply in SOPW, obfuscation, Meditations, ... you know, a regular node). I don't recall ever missing the ability to use any of these tags in a node title. Hmmm... in the chatterbox maybe though.

;$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$;[-1]->();print

Re^2: Site HTML filtering, Phase II (change?)
by tye (Sage) on Feb 12, 2004 at 07:33 UTC

BTW, the node you are replying to isn't discussing any changes to how you mark up nodes at PerlMonks. Feel free to ignore it.

The previous related node involved a fairly minor change: Instead of just expecting contributors to get their HTML elements properly nested, we are now checking for it and trying to fix any errors we find (trying to balance DWIM with code complexity/performance). We wouldn't be doing this except such errors can and do impact the contributions of others.

This node is discussing (in quite a bit of detail) how much feedback you can choose to see from this process. If you find it too complicated for you to understand (or it just taxes your patience), then you should probably stop reading after the short summary (or just ignore it completely and keep the default settings or even just try different settings when you get bored).

Implicit <code> tags would make for a rather ugly presentation (and a much less flexible one). I and others discuss POD elsewhere. With LaTeX, would we deliver the results as PDF or just big PNGs? (Sorry, I haven't used LaTeX in many years so I don't know how nice any LaTeX-to-HTML engines are -- but I suspect they'd take a lot more load than the current PerlMonks HTML production process.) Plain HTML would make posting Perl code difficult without using a program to help produce the HTML.

I didn't have anything to do with the development of the "near-HTML-subset plus square bracket" syntax. I don't find it particularly hard to understand (and this was back when the documentation was much worse). And I appreciate the short cuts it provides (and realize it isn't a perfect choice for Perl, a language that makes fairly heavy use of nearly every printable ASCII character).

If you simply want text, then the requirements are very simple:

Put <p> where you want a blank line.
Put <code> tags around any code (or other uses of &, <, >, [, and ] or text you need displayed in a fixed-width font, such as ASCII drawings). Try not to use this when you don't need it.

You later complain about producing links. Plain text doesn't have links, so you need to decide whether you want plain text or not. If you want links, then please stop asking why you can't have plain text. (:

- tye