Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Ok here goes:

I'm in the process of writing a forum type web application. I accept messages from forms, save them in a database and then display them as web pages and/or email them to users.

The problem is I have two types of users. The first type their message in the text field and expect to see it displayed as they typed. The second want to pretty up their messages and so use HTML. Currently the HTML tags are limited to a small subset including b,i,p,br,a,ul and li

At the moment, the text is passed through HTML::Scrubber to limit the HTML tags & attributes (if any) and then stored in the db. When displayed on a webpage, the text is run through a simple regex which adds <p> and <br> tags in place of \n . The emailed msgs are sent out as plain text, with no additional filtering.

There are a least two problems with this approch however:

  • Those users who supply HTML tags, find that the regex conflicts with their supplied tags, adding extra <p> and <br> 's everywhere
  • Those users getting the messages via email get a bunch of HTML tags in the messages if the orginal poster used HTML.

So my thoughts on this was to start storing the data as HTML. I was thinking of accepting messages in both HTML and plain text format (adding a checkbox below the form, or maybe searching for HTML tags and deciding). The plain text messages would be passed through HTML::FromText, and then both would be Scrubbed as before.

On the output side, when displayed as webpages, the data can be taken straight from the db without any processing, while for emailing I was looking at using HTML::FormatText to convert back into plain text.

I've started to code up some examples to test this out and it _almost_ works. The issue is that I'd like to have the output text match as closely as possible to the input text, else I will get complaints :) There are a number of small problems like how HTML::FromText changes

* 1
* 2
to
<UL><LI><P>1</P>
<P>2</P>
</UL>
which HTML::FormatText renders as:
  *
 
    1
 
  *
 
    2
To fix these this I've started making small modifications to both HTML::FromText and HTML::FormatText. So one of my quesitons is should I submit these as patches to the authors or should I just fork and change them to MyApp::HTML::xxxx

And finally while typing this I've thought of maybe adding an attribute in the db to indicate whether or not the text is in html form. This will get rid of the converting back and forth. Thinking about this now it might be the best way to do it.

Am I going about this the right way? Someone must have done something simliar to this before and I'm interested in your comments


In reply to Converting plain text to HTML and back again by Nomis52

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-03-28 23:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found