emilford has asked for the wisdom of the Perl Monks concerning the following question:
I have a form with a couple of form fields that have a 4000 character limit. I put in place some JS code that fails form submission if any of the fields are 4001+.
The problem is that some strings that the JS code finds to be < 4000, Oracle does not and everything comes crashing down. I decided to throw in an extra check w/in my Perl code that does a substr($x, 0, 4000) on all values, just to double check.
The problem I found, however, the string that JS and wc think is 4000, Perl thinks is 4052. I'm assuming that this has something to do with line breaks, etc. Perl truncates the string down to what it thinks is 4000 characters, but it's actually hacking off a chunk of the user's input.
So, my question is in regard to matching up what JS and Oracle think is 4000 characters to what Perl thinks is 4000 characters. How do I get Perl to recognize this difference?
Re: truncating form field input to 4000 characters
by JediWizard (Deacon) on Nov 10, 2005 at 15:49 UTC
|
This appears to be related to a cgi-script, right? Are you using the CGI module to fetch the parameter values from the form? I have seen people trying to read the parameters in themselves (without CGI.pm) and getting bitten by URI escape sequences in their input. Without seeing your code... this would be my first guess.
They say that time changes things, but you actually have to change them yourself. Andy Warhol
| [reply] [d/l] |
|
Yes, I am using CGI.pm to read in the parameters. After a bit more testing, I've found that JS and MS-Word's count show the text as < 4000, but doing a wc -m from the command line shows >4000. I'm assuming this has something to do with line breaks, etc. Would the text be manipulated in anyway between input into the form field and pulling it w/ CGI.pm?
| [reply] [d/l] |
Re: truncating form field input to 4000 characters
by kwaping (Priest) on Nov 10, 2005 at 15:42 UTC
|
What's different or special about that string that's giving you problems? Does it contain non-printing characters or something else out of the ordinary? | [reply] |
Re: truncating form field input to 4000 characters
by Skeeve (Parson) on Nov 10, 2005 at 17:56 UTC
|
| [reply] [d/l] [select] |
Re: truncating form field input to 4000 characters
by ikegami (Patriarch) on Nov 10, 2005 at 16:28 UTC
|
Did you try to send non-ASCII characters? They may have gotten converted to HTML entities (&...;). HTML::Entities might be useful here. | [reply] [d/l] |
|
I found the problem to be the different in line feeds. The testers were cutting and pasting from a Word Document into the HTML form. I believe the line feeds from Windows is a "\r\n". Javascript doesn't count the \r as an extra character where Perl and Oracle do. Removing the\r seemed to solve the problem.
| [reply] |
|
I believe the line feeds from Windows is a "\r\n"
That also happens to be the line feed used by the HTTP protocol (officially, anyway -- most HTTP servers and clients will accept simple \n line feeds, but it's not 100% correct). Your web client is probably where the \r\n line feeds are coming from in this case, not necessarily Windows.
| [reply] |
|
Newlines
In most operating systems, lines in files are terminated by newlines.
Just what is used as a newline may vary from OS to OS. Unix tradition-
ally uses "\012", one type of DOSish I/O uses "\015\012", and Mac OS
uses "\015".
Perl uses "\n" to represent the "logical" newline, where what is logi-
cal may depend on the platform in use. In MacPerl, "\n" always means
"\015". In DOSish perls, "\n" usually means "\012", but when accessing
a file in "text" mode, STDIO translates it to (or from) "\015\012",
depending on whether you're reading or writing. Unix does the same
thing on ttys in canonical mode. "\015\012" is commonly referred to as
CRLF.
So to be picky a "\r\n" on Windows should give you "\015\015\012" and not "\015\012".
s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
| [reply] [d/l] [select] |
Re: truncating form field input to 4000 characters
by kulls (Hermit) on Nov 11, 2005 at 05:38 UTC
|
Hi,
Are you using any templates for handling UI(html)?.
if so, you can add escape=html and escape=js in the form fields in order to control the special characters. I guess the value gets truncated due to special characters.
-Kulls | [reply] [d/l] [select] |
|
|