Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Preventing XSS

by techcode (Hermit)
on Sep 19, 2007 at 19:30 UTC ( [id://639978]=perlquestion: print w/replies, xml ) Need Help??

techcode has asked for the wisdom of the Perl Monks concerning the following question:

I thought I'm all settled with following code:
sub form { my $self = shift; my %params = @_; # I could use delete right? my $skip = array_to_hash($params{'skip_fields'}); # Array/ArrayRef my $q = $self->query(); my %vars = $q->Vars(); use HTML::Entities; foreach(keys %vars){ next if $skip->{$_}; # Don't encode if it's in skip list $vars{$_} = HTML::Entities::encode($vars{$_}); } return \%vars; }
But here is a problem. I use UTF-8 so that site would support Serbian (latin not cyrilic) so I end up with funky entities instead of letters like Š, Đ, Č, Ć and Ž.

Which when I hit preview I realised this site is doing too :)

Is there any other way to filter the input that would not do this? I dont want Š instead of Š in my forms ...

I believe it's ok to have those chars not encoded since I set both header and meta charset to utf-8.


Have you tried freelancing? Check out Scriptlance - I work there. For more info about Scriptlance and freelancing in general check out my home node.

Replies are listed 'Best First'.
Re: Preventing XSS
by ikegami (Patriarch) on Sep 19, 2007 at 20:11 UTC

    Sounds to me like you want

    $vars{$_} = HTML::Entities::encode_entities($vars{$_}, '<>&"');

    Quote HTML::Entities,

    The default set of characters to encode are control chars, high-bit chars, and the <, &, >, ' and " characters. But this, for example, would encode just the <, &, > and " characters:

    $encoded = encode_entities($input, '<>&"');

    It converts plain text into tag-less HTML.

      Note that I see no reason to encode quote characters here. It isn't like the result is being placed into an attribute value. (:

      And I probably wouldn't encode all ampersands since they are useful and rather low risk. If you are worried about the little-supported javascriptish ampersand stuff, then I'd only encode those ampersands. But I guess taking something useful away from users out of combined fear and laziness is not a shockingly rare result.

      Which leaves us with a couple of simple regexes and little reason to use a module:

      s/&{/&amp;{/g; s/</&lt;/g;

      - tye        

Re: Preventing XSS
by b10m (Vicar) on Sep 19, 2007 at 19:44 UTC

    I'm afraid you don't get the concept of XSS. You're dealing with encoding/HTML Entity problems, which is bad, but completely different than XSS "protection".

    For XSS "protection", have a look at HTML::StripScripts, it works rather well :-)

    Update: after reading your post again, it does seem you want to prevent XSS attacks (by using HTML::Entities) yet you don't want your "crazy letters" to be lost ;-). I'm not sure HTML::Entities will bulletproof your script. Have a look at HTML::StripScripts, really. But experts my say HTML::Entities _is_ enough (I would love to hear opinions on this)

    --
    b10m

    All code is usually tested, but rarely trusted.
Re: Preventing XSS
by andreas1234567 (Vicar) on Sep 20, 2007 at 11:05 UTC
Re: Preventing XSS
by techcode (Hermit) on Sep 20, 2007 at 10:47 UTC
    I ended up using what ikegami sugested + whitelist of allowed charactes in the fields. I needed to encode them since I print that back (HTML::FillInForm) together with error messages.

    So in Data::FormValidator I created additional constraints such as that ordinary fields should contain alphanums, underscore, minus, space and dot. Email obviosly takes out space and adds @, while URL's add : and /


    Have you tried freelancing? Check out Scriptlance - I work there. For more info about Scriptlance and freelancing in general check out my home node.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://639978]
Approved by b10m
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2024-04-19 11:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found