http://qs321.pair.com?node_id=485363
Category: HTML Utility
Author/Contact Info Ted Fiedler <fiedlert@gmail.com>
Description: a utility to translate an email or any text to HTML numeric entities. I generally use this to translate email addresses on web pages into numeric entities - the thought is that it may keep spiders from grabbing email addresses. Im not exactly sure if it does or not, but ignorance is bliss ;) It doesnt tolerate undefined characters.
#!/usr/bin/perl -w

use strict;

my $email    = $ARGV[0] or die "no email address given\n";

# setup our alphabet
my @alphabet = qw(a b c d e f g
                  h i j k l m n
                  o p q r s t u
                  v w x y z);

# Translate non alphanumerics first
my %translate=( "."  => "46",
                "-"  => "45",
                "\@" => "64",
                " "  => "32",
                "_"  => "95");

# translate our numbers
for ( 0 .. 9 )
{
    $translate{$_}                = $_ + 48;
}

# translate our alphabet
for ( 0 .. 25 )
{
    $translate{$alphabet[$_]}     =  $_ + 97;
    $translate{ uc $alphabet[$_]} =  $_ + 65;
}

print "&#" . sprintf("%03d", $translate{$_}) for (split //, $email);

print "\n";
Replies are listed 'Best First'.
Re: html2code.pl
by Roy Johnson (Monsignor) on Aug 20, 2005 at 23:00 UTC
    Wouldn't this be a one-liner based on
    s/./sprintf('&#%03d', ord($&))/ge;
    ?

    Caution: Contents may have been coded under pressure.

      Nice, but why not use $1 instead, and thus avoid the "evil variable" $& ?

      s/(.)/sprintf('&#%03d;', ord($1))/ge;

      the lowliest monk

        Frankly, because it's a one-liner working on one string. The performance difference isn't going to make up for typing extra parentheses, though it is something that people should be aware of. I could have left out more parens, but I thought it would make a hard-to-read answer.
        perl -pe 's/./sprintf"&#%03d;",ord$&/ge'

        Caution: Contents may have been coded under pressure.
Re: html2code.pl
by tcf03 (Deacon) on Aug 20, 2005 at 23:46 UTC
    Apperantly I sometimes get ahead of myself.

    Thanks
    Ted
    --
    "That which we persist in doing becomes easier, not that the task itself has become easier, but that our ability to perform it has improved."
      --Ralph Waldo Emerson
Re: html2code.pl
by tlm (Prior) on Aug 21, 2005 at 06:49 UTC

    Correct me if I'm missing something, but shouldn't there be a ";" at the end of the HTML entity? Also, as a general style issue, instead of print + sprintf I would just use a single printf:

    printf '&#%03d;', translate{ $_ } for split '', $email;

    the lowliest monk