http://qs321.pair.com?node_id=662953


in reply to "Unrecognized character" stops perl cold

I find this very annoying also. It's because the perldoc renderer is doing silly things like replacing regular '-' dash/hyphen characters with chr(226) (instead of the normal chr(45) character) and replacing single ticks with the "prettier" version of the single quote. I have not been able to figure out how to get this to display properly without switching completely to raw or text mode.

My solution was to bind a key in vim to run this silly little script. So whenever I paste some code from some perldoc I'm viewing, I run this over the code.

#!/usr/bin/perl -n s/‐/-/g; s/−/-/g; s/’/'/g; print;

I notice it renders oddly in the perlmonks node, but you get the idea. I literally just copy and pasted the offending symbol into this script.

--
naChoZ

Therapy is expensive. Popping bubble wrap is cheap. You choose.

Replies are listed 'Best First'.
Re^2: "Unrecognized character" stops perl cold
by andyford (Curate) on Jan 22, 2008 at 20:53 UTC

    Yes, thanks, you put me right onto "the" answer for my work enviroment.

    In your vimrc, put

    vmap ,qq :%s/’/'/g<CR> nmap ,qq :%s/’/'/g<CR>
    where the first quote is entered as a digraph: Ctrl-V Ctrl-K '9.

    At that point "comma q q" is mapped in vim to replace all the bad fancy quotes with good single quotes.

      That will only handle that one character. The problem is there are other characters that are modified as well. After my previous post the other day, I ended up going and making a much more verbose version of that script. Now the bad characters can be referred to by name. Plus I added a silly way of making it display the identity of characters to make it easier to find more that need to be fixed.

      #!/usr/bin/perl -n #use strict; #use warnings; use charnames (); use encoding "utf8"; $|++; my $chars = { 'HYPHEN' => '-', # \x{2010} 'MINUS SIGN' => '-', # \x{2212} 'FIGURE DASH' => '-', # \x{2012} 'RIGHT SINGLE QUOTATION MARK' => "'", # \x{2212} 'BOX DRAWINGS LIGHT VERTICAL' => '|', # \x{2502} }; # If the first character is an equal sign, skip it and # display the identity of each remaining characters. # if (/^=/) { for my $index ( 1 .. length($_) - 1 ) { my $char = substr( $_, $index++, 1 ); print $char . " " . sprintf( "\\x{%04X}", ord($char) ) . "\" = '" . charnames::viacode( ord($char) ) . "'\n" ; } } else { for my $cname ( keys %$chars ) { my $char = chr( charnames::vianame($cname) ); s/$char/$chars->{$cname}/g; } print; }

      --
      naChoZ

      Therapy is expensive. Popping bubble wrap is cheap. You choose.