Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Syntax highlighting has always been one of my favorite features in every editor I have used. After becoming addicted to it, I have always regretted not being able to share colored code with other people.

If we talk about Perl code alone, there are a few available tools to create colored code in HTML and publish it on the web.

When you need to show code in more than one language, though, things become more difficult. Think about presenting the installation and customization of a complex system. You'd need to show Perl code, Apache configuration files, HTML code, SQL queries, and perhaps some XML.

When you publish this code on the web, what was clearly highlighted and easily understandable in your editor screen becomes a flat sequence of black on white text.

Introducing Text::VimColor

Two wonderful features of Vim are its ability of highlighting different languages (384 as of today) and producing an HTML page with the same layout of the code on screen.

(And don't forget Vim's ability to highlight nested syntax, such as Perl embedded in HTML or SQL embedded in Perl.)

Producing code manually with vim is not user friendly and it is quite slow. If you need to publish code on a regular basis, producing HTML pages manually from Vim is a hassle.

Enter Geoff Richards' Text::VimColor, a module that removes your need to remember difficult commands and to cut and paste your code snippets.

Given these code fragments:

# CodeSamples.pm package CodeSamples; our $ctext = <<'CTEXT'; #include <stdio.h> int main() { printf("hello world\n"); return 0; } CTEXT our $perltext = <<'PTEXT'; my $query = qq{SELECT mycol, COUNT(*) FROM mytable WHERE mycol <= 10 GROUP BY mycol}; print $$_,$/ for @{ $dbh->selectcol_arrayref($query) }; # Notice that $query has nested SQL syntax PTEXT 1;

This script will produce nicely highlighted code (HTML + CSS).

#!/usr/bin/perl -w # test_vimcolor.pl use strict; use CGI qw/:standard/; use Text::VimColor; use CodeSamples; # contains code samples in C and Perl my $csyntax = Text::VimColor->new( string => $CodeSamples::ctext, filetype => 'c' )or die("can't create C object ($!)\n"); my $perlsyntax = Text::VimColor->new( string => $CodeSamples::perltext, filetype => 'perl' )or die("can't create perl object ($!)\n"); my $fperlsyntax = Text::VimColor->new( file => $0, filetype => 'perl' )or die("can't create perl object ($!)\n"); print start_html(-title=>"Text::VimColor test", -style=>{'src'=>'light.css'} ), h2("C"), pre( $csyntax->html), hr, h2("Perl"), pre( $perlsyntax->html), hr, h2("Perl (file)"), pre( $fperlsyntax->html), hr, h2("CSS"), pre(Text::VimColor->new(file =>'light.css', filetype => 'css')->ht +ml), hr, h2("Perl (package)"), pre(Text::VimColor->new(file =>'CodeSamples.pm', filetype => 'per +l')->html), hr, h2("Perl (another package)"), pre(Text::VimColor->new(file =>'VimColorCache.pm', filetype => 'p +erl')->html);

See the colorful result.

Highlighting systems overview

Before continuing, let me show the alternatives. I have tried all of them, and I have good and bad feelings for each one of them. I am currently in favor of Text::VimColor for the reason given before, and more.

Application Pro Con
perltidy Fast and accurate Only Perl
GNU source highlight Very fast Hard to customize.
Only a few languages
Syntax::Highlight::Perl Fast and customizable Only Perl
Text::VimColor All languages.
Easily customizable
Slower than other modules.
Works only on Unix (as of today)

Improving performance

As I said, Text::VimColor main deficiency is its poor performance compared to other modules. Although the latest version (0.07) is twice as fast as the previous one, it is still way too slow for any sensible web usage.

Therefore, I decided to create a caching object, to improve Text::VimColor basic performance.

The simplest way I could think of was a tied hash with DB_File. I have also simplified the object interface, to make it easier to use.

package VimColorCache; use strict; use warnings; use Text::VimColor; use Digest::MD5 qw/md5_hex/; use DB_File; our $VERSION = '0.1'; sub new { my $class = shift; my $filename = shift || 'VimColorCache.db'; my %code_items; tie %code_items, 'DB_File', $filename or return undef; my $self = bless { code_items => \%code_items }, $class; return $self; } sub _get_text { my $filename = shift; my $text = undef; open IN, $filename or return undef; local $/; $text = <IN>; close IN; return $text; } sub draw { my $self = shift; my $text = shift; # either the code or the file name my $input = shift; # file or string my $syntax_type = shift; # syntax type (perl, c, sql, html, xml, +etc) my $output = shift; # output mode return undef unless $output =~ /^(?:html|xml)$/; return undef unless $input =~ /^(?:file|string)$/; my $code = $text; if ($input eq 'file') { $code = _get_text($text) or return undef; } $code =~ s/\t/ /g; # turns tabs into 4 spaces my $signature = md5_hex($code); if (exists $self->{code_items}->{$output.$signature}) { return $self->{code_items}->{$output.$signature} } else { my $syntax = Text::VimColor->new ( $input => $text, filetype => $syntax_type ) or return $code; my $out = $syntax->$output; $self->{code_items}->{$output.$signature} = $out; return $out; } } sub remove { my $self = shift; my $text = shift; # either the code or the file name my $input = shift; # file or string my $output = shift; my $code = $text; if ($input eq 'file') { $code = _get_text($text) or return undef; } my $signature = md5_hex($code); delete $self->{code_items}->{$output.$signature}; } 1; __END__ =head1 NAME VimColorCache - caches the result of Text::VimColor =head1 SYNOPSIS use VimColorCache; my $filename = 'syntax.db'; my $vcc = VimColorCache->new($filename); print $vcc->draw('print $_,$/ unless m/^\s*$/g', 'string', 'perl', 'html'); print $vcc->draw('hello.c', 'file', 'c', 'html'; =head1 class methods =over 4 =item new() The constructor accepts an optional filename where to store previously highlighted code snippets. The default file name is VimColorCache.db =item draw() Returns a properly highlighted text. It needs some parameters: - text either a filename or a string containing code to be formatted - input 'file' or 'string' - syntax_type language to use (perl, c, php, xml, SQL) The same as Vim's 'filetype' - output either 'html' or 'xml' If the code passed as string or file has already been processed, then the corresponding formatted text is returned, otherwise a Text::VimCol +or object is created and the highlighted syntax is processed from scratch +. print $vcc->draw('hello.c','file', 'c', 'html'); open IN, "hello.c" or die "can't open\n"; my $c_text = do { local $/; <IN> }; close IN; print $vcc->draw($c_text,'string', 'c', 'html'); These two instructions will print the same output. Only, the first one will be slow, the second one will be extremely fast. =item remove() Removes an item from the repository. It needs the same parameters as draw(), except "syntax_type" $vcc->remove('hello.c','file', 'html'); =back =head1 AUTHOR Giuseppe Maxia, a.k.a. gmax (gmax_at_cpan.org) =head1 COPYRIGHT Same as Perl itself. =cut

VimColorCache is a layer between the application and the highlighting module. It works on the assumption that, in most cases, code is published once and shown many times. Sometimes it is modified, but mostly it is just published and then left on the page for public consumption. In this scenario, the first poster has to wait one or two seconds for the highlighting engine to do its job, but every further request of the same code is resolved instantly.

The first example shown in this node could be rewritten using VimColorCache as follows:

#!/usr/bin/perl -w # test_vimcolor_cache.pl use strict; use CGI qw/:standard/; use VimColorCache; use CodeSamples; my $vcc = VimColorCache->new or die("can't create object ($!)\n"); print start_html(-title=>"VimColorcache test", -style=>{'src'=>'light.css'} ), h2("C"), pre( $vcc->draw($CodeSamples::ctext, 'string', 'c', 'html')), hr, h2("Perl"), pre($vcc->draw($CodeSamples::perltext, 'string', 'perl', 'html')), hr, h2("Perl (file)"), pre($vcc->draw($0, 'file', 'perl', 'html')), hr, h2("CSS"), pre($vcc->draw('light.css', 'file', 'css', 'html')), hr, h2("Perl (package)"), pre($vcc->draw('CodeSamples.pm', 'file', 'perl', 'html')), hr, h2("Perl (another package)"), pre($vcc->draw('VimColorCache.pm', 'file', 'perl', 'html'));

And the second colorful result shows exactly the same output as the previous one, except for the page title.

update (1)
CAVEAT. If you pass a file to Text::VimColor and at the same time you are editing the same file with Vim, it will return an error. You should either ensure that your file is not currently in use by Vim before passing it to the class constructor, or slurp it into a scalar and pass it as a string.

Update (2)
In case you are wondering just how slow is Text::VimColor without a cache, here is an example.
Processing times are acceptable for small scripts, but become unbearable for large ones.

Application Time to highlight
strict.pm
(2.6 KB)
Benchmark.pm
(22 KB)
CGI.pm
(221 KB)
DBI.pm
(226 KB)
perltidy 0.30 0.86 1.76 2.77
source-highlight 0.00 0.03 0.19 0.21
Text::VimColor 0.34 2.35 16.62 17.23
VimColorCache 0.01 0.01 0.01 0.02

Enjoy!

 _  _ _  _  
(_|| | |(_|><
 _|   

In reply to On-the-fly all-languages syntax highlighting by gmax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-04-24 02:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found