I love it when a plan comes together.

I set out to figure out how to use Inline::C today, and thought I'd share the experience from the perspective of someone who was using Inline::C for the first time.

The Inline::CPP experience: Before I get into discussing Inline::C, I should mention that I really set out to explore Inline::CPP, but was disappointed to find that building it on my Windows Vista system with Strawberry Perl v5.12, as well as my Ubuntu Linux 11.10 system with Perl v5.14 proved more difficult than I cared to deal with at this time. The CPAN Testers Matrix shows it pretty much failing across the board for the versions of Perl I'm using on the OS's I have available at my fingertips. The CPAN Testers Reports summary page also shows v0.25 not passing on Win32 Perl 5.12.3 and just about any version of Linux. Given it hasn't been updated in a number of years, it seems that's probably a dead end. But if others have been successful, I'd love to hear about it.

Back to Inline::C. Installation was straightforward. On both Windows with Strawberry Perl, and Linux it was just a matter of invoking cpan Inline::C. The rest went like clockwork. That's nice.

The documentation for Inline::C also refers the reader to Inline::C-Cookbook. Anyone interested in getting some use out of this module should read both documents. The cookbook really helped to illustrate what is discussed in (or left out of) the documents for Inline::C. In particular, I was glad to find that I didn't have to jump through big hoops to pass a list back to Perl. Minor hoops yes, but I was expecting to have to build up a linked list or something by hand. Instead, the macros provided give access to Perl's lists.

Observe the following example from the Inline::C-Cookbook:

perl -e 'use Inline C=>q{void greet(){printf("Hello, world\n");}};gree +t'

Now that looks promising... Next I pulled up an old benchmark test I had in a trivia folder. I had already written two of the benchmark subs I wanted to compare with Inline::C. The first was a pure Perl Perl implementation of a pretty straightforward subroutine that searches for primes in the first 0 .. n integers. The second sub was a Perl wrapper around a system call (via open to a pipe). The system call invokes a compiled C++ implementation of the same algorithm as the one used in the pure Perl subroutine.

I had hoped to employ that same C++ code in an Inline::CPP test, but since I couldn't get that to install I re-implemented the same algorithm in C using the Inline::C hooks for Perl.

Here's the code, followed by a sample run:

use strict; use warnings; use autodie; use v5.12; use Benchmark qw/cmpthese/; use Test::More tests => 3; use Inline 'C'; use constant TOP => 150000; use constant TIME => 5; is( scalar @{ basic_perl( 3571 ) }, 500, "The first 500 primes are found from 2 to 3571." ); is_deeply( external_cpp(), basic_perl(), "external_cpp() function gives same results as basic_perl()." ); is_deeply( inline_c(), basic_perl(), "inline_c() function gives same results as basic_perl()." ); note "\nComparing basic_perl(), external_cpp(), and inline_c() for\n", TIME, " seconds searching ", TOP, " integers.\n\n"; cmpthese( - TIME, { basic_perl => \&basic_perl, external_cpp => \&external_cpp, inline_c => \&inline_c, }, ); note "\nI love it when a plan comes together.\n\n"; # The pure Perl version. sub basic_perl { my $top = $_[0] // TOP; my @primes = ( 2 ); BASIC_OUTER: for( my $i = 3; $i <= $top; $i += 2 ) { my $sqrt_i = sqrt( $i ); for( my $j = 3; $j <= $sqrt_i; $j += 2 ) { next BASIC_OUTER unless $i % $j; } push @primes, $i; } return \@primes; } # A wrapper around the external executable compiled in C++. sub external_cpp { my $top = TOP; open my $fh, '-|', "primes.exe $top"; chomp( my @primes = <$fh> ); close $fh; return \@primes; } # To be consistent: a wrapper around the Inline C version. sub inline_c{ my $top = TOP; my @primes = inline_c_primes( $top ); return \@primes; } __END__ // Reference only (not used by Inline::C ) // The source code, "primes.cpp" for "primes.exe", // used by external_cpp(). #include <iostream> #include <cmath> #include <cstdlib> #include <vector> #include <algorithm> using namespace std; vector<int> get_primes( int search_to ); void print( int value ); // The first 500 primes are found from 2 to 3571. const int TOP = 3571; // int main( int argc, char *argv[] ) { int search_to = ( argc > 1 ) ? atoi(argv[1]) : TOP; vector<int> primes = get_primes( search_to ); for_each( primes.begin(), primes.end(), print ); return 0; } vector<int> get_primes( int search_to ) { vector<int> primes; primes.push_back( 2 ); for( int i = 3; i <= search_to; i += 2 ) { int sqrt_i = sqrt( i ); for( int j = 3; j <= sqrt_i; j += 2 ) { if( i % j == 0 ) goto SKIP; } primes.push_back( i ); SKIP: {}; } return primes; } void print ( int value ) { cout << value << endl; } __C__ # Here is the C code that is compiled by Inline::C #include "math.h" void inline_c_primes( int search_to ) { Inline_Stack_Vars; Inline_Stack_Reset; Inline_Stack_Push(sv_2mortal(newSViv(2))); int i; for( i = 3; i <= search_to; i+=2 ) { int sqrt_i = sqrt( i ); int qualifies = 1; int j; for( j = 3; ( j <= sqrt_i ) && ( qualifies==1 ); j += 2 ) { if( i % j == 0 ) { qualifies = 0; } } if( qualifies == 1 ) { Inline_Stack_Push(sv_2mortal(newSViv(i))); } } Inline_Stack_Done; } # Cross your fingers and hope for the best!

...the sample run (Windows)...

1..3 ok 1 - The first 500 primes are found from 2 to 3571. ok 2 - external_cpp() function gives same results as basic_perl(). ok 3 - inline_c() function gives same results as basic_perl(). # # Comparing basic_perl(), external_cpp(), and inline_c() for # 5 seconds searching 150000 integers. # Rate basic_perl external_cpp inline_c basic_perl 1.27/s -- -94% -98% external_cpp 20.0/s 1467% -- -66% inline_c 59.3/s 4555% 197% -- # # I love it when a plan comes together. #

A sample run from my Linux system:

1..3 ok 1 - The first 500 primes are found from 2 to 3571. ok 2 - external_cpp() function gives same results as basic_perl(). ok 3 - inline_c() function gives same results as basic_perl(). # # Comparing basic_perl(), external_cpp(), and inline_c() for # 5 seconds searching 150000 integers. # Rate basic_perl external_cpp inline_c basic_perl 2.37/s -- -91% -97% external_cpp 25.3/s 969% -- -67% inline_c 76.0/s 3106% 200% -- # # I love it when a plan comes together. #

It took a little time getting used to debugging under Inline::C. But the error messages are about as informative as the C compiler would give on its own, if not a little better. For one thing, compile time errors get printed into a log file in the build directory, and the error messages that dump to the screen indicate the path to where the full error message dump resides. That's nice.

However, you do have to take note that line numbers in error messages won't correspond with those in your Perl source code. Instead, they refer to a C source file that gets placed in the build directory, with an .xs suffix. Again, the error log and screen messages point to that same file. Open it up in an editor that shows line numbers and the error messages will make more sense. But don't bother editing the .xs file. Changes need to be made to the C code within the Perl source file. (I know this is common sense, but with several editors opened it's easy to mistakenly start editing the .xs file just because that's where you're crossreferencing the error line numbers.)

Now for the fun: As the benchmark shows, the Inline::C screams by comparison to the other methods. Of course dropping into C or C++ is generally a big pain in the neck, but when performance counts, it doesn't disappoint.

Another thing to notice is that the external system call method is significantly slower than the inline method. Earlier I had a benchmark where I was doing an external call to essentially a "no-op", and it's pretty obvious that the work being done in an external call doesn't come for free. But even with that extra work, the external call method is an order of magnitude faster than the pure Perl subroutine.

Pros and cons of each: The benchmark results speak for themselves; if speed is what matters, the Inline::C method wins. It's not surprising that the Perl sub was the easiest to implement, followed by the external system call (which could be in any language, without worrying about Perl's C macros for passing data around), followed by the Inline::C method, which was the most trouble to work out.

I've always sort of avoided working with XS because I didn't see a lot of need. And in fact, I've gotten by just fine without Inline::C in the past as well. But it turns out that using Inline::C is fairly simple. I don't think I'll be as pleasantly surprised when I get around to tackling full blown XS. This, however, was a pretty positive experience. I hope others will be motivated to give it a try too.

Update: Added readmore tags after the node got FrontPaged, to reduce FP clutter. Tinkered with formatting.