http://qs321.pair.com?node_id=1214548

As part of my ongoing quest to port tutorials from Python/numpy to Perl/PDL please graciously accept the following contribution to the Monastery.

This is the Perl/PDL port of A Neural Network in 11 Lines of Python. While I've added some documentation, please reference the original blog post for full details.

#!/usr/bin/env perl use strict; use warnings; use 5.016; use PDL; ###################################################################### # This example is ported from the tutorial at # https://iamtrask.github.io/2015/07/12/basic-python-network/ ###################################################################### # # In this example, we are training a neural network of two layers # (one set of weights). # It has the following variables: # $X - input neurons # $y - desired output values # $syn0 - single layer of weights # $l1 - output neurons # # This is our 'non-linear' function. It accepts two arguments. # The first argument is a piddle of values, and the second argument # is a flag. # # If the flag is unset, the function returns the elementwise Sigmoid # Function (https://en.wikipedia.org/wiki/Sigmoid_function). # # If the flag is set, the function returns the elementwise derivative # of the Sigmoid Function. sub nonlin { my ( $x, $deriv ) = @_; return $x * ( 1 - $x ) if defined $deriv; return 1 / ( 1 + exp( -$x ) ); } # $X is are our input values. It contains four examples of three # inputs. It is the following matrix: # # [ # [0 0 1] # [0 1 1] # [1 0 1] # [1 1 1] # ] my $X = pdl( [ [ 0, 0, 1 ], [ 0, 1, 1 ], [ 1, 0, 1 ], [ 1, 1, 1 ] ] ); # $y is the output vector. It is the following desired outputs for # the four input vectors above: # [0 0 1 1] my $y = pdl( [ 0, 0, 1, 1 ] )->transpose; # $syn0 is the first layer of weights, connecting the input values # ($X) to our first layer ($l1). It is initialised to random values # between -1 and 1. my $syn0 = ( ( 2 * random( 3, 1 ) ) - 1 )->transpose; # $l1 is the second (output) layer: my $l1; # This is the training loop. It performs 10000 training interations. for ( 0 .. 10000 ) { # Predict the outputs for all four examples (full batch training) # This is performed by applying the non-linear function # elementwise over the dot product of our input examples matrix # ($X) and our weights between layers 0 (input) and 1 (output) # ($syn0): $l1 = nonlin( $X x $syn0 ); # Calculate the error by comparing calculated values ($l1) to # known output values ($y) my $l1_error = $y - $l1; # Calculate the 'error weighted derivative'. This is the # elementwise product of the errors and the derivative of the # non-linear function across the outputs my $l1_delta = $l1_error * nonlin( $l1, 1 ); # Update the weights between the layers $syn0 += ( $X->transpose x $l1_delta ); } # Display output say "Expected output:", $y; say "Output After Training:", $l1;

Running it on my machine takes approximately 1.5 seconds and gives output similar to:

% perl nn_tutorial.pl Expected output: [ [0] [0] [1] [1] ] Output After Training: [ [0.0096660515] [0.0078649669] [ 0.99358927] [ 0.99211856] ]

Replies are listed 'Best First'.
Re: Basic Neural Network in PDL
by pryrt (Abbot) on May 17, 2018 at 22:15 UTC

    ++Very nice. I had been thinking about brushing up on my neural net skills (they're 20years rusty), and I've bookmarked this this will be a good starting point for using PDL to do so.

    My one minor nitpick: the sigmoid function you chose, the "logistic function", has a derivative that's f(x) * (1-f(x)), not x * (1-x), so you should replace your nonlin() sub with

    sub nonlin { my ( $x, $deriv ) = @_; my $f = 1 / ( 1 + exp( -$x ) ); return $f * ( 1 - $f ) if defined $deriv; return $f; }
    ... It still trains with your slope, but with my slope, it gets there faster, so 10k training loops gives better results:
    Output After Training: [ [ 0.0007225057] [0.00048051061] [ 0.999593] [ 0.999388] ]

      Thanks for the kind words, I'm glad this post may assist.

      Interesting point about the Sigmoid derivative. Both PDL and neural networks are still new to me, so I wouldn't have caught it. I'm glad someone with more experience has had a glance over the code. I wasn't aware that it was incorrect as it's a direct port of the code from the original blog post. I just checked and that code has the same error.

      Thanks for the fix.