As part of my ongoing quest to port tutorials from Python/numpy to Perl/PDL please graciously accept the following contribution to the Monastery.
This is the Perl/PDL port of A Neural Network in 11 Lines of Python. While I've added some documentation, please reference the original blog post for full details.
#!/usr/bin/env perl
use strict;
use warnings;
use 5.016;
use PDL;
######################################################################
# This example is ported from the tutorial at
# https://iamtrask.github.io/2015/07/12/basic-python-network/
######################################################################
#
# In this example, we are training a neural network of two layers
# (one set of weights).
# It has the following variables:
# $X - input neurons
# $y - desired output values
# $syn0 - single layer of weights
# $l1 - output neurons
#
# This is our 'non-linear' function. It accepts two arguments.
# The first argument is a piddle of values, and the second argument
# is a flag.
#
# If the flag is unset, the function returns the elementwise Sigmoid
# Function (https://en.wikipedia.org/wiki/Sigmoid_function).
#
# If the flag is set, the function returns the elementwise derivative
# of the Sigmoid Function.
sub nonlin {
my ( $x, $deriv ) = @_;
return $x * ( 1 - $x ) if defined $deriv;
return 1 / ( 1 + exp( -$x ) );
}
# $X is are our input values. It contains four examples of three
# inputs. It is the following matrix:
#
# [
# [0 0 1]
# [0 1 1]
# [1 0 1]
# [1 1 1]
# ]
my $X = pdl( [ [ 0, 0, 1 ], [ 0, 1, 1 ], [ 1, 0, 1 ],
[ 1, 1, 1 ] ] );
# $y is the output vector. It is the following desired outputs for
# the four input vectors above:
# [0 0 1 1]
my $y = pdl( [ 0, 0, 1, 1 ] )->transpose;
# $syn0 is the first layer of weights, connecting the input values
# ($X) to our first layer ($l1). It is initialised to random values
# between -1 and 1.
my $syn0 = ( ( 2 * random( 3, 1 ) ) - 1 )->transpose;
# $l1 is the second (output) layer:
my $l1;
# This is the training loop. It performs 10000 training interations.
for ( 0 .. 10000 ) {
# Predict the outputs for all four examples (full batch training)
# This is performed by applying the non-linear function
# elementwise over the dot product of our input examples matrix
# ($X) and our weights between layers 0 (input) and 1 (output)
# ($syn0):
$l1 = nonlin( $X x $syn0 );
# Calculate the error by comparing calculated values ($l1) to
# known output values ($y)
my $l1_error = $y - $l1;
# Calculate the 'error weighted derivative'. This is the
# elementwise product of the errors and the derivative of the
# non-linear function across the outputs
my $l1_delta = $l1_error * nonlin( $l1, 1 );
# Update the weights between the layers
$syn0 += ( $X->transpose x $l1_delta );
}
# Display output
say "Expected output:", $y;
say "Output After Training:", $l1;
Running it on my machine takes approximately 1.5 seconds and gives output similar to:
% perl nn_tutorial.pl
Expected output:
[
[0]
[0]
[1]
[1]
]
Output After Training:
[
[0.0096660515]
[0.0078649669]
[ 0.99358927]
[ 0.99211856]
]