Greetings Fellow Monks,
Later this month, I will be giving a talk titled: Machine Learning Made Easy with Perl. A preliminary outline is:
Part I: Exploratory Data Analysis
Part II: Decision Support Systems
Part III: Pattern Recognition
In each part, I plan to discuss the problem, the strategy to solve it, the choice of machine learning technique and the main configuration issues the participants need to understand to successfully deploy machine learning applications. I will also show snippets of the code used. For example:
For data gathering using Finance::YahooQuote:
#!/usr/bin/perl
use strict;
use warnings;
use Finance::YahooQuote;
my @symbols = ("IBM","DELL","GOOG","YHOO","MSFT","ORCL","SAP","COGN",
+"BOBJ");
my @columns = ("Last Trade (Price Only)","Last Trade Date","Last Trade
+ Time","Day's Range","52-week Range","EPS Est. Next Year","P/E Ratio"
+,"PEG Ratio","Dividend Yield");
my $arrptr = getcustomquote(\@symbols, \@columns);
my $i = 0;
foreach my $symbol (@symbols){
my @quotes = @{$arrptr->[$i++]};
print "$symbol\t@quotes\n";
}
For the FCM:
use strict;
use warnings;
use PDL;
use PDL::NiceSlice;
# ================================
# fcm
# ( $performance_index, $prototypes, $current_partition_matrix) =
# fcm( $patterns, $partition_matrix, $fuzzification_factor,
# $tolerance, $max_iter )
# ================================
sub fcm {
#
# fuzzy c means implementation
#
my ( $patterns, $current_partition_matrix, $fuzzification_factor,
+$tolerance, $max_iter ) = @_;
my ( $number_of_patterns, $number_of_clusters ) = $current_partiti
+on_matrix->dims();
my ( $prototypes, $performance_index );
my $iter = 0;
while (1) {
# computing each prototype
my $temporal_partition_matrix = $current_partition_matrix ** $
+fuzzification_factor;
my $temp_prototypes = ($temporal_partition_matrix x $patterns
+)->xchg(1,0) / sumover($temporal_partition_matrix);
$prototypes = $temp_prototypes->xchg(1,0);
# copying partition matrix
my $previous_partition_matrix = $current_partition_matrix->cop
+y;
# updating the partition matrix
my $dist = zeroes($number_of_patterns, $number_of_clusters);
for my $j (0..$number_of_clusters - 1){
my $diff = $patterns - $prototypes(:,$j)->dummy(1, $number
+_of_patterns);
$dist(:,$j) .= (sumover( $diff ** 2 )) ** 0.5;
}
my $temp_variable = $dist ** (-2/($fuzzification_factor - 1));
$current_partition_matrix = $temp_variable / sumover($temp_var
+iable->xchg(1,0));
#
# Performance Index calculation
#
$temporal_partition_matrix = $current_partition_matrix ** $fuz
+zification_factor;
$performance_index = sum($temporal_partition_matrix * ( $dist
+** 2 ));
# checking stop conditions
my $diff_partition_matrix = $current_partition_matrix - $previ
+ous_partition_matrix;
$iter++;
if ( ($diff_partition_matrix->max < $tolerance) || ($iter > $m
+ax_iter) ) {
last;
}
print "iter = $iter\n";
}
return ( $performance_index, $prototypes, $current_partition_matri
+x );
}
I expect the audience to be mainly Perl savvy people. However, the talk is open to all the people attending the conference. Therefore, some people in the audience might not be familiar with Perl.
The talk is scheduled to last 45 minutes. I plan to cover each part in about 10 minutes to leave between 5 and 10 minutes for questions and answers. I do not plan to explain the snippets in detail because I do not have enough time. However, I will make the code available for all those interested. My questions for you Fellow Monks are:
- If you were attending this session, would you expect me to describe the code in detail?
- Do you think it is a good strategy to concentrate on the machine learning part rather than on the Perl part?
- What suggestion do you have in terms of points that I should (should not) cover?
- Any other suggestions? thoughts?
Thank you,
lin0
Update: Fixed typo in header of FCM sub
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.