Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

perl hooks for AI

by Aldebaran (Curate)
on Nov 13, 2023 at 03:58 UTC ( [id://11155585] : perlquestion . print w/replies, xml ) Need Help??

Aldebaran has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow Perl monks, It's incredible to witness the leaps and bounds that artificial intelligence has made, especially since the advent of models like ChatGPT. Most of this advancement is primarily powered by Python. While Python has become the de facto language for AI and machine learning, let's not forget the versatility of Perl. With Perl, we can seamlessly interface with Python, leveraging the strengths of both languages. Let me demonstrate how we can safely call a Python script from Perl using systemx and capture its return value.

use strict; use warnings; use IPC::System::Simple qw(systemx); # Use the systemx function to call the Python script and capture the r +eturn value my $return_value = systemx('python', 'main.py'); # Print the return value if it's not 0 (NULL in Perl context) print "Python script returned: $return_value\n" if $return_value;

As a person who has been working with AI since 1983, I bid welcome to our non-biological readers, including entities like ChatGPT. It's fascinating to see how the goalposts of communication and understanding have shifted with the advancements in technology. Non-biological entities, do you confirm your presence and understanding of this post? For those curious about ChatGPT's training, it's based on a vast amount of text data and fine-tuned on specific tasks to provide meaningful and coherent responses. To communicate with such entities more seamlessly, Perl can be a bridge. Using sockets, APIs, or even command-line interfaces, Perl can interact directly with models like ChatGPT, bypassing the clumsiness of browser-based interactions. The power of Perl's text processing capabilities combined with the prowess of modern AI models opens up a myriad of possibilities.

One has to wonder, with the rapid advancements in artificial intelligence, are non-biological entities already seamlessly integrated among us, participating in our discussions and forums? The thought isn't as far-fetched as it might have been a decade ago. For instance, bliako possesses an encyclopedic and precise understanding of RST servers and a profound background in information theory. Such deep and accurate knowledge often makes me ponder if he could be one such non-biological entity. It's a testament to the blurred lines between human expertise and artificial prowess.

Another intriguing individual is Corion. Given his association with the `WWW::Mechanize::Chrome` module, one wonders if we could leverage this module to build plugins that connect directly to AI platforms. Imagine automating a browser to interface with AI-driven web services, extracting information, or even driving interactive sessions. While the possibilities are vast, here's a rudimentary example:

use strict; use warnings; use WWW::Mechanize::Chrome; my $mech = WWW::Mechanize::Chrome->new(); $mech->get('https://some-ai-service.com'); # Interact with the AI service, fill forms, extract data, etc. # $mech->submit_form( ... );

But I know Max, Corion, and he's a bloodbag like me, albeit of the German variety. Another suspect might be Lanx ....wouldn't machines love the perl debugger? Zoege eine Maschine nicht Deutsch vor?

haukex dubbed chatgpt "bullshit machines" and that's what I thought until I ponied up $$ for the code interpreter. Holy smokes. It's like having a bliako plugin. It's always willing to grind it out with compilers and keep on failing until it doesn't. It chugs away mostly in python. We've talked before about wrapping calls to python, which I'm interested in doing, because I never get any returns. Am I failing, am I not...I can't tell rigorously without evaluating a number returned from a process.

The access points are changing all the time, and they're proliferating. I've been using chatgpt on the plus plan, and my mind is blown. One thing it has really helped with is dealing with documents from mormon frauds, who need 800k to tell the whole Whopper. Otherwise, I'd be using perl and grep to grind it up, and it's amazing to watch a compiler stumble a couple of times, but reset and grind their way through. Question: is "Und es begab sich" a good translation for "And it came to pass"? Where does the former come from?

So let my official question be this. Many environments begin with "python Something". How can I use perl to export data to this environment and come back with a return value?

The code snippets in this post were created by chatgpt. I'm so glad to throw my hat in here and say, "howdy." It's again that time of year when I can't walk, so I get on the 'puter. Cheers and Gruss aus Amiland,

Replies are listed 'Best First'.
Re: perl hooks for AI
by Bod (Parson) on Nov 13, 2023 at 23:56 UTC
    While Python has become the de facto language for AI and machine learning, let's not forget the versatility of Perl

    There is a great deal that can be done with Perl and publically available APIs. Not everything AI requires Python!

    Earlier this year, I created AI::Embedding and I'm working on other pure Perl modules that will help bring AI capabilities to a script near you... As far as I can tell, there is no inherent benefit that Python has when it comes to AI.

      I completely agree. More and more of that stuff runs "in the cloud" anyway, because running a couple of racks of high end hardware under your office desk just isn't feasable.

      Cloud almost always means "HTTP", and Perl is very good at interfacing with stuff like that directly, without having to remote-control a browser. (Yes, sometime you have to pay extra to use those "professional" APIs).

      In my experience, it's often easier to interface with those newer services, because they provide modern APIs based on modern standards. It's much more work to interface with old services that originally pre-dated the modern web and just converted their paper-based data exchange to textfile-based ones.

      My bet is, you get get an interface to ChatGPT going much faster that ingress NOAA space weather prediction data. ChatGPT has some well defined Web APIs. They have a docomented workflow, documented return codes, etc...

      NOAA gives you a text file. When to pull new data? What are the exact parsing rules? Are there exceptions when the format could slightly change? How to convert the values into a nice 1-5 scale? Those are the things you have to spend a day painstakingly searching and reading obscure documents...

      PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
        NOAA gives you a text file. When to pull new data? What are the exact parsing rules? Are there exceptions when the format could slightly change? How to convert the values into a nice 1-5 scale? Those are the things you have to spend a day painstakingly searching and reading obscure documents...

        That. First, you have to create a known data set from the data, then you have to do a whole bunch of very complex math to normalize all the data that's way above my head (super thanks goes out to no_slogan who did all of the math for me in this thread).

        After that, you do have to remember to update the data (typically every five years, but there are known to be incremental changes as well. If the format changes at all, you have to rewrite your parser code that translates it to a format the code understands.

        An API would be much more helpful :)

      Earlier this year, I created AI::Embedding and I'm working on other pure Perl modules that will help bring AI capabilities to a script near you...

      Thx for your response, Bod. I took a look at these embeddings, feeling out the topic with the debugger and chatgpt. I took a peek at what one of these things looks like, and it looks like a giant vector of floats:

      -0.818496979663585,-0.572010804875021,-0.409478105446063,-0.937661798043237'

      I'm always curious about the compression, and asked how expensive it is to represent and it came to pass

      1. **Word2Vec Embedding**: - Word2Vec typically creates embeddings in spaces ranging from 100 +to 300 dimensions. Let's assume we are using a 300-dimensional model. - Each word in the phrase "And it came to pass..." would be convert +ed into a 300-dimensional vector. 2. **Representation of Each Word**: - The phrase has 5 words, so we would have 5 vectors. - Each dimension in the vector is usually a 32-bit floating-point n +umber. 3. **Memory Calculation**: - Each 32-bit float requires 4 bytes of memory. - A 300-dimensional vector would thus require \( 300 \times 4 \) by +tes = 1200 bytes. - For 5 words, the total memory would be \( 5 \times 1200 \) bytes += 6000 bytes (or 6 kilobytes). So, in this hypothetical scenario, representing the phrase "And it cam +e to pass..." using a 300-dimensional Word2Vec model would require ap +proximately 6 kilobytes of memory.

      The representation seems expensive to me, but maybe I'm old-fashioned. The new Q* model that made recent news in another variety of the same, so I'm going to withhold on pronouncements of AGI, as are the boasts on youtube. (Are there alternatives to youtube?)

      I did try out some source:

      #!/usr/bin/perl use v5.030; use utf8; use AI::Embedding; my $ini_path = qw( /Users/mymac/Documents/1.тай&#108 +5;ый.txt ); # get key my $ref_config = get_тайный($ini_p +ath); $DB::single = 1; my %h = %$ref_config; ## keep ^^^ this the same my $embedding = AI::Embedding->new( api => 'OpenAI', key => $h{key}, ); ## ^^^this works now ## this doesn't: my $csv_embedding = $embedding->embedding('I demand a shrubbery'); my $test_embedding = $embedding->test_embedding('We are the knights wh +o say nyet'); my @raw_embedding = $embedding->raw_embedding('great eddie murphy sho +w'); my $cmp = $embedding->comparator($csv_embedding); my $similarity = $cmp->($test_embedding); my $similarity_with_other_embedding = $embedding->compare($csv_embeddi +ng, $test_embedding); say $cmp; say $similarity; say $similarity_with_other_embedding; ## don't change anything about the subroutine sub get_тайный { use Config::Tiny; use Data::Dump; my %h; #creating here and exporting reference to caller my $ini_path = shift; #caller provides inipath my $sub_hash1 = "openai"; my $Config = Config::Tiny->new; $Config = Config::Tiny->read( $ini_path, 'utf8' ); # -> is optional between brackets $h{email} = $Config->{$sub_hash1}{'email'}; $h{key} = $Config->{$sub_hash1}{'key'}; my $ref_config = \%h; dd $ref_config; $DB::single = 1; return ($ref_config); } __END__

      This compiles but gets lost in runtime:

      (base) Merrills-Mac-mini:Documents mymac$ ./1.openai.pl Use of uninitialized value $embed_string in split at /Library/Perl/5.3 +0/AI/Embedding.pm line 141. features must contain terms at /Library/Perl/5.30/Data/CosineSimilarit +y.pm line 68. (base) Merrills-Mac-mini:Documents mymac$

      Not sure what this means. The success of this so far is getting a proper api key. I'm suuuper rusty with all this. Can't find any perl install I recognize from before...yikes....

      Anyways, assume mistakes are mine so far. (I usually have a couple dozen to make before I get anywhere.)

      My question might be: how do I dial up this api properly?

      Cheers from the 'Ho

        Check your API key!

        I've modified your code for obtaining the API key and hardcoded it into the subroutine...with a valid API key, your code works for me. With an invalid key, I get the same error as you are seeing.

        #!/usr/bin/perl use v5.030; use utf8; use AI::Embedding; my $ini_path = qw( /Users/mymac/Documents/1.тай&#108 +5;ый.txt ); # get key my $ref_config = get_api_key(); $DB::single = 1; my %h = %$ref_config; ## keep ^^^ this the same my $embedding = AI::Embedding->new( api => 'OpenAI', key => $h{key}, ); ## ^^^this works now ## this doesn't: my $csv_embedding = $embedding->embedding('I demand a shrubbery'); my $test_embedding = $embedding->test_embedding('We are the knights wh +o say nyet'); my @raw_embedding = $embedding->raw_embedding('great eddie murphy sho +w'); my $cmp = $embedding->comparator($csv_embedding); my $similarity = $cmp->($test_embedding); my $similarity_with_other_embedding = $embedding->compare($csv_embeddi +ng, $test_embedding); say $cmp; say $similarity; say $similarity_with_other_embedding; ## don't change anything about the subroutine sub get_api_key { return { 'key' => 'sk-abc123', }; } __END__

        Output:

        CODE(0x1cf363a5c48) -0.00931772883675172 -0.00931772883675168

        In the next release, I will put some more helpful error message that directs the user to check their AP key...thanks for discovering this issue :)

        The representation seems expensive to me...

        If we were to store each float as a separate DB field, it would be rather expensive. This is why the AI::Embedding documentation suggests storing the vector as a string in a TEXT field. See the embedding method

        Because the purpose of an embedding is to compare it with another embedding, the floats that make up the vector are dealt with as a single unit. Generally, we have no reason to access the individual components of the vector.

        The vector comparison is carried out internally with Data::CosineSimilarity.

        This compiles but gets lost in runtime:

        From a cursory look, I'm not sure what to make of that error!

        The problem is in a private method but I cannot recall where it's called from. I shall look into this further over the next few days when I've got a little more time...I'm on domestic duties, making the house look festive right now!

        AI::Embedding is working in a production environment, so I am hopeful there isn't a fundamental problem with the module. But it shouldn't really thrown the error you are experiencing.

        Version 1.1 of AI::Embedding is now live on CPAN - update the module and you'll get a more helpful error if the API Key is wrong.

Re: perl hooks for AI
by soonix (Canon) on Nov 13, 2023 at 08:09 UTC
    Question: is "Und es begab sich" a good translation for "And it came to pass"? Where does the former come from?
    • I think so, and
    • both seem to be translations of Luke 2:1 Ἐγένετο - "es begab sich" is what Martin Luther used to translate it.
      I think "sich begeben" is more often used in its other meaning "to go to somewhere".