Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

I need a help with this one- even don't know how to title this

by baxy77bax (Deacon)
on Nov 19, 2011 at 20:42 UTC ( [id://939001]=perlquestion: print w/replies, xml ) Need Help??

baxy77bax has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

well here comes a puzzle (at least for me it is). So again I'm not seeking for an exact code just a single or more pointers. also this is (once finished) going to be rewritten in c and that is why I would like to avoiding hashing and regex. (Also feel free, If you have erdge to criticize my posting to this forum, since it is not strictly Perl associated question)

So the problem is as follows :
I have a random number generator R1 that produces from 1 to 4 integers, all spaning from 1-100. Also I have another random number generator R2 that reports one random number from 1-20.
so what i need to do is once the R1 gives me let say 4 numbers i need to store them into array where every position in the array represents one track:

R1 -> 17 1 20 12 $Array = [17,1,20,12] $Array[0] ->first track ...
Then R2 kicks in and generates number 4. After that R1 again produces let say 3 numbers (13 24 21). The problem now is how to efficiently (without hashing) figure out that 24 should go into $array[3]->third track since 20+4=24, and 21 into first track since 17+4=21 and 13 in second or fourth track - it doesn't matter. This process is repeated let say 5 times: so
R1 -> 17 1 20 12 R2 -> 4 R1 -> 21 13 24 4 17 21 1 13 20 24 12 R2-> 6 R1 -> 19 30 31 27 4 6 17 21 27 1 13 19 20 24 30 12 31 R2-> 3 R1 -> 22 34 4 6 3 17 21 27 1 13 19 22 20 24 30 12 31 34 ... ...
And so on ... Has anyone came across this type of problem before and solved it ? So i can deal with linked lists, chaining arrays but not hashing since the number of operations in hashing is too big for such a case. or at least i feel it is, maybe it is not, what do you think? any advices??

thank you

baxy

Update to the sundialsvc4's reply:

Well the rulesare pretty simple as stated. Can I actually make it more simpler, well maybe I don't know, where should i simplify ?
as far as the implementation goes , well if you were in my position you probably would not, given that the person in charge of the project is an old school , really old school type of a guy calling all the shoots and since he is paying me i have no choice then to comply.But other then that , thanks for the advice :)
cheers

UPDATE:

Ok , so what am i working on. It is a search engine that is based on "hiden" Viterbi probabilities (R1 and R2) and domino algorithm (numbers that im trying to match) so there is no extra information that i can give you, that is it. I just replaced those fancy words with simple ones. So what i did is, i replaced the real words with their fingerprints and bow bases on a HVB's am trying to put them back together (ofcourse much faster then regular search engine does since im not dealing with whole words).

PS

Based on the amount of replies, this really looks like is a tough problem without much room to move around.

Thank you

baxy

Replies are listed 'Best First'.
Re: I need a help with this one- even don't know how to title this
by Util (Priest) on Nov 20, 2011 at 01:49 UTC
    • Your diagram is easier to work with if you print it top-to-bottom instead of left-to-right, so I made that change.
    • No hashing needed, as long as it is OK to do linear scans across all your tracks.
    • I used first_index() from List::MoreUtils for clarity; it will be simple to duplicate its functionality in C.
    • I wrote this code with *no* understanding of what you are trying to accomplish. Whether or not I got it right, please enlighten us on the root problem you are trying to solve (or mathematical behavior your are trying to explore, etc).

    Working, tested code, insofar that I grasp the problem:

    #!/usr/bin/env perl use strict; use warnings; use List::MoreUtils qw(first_index); my $duplicate_original_post_data = 1; my $generation_limit = 99; my $r1_count_limit = 4; # 1..4 elements my @r1_keys = 0 .. ($r1_count_limit-1); sub pad { map {$_[$_] || ''} @r1_keys } my ( $R1, $R2 ); if ( not $duplicate_original_post_data ) { $R1 = sub { my $n = 1 + int rand $r1_count_limit; return map { 1 + int rand 100 } ( (0) x $n ); }; $R2 = sub { return 1 + int rand 20 }; } else { my @fake_r1 = ( [ 17, 1, 20, 12 ], [ 13, 24, 21 ], [ 19, 30, 31, 27 ], [ 22, 34 ], ); my @fake_r2 = ( 4, 6, 3 ); $generation_limit = @fake_r1; $R1 = sub { return @{ shift @fake_r1 } }; $R2 = sub { return shift @fake_r2 }; } my $template = "%5s %5s %5s %5s %5s %5s -> %5s %5s %5s %5s\n"; my @last_array = $R1->(); printf $template, qw(Gen R1.1 R1.2 R1.3 R1.4 R2 Trk1 Trk2 Trk3 Trk4); printf $template, 1, pad( @last_array ), 'None', pad( @last_array ); for my $generation ( 2 .. $generation_limit ) { my @r1_array = $R1->(); my $r2 = $R2->(); my @leftover; my @new_array = (0) x $r1_count_limit; for my $r1_num (@r1_array) { my $target = $r1_num - $r2; my $slot_to_use = first_index { $_ != 0 and $_ == $target } @last_array; if ( $slot_to_use == -1 ) { push @leftover, $r1_num; } else { $new_array[ $slot_to_use] = $r1_num; $last_array[$slot_to_use] = 0; } } my @empty_slots = grep { $new_array[$_] == 0 } @r1_keys; while (@leftover) { die if not @empty_slots; $new_array[ shift @empty_slots ] = shift @leftover; } printf $template, $generation, pad( @r1_array ), $r2, pad( @new_array ); @last_array = @new_array; }
    Output:
    Gen R1.1 R1.2 R1.3 R1.4 R2 -> Trk1 Trk2 Trk3 Trk4 1 17 1 20 12 None -> 17 1 20 12 2 13 24 21 4 -> 21 13 24 3 19 30 31 27 6 -> 27 19 30 31 4 22 34 3 -> 22 34

Re: I need a help with this one- even don't know how to title this
by choroba (Cardinal) on Nov 19, 2011 at 22:22 UTC
    Can you clarify some details? Imagine we have this situation:
    10 15 20 5 30 35 40
    Now, R2 gives 10 and R1 gives 50, 25. Should I put 50 into 4th track because of 40, or is 40 already forgotten?

    What is the expected output of the program? Should it just return the new status based on R2 and R1 for each step, or should it build a structure representing the whole process?

      Yes I forgot to mention that, the 40 is forgotten. you always pick from the last array in this example 15 5 35.
      What is the output, well just the updated status after , let say 10 iterations
      sorry, cheers
Re: I need a help with this one- even don't know how to title this
by sundialsvc4 (Abbot) on Nov 19, 2011 at 21:41 UTC

    Then R2 kicks in and generates number 4. After that R1 again produces let say 3 numbers (13 24 21). The problem now is how to efficiently (without hashing) figure out that 24 should go into $array3->third track since 20+4=24, and 21 into first track since 17+4=21 and 13 in second or fourth track - it doesn’t matter.

    Obviously, everything in the algorithm and its implementation depends on knowing what the rules are... and from your description I can’t guess.

    Certainly, I would implement your solution in C++, not C, if it actually comes to that, because in that language you do have ready access to data structures such as queues, balanced-trees and arbitrarily-sized “arrays” (collections ...), all without having to write them from scratch.   (You have better things to do with your time.)   But if your problem really is small, as it superficially appears to be, it seems to me that you don’t have much of a problem here.

    Basically, if you can manage to deal with the problem “in memory,” without getting into page-fault hell, you can actually do things in a very brute-force way and get away with it perfectly.   It is, for example, to say #DEFINE MAX_BUCKETS 10000 and thus allocate a fixed-size 10,000-element array and to have your program (check for, and gracefully and meaningfully terminate itself if ...) that limit is ever someday exceeded.   (As I once read in a macabre bit of source code:   “Dig me up and I’ll fix it then ...”   Ick.)   Given that modern processors execute (at least) hundreds of millions of instructions per second, if you can’t clearly demonstrate that a “sophisticated” solution is mandatory in order to solve the problem in all of the cases that you will encounter ... don’t build one.   Unless you can demonstrate that major page-fault activity might happen, the algorithm is going to execute at the full speed of the CPU ... or the maximum speed of the I/O subsystem, whichever is less.

    (Whether, by the way, it is implemented in C++ or Perl... for all practical intents and purposes.   The coders who put together perlguts are gods...)

    (It goes entirely without saying that if what you are describing is actually some god-awful CPU-intensive operation for some image-processing or cryptographic application or what-have-you, my opinions here would not apply.   But you knew that already, didn’t you?)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://939001]
Approved by Eliya
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-04-18 10:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found