Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

threads::shared vs IPC::Shareable

by Nar (Novice)
on Dec 02, 2015 at 18:23 UTC ( [id://1149196]=perlquestion: print w/replies, xml ) Need Help??

Nar has asked for the wisdom of the Perl Monks concerning the following question:

Monks

Looking for feedback on writing an application. The application consist of two primary parts, preprocesser and processer.

Multiple preprocessers will run and preprocess different log files moving the results to a hash then a single processer will pull contents from the hash and delete the element after processing.

Does anyone have feedback on implementation of this? My two thoughts were:
1.) use threads and threads:shared to share a hash between multiple preprocessers running as threads of the processer
2.) Create indivudal preprocesser scripts and processer script and share the hash via IPC::Shareable

Replies are listed 'Best First'.
Re: threads::shared vs IPC::Shareable
by BrowserUk (Patriarch) on Dec 02, 2015 at 19:14 UTC

    There is no way to answer your question on the basis of the (lack of) information you've provided.

    The first questions I have are:

    • Why are you sharing a hash?
    • What are the keys to the hash?
    • Does the information being stored in the value associated with each key come from a single preprocessor; or is it accumulated from multiple preprocessors?

      If it is from a single preprocessor, why does it need to be shared through a hash?

      If it is accumulated from multiple preprocessors; how will the processing app know when any given key is ready for post processing?

    And many more question arise depending upon the answers to those.

    Basically, until you describe what information is being passed from the preprocessors to the processor; how that information is keyed; and how it is derived and/or accumlated; any response to your question is an uneducated guess.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: threads::shared vs IPC::Shareable
by james28909 (Deacon) on Dec 02, 2015 at 19:40 UTC
    Is there a specific reason you cant handle it all in one script/thread/process instead?
    use strict; use warnings; my %hash; while(<DATA>){ my ($key, $value) = split(/\s+/, $_); $hash{$key} = $value; } foreach(keys %hash){ if ($_ =~ /this|is|a|test/){ print "found string \"$_\" and its value is \"$hash{$_}\"\n"; } } __DATA__ this 1 is 2 a 3 test 4

    This would read a logfile and add all of its contents to the hash. Then you can do functions foreach key or value. Honestly, I wouldnt even use a hash or an array, I would just loop through the logfile and do functions foreach line. I cannot comment about threads as I never had any situation that I absolutely needed them myself.

    Also, posting some form of code snippet or input data would help as well
Re: threads::shared vs IPC::Shareable
by neilwatson (Priest) on Dec 02, 2015 at 18:56 UTC
Re: threads::shared vs IPC::Shareable
by Anonymous Monk on Dec 02, 2015 at 18:59 UTC

    What is the purpose of the hash here? Do you need to build some lookup index? Sort or categorize?

    Natural fit for a preprocessor is to function as a filter. Keep state but don't slurp the data. You can pipe it and stream it. But if complicated queries are required, then a database model might be more suitable.

      Id like it in multiple processes due to size of logs. As each preprocesser is processing GBs of logs having each executing in its own process space would be important to take advantage of multiple cores.

      It uses hashs so as multiple preprocessers write, the processer can execute in order of datetime.

      Each preprocesser would write to the same hash and the engine would process that hash.
        Id like it in multiple processes due to size of logs. As each preprocesser is processing GBs of logs having each executing in its own process space would be important to take advantage of multiple cores.

        Where/what type of storage do these logs files exist on? Because running multiple concurrent file readers can slow things down rather than speed them up if you aren't very careful.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1149196]
Approved by kevbot
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (4)
As of 2024-04-24 22:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found