Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Multi threading

by Corion (Patriarch)
on Apr 06, 2009 at 06:46 UTC ( [id://755649] : note . print w/replies, xml ) Need Help??


in reply to Multi threading

First, get your nomenclature clean. You shouldn't mix threads and forks, and they are not the same.

If you are using threads, I recommend you use a Threads::Queue to serialize access to the log, and one log writer thread that reads from the queue and writes to the log file. This is the easiest way.

Another way would be to hope that your operating system has atomic writes and that your lines for the log file are shorter than 512 bytes or whatever the write buffer limit of your OS is. Then you shouldn't need to worry about threads mixing their write buffers as long as you unbuffer the filehandle.

Replies are listed 'Best First'.
Re^2: Multi threading
by sandy1028 (Sexton) on Apr 06, 2009 at 07:17 UTC
    I am using this code to create the threads using fork. This will read all the files in a directory and writes to a log file. Now this code creates th elog file. But some of the files are missed due to overlapping. How can I avoid this using locks
    #!/usr/bin/perl use Parallel::ForkManager; my $processor = shift; $tc=5; # threads $fc = 100; # splits.. each thread should process 100 at a time my $pm = new Parallel::ForkManager($tc+1); $pm->run_on_finish( sub { my ($pid, $exit_code, $ident) = @_; $tmp +Files[$ident] = undef; } ); foreach my $i (0..$#tmpFiles) { # Forks and returns the pid for the child: my $pid = $pm->start($i) and next; $SIG{INT} = 'DEFAULT'; my $filename = $tmpFiles[$i]->filename(); my $file = IO::File->new("<$filename") or die "Can't open $filen +ame\n"; while((my $line) = $file->getline()) { last unless defined($line); chomp $line; my ($dir, $file) = split(/\t/, $line); $processor->($dir, $file, $config, $log); } $pm->finish; # Terminates the child process }
      As Corion said, you create processes with fork(), not threads. If you have multiple writers to a file then there are several solutions - none of them particularly good!
      The simplest is to lock the whole file for each writer - but that defeats the object of having multiple writers.
      Another is to allocate specific regions (by byte offset) of the file for each thread/process ensuring that these do not overlap. That avoids locking, but requires some planning and management. Use seek to position each thread/process.
      If you turn off buffering for the write you will reduce the overlap, but probably not get rid of it altogether.
        Can you please tell me how to lock it. Any tutorial or any example related to this
      Can you please tell me how can I create a separate log files for each process. Not all process writing to a single file?