Writing to a file

jalebie has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
(Ovid) Re: Writing to a file by Ovid (Cardinal) on Aug 20, 2001 at 23:35 UTC
Well, it's tough to know exactly how to do that since we don't know much about the scripts and how they're writing to the file, but how about having them write to separate files and then `cat` them together when you're done? If you time and date stamp the log entries, you could write a perl program to sort and combine them for you. Cheers, Ovid Vote for paco! Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.	[reply]
Re: (Ovid) Re: Writing to a file by jalebie (Acolyte) on Aug 20, 2001 at 23:50 UTC
The problem with all your solutions is that want me to write to multiple files and combine them the only problem with that is that myscript.prl is actually being called locally on different wkstations when by `system("rshto $wks myscript.prl >> $tmp_file");` [download] and we have over 200,000 wkstations here were the script is supposed to run. I thought about writing to different files too, but the sheer number of temp log files generated make this impractical, and the extra code to put these files back by date/time stamp and then unlink("$tmp_file") is also needed. I was wondering if there is a way in perl to know if the file is being written too currently, and if its is being wriiten to wait until no other process is writing to it.	[reply] [d/l]
Re: Re: (Ovid) Re: Writing to a file by THRAK (Monk) on Aug 21, 2001 at 00:25 UTC
Within "myscript.prl" instead of just printing and capturing STDOUT to $tmp_file, open it instead and write to it. You will want to checkout flock which may help prevent the overwriting problem. Still, if you have hundreds of thousands of processes/machines all trying to write to the same file, you are creating a huge bottleneck. What about running the command on each machine as you appear to want to do, but write it to a local temporary accumulation file. Then either retrieve each one, or send them to a common queue (on a periodic basis) where a second process can collate them into this one behemoth file you desire? Just a thought. -THRAK www.polarlava.com	[reply]
Re: Re: (Ovid) Re: Writing to a file by dragonchild (Archbishop) on Aug 21, 2001 at 00:24 UTC
There is flock, which would lock the file. But, each process has to request what the flock status is and I'm not very conversant on how that works. Now, what you're saying is that you're going to run this script on separate workstations. Why not just run it, store the logfile locally, then have another script which gathers together all the data? ------ /me wants to be the brightest bulb in the chandelier! Vote paco for President!	[reply]
Re: Re: Re: (Ovid) Re: Writing to a file by swngnmonk (Pilgrim) on Aug 21, 2001 at 01:20 UTC
Re: Re: (Ovid) Re: Writing to a file by OzzyOsbourne (Chaplain) on Aug 22, 2001 at 21:54 UTC
Couldn't you utilize the users home directories for a location for the temp file, and then run one script to comb the homedirs and conglomerate them all into a master file? -OzzyOsbourne	[reply]
Re: Writing to a file by Cine (Friar) on Aug 20, 2001 at 23:30 UTC
output to $tmp_file$$ instead and then join the tempfiles afterwars if necessary. $$ is current pid, if you were unsure. Update: Ups... $$ in this case is the same always :( use `system("perl -e 'myscript.prl $id >> $tmp_file\$\$' &");` [download] instead T I M T O W T D I	[reply] [d/l]
Re: Re: Writing to a file by maverick (Curate) on Aug 20, 2001 at 23:35 UTC
This wouldn't work because $$ would be the same for every instance of the system call. Since the code is using a for loop, try using the counter variable to uniquely name the files: `system("myscript.prl $id > $tmp_file$i");` [download] /\/\averick perl -l -e "eval pack('h*','072796e6470272f2c5f2c5166756279636b672');"	[reply] [d/l]
Re: Re: Writing to a file by jalebie (Acolyte) on Aug 20, 2001 at 23:33 UTC
The problem with that is that I am planning to run the for loop for over a 100,000 times easily, which would generate a 100,000 of temp files	[reply]
Re: Re: Re: Writing to a file by Cine (Friar) on Aug 20, 2001 at 23:38 UTC
Cant you change your called script to use Sys::Syslog, that should solve your problem... T I M T O W T D I	[reply]
Re: Re: Re: Writing to a file by Cine (Friar) on Aug 20, 2001 at 23:46 UTC
You are going to get a problem with open filehandles and other resources if you are planning on starting 100k processes at once... Perhaps you should just open 10-100 at a time and wait for them to finish and then continue... T I M T O W T D I	[reply]
Re: Writing to a file by Dragonfly (Priest) on Aug 21, 2001 at 06:15 UTC
Instead of using a flat log file and trying to deal with the locking/overwriting problems that entails, have you considered logging these into a Free, solid database engine that supports row-level locking such as PostgreSQL? This approach might also have the side benefit of letting you find a way around the possibility of running out of space in your process table. And, you could write simple modules that could then index the log files and sort them by date or machine or what-have-you afterwards. Probably not exactly what you're looking for, but it's a thought. =}	[reply]


Keep It Simple, Stupid
	PerlMonks