Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

If I had to point a finger at the most significant factor reducing your performance, I would be tempted to point at your use of system pipes. System pipes have a maximum buffer size, and when this buffer is filled, the producer process must be swapped out. On the other end of the pipe, the same remains true as well. The Perl program (the consumer) can only read one pipe buffer size (often 4096 bytes or 8192 bytes on Unix) before it needs to be swapped out. For very large amounts of data, the constant rescheduling necessary, along with the expectation that other processes on the machine are vying for the same resources, make system pipes inefficient for this solution.

If this is the case, there are a few alternatives you might consider. The first, is that tcpdump could write bytes to a real file, and not a pipe. This would allow tcpdump to pump as much data as it could into the system. In the most optimal solution, the bytes would be written to a file system based on virtual memory, or that used deferred writes, such that the tcpdump write() system calls succeed quickly, and the data is then immediately available to other processes. Your Perl script would then perform a 'read behind' technique that would read() from the file until EOF is encountered. At EOF, a system call such as yield(), poll(), or select() should be executed to yield the processor back to the tcpdump process. When the Perl script is scheduled for execution again, it should read until EOF again. This approach gives you an effectively limitless buffer size, as opposed to the system pipe approach that provides only a fixed (small) buffer size. Of course, the situation of the file becoming too large for the file system may be a consideration.

Otherwise, your only solution would be to hack tcpdump, or find an alternative program than tcpdump, that would invoke the Perl inlines inline, or that would transfer the data more efficiency, such as using a large shared memory segment.


In reply to Re: Very fast reads from an external program by MarkM
in thread Very fast reads from an external program by slifox

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2024-03-29 14:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found