RFC - Linux::TCPServer (new module)

Replies are listed 'Best First'.
Re: RFC - Linux::TCPServer (new module) by dragonchild (Archbishop) on Oct 29, 2005 at 19:04 UTC
I would recommend Net::TCPserver::Linux as well. An important item to document will be what's so cool about yours that a pureperl solution doesn't do. if it's speed, i'd include benchmarks. My criteria for good software: Does it work? Can someone else come in, make a change, and be reasonably certain no bugs were introduced?	[reply]
Re^2: RFC - Linux::TCPServer (new module) by ph713 (Pilgrim) on Oct 29, 2005 at 19:30 UTC
There's some commentary in the .pod on how the code takes advantage of mmap() shared anonymous memory and lockless IPC for efficiency gains, but you're right, some good benchmarking versus, say, Net::Server::PreFork would be nice to have in there. I'll have to write up something to do the testing with. Update: It looks like Siege will be good for doing the testing. I'm writing up a test script that will do basic HTTP/1.0 responses to their benchmark and run under Linux::TCPServer or Net::Server::PreFork now, we'll see how it fares.	[reply]
Re^3: RFC - Linux::TCPServer (new module) by ph713 (Pilgrim) on Oct 30, 2005 at 08:05 UTC
FYI, I have some preliminary results, and it looks like I'm doing about 2-3x the connection handling speed of the pure perl competition depending on a lot of little variables. The results in the module distribution will of course have to include more details, and I'll leave the benchmarking script in the module too: Linux::TCPServer - 100 connections per child process: ` siege 2.64 Preparing 3 concurrent users for battle. The server is now under siege.. done. Transactions: 15000 hits Availability: 100.00 % Elapsed time: 6.93 secs Data transferred: 62.96 MB Response time: 0.00 secs Transaction rate: 2164.50 trans/sec Throughput: 9.08 MB/sec Concurrency: 2.72 Successful transactions: 15000 Failed transactions: 0 Longest transaction: 0.44 Shortest transaction: 0.00` [download] Linux::TCPServer - 1000 connections per child process: ` siege 2.64 Preparing 3 concurrent users for battle. The server is now under siege.. done. Transactions: 15000 hits Availability: 100.00 % Elapsed time: 7.64 secs Data transferred: 62.96 MB Response time: 0.00 secs Transaction rate: 1963.35 trans/sec Throughput: 8.24 MB/sec Concurrency: 2.82 Successful transactions: 15000 Failed transactions: 0 Longest transaction: 0.71 Shortest transaction: 0.00` [download] Net::Server::PreFork - 100 connections per child process: ` siege 2.64 Preparing 3 concurrent users for battle. The server is now under siege.. done. Transactions: 15000 hits Availability: 100.00 % Elapsed time: 19.89 secs Data transferred: 62.96 MB Response time: 0.00 secs Transaction rate: 754.15 trans/sec Throughput: 3.17 MB/sec Concurrency: 2.87 Successful transactions: 15000 Failed transactions: 0 Longest transaction: 0.75 Shortest transaction: 0.00` [download] Net::Server::PreFork - 1000 connections per child process: ` siege 2.64 Preparing 3 concurrent users for battle. The server is now under siege.. done. Transactions: 15000 hits Availability: 100.00 % Elapsed time: 14.92 secs Data transferred: 62.96 MB Response time: 0.00 secs Transaction rate: 1005.36 trans/sec Throughput: 4.22 MB/sec Concurrency: 2.61 Successful transactions: 15000 Failed transactions: 0 Longest transaction: 1.70 Shortest transaction: 0.00` [download]	[reply] [d/l] [select]
Re^3: RFC - Linux::TCPServer (new module) by ph713 (Pilgrim) on Oct 30, 2005 at 05:54 UTC
Artificial benchmarking has proved to be a wise path to go down indeed. It has uncovered some issues where I was leaking a little bit (either PerlIO objects or the perl stack in general, hard to tell which), that weren't apparent in my (rather strenuous I thought) real-world testing. An update to 0.14 is coming sometime Sunday that moves some of the leaky XS code regarding converting socket FDs into perl io objects back in perl where at least it works correctly, and a change in the handling of socket closing is pending too, as my original understanding of the whole orderly tcp shutdown issue was wrong (it turns out to be an very application-protocol-specific thing, so I'll leave that to the module users if they need it).	[reply]
Re: RFC - Linux::TCPServer (new module) by tirwhan (Abbot) on Oct 30, 2005 at 09:23 UTC
I also think I like Net::TCPServer::Linux best. Regarding the documentation, I hesitate to say it (because it's great to see someone take the time to actually provide extensive documentation), but it's a bit much ;-). A lot of the information given in the .pod is on the internals of the modules, design decisions and general networking background. IMO the module documentation should be there foremost to describe the user interface, things that are important to the user of the module. I'm not saying that any of the information you give is useless, but perhaps some of it could be broken out into a separate document (or separate sections in the docs). As it is, valuable user information is mixed together with design decisions and it's a lot to read for someone who is not familiar with the module. YMMV. The example in the pod also seems a bit long and would IMO be better in a separate script in an `example/` subdirectory. Examples in the pod should be general-purpose and preferably only show how the module itself works (and not include external database connections and the like). Again, YMMV, and I'll be interested to read what others think of this. Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan	[reply] [d/l]
Re^2: RFC - Linux::TCPServer (new module) by ph713 (Pilgrim) on Oct 30, 2005 at 09:43 UTC
Yeah you're probably right about all of that. The module was written out of personal need, so I guess I've always been coming at it as a user, and thinking in terms of how to best write code that uses the module when I'm writing the documentation. I think I will still keep most of the extraneous detailed stuff in another pod somewhere though. Basically, if your project really needs this module, chances are high you're going to want to know all of that stuff if you don't already (the network stuff that is). Actually, it's not worth making a seperate pod just to rehash socket-related manpages, curious users can figure that out for themselves The implementation details probably don't matter so much, I may just move them over to comments in the C code. I think I'll just kill the example in the docs and tell people to read the lib/Net/TCPServer/Linux.pm source for the example. It's basically the same thing minus the database stuff, and it's the default callback implementation that gets used by the test script, etc. ETA: That exact BWK quote has been in the back of my mind for the past couple of weeks, because I know I've been too clever for my own good in numerous places in the C source of this module. :)	[reply]
Re: RFC - Linux::TCPServer (new module) by ph713 (Pilgrim) on Nov 01, 2005 at 02:52 UTC
Thanks for the input/testing all. I've decided to hold off publishing this to CPAN for now (and let this thread die out), pending the name change for sure, and potentially a couple of other big changes which affect the name change. First, I'm thinking perhaps it's best to change up the interface of the module such that it is 99% compatible with the interface of the pure perl Net::Server::* module heirarchy, and publishing it into that space as Net::Server::Linux. I don't think it would be so bad to intrude on that space without inheriting from the existing modules as long as I provide a nearly identical module interface to the user. And since I currently only support TCP, but there's not a really "clean" way to put "TCP" in the name in that namespace, perhaps I should just go ahead and find an intelligent way to support UDP as well, so that the protocol-agnostic "Net::Server::Linux" can be a little more appropriate. (Or at the very least, document that future support for UDP is planned within this module) So in all likelyhood, this will end up emerging as Net::Server::Linux with a heavily changed module interface and potentially UDP support sometime in the near future. Once again, thanks for all the input, it's been very helpful. And if you have a use for the existing incarnation of the module, feel free to use it as is for now - just beware that the name and the interface will change shortly.	[reply]
Re^2: RFC - Linux::TCPServer (new module) by Anonymous Monk on Nov 02, 2005 at 17:31 UTC
Make a general purpose IO::Epoll based server :-)	[reply]
Re^3: RFC - Linux::TCPServer (new module) by ph713 (Pilgrim) on Nov 06, 2005 at 02:44 UTC
Actually I didn't use epoll() for Linux::TCPServer. If I were making a little more flexible solution I would have though. For the most part, the significant speed and efficiency gains I got (I still use this module as-is for now in some other proprietary code I'm developing) came from primarily three things: Single-port-ness. I only needed to listen to a single tcp port, and quite frankly I think most people using such a module are in the same boat. Single port can be implemented much more efficiently than mutli-port, because you can (at least on Linux, probably most others) just block on accept() in all the children on a shared socket the parent established before the forks without locking anything. It's a big win, and could be done as easily in perl as it was in C. Doing this stuff in perl is just inherently an inefficient proposition, whereas doing it in C is inherently efficient if done right. One of my "lessons learned" from this experience is that apparently not many developers of perl and/or perl modules really look at the net effect of what they're doing with tools like strace. Coming from C-land, if you were to take a server like Net::Server::PreFork and run a tcp service over it and strace it, you would be shocked. And I'm not trying to pick on Net::Server::PreFork specifically, a lot of perl modules are like that. IO::Socket::INET is relevant and easier to pick on, although Linux::TCPServer doesn't handle replacing it (in my proprietary code, however, I did). You'd be amazed at the system call waste in IO::Socket::INET for a simple tcp connection. It does a real getpeername() system call twice every time it receives a packet (or was it sends, I can't remember now, it was a while back), for instance. Or the fact that it actually goes and reads /etc/protocols two to three times in a row on every socket object creation to figure out that TCP is protocol number 6 (which hasn't changed in like, decades, on any operating system). This includes incoming socket objects in a server off of an accept() call. If you process 50 connections a second, you're going to open /etc/protocols and scan it for "tcp" 100+ times a second. You don't notice this stuff in small simple applications, but when you're processing large volumes of network connections, it adds up. Some of this only manifests as a result of sloppy overly generic module coding combined with handling the leaky abstractions of perl, combined with attributes of the local system's C library and whatnot. But it's important to look at the big picture for major platforms, and Linux/glibc is definitely a major platform for perl. Efficient direct access to shared memory to track and update child state. There are times (like this one) where you know that a very efficient solution for a problem can be made by creating a real shared memory array of integers and indexing it directly from multiple threads of execution. Perl doesn't offer any clean api for this, although a semi-portable XS module could offer it with only a slight efficiency loss. "use threads" + "my @array : shared" doesn't even come close to realizing this, as anyone more familiar than me with perl internals knows. I think Net::Server::PreFork ends up communicating over a socket to it's children, for example, because that's about the best you can hope for in generic platform-abstract perl-land. So in summary, what C had to offer over pure perl in this case was that it didn't egregiously waste system calls and disk i/o pointlessly for the purpose of ease of abstraction, and it allowed me direct access to real hardware shared memory arrays (which are available on many, many platforms, probably most that perl can run on), but epoll() had nothing to do with it really. UPDATE: Having written that and now reflected on the issues further as a result, one of the key efficiency problems for the perl socket infrastructure in general is to attempt to abstract all "sockets" to look alike. Just because they are all "sockets" at some API level does not mean that it's a good idea to abstract all sockets together into a single class hierarchy, or to treat them the same within perl itself. The world would be a better place if tcp, udp, raw ip, unix, and any other distinct flavor of socket were uniquely different types in the core perl code, and if modules were written seperately and specifically for each protocol. Doing it "right" for all of them in one generic chunk of code is damn near impossible. On top of that, the choice between udp, tcp, unix, raw ip, etc is a very big design decision for any socket user. You cannot arbitrarily switch socket types without rethinking and re-coding everything you do anyways, unless you're inviting bad design to begin with. Therefore there's not much gain from the abstraction. We have a case here of N things abstracted into a single interface which exhibit wildly different characteristics which always matter to the application at hand, as well as matter in terms of libc/kernel api code on the bottom side of perl.	[reply]
Re: RFC - Linux::TCPServer (new module) by CountZero (Bishop) on Oct 30, 2005 at 16:30 UTC
Since it will run on Linux only, it seems to me that it should be called `Linux::Net::TCPServer`. Modules who run on Windows only, have their own `Win32::xxx` namespace. CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply] [d/l] [select]
Re^2: RFC - Linux::TCPServer (new module) by jdporter (Paladin) on Oct 31, 2005 at 05:36 UTC
No; they're named `Win32::...` because they're for interfacing to the Windows system itself, or some subsystem of it. We're building the house of the future together.	[reply] [d/l]
Re^3: RFC - Linux::TCPServer (new module) by ph713 (Pilgrim) on Oct 31, 2005 at 06:28 UTC
I guess that gets at the heart of the matter. Just to be clear then, the standard is supposed to be that if the purpose of the module is to expose an OS/platform-specific interface to perl users, then it belongs in the ^Platform:: namespace, whereas if it implements a generic concept with the internals tailored to work on a certain OS/platform, then it belongs in the generic namespaces with the platform tacked on the end?	[reply]
Re^4: RFC - Linux::TCPServer (new module) by jdporter (Paladin) on Oct 31, 2005 at 14:44 UTC
Re^2: RFC - Linux::TCPServer (new module) by ph713 (Pilgrim) on Oct 30, 2005 at 17:12 UTC
I was originally of the Linux:: mind too (obviously), but everyone else so far has gone 3-0 in favor of Net::TCPserver::Linux. I was just about to give in and go start changing the name everywhere this morning when your dissenting opinion arrived, now you've given me an excuse to put it off for at least a few more hours and reconsider it some more :)	[reply]
Re^3: RFC - Linux::TCPServer (new module) by BrowserUk (Patriarch) on Oct 31, 2005 at 06:03 UTC
Count one more voice for Net::Server::Linux, that way if someone gets around to adding a Win32 equivalent, it can be Net::Server::Win32, and so on for other platforms. When someone goes looking for a net server, they'll likely find what they are looking for in that namespace regardless of what platform they need it for. Win32::* (and by extension Linux::*) are (or should be) reserved for stuff that it simply makes no sense to try and port to other platforms. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^4: RFC - Linux::TCPServer (new module) by ph713 (Pilgrim) on Oct 31, 2005 at 06:38 UTC


We don't bite newbies here... much
	PerlMonks