Re^3: Optimizing a large project.

Smaller subs are indeed easier to read than larger ones. However, it's often the case that the calling code is easier to read if an entire thought is finished by a sub rather than just a portion of a thought.

Having more methods to call to do the same amount of work clearly adds to your call overhead. Don't make a method do several things, but don't be afraid to have it do all of one thing either. If a task takes three fairly simple steps and always takes the same three steps, those should be serialized in the same method. Don't call a method and pass the return value twice. Just call the method, have it do all three, and work with the one return.

If you're reusing code because several areas actually use substantially similar logic, then that's good. If you're reusing code that has lots of conditionals in it to make it reusable, then it may be better not to do that. A one-line method that's called by, say, eight different modules as a utility method makes sense if it's exactly the same line needed for each -- but if that's the case then those modules are probably redundant in some other ways, too, if they have that much in common.

Larger buffers often lower accumulated I/O times, but don't spend so much more CPU time managing the buffers that you lose the advantage of having them larger. If you can, consider storing any data being input and output, at least in intermediate steps, in as compact a form as you can.

Be sure your main bottleneck isn't thrashing your swap. I've sped some tasks up greatly simply by makign sure the machine had adequate memory free at run time.

Don't discount the importance of your environment. Sometimes I/O on a system is slow because you're competing with yourself. A system, for example, that has heavy data access and heavy logging and has its data and its logs on the same disk spindle is begging for the logs to be configured somewhere else. That's generally easy to do on any Unix-type system. Things like disabling or modifying the atime semantics, raising the delay on journal writes, or using a different type of filesystem can make worlds of difference, too.

Since none of us know the specifics of your rather large project, none of us can give anything but generic advice. Test everything before you trust a suggestion to work in your specific case. Sometimes the biggest performance gains are from doing something unexpected because of the peculiarities of a project.

Comment on Re^3: Optimizing a large project.


P is for Practical
	PerlMonks