Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

We have no SPL.

by educated_foo (Vicar)
on May 06, 2002 at 08:32 UTC ( #164248=perlmeditation: print w/replies, xml ) Need Help??

After being party to some pain and suffering over at Heaps, I was reminded of Perl's striking lack of data structures beyond the basic arrays, hashes, and scalars. Sure, it has modules for everything from databases to MIDI, and makes it easy to glue these different domains together in interesting ways. But in terms of bread-and-butter, straight-from-CLR data structures, I think it could be better.

If I were working in C++, I could get these basic building blocks from the STL, LEDA, Boost, etc. Such standard implementations are well-designed, efficient, and well-tested; they let programmers solve problems once and for all, then move on to better things. While a few applications demand more than the standards can provide, and more than a few programmers prefer their own, uneven wheels, most people who want to get the job done quickly and correctly have the tools to do so.

Now consider our Heap. I believe POE has implemented its own priority queues, and List::Priority just hit the shelves last month. None of these is fast enough, general enough, and/or easy enough to use, so each will remain confined to its own niche. Each has its strengths and weaknesses, but I wouldn't be surprised if a good standard implementation could beat them all.

Perl needs a good, tested library of containers and algorithms, all living in a well-known place. There are some gems out there on CPAN now, and probably some more sitting in people's home directories. We need to put them in a common place, test them, tune them, and get them in the standard distribution.

/s

Replies are listed 'Best First'.
Re: We have no SPL.
by moodster (Hermit) on May 06, 2002 at 09:39 UTC
    No we don't.

    I don't need an SPL. I don't think the perl community need an SPL. If we'd wanted a single standard library of modules that've been singled out and promoted as the absolute truth and The Way We Do Things Around Here, we'd have turned to Java in the first place. Java is a great language that has an excellent class library that's getting better for each release. Also, java gives you no choice about what library/classes/modules to use. It's Sun's way or the highway.

    I don't want that to happen to perl. I don't want a standards committee to decide what algorithms I should run. You might say that I have a choice and that I could choose not to go with the standard modules, but that's not how things'd work out in the long run. Eventually, there wouldn't be any other modules, and if I insisted on rolling my own, I'd be derided as a naive newbie. It couldn't happen here, you say? This is already how we treat anyone who admits to writing his own cgi routines (see A "newbies" thoughts on cgi.pm... for the most recent thread).

    Cheers,
    --Moodster

      Perl already enforces bits of the language. E.g., with C++ I can pick to use a regexp library that comes on a UN*X platform, or a C++ regexp library, or (f)lex (for some things), or Henry Spencer's regexp library, or the Perl5 compatible regexp library, or any number of other libraries. What regexp libraries has Perl got? ONE. Can you write another library that will have the same syntax? (Probably) NO. Is the syntax consistent with the rest of the language? NOt really.

      Is this a good thing? Probably yes. Perl is still very much a data munging language; it's an important part of being glue. And regular expressions are exceedingly important there, both in terms of expressibility and in terms of efficient manipulation.

      More of the same

      What containers has Perl got? TWO: arrays and hashes. Want something else? It's not going to be as convenient: no special syntax (stealing syntax from something else is "using other syntax", not having "special syntax"), either less portable (if you use a C or C++ extension) or less efficient (if you extend in Perl), and less familiar to "A Perl Programmer".

      Is this a bad thing? Probably not. Perl culture emphasises other things (although Perl6 might improve some things). But sticking to the standards of a language is important: a Perl program, written in the Perl culture (regexps, CGI.pm, Tk, closures, blessed objects, etc.) is understandable to any good "Perl Programmer". Another Perl program, using less standard features, would be harder to understand. E.g. functional Perl is possible, but none too popular, for this reason.

      If you're worried that a standard data structures library will somehow deeply pervert Perl into Java by mandating the use of specific modules, I think you're missing a lot more than just the point of the "SPL" concept.

      The reason people get berated about writing their own CGI routines is because they do them badly. Not just slightly bad, not just in ways that are minor and forgivable, but in ways that illustrate what a gigantic hairball this whole CGI thing really is. There are bits of code in CGI.pm that are so stunningly stupid that you would be inclined to remove them, but they are there for a reason, probably to work around some limitation or peculiar "feature" of some Web server or browser somewhere. It is, as best as I can tell, the collected knowledge and wisdom on the subject of CGI in Perl, presented in module format.

      Standard data structures are not hairy in this way. Anyone can make a linked list and it will work just fine. You know it is going to work properly because you know exactly what it has to do, and you can test it. To compare, people still aren't sure what CGI is in specific terms.

      What educated_foo was probably suggesting was that although many of these are trivial to implement, why should we all have to roll our own each time we need them? When you roll your own, there is always the risk that you will do it wrong, sometimes because of something simple like an off-by-one error, or a mistakenly reversed logical operator. Nobody is perfect, and I would argue that anyone's relative perfection diminishes as the night grows longer.

      It would really be great if these sort of structures were built into the standard distribution so that they could be used. That way nobody is forcing you to use them, but at least you aren't forced to write your own.
(podmaster) Re: We have no SPL.
by PodMaster (Abbot) on May 06, 2002 at 08:42 UTC
    Ok, what is good c or c++ implementation of a "heap" that is both fast and flexible, and can be easily ported to every platform perl runs on?

    Let's take it, wrap it in XS, and put it on CPAN. The Perl community needn't re-invent any wheels that don't already exist, and are just waiting to be wrapped in XS (btw, i'm an XS newbie, so don't expect me to do it ... i'll compile it if you wish, but the real magic is up to someone, well, up to it ;)

    update: If wrapping in XS is not an option, then hey, I'm all for re-implementing one from scratch ... most of the logic is already layed out...
     

    Look ma', I'm on CPAN.


    ** The Third rule of perl club is a statement of fact: pod is sexy.

      The STL (C++) has priority_queue<T, Sequence, Compare>. But you can't just wrap it up for Perl, because the concepts behind the languages are too different.

      Look at priority_queue: it is specialized for one type T, it uses a vector<T> by default for storage, but can use any other random-accessible Sequence type, and it uses the desired Compare "functor" for comparison (or T's < method by default). In return, it gives you type safety, with the usual C++ guarantees. The compiled code will typically inline the comparisons and the container's code, so it can be very fast.

      There's no good translation of any of this into Perl. A useful Perl container will hold contents of all types. It will not be type safe, and the programmers using it will not expect it to be. Since access methods for containers are not standardized, it will be hard to make it work with another container. And containers and comparisons will probably need to be passed in a painfully slow way.

      It's not that the idea has no merit, nor even that C++ RULEZ 4ND P3RL SUX. Just that STL comes from a very different world.

      That said, wrapping one particular library-based implementation of a priority queue to work on standard Perl SVs would be very interesting, and quite possibly useful, too.

        That said, wrapping one particular library-based implementation of a priority queue to work on standard Perl SVs would be very interesting, and quite possibly useful, too.
        That would be the idea -- something along the lines of a priority_queue<SV*,vector<SV*>,comparison_func> . Making it "nice" would require it being possible to pass a block of Perl code in as the comparison function (a la sort), which completely blows the inlining. Standard comparisons (e.g. string and numeric greater- and less-than) could still be specified as constants passed to the constructor, and implemented in pure C for speed. It would be interesting to see how much faster this would be than the Perl-only versions.

        /s

      The STL runs anywhere GCC runs, and probably some places it doesn't. I suspect Perl is more widely supported than (full) C++, but there's money behind getting C++ compilers up to speed, so I would suspect things will improve. At times I've thought about trying to wrap versions of the containers for SV*'s. I did something similar for an edit distance algorithm I wrote awhile back -- took an algorithm templated on iterator type, and made it work with offsets in a Perl AV. I wrote an extra layer of non-template wrappers around it, then pulled it in with Inline::CPP, and it wasn't so bad. Depending on how the interfaces work out, e.g. how often callbacks are required for comparisons, this may or may not be efficient enough. If not, re-implementing them is not too hard, and can even be educational and enjoyable sometimes.

      A bigger issue may be Perl's funky license. I know Parrot is re-implementing the kitchen sink because of licensing issues.

      /s

Re: We have no SPL.
by perrin (Chancellor) on May 06, 2002 at 15:52 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://164248]
Approved by rob_au
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (7)
As of 2022-12-02 10:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?