Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Re: C vs perl

by abstracts (Hermit)
on Apr 28, 2002 at 08:06 UTC ( [id://162633]=note: print w/replies, xml ) Need Help??


in reply to Re: C vs perl
in thread C vs perl

I totally agree with samtregar that this code really convinces nobody that you know enough C to compare the 2 languages.

What I don't agree with is using realloc instead of scanning the string twice, as realloc will have to copy the string over to a new location if it fails to allocate a larger size of contiguous memory at the same location. I might be wrong and it might even be implementations dependant (what malloc library guarantees giving you the same location if you realloc to a larger size?).

Hope this helps...

Replies are listed 'Best First'.
Re: Re: Re: C vs perl
by samtregar (Abbot) on Apr 28, 2002 at 19:04 UTC
    Do you have a better idea? It's pretty hard to know how much memory to allocate when you don't know how big your results will grow! Perl realloc()s on SVs all the time for just this reason.

    I suppose he could build a linked-list of text blocks and then reassemble them into a single contiguous block at the end. I doubt that would perform better than realloc() though.

    -sam

      One way to do it is by allocating string of length:
      newlen = (strlen(str) * strlen("</p><p>")) / strlen("\r\n") + strlen("</p><p>") + 1;
      which is in this case 3.5 times the length of the original string. This is the total number of bytes required in the worst case scenario: $str =~ /(\r\n)*/. Excessive memory can be reclaimed by doing a realloc *after* the substitution.

      As for perl's internal implementation, it's a different issue as the regex engine must work with any regex given. But even with that in mind, you can still build a linked list of offsets and lengths of parts in the original strings that need to copied over, as well as another list of substitutions. The required amount of memory should be easy to compute and will require doing a single copy only.

      For this example, this is like doing:

      my $str = 'line1\r\nline2\r\n"; my $result = join '</p><p>', split(/\r\n/, $str);
        That sure sounds good, but I'm not sure I'd be thrilled with the results of passing this routine 3MB of text and having it malloc() 10MB. The efficacy of malloc()ing far more than you need then hoping that realloc() is cool enough to make it not matter is questionable.

        -sam

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://162633]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-04-24 20:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found