Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Re: Re: File::Find memory leak

by tachyon (Chancellor)
on Mar 14, 2004 at 22:00 UTC ( [id://336547]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: File::Find memory leak
in thread File::Find memory leak

Actually you *may* need real recursion to do that. You don't have to return the list of files and can certainly process them on the fly. This will of course reduce the in memory array size by orders of magnitude depending on file:dir ratio.

However using this approach, which as you not works fine, you are basically stuck with an array listing *all* the dirs. There is a reason for this. Although it is safe to push while you iterate over an array it is not safe to shift AFAIK but I have not extensively tested that. The perl docs *do basically say* don't do *anything* while iterating over an array but it copes fine with push. This makes a certain degree of sense as all we are doing is adding to the end of a link list of pointers and incrementing the last index by 1 with each push. In the loop perl is obviously not caching the end of list pointer but must be rechecking each time.

If you shift then there is an issue. If you are looping from offset N and are at index I and you move N then.....

Anyway a gig of RAM will cope with ~5-10M+ dirs so it should not be a major issue unless you have very few files per dir.

As the search is width first you could easily batch it up into a series of sub searches based on 1-2 levels deep if you have serious terrabytes.

cheers

tachyon

Replies are listed 'Best First'.
Re: Re: Re: Re: File::Find memory leak
by crabbdean (Pilgrim) on Mar 14, 2004 at 22:35 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://336547]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-04-18 17:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found