Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

So, how about dealing with the memory issue? I thought the beauty of Haskell was that it was non-strict or lazy. That it dealt with infinite lists. Then why does getContents insist on loading the whole darn file?

I was curious about why this happened, so I tried my own variations to see if I could overcome it. Nothing I tried worked; Hugs ran out of memory with a large file every time. So I did some more searching online and through the Haskell mailing list. There's a thread that talks about the 'wc' program, and gives the reasons why it uses so much memory.

In essence, it's not that Haskell is not being lazy enough and reading in the entire file. The problem is that it's being too lazy, and not evaluating the '+1's in the recursive function call. So it keeps all these '+1's around in memory, waiting to be evaluated until it hits the base case, but you run out of memory before it gets there.

The solution they came up with on the mailing list was to create a data type to hold the stats, and put strictness markers on the data type so that the '+1's would get evaluated immediately. Below is your example converted to that style. The strictness markers are the exclamation points before the parametric types in the data constructor Stats. They tell Haskell to evaluate whatever value is about to be put into that slot in the Stats value being created, instead of keeping the "thunks" around to evaluate later. This version does run in constant memory space with a huge file.

data Stats = Stats !Int !Int !Int wcf :: Stats -> String -> Stats wcf stats [] = stats wcf (Stats cc w lc) (' ' : xs) = wcf (Stats (cc+1) (w+1) (lc ) ) xs wcf (Stats cc w lc) ('\t' : xs) = wcf (Stats (cc+1) (w+1) (lc ) ) xs wcf (Stats cc w lc) ('\n' : xs) = wcf (Stats (cc+1) (w+1) (lc+1) ) xs wcf (Stats cc w lc) ( x : xs) = wcf (Stats (cc+1) (w ) (lc ) ) xs wc :: IO() wc = do putStr ( "Filename: " ) name <- getLine contents <- readFile name let (Stats cc w lc) = wcf (Stats 0 0 0) contents putStrLn ( "The file " ++ name ++ " has " ) putStrLn ( show cc ++ " chars " ) putStrLn ( show w ++ " words " ) putStrLn ( show lc ++ " lines." ) main = wc


In reply to Re^14: World's shortest intro to function programming by kelan
in thread Thread on Joel on software forum : "I hate Perl programmers." by techcode

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-04-23 22:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found