Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Re: filesize module on win32 systems

by meetraz (Hermit)
on Mar 01, 2004 at 22:08 UTC ( [id://333070]=note: print w/replies, xml ) Need Help??


in reply to Re: filesize module on win32 systems
in thread filesize module on win32 systems

There are caveats, of course:

  • NTFS supports transparent compression of files. There is the "actual" file size, and the "compressed" size. Which does -s return?
  • NTFS supports sparse files. If a large section of a file contains NULL, the filesystem can save space by not allocating room for the NULL data. Here, I would assume -s would return the full size, not the "on disk" size.
  • NTFS supports "alternate streams". A file can have many alternate streams, taking up data that I assume would never be reported by -s.
  • Depending on the file size & cluster size, most files probably take up more space "on disk" than the contents of the file would suggest. Which does -s report here?

This is probably not an exhaustive list, I'm sure I'm forggeting something.

Edited by Chady -- closed ul tag.

  • Comment on Re: Re: filesize module on win32 systems

Replies are listed 'Best First'.
Re^3: filesize module on win32 systems
by Aristotle (Chancellor) on Mar 02, 2004 at 10:15 UTC
    • I'll make an educated guess about compressed files later.
    • Most every modern filesystem supports sparse files, and the call underlying -s reports the virtual, not on-disk, size, on most of them. I assume NTFS is no different. Programs have to specifically be "sparse-aware" to deal with such files properly.
    • The size of alternative streams is not reported by the call underlying -s. Again, programs need to be "alternate-stream-aware" to deal with this situation correctly.
    • It reports the byte size of the file, not the cluster size. Unless you are writing a lowlevel filesystem manipulation/report tool, you shouldn't concern yourself with this distinction in the first place.

    You'll notice that the defaults are such that a program naively copying the contents of a file to another with taking the existence of advanced filesystem features into account will still work (and that's on OS level, not Perl -s level). In light of that trend I'd suggest that -s reports the uncompressed size of NTFS compressed files.

    And that's probably all the OP needed, too.

    Makeshifts last the longest.

Re: Re: Re: filesize module on win32 systems
by BrowserUk (Patriarch) on Mar 02, 2004 at 11:40 UTC

    Perl calls the C runtime fstat() to get the information. I the case of MSVC the actual call is _fstati64() which in turn calls GetFileInformationByHandle(). That returns a structure called BY_HANDLE_FILE_INFORMATION, which contains to DWORDS fields that contain the filesize. The filesize as it would be if you read the whole think into memory. Ie. decompressed, de-sparsed etc.

    To get the actual on-disk size you would need to call GetCompressedFileSize().


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://333070]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-04-20 01:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found