Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^2: [OT] Database row width and cargo cult programming

by moritz (Cardinal)
on Apr 06, 2011 at 11:55 UTC ( [id://897742]=note: print w/replies, xml ) Need Help??


in reply to Re: [OT] Database row width and cargo cult programming
in thread [OT] Database row width and cargo cult programming

You got it backwards. ;-) When storing UTF-8 characters in a VARCHAR(255), you may not be able to store 255 characters

That's curious. In sql server 2000 it's indeed the byte length, in postgres it's character length. And I just tried it in postgres, I can indeed store 64 characters with codepoints above 127 in a varchar(64) column (encoding is set to UTF-8).wikipedia says "characters"

Is there any normative source that is explicit about what's "correct"?

The size is the byte length. And 255 takes a single byte to store the length, anything above takes 2 bytes.

My real question was: do current RDBMs actually exploit that use only one byte to store the length?

  • Comment on Re^2: [OT] Database row width and cargo cult programming

Replies are listed 'Best First'.
Re^3: [OT] Database row width and cargo cult programming
by JavaFan (Canon) on Apr 06, 2011 at 13:01 UTC
    Is there any normative source that is explicit about what's "correct"?
    The various SQL standards, I presume. But they may leave it as "implementation defined". Anyway, from the Sybase 15 manuals:
    Use n to specify the number of bytes of storage for char and varchar datatypes. For unichar, use n to specify the number of Unicode characters (the amount of storage allocated is 2 bytes per character). For nchar and nvarchar, n is the number of characters (the amount of storage allocated is n times the number of bytes per characer for the server’s current default character set).
    ...
    Use n to specify the maximum length in characters for the variable-length datatypes, varchar(n), univarchar(n), and nvarchar(n). Data in variable-length columns is stripped of trailing blanks; storage size is the actual length of the data entered. Data in variable-length variables and parameters retains all trailing blanks, but is not padded to the defined length. Character literals are treated as variable-length datatypes.
    ...
    Fixed-length columns tend to take more storage space than variable-length columns, but are accessed somewhat faster. Table 1-17 summarizes the storage requirements of the different character datatypes:
    Table 1-17: Character datatypes
    
    Datatype        Stores                        Bytes of storage
    
    char(n)         Character                     n
    unichar(n)      Unicode character             n*@@unicharsize (@@unicharsize equals 2)
    nchar(n)        National character            n*@@ncharsize
    varchar(n)      Character varying             Actual number of characters entered
    univarchar(n)   Unicode character varying     Actual number of characters * @@unicharsize
    nvarchar(n)     National character varying    Actual number of characters * @@ncharsize
    
    I'm a bit surprised Sybase doesn't need to store the length of a stored varchar. The chapter about storage sizes of datatypes repeats the information above.

    MySql defines their storages sizes here. For varchar, the storage size is defined as L + 1 bytes if column values require 0 – 255 bytes, L + 2 bytes if values may require more than 255 bytes, where L represents the actual length in bytes of a given string value..

      For Sybase ASE, varchar() will be the size of the number of bytes entered. If the server charset is utf8 then the number of characters will be less or equal to the number of bytes. For nvarchar() you have the same behavrior, modulo @@ncharsize.

      Michael

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://897742]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (2)
As of 2024-04-26 06:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found