Every RDBMS supports multiple data types namely numeric, monetary,
character, date(time), binary, Boolean and blobs etc. PostgreSQL supports
special datatypes like Geometric, Network-Address, UUID, XML, JSON, Text Search
types etc.
Out of these 70% of columns are typically character data types. Character
data types are strings of ASCII characters. Upper and Lower case alphabetic
characters are accepted literally. There are three kinds of character data
types as cited:
1. fixed-length,
blank padded - char
2. variable-length
with limit - varchar and
3. variable
unlimited length - text
The storage requirement for a character datatype for a short
string (up to 126 bytes) is 1 byte plus the actual string, which includes the
space padding in the case of character. Longer strings have 4 bytes of overhead
instead of 1 byte. Long strings are compressed by the system automatically, so
the physical requirement on disk might be less. Very long values are also
stored in background tables so that they do not interfere with rapid access to
shorter column values.
BTW, In any case, the longest possible character string that can
be stored is about 1 GB. (The maximum value that will be allowed for n in the
data type declaration is less than that. It wouldn't be useful to change this
because with multibyte character encodings the number of characters and bytes
can be quite different. If you desire to store long strings with no specific
upper limit, use text or character varying without a length specifier, rather
than making up an arbitrary length limit.)
Most importantly interesting behaviour in PostgreSQL is how it
handles these character data types.
“There is no
performance difference among these three data types, apart from increased
storage space when using the blank-padded type, and a few extra CPU cycles to
check the length when storing into a length-constrained column. While character(
n)
has performance advantages in some other
database systems, there is no such advantage in PostgreSQL; in
fact character(
n
)
is usually the slowest of the three because of its
additional storage costs.
In most situations text
or character varying
should be used instead.”
Source: postgresql.org
No comments:
Post a Comment