The bytea data type allows storage of binary strings.
Table 3-6. Binary String Types
Type Name | Storage | Description |
---|---|---|
bytea | 4 bytes plus the actual binary string | Variable (not specifically limited) length binary string |
A binary string is a sequence of octets that does not have either a character set or collation associated with it. Bytea specifically allows storing octets of zero value and other "non-printable" octets.
Octets of certain values must be escaped (but all octet values may be escaped) when used as part of a string literal in an SQL statement. In general, to escape an octet, it is converted into the three-digit octal number equivalent of its decimal octet value, and preceded by two backslashes. Some octet values have alternate escape sequences, as shown in Table 3-7.
Table 3-7. SQL Literal Escaped Octets
Decimal Octet Value | Description | Input Escaped Representation | Example | Printed Result |
---|---|---|---|---|
0 | zero octet | '\\000' | select '\\000'::bytea; | \000 |
39 | single quote | '\\'' or '\\047' | select '\''::bytea; | ' |
92 | backslash | '\\\\' or '\\134' | select '\\\\'::bytea; | \\ |
Note that the result in each of the examples above was exactly one octet in length, even though the output representation of the zero octet and backslash are more than one character. Bytea output octets are also escaped. In general, each "non-printable" octet decimal value is converted into its equivalent three digit octal value, and preceded by one backslash. Most "printable" octets are represented by their standard representation in the client character set. The octet with decimal value 92 (backslash) has a special alternate output representation. Details are in Table 3-8.
Table 3-8. SQL Output Escaped Octets
Decimal Octet Value | Description | Output Escaped Representation | Example | Printed Result |
---|---|---|---|---|
92 | backslash | \\ | select '\\134'::bytea; | \\ |
0 to 31 and 127 to 255 | "non-printable" octets | \### (octal value) | select '\\001'::bytea; | \001 |
32 to 126 | "printable" octets | ASCII representation | select '\\176'::bytea; | ~ |
SQL string literals (input strings) must be preceded with two backslashes due to the fact that they must pass through two parsers in the PostgreSQL backend. The first backslash is interpreted as an escape character by the string-literal parser, and therefore is consumed, leaving the octets that follow. The remaining backslash is recognized by the bytea input function as the prefix of a three digit octal value. For example, a string literal passed to the backend as '\\001' becomes '\001' after passing through the string-literal parser. The '\001' is then sent to the bytea input function, where it is converted to a single octet with a decimal value of 1.
For a similar reason, a backslash must be input as '\\\\' (or '\\134'). The first and third backslashes are interpreted as escape characters by the string-literal parser, and therefore are consumed, leaving two backslashes in the string passed to the bytea input function, which interprets them as representing a single backslash. For example, a string literal passed to the backend as '\\\\' becomes '\\' after passing through the string-literal parser. The '\\' is then sent to the bytea input function, where it is converted to a single octet with a decimal value of 92.
A single quote is a bit different in that it must be input as '\'' (or '\\134'), not as '\\''. This is because, while the literal parser interprets the single quote as a special character, and will consume the single backslash, the bytea input function does not recognize a single quote as a special octet. Therefore a string literal passed to the backend as '\'' becomes ''' after passing through the string-literal parser. The ''' is then sent to the bytea input function, where it is retains its single octet decimal value of 39.
Depending on the front end to PostgreSQL you use, you may have additional work to do in terms of escaping and unescaping bytea strings. For example, you may also have to escape line feeds and carriage returns if your interface automatically translates these. Or you may have to double up on backslashes if the parser for your language or choice also treats them as an escape character.
Bytea provides most of the functionality of the binary string type per SQL99 section 4.3. A comparison of SQL99 Binary Strings and PostgreSQL bytea is presented in Table 3-9.
Table 3-9. Comparison of SQL99 Binary String and PostgreSQL BYTEA types
SQL99 | BYTEA |
---|---|
Name of data type BINARY LARGE OBJECT or BLOB | Name of data type BYTEA |
Sequence of octets that does not have either a character set or collation associated with it. | same |
Described by a binary data type descriptor containing the name of the data type and the maximum length in octets | Described by a binary data type descriptor containing the name of the data type with no specific maximum length |
All binary strings are mutually comparable in accordance with the rules of comparison predicates. | same |
Binary string values can only be compared for equality. | Binary string values can be compared for equality, greater than, greater than or equal, less than, less than or equal |
Operators operating on and returning binary strings include concatenation, substring, overlay, and trim | Operators operating on and returning binary strings include concatenation, substring, and trim. The leading and trailing arguments for trim are not yet implemented. |
Other operators involving binary strings include length, position, and the like predicate | same |
A binary string literal is comprised of an even number of hexadecimal digits, in single quotes, preceded by "X", e.g. X'1a43fe' | A binary string literal is comprised of octets escaped according to the rules shown in Table 3-7 |