Package org.python.modules
Class _codecs
java.lang.Object
org.python.modules._codecs
This class corresponds to the Python _codecs module, which in turn lends its functions to the
codecs module (in Lib/codecs.py). It exposes the implementing functions of several codec families
called out in the Python codecs library Lib/encodings/*.py, where it is usually claimed that they
are bound "as C functions". Obviously, C stands for "compiled" in this context, rather than
dependence on a particular implementation language. Actual transcoding methods often come from
the related
codecs
class.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Optimized charmap encoder mapping. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic PyTuple
ascii_decode
(String str) static PyTuple
ascii_decode
(String str, String errors) static PyTuple
ascii_encode
(String str) static PyTuple
ascii_encode
(String str, String errors) static PyObject
charmap_build
(PyUnicode map) static PyTuple
charmap_decode
(String bytes) Equivalent tocharmap_decode(bytes, errors, null)
.static PyTuple
charmap_decode
(String bytes, String errors) Equivalent tocharmap_decode(bytes, errors, null)
.static PyTuple
charmap_decode
(String bytes, String errors, PyObject mapping) Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers).static PyTuple
charmap_decode
(String bytes, String errors, PyObject mapping, boolean ignoreUnmapped) Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers).static PyTuple
charmap_encode
(String str) Equivalent tocharmap_encode(str, null, null)
.static PyTuple
charmap_encode
(String str, String errors) Equivalent tocharmap_encode(str, errors, null)
.static PyTuple
charmap_encode
(String str, String errors, PyObject mapping) Encoder based on an optional character mapping.static PyObject
Decodebytes
using the system default encoding (seecodecs.getDefaultEncoding()
).static PyObject
Decodebytes
using the codec registered for theencoding
.static PyObject
Decodebytes
using the codec registered for theencoding
.static PyString
Encodeunicode
using the system default encoding (seecodecs.getDefaultEncoding()
).static PyString
Encodeunicode
using the codec registered for theencoding
.static PyString
Encodeunicode
using the codec registered for theencoding
.static String
encode_UTF16
(String str, String errors, int byteorder) static PyTuple
escape_decode
(String str) static PyTuple
escape_decode
(String str, String errors) static PyTuple
escape_encode
(String str) static PyTuple
escape_encode
(String str, String errors) static PyTuple
latin_1_decode
(String str) static PyTuple
latin_1_decode
(String str, String errors) static PyTuple
latin_1_encode
(String str) static PyTuple
latin_1_encode
(String str, String errors) static PyTuple
static PyObject
lookup_error
(PyString handlerName) static PyTuple
static PyTuple
raw_unicode_escape_decode
(String str, String errors) static PyTuple
static PyTuple
raw_unicode_escape_encode
(String str, String errors) static void
static void
register_error
(String name, PyObject errorHandler) static PyObject
translateCharmap
(PyUnicode str, String errors, PyObject mapping) static PyTuple
static PyTuple
unicode_escape_decode
(String str, String errors) static PyTuple
static PyTuple
unicode_escape_encode
(String str, String errors) static PyTuple
unicode_internal_decode
(String bytes) Deprecated.static PyTuple
unicode_internal_decode
(String bytes, String errors) Deprecated.static PyTuple
unicode_internal_encode
(String unicode) Deprecated.static PyTuple
unicode_internal_encode
(String unicode, String errors) Deprecated.static PyTuple
utf_16_be_decode
(String str) static PyTuple
utf_16_be_decode
(String str, String errors) static PyTuple
utf_16_be_decode
(String str, String errors, boolean final_) static PyTuple
utf_16_be_encode
(String str) static PyTuple
utf_16_be_encode
(String str, String errors) static PyTuple
utf_16_decode
(String str) static PyTuple
utf_16_decode
(String str, String errors) static PyTuple
utf_16_decode
(String str, String errors, boolean final_) static PyTuple
utf_16_encode
(String str) static PyTuple
utf_16_encode
(String str, String errors) static PyTuple
utf_16_encode
(String str, String errors, int byteorder) static PyTuple
utf_16_ex_decode
(String str) static PyTuple
utf_16_ex_decode
(String str, String errors) static PyTuple
utf_16_ex_decode
(String str, String errors, int byteorder) static PyTuple
utf_16_ex_decode
(String str, String errors, int byteorder, boolean final_) static PyTuple
utf_16_le_decode
(String str) static PyTuple
utf_16_le_decode
(String str, String errors) static PyTuple
utf_16_le_decode
(String str, String errors, boolean final_) static PyTuple
utf_16_le_encode
(String str) static PyTuple
utf_16_le_encode
(String str, String errors) static PyTuple
utf_32_be_decode
(String bytes) Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTuple
utf_32_be_decode
(String bytes, String errors) Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTuple
utf_32_be_decode
(String bytes, String errors, boolean isFinal) Decode (perhaps partially) a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTuple
utf_32_be_encode
(String unicode) Encode a Unicode Java String as UTF-32 with big-endian byte order.static PyTuple
utf_32_be_encode
(String unicode, String errors) Encode a Unicode Java String as UTF-32 with big-endian byte order.static PyTuple
utf_32_decode
(String bytes) Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTuple
utf_32_decode
(String bytes, String errors) Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTuple
utf_32_decode
(String bytes, String errors, boolean isFinal) Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTuple
utf_32_encode
(String unicode) Encode a Unicode Java String as UTF-32 with byte order mark.static PyTuple
utf_32_encode
(String unicode, String errors) Encode a Unicode Java String as UTF-32 with byte order mark.static PyTuple
utf_32_encode
(String unicode, String errors, int byteorder) Encode a Unicode Java String as UTF-32 in specified byte order with byte order mark.static PyTuple
utf_32_ex_decode
(String bytes, String errors, int byteorder) Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention).static PyTuple
utf_32_ex_decode
(String bytes, String errors, int byteorder, boolean isFinal) Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention).static PyTuple
utf_32_le_decode
(String bytes) Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTuple
utf_32_le_decode
(String bytes, String errors) Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTuple
utf_32_le_decode
(String bytes, String errors, boolean isFinal) Decode (perhaps partially) a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTuple
utf_32_le_encode
(String unicode) Encode a Unicode Java String as UTF-32 with little-endian byte order.static PyTuple
utf_32_le_encode
(String unicode, String errors) Encode a Unicode Java String as UTF-32 with little-endian byte order.static PyTuple
utf_7_decode
(String bytes) static PyTuple
utf_7_decode
(String bytes, String errors) static PyTuple
utf_7_decode
(String bytes, String errors, boolean finalFlag) static PyTuple
utf_7_encode
(String str) static PyTuple
utf_7_encode
(String str, String errors) static PyTuple
utf_8_decode
(String str) static PyTuple
utf_8_decode
(String str, String errors) static PyTuple
utf_8_decode
(String str, String errors, boolean final_) static PyTuple
utf_8_decode
(String str, String errors, PyObject final_) static PyTuple
utf_8_encode
(String str) static PyTuple
utf_8_encode
(String str, String errors)
-
Constructor Details
-
_codecs
public _codecs()
-
-
Method Details
-
register
-
lookup
-
lookup_error
-
register_error
-
decode
Decodebytes
using the system default encoding (seecodecs.getDefaultEncoding()
). Decoding errors raise aValueError
.- Parameters:
bytes
- to be decoded- Returns:
- Unicode string decoded from
bytes
-
decode
Decodebytes
using the codec registered for theencoding
. Theencoding
defaults to the system default encoding (seecodecs.getDefaultEncoding()
). Decoding errors raise aValueError
.- Parameters:
bytes
- to be decodedencoding
- name of encoding (to look up in codec registry)- Returns:
- Unicode string decoded from
bytes
-
decode
Decodebytes
using the codec registered for theencoding
. Theencoding
defaults to the system default encoding (seecodecs.getDefaultEncoding()
). The stringerrors
may name a different error handling policy (built-in or registered withregister_error(String, PyObject)
). The default error policy is 'strict' meaning that decoding errors raise aValueError
.- Parameters:
bytes
- to be decodedencoding
- name of encoding (to look up in codec registry)errors
- error policy name (e.g. "ignore")- Returns:
- Unicode string decoded from
bytes
-
encode
Encodeunicode
using the system default encoding (seecodecs.getDefaultEncoding()
). Encoding errors raise aValueError
.- Parameters:
unicode
- string to be encoded- Returns:
- bytes object encoding
unicode
-
encode
Encodeunicode
using the codec registered for theencoding
. Theencoding
defaults to the system default encoding (seecodecs.getDefaultEncoding()
). Encoding errors raise aValueError
.- Parameters:
unicode
- string to be encodedencoding
- name of encoding (to look up in codec registry)- Returns:
- bytes object encoding
unicode
-
encode
Encodeunicode
using the codec registered for theencoding
. Theencoding
defaults to the system default encoding (seecodecs.getDefaultEncoding()
). The stringerrors
may name a different error handling policy (built-in or registered withregister_error(String, PyObject)
). The default error policy is 'strict' meaning that encoding errors raise aValueError
.- Parameters:
unicode
- string to be encodedencoding
- name of encoding (to look up in codec registry)errors
- error policy name (e.g. "ignore")- Returns:
- bytes object encoding
unicode
-
charmap_build
-
utf_8_decode
-
utf_8_decode
-
utf_8_decode
-
utf_8_decode
-
utf_8_encode
-
utf_8_encode
-
utf_7_decode
-
utf_7_decode
-
utf_7_decode
-
utf_7_encode
-
utf_7_encode
-
escape_decode
-
escape_decode
-
escape_encode
-
escape_encode
-
charmap_decode
Equivalent tocharmap_decode(bytes, errors, null)
. This method is here so the error and mapping arguments can be optional at the Python level.- Parameters:
bytes
- sequence of bytes to decode- Returns:
- decoded string and number of bytes consumed
-
charmap_decode
Equivalent tocharmap_decode(bytes, errors, null)
. This method is here so the error argument can be optional at the Python level.- Parameters:
bytes
- sequence of bytes to decodeerrors
- error policy- Returns:
- decoded string and number of bytes consumed
-
charmap_decode
Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers). If the mapping is null or None, decode with latin-1 (essentially treating bytes as character codes directly).- Parameters:
bytes
- sequence of bytes to decodeerrors
- error policymapping
- to convert bytes to characters- Returns:
- decoded string and number of bytes consumed
-
charmap_decode
public static PyTuple charmap_decode(String bytes, String errors, PyObject mapping, boolean ignoreUnmapped) Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers).- Parameters:
bytes
- sequence of bytes to decodeerrors
- error policymapping
- to convert bytes to charactersignoreUnmapped
- if true, pass unmapped byte values as character codes [0..256)- Returns:
- decoded string and number of bytes consumed
-
translateCharmap
-
charmap_encode
Equivalent tocharmap_encode(str, null, null)
. This method is here so the error and mapping arguments can be optional at the Python level.- Parameters:
str
- to be encoded- Returns:
- (encoded data, size(str)) as a pair
-
charmap_encode
Equivalent tocharmap_encode(str, errors, null)
. This method is here so the mapping can be optional at the Python level.- Parameters:
str
- to be encodederrors
- error policy name (e.g. "ignore")- Returns:
- (encoded data, size(str)) as a pair
-
charmap_encode
Encoder based on an optional character mapping. This mapping is either anEncodingMap
of 256 entries, or an arbitrary container indexable with integers using__finditem__
and yielding byte strings. If the mapping is null, latin-1 (effectively a mapping of character code to the numerically-equal byte) is used- Parameters:
str
- to be encodederrors
- error policy name (e.g. "ignore")mapping
- from character code to output byte (or string)- Returns:
- (encoded data, size(str)) as a pair
-
ascii_decode
-
ascii_decode
-
ascii_encode
-
ascii_encode
-
latin_1_decode
-
latin_1_decode
-
latin_1_encode
-
latin_1_encode
-
utf_16_encode
-
utf_16_encode
-
utf_16_encode
-
utf_16_le_encode
-
utf_16_le_encode
-
utf_16_be_encode
-
utf_16_be_encode
-
encode_UTF16
-
utf_16_decode
-
utf_16_decode
-
utf_16_decode
-
utf_16_le_decode
-
utf_16_le_decode
-
utf_16_le_decode
-
utf_16_be_decode
-
utf_16_be_decode
-
utf_16_be_decode
-
utf_16_ex_decode
-
utf_16_ex_decode
-
utf_16_ex_decode
-
utf_16_ex_decode
-
utf_32_encode
Encode a Unicode Java String as UTF-32 with byte order mark. (Encoding is in platform byte order, which is big-endian for Java.)- Parameters:
unicode
- to be encoded- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_encode
Encode a Unicode Java String as UTF-32 with byte order mark. (Encoding is in platform byte order, which is big-endian for Java.)- Parameters:
unicode
- to be encodederrors
- error policy name or null meaning "strict"- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_encode
Encode a Unicode Java String as UTF-32 in specified byte order with byte order mark.- Parameters:
unicode
- to be encodederrors
- error policy name or null meaning "strict"byteorder
- decoding "endianness" specified (in the Python -1, 0, +1 convention)- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_le_encode
Encode a Unicode Java String as UTF-32 with little-endian byte order. No byte-order mark is generated.- Parameters:
unicode
- to be encoded- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_le_encode
Encode a Unicode Java String as UTF-32 with little-endian byte order. No byte-order mark is generated.- Parameters:
unicode
- to be encodederrors
- error policy name or null meaning "strict"- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_be_encode
Encode a Unicode Java String as UTF-32 with big-endian byte order. No byte-order mark is generated.- Parameters:
unicode
- to be encoded- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_be_encode
Encode a Unicode Java String as UTF-32 with big-endian byte order. No byte-order mark is generated.- Parameters:
unicode
- to be encodederrors
- error policy name or null meaning "strict"- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. The endianness used will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes
- to be decoded (JythonPyString
convention)- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_decode
Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. The endianness used will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes
- to be decoded (JythonPyString
convention)errors
- error policy name (e.g. "ignore", "replace")- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. The endianness used will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
).- Parameters:
bytes
- to be decoded (JythonPyString
convention)errors
- error policy name (e.g. "ignore", "replace")isFinal
- if a "final" call, meaning the input must all be consumed- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_le_decode
Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes
- to be decoded (JythonPyString
convention)- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_le_decode
Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes
- to be decoded (JythonPyString
convention)errors
- error policy name (e.g. "ignore", "replace")- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_le_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
).- Parameters:
bytes
- to be decoded (JythonPyString
convention)errors
- error policy name (e.g. "ignore", "replace")isFinal
- if a "final" call, meaning the input must all be consumed- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_be_decode
Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes
- to be decoded (JythonPyString
convention)- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_be_decode
Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes
- to be decoded (JythonPyString
convention)errors
- error policy name (e.g. "ignore", "replace")- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_be_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. Unicode string and return as a tuple the unicode text, the amount of input consumed. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
).- Parameters:
bytes
- to be decoded (JythonPyString
convention)errors
- error policy name (e.g. "ignore", "replace")isFinal
- if a "final" call, meaning the input must all be consumed- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_ex_decode
Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention). The endianness, if not unspecified (=0), will be deduced from a byte-order mark and returned. (This codec entrypoint is used in that way in theutf_32.py
codec, but only until the byte order is known.) When not defined by a BOM, processing assumes big-endian coding (Java platform default), but returns "unspecified". (Theutf_32.py
codec treats this as an error, once more than 4 bytes have been processed.) (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
).- Parameters:
bytes
- to be decoded (JythonPyString
convention)errors
- error policy name (e.g. "ignore", "replace")byteorder
- decoding "endianness" specified (in the Python -1, 0, +1 convention)- Returns:
- tuple (unicode_result, bytes_consumed, endianness)
-
utf_32_ex_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention). The endianness will be that specified, will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). Or it may still be undefined if fewer than 4 bytes are presented. (This codec entrypoint is used in the utf-32 codec only untile the byte order is known.) The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode
).- Parameters:
bytes
- to be decoded (JythonPyString
convention)errors
- error policy name (e.g. "ignore", "replace")byteorder
- decoding "endianness" specified (in the Python -1, 0, +1 convention)isFinal
- if a "final" call, meaning the input must all be consumed- Returns:
- tuple (unicode_result, bytes_consumed, endianness)
-
raw_unicode_escape_encode
-
raw_unicode_escape_encode
-
raw_unicode_escape_decode
-
raw_unicode_escape_decode
-
unicode_escape_encode
-
unicode_escape_encode
-
unicode_escape_decode
-
unicode_escape_decode
-
unicode_internal_encode
Deprecated.Legacy method to encode given unicode in CPython wide-build internal format (equivalent UTF-32BE). -
unicode_internal_encode
Deprecated.Legacy method to encode given unicode in CPython wide-build internal format (equivalent UTF-32BE). There must be a multiple of 4 bytes. -
unicode_internal_decode
Deprecated.Legacy method to decode given bytes as if CPython wide-build internal format (equivalent UTF-32BE). There must be a multiple of 4 bytes. -
unicode_internal_decode
Deprecated.Legacy method to decode given bytes as if CPython wide-build internal format (equivalent UTF-32BE). There must be a multiple of 4 bytes.
-