u32string

Description

The utility string class that stores a text value in UTF-32. This is not the default string class, but can be used for working with Unicode string contents. See string.to_u32string() in string.

Constructors

u32string() – Creates an empty u32string object.
u32string( u32string s ) – Creates a u32string with a copy of s.

Operators

char32_t& operator[]( int index ) – Accesses code point at location index. Throws an exception if index is out of range.
u32string& operator =( u32string s ) – Copies the values from u32string s. Returns this object.
u32string& operator =( char32_t c ) – Sets the value to the code point c. Returns this object.
u32string& operator +=( u32string s ) – Appends the value from u32string s. Returns this object.
u32string& operator +=( char32_t c ) – Appends the code point c. Returns this object.
bool operator ==( u32string s ) – Returns true if both u32strings contain identical text.
bool operator !=( u32string s ) – Returns true if the u32strings represent different text.
bool operator <=( u32string s ) – Compares two u32strings lexicographically.
bool operator >=( u32string s ) – Compares two u32strings lexicographically.
bool operator <( u32string s ) – Compares two u32strings lexicographically.
bool operator >( u32string s ) – Compares two u32strings lexicographically.

Methods

size_t find(u32string s, size_t index) – Finds the first instance of s starting at offset index, returning the 0-based index. When not found, returns string_npos.
size_t rfind(u32string s, size_t index) – Finds the last instance of s starting at offset index, returning the 0-based index. When not found, returns string_npos.
size_t find_first_of(u32string s, size_t index) – Finds the first matching code point from the char32_t values in s starting at offset index, returning the 0-based index. When not found, returns string_npos.
size_t find_first_not_of(u32string s, size_t index) – Finds the first non-matching code point from the char32_t values in s starting at offset index, returning the 0-based index. When not found, returns string_npos.
size_t find_last_of(u32string s, size_t index) – Finds the last matching code point from the char32_t values in s starting at offset index, returning the 0-based index. When not found, returns string_npos.
size_t find_last_not_of(u32string s, size_t index) – Finds the last non-matching code point from the char32_t values in s starting at offset index, returning the 0-based index. When not found, returns string_npos.
void insert_at(int index, char32_t c) – Inserts code point c at position index. Throws an exception if index is out of range.
void erase_at(int index) – Removes code point at position index. Throws an exception if index is out of range.
void push_back(char32_t c) – Appends code point c.
void clear() – Empties the u32string.
bool empty() – Returns true if the u32string is empty.
size_t size() – Returns the length of the u32string in code points.
size_t length() – Returns the length of the u32string in code points.
u32string substr(size_t index, size_t len) – Returns a sub-u32string starting at offset index with length of len. The value string_npos can be used for the len parameter to return the remaining u32string contents. Throws an exception if index plus len is out of range.
u32string mid( size_t index ) – Returns a sub-u32string starting at offset index. Does not throw an exception.
u32string mid( size_t index, size_t len ) – Returns a sub-u32string starting at offset index with length up to len. Does not throw an exception.
u32string left( size_t len ) – Returns a sub-u32string with up to the first len code points. Does not throw an exception.
u32string right( size_t len ) – Returns a sub-u32string with up to the last len code points. Does not throw an exception.
u32string& replace( size_t start, size_t count, u32string newStr ) – Replaces the part of the u32string indicated by [start, start + count) with the text in newStr (newStr can be empty).
u32string& replaceAll( u32string oldStr, u32string newStr ) – Replaces each instance of oldStr with the text in newStr (newStr can be empty).
u32string& replaceFirst( u32string oldStr, u32string newStr ) – Replaces the first instance of oldStr with the text in newStr (newStr can be empty).
u32string& replaceFirst( u32string oldStr, u32string newStr, startPosition ) – Replaces the first instance of oldStr with the text in newStr (newStr can be empty), starting at index startPosition.
u32string& replaceLast( u32string oldStr, u32string newStr ) – Replaces the last instance of oldStr with the text in newStr (newStr can be empty).
u32string& truncate( size_t newLength ) – Shortens the u32string text value to the new length.
u32string ltrim() – Removes whitespace from the beginning of the u32string, returning a new u32string.
u32string& ltrim_self() – Removes whitespace from the beginning of this u32string.
u32string& ltrim_self( u32string s ) – Removes any code points found in s from the beginning of this u32string.
u32string rtrim() – Removes whitespace from the end of the u32string, returning a new u32string.
u32string& rtrim_self() – Removes whitespace from the end of this u32string.
u32string& rtrim_self( u32string s ) – Removes any code points found in s from the end of this u32string.
u32string trim() – Removes whitespace from the beginning and end of the u32string, returning a new u32string.
u32string& trim_self() – Removes whitespace from the beginning and end of this u32string.
u32string& trim_self( u32string s ) – Removes any code points found in s from the beginning and end of this u32string.
u32string& reverse_self() – Reverses the order of the code points in this u32string.
u32string& toLowerASCII() – All ASCII code points will be converted to ASCII lower case equivalent values.
u32string& toLower( Locale l ) – All code points will be converted to lower case equivalent values using the rules from the Locale provided.
u32string& toUpperASCII() – All ASCII code points will be converted to ASCII upper case equivalent values.
u32string& toUpper( Locale l ) – All code points will be converted to upper case equivalent values using the rules from the Locale provided.
bool u_normalizeNFC() – Applies the NFC (composed) Unicode normalization. Returns true if the u32string was normalized.
bool u_normalizeNFD() – Applies the NFD (decomposed) Unicode normalization. Returns true if the u32string was normalized.
string to_string() – Returns the contents of the u32string encoded as UTF-8. See string.

Unicode Utility Methods

bool u_isUpperCase( int index ) – Returns true if the code point at position index is an upper case letter character (equivalent to java.lang.Character.isUpperCase()). Throws exception if index is out of bounds.
bool u_isLowerCase( int index ) – Returns true if the code point at position index is a lower case letter character (equivalent to java.lang.Character.isLowerCase()). Throws exception if index is out of bounds.
bool u_isSpace( int index ) – Returns true if the code point at position index is a white space character; similar to C/POSIX isspace(). Throws exception if index is out of bounds.
bool u_isSpaceChar( int index ) – Returns true if the code point at position index is a space character (equivalent to java.lang.Character.isSpaceChar()). Throws exception if index is out of bounds.
bool u_isWhitespace( int index ) – Returns true if the code point at position index is a white space character (similar to java.lang.Character.isWhitespace()). Throws exception if index is out of bounds.
bool u_isLetter( int index ) – Returns true if the code point at position index is a letter character (equivalent to java.lang.Character.isLetter()). Throws exception if index is out of bounds.
bool u_isDigit( int index ) – Returns true if the code point at position index is a digit character (equivalent to java.lang.Character.isDigit()). Throws exception if index is out of bounds.
bool u_isLetterOrDigit( int index ) – Returns true if the code point at position index is an alphanumeric character (letter or digit) (equivalent to java.lang.Character.isLetterOrDigit()). Throws exception if index is out of bounds.
bool u_isPunctuation( int index ) – Returns true if the code point at index is a punctuation character. Throws exception if index is out of bounds.
bool u_isBMP( int index ) – Returns true if the code point at index is in the Unicode Basic Multilingual Plane. Throws exception if index is out of bounds.
bool u_isGraphic( int index ) – Returns true if the code point at index is a "graphic" character (printable, excluding spaces). Throws exception if index is out of bounds.
bool u_isPrintable( int index ) – Returns true if the code point at index is a printable character. Throws exception if index is out of bounds.
bool u_isBlank( int index ) – Returns true if the code point at index is a "blank" or "horizontal space", a character that visibly separates words on a line. Throws exception if index is out of bounds.
bool u_isDefined( int index ) – Returns true if the code point at index is "defined", which usually means it is assigned a character in Unicode (equivalent to java.lang.Character.isDefined()). Throws exception if index is out of bounds.
int u_charType( int index ) – Returns the Unicode general category of the code point at position index (equivalent to java.lang.Character.getType()). Return value can be tested against UnicodeCharTypeConstants. Throws exception if index is out of bounds. See UnicodeCharTypeConstants.
int u_digit( int index, int radix ) – Returns the decimal digit value of the code point at position index in the specified radix. Throws exception if index is out of bounds.
void u_toLower( int index ) – Changes code point at index to lower case using non-Locale-based Unicode mapping. Throws exception if index is out of bounds.
void u_toUpper( int index ) – Changes code point at index to upper case using non-Locale-based Unicode mapping. Throws exception if index is out of bounds.

u32string

Description

Constructors

Operators

Methods

Unicode Utility Methods

Related Topics