sombok  2.2.0
Data Structures | Functions
gcstring

Grapheme cluster string. More...

Data Structures

struct  unistr_t
struct  gcchar_t
struct  gcstring_t

Functions

gcstring_tgcstring_new (unistr_t *unistr, linebreak_t *lbobj)
gcstring_tgcstring_newcopy (unistr_t *str, linebreak_t *lbobj)
gcstring_tgcstring_new_from_utf8 (char *str, size_t len, int check, linebreak_t *lbobj)
void gcstring_destroy (gcstring_t *gcstr)
gcstring_tgcstring_copy (gcstring_t *gcstr)
gcstring_tgcstring_append (gcstring_t *gcstr, gcstring_t *appe)
int gcstring_cmp (gcstring_t *a, gcstring_t *b)
size_t gcstring_columns (gcstring_t *gcstr)
gcstring_tgcstring_concat (gcstring_t *gcstr, gcstring_t *appe)
gcchar_tgcstring_next (gcstring_t *gcstr)
void gcstring_setpos (gcstring_t *gcstr, int pos)
void gcstring_shrink (gcstring_t *gcstr, int length)
gcstring_tgcstring_substr (gcstring_t *gcstr, int offset, int length)
gcstring_tgcstring_replace (gcstring_t *gcstr, int offset, int length, gcstring_t *replacement)
propval_t gcstring_lbclass (gcstring_t *gcstr, int pos)
propval_t gcstring_lbclass_ext (gcstring_t *gcstr, int pos)

Detailed Description

Grapheme cluster string.


Function Documentation

gcstring_t* gcstring_append ( gcstring_t gcstr,
gcstring_t appe 
)

Append

Modify grapheme cluster string by appending another string.

Parameters:
[in]gcstrtarget grapheme cluster string, must not be NULL.
[in]appegrapheme cluster string to be appended. NULL means null string therefore gcstr won't be modified.
Returns:
Modified grapheme cluster string gcstr itself (not a copy). If error occurred, errno is set then NULL is returned.
int gcstring_cmp ( gcstring_t a,
gcstring_t b 
)

Compare

Compare grapheme cluster strings.

Parameters:
[in]agrapheme cluster string.
[in]bgrapheme cluster string.
Returns:
positive, zero or negative value when a is greater, equal to, lesser than b, respectively.
size_t gcstring_columns ( gcstring_t gcstr)

Number of Columns

Returns number of columns of grapheme cluster strings determined by built-in character database according to UAX #11.

Parameters:
[in]gcstrgrapheme cluster string. NULL may mean null string.
Returns:
Number of columns.
gcstring_t* gcstring_concat ( gcstring_t gcstr,
gcstring_t appe 
)

Concatenate

Create new grapheme cluster string which is concatination of two strings.

Parameters:
[in]gcstrgrapheme cluster string, must not be NULL.
[in]appegrapheme cluster string to be appended. NULL means null string.
Returns:
New grapheme cluster string. If error occurred, errno is set then NULL is returned.

Copy Constructor

Create deep copy of grapheme cluster string.

Parameters:
[in]gcstrgrapheme cluster string, must not be NULL.
Returns:
deep copy of grapheme cluster string. If error occurred, errno is set then NULL is returned.
void gcstring_destroy ( gcstring_t gcstr)

Destructor

Free memories allocated for grapheme cluster string.

Parameters:
[in]gcstrgrapheme cluster string.
Returns:
none. If gcstr was NULL, do nothing.
propval_t gcstring_lbclass ( gcstring_t gcstr,
int  pos 
)

Get Line Breaking Class of grapheme base

Get UAX #14 line breaking class of grapheme base.

Parameters:
[in]gcstrgrapheme cluster string, must not be NULL.
[in]posposition.
Returns:
line breaking class property value.
Note:
Introduced by sombok 2.2.
propval_t gcstring_lbclass_ext ( gcstring_t gcstr,
int  pos 
)

Get Line Breaking Class of grapheme extender

Get UAX #14 line breaking class of grapheme extender. If it is CM, get one of grapheme base.

Parameters:
[in]gcstrgrapheme cluster string, must not be NULL.
[in]posposition.
Returns:
line breaking class property value.
Note:
Introduced by sombok 2.2.
gcstring_t* gcstring_new ( unistr_t unistr,
linebreak_t lbobj 
)

Constructor

Create new grapheme cluster string from Unicode string. Use gcstring_newcopy() if you wish to copy buffer of Unicode string.

Parameters:
[in]unistrUnicode string. NULL may be given as zero-length string.
[in]lbobjlinebreak object.
Returns:
New grapheme cluster string sharing str buffer with unistr. If error occurred, errno is set then NULL is returned.

option bits of lbobj:

  • if LINEBREAK_OPTION_EASTASIAN_CONTEXT bit is set, LB_AI and EA_A are resolved to LB_ID and EA_F. Otherwise, LB_AL and EA_N, respectively.
  • if LINEBREAK_OPTION_LEGACY_CM bit is set, combining mark lead by a SPACE is isolated combining mark (ID). Otherwise, such sequences are treated as degenerate cases.
  • if LINEBREAK_OPTION_VIRAMA_AS_JOINER bit is set, virama and other letter are not broken.
gcstring_t* gcstring_new_from_utf8 ( char *  str,
size_t  len,
int  check,
linebreak_t lbobj 
)

Constructor from UTF-8 string

Create new grapheme cluster string from UTF-8 string.

Parameters:
[in]strbuffer of UTF-8 string, must not be NULL.
[in]lenlength of UTF-8 string.
[in]checkcheck input. See sombok_decode_utf8().
[in]lbobjlinebreak object.
Returns:
New grapheme cluster string. If error occurred, errno is set then NULL is returned. Source string buffer would not be modified.
gcstring_t* gcstring_newcopy ( unistr_t str,
linebreak_t lbobj 
)

Constructor copying Unicode string.

Create new grapheme cluster string from Unicode string. Use gcstring_new() if you wish not to copy buffer of Unicode string.

Parameters:
[in]strUnicode string. NULL may be given as zero-length string.
[in]lbobjlinebreak object.
Returns:
New grapheme cluster string. If error occurred, errno is set then NULL is returned.

Iterator

Returns pointer to next grapheme cluster of grapheme cluster string. Next position will be incremented.

Parameters:
[in]gcstrgrapheme cluster string.
Returns:
Pointer to grapheme cluster. If pointer was already at end of the string, NULL will be returned.
gcstring_t* gcstring_replace ( gcstring_t gcstr,
int  offset,
int  length,
gcstring_t replacement 
)

Replace substring

Replace substring og grapheme cluster string. Offset and length are specified by number of grapheme clusters.

Parameters:
[in,out]gcstrgrapheme cluster string. Must not be NULL.
[in]offsetOffset of substring.
[in]lengthLength of substring. offset and length must not be out of range.
[in]replacementIf this was not NULL, modify grapheme cluster string by replacing substring with it.
Returns:
modified gcstr itself (not a copy of it). If error occurred, errno is set to non-zero then NULL is returned.
Todo:
On next major release, offset and length would be ssize_t, not int.
void gcstring_setpos ( gcstring_t gcstr,
int  pos 
)

Set Next Position

Set next position of grapheme cluster string.

Parameters:
[in]gcstrgrapheme cluster string.
[in]posNew position.
Returns:
none. If pos is out of range of string, position won't be updated.
Todo:
On next major release, pos would be ssize_t, not int.
void gcstring_shrink ( gcstring_t gcstr,
int  length 
)

Shrink

Modify grapheme cluster string to shrink its length. Length is specified by number of grapheme clusters.

Parameters:
[in]gcstrgrapheme cluster string.
[in]lengthNew length.
Returns:
none. If gcstr was NULL, do nothing.
Todo:
On next major release, length would be ssize_t, not int.
gcstring_t* gcstring_substr ( gcstring_t gcstr,
int  offset,
int  length 
)

Substring

Returns substring of grapheme cluster string. Offset and length are specified by number of grapheme clusters.

Parameters:
[in]gcstrgrapheme cluster string. Must not be NULL.
[in]offsetOffset of substring.
[in]lengthLength of substring.
Returns:
(newly allocated) substring. If error occurred, errno is set to non-zero then NULL is returned.
Todo:
On next major release, offset and length would be ssize_t, not int.