The BreakIterator class implements methods for finding the location of boundaries in text
The BreakIterator class implements methods for finding the location of boundaries in text. BreakIterator is an abstract base class. Instances of BreakIterator maintain a current position and scan over text returning the index of characters where boundaries occur.Line boundary analysis determines where a text string can be broken when line-wrapping. The mechanism correctly handles punctuation and hyphenated words.
Sentence boundary analysis allows selection with correct interpretation of periods within numbers and abbreviations, and trailing punctuation marks such as quotation marks and parentheses.
Word boundary analysis is used by search and replace functions, as well as within text editing applications that allow the user to select words with a double click. Word selection provides correct interpretation of punctuation marks within and following words. Characters that are not part of a word, such as symbols or punctuation marks, have word-breaks on both sides.
Character boundary analysis allows users to interact with characters as they expect to, for example, when moving the cursor through a text string. Character boundary analysis provides correct navigation of through character strings, regardless of how the character is stored. For example, an accented character might be stored as a base character and a diacritical mark. What users consider to be a character can differ between languages.
This is the interface for all text boundaries.
Examples:
Helper function to output text
. void printTextRange( BreakIterator& iterator, UTextOffset start, UTextOffset end ) . { . UnicodeString textBuffer, temp; . CharacterIterator *strIter = iterator.createText(); . strIter->getText(temp); . cout << " " << start << " " << end << " |" . << temp.extractBetween(start, end, textBuffer) . << "|" << endl; . delete strIter; . }Print each element in order:. void printEachForward( BreakIterator& boundary) . { . UTextOffset start = boundary.first(); . for (UTextOffset end = boundary.next(); . end != BreakIterator::DONE; . start = end, end = boundary.next()) . { . printTextRange( boundary, start, end ); . } . }Print each element in reverse order:. void printEachBackward( BreakIterator& boundary) . { . UTextOffset end = boundary.last(); . for (UTextOffset start = boundary.previous(); . start != BreakIterator::DONE; . end = start, start = boundary.previous()) . { . printTextRange( boundary, start, end ); . } . }Print first element. void printFirst(BreakIterator& boundary) . { . UTextOffset start = boundary.first(); . UTextOffset end = boundary.next(); . printTextRange( boundary, start, end ); . }Print last element. void printLast(BreakIterator& boundary) . { . UTextOffset end = boundary.last(); . UTextOffset start = boundary.previous(); . printTextRange( boundary, start, end ); . }Print the element at a specified position. void printAt(BreakIterator &boundary, UTextOffset pos ) . { . UTextOffset end = boundary.following(pos); . UTextOffset start = boundary.previous(); . printTextRange( boundary, start, end ); . }Creating and using text boundaries. void BreakIterator_Example( void ) . { . BreakIterator* boundary; . UnicodeString stringToExamine("Aaa bbb ccc. Ddd eee fff."); . cout << "Examining: " << stringToExamine << endl; . . //print each sentence in forward and reverse order . boundary = BreakIterator::createSentenceInstance( Locale::US ); . boundary->setText(&stringToExamine); . cout << "----- forward: -----------" << endl; . printEachForward(*boundary); . cout << "----- backward: ----------" << endl; . printEachBackward(*boundary); . delete boundary; . . //print each word in order . boundary = BreakIterator::createWordInstance(); . boundary->setText(&stringToExamine); . cout << "----- forward: -----------" << endl; . printEachForward(*boundary); . //print first element . cout << "----- first: -------------" << endl; . printFirst(*boundary); . //print last element . cout << "----- last: --------------" << endl; . printLast(*boundary); . //print word at charpos 10 . cout << "----- at pos 10: ---------" << endl; . printAt(*boundary, 10 ); . . delete boundary; . }
Return true if this BreakIterator is at the same position in the same text, and is the same class and type (word, line, etc.) of BreakIterator, as the argument. Text is considered the same if it contains the same characters, it need not be the same object, and styles are not considered.
alphabetic index hierarchy of classes
this page has been generated automatically by doc++
(c)opyright by Malte Zöckler, Roland Wunderling
contact: doc++@zib.de