File: ntextWordBreak.man

package info (click to toggle)
r-cran-tcltk2 1.2-10-1
links: PTS
area: main
in suites: jessie, jessie-kfreebsd
size: 5,356 kB
ctags: 1,386
sloc: tcl: 37,888; ansic: 792; python: 324; sh: 68; sed: 16; makefile: 1
file content (132 lines) | stat: -rw-r--r-- 7,844 bytes
parent folder | download | duplicates (7)
[comment {-*- tcl -*- ntextWordBreak manpage}]
[manpage_begin ntextWordBreak n 0.81]
[moddesc   {ntext Word Boundary Detection for the Text Widget}]
[titledesc {ntext Word Boundary Detection for the Text Widget}]
[require Tcl 8.5]
[require Tk 8.5]
[require ntext [opt 0.81]]
[description]

The [package ntext] package provides a binding tag named [emph Ntext] for use by text widgets in place of the default [emph Text] binding tag.
[comment {use emph instead of term, because term creates a hyperlink, and ntext, Ntext and Text occur in almost every sentence: the page would be covered with the same hyperlinks many times}]

[para]

Navigation and selection in a text widget require the detection of words and their boundaries.  The word boundary detection facilities provided by Tcl/Tk through the [emph Text] binding tag are limited because they define only one class of "word" characters and one class of "non-word" characters.  The [emph Ntext] binding tag uses more general rules for word boundary detection, that define [emph two] classes of "word" characters and one class of "non-word" characters.

[para]

[section {CONFIGURATION OPTIONS}]

The behaviour of [emph Ntext] may be configured application-wide by setting the values of a number of namespace variables.  One of these is relevant to word boundary detection:
[para]
[var ::ntext::classicWordBreak]
[list_begin bullet]
[bullet]
   [const 0] - (default value) selects [emph Ntext] behaviour, i.e. platform-independent, two classes of word characters and one class of non-word characters.
[bullet]
   [const 1] - selects classic [emph Text] behaviour, i.e. platform-dependent, one class of word characters and one class of non-word characters
[bullet]
   After changing this value, [emph Ntext] 's regexp matching patterns should be recalculated.  See [sectref FUNCTIONS] for details and advanced configuration options.
[list_end]
[para]


[section {Advanced Use}]
[comment {no subsection in my dtp kit}]
[section {Variables (Advanced Use)}]
[var ::ntext::tcl_match_wordBreakAfter]
[para]
[var ::ntext::tcl_match_wordBreakBefore]
[para]
[var ::ntext::tcl_match_endOfWord]
[para]
[var ::ntext::tcl_match_startOfNextWord]
[para]
[var ::ntext::tcl_match_startOfPreviousWord]
[para]
These variables hold the regexp patterns that are used by [emph Ntext] to search for word boundaries.  If they are changed, subsequent searches are immediately altered.  In many situations, it it unnecessary to alter the values of these variables directly: instead call one of the functions [fun ::ntext::initializeMatchPatterns], [fun ::ntext::createMatchPatterns].
[para]
In the [emph Text] binding tag one can change the search rules by changing the values of the global variables [var tcl_wordchars] and [var tcl_nonwordchars].  The equivalent operation in the [emph Ntext] binding tag is to call [fun ::ntext::createMatchPatterns] with appropriate arguments.

[comment {no subsection in my dtp kit}]
[section {Functions (Advanced Use)}]
If a simple regexp search should prove insufficient, the following functions (analogous to the Tcl/Tk core's [fun tcl_wordBreakAfter] etc) may be replaced by the developer:
[para]
[fun ntext::new_wordBreakAfter]
[para]
[fun ntext::new_wordBreakBefore]
[para]
[fun ntext::new_endOfWord]
[para]
[fun ntext::new_startOfNextWord]
[para]
[fun ntext::new_startOfPreviousWord]
[para]

[section FUNCTIONS]
Each function calculates the five regexp search patterns that define the word boundary searches.  These values are stored in the namespace variables listed above.
[para]
[fun ::ntext::initializeMatchPatterns]
[list_begin bullet]
[bullet]
This function is called when [emph Ntext] is first used, and needs to be called again only if the script changes the value of either [var ::ntext::classicWordBreak] or [var ::tcl_platform(platform)].  The function is called with no arguments.  It is useful when the desired search patterns are the default patterns for either the [emph Ntext] or [emph Text] binding tag, and so are implicitly specified by the values of [var ::ntext::classicWordBreak] and [var ::tcl_platform(platform)] alone.
[list_end]
[fun ::ntext::createMatchPatterns] [arg new_nonwordchars] [arg new_word1chars] [opt new_word2chars]
[list_begin bullet]
[bullet]
This function is useful in a wider range of situations than [fun ::ntext::initializeMatchPatterns].  It calculates the regexp search patterns for any case with one class of "non-word" characters and one or two classes of "word" characters.
[nl]
Each argument should be a regexp expression defining a class of characters.  An argument will usually be a bracket expression, but might alternatively be a class-shorthand escape, or a single character.  The third argument may be omitted, or supplied as the empty string, in which case it is unused.
[nl]
The first argument is interpreted as the class of non-word characters; the second argument (and the third, if present) are classes of word characters.  The classes should include all possible characters and will normally be mutually exclusive: it is often convenient to define one class as the negation of the other two.
[list_end]

[section {WORD BOUNDARY MATCHING}]

The problem of word boundary selection is a vexed one, because text is used to represent a universe of different types of information, and there are no simple rules that are useful for all data types or for all purposes.
[para]
[emph Ntext] attempts to improve on the facilities available in classic [emph Text] by providing facilities for more complex definitions of words (with three classes of characters instead of two).
[para]
[emph {What is a word?  Why two classes of word?}]
[para]
When using the modified cursor keys <Control-Left> and <Control-Right> to navigate through a [emph Ntext] widget, the cursor is placed at the start of a word.  A word is defined as a sequence of one or more characters from only one of the two defined "word" classes; it may be preceded by a character from the other "word" class or from the "non-word" class.
[para]
The double-click of mouse button 1 selects a word of text, where in this case a "word" may be as defined above, or alternatively may be a sequence of one or more characters from the "non-word" class of characters.
[para]
Traditionally Tcl has defined only one word class and one non-word class: on Windows, the non-word class is whitespace, and so alphanumerics and punctuation belong to the same class.  On other platforms, punctuation is bundled with whitespace as "non-word" characters.  In either case, the navigation and selection of text are unnecessarily coarse-grained, and sometimes give unhelpful results.
[para]
The use of three classes of characters might make selection too fine-grained; but in this case, holding down the [emph Shift] key and double-clicking another word is an excellent way to select a longer range of text (a useful binding that Tcl/Tk has long provided but which is missing in other systems).
[para]
As well as its defaults, [emph Ntext] permits the developer to define their own classes of characters, or to revert to the classic [emph Text] definitions, or to specify their own regexp matching patterns.

[section EXAMPLE]

To use [emph Ntext] with Tcl/Tk's usual word-boundary detection rules:

[example {
package require ntext
text .t
bindtags .t {.t Ntext . all}
set ::ntext::classicWordBreak 1
::ntext::initializeMatchPatterns
}]

See bindtags for more information.
[para]
To define a different set of word-boundary detection rules:

[example {
package require ntext
text .t
bindtags .t {.t Ntext . all}
::ntext::createMatchPatterns \ 
  {[[:space:][:cntrl:]]} {[[:punct:]]} {[^[:punct:][:space:][:cntrl:]]}
}]

See regexp, re_syntax for more information.

[see_also ntext]
[see_also text bindtags regexp re_syntax]
[keywords text bindtags regexp re_syntax]
[manpage_end]