LPEX
4.4.0

com.ibm.lpex.core
Class LpexCharStream

java.lang.Object
  extended by com.ibm.lpex.core.LpexCharStream
All Implemented Interfaces:
UCode_CharStream

public class LpexCharStream
extends Object
implements UCode_CharStream

A stream interface for LPEX document parsers. This class feeds a lexer with the text of an LPEX document view, formatted as a stream of Java Unicode characters. It also provides the capability to set the parse results (styles and element classes) in the associated view.

The local stream buffer holds the text and styles for one current element. These are usually the actual text and style of this element. In documents with smart-logical intra- and inter-token bidi marks, the bidi marks and their corresponding style characters are stripped upon loading from the editor, and restored when set back in the editor after parsing. Note that the methods of this class (for example, getBeginColumn() and getEndColumn()) operate on the contents of this stream buffer. The bidi marks will not be fed to the associated lexer.


Field Summary
 StringBuffer bufferStyles
          The current styles for the text element in the stream buffer.
 
Fields inherited from interface com.ibm.lpex.cc.UCode_CharStream
staticFlag
 
Constructor Summary
LpexCharStream(LpexView lpexView)
          Constructor.
 
Method Summary
 void backup(int amount)
          Backs up the input stream by amount steps.
 void backupToStart()
          Goes (back) to the start of the current element.
 char BeginToken()
          Returns the next character that marks the beginning of the next token.
 void Done()
          The lexer calls this method to indicate it is done with the stream, so that we can free any resources being held.
 int endMargin()
          Returns the ZERO-based end margin as it applies to the text element being processed.
 boolean EOFSeen()
          Checks whether EOF was encountered on the last character read.
 int Expand(int endElement)
          Expands the existing stream in the same LpexView with one or more elements.
 int getBeginColumn()
          Returns the ONE-based column number of the first character for the current token (being matched after the last call to BeginToken()).
 int getBeginLine()
          Returns the element number of the first character for the current token (being matched after the last call to BeginToken()).
 String getBufferText()
          Returns the character buffer used by LpexCharStream for the current element.
 String getBufferText(int element)
          Returns the character buffer used by LpexCharStream for the given element.
 int getEndColumn()
          Returns the ONE-based column number of the last character for the current token (being matched after the last call to BeginToken()).
 int getEndElement()
          Returns the end element of the parse range.
 int getEndLine()
          Returns the element number of the last character for the current token (being matched after the last call to BeginToken()).
 String GetImage()
          Returns the string from the marked token-beginning to the current buffer position.
 LpexView getLpexView()
          Returns the document view that provides this input stream.
 char[] GetSuffix(int len)
          Returns an array of characters that make up the suffix of length len for the currently matched token.
 void Init(int beginElement, int beginPosition, int endElement, long classClear, long classSet, char styleDefault, boolean clearPending)
          Initializes the stream for a parse range, for a specific document parser.
 void Init(int beginElement, int endElement, long classClear, long classSet, char styleDefault, boolean clearPending)
          Initializes the stream for a parse range, for a specific document parser.
 char readChar()
          Returns the next character in the stream.
 void removeClasses(long classes)
          Removes the specified class(es) from the current element.
 void setClasses(int element, long classes)
          Sets additional element class(es) in the given element.
 void setClasses(long classes)
          Sets additional element class(es) in the current element.
 void setCurrentStyles()
          Sets the styles and classes of the current element in the editor.
 void setMaintainBidiMarks(boolean maintainBidiMarks)
          Sets whether inter-token bidi marks should be maintained.
 boolean setMargins(int beginMargin, int endMargin, char outMarginsStyle)
          Sets a begin and/or an end margin for the characters being returned.
 boolean setMargins(int beginMargin, int endMargin, char outMarginsStyle, boolean ignoreWhitespace)
          Sets a begin and/or an end margin for the characters being returned.
 void setParseResults(boolean setParseResults)
          Sets whether parse results should be set in the editor.
 void setStyles(int beginCol, int endCol, char style)
          Sets additional styles in the current element.
 void setStyles(int element, int beginCol, int endCol, char style)
          Sets or corrects styles in the given element.
 void skipChar()
          Skips a character in the stream.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

bufferStyles

public StringBuffer bufferStyles
The current styles for the text element in the stream buffer. The document parser sets, through e.g., setStyles(), the appropriate styles as it parses the tokens; then, the entire styles string is automatically set in LPEX by LpexCharStream, through setCurrentStyles(), when a new element is read in.

In documents with smart-logical bidi marks, the style characters for the bidi marks are stripped in the stream buffer.

See Also:
setCurrentStyles()
Constructor Detail

LpexCharStream

public LpexCharStream(LpexView lpexView)
Constructor.

Parameters:
lpexView - the document view that provides the input stream
Method Detail

readChar

public final char readChar()
                    throws IOException
Returns the next character in the stream. This method can throw any java.io.IOException.

An attempt to read a character beyond the end element of the stream will throw a java.io.EOFException. The range of elements in the stream is specified by beginElement .. endElement in methods Init() and Expand().

Specified by:
readChar in interface UCode_CharStream
Throws:
IOException
See Also:
Init(int,int,long,long,char,boolean), Expand(int)

getEndColumn

public final int getEndColumn()
Returns the ONE-based column number of the last character for the current token (being matched after the last call to BeginToken()).

Specified by:
getEndColumn in interface UCode_CharStream

getEndLine

public final int getEndLine()
Returns the element number of the last character for the current token (being matched after the last call to BeginToken()).

Specified by:
getEndLine in interface UCode_CharStream

getBeginColumn

public final int getBeginColumn()
Returns the ONE-based column number of the first character for the current token (being matched after the last call to BeginToken()).

Specified by:
getBeginColumn in interface UCode_CharStream

getBeginLine

public final int getBeginLine()
Returns the element number of the first character for the current token (being matched after the last call to BeginToken()).

Specified by:
getBeginLine in interface UCode_CharStream

backup

public final void backup(int amount)
Backs up the input stream by amount steps. The lexer calls this method if it had already read some characters, but couldn't use them to match a (longer) token; so they'll be used again as the prefix of the next token.

This method may back up to a previous element (an element previously readChar()-ed in this stream). The same margins setting is assumed for previous elements as is now in effect.

Specified by:
backup in interface UCode_CharStream
Parameters:
amount - the number of characters to back up (push back) in the stream

BeginToken

public final char BeginToken()
                      throws IOException
Returns the next character that marks the beginning of the next token.

Note: LpexCharStream guarantees that all the characters remain in the buffer between two successive calls to this method (i.e., the characters from tokenBegin on), so that backup() works correctly.

Specified by:
BeginToken in interface UCode_CharStream
Throws:
IOException

GetImage

public String GetImage()
Returns the string from the marked token-beginning to the current buffer position.

The current implementation of this method always returns an empty string ("").

Specified by:
GetImage in interface UCode_CharStream

GetSuffix

public char[] GetSuffix(int len)
Returns an array of characters that make up the suffix of length len for the currently matched token. Used to build up the matched string for use in actions in the case of MORE.

The current implementation of this method always returns an array which contains just one ('\n') character.

Specified by:
GetSuffix in interface UCode_CharStream
Parameters:
len - number of characters in the suffix

Done

public void Done()
The lexer calls this method to indicate it is done with the stream, so that we can free any resources being held.

Specified by:
Done in interface UCode_CharStream

Init

public final void Init(int beginElement,
                       int endElement,
                       long classClear,
                       long classSet,
                       char styleDefault,
                       boolean clearPending)
Initializes the stream for a parse range, for a specific document parser. Parsing starts at the first character in beginElement, subject to the margin settings in effect.

Parameters:
beginElement - begin element in the parse range
endElement - end element of the parse range. EOF will be returned when trying to read characters past it
classClear - class bit(s) to be cleared for a new element read in - the classClear bit(s) usually represent all the classes assigned to, and handled by, the parser
classSet - class bit(s) to be set by default in a new element read in - the classSet bit(s), usually set by the parser to indicate blank elements, are cleared in methods setClasses()
styleDefault - the default style character to be set for the entire element when read in
clearPending - true = once done with an element, remove it from the parse-pending list. This is used in incremental parsing, when looking ahead of just the current element which triggered the parse action: it prevents LPEX from calling the parser once again for this element, it it was itself modified (e.g., in a block-copy operation)
See Also:
Init(int,int,int,long,long,char,boolean)

Init

public final void Init(int beginElement,
                       int beginPosition,
                       int endElement,
                       long classClear,
                       long classSet,
                       char styleDefault,
                       boolean clearPending)
Initializes the stream for a parse range, for a specific document parser. Parsing starts from the given effective beginPosition in beginElement, subject to the margin settings in effect.

The first position in an element is 1. The original element classes and the styles prior to beginPosition are kept.

See Also:
Init(int,int,long,long,char,boolean)

setMargins

public final boolean setMargins(int beginMargin,
                                int endMargin,
                                char outMarginsStyle)
Sets a begin and/or an end margin for the characters being returned. All the characters outside the margins (including whitespace) are styled with outMarginsStyle.

See Also:
setMargins(int,int,char,boolean)

setMargins

public final boolean setMargins(int beginMargin,
                                int endMargin,
                                char outMarginsStyle,
                                boolean ignoreWhitespace)
Sets a begin and/or an end margin for the characters being returned. The next call to readChar() will return the next character positioned inside the margins set by this method.

Setting margins for the characters to be considered as part of the input stream buffer received by the parser is useful in certain column-sensitive languages, such as PL/I.

Method readChar() only returns text characters that are positioned in columns beginMargin .. endMargin of text elements. An EOL (end-of-line, '\n' character) may be returned at endMargin+1. The first column of a text element is 1. Note that even for an empty line of text, the EOL will be returned at position startMargin (rather than in column 1).

A call to Init() clears any margins in effect: all the characters in the text elements of the view will be returned by subsequent readChar() calls.

The style indicated will be set in the editor (by setCurrentStyles()) for the characters outside the margins in effect.

Parameters:
beginMargin - begin margin; 0 indicates no begin margin in effect
endMargin - end margin (must not be below beginMargin); 0 indicates no end margin in effect
outMarginsStyle - style to be set for characters located outside the margins
ignoreWhitespace - when true, whitespace outside the margins is styled with the default style character
Returns:
true: all OK, the new margins are in effect
See Also:
Init(int,int,long,long,char,boolean), setCurrentStyles()

setParseResults

public final void setParseResults(boolean setParseResults)
Sets whether parse results should be set in the editor. By default, LpexCharStream sets in the view the style and element classes resulting from the parse. This method may be called, after Init(), to disable setting the parse results in the editor. When called with a false argument, the style and element classes of parsed elements will not be set in the associated document view, nor will bidi marks be maintained for these elements.

A call to Init() clears this setting to true.


setMaintainBidiMarks

public final void setMaintainBidiMarks(boolean maintainBidiMarks)
Sets whether inter-token bidi marks should be maintained. This method may be called, after Init(), by a primary document parser for incremental parsing when its property maintainBidiMarks is on. Inter-token bidi marks (smart-logical format) will only be maintained in documents with a bidirectional source encoding (Arabic/Hebrew).

A call to Init() clears this setting to false.

See Also:
LpexNls.setTextStyleBidiMarks(int, java.lang.String)

Expand

public final int Expand(int endElement)
Expands the existing stream in the same LpexView with one or more elements. For example, the end element in the parse range may be increased when it was determined that more incremental parsing needs to be done to complete a language construct.

If any margins were previously set, they remain in effect. If endElement is not above the current end element, no change takes place in the existing range.

Parameters:
endElement - new end element of the parse range. EOF will be returned when trying to read characters past it
Returns:
the end element of the parse range

endMargin

public int endMargin()
Returns the ZERO-based end margin as it applies to the text element being processed. Returns 0 if there is no end margin in effect.


getLpexView

public final LpexView getLpexView()
Returns the document view that provides this input stream.


getEndElement

public final int getEndElement()
Returns the end element of the parse range.

See Also:
Init(int,int,long,long,char,boolean), Expand(int)

getBufferText

public final String getBufferText()
Returns the character buffer used by LpexCharStream for the current element. This is usually the actual text of the current element. In documents with smart-logical bidi marks, the bidi marks are stripped in the stream buffer.


getBufferText

public final String getBufferText(int element)
Returns the character buffer used by LpexCharStream for the given element. This is usually the text of the given element. In documents with smart-logical bidi marks, the bidi marks are stripped in the stream buffer.

When the specified element is the current element, this method is equivalent to a call to getBufferText().


EOFSeen

public final boolean EOFSeen()
Checks whether EOF was encountered on the last character read. Useful in determining the cause of a TokenMgrError. The EOFSeen condition is cleared by Init(), Expand(), and backup().

See Also:
Init(int,int,long,long,char,boolean), Expand(int), backup(int)

skipChar

public final void skipChar()
Skips a character in the stream. Used to skip the last error character and continue parsing.


backupToStart

public final void backupToStart()
Goes (back) to the start of the current element.


setStyles

public final void setStyles(int beginCol,
                            int endCol,
                            char style)
Sets additional styles in the current element. The style for a token located at beginCol .. endCol is set to the specified style character. This method does not take into account any margin setting in effect.

Style characters for the text element in the stream buffer are being set by the document parser (e.g., through this method) as it parses the tokens; then, the entire styles string is automatically set in the editor by LpexCharStream (through setCurrentStyles()) when a new element is read in.

Parameters:
beginCol - begin column (first column in an element is 1)
endCol - end column
style - the style character
See Also:
setCurrentStyles()

setStyles

public final void setStyles(int element,
                            int beginCol,
                            int endCol,
                            char style)
Sets or corrects styles in the given element. The style for a token located at beginCol .. endCol is set to the specified style character. This method does not take into account any margin setting in effect.

When the specified element is the pending current element of this stream, this method is equivalent to a call to setStyles(int,int,char). Otherwise, the element has been already parsed and its style set in LPEX, so the new styles are set directly in the editor.

Parameters:
element - element in the current parse range of this stream
beginCol - begin column (first column in an element is 1)
endCol - end column
style - the style character
See Also:
setStyles(int,int,char), setParseResults(boolean)

setClasses

public final void setClasses(long classes)
Sets additional element class(es) in the current element. The classes bit-mask specified, if non-zero, is added to the current element after the default classSet bit(s) specified in Init() are first cleared.

The current classes for the text element in the stream buffer are being set by the document parser (through this method) as it parses the tokens; then, the accumulated classes bit-mask is automatically set in the editor by LpexCharStream (through setCurrentStyles()) when a new element is read in.

Parameters:
classes - the bit-mask of the element class(es) to be added to the current element
See Also:
Init(int,int,long,long,char,boolean), setCurrentStyles()

setClasses

public final void setClasses(int element,
                             long classes)
Sets additional element class(es) in the given element.

When the specified element is the pending current element of this stream, this method is equivalent to a call to setClasses(long). Otherwise, the element has been already parsed and its classes set in LPEX, so the new classes are set directly in the editor.

Parameters:
element - element in the current parse range of this stream
classes - the bit-mask of the element class(es) to be added to the given element
See Also:
setClasses(long), setParseResults(boolean)

removeClasses

public final void removeClasses(long classes)
Removes the specified class(es) from the current element.

Parameters:
classes - the bit-mask of the element class(es) to be removed from the current element
See Also:
setClasses(long)

setCurrentStyles

public final void setCurrentStyles()
Sets the styles and classes of the current element in the editor. The styles (in bufferStyles) and the classes that have been set in the current element by the parser through setStyles() and setClasses(), are now set in the editor.

This method is called automatically by LpexCharStream when a new element is being read in an attempt to fill the stream buffer for the next readChar(). A parser may want to explicitly call this method when, for example, a lexical error caused by EOF was raised (i.e., after LpexCharStream has already called this method for the element), in order to update the element with this error's styles & classes. A parser may also call this method when switching to another subparser (lexer) for the rest of an element.

If margin settings are in effect, the style indicated by outMarginsStyle in setMargins() will be set in the editor for the characters outside the margins (with consideration to the ignoreWhitespace setting).

See Also:
setStyles(int, int, char), setClasses(long), setMargins(int, int, char), setParseResults(boolean)

LPEX
4.4.0

Copyright � 2016 IBM Corp. All Rights Reserved.

Note: This documentation is for part of an interim API that is still under development and expected to change significantly before reaching stability. It is being made available at this early stage to solicit feedback from pioneering adopters on the understanding that any code that uses this API will almost certainly be broken (repeatedly) as the API evolves.