Building a CDT-based editor, Part 2: Presenting text in the CDT

Highlighting source code with syntax styling

This article, the second in a "Building a CDT-based editor" series, introduces text presentation in the Eclipse C/C++ Development Tooling (CDT). Text presentation is an important advantage of the CDT. The clear, multicolored display makes it easy to read and navigate through code. Understanding how this works is crucial, whether you want to understand the CDT code or build a full-featured source editor of your own. Further, the mechanisms that make CDT text presentation possible are also needed for a more important capability: automatic parsing.

Share:

Matthew Scarpino, Java Developer, Eclipse Engineering, LLC

Matthew Scarpino is a project manager and Java developer at Eclipse Engineering LLC. He is the lead author of SWT/JFace in Action and made a minor but important contribution to the Standard Widget Toolkit (SWT). He enjoys Irish folk music, marathon running, the poetry of William Blake, and the Graphical Editing Framework (GEF).



19 September 2006

Also available in Japanese Vietnamese

Introducing the CDT text presentation

With each character you enter, the CDT editor performs a bewildering number of tasks. It checks for changes to the document partitions and activates rules to further divide the text. If the character completes a function, the editor enables subroutine folding to minimize the text space. If the character completes a word, it determines whether that word should be added to the index. Further, it determines whether that character fits into the accepted structure of a C/C++ document. If so, it updates its internal Document Object Model (DOM). If not, it provides error reporting through annotations.

It's beyond the scope of this article to cover every aspect of CDT event handling, so we'll focus on syntax styling. We'll explain how the editor changes text color and font style based on the structure of the source code. Not only will this show how the editor reacts to keystrokes but these objects and processes are also used in Part 3 this series, which discusses CDT parsing.

I've updated the Bare Bones C/C++ Development Tool (BBCDT) from Part 1 to provide the same text display here. The new classes are in the org.bbcdt.dworks.internal.ui.text and org.bbcdt.dworks.core.parser packages. If you enter valid code into a BBCDT source file, you see the same syntax styling you've become accustomed to with the full CDT (see Figure 1). See Download to retrieve the code.

Figure 1. CDT syntax styling
CDT syntax styling

The CDT syntax styling process

Syntax styling may seem common and decorative, but the process isn't simple. As you'll see, there's a lot going on under the hood. The good news is that once it becomes clear, you'll be able to customize each color and font style to your liking and make sure your editor looks exactly the way you want it to. Since most of these classes are part of the Eclipse text editor application program interface (API), you can use them directly in your own editors.

Put simply, the final goal of syntax styling is to create a TextPresentation object for a portion of text whenever the user enters an appropriate sequence of valid C/C++ code. The process involves four steps:

  1. The Document creates a DocumentEvent for the input keystroke.
  2. The FastPartitioner updates the Document's partitions.
  3. The viewer alerts the PresentationReconciler, which uses a DefaultDamagerRepairer to analyze the changed partition.
  4. The DefaultDamagerRepairer uses rules to create a TextPresentation that updates the text's color and style.

Step 1. The SourceViewer and the Document

During its creation, one of the first objects the CEditor creates is a CSourceViewer. This object not only constructs the editor's StyledText widget but also handles any events that it receives. In particular, it uses a VerifyListener to respond to keystrokes. You can forward VerifyEvents to other objects as needed, but the viewer notifies the editor's Document by default.

As mentioned in Part 1, the Document holds the editor's information, and the DocumentProvider initializes it with text from the editor's input file. Similarly, the SourceViewer updates the Document with the editor's text. It uses DocumentCommand to do this. Each command holds the added text and the location of the text within the Document. When the command executes, it updates the model information in the Document.

Just as the CDT editor is simply a StyledText widget with many bells and whistles, the Document is essentially a String (technically an ITextStore). In addition to text, it contains a series of Positions that represent subsections of the String. Each Position has a length and offset, and a TypedPosition also has an associated name. TypedPositions are particularly important in this discussion since they are used to represent partitions.

When a DocumentCommand executes, it starts by updating the Position corresponding to the caret (the vertical bar cursor bar). Then the command calls Document.replace(), which alters the characters in the Document's ITextStore. When this is finished, the Document is up to date, and it sends DocumentEvents to any registered DocumentListeners.

Step 2. Document partitioning with the FastPartitioner

Rather than analyze the entire document every time you enter text, the CDT uses the divide-and-conquer approach provided by the Eclipse text editor API. That is, it breaks the Document into mutually exclusive divisions called partitions. This way, only the partition containing modified text is examined. For example, if you change a word in a multiline comment, the CDT analyzes the partition containing the comment and not the rest of the code.

The plugin.xml file in the user interface (UI) plug-in creates an extension of the org.eclipse.core.filebuffers.documentSetup extension point. When a CDT Document is created, this extension connects it to a FastPartitioner object to determine and manage its subdivisions.

In the CDT, this partitioner is initialized with an array of four Strings, each naming a different partition:

The dark side of modularity

When I create a graphical editor with the Eclipse Graphical Editor Framework (GEF) API, there's a clear need for separation of concerns. I'm glad to have hundreds of single-functional classes that join in a Model-View-Controller (MVC) mosaic. Sure it's complex, but considering the figures, connections, and interrelationships in a GEF editor, the complexity is understandable.

However, in my opinion, text editors don't need this level of complexity. After all, it's just text. I shouldn't have to read pages of documentation to make every instance of volatile display in blue, bold lettering. You may think my discussions of partitioning and rule processing are too detailed, but I'm actually leaving out quite a bit. For those who aren't afflicted with my laziness and apathy, do something about this, will you?

  • Multiline comment
  • Single-line comment
  • String
  • Character

The CDT also initializes the partitioner with a FastCPartitionScanner. Put simply, Eclipse scanners convert characters in a Document range into a series of Tokens that hold arbitrary data Objects. In the case of the FastCPartitionScanner, each Token contains one of the four Strings that name the current partition.

Before it alerts any other listeners, the Document sends its DocumentEvents to its partitioners, or in this case, just the FastPartitioner. The partitioner uses the event to find the first line containing changed text and the TypedPosition containing the start of the line. Then it tells the FastCPartitionScanner to convert the range of text into Tokens.

The FastCPartitionScanner reads in characters of the Document by using a BufferedDocumentScanner. Then, it uses a state machine (a switch statement, that is) to determine whether an incoming character represents the end of a partition. If so, the scanner returns a Token for that partition and reports its offset and length. The FastPartitioner uses this to update the Document's list of partitions, and the partitioning process is complete.

Step 3. Analyzing changed text with the DefaultDamagerRepairer

Despite its importance, there isn't much code in the CSourceViewer class. The CSourceViewerConfiguration provides a number of objects that perform functions for the viewer. Of these, one of the most important is the PresentationReconciler, which creates a DefaultDamagerRepairer for each partition. The code in Listing 1 makes this possible.

Listing 1. Adding DefaultDamagerRepairers to the PresentationReconciler
DefaultDamagerRepairer dr= new \
DefaultDamagerRepairer(getSinglelineCommentScanner());		
reconciler.setDamager(dr, ICPartitions.C_SINGLE_LINE_COMMENT);
reconciler.setRepairer(dr, ICPartitions.C_SINGLE_LINE_COMMENT);
		
dr= new DefaultDamagerRepairer(getMultilineCommentScanner());		
reconciler.setDamager(dr, ICPartitions.C_MULTILINE_COMMENT);
reconciler.setRepairer(dr, ICPartitions.C_MULTILINE_COMMENT);

dr= new DefaultDamagerRepairer(getStringScanner());
reconciler.setDamager(dr, ICPartitions.C_STRING);
reconciler.setRepairer(dr, ICPartitions.C_STRING);
		
dr= new DefaultDamagerRepairer(getStringScanner());
reconciler.setDamager(dr, ICPartitions.C_CHARACTER);
reconciler.setRepairer(dr, ICPartitions.C_CHARACTER);

String language = ((CSourceViewer)sourceViewer).getDisplayLanguage();
if(language.equals(CEditor.LANGUAGE_CPP)) {
	scanner= getCppCodeScanner();
} else {
	scanner= getCCodeScanner();
}

dr= new DefaultDamagerRepairer(scanner);
reconciler.setDamager(dr, IDocument.DEFAULT_CONTENT_TYPE);
reconciler.setRepairer(dr, IDocument.DEFAULT_CONTENT_TYPE);

Before continuing, we need to explain what damagers and repairers do. Essentially, the purpose of an IPresentationDamager is to determine what region of a document's partition has been affected by a given DocumentEvent. Therefore, despite the fierce name, a damager is really just a damage analyzer. An IPresentationRepairer uses the damager's results to create a TextPresentation, which contains the information needed to change the text's color and style. Simply put, a DefaultDamagerRepairer performs both of these functions, responding to an event by creating a TextPresentation for a partition.

Before the CSourceViewer starts operating, it accesses the objects provided by the CSourceViewerConfiguration and installs the PresentationReconciler. This installation allows the reconciler to listen for TextEvents. These are similar to DocumentEvents, except TextEvents contain the new text and the text that has been replaced. DocumentEvents only contain the new text.

When the PresentationReconciler receives a TextEvent, it determines which partition contains the changed text and alerts the appropriate DefaultDamagerRepairer. Even though the reconciler has a different one for each partition, the DefaultDamagerRepairer does the same thing in all cases. Like the FastPartitioner, it determines the start of the first line containing the damage and the start of the partition. The maximum of these is considered the start of the damage. It finds the end of the damage by calculating the minimum of the last position in the partition and the last position of unchanged text in the partition. The DefaultDamagerRepairer returns an IRegion, which represents a section of the Document by providing an offset and length.

Step 4. Rule processing

After it receives the damage information, the PresentationReconciler tells the partition's DefaultDamagerRepairer to create a new TextPresentation and apply it to the damaged region. As shown in Listing 1, each DefaultDamagerRepairer is initialized with a scanner suited to the partition. The repairer starts by telling its scanner to analyze the damaged region and produce Tokens.

In the CDT, the comment partitions are scanned with a CCommentScanner and the character, and String partitions are scanned with a SingleTokenCScanner. The default partition, which contains the text not covered by another partition, is scanned by either the CppCodeScanner or the CCodeScanner, depending on the code language. Each of these is a subclass of RuleBasedScanner, and each uses a List of IRules to create Tokens corresponding to the patterns detected in the text.

These IRules are particularly important, and the Eclipse text editor API provides five implementation classes:

WordRule
Returns a Token when specific words are found
SingleLineRule
Returns a Token when an expression is found in a single line of text
MultiLineRule
Returns a Token when an expression is found over multiple lines of text
NumberRule
Returns a Token for every number in the text
WhitespaceRule
Returns a Token when white space is detected

To give you an idea of how scanners use these rules, Listing 2 shows a section of code in which the CCodeScanner creates rules for detecting Strings and numbers. When part of the text matches a rule pattern, the scanner returns the appropriate Token.

Listing 2. Adding rules to a RuleBasedScanner
Token token= getToken(ICColorConstants.C_STRING);
rules.add(new SingleLineRule("'", "'", token, '\\'));

token = getToken(ICColorConstants.C_NUMBER);
NumberRule numberRule = new NumberRule(token);
rules.add(numberRule);

These Tokens contain Strings, just like those returned by partitioning scanners. However, these Strings represent TextAttributes, instead of partition names. A TextAttribute describes how a given section of text should be presented. That is, it describes the text's background color, its foreground color, and an integer representing its style (SWT.BOLD, SWT.ITALIC, or SWT.NORMAL). For example, the CCodeScanner uses a WordRule to detect C/C++ keywords in the text. When the WordRule detects a keyword, the content of its Token tells the editor how the keyword should be displayed.

When the DefaultDamagerRepairer receives a Token from a rule, it acquires the Token's TextAttribute and determines how the text should be displayed. Then it creates a StyleRange object that controls the style for a given section of the Document's text. After the repairer adds the StyleRange to the TextPresentation, its role is finished. The PresentationReconciler sends the TextPresentation to the viewer, and the viewer updates the StyledText widget with the new color and font style.


Updating the BBCDT

Between the CDT and the BBCDT, the only major change in the syntax styling code involves the use of preferences. In the CDT, you can control the color and font style of C/C++ text by updating the preferences in the Eclipse workbench. However, the BBCDT doesn't make use of preferences. Therefore, to change the styles associated with each of the TokenStrings, you need to alter code in the constructor of the CColorManager class in the org.dworks.bbcdt.internal.ui.text package.

Figure 2 shows what BBCDT text looks like with default settings.

Figure 2. BBCDT syntax styling
BBCDT syntax styling

After adding the plug-ins to the Eclipse installation, you can create a BBCDT project by clicking File > New > Project and choosing the C or C++ option. To create a file, click New > Other and choose the C or C++ option.

Conclusion

This article explained how a keystroke can create a new Document partition or change the editor's text styling. The event handling isn't simple, but because of the separation of concerns, you can customize any aspect without disrupting the process. Also, now that we've explained how Eclipse text editor events work, we can discuss in upcoming installments an advanced feature that depends on this capability: CDT automatic parsing.


Download

DescriptionNameSize
Part 2 source codeos-ecl-cdt2.zip552KB

Resources

Learn

Get products and technologies

Discuss

  • The Eclipse forums have many resources for people interested in using and extending Eclipse.
  • Get involved in the developerWorks community. Connect with other developerWorks users while exploring dW's developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=159098
ArticleTitle=Building a CDT-based editor, Part 2: Presenting text in the CDT
publish-date=09192006