TeXnical Writing Part 3: Syntax

TeXnical Writing Part 3: Syntax


February 05, 2021


BellSoft Blog Disclaimer

Welcome to the third part of developing a Liberica JDK-based application for real-time conversion of mathematical formulas from Markdown to HTML. In the previous part we developed a plain Markdown editor and preview panel; in this part we’ll walk through adding a TeX processor to the application.

Introduction

Syntax highlighting in text editors has a colourful history, owing its emergence to the first syntax-directed editor in 1969. Sixteen years elapsed before colour syntax was introduced, an invention to make BASIC programming easier for beginners, especially children. One way of adding colour to text is to parse the file contents using regular expressions. Another way is to build an abstract syntax tree (AST). Given that our mdtexfx program includes a library that builds a Markdown AST, we’ll reuse that functionality to stylize the document.

We’ll encounter the visitor pattern along the way, which helps perform arbitrary actions on items within a hierarchical collection. This pattern lets us decouple the abstract syntax tree from the code that performs the highlighting, in line with the open/closed principle.

To implement syntax highlighting, we’ll first need to import a text editor that can be styled. Once that editor is in place, we’ll employ the visitor pattern to change the document’s font styles and colours.

Gradle

A rich text editor is a graphical user interface widget that provides developers with the ability to add style to otherwise plain text content, including: bold, italics, and colour. The StyleClassedTextArea class, which is a rich text editor developed for JavaFX, provides us with fine control over how text is presented within our editor. Keep in mind that JavaFX comes bundled with the full version of Liberica JDK. Update the build.gradle script dependencies section to include the RichTextFX library as follows:

implementation 'org.fxmisc.richtext:richtextfx:0.10.5'

The new text area is now available, so let’s update our application to use it.

App

Replacing the existing text area with one that can be styled will entail a few changes to the App class, including:

  • Preview — Revise the HtmlPreview panel to resize dynamically, giving users the ability to reconfigure the application’s dimensions to suit their preferences.
  • Scrollbars — Wrap the new text area with scrollbars so that users receive visual clues when content is off-screen. To adorn the application with visible scrollbars, a VirtualizedScrollPane must be constructed using an instance of StyleClassedTextArea.
  • SplitPane — Replace the fixed BorderPane class with a SplitPane to afford users the ability to change how much screen real estate is occupied by the editor and preview components. By default, the SplitPane will evenly divide the space allotted between the two components added.
  • Dimensions — Add default dimensions when constructing the Scene class.

Update the start method of the App class to apply the changes using the following snippet:

  @Override
  public void start( final Stage stage ) {
    final var editor = new StyleClassedTextArea();
    final var vsPane = new VirtualizedScrollPane<>( editor );

    final var preview = new HtmlPreview();
    final var context = new ProcessorContext( TEXT_MARKDOWN, preview );
    final var processor = ProcessorFactory.create( context );

    editor.textProperty().addListener( ( c, o, n ) -> processor.apply( n ) );

    final var pane = new SplitPane();
    pane.getItems().addAll( vsPane, preview );

    final var scene = new Scene( pane, 1280, 720 );
    stage.setScene( scene );
    stage.show();
  }

We’ll come back to this later to instantiate a syntax highlighter that we’ll give to the ProcessorContext.

HtmlPreview

The WebView class has default dimensions of 800x600 pixels, which can be changed by setting its preferred dimensions (i.e., width and height). We’d like to resize the WebView whenever its compositional class—HtmlPreview, in this case—is resized. A few tweaks are needed to accomplish this task:

  • HtmlPreview — Change the class such that it resizes based on its parent node’s size.
  • WebView — Bind its preferred dimensions to the dimensions of HtmlPreview.

First, change the HtmlPreview class definition to inherit from Region instead of Parent. This change ensures that the preview pane can be resized by its parent node: the SplitPane. The update looks as follows:

public class HtmlPreview extends Region implements HtmlRenderer {

Next, revise the HtmlPreview constructor to bind the WebView class member variable’s (mView) preferred dimensions to those of the HtmlPreview class itself, which resembles the following:

  public HtmlPreview() {
    mHtmlDocument.append( HTML_PREFIX );

    mView = new WebView();
    mView.prefWidthProperty().bind( widthProperty() );
    mView.prefHeightProperty().bind( heightProperty() );

    getChildren().add( mView );
  }

Run the program. Here is a screenshot that shows part of this document being edited, including a screenshot of our application within itself:

Application Split Pane

Stylish Visitor

Given the processor-based architecture for mdtexfx, we want to avoid polluting the generic processors with format-specific code. For example, a reStructuredText editor’s ProcessorContext must not require Markdown syntax highlighting. Instead, we’ll declare an interface that defines how any document can be highlighted, in general terms.

Skeletons

Before we begin coding, let’s plan our approach. Even though we don’t yet know the low-level technical details for how the code will fit together, we can jot down the high-level requirements in terms of class and interface responsibilities. The big-picture items include:

  • SyntaxHighlighter — Defines the interface that all syntax highlighters must implement.
  • PlaintextHighlighter — Default implementation for unsupported file formats.
  • MarkdownHighlighter — Provides functionality that stylizes Markdown documents.
  • HighlighterFactory — Responsible for creating a highlighter based on the document type.

Create a new package named com.mdtexfx.processors.highlighters. Inside the package, create the following interface definition:

package com.mdtexfx.processors.highlighters;

public interface SyntaxHighlighter {
}

It has no working innards—hence the term skeleton—which is perfectly fine; we’ll define how it works later. We need not worry about everything all at once. Next, in the same package, create the PlaintextHighlighter and MarkdownHighlighter classes such that they both implement the SyntaxHighlighter interface:

public class PlaintextHighlighter implements SyntaxHighlighter {
}

And:

import org.fxmisc.richtext.StyleClassedTextArea;

public class MarkdownHighlighter implements SyntaxHighlighter {
  private final StyleClassedTextArea mEditor;

  public MarkdownHighlighter( final StyleClassedTextArea editor ) {
    mEditor = editor;
  }
}

Once again, we can sort out the details of how the highlighter will do its job later.

To finish up, we’ll borrow a similar approach to the ProcessorFactory for implementing the HighlighterFactory:

import com.mdtexfx.io.MediaType;
import org.fxmisc.richtext.StyleClassedTextArea;

public class HighlighterFactory {
  public static SyntaxHighlighter create(
      final MediaType mediaType, final StyleClassedTextArea editor ) {
    return switch( mediaType ) {
      case TEXT_MARKDOWN -> new MarkdownHighlighter( editor );
      case UNDEFINED -> new PlaintextHighlighter();
    };
  }
}

Notice that the creation of syntax highlighters represents different information from the creation of processors. Even though the code snippet closely resembles that of the ProcessorFactory, we have not violated the DRY principle. (See the previous article for more details about the DRY principle.)

ProcessorContext

With the highlighter class skeletons in place, we can now give the ProcessorContext class a syntax highlighting mechanism as follows:

public class ProcessorContext {
  private final MediaType mMediaType;
  private final HtmlRenderer mHtmlRenderer;
  private final SyntaxHighlighter mHighlighter;

  public ProcessorContext(
      final MediaType mediaType, final HtmlRenderer htmlRenderer ) {
    this( mediaType, htmlRenderer, new PlaintextHighlighter() );
  }

  public ProcessorContext(
      final MediaType mediaType,
      final HtmlRenderer htmlRenderer,
      final SyntaxHighlighter highlighter ) {
    mMediaType = mediaType;
    mHtmlRenderer = htmlRenderer;
    mHighlighter = highlighter;
  }

  MediaType getMediaType() { return mMediaType; }
  HtmlRenderer getHtmlRenderer() { return mHtmlRenderer; }
  SyntaxHighlighter getSyntaxHighlighter() { return mHighlighter; }
}

Adding the new member variable allows any text processor to interact with a syntax highlighter without any modifications to the ProcessorFactory class, which was the main driving force behind creating the ProcessorContext class.

Note that record classes provide a terse syntax for declaring data holders, such as the ProcessorContext class, that serve to transport immutable data, exclusively. As it stands, this class is a suitable record class candidate. At the time of writing, record classes are a preview feature of the Java language, which can be used when preview features are enabled. Records have been integrated into Java 16.

Let’s return to the App class, where we’ll construct a new ProcessorContext class by giving it a suitable syntax highlighter.

Wiring the Application

Let’s stitch the classes together by giving the syntax highlighter to the processor context. To do so, first find the following line in the App class:

final var context = new ProcessorContext( TEXT_MARKDOWN, preview );

Then replace that line with the following lines:

final var syntax = HighlighterFactory.create( TEXT_MARKDOWN, editor );
final var context = new ProcessorContext( TEXT_MARKDOWN, preview, syntax );

Even though the ProcessorContext could have created the highlighter internally (by passing in the editor instead of a highlighter), we create the highlighter externally and pass it along so that the code subscribes to the Dependency Injection (DI) principle. Put succinctly, DI means that a parent object is to provide all necessary dependencies for its child objects. We do this so that the concrete implementations used by the child objects can be changed without having to update the child’s code.

Complete the edits by loading a Cascading Style Sheet (CSS) file—as yet undefined—that defines the style classes for the text editor. Accomplish this by inserting the following lines after creating the scene instance:

final var stylesheet = App.class.getResource( "markdown.css" );
editor.getStyleClass().add( "editor" );
editor.getStylesheets().add( stylesheet.toExternalForm() );

Create an empty CSS file named src/main/resources/com/mdtexfx/markdown.css . The directory portion under resources directly relates to the package name of the com.mdtexfx.App class, from which the resource is obtained via the getResource method. Test that everything works by running App within the IDE.

The application is wired up; we’re now ready to fill in the details for the syntax highlighting implementation.

MarkdownProcessor & NodeVisitor

In flexmark-java, there is a fundamental class called a NodeVisitor, which implements the visitor design pattern with the help of a VisitHandler. Effectively, the NodeVisitor class provides a way to map nodes found in an abstract syntax tree to methods (lambda functions) that process particular node types. For example, we’d like to map a Markdown heading node (e.g., ## Heading 2) to a distinct heading style class, such as h2, so that the text editor can apply the associated CSS effects to the marked text. Consider the following pseudo-code snippet:

visitor = new NodeVisitor(
  new VisitHandler(
    Heading.class,
    node -> syntax.highlight( node.start(), node.end(), "h" + node.level() )
  )
);

visitor.visit( root );

A few items to notice:

  • The NodeVisitor constructor takes a variable number of VisitHandler instances.
  • A VisitHandler can map Node subclasses to lambda functions that are called when the NodeVisitor visits any node of that subclass type.
  • Calling start() and end() represents obtaining the offsets of where the node starts and ends within the document; in practice, these functions don’t exist—instead, we’ll use equivalent method calls that correspond to the node type being visited.
  • Calling visit( root ) takes the root node from the abstract syntax tree that was already parsed when we ran the Markdown processor to generate an HTML document (for previewing).

Moving from pseudo-code to actual code involves changing the MarkdownProcessor class as follows:

public class MarkdownProcessor extends ExecutorProcessor<String> {
  private final IParse mParser = Parser.builder().build();
  private final IRender mRenderer = HtmlRenderer.builder().build();
  private final SyntaxHighlighter mSyntax;
  private final NodeVisitor mVisitor;

  public MarkdownProcessor(
    final Processor<String> successor, final ProcessorContext context ) {
    super( successor );
    mSyntax = context.getSyntaxHighlighter();
    mVisitor = new NodeVisitor(
      create( Text.class, node -> container( node, "text" ) ),
      create( Heading.class, node -> heading( node, "h" + node.getLevel() ) ),
      create( Code.class, node -> delimited( node, "code" ) ),
      create( Emphasis.class, node -> delimited( node, "emphasis" ) ),
      create( StrongEmphasis.class, node -> delimited( node, "strong" ) ),
      create( FencedCodeBlock.class, node -> fenced( node, "pre" ) ),
      create( BlockQuote.class, node -> container( node, "blockquote" ) ),
      create( BulletListItem.class, node -> itemized( node, "bullet" ) ),
      create( OrderedListItem.class, node -> itemized( node, "enumerated" ) ),
      create( Link.class, node -> link( node, "link" ) ),
      create( HtmlEntity.class, node -> container( node, "entity" ) )
    );
  }

  @Override
  public String apply( final String markdown ) {
    final var root = mParser.parse( markdown );
    mVisitor.visit( root );
    return mRenderer.render( root );
  }

A few techniques are employed to make the code readable and eliminate a bit of duplication:

  • The create method instantiates a VisitHandler.
  • The container, heading, delimited and similar methods hide how the node document offsets are obtained and subsequently applied via the syntax highlighter.
  • The style class strings (e.g., code, emphasis, blockquote) are clearly listed at a glance, which will help when developing the CSS file.

Download the source code at the end of the article to read how each type of node is mapped to specific document offsets.

The code to handle styling is complete.

Markdown Cascading Style Sheet

The functionality for JavaFX Cascading Style Sheets is based on the World Wide Web Consortium’s CSS specification version 2.1. Note that most style names start with -fx-, such as -fx-fill and -fx-font-size. Here are some salient parts to the markdown.css file that we created previously:

.editor {
  -fx-background-color: #fdf6e3;
}

.editor .code,
.editor .pre {
  -fx-fill: #002b36;
  -fx-font-family: monospace;
}

The CSS file contains mappings for the style class names passed into the lambda expressions we listed in the MarkdownProcessor class. When referenced in CSS, the names must be prefixed with a period. We’ve also introduced an editor style class name as a qualifier to help avoid conflicts with other CSS styles. Keep in mind that all style attributes, such as -fx-fill, are defined specifically for JavaFX; in some cases, the rich text editor requires the -rtfx- prefix for styles to stick.

When the application is run, the syntax highlighting will resemble the following image:

Syntax highlighted text editor

Download

You may download the complete project.

Summary

We’ve seen how the visitor pattern can decouple visiting all nodes in an abstract syntax tree from the code that handles the processing of each visited node. We also looked at a few key differences between the JavaFX CSS and the W3C CSS specifications. Lastly, we used a top-down software development approach by starting with skeleton classes and interfaces, wiring them together, then filling in the low-level implementation details afterwards.

In the next article, we’ll add TeX support.

Author image

Dave Jarvis

Senior Software Developer, Special for BellSoft

BellSoft LTD [email protected] BellSoft LTD logo Liberica Committed to Freedom 199 Obvodnogo Kanala Emb. 190020 St. Petersburg RU +7 812-336-35-67 BellSoft LTD 199 Obvodnogo Kanala Emb. 190020 St. Petersburg RU +7 812-336-35-67 BellSoft LTD 111 North Market Street, Suite 300 CA 95113 San Jose US +1 702 213-59-59