java - How to handle multi-line comments in a live syntax highlighter?

Question

Welcome To Ask or Share your Answers For Others

java - How to handle multi-line comments in a live syntax highlighter?

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

I'm writing my own text editor with syntax highlighting in Java, and at the moment it simply parses and highlights the current line every time the user enters a single character. While presumably not the most efficient way, it's good enough and doesn't cause any noticeable performance issues. In pseudo-Java, this would be the core concept of my code:

public void textUpdated(String wholeText, int updateOffset, int updateLength) {
    int lineStart = getFirstLineStart(wholeText, updateOffset);
    int lineEnd = getLastLineEnd(wholeText, updateOffset + updateLength);

    List<Token> foundTokens = tokenizeText(wholeText, lineStart, lineEnd);

    for(Token token : foundTokens) {
        highlightText(token.offset, token.length, token.tokenType);
    }
}

The real problem lies with multi-line comments. To check if an entered character is inside a multi-line comment, the program would need to parse back to the most recent occurrence of a "/*", while also being aware of whether this occurrence is inside a literal or another comment. This would not be an issue if the amount of text is small, but if the text consists of 20,000 lines of code, it would possibly have to scan and (re)highlight 20,000 lines of code on each key press, which would be very inefficient.

So my ultimate question is: how do I handle multi-line tokens/comments in a syntax highlighter while keeping it efficient?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

187 views

1 Answer

深蓝 · Answer 1 · 2021-10-23T18:42:11+0000

I tried to do this (for fun) about 10 years ago (or more). Because the code is so old, I don't remember all the details of the code and the logic conditions in the code. All the code here is basically a brute force solution. It in no way attempts to keep the state of each line as suggest by rici.

I'll try to explain the high level concept of the code. Hope some of it makes sense to you.

at the moment it simply parses and highlights the current line every time the user enters a single character.

This is the basic premise of my code as well. However, it does handle pasting multiple lines of code as well.

how do I handle multi-line tokens/comments in a syntax highlighter while keeping it efficient?

In my solution, when you enter "/*" to start the multi-line comment, I will comment all the following lines of code until I find the end of the comment or the start of the another multi-line comment or the end of the Document. When you then enter the matching "*/" to end the multi-line comment I will re-highlight the following lines until the next multi-line comment or the end of the Document.

So the amount of highlighting done depends on how much code you have between multi-line comments.

That is a quick overview of how it works. I doubt it is 100% accurate since I've only played with it a little bit. It should be noted this code was written when I was just learning Java, so in no way would I suggest it is the best approach, just the best I knew at the time.

Here is the code for your amusement :)

Just run the code and click on the button to get started.

import java.awt.*;
import java.awt.event.*;
import java.io.*;
import java.net.*;
import java.util.*;
import javax.swing.*;
import javax.swing.event.*;
import javax.swing.text.*;

class SyntaxDocument extends DefaultStyledDocument
{
    private DefaultStyledDocument doc;
    private Element rootElement;

    private boolean multiLineComment;
    private MutableAttributeSet normal;
    private MutableAttributeSet keyword;
    private MutableAttributeSet comment;
    private MutableAttributeSet quote;

    private Set<String> keywords;

    private int lastLineProcessed = -1;

    public SyntaxDocument()
    {
        doc = this;
        rootElement = doc.getDefaultRootElement();
        putProperty( DefaultEditorKit.EndOfLineStringProperty, "
" );

        normal = new SimpleAttributeSet();
        StyleConstants.setForeground(normal, Color.black);

        comment = new SimpleAttributeSet();
        StyleConstants.setForeground(comment, Color.gray);
        StyleConstants.setItalic(comment, true);

        keyword = new SimpleAttributeSet();
        StyleConstants.setForeground(keyword, Color.blue);

        quote = new SimpleAttributeSet();
        StyleConstants.setForeground(quote, Color.red);

        keywords = new HashSet<String>();
        keywords.add( "abstract" );
        keywords.add( "boolean" );
        keywords.add( "break" );
        keywords.add( "byte" );
        keywords.add( "byvalue" );
        keywords.add( "case" );
        keywords.add( "cast" );
        keywords.add( "catch" );
        keywords.add( "char" );
        keywords.add( "class" );
        keywords.add( "const" );
        keywords.add( "continue" );
        keywords.add( "default" );
        keywords.add( "do" );
        keywords.add( "double" );
        keywords.add( "else" );
        keywords.add( "extends" );
        keywords.add( "false" );
        keywords.add( "final" );
        keywords.add( "finally" );
        keywords.add( "float" );
        keywords.add( "for" );
        keywords.add( "future" );
        keywords.add( "generic" );
        keywords.add( "goto" );
        keywords.add( "if" );
        keywords.add( "implements" );
        keywords.add( "import" );
        keywords.add( "inner" );
        keywords.add( "instanceof" );
        keywords.add( "int" );
        keywords.add( "interface" );
        keywords.add( "long" );
        keywords.add( "native" );
        keywords.add( "new" );
        keywords.add( "null" );
        keywords.add( "operator" );
        keywords.add( "outer" );
        keywords.add( "package" );
        keywords.add( "private" );
        keywords.add( "protected" );
        keywords.add( "public" );
        keywords.add( "rest" );
        keywords.add( "return" );
        keywords.add( "short" );
        keywords.add( "static" );
        keywords.add( "super" );
        keywords.add( "switch" );
        keywords.add( "synchronized" );
        keywords.add( "this" );
        keywords.add( "throw" );
        keywords.add( "throws" );
        keywords.add( "transient" );
        keywords.add( "true" );
        keywords.add( "try" );
        keywords.add( "var" );
        keywords.add( "void" );
        keywords.add( "volatile" );
        keywords.add( "while" );
    }

    /*
     *  Override to apply syntax highlighting after the document has been updated
     */
    public void insertString(int offset, String str, AttributeSet a) throws BadLocationException
    {
        if (str.equals("{"))
            str = addMatchingBrace(offset);

        super.insertString(offset, str, a);
        processChangedLines(offset, str.length());
    }

    /*
     *  Override to apply syntax highlighting after the document has been updated
     */
    public void remove(int offset, int length) throws BadLocationException
    {
        super.remove(offset, length);
        processChangedLines(offset, 0);
    }

    /*
     *  Determine how many lines have been changed,
     *  then apply highlighting to each line
     */
    public void processChangedLines(int offset, int length)
        throws BadLocationException
    {
        String content = doc.getText(0, doc.getLength());

        //  The lines affected by the latest document update

        int startLine = rootElement.getElementIndex(offset);
        int endLine = rootElement.getElementIndex(offset + length);

        if (startLine > endLine)
            startLine = endLine;

        //  Make sure all comment lines prior to the start line are commented
        //  and determine if the start line is still in a multi line comment

        if (startLine != lastLineProcessed
        &&  startLine != lastLineProcessed + 1)
        {
            setMultiLineComment( commentLinesBefore( content, startLine ) );
        }

        //  Do the actual highlighting

        for (int i = startLine; i <= endLine; i++)
        {
            applyHighlighting(content, i);
        }

        //  Resolve highlighting to the next end multi line delimiter

        if (isMultiLineComment())
            commentLinesAfter(content, endLine);
        else
            highlightLinesAfter(content, endLine);

    }

    /*
     *  Highlight lines when a multi line comment is still 'open'
     *  (ie. matching end delimiter has not yet been encountered)
     */
    private boolean commentLinesBefore(String content, int line)
    {
        int offset = rootElement.getElement( line ).getStartOffset();

        //  Start of comment not found, nothing to do

        int startDelimiter = lastIndexOf( content, getStartDelimiter(), offset - 2 );

        if (startDelimiter < 0)
            return false;

        //  Matching start/end of comment found, nothing to do

        int endDelimiter = indexOf( content, getEndDelimiter(), startDelimiter );

        if (endDelimiter < offset & endDelimiter != -1)
            return false;

        //  End of comment not found, highlight the lines

        doc.setCharacterAttributes(startDelimiter, offset - startDelimiter + 1, comment, false);
        return true;
    }

    /*
     *  Highlight comment lines to matching end delimiter
     */
    private void commentLinesAfter(String content, int line)
    {
        int offset = rootElement.getElement( line ).getStartOffset();

        //  End of comment and Start of comment not found
        //  highlight until the end of the Document

        int endDelimiter = indexOf( content, getEndDelimiter(), offset );

        if (endDelimiter < 0)
        {
            endDelimiter = indexOf( content, getStartDelimiter(), offset + 2);

            if (endDelimiter < 0)
            {
                doc.setCharacterAttributes(offset, content.length() - offset + 1, comment, false);
                return;
            }
        }

        //  Matching start/end of comment found, comment the lines

        int startDelimiter = lastIndexOf( content, getStartDelimiter(), endDelimiter );

        if (startDelimiter < 0 || startDelimiter >= offset)
        {
            doc.setCharacterAttributes(offset, endDelimiter - offset + 1, comment, false);
        }
    }

    /*
     *  Highlight lines to start or end delimiter
     */
    private void highlightLinesAfter(String content, int line)
        throws BadLocationException
    {
        int offset = rootElement.getElement( line ).getEndOffset();

        //  Start/End delimiter not found, nothing to do

        int startDelimiter = indexOf( content, getStartDelimiter(), offset );
        int endDelimiter = indexOf( content, getEndDelimiter(), offset );

        if (startDelimiter < 0)
            startDelimiter = content.length();

        if (endDelimiter < 0)
            endDelimiter = content.length();

        int delimiter = Math.min(startDelimiter, endDelimiter);

        if (delimiter < offset)
            return;

        //  Start/End delimiter found, reapply highlighting

        int endLine = rootElement.getElementIndex( delimiter );

        for (int i = line + 1; i <= endLine; i++)
        {
            Element branch = rootElement.getElement( i );
            Element leaf = doc.getCharacterElement( branch.getStartOffset() );
            AttributeSet as = leaf.getAttributes();

            if ( as.isEqual(comment) )
            {
                applyHighlighting(content, i);
            }
        }
    }

    /*
     *  Parse the line to determine the appropriate highlighting
     */
    private void applyHighlighting(String content, int line)
        throws BadLocationException
    {
        lastLineProcessed = line;

        int startOffset = rootElement.getElement( line ).getStartOffset();
        int endOffset = rootElement.getElement( line ).getEndOffset() - 1;

        int lineLength = endOffset - startOffset;
        int contentLength = content.length();

        if (endOffset >= contentLength)
            endOffset = contentLength - 1;

        //  check for multi line comments
        //  (always set the comment attribute for the entire line)

        if (endingMultiLineComment(content, startOffset, endOffset)
        ||  isMultiLineComment()
        ||  startingMultiLineComment(content, startOffset, endOffset) )
        {
            doc.setCharacterAttributes(startOffset, endOffset - startOffset + 1, comment, false);
            lastLineProcessed = -1;
            return;
        }

        //  set normal attributes for the line

        doc.setCharacterAttributes(startOffset, lineLength, normal, true);

        //  check for single line comment

        int index = content.indexOf(getSingleLineDelimiter(), startOffset);

        if ( (index > -1) && (index < endOffset) )
        {
            doc.setCharacterAttributes(index, endOffset - index + 1, comment, false);
            endOffset = index - 1;
        }

        //  check for tokens

        checkForTokens(conten

Categories

java - How to handle multi-line comments in a live syntax highlighter?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags