From 0d7fc4516c9e4a36bd6d36b041a11ebc5c99d107 Mon Sep 17 00:00:00 2001 From: Eshel Yaron Date: Sat, 11 Oct 2025 14:18:53 +0200 Subject: [PATCH] Document 'elisp-fontify-semantically' in the Emacs manual * doc/emacs/display.texi (Semantic Font Lock): New node. * doc/emacs/emacs.texi: Update menu. * etc/NEWS: Update relevant entry. * lisp/emacs-lisp/elisp-scope.el: Expand commentary. * doc/misc/elisp-semantic-highlighting.org: Delete it. --- doc/emacs/display.texi | 83 +++++++++ doc/emacs/emacs.texi | 1 + doc/misc/elisp-semantic-highlighting.org | 210 ----------------------- etc/NEWS | 3 +- lisp/emacs-lisp/elisp-scope.el | 110 +++++++++++- 5 files changed, 193 insertions(+), 214 deletions(-) delete mode 100644 doc/misc/elisp-semantic-highlighting.org diff --git a/doc/emacs/display.texi b/doc/emacs/display.texi index 8da0495f531..aef1047c17e 100644 --- a/doc/emacs/display.texi +++ b/doc/emacs/display.texi @@ -1085,6 +1085,7 @@ program. @menu * Traditional Font Lock:: Font Lock based on regexps and syntax tables. * Parser-based Font Lock:: Font Lock based on external parser. +* Semantic Font Lock:: Font Lock based on semantic analysis. @end menu @node Traditional Font Lock @@ -1209,6 +1210,88 @@ fontification level. takes effect immediately in all the existing buffers and for files you visit in the future in the same session. +@node Semantic Font Lock +@subsection Semantic Font Lock +@cindex semantic highlighting + +@dfn{Semantic highlighting} is a semi-advanced editor feature in which +an editor uses some kind of semantic analysis to understand a program's +source code, and communicates useful information about the meaning of +different tokens to the user by highlighting these tokens according to +their specific role in the program. + +Semantic highlighting is more sophisticated than traditional ``syntax +highlighting'', which only considers the syntactic role of a token, +i.e. how it affects the code's @emph{parsing}, unlike semantic analysis +which takes into account the token's effect on the program's +@emph{execution}. For example, a semantic highlighting implementation +may be able to tell apart local and global variables and give distinct +highlighting to each category, even though the language's @emph{syntax} +doesn't make such a distinction. Semantic highlighting is especially +beneficial in languages in which syntactic constructs can mean +completely different things depending on the context in which they +occur, such as Lisp and Prolog. In such languages, syntactic analysis +alone misses a lot of important information that coders need to reason +about their programs. + +@vindex elisp-fontify-semantically +Emacs implements semantic highlighting for Emacs Lisp as an optional +feature of @code{emacs-lisp-mode}. To enable it, set the option +@code{elisp-fontify-semantically} to non-@code{nil}. + +When this option is enabled, @code{emacs-lisp-mode} analyzes your code +and highlights symbols according to their semantic roles, as part of the +mode's usual Font Lock highlighting. It doesn't effect the highlighting +of strings, comments and other syntactic elements such as brackets; +@code{elisp-fontify-semantically} only affects symbol highlighting. + +The semantic analysis assigns to each symbol a @dfn{symbol role}, such +as ``function'', ``local variable'', ``face name'', etc. Each symbol +role has an associated face property, which is applied to symbols with +that role during semantic highlighting. By default, most of these faces +inherit from appropriate @code{font-lock-*} faces. For example, +locally-bound variables get the @code{elisp-bound-variable} face, which +inherits from @code{font-lock-variable-use-face}. + +@vindex elisp-add-help-echo +The analysis can differentiate between more than 50 such symbol roles, +but you don't need to memorize the appearance of so many faces to +leverage semantic highlighting: you can hover over an highlighted symbol +with the mouse to see a tooltip with the exact role Emacs inferred for +that symbol (@pxref{Tooltips}). If you want to disable this extra +information, set @code{elisp-add-help-echo} to @code{nil}. + +There are a few more points you should keep in mind when using +@code{elisp-fontify-semantically}: + +@itemize @bullet +@item +Syntax errors break semantic analysis, so for best results you may want +to enable @code{electric-pair-mode} to keep your code syntactically +correct while you edit it. @xref{Matching}. + +@item +The analysis uses macro-expansion in some cases, but by default it only +does so in trusted buffers (@pxref{Host Security}). In @emph{untrusted} +buffers, some macro arguments may not be highlighted. See the +documentation string of @code{elisp-scope-safe-macro-p} for more +information about which macros Emacs considers safe to expand for +analysis. + +@item +The analysis is informed by definitions in the current Emacs session, +hence code that uses unloaded libraries may miss some highlighting. + +@item +The analysis assumes that lexical binding is in effect (@pxref{Selecting +Lisp Dialect,,, elisp, the Emacs Lisp Reference Manual}). + +@item +Semantic highlighting requires additional processing over traditional +Font Lock---the current implementation might be slow when editing very +large defuns. +@end itemize + @node Highlight Interactively @section Interactive Highlighting diff --git a/doc/emacs/emacs.texi b/doc/emacs/emacs.texi index b32c704bd12..28233110570 100644 --- a/doc/emacs/emacs.texi +++ b/doc/emacs/emacs.texi @@ -391,6 +391,7 @@ Controlling the Display Font Lock * Traditional Font Lock:: Font Lock based on regexps and syntax tables. * Parser-based Font Lock:: Font Lock based on external parser. +* Semantic Font Lock:: Font Lock based on semantic analysis. Searching and Replacement diff --git a/doc/misc/elisp-semantic-highlighting.org b/doc/misc/elisp-semantic-highlighting.org deleted file mode 100644 index 36886ae6535..00000000000 --- a/doc/misc/elisp-semantic-highlighting.org +++ /dev/null @@ -1,210 +0,0 @@ -#+TITLE: Semantic Highlighting for Emacs Lisp - -This document describes the semantic highlighting facility that Emacs -provides for Emacs Lisp (ELisp), and the ELisp code analysis that powers -this feature. - -* Semantic Highlighting - -The term "semantic highlighting" refers to a semi-advanced feature of -code editors, in which the editor uses some kind of semantic analysis to -understand a program's source code, and communicates useful information -about the meaning of different tokens to the user by highlighting these -tokens according to their specific role in the program. - -Semantic highlighting is more sophisticated than traditional "syntax -highlighting", which only considers the syntactic role of a token, -i.e. how it affects the code's /parsing/, unlike semantic analysis which -takes into account the token's effect on the program's /execution/. For -example, a semantic highlighting implementation may be able to tell -apart local and global variables and give distinct highlighting to each -category, even though the language's /syntax/ doesn't make such a -distinction. Semantic highlighting is especially beneficial in -languages in which syntactic constructs can mean completely different -things depending on the context in which they occur, such as Lisp and -Prolog. In such languages, syntactic analysis alone misses a lot of -important information that coders need to reason about their programs. - -* Highlighting ELisp - -Emacs implements semantic highlighting for Emacs Lisp as an optional -feature of =emacs-lisp-mode=. To enable it, set the option -=elisp-fontify-semantically= to non-nil. - -When this option is enabled, =emacs-lisp-mode= analyzes your code and -highlights symbols according to their semantic roles, as part of the -mode's usual =font-lock= highlighting. It doesn't effect the -highlighting of strings, comments and other syntactic elements such as -brackets; =elisp-fontify-semantically= only affects symbol highlighting. -Also note that this option assumes that lexical-binding is in effect. - -** Symbol Roles - -The semantic analysis assigns to each symbol a "symbol role", such as -=function=, =bound-variable=, =binding-variable=, =face=, etc. Each -symbol role has an associated face, which is applied to symbols with -that role during semantic highlighting. By default, most of these faces -inherit from appropriate =font-lock-*= faces. For example, -=binding-variable= symbols get the =elisp-binding-variable= face, which -inherits from =font-lock-variable-name-face=. - -To define new symbol roles, see the macro =elisp-scope-define-symbol-role=. - -** Helpful Annotations - -The analysis can differentiate between more than 50 symbol roles, but -you don't need to memorize the appearance of so many faces to leverage -semantic highlighting. During semantic highlighting, Emacs annotates -each highlighted symbol with a =help-echo= text property that describes -the role of that symbol, so you can see exactly which role was inferred -for a given symbol just by hovering over it with your mouse. You can -control these =help-echo= annotations by setting =elisp-add-help-echo=. - -** Bonus Feature: Highlighting Occurrences of the Local Variable at Point - -If you enable =cursor-sensor-mode= along with -=elisp-fontify-semantically=, then when you move point to a local -variable Emacs will apply special highlighting to all occurrences of -that variable in its local scope. This lets you see at a glance where a -certain local variable is used. - -* ELisp Code Analysis - -The analysis that powers =elisp-fontify-semantically= is implemented in -the library ~elisp-scope.el~. The entry point of the analysis in the -function =elisp-scope-analyze-form=, it takes a caller-provided callback -function which will be called to report the information we find about -each analyzed symbol: the callback gets the position and length of the -analyzed symbol, along with its inferred role and, for locally-bound -variables, the position of the binder. =elisp-scope-analyze-form= reads -a form from the current buffer, starting from point, using -=read-positioning-symbols= to attach position information to symbols. -It then recursively analyzes the form, reporting information about each -symbol it encounters via the caller-provided callback function. For -semantic highlighting, =elisp-scope-analyze-form= is called with a -callback that highlights each reported symbol during analysis. - -Hence, semantic highlighting always processes the whole top-level form -in one go, which might become slow for very large function definitions. -Please report such slowness to bug-gnu-emacs@gnu.org if you encounter it -so we can improve this aspect. - -Also note that, since semantic highlighting reads and analyzes forms, -for best results you should keep your code syntactically correct while -editing it, for example by using =electric-pair-mode=. - -** Recursive Form Analysis - -The core of the analysis that =elisp-scope-analyze-form= performs is -implemented in the recursive function =elisp-scope-1=, which analyzes an -sexp as an evaluated form, propagating contextual information such as -local variable bindings down to analyzed sub-forms. =elisp-scope-1= -takes two arguments: =form=, which is the form to analyze, and =spec=, -which is a specification of the expected value of =form= used to analyze -quoted data. The analysis proceeds as follows: - -- If =form= is a symbol, =elisp-scope-1= reports it as a variable. - See [[*Analyzing Variables][Analyzing Variables]] for details about the exact symbol roles used - for variables. - -- If =form= is a cons cell =(head . args)=, then the analysis depends on - =head=. =head= can have a bespoke "analyzer function" =af=, which is - called as =(af head . args)= and is responsible for (recursively) - analyzing =form=. The analyzer function can be associated to =head= - either locally, as an alist entry in =elisp-scope-local-definitions=, - or globally, via the symbol property =elisp-scope-analyzer=. - - An analyzer may use the functions =elisp-scope-report-s=, - =elisp-scope-1= and =elisp-scope-n= to analyze its arguments, and it - can consult the variable =elisp-scope-output-spec= to obtain the - expected output spec of the analyzed form. For example, the following - is a suitable analyzer for the `identity' function: - - #+begin_src emacs-lisp - (lambda (fsym arg) - (elisp-scope-report-s fsym 'function) - (elisp-scope-1 arg elisp-scope-output-spec)) - #+end_src - - In particular, the analyzer function of =quote= analyzes its argument - according to =elisp-scope-output-spec=, which is bound to the value of - the =spec= argument passed to =elisp-scope-1=. See [[*Analyzing Data][Analyzing Data]] for - more details about this analysis. - -- If =head= is a macro, normally it is expanded, and then the expanded - form is analyzed recursively. Since macro-expansion may involve - arbitrary code execution, only "safe" macro invocations are expanded: - If =head= is one of the macros in =elisp-scope-unsafe-macros=, then it - is never considered safe. Otherwise, =head= is safe if it specified - in the variable =elisp-scope-safe-macros=; or if it has a non-nil - =safe-macro= symbol property; or if the current buffer is trusted - according to =trusted-content-p=. - - If a macro =head= is not safe to expand (and has no associated - analyzer function), then the macro arguments =args= are not analyzed. - Hence semantic highlighting gives best results in trusted buffers, - where all macros can be expanded when needed. - -- If =head= is a function, it is reported as such, and =args= are - recursively analyzed as evaluated forms. - -- Otherwise, if =head= has no associated analyzer function, and it is - not a known macro or function, then it is reported with the =unknown= - symbol role. If the variable =elisp-scope-assume-func= is non-nil, - then unknown =head= is assumed to be a function call, and thus =args= - are analyzed as evaluated forms; otherwise =args= are not analyzed. - -** Analyzing Variables - -When =elisp-scope-1= encounters a variable reference =var=, it checks -whether =var= has a local binding in =elisp-scope-local-bindings=, and -whether =var= is a known special variable. If =var= is a locally-bound -special variable, =elisp-scope-1= reports the role =shadowed-variable=. -If =var= is locally-bound and not a special variable, it gets the role -=bound-variable=. Lastly, if it not locally-bound, then it gets the -role =free-variable=. - -** Analyzing Data - -When analyzer functions invoke =elisp-scope-1/n= to analyze some -sub-forms, they specify the =outspec= argument to convey information but -the expected value of the evaluated sub-form(s), so =elisp-scope-1/n= -will know what to do with a sub-form that is just (quoted) data. - -For example, the analyzer function for =face-attribute= calls -=elisp-scope-1= to analyze its first argument with an =outspec= which -says that a quoted symbol in this position refers to a face name. That -way, in a form such as =(face-attribute 'default :foreground)= the -symbol =default= is reported as a face reference (symbol role =face=). -Moreover, the =outspec= is passed down as appropriate through various -predefined analyzers, so every quoted symbol in a "tail position" of the -first argument to =face-attribute= will also be recognized as a face. -For instance, in the following form, both =success= and =error= are -reported as face references: - -#+begin_src emacs-lisp - (face-attribute (if (something-p) - 'success - (message "oops") - 'error) - :foreground) -#+end_src - -See also the docstring of =elisp-scope-1= for details about the format -of the =outspec= argument. - -* Takeaways - -- Set =elisp-fontify-semantically= to non-nil to enable semantic - highlighting for ELisp. -- It uses various =elisp-*= faces for the various symbol roles it - recognizes (function, macro, local/global variable...); most of these - faces inherit from appropriate =font-lock-*= faces. -- The current implementation can be slow when editing very large defuns. -- Syntax errors break semantic analysis, so =electric-pair-mode= or - similar is recommended. -- In untrusted buffers (as in =trusted-content-p=), some macro arguments - may not be highlighted. -- Highlighting is informed by definitions in the current Emacs session, - hence code that uses unloaded libraries may miss some highlighting. -- You can extend it with new analyzer functions and new symbol roles. diff --git a/etc/NEWS b/etc/NEWS index dde3b783877..ebdb0f4731d 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1153,11 +1153,12 @@ the previous silence. ** ELisp mode ++++ *** Semantic highlighting support for Emacs Lisp. 'emacs-lisp-mode' can now use code analysis to highlight more symbols more accurately. Customize the new user option 'elisp-fontify-semantically' to non-nil to enable this feature, and see -its documentation for more information. +the Info node "(emacs) Semantic Font Lock" for more information. ** Text mode diff --git a/lisp/emacs-lisp/elisp-scope.el b/lisp/emacs-lisp/elisp-scope.el index 3b7e088d5d7..340b9f55558 100644 --- a/lisp/emacs-lisp/elisp-scope.el +++ b/lisp/emacs-lisp/elisp-scope.el @@ -21,9 +21,113 @@ ;;; Commentary: ;; This library implements an analysis that determines the role of each -;; symbol in ELisp code. The entry point for the analysis is the -;; function `elisp-scope-analyze-form', see its docstring for usage -;; information. +;; symbol in ELisp code. + +;; The analysis assigns to each symbol a "symbol role", such as +;; `function', `bound-variable', `binding-variable', `face', etc. Each +;; symbol role has associated properties, such as the `:face' property, +;; which specifies a face that is applied to symbols with that role when +;; using semantic highlighting with `elisp-fontify-semantically'. +;; To define new symbol roles, see `elisp-scope-define-symbol-role'. +;; +;; The entry point of the analysis in the function +;; `elisp-scope-analyze-form'. It takes a caller-provided callback +;; function which will be called to report the information we find about +;; each analyzed symbol: the callback gets the position and length of +;; the analyzed symbol, along with its inferred role and, for +;; locally-bound variables, the position of the binder. +;; `elisp-scope-analyze-form' reads a form from the current buffer, +;; starting from point, using `read-positioning-symbols' to attach +;; position information to symbols. It then recursively analyzes the +;; form, reporting information about each symbol it encounters via the +;; caller-provided callback function. +;; +;; The core of the analysis that `elisp-scope-analyze-form' performs is +;; implemented in the recursive function `elisp-scope-1', which analyzes +;; an sexp as an evaluated form, propagating contextual information such +;; as local variable bindings down to analyzed sub-forms. +;; `elisp-scope-1' takes two arguments: `form', which is the form to +;; analyze, and `outspec', which is a specification of the expected +;; value of `form' used to analyze quoted data. The analysis proceeds +;; as follows: +;; +;; - If `form' is a symbol, `elisp-scope-1' reports it as a variable. +;; +;; - If `form' is a cons cell (head . args), then the analysis depends +;; on `head'. `head' can have a bespoke "analyzer function" `af', +;; which is called as (af head . args) and is responsible for +;; (recursively) analyzing `form'. The analyzer function can be +;; associated to `head' either locally, as an alist entry in +;; `elisp-scope-local-definitions', or globally, via the symbol +;; property `elisp-scope-analyzer'. +;; +;; An analyzer may use the functions `elisp-scope-report-s', +;; `elisp-scope-1' and `elisp-scope-n' to analyze its arguments, and +;; it can consult the variable `elisp-scope-output-spec' to obtain the +;; expected output spec of the analyzed form. For example, the +;; following is a suitable analyzer for the `identity' function: +;; +;; (lambda (fsym arg) +;; (elisp-scope-report-s fsym 'function) +;; (elisp-scope-1 arg elisp-scope-output-spec)) +;; +;; In particular, the analyzer function of `quote' analyzes its +;; argument according to `elisp-scope-output-spec', which is bound to +;; the value of the `outspec' argument passed to `elisp-scope-1'. +;; +;; - If `head' is a macro, normally it is expanded, and then the +;; expanded form is analyzed recursively. Since macro-expansion may +;; involve arbitrary code execution, only "safe" macro invocations are +;; expanded: if `head' is one of the macros in +;; `elisp-scope-unsafe-macros', then it is never considered safe. +;; Otherwise, `head' is safe if it specified in the variable +;; `elisp-scope-safe-macros'; or if it has a non-nil `safe-macro' +;; symbol property; or if the current buffer is trusted according to +;; `trusted-content-p'. If a macro `head' is not safe to expand (and +;; has no associated analyzer function), then the macro arguments +;; `args' are not analyzed. +;; +;; - If `head' is a function, it is reported as such, and `args' are +;; recursively analyzed as evaluated forms. +;; +;; - Otherwise, if `head' has no associated analyzer function, and it is +;; not a known macro or function, then it is reported with the `unknown' +;; symbol role. If the variable `elisp-scope-assume-func' is non-nil, +;; then unknown `head' is assumed to be a function call, and thus `args' +;; are analyzed as evaluated forms; otherwise `args' are not analyzed. +;; +;; When `elisp-scope-1' encounters a variable reference `var', it checks +;; whether `var' has a local binding in `elisp-scope-local-bindings', and +;; whether `var' is a known special variable. If `var' is a locally-bound +;; special variable, `elisp-scope-1' reports the role `shadowed-variable'. +;; If `var' is locally-bound and not a special variable, it gets the role +;; `bound-variable'. Lastly, if it not locally-bound, then it gets the +;; role `free-variable'. +;; +;; When analyzer functions invoke `elisp-scope-1/n' to analyze some +;; sub-forms, they specify the `outspec' argument to convey information +;; but the expected value of the evaluated sub-form(s), so +;; `elisp-scope-1/n' will know what to do with a sub-form that is just +;; (quoted) data. For example, the analyzer function for +;; `face-attribute' calls `elisp-scope-1' to analyze its first argument +;; with an `outspec' which says that a quoted symbol in this position +;; refers to a face name. +;; That way, in a form such as (face-attribute 'default :foreground), +;; the symbol `default' is reported as a face reference (`face' role). +;; Moreover, the `outspec' is passed down as appropriate through various +;; predefined analyzers, so every quoted symbol in a "tail position" of +;; the first argument to `face-attribute' will also be recognized as a +;; face. For instance, in the following form, both `success' and +;; `error' are reported as face references: +;; +;; (face-attribute (if (something-p) +;; 'success +;; (message "oops") +;; 'error) +;; :foreground) +;; +;; See also the docstring of `elisp-scope-1' for details about the +;; format of the `outspec' argument. ;;; Code: