emacs/test
Yuan Fu 1897da0b59
Add line-column tracking for tree-sitter
Add line-column tracking for tree-sitter parsers.  Copied from
comments in treesit.c:

   Technically we had to send tree-sitter the line and column
   position of each edit.  But in practice we just send it dummy
   values, because tree-sitter doesn't use it for parsing and
   mostly just carries the line and column positions around and
   return it when e.g. reporting node positions[1].  This has
   been working fine until we encountered grammars that actually
   utilizes the line and column information for
   parsing (Haskell)[2].

   [1] https://github.com/tree-sitter/tree-sitter/issues/445
   [2] https://github.com/tree-sitter/tree-sitter/issues/4001

   So now we have to keep track of line and column positions and
   pass valid values to tree-sitter.  (It adds quite some
   complexity, but only linearly; one can ignore all the linecol
   stuff when trying to understand treesit code and then come
   back to it later.)  Eli convinced me to disable tracking by
   default, and only enable it for languages that needs it.  So
   the buffer starts out not tracking linecol.  And when a
   parser is created, if the language is in
   treesit-languages-require-line-column-tracking, we enable
   tracking in the buffer, and enable tracking for the parser.
   To simplify things, once a buffer starts tracking linecol, it
   never disables tracking, even if parsers that need tracking
   are all deleted; and for parsers, tracking is determined at
   creation time, if it starts out tracking/non-tracking, it
   stays that way, regardless of later changes to
   treesit-languages-require-line-column-tracking.

   To make calculating line/column positons fast, we store
   linecol caches for begv, point, and zv in the
   buffer (buf->ts_linecol_cache_xxx); and in the parser object,
   we store linecol cache for visible beg/end of that parser.

   In buffer editing functions, we need the linecol for
   start/old_end/new_end, those can be calculated by scanning
   newlines (treesit_linecol_of_pos) from the buffer point
   cache, which should be always near the point.  And we usually
   set the calculated linecol of new_end back to the buffer
   point cache.

   We also need to calculate linecol for the visible_beg/end for
   each parser, and linecol for the buffer's begv/zv, these
   positions are usually far from point, so we have caches for
   all of them (in either the parser object or the buffer).
   These positions are far from point, so it's inefficient to
   scan newlines from point to there to get up-to-date linecol
   for them; but in the same time, because they're far and
   outside the changed region, we can calculate their change in
   line and column number by simply counting how much newlines
   are added/removed in the changed
   region (compute_new_linecol_by_change).

* doc/lispref/parsing.texi (Using Parser): Mention line-column
tracking in manual.
* etc/NEWS: Add news.
* lisp/treesit.el:
(treesit-languages-need-line-column-tracking): New variable.
* src/buffer.c: Include treesit.h (for TREESIT_EMPTY_LINECOL).
(Fget_buffer_create):
(Fmake_indirect_buffer): Initialize new buffer fields.
(Fbuffer_swap_text): Add new buffer fields.
* src/buffer.h (ts_linecol): New struct.
(buffer): New buffer fields.
(BUF_TS_LINECOL_BEGV):
(BUF_TS_LINECOL_POINT):
(BUF_TS_LINECOL_ZV):
(SET_BUF_TS_LINECOL_BEGV):
(SET_BUF_TS_LINECOL_POINT):
(SET_BUF_TS_LINECOL_ZV): New inline functions.
* src/casefiddle.c (casify_region): Record linecol info.
* src/editfns.c (Fsubst_char_in_region):
(Ftranslate_region_internal):
(Ftranspose_regions): Record linecol info.
* src/insdel.c (insert_1_both):
(insert_from_string_1):
(insert_from_gap_1):
(insert_from_buffer):
(replace_range):
(del_range_2): Record linecol info.
* src/treesit.c (TREESIT_BOB_LINECOL):
(TREESIT_EMPTY_LINECOL):
(TREESIT_TS_POINT_1_0): New constants.
(treesit_debug_print_linecol):
(treesit_buf_tracks_linecol_p):
(restore_restriction_and_selective_display):
(treesit_count_lines):
(treesit_debug_validate_linecol):
(treesit_linecol_of_pos):
(treesit_make_ts_point):
(Ftreesit_tracking_line_column_p):
(Ftreesit_parser_tracking_line_column_p): New functions.
(treesit_tree_edit_1): Accept real TSPoint and pass to
tree-sitter.
(compute_new_linecol_by_change): New function.
(treesit_record_change_1): Rename from treesit_record_change,
handle linecol if tracking is enabled.
(treesit_linecol_maybe): New function.
(treesit_record_change): New wrapper around
treesit_record_change_1 that handles some boilerplate and sets
buffer state.
(treesit_sync_visible_region): Handle linecol if tracking is
enabled.
(make_treesit_parser): Setup parser's linecol cache if tracking
is enabled.
(Ftreesit_parser_create): Enable tracking if the parser's
language requires it.
(Ftreesit__linecol_at):
(Ftreesit__linecol_cache_set):
(Ftreesit__linecol_cache): New functions for debugging and
testing.
(syms_of_treesit): New variable
Vtreesit_languages_require_line_column_tracking.
* src/treesit.h (Lisp_TS_Parser): New fields.
(TREESIT_BOB_LINECOL):
(TREESIT_EMPTY_LINECOL): New constants.
* test/src/treesit-tests.el (treesit-linecol-basic):
(treesit-linecol-search-back-across-newline):
(treesit-linecol-col-same-line):
(treesit-linecol-enable-disable): New tests.
* src/lisp.h: Declare display_count_lines.
* src/xdisp.c (display_count_lines): Remove static keyword.
2025-05-03 22:14:03 -07:00
..
data New tests for nested archives (bug#70987) 2024-05-20 09:22:41 +03:00
infra Avoid adding duplicate items to 'treesit-language-source-alist'. 2025-05-01 20:55:33 +03:00
lib-src Fix a number of ERT tests for execution on Android 2025-02-25 19:13:24 +08:00
lisp Fix 'Skip' behavior in erts files (bug#76839) 2025-05-03 10:31:04 +03:00
manual ; Fix last change. 2025-04-03 09:12:07 +03:00
misc Update copyright year to 2025 2025-01-02 18:39:42 +01:00
src Add line-column tracking for tree-sitter 2025-05-03 22:14:03 -07:00
ChangeLog.1 ; Delete troff markers from ChangeLog files 2025-02-20 02:46:43 +01:00
file-organization.org Run admin/cus-tests.el tests from test suite 2021-02-21 20:20:40 +01:00
Makefile.in Exclude files under `infra' from automatic testing 2025-03-02 19:11:10 +08:00
README Update copyright year to 2025 2025-01-02 18:39:42 +01:00

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Copyright (C) 2008-2025 Free Software Foundation, Inc.
See the end of the file for license conditions.

This directory contains files intended to test various aspects of
Emacs's functionality.  Please help add tests!

See the file file-organization.org for the details of the directory
structure and file-naming conventions.

For tests in the manual/ subdirectory, look there for separate README
files, or look for instructions in the test files themselves.

Emacs uses ERT, Emacs Lisp Regression Testing, for testing.  See (info
"(ert)") or https://www.gnu.org/software/emacs/manual/html_node/ert/
for more information on writing and running tests.

Tests could be tagged by the developer.  In this test directory, the
following tags are recognized:

* :expensive-test
  The test needs a serious amount of time to run.  It is not intended
  to run on a regular basis by users.  Instead, it runs on demand
  only, or during regression tests.

* :nativecomp
  The test runs only if Emacs is configured with Lisp native compiler
  support.

* :unstable
  The test is under development.  It shall run on demand only.

The Makefile sets the environment variable $EMACS_TEST_DIRECTORY,
which points to this directory.  This environment variable does not
exist when the tests are run outside make.  The Makefile supports the
following targets:

* make check
  Run all tests as defined in the directory.  Expensive and unstable
  tests are suppressed.  The result of the tests for <filename>.el is
  stored in <filename>.log.

* make check-maybe
  Like "make check", but run only the tests for files which have
  unresolved prerequisites.

* make check-expensive
  Like "make check", but run also the tests marked as expensive.

* make check-all
  Like "make check", but run all tests.

* make check-<dirname>
  Like "make check", but run only the tests in test/<dirname>/*.el.
  <dirname> is a relative directory path, which has replaced "/" by "-",
  like in "check-src" or "check-lisp-net".

* make <filename>  -or-  make <filename>.log
  Run all tests declared in <filename>.el.  This includes expensive
  tests.  In the former case the output is shown on the terminal, in
  the latter case the output is written to <filename>.log.

<filename> could be either a relative file name like
"lisp/files-tests", or a package name like "files-tests".

ERT offers selectors, which make it possible to filter out which test
cases shall run.  The make variable $(SELECTOR) gives you a simple
mean to use your own selectors.  The ERT manual describes how
selectors are constructed, see (info "(ert)Test Selectors") or
https://www.gnu.org/software/emacs/manual/html_node/ert/Test-Selectors.html

You could use predefined selectors of the Makefile.  "make <filename>
SELECTOR='$(SELECTOR_DEFAULT)'" runs all tests for <filename>.el
except the tests tagged as expensive or unstable.  Other predefined
selectors are $(SELECTOR_EXPENSIVE) (run all tests except unstable
ones) and $(SELECTOR_ALL) (run all tests).

If your test file contains the tests "test-foo", "test2-foo" and
"test-foo-remote", and you want to run only the former two tests, you
could use a selector regexp (note that the "$" needs to be doubled to
protect against "make" variable expansion):

    make <filename> SELECTOR='"foo$$"'

In case you want to use the symbol name of a test as selector, you can
use it directly:

    make <filename> SELECTOR='test-foo-remote'

Note that although the test files are always compiled (unless they set
no-byte-compile), the source files will be run when expensive or
unstable tests are involved, to give nicer backtraces.  To run the
compiled version of a test use

    make TEST_LOAD_EL=no ...

Some tests might take long time to run.  In order to summarize the
<nn> tests with the longest duration, call

    make SUMMARIZE_TESTS=<nn> ...

The backtrace of failing tests are truncated to the default value of
'ert-batch-backtrace-right-margin'.  To see more of the backtrace, use

    make TEST_BACKTRACE_LINE_LENGTH=<nn> ...

The tests are run in batch mode by default; sometimes it's useful to
get precisely the same environment but run in interactive mode for
debugging.  To do that, use

    make TEST_INTERACTIVE=yes ...

Sometimes, some further settings are needed in order to run the batch
test.  This can be indicated by the $EMACS_EXTRAOPT environment
variable, like

    make ... EMACS_EXTRAOPT="--eval '(setopt ert-batch-print-length nil ert-batch-print-level nil)'"

By default, ERT test failure summaries are quite brief in batch
mode--only the names of the failed tests are listed.  If the
$EMACS_TEST_VERBOSE environment variable is set and non-empty, the
failure summaries will also include the data from the failing test.

If the $EMACS_TEST_JUNIT_REPORT environment variable is set to a file
name, a JUnit test report is generated under this name.

Some of the tests require a remote temporary directory
(autorevert-tests.el, dnd-tests.el, eglot-tests.el, filenotify-tests.el,
shadowfile-tests.el and tramp-tests.el).  Per default, a mock-up
connection method is used (this might not be possible when running on
MS Windows).  If you want to test a real remote connection, set
$REMOTE_TEMPORARY_FILE_DIRECTORY to a suitable value in order to
overwrite the default value:

    env REMOTE_TEMPORARY_FILE_DIRECTORY=/ssh:host:/tmp make ...


There are also continuous integration tests on
<https://hydra.nixos.org/jobset/gnu/emacs-trunk> (see
admin/notes/hydra) and <https://emba.gnu.org/emacs/emacs> (see
admin/notes/emba).  Both environments provide an environment variable,
which could be used to determine, whether the tests run in one of
these test environments.

$EMACS_HYDRA_CI indicates the hydra environment, and $EMACS_EMBA_CI
indicates the emba environment, respectively.

If tests on these premises take too long, and it is needed to create a
core dump for further analysis, the environment variable
$EMACS_TEST_TIMEOUT could set a limit (in seconds) when this shall
happen.


(Also, see etc/compilation.txt for compilation mode font lock tests
and etc/grep.txt for grep mode font lock tests.)


This file is part of GNU Emacs.

GNU Emacs is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

GNU Emacs is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.