1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586
|
;;; matlab-syntax.el --- Manage MATLAB syntax tables and buffer parsing -*- lexical-binding: t -*-
;; Copyright (C) 2024 Free Software Foundation, Inc.
;; Author: <eludlam@mathworks.com>
;;
;; This program is free software; you can redistribute it and/or
;; modify it under the terms of the GNU General Public License as
;; published by the Free Software Foundation, either version 3 of the
;; License, or (at your option) any later version.
;; This program is distributed in the hope that it will be useful, but
;; WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
;; General Public License for more details.
;; You should have received a copy of the GNU General Public License
;; along with this program. If not, see https://www.gnu.org/licenses/.
;;; Commentary:
;;
;; Manage syntax handling for `matlab-mode'.
;; Matlab's syntax for comments and strings can't be handled by a standard
;; Emacs syntax table. This code handles the syntax table, and special
;; scanning needed to augment a buffer's syntax for all our special cases.
;;
;; This file also handles all the special parsing needed to support indentation,
;; block scanning, and the line.
(require 'matlab-compat)
;;; Code:
(defvar matlab-syntax-support-command-dual t
"Non-nil means to support command dual for indenting and syntax highlight.
Does not work well in classes with properties with datatypes.")
(make-variable-buffer-local 'matlab-syntax-support-command-dual)
(put 'matlab-syntax-support-command-dual 'safe-local-variable #'booleanp)
(defvar matlab-syntax-table
(let ((st (make-syntax-table (standard-syntax-table))))
;; Comment Handling:
;; Multiline comments: %{ text %}
;; Single line comments: % text (single char start)
;; Ellipsis comments: ... text (comment char is 1st char after 3rd dot)
;; ^ handled in `matlab--syntax-propertize'
(modify-syntax-entry ?% "< 13" st)
(modify-syntax-entry ?{ "(} 2c" st)
(modify-syntax-entry ?} "){ 4c" st)
(modify-syntax-entry ?\n ">" st)
;; String Handling:
;; Character vector: 'text'
;; String: "text"
;; These next syntaxes are handled with `matlab--syntax-propertize'
;; Transpose: varname'
;; Quoted quotes: ' don''t ' or " this "" "
;; Unterminated Char V: ' text
(modify-syntax-entry ?' "\"" st)
(modify-syntax-entry ?\" "\"" st)
;; Words and Symbols:
(modify-syntax-entry ?_ "_" st)
;; Punctuation:
(modify-syntax-entry ?\\ "." st)
(modify-syntax-entry ?\t " " st)
(modify-syntax-entry ?+ "." st)
(modify-syntax-entry ?- "." st)
(modify-syntax-entry ?* "." st)
(modify-syntax-entry ?/ "." st)
(modify-syntax-entry ?= "." st)
(modify-syntax-entry ?< "." st)
(modify-syntax-entry ?> "." st)
(modify-syntax-entry ?& "." st)
(modify-syntax-entry ?| "." st)
;; Parenthetical blocks:
;; Note: these are in standard syntax table, repeated here for completeness.
(modify-syntax-entry ?\( "()" st)
(modify-syntax-entry ?\) ")(" st)
(modify-syntax-entry ?\[ "(]" st)
(modify-syntax-entry ?\] ")[" st)
;;(modify-syntax-entry ?{ "(}" st) - Handled as part of comments
;;(modify-syntax-entry ?} "){" st)
st)
"MATLAB syntax table.")
(defvar matlab-navigation-syntax-table
(let ((st (copy-syntax-table matlab-syntax-table)))
;; Make _ a part of words so we can skip them better
(modify-syntax-entry ?_ "w" st)
st)
"The syntax table used when navigating blocks.")
(defmacro matlab-navigation-syntax (&rest forms)
"Set the current environment for syntax-navigation and execute FORMS."
(declare (indent 0))
(list 'let '((oldsyntax (syntax-table))
(case-fold-search nil))
(list 'unwind-protect
(list 'progn
'(set-syntax-table matlab-navigation-syntax-table)
(cons 'progn forms))
'(set-syntax-table oldsyntax))))
(add-hook 'edebug-setup-hook
(lambda ()
(def-edebug-spec matlab-navigation-syntax def-body)))
;;; Buffer Scanning for Syntax Table Augmentation
;;
;; To support all our special syntaxes via syntax-ppss (parse partial
;; sexp), we need to scan the buffer for patterns, and then leave
;; behind the hints pps needs to do the right thing.
;;
;; Support is broken up in these functions:
;; * matlab--put-char-category - Apply a syntax category to a character
;; * matlab--syntax-symbol - Create a syntax category symbol
;; * matlab--syntax-propertize - Used as `syntax-propertize-function' for
;; doing the buffer scan to augment syntaxes.
;; * matlab--scan-line-* - Scan for specific types of syntax occurrences.
(defun matlab--put-char-category (pos category)
"At character POS, put text CATEGORY."
(when (not (eobp))
(put-text-property pos (1+ pos) 'category category)
(put-text-property pos (1+ pos) 'mcm t))
)
(defmacro matlab--syntax-symbol (symbol syntax doc)
"Create a new SYMBOL with DOC used as a text property category with SYNTAX."
(declare (indent defun))
`(progn (defvar ,symbol ,syntax ,doc)
(set ',symbol ,syntax) ;; So you can re-eval it.
(put ',symbol 'syntax-table ,symbol)
))
(matlab--syntax-symbol matlab--command-dual-syntax '(15 . nil) ;; Generic string
"Syntax placed on end-of-line for unterminated strings.")
(put 'matlab--command-dual-syntax 'command-dual t) ;; Font-lock cookie
(matlab--syntax-symbol matlab--unterminated-string-syntax '(15 . nil) ;; Generic string end
"Syntax placed on end-of-line for unterminated strings.")
(put 'matlab--unterminated-string-syntax 'unterminated t) ;; Font-lock cookie
(matlab--syntax-symbol matlab--ellipsis-syntax (string-to-syntax "< ") ;; comment char
"Syntax placed on ellipsis to treat them as comments.")
(matlab--syntax-symbol matlab--not-block-comment-syntax (string-to-syntax "(}") ;; Just a regular open brace
"Syntax placed on ellipsis to treat them as comments.")
(defun matlab--syntax-propertize (&optional start end)
"Scan region between START and END for unterminated strings.
Only scans whole-lines, as MATLAB is a line-based language.
If region is not specified, scan the whole buffer.
See `matlab--scan-line-for-ellipsis', `matlab--scan-line-bad-blockcomment',
and `matlab--scan-line-for-unterminated-string' for specific details."
(save-match-data ;; avoid 'Syntax Checking transmuted the match-data'
(save-excursion
;; Scan region, but always expand to beginning of line
(goto-char (or start (point-min)))
(beginning-of-line)
;; Clear old properties
(remove-text-properties (point) (save-excursion (goto-char (or end (point-max)))
(end-of-line) (point))
'(category nil mcm nil))
;; Apply properties
(while (and (not (>= (point) (or end (point-max)))) (not (eobp)))
(when matlab-syntax-support-command-dual
;; Command line dual comes first to prevent wasting time
;; in later checks.
(beginning-of-line)
(when (matlab--scan-line-for-command-dual)
(matlab--put-char-category (point) 'matlab--command-dual-syntax)
(end-of-line)
(matlab--put-char-category (point) 'matlab--command-dual-syntax)
))
;; Multiple ellipsis can be on a line. Find them all
(beginning-of-line)
(while (matlab--scan-line-for-ellipsis)
;; Mark ellipsis as if a comment.
(matlab--put-char-category (point) 'matlab--ellipsis-syntax)
(forward-char 3)
)
;; Multiple invalid block comment starts possible. Find them all
(beginning-of-line)
(while (matlab--scan-line-bad-blockcomment)
;; Mark 2nd char as just open brace, not punctuation.
(matlab--put-char-category (point) 'matlab--not-block-comment-syntax)
)
;; Look for an unterminated string. Only one possible per line.
(beginning-of-line)
(when (matlab--scan-line-for-unterminated-string)
;; Mark this one char plus EOL as end of string.
(matlab--put-char-category (point) 'matlab--unterminated-string-syntax)
(end-of-line)
(matlab--put-char-category (point) 'matlab--unterminated-string-syntax))
(beginning-of-line)
(forward-line 1))
)))
(defconst matlab-syntax-commanddual-functions
'("warning" "disp" "cd"
;; debug
"dbstop" "dbclear"
;; Graphics
"print" "xlim" "ylim" "zlim" "grid" "hold" "box" "colormap" "axis")
"Functions that are commonly used with command line dual.")
(defconst matlab-cds-regex (regexp-opt matlab-syntax-commanddual-functions 'symbols))
(defun matlab--scan-line-for-command-dual (&optional debug)
"Scan this line for command line duality strings.
DEBUG is ignored."
(ignore debug)
;; Note - add \s$ b/c we'll add that syntax to the first letter, and it
;; might still be there during an edit!
(let ((case-fold-search nil))
(when (and (not (nth 9 (syntax-ppss (point))))
(looking-at
(concat "^\\s-*"
matlab-cds-regex
"\\s-+\\(\\s$\\|\\w\\|\\s_\\)")))
(goto-char (match-beginning 2)))))
(matlab--syntax-symbol matlab--transpose-syntax '(1 . nil) ;; 3 = symbol, 1 = punctuation
"Treat ' as non-string when used as transpose.")
(matlab--syntax-symbol matlab--quoted-string-syntax '(9 . nil) ;; 9 = escape in a string
"Treat '' or \"\" as not string delimiters when inside a string.")
(defun matlab--scan-line-for-unterminated-string (&optional debug)
"Scan this line for an unterminated string, leave cursor on starting string char.
DEBUG is ignored."
(ignore debug)
;; First, scan over all the string chars.
(save-restriction
(narrow-to-region (line-beginning-position) (line-end-position))
(beginning-of-line)
(condition-case nil
(while (re-search-forward "\\s\"\\|\\s<" nil t)
(let ((start-str (match-string 0))
(start-char (match-beginning 0)))
(forward-char -1)
(if (looking-at "\\s<")
(progn
(matlab--scan-line-comment-disable-strings)
(forward-comment 1))
;; Else, check for valid string
(if (or (bolp)
(string= start-str "\"")
(save-excursion
(forward-char -1)
(not (looking-at "\\(\\w\\|\\s_\\|\\s)\\|\"\\|\\.\\)"))))
(progn
;; Valid string start, try to skip the string
(forward-sexp 1)
;; If we just finished and we have a double of ourselves,
;; convert those doubles into punctuation.
(when (looking-at start-str)
(matlab--put-char-category (1- (point)) 'matlab--quoted-string-syntax)
;; and try again.
(goto-char start-char)
))
(when (string= start-str "'")
;; If it isn't valid string, it's just transpose or something.
;; convert to a symbol - as a VAR'', the second ' needs to think it
;; is not after punctuation.
(matlab--put-char-category (point) 'matlab--transpose-syntax))
;; Move forward 1.
(forward-char 1)
)))
nil)
(error
t))))
(defun matlab--scan-line-comment-disable-strings ()
"Disable bad string chars syntax from point to eol.
Called when comments found in `matlab--scan-line-for-unterminated-string'."
(save-excursion
(while (re-search-forward "\\s\"" nil t)
(save-excursion
(matlab--put-char-category (1- (point)) 'matlab--transpose-syntax)
))))
(defun matlab--scan-line-bad-blockcomment ()
"Scan this line for invalid block comment start."
(when (and (re-search-forward "%{" (line-end-position) t) (not (looking-at "\\s-*$")))
(goto-char (1- (match-end 0)))
t))
(defun matlab--scan-line-for-ellipsis ()
"Scan this line for an ellipsis."
(when (re-search-forward "\\.\\.\\." (line-end-position) t)
(goto-char (match-beginning 0))
t))
;;; Font Lock Support:
;;
;; The syntax specific font-lock support handles comments and strings.
;;
;; We'd like to support multiple kinds of strings and comments. To do
;; that we overload `font-lock-syntactic-face-function' with our own.
;; This does the same job as the original, except we scan the start
;; for special cookies left behind by `matlab--syntax-propertize' and
;; use that to choose different fonts.
(defun matlab--font-lock-syntactic-face (pps)
"Return the face to use for the syntax specified in PPS."
;; From the default in font-lock.
;; (if (nth 3 state) font-lock-string-face font-lock-comment-face)
(if (nth 3 pps)
;; This is a string. Check the start char to see if it was
;; marked as an unterminated string.
(cond ((get-text-property (nth 8 pps) 'unterminated)
'matlab-unterminated-string-face)
((get-text-property (nth 8 pps) 'command-dual)
'matlab-commanddual-string-face)
(t
'font-lock-string-face))
;; Not a string, must be a comment. Check to see if it is a
;; cellbreak comment.
(cond ((and (< (nth 8 pps) (point-max))
(= (char-after (1+ (nth 8 pps))) ?\%))
'matlab-sections-section-break-face)
((and (< (nth 8 pps) (point-max))
(= (char-after (1+ (nth 8 pps))) ?\#))
'matlab-pragma-face)
((and (< (nth 8 pps) (point-max))
(looking-at "\\^\\| \\$\\$\\$"))
'matlab-ignored-comment-face)
(t
'font-lock-comment-face))
))
;;; SETUP
;;
;; Connect our special logic into a running MATLAB Mode
;; replacing existing mechanics.
;;
;; Delete this if/when it becomes a permanent part of `matlab-mode'.
(defun matlab-syntax-setup ()
"Integrate our syntax handling into a running `matlab-mode' buffer.
Safe to use in `matlab-mode-hook'."
;; Syntax Table support
(set-syntax-table matlab-syntax-table)
(make-local-variable 'syntax-propertize-function)
(setq syntax-propertize-function 'matlab--syntax-propertize)
;; Comment handlers
(make-local-variable 'comment-start)
(make-local-variable 'comment-end)
(make-local-variable 'comment-start-skip)
(make-local-variable 'page-delimiter)
(setq comment-start "%"
comment-end ""
comment-start-skip "%\\s-+"
page-delimiter "^\\(\f\\|%%\\(\\s-\\|\n\\)\\)")
;; Other special regexps handling different kinds of syntax.
(make-local-variable 'paragraph-start)
(setq paragraph-start (concat "^$\\|" page-delimiter))
(make-local-variable 'paragraph-separate)
(setq paragraph-separate paragraph-start)
(make-local-variable 'paragraph-ignore-fill-prefix)
(setq paragraph-ignore-fill-prefix t)
;; Font lock
(make-local-variable 'font-lock-syntactic-face-function)
(setq font-lock-syntactic-face-function 'matlab--font-lock-syntactic-face)
)
;;; Syntax Testing for Strings and Comments
;;
;; These functions detect syntactic context based on the syntax table.
(defsubst matlab-cursor-in-string-or-comment ()
"Return non-nil if the cursor is in a valid MATLAB comment or string."
(nth 8 (syntax-ppss (point))))
(defsubst matlab-cursor-in-comment ()
"Return t if the cursor is in a valid MATLAB comment."
(nth 4 (syntax-ppss (point))))
(defsubst matlab-cursor-in-string (&optional incomplete)
"Return t if the cursor is in a valid MATLAB character vector or string scalar.
Note: INCOMPLETE is now obsolete
If the optional argument INCOMPLETE is non-nil, then return t if we
are in what could be a an incomplete string. (Note: this is also the default)"
(ignore incomplete)
(nth 3 (syntax-ppss (point))))
(defun matlab-cursor-comment-string-context (&optional bounds-sym)
"Return the comment/string context of cursor for the current line.
Return \\='comment if in a comment.
Return \\='string if in a string.
Return \\='charvector if in a character vector
Return \\='ellipsis if after an ... ellipsis
Return \\='commanddual if in text interpreted as string for command dual
Return nil if none of the above.
Scans from the beginning of line to determine the context.
If optional BOUNDS-SYM is specified, set that symbol value to the
bounds of the string or comment the cursor is in"
(let* ((pps (syntax-ppss (point)))
(start (nth 8 pps))
(syntax nil))
;; Else, inside something if 'start' is set.
(when start
(save-match-data
(save-excursion
(goto-char start) ;; Prep for extra checks.
(setq syntax
(cond ((eq (nth 3 pps) t)
(cond ((= (following-char) ?')
'charvector)
((= (following-char) ?\")
'string)
(t
'commanddual)))
((eq (nth 3 pps) ?')
'charvector)
((eq (nth 3 pps) ?\")
'string)
((nth 4 pps)
(if (= (following-char) ?\%)
'comment
'ellipsis))
(t nil)))
;; compute the bounds
(when (and syntax bounds-sym)
(if (memq syntax '(charvector string))
;;(forward-sexp 1) - overridden - need primitive version
(goto-char (scan-sexps (point) 1))
(forward-comment 1)
(if (bolp) (forward-char -1)))
(set bounds-sym (list start (point))))
)))
;; Return the syntax
syntax))
(defsubst matlab-beginning-of-string-or-comment (&optional all-comments)
"If the cursor is in a string or comment, move to the beginning.
Returns non-nil if the cursor is in a comment.
Optional ALL-COMMENTS if t, move to first."
(let* ((pps (syntax-ppss (point))))
(prog1
(when (nth 8 pps)
(goto-char (nth 8 pps))
t)
(when all-comments
(prog1
(forward-comment -100000)
(skip-chars-forward " \t\n\r"))))))
(defun matlab-end-of-string-or-comment (&optional all-comments)
"If the cursor is in a string or comment, move to the end.
If optional ALL-COMMENTS is non-nil, then also move over all
adjacent comments.
Returns non-nil if the cursor moved."
(let* ((pps (syntax-ppss (point)))
(start (point)))
(if (nth 8 pps)
(progn
;; syntax-ppss doesn't have the end, so go to the front
;; and then skip forward.
(goto-char (nth 8 pps))
(if (nth 3 pps)
(goto-char (scan-sexps (point) 1))
(forward-comment (if all-comments 100000 1))
(skip-chars-backward " \t\n\r"))
;; If the buffer is malformed, we might end up before starting pt.
;; so error.
(when (< (point) start)
(goto-char start)
(error "Error navigating syntax"))
t)
;; else not in comment, but still skip 'all-comments' if requested.
(when (and all-comments (looking-at "\\s-*\\s<"))
(forward-comment 100000)
(skip-chars-backward " \t\n\r")
t)
)))
;;; Navigating Lists
;;
;; MATLAB's lists are (), {}, [].
;; We used to need to do special stuff, but now I think this
;; is just a call straight to up-list.
(defun matlab-up-list (count)
"Move forwards or backwards up a list by COUNT.
When travelling backward, use `syntax-ppss' counted paren
starts to navigate upward.
When travelling forward, use \\='up-list\\=' directly, but disable
comment and string crossing."
(save-restriction
(matlab-beginning-of-string-or-comment)
(if (< count 0)
(let ((pps (syntax-ppss)))
(when (< (nth 0 pps) (abs count))
(error "Cannot navigate up %d lists" (abs count)))
;; When travelling in reverse, we can just use pps'
;; parsed paren list in slot 9.
(let ((posn (reverse (nth 9 pps)))) ;; Location of parens
(goto-char (nth (1- (abs count)) posn))))
;; Else - travel forward
(up-list count nil t)) ;; will this correctly ignore comments, etc?
))
(defsubst matlab-in-list-p ()
"If the cursor is in a list, return positions of the beginnings of the lists.
Returns nil if not in a list."
(nth 9 (syntax-ppss (point))))
(defsubst matlab-beginning-of-outer-list ()
"If the cursor is in a list, move to the beginning of outermost list.
Returns non-nil if the cursor moved."
(let ((pps (syntax-ppss (point))))
(when (nth 9 pps) (goto-char (car (nth 9 pps))) )))
(defun matlab-end-of-outer-list ()
"If the cursor is in a list, move to the end of the outermost list..
Returns non-nil if the cursor moved."
(let ((pps (syntax-ppss (point)))
(start (point)))
(when (nth 9 pps)
;; syntax-ppss doesn't have the end, so go to the front
;; and then skip forward.
(goto-char (car (nth 9 pps)))
(goto-char (scan-sexps (point) 1))
;; This checks for malformed buffer content
;; that can cause this to go backwards.
(when (> start (point))
(goto-char start)
(error "Malformed List"))
)))
;;; Useful checks for state around point.
;;
(defsubst matlab-syntax-keyword-as-variable-p ()
"Return non-nil if the current word is treated like a variable.
This could mean it is:
* Field of a structure
* Assigned from or into with ="
(or (save-excursion (skip-syntax-backward "w")
(skip-syntax-backward " ")
(or (= (preceding-char) ?\.)
(= (preceding-char) ?=)))
(save-excursion (skip-syntax-forward "w")
(skip-syntax-forward " ")
(= (following-char) ?=))))
(defsubst matlab-valid-keyword-syntax ()
"Return non-nil if cursor is not in a string, comment, or parens."
(let ((pps (syntax-ppss (point))))
(not (or (nth 8 pps) (nth 9 pps))))) ;; 8 == string/comment, 9 == parens
;;; Syntax Compat functions
;;
;; Left over old APIs. Delete these someday.
(defsubst matlab-move-simple-sexp-backward-internal (count)
"Move backward COUNT number of MATLAB sexps."
(let ((forward-sexp-function nil))
(forward-sexp (- count))))
(defsubst matlab-move-simple-sexp-internal(count)
"Move over one MATLAB sexp COUNT times.
If COUNT is negative, travel backward."
(let ((forward-sexp-function nil))
(forward-sexp count)))
(provide 'matlab-syntax)
;;; matlab-syntax.el ends here
;; LocalWords: Ludlam eludlam compat booleanp propertize varname defmacro oldsyntax progn edebug
;; LocalWords: ppss sexp pps defun eobp mcm blockcomment EOL defconst commanddual cds bolp eol
;; LocalWords: cellbreak setq defsubst charvector memq sexps posn parens
|