File: class-wp-block-processor.php

package info (click to toggle)
wordpress 6.9%2Bdfsg1-1
links: PTS, VCS
area: main
in suites: forky, sid
size: 136,360 kB
sloc: javascript: 534,044; php: 301,426; cs: 6,126; sh: 457; xml: 22; makefile: 14
file content (1983 lines) | stat: -rw-r--r-- 69,469 bytes
<?php
/**
 * Efficiently scan through block structure in document without parsing
 * the entire block tree and all of its JSON attributes into memory.
 *
 * @package WordPress
 * @subpackage Blocks
 * @since 6.9.0
 */

/**
 * Class for efficiently scanning through block structure in a document
 * without parsing the entire block tree and JSON attributes into memory.
 *
 * ## Overview
 *
 * This class is designed to help analyze and modify block structure in a
 * streaming fashion and to bridge the gap between parsed block trees and
 * the text representing them.
 *
 * Use-cases for this class include but are not limited to:
 *
 *  - Counting block types in a document.
 *  - Queuing stylesheets based on the presence of various block types.
 *  - Modifying blocks of a given type, i.e. migrations, updates, and styling.
 *  - Searching for content of specific kinds, e.g. checking for blocks
 *    with certain theme support attributes, or block bindings.
 *  - Adding CSS class names to the element wrapping a block’s inner blocks.
 *
 * > *Note!* If a fully-parsed block tree of a document is necessary, including
 * >         all the parsed JSON attributes, nested blocks, and HTML, consider
 * >         using {@see \parse_blocks()} instead which will parse the document
 * >         in one swift pass.
 *
 * For typical usage, jump first to the methods {@see self::next_block()},
 * {@see self::next_delimiter()}, or {@see self::next_token()}.
 *
 * ### Values
 *
 * As a lower-level interface than {@see parse_blocks()} this class follows
 * different performance-focused values:
 *
 *  - Minimize allocations so that documents of any size may be processed
 *    on a fixed or marginal amount of memory.
 *  - Make hidden costs explicit so that calling code only has to pay the
 *    performance penalty for features it needs.
 *  - Operate with a streaming and re-entrant design to make it possible
 *    to operate on chunks of a document and to resume after pausing.
 *
 * This means that some operations might appear more cumbersome than one
 * might expect. This design tradeoff opens up opportunity to wrap this in
 * a convenience class to add higher-level functionality.
 *
 * ## Concepts
 *
 * All text documents can be considered a block document containing a combination
 * of “freeform HTML” and explicit block structure. Block structure forms through
 * special HTML comments called _delimiters_ which include a block type and,
 * optionally, block attributes encoded as a JSON object payload.
 *
 * This processor is designed to scan through a block document from delimiter to
 * delimiter, tracking how the delimiters impact the structure of the document.
 * Spans of HTML appear between delimiters. If these spans exist at the top level
 * of the document, meaning there is no containing block around them, they are
 * considered freeform HTML content. If, however, they appear _inside_ block
 * structure they are interpreted as `innerHTML` for the containing block.
 *
 * ### Tokens and scanning
 *
 * As the processor scans through a document is reports information about the token
 * on which is pauses. Tokens represent spans of text in the input comprising block
 * delimiters and spans of HTML.
 *
 *  - {@see self::next_token()} visits every contiguous subspan of text in the
 *    input document. This includes all explicit block comment delimiters and spans
 *    of HTML content (whether freeform or inner HTML).
 *  - {@see self::next_delimiter()} visits every explicit block comment delimiter
 *    unless passed a block type which covers freeform HTML content. In these cases
 *    it will stop at top-level spans of HTML and report a `null` block type.
 *  - {@see self::next_block()} visits every block delimiter which _opens_ a block.
 *    This includes opening block delimiters as well as void block delimiters. With
 *    the same exception as above for freeform HTML block types, this will visit
 *    top-level spans of HTML content.
 *
 * When matched on a particular token, the following methods provide structural
 * and textual information about it:
 *
 *  - {@see self::get_delimiter_type()} reports whether the delimiter is an opener,
 *    a closer, or if it represents a whole void block.
 *  - {@see self::get_block_type()} reports the fully-qualified block type which
 *    the delimiter represents.
 *  - {@see self::get_printable_block_type()} reports the fully-qualified block type,
 *    but returns `core/freeform` instead of `null` for top-level freeform HTML content.
 *  - {@see self::is_block_type()} indicates if the delimiter represents a block of
 *    the given block type, or wildcard or pseudo-block type described below.
 *  - {@see self::opens_block()} indicates if the delimiter opens a block of one
 *    of the provided block types. Opening, void, and top-level freeform HTML content
 *    all open blocks.
 *  - {@see static::get_attributes()} is currently reserved for a future streaming
 *    JSON parser class.
 *  - {@see self::allocate_and_return_parsed_attributes()} extracts the JSON attributes
 *    for delimiters which open blocks and return the fully-parsed attributes as an
 *    associative array. {@see static::get_last_json_error()} for when this fails.
 *  - {@see self::is_html()} indicates if the token is a span of HTML which might
 *    be top-level freeform content or a block’s inner HTML.
 *  - {@see self::get_html_content()} returns the span of HTML.
 *  - {@see self::get_span()} for the byte offset and length into the input document
 *    representing the token.
 *
 * It’s possible for the processor to fail to scan forward if the input document ends
 * in a proper prefix of an explicit block comment delimiter. For example, if the input
 * ends in `<!-- wp:` then it _might_ be the start of another delimiter. The parser
 * cannot know, however, and therefore refuses to proceed. {@see static::get_last_error()}
 * to distinguish between a failure to find the next token and an incomplete input.
 *
 * ### Block types
 *
 * A block’s “type” comprises an optional _namespace_ and _name_. If the namespace
 * isn’t provided it will be interpreted as the implicit `core` namespace. For example,
 * the type `gallery` is the name of the block in the `core` namespace, but the type
 * `abc/gallery` is the _fully-qualified_ block type for the block whose name is still
 * `gallery`, but in the `abc` namespace.
 *
 * Methods on this class are aware of this block naming semantic and anywhere a block
 * type is an argument to a method it will be normalized to account for implicit namespaces.
 * Passing `paragraph` is the same as passing `core/paragraph`. On the contrary, anywhere
 * this class returns a block type, it will return the fully-qualified and normalized form.
 * For example, for the `<!-- wp:group -->` delimiter it will return `core/group` as the
 * block type.
 *
 * There are two special block types that change the behavior of the processor:
 *
 *  - The wildcard `*` represents _any block_. In addition to matching all block types,
 *    it also represents top-level freeform HTML whose block type is reported as `null`.
 *
 *  - The `core/freeform` block type is a pseudo-block type which explicitly matches
 *    top-level freeform HTML.
 *
 * These special block types can be passed into any method which searches for blocks.
 *
 * There is one additional special block type which may be returned from
 * {@see self::get_printable_block_type()}. This is the `#innerHTML` type, which
 * indicates that the HTML span on which the processor is paused is inner HTML for
 * a containing block.
 *
 * ### Spans of HTML
 *
 * Non-block content plays a complicated role in processing block documents. This
 * processor exposes tools to help work with these spans of HTML.
 *
 *  - {@see self::is_html()} indicates if the processor is paused at a span of
 *    HTML but does not differentiate between top-level freeform content and inner HTML.
 *  - {@see self::is_non_whitespace_html()} indicates not only if the processor
 *    is paused at a span of HTML, but also whether that span incorporates more than
 *    whitespace characters. Because block serialization often inserts newlines between
 *    block comment delimiters, this is useful for distinguishing “real” freeform
 *    content from purely aesthetic syntax.
 *  - {@see self::is_block_type()} matches top-level freeform HTML content when
 *    provided one of the special block types described above.
 *
 * ### Block structure
 *
 * As the processor traverses block delimiters it maintains a stack of which blocks are
 * open at the given place in the document where it’s paused. This stack represents the
 * block structure of a document and is used to determine where blocks end, which blocks
 * represent inner blocks, whether a span of HTML is top-level freeform content, and
 * more. Investigate the stack with {@see self::get_breadcrumbs()}, which returns an
 * array of block types starting at the outermost-open block and descending to the
 * currently-visited block.
 *
 * Unlike {@parse_blocks()}, spans of HTML appear in this structure as the special
 * reported block type `#html`. Such a span represents inner HTML for a block if the
 * depth reported by {@see self::get_depth()} is greater than one.
 *
 * It will generally not be necessary to inspect the stack of open blocks, though
 * depth may be important for finding where blocks end. When visiting a block opener,
 * the depth will have been increased before pausing; in contrast the depth is
 * decremented before visiting a closer. This makes the following an easy way to
 * determine if a block is still open.
 *
 * Example:
 *
 *     $depth = $processor->get_depth();
 *     while ( $processor->next_token() && $processor->get_depth() > $depth ) {
 *         continue
 *     }
 *     // Processor is now paused at the token immediately following the closed block.
 *
 * #### Extracting blocks
 *
 * A unique feature of this processor is the ability to return the same output as
 * {@see \parse_blocks()} would produce, but for a subset of the input document.
 * For example, it’s possible to extract an image block, manipulate that parsed
 * block, and re-serialize it into the original document. It’s possible to do so
 * while skipping over the parse of the rest of the document.
 *
 * {@see self::extract_full_block_and_advance()} will scan forward from the current block opener
 * and build the parsed block structure until the current block is closed. It will
 * include all inner HTML and inner blocks, and parse all of the inner blocks. It
 * can be used to extract a block at any depth in the document, helpful for operating
 * on blocks within nested structure.
 *
 * Example:
 *
 *     if ( ! $processor->next_block( 'gallery' ) ) {
 *         return $post_content;
 *     }
 *
 *     $gallery_at    = $processor->get_span()->start;
 *     $gallery_block = $processor->extract_full_block_and_advance();
 *     $after_gallery = $processor->get_span()->start;
 *     return (
 *         substr( $post_content, 0, $gallery_at ) .
 *         serialize_block( modify_gallery( $gallery_block ) .
 *         substr( $post_content, $after_gallery )
 *     );
 *
 * #### Handling of malformed structure
 *
 * There are situations where closing block delimiters appear for which no open block
 * exists, or where a document ends before a block is closed, or where a closing block
 * delimiter appears but references a different block type than the most-recently
 * opened block does. In all of these cases, the stack of open blocks should mirror
 * the behavior in {@see \parse_blocks()}.
 *
 * Unlike {@see \parse_blocks()}, however, this processor can still operate on the
 * invalid block delimiters. It provides a few functions which can be used for building
 * custom and non-spec-compliant error handling.
 *
 *  - {@see self::has_closing_flag()} indicates if the block delimiter contains the
 *    closing flag at the end. Some invalid block delimiters might contain both the
 *    void and closing flag, in which case {@see self::get_delimiter_type()} will
 *    report that it’s a void block.
 *  - {@see static::get_last_error()} indicates if the processor reached an invalid
 *    block closing. Depending on the context, {@see \parse_blocks()} might instead
 *    ignore the token or treat it as freeform HTML content.
 *
 * ## Static helpers
 *
 * This class provides helpers for performing semantic block-related operations.
 *
 *  - {@see self::normalize_block_type()} takes a block type with or without the
 *    implicit `core` namespace and returns a fully-qualified block type.
 *  - {@see self::are_equal_block_types()} indicates if two spans across one or
 *    more input texts represent the same fully-qualified block type.
 *
 * ## Subclassing
 *
 * This processor is designed to accurately parse a block document. Therefore, many
 * of its methods are not meant for subclassing. However, overall this class supports
 * building higher-level convenience classes which may choose to subclass it. For those
 * classes, avoid re-implementing methods except for the list below. Instead, create
 * new names representing the higher-level concepts being introduced. For example, instead
 * of creating a new method named `next_block()` which only advances to blocks of a given
 * kind, consider creating a new method named something like `next_layout_block()` which
 * won’t interfere with the base class method.
 *
 *  - {@see static::get_last_error()} may be reimplemented to report new errors in the subclass
 *    which aren’t intrinsic to block parsing.
 *  - {@see static::get_attributes()} may be reimplemented to provide a streaming interface
 *    to reading and modifying a block’s JSON attributes. It should be fast and memory efficient.
 *  - {@see static::get_last_json_error()} may be reimplemented to report new errors introduced
 *    with a reimplementation of {@see static::get_attributes()}.
 *
 * @since 6.9.0
 */
class WP_Block_Processor {
	/**
	 * Indicates if the last operation failed, otherwise
	 * will be `null` for success.
	 *
	 * @since 6.9.0
	 *
	 * @var string|null
	 */
	private $last_error = null;

	/**
	 * Indicates failures from decoding JSON attributes.
	 *
	 * @since 6.9.0
	 *
	 * @see \json_last_error()
	 *
	 * @var int
	 */
	private $last_json_error = JSON_ERROR_NONE;

	/**
	 * Source text provided to processor.
	 *
	 * @since 6.9.0
	 *
	 * @var string
	 */
	protected $source_text;

	/**
	 * Byte offset into source text where a matched delimiter starts.
	 *
	 * Example:
	 *
	 *          5    10   15   20   25   30   35   40   45   50
	 *     <!-- wp:group --><!-- wp:void /--><!-- /wp:group -->
	 *                      ╰─ Starts at byte offset 17.
	 *
	 * @since 6.9.0
	 *
	 * @var int
	 */
	private $matched_delimiter_at = 0;

	/**
	 * Byte length of full span of a matched delimiter.
	 *
	 * Example:
	 *
	 *          5    10   15   20   25   30   35   40   45   50
	 *     <!-- wp:group --><!-- wp:void /--><!-- /wp:group -->
	 *                      ╰───────────────╯
	 *                        17 bytes long.
	 *
	 * @since 6.9.0
	 *
	 * @var int
	 */
	private $matched_delimiter_length = 0;

	/**
	 * First byte offset into source text following any previously-matched delimiter.
	 * Used to indicate where an HTML span starts.
	 *
	 * Example:
	 *
	 *          5    10   15   20   25   30   35   40   45   50   55
	 *     <!-- wp:paragraph --><p>Content</p><⃨!⃨-⃨-⃨ ⃨/⃨w⃨p⃨:⃨p⃨a⃨r⃨a⃨g⃨r⃨a⃨p⃨h⃨ ⃨-⃨-⃨>⃨
	 *                          │             ╰─ This delimiter was matched, and after matching,
	 *                          │                revealed the preceding HTML span.
	 *                          │
	 *                          ╰─ The first byte offset after the previous matched delimiter
	 *                             is 21. Because the matched delimiter starts at 55, which is after
	 *                             this, a span of HTML must exist between these boundaries.
	 *
	 * @since 6.9.0
	 *
	 * @var int
	 */
	private $after_previous_delimiter = 0;

	/**
	 * Byte offset where namespace span begins.
	 *
	 * When no namespace is present, this will be the same as the starting
	 * byte offset for the block name.
	 *
	 * Example:
	 *
	 *     <!-- wp:core/gallery -->
	 *             │    ╰─ Name starts here.
	 *             ╰─ Namespace starts here.
	 *
	 *     <!-- wp:gallery -->
	 *             ├─ The namespace would start here but is implied as “core.”
	 *             ╰─ The name starts here.
	 *
	 * @since 6.9.0
	 *
	 * @var int
	 */
	private $namespace_at = 0;

	/**
	 * Byte offset where block name span begins.
	 *
	 * When no namespace is present, this will be the same as the starting
	 * byte offset for the block namespace.
	 *
	 * Example:
	 *
	 *     <!-- wp:core/gallery -->
	 *             │    ╰─ Name starts here.
	 *             ╰─ Namespace starts here.
	 *
	 *     <!-- wp:gallery -->
	 *             ├─ The namespace would start here but is implied as “core.”
	 *             ╰─ The name starts here.
	 *
	 * @since 6.9.0
	 *
	 * @var int
	 */
	private $name_at = 0;

	/**
	 * Byte length of block name span.
	 *
	 * Example:
	 *
	 *          5    10   15   20   25
	 *     <!-- wp:core/gallery -->
	 *                  ╰─────╯
	 *                7 bytes long.
	 *
	 * @since 6.9.0
	 *
	 * @var int
	 */
	private $name_length = 0;

	/**
	 * Whether the delimiter contains the block-closing flag.
	 *
	 * This may be erroneous if present within a void block,
	 * therefore the {@see self::has_closing_flag()} can be used by
	 * calling code to perform custom error-handling.
	 *
	 * @since 6.9.0
	 *
	 * @var bool
	 */
	private $has_closing_flag = false;

	/**
	 * Byte offset where JSON attributes span begins.
	 *
	 * Example:
	 *
	 *          5    10   15   20   25   30   35   40
	 *     <!-- wp:paragraph {"dropCaps":true} -->
	 *                       ╰─ Starts at byte offset 18.
	 *
	 * @since 6.9.0
	 *
	 * @var int
	 */
	private $json_at;

	/**
	 * Byte length of JSON attributes span, or 0 if none are present.
	 *
	 * Example:
	 *
	 *          5    10   15   20   25   30   35   40
	 *     <!-- wp:paragraph {"dropCaps":true} -->
	 *                       ╰───────────────╯
	 *                         17 bytes long.
	 *
	 * @since 6.9.0
	 *
	 * @var int
	 */
	private $json_length = 0;

	/**
	 * Internal parser state, differentiating whether the instance is currently matched,
	 * on an implicit freeform node, in error, or ready to begin parsing.
	 *
	 * @see self::READY
	 * @see self::MATCHED
	 * @see self::HTML_SPAN
	 * @see self::INCOMPLETE_INPUT
	 * @see self::COMPLETE
	 *
	 * @since 6.9.0
	 *
	 * @var string
	 */
	protected $state = self::READY;

	/**
	 * Indicates what kind of block comment delimiter was matched.
	 *
	 * One of:
	 *
	 *  - {@see self::OPENER} If the delimiter is opening a block.
	 *  - {@see self::CLOSER} If the delimiter is closing an open block.
	 *  - {@see self::VOID}   If the delimiter represents a void block with no inner content.
	 *
	 * If a parsed comment delimiter contains both the closing and the void
	 * flags then it will be interpreted as a void block to match the behavior
	 * of the official block parser, however, this is a syntax error and probably
	 * the block ought to close an open block of the same name, if one is open.
	 *
	 * @since 6.9.0
	 *
	 * @var string
	 */
	private $type;

	/**
	 * Whether the last-matched delimiter acts like a void block and should be
	 * popped from the stack of open blocks as soon as the parser advances.
	 *
	 * This applies to void block delimiters and to HTML spans.
	 *
	 * @since 6.9.0
	 *
	 * @var bool
	 */
	private $was_void = false;

	/**
	 * For every open block, in hierarchical order, this stores the byte offset
	 * into the source text where the block type starts, including for HTML spans.
	 *
	 * To avoid allocating and normalizing block names when they aren’t requested,
	 * the stack of open blocks is stored as the byte offsets and byte lengths of
	 * each open block’s block type. This allows for minimal tracking and quick
	 * reading or comparison of block types when requested.
	 *
	 * @since 6.9.0
	 *
	 * @see self::$open_blocks_length
	 *
	 * @var int[]
	 */
	private $open_blocks_at = array();

	/**
	 * For every open block, in hierarchical order, this stores the byte length
	 * of the block’s block type in the source text. For HTML spans this is 0.
	 *
	 * @since 6.9.0
	 *
	 * @see self::$open_blocks_at
	 *
	 * @var int[]
	 */
	private $open_blocks_length = array();

	/**
	 * Indicates which operation should apply to the stack of open blocks after
	 * processing any pending spans of HTML.
	 *
	 * Since HTML spans are discovered after matching block delimiters, those
	 * delimiters need to defer modifying the stack of open blocks. This value,
	 * if set, indicates what operation should be applied. The properties
	 * associated with token boundaries still point to the delimiters even
	 * when processing HTML spans, so there’s no need to track them independently.
	 *
	 * @var 'push'|'void'|'pop'|null
	 */
	private $next_stack_op = null;

	/**
	 * Creates a new block processor.
	 *
	 * Example:
	 *
	 *     $processor = new WP_Block_Processor( $post_content );
	 *     if ( $processor->next_block( 'core/image' ) ) {
	 *         echo "Found an image!\n";
	 *     }
	 *
	 * @see self::next_block() to advance to the start of the next block (skips closers).
	 * @see self::next_delimiter() to advance to the next explicit block delimiter.
	 * @see self::next_token() to advance to the next block delimiter or HTML span.
	 *
	 * @since 6.9.0
	 *
	 * @param string $source_text Input document potentially containing block content.
	 */
	public function __construct( string $source_text ) {
		$this->source_text = $source_text;
	}

	/**
	 * Advance to the next block delimiter which opens a block, indicating if one was found.
	 *
	 * Delimiters which open blocks include opening and void block delimiters. To visit
	 * freeform HTML content, pass the wildcard “*” as the block type.
	 *
	 * Use this function to walk through the blocks in a document, pausing where they open.
	 *
	 * Example blocks:
	 *
	 *     // The first delimiter opens the paragraph block.
	 *     <⃨!⃨-⃨-⃨ ⃨w⃨p⃨:⃨p⃨a⃨r⃨a⃨g⃨r⃨a⃨p⃨h⃨ ⃨-⃨-⃨>⃨<p>Content</p><!-- /wp:paragraph-->
	 *
	 *     // The void block is the first opener in this sequence of closers.
	 *     <!-- /wp:group --><⃨!⃨-⃨-⃨ ⃨w⃨p⃨:⃨s⃨p⃨a⃨c⃨e⃨r⃨ ⃨{⃨"⃨h⃨e⃨i⃨g⃨h⃨t⃨"⃨:⃨"⃨2⃨0⃨0⃨p⃨x⃨"⃨}⃨ ⃨/⃨-⃨-⃨>⃨<!-- /wp:group -->
	 *
	 *     // If, however, `*` is provided as the block type, freeform content is matched.
	 *     <⃨h⃨2⃨>⃨M⃨y⃨ ⃨s⃨y⃨n⃨o⃨p⃨s⃨i⃨s⃨<⃨/⃨h⃨2⃨>⃨\⃨n⃨<!-- wp:my/table-of-contents /-->
	 *
	 *     // Inner HTML is never freeform content, and will not be matched even with the wildcard.
	 *     <!-- /wp:list-item --></ul><!-- /wp:list --><⃨!⃨-⃨-⃨ ⃨w⃨p⃨:⃨p⃨a⃨r⃨a⃨g⃨r⃨a⃨p⃨h⃨ ⃨-⃨>⃨<p>
	 *
	 * Example:
	 *
	 *     // Find all textual ranges of image block opening delimiters.
	 *     $images = array();
	 *     $processor = new WP_Block_Processor( $html );
	 *     while ( $processor->next_block( 'core/image' ) ) {
	 *         $images[] = $processor->get_span();
	 *     }
	 *
	 *  In some cases it may be useful to conditionally visit the implicit freeform
	 *  blocks, such as when determining if a post contains freeform content that
	 *  isn’t purely whitespace.
	 *
	 *  Example:
	 *
	 *      $seen_block_types = [];
	 *      $block_type       = '*';
	 *      $processor        = new WP_Block_Processor( $html );
	 *      while ( $processor->next_block( $block_type ) {
	 *          // Stop wasting time visiting freeform blocks after one has been found.
	 *          if (
	 *              '*' === $block_type &&
	 *              null === $processor->get_block_type() &&
	 *              $processor->is_non_whitespace_html()
	 *          ) {
	 *              $block_type = null;
	 *              $seen_block_types['core/freeform'] = true;
	 *              continue;
	 *          }
	 *
	 *          $seen_block_types[ $processor->get_block_type() ] = true;
	 *      }
	 *
	 * @since 6.9.0
	 *
	 * @see self::next_delimiter() to advance to the next explicit block delimiter.
	 * @see self::next_token() to advance to the next block delimiter or HTML span.
	 *
	 * @param string|null $block_type Optional. If provided, advance until a block of this type is found.
	 *                                Default is to stop at any block regardless of its type.
	 * @return bool Whether an opening delimiter for a block was found.
	 */
	public function next_block( ?string $block_type = null ): bool {
		while ( $this->next_delimiter( $block_type ) ) {
			if ( self::CLOSER !== $this->get_delimiter_type() ) {
				return true;
			}
		}

		return false;
	}

	/**
	 * Advance to the next block delimiter in a document, indicating if one was found.
	 *
	 * Delimiters may include invalid JSON. This parser does not attempt to parse the
	 * JSON attributes until requested; when invalid, the attributes will be null. This
	 * matches the behavior of {@see \parse_blocks()}. To visit freeform HTML content,
	 * pass the wildcard “*” as the block type.
	 *
	 * Use this function to walk through the block delimiters in a document.
	 *
	 * Example delimiters:
	 *
	 *     <!-- wp:paragraph {"dropCap": true} -->
	 *     <!-- wp:separator /-->
	 *     <!-- /wp:paragraph -->
	 *
	 *     // If the wildcard `*` is provided as the block type, freeform content is matched.
	 *     <⃨h⃨2⃨>⃨M⃨y⃨ ⃨s⃨y⃨n⃨o⃨p⃨s⃨i⃨s⃨<⃨/⃨h⃨2⃨>⃨\⃨n⃨<!-- wp:my/table-of-contents /-->
	 *
	 *     // Inner HTML is never freeform content, and will not be matched even with the wildcard.
	 *     ...</ul><⃨!⃨-⃨-⃨ ⃨/⃨w⃨p⃨:⃨l⃨i⃨s⃨t⃨ ⃨-⃨-⃨>⃨<!-- wp:paragraph --><p>
	 *
	 * Example:
	 *
	 *     $html      = '<!-- wp:void /-->\n<!-- wp:void /-->';
	 *     $processor = new WP_Block_Processor( $html );
	 *     while ( $processor->next_delimiter() {
	 *         // Runs twice, seeing both void blocks of type “core/void.”
	 *     }
	 *
	 *     $processor = new WP_Block_Processor( $html );
	 *     while ( $processor->next_delimiter( '*' ) ) {
	 *         // Runs thrice, seeing the void block, the newline span, and the void block.
	 *     }
	 *
	 * @since 6.9.0
	 *
	 * @param string|null $block_name Optional. Keep searching until a block of this name is found.
	 *                                Defaults to visit every block regardless of type.
	 * @return bool Whether a block delimiter was matched.
	 */
	public function next_delimiter( ?string $block_name = null ): bool {
		if ( ! isset( $block_name ) ) {
			while ( $this->next_token() ) {
				if ( ! $this->is_html() ) {
					return true;
				}
			}

			return false;
		}

		while ( $this->next_token() ) {
			if ( $this->is_block_type( $block_name ) ) {
				return true;
			}
		}

		return false;
	}

	/**
	 * Advance to the next block delimiter or HTML span in a document, indicating if one was found.
	 *
	 * This function steps through every syntactic chunk in a document. This includes explicit
	 * block comment delimiters, freeform non-block content, and inner HTML segments.
	 *
	 * Example tokens:
	 *
	 *     <!-- wp:paragraph {"dropCap": true} -->
	 *     <!-- wp:separator /-->
	 *     <!-- /wp:paragraph -->
	 *     <p>Normal HTML content</p>
	 *     Plaintext content too!
	 *
	 * Example:
	 *
	 *     // Find span containing wrapping HTML element surrounding inner blocks.
	 *     $processor = new WP_Block_Processor( $html );
	 *     if ( ! $processor->next_block( 'gallery' ) ) {
	 *         return null;
	 *     }
	 *
	 *     $containing_span = null;
	 *     while ( $processor->next_token() && $processor->is_html() ) {
	 *         $containing_span = $processor->get_span();
	 *     }
	 *
	 * This method will visit all HTML spans including those forming freeform non-block
	 * content as well as those which are part of a block’s inner HTML.
	 *
	 * @since 6.9.0
	 *
	 * @return bool Whether a token was matched or the end of the document was reached without finding any.
	 */
	public function next_token(): bool {
		if ( $this->last_error || self::COMPLETE === $this->state || self::INCOMPLETE_INPUT === $this->state ) {
			return false;
		}

		// Void tokens automatically pop off the stack of open blocks.
		if ( $this->was_void ) {
			array_pop( $this->open_blocks_at );
			array_pop( $this->open_blocks_length );
			$this->was_void = false;
		}

		$text = $this->source_text;
		$end  = strlen( $text );

		/*
		 * Because HTML spans are inferred after finding the next delimiter, it means that
		 * the parser must transition out of that HTML state and reuse the token boundaries
		 * it found after the HTML span. If those boundaries are before the end of the
		 * document it implies that a real delimiter was found; otherwise this must be the
		 * terminating HTML span and the parsing is complete.
		 */
		if ( self::HTML_SPAN === $this->state ) {
			if ( $this->matched_delimiter_at >= $end ) {
				$this->state = self::COMPLETE;
				return false;
			}

			switch ( $this->next_stack_op ) {
				case 'void':
					$this->was_void             = true;
					$this->open_blocks_at[]     = $this->namespace_at;
					$this->open_blocks_length[] = $this->name_at + $this->name_length - $this->namespace_at;
					break;

				case 'push':
					$this->open_blocks_at[]     = $this->namespace_at;
					$this->open_blocks_length[] = $this->name_at + $this->name_length - $this->namespace_at;
					break;

				case 'pop':
					array_pop( $this->open_blocks_at );
					array_pop( $this->open_blocks_length );
					break;
			}

			$this->next_stack_op = null;
			$this->state         = self::MATCHED;
			return true;
		}

		$this->state          = self::READY;
		$after_prev_delimiter = $this->matched_delimiter_at + $this->matched_delimiter_length;
		$at                   = $after_prev_delimiter;

		while ( $at < $end ) {
			/*
			 * Find the next possible start of a delimiter.
			 *
			 * This follows the behavior in the official block parser, which segments a post
			 * by the block comment delimiters. It is possible for an HTML attribute to contain
			 * what looks like a block comment delimiter but which is actually an HTML attribute
			 * value. In such a case, the parser here will break apart the HTML and create the
			 * block boundary inside the HTML attribute. In other words, the block parser
			 * isolates sections of HTML from each other, even if that leads to malformed markup.
			 *
			 * For a more robust parse, scan through the document with the HTML API and parse
			 * comments once they are matched to see if they are also block delimiters. In
			 * practice, this nuance has not caused any known problems since developing blocks.
			 *
			 * <⃨!⃨-⃨-⃨ /wp:core/paragraph {"dropCap":true} /-->
			 */
			$comment_opening_at = strpos( $text, '<!--', $at );

			/*
			 * Even if the start of a potential block delimiter is not found, the document
			 * might end in a prefix of such, and in that case there is incomplete input.
			 */
			if ( false === $comment_opening_at ) {
				if ( str_ends_with( $text, '<!-' ) ) {
					$backup = 3;
				} elseif ( str_ends_with( $text, '<!' ) ) {
					$backup = 2;
				} elseif ( str_ends_with( $text, '<' ) ) {
					$backup = 1;
				} else {
					$backup = 0;
				}

				// Whether or not there is a potential delimiter, there might be an HTML span.
				if ( $after_prev_delimiter < ( $end - $backup ) ) {
					$this->state                    = self::HTML_SPAN;
					$this->after_previous_delimiter = $after_prev_delimiter;
					$this->matched_delimiter_at     = $end - $backup;
					$this->matched_delimiter_length = $backup;
					$this->open_blocks_at[]         = $after_prev_delimiter;
					$this->open_blocks_length[]     = 0;
					$this->was_void                 = true;
					return true;
				}

				/*
				 * In the case that there is the start of an HTML comment, it means that there
				 * might be a block delimiter, but it’s not possible know, therefore it’s incomplete.
				 */
				if ( $backup > 0 ) {
					goto incomplete;
				}

				// Otherwise this is the end.
				$this->state = self::COMPLETE;
				return false;
			}

			// <!-- ⃨/wp:core/paragraph {"dropCap":true} /-->
			$opening_whitespace_at = $comment_opening_at + 4;
			if ( $opening_whitespace_at >= $end ) {
				goto incomplete;
			}

			$opening_whitespace_length = strspn( $text, " \t\f\r\n", $opening_whitespace_at );

			/*
			 * The `wp` prefix cannot come before this point, but it may come after it
			 * depending on the presence of the closer. This is detected next.
			 */
			$wp_prefix_at = $opening_whitespace_at + $opening_whitespace_length;
			if ( $wp_prefix_at >= $end ) {
				goto incomplete;
			}

			if ( 0 === $opening_whitespace_length ) {
				$at = $this->find_html_comment_end( $comment_opening_at, $end );
				continue;
			}

			// <!-- /⃨wp:core/paragraph {"dropCap":true} /-->
			$has_closer = false;
			if ( '/' === $text[ $wp_prefix_at ] ) {
				$has_closer = true;
				++$wp_prefix_at;
			}

			// <!-- /w⃨p⃨:⃨core/paragraph {"dropCap":true} /-->
			if ( $wp_prefix_at < $end && 0 !== substr_compare( $text, 'wp:', $wp_prefix_at, 3 ) ) {
				if (
					( $wp_prefix_at + 2 >= $end && str_ends_with( $text, 'wp' ) ) ||
					( $wp_prefix_at + 1 >= $end && str_ends_with( $text, 'w' ) )
				) {
					goto incomplete;
				}

				$at = $this->find_html_comment_end( $comment_opening_at, $end );
				continue;
			}

			/*
			 * If the block contains no namespace, this will end up masquerading with
			 * the block name. It’s easier to first detect the span and then determine
			 * if it’s a namespace of a name.
			 *
			 * <!-- /wp:c⃨o⃨r⃨e⃨/paragraph {"dropCap":true} /-->
			 */
			$namespace_at = $wp_prefix_at + 3;
			if ( $namespace_at >= $end ) {
				goto incomplete;
			}

			$start_of_namespace = $text[ $namespace_at ];

			// The namespace must start with a-z.
			if ( 'a' > $start_of_namespace || 'z' < $start_of_namespace ) {
				$at = $this->find_html_comment_end( $comment_opening_at, $end );
				continue;
			}

			$namespace_length = 1 + strspn( $text, 'abcdefghijklmnopqrstuvwxyz0123456789-_', $namespace_at + 1 );
			$separator_at     = $namespace_at + $namespace_length;
			if ( $separator_at >= $end ) {
				goto incomplete;
			}

			// <!-- /wp:core/⃨paragraph {"dropCap":true} /-->
			$has_separator = '/' === $text[ $separator_at ];
			if ( $has_separator ) {
				$name_at = $separator_at + 1;

				if ( $name_at >= $end ) {
					goto incomplete;
				}

				// <!-- /wp:core/p⃨a⃨r⃨a⃨g⃨r⃨a⃨p⃨h⃨ {"dropCap":true} /-->
				$start_of_name = $text[ $name_at ];
				if ( 'a' > $start_of_name || 'z' < $start_of_name ) {
					$at = $this->find_html_comment_end( $comment_opening_at, $end );
					continue;
				}

				$name_length = 1 + strspn( $text, 'abcdefghijklmnopqrstuvwxyz0123456789-_', $name_at + 1 );
			} else {
				$name_at     = $namespace_at;
				$name_length = $namespace_length;
			}

			if ( $name_at + $name_length >= $end ) {
				goto incomplete;
			}

			/*
			 * For this next section of the delimiter, it could be the JSON attributes
			 * or it could be the end of the comment. Assume that the JSON is there and
			 * update if it’s not.
			 */

			// <!-- /wp:core/paragraph ⃨{"dropCap":true} /-->
			$after_name_whitespace_at     = $name_at + $name_length;
			$after_name_whitespace_length = strspn( $text, " \t\f\r\n", $after_name_whitespace_at );
			$json_at                      = $after_name_whitespace_at + $after_name_whitespace_length;

			if ( $json_at >= $end ) {
				goto incomplete;
			}

			if ( 0 === $after_name_whitespace_length ) {
				$at = $this->find_html_comment_end( $comment_opening_at, $end );
				continue;
			}

			// <!-- /wp:core/paragraph {⃨"dropCap":true} /-->
			$has_json    = '{' === $text[ $json_at ];
			$json_length = 0;

			/*
			 * For the final span of the delimiter it's most efficient to find the end of the
			 * HTML comment and work backwards. This prevents complicated parsing inside the
			 * JSON span, which is not allowed to contain the HTML comment terminator.
			 *
			 * This also matches the behavior in the official block parser,
			 * even though it allows for matching invalid JSON content.
			 *
			 * <!-- /wp:core/paragraph {"dropCap":true} /-⃨-⃨>⃨
			 */
			$comment_closing_at = strpos( $text, '-->', $json_at );
			if ( false === $comment_closing_at ) {
				goto incomplete;
			}

			// <!-- /wp:core/paragraph {"dropCap":true} /⃨-->
			if ( '/' === $text[ $comment_closing_at - 1 ] ) {
				$has_void_flag    = true;
				$void_flag_length = 1;
			} else {
				$has_void_flag    = false;
				$void_flag_length = 0;
			}

			/*
			 * If there's no JSON, then the span of text after the name
			 * until the comment closing must be completely whitespace.
			 * Otherwise it’s a normal HTML comment.
			 */
			if ( ! $has_json ) {
				if ( $after_name_whitespace_at + $after_name_whitespace_length === $comment_closing_at - $void_flag_length ) {
					// This must be a block delimiter!
					$this->state = self::MATCHED;
					break;
				}

				$at = $this->find_html_comment_end( $comment_opening_at, $end );
				continue;
			}

			/*
			 * There's JSON, so attempt to find its boundary.
			 *
			 * @todo It’s likely faster to scan forward instead of in reverse.
			 *
			 * <!-- /wp:core/paragraph {"dropCap":true}⃨ ⃨/-->
			 */
			$after_json_whitespace_length = 0;
			for ( $char_at = $comment_closing_at - $void_flag_length - 1; $char_at > $json_at; $char_at-- ) {
				$char = $text[ $char_at ];

				switch ( $char ) {
					case ' ':
					case "\t":
					case "\f":
					case "\r":
					case "\n":
						++$after_json_whitespace_length;
						continue 2;

					case '}':
						$json_length = $char_at - $json_at + 1;
						break 2;

					default:
						++$at;
						continue 3;
				}
			}

			/*
			 * This covers cases where there is no terminating “}” or where
			 * mandatory whitespace is missing.
			 */
			if ( 0 === $json_length || 0 === $after_json_whitespace_length ) {
				$at = $this->find_html_comment_end( $comment_opening_at, $end );
				continue;
			}

			// This must be a block delimiter!
			$this->state = self::MATCHED;
			break;
		}

		// The end of the document was reached without a match.
		if ( self::MATCHED !== $this->state ) {
			$this->state = self::COMPLETE;
			return false;
		}

		/*
		 * From this point forward, a delimiter has been matched. There
		 * might also be an HTML span that appears before the delimiter.
		 */

		$this->after_previous_delimiter = $after_prev_delimiter;

		$this->matched_delimiter_at     = $comment_opening_at;
		$this->matched_delimiter_length = $comment_closing_at + 3 - $comment_opening_at;

		$this->namespace_at = $namespace_at;
		$this->name_at      = $name_at;
		$this->name_length  = $name_length;

		$this->json_at     = $json_at;
		$this->json_length = $json_length;

		/*
		 * When delimiters contain both the void flag and the closing flag
		 * they shall be interpreted as void blocks, per the spec parser.
		 */
		if ( $has_void_flag ) {
			$this->type          = self::VOID;
			$this->next_stack_op = 'void';
		} elseif ( $has_closer ) {
			$this->type          = self::CLOSER;
			$this->next_stack_op = 'pop';

			/*
			 * @todo Check if the name matches and bail according to the spec parser.
			 *       The default parser doesn’t examine the names.
			 */
		} else {
			$this->type          = self::OPENER;
			$this->next_stack_op = 'push';
		}

		$this->has_closing_flag = $has_closer;

		// HTML spans are visited before the delimiter that follows them.
		if ( $comment_opening_at > $after_prev_delimiter ) {
			$this->state                = self::HTML_SPAN;
			$this->open_blocks_at[]     = $after_prev_delimiter;
			$this->open_blocks_length[] = 0;
			$this->was_void             = true;

			return true;
		}

		// If there were no HTML spans then flush the enqueued stack operations immediately.
		switch ( $this->next_stack_op ) {
			case 'void':
				$this->was_void             = true;
				$this->open_blocks_at[]     = $namespace_at;
				$this->open_blocks_length[] = $name_at + $name_length - $namespace_at;
				break;

			case 'push':
				$this->open_blocks_at[]     = $namespace_at;
				$this->open_blocks_length[] = $name_at + $name_length - $namespace_at;
				break;

			case 'pop':
				array_pop( $this->open_blocks_at );
				array_pop( $this->open_blocks_length );
				break;
		}

		$this->next_stack_op = null;

		return true;

		incomplete:
		$this->state      = self::COMPLETE;
		$this->last_error = self::INCOMPLETE_INPUT;
		return false;
	}

	/**
	 * Returns an array containing the names of the currently-open blocks, in order
	 * from outermost to innermost, with HTML spans indicated as “#html”.
	 *
	 * Example:
	 *
	 *     // Freeform HTML content is an HTML span.
	 *     $processor = new WP_Block_Processor( 'Just text' );
	 *     $processor->next_token();
	 *     array( '#text' ) === $processor->get_breadcrumbs();
	 *
	 *     $processor = new WP_Block_Processor( '<!-- wp:a --><!-- wp:b --><!-- wp:c /--><!-- /wp:b --><!-- /wp:a -->' );
	 *     $processor->next_token();
	 *     array( 'core/a' ) === $processor->get_breadcrumbs();
	 *     $processor->next_token();
	 *     array( 'core/a', 'core/b' ) === $processor->get_breadcrumbs();
	 *     $processor->next_token();
	 *     // Void blocks are only open while visiting them.
	 *     array( 'core/a', 'core/b', 'core/c' ) === $processor->get_breadcrumbs();
	 *     $processor->next_token();
	 *     // Blocks are closed before visiting their closing delimiter.
	 *     array( 'core/a' ) === $processor->get_breadcrumbs();
	 *     $processor->next_token();
	 *     array() === $processor->get_breadcrumbs();
	 *
	 *     // Inner HTML is also an HTML span.
	 *     $processor = new WP_Block_Processor( '<!-- wp:a -->Inner HTML<!-- /wp:a -->' );
	 *     $processor->next_token();
	 *     $processor->next_token();
	 *     array( 'core/a', '#html' ) === $processor->get_breadcrumbs();
	 *
	 * @since 6.9.0
	 *
	 * @return string[]
	 */
	public function get_breadcrumbs(): array {
		$breadcrumbs = array_fill( 0, count( $this->open_blocks_at ), null );

		/*
		 * Since HTML spans can only be at the very end, set the normalized block name for
		 * each open element and then work backwards after creating the array. This allows
		 * for the elimination of a conditional on each iteration of the loop.
		 */
		foreach ( $this->open_blocks_at as $i => $at ) {
			$block_type        = substr( $this->source_text, $at, $this->open_blocks_length[ $i ] );
			$breadcrumbs[ $i ] = self::normalize_block_type( $block_type );
		}

		if ( isset( $i ) && 0 === $this->open_blocks_length[ $i ] ) {
			$breadcrumbs[ $i ] = '#html';
		}

		return $breadcrumbs;
	}

	/**
	 * Returns the depth of the open blocks where the processor is currently matched.
	 *
	 * Depth increases before visiting openers and void blocks and decreases before
	 * visiting closers. HTML spans behave like void blocks.
	 *
	 * @since 6.9.0
	 *
	 * @return int
	 */
	public function get_depth(): int {
		return count( $this->open_blocks_at );
	}

	/**
	 * Extracts a block object, and all inner content, starting at a matched opening
	 * block delimiter, or at a matched top-level HTML span as freeform HTML content.
	 *
	 * Use this function to extract some blocks within a document, but not all. For example,
	 * one might want to find image galleries, parse them, modify them, and then reserialize
	 * them in place.
	 *
	 * Once this function returns, the parser will be matched on token following the close
	 * of the given block.
	 *
	 * The return type of this method is compatible with the return of {@see \parse_blocks()}.
	 *
	 * Example:
	 *
	 *     $processor = new WP_Block_Processor( $post_content );
	 *     if ( ! $processor->next_block( 'gallery' ) ) {
	 *         return $post_content;
	 *     }
	 *
	 *     $gallery_at  = $processor->get_span()->start;
	 *     $gallery     = $processor->extract_full_block_and_advance();
	 *     $ends_before = $processor->get_span();
	 *     $ends_before = $ends_before->start ?? strlen( $post_content );
	 *
	 *     $new_gallery = update_gallery( $gallery );
	 *     $new_gallery = serialize_block( $new_gallery );
	 *
	 *     return (
	 *         substr( $post_content, 0, $gallery_at ) .
	 *         $new_gallery .
	 *         substr( $post_content, $ends_before )
	 *     );
	 *
	 * @since 6.9.0
	 *
	 * @return array[]|null {
	 *     Array of block structures.
	 *
	 *     @type array ...$0 {
	 *         An associative array of a single parsed block object. See WP_Block_Parser_Block.
	 *
	 *         @type string|null $blockName    Name of block.
	 *         @type array       $attrs        Attributes from block comment delimiters.
	 *         @type array[]     $innerBlocks  List of inner blocks. An array of arrays that
	 *                                         have the same structure as this one.
	 *         @type string      $innerHTML    HTML from inside block comment delimiters.
	 *         @type array       $innerContent List of string fragments and null markers where
	 *                                         inner blocks were found.
	 *     }
	 * }
	 */
	public function extract_full_block_and_advance(): ?array {
		if ( $this->is_html() ) {
			$chunk = $this->get_html_content();

			return array(
				'blockName'    => null,
				'attrs'        => array(),
				'innerBlocks'  => array(),
				'innerHTML'    => $chunk,
				'innerContent' => array( $chunk ),
			);
		}

		$block = array(
			'blockName'    => $this->get_block_type(),
			'attrs'        => $this->allocate_and_return_parsed_attributes() ?? array(),
			'innerBlocks'  => array(),
			'innerHTML'    => '',
			'innerContent' => array(),
		);

		$depth = $this->get_depth();
		while ( $this->next_token() && $this->get_depth() > $depth ) {
			if ( $this->is_html() ) {
				$chunk                   = $this->get_html_content();
				$block['innerHTML']     .= $chunk;
				$block['innerContent'][] = $chunk;
				continue;
			}

			/**
			 * Inner blocks.
			 *
			 * @todo This is a decent place to call {@link \render_block()}
			 * @todo Use iteration instead of recursion, or at least refactor to tail-call form.
			 */
			if ( $this->opens_block() ) {
				$inner_block             = $this->extract_full_block_and_advance();
				$block['innerBlocks'][]  = $inner_block;
				$block['innerContent'][] = null;
			}
		}

		return $block;
	}

	/**
	 * Returns the byte-offset after the ending character of an HTML comment,
	 * assuming the proper starting byte offset.
	 *
	 * @since 6.9.0
	 *
	 * @param int $comment_starting_at Where the HTML comment started, the leading `<`.
	 * @param int $search_end          Last offset in which to search, for limiting search span.
	 * @return int Offset after the current HTML comment ends, or `$search_end` if no end was found.
	 */
	private function find_html_comment_end( int $comment_starting_at, int $search_end ): int {
		$text = $this->source_text;

		// Find span-of-dashes comments which look like `<!----->`.
		$span_of_dashes = strspn( $text, '-', $comment_starting_at + 2 );
		if (
			$comment_starting_at + 2 + $span_of_dashes < $search_end &&
			'>' === $text[ $comment_starting_at + 2 + $span_of_dashes ]
		) {
			return $comment_starting_at + $span_of_dashes + 1;
		}

		// Otherwise, there are other characters inside the comment, find the first `-->` or `--!>`.
		$now_at = $comment_starting_at + 4;
		while ( $now_at < $search_end ) {
			$dashes_at = strpos( $text, '--', $now_at );
			if ( false === $dashes_at ) {
				return $search_end;
			}

			$closer_must_be_at = $dashes_at + 2 + strspn( $text, '-', $dashes_at + 2 );
			if ( $closer_must_be_at < $search_end && '!' === $text[ $closer_must_be_at ] ) {
				++$closer_must_be_at;
			}

			if ( $closer_must_be_at < $search_end && '>' === $text[ $closer_must_be_at ] ) {
				return $closer_must_be_at + 1;
			}

			++$now_at;
		}

		return $search_end;
	}

	/**
	 * Indicates if the last attempt to parse a block comment delimiter
	 * failed, if set, otherwise `null` if the last attempt succeeded.
	 *
	 * @since 6.9.0
	 *
	 * @return string|null Error from last attempt at parsing next block delimiter,
	 *                     or `null` if last attempt succeeded.
	 */
	public function get_last_error(): ?string {
		return $this->last_error;
	}

	/**
	 * Indicates if the last attempt to parse a block’s JSON attributes failed.
	 *
	 * @see \json_last_error()
	 *
	 * @since 6.9.0
	 *
	 * @return int JSON_ERROR_ code from last attempt to parse block JSON attributes.
	 */
	public function get_last_json_error(): int {
		return $this->last_json_error;
	}

	/**
	 * Returns the type of the block comment delimiter.
	 *
	 * One of:
	 *
	 *  - {@see self::OPENER}
	 *  - {@see self::CLOSER}
	 *  - {@see self::VOID}
	 *  - `null`
	 *
	 * @since 6.9.0
	 *
	 * @return string|null type of the block comment delimiter, if currently matched.
	 */
	public function get_delimiter_type(): ?string {
		switch ( $this->state ) {
			case self::HTML_SPAN:
				return self::VOID;

			case self::MATCHED:
				return $this->type;

			default:
				return null;
		}
	}

	/**
	 * Returns whether the delimiter contains the closing flag.
	 *
	 * This should be avoided except in cases of custom error-handling
	 * with block closers containing the void flag. For normative use,
	 * {@see self::get_delimiter_type()}.
	 *
	 * @since 6.9.0
	 *
	 * @return bool Whether the currently-matched block delimiter contains the closing flag.
	 */
	public function has_closing_flag(): bool {
		return $this->has_closing_flag;
	}

	/**
	 * Indicates if the block delimiter represents a block of the given type.
	 *
	 * Since the “core” namespace may be implicit, it’s allowable to pass
	 * either the fully-qualified block type with namespace and block name
	 * as well as the shorthand version only containing the block name, if
	 * the desired block is in the “core” namespace.
	 *
	 * Since freeform HTML content is non-block content, it has no block type.
	 * Passing the wildcard “*” will, however, return true for all block types,
	 * even the implicit freeform content, though not for spans of inner HTML.
	 *
	 * Example:
	 *
	 *     $is_core_paragraph = $processor->is_block_type( 'paragraph' );
	 *     $is_core_paragraph = $processor->is_block_type( 'core/paragraph' );
	 *     $is_formula        = $processor->is_block_type( 'math-block/formula' );
	 *
	 * @param string $block_type Block type name for the desired block.
	 *                           E.g. "paragraph", "core/paragraph", "math-blocks/formula".
	 * @return bool Whether this delimiter represents a block of the given type.
	 */
	public function is_block_type( string $block_type ): bool {
		if ( '*' === $block_type ) {
			return true;
		}

		// This is a core/freeform text block, it’s special.
		if ( $this->is_html() && 0 === ( $this->open_blocks_length[0] ?? null ) ) {
			return (
				'core/freeform' === $block_type ||
				'freeform' === $block_type
			);
		}

		return $this->are_equal_block_types( $this->source_text, $this->namespace_at, $this->name_at - $this->namespace_at + $this->name_length, $block_type, 0, strlen( $block_type ) );
	}

	/**
	 * Given two spans of text, indicate if they represent identical block types.
	 *
	 * This function normalizes block types to account for implicit core namespacing.
	 *
	 * Note! This function only returns valid results when the complete block types are
	 *       represented in the span offsets and lengths. This means that the full optional
	 *       namespace and block name must be represented in the input arguments.
	 *
	 * Example:
	 *
	 *              0    5   10   15   20   25   30   35   40
	 *     $text = '<!-- wp:block --><!-- /wp:core/block -->';
	 *
	 *     true  === WP_Block_Processor::are_equal_block_types( $text, 9, 5, $text, 27, 10 );
	 *     false === WP_Block_Processor::are_equal_block_types( $text, 9, 5, 'my/block', 0, 8 );
	 *
	 * @since 6.9.0
	 *
	 * @param string $a_text   Text in which first block type appears.
	 * @param int    $a_at     Byte offset into text in which first block type starts.
	 * @param int    $a_length Byte length of first block type.
	 * @param string $b_text   Text in which second block type appears (may be the same as the first text).
	 * @param int    $b_at     Byte offset into text in which second block type starts.
	 * @param int    $b_length Byte length of second block type.
	 * @return bool Whether the spans of text represent identical block types, normalized for namespacing.
	 */
	public static function are_equal_block_types( string $a_text, int $a_at, int $a_length, string $b_text, int $b_at, int $b_length ): bool {
		$a_ns_length = strcspn( $a_text, '/', $a_at, $a_length );
		$b_ns_length = strcspn( $b_text, '/', $b_at, $b_length );

		$a_has_ns = $a_ns_length !== $a_length;
		$b_has_ns = $b_ns_length !== $b_length;

		// Both contain namespaces.
		if ( $a_has_ns && $b_has_ns ) {
			if ( $a_length !== $b_length ) {
				return false;
			}

			$a_block_type = substr( $a_text, $a_at, $a_length );

			return 0 === substr_compare( $b_text, $a_block_type, $b_at, $b_length );
		}

		if ( $a_has_ns ) {
			$b_block_type = 'core/' . substr( $b_text, $b_at, $b_length );

			return (
				strlen( $b_block_type ) === $a_length &&
				0 === substr_compare( $a_text, $b_block_type, $a_at, $a_length )
			);
		}

		if ( $b_has_ns ) {
			$a_block_type = 'core/' . substr( $a_text, $a_at, $a_length );

			return (
				strlen( $a_block_type ) === $b_length &&
				0 === substr_compare( $b_text, $a_block_type, $b_at, $b_length )
			);
		}

		// Neither contains a namespace.
		if ( $a_length !== $b_length ) {
			return false;
		}

		$a_name = substr( $a_text, $a_at, $a_length );

		return 0 === substr_compare( $b_text, $a_name, $b_at, $b_length );
	}

	/**
	 * Indicates if the matched delimiter is an opening or void delimiter of the given type,
	 * if a type is provided, otherwise if it opens any block or implicit freeform HTML content.
	 *
	 * This is a helper method to ease handling of code inspecting where blocks start, and for
	 * checking if the blocks are of a given type. The function is variadic to allow for
	 * checking if the delimiter opens one of many possible block types.
	 *
	 * To advance to the start of a block {@see self::next_block()}.
	 *
	 * Example:
	 *
	 *     $processor = new WP_Block_Processor( $html );
	 *     while ( $processor->next_delimiter() ) {
	 *         if ( $processor->opens_block( 'core/code', 'syntaxhighlighter/code' ) ) {
	 *             echo "Found code!";
	 *             continue;
	 *         }
	 *
	 *         if ( $processor->opens_block( 'core/image' ) ) {
	 *             echo "Found an image!";
	 *             continue;
	 *         }
	 *
	 *         if ( $processor->opens_block() ) {
	 *             echo "Found a new block!";
	 *         }
	 *     }
	 *
	 * @since 6.9.0
	 *
	 * @see self::is_block_type()
	 *
	 * @param string[] $block_type Optional. Is the matched block type one of these?
	 *                             If none are provided, will not test block type.
	 * @return bool Whether the matched block delimiter opens a block, and whether it
	 *              opens a block of one of the given block types, if provided.
	 */
	public function opens_block( string ...$block_type ): bool {
		// HTML spans only open implicit freeform content at the top level.
		if ( self::HTML_SPAN === $this->state && 1 !== count( $this->open_blocks_at ) ) {
			return false;
		}

		/*
		 * Because HTML spans are discovered after the next delimiter is found,
		 * the delimiter type when visiting HTML spans refers to the type of the
		 * following delimiter. Therefore the HTML case is handled by checking
		 * the state and depth of the stack of open block.
		 */
		if ( self::CLOSER === $this->type && ! $this->is_html() ) {
			return false;
		}

		if ( count( $block_type ) === 0 ) {
			return true;
		}

		foreach ( $block_type as $block ) {
			if ( $this->is_block_type( $block ) ) {
				return true;
			}
		}

		return false;
	}

	/**
	 * Indicates if the matched delimiter is an HTML span.
	 *
	 * @since 6.9.0
	 *
	 * @see self::is_non_whitespace_html()
	 *
	 * @return bool Whether the processor is matched on an HTML span.
	 */
	public function is_html(): bool {
		return self::HTML_SPAN === $this->state;
	}

	/**
	 * Indicates if the matched delimiter is an HTML span and comprises more
	 * than whitespace characters, i.e. contains real content.
	 *
	 * Many block serializers introduce newlines between block delimiters,
	 * so the presence of top-level non-block content does not imply that
	 * there are “real” freeform HTML blocks. Checking if there is content
	 * beyond whitespace is a more certain check, such as for determining
	 * whether to load CSS for the freeform or fallback block type.
	 *
	 * @since 6.9.0
	 *
	 * @see self::is_html()
	 *
	 * @return bool Whether the currently-matched delimiter is an HTML
	 *              span containing non-whitespace text.
	 */
	public function is_non_whitespace_html(): bool {
		if ( ! $this->is_html() ) {
			return false;
		}

		$length = $this->matched_delimiter_at - $this->after_previous_delimiter;

		$whitespace_length = strspn(
			$this->source_text,
			" \t\f\r\n",
			$this->after_previous_delimiter,
			$length
		);

		return $whitespace_length !== $length;
	}

	/**
	 * Returns the string content of a matched HTML span, or `null` otherwise.
	 *
	 * @since 6.9.0
	 *
	 * @return string|null Raw HTML content, or `null` if not currently matched on HTML.
	 */
	public function get_html_content(): ?string {
		if ( ! $this->is_html() ) {
			return null;
		}

		return substr(
			$this->source_text,
			$this->after_previous_delimiter,
			$this->matched_delimiter_at - $this->after_previous_delimiter
		);
	}

	/**
	 * Allocates a substring for the block type and returns the fully-qualified
	 * name, including the namespace, if matched on a delimiter, otherwise `null`.
	 *
	 * This function is like {@see self::get_printable_block_type()} but when
	 * paused on a freeform HTML block, will return `null` instead of “core/freeform”.
	 * The `null` behavior matches what {@see \parse_blocks()} returns but may not
	 * be as useful as having a string value.
	 *
	 * This function allocates a substring for the given block type. This
	 * allocation will be small and likely fine in most cases, but it's
	 * preferable to call {@see self::is_block_type()} if only needing
	 * to know whether the delimiter is for a given block type, as that
	 * function is more efficient for this purpose and avoids the allocation.
	 *
	 * Example:
	 *
	 *     // Avoid.
	 *     'core/paragraph' = $processor->get_block_type();
	 *
	 *     // Prefer.
	 *     $processor->is_block_type( 'core/paragraph' );
	 *     $processor->is_block_type( 'paragraph' );
	 *     $processor->is_block_type( 'core/freeform' );
	 *
	 *     // Freeform HTML content has no block type.
	 *     $processor = new WP_Block_Processor( 'non-block content' );
	 *     $processor->next_token();
	 *     null === $processor->get_block_type();
	 *
	 * @since 6.9.0
	 *
	 * @see self::are_equal_block_types()
	 *
	 * @return string|null Fully-qualified block namespace and type, e.g. "core/paragraph",
	 *                     if matched on an explicit delimiter, otherwise `null`.
	 */
	public function get_block_type(): ?string {
		if (
			self::READY === $this->state ||
			self::COMPLETE === $this->state ||
			self::INCOMPLETE_INPUT === $this->state
		) {
			return null;
		}

		// This is a core/freeform text block, it’s special.
		if ( $this->is_html() ) {
			return null;
		}

		$block_type = substr( $this->source_text, $this->namespace_at, $this->name_at - $this->namespace_at + $this->name_length );
		return self::normalize_block_type( $block_type );
	}

	/**
	 * Allocates a printable substring for the block type and returns the fully-qualified
	 * name, including the namespace, if matched on a delimiter or freeform block, otherwise `null`.
	 *
	 * This function is like {@see self::get_block_type()} but when paused on a freeform
	 * HTML block, will return “core/freeform” instead of `null`. The `null` behavior matches
	 * what {@see \parse_blocks()} returns but may not be as useful as having a string value.
	 *
	 * This function allocates a substring for the given block type. This
	 * allocation will be small and likely fine in most cases, but it's
	 * preferable to call {@see self::is_block_type()} if only needing
	 * to know whether the delimiter is for a given block type, as that
	 * function is more efficient for this purpose and avoids the allocation.
	 *
	 * Example:
	 *
	 *     // Avoid.
	 *     'core/paragraph' = $processor->get_printable_block_type();
	 *
	 *     // Prefer.
	 *     $processor->is_block_type( 'core/paragraph' );
	 *     $processor->is_block_type( 'paragraph' );
	 *     $processor->is_block_type( 'core/freeform' );
	 *
	 *     // Freeform HTML content is given an implicit type.
	 *     $processor = new WP_Block_Processor( 'non-block content' );
	 *     $processor->next_token();
	 *     'core/freeform' === $processor->get_printable_block_type();
	 *
	 * @since 6.9.0
	 *
	 * @see self::are_equal_block_types()
	 *
	 * @return string|null Fully-qualified block namespace and type, e.g. "core/paragraph",
	 *                     if matched on an explicit delimiter or freeform block, otherwise `null`.
	 */
	public function get_printable_block_type(): ?string {
		if (
			self::READY === $this->state ||
			self::COMPLETE === $this->state ||
			self::INCOMPLETE_INPUT === $this->state
		) {
			return null;
		}

		// This is a core/freeform text block, it’s special.
		if ( $this->is_html() ) {
			return 1 === count( $this->open_blocks_at )
				? 'core/freeform'
				: '#innerHTML';
		}

		$block_type = substr( $this->source_text, $this->namespace_at, $this->name_at - $this->namespace_at + $this->name_length );
		return self::normalize_block_type( $block_type );
	}

	/**
	 * Normalizes a block name to ensure that missing implicit “core” namespaces are present.
	 *
	 * Example:
	 *
	 *     'core/paragraph' === WP_Block_Processor::normalize_block_byte( 'paragraph' );
	 *     'core/paragraph' === WP_Block_Processor::normalize_block_byte( 'core/paragraph' );
	 *     'my/paragraph'   === WP_Block_Processor::normalize_block_byte( 'my/paragraph' );
	 *
	 * @since 6.9.0
	 *
	 * @param string $block_type Valid block name, potentially without a namespace.
	 * @return string Fully-qualified block type including namespace.
	 */
	public static function normalize_block_type( string $block_type ): string {
		return false === strpos( $block_type, '/' )
			? "core/{$block_type}"
			: $block_type;
	}

	/**
	 * Returns a lazy wrapper around the block attributes, which can be used
	 * for efficiently interacting with the JSON attributes.
	 *
	 * This stub hints that there should be a lazy interface for parsing
	 * block attributes but doesn’t define it. It serves both as a placeholder
	 * for one to come as well as a guard against implementing an eager
	 * function in its place.
	 *
	 * @throws Exception This function is a stub for subclasses to implement
	 *                   when providing streaming attribute parsing.
	 *
	 * @since 6.9.0
	 *
	 * @see self::allocate_and_return_parsed_attributes()
	 *
	 * @return never
	 */
	public function get_attributes() {
		throw new Exception( 'Lazy attribute parsing not yet supported' );
	}

	/**
	 * Attempts to parse and return the entire JSON attributes from the delimiter,
	 * allocating memory and processing the JSON span in the process.
	 *
	 * This does not return any parsed attributes for a closing block delimiter
	 * even if there is a span of JSON content; this JSON is a parsing error.
	 *
	 * Consider calling {@see static::get_attributes()} instead if it's not
	 * necessary to read all the attributes at the same time, as that provides
	 * a more efficient mechanism for typical use cases.
	 *
	 * Since the JSON span inside the comment delimiter may not be valid JSON,
	 * this function will return `null` if it cannot parse the span and set the
	 * {@see static::get_last_json_error()} to the appropriate JSON_ERROR_ constant.
	 *
	 * If the delimiter contains no JSON span, it will also return `null`,
	 * but the last error will be set to {@see \JSON_ERROR_NONE}.
	 *
	 * Example:
	 *
	 *     $processor = new WP_Block_Processor( '<!-- wp:image {"url": "https://wordpress.org/favicon.ico"} -->' );
	 *     $processor->next_delimiter();
	 *     $memory_hungry_and_slow_attributes = $processor->allocate_and_return_parsed_attributes();
	 *     $memory_hungry_and_slow_attributes === array( 'url' => 'https://wordpress.org/favicon.ico' );
	 *
	 *     $processor = new WP_Block_Processor( '<!-- /wp:image {"url": "https://wordpress.org/favicon.ico"} -->' );
	 *     $processor->next_delimiter();
	 *     null            = $processor->allocate_and_return_parsed_attributes();
	 *     JSON_ERROR_NONE = $processor->get_last_json_error();
	 *
	 *     $processor = new WP_Block_Processor( '<!-- wp:separator {} /-->' );
	 *     $processor->next_delimiter();
	 *     array() === $processor->allocate_and_return_parsed_attributes();
	 *
	 *     $processor = new WP_Block_Processor( '<!-- wp:separator /-->' );
	 *     $processor->next_delimiter();
	 *     null = $processor->allocate_and_return_parsed_attributes();
	 *
	 *     $processor = new WP_Block_Processor( '<!-- wp:image {"url} -->' );
	 *     $processor->next_delimiter();
	 *     null                 = $processor->allocate_and_return_parsed_attributes();
	 *     JSON_ERROR_CTRL_CHAR = $processor->get_last_json_error();
	 *
	 * @since 6.9.0
	 *
	 * @return array|null Parsed JSON attributes, if present and valid, otherwise `null`.
	 */
	public function allocate_and_return_parsed_attributes(): ?array {
		$this->last_json_error = JSON_ERROR_NONE;

		if ( self::CLOSER === $this->type || $this->is_html() || 0 === $this->json_length ) {
			return null;
		}

		$json_span = substr( $this->source_text, $this->json_at, $this->json_length );
		$parsed    = json_decode( $json_span, null, 512, JSON_OBJECT_AS_ARRAY | JSON_INVALID_UTF8_SUBSTITUTE );

		$last_error            = json_last_error();
		$this->last_json_error = $last_error;

		return ( JSON_ERROR_NONE === $last_error && is_array( $parsed ) )
			? $parsed
			: null;
	}

	/**
	 * Returns the span representing the currently-matched delimiter, if matched, otherwise `null`.
	 *
	 * Example:
	 *
	 *     $processor = new WP_Block_Processor( '<!-- wp:void /-->' );
	 *     null     === $processor->get_span();
	 *
	 *     $processor->next_delimiter();
	 *     WP_HTML_Span( 0, 17 ) === $processor->get_span();
	 *
	 * @since 6.9.0
	 *
	 * @return WP_HTML_Span|null Span of text in source text spanning matched delimiter.
	 */
	public function get_span(): ?WP_HTML_Span {
		switch ( $this->state ) {
			case self::HTML_SPAN:
				return new WP_HTML_Span( $this->after_previous_delimiter, $this->matched_delimiter_at - $this->after_previous_delimiter );

			case self::MATCHED:
				return new WP_HTML_Span( $this->matched_delimiter_at, $this->matched_delimiter_length );

			default:
				return null;
		}
	}

	//
	// Constant declarations that would otherwise pollute the top of the class.
	//

	/**
	 * Indicates that the block comment delimiter closes an open block.
	 *
	 * @see self::$type
	 *
	 * @since 6.9.0
	 */
	const CLOSER = 'closer';

	/**
	 * Indicates that the block comment delimiter opens a block.
	 *
	 * @see self::$type
	 *
	 * @since 6.9.0
	 */
	const OPENER = 'opener';

	/**
	 * Indicates that the block comment delimiter represents a void block
	 * with no inner content of any kind.
	 *
	 * @see self::$type
	 *
	 * @since 6.9.0
	 */
	const VOID = 'void';

	/**
	 * Indicates that the processor is ready to start parsing but hasn’t yet begun.
	 *
	 * @see self::$state
	 *
	 * @since 6.9.0
	 */
	const READY = 'processor-ready';

	/**
	 * Indicates that the processor is matched on an explicit block delimiter.
	 *
	 * @see self::$state
	 *
	 * @since 6.9.0
	 */
	const MATCHED = 'processor-matched';

	/**
	 * Indicates that the processor is matched on the opening of an implicit freeform delimiter.
	 *
	 * @see self::$state
	 *
	 * @since 6.9.0
	 */
	const HTML_SPAN = 'processor-html-span';

	/**
	 * Indicates that the parser started parsing a block comment delimiter, but
	 * the input document ended before it could finish. The document was likely truncated.
	 *
	 * @see self::$state
	 *
	 * @since 6.9.0
	 */
	const INCOMPLETE_INPUT = 'incomplete-input';

	/**
	 * Indicates that the processor has finished parsing and has nothing left to scan.
	 *
	 * @see self::$state
	 *
	 * @since 6.9.0
	 */
	const COMPLETE = 'processor-complete';
}