1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
|
//===----------------------------------------------------------------------===//
//
// This source file is part of the Swift.org open source project
//
// Copyright (c) 2021-2022 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
//
//===----------------------------------------------------------------------===//
// MARK: - Pitched Compiler/library interface
/// Conform to this to support regex literals
protocol ExpressibleByRegexLiteral {
associatedtype Builder: RegexLiteralBuilderProtocol
init(builder: Builder)
}
/// Builder conforms to this for compiler-library API
protocol RegexLiteralBuilderProtocol {
/// Opaquely identify something built
associatedtype ASTNodeId = UInt
/// NOTE: This will likely not be a requirement but could be ad-hoc name lookup
mutating func buildCharacterClass_d() -> ASTNodeId
/// Any post-processing or partial compilation
///
/// NOTE: we might want to make this `throws`, capable of communicating
/// compilation failure if conformer is constant evaluable.
mutating func finalize()
}
/*
TODO: We will probably want defaulting mechanisms, such as:
* Ability for a conformer to take a meta-character as
just an escaped character
* Ability for a conformer to use function decls for feature
set communication alone, and have default impl build just
echo the string for an engine
*/
// MARK: - Semantic levels
/// Dynamic notion of a specified semantics level for a regex
enum SemanticsLevel {
case graphemeCluster
case scalar
case posix // different than ASCII?
// ... code units ...
}
/// Conformers can be ran as a regex / pattern
protocol RegexComponent {
var level: SemanticsLevel? { get }
}
/// Provide the option to encode semantic level statically
protocol RegexLiteralProtocol: ExpressibleByRegexLiteral {
associatedtype ScalarSemanticRegex: RegexComponent
associatedtype GraphemeSemanticRegex: RegexComponent
associatedtype POSIXSemanticRegex: RegexComponent
associatedtype UnspecifiedSemanticRegex: RegexComponent = RegexLiteral
var scalarSemantic: ScalarSemanticRegex { get }
var graphemeSemantic: GraphemeSemanticRegex { get }
var posixSemantic: POSIXSemanticRegex { get }
}
// MARK: - Statically encoded semantic level
/// A regex that has statically bound its semantic level
struct StaticSemanticRegexLiteral: RegexLiteralProtocol {
/*
If we had values in type-parameter position, this would be
far easier and more straight-forward to model.
RegexLiteral<SemanticsLevel? = nil>
*/
/// A regex that has statically bound its semantic level
struct ScalarSemanticRegex: RegexComponent {
var level: SemanticsLevel? { .scalar }
}
struct GraphemeSemanticRegex: RegexComponent {
var level: SemanticsLevel? { .graphemeCluster }
}
struct POSIXSemanticRegex: RegexComponent {
var level: SemanticsLevel? { .posix }
}
struct UnspecifiedSemanticRegex: RegexComponent {
var level: SemanticsLevel? { nil }
}
var scalarSemantic: ScalarSemanticRegex { x() }
var graphemeSemantic: GraphemeSemanticRegex { x() }
var posixSemantic: POSIXSemanticRegex { x() }
init(builder: RegexLiteralBuilder) { }
typealias Builder = RegexLiteralBuilder
}
// MARK: - stdlib conformer
/// Stdlib's conformer
struct RegexLiteralBuilder: RegexLiteralBuilderProtocol {
/// Compiler converts literal into a series of calls to this kind of method
mutating func buildCharacterClass_d() -> ASTNodeId { x() }
/// We're done, so partially-compile or otherwise finalize
mutating func finalize() { }
}
/// The produced value for a regex literal. Might end up being same type as
/// `Regex` or `Pattern`, but for now useful to model independently.
struct RegexLiteral: ExpressibleByRegexLiteral {
typealias Builder = RegexLiteralBuilder
/// An explicitly specified semantics level
var level: SemanticsLevel? = nil
init(builder: Builder) {
// TODO: should this be throwing, constant evaluable, or
// some other way to issue diagnostics?
}
}
extension RegexLiteral: RegexComponent, RegexLiteralProtocol {
/// A regex that has finally bound its semantic level (dynamically)
struct BoundSemantic: RegexComponent {
var _level: SemanticsLevel // Bound semantic level
var level: SemanticsLevel? { _level }
}
private func sem(_ level: SemanticsLevel) -> BoundSemantic {
x()
}
var scalarSemantic: BoundSemantic { sem(.scalar) }
var graphemeSemantic: BoundSemantic { sem(.graphemeCluster) }
var posixSemantic: BoundSemantic { sem(.posix) }
}
// ---
internal func x() -> Never { fatalError() }
|