1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121
|
//===----------------------------------------------------------------------===//
//
// This source file is part of the Swift.org open source project
//
// Copyright (c) 2021-2022 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
//
//===----------------------------------------------------------------------===//
@_implementationOnly import _RegexParser
class Compiler {
let tree: DSLTree
// TODO: Or are these stored on the tree?
var options = MatchingOptions()
private var compileOptions: _CompileOptions = .default
init(ast: AST) {
self.tree = ast.dslTree
}
init(tree: DSLTree) {
self.tree = tree
}
init(tree: DSLTree, compileOptions: _CompileOptions) {
self.tree = tree
self.compileOptions = compileOptions
}
__consuming func emit() throws -> MEProgram {
// TODO: Handle global options
var codegen = ByteCodeGen(
options: options,
compileOptions:
compileOptions,
captureList: tree.captureList)
return try codegen.emitRoot(tree.root)
}
}
/// Hashable wrapper for `Any.Type`.
struct AnyHashableType: CustomStringConvertible, Hashable {
var ty: Any.Type
init(_ ty: Any.Type) {
self.ty = ty
}
var description: String { "\(ty)" }
static func == (lhs: Self, rhs: Self) -> Bool {
lhs.ty == rhs.ty
}
func hash(into hasher: inout Hasher) {
hasher.combine(ObjectIdentifier(ty))
}
}
// An error produced when compiling a regular expression.
enum RegexCompilationError: Error, Hashable, CustomStringConvertible {
// TODO: Source location?
case uncapturedReference
case incorrectOutputType(incorrect: AnyHashableType, correct: AnyHashableType)
case invalidCharacterClassRangeOperand(Character)
static func incorrectOutputType(
incorrect: Any.Type, correct: Any.Type
) -> Self {
.incorrectOutputType(incorrect: .init(incorrect), correct: .init(correct))
}
var description: String {
switch self {
case .uncapturedReference:
return "Found a reference used before it captured any match."
case .incorrectOutputType(let incorrect, let correct):
return "Cast to incorrect type 'Regex<\(incorrect)>', expected 'Regex<\(correct)>'"
case .invalidCharacterClassRangeOperand(let c):
return "'\(c)' is an invalid bound for character class range"
}
}
}
// Testing support
@available(SwiftStdlib 5.7, *)
func _compileRegex(
_ regex: String,
_ syntax: SyntaxOptions = .traditional,
_ semanticLevel: RegexSemanticLevel? = nil
) throws -> Executor {
let ast = try parse(regex, syntax)
let dsl: DSLTree
switch semanticLevel?.base {
case .graphemeCluster:
let sequence = AST.MatchingOptionSequence(adding: [.init(.graphemeClusterSemantics, location: .fake)])
dsl = DSLTree(.nonCapturingGroup(.init(ast: .changeMatchingOptions(sequence)), ast.dslTree.root))
case .unicodeScalar:
let sequence = AST.MatchingOptionSequence(adding: [.init(.unicodeScalarSemantics, location: .fake)])
dsl = DSLTree(.nonCapturingGroup(.init(ast: .changeMatchingOptions(sequence)), ast.dslTree.root))
case .none:
dsl = ast.dslTree
}
let program = try Compiler(tree: dsl).emit()
return Executor(program: program)
}
@_spi(RegexBenchmark)
public struct _CompileOptions: OptionSet {
public let rawValue: Int
public init(rawValue: Int) {
self.rawValue = rawValue
}
public static let disableOptimizations = _CompileOptions(rawValue: 1 << 0)
public static let enableTracing = _CompileOptions(rawValue: 1 << 1)
public static let enableMetrics = _CompileOptions(rawValue: 1 << 2)
public static let `default`: _CompileOptions = []
}
|