File: index.html

package info (click to toggle)
pyparsing 3.3.2-1
  • links: PTS, VCS
  • area: main
  • in suites: experimental
  • size: 12,200 kB
  • sloc: python: 30,867; ansic: 422; sh: 112; makefile: 24
file content (249 lines) | stat: -rw-r--r-- 12,595 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>PyParsing Regex Inverter</title>
    <link rel="stylesheet" href="https://pyscript.net/releases/2025.11.1/core.css">
    <script type="module" src="https://pyscript.net/releases/2025.11.1/core.js"></script>
    <style>
        body { font-family: sans-serif; margin: 2em; }
        #regex-input { width: 300px; }
        #output { width: 100%; height: 400px; margin-top: 1em; font-family: monospace; white-space: pre; border: 1px solid #ccc; padding: 0.5em; overflow: auto; }
        .controls { margin-bottom: 1em; }
        table { border-collapse: collapse; width: 100%; }
        table th, table td { border: 1px solid #ddd; padding: 8px; text-align: left; }
        table th { background-color: #f2f2f2; text-align: center; }
        table.quick-ref td:first-child { text-align: center; }
    </style>
</head>
<body>
    <h1>Regex Inverter</h1>
    <h4>by Paul McGuire, January 2026</h4>
    <details>
        <summary>Description</summary>
        <div>
            <p>This page allows you to invert a regular expression, generating strings that match it.</p>
            <p><strong>Instructions:</strong> Enter a regular expression in the "Regex" field and specify the maximum number of results you want to see (up to 100,000,000). Click "Invert" or press Enter to generate the matching strings.</p>
            <p><strong>Constraints:</strong></p>
            <ul>
                <li>Unbounded repetition operators <code>+</code> and <code>*</code> are <strong>not supported</strong>.</li>
                <li>Replace <code>+</code> or <code>*</code> with explicit <code>{1,n}</code> or <code>{min,max}</code> repetition operators (e.g., use an explicit repetition like <code>[A-Z]{1,4}</code> instead of <code>[A-Z]+</code>, or <code>[A-Z]{,4}</code> instead of <code>[A-Z]*</code>).</li>
                <li>For brevity, all generated strings in this utility are limited to 7-bit ASCII characters. By default, Python's <code>re</code> methods will match the full Unicode set, so macros like <code>\d</code> could match numeric digits in other language character sets beyond just the ASCII digits '0' through '9'.</li>
            </ul>
            <p><strong>Note:</strong> Complex regular expressions or those with large repetition counts may take some time to process.</p>
        </div>
    </details>
    <p>
    <details>
        <summary>Regular Expressions Quick Reference</summary>
        <div>
            <table class="quick-ref">
                <thead>
                    <tr><th>Construct</th><th>Description</th></tr>
                </thead>
                <tbody>
                    <tr><td><code>.</code></td><td>Any character except newline</td></tr>
                    <tr><td><code>\d</code></td><td>Digit <code>[0-9]</code></td></tr>
                    <tr><td><code>\w</code></td><td>Word (identifier) character <code>[a-zA-Z0-9_]</code></td></tr>
                    <tr><td><code>\s</code></td><td>Whitespace character</td></tr>
                    <tr><td><code>\D</code></td><td>Non-digit</td></tr>
                    <tr><td><code>\W</code></td><td>Non-word character</td></tr>
                    <tr><td><code>\S</code></td><td>Non-whitespace character</td></tr>
                    <tr><td><code>?</code></td><td>0 or 1 repetition</td></tr>
                    <tr><td><code>{n}</code></td><td>Exactly <i>n</i> repetitions</td></tr>
                    <tr><td><code>{n,m}</code></td><td>Between <i>n</i> and <i>m</i> repetitions</td></tr>
                    <tr><td><code>{,m}</code></td><td>0 to <i>m</i> repetitions</td></tr>
                    <tr><td><code>[...]</code></td><td>Character class (any of these characters)</td></tr>
                    <tr><td><code>[^...]</code></td><td>Negated character class</td></tr>
                    <tr><td><code>|</code></td><td>Alternation (OR)</td></tr>
                    <tr><td><code>(...)</code></td><td>Grouping</td></tr>
                    <tr><td><code>(?:...)</code></td><td>Grouping (non-capturing)</td></tr>
                    <tr><td><code>(?P&lt;name&gt;...)</code></td><td>Grouping (named group)</td></tr>
                    <tr><td colspan="2">Other common regex features not covered in this utility</td></tr>
                    <tr><td><code>^</code></td><td>Start of a line</td></tr>
                    <tr><td><code>$</code></td><td>End of a line</td></tr>
                    <tr><td><code>\A</code></td><td>Start of string</td></tr>
                    <tr><td><code>\Z</code></td><td>End of string</td></tr>
                    <tr><td><code>*</code></td><td>0 or more repetitions (use <code>{,m}</code> to limit repetitions)</td></tr>
                    <tr><td><code>+</code></td><td>1 or more repetitions (use <code>{1,m}</code> to limit repetitions)</td></tr>
                    <tr><td><code>\b</code></td><td>Word boundary</td></tr>
                    <tr><td><code>(?=...)</code></td><td>Positive lookahead</td></tr>
                    <tr><td><code>(?!...)</code></td><td>Negative lookahead</td></tr>
                    <tr><td><code>(?&lt;=...)</code></td><td>Positive lookbehind</td></tr>
                    <tr><td><code>(?&lt;!...)</code></td><td>Negative lookbehind</td></tr>
                </tbody>
            </table>
        </div>
    </details>
    <p>
    <details>
        <summary>Examples</summary>
        <div>
            <p>Here are some example regular expressions to try:</p>
            <table>
                <thead>
                    <tr><th>Description</th><th>Regex</th></tr>
                </thead>
                <tr>
                    <td>Match one uppercase letter followed by three digits</td>
                    <td><code>[A-Z]-\d{3}</code></td>
                </tr>
                <tr>
                    <td>Time of day (HH:MM:SS)</td>
                    <td><code>(2[0-3]|[01][0-9]):([0-5][0-9]):([0-5][0-9])</code></td>
                </tr>
                <tr>
                    <td>8-bit binary numbers</td>
                    <td><code>[01]{8}</code></td>
                </tr>
                <tr>
                    <td>Integer from 0 to 99</td>
                    <td><code>[1-9]?\d</code></td>
                </tr>
                <tr>
                    <td>Integer from 0 to 255</td>
                    <td><code>25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d</code></td>
                </tr>
                <tr>
                    <td>Roman Numerals to 50</td>
                    <td><code>(X{,3}|XL)(I{,3}|IV|VI{,3}|IX)|L</code></td>
                </tr>
                <tr>
                    <td>Chemical Symbol</td>
                    <td><code>A[cglmrstu]|B[aehikr]?|C[adeflmorsu]?|D[bsy]|E[rsu]|F[emr]?|G[ade]|H[efgos]?|I[nr]?|Kr?|L[airu]|M[dgnot]|N[abdeiop]?|Os?|P[abdmortu]?|R[abefghnu]|S[bcegimnr]?|T[abcehilm]|U(u[bhopqst])?|V|W|Xe|Yb?|Z[nr]</code></td>
                </tr>
                <tr>
                    <td>IPv4 addresses in 192.168.0.0/16</td>
                    <td><code>192\.168(\.((25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d))){2}</code></td>
                </tr>
                <tr>
                    <td>MAC address</td>
                    <td><code>[0-9A-Fa-f]{2}([:-][0-9A-Fa-f]{2}){5}</code></td>
                </tr>
                <tr>
                    <td>UUID</td>
                    <td><code>[0-9A-F]{8}(-[0-9A-F]{4}){3}-[0-9A-F]{12}</code></td>
                </tr>
                <tr>
                    <td>Original US Area codes (leading and trailing digit 2-9, middle digit 1 or 0)</td>
                    <td><code>[2-9][10][2-9]</code></td>
                </tr>
            </table>
        </div>
    </details>
    <p>Enter a regular expression to see its matching strings.</p>
    <div class="controls">
        <label for="regex-input">Regex:</label>
        <input type="text" id="regex-input" placeholder="e.g. [A-Z]-\d{3}" value="">
        <label for="max-results">Max results:</label>
        <input type="number" id="max-results" value="200" min="1" style="width: 60px;">
        <button id="invert-btn" py-click="do_invert">Invert</button>
        <button id="cancel-btn" py-click="cancel_invert" style="display: none;">Cancel</button>
    </div>
    <div id="status"></div>
    <textarea id="output" readonly></textarea>
    <p>GitHub repo for this page <a href="https://github.com/ptmcg/regex_inverter.git" target="_blank">regex-inverter</a></p>
    <p>Powered by <a href="https://pyscript.net" target="_blank">PyScript</a> and <a href="https://pypi.org/project/pyparsing/" target="_blank">pyparsing</a></p>

    <py-config>
        packages = ["pyparsing"]
        [[fetch]]
        files = ["inv_regex.py"]
    </py-config>

    <script type="py">
        from pyscript import document
        from inv_regex import invert, count
        import pyparsing
        import itertools
        import asyncio

        is_cancelled = False

        def cancel_invert(event):
            global is_cancelled
            is_cancelled = True

        async def do_invert(event):
            global is_cancelled
            is_cancelled = False

            regex = document.querySelector("#regex-input").value.strip()
            if not regex:
                return

            try:
                max_results = int(document.querySelector("#max-results").value)
            except ValueError:
                max_results = 200
            if max_results  > 100_000_000:
                max_results = 100_000_000
                document.querySelector("#max-results").value = 100_000_000

            output_area = document.querySelector("#output")
            status_div = document.querySelector("#status")
            cancel_btn = document.querySelector("#cancel-btn")
            
            output_area.value = ""
            status_div.innerText = "Processing..."
            
            # Use a small delay to allow status to update in the UI
            await asyncio.sleep(0.1)

            try:
                # Get up to max_results items using an iterator
                invert_iter = invert(regex)
                num_shown = 0
                while num_shown < max_results:
                    results = list(itertools.islice(invert_iter, min(max_results - num_shown, 100_000)))
                    num_shown += len(results)
                    output_area.value += "\n".join(results) + "\n"
                    if len(results) < 100_000:
                        break
                    if num_shown < max_results:
                        cancel_btn.style.display = "inline"
                    await asyncio.sleep(0)
                    if is_cancelled:
                        status_div.innerText += " (cancelled)"
                        break

                if not is_cancelled:
                    # Count the remaining items in the iterator
                    remaining_count = 0
                    for i, _ in enumerate(invert_iter, 1):
                        remaining_count = i
                        if i % 100_000 == 0:
                            status_div.innerText = f"Counting matches... {num_shown + i:,} found so far"
                            await asyncio.sleep(0)
                            if is_cancelled:
                                status_div.innerText += " (cancelled)"
                                break

                    total_count = num_shown + remaining_count

                if not is_cancelled:
                    status_div.innerText = f"Total matching strings: {total_count:,}"
                    if total_count > max_results:
                        status_div.innerText += f" (showing first {max_results:,})"

            except pyparsing.ParseBaseException as pe:
                status_div.innerText = "Error"
                output_area.value = f"Parse Error: {pe.msg}\n{pe.explain(depth=0)}"
            except Exception as e:
                status_div.innerText = "Error"
                output_area.value = f"Error: {str(e)}"
            finally:
                cancel_btn.style.display = "none"
                await asyncio.sleep(0)

        # Add event listener for Enter key in input box
        def on_keypress(event):
            if event.key == "Enter":
                asyncio.create_task(do_invert(None))

        document.querySelector("#regex-input").onkeypress = on_keypress
        document.querySelector("#max-results").onkeypress = on_keypress
    </script>
</body>
</html>