1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271
|
# files-to-prompt
[](https://pypi.org/project/files-to-prompt/)
[](https://github.com/simonw/files-to-prompt/releases)
[](https://github.com/simonw/files-to-prompt/actions/workflows/test.yml)
[](https://github.com/simonw/files-to-prompt/blob/master/LICENSE)
Concatenate a directory full of files into a single prompt for use with LLMs
For background on this project see [Building files-to-prompt entirely using Claude 3 Opus](https://simonwillison.net/2024/Apr/8/files-to-prompt/).
## Installation
Install this tool using `pip`:
```bash
pip install files-to-prompt
```
## Usage
To use `files-to-prompt`, provide the path to one or more files or directories you want to process:
```bash
files-to-prompt path/to/file_or_directory [path/to/another/file_or_directory ...]
```
This will output the contents of every file, with each file preceded by its relative path and separated by `---`.
### Options
- `-e/--extension <extension>`: Only include files with the specified extension. Can be used multiple times.
```bash
files-to-prompt path/to/directory -e txt -e md
```
- `--include-hidden`: Include files and folders starting with `.` (hidden files and directories).
```bash
files-to-prompt path/to/directory --include-hidden
```
- `--ignore <pattern>`: Specify one or more patterns to ignore. Can be used multiple times. Patterns may match file names and directory names, unless you also specify `--ignore-files-only`. Pattern syntax uses [fnmatch](https://docs.python.org/3/library/fnmatch.html), which supports `*`, `?`, `[anychar]`, `[!notchars]` and `[?]` for special character literals.
```bash
files-to-prompt path/to/directory --ignore "*.log" --ignore "temp*"
```
- `--ignore-files-only`: Include directory paths which would otherwise be ignored by an `--ignore` pattern.
```bash
files-to-prompt path/to/directory --ignore-files-only --ignore "*dir*"
```
- `--ignore-gitignore`: Ignore `.gitignore` files and include all files.
```bash
files-to-prompt path/to/directory --ignore-gitignore
```
- `-c/--cxml`: Output in Claude XML format.
```bash
files-to-prompt path/to/directory --cxml
```
- `-m/--markdown`: Output as Markdown with fenced code blocks.
```bash
files-to-prompt path/to/directory --markdown
```
- `-o/--output <file>`: Write the output to a file instead of printing it to the console.
```bash
files-to-prompt path/to/directory -o output.txt
```
- `-n/--line-numbers`: Include line numbers in the output.
```bash
files-to-prompt path/to/directory -n
```
Example output:
```
files_to_prompt/cli.py
---
1 import os
2 from fnmatch import fnmatch
3
4 import click
...
```
- `-0/--null`: Use NUL character as separator when reading paths from stdin. Useful when filenames may contain spaces.
```bash
find . -name "*.py" -print0 | files-to-prompt --null
```
### Example
Suppose you have a directory structure like this:
```
my_directory/
├── file1.txt
├── file2.txt
├── .hidden_file.txt
├── temp.log
└── subdirectory/
└── file3.txt
```
Running `files-to-prompt my_directory` will output:
```
my_directory/file1.txt
---
Contents of file1.txt
---
my_directory/file2.txt
---
Contents of file2.txt
---
my_directory/subdirectory/file3.txt
---
Contents of file3.txt
---
```
If you run `files-to-prompt my_directory --include-hidden`, the output will also include `.hidden_file.txt`:
```
my_directory/.hidden_file.txt
---
Contents of .hidden_file.txt
---
...
```
If you run `files-to-prompt my_directory --ignore "*.log"`, the output will exclude `temp.log`:
```
my_directory/file1.txt
---
Contents of file1.txt
---
my_directory/file2.txt
---
Contents of file2.txt
---
my_directory/subdirectory/file3.txt
---
Contents of file3.txt
---
```
If you run `files-to-prompt my_directory --ignore "sub*"`, the output will exclude all files in `subdirectory/` (unless you also specify `--ignore-files-only`):
```
my_directory/file1.txt
---
Contents of file1.txt
---
my_directory/file2.txt
---
Contents of file2.txt
---
```
### Reading from stdin
The tool can also read paths from standard input. This can be used to pipe in the output of another command:
```bash
# Find files modified in the last day
find . -mtime -1 | files-to-prompt
```
When using the `--null` (or `-0`) option, paths are expected to be NUL-separated (useful when dealing with filenames containing spaces):
```bash
find . -name "*.txt" -print0 | files-to-prompt --null
```
You can mix and match paths from command line arguments and stdin:
```bash
# Include files modified in the last day, and also include README.md
find . -mtime -1 | files-to-prompt README.md
```
### Claude XML Output
Anthropic has provided [specific guidelines](https://docs.anthropic.com/claude/docs/long-context-window-tips) for optimally structuring prompts to take advantage of Claude's extended context window.
To structure the output in this way, use the optional `--cxml` flag, which will produce output like this:
```xml
<documents>
<document index="1">
<source>my_directory/file1.txt</source>
<document_content>
Contents of file1.txt
</document_content>
</document>
<document index="2">
<source>my_directory/file2.txt</source>
<document_content>
Contents of file2.txt
</document_content>
</document>
</documents>
```
## --markdown fenced code block output
The `--markdown` option will output the files as fenced code blocks, which can be useful for pasting into Markdown documents.
```bash
files-to-prompt path/to/directory --markdown
```
The language tag will be guessed based on the filename.
If the code itself contains triple backticks the wrapper around it will use one additional backtick.
Example output:
`````
myfile.py
```python
def my_function():
return "Hello, world!"
```
other.js
```javascript
function myFunction() {
return "Hello, world!";
}
```
file_with_triple_backticks.md
````markdown
This file has its own
```
fenced code blocks
```
Inside it.
````
`````
## Development
To contribute to this tool, first checkout the code. Then create a new virtual environment:
```bash
cd files-to-prompt
python -m venv venv
source venv/bin/activate
```
Now install the dependencies and test dependencies:
```bash
pip install -e '.[test]'
```
To run the tests:
```bash
pytest
```
|