1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
|
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Python library and `ia` CLI for interacting with archive.org. Used for uploading, downloading, searching, and managing items and their metadata. Also provides catalog task management and account administration utilities. Items are identified by a unique identifier and contain files and metadata.
## Common Commands
```bash
# Install for development
pip install -e .
# Run tests
pytest
# Run tests with linting
ruff check && pytest
# Run a single test file
pytest tests/test_api.py
# Run a specific test
pytest tests/test_api.py::test_get_item
# Multi-version testing (requires Python 3.9-3.14 installed)
tox
# Lint only
ruff check
# Build docs
pip install -r docs/requirements.txt
cd docs && make html
```
## Architecture
The library has a three-layer architecture:
**Layer 1 - Public API (`internetarchive/api.py`)**
Convenience functions that wrap the core classes: `get_item()`, `search_items()`, `upload()`, `download()`, `modify_metadata()`, `delete()`, `configure()`, `get_session()`.
**Layer 2 - Core Classes**
- `ArchiveSession` (`session.py`) - Extends `requests.Session`. Manages config, credentials, HTTP headers, connection pooling.
- `Item` (`item.py`) - Represents an Archive.org item. Contains files, metadata, and methods for download/upload/modify.
- `File` (`files.py`) - Represents a single file within an item. Handles download, delete, checksum verification.
- `Search` (`search.py`) - Query interface with pagination and field selection.
**Layer 3 - Supporting Modules**
- `config.py` - INI-based configuration (credentials at `~/.config/internetarchive/ia.ini` or `~/.ia`)
- `iarequest.py` - HTTP request builders (`MetadataRequest`, `S3Request`)
- `auth.py` - S3 authentication handlers
- `catalog.py` - Catalog task management
**CLI (`internetarchive/cli/`)**
- Entry point: `ia.py:main()` → registered as `ia` console script
- Subcommands: `ia_download.py`, `ia_upload.py`, `ia_metadata.py`, `ia_search.py`, `ia_list.py`, `ia_delete.py`, `ia_copy.py`, `ia_move.py`, `ia_tasks.py`, `ia_configure.py`, etc.
## Code Style
- Line length: 90 characters
- Linter: ruff (configured in `pyproject.toml`)
- Formatter: black
- Type checking: mypy (type stubs in `options.extras_require` under `types`)
## Key Dependencies
- `requests` - HTTP client
- `jsonpatch` - JSON patching for metadata updates
- `tqdm` - Progress bars
- `responses` - HTTP mocking for tests
## Contributing Notes
- All new features should be developed on a feature branch, not directly on master
- PRs require tests and must pass ruff linting
- Avoid introducing new dependencies
- Support Python 3.9+
## Releasing
To release a new version (must be on master with clean working directory):
```bash
# 1. Prepare release (updates __version__.py and HISTORY.rst date)
make prepare-release RELEASE=X.Y.Z
# 2. Review and commit version changes
git diff
git add -A && git commit -m "Bump version to X.Y.Z"
# 3. Publish to PyPI + archive.org + GitHub
make publish-all
```
Individual release targets:
- `make publish` - PyPI + GitHub release (no binary)
- `make publish-all` - PyPI + pex binary + GitHub release
- `make publish-binary` - pex binary only (after PyPI release)
The release process will:
- Run tests and linting
- Build the package
- Build and test the pex binary
- Create and push a git tag
- Upload to PyPI
- Upload binary to archive.org
- Create a GitHub release with changelog from HISTORY.rst
|