File: CLAUDE.md

package info (click to toggle)
python-internetarchive 5.7.2-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 1,028 kB
  • sloc: python: 8,392; makefile: 235; xml: 180
file content (110 lines) | stat: -rw-r--r-- 3,561 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

Python library and `ia` CLI for interacting with archive.org. Used for uploading, downloading, searching, and managing items and their metadata. Also provides catalog task management and account administration utilities. Items are identified by a unique identifier and contain files and metadata.

## Common Commands

```bash
# Install for development
pip install -e .

# Run tests
pytest

# Run tests with linting
ruff check && pytest

# Run a single test file
pytest tests/test_api.py

# Run a specific test
pytest tests/test_api.py::test_get_item

# Multi-version testing (requires Python 3.9-3.14 installed)
tox

# Lint only
ruff check

# Build docs
pip install -r docs/requirements.txt
cd docs && make html
```

## Architecture

The library has a three-layer architecture:

**Layer 1 - Public API (`internetarchive/api.py`)**
Convenience functions that wrap the core classes: `get_item()`, `search_items()`, `upload()`, `download()`, `modify_metadata()`, `delete()`, `configure()`, `get_session()`.

**Layer 2 - Core Classes**
- `ArchiveSession` (`session.py`) - Extends `requests.Session`. Manages config, credentials, HTTP headers, connection pooling.
- `Item` (`item.py`) - Represents an Archive.org item. Contains files, metadata, and methods for download/upload/modify.
- `File` (`files.py`) - Represents a single file within an item. Handles download, delete, checksum verification.
- `Search` (`search.py`) - Query interface with pagination and field selection.

**Layer 3 - Supporting Modules**
- `config.py` - INI-based configuration (credentials at `~/.config/internetarchive/ia.ini` or `~/.ia`)
- `iarequest.py` - HTTP request builders (`MetadataRequest`, `S3Request`)
- `auth.py` - S3 authentication handlers
- `catalog.py` - Catalog task management

**CLI (`internetarchive/cli/`)**
- Entry point: `ia.py:main()` → registered as `ia` console script
- Subcommands: `ia_download.py`, `ia_upload.py`, `ia_metadata.py`, `ia_search.py`, `ia_list.py`, `ia_delete.py`, `ia_copy.py`, `ia_move.py`, `ia_tasks.py`, `ia_configure.py`, etc.

## Code Style

- Line length: 90 characters
- Linter: ruff (configured in `pyproject.toml`)
- Formatter: black
- Type checking: mypy (type stubs in `options.extras_require` under `types`)

## Key Dependencies

- `requests` - HTTP client
- `jsonpatch` - JSON patching for metadata updates
- `tqdm` - Progress bars
- `responses` - HTTP mocking for tests

## Contributing Notes

- All new features should be developed on a feature branch, not directly on master
- PRs require tests and must pass ruff linting
- Avoid introducing new dependencies
- Support Python 3.9+

## Releasing

To release a new version (must be on master with clean working directory):

```bash
# 1. Prepare release (updates __version__.py and HISTORY.rst date)
make prepare-release RELEASE=X.Y.Z

# 2. Review and commit version changes
git diff
git add -A && git commit -m "Bump version to X.Y.Z"

# 3. Publish to PyPI + archive.org + GitHub
make publish-all
```

Individual release targets:
- `make publish` - PyPI + GitHub release (no binary)
- `make publish-all` - PyPI + pex binary + GitHub release
- `make publish-binary` - pex binary only (after PyPI release)

The release process will:
- Run tests and linting
- Build the package
- Build and test the pex binary
- Create and push a git tag
- Upload to PyPI
- Upload binary to archive.org
- Create a GitHub release with changelog from HISTORY.rst