Testing and CI

End-to-end tests

tests/test_e2e.py is the main test suite. Tests typically call parse(url, ...) to fetch a live page, then extract(text) and assert on keys in the returned dict. Some tests pass custom headers or cookies.

Policy: one e2e test per site / scheme

Every extraction method (each named entry in schemes in schemes.py) should have at least one end-to-end test in tests/test_e2e.py that exercises a real URL (or the public JSON endpoint Maigret uses) and asserts on extracted fields.

Add the test in the same commit as a new or changed scheme when possible (see CONTRIBUTING.md).
Name the test function test_<something>_e2e or follow the existing test_<site> pattern.
In the docstring, put the exact scheme name(s) from schemes.py (one per line) so revision.py can link tests to methods in METHODS.md.
If a site blocks GitHub Actions (captchas, geo, bot walls), mark the test @pytest.mark.github_failed or @pytest.mark.rate_limited and document why — the test still counts for local runs and for coverage intent; CI uses -m 'not github_failed and not rate_limited'.

Where a live call is too flaky, add a fast offline check in a small module test (e.g. tests/test_socid_improvements.py with a saved HTML/JSON snippet) in addition to the e2e policy above, not as a full substitute.

Cookie-based scenarios may use files under tests/ (e.g. *.cookies); the default CI run excludes tests whose names match cookies (see below).

Pytest markers

Defined in pyproject.toml ([tool.pytest.ini_options]):

Marker	Meaning
`github_failed`	Request or site behavior often fails from GitHub Actions runners (blocks, geo, etc.). Excluded in CI.
`rate_limited`	Anti-bot, captcha, or rate limits. Excluded in CI.
`requires_cookies`	Needs authenticated cookies. Documented for selective runs.

Use @pytest.mark.skip for temporarily broken tests; reasons appear in revision.py output when regenerating METHODS.md.

Local commands

From the root README:

python3 -m pytest tests/test_e2e.py -n 10 -k 'not cookies' -m 'not github_failed and not rate_limited'

-n 10 — Requires pytest-xdist (listed in test-requirements.txt) for parallel workers. Omit -n 10 if you did not install xdist.
Filters match what CI runs, plus optional parallelism for speed.

Minimal dependency set for tests is in test-requirements.txt (pytest, rerun plugin, xdist). Runtime library deps remain in requirements.txt.

`tests/reformat.sh`

Helper script that turns lines of the form key: value into assert info.get("key") == "value" patterns in test_e2e.py (macOS sed syntax). Use after pasting CLI output into the test file as documented in CONTRIBUTING.md.

GitHub Actions

.github/workflows/python-package.yml runs on pushes and pull requests to master:

Python 3.10, 3.11, 3.12
flake8 — syntax/undefined-name checks; complexity/length as warnings (setup.cfg ignores E501 for line length)
mypy — type checking with mypy socid_extractor/ (stub overrides in pyproject.toml)
pytest — pytest -k 'not cookies' -m 'not github_failed and not rate_limited' --reruns 3 --reruns-delay 30 (pytest-rerunfailures for flaky network tests)

Publishing to PyPI on release is handled by .github/workflows/python-publish.yml using python -m build.

`revision.py`

Run from the repository root:

python revision.py

It:

Reads pytest marker descriptions from pyproject.toml
Loads tests from tests/test_e2e.py and schemes from socid_extractor/schemes.py
Associates tests with scheme names via docstrings (method name per line) or heuristic name matching
Overwrites METHODS.md with a table of methods, test links, and notes (markers, skip reasons)
Prints how many methods have no matching test

Keep docstrings in tests aligned with scheme names in schemes when you want accurate coverage reporting.

Flake8

setup.cfg configures flake8 with ignore = E501 (line length). CI still runs additional flake8 passes as defined in the workflow file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Testing and CI

End-to-end tests

Policy: one e2e test per site / scheme

Pytest markers

Local commands

`tests/reformat.sh`

GitHub Actions

`revision.py`

Flake8

Uh oh!

FilesExpand file tree

testing-and-ci.md

Latest commit

History

testing-and-ci.md

File metadata and controls

Testing and CI

End-to-end tests

Policy: one e2e test per site / scheme

Pytest markers

Local commands

tests/reformat.sh

GitHub Actions

revision.py

Flake8

`tests/reformat.sh`

`revision.py`