Testing

This is the single source of truth for Vera's testing infrastructure, coverage data, and test conventions.

Overview

Metric	Value
Tests	3,640 across 29 files (~34,500 lines of test code; 3,626 passed, 14 skipped)
Compiler code coverage	96% of 15,149 statements (CI minimum: 80%)
Conformance programs	82 programs across 9 spec chapters, validating every language feature
Example programs	33, all validated through `vera check` + `vera verify`
Spec code blocks	164 parseable blocks from 13 spec chapters: 86 parse, 72 type-check, 71 verify
README code blocks	13 Vera blocks (12 validated, 1 allowlisted future syntax)
FAQ code blocks	1 Vera block in FAQ.md (0 validated, 1 allowlisted snippet)
HTML code blocks	4 Vera blocks in docs/index.html (4 validated: parse + check + verify)
Contract verification	162 of 179 contracts (90.5%) verified statically (Tier 1)
CI matrix	6 combinations (Python 3.11/3.12/3.13 x Ubuntu/macOS) + browser parity (Node.js 22)

Running Tests

All commands assume the virtual environment is active (source .venv/bin/activate).

# Test suite
pytest tests/ -v                                     # full suite, verbose
pytest tests/test_codegen.py                         # single file
pytest tests/test_codegen.py::TestArithmetic          # single class
pytest tests/test_conformance.py -v                  # conformance suite only
pytest tests/ --cov=vera --cov-report=term-missing   # with coverage

# JavaScript coverage (browser runtime)
VERA_JS_COVERAGE=1 pytest tests/test_browser.py -v  # V8 coverage via c8

# Type checking
mypy vera/                                           # strict mode

# Validation scripts
python scripts/check_conformance.py                  # conformance suite (82 programs, see manifest.json)
python scripts/check_examples.py                     # 33 example programs
python scripts/check_spec_examples.py                # spec code blocks
python scripts/check_readme_examples.py              # README code blocks
python scripts/check_skill_examples.py               # SKILL.md code blocks
python scripts/check_faq_examples.py                 # FAQ.md code blocks
python scripts/check_html_examples.py               # docs/index.html code blocks
python scripts/check_version_sync.py                 # version consistency
python scripts/fix_allowlists.py --fix               # auto-fix stale allowlists

Test Files

File	Tests	Lines	What it covers
`test_parser.py`	127	968	Grammar rules, operator precedence, parse errors
`test_ast.py`	126	1,130	AST transformation, node structure, serialisation, string escape sequences, ability declarations
`test_checker.py`	508	5,656	Type synthesis, slot resolution, effects, effect subtyping, contracts, exhaustiveness, cross-module typing, visibility, error codes, string built-ins, generic rejection, IO operation types, Markdown types, Regex types, abilities, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, removed legacy name regression
`test_verifier.py`	144	2,072	Z3 verification, counterexamples, tier classification, call-site preconditions, branch-aware preconditions, pipe operator, cross-module contracts, match/ADT verification, decreases verification, mutual recursion, refined Bool/String/Float64 param sorts, @Nat subtraction underflow obligation (#520 — Path-A obligation discharge via requires/path-conditions/path-aware Z3 refutation, pure-literal exclusion, Int-Int and Nat-Int → Int exemptions)
`test_codegen.py`	998	14,131	WASM compilation, arithmetic, Float64, Byte, arrays (incl. compound element types), ADTs, match (incl. nested patterns), generics, State<T>, Exn<E> handlers, control flow, strings, string escape sequences, IO (read_line, read_file, write_file, args, exit, get_env, sleep, time, stderr), bounds checking, quantifiers, assert/assume, refinement type aliases, pipe operator, string built-ins, built-in shadowing, parse_nat Result, GC, Markdown host bindings, Regex host bindings, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, Random effect, example round-trips, GC shadow stack overflow, WASM tail-call optimization (#517 — `return_call` emission for tail-position calls, 50K- and 1M-iteration stress, structural assertions on `return_call`/plain `call` boundary, allocating-function fallback, postcondition-fallback regression, analyzer unit tests covering Block-trailing / IfExpr-both-branches / MatchExpr-arm-bodies / let-value-NOT-marked / call-args-NOT-marked / ExprStmt-statement-NOT-marked / IfExpr-condition-NOT-marked / MatchExpr-scrutinee-NOT-marked)
`test_codegen_contracts.py`	32	576	Runtime pre/postconditions, contract fail messages, old/new state postconditions
`test_codegen_monomorphize.py`	71	1,326	Generic instantiation, type inference, monomorphization edge cases, ability constraint satisfaction (Eq/Ord/Hash/Show), operation rewriting (eq/compare), show/hash dispatch, ADT auto-derivation, array operations (slice/map/filter/fold)
`test_codegen_closures.py`	25	759	Closure lifting, captured variables, higher-order functions
`test_codegen_modules.py`	19	565	Cross-module guard rail, cross-module codegen, name collision detection (E608/E609/E610)
`test_codegen_coverage.py`	5	250	Defensive error paths: E600, E601, E605, E606, unknown module calls
`test_errors.py`	52	525	Error code registry, diagnostic formatting, serialisation, SourceLocation, error display sync (README/HTML/spec)
`test_formatter.py`	120	1,075	Comment extraction, interior comment positioning, expression/declaration formatting, match arm block bodies, idempotency, parenthesization, spec rules, ability declarations
`test_cli.py`	217	3,021	CLI commands (check, verify, compile, run, test, fmt, version, quiet), subprocess integration, JSON error paths, runtime traps, arg validation, multi-file resolution, IO exit codes, --explain-slots
`test_resolver.py`	15	412	Module resolution, path lookup, parse caching, circular import detection
`test_types.py`	73	390	Type operations: subtyping, effect subtyping, equality, substitution, pretty-printing, canonical names
`test_wasm.py`	22	255	WASM internals: StringPool, WasmSlotEnv, translation edge cases via full pipeline
`test_verifier_coverage.py`	79	1,260	Verifier/SMT coverage gaps: SMT encoding paths, verifier edge cases, defensive branches
`test_wasm_coverage.py`	225	3,903	WASM coverage gaps: helpers unit tests, inference branches, closure free-var walking, operator/data/context edge cases
`test_tester.py`	14	369	Contract-driven testing: tier classification, input generation, test execution, skip message content
`test_tester_coverage.py`	34	901	Tester coverage gaps: String/Float64/ADT parameter input generation, Bool/Byte parameters, unsatisfiable preconditions, type expression edge cases
`test_markdown.py`	59	394	Markdown parser: block/inline parsing, rendering, round-trips, edge cases
`test_browser.py`	99	1,821	Browser parity: Python/wasmtime vs Node.js/JS-runtime output equivalence across IO, State, contracts, Markdown, Regex, and all compilable examples
`test_conformance.py`	410	102	Parametrized conformance suite: parse, check, verify, run, format idempotency across 82 programs
`test_prelude.py`	24	422	Prelude injection: Option/Result/array operation detection, combinator shadowing, type aliases, end-to-end compilation
`test_readme.py`	2	79	README code sample parsing
`test_html.py`	4	166	HTML landing page code samples: parse, check, verify
`test_build_site.py`	17	213	`_abs_links` unit tests: relative link rewriting, fenced block immunity (backtick and tilde fences, inline backticks inside fences), http/https/fragment pass-through, Vera effect syntax not mis-parsed
`test_check_changelog_updated.py`	66	638	`check_changelog_updated.py` unit + end-to-end tests: file classification (incl. file-style exact-match vs directory-style prefix-match), CHANGELOG diff parsing with `[Unreleased]` section tracking, bare-heading rejection, and full-file context (regression test for bullets far below the heading), `Skip-changelog:` trailer detection, temp-repo integration covering substantive/exempt/label/trailer paths
`test_runtime_traps.py`	53	1,784	Runtime trap categorisation (#516 Stage 1), stdout/stderr-on-trap preservation (#522), `IO.print` live tee (#543), and trap source backtrace (#516 Stage 2): `_classify_trap` per-`kind` mapping (`divide_by_zero`/`out_of_bounds`/`stack_exhausted`/`unreachable`/`overflow`/`contract_violation`/`unknown`), `WasmTrapError` shape + `RuntimeError` substitutability, end-to-end `cmd_run` text + JSON envelopes including `trap_kind`, captured `stdout`, captured `stderr`, JSON-mode "no stderr leak" invariant, cross-stream code-order regression using merged `redirect_stdout`/`redirect_stderr`, the v0.0.123 tee suite (live streaming, write-count + order preservation, JSON-mode tee suppression, trap preservation invariant under tee, per-write flush count, default-execute silence), and the v0.0.124 source-mapping suite — `_resolve_trap_frames` unit tests covering user-fn / built-in / built-in-prefix / monomorphized base-name fallback / unknown-name / no-frames-attribute / leaf-first ordering preservation; end-to-end `cmd_run` text-mode + JSON-mode backtrace including the leaf-first ordering invariant; contract-violation backtrace in both text and JSON modes; direct `execute()` `WasmTrapError.frames` attachment; suppression marker for collapsed leading runtime-helper frames (mocked `vera.codegen.execute` with synthetic `is_builtin=True` leaf frames so the collapse logic is testable deterministically); source-map population for top-level fns + lifted closures (with span-value assertion against the closure literal's exact line range); and the no-builtin-leakage regression that pins built-in helpers (`alloc` / `gc_collect` / `contract_fail`) NOT being registered in `fn_source_map`; plus the v0.0.125 Stage 3 suite (`#547`) — text-mode `Fix:` block surfacing with position-ordering invariant (Fix appears after the source backtrace), text-mode block suppression for `contract_violation` (no empty header noise), JSON-mode `fix` field always-present (schema stability) including the empty-string case, `_TRAP_FIX_PARAGRAPHS` table-completeness assertion (every kind in the taxonomy has a Fix paragraph entry), and the column-wrap invariant (~76 chars max per line, two-space indent under the `Fix:` heading)

Conformance Suite

The conformance suite is a collection of 82 small, focused programs in tests/conformance/ that systematically validate every language feature against the spec. Each program is self-contained and imports nothing, with the single exception of ch07_cross_module_contracts.vera which depends on ch07_cross_module_contracts_lib.vera. Each program tests one feature or a small group of related features.

Simon Willison argues that conformance suites are a "huge unlock" for language projects — they transform development from trust-based to verification-based. The conformance suite serves as the definitive specification artifact that any implementation (or agent) can validate against.

Three-layer testing model

Vera has three distinct test layers, each serving a different purpose:

Layer	Location	Purpose	What it tests
Unit tests	`tests/test_*.py`	Test compiler internals	Error paths, edge cases, internal APIs
Conformance suite	`tests/conformance/`	Spec-anchored feature validation	Every language feature, one program per feature
Example programs	`examples/`	Showcase programs and demos	End-to-end usage, documentation

Unit tests verify that the compiler works correctly. Conformance programs verify that the language works correctly. Examples demonstrate how to use the language. All three run in CI and pre-commit hooks.

Test levels

Each conformance program declares the deepest pipeline stage it must pass:

Level	What it validates	Count
`parse`	Source text is syntactically valid	0
`check`	Parses and type-checks cleanly	4
`verify`	Type-checks and all contracts verified by Z3	6
`run`	Compiles to WASM and executes correctly	72

Almost all programs are at the run level — they compile and execute, producing correct results. Four programs (ch07_cross_module_contracts_lib, ch09_http, ch09_inference, ch03_typed_holes) are at the check level. Six programs (ch03_slot_let_chains, ch03_slot_noncommutative, ch07_cross_module_contracts, ch07_io_sleep, ch07_random_effect, ch09_math_builtins) are at the verify level, using Z3-provable contracts.

Skipped tests

pytest tests/ -v reports 14 skipped tests across two categories:

Level-limited skips — the conformance framework only runs tests up to the declared level; stages beyond that level are automatically skipped. These are expected and correct.

Test	Program	Declared level	Skipped stage	Reason
`test_run[ch03_slot_let_chains]`	`ch03_slot_let_chains.vera`	`verify`	`run`	`verify`-level programs don't get a `run` test
`test_run[ch03_slot_noncommutative]`	`ch03_slot_noncommutative.vera`	`verify`	`run`	`verify`-level programs don't get a `run` test
`test_verify[ch03_typed_holes]`	`ch03_typed_holes.vera`	`check`	`verify`	`check`-level program: verify stage not run
`test_run[ch03_typed_holes]`	`ch03_typed_holes.vera`	`check`	`run`	`check`-level program: no standalone `main`
`test_run[ch07_cross_module_contracts]`	`ch07_cross_module_contracts.vera`	`verify`	`run`	`verify`-level programs don't get a `run` test
`test_verify[ch07_cross_module_contracts_lib]`	`ch07_cross_module_contracts_lib.vera`	`check`	`verify`	`check`-level program: verify stage not run
`test_run[ch07_cross_module_contracts_lib]`	`ch07_cross_module_contracts_lib.vera`	`check`	`run`	`check`-level library module: no standalone `main`
`test_run[ch07_io_sleep]`	`ch07_io_sleep.vera`	`verify`	`run`	`verify`-level programs don't get a `run` test
`test_run[ch07_random_effect]`	`ch07_random_effect.vera`	`verify`	`run`	`verify`-level programs don't get a `run` test
`test_run[ch09_math_builtins]`	`ch09_math_builtins.vera`	`verify`	`run`	`verify`-level programs don't get a `run` test

Environment-gated skips — these programs require network access or a live API key that is not available in CI. They pass vera check (type-checking) but cannot be executed.

Test	Program	Declared level	Skipped stage	Reason
`test_verify[ch09_http]`	`ch09_http.vera`	`check`	`verify`	Requires outbound HTTP; unavailable in CI sandbox
`test_run[ch09_http]`	`ch09_http.vera`	`check`	`run`	Requires outbound HTTP; unavailable in CI sandbox
`test_verify[ch09_inference]`	`ch09_inference.vera`	`check`	`verify`	Requires `VERA_*_API_KEY`; not set in CI
`test_run[ch09_inference]`	`ch09_inference.vera`	`check`	`run`	Requires `VERA_*_API_KEY`; not set in CI

To run the environment-gated tests locally: set VERA_ANTHROPIC_API_KEY (or another provider key) and ensure outbound HTTP is available, then vera run tests/conformance/ch09_http.vera / vera run tests/conformance/ch09_inference.vera.

Directory structure

tests/conformance/
├── manifest.json              # Machine-readable test metadata
├── ch01_int_literals.vera     # Chapter 1: Integer literals
├── ch01_float_literals.vera   # Chapter 1: Float64 literals
├── ch01_string_escapes.vera   # Chapter 1: String escape sequences
├── ...                        # 82 programs total, organized by spec chapter
├── ch07_state_handler.vera    # Chapter 7: State<T> effect handler
├── ch07_exn_handler.vera      # Chapter 7: Exn<E> effect handler
├── ch09_numeric_builtins.vera # Chapter 9: Numeric built-in functions
├── ch09_type_conversions.vera # Chapter 9: Numeric type conversions
├── ch09_markdown.vera         # Chapter 9: Markdown standard library
├── ch09_regex.vera            # Chapter 9: Regular expression matching
├── ch09_decimal.vera          # Chapter 9: Decimal type operations
├── ch09_json.vera             # Chapter 9: JSON standard library
├── ch09_http.vera             # Chapter 9: Http effect (check level)
└── ch10_float_predicates.vera # Chapter 9: Float64 predicates and constants

Manifest

manifest.json maps each program to its spec chapter, test level, and feature tags:

{
  "id": "ch04_arithmetic",
  "file": "ch04_arithmetic.vera",
  "chapter": 4,
  "title": "Arithmetic operators",
  "level": "run",
  "spec_ref": "Section 4.1",
  "features": ["add", "sub", "mul", "div", "mod", "unary_neg"]
}

The manifest is the machine-readable feature inventory — agents can query it to find which features exist and where they are tested.

Running the conformance suite

# Via pytest (parametrized — 410 tests)
pytest tests/test_conformance.py -v

# Via standalone script (used in CI and pre-commit)
python scripts/check_conformance.py

The pytest runner (test_conformance.py) parametrizes over every manifest entry and runs five checks per program: parse, check, verify, run, and format idempotency.

Adding a conformance test

Write a .vera program in tests/conformance/ following the naming convention chNN_feature_name.vera
Include a header comment indicating the spec chapter and what the program tests
Ensure the program has a main function (for run-level tests)
Format it: vera fmt --write tests/conformance/your_file.vera
Add an entry to manifest.json with the appropriate level and feature tags
Run python scripts/check_conformance.py to validate

When implementing a new language feature, the conformance program should be written first — this is test-driven development against the spec.

Compiler Code Coverage

Coverage by module, measured by pytest --cov=vera:

Module	Stmts	Miss	Coverage
`codegen/`	1,934	99	95%
`checker/`	1,117	73	93%
`wasm/`	7,473	268	96%
`browser/`	21	0	100%
`verifier.py`	429	0	100%
`transform.py`	564	16	97%
`formatter.py`	673	54	92%
`ast.py`	460	30	93%
`smt.py`	495	0	100%
`markdown.py`	413	54	87%
`types.py`	182	7	96%
`errors.py`	126	1	99%
`environment.py`	239	8	97%
`cli.py`	474	0	100%
`parser.py`	45	0	100%
`resolver.py`	68	2	97%
`tester.py`	312	0	100%
`prelude.py`	106	0	100%
`registration.py`	18	0	100%
Total	15,149	612	96%

The lowest-coverage module is markdown.py at 87%, reflecting Markdown AST traversal edge cases. The wasm/ subsystem was improved from 79% to 96% by #156 and #324; the remaining gaps are mostly in wasm/inference.py (85%) deep type-dispatch branches for specific builtin functions.

Contract Verification Coverage

Vera's verifier classifies each contract into one of three tiers. Tier 1 contracts are proved correct statically by Z3 — no runtime overhead. Tier 3 contracts cannot be fully decided by the SMT solver and fall back to runtime assertion checks. The verifier never rejects a valid program; it simply warns when a contract drops to Tier 3.

Across all 33 example programs:

Metric	Value
Tier 1 (static)	162 contracts — proved automatically by Z3
Tier 3 (runtime)	16 contracts — verified at runtime via assertion checks
Total	177 contracts (91.0% static)

The 16 remaining Tier 3 contracts and why they cannot be promoted:

Example	Contract	Reason
async_futures.vera	2 contracts	Async/future combinators not in decidable fragment
collections.vera	8 contracts	Collection operations (Map/Set) not modeled in Z3
gc_pressure.vera	`decreases` in `repeat`	Termination metric not in decidable fragment
generics.vera	`ensures(@T.result == @T.0)`	Generic type parameters have no Z3 sort
generics.vera	`ensures(@A.result == @A.0)`	Generic type parameters have no Z3 sort
increment.vera	`ensures(new(State<Int>) == old(State<Int>) + 1)`	`old`/`new` state modeling not yet implemented
json.vera	2 contracts	Json ADT operations not modeled in Z3

The Tier 1 fragment covers: integer/boolean arithmetic, comparisons, if/else, let bindings, match expressions, ADT constructors, function calls (modular postcondition), length, and decreases clauses (self-recursive, mutual recursion via where-blocks, Nat and structural ADT measures).

Language Feature Coverage

How Vera language features (by spec chapter) map to test files and example programs:

Spec chapter	Feature	Test files	Conformance	Examples
Ch 1: Lexical	Literals (Int, Float64, Bool, Byte, String)	test_ast, test_codegen	ch01_int_literals, ch01_float_literals, ch01_bool_literals, ch01_byte_literals	most examples
Ch 1: Lexical	String escape sequences (`\n`, `\t`, `\\`, `\"`, `\r`, `\0`, `\u{XXXX}`)	test_ast, test_codegen	ch01_string_escapes	io_operations, file_io
Ch 1: Lexical	Comments	test_parser	ch01_comments	—
Ch 2: Types	Int, Nat, Bool, String, Float64, Byte, Unit	test_codegen, test_checker	ch02_builtin_types	most examples
Ch 2: Types	ADTs (algebraic data types), Option, Result	test_codegen, test_checker	ch02_adt_basic, ch02_adt_recursive, ch02_option_result	pattern_matching, list_ops
Ch 2: Types	Refinement types	test_codegen, test_verifier	ch02_refinement_types	refinement_types, safe_divide
Ch 2: Types	Generics (`forall<T>`)	test_codegen_monomorphize, test_checker	ch02_generics	generics
Ch 3: Slots	`@T.n` references, De Bruijn indexing	test_checker, test_codegen	ch03_slot_basic, ch03_slot_indexing, ch03_slot_result	all 33 examples
Ch 4: Expressions	Arithmetic, comparison, boolean, unary ops	test_codegen, test_checker	ch04_arithmetic, ch04_comparison, ch04_boolean_ops	factorial, absolute_value
Ch 4: Expressions	If/else, let, match, pipe operator	test_codegen, test_checker	ch04_if_else, ch04_let_binding, ch04_match_basic, ch04_match_nested, ch04_pipe_operator	pattern_matching
Ch 4: Expressions	String and array builtins	test_codegen	ch04_string_builtins, ch04_array_ops	string_ops
Ch 5: Functions	Declarations, recursion, mutual recursion	test_codegen, test_checker	ch05_basic_function, ch05_recursion, ch05_mutual_recursion	factorial, mutual_recursion
Ch 5: Functions	Closures, higher-order functions	test_codegen_closures	ch05_closures	closures
Ch 5: Functions	Visibility (`public`/`private`)	test_checker	ch05_visibility	modules
Ch 6: Contracts	Preconditions (`requires`)	test_codegen_contracts, test_verifier	ch06_requires	safe_divide
Ch 6: Contracts	Postconditions (`ensures`)	test_codegen_contracts, test_verifier	ch06_ensures	absolute_value
Ch 6: Contracts	Decreases clauses, assert/assume	test_verifier, test_codegen	ch06_decreases, ch06_assert_assume	factorial
Ch 6: Contracts	Quantifiers (forall, exists)	test_codegen, test_verifier	ch06_quantifiers	quantifiers
Ch 7: Effects	Pure, IO, State<T>	test_codegen, test_checker	ch07_pure, ch07_io, ch07_state_handler	hello_world, increment, io_operations, file_io
Ch 7: Effects	Effect handlers (State<T>, Exn<E>)	test_codegen, test_checker	ch07_state_handler, ch07_exn_handler	effect_handler
Ch 9: Stdlib	Numeric builtins (abs, min, max, floor, ceil, round, sqrt, pow)	test_codegen, test_checker	ch09_numeric_builtins	—
Ch 9: Stdlib	Type conversions (int_to_float, float_to_int, nat_to_int, int_to_nat, byte_to_int, int_to_byte)	test_codegen, test_checker	ch09_type_conversions	—
Ch 9: Stdlib	Float64 predicates (float_is_nan, float_is_infinite, nan, infinity)	test_codegen, test_checker	ch10_float_predicates	—
Ch 7: Effects	Effect subtyping (§7.8), call-site checking	test_types, test_checker	—	—
Ch 2: Types	Bidirectional type checking (local inference)	test_checker	—	—
Ch 4: Expressions	Nested constructor patterns in match	test_codegen	ch04_match_nested	pattern_matching
Ch 8: Modules	Imports, cross-module typing and codegen	test_codegen_modules, test_resolver	—	modules
Ch 11: Compilation	Cross-module name collision detection (E608/E609/E610)	test_codegen_modules	—	—
Ch 9: Stdlib	Markdown (md_parse, md_render, md_has_heading, md_has_code_block, md_extract_code_blocks)	test_codegen, test_markdown	ch09_markdown	markdown
Ch 9: Stdlib	Regex (regex_match, regex_find, regex_find_all, regex_replace)	test_codegen, test_checker	ch09_regex	regex
Ch 9: Stdlib	Map, Set, Decimal collections	test_codegen, test_checker	ch09_map, ch09_set, ch09_decimal, ch09_decimal_generics	collections
Ch 9: Stdlib	Json (json_parse, json_stringify, json_get, json_array_get, json_array_length, json_keys, json_has_field, json_type)	test_codegen, test_checker	ch09_json	json
Ch 9: Stdlib	Html (html_parse, html_to_string, html_query, html_text, html_attr)	test_codegen, test_checker	ch09_html	html
Ch 9: Stdlib	Http effect (Http.get, Http.post)	test_codegen, test_checker	ch09_http	http
Ch 11: Compilation	Contract-driven testing (Z3 input gen + WASM execution)	test_tester, test_cli	—	safe_divide, factorial
Ch 12: Runtime	Browser runtime parity (JS host bindings match Python)	test_browser	—	—

Test Helpers

Each test module defines module-level helper functions (no conftest.py):

# test_checker.py pattern:
_check_ok(source)              # assert no type errors
_check_err(source, "match")    # assert at least one error matching substring

# test_verifier.py pattern:
_verify_ok(source)             # assert no verification errors
_verify_err(source, "match")   # assert at least one verification error
_verify_warn(source, "match")  # assert at least one warning

# test_codegen.py pattern:
_compile_ok(source)            # assert compilation succeeds
_run(source, fn, args)         # compile + execute, return result
_run_io(source, fn, args)      # compile + execute, return captured stdout
_run_trap(source, fn, args)    # compile + execute, assert WASM trap

Round-Trip Testing

Every one of the 33 example programs in examples/ is tested through every pipeline stage via parametrised tests: parsing, AST transformation, type checking, contract verification, WASM compilation, and execution. If you add a new .vera example, it is automatically included in the round-trip suite.

The formatter has idempotency tests: format(format(x)) == format(x) for all tested programs.

Adding Tests

When extending the compiler, add tests following the existing patterns:

New grammar construct: Add parser tests to test_parser.py (positive and negative)
New AST node: Add transformation tests to test_ast.py (check node fields, spans, serialisation)
New type rule: Add checker tests to test_checker.py using _check_ok()/_check_err()
New SMT support: Add verifier tests to test_verifier.py using _verify_ok()/_verify_err()
New codegen support: Add compilation tests to test_codegen.py using _compile_ok()/_run()/_run_trap()
New example program: Add to examples/ -- it is automatically included in round-trip tests
New error pattern: Add formatting tests to test_errors.py
New tester feature: Add tests to test_tester.py using _test(source) helper
New host binding: Add parity tests to test_browser.py to ensure the JavaScript runtime stays in sync with the Python runtime

Validation Scripts

Twelve scripts in scripts/ validate cross-cutting concerns beyond unit tests:

Script	What it validates
`check_conformance.py`	All 82 conformance programs pass their declared level (parse/check/verify/run)
`check_examples.py`	All 33 `.vera` examples pass `vera check` + `vera verify`
`check_spec_examples.py`	148 parseable code blocks from spec chapters: parse, type-check, and verify
`check_readme_examples.py`	All Vera code blocks in README.md parse correctly
`check_skill_examples.py`	All Vera code blocks in SKILL.md parse correctly
`check_faq_examples.py`	All Vera code blocks in FAQ.md parse correctly
`check_html_examples.py`	All Vera code blocks in docs/index.html pass parse + check + verify
`check_site_assets.py`	Generated site assets under `docs/` are up-to-date
`check_version_sync.py`	`pyproject.toml` and `vera/__init__.py` versions match
`check_doc_counts.py`	Counts cited in TESTING.md, CONTRIBUTING.md, and CLAUDE.md match live codebase
`check_licenses.py`	All installed packages have MIT-compatible licenses
`fix_allowlists.py`	Auto-fix stale allowlist line numbers after Markdown edits

These run in both pre-commit hooks and CI, so issues are caught locally before they reach the remote.

Spec validation pipeline

check_spec_examples.py pushes spec code blocks through three compiler stages, with allowlists at each level:

Stage	Pass	Allowlisted	Categories
Parse	81	67	FUTURE (9), FRAGMENT (58)
Type-check	67	14	INCOMPLETE (13), FUTURE (1)
Verify	66	1	ILLUSTRATIVE (1)

Allowlisted entries have stale-detection: when a feature lands or a spec edit shifts line numbers, CI flags the entry for removal or the fix_allowlists.py script auto-fixes the line numbers. The INCOMPLETE check entries reference functions, types, or imports not defined in the block (e.g. abs, Tuple, array_map, parse_int). The 1 FUTURE check entry uses async/await. The 1 ILLUSTRATIVE verify entry is a spec example demonstrating multiple postconditions syntax where the contract is intentionally imprecise.

Pre-commit Hooks

Every push is checked by 25 configured hooks across two stages: 23 are configured at the commit stage (after pre-commit install), and 2 (check-changelog-updated, uv-lock-check) are configured at the push stage (after pre-commit install --hook-type pre-push). Many commit-stage hooks use per-hook files: / types: filters and only fire when matching files are staged — a docs-only commit triggers a small subset, a compiler-level commit triggers most. Full list:

Hook	What it does
`trailing-whitespace`	Strip trailing whitespace
`end-of-file-fixer`	Ensure files end with a newline
`check-yaml` / `check-toml`	Validate config file syntax
`check-merge-conflict`	Detect conflict markers
`check-added-large-files`	Reject files >500 KB
`debug-statements`	Detect `pdb`/`ipdb` imports
`mypy vera/`	Type-check compiler in strict mode
`pytest tests/ -q`	Run full test suite
`fix_allowlists.py --fix`	Auto-fix stale allowlist line numbers
`check_conformance.py`	All 82 conformance programs pass their declared level
`check_examples.py`	All 33 examples pass `vera check` + `vera verify`
`check_examples_readme.py`	`vera run` commands in `examples/README.md` reference existing files and exported functions
`check_readme_examples.py`	README code blocks parse correctly
`check_examples_doc.py`	EXAMPLES.md code blocks parse correctly
`check_skill_examples.py`	SKILL.md code blocks parse correctly
`check_faq_examples.py`	FAQ.md code blocks parse correctly
`check_html_examples.py`	HTML landing page code blocks pass parse + check + verify
`check_doc_counts.py`	Counts in docs match live codebase
`check_limitations_sync.py`	Limitation tables consistent across KNOWN_ISSUES.md, vera/README, and spec
`check_licenses.py`	All package licenses are MIT-compatible
`build_site.py`	Regenerate AI-readable site assets (llms.txt, llms-full.txt, robots.txt, sitemap.xml, index.md)
`browser parity`	Browser runtime produces identical output to Python runtime
`check-changelog-updated` (pre-push)	CHANGELOG has a new entry when substantive files changed
`uv-lock-check` (pre-push)	`uv.lock` is in sync with `pyproject.toml`

The validation hooks are smart about triggers -- they only run when relevant files change (.vera, vera/**/*.py, grammar.lark, the corresponding Markdown file, or vera/browser/* for browser parity). The two pre-push hooks only fire at push time.

CI Pipeline

GitHub Actions (.github/workflows/ci.yml) runs seven parallel jobs on every push and pull request to main:

Job	Matrix / Runner	What it checks
test	Python 3.11, 3.12, 3.13 x Ubuntu, macOS (6 combos)	`pytest -v` passes on all combinations
test (coverage)	Python 3.12 x Ubuntu only	`pytest --cov=vera --cov-fail-under=80`
typecheck	Python 3.12 x Ubuntu	`mypy vera/` clean in strict mode
lint	Python 3.12 x Ubuntu	`check_conformance.py`, `check_examples.py`, `check_examples_readme.py`, `check_version_sync.py`, `check_spec_examples.py`, `check_readme_examples.py`, `check_skill_examples.py`, `check_faq_examples.py`, `check_html_examples.py`, `check_site_assets.py`, `check_licenses.py`, `ruff check --select S vera/` (security rules)
security	Ubuntu	Gitleaks secret scanning on full history
dependency-audit	Python 3.12 x Ubuntu	`pip-audit --skip-editable --ignore-vuln CVE-2026-4539` — checks all installed packages against the OSV vulnerability database (skips the local editable `vera` package; `CVE-2026-4539` suppressed pending a pygments fix release)
sbom	Python 3.12 x Ubuntu	`cyclonedx-py environment` — generates a CycloneDX JSON SBOM of the full installed dependency tree and uploads it as a 90-day CI artifact
browser-parity	Python 3.12 + Node.js 22 x Ubuntu	`pytest tests/test_browser.py -v` — verifies JS runtime matches Python runtime; collects V8 coverage via `NODE_V8_COVERAGE` and uploads to Codecov

The coverage threshold of 80% is enforced in CI. Current coverage is 96%. JavaScript coverage for vera/browser/runtime.mjs is collected separately using V8's built-in coverage and uploaded to Codecov with the javascript flag.

Each job uses scoped permissions (contents: read; the security job additionally has security-events: write) and all checkout steps set persist-credentials: false to prevent the GITHUB_TOKEN from being baked into .git/config. Actions without SHA-pinned version refs are tracked in #390.

Open CI/Tooling Issues

Tracked improvements to the testing and CI infrastructure:

Issue	Description
#349	Improve browser runtime (`runtime.mjs`) test coverage to >80% — JS code is invisible to pytest-cov, blocking codecov/patch on PRs that touch the runtime

Opportunities

Testing infrastructure that could be added in the future:

Property-based testing -- hypothesis is installed as a dev dependency but not yet used. Could generate random programs to test parser robustness and formatter idempotency at scale.
Formatter round-trip invariant -- verify parse(format(parse(src))) == parse(src) for all valid programs, not just the examples.
WASM inference.py coverage -- wasm/inference.py at 85% has the most remaining gaps, mostly in deep type-dispatch branches for specific builtin function return types. These branches require very specific expression nesting patterns to reach.
Performance benchmarks -- no benchmark infrastructure exists. Could track compilation time and Z3 verification time across releases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing

Overview

Running Tests

Test Files

Conformance Suite

Three-layer testing model

Test levels

Skipped tests

Directory structure

Manifest

Running the conformance suite

Adding a conformance test

Compiler Code Coverage

Contract Verification Coverage

Language Feature Coverage

Test Helpers

Round-Trip Testing

Adding Tests

Validation Scripts

Spec validation pipeline

Pre-commit Hooks

CI Pipeline

Open CI/Tooling Issues

Opportunities

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Testing

Overview

Running Tests

Test Files

Conformance Suite

Three-layer testing model

Test levels

Skipped tests

Directory structure

Manifest

Running the conformance suite

Adding a conformance test

Compiler Code Coverage

Contract Verification Coverage

Language Feature Coverage

Test Helpers

Round-Trip Testing

Adding Tests

Validation Scripts

Spec validation pipeline

Pre-commit Hooks

CI Pipeline

Open CI/Tooling Issues

Opportunities