This is the single source of truth for Vera's testing infrastructure, coverage data, and test conventions.
| Metric | Value |
|---|---|
| Tests | 3,640 across 29 files (~34,500 lines of test code; 3,626 passed, 14 skipped) |
| Compiler code coverage | 96% of 15,149 statements (CI minimum: 80%) |
| Conformance programs | 82 programs across 9 spec chapters, validating every language feature |
| Example programs | 33, all validated through vera check + vera verify |
| Spec code blocks | 164 parseable blocks from 13 spec chapters: 86 parse, 72 type-check, 71 verify |
| README code blocks | 13 Vera blocks (12 validated, 1 allowlisted future syntax) |
| FAQ code blocks | 1 Vera block in FAQ.md (0 validated, 1 allowlisted snippet) |
| HTML code blocks | 4 Vera blocks in docs/index.html (4 validated: parse + check + verify) |
| Contract verification | 162 of 179 contracts (90.5%) verified statically (Tier 1) |
| CI matrix | 6 combinations (Python 3.11/3.12/3.13 x Ubuntu/macOS) + browser parity (Node.js 22) |
All commands assume the virtual environment is active (source .venv/bin/activate).
# Test suite
pytest tests/ -v # full suite, verbose
pytest tests/test_codegen.py # single file
pytest tests/test_codegen.py::TestArithmetic # single class
pytest tests/test_conformance.py -v # conformance suite only
pytest tests/ --cov=vera --cov-report=term-missing # with coverage
# JavaScript coverage (browser runtime)
VERA_JS_COVERAGE=1 pytest tests/test_browser.py -v # V8 coverage via c8
# Type checking
mypy vera/ # strict mode
# Validation scripts
python scripts/check_conformance.py # conformance suite (82 programs, see manifest.json)
python scripts/check_examples.py # 33 example programs
python scripts/check_spec_examples.py # spec code blocks
python scripts/check_readme_examples.py # README code blocks
python scripts/check_skill_examples.py # SKILL.md code blocks
python scripts/check_faq_examples.py # FAQ.md code blocks
python scripts/check_html_examples.py # docs/index.html code blocks
python scripts/check_version_sync.py # version consistency
python scripts/fix_allowlists.py --fix # auto-fix stale allowlists| File | Tests | Lines | What it covers |
|---|---|---|---|
test_parser.py |
127 | 968 | Grammar rules, operator precedence, parse errors |
test_ast.py |
126 | 1,130 | AST transformation, node structure, serialisation, string escape sequences, ability declarations |
test_checker.py |
508 | 5,656 | Type synthesis, slot resolution, effects, effect subtyping, contracts, exhaustiveness, cross-module typing, visibility, error codes, string built-ins, generic rejection, IO operation types, Markdown types, Regex types, abilities, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, removed legacy name regression |
test_verifier.py |
144 | 2,072 | Z3 verification, counterexamples, tier classification, call-site preconditions, branch-aware preconditions, pipe operator, cross-module contracts, match/ADT verification, decreases verification, mutual recursion, refined Bool/String/Float64 param sorts, @Nat subtraction underflow obligation (#520 — Path-A obligation discharge via requires/path-conditions/path-aware Z3 refutation, pure-literal exclusion, Int-Int and Nat-Int → Int exemptions) |
test_codegen.py |
998 | 14,131 | WASM compilation, arithmetic, Float64, Byte, arrays (incl. compound element types), ADTs, match (incl. nested patterns), generics, State<T>, Exn<E> handlers, control flow, strings, string escape sequences, IO (read_line, read_file, write_file, args, exit, get_env, sleep, time, stderr), bounds checking, quantifiers, assert/assume, refinement type aliases, pipe operator, string built-ins, built-in shadowing, parse_nat Result, GC, Markdown host bindings, Regex host bindings, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, Random effect, example round-trips, GC shadow stack overflow, WASM tail-call optimization (#517 — return_call emission for tail-position calls, 50K- and 1M-iteration stress, structural assertions on return_call/plain call boundary, allocating-function fallback, postcondition-fallback regression, analyzer unit tests covering Block-trailing / IfExpr-both-branches / MatchExpr-arm-bodies / let-value-NOT-marked / call-args-NOT-marked / ExprStmt-statement-NOT-marked / IfExpr-condition-NOT-marked / MatchExpr-scrutinee-NOT-marked) |
test_codegen_contracts.py |
32 | 576 | Runtime pre/postconditions, contract fail messages, old/new state postconditions |
test_codegen_monomorphize.py |
71 | 1,326 | Generic instantiation, type inference, monomorphization edge cases, ability constraint satisfaction (Eq/Ord/Hash/Show), operation rewriting (eq/compare), show/hash dispatch, ADT auto-derivation, array operations (slice/map/filter/fold) |
test_codegen_closures.py |
25 | 759 | Closure lifting, captured variables, higher-order functions |
test_codegen_modules.py |
19 | 565 | Cross-module guard rail, cross-module codegen, name collision detection (E608/E609/E610) |
test_codegen_coverage.py |
5 | 250 | Defensive error paths: E600, E601, E605, E606, unknown module calls |
test_errors.py |
52 | 525 | Error code registry, diagnostic formatting, serialisation, SourceLocation, error display sync (README/HTML/spec) |
test_formatter.py |
120 | 1,075 | Comment extraction, interior comment positioning, expression/declaration formatting, match arm block bodies, idempotency, parenthesization, spec rules, ability declarations |
test_cli.py |
217 | 3,021 | CLI commands (check, verify, compile, run, test, fmt, version, quiet), subprocess integration, JSON error paths, runtime traps, arg validation, multi-file resolution, IO exit codes, --explain-slots |
test_resolver.py |
15 | 412 | Module resolution, path lookup, parse caching, circular import detection |
test_types.py |
73 | 390 | Type operations: subtyping, effect subtyping, equality, substitution, pretty-printing, canonical names |
test_wasm.py |
22 | 255 | WASM internals: StringPool, WasmSlotEnv, translation edge cases via full pipeline |
test_verifier_coverage.py |
79 | 1,260 | Verifier/SMT coverage gaps: SMT encoding paths, verifier edge cases, defensive branches |
test_wasm_coverage.py |
225 | 3,903 | WASM coverage gaps: helpers unit tests, inference branches, closure free-var walking, operator/data/context edge cases |
test_tester.py |
14 | 369 | Contract-driven testing: tier classification, input generation, test execution, skip message content |
test_tester_coverage.py |
34 | 901 | Tester coverage gaps: String/Float64/ADT parameter input generation, Bool/Byte parameters, unsatisfiable preconditions, type expression edge cases |
test_markdown.py |
59 | 394 | Markdown parser: block/inline parsing, rendering, round-trips, edge cases |
test_browser.py |
99 | 1,821 | Browser parity: Python/wasmtime vs Node.js/JS-runtime output equivalence across IO, State, contracts, Markdown, Regex, and all compilable examples |
test_conformance.py |
410 | 102 | Parametrized conformance suite: parse, check, verify, run, format idempotency across 82 programs |
test_prelude.py |
24 | 422 | Prelude injection: Option/Result/array operation detection, combinator shadowing, type aliases, end-to-end compilation |
test_readme.py |
2 | 79 | README code sample parsing |
test_html.py |
4 | 166 | HTML landing page code samples: parse, check, verify |
test_build_site.py |
17 | 213 | _abs_links unit tests: relative link rewriting, fenced block immunity (backtick and tilde fences, inline backticks inside fences), http/https/fragment pass-through, Vera effect syntax not mis-parsed |
test_check_changelog_updated.py |
66 | 638 | check_changelog_updated.py unit + end-to-end tests: file classification (incl. file-style exact-match vs directory-style prefix-match), CHANGELOG diff parsing with [Unreleased] section tracking, bare-heading rejection, and full-file context (regression test for bullets far below the heading), Skip-changelog: trailer detection, temp-repo integration covering substantive/exempt/label/trailer paths |
test_runtime_traps.py |
53 | 1,784 | Runtime trap categorisation (#516 Stage 1), stdout/stderr-on-trap preservation (#522), IO.print live tee (#543), and trap source backtrace (#516 Stage 2): _classify_trap per-kind mapping (divide_by_zero/out_of_bounds/stack_exhausted/unreachable/overflow/contract_violation/unknown), WasmTrapError shape + RuntimeError substitutability, end-to-end cmd_run text + JSON envelopes including trap_kind, captured stdout, captured stderr, JSON-mode "no stderr leak" invariant, cross-stream code-order regression using merged redirect_stdout/redirect_stderr, the v0.0.123 tee suite (live streaming, write-count + order preservation, JSON-mode tee suppression, trap preservation invariant under tee, per-write flush count, default-execute silence), and the v0.0.124 source-mapping suite — _resolve_trap_frames unit tests covering user-fn / built-in / built-in-prefix / monomorphized base-name fallback / unknown-name / no-frames-attribute / leaf-first ordering preservation; end-to-end cmd_run text-mode + JSON-mode backtrace including the leaf-first ordering invariant; contract-violation backtrace in both text and JSON modes; direct execute() WasmTrapError.frames attachment; suppression marker for collapsed leading runtime-helper frames (mocked vera.codegen.execute with synthetic is_builtin=True leaf frames so the collapse logic is testable deterministically); source-map population for top-level fns + lifted closures (with span-value assertion against the closure literal's exact line range); and the no-builtin-leakage regression that pins built-in helpers (alloc / gc_collect / contract_fail) NOT being registered in fn_source_map; plus the v0.0.125 Stage 3 suite (#547) — text-mode Fix: block surfacing with position-ordering invariant (Fix appears after the source backtrace), text-mode block suppression for contract_violation (no empty header noise), JSON-mode fix field always-present (schema stability) including the empty-string case, _TRAP_FIX_PARAGRAPHS table-completeness assertion (every kind in the taxonomy has a Fix paragraph entry), and the column-wrap invariant (~76 chars max per line, two-space indent under the Fix: heading) |
The conformance suite is a collection of 82 small, focused programs in tests/conformance/ that systematically validate every language feature against the spec. Each program is self-contained and imports nothing, with the single exception of ch07_cross_module_contracts.vera which depends on ch07_cross_module_contracts_lib.vera. Each program tests one feature or a small group of related features.
Simon Willison argues that conformance suites are a "huge unlock" for language projects — they transform development from trust-based to verification-based. The conformance suite serves as the definitive specification artifact that any implementation (or agent) can validate against.
Vera has three distinct test layers, each serving a different purpose:
| Layer | Location | Purpose | What it tests |
|---|---|---|---|
| Unit tests | tests/test_*.py |
Test compiler internals | Error paths, edge cases, internal APIs |
| Conformance suite | tests/conformance/ |
Spec-anchored feature validation | Every language feature, one program per feature |
| Example programs | examples/ |
Showcase programs and demos | End-to-end usage, documentation |
Unit tests verify that the compiler works correctly. Conformance programs verify that the language works correctly. Examples demonstrate how to use the language. All three run in CI and pre-commit hooks.
Each conformance program declares the deepest pipeline stage it must pass:
| Level | What it validates | Count |
|---|---|---|
parse |
Source text is syntactically valid | 0 |
check |
Parses and type-checks cleanly | 4 |
verify |
Type-checks and all contracts verified by Z3 | 6 |
run |
Compiles to WASM and executes correctly | 72 |
Almost all programs are at the run level — they compile and execute, producing correct results. Four programs (ch07_cross_module_contracts_lib, ch09_http, ch09_inference, ch03_typed_holes) are at the check level. Six programs (ch03_slot_let_chains, ch03_slot_noncommutative, ch07_cross_module_contracts, ch07_io_sleep, ch07_random_effect, ch09_math_builtins) are at the verify level, using Z3-provable contracts.
pytest tests/ -v reports 14 skipped tests across two categories:
Level-limited skips — the conformance framework only runs tests up to the declared level; stages beyond that level are automatically skipped. These are expected and correct.
| Test | Program | Declared level | Skipped stage | Reason |
|---|---|---|---|---|
test_run[ch03_slot_let_chains] |
ch03_slot_let_chains.vera |
verify |
run |
verify-level programs don't get a run test |
test_run[ch03_slot_noncommutative] |
ch03_slot_noncommutative.vera |
verify |
run |
verify-level programs don't get a run test |
test_verify[ch03_typed_holes] |
ch03_typed_holes.vera |
check |
verify |
check-level program: verify stage not run |
test_run[ch03_typed_holes] |
ch03_typed_holes.vera |
check |
run |
check-level program: no standalone main |
test_run[ch07_cross_module_contracts] |
ch07_cross_module_contracts.vera |
verify |
run |
verify-level programs don't get a run test |
test_verify[ch07_cross_module_contracts_lib] |
ch07_cross_module_contracts_lib.vera |
check |
verify |
check-level program: verify stage not run |
test_run[ch07_cross_module_contracts_lib] |
ch07_cross_module_contracts_lib.vera |
check |
run |
check-level library module: no standalone main |
test_run[ch07_io_sleep] |
ch07_io_sleep.vera |
verify |
run |
verify-level programs don't get a run test |
test_run[ch07_random_effect] |
ch07_random_effect.vera |
verify |
run |
verify-level programs don't get a run test |
test_run[ch09_math_builtins] |
ch09_math_builtins.vera |
verify |
run |
verify-level programs don't get a run test |
Environment-gated skips — these programs require network access or a live API key that is not available in CI. They pass vera check (type-checking) but cannot be executed.
| Test | Program | Declared level | Skipped stage | Reason |
|---|---|---|---|---|
test_verify[ch09_http] |
ch09_http.vera |
check |
verify |
Requires outbound HTTP; unavailable in CI sandbox |
test_run[ch09_http] |
ch09_http.vera |
check |
run |
Requires outbound HTTP; unavailable in CI sandbox |
test_verify[ch09_inference] |
ch09_inference.vera |
check |
verify |
Requires VERA_*_API_KEY; not set in CI |
test_run[ch09_inference] |
ch09_inference.vera |
check |
run |
Requires VERA_*_API_KEY; not set in CI |
To run the environment-gated tests locally: set VERA_ANTHROPIC_API_KEY (or another provider key) and ensure outbound HTTP is available, then vera run tests/conformance/ch09_http.vera / vera run tests/conformance/ch09_inference.vera.
tests/conformance/
├── manifest.json # Machine-readable test metadata
├── ch01_int_literals.vera # Chapter 1: Integer literals
├── ch01_float_literals.vera # Chapter 1: Float64 literals
├── ch01_string_escapes.vera # Chapter 1: String escape sequences
├── ... # 82 programs total, organized by spec chapter
├── ch07_state_handler.vera # Chapter 7: State<T> effect handler
├── ch07_exn_handler.vera # Chapter 7: Exn<E> effect handler
├── ch09_numeric_builtins.vera # Chapter 9: Numeric built-in functions
├── ch09_type_conversions.vera # Chapter 9: Numeric type conversions
├── ch09_markdown.vera # Chapter 9: Markdown standard library
├── ch09_regex.vera # Chapter 9: Regular expression matching
├── ch09_decimal.vera # Chapter 9: Decimal type operations
├── ch09_json.vera # Chapter 9: JSON standard library
├── ch09_http.vera # Chapter 9: Http effect (check level)
└── ch10_float_predicates.vera # Chapter 9: Float64 predicates and constants
manifest.json maps each program to its spec chapter, test level, and feature tags:
{
"id": "ch04_arithmetic",
"file": "ch04_arithmetic.vera",
"chapter": 4,
"title": "Arithmetic operators",
"level": "run",
"spec_ref": "Section 4.1",
"features": ["add", "sub", "mul", "div", "mod", "unary_neg"]
}The manifest is the machine-readable feature inventory — agents can query it to find which features exist and where they are tested.
# Via pytest (parametrized — 410 tests)
pytest tests/test_conformance.py -v
# Via standalone script (used in CI and pre-commit)
python scripts/check_conformance.pyThe pytest runner (test_conformance.py) parametrizes over every manifest entry and runs five checks per program: parse, check, verify, run, and format idempotency.
- Write a
.veraprogram intests/conformance/following the naming conventionchNN_feature_name.vera - Include a header comment indicating the spec chapter and what the program tests
- Ensure the program has a
mainfunction (forrun-level tests) - Format it:
vera fmt --write tests/conformance/your_file.vera - Add an entry to
manifest.jsonwith the appropriate level and feature tags - Run
python scripts/check_conformance.pyto validate
When implementing a new language feature, the conformance program should be written first — this is test-driven development against the spec.
Coverage by module, measured by pytest --cov=vera:
| Module | Stmts | Miss | Coverage |
|---|---|---|---|
codegen/ |
1,934 | 99 | 95% |
checker/ |
1,117 | 73 | 93% |
wasm/ |
7,473 | 268 | 96% |
browser/ |
21 | 0 | 100% |
verifier.py |
429 | 0 | 100% |
transform.py |
564 | 16 | 97% |
formatter.py |
673 | 54 | 92% |
ast.py |
460 | 30 | 93% |
smt.py |
495 | 0 | 100% |
markdown.py |
413 | 54 | 87% |
types.py |
182 | 7 | 96% |
errors.py |
126 | 1 | 99% |
environment.py |
239 | 8 | 97% |
cli.py |
474 | 0 | 100% |
parser.py |
45 | 0 | 100% |
resolver.py |
68 | 2 | 97% |
tester.py |
312 | 0 | 100% |
prelude.py |
106 | 0 | 100% |
registration.py |
18 | 0 | 100% |
| Total | 15,149 | 612 | 96% |
The lowest-coverage module is markdown.py at 87%, reflecting Markdown AST traversal edge cases. The wasm/ subsystem was improved from 79% to 96% by #156 and #324; the remaining gaps are mostly in wasm/inference.py (85%) deep type-dispatch branches for specific builtin functions.
Vera's verifier classifies each contract into one of three tiers. Tier 1 contracts are proved correct statically by Z3 — no runtime overhead. Tier 3 contracts cannot be fully decided by the SMT solver and fall back to runtime assertion checks. The verifier never rejects a valid program; it simply warns when a contract drops to Tier 3.
Across all 33 example programs:
| Metric | Value |
|---|---|
| Tier 1 (static) | 162 contracts — proved automatically by Z3 |
| Tier 3 (runtime) | 16 contracts — verified at runtime via assertion checks |
| Total | 177 contracts (91.0% static) |
The 16 remaining Tier 3 contracts and why they cannot be promoted:
| Example | Contract | Reason |
|---|---|---|
| async_futures.vera | 2 contracts | Async/future combinators not in decidable fragment |
| collections.vera | 8 contracts | Collection operations (Map/Set) not modeled in Z3 |
| gc_pressure.vera | decreases in repeat |
Termination metric not in decidable fragment |
| generics.vera | ensures(@T.result == @T.0) |
Generic type parameters have no Z3 sort |
| generics.vera | ensures(@A.result == @A.0) |
Generic type parameters have no Z3 sort |
| increment.vera | ensures(new(State<Int>) == old(State<Int>) + 1) |
old/new state modeling not yet implemented |
| json.vera | 2 contracts | Json ADT operations not modeled in Z3 |
The Tier 1 fragment covers: integer/boolean arithmetic, comparisons, if/else, let bindings, match expressions, ADT constructors, function calls (modular postcondition), length, and decreases clauses (self-recursive, mutual recursion via where-blocks, Nat and structural ADT measures).
How Vera language features (by spec chapter) map to test files and example programs:
| Spec chapter | Feature | Test files | Conformance | Examples |
|---|---|---|---|---|
| Ch 1: Lexical | Literals (Int, Float64, Bool, Byte, String) | test_ast, test_codegen | ch01_int_literals, ch01_float_literals, ch01_bool_literals, ch01_byte_literals | most examples |
| Ch 1: Lexical | String escape sequences (\n, \t, \\, \", \r, \0, \u{XXXX}) |
test_ast, test_codegen | ch01_string_escapes | io_operations, file_io |
| Ch 1: Lexical | Comments | test_parser | ch01_comments | — |
| Ch 2: Types | Int, Nat, Bool, String, Float64, Byte, Unit | test_codegen, test_checker | ch02_builtin_types | most examples |
| Ch 2: Types | ADTs (algebraic data types), Option, Result | test_codegen, test_checker | ch02_adt_basic, ch02_adt_recursive, ch02_option_result | pattern_matching, list_ops |
| Ch 2: Types | Refinement types | test_codegen, test_verifier | ch02_refinement_types | refinement_types, safe_divide |
| Ch 2: Types | Generics (forall<T>) |
test_codegen_monomorphize, test_checker | ch02_generics | generics |
| Ch 3: Slots | @T.n references, De Bruijn indexing |
test_checker, test_codegen | ch03_slot_basic, ch03_slot_indexing, ch03_slot_result | all 33 examples |
| Ch 4: Expressions | Arithmetic, comparison, boolean, unary ops | test_codegen, test_checker | ch04_arithmetic, ch04_comparison, ch04_boolean_ops | factorial, absolute_value |
| Ch 4: Expressions | If/else, let, match, pipe operator | test_codegen, test_checker | ch04_if_else, ch04_let_binding, ch04_match_basic, ch04_match_nested, ch04_pipe_operator | pattern_matching |
| Ch 4: Expressions | String and array builtins | test_codegen | ch04_string_builtins, ch04_array_ops | string_ops |
| Ch 5: Functions | Declarations, recursion, mutual recursion | test_codegen, test_checker | ch05_basic_function, ch05_recursion, ch05_mutual_recursion | factorial, mutual_recursion |
| Ch 5: Functions | Closures, higher-order functions | test_codegen_closures | ch05_closures | closures |
| Ch 5: Functions | Visibility (public/private) |
test_checker | ch05_visibility | modules |
| Ch 6: Contracts | Preconditions (requires) |
test_codegen_contracts, test_verifier | ch06_requires | safe_divide |
| Ch 6: Contracts | Postconditions (ensures) |
test_codegen_contracts, test_verifier | ch06_ensures | absolute_value |
| Ch 6: Contracts | Decreases clauses, assert/assume | test_verifier, test_codegen | ch06_decreases, ch06_assert_assume | factorial |
| Ch 6: Contracts | Quantifiers (forall, exists) | test_codegen, test_verifier | ch06_quantifiers | quantifiers |
| Ch 7: Effects | Pure, IO, State<T> | test_codegen, test_checker | ch07_pure, ch07_io, ch07_state_handler | hello_world, increment, io_operations, file_io |
| Ch 7: Effects | Effect handlers (State<T>, Exn<E>) | test_codegen, test_checker | ch07_state_handler, ch07_exn_handler | effect_handler |
| Ch 9: Stdlib | Numeric builtins (abs, min, max, floor, ceil, round, sqrt, pow) | test_codegen, test_checker | ch09_numeric_builtins | — |
| Ch 9: Stdlib | Type conversions (int_to_float, float_to_int, nat_to_int, int_to_nat, byte_to_int, int_to_byte) | test_codegen, test_checker | ch09_type_conversions | — |
| Ch 9: Stdlib | Float64 predicates (float_is_nan, float_is_infinite, nan, infinity) | test_codegen, test_checker | ch10_float_predicates | — |
| Ch 7: Effects | Effect subtyping (§7.8), call-site checking | test_types, test_checker | — | — |
| Ch 2: Types | Bidirectional type checking (local inference) | test_checker | — | — |
| Ch 4: Expressions | Nested constructor patterns in match | test_codegen | ch04_match_nested | pattern_matching |
| Ch 8: Modules | Imports, cross-module typing and codegen | test_codegen_modules, test_resolver | — | modules |
| Ch 11: Compilation | Cross-module name collision detection (E608/E609/E610) | test_codegen_modules | — | — |
| Ch 9: Stdlib | Markdown (md_parse, md_render, md_has_heading, md_has_code_block, md_extract_code_blocks) | test_codegen, test_markdown | ch09_markdown | markdown |
| Ch 9: Stdlib | Regex (regex_match, regex_find, regex_find_all, regex_replace) | test_codegen, test_checker | ch09_regex | regex |
| Ch 9: Stdlib | Map, Set, Decimal collections | test_codegen, test_checker | ch09_map, ch09_set, ch09_decimal, ch09_decimal_generics | collections |
| Ch 9: Stdlib | Json (json_parse, json_stringify, json_get, json_array_get, json_array_length, json_keys, json_has_field, json_type) | test_codegen, test_checker | ch09_json | json |
| Ch 9: Stdlib | Html (html_parse, html_to_string, html_query, html_text, html_attr) | test_codegen, test_checker | ch09_html | html |
| Ch 9: Stdlib | Http effect (Http.get, Http.post) | test_codegen, test_checker | ch09_http | http |
| Ch 11: Compilation | Contract-driven testing (Z3 input gen + WASM execution) | test_tester, test_cli | — | safe_divide, factorial |
| Ch 12: Runtime | Browser runtime parity (JS host bindings match Python) | test_browser | — | — |
Each test module defines module-level helper functions (no conftest.py):
# test_checker.py pattern:
_check_ok(source) # assert no type errors
_check_err(source, "match") # assert at least one error matching substring
# test_verifier.py pattern:
_verify_ok(source) # assert no verification errors
_verify_err(source, "match") # assert at least one verification error
_verify_warn(source, "match") # assert at least one warning
# test_codegen.py pattern:
_compile_ok(source) # assert compilation succeeds
_run(source, fn, args) # compile + execute, return result
_run_io(source, fn, args) # compile + execute, return captured stdout
_run_trap(source, fn, args) # compile + execute, assert WASM trapEvery one of the 33 example programs in examples/ is tested through every pipeline stage via parametrised tests: parsing, AST transformation, type checking, contract verification, WASM compilation, and execution. If you add a new .vera example, it is automatically included in the round-trip suite.
The formatter has idempotency tests: format(format(x)) == format(x) for all tested programs.
When extending the compiler, add tests following the existing patterns:
- New grammar construct: Add parser tests to
test_parser.py(positive and negative) - New AST node: Add transformation tests to
test_ast.py(check node fields, spans, serialisation) - New type rule: Add checker tests to
test_checker.pyusing_check_ok()/_check_err() - New SMT support: Add verifier tests to
test_verifier.pyusing_verify_ok()/_verify_err() - New codegen support: Add compilation tests to
test_codegen.pyusing_compile_ok()/_run()/_run_trap() - New example program: Add to
examples/-- it is automatically included in round-trip tests - New error pattern: Add formatting tests to
test_errors.py - New tester feature: Add tests to
test_tester.pyusing_test(source)helper - New host binding: Add parity tests to
test_browser.pyto ensure the JavaScript runtime stays in sync with the Python runtime
Twelve scripts in scripts/ validate cross-cutting concerns beyond unit tests:
| Script | What it validates |
|---|---|
check_conformance.py |
All 82 conformance programs pass their declared level (parse/check/verify/run) |
check_examples.py |
All 33 .vera examples pass vera check + vera verify |
check_spec_examples.py |
148 parseable code blocks from spec chapters: parse, type-check, and verify |
check_readme_examples.py |
All Vera code blocks in README.md parse correctly |
check_skill_examples.py |
All Vera code blocks in SKILL.md parse correctly |
check_faq_examples.py |
All Vera code blocks in FAQ.md parse correctly |
check_html_examples.py |
All Vera code blocks in docs/index.html pass parse + check + verify |
check_site_assets.py |
Generated site assets under docs/ are up-to-date |
check_version_sync.py |
pyproject.toml and vera/__init__.py versions match |
check_doc_counts.py |
Counts cited in TESTING.md, CONTRIBUTING.md, and CLAUDE.md match live codebase |
check_licenses.py |
All installed packages have MIT-compatible licenses |
fix_allowlists.py |
Auto-fix stale allowlist line numbers after Markdown edits |
These run in both pre-commit hooks and CI, so issues are caught locally before they reach the remote.
check_spec_examples.py pushes spec code blocks through three compiler stages, with allowlists at each level:
| Stage | Pass | Allowlisted | Categories |
|---|---|---|---|
| Parse | 81 | 67 | FUTURE (9), FRAGMENT (58) |
| Type-check | 67 | 14 | INCOMPLETE (13), FUTURE (1) |
| Verify | 66 | 1 | ILLUSTRATIVE (1) |
Allowlisted entries have stale-detection: when a feature lands or a spec edit shifts line numbers, CI flags the entry for removal or the fix_allowlists.py script auto-fixes the line numbers. The INCOMPLETE check entries reference functions, types, or imports not defined in the block (e.g. abs, Tuple, array_map, parse_int). The 1 FUTURE check entry uses async/await. The 1 ILLUSTRATIVE verify entry is a spec example demonstrating multiple postconditions syntax where the contract is intentionally imprecise.
Every push is checked by 25 configured hooks across two stages: 23 are configured at the commit stage (after pre-commit install), and 2 (check-changelog-updated, uv-lock-check) are configured at the push stage (after pre-commit install --hook-type pre-push). Many commit-stage hooks use per-hook files: / types: filters and only fire when matching files are staged — a docs-only commit triggers a small subset, a compiler-level commit triggers most. Full list:
| Hook | What it does |
|---|---|
trailing-whitespace |
Strip trailing whitespace |
end-of-file-fixer |
Ensure files end with a newline |
check-yaml / check-toml |
Validate config file syntax |
check-merge-conflict |
Detect conflict markers |
check-added-large-files |
Reject files >500 KB |
debug-statements |
Detect pdb/ipdb imports |
mypy vera/ |
Type-check compiler in strict mode |
pytest tests/ -q |
Run full test suite |
fix_allowlists.py --fix |
Auto-fix stale allowlist line numbers |
check_conformance.py |
All 82 conformance programs pass their declared level |
check_examples.py |
All 33 examples pass vera check + vera verify |
check_examples_readme.py |
vera run commands in examples/README.md reference existing files and exported functions |
check_readme_examples.py |
README code blocks parse correctly |
check_examples_doc.py |
EXAMPLES.md code blocks parse correctly |
check_skill_examples.py |
SKILL.md code blocks parse correctly |
check_faq_examples.py |
FAQ.md code blocks parse correctly |
check_html_examples.py |
HTML landing page code blocks pass parse + check + verify |
check_doc_counts.py |
Counts in docs match live codebase |
check_limitations_sync.py |
Limitation tables consistent across KNOWN_ISSUES.md, vera/README, and spec |
check_licenses.py |
All package licenses are MIT-compatible |
build_site.py |
Regenerate AI-readable site assets (llms.txt, llms-full.txt, robots.txt, sitemap.xml, index.md) |
browser parity |
Browser runtime produces identical output to Python runtime |
check-changelog-updated (pre-push) |
CHANGELOG has a new entry when substantive files changed |
uv-lock-check (pre-push) |
uv.lock is in sync with pyproject.toml |
The validation hooks are smart about triggers -- they only run when relevant files change (.vera, vera/**/*.py, grammar.lark, the corresponding Markdown file, or vera/browser/* for browser parity). The two pre-push hooks only fire at push time.
GitHub Actions (.github/workflows/ci.yml) runs seven parallel jobs on every push and pull request to main:
| Job | Matrix / Runner | What it checks |
|---|---|---|
| test | Python 3.11, 3.12, 3.13 x Ubuntu, macOS (6 combos) | pytest -v passes on all combinations |
| test (coverage) | Python 3.12 x Ubuntu only | pytest --cov=vera --cov-fail-under=80 |
| typecheck | Python 3.12 x Ubuntu | mypy vera/ clean in strict mode |
| lint | Python 3.12 x Ubuntu | check_conformance.py, check_examples.py, check_examples_readme.py, check_version_sync.py, check_spec_examples.py, check_readme_examples.py, check_skill_examples.py, check_faq_examples.py, check_html_examples.py, check_site_assets.py, check_licenses.py, ruff check --select S vera/ (security rules) |
| security | Ubuntu | Gitleaks secret scanning on full history |
| dependency-audit | Python 3.12 x Ubuntu | pip-audit --skip-editable --ignore-vuln CVE-2026-4539 — checks all installed packages against the OSV vulnerability database (skips the local editable vera package; CVE-2026-4539 suppressed pending a pygments fix release) |
| sbom | Python 3.12 x Ubuntu | cyclonedx-py environment — generates a CycloneDX JSON SBOM of the full installed dependency tree and uploads it as a 90-day CI artifact |
| browser-parity | Python 3.12 + Node.js 22 x Ubuntu | pytest tests/test_browser.py -v — verifies JS runtime matches Python runtime; collects V8 coverage via NODE_V8_COVERAGE and uploads to Codecov |
The coverage threshold of 80% is enforced in CI. Current coverage is 96%. JavaScript coverage for vera/browser/runtime.mjs is collected separately using V8's built-in coverage and uploaded to Codecov with the javascript flag.
Each job uses scoped permissions (contents: read; the security job additionally has security-events: write) and all checkout steps set persist-credentials: false to prevent the GITHUB_TOKEN from being baked into .git/config. Actions without SHA-pinned version refs are tracked in #390.
Tracked improvements to the testing and CI infrastructure:
| Issue | Description |
|---|---|
| #349 | Improve browser runtime (runtime.mjs) test coverage to >80% — JS code is invisible to pytest-cov, blocking codecov/patch on PRs that touch the runtime |
Testing infrastructure that could be added in the future:
- Property-based testing --
hypothesisis installed as a dev dependency but not yet used. Could generate random programs to test parser robustness and formatter idempotency at scale. - Formatter round-trip invariant -- verify
parse(format(parse(src))) == parse(src)for all valid programs, not just the examples. - WASM inference.py coverage --
wasm/inference.pyat 85% has the most remaining gaps, mostly in deep type-dispatch branches for specific builtin function return types. These branches require very specific expression nesting patterns to reach. - Performance benchmarks -- no benchmark infrastructure exists. Could track compilation time and Z3 verification time across releases.