SMT2: encode mathematical and string/regex types and operations#9074
Open
tautschnig wants to merge 3 commits into
Open
SMT2: encode mathematical and string/regex types and operations#9074tautschnig wants to merge 3 commits into
tautschnig wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Extends the SMT2 converter to support Strata mathematical Int/Real types plus SMT-LIB string/regex types, constants, and common Str.* / Re.* operations.
Changes:
- Encode
integer,real,string, and regex sorts intype2id/convert_type - Map selected
Str.*andRe.*function applications to SMT-LIB string/regex operators - Adjust integer
div/modencoding and add SMT-LIB string constant emission
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
kroening
reviewed
Jun 18, 2026
| fn_name = id2string( | ||
| to_symbol_expr(function_application_expr.function()).get_identifier()); | ||
|
|
||
| // Map Strata function names to SMT-LIB names for string/regex ops |
Collaborator
There was a problem hiding this comment.
We should introduce expressions for these at some point.
kroening
approved these changes
Jun 18, 2026
3 tasks
8d36980 to
4fa51d7
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #9074 +/- ##
===========================================
+ Coverage 80.69% 80.71% +0.02%
===========================================
Files 1714 1714
Lines 189597 189908 +311
Branches 73 73
===========================================
+ Hits 152987 153289 +302
- Misses 36610 36619 +9 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Map the mathematical integer/real types (and mathematical-function types) to their SMT-LIB names in type2id, and encode the mathematical integer mod and div operators with their SMT-LIB Int semantics (div/mod rather than real division). Also relax array handling for arrays of unknown/non-constant size (e.g. those indexed by a mathematical integer): drop the array-size invariant in convert_type so the SMT-LIB array sort can still be emitted, and make flatten_array reject bit-vector flattening of such arrays with a clear unsupported-operation error instead of an assertion failure. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Encode the SMT-LIB theory-of-strings String and RegLan types and string constants, and lower the CPROVER string/regex built-in functions (cprover_string_*_func, cprover_regex_*_func) to the corresponding SMT-LIB string/regex operators in convert_expr. Rather than threading a back-end-specific representation through the GOTO pipeline, front-ends (e.g. Strata) emit the canonical CPROVER string built-ins, which the SAT string solver already understands; here we additionally lower them natively to the SMT-LIB theory of strings. Operand contract: arguments are SMT-LIB-native (String/RegLan and scalar Int) and in SMT-LIB operand order, so most built-ins lower to a flat application. Specifically: concat -> str.++ length -> str.len char_at -> str.at substring -> str.substr contains -> str.contains is_prefix/is_suffix -> str.prefixof/suffixof replace -> str.replace equal -> = index_of -> str.indexof (start offset defaults to 0) startswith/endswith -> str.prefixof/suffixof (affix reordered first) is_empty -> (= s "") to_regex/in_regex -> str.to_re/str.in_re regex range/concat/star/plus/opt/loop/union/inter/comp/diff/all/ allchar/none -> the corresponding re.* operators Built-ins without an exact SMT-LIB counterpart (compare_to, of_int, parse_int, format, case folding, trim, ...) are intentionally not lowered and fall through to the generic handling. A soundness guard rejects (with unsupported_operation_exceptiont) any such built-in carrying refined-string (array/pointer) operands: that representation is handled by the SAT string solver (--refine-strings) and must not be silently mis-lowered here. The incremental SMT back-end (src/solvers/smt2_incremental) does not support these built-ins and reports them via UNIMPLEMENTED_FEATURE; a note documents this. Unit tests in unit/solvers/smt2/smt2_conv.cpp cover the lowering of each operator, the string-literal escaping, and the soundness guard. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
4fa51d7 to
1085c00
Compare
The refined-string solver is a SAT (bit-vector refinement) procedure and cannot run on an SMT2 back-end, which instead lowers string operations to the SMT-LIB theory of strings directly. Previously --refine-strings silently took precedence over --smt2/--incremental-smt2-solver; reject the contradictory combination with a usage error instead. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
1085c00 to
208e0dd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Teach the (non-incremental) SMT2 back-end (
smt2_convt) to encode SMT-LIB-native mathematical, string and regex types and their operators.Motivation
Rather than threading a back-end-specific representation through the GOTO pipeline, front-ends (e.g. Strata) emit the canonical CPROVER string/regex built-in functions that the SAT string solver already understands. This PR additionally lowers those built-ins natively to the SMT-LIB theory of strings, so string/regex problems can be solved by the SMT2 back-end without the SAT string-refinement loop, and without any Strata-specific operators leaking into the IR.
What it does
Mathematical types/operators (commit 1)
integer/real(and mathematical-function) types to SMT-LIBInt/Realintype2id.mod/divwith SMT-LIBIntsemantics.convert_typeso the SMT-LIB array sort is still emitted, and makeflatten_arrayreject bit-vector flattening of such arrays with a clear unsupported-operation error rather than an assertion failure.String/regex types and operators (commit 2)
String/RegLantypes and string constants (only"is escaped, as"", per SMT-LIB 2.6).cprover_string_*_func,cprover_regex_*_func) inconvert_exprto the corresponding SMT-LIB operators:str.++,str.len,str.at,str.substr,str.contains,str.prefixof/str.suffixof(coveringis_prefix/is_suffixandstartswith/endswith),str.indexof,str.replace,=,is_empty→(= s ""),str.to_re/str.in_re, and the regex familyre.range/re.++/re.*/re.+/re.opt/re.loop/re.union/re.inter/re.comp/re.diff/re.all/re.allchar/re.none.String/RegLan/Int) and in SMT-LIB operand order; the per-operator contract is documented inconvert_expr. Built-ins without an exact SMT-LIB counterpart (compare_to,of_int,parse_int,format, case folding,trim, …) are intentionally not lowered.unsupported_operation_exceptiontrather than silently mis-lowered — that representation is handled by the SAT string solver (--refine-strings).Back-end selection (commit 3)
--refine-strings(a SAT refinement procedure) together with an SMT2 back-end is now a usage error instead of--refine-stringssilently winning. With an SMT2 back-end and no--refine-strings, the native lowering above applies.Scope / limitations
src/solvers/smt2_incremental) uses a separate converter and does not support these built-ins; such inputs are reported viaUNIMPLEMENTED_FEATURE(a note documents this).smt2_convtunit tests (lowering of each operator, string-literal escaping, and the soundness guard). End-to-end regression tests require a front-end that emits these native built-ins (the Strata/Python side) and are expected to land there.stringtype support to SMT back-end #8838 (now closed).unit/solvers/smt2/smt2_conv.cpp; end-to-end tests to follow via the emitting front-end).