Skip to content

restrict xpath string-to-number conversion to the number grammar#286

Open
rootvector2 wants to merge 1 commit into
apache:masterfrom
rootvector2:xpath-number-strictness
Open

restrict xpath string-to-number conversion to the number grammar#286
rootvector2 wants to merge 1 commit into
apache:masterfrom
rootvector2:xpath-number-strictness

Conversation

@rootvector2

Copy link
Copy Markdown
Contributor
  • Read the contribution guidelines for this project.
  • Read the ASF Generative Tooling Guidance if you use Artificial Intelligence (AI).
  • I used AI to create any part of, or all of, this pull request. Which AI tool was used to create this pull request, and to what extent did it contribute?
  • Run a successful build using the default Maven goal with mvn; that's mvn on the command line by itself.
  • Write unit tests that match behavioral changes, where the tests fail if the changes to the runtime are not applied. This may not always be possible, but it is a best practice.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Each commit in the pull request should have a meaningful subject line and body. Note that a maintainer may squash commits during the merge process.

InfoSetUtil.doubleValue and InfoSetUtil.number coerce string values to numbers with Double.parseDouble/Double.valueOf, which accept Java number literals the XPath 1.0 number grammar does not: a leading +, exponents like 1e3, d/f type suffixes like 5d, hexadecimal floats, and the Infinity/NaN words. XPath requires every such string to become NaN, so today number('1e3') is 1000, '5d' >= 5 is true, and '1e3' = 1000 is true. Spotted while checking number() against the spec; the existing floor('NaN') cases only pass because Double.parseDouble happens to accept NaN.

Both methods now gate the conversion on a Pattern for the Number production (optional whitespace and minus around digits with an optional fraction) and return NaN otherwise. The check lives in InfoSetUtil because number(), the relational operators and floor/ceiling/round all coerce through these two methods, so node models and callers stay untouched. floor('NaN') and friends keep returning NaN since the word is rejected the same way.

Double.parseDouble/Double.valueOf in InfoSetUtil.doubleValue and number accept Java literals (leading +, exponents, d/f suffixes, hex, Infinity/NaN) that the XPath 1.0 number grammar excludes and which XPath requires to be NaN; gate both on the Number production so e.g. number('1e3') is NaN instead of 1000.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant