ARTICLE AD BOX
We convert invoice PDFs to text, and they vary a lot. They contain arbitrary text, VAT rates (indicated by %), and sums (numbers not followed by %). The goal is to match the sums with a regular expression.
Different regex patterns have been tried, but failed. How to extract first number that does not end with percent character (%) using C# 10 and .NET 10? There may or may not be space between a number and the percent character.
For the following texts, the results should be 4.23 and 5.67.
Arbitrary text 3.12 % Arbitrary text 4.23 Arbitrary textor
Arbitrary text 5.67 Arbitrary text 8% Abitrary textI tried the regex -?[0-9]{1,3}(?:[_,.]?[0-9]{3})*[.,]?[0-9]{0,2}[^%], but it returns 3.12.
