ARTICLE AD BOX
I have a lightweight ONNX model that performs Parts of Speech (POS) tagging, which I then use to generate grammatical features in my application.
These grammatical features involve surrounding POS tag lookups, regex matching and then measuring the occurrence of some patterns. The pipeline is follows:
User inputs text -> tokenize -> ONNX model generates tags -> original string in a set buffer perhaps? -> perform regex matching in parallel on the same text.
Now, I want to leverage WASM's capabilities after the ONNX part to make use of any performance gains if possible. I would to learn some good practices for handling strings in C++/WASM and any common pitfalls that I should be aware of.
