NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Yes, all longest regex matches in linear time is possible (iev.ee)
nextaccountic 2 days ago [-]
> input size normal hardened speedup w/ hardened

> 1,000 0.7ms 28us 25x

> 5,000 18ms 146us 123x

> 10,000 73ms 303us 241x

> 50,000 1.8s 1.6ms 1,125x

Why is there a normal mode if hardened mode is faster for all input sizes?

ieviev 2 days ago [-]
Sorry, finished the post just now with more comparisons on other inputs

The reason is just that the normal mode is faster in average non pathological cases

tracnar 2 days ago [-]
Could you have a heuristics based on the input size and the pattern to decide what to use?
ieviev 2 days ago [-]
Yes, this is entirely possible. you can even explore the automaton eagerly and detect if it's possible to loop from an accepting state to a nonaccepting one.

Exciting stuff for future work

nextaccountic 2 days ago [-]
Ripgrep does something like thhis. It has a meta regex engine that switches engine when it finds what looks like pathological cases (or rather, the regex-automata crate does, which is used by the regex crate, which powers ripgrep).

https://docs.rs/regex-automata/latest/regex_automata/meta/st...

Ripgrep in turn exposes some knobs to tweak the heuristics

https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md#how...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 12:49:25 GMT+0000 (Coordinated Universal Time) with Vercel.