A security scanner as fast as a linter – written in Rust

(github.com)

53 pontos | por peaktwilight 6 dias atrás

10 comentários

yatac42
4 dias atrás
From a quick look it seems like it's "as fast as a linter" because it is a linter. The homepage says "Not just generic AST patterns", but I couldn't find any rule that did anything besides AST matching. I don't see anything in the code that would enable any kind of control or data flow analysis.
[-]
- peaktwilight
  3 dias atrás
  true, right now it's AST pattern matching with `pattern-inside`/`pattern-not-inside` for syntactic scoping. I changed the description. Intraprocedural dataflow is the next step (tracking in #10) and while trying to keep it close linter latency.
davewritescode
4 dias atrás
The speed is really cool but the fact that your rules are written as rust code meaning that new rules need a new binary. That might be fine but just wanted to point it out to anyone who's interested.
[-]
- peaktwilight
  3 dias atrás
  quick correction: built-in rules are compiled in, but foxguard also loads Semgrep-compatible YAML rules at runtime via --rules <path> (or .foxguard.yml). You can add or modify rules without touching the binary. The rust-coded rules are just the default pack for zero-config speed :D
Ferret7446
3 dias atrás
This follows the stereotype of every single Rust project immediately advertising the fact that it's written in Rust, as Rust devs seem more enamored with the language than what they're doing with the language
[-]
- none2585
  3 dias atrás
  Came here to note this - it's like the old vegan joke:
  How do you know if an app is written in Rust? Don't worry, the developer will tell you.
kabir_daki
4 dias atrás
Running security checks at linter speed is a big deal for CI pipelines. What's the false positive rate in practice? That's usually the tradeoff with fast static analysis — speed vs accuracy. Would love to know how you benchmarked it.
[-]
- peaktwilight
  3 dias atrás
  didn't measure that yet, but definitely thinking of adding it into scope soon
staticassertion
4 dias atrás
Legitimately, I have had to stay away from certain linting tools because of how slow they are. I'll check this out.
cfn-lint is due for one of these rewrites, it's excruciating. I made some patches to experiment with it and it could be a lot faster.
[-]
- peaktwilight
  3 dias atrás
  Appreciate it! cloudformation isn't in scope today but the perf approach (tree-sitter + parallel file walk + rule pre-filtering) transfers, so happy to check it out.
woodruffw
4 dias atrás
Some of the checks here seem very brittle. For example this one[1].
In the context of security scanning (versus, say, listing), I think it's reasonable to expect the tool to be resilient to attempts at obfuscation (or just badly written code that doesn't adhere to normal Python idioms around import paths).
[1]: https://github.com/PwnKit-Labs/foxguard/blob/a215faf52dcff56...
[-]
- peaktwilight
  3 dias atrás
  update: `NoPickle`/`NoYamlLoad` string-match the callee text, so `import pickle as p; p.loads(...)` and `from pickle import loads as d` slip past. Filed as #7 with a fix plan (intraprocedural alias table). Thanks!
mplanchard
4 dias atrás
Looks interesting, will give it a run on the codebase at $work. One thing that would be nice to see in the README are benchmarks on larger codebases. Everything in the benchmark table is quite small. I’d also list line count over files, since the latter is a much better measure of amount of code.
For context, the codebase I work on most often has 1200 JS/TS files, 685 rust files, and a bunch more. LoC is 13k JS, 80k TS, and 155k Rust
[-]
- mplanchard
  4 dias atrás
  It is still quite fast on that codebase, fwiw. 10.7 ms.
  [-]
  - peaktwilight
    3 dias atrás
    thx for the tip, I'll measure and see if LoC time is stable accross different codebases. Mind if I cite it in the readme (anonymized)?
    [-]
    - mplanchard
      3 dias atrás
      Sure thing, feel free. If you’d like more details, you send me an email at msplanchard (at gmail)
    - peaktwilight
      3 dias atrás
      update: filed #9 to build a labeled corpus and publish per-rule numbers.
jiggawatts
4 dias atrás
"No X, no Y no Z. Just a ..."
15 commits on Day #1 starting from an stub/empty repo. 47K lines of code developed in under two weeks by one person.
Sigh... AI slop.
[-]
- weedhopper
  2 dias atrás
  But it’s written in Rust!!! It’s great!
Onavo
3 dias atrás
There's also https://github.com/mongodb/kingfisher
[-]
- peaktwilight
  3 dias atrás
  cool, will check it out thanks!