Hi Maxwell:
On Sun, Mar 3, 2024, Maxwell G wrote:
Has anyone every used trivy [1] to scan for licenses? It appears more robust and better maintained than askalono-cli and can detect files with multiple licenses and licenses embedded in file headers. I have been running it with "trivy fs --scanners license --license-full ."
IMHO trivy is not a robust tool for license detection from me trying it.
It is mostly based on google/licenseclassifier which had a single commit in the last 17 months, and this means this is not more maintained than askalono (and frankly both are fairly lightweight tools for license detection). Trivy adds SPDX expression parsing on top of the google/licenseclassifier and that's it. I would not rely on these for anything serious and certainly not to scan code for license prior to its inclusion in Fedora.
If you want robust license detection, consider using ScanCode [2] and Scancode.io [3] for more complex pipelines. Both are tools that I co-maintain and are considered as better tools for this. Do not hesitate to reach out for help!
Not directly related, I just found out ScanCode has been used for building large code LLMs [4]
[1] https://github.com/google/licenseclassifier [2] https://github.com/nexB/scancode-toolkit [3] https://github.com/nexB/scancode.io [4] https://huggingface.co/papers/2402.19173
-- Cordially Philippe Ombredanne
+1 650 799 0949 | pombredanne@nexB.com AboutCode - Open source for open source - https://www.aboutcode.org VulnerableCode - the open code and open data vulnerability database - https://github.com/nexb/vulnerablecode ScanCode - scan your code, for origin/license/vulnerabilities, report SBOMs - https://github.com/nexB/scancode-toolkit https://github.com/nexB/scancode.io package-url - the mostly universal SBOM identifier for packages - https://github.com/package-url DejaCode - What's in your code?! - http://www.dejacode.com