Hello,
I would like to ask for your consent with me making a particular license-checking program as a thesis assignment. This program would be about to be deployed in production, mostly in Fedora package reviewing, or for the use of individual developers outside Fedora. The problem is that the information obtained by usage of the program or disclosing obtained data could be harmful towards Fedora. Please read on for details.
The assignment is to invent a tool that:
- searches a software project (for the purpose of the thesis, only a Java software project, but more language-specific modules are intended in the future)
- determines the project's license (the means of doing that, beyond checking the license header, are to be invented)
- reports any problems or license incompatibilities found (mostly as warnings, I expect most of the checks to be based on heuristics)
- stores data in a publicly available database to speed up the process and share information, taking into consideration different builds of packages, different versions etc.
The assignment also requires devising a way to manage the database, user contributions, credibility of such etc.
It is to be noted that even though checking the compliance with Fedora licensing is a duty of every reviewer, my experience shows that it is beyond a standard package reviewer to see problems such as copy/pasted code, several classes taken from a different project with a different license (when a header is missing) etc. The aim of this program would be to help mainly with these problems as they often go unnoticed even now.
The problem is that for a successful defense of the thesis, it would be necessary (admittedly not vital) to publish the source code of the program and probably the database properties as well. I am aware that disclosing the data accumulated while checking projects in Fedora could have potentially devastating consequences, therefore I would only publish the source code of the program and bindings to a database system, which a user would need to run separately from Fedora, thus not revealing the internal data to the public. My thesis advisor deemed this idea to be satisfactory for the academia.
What I would like to ask you is if you too consider the execution of this project to be safe enough to be done and not threaten Fedora, e. g. by pointing out unnoticed licensing issues to a malefactor.
Thank you, Tomas Radej
FAS: tradej