Hey all,
As part of the discussion going on about Mesa on devel@, the situation
around OpenSSL was brought up, and Adam Williamson brought up that we
might not need to hobble OpenSSL anymore[1]. A quick check seems to
indicate we no longer do it for GnuTLS either, and haven't for many
years[2].
Could we just drop all this stuff and use pristine OpenSSL sources?
All the crypto algorithm usability stuff is controlled through
crypto-policies, so I don't think it makes sense to do this anymore
for OpenSSL since all the patents indicated in the script have expired
for a couple of years now[3].
Dropping this will eliminate a chunk of cruft that nobody needs around
anymore and simplify OpenSSL maintenance.
[1]: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org…
[2]: https://src.fedoraproject.org/rpms/gnutls/c/46d865d8451be0f4576dcc56841175a…
[3]: https://src.fedoraproject.org/rpms/openssl//blob/rawhide/f/hobble-openssl
--
真実はいつも一つ!/ Always, there's only one truth!
These questions came up in a FESCo ticket [1] recently and the primary purpose of this thread is to have some public record of the conversation around the handling of pre-trained weights for AI/ML models as packaged for Fedora.
[1] https://pagure.io/fesco/issue/3175
Intro and Definitions
=====================
Previous conversations have involved a decent amount of confusion around terminology and I want to be clear about what I'm asking so I'm starting with a few definitions in the context of my questions.
Artificial Neural Network (ANN) - effectively structured data consisting of neurons (nodes containing some value) organized into layers with various connections between the neurons. There are connections between neurons which control the flow of data through the entire network. The exact value of how the connections affect flow through the network is found through the training process and these values are generally referred to as weights.
Model - A model by itself is a description of a specific ANN - how layers are configured, how they interact with each other, how model training is done, how data needs to be structured for using a trained model and so on. A model by itself is rarely, if ever useful. Models generally need to be trained on data before they can be used but many models offer a mechanism through which weights can be loaded from a model which has already been trained. An untrained model without pre-trained weights or training is pretty much code.
Pre-Trined Weights - Pre-trained weights are essentially the data contained in a model after training the model on some input data. Training modern ANN models is a very expensive and time consuming process; pre-trained weights allow people to use models without having to train the model locally or even have access to data needed to train the model.
Questions
=========
1. Are pre-trained weights considered to be normal non-code content/data or do they require special handling?
2. If an upstream offers pre-trained weights and indicates that those weights are available under a license which is acceptable for non-code content in Fedora, can those pre-trained weights be included in Fedora packages?
3. Extending question 2, is it considered sufficient for an upstream to have a license on pre-trained weights or would a packager/reviewer need to verify that the data used to train those weights is acceptable?
4. Is it acceptable to package code which downloads pre-trained weights from a non-Fedora source upon first use post-installation by a user if that model and its associated weights are
a. For a specific model?
b. For a user-defined model which may or may not exist at the time of packaging?
I can provide examples of any of these situations if that would be helpful.
Thanks,
Tim
As announced in [1] the message on devel list, I have retired celestia and celestia-data due to some files have been discovered to be covered under CC-BY-NC-SA-3.0 (plus some other are still waiting for a full check by upstream).
I wonder if I need to ask fedora-infra to remove all the sources from the lookaside cache?
Mattia
[1] https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org…
Hey everyone,
It looks like Redis, Inc. has announced that future versions of Redis
are no longer OSS and will be dual-licensed SSPL and RSAL[1]. Absent a
fork of Redis coming up, we will likely need to remove Redis from
Fedora.
All I can say is... :(
[1]: https://redis.com/blog/redis-adopts-dual-source-available-licensing/
--
真実はいつも一つ!/ Always, there's only one truth!
On Wed, Mar 20, 2024 at 6:26 PM Jonathan Wright via devel
<devel(a)lists.fedoraproject.org> wrote:
>
> We can potentially look to https://github.com/Snapchat/KeyDB which I've been loosely working on packaging anyway.
>
I'll want to test this for Pagure at least, since we're going to have
to switch our recommendations around soon because of this.
--
真実はいつも一つ!/ Always, there's only one truth!
On Tue Mar 5, 2024 at 04:06 +0000, Maxwell G wrote:
> On Mon Mar 4, 2024 at 22:35 +0100, Sandro wrote:
> > On 04-03-2024 07:59, Miroslav Suchý wrote:
> > > It would welcome if anyone can help Robert here:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=2235055
> >
> > I had a look and it seems the package is currently stuck on broken
> > python-pymaven-patch, which requires python-lxml < 5~~. In rawhide and
> > f40 python-lxml was updated to 5.1.0 two months ago.
> >
> > For about as long there has been a PR open for python-pymaven-patch
> > removing that version constraint. Notably, the maintainer of
> > python-pymaven-patch is the same person, who submitted the
> > scancode-toolkit review request.
> >
> > There may be more trouble down the road. But for the moment, I don't see
> > how others can help driving this forward. A proven packager could merge
> > the PR. But I don't know how eclipseo, who's a proven packager, would
> > feel about that.
>
> Fixing FTI bugs that are unaddressed by a package's maintainer
> definitely falls under the purview of a provenpackager.
> I rebased the PR [1] and will merge it once CI passes.
>
> [1] https://src.fedoraproject.org/rpms/python-pymaven-patch/pull-request/1
It also looks like a bunch of the tests are failing and have been
skipped. That's not super ideal. It looks like [1] has been open
upstream for some time. I have not yet looked closely at the failures,
but Philippe, if you have any pointers to give, that would certainly be
helpful.
[1] https://github.com/nexB/scancode-toolkit/issues/3496
On Mon Mar 4, 2024 at 07:59 +0100, Miroslav Suchý wrote:
> Dne 03. 03. 24 v 20:22 Philippe Ombredanne napsal(a):
>
> > If you want robust license detection, consider using ScanCode [2] and
> > Scancode.io [3] for more complex pipelines. Both are tools that I
> > co-maintain and are considered as better tools for this. Do not
> > hesitate to reach out for help!
>
> *nod*
>
> It would welcome if anyone can help Robert here:
> https://bugzilla.redhat.com/show_bug.cgi?id=2235055
Robert has not been very responsive as of late. It might be a good idea
for someone else to pick it up and start a new review ticket.
Hi,
Has anyone every used trivy [1] to scan for licenses? It appears more
robust and better maintained than askalono-cli and can detect files with
multiple licenses and licenses embedded in file headers. I have been
running it with "trivy fs --scanners license --license-full ."
[1] https://github.com/aquasecurity/trivy
--
Maxwell G (@gotmax23)
Pronouns: He/They