Important process note: we are experimenting with using Fedora Discussion as part of the Changes process. Change announcements (like the one you are reading right now) will still be sent to the devel-announce mailing list, but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
This will follow the same process as before, just with discussion in a different format https://docs.fedoraproject.org/en-US/program_management/changes_policy/
You can subscribe to and interact with these conversations by email. See https://discussion.fedoraproject.org/t/guide-to-interacting-with-this-site-b... for detailed instructions. To make sure you do not miss anything, make sure that you have the Change Proposal category set to “Watching” — or, if you just want to get notified about new changes but not every reply in the conversation, to “Watching First Post”. (Click on the little bell icon at the top right of the category page.)
The below document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.
== Summary ==
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
Fedora is an open source community project, and nobody is interested in violating user privacy. We do not want to collect data about individual users. We want to collect only aggregate usage metrics that are actually needed to achieve specific Fedora improvement objectives, and no more. We understand that if we violate our users' trust, then we won't have many users left, so if metrics collection is approved, we will need to be very careful to roll this out in a way that respects our users at all times. (For example, we should not collect users' search queries, because that would be creepy.)
We believe an open source community can ethically collect limited aggregate data on how its software is used without involving big data companies or building creepy tracking profiles that are not in the best interests of users. Users will have the option to disable data upload before any data is sent for the first time. Our service will be operated by Fedora on Fedora infrastructure, and will not depend on Google Analytics or any other controversial third-party services. And in contrast to proprietary software operating systems, you can redirect the data collection to your own private metrics server instead of Fedora's to see precisely what data is being collected from you, because the server components are open source too.
Keep in mind this Fedora change proposal is just that: a proposal. It must undergo community review and must be approved by the community-elected Fedora Engineering Steering Committee (FESCo) before it can be implemented, just like any other Fedora change proposal. We welcome community participation and fully expect this proposal may need to be modified significantly depending on Fedora community feedback.
== Owner == * Name: [[User:catanzaro|Michael Catanzaro]] * Email: mcatanzaro@redhat.com
== Detailed Description ==
We intend to deploy the Endless OS metrics system. [https://blogs.gnome.org/wjjt/2023/07/05/endless-oss-privacy-preserving-metri... This blog post] contains a description of how the system works. We do not plan to deploy the eos-phone-home component in Fedora.
=== How will data collection be approved? ===
The proposal owners feel it is essential to ensure the Fedora community has ultimate oversight over metrics collection. Community control is required to maintain user trust. If this change proposal is approved, then we'll need new policies and procedures to ensure community oversight over metrics collection and ensure Fedora users can be confident that our metrics collection does not violate their privacy.
We can say "we would never collect personally-identifiable data" and write software that really doesn't collect any such data, but this alone will never be enough to ensure user confidence. We will need a metrics collection policy that describes what sort of data may be collected by Fedora (anonymous, non-invasive), and what sort of data may not be collected. Such a policy does not exist currently. We will also want to ensure the Fedora community has ultimate control over which particular metrics are collected. One option is that each metric to be collected should be separately approved by FESCo. Collection of particular metrics in a particular data format is ultimately an engineering decision, and therefore FESCo seems like an appropriate approval point. Because FESCo members are elected regularly by the Fedora community, this also provides the community with ultimate control over metrics collection via the election process. But other oversight and approval structures would work too.
=== What data might we collect? ===
We are not proposing to collect any of these particular metrics just yet, because a process for Fedora community approval of metrics to be collected does not yet exist. That said, in the interests of maximum transparency, we wish to give you an idea of what sorts of metrics we might propose to collect in the future.
One of the main goals of metrics collection is to analyze whether Red Hat is achieving its goal to make Fedora Workstation the premier developer platform for cloud software development. Accordingly, we want to know things like which IDEs are most popular among our users, and which runtimes are used to create containers using Toolbx.
Metrics can also be used to inform user interface design decisions. For example, we want to collect the clickthrough rate of the recommended software banners in GNOME Software to assess which banners are actually useful to users. We also want to know how frequently panels in gnome-control-center are visited to determine which panels could be consolidated or removed, because there are other settings we want to add, but our usability research indicates that the current high quantity of settings panels already makes it difficult for users to find commonly-used settings.
Metrics can help us understand the hardware we should be optimizing Fedora for. For example, our boot performance on hard drives dropped drastically when systemd-readahead was removed. Ubuntu has maintained its own readahead implementation, but Fedora does not because we assume that not many users use Fedora on hard drives. It would be nice to collect a metric that indicates whether primary storage is a solid state drive or a hard disk, so we can see actual hard drive usage instead of guessing. We would also want to collect hardware information that would be useful for collaboration with hardware vendors (such as Lenovo), such as laptop model ID.
Other Fedora teams may have other metrics they wish to collect. For example, Fedora localization wishes to count users of particular locales to evaluate which locales are in poorer shape relative to their usage.
This is only a small sample of what we might want to know; no doubt other community members can think of many more interesting data points to collect. But note the purpose of all of the above metrics is to inform specific design decisions, not to build tracking profiles. We only need to collect data in aggregate, and have no need to associate the data we collect with particular users.
=== Metrics transparency ===
Transparency is required to provide confidence that Fedora metrics collection is not creepy or invasive. Since Fedora is open source, a developer can review the source code to verify exactly what it is doing and what data is being collected. But most Fedora users are not software developers, and few software developers have time or inclination to review the source code of the operating system to see what it is doing. To retain user trust, we need an easy way for users to understand exactly what data we are collecting. We propose to maintain a documentation page showing the current metrics database schema, so users can see exactly which fields are in the database and what example data looks like.
Experienced users may gain additional confidence by building and running their own metrics collection server; all of the components of the server (discussed below) are open source, and we will provide instructions for how to run a simple server yourself and view its metrics database. You can redirect metrics from Fedora's server to your own by changing a URL in a configuration file.
=== User control ===
A new metrics collection setting will be added to the privacy page in gnome-initial-setup and also to the privacy page in gnome-control-center. This setting will be a toggle that will enable or disable metrics collection for the entire system. We want to ensure that metrics are never submitted to Fedora without the user's knowledge and consent, so the underlying setting will be off by default in order to ensure metrics upload is not unexpectedly turned on when upgrading from an older version of Fedora. However, we also want to ensure that the data we collect is meaningful, so gnome-initial-setup will default to displaying the toggle as enabled, even though the underlying setting will initially be disabled. (The underlying setting will not actually be enabled until the user finishes the privacy page, to ensure users have the opportunity to disable the setting before any data is uploaded.) This is to ensure the system is opt-out, not opt-in. This is essential because we know that opt-in metrics are not very useful. Few users would opt in, and these users would not be representative of Fedora users as a whole. We are not interested in opt-in metrics.
To make this a little more confusing, metrics collection is actually separate from uploading. Collection is always initially enabled, while uploading is always initially disabled. The graphical toggle enables or disables both at the same time. That is, a newly-installed Fedora system will always collect metrics locally at first, but the collected metrics will be deleted and never submitted to Fedora if the user disables the metrics collection toggle on the privacy page. If the user leaves the toggle enabled, then the collected metrics may be submitted only after finishing the privacy page.
Metrics uploading will be opt-in for users who upgrade from previous versions of Fedora Workstation, because we don't yet have a mechanism to ask the user to consent to data collection after a system upgrade like we do for new installations, but metrics collection will be opt-out. That is, your upgraded system will collect metrics locally but will never submit them to Fedora. If you visit the privacy page in gnome-control-center, then both collection and uploading will be either enabled or disabled depending on the user's selection. Unlike gnome-initial-setup, the switch in gnome-control-center will default to off if the user has not seen the switch in gnome-initial-setup and has not previously selected a value for the setting.
This might sound complicated, but it is consistent. If the user has not yet made a decision whether to allow telemetry, we collect it locally so that it's ready to submit if the user approves telemetry in the future, but we never upload it. Once the user makes a decision, then we either upload it or delete it and stop collecting.
=== GDPR ===
It is Fedora Legal's obligation to ensure our data collection complies with legal requirements in the jurisdictions in which Red Hat operates. This is not an obligation of the Fedora community, so there is no need to discuss GDPR rules on our mailing lists. The proposal owners will not respond to mailing list posts that discuss GDPR or similar legal obligations during this change proposal discussion. In short, let's keep discussion focused on what Fedora SHOULD or SHOULD NOT do, rather than what we MUST or MUST NOT do.
That said, Fedora Legal has determined that if we collect any personally-identifiable data, the entire metrics system must be opt-in. Since we are only interested in opt-out metrics due to the low value of opt-in metrics, we must accordingly never collect any personally-identifiable data. We must also not collect any data that could become personally-identifiable if combined with other data, which notably means IP addresses must not be stored. We only want to collect anonymous data anyway, but we need to be especially mindful of the possibility that combining two "anonymous" data points could result in the data no longer being anonymous.
=== Fedora data collection policy ===
Fedora Legal requires that we publish a Fedora data collection policy separate from the existing [https://fedoraproject.org/wiki/Legal:PrivacyPolicy Fedora Privacy Policy], which is designed to address usage of Fedora websites. This is currently a work in progress that we're not quite ready to share yet. You can expect it to be very short and very generic.
=== Metrics server infrastructure ===
We propose to deploy Azafea, the open source metrics collection server used by Endless OS. An Azafea deployment consists of five components: an nginx proxy server, [https://github.com/endlessm/azafea-metrics-proxy azafea-metrics-proxy], redis, [https://github.com/endlessm/azafea azafea itself], and a Postgres database. nginx proxies HTTP requests to azafea-metrics-proxy, which is itself a simple HTTP server that adds metrics into the redis database, where they will be fetched by Azafea and stored into Postgres. We will provide instructions on how to set up your own server and see for yourself what data gets collected.
=== Metrics client infrastructure ===
The client side consists of [https://github.com/endlessm/eos-metrics eos-metrics], [https://github.com/endlessm/eos-event-recorder-daemon eos-event-recorder-daemon], and [https://github.com/endlessm/eos-metrics-instrumentation eos-metrics-instrumentation]. eos-metrics is a D-Bus interface that applications and services may use to record events, plus a GObject library that provides a simple API around the D-Bus interface. eos-event-recorder-daemon is the service that actually implements this interface: it collects incoming metrics, batches them together, and sends them to the metrics server at predefined intervals. eos-metrics-instrumentation is the component that actually collects specific metrics. Originally, we had planned to not use this component and instead write our own fedora-metrics-instrumentation that would collect only a few particular metrics that are approved via Fedora community process. However, currently we are planning to ship eos-metrics-instrumentation and instead ensure that it is not collecting more metrics than would be acceptable to the Fedora community. A review process to decide which metrics to collect and which metrics to disable will be required.
=== Data set considerations ===
Although we assume the metrics server administrator is not malicious and will not actively attempt to deanonymize users, we will still take reasonable precautions to make it difficult to correlate metrics to a particular user, starting by not storing any IP address information in the metrics database. Additionally, each metric that we collect will be considered individual, non-correlatable data by default, unless approved to be correlated with particular other metrics via future Fedora community process. That is, if a user submits two data points, we usually don't want the ability to know that these data points were both submitted by the same user.
Each metric is stored in the database with a Unix timestamp indicating when it was generated on the client. If abused, this timestamp could allow correlation of data points that are collected at the same time as each other, or at a fixed time offset to other events. For example, if the system were designed to collect two metrics exactly 300 seconds after the system were booted, then just looking at the timestamps would be enough to determine that both metrics recorded at the same time were submitted by the same user. Accordingly, we should consider modifying the metrics server to reduce timestamp granularity at least somewhat.
=== History ===
Currently Fedora's only form of metrics collection is [https://fedoraproject.org/wiki/Changes/DNF_Better_Counting DNF Better Counting], but this only counts Fedora installations. That is useful, but we want to count more than just how many users we have.
Fedora's first metrics collection attempt was [https://en.wikipedia.org/wiki/Smolt_(Linux) Smolt], a precursor to hw-probe which collected data on user hardware. The current proposal is different from Smolt because it will collect more than just hardware data, and also because Smolt collected only opt-in data. The current proposal would be opt-out, not opt-in.
This change proposal will likely be compared to the Ubuntu spyware complaints from a decade ago, when Ubuntu desktop users' search queries were sent to Amazon by default. Let's not do that.
== Feedback ==
We will endeavor to update this section of the change proposal to include a summary of Fedora community discussion of this proposal.
== Benefit to Fedora ==
The main benefit to Fedora is that we will be able to use collected metrics to inform design decisions. It is very common for developers to wish to know something about how Fedora software is used, and we will finally have a way to answer such questions.
Occasionally, Red Hat might need to collect specific metrics to justify additional time spent on contributing to Fedora or additional investment in Fedora.
== Scope ==
* Proposal owners: This change requires substantial technical and nontechnical work from the change owners. Most notably, we will need to package eos-metrics, eos-event-recorder-daemon, and eos-metrics-instrumentation properly for Fedora; they are currently packaged in a copr. We also still need to modify eos-metrics-instrumentation so that it does not send events not approved for use in Fedora, as we expect to collect less data than Endless OS.
* Other developers: This proposal will require substantial effort by Community Platform Engineering (CPE) to host the metrics server infrastructure.
* Release engineering: [https://pagure.io/releng/issues/11514 #11514]
* Policies and guidelines: New processes and guidelines are proposed above under the section "How will data collection be approved?"
* Trademark approval: N/A (not needed for this Change)
* Alignment with Objectives: This change does not align with any current [https://docs.fedoraproject.org/en-US/project/initiatives/ Fedora Initiatives], which are very limited in scope. That said, one of the main purposes of metrics collection is to determine whether we are achieving other objectives not listed on the wiki page. For example, we want Fedora Workstation to become the premier developer workstation operating system. To that end, we want to know how many of our users are using particular IDEs.
== Upgrade/compatibility impact ==
We would like to enable metrics upload for upgraded systems, but this isn't trivial because we want to obtain user consent before enabling metrics upload. This would require us to design a user interface that would run on upgraded systems and present the setting to users. We have not yet created such a user interface, so for now metrics upload will need to default to disabled for systems upgraded from older versions of Fedora. Since the underlying setting will be off by default, we don't need to do anything special to achieve this.
== How To Test ==
The ultimate goal is to see metrics appear in the Postgres database of a metrics server, but configuring and running the server is not trivial. Accordingly, we propose to publish a separate document detailing how to set up and configure a metrics server for testing purposes, how to redirect metrics to the custom server, and how to force the client to immediately submit metrics to ease testing. Although we don't actually expect many community members to seriously run their own metrics servers, we still want to document the steps involved so that interested developers can see exactly how it works.
== User Experience ==
A new metrics collection setting will be added to the privacy page in gnome-initial-setup and also to the privacy page in gnome-control-center. This setting will be a simple toggle that will enable or disable all metrics upload for the entire system. Users who do not want any metrics upload should feel confident that uploading can be disabled with a simple toggle.
Fedora users should be confident that Fedora metrics collection respects their privacy and collects only limited, anonymous usage data.
== Dependencies ==
Any package that wishes to collect a metric would need to depend on eos-metrics. For example, if we were to collect statistics on which system settings panels are used most frequently, then the gnome-control-center package would need to depend on eos-metrics in order to send a metric to eos-event-recorder-daemon.
== Contingency Plan ==
* Contingency mechanism: We would need to remove the eos-metrics, eos-event-recorder-daemon, and eos-metrics-instrumentation packages from the workstation-product comps group, and rebuild any packages that gained a dependency on eos-metrics. * Contingency deadline: Beta freeze * Blocks release? Yes, if the change is incomplete, it will need to be reverted before release.
== Documentation ==
This feature will depend on several different upstream projects with varying amounts of documentation.
The client side consists of eos-metrics, eos-event-recorder-daemon, and eos-metrics-instrumentation. The best documentation of eos-metrics available online is its [https://github.com/endlessm/eos-metrics/blob/master/data/com.endlessm.Metric... D-Bus interface XML]. eos-metrics also contains normal API documentation that will be built and installed in a docs subpackage, but this is not currently available online. The eos-event-recorder-daemon and eos-metrics-instrumentation components do not appear to have any online documentation.
On the server end, the metrics server consists of azafea-metrics-proxy feeding metrics into redis, where they will be pulled by azafea and then added to a Postgres database. Documentation for [https://github.com/endlessm/azafea-metrics-proxy/tree/master/docs/source azafea-metrics-proxy] and [https://github.com/endlessm/azafea/tree/master/docs/source azafea] can be reviewed online. [https://azafea.readthedocs.io/en/latest/events.html Events recognized by the server are documented here.] Note that this documentation is currently focused on use by Endless OS rather than by Fedora, and includes documentation of many events that are no longer sent by Endless OS. This change proposal does not propose to enable sending any particular events in Fedora.
== Release Notes ==
Release Notes are not required for initial proposal. We need to write the release notes before change freeze.
On 06/07/2023 18:10, Aoife Moloney wrote:
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
All telemetry collection MUST be an opt-in feature (disabled by default). I'm strongly against enabling it by default.
Please add the ability to completely get rid of it by removing the telemetry collector package.
On Thu, Jul 6 2023 at 08:19:07 PM +0200, Vitaly Zaitsev via devel devel@lists.fedoraproject.org wrote:
All telemetry collection MUST be an opt-in feature (disabled by default). I'm strongly against enabling it by default.
As explained in the proposal document, we know that opt-in metrics are not very useful because few users would opt in, and these users would not be representative of Fedora users as a whole. We are not interested in opt-in metrics.
Please add the ability to completely get rid of it by removing the telemetry collector package.
It should be possible to uninstall eos-event-recorder-daemon and eos-metrics-instrumentation. I'm not sure if eos-metrics will be uninstallable, but that package basically just provides D-Bus API and doesn't do anything on its own. (eos-event-recorder-daemon is the component that actually uploads metrics.)
On 06/07/2023 21:32, Michael Catanzaro wrote:
As explained in the proposal document, we know that opt-in metrics are not very useful because few users would opt in, and these users would not be representative of Fedora users as a whole.
Because Linux users care about their privacy.
We are not interested in opt-in metrics.
Then the statement "Privacy-preserving Telemetry" is not true. We want privacy - that is, no telemetry at all.
On Thu, Jul 6, 2023 at 9:58 PM Vitaly Zaitsev via devel < devel@lists.fedoraproject.org> wrote:
On 06/07/2023 21:32, Michael Catanzaro wrote:
As explained in the proposal document, we know that opt-in metrics are not very useful because few users would opt in, and these users would not be representative of Fedora users as a whole.
Because Linux users care about their privacy.
In that case, the users can opt-out, if the aggregated telemetry doesn't fit in their privacy framework.
On Thu, Jul 06, 2023 at 10:26:17PM +0200, Frantisek Zatloukal wrote:
On Thu, Jul 6, 2023 at 9:58 PM Vitaly Zaitsev via devel < devel@lists.fedoraproject.org> wrote:
On 06/07/2023 21:32, Michael Catanzaro wrote: > As explained in the proposal document, we know that opt-in metrics are > not very useful because few users would opt in, and these users would > not be representative of Fedora users as a whole. Because Linux users care about their privacy.
In that case, the users can opt-out, if the aggregated telemetry doesn't fit in their privacy framework.
This is not legal.
Rich.
--
Best regards / S pozdravem,
František Zatloukal Senior Quality Engineer Red Hat
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Thu Jul 6, 2023 at 14:32 CDT, Michael Catanzaro wrote:
On Thu, Jul 6 2023 at 08:19:07 PM +0200, Vitaly Zaitsev via devel devel@lists.fedoraproject.org wrote:
All telemetry collection MUST be an opt-in feature (disabled by default). I'm strongly against enabling it by default.
As explained in the proposal document, we know that opt-in metrics are not very useful because few users would opt in, and these users would not be representative of Fedora users as a whole. We are not interested in opt-in metrics.
Opt-out telemetry isn't going to be representative of the whole community either. Privacy concious users are going to opt out, and then their voices won't be heard.
On 7/6/23 15:32, Michael Catanzaro wrote:
On Thu, Jul 6 2023 at 08:19:07 PM +0200, Vitaly Zaitsev via devel devel@lists.fedoraproject.org wrote:
All telemetry collection MUST be an opt-in feature (disabled by default). I'm strongly against enabling it by default.
As explained in the proposal document, we know that opt-in metrics are not very useful because few users would opt in, and these users would not be representative of Fedora users as a whole. We are not interested in opt-in metrics.
Then make the metrics be neither opt-in nor opt-out. Have “Enable telemetry (y/n)?” be a mandatory question in the installer, which the user must answer.
On Thu, Jul 6 2023 at 07:42:47 PM -0400, Demi Marie Obenour demiobenour@gmail.com wrote:
Then make the metrics be neither opt-in nor opt-out. Have “Enable telemetry (y/n)?” be a mandatory question in the installer, which the user must answer.
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
My plan is to put this switch in gnome-initial-setup, not the installer. But it will have a default value.
Remember, for avoidance of doubt, we will NEVER enable telemetry upload without the user's consent, which is indicated by either (a) not flipping the telemetry switch in gnome-initial-setup to the off position, or (b) flipping the telemetry switch in gnome-control-center to the on position. (The telemetry might be enabled *locally only* for users who upgrade from previous versions of Fedora Workstation and who therefore have not seen the consent switch, but the data will never be uploaded to Fedora. And upgraded users will see the switch default to off rather than on, so it really will be opt-in for upgraded users.)
I'm attaching a screenshot to give an idea of what this would look like in gnome-initial-setup. I don't have a gnome-control-center screenshot handy, but it would be similar, except there it would default to off.
On Thu Jul 6, 2023 at 20:17 CDT, Michael Catanzaro wrote:
I'm attaching a screenshot to give an idea of what this would look like in gnome-initial-setup. I don't have a gnome-control-center screenshot handy, but it would be similar, except there it would default to off.
I don't see an attachment.
On Fri, Jul 7 2023 at 01:39:24 AM +0000, Maxwell G maxwell@gtmx.me wrote:
I don't see an attachment.
Trying again.
Looking at the screenshot, I wonder what percentage of users will read "Privacy", see that all the switches are on, and click "Next" in the belief that all the privacy features are on.
Björn Persson
On 7/6/23 21:17, Michael Catanzaro wrote:
On Thu, Jul 6 2023 at 07:42:47 PM -0400, Demi Marie Obenour demiobenour@gmail.com wrote:
Then make the metrics be neither opt-in nor opt-out. Have “Enable telemetry (y/n)?” be a mandatory question in the installer, which the user must answer.
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
My plan is to put this switch in gnome-initial-setup, not the installer. But it will have a default value.
It needs to be off by default. See KDE’s telemetry policy.
On Thu, Jul 6 2023 at 09:40:59 PM -0400, Demi Marie Obenour demiobenour@gmail.com wrote:
It needs to be off by default. See KDE’s telemetry policy
Again, if it's off by default then the data will be garbage. There is no point in doing opt-in telemetry. I would withdraw the proposal entirely if we cannot do it opt-out.
Michael
Unfortunately this might just be what happens.
I know that I would personally always opt out on principle, and would vote for opt-in or dropping the proposal. I am under the impression that most Fedora users are in the same boat as me.
On 7/12/23 16:34, Jeremy Newton wrote:
I know that I would personally always opt out on principle, and would vote for opt-in or dropping the proposal. I am under the impression that most Fedora users are in the same boat as me.
For the record, my personal opinion is that an opt-out is an acceptable option. I believe that Fedora has been hampered in the past by lack of reliable usage stats: we had difficult discussions about support for i686/python2/Qt3/etc where we just didn't know where the Fedora users were. Therefore, I personally think it is a good idea to allow collecting such stats, because I trust Fedora organization to keep such information to itself.
First of all, I just don't see that large data brokers would be interested in Python3 adoption data, and secondly I hope that Fedora organization would have the integrity (and the whistleblowers:) to protect that info even if there was a temptation to let it out.
Regarding the opt-in vs opt-out, someone made a claim that opt-out is not compliant with GDPR, which doesn't sound right. Every GDPR widget I have seen so far essentially asks if I agree with data collection, and offers me an opportunity to opt out of everything but essential cookies, which seems equivalent to the 'opt-out' mechanism that Michael proposes.
One missing piece might be for Fedora organization to commit to a policy of protecting such data collections, by publishing a legally sound declaration about its intentions and practices. Currently, we have this
https://docs.fedoraproject.org/en-US/legal/privacy/
which in my 'not-a-lawyer' view seems to be targeted to the web collection and may be US-centric, so maybe it could use some legal wordsmithing.
Again, all this is my personal opinion.
p
On Thu, Jul 13, 2023 at 12:25:48PM -0400, Przemek Klosowski via devel wrote:
One missing piece might be for Fedora organization to commit to a policy of protecting such data collections, by publishing a legally sound declaration about its intentions and practices. Currently, we have this
https://docs.fedoraproject.org/en-US/legal/privacy/
which in my 'not-a-lawyer' view seems to be targeted to the web collection and may be US-centric, so maybe it could use some legal wordsmithing.
Should this proposal be accepted, there will be a separate document. And, unrelatedly, the existing privacy statement is in the process of an update -- needs a refresh for legal changes, and there are number of things that it suggests we might do that I think we have no interest in and should drop (like asking for geo coordinates).
On Sat, 2023-07-15 at 15:01 -0400, Matthew Miller wrote:
On Thu, Jul 13, 2023 at 12:25:48PM -0400, Przemek Klosowski via devel wrote:
One missing piece might be for Fedora organization to commit to a policy of protecting such data collections, by publishing a legally sound declaration about its intentions and practices. Currently, we have this
https://docs.fedoraproject.org/en-US/legal/privacy/
which in my 'not-a-lawyer' view seems to be targeted to the web collection and may be US-centric, so maybe it could use some legal wordsmithing.
Should this proposal be accepted, there will be a separate document. And, unrelatedly, the existing privacy statement is in the process of an update -- needs a refresh for legal changes, and there are number of things that it suggests we might do that I think we have no interest in and should drop (like asking for geo coordinates).
The installer does broad geolocation in order to guess the timezone and locale, IIRC. grep the anaconda codebase for 'geoip' and you'll find the code. It basically hits up https://geoip.fedoraproject.org/city , so you can go there manually and see what data it gets from you.
On Sat, 2023-07-15 at 12:16 -0700, Adam Williamson wrote:
On Sat, 2023-07-15 at 15:01 -0400, Matthew Miller wrote:
On Thu, Jul 13, 2023 at 12:25:48PM -0400, Przemek Klosowski via devel wrote:
One missing piece might be for Fedora organization to commit to a policy of protecting such data collections, by publishing a legally sound declaration about its intentions and practices. Currently, we have this
https://docs.fedoraproject.org/en-US/legal/privacy/
which in my 'not-a-lawyer' view seems to be targeted to the web collection and may be US-centric, so maybe it could use some legal wordsmithing.
Should this proposal be accepted, there will be a separate document. And, unrelatedly, the existing privacy statement is in the process of an update -- needs a refresh for legal changes, and there are number of things that it suggests we might do that I think we have no interest in and should drop (like asking for geo coordinates).
The installer does broad geolocation in order to guess the timezone and locale, IIRC. grep the anaconda codebase for 'geoip' and you'll find the code. It basically hits up https://geoip.fedoraproject.org/city , so you can go there manually and see what data it gets from you.
...of course, it doesn't *store* that information anywhere.
The current policy seems to be written in relation to the account system...in the current account system you can set your locale and your timezone, though I don't see anywhere to set any more specific location than that (I think older versions of FAS might've let you be more specific).
On Friday, 07 July 2023 at 04:16, Michael Catanzaro wrote:
On Thu, Jul 6 2023 at 09:40:59 PM -0400, Demi Marie Obenour demiobenour@gmail.com wrote:
It needs to be off by default. See KDE’s telemetry policy
Again, if it's off by default then the data will be garbage. There is no point in doing opt-in telemetry. I would withdraw the proposal entirely if we cannot do it opt-out.
I think you should withdraw the proposal, then. If you can't present clear enough benefits so that people willingly give you their data, then Fedora's reputation will be garbage when you betray Fedora users' trust by collecting any data without their explicit consent. There is no point in doing opt-out telemetry if you want Fedora to keep its user base. Conversely, if you can do successful opt-in telemetry, that would be really awesome.
Regards, Dominik
Hello,
On 7/7/23 04:16, Michael Catanzaro wrote:
On Thu, Jul 6 2023 at 09:40:59 PM -0400, Demi Marie Obenour demiobenour@gmail.com wrote:
It needs to be off by default. See KDE’s telemetry policy
Again, if it's off by default then the data will be garbage. There is no point in doing opt-in telemetry. I would withdraw the proposal entirely if we cannot do it opt-out.
Since you're repeating that argument, I'll join those who repeat "off by default" + "opt-in only" + "able to uninstall that component completely".
Argument:
1) IANAL, but GDPR; with addition "not a big believer in anonymization being 100% effective"
2) "dark pattern", e.g. you know (guess, estimate, whatever) that many will not opt-in, hence you're trying to trick them (e.g. twisting their will and choices)
Sincerely
Peter
On Thu, Jul 06, 2023 at 20:17:27 -0500, Michael Catanzaro mcatanzaro@redhat.com wrote:
Remember, for avoidance of doubt, we will NEVER enable telemetry upload without the user's consent, which is indicated by either (a) not flipping the telemetry switch in gnome-initial-setup to the off position, or (b) flipping the telemetry switch in gnome-control-center to the on position. (The telemetry might be enabled *locally only* for users who upgrade from previous versions of Fedora Workstation and who therefore have not seen the consent switch, but the data will never be uploaded to Fedora. And upgraded users will see the switch default to off rather than on, so it really will be opt-in for upgraded users.)
Note that collecting the data by default increases the harm if someone accidentally enables telemetry and then notices the issue after data is reported.
Is there going to be some time limit on the data that is stored and not uploaded yet?
On Fri, Jul 7 2023 at 12:03:14 PM -0500, Bruno Wolff III bruno@wolff.to wrote:
Note that collecting the data by default increases the harm if someone accidentally enables telemetry and then notices the issue after data is reported.
Is there going to be some time limit on the data that is stored and not uploaded yet?
We can implement a time limit. The main purpose of this is so that we have the ability to collect data between first boot and the privacy panel in gnome-initial-setup. I'll add this to the feedback section of the change proposal.
On Thu, Jul 06, 2023 at 08:17:27PM -0500, Michael Catanzaro wrote:
On Thu, Jul 6 2023 at 07:42:47 PM -0400, Demi Marie Obenour demiobenour@gmail.com wrote:
Then make the metrics be neither opt-in nor opt-out. Have “Enable telemetry (y/n)?” be a mandatory question in the installer, which the user must answer.
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
So you do not trust users to answer the way you want? So much for respecting your users.
Michael Catanzaro wrote:
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
[...]
Remember, for avoidance of doubt, we will NEVER enable telemetry upload without the user's consent, which is indicated by either (a) not flipping the telemetry switch in gnome-initial-setup to the off position,
In other words, you expect that many users will click "Next" without thinking, and you intend to call that "consent". It's a popular tactic to make people "agree" to things without knowing it.
Björn Persson
On Friday, 07 July 2023 at 23:45, Björn Persson wrote:
Michael Catanzaro wrote:
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
[...]
Remember, for avoidance of doubt, we will NEVER enable telemetry upload without the user's consent, which is indicated by either (a) not flipping the telemetry switch in gnome-initial-setup to the off position,
In other words, you expect that many users will click "Next" without thinking, and you intend to call that "consent". It's a popular tactic to make people "agree" to things without knowing it.
https://en.wikipedia.org/wiki/Dark_pattern#Privacy_Zuckering
So, either explain the issue to users convincingly enough that they do click to enable the (off-by-default) telemetry (emphasis on benefits) or scrap the idea altogether. Don't be like Facebook.
Regards, Dominik
Agreed 100%. Dark patterning or similar isn't the way to go.
If telemetry is included, it should be opt-in with very clear explanation of why opt-ing in is important and beneficial.
Opt-out and "by consent" are mutually exclusive in most circumstances.
On 7/6/23 21:17, Michael Catanzaro wrote:
On Thu, Jul 6 2023 at 07:42:47 PM -0400, Demi Marie Obenour demiobenour@gmail.com wrote:
Then make the metrics be neither opt-in nor opt-out. Have “Enable telemetry (y/n)?” be a mandatory question in the installer, which the user must answer.
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
My plan is to put this switch in gnome-initial-setup, not the installer. But it will have a default value.
Remember, for avoidance of doubt, we will NEVER enable telemetry upload without the user's consent, which is indicated by either (a) not flipping the telemetry switch in gnome-initial-setup to the off position, or (b) flipping the telemetry switch in gnome-control-center to the on position.
That is not consent. The GDPR explicitly states that consent must be opt-IN.
The way to get more data is not to trick users, but to explain _exactly_ what that data is in a way that non-technical people can actually understand.
On 7/7/23 19:59, Demi Marie Obenour wrote:
That is not consent. The GDPR explicitly states that consent must be opt-IN.
I agree.
I think it is important to make it possible for a user to ask for the data collected from their machine to be deleted in the event they mistakenly submitted data, or changed their mind.
On Sat, 8 Jul 2023, 01:08 Randy Barlow via devel, < devel@lists.fedoraproject.org> wrote:
On 7/7/23 19:59, Demi Marie Obenour wrote:
That is not consent. The GDPR explicitly states that consent must be opt-IN.
I agree.
I think it is important to make it possible for a user to ask for the data collected from their machine to be deleted in the event they mistakenly submitted data, or changed their mind.
Wouldnt that require the data to be individually identifiable?
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On 7/7/23 21:14, Naheem Zaffar wrote:
On Sat, 8 Jul 2023, 01:08 Randy Barlow via devel, < devel@lists.fedoraproject.org> wrote:
On 7/7/23 19:59, Demi Marie Obenour wrote:
That is not consent. The GDPR explicitly states that consent must be opt-IN.
I agree.
I think it is important to make it possible for a user to ask for the data collected from their machine to be deleted in the event they mistakenly submitted data, or changed their mind.
Wouldnt that require the data to be individually identifiable?
Yup! The set of all Fedora users is small enough that trying to use cryptographic approaches to mask it won’t work, as a brute-force attack is feasible.
On Sat, Jul 8 2023 at 12:08:09 AM +0000, Randy Barlow via devel devel@lists.fedoraproject.org wrote:
I agree.
I think it is important to make it possible for a user to ask for the data collected from their machine to be deleted in the event they mistakenly submitted data, or changed their mind.
To be able to delete your data on request, we would have to maintain user profiles such that we can tell which user submitted the data. That's invasive and would drastically reduce your privacy. We don't want to be able to figure out which user submitted particular data. That doesn't make sense for Fedora.
On Thu, Jul 6 2023 at 07:42:47 PM -0400, Demi Marie Obenour <demiobenour(a)gmail.com> wrote:
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
My plan is to put this switch in gnome-initial-setup, not the installer. But it will have a default value.
Remember, for avoidance of doubt, we will NEVER enable telemetry upload without the user's consent, which is indicated by either (a) not flipping the telemetry switch in gnome-initial-setup to the off position, or (b) flipping the telemetry switch in gnome-control-center to the on position. (The telemetry might be enabled *locally only* for users who upgrade from previous versions of Fedora Workstation and who therefore have not seen the consent switch, but the data will never be uploaded to Fedora. And upgraded users will see the switch default to off rather than on, so it really will be opt-in for upgraded users.)
I'm attaching a screenshot to give an idea of what this would look like in gnome-initial-setup. I don't have a gnome-control-center screenshot handy, but it would be similar, except there it would default to off. On Thu, Jul 6 2023 at 07:42:47 PM -0400, Demi Marie Obenour <demiobenour(a)gmail.com> wrote:
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
My plan is to put this switch in gnome-initial-setup, not the installer. But it will have a default value.
Remember, for avoidance of doubt, we will NEVER enable telemetry upload without the user's consent, which is indicated by either (a) not flipping the telemetry switch in gnome-initial-setup to the off position, or (b) flipping the telemetry switch in gnome-control-center to the on position. (The telemetry might be enabled *locally only* for users who upgrade from previous versions of Fedora Workstation and who therefore have not seen the consent switch, but the data will never be uploaded to Fedora. And upgraded users will see the switch default to off rather than on, so it really will be opt-in for upgraded users.)
I'm attaching a screenshot to give an idea of what this would look like in gnome-initial-setup. I don't have a gnome-control-center screenshot handy, but it would be similar, except there it would default to off. On Thu, Jul 6 2023 at 07:42:47 PM -0400, Demi Marie Obenour <demiobenour(a)gmail.com> wrote:
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
Opt-out is and always will be incredibly disingenous when it comes to data collection. Now I'm to understand that you're hoping enough users don't understand/notice that there's an option to opt-out, so that you recieve enough users. What exactly is the reason this change is being considered?
One of the main goals of metrics collection is to analyze whether Red Hat is achieving its goal to make Fedora Workstation the premier developer platform for cloud software development. Accordingly, we want to know things like which IDEs are most popular among our users, and which runtimes are used to create containers using Toolbx.
Then why not reach out to THESE users instead of casting a global net over all users? There has never been a telemetry inclusion to my knowledge, that has been to the benefit of its users. In understand that Red Hat sells products and services, but is it wise to do so at the expense of antagonizing its userbase of volunteers and avocates?
At the end of the day, no matter how you word it, telemetry is still data that is actively transmitted from the user to a third party. I still have to trust that this third-party will not misuse my data and ONLY collect what it says it will. Can Red Hat GUARANTEE that it won't collect something else if there's a security breach or there's an update pushed to the telemetry app containing a bug that collects more than intended? Once it happens, no matter if by acccident or not, it will still have happened and leaked unintended data.
Remember, for avoidance of doubt, we will NEVER enable telemetry upload without the user's consent, which is indicated by either (a) not flipping the telemetry switch in gnome-initial-setup to the off position, or (b) flipping the telemetry switch in gnome-control-center to the on position.
So it's considered consent if you don't know what you're signing up for? I would never consider something consent without it being overtly approved by the user, although I don't know how this applies to laws in different jurisdictions. This definition of consent would then have to match up with every country where there is a Fedora user, no?
In hindsight, both of my comments were hastily posted to this discussion. It wasn't very constructive and I apologize for this.
I do believe that this proposed change is being considered with the best intentions for both the user and Fedora. Could we see an example of the text/telemetry that would be sent? Would there be a notification to the user when/if this data is sent? If not, would the user be able to view this on their current install in some sort of log?
On 7/6/23 21:17, Michael Catanzaro wrote:
On Thu, Jul 6 2023 at 07:42:47 PM -0400, Demi Marie Obenour demiobenour@gmail.com wrote:
Then make the metrics be neither opt-in nor opt-out. Have “Enable telemetry (y/n)?” be a mandatory question in the installer, which the user must answer.
The problem is if users are expected to answer, they are going to probably answer No and it's effectively the same as an opt-in. But if we have a default value, users will be inclined to leave the default value.
My plan is to put this switch in gnome-initial-setup, not the installer. But it will have a default value.
Remember, for avoidance of doubt, we will NEVER enable telemetry upload without the user's consent
The GDPR is clear that failure to opt-out does not represent consent.
On Thu, Jul 06, 2023 at 14:32:04 -0500, Michael Catanzaro mcatanzaro@redhat.com wrote:
On Thu, Jul 6 2023 at 08:19:07 PM +0200, Vitaly Zaitsev via devel devel@lists.fedoraproject.org wrote:
All telemetry collection MUST be an opt-in feature (disabled by default). I'm strongly against enabling it by default.
As explained in the proposal document, we know that opt-in metrics are not very useful because few users would opt in, and these users would not be representative of Fedora users as a whole. We are not interested in opt-in metrics.
This strongly suggests that most people would prefer not to provide metrics. But what is hoped that they won't mind it enough to turn things off. I'm not a fan of doing this, but people can reasonably argue it is for the greater good or that most people are misevaluating the trade offs of their data being used to improve things for them.
"we know that opt-in metrics are not very useful because few users would opt in … We are not interested in opt-in metrics." Any metrics collected *must* be opt-in. If the quote above is still reflects your thinking on telemetry collection then this is not a viable scheme, and should be withdrawn.
Less technically related: this proposal’s hard stance against opt-in, following so soon after following so soon after Mick McGrath’s defense of locking down RHEL source (at https://www.redhat.com/en/blog/red-hats-commitment-open-source-response-gitc...), is tone deaf and a surprisingly bad look.
On Wed, 2023-07-19 at 17:49 +0000, Honore Doktorr wrote:
"we know that opt-in metrics are not very useful because few users would opt in … We are not interested in opt-in metrics." Any metrics collected *must* be opt-in. If the quote above is still reflects your thinking on telemetry collection then this is not a viable scheme, and should be withdrawn.
The proposal will not go forward in its current form and will be re- proposed with substantial changes, including an "explicit choice required" design. See https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving... and https://discussion.fedoraproject.org/t/opt-in-opt-out-a-breakout-topic-for-t... .
Assuming the goal is to improve fedora, that would be pointless as telemetry rarely produces useful results as opt-in. It makes sense to have it opt-out, but I'd expect the telemetry output and inputs to be open and available for fedora developers.
Regards, Nikos
On Thu, Jul 6, 2023 at 8:19 PM Vitaly Zaitsev via devel < devel@lists.fedoraproject.org> wrote:
On 06/07/2023 18:10, Aoife Moloney wrote:
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
All telemetry collection MUST be an opt-in feature (disabled by default). I'm strongly against enabling it by default.
Please add the ability to completely get rid of it by removing the telemetry collector package.
-- Sincerely, Vitaly Zaitsev (vitaly@easycoding.org) _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Assuming the goal is to improve fedora, that would be pointless as telemetry rarely produces useful results as opt-in. It makes sense to have it opt-out, but I'd expect the telemetry output and inputs to be open and available for fedora developers.
Regards, Nikos
On Thu, Jul 6, 2023 at 8:19 PM Vitaly Zaitsev via devel < devel(a)lists.fedoraproject.org> wrote:
If the telemetry is presented in plain text that's easy to understand and the user is prompted if they wish to submit the data, sure that could be a possible compromise.
I don't understand where my reply is supposed to go so here it is on the mailing list *and* on the forums? Are the change proposal owners reading both?
--
Perhaps this is implicit in the use of eos-* but I seem to be missing a list of what metrics would be collected exactly and what is contained in messages to/from Fedora infrastructure related to these metrics.
Is this change request meant to discuss the general idea and acceptance level of adding opt-out metrics collection in Fedora?
One of the main goals of metrics collection is to analyze whether Red Hat is achieving its goal to make Fedora Workstation the premier developer platform for cloud software development.
Could you please motivate on how metrics collection on which IDE is ran on Fedora systems would help Red Hat achieve making Fedora the premier developer platform for cloud software development?
Occasionally, Red Hat might need to collect specific metrics to justify additional time spent on contributing to Fedora or additional investment in Fedora.
Fedora is upstream; collecting these metrics on RHEL systems seems like a saner place to put them if it’s to steer Red Hat prioritizations?
On Thu, Jul 6 2023 at 08:41:03 PM +0200, Simon de Vlieger cmdr@supakeen.com wrote:
I don't understand where my reply is supposed to go so here it is on the mailing list *and* on the forums? Are the change proposal owners reading both?
In theory, we're supposed to be discussing this on Discourse to make sure that platform is indeed suitable for change proposal discussions. I will respond to comments on this list too, though.
Since you posted this on Discourse as well, I'll respond to you there.
As an opt-in option I am all for it. It should be turned off by default and easily removed. The opt-in could even be part of the installation process, adn if you don't choose to opt-in the necessary packages never get installed.
Without it being deployed as opt-in it would be a huge invasion of privacy (real or imagined), and is likley to receive a huge amount of pushback form the community.
* Aoife Moloney:
== Dependencies ==
Any package that wishes to collect a metric would need to depend on eos-metrics. For example, if we were to collect statistics on which system settings panels are used most frequently, then the gnome-control-center package would need to depend on eos-metrics in order to send a metric to eos-event-recorder-daemon.
What about packages which already collect metrics and report them somewhere (not necessarily to Red Hat)? Would these packages need to change under this proposal? If not, how do we explain this to our users?
Thanks, Florian
On Thu, Jul 6 2023 at 09:27:47 PM +0200, Florian Weimer fweimer@redhat.com wrote:
What about packages which already collect metrics and report them somewhere (not necessarily to Red Hat)? Would these packages need to change under this proposal? If not, how do we explain this to our users?
No, packages that are already collecting their own metrics separately would not be affected.
On Jul 7, 2023, at 7:09 AM, Michael Catanzaro mcatanzaro@redhat.com wrote:
On Thu, Jul 6 2023 at 09:27:47 PM +0200, Florian Weimer fweimer@redhat.com wrote:
What about packages which already collect metrics and report them somewhere (not necessarily to Red Hat)? Would these packages need to change under this proposal? If not, how do we explain this to our users?
No, packages that are already collecting their own metrics separately would not be affected.
I’d almost prefer we work out a policy where anything of the sort is disabled by default, and with a distro-wide standard bcond to not even compile it in as an option. (No, I don’t quite know how that could be worded sensibly as a policy…. but it’s where I think I’d prefer to start from).
Even well intentioned things can be problematic.
Did you know that “lshw" does a DNS query?
Not only that, it’s a DNS query not to where the distro points to, but somewhere out on the internet.
By running “lshw” you’ve now told a DNS server how many machines / people you have running “lshw” within some amount of time.
You’ve also now complicated the ability to go “I allow access to the packaging repositories for security updates, the one two or three endpoints my application needs to talk to, and if any of these machines EVER tries to do any other network activity, page people immediately as that can only mean something is wrong”. This *really* isn’t an unreasonable thing for people to do, in fact I really, really, REALLY want to make it easy for people to do this (and not start paging people just because someone diagnosing a problem typed “lshw” or something)
For lshw specifically, this is fixed in c9s, Fedora, and upstream now has an option to build with this feature disabled: - https://gitlab.com/redhat/centos-stream/rpms/lshw/-/merge_requests/3 - https://bugzilla.redhat.com/show_bug.cgi?id=2098463 - https://src.fedoraproject.org/rpms/lshw/pull-request/1 - https://github.com/lyonel/lshw/pull/86
Now, this example is obviously not that extreme or anything. It’s arguably less information than what’s in your average `curl http://foo%60 http://foo`/ request.
But the burden we put on our users is to evaluate each of these is to evaluate for them, in their deployment and security context, if they are okay with a third party having that information, and that they understand exactly what is being done, and what *could* be done with it. It sounds like a lot of work.
An example of this, the countme feature https://docs.fedoraproject.org/en-US/fedora-coreos/counting/ / https://docs.fedoraproject.org/en-US/infra/sysadmin_guide/dnf-counting/ / https://lwn.net/Articles/776327/ that lives as default on in Fedora (on my at-home personal Fedora machines too). I made a personal decision for my own machines, but when looking at it in the context of building the next (now current) version of Amazon Linux, I was faced with a choice: do we go through a process of independently working out what our customer thoughts would be on this feature, be prepared to set up our own infrastructure around it, how we’d communicate about it, as well as ensure all of that meets the security and privacy bars we want to uphold….. or do we just not enable it and spend that time on other things? We chose to spend the time on other things, as setting this up was not critical for us.
But what was fantastic about this was that Fedora was very very very clear about the change, how it worked, the efforts gone to etc, and it was so easy to flip on/off and was really just in one place, and a place we would *have* to modify when we started building our own distro.
On Sat, Jul 22 2023 at 02:44:30 AM +0000, "Smith, Stewart via devel" devel@lists.fedoraproject.org wrote:
I’d almost prefer we work out a policy where anything of the sort is disabled by default, and with a distro-wide standard bcond to not even compile it in as an option. (No, I don’t quite know how that could be worded sensibly as a policy…. but it’s where I think I’d prefer to start from).
You can just not package the eos- packages (eos-metrics, eos-event-recorder-daemon, eos-metrics-instrumentation). eos-event-recorder-daemon is the package that actually sends metrics. Without that, no metrics. And nothing should have a hard dependency on it, so no bconds should be needed. If you have some denylist somewhere that throws an error if an unwanted package exists, that should robustly ensure it's never enabled.
For everything else, the test for whether to send metrics is "is the event recorder bus name owned?" so no conditional compilation or bconds is needed.
As a non-user of Gnome 3 who normally never runs any Gnome 3 settings programs, I get the impression that Fedora 40 will begin accumulating unused metrics somewhere in the filesystem. To prevent a constantly growing waste of storage space, I'll have to run one of two Gnome 3 settings programs – which may or may not require starting a Gnome 3 desktop session – and find the right switch to either turn on uploading or turn off collection. I'll have to remember to do that after upgrading around a year from now, and also on any new installations in the distant future.
If my impression is wrong, then the change proposal needs to be amended.
Björn Persson
So this change is for workstation iso only?, the other spins wont have this unwanted change.
On Thu, Jul 6 2023 at 11:08:15 PM +0200, Björn Persson Bjorn@xn--rombobjrn-67a.se wrote:
As a non-user of Gnome 3 who normally never runs any Gnome 3 settings programs, I get the impression that Fedora 40 will begin accumulating unused metrics somewhere in the filesystem. To prevent a constantly growing waste of storage space, I'll have to run one of two Gnome 3 settings programs – which may or may not require starting a Gnome 3 desktop session – and find the right switch to either turn on uploading or turn off collection. I'll have to remember to do that after upgrading around a year from now, and also on any new installations in the distant future.
If my impression is wrong, then the change proposal needs to be amended.
Well this change proposal is for Fedora Workstation specifically. That's in the title. :) I would envision installing eos-event-recorder-daemon via a Recommends: from the gnome-control-center and gnome-initial-setup packages (and probably also by adding it to the workstation-product comps group), so if you don't have gnome-initial-setup or gnome-control-center installed, you wouldn't get in on upgrade. I'm not sure whether I want to amend this level of detail into the change proposal in case we might want to change the specifics of how it gets installed, but that's just to give you an idea of what I'm thinking currently. Certainly the metrics components should not be installed for non-GNOME users as part of this change proposal.
However, I've heard that Fedora KDE might also be interested in adding metrics once we have this working in Workstation. But that would be up to the people contributing to Fedora KDE and would need to be proposed separately.
I think eos-event-recorder-daemon uses some sort of ring buffer to eventually discard old events, so that storage space does not increase forever and should not become an issue? But please don't quote me on this; I have a lot of comments to respond to, and I'm not super familiar with the code, and I don't want to dive in to look at how it works right now. If there's really an issue with space growing without bound, then that's a bug we should fix, but I don't think it's so.
(BTW, the GNOME 3 era concluded with the release of GNOME 40 in Fedora 34, so I wouldn't except Fedora users to still be using GNOME 3. :)
Michael
On Thu, Jul 6, 2023 at 8:53 PM Michael Catanzaro mcatanzaro@redhat.com wrote:
On Thu, Jul 6 2023 at 11:08:15 PM +0200, Björn Persson Bjorn@xn--rombobjrn-67a.se wrote:
As a non-user of Gnome 3 who normally never runs any Gnome 3 settings programs, I get the impression that Fedora 40 will begin accumulating unused metrics somewhere in the filesystem. To prevent a constantly growing waste of storage space, I'll have to run one of two Gnome 3 settings programs – which may or may not require starting a Gnome 3 desktop session – and find the right switch to either turn on uploading or turn off collection. I'll have to remember to do that after upgrading around a year from now, and also on any new installations in the distant future.
If my impression is wrong, then the change proposal needs to be amended.
Well this change proposal is for Fedora Workstation specifically. That's in the title. :) I would envision installing eos-event-recorder-daemon via a Recommends: from the gnome-control-center and gnome-initial-setup packages (and probably also by adding it to the workstation-product comps group), so if you don't have gnome-initial-setup or gnome-control-center installed, you wouldn't get in on upgrade. I'm not sure whether I want to amend this level of detail into the change proposal in case we might want to change the specifics of how it gets installed, but that's just to give you an idea of what I'm thinking currently. Certainly the metrics components should not be installed for non-GNOME users as part of this change proposal.
However, I've heard that Fedora KDE might also be interested in adding metrics once we have this working in Workstation. But that would be up to the people contributing to Fedora KDE and would need to be proposed separately.
I'm interested from the Fedora KDE side, but I don't want to implement it until we have our own equivalent of GNOME Initial Setup working that would let us present all the knobs on first boot. Without that, it feels pretty sketchy to me.
I also don't have a good handle on what this thing records, and what metrics I would *want* to enable for it to record. I would also be generally interested in what this can do from a holistic Fedora point of view and how accessible the data will be to the project.
Since Workstation presents the configuration knob for this at first boot with GNOME Initial Setup, I feel that is a good place to ensure people get an informed (non)consent of metrics gathering so they can make a decision of whether to leave it enabled.
I think eos-event-recorder-daemon uses some sort of ring buffer to eventually discard old events, so that storage space does not increase forever and should not become an issue? But please don't quote me on this; I have a lot of comments to respond to, and I'm not super familiar with the code, and I don't want to dive in to look at how it works right now. If there's really an issue with space growing without bound, then that's a bug we should fix, but I don't think it's so.
(BTW, the GNOME 3 era concluded with the release of GNOME 40 in Fedora 34, so I wouldn't except Fedora users to still be using GNOME 3. :)
From my perspective, it's still the GNOME 3 era, as there hasn't been a significant redesign of the UX to warrant distinguishing it.
On Thu, Jul 06, 2023 at 19:53:12 -0500, Michael Catanzaro mcatanzaro@redhat.com wrote:
Well this change proposal is for Fedora Workstation specifically. That's in the title. :) I would envision installing eos-event-recorder-daemon via a Recommends: from the gnome-control-center and gnome-initial-setup packages (and probably also by adding it to the workstation-product comps group), so if you don't have gnome-initial-setup or gnome-control-center installed, you wouldn't get in on upgrade. I'm not sure whether I want to amend this level of detail into the change proposal in case we might want to change the specifics of how it gets installed, but that's just to give you an idea of what I'm thinking currently. Certainly the metrics components should not be installed for non-GNOME users as part of this change proposal.
Is there going to be a recommended way to not accidentally install this stuff? I'm guessing the least work (for Fedora) would be to black list the key packages in the repo files. Making available a package that conflicts with them could be done, but it could accidentally get removed during and --allowerasing change. But this might be easier when doing installs.
On Fri, Jul 7 2023 at 12:25:12 PM -0500, Bruno Wolff III bruno@wolff.to wrote:
Is there going to be a recommended way to not accidentally install this stuff? I'm guessing the least work (for Fedora) would be to black list the key packages in the repo files. Making available a package that conflicts with them could be done, but it could accidentally get removed during and --allowerasing change. But this might be easier when doing installs.
Well I wouldn't necessarily expect it to be easy to install by mistake, but I do want to make sure it's not harmful if that happens somehow. So even if the packages are installed, they're still not going to upload metrics to Fedora without further user consent.
The local collection is a bit of a hole, but I like your suggestion to put a short time limit on that. Perhaps we can collect for something like one hour locally, then delete if the user has not consented to upload before then. Something like that.
Michael
On Fri, Jul 07, 2023 at 16:15:14 -0500, Michael Catanzaro mcatanzaro@redhat.com wrote:
The local collection is a bit of a hole, but I like your suggestion to put a short time limit on that. Perhaps we can collect for something like one hour locally, then delete if the user has not consented to upload before then. Something like that.
I think that would be an improvement.
On Friday, 07 July 2023 at 23:15, Michael Catanzaro wrote: [...]
The local collection is a bit of a hole, but I like your suggestion to put a short time limit on that. Perhaps we can collect for something like one hour locally, then delete if the user has not consented to upload before then. Something like that.
This is still collecting without consent. Can I look inside your bedroom and take pictures for something like one hour and then delete them if you haven't consented? This is ridiculous. Please stop even considering doing opt-out collection of any data.
Regards, Dominik
From what I read, the metrics accumulation has an option to turn off the collection, as well as the transmission
Sent from Yahoo Mail on Android
On Thu, Jul 6, 2023 at 8:53 p.m., Michael Catanzaromcatanzaro@redhat.com wrote:
On Thu, Jul 6 2023 at 11:08:15 PM +0200, Björn Persson Bjorn@xn--rombobjrn-67a.se wrote:
As a non-user of Gnome 3 who normally never runs any Gnome 3 settings programs, I get the impression that Fedora 40 will begin accumulating unused metrics somewhere in the filesystem. To prevent a constantly growing waste of storage space, I'll have to run one of two Gnome 3 settings programs – which may or may not require starting a Gnome 3 desktop session – and find the right switch to either turn on uploading or turn off collection. I'll have to remember to do that after upgrading around a year from now, and also on any new installations in the distant future.
If my impression is wrong, then the change proposal needs to be amended.
Well this change proposal is for Fedora Workstation specifically. That's in the title. :) I would envision installing eos-event-recorder-daemon via a Recommends: from the gnome-control-center and gnome-initial-setup packages (and probably also by adding it to the workstation-product comps group), so if you don't have gnome-initial-setup or gnome-control-center installed, you wouldn't get in on upgrade. I'm not sure whether I want to amend this level of detail into the change proposal in case we might want to change the specifics of how it gets installed, but that's just to give you an idea of what I'm thinking currently. Certainly the metrics components should not be installed for non-GNOME users as part of this change proposal.
However, I've heard that Fedora KDE might also be interested in adding metrics once we have this working in Workstation. But that would be up to the people contributing to Fedora KDE and would need to be proposed separately.
I think eos-event-recorder-daemon uses some sort of ring buffer to eventually discard old events, so that storage space does not increase forever and should not become an issue? But please don't quote me on this; I have a lot of comments to respond to, and I'm not super familiar with the code, and I don't want to dive in to look at how it works right now. If there's really an issue with space growing without bound, then that's a bug we should fix, but I don't think it's so.
(BTW, the GNOME 3 era concluded with the release of GNOME 40 in Fedora 34, so I wouldn't except Fedora users to still be using GNOME 3. :)
Michael
_______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Michael Catanzaro wrote:
I would envision installing eos-event-recorder-daemon via a Recommends: from the gnome-control-center and gnome-initial-setup packages (and probably also by adding it to the workstation-product comps group), so if you don't have gnome-initial-setup or gnome-control-center installed, you wouldn't get in on upgrade.
I don't seem to have a package named gnome-initial-setup installed. gnome-control-center is installed, but fortunately it looks like I can remove it without losing anything important. I don't know what pulled in gnome-control-center or when, but I used XFCE for many years (until it became unusable on my laptop and drove me over to LXQT), and XFCE had ties to various gnomy things.
Certainly the metrics components should not be installed for non-GNOME users as part of this change proposal.
Having some package installed is not the same thing as using a particular desktop environment. There are many possible reasons why packages get installed, and they won't always get removed when they're no longer needed. Among more than 4000 installed packages, there are surely several I'm not actually using, but examining them all to determine which ones can be removed would take a lot of work.
I think eos-event-recorder-daemon uses some sort of ring buffer to eventually discard old events, so that storage space does not increase forever and should not become an issue?
That should make it somewhat less of a problem if it is so. It should of course be verified before data gathering is turned on.
(BTW, the GNOME 3 era concluded with the release of GNOME 40 in Fedora 34, so I wouldn't except Fedora users to still be using GNOME 3. :)
I need some way to distinguish between the Gnome that once was and the very different thing that took over the name "Gnome".
Björn Persson
On Thu, Jul 06, 2023 at 05:10:24PM +0100, Aoife Moloney wrote:
== Summary ==
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
Given the detailed proposal, it's probably too late now for any fundamental changes, but there's a formal research area called Differential Privacy [1] that deals with the collection of user data in such a way that it preserves the privacy of each participating individual.
Have you guys, by any chance, considered looking into that for some inspiration?
Either way, if anyone is curious, there's a nice and easy-to-read write up on the key concepts: https://desfontain.es/privacy/differential-privacy-awesomeness.html
A specific set of algorithms (RAPPOR) for collecting arbitrary user strings that preserves Differential Privacy has been proposed (and implemented) by Google a while back, too: http://arxiv.org/abs/1407.6981 https://github.com/google/rappor
On Thu, Jul 06, 2023 at 11:33:03PM +0200, Michal Domonkos wrote:
changes, but there's a formal research area called Differential Privacy [1]
Oops, forgot the link:
[1] https://en.wikipedia.org/wiki/Differential_privacy
On Thu, Jul 6 2023 at 11:33:03 PM +0200, Michal Domonkos mdomonko@redhat.com wrote:
Given the detailed proposal, it's probably too late now for any fundamental changes, but there's a formal research area called Differential Privacy [1] that deals with the collection of user data in such a way that it preserves the privacy of each participating individual.
No, it's not too late for fundamental changes. Big changes would make this harder and take longer, but we're still very early on here. If the Fedora community wants to completely throw out the Endless system and use something else instead, that would be sad since it would mean more work for me, but we're still at the point where that's possible. (I'd *much* rather make changes to the existing system to adapt it to our needs, though. :)
But remember we do not want to keep information about individuals in the data set in the first place. It's easier to dodge privacy concerns if we just don't store such associations at all.
As for differential privacy, I'm quite unfamiliar with this topic so I don't know to what extent it could be useful, but Endless is interested in adding randomized response [1], where say 50% of the data sent is fake and the other half is accurate. This only works for boolean and possibly integer data, but it would make it even harder to deanonymize reporterd data. But that is not supported yet.
[1] https://blogs.gnome.org/wjjt/2023/07/05/endless-oss-privacy-preserving-metri...
Have you guys, by any chance, considered looking into that for some inspiration?
Either way, if anyone is curious, there's a nice and easy-to-read write up on the key concepts: https://desfontain.es/privacy/differential-privacy-awesomeness.html
I will add that to my reading list. Certainly it seems a lot less intimidating than the Wikipedia article. ;)
A specific set of algorithms (RAPPOR) for collecting arbitrary user strings that preserves Differential Privacy has been proposed (and implemented) by Google a while back, too: http://arxiv.org/abs/1407.6981 https://github.com/google/rappor
Wow. I'll add this to my reading list too, although remains to be seen whether I'll be able to understand it. :D
Michael
On Thu, Jul 06, 2023 at 08:08:05PM -0500, Michael Catanzaro wrote:
But remember we do not want to keep information about individuals in the data set in the first place. It's easier to dodge privacy concerns if we just don't store such associations at all.
Sure, but the data still needs to leave a user's system at some point and that's where you have to trust the aggregator (the Fedora project in this case, I suppose) that it's not stored verbatim.
Or, apply a DP technique locally, before it leaves the system. Randomized response, which you mentioned, is actually one such technique.
In a way, you already trust the distribution by the very nature of it, e.g. the signatures in packages you install. DP just provides a framework in which you can formally quantify the risk of de-masking an individual user from a given data set, and concrete strategies to employ to minimize that risk.
Actually this exact problem is discussed in the blog post series I shared, specifically in this part:
https://desfontain.es/privacy/local-global-differential-privacy.html
As for differential privacy, I'm quite unfamiliar with this topic so I don't know to what extent it could be useful, but Endless is interested in adding randomized response [1], where say 50% of the data sent is fake and the other half is accurate. This only works for boolean and possibly integer data, but it would make it even harder to deanonymize reporterd data. But that is not supported yet.
Indeed, randomized response is one of the DP-aware techniques (it's also mentioned in that blog series) :) And RAPPOR is basically just randomized response but generalized to arbitrary strings (using this fancy thing called Bloom filters [1]).
I will add that to my reading list. Certainly it seems a lot less intimidating than the Wikipedia article. ;)
Yup, the Wikipedia article isn't very helpful. There are much better resources, including a bunch of talks on YouTube from the researchers themselves (e.g. Cynthia Dwork).
Wow. I'll add this to my reading list too, although remains to be seen whether I'll be able to understand it. :D
Yeah, the RAPPOR paper is an interesting read but pretty dense and math-heavy (although not as much as it might seem at first glance). I did *try* to read it at some point and actually managed to understand the key concepts which aren't *that* complicated. But I can't blame anybody for not wanting to go down that path after they skim through it and see those formulas and charts, really :D
I went into this DP rabbit hole myself when I was working on the DNF Countme [2] implementation a few years back, and even if it wasn't directly applicable in the end, it did inspire me to add a form of "randomized response" there, to spread the countme events from a single system randomly across a week's time window so that no usage patterns of that particular system (e.g. the typical uptime hours) could emerge if someone were to inspect the HTTP requests with the countme flag coming from the same system aggregated over a long period of time. Pretty theoretical and, in retrospect, rather unlikely and paranoid, but it was easy to add that logic so I did, just for the peace of mind :)
I haven't kept up with the latest developments in DP since then, though, and have blissfully forgotten most of it, too. But it sparked my interest back then and I certainly thought that if Fedora ever decides that it wants some kind of "telemetry", *this* is the (only acceptable) way to do it.
Which doesn't mean there aren't other ways, or that the approach taken by Endless (which you'd like to adopt) is wrong, of course. These were just my 2 cents :)
FWIW, it seems like various tech companies and software project make use of DP (at least that's what the Wikipedia article claims). Google Chrome and MS Windows are among those, amusingly, despite their reputation.
[1] https://en.wikipedia.org/wiki/Bloom_filter [2] https://fedoraproject.org/wiki/Changes/DNF_Better_Counting
On Thu, Jul 06, 2023 at 08:08:05PM -0500, Michael Catanzaro wrote:
... that would be sad since it would mean more work for me, but we're still at the point where that's possible. (I'd *much* rather make changes to the existing system to adapt it to our needs, though. :)
Oh, and I didn't mean to suggest adding more work or reworking your existing plans, don't get me wrong :)
And absolutely, using an *existing* (and tried) system and adapting that to our needs sounds like a much better idea than scratching all your plans and looking for something else, especially if that *something* isn't even that obvious.
On Thu, Jul 06, 2023 at 05:10:24PM +0100, Aoife Moloney wrote:
Important process note: we are experimenting with using Fedora Discussion as part of the Changes process. Change announcements (like the one you are reading right now) will still be sent to the devel-announce mailing list, but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
Why? This was discussed a while back and the number problems with discourse were covered, and to my knowledge none of them have been fixed.
== Summary ==
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
Fedora is an open source community project, and nobody is interested in violating user privacy. We do not want to collect data about individual users. We want to collect only aggregate usage metrics that are actually needed to achieve specific Fedora improvement objectives, and no more. We understand that if we violate our users' trust, then we won't have many users left, so if metrics collection is approved, we will need to be very careful to roll this out in a way that respects our users at all times. (For example, we should not collect users' search queries, because that would be creepy.)
This also keeps coming up and the answer is again, no! There's no such thing as anonymous data collection, people don't want it, it must not be enabled by default (making it useless to you), it's probably illegal in the Europe, so stop asking for it.
Rich.
Am 07.07.23 um 12:19 schrieb Richard W.M. Jones:
On Thu, Jul 06, 2023 at 05:10:24PM +0100, Aoife Moloney wrote:
Important process note: we are experimenting with using Fedora Discussion as part of the Changes process. Change announcements (like the one you are reading right now) will still be sent to the devel-announce mailing list, but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
Why? This was discussed a while back and the number problems with discourse were covered, and to my knowledge none of them have been fixed.
== Summary ==
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
Fedora is an open source community project, and nobody is interested in violating user privacy. We do not want to collect data about individual users. We want to collect only aggregate usage metrics that are actually needed to achieve specific Fedora improvement objectives, and no more. We understand that if we violate our users' trust, then we won't have many users left, so if metrics collection is approved, we will need to be very careful to roll this out in a way that respects our users at all times. (For example, we should not collect users' search queries, because that would be creepy.)
This also keeps coming up and the answer is again, no! There's no such thing as anonymous data collection, people don't want it, it must not be enabled by default (making it useless to you), it's probably illegal in the Europe, so stop asking for it.
+1
General Data Protection Regulation in EU law.
"... consent can't be implied and must always be given through an opt-in ..."
On Fri, Jul 07, 2023 at 12:41:00PM +0200, Leon Fauster via devel wrote:
Am 07.07.23 um 12:19 schrieb Richard W.M. Jones:
On Thu, Jul 06, 2023 at 05:10:24PM +0100, Aoife Moloney wrote:
Important process note: we are experimenting with using Fedora Discussion as part of the Changes process. Change announcements (like the one you are reading right now) will still be sent to the devel-announce mailing list, but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
Why? This was discussed a while back and the number problems with discourse were covered, and to my knowledge none of them have been fixed.
== Summary ==
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
Fedora is an open source community project, and nobody is interested in violating user privacy. We do not want to collect data about individual users. We want to collect only aggregate usage metrics that are actually needed to achieve specific Fedora improvement objectives, and no more. We understand that if we violate our users' trust, then we won't have many users left, so if metrics collection is approved, we will need to be very careful to roll this out in a way that respects our users at all times. (For example, we should not collect users' search queries, because that would be creepy.)
This also keeps coming up and the answer is again, no! There's no such thing as anonymous data collection, people don't want it, it must not be enabled by default (making it useless to you), it's probably illegal in the Europe, so stop asking for it.
+1
General Data Protection Regulation in EU law.
"... consent can't be implied and must always be given through an opt-in ..."
Note the proposal at the top of the thread directly addresses this opt-in vs opt-out Q wrt GDPR compliance:
[quote] Fedora Legal has determined that if we collect any personally-identifiable data, the entire metrics system must be opt-in. Since we are only interested in opt-out metrics due to the low value of opt-in metrics, we must accordingly never collect any personally-identifiable data. We must also not collect any data that could become personally-identifiable if combined with other data, which notably means IP addresses must not be stored. We only want to collect anonymous data anyway, but we need to be especially mindful of the possibility that combining two "anonymous" data points could result in the data no longer being anonymous. [/quote]
IOW, the intention is to avoid triggering GDPR obligations by not collecting (potentially) personally identifiable data.
The last sentance though hints at how tricky this can be to put into practice in reality though.
Combining anonymous data sets can be surprisingly effective at producing metrics that could uniquely identify users - it is the heart of online advertizment targetting techniques after all.
With regards, Daniel
On 7/6/23 12:10, Aoife Moloney wrote:
That said, Fedora Legal has determined that if we collect any personally-identifiable data, the entire metrics system must be opt-in. Since we are only interested in opt-out metrics due to the low value of opt-in metrics, we must accordingly never collect any personally-identifiable data.
I oppose any telemetry that is not opt-in, but I also do not think that what this proposal is suggesting is possible to implement.
For metrics to not be personally identifiable, it is necessary that the set of metrics collected have sufficiently low entropy that on average, _many_ users will send _the exact same metrics_. It is very hard for me to see any useful set of metrics having such low entropy.
If Fedora has 2 million users (possibly an overestimate) then the metrics would need to have entropy much less than 2^21, which means that the entire metrics set would need to be able to be represented as a 20-bit integer. In practice, I suspect one would need to fit the entire set in a 16-bit integer or less, and possibly _significantly_ less.
On Sat, Jul 8, 2023, at 03:21, Demi Marie Obenour wrote:
If Fedora has 2 million users (possibly an overestimate) then the metrics would need to have entropy much less than 2^21, which means that the entire metrics set would need to be able to be represented as a 20-bit integer. In practice, I suspect one would need to fit the entire set in a 16-bit integer or less, and possibly _significantly_ less.
I see this numbers as over-optimistic, since: * The change will only apply to Fedora Workstation users, not all Fedora users * Many pro users (probably a big percentage of the total Fedora users, due to the nature of Fedora) will disble the telemetry
Overall, my pov is: this change is acceptable only if it is 100% opt-in. Opt-out is not a valid way of saying "user consent", and can also be considered a dark pattern (illegal in many places, immoral everywhere).
Best, Fale
+1
Yes this has been mentioned many times on the thread. You can't say the user has consented but also have it opt-out. Saying that opt-in data isn't useful because most users won't opt-in is implying the desire of a dark pattern to encourage more data collection.
On Fri, Jul 7 2023 at 09:21:15 PM -0400, Demi Marie Obenour demiobenour@gmail.com wrote:
For metrics to not be personally identifiable, it is necessary that the set of metrics collected have sufficiently low entropy that on average, _many_ users will send _the exact same metrics_. It is very hard for me to see any useful set of metrics having such low entropy.
If Fedora has 2 million users (possibly an overestimate) then the metrics would need to have entropy much less than 2^21, which means that the entire metrics set would need to be able to be represented as a 20-bit integer. In practice, I suspect one would need to fit the entire set in a 16-bit integer or less, and possibly _significantly_ less.
We're not going to build creepy user profiles. Particular metrics will be stored individually, not correlated together.
Let's say we have two metrics:
Key | Value ------------ User launched GNOME Builder today? | y/n User has NVIDIA proprietary driver | y/n
We would know how many users launched Builder and how many users have NVIDIA graphics, but we wouldn't know how many NVIDIA users launched Builder because there's just no need to tie those two data points together.
Michael
On Thu, Jul 06, 2023 at 05:10:24PM +0100, Aoife Moloney wrote:
== Summary ==
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
One thing to realize here is that, no matter what collection method will be used and how well it will be secured against potential malicious actors, the reputation of Fedora *will* be harmed or at least tainted. And it won't be easy to undo that.
Even if we end up using mathematically sound techniques as per Differential Privacy (as I suggested in my other reply), most user won't know/realize that and will only see the words "telemetry" and "Fedora" alongside each other in all those discussions and articles that will inevitably pop up as a result of this change.
I think the reputation of Fedora as a project shouldn't be taken lightly, regardless of the actual implementation, and should be weighted against the benefits that it would bring to the project. I'd say a huge portion of the user base in Fedora consists of technical people who actively despise the notion of any kind of "phone home" mechanism on their system (me included), and for good reason. It's also evidenced by this thread so far.
The problem, as noted in this thread multiple times, is that if we make this opt-in, the usefulness would decrease to almost it being irrelevant. If we make it opt-out, all the above applies (IMHO).
Consider that even those big software companies couldn't prevent their products from getting the bad reputation, despite some of them reportedly using Differential Privacy (!).
On 7/8/23 06:19, Michal Domonkos wrote:
On Thu, Jul 06, 2023 at 05:10:24PM +0100, Aoife Moloney wrote:
== Summary ==
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
One thing to realize here is that, no matter what collection method will be used and how well it will be secured against potential malicious actors, the reputation of Fedora *will* be harmed or at least tainted. And it won't be easy to undo that.
Even if we end up using mathematically sound techniques as per Differential Privacy (as I suggested in my other reply), most user won't know/realize that and will only see the words "telemetry" and "Fedora" alongside each other in all those discussions and articles that will inevitably pop up as a result of this change.
I think the reputation of Fedora as a project shouldn't be taken lightly, regardless of the actual implementation, and should be weighted against the benefits that it would bring to the project. I'd say a huge portion of the user base in Fedora consists of technical people who actively despise the notion of any kind of "phone home" mechanism on their system (me included), and for good reason. It's also evidenced by this thread so far.
The problem, as noted in this thread multiple times, is that if we make this opt-in, the usefulness would decrease to almost it being irrelevant. If we make it opt-out, all the above applies (IMHO).
Consider that even those big software companies couldn't prevent their products from getting the bad reputation, despite some of them reportedly using Differential Privacy (!).
I 100% agree with this. Even if it can be done in a way that preserves user privacy, the risk to Fedora’s reputation is simply not worth it.
On 06/07/2023 18:10, Aoife Moloney wrote:
but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
It looks like they've started moving replies they don't like to other threads to cover up the flow of resentment that comes naturally to them.
That's why switching to Fedora Discussion from the mailing lists is a very bad idea: admins or RH staff can easily delete your comments or bury them in another threads.
On Sat, Jul 08, 2023 at 01:06:01PM +0200, Vitaly Zaitsev via devel wrote:
On 06/07/2023 18:10, Aoife Moloney wrote:
but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
It looks like they've started moving replies they don't like to other threads to cover up the flow of resentment that comes naturally to them.
That's why switching to Fedora Discussion from the mailing lists is a very bad idea: admins or RH staff can easily delete your comments or bury them in another threads.
Well, no. Thats not whats happening.
Moving part of the discussion to another thread actually makes it MORE visible to other people. There's links to it in the first post, and people who wouldn't want to read through all NNN posts can see a subtopic they want to discuss.
But if you are consuming via email, it... doesnt matter much. The entire discussion seems to stay in the same thread anyhow. (At least in my mail client)
I'm not sure what you mean by "RH staff". Our discourse instance is managed by Fedora community moderators. (Some of whom work for Red Hat, but they are part of the community too).
kevin
On Saturday, 08 July 2023 at 19:39, Kevin Fenzi wrote:
On Sat, Jul 08, 2023 at 01:06:01PM +0200, Vitaly Zaitsev via devel wrote:
On 06/07/2023 18:10, Aoife Moloney wrote:
but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
It looks like they've started moving replies they don't like to other threads to cover up the flow of resentment that comes naturally to them.
That's why switching to Fedora Discussion from the mailing lists is a very bad idea: admins or RH staff can easily delete your comments or bury them in another threads.
Well, no. Thats not whats happening.
Moving part of the discussion to another thread actually makes it MORE visible to other people.
If by "MORE visible" you actually mean "unaccessible", then I agree. The opt-in/opt-out subtopic has been made private: ... (mattdm) Split this topic 1 day ago
97 posts were merged into an existing topic: Opt-in / Opt-Out? A breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation
and the link leads to:
https://discussion.fedoraproject.org/t/opt-in-opt-out-a-breakout-topic-for-t...
Which, when I visit it, says "This page does not exist or is private".
If that's how it's supposed to work then I'll stay on the mailing list, thank you very much.
Regards, Dominik
On 7/8/23 19:48, Dominik 'Rathann' Mierzejewski wrote:
On Saturday, 08 July 2023 at 19:39, Kevin Fenzi wrote:
On Sat, Jul 08, 2023 at 01:06:01PM +0200, Vitaly Zaitsev via devel wrote:
On 06/07/2023 18:10, Aoife Moloney wrote:
but the conversation about each change will take place on Fedora Discussion at
...
97 posts were merged into an existing topic: Opt-in / Opt-Out? A breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation
and the link leads to:
https://discussion.fedoraproject.org/t/opt-in-opt-out-a-breakout-topic-for-t...
Which, when I visit it, says "This page does not exist or is private".
If that's how it's supposed to work then I'll stay on the mailing list, thank you very much.
I don't think this is a result of the "evil Red Hat", more like a result of the particular post being moved back and forth, so the link became invalid. If you strip the post ID from the link, it'll work:
https://discussion.fedoraproject.org/t/opt-in-opt-out-a-breakout-topic-for-t...
As mentioned in the thread (and also in [0]) - this is the first time we use discourse for such active discussion, so some transient issues are understandable.
[0] https://discussion.fedoraproject.org/t/thoughts-about-the-earlier-proposal-t...
Il 08/07/23 13:06, Vitaly Zaitsev via devel ha scritto:
On 06/07/2023 18:10, Aoife Moloney wrote:
but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
It looks like they've started moving replies they don't like to other threads to cover up the flow of resentment that comes naturally to them.
That's why switching to Fedora Discussion from the mailing lists is a very bad idea: admins or RH staff can easily delete your comments or bury them in another threads.
Can we please stop implying malevolence every time we don't agree with something?
BTW in the spirit of openness, I've set up a poll (UNOFFICIAL) to clearly state community sentiment about enabling OPT-OUT metrics to FESCO: https://discussion.fedoraproject.org/t/unofficial-poll-about-opt-out-metrics...
Just a simple question and a YES/NO reply.
Mattia
On 09/07/2023 08:59, Mattia Verga via devel wrote:
Can we please stop implying malevolence every time we don't agree with something?
What malevolence? All 4 of my replies are gone from the main thread. I can treat this as a censoring attempt by the RH staff. This is absolutely unacceptable for free projects like Fedora.
On 09/07/2023 08:59, Mattia Verga via devel wrote:
BTW in the spirit of openness, I've set up a poll (UNOFFICIAL) to clearly state community sentiment about enabling OPT-OUT metrics to FESCO: https://discussion.fedoraproject.org/t/unofficial-poll-about-opt-out-metrics...
Just a simple question and a YES/NO reply.
Sorry, but we can't trust **ANONYMOUS** vote on a third-party platform. Admins or other people with access to host can easily edit SQL database and set 100500 votes for variant YES there.
You have already received a lot of feedback in several threads. FESCO can count these replies. Most of them overwhelmingly oppose this change.
On Sun, Jul 09, 2023 at 09:59:08AM +0200, Vitaly Zaitsev via devel wrote:
On 09/07/2023 08:59, Mattia Verga via devel wrote:
BTW in the spirit of openness, I've set up a poll (UNOFFICIAL) to clearly state community sentiment about enabling OPT-OUT metrics to FESCO: https://discussion.fedoraproject.org/t/unofficial-poll-about-opt-out-metrics...
Just a simple question and a YES/NO reply.
Sorry, but we can't trust **ANONYMOUS** vote on a third-party platform. Admins or other people with access to host can easily edit SQL database and set 100500 votes for variant YES there.
Yes they could, but this is ridiculous.
On Sun, 09 Jul 2023 06:59:11 +0000 Mattia Verga via devel devel@lists.fedoraproject.org wrote:
Il 08/07/23 13:06, Vitaly Zaitsev via devel ha scritto:
On 06/07/2023 18:10, Aoife Moloney wrote:
but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
It looks like they've started moving replies they don't like to other threads to cover up the flow of resentment that comes naturally to them.
That's why switching to Fedora Discussion from the mailing lists is a very bad idea: admins or RH staff can easily delete your comments or bury them in another threads.
Can we please stop implying malevolence every time we don't agree with something?
BTW in the spirit of openness, I've set up a poll (UNOFFICIAL) to clearly state community sentiment about enabling OPT-OUT metrics to FESCO: https://discussion.fedoraproject.org/t/unofficial-poll-about-opt-out-metrics...
How is that going to help anything, when some of us are using browsers from Fedora repos, that just gets this answer:
"Unfortunately, your browser is unsupported. Please switch to a supported browser to view rich content, log in and reply."
Thats why we still wants maillists for this - as clearly said in last discussion about it.
Just a simple question and a YES/NO reply.
NO
On 7/9/23 18:53, Allan via devel wrote:
On Sun, 09 Jul 2023 06:59:11 +0000 Mattia Verga via devel devel@lists.fedoraproject.org wrote:
Il 08/07/23 13:06, Vitaly Zaitsev via devel ha scritto:
On 06/07/2023 18:10, Aoife Moloney wrote:
but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
It looks like they've started moving replies they don't like to other threads to cover up the flow of resentment that comes naturally to them.
That's why switching to Fedora Discussion from the mailing lists is a very bad idea: admins or RH staff can easily delete your comments or bury them in another threads.
Can we please stop implying malevolence every time we don't agree with something?
BTW in the spirit of openness, I've set up a poll (UNOFFICIAL) to clearly state community sentiment about enabling OPT-OUT metrics to FESCO: https://discussion.fedoraproject.org/t/unofficial-poll-about-opt-out-metrics...
How is that going to help anything, when some of us are using browsers from Fedora repos, that just gets this answer:
Which browser?
On Sun, 9 Jul 2023 18:54:18 -0400 Demi Marie Obenour demiobenour@gmail.com wrote:
On 7/9/23 18:53, Allan via devel wrote:
On Sun, 09 Jul 2023 06:59:11 +0000 Mattia Verga via devel devel@lists.fedoraproject.org wrote:
Il 08/07/23 13:06, Vitaly Zaitsev via devel ha scritto:
On 06/07/2023 18:10, Aoife Moloney wrote:
but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
It looks like they've started moving replies they don't like to other threads to cover up the flow of resentment that comes naturally to them.
That's why switching to Fedora Discussion from the mailing lists is a very bad idea: admins or RH staff can easily delete your comments or bury them in another threads.
Can we please stop implying malevolence every time we don't agree with something?
BTW in the spirit of openness, I've set up a poll (UNOFFICIAL) to clearly state community sentiment about enabling OPT-OUT metrics to FESCO: https://discussion.fedoraproject.org/t/unofficial-poll-about-opt-out-metrics...
How is that going to help anything, when some of us are using browsers from Fedora repos, that just gets this answer:
Which browser?
Seamonkey, Falkon maybe more...
On 7/9/23 19:08, Allan via devel wrote:
On Sun, 9 Jul 2023 18:54:18 -0400 Demi Marie Obenour demiobenour@gmail.com wrote:
On 7/9/23 18:53, Allan via devel wrote:
On Sun, 09 Jul 2023 06:59:11 +0000 Mattia Verga via devel devel@lists.fedoraproject.org wrote:
Il 08/07/23 13:06, Vitaly Zaitsev via devel ha scritto:
On 06/07/2023 18:10, Aoife Moloney wrote:
but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
It looks like they've started moving replies they don't like to other threads to cover up the flow of resentment that comes naturally to them.
That's why switching to Fedora Discussion from the mailing lists is a very bad idea: admins or RH staff can easily delete your comments or bury them in another threads.
Can we please stop implying malevolence every time we don't agree with something?
BTW in the spirit of openness, I've set up a poll (UNOFFICIAL) to clearly state community sentiment about enabling OPT-OUT metrics to FESCO: https://discussion.fedoraproject.org/t/unofficial-poll-about-opt-out-metrics...
How is that going to help anything, when some of us are using browsers from Fedora repos, that just gets this answer:
Which browser?
. Seamonkey, Falkon maybe more...
SeaMonkey and Falkon are based on outdated versions of Firefox and Chromium respectively. Mozilla stopped issuing security advisories for SeaMonkey back in 2015, and QtWebEngine (used by Falkon) was a month or more behind upstream Chromium last I checked.
On Sun, Jul 9, 2023 at 8:51 PM Demi Marie Obenour demiobenour@gmail.com wrote:
On 7/9/23 19:08, Allan via devel wrote:
On Sun, 9 Jul 2023 18:54:18 -0400 Demi Marie Obenour demiobenour@gmail.com wrote:
On 7/9/23 18:53, Allan via devel wrote:
On Sun, 09 Jul 2023 06:59:11 +0000 Mattia Verga via devel devel@lists.fedoraproject.org wrote:
Il 08/07/23 13:06, Vitaly Zaitsev via devel ha scritto:
On 06/07/2023 18:10, Aoife Moloney wrote: > but the conversation about each change > will take place on Fedora Discussion at > https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving... It looks like they've started moving replies they don't like to other threads to cover up the flow of resentment that comes naturally to them.
That's why switching to Fedora Discussion from the mailing lists is a very bad idea: admins or RH staff can easily delete your comments or bury them in another threads.
Can we please stop implying malevolence every time we don't agree with something?
BTW in the spirit of openness, I've set up a poll (UNOFFICIAL) to clearly state community sentiment about enabling OPT-OUT metrics to FESCO: https://discussion.fedoraproject.org/t/unofficial-poll-about-opt-out-metrics...
How is that going to help anything, when some of us are using browsers from Fedora repos, that just gets this answer:
Which browser?
. Seamonkey, Falkon maybe more...
SeaMonkey and Falkon are based on outdated versions of Firefox and Chromium respectively. Mozilla stopped issuing security advisories for SeaMonkey back in 2015, and QtWebEngine (used by Falkon) was a month or more behind upstream Chromium last I checked.
Please stop bringing this up. QtWebEngine is maintained by the Qt Company, and we all know that security advisories aren't the be-all end-all for maintenance.
SeaMonkey is maintained by its community. And community projects rarely issue security advisories.
On 10/07/2023 02:49, Demi Marie Obenour wrote:
QtWebEngine (used by Falkon) was a month or more behind upstream Chromium last I checked.
Qt5QtWebEngine is an extremely vulnerable thing. It still uses Chromium 87.0[1].
Current Chromium version: 105.0.
[1]: https://wiki.qt.io/QtWebEngine/ChromiumVersions
On 7/10/23 02:30, Vitaly Zaitsev via devel wrote:
On 10/07/2023 02:49, Demi Marie Obenour wrote:
QtWebEngine (used by Falkon) was a month or more behind upstream Chromium last I checked.
Qt5QtWebEngine is an extremely vulnerable thing. It still uses Chromium 87.0[1].
Current Chromium version: 105.0.
In that case it should be removed from the distribution. Can KDE mail clients be built without QtWebEngine? This would disable HTML email support, but plain text mail might still work.
More generally, WebKit is the only major browser engine with upstream support for being embedded, so it is the only embedded browser engine that is supportable security-wise. Unfortunately, it is also the least secure of the major browser engines on Linux last I checked, and in particular is far behind Chromium.
On 10/07/2023 20:16, Demi Marie Obenour wrote:
In that case it should be removed from the distribution. Can KDE mail clients be built without QtWebEngine? This would disable HTML email support, but plain text mail might still work.
I doubt. But last year I disabled QtWebEngine in Psi and Psi+ Jabber clients.
More generally, WebKit is the only major browser engine with upstream support for being embedded, so it is the only embedded browser engine that is supportable security-wise.
Telegram Desktop uses WebKitGTK instead of QtWebEngine.
On 7/10/23 13:16, Demi Marie Obenour wrote:
On 7/10/23 02:30, Vitaly Zaitsev via devel wrote:
On 10/07/2023 02:49, Demi Marie Obenour wrote:
QtWebEngine (used by Falkon) was a month or more behind upstream Chromium last I checked.
Qt5QtWebEngine is an extremely vulnerable thing. It still uses Chromium 87.0[1].
Current Chromium version: 105.0.
In that case it should be removed from the distribution. Can KDE mail clients be built without QtWebEngine? This would disable HTML email support, but plain text mail might still work.
The problem isn't QtWebEngine, the latest Qt 6.X is using 108 according to the link above.
The problem seems to be that not everything has moved to the 6.x branch yet.
More generally, WebKit is the only major browser engine with upstream support for being embedded, so it is the only embedded browser engine that is supportable security-wise. Unfortunately, it is also the least secure of the major browser engines on Linux last I checked, and in particular is far behind Chromium.
On 7/11/23 15:45, Jeremy Linton wrote:
On 7/10/23 13:16, Demi Marie Obenour wrote:
On 7/10/23 02:30, Vitaly Zaitsev via devel wrote:
On 10/07/2023 02:49, Demi Marie Obenour wrote:
QtWebEngine (used by Falkon) was a month or more behind upstream Chromium last I checked.
Qt5QtWebEngine is an extremely vulnerable thing. It still uses Chromium 87.0[1].
Current Chromium version: 105.0.
In that case it should be removed from the distribution. Can KDE mail clients be built without QtWebEngine? This would disable HTML email support, but plain text mail might still work.
The problem isn't QtWebEngine, the latest Qt 6.X is using 108 according to the link above.
The problem seems to be that not everything has moved to the 6.x branch yet.
It’s a mixture. The best possible outcome would be for QtWebEngine to be part of upstream Chromium and use Chromium’s release schedule. Not sure if that is possible/practical.
Hi,
On Thursday, 2023-07-06 17:10:24 +0100, Aoife Moloney wrote:
https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
So this is how a bit harsher criticism on Discourse is handled? By flagging and hiding? https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
https://mastodon.ar.al/@aral/110688848596975566
Awesome.
Eike
I think what happens is: somebody (anybody) can report a post, if it gets enough reports it gets proactively hidden before a moderator can review it. Do our moderators eventually review such posts to ensure they're truly inappropriate? Seems clear that the post is question should not have been hidden.
Michael
Hi,
On Tuesday, 2023-07-11 08:17:07 -0500, Michael Catanzaro wrote:
I think what happens is: somebody (anybody) can report a post, if it gets enough reports it gets proactively hidden before a moderator can review it. Do our moderators eventually review such posts to ensure they're truly inappropriate? Seems clear that the post is question should not have been hidden.
According to https://mastodon.social/@decathorpe/110688949866653898 Fabio even (re-)approved the post to unhide it and then apparently some moderator hid it again.. https://mastodon.social/@decathorpe/110692221789994477
It's time to declare a thread dead when moderator wars start. And it shows that Discourse is the wrong medium to discuss controversial proposals.
Eike
Matt has started a poll with regards to the community's preferences about the topic:
https://discussion.fedoraproject.org/t/straw-poll-on-your-preferences-about-...
On 7/12/23 12:37, Eike Rathke wrote:
Hi,
On Tuesday, 2023-07-11 08:17:07 -0500, Michael Catanzaro wrote:
I think what happens is: somebody (anybody) can report a post, if it gets enough reports it gets proactively hidden before a moderator can review it. Do our moderators eventually review such posts to ensure they're truly inappropriate? Seems clear that the post is question should not have been hidden.
According to https://mastodon.social/@decathorpe/110688949866653898 Fabio even (re-)approved the post to unhide it and then apparently some moderator hid it again.. https://mastodon.social/@decathorpe/110692221789994477
It's time to declare a thread dead when moderator wars start. And it shows that Discourse is the wrong medium to discuss controversial proposals.
Eike
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Matt has started a poll with regards to the community's preferences about the topic:
https://discussion.fedoraproject.org/t/straw-poll-on-your-preferences-about-...
On 7/12/23 12:37, Eike Rathke wrote:
Hi,
On Tuesday, 2023-07-11 08:17:07 -0500, Michael Catanzaro wrote:
I think what happens is: somebody (anybody) can report a post, if it gets enough reports it gets proactively hidden before a moderator can review it. Do our moderators eventually review such posts to ensure they're truly inappropriate? Seems clear that the post is question should not have been hidden.
According to https://mastodon.social/@decathorpe/110688949866653898 Fabio even (re-)approved the post to unhide it and then apparently some moderator hid it again.. https://mastodon.social/@decathorpe/110692221789994477
It's time to declare a thread dead when moderator wars start. And it shows that Discourse is the wrong medium to discuss controversial proposals.
Eike
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Hi,
On 7/6/23 11:10, Aoife Moloney wrote:
Important process note: we are experimenting with using Fedora
(trimming stuff because this proposal is huge)
We intend to deploy the Endless OS metrics system. [https://blogs.gnome.org/wjjt/2023/07/05/endless-oss-privacy-preserving-metri... This blog post] contains a description of how the system works. We do not plan to deploy the eos-phone-home component in Fedora.
So, the following is just _my_ opinion, don't read more than that into it:
Having finally had a chance to look at the list of collected metrics i'm a bit worried about just how much information is being/can be gathered by the project, as well as the frequency it is being gathered.
Personally, I think it would benefit fedora if questions such as "is anyone actually using this hardware/driver/package" could be answered. OTOH, the metrics presented above go far beyond that. I'm not sure why its necessary to know how many times, or how long a particular application is being used.
=== How will data collection be approved? ===
The proposal owners feel it is essential to ensure the Fedora community has ultimate oversight over metrics collection. Community control is required to maintain user trust. If this change proposal is approved, then we'll need new policies and procedures to ensure community oversight over metrics collection and ensure Fedora users can be confident that our metrics collection does not violate their privacy.
So, I would suggest that the intended metrics are included as part of this proposal as well as the interval, and that it wouldn't be changed without further community approval. Doing this would go a long way to convincing me, and likely others, that its not worth the effort to manually rip the entire subsystem out of fedora at the first chance on my machines.
If there is to be a "process" for changing them, then I think that needs to be documented here rather than hand waving it away too.
We can say "we would never collect personally-identifiable data" and write software that really doesn't collect any such data, but this alone will never be enough to ensure user confidence. We will need a metrics collection policy that describes what sort of data may be collected by Fedora (anonymous, non-invasive), and what sort of data may not be collected. Such a policy does not exist currently. We will also want to ensure the Fedora community has ultimate control over which particular metrics are collected. One option is that each metric to be collected should be separately approved by FESCo. Collection of particular metrics in a particular data format is ultimately an engineering decision, and therefore FESCo seems like an appropriate approval point. Because FESCo members are elected regularly by the Fedora community, this also provides the community with ultimate control over metrics collection via the election process. But other oversight and approval structures would work too.
=== What data might we collect? ===
We are not proposing to collect any of these particular metrics just yet, because a process for Fedora community approval of metrics to be collected does not yet exist. That said, in the interests of maximum transparency, we wish to give you an idea of what sorts of metrics we might propose to collect in the future.
One of the main goals of metrics collection is to analyze whether Red Hat is achieving its goal to make Fedora Workstation the premier developer platform for cloud software development. Accordingly, we want to know things like which IDEs are most popular among our users, and which runtimes are used to create containers using Toolbx.
IMHO, the data shouldn't be collected more frequently than every 6 months or so, which allows each collection to be presented to the user, rather than having it just uploading the data in the background. Nor should it be tracking _user_ actions, which I would differentiate from machine state (bios machine type, RAM, installed packages, application crashes, failed suspend/resume, kinds of things).
But given course grained tracking, why isn't it part of server/IoT/etc as well, other than the current focus on gnome? Surely knowing that only one user is running $APPLICATION on a server is useful too.
Metrics can also be used to inform user interface design decisions. For example, we want to collect the clickthrough rate of the recommended software banners in GNOME Software to assess which banners are actually useful to users. We also want to know how frequently panels in gnome-control-center are visited to determine which panels could be consolidated or removed, because there are other settings we want to add, but our usability research indicates that the current high quantity of settings panels already makes it difficult for users to find commonly-used settings.
(trimming)
=== User control ===
A new metrics collection setting will be added to the privacy page in gnome-initial-setup and also to the privacy page in gnome-control-center. This setting will be a toggle that will enable or disable metrics collection for the entire system. We want to ensure that metrics are never submitted to Fedora without the user's knowledge and consent, so the underlying setting will be off by default in order to ensure metrics upload is not unexpectedly turned on when upgrading from an older version of Fedora. However, we also want to ensure that the data we collect is meaningful, so gnome-initial-setup will default to displaying the toggle as enabled, even though the underlying setting will initially be disabled. (The underlying setting will not actually be enabled until the user finishes the privacy page, to ensure users have the opportunity to disable the setting before any data is uploaded.) This is to ensure the system is opt-out, not opt-in. This is essential because we know that opt-in metrics are not very useful. Few users would opt in, and these users would not be representative of Fedora users as a whole. We are not interested in opt-in metrics.
I also think its useful here to describe _exactly_ how to disable/remove the component, as well as where the opt-in/out settings are stored in the filesystem, how to change it, and where the log of reported data for a given machine can be retrieved.
To make this a little more confusing, metrics collection is actually separate from uploading. Collection is always initially enabled, while uploading is always initially disabled. The graphical toggle enables or disables both at the same time. That is, a newly-installed Fedora system will always collect metrics locally at first, but the collected metrics will be deleted and never submitted to Fedora if the user disables the metrics collection toggle on the privacy page. If the user leaves the toggle enabled, then the collected metrics may be submitted only after finishing the privacy page.
(trimmed rest)
Thanks for getting this far.
On Tue, Jul 11 2023 at 02:19:31 PM -0500, Jeremy Linton jeremy.linton@arm.com wrote:
Having finally had a chance to look at the list of collected metrics i'm a bit worried about just how much information is being/can be gathered by the project, as well as the frequency it is being gathered.
Personally, I think it would benefit fedora if questions such as "is anyone actually using this hardware/driver/package" could be answered. OTOH, the metrics presented above go far beyond that. I'm not sure why its necessary to know how many times, or how long a particular application is being used.
I think Endless needs more data than we do. ;) If they don't have application usage data then they could be *really* wasting their time developing stuff that users are not using. Fedora works a quite differently, but I can imagine we'd still be interested in counting use of at least some applications (e.g. was GNOME Builder started today?).
For avoidance of doubt, we won't actually collect the same metrics that Endless does. Metrics collected by Fedora will need to be individually approved via some sort of community process.
So, I would suggest that the intended metrics are included as part of this proposal as well as the interval, and that it wouldn't be changed without further community approval. Doing this would go a long way to convincing me, and likely others, that its not worth the effort to manually rip the entire subsystem out of fedora at the first chance on my machines.
I agree that community approval should be required to make changes to what data we collect.
I was really hoping the initial proposal would not include particular metrics, so that each metric could be discussed separately outside the discussion of whether we should do this at all, but a lot of people are requesting this, so maybe we'll need to add a few.
If there is to be a "process" for changing them, then I think that needs to be documented here rather than hand waving it away too.
I agree. Once we agree on what process should be used, I'll edit it into the change proposal. I've started a discussion on this here:
https://discussion.fedoraproject.org/t/potential-process-and-policies-for-ap...
IMHO, the data shouldn't be collected more frequently than every 6 months or so, which allows each collection to be presented to the user, rather than having it just uploading the data in the background. Nor should it be tracking _user_ actions, which I would differentiate from machine state (bios machine type, RAM, installed packages, application crashes, failed suspend/resume, kinds of things).
But given course grained tracking, why isn't it part of server/IoT/etc as well, other than the current focus on gnome? Surely knowing that only one user is running $APPLICATION on a server is useful too.
We do want to track user action, though (e.g. "what control center panels are used the most?"
6 months is too infrequent. I'm open to discussing how frequently metrics are uploaded, but I think the current value is 30 minutes. Presenting each collection to the user would be too much clutter, but I'll plan to build some way to inspect this manually for users who want to do so.
I think telemetry would be useful for server, IoT, and Fedora spins as well, but this is something for each edition or spin to decide for themselves. The technology is somewhat tied to GNOME because it depends on D-Bus and GVariant, but it can be used on servers too.
I also think its useful here to describe _exactly_ how to disable/remove the component, as well as where the opt-in/out settings are stored in the filesystem, how to change it, and where the log of reported data for a given machine can be retrieved.
You can do: sudo dnf remove eos-event-recorder-daemon
The settings are stored in /etc/metrics/eos-metrics-permissions.conf
I'm not sure about logs of reported data, I agree but we'll have to build such functionality if it doesn't exist already.
I'll create a note to edit this into the change proposal.
To make this a little more confusing, metrics collection is actually separate from uploading. Collection is always initially enabled, while uploading is always initially disabled. The graphical toggle enables or disables both at the same time. That is, a newly-installed Fedora system will always collect metrics locally at first, but the collected metrics will be deleted and never submitted to Fedora if the user disables the metrics collection toggle on the privacy page. If the user leaves the toggle enabled, then the collected metrics may be submitted only after finishing the privacy page.
(trimmed rest)
Thanks for getting this far. _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On 7/6/23 12:10, Aoife Moloney wrote:
Important process note: we are experimenting with using Fedora Discussion as part of the Changes process. Change announcements (like the one you are reading right now) will still be sent to the devel-announce mailing list, but the conversation about each change will take place on Fedora Discussion at https://discussion.fedoraproject.org/t/f40-change-request-privacy-preserving...
This will follow the same process as before, just with discussion in a different format https://docs.fedoraproject.org/en-US/program_management/changes_policy/
You can subscribe to and interact with these conversations by email. See https://discussion.fedoraproject.org/t/guide-to-interacting-with-this-site-b... for detailed instructions. To make sure you do not miss anything, make sure that you have the Change Proposal category set to “Watching” — or, if you just want to get notified about new changes but not every reply in the conversation, to “Watching First Post”. (Click on the little bell icon at the top right of the category page.)
The below document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.
== Summary ==
The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.
There are two problems here:
1. The GDPR and similar regulations are 100% clear that consent must be opt-*in*. Opt-*out*, as is proposed here, is not consent. Therefore, this change is proposing collecting telemetry *without user’s consent*.
2. Irrespective of whether or not the metrics are personally identifiable for the purposes of GDPR and other regulations, I highly doubt you will be able to convince people that they are in fact not personally identifiable. Techniques for correlating metrics can only get better, never worse, and this means that what information may become personally identifiable in the future even if it was not in the past. Even Differential Privacy cannot solve this problem because it works on aggregate statistics, not on the raw data collected.
The only way I could be convinced that the raw data is in fact not personally identifiable is if there was a mathematical proof to that effect. Such a proof would probably be worthy of publication in a peer-reviewed research paper.
Since this Change proposal comes from Red Hat, I have an alternative to propose: Red Hat can ask its paying corporate customers for this information, perhaps in exchange for a discount on their RHEL subscriptions. This should be much less controversial.
On 7/12/23 19:21, Demi Marie Obenour wrote:
- The GDPR and similar regulations are 100% clear that consent must be opt-*in*. Opt-*out*, as is proposed here, is not consent. Therefore, this change is proposing collecting telemetry *without user’s consent*.
I seems to me that there are two slightly different understanding of 'opt-in':
1. data collection is happening automatically, but there is a way to 'opt-out' and turn it off. 2. the user is asked for permission, and the default answer is preselected as 'yes'
I think GDPR prohibits the first option, but the second one must be allowed because it's like pretty much all GDPR-compliant implementations i've seen
I understand that Michael's Telemetry proposal uses the second method.
Perhaps a criticism of the opt-out approach (even in the second form) results from people believing that the consent at the installation time is not fully informed---that somehow people don't understand the ramifications and amount of data being shared. This is actually makes sense.
Such concern could be mitigated by scheduling a system notification after several weeks or months, with a rough summary of the collected data ( 'we shared X anonymized reports about Y,Z and W'), and offering a link to a telemetry consent dialog.
On Mon, 2023-07-17 at 16:26 -0400, przemek klosowski via devel wrote:
I seems to me that there are two slightly different understanding of 'opt-in':
1. data collection is happening automatically, but there is a way to 'opt-out' and turn it off. 2. the user is asked for permission, and the default answer is preselected as 'yes'
I think GDPR prohibits the first option, but the second one must be allowed because it's like pretty much all GDPR-compliant implementations i've seen
I understand that Michael's Telemetry proposal uses the second method.
The original form of the proposal does. It seems fairly clear at this point that the proposal will be revised to use a "choice required" method, where there is no "default" choice and the user must deliberately pick one option or the other to proceed. This has come up in the discussion on discussion.fp.o .
Privacy is not too much of my concern.
How much data is to be expected to be sent over my dataplan on monthly basis? When using Fedora Workstations as a graphics workstation (including regular office applications) during office hours and extensive internet research and entertainment during (late)evenings and weekends, should I expect this to generate data of some 10's of KB, or should I expect it to amount to megabytes? Will this be uploaded on scheduled daily interval or more regularly? Can I monitor the traffic on my firewall (and how)?
Good luck with the upcoming release soon! Cheers, Igor
I'm worried about seeing someone here on this discussion list lowering the importance of privacy.
On Thu, Apr 18, 2024 at 2:53 PM Igor Kerstges ikerstges@gmail.com wrote:
Privacy is not too much of my concern.
How much data is to be expected to be sent over my dataplan on monthly basis? When using Fedora Workstations as a graphics workstation (including regular office applications) during office hours and extensive internet research and entertainment during (late)evenings and weekends, should I expect this to generate data of some 10's of KB, or should I expect it to amount to megabytes? Will this be uploaded on scheduled daily interval or more regularly? Can I monitor the traffic on my firewall (and how)?
Good luck with the upcoming release soon! Cheers, Igor -- _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Those questions regarding privacy are asked and answered to my satisfaction. I'd like to understand more implications about this change..
On Fri, Apr 19, 2024 at 09:37:38AM GMT, Igor Kerstges wrote:
Those questions regarding privacy are asked and answered to my satisfaction. I'd like to understand more implications about this change..
There are none. This proposal was withdrawn.
It may be adjusted and submitted for consideration again, but that has not yet happened.
kevin
On Fri, Apr 19 2024 at 11:11:33 AM -07:00:00, Kevin Fenzi kevin@scrye.com wrote:
There are none. This proposal was withdrawn.
It may be adjusted and submitted for consideration again, but that has not yet happened.
Well, yes, but I'm planning to do this soonish.
On Thu, Apr 18 2024 at 05:53:14 PM +00:00:00, Igor Kerstges ikerstges@gmail.com wrote:
How much data is to be expected to be sent over my dataplan on monthly basis? When using Fedora Workstations as a graphics workstation (including regular office applications) during office hours and extensive internet research and entertainment during (late)evenings and weekends, should I expect this to generate data of some 10's of KB, or should I expect it to amount to megabytes?
Hi, how much data gets sent would depend on how many metrics we decide to collect. I don't have any estimate, but my guess is "very little."
The good news is NetworkManager already knows how to detect a metered connection (and there is an override switch in gnome-control-center if the automatic detection fails). So if it turns out to be a problem, then we can disable most of the data collection when on a metered connection.
Will this be uploaded on scheduled daily interval or more regularly?
To be determined!
Can I monitor the traffic on my firewall (and how)?
All the data would be sent to a single host operated by Fedora. But it does not actually exist yet.
Michael