On Apr 23, 2022, at 22:36, Stephen J. Turnbull stephen@xemacs.org wrote:
As far as I know there isn't really a technical argument for systemd or any particular systemd.* on Fedora workstations. The various traditional inits and daemons work fine in that environment.[1]
There are several features in systemd that directly benefit the desktop.
1.) systemd service dependencies can ensure that the desktop environment doesn’t launch until all dependencies are met. The side benefit of this is that with parallel startup of services, the desktop launches faster, but it also launches with all the services it needs.
2.) systemd-logind helps contain desktop processes in cgroups, meaning that if you want it to, it will terminate all user processes *for that session* when it logs out. This is a huge thing for the enterprise desktop environments. For example, I managed engineering desktops and there was a particularly finicky circuit designer that loved to leave background processes that would survive logouts, and if another user logged in it would interfere.
But this process management also introduced resource management per-user session, so you could ensure a single user couldn’t abuse the system. This was also important to me, since we had multi-user systems running graphical sessions via VNC, and we wanted to make sure one user didn’t overwhelm the system.
3.) systemd now launches your GUI. You have your own private systemd --user running every time you log in. This process launches services and apps, maintains your environment, and can run other systemd units such as timers. This gives you a similar interface to system services, scoped just to your account. Since there’s only one user systemd per user, you can launch a process that can be used and managed by both the graphical login and a ssh session. (This is actually annoying to me, since it means stuff like Kerberos and AFS works differently than it used to)
4.) the desktop session output and error are captured in the journal. Previously init systems had user console lost to the user. There was some attempt to capture the X logs and the gnome session, but in systemd each user unit can be individually examined with journalctl.
This is just stuff off the top of my head. While I do agree that there has been a lot of focus on server with systemd, a lot of cool things (like unit templating) were introduced because of systemd on workstations. Don’t forget that nearly all the common benefits of systemd also help desktops, because at its core, it’s the core init system to launch the OS.
On Mon, 25 Apr 2022 16:45:41 -0400 Jonathan Billings wrote:
if you want it to, it will terminate all user processes *for that session* when it logs out
This only recently started working moderately well. If I ever ssh'ed into my desktop for a separate login session, systemd would create some sort of systemd user daemon that would hang around forever even after I logged out of the ssh session. Then when I tried to reboot the system, it would take something like 5 minutes to timeout waiting for the user daemon to terminate. I think it is finally better now, but it took years. I started using my own special reboot script that would search for and kill all systemd user daemons before trying to reboot :-).
Tom Horsley writes:
if you want it to, it will terminate all user processes *for that session*
when it logs out
This only recently started working moderately well. If I ever ssh'ed into my desktop for a separate login session, systemd would create some sort of systemd user daemon that would hang around forever even after I logged out of the ssh session. Then when I tried to reboot the system, it would take something like 5 minutes to timeout waiting for the user daemon to terminate. I think it is finally better now, but it took years. I started using my own special reboot script that would search for and kill all systemd user daemons before trying to reboot :-).
Oh, so that's what that was all about.
In my case this was happening occasionally, and not every time. I ssh all over my LAN, every day and do weekly reboots. Every other month or so one of the machines gets stuck for a few minutes rebooting.
I couldn't detect any rhyme or reason for it, and wrote it off just as a random systemd bug.
On Apr 25, 2022, at 21:17, Tom Horsley horsley1953@gmail.com wrote:
On Mon, 25 Apr 2022 16:45:41 -0400 Jonathan Billings wrote:
if you want it to, it will terminate all user processes *for that session* when it logs out
This only recently started working moderately well. If I ever ssh'ed into my desktop for a separate login session, systemd would create some sort of systemd user daemon that would hang around forever even after I logged out of the ssh session. Then when I tried to reboot the system, it would take something like 5 minutes to timeout waiting for the user daemon to terminate. I think it is finally better now, but it took years. I started using my own special reboot script that would search for and kill all systemd user daemons before trying to reboot :-).
I believe KillUserProcesses is “no” by default, so it is unlikely to be related. I had to enable it on our systems.
The systemd --user daemon does hang around though, and it is likely it is waiting on terminating some pesky user process that wasn’t terminating properly. It isn’t the blocking process, it is the daemon trying to terminate it.
-- Jonathan Billings
On Tue, 26 Apr 2022 17:41:34 -0400 Jonathan Billings wrote:
The systemd --user daemon does hang around though, and it is likely it is waiting on terminating some pesky user process that wasn’t terminating properly. It isn’t the blocking process, it is the daemon trying to terminate it.
Nope, absolutely everything from that ssh session was always gone, it was the user daemon itself that caused the hang.
On Apr 26, 2022, at 17:58, Tom Horsley horsley1953@gmail.com wrote:
On Tue, 26 Apr 2022 17:41:34 -0400 Jonathan Billings wrote:
The systemd --user daemon does hang around though, and it is likely it is waiting on terminating some pesky user process that wasn’t terminating properly. It isn’t the blocking process, it is the daemon trying to terminate it.
Nope, absolutely everything from that ssh session was always gone, it was the user daemon itself that caused the hang.
Most likely it was trying to remove something from the session (such as a mount) that wasn’t responding. The user daemon itself will terminate once the session has been terminated.
It’s not like the systemd developers were like, “oops we forgot how to make the user daemon exit” or anything. I’m not trying to say I don’t believe your experience happened, I’m just saying that the outward appearance was deceiving. Could systemd do a better job saying what it was waiting on? Yes. Is it so horribly broken it doesn’t know how to exit? No.
On Tue, 26 Apr 2022 18:05:39 -0400 Jonathan Billings wrote:
Most likely it was trying to remove something from the session (such as a mount) that wasn’t responding. The user daemon itself will terminate once the session has been terminated.
Not even close. No mounts, no resources used at all, I could even ssh in and immediately log out, and the systemd user daemon never went away. Like I said, I think it is finally better, but it acted this way for years and years.
On Tue, Apr 26, 2022 at 6:06 PM Jonathan Billings billings@negate.org wrote:
The systemd --user daemon does hang around though, and it is likely it
is waiting on terminating some pesky user process that wasn’t terminating properly. It isn’t the blocking process, it is the daemon trying to terminate it.
Nope, absolutely everything from that ssh session was always gone, it was the user daemon itself that caused the hang.
[snip] Could systemd do a better job saying what it was waiting on? Yes. Is it so horribly broken it doesn’t know how to exit? No.
This kind of blanket dismissal of user feedback and refusal to believe *even the possibility* that systemd could be broken in obvious ways contributes to the sense from the community that negative feedback about systemd has been and will be ignored.
Had the response been "What kind of system was it? What test cases did you do? Which time frames?" then at least it would come across as a constructive attempt to solve the problem. But a blanket dismissal of the possibility that systemd could fail to exit cleanly as opposed to admitting that maybe they were bit by any of the previous bugs where systemd would crash on exit [1, 2, 3, 4] reinforces the sense of "systemd advocates don't listen to user feedback".
-justin
[1] https://github.com/systemd/systemd/issues/17758 [2] https://access.redhat.com/solutions/6369201 [3] https://github.com/systemd/systemd/issues/6512 [4] https://linux.debian.bugs.dist.narkive.com/2ReHgNIk/bug-780675-systemd-segfa...
-- Jonathan Billings _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Justin Moore writes:
This kind of blanket dismissal of user feedback and refusal to believe *even the possibility* that systemd could be broken in obvious ways contributes to the sense from the community that negative feedback about systemd has been and will be ignored.
Had the response been "What kind of system was it? What test cases did you do? Which time frames?" then at least it would come across as a constructive attempt to solve the problem. But a blanket dismissal of the possibility that systemd could fail to exit cleanly as opposed to admitting that maybe they were bit by any of the previous bugs where systemd would crash on exit [1, 2, 3, 4] reinforces the sense of "systemd advocates don't listen to user feedback".
And in the instant case, we had:
1) A broken systemd-resolved scriptlet that ended up overwriting the /etc/resolv.conf symlink. This was fixed in the -2 update, but the initial reports were ignored, because we were told that the symlink gets created only on the initial install, and not an upgrade. Well, it turns out this wasn't the case.
2) Completely unaddressed was the reason all of that came to light: either the original update also broke DNS resolution on the LAN, or it was always broken and systemd-resolved never adds the DHCP-provided domain to the "search" directory in its /etc/resolv.conf, but NetworkManager always does that. I documented that.
I don't know if the -2 update fixes this or not. But that's another bug that at least was initially ignored. From all the looks it's still being ignored.
The only reason systemd-resolved exists is because glibc caches /etc/resolv.conf when a process performs its first DNS lookup. Having the means to have an existing process become aware that its been changed, and it should reread it, will completely eliminate the reason for systemd- resolved's existence. That, I think, is the right solution, and it was always the right solution.
On 27 Apr 2022, at 13:05, Sam Varshavchik mrsam@courier-mta.com wrote:
The only reason systemd-resolved exists is because glibc caches /etc/resolv.conf when a process performs its first DNS lookup. Having the means to have an existing process become aware that its been changed, and it should reread it, will completely eliminate the reason for systemd-resolved's existence. That, I think, is the right solution, and it was always the right solution.
Its not just the problem of glibc not being able to reload /etc/resolv.conf (that might be fixed in recent glibc - not sure).
I use and depend on the DNS split horizon features of systemd-resolved all the time.
Barry
On Wed, 2022-04-27 at 08:05 -0400, Sam Varshavchik wrote:
The only reason systemd-resolved exists is because glibc caches /etc/resolv.conf when a process performs its first DNS lookup. Having the means to have an existing process become aware that its been changed, and it should reread it, will completely eliminate the reason for systemd-resolved's existence. That, I think, is the right solution, and it was always the right solution.
I remember that kind of thing when I first started using Linux in the early 2000s: It seemed to presume that all networks were a permanent thing. If a network went up and down again, be that ethernet or your dial-up internet, you had to manually restart NTP, and one or two other daemons.
Things that rely on network (time servers, mail servers, etc), shouldn't just silently die and stay dead when a network changed. They should be able to notice a network status change and act accordingly. It's no good for a daemon to examine the network when it first starts up, and then never check again.
And the converse isn't true. The network shouldn't be poking everything that might need to know about it. You'd be forever building an increasing number of wake-up routines.
I don't know about the things that used to concern me, back then, I haven't checked lately. But we still have applications that are ignorant of network changes, like Firefox. If you try to load a site and networking fails it, that's it (whether your ISP connection was temporarily down, or a webserver on your LAN changed IPs). It's a case of manual jiggery pokery by you to get it have another go at it after it's cached a wrong or null DNS answer.
On Wed, Apr 27, 2022 at 08:05:56AM -0400, Sam Varshavchik wrote:
And in the instant case, we had:
- A broken systemd-resolved scriptlet that ended up overwriting the
/etc/resolv.conf symlink. This was fixed in the -2 update, but the initial reports were ignored, because we were told that the symlink gets created only on the initial install, and not an upgrade. Well, it turns out this wasn't the case.
ignored by whom? I noted that there were bugs and fixed in the most recent versions. It's important to that basically everyone should have a 'according to what I know...' in front of any help you get.
- Completely unaddressed was the reason all of that came to light: either
the original update also broke DNS resolution on the LAN, or it was always broken and systemd-resolved never adds the DHCP-provided domain to the "search" directory in its /etc/resolv.conf, but NetworkManager always does that. I documented that.
I don't know if the -2 update fixes this or not. But that's another bug that at least was initially ignored. From all the looks it's still being ignored.
Well, the users list is not a bug reporting medium. I don't think systemd-resolved maintainers read this list and act on it. Do report a bug to bugzilla.redhat.com or upstream to systemd on it.
systemd-resolved does pick up the domain provided by dhcp here. what does 'resolvectl' show there? (if you still have it around)
The only reason systemd-resolved exists is because glibc caches /etc/resolv.conf when a process performs its first DNS lookup. Having the
no. It provides a number of advantages over a static resolv.conf file, this is just not the case.
means to have an existing process become aware that its been changed, and it should reread it, will completely eliminate the reason for systemd- resolved's existence. That, I think, is the right solution, and it was always the right solution.
Well, I disagree, but such is life...
kevin
On Apr 27, 2022, at 07:25, Justin Moore justin.nonwork@gmail.com wrote:
On Tue, Apr 26, 2022 at 6:06 PM Jonathan Billings billings@negate.org wrote:
[snip] Could systemd do a better job saying what it was waiting on? Yes. Is it so horribly broken it doesn’t know how to exit? No.
This kind of blanket dismissal of user feedback and refusal to believe *even the possibility* that systemd could be broken in obvious ways contributes to the sense from the community that negative feedback about systemd has been and will be ignored.
Had the response been "What kind of system was it? What test cases did you do? Which time frames?" then at least it would come across as a constructive attempt to solve the problem. But a blanket dismissal of the possibility that systemd could fail to exit cleanly as opposed to admitting that maybe they were bit by any of the previous bugs where systemd would crash on exit [1, 2, 3, 4] reinforces the sense of "systemd advocates don't listen to user feedback".
-justin
[1] https://github.com/systemd/systemd/issues/17758 [2] https://access.redhat.com/solutions/6369201 [3] https://github.com/systemd/systemd/issues/6512 [4] https://linux.debian.bugs.dist.narkive.com/2ReHgNIk/bug-780675-systemd-segfa...
Just as much as frustrating as people who say “systemd is evil” and because it has bugs it should be tossed out entirely. I wasn’t trying to be dismissive, I just didn’t realize we were debugging someone’s genuine problem.
In general, the way I suggest debugging these kinds of hangs at shutdown/reboot are to run:
journalctl --boot=-1 --reverse
On Wed, Apr 27, 2022 at 5:35 PM Jonathan Billings billings@negate.org wrote:
Just as much as frustrating as people who say “systemd is evil” and because it has bugs it should be tossed out entirely.
That attempted equivalence doesn't acknowledge the power imbalance in the situation. The "systemd can't possibly be broken in THAT way" crowd are part of the group which have final say over Linux distributions at large and how they do (or don't) work.
If the "systemd is evil" crowd were removing systemd from Fedora/Ubuntu/etc and repeatedly breaking working systems and dismissing user feedback -- even if they had good intentions and could point to bugs they were fixing -- I think this equivalence would carry weight.
I wasn’t trying to be dismissive, I just didn’t realize we were debugging someone’s genuine problem.
And, not to be too flip, I think that's part of the problem.
-justin
In general, the way I suggest debugging these kinds of hangs at shutdown/reboot are to run:
journalctl --boot=-1 --reverse
-- Jonathan Billings _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On Apr 28, 2022, at 07:05, Justin Moore justin.nonwork@gmail.com wrote:
On Wed, Apr 27, 2022 at 5:35 PM Jonathan Billings billings@negate.org wrote:
Just as much as frustrating as people who say “systemd is evil” and because it has bugs it should be tossed out entirely.
That attempted equivalence doesn't acknowledge the power imbalance in the situation. The "systemd can't possibly be broken in THAT way" crowd are part of the group which have final say over Linux distributions at large and how they do (or don't) work.
If the "systemd is evil" crowd were removing systemd from Fedora/Ubuntu/etc and repeatedly breaking working systems and dismissing user feedback -- even if they had good intentions and could point to bugs they were fixing -- I think this equivalence would carry weight.
I only waded into this argument because someone said systemd isn’t any use on a workstation, which is just untrue. I’m not part of any group that incorporates it into Fedora, but I have a lot of experience in enterprise workstations where systemd has really improved my ability to manage systems.
I genuinely would prefer showing all the neat things you can do but I usually end up defending a decision I didn’t make on a platform I have no control over.
I wasn’t trying to be dismissive, I just didn’t realize we were debugging someone’s genuine problem.
And, not to be too flip, I think that's part of the problem.
I agree. I will strive to avoid discussing systemd in the future.
-- Jonathan Billings
Justin Moore writes:
And, not to be too flip, I think that's part of the problem.
Only the slow march of time will fix this problem.
The core features of systemd – the dependency-based replacement for init that uses containers – the initial feature set that was was used as its advocacy: it's a fair amount of code to implement, but it's not insurmountable.
I think that if there's a functional replacement for that, there's a very good chance that Debian will adopt it and use it by default. Once that's done, all the Debian-based distributions then inherit it. Fedora and RHEL will continue to be based on systemd, and all the Debian-based distros will migrate to its replacement.
There will be some temporary pain without a local port 53 proxy; but it should be possible to take a peek at how nscd monitors the changes to /etc/resolv.conf (based on what its man page says) and mimic this in a lightweight port 53 proxy. It should be possible to do that without actually having to interpret DNS queries and response packets, and just forward them to the current DNS server, back and forth. Thinking about this, I don't think it's necessary to actually grok DNS packets to do systemd-resolved- type proxying. But if it is, I do have my own DNS packet parsing code, that has been quietly grinding away for …a very long time, which I can toss over the wall pretty easily.
History is full of similar examples, of an established component getting replaced by a lighter replacement: chrony supplanting ntp; cronie supplanting vixie-cron. This certainly can happen again.
On Thu, 28 Apr 2022 18:15:16 -0400 Sam Varshavchik wrote:
History is full of similar examples, of an established component getting replaced by a lighter replacement: chrony supplanting ntp; cronie supplanting vixie-cron. This certainly can happen again.
But systemd-timedated tries to supplant them all :-). This is where my "computer fungus" description comes from. Every systemd release seems to take over something else that worked fine without systemd engulfing it.
On 28 Apr at 17:27, Tom Horsley horsley1953@gmail.com wrote:
But systemd-timedated tries to supplant them all :-). This is where my "computer fungus" description comes from. Every systemd release seems to take over something else that worked fine without systemd engulfing it.
I've been working in Unix since about 1980--I wrote custom BIOS code for BTL products, Unix drivers and kernel internals while at BTL Naperville, and rewrote cut and paste for contribution to the Gnu Project (and wasn't *that* fun, but another story.) Suffice to say I've been around a long time.
The biggest problem I have with systemd is that it violates, on so many levels, the Unix mantra "do one thing and do it well". It may have been a good idea when it was only supposed to provide an efficient replacement for the admittedly fractured different system initialization system.
With it growing to subsume so many different, and usually unrelated, tasks, it's turned into the kind of thing that we were trying so hard to get away from--the monolithic OS with complex interrelated interactions that was prevalent at the time.
And everything that we wanted to avoid is resurfacing with systemd. It's become a glop of totally unrelated features and services, difficult to diagnose and maintain, and prone to unwanted and unforseen interactions. Tug this strand, and the spiderweb shivers.
$0.02, and I know nobody's going to listen to me, but listening to the conflicts on this topic just make me sigh.
Sincerely, -- Dave Ihnat dihnat@dminet.com
On 4/28/22 15:38, Dave Ihnat wrote:
On 28 Apr at 17:27, Tom Horsley horsley1953@gmail.com wrote:
But systemd-timedated tries to supplant them all :-). This is where my "computer fungus" description comes from. Every systemd release seems to take over something else that worked fine without systemd engulfing it.
The biggest problem I have with systemd is that it violates, on so many levels, the Unix mantra "do one thing and do it well". It may have been a good idea when it was only supposed to provide an efficient replacement for the admittedly fractured different system initialization system.
With it growing to subsume so many different, and usually unrelated, tasks, it's turned into the kind of thing that we were trying so hard to get away from--the monolithic OS with complex interrelated interactions that was prevalent at the time.
And everything that we wanted to avoid is resurfacing with systemd. It's become a glop of totally unrelated features and services, difficult to diagnose and maintain, and prone to unwanted and unforseen interactions. Tug this strand, and the spiderweb shivers.
What's the difference between 10 independent programs that all come from the same "package" vs. 10 independent programs that each come from a different source? Each "systemd" program is independent and can work on its own. They are not tied together. It's not a "glop". Each program has its own purpose and functionality and does not require the others. As is quite obvious since they have been added to Fedora separately in different releases.
On Thu, 28 Apr 2022 at 19:39, Dave Ihnat dihnat@dminet.com wrote:
On 28 Apr at 17:27, Tom Horsley horsley1953@gmail.com wrote:
But systemd-timedated tries to supplant them all :-). This is where my "computer fungus" description comes from. Every systemd release seems to take over something else that worked fine without systemd engulfing it.
[...] And everything that we wanted to avoid is resurfacing with systemd. It's become a glop of totally unrelated features and services, difficult to diagnose and maintain, and prone to unwanted and unforseen interactions. Tug this strand, and the spiderweb shivers.
In my experience, init.d scripts suffered from lack of the sort of filtering we get with the kernel. The world is very different today. If you had a problem with some init script it was faster to fix it yourself than to deal with the vendor's technical support by email or get a response from a usenet group for your flamor of UNIX.
Linux now has a wide range of use cases, from data centers run by internet "giants" to schools and individuals trying to make the best of tight budgets. We have many distributions and spins because they meet the needs of groups large enough to warrant the effort. Debian still provides init, but it is up to packagers to choose whether to provide init scripts. Quoting from Init - Debian Wiki https://wiki.debian.org/Init
Since jessie, only systemd is fully supported; sysvinit is mostly supported, but Debian packages are not required to provide sysvinit start scripts. Support for init systems other than systemd is significantly improved in Bullseye. runit is also packaged, but has not received the same level of testing and support as the others, and is not currently supported as PID 1. As of Bullseye, a collection of sysvinit start scripts that have been removed from their original packages is provided in the orphan-sysvinit-scripts package.
It remains to be seen if there is significant adoption of init scripts and how often packagers provide sysvinit scripts.
On 2022-04-27 17:34:17-0400, Jonathan Billings wrote:
On Apr 27, 2022, at 07:25, Justin Moore justin.nonwork@gmail.com In general, the way I suggest debugging these kinds of hangs at shutdown/reboot are to run:
journalctl --boot=-1 --reverse
One thing to note.
I got bitten by the following quite recently:
[lars@localhost ~]$ journalctl --boot=-1 --reverse Specifying boot ID or boot offset has no effect, no persistent journal was found. [lars@localhost ~]$
I thought that the system in question, that crashed, had its files in /var/log/journal, but apparently not. That system is now setup to use persistent journal.
:)
Lars
Jonathan Billings writes:
There are several features in systemd that directly benefit the desktop.
Sure, but the ones you mention don't benefit me and people like me. I can't imagine why I would even notice them on my personal desktop.
The odd man out (and thank you for mentioning enterprise desktop!) is
2.) systemd-logind helps contain desktop processes in cgroups, meaning that if you want it to, it will terminate all user processes *for that session* when it logs out. This is a huge thing for the enterprise desktop environments.
which doesn't benefit *me* because I'm an academic and "own" my own system, but I sympathize with enterprise system management enough. Management of such systems feels a lot like server management in some ways; I wouldn't include them in "[personal] workstation" for this purpose.
My point in mentioning "workstation" was simply to give a loose description of a context where systemd simply doesn't buy much, and to point out that there are *other* contexts where what systemd offers matters.
I will note that managed desktops seems like a very important application in academia, too -- student computer literacy labs in college. High school too, in fact any compulsory education context. So the applications where the transition to systemd offers few if any benefits seem pretty small (although a lot of Fedora users!) and increasingly circumscribed going forward.
Steve