Hey folks! I did mention this in my mails explaining the 'compose check reports' to test@ and devel@, but in case anyone isn't following those, I thought I'd mention it here too.
Lately, openQA tests of Fedora 25 KDE seem to be crashing quite often during KDE startup. Here are three that crashed just in testing of today's nightly compose:
https://openqa.fedoraproject.org/tests/44206 https://openqa.fedoraproject.org/tests/44216 https://openqa.fedoraproject.org/tests/44278
As a simple openQA primer, the fact that you see a screenshot of a black screen with a red border around the screenshot in the thumbnail sequence, shortly after a screenshot of the KDE startup splash screen, means the test failed because instead of seeing a clean KDE desktop (like it was expecting), it saw a black screen. After that it switches to a console so it can upload some logs for us.
You can watch videos of each test by switching to the Logs & Assets tab and clicking Video, but that's all they'll show you - at some point during the test KDE tries to start up but sticks at a black screen. The subsequent switch to a console is done by the test system by pressing ctrl-alt-f6 or ctrl-alt-f2, it's not a part of the failure mode. Note the videos are sped up - in reality the system sat at the black screen for around 5 minutes in each case (that's the timeout on the 'wait for a clean KDE desktop to appear' step).
More interestingly on the Logs & Assets tab you can find various uploaded log files. The most useful one is probably 44216:
https://openqa.fedoraproject.org/tests/44216#downloads
because there you can find an ABRT problem directory for the crash, in https://openqa.fedoraproject.org/tests/44216/file/_graphical_wait_login -spoolabrt.tar.gz . You can also find the entire contents of /var/log in the var_log.tar.gz file.
I'll try looking into this a bit more manually too - see if I can reproduce it in a virt-manager VM and on bare metal - but it'd be great if anyone else can help figure out what's going on.
The openQA test VMs are direct runs of qemu (libvirt is not used) with the 'qxl' graphics driver. openQA interacts with the VM via qemu's built-in VNC server.
Adam Williamson composed on 2016-10-25 11:35 (UTC-0700): ... Could it be related or the same basic trouble as nvidia gfx with nouveau or modesetting driver on openSUSE is causing? https://bugzilla.opensuse.org/show_bug.cgi?id=1005323 https://bugzilla.opensuse.org/show_bug.cgi?id=1003402 https://bugzilla.opensuse.org/show_bug.cgi?id=997171
IIUC from the debian-user mailing list lately, Debian Stretch users may be having the same or similar trouble.
Adam Williamson composed on 2016-10-25 11:35 (UTC-0700): ...
[ 230.023424] kactivitymanage[1132]: segfault at 7fc1f1ecb630 ip 00007fc1d80b9741 sp 00007ffdf1b35f28 error 4 in libQt5Sql.so.5.7.0[7fc1d80a3000+45000] [ 271.127147] kactivitymanage[1550]: segfault at 7feb6c618630 ip 00007feb54022741 sp 00007ffcf215acc8 error 4 in libQt5Sql.so.5.7.0[7feb5400c000+45000]
Those are from logging out of Plasma on just updated F25. Same thing happens on Rawhide. Video chip is Intel G33 using intel Xorg driver.
Am 27.10.2016 um 10:37 schrieb Felix Miata:
Adam Williamson composed on 2016-10-25 11:35 (UTC-0700): ...
[ 230.023424] kactivitymanage[1132]: segfault at 7fc1f1ecb630 ip 00007fc1d80b9741 sp 00007ffdf1b35f28 error 4 in libQt5Sql.so.5.7.0[7fc1d80a3000+45000] [ 271.127147] kactivitymanage[1550]: segfault at 7feb6c618630 ip 00007feb54022741 sp 00007ffcf215acc8 error 4 in libQt5Sql.so.5.7.0[7feb5400c000+45000]
Those are from logging out of Plasma on just updated F25. Same thing happens on Rawhide. Video chip is Intel G33 using intel Xorg driver
happens for a long time on F24 each logout with intel and modesetting driver - dunno which business SQL stuff has when anything in context of search and indexing is disabled or uninstalled when possible
On Thu, 2016-10-27 at 11:43 +0200, Reindl Harald wrote:
Am 27.10.2016 um 10:37 schrieb Felix Miata:
Adam Williamson composed on 2016-10-25 11:35 (UTC-0700): ...
[ 230.023424] kactivitymanage[1132]: segfault at 7fc1f1ecb630 ip 00007fc1d80b9741 sp 00007ffdf1b35f28 error 4 in libQt5Sql.so.5.7.0[7fc1d80a3000+45000] [ 271.127147] kactivitymanage[1550]: segfault at 7feb6c618630 ip 00007feb54022741 sp 00007ffcf215acc8 error 4 in libQt5Sql.so.5.7.0[7feb5400c000+45000]
Those are from logging out of Plasma on just updated F25. Same thing happens on Rawhide. Video chip is Intel G33 using intel Xorg driver
happens for a long time on F24 each logout with intel and modesetting driver - dunno which business SQL stuff has when anything in context of search and indexing is disabled or uninstalled when possible
The openQA issue is on logging *in*, not *out*, so likely not the same issue.
Am 27.10.2016 um 19:01 schrieb Adam Williamson:
On Thu, 2016-10-27 at 11:43 +0200, Reindl Harald wrote:
Am 27.10.2016 um 10:37 schrieb Felix Miata:
Adam Williamson composed on 2016-10-25 11:35 (UTC-0700): ...
[ 230.023424] kactivitymanage[1132]: segfault at 7fc1f1ecb630 ip 00007fc1d80b9741 sp 00007ffdf1b35f28 error 4 in libQt5Sql.so.5.7.0[7fc1d80a3000+45000] [ 271.127147] kactivitymanage[1550]: segfault at 7feb6c618630 ip 00007feb54022741 sp 00007ffcf215acc8 error 4 in libQt5Sql.so.5.7.0[7feb5400c000+45000]
Those are from logging out of Plasma on just updated F25. Same thing happens on Rawhide. Video chip is Intel G33 using intel Xorg driver
happens for a long time on F24 each logout with intel and modesetting driver - dunno which business SQL stuff has when anything in context of search and indexing is disabled or uninstalled when possible
The openQA issue is on logging *in*, not *out*, so likely not the same issue
well, i referred to "Those are from logging out of Plasma on just updated F25" as quoted above
Felix Miata wrote:
Could it be related or the same basic trouble as nvidia gfx with nouveau or modesetting driver on openSUSE is causing? https://bugzilla.opensuse.org/show_bug.cgi?id=1005323 https://bugzilla.opensuse.org/show_bug.cgi?id=1003402 https://bugzilla.opensuse.org/show_bug.cgi?id=997171
No. That is an issue specific to the Nouveau driver (it is not thread-safe). OpenQA is not running on NVidia hardware, but in virtual machines. So it cannot possibly be affected. And mostly, that issue affects only QtWebEngine and applications using it.
For those issues, I have builds in my QtWebEngine Copr with experimental fixes: https://copr.fedorainfracloud.org/coprs/kkofler/qtwebengine/package/mesa/ but as is also said in the OpenSUSE bug reports' comments, the upstream developers consider those patches too incomplete and unsafe (risk of deadlocks) to ship. Also note that using those patches may indeed affect also applications not using QtWebEngine, by causing deadlocks. (In the worst case, the whole system can hang.)
There are now also proposed workarounds on the QtWebEngine end: https://bugreports.qt.io/browse/QTBUG-41242 which I don't really want to ship (I want Fedora to ship the Nouveau patches instead, which address the problem instead of hacking around it), but since I do not maintain the Nouveau package, I may have no other choice but to include a version of the QtWebEngine hack in our packaging.
But in any case, that issue cannot possibly be the cause of the OpenQA crashes.
Kevin Kofler