backtrace parser and processor for ABRT

31 Aug 2010


      Hi,
I have created the next iteration of backtrace parsing and processing 
code, and I would like to see it included in the development version of 
ABRT.
What are the improvements compared to the current ABRT's backtrace code?
- the backtrace duplication hash algorithm has been significantly 
improved: it now removes much more irrelevant frames, and unifies and 
merges frames. For example, glib's functions are sometimes prefixed with 
IA__, and sometimes they are not (sometimes you see IA__g_logv, 
sometimes just g_logv), so the IA__ is stripped for the duplication 
hash. Another example: many backtraces were the same except that one 
crashed in strcpy_ssse3 and other in strcpy_sse2. So now all glibc's 
functions which depend on the instruction set available are unified for 
the hash purposes (both strcpy_sse2 and strcpy_ssse3 are renamed to strcpy)
- the backtrace rating algorithm has been improved: it ignores 
cleanup-after-crash frames even when they miss debug info, and it also 
fixes bug #592523
- the "crash function name" detection has been improved: many irrelevant 
frames are skipped during the crash frame detection, so the reported 
crash function is more often the actual place where the program crashed
- new hand written parser: it parses much more backtraces correctly 
(especially those with C++ frames, applications using boost library), 
and the code seems to be more readable than the current bison grammar. 
That is because some parts of the backtrace format are very difficult to 
express in the rules for the Bison GLR parser while keeping the memory 
usage and parsing time reasonable
- the error messages returned by the hand written parser always include 
precise location of the error (line:column) and an informative message
- the new parser does not crash on "wrong" backtraces; the current bison 
parser uses stack heavily, and when some line in a backtrace is too long 
(many kilobytes) bison puts too much stuff on the stack and crashes (I 
do not know how to fix that without rewriting most of the grammar, and 
that would probably introduce limits on other places); so the new parser 
fixes crashes #627698, #627680, #616988, #589962, #588129, #573333
--
The new code has been written as a separate library (see the attached 
archive), because it became pretty large, containing a binary not used 
by the rest of ABRT, many tests, and several helper scripts. Within 
ABRT, it's used only by the CCpp plugin. I am not sure how to integrate 
it. Should we include it (as a separate project in a subdirectory) to 
ABRT's git repository?
I spend many days hunting bugs in the code, so it is in a good shape now 
as far as I can tell. Most C functions are covered by unit tests, and I 
often run the parser on all ABRT-reported backtraces downloaded from 
Bugzilla (~27000) and it provides good results. So putting btparser into 
ABRT should not cause much disruption. See the attached patches for the 
integration code.
How to check how it works with ABRT?
$ tar xzvf btparser-0.5.tar.gz
$ cd btparser-0.5
$ ./configure
$ make check      # to see the tests :)
$ make rpm
$ cd i686   # depends on your arch
$ sudo yum install --nogpgcheck \
./btparser-0.5-1.*.rpm \
./btparser-devel-0.5-1.*.rpm
$ cd ../../abrt   # to your devel abrt git clone
$ git apply 000*.patch
$ ./autogen.sh
$ ./configure
$ make rpm
...
Check the btparser-0.5/README and btparser-0.5/lib/*.h. I tried to 
explain there what it does.
Thanks,
Karel

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

backtrace parser and processor for ABRT