Sunday, June 08, 2008

wanted: backtrace for krunner heisenbug

Short story: KSycoca is not thread safe, reentrant or anything else that might make it usable in a multithreaded application. This is true of many parts of kdelibs, actually. Well, guess what krunner does? Yep, run lots and lots of threads.

Between the runners themselves we share a "kdelibs lock" that prevents one runner from entering these sensitive bits of kdelibs while others are.

Unfortunately, the startup info is also in there and .. guess what .. uses KSycoca. So every once in a while, when things line up juuuuust right, a runner is (with the lock) fetching data from KSycoca when we get an appliation startup notification and *poof* there goes krunner.

I triggered this bug today, generated a backtrace and then, stupidity of stupidity, accidentally closed the backtrace window! Oh no! I need the backtrace to see exactly where in the startup feedback machinery we're hitting this problem so I can then track down the problems with accuracy (versus just guessing, even if my guesses are sometimes right).

Since this is not the easiest crash in the world trigger and I'd like to not spend an entire day sitting there trying to egg it out of krunner again, I'm asking all of you who are running KDE from svn trunk to help me out here: if you manage to crash krunner and you get a backtrace that starts out in KSycoca stuff and then trails back into krunner symbols that contain the word "startup" in them, please save that backtrace and mail it to me.

Thank you very much in advance! =)

p.s. and yes, while it would make lots of sense to make more paths in kdleibs, such as some of the KSycoca stuff, at least reentrant, that's a ton of work that I don't have time for right now. Especially not before 4.1 ;)

12 comments:

Anonymous said...

What happened to your blog at the planet? I can't find it any more...

Anonymous said...

People wanting a simple and risk-free way of running KDE4 SVN might want to check out KDE4Daily:

http://etotheipiplusone.com/kde4daily/docs/kde4daily.html

Up-to-date backtrace generation is automatic, but very slow, so have patience if you manage to trigger the elusive KRunner bug!

Andras said...

Here you have a backtrace I got after trying to launch applications from krunner. I had to do it several times until it crashed:
Application: Run Command Interface (krunner), signal SIGABRT
[?1034h[Thread debugging using libthread_db enabled]
[New Thread 0x7f3bd40c8740 (LWP 25870)]
[New Thread 0x42e08950 (LWP 13261)]
[New Thread 0x42607950 (LWP 13260)]
[New Thread 0x41e06950 (LWP 13259)]
[New Thread 0x41605950 (LWP 13258)]
[New Thread 0x40a49950 (LWP 13257)]
[KCrash handler]
#5 0x00007f3bcaeb35c5 in raise () from /lib64/libc.so.6
#6 0x00007f3bcaeb4bb3 in abort () from /lib64/libc.so.6
#7 0x00007f3bd0268a90 in qt_message_output () from /opt/qt4/lib/libQtCore.so.4
#8 0x00007f3bd0268b9f in qFatal () from /opt/qt4/lib/libQtCore.so.4
#9 0x00007f3bd0268bdb in qt_assert_x () from /opt/qt4/lib/libQtCore.so.4
#10 0x00007f3bd026d5bf in QReadWriteLock::unlock ()
from /opt/qt4/lib/libQtCore.so.4
#11 0x00007f3bd3a59bdb in Plasma::AbstractRunner::performMatch (this=0x76ac50,
globalContext=@0x85fe90)
at /data/development/sources/kde-trunk/kdebase/workspace/libs/plasma/abstractrunner.cpp:130
#12 0x00007f3bcfdd5a29 in ThreadWeaver::JobRunHelper::runTheJob (
this=0x41604fb0, th=0x824fa0, job=0xc73ac0)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Job.cpp:106
#13 0x00007f3bcfdd5b99 in ThreadWeaver::Job::execute (this=0xc73ac0,
th=0x824fa0)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Job.cpp:135
#14 0x00007f3bcfdd4c0f in ThreadWeaver::ThreadRunHelper::run (this=0x41605050,
parent=0x893ba0, th=0x824fa0)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Thread.cpp:95
#15 0x00007f3bcfdd4d2e in ThreadWeaver::Thread::run (this=0x824fa0)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Thread.cpp:142
#16 0x00007f3bd027485c in QThreadPrivate::start ()
from /opt/qt4/lib/libQtCore.so.4
#17 0x00007f3bcffea040 in start_thread () from /lib64/libpthread.so.0
#18 0x00007f3bcaf540cd in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x41e06950 (LWP 13259)):
#0 0x00007f3bcffeddd9 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
#1 0x00007f3bd0275211 in QWaitConditionPrivate::wait ()
from /opt/qt4/lib/libQtCore.so.4
#2 0x00007f3bd0274d6e in QWaitCondition::wait ()
from /opt/qt4/lib/libQtCore.so.4
#3 0x00007f3bd026db42 in QReadWriteLock::lockForWrite ()
from /opt/qt4/lib/libQtCore.so.4
#4 0x00007f3bd3ab1877 in Plasma::RunnerContext::addMatches (this=0x41e05e90,
term=@0x80, matches=@0x41e05dc0)
at /data/development/sources/kde-trunk/kdebase/workspace/libs/plasma/runnercontext.cpp:199
#5 0x00007f3bc0e20c67 in ServiceRunner::match (this=0x76cdf0,
context=@0x41e05e90)
at /data/development/sources/kde-trunk/kdebase/workspace/plasma/runners/services/servicerunner.cpp:130
#6 0x00007f3bd3a59be8 in Plasma::AbstractRunner::performMatch (this=0x76cdf0,
globalContext=@0x85fe90)
at /data/development/sources/kde-trunk/kdebase/workspace/libs/plasma/abstractrunner.cpp:131
#7 0x00007f3bcfdd5a29 in ThreadWeaver::JobRunHelper::runTheJob (
this=0x41e05fb0, th=0x755890, job=0xc739b0)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Job.cpp:106
#8 0x00007f3bcfdd5b99 in ThreadWeaver::Job::execute (this=0xc739b0,
th=0x755890)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Job.cpp:135
#9 0x00007f3bcfdd4c0f in ThreadWeaver::ThreadRunHelper::run (this=0x41e06050,
parent=0x893ba0, th=0x755890)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Thread.cpp:95
#10 0x00007f3bcfdd4d2e in ThreadWeaver::Thread::run (this=0x755890)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Thread.cpp:142
#11 0x00007f3bd027485c in QThreadPrivate::start ()
from /opt/qt4/lib/libQtCore.so.4
#12 0x00007f3bcffea040 in start_thread () from /lib64/libpthread.so.0
#13 0x00007f3bcaf540cd in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x42607950 (LWP 13260)):
#0 0x00007f3bcffeddd9 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
#1 0x00007f3bd0275211 in QWaitConditionPrivate::wait ()
from /opt/qt4/lib/libQtCore.so.4
#2 0x00007f3bd0274d6e in QWaitCondition::wait ()
from /opt/qt4/lib/libQtCore.so.4
#3 0x00007f3bcfdd343c in ThreadWeaver::WeaverImpl::blockThreadUntilJobsAreBeingAssigned (this=0x893ba0, th=0x781bd0)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/WeaverImpl.cpp:365
#4 0x00007f3bcfdd6598 in ThreadWeaver::WorkingHardState::applyForWork (
this=0x7e2df0, th=0x781bd0)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/WorkingHardState.cpp:71
#5 0x00007f3bcfdd4c54 in ThreadWeaver::ThreadRunHelper::run (this=0x42607050,
parent=0x893ba0, th=0x781bd0)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Thread.cpp:87
#6 0x00007f3bcfdd4d2e in ThreadWeaver::Thread::run (this=0x781bd0)
at /data/development/sources/kde-trunk/kdelibs/threadweaver/Weaver/Thread.cpp:142
#7 0x00007f3bd027485c in QThreadPrivate::start ()
from /opt/qt4/lib/libQtCore.so.4
#8 0x00007f3bcffea040 in start_thread () from /lib64/libpthread.so.0
#9 0x00007f3bcaf540cd in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x42e08950 (LWP 13261)):
#0 0x00007f3bcaf4d662 in select () from /lib64/libc.so.6
#1 0x00007f3bd03666df in QProcessManager::run ()
from /opt/qt4/lib/libQtCore.so.4
#2 0x00007f3bd027485c in QThreadPrivate::start ()
from /opt/qt4/lib/libQtCore.so.4
#3 0x00007f3bcffea040 in start_thread () from /lib64/libpthread.so.0
#4 0x00007f3bcaf540cd in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f3bd40c8740 (LWP 25870)):
#0 0x00007f3bcffeddd9 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
#1 0x00007f3bd0275211 in QWaitConditionPrivate::wait ()
from /opt/qt4/lib/libQtCore.so.4
#2 0x00007f3bd0274d6e in QWaitCondition::wait ()
from /opt/qt4/lib/libQtCore.so.4
#3 0x00007f3bd0273a64 in QThread::wait () from /opt/qt4/lib/libQtCore.so.4
#4 0x00007f3bd0365505 in QProcessManager::~QProcessManager ()
from /opt/qt4/lib/libQtCore.so.4
#5 0x00007f3bd03668ca in QGlobalStaticDeleter < QProcessManager>::~QGlobalStaticDeleter () from /opt/qt4/lib/libQtCore.so.4
#6 0x00007f3bcaeb626d in exit () from /lib64/libc.so.6
#7 0x00007f3bcce6ebe0 in qt_xio_errhandler ()
at /data/development/sources/kde-trunk/qt-copy/src/gui/kernel/qapplication_x11.cpp:590
#8 0x00007f3bd2a6c7a0 in KApplication::xioErrhandler (this=0x62d9c0,
dpy=0x622d30)
at /data/development/sources/kde-trunk/kdelibs/kdeui/kernel/kapplication.cpp:418
#9 0x00007f3bcf6acee4 in _XIOError () from /usr/lib64/libX11.so.6
#10 0x00007f3bcf6b3ca0 in _XGetXCBBuffer () from /usr/lib64/libX11.so.6
#11 0x00007f3bcf6b4b62 in _XReply () from /usr/lib64/libX11.so.6
#12 0x00007f3bcf690c87 in XGetImage () from /usr/lib64/libX11.so.6
#13 0x00007f3bccef6761 in QX11PixmapData::toImage (this=0xc73480)
at /data/development/sources/kde-trunk/qt-copy/src/gui/image/qpixmap_x11.cpp:1379
#14 0x00007f3bccef93d8 in QX11PixmapData::mask (this=0xc73480)
at /data/development/sources/kde-trunk/qt-copy/src/gui/image/qpixmap_x11.cpp:1201
#15 0x00007f3bccee2d57 in QPixmap::mask (this=0x7fffdc21af60)
at /data/development/sources/kde-trunk/qt-copy/src/gui/image/qpixmap.cpp:1763
#16 0x00007f3bd3deabeb in StartupId::update_startupid (this=0x704ee0)
at /data/development/sources/kde-trunk/kdebase/workspace/krunner/startupid.cpp:303
#17 0x00007f3bd3decfd8 in StartupId::qt_metacall (this=0x704ee0,
_c=QMetaObject::InvokeMetaMethod, _id=>value optimized out>,
_a=0x7fffdc21b0b0)
at /data/development/build/kde-trunk/kdebase/workspace/krunner/startupid.moc:73
#18 0x00007f3bd03a440f in QMetaObject::activate ()
from /opt/qt4/lib/libQtCore.so.4
#19 0x00007f3bd03a49e1 in QMetaObject::activate ()
from /opt/qt4/lib/libQtCore.so.4
#20 0x00007f3bd03ef898 in QTimer::timeout () from /opt/qt4/lib/libQtCore.so.4
#21 0x00007f3bd03af908 in QTimer::timerEvent ()
from /opt/qt4/lib/libQtCore.so.4
#22 0x00007f3bd03a1d38 in QObject::event () from /opt/qt4/lib/libQtCore.so.4
#23 0x00007f3bccdee1b7 in QApplicationPrivate::notify_helper (this=0x62ec70,
receiver=0x704f28, e=0x7fffdc21ba20)
at /data/development/sources/kde-trunk/qt-copy/src/gui/kernel/qapplication.cpp:3772
#24 0x00007f3bccdee4d3 in QApplication::notify (this=0x62d9c0,
receiver=0x704f28, e=0x7fffdc21ba20)
at /data/development/sources/kde-trunk/qt-copy/src/gui/kernel/qapplication.cpp:3366
#25 0x00007f3bd2a6cd8b in KApplication::notify (this=0x62d9c0,
receiver=0x704f28, event=0x7fffdc21ba20)
at /data/development/sources/kde-trunk/kdelibs/kdeui/kernel/kapplication.cpp:311
#26 0x00007f3bd038af64 in QCoreApplication::notifyInternal ()
from /opt/qt4/lib/libQtCore.so.4
#27 0x00007f3bd0f9b075 in QCoreApplication::sendEvent ()
from /opt/qt4/lib/libQtWebKit.so.4
#28 0x00007f3bd03c54ab in QTimerInfoList::activateTimers ()
from /opt/qt4/lib/libQtCore.so.4
#29 0x00007f3bd03c2be7 in timerSourceDispatch ()
from /opt/qt4/lib/libQtCore.so.4
#30 0x00007f3bc9cf793a in g_main_context_dispatch ()
from /usr/lib64/libglib-2.0.so.0
#31 0x00007f3bc9cfb040 in ?? () from /usr/lib64/libglib-2.0.so.0
#32 0x00007f3bc9cfb1dc in g_main_context_iteration ()
from /usr/lib64/libglib-2.0.so.0
#33 0x00007f3bd03c1c0e in QEventDispatcherGlib::processEvents ()
from /opt/qt4/lib/libQtCore.so.4
#34 0x00007f3bcceae3bb in QGuiEventDispatcherGlib::processEvents (
this=0x616810, flags={i = -601768736})
at /data/development/sources/kde-trunk/qt-copy/src/gui/kernel/qguieventdispatcher_glib.cpp:204
#35 0x00007f3bd03875f7 in QEventLoop::processEvents ()
from /opt/qt4/lib/libQtCore.so.4
#36 0x00007f3bd03877f4 in QEventLoop::exec () from /opt/qt4/lib/libQtCore.so.4
#37 0x00007f3bd038b9e5 in QCoreApplication::exec ()
from /opt/qt4/lib/libQtCore.so.4
#38 0x00007f3bccdedf14 in QApplication::exec ()
at /data/development/sources/kde-trunk/qt-copy/src/gui/kernel/qapplication.cpp:3304
#39 0x00007f3bd3de2955 in kdemain (argc=1, argv=0x7fffdc21c058)
at /data/development/sources/kde-trunk/kdebase/workspace/krunner/main.cpp:53
#40 0x00007f3bcae9f436 in __libc_start_main () from /lib64/libc.so.6
#41 0x0000000000400939 in _start ()
#0 0x00007f3bcffeddd9 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0

Rafael said...

A little OT:

Will krunner be able to do things like Enso:

http://humanized.com/weblog/2007/05/18/die_desktop_die/

??

I think KDE would be the best platform for something like that and it seems krunner is moving in that direction.

notagod said...

@andras:
ugh, looks like this line is triggering

Q_ASSERT_X(d->accessCount != 0, "QReadWriteLock::unlock()", "Cannot unlock an unlocked lock");

@aseigo:
One thing that's always annoyed me just a touch, no save/email buttons on backtrace crashes. Would mean that a lot more backtraces get reported. I know when I'm in the midst of trying to get something done, the last thing I want to do is interrupt my flow to start up an email client + web browser, to work out where to send the backtrace to.

Aaron J. Seigo said...

@Rafael: "it seems krunner is moving in that direction."

yep...

@notagod: "no save/email buttons on backtrace crashes"

there is a save button (and a copy button) and there should be a clickable link in the text at the top to take you to either a bug tracker or email address.

Anonymous said...

Just a quick suggestion. How about when there's a crash, the backtrace is copied automatically to the clipboard?

That way, if you accidentally close the backtrace you can still paste it somewhere.

blueget said...

Autocopy for the Backtrace is a bad idea, especially for people not using Klipper. There should be logfiles where all generated backtraces are stored until system reboot, or until a specific limit is reached (could be number of backtraces or something like that)

chani said...

generating backtraces is not always a simple matter. on my poor old laptop, I pretty much have to walk away for 5-10 minutes and find something else to do while the backtrace is generating. and pray that I don't run out of swap.

that's why it's not just automatically dealt with... although having it save to a file when it's done generating might not be a bad idea.

yman said...

It's things like that that make me wish there was an "Undo Close Window" feature...

rafael said...

Thank you, Aaron.

Oh, btw:

http://streaming.linux-magazin.de/archiv_linuxtag08.htm

There is a video of your Linuxtag keynote. Interesting stuff!

notagod said...

@aaron
Excellent. Haven't had a good chance to check out kde4 yet, we're still opensuse10.2 at work.

A bit of a headsup, the internal dev list (@trolltech) is having a discussion on the validity of asserts in code. Some are arguing for the abolition of all asserts in code that can be reached via user input ;) This could be... interesting to see where it leads. ;) (this is re your krunner crash bug originating from an assert from an admittedly wrong, but acceptably wrong call of unlock() on an unlocked lock)