On Tue 05-02-13 20:05:48, Steven Rostedt wrote: > [ I sent this in a reply to another thread, but wanted a bit more attention to it ] > > To prevent deadlocks with doing a printk inside the scheduler, > printk_sched() was created. The issue is that printk has a console_sem > that it can grab and release. The release does a wake up if there's a > task pending on the sem, and this wake up grabs the rq locks that is > held in the scheduler. This leads to a possible deadlock if the wake up > uses the same rq as the one with the rq lock held already. > > What printk_sched() does is to save the printk write in a per cpu buffer > and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is > set, the printk() is done against the buffer. > > There's a couple of issues with this approach. > > 1) If two printk_sched()s are called before the tick, the second one > will overwrite the first one. > > 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a > bit of space wasted for something that is seldom used. > > In order to remove this, the printk_sched() can instead use the printk > buffer instead, and delay the console_trylock()/console_unlock() to the > tick. > > Because printk_sched() would then be taking the logbuf_lock, the > logbuf_lock must not be held while doing anything that may call into the > scheduler functions, which includes wake ups. Unfortunately, printk() > also has a console_sem that it uses, and on release, the > up(&console_sem) may do a wake up of any pending waiters. This must be > avoided while holding the logbuf_lock. > > Luckily, there's not many places that do the unlock, or hold the > logbuf_lock. By moving things around a little, the console_sem can be > released without ever holding the logbuf_lock, and we can safely have > printk_sched() use the printk buffer directly. So after quite some experiments and some hair tearing I have a patch that uses PRINTK_PENDING_OUTPUT and makes the machine survive my heavy-printk test. The first patch I attach is actually a small improvement of your patch which I think can be folded in it. I was also wondering whether we still need printk_needs_cpu(). I left it in since I don't know about a better way of keeping at least one CPU ticking. But maybe others do? The second patch then makes use of PRINTK_PENDING_OUTPUT to handle the printing when console_unlock() would take too long. If you wonder whether the last_printing_cpu in printk_tick() is necessary - it is... Without it we keep printing on one CPU and the machine complains, looses drives, etc... (I guess I should add this comment somewhere to the code). Anyway, what do you guys think about this version? Honza -- Jan Kara SUSE Labs, CR