* On-stack work item completion race? (was Re: XFS crash?) [not found] ` <CANGgnMZ8OwzfBj5m9H7c6q2yahGhU7oFZLsJfVxnWoqZExkZmQ@mail.gmail.com> @ 2014-06-24 3:02 ` Dave Chinner 2014-06-24 3:25 ` Tejun Heo 0 siblings, 1 reply; 8+ messages in thread From: Dave Chinner @ 2014-06-24 3:02 UTC (permalink / raw) To: Austin Schuh; +Cc: xfs, linux-kernel, Tejun Heo [Adding Tejun and lkml to the cc list] On Mon, Jun 23, 2014 at 01:05:54PM -0700, Austin Schuh wrote: > I found 1 bug in XFS which I fixed, and I've uncovered something else that > I'm not completely sure how to fix. > > In xfs_bmapi_allocate, you create a completion, and use that to wait until > the work has finished. Occasionally, I'm seeing a case where I get a > context switch after the completion has been completed, but before the > workqueue has finished doing it's internal book-keeping. This results in > the work being deleted before the workqueue is done using it, corrupting > the internal data structures. I fixed it by waiting using flush_work and > removing the completion entirely. > > --- a/fs/xfs/xfs_bmap_util.c 2014-06-23 12:59:10.008678410 -0700 > +++ b/fs/xfs/xfs_bmap_util.c 2014-06-23 12:59:14.116678239 -0700 > @@ -263,7 +263,6 @@ > current_set_flags_nested(&pflags, PF_FSTRANS); > > args->result = __xfs_bmapi_allocate(args); > - complete(args->done); > > current_restore_flags_nested(&pflags, PF_FSTRANS); > } > @@ -277,16 +276,13 @@ > xfs_bmapi_allocate( > struct xfs_bmalloca *args) > { > - DECLARE_COMPLETION_ONSTACK(done); > - > if (!args->stack_switch) > return __xfs_bmapi_allocate(args); > > > - args->done = &done; > INIT_WORK_ONSTACK(&args->work, xfs_bmapi_allocate_worker); > queue_work(xfs_alloc_wq, &args->work); > - wait_for_completion(&done); > + flush_work(&args->work); > destroy_work_on_stack(&args->work); > return args->result; > } > --- a/fs/xfs/xfs_bmap_util.h 2014-06-23 12:59:10.008678410 -0700 > +++ b/fs/xfs/xfs_bmap_util.h 2014-06-23 12:59:11.708678340 -0700 > @@ -57,7 +57,6 @@ > char conv; /* overwriting unwritten extents */ > char stack_switch; > int flags; > - struct completion *done; > struct work_struct work; > int result; > }; Ok, that's a surprise. However, I can't see how using flush_work() solves that underlying context switch problem, because it's implemented the same way: bool flush_work(struct work_struct *work) { struct wq_barrier barr; lock_map_acquire(&work->lockdep_map); lock_map_release(&work->lockdep_map); if (start_flush_work(work, &barr)) { wait_for_completion(&barr.done); destroy_work_on_stack(&barr.work); return true; } else { return false; } } start_flush_work() is effectively a special queue_work() implementation, so if if it's not safe to call complete() from the workqueue as the above patch implies then this code has the same problem. Tejun - is this "do it yourself completion" a known issue w.r.t. workqueues? I can't find any documentation that says "don't do that" so...? A quick grep also shows up the same queue_work/wait_for_completion pattern in arch/x86/kernel/hpet.c, drivers/md/dm-thin.c, fs/fs-writeback.c, drivers/block/drbd/drbd_main.c.... > I enabled event tracing (and added a new event which lists the number of > workers running in a queue whenever that is changed). > > To me, it looks like work is scheduled from irq/44-ahci-273 that will > acquire an inode lock. scp-3986 then acquires the lock, and then goes and > schedules work. That work is then scheduled behind the work from > irq/44-ahci-273 in the same pool. The first work blocks waiting on the > lock, and scp-3986 won't finish and release that lock until the second work > gets run. IOWs, scp takes an inode lock and queues work to the xfs_alloc_wq, then schedules. Then a kworker runs an xfs-data work item, which tries to take the inode lock and blocks. As I understand it, what then happens is that the workqueue code grabs another kworker thread and runs the next work item in it's queue. IOWs, work items can block, but doing that does not prevent execution of other work items queued on other work queues or even on the same work queue. Tejun, did I get that correct? Hence the work on the xfs-data queue will block until another kworker processes the item on the xfs-alloc-wq which means progress is made and the inode gets unlocked. Then the kworker for the work on the xfs-data queue will get the lock, complete it's work and everything has resolved itself. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: On-stack work item completion race? (was Re: XFS crash?) 2014-06-24 3:02 ` On-stack work item completion race? (was Re: XFS crash?) Dave Chinner @ 2014-06-24 3:25 ` Tejun Heo 2014-06-25 3:16 ` Austin Schuh ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Tejun Heo @ 2014-06-24 3:25 UTC (permalink / raw) To: Dave Chinner; +Cc: Austin Schuh, xfs, linux-kernel Hello, On Tue, Jun 24, 2014 at 01:02:40PM +1000, Dave Chinner wrote: > start_flush_work() is effectively a special queue_work() > implementation, so if if it's not safe to call complete() from the > workqueue as the above patch implies then this code has the same > problem. > > Tejun - is this "do it yourself completion" a known issue w.r.t. > workqueues? I can't find any documentation that says "don't do > that" so...? It's more complex than using flush_work() but there's nothing fundamentally wrong with it. A work item is completely unlinked before its execution starts. It's safe to free the work item once its work function started, whether through kfree() or returning. One difference between flush_work() and manual completion would be that if the work item gets requeued, flush_work() would wait for the queued one to finish but given the work item is one-shot this doesn't make any difference. I can see no reason why manual completion would behave differently from flush_work() in this case. > As I understand it, what then happens is that the workqueue code > grabs another kworker thread and runs the next work item in it's > queue. IOWs, work items can block, but doing that does not prevent > execution of other work items queued on other work queues or even on > the same work queue. Tejun, did I get that correct? Yes, as long as the workqueue is under its @max_active limit and has access to an existing kworker or can create a new one, it'll start executing the next work item immediately; however, the guaranteed level of concurrency is 1 even for WQ_RECLAIM workqueues. IOW, the work items queued on a workqueue must be able to make forward progress with single work item if the work items are being depended upon for memory reclaim. > Hence the work on the xfs-data queue will block until another > kworker processes the item on the xfs-alloc-wq which means progress > is made and the inode gets unlocked. Then the kworker for the work > on the xfs-data queue will get the lock, complete it's work and > everything has resolved itself. As long as a WQ_RECLAIM workqueue dosen't depend upon itself, forward-progress is guaranteed. Thanks. -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: On-stack work item completion race? (was Re: XFS crash?) 2014-06-24 3:25 ` Tejun Heo @ 2014-06-25 3:16 ` Austin Schuh 2014-06-25 5:56 ` Dave Chinner [not found] ` <CANGgnMY5cBSXOayDbbOvqNXEG8e6sAYEjpWEQO2X8XPxx2R5-Q@mail.gmail.com> 2 siblings, 0 replies; 8+ messages in thread From: Austin Schuh @ 2014-06-25 3:16 UTC (permalink / raw) To: Tejun Heo; +Cc: Dave Chinner, xfs, linux-kernel, Thomas Gleixner [Adding tglx to the cc. Sorry for any double sends] On Mon, Jun 23, 2014 at 8:25 PM, Tejun Heo <tj@kernel.org> wrote: > Hello, > > On Tue, Jun 24, 2014 at 01:02:40PM +1000, Dave Chinner wrote: >> start_flush_work() is effectively a special queue_work() >> implementation, so if if it's not safe to call complete() from the >> workqueue as the above patch implies then this code has the same >> problem. >> >> Tejun - is this "do it yourself completion" a known issue w.r.t. >> workqueues? I can't find any documentation that says "don't do >> that" so...? > > It's more complex than using flush_work() but there's nothing > fundamentally wrong with it. A work item is completely unlinked > before its execution starts. It's safe to free the work item once its > work function started, whether through kfree() or returning. > > One difference between flush_work() and manual completion would be > that if the work item gets requeued, flush_work() would wait for the > queued one to finish but given the work item is one-shot this doesn't > make any difference. > > I can see no reason why manual completion would behave differently > from flush_work() in this case. I went looking for a short trace in my original log to show the problem, and instead found evidence of the second problem. I still like the shorter flush_work call, but that's not my call. I did find this comment in the process_one_work function. Sounds like this could be better documented. /* * It is permissible to free the struct work_struct from * inside the function that is called from it, this we need to * take into account for lockdep too. To avoid bogus "held * lock freed" warnings as well as problems when looking into * work->lockdep_map, make a copy and use that here. */ >> As I understand it, what then happens is that the workqueue code >> grabs another kworker thread and runs the next work item in it's >> queue. IOWs, work items can block, but doing that does not prevent >> execution of other work items queued on other work queues or even on >> the same work queue. Tejun, did I get that correct? > > Yes, as long as the workqueue is under its @max_active limit and has > access to an existing kworker or can create a new one, it'll start > executing the next work item immediately; however, the guaranteed > level of concurrency is 1 even for WQ_RECLAIM workqueues. IOW, the > work items queued on a workqueue must be able to make forward progress > with single work item if the work items are being depended upon for > memory reclaim. I mentioned this to Dave when I initially started this thread, but I'm running a RT patched kernel. I don't see forwards progress. The workqueue is only used in 1 spot (xfs_alloc_wq), and has WQ_MEM_RECLAIM set. I started with a 10,000,000 line trace and pulled out the entries which referenced the workqueue and pool leading up to the lockup. scp-4110 [001] ....1.3 101.184640: workqueue_queue_work: work struct=ffff8803c782d900 function=xfs_bmapi_allocate_worker workqueue=ffff88040c9f5a00 pool=ffff88042dae3fc0 req_cpu=512 cpu=1 scp-4110 [001] ....1.3 101.184641: workqueue_activate_work: work struct ffff8803c782d900 kworker/1:1-89 [001] ....1.1 101.184883: workqueue_nr_running: pool=ffff88042dae3fc0 nr_running=1 kworker/1:1-89 [001] ....1.. 101.184885: workqueue_execute_start: work struct ffff8803c782d900: function xfs_bmapi_allocate_worker irq/44-ahci-275 [001] ....1.5 101.185086: workqueue_queue_work: work struct=ffff8800ae3f01e0 function=xfs_end_io workqueue=ffff88040b459a00 pool=ffff88042dae3fc0 req_cpu=512 cpu=1 irq/44-ahci-275 [001] ....1.5 101.185088: workqueue_activate_work: work struct ffff8800ae3f01e0 scp-4110 [001] ....1.. 101.187911: xfs_ilock: dev 8:5 ino 0xf9e flags ILOCK_EXCL caller xfs_iomap_write_allocate scp-4110 [001] ....1.3 101.187914: workqueue_queue_work: work struct=ffff8803c782d900 function=xfs_bmapi_allocate_worker workqueue=ffff88040c9f5a00 pool=ffff88042dae3fc0 req_cpu=512 cpu=1 scp-4110 [001] ....1.3 101.187915: workqueue_activate_work: work struct ffff8803c782d900 scp-4110 [001] ....1.4 101.187926: workqueue_queue_work: work struct=ffff88040a6a01c8 function=blk_delay_work workqueue=ffff88040c9f4a00 pool=ffff88042dae44c0 req_cpu=512 cpu=1 scp-4110 [001] ....1.4 101.187926: workqueue_activate_work: work struct ffff88040a6a01c8 kworker/1:1-89 [001] ....1.. 101.187998: workqueue_execute_end: work struct ffff8803c782d900 kworker/1:1-89 [001] ....1.. 101.188000: workqueue_execute_start: work struct ffff8800ae3f01e0: function xfs_end_io kworker/1:1-89 [001] ....1.. 101.188001: xfs_ilock: dev 8:5 ino 0xf9e flags ILOCK_EXCL caller xfs_setfilesize The last work never runs. Hangcheck triggers shortly after. I spent some more time debugging, and I am seeing that tsk_is_pi_blocked is returning 1 in sched_submit_work (kernel/sched/core.c). It looks like sched_submit_work is not detecting that the worker task is blocked on a mutex. This looks very RT related right now. I see 2 problems from my reading (and experimentation). The first is that the second worker isn't getting started because tsk_is_pi_blocked is reporting that the task isn't blocked on a mutex. The second is that even if another worker needs to be scheduled because the original worker is blocked on a mutex, we need the pool lock to schedule another worker. The pool lock can be acquired by any CPU, and is a spin_lock. If we end up on the slow path for the pool lock, we hit BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in task_blocks_on_rt_mutex in rtmutex.c. I'm not sure how to deal with either problem. Thomas, any ideas? Hopefully I've got all my facts right... Debugging kernel code is a whole new world from userspace code. Thanks! Austin ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: On-stack work item completion race? (was Re: XFS crash?) 2014-06-24 3:25 ` Tejun Heo 2014-06-25 3:16 ` Austin Schuh @ 2014-06-25 5:56 ` Dave Chinner 2014-06-25 14:18 ` Tejun Heo [not found] ` <CANGgnMY5cBSXOayDbbOvqNXEG8e6sAYEjpWEQO2X8XPxx2R5-Q@mail.gmail.com> 2 siblings, 1 reply; 8+ messages in thread From: Dave Chinner @ 2014-06-25 5:56 UTC (permalink / raw) To: Tejun Heo; +Cc: Austin Schuh, xfs, linux-kernel On Mon, Jun 23, 2014 at 11:25:21PM -0400, Tejun Heo wrote: > Hello, > > On Tue, Jun 24, 2014 at 01:02:40PM +1000, Dave Chinner wrote: > > As I understand it, what then happens is that the workqueue code > > grabs another kworker thread and runs the next work item in it's > > queue. IOWs, work items can block, but doing that does not prevent > > execution of other work items queued on other work queues or even on > > the same work queue. Tejun, did I get that correct? > > Yes, as long as the workqueue is under its @max_active limit and has > access to an existing kworker or can create a new one, it'll start > executing the next work item immediately; however, the guaranteed > level of concurrency is 1 even for WQ_RECLAIM workqueues. IOW, the > work items queued on a workqueue must be able to make forward progress > with single work item if the work items are being depended upon for > memory reclaim. Hmmm - that's different from my understanding of what the original behaviour WQ_MEM_RECLAIM gave us. i.e. that WQ_MEM_RECLAIM workqueues had a rescuer thread created to guarantee that the *workqueue* could make forward progress executing work in a reclaim context. The concept that the *work being executed* needs to guarantee forwards progress is something I've never heard stated before. That worries me a lot, especially with all the memory reclaim problems that have surfaced in the past couple of months.... > As long as a WQ_RECLAIM workqueue dosen't depend upon itself, > forward-progress is guaranteed. I can't find any documentation that actually defines what WQ_MEM_RECLAIM means, so I can't tell when or how this requirement came about. If it's true, then I suspect most of the WQ_MEM_RECLAIM workqueues in filesystems violate it. Can you point me at documentation/commits/code describing the constraints of WQ_MEM_RECLAIM and the reasons for it? Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: On-stack work item completion race? (was Re: XFS crash?) 2014-06-25 5:56 ` Dave Chinner @ 2014-06-25 14:18 ` Tejun Heo 2014-06-25 22:08 ` Dave Chinner 0 siblings, 1 reply; 8+ messages in thread From: Tejun Heo @ 2014-06-25 14:18 UTC (permalink / raw) To: Dave Chinner; +Cc: Austin Schuh, xfs, linux-kernel Hello, Dave. On Wed, Jun 25, 2014 at 03:56:41PM +1000, Dave Chinner wrote: > Hmmm - that's different from my understanding of what the original > behaviour WQ_MEM_RECLAIM gave us. i.e. that WQ_MEM_RECLAIM > workqueues had a rescuer thread created to guarantee that the > *workqueue* could make forward progress executing work in a > reclaim context. >From Documentation/workqueue.txt WQ_MEM_RECLAIM All wq which might be used in the memory reclaim paths _MUST_ have this flag set. The wq is guaranteed to have at least one execution context regardless of memory pressure. So, all that's guaranteed is that the workqueue has at least one worker executing its work items. If that one worker is serving a work item which can't make forward progress, the workqueue is not guaranteed to make forward progress. > The concept that the *work being executed* needs to guarantee > forwards progress is something I've never heard stated before. > That worries me a lot, especially with all the memory reclaim > problems that have surfaced in the past couple of months.... I'd love to provide that but guaranteeing that at least one work is always being executed requires unlimited task allocation (the ones which get blocked gotta store their context somewhere). > > As long as a WQ_RECLAIM workqueue dosen't depend upon itself, > > forward-progress is guaranteed. > > I can't find any documentation that actually defines what > WQ_MEM_RECLAIM means, so I can't tell when or how this requirement > came about. If it's true, then I suspect most of the WQ_MEM_RECLAIM > workqueues in filesystems violate it. Can you point me at > documentation/commits/code describing the constraints of > WQ_MEM_RECLAIM and the reasons for it? Documentation/workqueue.txt should be it but maybe we should be more explicit. The behavior is maintaining what the pre-concurrency-management workqueue provided with static per-workqueue workers. Each workqueue reserved its workers (either one per cpu or one globally) and it only supported single level of concurrency on each CPU. WQ_MEM_RECLAIM is providing equivalent amount of forward progress guarantee and all the existing users shouldn't have issues on this front. If we have grown incorrect usages from then on, we need to fix them. Thanks. -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: On-stack work item completion race? (was Re: XFS crash?) 2014-06-25 14:18 ` Tejun Heo @ 2014-06-25 22:08 ` Dave Chinner 0 siblings, 0 replies; 8+ messages in thread From: Dave Chinner @ 2014-06-25 22:08 UTC (permalink / raw) To: Tejun Heo; +Cc: Austin Schuh, xfs, linux-kernel On Wed, Jun 25, 2014 at 10:18:36AM -0400, Tejun Heo wrote: > Hello, Dave. > > On Wed, Jun 25, 2014 at 03:56:41PM +1000, Dave Chinner wrote: > > Hmmm - that's different from my understanding of what the original > > behaviour WQ_MEM_RECLAIM gave us. i.e. that WQ_MEM_RECLAIM > > workqueues had a rescuer thread created to guarantee that the > > *workqueue* could make forward progress executing work in a > > reclaim context. > > From Documentation/workqueue.txt > > WQ_MEM_RECLAIM > > All wq which might be used in the memory reclaim paths _MUST_ > have this flag set. The wq is guaranteed to have at least one > execution context regardless of memory pressure. > > So, all that's guaranteed is that the workqueue has at least one > worker executing its work items. If that one worker is serving a work > item which can't make forward progress, the workqueue is not > guaranteed to make forward progress. Adding that to the docco might be useful ;) > > > As long as a WQ_RECLAIM workqueue dosen't depend upon itself, > > > forward-progress is guaranteed. > > > > I can't find any documentation that actually defines what > > WQ_MEM_RECLAIM means, so I can't tell when or how this requirement > > came about. If it's true, then I suspect most of the WQ_MEM_RECLAIM > > workqueues in filesystems violate it. Can you point me at > > documentation/commits/code describing the constraints of > > WQ_MEM_RECLAIM and the reasons for it? > > Documentation/workqueue.txt should be it but maybe we should be more > explicit. The behavior is maintaining what the > pre-concurrency-management workqueue provided with static > per-workqueue workers. Each workqueue reserved its workers (either > one per cpu or one globally) and it only supported single level of > concurrency on each CPU. WQ_MEM_RECLAIM is providing equivalent > amount of forward progress guarantee and all the existing users > shouldn't have issues on this front. If we have grown incorrect > usages from then on, we need to fix them. Ok, so it hasn't changed. We're still usingthem like we used the original workqueues, and we never, ever provided a guarantee of forwards progress for them, either. So if the workqueues haven't changed, and we haven't changed how we use workqueues, then something else is causing all our recent problems.... Thanks for the clarification, Tejun! Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <CANGgnMY5cBSXOayDbbOvqNXEG8e6sAYEjpWEQO2X8XPxx2R5-Q@mail.gmail.com>]
* Re: On-stack work item completion race? (was Re: XFS crash?) [not found] ` <CANGgnMY5cBSXOayDbbOvqNXEG8e6sAYEjpWEQO2X8XPxx2R5-Q@mail.gmail.com> @ 2014-06-25 14:00 ` Tejun Heo 2014-06-25 17:04 ` Austin Schuh 0 siblings, 1 reply; 8+ messages in thread From: Tejun Heo @ 2014-06-25 14:00 UTC (permalink / raw) To: Austin Schuh; +Cc: Dave Chinner, xfs, Thomas Gleixner, linux-kernel Hello, On Tue, Jun 24, 2014 at 08:05:07PM -0700, Austin Schuh wrote: > > I can see no reason why manual completion would behave differently > > from flush_work() in this case. > > I went looking for a short trace in my original log to show the problem, > and instead found evidence of the second problem. I still like the shorter > flush_work call, but that's not my call. So, are you saying that the original issue you reported isn't actually a problem? But didn't you imply that changing the waiting mechanism fixed a deadlock or was that a false positive? > I did find this comment in the process_one_work function. Sounds like this > could be better documented. Yeah, we prolly should beef up Documentation/workqueue.txt with information on general usage. > I spent some more time debugging, and I am seeing that tsk_is_pi_blocked is > returning 1 in sched_submit_work (kernel/sched/core.c). It looks > like sched_submit_work is not detecting that the worker task is blocked on > a mutex. The function unplugs the block layer and doesn't have much to do with workqueue although it has "_work" in its name. > This looks very RT related right now. I see 2 problems from my reading > (and experimentation). The first is that the second worker isn't getting > started because tsk_is_pi_blocked is reporting that the task isn't blocked > on a mutex. The second is that even if another worker needs to be > scheduled because the original worker is blocked on a mutex, we need the > pool lock to schedule another worker. The pool lock can be acquired by any > CPU, and is a spin_lock. If we end up on the slow path for the pool lock, > we hit BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) > in task_blocks_on_rt_mutex in rtmutex.c. I'm not sure how to deal with > either problem. > > Hopefully I've got all my facts right... Debugging kernel code is a whole > new world from userspace code. I don't have much idea how RT kernel works either. Can you reproduce the issues that you see on mainline? Thanks. -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: On-stack work item completion race? (was Re: XFS crash?) 2014-06-25 14:00 ` Tejun Heo @ 2014-06-25 17:04 ` Austin Schuh 0 siblings, 0 replies; 8+ messages in thread From: Austin Schuh @ 2014-06-25 17:04 UTC (permalink / raw) To: Tejun Heo; +Cc: Dave Chinner, xfs, Thomas Gleixner, linux-kernel On Wed, Jun 25, 2014 at 7:00 AM, Tejun Heo <tj@kernel.org> wrote: > > Hello, > > On Tue, Jun 24, 2014 at 08:05:07PM -0700, Austin Schuh wrote: > > > I can see no reason why manual completion would behave differently > > > from flush_work() in this case. > > > > I went looking for a short trace in my original log to show the problem, > > and instead found evidence of the second problem. I still like the shorter > > flush_work call, but that's not my call. > > So, are you saying that the original issue you reported isn't actually > a problem? But didn't you imply that changing the waiting mechanism > fixed a deadlock or was that a false positive? Correct, that was a false positive. Sorry for the noise. > > I spent some more time debugging, and I am seeing that tsk_is_pi_blocked is > > returning 1 in sched_submit_work (kernel/sched/core.c). It looks > > like sched_submit_work is not detecting that the worker task is blocked on > > a mutex. > > The function unplugs the block layer and doesn't have much to do with > workqueue although it has "_work" in its name. Thomas moved + if (tsk->flags & PF_WQ_WORKER) + wq_worker_sleeping(tsk); into sched_submit_work as part of the RT patchset. > > This looks very RT related right now. I see 2 problems from my reading > > (and experimentation). The first is that the second worker isn't getting > > started because tsk_is_pi_blocked is reporting that the task isn't blocked > > on a mutex. The second is that even if another worker needs to be > > scheduled because the original worker is blocked on a mutex, we need the > > pool lock to schedule another worker. The pool lock can be acquired by any > > CPU, and is a spin_lock. If we end up on the slow path for the pool lock, > > we hit BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) > > in task_blocks_on_rt_mutex in rtmutex.c. I'm not sure how to deal with > > either problem. > > > > Hopefully I've got all my facts right... Debugging kernel code is a whole > > new world from userspace code. > > I don't have much idea how RT kernel works either. Can you reproduce > the issues that you see on mainline? > > Thanks. > > -- > tejun I'll see what I can do. Thanks! Austin ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-06-25 22:08 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20140305233551.GK6851@dastard> [not found] ` <CANGgnMb=2dYGQO4K36pQ9LEb8E4rT6S_VskLF+n=ndd0_kJr_g@mail.gmail.com> [not found] ` <CANGgnMa80WwQ8zSkL52yYegmQURVQeZiBFv41=FQXMZJ_NaEDw@mail.gmail.com> [not found] ` <20140513034647.GA5421@dastard> [not found] ` <CANGgnMZ0q9uE3NHj2i0SBK1d0vdKLx7QBJeFNb+YwP-5EAmejQ@mail.gmail.com> [not found] ` <20140513063943.GQ26353@dastard> [not found] ` <CANGgnMYn++1++UyX+D2d9GxPxtytpQJv0ThFwdxM-yX7xDWqiA@mail.gmail.com> [not found] ` <20140513090321.GR26353@dastard> [not found] ` <CANGgnMZqQc_NeaDpO_aX+bndmHrQ9VWo9mkfxhPBkRD-J=N6sQ@mail.gmail.com> [not found] ` <CANGgnMZ8OwzfBj5m9H7c6q2yahGhU7oFZLsJfVxnWoqZExkZmQ@mail.gmail.com> 2014-06-24 3:02 ` On-stack work item completion race? (was Re: XFS crash?) Dave Chinner 2014-06-24 3:25 ` Tejun Heo 2014-06-25 3:16 ` Austin Schuh 2014-06-25 5:56 ` Dave Chinner 2014-06-25 14:18 ` Tejun Heo 2014-06-25 22:08 ` Dave Chinner [not found] ` <CANGgnMY5cBSXOayDbbOvqNXEG8e6sAYEjpWEQO2X8XPxx2R5-Q@mail.gmail.com> 2014-06-25 14:00 ` Tejun Heo 2014-06-25 17:04 ` Austin Schuh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).