All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
@ 2016-08-03 16:35 Bart Van Assche
  2016-08-03 18:11 ` Peter Zijlstra
  0 siblings, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-03 16:35 UTC (permalink / raw)
  To: mingo, Peter Zijlstra
  Cc: Andrew Morton, Johannes Weiner, Neil Brown, Michael Shaver, linux-kernel

If try_to_wakeup() reads the task state before abort_exclusive_wait()
sets the task state and if autoremove_wake_function() is called after
abort_exclusive_wait() has removed a task from a wait list then the
cascading mechanism for exclusive wakeups in abort_exclusive_wait()
won't be triggered. Avoid this by serializing the task state change
in abort_exclusive_wait() and try_to_wakeup(). This patch fixes the
following hang:

INFO: task systemd-udevd:10111 blocked for more than 480 seconds.
      Not tainted 4.7.0-dbg+ #1
Call Trace:
 [<ffffffff8161f397>] schedule+0x37/0x90
 [<ffffffff816239ef>] schedule_timeout+0x27f/0x470
 [<ffffffff8161e76f>] io_schedule_timeout+0x9f/0x110
 [<ffffffff8161fb36>] bit_wait_io+0x16/0x60
 [<ffffffff8161f929>] __wait_on_bit_lock+0x49/0xa0
 [<ffffffff8114fe69>] __lock_page+0xb9/0xc0
 [<ffffffff81165d90>] truncate_inode_pages_range+0x3e0/0x760
 [<ffffffff81166120>] truncate_inode_pages+0x10/0x20
 [<ffffffff81212a20>] kill_bdev+0x30/0x40
 [<ffffffff81213d41>] __blkdev_put+0x71/0x360
 [<ffffffff81214079>] blkdev_put+0x49/0x170
 [<ffffffff812141c0>] blkdev_close+0x20/0x30
 [<ffffffff811d48e8>] __fput+0xe8/0x1f0
 [<ffffffff811d4a29>] ____fput+0x9/0x10
 [<ffffffff810842d3>] task_work_run+0x83/0xb0
 [<ffffffff8106606e>] do_exit+0x3ee/0xc40
 [<ffffffff8106694b>] do_group_exit+0x4b/0xc0
 [<ffffffff81073d9a>] get_signal+0x2ca/0x940
 [<ffffffff8101bf43>] do_signal+0x23/0x660
 [<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0
 [<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0
 [<ffffffff81624e33>] entry_SYSCALL_64_fastpath+0xa6/0xa8

Fixes: 777c6c5f1f6e ("wait: prevent exclusive waiter starvation")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Shaver <jmshaver@gmail.com>
Cc: <stable@vger.kernel.org> # v2.6.33+
---
 kernel/sched/wait.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index f15d6b6..93d1f50 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -277,10 +277,17 @@ void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
 			unsigned int mode, void *key)
 {
 	unsigned long flags;
+	long wake_up;
+
+	/* Serialize against try_to_wake_up() */
+	raw_spin_lock_irqsave(&current->pi_lock, flags);
+	wake_up = current->state & (TASK_INTERRUPTIBLE | TASK_UNINTERRUPTIBLE);
+	if (wake_up)
+		__set_current_state(TASK_RUNNING);
+	raw_spin_unlock_irqrestore(&current->pi_lock, flags);
 
-	__set_current_state(TASK_RUNNING);
 	spin_lock_irqsave(&q->lock, flags);
-	if (!list_empty(&wait->task_list))
+	if (wake_up)
 		list_del_init(&wait->task_list);
 	else if (waitqueue_active(q))
 		__wake_up_locked_key(q, mode, key);
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-03 16:35 [PATCH] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
@ 2016-08-03 18:11 ` Peter Zijlstra
  2016-08-03 18:56   ` Bart Van Assche
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Zijlstra @ 2016-08-03 18:11 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: mingo, Andrew Morton, Johannes Weiner, Neil Brown,
	Michael Shaver, linux-kernel, Oleg Nesterov

On Wed, Aug 03, 2016 at 09:35:03AM -0700, Bart Van Assche wrote:
> If try_to_wakeup() reads the task state before abort_exclusive_wait()
> sets the task state and if autoremove_wake_function() is called after
> abort_exclusive_wait() has removed a task from a wait list then the
> cascading mechanism for exclusive wakeups in abort_exclusive_wait()
> won't be triggered. Avoid this by serializing the task state change
> in abort_exclusive_wait() and try_to_wakeup().

I'm dense.. what!?

	CPU0			CPU1			CPU2

	

				__lock_page_killable()
				  __wait_on_bit_lock()
				    bit_wait_io()
				      schedule()
	__wake_up_bit()
	  __wake_up(.nr_exclusive=1)
	    spin_lock(&q->lock)
	    __wake_up_common()
	      autoremove_wake_func()
	        try_to_wake_up(p, TASK_NORMAL)
		list_del_init(&wait->task_list)
	    spin_unlock(&q->lock)

							complete_signal(p)
							  signal_wake_up(p, 1)
							    sigaddset(&p->pending.signal, SIGKILL)
							    try_to_wake_up(p, TASK_WAKEKILL)

				      if (signal_pending_state(TASK_KILLABLE))
				        return -EINTR;
				    abort_exclusive_wait()
				      __set_current_state(RUNNING)
				      spin_lock(q->lock)
				      if (!list_empty()) /* empty */
				      else if (waitqueue_active()) /* pending ? */
				        __wake_up_locked_key(q, mode, key)
				      spin_unlock(q->lock)


That seems to do the right thing, so clearly I misunderstand. Please
clarify.


> +++ b/kernel/sched/wait.c
> @@ -277,10 +277,17 @@ void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
>  			unsigned int mode, void *key)
>  {
>  	unsigned long flags;
> +	long wake_up;
> +
> +	/* Serialize against try_to_wake_up() */
> +	raw_spin_lock_irqsave(&current->pi_lock, flags);
> +	wake_up = current->state & (TASK_INTERRUPTIBLE | TASK_UNINTERRUPTIBLE);
> +	if (wake_up)
> +		__set_current_state(TASK_RUNNING);
> +	raw_spin_unlock_irqrestore(&current->pi_lock, flags);
>  
> -	__set_current_state(TASK_RUNNING);
>  	spin_lock_irqsave(&q->lock, flags);
> -	if (!list_empty(&wait->task_list))
> +	if (wake_up)
>  		list_del_init(&wait->task_list);
>  	else if (waitqueue_active(q))
>  		__wake_up_locked_key(q, mode, key);

That just feels wrong,.. very wrong.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-03 18:11 ` Peter Zijlstra
@ 2016-08-03 18:56   ` Bart Van Assche
  2016-08-03 21:30     ` Oleg Nesterov
  0 siblings, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-03 18:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, Andrew Morton, Johannes Weiner, Neil Brown,
	Michael Shaver, linux-kernel, Oleg Nesterov

On 08/03/2016 11:11 AM, Peter Zijlstra wrote:
> That seems to do the right thing, so clearly I misunderstand. Please
> clarify.

Hello Peter,

try_to_wake_up() locks task_struct.pi_lock but abort_exclusive_wait() 
not. My assumption is that the following sequence of events leads to the 
lockup that I had mentioned in the description of my patch:
* try_to_wake_up() is called for the task that will execute
   abort_exclusive_wait().
* After try_to_wake_up() has checked task_struct.state and before
   autoremove_wake_function() has tried to remove the task from the wait
   queue, abort_exclusive_wait() is executed for the same task.

Please note that the call stack I had mentioned in my e-mail had been 
reported before. See e.g.
* Michael Shaver, Kernel deadlock during mdadm reshape, July 2016 
(http://www.spinics.net/lists/raid/msg53056.html).
* Bart Van Assche, Kernel hangs in truncate_inode_pages(), August 2012 
(https://lkml.org/lkml/2012/8/24/185).

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-03 18:56   ` Bart Van Assche
@ 2016-08-03 21:30     ` Oleg Nesterov
  2016-08-03 21:51       ` Bart Van Assche
  2016-08-04  0:05       ` Bart Van Assche
  0 siblings, 2 replies; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-03 21:30 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

Hi Bart,

I too can't understand the problem. Perhaps you missed the fact that
abort_exclusive_wait() does everything under wait_queue_head_t->lock ?

On 08/03, Bart Van Assche wrote:
>
> try_to_wake_up() locks task_struct.pi_lock but abort_exclusive_wait() not.
> My assumption is that the following sequence of events leads to the lockup
> that I had mentioned in the description of my patch:
> * try_to_wake_up() is called for the task that will execute
>   abort_exclusive_wait().
> * After try_to_wake_up() has checked task_struct.state and before
>   autoremove_wake_function() has tried to remove the task from the wait
>   queue, abort_exclusive_wait() is executed for the same task.

But we do not care if we race with another try_to_wake_up(), or even with
another exclusive wake_up_nr(wq)/whatever unless wq is the same.

And if this wq is the same, then wake_up_nr() will do try_to_wake_up/autoremove
either before or after abort_exclusive_wait(), wake_up_nr() takes the same
wq->lock.

And this means that abort_exclusive_wait() can't be called "After try_to_wake_up()"
and "before autoremove_wake_function()".

Oleg.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-03 21:30     ` Oleg Nesterov
@ 2016-08-03 21:51       ` Bart Van Assche
  2016-08-04 14:09         ` Peter Zijlstra
  2016-08-04  0:05       ` Bart Van Assche
  1 sibling, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-03 21:51 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/03/2016 02:30 PM, Oleg Nesterov wrote:
> On 08/03, Bart Van Assche wrote:
>> try_to_wake_up() locks task_struct.pi_lock but abort_exclusive_wait() not.
>> My assumption is that the following sequence of events leads to the lockup
>> that I had mentioned in the description of my patch:
>> * try_to_wake_up() is called for the task that will execute
>>   abort_exclusive_wait().
>> * After try_to_wake_up() has checked task_struct.state and before
>>   autoremove_wake_function() has tried to remove the task from the wait
>>   queue, abort_exclusive_wait() is executed for the same task.
>
> But we do not care if we race with another try_to_wake_up(), or even with
> another exclusive wake_up_nr(wq)/whatever unless wq is the same.
>
> And if this wq is the same, then wake_up_nr() will do try_to_wake_up/autoremove
> either before or after abort_exclusive_wait(), wake_up_nr() takes the same
> wq->lock.
>
> And this means that abort_exclusive_wait() can't be called "After try_to_wake_up()"
> and "before autoremove_wake_function()".

Hello Oleg,

It is possible that my analysis is wrong. But what I see is that my 
patch makes the lockup disappear. However, what I had not expected is 
that I ran into the following (probably caused by the patch at the start 
of this thread):

WARNING: CPU: 1 PID: 26023 at lib/list_debug.c:33 __list_add+0x89/0xb0
list_add corruption. prev->next should be next (ffff88047ff4b0c8), but 
was ffff8803cbc037e0. (prev=ffff8803c863bd80).
Call Trace:
  [<ffffffff81320137>] dump_stack+0x68/0xa1
  [<ffffffff81061c46>] __warn+0xc6/0xe0
  [<ffffffff81061caa>] warn_slowpath_fmt+0x4a/0x50
  [<ffffffff8133d5d9>] __list_add+0x89/0xb0
  [<ffffffff810ab5c9>] prepare_to_wait_exclusive+0x79/0x80
  [<ffffffff8161fa2f>] __wait_on_bit_lock+0x2f/0xa0
  [<ffffffff8114fe89>] __lock_page+0xb9/0xc0
  [<ffffffff81165db0>] truncate_inode_pages_range+0x3e0/0x760
  [<ffffffff81166140>] truncate_inode_pages+0x10/0x20
  [ ... ]

So I started testing the patch below that should fix the same hang but 
without triggering any wait list corruption.

Bart.

diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index f15d6b6..4e3f651 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -282,7 +282,7 @@ void abort_exclusive_wait(wait_queue_head_t *q, 
wait_queue_t *wait,
  	spin_lock_irqsave(&q->lock, flags);
  	if (!list_empty(&wait->task_list))
  		list_del_init(&wait->task_list);
-	else if (waitqueue_active(q))
+	if (waitqueue_active(q))
  		__wake_up_locked_key(q, mode, key);
  	spin_unlock_irqrestore(&q->lock, flags);
  }

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-03 21:30     ` Oleg Nesterov
  2016-08-03 21:51       ` Bart Van Assche
@ 2016-08-04  0:05       ` Bart Van Assche
  1 sibling, 0 replies; 33+ messages in thread
From: Bart Van Assche @ 2016-08-04  0:05 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/03/2016 02:30 PM, Oleg Nesterov wrote:
> I too can't understand the problem. Perhaps you missed the fact that
> abort_exclusive_wait() does everything under wait_queue_head_t->lock ?
>
> [ ... ]
>
> But we do not care if we race with another try_to_wake_up(), or even with
> another exclusive wake_up_nr(wq)/whatever unless wq is the same.
>
> And if this wq is the same, then wake_up_nr() will do try_to_wake_up/autoremove
> either before or after abort_exclusive_wait(), wake_up_nr() takes the same
> wq->lock.
>
> And this means that abort_exclusive_wait() can't be called "After try_to_wake_up()"
> and "before autoremove_wake_function()".

Hello Oleg,

I had noticed that abort_exclusive_wait() locks and unlocks 
wait_queue_head_t->lock.

What I had overlooked is that if try_to_wake_up() is called (indirectly) 
by __wake_up_common() that then wait_queue_head_t->lock is held. 
However, not all try_to_wake_up() callers hold that lock. Since I'm not 
a scheduler expert I would appreciate it if someone who is more familiar 
with the scheduler could explain me how the two patches that I posted in 
the context of this e-mail thread can cause a behavior difference.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-03 21:51       ` Bart Van Assche
@ 2016-08-04 14:09         ` Peter Zijlstra
  2016-08-04 14:31           ` Bart Van Assche
  2016-08-05 17:41           ` Bart Van Assche
  0 siblings, 2 replies; 33+ messages in thread
From: Peter Zijlstra @ 2016-08-04 14:09 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Oleg Nesterov, mingo, Andrew Morton, Johannes Weiner, Neil Brown,
	Michael Shaver, linux-kernel

On Wed, Aug 03, 2016 at 02:51:23PM -0700, Bart Van Assche wrote:
> So I started testing the patch below that should fix the same hang but
> without triggering any wait list corruption.
> 
> diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> index f15d6b6..4e3f651 100644
> --- a/kernel/sched/wait.c
> +++ b/kernel/sched/wait.c
> @@ -282,7 +282,7 @@ void abort_exclusive_wait(wait_queue_head_t *q,
> wait_queue_t *wait,
>  	spin_lock_irqsave(&q->lock, flags);
>  	if (!list_empty(&wait->task_list))
>  		list_del_init(&wait->task_list);
> -	else if (waitqueue_active(q))
> +	if (waitqueue_active(q))
>  		__wake_up_locked_key(q, mode, key);
>  	spin_unlock_irqrestore(&q->lock, flags);
>  }

So the problem with this patch is that it will violate the nr_exclusive
semantics in that it can result in too many wakeups -- which is a much
less severe (typically harmless) issue.

We now always wake up the next waiter, even if there wasn't an actual
wakeup we raced against. And if we then also get a wakeup, we can end up
with 2 woken tasks (instead of the nr_exclusive=1).

Now, since wait loops must all deal with spurious wakeups, this ends up
as harmless overhead.

But I'd still like to understand where we loose the wakeup. What are you
doing to reproduce this issue?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-04 14:09         ` Peter Zijlstra
@ 2016-08-04 14:31           ` Bart Van Assche
  2016-08-05 17:41           ` Bart Van Assche
  1 sibling, 0 replies; 33+ messages in thread
From: Bart Van Assche @ 2016-08-04 14:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, mingo, Andrew Morton, Johannes Weiner, Neil Brown,
	Michael Shaver, linux-kernel

On 08/04/16 07:09, Peter Zijlstra wrote:
> But I'd still like to understand where we loose the wakeup. What are you
> doing to reproduce this issue?

Hello Peter,

The test I run is as follows:
* Configure the ib_srpt driver to export a RAM disk through the SRP
   protocol. The ib_srpt driver is a LIO target driver that implements
   the SRP protocol, a SCSI transport protocol.
* On the same system, let the ib_srp (SRP initiator) driver log in
   to the ib_srpt driver using the loopback capability of a local
   InfiniBand HCA.
* Run fio with data verification enabled on top of multipath (dm-mpath)
   with queue_if_no_path enabled and let multipath use the SRP paths.
* Simulate cable pulls and reinserts by periodically writing in the
   /sys/class/srp_remote_ports/*/delete and by logging in again. Writing
   into the delete attribute triggers scsi_remove_host() and hence also
   removal of the block device associated with the SCSI device.

The scripts I use to run this test are available at 
https://github.com/bvanassche/srp-test. Since the softRoCE driver is not 
yet upstream running this test requires at least one InfiniBand HCA.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-04 14:09         ` Peter Zijlstra
  2016-08-04 14:31           ` Bart Van Assche
@ 2016-08-05 17:41           ` Bart Van Assche
  2016-08-08 10:22             ` Peter Zijlstra
  1 sibling, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-05 17:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, mingo, Andrew Morton, Johannes Weiner, Neil Brown,
	Michael Shaver, linux-kernel

On 08/04/2016 07:09 AM, Peter Zijlstra wrote:
> On Wed, Aug 03, 2016 at 02:51:23PM -0700, Bart Van Assche wrote:
>> So I started testing the patch below that should fix the same hang but
>> without triggering any wait list corruption.
>>
>> diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
>> index f15d6b6..4e3f651 100644
>> --- a/kernel/sched/wait.c
>> +++ b/kernel/sched/wait.c
>> @@ -282,7 +282,7 @@ void abort_exclusive_wait(wait_queue_head_t *q,
>> wait_queue_t *wait,
>>  	spin_lock_irqsave(&q->lock, flags);
>>  	if (!list_empty(&wait->task_list))
>>  		list_del_init(&wait->task_list);
>> -	else if (waitqueue_active(q))
>> +	if (waitqueue_active(q))
>>  		__wake_up_locked_key(q, mode, key);
>>  	spin_unlock_irqrestore(&q->lock, flags);
>>  }
>
> So the problem with this patch is that it will violate the nr_exclusive
> semantics in that it can result in too many wakeups -- which is a much
> less severe (typically harmless) issue.
>
> We now always wake up the next waiter, even if there wasn't an actual
> wakeup we raced against. And if we then also get a wakeup, we can end up
> with 2 woken tasks (instead of the nr_exclusive=1).
>
> Now, since wait loops must all deal with spurious wakeups, this ends up
> as harmless overhead.

How about adding a fifth argument to abort_exclusive_wait() that 
indicates whether or not the "if (waitqueue_active(q)) 
__wake_up_locked_key(q, mode, key)" code should be executed? 
__wait_event() could pass "condition" as fifth argument when calling 
abort_exclusive_wait().

> But I'd still like to understand where we loose the wakeup.

My assumption is that __wake_up_common() and signal delivery happen 
concurrently, that __wake_up_common() wakes up bit_wait_io() and that 
signal delivery happens after bit_wait_io() has been woken up but before 
it tests the signal pending state.

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-05 17:41           ` Bart Van Assche
@ 2016-08-08 10:22             ` Peter Zijlstra
  2016-08-08 14:38               ` Bart Van Assche
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Zijlstra @ 2016-08-08 10:22 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Oleg Nesterov, mingo, Andrew Morton, Johannes Weiner, Neil Brown,
	Michael Shaver, linux-kernel

On Fri, Aug 05, 2016 at 10:41:33AM -0700, Bart Van Assche wrote:
> On 08/04/2016 07:09 AM, Peter Zijlstra wrote:

> >But I'd still like to understand where we loose the wakeup.
> 
> My assumption is that __wake_up_common() and signal delivery happen
> concurrently, that __wake_up_common() wakes up bit_wait_io() and that signal
> delivery happens after bit_wait_io() has been woken up but before it tests
> the signal pending state.

That would be the exact scenario I drew a picture of, no? I'm still
failing to see the hole there.

Please draw a picture like that and illustrate the hole.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-08 10:22             ` Peter Zijlstra
@ 2016-08-08 14:38               ` Bart Van Assche
  2016-08-08 16:20                 ` Oleg Nesterov
  0 siblings, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-08 14:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, mingo, Andrew Morton, Johannes Weiner, Neil Brown,
	Michael Shaver, linux-kernel

On 08/08/16 03:22, Peter Zijlstra wrote:
> That would be the exact scenario I drew a picture of, no? I'm still
> failing to see the hole there.
> 
> Please draw a picture like that and illustrate the hole.
 
Hi Peter,

This is the sequence of which I think that it leads to the missed wakeup:

Task 1                    Task 2                    Task 3                    Task 4

lock_page()
 ...
                          lock_page_killable()
                           __lock_page_killable()
                            __wait_on_bit_lock()
                             bit_wait_io()
                              io_schedule()
                               ...
                                                                              lock_page()
                                                                               __lock_page()
                                                                                __wait_on_bit_lock()
                                                                                 bit_wait_io()
                                                                                  io_schedule()
                                                                                   ...


                                                    (signal delivery to task 2)
                                                    try_to_wake_up(task2, ..., ...)
                                                    (try_to_wake_up() returns 1)

unlock_page()
 wake_up_page()
  __wake_up_bit()
   __wake_up(wq, TASK_NORMAL, 1, &key)
    __wake_up_common(wq, mode=TASK_NORMAL, nr_exclusive=1, 0, key)
     wake_bit_function()
      autoremove_wake_function()
       default_wake_function()
        try_to_wake_up() <- skips task 2 because task 3 already changed
                            the task state of task 2
       (autoremove_wake_function() does not do
        list_del_init(&wait->task_list))


                              bit_wait_io() returns -EINTR
                             abort_exclusive_wait() is called by __wait_on_bit_lock()


In the above sequence task 1 does not remove task 2 from the waitqueue
because task 3 had already woken up task 2. The result is that when task 2
calls abort_exclusive_wait() that task 2 is still on the waitqueue. With the
current implementation of abort_exclusive_wait() in the above scenario task
4 is not woken up although it should be woken up. Hence the patch that removes
the "else" keyword from abort_exclusive_wait().

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-08 14:38               ` Bart Van Assche
@ 2016-08-08 16:20                 ` Oleg Nesterov
  2016-08-08 18:31                   ` Bart Van Assche
  2016-08-09 23:56                   ` Bart Van Assche
  0 siblings, 2 replies; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-08 16:20 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/08, Bart Van Assche wrote:
>
> This is the sequence of which I think that it leads to the missed wakeup:
>
> Task 1                    Task 2                    Task 3                    Task 4
>
> lock_page()
>  ...
>                           lock_page_killable()
>                            __lock_page_killable()
>                             __wait_on_bit_lock()
>                              bit_wait_io()
>                               io_schedule()
>                                ...
>                                                                               lock_page()
>                                                                                __lock_page()
>                                                                                 __wait_on_bit_lock()
>                                                                                  bit_wait_io()
>                                                                                   io_schedule()
>                                                                                    ...
>
>
>                                                     (signal delivery to task 2)
>                                                     try_to_wake_up(task2, ..., ...)
>                                                     (try_to_wake_up() returns 1)
>
> unlock_page()
>  wake_up_page()
>   __wake_up_bit()
>    __wake_up(wq, TASK_NORMAL, 1, &key)
>     __wake_up_common(wq, mode=TASK_NORMAL, nr_exclusive=1, 0, key)
>      wake_bit_function()
>       autoremove_wake_function()
>        default_wake_function()
>         try_to_wake_up() <- skips task 2 because task 3 already changed
>                             the task state of task 2
>        (autoremove_wake_function() does not do
>         list_del_init(&wait->task_list))

Yes.

But since it skips task2, __wake_up_common() doesn't decrement nr_exclusive,
doesn't stop. It continues the list_for_each_entry_safe() loop, and finds the
sleeping task4, and wakes it up,

>                               bit_wait_io() returns -EINTR
>                              abort_exclusive_wait() is called by __wait_on_bit_lock()
>
>
> In the above sequence task 1 does not remove task 2 from the waitqueue
> because task 3 had already woken up task 2. The result is that when task 2
> calls abort_exclusive_wait() that task 2 is still on the waitqueue.

Yes, but this is fine,

> With the
> current implementation of abort_exclusive_wait() in the above scenario task
> 4 is not woken up although it should be woken up.

See above, it must be already woken by __wake_up_common().



So far _I think_ that the bug is somewhere else... Say, someone clears
PG_locked without wake_up(). Then SIGKILL sent to the task sleeping in
sys_read() "adds" the necessary wakeup...

Do you use external modules during the testing?

Oleg.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-08 16:20                 ` Oleg Nesterov
@ 2016-08-08 18:31                   ` Bart Van Assche
  2016-08-09 17:14                     ` Oleg Nesterov
  2016-08-09 23:56                   ` Bart Van Assche
  1 sibling, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-08 18:31 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/08/2016 09:20 AM, Oleg Nesterov wrote:
> Do you use external modules during the testing?

Hello Oleg,

No external modules were loaded when I triggered the lockup I mentioned 
in the patch description. Although the SRP test software I referred to 
earlier can be run against the SCST SRP target driver, I'm using the 
in-tree LIO SRP target driver to reproduce this lockup. BTW, I think it 
is unlikely that the target driver framework is involved in the lockup 
since I have observed this lockup before when no SRP target driver was 
loaded on the test system. The same lockup also has already been 
observed with md-raid.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-08 18:31                   ` Bart Van Assche
@ 2016-08-09 17:14                     ` Oleg Nesterov
  2016-08-09 18:48                       ` Bart Van Assche
  0 siblings, 1 reply; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-09 17:14 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/08, Bart Van Assche wrote:
>
> No external modules were loaded when I triggered the lockup

Heh. Could you test the patch below?

Oleg.

--- x/kernel/sched/wait.c
+++ x/kernel/sched/wait.c
@@ -283,7 +283,7 @@ void abort_exclusive_wait(wait_queue_hea
 	if (!list_empty(&wait->task_list))
 		list_del_init(&wait->task_list);
 	else if (waitqueue_active(q))
-		__wake_up_locked_key(q, mode, key);
+		__wake_up_locked_key(q, TASK_NORMAL, key);
 	spin_unlock_irqrestore(&q->lock, flags);
 }
 EXPORT_SYMBOL(abort_exclusive_wait);

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-09 17:14                     ` Oleg Nesterov
@ 2016-08-09 18:48                       ` Bart Van Assche
  2016-08-09 23:10                         ` Bart Van Assche
  2016-08-10 10:45                         ` Oleg Nesterov
  0 siblings, 2 replies; 33+ messages in thread
From: Bart Van Assche @ 2016-08-09 18:48 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/09/2016 10:15 AM, Oleg Nesterov wrote:
> On 08/08, Bart Van Assche wrote:
>>
>> No external modules were loaded when I triggered the lockup
> 
> Heh. Could you test the patch below?
> 
> Oleg.
> 
> --- x/kernel/sched/wait.c
> +++ x/kernel/sched/wait.c
> @@ -283,7 +283,7 @@ void abort_exclusive_wait(wait_queue_hea
>  	if (!list_empty(&wait->task_list))
>  		list_del_init(&wait->task_list);
>  	else if (waitqueue_active(q))
> -		__wake_up_locked_key(q, mode, key);
> +		__wake_up_locked_key(q, TASK_NORMAL, key);
>  	spin_unlock_irqrestore(&q->lock, flags);
>  }
>  EXPORT_SYMBOL(abort_exclusive_wait);

Hello Oleg,

That patch looks interesting to me. Unfortunately even with that patch
applied I still see lockups. These are the two lockups I have observed
after I had applied your patch, rebuilt and reinstalled the kernel and
rebooted the test server:

[ 1548.018115] sysrq: SysRq : Show Blocked State
[ 1548.018210]   task                        PC stack   pid father
[ 1548.018677] systemd-udevd   D ffff8803a9f13be8     0 29908    483 0x00000000
[ 1548.018792]  ffff8803a9f13be8 ffffffff82584bd0 00ffffff8252b1b0 ffff88046f0569c0
[ 1548.018961]  ffff88016c98b140 ffff8800757bc9c0 ffff8803a9f14000 ffff88046f0569c0
[ 1548.019131]  7fffffffffffffff ffffffff8161fcf0 ffff8803a9f13d50 ffff8803a9f13c00
[ 1548.019316] Call Trace:
[ 1548.019415]  [<ffffffff8161f567>] schedule+0x37/0x90
[ 1548.019464]  [<ffffffff81623bbf>] schedule_timeout+0x27f/0x470
[ 1548.019758]  [<ffffffff8161e93f>] io_schedule_timeout+0x9f/0x110
[ 1548.019808]  [<ffffffff8161fd06>] bit_wait_io+0x16/0x60
[ 1548.019856]  [<ffffffff8161f996>] __wait_on_bit+0x56/0x80
[ 1548.019906]  [<ffffffff81152e1d>] wait_on_page_bit_killable+0xbd/0xc0
[ 1548.020006]  [<ffffffff81152f50>] generic_file_read_iter+0x130/0x770
[ 1548.020158]  [<ffffffff812134a0>] blkdev_read_iter+0x30/0x40
[ 1548.020209]  [<ffffffff811d266b>] __vfs_read+0xbb/0x130
[ 1548.020258]  [<ffffffff811d2a51>] vfs_read+0x91/0x130
[ 1548.020305]  [<ffffffff811d3dd4>] SyS_read+0x44/0xa0
[ 1548.020354]  [<ffffffff81624fa5>] entry_SYSCALL_64_fastpath+0x18/0xa8

[ 1050.892823] sysrq: SysRq : Show Blocked State
[ 1050.892912]   task                        PC stack   pid father
[ 1050.893333] systemd-udevd   D ffff880449b3f838     0 17240    492 0x00000006
[ 1050.893974] Call Trace:
[ 1050.894119]  [<ffffffff8161f567>] schedule+0x37/0x90
[ 1050.894168]  [<ffffffff81623bcf>] schedule_timeout+0x27f/0x470
[ 1050.894561]  [<ffffffff8161e93f>] io_schedule_timeout+0x9f/0x110
[ 1050.894609]  [<ffffffff8161fd16>] bit_wait_io+0x16/0x60
[ 1050.894657]  [<ffffffff8161fb09>] __wait_on_bit_lock+0x49/0xa0
[ 1050.894705]  [<ffffffff8114fe59>] __lock_page+0xb9/0xc0
[ 1050.894802]  [<ffffffff81165d80>] truncate_inode_pages_range+0x3e0/0x760
[ 1050.895750]  [<ffffffff81166110>] truncate_inode_pages+0x10/0x20
[ 1050.895799]  [<ffffffff81212a10>] kill_bdev+0x30/0x40
[ 1050.895849]  [<ffffffff81213d31>] __blkdev_put+0x71/0x360
[ 1050.895951]  [<ffffffff81214069>] blkdev_put+0x49/0x170
[ 1050.895998]  [<ffffffff812141b0>] blkdev_close+0x20/0x30
[ 1050.896047]  [<ffffffff811d48d8>] __fput+0xe8/0x1f0
[ 1050.896094]  [<ffffffff811d4a19>] ____fput+0x9/0x10
[ 1050.896141]  [<ffffffff810842d3>] task_work_run+0x83/0xb0
[ 1050.896189]  [<ffffffff8106606e>] do_exit+0x3ee/0xc40
[ 1050.896284]  [<ffffffff8106694b>] do_group_exit+0x4b/0xc0
[ 1050.896331]  [<ffffffff81073d9a>] get_signal+0x2ca/0x940
[ 1050.896425]  [<ffffffff8101bf43>] do_signal+0x23/0x660
[ 1050.896626]  [<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0
[ 1050.896678]  [<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0
[ 1050.896727]  [<ffffffff81625033>] entry_SYSCALL_64_fastpath+0xa6/0xa8

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-09 18:48                       ` Bart Van Assche
@ 2016-08-09 23:10                         ` Bart Van Assche
  2016-08-10 10:45                         ` Oleg Nesterov
  1 sibling, 0 replies; 33+ messages in thread
From: Bart Van Assche @ 2016-08-09 23:10 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/09/2016 11:48 AM, Bart Van Assche wrote:
> [ 1548.018115] sysrq: SysRq : Show Blocked State
> [ 1548.018210]   task                        PC stack   pid father
> [ 1548.018677] systemd-udevd   D ffff8803a9f13be8     0 29908    483 0x00000000
> [ 1548.018792]  ffff8803a9f13be8 ffffffff82584bd0 00ffffff8252b1b0 ffff88046f0569c0
> [ 1548.018961]  ffff88016c98b140 ffff8800757bc9c0 ffff8803a9f14000 ffff88046f0569c0
> [ 1548.019131]  7fffffffffffffff ffffffff8161fcf0 ffff8803a9f13d50 ffff8803a9f13c00
> [ 1548.019316] Call Trace:
> [ 1548.019415]  [<ffffffff8161f567>] schedule+0x37/0x90
> [ 1548.019464]  [<ffffffff81623bbf>] schedule_timeout+0x27f/0x470
> [ 1548.019758]  [<ffffffff8161e93f>] io_schedule_timeout+0x9f/0x110
> [ 1548.019808]  [<ffffffff8161fd06>] bit_wait_io+0x16/0x60
> [ 1548.019856]  [<ffffffff8161f996>] __wait_on_bit+0x56/0x80
> [ 1548.019906]  [<ffffffff81152e1d>] wait_on_page_bit_killable+0xbd/0xc0
> [ 1548.020006]  [<ffffffff81152f50>] generic_file_read_iter+0x130/0x770
> [ 1548.020158]  [<ffffffff812134a0>] blkdev_read_iter+0x30/0x40
> [ 1548.020209]  [<ffffffff811d266b>] __vfs_read+0xbb/0x130
> [ 1548.020258]  [<ffffffff811d2a51>] vfs_read+0x91/0x130
> [ 1548.020305]  [<ffffffff811d3dd4>] SyS_read+0x44/0xa0
> [ 1548.020354]  [<ffffffff81624fa5>] entry_SYSCALL_64_fastpath+0x18/0xa8

(replying to my own e-mail)

The above call stack is probably caused by a missing I/O completion 
somewhere in the I/O stack (not in ib_srp) and hence can be ignored in 
the context of the discussion about __wait_on_bit_lock(). BTW, I have 
made the following local change in abort_exclusive_wait() in the hope 
that if I can trigger this statement that it will provide more 
information about why the __wait_on_bit_lock() hang happens:

diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index f0fdd8e..fad852d 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -280,6 +280,8 @@ void abort_exclusive_wait(wait_queue_head_t *q, 
wait_queue_t *wait,

         __set_current_state(TASK_RUNNING);
         spin_lock_irqsave(&q->lock, flags);
+       WARN_ONCE(!list_empty(&wait->task_list) && waitqueue_active(q),
+                 "mode = %#x\n", mode);
         if (!list_empty(&wait->task_list))
                 list_del_init(&wait->task_list);
         else if (waitqueue_active(q))

Bart.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-08 16:20                 ` Oleg Nesterov
  2016-08-08 18:31                   ` Bart Van Assche
@ 2016-08-09 23:56                   ` Bart Van Assche
  2016-08-10 10:57                     ` Oleg Nesterov
  1 sibling, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-09 23:56 UTC (permalink / raw)
  To: Oleg Nesterov, Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/08/2016 09:20 AM, Oleg Nesterov wrote:
> So far _I think_ that the bug is somewhere else... Say, someone clears
> PG_locked without wake_up(). Then SIGKILL sent to the task sleeping in
> sys_read() "adds" the necessary wakeup...

Hello Oleg,

Something that puzzles me is that removing the "else" keyword from 
abort_exclusive_wait() is sufficient to avoid the hang. If there would 
be code that clears PG_locked without calling wake_up() this hang 
probably would also be triggered by workloads that do not wake up 
lock_page_killable() with a signal. BTW, the 
WARN_ONCE(!list_empty(&wait->task_list) && waitqueue_active(q), "mode = 
%#x\n", mode) statement that I added in abort_exclusive_wait() just 
produced the following call stack:

Aug  9 16:16:38 ion-dev-ib-ini kernel: WARNING: CPU: 0 PID: 14767 at kernel/sched/wait.c:284 abort_exclusive_wait+0xe3/0xf0
Aug  9 16:16:38 ion-dev-ib-ini kernel: mode = 0x82
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [last unloaded: brd]
Aug  9 16:16:38 ion-dev-ib-ini kernel: CPU: 0 PID: 14767 Comm: kpartx Tainted: G        W       4.7.0-dbg+ #3
Aug  9 16:16:38 ion-dev-ib-ini kernel: Call Trace:
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff81320157>] dump_stack+0x68/0xa1
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff81061c46>] __warn+0xc6/0xe0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff81061caa>] warn_slowpath_fmt+0x4a/0x50
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff810ab7d3>] abort_exclusive_wait+0xe3/0xf0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff8161fb61>] __wait_on_bit_lock+0x61/0xa0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff8114ff69>] __lock_page_killable+0xb9/0xc0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff8115305a>] generic_file_read_iter+0x1ea/0x770
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff812134f0>] blkdev_read_iter+0x30/0x40
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff811d26bb>] __vfs_read+0xbb/0x130
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff811d2aa1>] vfs_read+0x91/0x130
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff811d3e24>] SyS_read+0x44/0xa0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [<ffffffff81624fe5>] entry_SYSCALL_64_fastpath+0x18/0xa8

(gdb) list *(generic_file_read_iter+0x1ea)
0xffffffff8115305a is in do_generic_file_read (mm/filemap.c:1730).
1725                    continue;
1726
1727    page_not_up_to_date:
1728                    /* Get exclusive access to the page ... */
1729                    error = lock_page_killable(page);
1730                    if (unlikely(error))
1731                            goto readpage_error;
1732
1733    page_not_up_to_date_locked:
1734                    /* Did it get truncated before we got the lock? */

Apparently the task that hangs is the same task as the one that
received the signal (PID 14767; state "D" = TASK_UNINTERRUPTIBLE):

[ 3718.134118] sysrq: SysRq : Show Blocked State
[ 3718.136234] kpartx          D ffff8803c7767838     0 14767      1 0x00000006
[ 3718.136928] Call Trace:
[ 3718.137089]  [<ffffffff8161f5b7>] schedule+0x37/0x90
[ 3718.137142]  [<ffffffff81623c0f>] schedule_timeout+0x27f/0x470
[ 3718.137603]  [<ffffffff8161e98f>] io_schedule_timeout+0x9f/0x110
[ 3718.137662]  [<ffffffff8161fd56>] bit_wait_io+0x16/0x60
[ 3718.137714]  [<ffffffff8161fb49>] __wait_on_bit_lock+0x49/0xa0
[ 3718.137764]  [<ffffffff8114fea9>] __lock_page+0xb9/0xc0
[ 3718.137865]  [<ffffffff81165dd0>] truncate_inode_pages_range+0x3e0/0x760
[ 3718.138175]  [<ffffffff81166160>] truncate_inode_pages+0x10/0x20
[ 3718.138477]  [<ffffffff81212a60>] kill_bdev+0x30/0x40
[ 3718.138529]  [<ffffffff81213d81>] __blkdev_put+0x71/0x360
[ 3718.138631]  [<ffffffff812140b9>] blkdev_put+0x49/0x170
[ 3718.138681]  [<ffffffff81214200>] blkdev_close+0x20/0x30
[ 3718.138732]  [<ffffffff811d4928>] __fput+0xe8/0x1f0
[ 3718.138782]  [<ffffffff811d4a69>] ____fput+0x9/0x10
[ 3718.138834]  [<ffffffff810842d3>] task_work_run+0x83/0xb0
[ 3718.138886]  [<ffffffff8106606e>] do_exit+0x3ee/0xc40
[ 3718.138987]  [<ffffffff8106694b>] do_group_exit+0x4b/0xc0
[ 3718.139038]  [<ffffffff81073d9a>] get_signal+0x2ca/0x940
[ 3718.139142]  [<ffffffff8101bf43>] do_signal+0x23/0x660
[ 3718.139247]  [<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0
[ 3718.139297]  [<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0
[ 3718.139349]  [<ffffffff81625073>] entry_SYSCALL_64_fastpath+0xa6/0xa8

I'll try to see whether this behavior is reproducible.

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-09 18:48                       ` Bart Van Assche
  2016-08-09 23:10                         ` Bart Van Assche
@ 2016-08-10 10:45                         ` Oleg Nesterov
  2016-08-10 16:01                           ` Bart Van Assche
  2016-08-10 19:58                           ` Bart Van Assche
  1 sibling, 2 replies; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-10 10:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/09, Bart Van Assche wrote:
>
> On 08/09/2016 10:15 AM, Oleg Nesterov wrote:
> >
> > --- x/kernel/sched/wait.c
> > +++ x/kernel/sched/wait.c
> > @@ -283,7 +283,7 @@ void abort_exclusive_wait(wait_queue_hea
> >  	if (!list_empty(&wait->task_list))
> >  		list_del_init(&wait->task_list);
> >  	else if (waitqueue_active(q))
> > -		__wake_up_locked_key(q, mode, key);
> > +		__wake_up_locked_key(q, TASK_NORMAL, key);
> >  	spin_unlock_irqrestore(&q->lock, flags);
> >  }
> >  EXPORT_SYMBOL(abort_exclusive_wait);
>
> Hello Oleg,
>
> That patch looks interesting to me.

And I'll redo/resend it, __wake_up_locked_key(mode) is simply wrong I think.

But it can't affect lock_page() because TASK_KILLABLE includes TASK_UNINTERRUPTIBLE
and we do not have lock_page_interruptible().


> Unfortunately even with that patch
> applied I still see lockups.

Thanks. I hoped this change can fix some another exclusive wait...

OK. Could you  try another debugging patch below?

Oleg.
---

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index e5a3244..9d5f892 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -711,6 +711,15 @@ static inline int page_has_private(struct page *page)
 	return !!(page->flags & PAGE_FLAGS_PRIVATE);
 }
 
+void unlock_page(struct page *page);
+static inline void __ClearPageLocked_x(struct page *page)
+{
+	if (PageLocked(compound_head(page)))
+		unlock_page(page);
+}
+
+#define __ClearPageLocked(page)	__ClearPageLocked_x(page)
+
 #undef PF_ANY
 #undef PF_HEAD
 #undef PF_NO_TAIL

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-09 23:56                   ` Bart Van Assche
@ 2016-08-10 10:57                     ` Oleg Nesterov
  2016-08-10 11:03                       ` Peter Zijlstra
  0 siblings, 1 reply; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-10 10:57 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Bart Van Assche, Peter Zijlstra, mingo, Andrew Morton,
	Johannes Weiner, Neil Brown, Michael Shaver, linux-kernel

On 08/09, Bart Van Assche wrote:
>
> Hello Oleg,
>
> Something that puzzles me is that removing the "else" keyword from
> abort_exclusive_wait() is sufficient to avoid the hang.

Yes, we need to understand this.

> If there would
> be code that clears PG_locked without calling wake_up() this hang
> probably would also be triggered by workloads that do not wake up
> lock_page_killable() with a signal.

Yes, and I already have another debugging patch to test this... it simply turns
lock_page_killable() into lock_page(). But lets check __ClearPageLocked() first
(the patch I sent a minute ago).

> BTW, the
> WARN_ONCE(!list_empty(&wait->task_list) && waitqueue_active(q), "mode =
> %#x\n", mode) statement that I added in abort_exclusive_wait() just
> produced the following call stack:

This condition is fine, and the trace is clear. This means that lock_page_killable()
was interrupted and wake_bit_function() was not called. We do not need another wakeup
in this case but somehow it helps. Again, I think because the necessary wakeup was
already lost/missed.

Oleg.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-10 10:57                     ` Oleg Nesterov
@ 2016-08-10 11:03                       ` Peter Zijlstra
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Zijlstra @ 2016-08-10 11:03 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Bart Van Assche, Bart Van Assche, mingo, Andrew Morton,
	Johannes Weiner, Neil Brown, Michael Shaver, linux-kernel

On Wed, Aug 10, 2016 at 12:57:25PM +0200, Oleg Nesterov wrote:
> This condition is fine, and the trace is clear. This means that lock_page_killable()
> was interrupted and wake_bit_function() was not called. We do not need another wakeup
> in this case but somehow it helps. Again, I think because the necessary wakeup was
> already lost/missed.

I suspect the same. Removing that else generates 'spurious' wakeups,
which can unstick the situation, hiding the real source of the problem.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-10 10:45                         ` Oleg Nesterov
@ 2016-08-10 16:01                           ` Bart Van Assche
  2016-08-10 16:27                             ` Oleg Nesterov
  2016-08-10 19:58                           ` Bart Van Assche
  1 sibling, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-10 16:01 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/10/2016 03:46 AM, Oleg Nesterov wrote:
> OK. Could you  try another debugging patch below?
> 
> Oleg.
> ---
> 
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index e5a3244..9d5f892 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -711,6 +711,15 @@ static inline int page_has_private(struct page *page)
>  	return !!(page->flags & PAGE_FLAGS_PRIVATE);
>  }
>  
> +void unlock_page(struct page *page);
> +static inline void __ClearPageLocked_x(struct page *page)
> +{
> +	if (PageLocked(compound_head(page)))
> +		unlock_page(page);
> +}
> +
> +#define __ClearPageLocked(page)	__ClearPageLocked_x(page)
> +
>  #undef PF_ANY
>  #undef PF_HEAD
>  #undef PF_NO_TAIL
 
Hi Oleg,

Are you sure that all __ClearPageLocked() users pass the compound head
to that macro? How about testing the patch below instead?

Thanks,

Bart.
 
---
 include/linux/page-flags.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index e5a3244..10d8e63 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -711,6 +711,17 @@ static inline int page_has_private(struct page *page)
 	return !!(page->flags & PAGE_FLAGS_PRIVATE);
 }
 
+void unlock_page(struct page *page);
+static inline void __ClearPageLocked_x(struct page *page)
+{
+	if (PageLocked(compound_head(page)))
+		unlock_page(page);
+	else
+		__ClearPageLocked(page);
+}
+
+#define __ClearPageLocked(page)	__ClearPageLocked_x(page)
+
 #undef PF_ANY
 #undef PF_HEAD
 #undef PF_NO_TAIL
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-10 16:01                           ` Bart Van Assche
@ 2016-08-10 16:27                             ` Oleg Nesterov
  0 siblings, 0 replies; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-10 16:27 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/10, Bart Van Assche wrote:
>
> On 08/10/2016 03:46 AM, Oleg Nesterov wrote:
> > OK. Could you  try another debugging patch below?
> >
> > Oleg.
> > ---
> >
> > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> > index e5a3244..9d5f892 100644
> > --- a/include/linux/page-flags.h
> > +++ b/include/linux/page-flags.h
> > @@ -711,6 +711,15 @@ static inline int page_has_private(struct page *page)
> >  	return !!(page->flags & PAGE_FLAGS_PRIVATE);
> >  }
> >
> > +void unlock_page(struct page *page);
> > +static inline void __ClearPageLocked_x(struct page *page)
> > +{
> > +	if (PageLocked(compound_head(page)))
> > +		unlock_page(page);
> > +}
> > +
> > +#define __ClearPageLocked(page)	__ClearPageLocked_x(page)
> > +
> >  #undef PF_ANY
> >  #undef PF_HEAD
> >  #undef PF_NO_TAIL
>
> Hi Oleg,
>
> Are you sure that all __ClearPageLocked() users pass the compound head
> to that macro?

Hmm. it obviously should... which kernel version do you use for testing?

>From include/linux/page-flags.h

	__PAGEFLAG(Locked, locked, PF_NO_TAIL)

and

	#define PF_NO_TAIL(page, enforce) ({                                    \
		VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page);     \
	compound_head(page);})
	
and this matches compound_head() in lock/unlock_page().

> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -711,6 +711,17 @@ static inline int page_has_private(struct page *page)
>  	return !!(page->flags & PAGE_FLAGS_PRIVATE);
>  }
>
> +void unlock_page(struct page *page);
> +static inline void __ClearPageLocked_x(struct page *page)
> +{
> +	if (PageLocked(compound_head(page)))
> +		unlock_page(page);
> +	else
> +		__ClearPageLocked(page);
> +}

No, no. If you use an old kernel (which doesn't call compound_head() in
lock_page()), then just remove compound_head() from __ClearPageLocked_x()
above:

	static inline void __ClearPageLocked_x(struct page *page)
	{
		if (PageLocked(page))
			unlock_page(page);
	}

even if this shouldn't make any difference afaics, note the
VM_BUG_ON_PGFLAGS() above.

Oleg.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-10 10:45                         ` Oleg Nesterov
  2016-08-10 16:01                           ` Bart Van Assche
@ 2016-08-10 19:58                           ` Bart Van Assche
  2016-08-11 17:36                             ` Oleg Nesterov
  1 sibling, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-10 19:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/10/2016 03:46 AM, Oleg Nesterov wrote:
> OK. Could you  try another debugging patch below?
>
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index e5a3244..9d5f892 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -711,6 +711,15 @@ static inline int page_has_private(struct page *page)
>  	return !!(page->flags & PAGE_FLAGS_PRIVATE);
>  }
>
> +void unlock_page(struct page *page);
> +static inline void __ClearPageLocked_x(struct page *page)
> +{
> +	if (PageLocked(compound_head(page)))
> +		unlock_page(page);
> +}
> +
> +#define __ClearPageLocked(page)	__ClearPageLocked_x(page)
> +
>  #undef PF_ANY
>  #undef PF_HEAD
>  #undef PF_NO_TAIL

Hello Oleg,

That's an excellent catch. With your previous patch and this patch 
applied I can't reproduce the hang in truncate_inode_pages_range() 
anymore. I still see some other wait_on_page_bit() hangs after an I/O 
error has occurred. However, the hangs that I still see are related to 
waiting on buffer head state changes and not on the PG_locked page flag.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-10 19:58                           ` Bart Van Assche
@ 2016-08-11 17:36                             ` Oleg Nesterov
  2016-08-12 16:16                               ` Oleg Nesterov
  0 siblings, 1 reply; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-11 17:36 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

Hi Bart,

On 08/10, Bart Van Assche wrote:
>
> That's an excellent catch. With your previous patch and this patch applied I
> can't reproduce the hang in truncate_inode_pages_range() anymore.

Great, thanks.

I'll send another debugging patch tomorrow, I was a bit busy today. The next
step is obvious, we need to know the caller.

But just in case, this doesn't necessarily mean that the usage of
__ClearPageLocked() is actually buggy, we don't really know this so far...

And I can't understand another oddity. Your test-case hangs in kill_bdev()
path which sleeps with bdev->bd_openers == 0 under bdev->bd_mutex so it can't
be re-opened. However, since your change in abort_exclusive_wait() helped,
there should be the readers sleeping in lock_killable() and thus bd_openers
can't be zero.

Nevermind, I don't understand this code even remotely, we will see later
who should be asked.

> I still
> see some other wait_on_page_bit() hangs after an I/O error has occurred.
> However, the hangs that I still see are related to waiting on buffer head
> state changes and not on the PG_locked page flag.

I don't know if this is right or not... lets discuss this later.

Thanks!

Oleg.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-11 17:36                             ` Oleg Nesterov
@ 2016-08-12 16:16                               ` Oleg Nesterov
  2016-08-12 16:27                                 ` Bart Van Assche
  2016-08-12 22:47                                 ` Bart Van Assche
  0 siblings, 2 replies; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-12 16:16 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/11, Oleg Nesterov wrote:
>
> I'll send another debugging patch tomorrow, I was a bit busy today. The next
> step is obvious, we need to know the caller.

Please drop two patches I sent before anf try the new one below.

Which kernel version do you use?

Oleg.
---

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index e5a3244..533da3ab 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -711,6 +711,15 @@ static inline int page_has_private(struct page *page)
 	return !!(page->flags & PAGE_FLAGS_PRIVATE);
 }
 
+void unlock_page_x(struct page *page);
+static inline void __ClearPageLocked_x(struct page *page)
+{
+	if (PageLocked(compound_head(page)))
+		unlock_page_x(page);
+}
+
+#define __ClearPageLocked(page)	__ClearPageLocked_x(page)
+
 #undef PF_ANY
 #undef PF_HEAD
 #undef PF_NO_TAIL
diff --git a/mm/filemap.c b/mm/filemap.c
index 20f3b1f..fb320fb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -837,6 +837,43 @@ void unlock_page(struct page *page)
 }
 EXPORT_SYMBOL(unlock_page);
 
+void unlock_page_x(struct page *__page)
+{
+	struct page *page = compound_head(__page);
+	wait_queue_head_t *wq = page_waitqueue(page);
+	struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(&page->flags, PG_locked);
+	wait_queue_t *curr, *next;
+	unsigned long flags;
+	bool w = false;
+
+	#define W() do {								\
+		if (!w) { w = true; pr_crit("XXXXXXXXXXXX\n"); dump_stack(); }	\
+	} while (0)
+
+	clear_bit_unlock(PG_locked, &page->flags);
+	smp_mb__after_atomic();
+
+	if (!waitqueue_active(wq))
+		return;
+
+	spin_lock_irqsave(&wq->lock, flags);
+	list_for_each_entry_safe(curr, next, &wq->task_list, task_list) {
+		if (curr->func == wake_bit_function) {
+			struct wait_bit_queue *wb = container_of(curr, struct wait_bit_queue, wait);
+			if (wb->key.flags == key.flags && wb->key.bit_nr == PG_locked) {
+				W();
+				pr_crit("XXX flags = %x, waiter:\n", curr->flags);
+				sched_show_task(curr->private);
+			}
+		} else {
+			W();
+			pr_crit("XXX flags = %x, func = %pF\n", curr->flags, curr->func);
+		}
+		curr->func(curr, TASK_NORMAL, 0, &key);
+	}
+	spin_unlock_irqrestore(&wq->lock, flags);
+}
+
 /**
  * end_page_writeback - end writeback against a page
  * @page: the page

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-12 16:16                               ` Oleg Nesterov
@ 2016-08-12 16:27                                 ` Bart Van Assche
  2016-08-12 22:47                                 ` Bart Van Assche
  1 sibling, 0 replies; 33+ messages in thread
From: Bart Van Assche @ 2016-08-12 16:27 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/12/2016 09:16 AM, Oleg Nesterov wrote:
> On 08/11, Oleg Nesterov wrote:
> Please drop two patches I sent before and try the new one below.

Thanks, will do.

> Which kernel version do you use?

Kernel v4.7 with a few ib_srp and dm-mpath backports from kernel 
v4.8-rc1 and also a few SCSI patches that I'm still testing (see also 
https://github.com/bvanassche/linux/commits/srp-initiator-for-next).

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-12 16:16                               ` Oleg Nesterov
  2016-08-12 16:27                                 ` Bart Van Assche
@ 2016-08-12 22:47                                 ` Bart Van Assche
  2016-08-13 16:32                                   ` Oleg Nesterov
  2016-08-13 17:07                                   ` Oleg Nesterov
  1 sibling, 2 replies; 33+ messages in thread
From: Bart Van Assche @ 2016-08-12 22:47 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4307 bytes --]

On 08/12/2016 09:16 AM, Oleg Nesterov wrote:
> Please drop two patches I sent before and try the new one below.

Hello Oleg,

Thanks for the patch. In addition to your patch I also applied the
attached two patches before I started testing. It took some time
before I could reproduce the hang in truncate_inode_pages_range().
To my surprise the following appeared in the system log instead of
a list of waiting tasks when I succeeded to reproduce this hang:

Aug 12 14:48:06 ion-dev-ib-ini systemd-udevd[500]: seq 11210 '/devices/virtual/block/dm-0' is taking a long time
Aug 12 14:48:07 ion-dev-ib-ini systemd-udevd[500]: seq 11227 '/devices/virtual/block/dm-1' is taking a long time
Aug 12 14:50:06 ion-dev-ib-ini systemd-udevd[500]: seq 11210 '/devices/virtual/block/dm-0' killed
Aug 12 14:50:06 ion-dev-ib-ini kernel: do_generic_file_read / pid 17232: killed
Aug 12 14:50:06 ion-dev-ib-ini systemd[1]: Started Cleanup of Temporary Directories.
Aug 12 14:50:36 ion-dev-ib-ini kernel: __lock_page_impl / pid 17224 / m 0x2: timeout - continuing to wait for 17224
Aug 12 14:50:36 ion-dev-ib-ini kernel: __lock_page_impl / pid 17232 / m 0x2: timeout - continuing to wait for 17232
Aug 12 14:51:06 ion-dev-ib-ini kernel: __lock_page_impl / pid 17224 / m 0x2: timeout - continuing to wait for 17224
Aug 12 14:51:06 ion-dev-ib-ini kernel: __lock_page_impl / pid 17232 / m 0x2: timeout - continuing to wait for 17232
[ ... ]

Running echo w > /proc/sysrq-trigger learned me that both pid 17224 and
17232 were hanging in truncate_inode_pages_range(). Does this mean that
some code in mm or in the filesystem I was using for this test (ext4) does
not unlock all pages it should unlock if a fatal signal is received?

Please let me know if you would like me to repost this message on an
mm-related mailing list.

Thanks,

Bart.

The echo w > /proc/sysrq-trigger output:

sysrq: SysRq : Show Blocked State
  task                        PC stack   pid father
systemd-udevd   D ffff88039870b7e8     0 17224    500 0x00000006
Call Trace:
 [<ffffffff816219f7>] schedule+0x37/0x90
 [<ffffffff81626019>] schedule_timeout+0x249/0x470
 [<ffffffff81620dcf>] io_schedule_timeout+0x9f/0x110
 [<ffffffff81622204>] bit_wait_io_timeout+0x24/0x70
 [<ffffffff81621f89>] __wait_on_bit_lock+0x49/0xa0
 [<ffffffff81152be5>] __lock_page_impl+0xe5/0x160
 [<ffffffff81152c6e>] __lock_page+0xe/0x10
 [<ffffffff811666a6>] truncate_inode_pages_range+0x416/0x7c0
 [<ffffffff81166a60>] truncate_inode_pages+0x10/0x20
 [<ffffffff81214200>] kill_bdev+0x30/0x40
 [<ffffffff81215521>] __blkdev_put+0x71/0x360
 [<ffffffff81215859>] blkdev_put+0x49/0x170
 [<ffffffff812159a0>] blkdev_close+0x20/0x30
 [<ffffffff811d6058>] __fput+0xe8/0x1f0
 [<ffffffff811d6199>] ____fput+0x9/0x10
 [<ffffffff81084453>] task_work_run+0x83/0xb0
 [<ffffffff810661ee>] do_exit+0x3ee/0xc40
 [<ffffffff81066acb>] do_group_exit+0x4b/0xc0
 [<ffffffff81073f1a>] get_signal+0x2ca/0x940
 [<ffffffff8101bf43>] do_signal+0x23/0x660
 [<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0
 [<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0
 [<ffffffff816274b3>] entry_SYSCALL_64_fastpath+0xa6/0xa8
systemd-udevd   D ffff88006ce6f7e8     0 17232    500 0x00000006
Call Trace:
 [<ffffffff816219f7>] schedule+0x37/0x90
 [<ffffffff81626019>] schedule_timeout+0x249/0x470
 [<ffffffff81620dcf>] io_schedule_timeout+0x9f/0x110
 [<ffffffff81622204>] bit_wait_io_timeout+0x24/0x70
 [<ffffffff81621f89>] __wait_on_bit_lock+0x49/0xa0
 [<ffffffff81152be5>] __lock_page_impl+0xe5/0x160
 [<ffffffff81152c6e>] __lock_page+0xe/0x10
 [<ffffffff811666a6>] truncate_inode_pages_range+0x416/0x7c0
 [<ffffffff81166a60>] truncate_inode_pages+0x10/0x20
 [<ffffffff81214200>] kill_bdev+0x30/0x40
 [<ffffffff81215521>] __blkdev_put+0x71/0x360
 [<ffffffff81215859>] blkdev_put+0x49/0x170
 [<ffffffff812159a0>] blkdev_close+0x20/0x30
 [<ffffffff811d6058>] __fput+0xe8/0x1f0
 [<ffffffff811d6199>] ____fput+0x9/0x10
 [<ffffffff81084453>] task_work_run+0x83/0xb0
 [<ffffffff810661ee>] do_exit+0x3ee/0xc40
 [<ffffffff81066acb>] do_group_exit+0x4b/0xc0
 [<ffffffff81073f1a>] get_signal+0x2ca/0x940
 [<ffffffff8101bf43>] do_signal+0x23/0x660
 [<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0
 [<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0
 [<ffffffff816274b3>] entry_SYSCALL_64_fastpath+0xa6/0xa8


[-- Attachment #2: 0001-mm-__lock_page-dbg.patch --]
[-- Type: text/x-patch, Size: 7101 bytes --]

>From af1cda43467c7fe2a6c76b11a6c25fcbec424ce3 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Thu, 11 Aug 2016 16:38:32 -0700
Subject: [PATCH] mm: __lock_page() dbg

---
 include/linux/mm_types.h |  3 +++
 include/linux/pagemap.h  | 22 ++++++++++++++++++++--
 mm/filemap.c             | 44 ++++++++++++++++++++++++++++++++------------
 mm/ksm.c                 |  1 +
 mm/migrate.c             |  1 +
 mm/shmem.c               |  1 +
 mm/swap_state.c          |  2 ++
 mm/vmscan.c              |  1 +
 8 files changed, 61 insertions(+), 14 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index ca3e517..59fdfeb 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -23,6 +23,7 @@
 
 struct address_space;
 struct mem_cgroup;
+struct task_struct;
 
 #define USE_SPLIT_PTE_PTLOCKS	(NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS)
 #define USE_SPLIT_PMD_PTLOCKS	(USE_SPLIT_PTE_PTLOCKS && \
@@ -220,6 +221,8 @@ struct page {
 #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
 	int _last_cpupid;
 #endif
+
+	struct task_struct *owner;
 }
 /*
  * The struct page can be forced to be double word aligned so that atomic ops
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 9735410..d332674 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -419,10 +419,25 @@ extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
 extern void unlock_page(struct page *page);
 
+static inline struct task_struct *get_page_lock_owner(struct page *page)
+{
+	return page->owner;
+}
+
+static inline void set_page_lock_owner(struct page *page, struct task_struct *t)
+{
+	page->owner = t;
+}
+
 static inline int trylock_page(struct page *page)
 {
+	int res;
+
 	page = compound_head(page);
-	return (likely(!test_and_set_bit_lock(PG_locked, &page->flags)));
+	res = !test_and_set_bit_lock(PG_locked, &page->flags);
+	if (likely(res))
+		set_page_lock_owner(page, current);
+	return res;
 }
 
 /*
@@ -641,9 +656,12 @@ static inline int add_to_page_cache(struct page *page,
 	int error;
 
 	__SetPageLocked(page);
+	set_page_lock_owner(page, current);
 	error = add_to_page_cache_locked(page, mapping, offset, gfp_mask);
-	if (unlikely(error))
+	if (unlikely(error)) {
+		set_page_lock_owner(page, NULL);
 		__ClearPageLocked(page);
+	}
 	return error;
 }
 
diff --git a/mm/filemap.c b/mm/filemap.c
index 530e75a..0ad8bf6 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -699,11 +699,13 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
 	int ret;
 
 	__SetPageLocked(page);
+	set_page_lock_owner(page, current);
 	ret = __add_to_page_cache_locked(page, mapping, offset,
 					 gfp_mask, &shadow);
-	if (unlikely(ret))
+	if (unlikely(ret)) {
+		set_page_lock_owner(page, NULL);
 		__ClearPageLocked(page);
-	else {
+	} else {
 		/*
 		 * The page might have been evicted from cache only
 		 * recently, in which case it should be activated like
@@ -831,6 +833,7 @@ void unlock_page(struct page *page)
 {
 	page = compound_head(page);
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	set_page_lock_owner(page, NULL);
 	clear_bit_unlock(PG_locked, &page->flags);
 	smp_mb__after_atomic();
 	wake_up_page(page, PG_locked);
@@ -925,27 +928,44 @@ void page_endio(struct page *page, int rw, int err)
 }
 EXPORT_SYMBOL_GPL(page_endio);
 
+int __lock_page_impl(struct page *page, int mode)
+{
+	struct page *page_head = compound_head(page);
+	DEFINE_WAIT_BIT(wait, &page_head->flags, PG_locked);
+	struct task_struct *owner;
+	int res;
+
+	for (;;) {
+		wait.key.timeout = jiffies + 30 * HZ;
+		res = __wait_on_bit_lock(page_waitqueue(page_head),
+					 &wait, bit_wait_io_timeout, mode);
+		if (res == 0) {
+			set_page_lock_owner(page, current);
+			break;
+		}
+		if (res == -EINTR)
+			break;
+		owner = get_page_lock_owner(page);
+		pr_info("%s / pid %d / m %#x: %s - continuing to wait for %d\n",
+			__func__, task_pid_nr(current), mode, res == -EAGAIN ?
+			"timeout" : "interrupted",
+			owner ? task_pid_nr(owner) : 0);
+	}
+	return res;
+}
 /**
  * __lock_page - get a lock on the page, assuming we need to sleep to get it
  * @page: the page to lock
  */
 void __lock_page(struct page *page)
 {
-	struct page *page_head = compound_head(page);
-	DEFINE_WAIT_BIT(wait, &page_head->flags, PG_locked);
-
-	__wait_on_bit_lock(page_waitqueue(page_head), &wait, bit_wait_io,
-							TASK_UNINTERRUPTIBLE);
+	__lock_page_impl(page, TASK_UNINTERRUPTIBLE);
 }
 EXPORT_SYMBOL(__lock_page);
 
 int __lock_page_killable(struct page *page)
 {
-	struct page *page_head = compound_head(page);
-	DEFINE_WAIT_BIT(wait, &page_head->flags, PG_locked);
-
-	return __wait_on_bit_lock(page_waitqueue(page_head), &wait,
-					bit_wait_io, TASK_KILLABLE);
+	return __lock_page_impl(page, TASK_KILLABLE);
 }
 EXPORT_SYMBOL_GPL(__lock_page_killable);
 
diff --git a/mm/ksm.c b/mm/ksm.c
index 4786b41..20ca878 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1880,6 +1880,7 @@ struct page *ksm_might_need_to_copy(struct page *page,
 		SetPageDirty(new_page);
 		__SetPageUptodate(new_page);
 		__SetPageLocked(new_page);
+		set_page_lock_owner(new_page, current);
 	}
 
 	return new_page;
diff --git a/mm/migrate.c b/mm/migrate.c
index bd3fdc2..50e5bc1 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1794,6 +1794,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 
 	/* Prepare a page as a migration target */
 	__SetPageLocked(new_page);
+	set_page_lock_owner(new_page, current);
 	__SetPageSwapBacked(new_page);
 
 	/* anon mapping, we can simply copy page->mapping to the new page: */
diff --git a/mm/shmem.c b/mm/shmem.c
index 171dee7..0af6bf7 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1021,6 +1021,7 @@ static struct page *shmem_alloc_page(gfp_t gfp,
 	page = alloc_pages_vma(gfp, 0, &pvma, 0, numa_node_id(), false);
 	if (page) {
 		__SetPageLocked(page);
+		set_page_lock_owner(page, current);
 		__SetPageSwapBacked(page);
 	}
 
diff --git a/mm/swap_state.c b/mm/swap_state.c
index c99463a..8522a8c 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -361,6 +361,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 
 		/* May fail (-ENOMEM) if radix-tree node allocation failed. */
 		__SetPageLocked(new_page);
+		set_page_lock_owner(new_page, current);
 		__SetPageSwapBacked(new_page);
 		err = __add_to_swap_cache(new_page, entry);
 		if (likely(!err)) {
@@ -373,6 +374,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 			return new_page;
 		}
 		radix_tree_preload_end();
+		set_page_lock_owner(new_page, NULL);
 		__ClearPageLocked(new_page);
 		/*
 		 * add_to_swap_cache() doesn't return -EEXIST, so we can safely
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c4a2f45..67d7496 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1190,6 +1190,7 @@ lazyfree:
 		 * we obviously don't have to worry about waking up a process
 		 * waiting on the page lock, because there are no references.
 		 */
+		set_page_lock_owner(page, NULL);
 		__ClearPageLocked(page);
 free_it:
 		if (ret == SWAP_LZFREE)
-- 
2.9.2


[-- Attachment #3: 0001-do_generic_file_read-Fail-immediately-if-killed.patch --]
[-- Type: text/x-patch, Size: 882 bytes --]

>From 32f250e0c8aa3d90f7fc8ac293060e2944d359a5 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Thu, 11 Aug 2016 11:02:29 -0700
Subject: [PATCH] do_generic_file_read(): Fail immediately if killed

If a fatal signal has been received, fail immediately instead of
trying to read more data.
---
 mm/filemap.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 20f3b1f..6e46fb5 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1643,7 +1643,12 @@ find_page:
 			 * wait_on_page_locked is used to avoid unnecessarily
 			 * serialisations and why it's safe.
 			 */
-			wait_on_page_locked_killable(page);
+			error = wait_on_page_locked_killable(page);
+			if (error == -EINTR) {
+				put_page(page);
+				goto out;
+			}
+			error = 0;
 			if (PageUptodate(page))
 				goto page_ok;
 
-- 
2.9.2


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-12 22:47                                 ` Bart Van Assche
@ 2016-08-13 16:32                                   ` Oleg Nesterov
  2016-08-15 23:39                                     ` Bart Van Assche
  2016-08-13 17:07                                   ` Oleg Nesterov
  1 sibling, 1 reply; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-13 16:32 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/12, Bart Van Assche wrote:
>
> On 08/12/2016 09:16 AM, Oleg Nesterov wrote:
> > Please drop two patches I sent before and try the new one below.
>
> Hello Oleg,
>
> Thanks for the patch. In addition to your patch I also applied the
> attached two patches

And I guess you did this because you think we do not have enough
confusion so you decided to add a bit more ;)

Could you please test my patch alone without additional changes?

> before I started testing. It took some time
> before I could reproduce the hang in truncate_inode_pages_range().

all I can say this contradicts with the prvious testing results with
my previous patch or with your change in abort_exclusive_wait().

> +int __lock_page_impl(struct page *page, int mode)
> +{
> +	struct page *page_head = compound_head(page);
> +	DEFINE_WAIT_BIT(wait, &page_head->flags, PG_locked);
> +	struct task_struct *owner;
> +	int res;
> +
> +	for (;;) {
> +		wait.key.timeout = jiffies + 30 * HZ;
> +		res = __wait_on_bit_lock(page_waitqueue(page_head),
> +					 &wait, bit_wait_io_timeout, mode);
> +		if (res == 0) {
> +			set_page_lock_owner(page, current);

this is not right, you should use page_head. Although I doubt this can
make a difference in this case. The same for get_page_lock_owner() below.

> +			break;
> +		}
> +		if (res == -EINTR)
> +			break;
> +		owner = get_page_lock_owner(page);
> +		pr_info("%s / pid %d / m %#x: %s - continuing to wait for %d\n",
> +			__func__, task_pid_nr(current), mode, res == -EAGAIN ?
> +			"timeout" : "interrupted",
> +			owner ? task_pid_nr(owner) : 0);

I thought about the similar debugging patch too. But this is not what
we need. Note that if res == -EAGAIN then another exlcusive waiter was
already woken and it can lock this page and set get_page_lock_owner().
So this can't actually help if the problem is the missed/lost wakeup.

Not that it explains the strange dmesg you reported. Perhaps your patch
has other bugs, or my patch is buggy, or both. Please do not mix them.

As for "add the timeout" idea it makes sense too and perhaps we will test
this later, but we can start with the much more simple patch.

Oleg.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-12 22:47                                 ` Bart Van Assche
  2016-08-13 16:32                                   ` Oleg Nesterov
@ 2016-08-13 17:07                                   ` Oleg Nesterov
  1 sibling, 0 replies; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-13 17:07 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

Forgot to mention...

On 08/12, Bart Van Assche wrote:
>
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1643,7 +1643,12 @@ find_page:
>  			 * wait_on_page_locked is used to avoid unnecessarily
>  			 * serialisations and why it's safe.
>  			 */
> -			wait_on_page_locked_killable(page);
> +			error = wait_on_page_locked_killable(page);
> +			if (error == -EINTR) {
> +				put_page(page);
> +				goto out;
> +			}
> +			error = 0;

This change probably makes sense regardless although I'd suggest to
simplify it:

	-		wait_on_page_locked_killable(page);
	+		error = wait_on_page_locked_killable(page);
	+		if (unlikely(error))
	+			goto readpage_error;


but it looks off-topic. And the changelog looks misleading/wrong.

I do not think this change makes sense in this debugging session,

Oleg.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-13 16:32                                   ` Oleg Nesterov
@ 2016-08-15 23:39                                     ` Bart Van Assche
  2016-08-16 13:06                                       ` Oleg Nesterov
  0 siblings, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-15 23:39 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/13/2016 09:32 AM, Oleg Nesterov wrote:
> On 08/12, Bart Van Assche wrote:
>> before I started testing. It took some time
>> before I could reproduce the hang in truncate_inode_pages_range().
>
> all I can say this contradicts with the previous testing results with
> my previous patch or with your change in abort_exclusive_wait().

Hello Oleg,

My opinion is that all this means is that we do not yet have a full 
understanding of what is going on.

BTW, I have improved my page lock owner instrumentation patch such that 
it prints a call stack of the lock owner if lock_page() takes too long. 
The following call stack was reported:

__lock_page / pid 8549 / m 0x2: timeout - continuing to wait for 8549
   [<ffffffff8102b316>] save_stack_trace+0x26/0x50
   [<ffffffff81152bee>] add_to_page_cache_lru+0x7e/0x170
   [<ffffffff8121bfc5>] mpage_readpages+0xc5/0x170
   [<ffffffff81215548>] blkdev_readpages+0x18/0x20
   [<ffffffff81163a68>] __do_page_cache_readahead+0x268/0x310
   [<ffffffff811640a8>] force_page_cache_readahead+0xa8/0x100
   [<ffffffff81164139>] page_cache_sync_readahead+0x39/0x40
   [<ffffffff81153967>] generic_file_read_iter+0x707/0x920
   [<ffffffff81215920>] blkdev_read_iter+0x30/0x40
   [<ffffffff811d4b4b>] __vfs_read+0xbb/0x130
   [<ffffffff811d4f31>] vfs_read+0x91/0x130
   [<ffffffff811d62b4>] SyS_read+0x44/0xa0
   [<ffffffff816281e5>] entry_SYSCALL_64_fastpath+0x18/0xa8

My understanding of mpage_readpages() is that the page unlock happens 
after readahead I/O completed (see also page_endio()). So this probably 
means that an I/O request submitted because of readahead code did not 
get completed. I will see whether I can find anything that's wrong in 
the block layer.

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-15 23:39                                     ` Bart Van Assche
@ 2016-08-16 13:06                                       ` Oleg Nesterov
  2016-08-16 16:54                                         ` Bart Van Assche
  0 siblings, 1 reply; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-16 13:06 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/15, Bart Van Assche wrote:
>
> On 08/13/2016 09:32 AM, Oleg Nesterov wrote:
>> On 08/12, Bart Van Assche wrote:
>>> before I started testing. It took some time
>>> before I could reproduce the hang in truncate_inode_pages_range().
>>
>> all I can say this contradicts with the previous testing results with
>> my previous patch or with your change in abort_exclusive_wait().
>
> Hello Oleg,
>
> My opinion is that all this means is that we do not yet have a full
> understanding of what is going on.

Sure.

> BTW, I have improved my page lock owner instrumentation patch such that
> it prints a call stack of the lock owner if lock_page() takes too long.
> The following call stack was reported:
>
> __lock_page / pid 8549 / m 0x2: timeout - continuing to wait for 8549
>   [<ffffffff8102b316>] save_stack_trace+0x26/0x50
>   [<ffffffff81152bee>] add_to_page_cache_lru+0x7e/0x170
>   [<ffffffff8121bfc5>] mpage_readpages+0xc5/0x170
>   [<ffffffff81215548>] blkdev_readpages+0x18/0x20
>   [<ffffffff81163a68>] __do_page_cache_readahead+0x268/0x310
>   [<ffffffff811640a8>] force_page_cache_readahead+0xa8/0x100
>   [<ffffffff81164139>] page_cache_sync_readahead+0x39/0x40
>   [<ffffffff81153967>] generic_file_read_iter+0x707/0x920
>   [<ffffffff81215920>] blkdev_read_iter+0x30/0x40
>   [<ffffffff811d4b4b>] __vfs_read+0xbb/0x130
>   [<ffffffff811d4f31>] vfs_read+0x91/0x130
>   [<ffffffff811d62b4>] SyS_read+0x44/0xa0
>   [<ffffffff816281e5>] entry_SYSCALL_64_fastpath+0x18/0xa8
>
> My understanding of mpage_readpages() is that the page unlock happens
> after readahead I/O completed (see also page_endio()). So this probably
> means that an I/O request submitted because of readahead code did not
> get completed. I will see whether I can find anything that's wrong in
> the block layer.

Perhaps. But this means another problem! Or you didn't wait enough. Or
your previous testing was wrong.

Because, once again, your changes in abort_exclusive_wait(), and my
debugging patch which adds wakeup into ClearPageLocked() suggest that
the problem is NOT that the page is still locked.


I'd still like to know what happens with the last patch I sent (without
any other changes)... but now I am totally confused.

If only I could reproduce. Or at least understand what are you doing to
hit thi bug ;)

Oleg.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-16 13:06                                       ` Oleg Nesterov
@ 2016-08-16 16:54                                         ` Bart Van Assche
  2016-08-17 17:30                                           ` Oleg Nesterov
  0 siblings, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2016-08-16 16:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/16/2016 06:06 AM, Oleg Nesterov wrote:
> If only I could reproduce. Or at least understand what are you doing to
> hit this bug ;)

Hello Oleg,

What I'm doing to hit this bug is to run the test script that is 
available at https://github.com/bvanassche/srp-test on a setup that is 
equipped with at least one InfiniBand adapter. I see the following 
possibilities for you to reproduce this:
* Ask a colleague for access to an IB setup.
* Add RoCE support to the srp-test script and run that script against a
   v4.8 kernel + ib_srp-backport + SCST ib_srpt drivers. These last two
   (out-of-tree) drivers namely support SRP over RoCE. The upstream
   drivers not yet. The SRP-over-RoCE functionality will be sent
   upstream as soon as standardization of this protocol by the T10
   committee has finished (this work has already been started and will
   probably be finished later this year).

Please let me know if you need more information.

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-16 16:54                                         ` Bart Van Assche
@ 2016-08-17 17:30                                           ` Oleg Nesterov
  0 siblings, 0 replies; 33+ messages in thread
From: Oleg Nesterov @ 2016-08-17 17:30 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Peter Zijlstra, mingo, Andrew Morton, Johannes Weiner,
	Neil Brown, Michael Shaver, linux-kernel

On 08/16, Bart Van Assche wrote:
>
> On 08/16/2016 06:06 AM, Oleg Nesterov wrote:
>> If only I could reproduce. Or at least understand what are you doing to
>> hit this bug ;)
>
> Hello Oleg,
>
> What I'm doing to hit this bug is to run the test script that is
> available at https://github.com/bvanassche/srp-test on a setup that is
> equipped with at least one InfiniBand adapter. I see the following
> possibilities for you to reproduce this:
> * Ask a colleague for access to an IB setup.
> * Add RoCE support to the srp-test script and run that script against a
>   v4.8 kernel + ib_srp-backport + SCST ib_srpt drivers. These last two
>   (out-of-tree) drivers namely support SRP over RoCE. The upstream
>   drivers not yet. The SRP-over-RoCE functionality will be sent
>   upstream as soon as standardization of this protocol by the T10
>   committee has finished (this work has already been started and will
>   probably be finished later this year).
>
> Please let me know if you need more information.

Heh ;) I can't understand any single word above.

So I'll give up. Previously you reported that this patch

	http://marc.info/?l=linux-kernel&m=147085570503588

the problem goes away. In this case the next one

	http://marc.info/?l=linux-kernel&m=147101858416463

could give us more info but you didn't try it so far (without other
changes).

It seems you find the root of this problem somewhere else, hopefully
you will resolve it soon.

Oleg.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2016-08-17 17:30 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-03 16:35 [PATCH] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
2016-08-03 18:11 ` Peter Zijlstra
2016-08-03 18:56   ` Bart Van Assche
2016-08-03 21:30     ` Oleg Nesterov
2016-08-03 21:51       ` Bart Van Assche
2016-08-04 14:09         ` Peter Zijlstra
2016-08-04 14:31           ` Bart Van Assche
2016-08-05 17:41           ` Bart Van Assche
2016-08-08 10:22             ` Peter Zijlstra
2016-08-08 14:38               ` Bart Van Assche
2016-08-08 16:20                 ` Oleg Nesterov
2016-08-08 18:31                   ` Bart Van Assche
2016-08-09 17:14                     ` Oleg Nesterov
2016-08-09 18:48                       ` Bart Van Assche
2016-08-09 23:10                         ` Bart Van Assche
2016-08-10 10:45                         ` Oleg Nesterov
2016-08-10 16:01                           ` Bart Van Assche
2016-08-10 16:27                             ` Oleg Nesterov
2016-08-10 19:58                           ` Bart Van Assche
2016-08-11 17:36                             ` Oleg Nesterov
2016-08-12 16:16                               ` Oleg Nesterov
2016-08-12 16:27                                 ` Bart Van Assche
2016-08-12 22:47                                 ` Bart Van Assche
2016-08-13 16:32                                   ` Oleg Nesterov
2016-08-15 23:39                                     ` Bart Van Assche
2016-08-16 13:06                                       ` Oleg Nesterov
2016-08-16 16:54                                         ` Bart Van Assche
2016-08-17 17:30                                           ` Oleg Nesterov
2016-08-13 17:07                                   ` Oleg Nesterov
2016-08-09 23:56                   ` Bart Van Assche
2016-08-10 10:57                     ` Oleg Nesterov
2016-08-10 11:03                       ` Peter Zijlstra
2016-08-04  0:05       ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.