All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeffrey Hugo <jhugo@codeaurora.org>
To: paulmck@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	pprakash@codeaurora.org, Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Jens Axboe <axboe@kernel.dk>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Richard Cochran <rcochran@linutronix.de>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Richard Weinberger <richard@nod.at>
Subject: Re: [BUG] Deadlock due due to interactions of block, RCU, and cpu offline
Date: Thu, 29 Jun 2017 10:29:12 -0600	[thread overview]
Message-ID: <d64c9d16-3b91-2081-0633-7f6a5196fd45@codeaurora.org> (raw)
In-Reply-To: <20170628001130.GB3721@linux.vnet.ibm.com>

On 6/27/2017 6:11 PM, Paul E. McKenney wrote:
> On Tue, Jun 27, 2017 at 04:32:09PM -0600, Jeffrey Hugo wrote:
>> On 6/22/2017 9:34 PM, Paul E. McKenney wrote:
>>> On Wed, Jun 21, 2017 at 09:18:53AM -0700, Paul E. McKenney wrote:
>>>> No worries, and I am very much looking forward to seeing the results of
>>>> your testing.
>>>
>>> And please see below for an updated patch based on LKML review and
>>> more intensive testing.
>>>
>>
>> I spent some time on this today.  It didn't go as I expected.  I
>> validated the issue is reproducible as before on 4.11 and 4.12 rcs 1
>> through 4.  However, the version of stress-ng that I was using ran
>> into constant errors starting with rc5, making it nearly impossible
>> to make progress toward reproduction.  Upgrading stress-ng to tip
>> fixes the issue, however, I've still been unable to repro the issue.
>>
>> Its my unfounded suspicion that something went in between rc4 and
>> rc5 which changed the timing, and didn't actually fix the issue.  I
>> will run the test overnight for 5 hours to try to repro.
>>
>> The patch you sent appears to be based on linux-next, and appears to
>> have a number of dependencies which prevent it from cleanly applying
>> on anything current that I'm able to repro on at this time.  Do you
>> want to provide a rebased version of the patch which applies to say
>> 4.11?  I could easily test that and report back.
> 
> Here is a very lightly tested backport to v4.11.
> 

Works for me. Always reproduced the lockup within 2 minutes on stock 
4.11.  With the change applied, I was able to test for 2 hours in the 
same conditions, and 4 hours with the full system and not encounter an 
issue.

Feel free to add:
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>

I'm going to go back to 4.12-rc5 and see if I can get either repro the 
issue, or identify what changed.  Hopefully I can get to linux-next and 
double check the original version of the change as well.

-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

  reply	other threads:[~2017-06-29 16:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-26 23:10 [BUG] Deadlock due due to interactions of block, RCU, and cpu offline Jeffrey Hugo
2017-03-26 23:28 ` Paul E. McKenney
2017-03-27 18:02   ` Jeffrey Hugo
2017-03-27 18:17     ` Paul E. McKenney
2017-06-20 23:46       ` Paul E. McKenney
2017-06-21 14:39         ` Jeffrey Hugo
2017-06-21 16:18           ` Paul E. McKenney
2017-06-23  3:34             ` Paul E. McKenney
2017-06-27 22:32               ` Jeffrey Hugo
2017-06-28  0:11                 ` Paul E. McKenney
2017-06-29 16:29                   ` Jeffrey Hugo [this message]
2017-06-30  0:18                     ` Paul E. McKenney
2017-08-20 19:31                       ` Jeffrey Hugo
2017-08-20 20:56                         ` Paul E. McKenney
2017-08-22 16:12                           ` Paolo Bonzini
2017-08-22 20:53                             ` Jeffrey Hugo
2017-08-15  8:46 ` [tip:core/rcu] rcu: Migrate callbacks earlier in the CPU-offline timeline tip-bot for Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d64c9d16-3b91-2081-0633-7f6a5196fd45@codeaurora.org \
    --to=jhugo@codeaurora.org \
    --cc=axboe@kernel.dk \
    --cc=bigeasy@linutronix.de \
    --cc=boris.ostrovsky@oracle.com \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pprakash@codeaurora.org \
    --cc=rcochran@linutronix.de \
    --cc=richard@nod.at \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.