All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavan Kondeti <pkondeti@codeaurora.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: williams@redhat.com, Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	bristot@redhat.com, jkacur@redhat.com, efault@gmx.de,
	hpa@zytor.com, torvalds@linux-foundation.org, swood@redhat.com,
	linux-tip-commits@vger.kernel.org
Subject: Re: [tip:sched/core] sched/rt: Simplify the IPI based RT balancing logic
Date: Fri, 19 Jan 2018 23:16:17 +0530	[thread overview]
Message-ID: <20180119174617.GA6563@codeaurora.org> (raw)
In-Reply-To: <20180119100353.7f9f5154@gandalf.local.home>

On Fri, Jan 19, 2018 at 10:03:53AM -0500, Steven Rostedt wrote:
> On Fri, 19 Jan 2018 14:53:05 +0530
> Pavan Kondeti <pkondeti@codeaurora.org> wrote:
> 
> > I am seeing "spinlock already unlocked" BUG for rd->rto_lock on a 4.9
> > stable kernel based system. This issue is observed only after
> > inclusion of this patch. It appears to me that rq->rd can change
> > between spinlock is acquired and released in rto_push_irq_work_func()
> > IRQ work if hotplug is in progress. It was only reported couple of
> > times during long stress testing. The issue can be easily reproduced
> > if an artificial delay is introduced between lock and unlock of
> > rto_lock. The rq->rd is changed under rq->lock, so we can protect this
> > race with rq->lock. The below patch solved the problem. we are taking
> > rq->lock in pull_rt_task()->tell_cpu_to_push(), so I extended the same
> > here. Please let me know your thoughts on this.
> 
> As so rq->rd can change. Interesting.
> 
> > 
> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index d863d39..478192b 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -2284,6 +2284,7 @@ void rto_push_irq_work_func(struct irq_work *work)
> >                 raw_spin_unlock(&rq->lock);
> >         }
> > 
> > +       raw_spin_lock(&rq->lock);
> 
> 
> What about just saving the rd then?
> 
> 	struct root_domain *rd;
> 
> 	rd = READ_ONCE(rq->rd);
> 
> then use that. Then we don't need to worry about it changing.
> 

I am thinking of another problem because of the race between
rto_push_irq_work_func() and rq_attach_root() where rq->rd is modified.

Lets say, we cache the rq->rd here and queued the IRQ work on a remote
CPU. In the mean time, the rq_attach_root() might drop all the references
to this cached (old) rd and wants to free it. The rq->rd is freed in
RCU-sched callback. If that remote CPU is in RCU quiescent state, the rq->rd
can get freed before the IRQ work is executed. This results in the corruption
of the remote  CPU's IRQ work list. Right?

Taking rq->lock in rto_push_irq_work_func() also does not help here. Probably
we have to wait for the IRQ work to finish before freeing the older root domain
in RCU-sched callback.

Thanks,
Pavan
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

  parent reply	other threads:[~2018-01-19 17:46 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-24 15:47 [PATCH tip/sched/core v2] sched/rt: Simplify the IPI rt balancing logic Steven Rostedt
2017-05-04 14:41 ` Peter Zijlstra
2017-05-04 15:01   ` Steven Rostedt
2017-05-04 15:32 ` Peter Zijlstra
2017-05-04 17:25   ` Steven Rostedt
2017-05-04 18:42     ` Peter Zijlstra
2017-05-04 19:03       ` Steven Rostedt
2017-05-05  4:26         ` Mike Galbraith
2017-05-05  5:16           ` Mike Galbraith
2017-05-05 11:05         ` Peter Zijlstra
2017-05-05 12:02           ` Steven Rostedt
2017-05-05 17:39             ` Peter Zijlstra
2017-05-05 18:59               ` Steven Rostedt
2017-05-06  7:52                 ` Peter Zijlstra
2017-05-08  1:25 ` [lkp-robot] [sched/rt] bebe5811a9: hackbench.throughput -6% regression kernel test robot
2017-10-10 11:02 ` [tip:sched/core] sched/rt: Simplify the IPI based RT balancing logic tip-bot for Steven Rostedt (Red Hat)
2018-01-19  9:23   ` Pavan Kondeti
2018-01-19 15:03     ` Steven Rostedt
2018-01-19 15:44       ` Pavan Kondeti
2018-01-19 15:53         ` Steven Rostedt
2018-01-19 17:46       ` Pavan Kondeti [this message]
2018-01-19 18:11         ` Steven Rostedt
2018-01-19 18:12           ` Steven Rostedt
2018-01-19 18:57             ` Pavan Kondeti
2018-01-19 19:51               ` Steven Rostedt
2018-01-20  4:56                 ` Pavan Kondeti
2018-01-19 18:54           ` Pavan Kondeti
2018-02-06 11:54     ` [tip:sched/urgent] sched/rt: Use container_of() to get root domain in rto_push_irq_work_func() tip-bot for Steven Rostedt (VMware)
2018-02-07  4:14       ` Steven Rostedt
2018-02-06 11:54     ` [tip:sched/urgent] sched/rt: Up the root domain ref count when passing it around via IPIs tip-bot for Steven Rostedt (VMware)
2018-02-07  4:15       ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180119174617.GA6563@codeaurora.org \
    --to=pkondeti@codeaurora.org \
    --cc=bristot@redhat.com \
    --cc=efault@gmx.de \
    --cc=hpa@zytor.com \
    --cc=jkacur@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=swood@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.