linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Anson Huang <anson.huang@nxp.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Jacky Bai <ping.bai@nxp.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Subject: Re: About CPU hot-plug stress test failed in cpufreq driver
Date: Mon, 25 Nov 2019 13:44:20 +0100	[thread overview]
Message-ID: <CAJZ5v0iiJusFSrB9LRQq39K9TeGu0kndogdd060qqiJ=QOAQRw@mail.gmail.com> (raw)
In-Reply-To: <DB3PR0402MB39165E40800E42C2E5635C7CF54A0@DB3PR0402MB3916.eurprd04.prod.outlook.com>

On Mon, Nov 25, 2019 at 7:05 AM Anson Huang <anson.huang@nxp.com> wrote:
>
> Hi, Rafael
>         Looks like adding pr_info() in irq_work_sync() makes issue can NOT be reproduced, any possibility of race happen there and the pr_info eliminate the race condition? I will continue run the test with the pr_info to see if any luck to reproduce it.

Yes, it looks like there is a race condition in there.

I need to analyze the code a bit to confirm it which may take a bit of time.

Cheers!


> > On Fri, Nov 22, 2019 at 6:15 AM Anson Huang <anson.huang@nxp.com>
> > wrote:
> > >
> > > Hi, Rafael
> > >         Theoretically, yes, the CPU being offline will run the irq
> > > work list to make sure the irq work pending on it will be clear, but
> > > the fact is NOT,
> >
> > So this looks like a problem with irq_work_sync() working not as expected.
> >
> > >         both ondemand and schedutil governor can reproduce this issue if
> > running stress CPU hotplug test.
> > >         I tried add a "int cpu" in irq work structure to record CPU number
> > which has irq work pending, when issue happen, I can see the irq work is
> > pending at CPU #3 which is already offline, this is why issue happen, but I
> > don't know how it happens...
> > >
> > > diff --git a/include/linux/irq_work.h b/include/linux/irq_work.h index
> > > b11fcdf..f8da06f9 100644
> > > --- a/include/linux/irq_work.h
> > > +++ b/include/linux/irq_work.h
> > > @@ -25,6 +25,7 @@ struct irq_work {
> > >         unsigned long flags;
> > >         struct llist_node llnode;
> > >         void (*func)(struct irq_work *);
> > > +       int cpu;
> > >  };
> > >
> > >  static inline
> > > diff --git a/kernel/irq_work.c b/kernel/irq_work.c index
> > > d42acaf..2e893d5 100644
> > > --- a/kernel/irq_work.c
> > > +++ b/kernel/irq_work.c
> > > @@ -10,6 +10,7 @@
> > >  #include <linux/kernel.h>
> > >  #include <linux/export.h>
> > >  #include <linux/irq_work.h>
> > > +#include <linux/jiffies.h>
> > >  #include <linux/percpu.h>
> > >  #include <linux/hardirq.h>
> > >  #include <linux/irqflags.h>
> > > @@ -78,6 +79,7 @@ bool irq_work_queue(struct irq_work *work)
> > >         if (!irq_work_claim(work))
> > >                 return false;
> > >
> > > +       work->cpu = smp_processor_id();
> > >         /* Queue the entry and raise the IPI if needed. */
> > >         preempt_disable();
> > >         __irq_work_queue_local(work);
> > > @@ -105,6 +107,7 @@ bool irq_work_queue_on(struct irq_work *work,
> > int cpu)
> > >         /* Only queue if not already pending */
> > >         if (!irq_work_claim(work))
> > >                 return false;
> > > +       work->cpu = cpu;
> > >
> > >         preempt_disable();
> > >         if (cpu != smp_processor_id()) { @@ -161,6 +164,7 @@ static
> > > void irq_work_run_list(struct llist_head *list)
> > >                  */
> > >                 flags = work->flags & ~IRQ_WORK_PENDING;
> > >                 xchg(&work->flags, flags);
> > > +               work->cpu = -1;
> > >
> > >                 work->func(work);
> > >                 /*
> > > @@ -197,9 +201,13 @@ void irq_work_tick(void)
> > >   */
> > >  void irq_work_sync(struct irq_work *work)  {
> > > +       unsigned long timeout = jiffies + msecs_to_jiffies(500);
> > >         lockdep_assert_irqs_enabled();
> >
> > Can you please add something like
> >
> > pr_info("%s: CPU %d\n", __func__, work->cpu);
> >
> > here re-run the test and collect a log again?
> >
> > I need to know if irq_work_sync() runs during CPU offline as expected.
> >
> > >
> > > -       while (work->flags & IRQ_WORK_BUSY)
> > > +       while (work->flags & IRQ_WORK_BUSY) {
> > > +               if (time_after(jiffies, timeout))
> > > +                       pr_warn("irq_work_sync 500ms timeout, work cpu
> > > + %d\n", work->cpu);
> > >                 cpu_relax();
> > > +       }
> > >  }
> > >  EXPORT_SYMBOL_GPL(irq_work_sync);

  parent reply	other threads:[~2019-11-25 12:44 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <DB3PR0402MB391626A8ECFDC182C6EDCF8DF54E0@DB3PR0402MB3916.eurprd04.prod.outlook.com>
2019-11-21  9:35 ` About CPU hot-plug stress test failed in cpufreq driver Viresh Kumar
2019-11-21 10:13   ` Anson Huang
2019-11-21 10:53     ` Rafael J. Wysocki
2019-11-21 10:56       ` Rafael J. Wysocki
2019-11-22  5:15         ` Anson Huang
2019-11-22  9:59           ` Rafael J. Wysocki
2019-11-25  6:05             ` Anson Huang
2019-11-25  9:43               ` Anson Huang
2019-11-26  6:18                 ` Viresh Kumar
2019-11-26  8:22                   ` Anson Huang
2019-11-26  8:25                     ` Viresh Kumar
2019-11-25 12:44               ` Rafael J. Wysocki [this message]
2019-11-26  8:57                 ` Rafael J. Wysocki
2019-11-29 11:39                 ` Rafael J. Wysocki
2019-11-29 13:44                   ` Anson Huang
2019-12-05  8:53                     ` Anson Huang
2019-12-05 10:48                       ` Rafael J. Wysocki
2019-12-05 13:18                         ` Anson Huang
2019-12-05 15:52                           ` Rafael J. Wysocki
2019-12-09 10:31                             ` Peng Fan
2019-12-09 10:37                             ` Anson Huang
2019-12-09 10:56                               ` Anson Huang
2019-12-09 11:23                                 ` Rafael J. Wysocki
2019-12-09 12:32                                   ` Anson Huang
2019-12-09 12:44                                     ` Rafael J. Wysocki
2019-12-09 14:18                                       ` Anson Huang
2019-12-10  5:39                                         ` Anson Huang
2019-12-10  5:53                                       ` Peng Fan
2019-12-10  7:05                                         ` Viresh Kumar
2019-12-10  8:22                                           ` Rafael J. Wysocki
2019-12-10  8:29                                             ` Anson Huang
2019-12-10  8:36                                               ` Viresh Kumar
2019-12-10  8:37                                                 ` Peng Fan
2019-12-10  8:37                                               ` Rafael J. Wysocki
2019-12-10  8:43                                                 ` Peng Fan
2019-12-10  8:45                                                 ` Anson Huang
2019-12-10  8:50                                                   ` Rafael J. Wysocki
2019-12-10  8:51                                                     ` Anson Huang
2019-12-10 10:39                                                       ` Rafael J. Wysocki
2019-12-10 10:54                                                         ` Rafael J. Wysocki
2019-12-11  5:08                                                           ` Anson Huang
2019-12-11  8:59                                                           ` Peng Fan
2019-12-11  9:36                                                             ` Rafael J. Wysocki
2019-12-11  9:43                                                               ` Peng Fan
2019-12-11  9:52                                                                 ` Rafael J. Wysocki
2019-12-11 10:11                                                                   ` Peng Fan
2019-12-10 10:54                                                         ` Viresh Kumar
2019-12-10 11:07                                                           ` Rafael J. Wysocki
2019-12-10  8:57                                                     ` Viresh Kumar
2019-12-10 11:03                                                       ` Rafael J. Wysocki
2019-12-10  9:04                                                     ` Rafael J. Wysocki
2019-12-10  8:31                                             ` Viresh Kumar
2019-12-10  8:12                                         ` Rafael J. Wysocki
2019-12-05 11:00                       ` Viresh Kumar
2019-12-05 11:10                         ` Rafael J. Wysocki
2019-12-05 11:17                           ` Viresh Kumar
2019-11-21 10:37   ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJZ5v0iiJusFSrB9LRQq39K9TeGu0kndogdd060qqiJ=QOAQRw@mail.gmail.com' \
    --to=rafael@kernel.org \
    --cc=anson.huang@nxp.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=ping.bai@nxp.com \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).