All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Anup Patel <Anup.Patel@wdc.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Jason Cooper <jason@lakedaemon.net>,
	Atish Patra <Atish.Patra@wdc.com>,
	Alistair Francis <Alistair.Francis@wdc.com>,
	Anup Patel <anup@brainfault.org>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/4] irqchip/sifive-plic: Setup cpuhp once after current handler is present
Date: Sat, 16 May 2020 14:30:46 +0100	[thread overview]
Message-ID: <0be23fcd363998ddd527eceb308f592c@kernel.org> (raw)
In-Reply-To: <DM6PR04MB62012DBAF3FAA7A264094C418DBA0@DM6PR04MB6201.namprd04.prod.outlook.com>

On 2020-05-16 13:52, Anup Patel wrote:
>> -----Original Message-----
>> From: Marc Zyngier <maz@kernel.org>
>> Sent: 16 May 2020 17:42
>> To: Anup Patel <Anup.Patel@wdc.com>
>> Cc: Palmer Dabbelt <palmer@dabbelt.com>; Paul Walmsley
>> <paul.walmsley@sifive.com>; Thomas Gleixner <tglx@linutronix.de>; 
>> Jason
>> Cooper <jason@lakedaemon.net>; Atish Patra <Atish.Patra@wdc.com>; 
>> Alistair
>> Francis <Alistair.Francis@wdc.com>; Anup Patel <anup@brainfault.org>; 
>> linux-
>> riscv@lists.infradead.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH 1/4] irqchip/sifive-plic: Setup cpuhp once after 
>> current
>> handler is present
>> 
>> Hi Anup,
>> 
>> On 2020-05-16 07:38, Anup Patel wrote:
>> > For multiple PLIC instances, the plic_init() is called once for each
>> > PLIC instance. Due to this we have two issues:
>> > 1. cpuhp_setup_state() is called multiple times 2. plic_starting_cpu()
>> > can crash for boot CPU if cpuhp_setup_state()
>> >    is called before boot CPU PLIC handler is available.
>> >
>> > This patch fixes both above issues.
>> >
>> > Signed-off-by: Anup Patel <anup.patel@wdc.com>
>> > ---
>> >  drivers/irqchip/irq-sifive-plic.c | 14 ++++++++++++--
>> >  1 file changed, 12 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/irqchip/irq-sifive-plic.c
>> > b/drivers/irqchip/irq-sifive-plic.c
>> > index 822e074c0600..7dc23edb3267 100644
>> > --- a/drivers/irqchip/irq-sifive-plic.c
>> > +++ b/drivers/irqchip/irq-sifive-plic.c
>> > @@ -76,6 +76,7 @@ struct plic_handler {
>> >  	void __iomem		*enable_base;
>> >  	struct plic_priv	*priv;
>> >  };
>> > +static bool plic_cpuhp_setup_done;
>> >  static DEFINE_PER_CPU(struct plic_handler, plic_handlers);
>> >
>> >  static inline void plic_toggle(struct plic_handler *handler, @@
>> > -282,6 +283,7 @@ static int __init plic_init(struct device_node *node,
>> >  	int error = 0, nr_contexts, nr_handlers = 0, i;
>> >  	u32 nr_irqs;
>> >  	struct plic_priv *priv;
>> > +	struct plic_handler *handler;
>> >
>> >  	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>> >  	if (!priv)
>> > @@ -310,7 +312,6 @@ static int __init plic_init(struct device_node
>> > *node,
>> >
>> >  	for (i = 0; i < nr_contexts; i++) {
>> >  		struct of_phandle_args parent;
>> > -		struct plic_handler *handler;
>> >  		irq_hw_number_t hwirq;
>> >  		int cpu, hartid;
>> >
>> > @@ -364,9 +365,18 @@ static int __init plic_init(struct device_node
>> > *node,
>> >  		nr_handlers++;
>> >  	}
>> >
>> > -	cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
>> > +	/*
>> > +	 * We can have multiple PLIC instances so setup cpuhp state only
>> > +	 * when context handler for current/boot CPU is present.
>> > +	 */
>> > +	handler = this_cpu_ptr(&plic_handlers);
>> > +	if (handler->present && !plic_cpuhp_setup_done) {
>> 
>> If there is no context handler for the boot CPU, the system is doomed, 
>> right? It
>> isn't able to get any interrupt, and you don't register the hotplug 
>> notifier that
>> could allow secondary CPUs to boot.
>> 
>> So what is the point? It feels like you should just give up here.
>> 
>> Also, the boot CPU is always CPU 0. So checking that you only register 
>> the
>> hotplug notifier from CPU 0 should be enough.
> 
> The boot CPU is not fixed in RISC-V, the logical id of the boot CPU 
> will always
> be zero but physical id (or HART id) can be something totally 
> different.

So on riscv, smp_processor_id() can return a non-zero value on the
the boot CPU? Interesting... :-/

> 
> On RISC-V NUMA system, we will have a separate PLIC in each NUMA node.
> 
> Let's say we have a system with 2 NUMA nodes, each NUMA node having
> 4 CPUs (or 4 HARTs).  In this case, the DTB passed to Linux will have 
> two PLIC
> DT nodes where each PLIC device targets only 4 CPUs (or 4 HARTs). The
> plic_init() functions will setup handlers for only 4 CPUs (or 4 HARTs). 
> In other
> words, plic_init() for "PLIC0" will setup handler for HART id 0 to 3 
> and
> plic_init() for "PLIC1" will setup handler for HART id 4 to 7. Now, any 
> CPU
> can be boot CPU so it is possible that CPU with HART id 4 is boot CPU 
> and
> when plic_init() is first called for "PLIC0" the handler for HART id 4 
> is not
> setup because it will be setup later when plic_init() is called for 
> "PLIC1".
> This cause plic_starting_cpu() to crash when plic_init() is called for 
> "PLIC0".
> 
> I hope above example helps understanding the issue.

It does, thanks. This pseudo NUMA thing really is a terrible hack...

> 
> I encounter this issue randomly when booting Linux on QEMU RISC-V
> with multiple NUMA nodes.

Then why don't you defer the probing of the PLIC you can't initialize
from this CPU? If you're on CPU4-7, only initialize the PLIC that
matters to you, and not the the others. It would certainly make a lot
more sense, and be more robust.

         M.
-- 
Jazz is not dead. It just smells funny...

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Anup Patel <Anup.Patel@wdc.com>
Cc: Jason Cooper <jason@lakedaemon.net>,
	Anup Patel <anup@brainfault.org>,
	linux-kernel@vger.kernel.org, Atish Patra <Atish.Patra@wdc.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Alistair Francis <Alistair.Francis@wdc.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-riscv@lists.infradead.org
Subject: Re: [PATCH 1/4] irqchip/sifive-plic: Setup cpuhp once after current handler is present
Date: Sat, 16 May 2020 14:30:46 +0100	[thread overview]
Message-ID: <0be23fcd363998ddd527eceb308f592c@kernel.org> (raw)
In-Reply-To: <DM6PR04MB62012DBAF3FAA7A264094C418DBA0@DM6PR04MB6201.namprd04.prod.outlook.com>

On 2020-05-16 13:52, Anup Patel wrote:
>> -----Original Message-----
>> From: Marc Zyngier <maz@kernel.org>
>> Sent: 16 May 2020 17:42
>> To: Anup Patel <Anup.Patel@wdc.com>
>> Cc: Palmer Dabbelt <palmer@dabbelt.com>; Paul Walmsley
>> <paul.walmsley@sifive.com>; Thomas Gleixner <tglx@linutronix.de>; 
>> Jason
>> Cooper <jason@lakedaemon.net>; Atish Patra <Atish.Patra@wdc.com>; 
>> Alistair
>> Francis <Alistair.Francis@wdc.com>; Anup Patel <anup@brainfault.org>; 
>> linux-
>> riscv@lists.infradead.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH 1/4] irqchip/sifive-plic: Setup cpuhp once after 
>> current
>> handler is present
>> 
>> Hi Anup,
>> 
>> On 2020-05-16 07:38, Anup Patel wrote:
>> > For multiple PLIC instances, the plic_init() is called once for each
>> > PLIC instance. Due to this we have two issues:
>> > 1. cpuhp_setup_state() is called multiple times 2. plic_starting_cpu()
>> > can crash for boot CPU if cpuhp_setup_state()
>> >    is called before boot CPU PLIC handler is available.
>> >
>> > This patch fixes both above issues.
>> >
>> > Signed-off-by: Anup Patel <anup.patel@wdc.com>
>> > ---
>> >  drivers/irqchip/irq-sifive-plic.c | 14 ++++++++++++--
>> >  1 file changed, 12 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/irqchip/irq-sifive-plic.c
>> > b/drivers/irqchip/irq-sifive-plic.c
>> > index 822e074c0600..7dc23edb3267 100644
>> > --- a/drivers/irqchip/irq-sifive-plic.c
>> > +++ b/drivers/irqchip/irq-sifive-plic.c
>> > @@ -76,6 +76,7 @@ struct plic_handler {
>> >  	void __iomem		*enable_base;
>> >  	struct plic_priv	*priv;
>> >  };
>> > +static bool plic_cpuhp_setup_done;
>> >  static DEFINE_PER_CPU(struct plic_handler, plic_handlers);
>> >
>> >  static inline void plic_toggle(struct plic_handler *handler, @@
>> > -282,6 +283,7 @@ static int __init plic_init(struct device_node *node,
>> >  	int error = 0, nr_contexts, nr_handlers = 0, i;
>> >  	u32 nr_irqs;
>> >  	struct plic_priv *priv;
>> > +	struct plic_handler *handler;
>> >
>> >  	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>> >  	if (!priv)
>> > @@ -310,7 +312,6 @@ static int __init plic_init(struct device_node
>> > *node,
>> >
>> >  	for (i = 0; i < nr_contexts; i++) {
>> >  		struct of_phandle_args parent;
>> > -		struct plic_handler *handler;
>> >  		irq_hw_number_t hwirq;
>> >  		int cpu, hartid;
>> >
>> > @@ -364,9 +365,18 @@ static int __init plic_init(struct device_node
>> > *node,
>> >  		nr_handlers++;
>> >  	}
>> >
>> > -	cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
>> > +	/*
>> > +	 * We can have multiple PLIC instances so setup cpuhp state only
>> > +	 * when context handler for current/boot CPU is present.
>> > +	 */
>> > +	handler = this_cpu_ptr(&plic_handlers);
>> > +	if (handler->present && !plic_cpuhp_setup_done) {
>> 
>> If there is no context handler for the boot CPU, the system is doomed, 
>> right? It
>> isn't able to get any interrupt, and you don't register the hotplug 
>> notifier that
>> could allow secondary CPUs to boot.
>> 
>> So what is the point? It feels like you should just give up here.
>> 
>> Also, the boot CPU is always CPU 0. So checking that you only register 
>> the
>> hotplug notifier from CPU 0 should be enough.
> 
> The boot CPU is not fixed in RISC-V, the logical id of the boot CPU 
> will always
> be zero but physical id (or HART id) can be something totally 
> different.

So on riscv, smp_processor_id() can return a non-zero value on the
the boot CPU? Interesting... :-/

> 
> On RISC-V NUMA system, we will have a separate PLIC in each NUMA node.
> 
> Let's say we have a system with 2 NUMA nodes, each NUMA node having
> 4 CPUs (or 4 HARTs).  In this case, the DTB passed to Linux will have 
> two PLIC
> DT nodes where each PLIC device targets only 4 CPUs (or 4 HARTs). The
> plic_init() functions will setup handlers for only 4 CPUs (or 4 HARTs). 
> In other
> words, plic_init() for "PLIC0" will setup handler for HART id 0 to 3 
> and
> plic_init() for "PLIC1" will setup handler for HART id 4 to 7. Now, any 
> CPU
> can be boot CPU so it is possible that CPU with HART id 4 is boot CPU 
> and
> when plic_init() is first called for "PLIC0" the handler for HART id 4 
> is not
> setup because it will be setup later when plic_init() is called for 
> "PLIC1".
> This cause plic_starting_cpu() to crash when plic_init() is called for 
> "PLIC0".
> 
> I hope above example helps understanding the issue.

It does, thanks. This pseudo NUMA thing really is a terrible hack...

> 
> I encounter this issue randomly when booting Linux on QEMU RISC-V
> with multiple NUMA nodes.

Then why don't you defer the probing of the PLIC you can't initialize
from this CPU? If you're on CPU4-7, only initialize the PLIC that
matters to you, and not the the others. It would certainly make a lot
more sense, and be more robust.

         M.
-- 
Jazz is not dead. It just smells funny...


  reply	other threads:[~2020-05-16 13:30 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-16  6:38 [PATCH 0/4] More improvements for multiple PLICs Anup Patel
2020-05-16  6:38 ` Anup Patel
2020-05-16  6:38 ` [PATCH 1/4] irqchip/sifive-plic: Setup cpuhp once after current handler is present Anup Patel
2020-05-16  6:38   ` Anup Patel
2020-05-16 12:11   ` Marc Zyngier
2020-05-16 12:11     ` Marc Zyngier
2020-05-16 12:52     ` Anup Patel
2020-05-16 12:52       ` Anup Patel
2020-05-16 13:30       ` Marc Zyngier [this message]
2020-05-16 13:30         ` Marc Zyngier
2020-05-16 16:28         ` Anup Patel
2020-05-16 16:28           ` Anup Patel
2020-05-17  8:02           ` Anup Patel
2020-05-17  8:02             ` Anup Patel
2020-05-16  6:38 ` [PATCH 2/4] irqchip/sifive-plic: Improve boot prints for multiple PLIC instances Anup Patel
2020-05-16  6:38   ` Anup Patel
2020-05-16 12:20   ` Marc Zyngier
2020-05-16 12:20     ` Marc Zyngier
2020-05-16 12:53     ` Anup Patel
2020-05-16 12:53       ` Anup Patel
2020-05-16  6:39 ` [PATCH 3/4] irqchip/sifive-plic: Separate irq_chip for muiltiple " Anup Patel
2020-05-16  6:39   ` Anup Patel
2020-05-16 12:29   ` Marc Zyngier
2020-05-16 12:29     ` Marc Zyngier
2020-05-16 13:01     ` Anup Patel
2020-05-16 13:01       ` Anup Patel
2020-05-16 13:16       ` Marc Zyngier
2020-05-16 13:16         ` Marc Zyngier
2020-05-16 16:38         ` Anup Patel
2020-05-16 16:38           ` Anup Patel
2020-05-18  8:14           ` Marc Zyngier
2020-05-18  8:14             ` Marc Zyngier
2020-05-18  9:00             ` Anup Patel
2020-05-18  9:00               ` Anup Patel
2020-05-16  6:39 ` [PATCH 4/4] irqchip/sifive-plic: Set default irq affinity in plic_irqdomain_map() Anup Patel
2020-05-16  6:39   ` Anup Patel
2020-05-16 12:30   ` Marc Zyngier
2020-05-16 12:30     ` Marc Zyngier
2020-05-16 12:53     ` Anup Patel
2020-05-16 12:53       ` Anup Patel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0be23fcd363998ddd527eceb308f592c@kernel.org \
    --to=maz@kernel.org \
    --cc=Alistair.Francis@wdc.com \
    --cc=Anup.Patel@wdc.com \
    --cc=Atish.Patra@wdc.com \
    --cc=anup@brainfault.org \
    --cc=jason@lakedaemon.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.