linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
@ 2020-07-05 14:26 Emil Renner Berthing
  2020-07-05 14:43 ` Guo Ren
  0 siblings, 1 reply; 15+ messages in thread
From: Emil Renner Berthing @ 2020-07-05 14:26 UTC (permalink / raw)
  To: linux-riscv
  Cc: linux-arch, Palmer Dabbelt, Emil Renner Berthing, Arnd Bergmann,
	Paul Walmsley

Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
results in many errors like this on the HiFive Unleashed
RISC-V board:

BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
caller is regmap_mmio_write32le+0x1c/0x46
CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
Call Trace:
[<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
[<ffffffe0005b290e>] dump_stack+0x6e/0x88
[<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
[<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
[<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
[<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
[<ffffffe0004715c4>] regmap_write+0x28/0x48
[<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
[<ffffffe00000113e>] rdinit_setup+0x22/0x26
[<ffffffe000469054>] platform_drv_probe+0x24/0x52
[<ffffffe000467e16>] really_probe+0x92/0x21a
[<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
[<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
[<ffffffe0004683f0>] __driver_attach+0x40/0xac
[<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
[<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
[<ffffffe000467118>] bus_add_driver+0x11e/0x184
[<ffffffe00046889a>] driver_register+0x32/0xc6
[<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
[<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
[<ffffffe00000113e>] rdinit_setup+0x22/0x26
[<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
[<ffffffe0005c4d28>] rest_init+0xde/0xe2
[<ffffffe0005c4d36>] kernel_init+0xa/0x11a
[<ffffffe0005c4d28>] rest_init+0xde/0xe2
[<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
---
This patch fixes it, but my guess is that it's not the right
fix. Do anyone have a better idea?

 include/asm-generic/mmiowb.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
index 9439ff037b2d..31a21cdfbbcf 100644
--- a/include/asm-generic/mmiowb.h
+++ b/include/asm-generic/mmiowb.h
@@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
 
 static inline void mmiowb_set_pending(void)
 {
-	struct mmiowb_state *ms = __mmiowb_state();
+	struct mmiowb_state *ms;
+
+	get_cpu();
+	ms = __mmiowb_state();
 	ms->mmiowb_pending = ms->nesting_count;
+	put_cpu();
 }
 
 static inline void mmiowb_spin_lock(void)
-- 
2.27.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-05 14:26 [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending Emil Renner Berthing
@ 2020-07-05 14:43 ` Guo Ren
  2020-07-05 14:51   ` Guo Ren
  2020-07-05 15:03   ` Emil Renner Berthing
  0 siblings, 2 replies; 15+ messages in thread
From: Guo Ren @ 2020-07-05 14:43 UTC (permalink / raw)
  To: Emil Renner Berthing
  Cc: linux-arch, linux-riscv, Palmer Dabbelt, Arnd Bergmann, Paul Walmsley

Hi Emil,

On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
>
> Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> results in many errors like this on the HiFive Unleashed
> RISC-V board:
>
> BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> caller is regmap_mmio_write32le+0x1c/0x46
> CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> Call Trace:
> [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> [<ffffffe0004715c4>] regmap_write+0x28/0x48
> [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> [<ffffffe000467e16>] really_probe+0x92/0x21a
> [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> [<ffffffe00046889a>] driver_register+0x32/0xc6
> [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
>
> Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
> ---
> This patch fixes it, but my guess is that it's not the right
> fix. Do anyone have a better idea?
>
>  include/asm-generic/mmiowb.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> index 9439ff037b2d..31a21cdfbbcf 100644
> --- a/include/asm-generic/mmiowb.h
> +++ b/include/asm-generic/mmiowb.h
> @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
>
>  static inline void mmiowb_set_pending(void)
>  {
> -       struct mmiowb_state *ms = __mmiowb_state();
> +       struct mmiowb_state *ms;
> +
> +       get_cpu();
> +       ms = __mmiowb_state();
>         ms->mmiowb_pending = ms->nesting_count;
> +       put_cpu();
>  }

#define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)

The ptr is a fixed address, so don't worry about the change, and just
use an atomic_read is enough.
static inline void mmiowb_set_pending(void)
{
        struct mmiowb_state *ms = __mmiowb_state();
-       ms->mmiowb_pending = ms->nesting_count;
+      ms->mmiowb_pending = atomic_read(ms->nesting_count);
}


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-05 14:43 ` Guo Ren
@ 2020-07-05 14:51   ` Guo Ren
  2020-07-05 15:19     ` Guo Ren
  2020-07-05 15:03   ` Emil Renner Berthing
  1 sibling, 1 reply; 15+ messages in thread
From: Guo Ren @ 2020-07-05 14:51 UTC (permalink / raw)
  To: Emil Renner Berthing
  Cc: linux-arch, linux-riscv, Palmer Dabbelt, Arnd Bergmann, Paul Walmsley

Sorry:  Use atomic_set, not atomic read,
seams like:
 +       atomic_set(&ms->mmiowb_pending, ms->nesting_count);

But you need still deal with these for atomic
struct mmiowb_state {
        u16     nesting_count;
        u16     mmiowb_pending;
};

On Sun, Jul 5, 2020 at 10:43 PM Guo Ren <guoren@kernel.org> wrote:
>
> Hi Emil,
>
> On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> >
> > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> > results in many errors like this on the HiFive Unleashed
> > RISC-V board:
> >
> > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> > caller is regmap_mmio_write32le+0x1c/0x46
> > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> > Call Trace:
> > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
> >
> > Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
> > ---
> > This patch fixes it, but my guess is that it's not the right
> > fix. Do anyone have a better idea?
> >
> >  include/asm-generic/mmiowb.h | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > index 9439ff037b2d..31a21cdfbbcf 100644
> > --- a/include/asm-generic/mmiowb.h
> > +++ b/include/asm-generic/mmiowb.h
> > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> >
> >  static inline void mmiowb_set_pending(void)
> >  {
> > -       struct mmiowb_state *ms = __mmiowb_state();
> > +       struct mmiowb_state *ms;
> > +
> > +       get_cpu();
> > +       ms = __mmiowb_state();
> >         ms->mmiowb_pending = ms->nesting_count;
> > +       put_cpu();
> >  }
>
> #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
>
> The ptr is a fixed address, so don't worry about the change, and just
> use an atomic_read is enough.
> static inline void mmiowb_set_pending(void)
> {
>         struct mmiowb_state *ms = __mmiowb_state();
> -       ms->mmiowb_pending = ms->nesting_count;
> +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
> }
>
>
> --
> Best Regards
>  Guo Ren
>
> ML: https://lore.kernel.org/linux-csky/



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-05 14:43 ` Guo Ren
  2020-07-05 14:51   ` Guo Ren
@ 2020-07-05 15:03   ` Emil Renner Berthing
  2020-07-05 15:52     ` Guo Ren
  1 sibling, 1 reply; 15+ messages in thread
From: Emil Renner Berthing @ 2020-07-05 15:03 UTC (permalink / raw)
  To: Guo Ren
  Cc: linux-arch, linux-riscv, Palmer Dabbelt, Arnd Bergmann, Paul Walmsley

Hi Ren,

On Sun, 5 Jul 2020 at 16:44, Guo Ren <guoren@kernel.org> wrote:
>
> Hi Emil,
>
> On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> >
> > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> > results in many errors like this on the HiFive Unleashed
> > RISC-V board:
> >
> > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> > caller is regmap_mmio_write32le+0x1c/0x46
> > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> > Call Trace:
> > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
> >
> > Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
> > ---
> > This patch fixes it, but my guess is that it's not the right
> > fix. Do anyone have a better idea?
> >
> >  include/asm-generic/mmiowb.h | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > index 9439ff037b2d..31a21cdfbbcf 100644
> > --- a/include/asm-generic/mmiowb.h
> > +++ b/include/asm-generic/mmiowb.h
> > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> >
> >  static inline void mmiowb_set_pending(void)
> >  {
> > -       struct mmiowb_state *ms = __mmiowb_state();
> > +       struct mmiowb_state *ms;
> > +
> > +       get_cpu();
> > +       ms = __mmiowb_state();
> >         ms->mmiowb_pending = ms->nesting_count;
> > +       put_cpu();
> >  }
>
> #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
>
> The ptr is a fixed address, so don't worry about the change, and just
> use an atomic_read is enough.
> static inline void mmiowb_set_pending(void)
> {
>         struct mmiowb_state *ms = __mmiowb_state();
> -       ms->mmiowb_pending = ms->nesting_count;
> +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
> }

You may be right, but it doesn't fix the BUG. As far as I can tell it
happens in __mmiowb_state() which expands through this_cpu_ptr and
arch_raw_cpu_ptr to SHIFT_PERCPU_PTR(ptr, __my_cpu_offset), where
__my_cpu_offset is per_cpu_offset(smp_processor_id()) and with
CONFIG_DEBUG_PREEMPT smp_processor_id is actually
debug_smp_processor_id, which eventually checks that preemption is
disabled in check_preemption_disabled.

/Emil
> --
> Best Regards
>  Guo Ren
>
> ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-05 14:51   ` Guo Ren
@ 2020-07-05 15:19     ` Guo Ren
  0 siblings, 0 replies; 15+ messages in thread
From: Guo Ren @ 2020-07-05 15:19 UTC (permalink / raw)
  To: Emil Renner Berthing
  Cc: linux-arch, linux-riscv, Palmer Dabbelt, Arnd Bergmann, Paul Walmsley

On Sun, Jul 5, 2020 at 10:51 PM Guo Ren <guoren@kernel.org> wrote:
>
> Sorry:  Use atomic_set, not atomic read,
> seams like:
>  +       atomic_set(&ms->mmiowb_pending, ms->nesting_count);
It's still wrong, forgive me.

>
> But you need still deal with these for atomic
> struct mmiowb_state {
>         u16     nesting_count;
>         u16     mmiowb_pending;
> };
>
> On Sun, Jul 5, 2020 at 10:43 PM Guo Ren <guoren@kernel.org> wrote:
> >
> > Hi Emil,
> >
> > On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> > >
> > > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> > > results in many errors like this on the HiFive Unleashed
> > > RISC-V board:
> > >
> > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> > > caller is regmap_mmio_write32le+0x1c/0x46
> > > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> > > Call Trace:
> > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
> > >
> > > Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
> > > ---
> > > This patch fixes it, but my guess is that it's not the right
> > > fix. Do anyone have a better idea?
> > >
> > >  include/asm-generic/mmiowb.h | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > > index 9439ff037b2d..31a21cdfbbcf 100644
> > > --- a/include/asm-generic/mmiowb.h
> > > +++ b/include/asm-generic/mmiowb.h
> > > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> > >
> > >  static inline void mmiowb_set_pending(void)
> > >  {
> > > -       struct mmiowb_state *ms = __mmiowb_state();
> > > +       struct mmiowb_state *ms;
> > > +
> > > +       get_cpu();
> > > +       ms = __mmiowb_state();
> > >         ms->mmiowb_pending = ms->nesting_count;
> > > +       put_cpu();
> > >  }
> >
> > #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
> >
> > The ptr is a fixed address, so don't worry about the change, and just
> > use an atomic_read is enough.
> > static inline void mmiowb_set_pending(void)
> > {
> >         struct mmiowb_state *ms = __mmiowb_state();
> > -       ms->mmiowb_pending = ms->nesting_count;
> > +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
> > }
> >
> >
> > --
> > Best Regards
> >  Guo Ren
> >
> > ML: https://lore.kernel.org/linux-csky/
>
>
>
> --
> Best Regards
>  Guo Ren
>
> ML: https://lore.kernel.org/linux-csky/



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-05 15:03   ` Emil Renner Berthing
@ 2020-07-05 15:52     ` Guo Ren
  2020-07-05 17:09       ` Emil Renner Berthing
  0 siblings, 1 reply; 15+ messages in thread
From: Guo Ren @ 2020-07-05 15:52 UTC (permalink / raw)
  To: Emil Renner Berthing
  Cc: linux-arch, linux-riscv, Palmer Dabbelt, Arnd Bergmann, Paul Walmsley

On Sun, Jul 5, 2020 at 11:03 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
>
> Hi Ren,
>
> On Sun, 5 Jul 2020 at 16:44, Guo Ren <guoren@kernel.org> wrote:
> >
> > Hi Emil,
> >
> > On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> > >
> > > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> > > results in many errors like this on the HiFive Unleashed
> > > RISC-V board:
> > >
> > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> > > caller is regmap_mmio_write32le+0x1c/0x46
> > > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> > > Call Trace:
> > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
> > >
> > > Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
> > > ---
> > > This patch fixes it, but my guess is that it's not the right
> > > fix. Do anyone have a better idea?
> > >
> > >  include/asm-generic/mmiowb.h | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > > index 9439ff037b2d..31a21cdfbbcf 100644
> > > --- a/include/asm-generic/mmiowb.h
> > > +++ b/include/asm-generic/mmiowb.h
> > > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> > >
> > >  static inline void mmiowb_set_pending(void)
> > >  {
> > > -       struct mmiowb_state *ms = __mmiowb_state();
> > > +       struct mmiowb_state *ms;
> > > +
> > > +       get_cpu();
> > > +       ms = __mmiowb_state();
> > >         ms->mmiowb_pending = ms->nesting_count;
> > > +       put_cpu();
> > >  }
> >
> > #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
> >
> > The ptr is a fixed address, so don't worry about the change, and just
> > use an atomic_read is enough.
> > static inline void mmiowb_set_pending(void)
> > {
> >         struct mmiowb_state *ms = __mmiowb_state();
> > -       ms->mmiowb_pending = ms->nesting_count;
> > +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
> > }
>
> You may be right, but it doesn't fix the BUG. As far as I can tell it
> happens in __mmiowb_state() which expands through this_cpu_ptr and
> arch_raw_cpu_ptr to SHIFT_PERCPU_PTR(ptr, __my_cpu_offset), where
> __my_cpu_offset is per_cpu_offset(smp_processor_id()) and with
> CONFIG_DEBUG_PREEMPT smp_processor_id is actually
> debug_smp_processor_id, which eventually checks that preemption is
> disabled in check_preemption_disabled.
Thx for explaining.

Seems we need to find who disable preemption during:
> > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc

do_one_initcall's preempt_count = 0
(gdb) bt
#0  do_one_initcall (fn=0xffffffe000003b0e
<trace_init_flags_sys_exit>) at init/main.c:1190
#1  0xffffffe000001f20 in do_pre_smp_initcalls () at ./include/linux/init.h:131

#2  kernel_init_freeable () at init/main.c:1494
#3  0xffffffe0009d6ea6 in kernel_init (unused=<optimized out>) a
   t init/main.c:1399
#4  0xffffffe000201c2a in handle_exception () at arch/riscv/kernel/entry.S:188
Backtrace stopped: frame did not save the PC
(gdb) p *(struct task_struct*)$tp
$2 = {thread_info = {flags = 0, preempt_count = 0,

Can you debug like this ? to see which function's preempt_count = 0 in
your backtrace.

-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-05 15:52     ` Guo Ren
@ 2020-07-05 17:09       ` Emil Renner Berthing
  2020-07-06  0:47         ` Guo Ren
  0 siblings, 1 reply; 15+ messages in thread
From: Emil Renner Berthing @ 2020-07-05 17:09 UTC (permalink / raw)
  To: Guo Ren
  Cc: linux-arch, linux-riscv, Palmer Dabbelt, Arnd Bergmann, Paul Walmsley

On Sun, 5 Jul 2020 at 17:52, Guo Ren <guoren@kernel.org> wrote:
> On Sun, Jul 5, 2020 at 11:03 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> > On Sun, 5 Jul 2020 at 16:44, Guo Ren <guoren@kernel.org> wrote:
> > >
> > > Hi Emil,
> > >
> > > On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> > > >
> > > > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> > > > results in many errors like this on the HiFive Unleashed
> > > > RISC-V board:
> > > >
> > > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> > > > caller is regmap_mmio_write32le+0x1c/0x46
> > > > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> > > > Call Trace:
> > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
> > > >
> > > > Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
> > > > ---
> > > > This patch fixes it, but my guess is that it's not the right
> > > > fix. Do anyone have a better idea?
> > > >
> > > >  include/asm-generic/mmiowb.h | 6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > > > index 9439ff037b2d..31a21cdfbbcf 100644
> > > > --- a/include/asm-generic/mmiowb.h
> > > > +++ b/include/asm-generic/mmiowb.h
> > > > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> > > >
> > > >  static inline void mmiowb_set_pending(void)
> > > >  {
> > > > -       struct mmiowb_state *ms = __mmiowb_state();
> > > > +       struct mmiowb_state *ms;
> > > > +
> > > > +       get_cpu();
> > > > +       ms = __mmiowb_state();
> > > >         ms->mmiowb_pending = ms->nesting_count;
> > > > +       put_cpu();
> > > >  }
> > >
> > > #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
> > >
> > > The ptr is a fixed address, so don't worry about the change, and just
> > > use an atomic_read is enough.
> > > static inline void mmiowb_set_pending(void)
> > > {
> > >         struct mmiowb_state *ms = __mmiowb_state();
> > > -       ms->mmiowb_pending = ms->nesting_count;
> > > +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
> > > }
> >
> > You may be right, but it doesn't fix the BUG. As far as I can tell it
> > happens in __mmiowb_state() which expands through this_cpu_ptr and
> > arch_raw_cpu_ptr to SHIFT_PERCPU_PTR(ptr, __my_cpu_offset), where
> > __my_cpu_offset is per_cpu_offset(smp_processor_id()) and with
> > CONFIG_DEBUG_PREEMPT smp_processor_id is actually
> > debug_smp_processor_id, which eventually checks that preemption is
> > disabled in check_preemption_disabled.
> Thx for explaining.
>
> Seems we need to find who disable preemption during:
> > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc

Hmm.. the problem is that preemption is *not* disabled when
smp_processor_id is called, right?

> do_one_initcall's preempt_count = 0
> (gdb) bt
> #0  do_one_initcall (fn=0xffffffe000003b0e
> <trace_init_flags_sys_exit>) at init/main.c:1190
> #1  0xffffffe000001f20 in do_pre_smp_initcalls () at ./include/linux/init.h:131
>
> #2  kernel_init_freeable () at init/main.c:1494
> #3  0xffffffe0009d6ea6 in kernel_init (unused=<optimized out>) a
>    t init/main.c:1399
> #4  0xffffffe000201c2a in handle_exception () at arch/riscv/kernel/entry.S:188
> Backtrace stopped: frame did not save the PC
> (gdb) p *(struct task_struct*)$tp
> $2 = {thread_info = {flags = 0, preempt_count = 0,
>
> Can you debug like this ? to see which function's preempt_count = 0 in
> your backtrace.

I'm sorry, I'm not exactly sure what you mean by this, but it seems to
happen multiple places in writel-like functions. Both at probe time in
sifive_gpio_probe and sifive_spi_probe and when the spi driver is
running. Here is the full log:

http://sprunge.us/rgfC3r

/Emil

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-05 17:09       ` Emil Renner Berthing
@ 2020-07-06  0:47         ` Guo Ren
  2020-07-06  8:08           ` Emil Renner Berthing
  0 siblings, 1 reply; 15+ messages in thread
From: Guo Ren @ 2020-07-06  0:47 UTC (permalink / raw)
  To: Emil Renner Berthing
  Cc: linux-arch, linux-riscv, Palmer Dabbelt, Arnd Bergmann, Paul Walmsley

On Mon, Jul 6, 2020 at 1:09 AM Emil Renner Berthing <kernel@esmil.dk> wrote:
>
> On Sun, 5 Jul 2020 at 17:52, Guo Ren <guoren@kernel.org> wrote:
> > On Sun, Jul 5, 2020 at 11:03 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> > > On Sun, 5 Jul 2020 at 16:44, Guo Ren <guoren@kernel.org> wrote:
> > > >
> > > > Hi Emil,
> > > >
> > > > On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> > > > >
> > > > > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> > > > > results in many errors like this on the HiFive Unleashed
> > > > > RISC-V board:
> > > > >
> > > > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> > > > > caller is regmap_mmio_write32le+0x1c/0x46
> > > > > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> > > > > Call Trace:
> > > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> > > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> > > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
> > > > >
> > > > > Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
> > > > > ---
> > > > > This patch fixes it, but my guess is that it's not the right
> > > > > fix. Do anyone have a better idea?
> > > > >
> > > > >  include/asm-generic/mmiowb.h | 6 +++++-
> > > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > > > > index 9439ff037b2d..31a21cdfbbcf 100644
> > > > > --- a/include/asm-generic/mmiowb.h
> > > > > +++ b/include/asm-generic/mmiowb.h
> > > > > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> > > > >
> > > > >  static inline void mmiowb_set_pending(void)
> > > > >  {
> > > > > -       struct mmiowb_state *ms = __mmiowb_state();
> > > > > +       struct mmiowb_state *ms;
> > > > > +
> > > > > +       get_cpu();
> > > > > +       ms = __mmiowb_state();
> > > > >         ms->mmiowb_pending = ms->nesting_count;
> > > > > +       put_cpu();
> > > > >  }
> > > >
> > > > #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
> > > >
> > > > The ptr is a fixed address, so don't worry about the change, and just
> > > > use an atomic_read is enough.
> > > > static inline void mmiowb_set_pending(void)
> > > > {
> > > >         struct mmiowb_state *ms = __mmiowb_state();
> > > > -       ms->mmiowb_pending = ms->nesting_count;
> > > > +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
> > > > }
> > >
> > > You may be right, but it doesn't fix the BUG. As far as I can tell it
> > > happens in __mmiowb_state() which expands through this_cpu_ptr and
> > > arch_raw_cpu_ptr to SHIFT_PERCPU_PTR(ptr, __my_cpu_offset), where
> > > __my_cpu_offset is per_cpu_offset(smp_processor_id()) and with
> > > CONFIG_DEBUG_PREEMPT smp_processor_id is actually
> > > debug_smp_processor_id, which eventually checks that preemption is
> > > disabled in check_preemption_disabled.
> > Thx for explaining.
> >
> > Seems we need to find who disable preemption during:
> > > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
>
> Hmm.. the problem is that preemption is *not* disabled when
> smp_processor_id is called, right?

Yes!

smp_processor_id is defined as:

 * This is the normal accessor to the CPU id and should be used
 * whenever possible.
 *
 * The CPU id is stable when:
 *
 *  - IRQs are disabled;
 *  - preemption is disabled;
 *  - the task is CPU affine.
 *
 * When CONFIG_DEBUG_PREEMPT; we verify these assumption and WARN
 * when smp_processor_id() is used when the CPU id is not stable.

So regmap_write->regmap_mmio_write should be PREEMPT disabled in
sifive_gpio_probe().

>
> > do_one_initcall's preempt_count = 0
> > (gdb) bt
> > #0  do_one_initcall (fn=0xffffffe000003b0e
> > <trace_init_flags_sys_exit>) at init/main.c:1190
> > #1  0xffffffe000001f20 in do_pre_smp_initcalls () at ./include/linux/init.h:131
> >
> > #2  kernel_init_freeable () at init/main.c:1494
> > #3  0xffffffe0009d6ea6 in kernel_init (unused=<optimized out>) a
> >    t init/main.c:1399
> > #4  0xffffffe000201c2a in handle_exception () at arch/riscv/kernel/entry.S:188
> > Backtrace stopped: frame did not save the PC
> > (gdb) p *(struct task_struct*)$tp
> > $2 = {thread_info = {flags = 0, preempt_count = 0,
> >
> > Can you debug like this ? to see which function's preempt_count = 0 in
> > your backtrace.
>
> I'm sorry, I'm not exactly sure what you mean by this, but it seems to
> happen multiple places in writel-like functions. Both at probe time in
> sifive_gpio_probe and sifive_spi_probe and when the spi driver is
> running. Here is the full log:
>
I mean Use jtag + gdb to detect  preempt_count = 0 in parent function.
I do this in qemu :)

If couldn't use jtag, then just add printk preempt_count trace, or kgdb.

-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-06  0:47         ` Guo Ren
@ 2020-07-06  8:08           ` Emil Renner Berthing
  2020-07-15  6:45             ` Palmer Dabbelt
  0 siblings, 1 reply; 15+ messages in thread
From: Emil Renner Berthing @ 2020-07-06  8:08 UTC (permalink / raw)
  To: Guo Ren
  Cc: linux-arch, linux-riscv, Palmer Dabbelt, Arnd Bergmann, Paul Walmsley

On Mon, 6 Jul 2020 at 02:48, Guo Ren <guoren@kernel.org> wrote:
> On Mon, Jul 6, 2020 at 1:09 AM Emil Renner Berthing <kernel@esmil.dk> wrote:
> >
> > On Sun, 5 Jul 2020 at 17:52, Guo Ren <guoren@kernel.org> wrote:
> > > On Sun, Jul 5, 2020 at 11:03 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> > > > On Sun, 5 Jul 2020 at 16:44, Guo Ren <guoren@kernel.org> wrote:
> > > > >
> > > > > Hi Emil,
> > > > >
> > > > > On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
> > > > > >
> > > > > > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> > > > > > results in many errors like this on the HiFive Unleashed
> > > > > > RISC-V board:
> > > > > >
> > > > > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> > > > > > caller is regmap_mmio_write32le+0x1c/0x46
> > > > > > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> > > > > > Call Trace:
> > > > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> > > > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > > > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> > > > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > > > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
> > > > > >
> > > > > > Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
> > > > > > ---
> > > > > > This patch fixes it, but my guess is that it's not the right
> > > > > > fix. Do anyone have a better idea?
> > > > > >
> > > > > >  include/asm-generic/mmiowb.h | 6 +++++-
> > > > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > > > > > index 9439ff037b2d..31a21cdfbbcf 100644
> > > > > > --- a/include/asm-generic/mmiowb.h
> > > > > > +++ b/include/asm-generic/mmiowb.h
> > > > > > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> > > > > >
> > > > > >  static inline void mmiowb_set_pending(void)
> > > > > >  {
> > > > > > -       struct mmiowb_state *ms = __mmiowb_state();
> > > > > > +       struct mmiowb_state *ms;
> > > > > > +
> > > > > > +       get_cpu();
> > > > > > +       ms = __mmiowb_state();
> > > > > >         ms->mmiowb_pending = ms->nesting_count;
> > > > > > +       put_cpu();
> > > > > >  }
> > > > >
> > > > > #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
> > > > >
> > > > > The ptr is a fixed address, so don't worry about the change, and just
> > > > > use an atomic_read is enough.
> > > > > static inline void mmiowb_set_pending(void)
> > > > > {
> > > > >         struct mmiowb_state *ms = __mmiowb_state();
> > > > > -       ms->mmiowb_pending = ms->nesting_count;
> > > > > +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
> > > > > }
> > > >
> > > > You may be right, but it doesn't fix the BUG. As far as I can tell it
> > > > happens in __mmiowb_state() which expands through this_cpu_ptr and
> > > > arch_raw_cpu_ptr to SHIFT_PERCPU_PTR(ptr, __my_cpu_offset), where
> > > > __my_cpu_offset is per_cpu_offset(smp_processor_id()) and with
> > > > CONFIG_DEBUG_PREEMPT smp_processor_id is actually
> > > > debug_smp_processor_id, which eventually checks that preemption is
> > > > disabled in check_preemption_disabled.
> > > Thx for explaining.
> > >
> > > Seems we need to find who disable preemption during:
> > > > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> >
> > Hmm.. the problem is that preemption is *not* disabled when
> > smp_processor_id is called, right?
>
> Yes!
>
> smp_processor_id is defined as:
>
>  * This is the normal accessor to the CPU id and should be used
>  * whenever possible.
>  *
>  * The CPU id is stable when:
>  *
>  *  - IRQs are disabled;
>  *  - preemption is disabled;
>  *  - the task is CPU affine.
>  *
>  * When CONFIG_DEBUG_PREEMPT; we verify these assumption and WARN
>  * when smp_processor_id() is used when the CPU id is not stable.
>
> So regmap_write->regmap_mmio_write should be PREEMPT disabled in
> sifive_gpio_probe().

Ah! Sorry, now I think I understand. So you're saying that the real
problem is that the driver framework should have disabled preemption
before calling any .probe functions, but for some reason that doesn't
happen on RISC-V?

> >
> > > do_one_initcall's preempt_count = 0
> > > (gdb) bt
> > > #0  do_one_initcall (fn=0xffffffe000003b0e
> > > <trace_init_flags_sys_exit>) at init/main.c:1190
> > > #1  0xffffffe000001f20 in do_pre_smp_initcalls () at ./include/linux/init.h:131
> > >
> > > #2  kernel_init_freeable () at init/main.c:1494
> > > #3  0xffffffe0009d6ea6 in kernel_init (unused=<optimized out>) a
> > >    t init/main.c:1399
> > > #4  0xffffffe000201c2a in handle_exception () at arch/riscv/kernel/entry.S:188
> > > Backtrace stopped: frame did not save the PC
> > > (gdb) p *(struct task_struct*)$tp
> > > $2 = {thread_info = {flags = 0, preempt_count = 0,
> > >
> > > Can you debug like this ? to see which function's preempt_count = 0 in
> > > your backtrace.
> >
> > I'm sorry, I'm not exactly sure what you mean by this, but it seems to
> > happen multiple places in writel-like functions. Both at probe time in
> > sifive_gpio_probe and sifive_spi_probe and when the spi driver is
> > running. Here is the full log:
> >
> I mean Use jtag + gdb to detect  preempt_count = 0 in parent function.
> I do this in qemu :)
>
> If couldn't use jtag, then just add printk preempt_count trace, or kgdb.

Right, thanks. I'll look into it.

/Emil

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-06  8:08           ` Emil Renner Berthing
@ 2020-07-15  6:45             ` Palmer Dabbelt
  2020-07-15 10:42               ` Will Deacon
  0 siblings, 1 reply; 15+ messages in thread
From: Palmer Dabbelt @ 2020-07-15  6:45 UTC (permalink / raw)
  To: Will Deacon, kernel
  Cc: linux-arch, linux-riscv, guoren, Arnd Bergmann, Paul Walmsley

On Mon, 06 Jul 2020 01:08:24 PDT (-0700), kernel@esmil.dk wrote:
> On Mon, 6 Jul 2020 at 02:48, Guo Ren <guoren@kernel.org> wrote:
>> On Mon, Jul 6, 2020 at 1:09 AM Emil Renner Berthing <kernel@esmil.dk> wrote:
>> >
>> > On Sun, 5 Jul 2020 at 17:52, Guo Ren <guoren@kernel.org> wrote:
>> > > On Sun, Jul 5, 2020 at 11:03 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
>> > > > On Sun, 5 Jul 2020 at 16:44, Guo Ren <guoren@kernel.org> wrote:
>> > > > >
>> > > > > Hi Emil,
>> > > > >
>> > > > > On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@esmil.dk> wrote:
>> > > > > >
>> > > > > > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
>> > > > > > results in many errors like this on the HiFive Unleashed
>> > > > > > RISC-V board:
>> > > > > >
>> > > > > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
>> > > > > > caller is regmap_mmio_write32le+0x1c/0x46
>> > > > > > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
>> > > > > > Call Trace:
>> > > > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
>> > > > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
>> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
>> > > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
>> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
>> > > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
>> > > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
>> > > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
>> > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
>> > > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
>> > > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
>> > > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
>> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
>> > > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
>> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
>> > > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
>> > > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
>> > > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
>> > > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
>> > > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
>> > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
>> > > > > > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
>> > > > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
>> > > > > > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
>> > > > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
>> > > > > > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
>> > > > > >
>> > > > > > Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
>> > > > > > ---
>> > > > > > This patch fixes it, but my guess is that it's not the right
>> > > > > > fix. Do anyone have a better idea?
>> > > > > >
>> > > > > >  include/asm-generic/mmiowb.h | 6 +++++-
>> > > > > >  1 file changed, 5 insertions(+), 1 deletion(-)
>> > > > > >
>> > > > > > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
>> > > > > > index 9439ff037b2d..31a21cdfbbcf 100644
>> > > > > > --- a/include/asm-generic/mmiowb.h
>> > > > > > +++ b/include/asm-generic/mmiowb.h
>> > > > > > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
>> > > > > >
>> > > > > >  static inline void mmiowb_set_pending(void)
>> > > > > >  {
>> > > > > > -       struct mmiowb_state *ms = __mmiowb_state();
>> > > > > > +       struct mmiowb_state *ms;
>> > > > > > +
>> > > > > > +       get_cpu();
>> > > > > > +       ms = __mmiowb_state();
>> > > > > >         ms->mmiowb_pending = ms->nesting_count;
>> > > > > > +       put_cpu();
>> > > > > >  }
>> > > > >
>> > > > > #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
>> > > > >
>> > > > > The ptr is a fixed address, so don't worry about the change, and just
>> > > > > use an atomic_read is enough.
>> > > > > static inline void mmiowb_set_pending(void)
>> > > > > {
>> > > > >         struct mmiowb_state *ms = __mmiowb_state();
>> > > > > -       ms->mmiowb_pending = ms->nesting_count;
>> > > > > +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
>> > > > > }
>> > > >
>> > > > You may be right, but it doesn't fix the BUG. As far as I can tell it
>> > > > happens in __mmiowb_state() which expands through this_cpu_ptr and
>> > > > arch_raw_cpu_ptr to SHIFT_PERCPU_PTR(ptr, __my_cpu_offset), where
>> > > > __my_cpu_offset is per_cpu_offset(smp_processor_id()) and with
>> > > > CONFIG_DEBUG_PREEMPT smp_processor_id is actually
>> > > > debug_smp_processor_id, which eventually checks that preemption is
>> > > > disabled in check_preemption_disabled.
>> > > Thx for explaining.
>> > >
>> > > Seems we need to find who disable preemption during:
>> > > > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
>> > > > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
>> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
>> > > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
>> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
>> > > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
>> > > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
>> > > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
>> > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
>> > > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
>> > > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
>> > > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
>> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
>> > > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
>> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
>> > > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
>> > > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
>> > > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
>> > > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
>> > > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
>> >
>> > Hmm.. the problem is that preemption is *not* disabled when
>> > smp_processor_id is called, right?
>>
>> Yes!
>>
>> smp_processor_id is defined as:
>>
>>  * This is the normal accessor to the CPU id and should be used
>>  * whenever possible.
>>  *
>>  * The CPU id is stable when:
>>  *
>>  *  - IRQs are disabled;
>>  *  - preemption is disabled;
>>  *  - the task is CPU affine.
>>  *
>>  * When CONFIG_DEBUG_PREEMPT; we verify these assumption and WARN
>>  * when smp_processor_id() is used when the CPU id is not stable.
>>
>> So regmap_write->regmap_mmio_write should be PREEMPT disabled in
>> sifive_gpio_probe().
>
> Ah! Sorry, now I think I understand. So you're saying that the real
> problem is that the driver framework should have disabled preemption
> before calling any .probe functions, but for some reason that doesn't
> happen on RISC-V?

I think it's actually an issue with the generic mmiowb stuff and that we should
just elide the check.  I'm adding Will, for context.  I'll send out a patch.

>> > > do_one_initcall's preempt_count = 0
>> > > (gdb) bt
>> > > #0  do_one_initcall (fn=0xffffffe000003b0e
>> > > <trace_init_flags_sys_exit>) at init/main.c:1190
>> > > #1  0xffffffe000001f20 in do_pre_smp_initcalls () at ./include/linux/init.h:131
>> > >
>> > > #2  kernel_init_freeable () at init/main.c:1494
>> > > #3  0xffffffe0009d6ea6 in kernel_init (unused=<optimized out>) a
>> > >    t init/main.c:1399
>> > > #4  0xffffffe000201c2a in handle_exception () at arch/riscv/kernel/entry.S:188
>> > > Backtrace stopped: frame did not save the PC
>> > > (gdb) p *(struct task_struct*)$tp
>> > > $2 = {thread_info = {flags = 0, preempt_count = 0,
>> > >
>> > > Can you debug like this ? to see which function's preempt_count = 0 in
>> > > your backtrace.
>> >
>> > I'm sorry, I'm not exactly sure what you mean by this, but it seems to
>> > happen multiple places in writel-like functions. Both at probe time in
>> > sifive_gpio_probe and sifive_spi_probe and when the spi driver is
>> > running. Here is the full log:
>> >
>> I mean Use jtag + gdb to detect  preempt_count = 0 in parent function.
>> I do this in qemu :)
>>
>> If couldn't use jtag, then just add printk preempt_count trace, or kgdb.
>
> Right, thanks. I'll look into it.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-15  6:45             ` Palmer Dabbelt
@ 2020-07-15 10:42               ` Will Deacon
  2020-07-15 14:03                 ` Palmer Dabbelt
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2020-07-15 10:42 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: linux-arch, kernel, Arnd Bergmann, guoren, Paul Walmsley, linux-riscv

On Tue, Jul 14, 2020 at 11:45:11PM -0700, Palmer Dabbelt wrote:
> > > > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > > >
> > > > Hmm.. the problem is that preemption is *not* disabled when
> > > > smp_processor_id is called, right?
> > > 
> > > Yes!
> > > 
> > > smp_processor_id is defined as:
> > > 
> > >  * This is the normal accessor to the CPU id and should be used
> > >  * whenever possible.
> > >  *
> > >  * The CPU id is stable when:
> > >  *
> > >  *  - IRQs are disabled;
> > >  *  - preemption is disabled;
> > >  *  - the task is CPU affine.
> > >  *
> > >  * When CONFIG_DEBUG_PREEMPT; we verify these assumption and WARN
> > >  * when smp_processor_id() is used when the CPU id is not stable.
> > > 
> > > So regmap_write->regmap_mmio_write should be PREEMPT disabled in
> > > sifive_gpio_probe().
> > 
> > Ah! Sorry, now I think I understand. So you're saying that the real
> > problem is that the driver framework should have disabled preemption
> > before calling any .probe functions, but for some reason that doesn't
> > happen on RISC-V?
> 
> I think it's actually an issue with the generic mmiowb stuff and that we should
> just elide the check.  I'm adding Will, for context.  I'll send out a patch.

Hmm. Although I _think_ something like the diff below ought to work, are you
sure you want to be doing MMIO writes in preemptible context? Setting
'.disable_locking = true' in 'sifive_gpio_regmap_config' implies to me that
you should be handling the locking within the driver itself, and all the
other regmap writes are protected by '&gc->bgpio_lock'.

Given that riscv is one of the few architectures needing an implementation
of mmiowb(), doing MMIO in a preemptible section seems especially dangerous
as you have no way to ensure completion of the writes without adding an
mmiowb() to the CPU migration path (i.e. context switch).

Will

--->8

diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
index 9439ff037b2d..5698fca3bf56 100644
--- a/include/asm-generic/mmiowb.h
+++ b/include/asm-generic/mmiowb.h
@@ -27,7 +27,7 @@
 #include <asm/smp.h>
 
 DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
-#define __mmiowb_state()       this_cpu_ptr(&__mmiowb_state)
+#define __mmiowb_state()       raw_cpu_ptr(&__mmiowb_state)
 #else
 #define __mmiowb_state()       arch_mmiowb_state()
 #endif /* arch_mmiowb_state */
@@ -35,7 +35,9 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
 static inline void mmiowb_set_pending(void)
 {
        struct mmiowb_state *ms = __mmiowb_state();
-       ms->mmiowb_pending = ms->nesting_count;
+
+       if (likely(ms->nesting_count))
+               ms->mmiowb_pending = ms->nesting_count;
 }
 
 static inline void mmiowb_spin_lock(void)


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-15 10:42               ` Will Deacon
@ 2020-07-15 14:03                 ` Palmer Dabbelt
  2020-07-15 14:48                   ` Will Deacon
  0 siblings, 1 reply; 15+ messages in thread
From: Palmer Dabbelt @ 2020-07-15 14:03 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arch, kernel, Arnd Bergmann, guoren, Paul Walmsley, linux-riscv

On Wed, 15 Jul 2020 03:42:46 PDT (-0700), Will Deacon wrote:
> On Tue, Jul 14, 2020 at 11:45:11PM -0700, Palmer Dabbelt wrote:
>> > > > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
>> > > > > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
>> > > > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
>> > > > > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
>> > > > > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
>> > > > > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
>> > > > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
>> > > > > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
>> > > > > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
>> > > > > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
>> > > > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
>> > > > > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
>> > > > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
>> > > > > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
>> > > > > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
>> > > > > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
>> > > > > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
>> > > > > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
>> > > >
>> > > > Hmm.. the problem is that preemption is *not* disabled when
>> > > > smp_processor_id is called, right?
>> > >
>> > > Yes!
>> > >
>> > > smp_processor_id is defined as:
>> > >
>> > >  * This is the normal accessor to the CPU id and should be used
>> > >  * whenever possible.
>> > >  *
>> > >  * The CPU id is stable when:
>> > >  *
>> > >  *  - IRQs are disabled;
>> > >  *  - preemption is disabled;
>> > >  *  - the task is CPU affine.
>> > >  *
>> > >  * When CONFIG_DEBUG_PREEMPT; we verify these assumption and WARN
>> > >  * when smp_processor_id() is used when the CPU id is not stable.
>> > >
>> > > So regmap_write->regmap_mmio_write should be PREEMPT disabled in
>> > > sifive_gpio_probe().
>> >
>> > Ah! Sorry, now I think I understand. So you're saying that the real
>> > problem is that the driver framework should have disabled preemption
>> > before calling any .probe functions, but for some reason that doesn't
>> > happen on RISC-V?
>>
>> I think it's actually an issue with the generic mmiowb stuff and that we should
>> just elide the check.  I'm adding Will, for context.  I'll send out a patch.
>
> Hmm. Although I _think_ something like the diff below ought to work, are you
> sure you want to be doing MMIO writes in preemptible context? Setting
> '.disable_locking = true' in 'sifive_gpio_regmap_config' implies to me that
> you should be handling the locking within the driver itself, and all the
> other regmap writes are protected by '&gc->bgpio_lock'.

I guess my goal here was to avoid fixing the drivers: it's one thing if it's
just broken SiFive drivers, as they're all a bit crusty, but this is blowing up
for me in the 8250 driver on QEMU as well.  At that point I figured there'd be
an endless stream of bugs around this and I'd rather just.

> Given that riscv is one of the few architectures needing an implementation
> of mmiowb(), doing MMIO in a preemptible section seems especially dangerous
> as you have no way to ensure completion of the writes without adding an
> mmiowb() to the CPU migration path (i.e. context switch).

I was going to just stick one in our context switching code unconditionally.
While we could go track cumulative writes outside the locks, the mmiowb is
essentially free for us because the one RISC-V implementation treats all fences
the same way so the subsequent store_release would hold all this up anyway.

I think the right thing to do is to add some sort of arch hook right about here

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cfd71d61aa3c..14b4f8b7433f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3212,6 +3212,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
 	prev_state = prev->state;
 	vtime_task_switch(prev);
 	perf_event_task_sched_in(prev, current);
+	finish_arch_pre_release(prev);
 	finish_task(prev);
 	finish_lock_switch(rq);
 	finish_arch_post_lock_switch();

but I was just going to stick it in switch_to for now... :).  I guess we could
also roll the fence up into yet another one-off primitive for the scheduler,
something like

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9a2fbf98fd6f..9dcc35d37a99 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3074,7 +3074,7 @@ static inline void finish_task(struct task_struct *prev)
         *
         * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
         */
-       smp_store_release(&prev->on_cpu, 0);
+       smp_store_release_and_mmiowb(&prev->on_cpu, 0);
 #endif
 }

but in the long term we'd probably want to elide the unnecessary mmiowb()s
which would result in some accounting code.

>
> Will
>
> --->8
>
> diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> index 9439ff037b2d..5698fca3bf56 100644
> --- a/include/asm-generic/mmiowb.h
> +++ b/include/asm-generic/mmiowb.h
> @@ -27,7 +27,7 @@
>  #include <asm/smp.h>
>
>  DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> -#define __mmiowb_state()       this_cpu_ptr(&__mmiowb_state)
> +#define __mmiowb_state()       raw_cpu_ptr(&__mmiowb_state)
>  #else
>  #define __mmiowb_state()       arch_mmiowb_state()
>  #endif /* arch_mmiowb_state */
> @@ -35,7 +35,9 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
>  static inline void mmiowb_set_pending(void)
>  {
>         struct mmiowb_state *ms = __mmiowb_state();
> -       ms->mmiowb_pending = ms->nesting_count;
> +
> +       if (likely(ms->nesting_count))
> +               ms->mmiowb_pending = ms->nesting_count;

Ya, that's one of the earlier ideas I had, but I decided it doesn't actually do
anything: if we're scheduleable then we know that pending and count are zero,
thus the check isn't necessary.  It made sense late last night and still does
this morning, but I haven't had my coffee yet.

I'm kind of tempted to just declare "mmiowb() is fast on RISC-V, so let's do it
unconditionally everywhere it's necessary".  IIRC that's essentially true on
the existing implementation, as it'll get rolled up to any upcoming fence
anyway.  It seems like building any real machine that relies on the orderings
provided by mmiowb is going to have an infinate rabbit hole of bugs anyway, so
in that case we'd just rely on the hardware to elide the now unnecessary fences
so we'd just be throwing static code size at this wacky memory model and then
forgetting about it.

I'm going to send out a patch set that does all the work I think is necessary
to avoid fixing up the various drivers, with the accounting code to avoid
mmiowbs all over our port.  I'm not sure I'm going to like it, but I guess we
can argue as to exactly how ugly it is :)


>  }
>
>  static inline void mmiowb_spin_lock(void)

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-15 14:03                 ` Palmer Dabbelt
@ 2020-07-15 14:48                   ` Will Deacon
  2020-07-15 16:41                     ` Palmer Dabbelt
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2020-07-15 14:48 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: linux-arch, kernel, Arnd Bergmann, guoren, Paul Walmsley, linux-riscv

On Wed, Jul 15, 2020 at 07:03:49AM -0700, Palmer Dabbelt wrote:
> On Wed, 15 Jul 2020 03:42:46 PDT (-0700), Will Deacon wrote:
> > Hmm. Although I _think_ something like the diff below ought to work, are you
> > sure you want to be doing MMIO writes in preemptible context? Setting
> > '.disable_locking = true' in 'sifive_gpio_regmap_config' implies to me that
> > you should be handling the locking within the driver itself, and all the
> > other regmap writes are protected by '&gc->bgpio_lock'.
> 
> I guess my goal here was to avoid fixing the drivers: it's one thing if it's
> just broken SiFive drivers, as they're all a bit crusty, but this is blowing up
> for me in the 8250 driver on QEMU as well.  At that point I figured there'd be
> an endless stream of bugs around this and I'd rather just.

Right, and my patch should solve that.

> > Given that riscv is one of the few architectures needing an implementation
> > of mmiowb(), doing MMIO in a preemptible section seems especially dangerous
> > as you have no way to ensure completion of the writes without adding an
> > mmiowb() to the CPU migration path (i.e. context switch).
> 
> I was going to just stick one in our context switching code unconditionally.
> While we could go track cumulative writes outside the locks, the mmiowb is
> essentially free for us because the one RISC-V implementation treats all fences
> the same way so the subsequent store_release would hold all this up anyway.
> 
> I think the right thing to do is to add some sort of arch hook right about here
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index cfd71d61aa3c..14b4f8b7433f 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3212,6 +3212,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
> 	prev_state = prev->state;
> 	vtime_task_switch(prev);
> 	perf_event_task_sched_in(prev, current);
> +	finish_arch_pre_release(prev);
> 	finish_task(prev);
> 	finish_lock_switch(rq);
> 	finish_arch_post_lock_switch();
> 
> but I was just going to stick it in switch_to for now... :).  I guess we could
> also roll the fence up into yet another one-off primitive for the scheduler,
> something like

What does the above get you over switch_to()?

> > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > index 9439ff037b2d..5698fca3bf56 100644
> > --- a/include/asm-generic/mmiowb.h
> > +++ b/include/asm-generic/mmiowb.h
> > @@ -27,7 +27,7 @@
> >  #include <asm/smp.h>
> > 
> >  DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> > -#define __mmiowb_state()       this_cpu_ptr(&__mmiowb_state)
> > +#define __mmiowb_state()       raw_cpu_ptr(&__mmiowb_state)
> >  #else
> >  #define __mmiowb_state()       arch_mmiowb_state()
> >  #endif /* arch_mmiowb_state */
> > @@ -35,7 +35,9 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> >  static inline void mmiowb_set_pending(void)
> >  {
> >         struct mmiowb_state *ms = __mmiowb_state();
> > -       ms->mmiowb_pending = ms->nesting_count;
> > +
> > +       if (likely(ms->nesting_count))
> > +               ms->mmiowb_pending = ms->nesting_count;
> 
> Ya, that's one of the earlier ideas I had, but I decided it doesn't actually do
> anything: if we're scheduleable then we know that pending and count are zero,
> thus the check isn't necessary.  It made sense late last night and still does
> this morning, but I haven't had my coffee yet.

What it does is prevent preemptible writeX() from trashing the state on
another CPU, so I think it's a valid fix. I agree that it doesn't help
you if you need mmiowb(), but then that _really_ should only be needed if
you're holding a spinlock. If you're doing concurrent lockless MMIO you
deserve all the pain you get.

I don't get why you think the patch does nothing, as it will operate as
expected if writeX() is called with preemption disabled, which is the common
case.

> I'm kind of tempted to just declare "mmiowb() is fast on RISC-V, so let's do it
> unconditionally everywhere it's necessary".  IIRC that's essentially true on
> the existing implementation, as it'll get rolled up to any upcoming fence
> anyway.  It seems like building any real machine that relies on the orderings
> provided by mmiowb is going to have an infinate rabbit hole of bugs anyway, so
> in that case we'd just rely on the hardware to elide the now unnecessary fences
> so we'd just be throwing static code size at this wacky memory model and then
> forgetting about it.

If you can do that, that's obviously the best approach.

> I'm going to send out a patch set that does all the work I think is necessary
> to avoid fixing up the various drivers, with the accounting code to avoid
> mmiowbs all over our port.  I'm not sure I'm going to like it, but I guess we
> can argue as to exactly how ugly it is :)

Ok.

Will

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-15 14:48                   ` Will Deacon
@ 2020-07-15 16:41                     ` Palmer Dabbelt
  2020-07-15 19:28                       ` Palmer Dabbelt
  0 siblings, 1 reply; 15+ messages in thread
From: Palmer Dabbelt @ 2020-07-15 16:41 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arch, kernel, Arnd Bergmann, guoren, Paul Walmsley, linux-riscv

On Wed, 15 Jul 2020 07:48:06 PDT (-0700), Will Deacon wrote:
> On Wed, Jul 15, 2020 at 07:03:49AM -0700, Palmer Dabbelt wrote:
>> On Wed, 15 Jul 2020 03:42:46 PDT (-0700), Will Deacon wrote:
>> > Hmm. Although I _think_ something like the diff below ought to work, are you
>> > sure you want to be doing MMIO writes in preemptible context? Setting
>> > '.disable_locking = true' in 'sifive_gpio_regmap_config' implies to me that
>> > you should be handling the locking within the driver itself, and all the
>> > other regmap writes are protected by '&gc->bgpio_lock'.
>>
>> I guess my goal here was to avoid fixing the drivers: it's one thing if it's
>> just broken SiFive drivers, as they're all a bit crusty, but this is blowing up
>> for me in the 8250 driver on QEMU as well.  At that point I figured there'd be
>> an endless stream of bugs around this and I'd rather just.
>
> Right, and my patch should solve that.
>
>> > Given that riscv is one of the few architectures needing an implementation
>> > of mmiowb(), doing MMIO in a preemptible section seems especially dangerous
>> > as you have no way to ensure completion of the writes without adding an
>> > mmiowb() to the CPU migration path (i.e. context switch).
>>
>> I was going to just stick one in our context switching code unconditionally.
>> While we could go track cumulative writes outside the locks, the mmiowb is
>> essentially free for us because the one RISC-V implementation treats all fences
>> the same way so the subsequent store_release would hold all this up anyway.
>>
>> I think the right thing to do is to add some sort of arch hook right about here
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index cfd71d61aa3c..14b4f8b7433f 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -3212,6 +3212,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
>> 	prev_state = prev->state;
>> 	vtime_task_switch(prev);
>> 	perf_event_task_sched_in(prev, current);
>> +	finish_arch_pre_release(prev);
>> 	finish_task(prev);
>> 	finish_lock_switch(rq);
>> 	finish_arch_post_lock_switch();
>>
>> but I was just going to stick it in switch_to for now... :).  I guess we could
>> also roll the fence up into yet another one-off primitive for the scheduler,
>> something like
>
> What does the above get you over switch_to()?
>
>> > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
>> > index 9439ff037b2d..5698fca3bf56 100644
>> > --- a/include/asm-generic/mmiowb.h
>> > +++ b/include/asm-generic/mmiowb.h
>> > @@ -27,7 +27,7 @@
>> >  #include <asm/smp.h>
>> >
>> >  DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
>> > -#define __mmiowb_state()       this_cpu_ptr(&__mmiowb_state)
>> > +#define __mmiowb_state()       raw_cpu_ptr(&__mmiowb_state)
>> >  #else
>> >  #define __mmiowb_state()       arch_mmiowb_state()
>> >  #endif /* arch_mmiowb_state */
>> > @@ -35,7 +35,9 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
>> >  static inline void mmiowb_set_pending(void)
>> >  {
>> >         struct mmiowb_state *ms = __mmiowb_state();
>> > -       ms->mmiowb_pending = ms->nesting_count;
>> > +
>> > +       if (likely(ms->nesting_count))
>> > +               ms->mmiowb_pending = ms->nesting_count;
>>
>> Ya, that's one of the earlier ideas I had, but I decided it doesn't actually do
>> anything: if we're scheduleable then we know that pending and count are zero,
>> thus the check isn't necessary.  It made sense late last night and still does
>> this morning, but I haven't had my coffee yet.
>
> What it does is prevent preemptible writeX() from trashing the state on
> another CPU, so I think it's a valid fix. I agree that it doesn't help
> you if you need mmiowb(), but then that _really_ should only be needed if
> you're holding a spinlock. If you're doing concurrent lockless MMIO you
> deserve all the pain you get.
>
> I don't get why you think the patch does nothing, as it will operate as
> expected if writeX() is called with preemption disabled, which is the common
> case.

Aside from PREEMPT_RT, I don't understand how you can be scheduled onto a CPU
that has a non-zero nesting_count.  Doesn't that mean that the CPU you're
scheduled on to is itself holding a spinlock, and therefor can't be scheduled
on?

Sure, some interrupt could come in the middle, but it's still going to see the
non-zero nesting_count left over from the spinlock being held and therefor will
avoid trashing the accumulated mmiowb.  As far as I can tell everything then
proceeds acceptably: when the interrupt unlocks it'll do an mmiowb (whether it
did an IO or not), which is sufficient to ensure that the IO from the
interrupted code is completed before the unlock from that code.

I must be missing something here?

>> I'm kind of tempted to just declare "mmiowb() is fast on RISC-V, so let's do it
>> unconditionally everywhere it's necessary".  IIRC that's essentially true on
>> the existing implementation, as it'll get rolled up to any upcoming fence
>> anyway.  It seems like building any real machine that relies on the orderings
>> provided by mmiowb is going to have an infinate rabbit hole of bugs anyway, so
>> in that case we'd just rely on the hardware to elide the now unnecessary fences
>> so we'd just be throwing static code size at this wacky memory model and then
>> forgetting about it.
>
> If you can do that, that's obviously the best approach.
>
>> I'm going to send out a patch set that does all the work I think is necessary
>> to avoid fixing up the various drivers, with the accounting code to avoid
>> mmiowbs all over our port.  I'm not sure I'm going to like it, but I guess we
>> can argue as to exactly how ugly it is :)
>
> Ok.
>
> Will

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending
  2020-07-15 16:41                     ` Palmer Dabbelt
@ 2020-07-15 19:28                       ` Palmer Dabbelt
  0 siblings, 0 replies; 15+ messages in thread
From: Palmer Dabbelt @ 2020-07-15 19:28 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arch, kernel, Arnd Bergmann, Guo Ren, Paul Walmsley, linux-riscv

On Wed, Jul 15, 2020 at 9:41 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Wed, 15 Jul 2020 07:48:06 PDT (-0700), Will Deacon wrote:
> > On Wed, Jul 15, 2020 at 07:03:49AM -0700, Palmer Dabbelt wrote:
> >> On Wed, 15 Jul 2020 03:42:46 PDT (-0700), Will Deacon wrote:
> >> > Hmm. Although I _think_ something like the diff below ought to work, are you
> >> > sure you want to be doing MMIO writes in preemptible context? Setting
> >> > '.disable_locking = true' in 'sifive_gpio_regmap_config' implies to me that
> >> > you should be handling the locking within the driver itself, and all the
> >> > other regmap writes are protected by '&gc->bgpio_lock'.
> >>
> >> I guess my goal here was to avoid fixing the drivers: it's one thing if it's
> >> just broken SiFive drivers, as they're all a bit crusty, but this is blowing up
> >> for me in the 8250 driver on QEMU as well.  At that point I figured there'd be
> >> an endless stream of bugs around this and I'd rather just.
> >
> > Right, and my patch should solve that.
> >
> >> > Given that riscv is one of the few architectures needing an implementation
> >> > of mmiowb(), doing MMIO in a preemptible section seems especially dangerous
> >> > as you have no way to ensure completion of the writes without adding an
> >> > mmiowb() to the CPU migration path (i.e. context switch).
> >>
> >> I was going to just stick one in our context switching code unconditionally.
> >> While we could go track cumulative writes outside the locks, the mmiowb is
> >> essentially free for us because the one RISC-V implementation treats all fences
> >> the same way so the subsequent store_release would hold all this up anyway.
> >>
> >> I think the right thing to do is to add some sort of arch hook right about here
> >>
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index cfd71d61aa3c..14b4f8b7433f 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -3212,6 +3212,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
> >>      prev_state = prev->state;
> >>      vtime_task_switch(prev);
> >>      perf_event_task_sched_in(prev, current);
> >> +    finish_arch_pre_release(prev);
> >>      finish_task(prev);
> >>      finish_lock_switch(rq);
> >>      finish_arch_post_lock_switch();
> >>
> >> but I was just going to stick it in switch_to for now... :).  I guess we could
> >> also roll the fence up into yet another one-off primitive for the scheduler,
> >> something like
> >
> > What does the above get you over switch_to()?
> >
> >> > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> >> > index 9439ff037b2d..5698fca3bf56 100644
> >> > --- a/include/asm-generic/mmiowb.h
> >> > +++ b/include/asm-generic/mmiowb.h
> >> > @@ -27,7 +27,7 @@
> >> >  #include <asm/smp.h>
> >> >
> >> >  DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> >> > -#define __mmiowb_state()       this_cpu_ptr(&__mmiowb_state)
> >> > +#define __mmiowb_state()       raw_cpu_ptr(&__mmiowb_state)
> >> >  #else
> >> >  #define __mmiowb_state()       arch_mmiowb_state()
> >> >  #endif /* arch_mmiowb_state */
> >> > @@ -35,7 +35,9 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> >> >  static inline void mmiowb_set_pending(void)
> >> >  {
> >> >         struct mmiowb_state *ms = __mmiowb_state();
> >> > -       ms->mmiowb_pending = ms->nesting_count;
> >> > +
> >> > +       if (likely(ms->nesting_count))
> >> > +               ms->mmiowb_pending = ms->nesting_count;
> >>
> >> Ya, that's one of the earlier ideas I had, but I decided it doesn't actually do
> >> anything: if we're scheduleable then we know that pending and count are zero,
> >> thus the check isn't necessary.  It made sense late last night and still does
> >> this morning, but I haven't had my coffee yet.
> >
> > What it does is prevent preemptible writeX() from trashing the state on
> > another CPU, so I think it's a valid fix. I agree that it doesn't help
> > you if you need mmiowb(), but then that _really_ should only be needed if
> > you're holding a spinlock. If you're doing concurrent lockless MMIO you
> > deserve all the pain you get.
> >
> > I don't get why you think the patch does nothing, as it will operate as
> > expected if writeX() is called with preemption disabled, which is the common
> > case.
>
> Aside from PREEMPT_RT, I don't understand how you can be scheduled onto a CPU
> that has a non-zero nesting_count.  Doesn't that mean that the CPU you're
> scheduled on to is itself holding a spinlock, and therefor can't be scheduled
> on?
>
> Sure, some interrupt could come in the middle, but it's still going to see the
> non-zero nesting_count left over from the spinlock being held and therefor will
> avoid trashing the accumulated mmiowb.  As far as I can tell everything then
> proceeds acceptably: when the interrupt unlocks it'll do an mmiowb (whether it
> did an IO or not), which is sufficient to ensure that the IO from the
> interrupted code is completed before the unlock from that code.
>
> I must be missing something here?

Will and I talked for a bit, this patch is correct.  He's going to
send it, I'm promoting smp_mb__after_spinlock to include IO ordering
so we don't break code when scheduling.

Thanks!

>
> >> I'm kind of tempted to just declare "mmiowb() is fast on RISC-V, so let's do it
> >> unconditionally everywhere it's necessary".  IIRC that's essentially true on
> >> the existing implementation, as it'll get rolled up to any upcoming fence
> >> anyway.  It seems like building any real machine that relies on the orderings
> >> provided by mmiowb is going to have an infinate rabbit hole of bugs anyway, so
> >> in that case we'd just rely on the hardware to elide the now unnecessary fences
> >> so we'd just be throwing static code size at this wacky memory model and then
> >> forgetting about it.
> >
> > If you can do that, that's obviously the best approach.
> >
> >> I'm going to send out a patch set that does all the work I think is necessary
> >> to avoid fixing up the various drivers, with the accounting code to avoid
> >> mmiowbs all over our port.  I'm not sure I'm going to like it, but I guess we
> >> can argue as to exactly how ugly it is :)
> >
> > Ok.
> >
> > Will

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-07-15 19:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-05 14:26 [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending Emil Renner Berthing
2020-07-05 14:43 ` Guo Ren
2020-07-05 14:51   ` Guo Ren
2020-07-05 15:19     ` Guo Ren
2020-07-05 15:03   ` Emil Renner Berthing
2020-07-05 15:52     ` Guo Ren
2020-07-05 17:09       ` Emil Renner Berthing
2020-07-06  0:47         ` Guo Ren
2020-07-06  8:08           ` Emil Renner Berthing
2020-07-15  6:45             ` Palmer Dabbelt
2020-07-15 10:42               ` Will Deacon
2020-07-15 14:03                 ` Palmer Dabbelt
2020-07-15 14:48                   ` Will Deacon
2020-07-15 16:41                     ` Palmer Dabbelt
2020-07-15 19:28                       ` Palmer Dabbelt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).