linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
       [not found] <1531467477-9952-1-git-send-email-chenhc@lemote.com>
@ 2018-07-17 17:52 ` Paul Burton
  2018-07-17 18:03   ` Paul E. McKenney
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Paul Burton @ 2018-07-17 17:52 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Ralf Baechle, James Hogan, linux-mips, Fuxin Zhang, Zhangjin Wu,
	Huacai Chen, stable, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	linux-kernel

Hi Huacai,

On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
> Linux expects that if a CPU modifies a memory location, then that
> modification will eventually become visible to other CPUs in the system.
> 
> On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
> prioritised over stores so it is possible for a store operation to be
> postponed if a polling loop immediately follows it. If the variable
> being polled indirectly depends on the outstanding store [for example,
> another CPU may be polling the variable that is pending modification]
> then there is the potential for deadlock if interrupts are disabled.
> This deadlock occurs in qspinlock code.
> 
> This patch changes the definition of cpu_relax() to smp_mb() for
> Loongson-3, forcing a flushing of the SFB on SMP systems before the
> next load takes place. If the Kernel is not compiled for SMP support,
> this will expand to a barrier() as before.
> 
> References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore)
> Cc: stable@vger.kernel.org
> Signed-off-by: Huacai Chen <chenhc@lemote.com>
> ---
>  arch/mips/include/asm/processor.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
> index af34afb..a8c4a3a 100644
> --- a/arch/mips/include/asm/processor.h
> +++ b/arch/mips/include/asm/processor.h
> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
>  
> +#ifdef CONFIG_CPU_LOONGSON3
> +/*
> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
> + * loop. Since spin loops of any kind should have a cpu_relax() in them, force
> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes will
> + * become available as expected.
> + */

I think "may starve writes" or "may queue writes indefinitely" would be
clearer than "may get starved".

> +#define cpu_relax()	smp_mb()
> +#else
>  #define cpu_relax()	barrier()
> +#endif
>  
>  /*
>   * Return_address is a replacement for __builtin_return_address(count)
> -- 
> 2.7.0

Apart from the comment above though this looks better to me.

Re-copying the LKMM maintainers - are you happy(ish) with this?

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
  2018-07-17 17:52 ` [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3 Paul Burton
@ 2018-07-17 18:03   ` Paul E. McKenney
  2018-07-17 18:46   ` Peter Zijlstra
  2018-07-18  1:15   ` Huacai Chen
  2 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2018-07-17 18:03 UTC (permalink / raw)
  To: Paul Burton
  Cc: Huacai Chen, Ralf Baechle, James Hogan, linux-mips, Fuxin Zhang,
	Zhangjin Wu, Huacai Chen, stable, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	linux-kernel

On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote:
> Hi Huacai,
> 
> On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
> > Linux expects that if a CPU modifies a memory location, then that
> > modification will eventually become visible to other CPUs in the system.
> > 
> > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
> > prioritised over stores so it is possible for a store operation to be
> > postponed if a polling loop immediately follows it. If the variable
> > being polled indirectly depends on the outstanding store [for example,
> > another CPU may be polling the variable that is pending modification]
> > then there is the potential for deadlock if interrupts are disabled.
> > This deadlock occurs in qspinlock code.
> > 
> > This patch changes the definition of cpu_relax() to smp_mb() for
> > Loongson-3, forcing a flushing of the SFB on SMP systems before the
> > next load takes place. If the Kernel is not compiled for SMP support,
> > this will expand to a barrier() as before.
> > 
> > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore)
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Huacai Chen <chenhc@lemote.com>
> > ---
> >  arch/mips/include/asm/processor.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
> > index af34afb..a8c4a3a 100644
> > --- a/arch/mips/include/asm/processor.h
> > +++ b/arch/mips/include/asm/processor.h
> > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >  
> > +#ifdef CONFIG_CPU_LOONGSON3
> > +/*
> > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
> > + * loop. Since spin loops of any kind should have a cpu_relax() in them, force
> > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes will
> > + * become available as expected.
> > + */
> 
> I think "may starve writes" or "may queue writes indefinitely" would be
> clearer than "may get starved".
> 
> > +#define cpu_relax()	smp_mb()
> > +#else
> >  #define cpu_relax()	barrier()
> > +#endif
> >  
> >  /*
> >   * Return_address is a replacement for __builtin_return_address(count)
> > -- 
> > 2.7.0
> 
> Apart from the comment above though this looks better to me.
> 
> Re-copying the LKMM maintainers - are you happy(ish) with this?

This looks much better to me.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
  2018-07-17 17:52 ` [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3 Paul Burton
  2018-07-17 18:03   ` Paul E. McKenney
@ 2018-07-17 18:46   ` Peter Zijlstra
  2018-07-18  1:15   ` Huacai Chen
  2 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2018-07-17 18:46 UTC (permalink / raw)
  To: Paul Burton
  Cc: Huacai Chen, Ralf Baechle, James Hogan, linux-mips, Fuxin Zhang,
	Zhangjin Wu, Huacai Chen, stable, Alan Stern, Andrea Parri,
	Will Deacon, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	linux-kernel

On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote:
> On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
> > Linux expects that if a CPU modifies a memory location, then that
> > modification will eventually become visible to other CPUs in the system.
> > 
> > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
> > prioritised over stores so it is possible for a store operation to be
> > postponed if a polling loop immediately follows it. If the variable
> > being polled indirectly depends on the outstanding store [for example,
> > another CPU may be polling the variable that is pending modification]
> > then there is the potential for deadlock if interrupts are disabled.
> > This deadlock occurs in qspinlock code.
> > 
> > This patch changes the definition of cpu_relax() to smp_mb() for
> > Loongson-3, forcing a flushing of the SFB on SMP systems before the
> > next load takes place. If the Kernel is not compiled for SMP support,
> > this will expand to a barrier() as before.
> > 
> > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore)
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Huacai Chen <chenhc@lemote.com>
> > ---
> >  arch/mips/include/asm/processor.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
> > index af34afb..a8c4a3a 100644
> > --- a/arch/mips/include/asm/processor.h
> > +++ b/arch/mips/include/asm/processor.h
> > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >  
> > +#ifdef CONFIG_CPU_LOONGSON3
> > +/*
> > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
> > + * loop. Since spin loops of any kind should have a cpu_relax() in them, force
> > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes will
> > + * become available as expected.
> > + */
> 
> I think "may starve writes" or "may queue writes indefinitely" would be
> clearer than "may get starved".

Agreed.

> > +#define cpu_relax()	smp_mb()
> > +#else
> >  #define cpu_relax()	barrier()
> > +#endif
> >  
> >  /*
> >   * Return_address is a replacement for __builtin_return_address(count)
> 
> Apart from the comment above though this looks better to me.
> 
> Re-copying the LKMM maintainers - are you happy(ish) with this?

Right, thanks for adding us back on :-)

Yes, this is much better, although I myself would also prefer explicit
mention that this is a work-around for a hardware bug.

But aside from the actual comment bike-shedding, this looks entirely
acceptible (also because ARM is already doing this -- and the Changelog
might want to refer to that particular patch).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
  2018-07-17 17:52 ` [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3 Paul Burton
  2018-07-17 18:03   ` Paul E. McKenney
  2018-07-17 18:46   ` Peter Zijlstra
@ 2018-07-18  1:15   ` Huacai Chen
  2018-07-19 21:15     ` Paul Burton
  2 siblings, 1 reply; 7+ messages in thread
From: Huacai Chen @ 2018-07-18  1:15 UTC (permalink / raw)
  To: Paul Burton
  Cc: Ralf Baechle, James Hogan, Linux MIPS Mailing List, Fuxin Zhang,
	Zhangjin Wu, stable, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	LKML

On Wed, Jul 18, 2018 at 1:52 AM, Paul Burton <paul.burton@mips.com> wrote:
> Hi Huacai,
>
> On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
>> Linux expects that if a CPU modifies a memory location, then that
>> modification will eventually become visible to other CPUs in the system.
>>
>> On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
>> prioritised over stores so it is possible for a store operation to be
>> postponed if a polling loop immediately follows it. If the variable
>> being polled indirectly depends on the outstanding store [for example,
>> another CPU may be polling the variable that is pending modification]
>> then there is the potential for deadlock if interrupts are disabled.
>> This deadlock occurs in qspinlock code.
>>
>> This patch changes the definition of cpu_relax() to smp_mb() for
>> Loongson-3, forcing a flushing of the SFB on SMP systems before the
>> next load takes place. If the Kernel is not compiled for SMP support,
>> this will expand to a barrier() as before.
>>
>> References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore)
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Huacai Chen <chenhc@lemote.com>
>> ---
>>  arch/mips/include/asm/processor.h | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
>> index af34afb..a8c4a3a 100644
>> --- a/arch/mips/include/asm/processor.h
>> +++ b/arch/mips/include/asm/processor.h
>> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
>>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
>>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
>>
>> +#ifdef CONFIG_CPU_LOONGSON3
>> +/*
>> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
>> + * loop. Since spin loops of any kind should have a cpu_relax() in them, force
>> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes will
>> + * become available as expected.
>> + */
>
> I think "may starve writes" or "may queue writes indefinitely" would be
> clearer than "may get starved".
Need I change the comment and resend? Or you change the comment and get merged?

Huacai

>
>> +#define cpu_relax()  smp_mb()
>> +#else
>>  #define cpu_relax()  barrier()
>> +#endif
>>
>>  /*
>>   * Return_address is a replacement for __builtin_return_address(count)
>> --
>> 2.7.0
>
> Apart from the comment above though this looks better to me.
>
> Re-copying the LKMM maintainers - are you happy(ish) with this?
>
> Thanks,
>     Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
  2018-07-18  1:15   ` Huacai Chen
@ 2018-07-19 21:15     ` Paul Burton
  2018-07-21  1:35       ` 陈华才
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Burton @ 2018-07-19 21:15 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Ralf Baechle, James Hogan, Linux MIPS Mailing List, Fuxin Zhang,
	Zhangjin Wu, stable, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	LKML

Hi Huacai,

On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote:
> >> diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
> >> index af34afb..a8c4a3a 100644
> >> --- a/arch/mips/include/asm/processor.h
> >> +++ b/arch/mips/include/asm/processor.h
> >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >>
> >> +#ifdef CONFIG_CPU_LOONGSON3
> >> +/*
> >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
> >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, force
> >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes will
> >> + * become available as expected.
> >> + */
> >
> > I think "may starve writes" or "may queue writes indefinitely" would be
> > clearer than "may get starved".
>
> Need I change the comment and resend? Or you change the comment and get merged?

I'm happy to fix up the comment - but have a couple more questions.

Looking into the history, would it be fair to say that this is only a
problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce
CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y,
which adds code to enable the SFB?

If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select
the use of smp_mb()?

How much does performance gain does enabling the SFB give you? Would it
be reasonable to just disable it, rather than using this workaround?

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
  2018-07-19 21:15     ` Paul Burton
@ 2018-07-21  1:35       ` 陈华才
  2018-07-23 17:37         ` Paul Burton
  0 siblings, 1 reply; 7+ messages in thread
From: 陈华才 @ 2018-07-21  1:35 UTC (permalink / raw)
  To: Paul Burton
  Cc: Ralf Baechle, James Hogan, linux-mips, Fuxin Zhang, wuzhangjin,
	stable, Alan Stern, Andrea Parri, Will Deacon, Peter Zijlstra,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, LKML

Hi, Paul,
 
SFB can improve the memory bandwidth as much as 30%, and we are planning to enable SFB by default. So, we want to control cpu_relax() under CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT.

Huacai
 
------------------ Original ------------------
From:  "Paul Burton"<paul.burton@mips.com>;
Date:  Fri, Jul 20, 2018 05:15 AM
To:  "Huacai Chen"<chenhc@lemote.com>; 
Cc:  "Ralf Baechle"<ralf@linux-mips.org>; "James Hogan"<jhogan@kernel.org>; "linux-mips"<linux-mips@linux-mips.org>; "Fuxin Zhang"<zhangfx@lemote.com>; "wuzhangjin"<wuzhangjin@gmail.com>; "stable"<stable@vger.kernel.org>; "Alan Stern"<stern@rowland.harvard.edu>; "Andrea Parri"<andrea.parri@amarulasolutions.com>; "Will Deacon"<will.deacon@arm.com>; "Peter Zijlstra"<peterz@infradead.org>; "Boqun Feng"<boqun.feng@gmail.com>; "Nicholas Piggin"<npiggin@gmail.com>; "David Howells"<dhowells@redhat.com>; "Jade Alglave"<j.alglave@ucl.ac.uk>; "Luc Maranget"<luc.maranget@inria.fr>; "Paul E. McKenney"<paulmck@linux.vnet.ibm.com>; "Akira Yokosawa"<akiyks@gmail.com>; "LKML"<linux-kernel@vger.kernel.org>; 
Subject:  Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

 
Hi Huacai,

On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote:
> >> diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
> >> index af34afb..a8c4a3a 100644
> >> --- a/arch/mips/include/asm/processor.h
> >> +++ b/arch/mips/include/asm/processor.h
> >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >>
> >> +#ifdef CONFIG_CPU_LOONGSON3
> >> +/*
> >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
> >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, force
> >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes will
> >> + * become available as expected.
> >> + */
> >
> > I think "may starve writes" or "may queue writes indefinitely" would be
> > clearer than "may get starved".
>
> Need I change the comment and resend? Or you change the comment and get merged?

I'm happy to fix up the comment - but have a couple more questions.

Looking into the history, would it be fair to say that this is only a
problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce
CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y,
which adds code to enable the SFB?

If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select
the use of smp_mb()?

How much does performance gain does enabling the SFB give you? Would it
be reasonable to just disable it, rather than using this workaround?

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
  2018-07-21  1:35       ` 陈华才
@ 2018-07-23 17:37         ` Paul Burton
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Burton @ 2018-07-23 17:37 UTC (permalink / raw)
  To: 陈华才
  Cc: Ralf Baechle, James Hogan, linux-mips, Fuxin Zhang, wuzhangjin,
	stable, Alan Stern, Andrea Parri, Will Deacon, Peter Zijlstra,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, LKML

Hi Huacai,

On Sat, Jul 21, 2018 at 09:35:59AM +0800, 陈华才 wrote:
> SFB can improve the memory bandwidth as much as 30%, and we are
> planning to enable SFB by default. So, we want to control cpu_relax()
> under CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT.

OK, applied to mips-next for 4.19 with changes to the commit message &
comment.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-07-23 17:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1531467477-9952-1-git-send-email-chenhc@lemote.com>
2018-07-17 17:52 ` [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3 Paul Burton
2018-07-17 18:03   ` Paul E. McKenney
2018-07-17 18:46   ` Peter Zijlstra
2018-07-18  1:15   ` Huacai Chen
2018-07-19 21:15     ` Paul Burton
2018-07-21  1:35       ` 陈华才
2018-07-23 17:37         ` Paul Burton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).