[PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-04 11:53 Kashyap, Desai
  2011-05-17  7:16 ` James Bottomley
  0 siblings, 1 reply; 46+ messages in thread
From: Kashyap, Desai @ 2011-05-04 11:53 UTC (permalink / raw)
  To: linux-scsi; +Cc: James.Bottomley, Eric.Moore, Sathya.Prakash

The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
This is not going to work.

static inline void writeq(__u64 val, volatile void __iomem *addr)
{
        writel(val, addr);
        writel(val >> 32, addr+4);
}

So with this code turned on in the kernel, there is going to be race condition 
where multiple cpus can be writing to the request descriptor at the same time.

Meaning this could happen:
(A) CPU A doest 32bit write
(B) CPU B does 32 bit write
(C) CPU A does 32 bit write
(D) CPU B does 32 bit write

We need the 64 bit completed in one access pci memory write, else spin lock is required.
Since it's going to be difficult to know which writeq was implemented in the kernel, 
the driver is going to have to always acquire a spin lock each time we do 64bit write.

Cc: stable@kernle.org
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
---
diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
index efa0255..5778334 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
  * care of 32 bit environment where its not quarenteed to send the entire word
  * in one transfer.
  */
-#ifndef writeq
 static inline void _base_writeq(__u64 b, volatile void __iomem *addr,
     spinlock_t *writeq_lock)
 {
@@ -1570,13 +1569,6 @@ static inline void _base_writeq(__u64 b, volatile void __iomem *addr,
 	writel((u32)(data_out >> 32), (addr + 4));
 	spin_unlock_irqrestore(writeq_lock, flags);
 }
-#else
-static inline void _base_writeq(__u64 b, volatile void __iomem *addr,
-    spinlock_t *writeq_lock)
-{
-	writeq(cpu_to_le64(b), addr);
-}
-#endif
 
 /**
  * mpt2sas_base_put_smid_scsi_io - send SCSI_IO request to firmware

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-04 11:53 [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic Kashyap, Desai
@ 2011-05-17  7:16 ` James Bottomley
  2011-05-18  4:07   ` Desai, Kashyap
  0 siblings, 1 reply; 46+ messages in thread
From: James Bottomley @ 2011-05-17  7:16 UTC (permalink / raw)
  To: Kashyap, Desai; +Cc: linux-scsi, Eric.Moore, Sathya.Prakash

On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> This is not going to work.
> 
> static inline void writeq(__u64 val, volatile void __iomem *addr)
> {
>         writel(val, addr);
>         writel(val >> 32, addr+4);
> }
> 
> So with this code turned on in the kernel, there is going to be race condition 
> where multiple cpus can be writing to the request descriptor at the same time.
> 
> Meaning this could happen:
> (A) CPU A doest 32bit write
> (B) CPU B does 32 bit write
> (C) CPU A does 32 bit write
> (D) CPU B does 32 bit write
> 
> We need the 64 bit completed in one access pci memory write, else spin lock is required.
> Since it's going to be difficult to know which writeq was implemented in the kernel, 
> the driver is going to have to always acquire a spin lock each time we do 64bit write.
> 
> Cc: stable@kernle.org
> Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> ---
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> index efa0255..5778334 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
>   * care of 32 bit environment where its not quarenteed to send the entire word
>   * in one transfer.
>   */
> -#ifndef writeq

Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
systems have writeq implemented correctly; you suspect 32 bit systems
don't.

James

>  static inline void _base_writeq(__u64 b, volatile void __iomem *addr,
>      spinlock_t *writeq_lock)
>  {
> @@ -1570,13 +1569,6 @@ static inline void _base_writeq(__u64 b, volatile void __iomem *addr,
>  	writel((u32)(data_out >> 32), (addr + 4));
>  	spin_unlock_irqrestore(writeq_lock, flags);
>  }
> -#else
> -static inline void _base_writeq(__u64 b, volatile void __iomem *addr,
> -    spinlock_t *writeq_lock)
> -{
> -	writeq(cpu_to_le64(b), addr);
> -}
> -#endif
>  
>  /**
>   * mpt2sas_base_put_smid_scsi_io - send SCSI_IO request to firmware
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-17  7:16 ` James Bottomley
@ 2011-05-18  4:07   ` Desai, Kashyap
  2011-05-18  4:15       ` Matthew Wilcox
  0 siblings, 1 reply; 46+ messages in thread
From: Desai, Kashyap @ 2011-05-18  4:07 UTC (permalink / raw)
  To: James Bottomley, Moore, Eric; +Cc: linux-scsi, Prakash, Sathya



-----Original Message-----
From: James Bottomley [mailto:James.Bottomley@HansenPartnership.com] 
Sent: Tuesday, May 17, 2011 12:46 PM
To: Desai, Kashyap
Cc: linux-scsi@vger.kernel.org; Moore, Eric; Prakash, Sathya
Subject: Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic

On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> This is not going to work.
> 
> static inline void writeq(__u64 val, volatile void __iomem *addr)
> {
>         writel(val, addr);
>         writel(val >> 32, addr+4);
> }
> 
> So with this code turned on in the kernel, there is going to be race condition 
> where multiple cpus can be writing to the request descriptor at the same time.
> 
> Meaning this could happen:
> (A) CPU A doest 32bit write
> (B) CPU B does 32 bit write
> (C) CPU A does 32 bit write
> (D) CPU B does 32 bit write
> 
> We need the 64 bit completed in one access pci memory write, else spin lock is required.
> Since it's going to be difficult to know which writeq was implemented in the kernel, 
> the driver is going to have to always acquire a spin lock each time we do 64bit write.
> 
> Cc: stable@kernle.org
> Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> ---
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> index efa0255..5778334 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
>   * care of 32 bit environment where its not quarenteed to send the entire word
>   * in one transfer.
>   */
> -#ifndef writeq

Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
systems have writeq implemented correctly; you suspect 32 bit systems
don't.

James

James, This issue was observed on PPC64 system. So what you have suggested will not solve this issue.
If we are sure that writeq() is atomic across all architecture, we can use it safely. As we have seen issue on ppc64, we are not confident to use
"writeq" call.

~ Kashyap

>  static inline void _base_writeq(__u64 b, volatile void __iomem *addr,
>      spinlock_t *writeq_lock)
>  {
> @@ -1570,13 +1569,6 @@ static inline void _base_writeq(__u64 b, volatile void __iomem *addr,
>  	writel((u32)(data_out >> 32), (addr + 4));
>  	spin_unlock_irqrestore(writeq_lock, flags);
>  }
> -#else
> -static inline void _base_writeq(__u64 b, volatile void __iomem *addr,
> -    spinlock_t *writeq_lock)
> -{
> -	writeq(cpu_to_le64(b), addr);
> -}
> -#endif
>  
>  /**
>   * mpt2sas_base_put_smid_scsi_io - send SCSI_IO request to firmware
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18  4:07   ` Desai, Kashyap
@ 2011-05-18  4:15       ` Matthew Wilcox
  0 siblings, 0 replies; 46+ messages in thread
From: Matthew Wilcox @ 2011-05-18  4:15 UTC (permalink / raw)
  To: Desai, Kashyap
  Cc: James Bottomley, Moore, Eric, linux-scsi, Prakash, Sathya, benh,
	paulus, linuxppc-dev

On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> > This is not going to work.
> > 
> > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > {
> >         writel(val, addr);
> >         writel(val >> 32, addr+4);
> > }
> > 
> > So with this code turned on in the kernel, there is going to be race condition 
> > where multiple cpus can be writing to the request descriptor at the same time.
> > 
> > Meaning this could happen:
> > (A) CPU A doest 32bit write
> > (B) CPU B does 32 bit write
> > (C) CPU A does 32 bit write
> > (D) CPU B does 32 bit write
> > 
> > We need the 64 bit completed in one access pci memory write, else spin lock is required.
> > Since it's going to be difficult to know which writeq was implemented in the kernel, 
> > the driver is going to have to always acquire a spin lock each time we do 64bit write.
> > 
> > Cc: stable@kernle.org
> > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > ---
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > index efa0255..5778334 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> >   * care of 32 bit environment where its not quarenteed to send the entire word
> >   * in one transfer.
> >   */
> > -#ifndef writeq
> 
> Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> systems have writeq implemented correctly; you suspect 32 bit systems
> don't.
> 
> James
> 
> James, This issue was observed on PPC64 system. So what you have suggested will not solve this issue.
> If we are sure that writeq() is atomic across all architecture, we can use it safely. As we have seen issue on ppc64, we are not confident to use
> "writeq" call.

So have you told the powerpc people that they have a broken writeq?
And why do you obfuscate your report by talking about i386 when it's
really about powerpc64?

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18  4:15       ` Matthew Wilcox
  0 siblings, 0 replies; 46+ messages in thread
From: Matthew Wilcox @ 2011-05-18  4:15 UTC (permalink / raw)
  To: Desai, Kashyap
  Cc: Prakash, Sathya, linux-scsi, linuxppc-dev, James Bottomley,
	paulus, Moore, Eric

On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> > This is not going to work.
> > 
> > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > {
> >         writel(val, addr);
> >         writel(val >> 32, addr+4);
> > }
> > 
> > So with this code turned on in the kernel, there is going to be race condition 
> > where multiple cpus can be writing to the request descriptor at the same time.
> > 
> > Meaning this could happen:
> > (A) CPU A doest 32bit write
> > (B) CPU B does 32 bit write
> > (C) CPU A does 32 bit write
> > (D) CPU B does 32 bit write
> > 
> > We need the 64 bit completed in one access pci memory write, else spin lock is required.
> > Since it's going to be difficult to know which writeq was implemented in the kernel, 
> > the driver is going to have to always acquire a spin lock each time we do 64bit write.
> > 
> > Cc: stable@kernle.org
> > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > ---
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > index efa0255..5778334 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> >   * care of 32 bit environment where its not quarenteed to send the entire word
> >   * in one transfer.
> >   */
> > -#ifndef writeq
> 
> Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> systems have writeq implemented correctly; you suspect 32 bit systems
> don't.
> 
> James
> 
> James, This issue was observed on PPC64 system. So what you have suggested will not solve this issue.
> If we are sure that writeq() is atomic across all architecture, we can use it safely. As we have seen issue on ppc64, we are not confident to use
> "writeq" call.

So have you told the powerpc people that they have a broken writeq?
And why do you obfuscate your report by talking about i386 when it's
really about powerpc64?

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18  4:15       ` Matthew Wilcox
@ 2011-05-18  4:23         ` James Bottomley
  -1 siblings, 0 replies; 46+ messages in thread
From: James Bottomley @ 2011-05-18  4:23 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Desai, Kashyap, Moore, Eric, linux-scsi, Prakash, Sathya, benh,
	paulus, linuxppc-dev

On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> > > This is not going to work.
> > > 
> > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > {
> > >         writel(val, addr);
> > >         writel(val >> 32, addr+4);
> > > }
> > > 
> > > So with this code turned on in the kernel, there is going to be race condition 
> > > where multiple cpus can be writing to the request descriptor at the same time.
> > > 
> > > Meaning this could happen:
> > > (A) CPU A doest 32bit write
> > > (B) CPU B does 32 bit write
> > > (C) CPU A does 32 bit write
> > > (D) CPU B does 32 bit write
> > > 
> > > We need the 64 bit completed in one access pci memory write, else spin lock is required.
> > > Since it's going to be difficult to know which writeq was implemented in the kernel, 
> > > the driver is going to have to always acquire a spin lock each time we do 64bit write.
> > > 
> > > Cc: stable@kernle.org
> > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > ---
> > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > index efa0255..5778334 100644
> > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> > >   * care of 32 bit environment where its not quarenteed to send the entire word
> > >   * in one transfer.
> > >   */
> > > -#ifndef writeq
> > 
> > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > systems have writeq implemented correctly; you suspect 32 bit systems
> > don't.
> > 
> > James
> > 
> > James, This issue was observed on PPC64 system. So what you have suggested will not solve this issue.
> > If we are sure that writeq() is atomic across all architecture, we can use it safely. As we have seen issue on ppc64, we are not confident to use
> > "writeq" call.
> 
> So have you told the powerpc people that they have a broken writeq?

I'm just in the process of finding them now on IRC so I can demand an
explanation: this is a really serious API problem because writeq is
supposed to be atomic on 64 bit.

> And why do you obfuscate your report by talking about i386 when it's
> really about powerpc64?

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18  4:23         ` James Bottomley
  0 siblings, 0 replies; 46+ messages in thread
From: James Bottomley @ 2011-05-18  4:23 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Prakash, Sathya, Desai, Kashyap, linux-scsi, linuxppc-dev,
	paulus, Moore, Eric

On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> > > This is not going to work.
> > > 
> > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > {
> > >         writel(val, addr);
> > >         writel(val >> 32, addr+4);
> > > }
> > > 
> > > So with this code turned on in the kernel, there is going to be race condition 
> > > where multiple cpus can be writing to the request descriptor at the same time.
> > > 
> > > Meaning this could happen:
> > > (A) CPU A doest 32bit write
> > > (B) CPU B does 32 bit write
> > > (C) CPU A does 32 bit write
> > > (D) CPU B does 32 bit write
> > > 
> > > We need the 64 bit completed in one access pci memory write, else spin lock is required.
> > > Since it's going to be difficult to know which writeq was implemented in the kernel, 
> > > the driver is going to have to always acquire a spin lock each time we do 64bit write.
> > > 
> > > Cc: stable@kernle.org
> > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > ---
> > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > index efa0255..5778334 100644
> > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> > >   * care of 32 bit environment where its not quarenteed to send the entire word
> > >   * in one transfer.
> > >   */
> > > -#ifndef writeq
> > 
> > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > systems have writeq implemented correctly; you suspect 32 bit systems
> > don't.
> > 
> > James
> > 
> > James, This issue was observed on PPC64 system. So what you have suggested will not solve this issue.
> > If we are sure that writeq() is atomic across all architecture, we can use it safely. As we have seen issue on ppc64, we are not confident to use
> > "writeq" call.
> 
> So have you told the powerpc people that they have a broken writeq?

I'm just in the process of finding them now on IRC so I can demand an
explanation: this is a really serious API problem because writeq is
supposed to be atomic on 64 bit.

> And why do you obfuscate your report by talking about i386 when it's
> really about powerpc64?

James

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18  4:15       ` Matthew Wilcox
@ 2011-05-18  5:45         ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2011-05-18  5:45 UTC (permalink / raw)
  To: Desai, Kashyap
  Cc: James Bottomley, Moore, Eric, linux-scsi, Prakash, Sathya,
	paulus, linuxppc-dev, Matthew Wilcox


> > > Cc: stable@kernle.org
> > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > ---
> > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > index efa0255..5778334 100644
> > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> > >   * care of 32 bit environment where its not quarenteed to send the entire word
> > >   * in one transfer.
> > >   */
> > > -#ifndef writeq
> > 
> > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > systems have writeq implemented correctly; you suspect 32 bit systems
> > don't.
> > 
> > James
> > 
> > James, This issue was observed on PPC64 system. So what you have
> suggested will not solve this issue.
> > If we are sure that writeq() is atomic across all architecture, we
> can use it safely. As we have seen issue on ppc64, we are not
> confident to use
> > "writeq" call.
> 
> So have you told the powerpc people that they have a broken writeq?
> And why do you obfuscate your report by talking about i386 when it's
> really about powerpc64?

Well, our writeq isn't supposed to be broken :-)

It's defined as an std instruction (and "ld" for readq) so that's
perfectly atomic ... provided your access is aligned. Is it ?

Cheers,
Be


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18  5:45         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2011-05-18  5:45 UTC (permalink / raw)
  To: Desai, Kashyap
  Cc: Prakash, Sathya, linux-scsi, Matthew Wilcox, linuxppc-dev,
	James Bottomley, paulus, Moore, Eric


> > > Cc: stable@kernle.org
> > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > ---
> > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > index efa0255..5778334 100644
> > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> > >   * care of 32 bit environment where its not quarenteed to send the entire word
> > >   * in one transfer.
> > >   */
> > > -#ifndef writeq
> > 
> > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > systems have writeq implemented correctly; you suspect 32 bit systems
> > don't.
> > 
> > James
> > 
> > James, This issue was observed on PPC64 system. So what you have
> suggested will not solve this issue.
> > If we are sure that writeq() is atomic across all architecture, we
> can use it safely. As we have seen issue on ppc64, we are not
> confident to use
> > "writeq" call.
> 
> So have you told the powerpc people that they have a broken writeq?
> And why do you obfuscate your report by talking about i386 when it's
> really about powerpc64?

Well, our writeq isn't supposed to be broken :-)

It's defined as an std instruction (and "ld" for readq) so that's
perfectly atomic ... provided your access is aligned. Is it ?

Cheers,
Be

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18  4:23         ` James Bottomley
@ 2011-05-18  7:00           ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2011-05-18  7:00 UTC (permalink / raw)
  To: James Bottomley
  Cc: Matthew Wilcox, Desai, Kashyap, Moore, Eric, linux-scsi, Prakash,
	Sathya, paulus, linuxppc-dev, miltonm

(Just adding Milton to the CC list, he suspects races in the driver
instead).

On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote:
> On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > > The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> > > > This is not going to work.
> > > > 
> > > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > > {
> > > >         writel(val, addr);
> > > >         writel(val >> 32, addr+4);
> > > > }
> > > > 
> > > > So with this code turned on in the kernel, there is going to be race condition 
> > > > where multiple cpus can be writing to the request descriptor at the same time.
> > > > 
> > > > Meaning this could happen:
> > > > (A) CPU A doest 32bit write
> > > > (B) CPU B does 32 bit write
> > > > (C) CPU A does 32 bit write
> > > > (D) CPU B does 32 bit write
> > > > 
> > > > We need the 64 bit completed in one access pci memory write, else spin lock is required.
> > > > Since it's going to be difficult to know which writeq was implemented in the kernel, 
> > > > the driver is going to have to always acquire a spin lock each time we do 64bit write.
> > > > 
> > > > Cc: stable@kernle.org
> > > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > > ---
> > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > index efa0255..5778334 100644
> > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> > > >   * care of 32 bit environment where its not quarenteed to send the entire word
> > > >   * in one transfer.
> > > >   */
> > > > -#ifndef writeq
> > > 
> > > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > > systems have writeq implemented correctly; you suspect 32 bit systems
> > > don't.
> > > 
> > > James
> > > 
> > > James, This issue was observed on PPC64 system. So what you have suggested will not solve this issue.
> > > If we are sure that writeq() is atomic across all architecture, we can use it safely. As we have seen issue on ppc64, we are not confident to use
> > > "writeq" call.
> > 
> > So have you told the powerpc people that they have a broken writeq?
> 
> I'm just in the process of finding them now on IRC so I can demand an
> explanation: this is a really serious API problem because writeq is
> supposed to be atomic on 64 bit.
> 
> > And why do you obfuscate your report by talking about i386 when it's
> > really about powerpc64?
> 
> James
> 



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18  7:00           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2011-05-18  7:00 UTC (permalink / raw)
  To: James Bottomley
  Cc: Prakash, Sathya, Desai, Kashyap, linux-scsi, Matthew Wilcox,
	linuxppc-dev, miltonm, paulus, Moore, Eric

(Just adding Milton to the CC list, he suspects races in the driver
instead).

On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote:
> On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > > The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> > > > This is not going to work.
> > > > 
> > > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > > {
> > > >         writel(val, addr);
> > > >         writel(val >> 32, addr+4);
> > > > }
> > > > 
> > > > So with this code turned on in the kernel, there is going to be race condition 
> > > > where multiple cpus can be writing to the request descriptor at the same time.
> > > > 
> > > > Meaning this could happen:
> > > > (A) CPU A doest 32bit write
> > > > (B) CPU B does 32 bit write
> > > > (C) CPU A does 32 bit write
> > > > (D) CPU B does 32 bit write
> > > > 
> > > > We need the 64 bit completed in one access pci memory write, else spin lock is required.
> > > > Since it's going to be difficult to know which writeq was implemented in the kernel, 
> > > > the driver is going to have to always acquire a spin lock each time we do 64bit write.
> > > > 
> > > > Cc: stable@kernle.org
> > > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > > ---
> > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > index efa0255..5778334 100644
> > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> > > >   * care of 32 bit environment where its not quarenteed to send the entire word
> > > >   * in one transfer.
> > > >   */
> > > > -#ifndef writeq
> > > 
> > > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > > systems have writeq implemented correctly; you suspect 32 bit systems
> > > don't.
> > > 
> > > James
> > > 
> > > James, This issue was observed on PPC64 system. So what you have suggested will not solve this issue.
> > > If we are sure that writeq() is atomic across all architecture, we can use it safely. As we have seen issue on ppc64, we are not confident to use
> > > "writeq" call.
> > 
> > So have you told the powerpc people that they have a broken writeq?
> 
> I'm just in the process of finding them now on IRC so I can demand an
> explanation: this is a really serious API problem because writeq is
> supposed to be atomic on 64 bit.
> 
> > And why do you obfuscate your report by talking about i386 when it's
> > really about powerpc64?
> 
> James
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq isnot atomic
  2011-05-18  4:23         ` James Bottomley
@ 2011-05-18  8:04           ` David Laight
  -1 siblings, 0 replies; 46+ messages in thread
From: David Laight @ 2011-05-18  8:04 UTC (permalink / raw)
  To: James Bottomley, Matthew Wilcox
  Cc: Prakash, Sathya, Desai,Kashyap, linux-scsi, linuxppc-dev, paulus,
	Moore,Eric

 

> > > > static inline void writeq(__u64 val, volatile void __iomem
*addr)
> > > > {
> > > >         writel(val, addr);
> > > >         writel(val >> 32, addr+4);
> > > > }
...
> > > > We need the 64 bit completed in one access pci memory write,
else spin lock is required.
> > > > Since it's going to be difficult to know which writeq was
implemented in the kernel, 
> > > > the driver is going to have to always acquire a spin lock each
time we do 64bit write.
...
> I'm just in the process of finding them now on IRC so I can demand an
> explanation: this is a really serious API problem because writeq is
> supposed to be atomic on 64 bit.

Most 32 bit systems don't have atomic 64bit writes.
I'd also have thought there would be code which wouldn't mind the
write being done as two cycles.

I'm not sure that some of the ppc soc systems are capable of
doing a 64bit data pci/pcie cycle except by dma.
So your driver is probably doomed to require a lock.

	David



^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq isnot atomic
@ 2011-05-18  8:04           ` David Laight
  0 siblings, 0 replies; 46+ messages in thread
From: David Laight @ 2011-05-18  8:04 UTC (permalink / raw)
  To: James Bottomley, Matthew Wilcox
  Cc: Prakash, Sathya, Desai, Kashyap, linux-scsi, linuxppc-dev,
	paulus, Moore, Eric

=20

> > > > static inline void writeq(__u64 val, volatile void __iomem
*addr)
> > > > {
> > > >         writel(val, addr);
> > > >         writel(val >> 32, addr+4);
> > > > }
...
> > > > We need the 64 bit completed in one access pci memory write,
else spin lock is required.
> > > > Since it's going to be difficult to know which writeq was
implemented in the kernel,=20
> > > > the driver is going to have to always acquire a spin lock each
time we do 64bit write.
...
> I'm just in the process of finding them now on IRC so I can demand an
> explanation: this is a really serious API problem because writeq is
> supposed to be atomic on 64 bit.

Most 32 bit systems don't have atomic 64bit writes.
I'd also have thought there would be code which wouldn't mind the
write being done as two cycles.

I'm not sure that some of the ppc soc systems are capable of
doing a 64bit data pci/pcie cycle except by dma.
So your driver is probably doomed to require a lock.

	David

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18  7:00           ` Benjamin Herrenschmidt
@ 2011-05-18  8:23             ` Milton Miller
  -1 siblings, 0 replies; 46+ messages in thread
From: Milton Miller @ 2011-05-18  8:23 UTC (permalink / raw)
  To: Desai, Kashyap, Moore, Eric, Prakash, Sathya
  Cc: James Bottomley, Matthew Wilcox, Benjamin Herrenschmidt,
	linux-scsi, paulus, linuxppc-dev

On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote:
> (Just adding Milton to the CC list, he suspects races in the
> driver instead).
>
> On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote:
> > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > > > The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> > > > > This is not going to work.
> > > > > 
> > > > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > > > {
> > > > >         writel(val, addr);
> > > > >         writel(val >> 32, addr+4);
> > > > > }
> > > > > 
> > > > > So with this code turned on in the kernel, there is going to be race condition 
> > > > > where multiple cpus can be writing to the request descriptor at the same time.
> > > > > 
> > > > > Meaning this could happen:
> > > > > (A) CPU A doest 32bit write
> > > > > (B) CPU B does 32 bit write
> > > > > (C) CPU A does 32 bit write
> > > > > (D) CPU B does 32 bit write
> > > > > 
> > > > > We need the 64 bit completed in one access pci memory write, else spin lock is required.
> > > > > Since it's going to be difficult to know which writeq was implemented in the kernel, 
> > > > > the driver is going to have to always acquire a spin lock each time we do 64bit write.
> > > > > 
> > > > > Cc: stable@kernle.org
> > > > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > > > ---
> > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > index efa0255..5778334 100644
> > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> > > > >   * care of 32 bit environment where its not quarenteed to send the entire word
> > > > >   * in one transfer.
> > > > >   */
> > > > > -#ifndef writeq
> > > > 
> > > > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > > > systems have writeq implemented correctly; you suspect 32 bit systems
> > > > don't.
> > > > 
> > > > James
> > > > 
> > > > James, This issue was observed on PPC64 system. So what you have suggested will not solve this issue.
> > > > If we are sure that writeq() is atomic across all architecture, we can use it safely. As we have seen issue on ppc64, we are not confident to use
> > > > "writeq" call.
> > > 
> > > So have you told the powerpc people that they have a broken writeq?
> > 
> > I'm just in the process of finding them now on IRC so I can demand an
> > explanation: this is a really serious API problem because writeq is
> > supposed to be atomic on 64 bit.
> > 
> > > And why do you obfuscate your report by talking about i386 when it's
> > > really about powerpc64?
> > 
> > James

I checked the assembly for my complied output and it ends up with
a single std (store doubleword aka 64 bits) instruction with offset
192 decimal (0xc0) from the base register obtained from the structure.

An aligned doubleword store is atomic on 64 bit powerpc.

So I would really like more details if you are blaming 64 bit
powerpc of a non-atomic store.

That said, the patch will affect the code by adding barriers.
Specifically, while powerpc has a sync before doing the store as part
of writeq, wrapping in a spinlock adds a sync before releasing the lock
whenever a writeq (or writex x=b,w,d,q) was issued inside the lock.

(sync orders all reads and all writes to both memory and devices from
that cpu).

But looking further at the code, I see such things as:

drivers/scsi/mpt2sas/mpt2sas_base.c  line 2944

        mpt2sas_base_put_smid_default(ioc, smid);
        init_completion(&ioc->base_cmds.done);
        timeleft = wait_for_completion_timeout(&ioc->base_cmds.done,

where mpt2sas_base_put_smid_default is a routine that has a call to
_base_writeq.  This will initiate io to the adapter, then initialize
the completion, then hope that the timeout is long enough to let the io
complete and be marked done but short enough to not be a problem when
the timeout occurs because we initialized the compeltion after the irq
came in.

The code then looks at a status flag, but there is no indication how
the access to that field is serialized between the interrupt handler
and the submission routine.  It may mostly work due to barriers in
the primitives but I don't see any statement of rules.

Also, while I see a few wmb before writel in _base_interrupt, I don't
see any rmb, which I would expect between establishing a element is
valid and reading other fields in that element.

So I'd really like to hear more about what your symptoms were and how
you determined writeq on 64 bit powerpc was not atomic.

milton

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18  8:23             ` Milton Miller
  0 siblings, 0 replies; 46+ messages in thread
From: Milton Miller @ 2011-05-18  8:23 UTC (permalink / raw)
  To: Desai, Kashyap, Moore, Eric, Prakash, Sathya
  Cc: linux-scsi, Matthew Wilcox, James Bottomley, paulus, linuxppc-dev

On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote:
> (Just adding Milton to the CC list, he suspects races in the
> driver instead).
>
> On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote:
> > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > > > The following code seems to be there in /usr/src/linux/arch/x86/include/asm/io.h.
> > > > > This is not going to work.
> > > > > 
> > > > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > > > {
> > > > >         writel(val, addr);
> > > > >         writel(val >> 32, addr+4);
> > > > > }
> > > > > 
> > > > > So with this code turned on in the kernel, there is going to be race condition 
> > > > > where multiple cpus can be writing to the request descriptor at the same time.
> > > > > 
> > > > > Meaning this could happen:
> > > > > (A) CPU A doest 32bit write
> > > > > (B) CPU B does 32 bit write
> > > > > (C) CPU A does 32 bit write
> > > > > (D) CPU B does 32 bit write
> > > > > 
> > > > > We need the 64 bit completed in one access pci memory write, else spin lock is required.
> > > > > Since it's going to be difficult to know which writeq was implemented in the kernel, 
> > > > > the driver is going to have to always acquire a spin lock each time we do 64bit write.
> > > > > 
> > > > > Cc: stable@kernle.org
> > > > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > > > ---
> > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > index efa0255..5778334 100644
> > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER *ioc, u16 smid)
> > > > >   * care of 32 bit environment where its not quarenteed to send the entire word
> > > > >   * in one transfer.
> > > > >   */
> > > > > -#ifndef writeq
> > > > 
> > > > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > > > systems have writeq implemented correctly; you suspect 32 bit systems
> > > > don't.
> > > > 
> > > > James
> > > > 
> > > > James, This issue was observed on PPC64 system. So what you have suggested will not solve this issue.
> > > > If we are sure that writeq() is atomic across all architecture, we can use it safely. As we have seen issue on ppc64, we are not confident to use
> > > > "writeq" call.
> > > 
> > > So have you told the powerpc people that they have a broken writeq?
> > 
> > I'm just in the process of finding them now on IRC so I can demand an
> > explanation: this is a really serious API problem because writeq is
> > supposed to be atomic on 64 bit.
> > 
> > > And why do you obfuscate your report by talking about i386 when it's
> > > really about powerpc64?
> > 
> > James

I checked the assembly for my complied output and it ends up with
a single std (store doubleword aka 64 bits) instruction with offset
192 decimal (0xc0) from the base register obtained from the structure.

An aligned doubleword store is atomic on 64 bit powerpc.

So I would really like more details if you are blaming 64 bit
powerpc of a non-atomic store.

That said, the patch will affect the code by adding barriers.
Specifically, while powerpc has a sync before doing the store as part
of writeq, wrapping in a spinlock adds a sync before releasing the lock
whenever a writeq (or writex x=b,w,d,q) was issued inside the lock.

(sync orders all reads and all writes to both memory and devices from
that cpu).

But looking further at the code, I see such things as:

drivers/scsi/mpt2sas/mpt2sas_base.c  line 2944

        mpt2sas_base_put_smid_default(ioc, smid);
        init_completion(&ioc->base_cmds.done);
        timeleft = wait_for_completion_timeout(&ioc->base_cmds.done,

where mpt2sas_base_put_smid_default is a routine that has a call to
_base_writeq.  This will initiate io to the adapter, then initialize
the completion, then hope that the timeout is long enough to let the io
complete and be marked done but short enough to not be a problem when
the timeout occurs because we initialized the compeltion after the irq
came in.

The code then looks at a status flag, but there is no indication how
the access to that field is serialized between the interrupt handler
and the submission routine.  It may mostly work due to barriers in
the primitives but I don't see any statement of rules.

Also, while I see a few wmb before writel in _base_interrupt, I don't
see any rmb, which I would expect between establishing a element is
valid and reading other fields in that element.

So I'd really like to hear more about what your symptoms were and how
you determined writeq on 64 bit powerpc was not atomic.

milton

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18  8:23             ` Milton Miller
@ 2011-05-18 15:35               ` Moore, Eric
  -1 siblings, 0 replies; 46+ messages in thread
From: Moore, Eric @ 2011-05-18 15:35 UTC (permalink / raw)
  To: Milton Miller, Desai, Kashyap, Prakash, Sathya
  Cc: James Bottomley, Matthew Wilcox, Benjamin Herrenschmidt,
	linux-scsi, paulus, linuxppc-dev

On Wednesday, May 18, 2011 2:24 AM, Milton Miller wrote:
> On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote:
> > (Just adding Milton to the CC list, he suspects races in the
> > driver instead).
> >
> > On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote:
> > > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> > > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > > > > The following code seems to be there in
> /usr/src/linux/arch/x86/include/asm/io.h.
> > > > > > This is not going to work.
> > > > > >
> > > > > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > > > > {
> > > > > >         writel(val, addr);
> > > > > >         writel(val >> 32, addr+4);
> > > > > > }
> > > > > >
> > > > > > So with this code turned on in the kernel, there is going to be
> race condition
> > > > > > where multiple cpus can be writing to the request descriptor at
> the same time.
> > > > > >
> > > > > > Meaning this could happen:
> > > > > > (A) CPU A doest 32bit write
> > > > > > (B) CPU B does 32 bit write
> > > > > > (C) CPU A does 32 bit write
> > > > > > (D) CPU B does 32 bit write
> > > > > >
> > > > > > We need the 64 bit completed in one access pci memory write, else
> spin lock is required.
> > > > > > Since it's going to be difficult to know which writeq was
> implemented in the kernel,
> > > > > > the driver is going to have to always acquire a spin lock each
> time we do 64bit write.
> > > > > >
> > > > > > Cc: stable@kernle.org
> > > > > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > > > > ---
> > > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
> b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > index efa0255..5778334 100644
> > > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct
> MPT2SAS_ADAPTER *ioc, u16 smid)
> > > > > >   * care of 32 bit environment where its not quarenteed to send
> the entire word
> > > > > >   * in one transfer.
> > > > > >   */
> > > > > > -#ifndef writeq
> > > > >
> > > > > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > > > > systems have writeq implemented correctly; you suspect 32 bit
> systems
> > > > > don't.
> > > > >
> > > > > James
> > > > >
> > > > > James, This issue was observed on PPC64 system. So what you have
> suggested will not solve this issue.
> > > > > If we are sure that writeq() is atomic across all architecture, we
> can use it safely. As we have seen issue on ppc64, we are not confident to
> use
> > > > > "writeq" call.
> > > >
> > > > So have you told the powerpc people that they have a broken writeq?
> > >
> > > I'm just in the process of finding them now on IRC so I can demand an
> > > explanation: this is a really serious API problem because writeq is
> > > supposed to be atomic on 64 bit.
> > >
> > > > And why do you obfuscate your report by talking about i386 when it's
> > > > really about powerpc64?
> > >
> > > James
> 
> I checked the assembly for my complied output and it ends up with
> a single std (store doubleword aka 64 bits) instruction with offset
> 192 decimal (0xc0) from the base register obtained from the structure.
> 
> An aligned doubleword store is atomic on 64 bit powerpc.
> 
> So I would really like more details if you are blaming 64 bit
> powerpc of a non-atomic store.
> 
> That said, the patch will affect the code by adding barriers.
> Specifically, while powerpc has a sync before doing the store as part
> of writeq, wrapping in a spinlock adds a sync before releasing the lock
> whenever a writeq (or writex x=b,w,d,q) was issued inside the lock.
> 
> (sync orders all reads and all writes to both memory and devices from
> that cpu).
> 
> But looking further at the code, I see such things as:
> 
> drivers/scsi/mpt2sas/mpt2sas_base.c  line 2944
> 
>         mpt2sas_base_put_smid_default(ioc, smid);
>         init_completion(&ioc->base_cmds.done);
>         timeleft = wait_for_completion_timeout(&ioc->base_cmds.done,
> 
> where mpt2sas_base_put_smid_default is a routine that has a call to
> _base_writeq.  This will initiate io to the adapter, then initialize
> the completion, then hope that the timeout is long enough to let the io
> complete and be marked done but short enough to not be a problem when
> the timeout occurs because we initialized the compeltion after the irq
> came in.
> 
> The code then looks at a status flag, but there is no indication how
> the access to that field is serialized between the interrupt handler
> and the submission routine.  It may mostly work due to barriers in
> the primitives but I don't see any statement of rules.
> 
> Also, while I see a few wmb before writel in _base_interrupt, I don't
> see any rmb, which I would expect between establishing a element is
> valid and reading other fields in that element.
> 
> So I'd really like to hear more about what your symptoms were and how
> you determined writeq on 64 bit powerpc was not atomic.
> 
> Milton


I worked the original defect a couple months ago, and Kashyap is now getting around to posting my patch's.

This original defect has nothing to do with PPC64.  The original problem was only on x86.    It only became a problem on PPC64 when I tried to fix the original x86 issue by copying the writeq code from the linux headers, then it broke PPC64.   I doubt that broken patch was ever posted. Anyways, back to the original defect.  The reason it because a problem for x86 is because the kernel headers had a implementation of writeq in the arch/x86 headers, which means our internal implementation of writeq is not being used.  The writeq implementation in the kernel is total wrong for arch/x86 because it doesn't not have spin locks, and if two processor simultaneously doing two separate 32bit pci writes, then what is received by controller firmware is out of order.   This change occurs between Red Ha
 t RHEL5 and RHEL6.  In RHEL5, this writeq was not implemented in arch/x86 headers, and our driver internal implementation of write was used.

Eric




^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18 15:35               ` Moore, Eric
  0 siblings, 0 replies; 46+ messages in thread
From: Moore, Eric @ 2011-05-18 15:35 UTC (permalink / raw)
  To: Milton Miller, Desai, Kashyap, Prakash, Sathya
  Cc: linux-scsi, Matthew Wilcox, James Bottomley, paulus, linuxppc-dev

On Wednesday, May 18, 2011 2:24 AM, Milton Miller wrote:
> On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote:
> > (Just adding Milton to the CC list, he suspects races in the
> > driver instead).
> >
> > On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote:
> > > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> > > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > > > > The following code seems to be there in
> /usr/src/linux/arch/x86/include/asm/io.h.
> > > > > > This is not going to work.
> > > > > >
> > > > > > static inline void writeq(__u64 val, volatile void __iomem *add=
r)
> > > > > > {
> > > > > >         writel(val, addr);
> > > > > >         writel(val >> 32, addr+4);
> > > > > > }
> > > > > >
> > > > > > So with this code turned on in the kernel, there is going to be
> race condition
> > > > > > where multiple cpus can be writing to the request descriptor at
> the same time.
> > > > > >
> > > > > > Meaning this could happen:
> > > > > > (A) CPU A doest 32bit write
> > > > > > (B) CPU B does 32 bit write
> > > > > > (C) CPU A does 32 bit write
> > > > > > (D) CPU B does 32 bit write
> > > > > >
> > > > > > We need the 64 bit completed in one access pci memory write, el=
se
> spin lock is required.
> > > > > > Since it's going to be difficult to know which writeq was
> implemented in the kernel,
> > > > > > the driver is going to have to always acquire a spin lock each
> time we do 64bit write.
> > > > > >
> > > > > > Cc: stable@kernle.org
> > > > > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > > > > ---
> > > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
> b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > index efa0255..5778334 100644
> > > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct
> MPT2SAS_ADAPTER *ioc, u16 smid)
> > > > > >   * care of 32 bit environment where its not quarenteed to send
> the entire word
> > > > > >   * in one transfer.
> > > > > >   */
> > > > > > -#ifndef writeq
> > > > >
> > > > > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > > > > systems have writeq implemented correctly; you suspect 32 bit
> systems
> > > > > don't.
> > > > >
> > > > > James
> > > > >
> > > > > James, This issue was observed on PPC64 system. So what you have
> suggested will not solve this issue.
> > > > > If we are sure that writeq() is atomic across all architecture, w=
e
> can use it safely. As we have seen issue on ppc64, we are not confident t=
o
> use
> > > > > "writeq" call.
> > > >
> > > > So have you told the powerpc people that they have a broken writeq?
> > >
> > > I'm just in the process of finding them now on IRC so I can demand an
> > > explanation: this is a really serious API problem because writeq is
> > > supposed to be atomic on 64 bit.
> > >
> > > > And why do you obfuscate your report by talking about i386 when it'=
s
> > > > really about powerpc64?
> > >
> > > James
>=20
> I checked the assembly for my complied output and it ends up with
> a single std (store doubleword aka 64 bits) instruction with offset
> 192 decimal (0xc0) from the base register obtained from the structure.
>=20
> An aligned doubleword store is atomic on 64 bit powerpc.
>=20
> So I would really like more details if you are blaming 64 bit
> powerpc of a non-atomic store.
>=20
> That said, the patch will affect the code by adding barriers.
> Specifically, while powerpc has a sync before doing the store as part
> of writeq, wrapping in a spinlock adds a sync before releasing the lock
> whenever a writeq (or writex x=3Db,w,d,q) was issued inside the lock.
>=20
> (sync orders all reads and all writes to both memory and devices from
> that cpu).
>=20
> But looking further at the code, I see such things as:
>=20
> drivers/scsi/mpt2sas/mpt2sas_base.c  line 2944
>=20
>         mpt2sas_base_put_smid_default(ioc, smid);
>         init_completion(&ioc->base_cmds.done);
>         timeleft =3D wait_for_completion_timeout(&ioc->base_cmds.done,
>=20
> where mpt2sas_base_put_smid_default is a routine that has a call to
> _base_writeq.  This will initiate io to the adapter, then initialize
> the completion, then hope that the timeout is long enough to let the io
> complete and be marked done but short enough to not be a problem when
> the timeout occurs because we initialized the compeltion after the irq
> came in.
>=20
> The code then looks at a status flag, but there is no indication how
> the access to that field is serialized between the interrupt handler
> and the submission routine.  It may mostly work due to barriers in
> the primitives but I don't see any statement of rules.
>=20
> Also, while I see a few wmb before writel in _base_interrupt, I don't
> see any rmb, which I would expect between establishing a element is
> valid and reading other fields in that element.
>=20
> So I'd really like to hear more about what your symptoms were and how
> you determined writeq on 64 bit powerpc was not atomic.
>=20
> Milton


I worked the original defect a couple months ago, and Kashyap is now gettin=
g around to posting my patch's.

This original defect has nothing to do with PPC64.  The original problem wa=
s only on x86.    It only became a problem on PPC64 when I tried to fix the=
 original x86 issue by copying the writeq code from the linux headers, then=
 it broke PPC64.   I doubt that broken patch was ever posted. Anyways, back=
 to the original defect.  The reason it because a problem for x86 is becaus=
e the kernel headers had a implementation of writeq in the arch/x86 headers=
, which means our internal implementation of writeq is not being used.  The=
 writeq implementation in the kernel is total wrong for arch/x86 because it=
 doesn't not have spin locks, and if two processor simultaneously doing two=
 separate 32bit pci writes, then what is received by controller firmware is=
 out of order.   This change occurs between Red Hat RHEL5 and RHEL6.  In RH=
EL5, this writeq was not implemented in arch/x86 headers, and our driver in=
ternal implementation of write was used.

Eric

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18 15:35               ` Moore, Eric
  (?)
@ 2011-05-18 18:31                 ` Milton Miller
  -1 siblings, 0 replies; 46+ messages in thread
From: Milton Miller @ 2011-05-18 18:31 UTC (permalink / raw)
  To: Hitoshi Mitake, Sam Ravnborg, Ingo Molnar, Ingo Molnar, Desai,
	Kashyap, Prakash, Sathya
  Cc: James Bottomley, Matthew Wilcox, Benjamin Herrenschmidt,
	linux scsi dev, paulus, linux powerpc dev, linux pci,
	linux kernel, linux-arch

On Wed, 18 May 2011 about 09:35:56 -0600, Eric Moore wrote:
> On Wednesday, May 18, 2011 2:24 AM, Milton Miller wrote:
> > On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote:
> > > (Just adding Milton to the CC list, he suspects races in the
> > > driver instead).
> > >
> > > On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote:
> > > > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> > > > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > > > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > > > > > The following code seems to be there in
> > /usr/src/linux/arch/x86/include/asm/io.h.
> > > > > > > This is not going to work.
> > > > > > >
> > > > > > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > > > > > {
> > > > > > >         writel(val, addr);
> > > > > > >         writel(val >> 32, addr+4);
> > > > > > > }
> > > > > > >
> > > > > > > So with this code turned on in the kernel, there is going to be
> > race condition
> > > > > > > where multiple cpus can be writing to the request descriptor at
> > the same time.
> > > > > > >
> > > > > > > Meaning this could happen:
> > > > > > > (A) CPU A doest 32bit write
> > > > > > > (B) CPU B does 32 bit write
> > > > > > > (C) CPU A does 32 bit write
> > > > > > > (D) CPU B does 32 bit write
> > > > > > >
> > > > > > > We need the 64 bit completed in one access pci memory write, else
> > spin lock is required.
> > > > > > > Since it's going to be difficult to know which writeq was
> > implemented in the kernel,
> > > > > > > the driver is going to have to always acquire a spin lock each
> > time we do 64bit write.
> > > > > > >
> > > > > > > Cc: stable@kernle.org
> > > > > > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > > > > > ---
> > > > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > > index efa0255..5778334 100644
> > > > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct
> > MPT2SAS_ADAPTER *ioc, u16 smid)
> > > > > > >   * care of 32 bit environment where its not quarenteed to send
> > the entire word
> > > > > > >   * in one transfer.
> > > > > > >   */
> > > > > > > -#ifndef writeq
> > > > > >
> > > > > > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > > > > > systems have writeq implemented correctly; you suspect 32 bit
> > systems
> > > > > > don't.
> > > > > >
> > > > > > James
> > > > > >
> > > > > > James, This issue was observed on PPC64 system. So what you have
> > suggested will not solve this issue.
> > > > > > If we are sure that writeq() is atomic across all architecture, we
> > can use it safely. As we have seen issue on ppc64, we are not confident to
> > use
> > > > > > "writeq" call.
> > > > >
> > > > > So have you told the powerpc people that they have a broken writeq?
> > > >
> > > > I'm just in the process of finding them now on IRC so I can demand an
> > > > explanation: this is a really serious API problem because writeq is
> > > > supposed to be atomic on 64 bit.
> > > >
> > > > > And why do you obfuscate your report by talking about i386 when it's
> > > > > really about powerpc64?
> > > >
> > > > James
> >
> > I checked the assembly for my complied output and it ends up with
> > a single std (store doubleword aka 64 bits) instruction with offset
> > 192 decimal (0xc0) from the base register obtained from the structure.
> >
> > An aligned doubleword store is atomic on 64 bit powerpc.
> >
> > So I would really like more details if you are blaming 64 bit
> > powerpc of a non-atomic store.
> >
> > That said, the patch will affect the code by adding barriers.
> > Specifically, while powerpc has a sync before doing the store as part
> > of writeq, wrapping in a spinlock adds a sync before releasing the lock
> > whenever a writeq (or writex x=b,w,d,q) was issued inside the lock.
> >
> > (sync orders all reads and all writes to both memory and devices from
> > that cpu).
> >
> > But looking further at the code, I see such things as:
> >
> > drivers/scsi/mpt2sas/mpt2sas_base.c  line 2944
> >
> >         mpt2sas_base_put_smid_default(ioc, smid);
> >         init_completion(&ioc->base_cmds.done);
> >         timeleft = wait_for_completion_timeout(&ioc->base_cmds.done,
> >
> > where mpt2sas_base_put_smid_default is a routine that has a call to
> > _base_writeq.  This will initiate io to the adapter, then initialize
> > the completion, then hope that the timeout is long enough to let the io
> > complete and be marked done but short enough to not be a problem when
> > the timeout occurs because we initialized the compeltion after the irq
> > came in.
> >
> > The code then looks at a status flag, but there is no indication how
> > the access to that field is serialized between the interrupt handler
> > and the submission routine.  It may mostly work due to barriers in
> > the primitives but I don't see any statement of rules.
> >
> > Also, while I see a few wmb before writel in _base_interrupt, I don't
> > see any rmb, which I would expect between establishing a element is
> > valid and reading other fields in that element.
> >
> > So I'd really like to hear more about what your symptoms were and how
> > you determined writeq on 64 bit powerpc was not atomic.
> >
> > Milton
> 
> 
> I worked the original defect a couple months ago, and Kashyap is now
> getting around to posting my patch's.
> 
> This original defect has nothing to do with PPC64.  The original
> problem was only on x86.    It only became a problem on PPC64 when I
> tried to fix the original x86 issue by copying the writeq code from
> the linux headers, then it broke PPC64.   I doubt that broken patch
> was ever posted. Anyways, back to the original defect.  The reason
> it because a problem for x86 is because the kernel headers had a
> implementation of writeq in the arch/x86 headers, which means our
> internal implementation of writeq is not being used.  The writeq
> implementation in the kernel is total wrong for arch/x86 because it
> doesn't not have spin locks, and if two processor simultaneously doing
> two separate 32bit pci writes, then what is received by controller
> firmware is out of order.   This change occurs between Red Hat RHEL5
> and RHEL6.  In RHEL5, this writeq was not implemented in arch/x86
> headers, and our driver internal implementation of write was used.
> 
> Eric

So the real question should be why is x86-32 supplying a broken writeq
instead of letting drivers work out what to do it when needed?

It was added in 2.6.29 with out any pci or other io acks in
the change log.

Also the only reference I find to HAVE_WRITEQ and HAVE_READQ is the
x86 selects, so I think those can just be removed (vs move to x86_64).

The original changelog is:
    Impact: add new API for drivers
    
    Add implementation of readq/writeq to x86_32, and add config value to
    the x86 architecture to determine existence of readq/writeq.

Ingo I would propose the following commits added in 2.6.29 be reverted.
I think the current concensus is drivers must know if the writeq is
not atomic so they can provide their own locking or other workaround.

2c5643b1c5c7fbb13f340d4c58944d9642f41796
	x86: provide readq()/writeq() on 32-bit too
a0b1131e479e5af32eefac8bc54c9742e23d638e
	x86: provide readq()/writeq() on 32-bit too, cleanup
93093d099e5dd0c258fd530c12668e828c20df41
	x86: provide readq()/writeq() on 32-bit too, complete

milton

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18 18:31                 ` Milton Miller
  0 siblings, 0 replies; 46+ messages in thread
From: Milton Miller @ 2011-05-18 18:31 UTC (permalink / raw)
  To: Hitoshi Mitake, Sam Ravnborg, Ingo Molnar, Ingo Molnar, Desai,
	Kashyap, Prakash
  Cc: James Bottomley, Matthew Wilcox, Benjamin Herrenschmidt,
	linux scsi dev, paulus, linux powerpc dev, linux pci,
	linux kernel, linux-arch

On Wed, 18 May 2011 about 09:35:56 -0600, Eric Moore wrote:
> On Wednesday, May 18, 2011 2:24 AM, Milton Miller wrote:
> > On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote:
> > > (Just adding Milton to the CC list, he suspects races in the
> > > driver instead).
> > >
> > > On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote:
> > > > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> > > > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > > > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > > > > > The following code seems to be there in
> > /usr/src/linux/arch/x86/include/asm/io.h.
> > > > > > > This is not going to work.
> > > > > > >
> > > > > > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > > > > > {
> > > > > > >         writel(val, addr);
> > > > > > >         writel(val >> 32, addr+4);
> > > > > > > }
> > > > > > >
> > > > > > > So with this code turned on in the kernel, there is going to be
> > race condition
> > > > > > > where multiple cpus can be writing to the request descriptor at
> > the same time.
> > > > > > >
> > > > > > > Meaning this could happen:
> > > > > > > (A) CPU A doest 32bit write
> > > > > > > (B) CPU B does 32 bit write
> > > > > > > (C) CPU A does 32 bit write
> > > > > > > (D) CPU B does 32 bit write
> > > > > > >
> > > > > > > We need the 64 bit completed in one access pci memory write, else
> > spin lock is required.
> > > > > > > Since it's going to be difficult to know which writeq was
> > implemented in the kernel,
> > > > > > > the driver is going to have to always acquire a spin lock each
> > time we do 64bit write.
> > > > > > >
> > > > > > > Cc: stable@kernle.org
> > > > > > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > > > > > ---
> > > > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > > index efa0255..5778334 100644
> > > > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct
> > MPT2SAS_ADAPTER *ioc, u16 smid)
> > > > > > >   * care of 32 bit environment where its not quarenteed to send
> > the entire word
> > > > > > >   * in one transfer.
> > > > > > >   */
> > > > > > > -#ifndef writeq
> > > > > >
> > > > > > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > > > > > systems have writeq implemented correctly; you suspect 32 bit
> > systems
> > > > > > don't.
> > > > > >
> > > > > > James
> > > > > >
> > > > > > James, This issue was observed on PPC64 system. So what you have
> > suggested will not solve this issue.
> > > > > > If we are sure that writeq() is atomic across all architecture, we
> > can use it safely. As we have seen issue on ppc64, we are not confident to
> > use
> > > > > > "writeq" call.
> > > > >
> > > > > So have you told the powerpc people that they have a broken writeq?
> > > >
> > > > I'm just in the process of finding them now on IRC so I can demand an
> > > > explanation: this is a really serious API problem because writeq is
> > > > supposed to be atomic on 64 bit.
> > > >
> > > > > And why do you obfuscate your report by talking about i386 when it's
> > > > > really about powerpc64?
> > > >
> > > > James
> >
> > I checked the assembly for my complied output and it ends up with
> > a single std (store doubleword aka 64 bits) instruction with offset
> > 192 decimal (0xc0) from the base register obtained from the structure.
> >
> > An aligned doubleword store is atomic on 64 bit powerpc.
> >
> > So I would really like more details if you are blaming 64 bit
> > powerpc of a non-atomic store.
> >
> > That said, the patch will affect the code by adding barriers.
> > Specifically, while powerpc has a sync before doing the store as part
> > of writeq, wrapping in a spinlock adds a sync before releasing the lock
> > whenever a writeq (or writex x=b,w,d,q) was issued inside the lock.
> >
> > (sync orders all reads and all writes to both memory and devices from
> > that cpu).
> >
> > But looking further at the code, I see such things as:
> >
> > drivers/scsi/mpt2sas/mpt2sas_base.c  line 2944
> >
> >         mpt2sas_base_put_smid_default(ioc, smid);
> >         init_completion(&ioc->base_cmds.done);
> >         timeleft = wait_for_completion_timeout(&ioc->base_cmds.done,
> >
> > where mpt2sas_base_put_smid_default is a routine that has a call to
> > _base_writeq.  This will initiate io to the adapter, then initialize
> > the completion, then hope that the timeout is long enough to let the io
> > complete and be marked done but short enough to not be a problem when
> > the timeout occurs because we initialized the compeltion after the irq
> > came in.
> >
> > The code then looks at a status flag, but there is no indication how
> > the access to that field is serialized between the interrupt handler
> > and the submission routine.  It may mostly work due to barriers in
> > the primitives but I don't see any statement of rules.
> >
> > Also, while I see a few wmb before writel in _base_interrupt, I don't
> > see any rmb, which I would expect between establishing a element is
> > valid and reading other fields in that element.
> >
> > So I'd really like to hear more about what your symptoms were and how
> > you determined writeq on 64 bit powerpc was not atomic.
> >
> > Milton
> 
> 
> I worked the original defect a couple months ago, and Kashyap is now
> getting around to posting my patch's.
> 
> This original defect has nothing to do with PPC64.  The original
> problem was only on x86.    It only became a problem on PPC64 when I
> tried to fix the original x86 issue by copying the writeq code from
> the linux headers, then it broke PPC64.   I doubt that broken patch
> was ever posted. Anyways, back to the original defect.  The reason
> it because a problem for x86 is because the kernel headers had a
> implementation of writeq in the arch/x86 headers, which means our
> internal implementation of writeq is not being used.  The writeq
> implementation in the kernel is total wrong for arch/x86 because it
> doesn't not have spin locks, and if two processor simultaneously doing
> two separate 32bit pci writes, then what is received by controller
> firmware is out of order.   This change occurs between Red Hat RHEL5
> and RHEL6.  In RHEL5, this writeq was not implemented in arch/x86
> headers, and our driver internal implementation of write was used.
> 
> Eric

So the real question should be why is x86-32 supplying a broken writeq
instead of letting drivers work out what to do it when needed?

It was added in 2.6.29 with out any pci or other io acks in
the change log.

Also the only reference I find to HAVE_WRITEQ and HAVE_READQ is the
x86 selects, so I think those can just be removed (vs move to x86_64).

The original changelog is:
    Impact: add new API for drivers
    
    Add implementation of readq/writeq to x86_32, and add config value to
    the x86 architecture to determine existence of readq/writeq.

Ingo I would propose the following commits added in 2.6.29 be reverted.
I think the current concensus is drivers must know if the writeq is
not atomic so they can provide their own locking or other workaround.

2c5643b1c5c7fbb13f340d4c58944d9642f41796
	x86: provide readq()/writeq() on 32-bit too
a0b1131e479e5af32eefac8bc54c9742e23d638e
	x86: provide readq()/writeq() on 32-bit too, cleanup
93093d099e5dd0c258fd530c12668e828c20df41
	x86: provide readq()/writeq() on 32-bit too, complete

milton

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18 18:31                 ` Milton Miller
  0 siblings, 0 replies; 46+ messages in thread
From: Milton Miller @ 2011-05-18 18:31 UTC (permalink / raw)
  To: Hitoshi Mitake, Sam Ravnborg, Ingo Molnar, Ingo Molnar, Desai,
	Kashyap, Prakash, Sathya
  Cc: linux-arch, linux scsi dev, Matthew Wilcox, linux kernel,
	James Bottomley, paulus, linux pci, linux powerpc dev

On Wed, 18 May 2011 about 09:35:56 -0600, Eric Moore wrote:
> On Wednesday, May 18, 2011 2:24 AM, Milton Miller wrote:
> > On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote:
> > > (Just adding Milton to the CC list, he suspects races in the
> > > driver instead).
> > >
> > > On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote:
> > > > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote:
> > > > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote:
> > > > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote:
> > > > > > > The following code seems to be there in
> > /usr/src/linux/arch/x86/include/asm/io.h.
> > > > > > > This is not going to work.
> > > > > > >
> > > > > > > static inline void writeq(__u64 val, volatile void __iomem *addr)
> > > > > > > {
> > > > > > >         writel(val, addr);
> > > > > > >         writel(val >> 32, addr+4);
> > > > > > > }
> > > > > > >
> > > > > > > So with this code turned on in the kernel, there is going to be
> > race condition
> > > > > > > where multiple cpus can be writing to the request descriptor at
> > the same time.
> > > > > > >
> > > > > > > Meaning this could happen:
> > > > > > > (A) CPU A doest 32bit write
> > > > > > > (B) CPU B does 32 bit write
> > > > > > > (C) CPU A does 32 bit write
> > > > > > > (D) CPU B does 32 bit write
> > > > > > >
> > > > > > > We need the 64 bit completed in one access pci memory write, else
> > spin lock is required.
> > > > > > > Since it's going to be difficult to know which writeq was
> > implemented in the kernel,
> > > > > > > the driver is going to have to always acquire a spin lock each
> > time we do 64bit write.
> > > > > > >
> > > > > > > Cc: stable@kernle.org
> > > > > > > Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
> > > > > > > ---
> > > > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > > index efa0255..5778334 100644
> > > > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> > > > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct
> > MPT2SAS_ADAPTER *ioc, u16 smid)
> > > > > > >   * care of 32 bit environment where its not quarenteed to send
> > the entire word
> > > > > > >   * in one transfer.
> > > > > > >   */
> > > > > > > -#ifndef writeq
> > > > > >
> > > > > > Why not make this #ifndef CONFIG_64BIT?  You know that all 64 bit
> > > > > > systems have writeq implemented correctly; you suspect 32 bit
> > systems
> > > > > > don't.
> > > > > >
> > > > > > James
> > > > > >
> > > > > > James, This issue was observed on PPC64 system. So what you have
> > suggested will not solve this issue.
> > > > > > If we are sure that writeq() is atomic across all architecture, we
> > can use it safely. As we have seen issue on ppc64, we are not confident to
> > use
> > > > > > "writeq" call.
> > > > >
> > > > > So have you told the powerpc people that they have a broken writeq?
> > > >
> > > > I'm just in the process of finding them now on IRC so I can demand an
> > > > explanation: this is a really serious API problem because writeq is
> > > > supposed to be atomic on 64 bit.
> > > >
> > > > > And why do you obfuscate your report by talking about i386 when it's
> > > > > really about powerpc64?
> > > >
> > > > James
> >
> > I checked the assembly for my complied output and it ends up with
> > a single std (store doubleword aka 64 bits) instruction with offset
> > 192 decimal (0xc0) from the base register obtained from the structure.
> >
> > An aligned doubleword store is atomic on 64 bit powerpc.
> >
> > So I would really like more details if you are blaming 64 bit
> > powerpc of a non-atomic store.
> >
> > That said, the patch will affect the code by adding barriers.
> > Specifically, while powerpc has a sync before doing the store as part
> > of writeq, wrapping in a spinlock adds a sync before releasing the lock
> > whenever a writeq (or writex x=b,w,d,q) was issued inside the lock.
> >
> > (sync orders all reads and all writes to both memory and devices from
> > that cpu).
> >
> > But looking further at the code, I see such things as:
> >
> > drivers/scsi/mpt2sas/mpt2sas_base.c  line 2944
> >
> >         mpt2sas_base_put_smid_default(ioc, smid);
> >         init_completion(&ioc->base_cmds.done);
> >         timeleft = wait_for_completion_timeout(&ioc->base_cmds.done,
> >
> > where mpt2sas_base_put_smid_default is a routine that has a call to
> > _base_writeq.  This will initiate io to the adapter, then initialize
> > the completion, then hope that the timeout is long enough to let the io
> > complete and be marked done but short enough to not be a problem when
> > the timeout occurs because we initialized the compeltion after the irq
> > came in.
> >
> > The code then looks at a status flag, but there is no indication how
> > the access to that field is serialized between the interrupt handler
> > and the submission routine.  It may mostly work due to barriers in
> > the primitives but I don't see any statement of rules.
> >
> > Also, while I see a few wmb before writel in _base_interrupt, I don't
> > see any rmb, which I would expect between establishing a element is
> > valid and reading other fields in that element.
> >
> > So I'd really like to hear more about what your symptoms were and how
> > you determined writeq on 64 bit powerpc was not atomic.
> >
> > Milton
> 
> 
> I worked the original defect a couple months ago, and Kashyap is now
> getting around to posting my patch's.
> 
> This original defect has nothing to do with PPC64.  The original
> problem was only on x86.    It only became a problem on PPC64 when I
> tried to fix the original x86 issue by copying the writeq code from
> the linux headers, then it broke PPC64.   I doubt that broken patch
> was ever posted. Anyways, back to the original defect.  The reason
> it because a problem for x86 is because the kernel headers had a
> implementation of writeq in the arch/x86 headers, which means our
> internal implementation of writeq is not being used.  The writeq
> implementation in the kernel is total wrong for arch/x86 because it
> doesn't not have spin locks, and if two processor simultaneously doing
> two separate 32bit pci writes, then what is received by controller
> firmware is out of order.   This change occurs between Red Hat RHEL5
> and RHEL6.  In RHEL5, this writeq was not implemented in arch/x86
> headers, and our driver internal implementation of write was used.
> 
> Eric

So the real question should be why is x86-32 supplying a broken writeq
instead of letting drivers work out what to do it when needed?

It was added in 2.6.29 with out any pci or other io acks in
the change log.

Also the only reference I find to HAVE_WRITEQ and HAVE_READQ is the
x86 selects, so I think those can just be removed (vs move to x86_64).

The original changelog is:
    Impact: add new API for drivers
    
    Add implementation of readq/writeq to x86_32, and add config value to
    the x86 architecture to determine existence of readq/writeq.

Ingo I would propose the following commits added in 2.6.29 be reverted.
I think the current concensus is drivers must know if the writeq is
not atomic so they can provide their own locking or other workaround.

2c5643b1c5c7fbb13f340d4c58944d9642f41796
	x86: provide readq()/writeq() on 32-bit too
a0b1131e479e5af32eefac8bc54c9742e23d638e
	x86: provide readq()/writeq() on 32-bit too, cleanup
93093d099e5dd0c258fd530c12668e828c20df41
	x86: provide readq()/writeq() on 32-bit too, complete

milton

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18 18:31                 ` Milton Miller
  (?)
@ 2011-05-18 19:11                   ` Moore, Eric
  -1 siblings, 0 replies; 46+ messages in thread
From: Moore, Eric @ 2011-05-18 19:11 UTC (permalink / raw)
  To: Milton Miller, Hitoshi Mitake, Sam Ravnborg, Ingo Molnar,
	Ingo Molnar, Desai, Kashyap, Prakash, Sathya
  Cc: James Bottomley, Matthew Wilcox, Benjamin Herrenschmidt,
	linux scsi dev, paulus, linux powerpc dev, linux pci,
	linux kernel, linux-arch

On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote:
> Ingo I would propose the following commits added in 2.6.29 be reverted.
> I think the current concensus is drivers must know if the writeq is
> not atomic so they can provide their own locking or other workaround.
> 


Exactly.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18 19:11                   ` Moore, Eric
  0 siblings, 0 replies; 46+ messages in thread
From: Moore, Eric @ 2011-05-18 19:11 UTC (permalink / raw)
  To: Milton Miller, Hitoshi Mitake, Sam Ravnborg, Ingo Molnar,
	Ingo Molnar, Desai, Kashyap
  Cc: James Bottomley, Matthew Wilcox, Benjamin Herrenschmidt,
	linux scsi dev, paulus, linux powerpc dev, linux pci,
	linux kernel, linux-arch

On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote:
> Ingo I would propose the following commits added in 2.6.29 be reverted.
> I think the current concensus is drivers must know if the writeq is
> not atomic so they can provide their own locking or other workaround.
> 


Exactly.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18 19:11                   ` Moore, Eric
  0 siblings, 0 replies; 46+ messages in thread
From: Moore, Eric @ 2011-05-18 19:11 UTC (permalink / raw)
  To: Milton Miller, Hitoshi Mitake, Sam Ravnborg, Ingo Molnar,
	Ingo Molnar, Desai, Kashyap, Prakash, Sathya
  Cc: linux-arch, linux scsi dev, Matthew Wilcox, linux kernel,
	James Bottomley, paulus, linux pci, linux powerpc dev

On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote:
> Ingo I would propose the following commits added in 2.6.29 be reverted.
> I think the current concensus is drivers must know if the writeq is
> not atomic so they can provide their own locking or other workaround.
>=20


Exactly.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18 15:35               ` Moore, Eric
  (?)
  (?)
@ 2011-05-18 21:30               ` Benjamin Herrenschmidt
  2011-05-18 22:05                   ` Moore, Eric
  -1 siblings, 1 reply; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2011-05-18 21:30 UTC (permalink / raw)
  To: Moore, Eric
  Cc: Prakash, Sathya, Desai, Kashyap, linux-scsi, Matthew Wilcox,
	Milton Miller, James Bottomley, paulus, linuxppc-dev

On Wed, 2011-05-18 at 09:35 -0600, Moore, Eric wrote:
> I worked the original defect a couple months ago, and Kashyap is now
> getting around to posting my patch's.
> 
> This original defect has nothing to do with PPC64.  The original
> problem was only on x86.    It only became a problem on PPC64 when I
> tried to fix the original x86 issue by copying the writeq code from
> the linux headers, then it broke PPC64.   I doubt that broken patch
> was ever posted. Anyways, back to the original defect.  The reason it
> because a problem for x86 is because the kernel headers had a
> implementation of writeq in the arch/x86 headers, which means our
> internal implementation of writeq is not being used.  The writeq
> implementation in the kernel is total wrong for arch/x86 because it
> doesn't not have spin locks, and if two processor simultaneously doing
> two separate 32bit pci writes, then what is received by controller
> firmware is out of order.   This change occurs between Red Hat RHEL5
> and RHEL6.  In RHEL5, this writeq was not implemented in arch/x86
> headers, and our driver internal implementation of write was used.

You may also want to look at Milton's comments, it looks like the way
you do init_completion followed immediately by wait_completion is racy.

You should init the completion before you do the IO that will eventually
trigger complete() to be called.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18 21:30               ` [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic Benjamin Herrenschmidt
@ 2011-05-18 22:05                   ` Moore, Eric
  0 siblings, 0 replies; 46+ messages in thread
From: Moore, Eric @ 2011-05-18 22:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Milton Miller, Desai, Kashyap, Prakash, Sathya, James Bottomley,
	Matthew Wilcox, linux-scsi, paulus, linuxppc-dev

On Wednesday, May 18, 2011 3:30 PM, Benjamin Herrenschmidt wrote:
> 
> You may also want to look at Milton's comments, it looks like the way
> you do init_completion followed immediately by wait_completion is racy.
> 
> You should init the completion before you do the IO that will eventually
> trigger complete() to be called.
> 

I agree.  The init_completion needs to be done prior to posting the smid.  I'm not sure why I did it that way.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-18 22:05                   ` Moore, Eric
  0 siblings, 0 replies; 46+ messages in thread
From: Moore, Eric @ 2011-05-18 22:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Prakash, Sathya, Desai, Kashyap, linux-scsi, Matthew Wilcox,
	Milton Miller, James Bottomley, paulus, linuxppc-dev

T24gV2VkbmVzZGF5LCBNYXkgMTgsIDIwMTEgMzozMCBQTSwgQmVuamFtaW4gSGVycmVuc2NobWlk
dCB3cm90ZToNCj4gDQo+IFlvdSBtYXkgYWxzbyB3YW50IHRvIGxvb2sgYXQgTWlsdG9uJ3MgY29t
bWVudHMsIGl0IGxvb2tzIGxpa2UgdGhlIHdheQ0KPiB5b3UgZG8gaW5pdF9jb21wbGV0aW9uIGZv
bGxvd2VkIGltbWVkaWF0ZWx5IGJ5IHdhaXRfY29tcGxldGlvbiBpcyByYWN5Lg0KPiANCj4gWW91
IHNob3VsZCBpbml0IHRoZSBjb21wbGV0aW9uIGJlZm9yZSB5b3UgZG8gdGhlIElPIHRoYXQgd2ls
bCBldmVudHVhbGx5DQo+IHRyaWdnZXIgY29tcGxldGUoKSB0byBiZSBjYWxsZWQuDQo+IA0KDQpJ
IGFncmVlLiAgVGhlIGluaXRfY29tcGxldGlvbiBuZWVkcyB0byBiZSBkb25lIHByaW9yIHRvIHBv
c3RpbmcgdGhlIHNtaWQuICBJJ20gbm90IHN1cmUgd2h5IEkgZGlkIGl0IHRoYXQgd2F5Lg0K

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18 19:11                   ` Moore, Eric
@ 2011-05-19  4:08                     ` Hitoshi Mitake
  -1 siblings, 0 replies; 46+ messages in thread
From: Hitoshi Mitake @ 2011-05-19  4:08 UTC (permalink / raw)
  To: Moore, Eric
  Cc: Milton Miller, Sam Ravnborg, Ingo Molnar, Ingo Molnar, Desai,
	Kashyap, Prakash, Sathya, James Bottomley, Matthew Wilcox,
	Benjamin Herrenschmidt, linux scsi dev, paulus,
	linux powerpc dev, linux pci, linux kernel, linux-arch

On Thu, May 19, 2011 at 04:11, Moore, Eric <Eric.Moore@lsi.com> wrote:
> On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote:
>> Ingo I would propose the following commits added in 2.6.29 be reverted.
>> I think the current concensus is drivers must know if the writeq is
>> not atomic so they can provide their own locking or other workaround.
>>
>
>
> Exactly.
>

The original motivation of preparing common readq/writeq is that
letting each driver
have their own readq/writeq is bad for maintenance of source code.

But if you really dislike them, there might be two solutions:

1. changing the name of readq/writeq to readq_nonatomic/writeq_nonatomic
2. adding new C file to somewhere and defining spinlock for them.
    With spin_lock_irqsave() and spin_unlock_irqrestore() on the spinlock,
    readq/writeq can be atomic.

How do you think about them? If you cannot agree with the above two solutions,
I'll agree with reverting them.

-- 
Hitoshi Mitake
h.mitake@gmail.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-19  4:08                     ` Hitoshi Mitake
  0 siblings, 0 replies; 46+ messages in thread
From: Hitoshi Mitake @ 2011-05-19  4:08 UTC (permalink / raw)
  To: Moore, Eric
  Cc: linux-arch, Prakash, Sathya, Desai, Kashyap, linux scsi dev,
	Matthew Wilcox, linux powerpc dev, Milton Miller, linux kernel,
	James Bottomley, Ingo Molnar, paulus, linux pci, Ingo Molnar,
	Sam Ravnborg

On Thu, May 19, 2011 at 04:11, Moore, Eric <Eric.Moore@lsi.com> wrote:
> On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote:
>> Ingo I would propose the following commits added in 2.6.29 be reverted.
>> I think the current concensus is drivers must know if the writeq is
>> not atomic so they can provide their own locking or other workaround.
>>
>
>
> Exactly.
>

The original motivation of preparing common readq/writeq is that
letting each driver
have their own readq/writeq is bad for maintenance of source code.

But if you really dislike them, there might be two solutions:

1. changing the name of readq/writeq to readq_nonatomic/writeq_nonatomic
2. adding new C file to somewhere and defining spinlock for them.
    With spin_lock_irqsave() and spin_unlock_irqrestore() on the spinlock,
    readq/writeq can be atomic.

How do you think about them? If you cannot agree with the above two solutions,
I'll agree with reverting them.

-- 
Hitoshi Mitake
h.mitake@gmail.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-18 18:31                 ` Milton Miller
@ 2011-05-19  4:16                   ` Roland Dreier
  -1 siblings, 0 replies; 46+ messages in thread
From: Roland Dreier @ 2011-05-19  4:16 UTC (permalink / raw)
  To: Milton Miller
  Cc: Hitoshi Mitake, Sam Ravnborg, Ingo Molnar, Ingo Molnar, Desai,
	Kashyap, Prakash, Sathya, James Bottomley, Matthew Wilcox,
	Benjamin Herrenschmidt, linux scsi dev, paulus,
	linux powerpc dev, linux pci, linux kernel, linux-arch

On Wed, May 18, 2011 at 11:31 AM, Milton Miller <miltonm@bga.com> wrote:
> So the real question should be why is x86-32 supplying a broken writeq
> instead of letting drivers work out what to do it when needed?

Sounds a lot like what I was asking a couple of years ago :)
http://lkml.org/lkml/2009/4/19/164

But Ingo insisted that non-atomic writeq would be fine:
http://lkml.org/lkml/2009/4/19/167

 - R.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-19  4:16                   ` Roland Dreier
  0 siblings, 0 replies; 46+ messages in thread
From: Roland Dreier @ 2011-05-19  4:16 UTC (permalink / raw)
  To: Milton Miller
  Cc: linux-arch, Prakash, Sathya, Desai, Kashyap, linux scsi dev,
	Matthew Wilcox, Hitoshi Mitake, linux pci, linux powerpc dev,
	linux kernel, James Bottomley, Ingo Molnar, paulus, Ingo Molnar,
	Sam Ravnborg

On Wed, May 18, 2011 at 11:31 AM, Milton Miller <miltonm@bga.com> wrote:
> So the real question should be why is x86-32 supplying a broken writeq
> instead of letting drivers work out what to do it when needed?

Sounds a lot like what I was asking a couple of years ago :)
http://lkml.org/lkml/2009/4/19/164

But Ingo insisted that non-atomic writeq would be fine:
http://lkml.org/lkml/2009/4/19/167

 - R.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-19  4:08                     ` Hitoshi Mitake
@ 2011-05-19  4:46                       ` James Bottomley
  -1 siblings, 0 replies; 46+ messages in thread
From: James Bottomley @ 2011-05-19  4:46 UTC (permalink / raw)
  To: Hitoshi Mitake
  Cc: Moore, Eric, Milton Miller, Sam Ravnborg, Ingo Molnar,
	Ingo Molnar, Desai, Kashyap, Prakash, Sathya, Matthew Wilcox,
	Benjamin Herrenschmidt, linux scsi dev, paulus,
	linux powerpc dev, linux pci, linux kernel, linux-arch,
	Roland Dreier

On Thu, 2011-05-19 at 13:08 +0900, Hitoshi Mitake wrote:
> On Thu, May 19, 2011 at 04:11, Moore, Eric <Eric.Moore@lsi.com> wrote:
> > On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote:
> >> Ingo I would propose the following commits added in 2.6.29 be reverted.
> >> I think the current concensus is drivers must know if the writeq is
> >> not atomic so they can provide their own locking or other workaround.
> >>
> >
> >
> > Exactly.
> >
> 
> The original motivation of preparing common readq/writeq is that
> letting each driver
> have their own readq/writeq is bad for maintenance of source code.
> 
> But if you really dislike them, there might be two solutions:
> 
> 1. changing the name of readq/writeq to readq_nonatomic/writeq_nonatomic

This is fine, but not really very useful

> 2. adding new C file to somewhere and defining spinlock for them.
>     With spin_lock_irqsave() and spin_unlock_irqrestore() on the spinlock,
>     readq/writeq can be atomic.

This can't really be done generically.  There are several considerations
to do with hardware requirements.  I can see some hw requiring a
specific write order (I think this applies more to read order, though).

The specific mpt2sas problem is that if we write a 64 bit register non
atomically, we can't allow any interleaving writes for any other region
on the chip, otherwise the HW will take the write as complete in the 64
bit register and latch the wrong value.  The only way to achieve that
given the semantics of writeq is a global static spinlock.

> How do you think about them? If you cannot agree with the above two
> solutions, I'll agree with reverting them.

Having x86 roll its own never made any sense, so I think they need
reverting anyway.  This is a driver/platform bus problem not an
architecture problem.  The assumption we can make is that the platform
CPU can write atomically at its chip width.  We *may* be able to make
the assumption that the bus controller can translate an atomic chip
width transaction to a single atomic bus transaction; I think that
assumption holds true for at least PCI and on the parisc legacy busses,
so if we can agree on semantics, this should be a global define
somewhere.  If there are problems with the bus assumption, we'll likely
need some type of opt-in (or just not bother).

James

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-19  4:46                       ` James Bottomley
  0 siblings, 0 replies; 46+ messages in thread
From: James Bottomley @ 2011-05-19  4:46 UTC (permalink / raw)
  To: Hitoshi Mitake
  Cc: linux-arch, Prakash, Sathya, Roland Dreier, Desai, Kashyap,
	linux scsi dev, Matthew Wilcox, Moore, Eric, linux pci,
	linux powerpc dev, Milton Miller, linux kernel, Ingo Molnar,
	paulus, Ingo Molnar, Sam Ravnborg

On Thu, 2011-05-19 at 13:08 +0900, Hitoshi Mitake wrote:
> On Thu, May 19, 2011 at 04:11, Moore, Eric <Eric.Moore@lsi.com> wrote:
> > On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote:
> >> Ingo I would propose the following commits added in 2.6.29 be reverted.
> >> I think the current concensus is drivers must know if the writeq is
> >> not atomic so they can provide their own locking or other workaround.
> >>
> >
> >
> > Exactly.
> >
> 
> The original motivation of preparing common readq/writeq is that
> letting each driver
> have their own readq/writeq is bad for maintenance of source code.
> 
> But if you really dislike them, there might be two solutions:
> 
> 1. changing the name of readq/writeq to readq_nonatomic/writeq_nonatomic

This is fine, but not really very useful

> 2. adding new C file to somewhere and defining spinlock for them.
>     With spin_lock_irqsave() and spin_unlock_irqrestore() on the spinlock,
>     readq/writeq can be atomic.

This can't really be done generically.  There are several considerations
to do with hardware requirements.  I can see some hw requiring a
specific write order (I think this applies more to read order, though).

The specific mpt2sas problem is that if we write a 64 bit register non
atomically, we can't allow any interleaving writes for any other region
on the chip, otherwise the HW will take the write as complete in the 64
bit register and latch the wrong value.  The only way to achieve that
given the semantics of writeq is a global static spinlock.

> How do you think about them? If you cannot agree with the above two
> solutions, I'll agree with reverting them.

Having x86 roll its own never made any sense, so I think they need
reverting anyway.  This is a driver/platform bus problem not an
architecture problem.  The assumption we can make is that the platform
CPU can write atomically at its chip width.  We *may* be able to make
the assumption that the bus controller can translate an atomic chip
width transaction to a single atomic bus transaction; I think that
assumption holds true for at least PCI and on the parisc legacy busses,
so if we can agree on semantics, this should be a global define
somewhere.  If there are problems with the bus assumption, we'll likely
need some type of opt-in (or just not bother).

James

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-19  4:16                   ` Roland Dreier
@ 2011-05-19  5:34                     ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2011-05-19  5:34 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Milton Miller, Hitoshi Mitake, Sam Ravnborg, Ingo Molnar,
	Ingo Molnar, Desai, Kashyap, Prakash, Sathya, James Bottomley,
	Matthew Wilcox, linux scsi dev, paulus, linux powerpc dev,
	linux pci, linux kernel, linux-arch

On Wed, 2011-05-18 at 21:16 -0700, Roland Dreier wrote:
> On Wed, May 18, 2011 at 11:31 AM, Milton Miller <miltonm@bga.com> wrote:
> > So the real question should be why is x86-32 supplying a broken writeq
> > instead of letting drivers work out what to do it when needed?
> 
> Sounds a lot like what I was asking a couple of years ago :)
> http://lkml.org/lkml/2009/4/19/164
> 
> But Ingo insisted that non-atomic writeq would be fine:
> http://lkml.org/lkml/2009/4/19/167

Yuck... Ingo, I think that was very wrong.

Those are for MMIO, which must almost ALWAYS know precisely what the
resulting access size is going to be. It's not even about atomicity
between multiple CPUs. I have seen plenty of HW for which a 64-bit
access to a register is -not- equivalent to two 32-bit ones. In fact, in
some case, you can get the side effects twice ... or none at all.

The only case where you can be lax is when you explicitely know that
there is no side effects -and- the HW cope with different access sizes.
This is not the general case and drivers need at the very least a way to
know what the behaviour will be.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-19  5:34                     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2011-05-19  5:34 UTC (permalink / raw)
  To: Roland Dreier
  Cc: linux-arch, Prakash, Sathya, Desai, Kashyap, linux scsi dev,
	Matthew Wilcox, Hitoshi Mitake, linux powerpc dev, Milton Miller,
	linux kernel, James Bottomley, Ingo Molnar, paulus, linux pci,
	Ingo Molnar, Sam Ravnborg

On Wed, 2011-05-18 at 21:16 -0700, Roland Dreier wrote:
> On Wed, May 18, 2011 at 11:31 AM, Milton Miller <miltonm@bga.com> wrote:
> > So the real question should be why is x86-32 supplying a broken writeq
> > instead of letting drivers work out what to do it when needed?
> 
> Sounds a lot like what I was asking a couple of years ago :)
> http://lkml.org/lkml/2009/4/19/164
> 
> But Ingo insisted that non-atomic writeq would be fine:
> http://lkml.org/lkml/2009/4/19/167

Yuck... Ingo, I think that was very wrong.

Those are for MMIO, which must almost ALWAYS know precisely what the
resulting access size is going to be. It's not even about atomicity
between multiple CPUs. I have seen plenty of HW for which a 64-bit
access to a register is -not- equivalent to two 32-bit ones. In fact, in
some case, you can get the side effects twice ... or none at all.

The only case where you can be lax is when you explicitely know that
there is no side effects -and- the HW cope with different access sizes.
This is not the general case and drivers need at the very least a way to
know what the behaviour will be.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-19  4:46                       ` James Bottomley
@ 2011-05-19  5:36                         ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2011-05-19  5:36 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hitoshi Mitake, Moore, Eric, Milton Miller, Sam Ravnborg,
	Ingo Molnar, Ingo Molnar, Desai, Kashyap, Prakash, Sathya,
	Matthew Wilcox, linux scsi dev, paulus, linux powerpc dev,
	linux pci, linux kernel, linux-arch, Roland Dreier

On Thu, 2011-05-19 at 08:46 +0400, James Bottomley wrote:
> This can't really be done generically.  There are several considerations
> to do with hardware requirements.  I can see some hw requiring a
> specific write order (I think this applies more to read order, though).

Right. Or there can be a need for a completely different access pattern
to do 32-bit, or maybe write only one half because both might have a
side effect etc etc etc ...

Also a global lock would be suboptimal vs. a per device lock burried in
the driver.

> The specific mpt2sas problem is that if we write a 64 bit register non
> atomically, we can't allow any interleaving writes for any other region
> on the chip, otherwise the HW will take the write as complete in the 64
> bit register and latch the wrong value.  The only way to achieve that
> given the semantics of writeq is a global static spinlock.
> 
> > How do you think about them? If you cannot agree with the above two
> > solutions, I'll agree with reverting them.
> 
> Having x86 roll its own never made any sense, so I think they need
> reverting anyway. 

Agreed.

>  This is a driver/platform bus problem not an
> architecture problem.  The assumption we can make is that the platform
> CPU can write atomically at its chip width.  We *may* be able to make
> the assumption that the bus controller can translate an atomic chip
> width transaction to a single atomic bus transaction; I think that
> assumption holds true for at least PCI and on the parisc legacy busses,
> so if we can agree on semantics, this should be a global define
> somewhere.  If there are problems with the bus assumption, we'll likely
> need some type of opt-in (or just not bother).

And we want a well defined #ifdef drivers test to know whether there's a
writeq/readq (just #define writeq/readq itself is fine even if it's an
inline function, we do that elsewhere) so they can have a fallback
scenario.

This is important as these can be used in very performance critical code
path.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-19  5:36                         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2011-05-19  5:36 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-arch, Prakash, Sathya, Roland Dreier, Desai, Kashyap,
	Hitoshi Mitake, Matthew Wilcox, Moore, Eric, linux pci,
	linux powerpc dev, Milton Miller, linux kernel, Ingo Molnar,
	paulus, linux scsi dev, Ingo Molnar, Sam Ravnborg

On Thu, 2011-05-19 at 08:46 +0400, James Bottomley wrote:
> This can't really be done generically.  There are several considerations
> to do with hardware requirements.  I can see some hw requiring a
> specific write order (I think this applies more to read order, though).

Right. Or there can be a need for a completely different access pattern
to do 32-bit, or maybe write only one half because both might have a
side effect etc etc etc ...

Also a global lock would be suboptimal vs. a per device lock burried in
the driver.

> The specific mpt2sas problem is that if we write a 64 bit register non
> atomically, we can't allow any interleaving writes for any other region
> on the chip, otherwise the HW will take the write as complete in the 64
> bit register and latch the wrong value.  The only way to achieve that
> given the semantics of writeq is a global static spinlock.
> 
> > How do you think about them? If you cannot agree with the above two
> > solutions, I'll agree with reverting them.
> 
> Having x86 roll its own never made any sense, so I think they need
> reverting anyway. 

Agreed.

>  This is a driver/platform bus problem not an
> architecture problem.  The assumption we can make is that the platform
> CPU can write atomically at its chip width.  We *may* be able to make
> the assumption that the bus controller can translate an atomic chip
> width transaction to a single atomic bus transaction; I think that
> assumption holds true for at least PCI and on the parisc legacy busses,
> so if we can agree on semantics, this should be a global define
> somewhere.  If there are problems with the bus assumption, we'll likely
> need some type of opt-in (or just not bother).

And we want a well defined #ifdef drivers test to know whether there's a
writeq/readq (just #define writeq/readq itself is fine even if it's an
inline function, we do that elsewhere) so they can have a fallback
scenario.

This is important as these can be used in very performance critical code
path.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq isnot atomic
  2011-05-19  4:46                       ` James Bottomley
  (?)
@ 2011-05-19  8:35                         ` David Laight
  -1 siblings, 0 replies; 46+ messages in thread
From: David Laight @ 2011-05-19  8:35 UTC (permalink / raw)
  To: James Bottomley, Hitoshi Mitake
  Cc: linux-arch, Prakash,Sathya, Roland Dreier, Desai, Kashyap,
	linux scsi dev, Matthew Wilcox, Moore, Eric, linux pci,
	linux powerpc dev, Milton Miller, linux kernel, Ingo Molnar,
	paulus, Ingo Molnar, Sam Ravnborg

> The specific mpt2sas problem is that if we write a 64 bit register non
> atomically, we can't allow any interleaving writes for any other
region
> on the chip, otherwise the HW will take the write as complete in the
64
> bit register and latch the wrong value.  The only way to achieve that
> given the semantics of writeq is a global static spinlock.

That sounds like very specific and slightly dodgy hardware.
You don't say what the scope of 'region on the chip' is, but
it looks like you need to disable ALL writes to the memory
area between the first and second writes of the 64bit value
and not just those coming from writeq().
I don't see how this can possibly be done by normal mutexing
around the writeq() sequence, surely you need to lock the bus
between the two transfers.
Even dma writes would be a problem.

The only way I can think to stop other cpus doing writes
is to use IPIs for force them into a busy wait loop.

All rather reminds me of a PCI slave that got things
horribly wrong when a read was done while a write was
still 'posted', or a 2nd master did a cycle did a read
while a read rerun sequence was still in progress.
(required a mutex and dummy cycles).
At least than one wqs confined to one driver.

	David

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq isnot atomic
@ 2011-05-19  8:35                         ` David Laight
  0 siblings, 0 replies; 46+ messages in thread
From: David Laight @ 2011-05-19  8:35 UTC (permalink / raw)
  To: James Bottomley, Hitoshi Mitake
  Cc: linux-arch, Prakash,Sathya, Roland Dreier, Desai, Kashyap,
	linux scsi dev, Matthew Wilcox, Moore, Eric, linux pci,
	linux powerpc dev, Milton Miller, linux kernel, Ingo Molnar,
	paulus, Ingo Molnar, Sam Ravnborg

> The specific mpt2sas problem is that if we write a 64 bit register non
> atomically, we can't allow any interleaving writes for any other
region
> on the chip, otherwise the HW will take the write as complete in the
64
> bit register and latch the wrong value.  The only way to achieve that
> given the semantics of writeq is a global static spinlock.

That sounds like very specific and slightly dodgy hardware.
You don't say what the scope of 'region on the chip' is, but
it looks like you need to disable ALL writes to the memory
area between the first and second writes of the 64bit value
and not just those coming from writeq().
I don't see how this can possibly be done by normal mutexing
around the writeq() sequence, surely you need to lock the bus
between the two transfers.
Even dma writes would be a problem.

The only way I can think to stop other cpus doing writes
is to use IPIs for force them into a busy wait loop.

All rather reminds me of a PCI slave that got things
horribly wrong when a read was done while a write was
still 'posted', or a 2nd master did a cycle did a read
while a read rerun sequence was still in progress.
(required a mutex and dummy cycles).
At least than one wqs confined to one driver.

	David

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq isnot atomic
@ 2011-05-19  8:35                         ` David Laight
  0 siblings, 0 replies; 46+ messages in thread
From: David Laight @ 2011-05-19  8:35 UTC (permalink / raw)
  To: James Bottomley, Hitoshi Mitake
  Cc: linux-arch, Prakash, Sathya, Roland Dreier, Desai, Kashyap,
	linux scsi dev, Matthew Wilcox, linux pci, linux powerpc dev,
	Milton Miller, linux kernel, Sam Ravnborg, Ingo Molnar, paulus,
	Ingo Molnar, Moore, Eric

=20
> The specific mpt2sas problem is that if we write a 64 bit register non
> atomically, we can't allow any interleaving writes for any other
region
> on the chip, otherwise the HW will take the write as complete in the
64
> bit register and latch the wrong value.  The only way to achieve that
> given the semantics of writeq is a global static spinlock.

That sounds like very specific and slightly dodgy hardware.
You don't say what the scope of 'region on the chip' is, but
it looks like you need to disable ALL writes to the memory
area between the first and second writes of the 64bit value
and not just those coming from writeq().
I don't see how this can possibly be done by normal mutexing
around the writeq() sequence, surely you need to lock the bus
between the two transfers.
Even dma writes would be a problem.

The only way I can think to stop other cpus doing writes
is to use IPIs for force them into a busy wait loop.

All rather reminds me of a PCI slave that got things
horribly wrong when a read was done while a write was
still 'posted', or a 2nd master did a cycle did a read
while a read rerun sequence was still in progress.
(required a mutex and dummy cycles).
At least than one wqs confined to one driver.

	David

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
  2011-05-19  5:34                     ` Benjamin Herrenschmidt
@ 2011-05-19 18:15                       ` Ingo Molnar
  -1 siblings, 0 replies; 46+ messages in thread
From: Ingo Molnar @ 2011-05-19 18:15 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, H. Peter Anvin
  Cc: Roland Dreier, Milton Miller, Hitoshi Mitake, Sam Ravnborg,
	Ingo Molnar, Desai, Kashyap, Prakash, Sathya, James Bottomley,
	Matthew Wilcox, linux scsi dev, paulus, linux powerpc dev,
	linux pci, linux kernel, linux-arch


* Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Wed, 2011-05-18 at 21:16 -0700, Roland Dreier wrote:
> > On Wed, May 18, 2011 at 11:31 AM, Milton Miller <miltonm@bga.com> wrote:
> > > So the real question should be why is x86-32 supplying a broken writeq
> > > instead of letting drivers work out what to do it when needed?
> > 
> > Sounds a lot like what I was asking a couple of years ago :)
> > http://lkml.org/lkml/2009/4/19/164
> > 
> > But Ingo insisted that non-atomic writeq would be fine:
> > http://lkml.org/lkml/2009/4/19/167
> 
> Yuck... Ingo, I think that was very wrong.
> 
> Those are for MMIO, which must almost ALWAYS know precisely what the
> resulting access size is going to be. It's not even about atomicity
> between multiple CPUs. I have seen plenty of HW for which a 64-bit
> access to a register is -not- equivalent to two 32-bit ones. In fact, in
> some case, you can get the side effects twice ... or none at all.
> 
> The only case where you can be lax is when you explicitely know that
> there is no side effects -and- the HW cope with different access sizes.
> This is not the general case and drivers need at the very least a way to
> know what the behaviour will be.

Ok, that's pretty convincing.

Unless hpa or tglx disagrees with reverting this, could any of you send a patch 
with a proper changelog etc. that applies cleanly to v2.6.39?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
@ 2011-05-19 18:15                       ` Ingo Molnar
  0 siblings, 0 replies; 46+ messages in thread
From: Ingo Molnar @ 2011-05-19 18:15 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, H. Peter Anvin
  Cc: Roland Dreier, Prakash, Sathya, linux-arch, Desai, Kashyap,
	linux scsi dev, Matthew Wilcox, Hitoshi Mitake,
	linux powerpc dev, Milton Miller, linux kernel, James Bottomley,
	Ingo Molnar, paulus, linux pci, Sam Ravnborg


* Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Wed, 2011-05-18 at 21:16 -0700, Roland Dreier wrote:
> > On Wed, May 18, 2011 at 11:31 AM, Milton Miller <miltonm@bga.com> wrote:
> > > So the real question should be why is x86-32 supplying a broken writeq
> > > instead of letting drivers work out what to do it when needed?
> > 
> > Sounds a lot like what I was asking a couple of years ago :)
> > http://lkml.org/lkml/2009/4/19/164
> > 
> > But Ingo insisted that non-atomic writeq would be fine:
> > http://lkml.org/lkml/2009/4/19/167
> 
> Yuck... Ingo, I think that was very wrong.
> 
> Those are for MMIO, which must almost ALWAYS know precisely what the
> resulting access size is going to be. It's not even about atomicity
> between multiple CPUs. I have seen plenty of HW for which a 64-bit
> access to a register is -not- equivalent to two 32-bit ones. In fact, in
> some case, you can get the side effects twice ... or none at all.
> 
> The only case where you can be lax is when you explicitely know that
> there is no side effects -and- the HW cope with different access sizes.
> This is not the general case and drivers need at the very least a way to
> know what the behaviour will be.

Ok, that's pretty convincing.

Unless hpa or tglx disagrees with reverting this, could any of you send a patch 
with a proper changelog etc. that applies cleanly to v2.6.39?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] x86: Remove 32-bit versions of readq()/writeq()
  2011-05-19 18:15                       ` Ingo Molnar
  (?)
@ 2011-05-19 23:54                       ` Roland Dreier
  2011-05-20  1:15                         ` Hitoshi Mitake
  2011-05-20 11:44                         ` Ingo Molnar
  -1 siblings, 2 replies; 46+ messages in thread
From: Roland Dreier @ 2011-05-19 23:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-scsi, Benjamin Herrenschmidt, Milton Miller,
	James Bottomley, Hitoshi Mitake, Kashyap Desai, Len Brown,
	Ravi Anand, Vikas Chaudhary, Matthew Garrett

From: Roland Dreier <roland@purestorage.com>

The presense of a writeq() implementation on 32-bit x86 that splits
the 64-bit write into two 32-bit writes turns out to break the mpt2sas
driver (and in general is risky for drivers as was discussed in
<http://lkml.kernel.org/r/adaab6c1h7c.fsf@cisco.com>).  To fix this,
revert 2c5643b1c5c7 ("x86: provide readq()/writeq() on 32-bit too")
and follow-on cleanups.

This unfortunately leads to pushing non-atomic definitions of readq()
and write() to various x86-only drivers that in the mean time started
using the definitions in the x86 version of <asm/io.h>.  However as
discussed exhaustively, this is actually the right thing to do,
because the right way to split a 64-bit transaction is hardware
dependent and therefore belongs in the hardware driver (eg mpt2sas
needs a spinlock to make sure no other accesses occur in between the
two halves of the access).

Build tested on 32- and 64-bit x86 allmodconfig.

Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Ravi Anand <ravi.anand@qlogic.com>
Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
---
 arch/x86/Kconfig                 |    2 --
 arch/x86/include/asm/io.h        |   24 ++----------------------
 drivers/acpi/apei/einj.c         |    8 ++++++++
 drivers/acpi/atomicio.c          |    4 ++++
 drivers/edac/i3200_edac.c        |   13 +++++++++++++
 drivers/platform/x86/ibm_rtl.c   |   13 +++++++++++++
 drivers/platform/x86/intel_ips.c |   13 +++++++++++++
 drivers/scsi/qla4xxx/ql4_nx.c    |   21 +++++++++++++++++++++
 8 files changed, 74 insertions(+), 24 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc6c53a..ceb41f3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -16,8 +16,6 @@ config X86_64
 config X86
 	def_bool y
 	select HAVE_AOUT if X86_32
-	select HAVE_READQ
-	select HAVE_WRITEQ
 	select HAVE_UNSTABLE_SCHED_CLOCK
 	select HAVE_IDE
 	select HAVE_OPROFILE
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 0722730..d02804d 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -38,7 +38,6 @@
 
 #include <linux/string.h>
 #include <linux/compiler.h>
-#include <asm-generic/int-ll64.h>
 #include <asm/page.h>
 
 #include <xen/xen.h>
@@ -87,27 +86,6 @@ build_mmio_write(__writel, "l", unsigned int, "r", )
 build_mmio_read(readq, "q", unsigned long, "=r", :"memory")
 build_mmio_write(writeq, "q", unsigned long, "r", :"memory")
 
-#else
-
-static inline __u64 readq(const volatile void __iomem *addr)
-{
-	const volatile u32 __iomem *p = addr;
-	u32 low, high;
-
-	low = readl(p);
-	high = readl(p + 1);
-
-	return low + ((u64)high << 32);
-}
-
-static inline void writeq(__u64 val, volatile void __iomem *addr)
-{
-	writel(val, addr);
-	writel(val >> 32, addr+4);
-}
-
-#endif
-
 #define readq_relaxed(a)	readq(a)
 
 #define __raw_readq(a)		readq(a)
@@ -117,6 +95,8 @@ static inline void writeq(__u64 val, volatile void __iomem *addr)
 #define readq			readq
 #define writeq			writeq
 
+#endif
+
 /**
  *	virt_to_phys	-	map virtual addresses to physical
  *	@address: address to remap
diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c
index 096aebf..f74b2ea 100644
--- a/drivers/acpi/apei/einj.c
+++ b/drivers/acpi/apei/einj.c
@@ -101,6 +101,14 @@ static DEFINE_MUTEX(einj_mutex);
 
 static struct einj_parameter *einj_param;
 
+#ifndef writeq
+static inline void writeq(__u64 val, volatile void __iomem *addr)
+{
+	writel(val, addr);
+	writel(val >> 32, addr+4);
+}
+#endif
+
 static void einj_exec_ctx_init(struct apei_exec_context *ctx)
 {
 	apei_exec_ctx_init(ctx, einj_ins_type, ARRAY_SIZE(einj_ins_type),
diff --git a/drivers/acpi/atomicio.c b/drivers/acpi/atomicio.c
index 542e539..7489b89 100644
--- a/drivers/acpi/atomicio.c
+++ b/drivers/acpi/atomicio.c
@@ -280,9 +280,11 @@ static int acpi_atomic_read_mem(u64 paddr, u64 *val, u32 width)
 	case 32:
 		*val = readl(addr);
 		break;
+#ifdef readq
 	case 64:
 		*val = readq(addr);
 		break;
+#endif
 	default:
 		return -EINVAL;
 	}
@@ -307,9 +309,11 @@ static int acpi_atomic_write_mem(u64 paddr, u64 val, u32 width)
 	case 32:
 		writel(val, addr);
 		break;
+#ifdef writeq
 	case 64:
 		writeq(val, addr);
 		break;
+#endif
 	default:
 		return -EINVAL;
 	}
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index d41f900..aa08497 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -101,6 +101,19 @@ struct i3200_priv {
 
 static int nr_channels;
 
+#ifndef readq
+static inline __u64 readq(const volatile void __iomem *addr)
+{
+	const volatile u32 __iomem *p = addr;
+	u32 low, high;
+
+	low = readl(p);
+	high = readl(p + 1);
+
+	return low + ((u64)high << 32);
+}
+#endif
+
 static int how_many_channels(struct pci_dev *pdev)
 {
 	unsigned char capid0_8b; /* 8th byte of CAPID0 */
diff --git a/drivers/platform/x86/ibm_rtl.c b/drivers/platform/x86/ibm_rtl.c
index 94a114a..b1396e5 100644
--- a/drivers/platform/x86/ibm_rtl.c
+++ b/drivers/platform/x86/ibm_rtl.c
@@ -81,6 +81,19 @@ static void __iomem *rtl_cmd_addr;
 static u8 rtl_cmd_type;
 static u8 rtl_cmd_width;
 
+#ifndef readq
+static inline __u64 readq(const volatile void __iomem *addr)
+{
+	const volatile u32 __iomem *p = addr;
+	u32 low, high;
+
+	low = readl(p);
+	high = readl(p + 1);
+
+	return low + ((u64)high << 32);
+}
+#endif
+
 static void __iomem *rtl_port_map(phys_addr_t addr, unsigned long len)
 {
 	if (rtl_cmd_type == RTL_ADDR_TYPE_MMIO)
diff --git a/drivers/platform/x86/intel_ips.c b/drivers/platform/x86/intel_ips.c
index 85c8ad4..5ffe7c3 100644
--- a/drivers/platform/x86/intel_ips.c
+++ b/drivers/platform/x86/intel_ips.c
@@ -344,6 +344,19 @@ struct ips_driver {
 static bool
 ips_gpu_turbo_enabled(struct ips_driver *ips);
 
+#ifndef readq
+static inline __u64 readq(const volatile void __iomem *addr)
+{
+	const volatile u32 __iomem *p = addr;
+	u32 low, high;
+
+	low = readl(p);
+	high = readl(p + 1);
+
+	return low + ((u64)high << 32);
+}
+#endif
+
 /**
  * ips_cpu_busy - is CPU busy?
  * @ips: IPS driver struct
diff --git a/drivers/scsi/qla4xxx/ql4_nx.c b/drivers/scsi/qla4xxx/ql4_nx.c
index 35381cb..03e522b 100644
--- a/drivers/scsi/qla4xxx/ql4_nx.c
+++ b/drivers/scsi/qla4xxx/ql4_nx.c
@@ -655,6 +655,27 @@ static int qla4_8xxx_pci_is_same_window(struct scsi_qla_host *ha,
 	return 0;
 }
 
+#ifndef readq
+static inline __u64 readq(const volatile void __iomem *addr)
+{
+	const volatile u32 __iomem *p = addr;
+	u32 low, high;
+
+	low = readl(p);
+	high = readl(p + 1);
+
+	return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef writeq
+static inline void writeq(__u64 val, volatile void __iomem *addr)
+{
+	writel(val, addr);
+	writel(val >> 32, addr+4);
+}
+#endif
+
 static int qla4_8xxx_pci_mem_read_direct(struct scsi_qla_host *ha,
 		u64 off, void *data, int size)
 {

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH] x86: Remove 32-bit versions of readq()/writeq()
  2011-05-19 23:54                       ` [PATCH] x86: Remove 32-bit versions of readq()/writeq() Roland Dreier
@ 2011-05-20  1:15                         ` Hitoshi Mitake
  2011-05-20  8:05                           ` Desai, Kashyap
  2011-05-20 11:44                         ` Ingo Molnar
  1 sibling, 1 reply; 46+ messages in thread
From: Hitoshi Mitake @ 2011-05-20  1:15 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Ingo Molnar, linux-kernel, linux-scsi, Benjamin Herrenschmidt,
	Milton Miller, James Bottomley, Kashyap Desai, Len Brown,
	Ravi Anand, Vikas Chaudhary, Matthew Garrett

On Fri, May 20, 2011 at 08:54, Roland Dreier <roland@kernel.org> wrote:
> From: Roland Dreier <roland@purestorage.com>
>
> The presense of a writeq() implementation on 32-bit x86 that splits
> the 64-bit write into two 32-bit writes turns out to break the mpt2sas
> driver (and in general is risky for drivers as was discussed in
> <http://lkml.kernel.org/r/adaab6c1h7c.fsf@cisco.com>).  To fix this,
> revert 2c5643b1c5c7 ("x86: provide readq()/writeq() on 32-bit too")
> and follow-on cleanups.
>
> This unfortunately leads to pushing non-atomic definitions of readq()
> and write() to various x86-only drivers that in the mean time started
> using the definitions in the x86 version of <asm/io.h>.  However as
> discussed exhaustively, this is actually the right thing to do,
> because the right way to split a 64-bit transaction is hardware
> dependent and therefore belongs in the hardware driver (eg mpt2sas
> needs a spinlock to make sure no other accesses occur in between the
> two halves of the access).
>
> Build tested on 32- and 64-bit x86 allmodconfig.
>
> Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com
> Cc: Hitoshi Mitake <h.mitake@gmail.com>
> Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Ravi Anand <ravi.anand@qlogic.com>
> Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
> Cc: Matthew Garrett <mjg@redhat.com>
> Signed-off-by: Roland Dreier <roland@purestorage.com>

I should ack this patch based on the detailed explanation from guys in
the previous thread.
Acked-by: Hitoshi Mitake <h.mitake@gmail.com>

Thanks for your cleaning drivers, Roland!

> ---
>  arch/x86/Kconfig                 |    2 --
>  arch/x86/include/asm/io.h        |   24 ++----------------------
>  drivers/acpi/apei/einj.c         |    8 ++++++++
>  drivers/acpi/atomicio.c          |    4 ++++
>  drivers/edac/i3200_edac.c        |   13 +++++++++++++
>  drivers/platform/x86/ibm_rtl.c   |   13 +++++++++++++
>  drivers/platform/x86/intel_ips.c |   13 +++++++++++++
>  drivers/scsi/qla4xxx/ql4_nx.c    |   21 +++++++++++++++++++++
>  8 files changed, 74 insertions(+), 24 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index cc6c53a..ceb41f3 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -16,8 +16,6 @@ config X86_64
>  config X86
>        def_bool y
>        select HAVE_AOUT if X86_32
> -       select HAVE_READQ
> -       select HAVE_WRITEQ
>        select HAVE_UNSTABLE_SCHED_CLOCK
>        select HAVE_IDE
>        select HAVE_OPROFILE
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index 0722730..d02804d 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -38,7 +38,6 @@
>
>  #include <linux/string.h>
>  #include <linux/compiler.h>
> -#include <asm-generic/int-ll64.h>
>  #include <asm/page.h>
>
>  #include <xen/xen.h>
> @@ -87,27 +86,6 @@ build_mmio_write(__writel, "l", unsigned int, "r", )
>  build_mmio_read(readq, "q", unsigned long, "=r", :"memory")
>  build_mmio_write(writeq, "q", unsigned long, "r", :"memory")
>
> -#else
> -
> -static inline __u64 readq(const volatile void __iomem *addr)
> -{
> -       const volatile u32 __iomem *p = addr;
> -       u32 low, high;
> -
> -       low = readl(p);
> -       high = readl(p + 1);
> -
> -       return low + ((u64)high << 32);
> -}
> -
> -static inline void writeq(__u64 val, volatile void __iomem *addr)
> -{
> -       writel(val, addr);
> -       writel(val >> 32, addr+4);
> -}
> -
> -#endif
> -
>  #define readq_relaxed(a)       readq(a)
>
>  #define __raw_readq(a)         readq(a)
> @@ -117,6 +95,8 @@ static inline void writeq(__u64 val, volatile void __iomem *addr)
>  #define readq                  readq
>  #define writeq                 writeq
>
> +#endif
> +
>  /**
>  *     virt_to_phys    -       map virtual addresses to physical
>  *     @address: address to remap
> diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c
> index 096aebf..f74b2ea 100644
> --- a/drivers/acpi/apei/einj.c
> +++ b/drivers/acpi/apei/einj.c
> @@ -101,6 +101,14 @@ static DEFINE_MUTEX(einj_mutex);
>
>  static struct einj_parameter *einj_param;
>
> +#ifndef writeq
> +static inline void writeq(__u64 val, volatile void __iomem *addr)
> +{
> +       writel(val, addr);
> +       writel(val >> 32, addr+4);
> +}
> +#endif
> +
>  static void einj_exec_ctx_init(struct apei_exec_context *ctx)
>  {
>        apei_exec_ctx_init(ctx, einj_ins_type, ARRAY_SIZE(einj_ins_type),
> diff --git a/drivers/acpi/atomicio.c b/drivers/acpi/atomicio.c
> index 542e539..7489b89 100644
> --- a/drivers/acpi/atomicio.c
> +++ b/drivers/acpi/atomicio.c
> @@ -280,9 +280,11 @@ static int acpi_atomic_read_mem(u64 paddr, u64 *val, u32 width)
>        case 32:
>                *val = readl(addr);
>                break;
> +#ifdef readq
>        case 64:
>                *val = readq(addr);
>                break;
> +#endif
>        default:
>                return -EINVAL;
>        }
> @@ -307,9 +309,11 @@ static int acpi_atomic_write_mem(u64 paddr, u64 val, u32 width)
>        case 32:
>                writel(val, addr);
>                break;
> +#ifdef writeq
>        case 64:
>                writeq(val, addr);
>                break;
> +#endif
>        default:
>                return -EINVAL;
>        }
> diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
> index d41f900..aa08497 100644
> --- a/drivers/edac/i3200_edac.c
> +++ b/drivers/edac/i3200_edac.c
> @@ -101,6 +101,19 @@ struct i3200_priv {
>
>  static int nr_channels;
>
> +#ifndef readq
> +static inline __u64 readq(const volatile void __iomem *addr)
> +{
> +       const volatile u32 __iomem *p = addr;
> +       u32 low, high;
> +
> +       low = readl(p);
> +       high = readl(p + 1);
> +
> +       return low + ((u64)high << 32);
> +}
> +#endif
> +
>  static int how_many_channels(struct pci_dev *pdev)
>  {
>        unsigned char capid0_8b; /* 8th byte of CAPID0 */
> diff --git a/drivers/platform/x86/ibm_rtl.c b/drivers/platform/x86/ibm_rtl.c
> index 94a114a..b1396e5 100644
> --- a/drivers/platform/x86/ibm_rtl.c
> +++ b/drivers/platform/x86/ibm_rtl.c
> @@ -81,6 +81,19 @@ static void __iomem *rtl_cmd_addr;
>  static u8 rtl_cmd_type;
>  static u8 rtl_cmd_width;
>
> +#ifndef readq
> +static inline __u64 readq(const volatile void __iomem *addr)
> +{
> +       const volatile u32 __iomem *p = addr;
> +       u32 low, high;
> +
> +       low = readl(p);
> +       high = readl(p + 1);
> +
> +       return low + ((u64)high << 32);
> +}
> +#endif
> +
>  static void __iomem *rtl_port_map(phys_addr_t addr, unsigned long len)
>  {
>        if (rtl_cmd_type == RTL_ADDR_TYPE_MMIO)
> diff --git a/drivers/platform/x86/intel_ips.c b/drivers/platform/x86/intel_ips.c
> index 85c8ad4..5ffe7c3 100644
> --- a/drivers/platform/x86/intel_ips.c
> +++ b/drivers/platform/x86/intel_ips.c
> @@ -344,6 +344,19 @@ struct ips_driver {
>  static bool
>  ips_gpu_turbo_enabled(struct ips_driver *ips);
>
> +#ifndef readq
> +static inline __u64 readq(const volatile void __iomem *addr)
> +{
> +       const volatile u32 __iomem *p = addr;
> +       u32 low, high;
> +
> +       low = readl(p);
> +       high = readl(p + 1);
> +
> +       return low + ((u64)high << 32);
> +}
> +#endif
> +
>  /**
>  * ips_cpu_busy - is CPU busy?
>  * @ips: IPS driver struct
> diff --git a/drivers/scsi/qla4xxx/ql4_nx.c b/drivers/scsi/qla4xxx/ql4_nx.c
> index 35381cb..03e522b 100644
> --- a/drivers/scsi/qla4xxx/ql4_nx.c
> +++ b/drivers/scsi/qla4xxx/ql4_nx.c
> @@ -655,6 +655,27 @@ static int qla4_8xxx_pci_is_same_window(struct scsi_qla_host *ha,
>        return 0;
>  }
>
> +#ifndef readq
> +static inline __u64 readq(const volatile void __iomem *addr)
> +{
> +       const volatile u32 __iomem *p = addr;
> +       u32 low, high;
> +
> +       low = readl(p);
> +       high = readl(p + 1);
> +
> +       return low + ((u64)high << 32);
> +}
> +#endif
> +
> +#ifndef writeq
> +static inline void writeq(__u64 val, volatile void __iomem *addr)
> +{
> +       writel(val, addr);
> +       writel(val >> 32, addr+4);
> +}
> +#endif
> +
>  static int qla4_8xxx_pci_mem_read_direct(struct scsi_qla_host *ha,
>                u64 off, void *data, int size)
>  {
>



-- 
Hitoshi Mitake
h.mitake@gmail.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] x86: Remove 32-bit versions of readq()/writeq()
  2011-05-20  1:15                         ` Hitoshi Mitake
@ 2011-05-20  8:05                           ` Desai, Kashyap
  0 siblings, 0 replies; 46+ messages in thread
From: Desai, Kashyap @ 2011-05-20  8:05 UTC (permalink / raw)
  To: Hitoshi Mitake, Roland Dreier
  Cc: Ingo Molnar, linux-kernel, linux-scsi, Benjamin Herrenschmidt,
	Milton Miller, James Bottomley, Len Brown, Ravi Anand,
	Vikas Chaudhary, Matthew Garrett



-----Original Message-----
From: Hitoshi Mitake [mailto:h.mitake@gmail.com] 
Sent: Friday, May 20, 2011 6:46 AM
To: Roland Dreier
Cc: Ingo Molnar; linux-kernel@vger.kernel.org; linux-scsi@vger.kernel.org; Benjamin Herrenschmidt; Milton Miller; James Bottomley; Desai, Kashyap; Len Brown; Ravi Anand; Vikas Chaudhary; Matthew Garrett
Subject: Re: [PATCH] x86: Remove 32-bit versions of readq()/writeq()

On Fri, May 20, 2011 at 08:54, Roland Dreier <roland@kernel.org> wrote:
> From: Roland Dreier <roland@purestorage.com>
>
> The presense of a writeq() implementation on 32-bit x86 that splits
> the 64-bit write into two 32-bit writes turns out to break the mpt2sas
> driver (and in general is risky for drivers as was discussed in
> <http://lkml.kernel.org/r/adaab6c1h7c.fsf@cisco.com>).  To fix this,
> revert 2c5643b1c5c7 ("x86: provide readq()/writeq() on 32-bit too")
> and follow-on cleanups.
>
> This unfortunately leads to pushing non-atomic definitions of readq()
> and write() to various x86-only drivers that in the mean time started
> using the definitions in the x86 version of <asm/io.h>.  However as
> discussed exhaustively, this is actually the right thing to do,
> because the right way to split a 64-bit transaction is hardware
> dependent and therefore belongs in the hardware driver (eg mpt2sas
> needs a spinlock to make sure no other accesses occur in between the
> two halves of the access).
>
> Build tested on 32- and 64-bit x86 allmodconfig.
>
> Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com
> Cc: Hitoshi Mitake <h.mitake@gmail.com>
> Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Ravi Anand <ravi.anand@qlogic.com>
> Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
> Cc: Matthew Garrett <mjg@redhat.com>
> Signed-off-by: Roland Dreier <roland@purestorage.com>

I should ack this patch based on the detailed explanation from guys in
the previous thread.
Acked-by: Hitoshi Mitake <h.mitake@gmail.com>

Thanks for your cleaning drivers, Roland!
[Desai, Kashyap]  I found this is the good approach to remove 32 bit version of readq/writeq.
Acked-by: Kashyap Desai <kashyap.desai@lsi.com>


> ---
>  arch/x86/Kconfig                 |    2 --
>  arch/x86/include/asm/io.h        |   24 ++----------------------
>  drivers/acpi/apei/einj.c         |    8 ++++++++
>  drivers/acpi/atomicio.c          |    4 ++++
>  drivers/edac/i3200_edac.c        |   13 +++++++++++++
>  drivers/platform/x86/ibm_rtl.c   |   13 +++++++++++++
>  drivers/platform/x86/intel_ips.c |   13 +++++++++++++
>  drivers/scsi/qla4xxx/ql4_nx.c    |   21 +++++++++++++++++++++
>  8 files changed, 74 insertions(+), 24 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index cc6c53a..ceb41f3 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -16,8 +16,6 @@ config X86_64
>  config X86
>        def_bool y
>        select HAVE_AOUT if X86_32
> -       select HAVE_READQ
> -       select HAVE_WRITEQ
>        select HAVE_UNSTABLE_SCHED_CLOCK
>        select HAVE_IDE
>        select HAVE_OPROFILE
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index 0722730..d02804d 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -38,7 +38,6 @@
>
>  #include <linux/string.h>
>  #include <linux/compiler.h>
> -#include <asm-generic/int-ll64.h>
>  #include <asm/page.h>
>
>  #include <xen/xen.h>
> @@ -87,27 +86,6 @@ build_mmio_write(__writel, "l", unsigned int, "r", )
>  build_mmio_read(readq, "q", unsigned long, "=r", :"memory")
>  build_mmio_write(writeq, "q", unsigned long, "r", :"memory")
>
> -#else
> -
> -static inline __u64 readq(const volatile void __iomem *addr)
> -{
> -       const volatile u32 __iomem *p = addr;
> -       u32 low, high;
> -
> -       low = readl(p);
> -       high = readl(p + 1);
> -
> -       return low + ((u64)high << 32);
> -}
> -
> -static inline void writeq(__u64 val, volatile void __iomem *addr)
> -{
> -       writel(val, addr);
> -       writel(val >> 32, addr+4);
> -}
> -
> -#endif
> -
>  #define readq_relaxed(a)       readq(a)
>
>  #define __raw_readq(a)         readq(a)
> @@ -117,6 +95,8 @@ static inline void writeq(__u64 val, volatile void __iomem *addr)
>  #define readq                  readq
>  #define writeq                 writeq
>
> +#endif
> +
>  /**
>  *     virt_to_phys    -       map virtual addresses to physical
>  *     @address: address to remap
> diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c
> index 096aebf..f74b2ea 100644
> --- a/drivers/acpi/apei/einj.c
> +++ b/drivers/acpi/apei/einj.c
> @@ -101,6 +101,14 @@ static DEFINE_MUTEX(einj_mutex);
>
>  static struct einj_parameter *einj_param;
>
> +#ifndef writeq
> +static inline void writeq(__u64 val, volatile void __iomem *addr)
> +{
> +       writel(val, addr);
> +       writel(val >> 32, addr+4);
> +}
> +#endif
> +
>  static void einj_exec_ctx_init(struct apei_exec_context *ctx)
>  {
>        apei_exec_ctx_init(ctx, einj_ins_type, ARRAY_SIZE(einj_ins_type),
> diff --git a/drivers/acpi/atomicio.c b/drivers/acpi/atomicio.c
> index 542e539..7489b89 100644
> --- a/drivers/acpi/atomicio.c
> +++ b/drivers/acpi/atomicio.c
> @@ -280,9 +280,11 @@ static int acpi_atomic_read_mem(u64 paddr, u64 *val, u32 width)
>        case 32:
>                *val = readl(addr);
>                break;
> +#ifdef readq
>        case 64:
>                *val = readq(addr);
>                break;
> +#endif
>        default:
>                return -EINVAL;
>        }
> @@ -307,9 +309,11 @@ static int acpi_atomic_write_mem(u64 paddr, u64 val, u32 width)
>        case 32:
>                writel(val, addr);
>                break;
> +#ifdef writeq
>        case 64:
>                writeq(val, addr);
>                break;
> +#endif
>        default:
>                return -EINVAL;
>        }
> diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
> index d41f900..aa08497 100644
> --- a/drivers/edac/i3200_edac.c
> +++ b/drivers/edac/i3200_edac.c
> @@ -101,6 +101,19 @@ struct i3200_priv {
>
>  static int nr_channels;
>
> +#ifndef readq
> +static inline __u64 readq(const volatile void __iomem *addr)
> +{
> +       const volatile u32 __iomem *p = addr;
> +       u32 low, high;
> +
> +       low = readl(p);
> +       high = readl(p + 1);
> +
> +       return low + ((u64)high << 32);
> +}
> +#endif
> +
>  static int how_many_channels(struct pci_dev *pdev)
>  {
>        unsigned char capid0_8b; /* 8th byte of CAPID0 */
> diff --git a/drivers/platform/x86/ibm_rtl.c b/drivers/platform/x86/ibm_rtl.c
> index 94a114a..b1396e5 100644
> --- a/drivers/platform/x86/ibm_rtl.c
> +++ b/drivers/platform/x86/ibm_rtl.c
> @@ -81,6 +81,19 @@ static void __iomem *rtl_cmd_addr;
>  static u8 rtl_cmd_type;
>  static u8 rtl_cmd_width;
>
> +#ifndef readq
> +static inline __u64 readq(const volatile void __iomem *addr)
> +{
> +       const volatile u32 __iomem *p = addr;
> +       u32 low, high;
> +
> +       low = readl(p);
> +       high = readl(p + 1);
> +
> +       return low + ((u64)high << 32);
> +}
> +#endif
> +
>  static void __iomem *rtl_port_map(phys_addr_t addr, unsigned long len)
>  {
>        if (rtl_cmd_type == RTL_ADDR_TYPE_MMIO)
> diff --git a/drivers/platform/x86/intel_ips.c b/drivers/platform/x86/intel_ips.c
> index 85c8ad4..5ffe7c3 100644
> --- a/drivers/platform/x86/intel_ips.c
> +++ b/drivers/platform/x86/intel_ips.c
> @@ -344,6 +344,19 @@ struct ips_driver {
>  static bool
>  ips_gpu_turbo_enabled(struct ips_driver *ips);
>
> +#ifndef readq
> +static inline __u64 readq(const volatile void __iomem *addr)
> +{
> +       const volatile u32 __iomem *p = addr;
> +       u32 low, high;
> +
> +       low = readl(p);
> +       high = readl(p + 1);
> +
> +       return low + ((u64)high << 32);
> +}
> +#endif
> +
>  /**
>  * ips_cpu_busy - is CPU busy?
>  * @ips: IPS driver struct
> diff --git a/drivers/scsi/qla4xxx/ql4_nx.c b/drivers/scsi/qla4xxx/ql4_nx.c
> index 35381cb..03e522b 100644
> --- a/drivers/scsi/qla4xxx/ql4_nx.c
> +++ b/drivers/scsi/qla4xxx/ql4_nx.c
> @@ -655,6 +655,27 @@ static int qla4_8xxx_pci_is_same_window(struct scsi_qla_host *ha,
>        return 0;
>  }
>
> +#ifndef readq
> +static inline __u64 readq(const volatile void __iomem *addr)
> +{
> +       const volatile u32 __iomem *p = addr;
> +       u32 low, high;
> +
> +       low = readl(p);
> +       high = readl(p + 1);
> +
> +       return low + ((u64)high << 32);
> +}
> +#endif
> +
> +#ifndef writeq
> +static inline void writeq(__u64 val, volatile void __iomem *addr)
> +{
> +       writel(val, addr);
> +       writel(val >> 32, addr+4);
> +}
> +#endif
> +
>  static int qla4_8xxx_pci_mem_read_direct(struct scsi_qla_host *ha,
>                u64 off, void *data, int size)
>  {
>



-- 
Hitoshi Mitake
h.mitake@gmail.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] x86: Remove 32-bit versions of readq()/writeq()
  2011-05-19 23:54                       ` [PATCH] x86: Remove 32-bit versions of readq()/writeq() Roland Dreier
  2011-05-20  1:15                         ` Hitoshi Mitake
@ 2011-05-20 11:44                         ` Ingo Molnar
  2011-05-20 12:03                           ` James Bottomley
  1 sibling, 1 reply; 46+ messages in thread
From: Ingo Molnar @ 2011-05-20 11:44 UTC (permalink / raw)
  To: Roland Dreier, Andrew Morton
  Cc: linux-kernel, linux-scsi, Benjamin Herrenschmidt, Milton Miller,
	James Bottomley, Hitoshi Mitake, Kashyap Desai, Len Brown,
	Ravi Anand, Vikas Chaudhary, Matthew Garrett


* Roland Dreier <roland@kernel.org> wrote:

> From: Roland Dreier <roland@purestorage.com>
> 
> The presense of a writeq() implementation on 32-bit x86 that splits
> the 64-bit write into two 32-bit writes turns out to break the mpt2sas
> driver (and in general is risky for drivers as was discussed in
> <http://lkml.kernel.org/r/adaab6c1h7c.fsf@cisco.com>).  To fix this,
> revert 2c5643b1c5c7 ("x86: provide readq()/writeq() on 32-bit too")
> and follow-on cleanups.
> 
> This unfortunately leads to pushing non-atomic definitions of readq()
> and write() to various x86-only drivers that in the mean time started
> using the definitions in the x86 version of <asm/io.h>.  However as
> discussed exhaustively, this is actually the right thing to do,
> because the right way to split a 64-bit transaction is hardware
> dependent and therefore belongs in the hardware driver (eg mpt2sas
> needs a spinlock to make sure no other accesses occur in between the
> two halves of the access).
> 
> Build tested on 32- and 64-bit x86 allmodconfig.
> 
> Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com
> Cc: Hitoshi Mitake <h.mitake@gmail.com>
> Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Ravi Anand <ravi.anand@qlogic.com>
> Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
> Cc: Matthew Garrett <mjg@redhat.com>
> Signed-off-by: Roland Dreier <roland@purestorage.com>
> ---
>  arch/x86/Kconfig                 |    2 --
>  arch/x86/include/asm/io.h        |   24 ++----------------------
>  drivers/acpi/apei/einj.c         |    8 ++++++++
>  drivers/acpi/atomicio.c          |    4 ++++
>  drivers/edac/i3200_edac.c        |   13 +++++++++++++
>  drivers/platform/x86/ibm_rtl.c   |   13 +++++++++++++
>  drivers/platform/x86/intel_ips.c |   13 +++++++++++++
>  drivers/scsi/qla4xxx/ql4_nx.c    |   21 +++++++++++++++++++++
>  8 files changed, 74 insertions(+), 24 deletions(-)

Hm, this patch is wider than i thought - might be better to do this via one of 
the driver trees or -mm?

The x86 bits are:

Acked-by: Ingo Molnar <mingo@elte.hu>

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] x86: Remove 32-bit versions of readq()/writeq()
  2011-05-20 11:44                         ` Ingo Molnar
@ 2011-05-20 12:03                           ` James Bottomley
  0 siblings, 0 replies; 46+ messages in thread
From: James Bottomley @ 2011-05-20 12:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roland Dreier, Andrew Morton, linux-kernel, linux-scsi,
	Benjamin Herrenschmidt, Milton Miller, Hitoshi Mitake,
	Kashyap Desai, Len Brown, Ravi Anand, Vikas Chaudhary,
	Matthew Garrett

On Fri, 2011-05-20 at 13:44 +0200, Ingo Molnar wrote:
> * Roland Dreier <roland@kernel.org> wrote:
> 
> > From: Roland Dreier <roland@purestorage.com>
> > 
> > The presense of a writeq() implementation on 32-bit x86 that splits
> > the 64-bit write into two 32-bit writes turns out to break the mpt2sas
> > driver (and in general is risky for drivers as was discussed in
> > <http://lkml.kernel.org/r/adaab6c1h7c.fsf@cisco.com>).  To fix this,
> > revert 2c5643b1c5c7 ("x86: provide readq()/writeq() on 32-bit too")
> > and follow-on cleanups.
> > 
> > This unfortunately leads to pushing non-atomic definitions of readq()
> > and write() to various x86-only drivers that in the mean time started
> > using the definitions in the x86 version of <asm/io.h>.  However as
> > discussed exhaustively, this is actually the right thing to do,
> > because the right way to split a 64-bit transaction is hardware
> > dependent and therefore belongs in the hardware driver (eg mpt2sas
> > needs a spinlock to make sure no other accesses occur in between the
> > two halves of the access).
> > 
> > Build tested on 32- and 64-bit x86 allmodconfig.
> > 
> > Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com
> > Cc: Hitoshi Mitake <h.mitake@gmail.com>
> > Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
> > Cc: Len Brown <lenb@kernel.org>
> > Cc: Ravi Anand <ravi.anand@qlogic.com>
> > Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
> > Cc: Matthew Garrett <mjg@redhat.com>
> > Signed-off-by: Roland Dreier <roland@purestorage.com>
> > ---
> >  arch/x86/Kconfig                 |    2 --
> >  arch/x86/include/asm/io.h        |   24 ++----------------------
> >  drivers/acpi/apei/einj.c         |    8 ++++++++
> >  drivers/acpi/atomicio.c          |    4 ++++
> >  drivers/edac/i3200_edac.c        |   13 +++++++++++++
> >  drivers/platform/x86/ibm_rtl.c   |   13 +++++++++++++
> >  drivers/platform/x86/intel_ips.c |   13 +++++++++++++
> >  drivers/scsi/qla4xxx/ql4_nx.c    |   21 +++++++++++++++++++++
> >  8 files changed, 74 insertions(+), 24 deletions(-)
> 
> Hm, this patch is wider than i thought - might be better to do this via one of 
> the driver trees or -mm?

We don't have a single driver tree covering all of those, so I think the
x86 tree is as good as any other

> The x86 bits are:
> 
> Acked-by: Ingo Molnar <mingo@elte.hu>

For the SCSI bits:

Acked-by: James Bottomley <James.Bottomley@parallels.com>

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2011-05-20 12:07 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-04 11:53 [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic Kashyap, Desai
2011-05-17  7:16 ` James Bottomley
2011-05-18  4:07   ` Desai, Kashyap
2011-05-18  4:15     ` Matthew Wilcox
2011-05-18  4:15       ` Matthew Wilcox
2011-05-18  4:23       ` James Bottomley
2011-05-18  4:23         ` James Bottomley
2011-05-18  7:00         ` Benjamin Herrenschmidt
2011-05-18  7:00           ` Benjamin Herrenschmidt
2011-05-18  8:23           ` Milton Miller
2011-05-18  8:23             ` Milton Miller
2011-05-18 15:35             ` Moore, Eric
2011-05-18 15:35               ` Moore, Eric
2011-05-18 18:31               ` Milton Miller
2011-05-18 18:31                 ` Milton Miller
2011-05-18 18:31                 ` Milton Miller
2011-05-18 19:11                 ` Moore, Eric
2011-05-18 19:11                   ` Moore, Eric
2011-05-18 19:11                   ` Moore, Eric
2011-05-19  4:08                   ` Hitoshi Mitake
2011-05-19  4:08                     ` Hitoshi Mitake
2011-05-19  4:46                     ` James Bottomley
2011-05-19  4:46                       ` James Bottomley
2011-05-19  5:36                       ` Benjamin Herrenschmidt
2011-05-19  5:36                         ` Benjamin Herrenschmidt
2011-05-19  8:35                       ` [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq isnot atomic David Laight
2011-05-19  8:35                         ` David Laight
2011-05-19  8:35                         ` David Laight
2011-05-19  4:16                 ` [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic Roland Dreier
2011-05-19  4:16                   ` Roland Dreier
2011-05-19  5:34                   ` Benjamin Herrenschmidt
2011-05-19  5:34                     ` Benjamin Herrenschmidt
2011-05-19 18:15                     ` Ingo Molnar
2011-05-19 18:15                       ` Ingo Molnar
2011-05-19 23:54                       ` [PATCH] x86: Remove 32-bit versions of readq()/writeq() Roland Dreier
2011-05-20  1:15                         ` Hitoshi Mitake
2011-05-20  8:05                           ` Desai, Kashyap
2011-05-20 11:44                         ` Ingo Molnar
2011-05-20 12:03                           ` James Bottomley
2011-05-18 21:30               ` [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic Benjamin Herrenschmidt
2011-05-18 22:05                 ` Moore, Eric
2011-05-18 22:05                   ` Moore, Eric
2011-05-18  8:04         ` [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq isnot atomic David Laight
2011-05-18  8:04           ` David Laight
2011-05-18  5:45       ` [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic Benjamin Herrenschmidt
2011-05-18  5:45         ` Benjamin Herrenschmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.