linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
@ 2015-01-30 19:54 Tim Chen
  2015-01-30 19:58 ` Greg KH
  2015-01-30 21:03 ` Sergei Shtylyov
  0 siblings, 2 replies; 20+ messages in thread
From: Tim Chen @ 2015-01-30 19:54 UTC (permalink / raw)
  To: H. Peter Anvin, Akinobu Mita, Mathias Nyman
  Cc: Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski,
	Thomas Gleixner, linux-kernel, x86, linux-usb


Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory
if CMA is enabled") changed the dma_alloc_coherent page clearance from
using an __GFP_ZERO in page allocation to not setting the flag but doing
an explicit memory clear at the end.

However the memory clear only covered the memory size that
was requested, but may not be up to the full extent of the
last page, if the total pages returned exceed the
memory size requested.  This behavior has caused problem with XHCI
and caused it to hang and froze usb:

kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead
kernel: xhci_hcd 0000:00:14.0: Abort command ring failed
kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID
kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64.

Other drivers may have similar issue if it assumes that the pages
allocated are completely zeroed.

This patch ensures that the pages returned are fully cleared.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 arch/x86/kernel/pci-dma.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index a25e202..534470f 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -125,6 +125,8 @@ again:
 
 		return NULL;
 	}
+	/* round up to full page size */
+	size = (((size-1) >> PAGE_SHIFT) + 1) * PAGE_SIZE;
 	memset(page_address(page), 0, size);
 	*dma_addr = addr;
 	return page_address(page);
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-01-30 19:54 [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned Tim Chen
@ 2015-01-30 19:58 ` Greg KH
  2015-01-30 22:01   ` Tim Chen
  2015-01-30 21:03 ` Sergei Shtylyov
  1 sibling, 1 reply; 20+ messages in thread
From: Greg KH @ 2015-01-30 19:58 UTC (permalink / raw)
  To: Tim Chen
  Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen,
	Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner,
	linux-kernel, x86, linux-usb

On Fri, Jan 30, 2015 at 11:54:01AM -0800, Tim Chen wrote:
> 
> Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory
> if CMA is enabled") changed the dma_alloc_coherent page clearance from
> using an __GFP_ZERO in page allocation to not setting the flag but doing
> an explicit memory clear at the end.
> 
> However the memory clear only covered the memory size that
> was requested, but may not be up to the full extent of the
> last page, if the total pages returned exceed the
> memory size requested.  This behavior has caused problem with XHCI
> and caused it to hang and froze usb:
> 
> kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead
> kernel: xhci_hcd 0000:00:14.0: Abort command ring failed
> kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
> kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID
> kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64.
> 
> Other drivers may have similar issue if it assumes that the pages
> allocated are completely zeroed.
> 
> This patch ensures that the pages returned are fully cleared.
> 
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
>  arch/x86/kernel/pci-dma.c | 2 ++
>  1 file changed, 2 insertions(+)
> 


Shouldn't this go to stable trees too?

Also, why is the xhci driver not asking for the memory it is going to
need?  If it wants to use the full page, shouldn't it ask for it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-01-30 19:54 [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned Tim Chen
  2015-01-30 19:58 ` Greg KH
@ 2015-01-30 21:03 ` Sergei Shtylyov
  2015-01-30 21:54   ` Tim Chen
  1 sibling, 1 reply; 20+ messages in thread
From: Sergei Shtylyov @ 2015-01-30 21:03 UTC (permalink / raw)
  To: Tim Chen, H. Peter Anvin, Akinobu Mita, Mathias Nyman
  Cc: Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski,
	Thomas Gleixner, linux-kernel, x86, linux-usb

On 01/30/2015 10:54 PM, Tim Chen wrote:

> Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory
> if CMA is enabled") changed the dma_alloc_coherent page clearance from
> using an __GFP_ZERO in page allocation to not setting the flag but doing
> an explicit memory clear at the end.

> However the memory clear only covered the memory size that
> was requested, but may not be up to the full extent of the
> last page, if the total pages returned exceed the
> memory size requested.  This behavior has caused problem with XHCI
> and caused it to hang and froze usb:

> kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead
> kernel: xhci_hcd 0000:00:14.0: Abort command ring failed
> kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
> kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID
> kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64.

> Other drivers may have similar issue if it assumes that the pages
> allocated are completely zeroed.

> This patch ensures that the pages returned are fully cleared.

> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
>   arch/x86/kernel/pci-dma.c | 2 ++
>   1 file changed, 2 insertions(+)

> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> index a25e202..534470f 100644
> --- a/arch/x86/kernel/pci-dma.c
> +++ b/arch/x86/kernel/pci-dma.c
> @@ -125,6 +125,8 @@ again:
>
>   		return NULL;
>   	}
> +	/* round up to full page size */
> +	size = (((size-1) >> PAGE_SHIFT) + 1) * PAGE_SIZE;

    This is quite suboptimal formula, we can do without shifts and 
multiplications (hopefully, still converted to shifts by gcc):

	size = (size + ~PAGE_MASK) & PAGE_MASK;

WBR, Sergei


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-01-30 21:03 ` Sergei Shtylyov
@ 2015-01-30 21:54   ` Tim Chen
  2015-02-02 14:02     ` Jiri Slaby
  0 siblings, 1 reply; 20+ messages in thread
From: Tim Chen @ 2015-01-30 21:54 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen,
	Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner,
	linux-kernel, x86, linux-usb, stable

On Sat, 2015-01-31 at 00:03 +0300, Sergei Shtylyov wrote:
> On 01/30/2015 10:54 PM, Tim Chen wrote:

> >
> >   		return NULL;
> >   	}
> > +	/* round up to full page size */
> > +	size = (((size-1) >> PAGE_SHIFT) + 1) * PAGE_SIZE;
> 
>     This is quite suboptimal formula, we can do without shifts and 
> multiplications (hopefully, still converted to shifts by gcc):
> 
> 	size = (size + ~PAGE_MASK) & PAGE_MASK;

Agree.  I've updated patch below

Thanks.

Tim

--->8---

From: Tim Chen <tim.c.chen@linux.intel.com>
Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned


Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory
if CMA is enabled") changed the dma_alloc_coherent page clearance from
using an __GFP_ZERO in page allocation to not setting the flag but doing
an explicit memory clear at the end.

However the memory clear only covered the memory size that
was requested, but may not be up to the full extent of the
last page, if the total pages returned exceed the
memory size requested.  This behavior has caused problem with XHCI
and caused it to hang:

kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead
kernel: xhci_hcd 0000:00:14.0: Abort command ring failed
kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID
kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64.

Other drivers may have similar issue if it assumes that the pages
allocated are completely zeroed.

This patch ensures that the pages returned are fully cleared.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: <stable@vger.kernel.org> # 3.16+

---
 arch/x86/kernel/pci-dma.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index a25e202..e9d8dee 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -125,6 +125,8 @@ again:
 
 		return NULL;
 	}
+	/* round up to full page size */
+	size = (size + ~PAGE_MASK) & PAGE_MASK;
 	memset(page_address(page), 0, size);
 	*dma_addr = addr;
 	return page_address(page);
-- 
1.9.3




^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-01-30 19:58 ` Greg KH
@ 2015-01-30 22:01   ` Tim Chen
  2015-01-30 22:07     ` Greg KH
  0 siblings, 1 reply; 20+ messages in thread
From: Tim Chen @ 2015-01-30 22:01 UTC (permalink / raw)
  To: Greg KH
  Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen,
	Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner,
	linux-kernel, x86, linux-usb

On Fri, 2015-01-30 at 11:58 -0800, Greg KH wrote:

> 
> Shouldn't this go to stable trees too?
> 

Yes. Added the stable tag in updated patch that's resent.

> Also, why is the xhci driver not asking for the memory it is going to
> need?  If it wants to use the full page, shouldn't it ask for it?
> 

I agree that xhci should have done that, but it didn't.  Commit 
d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory 
if CMA is enabled") changed the behavior of dma_alloc_coherent
by clearing only the memory being asked for.

So for backward compatibility, clearing the pages 
completely to revert to dma_alloc_coherent's original
behavior is probably the safe thing to do.

Thanks.

Tim


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-01-30 22:01   ` Tim Chen
@ 2015-01-30 22:07     ` Greg KH
  2015-01-30 22:16       ` Tim Chen
  2015-01-30 22:39       ` Andi Kleen
  0 siblings, 2 replies; 20+ messages in thread
From: Greg KH @ 2015-01-30 22:07 UTC (permalink / raw)
  To: Tim Chen
  Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen,
	Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner,
	linux-kernel, x86, linux-usb

On Fri, Jan 30, 2015 at 02:01:58PM -0800, Tim Chen wrote:
> On Fri, 2015-01-30 at 11:58 -0800, Greg KH wrote:
> 
> > 
> > Shouldn't this go to stable trees too?
> > 
> 
> Yes. Added the stable tag in updated patch that's resent.
> 
> > Also, why is the xhci driver not asking for the memory it is going to
> > need?  If it wants to use the full page, shouldn't it ask for it?
> > 
> 
> I agree that xhci should have done that, but it didn't.  Commit 
> d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory 
> if CMA is enabled") changed the behavior of dma_alloc_coherent
> by clearing only the memory being asked for.
> 
> So for backward compatibility, clearing the pages 
> completely to revert to dma_alloc_coherent's original
> behavior is probably the safe thing to do.

We don't "need" any backward compatility, why not fix the broken drivers
that are using memory outside of what they are asking for?  That's not
ok no matter what, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-01-30 22:07     ` Greg KH
@ 2015-01-30 22:16       ` Tim Chen
  2015-01-30 22:39       ` Andi Kleen
  1 sibling, 0 replies; 20+ messages in thread
From: Tim Chen @ 2015-01-30 22:16 UTC (permalink / raw)
  To: Greg KH
  Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen,
	Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner,
	linux-kernel, x86, linux-usb

On Fri, 2015-01-30 at 14:07 -0800, Greg KH wrote:
> On Fri, Jan 30, 2015 at 02:01:58PM -0800, Tim Chen wrote:
> > On Fri, 2015-01-30 at 11:58 -0800, Greg KH wrote:
> > 
> > > 
> > > Shouldn't this go to stable trees too?
> > > 
> > 
> > Yes. Added the stable tag in updated patch that's resent.
> > 
> > > Also, why is the xhci driver not asking for the memory it is going to
> > > need?  If it wants to use the full page, shouldn't it ask for it?
> > > 
> > 
> > I agree that xhci should have done that, but it didn't.  Commit 
> > d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory 
> > if CMA is enabled") changed the behavior of dma_alloc_coherent
> > by clearing only the memory being asked for.
> > 
> > So for backward compatibility, clearing the pages 
> > completely to revert to dma_alloc_coherent's original
> > behavior is probably the safe thing to do.
> 
> We don't "need" any backward compatility, why not fix the broken drivers
> that are using memory outside of what they are asking for?  That's not
> ok no matter what, right?

Not disagreeing with you.  I'll be equally happy if the xhci folks
can fix the driver.  Mathias?

Tim


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-01-30 22:07     ` Greg KH
  2015-01-30 22:16       ` Tim Chen
@ 2015-01-30 22:39       ` Andi Kleen
  2015-01-31 17:14         ` Alan Stern
  1 sibling, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2015-01-30 22:39 UTC (permalink / raw)
  To: Greg KH
  Cc: Tim Chen, H. Peter Anvin, Akinobu Mita, Mathias Nyman,
	Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski,
	Thomas Gleixner, linux-kernel, x86, linux-usb

> We don't "need" any backward compatility, why not fix the broken drivers
> that are using memory outside of what they are asking for?  That's not
> ok no matter what, right?

How would you find them all?

We don't even know which place in XHCI is the culprit here.

Yes iff you can find them it would be good to fix.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-01-30 22:39       ` Andi Kleen
@ 2015-01-31 17:14         ` Alan Stern
  0 siblings, 0 replies; 20+ messages in thread
From: Alan Stern @ 2015-01-31 17:14 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Greg KH, Tim Chen, H. Peter Anvin, Akinobu Mita, Mathias Nyman,
	Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner,
	linux-kernel, x86, linux-usb

On Fri, 30 Jan 2015, Andi Kleen wrote:

> > We don't "need" any backward compatility, why not fix the broken drivers
> > that are using memory outside of what they are asking for?  That's not
> > ok no matter what, right?
> 
> How would you find them all?
> 
> We don't even know which place in XHCI is the culprit here.

It shouldn't be too hard to find, expecially if the fault can easily 
be reproduced.

There are only about half a dozen calls to dma_alloc_coherent in the
xhci-hcd driver.  Just add 64 KB to the requested memory size in each
of them.  Then one-by-one (or using bisection) remove the extra size
and see which spots trigger the fault.

Alan Stern


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-01-30 21:54   ` Tim Chen
@ 2015-02-02 14:02     ` Jiri Slaby
  2015-02-02 16:39       ` Sergei Shtylyov
  0 siblings, 1 reply; 20+ messages in thread
From: Jiri Slaby @ 2015-02-02 14:02 UTC (permalink / raw)
  To: Tim Chen, Sergei Shtylyov
  Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen,
	Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner,
	linux-kernel, x86, linux-usb, stable

On 01/30/2015, 10:54 PM, Tim Chen wrote:
> On Sat, 2015-01-31 at 00:03 +0300, Sergei Shtylyov wrote:
>> On 01/30/2015 10:54 PM, Tim Chen wrote:
> 
>>>
>>>   		return NULL;
>>>   	}
>>> +	/* round up to full page size */
>>> +	size = (((size-1) >> PAGE_SHIFT) + 1) * PAGE_SIZE;
>>
>>     This is quite suboptimal formula, we can do without shifts and 
>> multiplications (hopefully, still converted to shifts by gcc):
>>
>> 	size = (size + ~PAGE_MASK) & PAGE_MASK;
> 
> Agree.  I've updated patch below
> 
> Thanks.
> 
> Tim
> 
> --->8---
> 
> From: Tim Chen <tim.c.chen@linux.intel.com>
> Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
> 
> 
> Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory
> if CMA is enabled") changed the dma_alloc_coherent page clearance from
> using an __GFP_ZERO in page allocation to not setting the flag but doing
> an explicit memory clear at the end.
> 
> However the memory clear only covered the memory size that
> was requested, but may not be up to the full extent of the
> last page, if the total pages returned exceed the
> memory size requested.  This behavior has caused problem with XHCI
> and caused it to hang:
> 
> kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead
> kernel: xhci_hcd 0000:00:14.0: Abort command ring failed
> kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
> kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID
> kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64.
> 
> Other drivers may have similar issue if it assumes that the pages
> allocated are completely zeroed.
> 
> This patch ensures that the pages returned are fully cleared.
> 
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: <stable@vger.kernel.org> # 3.16+
> 
> ---
>  arch/x86/kernel/pci-dma.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> index a25e202..e9d8dee 100644
> --- a/arch/x86/kernel/pci-dma.c
> +++ b/arch/x86/kernel/pci-dma.c
> @@ -125,6 +125,8 @@ again:
>  
>  		return NULL;
>  	}
> +	/* round up to full page size */
> +	size = (size + ~PAGE_MASK) & PAGE_MASK;

Hi, is this an open-coded version of PAGE_ALIGN?

>  	memset(page_address(page), 0, size);
>  	*dma_addr = addr;
>  	return page_address(page);
> 


-- 
js
suse labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-02 14:02     ` Jiri Slaby
@ 2015-02-02 16:39       ` Sergei Shtylyov
  2015-02-04 18:30         ` Tim Chen
  0 siblings, 1 reply; 20+ messages in thread
From: Sergei Shtylyov @ 2015-02-02 16:39 UTC (permalink / raw)
  To: Jiri Slaby, Tim Chen
  Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen,
	Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner,
	linux-kernel, x86, linux-usb, stable

Hello.

On 02/02/2015 05:02 PM, Jiri Slaby wrote:

>> From: Tim Chen <tim.c.chen@linux.intel.com>
>> Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned


>> Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory
>> if CMA is enabled") changed the dma_alloc_coherent page clearance from
>> using an __GFP_ZERO in page allocation to not setting the flag but doing
>> an explicit memory clear at the end.

>> However the memory clear only covered the memory size that
>> was requested, but may not be up to the full extent of the
>> last page, if the total pages returned exceed the
>> memory size requested.  This behavior has caused problem with XHCI
>> and caused it to hang:

>> kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead
>> kernel: xhci_hcd 0000:00:14.0: Abort command ring failed
>> kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
>> kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID
>> kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64.

>> Other drivers may have similar issue if it assumes that the pages
>> allocated are completely zeroed.

>> This patch ensures that the pages returned are fully cleared.

>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Cc: <stable@vger.kernel.org> # 3.16+

>> ---
>>   arch/x86/kernel/pci-dma.c | 2 ++
>>   1 file changed, 2 insertions(+)

>> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
>> index a25e202..e9d8dee 100644
>> --- a/arch/x86/kernel/pci-dma.c
>> +++ b/arch/x86/kernel/pci-dma.c
>> @@ -125,6 +125,8 @@ again:
>>
>>   		return NULL;
>>   	}
>> +	/* round up to full page size */
>> +	size = (size + ~PAGE_MASK) & PAGE_MASK;

> Hi, is this an open-coded version of PAGE_ALIGN?

    Yes, it appears so. :-)

WBR, Sergei


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-02 16:39       ` Sergei Shtylyov
@ 2015-02-04 18:30         ` Tim Chen
  2015-02-18 19:40           ` Tim Chen
  0 siblings, 1 reply; 20+ messages in thread
From: Tim Chen @ 2015-02-04 18:30 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman,
	Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski,
	Thomas Gleixner, linux-kernel, x86, linux-usb, stable

On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote:

> 
> > Hi, is this an open-coded version of PAGE_ALIGN?
> 
>     Yes, it appears so. :-)
> 
> WBR, Sergei
> 

Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN
below.

Regards,
Tim

--->8---
From: Tim Chen <tim.c.chen@linux.intel.com>
Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned

Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory
if CMA is enabled") changed the dma_alloc_coherent page clearance from
using an __GFP_ZERO in page allocation to not setting the flag but doing
an explicit memory clear at the end.

However the memory clear only covered the memory size that
was requested, but may not be up to the full extent of the
last page, if the total pages returned exceed the
memory size requested.  This behavior has caused problem with XHCI
and caused it to hang:

kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead
kernel: xhci_hcd 0000:00:14.0: Abort command ring failed
kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID
kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64.

Other drivers may have similar issue if it assumes that the pages
allocated are completely zeroed.

This patch ensures that the pages returned are fully cleared.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/pci-dma.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index a25e202..3bdee55 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -125,6 +125,8 @@ again:
 
 		return NULL;
 	}
+	/* round up to full page size */
+	size = PAGE_ALIGN(size);
 	memset(page_address(page), 0, size);
 	*dma_addr = addr;
 	return page_address(page);
-- 
1.9.3

 


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-04 18:30         ` Tim Chen
@ 2015-02-18 19:40           ` Tim Chen
  2015-02-18 19:53             ` Alan Stern
  0 siblings, 1 reply; 20+ messages in thread
From: Tim Chen @ 2015-02-18 19:40 UTC (permalink / raw)
  To: Sergei Shtylyov, Greg Kroah-Hartman
  Cc: Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman,
	Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski,
	Thomas Gleixner, linux-kernel, x86, linux-usb, stable,
	Alan Stern

On Wed, 2015-02-04 at 10:30 -0800, Tim Chen wrote:
> On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote:
> 
> > 
> > > Hi, is this an open-coded version of PAGE_ALIGN?
> > 
> >     Yes, it appears so. :-)
> > 
> > WBR, Sergei
> > 
> 
> Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN
> below.
> 
> Regards,
> Tim
> 

Is there any resolution on this patch?  I haven't seen fixes from the
XHCI folks yet.  This is breaking many of our systems.

Thanks.

Tim

> --->8---
> From: Tim Chen <tim.c.chen@linux.intel.com>
> Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
> 
> Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory
> if CMA is enabled") changed the dma_alloc_coherent page clearance from
> using an __GFP_ZERO in page allocation to not setting the flag but doing
> an explicit memory clear at the end.
> 
> However the memory clear only covered the memory size that
> was requested, but may not be up to the full extent of the
> last page, if the total pages returned exceed the
> memory size requested.  This behavior has caused problem with XHCI
> and caused it to hang:
> 
> kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead
> kernel: xhci_hcd 0000:00:14.0: Abort command ring failed
> kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
> kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID
> kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64.
> 
> Other drivers may have similar issue if it assumes that the pages
> allocated are completely zeroed.
> 
> This patch ensures that the pages returned are fully cleared.
> 
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/kernel/pci-dma.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> index a25e202..3bdee55 100644
> --- a/arch/x86/kernel/pci-dma.c
> +++ b/arch/x86/kernel/pci-dma.c
> @@ -125,6 +125,8 @@ again:
>  
>  		return NULL;
>  	}
> +	/* round up to full page size */
> +	size = PAGE_ALIGN(size);
>  	memset(page_address(page), 0, size);
>  	*dma_addr = addr;
>  	return page_address(page);



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-18 19:40           ` Tim Chen
@ 2015-02-18 19:53             ` Alan Stern
  2015-02-18 20:19               ` Tim Chen
  0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2015-02-18 19:53 UTC (permalink / raw)
  To: Tim Chen
  Cc: Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin,
	Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar,
	Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel,
	x86, linux-usb, stable

On Wed, 18 Feb 2015, Tim Chen wrote:

> On Wed, 2015-02-04 at 10:30 -0800, Tim Chen wrote:
> > On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote:
> > 
> > > 
> > > > Hi, is this an open-coded version of PAGE_ALIGN?
> > > 
> > >     Yes, it appears so. :-)
> > > 
> > > WBR, Sergei
> > > 
> > 
> > Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN
> > below.
> > 
> > Regards,
> > Tim
> > 
> 
> Is there any resolution on this patch?  I haven't seen fixes from the
> XHCI folks yet.  This is breaking many of our systems.

Have you tried doing the experiments I suggested in

	http://marc.info/?l=linux-usb&m=142272448620716&w=2

to determine where the problem occurs?

Alan Stern


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-18 19:53             ` Alan Stern
@ 2015-02-18 20:19               ` Tim Chen
  2015-02-18 20:36                 ` Greg Kroah-Hartman
                                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Tim Chen @ 2015-02-18 20:19 UTC (permalink / raw)
  To: Alan Stern
  Cc: Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin,
	Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar,
	Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel,
	x86, linux-usb, stable

On Wed, 2015-02-18 at 14:53 -0500, Alan Stern wrote:
> On Wed, 18 Feb 2015, Tim Chen wrote:
> 
> > On Wed, 2015-02-04 at 10:30 -0800, Tim Chen wrote:
> > > On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote:
> > > 
> > > > 
> > > > > Hi, is this an open-coded version of PAGE_ALIGN?
> > > > 
> > > >     Yes, it appears so. :-)
> > > > 
> > > > WBR, Sergei
> > > > 
> > > 
> > > Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN
> > > below.
> > > 
> > > Regards,
> > > Tim
> > > 
> > 
> > Is there any resolution on this patch?  I haven't seen fixes from the
> > XHCI folks yet.  This is breaking many of our systems.
> 
> Have you tried doing the experiments I suggested in
> 
>         http://marc.info/?l=linux-usb&m=142272448620716&w=2
> 
> to determine where the problem occurs?
> 

I was bogged down with other things lately and I haven't got a chance to
test that.  But as you said, there's very few places where xhci 
call this memory allocation.  So I think the problem has been fairly
narrowed down for the XHCI folks.

Tim

> Alan Stern
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-18 20:19               ` Tim Chen
@ 2015-02-18 20:36                 ` Greg Kroah-Hartman
  2015-02-18 20:39                 ` Andi Kleen
  2015-02-18 20:45                 ` Alan Stern
  2 siblings, 0 replies; 20+ messages in thread
From: Greg Kroah-Hartman @ 2015-02-18 20:36 UTC (permalink / raw)
  To: Tim Chen
  Cc: Alan Stern, Sergei Shtylyov, Jiri Slaby, H. Peter Anvin,
	Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar,
	Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel,
	x86, linux-usb, stable

On Wed, Feb 18, 2015 at 12:19:03PM -0800, Tim Chen wrote:
> On Wed, 2015-02-18 at 14:53 -0500, Alan Stern wrote:
> > On Wed, 18 Feb 2015, Tim Chen wrote:
> > 
> > > On Wed, 2015-02-04 at 10:30 -0800, Tim Chen wrote:
> > > > On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote:
> > > > 
> > > > > 
> > > > > > Hi, is this an open-coded version of PAGE_ALIGN?
> > > > > 
> > > > >     Yes, it appears so. :-)
> > > > > 
> > > > > WBR, Sergei
> > > > > 
> > > > 
> > > > Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN
> > > > below.
> > > > 
> > > > Regards,
> > > > Tim
> > > > 
> > > 
> > > Is there any resolution on this patch?  I haven't seen fixes from the
> > > XHCI folks yet.  This is breaking many of our systems.
> > 
> > Have you tried doing the experiments I suggested in
> > 
> >         http://marc.info/?l=linux-usb&m=142272448620716&w=2
> > 
> > to determine where the problem occurs?
> > 
> 
> I was bogged down with other things lately and I haven't got a chance to
> test that.  But as you said, there's very few places where xhci 
> call this memory allocation.  So I think the problem has been fairly
> narrowed down for the XHCI folks.

The "XHCI folks" are in a cube near you, go poke them in person if they
aren't answering their emails :)

good luck,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-18 20:19               ` Tim Chen
  2015-02-18 20:36                 ` Greg Kroah-Hartman
@ 2015-02-18 20:39                 ` Andi Kleen
  2015-02-18 21:15                   ` Alan Stern
  2015-02-18 20:45                 ` Alan Stern
  2 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2015-02-18 20:39 UTC (permalink / raw)
  To: Tim Chen
  Cc: Alan Stern, Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby,
	H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen,
	Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner,
	linux-kernel, x86, linux-usb, stable, torvalds

> > Have you tried doing the experiments I suggested in
> > 
> >         http://marc.info/?l=linux-usb&m=142272448620716&w=2
> > 
> > to determine where the problem occurs?
> > 
> 
> I was bogged down with other things lately and I haven't got a chance to
> test that.  But as you said, there's very few places where xhci 
> call this memory allocation.  So I think the problem has been fairly
> narrowed down for the XHCI folks.

Also I don't really understand why we're even discussing this. The patch
only makes an widely used API behave as it was before. Who knows who
else was broken with this change. There's no sane way to audit all
users. There is no real advantage of the new behavior.

The only good way is to revert to old behavior, like in Tim's
original patch. And doing it quickly for mainline and stable.

FWIW we have a large number of systems here that are broken
without this change.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-18 20:19               ` Tim Chen
  2015-02-18 20:36                 ` Greg Kroah-Hartman
  2015-02-18 20:39                 ` Andi Kleen
@ 2015-02-18 20:45                 ` Alan Stern
  2015-02-18 23:59                   ` Tim Chen
  2 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2015-02-18 20:45 UTC (permalink / raw)
  To: Tim Chen
  Cc: Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin,
	Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar,
	Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel,
	x86, linux-usb, stable

On Wed, 18 Feb 2015, Tim Chen wrote:

> > Have you tried doing the experiments I suggested in
> > 
> >         http://marc.info/?l=linux-usb&m=142272448620716&w=2
> > 
> > to determine where the problem occurs?
> > 
> 
> I was bogged down with other things lately and I haven't got a chance to
> test that.  But as you said, there's very few places where xhci 
> call this memory allocation.  So I think the problem has been fairly
> narrowed down for the XHCI folks.

I disagree.  _You_ reported the error.  How can you expect other people 
to figure out where it is with no help from you?

I looked briefly at the xhci-hcd DMA allocation code.  It does not 
contain any obvious mistakes.

Alan Stern


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-18 20:39                 ` Andi Kleen
@ 2015-02-18 21:15                   ` Alan Stern
  0 siblings, 0 replies; 20+ messages in thread
From: Alan Stern @ 2015-02-18 21:15 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Tim Chen, Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby,
	H. Peter Anvin, Akinobu Mita, Mathias Nyman, Ingo Molnar,
	Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel,
	x86, linux-usb, stable, torvalds

On Wed, 18 Feb 2015, Andi Kleen wrote:

> > > Have you tried doing the experiments I suggested in
> > > 
> > >         http://marc.info/?l=linux-usb&m=142272448620716&w=2
> > > 
> > > to determine where the problem occurs?
> > > 
> > 
> > I was bogged down with other things lately and I haven't got a chance to
> > test that.  But as you said, there's very few places where xhci 
> > call this memory allocation.  So I think the problem has been fairly
> > narrowed down for the XHCI folks.
> 
> Also I don't really understand why we're even discussing this. The patch
> only makes an widely used API behave as it was before. Who knows who
> else was broken with this change. There's no sane way to audit all
> users. There is no real advantage of the new behavior.

We are discussing it because fixing problems is better than papering
around them.

> The only good way is to revert to old behavior, like in Tim's
> original patch. And doing it quickly for mainline and stable.

I will agree that applying the patch is a reasonable thing to do.  
However, I also believe that it is important to fix the bugs revealed 
by the API change.

> FWIW we have a large number of systems here that are broken
> without this change.

For all you know, they will still be broken even after the change is 
applied.  The breakage may become less obvious, but that doesn't mean 
it will disappear entirely.

Alan Stern


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
  2015-02-18 20:45                 ` Alan Stern
@ 2015-02-18 23:59                   ` Tim Chen
  0 siblings, 0 replies; 20+ messages in thread
From: Tim Chen @ 2015-02-18 23:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin,
	Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar,
	Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel,
	x86, linux-usb, stable

On Wed, 2015-02-18 at 15:45 -0500, Alan Stern wrote:
> On Wed, 18 Feb 2015, Tim Chen wrote:
> 
> > > Have you tried doing the experiments I suggested in
> > > 
> > >         http://marc.info/?l=linux-usb&m=142272448620716&w=2
> > > 
> > > to determine where the problem occurs?
> > > 
> > 
> > I was bogged down with other things lately and I haven't got a chance to
> > test that.  But as you said, there's very few places where xhci 
> > call this memory allocation.  So I think the problem has been fairly
> > narrowed down for the XHCI folks.
> 
> I disagree.  _You_ reported the error.  How can you expect other people 
> to figure out where it is with no help from you?
> 
> I looked briefly at the xhci-hcd DMA allocation code.  It does not 
> contain any obvious mistakes.
> 
> Alan Stern
> 

The error and my quick fix is right here.  And xhci still needs to be
fixed properly.  I'll send out the patch below in a proper patch.

Tim

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 5cb3d7a..39e7196 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1658,7 +1658,7 @@ static int scratchpad_alloc(struct xhci_hcd *xhci,
gfp_t flags)
                goto fail_sp;

        xhci->scratchpad->sp_array = dma_alloc_coherent(dev,
-                                    num_sp * sizeof(u64),
+                                    PAGE_ALIGN(num_sp * sizeof(u64)),
                                     &xhci->scratchpad->sp_dma, flags);
        if (!xhci->scratchpad->sp_array)





^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-02-18 23:59 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-30 19:54 [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned Tim Chen
2015-01-30 19:58 ` Greg KH
2015-01-30 22:01   ` Tim Chen
2015-01-30 22:07     ` Greg KH
2015-01-30 22:16       ` Tim Chen
2015-01-30 22:39       ` Andi Kleen
2015-01-31 17:14         ` Alan Stern
2015-01-30 21:03 ` Sergei Shtylyov
2015-01-30 21:54   ` Tim Chen
2015-02-02 14:02     ` Jiri Slaby
2015-02-02 16:39       ` Sergei Shtylyov
2015-02-04 18:30         ` Tim Chen
2015-02-18 19:40           ` Tim Chen
2015-02-18 19:53             ` Alan Stern
2015-02-18 20:19               ` Tim Chen
2015-02-18 20:36                 ` Greg Kroah-Hartman
2015-02-18 20:39                 ` Andi Kleen
2015-02-18 21:15                   ` Alan Stern
2015-02-18 20:45                 ` Alan Stern
2015-02-18 23:59                   ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).