linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* dmapool regression in next
@ 2018-12-06  1:30 Tony Lindgren
  2018-12-06  9:25 ` Krzysztof Kozlowski
  0 siblings, 1 reply; 7+ messages in thread
From: Tony Lindgren @ 2018-12-06  1:30 UTC (permalink / raw)
  To: Tony Battersby, Andrew Morton, Stephen Rothwell
  Cc: Andy Shevchenko, Christoph Hellwig, John Garry, Marek Szyprowski,
	Russell King - ARM Linux, linux-kernel, linux-arm-kernel,
	linux-omap

Hi,

Looks like with commit 26abe88e830d ("mm/dmapool.c: improve scalability
of dma_pool_free()") I'm now getting spammed with lots of "(bad vaddr)"
on at least omap4 pandaboard, see below.

Any ideas what might be going wrong?

Regards,

Tony

8< ---------------------
omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe800000
omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe80001c
omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe800038
...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmapool regression in next
  2018-12-06  1:30 dmapool regression in next Tony Lindgren
@ 2018-12-06  9:25 ` Krzysztof Kozlowski
  2018-12-06 15:11   ` Tony Battersby
  0 siblings, 1 reply; 7+ messages in thread
From: Krzysztof Kozlowski @ 2018-12-06  9:25 UTC (permalink / raw)
  To: tony
  Cc: tonyb, akpm, Stephen Rothwell, andy.shevchenko, hch, john.garry,
	Marek Szyprowski, linux, linux-kernel, linux-arm-kernel,
	linux-omap

On Thu, 6 Dec 2018 at 02:31, Tony Lindgren <tony@atomide.com> wrote:
>
> Hi,
>
> Looks like with commit 26abe88e830d ("mm/dmapool.c: improve scalability
> of dma_pool_free()") I'm now getting spammed with lots of "(bad vaddr)"
> on at least omap4 pandaboard, see below.
>
> Any ideas what might be going wrong?
>
> Regards,
>
> Tony
>
> 8< ---------------------
> omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe800000
> omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe80001c
> omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe800038
> ...

I see it as well on all my Exynos boards, since yesterday's next. In
my case it is the USB EHCI driver:
exynos-ehci 12110000.usb: dma_pool_free ehci_qtd, (ptrval) (bad
vaddr)/0xb8844180
Full log here:
https://krzk.eu/#/builders/1/builds/2937/steps/12/logs/serial0

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmapool regression in next
  2018-12-06  9:25 ` Krzysztof Kozlowski
@ 2018-12-06 15:11   ` Tony Battersby
  2018-12-06 15:51     ` Robin Murphy
  0 siblings, 1 reply; 7+ messages in thread
From: Tony Battersby @ 2018-12-06 15:11 UTC (permalink / raw)
  To: Krzysztof Kozlowski, tony
  Cc: akpm, Stephen Rothwell, andy.shevchenko, hch, john.garry,
	Marek Szyprowski, linux, linux-kernel, linux-arm-kernel,
	linux-omap

On 12/6/18 4:25 AM, Krzysztof Kozlowski wrote:
> On Thu, 6 Dec 2018 at 02:31, Tony Lindgren <tony@atomide.com> wrote:
>> Hi,
>>
>> Looks like with commit 26abe88e830d ("mm/dmapool.c: improve scalability
>> of dma_pool_free()") I'm now getting spammed with lots of "(bad vaddr)"
>> on at least omap4 pandaboard, see below.
>>
>> Any ideas what might be going wrong?
>>
>> Regards,
>>
>> Tony
>>
>> 8< ---------------------
>> omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe800000
>> omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe80001c
>> omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe800038
>> ...
> I see it as well on all my Exynos boards, since yesterday's next. In
> my case it is the USB EHCI driver:
> exynos-ehci 12110000.usb: dma_pool_free ehci_qtd, (ptrval) (bad
> vaddr)/0xb8844180
> Full log here:
> https://krzk.eu/#/builders/1/builds/2937/steps/12/logs/serial0
>
> Best regards,
> Krzysztof
>
Here is the prototype:

void dma_pool_free(struct dma_pool *pool, void *vaddr, dma_addr_t dma);

With the old code, the 'dma' value had to be correct for use with
pool_find_page(), or else you would get an error.  If the 'vaddr' value
was incorrect, it would corrupt the dmapool freelist, but you wouldn't
get an error unless DMAPOOL_DEBUG was enabled.

With my patch applied, 'vaddr' has to be correct for virt_to_page().  My
code also checks that 'dma' is consistent with 'vaddr' even if
DMAPOOL_DEBUG is disabled, since the check is fast and it will prevent
problems like this in the future.

So if a buggy driver passes in a good value for 'dma' but a bad value
for 'vaddr', then it may have appeared to work previously (but with
possible data corruption, depending on the circumstances), but my patch
will expose the problem.  You can confirm by reverting my dmapool
patches and enabling DMAPOOL_DEBUG, which is at the top of mm/dmapool.c:

#if defined(CONFIG_DEBUG_SLAB) || defined(CONFIG_SLUB_DEBUG_ON)
#define DMAPOOL_DEBUG 1
#endif

Tony Battersby



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmapool regression in next
  2018-12-06 15:11   ` Tony Battersby
@ 2018-12-06 15:51     ` Robin Murphy
  2018-12-06 16:13       ` Tony Battersby
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2018-12-06 15:51 UTC (permalink / raw)
  To: Tony Battersby, Krzysztof Kozlowski, tony
  Cc: Stephen Rothwell, john.garry, linux, linux-kernel,
	andy.shevchenko, akpm, linux-omap, hch, linux-arm-kernel,
	Marek Szyprowski

On 06/12/2018 15:11, Tony Battersby wrote:
> On 12/6/18 4:25 AM, Krzysztof Kozlowski wrote:
>> On Thu, 6 Dec 2018 at 02:31, Tony Lindgren <tony@atomide.com> wrote:
>>> Hi,
>>>
>>> Looks like with commit 26abe88e830d ("mm/dmapool.c: improve scalability
>>> of dma_pool_free()") I'm now getting spammed with lots of "(bad vaddr)"
>>> on at least omap4 pandaboard, see below.
>>>
>>> Any ideas what might be going wrong?
>>>
>>> Regards,
>>>
>>> Tony
>>>
>>> 8< ---------------------
>>> omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe800000
>>> omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe80001c
>>> omap-dma-engine 4a056000.dma-controller: dma_pool_free 4a056000.dma-controller, (ptrval) (bad vaddr)/0xbe800038
>>> ...
>> I see it as well on all my Exynos boards, since yesterday's next. In
>> my case it is the USB EHCI driver:
>> exynos-ehci 12110000.usb: dma_pool_free ehci_qtd, (ptrval) (bad
>> vaddr)/0xb8844180
>> Full log here:
>> https://krzk.eu/#/builders/1/builds/2937/steps/12/logs/serial0
>>
>> Best regards,
>> Krzysztof
>>
> Here is the prototype:
> 
> void dma_pool_free(struct dma_pool *pool, void *vaddr, dma_addr_t dma);
> 
> With the old code, the 'dma' value had to be correct for use with
> pool_find_page(), or else you would get an error.  If the 'vaddr' value
> was incorrect, it would corrupt the dmapool freelist, but you wouldn't
> get an error unless DMAPOOL_DEBUG was enabled.
> 
> With my patch applied, 'vaddr' has to be correct for virt_to_page().  My
> code also checks that 'dma' is consistent with 'vaddr' even if
> DMAPOOL_DEBUG is disabled, since the check is fast and it will prevent
> problems like this in the future.

Unfortunately that logic has a fatal flaw - DMA pools are backed by 
dma_alloc_coherent(), and there is absolutely no guarantee that the 
memory dma_alloc_coherent() returns is backed by a struct page at all. 
Even if it is, there is still absolutely no guarantee that the vaddr 
value it returns is valid for virt_to_page() - on many systems it will 
be in vmalloc or some architecture-specific region of address space.

The problem is not that these drivers are buggy (they're not - the arch 
code is returning a vmalloc()ed non-cacheable remap in the first place), 
it's that 26abe88e830d is fundamentally unworkable and needs reverting. 
Apparently the original patches managed not to catch my eye as something 
I needed to review, sorry about that :(

Robin.

> 
> So if a buggy driver passes in a good value for 'dma' but a bad value
> for 'vaddr', then it may have appeared to work previously (but with
> possible data corruption, depending on the circumstances), but my patch
> will expose the problem.  You can confirm by reverting my dmapool
> patches and enabling DMAPOOL_DEBUG, which is at the top of mm/dmapool.c:
> 
> #if defined(CONFIG_DEBUG_SLAB) || defined(CONFIG_SLUB_DEBUG_ON)
> #define DMAPOOL_DEBUG 1
> #endif
> 
> Tony Battersby
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmapool regression in next
  2018-12-06 15:51     ` Robin Murphy
@ 2018-12-06 16:13       ` Tony Battersby
  2018-12-06 16:33         ` Tony Lindgren
  0 siblings, 1 reply; 7+ messages in thread
From: Tony Battersby @ 2018-12-06 16:13 UTC (permalink / raw)
  To: Robin Murphy, Krzysztof Kozlowski, tony, akpm
  Cc: Stephen Rothwell, john.garry, linux, linux-kernel,
	andy.shevchenko, linux-omap, hch, linux-arm-kernel,
	Marek Szyprowski, Matthew Wilcox

On 12/6/18 10:51 AM, Robin Murphy wrote:
>> Here is the prototype:
>>
>> void dma_pool_free(struct dma_pool *pool, void *vaddr, dma_addr_t dma);
>>
>> With the old code, the 'dma' value had to be correct for use with
>> pool_find_page(), or else you would get an error.  If the 'vaddr' value
>> was incorrect, it would corrupt the dmapool freelist, but you wouldn't
>> get an error unless DMAPOOL_DEBUG was enabled.
>>
>> With my patch applied, 'vaddr' has to be correct for virt_to_page().  My
>> code also checks that 'dma' is consistent with 'vaddr' even if
>> DMAPOOL_DEBUG is disabled, since the check is fast and it will prevent
>> problems like this in the future.
> Unfortunately that logic has a fatal flaw - DMA pools are backed by 
> dma_alloc_coherent(), and there is absolutely no guarantee that the 
> memory dma_alloc_coherent() returns is backed by a struct page at all. 
> Even if it is, there is still absolutely no guarantee that the vaddr 
> value it returns is valid for virt_to_page() - on many systems it will 
> be in vmalloc or some architecture-specific region of address space.
>
> The problem is not that these drivers are buggy (they're not - the arch 
> code is returning a vmalloc()ed non-cacheable remap in the first place), 
> it's that 26abe88e830d is fundamentally unworkable and needs reverting. 
> Apparently the original patches managed not to catch my eye as something 
> I needed to review, sorry about that :(
>
> Robin.
>
Thanks for the info; the inner workings of the vm system are a bit out
of my area of expertise.  My first version of the patch series used a
different method that didn't rely on virt_to_page(); I will go back to
that version, clean it up, and resubmit when I have time.

Andrew, please revert all 9 patches.  I will resubmit the set when I
have a workable solution.

Tony Battersby


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmapool regression in next
  2018-12-06 16:13       ` Tony Battersby
@ 2018-12-06 16:33         ` Tony Lindgren
  2018-12-06 22:10           ` Stephen Rothwell
  0 siblings, 1 reply; 7+ messages in thread
From: Tony Lindgren @ 2018-12-06 16:33 UTC (permalink / raw)
  To: Tony Battersby
  Cc: Robin Murphy, Krzysztof Kozlowski, akpm, Stephen Rothwell,
	john.garry, linux, linux-kernel, andy.shevchenko, linux-omap,
	hch, linux-arm-kernel, Marek Szyprowski, Matthew Wilcox

* Tony Battersby <tonyb@cybernetics.com> [181206 16:13]:
> On 12/6/18 10:51 AM, Robin Murphy wrote:
> >> Here is the prototype:
> >>
> >> void dma_pool_free(struct dma_pool *pool, void *vaddr, dma_addr_t dma);
> >>
> >> With the old code, the 'dma' value had to be correct for use with
> >> pool_find_page(), or else you would get an error.  If the 'vaddr' value
> >> was incorrect, it would corrupt the dmapool freelist, but you wouldn't
> >> get an error unless DMAPOOL_DEBUG was enabled.
> >>
> >> With my patch applied, 'vaddr' has to be correct for virt_to_page().  My
> >> code also checks that 'dma' is consistent with 'vaddr' even if
> >> DMAPOOL_DEBUG is disabled, since the check is fast and it will prevent
> >> problems like this in the future.
> > Unfortunately that logic has a fatal flaw - DMA pools are backed by 
> > dma_alloc_coherent(), and there is absolutely no guarantee that the 
> > memory dma_alloc_coherent() returns is backed by a struct page at all. 
> > Even if it is, there is still absolutely no guarantee that the vaddr 
> > value it returns is valid for virt_to_page() - on many systems it will 
> > be in vmalloc or some architecture-specific region of address space.
> >
> > The problem is not that these drivers are buggy (they're not - the arch 
> > code is returning a vmalloc()ed non-cacheable remap in the first place), 
> > it's that 26abe88e830d is fundamentally unworkable and needs reverting. 
> > Apparently the original patches managed not to catch my eye as something 
> > I needed to review, sorry about that :(
> >
> > Robin.
> >
> Thanks for the info; the inner workings of the vm system are a bit out
> of my area of expertise.  My first version of the patch series used a
> different method that didn't rely on virt_to_page(); I will go back to
> that version, clean it up, and resubmit when I have time.
> 
> Andrew, please revert all 9 patches.  I will resubmit the set when I
> have a workable solution.

OK sounds good to me. I can test the new set easily when available
if you Cc me on them.

Thanks,

Tony

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmapool regression in next
  2018-12-06 16:33         ` Tony Lindgren
@ 2018-12-06 22:10           ` Stephen Rothwell
  0 siblings, 0 replies; 7+ messages in thread
From: Stephen Rothwell @ 2018-12-06 22:10 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Tony Battersby, Robin Murphy, Krzysztof Kozlowski, akpm,
	john.garry, linux, linux-kernel, andy.shevchenko, linux-omap,
	hch, linux-arm-kernel, Marek Szyprowski, Matthew Wilcox

[-- Attachment #1: Type: text/plain, Size: 2363 bytes --]

Hi all,

On Thu, 6 Dec 2018 08:33:15 -0800 Tony Lindgren <tony@atomide.com> wrote:
>
> * Tony Battersby <tonyb@cybernetics.com> [181206 16:13]:
> > On 12/6/18 10:51 AM, Robin Murphy wrote:  
> > >> Here is the prototype:
> > >>
> > >> void dma_pool_free(struct dma_pool *pool, void *vaddr, dma_addr_t dma);
> > >>
> > >> With the old code, the 'dma' value had to be correct for use with
> > >> pool_find_page(), or else you would get an error.  If the 'vaddr' value
> > >> was incorrect, it would corrupt the dmapool freelist, but you wouldn't
> > >> get an error unless DMAPOOL_DEBUG was enabled.
> > >>
> > >> With my patch applied, 'vaddr' has to be correct for virt_to_page().  My
> > >> code also checks that 'dma' is consistent with 'vaddr' even if
> > >> DMAPOOL_DEBUG is disabled, since the check is fast and it will prevent
> > >> problems like this in the future.  
> > > Unfortunately that logic has a fatal flaw - DMA pools are backed by 
> > > dma_alloc_coherent(), and there is absolutely no guarantee that the 
> > > memory dma_alloc_coherent() returns is backed by a struct page at all. 
> > > Even if it is, there is still absolutely no guarantee that the vaddr 
> > > value it returns is valid for virt_to_page() - on many systems it will 
> > > be in vmalloc or some architecture-specific region of address space.
> > >
> > > The problem is not that these drivers are buggy (they're not - the arch 
> > > code is returning a vmalloc()ed non-cacheable remap in the first place), 
> > > it's that 26abe88e830d is fundamentally unworkable and needs reverting. 
> > > Apparently the original patches managed not to catch my eye as something 
> > > I needed to review, sorry about that :(
> > >
> > > Robin.
> > >  
> > Thanks for the info; the inner workings of the vm system are a bit out
> > of my area of expertise.  My first version of the patch series used a
> > different method that didn't rely on virt_to_page(); I will go back to
> > that version, clean it up, and resubmit when I have time.
> > 
> > Andrew, please revert all 9 patches.  I will resubmit the set when I
> > have a workable solution.  
> 
> OK sounds good to me. I can test the new set easily when available
> if you Cc me on them.

I have removed those patches from linux-next for today.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-12-06 22:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-06  1:30 dmapool regression in next Tony Lindgren
2018-12-06  9:25 ` Krzysztof Kozlowski
2018-12-06 15:11   ` Tony Battersby
2018-12-06 15:51     ` Robin Murphy
2018-12-06 16:13       ` Tony Battersby
2018-12-06 16:33         ` Tony Lindgren
2018-12-06 22:10           ` Stephen Rothwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).