netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION 4.20] mvneta - DMA-API: device driver tries to sync DMA memory it has not allocated
@ 2019-02-12 13:59 Russell King - ARM Linux admin
  2019-02-15 11:10 ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 2+ messages in thread
From: Russell King - ARM Linux admin @ 2019-02-12 13:59 UTC (permalink / raw)
  To: Thomas Petazzoni, netdev

Hi,

Booting 4.20 on SolidRun Clearfog reliably provokes the following
warning - this is with mvneta built in, but DSA as modules:

WARNING: CPU: 0 PID: 555 at kernel/dma/debug.c:1230 check_sync+0x514/0x5bc
mvneta f1070000.ethernet: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000002dd7dc00] [size=240 bytes]
Modules linked in: ahci mv88e6xxx dsa_core xhci_plat_hcd xhci_hcd devlink armada_thermal marvell_cesa des_generic ehci_orion phy_armada38x_comphy mcp3021 spi_orion evbug sfp mdio_i2c ip_tables x_tables
CPU: 0 PID: 555 Comm: bridge-network- Not tainted 4.20.0+ #291
Hardware name: Marvell Armada 380/385 (Device Tree)
[<c0019638>] (unwind_backtrace) from [<c0014888>] (show_stack+0x10/0x14)
[<c0014888>] (show_stack) from [<c07f54e0>] (dump_stack+0x9c/0xd4)
[<c07f54e0>] (dump_stack) from [<c00312bc>] (__warn+0xf8/0x124)
[<c00312bc>] (__warn) from [<c00313b0>] (warn_slowpath_fmt+0x38/0x48)
[<c00313b0>] (warn_slowpath_fmt) from [<c00b0370>] (check_sync+0x514/0x5bc)
[<c00b0370>] (check_sync) from [<c00b04f8>] (debug_dma_sync_single_range_for_cpu+0x6c/0x74)
[<c00b04f8>] (debug_dma_sync_single_range_for_cpu) from [<c051bd14>] (mvneta_poll+0x298/0xf58)
[<c051bd14>] (mvneta_poll) from [<c0656194>] (net_rx_action+0x128/0x424)
[<c0656194>] (net_rx_action) from [<c000a230>] (__do_softirq+0xf0/0x540)
[<c000a230>] (__do_softirq) from [<c00386e0>] (irq_exit+0x124/0x144)
[<c00386e0>] (irq_exit) from [<c009b5e0>] (__handle_domain_irq+0x58/0xb0)
[<c009b5e0>] (__handle_domain_irq) from [<c03a63c4>] (gic_handle_irq+0x48/0x98)
[<c03a63c4>] (gic_handle_irq) from [<c0009a10>] (__irq_svc+0x70/0x98)
...

This appears to be from:

                if (rx_bytes <= rx_copybreak) {
                        /* better copy a small frame and not unmap the DMA region */
                        skb = netdev_alloc_skb_ip_align(dev, rx_bytes);
                        if (unlikely(!skb))
                                goto err_drop_frame_ret_pool;

                        dma_sync_single_range_for_cpu(dev->dev.parent,
                                                      rx_desc->buf_phys_addr,
                                                      MVNETA_MH_SIZE + NET_SKB_PAD,
                                                      rx_bytes,
                                                      DMA_FROM_DEVICE);

which suggests that rx_desc->buf_phys_addr is not something that should
be passed to dma_sync_single_range_for_cpu().  I've not been able to
track down why that is, nor which interface is provoking that.

As I don't have the details of how the buffer management hardware works
on Armada 388, I'm unable to debug this myself.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [REGRESSION 4.20] mvneta - DMA-API: device driver tries to sync DMA memory it has not allocated
  2019-02-12 13:59 [REGRESSION 4.20] mvneta - DMA-API: device driver tries to sync DMA memory it has not allocated Russell King - ARM Linux admin
@ 2019-02-15 11:10 ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 2+ messages in thread
From: Russell King - ARM Linux admin @ 2019-02-15 11:10 UTC (permalink / raw)
  To: Thomas Petazzoni, netdev

On Tue, Feb 12, 2019 at 01:59:19PM +0000, Russell King - ARM Linux admin wrote:
> Hi,
> 
> Booting 4.20 on SolidRun Clearfog reliably provokes the following
> warning - this is with mvneta built in, but DSA as modules:
> 
> WARNING: CPU: 0 PID: 555 at kernel/dma/debug.c:1230 check_sync+0x514/0x5bc
> mvneta f1070000.ethernet: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000002dd7dc00] [size=240 bytes]
> Modules linked in: ahci mv88e6xxx dsa_core xhci_plat_hcd xhci_hcd devlink armada_thermal marvell_cesa des_generic ehci_orion phy_armada38x_comphy mcp3021 spi_orion evbug sfp mdio_i2c ip_tables x_tables
> CPU: 0 PID: 555 Comm: bridge-network- Not tainted 4.20.0+ #291
> Hardware name: Marvell Armada 380/385 (Device Tree)
> [<c0019638>] (unwind_backtrace) from [<c0014888>] (show_stack+0x10/0x14)
> [<c0014888>] (show_stack) from [<c07f54e0>] (dump_stack+0x9c/0xd4)
> [<c07f54e0>] (dump_stack) from [<c00312bc>] (__warn+0xf8/0x124)
> [<c00312bc>] (__warn) from [<c00313b0>] (warn_slowpath_fmt+0x38/0x48)
> [<c00313b0>] (warn_slowpath_fmt) from [<c00b0370>] (check_sync+0x514/0x5bc)
> [<c00b0370>] (check_sync) from [<c00b04f8>] (debug_dma_sync_single_range_for_cpu+0x6c/0x74)
> [<c00b04f8>] (debug_dma_sync_single_range_for_cpu) from [<c051bd14>] (mvneta_poll+0x298/0xf58)
> [<c051bd14>] (mvneta_poll) from [<c0656194>] (net_rx_action+0x128/0x424)
> [<c0656194>] (net_rx_action) from [<c000a230>] (__do_softirq+0xf0/0x540)
> [<c000a230>] (__do_softirq) from [<c00386e0>] (irq_exit+0x124/0x144)
> [<c00386e0>] (irq_exit) from [<c009b5e0>] (__handle_domain_irq+0x58/0xb0)
> [<c009b5e0>] (__handle_domain_irq) from [<c03a63c4>] (gic_handle_irq+0x48/0x98)
> [<c03a63c4>] (gic_handle_irq) from [<c0009a10>] (__irq_svc+0x70/0x98)
> ...
> 
> This appears to be from:
> 
>                 if (rx_bytes <= rx_copybreak) {
>                         /* better copy a small frame and not unmap the DMA region */
>                         skb = netdev_alloc_skb_ip_align(dev, rx_bytes);
>                         if (unlikely(!skb))
>                                 goto err_drop_frame_ret_pool;
> 
>                         dma_sync_single_range_for_cpu(dev->dev.parent,
>                                                       rx_desc->buf_phys_addr,
>                                                       MVNETA_MH_SIZE + NET_SKB_PAD,
>                                                       rx_bytes,
>                                                       DMA_FROM_DEVICE);
> 
> which suggests that rx_desc->buf_phys_addr is not something that should
> be passed to dma_sync_single_range_for_cpu().  I've not been able to
> track down why that is, nor which interface is provoking that.
> 
> As I don't have the details of how the buffer management hardware works
> on Armada 388, I'm unable to debug this myself.

Doing what debugging I _can_ do, it seems that this has been a long-term
error in mvneta, but one that was merely uncovered by:

  commit 562e2f467e71f45f0400ebee5077eaa426d3e426
  Author: Yelena Krivosheev <yelena@marvell.com>
  Date:   Wed Jul 18 18:10:57 2018 +0200

The buffer that is being complained about is sync'd using a device of
dev->dev.parent 'f1070000.ethernet', but is allocated by
mvneta_bm_construct() against a different device:

mvneta_bm_construct: 0x2dd85c00 +0x140 for ee113294 (f10c8000.bm)

namely 'f10c8000.bm'.

It's long-term, because it will only trigger in older kernels if we
hit the copy-break stuff, which used to do:

	if (rx_bytes <= rx_copybreak) {
		skb = netdev_alloc_skb_ip_align(dev, rx_bytes);
		if (unlikely(!skb)) {
			...
		}
		
		dma_sync_single_range_for_cpu(dev->dev.parent,
					      phys_addr,
					      MVNETA_MH_SIZE + NET_SKB_PAD,
					      rx_bytes,
					      DMA_FROM_DEVICE);

where rx_copybreak is 256 bytes.  Quite why that hasn't been seen
already, I do not know.

Looking at the code after the commit, if mvneta is used on a
non-coherent platform, then we have problems:

	copy_size = min(skb_size, rx_bytes);
	...
	memcpy(rxq->skb->data, data + MVNETA_MH_SIZE,
	       copy_size);
	...
	if (rxq->left_size == 0) {
		int size = copy_size + MVNETA_MH_SIZE;
		dma_sync_single_range_for_cpu(dev->dev.parent,
					      phys_addr, 0,
					      size,
					      DMA_FROM_DEVICE);

Since the sync is done _after_ we've copied data from a non-coherent
buffer.

If this code has been written to assume that we're always coherent,
then is there any point at all to having the incorrect dma_sync_*()
calls at all?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-02-15 11:10 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-12 13:59 [REGRESSION 4.20] mvneta - DMA-API: device driver tries to sync DMA memory it has not allocated Russell King - ARM Linux admin
2019-02-15 11:10 ` Russell King - ARM Linux admin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).