All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Marek Behún" <kabel@kernel.org>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>, Arnd Bergmann <arnd@kernel.org>,
	Andre Przywara <andre.przywara@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Russell King <linux@armlinux.org.uk>,
	Andrew Lunn <andrew@lunn.ch>,
	Gregory Clement <gregory.clement@bootlin.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally"
Date: Fri, 30 Sep 2022 16:52:34 +0200	[thread overview]
Message-ID: <20220930165234.729ad68c@dellmb> (raw)
In-Reply-To: <630be11f-09ef-02d4-69f7-c7880ae5674c@arm.com>

[-- Attachment #1: Type: text/plain, Size: 2110 bytes --]

On Fri, 30 Sep 2022 14:46:06 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> On 2022-09-30 14:10, Marek Behún wrote:
> > Hello Linus, Arnd, Robin and Christoph,
> > 
> > I just bisected a regression on Turris Omnia (Armada 385), wherein the
> > system hangs shortly after init is run, to commit
> > 
> >    ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally")
> >    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ae626eb97376
> > 
> > In order to fix the regression, I had to revert this commit and
> > subsequent 3 commits:
> >    ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally")
> >    42998ef08aba ("ARM/dma-mapping: drop .dma_supported for IOMMU ops")
> >    d563bccfa35b ("ARM/dma-mapping: consolidate IOMMU ops callbacks")
> >    4136ce90f079 ("ARM/dma-mapping: merge IOMMU ops")
> > in reverse order, of course:
> >    git revert 4136ce90f079
> >    git revert d563bccfa35b
> >    git revert 42998ef08aba
> >    git revert ae626eb97376
> > 
> > Christoph, Robin, since you are the authors of these commits, do you
> > have any idea what could be happening? Are we able to fix this without
> > reverting those commits, before 6.0?  
> 
> "hangs shortly after init" isn't much to go on. Are any errors logged? 
> Possibly some driver is sat waiting for a DMA transfer to complete, that 
> has somehow got the wrong address or lost coherency so never gets seen, 
> but without at least being able to narrow it down to the affected driver 
> it's hard to do much more than vague guessing.

OK I enabled CONFIG_DMA_API_DEBUG and now am getting a null pointer
dereference. I managed to isolate the bug to a specifc line in mvneta
driver:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/marvell/mvneta.c#n2591

I put debug printfs (pr_err("  a %i\n", __LINE__)) into the
mvneta_rx_hwbm() function.
The pr_err after the call to dma_sync_single_range_for_cpu() prints,
but the pr_err after skb_put_data() does not print.

Attaching console output.

Marek

[-- Attachment #2: regression_console_output.txt --]
[-- Type: text/plain, Size: 7313 bytes --]

[    3.427249] Run /bin/bash as init process
bash: cannot set terminal process group (-1): Not a tty
bash: no job control in this shell
bash-5.1# ifconfig eth2 up
[    6.738009] mvneta f1034000.ethernet eth2: PHY [f1072004.mdio-mii:01] driver [Marvell 88E1510] (irq=POLL)
[    6.747801] mvneta f1034000.ethernet eth2: configuring for phy/sgmii link mode
bash-5.1# [    9.857426] mvneta f1034000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off
[    9.865290] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[   10.275029]          a 2541
[   10.277226]          a 2550
[   10.279416]          a 2570
[   10.281604]          a 2583
[   10.283793]          a 2590
[   10.285984]          a 2596
[   10.288178] 8<--- cut here ---
[   10.291236] Unable to handle kernel NULL pointer dereference at virtual address 00000042
[   10.299348] [00000042] *pgd=00000000
[   10.302933] Internal error: Oops: 5 [#1] SMP ARM
[   10.307562] Modules linked in:
[   10.310622] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.0.0-rc7+ #75
[   10.316993] Hardware name: Marvell Armada 380/385 (Device Tree)
[   10.322926] PC is at mmiocpy+0xec/0x334
[   10.326776] LR is at mvneta_poll+0x5e4/0x7a8
[   10.331058] pc : [<c05befac>]    lr : [<c077ed00>]    psr: 60000113
[   10.337339] sp : c1001db0  ip : 00000002  fp : c1001db0
[   10.342575] r10: c14ac840  r9 : c2024000  r8 : 0000005c
[   10.347811] r7 : 4fa06000  r6 : c0e4cc9c  r5 : f10f3000  r4 : c1e0d480
[   10.354353] r3 : 00000000  r2 : 00000058  r1 : 00000042  r0 : c147a642
[   10.360895] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   10.368047] Control: 10c5387d  Table: 01f5c04a  DAC: 00000051
[   10.373805] Register r0 information: slab kmalloc-512 start c147a600 pointer offset 66 size 512
[   10.382532] Register r1 information: non-paged memory
[   10.387595] Register r2 information: non-paged memory
[   10.392658] Register r3 information: NULL pointer
[   10.397372] Register r4 information: slab skbuff_head_cache start c1e0d480 pointer offset 0 size 48
[   10.406446] Register r5 information: 0-page vmalloc region starting at 0xf10f3000 allocated at dma_common_contiguous_remap+0x68/0x84
[   10.418395] Register r6 information: non-slab/vmalloc memory
[   10.424068] Register r7 information: non-paged memory
[   10.429130] Register r8 information: non-paged memory
[   10.434191] Register r9 information: slab kmalloc-cg-4k start c2024000 pointer offset 0 size 4096
[   10.443090] Register r10 information: slab kmalloc-2k start c14ac800 pointer offset 64 size 2048
[   10.451902] Register r11 information: non-slab/vmalloc memory
[   10.457661] Register r12 information: non-paged memory
[   10.462810] Process swapper/0 (pid: 0, stack limit = 0x(ptrval))
[   10.468831] Stack: (0xc1001db0 to 0xc1002000)
[   10.473198] 1da0:                                     c147a642 c1e0d480 c2144d18 c077ed00
[   10.481396] 1dc0: 63f6f15b c1410c00 00000000 00000000 00000040 ff7ebb60 c20244c0 ff7ebb68
[   10.489592] 1de0: 00000000 c2024000 00000001 00000000 00000002 00000000 0273e980 c1005c88
[   10.497790] 1e00: c0e4cca8 00000100 c1001e04 00000001 ff7ebb68 00000040 c1001e5b c1001e5c
[   10.505987] 1e20: c1002d40 ffff8ed4 eedd5b40 c0872630 00000000 c1001e60 00000101 eedd5980
[   10.514185] 1e40: 0000012c ff7ebb68 00000000 c08728b8 2de7b000 c0f5a980 0000002f c1001e5c
[   10.522381] 1e60: c1001e5c c1001e64 c1001e64 10db25e3 0000002f 00000000 00000003 c100208c
[   10.530578] 1e80: c1009b80 00000100 c1001e98 40000003 c1002080 c0101354 0000000c c05db53c
[   10.538776] 1ea0: 00001000 c1002080 c0f57300 0000000a c0f57274 c0f59bc0 c0f59bc0 ffff8ed3
[   10.546972] 1ec0: c1002d40 c0de605c 04200002 c0b02800 c0f58edc c0107298 60000013 ffffffff
[   10.555170] 1ee0: c1001f34 c0f592e8 c1009b80 00000000 00000000 c0134938 c0107298 c0100b68
[   10.563367] 1f00: 00000005 00000000 0004b5b9 c01164a0 c1009b80 c1004f90 00000000 c1004fdc
[   10.571564] 1f20: c0f592e8 c10aa290 00000000 00000000 c1001f30 c1001f50 c0107294 c0107298
[   10.579762] 1f40: 60000013 ffffffff 00000051 c1004f90 c1009b80 c0a54e64 c1009b80 c016bbc8
[   10.587958] 1f60: c1107af0 10db25e3 ffffffff 000000ec c1107af0 c1004f40 ffffffff c0f45a60
[   10.596155] 1f80: 00000000 10c5387d 00000000 c016bef4 c10103c8 c0a4ded0 c10ab040 c0f009f0
[   10.604353] 1fa0: c10ab040 c0f01054 ffffffff ffffffff 00000000 c0f0060c 00000000 00000000
[   10.612549] 1fc0: 00000000 c0f45a60 10dd25e3 00000000 00000000 c0f00340 00000051 10c0387d
[   10.620746] 1fe0: 00000000 0fff7000 414fc091 10c5387d 00000000 00000000 00000000 00000000
[   10.628945]  mmiocpy from mvneta_poll+0x5e4/0x7a8
[   10.633665]  mvneta_poll from __napi_poll.constprop.0+0x2c/0x180
[   10.639694]  __napi_poll.constprop.0 from net_rx_action+0x134/0x2e0
[   10.645980]  net_rx_action from __do_softirq+0x114/0x274
[   10.651311]  __do_softirq from irq_exit+0x80/0xa8
[   10.656029]  irq_exit from __irq_svc+0x88/0xb0
[   10.660484] Exception stack(0xc1001f00 to 0xc1001f48)
[   10.665548] 1f00: 00000005 00000000 0004b5b9 c01164a0 c1009b80 c1004f90 00000000 c1004fdc
[   10.673746] 1f20: c0f592e8 c10aa290 00000000 00000000 c1001f30 c1001f50 c0107294 c0107298
[   10.681941] 1f40: 60000013 ffffffff
[   10.685436]  __irq_svc from arch_cpu_idle+0x38/0x3c
[   10.690329]  arch_cpu_idle from default_idle_call+0x24/0x34
[   10.695921]  default_idle_call from do_idle+0x1b4/0x210
[   10.701166]  do_idle from cpu_startup_entry+0x18/0x1c
[   10.706233]  cpu_startup_entry from rest_init+0xa8/0xac
[   10.711473]  rest_init from arch_post_acpi_subsys_init+0x0/0x8
[   10.717324] Code: e8bd8811 e26cc004 e35c0002 c4d13001 (a4d14001) 
[   10.723437] ---[ end trace 0000000000000000 ]---
[   10.728069] Kernel panic - not syncing: Fatal exception in interrupt
[   10.734437] CPU1: stopping
[   10.737151] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D            6.0.0-rc7+ #75
[   10.745001] Hardware name: Marvell Armada 380/385 (Device Tree)
[   10.750934]  unwind_backtrace from show_stack+0x10/0x14
[   10.756177]  show_stack from dump_stack_lvl+0x40/0x4c
[   10.761245]  dump_stack_lvl from do_handle_IPI+0xec/0x124
[   10.766659]  do_handle_IPI from ipi_handler+0x18/0x20
[   10.771724]  ipi_handler from handle_percpu_devid_irq+0x78/0x134
[   10.777751]  handle_percpu_devid_irq from generic_handle_domain_irq+0x28/0x38
[   10.784906]  generic_handle_domain_irq from gic_handle_irq+0x74/0x88
[   10.791279]  gic_handle_irq from generic_handle_arch_irq+0x34/0x44
[   10.797475]  generic_handle_arch_irq from call_with_stack+0x18/0x20
[   10.803759]  call_with_stack from __irq_svc+0x98/0xb0
[   10.808823] Exception stack(0xf086df50 to 0xf086df98)
[   10.813887] df40:                                     00000005 00000000 00072f51 c01164a0
[   10.822083] df60: c14b8000 c1004f90 00000001 c1004fdc c0f592e8 c10aa290 00000000 00000000
[   10.830281] df80: f086df80 f086dfa0 c0107294 c0107298 60000013 ffffffff
[   10.836909]  __irq_svc from arch_cpu_idle+0x38/0x3c
[   10.841800]  arch_cpu_idle from default_idle_call+0x24/0x34
[   10.847388]  default_idle_call from do_idle+0x1b4/0x210
[   10.852629]  do_idle from cpu_startup_entry+0x18/0x1c
[   10.857696]  cpu_startup_entry from secondary_start_kernel+0x118/0x120
[   10.864243]  secondary_start_kernel from 0x101560
[   10.868962] Rebooting in 1 seconds..

[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-09-30 14:53 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-30 13:10 REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Marek Behún
2022-09-30 13:46 ` Robin Murphy
2022-09-30 14:52   ` Marek Behún [this message]
2022-09-30 15:02     ` Marek Behún
2022-09-30 16:41       ` Robin Murphy
2022-09-30 18:02         ` Marek Behún
2022-10-03  7:21           ` Christoph Hellwig
2022-10-03  7:30       ` Christoph Hellwig
2022-10-03 14:11         ` Russell King (Oracle)
2022-10-03 15:25           ` Marek Behún
2022-10-03 16:09             ` Pali Rohár
2022-10-03 19:04               ` Marek Behún
2022-10-03 19:08                 ` Pali Rohár
2022-10-03 21:30             ` Marcin Wojtas
2022-10-03 21:35               ` Pali Rohár
2022-10-03 22:03                 ` Marcin Wojtas
2022-10-04  7:10               ` Christoph Hellwig
2022-10-04  8:15                 ` Marek Behún
2022-10-04  8:17                   ` [PATCH] ARM: mvebu: select OF_DMA_DEFAULT_COHERENT if MACH_MVEBU_V7 Marek Behún
2022-10-04  8:30                     ` Christoph Hellwig
2022-10-04 12:54                       ` Marek Behún
2022-10-04  8:30                     ` Arnd Bergmann
2022-10-04  9:14                     ` Thorsten Leemhuis
2022-10-04  9:22                       ` Russell King (Oracle)
2022-10-04  9:56                 ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Robin Murphy
2022-10-04  7:25               ` Russell King (Oracle)
2022-10-04  8:30                 ` Marcin Wojtas
2022-10-04  9:08                   ` Russell King (Oracle)
2022-10-04 12:36                     ` Marek Behún
2022-10-04 12:59                       ` Marcin Wojtas
2022-10-04 18:51                         ` Pali Rohár
2022-10-04 19:35                           ` Marcin Wojtas
2022-10-04  8:26               ` Marek Behún
2022-10-04  8:36                 ` Marcin Wojtas
2022-10-20 18:22                   ` Russell King (Oracle)
2022-10-20 19:10                     ` Marek Behún
2022-10-21 16:25                     ` Linus Torvalds
2022-10-21 16:30                     ` Christoph Hellwig
2022-10-21 18:21                       ` Russell King (Oracle)
2022-10-23 11:58                     ` Klaus Kudielka
2022-10-03 18:57         ` Marek Behún
2022-10-01  9:31 ` Thorsten Leemhuis
2022-11-04 12:08   ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" #forregzbot Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220930165234.729ad68c@dellmb \
    --to=kabel@kernel.org \
    --cc=andre.przywara@arm.com \
    --cc=andrew@lunn.ch \
    --cc=arnd@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=gregory.clement@bootlin.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=maz@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.