From: "Marek Behún" <kabel@kernel.org>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>, Arnd Bergmann <arnd@kernel.org>,
Andre Przywara <andre.przywara@arm.com>,
Marc Zyngier <maz@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Russell King <linux@armlinux.org.uk>,
Andrew Lunn <andrew@lunn.ch>,
Gregory Clement <gregory.clement@bootlin.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
iommu@lists.linux-foundation.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally"
Date: Fri, 30 Sep 2022 16:52:34 +0200 [thread overview]
Message-ID: <20220930165234.729ad68c@dellmb> (raw)
In-Reply-To: <630be11f-09ef-02d4-69f7-c7880ae5674c@arm.com>
[-- Attachment #1: Type: text/plain, Size: 2110 bytes --]
On Fri, 30 Sep 2022 14:46:06 +0100
Robin Murphy <robin.murphy@arm.com> wrote:
> On 2022-09-30 14:10, Marek Behún wrote:
> > Hello Linus, Arnd, Robin and Christoph,
> >
> > I just bisected a regression on Turris Omnia (Armada 385), wherein the
> > system hangs shortly after init is run, to commit
> >
> > ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally")
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ae626eb97376
> >
> > In order to fix the regression, I had to revert this commit and
> > subsequent 3 commits:
> > ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally")
> > 42998ef08aba ("ARM/dma-mapping: drop .dma_supported for IOMMU ops")
> > d563bccfa35b ("ARM/dma-mapping: consolidate IOMMU ops callbacks")
> > 4136ce90f079 ("ARM/dma-mapping: merge IOMMU ops")
> > in reverse order, of course:
> > git revert 4136ce90f079
> > git revert d563bccfa35b
> > git revert 42998ef08aba
> > git revert ae626eb97376
> >
> > Christoph, Robin, since you are the authors of these commits, do you
> > have any idea what could be happening? Are we able to fix this without
> > reverting those commits, before 6.0?
>
> "hangs shortly after init" isn't much to go on. Are any errors logged?
> Possibly some driver is sat waiting for a DMA transfer to complete, that
> has somehow got the wrong address or lost coherency so never gets seen,
> but without at least being able to narrow it down to the affected driver
> it's hard to do much more than vague guessing.
OK I enabled CONFIG_DMA_API_DEBUG and now am getting a null pointer
dereference. I managed to isolate the bug to a specifc line in mvneta
driver:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/marvell/mvneta.c#n2591
I put debug printfs (pr_err(" a %i\n", __LINE__)) into the
mvneta_rx_hwbm() function.
The pr_err after the call to dma_sync_single_range_for_cpu() prints,
but the pr_err after skb_put_data() does not print.
Attaching console output.
Marek
[-- Attachment #2: regression_console_output.txt --]
[-- Type: text/plain, Size: 7313 bytes --]
[ 3.427249] Run /bin/bash as init process
bash: cannot set terminal process group (-1): Not a tty
bash: no job control in this shell
bash-5.1# ifconfig eth2 up
[ 6.738009] mvneta f1034000.ethernet eth2: PHY [f1072004.mdio-mii:01] driver [Marvell 88E1510] (irq=POLL)
[ 6.747801] mvneta f1034000.ethernet eth2: configuring for phy/sgmii link mode
bash-5.1# [ 9.857426] mvneta f1034000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off
[ 9.865290] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[ 10.275029] a 2541
[ 10.277226] a 2550
[ 10.279416] a 2570
[ 10.281604] a 2583
[ 10.283793] a 2590
[ 10.285984] a 2596
[ 10.288178] 8<--- cut here ---
[ 10.291236] Unable to handle kernel NULL pointer dereference at virtual address 00000042
[ 10.299348] [00000042] *pgd=00000000
[ 10.302933] Internal error: Oops: 5 [#1] SMP ARM
[ 10.307562] Modules linked in:
[ 10.310622] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.0.0-rc7+ #75
[ 10.316993] Hardware name: Marvell Armada 380/385 (Device Tree)
[ 10.322926] PC is at mmiocpy+0xec/0x334
[ 10.326776] LR is at mvneta_poll+0x5e4/0x7a8
[ 10.331058] pc : [<c05befac>] lr : [<c077ed00>] psr: 60000113
[ 10.337339] sp : c1001db0 ip : 00000002 fp : c1001db0
[ 10.342575] r10: c14ac840 r9 : c2024000 r8 : 0000005c
[ 10.347811] r7 : 4fa06000 r6 : c0e4cc9c r5 : f10f3000 r4 : c1e0d480
[ 10.354353] r3 : 00000000 r2 : 00000058 r1 : 00000042 r0 : c147a642
[ 10.360895] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 10.368047] Control: 10c5387d Table: 01f5c04a DAC: 00000051
[ 10.373805] Register r0 information: slab kmalloc-512 start c147a600 pointer offset 66 size 512
[ 10.382532] Register r1 information: non-paged memory
[ 10.387595] Register r2 information: non-paged memory
[ 10.392658] Register r3 information: NULL pointer
[ 10.397372] Register r4 information: slab skbuff_head_cache start c1e0d480 pointer offset 0 size 48
[ 10.406446] Register r5 information: 0-page vmalloc region starting at 0xf10f3000 allocated at dma_common_contiguous_remap+0x68/0x84
[ 10.418395] Register r6 information: non-slab/vmalloc memory
[ 10.424068] Register r7 information: non-paged memory
[ 10.429130] Register r8 information: non-paged memory
[ 10.434191] Register r9 information: slab kmalloc-cg-4k start c2024000 pointer offset 0 size 4096
[ 10.443090] Register r10 information: slab kmalloc-2k start c14ac800 pointer offset 64 size 2048
[ 10.451902] Register r11 information: non-slab/vmalloc memory
[ 10.457661] Register r12 information: non-paged memory
[ 10.462810] Process swapper/0 (pid: 0, stack limit = 0x(ptrval))
[ 10.468831] Stack: (0xc1001db0 to 0xc1002000)
[ 10.473198] 1da0: c147a642 c1e0d480 c2144d18 c077ed00
[ 10.481396] 1dc0: 63f6f15b c1410c00 00000000 00000000 00000040 ff7ebb60 c20244c0 ff7ebb68
[ 10.489592] 1de0: 00000000 c2024000 00000001 00000000 00000002 00000000 0273e980 c1005c88
[ 10.497790] 1e00: c0e4cca8 00000100 c1001e04 00000001 ff7ebb68 00000040 c1001e5b c1001e5c
[ 10.505987] 1e20: c1002d40 ffff8ed4 eedd5b40 c0872630 00000000 c1001e60 00000101 eedd5980
[ 10.514185] 1e40: 0000012c ff7ebb68 00000000 c08728b8 2de7b000 c0f5a980 0000002f c1001e5c
[ 10.522381] 1e60: c1001e5c c1001e64 c1001e64 10db25e3 0000002f 00000000 00000003 c100208c
[ 10.530578] 1e80: c1009b80 00000100 c1001e98 40000003 c1002080 c0101354 0000000c c05db53c
[ 10.538776] 1ea0: 00001000 c1002080 c0f57300 0000000a c0f57274 c0f59bc0 c0f59bc0 ffff8ed3
[ 10.546972] 1ec0: c1002d40 c0de605c 04200002 c0b02800 c0f58edc c0107298 60000013 ffffffff
[ 10.555170] 1ee0: c1001f34 c0f592e8 c1009b80 00000000 00000000 c0134938 c0107298 c0100b68
[ 10.563367] 1f00: 00000005 00000000 0004b5b9 c01164a0 c1009b80 c1004f90 00000000 c1004fdc
[ 10.571564] 1f20: c0f592e8 c10aa290 00000000 00000000 c1001f30 c1001f50 c0107294 c0107298
[ 10.579762] 1f40: 60000013 ffffffff 00000051 c1004f90 c1009b80 c0a54e64 c1009b80 c016bbc8
[ 10.587958] 1f60: c1107af0 10db25e3 ffffffff 000000ec c1107af0 c1004f40 ffffffff c0f45a60
[ 10.596155] 1f80: 00000000 10c5387d 00000000 c016bef4 c10103c8 c0a4ded0 c10ab040 c0f009f0
[ 10.604353] 1fa0: c10ab040 c0f01054 ffffffff ffffffff 00000000 c0f0060c 00000000 00000000
[ 10.612549] 1fc0: 00000000 c0f45a60 10dd25e3 00000000 00000000 c0f00340 00000051 10c0387d
[ 10.620746] 1fe0: 00000000 0fff7000 414fc091 10c5387d 00000000 00000000 00000000 00000000
[ 10.628945] mmiocpy from mvneta_poll+0x5e4/0x7a8
[ 10.633665] mvneta_poll from __napi_poll.constprop.0+0x2c/0x180
[ 10.639694] __napi_poll.constprop.0 from net_rx_action+0x134/0x2e0
[ 10.645980] net_rx_action from __do_softirq+0x114/0x274
[ 10.651311] __do_softirq from irq_exit+0x80/0xa8
[ 10.656029] irq_exit from __irq_svc+0x88/0xb0
[ 10.660484] Exception stack(0xc1001f00 to 0xc1001f48)
[ 10.665548] 1f00: 00000005 00000000 0004b5b9 c01164a0 c1009b80 c1004f90 00000000 c1004fdc
[ 10.673746] 1f20: c0f592e8 c10aa290 00000000 00000000 c1001f30 c1001f50 c0107294 c0107298
[ 10.681941] 1f40: 60000013 ffffffff
[ 10.685436] __irq_svc from arch_cpu_idle+0x38/0x3c
[ 10.690329] arch_cpu_idle from default_idle_call+0x24/0x34
[ 10.695921] default_idle_call from do_idle+0x1b4/0x210
[ 10.701166] do_idle from cpu_startup_entry+0x18/0x1c
[ 10.706233] cpu_startup_entry from rest_init+0xa8/0xac
[ 10.711473] rest_init from arch_post_acpi_subsys_init+0x0/0x8
[ 10.717324] Code: e8bd8811 e26cc004 e35c0002 c4d13001 (a4d14001)
[ 10.723437] ---[ end trace 0000000000000000 ]---
[ 10.728069] Kernel panic - not syncing: Fatal exception in interrupt
[ 10.734437] CPU1: stopping
[ 10.737151] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 6.0.0-rc7+ #75
[ 10.745001] Hardware name: Marvell Armada 380/385 (Device Tree)
[ 10.750934] unwind_backtrace from show_stack+0x10/0x14
[ 10.756177] show_stack from dump_stack_lvl+0x40/0x4c
[ 10.761245] dump_stack_lvl from do_handle_IPI+0xec/0x124
[ 10.766659] do_handle_IPI from ipi_handler+0x18/0x20
[ 10.771724] ipi_handler from handle_percpu_devid_irq+0x78/0x134
[ 10.777751] handle_percpu_devid_irq from generic_handle_domain_irq+0x28/0x38
[ 10.784906] generic_handle_domain_irq from gic_handle_irq+0x74/0x88
[ 10.791279] gic_handle_irq from generic_handle_arch_irq+0x34/0x44
[ 10.797475] generic_handle_arch_irq from call_with_stack+0x18/0x20
[ 10.803759] call_with_stack from __irq_svc+0x98/0xb0
[ 10.808823] Exception stack(0xf086df50 to 0xf086df98)
[ 10.813887] df40: 00000005 00000000 00072f51 c01164a0
[ 10.822083] df60: c14b8000 c1004f90 00000001 c1004fdc c0f592e8 c10aa290 00000000 00000000
[ 10.830281] df80: f086df80 f086dfa0 c0107294 c0107298 60000013 ffffffff
[ 10.836909] __irq_svc from arch_cpu_idle+0x38/0x3c
[ 10.841800] arch_cpu_idle from default_idle_call+0x24/0x34
[ 10.847388] default_idle_call from do_idle+0x1b4/0x210
[ 10.852629] do_idle from cpu_startup_entry+0x18/0x1c
[ 10.857696] cpu_startup_entry from secondary_start_kernel+0x118/0x120
[ 10.864243] secondary_start_kernel from 0x101560
[ 10.868962] Rebooting in 1 seconds..
[-- Attachment #3: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-09-30 14:53 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-30 13:10 REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Marek Behún
2022-09-30 13:46 ` Robin Murphy
2022-09-30 14:52 ` Marek Behún [this message]
2022-09-30 15:02 ` Marek Behún
2022-09-30 16:41 ` Robin Murphy
2022-09-30 18:02 ` Marek Behún
2022-10-03 7:21 ` Christoph Hellwig
2022-10-03 7:30 ` Christoph Hellwig
2022-10-03 14:11 ` Russell King (Oracle)
2022-10-03 15:25 ` Marek Behún
2022-10-03 16:09 ` Pali Rohár
2022-10-03 19:04 ` Marek Behún
2022-10-03 19:08 ` Pali Rohár
2022-10-03 21:30 ` Marcin Wojtas
2022-10-03 21:35 ` Pali Rohár
2022-10-03 22:03 ` Marcin Wojtas
2022-10-04 7:10 ` Christoph Hellwig
2022-10-04 8:15 ` Marek Behún
2022-10-04 8:17 ` [PATCH] ARM: mvebu: select OF_DMA_DEFAULT_COHERENT if MACH_MVEBU_V7 Marek Behún
2022-10-04 8:30 ` Christoph Hellwig
2022-10-04 12:54 ` Marek Behún
2022-10-04 8:30 ` Arnd Bergmann
2022-10-04 9:14 ` Thorsten Leemhuis
2022-10-04 9:22 ` Russell King (Oracle)
2022-10-04 9:56 ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Robin Murphy
2022-10-04 7:25 ` Russell King (Oracle)
2022-10-04 8:30 ` Marcin Wojtas
2022-10-04 9:08 ` Russell King (Oracle)
2022-10-04 12:36 ` Marek Behún
2022-10-04 12:59 ` Marcin Wojtas
2022-10-04 18:51 ` Pali Rohár
2022-10-04 19:35 ` Marcin Wojtas
2022-10-04 8:26 ` Marek Behún
2022-10-04 8:36 ` Marcin Wojtas
2022-10-20 18:22 ` Russell King (Oracle)
2022-10-20 19:10 ` Marek Behún
2022-10-21 16:25 ` Linus Torvalds
2022-10-21 16:30 ` Christoph Hellwig
2022-10-21 18:21 ` Russell King (Oracle)
2022-10-23 11:58 ` Klaus Kudielka
2022-10-03 18:57 ` Marek Behún
2022-10-01 9:31 ` Thorsten Leemhuis
2022-11-04 12:08 ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" #forregzbot Thorsten Leemhuis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220930165234.729ad68c@dellmb \
--to=kabel@kernel.org \
--cc=andre.przywara@arm.com \
--cc=andrew@lunn.ch \
--cc=arnd@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=gregory.clement@bootlin.com \
--cc=hch@lst.de \
--cc=iommu@lists.linux-foundation.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux@armlinux.org.uk \
--cc=maz@kernel.org \
--cc=robin.murphy@arm.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).