linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "Marek Behún" <kabel@kernel.org>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>, Arnd Bergmann <arnd@kernel.org>,
	Andre Przywara <andre.przywara@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Russell King <linux@armlinux.org.uk>,
	Andrew Lunn <andrew@lunn.ch>,
	Gregory Clement <gregory.clement@bootlin.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally"
Date: Fri, 30 Sep 2022 16:52:34 +0200	[thread overview]
Message-ID: <20220930165234.729ad68c@dellmb> (raw)
In-Reply-To: <630be11f-09ef-02d4-69f7-c7880ae5674c@arm.com>

[-- Attachment #1: Type: text/plain, Size: 2110 bytes --]

On Fri, 30 Sep 2022 14:46:06 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> On 2022-09-30 14:10, Marek Behún wrote:
> > Hello Linus, Arnd, Robin and Christoph,
> > 
> > I just bisected a regression on Turris Omnia (Armada 385), wherein the
> > system hangs shortly after init is run, to commit
> > 
> >    ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally")
> >    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ae626eb97376
> > 
> > In order to fix the regression, I had to revert this commit and
> > subsequent 3 commits:
> >    ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally")
> >    42998ef08aba ("ARM/dma-mapping: drop .dma_supported for IOMMU ops")
> >    d563bccfa35b ("ARM/dma-mapping: consolidate IOMMU ops callbacks")
> >    4136ce90f079 ("ARM/dma-mapping: merge IOMMU ops")
> > in reverse order, of course:
> >    git revert 4136ce90f079
> >    git revert d563bccfa35b
> >    git revert 42998ef08aba
> >    git revert ae626eb97376
> > 
> > Christoph, Robin, since you are the authors of these commits, do you
> > have any idea what could be happening? Are we able to fix this without
> > reverting those commits, before 6.0?  
> 
> "hangs shortly after init" isn't much to go on. Are any errors logged? 
> Possibly some driver is sat waiting for a DMA transfer to complete, that 
> has somehow got the wrong address or lost coherency so never gets seen, 
> but without at least being able to narrow it down to the affected driver 
> it's hard to do much more than vague guessing.

OK I enabled CONFIG_DMA_API_DEBUG and now am getting a null pointer
dereference. I managed to isolate the bug to a specifc line in mvneta
driver:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/marvell/mvneta.c#n2591

I put debug printfs (pr_err("  a %i\n", __LINE__)) into the
mvneta_rx_hwbm() function.
The pr_err after the call to dma_sync_single_range_for_cpu() prints,
but the pr_err after skb_put_data() does not print.

Attaching console output.

Marek

[-- Attachment #2: regression_console_output.txt --]
[-- Type: text/plain, Size: 7313 bytes --]

[    3.427249] Run /bin/bash as init process
bash: cannot set terminal process group (-1): Not a tty
bash: no job control in this shell
bash-5.1# ifconfig eth2 up
[    6.738009] mvneta f1034000.ethernet eth2: PHY [f1072004.mdio-mii:01] driver [Marvell 88E1510] (irq=POLL)
[    6.747801] mvneta f1034000.ethernet eth2: configuring for phy/sgmii link mode
bash-5.1# [    9.857426] mvneta f1034000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off
[    9.865290] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[   10.275029]          a 2541
[   10.277226]          a 2550
[   10.279416]          a 2570
[   10.281604]          a 2583
[   10.283793]          a 2590
[   10.285984]          a 2596
[   10.288178] 8<--- cut here ---
[   10.291236] Unable to handle kernel NULL pointer dereference at virtual address 00000042
[   10.299348] [00000042] *pgd=00000000
[   10.302933] Internal error: Oops: 5 [#1] SMP ARM
[   10.307562] Modules linked in:
[   10.310622] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.0.0-rc7+ #75
[   10.316993] Hardware name: Marvell Armada 380/385 (Device Tree)
[   10.322926] PC is at mmiocpy+0xec/0x334
[   10.326776] LR is at mvneta_poll+0x5e4/0x7a8
[   10.331058] pc : [<c05befac>]    lr : [<c077ed00>]    psr: 60000113
[   10.337339] sp : c1001db0  ip : 00000002  fp : c1001db0
[   10.342575] r10: c14ac840  r9 : c2024000  r8 : 0000005c
[   10.347811] r7 : 4fa06000  r6 : c0e4cc9c  r5 : f10f3000  r4 : c1e0d480
[   10.354353] r3 : 00000000  r2 : 00000058  r1 : 00000042  r0 : c147a642
[   10.360895] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   10.368047] Control: 10c5387d  Table: 01f5c04a  DAC: 00000051
[   10.373805] Register r0 information: slab kmalloc-512 start c147a600 pointer offset 66 size 512
[   10.382532] Register r1 information: non-paged memory
[   10.387595] Register r2 information: non-paged memory
[   10.392658] Register r3 information: NULL pointer
[   10.397372] Register r4 information: slab skbuff_head_cache start c1e0d480 pointer offset 0 size 48
[   10.406446] Register r5 information: 0-page vmalloc region starting at 0xf10f3000 allocated at dma_common_contiguous_remap+0x68/0x84
[   10.418395] Register r6 information: non-slab/vmalloc memory
[   10.424068] Register r7 information: non-paged memory
[   10.429130] Register r8 information: non-paged memory
[   10.434191] Register r9 information: slab kmalloc-cg-4k start c2024000 pointer offset 0 size 4096
[   10.443090] Register r10 information: slab kmalloc-2k start c14ac800 pointer offset 64 size 2048
[   10.451902] Register r11 information: non-slab/vmalloc memory
[   10.457661] Register r12 information: non-paged memory
[   10.462810] Process swapper/0 (pid: 0, stack limit = 0x(ptrval))
[   10.468831] Stack: (0xc1001db0 to 0xc1002000)
[   10.473198] 1da0:                                     c147a642 c1e0d480 c2144d18 c077ed00
[   10.481396] 1dc0: 63f6f15b c1410c00 00000000 00000000 00000040 ff7ebb60 c20244c0 ff7ebb68
[   10.489592] 1de0: 00000000 c2024000 00000001 00000000 00000002 00000000 0273e980 c1005c88
[   10.497790] 1e00: c0e4cca8 00000100 c1001e04 00000001 ff7ebb68 00000040 c1001e5b c1001e5c
[   10.505987] 1e20: c1002d40 ffff8ed4 eedd5b40 c0872630 00000000 c1001e60 00000101 eedd5980
[   10.514185] 1e40: 0000012c ff7ebb68 00000000 c08728b8 2de7b000 c0f5a980 0000002f c1001e5c
[   10.522381] 1e60: c1001e5c c1001e64 c1001e64 10db25e3 0000002f 00000000 00000003 c100208c
[   10.530578] 1e80: c1009b80 00000100 c1001e98 40000003 c1002080 c0101354 0000000c c05db53c
[   10.538776] 1ea0: 00001000 c1002080 c0f57300 0000000a c0f57274 c0f59bc0 c0f59bc0 ffff8ed3
[   10.546972] 1ec0: c1002d40 c0de605c 04200002 c0b02800 c0f58edc c0107298 60000013 ffffffff
[   10.555170] 1ee0: c1001f34 c0f592e8 c1009b80 00000000 00000000 c0134938 c0107298 c0100b68
[   10.563367] 1f00: 00000005 00000000 0004b5b9 c01164a0 c1009b80 c1004f90 00000000 c1004fdc
[   10.571564] 1f20: c0f592e8 c10aa290 00000000 00000000 c1001f30 c1001f50 c0107294 c0107298
[   10.579762] 1f40: 60000013 ffffffff 00000051 c1004f90 c1009b80 c0a54e64 c1009b80 c016bbc8
[   10.587958] 1f60: c1107af0 10db25e3 ffffffff 000000ec c1107af0 c1004f40 ffffffff c0f45a60
[   10.596155] 1f80: 00000000 10c5387d 00000000 c016bef4 c10103c8 c0a4ded0 c10ab040 c0f009f0
[   10.604353] 1fa0: c10ab040 c0f01054 ffffffff ffffffff 00000000 c0f0060c 00000000 00000000
[   10.612549] 1fc0: 00000000 c0f45a60 10dd25e3 00000000 00000000 c0f00340 00000051 10c0387d
[   10.620746] 1fe0: 00000000 0fff7000 414fc091 10c5387d 00000000 00000000 00000000 00000000
[   10.628945]  mmiocpy from mvneta_poll+0x5e4/0x7a8
[   10.633665]  mvneta_poll from __napi_poll.constprop.0+0x2c/0x180
[   10.639694]  __napi_poll.constprop.0 from net_rx_action+0x134/0x2e0
[   10.645980]  net_rx_action from __do_softirq+0x114/0x274
[   10.651311]  __do_softirq from irq_exit+0x80/0xa8
[   10.656029]  irq_exit from __irq_svc+0x88/0xb0
[   10.660484] Exception stack(0xc1001f00 to 0xc1001f48)
[   10.665548] 1f00: 00000005 00000000 0004b5b9 c01164a0 c1009b80 c1004f90 00000000 c1004fdc
[   10.673746] 1f20: c0f592e8 c10aa290 00000000 00000000 c1001f30 c1001f50 c0107294 c0107298
[   10.681941] 1f40: 60000013 ffffffff
[   10.685436]  __irq_svc from arch_cpu_idle+0x38/0x3c
[   10.690329]  arch_cpu_idle from default_idle_call+0x24/0x34
[   10.695921]  default_idle_call from do_idle+0x1b4/0x210
[   10.701166]  do_idle from cpu_startup_entry+0x18/0x1c
[   10.706233]  cpu_startup_entry from rest_init+0xa8/0xac
[   10.711473]  rest_init from arch_post_acpi_subsys_init+0x0/0x8
[   10.717324] Code: e8bd8811 e26cc004 e35c0002 c4d13001 (a4d14001) 
[   10.723437] ---[ end trace 0000000000000000 ]---
[   10.728069] Kernel panic - not syncing: Fatal exception in interrupt
[   10.734437] CPU1: stopping
[   10.737151] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D            6.0.0-rc7+ #75
[   10.745001] Hardware name: Marvell Armada 380/385 (Device Tree)
[   10.750934]  unwind_backtrace from show_stack+0x10/0x14
[   10.756177]  show_stack from dump_stack_lvl+0x40/0x4c
[   10.761245]  dump_stack_lvl from do_handle_IPI+0xec/0x124
[   10.766659]  do_handle_IPI from ipi_handler+0x18/0x20
[   10.771724]  ipi_handler from handle_percpu_devid_irq+0x78/0x134
[   10.777751]  handle_percpu_devid_irq from generic_handle_domain_irq+0x28/0x38
[   10.784906]  generic_handle_domain_irq from gic_handle_irq+0x74/0x88
[   10.791279]  gic_handle_irq from generic_handle_arch_irq+0x34/0x44
[   10.797475]  generic_handle_arch_irq from call_with_stack+0x18/0x20
[   10.803759]  call_with_stack from __irq_svc+0x98/0xb0
[   10.808823] Exception stack(0xf086df50 to 0xf086df98)
[   10.813887] df40:                                     00000005 00000000 00072f51 c01164a0
[   10.822083] df60: c14b8000 c1004f90 00000001 c1004fdc c0f592e8 c10aa290 00000000 00000000
[   10.830281] df80: f086df80 f086dfa0 c0107294 c0107298 60000013 ffffffff
[   10.836909]  __irq_svc from arch_cpu_idle+0x38/0x3c
[   10.841800]  arch_cpu_idle from default_idle_call+0x24/0x34
[   10.847388]  default_idle_call from do_idle+0x1b4/0x210
[   10.852629]  do_idle from cpu_startup_entry+0x18/0x1c
[   10.857696]  cpu_startup_entry from secondary_start_kernel+0x118/0x120
[   10.864243]  secondary_start_kernel from 0x101560
[   10.868962] Rebooting in 1 seconds..

[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-09-30 14:53 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-30 13:10 REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Marek Behún
2022-09-30 13:46 ` Robin Murphy
2022-09-30 14:52   ` Marek Behún [this message]
2022-09-30 15:02     ` Marek Behún
2022-09-30 16:41       ` Robin Murphy
2022-09-30 18:02         ` Marek Behún
2022-10-03  7:21           ` Christoph Hellwig
2022-10-03  7:30       ` Christoph Hellwig
2022-10-03 14:11         ` Russell King (Oracle)
2022-10-03 15:25           ` Marek Behún
2022-10-03 16:09             ` Pali Rohár
2022-10-03 19:04               ` Marek Behún
2022-10-03 19:08                 ` Pali Rohár
2022-10-03 21:30             ` Marcin Wojtas
2022-10-03 21:35               ` Pali Rohár
2022-10-03 22:03                 ` Marcin Wojtas
2022-10-04  7:10               ` Christoph Hellwig
2022-10-04  8:15                 ` Marek Behún
2022-10-04  8:17                   ` [PATCH] ARM: mvebu: select OF_DMA_DEFAULT_COHERENT if MACH_MVEBU_V7 Marek Behún
2022-10-04  8:30                     ` Christoph Hellwig
2022-10-04 12:54                       ` Marek Behún
2022-10-04  8:30                     ` Arnd Bergmann
2022-10-04  9:14                     ` Thorsten Leemhuis
2022-10-04  9:22                       ` Russell King (Oracle)
2022-10-04  9:56                 ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Robin Murphy
2022-10-04  7:25               ` Russell King (Oracle)
2022-10-04  8:30                 ` Marcin Wojtas
2022-10-04  9:08                   ` Russell King (Oracle)
2022-10-04 12:36                     ` Marek Behún
2022-10-04 12:59                       ` Marcin Wojtas
2022-10-04 18:51                         ` Pali Rohár
2022-10-04 19:35                           ` Marcin Wojtas
2022-10-04  8:26               ` Marek Behún
2022-10-04  8:36                 ` Marcin Wojtas
2022-10-20 18:22                   ` Russell King (Oracle)
2022-10-20 19:10                     ` Marek Behún
2022-10-21 16:25                     ` Linus Torvalds
2022-10-21 16:30                     ` Christoph Hellwig
2022-10-21 18:21                       ` Russell King (Oracle)
2022-10-23 11:58                     ` Klaus Kudielka
2022-10-03 18:57         ` Marek Behún
2022-10-01  9:31 ` Thorsten Leemhuis
2022-11-04 12:08   ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" #forregzbot Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220930165234.729ad68c@dellmb \
    --to=kabel@kernel.org \
    --cc=andre.przywara@arm.com \
    --cc=andrew@lunn.ch \
    --cc=arnd@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=gregory.clement@bootlin.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=maz@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).