All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Marek Behún" <kabel@kernel.org>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>, Arnd Bergmann <arnd@kernel.org>,
	Andre Przywara <andre.przywara@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Russell King <linux@armlinux.org.uk>,
	Andrew Lunn <andrew@lunn.ch>,
	Gregory Clement <gregory.clement@bootlin.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally"
Date: Fri, 30 Sep 2022 20:02:00 +0200	[thread overview]
Message-ID: <20220930200200.7e689abf@thinkpad> (raw)
In-Reply-To: <50ec473b-9def-dc2b-7d1b-cbdc277cbac2@arm.com>

On Fri, 30 Sep 2022 17:41:44 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> On 2022-09-30 16:02, Marek Behún wrote:
> > On Fri, 30 Sep 2022 16:52:34 +0200
> > Marek Behún <kabel@kernel.org> wrote:
> >   
> >> On Fri, 30 Sep 2022 14:46:06 +0100
> >> Robin Murphy <robin.murphy@arm.com> wrote:
> >>  
> >>> On 2022-09-30 14:10, Marek Behún wrote:  
> >>>> Hello Linus, Arnd, Robin and Christoph,
> >>>>
> >>>> I just bisected a regression on Turris Omnia (Armada 385), wherein the
> >>>> system hangs shortly after init is run, to commit
> >>>>
> >>>>     ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally")
> >>>>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ae626eb97376
> >>>>
> >>>> In order to fix the regression, I had to revert this commit and
> >>>> subsequent 3 commits:
> >>>>     ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally")
> >>>>     42998ef08aba ("ARM/dma-mapping: drop .dma_supported for IOMMU ops")
> >>>>     d563bccfa35b ("ARM/dma-mapping: consolidate IOMMU ops callbacks")
> >>>>     4136ce90f079 ("ARM/dma-mapping: merge IOMMU ops")
> >>>> in reverse order, of course:
> >>>>     git revert 4136ce90f079
> >>>>     git revert d563bccfa35b
> >>>>     git revert 42998ef08aba
> >>>>     git revert ae626eb97376
> >>>>
> >>>> Christoph, Robin, since you are the authors of these commits, do you
> >>>> have any idea what could be happening? Are we able to fix this without
> >>>> reverting those commits, before 6.0?  
> >>>
> >>> "hangs shortly after init" isn't much to go on. Are any errors logged?
> >>> Possibly some driver is sat waiting for a DMA transfer to complete, that
> >>> has somehow got the wrong address or lost coherency so never gets seen,
> >>> but without at least being able to narrow it down to the affected driver
> >>> it's hard to do much more than vague guessing.  
> >>
> >> OK I enabled CONFIG_DMA_API_DEBUG and now am getting a null pointer
> >> dereference. I managed to isolate the bug to a specifc line in mvneta
> >> driver:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/marvell/mvneta.c#n2591
> >>
> >> I put debug printfs (pr_err("  a %i\n", __LINE__)) into the
> >> mvneta_rx_hwbm() function.
> >> The pr_err after the call to dma_sync_single_range_for_cpu() prints,
> >> but the pr_err after skb_put_data() does not print.
> >>
> >> Attaching console output.  
> > 
> > It seems that the null pointer dereference comes from the data variable
> > having zero value. We assign
> >    data = (u8 *)(uintptr_t)rx_desc->buf_cookie;
> > rx_desc is obtained with function
> >    mvneta_rxq_next_desc_get()
> > 
> > rx queues are allocated in mvneta_rxq_sw_init() with
> > 
> >    /* Allocate memory for RX descriptors */
> >    rxq->descs = dma_alloc_coherent(pp->dev->dev.parent,
> > 				  rxq->size * MVNETA_DESC_ALIGNED_SIZE,
> > 				  &rxq->descs_phys, GFP_KERNEL);  
> 
> Hmm, making sense of that driver is beyond me at this time on a Friday 
> afternoon, and I can't tell whether this is immediately related, but:
> 
> [   10.406446] Register r5 information: 0-page vmalloc region starting 
> at 0xf10f3000 allocated at dma_common_contiguous_remap+0x68/0x84
> 
> definitely smells suspicious in its own right. Remapping 0 pages is bad 
> enough, but I'm also slightly wondering about remapping DMA allocations 
> at all - IIUC this is one of the mvebu SoCs where everything gets made 
> coherent by a bus notifier, so I wouldn't expect remaps except for 
> highmem, but the upstream DT suggests you probably don't have masses of 
> RAM either :/

Are the patches that cause the regression supposed to do only code
refactoring (although major), or are they supposed to be functional
changes?

Marek

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-09-30 18:04 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-30 13:10 REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Marek Behún
2022-09-30 13:46 ` Robin Murphy
2022-09-30 14:52   ` Marek Behún
2022-09-30 15:02     ` Marek Behún
2022-09-30 16:41       ` Robin Murphy
2022-09-30 18:02         ` Marek Behún [this message]
2022-10-03  7:21           ` Christoph Hellwig
2022-10-03  7:30       ` Christoph Hellwig
2022-10-03 14:11         ` Russell King (Oracle)
2022-10-03 15:25           ` Marek Behún
2022-10-03 16:09             ` Pali Rohár
2022-10-03 19:04               ` Marek Behún
2022-10-03 19:08                 ` Pali Rohár
2022-10-03 21:30             ` Marcin Wojtas
2022-10-03 21:35               ` Pali Rohár
2022-10-03 22:03                 ` Marcin Wojtas
2022-10-04  7:10               ` Christoph Hellwig
2022-10-04  8:15                 ` Marek Behún
2022-10-04  8:17                   ` [PATCH] ARM: mvebu: select OF_DMA_DEFAULT_COHERENT if MACH_MVEBU_V7 Marek Behún
2022-10-04  8:30                     ` Christoph Hellwig
2022-10-04 12:54                       ` Marek Behún
2022-10-04  8:30                     ` Arnd Bergmann
2022-10-04  9:14                     ` Thorsten Leemhuis
2022-10-04  9:22                       ` Russell King (Oracle)
2022-10-04  9:56                 ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Robin Murphy
2022-10-04  7:25               ` Russell King (Oracle)
2022-10-04  8:30                 ` Marcin Wojtas
2022-10-04  9:08                   ` Russell King (Oracle)
2022-10-04 12:36                     ` Marek Behún
2022-10-04 12:59                       ` Marcin Wojtas
2022-10-04 18:51                         ` Pali Rohár
2022-10-04 19:35                           ` Marcin Wojtas
2022-10-04  8:26               ` Marek Behún
2022-10-04  8:36                 ` Marcin Wojtas
2022-10-20 18:22                   ` Russell King (Oracle)
2022-10-20 19:10                     ` Marek Behún
2022-10-21 16:25                     ` Linus Torvalds
2022-10-21 16:30                     ` Christoph Hellwig
2022-10-21 18:21                       ` Russell King (Oracle)
2022-10-23 11:58                     ` Klaus Kudielka
2022-10-03 18:57         ` Marek Behún
2022-10-01  9:31 ` Thorsten Leemhuis
2022-11-04 12:08   ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" #forregzbot Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220930200200.7e689abf@thinkpad \
    --to=kabel@kernel.org \
    --cc=andre.przywara@arm.com \
    --cc=andrew@lunn.ch \
    --cc=arnd@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=gregory.clement@bootlin.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=maz@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.