linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "Marek Behún" <kabel@kernel.org>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: Marcin Wojtas <mw@semihalf.com>,
	pali@kernel.org, Christoph Hellwig <hch@lst.de>,
	Robin Murphy <robin.murphy@arm.com>,
	Arnd Bergmann <arnd@kernel.org>,
	Andre Przywara <andre.przywara@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Lunn <andrew@lunn.ch>,
	Gregory Clement <gregory.clement@bootlin.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally"
Date: Thu, 20 Oct 2022 21:10:31 +0200	[thread overview]
Message-ID: <20221020211031.176c3249@thinkpad> (raw)
In-Reply-To: <Y1GRzyvddTliI2J/@shell.armlinux.org.uk>

On Thu, 20 Oct 2022 19:22:07 +0100
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:

> On Tue, Oct 04, 2022 at 10:36:05AM +0200, Marcin Wojtas wrote:
> > wt., 4 paź 2022 o 10:26 Marek Behún <kabel@kernel.org> napisał(a):  
> > >
> > > On Mon, 3 Oct 2022 23:30:31 +0200
> > > Marcin Wojtas <mw@semihalf.com> wrote:
> > >  
> > > > Hi Marek,
> > > >
> > > >
> > > > pon., 3 paź 2022 o 17:33 Marek Behún <kabel@kernel.org> napisał(a):  
> > > > >
> > > > > On Mon, 3 Oct 2022 15:11:44 +0100
> > > > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > > >  
> > > > > > On Mon, Oct 03, 2022 at 09:30:37AM +0200, Christoph Hellwig wrote:  
> > > > > > > On Fri, Sep 30, 2022 at 05:02:05PM +0200, Marek Behún wrote:  
> > > > > > > > It seems that the null pointer dereference comes from the data variable
> > > > > > > > having zero value. We assign
> > > > > > > >   data = (u8 *)(uintptr_t)rx_desc->buf_cookie;  
> > > > > > >
> > > > > > > I never see any assignment to ->buf_cookie in the driver, what am
> > > > > > > I missing?  
> > > > > >
> > > > > > I think Marek's setup (like my setups) use the hardware buffer manager,
> > > > > > and it's hardware that fills in the "buf_cookie", which is supposed to
> > > > > > be the virtual address of the buffer.
> > > > > >
> > > > > > Each buffer supplied to the hardware buffer manager is supposed to
> > > > > > contain the virtual address in the first 32-bit word in that buffer.
> > > > > >
> > > > > > This is done by mvneta_bm_construct():
> > > > > >
> > > > > >         /* In order to update buf_cookie field of RX descriptor properly,
> > > > > >          * BM hardware expects buf virtual address to be placed in the
> > > > > >          * first four bytes of mapped buffer.
> > > > > >          */
> > > > > >         *(u32 *)buf = (u32)buf;
> > > > > >
> > > > > > immediately prior to dma_map_single(..., DMA_FROM_DEVICE) is called.
> > > > > >
> > > > > > If I had to guess, I would suggest that this write is being lost via
> > > > > > cache invalidation, and given that the hardware BM both reads and
> > > > > > writes this buffer, DMA_FROM_DEVICE is not correct, it should be
> > > > > > DMA_BIDIRECTIONAL.
> > > > > >  
> > > >
> > > > I think the DMA_FROM_DEVICE is used rather properly in the RX path of
> > > > the driver - the CPU doesn't access the payload afterward. The BM only
> > > > pushes the pointers back and forth between internal SRAM ('internal
> > > > pool' - BPPI) to DRAM ('external pool' - BPPE) and the descriptors,
> > > > but afaik it should not touch the buffer contents. But maybe somehow
> > > > it affects the coherency and DMA_BIDIRECTIONAL are indeed required...
> > > > About the coherency itself - please see my comment below.
> > > >
> > > > Another thought - when writing to *buf (memory normal) shouldn't we add a dsb()?
> > > >  
> > > > > > Changing that is probably going to need DMA_FROM_DEVICE also changed
> > > > > > elsewhere in the mvneta_bm and mvneta driver.
> > > > > >
> > > > > > I'm not in a position where I could test that out. Marek?
> > > > > >  
> > > > >
> > > > > Hello Russell,
> > > > >
> > > > > thanks for your suggestion!
> > > > >
> > > > > Adding Pali, since he has some information (see at the end of this
> > > > > message).
> > > > >
> > > > > The attached patch seems to solve the null-pointer dereference.  
> > > >
> > > > Did you manage to measure performance impact?  
> > >
> > > I did not measure any performance impacts. But DMA directianlity
> > > within mvneta seems to be a different bug, as Russell replied, and maybe
> > > was just revealed by this.
> > >  
> > > > I have one overall concern here. On all kinds of A38x-based boards I
> > > > worked on, by default, the firmware set all devices (e.g. network,
> > > > AHCI, XHCI) on MBUS as fully IO cache coherent - it should be
> > > > reflected in the MVNETA_WIN_BASE(w) registers attribute field. Bits
> > > > [15:8] should be set to 0x1D (or 0x1E if there is a second DRAM CS
> > > > used). Can you please try adding 'dma-coherent;' property under the
> > > > 'internal-regs' node?  
> > >
> > > Yes, adding dma-coherent solves this issue. See other emails. The
> > > proper solution IMO is to default to dma-coherent on the platform,
> > > which can be done in a simple way (I've sent a patch). We want to be
> > > compatible with older device-trees.
> > >  
> > 
> > Thanks a lot for testing. I agree we must maintain the backward
> > compatibility with older DT's. To summarize, I think we should end up
> > with 3 patches:
> > 1. Update  arch/arm/mm/dma-mapping.c as suggested by Christoph.
> > 2. Add 'dma-coherent' in armada-38x.dtsi.
> > 3. Fix DMA attribute in mvneta_bm_construct().
> > 
> > In case any help is needed from my side, please let me know.  
> 
> Is it possible that this would also cause data corruption reading from
> a SATA card on armada-38x platforms?
> 
> I'm getting with 6.0:
> 
> [    1.410115] EXT4-fs error (device sda1): htree_dirblock_to_tree:1093: inode #256: block 8797: comm systemd: bad entry in directory: rec_len % 4 != 0 - offset=0, inode=33188, rec_len=35097, size=4096 fake=0
> 
> which appears to be due to reading bad data from the SATA device -
> what this tells me in inode and rec_len is not what is actually on
> the device in question.
> 
> Booting back to 5.19 gives a clean filesystem which has no errors,
> so it isn't filesystem corruption, it is an inability for 6.0 to
> read data correctly from the device.
> 

It's very probable. Don't use 6.0 or apply Christoph's patches.

Marek

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-10-20 19:11 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-30 13:10 REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Marek Behún
2022-09-30 13:46 ` Robin Murphy
2022-09-30 14:52   ` Marek Behún
2022-09-30 15:02     ` Marek Behún
2022-09-30 16:41       ` Robin Murphy
2022-09-30 18:02         ` Marek Behún
2022-10-03  7:21           ` Christoph Hellwig
2022-10-03  7:30       ` Christoph Hellwig
2022-10-03 14:11         ` Russell King (Oracle)
2022-10-03 15:25           ` Marek Behún
2022-10-03 16:09             ` Pali Rohár
2022-10-03 19:04               ` Marek Behún
2022-10-03 19:08                 ` Pali Rohár
2022-10-03 21:30             ` Marcin Wojtas
2022-10-03 21:35               ` Pali Rohár
2022-10-03 22:03                 ` Marcin Wojtas
2022-10-04  7:10               ` Christoph Hellwig
2022-10-04  8:15                 ` Marek Behún
2022-10-04  8:17                   ` [PATCH] ARM: mvebu: select OF_DMA_DEFAULT_COHERENT if MACH_MVEBU_V7 Marek Behún
2022-10-04  8:30                     ` Christoph Hellwig
2022-10-04 12:54                       ` Marek Behún
2022-10-04  8:30                     ` Arnd Bergmann
2022-10-04  9:14                     ` Thorsten Leemhuis
2022-10-04  9:22                       ` Russell King (Oracle)
2022-10-04  9:56                 ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Robin Murphy
2022-10-04  7:25               ` Russell King (Oracle)
2022-10-04  8:30                 ` Marcin Wojtas
2022-10-04  9:08                   ` Russell King (Oracle)
2022-10-04 12:36                     ` Marek Behún
2022-10-04 12:59                       ` Marcin Wojtas
2022-10-04 18:51                         ` Pali Rohár
2022-10-04 19:35                           ` Marcin Wojtas
2022-10-04  8:26               ` Marek Behún
2022-10-04  8:36                 ` Marcin Wojtas
2022-10-20 18:22                   ` Russell King (Oracle)
2022-10-20 19:10                     ` Marek Behún [this message]
2022-10-21 16:25                     ` Linus Torvalds
2022-10-21 16:30                     ` Christoph Hellwig
2022-10-21 18:21                       ` Russell King (Oracle)
2022-10-23 11:58                     ` Klaus Kudielka
2022-10-03 18:57         ` Marek Behún
2022-10-01  9:31 ` Thorsten Leemhuis
2022-11-04 12:08   ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" #forregzbot Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221020211031.176c3249@thinkpad \
    --to=kabel@kernel.org \
    --cc=andre.przywara@arm.com \
    --cc=andrew@lunn.ch \
    --cc=arnd@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=gregory.clement@bootlin.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=maz@kernel.org \
    --cc=mw@semihalf.com \
    --cc=pali@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).