linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Marcin Wojtas <mw@semihalf.com>
To: "Pali Rohár" <pali@kernel.org>
Cc: "Marek Behún" <kabel@kernel.org>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	"Christoph Hellwig" <hch@lst.de>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Arnd Bergmann" <arnd@kernel.org>,
	"Andre Przywara" <andre.przywara@arm.com>,
	"Marc Zyngier" <maz@kernel.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Andrew Lunn" <andrew@lunn.ch>,
	"Gregory Clement" <gregory.clement@bootlin.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally"
Date: Tue, 4 Oct 2022 00:03:26 +0200	[thread overview]
Message-ID: <CAPv3WKdB_3_hS4p46MhUDFLkL6c8P6=jkprjANYDZFXNayBAZw@mail.gmail.com> (raw)
In-Reply-To: <20221003213527.ocsfgfp5xe4vu5kk@pali>

Hi,


pon., 3 paź 2022 o 23:35 Pali Rohár <pali@kernel.org> napisał(a):
>
> Hello!
>
> On Monday 03 October 2022 23:30:31 Marcin Wojtas wrote:
> > Hi Marek,
> >
> >
> > pon., 3 paź 2022 o 17:33 Marek Behún <kabel@kernel.org> napisał(a):
> > >
> > > On Mon, 3 Oct 2022 15:11:44 +0100
> > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > >
> > > > On Mon, Oct 03, 2022 at 09:30:37AM +0200, Christoph Hellwig wrote:
> > > > > On Fri, Sep 30, 2022 at 05:02:05PM +0200, Marek Behún wrote:
> > > > > > It seems that the null pointer dereference comes from the data variable
> > > > > > having zero value. We assign
> > > > > >   data = (u8 *)(uintptr_t)rx_desc->buf_cookie;
> > > > >
> > > > > I never see any assignment to ->buf_cookie in the driver, what am
> > > > > I missing?
> > > >
> > > > I think Marek's setup (like my setups) use the hardware buffer manager,
> > > > and it's hardware that fills in the "buf_cookie", which is supposed to
> > > > be the virtual address of the buffer.
> > > >
> > > > Each buffer supplied to the hardware buffer manager is supposed to
> > > > contain the virtual address in the first 32-bit word in that buffer.
> > > >
> > > > This is done by mvneta_bm_construct():
> > > >
> > > >         /* In order to update buf_cookie field of RX descriptor properly,
> > > >          * BM hardware expects buf virtual address to be placed in the
> > > >          * first four bytes of mapped buffer.
> > > >          */
> > > >         *(u32 *)buf = (u32)buf;
> > > >
> > > > immediately prior to dma_map_single(..., DMA_FROM_DEVICE) is called.
> > > >
> > > > If I had to guess, I would suggest that this write is being lost via
> > > > cache invalidation, and given that the hardware BM both reads and
> > > > writes this buffer, DMA_FROM_DEVICE is not correct, it should be
> > > > DMA_BIDIRECTIONAL.
> > > >
> >
> > I think the DMA_FROM_DEVICE is used rather properly in the RX path of
> > the driver - the CPU doesn't access the payload afterward. The BM only
> > pushes the pointers back and forth between internal SRAM ('internal
> > pool' - BPPI) to DRAM ('external pool' - BPPE) and the descriptors,
> > but afaik it should not touch the buffer contents. But maybe somehow
> > it affects the coherency and DMA_BIDIRECTIONAL are indeed required...
> > About the coherency itself - please see my comment below.
> >
> > Another thought - when writing to *buf (memory normal) shouldn't we add a dsb()?
>
> Hm... When you are talking about IO coherency... This reminds me that
> there is a HW bug on A38x with undocumented errata related to IO
> coherency. It is not mentioned in the official A38x errata document but
> kernel has implemented some kind of workaround for it. But last time
> when I tried to understand it I had feeling that it was implemented
> incorrectly and maybe does not do what it is expected to do?
>
> Maybe it should be revisited, now?

It mentions problems with deadlocks when outer syncs are performed:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/mach-mvebu/coherency.c?h=v6.0#n183
A couple of lines above __arm_ioremap_caller is overriden, but IMO
it's not related to what we see in our issue.

If that may help - despite lack of 'dma-coherent' property in DT, the
FreeBSD works with full IO cache-coherency (e.g. cache maintenance in
NETA drivers are effectively nop's) on Clearfog and Clearfog-like
devices in production. So it's definitely possible, and I'd like to
check that in Linux (I may also have a setup this week available).

Best regards,
Marcin

>
> > > > Changing that is probably going to need DMA_FROM_DEVICE also changed
> > > > elsewhere in the mvneta_bm and mvneta driver.
> > > >
> > > > I'm not in a position where I could test that out. Marek?
> > > >
> > >
> > > Hello Russell,
> > >
> > > thanks for your suggestion!
> > >
> > > Adding Pali, since he has some information (see at the end of this
> > > message).
> > >
> > > The attached patch seems to solve the null-pointer dereference.
> >
> > Did you manage to measure performance impact?
> >
> > I have one overall concern here. On all kinds of A38x-based boards I
> > worked on, by default, the firmware set all devices (e.g. network,
> > AHCI, XHCI) on MBUS as fully IO cache coherent - it should be
> > reflected in the MVNETA_WIN_BASE(w) registers attribute field. Bits
> > [15:8] should be set to 0x1D (or 0x1E if there is a second DRAM CS
> > used). Can you please try adding 'dma-coherent;' property under the
> > 'internal-regs' node?
> >
> > > I booted into single user mode and enabled eth2. Before it caused the
> > > NULL pointer dereference after link got up, not it does not happen.
> > >
> > > But I am still encountering the freeze after booting into system.
> > >
> > > Maybe these are different bugs?
> > >
> > > I am thinking whether we don't need something similar like
> > >   7bea67a99430 ("ARM: dts integrator: Fix DMA ranges")
> > > also for mvebu.
> > > I seem to remember Pali talking about how the ranges defined in some
> > > upstream mvebu-tree, using MBUS_ID() macros, are incorrect.
> > > Pali, what do you remember about this?
> > >
> >
> > Afaik there should be no issues around MBUS configuration for non-PCIE devices.
> >
> > Best regards,
> > Marcin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-10-03 22:04 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-30 13:10 REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Marek Behún
2022-09-30 13:46 ` Robin Murphy
2022-09-30 14:52   ` Marek Behún
2022-09-30 15:02     ` Marek Behún
2022-09-30 16:41       ` Robin Murphy
2022-09-30 18:02         ` Marek Behún
2022-10-03  7:21           ` Christoph Hellwig
2022-10-03  7:30       ` Christoph Hellwig
2022-10-03 14:11         ` Russell King (Oracle)
2022-10-03 15:25           ` Marek Behún
2022-10-03 16:09             ` Pali Rohár
2022-10-03 19:04               ` Marek Behún
2022-10-03 19:08                 ` Pali Rohár
2022-10-03 21:30             ` Marcin Wojtas
2022-10-03 21:35               ` Pali Rohár
2022-10-03 22:03                 ` Marcin Wojtas [this message]
2022-10-04  7:10               ` Christoph Hellwig
2022-10-04  8:15                 ` Marek Behún
2022-10-04  8:17                   ` [PATCH] ARM: mvebu: select OF_DMA_DEFAULT_COHERENT if MACH_MVEBU_V7 Marek Behún
2022-10-04  8:30                     ` Christoph Hellwig
2022-10-04 12:54                       ` Marek Behún
2022-10-04  8:30                     ` Arnd Bergmann
2022-10-04  9:14                     ` Thorsten Leemhuis
2022-10-04  9:22                       ` Russell King (Oracle)
2022-10-04  9:56                 ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" Robin Murphy
2022-10-04  7:25               ` Russell King (Oracle)
2022-10-04  8:30                 ` Marcin Wojtas
2022-10-04  9:08                   ` Russell King (Oracle)
2022-10-04 12:36                     ` Marek Behún
2022-10-04 12:59                       ` Marcin Wojtas
2022-10-04 18:51                         ` Pali Rohár
2022-10-04 19:35                           ` Marcin Wojtas
2022-10-04  8:26               ` Marek Behún
2022-10-04  8:36                 ` Marcin Wojtas
2022-10-20 18:22                   ` Russell King (Oracle)
2022-10-20 19:10                     ` Marek Behún
2022-10-21 16:25                     ` Linus Torvalds
2022-10-21 16:30                     ` Christoph Hellwig
2022-10-21 18:21                       ` Russell King (Oracle)
2022-10-23 11:58                     ` Klaus Kudielka
2022-10-03 18:57         ` Marek Behún
2022-10-01  9:31 ` Thorsten Leemhuis
2022-11-04 12:08   ` REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally" #forregzbot Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPv3WKdB_3_hS4p46MhUDFLkL6c8P6=jkprjANYDZFXNayBAZw@mail.gmail.com' \
    --to=mw@semihalf.com \
    --cc=andre.przywara@arm.com \
    --cc=andrew@lunn.ch \
    --cc=arnd@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=gregory.clement@bootlin.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=kabel@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=maz@kernel.org \
    --cc=pali@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).