linux-sunxi.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Anup Patel <Anup.Patel@wdc.com>
To: Guo Ren <guoren@kernel.org>
Cc: Atish Patra <atishp@atishpatra.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	"anup@brainfault.org" <anup@brainfault.org>,
	"drew@beagleboard.org" <drew@beagleboard.org>,
	Christoph Hellwig <hch@lst.de>,
	"wefu@redhat.com" <wefu@redhat.com>,
	"lazyparser@gmail.com" <lazyparser@gmail.com>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-sunxi@lists.linux.dev" <linux-sunxi@lists.linux.dev>,
	"guoren@linux.alibaba.com" <guoren@linux.alibaba.com>,
	Paul Walmsley <paul.walmsley@sifive.com>
Subject: RE: [PATCH RFC 0/3] riscv: Add DMA_COHERENT support
Date: Mon, 7 Jun 2021 04:47:34 +0000	[thread overview]
Message-ID: <CO6PR04MB7812D075519BCFBAF744DE7D8D389@CO6PR04MB7812.namprd04.prod.outlook.com> (raw)
In-Reply-To: <CAJF2gTSgC2+ULpfSQKvan7phPf8VT+nxUiekfpHELNjsQYo=CA@mail.gmail.com>



> -----Original Message-----
> From: Guo Ren <guoren@kernel.org>
> Sent: 07 June 2021 09:52
> To: Anup Patel <Anup.Patel@wdc.com>
> Cc: Atish Patra <atishp@atishpatra.org>; Palmer Dabbelt
> <palmer@dabbelt.com>; anup@brainfault.org; drew@beagleboard.org;
> Christoph Hellwig <hch@lst.de>; wefu@redhat.com; lazyparser@gmail.com;
> linux-riscv@lists.infradead.org; linux-kernel@vger.kernel.org; linux-
> arch@vger.kernel.org; linux-sunxi@lists.linux.dev; guoren@linux.alibaba.com;
> Paul Walmsley <paul.walmsley@sifive.com>
> Subject: Re: [PATCH RFC 0/3] riscv: Add DMA_COHERENT support
> 
> Hi Anup,
> 
> On Mon, Jun 7, 2021 at 11:38 AM Anup Patel <Anup.Patel@wdc.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Guo Ren <guoren@kernel.org>
> > > Sent: 06 June 2021 22:42
> > > To: Anup Patel <Anup.Patel@wdc.com>; Atish Patra
> > > <atishp@atishpatra.org>
> > > Cc: Palmer Dabbelt <palmer@dabbelt.com>; anup@brainfault.org;
> > > drew@beagleboard.org; Christoph Hellwig <hch@lst.de>;
> > > wefu@redhat.com; lazyparser@gmail.com;
> > > linux-riscv@lists.infradead.org; linux- kernel@vger.kernel.org;
> > > linux-arch@vger.kernel.org; linux- sunxi@lists.linux.dev;
> > > guoren@linux.alibaba.com; Paul Walmsley <paul.walmsley@sifive.com>
> > > Subject: Re: [PATCH RFC 0/3] riscv: Add DMA_COHERENT support
> > >
> > > Hi Anup and Atish,
> > >
> > > On Thu, Jun 3, 2021 at 2:00 PM Anup Patel <Anup.Patel@wdc.com>
> wrote:
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Palmer Dabbelt <palmer@dabbelt.com>
> > > > > Sent: 03 June 2021 09:43
> > > > > To: guoren@kernel.org
> > > > > Cc: anup@brainfault.org; drew@beagleboard.org; Christoph Hellwig
> > > > > <hch@lst.de>; Anup Patel <Anup.Patel@wdc.com>;
> wefu@redhat.com;
> > > > > lazyparser@gmail.com; linux-riscv@lists.infradead.org; linux-
> > > > > kernel@vger.kernel.org; linux-arch@vger.kernel.org; linux-
> > > > > sunxi@lists.linux.dev; guoren@linux.alibaba.com; Paul Walmsley
> > > > > <paul.walmsley@sifive.com>
> > > > > Subject: Re: [PATCH RFC 0/3] riscv: Add DMA_COHERENT support
> > > > >
> > > > > On Sat, 29 May 2021 17:30:18 PDT (-0700), Palmer Dabbelt wrote:
> > > > > > On Fri, 21 May 2021 17:36:08 PDT (-0700), guoren@kernel.org
> wrote:
> > > > > >> On Wed, May 19, 2021 at 3:15 PM Anup Patel
> > > > > >> <anup@brainfault.org>
> > > > > wrote:
> > > > > >>>
> > > > > >>> On Wed, May 19, 2021 at 12:24 PM Drew Fustini
> > > > > <drew@beagleboard.org> wrote:
> > > > > >>> >
> > > > > >>> > On Wed, May 19, 2021 at 08:06:17AM +0200, Christoph
> > > > > >>> > Hellwig
> > > > > wrote:
> > > > > >>> > > On Wed, May 19, 2021 at 02:05:00PM +0800, Guo Ren
> wrote:
> > > > > >>> > > > Since the existing RISC-V ISA cannot solve this
> > > > > >>> > > > problem, it is better to provide some configuration
> > > > > >>> > > > for the SOC vendor to
> > > > > customize.
> > > > > >>> > >
> > > > > >>> > > We've been talking about this problem for close to five years.
> > > > > >>> > > So no, if you don't manage to get the feature into the
> > > > > >>> > > ISA it can't be supported.
> > > > > >>> >
> > > > > >>> > Isn't it a good goal for Linux to support the capabilities
> > > > > >>> > present in the SoC that a currently being fab'd?
> > > > > >>> >
> > > > > >>> > I believe the CMO group only started last year [1] so the
> > > > > >>> > RV64GC SoCs that are going into mass production this year
> > > > > >>> > would not have had the opporuntiy of utilizing any RISC-V
> > > > > >>> > ISA extension for handling cache management.
> > > > > >>>
> > > > > >>> The current Linux RISC-V policy is to only accept patches
> > > > > >>> for frozen or ratified ISA specs.
> > > > > >>> (Refer, Documentation/riscv/patch-acceptance.rst)
> > > > > >>>
> > > > > >>> This means even if emulate CMO instructions in OpenSBI, the
> > > > > >>> Linux patches won't be taken by Palmer because CMO
> > > > > >>> specification is still in draft stage.
> > > > > >> Before CMO specification release, could we use a sbi_ecall to
> > > > > >> solve the current problem? This is not against the
> > > > > >> specification, when CMO is ready we could let users choose to
> > > > > >> use the new CMO in
> > > Linux.
> > > > > >>
> > > > > >> From a tech view, CMO trap emulation is the same as sbi_ecall.
> > > > > >>
> > > > > >>>
> > > > > >>> Also, we all know how much time it takes for RISCV
> > > > > >>> international to freeze some spec. Judging by that we are
> > > > > >>> looking at another
> > > > > >>> 3-4 years at minimum.
> > > > > >
> > > > > > Sorry for being slow here, this thread got buried.
> > > > > >
> > > > > > I've been trying to work with a handful of folks at the RISC-V
> > > > > > foundation to try and get a subset of the various
> > > > > > in-development specifications (some simple CMOs, something
> > > > > > about non-caching in the page tables, and some way to prevent
> > > > > > speculative accesse from generating coherence traffic that will break
> non-coherent systems).
> > > > > > I'm not sure we can get this together quickly, but I'd prefer
> > > > > > to at least try before we jump to taking vendor-specificed behavior
> here.
> > > > > > It's obviously an up-hill battle to try and get specifications
> > > > > > through the process and I'm certainly not going to promise it
> > > > > > will work, but I'm hoping that the impending need to avoid
> > > > > > forking the ISA will be sufficient to get people behind
> > > > > > producing some specifications in a timely
> > > > > fashion.
> > > > > >
> > > > > > I wasn't aware than this chip had non-coherent devices until I
> > > > > > saw this thread, so we'd been mostly focused on the Beagle V chip.
> > > > > > That was in a sense an easier problem because the SiFive IP in
> > > > > > it was never designed to have non-coherent devices so we'd
> > > > > > have to make anything work via a series of slow workarounds,
> > > > > > which would make emulating the eventually standardized
> > > > > > behavior reasonable in terms of performance (ie, everything
> > > > > > would be super slow so who really
> > > cares).
> > > > > >
> > > > > > I don't think relying on some sort of SBI call for the CMOs
> > > > > > whould be such a performance hit that it would prevent these
> > > > > > systems from being viable, but assuming you have reasonable
> > > > > > performance on your non-cached accesses then that's probably
> > > > > > not going to be viable to trap and emulate.  At that point it
> > > > > > really just becomes silly to pretend that we're still making
> > > > > > things work by emulating the eventually ratified behavior, as
> > > > > > anyone who actually tries to use this thing to do IO would
> > > > > > need out of tree patches.  I'm not sure exactly what the plan
> > > > > > is for the page table bits in the specification right now, but
> > > > > > if you can give me a pointer to some documentation then I'm
> > > > > > happy to try and push for something
> > > compatible.
> > > > > >
> > > > > > If we can't make the process work at the foundation then I'd
> > > > > > be strongly in favor of just biting the bullet and starting to
> > > > > > take vendor-specific code that's been implemented in hardware
> > > > > > and is necessarry to make things work acceptably.  That's
> > > > > > obviously a sub-optimal solution as it'll lead to a bunch of
> > > > > > ISA fragmentation, but at least we'll be able to keep the
> > > > > > software stack
> > > together.
> > > > > >
> > > > > > Can you tell us when these will be in the hands of users?
> > > > > > That's pretty important here, as I don't want to be blocking
> > > > > > real users from having their hardware work.  IIRC there were
> > > > > > some plans to distribute early boards, but it looks like the
> > > > > > foundation got involved and I guess I lost the thread at that point.
> > > > > >
> > > > > > Sorry this is all such a headache, but hopefully we can get
> > > > > > things sorted out.
> > > > >
> > > > > I talked with some of the RISC-V foundation folks, we're not
> > > > > going to have an ISA specification for the non-coherent stuff
> > > > > any time soon.  I took a look at this code and I definately
> > > > > don't want to take it as is, but I'm not opposed to taking
> > > > > something that makes the
> > > hardware work as long as it's a lot cleaner.
> > > > > We've already got two of these non-coherent chips, I'm sure more
> > > > > will come, and I'd rather have the extra headaches than make
> > > > > everyone fork the software stack.
> > > >
> > > > Thanks for confirming. The CMO extension is still in early stages
> > > > so it will certainly take more time for them. After CMO extension
> > > > is finalized, it will take some more time to have actual RISC-V
> > > > platforms with
> > > CMO implemented.
> > > >
> > > > >
> > > > > After talking to Atish it looks like there's likely to be an SBI
> > > > > extension to handle the CMOs, which should let us avoid the bulk
> > > > > of the vendor-specific behavior in the kernel.  I know some
> > > > > people are worried about adding to the SBI surface.  I'm worried
> > > > > about that too, but that's way better than sticking a bunch of
> > > > > vendor-specific instructions into the kernel.  The SBI extension
> > > > > should make for a straight-forward cache flush implementation in
> > > > > Linux, so let's just plan on
> > > that getting through quickly (as has been done before).
> > > >
> > > > Yes, I agree. We can have just a single SBI call which is meant
> > > > for DMA sync purpose only which means it will flush/invalidate
> > > > pages from all cache levels irrespective of the cache hierarchy (i.e.
> > > > flush/invalidate to RAM). The CMO extension might more generic
> > > > cache operations which can target any cache level.
> > > >
> > > > I am already preparing a write-up for SBI DMA sync call in SBI
> > > > spec. I will share it with you separately as well.
> > > >
> > > > >
> > > > > Unfortunately we've yet to come up with a way to handle the
> > > > > non-cacheable mappings without introducing a degree of
> > > > > vendor-specific behavior or seriously impacting performance
> > > > > (mark them as not valid and deal with them in the trap handler).
> > > > > I'm not really sure it counts as supporting the hardware if it's
> > > > > massively slow, so that really leaves us with vendor-specific
> > > > > mappings as the only
> > > option to make these chips work.
> > > >
> > > > A RISC-V platform can have non-cacheable mappings is following
> > > > possible
> > > > ways:
> > > > 1) Fixed physical address range as non-cacheable using PMAs
> > > > 2) Custom page table attributes
> > > > 3) Svpmbt extension being defined by RVI
> > > >
> > > > Atish and me both think it is possible to have RISC-V specific DMA
> > > > ops implementation which can handle all above case. Atish is
> > > > already working on DMA ops implementation for RISC-V.
> > > Not only DMA ops, but also icache_sync & __vdso_icache_sync. Please
> > > have a look at:
> > > https://lore.kernel.org/linux-riscv/1622970249-50770-12-git-send-ema
> > > il-
> > > guoren@kernel.org/T/#u
> >
> > The icache_sync and __vdso_icache_sync will have to be addressed
> > differently. The SBI DMA sync extension cannot address this.
> Agree
> 
> >
> > It seems Allwinner D1 have more non-standard stuff:
> > 1) Custom PTE bits for IO-coherent access
> > 2) Custom data cache flush/invalidate for DMA sync
> > 3) Custom icache flush/invalidate
> Yes, but 3) is a performance optimization, not critical for running.
> 
> >
> > Other hand, BeagleV has only two problems:
> > 1) Custom physical address range for IO-coherent access
> > 2) Custom L2 cache flush/invalidate for DMA sync
> https://github.com/starfive-
> tech/linux/commit/d4c4044c08134dca8e5eaaeb6d3faf97dc453b6d
> 
> Currently, they still use DMA sync with DMA descriptor, are you sure they
> have minor memory physical address.
> 
> >
> > From above #2, can be solved by SBI DMA sync call and Linux DMA ops
> > for both BeagleV and Allwinner D1
> >
> > On BeagleV, issue #1 can be solved using "dma-ranges".
> >
> > On Allwinner D1, issues #1 and #3 need to be addressed separately.
> >
> > I think supporting BeagleV in upstream Linux is relatively easy
> > compared to Allwinner D1.
> >
> > @Guo, please check if you can reserve dedicated physical address range
> > for IO-coherent access (just like BeagleV). If yes, then we can tackle
> > issue #1 for Allwinner
> > D1 using "dma-ranges" DT property.
> There is no dedicated physical address range for IO-coherent access in D1. But
> the solution you mentioned couldn't solve all requirements.
> Only one mirror physical address range is not enough, we need at least three
> (Normal, DMA desc, frame buffer).

How many non-coherent devices you really have?

I am guess lot of critical devices on Allwinner D1 are not coherent with CPU.
The problem for Allwinner D1 is even worst than I thought. If such critical
high through-put devices are not cache coherent with CPU then I am
speechless about Allwinner D1 situation.

> And that will triple the memory physical address which can't be accepted by
> our users from the hardware design cost view.
> 
>  "dma-ranges" DT property is a big early MIPS smell. ARM SOC users can't
> accept it. (They just say replace the CPU, but don't touch anything other.)
> 
> PTE attributes are the non-coherent solution for many years. MIPS also
> follows that now:
> ref arch/mips/include/asm/pgtable-bits.h &
> arch/mips/include/asm/pgtable.h

RISC-V is in the process of standardizing Svpmbt extension.

Unfortunately, the higher order bits which your implementation uses is
not for SoC vendor use as-per the RISC-V privilege spec.

> 
> #ifndef _CACHE_CACHABLE_NO_WA
> #define _CACHE_CACHABLE_NO_WA           (0<<_CACHE_SHIFT)
> #endif
> #ifndef _CACHE_CACHABLE_WA
> #define _CACHE_CACHABLE_WA              (1<<_CACHE_SHIFT)
> #endif
> #ifndef _CACHE_UNCACHED
> #define _CACHE_UNCACHED                 (2<<_CACHE_SHIFT)
> #endif
> #ifndef _CACHE_CACHABLE_NONCOHERENT
> #define _CACHE_CACHABLE_NONCOHERENT     (3<<_CACHE_SHIFT)
> #endif
> #ifndef _CACHE_CACHABLE_CE
> #define _CACHE_CACHABLE_CE              (4<<_CACHE_SHIFT)
> #endif
> #ifndef _CACHE_CACHABLE_COW
> #define _CACHE_CACHABLE_COW             (5<<_CACHE_SHIFT)
> #endif
> #ifndef _CACHE_CACHABLE_CUW
> #define _CACHE_CACHABLE_CUW             (6<<_CACHE_SHIFT)
> #endif
> #ifndef _CACHE_UNCACHED_ACCELERATED
> #define _CACHE_UNCACHED_ACCELERATED     (7<<_CACHE_SHIFT)
> 
> We can't force our users to double/triplicate their physical memory regions.

We are trying to find a workable solution here so that we don't have
to deal with custom PTE attributes which are reserved for RISC-V priv
specification only.

Regards,
Anup

> 
> >
> > Regards,
> > Anup
> >
> > >
> > >
> > > >
> > > > >
> > > > > This implementation, which adds some Kconfig entries that
> > > > > control page table bits, definately isn't suitable for upstream.
> > > > > Allowing users to set arbitrary page table bits will eventually
> > > > > conflict with the standard, and is just going to be a mess.
> > > > > It'll also lead to kernels that are only compatible with
> > > > > specific designs, which we're trying very hard to avoid.  At a
> > > > > bare minimum we'll need some way to detect systems with these
> > > > > page table bits before setting them, and some description of
> > > > > what the bits actually do so we can reason about
> > > them.
> > > >
> > > > Yes, vendor specific Kconfig options are strict NO NO. We can't
> > > > give-up the goal of unified kernel image for all platforms.
> > > >
> > > > Regards,
> > > > Anup
> > >
> > >
> > >
> > > --
> > > Best Regards
> > >  Guo Ren
> > >
> > > ML: https://lore.kernel.org/linux-csky/
> 
> 
> 
> --
> Best Regards
>  Guo Ren
> 
> ML: https://lore.kernel.org/linux-csky/

  reply	other threads:[~2021-06-07  4:47 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19  5:04 [PATCH RFC 0/3] riscv: Add DMA_COHERENT support guoren
2021-05-19  5:04 ` [PATCH RFC 1/3] riscv: pgtable.h: Fixup _PAGE_CHG_MASK usage guoren
2021-05-19  5:04 ` [PATCH RFC 2/3] riscv: Add DMA_COHERENT for custom PTE attributes guoren
2021-05-19  5:04 ` [PATCH RFC 3/3] riscv: Add SYNC_DMA_FOR_CPU/DEVICE for DMA_COHERENT guoren
2021-05-19  6:32   ` Guo Ren
2021-05-19  5:20 ` [PATCH RFC 0/3] riscv: Add DMA_COHERENT support Christoph Hellwig
2021-05-19  5:48   ` Guo Ren
2021-05-19  5:55     ` Christoph Hellwig
2021-05-19  6:09       ` Guo Ren
2021-05-19  6:44     ` Drew Fustini
2021-05-19  6:53       ` Christoph Hellwig
2021-05-20  1:45         ` Guo Ren
2021-05-20  5:48           ` Christoph Hellwig
2021-06-06 18:14           ` Nick Kossifidis
2021-06-07  0:04             ` Guo Ren
2021-06-07  2:16               ` Nick Kossifidis
2021-06-07  3:19                 ` Guo Ren
2021-06-07  6:27                   ` Christoph Hellwig
2021-06-07  6:41                     ` Guo Ren
2021-06-07  6:51                       ` Christoph Hellwig
2021-06-07  7:46                         ` Guo Ren
2021-06-08 15:00                     ` David Laight
2021-06-08 15:32                       ` 'Christoph Hellwig'
2021-06-08 16:11                         ` David Laight
2021-06-07  8:35                   ` Nick Kossifidis
2021-06-09  3:28             ` Guo Ren
2021-06-09  6:05               ` Jisheng Zhang
2021-06-09  9:45               ` Nick Kossifidis
2021-06-09 12:43                 ` Guo Ren
2021-05-19  6:05   ` Guo Ren
2021-05-19  6:06     ` Christoph Hellwig
2021-05-19  6:11       ` Guo Ren
2021-05-19  6:54       ` Drew Fustini
2021-05-19  6:56         ` Christoph Hellwig
2021-05-19  7:14         ` Anup Patel
2021-05-19  8:25           ` Damien Le Moal
2021-05-20  1:47           ` Guo Ren
2021-05-20  1:59             ` Guo Ren
2021-05-22  0:36           ` Guo Ren
2021-05-30  0:30             ` Palmer Dabbelt
2021-06-03  4:13               ` Palmer Dabbelt
2021-06-03  6:00                 ` Anup Patel
2021-06-03 15:39                   ` Palmer Dabbelt
2021-06-04  9:02                     ` David Laight
2021-06-04  9:53                     ` Arnd Bergmann
2021-06-04 14:47                       ` Guo Ren
2021-06-04 16:12                         ` Palmer Dabbelt
2021-06-04 21:26                           ` Arnd Bergmann
2021-06-04 22:10                             ` Palmer Dabbelt
2021-06-08 12:26                           ` Guo Ren
2021-06-06 17:11                   ` Guo Ren
2021-06-07  3:38                     ` Anup Patel
2021-06-07  4:22                       ` Guo Ren
2021-06-07  4:47                         ` Anup Patel [this message]
2021-06-07  5:08                           ` Guo Ren
2021-06-07  5:13                           ` Guo Ren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CO6PR04MB7812D075519BCFBAF744DE7D8D389@CO6PR04MB7812.namprd04.prod.outlook.com \
    --to=anup.patel@wdc.com \
    --cc=anup@brainfault.org \
    --cc=atishp@atishpatra.org \
    --cc=drew@beagleboard.org \
    --cc=guoren@kernel.org \
    --cc=guoren@linux.alibaba.com \
    --cc=hch@lst.de \
    --cc=lazyparser@gmail.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-sunxi@lists.linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=wefu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).