linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: "Michel Dänzer" <michel@daenzer.net>
Cc: Christoph Hellwig <hch@infradead.org>,
	Will Deacon <will.deacon@arm.com>,
	David Zhou <David1.Zhou@amd.com>,
	Maxime Ripard <maxime.ripard@bootlin.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	David Airlie <airlied@linux.ie>,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	amd-gfx@lists.freedesktop.org, Junwei Zhang <Jerry.Zhang@amd.com>,
	Huang Rui <ray.huang@amd.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Daniel Vetter <daniel@ffwll.ch>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Alex Deucher <alexander.deucher@amd.com>,
	Sean Paul <sean@poorly.run>,
	Christian Koenig <christian.koenig@amd.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [RFC PATCH] drm: disable WC optimization for cache coherent devices on non-x86
Date: Mon, 21 Jan 2019 20:18:51 +0100	[thread overview]
Message-ID: <CAKv+Gu-gaZpKqyLt9Vft2e_zS4Xkc9+8HAYoHvuBo5C0hqL2Qw@mail.gmail.com> (raw)
In-Reply-To: <047667fd-17be-1c37-5d2a-26768cfd6ab8@daenzer.net>

On Mon, 21 Jan 2019 at 20:04, Michel Dänzer <michel@daenzer.net> wrote:
>
> On 2019-01-21 7:28 p.m., Ard Biesheuvel wrote:
> > On Mon, 21 Jan 2019 at 19:24, Michel Dänzer <michel@daenzer.net> wrote:
> >> On 2019-01-21 7:20 p.m., Ard Biesheuvel wrote:
> >>> On Mon, 21 Jan 2019 at 19:04, Michel Dänzer <michel@daenzer.net> wrote:
> >>>> On 2019-01-21 6:59 p.m., Ard Biesheuvel wrote:
> >>>>> On Mon, 21 Jan 2019 at 18:55, Michel Dänzer <michel@daenzer.net> wrote:
> >>>>>> On 2019-01-21 5:30 p.m., Ard Biesheuvel wrote:
> >>>>>>> On Mon, 21 Jan 2019 at 17:22, Christoph Hellwig <hch@infradead.org> wrote:
> >>>>>>>
> >>>>>>>> Until that happens we should just change the driver ifdefs to default
> >>>>>>>> the hacks to off and only enable them on setups where we 100%
> >>>>>>>> positively know that they actually work.  And document that fact
> >>>>>>>> in big fat comments.
> >>>>>>>
> >>>>>>> Well, as I mentioned in my commit log as well, if we default to off
> >>>>>>> unless CONFIG_X86, we may break working setups on MIPS and Power where
> >>>>>>> the device is in fact non-cache coherent, and relies on this
> >>>>>>> 'optimization' to get things working.
> >>>>>>
> >>>>>> FWIW, the amdgpu driver doesn't rely on non-snooped transfers for
> >>>>>> correct basic operation (the scenario Christian brought up is a very
> >>>>>> specialized use-case), so that shouldn't be an issue.
> >>>>>
> >>>>> The point is that this is only true for x86.
> >>>>>
> >>>>> On other architectures, the use of non-cached mappings on the CPU side
> >>>>> means that you /do/ rely on non-snooped transfers, since if those
> >>>>> transfers turn out not to snoop inadvertently, the accesses are
> >>>>> incoherent with the CPU's view of memory.
> >>>>
> >>>> The driver generally only uses non-cached mappings if
> >>>> drm_arch/device_can_wc_memory returns true.
> >>>
> >>> Indeed. And so we should take care to only return 'true' from that
> >>> function if it is guaranteed that non-cached CPU mappings are coherent
> >>> with the mappings used by the GPU, either because that is always the
> >>> case (like on x86), or because we know that the platform in question
> >>> implements NoSnoop correctly throughout the interconnect.
> >>>
> >>> What seems to be complicating matters is that in some cases, the
> >>> device is non-cache coherent to begin with, so regardless of whether
> >>> the NoSnoop attribute is used or not, those accesses will not snoop in
> >>> the caches and be coherent with the non-cached mappings used by the
> >>> CPU. So if we restrict this optimization [on non-X86] to platforms
> >>> that are known to implement NoSnoop correctly, we may break platforms
> >>> that are implicitly NoSnoop all the time.
> >>
> >> Since the driver generally doesn't rely on non-snooped accesses for
> >> correctness, that couldn't "break" anything that hasn't always been broken.
> >
> > Again, that is only true on x86.
> >
> > On other architectures, DMA writes from the device may allocate in the
> > caches, and be invisible to the CPU when it uses non-cached mappings.
>
> Let me try one last time:
>

I could say the same :-)

> If drm_arch_can_wc_memory returns false, the driver falls back to the
> normal mode of operation, using a cacheable CPU mapping and snooped GPU
> transfers, even if userspace asks (as a performance optimization) for a
> write-combined CPU mapping and non-snooped GPU transfers via
> AMDGPU_GEM_CREATE_CPU_GTT_USWC.

I am not talking about the case where drm_arch_can_wc_memory() returns false.

I am talking about the case where it returns true, which is currently
the default for all architectures, except Power and MIPS in some
cases.

This mode of operation breaks my cache coherent arm64 system. (AMD Seattle)
With this patch applied, everything works fine.

> This normal mode of operation is also
> used for the ring buffers at the heart of the driver's operation.

But is it really the same mode of operation? Does it also vmap() the
pages? Or does it use the DMA API to allocate the ring buffers?
Because in the latter case, this will give you non-cached CPU mappings
as well if the device is non-cache coherent.

> If
> there is a platform where this normal mode of operation doesn't work,
> the driver could never have worked reliably on that platform, since
> before AMDGPU_GEM_CREATE_CPU_GTT_USWC or drm_arch_can_wc_memory even
> existed.
>

As I said, I am talking about the case where drm_arch_can_wc_memory()
returns true on a cache coherent system. This relies on NoSnoop being
implemented correctly in the platform, or a CPU architecture that
snoops the caches when doing uncached memory accesses (such as x86)

  reply	other threads:[~2019-01-21 19:19 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-21 10:06 [RFC PATCH] drm: disable WC optimization for cache coherent devices on non-x86 Ard Biesheuvel
2019-01-21 10:11 ` Koenig, Christian
2019-01-21 15:07 ` Christoph Hellwig
2019-01-21 15:33   ` Ard Biesheuvel
2019-01-21 15:59     ` Christoph Hellwig
2019-01-21 16:14       ` Ard Biesheuvel
2019-01-21 16:22         ` Christoph Hellwig
2019-01-21 16:30           ` Ard Biesheuvel
2019-01-21 16:35             ` Christoph Hellwig
2019-01-21 16:50               ` Ard Biesheuvel
2019-01-21 17:55             ` Michel Dänzer
2019-01-21 17:59               ` Ard Biesheuvel
2019-01-21 18:04                 ` Michel Dänzer
2019-01-21 18:20                   ` Ard Biesheuvel
2019-01-21 18:23                     ` Michel Dänzer
2019-01-21 18:28                       ` Ard Biesheuvel
2019-01-21 19:04                         ` Michel Dänzer
2019-01-21 19:18                           ` Ard Biesheuvel [this message]
2019-01-22  8:38                           ` Ard Biesheuvel
2019-01-22 20:56                             ` Alex Deucher
2019-01-22 21:07                               ` Ard Biesheuvel
2019-01-23  7:15                                 ` Christoph Hellwig
2019-01-23  9:08                                   ` Ard Biesheuvel
2019-01-23 16:44                                     ` Christoph Hellwig
2019-01-23 16:52                                       ` Ard Biesheuvel
2019-01-24  9:13                                         ` Christoph Hellwig
2019-01-24  9:25                                           ` Koenig, Christian
2019-01-24  9:28                                             ` Ard Biesheuvel
2019-01-24  9:45                                               ` Koenig, Christian
2019-01-24  9:59                                                 ` Ard Biesheuvel
2019-01-24 11:23                                                   ` Koenig, Christian
2019-01-24 11:26                                                     ` Ard Biesheuvel
2019-01-24 11:37                                                       ` Koenig, Christian
2019-01-24 11:45                                                         ` Ard Biesheuvel
2019-01-24 13:54                                                           ` Alex Deucher
2019-01-24 13:57                                                             ` Ard Biesheuvel
2019-01-24 14:00                                                               ` Alex Deucher
2019-01-24 16:04                                                           ` Michel Dänzer
2019-01-24  9:31                                         ` Michel Dänzer
2019-01-24  9:37                                           ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKv+Gu-gaZpKqyLt9Vft2e_zS4Xkc9+8HAYoHvuBo5C0hqL2Qw@mail.gmail.com \
    --to=ard.biesheuvel@linaro.org \
    --cc=David1.Zhou@amd.com \
    --cc=Jerry.Zhang@amd.com \
    --cc=airlied@linux.ie \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=benh@kernel.crashing.org \
    --cc=christian.koenig@amd.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=maxime.ripard@bootlin.com \
    --cc=michel@daenzer.net \
    --cc=mpe@ellerman.id.au \
    --cc=ray.huang@amd.com \
    --cc=sean@poorly.run \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).