All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: "Christian König" <christian.koenig@amd.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"David Airlie" <airlied@linux.ie>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Intel Graphics Development" <intel-gfx@lists.freedesktop.org>,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"Daniel Vetter" <daniel.vetter@intel.com>
Subject: Re: [PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86
Date: Fri, 23 Jul 2021 10:34:14 +0200	[thread overview]
Message-ID: <YPp/BlD8zrM98+6C@phenom.ffwll.local> (raw)
In-Reply-To: <be56fbe8-5151-ef8d-13cb-0b8a71f4d1e0@amd.com>

On Fri, Jul 23, 2021 at 10:02:39AM +0200, Christian König wrote:
> Am 23.07.21 um 09:36 schrieb Daniel Vetter:
> > On Thu, Jul 22, 2021 at 08:40:56PM +0200, Thomas Zimmermann wrote:
> > > Hi
> > > 
> > > Am 13.07.21 um 22:51 schrieb Daniel Vetter:
> > > [SNIP]
> > > > +#ifdef CONFIG_X86
> > > > +	if (shmem->map_wc)
> > > > +		set_pages_array_wc(pages, obj->size >> PAGE_SHIFT);
> > > > +#endif
> > > I cannot comment much on the technical details of the caching of various
> > > architectures. If this patch goes in, there should be a longer comment that
> > > reflects the discussion in this thread. It's apparently a workaround.
> > > 
> > > I think the call itself should be hidden behind a DRM API, which depends on
> > > CONFIG_X86. Something simple like
> > > 
> > > ifdef CONFIG_X86
> > > drm_set_pages_array_wc()
> > > {
> > > 	set_pages_array_wc();
> > > }
> > > else
> > > drm_set_pages_array_wc()
> > >   {
> > >   }
> > > #endif
> > > 
> > > Maybe in drm_cache.h?
> > We do have a bunch of this in drm_cache.h already, and architecture
> > maintainers hate us for it.
> 
> Yeah, for good reasons :)
> 
> > The real fix is to get at the architecture-specific wc allocator, which is
> > currently not something that's exposed, but hidden within the dma api. I
> > think having this stick out like this is better than hiding it behind fake
> > generic code (like we do with drm_clflush, which defacto also only really
> > works on x86).
> 
> The DMA API also doesn't really touch that stuff as far as I know.
> 
> What we rather do on other architectures is to set the appropriate caching
> flags on the CPU mappings, see function ttm_prot_from_caching().

This alone doesn't do cache flushes. And at least on some arm cpus having
inconsistent mappings can lead to interconnect hangs, so you have to at
least punch out the kernel linear map. Which on some arms isn't possible
(because the kernel map is a special linear map and not done with
pagetables). Which means you need to carve this out at boot and treat them
as GFP_HIGHMEM.

Afaik dma-api has that allocator somewhere which dtrt for
dma_alloc_coherent.

Also shmem helpers already set the caching pgprot.

> > Also note that ttm has the exact same ifdef in its page allocator, but it
> > does fall back to using dma_alloc_coherent on other platforms.
> 
> This works surprisingly well on non x86 architectures as well. We just don't
> necessary update the kernel mappings everywhere which limits the kmap usage.
> 
> In other words radeon and nouveau still work on PowerPC AGP systems as far
> as I know for example.

The thing is, on most cpus you get away with just pgprot set to wc, and on
many others it's only an issue while there's still some cpu dirt hanging
around because they don't prefetch badly enough. It's very few were it's a
persistent problem.

Really the only reason I've even caught this was because some of the
i915+vgem buffer sharing tests we have are very nasty and intentionally
try to provoke the worst case :-)

Anyway, since you're looking, can you pls review this and the previous
patch for shmem helpers?

The first one to make VM_PFNMAP standard for all dma-buf isn't ready yet,
because I need to audit all the driver still. And at least i915 dma-buf
mmap is still using gup-able memory too. So more work to do here.
-Danel

> 
> Christian.
> 
> > -Daniel
> > 
> > > Best regard
> > > Thomas
> > > 
> > > > +
> > > >    	shmem->pages = pages;
> > > >    	return 0;
> > > > @@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> > > >    	if (--shmem->pages_use_count > 0)
> > > >    		return;
> > > > +#ifdef CONFIG_X86
> > > > +	if (shmem->map_wc)
> > > > +		set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
> > > > +#endif
> > > > +
> > > >    	drm_gem_put_pages(obj, shmem->pages,
> > > >    			  shmem->pages_mark_dirty_on_put,
> > > >    			  shmem->pages_mark_accessed_on_put);
> > > > 
> > > -- 
> > > Thomas Zimmermann
> > > Graphics Driver Developer
> > > SUSE Software Solutions Germany GmbH
> > > Maxfeldstr. 5, 90409 Nürnberg, Germany
> > > (HRB 36809, AG Nürnberg)
> > > Geschäftsführer: Felix Imendörffer
> > > 
> > 
> > 
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: "Christian König" <christian.koenig@amd.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"David Airlie" <airlied@linux.ie>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Intel Graphics Development" <intel-gfx@lists.freedesktop.org>,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"Daniel Vetter" <daniel.vetter@intel.com>
Subject: Re: [Intel-gfx] [PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86
Date: Fri, 23 Jul 2021 10:34:14 +0200	[thread overview]
Message-ID: <YPp/BlD8zrM98+6C@phenom.ffwll.local> (raw)
In-Reply-To: <be56fbe8-5151-ef8d-13cb-0b8a71f4d1e0@amd.com>

On Fri, Jul 23, 2021 at 10:02:39AM +0200, Christian König wrote:
> Am 23.07.21 um 09:36 schrieb Daniel Vetter:
> > On Thu, Jul 22, 2021 at 08:40:56PM +0200, Thomas Zimmermann wrote:
> > > Hi
> > > 
> > > Am 13.07.21 um 22:51 schrieb Daniel Vetter:
> > > [SNIP]
> > > > +#ifdef CONFIG_X86
> > > > +	if (shmem->map_wc)
> > > > +		set_pages_array_wc(pages, obj->size >> PAGE_SHIFT);
> > > > +#endif
> > > I cannot comment much on the technical details of the caching of various
> > > architectures. If this patch goes in, there should be a longer comment that
> > > reflects the discussion in this thread. It's apparently a workaround.
> > > 
> > > I think the call itself should be hidden behind a DRM API, which depends on
> > > CONFIG_X86. Something simple like
> > > 
> > > ifdef CONFIG_X86
> > > drm_set_pages_array_wc()
> > > {
> > > 	set_pages_array_wc();
> > > }
> > > else
> > > drm_set_pages_array_wc()
> > >   {
> > >   }
> > > #endif
> > > 
> > > Maybe in drm_cache.h?
> > We do have a bunch of this in drm_cache.h already, and architecture
> > maintainers hate us for it.
> 
> Yeah, for good reasons :)
> 
> > The real fix is to get at the architecture-specific wc allocator, which is
> > currently not something that's exposed, but hidden within the dma api. I
> > think having this stick out like this is better than hiding it behind fake
> > generic code (like we do with drm_clflush, which defacto also only really
> > works on x86).
> 
> The DMA API also doesn't really touch that stuff as far as I know.
> 
> What we rather do on other architectures is to set the appropriate caching
> flags on the CPU mappings, see function ttm_prot_from_caching().

This alone doesn't do cache flushes. And at least on some arm cpus having
inconsistent mappings can lead to interconnect hangs, so you have to at
least punch out the kernel linear map. Which on some arms isn't possible
(because the kernel map is a special linear map and not done with
pagetables). Which means you need to carve this out at boot and treat them
as GFP_HIGHMEM.

Afaik dma-api has that allocator somewhere which dtrt for
dma_alloc_coherent.

Also shmem helpers already set the caching pgprot.

> > Also note that ttm has the exact same ifdef in its page allocator, but it
> > does fall back to using dma_alloc_coherent on other platforms.
> 
> This works surprisingly well on non x86 architectures as well. We just don't
> necessary update the kernel mappings everywhere which limits the kmap usage.
> 
> In other words radeon and nouveau still work on PowerPC AGP systems as far
> as I know for example.

The thing is, on most cpus you get away with just pgprot set to wc, and on
many others it's only an issue while there's still some cpu dirt hanging
around because they don't prefetch badly enough. It's very few were it's a
persistent problem.

Really the only reason I've even caught this was because some of the
i915+vgem buffer sharing tests we have are very nasty and intentionally
try to provoke the worst case :-)

Anyway, since you're looking, can you pls review this and the previous
patch for shmem helpers?

The first one to make VM_PFNMAP standard for all dma-buf isn't ready yet,
because I need to audit all the driver still. And at least i915 dma-buf
mmap is still using gup-able memory too. So more work to do here.
-Danel

> 
> Christian.
> 
> > -Daniel
> > 
> > > Best regard
> > > Thomas
> > > 
> > > > +
> > > >    	shmem->pages = pages;
> > > >    	return 0;
> > > > @@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> > > >    	if (--shmem->pages_use_count > 0)
> > > >    		return;
> > > > +#ifdef CONFIG_X86
> > > > +	if (shmem->map_wc)
> > > > +		set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
> > > > +#endif
> > > > +
> > > >    	drm_gem_put_pages(obj, shmem->pages,
> > > >    			  shmem->pages_mark_dirty_on_put,
> > > >    			  shmem->pages_mark_accessed_on_put);
> > > > 
> > > -- 
> > > Thomas Zimmermann
> > > Graphics Driver Developer
> > > SUSE Software Solutions Germany GmbH
> > > Maxfeldstr. 5, 90409 Nürnberg, Germany
> > > (HRB 36809, AG Nürnberg)
> > > Geschäftsführer: Felix Imendörffer
> > > 
> > 
> > 
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-07-23  8:34 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-13 20:51 [PATCH v4 0/4] shmem helpers for vgem Daniel Vetter
2021-07-13 20:51 ` [Intel-gfx] " Daniel Vetter
2021-07-13 20:51 ` [PATCH v4 1/4] dma-buf: Require VM_PFNMAP vma for mmap Daniel Vetter
2021-07-13 20:51   ` [Intel-gfx] " Daniel Vetter
2021-07-13 20:51   ` Daniel Vetter
2021-07-23 18:45   ` Thomas Zimmermann
2021-07-23 18:45     ` [Intel-gfx] " Thomas Zimmermann
2021-07-23 18:45     ` Thomas Zimmermann
2021-07-13 20:51 ` [PATCH v4 2/4] drm/shmem-helper: Switch to vmf_insert_pfn Daniel Vetter
2021-07-13 20:51   ` [Intel-gfx] " Daniel Vetter
2021-07-22 18:22   ` Thomas Zimmermann
2021-07-22 18:22     ` [Intel-gfx] " Thomas Zimmermann
2021-07-23  7:32     ` Daniel Vetter
2021-07-23  7:32       ` [Intel-gfx] " Daniel Vetter
2021-08-12 13:05     ` Daniel Vetter
2021-08-12 13:05       ` [Intel-gfx] " Daniel Vetter
2021-07-13 20:51 ` [PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86 Daniel Vetter
2021-07-13 20:51   ` [Intel-gfx] " Daniel Vetter
2021-07-14 11:54   ` Christian König
2021-07-14 11:54     ` [Intel-gfx] " Christian König
2021-07-14 12:48     ` Daniel Vetter
2021-07-14 12:48       ` [Intel-gfx] " Daniel Vetter
2021-07-14 12:58       ` Christian König
2021-07-14 12:58         ` [Intel-gfx] " Christian König
2021-07-14 16:16         ` Daniel Vetter
2021-07-14 16:16           ` [Intel-gfx] " Daniel Vetter
2021-07-22 18:40   ` Thomas Zimmermann
2021-07-22 18:40     ` [Intel-gfx] " Thomas Zimmermann
2021-07-23  7:36     ` Daniel Vetter
2021-07-23  7:36       ` [Intel-gfx] " Daniel Vetter
2021-07-23  8:02       ` Christian König
2021-07-23  8:02         ` [Intel-gfx] " Christian König
2021-07-23  8:34         ` Daniel Vetter [this message]
2021-07-23  8:34           ` Daniel Vetter
2021-08-05 18:40       ` Thomas Zimmermann
2021-08-05 18:40         ` [Intel-gfx] " Thomas Zimmermann
2021-07-13 20:51 ` [PATCH v4 4/4] drm/vgem: use shmem helpers Daniel Vetter
2021-07-13 20:51   ` [Intel-gfx] " Daniel Vetter
2021-07-14 12:45   ` [PATCH] " Daniel Vetter
2021-07-14 12:45     ` [Intel-gfx] " Daniel Vetter
2021-07-22 18:50   ` [PATCH v4 4/4] " Thomas Zimmermann
2021-07-22 18:50     ` [Intel-gfx] " Thomas Zimmermann
2021-07-23  7:38     ` Daniel Vetter
2021-07-23  7:38       ` [Intel-gfx] " Daniel Vetter
2021-07-13 23:43 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for shmem helpers for vgem (rev6) Patchwork
2021-07-14  0:11 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2021-07-16 13:29 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for shmem helpers for vgem (rev8) Patchwork
2021-07-16 13:58 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-07-16 16:43 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YPp/BlD8zrM98+6C@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=airlied@linux.ie \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.