linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
@ 2020-04-30 22:10 Jason A. Donenfeld
  2020-05-01 10:42 ` Sebastian Andrzej Siewior
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Jason A. Donenfeld @ 2020-04-30 22:10 UTC (permalink / raw)
  To: linux-kernel, intel-gfx, dri-devel, bigeasy, tglx, chris
  Cc: Jason A. Donenfeld, stable

Sometimes it's not okay to use SIMD registers, the conditions for which
have changed subtly from kernel release to kernel release. Usually the
pattern is to check for may_use_simd() and then fallback to using
something slower in the unlikely case SIMD registers aren't available.
So, this patch fixes up i915's accelerated memcpy routines to fallback
to boring memcpy if may_use_simd() is false.

Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 drivers/gpu/drm/i915/i915_memcpy.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_memcpy.c b/drivers/gpu/drm/i915/i915_memcpy.c
index fdd550405fd3..7c0e022586bc 100644
--- a/drivers/gpu/drm/i915/i915_memcpy.c
+++ b/drivers/gpu/drm/i915/i915_memcpy.c
@@ -24,6 +24,7 @@
 
 #include <linux/kernel.h>
 #include <asm/fpu/api.h>
+#include <asm/simd.h>
 
 #include "i915_memcpy.h"
 
@@ -38,6 +39,12 @@ static DEFINE_STATIC_KEY_FALSE(has_movntdqa);
 #ifdef CONFIG_AS_MOVNTDQA
 static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len)
 {
+	if (unlikely(!may_use_simd())) {
+		memcpy(dst, src, len);
+		return;
+	}
+
+
 	kernel_fpu_begin();
 
 	while (len >= 4) {
@@ -67,6 +74,11 @@ static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len)
 
 static void __memcpy_ntdqu(void *dst, const void *src, unsigned long len)
 {
+	if (unlikely(!may_use_simd())) {
+		memcpy(dst, src, len);
+		return;
+	}
+
 	kernel_fpu_begin();
 
 	while (len >= 4) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-04-30 22:10 [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD Jason A. Donenfeld
@ 2020-05-01 10:42 ` Sebastian Andrzej Siewior
  2020-05-01 11:34   ` David Laight
  2020-05-01 21:54   ` Jason A. Donenfeld
  2020-05-01 18:07 ` Christoph Hellwig
  2020-05-03 20:30 ` Chris Wilson
  2 siblings, 2 replies; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-05-01 10:42 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: linux-kernel, intel-gfx, dri-devel, tglx, chris, stable

On 2020-04-30 16:10:16 [-0600], Jason A. Donenfeld wrote:
> Sometimes it's not okay to use SIMD registers, the conditions for which
> have changed subtly from kernel release to kernel release. Usually the
> pattern is to check for may_use_simd() and then fallback to using
> something slower in the unlikely case SIMD registers aren't available.
> So, this patch fixes up i915's accelerated memcpy routines to fallback
> to boring memcpy if may_use_simd() is false.

That would indicate that these functions are used from IRQ/softirq which
break otherwise if the kernel is also using the registers. The crypto
code uses it for that purpose.

So
   Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

May I ask how large the memcpy can be? I'm asking in case it is large
and an explicit rescheduling point might be needed.

> Cc: stable@vger.kernel.org
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-05-01 10:42 ` Sebastian Andrzej Siewior
@ 2020-05-01 11:34   ` David Laight
  2020-05-01 21:54   ` Jason A. Donenfeld
  1 sibling, 0 replies; 11+ messages in thread
From: David Laight @ 2020-05-01 11:34 UTC (permalink / raw)
  To: 'Sebastian Andrzej Siewior', Jason A. Donenfeld
  Cc: linux-kernel, intel-gfx, dri-devel, tglx, chris, stable

From: Sebastian Andrzej Siewior
> Sent: 01 May 2020 11:42
> On 2020-04-30 16:10:16 [-0600], Jason A. Donenfeld wrote:
> > Sometimes it's not okay to use SIMD registers, the conditions for which
> > have changed subtly from kernel release to kernel release. Usually the
> > pattern is to check for may_use_simd() and then fallback to using
> > something slower in the unlikely case SIMD registers aren't available.
> > So, this patch fixes up i915's accelerated memcpy routines to fallback
> > to boring memcpy if may_use_simd() is false.
> 
> That would indicate that these functions are used from IRQ/softirq which
> break otherwise if the kernel is also using the registers. The crypto
> code uses it for that purpose.
> 
> So
>    Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> May I ask how large the memcpy can be? I'm asking in case it is large
> and an explicit rescheduling point might be needed.

It is also quite likely that a 'rep movs' copy will be at least just as
fast on modern hardware.

Clearly if you are copying to/from PCIe memory you need the largest
resisters possible - but I think the graphics buffers are mapped cached?
(Otherwise I wouldn't see 3ms 'spins' while it invalidates the
entire screen buffer cache.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-04-30 22:10 [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD Jason A. Donenfeld
  2020-05-01 10:42 ` Sebastian Andrzej Siewior
@ 2020-05-01 18:07 ` Christoph Hellwig
  2020-05-01 21:55   ` Jason A. Donenfeld
  2020-05-03 20:20   ` Chris Wilson
  2020-05-03 20:30 ` Chris Wilson
  2 siblings, 2 replies; 11+ messages in thread
From: Christoph Hellwig @ 2020-05-01 18:07 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: linux-kernel, intel-gfx, dri-devel, bigeasy, tglx, chris, stable

On Thu, Apr 30, 2020 at 04:10:16PM -0600, Jason A. Donenfeld wrote:
> Sometimes it's not okay to use SIMD registers, the conditions for which
> have changed subtly from kernel release to kernel release. Usually the
> pattern is to check for may_use_simd() and then fallback to using
> something slower in the unlikely case SIMD registers aren't available.
> So, this patch fixes up i915's accelerated memcpy routines to fallback
> to boring memcpy if may_use_simd() is false.

Err, why does i915 implements its own uncached memcpy instead of relying
on core functionality to start with?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-05-01 10:42 ` Sebastian Andrzej Siewior
  2020-05-01 11:34   ` David Laight
@ 2020-05-01 21:54   ` Jason A. Donenfeld
  1 sibling, 0 replies; 11+ messages in thread
From: Jason A. Donenfeld @ 2020-05-01 21:54 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: LKML, intel-gfx, dri-devel, Thomas Gleixner, Chris Wilson, stable

On Fri, May 1, 2020 at 4:42 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>    Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Thanks.

>
> May I ask how large the memcpy can be? I'm asking in case it is large
> and an explicit rescheduling point might be needed.

Yea I was worried about that too. I'm not an i915 developer, but so
far as I can tell:

- The path from intel_engine_cmd_parser is  <= 256 KiB for "known
users", so that's rather large.
- The path from perf_memcpy is either 4k, 64k, or 4M, depending on the
type of object, so that seems gigantic, but I think that might be
selftest code.
- The path from compress_page appears to be PAGE_SIZE, so 4k, which
meshes with the limits we set agreed on few weeks ago for the crypto
stuff.
- The path from guc_read_update_log_buffer appears to be 8k, 32k, 2M,
or 8M, depending on the type of object, so that seems absurdly huge
and doesn't appear to be selftest code either like the other case.

I have no doubt the i915 developers will jump in here waving their
arms, but either way, it sure seems to me like you might have a point.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-05-01 18:07 ` Christoph Hellwig
@ 2020-05-01 21:55   ` Jason A. Donenfeld
  2020-05-03 20:20   ` Chris Wilson
  1 sibling, 0 replies; 11+ messages in thread
From: Jason A. Donenfeld @ 2020-05-01 21:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: LKML, intel-gfx, dri-devel, Sebastian Siewior, Thomas Gleixner,
	Chris Wilson, stable

On Fri, May 1, 2020 at 12:07 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, Apr 30, 2020 at 04:10:16PM -0600, Jason A. Donenfeld wrote:
> > Sometimes it's not okay to use SIMD registers, the conditions for which
> > have changed subtly from kernel release to kernel release. Usually the
> > pattern is to check for may_use_simd() and then fallback to using
> > something slower in the unlikely case SIMD registers aren't available.
> > So, this patch fixes up i915's accelerated memcpy routines to fallback
> > to boring memcpy if may_use_simd() is false.
>
> Err, why does i915 implements its own uncached memcpy instead of relying
> on core functionality to start with?

I was wondering the same. It sure does seem like this ought to be more
generalized functionality, with a name that represents the type of
transfer it's optimized for (wc or similar).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-05-01 18:07 ` Christoph Hellwig
  2020-05-01 21:55   ` Jason A. Donenfeld
@ 2020-05-03 20:20   ` Chris Wilson
  2020-05-04 16:03     ` Christoph Hellwig
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2020-05-03 20:20 UTC (permalink / raw)
  To: Jason A. Donenfeld, Christoph Hellwig
  Cc: linux-kernel, intel-gfx, dri-devel, bigeasy, tglx, stable

Quoting Christoph Hellwig (2020-05-01 19:07:31)
> On Thu, Apr 30, 2020 at 04:10:16PM -0600, Jason A. Donenfeld wrote:
> > Sometimes it's not okay to use SIMD registers, the conditions for which
> > have changed subtly from kernel release to kernel release. Usually the
> > pattern is to check for may_use_simd() and then fallback to using
> > something slower in the unlikely case SIMD registers aren't available.
> > So, this patch fixes up i915's accelerated memcpy routines to fallback
> > to boring memcpy if may_use_simd() is false.
> 
> Err, why does i915 implements its own uncached memcpy instead of relying
> on core functionality to start with?

What is this core functionality that provides movntqda?
-Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-04-30 22:10 [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD Jason A. Donenfeld
  2020-05-01 10:42 ` Sebastian Andrzej Siewior
  2020-05-01 18:07 ` Christoph Hellwig
@ 2020-05-03 20:30 ` Chris Wilson
  2020-05-03 20:35   ` Jason A. Donenfeld
  2 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2020-05-03 20:30 UTC (permalink / raw)
  To: Jason A. Donenfeld, bigeasy, dri-devel, intel-gfx, linux-kernel, tglx
  Cc: Jason A. Donenfeld, stable

Quoting Jason A. Donenfeld (2020-04-30 23:10:16)
> Sometimes it's not okay to use SIMD registers, the conditions for which
> have changed subtly from kernel release to kernel release. Usually the
> pattern is to check for may_use_simd() and then fallback to using
> something slower in the unlikely case SIMD registers aren't available.
> So, this patch fixes up i915's accelerated memcpy routines to fallback
> to boring memcpy if may_use_simd() is false.
> 
> Cc: stable@vger.kernel.org

The same argument as on the previous submission is that we return to the
caller if we can't use movntqda as their fallback path should be faster
than uncached memcpy.
-Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-05-03 20:30 ` Chris Wilson
@ 2020-05-03 20:35   ` Jason A. Donenfeld
  0 siblings, 0 replies; 11+ messages in thread
From: Jason A. Donenfeld @ 2020-05-03 20:35 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Sebastian Siewior, dri-devel, intel-gfx, LKML, Thomas Gleixner, stable

On Sun, May 3, 2020 at 2:30 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Jason A. Donenfeld (2020-04-30 23:10:16)
> > Sometimes it's not okay to use SIMD registers, the conditions for which
> > have changed subtly from kernel release to kernel release. Usually the
> > pattern is to check for may_use_simd() and then fallback to using
> > something slower in the unlikely case SIMD registers aren't available.
> > So, this patch fixes up i915's accelerated memcpy routines to fallback
> > to boring memcpy if may_use_simd() is false.
> >
> > Cc: stable@vger.kernel.org
>
> The same argument as on the previous submission is that we return to the
> caller if we can't use movntqda as their fallback path should be faster
> than uncached memcpy.

Oh, THAT's what you meant before. Okay, will follow up.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-05-03 20:20   ` Chris Wilson
@ 2020-05-04 16:03     ` Christoph Hellwig
  2020-05-04 16:15       ` David Laight
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2020-05-04 16:03 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Jason A. Donenfeld, Christoph Hellwig, linux-kernel, intel-gfx,
	dri-devel, bigeasy, tglx, stable

On Sun, May 03, 2020 at 09:20:19PM +0100, Chris Wilson wrote:
> > Err, why does i915 implements its own uncached memcpy instead of relying
> > on core functionality to start with?
> 
> What is this core functionality that provides movntqda?

A sensible name might be memcpy_uncached or mempcy_nontemporal.
But the important point is that this should be arch code with a common
fallback rather than hacking it up in drivers.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD
  2020-05-04 16:03     ` Christoph Hellwig
@ 2020-05-04 16:15       ` David Laight
  0 siblings, 0 replies; 11+ messages in thread
From: David Laight @ 2020-05-04 16:15 UTC (permalink / raw)
  To: 'Christoph Hellwig', Chris Wilson
  Cc: Jason A. Donenfeld, linux-kernel, intel-gfx, dri-devel, bigeasy,
	tglx, stable

From: Christoph Hellwig
> Sent: 04 May 2020 17:03
> 
> On Sun, May 03, 2020 at 09:20:19PM +0100, Chris Wilson wrote:
> > > Err, why does i915 implements its own uncached memcpy instead of relying
> > > on core functionality to start with?
> >
> > What is this core functionality that provides movntqda?
> 
> A sensible name might be memcpy_uncached or mempcy_nontemporal.
> But the important point is that this should be arch code with a common
> fallback rather than hacking it up in drivers.

More the point, you are trying to do a copy where:
1) The kernel isn't expected to read the data - so can bypass the cache.
and maybe:
2) The data needs flushing from the cache to actual memory.
and maybe:
3) The cache lines need invalidating.

The fallbacks depend on the required behaviour.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-05-04 16:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-30 22:10 [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD Jason A. Donenfeld
2020-05-01 10:42 ` Sebastian Andrzej Siewior
2020-05-01 11:34   ` David Laight
2020-05-01 21:54   ` Jason A. Donenfeld
2020-05-01 18:07 ` Christoph Hellwig
2020-05-01 21:55   ` Jason A. Donenfeld
2020-05-03 20:20   ` Chris Wilson
2020-05-04 16:03     ` Christoph Hellwig
2020-05-04 16:15       ` David Laight
2020-05-03 20:30 ` Chris Wilson
2020-05-03 20:35   ` Jason A. Donenfeld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).