dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] iosys-map: Add per-word read
@ 2022-06-17  8:52 Lucas De Marchi
  2022-06-17  8:52 ` [PATCH 2/2] iosys-map: Add per-word write Lucas De Marchi
  2022-06-17 15:56 ` [Intel-gfx] [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
  0 siblings, 2 replies; 5+ messages in thread
From: Lucas De Marchi @ 2022-06-17  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Daniel Vetter, Lucas De Marchi, christian.koenig, tzimmermann

Instead of always falling back to memcpy_fromio() for any size, prefer
using read{b,w,l}(). When reading struct members it's common to read
individual integer variables individually. Going through memcpy_fromio()
for each of them poses a high penalty.

Employ a similar trick as __seqprop() by using _Generic() to generate
only the specific call based on a type-compatible variable.

For a pariticular i915 workload producing GPU context switches,
__get_engine_usage_record() is particularly hot since the engine usage
is read from device local memory with dgfx, possibly multiple times
since it's racy. Test execution time for this test shows a ~12.5%
improvement with DG2:

Before:
	nrepeats = 1000; min = 7.63243e+06; max = 1.01817e+07;
	median = 9.52548e+06; var = 526149;
After:
	nrepeats = 1000; min = 7.03402e+06; max = 8.8832e+06;
	median = 8.33955e+06; var = 333113;

Other things attempted that didn't prove very useful:
1) Change the _Generic() on x86 to just dereference the memory address
2) Change __get_engine_usage_record() to do just 1 read per loop,
   comparing with the previous value read
3) Change __get_engine_usage_record() to access the fields directly as it
   was before the conversion to iosys-map

(3) did gave a small improvement (~3%), but doesn't seem to scale well
to other similar cases in the driver.

Additional test by Chris Wilson using gem_create from igt with some
changes to track object creation time. This happens to accidentally
stress this code path:

	Pre iosys_map conversion of engine busyness:
	lmem0: Creating    262144 4KiB objects took 59274.2ms

	Unpatched:
	lmem0: Creating    262144 4KiB objects took 108830.2ms

	With readl (this patch):
	lmem0: Creating    262144 4KiB objects took 61348.6ms

	s/readl/READ_ONCE/
	lmem0: Creating    262144 4KiB objects took 61333.2ms

So we do take a little bit more time than before the conversion, but
that is due to other factors: bringing the READ_ONCE back would be as
good as just doing this conversion.

v2:
- Remove default from _Generic() - callers wanting to read more
  than u64 should use iosys_map_memcpy_from()
- Add READ_ONCE() cases dereferencing the pointer when using system
  memory

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
---
 include/linux/iosys-map.h | 45 +++++++++++++++++++++++++++++++--------
 1 file changed, 36 insertions(+), 9 deletions(-)

diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index 4b8406ee8bc4..f59dd00ed202 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -6,6 +6,7 @@
 #ifndef __IOSYS_MAP_H__
 #define __IOSYS_MAP_H__
 
+#include <linux/compiler_types.h>
 #include <linux/io.h>
 #include <linux/string.h>
 
@@ -333,6 +334,26 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
 		memset(dst->vaddr + offset, value, len);
 }
 
+#ifdef CONFIG_64BIT
+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
+	u64: val_ = readq(vaddr_iomem_)
+#else
+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
+	u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
+#endif
+
+#define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
+	u8: val__ = readb(vaddr_iomem__),					\
+	u16: val__ = readw(vaddr_iomem__),					\
+	u32: val__ = readl(vaddr_iomem__),					\
+	__iosys_map_rd_io_u64_case(val__, vaddr_iomem__))
+
+#define __iosys_map_rd_sys(val__, vaddr__, type__) ({				\
+	compiletime_assert(sizeof(type__) <= sizeof(u64),			\
+			   "Unsupported access size for __iosys_map_rd_sys()");	\
+	val__ = READ_ONCE(*((type__ *)vaddr__));				\
+})
+
 /**
  * iosys_map_rd - Read a C-type value from the iosys_map
  *
@@ -340,16 +361,21 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  * @offset__:	The offset from which to read
  * @type__:	Type of the value being read
  *
- * Read a C type value from iosys_map, handling possible un-aligned accesses to
- * the mapping.
+ * Read a C type value (u8, u16, u32 and u64) from iosys_map. For other types or
+ * if pointer may be unaligned (and problematic for the architecture supported),
+ * use iosys_map_memcpy_from().
  *
  * Returns:
  * The value read from the mapping.
  */
-#define iosys_map_rd(map__, offset__, type__) ({			\
-	type__ val;							\
-	iosys_map_memcpy_from(&val, map__, offset__, sizeof(val));	\
-	val;								\
+#define iosys_map_rd(map__, offset__, type__) ({				\
+	type__ val;								\
+	if ((map__)->is_iomem) {						\
+		__iosys_map_rd_io(val, (map__)->vaddr_iomem + (offset__), type__);\
+	} else {								\
+		__iosys_map_rd_sys(val, (map__)->vaddr + (offset__), type__);	\
+	}									\
+	val;									\
 })
 
 /**
@@ -379,9 +405,10 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  *
  * Read a value from iosys_map considering its layout is described by a C struct
  * starting at @struct_offset__. The field offset and size is calculated and its
- * value read handling possible un-aligned memory accesses. For example: suppose
- * there is a @struct foo defined as below and the value ``foo.field2.inner2``
- * needs to be read from the iosys_map:
+ * value read. If the field access would incur in un-aligned access, then either
+ * iosys_map_memcpy_from() needs to be used or the architecture must support it.
+ * For example: suppose there is a @struct foo defined as below and the value
+ * ``foo.field2.inner2`` needs to be read from the iosys_map:
  *
  * .. code-block:: c
  *
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] iosys-map: Add per-word write
  2022-06-17  8:52 [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
@ 2022-06-17  8:52 ` Lucas De Marchi
  2022-06-26 21:01   ` [Intel-gfx] " Lucas De Marchi
  2022-06-17 15:56 ` [Intel-gfx] [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
  1 sibling, 1 reply; 5+ messages in thread
From: Lucas De Marchi @ 2022-06-17  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Daniel Vetter, Lucas De Marchi, christian.koenig, tzimmermann

Like was done for read, provide the equivalent for write. Even if
current users are not in the hot path, this should future-proof it.

v2:
  - Remove default from _Generic() - callers wanting to write more
    than u64 should use iosys_map_memcpy_to()
  - Add WRITE_ONCE() cases dereferencing the pointer when using system
    memory

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Reviewed-by: Christian König <christian.koenig@amd.com> # v1
---
 include/linux/iosys-map.h | 42 ++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index f59dd00ed202..580e14cd360c 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -337,9 +337,13 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
 #ifdef CONFIG_64BIT
 #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
 	u64: val_ = readq(vaddr_iomem_)
+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)			\
+	u64: writeq(val_, vaddr_iomem_)
 #else
 #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
 	u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)			\
+	u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
 #endif
 
 #define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
@@ -354,6 +358,19 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
 	val__ = READ_ONCE(*((type__ *)vaddr__));				\
 })
 
+#define __iosys_map_wr_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
+	u8: writeb(val__, vaddr_iomem__),					\
+	u16: writew(val__, vaddr_iomem__),					\
+	u32: writel(val__, vaddr_iomem__),					\
+	__iosys_map_wr_io_u64_case(val__, vaddr_iomem__))
+
+#define __iosys_map_wr_sys(val__, vaddr__, type__) ({				\
+	compiletime_assert(sizeof(type__) <= sizeof(u64),			\
+			   "Unsupported access size for __iosys_map_wr_sys()"); \
+	WRITE_ONCE(*((type__ *)vaddr__), val__);				\
+})
+
+
 /**
  * iosys_map_rd - Read a C-type value from the iosys_map
  *
@@ -386,12 +403,17 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  * @type__:	Type of the value being written
  * @val__:	Value to write
  *
- * Write a C-type value to the iosys_map, handling possible un-aligned accesses
- * to the mapping.
+ * Write a C type value (u8, u16, u32 and u64) to the iosys_map. For other types
+ * or if pointer may be unaligned (and problematic for the architecture
+ * supported), use iosys_map_memcpy_to()
  */
-#define iosys_map_wr(map__, offset__, type__, val__) ({			\
-	type__ val = (val__);						\
-	iosys_map_memcpy_to(map__, offset__, &val, sizeof(val));	\
+#define iosys_map_wr(map__, offset__, type__, val__) ({				\
+	type__ val = (val__);							\
+	if ((map__)->is_iomem) {						\
+		__iosys_map_wr_io(val, (map__)->vaddr_iomem + (offset__), type__);\
+	} else {								\
+		__iosys_map_wr_sys(val, (map__)->vaddr + (offset__), type__);	\
+	}									\
 })
 
 /**
@@ -472,10 +494,12 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  * @field__:		Member of the struct to read
  * @val__:		Value to write
  *
- * Write a value to the iosys_map considering its layout is described by a C struct
- * starting at @struct_offset__. The field offset and size is calculated and the
- * @val__ is written handling possible un-aligned memory accesses. Refer to
- * iosys_map_rd_field() for expected usage and memory layout.
+ * Write a value to the iosys_map considering its layout is described by a C
+ * struct starting at @struct_offset__. The field offset and size is calculated
+ * and the @val__ is written. If the field access would incur in un-aligned
+ * access, then either iosys_map_memcpy_to() needs to be used or the
+ * architecture must support it. Refer to iosys_map_rd_field() for expected
+ * usage and memory layout.
  */
 #define iosys_map_wr_field(map__, struct_offset__, struct_type__, field__, val__) ({	\
 	struct_type__ *s;								\
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] iosys-map: Add per-word read
  2022-06-17  8:52 [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
  2022-06-17  8:52 ` [PATCH 2/2] iosys-map: Add per-word write Lucas De Marchi
@ 2022-06-17 15:56 ` Lucas De Marchi
  1 sibling, 0 replies; 5+ messages in thread
From: Lucas De Marchi @ 2022-06-17 15:56 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig, tzimmermann

On Fri, Jun 17, 2022 at 01:52:03AM -0700, Lucas De Marchi wrote:
>Instead of always falling back to memcpy_fromio() for any size, prefer
>using read{b,w,l}(). When reading struct members it's common to read
>individual integer variables individually. Going through memcpy_fromio()
>for each of them poses a high penalty.
>
>Employ a similar trick as __seqprop() by using _Generic() to generate
>only the specific call based on a type-compatible variable.
>
>For a pariticular i915 workload producing GPU context switches,
>__get_engine_usage_record() is particularly hot since the engine usage
>is read from device local memory with dgfx, possibly multiple times
>since it's racy. Test execution time for this test shows a ~12.5%
>improvement with DG2:
>
>Before:
>	nrepeats = 1000; min = 7.63243e+06; max = 1.01817e+07;
>	median = 9.52548e+06; var = 526149;
>After:
>	nrepeats = 1000; min = 7.03402e+06; max = 8.8832e+06;
>	median = 8.33955e+06; var = 333113;
>
>Other things attempted that didn't prove very useful:
>1) Change the _Generic() on x86 to just dereference the memory address
>2) Change __get_engine_usage_record() to do just 1 read per loop,
>   comparing with the previous value read
>3) Change __get_engine_usage_record() to access the fields directly as it
>   was before the conversion to iosys-map
>
>(3) did gave a small improvement (~3%), but doesn't seem to scale well
>to other similar cases in the driver.
>
>Additional test by Chris Wilson using gem_create from igt with some
>changes to track object creation time. This happens to accidentally
>stress this code path:
>
>	Pre iosys_map conversion of engine busyness:
>	lmem0: Creating    262144 4KiB objects took 59274.2ms
>
>	Unpatched:
>	lmem0: Creating    262144 4KiB objects took 108830.2ms
>
>	With readl (this patch):
>	lmem0: Creating    262144 4KiB objects took 61348.6ms
>
>	s/readl/READ_ONCE/
>	lmem0: Creating    262144 4KiB objects took 61333.2ms
>
>So we do take a little bit more time than before the conversion, but
>that is due to other factors: bringing the READ_ONCE back would be as
>good as just doing this conversion.
>
>v2:
>- Remove default from _Generic() - callers wanting to read more
>  than u64 should use iosys_map_memcpy_from()
>- Add READ_ONCE() cases dereferencing the pointer when using system
>  memory
>
>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>Reviewed-by: Christian König <christian.koenig@amd.com> # v1
>---
> include/linux/iosys-map.h | 45 +++++++++++++++++++++++++++++++--------
> 1 file changed, 36 insertions(+), 9 deletions(-)
>
>diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
>index 4b8406ee8bc4..f59dd00ed202 100644
>--- a/include/linux/iosys-map.h
>+++ b/include/linux/iosys-map.h
>@@ -6,6 +6,7 @@
> #ifndef __IOSYS_MAP_H__
> #define __IOSYS_MAP_H__
>
>+#include <linux/compiler_types.h>
> #include <linux/io.h>
> #include <linux/string.h>
>
>@@ -333,6 +334,26 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
> 		memset(dst->vaddr + offset, value, len);
> }
>
>+#ifdef CONFIG_64BIT
>+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
>+	u64: val_ = readq(vaddr_iomem_)
>+#else
>+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
>+	u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))

I tested io/sys and forgot again to test it for 32-bit :(. This
should fix the build for 32-bits:

diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index 580e14cd360c..f8bc052f8975 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -341,7 +341,7 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  	u64: writeq(val_, vaddr_iomem_)
  #else
  #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
-	u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
+	u64: memcpy_fromio(&(val_), vaddr_iomem_, sizeof(u64))
  #define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)			\
  	u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
  #endif

Lucas De Marchi

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] iosys-map: Add per-word write
  2022-06-17  8:52 ` [PATCH 2/2] iosys-map: Add per-word write Lucas De Marchi
@ 2022-06-26 21:01   ` Lucas De Marchi
  2022-06-27  7:03     ` Thomas Zimmermann
  0 siblings, 1 reply; 5+ messages in thread
From: Lucas De Marchi @ 2022-06-26 21:01 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig, tzimmermann

On Fri, Jun 17, 2022 at 01:52:04AM -0700, Lucas De Marchi wrote:
>Like was done for read, provide the equivalent for write. Even if
>current users are not in the hot path, this should future-proof it.
>
>v2:
>  - Remove default from _Generic() - callers wanting to write more
>    than u64 should use iosys_map_memcpy_to()
>  - Add WRITE_ONCE() cases dereferencing the pointer when using system
>    memory

Thomas, do you have any additional concern on this patch regarding your
previous review?

thanks
Lucas De Marchi

>
>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>Reviewed-by: Reviewed-by: Christian König <christian.koenig@amd.com> # v1
>---
> include/linux/iosys-map.h | 42 ++++++++++++++++++++++++++++++---------
> 1 file changed, 33 insertions(+), 9 deletions(-)
>
>diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
>index f59dd00ed202..580e14cd360c 100644
>--- a/include/linux/iosys-map.h
>+++ b/include/linux/iosys-map.h
>@@ -337,9 +337,13 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
> #ifdef CONFIG_64BIT
> #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
> 	u64: val_ = readq(vaddr_iomem_)
>+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)			\
>+	u64: writeq(val_, vaddr_iomem_)
> #else
> #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
> 	u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
>+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)			\
>+	u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
> #endif
>
> #define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
>@@ -354,6 +358,19 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
> 	val__ = READ_ONCE(*((type__ *)vaddr__));				\
> })
>
>+#define __iosys_map_wr_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
>+	u8: writeb(val__, vaddr_iomem__),					\
>+	u16: writew(val__, vaddr_iomem__),					\
>+	u32: writel(val__, vaddr_iomem__),					\
>+	__iosys_map_wr_io_u64_case(val__, vaddr_iomem__))
>+
>+#define __iosys_map_wr_sys(val__, vaddr__, type__) ({				\
>+	compiletime_assert(sizeof(type__) <= sizeof(u64),			\
>+			   "Unsupported access size for __iosys_map_wr_sys()"); \
>+	WRITE_ONCE(*((type__ *)vaddr__), val__);				\
>+})
>+
>+
> /**
>  * iosys_map_rd - Read a C-type value from the iosys_map
>  *
>@@ -386,12 +403,17 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
>  * @type__:	Type of the value being written
>  * @val__:	Value to write
>  *
>- * Write a C-type value to the iosys_map, handling possible un-aligned accesses
>- * to the mapping.
>+ * Write a C type value (u8, u16, u32 and u64) to the iosys_map. For other types
>+ * or if pointer may be unaligned (and problematic for the architecture
>+ * supported), use iosys_map_memcpy_to()
>  */
>-#define iosys_map_wr(map__, offset__, type__, val__) ({			\
>-	type__ val = (val__);						\
>-	iosys_map_memcpy_to(map__, offset__, &val, sizeof(val));	\
>+#define iosys_map_wr(map__, offset__, type__, val__) ({				\
>+	type__ val = (val__);							\
>+	if ((map__)->is_iomem) {						\
>+		__iosys_map_wr_io(val, (map__)->vaddr_iomem + (offset__), type__);\
>+	} else {								\
>+		__iosys_map_wr_sys(val, (map__)->vaddr + (offset__), type__);	\
>+	}									\
> })
>
> /**
>@@ -472,10 +494,12 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
>  * @field__:		Member of the struct to read
>  * @val__:		Value to write
>  *
>- * Write a value to the iosys_map considering its layout is described by a C struct
>- * starting at @struct_offset__. The field offset and size is calculated and the
>- * @val__ is written handling possible un-aligned memory accesses. Refer to
>- * iosys_map_rd_field() for expected usage and memory layout.
>+ * Write a value to the iosys_map considering its layout is described by a C
>+ * struct starting at @struct_offset__. The field offset and size is calculated
>+ * and the @val__ is written. If the field access would incur in un-aligned
>+ * access, then either iosys_map_memcpy_to() needs to be used or the
>+ * architecture must support it. Refer to iosys_map_rd_field() for expected
>+ * usage and memory layout.
>  */
> #define iosys_map_wr_field(map__, struct_offset__, struct_type__, field__, val__) ({	\
> 	struct_type__ *s;								\
>-- 
>2.36.1
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] iosys-map: Add per-word write
  2022-06-26 21:01   ` [Intel-gfx] " Lucas De Marchi
@ 2022-06-27  7:03     ` Thomas Zimmermann
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Zimmermann @ 2022-06-27  7:03 UTC (permalink / raw)
  To: Lucas De Marchi, intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig


[-- Attachment #1.1: Type: text/plain, Size: 6059 bytes --]

Hi Lucas

Am 26.06.22 um 23:01 schrieb Lucas De Marchi:
> On Fri, Jun 17, 2022 at 01:52:04AM -0700, Lucas De Marchi wrote:
>> Like was done for read, provide the equivalent for write. Even if
>> current users are not in the hot path, this should future-proof it.
>>
>> v2:
>>  - Remove default from _Generic() - callers wanting to write more
>>    than u64 should use iosys_map_memcpy_to()
>>  - Add WRITE_ONCE() cases dereferencing the pointer when using system
>>    memory
> 
> Thomas, do you have any additional concern on this patch regarding your
> previous review?

Sorry, your patches simply fell through the cracks. For the patchset:

Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>

Thanks for the effort you put into this.

Best regards
Thomas

> 
> thanks
> Lucas De Marchi
> 
>>
>> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>> Reviewed-by: Reviewed-by: Christian König <christian.koenig@amd.com> # v1
>> ---
>> include/linux/iosys-map.h | 42 ++++++++++++++++++++++++++++++---------
>> 1 file changed, 33 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
>> index f59dd00ed202..580e14cd360c 100644
>> --- a/include/linux/iosys-map.h
>> +++ b/include/linux/iosys-map.h
>> @@ -337,9 +337,13 @@ static inline void iosys_map_memset(struct 
>> iosys_map *dst, size_t offset,
>> #ifdef CONFIG_64BIT
>> #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)                \
>>     u64: val_ = readq(vaddr_iomem_)
>> +#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)            \
>> +    u64: writeq(val_, vaddr_iomem_)
>> #else
>> #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)                \
>>     u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
>> +#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)            \
>> +    u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
>> #endif
>>
>> #define __iosys_map_rd_io(val__, vaddr_iomem__, type__) 
>> _Generic(val__,        \
>> @@ -354,6 +358,19 @@ static inline void iosys_map_memset(struct 
>> iosys_map *dst, size_t offset,
>>     val__ = READ_ONCE(*((type__ *)vaddr__));                \
>> })
>>
>> +#define __iosys_map_wr_io(val__, vaddr_iomem__, type__) 
>> _Generic(val__,        \
>> +    u8: writeb(val__, vaddr_iomem__),                    \
>> +    u16: writew(val__, vaddr_iomem__),                    \
>> +    u32: writel(val__, vaddr_iomem__),                    \
>> +    __iosys_map_wr_io_u64_case(val__, vaddr_iomem__))
>> +
>> +#define __iosys_map_wr_sys(val__, vaddr__, type__) ({                \
>> +    compiletime_assert(sizeof(type__) <= sizeof(u64),            \
>> +               "Unsupported access size for __iosys_map_wr_sys()"); \
>> +    WRITE_ONCE(*((type__ *)vaddr__), val__);                \
>> +})
>> +
>> +
>> /**
>>  * iosys_map_rd - Read a C-type value from the iosys_map
>>  *
>> @@ -386,12 +403,17 @@ static inline void iosys_map_memset(struct 
>> iosys_map *dst, size_t offset,
>>  * @type__:    Type of the value being written
>>  * @val__:    Value to write
>>  *
>> - * Write a C-type value to the iosys_map, handling possible 
>> un-aligned accesses
>> - * to the mapping.
>> + * Write a C type value (u8, u16, u32 and u64) to the iosys_map. For 
>> other types
>> + * or if pointer may be unaligned (and problematic for the architecture
>> + * supported), use iosys_map_memcpy_to()
>>  */
>> -#define iosys_map_wr(map__, offset__, type__, val__) ({            \
>> -    type__ val = (val__);                        \
>> -    iosys_map_memcpy_to(map__, offset__, &val, sizeof(val));    \
>> +#define iosys_map_wr(map__, offset__, type__, val__) ({                \
>> +    type__ val = (val__);                            \
>> +    if ((map__)->is_iomem) {                        \
>> +        __iosys_map_wr_io(val, (map__)->vaddr_iomem + (offset__), 
>> type__);\
>> +    } else {                                \
>> +        __iosys_map_wr_sys(val, (map__)->vaddr + (offset__), 
>> type__);    \
>> +    }                                    \
>> })
>>
>> /**
>> @@ -472,10 +494,12 @@ static inline void iosys_map_memset(struct 
>> iosys_map *dst, size_t offset,
>>  * @field__:        Member of the struct to read
>>  * @val__:        Value to write
>>  *
>> - * Write a value to the iosys_map considering its layout is described 
>> by a C struct
>> - * starting at @struct_offset__. The field offset and size is 
>> calculated and the
>> - * @val__ is written handling possible un-aligned memory accesses. 
>> Refer to
>> - * iosys_map_rd_field() for expected usage and memory layout.
>> + * Write a value to the iosys_map considering its layout is described 
>> by a C
>> + * struct starting at @struct_offset__. The field offset and size is 
>> calculated
>> + * and the @val__ is written. If the field access would incur in 
>> un-aligned
>> + * access, then either iosys_map_memcpy_to() needs to be used or the
>> + * architecture must support it. Refer to iosys_map_rd_field() for 
>> expected
>> + * usage and memory layout.
>>  */
>> #define iosys_map_wr_field(map__, struct_offset__, struct_type__, 
>> field__, val__) ({    \
>>     struct_type__ *s;                                \
>> -- 
>> 2.36.1
>>

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-06-27  7:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-17  8:52 [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
2022-06-17  8:52 ` [PATCH 2/2] iosys-map: Add per-word write Lucas De Marchi
2022-06-26 21:01   ` [Intel-gfx] " Lucas De Marchi
2022-06-27  7:03     ` Thomas Zimmermann
2022-06-17 15:56 ` [Intel-gfx] [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).