* [PATCH 1/2] iosys-map: Add per-word read
@ 2022-06-17 8:52 Lucas De Marchi
2022-06-17 8:52 ` [PATCH 2/2] iosys-map: Add per-word write Lucas De Marchi
2022-06-17 15:56 ` [Intel-gfx] [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
0 siblings, 2 replies; 5+ messages in thread
From: Lucas De Marchi @ 2022-06-17 8:52 UTC (permalink / raw)
To: intel-gfx, dri-devel
Cc: Daniel Vetter, Lucas De Marchi, christian.koenig, tzimmermann
Instead of always falling back to memcpy_fromio() for any size, prefer
using read{b,w,l}(). When reading struct members it's common to read
individual integer variables individually. Going through memcpy_fromio()
for each of them poses a high penalty.
Employ a similar trick as __seqprop() by using _Generic() to generate
only the specific call based on a type-compatible variable.
For a pariticular i915 workload producing GPU context switches,
__get_engine_usage_record() is particularly hot since the engine usage
is read from device local memory with dgfx, possibly multiple times
since it's racy. Test execution time for this test shows a ~12.5%
improvement with DG2:
Before:
nrepeats = 1000; min = 7.63243e+06; max = 1.01817e+07;
median = 9.52548e+06; var = 526149;
After:
nrepeats = 1000; min = 7.03402e+06; max = 8.8832e+06;
median = 8.33955e+06; var = 333113;
Other things attempted that didn't prove very useful:
1) Change the _Generic() on x86 to just dereference the memory address
2) Change __get_engine_usage_record() to do just 1 read per loop,
comparing with the previous value read
3) Change __get_engine_usage_record() to access the fields directly as it
was before the conversion to iosys-map
(3) did gave a small improvement (~3%), but doesn't seem to scale well
to other similar cases in the driver.
Additional test by Chris Wilson using gem_create from igt with some
changes to track object creation time. This happens to accidentally
stress this code path:
Pre iosys_map conversion of engine busyness:
lmem0: Creating 262144 4KiB objects took 59274.2ms
Unpatched:
lmem0: Creating 262144 4KiB objects took 108830.2ms
With readl (this patch):
lmem0: Creating 262144 4KiB objects took 61348.6ms
s/readl/READ_ONCE/
lmem0: Creating 262144 4KiB objects took 61333.2ms
So we do take a little bit more time than before the conversion, but
that is due to other factors: bringing the READ_ONCE back would be as
good as just doing this conversion.
v2:
- Remove default from _Generic() - callers wanting to read more
than u64 should use iosys_map_memcpy_from()
- Add READ_ONCE() cases dereferencing the pointer when using system
memory
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
---
include/linux/iosys-map.h | 45 +++++++++++++++++++++++++++++++--------
1 file changed, 36 insertions(+), 9 deletions(-)
diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index 4b8406ee8bc4..f59dd00ed202 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -6,6 +6,7 @@
#ifndef __IOSYS_MAP_H__
#define __IOSYS_MAP_H__
+#include <linux/compiler_types.h>
#include <linux/io.h>
#include <linux/string.h>
@@ -333,6 +334,26 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
memset(dst->vaddr + offset, value, len);
}
+#ifdef CONFIG_64BIT
+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
+ u64: val_ = readq(vaddr_iomem_)
+#else
+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
+ u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
+#endif
+
+#define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__, \
+ u8: val__ = readb(vaddr_iomem__), \
+ u16: val__ = readw(vaddr_iomem__), \
+ u32: val__ = readl(vaddr_iomem__), \
+ __iosys_map_rd_io_u64_case(val__, vaddr_iomem__))
+
+#define __iosys_map_rd_sys(val__, vaddr__, type__) ({ \
+ compiletime_assert(sizeof(type__) <= sizeof(u64), \
+ "Unsupported access size for __iosys_map_rd_sys()"); \
+ val__ = READ_ONCE(*((type__ *)vaddr__)); \
+})
+
/**
* iosys_map_rd - Read a C-type value from the iosys_map
*
@@ -340,16 +361,21 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
* @offset__: The offset from which to read
* @type__: Type of the value being read
*
- * Read a C type value from iosys_map, handling possible un-aligned accesses to
- * the mapping.
+ * Read a C type value (u8, u16, u32 and u64) from iosys_map. For other types or
+ * if pointer may be unaligned (and problematic for the architecture supported),
+ * use iosys_map_memcpy_from().
*
* Returns:
* The value read from the mapping.
*/
-#define iosys_map_rd(map__, offset__, type__) ({ \
- type__ val; \
- iosys_map_memcpy_from(&val, map__, offset__, sizeof(val)); \
- val; \
+#define iosys_map_rd(map__, offset__, type__) ({ \
+ type__ val; \
+ if ((map__)->is_iomem) { \
+ __iosys_map_rd_io(val, (map__)->vaddr_iomem + (offset__), type__);\
+ } else { \
+ __iosys_map_rd_sys(val, (map__)->vaddr + (offset__), type__); \
+ } \
+ val; \
})
/**
@@ -379,9 +405,10 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
*
* Read a value from iosys_map considering its layout is described by a C struct
* starting at @struct_offset__. The field offset and size is calculated and its
- * value read handling possible un-aligned memory accesses. For example: suppose
- * there is a @struct foo defined as below and the value ``foo.field2.inner2``
- * needs to be read from the iosys_map:
+ * value read. If the field access would incur in un-aligned access, then either
+ * iosys_map_memcpy_from() needs to be used or the architecture must support it.
+ * For example: suppose there is a @struct foo defined as below and the value
+ * ``foo.field2.inner2`` needs to be read from the iosys_map:
*
* .. code-block:: c
*
--
2.36.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] iosys-map: Add per-word write
2022-06-17 8:52 [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
@ 2022-06-17 8:52 ` Lucas De Marchi
2022-06-26 21:01 ` [Intel-gfx] " Lucas De Marchi
2022-06-17 15:56 ` [Intel-gfx] [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
1 sibling, 1 reply; 5+ messages in thread
From: Lucas De Marchi @ 2022-06-17 8:52 UTC (permalink / raw)
To: intel-gfx, dri-devel
Cc: Daniel Vetter, Lucas De Marchi, christian.koenig, tzimmermann
Like was done for read, provide the equivalent for write. Even if
current users are not in the hot path, this should future-proof it.
v2:
- Remove default from _Generic() - callers wanting to write more
than u64 should use iosys_map_memcpy_to()
- Add WRITE_ONCE() cases dereferencing the pointer when using system
memory
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Reviewed-by: Christian König <christian.koenig@amd.com> # v1
---
include/linux/iosys-map.h | 42 ++++++++++++++++++++++++++++++---------
1 file changed, 33 insertions(+), 9 deletions(-)
diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index f59dd00ed202..580e14cd360c 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -337,9 +337,13 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
#ifdef CONFIG_64BIT
#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
u64: val_ = readq(vaddr_iomem_)
+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_) \
+ u64: writeq(val_, vaddr_iomem_)
#else
#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_) \
+ u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
#endif
#define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__, \
@@ -354,6 +358,19 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
val__ = READ_ONCE(*((type__ *)vaddr__)); \
})
+#define __iosys_map_wr_io(val__, vaddr_iomem__, type__) _Generic(val__, \
+ u8: writeb(val__, vaddr_iomem__), \
+ u16: writew(val__, vaddr_iomem__), \
+ u32: writel(val__, vaddr_iomem__), \
+ __iosys_map_wr_io_u64_case(val__, vaddr_iomem__))
+
+#define __iosys_map_wr_sys(val__, vaddr__, type__) ({ \
+ compiletime_assert(sizeof(type__) <= sizeof(u64), \
+ "Unsupported access size for __iosys_map_wr_sys()"); \
+ WRITE_ONCE(*((type__ *)vaddr__), val__); \
+})
+
+
/**
* iosys_map_rd - Read a C-type value from the iosys_map
*
@@ -386,12 +403,17 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
* @type__: Type of the value being written
* @val__: Value to write
*
- * Write a C-type value to the iosys_map, handling possible un-aligned accesses
- * to the mapping.
+ * Write a C type value (u8, u16, u32 and u64) to the iosys_map. For other types
+ * or if pointer may be unaligned (and problematic for the architecture
+ * supported), use iosys_map_memcpy_to()
*/
-#define iosys_map_wr(map__, offset__, type__, val__) ({ \
- type__ val = (val__); \
- iosys_map_memcpy_to(map__, offset__, &val, sizeof(val)); \
+#define iosys_map_wr(map__, offset__, type__, val__) ({ \
+ type__ val = (val__); \
+ if ((map__)->is_iomem) { \
+ __iosys_map_wr_io(val, (map__)->vaddr_iomem + (offset__), type__);\
+ } else { \
+ __iosys_map_wr_sys(val, (map__)->vaddr + (offset__), type__); \
+ } \
})
/**
@@ -472,10 +494,12 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
* @field__: Member of the struct to read
* @val__: Value to write
*
- * Write a value to the iosys_map considering its layout is described by a C struct
- * starting at @struct_offset__. The field offset and size is calculated and the
- * @val__ is written handling possible un-aligned memory accesses. Refer to
- * iosys_map_rd_field() for expected usage and memory layout.
+ * Write a value to the iosys_map considering its layout is described by a C
+ * struct starting at @struct_offset__. The field offset and size is calculated
+ * and the @val__ is written. If the field access would incur in un-aligned
+ * access, then either iosys_map_memcpy_to() needs to be used or the
+ * architecture must support it. Refer to iosys_map_rd_field() for expected
+ * usage and memory layout.
*/
#define iosys_map_wr_field(map__, struct_offset__, struct_type__, field__, val__) ({ \
struct_type__ *s; \
--
2.36.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] iosys-map: Add per-word read
2022-06-17 8:52 [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
2022-06-17 8:52 ` [PATCH 2/2] iosys-map: Add per-word write Lucas De Marchi
@ 2022-06-17 15:56 ` Lucas De Marchi
1 sibling, 0 replies; 5+ messages in thread
From: Lucas De Marchi @ 2022-06-17 15:56 UTC (permalink / raw)
To: intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig, tzimmermann
On Fri, Jun 17, 2022 at 01:52:03AM -0700, Lucas De Marchi wrote:
>Instead of always falling back to memcpy_fromio() for any size, prefer
>using read{b,w,l}(). When reading struct members it's common to read
>individual integer variables individually. Going through memcpy_fromio()
>for each of them poses a high penalty.
>
>Employ a similar trick as __seqprop() by using _Generic() to generate
>only the specific call based on a type-compatible variable.
>
>For a pariticular i915 workload producing GPU context switches,
>__get_engine_usage_record() is particularly hot since the engine usage
>is read from device local memory with dgfx, possibly multiple times
>since it's racy. Test execution time for this test shows a ~12.5%
>improvement with DG2:
>
>Before:
> nrepeats = 1000; min = 7.63243e+06; max = 1.01817e+07;
> median = 9.52548e+06; var = 526149;
>After:
> nrepeats = 1000; min = 7.03402e+06; max = 8.8832e+06;
> median = 8.33955e+06; var = 333113;
>
>Other things attempted that didn't prove very useful:
>1) Change the _Generic() on x86 to just dereference the memory address
>2) Change __get_engine_usage_record() to do just 1 read per loop,
> comparing with the previous value read
>3) Change __get_engine_usage_record() to access the fields directly as it
> was before the conversion to iosys-map
>
>(3) did gave a small improvement (~3%), but doesn't seem to scale well
>to other similar cases in the driver.
>
>Additional test by Chris Wilson using gem_create from igt with some
>changes to track object creation time. This happens to accidentally
>stress this code path:
>
> Pre iosys_map conversion of engine busyness:
> lmem0: Creating 262144 4KiB objects took 59274.2ms
>
> Unpatched:
> lmem0: Creating 262144 4KiB objects took 108830.2ms
>
> With readl (this patch):
> lmem0: Creating 262144 4KiB objects took 61348.6ms
>
> s/readl/READ_ONCE/
> lmem0: Creating 262144 4KiB objects took 61333.2ms
>
>So we do take a little bit more time than before the conversion, but
>that is due to other factors: bringing the READ_ONCE back would be as
>good as just doing this conversion.
>
>v2:
>- Remove default from _Generic() - callers wanting to read more
> than u64 should use iosys_map_memcpy_from()
>- Add READ_ONCE() cases dereferencing the pointer when using system
> memory
>
>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>Reviewed-by: Christian König <christian.koenig@amd.com> # v1
>---
> include/linux/iosys-map.h | 45 +++++++++++++++++++++++++++++++--------
> 1 file changed, 36 insertions(+), 9 deletions(-)
>
>diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
>index 4b8406ee8bc4..f59dd00ed202 100644
>--- a/include/linux/iosys-map.h
>+++ b/include/linux/iosys-map.h
>@@ -6,6 +6,7 @@
> #ifndef __IOSYS_MAP_H__
> #define __IOSYS_MAP_H__
>
>+#include <linux/compiler_types.h>
> #include <linux/io.h>
> #include <linux/string.h>
>
>@@ -333,6 +334,26 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
> memset(dst->vaddr + offset, value, len);
> }
>
>+#ifdef CONFIG_64BIT
>+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
>+ u64: val_ = readq(vaddr_iomem_)
>+#else
>+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
>+ u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
I tested io/sys and forgot again to test it for 32-bit :(. This
should fix the build for 32-bits:
diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index 580e14cd360c..f8bc052f8975 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -341,7 +341,7 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
u64: writeq(val_, vaddr_iomem_)
#else
#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
- u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
+ u64: memcpy_fromio(&(val_), vaddr_iomem_, sizeof(u64))
#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_) \
u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
#endif
Lucas De Marchi
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Intel-gfx] [PATCH 2/2] iosys-map: Add per-word write
2022-06-17 8:52 ` [PATCH 2/2] iosys-map: Add per-word write Lucas De Marchi
@ 2022-06-26 21:01 ` Lucas De Marchi
2022-06-27 7:03 ` Thomas Zimmermann
0 siblings, 1 reply; 5+ messages in thread
From: Lucas De Marchi @ 2022-06-26 21:01 UTC (permalink / raw)
To: intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig, tzimmermann
On Fri, Jun 17, 2022 at 01:52:04AM -0700, Lucas De Marchi wrote:
>Like was done for read, provide the equivalent for write. Even if
>current users are not in the hot path, this should future-proof it.
>
>v2:
> - Remove default from _Generic() - callers wanting to write more
> than u64 should use iosys_map_memcpy_to()
> - Add WRITE_ONCE() cases dereferencing the pointer when using system
> memory
Thomas, do you have any additional concern on this patch regarding your
previous review?
thanks
Lucas De Marchi
>
>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>Reviewed-by: Reviewed-by: Christian König <christian.koenig@amd.com> # v1
>---
> include/linux/iosys-map.h | 42 ++++++++++++++++++++++++++++++---------
> 1 file changed, 33 insertions(+), 9 deletions(-)
>
>diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
>index f59dd00ed202..580e14cd360c 100644
>--- a/include/linux/iosys-map.h
>+++ b/include/linux/iosys-map.h
>@@ -337,9 +337,13 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
> #ifdef CONFIG_64BIT
> #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
> u64: val_ = readq(vaddr_iomem_)
>+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_) \
>+ u64: writeq(val_, vaddr_iomem_)
> #else
> #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
> u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
>+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_) \
>+ u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
> #endif
>
> #define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__, \
>@@ -354,6 +358,19 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
> val__ = READ_ONCE(*((type__ *)vaddr__)); \
> })
>
>+#define __iosys_map_wr_io(val__, vaddr_iomem__, type__) _Generic(val__, \
>+ u8: writeb(val__, vaddr_iomem__), \
>+ u16: writew(val__, vaddr_iomem__), \
>+ u32: writel(val__, vaddr_iomem__), \
>+ __iosys_map_wr_io_u64_case(val__, vaddr_iomem__))
>+
>+#define __iosys_map_wr_sys(val__, vaddr__, type__) ({ \
>+ compiletime_assert(sizeof(type__) <= sizeof(u64), \
>+ "Unsupported access size for __iosys_map_wr_sys()"); \
>+ WRITE_ONCE(*((type__ *)vaddr__), val__); \
>+})
>+
>+
> /**
> * iosys_map_rd - Read a C-type value from the iosys_map
> *
>@@ -386,12 +403,17 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
> * @type__: Type of the value being written
> * @val__: Value to write
> *
>- * Write a C-type value to the iosys_map, handling possible un-aligned accesses
>- * to the mapping.
>+ * Write a C type value (u8, u16, u32 and u64) to the iosys_map. For other types
>+ * or if pointer may be unaligned (and problematic for the architecture
>+ * supported), use iosys_map_memcpy_to()
> */
>-#define iosys_map_wr(map__, offset__, type__, val__) ({ \
>- type__ val = (val__); \
>- iosys_map_memcpy_to(map__, offset__, &val, sizeof(val)); \
>+#define iosys_map_wr(map__, offset__, type__, val__) ({ \
>+ type__ val = (val__); \
>+ if ((map__)->is_iomem) { \
>+ __iosys_map_wr_io(val, (map__)->vaddr_iomem + (offset__), type__);\
>+ } else { \
>+ __iosys_map_wr_sys(val, (map__)->vaddr + (offset__), type__); \
>+ } \
> })
>
> /**
>@@ -472,10 +494,12 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
> * @field__: Member of the struct to read
> * @val__: Value to write
> *
>- * Write a value to the iosys_map considering its layout is described by a C struct
>- * starting at @struct_offset__. The field offset and size is calculated and the
>- * @val__ is written handling possible un-aligned memory accesses. Refer to
>- * iosys_map_rd_field() for expected usage and memory layout.
>+ * Write a value to the iosys_map considering its layout is described by a C
>+ * struct starting at @struct_offset__. The field offset and size is calculated
>+ * and the @val__ is written. If the field access would incur in un-aligned
>+ * access, then either iosys_map_memcpy_to() needs to be used or the
>+ * architecture must support it. Refer to iosys_map_rd_field() for expected
>+ * usage and memory layout.
> */
> #define iosys_map_wr_field(map__, struct_offset__, struct_type__, field__, val__) ({ \
> struct_type__ *s; \
>--
>2.36.1
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Intel-gfx] [PATCH 2/2] iosys-map: Add per-word write
2022-06-26 21:01 ` [Intel-gfx] " Lucas De Marchi
@ 2022-06-27 7:03 ` Thomas Zimmermann
0 siblings, 0 replies; 5+ messages in thread
From: Thomas Zimmermann @ 2022-06-27 7:03 UTC (permalink / raw)
To: Lucas De Marchi, intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig
[-- Attachment #1.1: Type: text/plain, Size: 6059 bytes --]
Hi Lucas
Am 26.06.22 um 23:01 schrieb Lucas De Marchi:
> On Fri, Jun 17, 2022 at 01:52:04AM -0700, Lucas De Marchi wrote:
>> Like was done for read, provide the equivalent for write. Even if
>> current users are not in the hot path, this should future-proof it.
>>
>> v2:
>> - Remove default from _Generic() - callers wanting to write more
>> than u64 should use iosys_map_memcpy_to()
>> - Add WRITE_ONCE() cases dereferencing the pointer when using system
>> memory
>
> Thomas, do you have any additional concern on this patch regarding your
> previous review?
Sorry, your patches simply fell through the cracks. For the patchset:
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Thanks for the effort you put into this.
Best regards
Thomas
>
> thanks
> Lucas De Marchi
>
>>
>> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>> Reviewed-by: Reviewed-by: Christian König <christian.koenig@amd.com> # v1
>> ---
>> include/linux/iosys-map.h | 42 ++++++++++++++++++++++++++++++---------
>> 1 file changed, 33 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
>> index f59dd00ed202..580e14cd360c 100644
>> --- a/include/linux/iosys-map.h
>> +++ b/include/linux/iosys-map.h
>> @@ -337,9 +337,13 @@ static inline void iosys_map_memset(struct
>> iosys_map *dst, size_t offset,
>> #ifdef CONFIG_64BIT
>> #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
>> u64: val_ = readq(vaddr_iomem_)
>> +#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_) \
>> + u64: writeq(val_, vaddr_iomem_)
>> #else
>> #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_) \
>> u64: memcpy_fromio(&(val_), vaddr_iomem__, sizeof(u64))
>> +#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_) \
>> + u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
>> #endif
>>
>> #define __iosys_map_rd_io(val__, vaddr_iomem__, type__)
>> _Generic(val__, \
>> @@ -354,6 +358,19 @@ static inline void iosys_map_memset(struct
>> iosys_map *dst, size_t offset,
>> val__ = READ_ONCE(*((type__ *)vaddr__)); \
>> })
>>
>> +#define __iosys_map_wr_io(val__, vaddr_iomem__, type__)
>> _Generic(val__, \
>> + u8: writeb(val__, vaddr_iomem__), \
>> + u16: writew(val__, vaddr_iomem__), \
>> + u32: writel(val__, vaddr_iomem__), \
>> + __iosys_map_wr_io_u64_case(val__, vaddr_iomem__))
>> +
>> +#define __iosys_map_wr_sys(val__, vaddr__, type__) ({ \
>> + compiletime_assert(sizeof(type__) <= sizeof(u64), \
>> + "Unsupported access size for __iosys_map_wr_sys()"); \
>> + WRITE_ONCE(*((type__ *)vaddr__), val__); \
>> +})
>> +
>> +
>> /**
>> * iosys_map_rd - Read a C-type value from the iosys_map
>> *
>> @@ -386,12 +403,17 @@ static inline void iosys_map_memset(struct
>> iosys_map *dst, size_t offset,
>> * @type__: Type of the value being written
>> * @val__: Value to write
>> *
>> - * Write a C-type value to the iosys_map, handling possible
>> un-aligned accesses
>> - * to the mapping.
>> + * Write a C type value (u8, u16, u32 and u64) to the iosys_map. For
>> other types
>> + * or if pointer may be unaligned (and problematic for the architecture
>> + * supported), use iosys_map_memcpy_to()
>> */
>> -#define iosys_map_wr(map__, offset__, type__, val__) ({ \
>> - type__ val = (val__); \
>> - iosys_map_memcpy_to(map__, offset__, &val, sizeof(val)); \
>> +#define iosys_map_wr(map__, offset__, type__, val__) ({ \
>> + type__ val = (val__); \
>> + if ((map__)->is_iomem) { \
>> + __iosys_map_wr_io(val, (map__)->vaddr_iomem + (offset__),
>> type__);\
>> + } else { \
>> + __iosys_map_wr_sys(val, (map__)->vaddr + (offset__),
>> type__); \
>> + } \
>> })
>>
>> /**
>> @@ -472,10 +494,12 @@ static inline void iosys_map_memset(struct
>> iosys_map *dst, size_t offset,
>> * @field__: Member of the struct to read
>> * @val__: Value to write
>> *
>> - * Write a value to the iosys_map considering its layout is described
>> by a C struct
>> - * starting at @struct_offset__. The field offset and size is
>> calculated and the
>> - * @val__ is written handling possible un-aligned memory accesses.
>> Refer to
>> - * iosys_map_rd_field() for expected usage and memory layout.
>> + * Write a value to the iosys_map considering its layout is described
>> by a C
>> + * struct starting at @struct_offset__. The field offset and size is
>> calculated
>> + * and the @val__ is written. If the field access would incur in
>> un-aligned
>> + * access, then either iosys_map_memcpy_to() needs to be used or the
>> + * architecture must support it. Refer to iosys_map_rd_field() for
>> expected
>> + * usage and memory layout.
>> */
>> #define iosys_map_wr_field(map__, struct_offset__, struct_type__,
>> field__, val__) ({ \
>> struct_type__ *s; \
>> --
>> 2.36.1
>>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-06-27 7:03 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-17 8:52 [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
2022-06-17 8:52 ` [PATCH 2/2] iosys-map: Add per-word write Lucas De Marchi
2022-06-26 21:01 ` [Intel-gfx] " Lucas De Marchi
2022-06-27 7:03 ` Thomas Zimmermann
2022-06-17 15:56 ` [Intel-gfx] [PATCH 1/2] iosys-map: Add per-word read Lucas De Marchi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).