All of lore.kernel.org
 help / color / mirror / Atom feed
* [CI 1/3] drm-tip: 2022y-06m-27d-16h-18m-47s UTC integration manifest
@ 2022-06-28 18:47 ` Lucas De Marchi
  0 siblings, 0 replies; 8+ messages in thread
From: Lucas De Marchi @ 2022-06-28 18:47 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig, tzimmermann

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

---
 integration-manifest | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100644 integration-manifest

diff --git a/integration-manifest b/integration-manifest
new file mode 100644
index 000000000000..baffa2a57cd4
--- /dev/null
+++ b/integration-manifest
@@ -0,0 +1,26 @@
+drm drm-fixes 03c765b0e3b4cb5063276b086c76f7a612856a9a
+	Linux 5.19-rc4
+drm-misc drm-misc-fixes 5f701324c0fb6f9f5aaac3f8d1575321375f6d8f
+	drm/vc4: perfmon: Fix variable dereferenced before check
+drm-intel drm-intel-fixes 79538490fd7ade244dba400923e792519a2bdfea
+	drm/i915: tweak the ordering in cpu_write_needs_clflush
+drm drm-next 805ada63ba0567b15d10d40419bcc5e6f0b461e6
+	Merge tag 'drm-intel-next-2022-06-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
+drm-misc drm-misc-next-fixes 5ee8c8f930ba7d20717c4fc2d9f1ce0e757d1155
+	drm/rockchip: Change register space names in vop2
+drm-intel drm-intel-next-fixes f2906aa863381afb0015a9eb7fefad885d4e5a56
+	Linux 5.19-rc1
+drm-misc drm-misc-next 7d008eecb0cfc2b1a1a742d6faa0a02f339535c2
+	drm/stm: ltdc: update hardware error management
+drm-intel drm-intel-next f7fb92cd2e39357f14846d69ae0e1d8692371f82
+	drm/i915: Move the color stuff under INTEL_INFO->display
+drm-intel drm-intel-gt-next 7d8097073caa334ed6187a964645335324231e01
+	drm/i915: Prefer "XEHP_" prefix for registers
+sound-upstream for-linus 7cf3dead1ad70c72edb03e2d98e1f3dcd332cdb2
+	Linux 5.13
+sound-upstream for-next 7cf3dead1ad70c72edb03e2d98e1f3dcd332cdb2
+	Linux 5.13
+drm-intel topic/core-for-CI f7d7dddaab81eeae4508197b5f38f0b974d97b8c
+	topic/core-for-CI: Add remaining DG2 and ATS-M device IDs
+drm-misc topic/i915-ttm 1e3944578b749449bd7fa6bf0bae4c3d3f5f1733
+	Merge tag 'amd-drm-next-5.16-2021-09-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Intel-gfx] [CI 1/3] drm-tip: 2022y-06m-27d-16h-18m-47s UTC integration manifest
@ 2022-06-28 18:47 ` Lucas De Marchi
  0 siblings, 0 replies; 8+ messages in thread
From: Lucas De Marchi @ 2022-06-28 18:47 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig, tzimmermann

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

---
 integration-manifest | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100644 integration-manifest

diff --git a/integration-manifest b/integration-manifest
new file mode 100644
index 000000000000..baffa2a57cd4
--- /dev/null
+++ b/integration-manifest
@@ -0,0 +1,26 @@
+drm drm-fixes 03c765b0e3b4cb5063276b086c76f7a612856a9a
+	Linux 5.19-rc4
+drm-misc drm-misc-fixes 5f701324c0fb6f9f5aaac3f8d1575321375f6d8f
+	drm/vc4: perfmon: Fix variable dereferenced before check
+drm-intel drm-intel-fixes 79538490fd7ade244dba400923e792519a2bdfea
+	drm/i915: tweak the ordering in cpu_write_needs_clflush
+drm drm-next 805ada63ba0567b15d10d40419bcc5e6f0b461e6
+	Merge tag 'drm-intel-next-2022-06-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
+drm-misc drm-misc-next-fixes 5ee8c8f930ba7d20717c4fc2d9f1ce0e757d1155
+	drm/rockchip: Change register space names in vop2
+drm-intel drm-intel-next-fixes f2906aa863381afb0015a9eb7fefad885d4e5a56
+	Linux 5.19-rc1
+drm-misc drm-misc-next 7d008eecb0cfc2b1a1a742d6faa0a02f339535c2
+	drm/stm: ltdc: update hardware error management
+drm-intel drm-intel-next f7fb92cd2e39357f14846d69ae0e1d8692371f82
+	drm/i915: Move the color stuff under INTEL_INFO->display
+drm-intel drm-intel-gt-next 7d8097073caa334ed6187a964645335324231e01
+	drm/i915: Prefer "XEHP_" prefix for registers
+sound-upstream for-linus 7cf3dead1ad70c72edb03e2d98e1f3dcd332cdb2
+	Linux 5.13
+sound-upstream for-next 7cf3dead1ad70c72edb03e2d98e1f3dcd332cdb2
+	Linux 5.13
+drm-intel topic/core-for-CI f7d7dddaab81eeae4508197b5f38f0b974d97b8c
+	topic/core-for-CI: Add remaining DG2 and ATS-M device IDs
+drm-misc topic/i915-ttm 1e3944578b749449bd7fa6bf0bae4c3d3f5f1733
+	Merge tag 'amd-drm-next-5.16-2021-09-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [CI 2/3] iosys-map: Add per-word read
  2022-06-28 18:47 ` [Intel-gfx] " Lucas De Marchi
@ 2022-06-28 18:47   ` Lucas De Marchi
  -1 siblings, 0 replies; 8+ messages in thread
From: Lucas De Marchi @ 2022-06-28 18:47 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Daniel Vetter, Lucas De Marchi, christian.koenig, tzimmermann

Instead of always falling back to memcpy_fromio() for any size, prefer
using read{b,w,l}(). When reading struct members it's common to read
individual integer variables individually. Going through memcpy_fromio()
for each of them poses a high penalty.

Employ a similar trick as __seqprop() by using _Generic() to generate
only the specific call based on a type-compatible variable.

For a pariticular i915 workload producing GPU context switches,
__get_engine_usage_record() is particularly hot since the engine usage
is read from device local memory with dgfx, possibly multiple times
since it's racy. Test execution time for this test shows a ~12.5%
improvement with DG2:

Before:
	nrepeats = 1000; min = 7.63243e+06; max = 1.01817e+07;
	median = 9.52548e+06; var = 526149;
After:
	nrepeats = 1000; min = 7.03402e+06; max = 8.8832e+06;
	median = 8.33955e+06; var = 333113;

Other things attempted that didn't prove very useful:
1) Change the _Generic() on x86 to just dereference the memory address
2) Change __get_engine_usage_record() to do just 1 read per loop,
   comparing with the previous value read
3) Change __get_engine_usage_record() to access the fields directly as it
   was before the conversion to iosys-map

(3) did gave a small improvement (~3%), but doesn't seem to scale well
to other similar cases in the driver.

Additional test by Chris Wilson using gem_create from igt with some
changes to track object creation time. This happens to accidentally
stress this code path:

	Pre iosys_map conversion of engine busyness:
	lmem0: Creating    262144 4KiB objects took 59274.2ms

	Unpatched:
	lmem0: Creating    262144 4KiB objects took 108830.2ms

	With readl (this patch):
	lmem0: Creating    262144 4KiB objects took 61348.6ms

	s/readl/READ_ONCE/
	lmem0: Creating    262144 4KiB objects took 61333.2ms

So we do take a little bit more time than before the conversion, but
that is due to other factors: bringing the READ_ONCE back would be as
good as just doing this conversion.

v2:
  - Remove default from _Generic() - callers wanting to read more
    than u64 should use iosys_map_memcpy_from()
  - Add READ_ONCE() cases dereferencing the pointer when using system
    memory
v3:
  - Fix precedence issue when casting inside READ_ONCE(). By not using ()
    around vaddr__ the offset was not part of the cast, but rather added
    to it, producing a wrong address
  - Remove compiletime_assert() as READ_ONCE() already contains it

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 include/linux/iosys-map.h | 42 ++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index 4b8406ee8bc4..48e550b290fa 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -6,6 +6,7 @@
 #ifndef __IOSYS_MAP_H__
 #define __IOSYS_MAP_H__
 
+#include <linux/compiler_types.h>
 #include <linux/io.h>
 #include <linux/string.h>
 
@@ -333,6 +334,23 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
 		memset(dst->vaddr + offset, value, len);
 }
 
+#ifdef CONFIG_64BIT
+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
+	u64: val_ = readq(vaddr_iomem_)
+#else
+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
+	u64: memcpy_fromio(&(val_), vaddr_iomem_, sizeof(u64))
+#endif
+
+#define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
+	u8: val__ = readb(vaddr_iomem__),					\
+	u16: val__ = readw(vaddr_iomem__),					\
+	u32: val__ = readl(vaddr_iomem__),					\
+	__iosys_map_rd_io_u64_case(val__, vaddr_iomem__))
+
+#define __iosys_map_rd_sys(val__, vaddr__, type__)				\
+	val__ = READ_ONCE(*(type__ *)(vaddr__));
+
 /**
  * iosys_map_rd - Read a C-type value from the iosys_map
  *
@@ -340,16 +358,21 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  * @offset__:	The offset from which to read
  * @type__:	Type of the value being read
  *
- * Read a C type value from iosys_map, handling possible un-aligned accesses to
- * the mapping.
+ * Read a C type value (u8, u16, u32 and u64) from iosys_map. For other types or
+ * if pointer may be unaligned (and problematic for the architecture supported),
+ * use iosys_map_memcpy_from().
  *
  * Returns:
  * The value read from the mapping.
  */
-#define iosys_map_rd(map__, offset__, type__) ({			\
-	type__ val;							\
-	iosys_map_memcpy_from(&val, map__, offset__, sizeof(val));	\
-	val;								\
+#define iosys_map_rd(map__, offset__, type__) ({				\
+	type__ val;								\
+	if ((map__)->is_iomem) {						\
+		__iosys_map_rd_io(val, (map__)->vaddr_iomem + (offset__), type__);\
+	} else {								\
+		__iosys_map_rd_sys(val, (map__)->vaddr + (offset__), type__);	\
+	}									\
+	val;									\
 })
 
 /**
@@ -379,9 +402,10 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  *
  * Read a value from iosys_map considering its layout is described by a C struct
  * starting at @struct_offset__. The field offset and size is calculated and its
- * value read handling possible un-aligned memory accesses. For example: suppose
- * there is a @struct foo defined as below and the value ``foo.field2.inner2``
- * needs to be read from the iosys_map:
+ * value read. If the field access would incur in un-aligned access, then either
+ * iosys_map_memcpy_from() needs to be used or the architecture must support it.
+ * For example: suppose there is a @struct foo defined as below and the value
+ * ``foo.field2.inner2`` needs to be read from the iosys_map:
  *
  * .. code-block:: c
  *
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Intel-gfx] [CI 2/3] iosys-map: Add per-word read
@ 2022-06-28 18:47   ` Lucas De Marchi
  0 siblings, 0 replies; 8+ messages in thread
From: Lucas De Marchi @ 2022-06-28 18:47 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Daniel Vetter, Lucas De Marchi, christian.koenig, tzimmermann

Instead of always falling back to memcpy_fromio() for any size, prefer
using read{b,w,l}(). When reading struct members it's common to read
individual integer variables individually. Going through memcpy_fromio()
for each of them poses a high penalty.

Employ a similar trick as __seqprop() by using _Generic() to generate
only the specific call based on a type-compatible variable.

For a pariticular i915 workload producing GPU context switches,
__get_engine_usage_record() is particularly hot since the engine usage
is read from device local memory with dgfx, possibly multiple times
since it's racy. Test execution time for this test shows a ~12.5%
improvement with DG2:

Before:
	nrepeats = 1000; min = 7.63243e+06; max = 1.01817e+07;
	median = 9.52548e+06; var = 526149;
After:
	nrepeats = 1000; min = 7.03402e+06; max = 8.8832e+06;
	median = 8.33955e+06; var = 333113;

Other things attempted that didn't prove very useful:
1) Change the _Generic() on x86 to just dereference the memory address
2) Change __get_engine_usage_record() to do just 1 read per loop,
   comparing with the previous value read
3) Change __get_engine_usage_record() to access the fields directly as it
   was before the conversion to iosys-map

(3) did gave a small improvement (~3%), but doesn't seem to scale well
to other similar cases in the driver.

Additional test by Chris Wilson using gem_create from igt with some
changes to track object creation time. This happens to accidentally
stress this code path:

	Pre iosys_map conversion of engine busyness:
	lmem0: Creating    262144 4KiB objects took 59274.2ms

	Unpatched:
	lmem0: Creating    262144 4KiB objects took 108830.2ms

	With readl (this patch):
	lmem0: Creating    262144 4KiB objects took 61348.6ms

	s/readl/READ_ONCE/
	lmem0: Creating    262144 4KiB objects took 61333.2ms

So we do take a little bit more time than before the conversion, but
that is due to other factors: bringing the READ_ONCE back would be as
good as just doing this conversion.

v2:
  - Remove default from _Generic() - callers wanting to read more
    than u64 should use iosys_map_memcpy_from()
  - Add READ_ONCE() cases dereferencing the pointer when using system
    memory
v3:
  - Fix precedence issue when casting inside READ_ONCE(). By not using ()
    around vaddr__ the offset was not part of the cast, but rather added
    to it, producing a wrong address
  - Remove compiletime_assert() as READ_ONCE() already contains it

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 include/linux/iosys-map.h | 42 ++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index 4b8406ee8bc4..48e550b290fa 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -6,6 +6,7 @@
 #ifndef __IOSYS_MAP_H__
 #define __IOSYS_MAP_H__
 
+#include <linux/compiler_types.h>
 #include <linux/io.h>
 #include <linux/string.h>
 
@@ -333,6 +334,23 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
 		memset(dst->vaddr + offset, value, len);
 }
 
+#ifdef CONFIG_64BIT
+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
+	u64: val_ = readq(vaddr_iomem_)
+#else
+#define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
+	u64: memcpy_fromio(&(val_), vaddr_iomem_, sizeof(u64))
+#endif
+
+#define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
+	u8: val__ = readb(vaddr_iomem__),					\
+	u16: val__ = readw(vaddr_iomem__),					\
+	u32: val__ = readl(vaddr_iomem__),					\
+	__iosys_map_rd_io_u64_case(val__, vaddr_iomem__))
+
+#define __iosys_map_rd_sys(val__, vaddr__, type__)				\
+	val__ = READ_ONCE(*(type__ *)(vaddr__));
+
 /**
  * iosys_map_rd - Read a C-type value from the iosys_map
  *
@@ -340,16 +358,21 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  * @offset__:	The offset from which to read
  * @type__:	Type of the value being read
  *
- * Read a C type value from iosys_map, handling possible un-aligned accesses to
- * the mapping.
+ * Read a C type value (u8, u16, u32 and u64) from iosys_map. For other types or
+ * if pointer may be unaligned (and problematic for the architecture supported),
+ * use iosys_map_memcpy_from().
  *
  * Returns:
  * The value read from the mapping.
  */
-#define iosys_map_rd(map__, offset__, type__) ({			\
-	type__ val;							\
-	iosys_map_memcpy_from(&val, map__, offset__, sizeof(val));	\
-	val;								\
+#define iosys_map_rd(map__, offset__, type__) ({				\
+	type__ val;								\
+	if ((map__)->is_iomem) {						\
+		__iosys_map_rd_io(val, (map__)->vaddr_iomem + (offset__), type__);\
+	} else {								\
+		__iosys_map_rd_sys(val, (map__)->vaddr + (offset__), type__);	\
+	}									\
+	val;									\
 })
 
 /**
@@ -379,9 +402,10 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  *
  * Read a value from iosys_map considering its layout is described by a C struct
  * starting at @struct_offset__. The field offset and size is calculated and its
- * value read handling possible un-aligned memory accesses. For example: suppose
- * there is a @struct foo defined as below and the value ``foo.field2.inner2``
- * needs to be read from the iosys_map:
+ * value read. If the field access would incur in un-aligned access, then either
+ * iosys_map_memcpy_from() needs to be used or the architecture must support it.
+ * For example: suppose there is a @struct foo defined as below and the value
+ * ``foo.field2.inner2`` needs to be read from the iosys_map:
  *
  * .. code-block:: c
  *
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [CI 3/3] iosys-map: Add per-word write
  2022-06-28 18:47 ` [Intel-gfx] " Lucas De Marchi
@ 2022-06-28 18:47   ` Lucas De Marchi
  -1 siblings, 0 replies; 8+ messages in thread
From: Lucas De Marchi @ 2022-06-28 18:47 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Daniel Vetter, Lucas De Marchi, christian.koenig, tzimmermann

Like was done for read, provide the equivalent for write. Even if
current users are not in the hot path, this should future-proof it.

v2:
  - Remove default from _Generic() - callers wanting to write more
    than u64 should use iosys_map_memcpy_to()
  - Add WRITE_ONCE() cases dereferencing the pointer when using system
    memory
v3:
  - Fix precedence issue when casting inside WRITE_ONCE(). By not using ()
    around vaddr__ the offset was not part of the cast, but rather added
    to it, producing a wrong address
  - Remove compiletime_assert() as WRITE_ONCE() already contains it

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Reviewed-by: Christian König <christian.koenig@amd.com> # v1
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 include/linux/iosys-map.h | 38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index 48e550b290fa..08dad5b0ad17 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -337,9 +337,13 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
 #ifdef CONFIG_64BIT
 #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
 	u64: val_ = readq(vaddr_iomem_)
+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)				\
+	u64: writeq(val_, vaddr_iomem_)
 #else
 #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
 	u64: memcpy_fromio(&(val_), vaddr_iomem_, sizeof(u64))
+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)				\
+	u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
 #endif
 
 #define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
@@ -351,6 +355,15 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
 #define __iosys_map_rd_sys(val__, vaddr__, type__)				\
 	val__ = READ_ONCE(*(type__ *)(vaddr__));
 
+#define __iosys_map_wr_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
+	u8: writeb(val__, vaddr_iomem__),					\
+	u16: writew(val__, vaddr_iomem__),					\
+	u32: writel(val__, vaddr_iomem__),					\
+	__iosys_map_wr_io_u64_case(val__, vaddr_iomem__))
+
+#define __iosys_map_wr_sys(val__, vaddr__, type__)				\
+	WRITE_ONCE(*(type__ *)(vaddr__), val__);
+
 /**
  * iosys_map_rd - Read a C-type value from the iosys_map
  *
@@ -383,12 +396,17 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  * @type__:	Type of the value being written
  * @val__:	Value to write
  *
- * Write a C-type value to the iosys_map, handling possible un-aligned accesses
- * to the mapping.
+ * Write a C type value (u8, u16, u32 and u64) to the iosys_map. For other types
+ * or if pointer may be unaligned (and problematic for the architecture
+ * supported), use iosys_map_memcpy_to()
  */
-#define iosys_map_wr(map__, offset__, type__, val__) ({			\
-	type__ val = (val__);						\
-	iosys_map_memcpy_to(map__, offset__, &val, sizeof(val));	\
+#define iosys_map_wr(map__, offset__, type__, val__) ({				\
+	type__ val = (val__);							\
+	if ((map__)->is_iomem) {						\
+		__iosys_map_wr_io(val, (map__)->vaddr_iomem + (offset__), type__);\
+	} else {								\
+		__iosys_map_wr_sys(val, (map__)->vaddr + (offset__), type__);	\
+	}									\
 })
 
 /**
@@ -469,10 +487,12 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  * @field__:		Member of the struct to read
  * @val__:		Value to write
  *
- * Write a value to the iosys_map considering its layout is described by a C struct
- * starting at @struct_offset__. The field offset and size is calculated and the
- * @val__ is written handling possible un-aligned memory accesses. Refer to
- * iosys_map_rd_field() for expected usage and memory layout.
+ * Write a value to the iosys_map considering its layout is described by a C
+ * struct starting at @struct_offset__. The field offset and size is calculated
+ * and the @val__ is written. If the field access would incur in un-aligned
+ * access, then either iosys_map_memcpy_to() needs to be used or the
+ * architecture must support it. Refer to iosys_map_rd_field() for expected
+ * usage and memory layout.
  */
 #define iosys_map_wr_field(map__, struct_offset__, struct_type__, field__, val__) ({	\
 	struct_type__ *s;								\
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Intel-gfx] [CI 3/3] iosys-map: Add per-word write
@ 2022-06-28 18:47   ` Lucas De Marchi
  0 siblings, 0 replies; 8+ messages in thread
From: Lucas De Marchi @ 2022-06-28 18:47 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Daniel Vetter, Lucas De Marchi, christian.koenig, tzimmermann

Like was done for read, provide the equivalent for write. Even if
current users are not in the hot path, this should future-proof it.

v2:
  - Remove default from _Generic() - callers wanting to write more
    than u64 should use iosys_map_memcpy_to()
  - Add WRITE_ONCE() cases dereferencing the pointer when using system
    memory
v3:
  - Fix precedence issue when casting inside WRITE_ONCE(). By not using ()
    around vaddr__ the offset was not part of the cast, but rather added
    to it, producing a wrong address
  - Remove compiletime_assert() as WRITE_ONCE() already contains it

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Reviewed-by: Christian König <christian.koenig@amd.com> # v1
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 include/linux/iosys-map.h | 38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index 48e550b290fa..08dad5b0ad17 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -337,9 +337,13 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
 #ifdef CONFIG_64BIT
 #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
 	u64: val_ = readq(vaddr_iomem_)
+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)				\
+	u64: writeq(val_, vaddr_iomem_)
 #else
 #define __iosys_map_rd_io_u64_case(val_, vaddr_iomem_)				\
 	u64: memcpy_fromio(&(val_), vaddr_iomem_, sizeof(u64))
+#define __iosys_map_wr_io_u64_case(val_, vaddr_iomem_)				\
+	u64: memcpy_toio(vaddr_iomem_, &(val_), sizeof(u64))
 #endif
 
 #define __iosys_map_rd_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
@@ -351,6 +355,15 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
 #define __iosys_map_rd_sys(val__, vaddr__, type__)				\
 	val__ = READ_ONCE(*(type__ *)(vaddr__));
 
+#define __iosys_map_wr_io(val__, vaddr_iomem__, type__) _Generic(val__,		\
+	u8: writeb(val__, vaddr_iomem__),					\
+	u16: writew(val__, vaddr_iomem__),					\
+	u32: writel(val__, vaddr_iomem__),					\
+	__iosys_map_wr_io_u64_case(val__, vaddr_iomem__))
+
+#define __iosys_map_wr_sys(val__, vaddr__, type__)				\
+	WRITE_ONCE(*(type__ *)(vaddr__), val__);
+
 /**
  * iosys_map_rd - Read a C-type value from the iosys_map
  *
@@ -383,12 +396,17 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  * @type__:	Type of the value being written
  * @val__:	Value to write
  *
- * Write a C-type value to the iosys_map, handling possible un-aligned accesses
- * to the mapping.
+ * Write a C type value (u8, u16, u32 and u64) to the iosys_map. For other types
+ * or if pointer may be unaligned (and problematic for the architecture
+ * supported), use iosys_map_memcpy_to()
  */
-#define iosys_map_wr(map__, offset__, type__, val__) ({			\
-	type__ val = (val__);						\
-	iosys_map_memcpy_to(map__, offset__, &val, sizeof(val));	\
+#define iosys_map_wr(map__, offset__, type__, val__) ({				\
+	type__ val = (val__);							\
+	if ((map__)->is_iomem) {						\
+		__iosys_map_wr_io(val, (map__)->vaddr_iomem + (offset__), type__);\
+	} else {								\
+		__iosys_map_wr_sys(val, (map__)->vaddr + (offset__), type__);	\
+	}									\
 })
 
 /**
@@ -469,10 +487,12 @@ static inline void iosys_map_memset(struct iosys_map *dst, size_t offset,
  * @field__:		Member of the struct to read
  * @val__:		Value to write
  *
- * Write a value to the iosys_map considering its layout is described by a C struct
- * starting at @struct_offset__. The field offset and size is calculated and the
- * @val__ is written handling possible un-aligned memory accesses. Refer to
- * iosys_map_rd_field() for expected usage and memory layout.
+ * Write a value to the iosys_map considering its layout is described by a C
+ * struct starting at @struct_offset__. The field offset and size is calculated
+ * and the @val__ is written. If the field access would incur in un-aligned
+ * access, then either iosys_map_memcpy_to() needs to be used or the
+ * architecture must support it. Refer to iosys_map_rd_field() for expected
+ * usage and memory layout.
  */
 #define iosys_map_wr_field(map__, struct_offset__, struct_type__, field__, val__) ({	\
 	struct_type__ *s;								\
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [CI 1/3] drm-tip: 2022y-06m-27d-16h-18m-47s UTC integration manifest
  2022-06-28 18:47 ` [Intel-gfx] " Lucas De Marchi
@ 2022-06-28 18:59   ` Lucas De Marchi
  -1 siblings, 0 replies; 8+ messages in thread
From: Lucas De Marchi @ 2022-06-28 18:59 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig, tzimmermann

On Tue, Jun 28, 2022 at 11:47:45AM -0700, Lucas De Marchi wrote:
>From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Sorry for the noise.

This should NOT be the patch 1, of course. It went here beacuse my local
and remote branch were out of sync (and drm-tip/drm-tip.. then includes
it)

This is intended for CI, but it will fail to apply. I will re-submit
this.

Lucas De Marchi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] [CI 1/3] drm-tip: 2022y-06m-27d-16h-18m-47s UTC integration manifest
@ 2022-06-28 18:59   ` Lucas De Marchi
  0 siblings, 0 replies; 8+ messages in thread
From: Lucas De Marchi @ 2022-06-28 18:59 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Daniel Vetter, christian.koenig, tzimmermann

On Tue, Jun 28, 2022 at 11:47:45AM -0700, Lucas De Marchi wrote:
>From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Sorry for the noise.

This should NOT be the patch 1, of course. It went here beacuse my local
and remote branch were out of sync (and drm-tip/drm-tip.. then includes
it)

This is intended for CI, but it will fail to apply. I will re-submit
this.

Lucas De Marchi

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-06-28 18:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-28 18:47 [CI 1/3] drm-tip: 2022y-06m-27d-16h-18m-47s UTC integration manifest Lucas De Marchi
2022-06-28 18:47 ` [Intel-gfx] " Lucas De Marchi
2022-06-28 18:47 ` [CI 2/3] iosys-map: Add per-word read Lucas De Marchi
2022-06-28 18:47   ` [Intel-gfx] " Lucas De Marchi
2022-06-28 18:47 ` [CI 3/3] iosys-map: Add per-word write Lucas De Marchi
2022-06-28 18:47   ` [Intel-gfx] " Lucas De Marchi
2022-06-28 18:59 ` [CI 1/3] drm-tip: 2022y-06m-27d-16h-18m-47s UTC integration manifest Lucas De Marchi
2022-06-28 18:59   ` [Intel-gfx] " Lucas De Marchi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.