* [igt-dev] [PATCH i-g-t] lib/xe_query: Prevent cache corruption on multiple access
@ 2023-03-24 12:06 Zbigniew Kempczyński
2023-03-24 13:08 ` [igt-dev] ✗ Fi.CI.BAT: failure for " Patchwork
2023-03-24 20:59 ` [igt-dev] [PATCH i-g-t] " Niranjana Vishwanathapura
0 siblings, 2 replies; 3+ messages in thread
From: Zbigniew Kempczyński @ 2023-03-24 12:06 UTC (permalink / raw)
To: igt-dev
If multiple threads will enter getting the xe_device and cache is
empty we may encounter situation all of them will try to add config
data to the cache. As cache is resizable map and wasn't protected
by the mutex at the moment of insertion it's possible to corrupt the
map.
Lets add protecting mutex at the moment of adding to the cache. Due
to assertions around querying the kernel with ioctls mutex doesn't
protect gathering the config data so some threads may do this in
parallel. It's not a big deal though as all threads will see same
configuration so we add first (winning) thread xe_device data.
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
lib/xe/xe_query.c | 33 +++++++++++++++++++++------------
1 file changed, 21 insertions(+), 12 deletions(-)
diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
index 1835232804..82a7e36e30 100644
--- a/lib/xe/xe_query.c
+++ b/lib/xe/xe_query.c
@@ -223,6 +223,16 @@ static struct xe_device *find_in_cache(int fd)
return xe_dev;
}
+static void xe_device_free(struct xe_device *xe_dev)
+{
+ free(xe_dev->config);
+ free(xe_dev->gts);
+ free(xe_dev->hw_engines);
+ free(xe_dev->mem_usage);
+ free(xe_dev->vram_size);
+ free(xe_dev);
+}
+
/**
* xe_device_get:
* @fd: xe device fd
@@ -234,7 +244,7 @@ static struct xe_device *find_in_cache(int fd)
*/
struct xe_device *xe_device_get(int fd)
{
- struct xe_device *xe_dev;
+ struct xe_device *xe_dev, *prev;
xe_dev = find_in_cache(fd);
if (xe_dev)
@@ -260,21 +270,20 @@ struct xe_device *xe_device_get(int fd)
xe_dev->has_vram = __mem_has_vram(xe_dev->mem_usage);
xe_dev->supports_faults = xe_check_supports_faults(fd);
- igt_map_insert(cache.map, &xe_dev->fd, xe_dev);
+ /* We may get here from multiple threads, use first cached xe_dev */
+ pthread_mutex_lock(&cache.cache_mutex);
+ prev = find_in_cache_unlocked(fd);
+ if (!prev) {
+ igt_map_insert(cache.map, &xe_dev->fd, xe_dev);
+ } else {
+ xe_device_free(xe_dev);
+ xe_dev = prev;
+ }
+ pthread_mutex_unlock(&cache.cache_mutex);
return xe_dev;
}
-static void xe_device_free(struct xe_device *xe_dev)
-{
- free(xe_dev->config);
- free(xe_dev->gts);
- free(xe_dev->hw_engines);
- free(xe_dev->mem_usage);
- free(xe_dev->vram_size);
- free(xe_dev);
-}
-
static void delete_in_cache(struct igt_map_entry *entry)
{
xe_device_free((struct xe_device *)entry->data);
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [igt-dev] ✗ Fi.CI.BAT: failure for lib/xe_query: Prevent cache corruption on multiple access
2023-03-24 12:06 [igt-dev] [PATCH i-g-t] lib/xe_query: Prevent cache corruption on multiple access Zbigniew Kempczyński
@ 2023-03-24 13:08 ` Patchwork
2023-03-24 20:59 ` [igt-dev] [PATCH i-g-t] " Niranjana Vishwanathapura
1 sibling, 0 replies; 3+ messages in thread
From: Patchwork @ 2023-03-24 13:08 UTC (permalink / raw)
To: Zbigniew Kempczyński; +Cc: igt-dev
[-- Attachment #1: Type: text/plain, Size: 5281 bytes --]
== Series Details ==
Series: lib/xe_query: Prevent cache corruption on multiple access
URL : https://patchwork.freedesktop.org/series/115600/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_12909 -> IGTPW_8679
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with IGTPW_8679 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in IGTPW_8679, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/index.html
Participating hosts (37 -> 37)
------------------------------
Additional (1): fi-kbl-soraka
Missing (1): fi-snb-2520m
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in IGTPW_8679:
### IGT changes ###
#### Possible regressions ####
* igt@gem_huc_copy@huc-copy:
- fi-kbl-soraka: NOTRUN -> [INCOMPLETE][1]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/fi-kbl-soraka/igt@gem_huc_copy@huc-copy.html
Known issues
------------
Here are the changes found in IGTPW_8679 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@dmabuf@all-tests@dma_fence:
- fi-pnv-d510: [PASS][2] -> [FAIL][3] ([i915#8064])
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12909/fi-pnv-d510/igt@dmabuf@all-tests@dma_fence.html
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/fi-pnv-d510/igt@dmabuf@all-tests@dma_fence.html
* igt@dmabuf@all-tests@sanitycheck:
- fi-pnv-d510: [PASS][4] -> [ABORT][5] ([i915#8143])
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12909/fi-pnv-d510/igt@dmabuf@all-tests@sanitycheck.html
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/fi-pnv-d510/igt@dmabuf@all-tests@sanitycheck.html
* igt@gem_lmem_swapping@basic:
- fi-kbl-soraka: NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#4613]) +3 similar issues
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/fi-kbl-soraka/igt@gem_lmem_swapping@basic.html
* igt@i915_selftest@live@gt_pm:
- fi-kbl-soraka: NOTRUN -> [DMESG-FAIL][7] ([i915#1886])
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html
* igt@kms_chamelium_frames@hdmi-crc-fast:
- fi-kbl-soraka: NOTRUN -> [SKIP][8] ([fdo#109271]) +16 similar issues
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/fi-kbl-soraka/igt@kms_chamelium_frames@hdmi-crc-fast.html
* igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence:
- bat-dg2-11: NOTRUN -> [SKIP][9] ([i915#5354])
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
#### Possible fixes ####
* igt@i915_selftest@live@gt_heartbeat:
- fi-cfl-guc: [DMESG-FAIL][10] ([i915#5334]) -> [PASS][11]
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12909/fi-cfl-guc/igt@i915_selftest@live@gt_heartbeat.html
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/fi-cfl-guc/igt@i915_selftest@live@gt_heartbeat.html
#### Warnings ####
* igt@i915_selftest@live@slpc:
- bat-rpls-2: [DMESG-FAIL][12] ([i915#6367] / [i915#7913] / [i915#7996]) -> [DMESG-FAIL][13] ([i915#6367] / [i915#6997] / [i915#7913])
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12909/bat-rpls-2/igt@i915_selftest@live@slpc.html
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/bat-rpls-2/igt@i915_selftest@live@slpc.html
- bat-rpls-1: [DMESG-FAIL][14] ([i915#6367]) -> [DMESG-FAIL][15] ([i915#6367] / [i915#7996])
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12909/bat-rpls-1/igt@i915_selftest@live@slpc.html
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/bat-rpls-1/igt@i915_selftest@live@slpc.html
[fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
[i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
[i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
[i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
[i915#5354]: https://gitlab.freedesktop.org/drm/intel/issues/5354
[i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
[i915#6997]: https://gitlab.freedesktop.org/drm/intel/issues/6997
[i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
[i915#7996]: https://gitlab.freedesktop.org/drm/intel/issues/7996
[i915#8064]: https://gitlab.freedesktop.org/drm/intel/issues/8064
[i915#8143]: https://gitlab.freedesktop.org/drm/intel/issues/8143
Build changes
-------------
* CI: CI-20190529 -> None
* IGT: IGT_7216 -> IGTPW_8679
CI-20190529: 20190529
CI_DRM_12909: 570255a5f5def30955477836cf87b00e587638d0 @ git://anongit.freedesktop.org/gfx-ci/linux
IGTPW_8679: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/index.html
IGT_7216: 0682c2b07c7eab2bf384899c3da3cc7f08083847 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8679/index.html
[-- Attachment #2: Type: text/html, Size: 6517 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [igt-dev] [PATCH i-g-t] lib/xe_query: Prevent cache corruption on multiple access
2023-03-24 12:06 [igt-dev] [PATCH i-g-t] lib/xe_query: Prevent cache corruption on multiple access Zbigniew Kempczyński
2023-03-24 13:08 ` [igt-dev] ✗ Fi.CI.BAT: failure for " Patchwork
@ 2023-03-24 20:59 ` Niranjana Vishwanathapura
1 sibling, 0 replies; 3+ messages in thread
From: Niranjana Vishwanathapura @ 2023-03-24 20:59 UTC (permalink / raw)
To: Zbigniew Kempczyński; +Cc: igt-dev
On Fri, Mar 24, 2023 at 01:06:44PM +0100, Zbigniew Kempczyński wrote:
>If multiple threads will enter getting the xe_device and cache is
>empty we may encounter situation all of them will try to add config
>data to the cache. As cache is resizable map and wasn't protected
>by the mutex at the moment of insertion it's possible to corrupt the
>map.
>
>Lets add protecting mutex at the moment of adding to the cache. Due
>to assertions around querying the kernel with ioctls mutex doesn't
>protect gathering the config data so some threads may do this in
>parallel. It's not a big deal though as all threads will see same
>configuration so we add first (winning) thread xe_device data.
>
>Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
LGTM.
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>---
> lib/xe/xe_query.c | 33 +++++++++++++++++++++------------
> 1 file changed, 21 insertions(+), 12 deletions(-)
>
>diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
>index 1835232804..82a7e36e30 100644
>--- a/lib/xe/xe_query.c
>+++ b/lib/xe/xe_query.c
>@@ -223,6 +223,16 @@ static struct xe_device *find_in_cache(int fd)
> return xe_dev;
> }
>
>+static void xe_device_free(struct xe_device *xe_dev)
>+{
>+ free(xe_dev->config);
>+ free(xe_dev->gts);
>+ free(xe_dev->hw_engines);
>+ free(xe_dev->mem_usage);
>+ free(xe_dev->vram_size);
>+ free(xe_dev);
>+}
>+
> /**
> * xe_device_get:
> * @fd: xe device fd
>@@ -234,7 +244,7 @@ static struct xe_device *find_in_cache(int fd)
> */
> struct xe_device *xe_device_get(int fd)
> {
>- struct xe_device *xe_dev;
>+ struct xe_device *xe_dev, *prev;
>
> xe_dev = find_in_cache(fd);
> if (xe_dev)
>@@ -260,21 +270,20 @@ struct xe_device *xe_device_get(int fd)
> xe_dev->has_vram = __mem_has_vram(xe_dev->mem_usage);
> xe_dev->supports_faults = xe_check_supports_faults(fd);
>
>- igt_map_insert(cache.map, &xe_dev->fd, xe_dev);
>+ /* We may get here from multiple threads, use first cached xe_dev */
>+ pthread_mutex_lock(&cache.cache_mutex);
>+ prev = find_in_cache_unlocked(fd);
>+ if (!prev) {
>+ igt_map_insert(cache.map, &xe_dev->fd, xe_dev);
>+ } else {
>+ xe_device_free(xe_dev);
>+ xe_dev = prev;
>+ }
>+ pthread_mutex_unlock(&cache.cache_mutex);
>
> return xe_dev;
> }
>
>-static void xe_device_free(struct xe_device *xe_dev)
>-{
>- free(xe_dev->config);
>- free(xe_dev->gts);
>- free(xe_dev->hw_engines);
>- free(xe_dev->mem_usage);
>- free(xe_dev->vram_size);
>- free(xe_dev);
>-}
>-
> static void delete_in_cache(struct igt_map_entry *entry)
> {
> xe_device_free((struct xe_device *)entry->data);
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-03-24 21:03 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-24 12:06 [igt-dev] [PATCH i-g-t] lib/xe_query: Prevent cache corruption on multiple access Zbigniew Kempczyński
2023-03-24 13:08 ` [igt-dev] ✗ Fi.CI.BAT: failure for " Patchwork
2023-03-24 20:59 ` [igt-dev] [PATCH i-g-t] " Niranjana Vishwanathapura
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.