* [PATCH 0/2] shmem, drm/i915: Mark pinned shmemfs pages as unevictable
@ 2018-10-16 17:42 ` Kuo-Hsin Yang
0 siblings, 0 replies; 19+ messages in thread
From: Kuo-Hsin Yang @ 2018-10-16 17:42 UTC (permalink / raw)
To: linux-kernel, intel-gfx, linux-mm
Cc: mhocko, akpm, chris, peterz, dave.hansen, corbet, hughd,
joonas.lahtinen, marcheu, hoegsberg, Kuo-Hsin Yang
When a graphics heavy application is running, i915 driver may pin a lot
of shmemfs pages and vmscan slows down significantly by scanning these
pinned pages. This patch is an alternative to the patch by Chris Wilson
[1]. As i915 driver pins all pages in an address space, marking an
address space as unevictable is sufficient to solve this issue.
[1]: https://patchwork.kernel.org/patch/9768741/
Kuo-Hsin Yang (2):
shmem: export shmem_unlock_mapping
drm/i915: Mark pinned shmemfs pages as unevictable
Documentation/vm/unevictable-lru.rst | 4 +++-
drivers/gpu/drm/i915/i915_gem.c | 8 ++++++++
mm/shmem.c | 2 ++
3 files changed, 13 insertions(+), 1 deletion(-)
--
2.19.1.331.ge82ca0e54c-goog
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 0/2] shmem, drm/i915: Mark pinned shmemfs pages as unevictable
@ 2018-10-16 17:42 ` Kuo-Hsin Yang
0 siblings, 0 replies; 19+ messages in thread
From: Kuo-Hsin Yang @ 2018-10-16 17:42 UTC (permalink / raw)
To: linux-kernel, intel-gfx, linux-mm
Cc: mhocko, corbet, peterz, hughd, dave.hansen, akpm, hoegsberg
When a graphics heavy application is running, i915 driver may pin a lot
of shmemfs pages and vmscan slows down significantly by scanning these
pinned pages. This patch is an alternative to the patch by Chris Wilson
[1]. As i915 driver pins all pages in an address space, marking an
address space as unevictable is sufficient to solve this issue.
[1]: https://patchwork.kernel.org/patch/9768741/
Kuo-Hsin Yang (2):
shmem: export shmem_unlock_mapping
drm/i915: Mark pinned shmemfs pages as unevictable
Documentation/vm/unevictable-lru.rst | 4 +++-
drivers/gpu/drm/i915/i915_gem.c | 8 ++++++++
mm/shmem.c | 2 ++
3 files changed, 13 insertions(+), 1 deletion(-)
--
2.19.1.331.ge82ca0e54c-goog
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/2] shmem: export shmem_unlock_mapping
2018-10-16 17:42 ` Kuo-Hsin Yang
@ 2018-10-16 17:42 ` Kuo-Hsin Yang
-1 siblings, 0 replies; 19+ messages in thread
From: Kuo-Hsin Yang @ 2018-10-16 17:42 UTC (permalink / raw)
To: linux-kernel, intel-gfx, linux-mm
Cc: mhocko, akpm, chris, peterz, dave.hansen, corbet, hughd,
joonas.lahtinen, marcheu, hoegsberg, Kuo-Hsin Yang
By exporting this function, drivers can mark/unmark a shmemfs address
space as unevictable in the following way: 1. mark an address space as
unevictable with mapping_set_unevictable(), pages in the address space
will be moved to unevictable list in vmscan. 2. mark an address space
evictable with mapping_clear_unevictable(), and move these pages back to
evictable list with shmem_unlock_mapping().
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
---
Documentation/vm/unevictable-lru.rst | 4 +++-
mm/shmem.c | 2 ++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/Documentation/vm/unevictable-lru.rst b/Documentation/vm/unevictable-lru.rst
index fdd84cb8d511..a812fb55136d 100644
--- a/Documentation/vm/unevictable-lru.rst
+++ b/Documentation/vm/unevictable-lru.rst
@@ -143,7 +143,7 @@ using a number of wrapper functions:
Query the address space, and return true if it is completely
unevictable.
-These are currently used in two places in the kernel:
+These are currently used in three places in the kernel:
(1) By ramfs to mark the address spaces of its inodes when they are created,
and this mark remains for the life of the inode.
@@ -154,6 +154,8 @@ These are currently used in two places in the kernel:
swapped out; the application must touch the pages manually if it wants to
ensure they're in memory.
+ (3) By the i915 driver to mark pinned address space until it's unpinned.
+
Detecting Unevictable Pages
---------------------------
diff --git a/mm/shmem.c b/mm/shmem.c
index 446942677cd4..d1ce34c09df6 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -786,6 +786,7 @@ void shmem_unlock_mapping(struct address_space *mapping)
cond_resched();
}
}
+EXPORT_SYMBOL_GPL(shmem_unlock_mapping);
/*
* Remove range of pages and swap entries from radix tree, and free them.
@@ -3874,6 +3875,7 @@ int shmem_lock(struct file *file, int lock, struct user_struct *user)
void shmem_unlock_mapping(struct address_space *mapping)
{
}
+EXPORT_SYMBOL_GPL(shmem_unlock_mapping);
#ifdef CONFIG_MMU
unsigned long shmem_get_unmapped_area(struct file *file,
--
2.19.1.331.ge82ca0e54c-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 1/2] shmem: export shmem_unlock_mapping
@ 2018-10-16 17:42 ` Kuo-Hsin Yang
0 siblings, 0 replies; 19+ messages in thread
From: Kuo-Hsin Yang @ 2018-10-16 17:42 UTC (permalink / raw)
To: linux-kernel, intel-gfx, linux-mm
Cc: mhocko, corbet, peterz, hughd, dave.hansen, akpm, hoegsberg
By exporting this function, drivers can mark/unmark a shmemfs address
space as unevictable in the following way: 1. mark an address space as
unevictable with mapping_set_unevictable(), pages in the address space
will be moved to unevictable list in vmscan. 2. mark an address space
evictable with mapping_clear_unevictable(), and move these pages back to
evictable list with shmem_unlock_mapping().
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
---
Documentation/vm/unevictable-lru.rst | 4 +++-
mm/shmem.c | 2 ++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/Documentation/vm/unevictable-lru.rst b/Documentation/vm/unevictable-lru.rst
index fdd84cb8d511..a812fb55136d 100644
--- a/Documentation/vm/unevictable-lru.rst
+++ b/Documentation/vm/unevictable-lru.rst
@@ -143,7 +143,7 @@ using a number of wrapper functions:
Query the address space, and return true if it is completely
unevictable.
-These are currently used in two places in the kernel:
+These are currently used in three places in the kernel:
(1) By ramfs to mark the address spaces of its inodes when they are created,
and this mark remains for the life of the inode.
@@ -154,6 +154,8 @@ These are currently used in two places in the kernel:
swapped out; the application must touch the pages manually if it wants to
ensure they're in memory.
+ (3) By the i915 driver to mark pinned address space until it's unpinned.
+
Detecting Unevictable Pages
---------------------------
diff --git a/mm/shmem.c b/mm/shmem.c
index 446942677cd4..d1ce34c09df6 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -786,6 +786,7 @@ void shmem_unlock_mapping(struct address_space *mapping)
cond_resched();
}
}
+EXPORT_SYMBOL_GPL(shmem_unlock_mapping);
/*
* Remove range of pages and swap entries from radix tree, and free them.
@@ -3874,6 +3875,7 @@ int shmem_lock(struct file *file, int lock, struct user_struct *user)
void shmem_unlock_mapping(struct address_space *mapping)
{
}
+EXPORT_SYMBOL_GPL(shmem_unlock_mapping);
#ifdef CONFIG_MMU
unsigned long shmem_get_unmapped_area(struct file *file,
--
2.19.1.331.ge82ca0e54c-goog
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
2018-10-16 17:42 ` Kuo-Hsin Yang
@ 2018-10-16 17:43 ` Kuo-Hsin Yang
-1 siblings, 0 replies; 19+ messages in thread
From: Kuo-Hsin Yang @ 2018-10-16 17:43 UTC (permalink / raw)
To: linux-kernel, intel-gfx, linux-mm
Cc: mhocko, akpm, chris, peterz, dave.hansen, corbet, hughd,
joonas.lahtinen, marcheu, hoegsberg, Kuo-Hsin Yang
The i915 driver use shmemfs to allocate backing storage for gem objects.
These shmemfs pages can be pinned (increased ref count) by
shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
wastes a lot of time scanning these pinned pages. Mark these pinned
pages as unevictable to speed up vmscan.
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
---
drivers/gpu/drm/i915/i915_gem.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fcc73a6ab503..e0ff5b736128 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2390,6 +2390,7 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
{
struct sgt_iter sgt_iter;
struct page *page;
+ struct address_space *mapping;
__i915_gem_object_release_shmem(obj, pages, true);
@@ -2409,6 +2410,10 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
}
obj->mm.dirty = false;
+ mapping = file_inode(obj->base.filp)->i_mapping;
+ mapping_clear_unevictable(mapping);
+ shmem_unlock_mapping(mapping);
+
sg_free_table(pages);
kfree(pages);
}
@@ -2551,6 +2556,7 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
* Fail silently without starting the shrinker
*/
mapping = obj->base.filp->f_mapping;
+ mapping_set_unevictable(mapping);
noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
@@ -2664,6 +2670,8 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
err_pages:
for_each_sgt_page(page, sgt_iter, st)
put_page(page);
+ mapping_clear_unevictable(mapping);
+ shmem_unlock_mapping(mapping);
sg_free_table(st);
kfree(st);
--
2.19.1.331.ge82ca0e54c-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
@ 2018-10-16 17:43 ` Kuo-Hsin Yang
0 siblings, 0 replies; 19+ messages in thread
From: Kuo-Hsin Yang @ 2018-10-16 17:43 UTC (permalink / raw)
To: linux-kernel, intel-gfx, linux-mm
Cc: mhocko, corbet, peterz, hughd, dave.hansen, akpm, hoegsberg
The i915 driver use shmemfs to allocate backing storage for gem objects.
These shmemfs pages can be pinned (increased ref count) by
shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
wastes a lot of time scanning these pinned pages. Mark these pinned
pages as unevictable to speed up vmscan.
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
---
drivers/gpu/drm/i915/i915_gem.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fcc73a6ab503..e0ff5b736128 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2390,6 +2390,7 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
{
struct sgt_iter sgt_iter;
struct page *page;
+ struct address_space *mapping;
__i915_gem_object_release_shmem(obj, pages, true);
@@ -2409,6 +2410,10 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
}
obj->mm.dirty = false;
+ mapping = file_inode(obj->base.filp)->i_mapping;
+ mapping_clear_unevictable(mapping);
+ shmem_unlock_mapping(mapping);
+
sg_free_table(pages);
kfree(pages);
}
@@ -2551,6 +2556,7 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
* Fail silently without starting the shrinker
*/
mapping = obj->base.filp->f_mapping;
+ mapping_set_unevictable(mapping);
noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
@@ -2664,6 +2670,8 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
err_pages:
for_each_sgt_page(page, sgt_iter, st)
put_page(page);
+ mapping_clear_unevictable(mapping);
+ shmem_unlock_mapping(mapping);
sg_free_table(st);
kfree(st);
--
2.19.1.331.ge82ca0e54c-goog
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
2018-10-16 17:43 ` Kuo-Hsin Yang
@ 2018-10-16 18:21 ` Michal Hocko
-1 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2018-10-16 18:21 UTC (permalink / raw)
To: Kuo-Hsin Yang
Cc: linux-kernel, intel-gfx, linux-mm, akpm, chris, peterz,
dave.hansen, corbet, hughd, joonas.lahtinen, marcheu, hoegsberg
On Wed 17-10-18 01:43:00, Kuo-Hsin Yang wrote:
> The i915 driver use shmemfs to allocate backing storage for gem objects.
> These shmemfs pages can be pinned (increased ref count) by
> shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
> wastes a lot of time scanning these pinned pages. Mark these pinned
> pages as unevictable to speed up vmscan.
I would squash the two patches into the single one. One more thing
though. One more thing to be careful about here. Unless I miss something
such a page is not migrateable so it shouldn't be allocated from a
movable zone. Does mapping_gfp_constraint contains __GFP_MOVABLE? If
yes, we want to drop it as well. Other than that the patch makes sense
with my very limited knowlege of the i915 code of course.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
@ 2018-10-16 18:21 ` Michal Hocko
0 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2018-10-16 18:21 UTC (permalink / raw)
To: Kuo-Hsin Yang
Cc: dave.hansen, peterz, intel-gfx, corbet, hughd, linux-kernel,
linux-mm, akpm, hoegsberg
On Wed 17-10-18 01:43:00, Kuo-Hsin Yang wrote:
> The i915 driver use shmemfs to allocate backing storage for gem objects.
> These shmemfs pages can be pinned (increased ref count) by
> shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
> wastes a lot of time scanning these pinned pages. Mark these pinned
> pages as unevictable to speed up vmscan.
I would squash the two patches into the single one. One more thing
though. One more thing to be careful about here. Unless I miss something
such a page is not migrateable so it shouldn't be allocated from a
movable zone. Does mapping_gfp_constraint contains __GFP_MOVABLE? If
yes, we want to drop it as well. Other than that the patch makes sense
with my very limited knowlege of the i915 code of course.
--
Michal Hocko
SUSE Labs
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* ✓ Fi.CI.BAT: success for shmem, drm/i915: Mark pinned shmemfs pages as unevictable
2018-10-16 17:42 ` Kuo-Hsin Yang
` (2 preceding siblings ...)
(?)
@ 2018-10-16 18:27 ` Patchwork
-1 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2018-10-16 18:27 UTC (permalink / raw)
To: Kuo-Hsin Yang; +Cc: intel-gfx
== Series Details ==
Series: shmem, drm/i915: Mark pinned shmemfs pages as unevictable
URL : https://patchwork.freedesktop.org/series/51078/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4990 -> Patchwork_10478 =
== Summary - SUCCESS ==
No regressions found.
External URL: https://patchwork.freedesktop.org/api/1.0/series/51078/revisions/1/mbox/
== Known issues ==
Here are the changes found in Patchwork_10478 that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@gem_exec_suspend@basic-s4-devices:
fi-glk-j4005: PASS -> DMESG-WARN (fdo#105719)
igt@kms_flip@basic-flip-vs-dpms:
fi-skl-6700hq: PASS -> DMESG-WARN (fdo#105998)
igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence:
fi-byt-clapper: PASS -> FAIL (fdo#107362, fdo#103191)
igt@kms_pipe_crc_basic@read-crc-pipe-c-frame-sequence:
fi-glk-j4005: PASS -> DMESG-WARN (fdo#106097)
igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
fi-cfl-8109u: PASS -> INCOMPLETE (fdo#108126, fdo#106070)
fi-icl-u: NOTRUN -> INCOMPLETE (fdo#107713)
igt@pm_rpm@module-reload:
fi-glk-j4005: PASS -> DMESG-WARN (fdo#107726)
==== Possible fixes ====
igt@drv_selftest@live_hangcheck:
fi-icl-u2: INCOMPLETE (fdo#108315) -> PASS
igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
fi-glk-j4005: FAIL (fdo#106765) -> PASS
igt@kms_flip@basic-flip-vs-dpms:
fi-hsw-4770r: DMESG-WARN (fdo#105602) -> PASS
igt@kms_flip@basic-plain-flip:
fi-glk-j4005: DMESG-WARN (fdo#106097) -> PASS
igt@kms_frontbuffer_tracking@basic:
fi-icl-u2: FAIL (fdo#103167) -> PASS
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#105602 https://bugs.freedesktop.org/show_bug.cgi?id=105602
fdo#105719 https://bugs.freedesktop.org/show_bug.cgi?id=105719
fdo#105998 https://bugs.freedesktop.org/show_bug.cgi?id=105998
fdo#106070 https://bugs.freedesktop.org/show_bug.cgi?id=106070
fdo#106097 https://bugs.freedesktop.org/show_bug.cgi?id=106097
fdo#106765 https://bugs.freedesktop.org/show_bug.cgi?id=106765
fdo#107362 https://bugs.freedesktop.org/show_bug.cgi?id=107362
fdo#107713 https://bugs.freedesktop.org/show_bug.cgi?id=107713
fdo#107726 https://bugs.freedesktop.org/show_bug.cgi?id=107726
fdo#108126 https://bugs.freedesktop.org/show_bug.cgi?id=108126
fdo#108315 https://bugs.freedesktop.org/show_bug.cgi?id=108315
== Participating hosts (46 -> 43) ==
Additional (1): fi-icl-u
Missing (4): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u
== Build changes ==
* Linux: CI_DRM_4990 -> Patchwork_10478
CI_DRM_4990: 0bd34d92e9a27784cb988a300619f497ca0e99ec @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4681: 959d6f95cb1344e0c0dace5b236e17755826fac1 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_10478: c2dbef879ec3ecd799fbbda3343f699fec620ec3 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
c2dbef879ec3 drm/i915: Mark pinned shmemfs pages as unevictable
559de4911136 shmem: export shmem_unlock_mapping
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10478/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
2018-10-16 18:21 ` Michal Hocko
@ 2018-10-16 18:31 ` Chris Wilson
-1 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2018-10-16 18:31 UTC (permalink / raw)
To: Kuo-Hsin Yang, Michal Hocko
Cc: linux-kernel, intel-gfx, linux-mm, akpm, peterz, dave.hansen,
corbet, hughd, joonas.lahtinen, marcheu, hoegsberg
Quoting Michal Hocko (2018-10-16 19:21:55)
> On Wed 17-10-18 01:43:00, Kuo-Hsin Yang wrote:
> > The i915 driver use shmemfs to allocate backing storage for gem objects.
> > These shmemfs pages can be pinned (increased ref count) by
> > shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
> > wastes a lot of time scanning these pinned pages. Mark these pinned
> > pages as unevictable to speed up vmscan.
>
> I would squash the two patches into the single one. One more thing
> though. One more thing to be careful about here. Unless I miss something
> such a page is not migrateable so it shouldn't be allocated from a
> movable zone. Does mapping_gfp_constraint contains __GFP_MOVABLE? If
> yes, we want to drop it as well. Other than that the patch makes sense
> with my very limited knowlege of the i915 code of course.
They are not migrateable today. But we have proposed hooking up
.migratepage and setting __GFP_MOVABLE which would then include unlocking
the mapping at migrate time.
Fwiw, the shmem_unlock_mapping() call feels quite expensive, almost
nullifying the advantage gained from not walking the lists in reclaim.
I'll have better numbers in a couple of days.
-Chris
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
@ 2018-10-16 18:31 ` Chris Wilson
0 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2018-10-16 18:31 UTC (permalink / raw)
To: Kuo-Hsin Yang, Michal Hocko
Cc: dave.hansen, peterz, intel-gfx, corbet, hughd, linux-kernel,
linux-mm, akpm, hoegsberg
Quoting Michal Hocko (2018-10-16 19:21:55)
> On Wed 17-10-18 01:43:00, Kuo-Hsin Yang wrote:
> > The i915 driver use shmemfs to allocate backing storage for gem objects.
> > These shmemfs pages can be pinned (increased ref count) by
> > shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
> > wastes a lot of time scanning these pinned pages. Mark these pinned
> > pages as unevictable to speed up vmscan.
>
> I would squash the two patches into the single one. One more thing
> though. One more thing to be careful about here. Unless I miss something
> such a page is not migrateable so it shouldn't be allocated from a
> movable zone. Does mapping_gfp_constraint contains __GFP_MOVABLE? If
> yes, we want to drop it as well. Other than that the patch makes sense
> with my very limited knowlege of the i915 code of course.
They are not migrateable today. But we have proposed hooking up
.migratepage and setting __GFP_MOVABLE which would then include unlocking
the mapping at migrate time.
Fwiw, the shmem_unlock_mapping() call feels quite expensive, almost
nullifying the advantage gained from not walking the lists in reclaim.
I'll have better numbers in a couple of days.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
2018-10-16 18:31 ` Chris Wilson
(?)
@ 2018-10-16 19:13 ` Michal Hocko
-1 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2018-10-16 19:13 UTC (permalink / raw)
To: Chris Wilson
Cc: Kuo-Hsin Yang, linux-kernel, intel-gfx, linux-mm, akpm, peterz,
dave.hansen, corbet, hughd, joonas.lahtinen, marcheu, hoegsberg
On Tue 16-10-18 19:31:06, Chris Wilson wrote:
> Quoting Michal Hocko (2018-10-16 19:21:55)
> > On Wed 17-10-18 01:43:00, Kuo-Hsin Yang wrote:
> > > The i915 driver use shmemfs to allocate backing storage for gem objects.
> > > These shmemfs pages can be pinned (increased ref count) by
> > > shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
> > > wastes a lot of time scanning these pinned pages. Mark these pinned
> > > pages as unevictable to speed up vmscan.
> >
> > I would squash the two patches into the single one. One more thing
> > though. One more thing to be careful about here. Unless I miss something
> > such a page is not migrateable so it shouldn't be allocated from a
> > movable zone. Does mapping_gfp_constraint contains __GFP_MOVABLE? If
> > yes, we want to drop it as well. Other than that the patch makes sense
> > with my very limited knowlege of the i915 code of course.
>
> They are not migrateable today. But we have proposed hooking up
> .migratepage and setting __GFP_MOVABLE which would then include unlocking
> the mapping at migrate time.
if the mapping_gfp doesn't include __GFP_MOVABLE today then there is no
issue I've had in mind.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* ✓ Fi.CI.IGT: success for shmem, drm/i915: Mark pinned shmemfs pages as unevictable
2018-10-16 17:42 ` Kuo-Hsin Yang
` (3 preceding siblings ...)
(?)
@ 2018-10-16 22:07 ` Patchwork
-1 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2018-10-16 22:07 UTC (permalink / raw)
To: Kuo-Hsin Yang; +Cc: intel-gfx
== Series Details ==
Series: shmem, drm/i915: Mark pinned shmemfs pages as unevictable
URL : https://patchwork.freedesktop.org/series/51078/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4990_full -> Patchwork_10478_full =
== Summary - SUCCESS ==
No regressions found.
== Known issues ==
Here are the changes found in Patchwork_10478_full that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@gem_exec_schedule@pi-ringfull-bsd:
shard-skl: NOTRUN -> FAIL (fdo#103158) +1
igt@gem_exec_schedule@pi-ringfull-render:
shard-glk: NOTRUN -> FAIL (fdo#103158)
igt@gem_ppgtt@blt-vs-render-ctxn:
shard-kbl: PASS -> INCOMPLETE (fdo#106023, fdo#103665)
shard-skl: NOTRUN -> TIMEOUT (fdo#108039)
igt@kms_atomic_transition@1x-modeset-transitions-nonblocking:
shard-skl: NOTRUN -> FAIL (fdo#108228)
igt@kms_busy@extended-modeset-hang-newfb-render-a:
shard-skl: NOTRUN -> DMESG-WARN (fdo#107956) +2
igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-a:
shard-glk: NOTRUN -> DMESG-WARN (fdo#107956)
igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-b:
shard-hsw: NOTRUN -> DMESG-WARN (fdo#107956) +2
igt@kms_cursor_crc@cursor-128x128-suspend:
shard-apl: PASS -> FAIL (fdo#103232, fdo#103191) +1
igt@kms_cursor_crc@cursor-256x256-onscreen:
shard-skl: NOTRUN -> FAIL (fdo#103232)
igt@kms_cursor_legacy@all-pipes-forked-move:
shard-apl: PASS -> INCOMPLETE (fdo#103927)
igt@kms_draw_crc@draw-method-xrgb2101010-pwrite-untiled:
shard-skl: NOTRUN -> FAIL (fdo#103184) +1
igt@kms_fbcon_fbt@psr:
shard-skl: NOTRUN -> FAIL (fdo#107882)
igt@kms_flip@flip-vs-expired-vblank:
shard-glk: PASS -> FAIL (fdo#105363, fdo#102887)
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-mmap-gtt:
shard-skl: NOTRUN -> FAIL (fdo#105682) +2
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-wc:
shard-apl: PASS -> FAIL (fdo#103167)
igt@kms_frontbuffer_tracking@fbc-stridechange:
shard-skl: NOTRUN -> FAIL (fdo#105683)
igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-pwrite:
shard-skl: NOTRUN -> FAIL (fdo#103167) +4
igt@kms_panel_fitting@legacy:
shard-skl: NOTRUN -> FAIL (fdo#105456)
igt@kms_plane@pixel-format-pipe-b-planes:
shard-skl: NOTRUN -> DMESG-FAIL (fdo#106885, fdo#103166) +1
igt@kms_plane_alpha_blend@pipe-a-alpha-basic:
shard-skl: NOTRUN -> FAIL (fdo#108145, fdo#107815)
igt@kms_plane_alpha_blend@pipe-a-constant-alpha-max:
shard-glk: NOTRUN -> FAIL (fdo#108145)
igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min:
shard-skl: NOTRUN -> FAIL (fdo#108145) +1
igt@kms_plane_alpha_blend@pipe-b-alpha-7efc:
shard-skl: NOTRUN -> FAIL (fdo#108146)
igt@kms_plane_multiple@atomic-pipe-b-tiling-yf:
shard-glk: NOTRUN -> FAIL (fdo#103166)
igt@kms_setmode@basic:
shard-apl: PASS -> FAIL (fdo#99912)
shard-snb: NOTRUN -> FAIL (fdo#99912)
igt@kms_sysfs_edid_timing:
shard-skl: NOTRUN -> FAIL (fdo#100047)
igt@kms_universal_plane@universal-plane-pipe-c-functional:
shard-glk: PASS -> FAIL (fdo#103166)
==== Possible fixes ====
igt@kms_flip@flip-vs-expired-vblank-interruptible:
shard-glk: FAIL (fdo#105363) -> PASS
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-render:
shard-apl: FAIL (fdo#103167) -> PASS
igt@kms_vblank@pipe-a-ts-continuation-suspend:
shard-skl: INCOMPLETE (fdo#107773, fdo#104108) -> PASS +1
igt@pm_rpm@system-suspend-execbuf:
shard-skl: INCOMPLETE (fdo#107773, fdo#107807, fdo#104108) -> PASS
fdo#100047 https://bugs.freedesktop.org/show_bug.cgi?id=100047
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#103158 https://bugs.freedesktop.org/show_bug.cgi?id=103158
fdo#103166 https://bugs.freedesktop.org/show_bug.cgi?id=103166
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#103184 https://bugs.freedesktop.org/show_bug.cgi?id=103184
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#103232 https://bugs.freedesktop.org/show_bug.cgi?id=103232
fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927
fdo#104108 https://bugs.freedesktop.org/show_bug.cgi?id=104108
fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363
fdo#105456 https://bugs.freedesktop.org/show_bug.cgi?id=105456
fdo#105682 https://bugs.freedesktop.org/show_bug.cgi?id=105682
fdo#105683 https://bugs.freedesktop.org/show_bug.cgi?id=105683
fdo#106023 https://bugs.freedesktop.org/show_bug.cgi?id=106023
fdo#106885 https://bugs.freedesktop.org/show_bug.cgi?id=106885
fdo#107773 https://bugs.freedesktop.org/show_bug.cgi?id=107773
fdo#107807 https://bugs.freedesktop.org/show_bug.cgi?id=107807
fdo#107815 https://bugs.freedesktop.org/show_bug.cgi?id=107815
fdo#107882 https://bugs.freedesktop.org/show_bug.cgi?id=107882
fdo#107956 https://bugs.freedesktop.org/show_bug.cgi?id=107956
fdo#108039 https://bugs.freedesktop.org/show_bug.cgi?id=108039
fdo#108145 https://bugs.freedesktop.org/show_bug.cgi?id=108145
fdo#108146 https://bugs.freedesktop.org/show_bug.cgi?id=108146
fdo#108228 https://bugs.freedesktop.org/show_bug.cgi?id=108228
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
== Participating hosts (6 -> 6) ==
No changes in participating hosts
== Build changes ==
* Linux: CI_DRM_4990 -> Patchwork_10478
CI_DRM_4990: 0bd34d92e9a27784cb988a300619f497ca0e99ec @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4681: 959d6f95cb1344e0c0dace5b236e17755826fac1 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_10478: c2dbef879ec3ecd799fbbda3343f699fec620ec3 @ git://anongit.freedesktop.org/gfx-ci/linux
piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10478/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2] shmem, drm/i915: mark pinned shmemfs pages as unevictable
2018-10-16 17:42 ` Kuo-Hsin Yang
@ 2018-10-17 8:58 ` Kuo-Hsin Yang
-1 siblings, 0 replies; 19+ messages in thread
From: Kuo-Hsin Yang @ 2018-10-17 8:58 UTC (permalink / raw)
To: vovoy
Cc: akpm, chris, corbet, dave.hansen, hoegsberg, hughd, intel-gfx,
joonas.lahtinen, linux-kernel, linux-mm, marcheu, mhocko, peterz
The i915 driver uses shmemfs to allocate backing storage for gem
objects. These shmemfs pages can be pinned (increased ref count) by
shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
wastes a lot of time scanning these pinned pages. In some extreme case,
all pages in the inactive anon lru are pinned, and only the inactive
anon lru is scanned due to inactive_ratio, the system cannot swap and
invokes the oom-killer. Mark these pinned pages as unevictable to speed
up vmscan.
By exporting shmem_unlock_mapping, drivers can: 1. mark a shmemfs
address space as unevictable with mapping_set_unevictable(), pages in
the address space will be moved to unevictable list in vmscan. 2. mark
an address space as evictable with mapping_clear_unevictable(), and move
these pages back to evictable list with shmem_unlock_mapping().
This patch was inspired by Chris Wilson's change [1].
[1]: https://patchwork.kernel.org/patch/9768741/
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
---
Changes for v2:
Squashed the two patches.
Documentation/vm/unevictable-lru.rst | 4 +++-
drivers/gpu/drm/i915/i915_gem.c | 8 ++++++++
mm/shmem.c | 2 ++
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/Documentation/vm/unevictable-lru.rst b/Documentation/vm/unevictable-lru.rst
index fdd84cb8d511..a812fb55136d 100644
--- a/Documentation/vm/unevictable-lru.rst
+++ b/Documentation/vm/unevictable-lru.rst
@@ -143,7 +143,7 @@ using a number of wrapper functions:
Query the address space, and return true if it is completely
unevictable.
-These are currently used in two places in the kernel:
+These are currently used in three places in the kernel:
(1) By ramfs to mark the address spaces of its inodes when they are created,
and this mark remains for the life of the inode.
@@ -154,6 +154,8 @@ These are currently used in two places in the kernel:
swapped out; the application must touch the pages manually if it wants to
ensure they're in memory.
+ (3) By the i915 driver to mark pinned address space until it's unpinned.
+
Detecting Unevictable Pages
---------------------------
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fcc73a6ab503..e0ff5b736128 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2390,6 +2390,7 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
{
struct sgt_iter sgt_iter;
struct page *page;
+ struct address_space *mapping;
__i915_gem_object_release_shmem(obj, pages, true);
@@ -2409,6 +2410,10 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
}
obj->mm.dirty = false;
+ mapping = file_inode(obj->base.filp)->i_mapping;
+ mapping_clear_unevictable(mapping);
+ shmem_unlock_mapping(mapping);
+
sg_free_table(pages);
kfree(pages);
}
@@ -2551,6 +2556,7 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
* Fail silently without starting the shrinker
*/
mapping = obj->base.filp->f_mapping;
+ mapping_set_unevictable(mapping);
noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
@@ -2664,6 +2670,8 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
err_pages:
for_each_sgt_page(page, sgt_iter, st)
put_page(page);
+ mapping_clear_unevictable(mapping);
+ shmem_unlock_mapping(mapping);
sg_free_table(st);
kfree(st);
diff --git a/mm/shmem.c b/mm/shmem.c
index 446942677cd4..d1ce34c09df6 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -786,6 +786,7 @@ void shmem_unlock_mapping(struct address_space *mapping)
cond_resched();
}
}
+EXPORT_SYMBOL_GPL(shmem_unlock_mapping);
/*
* Remove range of pages and swap entries from radix tree, and free them.
@@ -3874,6 +3875,7 @@ int shmem_lock(struct file *file, int lock, struct user_struct *user)
void shmem_unlock_mapping(struct address_space *mapping)
{
}
+EXPORT_SYMBOL_GPL(shmem_unlock_mapping);
#ifdef CONFIG_MMU
unsigned long shmem_get_unmapped_area(struct file *file,
--
2.19.1.331.ge82ca0e54c-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2] shmem, drm/i915: mark pinned shmemfs pages as unevictable
@ 2018-10-17 8:58 ` Kuo-Hsin Yang
0 siblings, 0 replies; 19+ messages in thread
From: Kuo-Hsin Yang @ 2018-10-17 8:58 UTC (permalink / raw)
To: vovoy
Cc: mhocko, corbet, peterz, intel-gfx, hughd, linux-kernel, linux-mm,
dave.hansen, akpm, hoegsberg
The i915 driver uses shmemfs to allocate backing storage for gem
objects. These shmemfs pages can be pinned (increased ref count) by
shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
wastes a lot of time scanning these pinned pages. In some extreme case,
all pages in the inactive anon lru are pinned, and only the inactive
anon lru is scanned due to inactive_ratio, the system cannot swap and
invokes the oom-killer. Mark these pinned pages as unevictable to speed
up vmscan.
By exporting shmem_unlock_mapping, drivers can: 1. mark a shmemfs
address space as unevictable with mapping_set_unevictable(), pages in
the address space will be moved to unevictable list in vmscan. 2. mark
an address space as evictable with mapping_clear_unevictable(), and move
these pages back to evictable list with shmem_unlock_mapping().
This patch was inspired by Chris Wilson's change [1].
[1]: https://patchwork.kernel.org/patch/9768741/
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
---
Changes for v2:
Squashed the two patches.
Documentation/vm/unevictable-lru.rst | 4 +++-
drivers/gpu/drm/i915/i915_gem.c | 8 ++++++++
mm/shmem.c | 2 ++
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/Documentation/vm/unevictable-lru.rst b/Documentation/vm/unevictable-lru.rst
index fdd84cb8d511..a812fb55136d 100644
--- a/Documentation/vm/unevictable-lru.rst
+++ b/Documentation/vm/unevictable-lru.rst
@@ -143,7 +143,7 @@ using a number of wrapper functions:
Query the address space, and return true if it is completely
unevictable.
-These are currently used in two places in the kernel:
+These are currently used in three places in the kernel:
(1) By ramfs to mark the address spaces of its inodes when they are created,
and this mark remains for the life of the inode.
@@ -154,6 +154,8 @@ These are currently used in two places in the kernel:
swapped out; the application must touch the pages manually if it wants to
ensure they're in memory.
+ (3) By the i915 driver to mark pinned address space until it's unpinned.
+
Detecting Unevictable Pages
---------------------------
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fcc73a6ab503..e0ff5b736128 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2390,6 +2390,7 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
{
struct sgt_iter sgt_iter;
struct page *page;
+ struct address_space *mapping;
__i915_gem_object_release_shmem(obj, pages, true);
@@ -2409,6 +2410,10 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
}
obj->mm.dirty = false;
+ mapping = file_inode(obj->base.filp)->i_mapping;
+ mapping_clear_unevictable(mapping);
+ shmem_unlock_mapping(mapping);
+
sg_free_table(pages);
kfree(pages);
}
@@ -2551,6 +2556,7 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
* Fail silently without starting the shrinker
*/
mapping = obj->base.filp->f_mapping;
+ mapping_set_unevictable(mapping);
noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
@@ -2664,6 +2670,8 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
err_pages:
for_each_sgt_page(page, sgt_iter, st)
put_page(page);
+ mapping_clear_unevictable(mapping);
+ shmem_unlock_mapping(mapping);
sg_free_table(st);
kfree(st);
diff --git a/mm/shmem.c b/mm/shmem.c
index 446942677cd4..d1ce34c09df6 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -786,6 +786,7 @@ void shmem_unlock_mapping(struct address_space *mapping)
cond_resched();
}
}
+EXPORT_SYMBOL_GPL(shmem_unlock_mapping);
/*
* Remove range of pages and swap entries from radix tree, and free them.
@@ -3874,6 +3875,7 @@ int shmem_lock(struct file *file, int lock, struct user_struct *user)
void shmem_unlock_mapping(struct address_space *mapping)
{
}
+EXPORT_SYMBOL_GPL(shmem_unlock_mapping);
#ifdef CONFIG_MMU
unsigned long shmem_get_unmapped_area(struct file *file,
--
2.19.1.331.ge82ca0e54c-goog
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
2018-10-16 18:31 ` Chris Wilson
@ 2018-10-18 6:56 ` Chris Wilson
-1 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2018-10-18 6:56 UTC (permalink / raw)
To: Kuo-Hsin Yang, Michal Hocko
Cc: linux-kernel, intel-gfx, linux-mm, akpm, peterz, dave.hansen,
corbet, hughd, joonas.lahtinen, marcheu, hoegsberg
Quoting Chris Wilson (2018-10-16 19:31:06)
> Fwiw, the shmem_unlock_mapping() call feels quite expensive, almost
> nullifying the advantage gained from not walking the lists in reclaim.
> I'll have better numbers in a couple of days.
Using a test ("igt/benchmarks/gem_syslatency -t 120 -b -m" on kbl)
consisting of cycletest with a background load of trying to allocate +
populate 2MiB (to hit thp) while catting all files to /dev/null, the
result of using mapping_set_unevictable is mixed.
Each test run consists of running cycletest for 120s measuring the mean
and maximum wakeup latency and then repeating that 120 times.
x baseline-mean.txt # no i915 activity
+ tip-mean.txt # current stock i915 with a continuous load
+------------------------------------------------------------------------+
| x + |
| x + |
|xx + |
|xx + |
|xx + |
|xx ++ |
|xx +++ |
|xx +++ |
|xx +++ |
|xx +++ |
|xx +++ |
|xx ++++ |
|xx +++++ |
|xx ++++++ |
|xx ++++++ |
|xx ++++++ |
|xx ++++++ |
|xx ++++++ |
|xx +++++++ + |
|xx ++++++++ + |
|xx ++++++++++ |
|xx+++++++++++ + + |
|xx+++++++++++ + + + + + ++ +|
| A |
||______M_A_________| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 359.153 876.915 863.548 778.80319 186.15875
+ 120 2475.318 73172.303 7666.812 9579.4671 9552.865
Our target then is 863us, but currently i915 adds 7ms of uninterruptable
delay on hitting the shrinker.
x baseline-mean.txt
+ mapping-mean.txt # applying the mapping_set_evictable patch
* tip-mean.txt
+------------------------------------------------------------------------+
| x * + |
| x * + |
|xx * + |
|xx * + |
|xx * + |
|xx ** + |
|xx *** ++ |
|xx *** ++ |
|xx *** ++ |
|xx *** ++ |
|xx *** ++ |
|xx **** + ++ |
|xx *****+ ++ ++ |
|xx ******+ ++ ++ |
|xx ******+ ++ + ++ |
|xx ******+ ++ + ++ |
|xx ******+ ++ ++++ |
|xx ******+ ++ ++++ |
|xx ******* *+ ++++ |
|xx ******** *+ +++++ |
|xx **********+ +++++ |
|xx***********+*+++++* |
|xx***********+*+++++* * + * * ** *|
| A |
| |___AM___| |
||______M_A_________| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 359.153 876.915 863.548 778.80319 186.15875
+ 120 3291.633 26644.894 15829.186 14654.781 4466.6997
* 120 2475.318 73172.303 7666.812 9579.4671 9552.865
Shows that if we use the mapping_set_evictable() +
shmem_unlock_mapping() we add a further 8ms uninterruptable delay to the
system... That's the opposite of our goal! ;)
x baseline-mean.txt
+ lock_vma-mean.txt # the old approach of pinning each page
* tip-mean.txt
+------------------------------------------------------------------------+
| *+ * |
| *+ * * |
| *+ * * |
| *+ * * |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ **** |
| *+ ***** |
| *+ ****** |
| *+ ****** * |
| *+ ****** * |
| *+ ******* * |
| *+******** * |
| *+******** * |
| *+******** * |
| *+******** * * * |
| *+******** * * + * * * * * * *|
| A |
||MA| |
||_______M_A________| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 359.153 876.915 863.548 778.80319 186.15875
+ 120 511.415 18757.367 1276.302 1416.0016 1679.3965
* 120 2475.318 73172.303 7666.812 9579.4671 9552.865
By contrast, the previous approach of using mlock_page_vma() does
dramatically reduce the uninterruptable delay -- which suggests that the
mapping_set_evictable() isn't keeping our unshrinkable pages off the
shrinker lru.
However, if instead of looking at the average uninterruptable delay
during the 120s of cycletest, but look at the worst case, things get a
little more interesting. Currently i915 is terrible.
x baseline-max.txt
+ tip-max.txt
+------------------------------------------------------------------------+
| * |
[snip 100 lines]
| * |
| * |
| * |
| * |
| * |
| * |
| * |
| * |
| * +++ ++ + + + + +|
| A |
||_____M_A_______| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 7391 58543 51953 51564.033 5044.6375
+ 120 2284928 6.752085e+08 3385097 20825362 80352645
Worst case with no i915 is 52ms, but as soon as we load up i915 with
some work, the worst case uninterruptable delay is on average 20s!!! As
suggested by the median, the data is severely skewed by a few outliers.
(Worst worst case is so bad khungtaskd often makes an appearance.)
x baseline-max.txt
+ mapping-max.txt
* tip-max.txt
+------------------------------------------------------------------------+
| * |
[snip 100 lines]
| * |
| * |
| * |
| * |
| * |
| * |
| * |
| *+ |
| *+*** ** * * +* * *|
| A |
| |_A__| |
||_____M_A_______| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 7391 58543 51953 51564.033 5044.6375
+ 120 3088140 2.9181602e+08 4022581 6528993.3 26278426
* 120 2284928 6.752085e+08 3385097 20825362 80352645
So while the mapping_set_evictable patch did reduce the maximum observed
delay within the 4 hour sample, on average (median, to exclude those worst
worst case outliers) it still fares worse than stock i915. The
mlock_page_vma() has no impact on worst case wrt stock.
My conclusion is that the mapping_set_evictable patch makes both the
average and worst case uninterruptable latency (as observed by other
users of the system) significantly worse. (Although the maximum latency
is not stable enough to draw a real conclusion other than i915 is
shockingly terrible.)
-Chris
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
@ 2018-10-18 6:56 ` Chris Wilson
0 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2018-10-18 6:56 UTC (permalink / raw)
To: Kuo-Hsin Yang, Michal Hocko
Cc: dave.hansen, peterz, intel-gfx, corbet, hughd, linux-kernel,
linux-mm, akpm, hoegsberg
Quoting Chris Wilson (2018-10-16 19:31:06)
> Fwiw, the shmem_unlock_mapping() call feels quite expensive, almost
> nullifying the advantage gained from not walking the lists in reclaim.
> I'll have better numbers in a couple of days.
Using a test ("igt/benchmarks/gem_syslatency -t 120 -b -m" on kbl)
consisting of cycletest with a background load of trying to allocate +
populate 2MiB (to hit thp) while catting all files to /dev/null, the
result of using mapping_set_unevictable is mixed.
Each test run consists of running cycletest for 120s measuring the mean
and maximum wakeup latency and then repeating that 120 times.
x baseline-mean.txt # no i915 activity
+ tip-mean.txt # current stock i915 with a continuous load
+------------------------------------------------------------------------+
| x + |
| x + |
|xx + |
|xx + |
|xx + |
|xx ++ |
|xx +++ |
|xx +++ |
|xx +++ |
|xx +++ |
|xx +++ |
|xx ++++ |
|xx +++++ |
|xx ++++++ |
|xx ++++++ |
|xx ++++++ |
|xx ++++++ |
|xx ++++++ |
|xx +++++++ + |
|xx ++++++++ + |
|xx ++++++++++ |
|xx+++++++++++ + + |
|xx+++++++++++ + + + + + ++ +|
| A |
||______M_A_________| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 359.153 876.915 863.548 778.80319 186.15875
+ 120 2475.318 73172.303 7666.812 9579.4671 9552.865
Our target then is 863us, but currently i915 adds 7ms of uninterruptable
delay on hitting the shrinker.
x baseline-mean.txt
+ mapping-mean.txt # applying the mapping_set_evictable patch
* tip-mean.txt
+------------------------------------------------------------------------+
| x * + |
| x * + |
|xx * + |
|xx * + |
|xx * + |
|xx ** + |
|xx *** ++ |
|xx *** ++ |
|xx *** ++ |
|xx *** ++ |
|xx *** ++ |
|xx **** + ++ |
|xx *****+ ++ ++ |
|xx ******+ ++ ++ |
|xx ******+ ++ + ++ |
|xx ******+ ++ + ++ |
|xx ******+ ++ ++++ |
|xx ******+ ++ ++++ |
|xx ******* *+ ++++ |
|xx ******** *+ +++++ |
|xx **********+ +++++ |
|xx***********+*+++++* |
|xx***********+*+++++* * + * * ** *|
| A |
| |___AM___| |
||______M_A_________| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 359.153 876.915 863.548 778.80319 186.15875
+ 120 3291.633 26644.894 15829.186 14654.781 4466.6997
* 120 2475.318 73172.303 7666.812 9579.4671 9552.865
Shows that if we use the mapping_set_evictable() +
shmem_unlock_mapping() we add a further 8ms uninterruptable delay to the
system... That's the opposite of our goal! ;)
x baseline-mean.txt
+ lock_vma-mean.txt # the old approach of pinning each page
* tip-mean.txt
+------------------------------------------------------------------------+
| *+ * |
| *+ * * |
| *+ * * |
| *+ * * |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ *** |
| *+ **** |
| *+ ***** |
| *+ ****** |
| *+ ****** * |
| *+ ****** * |
| *+ ******* * |
| *+******** * |
| *+******** * |
| *+******** * |
| *+******** * * * |
| *+******** * * + * * * * * * *|
| A |
||MA| |
||_______M_A________| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 359.153 876.915 863.548 778.80319 186.15875
+ 120 511.415 18757.367 1276.302 1416.0016 1679.3965
* 120 2475.318 73172.303 7666.812 9579.4671 9552.865
By contrast, the previous approach of using mlock_page_vma() does
dramatically reduce the uninterruptable delay -- which suggests that the
mapping_set_evictable() isn't keeping our unshrinkable pages off the
shrinker lru.
However, if instead of looking at the average uninterruptable delay
during the 120s of cycletest, but look at the worst case, things get a
little more interesting. Currently i915 is terrible.
x baseline-max.txt
+ tip-max.txt
+------------------------------------------------------------------------+
| * |
[snip 100 lines]
| * |
| * |
| * |
| * |
| * |
| * |
| * |
| * |
| * +++ ++ + + + + +|
| A |
||_____M_A_______| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 7391 58543 51953 51564.033 5044.6375
+ 120 2284928 6.752085e+08 3385097 20825362 80352645
Worst case with no i915 is 52ms, but as soon as we load up i915 with
some work, the worst case uninterruptable delay is on average 20s!!! As
suggested by the median, the data is severely skewed by a few outliers.
(Worst worst case is so bad khungtaskd often makes an appearance.)
x baseline-max.txt
+ mapping-max.txt
* tip-max.txt
+------------------------------------------------------------------------+
| * |
[snip 100 lines]
| * |
| * |
| * |
| * |
| * |
| * |
| * |
| *+ |
| *+*** ** * * +* * *|
| A |
| |_A__| |
||_____M_A_______| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 7391 58543 51953 51564.033 5044.6375
+ 120 3088140 2.9181602e+08 4022581 6528993.3 26278426
* 120 2284928 6.752085e+08 3385097 20825362 80352645
So while the mapping_set_evictable patch did reduce the maximum observed
delay within the 4 hour sample, on average (median, to exclude those worst
worst case outliers) it still fares worse than stock i915. The
mlock_page_vma() has no impact on worst case wrt stock.
My conclusion is that the mapping_set_evictable patch makes both the
average and worst case uninterruptable latency (as observed by other
users of the system) significantly worse. (Although the maximum latency
is not stable enough to draw a real conclusion other than i915 is
shockingly terrible.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
2018-10-18 6:56 ` Chris Wilson
@ 2018-10-18 8:15 ` Michal Hocko
-1 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2018-10-18 8:15 UTC (permalink / raw)
To: Chris Wilson
Cc: Kuo-Hsin Yang, linux-kernel, intel-gfx, linux-mm, akpm, peterz,
dave.hansen, corbet, hughd, joonas.lahtinen, marcheu, hoegsberg
On Thu 18-10-18 07:56:45, Chris Wilson wrote:
> Quoting Chris Wilson (2018-10-16 19:31:06)
> > Fwiw, the shmem_unlock_mapping() call feels quite expensive, almost
> > nullifying the advantage gained from not walking the lists in reclaim.
> > I'll have better numbers in a couple of days.
>
> Using a test ("igt/benchmarks/gem_syslatency -t 120 -b -m" on kbl)
> consisting of cycletest with a background load of trying to allocate +
> populate 2MiB (to hit thp) while catting all files to /dev/null, the
> result of using mapping_set_unevictable is mixed.
I haven't really read through your report completely yet but I wanted to
point out that the above test scenario is unlikely show the real effect of
the LRU scanning overhead because shmem pages do live on the anonymous
LRU list. With a plenty of file page cache available we do not even scan
anonymous LRU lists. You would have to generate a swapout workload to
test this properly.
On the other hand if mapping_set_unevictable has really a measurably bad
performance impact then this is probably not worth much because most
workloads are swap modest.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
@ 2018-10-18 8:15 ` Michal Hocko
0 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2018-10-18 8:15 UTC (permalink / raw)
To: Chris Wilson
Cc: dave.hansen, peterz, intel-gfx, corbet, hughd, linux-kernel,
linux-mm, akpm, hoegsberg
On Thu 18-10-18 07:56:45, Chris Wilson wrote:
> Quoting Chris Wilson (2018-10-16 19:31:06)
> > Fwiw, the shmem_unlock_mapping() call feels quite expensive, almost
> > nullifying the advantage gained from not walking the lists in reclaim.
> > I'll have better numbers in a couple of days.
>
> Using a test ("igt/benchmarks/gem_syslatency -t 120 -b -m" on kbl)
> consisting of cycletest with a background load of trying to allocate +
> populate 2MiB (to hit thp) while catting all files to /dev/null, the
> result of using mapping_set_unevictable is mixed.
I haven't really read through your report completely yet but I wanted to
point out that the above test scenario is unlikely show the real effect of
the LRU scanning overhead because shmem pages do live on the anonymous
LRU list. With a plenty of file page cache available we do not even scan
anonymous LRU lists. You would have to generate a swapout workload to
test this properly.
On the other hand if mapping_set_unevictable has really a measurably bad
performance impact then this is probably not worth much because most
workloads are swap modest.
--
Michal Hocko
SUSE Labs
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2018-10-18 8:16 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-16 17:42 [PATCH 0/2] shmem, drm/i915: Mark pinned shmemfs pages as unevictable Kuo-Hsin Yang
2018-10-16 17:42 ` Kuo-Hsin Yang
2018-10-16 17:42 ` [PATCH 1/2] shmem: export shmem_unlock_mapping Kuo-Hsin Yang
2018-10-16 17:42 ` Kuo-Hsin Yang
2018-10-16 17:43 ` [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable Kuo-Hsin Yang
2018-10-16 17:43 ` Kuo-Hsin Yang
2018-10-16 18:21 ` Michal Hocko
2018-10-16 18:21 ` Michal Hocko
2018-10-16 18:31 ` Chris Wilson
2018-10-16 18:31 ` Chris Wilson
2018-10-16 19:13 ` Michal Hocko
2018-10-18 6:56 ` Chris Wilson
2018-10-18 6:56 ` Chris Wilson
2018-10-18 8:15 ` Michal Hocko
2018-10-18 8:15 ` Michal Hocko
2018-10-16 18:27 ` ✓ Fi.CI.BAT: success for shmem, " Patchwork
2018-10-16 22:07 ` ✓ Fi.CI.IGT: " Patchwork
2018-10-17 8:58 ` [PATCH v2] shmem, drm/i915: mark " Kuo-Hsin Yang
2018-10-17 8:58 ` Kuo-Hsin Yang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.