linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Kuo-Hsin Yang <vovoy@chromium.org>, Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org, intel-gfx@lists.freedesktop.org,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	peterz@infradead.org, dave.hansen@intel.com, corbet@lwn.net,
	hughd@google.com, joonas.lahtinen@linux.intel.com,
	marcheu@chromium.org, hoegsberg@chromium.org
Subject: Re: [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable
Date: Thu, 18 Oct 2018 07:56:45 +0100	[thread overview]
Message-ID: <153984580501.19935.11456945882099910977@skylake-alporthouse-com> (raw)
In-Reply-To: <153971466599.22931.16793398326492316920@skylake-alporthouse-com>

Quoting Chris Wilson (2018-10-16 19:31:06)
> Fwiw, the shmem_unlock_mapping() call feels quite expensive, almost
> nullifying the advantage gained from not walking the lists in reclaim.
> I'll have better numbers in a couple of days.

Using a test ("igt/benchmarks/gem_syslatency -t 120 -b -m" on kbl)
consisting of cycletest with a background load of trying to allocate +
populate 2MiB (to hit thp) while catting all files to /dev/null, the
result of using mapping_set_unevictable is mixed.

Each test run consists of running cycletest for 120s measuring the mean
and maximum wakeup latency and then repeating that 120 times.

x baseline-mean.txt # no i915 activity
+ tip-mean.txt # current stock i915 with a continuous load
+------------------------------------------------------------------------+
| x      +                                                               |
| x      +                                                               |
|xx      +                                                               |
|xx      +                                                               |
|xx      +                                                               |
|xx     ++                                                               |
|xx    +++                                                               |
|xx    +++                                                               |
|xx    +++                                                               |
|xx    +++                                                               |
|xx    +++                                                               |
|xx    ++++                                                              |
|xx   +++++                                                              |
|xx  ++++++                                                              |
|xx  ++++++                                                              |
|xx  ++++++                                                              |
|xx  ++++++                                                              |
|xx  ++++++                                                              |
|xx  +++++++ +                                                           |
|xx ++++++++ +                                                           |
|xx ++++++++++                                                           |
|xx+++++++++++ +     +                                                   |
|xx+++++++++++ +     +  +          +      +       ++                    +|
| A                                                                      |
||______M_A_________|                                                    |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 120       359.153       876.915       863.548     778.80319     186.15875
+ 120      2475.318     73172.303      7666.812     9579.4671      9552.865

Our target then is 863us, but currently i915 adds 7ms of uninterruptable
delay on hitting the shrinker.

x baseline-mean.txt
+ mapping-mean.txt # applying the mapping_set_evictable patch
* tip-mean.txt
+------------------------------------------------------------------------+
| x      *         +                                                     |
| x      *         +                                                     |
|xx      *         +                                                     |
|xx      *         +                                                     |
|xx      *         +                                                     |
|xx     **         +                                                     |
|xx    ***         ++                                                    |
|xx    ***         ++                                                    |
|xx    ***         ++                                                    |
|xx    ***         ++                                                    |
|xx    ***         ++                                                    |
|xx    ****  +     ++                                                    |
|xx   *****+ ++    ++                                                    |
|xx  ******+ ++    ++                                                    |
|xx  ******+ ++  + ++                                                    |
|xx  ******+ ++  + ++                                                    |
|xx  ******+ ++  ++++                                                    |
|xx  ******+ ++  ++++                                                    |
|xx  ******* *+  ++++                                                    |
|xx ******** *+ +++++                                                    |
|xx **********+ +++++                                                    |
|xx***********+*+++++*                                                   |
|xx***********+*+++++*  *  +       *      *       **                    *|
| A                                                                      |
|          |___AM___|                                                    |
||______M_A_________|                                                    |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 120       359.153       876.915       863.548     778.80319     186.15875
+ 120      3291.633     26644.894     15829.186     14654.781     4466.6997
* 120      2475.318     73172.303      7666.812     9579.4671      9552.865

Shows that if we use the mapping_set_evictable() +
shmem_unlock_mapping() we add a further 8ms uninterruptable delay to the
system... That's the opposite of our goal! ;)

x baseline-mean.txt
+ lock_vma-mean.txt # the old approach of pinning each page
* tip-mean.txt
+------------------------------------------------------------------------+
| *+     *                                                               |
| *+   * *                                                               |
| *+   * *                                                               |
| *+   * *                                                               |
| *+   ***                                                               |
| *+   ***                                                               |
| *+   ***                                                               |
| *+   ***                                                               |
| *+   ***                                                               |
| *+   ***                                                               |
| *+   ***                                                               |
| *+   ****                                                              |
| *+  *****                                                              |
| *+  ******                                                             |
| *+  ****** *                                                           |
| *+  ****** *                                                           |
| *+ ******* *                                                           |
| *+******** *                                                           |
| *+******** *                                                           |
| *+******** *                                                           |
| *+******** * *     *                                                   |
| *+******** * *   + *  *          *      *       * *                   *|
| A                                                                      |
||MA|                                                                    |
||_______M_A________|                                                    |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 120       359.153       876.915       863.548     778.80319     186.15875
+ 120       511.415     18757.367      1276.302     1416.0016     1679.3965
* 120      2475.318     73172.303      7666.812     9579.4671      9552.865

By contrast, the previous approach of using mlock_page_vma() does
dramatically reduce the uninterruptable delay -- which suggests that the
mapping_set_evictable() isn't keeping our unshrinkable pages off the
shrinker lru.

However, if instead of looking at the average uninterruptable delay
during the 120s of cycletest, but look at the worst case, things get a
little more interesting. Currently i915 is terrible.

x baseline-max.txt
+ tip-max.txt
+------------------------------------------------------------------------+
|      *                                                                 |
[snip 100 lines]
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      * +++      ++ +           +  +      +                            +|
|      A                                                                 |
||_____M_A_______|                                                       |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 120          7391         58543         51953     51564.033     5044.6375
+ 120       2284928  6.752085e+08       3385097      20825362      80352645

Worst case with no i915 is 52ms, but as soon as we load up i915 with
some work, the worst case uninterruptable delay is on average 20s!!! As
suggested by the median, the data is severely skewed by a few outliers.
(Worst worst case is so bad khungtaskd often makes an appearance.)

x baseline-max.txt
+ mapping-max.txt
* tip-max.txt
+------------------------------------------------------------------------+
|      *                                                                 |
[snip 100 lines]
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *                                                                 |
|      *+                                                                |
|      *+***      ** *           * +*      *                            *|
|      A                                                                 |
|    |_A__|                                                              |
||_____M_A_______|                                                       |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 120          7391         58543         51953     51564.033     5044.6375
+ 120       3088140 2.9181602e+08       4022581     6528993.3      26278426
* 120       2284928  6.752085e+08       3385097      20825362      80352645

So while the mapping_set_evictable patch did reduce the maximum observed
delay within the 4 hour sample, on average (median, to exclude those worst
worst case outliers) it still fares worse than stock i915. The
mlock_page_vma() has no impact on worst case wrt stock.

My conclusion is that the mapping_set_evictable patch makes both the
average and worst case uninterruptable latency (as observed by other
users of the system) significantly worse. (Although the maximum latency
is not stable enough to draw a real conclusion other than i915 is
shockingly terrible.)
-Chris

  parent reply	other threads:[~2018-10-18  6:57 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-16 17:42 [PATCH 0/2] shmem, drm/i915: Mark pinned shmemfs pages as unevictable Kuo-Hsin Yang
2018-10-16 17:42 ` [PATCH 1/2] shmem: export shmem_unlock_mapping Kuo-Hsin Yang
2018-10-16 17:43 ` [PATCH 2/2] drm/i915: Mark pinned shmemfs pages as unevictable Kuo-Hsin Yang
2018-10-16 18:21   ` Michal Hocko
2018-10-16 18:31     ` Chris Wilson
2018-10-16 19:13       ` Michal Hocko
2018-10-18  6:56       ` Chris Wilson [this message]
2018-10-18  8:15         ` Michal Hocko
2018-10-17  8:58 ` [PATCH v2] shmem, drm/i915: mark " Kuo-Hsin Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=153984580501.19935.11456945882099910977@skylake-alporthouse-com \
    --to=chris@chris-wilson.co.uk \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=hoegsberg@chromium.org \
    --cc=hughd@google.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=marcheu@chromium.org \
    --cc=mhocko@kernel.org \
    --cc=peterz@infradead.org \
    --cc=vovoy@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).