linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: SeongJae Park <sjpark@amazon.com>
Cc: <jgross@suse.com>, <axboe@kernel.dk>, <konrad.wilk@oracle.com>,
	"SeongJae Park" <sjpark@amazon.de>, <pdurrant@amazon.com>,
	<sj38.park@gmail.com>, <xen-devel@lists.xenproject.org>,
	<linux-block@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v13 3/5] xen/blkback: Squeeze page pools if a memory pressure is detected
Date: Wed, 8 Jan 2020 13:40:57 +0100	[thread overview]
Message-ID: <20200108124057.GN11756@Air-de-Roger> (raw)
In-Reply-To: <20191218183718.31719-4-sjpark@amazon.com>

On Wed, Dec 18, 2019 at 07:37:16PM +0100, SeongJae Park wrote:
> From: SeongJae Park <sjpark@amazon.de>
> 
> Each `blkif` has a free pages pool for the grant mapping.  The size of
> the pool starts from zero and is increased on demand while processing
> the I/O requests.  If current I/O requests handling is finished or 100
> milliseconds has passed since last I/O requests handling, it checks and
> shrinks the pool to not exceed the size limit, `max_buffer_pages`.
> 
> Therefore, host administrators can cause memory pressure in blkback by
> attaching a large number of block devices and inducing I/O.  Such
> problematic situations can be avoided by limiting the maximum number of
> devices that can be attached, but finding the optimal limit is not so
> easy.  Improper set of the limit can results in memory pressure or a
> resource underutilization.  This commit avoids such problematic
> situations by squeezing the pools (returns every free page in the pool
> to the system) for a while (users can set this duration via a module
> parameter) if memory pressure is detected.
> 
> Discussions
> ===========
> 
> The `blkback`'s original shrinking mechanism returns only pages in the
> pool which are not currently be used by `blkback` to the system.  In
> other words, the pages that are not mapped with granted pages.  Because
> this commit is changing only the shrink limit but still uses the same
> freeing mechanism it does not touch pages which are currently mapping
> grants.
> 
> Once memory pressure is detected, this commit keeps the squeezing limit
> for a user-specified time duration.  The duration should be neither too
> long nor too short.  If it is too long, the squeezing incurring overhead
> can reduce the I/O performance.  If it is too short, `blkback` will not
> free enough pages to reduce the memory pressure.  This commit sets the
> value as `10 milliseconds` by default because it is a short time in
> terms of I/O while it is a long time in terms of memory operations.
> Also, as the original shrinking mechanism works for at least every 100
> milliseconds, this could be a somewhat reasonable choice.  I also tested
> other durations (refer to the below section for more details) and
> confirmed that 10 milliseconds is the one that works best with the test.
> That said, the proper duration depends on actual configurations and
> workloads.  That's why this commit allows users to set the duration as a
> module parameter.
> 
> Memory Pressure Test
> ====================
> 
> To show how this commit fixes the memory pressure situation well, I
> configured a test environment on a xen-running virtualization system.
> On the `blkfront` running guest instances, I attach a large number of
> network-backed volume devices and induce I/O to those.  Meanwhile, I
> measure the number of pages that swapped in (pswpin) and out (pswpout)
> on the `blkback` running guest.  The test ran twice, once for the
> `blkback` before this commit and once for that after this commit.  As
> shown below, this commit has dramatically reduced the memory pressure:
> 
>                 pswpin  pswpout
>     before      76,672  185,799
>     after          867    3,967
> 
> Optimal Aggressive Shrinking Duration
> -------------------------------------
> 
> To find a best squeezing duration, I repeated the test with three
> different durations (1ms, 10ms, and 100ms).  The results are as below:
> 
>     duration    pswpin  pswpout
>     1           707     5,095
>     10          867     3,967
>     100         362     3,348
> 
> As expected, the memory pressure decreases as the duration increases,
> but the reduction become slow from the `10ms`.  Based on this results, I
> chose the default duration as 10ms.
> 
> Performance Overhead Test
> =========================
> 
> This commit could incur I/O performance degradation under severe memory
> pressure because the squeezing will require more page allocations per
> I/O.  To show the overhead, I artificially made a worst-case squeezing
> situation and measured the I/O performance of a `blkfront` running
> guest.
> 
> For the artificial squeezing, I set the `blkback.max_buffer_pages` using
> the `/sys/module/xen_blkback/parameters/max_buffer_pages` file.  In this
> test, I set the value to `1024` and `0`.  The `1024` is the default
> value.  Setting the value as `0` is same to a situation doing the
> squeezing always (worst-case).
> 
> If the underlying block device is slow enough, the squeezing overhead
> could be hidden.  For the reason, I use a fast block device, namely the
> rbd[1]:
> 
>     # xl block-attach guest phy:/dev/ram0 xvdb w
> 
> For the I/O performance measurement, I run a simple `dd` command 5 times
> directly to the device as below and collect the 'MB/s' results.
> 
>     $ for i in {1..5}; do dd if=/dev/zero of=/dev/xvdb \
>                              bs=4k count=$((256*512)); sync; done
> 
> The results are as below.  'max_pgs' represents the value of the
> `blkback.max_buffer_pages` parameter.
> 
>     max_pgs   Min       Max       Median     Avg    Stddev
>     0         417       423       420        419.4  2.5099801
>     1024      414       425       416        417.8  4.4384682
>     No difference proven at 95.0% confidence
> 
> In short, even worst case squeezing on ramdisk based fast block device
> makes no visible performance degradation.  Please note that this is just
> a very simple and minimal test.  On systems using super-fast block
> devices and a special I/O workload, the results might be different.  If
> you have any doubt, test on your machine with your workload to find the
> optimal squeezing duration for you.
> 
> [1] https://www.kernel.org/doc/html/latest/admin-guide/blockdev/ramdisk.html
> 
> Signed-off-by: SeongJae Park <sjpark@amazon.de>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, and sorry for the delay!

Roger.

  parent reply	other threads:[~2020-01-08 12:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-18 18:37 [PATCH v13 0/5] xenbus/backend: Add memory pressure handler callback SeongJae Park
2019-12-18 18:37 ` [PATCH v13 1/5] " SeongJae Park
2019-12-18 18:37 ` [PATCH v13 2/5] xenbus/backend: Protect xenbus callback with lock SeongJae Park
2019-12-20 11:42   ` Jürgen Groß
2019-12-18 18:37 ` [PATCH v13 3/5] xen/blkback: Squeeze page pools if a memory pressure is detected SeongJae Park
2020-01-03  7:54   ` [Xen-devel] " SeongJae Park
2020-01-08 12:40   ` Roger Pau Monné [this message]
2019-12-18 18:37 ` [PATCH v13 4/5] xen/blkback: Remove unnecessary static variable name prefixes SeongJae Park
2019-12-18 18:38 ` SeongJae Park
2019-12-18 18:39 ` [PATCH v13 5/5] xen/blkback: Consistently insert one empty line between functions SeongJae Park
2020-01-13  9:49 ` [Xen-devel] [PATCH v13 0/5] xenbus/backend: Add memory pressure handler callback SeongJae Park
2020-01-13  9:55   ` Roger Pau Monné
2020-01-13  9:59     ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200108124057.GN11756@Air-de-Roger \
    --to=roger.pau@citrix.com \
    --cc=axboe@kernel.dk \
    --cc=jgross@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pdurrant@amazon.com \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.com \
    --cc=sjpark@amazon.de \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).