linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected
@ 2019-12-04 11:34 SeongJae Park
  2019-12-04 11:34 ` [PATCH 1/2] " SeongJae Park
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: SeongJae Park @ 2019-12-04 11:34 UTC (permalink / raw)
  To: konrad.wilk, roger.pau, axboe
  Cc: sj38.park, xen-devel, linux-block, linux-kernel, SeongJae Park

Each `blkif` has a free pages pool for the grant mapping.  The size of
the pool starts from zero and be increased on demand while processing
the I/O requests.  If current I/O requests handling is finished or 100
milliseconds has passed since last I/O requests handling, it checks and
shrinks the pool to not exceed the size limit, `max_buffer_pages`.

Therefore, `blkfront` running guests can cause a memory pressure in the
`blkback` running guest by attaching arbitrarily large number of block
devices and inducing I/O.  This patchset avoids such problematic
situations by shrinking the pools aggressively (further the limit) for a
while if a memory pressure is detected.


Discussions
===========

The shrinking mechanism returns only pages in the pool which are not
currently be used by blkback.  In other words, the pages that will be
shrunk are not mapped with foreign pages.  Because this patchset is
changing only the shrink limit but uses the shrinking mechanism as is,
this patchset does not introduce security issues such as improper
unmappings.

The first patch keeps the aggressive shrinking limit for one milisecond
from last memory pressure detected time.  The duration should be neither
too short nor too long.  If it is too long, free pages pool shrinking
overhead can reduce the I/O performance.  If it is too short, blkback
will not free enough pages to reduce the memory pressure.  I set the
value as 1 millisecond by default because I believe that 1 millisecond
is a short duration in terms of I/O while it is a long duration in terms
of memory operations.  Also, as the original shrinking mechanism works
for every 100 milliseconds, this could be a somewhat reasonable choice.
In actual, the default value worked well for my test (refer to below
section for the detail of the test).  Nevertheless, the proper duration
would depends on given configurations and workloads.  The second patch
therefore allows users to set it via a module parameter interface.


Memory Pressure Test
====================

To show whether this patchset fixes the above mentioned memory pressure
situation well, I configured a test environment.  On the `blkfront`
running guest instances of a virtualized environment, I attach
arbitrarily large number of network-backed volume devices and induce I/O
to those.  Meanwhile, I measure the number of pages that swapped in and
out on the `blkback` running guest.  The test ran twice, once for the
`blkback` before this patchset and once for that after this patchset.

Roughly speaking, this patchset has reduced those numbers 130x (pswpin)
and 34x (pswpout) as below:

    		pswpin	pswpout
    before	76,672	185,799
    after	   587	  5,402


Performance Overhead Test
=========================

This patchset could incur I/O performance degradation under memory
pressure because the aggressive shrinking will require more page
allocations.  To show the overhead, I artificially made an aggressive
pages pool shrinking situation and measured the I/O performance of a
`blkfront` running guest.

For the artificial shrinking, I set the `blkback.max_buffer_pages` using
the `/sys/module/xen_blkback/parameters/max_buffer_pages` file.  We set
the value to `1024` and `0`.  The `1024` is the default value.  Setting
the value as `0` incurs the worst-case aggressive shrinking stress.

For the I/O performance measurement, I use a simple `dd` command.

Default Performance
-------------------

    [dom0]# echo 1024 >  /sys/module/xen_blkback/parameters/max_buffer_pages
    [instance]$ for i in {1..5}; do dd if=/dev/zero of=file bs=4k count=$((256*512)); sync; done
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 11.7257 s, 45.8 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8827 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8781 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8737 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8702 s, 38.7 MB/s

Worst-case Performance
----------------------

    [dom0]# echo 0 >  /sys/module/xen_blkback/parameters/max_buffer_pages
    [instance]$ for i in {1..5}; do dd if=/dev/zero of=file bs=4k count=$((256*512)); sync; done
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 11.7257 s, 45.8 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.878 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8746 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8786 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8749 s, 38.7 MB/s

In short, even worst case aggressive pools shrinking makes no visible
performance degradation.  I think this is due to the slow speed of the
I/O.  In other words, the additional page allocation overhead is hidden
under the much slower I/O time.

SeongJae Park (2):
  xen/blkback: Aggressively shrink page pools if a memory pressure is
    detected
  blkback: Add a module parameter for aggressive pool shrinking duration

 drivers/block/xen-blkback/blkback.c | 35 +++++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected
  2019-12-04 11:34 [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected SeongJae Park
@ 2019-12-04 11:34 ` SeongJae Park
  2019-12-04 11:34 ` [PATCH 2/2] blkback: Add a module parameter for aggressive pool shrinking duration SeongJae Park
  2019-12-04 11:52 ` [Xen-devel] [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected Durrant, Paul
  2 siblings, 0 replies; 5+ messages in thread
From: SeongJae Park @ 2019-12-04 11:34 UTC (permalink / raw)
  To: konrad.wilk, roger.pau, axboe
  Cc: sj38.park, xen-devel, linux-block, linux-kernel, SeongJae Park

From: SeongJae Park <sjpark@amazon.de>

Each `blkif` has a free pages pool for the grant mapping.  The size of
the pool starts from zero and be increased on demand while processing
the I/O requests.  If current I/O requests handling is finished or 100
milliseconds has passed since last I/O requests handling, it checks and
shrinks the pool to not exceed the size limit, `max_buffer_pages`.

Therefore, `blkfront` running guests can cause a memory pressure in the
`blkback` running guest by attaching arbitrarily large number of block
devices and inducing I/O.  This commit avoids such problematic
situations by shrinking the pools aggressively (further the limit) for a
while (one millisecond) if a memory pressure is detected.

Discussions
===========

The shrinking mechanism returns only pages in the pool which are not
currently be used by blkback.  In other words, the pages that will be
shrunk are not mapped with foreign pages.  Because this commit is
changing only the shrink limit but uses the shrinking mechanism as is,
this commit does not introduce security issues such as improper
unmappings.

This commit keeps the aggressive shrinking limit for one milisecond from
last memory pressure detected time.  The duration should be neither too
short nor too long.  If it is too long, free pages pool shrinking
overhead can reduce the I/O performance.  If it is too short, blkback
will not free enough pages to reduce the memory pressure.  I believe
that one millisecond is a short duration in terms of I/O while it is a
long duration in terms of memory operations.  Also, as the original
shrinking mechanism works for every 100 milliseconds, this 1 millisecond
could be a somewhat reasonable choice.  Also, this duration worked well
for our testing environment simulating the memory pressure situation
(will be described in detail below).

Memory Pressure Test
====================

To show whether this commit fixes the above mentioned memory pressure
situation well, I configured a test environment.  On the `blkfront`
running guest instances of a virtualized environment, I attach
arbitrarily large number of network-backed volume devices and induce I/O
to those.  Meanwhile, I measure the number of pages that swapped in and
out on the `blkback` running guest.  The test ran twice, once for the
`blkback` before this commit and once for that after this commit.

Roughly speaking, this commit has reduced those numbers 130x (pswpin)
and 34x (pswpout) as below:

    		pswpin	pswpout
    before	76,672	185,799
    after	   587	  5,402

Performance Overhead Test
=========================

This commit could incur I/O performance degradation under memory
pressure because the aggressive shrinking will require more page
allocations.  To show the overhead, I artificially made an aggressive
pages pool shrinking situation and measured the I/O performance of a
`blkfront` running guest.

For the artificial shrinking, I set the `blkback.max_buffer_pages` using
the `/sys/module/xen_blkback/parameters/max_buffer_pages` file.  We set
the value to `1024` and `0`.  The `1024` is the default value.  Setting
the value as `0` incurs the worst-case aggressive shrinking stress.

For the I/O performance measurement, I use a simple `dd` command.

Default Performance
-------------------

    [dom0]# echo 1024 >  /sys/module/xen_blkback/parameters/max_buffer_pages
    [instance]$ for i in {1..5}; do dd if=/dev/zero of=file bs=4k count=$((256*512)); sync; done
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 11.7257 s, 45.8 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8827 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8781 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8737 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8702 s, 38.7 MB/s

Worst-case Performance
----------------------

    [dom0]# echo 0 >  /sys/module/xen_blkback/parameters/max_buffer_pages
    [instance]$ for i in {1..5}; do dd if=/dev/zero of=file bs=4k count=$((256*512)); sync; done
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 11.7257 s, 45.8 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.878 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8746 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8786 s, 38.7 MB/s
    131072+0 records in
    131072+0 records out
    536870912 bytes (537 MB) copied, 13.8749 s, 38.7 MB/s

In short, even worst case aggressive pools shrinking makes no visible
performance degradation.  I think this is due to the slow speed of the
I/O.  In other words, the additional page allocation overhead is hidden
under the much slower I/O time.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 drivers/block/xen-blkback/blkback.c | 31 +++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 3666afa639d1..aa1a127093e5 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -135,6 +135,27 @@ module_param(log_stats, int, 0644);
 /* Number of free pages to remove on each call to gnttab_free_pages */
 #define NUM_BATCH_FREE_PAGES 10
 
+/*
+ * Once a memory pressure is detected, keep aggressive shrinking of the free
+ * page pools for this time (msec)
+ */
+#define AGGRESSIVE_SHRINKING_DURATION	1
+
+static unsigned long xen_blk_mem_pressure_end;
+
+static unsigned long blkif_shrink_count(struct shrinker *shrinker,
+				struct shrink_control *sc)
+{
+	xen_blk_mem_pressure_end = jiffies +
+		msecs_to_jiffies(AGGRESSIVE_SHRINKING_DURATION);
+	return 0;
+}
+
+static struct shrinker blkif_shrinker = {
+	.count_objects = blkif_shrink_count,
+	.seeks = DEFAULT_SEEKS,
+};
+
 static inline bool persistent_gnt_timeout(struct persistent_gnt *persistent_gnt)
 {
 	return xen_blkif_pgrant_timeout &&
@@ -656,8 +677,11 @@ int xen_blkif_schedule(void *arg)
 			ring->next_lru = jiffies + msecs_to_jiffies(LRU_INTERVAL);
 		}
 
-		/* Shrink if we have more than xen_blkif_max_buffer_pages */
-		shrink_free_pagepool(ring, xen_blkif_max_buffer_pages);
+		/* Shrink the free pages pool if it is too large. */
+		if (time_before(jiffies, xen_blk_mem_pressure_end))
+			shrink_free_pagepool(ring, 0);
+		else
+			shrink_free_pagepool(ring, xen_blkif_max_buffer_pages);
 
 		if (log_stats && time_after(jiffies, ring->st_print))
 			print_stats(ring);
@@ -1500,6 +1524,9 @@ static int __init xen_blkif_init(void)
 	if (rc)
 		goto failed_init;
 
+	if (register_shrinker(&blkif_shrinker))
+		pr_warn("shrinker registration failed\n");
+
  failed_init:
 	return rc;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] blkback: Add a module parameter for aggressive pool shrinking duration
  2019-12-04 11:34 [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected SeongJae Park
  2019-12-04 11:34 ` [PATCH 1/2] " SeongJae Park
@ 2019-12-04 11:34 ` SeongJae Park
  2019-12-04 11:52 ` [Xen-devel] [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected Durrant, Paul
  2 siblings, 0 replies; 5+ messages in thread
From: SeongJae Park @ 2019-12-04 11:34 UTC (permalink / raw)
  To: konrad.wilk, roger.pau, axboe
  Cc: sj38.park, xen-devel, linux-block, linux-kernel, SeongJae Park

From: SeongJae Park <sjpark@amazon.de>

As discussed by the previous commit ("xen/blkback: Aggressively shrink
page pools if a memory pressure is detected"), the aggressive pool
shrinking duration should be carefully selected:
``If it is too long, free pages pool shrinking overhead can reduce the
I/O performance.  If it is too short, blkback will not free enough pages
to reduce the memory pressure.``

That said, the proper duration would depends on given configurations and
workloads.  For the reason, this commit allows users to set it via a
module parameter interface.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
Suggested-by: Amit Shah <aams@amazon.de>
---
 drivers/block/xen-blkback/blkback.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index aa1a127093e5..88c011300ee9 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -137,9 +137,13 @@ module_param(log_stats, int, 0644);
 
 /*
  * Once a memory pressure is detected, keep aggressive shrinking of the free
- * page pools for this time (msec)
+ * page pools for this time (milliseconds)
  */
-#define AGGRESSIVE_SHRINKING_DURATION	1
+static int xen_blkif_aggressive_shrinking_duration = 1;
+module_param_named(aggressive_shrinking_duration,
+		xen_blkif_aggressive_shrinking_duration, int, 0644);
+MODULE_PARM_DESC(aggressive_shrinking_duration,
+"Duration to do aggressive shrinking when a memory pressure is detected");
 
 static unsigned long xen_blk_mem_pressure_end;
 
@@ -147,7 +151,7 @@ static unsigned long blkif_shrink_count(struct shrinker *shrinker,
 				struct shrink_control *sc)
 {
 	xen_blk_mem_pressure_end = jiffies +
-		msecs_to_jiffies(AGGRESSIVE_SHRINKING_DURATION);
+		msecs_to_jiffies(xen_blkif_aggressive_shrinking_duration);
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: [Xen-devel] [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected
  2019-12-04 11:34 [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected SeongJae Park
  2019-12-04 11:34 ` [PATCH 1/2] " SeongJae Park
  2019-12-04 11:34 ` [PATCH 2/2] blkback: Add a module parameter for aggressive pool shrinking duration SeongJae Park
@ 2019-12-04 11:52 ` Durrant, Paul
  2019-12-04 12:09   ` sjpark
  2 siblings, 1 reply; 5+ messages in thread
From: Durrant, Paul @ 2019-12-04 11:52 UTC (permalink / raw)
  To: Park, Seongjae, konrad.wilk, roger.pau, axboe
  Cc: sj38.park, xen-devel, linux-block, linux-kernel, Park, Seongjae

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> SeongJae Park
> Sent: 04 December 2019 11:34
> To: konrad.wilk@oracle.com; roger.pau@citrix.com; axboe@kernel.dk
> Cc: sj38.park@gmail.com; xen-devel@lists.xenproject.org; linux-
> block@vger.kernel.org; linux-kernel@vger.kernel.org; Park, Seongjae
> <sjpark@amazon.com>
> Subject: [Xen-devel] [PATCH 0/2] xen/blkback: Aggressively shrink page
> pools if a memory pressure is detected
> 
> Each `blkif` has a free pages pool for the grant mapping.  The size of
> the pool starts from zero and be increased on demand while processing
> the I/O requests.  If current I/O requests handling is finished or 100
> milliseconds has passed since last I/O requests handling, it checks and
> shrinks the pool to not exceed the size limit, `max_buffer_pages`.
> 
> Therefore, `blkfront` running guests can cause a memory pressure in the
> `blkback` running guest by attaching arbitrarily large number of block
> devices and inducing I/O.

OOI... How do guests unilaterally cause the attachment of arbitrary numbers of PV devices?

  Paul


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Xen-devel] [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected
  2019-12-04 11:52 ` [Xen-devel] [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected Durrant, Paul
@ 2019-12-04 12:09   ` sjpark
  0 siblings, 0 replies; 5+ messages in thread
From: sjpark @ 2019-12-04 12:09 UTC (permalink / raw)
  To: Durrant, Paul, konrad.wilk, roger.pau, axboe
  Cc: sj38.park, xen-devel, linux-block, linux-kernel

On 04.12.19 12:52, Durrant, Paul wrote:
>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
>> SeongJae Park
>> Sent: 04 December 2019 11:34
>> To: konrad.wilk@oracle.com; roger.pau@citrix.com; axboe@kernel.dk
>> Cc: sj38.park@gmail.com; xen-devel@lists.xenproject.org; linux-
>> block@vger.kernel.org; linux-kernel@vger.kernel.org; Park, Seongjae
>> <sjpark@amazon.com>
>> Subject: [Xen-devel] [PATCH 0/2] xen/blkback: Aggressively shrink page
>> pools if a memory pressure is detected
>>
>> Each `blkif` has a free pages pool for the grant mapping.  The size of
>> the pool starts from zero and be increased on demand while processing
>> the I/O requests.  If current I/O requests handling is finished or 100
>> milliseconds has passed since last I/O requests handling, it checks and
>> shrinks the pool to not exceed the size limit, `max_buffer_pages`.
>>
>> Therefore, `blkfront` running guests can cause a memory pressure in the
>> `blkback` running guest by attaching arbitrarily large number of block
>> devices and inducing I/O.
> OOI... How do guests unilaterally cause the attachment of arbitrary numbers of PV devices?
Good point.  Many systems have their limit for the maximum number of the
devices.  Thus, 'arbitrarily' large number of devices cannot be attached.  So,
there is the upperbound.  System administrators might be able to avoid the
memory pressure problem by setting the limit low enough or giving more memory
to the 'blkback' running guest.

However, many systems also tempt to set the limit high enough so that guests
can satisfy and to give minimal memory to the 'blkback' running guest for cost
efficiency.

I believe this patchset can be helpful for such situations.

Anyway, using the term 'arbitrarily' is obvisously my fault.  I will update the
description in the next version of patchset.


Thanks,
SeongJae Park

>
>   Paul
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-12-04 12:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-04 11:34 [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected SeongJae Park
2019-12-04 11:34 ` [PATCH 1/2] " SeongJae Park
2019-12-04 11:34 ` [PATCH 2/2] blkback: Add a module parameter for aggressive pool shrinking duration SeongJae Park
2019-12-04 11:52 ` [Xen-devel] [PATCH 0/2] xen/blkback: Aggressively shrink page pools if a memory pressure is detected Durrant, Paul
2019-12-04 12:09   ` sjpark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).