xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: SeongJae Park <sj38.park@gmail.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: jgross@suse.com, axboe@kernel.dk, sjpark@amazon.com,
	konrad.wilk@oracle.com, pdurrant@amazon.com,
	SeongJae Park <sjpark@amazon.de>,
	linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	xen-devel@lists.xenproject.org
Subject: Re: [Xen-devel] [PATCH v7 2/3] xen/blkback: Squeeze page pools if a memory pressure is detected
Date: Thu, 12 Dec 2019 17:15:37 +0100	[thread overview]
Message-ID: <20191212161537.10756-1-sj38.park@gmail.com> (raw)
In-Reply-To: <20191212152317.GE11756@Air-de-Roger> (raw)

On Thu, 12 Dec 2019 16:23:17 +0100 "Roger Pau Monné" <roger.pau@citrix.com> wrote:

> > On Thu, 12 Dec 2019 12:42:47 +0100 "Roger Pau Monné" <roger.pau@citrix.com> wrote:
> > > > On the slow block device
> > > > ------------------------
> > > > 
> > > >     max_pgs   Min       Max       Median     Avg    Stddev
> > > >     0         38.7      45.8      38.7       40.12  3.1752165
> > > >     1024      38.7      45.8      38.7       40.12  3.1752165
> > > >     No difference proven at 95.0% confidence
> > > > 
> > > > On the fast block device
> > > > ------------------------
> > > > 
> > > >     max_pgs   Min       Max       Median     Avg    Stddev
> > > >     0         417       423       420        419.4  2.5099801
> > > >     1024      414       425       416        417.8  4.4384682
> > > >     No difference proven at 95.0% confidence
> > > 
> > > This is intriguing, as it seems to prove that the usage of a cache of
> > > free pages is irrelevant performance wise.
> > > 
> > > The pool of free pages was introduced long ago, and it's possible that
> > > recent improvements to the balloon driver had made such pool useless,
> > > at which point it could be removed instead of worked around.
> > 
> > I guess the grant page allocation overhead in this test scenario is really
> > small.  In an absence of memory pressure, fragmentation, and NUMA imbalance,
> > the latency of the page allocation ('get_page()') is very short, as it will
> > success in the fast path.
> 
> The allocation of the pool of free pages involves more than get_page,
> it uses gnttab_alloc_pages which in the worse case will allocate a
> page and balloon it out issuing one hypercall.
> 
> > Few years ago, I once measured the page allocation latency on my machine.
> > Roughly speaking, it was about 1us in best case, 100us in worst case, and 5us
> > in average.  Please keep in mind that the measurement was not designed and
> > performed in serious way.  Thus the results could have profile overhead in it,
> > though.  While keeping that in mind, let's simply believe the number and ignore
> > the latency of the block layer, blkback itself (including the grant
> > mapping), and anything else including context switch, cache miss, but the
> > allocation.  In other words, suppose that the grant page allocation is only one
> > source of the overhead.  It will be able to achieve 1 million IOPS (4KB *
> > 1MIOPS = 4 GB/s) in the best case, 200 thousand IOPS (800 MB/s) in average, and
> > 10 thousand IOPS (40 MB/s) in worst case.  Based on this coarse calculation, I
> > think the test results is reasonable.
> > 
> > This also means that the effect of the blkback's free pages pool might be
> > visible under page allocation fast path failure situation.  Nevertheless, it
> > would be also hard to measure that in micro level unless the measurement is
> > well designed and controlled.
> > 
> > > 
> > > Do you think you could perform some more tests (as pointed out above
> > > against the block device to skip the fs overhead) and report back the
> > > results?
> > 
> > To be honest, I'm not sure whether additional tests are really necessary,
> > because I think the `dd` test and the results explanation already makes some
> > sense and provide the minimal proof of the concept.  Also, this change is a
> > fallback for the memory pressure situation, which is an error path in some
> > point of view.  Such errorneous situation might not happen frequently and if
> > the situation is not solved in short time, something much worse (e.g., OOM kill
> > of the user space xen control processes) than temporal I/O performance
> > degradation could happen.  Thus, I'm not sure whether such detailed performance
> > measurement is necessary for this rare error handling change.
> 
> Right, my main concern is that we seem to be adding duck tape so
> things don't fall apart, but if such cache is really not beneficial
> from a performance PoV I would rather see it go away than adding more
> stuff to it in order to workaround corner cases like memory
> starvation.

Right, if the cache is really giving no benefit, it would be much better to
simply remove it.  However, as mentioned before, I'm not sure whether it is
useless at all.  Maybe we could do some more detailed test to know that, but it
would be an out of scope of this patch.

> 
> Anyway, I guess we can take such change, but long term we need to look
> into fixing grants to not use ballooned pages, and figure out if the
> blkback free page cache is really useful or not.

Totally agreed.


Thanks,
SeongJae Park

> 
> Thanks, Roger.
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-12-12 16:16 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-11 18:10 [Xen-devel] [PATCH v7 0/2] xenbus/backend: Add a memory pressure handler callback SeongJae Park
2019-12-11 18:10 ` [Xen-devel] [PATCH v7 1/3] xenbus/backend: Add " SeongJae Park
2019-12-12  9:46   ` Roger Pau Monné
2019-12-11 18:10 ` [Xen-devel] [PATCH v7 2/3] xen/blkback: Squeeze page pools if a memory pressure is detected SeongJae Park
2019-12-12 11:42   ` Roger Pau Monné
2019-12-12 13:39     ` SeongJae Park
2019-12-12 15:23       ` Roger Pau Monné
2019-12-12 16:15         ` SeongJae Park [this message]
2019-12-12 15:27   ` Roger Pau Monné
2019-12-12 16:06     ` SeongJae Park
2019-12-13  9:27       ` Roger Pau Monné
2019-12-13  9:33         ` Jürgen Groß
2019-12-13 11:47           ` SeongJae Park
2019-12-11 18:10 ` [Xen-devel] [PATCH v7 3/3] xen/blkback: Remove unnecessary static variable name prefixes SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191212161537.10756-1-sj38.park@gmail.com \
    --to=sj38.park@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=jgross@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pdurrant@amazon.com \
    --cc=roger.pau@citrix.com \
    --cc=sjpark@amazon.com \
    --cc=sjpark@amazon.de \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).