All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Michal Hocko <mhocko@suse.com>,
	Liang Li <liliangleo@didiglobal.com>,
	linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	virtualization@lists.linux-foundation.org
Subject: Re: [RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO
Date: Tue, 22 Dec 2020 11:13:46 -0800	[thread overview]
Message-ID: <CAKgT0UcT8YafkMzGLV1Bnoys4qFsJP-e9cxLUEr_xQZKn1r+bg@mail.gmail.com> (raw)
In-Reply-To: <20201221162519.GA22504@open-light-1.localdomain>

On Mon, Dec 21, 2020 at 8:25 AM Liang Li <liliang.opensource@gmail.com> wrote:
>
> The first version can be found at: https://lkml.org/lkml/2020/4/12/42
>
> Zero out the page content usually happens when allocating pages with
> the flag of __GFP_ZERO, this is a time consuming operation, it makes
> the population of a large vma area very slowly. This patch introduce
> a new feature for zero out pages before page allocation, it can help
> to speed up page allocation with __GFP_ZERO.
>
> My original intention for adding this feature is to shorten VM
> creation time when SR-IOV devicde is attached, it works good and the
> VM creation time is reduced by about 90%.
>
> Creating a VM [64G RAM, 32 CPUs] with GPU passthrough
> =====================================================
> QEMU use 4K pages, THP is off
>                   round1      round2      round3
> w/o this patch:    23.5s       24.7s       24.6s
> w/ this patch:     10.2s       10.3s       11.2s
>
> QEMU use 4K pages, THP is on
>                   round1      round2      round3
> w/o this patch:    17.9s       14.8s       14.9s
> w/ this patch:     1.9s        1.8s        1.9s
> =====================================================
>
> Obviously, it can do more than this. We can benefit from this feature
> in the flowing case:

So I am not sure page reporting is the best thing to base this page
zeroing setup on. The idea with page reporting is to essentially act
as a leaky bucket and allow the guest to drop memory it isn't using
slowly so if it needs to reinflate it won't clash with the
applications that need memory. What you are doing here seems far more
aggressive in that you are going down to low order pages and sleeping
instead of rescheduling for the next time interval.

Also I am not sure your SR-IOV creation time test is a good
justification for this extra overhead. With your patches applied all
you are doing is making use of the free time before the test to do the
page zeroing instead of doing it during your test. As such your CPU
overhead prior to running the test would be higher and you haven't
captured that information.

One thing I would be interested in seeing is what is the load this is
adding when you are running simple memory allocation/free type tests
on the system. For example it might be useful to see what the
will-it-scale page_fault1 tests look like with this patch applied
versus not applied. I suspect it would be adding some amount of
overhead as you have to spend a ton of time scanning all the pages and
that will be considerable overhead.

WARNING: multiple messages have this Message-ID (diff)
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	 Mel Gorman <mgorman@techsingularity.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Andrea Arcangeli <aarcange@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	 Dave Hansen <dave.hansen@intel.com>,
	Michal Hocko <mhocko@suse.com>,
	 Liang Li <liliangleo@didiglobal.com>,
	linux-mm <linux-mm@kvack.org>,
	 LKML <linux-kernel@vger.kernel.org>,
	virtualization@lists.linux-foundation.org
Subject: Re: [RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO
Date: Tue, 22 Dec 2020 11:13:46 -0800	[thread overview]
Message-ID: <CAKgT0UcT8YafkMzGLV1Bnoys4qFsJP-e9cxLUEr_xQZKn1r+bg@mail.gmail.com> (raw)
In-Reply-To: <20201221162519.GA22504@open-light-1.localdomain>

On Mon, Dec 21, 2020 at 8:25 AM Liang Li <liliang.opensource@gmail.com> wrote:
>
> The first version can be found at: https://lkml.org/lkml/2020/4/12/42
>
> Zero out the page content usually happens when allocating pages with
> the flag of __GFP_ZERO, this is a time consuming operation, it makes
> the population of a large vma area very slowly. This patch introduce
> a new feature for zero out pages before page allocation, it can help
> to speed up page allocation with __GFP_ZERO.
>
> My original intention for adding this feature is to shorten VM
> creation time when SR-IOV devicde is attached, it works good and the
> VM creation time is reduced by about 90%.
>
> Creating a VM [64G RAM, 32 CPUs] with GPU passthrough
> =====================================================
> QEMU use 4K pages, THP is off
>                   round1      round2      round3
> w/o this patch:    23.5s       24.7s       24.6s
> w/ this patch:     10.2s       10.3s       11.2s
>
> QEMU use 4K pages, THP is on
>                   round1      round2      round3
> w/o this patch:    17.9s       14.8s       14.9s
> w/ this patch:     1.9s        1.8s        1.9s
> =====================================================
>
> Obviously, it can do more than this. We can benefit from this feature
> in the flowing case:

So I am not sure page reporting is the best thing to base this page
zeroing setup on. The idea with page reporting is to essentially act
as a leaky bucket and allow the guest to drop memory it isn't using
slowly so if it needs to reinflate it won't clash with the
applications that need memory. What you are doing here seems far more
aggressive in that you are going down to low order pages and sleeping
instead of rescheduling for the next time interval.

Also I am not sure your SR-IOV creation time test is a good
justification for this extra overhead. With your patches applied all
you are doing is making use of the free time before the test to do the
page zeroing instead of doing it during your test. As such your CPU
overhead prior to running the test would be higher and you haven't
captured that information.

One thing I would be interested in seeing is what is the load this is
adding when you are running simple memory allocation/free type tests
on the system. For example it might be useful to see what the
will-it-scale page_fault1 tests look like with this patch applied
versus not applied. I suspect it would be adding some amount of
overhead as you have to spend a ton of time scanning all the pages and
that will be considerable overhead.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  parent reply	other threads:[~2020-12-22 19:14 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-21 16:25 [RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO Liang Li
2020-12-22  8:47 ` David Hildenbrand
2020-12-22  8:47   ` David Hildenbrand
2020-12-22 11:31   ` Liang Li
2020-12-22 11:31     ` Liang Li
2020-12-22 11:57     ` David Hildenbrand
2020-12-22 11:57       ` David Hildenbrand
2020-12-22 14:00       ` Liang Li
2020-12-22 14:00         ` Liang Li
2020-12-23  8:41         ` David Hildenbrand
2020-12-23  8:41           ` David Hildenbrand
2020-12-23 12:11           ` Liang Li
2020-12-23 12:11             ` Liang Li
2021-01-04 20:18             ` David Hildenbrand
2021-01-04 20:18               ` David Hildenbrand
2021-01-05  2:14               ` Liang Li
2021-01-05  2:14                 ` Liang Li
2021-01-05  9:39                 ` David Hildenbrand
2021-01-05  9:39                   ` David Hildenbrand
2021-01-05 10:22                   ` Liang Li
2021-01-05 10:22                     ` Liang Li
2021-01-05 10:27                     ` David Hildenbrand
2021-01-05 10:27                       ` David Hildenbrand
2020-12-22 12:23 ` Matthew Wilcox
2020-12-22 12:23   ` Matthew Wilcox
2020-12-22 14:42   ` Liang Li
2020-12-22 14:42     ` Liang Li
2021-01-04 12:51     ` Michal Hocko
2021-01-04 13:45       ` Liang Li
2021-01-04 13:45         ` Liang Li
2020-12-22 17:11 ` Daniel Jordan
2020-12-22 19:13 ` Alexander Duyck [this message]
2020-12-22 19:13   ` Alexander Duyck
2020-12-22 19:13   ` Alexander Duyck
2021-01-04 12:55 ` Michal Hocko
2021-01-04 14:07   ` Liang Li
2021-01-04 14:07     ` Liang Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKgT0UcT8YafkMzGLV1Bnoys4qFsJP-e9cxLUEr_xQZKn1r+bg@mail.gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=liliangleo@didiglobal.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=mst@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.