All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nitesh Narayan Lal <nitesh@redhat.com>
To: Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org,
	mst@redhat.com, dave.hansen@intel.com,
	linux-kernel@vger.kernel.org, willy@infradead.org,
	mhocko@kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org,
	mgorman@techsingularity.net, vbabka@suse.cz, osalvador@suse.de
Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com,
	konrad.wilk@oracle.com, riel@surriel.com, lcapitulino@redhat.com,
	wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com,
	dan.j.williams@intel.com
Subject: Re: [PATCH v11 0/6] mm / virtio: Provide support for unused page reporting
Date: Tue, 1 Oct 2019 15:16:04 -0400	[thread overview]
Message-ID: <8bd303a6-6e50-b2dc-19ab-4c3f176c4b02@redhat.com> (raw)
In-Reply-To: <1ea1a4e11617291062db81f65745b9c95fd0bb30.camel@linux.intel.com>


On 10/1/19 12:21 PM, Alexander Duyck wrote:
> On Tue, 2019-10-01 at 17:35 +0200, David Hildenbrand wrote:
>> On 01.10.19 17:29, Alexander Duyck wrote:
>>> This series provides an asynchronous means of reporting to a hypervisor
>>> that a guest page is no longer in use and can have the data associated
>>> with it dropped. To do this I have implemented functionality that allows
>>> for what I am referring to as unused page reporting. The advantage of
>>> unused page reporting is that we can support a significant amount of
>>> memory over-commit with improved performance as we can avoid having to
>>> write/read memory from swap as the VM will instead actively participate
>>> in freeing unused memory so it doesn't have to be written.
>>>
>>> The functionality for this is fairly simple. When enabled it will allocate
>>> statistics to track the number of reported pages in a given free area.
>>> When the number of free pages exceeds this value plus a high water value,
>>> currently 32, it will begin performing page reporting which consists of
>>> pulling non-reported pages off of the free lists of a given zone and
>>> placing them into a scatterlist. The scatterlist is then given to the page
>>> reporting device and it will perform the required action to make the pages
>>> "reported", in the case of virtio-balloon this results in the pages being
>>> madvised as MADV_DONTNEED. After this they are placed back on their
>>> original free list. If they are not merged in freeing an additional bit is
>>> set indicating that they are a "reported" buddy page instead of a standard
>>> buddy page. The cycle then repeats with additional non-reported pages
>>> being pulled until the free areas all consist of reported pages.
>>>
>>> In order to try and keep the time needed to find a non-reported page to
>>> a minimum we maintain a "reported_boundary" pointer. This pointer is used
>>> by the get_unreported_pages iterator to determine at what point it should
>>> resume searching for non-reported pages. In order to guarantee pages do
>>> not get past the scan I have modified add_to_free_list_tail so that it
>>> will not insert pages behind the reported_boundary. Doing this allows us
>>> to keep the overhead to a minimum as re-walking the list without the
>>> boundary will result in as much as 18% additional overhead on a 32G VM.
>>>
>>>
> <snip>
>
>>> As far as possible regressions I have focused on cases where performing
>>> the hinting would be non-optimal, such as cases where the code isn't
>>> needed as memory is not over-committed, or the functionality is not in
>>> use. I have been using the will-it-scale/page_fault1 test running with 16
>>> vcpus and have modified it to use Transparent Huge Pages. With this I see
>>> almost no difference with the patches applied and the feature disabled.
>>> Likewise I see almost no difference with the feature enabled, but the
>>> madvise disabled in the hypervisor due to a device being assigned. With
>>> the feature fully enabled in both guest and hypervisor I see a regression
>>> between -1.86% and -8.84% versus the baseline. I found that most of the
>>> overhead was due to the page faulting/zeroing that comes as a result of
>>> the pages having been evicted from the guest.
>> I think Michal asked for a performance comparison against Nitesh's
>> approach, to evaluate if keeping the reported state + tracking inside
>> the buddy is really worth it. Do you have any such numbers already? (or
>> did my tired eyes miss them in this cover letter? :/)
>>
> I thought what Michal was asking for was what was the benefit of using the
> boundary pointer. I added a bit up above and to the description for patch
> 3 as on a 32G VM it adds up to about a 18% difference without factoring in
> the page faulting and zeroing logic that occurs when we actually do the
> madvise.
>
> Do we have a working patch set for Nitesh's code? The last time I tried
> running his patch set I ran into issues with kernel panics. If we have a
> known working/stable patch set I can give it a try.

Did you try the v12 patch-set [1]?
I remember that you reported the CPU stall issue, which I fixed in the v12.

[1] https://lkml.org/lkml/2019/8/12/593

>
> - Alex
>
-- 
Thanks
Nitesh

WARNING: multiple messages have this Message-ID (diff)
From: Nitesh Narayan Lal <nitesh@redhat.com>
To: Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org,
	mst@redhat.com, dave.hansen@intel.com,
	linux-kernel@vger.kernel.org, willy@infradead.org,
	mhocko@kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org,
	mgorman@techsingularity.net, vbabka@suse.cz, osalvador@suse.de
Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com,
	konrad.wilk@oracle.com, riel@surriel.com, lcapitulino@redhat.com,
	wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com,
	dan.j.williams@intel.com
Subject: [virtio-dev] Re: [PATCH v11 0/6] mm / virtio: Provide support for unused page reporting
Date: Tue, 1 Oct 2019 15:16:04 -0400	[thread overview]
Message-ID: <8bd303a6-6e50-b2dc-19ab-4c3f176c4b02@redhat.com> (raw)
In-Reply-To: <1ea1a4e11617291062db81f65745b9c95fd0bb30.camel@linux.intel.com>


On 10/1/19 12:21 PM, Alexander Duyck wrote:
> On Tue, 2019-10-01 at 17:35 +0200, David Hildenbrand wrote:
>> On 01.10.19 17:29, Alexander Duyck wrote:
>>> This series provides an asynchronous means of reporting to a hypervisor
>>> that a guest page is no longer in use and can have the data associated
>>> with it dropped. To do this I have implemented functionality that allows
>>> for what I am referring to as unused page reporting. The advantage of
>>> unused page reporting is that we can support a significant amount of
>>> memory over-commit with improved performance as we can avoid having to
>>> write/read memory from swap as the VM will instead actively participate
>>> in freeing unused memory so it doesn't have to be written.
>>>
>>> The functionality for this is fairly simple. When enabled it will allocate
>>> statistics to track the number of reported pages in a given free area.
>>> When the number of free pages exceeds this value plus a high water value,
>>> currently 32, it will begin performing page reporting which consists of
>>> pulling non-reported pages off of the free lists of a given zone and
>>> placing them into a scatterlist. The scatterlist is then given to the page
>>> reporting device and it will perform the required action to make the pages
>>> "reported", in the case of virtio-balloon this results in the pages being
>>> madvised as MADV_DONTNEED. After this they are placed back on their
>>> original free list. If they are not merged in freeing an additional bit is
>>> set indicating that they are a "reported" buddy page instead of a standard
>>> buddy page. The cycle then repeats with additional non-reported pages
>>> being pulled until the free areas all consist of reported pages.
>>>
>>> In order to try and keep the time needed to find a non-reported page to
>>> a minimum we maintain a "reported_boundary" pointer. This pointer is used
>>> by the get_unreported_pages iterator to determine at what point it should
>>> resume searching for non-reported pages. In order to guarantee pages do
>>> not get past the scan I have modified add_to_free_list_tail so that it
>>> will not insert pages behind the reported_boundary. Doing this allows us
>>> to keep the overhead to a minimum as re-walking the list without the
>>> boundary will result in as much as 18% additional overhead on a 32G VM.
>>>
>>>
> <snip>
>
>>> As far as possible regressions I have focused on cases where performing
>>> the hinting would be non-optimal, such as cases where the code isn't
>>> needed as memory is not over-committed, or the functionality is not in
>>> use. I have been using the will-it-scale/page_fault1 test running with 16
>>> vcpus and have modified it to use Transparent Huge Pages. With this I see
>>> almost no difference with the patches applied and the feature disabled.
>>> Likewise I see almost no difference with the feature enabled, but the
>>> madvise disabled in the hypervisor due to a device being assigned. With
>>> the feature fully enabled in both guest and hypervisor I see a regression
>>> between -1.86% and -8.84% versus the baseline. I found that most of the
>>> overhead was due to the page faulting/zeroing that comes as a result of
>>> the pages having been evicted from the guest.
>> I think Michal asked for a performance comparison against Nitesh's
>> approach, to evaluate if keeping the reported state + tracking inside
>> the buddy is really worth it. Do you have any such numbers already? (or
>> did my tired eyes miss them in this cover letter? :/)
>>
> I thought what Michal was asking for was what was the benefit of using the
> boundary pointer. I added a bit up above and to the description for patch
> 3 as on a 32G VM it adds up to about a 18% difference without factoring in
> the page faulting and zeroing logic that occurs when we actually do the
> madvise.
>
> Do we have a working patch set for Nitesh's code? The last time I tried
> running his patch set I ran into issues with kernel panics. If we have a
> known working/stable patch set I can give it a try.

Did you try the v12 patch-set [1]?
I remember that you reported the CPU stall issue, which I fixed in the v12.

[1] https://lkml.org/lkml/2019/8/12/593

>
> - Alex
>
-- 
Thanks
Nitesh

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


  parent reply	other threads:[~2019-10-01 19:16 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-01 15:29 [PATCH v11 0/6] mm / virtio: Provide support for unused page reporting Alexander Duyck
2019-10-01 15:29 ` [virtio-dev] " Alexander Duyck
2019-10-01 15:29 ` [PATCH v11 1/6] mm: Adjust shuffle code to allow for future coalescing Alexander Duyck
2019-10-01 15:29   ` [virtio-dev] " Alexander Duyck
2019-10-01 15:29 ` [PATCH v11 2/6] mm: Use zone and order instead of free area in free_list manipulators Alexander Duyck
2019-10-01 15:29   ` [virtio-dev] " Alexander Duyck
2019-10-15  0:42   ` [mm] 2eca680594: will-it-scale.per_process_ops -2.5% regression kernel test robot
2019-10-15  0:42     ` kernel test robot
2019-10-01 15:29 ` [PATCH v11 3/6] mm: Introduce Reported pages Alexander Duyck
2019-10-01 15:29   ` [virtio-dev] " Alexander Duyck
2019-10-01 15:29 ` [PATCH v11 4/6] mm: Add device side and notifier for unused page reporting Alexander Duyck
2019-10-01 15:29   ` [virtio-dev] " Alexander Duyck
2019-10-01 15:29 ` [PATCH v11 5/6] virtio-balloon: Pull page poisoning config out of free page hinting Alexander Duyck
2019-10-01 15:29   ` [virtio-dev] " Alexander Duyck
2019-10-01 15:29 ` [PATCH v11 6/6] virtio-balloon: Add support for providing unused page reports to host Alexander Duyck
2019-10-01 15:29   ` [virtio-dev] " Alexander Duyck
2019-10-01 15:31 ` [PATCH v11 QEMU 1/3] virtio-ballon: Implement support for page poison tracking feature Alexander Duyck
2019-10-01 15:31   ` [virtio-dev] " Alexander Duyck
2019-10-01 15:31 ` [PATCH v11 QEMU 2/3] virtio-balloon: Add bit to notify guest of unused page reporting Alexander Duyck
2019-10-01 15:31   ` [virtio-dev] " Alexander Duyck
2019-10-01 15:31 ` [PATCH v11 QEMU 3/3] virtio-balloon: Provide a interface for " Alexander Duyck
2019-10-01 15:31   ` [virtio-dev] " Alexander Duyck
2019-10-01 15:35 ` [PATCH v11 0/6] mm / virtio: Provide support " David Hildenbrand
2019-10-01 15:35   ` [virtio-dev] " David Hildenbrand
2019-10-01 16:21   ` Alexander Duyck
2019-10-01 16:21     ` [virtio-dev] " Alexander Duyck
2019-10-01 16:21     ` Alexander Duyck
2019-10-01 18:41     ` David Hildenbrand
2019-10-01 18:41       ` [virtio-dev] " David Hildenbrand
2019-10-01 19:17       ` Nitesh Narayan Lal
2019-10-01 19:17         ` [virtio-dev] " Nitesh Narayan Lal
2019-10-01 19:08     ` Michael S. Tsirkin
2019-10-01 19:08       ` [virtio-dev] " Michael S. Tsirkin
2019-10-01 19:16     ` Nitesh Narayan Lal [this message]
2019-10-01 19:16       ` Nitesh Narayan Lal
2019-10-01 20:25       ` Alexander Duyck
2019-10-01 20:25         ` [virtio-dev] " Alexander Duyck
2019-10-01 20:25         ` Alexander Duyck
2019-10-01 20:49         ` Alexander Duyck
2019-10-01 20:49           ` [virtio-dev] " Alexander Duyck
2019-10-01 20:49           ` Alexander Duyck
2019-10-01 20:51           ` Dave Hansen
2019-10-02 15:04             ` Nitesh Narayan Lal
2019-10-02 15:04               ` [virtio-dev] " Nitesh Narayan Lal
2019-10-02 14:41         ` Nitesh Narayan Lal
2019-10-02 14:41           ` Nitesh Narayan Lal
2019-10-02  0:55       ` Alexander Duyck
2019-10-02  0:55         ` [virtio-dev] " Alexander Duyck
2019-10-02  0:55         ` Alexander Duyck
2019-10-02  7:13         ` David Hildenbrand
2019-10-02  7:13           ` [virtio-dev] " David Hildenbrand
2019-10-02 10:44           ` Nitesh Narayan Lal
2019-10-02 10:44             ` [virtio-dev] " Nitesh Narayan Lal
2019-10-02 10:36         ` Nitesh Narayan Lal
2019-10-02 10:36           ` [virtio-dev] " Nitesh Narayan Lal
2019-10-02 14:25           ` Alexander Duyck
2019-10-02 14:25             ` [virtio-dev] " Alexander Duyck
2019-10-02 14:25             ` Alexander Duyck
2019-10-02 14:36             ` Nitesh Narayan Lal
2019-10-02 14:36               ` [virtio-dev] " Nitesh Narayan Lal
2019-10-07 12:29             ` Nitesh Narayan Lal
2019-10-07 12:29               ` [virtio-dev] " Nitesh Narayan Lal
2019-10-07 15:33               ` Alexander Duyck
2019-10-07 15:33                 ` [virtio-dev] " Alexander Duyck
2019-10-07 15:33                 ` Alexander Duyck
2019-10-07 16:19                 ` Nitesh Narayan Lal
2019-10-07 16:19                   ` [virtio-dev] " Nitesh Narayan Lal
2019-10-07 16:27                   ` Alexander Duyck
2019-10-07 16:27                     ` [virtio-dev] " Alexander Duyck
2019-10-07 16:27                     ` Alexander Duyck
2019-10-07 17:06                     ` Nitesh Narayan Lal
2019-10-07 17:06                       ` [virtio-dev] " Nitesh Narayan Lal
2019-10-07 17:20                       ` Alexander Duyck
2019-10-07 17:20                         ` [virtio-dev] " Alexander Duyck
2019-10-07 17:20                         ` Alexander Duyck
2019-10-09 16:25                         ` Nitesh Narayan Lal
2019-10-09 16:25                           ` [virtio-dev] " Nitesh Narayan Lal
2019-10-09 16:25                           ` Nitesh Narayan Lal
2019-10-09 16:50                           ` Alexander Duyck
2019-10-09 16:50                             ` [virtio-dev] " Alexander Duyck
2019-10-09 16:50                             ` Alexander Duyck
2019-10-09 17:08                             ` Nitesh Narayan Lal
2019-10-09 17:08                               ` [virtio-dev] " Nitesh Narayan Lal
2019-10-09 17:26                               ` Alexander Duyck
2019-10-09 17:26                                 ` [virtio-dev] " Alexander Duyck
2019-10-09 17:26                                 ` Alexander Duyck
2019-10-09 15:21                       ` Nitesh Narayan Lal
2019-10-09 15:21                         ` [virtio-dev] " Nitesh Narayan Lal
2019-10-09 16:35                         ` Alexander Duyck
2019-10-09 16:35                           ` Alexander Duyck
2019-10-09 16:35                           ` Alexander Duyck
2019-10-09 19:46                           ` Nitesh Narayan Lal
2019-10-10  7:36                             ` David Hildenbrand
2019-10-10 10:27                               ` Nitesh Narayan Lal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8bd303a6-6e50-b2dc-19ab-4c3f176c4b02@redhat.com \
    --to=nitesh@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=osalvador@suse.de \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=vbabka@suse.cz \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=wei.w.wang@intel.com \
    --cc=willy@infradead.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.