From: "Wang, Wei W" <wei.w.wang@intel.com>
To: "virtio-dev@lists.oasis-open.org"
<virtio-dev@lists.oasis-open.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"virtualization@lists.linux-foundation.org"
<virtualization@lists.linux-foundation.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"mst@redhat.com" <mst@redhat.com>,
"mhocko@kernel.org" <mhocko@kernel.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Cc: "torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"liliang.opensource@gmail.com" <liliang.opensource@gmail.com>,
"yang.zhang.wz@gmail.com" <yang.zhang.wz@gmail.com>,
"quan.xu0@gmail.com" <quan.xu0@gmail.com>,
"nilal@redhat.com" <nilal@redhat.com>,
"riel@redhat.com" <riel@redhat.com>,
"peterx@redhat.com" <peterx@redhat.com>
Subject: RE: [PATCH v35 1/5] mm: support to get hints of free page blocks
Date: Tue, 10 Jul 2018 10:16:57 +0000 [thread overview]
Message-ID: <286AC319A985734F985F78AFA26841F7396E91B6@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <1531215067-35472-2-git-send-email-wei.w.wang@intel.com>
On Tuesday, July 10, 2018 5:31 PM, Wang, Wei W wrote:
> Subject: [PATCH v35 1/5] mm: support to get hints of free page blocks
>
> This patch adds support to get free page blocks from a free page list.
> The physical addresses of the blocks are stored to a list of buffers passed
> from the caller. The obtained free page blocks are hints about free pages,
> because there is no guarantee that they are still on the free page list after the
> function returns.
>
> One use example of this patch is to accelerate live migration by skipping the
> transfer of free pages reported from the guest. A popular method used by
> the hypervisor to track which part of memory is written during live migration
> is to write-protect all the guest memory. So, those pages that are hinted as
> free pages but are written after this function returns will be captured by the
> hypervisor, and they will be added to the next round of memory transfer.
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> ---
> include/linux/mm.h | 3 ++
> mm/page_alloc.c | 98
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 101 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h index a0fbb9f..5ce654f
> 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2007,6 +2007,9 @@ extern void free_area_init(unsigned long *
> zones_size); extern void free_area_init_node(int nid, unsigned long *
> zones_size,
> unsigned long zone_start_pfn, unsigned long *zholes_size);
> extern void free_initmem(void);
> +unsigned long max_free_page_blocks(int order); int
> +get_from_free_page_list(int order, struct list_head *pages,
> + unsigned int size, unsigned long *loaded_num);
>
> /*
> * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1521100..b67839b
> 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5043,6 +5043,104 @@ void show_free_areas(unsigned int filter,
> nodemask_t *nodemask)
> show_swap_cache_info();
> }
>
> +/**
> + * max_free_page_blocks - estimate the max number of free page blocks
> + * @order: the order of the free page blocks to estimate
> + *
> + * This function gives a rough estimation of the possible maximum
> +number of
> + * free page blocks a free list may have. The estimation works on an
> +assumption
> + * that all the system pages are on that list.
> + *
> + * Context: Any context.
> + *
> + * Return: The largest number of free page blocks that the free list can have.
> + */
> +unsigned long max_free_page_blocks(int order) {
> + return totalram_pages / (1 << order);
> +}
> +EXPORT_SYMBOL_GPL(max_free_page_blocks);
> +
> +/**
> + * get_from_free_page_list - get hints of free pages from a free page
> +list
> + * @order: the order of the free page list to check
> + * @pages: the list of page blocks used as buffers to load the
> +addresses
> + * @size: the size of each buffer in bytes
> + * @loaded_num: the number of addresses loaded to the buffers
> + *
> + * This function offers hints about free pages. The addresses of free
> +page
> + * blocks are stored to the list of buffers passed from the caller.
> +There is
> + * no guarantee that the obtained free pages are still on the free page
> +list
> + * after the function returns. pfn_to_page on the obtained free pages
> +is
> + * strongly discouraged and if there is an absolute need for that, make
> +sure
> + * to contact MM people to discuss potential problems.
> + *
> + * The addresses are currently stored to a buffer in little endian.
> +This
> + * avoids the overhead of converting endianness by the caller who needs
> +data
> + * in the little endian format. Big endian support can be added on
> +demand in
> + * the future.
> + *
> + * Context: Process context.
> + *
> + * Return: 0 if all the free page block addresses are stored to the buffers;
> + * -ENOSPC if the buffers are not sufficient to store all the
> + * addresses; or -EINVAL if an unexpected argument is received (e.g.
> + * incorrect @order, empty buffer list).
> + */
> +int get_from_free_page_list(int order, struct list_head *pages,
> + unsigned int size, unsigned long *loaded_num) {
Hi Linus,
We took your original suggestion - pass in pre-allocated buffers to load the addresses (now we use a list of pre-allocated page blocks as buffers). Hope that suggestion is still acceptable (the advantage of this method was explained here: https://lkml.org/lkml/2018/6/28/184).
Look forward to getting your feedback. Thanks.
Best,
Wei
next prev parent reply other threads:[~2018-07-10 10:17 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-10 9:31 [PATCH v35 0/5] Virtio-balloon: support free page reporting Wei Wang
2018-07-10 9:31 ` [PATCH v35 1/5] mm: support to get hints of free page blocks Wei Wang
2018-07-10 10:16 ` Wang, Wei W [this message]
2018-07-10 17:33 ` Linus Torvalds
2018-07-11 1:28 ` Wei Wang
2018-07-11 1:44 ` Linus Torvalds
2018-07-11 9:21 ` Michal Hocko
2018-07-11 10:52 ` Wei Wang
2018-07-11 11:09 ` Michal Hocko
2018-07-11 13:55 ` Wang, Wei W
2018-07-11 14:38 ` Michal Hocko
2018-07-11 19:36 ` Michael S. Tsirkin
2018-07-11 16:23 ` Linus Torvalds
2018-07-12 2:21 ` Wei Wang
2018-07-12 2:30 ` Linus Torvalds
2018-07-12 2:52 ` Wei Wang
2018-07-12 8:13 ` Michal Hocko
2018-07-12 11:34 ` Wei Wang
2018-07-12 11:49 ` Michal Hocko
2018-07-13 0:33 ` Wei Wang
2018-07-12 13:12 ` Michal Hocko
2018-07-11 4:00 ` Michael S. Tsirkin
2018-07-11 4:04 ` Michael S. Tsirkin
2018-07-10 9:31 ` [PATCH v35 2/5] virtio-balloon: remove BUG() in init_vqs Wei Wang
2018-07-10 9:31 ` [PATCH v35 3/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang
2018-07-10 9:31 ` [PATCH v35 4/5] mm/page_poison: expose page_poisoning_enabled to kernel modules Wei Wang
2018-07-10 9:31 ` [PATCH v35 5/5] virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON Wei Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=286AC319A985734F985F78AFA26841F7396E91B6@SHSMSX101.ccr.corp.intel.com \
--to=wei.w.wang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=kvm@vger.kernel.org \
--cc=liliang.opensource@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mst@redhat.com \
--cc=nilal@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=quan.xu0@gmail.com \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=virtio-dev@lists.oasis-open.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=yang.zhang.wz@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).