From: Mel Gorman <mgorman@techsingularity.net> To: Alexander Duyck <alexander.duyck@gmail.com> Cc: kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, yang.zhang.wz@gmail.com, pagupta@redhat.com, konrad.wilk@oracle.com, nitesh@redhat.com, riel@surriel.com, willy@infradead.org, lcapitulino@redhat.com, dave.hansen@intel.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, mhocko@kernel.org, alexander.h.duyck@linux.intel.com, vbabka@suse.cz, osalvador@suse.de Subject: Re: [PATCH v17 4/9] mm: Introduce Reported pages Date: Wed, 19 Feb 2020 14:55:11 +0000 [thread overview] Message-ID: <20200219145511.GS3466@techsingularity.net> (raw) In-Reply-To: <20200211224635.29318.19750.stgit@localhost.localdomain> On Tue, Feb 11, 2020 at 02:46:35PM -0800, Alexander Duyck wrote: > diff --git a/mm/page_reporting.c b/mm/page_reporting.c > new file mode 100644 > index 000000000000..1047c6872d4f > --- /dev/null > +++ b/mm/page_reporting.c > @@ -0,0 +1,319 @@ > +// SPDX-License-Identifier: GPL-2.0 > +#include <linux/mm.h> > +#include <linux/mmzone.h> > +#include <linux/page_reporting.h> > +#include <linux/gfp.h> > +#include <linux/export.h> > +#include <linux/delay.h> > +#include <linux/scatterlist.h> > + > +#include "page_reporting.h" > +#include "internal.h" > + > +#define PAGE_REPORTING_DELAY (2 * HZ) I assume there is nothing special about 2 seconds other than "do some progress every so often". > +static struct page_reporting_dev_info __rcu *pr_dev_info __read_mostly; > + > +enum { > + PAGE_REPORTING_IDLE = 0, > + PAGE_REPORTING_REQUESTED, > + PAGE_REPORTING_ACTIVE > +}; > + > +/* request page reporting */ > +static void > +__page_reporting_request(struct page_reporting_dev_info *prdev) > +{ > + unsigned int state; > + > + /* Check to see if we are in desired state */ > + state = atomic_read(&prdev->state); > + if (state == PAGE_REPORTING_REQUESTED) > + return; > + > + /* > + * If reporting is already active there is nothing we need to do. > + * Test against 0 as that represents PAGE_REPORTING_IDLE. > + */ > + state = atomic_xchg(&prdev->state, PAGE_REPORTING_REQUESTED); > + if (state != PAGE_REPORTING_IDLE) > + return; > + > + /* > + * Delay the start of work to allow a sizable queue to build. For > + * now we are limiting this to running no more than once every > + * couple of seconds. > + */ > + schedule_delayed_work(&prdev->work, PAGE_REPORTING_DELAY); > +} Seems a fair use of atomics. > +static int > +page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone, > + unsigned int order, unsigned int mt, > + struct scatterlist *sgl, unsigned int *offset) > +{ > + struct free_area *area = &zone->free_area[order]; > + struct list_head *list = &area->free_list[mt]; > + unsigned int page_len = PAGE_SIZE << order; > + struct page *page, *next; > + int err = 0; > + > + /* > + * Perform early check, if free area is empty there is > + * nothing to process so we can skip this free_list. > + */ > + if (list_empty(list)) > + return err; > + > + spin_lock_irq(&zone->lock); > + > + /* loop through free list adding unreported pages to sg list */ > + list_for_each_entry_safe(page, next, list, lru) { > + /* We are going to skip over the reported pages. */ > + if (PageReported(page)) > + continue; > + > + /* Attempt to pull page from list */ > + if (!__isolate_free_page(page, order)) > + break; > + Might want to note that you are breaking because the only reason to fail the isolation is that watermarks are not met and we are likely under memory pressure. It's not a big issue. However, while I think this is correct, it's hard to follow. This loop can be broken out of with pages still on the scatter gather list. The current flow guarantees that err will not be set at this point so the caller cleans it up so we always drain the list either here or in the caller. While I think it works, it's a bit fragile. I recommend putting a comment above this noting why it's safe and put a VM_WARN_ON_ONCE(err) before the break in case someone tries to change this in a years time and does not spot that the flow to reach page_reporting_drain *somewhere* is critical. > + /* Add page to scatter list */ > + --(*offset); > + sg_set_page(&sgl[*offset], page, page_len, 0); > + > + /* If scatterlist isn't full grab more pages */ > + if (*offset) > + continue; > + > + /* release lock before waiting on report processing */ > + spin_unlock_irq(&zone->lock); > + > + /* begin processing pages in local list */ > + err = prdev->report(prdev, sgl, PAGE_REPORTING_CAPACITY); > + > + /* reset offset since the full list was reported */ > + *offset = PAGE_REPORTING_CAPACITY; > + > + /* reacquire zone lock and resume processing */ > + spin_lock_irq(&zone->lock); > + > + /* flush reported pages from the sg list */ > + page_reporting_drain(prdev, sgl, PAGE_REPORTING_CAPACITY, !err); > + > + /* > + * Reset next to first entry, the old next isn't valid > + * since we dropped the lock to report the pages > + */ > + next = list_first_entry(list, struct page, lru); > + > + /* exit on error */ > + if (err) > + break; > + } > + > + spin_unlock_irq(&zone->lock); > + > + return err; > +} I complained about the use of zone lock before but in this version, I think I'm ok with it. The lock is held for the free list manipulations which is what it's for. The state management with atomics seems reasonable. Otherwise I think this is ok and I think the implementation right. Of great importance to me was the allocator fast paths but they seem to be adequately protected by a static branch so Acked-by: Mel Gorman <mgorman@techsingularity.net> The ack applies regardless of whether you decide to document and defensively protect page_reporting_cycle against losing pages on the scatter/gather list but I do recommend it. -- Mel Gorman SUSE Labs
next prev parent reply other threads:[~2020-02-19 14:55 UTC|newest] Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-02-11 22:45 [PATCH v17 0/9] mm / virtio: Provide support for free page reporting Alexander Duyck 2020-02-11 22:46 ` [PATCH v17 1/9] mm: Adjust shuffle code to allow for future coalescing Alexander Duyck 2020-02-11 22:46 ` [PATCH v17 2/9] mm: Use zone and order instead of free area in free_list manipulators Alexander Duyck 2020-02-11 22:46 ` [PATCH v17 3/9] mm: Add function __putback_isolated_page Alexander Duyck 2020-02-19 14:33 ` Mel Gorman 2020-02-11 22:46 ` [PATCH v17 4/9] mm: Introduce Reported pages Alexander Duyck 2020-02-19 14:55 ` Mel Gorman [this message] 2020-02-20 18:44 ` Alexander Duyck 2020-02-20 22:35 ` Mel Gorman 2020-02-21 19:25 ` Alexander Duyck 2020-02-21 20:19 ` Mel Gorman 2020-02-11 22:46 ` [PATCH v17 5/9] virtio-balloon: Pull page poisoning config out of free page hinting Alexander Duyck 2020-02-11 22:46 ` [PATCH v17 6/9] virtio-balloon: Add support for providing free page reports to host Alexander Duyck 2020-02-11 22:47 ` [PATCH v17 7/9] mm/page_reporting: Rotate reported pages to the tail of the list Alexander Duyck 2020-02-19 14:59 ` Mel Gorman 2020-02-11 22:47 ` [PATCH v17 8/9] mm/page_reporting: Add budget limit on how many pages can be reported per pass Alexander Duyck 2020-02-19 15:02 ` Mel Gorman 2020-02-11 22:47 ` [PATCH v17 9/9] mm/page_reporting: Add free page reporting documentation Alexander Duyck 2020-02-11 22:51 ` [PATCH v17 QEMU 1/3] virtio-ballon: Implement support for page poison tracking feature Alexander Duyck 2020-02-11 22:51 ` [PATCH v17 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host Alexander Duyck 2020-02-11 22:51 ` [PATCH v17 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting Alexander Duyck 2020-02-11 22:53 ` [PATCH v17 QEMU 4/3 RFC] memory: Add support for MADV_FREE as mechanism to lazy discard pages Alexander Duyck 2020-02-11 23:05 ` [PATCH v17 0/9] mm / virtio: Provide support for free page reporting Andrew Morton 2020-02-11 23:55 ` Alexander Duyck 2020-02-12 0:19 ` Andrew Morton 2020-02-12 1:19 ` Alexander Duyck 2020-02-18 16:37 ` Alexander Duyck 2020-02-19 8:49 ` Mel Gorman 2020-02-19 15:06 ` Mel Gorman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200219145511.GS3466@techsingularity.net \ --to=mgorman@techsingularity.net \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=alexander.duyck@gmail.com \ --cc=alexander.h.duyck@linux.intel.com \ --cc=dan.j.williams@intel.com \ --cc=dave.hansen@intel.com \ --cc=david@redhat.com \ --cc=konrad.wilk@oracle.com \ --cc=kvm@vger.kernel.org \ --cc=lcapitulino@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@kernel.org \ --cc=mst@redhat.com \ --cc=nitesh@redhat.com \ --cc=osalvador@suse.de \ --cc=pagupta@redhat.com \ --cc=pbonzini@redhat.com \ --cc=riel@surriel.com \ --cc=vbabka@suse.cz \ --cc=wei.w.wang@intel.com \ --cc=willy@infradead.org \ --cc=yang.zhang.wz@gmail.com \ --subject='Re: [PATCH v17 4/9] mm: Introduce Reported pages' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).