From: Jan Kara <jack@suse.cz>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nvdimm@lists.01.org, virtio-fs@redhat.com,
miklos@szeredi.hu, stefanha@redhat.com, dgilbert@redhat.com,
Jan Kara <jack@suse.cz>
Subject: Re: [PATCH v3 02/18] dax: Create a range version of dax_layout_busy_page()
Date: Thu, 20 Aug 2020 14:58:55 +0200 [thread overview]
Message-ID: <20200820125855.GL1902@quack2.suse.cz> (raw)
In-Reply-To: <20200819221956.845195-3-vgoyal@redhat.com>
On Wed 19-08-20 18:19:40, Vivek Goyal wrote:
> virtiofs device has a range of memory which is mapped into file inodes
> using dax. This memory is mapped in qemu on host and maps different
> sections of real file on host. Size of this memory is limited
> (determined by administrator) and depending on filesystem size, we will
> soon reach a situation where all the memory is in use and we need to
> reclaim some.
>
> As part of reclaim process, we will need to make sure that there are
> no active references to pages (taken by get_user_pages()) on the memory
> range we are trying to reclaim. I am planning to use
> dax_layout_busy_page() for this. But in current form this is per inode
> and scans through all the pages of the inode.
>
> We want to reclaim only a portion of memory (say 2MB page). So we want
> to make sure that only that 2MB range of pages do not have any
> references (and don't want to unmap all the pages of inode).
>
> Hence, create a range version of this function named
> dax_layout_busy_page_range() which can be used to pass a range which
> needs to be unmapped.
>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: linux-nvdimm@lists.01.org
> Cc: Jan Kara <jack@suse.cz>
> Cc: Vishal L Verma <vishal.l.verma@intel.com>
> Cc: "Weiny, Ira" <ira.weiny@intel.com>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
> fs/dax.c | 29 +++++++++++++++++++++++------
> include/linux/dax.h | 6 ++++++
> 2 files changed, 29 insertions(+), 6 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 95341af1a966..ddd705251d9f 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -559,7 +559,7 @@ static void *grab_mapping_entry(struct xa_state *xas,
> }
>
> /**
> - * dax_layout_busy_page - find first pinned page in @mapping
> + * dax_layout_busy_page_range - find first pinned page in @mapping
> * @mapping: address space to scan for a page with ref count > 1
Please document additional function arguments in the kernel-doc comment.
Otherwise the patch looks good so feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
after fixing this nit.
Honza
> *
> * DAX requires ZONE_DEVICE mapped pages. These pages are never
> @@ -572,13 +572,19 @@ static void *grab_mapping_entry(struct xa_state *xas,
> * establishment of new mappings in this address_space. I.e. it expects
> * to be able to run unmap_mapping_range() and subsequently not race
> * mapping_mapped() becoming true.
> + *
> + * Partial pages are included. If 'end' is LLONG_MAX, pages in the range
> + * from 'start' to end of the file are inluded.
> */
> -struct page *dax_layout_busy_page(struct address_space *mapping)
> +struct page *dax_layout_busy_page_range(struct address_space *mapping,
> + loff_t start, loff_t end)
> {
> - XA_STATE(xas, &mapping->i_pages, 0);
> void *entry;
> unsigned int scanned = 0;
> struct page *page = NULL;
> + pgoff_t start_idx = start >> PAGE_SHIFT;
> + pgoff_t end_idx;
> + XA_STATE(xas, &mapping->i_pages, start_idx);
>
> /*
> * In the 'limited' case get_user_pages() for dax is disabled.
> @@ -589,6 +595,11 @@ struct page *dax_layout_busy_page(struct address_space *mapping)
> if (!dax_mapping(mapping) || !mapping_mapped(mapping))
> return NULL;
>
> + /* If end == LLONG_MAX, all pages from start to till end of file */
> + if (end == LLONG_MAX)
> + end_idx = ULONG_MAX;
> + else
> + end_idx = end >> PAGE_SHIFT;
> /*
> * If we race get_user_pages_fast() here either we'll see the
> * elevated page count in the iteration and wait, or
> @@ -596,15 +607,15 @@ struct page *dax_layout_busy_page(struct address_space *mapping)
> * against is no longer mapped in the page tables and bail to the
> * get_user_pages() slow path. The slow path is protected by
> * pte_lock() and pmd_lock(). New references are not taken without
> - * holding those locks, and unmap_mapping_range() will not zero the
> + * holding those locks, and unmap_mapping_pages() will not zero the
> * pte or pmd without holding the respective lock, so we are
> * guaranteed to either see new references or prevent new
> * references from being established.
> */
> - unmap_mapping_range(mapping, 0, 0, 0);
> + unmap_mapping_pages(mapping, start_idx, end_idx - start_idx + 1, 0);
>
> xas_lock_irq(&xas);
> - xas_for_each(&xas, entry, ULONG_MAX) {
> + xas_for_each(&xas, entry, end_idx) {
> if (WARN_ON_ONCE(!xa_is_value(entry)))
> continue;
> if (unlikely(dax_is_locked(entry)))
> @@ -625,6 +636,12 @@ struct page *dax_layout_busy_page(struct address_space *mapping)
> xas_unlock_irq(&xas);
> return page;
> }
> +EXPORT_SYMBOL_GPL(dax_layout_busy_page_range);
> +
> +struct page *dax_layout_busy_page(struct address_space *mapping)
> +{
> + return dax_layout_busy_page_range(mapping, 0, LLONG_MAX);
> +}
> EXPORT_SYMBOL_GPL(dax_layout_busy_page);
>
> static int __dax_invalidate_entry(struct address_space *mapping,
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index 6904d4e0b2e0..9016929db4c6 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -141,6 +141,7 @@ int dax_writeback_mapping_range(struct address_space *mapping,
> struct dax_device *dax_dev, struct writeback_control *wbc);
>
> struct page *dax_layout_busy_page(struct address_space *mapping);
> +struct page *dax_layout_busy_page_range(struct address_space *mapping, loff_t start, loff_t end);
> dax_entry_t dax_lock_page(struct page *page);
> void dax_unlock_page(struct page *page, dax_entry_t cookie);
> #else
> @@ -171,6 +172,11 @@ static inline struct page *dax_layout_busy_page(struct address_space *mapping)
> return NULL;
> }
>
> +static inline struct page *dax_layout_busy_page_range(struct address_space *mapping, pgoff_t start, pgoff_t nr_pages)
> +{
> + return NULL;
> +}
> +
> static inline int dax_writeback_mapping_range(struct address_space *mapping,
> struct dax_device *dax_dev, struct writeback_control *wbc)
> {
> --
> 2.25.4
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
next prev parent reply other threads:[~2020-08-20 12:59 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-19 22:19 [PATCH v3 00/18] virtiofs: Add DAX support Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 01/18] dax: Modify bdev_dax_pgoff() to handle NULL bdev Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 02/18] dax: Create a range version of dax_layout_busy_page() Vivek Goyal
2020-08-20 12:58 ` Jan Kara [this message]
2020-08-20 14:29 ` Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 03/18] virtio: Add get_shm_region method Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 04/18] virtio: Implement get_shm_region for PCI transport Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 05/18] virtio: Implement get_shm_region for MMIO transport Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 06/18] virtiofs: Provide a helper function for virtqueue initialization Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 07/18] fuse: Get rid of no_mount_options Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 08/18] virtio_fs, dax: Set up virtio_fs dax_device Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 09/18] fuse,virtiofs: Add a mount option to enable dax Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 10/18] fuse,virtiofs: Keep a list of free dax memory ranges Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 11/18] fuse: implement FUSE_INIT map_alignment field Vivek Goyal
2020-08-26 14:06 ` Miklos Szeredi
2020-08-26 15:51 ` Vivek Goyal
2020-08-26 17:34 ` Stefan Hajnoczi
2020-08-26 19:17 ` Dr. David Alan Gilbert
2020-08-26 19:26 ` Miklos Szeredi
2020-08-26 19:53 ` Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 12/18] fuse: Introduce setupmapping/removemapping commands Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 13/18] fuse, dax: Implement dax read/write operations Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 14/18] fuse,dax: add DAX mmap support Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 15/18] fuse,virtiofs: Define dax address space operations Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 16/18] fuse, dax: Serialize truncate/punch_hole and dax fault path Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 17/18] fuse,virtiofs: Maintain a list of busy elements Vivek Goyal
2020-08-19 22:19 ` [PATCH v3 18/18] fuse,virtiofs: Add logic to free up a memory range Vivek Goyal
2020-08-28 14:26 ` [PATCH v3 00/18] virtiofs: Add DAX support Miklos Szeredi
2020-08-28 14:39 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200820125855.GL1902@quack2.suse.cz \
--to=jack@suse.cz \
--cc=dgilbert@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=miklos@szeredi.hu \
--cc=stefanha@redhat.com \
--cc=vgoyal@redhat.com \
--cc=virtio-fs@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).