All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: john.hubbard@gmail.com
Cc: Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@kernel.org>,
	Christopher Lameter <cl@linux.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, John Hubbard <jhubbard@nvidia.com>
Subject: Re: [PATCH v2 6/6] mm: page_mkclean, ttu: handle pinned pages
Date: Mon, 2 Jul 2018 12:15:42 +0200	[thread overview]
Message-ID: <20180702101542.fi7ndfkg5fpzodey@quack2.suse.cz> (raw)
In-Reply-To: <20180702005654.20369-7-jhubbard@nvidia.com>

On Sun 01-07-18 17:56:54, john.hubbard@gmail.com wrote:
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 9d142b9b86dc..c4bc8d216746 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -931,6 +931,7 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
>  	int kill = 1, forcekill;
>  	struct page *hpage = *hpagep;
>  	bool mlocked = PageMlocked(hpage);
> +	bool skip_pinned_pages = false;

I'm not sure we can afford to wait for page pins when handling page
poisoning. In an ideal world we should but... But I guess this is for
someone understanding memory poisoning better to judge.

> diff --git a/mm/rmap.c b/mm/rmap.c
> index 6db729dc4c50..c137c43eb2ad 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -879,6 +879,26 @@ int page_referenced(struct page *page,
>  	return pra.referenced;
>  }
>  
> +/* Must be called with pinned_dma_lock held. */
> +static void wait_for_dma_pinned_to_clear(struct page *page)
> +{
> +	struct zone *zone = page_zone(page);
> +
> +	while (PageDmaPinnedFlags(page)) {
> +		spin_unlock(zone_gup_lock(zone));
> +
> +		schedule();
> +
> +		spin_lock(zone_gup_lock(zone));
> +	}
> +}

Ouch, we definitely need something better here. Either reuse the
page_waitqueue() mechanism or create at least a global wait queue for this
(I don't expect too much contention on the waitqueue and even if there
eventually is, we can switch to page_waitqueue() when we find it).  But
this is a no-go...

> +
> +struct page_mkclean_info {
> +	int cleaned;
> +	int skipped;
> +	bool skip_pinned_pages;
> +};
> +
>  static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
>  			    unsigned long address, void *arg)
>  {
> @@ -889,7 +909,24 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
>  		.flags = PVMW_SYNC,
>  	};
>  	unsigned long start = address, end;
> -	int *cleaned = arg;
> +	struct page_mkclean_info *mki = (struct page_mkclean_info *)arg;
> +	bool is_dma_pinned;
> +	struct zone *zone = page_zone(page);
> +
> +	/* Serialize with get_user_pages: */
> +	spin_lock(zone_gup_lock(zone));
> +	is_dma_pinned = PageDmaPinned(page);

Hum, why do you do this for each page table this is mapped in? Also the
locking is IMHO going to hurt a lot and we need to avoid it.

What I think needs to happen is that in page_mkclean(), after you've
cleared all the page tables, you check PageDmaPinned() and wait if needed.
Page cannot be faulted in again as we hold page lock and so races with
concurrent GUP are fairly limited. So with some careful ordering & memory
barriers you should be able to get away without any locking. Ditto for the
unmap path...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2018-07-02 10:15 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-02  0:56 [PATCH v2 0/6] mm/fs: gup: don't unmap or drop filesystem buffers john.hubbard
2018-07-02  0:56 ` [PATCH v2 1/6] mm: get_user_pages: consolidate error handling john.hubbard
2018-07-02 10:17   ` Jan Kara
2018-07-02 21:34     ` John Hubbard
2018-07-02 21:34       ` John Hubbard
2018-07-02  0:56 ` [PATCH v2 2/6] mm: introduce page->dma_pinned_flags, _count john.hubbard
2018-07-02  0:56 ` [PATCH v2 3/6] mm: introduce zone_gup_lock, for dma-pinned pages john.hubbard
2018-07-02  0:56 ` [PATCH v2 4/6] mm/fs: add a sync_mode param for clear_page_dirty_for_io() john.hubbard
2018-07-02  2:11   ` kbuild test robot
2018-07-02  4:40     ` John Hubbard
2018-07-02  4:40       ` John Hubbard
2018-07-02  2:47   ` kbuild test robot
2018-07-02  4:40     ` John Hubbard
2018-07-02  4:40       ` John Hubbard
2018-07-02  0:56 ` [PATCH v2 5/6] mm: track gup pages with page->dma_pinned_* fields john.hubbard
2018-07-02  2:11   ` kbuild test robot
2018-07-02  2:58   ` kbuild test robot
2018-07-02  5:05     ` John Hubbard
2018-07-02  5:05       ` John Hubbard
2018-07-02  9:53   ` Jan Kara
2018-07-02 20:43     ` John Hubbard
2018-07-02 20:43       ` John Hubbard
2018-07-03  0:08       ` Christopher Lameter
2018-07-03  4:30         ` John Hubbard
2018-07-03  4:30           ` John Hubbard
2018-07-03 17:08           ` Christopher Lameter
2018-07-03 17:36             ` John Hubbard
2018-07-03 17:36               ` John Hubbard
2018-07-03 17:48               ` Christopher Lameter
2018-07-03 18:48                 ` John Hubbard
2018-07-03 18:48                   ` John Hubbard
2018-07-04 10:43               ` Jan Kara
2018-07-05 14:17                 ` Christopher Lameter
2018-07-09 13:49                   ` Jan Kara
2018-07-02  0:56 ` [PATCH v2 6/6] mm: page_mkclean, ttu: handle pinned pages john.hubbard
2018-07-02 10:15   ` Jan Kara [this message]
2018-07-02 21:07     ` John Hubbard
2018-07-02 21:07       ` John Hubbard
2018-07-02  5:54 ` [PATCH v2 0/6] mm/fs: gup: don't unmap or drop filesystem buffers John Hubbard
2018-07-02  5:54   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180702101542.fi7ndfkg5fpzodey@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=john.hubbard@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.