All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Jan Kara <jack@suse.cz>, john.hubbard@gmail.com
Cc: Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@kernel.org>,
	Christopher Lameter <cl@linux.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dan Williams <dan.j.williams@intel.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2 5/6] mm: track gup pages with page->dma_pinned_* fields
Date: Mon, 2 Jul 2018 13:43:46 -0700	[thread overview]
Message-ID: <bb798475-ebf3-7b02-409f-8c4347fa6674@nvidia.com> (raw)
In-Reply-To: <20180702095331.n5zfz35d3invl5al@quack2.suse.cz>

On 07/02/2018 02:53 AM, Jan Kara wrote:
> On Sun 01-07-18 17:56:53, john.hubbard@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>>
> ...
> 
>> @@ -904,12 +907,24 @@ static inline void get_page(struct page *page)
>>  	 */
>>  	VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page);
>>  	page_ref_inc(page);
>> +
>> +	if (unlikely(PageDmaPinned(page)))
>> +		__get_page_for_pinned_dma(page);
>>  }
>>  
>>  static inline void put_page(struct page *page)
>>  {
>>  	page = compound_head(page);
>>  
>> +	/* Because the page->dma_pinned_* fields are unioned with
>> +	 * page->lru, there is no way to do classical refcount-style
>> +	 * decrement-and-test-for-zero. Instead, PageDmaPinned(page) must
>> +	 * be checked, in order to safely check if we are allowed to decrement
>> +	 * page->dma_pinned_count at all.
>> +	 */
>> +	if (unlikely(PageDmaPinned(page)))
>> +		__put_page_for_pinned_dma(page);
>> +
> 
> These two are just wrong. You cannot make any page reference for
> PageDmaPinned() account against a pin count. First, it is just conceptually
> wrong as these references need not be long term pins, second, you can
> easily race like:
> 
> Pinner				Random process
> 				get_page(page)
> pin_page_for_dma()
> 				put_page(page)
> 				 -> oops, page gets unpinned too early
> 

I'll drop this approach, without mentioning any of the locking that is hiding in
there, since that was probably breaking other rules anyway. :) Thanks for your
patience in reviewing this.

> So you really have to create counterpart to get_user_pages() - like
> put_user_page() or whatever... It is inconvenient to have to modify all GUP
> users but I don't see a way around that. 

OK, there will be a long-ish pause, while I go visit all the gup sites. I count about
88 callers, which is not nearly as crazy as my first casual grep showed, but still
quite a chunk, since I have to track down where each one does its put_page call(s).

It's definitely worth the effort, though. These pins just plain need some special
handling in order to get everything correct.


thanks,
-- 
John Hubbard
NVIDIA

WARNING: multiple messages have this Message-ID (diff)
From: John Hubbard <jhubbard@nvidia.com>
To: Jan Kara <jack@suse.cz>, <john.hubbard@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@kernel.org>,
	Christopher Lameter <cl@linux.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dan Williams <dan.j.williams@intel.com>, <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	<linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v2 5/6] mm: track gup pages with page->dma_pinned_* fields
Date: Mon, 2 Jul 2018 13:43:46 -0700	[thread overview]
Message-ID: <bb798475-ebf3-7b02-409f-8c4347fa6674@nvidia.com> (raw)
In-Reply-To: <20180702095331.n5zfz35d3invl5al@quack2.suse.cz>

On 07/02/2018 02:53 AM, Jan Kara wrote:
> On Sun 01-07-18 17:56:53, john.hubbard@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>>
> ...
> 
>> @@ -904,12 +907,24 @@ static inline void get_page(struct page *page)
>>  	 */
>>  	VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page);
>>  	page_ref_inc(page);
>> +
>> +	if (unlikely(PageDmaPinned(page)))
>> +		__get_page_for_pinned_dma(page);
>>  }
>>  
>>  static inline void put_page(struct page *page)
>>  {
>>  	page = compound_head(page);
>>  
>> +	/* Because the page->dma_pinned_* fields are unioned with
>> +	 * page->lru, there is no way to do classical refcount-style
>> +	 * decrement-and-test-for-zero. Instead, PageDmaPinned(page) must
>> +	 * be checked, in order to safely check if we are allowed to decrement
>> +	 * page->dma_pinned_count at all.
>> +	 */
>> +	if (unlikely(PageDmaPinned(page)))
>> +		__put_page_for_pinned_dma(page);
>> +
> 
> These two are just wrong. You cannot make any page reference for
> PageDmaPinned() account against a pin count. First, it is just conceptually
> wrong as these references need not be long term pins, second, you can
> easily race like:
> 
> Pinner				Random process
> 				get_page(page)
> pin_page_for_dma()
> 				put_page(page)
> 				 -> oops, page gets unpinned too early
> 

I'll drop this approach, without mentioning any of the locking that is hiding in
there, since that was probably breaking other rules anyway. :) Thanks for your
patience in reviewing this.

> So you really have to create counterpart to get_user_pages() - like
> put_user_page() or whatever... It is inconvenient to have to modify all GUP
> users but I don't see a way around that. 

OK, there will be a long-ish pause, while I go visit all the gup sites. I count about
88 callers, which is not nearly as crazy as my first casual grep showed, but still
quite a chunk, since I have to track down where each one does its put_page call(s).

It's definitely worth the effort, though. These pins just plain need some special
handling in order to get everything correct.


thanks,
-- 
John Hubbard
NVIDIA

  reply	other threads:[~2018-07-02 20:43 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-02  0:56 [PATCH v2 0/6] mm/fs: gup: don't unmap or drop filesystem buffers john.hubbard
2018-07-02  0:56 ` [PATCH v2 1/6] mm: get_user_pages: consolidate error handling john.hubbard
2018-07-02 10:17   ` Jan Kara
2018-07-02 21:34     ` John Hubbard
2018-07-02 21:34       ` John Hubbard
2018-07-02  0:56 ` [PATCH v2 2/6] mm: introduce page->dma_pinned_flags, _count john.hubbard
2018-07-02  0:56 ` [PATCH v2 3/6] mm: introduce zone_gup_lock, for dma-pinned pages john.hubbard
2018-07-02  0:56 ` [PATCH v2 4/6] mm/fs: add a sync_mode param for clear_page_dirty_for_io() john.hubbard
2018-07-02  2:11   ` kbuild test robot
2018-07-02  4:40     ` John Hubbard
2018-07-02  4:40       ` John Hubbard
2018-07-02  2:47   ` kbuild test robot
2018-07-02  4:40     ` John Hubbard
2018-07-02  4:40       ` John Hubbard
2018-07-02  0:56 ` [PATCH v2 5/6] mm: track gup pages with page->dma_pinned_* fields john.hubbard
2018-07-02  2:11   ` kbuild test robot
2018-07-02  2:58   ` kbuild test robot
2018-07-02  5:05     ` John Hubbard
2018-07-02  5:05       ` John Hubbard
2018-07-02  9:53   ` Jan Kara
2018-07-02 20:43     ` John Hubbard [this message]
2018-07-02 20:43       ` John Hubbard
2018-07-03  0:08       ` Christopher Lameter
2018-07-03  4:30         ` John Hubbard
2018-07-03  4:30           ` John Hubbard
2018-07-03 17:08           ` Christopher Lameter
2018-07-03 17:36             ` John Hubbard
2018-07-03 17:36               ` John Hubbard
2018-07-03 17:48               ` Christopher Lameter
2018-07-03 18:48                 ` John Hubbard
2018-07-03 18:48                   ` John Hubbard
2018-07-04 10:43               ` Jan Kara
2018-07-05 14:17                 ` Christopher Lameter
2018-07-09 13:49                   ` Jan Kara
2018-07-02  0:56 ` [PATCH v2 6/6] mm: page_mkclean, ttu: handle pinned pages john.hubbard
2018-07-02 10:15   ` Jan Kara
2018-07-02 21:07     ` John Hubbard
2018-07-02 21:07       ` John Hubbard
2018-07-02  5:54 ` [PATCH v2 0/6] mm/fs: gup: don't unmap or drop filesystem buffers John Hubbard
2018-07-02  5:54   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb798475-ebf3-7b02-409f-8c4347fa6674@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=john.hubbard@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.