All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Ulrich Weigand <uweigand@de.ibm.com>,
	Dave Hansen <dave.hansen@intel.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>,
	viro@zeniv.linux.org.uk, david@redhat.com,
	akpm@linux-foundation.org, aarcange@redhat.com,
	linux-mm@kvack.org, frankja@linux.ibm.com, sfr@canb.auug.org.au,
	jhubbard@nvidia.com, linux-kernel@vger.kernel.org,
	linux-s390@vger.kernel.org, jack@suse.cz, kirill@shutemov.name,
	peterz@infradead.org, sean.j.christopherson@intel.com,
	Ulrich.Weigand@de.ibm.com
Subject: Re: [PATCH v2 1/1] fs/splice: add missing callback for inaccessible pages
Date: Tue, 5 May 2020 16:01:07 +0200	[thread overview]
Message-ID: <fd300dca-f0b4-ce3b-4a97-244030624fbd@de.ibm.com> (raw)
In-Reply-To: <20200505135556.GA9920@oc3748833570.ibm.com>



On 05.05.20 15:55, Ulrich Weigand wrote:
> On Tue, May 05, 2020 at 05:34:45AM -0700, Dave Hansen wrote:
>> On 5/4/20 6:41 AM, Ulrich Weigand wrote:
>>> You're right that there is no mechanism to prevent new references,
>>> but that's really never been the goal either.  We're simply trying
>>> to ensure that no I/O is ever done on a page that is in the "secure"
>>> (or inaccessible) state.  To do so, we rely on the assumption that
>>> all code that starts I/O on a page cache page will *first*:
>>> - mark the page as pending I/O by either taking an extra page
>>>   count, or by setting the Writeback flag; then:
>>> - call arch_make_page_accessible(); then:
>>> - start I/O; and only after I/O has finished:
>>> - remove the "pending I/O" marker (Writeback and/or extra ref)
>>
>> Let's ignore writeback for a moment because get_page() is the more
>> general case.  The locking sequence is:
>>
>> 1. get_page() (or equivalent) "locks out" a page from converting to
>>    inaccessbile,
>> 2. followed by a make_page_accessible() guarantees that the page
>>    *stays* accessible until
>> 3. I/O is safe in this region
>> 4. put_page(), removes the "lock out", I/O now unsafe
> 
> Yes, exactly.
> 
>> They key is, though, the get_page() must happen before
>> make_page_accessible() and *every* place that acquires a new reference
>> needs a make_page_accessible().
> 
> Well, sort of: every place that acquires a new reference *and then
> performs I/O* needs a make_page_accessible().  There seem to be a
> lot of plain get_page() calls that aren't related to I/O.
> 
>> try_get_page() is obviously one of those "new reference sites" and it
>> only has one call site outisde of the gup code: generic_pipe_buf_get(),
>> which is effectively patched by the patch that started this thread.  The
>> fact that this one oddball site _and_ gup are patched now is a good sign.
>>
>> *But*, I still don't know how that could work nicely:
>>
>>> static inline __must_check bool try_get_page(struct page *page)
>>> {
>>>         page = compound_head(page);
>>>         if (WARN_ON_ONCE(page_ref_count(page) <= 0))
>>>                 return false;
>>>         page_ref_inc(page);
>>>         return true;
>>> }
>>
>> If try_get_page() collides with a freeze_page_refs(), it'll hit the
>> WARN_ON_ONCE(), which is surely there for a good reason.  I'm not sure
>> that warning is _actually_ valid since freeze_page_refs() isn't truly a
>> 0 refcount.  But, the fact that this hasn't been encountered means that
>> the testing here is potentially lacking.
> 
> This is indeed interesting.  In particular if you compare try_get_page
> with try_get_compound_head in gup.c, which does instead:
> 
>         if (WARN_ON_ONCE(page_ref_count(head) < 0))
>                 return NULL;
> 
> which seems more reasonable to me, given the presence of the
> page_ref_freeze method.  So I'm not sure why try_get_page has <= 0.


Just looked at 
commit 88b1a17dfc3ed7728316478fae0f5ad508f50397  mm: add 'try_get_page()' helper function

which says:
    Also like 'get_page()', you can't use this function unless you already
    had a reference to the page.  The intent is that you can use this
    exactly like get_page(), but in situations where you want to limit the
    maximum reference count.
    
    The code currently does an unconditional WARN_ON_ONCE() if we ever hit
    the reference count issues (either zero or negative), as a notification
    that the conditional non-increment actually happened.

If try_get_page must not be called with an existing reference, that means
that when we call it the page reference is already higher and our freeze
will never succeed. That would imply that we cannot trigger this. No?

  reply	other threads:[~2020-05-05 14:01 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-30 14:38 [PATCH v2 1/1] fs/splice: add missing callback for inaccessible pages Claudio Imbrenda
2020-04-30 20:04 ` Christian Borntraeger
2020-04-30 22:06   ` Dave Hansen
2020-04-30 22:20     ` Dave Hansen
2020-05-01  7:18     ` Christian Borntraeger
2020-05-01 16:32       ` Dave Hansen
2020-05-04 13:41         ` Ulrich Weigand
2020-05-05 12:34           ` Dave Hansen
2020-05-05 13:55             ` Ulrich Weigand
2020-05-05 14:01               ` Christian Borntraeger [this message]
2020-05-05 14:03                 ` Christian Borntraeger
2020-05-05 14:33                   ` Ulrich Weigand
2020-05-05 14:49                     ` Christian Borntraeger
2020-05-05 14:57                 ` Dave Hansen
2020-05-05 14:00             ` Christian Borntraeger
2020-05-05 14:24               ` Dave Hansen
2020-05-05 14:31                 ` Christian Borntraeger
2020-05-05 14:34                   ` Dave Hansen
2020-05-05 14:39                     ` Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fd300dca-f0b4-ce3b-4a97-244030624fbd@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=Ulrich.Weigand@de.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jack@suse.cz \
    --cc=jhubbard@nvidia.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=sean.j.christopherson@intel.com \
    --cc=sfr@canb.auug.org.au \
    --cc=uweigand@de.ibm.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.