All of lore.kernel.org
 help / color / mirror / Atom feed
* nfs_page_async_flush returning 0 for fatal errors on writeback
@ 2021-07-07 18:51 Calum Mackay
  2021-07-07 22:01 ` Trond Myklebust
  0 siblings, 1 reply; 6+ messages in thread
From: Calum Mackay @ 2021-07-07 18:51 UTC (permalink / raw)
  To: Trond Myklebust, Linux NFS Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 2914 bytes --]

hi Trond,

I had a question about these two old commits of yours, from v5.0 & v5.2:

14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal errors (2 
years, 2 months ago)

8fc75bed96bb NFS: Fix up return value on fatal errors in 
nfs_page_async_flush() (2 years, 5 months ago)


I am looking at a crash dump, with a kernel based on an older-still 
v4.14 stable which did not have either of the above commits.

        PANIC: "BUG: unable to handle kernel NULL pointer dereference at
0000000000000080"

     [exception RIP: _raw_spin_lock+20]

#10 [ffffb1493d78fcb8] nfs_updatepage at ffffffffc08f1791 [nfs]
#11 [ffffb1493d78fd10] nfs_write_end at ffffffffc08e094e [nfs]
#12 [ffffb1493d78fd58] generic_perform_write at ffffffffa71d458b
#13 [ffffb1493d78fde0] nfs_file_write at ffffffffc08dfdb4 [nfs]
#14 [ffffb1493d78fe18] __vfs_write at ffffffffa72848bc
#15 [ffffb1493d78fea0] vfs_write at ffffffffa7284ad2
#16 [ffffb1493d78fee0] sys_write at ffffffffa7284d35
#17 [ffffb1493d78ff28] do_syscall_64 at ffffffffa7003949

the real sequence, obscured by compiler inlining, is:

    nfs_updatepage
       nfs_writepage_setup
          nfs_setup_write_request
             nfs_inode_add_request
                spin_lock(&mapping->private_lock);

and we crash since the as mapping pointer is NULL.


I thought I was able to construct a possible sequence that would explain 
the above, if we are in (from above):

    nfs_setup_write_request
     nfs_try_to_update_request
      nfs_wb_page
       nfs_writepage_locked
        nfs_do_writepage

and nfs_page_async_flush detects a fatal server error, and calls 
nfs_write_error_remove_page, which results in the page->mapping set to NULL.

In that version of the code, without your commits above, 
nfs_page_async_flush returns 0 in this case, which I thought might 
result in nfs_setup_write_request going ahead and calling 
nfs_inode_add_request with that page, resulting in the crash seen.


I then discovered your v5.0 commit:

8fc75bed96bb NFS: Fix up return value on fatal errors in 
nfs_page_async_flush() (2 years, 5 months ago)

which appeared to correct that, having nfs_page_async_flush return the 
error in this case, so we would not end up in nfs_inode_add_request.


But I then spotted your later v5.2 commit:

14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal errors (2 
years, 2 months ago)

which changes things back, so that nfs_page_async_flush now again 
returns 0, in the "launder" case, and that's how that code remains today.


If so, is there anything to stop the possible crash path that I describe 
above?


Is my theory just wrong? Is there another mechanism that prevents the 
path I suggest above? Or perhaps I'm missing another commit that stops 
it happening, even after your second commit above?


thanks very much,

cheers,
calum.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs_page_async_flush returning 0 for fatal errors on writeback
  2021-07-07 18:51 nfs_page_async_flush returning 0 for fatal errors on writeback Calum Mackay
@ 2021-07-07 22:01 ` Trond Myklebust
  2021-07-07 23:13   ` Calum Mackay
  0 siblings, 1 reply; 6+ messages in thread
From: Trond Myklebust @ 2021-07-07 22:01 UTC (permalink / raw)
  To: linux-nfs, calum.mackay

On Wed, 2021-07-07 at 19:51 +0100, Calum Mackay wrote:
> hi Trond,
> 
> I had a question about these two old commits of yours, from v5.0 &
> v5.2:
> 
> 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal errors
> (2 
> years, 2 months ago)
> 
> 8fc75bed96bb NFS: Fix up return value on fatal errors in 
> nfs_page_async_flush() (2 years, 5 months ago)
> 
> 
> I am looking at a crash dump, with a kernel based on an older-still 
> v4.14 stable which did not have either of the above commits.
> 
>         PANIC: "BUG: unable to handle kernel NULL pointer dereference
> at
> 0000000000000080"
> 
>      [exception RIP: _raw_spin_lock+20]
> 
> #10 [ffffb1493d78fcb8] nfs_updatepage at ffffffffc08f1791 [nfs]
> #11 [ffffb1493d78fd10] nfs_write_end at ffffffffc08e094e [nfs]
> #12 [ffffb1493d78fd58] generic_perform_write at ffffffffa71d458b
> #13 [ffffb1493d78fde0] nfs_file_write at ffffffffc08dfdb4 [nfs]
> #14 [ffffb1493d78fe18] __vfs_write at ffffffffa72848bc
> #15 [ffffb1493d78fea0] vfs_write at ffffffffa7284ad2
> #16 [ffffb1493d78fee0] sys_write at ffffffffa7284d35
> #17 [ffffb1493d78ff28] do_syscall_64 at ffffffffa7003949
> 
> the real sequence, obscured by compiler inlining, is:
> 
>     nfs_updatepage
>        nfs_writepage_setup
>           nfs_setup_write_request
>              nfs_inode_add_request
>                 spin_lock(&mapping->private_lock);
> 
> and we crash since the as mapping pointer is NULL.
> 
> 
> I thought I was able to construct a possible sequence that would
> explain 
> the above, if we are in (from above):
> 
>     nfs_setup_write_request
>      nfs_try_to_update_request
>       nfs_wb_page
>        nfs_writepage_locked
>         nfs_do_writepage
> 
> and nfs_page_async_flush detects a fatal server error, and calls 
> nfs_write_error_remove_page, which results in the page->mapping set
> to NULL.
> 
> In that version of the code, without your commits above, 
> nfs_page_async_flush returns 0 in this case, which I thought might 
> result in nfs_setup_write_request going ahead and calling 
> nfs_inode_add_request with that page, resulting in the crash seen.
> 
> 
> I then discovered your v5.0 commit:
> 
> 8fc75bed96bb NFS: Fix up return value on fatal errors in 
> nfs_page_async_flush() (2 years, 5 months ago)
> 
> which appeared to correct that, having nfs_page_async_flush return
> the 
> error in this case, so we would not end up in nfs_inode_add_request.
> 
> 
> But I then spotted your later v5.2 commit:
> 
> 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal errors
> (2 
> years, 2 months ago)
> 
> which changes things back, so that nfs_page_async_flush now again 
> returns 0, in the "launder" case, and that's how that code remains
> today.
> 
> 
> If so, is there anything to stop the possible crash path that I
> describe 
> above?
> 
> 
> path I suggest above? Or perhaps I'm missing another commit that
> stops 
> it happening, even after your second commit above?
> 

In order for page->mapping to get set to NULL, we'd have to be removing
the page from the page cache altogether. I'm not seeing where we'd be
doing that here. It certainly isn't possible for some third party to do
so, since our thread is holding the page lock and I'm not seeing where
the call to nfs_write_error() might be doing so.

We do call nfs_inode_remove_request(), which removes the struct
nfs_page that is tracking the page dirtiness, however it shouldn't ever
result in the removal of the pagecache page itself.

Am I misreading your email?

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs_page_async_flush returning 0 for fatal errors on writeback
  2021-07-07 22:01 ` Trond Myklebust
@ 2021-07-07 23:13   ` Calum Mackay
  2021-07-07 23:50     ` Trond Myklebust
  0 siblings, 1 reply; 6+ messages in thread
From: Calum Mackay @ 2021-07-07 23:13 UTC (permalink / raw)
  To: Trond Myklebust, linux-nfs; +Cc: Calum Mackay


[-- Attachment #1.1: Type: text/plain, Size: 5598 bytes --]

On 07/07/2021 11:01 pm, Trond Myklebust wrote:
> On Wed, 2021-07-07 at 19:51 +0100, Calum Mackay wrote:
>> hi Trond,
>>
>> I had a question about these two old commits of yours, from v5.0 &
>> v5.2:
>>
>> 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal errors
>> (2
>> years, 2 months ago)
>>
>> 8fc75bed96bb NFS: Fix up return value on fatal errors in
>> nfs_page_async_flush() (2 years, 5 months ago)
>>
>>
>> I am looking at a crash dump, with a kernel based on an older-still
>> v4.14 stable which did not have either of the above commits.
>>
>>          PANIC: "BUG: unable to handle kernel NULL pointer dereference
>> at
>> 0000000000000080"
>>
>>       [exception RIP: _raw_spin_lock+20]
>>
>> #10 [ffffb1493d78fcb8] nfs_updatepage at ffffffffc08f1791 [nfs]
>> #11 [ffffb1493d78fd10] nfs_write_end at ffffffffc08e094e [nfs]
>> #12 [ffffb1493d78fd58] generic_perform_write at ffffffffa71d458b
>> #13 [ffffb1493d78fde0] nfs_file_write at ffffffffc08dfdb4 [nfs]
>> #14 [ffffb1493d78fe18] __vfs_write at ffffffffa72848bc
>> #15 [ffffb1493d78fea0] vfs_write at ffffffffa7284ad2
>> #16 [ffffb1493d78fee0] sys_write at ffffffffa7284d35
>> #17 [ffffb1493d78ff28] do_syscall_64 at ffffffffa7003949
>>
>> the real sequence, obscured by compiler inlining, is:
>>
>>      nfs_updatepage
>>         nfs_writepage_setup
>>            nfs_setup_write_request
>>               nfs_inode_add_request
>>                  spin_lock(&mapping->private_lock);
>>
>> and we crash since the as mapping pointer is NULL.
>>
>>
>> I thought I was able to construct a possible sequence that would
>> explain
>> the above, if we are in (from above):
>>
>>      nfs_setup_write_request
>>       nfs_try_to_update_request
>>        nfs_wb_page
>>         nfs_writepage_locked
>>          nfs_do_writepage
>>
>> and nfs_page_async_flush detects a fatal server error, and calls
>> nfs_write_error_remove_page, which results in the page->mapping set
>> to NULL.
>>
>> In that version of the code, without your commits above,
>> nfs_page_async_flush returns 0 in this case, which I thought might
>> result in nfs_setup_write_request going ahead and calling
>> nfs_inode_add_request with that page, resulting in the crash seen.
>>
>>
>> I then discovered your v5.0 commit:
>>
>> 8fc75bed96bb NFS: Fix up return value on fatal errors in
>> nfs_page_async_flush() (2 years, 5 months ago)
>>
>> which appeared to correct that, having nfs_page_async_flush return
>> the
>> error in this case, so we would not end up in nfs_inode_add_request.
>>
>>
>> But I then spotted your later v5.2 commit:
>>
>> 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal errors
>> (2
>> years, 2 months ago)
>>
>> which changes things back, so that nfs_page_async_flush now again
>> returns 0, in the "launder" case, and that's how that code remains
>> today.
>>
>>
>> If so, is there anything to stop the possible crash path that I
>> describe
>> above?
>>
>>
>> path I suggest above? Or perhaps I'm missing another commit that
>> stops
>> it happening, even after your second commit above?
>>
> 
> In order for page->mapping to get set to NULL, we'd have to be removing
> the page from the page cache altogether. I'm not seeing where we'd be
> doing that here. It certainly isn't possible for some third party to do
> so, since our thread is holding the page lock and I'm not seeing where
> the call to nfs_write_error() might be doing so.
> 
> We do call nfs_inode_remove_request(), which removes the struct
> nfs_page that is tracking the page dirtiness, however it shouldn't ever
> result in the removal of the pagecache page itself.
> 
> Am I misreading your email?

thanks very much Trond; much more likely I am misreading the code :)


My theory was that we have nfs_page_async_flush detecting 
nfs_error_is_fatal_on_server, so calling nfs_write_error_remove_page 
(this is an older v4.14.72-ish kernel).

That would then generic_error_remove_page -> truncate_inode_page -> 
truncate_complete_page -> delete_from_page_cache

thus, as you say, removing the page from the page cache, with 
__delete_from_page_cache clearing page->mapping.


Without your v5.0 commit, nfs_page_async_flush will then return 0, via 
nfs_do_writepage, nfs_writepage_locked, nfs_wb_page to 
nfs_try_to_update_request, which then returns NULL to 
nfs_setup_write_request.

Since it did not see an error, nfs_setup_write_request will then call 
nfs_inode_add_request, which itself then dereferences the mapping:

	spin_lock(&mapping->private_lock);

which is where we crash.


Obviously, there are a number of assumptions in the above, so I thought 
it must just be a possible path the code could take?

Does that sound plausible (given that v4.14.72-ish code)?



However, I note that in a subsequent v5.2 commit:

22876f540bdf NFS: Don't call generic_error_remove_page() while holding locks

you remove the call to generic_error_remove_page from 
nfs_write_error_remove_page(), and that is itself then renamed 
nfs_write_error(), as part of a later v5.2 commit:

6fbda89b257f NFS: Replace custom error reporting mechanism with generic one


Without those commits, and also without your adjustments to 
nfs_page_async_flush I mentioned earlier, is it possible that the code 
path I present above, where the page /is/ removed from the page cache, 
could result in the crash we saw?


thanks again,
calum.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs_page_async_flush returning 0 for fatal errors on writeback
  2021-07-07 23:13   ` Calum Mackay
@ 2021-07-07 23:50     ` Trond Myklebust
  2021-07-07 23:57       ` Calum Mackay
  2021-08-06 21:25       ` Calum Mackay
  0 siblings, 2 replies; 6+ messages in thread
From: Trond Myklebust @ 2021-07-07 23:50 UTC (permalink / raw)
  To: linux-nfs, calum.mackay

On Thu, 2021-07-08 at 00:13 +0100, Calum Mackay wrote:
> On 07/07/2021 11:01 pm, Trond Myklebust wrote:
> > On Wed, 2021-07-07 at 19:51 +0100, Calum Mackay wrote:
> > > hi Trond,
> > > 
> > > I had a question about these two old commits of yours, from v5.0
> > > &
> > > v5.2:
> > > 
> > > 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal
> > > errors
> > > (2
> > > years, 2 months ago)
> > > 
> > > 8fc75bed96bb NFS: Fix up return value on fatal errors in
> > > nfs_page_async_flush() (2 years, 5 months ago)
> > > 
> > > 
> > > I am looking at a crash dump, with a kernel based on an older-
> > > still
> > > v4.14 stable which did not have either of the above commits.
> > > 
> > >          PANIC: "BUG: unable to handle kernel NULL pointer
> > > dereference
> > > at
> > > 0000000000000080"
> > > 
> > >       [exception RIP: _raw_spin_lock+20]
> > > 
> > > #10 [ffffb1493d78fcb8] nfs_updatepage at ffffffffc08f1791 [nfs]
> > > #11 [ffffb1493d78fd10] nfs_write_end at ffffffffc08e094e [nfs]
> > > #12 [ffffb1493d78fd58] generic_perform_write at ffffffffa71d458b
> > > #13 [ffffb1493d78fde0] nfs_file_write at ffffffffc08dfdb4 [nfs]
> > > #14 [ffffb1493d78fe18] __vfs_write at ffffffffa72848bc
> > > #15 [ffffb1493d78fea0] vfs_write at ffffffffa7284ad2
> > > #16 [ffffb1493d78fee0] sys_write at ffffffffa7284d35
> > > #17 [ffffb1493d78ff28] do_syscall_64 at ffffffffa7003949
> > > 
> > > the real sequence, obscured by compiler inlining, is:
> > > 
> > >      nfs_updatepage
> > >         nfs_writepage_setup
> > >            nfs_setup_write_request
> > >               nfs_inode_add_request
> > >                  spin_lock(&mapping->private_lock);
> > > 
> > > and we crash since the as mapping pointer is NULL.
> > > 
> > > 
> > > I thought I was able to construct a possible sequence that would
> > > explain
> > > the above, if we are in (from above):
> > > 
> > >      nfs_setup_write_request
> > >       nfs_try_to_update_request
> > >        nfs_wb_page
> > >         nfs_writepage_locked
> > >          nfs_do_writepage
> > > 
> > > and nfs_page_async_flush detects a fatal server error, and calls
> > > nfs_write_error_remove_page, which results in the page->mapping
> > > set
> > > to NULL.
> > > 
> > > In that version of the code, without your commits above,
> > > nfs_page_async_flush returns 0 in this case, which I thought
> > > might
> > > result in nfs_setup_write_request going ahead and calling
> > > nfs_inode_add_request with that page, resulting in the crash
> > > seen.
> > > 
> > > 
> > > I then discovered your v5.0 commit:
> > > 
> > > 8fc75bed96bb NFS: Fix up return value on fatal errors in
> > > nfs_page_async_flush() (2 years, 5 months ago)
> > > 
> > > which appeared to correct that, having nfs_page_async_flush
> > > return
> > > the
> > > error in this case, so we would not end up in
> > > nfs_inode_add_request.
> > > 
> > > 
> > > But I then spotted your later v5.2 commit:
> > > 
> > > 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal
> > > errors
> > > (2
> > > years, 2 months ago)
> > > 
> > > which changes things back, so that nfs_page_async_flush now again
> > > returns 0, in the "launder" case, and that's how that code
> > > remains
> > > today.
> > > 
> > > 
> > > If so, is there anything to stop the possible crash path that I
> > > describe
> > > above?
> > > 
> > > 
> > > path I suggest above? Or perhaps I'm missing another commit that
> > > stops
> > > it happening, even after your second commit above?
> > > 
> > 
> > In order for page->mapping to get set to NULL, we'd have to be
> > removing
> > the page from the page cache altogether. I'm not seeing where we'd
> > be
> > doing that here. It certainly isn't possible for some third party
> > to do
> > so, since our thread is holding the page lock and I'm not seeing
> > where
> > the call to nfs_write_error() might be doing so.
> > 
> > We do call nfs_inode_remove_request(), which removes the struct
> > nfs_page that is tracking the page dirtiness, however it shouldn't
> > ever
> > result in the removal of the pagecache page itself.
> > 
> > Am I misreading your email?
> 
> thanks very much Trond; much more likely I am misreading the code :)
> 
> 
> My theory was that we have nfs_page_async_flush detecting 
> nfs_error_is_fatal_on_server, so calling nfs_write_error_remove_page 
> (this is an older v4.14.72-ish kernel).
> 
> That would then generic_error_remove_page -> truncate_inode_page -> 
> truncate_complete_page -> delete_from_page_cache
> 
> thus, as you say, removing the page from the page cache, with 
> __delete_from_page_cache clearing page->mapping.
> 
> 
> Without your v5.0 commit, nfs_page_async_flush will then return 0,
> via 
> nfs_do_writepage, nfs_writepage_locked, nfs_wb_page to 
> nfs_try_to_update_request, which then returns NULL to 
> nfs_setup_write_request.
> 
> nfs_inode_add_request, which itself then dereferences the mapping:
> 
>         spin_lock(&mapping->private_lock);
> 
> which is where we crash.
> 
> 
> Obviously, there are a number of assumptions in the above, so I
> thought 
> it must just be a possible path the code could take?
> 
> Does that sound plausible (given that v4.14.72-ish code)?
> 
> 
> 
> However, I note that in a subsequent v5.2 commit:
> 
> 22876f540bdf NFS: Don't call generic_error_remove_page() while
> holding locks
> 
> you remove the call to generic_error_remove_page from 
> nfs_write_error_remove_page(), and that is itself then renamed 
> nfs_write_error(), as part of a later v5.2 commit:
> 
> 6fbda89b257f NFS: Replace custom error reporting mechanism with
> generic one
> 
> 
> Without those commits, and also without your adjustments to 
> nfs_page_async_flush I mentioned earlier, is it possible that the
> code 
> path I present above, where the page /is/ removed from the page
> cache, 
> could result in the crash we saw?
> 
> 

OK, yes that is plausible. The call to generic_error_remove_page()
would, as you said, remove the page from the page cache, and thus could
result in the crash you describe.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs_page_async_flush returning 0 for fatal errors on writeback
  2021-07-07 23:50     ` Trond Myklebust
@ 2021-07-07 23:57       ` Calum Mackay
  2021-08-06 21:25       ` Calum Mackay
  1 sibling, 0 replies; 6+ messages in thread
From: Calum Mackay @ 2021-07-07 23:57 UTC (permalink / raw)
  To: Trond Myklebust, linux-nfs; +Cc: Calum Mackay


[-- Attachment #1.1: Type: text/plain, Size: 6418 bytes --]

On 08/07/2021 12:50 am, Trond Myklebust wrote:
> On Thu, 2021-07-08 at 00:13 +0100, Calum Mackay wrote:
>> On 07/07/2021 11:01 pm, Trond Myklebust wrote:
>>> On Wed, 2021-07-07 at 19:51 +0100, Calum Mackay wrote:
>>>> hi Trond,
>>>>
>>>> I had a question about these two old commits of yours, from v5.0
>>>> &
>>>> v5.2:
>>>>
>>>> 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal
>>>> errors
>>>> (2
>>>> years, 2 months ago)
>>>>
>>>> 8fc75bed96bb NFS: Fix up return value on fatal errors in
>>>> nfs_page_async_flush() (2 years, 5 months ago)
>>>>
>>>>
>>>> I am looking at a crash dump, with a kernel based on an older-
>>>> still
>>>> v4.14 stable which did not have either of the above commits.
>>>>
>>>>           PANIC: "BUG: unable to handle kernel NULL pointer
>>>> dereference
>>>> at
>>>> 0000000000000080"
>>>>
>>>>        [exception RIP: _raw_spin_lock+20]
>>>>
>>>> #10 [ffffb1493d78fcb8] nfs_updatepage at ffffffffc08f1791 [nfs]
>>>> #11 [ffffb1493d78fd10] nfs_write_end at ffffffffc08e094e [nfs]
>>>> #12 [ffffb1493d78fd58] generic_perform_write at ffffffffa71d458b
>>>> #13 [ffffb1493d78fde0] nfs_file_write at ffffffffc08dfdb4 [nfs]
>>>> #14 [ffffb1493d78fe18] __vfs_write at ffffffffa72848bc
>>>> #15 [ffffb1493d78fea0] vfs_write at ffffffffa7284ad2
>>>> #16 [ffffb1493d78fee0] sys_write at ffffffffa7284d35
>>>> #17 [ffffb1493d78ff28] do_syscall_64 at ffffffffa7003949
>>>>
>>>> the real sequence, obscured by compiler inlining, is:
>>>>
>>>>       nfs_updatepage
>>>>          nfs_writepage_setup
>>>>             nfs_setup_write_request
>>>>                nfs_inode_add_request
>>>>                   spin_lock(&mapping->private_lock);
>>>>
>>>> and we crash since the as mapping pointer is NULL.
>>>>
>>>>
>>>> I thought I was able to construct a possible sequence that would
>>>> explain
>>>> the above, if we are in (from above):
>>>>
>>>>       nfs_setup_write_request
>>>>        nfs_try_to_update_request
>>>>         nfs_wb_page
>>>>          nfs_writepage_locked
>>>>           nfs_do_writepage
>>>>
>>>> and nfs_page_async_flush detects a fatal server error, and calls
>>>> nfs_write_error_remove_page, which results in the page->mapping
>>>> set
>>>> to NULL.
>>>>
>>>> In that version of the code, without your commits above,
>>>> nfs_page_async_flush returns 0 in this case, which I thought
>>>> might
>>>> result in nfs_setup_write_request going ahead and calling
>>>> nfs_inode_add_request with that page, resulting in the crash
>>>> seen.
>>>>
>>>>
>>>> I then discovered your v5.0 commit:
>>>>
>>>> 8fc75bed96bb NFS: Fix up return value on fatal errors in
>>>> nfs_page_async_flush() (2 years, 5 months ago)
>>>>
>>>> which appeared to correct that, having nfs_page_async_flush
>>>> return
>>>> the
>>>> error in this case, so we would not end up in
>>>> nfs_inode_add_request.
>>>>
>>>>
>>>> But I then spotted your later v5.2 commit:
>>>>
>>>> 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal
>>>> errors
>>>> (2
>>>> years, 2 months ago)
>>>>
>>>> which changes things back, so that nfs_page_async_flush now again
>>>> returns 0, in the "launder" case, and that's how that code
>>>> remains
>>>> today.
>>>>
>>>>
>>>> If so, is there anything to stop the possible crash path that I
>>>> describe
>>>> above?
>>>>
>>>>
>>>> path I suggest above? Or perhaps I'm missing another commit that
>>>> stops
>>>> it happening, even after your second commit above?
>>>>
>>>
>>> In order for page->mapping to get set to NULL, we'd have to be
>>> removing
>>> the page from the page cache altogether. I'm not seeing where we'd
>>> be
>>> doing that here. It certainly isn't possible for some third party
>>> to do
>>> so, since our thread is holding the page lock and I'm not seeing
>>> where
>>> the call to nfs_write_error() might be doing so.
>>>
>>> We do call nfs_inode_remove_request(), which removes the struct
>>> nfs_page that is tracking the page dirtiness, however it shouldn't
>>> ever
>>> result in the removal of the pagecache page itself.
>>>
>>> Am I misreading your email?
>>
>> thanks very much Trond; much more likely I am misreading the code :)
>>
>>
>> My theory was that we have nfs_page_async_flush detecting
>> nfs_error_is_fatal_on_server, so calling nfs_write_error_remove_page
>> (this is an older v4.14.72-ish kernel).
>>
>> That would then generic_error_remove_page -> truncate_inode_page ->
>> truncate_complete_page -> delete_from_page_cache
>>
>> thus, as you say, removing the page from the page cache, with
>> __delete_from_page_cache clearing page->mapping.
>>
>>
>> Without your v5.0 commit, nfs_page_async_flush will then return 0,
>> via
>> nfs_do_writepage, nfs_writepage_locked, nfs_wb_page to
>> nfs_try_to_update_request, which then returns NULL to
>> nfs_setup_write_request.
>>
>> nfs_inode_add_request, which itself then dereferences the mapping:
>>
>>          spin_lock(&mapping->private_lock);
>>
>> which is where we crash.
>>
>>
>> Obviously, there are a number of assumptions in the above, so I
>> thought
>> it must just be a possible path the code could take?
>>
>> Does that sound plausible (given that v4.14.72-ish code)?
>>
>>
>>
>> However, I note that in a subsequent v5.2 commit:
>>
>> 22876f540bdf NFS: Don't call generic_error_remove_page() while
>> holding locks
>>
>> you remove the call to generic_error_remove_page from
>> nfs_write_error_remove_page(), and that is itself then renamed
>> nfs_write_error(), as part of a later v5.2 commit:
>>
>> 6fbda89b257f NFS: Replace custom error reporting mechanism with
>> generic one
>>
>>
>> Without those commits, and also without your adjustments to
>> nfs_page_async_flush I mentioned earlier, is it possible that the
>> code
>> path I present above, where the page /is/ removed from the page
>> cache,
>> could result in the crash we saw?
>>
>>
> 
> OK, yes that is plausible. The call to generic_error_remove_page()
> would, as you said, remove the page from the page cache, and thus could
> result in the crash you describe.
> 

that's great, thanks very much indeed for the confirmation, Trond.

sorry to waste your time with older code…

cheers,
calum.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs_page_async_flush returning 0 for fatal errors on writeback
  2021-07-07 23:50     ` Trond Myklebust
  2021-07-07 23:57       ` Calum Mackay
@ 2021-08-06 21:25       ` Calum Mackay
  1 sibling, 0 replies; 6+ messages in thread
From: Calum Mackay @ 2021-08-06 21:25 UTC (permalink / raw)
  To: Trond Myklebust, linux-nfs; +Cc: Calum Mackay


[-- Attachment #1.1: Type: text/plain, Size: 6761 bytes --]

hi Trond,

On 08/07/2021 12:50 am, Trond Myklebust wrote:
> On Thu, 2021-07-08 at 00:13 +0100, Calum Mackay wrote:
>> On 07/07/2021 11:01 pm, Trond Myklebust wrote:
>>> On Wed, 2021-07-07 at 19:51 +0100, Calum Mackay wrote:
>>>> hi Trond,
>>>>
>>>> I had a question about these two old commits of yours, from v5.0
>>>> &
>>>> v5.2:
>>>>
>>>> 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal
>>>> errors
>>>> (2
>>>> years, 2 months ago)
>>>>
>>>> 8fc75bed96bb NFS: Fix up return value on fatal errors in
>>>> nfs_page_async_flush() (2 years, 5 months ago)
>>>>
>>>>
>>>> I am looking at a crash dump, with a kernel based on an older-
>>>> still
>>>> v4.14 stable which did not have either of the above commits.
>>>>
>>>>           PANIC: "BUG: unable to handle kernel NULL pointer
>>>> dereference
>>>> at
>>>> 0000000000000080"
>>>>
>>>>        [exception RIP: _raw_spin_lock+20]
>>>>
>>>> #10 [ffffb1493d78fcb8] nfs_updatepage at ffffffffc08f1791 [nfs]
>>>> #11 [ffffb1493d78fd10] nfs_write_end at ffffffffc08e094e [nfs]
>>>> #12 [ffffb1493d78fd58] generic_perform_write at ffffffffa71d458b
>>>> #13 [ffffb1493d78fde0] nfs_file_write at ffffffffc08dfdb4 [nfs]
>>>> #14 [ffffb1493d78fe18] __vfs_write at ffffffffa72848bc
>>>> #15 [ffffb1493d78fea0] vfs_write at ffffffffa7284ad2
>>>> #16 [ffffb1493d78fee0] sys_write at ffffffffa7284d35
>>>> #17 [ffffb1493d78ff28] do_syscall_64 at ffffffffa7003949
>>>>
>>>> the real sequence, obscured by compiler inlining, is:
>>>>
>>>>       nfs_updatepage
>>>>          nfs_writepage_setup
>>>>             nfs_setup_write_request
>>>>                nfs_inode_add_request
>>>>                   spin_lock(&mapping->private_lock);
>>>>
>>>> and we crash since the as mapping pointer is NULL.
>>>>
>>>>
>>>> I thought I was able to construct a possible sequence that would
>>>> explain
>>>> the above, if we are in (from above):
>>>>
>>>>       nfs_setup_write_request
>>>>        nfs_try_to_update_request
>>>>         nfs_wb_page
>>>>          nfs_writepage_locked
>>>>           nfs_do_writepage
>>>>
>>>> and nfs_page_async_flush detects a fatal server error, and calls
>>>> nfs_write_error_remove_page, which results in the page->mapping
>>>> set
>>>> to NULL.
>>>>
>>>> In that version of the code, without your commits above,
>>>> nfs_page_async_flush returns 0 in this case, which I thought
>>>> might
>>>> result in nfs_setup_write_request going ahead and calling
>>>> nfs_inode_add_request with that page, resulting in the crash
>>>> seen.
>>>>
>>>>
>>>> I then discovered your v5.0 commit:
>>>>
>>>> 8fc75bed96bb NFS: Fix up return value on fatal errors in
>>>> nfs_page_async_flush() (2 years, 5 months ago)
>>>>
>>>> which appeared to correct that, having nfs_page_async_flush
>>>> return
>>>> the
>>>> error in this case, so we would not end up in
>>>> nfs_inode_add_request.
>>>>
>>>>
>>>> But I then spotted your later v5.2 commit:
>>>>
>>>> 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal
>>>> errors
>>>> (2
>>>> years, 2 months ago)
>>>>
>>>> which changes things back, so that nfs_page_async_flush now again
>>>> returns 0, in the "launder" case, and that's how that code
>>>> remains
>>>> today.
>>>>
>>>>
>>>> If so, is there anything to stop the possible crash path that I
>>>> describe
>>>> above?
>>>>
>>>>
>>>> path I suggest above? Or perhaps I'm missing another commit that
>>>> stops
>>>> it happening, even after your second commit above?
>>>>
>>>
>>> In order for page->mapping to get set to NULL, we'd have to be
>>> removing
>>> the page from the page cache altogether. I'm not seeing where we'd
>>> be
>>> doing that here. It certainly isn't possible for some third party
>>> to do
>>> so, since our thread is holding the page lock and I'm not seeing
>>> where
>>> the call to nfs_write_error() might be doing so.
>>>
>>> We do call nfs_inode_remove_request(), which removes the struct
>>> nfs_page that is tracking the page dirtiness, however it shouldn't
>>> ever
>>> result in the removal of the pagecache page itself.
>>>
>>> Am I misreading your email?
>>
>> thanks very much Trond; much more likely I am misreading the code :)
>>
>>
>> My theory was that we have nfs_page_async_flush detecting
>> nfs_error_is_fatal_on_server, so calling nfs_write_error_remove_page
>> (this is an older v4.14.72-ish kernel).
>>
>> That would then generic_error_remove_page -> truncate_inode_page ->
>> truncate_complete_page -> delete_from_page_cache
>>
>> thus, as you say, removing the page from the page cache, with
>> __delete_from_page_cache clearing page->mapping.
>>
>>
>> Without your v5.0 commit, nfs_page_async_flush will then return 0,
>> via
>> nfs_do_writepage, nfs_writepage_locked, nfs_wb_page to
>> nfs_try_to_update_request, which then returns NULL to
>> nfs_setup_write_request.
>>
>> nfs_inode_add_request, which itself then dereferences the mapping:
>>
>>          spin_lock(&mapping->private_lock);
>>
>> which is where we crash.
>>
>>
>> Obviously, there are a number of assumptions in the above, so I
>> thought
>> it must just be a possible path the code could take?
>>
>> Does that sound plausible (given that v4.14.72-ish code)?
>>
>>
>>
>> However, I note that in a subsequent v5.2 commit:
>>
>> 22876f540bdf NFS: Don't call generic_error_remove_page() while
>> holding locks
>>
>> you remove the call to generic_error_remove_page from
>> nfs_write_error_remove_page(), and that is itself then renamed
>> nfs_write_error(), as part of a later v5.2 commit:
>>
>> 6fbda89b257f NFS: Replace custom error reporting mechanism with
>> generic one
>>
>>
>> Without those commits, and also without your adjustments to
>> nfs_page_async_flush I mentioned earlier, is it possible that the
>> code
>> path I present above, where the page /is/ removed from the page
>> cache,
>> could result in the crash we saw?
>>
>>
> 
> OK, yes that is plausible. The call to generic_error_remove_page()
> would, as you said, remove the page from the page cache, and thus could
> result in the crash you describe.

Do you think that this commit ought to be considered for stable?

22876f540bdf NFS: Don't call generic_error_remove_page() while holding 
locks

otherwise, I think we are open to the crash, since the commit that 
changes nfs_page_async_flush() to again return 0:

14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal errors

is itself in stable.


cheers,
calum.


-- 
Calum Mackay
Linux Kernel Engineering
Oracle Linux and Virtualisation

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-06 21:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-07 18:51 nfs_page_async_flush returning 0 for fatal errors on writeback Calum Mackay
2021-07-07 22:01 ` Trond Myklebust
2021-07-07 23:13   ` Calum Mackay
2021-07-07 23:50     ` Trond Myklebust
2021-07-07 23:57       ` Calum Mackay
2021-08-06 21:25       ` Calum Mackay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.