All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: <linux-fsdevel@vger.kernel.org>,
	Dominique Martinet <asmadeus@codewreck.org>,
	<v9fs-developer@lists.sourceforge.net>,
	Steve French <sfrench@samba.org>, <linux-cifs@vger.kernel.org>,
	Miklos Szeredi <miklos@szeredi.hu>, Jeff Dike <jdike@addtoit.com>,
	Richard Weinberger <richard@nod.at>,
	<linux-um@lists.infradead.org>, Dave Kleikamp <shaggy@kernel.org>,
	<jfs-discussion@lists.sourceforge.net>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	<linux-nfs@vger.kernel.org>,
	Anton Altaparmakov <anton@tuxera.com>,
	<linux-ntfs-dev@lists.sourceforge.net>,
	Mike Marshall <hubcap@omnibond.com>,
	Martin Brandenburg <martin@omnibond.com>,
	<devel@lists.orangefs.org>, Hans de Goede <hdegoede@redhat.com>
Subject: Re: set_page_dirty vs truncate
Date: Fri, 18 Dec 2020 23:04:23 -0800	[thread overview]
Message-ID: <43f05a8f-25ce-a400-5825-d8fa159ee7f6@nvidia.com> (raw)
In-Reply-To: <20201219065057.GR15600@casper.infradead.org>

On 12/18/20 10:50 PM, Matthew Wilcox wrote:
...
>>> Hmmm ... looks like __set_page_dirty_nobuffers() has a similar problem:
>>>
>>> {
>>>           lock_page_memcg(page);
>>>           if (!TestSetPageDirty(page)) {
>>>                   struct address_space *mapping = page_mapping(page);
>>>                   unsigned long flags;
>>>
>>>                   if (!mapping) {
>>>                           unlock_page_memcg(page);
>>>                           return 1;
>>>                   }
>>>
>>>                   xa_lock_irqsave(&mapping->i_pages, flags);
>>>                   BUG_ON(page_mapping(page) != mapping);
>>>
>>> sure, we check that the page wasn't truncated between set_page_dirty()
>>> and the call to TestSetPageDirty(), but we can truncate dirty pages
>>> with no problem.  So between the call to TestSetPageDirty() and
>>> the call to xa_lock_irqsave(), the page can be truncated, and the
>>> BUG_ON should fire.
>>>
>>> I haven't been able to find any examples of this, but maybe it's just a very
>>> narrow race.  Does anyone recognise this signature?  Adding the filesystems
>>> which use __set_page_dirty_nobuffers() directly without extra locking.
>>
>>
>> That sounds like the same *kind* of failure that Jan Kara and I were
>> seeing on live systems[1], that led eventually to the gup-to-pup
>> conversion exercise.
>>
>> That crash happened due to calling set_page_dirty() on pages that had no
>> buffers on them [2]. And that sounds like *exactly* the same thing as
>> calling __set_page_dirty_nobuffers() without extra locking. So I'd
>> expect that it's Just Wrong To Do, for the same reasons as Jan spells
>> out very clearly in [1].
> 
> Interesting.  It's a bit different, *but* Jan's race might be what's
> causing this symptom.  The reason is that the backtrace contains
> set_page_dirty_lock() which holds the page lock.  So there can't be
> a truncation race because truncate holds the page lock when calling
> ->invalidatepage.
> 
> That said, the syzbot reproducer doesn't have any O_DIRECT in it
> either.  So maybe this is some other race?

Jan's race can be also be reproduced *without* O_DIRECT. I first saw
it via a program that just did these steps on a normal ext4 filesystem:

a) pin ext4 file-backed pages, via get_user_pages(). Actually the way
it got here was due to using what *looked* like anonymous RAM to the
program, but was really file-backed RAM, because the admin had it
set up to mount ext4 on /tmp, instead of using tmpfs, to "save RAM",
but I digress. :)

b) wait a while, optionally do some DMA on the pages from a GPU, drink
coffee...

c) call set_pages_dirty()

d) unpin the pages

e) BUG_ON() in page_buffers().


thanks,
-- 
John Hubbard
NVIDIA

WARNING: multiple messages have this Message-ID (diff)
From: John Hubbard <jhubbard@nvidia.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Martin Brandenburg <martin@omnibond.com>,
	linux-cifs@vger.kernel.org, jfs-discussion@lists.sourceforge.net,
	Miklos Szeredi <miklos@szeredi.hu>,
	Dave Kleikamp <shaggy@kernel.org>,
	Richard Weinberger <richard@nod.at>,
	Dominique Martinet <asmadeus@codewreck.org>,
	linux-um@lists.infradead.org, linux-nfs@vger.kernel.org,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Steve French <sfrench@samba.org>,
	linux-ntfs-dev@lists.sourceforge.net,
	Hans de Goede <hdegoede@redhat.com>,
	devel@lists.orangefs.org,
	Anna Schumaker <anna.schumaker@netapp.com>,
	linux-fsdevel@vger.kernel.org,
	v9fs-developer@lists.sourceforge.net,
	Jeff Dike <jdike@addtoit.com>,
	Anton Altaparmakov <anton@tuxera.com>,
	Mike Marshall <hubcap@omnibond.com>
Subject: Re: set_page_dirty vs truncate
Date: Fri, 18 Dec 2020 23:04:23 -0800	[thread overview]
Message-ID: <43f05a8f-25ce-a400-5825-d8fa159ee7f6@nvidia.com> (raw)
In-Reply-To: <20201219065057.GR15600@casper.infradead.org>

On 12/18/20 10:50 PM, Matthew Wilcox wrote:
...
>>> Hmmm ... looks like __set_page_dirty_nobuffers() has a similar problem:
>>>
>>> {
>>>           lock_page_memcg(page);
>>>           if (!TestSetPageDirty(page)) {
>>>                   struct address_space *mapping = page_mapping(page);
>>>                   unsigned long flags;
>>>
>>>                   if (!mapping) {
>>>                           unlock_page_memcg(page);
>>>                           return 1;
>>>                   }
>>>
>>>                   xa_lock_irqsave(&mapping->i_pages, flags);
>>>                   BUG_ON(page_mapping(page) != mapping);
>>>
>>> sure, we check that the page wasn't truncated between set_page_dirty()
>>> and the call to TestSetPageDirty(), but we can truncate dirty pages
>>> with no problem.  So between the call to TestSetPageDirty() and
>>> the call to xa_lock_irqsave(), the page can be truncated, and the
>>> BUG_ON should fire.
>>>
>>> I haven't been able to find any examples of this, but maybe it's just a very
>>> narrow race.  Does anyone recognise this signature?  Adding the filesystems
>>> which use __set_page_dirty_nobuffers() directly without extra locking.
>>
>>
>> That sounds like the same *kind* of failure that Jan Kara and I were
>> seeing on live systems[1], that led eventually to the gup-to-pup
>> conversion exercise.
>>
>> That crash happened due to calling set_page_dirty() on pages that had no
>> buffers on them [2]. And that sounds like *exactly* the same thing as
>> calling __set_page_dirty_nobuffers() without extra locking. So I'd
>> expect that it's Just Wrong To Do, for the same reasons as Jan spells
>> out very clearly in [1].
> 
> Interesting.  It's a bit different, *but* Jan's race might be what's
> causing this symptom.  The reason is that the backtrace contains
> set_page_dirty_lock() which holds the page lock.  So there can't be
> a truncation race because truncate holds the page lock when calling
> ->invalidatepage.
> 
> That said, the syzbot reproducer doesn't have any O_DIRECT in it
> either.  So maybe this is some other race?

Jan's race can be also be reproduced *without* O_DIRECT. I first saw
it via a program that just did these steps on a normal ext4 filesystem:

a) pin ext4 file-backed pages, via get_user_pages(). Actually the way
it got here was due to using what *looked* like anonymous RAM to the
program, but was really file-backed RAM, because the admin had it
set up to mount ext4 on /tmp, instead of using tmpfs, to "save RAM",
but I digress. :)

b) wait a while, optionally do some DMA on the pages from a GPU, drink
coffee...

c) call set_pages_dirty()

d) unpin the pages

e) BUG_ON() in page_buffers().


thanks,
-- 
John Hubbard
NVIDIA

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


  reply	other threads:[~2020-12-19  7:05 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-18 16:05 set_page_dirty vs truncate Matthew Wilcox
2020-12-18 22:03 ` Matthew Wilcox
2020-12-19  5:18   ` Matthew Wilcox
2020-12-19  5:18     ` Matthew Wilcox
2020-12-19  6:10     ` John Hubbard
2020-12-19  6:10       ` John Hubbard
2020-12-19  6:50       ` Matthew Wilcox
2020-12-19  6:50         ` Matthew Wilcox
2020-12-19  7:04         ` John Hubbard [this message]
2020-12-19  7:04           ` John Hubbard
2020-12-21 14:12   ` Jan Kara
2020-12-21 14:58     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43f05a8f-25ce-a400-5825-d8fa159ee7f6@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=anna.schumaker@netapp.com \
    --cc=anton@tuxera.com \
    --cc=asmadeus@codewreck.org \
    --cc=devel@lists.orangefs.org \
    --cc=hdegoede@redhat.com \
    --cc=hubcap@omnibond.com \
    --cc=jdike@addtoit.com \
    --cc=jfs-discussion@lists.sourceforge.net \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-ntfs-dev@lists.sourceforge.net \
    --cc=linux-um@lists.infradead.org \
    --cc=martin@omnibond.com \
    --cc=miklos@szeredi.hu \
    --cc=richard@nod.at \
    --cc=sfrench@samba.org \
    --cc=shaggy@kernel.org \
    --cc=trond.myklebust@hammerspace.com \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.