All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Theodore Ts'o <tytso@mit.edu>, Eric Biggers <ebiggers@kernel.org>
Cc: Lee Jones <lee.jones@linaro.org>,
	linux-ext4@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Dave Chinner <dchinner@redhat.com>,
	Goldwyn Rodrigues <rgoldwyn@suse.com>,
	"Darrick J . Wong" <darrick.wong@oracle.com>,
	Bob Peterson <rpeterso@redhat.com>,
	Damien Le Moal <damien.lemoal@wdc.com>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	Ritesh Harjani <riteshh@linux.ibm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Johannes Thumshirn <jth@kernel.org>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	cluster-devel@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH -v3] ext4: don't BUG if kernel subsystems dirty pages without asking ext4 first
Date: Fri, 25 Feb 2022 13:33:33 -0800	[thread overview]
Message-ID: <2f9933b3-a574-23e1-e632-72fc29e582cf@nvidia.com> (raw)
In-Reply-To: <YhlIvw00Y4MkAgxX@mit.edu>

On 2/25/22 13:23, Theodore Ts'o wrote:
> [un]pin_user_pages_remote is dirtying pages without properly warning
> the file system in advance.  This was noted by Jan Kara in 2018[1] and

In 2018, [un]pin_user_pages_remote did not exist. And so what Jan reported
was actually that dio_bio_complete() was calling set_page_dirty_lock()
on pages that were not (any longer) set up for that.

> more recently has resulted in bug reports by Syzbot in various Android
> kernels[2].
> 
> This is technically a bug in mm/gup.c, but arguably ext4 is fragile in

Is it, really? unpin_user_pages_dirty_lock() moved the set_page_dirty_lock()
call into mm/gup.c, but that merely refactored things. The callers are
all over the kernel, and those callers are what need changing in order
to fix this.


thanks,
-- 
John Hubbard
NVIDIA

> that a buggy kernel subsystem which dirty pages without properly
> notifying the file system causes ext4 to BUG, while other file systems
> are not (although user data likely will be lost).  I suspect in real
> life it is rare that people are using RDMA into file-backed memory,
> which is why no one has complained to ext4 developers except fuzzing
> programs.
> 
> So instead of crashing with a BUG, issue a warning (since there may be
> potential data loss) and just mark the page as clean to avoid
> unprivileged denial of service attacks until the problem can be
> properly fixed.  More discussion and background can be found in the
> thread starting at [2].
> 
> [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> [2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM@google.com
> 
> Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db@syzkaller.appspotmail.com
> Reported-by: Lee Jones <lee.jones@linaro.org>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
>   fs/ext4/inode.c | 27 ++++++++++++++++++++++++++-
>   1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 01c9e4f743ba..008fe8750109 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1993,6 +1993,15 @@ static int ext4_writepage(struct page *page,
>   	else
>   		len = PAGE_SIZE;
>   
> +	/* Should never happen but for bugs in other kernel subsystems */
> +	if (!page_has_buffers(page)) {
> +		ext4_warning_inode(inode,
> +		   "page %lu does not have buffers attached", page->index);
> +		ClearPageDirty(page);
> +		unlock_page(page);
> +		return 0;
> +	}
> +
>   	page_bufs = page_buffers(page);
>   	/*
>   	 * We cannot do block allocation or other extent handling in this
> @@ -2588,12 +2597,28 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
>   			     (mpd->wbc->sync_mode == WB_SYNC_NONE)) ||
>   			    unlikely(page->mapping != mapping)) {
>   				unlock_page(page);
> -				continue;
> +				goto out;
>   			}
>   
>   			wait_on_page_writeback(page);
>   			BUG_ON(PageWriteback(page));
>   
> +			/*
> +			 * Should never happen but for buggy code in
> +			 * other subsystems that call
> +			 * set_page_dirty() without properly warning
> +			 * the file system first.  See [1] for more
> +			 * information.
> +			 *
> +			 * [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> +			 */
> +			if (!page_has_buffers(page)) {
> +				ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index);
> +				ClearPageDirty(page);
> +				unlock_page(page);
> +				continue;
> +			}
> +
>   			if (mpd->map.m_len == 0)
>   				mpd->first_page = page->index;
>   			mpd->next_page = page->index + 1;



WARNING: multiple messages have this Message-ID (diff)
From: John Hubbard <jhubbard@nvidia.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH -v3] ext4: don't BUG if kernel subsystems dirty pages without asking ext4 first
Date: Fri, 25 Feb 2022 13:33:33 -0800	[thread overview]
Message-ID: <2f9933b3-a574-23e1-e632-72fc29e582cf@nvidia.com> (raw)
In-Reply-To: <YhlIvw00Y4MkAgxX@mit.edu>

On 2/25/22 13:23, Theodore Ts'o wrote:
> [un]pin_user_pages_remote is dirtying pages without properly warning
> the file system in advance.  This was noted by Jan Kara in 2018[1] and

In 2018, [un]pin_user_pages_remote did not exist. And so what Jan reported
was actually that dio_bio_complete() was calling set_page_dirty_lock()
on pages that were not (any longer) set up for that.

> more recently has resulted in bug reports by Syzbot in various Android
> kernels[2].
> 
> This is technically a bug in mm/gup.c, but arguably ext4 is fragile in

Is it, really? unpin_user_pages_dirty_lock() moved the set_page_dirty_lock()
call into mm/gup.c, but that merely refactored things. The callers are
all over the kernel, and those callers are what need changing in order
to fix this.


thanks,
-- 
John Hubbard
NVIDIA

> that a buggy kernel subsystem which dirty pages without properly
> notifying the file system causes ext4 to BUG, while other file systems
> are not (although user data likely will be lost).  I suspect in real
> life it is rare that people are using RDMA into file-backed memory,
> which is why no one has complained to ext4 developers except fuzzing
> programs.
> 
> So instead of crashing with a BUG, issue a warning (since there may be
> potential data loss) and just mark the page as clean to avoid
> unprivileged denial of service attacks until the problem can be
> properly fixed.  More discussion and background can be found in the
> thread starting at [2].
> 
> [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911 at quack2.suse.cz
> [2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM at google.com
> 
> Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db at syzkaller.appspotmail.com
> Reported-by: Lee Jones <lee.jones@linaro.org>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
>   fs/ext4/inode.c | 27 ++++++++++++++++++++++++++-
>   1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 01c9e4f743ba..008fe8750109 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1993,6 +1993,15 @@ static int ext4_writepage(struct page *page,
>   	else
>   		len = PAGE_SIZE;
>   
> +	/* Should never happen but for bugs in other kernel subsystems */
> +	if (!page_has_buffers(page)) {
> +		ext4_warning_inode(inode,
> +		   "page %lu does not have buffers attached", page->index);
> +		ClearPageDirty(page);
> +		unlock_page(page);
> +		return 0;
> +	}
> +
>   	page_bufs = page_buffers(page);
>   	/*
>   	 * We cannot do block allocation or other extent handling in this
> @@ -2588,12 +2597,28 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
>   			     (mpd->wbc->sync_mode == WB_SYNC_NONE)) ||
>   			    unlikely(page->mapping != mapping)) {
>   				unlock_page(page);
> -				continue;
> +				goto out;
>   			}
>   
>   			wait_on_page_writeback(page);
>   			BUG_ON(PageWriteback(page));
>   
> +			/*
> +			 * Should never happen but for buggy code in
> +			 * other subsystems that call
> +			 * set_page_dirty() without properly warning
> +			 * the file system first.  See [1] for more
> +			 * information.
> +			 *
> +			 * [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911 at quack2.suse.cz
> +			 */
> +			if (!page_has_buffers(page)) {
> +				ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index);
> +				ClearPageDirty(page);
> +				unlock_page(page);
> +				continue;
> +			}
> +
>   			if (mpd->map.m_len == 0)
>   				mpd->first_page = page->index;
>   			mpd->next_page = page->index + 1;




  reply	other threads:[~2022-02-25 21:33 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-16 16:31 [REPORT] kernel BUG at fs/ext4/inode.c:2620 - page_buffers() Lee Jones
2022-02-16 16:31 ` [Cluster-devel] " Lee Jones
2022-02-18  1:06 ` John Hubbard
2022-02-18  1:06   ` [Cluster-devel] " John Hubbard
2022-02-18  4:08   ` Theodore Ts'o
2022-02-18  4:08     ` [Cluster-devel] " Theodore Ts'o
2022-02-18  6:33     ` John Hubbard
2022-02-18  6:33       ` [Cluster-devel] " John Hubbard
2022-02-23 23:31       ` Theodore Ts'o
2022-02-23 23:31         ` [Cluster-devel] " Theodore Ts'o
2022-02-24  0:44         ` John Hubbard
2022-02-24  0:44           ` [Cluster-devel] " John Hubbard
2022-02-24  4:04           ` Theodore Ts'o
2022-02-24  4:04             ` [Cluster-devel] " Theodore Ts'o
2022-02-18  7:51     ` Greg Kroah-Hartman
2022-02-18  7:51       ` [Cluster-devel] " Greg Kroah-Hartman
2022-02-23 23:35       ` Theodore Ts'o
2022-02-23 23:35         ` [Cluster-devel] " Theodore Ts'o
2022-02-24  1:48         ` Dave Chinner
2022-02-24  1:48           ` [Cluster-devel] " Dave Chinner
2022-02-24  3:50           ` Theodore Ts'o
2022-02-24  3:50             ` [Cluster-devel] " Theodore Ts'o
2022-02-24 10:29             ` Dave Chinner
2022-02-24 10:29               ` [Cluster-devel] " Dave Chinner
2022-02-18  2:54 ` Theodore Ts'o
2022-02-18  2:54   ` [Cluster-devel] " Theodore Ts'o
2022-02-18  4:24   ` Matthew Wilcox
2022-02-18  4:24     ` [Cluster-devel] " Matthew Wilcox
2022-02-18  6:03     ` Theodore Ts'o
2022-02-18  6:03       ` [Cluster-devel] " Theodore Ts'o
2022-02-25 19:24 ` [PATCH -v2] ext4: don't BUG if kernel subsystems dirty pages without asking ext4 first Theodore Ts'o
2022-02-25 19:24   ` [Cluster-devel] " Theodore Ts'o
2022-02-25 20:51   ` Eric Biggers
2022-02-25 20:51     ` [Cluster-devel] " Eric Biggers
2022-02-25 21:08     ` Theodore Ts'o
2022-02-25 21:08       ` [Cluster-devel] " Theodore Ts'o
2022-02-25 21:23       ` [PATCH -v3] " Theodore Ts'o
2022-02-25 21:23         ` [Cluster-devel] " Theodore Ts'o
2022-02-25 21:33         ` John Hubbard [this message]
2022-02-25 21:33           ` John Hubbard
2022-02-25 23:21           ` Theodore Ts'o
2022-02-25 23:21             ` [Cluster-devel] " Theodore Ts'o
2022-02-26  0:18             ` Hillf Danton
2022-02-26  0:41             ` John Hubbard
2022-02-26  0:41               ` [Cluster-devel] " John Hubbard
2022-02-26  1:40               ` Theodore Ts'o
2022-02-26  1:40                 ` [Cluster-devel] " Theodore Ts'o
2022-02-26  2:00                 ` Theodore Ts'o
2022-02-26  2:00                   ` [Cluster-devel] " Theodore Ts'o
2022-02-26  2:55                 ` John Hubbard
2022-02-26  2:55                   ` [Cluster-devel] " John Hubbard
2022-03-03  4:26         ` [PATCH -v4] " Theodore Ts'o
2022-03-03  4:26           ` [Cluster-devel] " Theodore Ts'o
2022-03-03  8:21           ` Christoph Hellwig
2022-03-03  8:21             ` [Cluster-devel] " Christoph Hellwig
2022-03-03  9:21           ` Lee Jones
2022-03-03  9:21             ` [Cluster-devel] " Lee Jones
2022-03-03 14:38           ` [PATCH -v5] ext4: don't BUG if someone " Theodore Ts'o
2022-03-03 14:38             ` [Cluster-devel] " Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2f9933b3-a574-23e1-e632-72fc29e582cf@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=agruenba@redhat.com \
    --cc=cluster-devel@redhat.com \
    --cc=damien.lemoal@wdc.com \
    --cc=darrick.wong@oracle.com \
    --cc=dchinner@redhat.com \
    --cc=ebiggers@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=jth@kernel.org \
    --cc=lee.jones@linaro.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rgoldwyn@suse.com \
    --cc=riteshh@linux.ibm.com \
    --cc=rpeterso@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.