linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Hugh Dickins <hughd@google.com>, Hannes Reineke <hare@suse.de>,
	linux-mm@kvack.org, linux-block@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH] block: Remove special-casing of compound pages
Date: Wed, 16 Aug 2023 13:27:17 -0700 (PDT)	[thread overview]
Message-ID: <94635da5-ce28-a8fb-84e3-7a9f5240fe6a@google.com> (raw)
In-Reply-To: <20230814144100.596749-1-willy@infradead.org>

a.k.a "Fix rare user data corruption when using THP" :)

On Mon, 14 Aug 2023, Matthew Wilcox (Oracle) wrote:

> The special casing was originally added in pre-git history; reproducing
> the commit log here:
> 
> > commit a318a92567d77
> > Author: Andrew Morton <akpm@osdl.org>
> > Date:   Sun Sep 21 01:42:22 2003 -0700
> >
> >     [PATCH] Speed up direct-io hugetlbpage handling
> >
> >     This patch short-circuits all the direct-io page dirtying logic for
> >     higher-order pages.  Without this, we pointlessly bounce BIOs up to
> >     keventd all the time.
> 
> In the last twenty years, compound pages have become used for more than
> just hugetlb.  Rewrite these functions to operate on folios instead
> of pages and remove the special case for hugetlbfs; I don't think
> it's needed any more (and if it is, we can put it back in as a call
> to folio_test_hugetlb()).
> 
> This was found by inspection; as far as I can tell, this bug can lead
> to pages used as the destination of a direct I/O read not being marked
> as dirty.  If those pages are then reclaimed by the MM without being
> dirtied for some other reason, they won't be written out.  Then when
> they're faulted back in, they will not contain the data they should.
> It'll take a pretty unusual setup to produce this problem with several
> races all going the wrong way.
> 
> This problem predates the folio work; it could for example have been
> triggered by mmaping a THP in tmpfs and using that as the target of an
> O_DIRECT read.
> 
> Fixes: 800d8c63b2e98 ("shmem: add huge pages support")

No. It's a good catch, but bug looks specific to the folio work to me.

Almost all shmem pages are dirty from birth, even as soon as they are
brought back from swap; so it is not necessary to re-mark them dirty.

The exceptions are pages allocated to holes when faulted: so you did
get me worried as to whether khugepaged could collapse a pmd-ful of
those into a THP without marking the result as dirty.

But no, in v6.5-rc6 the collapse_file() success path has
	if (is_shmem)
		folio_mark_dirty(folio);
and in v5.10 the same appears as
		if (is_shmem)
			set_page_dirty(new_page);

(IIRC, that or marking pmd dirty was missed from early shmem THP
support, but fairly soon corrected, and backported to stable then.
I have a faint memory of versions which assembled pmd_dirty from
collected pte_dirtys.)

And the !is_shmem case is for CONFIG_READ_ONLY_THP_FOR_FS: writing
into those pages, by direct IO or whatever, is already prohibited.

It's dem dirty (or not dirty) folios dat's the trouble!

Hugh

> Cc: stable@vger.kernel.org
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  block/bio.c | 46 ++++++++++++++++++++++++----------------------
>  1 file changed, 24 insertions(+), 22 deletions(-)

  parent reply	other threads:[~2023-08-16 20:28 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14 14:41 [PATCH] block: Remove special-casing of compound pages Matthew Wilcox (Oracle)
2023-08-14 14:48 ` Hannes Reinecke
2023-08-16 17:03 ` Fix rare user data corruption when using THP Matthew Wilcox
2023-08-16 20:27 ` Hugh Dickins [this message]
2023-09-15 14:21   ` [PATCH] block: Remove special-casing of compound pages Matthew Wilcox
2023-09-15 22:48     ` Hugh Dickins
2023-12-07 21:04 ` Jens Axboe
2024-02-29 18:25   ` Greg Edwards
2024-02-29 19:37     ` Matthew Wilcox
2024-02-29 20:05       ` Greg Edwards
2023-12-07 22:10 ` Keith Busch
2023-12-07 23:57   ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94635da5-ce28-a8fb-84e3-7a9f5240fe6a@google.com \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=hare@suse.de \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).