linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/3] ext4/jbd2: data=journal: write-protect pages on transaction commit
@ 2020-09-10 19:31 Mauricio Faria de Oliveira
  2020-09-10 19:31 ` [RFC PATCH v3 1/3] jbd2: introduce/export functions jbd2_journal_submit|finish_inode_data_buffers() Mauricio Faria de Oliveira
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Mauricio Faria de Oliveira @ 2020-09-10 19:31 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, dann frazier, Mauricio Faria de Oliveira

Hey Jan,

This series implements your suggestions for the RFC PATCH v2 set,
which we mostly talked about in cover letter [1] and PATCH 3 [2].
(I added Suggested-by: tags, by the way, for due credit.)

It looks almost good on fstests: zero regressions on data=ordered,
and only one regression in data=journal (generic/347); I'll check.
(That's with ext4; I'll check ocfs2, but it's only a minor change.)

However, there's an issue I have to check with you about, that we
exposed from the original kernel. It's described below, but other
than this I _guess_ this should be close if you don't spot errors.

Thanks!
Mauricio

...

Testing with 'stress-ng --mmap <N> --mmap-file' runs well for days,
but it occasionally hits:

  JBD2: Spotted dirty metadata buffer (dev = vdc, blocknr = 74775).
  There's a risk of filesystem corruption in case of system crash.

I added dump_stack() in warn_dirty_buffer(), and it usually comes
from jbd2_journal_file_buffer(, BJ_Forget) in the commit function.
When filing from BJ_Shadow to BJ_Forget.. so it possibly happened
while BH_Shadow was still set!

So I instrumented [test_]set_buffer_dirty() macros to dump_stack()
if BH_Shadow is set (i.e. buffer being set dirty during write-out.)

This showed that the occasional BH_Dirty setter with BH_Shadow set
is block_page_mkwrite() in ext4_page_mkwrite(). And it seems right,
because it's called before do_journal_get_write_access() (where we
check for/wait on BH_Shadow.)

ext4_page_mkwrite():

        err = block_page_mkwrite(vma, vmf, get_block);
        if (!err && ext4_should_journal_data(inode)) {
                if (ext4_walk_page_buffers(handle, page_buffers(page), 0,
                          PAGE_SIZE, NULL, do_journal_get_write_access)) {

The patches didn't directly break this, they only allow this code
to run more often as we disabled an early-return optimization for
the case 'all buffers mapped' in data=journal (question 2 in [1]):

ext4_page_mkwrite():

         * Return if we have all the buffers mapped.
	...
-       if (page_has_buffers(page)) {
+       if (page_has_buffers(page) && !ext4_should_journal_data(inode)) {


In order to confirm it, I built the unpatched v5.9-rc4 kernel, with
just the change to disable that optimization in data=journal -- and
it hit that occasionally too ("JBD2: Spotted dirty metadata buffer".)

I was naive enough to mindlessly try to swap the order of those two
statements, in hopes that do_journal_get_write_access() should wait
for BH_Shadow to clear, and then we just call block_page_mkwrite().
BUT it trips into the BUG() check in page_buffers() in the former.

I still have to dig more about it, but if you have something quick
in mind, I'd appreciate any comments/feedback about it, if it gets
more complex than I can see.

Thanks again!

[1] https://lore.kernel.org/linux-ext4/20200819092735.GI1902@quack2.suse.cz/
[2] https://lore.kernel.org/linux-ext4/20200821102625.GB3432@quack2.suse.cz/

Mauricio Faria de Oliveira (3):
  jbd2: introduce/export functions
    jbd2_journal_submit|finish_inode_data_buffers()
  jbd2, ext4, ocfs2: introduce/use journal callbacks
    j_submit|finish_inode_data_buffers()
  ext4: data=journal: write-protect pages on
    j_submit_inode_data_buffers()

 fs/ext4/inode.c      | 29 +++++++++++-----
 fs/ext4/super.c      | 82 ++++++++++++++++++++++++++++++++++++++++++++
 fs/jbd2/commit.c     | 58 +++++++++++++++++--------------
 fs/jbd2/journal.c    |  2 ++
 fs/ocfs2/super.c     | 15 ++++++++
 include/linux/jbd2.h | 29 +++++++++++++++-
 6 files changed, 181 insertions(+), 34 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-09-16 20:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-10 19:31 [RFC PATCH v3 0/3] ext4/jbd2: data=journal: write-protect pages on transaction commit Mauricio Faria de Oliveira
2020-09-10 19:31 ` [RFC PATCH v3 1/3] jbd2: introduce/export functions jbd2_journal_submit|finish_inode_data_buffers() Mauricio Faria de Oliveira
2020-09-16 16:19   ` Jan Kara
2020-09-10 19:31 ` [RFC PATCH v3 2/3] jbd2, ext4, ocfs2: introduce/use journal callbacks j_submit|finish_inode_data_buffers() Mauricio Faria de Oliveira
2020-09-16 16:22   ` Jan Kara
2020-09-16 16:52     ` Jan Kara
2020-09-10 19:31 ` [RFC PATCH v3 3/3] ext4: data=journal: write-protect pages on j_submit_inode_data_buffers() Mauricio Faria de Oliveira
2020-09-16 16:59   ` Jan Kara
2020-09-16 16:15 ` [RFC PATCH v3 0/3] ext4/jbd2: data=journal: write-protect pages on transaction commit Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).