linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	andres@anarazel.de, willy@infradead.org, dhowells@redhat.com,
	hch@infradead.org, akpm@linux-foundation.org,
	david@fromorbit.com
Subject: Re: [PATCH v5 2/2] buffer: record blockdev write errors in super_block that it backs
Date: Thu, 16 Apr 2020 07:31:33 -0400	[thread overview]
Message-ID: <35d03979765b4351160086badcb59a512695493f.camel@kernel.org> (raw)
In-Reply-To: <20200416093528.GC23739@quack2.suse.cz>

On Thu, 2020-04-16 at 11:35 +0200, Jan Kara wrote:
> On Wed 15-04-20 12:22:27, Jeff Layton wrote:
> > On Wed, 2020-04-15 at 16:06 +0200, Jan Kara wrote:
> > > On Wed 15-04-20 08:13:00, Jeff Layton wrote:
> > > > From: Jeff Layton <jlayton@redhat.com>
> > > > 
> > > > When syncing out a block device (a'la __sync_blockdev), any error
> > > > encountered will only be recorded in the bd_inode's mapping. When the
> > > > blockdev contains a filesystem however, we'd like to also record the
> > > > error in the super_block that's stored there.
> > > > 
> > > > Make mark_buffer_write_io_error also record the error in the
> > > > corresponding super_block when a writeback error occurs and the block
> > > > device contains a mounted superblock.
> > > > 
> > > > Since superblocks are RCU freed, hold the rcu_read_lock to ensure
> > > > that the superblock doesn't go away while we're marking it.
> > > > 
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > >  fs/buffer.c | 7 +++++++
> > > >  1 file changed, 7 insertions(+)
> > > > 
> > > > diff --git a/fs/buffer.c b/fs/buffer.c
> > > > index f73276d746bb..2a4a5cc20418 100644
> > > > --- a/fs/buffer.c
> > > > +++ b/fs/buffer.c
> > > > @@ -1154,12 +1154,19 @@ EXPORT_SYMBOL(mark_buffer_dirty);
> > > >  
> > > >  void mark_buffer_write_io_error(struct buffer_head *bh)
> > > >  {
> > > > +	struct super_block *sb;
> > > > +
> > > >  	set_buffer_write_io_error(bh);
> > > >  	/* FIXME: do we need to set this in both places? */
> > > >  	if (bh->b_page && bh->b_page->mapping)
> > > >  		mapping_set_error(bh->b_page->mapping, -EIO);
> > > >  	if (bh->b_assoc_map)
> > > >  		mapping_set_error(bh->b_assoc_map, -EIO);
> > > > +	rcu_read_lock();
> > > > +	sb = bh->b_bdev->bd_super;
> > > 
> > > You still need READ_ONCE() here. Otherwise the dereference below can still
> > > result in refetch and NULL ptr deref.
> > > 
> > > 								Honza
> > > 
> > 
> > Huh? That seems like a really suspicious thing for the compiler/arch to
> > do. We are checking that sb isn't NULL before we dereference it. Doesn't
> > that imply a data dependency? How could the value of "sb" change after
> > that?
> 
> Because the compiler is free to optimize the local variable away and
> actually compile the dereference below as bh->b_bdev->bd_super->s_wb_err
> (from C11 standard POV such code is equivalent since in C11 memory model
> it is assumed there are no concurrent accesses). And READ_ONCE() is a way
> to forbid compiler from doing such optimization - through 'volatile'
> keyword it tells the compiler there may be concurrent accesses happening
> and makes sure the value is really fetched into the local variable and used
> from there. There are good articles about this on LWN - I'd give you a link
> but LWN seems to be down today. But the latest article is about KCSAN and
> from there are links to older articles about compiler optimizations.
> 
> > I'm also not sure I understand how using READ_ONCE really helps there if
> > we can't count on the value of a local variable not changing.
> 
> I hope I've explained this above.
> 

Got it. Thanks for the explanation. Now I'll have nightmares about all
of the race conditions I've created in the past by making this
assumption!

I'll send a v6 set in a few mins.

> > > > +	if (sb)
> > > > +		errseq_set(&sb->s_wb_err, -EIO);
> > > > +	rcu_read_unlock();
> > > >  }
> > > >  EXPORT_SYMBOL(mark_buffer_write_io_error);
> > > >  
> > > > -- 
> > > > 2.25.2
> > > > 
> > 
> > -- 
> > Jeff Layton <jlayton@kernel.org>
> > 

-- 
Jeff Layton <jlayton@kernel.org>


      reply	other threads:[~2020-04-16 11:32 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-15 12:12 [PATCH v5 0/2] vfs: have syncfs() return error when there are writeback errors Jeff Layton
2020-04-15 12:12 ` [PATCH v5 1/2] vfs: track per-sb writeback errors and report them to syncfs Jeff Layton
2020-04-15 12:13 ` [PATCH v5 2/2] buffer: record blockdev write errors in super_block that it backs Jeff Layton
2020-04-15 14:06   ` Jan Kara
2020-04-15 16:22     ` Jeff Layton
2020-04-16  9:35       ` Jan Kara
2020-04-16 11:31         ` Jeff Layton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35d03979765b4351160086badcb59a512695493f.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=andres@anarazel.de \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).