All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Brian Foster <bfoster@redhat.com>,
	xfs <linux-xfs@vger.kernel.org>,
	fstests <fstests@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: generic/068 crash on 5.18-rc2?
Date: Wed, 4 May 2022 19:40:12 -0700	[thread overview]
Message-ID: <20220505024012.GA27195@magnolia> (raw)
In-Reply-To: <20220503172532.GA8265@magnolia>

On Tue, May 03, 2022 at 10:25:32AM -0700, Darrick J. Wong wrote:
> On Tue, May 03, 2022 at 05:31:01AM +0100, Matthew Wilcox wrote:
> > On Mon, May 02, 2022 at 08:25:34PM -0700, Darrick J. Wong wrote:
> > > On Mon, May 02, 2022 at 08:20:00AM -0400, Brian Foster wrote:
> > > > On Sat, Apr 30, 2022 at 10:40:31PM +0100, Matthew Wilcox wrote:
> > > > > On Sat, Apr 30, 2022 at 04:44:07AM +0100, Matthew Wilcox wrote:
> > > > > > (I do not love this, have not even compiled it; it's late.  We may be
> > > > > > better off just storing next_folio inside the folio_iter).
> > > > > 
> > > > > Does anyone have a preference for fixing this between Option A:
> > > > > 
> > > > 
> > > > After seeing the trace in my previous mail and several thousand
> > > > successful iterations of the test hack, I had reworked it into this
> > > > (which survived weekend testing until it ran into some other XFS problem
> > > > that looks unrelated):
> > > > 
> > > > diff --git a/include/linux/bio.h b/include/linux/bio.h
> > > > index 278cc81cc1e7..aa820e09978e 100644
> > > > --- a/include/linux/bio.h
> > > > +++ b/include/linux/bio.h
> > > > @@ -269,6 +269,7 @@ struct folio_iter {
> > > >  	size_t offset;
> > > >  	size_t length;
> > > >  	/* private: for use by the iterator */
> > > > +	struct folio *_next;
> > > >  	size_t _seg_count;
> > > >  	int _i;
> > > >  };
> > > > @@ -279,6 +280,7 @@ static inline void bio_first_folio(struct folio_iter *fi, struct bio *bio,
> > > >  	struct bio_vec *bvec = bio_first_bvec_all(bio) + i;
> > > >  
> > > >  	fi->folio = page_folio(bvec->bv_page);
> > > > +	fi->_next = folio_next(fi->folio);
> > > >  	fi->offset = bvec->bv_offset +
> > > >  			PAGE_SIZE * (bvec->bv_page - &fi->folio->page);
> > > >  	fi->_seg_count = bvec->bv_len;
> > > > @@ -290,13 +292,15 @@ static inline void bio_next_folio(struct folio_iter *fi, struct bio *bio)
> > > >  {
> > > >  	fi->_seg_count -= fi->length;
> > > >  	if (fi->_seg_count) {
> > > > -		fi->folio = folio_next(fi->folio);
> > > > +		fi->folio = fi->_next;
> > > > +		fi->_next = folio_next(fi->folio);
> > > >  		fi->offset = 0;
> > > >  		fi->length = min(folio_size(fi->folio), fi->_seg_count);
> > > >  	} else if (fi->_i + 1 < bio->bi_vcnt) {
> > > >  		bio_first_folio(fi, bio, fi->_i + 1);
> > > >  	} else {
> > > >  		fi->folio = NULL;
> > > > +		fi->_next = NULL;
> > > >  	}
> > > >  }
> > > > 
> > > > So FWIW, that is just to say that I find option A to be cleaner and more
> > > > readable.
> > > 
> > > Me too.  I'll queue up the usual nightly tests with that patch added and
> > > we'll see how that does.
> > 
> > I've just pushed essentially that patch to my for-next tree in case
> > anybody does any testing with that.  I'll give it a couple of days
> > before creating a folio-5.18f tag and asking Linus to pull the first two
> > commits on
> > 
> > git://git.infradead.org/users/willy/pagecache.git for-next
> > 
> > That is, commits
> > 
> > 1a4c97e2dd5b ("block: Do not call folio_next() on an unreferenced folio")
> > 095099da208b ("mm/readahead: Fix readahead with large folios")
> 
> Hmm.  Well, I added 1a4c97 to my tree last night, and it seems to have
> cleared up all but two of the problems I saw with the for-next branch.
> 
> generic/388 still fails (40 minutes in) with:
> 
> WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending));
> VM_BUG_ON_FOLIO(i_blocks_per_folio(inode, folio) > 1 && !iop, folio);
> 
> Which I think is the same problem where the fs goes down, XFS throws an
> error back to iomap_do_writepages, and it tries to discard a folio that
> already had writeback running on it.
> 
> There's also the same problem I reported a few days ago in xfs/501
> on a 64k-page ARM64 VM where:
> 
> run fstests xfs/501 at 2022-05-02 21:17:31
> XFS: Assertion failed: IS_ALIGNED((unsigned long)lv->lv_buf, sizeof(uint64_t)), file: fs/xfs/xfs_log_cil.c, line: 430
> XFS: Assertion failed: IS_ALIGNED((unsigned long)buf, sizeof(uint64_t)), file: fs/xfs/xfs_log.c, line: 137
> XFS: Assertion failed: IS_ALIGNED((unsigned long)buf, sizeof(uint64_t)), file: fs/xfs/xfs_log.c, line: 137
> 
> But I think that's a new bug that came in with all the log buffer
> alignment changes in the 5.19 branch.
> 
> Oh.  My tree still had the "disable large folios" patch in it.  I guess
> the "successful" results are mostly invalid then.

Well... with large folios turned back on and those two patches added to
the branch, *most* of the problems go away.  The generic/388 problem
persists, and last night's run showed that the weird xfs_dquot leak that
I"ve occasionally seen on 5.18 with xfs/43[46] also exists in 5.17.

--D

> --D
> 
> > (more than happy to update anything about those patches)

  reply	other threads:[~2022-05-05  2:40 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13  3:34 generic/068 crash on 5.18-rc2? Darrick J. Wong
2022-04-13 14:50 ` Matthew Wilcox
2022-04-13 16:23   ` Darrick J. Wong
2022-04-13 16:35     ` Matthew Wilcox
2022-04-18 18:44       ` Darrick J. Wong
2022-04-18 17:47   ` Darrick J. Wong
2022-04-20  0:37     ` Darrick J. Wong
2022-04-22 21:59     ` Darrick J. Wong
2022-04-28 15:53       ` Brian Foster
2022-04-30  3:10         ` Darrick J. Wong
2022-04-30  3:44         ` Matthew Wilcox
2022-04-30 21:40           ` Matthew Wilcox
2022-05-02 12:20             ` Brian Foster
2022-05-03  3:25               ` Darrick J. Wong
2022-05-03  4:31                 ` Matthew Wilcox
2022-05-03 17:25                   ` Darrick J. Wong
2022-05-05  2:40                     ` Darrick J. Wong [this message]
2022-05-05  4:18                       ` Matthew Wilcox
2022-05-05  4:24                         ` Darrick J. Wong
2022-05-06 17:03                           ` Darrick J. Wong
2022-05-02 12:18           ` Brian Foster
2022-05-02 13:00             ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220505024012.GA27195@magnolia \
    --to=djwong@kernel.org \
    --cc=bfoster@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.