All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "hch@infradead.org" <hch@infradead.org>
Cc: Trond Myklebust <trondmy@hammerspace.com>,
	"bfoster@redhat.com" <bfoster@redhat.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"trondmy@kernel.org" <trondmy@kernel.org>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"djwong@kernel.org" <djwong@kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"willy@infradead.org" <willy@infradead.org>
Subject: Re: [PATCH] iomap: Address soft lockup in iomap_finish_ioend()
Date: Wed, 5 Jan 2022 08:16:05 +1100	[thread overview]
Message-ID: <20220104211605.GI945095@dread.disaster.area> (raw)
In-Reply-To: <YdPyhpdxykDscMtJ@infradead.org>

On Mon, Jan 03, 2022 at 11:08:54PM -0800, hch@infradead.org wrote:
> On Tue, Jan 04, 2022 at 12:22:15PM +1100, Dave Chinner wrote:
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -1098,6 +1098,15 @@ iomap_ioend_can_merge(struct iomap_ioend *ioend, struct iomap_ioend *next)
> >  		return false;
> >  	if (ioend->io_offset + ioend->io_size != next->io_offset)
> >  		return false;
> > +	/*
> > +	 * Do not merge physically discontiguous ioends. The filesystem
> > +	 * completion functions will have to iterate the physical
> > +	 * discontiguities even if we merge the ioends at a logical level, so
> > +	 * we don't gain anything by merging physical discontiguities here.
> > +	 */
> > +	if (ioend->io_inline_bio.bi_iter.bi_sector + (ioend->io_size >> 9) !=
> 
> This open codes bio_end_sector()

No, it doesn't. The ioend can have chained bios or have others merged
and concatenated to the ioend->io_list, so ioend->io_size != length
of the first bio in the chain....

> > +	    next->io_inline_bio.bi_iter.bi_sector)
> 
> But more importantly I don't think just using the inline_bio makes sense
> here as the ioend can have multiple bios.  Fortunately we should always
> have the last built bio available in ->io_bio.

Except merging chains ioends and modifies the head io_size to
account for the chained ioends we add to ioend->io_list. Hence
ioend->io_bio is not the last bio in a contiguous ioend chain.

> > +		return false;
> >  	return true;
> >  }
> >  
> > @@ -1241,6 +1250,13 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset,
> >  		return false;
> >  	if (sector != bio_end_sector(wpc->ioend->io_bio))
> >  		return false;
> > +	/*
> > +	 * Limit ioend bio chain lengths to minimise IO completion latency. This
> > +	 * also prevents long tight loops ending page writeback on all the pages
> > +	 * in the ioend.
> > +	 */
> > +	if (wpc->ioend->io_size >= 4096 * PAGE_SIZE)
> > +		return false;
> 
> And this stops making sense with the impending additions of large folio
> support.  I think we need to count the pages/folios instead as the
> operations are once per page/folio.

Agree, but I was looking at this initially as something easy to test
and backport.

UNfortunately, we hide the ioend switching in a function that can be
called many times per page/folio and the calling function has no
real clue when ioends get switched. Hence it's much more invasive to
correctly account for size based on variable sized folios attached
to bios in an ioend compared to hard coding a simple IO size limit.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2022-01-04 21:16 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-30 19:35 [PATCH] iomap: Address soft lockup in iomap_finish_ioend() trondmy
2021-12-30 21:24 ` Jens Axboe
2021-12-30 22:25   ` Trond Myklebust
2021-12-30 22:27     ` Jens Axboe
2021-12-30 22:55       ` Trond Myklebust
2021-12-31  1:42 ` Matthew Wilcox
2021-12-31  6:16   ` Trond Myklebust
2022-01-01  3:55     ` Dave Chinner
2022-01-01 17:39       ` Trond Myklebust
2022-01-03 22:03         ` Dave Chinner
2022-01-04  0:04           ` Trond Myklebust
2022-01-04  1:22             ` Dave Chinner
2022-01-04  3:01               ` Trond Myklebust
2022-01-04  7:08               ` hch
2022-01-04 18:08                 ` Matthew Wilcox
2022-01-04 18:14                   ` hch
2022-01-04 19:22                     ` Darrick J. Wong
2022-01-04 21:52                       ` Dave Chinner
2022-01-04 23:12                         ` Darrick J. Wong
2022-01-05  2:10                           ` Dave Chinner
2022-01-05 13:56                             ` Brian Foster
2022-01-05 22:04                               ` Dave Chinner
2022-01-06 16:44                                 ` Brian Foster
2022-01-10  8:18                                   ` Dave Chinner
2022-01-10 17:45                                     ` Brian Foster
2022-01-10 18:11                                       ` hch
2022-01-11 14:33                                       ` Trond Myklebust
2022-01-05 13:42                           ` hch
2022-01-04 21:16                 ` Dave Chinner [this message]
2022-01-05 13:43                   ` hch
2022-01-05 22:34                     ` Dave Chinner
2022-01-05  2:09               ` Trond Myklebust
2022-01-05 20:45                 ` Trond Myklebust
2022-01-05 22:48                   ` Dave Chinner
2022-01-05 23:29                     ` Trond Myklebust
2022-01-06  0:01                     ` Darrick J. Wong
2022-01-09 23:09                       ` Dave Chinner
2022-01-06 18:36                     ` Trond Myklebust
2022-01-06 18:38                       ` Trond Myklebust
2022-01-06 20:07                       ` Brian Foster
2022-01-07  3:08                         ` Trond Myklebust
2022-01-07 15:15                           ` Brian Foster
2022-01-09 23:34                       ` Dave Chinner
2022-01-10 23:37                       ` Dave Chinner
2022-01-11  0:08                         ` Dave Chinner
2022-01-13 17:01                         ` Trond Myklebust
2022-01-17 17:24                           ` Trond Myklebust
2022-01-17 17:36                             ` Darrick J. Wong
2022-01-04 13:36         ` Brian Foster
2022-01-04 19:23           ` Darrick J. Wong
2022-01-05  2:31 ` [iomap] f5934dda54: BUG:sleeping_function_called_from_invalid_context_at_fs/iomap/buffered-io.c kernel test robot
2022-01-05  2:31   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220104211605.GI945095@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=axboe@kernel.dk \
    --cc=bfoster@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    --cc=trondmy@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.