All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: dm-thin: wakeup worker only when deferred bios exist
       [not found] <1573740655-104317-1-git-send-email-jefflexu@linux.alibaba.com>
@ 2019-11-14 15:48 ` Mike Snitzer
  2019-12-01  1:15   ` Eric Wheeler
  0 siblings, 1 reply; 2+ messages in thread
From: Mike Snitzer @ 2019-11-14 15:48 UTC (permalink / raw)
  To: Jeffle Xu; +Cc: joseph.qi, ejt, dm-devel, agk

On Thu, Nov 14 2019 at  9:10am -0500,
Jeffle Xu <jefflexu@linux.alibaba.com> wrote:

> Single thread fio test (read, bs=4k, ioengine=libaio, iodepth=128,
> numjobs=1) over dm-thin device has poor performance versus bare nvme
> disk on v5.4.0-rc7.
> 
> Further investigation with perf indicates that queue_work_on() consumes
> over 20% CPU time when doing IO over dm-thin device. The call stack is
> as follows.
> 
> - 46.64% thin_map
>    + 22.28% queue_work_on
>    + 12.98% dm_thin_find_block
>    + 5.07% cell_defer_no_holder
>    + 2.42% bio_detain.isra.33
>    + 0.95% inc_all_io_entry.isra.31.part.32
>      0.80% remap.isra.41
> 
> In cell_defer_no_holder(), wakeup_worker() is always called, no matter whether
> the cell->bios list is empty or not. In single thread IO model, cell->bios list
> is most likely empty. So skip waking up worker thread if cell->bios list is
> empty.
> 
> A significant IO performance of single thread can be seen with this patch.
> The original IO performance is 445 MiB/s with the fio test previously
> described, while it is 643 MiB/s after applying the patch, which is a
> 44% performance improvement.
> 
> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>


Nice find, implementation detail questions inlined below.

> ---
>  drivers/md/dm-bio-prison-v1.c |  8 +++++++-
>  drivers/md/dm-bio-prison-v1.h |  2 +-
>  drivers/md/dm-thin.c          | 10 ++++++----
>  3 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/md/dm-bio-prison-v1.c b/drivers/md/dm-bio-prison-v1.c
> index b538989..b2a9b8d 100644
> --- a/drivers/md/dm-bio-prison-v1.c
> +++ b/drivers/md/dm-bio-prison-v1.c
> @@ -219,11 +219,17 @@ static void __cell_release_no_holder(struct dm_bio_prison *prison,
>  
>  void dm_cell_release_no_holder(struct dm_bio_prison *prison,
>  			       struct dm_bio_prison_cell *cell,
> -			       struct bio_list *inmates)
> +			       struct bio_list *inmates, int *empty)
>  {
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&prison->lock, flags);
> +	/*
> +	 * The empty flag should represent the list state exactly
> +	 * when the list is merged into @inmates, thus get the
> +	 * list state when prison->lock is held.
> +	 */
> +	*empty = bio_list_empty(&cell->bios);
>  	__cell_release_no_holder(prison, cell, inmates);
>  	spin_unlock_irqrestore(&prison->lock, flags);
>  }
> diff --git a/drivers/md/dm-bio-prison-v1.h b/drivers/md/dm-bio-prison-v1.h
> index cec52ac..500edbc 100644
> --- a/drivers/md/dm-bio-prison-v1.h
> +++ b/drivers/md/dm-bio-prison-v1.h
> @@ -89,7 +89,7 @@ void dm_cell_release(struct dm_bio_prison *prison,
>  		     struct bio_list *bios);
>  void dm_cell_release_no_holder(struct dm_bio_prison *prison,
>  			       struct dm_bio_prison_cell *cell,
> -			       struct bio_list *inmates);
> +			       struct bio_list *inmates, int *empty);
>  void dm_cell_error(struct dm_bio_prison *prison,
>  		   struct dm_bio_prison_cell *cell, blk_status_t error);
>  
> diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> index fcd8877..51fd396 100644
> --- a/drivers/md/dm-thin.c
> +++ b/drivers/md/dm-thin.c
> @@ -480,9 +480,9 @@ static void cell_visit_release(struct pool *pool,
>  
>  static void cell_release_no_holder(struct pool *pool,
>  				   struct dm_bio_prison_cell *cell,
> -				   struct bio_list *bios)
> +				   struct bio_list *bios, int *empty)
>  {
> -	dm_cell_release_no_holder(pool->prison, cell, bios);
> +	dm_cell_release_no_holder(pool->prison, cell, bios, empty);
>  	dm_bio_prison_free_cell(pool->prison, cell);
>  }
>  
> @@ -886,12 +886,14 @@ static void cell_defer_no_holder(struct thin_c *tc, struct dm_bio_prison_cell *c
>  {
>  	struct pool *pool = tc->pool;
>  	unsigned long flags;
> +	int empty;
>  
>  	spin_lock_irqsave(&tc->lock, flags);
> -	cell_release_no_holder(pool, cell, &tc->deferred_bio_list);
> +	cell_release_no_holder(pool, cell, &tc->deferred_bio_list, &empty);
>  	spin_unlock_irqrestore(&tc->lock, flags);
>  
> -	wake_worker(pool);
> +	if (!empty)
> +		wake_worker(pool);
>  }

Think your point is tc->deferred_bio_list may have bios already, before
the call to cell_release_no_holder, so simply checking if it is empty
after the cell_release_no_holder call isn't enough?

But if tc->deferred_bio_list isn't empty then there is work that needs
to be done, regardless of whether this particular cell created work, so
I'm left wondering if you first tried simply checking if
tc->deferred_bio_list is empty after cell_release_no_holder() in
cell_defer_no_holder?

Thanks,
Mike

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: dm-thin: wakeup worker only when deferred bios exist
  2019-11-14 15:48 ` dm-thin: wakeup worker only when deferred bios exist Mike Snitzer
@ 2019-12-01  1:15   ` Eric Wheeler
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Wheeler @ 2019-12-01  1:15 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Jeffle Xu, joseph.qi, ejt, dm-devel, agk

On Thu, 14 Nov 2019, Mike Snitzer wrote:

> On Thu, Nov 14 2019 at  9:10am -0500,
> Jeffle Xu <jefflexu@linux.alibaba.com> wrote:
> 
> > Single thread fio test (read, bs=4k, ioengine=libaio, iodepth=128,
> > numjobs=1) over dm-thin device has poor performance versus bare nvme
> > disk on v5.4.0-rc7.
> > 
> > Further investigation with perf indicates that queue_work_on() consumes
> > over 20% CPU time when doing IO over dm-thin device. The call stack is
> > as follows.
> > 
> > - 46.64% thin_map
> >    + 22.28% queue_work_on
> >    + 12.98% dm_thin_find_block
> >    + 5.07% cell_defer_no_holder
> >    + 2.42% bio_detain.isra.33
> >    + 0.95% inc_all_io_entry.isra.31.part.32
> >      0.80% remap.isra.41
> > 
> > In cell_defer_no_holder(), wakeup_worker() is always called, no matter whether
> > the cell->bios list is empty or not. In single thread IO model, cell->bios list
> > is most likely empty. So skip waking up worker thread if cell->bios list is
> > empty.
> > 
> > A significant IO performance of single thread can be seen with this patch.
> > The original IO performance is 445 MiB/s with the fio test previously
> > described, while it is 643 MiB/s after applying the patch, which is a
> > 44% performance improvement.
> > 
> > Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
> 
> 
> Nice find, implementation detail questions inlined below.

Indeed!  

This cherry-picks clean into v4.19.y.  

Is there any reason this would be unsafe in 4.19?

--
Eric Wheeler


> 
> > ---
> >  drivers/md/dm-bio-prison-v1.c |  8 +++++++-
> >  drivers/md/dm-bio-prison-v1.h |  2 +-
> >  drivers/md/dm-thin.c          | 10 ++++++----
> >  3 files changed, 14 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/md/dm-bio-prison-v1.c b/drivers/md/dm-bio-prison-v1.c
> > index b538989..b2a9b8d 100644
> > --- a/drivers/md/dm-bio-prison-v1.c
> > +++ b/drivers/md/dm-bio-prison-v1.c
> > @@ -219,11 +219,17 @@ static void __cell_release_no_holder(struct dm_bio_prison *prison,
> >  
> >  void dm_cell_release_no_holder(struct dm_bio_prison *prison,
> >  			       struct dm_bio_prison_cell *cell,
> > -			       struct bio_list *inmates)
> > +			       struct bio_list *inmates, int *empty)
> >  {
> >  	unsigned long flags;
> >  
> >  	spin_lock_irqsave(&prison->lock, flags);
> > +	/*
> > +	 * The empty flag should represent the list state exactly
> > +	 * when the list is merged into @inmates, thus get the
> > +	 * list state when prison->lock is held.
> > +	 */
> > +	*empty = bio_list_empty(&cell->bios);
> >  	__cell_release_no_holder(prison, cell, inmates);
> >  	spin_unlock_irqrestore(&prison->lock, flags);
> >  }
> > diff --git a/drivers/md/dm-bio-prison-v1.h b/drivers/md/dm-bio-prison-v1.h
> > index cec52ac..500edbc 100644
> > --- a/drivers/md/dm-bio-prison-v1.h
> > +++ b/drivers/md/dm-bio-prison-v1.h
> > @@ -89,7 +89,7 @@ void dm_cell_release(struct dm_bio_prison *prison,
> >  		     struct bio_list *bios);
> >  void dm_cell_release_no_holder(struct dm_bio_prison *prison,
> >  			       struct dm_bio_prison_cell *cell,
> > -			       struct bio_list *inmates);
> > +			       struct bio_list *inmates, int *empty);
> >  void dm_cell_error(struct dm_bio_prison *prison,
> >  		   struct dm_bio_prison_cell *cell, blk_status_t error);
> >  
> > diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> > index fcd8877..51fd396 100644
> > --- a/drivers/md/dm-thin.c
> > +++ b/drivers/md/dm-thin.c
> > @@ -480,9 +480,9 @@ static void cell_visit_release(struct pool *pool,
> >  
> >  static void cell_release_no_holder(struct pool *pool,
> >  				   struct dm_bio_prison_cell *cell,
> > -				   struct bio_list *bios)
> > +				   struct bio_list *bios, int *empty)
> >  {
> > -	dm_cell_release_no_holder(pool->prison, cell, bios);
> > +	dm_cell_release_no_holder(pool->prison, cell, bios, empty);
> >  	dm_bio_prison_free_cell(pool->prison, cell);
> >  }
> >  
> > @@ -886,12 +886,14 @@ static void cell_defer_no_holder(struct thin_c *tc, struct dm_bio_prison_cell *c
> >  {
> >  	struct pool *pool = tc->pool;
> >  	unsigned long flags;
> > +	int empty;
> >  
> >  	spin_lock_irqsave(&tc->lock, flags);
> > -	cell_release_no_holder(pool, cell, &tc->deferred_bio_list);
> > +	cell_release_no_holder(pool, cell, &tc->deferred_bio_list, &empty);
> >  	spin_unlock_irqrestore(&tc->lock, flags);
> >  
> > -	wake_worker(pool);
> > +	if (!empty)
> > +		wake_worker(pool);
> >  }
> 
> Think your point is tc->deferred_bio_list may have bios already, before
> the call to cell_release_no_holder, so simply checking if it is empty
> after the cell_release_no_holder call isn't enough?
> 
> But if tc->deferred_bio_list isn't empty then there is work that needs
> to be done, regardless of whether this particular cell created work, so
> I'm left wondering if you first tried simply checking if
> tc->deferred_bio_list is empty after cell_release_no_holder() in
> cell_defer_no_holder?
> 
> Thanks,
> Mike
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-12-01  1:15 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1573740655-104317-1-git-send-email-jefflexu@linux.alibaba.com>
2019-11-14 15:48 ` dm-thin: wakeup worker only when deferred bios exist Mike Snitzer
2019-12-01  1:15   ` Eric Wheeler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.