Re: [PATCH] mmc: block: handle complete_work on separate workqueue

* Re: [PATCH] mmc: block: handle complete_work on separate workqueue
       [not found] <1549385523-20521-1-git-send-email-zhays@lexmark.com>
@ 2019-02-06 12:00 ` Ulf Hansson
  2019-02-06 12:42   ` Adrian Hunter
  0 siblings, 1 reply; 5+ messages in thread
From: Ulf Hansson @ 2019-02-06 12:00 UTC (permalink / raw)
  To: Zachary Hays, Adrian Hunter
  Cc: Christoph Hellwig, linux-mmc, linux-block, Linus Walleij

+ Adrian, Linus

On Tue, 5 Feb 2019 at 17:53, Zachary Hays <zhays@lexmark.com> wrote:
>
> The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set.
> This generates a rescuer thread for that queue that will trigger when
> the CPU is under heavy load and collect the uncompleted work.
>
> In the case of mmc, this creates the possibility of a deadlock as
> other blk-mq work is also run on the same queue. For example:
>
> - worker 0 claims the mmc host
> - worker 1 attempts to claim the host
> - worker 0 schedules complete_work to release the host
> - rescuer thread is triggered after time-out and collects the dangling
>   work
> - rescuer thread attempts to complete the work in order starting with
>   claim host
> - the task to release host is now blocked by a task to claim it and
>   will never be called
>

Adrian, I need your help to understand more of this. The above
description doesn't add up to me.

In principle, already when "worker 1 attempts to claim the host" as
described above, it should succeed and should *not* need to wait for
the host to be released. Right?

The hole point with the commit 6c0cedd1ef95 ("mmc: core: Introduce
host claiming by context"), was to allow the mmc host to be
re-claimable for blk I/O requests, no matter from what worker/thread
the claim/release is done from.

Is it not working as expected you think? What am I missing here?

> The above results in multiple hung tasks that lead to failures to boot.

Of course, during boot there is also a card detect work running in
parallel with blk I/O requests. This work is being punted to the
"system_freezable_wq" and it claims/releases the host as well.

Perhaps, that could have something to do with the problem....

>
> Handling complete_work on a separate workqueue avoids this by keeping
> the work completion tasks separate from the other blk-mq work. This
> allows the host to be released without getting blocked by other tasks
> attempting to claim the host.
>
> Signed-off-by: Zachary Hays <zhays@lexmark.com>
> ---
>  drivers/mmc/core/block.c | 10 +++++++++-
>  include/linux/mmc/card.h |  1 +
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index aef1185f383d..14f3fdb8c6bb 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -2112,7 +2112,7 @@ static void mmc_blk_mq_req_done(struct mmc_request *mrq)
>                 if (waiting)
>                         wake_up(&mq->wait);
>                 else
> -                       kblockd_schedule_work(&mq->complete_work);
> +                       queue_work(mq->card->complete_wq, &mq->complete_work);
>
>                 return;
>         }
> @@ -2924,6 +2924,13 @@ static int mmc_blk_probe(struct mmc_card *card)
>
>         mmc_fixup_device(card, mmc_blk_fixups);
>
> +       card->complete_wq = alloc_workqueue("mmc_complete",
> +                                       WQ_MEM_RECLAIM | WQ_HIGHPRI, 0);
> +       if (unlikely(!card->complete_wq)) {
> +               pr_err("Failed to create mmc completion workqueue");
> +               return -ENOMEM;
> +       }
> +
>         md = mmc_blk_alloc(card);
>         if (IS_ERR(md))
>                 return PTR_ERR(md);
> @@ -2987,6 +2994,7 @@ static void mmc_blk_remove(struct mmc_card *card)
>         pm_runtime_put_noidle(&card->dev);
>         mmc_blk_remove_req(md);
>         dev_set_drvdata(&card->dev, NULL);
> +       destroy_workqueue(card->complete_wq);
>  }
>
>  static int _mmc_blk_suspend(struct mmc_card *card)
> diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
> index de7377815b6b..8ef330027b13 100644
> --- a/include/linux/mmc/card.h
> +++ b/include/linux/mmc/card.h
> @@ -308,6 +308,7 @@ struct mmc_card {
>         unsigned int    nr_parts;
>
>         unsigned int            bouncesz;       /* Bounce buffer size */
> +       struct workqueue_struct *complete_wq;   /* Private workqueue */
>  };
>
>  static inline bool mmc_large_sector(struct mmc_card *card)
> --
> 2.7.4
>

So this change seems to solve the problem, which is really great.
However, I think we really need to understand what goes wrong, before
we apply this as fix.

Unfortunate, I am also in busy period, so I need a couple of more days
before I can help out running tests.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 5+ messages in thread