* Re: [PATCH] mmc: block: handle complete_work on separate workqueue [not found] <1549385523-20521-1-git-send-email-zhays@lexmark.com> @ 2019-02-06 12:00 ` Ulf Hansson 2019-02-06 12:42 ` Adrian Hunter 0 siblings, 1 reply; 5+ messages in thread From: Ulf Hansson @ 2019-02-06 12:00 UTC (permalink / raw) To: Zachary Hays, Adrian Hunter Cc: Christoph Hellwig, linux-mmc, linux-block, Linus Walleij + Adrian, Linus On Tue, 5 Feb 2019 at 17:53, Zachary Hays <zhays@lexmark.com> wrote: > > The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set. > This generates a rescuer thread for that queue that will trigger when > the CPU is under heavy load and collect the uncompleted work. > > In the case of mmc, this creates the possibility of a deadlock as > other blk-mq work is also run on the same queue. For example: > > - worker 0 claims the mmc host > - worker 1 attempts to claim the host > - worker 0 schedules complete_work to release the host > - rescuer thread is triggered after time-out and collects the dangling > work > - rescuer thread attempts to complete the work in order starting with > claim host > - the task to release host is now blocked by a task to claim it and > will never be called > Adrian, I need your help to understand more of this. The above description doesn't add up to me. In principle, already when "worker 1 attempts to claim the host" as described above, it should succeed and should *not* need to wait for the host to be released. Right? The hole point with the commit 6c0cedd1ef95 ("mmc: core: Introduce host claiming by context"), was to allow the mmc host to be re-claimable for blk I/O requests, no matter from what worker/thread the claim/release is done from. Is it not working as expected you think? What am I missing here? > The above results in multiple hung tasks that lead to failures to boot. Of course, during boot there is also a card detect work running in parallel with blk I/O requests. This work is being punted to the "system_freezable_wq" and it claims/releases the host as well. Perhaps, that could have something to do with the problem.... > > Handling complete_work on a separate workqueue avoids this by keeping > the work completion tasks separate from the other blk-mq work. This > allows the host to be released without getting blocked by other tasks > attempting to claim the host. > > Signed-off-by: Zachary Hays <zhays@lexmark.com> > --- > drivers/mmc/core/block.c | 10 +++++++++- > include/linux/mmc/card.h | 1 + > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c > index aef1185f383d..14f3fdb8c6bb 100644 > --- a/drivers/mmc/core/block.c > +++ b/drivers/mmc/core/block.c > @@ -2112,7 +2112,7 @@ static void mmc_blk_mq_req_done(struct mmc_request *mrq) > if (waiting) > wake_up(&mq->wait); > else > - kblockd_schedule_work(&mq->complete_work); > + queue_work(mq->card->complete_wq, &mq->complete_work); > > return; > } > @@ -2924,6 +2924,13 @@ static int mmc_blk_probe(struct mmc_card *card) > > mmc_fixup_device(card, mmc_blk_fixups); > > + card->complete_wq = alloc_workqueue("mmc_complete", > + WQ_MEM_RECLAIM | WQ_HIGHPRI, 0); > + if (unlikely(!card->complete_wq)) { > + pr_err("Failed to create mmc completion workqueue"); > + return -ENOMEM; > + } > + > md = mmc_blk_alloc(card); > if (IS_ERR(md)) > return PTR_ERR(md); > @@ -2987,6 +2994,7 @@ static void mmc_blk_remove(struct mmc_card *card) > pm_runtime_put_noidle(&card->dev); > mmc_blk_remove_req(md); > dev_set_drvdata(&card->dev, NULL); > + destroy_workqueue(card->complete_wq); > } > > static int _mmc_blk_suspend(struct mmc_card *card) > diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h > index de7377815b6b..8ef330027b13 100644 > --- a/include/linux/mmc/card.h > +++ b/include/linux/mmc/card.h > @@ -308,6 +308,7 @@ struct mmc_card { > unsigned int nr_parts; > > unsigned int bouncesz; /* Bounce buffer size */ > + struct workqueue_struct *complete_wq; /* Private workqueue */ > }; > > static inline bool mmc_large_sector(struct mmc_card *card) > -- > 2.7.4 > So this change seems to solve the problem, which is really great. However, I think we really need to understand what goes wrong, before we apply this as fix. Unfortunate, I am also in busy period, so I need a couple of more days before I can help out running tests. Kind regards Uffe ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mmc: block: handle complete_work on separate workqueue 2019-02-06 12:00 ` [PATCH] mmc: block: handle complete_work on separate workqueue Ulf Hansson @ 2019-02-06 12:42 ` Adrian Hunter 2019-02-06 14:08 ` Ulf Hansson 0 siblings, 1 reply; 5+ messages in thread From: Adrian Hunter @ 2019-02-06 12:42 UTC (permalink / raw) To: Ulf Hansson, Zachary Hays Cc: Christoph Hellwig, linux-mmc, linux-block, Linus Walleij On 6/02/19 2:00 PM, Ulf Hansson wrote: > + Adrian, Linus > > On Tue, 5 Feb 2019 at 17:53, Zachary Hays <zhays@lexmark.com> wrote: >> >> The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set. >> This generates a rescuer thread for that queue that will trigger when >> the CPU is under heavy load and collect the uncompleted work. >> >> In the case of mmc, this creates the possibility of a deadlock as >> other blk-mq work is also run on the same queue. For example: >> >> - worker 0 claims the mmc host >> - worker 1 attempts to claim the host >> - worker 0 schedules complete_work to release the host >> - rescuer thread is triggered after time-out and collects the dangling >> work >> - rescuer thread attempts to complete the work in order starting with >> claim host >> - the task to release host is now blocked by a task to claim it and >> will never be called >> > > Adrian, I need your help to understand more of this. The above > description doesn't add up to me. > > In principle, already when "worker 1 attempts to claim the host" as > described above, it should succeed and should *not* need to wait for > the host to be released. Right? If it is the same queue, then yes. Although in that case there is only 1 work for the hw queue so there cannot be another worker. There could be another attempt to send a request directly, but that will not block - if the host controller is busy, BLK_STS_RESOURCE will be returned from ->queue_rq(). > > The hole point with the commit 6c0cedd1ef95 ("mmc: core: Introduce > host claiming by context"), was to allow the mmc host to be > re-claimable for blk I/O requests, no matter from what worker/thread > the claim/release is done from. > > Is it not working as expected you think? What am I missing here? I assumed we were talking about a situation where there are multiple internal eMMC partitions each with their own disk and queue. In that case, a queue waits if there is another queue that is using the eMMC. We should clarify whether that is the scenario we are looking at. Zachary? > >> The above results in multiple hung tasks that lead to failures to boot. > > Of course, during boot there is also a card detect work running in > parallel with blk I/O requests. This work is being punted to the > "system_freezable_wq" and it claims/releases the host as well. > > Perhaps, that could have something to do with the problem.... > >> >> Handling complete_work on a separate workqueue avoids this by keeping >> the work completion tasks separate from the other blk-mq work. This >> allows the host to be released without getting blocked by other tasks >> attempting to claim the host. >> >> Signed-off-by: Zachary Hays <zhays@lexmark.com> >> --- >> drivers/mmc/core/block.c | 10 +++++++++- >> include/linux/mmc/card.h | 1 + >> 2 files changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c >> index aef1185f383d..14f3fdb8c6bb 100644 >> --- a/drivers/mmc/core/block.c >> +++ b/drivers/mmc/core/block.c >> @@ -2112,7 +2112,7 @@ static void mmc_blk_mq_req_done(struct mmc_request *mrq) >> if (waiting) >> wake_up(&mq->wait); >> else >> - kblockd_schedule_work(&mq->complete_work); >> + queue_work(mq->card->complete_wq, &mq->complete_work); >> >> return; >> } >> @@ -2924,6 +2924,13 @@ static int mmc_blk_probe(struct mmc_card *card) >> >> mmc_fixup_device(card, mmc_blk_fixups); >> >> + card->complete_wq = alloc_workqueue("mmc_complete", >> + WQ_MEM_RECLAIM | WQ_HIGHPRI, 0); >> + if (unlikely(!card->complete_wq)) { >> + pr_err("Failed to create mmc completion workqueue"); >> + return -ENOMEM; >> + } >> + >> md = mmc_blk_alloc(card); >> if (IS_ERR(md)) >> return PTR_ERR(md); >> @@ -2987,6 +2994,7 @@ static void mmc_blk_remove(struct mmc_card *card) >> pm_runtime_put_noidle(&card->dev); >> mmc_blk_remove_req(md); >> dev_set_drvdata(&card->dev, NULL); >> + destroy_workqueue(card->complete_wq); >> } >> >> static int _mmc_blk_suspend(struct mmc_card *card) >> diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h >> index de7377815b6b..8ef330027b13 100644 >> --- a/include/linux/mmc/card.h >> +++ b/include/linux/mmc/card.h >> @@ -308,6 +308,7 @@ struct mmc_card { >> unsigned int nr_parts; >> >> unsigned int bouncesz; /* Bounce buffer size */ >> + struct workqueue_struct *complete_wq; /* Private workqueue */ >> }; >> >> static inline bool mmc_large_sector(struct mmc_card *card) >> -- >> 2.7.4 >> > > So this change seems to solve the problem, which is really great. > However, I think we really need to understand what goes wrong, before > we apply this as fix. > > Unfortunate, I am also in busy period, so I need a couple of more days > before I can help out running tests. > > Kind regards > Uffe > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mmc: block: handle complete_work on separate workqueue 2019-02-06 12:42 ` Adrian Hunter @ 2019-02-06 14:08 ` Ulf Hansson 2019-02-06 21:08 ` Zak Hays [not found] ` <1549551788-2572-1-git-send-email-zhays@lexmark.com> 0 siblings, 2 replies; 5+ messages in thread From: Ulf Hansson @ 2019-02-06 14:08 UTC (permalink / raw) To: Adrian Hunter Cc: Zachary Hays, Christoph Hellwig, linux-mmc, linux-block, Linus Walleij On Wed, 6 Feb 2019 at 13:43, Adrian Hunter <adrian.hunter@intel.com> wrote: > > On 6/02/19 2:00 PM, Ulf Hansson wrote: > > + Adrian, Linus > > > > On Tue, 5 Feb 2019 at 17:53, Zachary Hays <zhays@lexmark.com> wrote: > >> > >> The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set. > >> This generates a rescuer thread for that queue that will trigger when > >> the CPU is under heavy load and collect the uncompleted work. > >> > >> In the case of mmc, this creates the possibility of a deadlock as > >> other blk-mq work is also run on the same queue. For example: > >> > >> - worker 0 claims the mmc host > >> - worker 1 attempts to claim the host > >> - worker 0 schedules complete_work to release the host > >> - rescuer thread is triggered after time-out and collects the dangling > >> work > >> - rescuer thread attempts to complete the work in order starting with > >> claim host > >> - the task to release host is now blocked by a task to claim it and > >> will never be called > >> > > > > Adrian, I need your help to understand more of this. The above > > description doesn't add up to me. > > > > In principle, already when "worker 1 attempts to claim the host" as > > described above, it should succeed and should *not* need to wait for > > the host to be released. Right? > > If it is the same queue, then yes. Although in that case there is only 1 > work for the hw queue so there cannot be another worker. There could be > another attempt to send a request directly, but that will not block - if the > host controller is busy, BLK_STS_RESOURCE will be returned from ->queue_rq(). > Right. > > > > The hole point with the commit 6c0cedd1ef95 ("mmc: core: Introduce > > host claiming by context"), was to allow the mmc host to be > > re-claimable for blk I/O requests, no matter from what worker/thread > > the claim/release is done from. > > > > Is it not working as expected you think? What am I missing here? > > I assumed we were talking about a situation where there are multiple > internal eMMC partitions each with their own disk and queue. In that case, > a queue waits if there is another queue that is using the eMMC. Of course! I totally forgot about this case, that is most certainly what must be happening! > > We should clarify whether that is the scenario we are looking at. Zachary? Yes, please. Assuming this is the case, I would also prefer an updated changelog that describe this scenario. Adrian, thanks a lot for you help! [...] Kind regards Uffe ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] mmc: block: handle complete_work on separate workqueue 2019-02-06 14:08 ` Ulf Hansson @ 2019-02-06 21:08 ` Zak Hays [not found] ` <1549551788-2572-1-git-send-email-zhays@lexmark.com> 1 sibling, 0 replies; 5+ messages in thread From: Zak Hays @ 2019-02-06 21:08 UTC (permalink / raw) To: Ulf Hansson, Adrian Hunter Cc: Christoph Hellwig, linux-mmc, linux-block, Linus Walleij > -----Original Message----- > From: Ulf Hansson <ulf.hansson@linaro.org> > Sent: Wednesday, February 6, 2019 9:08 AM > To: Adrian Hunter <adrian.hunter@intel.com> > Cc: Zak Hays <zak.hays@lexmark.com>; Christoph Hellwig <hch@lst.de>; > linux-mmc@vger.kernel.org; linux-block <linux-block@vger.kernel.org>; > Linus Walleij <linus.walleij@linaro.org> > Subject: Re: [PATCH] mmc: block: handle complete_work on separate > workqueue > > On Wed, 6 Feb 2019 at 13:43, Adrian Hunter <adrian.hunter@intel.com> > wrote: > > > > On 6/02/19 2:00 PM, Ulf Hansson wrote: > > > + Adrian, Linus > > > > > > On Tue, 5 Feb 2019 at 17:53, Zachary Hays <zhays@lexmark.com> wrote: > > >> > > >> The kblockd workqueue is created with the WQ_MEM_RECLAIM flag > set. > > >> This generates a rescuer thread for that queue that will trigger when > > >> the CPU is under heavy load and collect the uncompleted work. > > >> > > >> In the case of mmc, this creates the possibility of a deadlock as > > >> other blk-mq work is also run on the same queue. For example: > > >> > > >> - worker 0 claims the mmc host > > >> - worker 1 attempts to claim the host > > >> - worker 0 schedules complete_work to release the host > > >> - rescuer thread is triggered after time-out and collects the dangling > > >> work > > >> - rescuer thread attempts to complete the work in order starting with > > >> claim host > > >> - the task to release host is now blocked by a task to claim it and > > >> will never be called > > >> > > > > > > Adrian, I need your help to understand more of this. The above > > > description doesn't add up to me. > > > > > > In principle, already when "worker 1 attempts to claim the host" as > > > described above, it should succeed and should *not* need to wait for > > > the host to be released. Right? > > > > If it is the same queue, then yes. Although in that case there is only 1 > > work for the hw queue so there cannot be another worker. There could be > > another attempt to send a request directly, but that will not block - if the > > host controller is busy, BLK_STS_RESOURCE will be returned from - > >queue_rq(). > > > > Right. > > > > > > > The hole point with the commit 6c0cedd1ef95 ("mmc: core: Introduce > > > host claiming by context"), was to allow the mmc host to be > > > re-claimable for blk I/O requests, no matter from what worker/thread > > > the claim/release is done from. > > > > > > Is it not working as expected you think? What am I missing here? > > > > I assumed we were talking about a situation where there are multiple > > internal eMMC partitions each with their own disk and queue. In that case, > > a queue waits if there is another queue that is using the eMMC. > > Of course! I totally forgot about this case, that is most certainly > what must be happening! > > > > > We should clarify whether that is the scenario we are looking at. Zachary? > > Yes, please. Yes, this is the case. There are multiple partitions on the eMMC. > > Assuming this is the case, I would also prefer an updated changelog > that describe this scenario. I will update the commit message to clarify this and resubmit. > Adrian, thanks a lot for you help! > > [...] > > Kind regards > Uffe ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <1549551788-2572-1-git-send-email-zhays@lexmark.com>]
* Re: [PATCH v2] mmc: block: handle complete_work on separate workqueue [not found] ` <1549551788-2572-1-git-send-email-zhays@lexmark.com> @ 2019-02-08 11:52 ` Ulf Hansson 0 siblings, 0 replies; 5+ messages in thread From: Ulf Hansson @ 2019-02-08 11:52 UTC (permalink / raw) To: Zachary Hays Cc: Adrian Hunter, Christoph Hellwig, linux-mmc, linux-block, Linus Walleij On Thu, 7 Feb 2019 at 16:55, Zachary Hays <zhays@lexmark.com> wrote: > > The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set. > This generates a rescuer thread for that queue that will trigger when > the CPU is under heavy load and collect the uncompleted work. > > In the case of mmc, this creates the possibility of a deadlock when > there are multiple partitions on the device as other blk-mq work is > also run on the same queue. For example: > > - worker 0 claims the mmc host to work on partition 1 > - worker 1 attempts to claim the host for partition 2 but has to wait > for worker 0 to finish > - worker 0 schedules complete_work to release the host > - rescuer thread is triggered after time-out and collects the dangling > work > - rescuer thread attempts to complete the work in order starting with > claim host > - the task to release host is now blocked by a task to claim it and > will never be called > > The above results in multiple hung tasks that lead to failures to > mount partitions. > > Handling complete_work on a separate workqueue avoids this by keeping > the work completion tasks separate from the other blk-mq work. This > allows the host to be released without getting blocked by other tasks > attempting to claim the host. > > Signed-off-by: Zachary Hays <zhays@lexmark.com> Applied for fixes and by adding a fixes+stable tag. Thanks Zachary for fixing this non-trivial problem! And thanks also to Adrian/Christoph for your valuable input to this (feel free to reply with your reviewed-by tag). Kind regards Uffe > --- > drivers/mmc/core/block.c | 10 +++++++++- > include/linux/mmc/card.h | 1 + > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c > index aef1185f383d..14f3fdb8c6bb 100644 > --- a/drivers/mmc/core/block.c > +++ b/drivers/mmc/core/block.c > @@ -2112,7 +2112,7 @@ static void mmc_blk_mq_req_done(struct mmc_request *mrq) > if (waiting) > wake_up(&mq->wait); > else > - kblockd_schedule_work(&mq->complete_work); > + queue_work(mq->card->complete_wq, &mq->complete_work); > > return; > } > @@ -2924,6 +2924,13 @@ static int mmc_blk_probe(struct mmc_card *card) > > mmc_fixup_device(card, mmc_blk_fixups); > > + card->complete_wq = alloc_workqueue("mmc_complete", > + WQ_MEM_RECLAIM | WQ_HIGHPRI, 0); > + if (unlikely(!card->complete_wq)) { > + pr_err("Failed to create mmc completion workqueue"); > + return -ENOMEM; > + } > + > md = mmc_blk_alloc(card); > if (IS_ERR(md)) > return PTR_ERR(md); > @@ -2987,6 +2994,7 @@ static void mmc_blk_remove(struct mmc_card *card) > pm_runtime_put_noidle(&card->dev); > mmc_blk_remove_req(md); > dev_set_drvdata(&card->dev, NULL); > + destroy_workqueue(card->complete_wq); > } > > static int _mmc_blk_suspend(struct mmc_card *card) > diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h > index de7377815b6b..8ef330027b13 100644 > --- a/include/linux/mmc/card.h > +++ b/include/linux/mmc/card.h > @@ -308,6 +308,7 @@ struct mmc_card { > unsigned int nr_parts; > > unsigned int bouncesz; /* Bounce buffer size */ > + struct workqueue_struct *complete_wq; /* Private workqueue */ > }; > > static inline bool mmc_large_sector(struct mmc_card *card) > -- > 2.7.4 > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-02-08 11:53 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <1549385523-20521-1-git-send-email-zhays@lexmark.com> 2019-02-06 12:00 ` [PATCH] mmc: block: handle complete_work on separate workqueue Ulf Hansson 2019-02-06 12:42 ` Adrian Hunter 2019-02-06 14:08 ` Ulf Hansson 2019-02-06 21:08 ` Zak Hays [not found] ` <1549551788-2572-1-git-send-email-zhays@lexmark.com> 2019-02-08 11:52 ` [PATCH v2] " Ulf Hansson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).