All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] bus: mhi: Command completion workaround
@ 2021-03-10 11:38 Loic Poulain
  2021-03-10 16:19 ` Jeffrey Hugo
  2021-03-10 20:43 ` Hemant Kumar
  0 siblings, 2 replies; 6+ messages in thread
From: Loic Poulain @ 2021-03-10 11:38 UTC (permalink / raw)
  To: manivannan.sadhasivam, hemantk; +Cc: linux-arm-msm, Loic Poulain

Some buggy hardwares (e.g sdx24) may report the current command
ring wp pointer instead of the command completion pointer. It's
obviously wrong, causing completion timeout. We can however deal
with that situation by completing the cmd n-1 element, which is
what the device actually completes.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
---
 drivers/bus/mhi/core/main.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
index 16b9640..3e3c520 100644
--- a/drivers/bus/mhi/core/main.c
+++ b/drivers/bus/mhi/core/main.c
@@ -707,6 +707,7 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
 {
 	dma_addr_t ptr = MHI_TRE_GET_EV_PTR(tre);
 	struct mhi_cmd *cmd_ring = &mhi_cntrl->mhi_cmd[PRIMARY_CMD_RING];
+	struct device *dev = &mhi_cntrl->mhi_dev->dev;
 	struct mhi_ring *mhi_ring = &cmd_ring->ring;
 	struct mhi_tre *cmd_pkt;
 	struct mhi_chan *mhi_chan;
@@ -714,6 +715,23 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
 
 	cmd_pkt = mhi_to_virtual(mhi_ring, ptr);
 
+	if (unlikely(cmd_pkt == mhi_ring->wp)) {
+		/* Some buggy hardwares (e.g sdx24) sometimes report the current
+		 * command ring wp pointer instead of the command completion
+		 * pointer. It's obviously wrong, causing completion timeout. We
+		 * can however deal with that situation by completing the cmd
+		 * n-1 element.
+		 */
+		void *ring_ptr = (void *)cmd_pkt - mhi_ring->el_size;
+
+		if (ring_ptr < mhi_ring->base)
+			ring_ptr += mhi_ring->len;
+
+		cmd_pkt = ring_ptr;
+
+		dev_warn(dev, "Bad completion pointer (ptr == ring_wp)\n");
+	}
+
 	chan = MHI_TRE_GET_CMD_CHID(cmd_pkt);
 	mhi_chan = &mhi_cntrl->mhi_chan[chan];
 	write_lock_bh(&mhi_chan->lock);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] bus: mhi: Command completion workaround
  2021-03-10 11:38 [PATCH] bus: mhi: Command completion workaround Loic Poulain
@ 2021-03-10 16:19 ` Jeffrey Hugo
  2021-03-11  8:05   ` Loic Poulain
  2021-03-10 20:43 ` Hemant Kumar
  1 sibling, 1 reply; 6+ messages in thread
From: Jeffrey Hugo @ 2021-03-10 16:19 UTC (permalink / raw)
  To: Loic Poulain, manivannan.sadhasivam, hemantk; +Cc: linux-arm-msm

On 3/10/2021 4:38 AM, Loic Poulain wrote:
> Some buggy hardwares (e.g sdx24) may report the current command
> ring wp pointer instead of the command completion pointer. It's
> obviously wrong, causing completion timeout. We can however deal
> with that situation by completing the cmd n-1 element, which is
> what the device actually completes.
> 
> Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
> ---
>   drivers/bus/mhi/core/main.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
> index 16b9640..3e3c520 100644
> --- a/drivers/bus/mhi/core/main.c
> +++ b/drivers/bus/mhi/core/main.c
> @@ -707,6 +707,7 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
>   {
>   	dma_addr_t ptr = MHI_TRE_GET_EV_PTR(tre);
>   	struct mhi_cmd *cmd_ring = &mhi_cntrl->mhi_cmd[PRIMARY_CMD_RING];
> +	struct device *dev = &mhi_cntrl->mhi_dev->dev;
>   	struct mhi_ring *mhi_ring = &cmd_ring->ring;
>   	struct mhi_tre *cmd_pkt;
>   	struct mhi_chan *mhi_chan;
> @@ -714,6 +715,23 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
>   
>   	cmd_pkt = mhi_to_virtual(mhi_ring, ptr);
>   
> +	if (unlikely(cmd_pkt == mhi_ring->wp)) {
> +		/* Some buggy hardwares (e.g sdx24) sometimes report the current
> +		 * command ring wp pointer instead of the command completion
> +		 * pointer. It's obviously wrong, causing completion timeout. We
> +		 * can however deal with that situation by completing the cmd
> +		 * n-1 element.
> +		 */
> +		void *ring_ptr = (void *)cmd_pkt - mhi_ring->el_size;
> +
> +		if (ring_ptr < mhi_ring->base)
> +			ring_ptr += mhi_ring->len;
> +
> +		cmd_pkt = ring_ptr;
> +
> +		dev_warn(dev, "Bad completion pointer (ptr == ring_wp)\n");

Is there value in having this warning every time?  I wonder if a _once 
version would be better to not flood the kernel log.  Although this is 
only for commands, which shouldn't be frequent, so maybe that is the 
implicit rate limiter.

What do you think?

> +	}
> +
>   	chan = MHI_TRE_GET_CMD_CHID(cmd_pkt);
>   	mhi_chan = &mhi_cntrl->mhi_chan[chan];
>   	write_lock_bh(&mhi_chan->lock);
> 


-- 
Jeffrey Hugo
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bus: mhi: Command completion workaround
  2021-03-10 11:38 [PATCH] bus: mhi: Command completion workaround Loic Poulain
  2021-03-10 16:19 ` Jeffrey Hugo
@ 2021-03-10 20:43 ` Hemant Kumar
  2021-03-11  7:56   ` Loic Poulain
  1 sibling, 1 reply; 6+ messages in thread
From: Hemant Kumar @ 2021-03-10 20:43 UTC (permalink / raw)
  To: Loic Poulain, manivannan.sadhasivam; +Cc: linux-arm-msm

Hi Loic,

On 3/10/21 3:38 AM, Loic Poulain wrote:
> Some buggy hardwares (e.g sdx24) may report the current command
> ring wp pointer instead of the command completion pointer. It's
> obviously wrong, causing completion timeout. We can however deal
> with that situation by completing the cmd n-1 element, which is
> what the device actually completes.
> 
> Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
> ---
>   drivers/bus/mhi/core/main.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
> index 16b9640..3e3c520 100644
> --- a/drivers/bus/mhi/core/main.c
> +++ b/drivers/bus/mhi/core/main.c
> @@ -707,6 +707,7 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
>   {
>   	dma_addr_t ptr = MHI_TRE_GET_EV_PTR(tre);
>   	struct mhi_cmd *cmd_ring = &mhi_cntrl->mhi_cmd[PRIMARY_CMD_RING];
> +	struct device *dev = &mhi_cntrl->mhi_dev->dev;
>   	struct mhi_ring *mhi_ring = &cmd_ring->ring;
>   	struct mhi_tre *cmd_pkt;
>   	struct mhi_chan *mhi_chan;
> @@ -714,6 +715,23 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
>   
>   	cmd_pkt = mhi_to_virtual(mhi_ring, ptr);
>   
> +	if (unlikely(cmd_pkt == mhi_ring->wp)) {
As per spec : The location of the command ring read pointer is reported 
to the host on the command completion events in the primary event ring.

If device is buggy and updates with WP instead of Rp, we should not 
workaround it by processing Wp - 1. We can print a warning if cmd_pkt != 
mhi_ring->rp and let the command completion timeout. This needs to be 
fixed by device. We can not accommodate device side bug in host side.

> +		/* Some buggy hardwares (e.g sdx24) sometimes report the current
> +		 * command ring wp pointer instead of the command completion
> +		 * pointer. It's obviously wrong, causing completion timeout. We
> +		 * can however deal with that situation by completing the cmd
> +		 * n-1 element.
> +		 */
> +		void *ring_ptr = (void *)cmd_pkt - mhi_ring->el_size;
> +
> +		if (ring_ptr < mhi_ring->base)
> +			ring_ptr += mhi_ring->len;
> +
> +		cmd_pkt = ring_ptr;
> +
> +		dev_warn(dev, "Bad completion pointer (ptr == ring_wp)\n");
> +	}
> +
>   	chan = MHI_TRE_GET_CMD_CHID(cmd_pkt);
>   	mhi_chan = &mhi_cntrl->mhi_chan[chan];
>   	write_lock_bh(&mhi_chan->lock);
> 

Hi Mani,

What do you think about this workaround ?

Thanks,
Hemant
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bus: mhi: Command completion workaround
  2021-03-10 20:43 ` Hemant Kumar
@ 2021-03-11  7:56   ` Loic Poulain
  0 siblings, 0 replies; 6+ messages in thread
From: Loic Poulain @ 2021-03-11  7:56 UTC (permalink / raw)
  To: Hemant Kumar; +Cc: Manivannan Sadhasivam, linux-arm-msm

Hi Hemant,

On Wed, 10 Mar 2021 at 21:43, Hemant Kumar <hemantk@codeaurora.org> wrote:
>
> Hi Loic,
>
> On 3/10/21 3:38 AM, Loic Poulain wrote:
> > Some buggy hardwares (e.g sdx24) may report the current command
> > ring wp pointer instead of the command completion pointer. It's
> > obviously wrong, causing completion timeout. We can however deal
> > with that situation by completing the cmd n-1 element, which is
> > what the device actually completes.
> >
> > Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
> > ---
> >   drivers/bus/mhi/core/main.c | 18 ++++++++++++++++++
> >   1 file changed, 18 insertions(+)
> >
> > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
> > index 16b9640..3e3c520 100644
> > --- a/drivers/bus/mhi/core/main.c
> > +++ b/drivers/bus/mhi/core/main.c
> > @@ -707,6 +707,7 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
> >   {
> >       dma_addr_t ptr = MHI_TRE_GET_EV_PTR(tre);
> >       struct mhi_cmd *cmd_ring = &mhi_cntrl->mhi_cmd[PRIMARY_CMD_RING];
> > +     struct device *dev = &mhi_cntrl->mhi_dev->dev;
> >       struct mhi_ring *mhi_ring = &cmd_ring->ring;
> >       struct mhi_tre *cmd_pkt;
> >       struct mhi_chan *mhi_chan;
> > @@ -714,6 +715,23 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
> >
> >       cmd_pkt = mhi_to_virtual(mhi_ring, ptr);
> >
> > +     if (unlikely(cmd_pkt == mhi_ring->wp)) {
> As per spec : The location of the command ring read pointer is reported
> to the host on the command completion events in the primary event ring.
>
> If device is buggy and updates with WP instead of Rp, we should not
> workaround it by processing Wp - 1. We can print a warning if cmd_pkt !=
> mhi_ring->rp and let the command completion timeout. This needs to be
> fixed by device. We can not accommodate device side bug in host side.

I see your point, but here it's not to accommodate the device but the
users using such
'buggy' device. The kernel has a ton of 'quirks' in various drivers,
I'm not a fan of this
but my argument is that:
- It captures a behavior that was not captured until now
- It workarounds an issue without any impact on non 'buggy' devices
- It clearly prints a warn to highlight that it's a known issue that
should be fixed
- Fixing devices in the wild is quite complex, and we may have to live with it.

Regards,
Loic

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bus: mhi: Command completion workaround
  2021-03-10 16:19 ` Jeffrey Hugo
@ 2021-03-11  8:05   ` Loic Poulain
  2021-03-11 14:46     ` Jeffrey Hugo
  0 siblings, 1 reply; 6+ messages in thread
From: Loic Poulain @ 2021-03-11  8:05 UTC (permalink / raw)
  To: Jeffrey Hugo; +Cc: Manivannan Sadhasivam, Hemant Kumar, linux-arm-msm

Hi Jeffrey,

On Wed, 10 Mar 2021 at 17:19, Jeffrey Hugo <jhugo@codeaurora.org> wrote:
>
> On 3/10/2021 4:38 AM, Loic Poulain wrote:
> > Some buggy hardwares (e.g sdx24) may report the current command
> > ring wp pointer instead of the command completion pointer. It's
> > obviously wrong, causing completion timeout. We can however deal
> > with that situation by completing the cmd n-1 element, which is
> > what the device actually completes.
> >
> > Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
> > ---
> >   drivers/bus/mhi/core/main.c | 18 ++++++++++++++++++
> >   1 file changed, 18 insertions(+)
> >
> > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
> > index 16b9640..3e3c520 100644
> > --- a/drivers/bus/mhi/core/main.c
> > +++ b/drivers/bus/mhi/core/main.c
> > @@ -707,6 +707,7 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
> >   {
> >       dma_addr_t ptr = MHI_TRE_GET_EV_PTR(tre);
> >       struct mhi_cmd *cmd_ring = &mhi_cntrl->mhi_cmd[PRIMARY_CMD_RING];
> > +     struct device *dev = &mhi_cntrl->mhi_dev->dev;
> >       struct mhi_ring *mhi_ring = &cmd_ring->ring;
> >       struct mhi_tre *cmd_pkt;
> >       struct mhi_chan *mhi_chan;
> > @@ -714,6 +715,23 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
> >
> >       cmd_pkt = mhi_to_virtual(mhi_ring, ptr);
> >
> > +     if (unlikely(cmd_pkt == mhi_ring->wp)) {
> > +             /* Some buggy hardwares (e.g sdx24) sometimes report the current
> > +              * command ring wp pointer instead of the command completion
> > +              * pointer. It's obviously wrong, causing completion timeout. We
> > +              * can however deal with that situation by completing the cmd
> > +              * n-1 element.
> > +              */
> > +             void *ring_ptr = (void *)cmd_pkt - mhi_ring->el_size;
> > +
> > +             if (ring_ptr < mhi_ring->base)
> > +                     ring_ptr += mhi_ring->len;
> > +
> > +             cmd_pkt = ring_ptr;
> > +
> > +             dev_warn(dev, "Bad completion pointer (ptr == ring_wp)\n");
>
> Is there value in having this warning every time?  I wonder if a _once
> version would be better to not flood the kernel log.  Although this is
> only for commands, which shouldn't be frequent, so maybe that is the
> implicit rate limiter.
>
> What do you think?

As you said it's kind of self rate-limited because of the unfrequent
command operations, mostly for starting and stopping channels. A _once
variant would hide the issue a bit, and probably not annoying enough
to raise curiosity.

>
> > +     }
> > +
> >       chan = MHI_TRE_GET_CMD_CHID(cmd_pkt);
> >       mhi_chan = &mhi_cntrl->mhi_chan[chan];
> >       write_lock_bh(&mhi_chan->lock);
> >
>
>
> --
> Jeffrey Hugo
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bus: mhi: Command completion workaround
  2021-03-11  8:05   ` Loic Poulain
@ 2021-03-11 14:46     ` Jeffrey Hugo
  0 siblings, 0 replies; 6+ messages in thread
From: Jeffrey Hugo @ 2021-03-11 14:46 UTC (permalink / raw)
  To: Loic Poulain; +Cc: Manivannan Sadhasivam, Hemant Kumar, linux-arm-msm

On 3/11/2021 1:05 AM, Loic Poulain wrote:
> Hi Jeffrey,
> 
> On Wed, 10 Mar 2021 at 17:19, Jeffrey Hugo <jhugo@codeaurora.org> wrote:
>>
>> On 3/10/2021 4:38 AM, Loic Poulain wrote:
>>> Some buggy hardwares (e.g sdx24) may report the current command
>>> ring wp pointer instead of the command completion pointer. It's
>>> obviously wrong, causing completion timeout. We can however deal
>>> with that situation by completing the cmd n-1 element, which is
>>> what the device actually completes.
>>>
>>> Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
>>> ---
>>>    drivers/bus/mhi/core/main.c | 18 ++++++++++++++++++
>>>    1 file changed, 18 insertions(+)
>>>
>>> diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
>>> index 16b9640..3e3c520 100644
>>> --- a/drivers/bus/mhi/core/main.c
>>> +++ b/drivers/bus/mhi/core/main.c
>>> @@ -707,6 +707,7 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
>>>    {
>>>        dma_addr_t ptr = MHI_TRE_GET_EV_PTR(tre);
>>>        struct mhi_cmd *cmd_ring = &mhi_cntrl->mhi_cmd[PRIMARY_CMD_RING];
>>> +     struct device *dev = &mhi_cntrl->mhi_dev->dev;
>>>        struct mhi_ring *mhi_ring = &cmd_ring->ring;
>>>        struct mhi_tre *cmd_pkt;
>>>        struct mhi_chan *mhi_chan;
>>> @@ -714,6 +715,23 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl,
>>>
>>>        cmd_pkt = mhi_to_virtual(mhi_ring, ptr);
>>>
>>> +     if (unlikely(cmd_pkt == mhi_ring->wp)) {
>>> +             /* Some buggy hardwares (e.g sdx24) sometimes report the current
>>> +              * command ring wp pointer instead of the command completion
>>> +              * pointer. It's obviously wrong, causing completion timeout. We
>>> +              * can however deal with that situation by completing the cmd
>>> +              * n-1 element.
>>> +              */
>>> +             void *ring_ptr = (void *)cmd_pkt - mhi_ring->el_size;
>>> +
>>> +             if (ring_ptr < mhi_ring->base)
>>> +                     ring_ptr += mhi_ring->len;
>>> +
>>> +             cmd_pkt = ring_ptr;
>>> +
>>> +             dev_warn(dev, "Bad completion pointer (ptr == ring_wp)\n");
>>
>> Is there value in having this warning every time?  I wonder if a _once
>> version would be better to not flood the kernel log.  Although this is
>> only for commands, which shouldn't be frequent, so maybe that is the
>> implicit rate limiter.
>>
>> What do you think?
> 
> As you said it's kind of self rate-limited because of the unfrequent
> command operations, mostly for starting and stopping channels. A _once
> variant would hide the issue a bit, and probably not annoying enough
> to raise curiosity.

Thats fair.

I happened to notice just now that the block comment you have above is 
not the proper style.  That looks like the netdev style, but we are not 
in the netdev area.

I'm curious to see where you and Hemant land on his comment.

-- 
Jeffrey Hugo
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-11 14:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-10 11:38 [PATCH] bus: mhi: Command completion workaround Loic Poulain
2021-03-10 16:19 ` Jeffrey Hugo
2021-03-11  8:05   ` Loic Poulain
2021-03-11 14:46     ` Jeffrey Hugo
2021-03-10 20:43 ` Hemant Kumar
2021-03-11  7:56   ` Loic Poulain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.