linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Mailbox: Complete wait event only if Tx was successful
@ 2014-12-10 20:16 Ashwin Chaugule
  2014-12-12  8:43 ` [Linaro-acpi] " Sudeep Holla
  2014-12-12 10:21 ` Jassi Brar
  0 siblings, 2 replies; 7+ messages in thread
From: Ashwin Chaugule @ 2014-12-10 20:16 UTC (permalink / raw)
  To: linux-arm-kernel

If a wait_for_completion_timeout() call returns due to a timeout,
the mbox code can still call complete() after returning from the wait.
This can cause subsequent transmissions on a channel to fail, since
the wait_for_completion_timeout() sees the completion variable
is !=0, caused by the erroneous complete() call, and immediately
returns without waiting for the time as expected by the client.

Fix this by calling complete() only if the TX was successful.

Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
---
 drivers/mailbox/mailbox.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 17e9e4a..4acaddb 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -101,7 +101,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
 	if (mssg && chan->cl->tx_done)
 		chan->cl->tx_done(chan->cl, mssg, r);
 
-	if (chan->cl->tx_block)
+	if ((!r) && chan->cl->tx_block)
 		complete(&chan->tx_complete);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Linaro-acpi] [PATCH] Mailbox: Complete wait event only if Tx was successful
  2014-12-10 20:16 [PATCH] Mailbox: Complete wait event only if Tx was successful Ashwin Chaugule
@ 2014-12-12  8:43 ` Sudeep Holla
  2014-12-12 17:47   ` Ashwin Chaugule
  2014-12-12 10:21 ` Jassi Brar
  1 sibling, 1 reply; 7+ messages in thread
From: Sudeep Holla @ 2014-12-12  8:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ashwin,

On Thursday 11 December 2014 01:46 AM, Ashwin Chaugule wrote:
> If a wait_for_completion_timeout() call returns due to a timeout,
> the mbox code can still call complete() after returning from the wait.
> This can cause subsequent transmissions on a channel to fail, since
> the wait_for_completion_timeout() sees the completion variable
> is !=0, caused by the erroneous complete() call, and immediately
> returns without waiting for the time as expected by the client.
>
> Fix this by calling complete() only if the TX was successful.
>
> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> ---
>   drivers/mailbox/mailbox.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> index 17e9e4a..4acaddb 100644
> --- a/drivers/mailbox/mailbox.c
> +++ b/drivers/mailbox/mailbox.c
> @@ -101,7 +101,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
>   	if (mssg && chan->cl->tx_done)
>   		chan->cl->tx_done(chan->cl, mssg, r);
>
> -	if (chan->cl->tx_block)
> +	if ((!r) && chan->cl->tx_block)
>   		complete(&chan->tx_complete);

Just curious to check if there's another possible race which is
a different issue.

Suppose the timer fired and indicated that the Tx is complete, then
it tries to execute complete while the wait_for_completion_timeout timed
out. Does that make sense ?

So if yes, how about adding !completion_done(..) to the check while you
are at this ?

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] Mailbox: Complete wait event only if Tx was successful
  2014-12-10 20:16 [PATCH] Mailbox: Complete wait event only if Tx was successful Ashwin Chaugule
  2014-12-12  8:43 ` [Linaro-acpi] " Sudeep Holla
@ 2014-12-12 10:21 ` Jassi Brar
  2014-12-12 17:58   ` Ashwin Chaugule
  1 sibling, 1 reply; 7+ messages in thread
From: Jassi Brar @ 2014-12-12 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 11 December 2014 at 01:46, Ashwin Chaugule
<ashwin.chaugule@linaro.org> wrote:
> If a wait_for_completion_timeout() call returns due to a timeout,
> the mbox code can still call complete() after returning from the wait.
> This can cause subsequent transmissions on a channel to fail, since
> the wait_for_completion_timeout() sees the completion variable
> is !=0, caused by the erroneous complete() call, and immediately
> returns without waiting for the time as expected by the client.
>
> Fix this by calling complete() only if the TX was successful.
>
> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> ---
>  drivers/mailbox/mailbox.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> index 17e9e4a..4acaddb 100644
> --- a/drivers/mailbox/mailbox.c
> +++ b/drivers/mailbox/mailbox.c
> @@ -101,7 +101,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
>         if (mssg && chan->cl->tx_done)
>                 chan->cl->tx_done(chan->cl, mssg, r);
>
> -       if (chan->cl->tx_block)
> +       if ((!r) && chan->cl->tx_block)
>                 complete(&chan->tx_complete);
>
Thanks for finding the bug.
However the fix is flawed. complete() could also be done from
mbox_chan_txdone() calling tx_tick(). And if the controller returned
error, we could never pass on that error code to the user (timeout
fires and then we will move on with -EIO).
Since we could never prevent the controller from returning -EIO as the
error, I think we have to explicitly tell tx_tick() if it needs to
complete() or not.

-Jassi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Linaro-acpi] [PATCH] Mailbox: Complete wait event only if Tx was successful
  2014-12-12  8:43 ` [Linaro-acpi] " Sudeep Holla
@ 2014-12-12 17:47   ` Ashwin Chaugule
  2014-12-16 11:36     ` Sudeep Holla
  0 siblings, 1 reply; 7+ messages in thread
From: Ashwin Chaugule @ 2014-12-12 17:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 December 2014 at 03:43, Sudeep Holla <sudeep.holla@arm.com> wrote:
> Hi Ashwin,

Hi,

> On Thursday 11 December 2014 01:46 AM, Ashwin Chaugule wrote:
>>
>> If a wait_for_completion_timeout() call returns due to a timeout,
>> the mbox code can still call complete() after returning from the wait.
>> This can cause subsequent transmissions on a channel to fail, since
>> the wait_for_completion_timeout() sees the completion variable
>> is !=0, caused by the erroneous complete() call, and immediately
>> returns without waiting for the time as expected by the client.
>>
>> Fix this by calling complete() only if the TX was successful.
>>
>> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
>> ---
>>   drivers/mailbox/mailbox.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
>> index 17e9e4a..4acaddb 100644
>> --- a/drivers/mailbox/mailbox.c
>> +++ b/drivers/mailbox/mailbox.c
>> @@ -101,7 +101,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
>>         if (mssg && chan->cl->tx_done)
>>                 chan->cl->tx_done(chan->cl, mssg, r);
>>
>> -       if (chan->cl->tx_block)
>> +       if ((!r) && chan->cl->tx_block)
>>                 complete(&chan->tx_complete);
>
>
> Just curious to check if there's another possible race which is
> a different issue.
>
> Suppose the timer fired and indicated that the Tx is complete, then
> it tries to execute complete while the wait_for_completion_timeout timed
> out. Does that make sense ?
>
> So if yes, how about adding !completion_done(..) to the check while you
> are at this ?

Yea. Seems like another race condition. I'll add it along with this..

Thanks,
Ashwin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] Mailbox: Complete wait event only if Tx was successful
  2014-12-12 10:21 ` Jassi Brar
@ 2014-12-12 17:58   ` Ashwin Chaugule
  0 siblings, 0 replies; 7+ messages in thread
From: Ashwin Chaugule @ 2014-12-12 17:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 December 2014 at 05:21, Jassi Brar <jaswinder.singh@linaro.org> wrote:
> On 11 December 2014 at 01:46, Ashwin Chaugule
> <ashwin.chaugule@linaro.org> wrote:
>> If a wait_for_completion_timeout() call returns due to a timeout,
>> the mbox code can still call complete() after returning from the wait.
>> This can cause subsequent transmissions on a channel to fail, since
>> the wait_for_completion_timeout() sees the completion variable
>> is !=0, caused by the erroneous complete() call, and immediately
>> returns without waiting for the time as expected by the client.
>>
>> Fix this by calling complete() only if the TX was successful.
>>
>> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
>> ---
>>  drivers/mailbox/mailbox.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
>> index 17e9e4a..4acaddb 100644
>> --- a/drivers/mailbox/mailbox.c
>> +++ b/drivers/mailbox/mailbox.c
>> @@ -101,7 +101,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
>>         if (mssg && chan->cl->tx_done)
>>                 chan->cl->tx_done(chan->cl, mssg, r);
>>
>> -       if (chan->cl->tx_block)
>> +       if ((!r) && chan->cl->tx_block)
>>                 complete(&chan->tx_complete);
>>
> Thanks for finding the bug.
> However the fix is flawed. complete() could also be done from
> mbox_chan_txdone() calling tx_tick(). And if the controller returned
> error, we could never pass on that error code to the user (timeout
> fires and then we will move on with -EIO).

If the controller returned error, wouldnt it be passed to the user via
chan->cl->tx_done()?
However, I agree that with this fix, it won't call complete(), when
the controller finished the Tx with an Err.

> Since we could never prevent the controller from returning -EIO as the
> error, I think we have to explicitly tell tx_tick() if it needs to
> complete() or not.

Hm. How about checking if the wait_for_completion_timeout() returned
the timeout value? If so, then return with -EIO and don't complete().

>
> -Jassi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Linaro-acpi] [PATCH] Mailbox: Complete wait event only if Tx was successful
  2014-12-12 17:47   ` Ashwin Chaugule
@ 2014-12-16 11:36     ` Sudeep Holla
  2014-12-16 13:00       ` Ashwin Chaugule
  0 siblings, 1 reply; 7+ messages in thread
From: Sudeep Holla @ 2014-12-16 11:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 12, 2014 at 05:47:26PM +0000, Ashwin Chaugule wrote:
> On 12 December 2014 at 03:43, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > Hi Ashwin,
> 
> Hi,
>

Hi Ashwin,

> > On Thursday 11 December 2014 01:46 AM, Ashwin Chaugule wrote:
> >>
> >> If a wait_for_completion_timeout() call returns due to a timeout,
> >> the mbox code can still call complete() after returning from the wait.
> >> This can cause subsequent transmissions on a channel to fail, since
> >> the wait_for_completion_timeout() sees the completion variable
> >> is !=0, caused by the erroneous complete() call, and immediately
> >> returns without waiting for the time as expected by the client.
> >>
> >> Fix this by calling complete() only if the TX was successful.
> >>
> >> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> >> ---
> >>   drivers/mailbox/mailbox.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> >> index 17e9e4a..4acaddb 100644
> >> --- a/drivers/mailbox/mailbox.c
> >> +++ b/drivers/mailbox/mailbox.c
> >> @@ -101,7 +101,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
> >>         if (mssg && chan->cl->tx_done)
> >>                 chan->cl->tx_done(chan->cl, mssg, r);
> >>
> >> -       if (chan->cl->tx_block)
> >> +       if ((!r) && chan->cl->tx_block)
> >>                 complete(&chan->tx_complete);
> >
> >
> > Just curious to check if there's another possible race which is
> > a different issue.
> >
> > Suppose the timer fired and indicated that the Tx is complete, then
> > it tries to execute complete while the wait_for_completion_timeout timed
> > out. Does that make sense ?
> >
> > So if yes, how about adding !completion_done(..) to the check while you
> > are at this ?
> 
> Yea. Seems like another race condition. I'll add it along with this..
> 

Thanks !

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Linaro-acpi] [PATCH] Mailbox: Complete wait event only if Tx was successful
  2014-12-16 11:36     ` Sudeep Holla
@ 2014-12-16 13:00       ` Ashwin Chaugule
  0 siblings, 0 replies; 7+ messages in thread
From: Ashwin Chaugule @ 2014-12-16 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 16 December 2014 at 06:36, Sudeep Holla <sudeep.holla@arm.com> wrote:
> On Fri, Dec 12, 2014 at 05:47:26PM +0000, Ashwin Chaugule wrote:
>> On 12 December 2014 at 03:43, Sudeep Holla <sudeep.holla@arm.com> wrote:
>> > On Thursday 11 December 2014 01:46 AM, Ashwin Chaugule wrote:
>> >>
>> >> If a wait_for_completion_timeout() call returns due to a timeout,
>> >> the mbox code can still call complete() after returning from the wait.
>> >> This can cause subsequent transmissions on a channel to fail, since
>> >> the wait_for_completion_timeout() sees the completion variable
>> >> is !=0, caused by the erroneous complete() call, and immediately
>> >> returns without waiting for the time as expected by the client.
>> >>
>> >> Fix this by calling complete() only if the TX was successful.
>> >>
>> >> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
>> >> ---
>> >>   drivers/mailbox/mailbox.c | 2 +-
>> >>   1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
>> >> index 17e9e4a..4acaddb 100644
>> >> --- a/drivers/mailbox/mailbox.c
>> >> +++ b/drivers/mailbox/mailbox.c
>> >> @@ -101,7 +101,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
>> >>         if (mssg && chan->cl->tx_done)
>> >>                 chan->cl->tx_done(chan->cl, mssg, r);
>> >>
>> >> -       if (chan->cl->tx_block)
>> >> +       if ((!r) && chan->cl->tx_block)
>> >>                 complete(&chan->tx_complete);
>> >
>> >
>> > Just curious to check if there's another possible race which is
>> > a different issue.
>> >
>> > Suppose the timer fired and indicated that the Tx is complete, then
>> > it tries to execute complete while the wait_for_completion_timeout timed
>> > out. Does that make sense ?
>> >
>> > So if yes, how about adding !completion_done(..) to the check while you
>> > are at this ?
>>
>> Yea. Seems like another race condition. I'll add it along with this..
>>
>
> Thanks !

IIUC, it looks like adding the !completion_done() will not really fix
this race. Once the lock inside wait_for_completion.. is released,
completion_done() will return 0, and we'll call complete(), which is
not what we want, since the "wait" is already over (after a timeout).
I think the only right thing here is to increase the timeout in
wait_for_completion_timeout(). Thoughts?

Cheers,
Ashwin

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-12-16 13:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-10 20:16 [PATCH] Mailbox: Complete wait event only if Tx was successful Ashwin Chaugule
2014-12-12  8:43 ` [Linaro-acpi] " Sudeep Holla
2014-12-12 17:47   ` Ashwin Chaugule
2014-12-16 11:36     ` Sudeep Holla
2014-12-16 13:00       ` Ashwin Chaugule
2014-12-12 10:21 ` Jassi Brar
2014-12-12 17:58   ` Ashwin Chaugule

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).