All of lore.kernel.org
 help / color / mirror / Atom feed
* usb_wwan_write() called while device still being resumed
@ 2013-02-14 10:31 Alex Courbot
  2013-02-14 17:41   ` Bjørn Mork
  0 siblings, 1 reply; 8+ messages in thread
From: Alex Courbot @ 2013-02-14 10:31 UTC (permalink / raw)
  To: USB list, linux-pm, linux-kernel; +Cc: Alexandre Courbot

Hi everyone,

I have this pretty weird issue on Android 3.1 kernel and would really 
appreciate some insight that would allow me to figure it out. Could not 
find any reference to a similar problem so I am seeking your advice.

The board features a USB GSM modem using the usb_wwan module. Once in a 
while, when the system resumes from LP0, a NET_RX softirq will be 
triggered while the modem is still being resumed, calling 
usb_wwan_write(). This will cause usb_autopm_get_interface_async() 
inside usb_wwan_write() to fail because the runtime PM disable_depth is 
still equal to 1 since device_resume() has not completed yet. As a 
result, the modem just stops working.

The resume sequence is the device would look like this:

1) device_resume() is called
2) usb_resume() is called
3) usb_resume_both() resumes the device and its 6 serial interfaces (all 
using usb_wwan)
4) device_resume() completes, setting the device as resumed, 
disable_depth is 0

The problem occurs during 3) when, after 1 or 2 of the device's 
interfaces have been resumed (and usb_wwan_resume() called on them), an 
NET_RX softirq is fired that ends up calling usb_wwan_write(). Here is 
the trace when this happens:

[  245.833459] [<c0373040>] (usb_wwan_write+0x3cc/0x400) from 
[<c036e2e8>] (serial_write+0x6c/0xa4)
[  245.842452] [<c036e2e8>] (serial_write+0x6c/0xa4) from [<c0334838>] 
(ppp_async_push+0x150/0x1c0)
[  245.851440] [<c0334838>] (ppp_async_push+0x150/0x1c0) from 
[<c03348cc>] (ppp_async_send+0x24/0x58)
[  245.860516] [<c03348cc>] (ppp_async_send+0x24/0x58) from [<c032fd30>] 
(ppp_push+0x6c/0xb4)
[  245.868981] [<c032fd30>] (ppp_push+0x6c/0xb4) from [<c0331564>] 
(ppp_send_frame+0x384/0x470)
[  245.877601] [<c0331564>] (ppp_send_frame+0x384/0x470) from 
[<c03316ac>] (ppp_xmit_process+0x5c/0xc4)
[  245.886933] [<c03316ac>] (ppp_xmit_process+0x5c/0xc4) from 
[<c0331868>] (ppp_start_xmit+0x154/0x1c4)
[  245.896181] [<c0331868>] (ppp_start_xmit+0x154/0x1c4) from 
[<c04c9640>] (dev_hard_start_xmit+0x288/0x5c4)
[  245.905950] [<c04c9640>] (dev_hard_start_xmit+0x288/0x5c4) from 
[<c04e111c>] (sch_direct_xmit+0xc0/0x1ec)
[  245.915714] [<c04e111c>] (sch_direct_xmit+0xc0/0x1ec) from 
[<c04c9b54>] (dev_queue_xmit+0x1d8/0x4d8)
[  245.925046] [<c04c9b54>] (dev_queue_xmit+0x1d8/0x4d8) from 
[<c04d1f8c>] (neigh_direct_output+0x1c/0x20)
[  245.934644] [<c04d1f8c>] (neigh_direct_output+0x1c/0x20) from 
[<c0524424>] (ip_finish_output.part.9+0x14c/0x350)
[  245.945019] [<c0524424>] (ip_finish_output.part.9+0x14c/0x350) from 
[<c052466c>] (ip_finish_output+0x44/0x48)
[  245.955136] [<c052466c>] (ip_finish_output+0x44/0x48) from 
[<c052509c>] (ip_output+0x100/0x118)
[  245.963951] [<c052509c>] (ip_output+0x100/0x118) from [<c0524784>] 
(ip_local_out+0x38/0x3c)
[  245.972506] [<c0524784>] (ip_local_out+0x38/0x3c) from [<c05248a4>] 
(ip_queue_xmit+0x11c/0x398)
[  245.981414] [<c05248a4>] (ip_queue_xmit+0x11c/0x398) from 
[<c053b928>] (tcp_transmit_skb+0x31c/0x50c)
[  245.990836] [<c053b928>] (tcp_transmit_skb+0x31c/0x50c) from 
[<c053d27c>] (tcp_send_ack+0xd8/0x114)
[  246.000081] [<c053d27c>] (tcp_send_ack+0xd8/0x114) from [<c0532ec4>] 
(__tcp_ack_snd_check+0x78/0xa4)
[  246.009330] [<c0532ec4>] (__tcp_ack_snd_check+0x78/0xa4) from 
[<c0538ddc>] (tcp_rcv_established+0x184/0x6e8)
[  246.019355] [<c0538ddc>] (tcp_rcv_established+0x184/0x6e8) from 
[<c05400a8>] (tcp_v4_do_rcv+0xf8/0x21c)
[  246.028948] [<c05400a8>] (tcp_v4_do_rcv+0xf8/0x21c) from [<c05421ac>] 
(tcp_v4_rcv+0x70c/0x838)
[  246.037740] [<c05421ac>] (tcp_v4_rcv+0x70c/0x838) from [<c051f534>] 
(ip_local_deliver_finish+0x110/0x2fc)
[  246.047504] [<c051f534>] (ip_local_deliver_finish+0x110/0x2fc) from 
[<c051f8e8>] (ip_local_deliver+0xb8/0xc4)
[  246.057618] [<c051f8e8>] (ip_local_deliver+0xb8/0xc4) from 
[<c051f10c>] (ip_rcv_finish+0x158/0x470)
[  246.066780] [<c051f10c>] (ip_rcv_finish+0x158/0x470) from 
[<c051fbf8>] (ip_rcv+0x304/0x3f4)
[  246.075337] [<c051fbf8>] (ip_rcv+0x304/0x3f4) from [<c04c40a0>] 
(__netif_receive_skb+0x2a0/0x514)
[  246.084410] [<c04c40a0>] (__netif_receive_skb+0x2a0/0x514) from 
[<c04c43bc>] (process_backlog+0xa8/0x178)
[  246.094178] [<c04c43bc>] (process_backlog+0xa8/0x178) from 
[<c04c8308>] (net_rx_action+0xd0/0x2b8)
[  246.103336] [<c04c8308>] (net_rx_action+0xd0/0x2b8) from [<c00751d4>] 
(__do_softirq+0xdc/0x278)
[  246.112150] [<c00751d4>] (__do_softirq+0xdc/0x278) from [<c00755bc>] 
(do_softirq+0x60/0x6c)
[  246.120699] [<c00755bc>] (do_softirq+0x60/0x6c) from [<c0075690>] 
(local_bh_enable_ip+0xc8/0xd4)
[  246.129685] [<c0075690>] (local_bh_enable_ip+0xc8/0xd4) from 
[<c061a984>] (_raw_write_unlock_bh+0x48/0x4c)
[  246.139546] [<c061a984>] (_raw_write_unlock_bh+0x48/0x4c) from 
[<c04d4b2c>] (neigh_periodic_work+0x15c/0x1fc)
[  246.149670] [<c04d4b2c>] (neigh_periodic_work+0x15c/0x1fc) from 
[<c008a68c>] (process_one_work+0x174/0x52c)
[  246.159525] [<c008a68c>] (process_one_work+0x174/0x52c) from 
[<c008ab90>] (worker_thread+0x14c/0x3a8)
[  246.168945] [<c008ab90>] (worker_thread+0x14c/0x3a8) from 
[<c008f708>] (kthread+0xac/0xb4)
[  246.177396] [<c008f708>] (kthread+0xac/0xb4) from [<c000fd4c>] 
(kernel_thread_exit+0x0/0x8)

... and then usb_autopm_get_interface_async() fails because 
disable_depth is > 0.

The first interface has an interrupt URB, which is submitted when 
usb_wwan_resume() is called. I suspect it fetches data immediatly after 
and triggers the softirq. From the stack trace it seems that the issue 
happens when trying to send the ACK for received data.

That's unfortunately where my understanding of networking, PM, and USB 
stops. I would really appreciate if someone could explain why this 
interface is processing data even though the device is not completely 
resumed, and how I could avoid this. I have checked the latest upstream 
code for usb_wwan.c and the behavior does not seem to have changed much.

Thanks for any hint,
Alex.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: usb_wwan_write() called while device still being resumed
  2013-02-14 10:31 usb_wwan_write() called while device still being resumed Alex Courbot
@ 2013-02-14 17:41   ` Bjørn Mork
  0 siblings, 0 replies; 8+ messages in thread
From: Bjørn Mork @ 2013-02-14 17:41 UTC (permalink / raw)
  To: Alex Courbot; +Cc: USB list, linux-pm, linux-kernel, Alexandre Courbot

Alex Courbot <acourbot@nvidia.com> writes:

> The board features a USB GSM modem using the usb_wwan module. Once in
> a while, when the system resumes from LP0, a NET_RX softirq will be
> triggered while the modem is still being resumed, calling
> usb_wwan_write(). This will cause usb_autopm_get_interface_async()
> inside usb_wwan_write() to fail because the runtime PM disable_depth
> is still equal to 1 since device_resume() has not completed yet. As a
> result, the modem just stops working.

I believe the usb_autopm_get_interface_async() failing is OK in this
case, but that should not cause the modem to stop working.

Wonder if this patch solves the problem? :

From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= <bjorn@mork.no>
Date: Thu, 14 Feb 2013 18:34:48 +0100
Subject: [PATCH] USB: usb_wwan: clear port busy state on error
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Reported-by: Alex Courbot <acourbot@nvidia.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/usb/serial/usb_wwan.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
index 01c94aa..44e106d 100644
--- a/drivers/usb/serial/usb_wwan.c
+++ b/drivers/usb/serial/usb_wwan.c
@@ -230,8 +230,10 @@ int usb_wwan_write(struct tty_struct *tty, struct usb_serial_port *port,
 			usb_pipeendpoint(this_urb->pipe), i);
 
 		err = usb_autopm_get_interface_async(port->serial->interface);
-		if (err < 0)
+		if (err < 0) {
+			clear_bit(i, &portdata->out_busy);
 			break;
+		}
 
 		/* send the data */
 		memcpy(this_urb->transfer_buffer, buf, todo);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: usb_wwan_write() called while device still being resumed
@ 2013-02-14 17:41   ` Bjørn Mork
  0 siblings, 0 replies; 8+ messages in thread
From: Bjørn Mork @ 2013-02-14 17:41 UTC (permalink / raw)
  To: Alex Courbot; +Cc: USB list, linux-pm, linux-kernel, Alexandre Courbot

Alex Courbot <acourbot@nvidia.com> writes:

> The board features a USB GSM modem using the usb_wwan module. Once in
> a while, when the system resumes from LP0, a NET_RX softirq will be
> triggered while the modem is still being resumed, calling
> usb_wwan_write(). This will cause usb_autopm_get_interface_async()
> inside usb_wwan_write() to fail because the runtime PM disable_depth
> is still equal to 1 since device_resume() has not completed yet. As a
> result, the modem just stops working.

I believe the usb_autopm_get_interface_async() failing is OK in this
case, but that should not cause the modem to stop working.

Wonder if this patch solves the problem? :

From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= <bjorn@mork.no>
Date: Thu, 14 Feb 2013 18:34:48 +0100
Subject: [PATCH] USB: usb_wwan: clear port busy state on error
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Reported-by: Alex Courbot <acourbot@nvidia.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/usb/serial/usb_wwan.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
index 01c94aa..44e106d 100644
--- a/drivers/usb/serial/usb_wwan.c
+++ b/drivers/usb/serial/usb_wwan.c
@@ -230,8 +230,10 @@ int usb_wwan_write(struct tty_struct *tty, struct usb_serial_port *port,
 			usb_pipeendpoint(this_urb->pipe), i);
 
 		err = usb_autopm_get_interface_async(port->serial->interface);
-		if (err < 0)
+		if (err < 0) {
+			clear_bit(i, &portdata->out_busy);
 			break;
+		}
 
 		/* send the data */
 		memcpy(this_urb->transfer_buffer, buf, todo);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: usb_wwan_write() called while device still being resumed
  2013-02-14 17:41   ` Bjørn Mork
  (?)
@ 2013-02-15  8:23   ` Alex Courbot
  2013-02-15 11:05       ` Bjørn Mork
  -1 siblings, 1 reply; 8+ messages in thread
From: Alex Courbot @ 2013-02-15  8:23 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: USB list, linux-pm, linux-kernel, Alexandre Courbot

Hi Bjørn, thanks for the reply!

On 02/15/2013 02:41 AM, Bjørn Mork wrote:
> I believe the usb_autopm_get_interface_async() failing is OK in this
> case, but that should not cause the modem to stop working.
>
> Wonder if this patch solves the problem? :
>
> From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= <bjorn@mork.no>
> Date: Thu, 14 Feb 2013 18:34:48 +0100
> Subject: [PATCH] USB: usb_wwan: clear port busy state on error
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Reported-by: Alex Courbot <acourbot@nvidia.com>
> Signed-off-by: Bjørn Mork <bjorn@mork.no>
> ---
>   drivers/usb/serial/usb_wwan.c |    4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
> index 01c94aa..44e106d 100644
> --- a/drivers/usb/serial/usb_wwan.c
> +++ b/drivers/usb/serial/usb_wwan.c
> @@ -230,8 +230,10 @@ int usb_wwan_write(struct tty_struct *tty, struct usb_serial_port *port,
>   			usb_pipeendpoint(this_urb->pipe), i);
>
>   		err = usb_autopm_get_interface_async(port->serial->interface);
> -		if (err < 0)
> +		if (err < 0) {
> +			clear_bit(i, &portdata->out_busy);
>   			break;
> +		}
>
>   		/* send the data */
>   		memcpy(this_urb->transfer_buffer, buf, todo);
>

Unfortunately it does not, and fails the same way. On the other hand, I 
do not see the issue when doing the following:

diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
index e4fad5e..1490029 100644
--- a/drivers/usb/serial/usb_wwan.c
+++ b/drivers/usb/serial/usb_wwan.c
@@ -238,8 +238,6 @@ int usb_wwan_write(struct tty_struct *tty, struct 
usb_serial_port *port,
                     usb_pipeendpoint(this_urb->pipe), i);

                 err = 
usb_autopm_get_interface_async(port->serial->interface);
-               if (err < 0)
-                       break;

                 /* send the data */
                 memcpy(this_urb->transfer_buffer, buf, todo);

After doing this I don't see this issue anymore. It looks wrong though. 
But it seems to work despite the obvious unbalance in autopm calls that 
results.

If I understand you correctly, usb_wwan_write() failing here is not a 
problem in itself, and the ack should just be sent again later?

 > that should not cause the modem to stop working.

Actually it might also be that the network stack ends up in a bad state 
and remains stuck in it. I don't think the modem by itself is affected. 
All I observe is that no network traffic takes place after this. I'm not 
familiar enough with networking to make any stronger assumption.

FWIW, when usb_autopm_get_interface_async() returns -EACCES, the power 
parameters of port->serial->interface->dev are as follows:

disable_depth = 1
is_suspended = 1
runtime_status = 2 (RPM_SUSPENDED)

Thanks,
Alex.


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: usb_wwan_write() called while device still being resumed
@ 2013-02-15 11:05       ` Bjørn Mork
  0 siblings, 0 replies; 8+ messages in thread
From: Bjørn Mork @ 2013-02-15 11:05 UTC (permalink / raw)
  To: Alex Courbot; +Cc: USB list, linux-pm, linux-kernel, Alexandre Courbot

Alex Courbot <acourbot@nvidia.com> writes:

> Unfortunately it does not, and fails the same way. On the other hand,
> I do not see the issue when doing the following:
>
> diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
> index e4fad5e..1490029 100644
> --- a/drivers/usb/serial/usb_wwan.c
> +++ b/drivers/usb/serial/usb_wwan.c
> @@ -238,8 +238,6 @@ int usb_wwan_write(struct tty_struct *tty, struct
> usb_serial_port *port,
>                     usb_pipeendpoint(this_urb->pipe), i);
>
>                 err =
> usb_autopm_get_interface_async(port->serial->interface);
> -               if (err < 0)
> -                       break;
>
>                 /* send the data */
>                 memcpy(this_urb->transfer_buffer, buf, todo);
>
> After doing this I don't see this issue anymore. It looks wrong
> though. But it seems to work despite the obvious unbalance in autopm
> calls that results.
>
> If I understand you correctly, usb_wwan_write() failing here is not a
> problem in itself, and the ack should just be sent again later?

That was what I thought looking (obviously too) briefly through this.

Most errors from usb_autopm_get_interface_async will be translated to
EIO before being returned by serial_write.  I believe the userspace
application should deal with that.  But maybe it just gives up?  Should
we return EAGAIN or something instead?

I don't know.  I am pretty clueless about these things...

But looking again, trying to guess why it works fine if you just ignore
the error. I believe that is because you then end up hitting this until
the interface is fully resumed:

		if (intfdata->suspended) {
			usb_anchor_urb(this_urb, &portdata->delayed);
			spin_unlock_irqrestore(&intfdata->susp_lock, flags);
                }

>> that should not cause the modem to stop working.
>
> Actually it might also be that the network stack ends up in a bad
> state and remains stuck in it. I don't think the modem by itself is
> affected. All I observe is that no network traffic takes place after
> this. I'm not familiar enough with networking to make any stronger
> assumption.

> FWIW, when usb_autopm_get_interface_async() returns -EACCES, the power
> parameters of port->serial->interface->dev are as follows:
>
> disable_depth = 1
> is_suspended = 1
> runtime_status = 2 (RPM_SUSPENDED)

Yes, that makes pm_runtime_get() return -EACCES.

I am way out of my league here, but I wonder if pm_runtime_get()
shouldn't return -EINPROGRESS instead if there is a queued resume
request or an ongoing resume, regardless of disable_depth?

Maybe something like the completely untested:

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 3148b10..38e19ba 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -512,6 +512,9 @@ static int rpm_resume(struct device *dev, int rpmflags)
 	else if (dev->power.disable_depth == 1 && dev->power.is_suspended
 	    && dev->power.runtime_status == RPM_ACTIVE)
 		retval = 1;
+	else if (rpmflags & RPM_ASYNC && dev->power.request_pending &&
+		 dev->power.request == RPM_REQ_RESUME)
+		retval = -EINPROGRESS;
 	else if (dev->power.disable_depth > 0)
 		retval = -EACCES;
 	if (retval)
---
usb_autopm_get_interface_async() will interprete EINPROGRESS as success,
so that would prevent this problem.


Bjørn

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: usb_wwan_write() called while device still being resumed
@ 2013-02-15 11:05       ` Bjørn Mork
  0 siblings, 0 replies; 8+ messages in thread
From: Bjørn Mork @ 2013-02-15 11:05 UTC (permalink / raw)
  To: Alex Courbot
  Cc: USB list, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Alexandre Courbot

Alex Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> writes:

> Unfortunately it does not, and fails the same way. On the other hand,
> I do not see the issue when doing the following:
>
> diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
> index e4fad5e..1490029 100644
> --- a/drivers/usb/serial/usb_wwan.c
> +++ b/drivers/usb/serial/usb_wwan.c
> @@ -238,8 +238,6 @@ int usb_wwan_write(struct tty_struct *tty, struct
> usb_serial_port *port,
>                     usb_pipeendpoint(this_urb->pipe), i);
>
>                 err =
> usb_autopm_get_interface_async(port->serial->interface);
> -               if (err < 0)
> -                       break;
>
>                 /* send the data */
>                 memcpy(this_urb->transfer_buffer, buf, todo);
>
> After doing this I don't see this issue anymore. It looks wrong
> though. But it seems to work despite the obvious unbalance in autopm
> calls that results.
>
> If I understand you correctly, usb_wwan_write() failing here is not a
> problem in itself, and the ack should just be sent again later?

That was what I thought looking (obviously too) briefly through this.

Most errors from usb_autopm_get_interface_async will be translated to
EIO before being returned by serial_write.  I believe the userspace
application should deal with that.  But maybe it just gives up?  Should
we return EAGAIN or something instead?

I don't know.  I am pretty clueless about these things...

But looking again, trying to guess why it works fine if you just ignore
the error. I believe that is because you then end up hitting this until
the interface is fully resumed:

		if (intfdata->suspended) {
			usb_anchor_urb(this_urb, &portdata->delayed);
			spin_unlock_irqrestore(&intfdata->susp_lock, flags);
                }

>> that should not cause the modem to stop working.
>
> Actually it might also be that the network stack ends up in a bad
> state and remains stuck in it. I don't think the modem by itself is
> affected. All I observe is that no network traffic takes place after
> this. I'm not familiar enough with networking to make any stronger
> assumption.

> FWIW, when usb_autopm_get_interface_async() returns -EACCES, the power
> parameters of port->serial->interface->dev are as follows:
>
> disable_depth = 1
> is_suspended = 1
> runtime_status = 2 (RPM_SUSPENDED)

Yes, that makes pm_runtime_get() return -EACCES.

I am way out of my league here, but I wonder if pm_runtime_get()
shouldn't return -EINPROGRESS instead if there is a queued resume
request or an ongoing resume, regardless of disable_depth?

Maybe something like the completely untested:

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 3148b10..38e19ba 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -512,6 +512,9 @@ static int rpm_resume(struct device *dev, int rpmflags)
 	else if (dev->power.disable_depth == 1 && dev->power.is_suspended
 	    && dev->power.runtime_status == RPM_ACTIVE)
 		retval = 1;
+	else if (rpmflags & RPM_ASYNC && dev->power.request_pending &&
+		 dev->power.request == RPM_REQ_RESUME)
+		retval = -EINPROGRESS;
 	else if (dev->power.disable_depth > 0)
 		retval = -EACCES;
 	if (retval)
---
usb_autopm_get_interface_async() will interprete EINPROGRESS as success,
so that would prevent this problem.


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: usb_wwan_write() called while device still being resumed
  2013-02-15 11:05       ` Bjørn Mork
  (?)
@ 2013-02-17 11:31       ` Alex Courbot
  -1 siblings, 0 replies; 8+ messages in thread
From: Alex Courbot @ 2013-02-17 11:31 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: USB list, linux-pm, linux-kernel, Alexandre Courbot

On 02/15/2013 08:05 PM, Bjørn Mork wrote:
> Alex Courbot <acourbot@nvidia.com> writes:
>
>> Unfortunately it does not, and fails the same way. On the other hand,
>> I do not see the issue when doing the following:
>>
>> diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
>> index e4fad5e..1490029 100644
>> --- a/drivers/usb/serial/usb_wwan.c
>> +++ b/drivers/usb/serial/usb_wwan.c
>> @@ -238,8 +238,6 @@ int usb_wwan_write(struct tty_struct *tty, struct
>> usb_serial_port *port,
>>                      usb_pipeendpoint(this_urb->pipe), i);
>>
>>                  err =
>> usb_autopm_get_interface_async(port->serial->interface);
>> -               if (err < 0)
>> -                       break;
>>
>>                  /* send the data */
>>                  memcpy(this_urb->transfer_buffer, buf, todo);
>>
>> After doing this I don't see this issue anymore. It looks wrong
>> though. But it seems to work despite the obvious unbalance in autopm
>> calls that results.
>>
>> If I understand you correctly, usb_wwan_write() failing here is not a
>> problem in itself, and the ack should just be sent again later?
>
> That was what I thought looking (obviously too) briefly through this.
>
> Most errors from usb_autopm_get_interface_async will be translated to
> EIO before being returned by serial_write.  I believe the userspace
> application should deal with that.  But maybe it just gives up?  Should
> we return EAGAIN or something instead?
>
> I don't know.  I am pretty clueless about these things...

Obviously not as much as I am. :) Checking what userspace is doing could 
indeed be another trail.

> But looking again, trying to guess why it works fine if you just ignore
> the error. I believe that is because you then end up hitting this until
> the interface is fully resumed:
>
> 		if (intfdata->suspended) {
> 			usb_anchor_urb(this_urb, &portdata->delayed);
> 			spin_unlock_irqrestore(&intfdata->susp_lock, flags);
>                  }

Yes, this seems to be exactly what is happening.

> I am way out of my league here, but I wonder if pm_runtime_get()
> shouldn't return -EINPROGRESS instead if there is a queued resume
> request or an ongoing resume, regardless of disable_depth?
>
> Maybe something like the completely untested:
>
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 3148b10..38e19ba 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -512,6 +512,9 @@ static int rpm_resume(struct device *dev, int rpmflags)
>   	else if (dev->power.disable_depth == 1 && dev->power.is_suspended
>   	    && dev->power.runtime_status == RPM_ACTIVE)
>   		retval = 1;
> +	else if (rpmflags & RPM_ASYNC && dev->power.request_pending &&
> +		 dev->power.request == RPM_REQ_RESUME)
> +		retval = -EINPROGRESS;
>   	else if (dev->power.disable_depth > 0)
>   		retval = -EACCES;
>   	if (retval)
> ---
> usb_autopm_get_interface_async() will interprete EINPROGRESS as success,
> so that would prevent this problem.

That sounds sensefull indeed. If the interface is soon to be resumed, 
there should be no reason for usb_autopm_get_interface_async() to fail. 
Let's give this a try and bring the idea to the PM people if it works.

In any case thanks a lot for the help, it is extremely useful.
Alex.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: usb_wwan_write() called while device still being resumed
  2013-02-15 11:05       ` Bjørn Mork
  (?)
  (?)
@ 2013-02-18  3:20       ` Alex Courbot
  -1 siblings, 0 replies; 8+ messages in thread
From: Alex Courbot @ 2013-02-18  3:20 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: USB list, linux-pm, linux-kernel, Alexandre Courbot

On 02/15/2013 08:05 PM, Bjørn Mork wrote:
> Maybe something like the completely untested:
>
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 3148b10..38e19ba 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -512,6 +512,9 @@ static int rpm_resume(struct device *dev, int rpmflags)
>   	else if (dev->power.disable_depth == 1 && dev->power.is_suspended
>   	    && dev->power.runtime_status == RPM_ACTIVE)
>   		retval = 1;
> +	else if (rpmflags & RPM_ASYNC && dev->power.request_pending &&
> +		 dev->power.request == RPM_REQ_RESUME)
> +		retval = -EINPROGRESS;
>   	else if (dev->power.disable_depth > 0)
>   		retval = -EACCES;
>   	if (retval)

Second thought: not sure this will work since power.request_pending and 
power.request are set to these values later in the same rpm_resume() 
function. However, the three lines before yours caught my attention. 
They are not in my 3.1 source tree and the conditions are very close 
from the ones I am seeing when the issue happens: disable_depth == 1, 
is_suspended == 1. Only runtime_status is not equal to RPM_ACTIVE.

Nonetheless, I have looked at the patch that introduced these 
(http://pastebin.com/PmHUjiAE ) and it details a problem that is very 
similar to mine. It also mentions a workaround to be implemented in the 
driver by saving the suspend status into a variable that is checked when 
pm_runtime_get() return -EACCES. This variable already exists in 
usb_wwan, actually it is the very variable that is checked a bit later 
in that other chunk of code you mentioned:

spin_lock_irqsave(&intfdata->susp_lock, flags);
if (intfdata->suspended) {
	usb_anchor_urb(this_urb, &portdata->delayed);
	spin_unlock_irqrestore(&intfdata->susp_lock, flags);
} else {

So it looks like this code is here for exactly that purpose. However in 
my current condition I do not see how this block could be run since 
runtime PM is disabled when intfdata->suspended is set to true.

I will try implementing the workaround suggested (checking if 
intfdata->suspended is set when -EACCES is returned and go on if it is 
the case) and see if it works (and I bet it will). In the upstream 
kernel the issue seems to have been addressed already, so this might 
just be a source-out-of-date issue.

Thanks,
Alex.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-02-18  3:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-14 10:31 usb_wwan_write() called while device still being resumed Alex Courbot
2013-02-14 17:41 ` Bjørn Mork
2013-02-14 17:41   ` Bjørn Mork
2013-02-15  8:23   ` Alex Courbot
2013-02-15 11:05     ` Bjørn Mork
2013-02-15 11:05       ` Bjørn Mork
2013-02-17 11:31       ` Alex Courbot
2013-02-18  3:20       ` Alex Courbot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.