linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Young <sean@mess.org>
To: A Sun <as1033x@comcast.net>
Cc: linux-media@vger.kernel.org, Mauro Carvalho Chehab <mchehab@kernel.org>
Subject: Re: [PATCH v1 3/3] [media] mceusb: Show USB halt/stall error recovery
Date: Sat, 8 Jun 2019 09:37:29 +0100	[thread overview]
Message-ID: <20190608083729.bw47vkplpf3r4e4b@gofer.mess.org> (raw)
In-Reply-To: <2548e827-1d11-4ce2-013f-bf36c9f5436e@comcast.net>

Hi,

On Thu, Jun 06, 2019 at 05:11:04PM -0400, A Sun wrote:
> Hi Sean,
> 
> Thanks again for reviewing my patch submission.
> 
> This patch updates mceusb RX and TX HALT error handling and recovery reporting,
> and provides placeholder for a later patch.
> 
> The possible later patch would have mceusb call usb_reset_device() in place of the
> new "... requires USB Reset ..." message. I say "possible" because I'm running into
> multiple issues invoking usb_reset_device() from within mceusb.
> 
> I'll also mention the difficulty obtaining even a single fault replication
> (~ 1/month) and test cycle.

OK, thank you.

> Additional comments below...  ..A Sun
> 
> On 6/6/2019 5:53 AM, Sean Young wrote:
> > On Sat, Jun 01, 2019 at 07:35:09PM -0400, A Sun wrote:

-snip-

 >> Additional findings are "modprobe -r mceusb" and "modprobe mceusb"
> >> are unable to clear stuck RX or TX HALT state.
> >> Attempts to call usb_reset_endpoint() and usb_reset_configuration()
> >> from within the mceusb driver also did not clear stuck HALTs.
> >>
> >> A possible future patch could have the mceusb call usb_reset_device()
> >> on itself, when necessary. Limiting the number of HALT error recovery
> >> attempts may also be necessary to prevent halt/clear-halt loops.
> >>
> >> Unresolved for now, deadlock complications occur if mceusb's worker
> >> thread, mceusb_deferred_kevent(), calls usb_reset_device(),
> >> which calls mceusb_dev_disconnect(), which calls cancel_work_sync(),
> >> which waits on the still active worker thread.
> > 
> > I think you can call usb_lock_device_for_reset() to prevent that problem.
> 
> Deadlock still occurs in test:
> mceusb_deferred_kevent()
>     ->usb_reset_device()
>         ->mceusb_dev_disconnect()
>             ->cancel_work_sync()
> where cancel_work_sync() blocks because mceusb_deferred_kevent() is active.

I understand. The usb_reset_device() does not need to happen synchronously
in mceusb_deferred_kevent(). Possibly another delayed workqueue.

Actually there is also a function usb_queue_reset_device() which might do
what you want here.

> 
> > It would be nice to see this implemented too.
> > 
> 
> Any additional tips to implement would help!
> How to validate and test a rare condition with a larger audience?

This is hard. Do you know the model of the mceusb and host hardware?

> >>
> >> Signed-off-by: A Sun <as1033x@comcast.net>
> >> ---
> >>  drivers/media/rc/mceusb.c | 35 +++++++++++++++++++++++++----------
> >>  1 file changed, 25 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/drivers/media/rc/mceusb.c b/drivers/media/rc/mceusb.c
> >> index efffb1795..5d81ccafc 100644
> >> --- a/drivers/media/rc/mceusb.c
> >> +++ b/drivers/media/rc/mceusb.c
> >> @@ -765,7 +765,7 @@ static void mceusb_defer_kevent(struct mceusb_dev *ir, int kevent)
> >>  {
> >>  	set_bit(kevent, &ir->kevent_flags);
> >>  	if (!schedule_work(&ir->kevent))
> >> -		dev_err(ir->dev, "kevent %d may have been dropped", kevent);
> >> +		dev_dbg(ir->dev, "kevent %d already scheduled", kevent);
> >>  	else
> >>  		dev_dbg(ir->dev, "kevent %d scheduled", kevent);
> >>  }
> >> @@ -1404,19 +1404,26 @@ static void mceusb_deferred_kevent(struct work_struct *work)
> >>  		container_of(work, struct mceusb_dev, kevent);
> >>  	int status;
> >>  
> >> +	dev_info(ir->dev, "kevent handler called (flags 0x%lx)",
> >> +		 ir->kevent_flags);
> >> +
> >>  	if (test_bit(EVENT_RX_HALT, &ir->kevent_flags)) {
> >>  		usb_unlink_urb(ir->urb_in);
> >>  		status = usb_clear_halt(ir->usbdev, ir->pipe_in);
> >> +		dev_err(ir->dev, "rx clear halt status = %d", status);
> >>  		if (status < 0) {
> >> -			dev_err(ir->dev, "rx clear halt error %d",
> >> -				status);
> >> -		}
> >> -		clear_bit(EVENT_RX_HALT, &ir->kevent_flags);
> >> -		if (status == 0) {
> >> +			/*
> >> +			 * Unable to clear RX stall/halt.
> >> +			 * Will need to call usb_reset_device().
> >> +			 */
> >> +			dev_err(ir->dev,
> >> +				"stuck RX HALT state requires USB Reset Device to clear");
> > 
> > Here you say if the usb_clear_halt() returns < 0 then we're in the bad
> > code path. But in this code path, we're not re-submitting the urb so
> > we wouldn't end up in an infinite loop.
> > 
> 
> The theory for the infinite loop case is usb_clear_halt() returned 0.
> The infinite loop isn't resolved with this patch.
> 
> > 
> >> +		} else {
> >> +			clear_bit(EVENT_RX_HALT, &ir->kevent_flags);
> > 
> > Why have you moved this line?
> > 
> 
> Clear bit but only if usb_clear_halt() recovery succeeds.

To what affect? Will it be attempted again? Might it cause an infinite
loop?
> 
> A future usb_device_reset() operation affects both RX and TX and its
> success code path must clear both EVENT_RX_HALT and EVENT_TX_HALT
> and do its own usb_submit_urb() with message "rx reset device submit urb status = %d"
> 
> Hence, the moved line.

That's in a future patch. Please only change error strings in this patch.

Thanks,

Sean

  reply	other threads:[~2019-06-08  8:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <999ae5cd-d72b-983f-2f96-5aaca72e8214@comcast.net>
2019-06-01 23:34 ` [PATCH v1 1/3] [media] mceusb: Disable "nonsensical irdata" messages A Sun
2019-06-01 23:34 ` [PATCH v1 2/3] [media] mceusb: Reword messages referring to "urb" A Sun
2019-06-01 23:35 ` [PATCH v1 3/3] [media] mceusb: Show USB halt/stall error recovery A Sun
2019-06-06  9:53   ` Sean Young
2019-06-06 21:11     ` A Sun
2019-06-08  8:37       ` Sean Young [this message]
2019-06-09  0:56         ` A Sun
2019-06-19  7:53         ` [PATCH v2 0/3] [media] mceusb: Error message text revisions A Sun
2019-06-19  7:53         ` [PATCH v2 1/3] [media] mceusb: Disable "nonsensical irdata" messages A Sun
2019-06-19  7:54         ` [PATCH v2 2/3] [media] mceusb: Reword messages referring to "urb" A Sun
2019-06-25 10:51           ` Sean Young
2019-06-25 15:01             ` A Sun
2019-06-25 16:12               ` Sean Young
2019-06-25 21:29                 ` A Sun
2019-07-15 12:28                   ` Sean Young
2019-07-21 21:31                     ` A Sun
2019-08-10 12:17                       ` Sean Young
2019-06-19  7:54         ` [PATCH v2 3/3] [media] mceusb: Show USB halt/stall error recovery A Sun
2019-07-15  2:54           ` A Sun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190608083729.bw47vkplpf3r4e4b@gofer.mess.org \
    --to=sean@mess.org \
    --cc=as1033x@comcast.net \
    --cc=linux-media@vger.kernel.org \
    --cc=mchehab@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).