linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] Fix ep command fail issue in dequeue
       [not found] <CGME20220214094040epcas2p3cc844d30f54793f51f16bb2b59b432e1@epcas2p3.samsung.com>
@ 2022-02-14  9:37 ` Daehwan Jung
       [not found]   ` <CGME20220214094041epcas2p2ec37c252dd5f9508454e9449c95e6c7a@epcas2p2.samsung.com>
       [not found]   ` <CGME20220214094042epcas2p118ac06692ad14f321a3fd59e57bcf1d5@epcas2p1.samsung.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Daehwan Jung @ 2022-02-14  9:37 UTC (permalink / raw)
  To: Felipe Balbi, Greg Kroah-Hartman
  Cc: linux-usb, open list, Daehwan Jung, quic_wcheng, quic_jackp,
	Thinh.Nguyen

It always sets DWC3_EP_END_TRANSFER_PENDING in dwc3_stop_active_transfer
even if dwc3_send_gadget_ep_cmd fails. It can cause some problems like
skipping clear stall commmand or giveback from dequeue. It could cause
hung task if ENDTRANSFER command should not be completed. It seems
like HW(Controller) issue but SW can prevent it.

Daehwan Jung (2):
  usb: dwc3: Not set DWC3_EP_END_TRANSFER_PENDING in ep cmd fails
  usb: dwc3: Prevent cleanup cancelled requests at the same time.

 drivers/usb/dwc3/gadget.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

--
2.31.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v1 1/2] usb: dwc3: Not set DWC3_EP_END_TRANSFER_PENDING in ep cmd fails
       [not found]   ` <CGME20220214094041epcas2p2ec37c252dd5f9508454e9449c95e6c7a@epcas2p2.samsung.com>
@ 2022-02-14  9:37     ` Daehwan Jung
  2022-02-14 18:53       ` Wesley Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Daehwan Jung @ 2022-02-14  9:37 UTC (permalink / raw)
  To: Felipe Balbi, Greg Kroah-Hartman
  Cc: linux-usb, open list, Daehwan Jung, quic_wcheng, quic_jackp,
	Thinh.Nguyen

It always sets DWC3_EP_END_TRANSFER_PENDING in dwc3_stop_active_transfer
even if dwc3_send_gadget_ep_cmd fails. It can cause some problems like
skipping clear stall commmand or giveback from dequeue. We fix to set it
only when ep cmd success. Additionally, We clear DWC3_EP_TRANSFER_STARTED
for next trb to start transfer not update transfer.

Change-Id: I2e6b58acc99f385e467e8b639a3792a5e5f4d2bb
Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>
---
 drivers/usb/dwc3/gadget.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 183b90923f51..3ad3bc5813ca 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2044,6 +2044,12 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 				dwc3_gadget_move_cancelled_request(r,
 						DWC3_REQUEST_STATUS_DEQUEUED);
 
+			/* If ep cmd fails, then force to giveback cancelled requests here */
+			if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) {
+				dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
+				dwc3_gadget_ep_cleanup_cancelled_requests(dep);
+			}
+
 			dep->flags &= ~DWC3_EP_WAIT_TRANSFER_COMPLETE;
 
 			goto out;
@@ -3645,7 +3651,7 @@ static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
 
 	if (!interrupt)
 		dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
-	else
+	else if (!ret)
 		dep->flags |= DWC3_EP_END_TRANSFER_PENDING;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v1 2/2] usb: dwc3: Prevent cleanup cancelled requests at the same time.
       [not found]   ` <CGME20220214094042epcas2p118ac06692ad14f321a3fd59e57bcf1d5@epcas2p1.samsung.com>
@ 2022-02-14  9:37     ` Daehwan Jung
  2022-02-14 10:42       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 7+ messages in thread
From: Daehwan Jung @ 2022-02-14  9:37 UTC (permalink / raw)
  To: Felipe Balbi, Greg Kroah-Hartman
  Cc: linux-usb, open list, Daehwan Jung, quic_wcheng, quic_jackp,
	Thinh.Nguyen

We added cleanup cancelled requests when ep cmd timeout on ep dequeue
because there's no complete interrupt then. But, we find out new case
that complete interrupt comes up later. list_for_each_entry_safe is
used when cleanup cancelled requests and it has vulnerabilty on multi-core
environment. dwc3_gadget_giveback unlocks dwc->lock temporarily and other
core(ISR) can get lock and try to cleanup them again. It could cause
list_del corruption and we use DWC3_EP_END_TRANSFER_PENDING to prevent it.

1. MTP server cancels -> ep dequeue -> ep cmd timeout(END_TRANSFER)
   -> cleanup cancelled requests -> dwc3_gadget_giveback -> list_del -> release lock temporarily
2. Complete with END_TRANSFER -> ISR(dwc3_gadget_endpoint_command_complete) gets lock
   -> cleanup cancelled requests -> dwc3_gadget_giveback -> list_del
3. MTP server process gets lock again -> tries to access POISON list(list_del corruption)

[  205.014697] [2:      MtpServer: 5032] dwc3 10b00000.dwc3: request cancelled with wrong reason:5
[  205.014719] [2:      MtpServer: 5032] list_del corruption, ffffff88b6963968->next is LIST_POISON1 (dead000000000100)

Change-Id: I9df055c6c04855edd09e330300914454a6657a23
Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>

Change-Id: If87c88c3bb4c17ea1a5bde2bfec1382769f7ecab
Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>
---
 drivers/usb/dwc3/gadget.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 3ad3bc5813ca..2e0183512d5b 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2046,8 +2046,11 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 
 			/* If ep cmd fails, then force to giveback cancelled requests here */
 			if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) {
-				dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
+				dep->flags |= DWC3_EP_END_TRANSFER_PENDING;
 				dwc3_gadget_ep_cleanup_cancelled_requests(dep);
+
+				dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
+				dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
 			}
 
 			dep->flags &= ~DWC3_EP_WAIT_TRANSFER_COMPLETE;
@@ -3426,9 +3429,12 @@ static void dwc3_gadget_endpoint_command_complete(struct dwc3_ep *dep,
 	if (dep->stream_capable)
 		dep->flags |= DWC3_EP_IGNORE_NEXT_NOSTREAM;
 
+	if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) {
+		dwc3_gadget_ep_cleanup_cancelled_requests(dep);
+	}
+
 	dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
 	dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
-	dwc3_gadget_ep_cleanup_cancelled_requests(dep);
 
 	if (dep->flags & DWC3_EP_PENDING_CLEAR_STALL) {
 		struct dwc3 *dwc = dep->dwc;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 2/2] usb: dwc3: Prevent cleanup cancelled requests at the same time.
  2022-02-14  9:37     ` [PATCH v1 2/2] usb: dwc3: Prevent cleanup cancelled requests at the same time Daehwan Jung
@ 2022-02-14 10:42       ` Greg Kroah-Hartman
  2022-02-14 10:46         ` Jung Daehwan
  0 siblings, 1 reply; 7+ messages in thread
From: Greg Kroah-Hartman @ 2022-02-14 10:42 UTC (permalink / raw)
  To: Daehwan Jung
  Cc: Felipe Balbi, linux-usb, open list, quic_wcheng, quic_jackp,
	Thinh.Nguyen

On Mon, Feb 14, 2022 at 06:37:18PM +0900, Daehwan Jung wrote:
> We added cleanup cancelled requests when ep cmd timeout on ep dequeue
> because there's no complete interrupt then. But, we find out new case
> that complete interrupt comes up later. list_for_each_entry_safe is
> used when cleanup cancelled requests and it has vulnerabilty on multi-core
> environment. dwc3_gadget_giveback unlocks dwc->lock temporarily and other
> core(ISR) can get lock and try to cleanup them again. It could cause
> list_del corruption and we use DWC3_EP_END_TRANSFER_PENDING to prevent it.
> 
> 1. MTP server cancels -> ep dequeue -> ep cmd timeout(END_TRANSFER)
>    -> cleanup cancelled requests -> dwc3_gadget_giveback -> list_del -> release lock temporarily
> 2. Complete with END_TRANSFER -> ISR(dwc3_gadget_endpoint_command_complete) gets lock
>    -> cleanup cancelled requests -> dwc3_gadget_giveback -> list_del
> 3. MTP server process gets lock again -> tries to access POISON list(list_del corruption)
> 
> [  205.014697] [2:      MtpServer: 5032] dwc3 10b00000.dwc3: request cancelled with wrong reason:5
> [  205.014719] [2:      MtpServer: 5032] list_del corruption, ffffff88b6963968->next is LIST_POISON1 (dead000000000100)
> 
> Change-Id: I9df055c6c04855edd09e330300914454a6657a23
> Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>
> 
> Change-Id: If87c88c3bb4c17ea1a5bde2bfec1382769f7ecab
> Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>

Why did you sign off on this twice?

And did you run it through checkpatch.pl?  It would have reminded you
that Change-Id: should not be on patches :(

Same for patch 1/1.

Please fix.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 2/2] usb: dwc3: Prevent cleanup cancelled requests at the same time.
  2022-02-14 10:42       ` Greg Kroah-Hartman
@ 2022-02-14 10:46         ` Jung Daehwan
  0 siblings, 0 replies; 7+ messages in thread
From: Jung Daehwan @ 2022-02-14 10:46 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Felipe Balbi, linux-usb, open list, quic_wcheng, quic_jackp,
	Thinh.Nguyen

[-- Attachment #1: Type: text/plain, Size: 1938 bytes --]

On Mon, Feb 14, 2022 at 11:42:03AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Feb 14, 2022 at 06:37:18PM +0900, Daehwan Jung wrote:
> > We added cleanup cancelled requests when ep cmd timeout on ep dequeue
> > because there's no complete interrupt then. But, we find out new case
> > that complete interrupt comes up later. list_for_each_entry_safe is
> > used when cleanup cancelled requests and it has vulnerabilty on multi-core
> > environment. dwc3_gadget_giveback unlocks dwc->lock temporarily and other
> > core(ISR) can get lock and try to cleanup them again. It could cause
> > list_del corruption and we use DWC3_EP_END_TRANSFER_PENDING to prevent it.
> > 
> > 1. MTP server cancels -> ep dequeue -> ep cmd timeout(END_TRANSFER)
> >    -> cleanup cancelled requests -> dwc3_gadget_giveback -> list_del -> release lock temporarily
> > 2. Complete with END_TRANSFER -> ISR(dwc3_gadget_endpoint_command_complete) gets lock
> >    -> cleanup cancelled requests -> dwc3_gadget_giveback -> list_del
> > 3. MTP server process gets lock again -> tries to access POISON list(list_del corruption)
> > 
> > [  205.014697] [2:      MtpServer: 5032] dwc3 10b00000.dwc3: request cancelled with wrong reason:5
> > [  205.014719] [2:      MtpServer: 5032] list_del corruption, ffffff88b6963968->next is LIST_POISON1 (dead000000000100)
> > 
> > Change-Id: I9df055c6c04855edd09e330300914454a6657a23
> > Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>
> > 
> > Change-Id: If87c88c3bb4c17ea1a5bde2bfec1382769f7ecab
> > Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>
> 
> Why did you sign off on this twice?
> 
> And did you run it through checkpatch.pl?  It would have reminded you
> that Change-Id: should not be on patches :(
> 
> Same for patch 1/1.
> 
> Please fix.
> 
> thanks,
> 
> greg k-h
> 

Dear greg,

I'm so sorry. It's my fault when getting patches from our system.
I'm going to fix and re-send it.

Best Regards,
Jung Daehwan

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/2] usb: dwc3: Not set DWC3_EP_END_TRANSFER_PENDING in ep cmd fails
  2022-02-14  9:37     ` [PATCH v1 1/2] usb: dwc3: Not set DWC3_EP_END_TRANSFER_PENDING in ep cmd fails Daehwan Jung
@ 2022-02-14 18:53       ` Wesley Cheng
  2022-02-15  6:08         ` Jung Daehwan
  0 siblings, 1 reply; 7+ messages in thread
From: Wesley Cheng @ 2022-02-14 18:53 UTC (permalink / raw)
  To: Daehwan Jung, Felipe Balbi, Greg Kroah-Hartman
  Cc: linux-usb, open list, quic_jackp, Thinh.Nguyen

Hi Daehwan,

On 2/14/2022 1:37 AM, Daehwan Jung wrote:
> It always sets DWC3_EP_END_TRANSFER_PENDING in dwc3_stop_active_transfer
> even if dwc3_send_gadget_ep_cmd fails. It can cause some problems like
> skipping clear stall commmand or giveback from dequeue. We fix to set it
> only when ep cmd success. Additionally, We clear DWC3_EP_TRANSFER_STARTED
> for next trb to start transfer not update transfer.
> 
> Change-Id: I2e6b58acc99f385e467e8b639a3792a5e5f4d2bb
> Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>
> ---
>  drivers/usb/dwc3/gadget.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 183b90923f51..3ad3bc5813ca 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -2044,6 +2044,12 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
>  				dwc3_gadget_move_cancelled_request(r,
>  						DWC3_REQUEST_STATUS_DEQUEUED);
>  
> +			/* If ep cmd fails, then force to giveback cancelled requests here */
> +			if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) {
> +				dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
> +				dwc3_gadget_ep_cleanup_cancelled_requests(dep);
> +			}
> +
What I realized when looking at the endxfer command fail due to TIMEOUT,
was that it would lead to subsequent controller halt failures as well
(during pullup disable case).  It might not be safe to forcefully unmap
the request buffers if the controller may still be "working" on it.

I found some interesting quirks with regards to endxfer timeouts as
well, which I'm trying to get some more feedback on [1].  What is the
end issue being seen that requires this change? (we may have run into
the same issue as well.

[1] -
https://lore.kernel.org/linux-usb/20220203080017.27339-1-quic_wcheng@quicinc.com/

Thanks
Wesley Cheng
>  			dep->flags &= ~DWC3_EP_WAIT_TRANSFER_COMPLETE;
>  
>  			goto out;
> @@ -3645,7 +3651,7 @@ static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
>  
>  	if (!interrupt)
>  		dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
> -	else
> +	else if (!ret)
>  		dep->flags |= DWC3_EP_END_TRANSFER_PENDING;
>  }
>  

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/2] usb: dwc3: Not set DWC3_EP_END_TRANSFER_PENDING in ep cmd fails
  2022-02-14 18:53       ` Wesley Cheng
@ 2022-02-15  6:08         ` Jung Daehwan
  0 siblings, 0 replies; 7+ messages in thread
From: Jung Daehwan @ 2022-02-15  6:08 UTC (permalink / raw)
  To: Wesley Cheng
  Cc: Felipe Balbi, Greg Kroah-Hartman, linux-usb, open list,
	quic_jackp, Thinh.Nguyen

[-- Attachment #1: Type: text/plain, Size: 2768 bytes --]

On Mon, Feb 14, 2022 at 10:53:14AM -0800, Wesley Cheng wrote:
> Hi Daehwan,
> 
> On 2/14/2022 1:37 AM, Daehwan Jung wrote:
> > It always sets DWC3_EP_END_TRANSFER_PENDING in dwc3_stop_active_transfer
> > even if dwc3_send_gadget_ep_cmd fails. It can cause some problems like
> > skipping clear stall commmand or giveback from dequeue. We fix to set it
> > only when ep cmd success. Additionally, We clear DWC3_EP_TRANSFER_STARTED
> > for next trb to start transfer not update transfer.
> > 
> > Change-Id: I2e6b58acc99f385e467e8b639a3792a5e5f4d2bb
> > Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>
> > ---
> >  drivers/usb/dwc3/gadget.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 183b90923f51..3ad3bc5813ca 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -2044,6 +2044,12 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
> >  				dwc3_gadget_move_cancelled_request(r,
> >  						DWC3_REQUEST_STATUS_DEQUEUED);
> >  
> > +			/* If ep cmd fails, then force to giveback cancelled requests here */
> > +			if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) {
> > +				dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
> > +				dwc3_gadget_ep_cleanup_cancelled_requests(dep);
> > +			}
> > +
> What I realized when looking at the endxfer command fail due to TIMEOUT,
> was that it would lead to subsequent controller halt failures as well
> (during pullup disable case).  It might not be safe to forcefully unmap
> the request buffers if the controller may still be "working" on it.
>
Hi Wesley,

I agree with your opinion that the controller may still be "working" on
it.

> I found some interesting quirks with regards to endxfer timeouts as
> well, which I'm trying to get some more feedback on [1].  What is the
> end issue being seen that requires this change? (we may have run into
> the same issue as well.
> 
> [1] -
> https://protect2.fireeye.com/v1/url?k=9d423b69-fc3fd32e-9d43b026-74fe485fff30-77a099b52659410d&q=1&e=20b4d9f5-2599-4f57-8b6a-7c4ec167d228&u=https%3A%2F%2Flore.kernel.org%2Flinux-usb%2F20220203080017.27339-1-quic_wcheng%40quicinc.com%2F

I had adb hung issue if ep cmd timeout occurs. I also think we may have
run into the same issue. I'm going to see your patches.
Thanks for your comment.

Best Regards,
Jung Daehwan

> 
> Thanks
> Wesley Cheng
> >  			dep->flags &= ~DWC3_EP_WAIT_TRANSFER_COMPLETE;
> >  
> >  			goto out;
> > @@ -3645,7 +3651,7 @@ static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
> >  
> >  	if (!interrupt)
> >  		dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
> > -	else
> > +	else if (!ret)
> >  		dep->flags |= DWC3_EP_END_TRANSFER_PENDING;
> >  }
> >  
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-02-15  6:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20220214094040epcas2p3cc844d30f54793f51f16bb2b59b432e1@epcas2p3.samsung.com>
2022-02-14  9:37 ` [PATCH v1 0/2] Fix ep command fail issue in dequeue Daehwan Jung
     [not found]   ` <CGME20220214094041epcas2p2ec37c252dd5f9508454e9449c95e6c7a@epcas2p2.samsung.com>
2022-02-14  9:37     ` [PATCH v1 1/2] usb: dwc3: Not set DWC3_EP_END_TRANSFER_PENDING in ep cmd fails Daehwan Jung
2022-02-14 18:53       ` Wesley Cheng
2022-02-15  6:08         ` Jung Daehwan
     [not found]   ` <CGME20220214094042epcas2p118ac06692ad14f321a3fd59e57bcf1d5@epcas2p1.samsung.com>
2022-02-14  9:37     ` [PATCH v1 2/2] usb: dwc3: Prevent cleanup cancelled requests at the same time Daehwan Jung
2022-02-14 10:42       ` Greg Kroah-Hartman
2022-02-14 10:46         ` Jung Daehwan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).