All of lore.kernel.org
 help / color / mirror / Atom feed
* [Regression] xhci: some hard drives cannot be seen using a JMicron JMS56x enclosure
@ 2018-02-28 13:55 Mathias Nyman
  0 siblings, 0 replies; 5+ messages in thread
From: Mathias Nyman @ 2018-02-28 13:55 UTC (permalink / raw)
  To: Cyril Roelandt; +Cc: linux-usb, Oliver Neukum

Hi

On 23.02.2018 04:55, Cyril Roelandt wrote:
> Hello,
> 
> On 02/21/18 17:04, Mathias Nyman wrote:
>>
>> could you change one flag in the xhci driver and take the same
>> set of logs? It will add more details about the URB and xHC hw position check.
>>
>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>> index cc368ad..f1b73e6 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -772,7 +772,7 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id,
>>                  hw_deq &= ~0xf;
>>                  if (trb_in_td(xhci, cur_td->start_seg, cur_td->first_trb,
>> -                             cur_td->last_trb, hw_deq, false)) {
>> +                             cur_td->last_trb, hw_deq, true)) {
>>                          xhci_find_new_dequeue_state(xhci, slot_id, ep_index,
>>                                                      cur_td->urb->stream_id,
>>                                                      cur_td, &deq_state);
>>
>>
> 
> 
> OK, so I tried this patch on top of 4.13. I'll post the logs at the end of this
> message.
> 
>> Another thing not related to xhci is that the serial number seems to be changing
>> for this device after reset.
>> This causes the "usb 4-2: device firmware changed" messages, and usb core will
>> try to logically disconnect the device.
>>
>> Can you also plug in and out the device a few times without changing the
>> drives in the enclosure, and check if the serial number really is changing.
> 
> This behaviour can be seen on a machine running 4.11 (a version of the kernel
> with which the enclosure works well):
> 
> # dmesg -T | grep "SerialNumber:"
> [Fri Feb 23 02:43:20 2018] usb 2-2: SerialNumber: RANDOM__7758C4668BE7
> [Fri Feb 23 02:43:38 2018] usb 2-2: SerialNumber: RANDOM__D6BA993A16EF
> [Fri Feb 23 02:43:57 2018] usb 2-2: SerialNumber: RANDOM__DA02D6BA993A
> [Fri Feb 23 02:55:45 2018] usb 2-2: SerialNumber: RANDOM__BB3BB6A08401
> 
> 
> Thanks for your help, I'm leaving the (huge) logs below.
> Cyril Roelandt.
> 

Thanks for the logs, They clearly show that there is a issue in getting the
position where the second stream stopped.

When reading the stopped position for URB2 on stream 2 we get stream1 position,
in fact stream 2 never starts again after this, and when re-reading the stopped
position of stream 2 it keeps giving stream 1 position.

One possible cause is that a xhci command accidentally writes a "0" to "Max Primary Streams"
while configuring some other endpoint value, this would limit streams to 1.
Current tracing has a bug in displaying "Max Primaty streams" and always showed zero.

I have a series of even more custom debugging patches.
attached patches apply on 4.13, but seris for both 4.13 and 4.15 can be found in the
streams-debug-4.13 and streams-debug-4.15 branches at

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git

Can I ask you to do take logs with these?
no need for the previous diff, it's included

Thanks
Mathias

From b6dcdf0302e839555e8e1c884c919a3c9d775b92 Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Mon, 26 Feb 2018 11:07:15 +0200
Subject: [PATCH 3/3] xhci: TEMP DEBUG, add more custom streams debugging

For testing missing stream on uas
---
 drivers/usb/host/xhci-ring.c  | 15 +++++++++++++--
 drivers/usb/host/xhci-trace.h |  5 +++++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f8a09b1..8fb25a8 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -498,9 +498,20 @@ static u64 xhci_get_hw_deq(struct xhci_hcd *xhci, struct xhci_virt_device *vdev,
 
 	if (ep->ep_state & EP_HAS_STREAMS) {
 		st_ctx = &ep->stream_info->stream_ctx_array[stream_id];
-		return le64_to_cpu(st_ctx->stream_ring);
+		xhci_dbg_trace(xhci, trace_xhci_get_hw_deq,
+			       "ep_index %d stream_id %d deq %016llx\n",
+			       ep_index,
+			       stream_id,
+			       le64_to_cpu(st_ctx->stream_ring));
+
+	return le64_to_cpu(st_ctx->stream_ring);
 	}
 	ep_ctx = xhci_get_ep_ctx(xhci, vdev->out_ctx, ep_index);
+	xhci_dbg_trace(xhci, trace_xhci_get_hw_deq,
+		       "ep_index %d No stream, deq %016llx\n",
+		       ep_index,
+		       le64_to_cpu(ep_ctx->deq));
+
 	return le64_to_cpu(ep_ctx->deq);
 }
 
@@ -772,7 +783,7 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id,
 		hw_deq &= ~0xf;
 
 		if (trb_in_td(xhci, cur_td->start_seg, cur_td->first_trb,
-			      cur_td->last_trb, hw_deq, false)) {
+			      cur_td->last_trb, hw_deq, true)) {
 			xhci_find_new_dequeue_state(xhci, slot_id, ep_index,
 						    cur_td->urb->stream_id,
 						    cur_td, &deq_state);
diff --git a/drivers/usb/host/xhci-trace.h b/drivers/usb/host/xhci-trace.h
index 262140e..c3c5801 100644
--- a/drivers/usb/host/xhci-trace.h
+++ b/drivers/usb/host/xhci-trace.h
@@ -39,6 +39,11 @@ DECLARE_EVENT_CLASS(xhci_log_msg,
 	TP_printk("%s", __get_str(msg))
 );
 
+DEFINE_EVENT(xhci_log_msg, xhci_get_hw_deq,
+	TP_PROTO(struct va_format *vaf),
+	TP_ARGS(vaf)
+);
+
 DEFINE_EVENT(xhci_log_msg, xhci_dbg_address,
 	TP_PROTO(struct va_format *vaf),
 	TP_ARGS(vaf)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Regression] xhci: some hard drives cannot be seen using a JMicron JMS56x enclosure
@ 2018-03-26 12:27 Mathias Nyman
  0 siblings, 0 replies; 5+ messages in thread
From: Mathias Nyman @ 2018-03-26 12:27 UTC (permalink / raw)
  To: Cyril Roelandt; +Cc: linux-usb, Oliver Neukum

On 24.03.2018 06:37, Cyril Roelandt wrote:
> Hello,
> 
> On 03/07/18 17:16, Mathias Nyman wrote:
>> I can try to write a workaround that sets dequeue pointers for both the stream we want, and
>> the current active stream for each URB canceling.
> 
> Is there a branch from which I could pull the workaround and test it?
> 

I noticed some other issues while working on it. Streams set deq required a bit rework.
A branch with just streams deq rework can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git stream_set_tr_dq

If we're lucky that is enough, but your logs indicate that there might be something else.

If the rework alone doesn't work, then I have a simple fix on top of it.
It assumes streams aren't switched mid TD, i.e. that a non-active stream ring is not
left mid TD and controller doen't have cached TRBs for non-active streams.
This is pretty similar to how 4.12 kernel was handling it.
I haven't got confirmation from HW/xhci spec side if we can make this assumption.

Anyways, branch for this can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git stream_set_tr_dq_test1

If that still doesn't work, then a workaround that sets deq pointes for both the stream we want,
and the current active stream could be an option. I haven't written that yet.

Thanks
Mathias
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Regression] xhci: some hard drives cannot be seen using a JMicron JMS56x enclosure
@ 2018-03-24  4:37 Cyril Roelandt
  0 siblings, 0 replies; 5+ messages in thread
From: Cyril Roelandt @ 2018-03-24  4:37 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb, Oliver Neukum

Hello,

On 03/07/18 17:16, Mathias Nyman wrote:
> I can try to write a workaround that sets dequeue pointers for both the stream we want, and
> the current active stream for each URB canceling.

Is there a branch from which I could pull the workaround and test it?

Regards,
Cyril Roelandt.
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Regression] xhci: some hard drives cannot be seen using a JMicron JMS56x enclosure
@ 2018-03-07 15:16 Mathias Nyman
  0 siblings, 0 replies; 5+ messages in thread
From: Mathias Nyman @ 2018-03-07 15:16 UTC (permalink / raw)
  To: Cyril Roelandt; +Cc: linux-usb, Oliver Neukum

On 06.03.2018 04:24, Cyril Roelandt wrote:
> Hi,
> 
> On 02/28/18 15:55, Mathias Nyman wrote:
>>
>> I have a series of even more custom debugging patches.
>> attached patches apply on 4.13, but seris for both 4.13 and 4.15 can be found in the
>> streams-debug-4.13 and streams-debug-4.15 branches at
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
>> https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
>>
>> Can I ask you to do take logs with these?
>> no need for the previous diff, it's included
> 
> I applied your three patches on top of v4.13 and got the followings logs:
> 
> dmesg:
> 
...
151.747272: xhci_get_hw_deq: ep_index 6 stream_id 1 deq 00000001bd377133
...
151.747288: xhci_queue_trb: CMD: Set TR Dequeue Pointer Command: deq 00000001bd377143 stream 1 slot 2 ep 7 flags C
...
151.747351: xhci_get_hw_deq: ep_index 6 stream_id 2 deq 00000001bd377143


> 
> I hope this helps,
> Cyril Roelandt.
> 

Thanks, It does
To me it looks like there is a issue with setting the dequeue pointer of a stream if
another stream was active while the endpoint stopped.

Basically if we want to cancel a URB on Stream 1, meaning stop entire endpoint with all its streams,
and then move stream 1 dequeue pointer forward, we end up setting the dequeue pointer of stream 2
as well. Probably because stream 2 was the active stream when the endpoint was stopped.

I can try to write a workaround that sets dequeue pointers for both the stream we want, and
the current active stream for each URB canceling.

Thanks
Mathias
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Regression] xhci: some hard drives cannot be seen using a JMicron JMS56x enclosure
@ 2018-02-21 15:04 Mathias Nyman
  0 siblings, 0 replies; 5+ messages in thread
From: Mathias Nyman @ 2018-02-21 15:04 UTC (permalink / raw)
  To: Cyril Roelandt; +Cc: linux-usb, Oliver Neukum

On 19.02.2018 01:38, Cyril Roelandt wrote:
> Hello,
> 
> I'm inlining my answers to your questions.
> 
> On 02/15/18 17:30, Mathias Nyman wrote:
>> On 15.02.2018 03:05, Cyril Roelandt wrote:
>>> Hi,
>>>
>>> I use two hard drives in an enclosure connected to my PC using UAS. The
>>> enclosure is a JMicron JMS56x (152d:0562); the drives are a Fujitsu MHZ2160BH
>>> G2 (2"5, 160GB) and a Western Digital EFRX-68N32N0 (3"5, 4TB).
>>>
>>> Using a USB2 port, I can see both drives as expected. I used to be able to use
>>> both drives using a USB3 port (and UAS) in Linux 4.11 and 4.12, but I am
>>> experiencing a bug starting with Linux 4.13.
>>>
>>
>> Did you see the:
>> "usb 4-2: device firmware changed"
>> message with 4.12 or older kernels?
> 
> No, I cannot see this message on a 4.11 kernel.
...
> 
>> could you send full logs, both dynamic debug and xhci traces to me:
>>
>> echo 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>> echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control
>>
>> and
>>
>> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
>> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
>>
>> both the /sys/kernel/debug/tracing/trace and dmesg
> 
> Here is the output of dmesg:
> 

Thanks, been staring at the logs for some time now.

I can see one issue on xhci side.

endpoint 3(IN) for xhci slot 2 uses two streams.

uas driver wants to cancel two URBS on this endpoint:
   URB1 (ffff9e50d43afa80) located in stream ring 1 at 0x1a3704130
   URB2 (ffff9e50d43af0c0) located in stream ring 2 at 0x1a3707080

xhci stops the endpoint to cancel the URBs, this causes both streams to stop.

For URB1 we notice xHC hw was currently working on the URB we want to cancel,
i.e. stream 1 stopped at the same location as URB1 (at 0x1a3704130).
xhci driver ask hardware to move past this position, and xhci driver gives back
URB1 as "cancelled". Stream 1 continue to work after this

For the second URB we don't notice that xHC hw is working on the same URB we
want to cancel, so we bluntly write over URB2  with zeroes, and driver gives
back URB2 as "cancelled".
Stream 2 does not recover, and the URB that is queued after this on stream 2 is
never handled. Stream 2 is stuck at the position of URB2 (0x1a3707080).

As the next URB on stream 2 is never given back it blocks usb_disconnect which
is waiting for all pending urbs to be given back.

could you change one flag in the xhci driver and take the same
set of logs? It will add more details about the URB and xHC hw position check.



Another thing not related to xhci is that the serial number seems to be changing
for this device after reset.
First log:
[Tue Feb  6 22:17:49 2018] usb 4-2: SerialNumber: RANDOM__8A4D7F833EEF
Second log:
[Sun Feb 18 22:46:01 2018] usb 2-2: SerialNumber: RANDOM__4157B7E21ACD

This causes the "usb 4-2: device firmware changed" messages, and usb core will
try to logically disconnect the device.

Can you also plug in and out the device a few times without changing the
drives in the enclosure, and check if the serial number really is changing.

Thanks
Mathias
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index cc368ad..f1b73e6 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -772,7 +772,7 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id,
                 hw_deq &= ~0xf;
  
                 if (trb_in_td(xhci, cur_td->start_seg, cur_td->first_trb,
-                             cur_td->last_trb, hw_deq, false)) {
+                             cur_td->last_trb, hw_deq, true)) {
                         xhci_find_new_dequeue_state(xhci, slot_id, ep_index,
                                                     cur_td->urb->stream_id,
                                                     cur_td, &deq_state);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-03-26 12:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 13:55 [Regression] xhci: some hard drives cannot be seen using a JMicron JMS56x enclosure Mathias Nyman
  -- strict thread matches above, loose matches on Subject: below --
2018-03-26 12:27 Mathias Nyman
2018-03-24  4:37 Cyril Roelandt
2018-03-07 15:16 Mathias Nyman
2018-02-21 15:04 Mathias Nyman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.