linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Receiving report multiple times when changing cpu
@ 2022-10-12 14:35 Thilo Rörig
  2022-10-12 16:00 ` Greg KH
  2022-10-12 16:54 ` Alan Stern
  0 siblings, 2 replies; 9+ messages in thread
From: Thilo Rörig @ 2022-10-12 14:35 UTC (permalink / raw)
  To: linux-usb

Dear maintainer, dear community,

I'm new to looking into the linux kernel, but am hoping that this is the
right spot to address my problem.
Otherwise I would be very thankful, if you could point me in the right
direction.

I have a problem with a custom usb device on a raspberry pi running a
buildroot 64bit linux with a v5.19.14 kernel.
I ran into a problem reading data from a custom usb hid device using
interrupt transfers. I'm using
the python hidapi library (https://pypi.org/project/hidapi/) with the
libusb backend. After sending
a custom command to the device, the driver changes from usbhid to usbfs.
The device starts sending
reports of size 1023 bytes to endpoint 0x83. `usb-devices` yields the
following output:

T:  Bus=3D01 Lev=3D03 Prnt=3D13 Port=3D00 Cnt=3D01 Dev#=3D 14 Spd=3D480 MxC=
h=3D 0
D:  Ver=3D 2.00 Cls=3D00(>ifc ) Sub=3D00 Prot=3D00 MxPS=3D64 #Cfgs=3D  1
C:  #Ifs=3D 1 Cfg#=3D 1 Atr=3D80 MxPwr=3D500mA
I:  If#=3D 0 Alt=3D 0 #EPs=3D 1 Cls=3D03(HID  ) Sub=3D00 Prot=3D00 Driver=
=3Dusbfs
E:  Ad=3D83(I) Atr=3D03(Int.) MxPS=3D1024 Ivl=3D125us

The reports are numbered by an increasing report id and are available via
interrupt transfer approx. every
230 microseconds (1.85 microframes). The device does not retain any old
reports.

Initially, I get all reports in the correct order, but after some time I
start getting reports that I have already
received previously. The report ids start to include a shift after a
certain number of reports and
schematically they look like the following:

... 1,2,...,578,  238,239,...,816,  476,477,...,1054, and so on

So there is approximately the same number of consecutive "forward" reports
followed by a jump "backwards"
by approximately the same number of reports. I snooped the transferred data
using the `usbcore.usbfs_snoop` kernel
parameter but this slowed things down, so that I started missing reports
and could not reproduce the problem.
Then I traced the `xhci-hcd` events and observed that the pattern started
to occur, when my python process was
rescheduled from one cpu-core to another

...
          python-22331   [000] ..... 174497.068723: xhci_urb_enqueue:
ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
sgs 0/0 stream 0 flags 00010200
          python-22331   [000] d.... 174497.068727: xhci_queue_trb:
INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
'Normal' flags b:i:I:c:s:I:e:c
          python-22331   [000] d.... 174497.068728: xhci_inc_enq: INTR
000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
0x00000004526bf8e0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
bounce 1024 cycle 1
          python-22331   [000] d.... 174497.068729:
xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
          <idle>-0       [000] d.h.. 174497.068811: xhci_handle_event:
EVENT: TRB 00000004526bf8e0 status 'Short Packet' len 1 slot 3 ep 7
type 'Transfer Event' flags e:c
          <idle>-0       [000] d.h.. 174497.068814:
xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
          <idle>-0       [000] d.h.. 174497.068820: xhci_inc_deq: INTR
000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 501
bounce 1024 cycle 1
          <idle>-0       [000] d.h.. 174497.068822: xhci_urb_giveback:
ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
1023/1024 sgs 0/0 stream 0 flags 00010200
          <idle>-0       [000] d.h.. 174497.068824: xhci_inc_deq:
EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
0x0000000440f9f620(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
bounce 0 cycle 0
          python-22331   [000] ..... 174497.068894: xhci_urb_enqueue:
ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
sgs 0/0 stream 0 flags 00010200
          python-22331   [000] d.... 174497.068899: xhci_queue_trb:
INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
'Normal' flags b:i:I:c:s:I:e:c
          python-22331   [000] d.... 174497.068900: xhci_inc_enq: INTR
000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
bounce 1024 cycle 1
          python-22331   [000] d.... 174497.068901:
xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
          <idle>-0       [000] d.h.. 174497.069954: xhci_handle_event:
EVENT: TRB 00000004526bf8f0 status 'Short Packet' len 1 slot 3 ep 7
type 'Transfer Event' flags e:c
          <idle>-0       [000] d.h.. 174497.069957:
xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
          <idle>-0       [000] d.h.. 174497.069959: xhci_inc_deq: INTR
000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 501
bounce 1024 cycle 1
          <idle>-0       [000] d.h.. 174497.069961: xhci_urb_giveback:
ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
1023/1024 sgs 0/0 stream 0 flags 00010200
          <idle>-0       [000] d.h.. 174497.069962: xhci_inc_deq:
EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
0x0000000440f9f630(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
bounce 0 cycle 0
          python-22331   [003] ..... 174497.070160: xhci_urb_enqueue:
ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
sgs 0/0 stream 0 flags 00010200
          python-22331   [003] d.... 174497.070171: xhci_queue_trb:
INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
'Normal' flags b:i:I:c:s:I:e:c
          python-22331   [003] d.... 174497.070172: xhci_inc_enq: INTR
000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 500
bounce 1024 cycle 1
          python-22331   [003] d.... 174497.070174:
xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
          <idle>-0       [000] d.h.. 174497.071201: xhci_handle_event:
EVENT: TRB 00000004526bf900 status 'Short Packet' len 1 slot 3 ep 7
type 'Transfer Event' flags e:c
          <idle>-0       [000] d.h.. 174497.071204:
xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
          <idle>-0       [000] d.h.. 174497.071206: xhci_inc_deq: INTR
000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 501
bounce 1024 cycle 1
          <idle>-0       [000] d.h.. 174497.071208: xhci_urb_giveback:
ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
1023/1024 sgs 0/0 stream 0 flags 00010200
          <idle>-0       [000] d.h.. 174497.071210: xhci_inc_deq:
EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
0x0000000440f9f640(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
bounce 0 cycle 0
          python-22331   [003] ..... 174497.071269: xhci_urb_enqueue:
ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
sgs 0/0 stream 0 flags 00010200
          python-22331   [003] d.... 174497.071272: xhci_queue_trb:
INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
'Normal' flags b:i:I:c:s:I:e:c
          python-22331   [003] d.... 174497.071273: xhci_inc_enq: INTR
000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 500
bounce 1024 cycle 1
          python-22331   [003] d.... 174497.071274:
xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
          <idle>-0       [000] d.h.. 174497.072330: xhci_handle_event:
EVENT: TRB 00000004526bf910 status 'Short Packet' len 1 slot 3 ep 7
type 'Transfer Event' flags e:c
          <idle>-0       [000] d.h.. 174497.072331:
xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
          <idle>-0       [000] d.h.. 174497.072332: xhci_inc_deq: INTR
000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
0x00000004526bf920(0x00000004526bf000) segs 2 stream 0 free_trbs 501
bounce 1024 cycle 1
          <idle>-0       [000] d.h.. 174497.072334: xhci_urb_giveback:
ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
1023/1024 sgs 0/0 stream 0 flags 00010200
          <idle>-0       [000] d.h.. 174497.072334: xhci_inc_deq:
EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
0x0000000440f9f650(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
bounce 0 cycle 0
...

Unfortunately, I do not understand the details of the linux usb stack in
depth and the details of
this log enough to know where to look any further. I am hoping to get some
help or pointers to
how to attack my problem. I have tried using different libraries (hidapi C,
libusb), linux systems
(raspberry pi OS, ubuntu), but the problem persisted.

Thank you very much for taking the time for reading about my problem and
thank you in advance
for any advice.

Best
Thilo Rörig

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Receiving report multiple times when changing cpu
  2022-10-12 14:35 Receiving report multiple times when changing cpu Thilo Rörig
@ 2022-10-12 16:00 ` Greg KH
  2022-10-14 11:18   ` Thilo Roerig
  2022-10-12 16:54 ` Alan Stern
  1 sibling, 1 reply; 9+ messages in thread
From: Greg KH @ 2022-10-12 16:00 UTC (permalink / raw)
  To: Thilo Rörig; +Cc: linux-usb

On Wed, Oct 12, 2022 at 04:35:52PM +0200, Thilo Rörig wrote:
> Dear maintainer, dear community,
> 
> I'm new to looking into the linux kernel, but am hoping that this is the
> right spot to address my problem.
> Otherwise I would be very thankful, if you could point me in the right
> direction.
> 
> I have a problem with a custom usb device on a raspberry pi running a
> buildroot 64bit linux with a v5.19.14 kernel.
> I ran into a problem reading data from a custom usb hid device using
> interrupt transfers. I'm using
> the python hidapi library (https://pypi.org/project/hidapi/) with the
> libusb backend. After sending
> a custom command to the device, the driver changes from usbhid to usbfs.
> The device starts sending
> reports of size 1023 bytes to endpoint 0x83. `usb-devices` yields the
> following output:

USB devices do not "send" anything, they require a request from the host
to be able to transfer any data.  So you have to ask the device "do you
have any data?" before it can respond with anything.

So just wait, don't ask the device if you are not ready for the data to
be recieved, do not allow the device to continue to flood the data to
you if you can not recieve it all.



> 
> T:  Bus=3D01 Lev=3D03 Prnt=3D13 Port=3D00 Cnt=3D01 Dev#=3D 14 Spd=3D480 MxC=
> h=3D 0
> D:  Ver=3D 2.00 Cls=3D00(>ifc ) Sub=3D00 Prot=3D00 MxPS=3D64 #Cfgs=3D  1
> C:  #Ifs=3D 1 Cfg#=3D 1 Atr=3D80 MxPwr=3D500mA
> I:  If#=3D 0 Alt=3D 0 #EPs=3D 1 Cls=3D03(HID  ) Sub=3D00 Prot=3D00 Driver=
> =3Dusbfs
> E:  Ad=3D83(I) Atr=3D03(Int.) MxPS=3D1024 Ivl=3D125us
> 
> The reports are numbered by an increasing report id and are available via
> interrupt transfer approx. every
> 230 microseconds (1.85 microframes). The device does not retain any old
> reports.

So is the buffer in the device overflowing because the host is not
asking for enough data from it, or is the data overflowing in the host
because you are not reading the buffers fast enough?

> Initially, I get all reports in the correct order, but after some time I
> start getting reports that I have already
> received previously. The report ids start to include a shift after a
> certain number of reports and
> schematically they look like the following:
> 
> ... 1,2,...,578,  238,239,...,816,  476,477,...,1054, and so on
> 
> So there is approximately the same number of consecutive "forward" reports
> followed by a jump "backwards"
> by approximately the same number of reports. I snooped the transferred data
> using the `usbcore.usbfs_snoop` kernel
> parameter but this slowed things down, so that I started missing reports
> and could not reproduce the problem.

Where was the report missed from?  The device or the host?  Does the
device overflow and not send data if you don't ask for it fast enough?

What about using usbmon to watch the data flow by?


> Then I traced the `xhci-hcd` events and observed that the pattern started
> to occur, when my python process was
> rescheduled from one cpu-core to another
> 
> ...
>           python-22331   [000] ..... 174497.068723: xhci_urb_enqueue:
> ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
> sgs 0/0 stream 0 flags 00010200
>           python-22331   [000] d.... 174497.068727: xhci_queue_trb:
> INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
> 'Normal' flags b:i:I:c:s:I:e:c
>           python-22331   [000] d.... 174497.068728: xhci_inc_enq: INTR
> 000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
> 0x00000004526bf8e0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> bounce 1024 cycle 1
>           python-22331   [000] d.... 174497.068729:
> xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
>           <idle>-0       [000] d.h.. 174497.068811: xhci_handle_event:
> EVENT: TRB 00000004526bf8e0 status 'Short Packet' len 1 slot 3 ep 7
> type 'Transfer Event' flags e:c
>           <idle>-0       [000] d.h.. 174497.068814:
> xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
> size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
>           <idle>-0       [000] d.h.. 174497.068820: xhci_inc_deq: INTR
> 000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
> 0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> bounce 1024 cycle 1
>           <idle>-0       [000] d.h.. 174497.068822: xhci_urb_giveback:
> ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
> 1023/1024 sgs 0/0 stream 0 flags 00010200
>           <idle>-0       [000] d.h.. 174497.068824: xhci_inc_deq:
> EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> 0x0000000440f9f620(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> bounce 0 cycle 0
>           python-22331   [000] ..... 174497.068894: xhci_urb_enqueue:
> ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
> sgs 0/0 stream 0 flags 00010200
>           python-22331   [000] d.... 174497.068899: xhci_queue_trb:
> INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
> 'Normal' flags b:i:I:c:s:I:e:c
>           python-22331   [000] d.... 174497.068900: xhci_inc_enq: INTR
> 000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
> 0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> bounce 1024 cycle 1
>           python-22331   [000] d.... 174497.068901:
> xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
>           <idle>-0       [000] d.h.. 174497.069954: xhci_handle_event:
> EVENT: TRB 00000004526bf8f0 status 'Short Packet' len 1 slot 3 ep 7
> type 'Transfer Event' flags e:c
>           <idle>-0       [000] d.h.. 174497.069957:
> xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
> size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
>           <idle>-0       [000] d.h.. 174497.069959: xhci_inc_deq: INTR
> 000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
> 0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> bounce 1024 cycle 1
>           <idle>-0       [000] d.h.. 174497.069961: xhci_urb_giveback:
> ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
> 1023/1024 sgs 0/0 stream 0 flags 00010200
>           <idle>-0       [000] d.h.. 174497.069962: xhci_inc_deq:
> EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> 0x0000000440f9f630(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> bounce 0 cycle 0
>           python-22331   [003] ..... 174497.070160: xhci_urb_enqueue:
> ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
> sgs 0/0 stream 0 flags 00010200
>           python-22331   [003] d.... 174497.070171: xhci_queue_trb:
> INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
> 'Normal' flags b:i:I:c:s:I:e:c
>           python-22331   [003] d.... 174497.070172: xhci_inc_enq: INTR
> 000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
> 0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> bounce 1024 cycle 1
>           python-22331   [003] d.... 174497.070174:
> xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
>           <idle>-0       [000] d.h.. 174497.071201: xhci_handle_event:
> EVENT: TRB 00000004526bf900 status 'Short Packet' len 1 slot 3 ep 7
> type 'Transfer Event' flags e:c
>           <idle>-0       [000] d.h.. 174497.071204:
> xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
> size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
>           <idle>-0       [000] d.h.. 174497.071206: xhci_inc_deq: INTR
> 000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
> 0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> bounce 1024 cycle 1
>           <idle>-0       [000] d.h.. 174497.071208: xhci_urb_giveback:
> ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
> 1023/1024 sgs 0/0 stream 0 flags 00010200
>           <idle>-0       [000] d.h.. 174497.071210: xhci_inc_deq:
> EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> 0x0000000440f9f640(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> bounce 0 cycle 0
>           python-22331   [003] ..... 174497.071269: xhci_urb_enqueue:
> ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
> sgs 0/0 stream 0 flags 00010200
>           python-22331   [003] d.... 174497.071272: xhci_queue_trb:
> INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
> 'Normal' flags b:i:I:c:s:I:e:c
>           python-22331   [003] d.... 174497.071273: xhci_inc_enq: INTR
> 000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
> 0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> bounce 1024 cycle 1
>           python-22331   [003] d.... 174497.071274:
> xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
>           <idle>-0       [000] d.h.. 174497.072330: xhci_handle_event:
> EVENT: TRB 00000004526bf910 status 'Short Packet' len 1 slot 3 ep 7
> type 'Transfer Event' flags e:c
>           <idle>-0       [000] d.h.. 174497.072331:
> xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
> size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
>           <idle>-0       [000] d.h.. 174497.072332: xhci_inc_deq: INTR
> 000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
> 0x00000004526bf920(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> bounce 1024 cycle 1
>           <idle>-0       [000] d.h.. 174497.072334: xhci_urb_giveback:
> ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
> 1023/1024 sgs 0/0 stream 0 flags 00010200
>           <idle>-0       [000] d.h.. 174497.072334: xhci_inc_deq:
> EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> 0x0000000440f9f650(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> bounce 0 cycle 0
> ...

That's a bit hard to read, sorry, I don't know what that means.

But check your userspace side, only ask for data when you are ready to
accept it, and then process it and ask for more data.  That might make
things a lot simpler.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Receiving report multiple times when changing cpu
  2022-10-12 14:35 Receiving report multiple times when changing cpu Thilo Rörig
  2022-10-12 16:00 ` Greg KH
@ 2022-10-12 16:54 ` Alan Stern
  2022-10-14 11:20   ` Thilo Roerig
  1 sibling, 1 reply; 9+ messages in thread
From: Alan Stern @ 2022-10-12 16:54 UTC (permalink / raw)
  To: Thilo Rörig; +Cc: linux-usb

On Wed, Oct 12, 2022 at 04:35:52PM +0200, Thilo Rörig wrote:
> Dear maintainer, dear community,
> 
> I'm new to looking into the linux kernel, but am hoping that this is the
> right spot to address my problem.
> Otherwise I would be very thankful, if you could point me in the right
> direction.
> 
> I have a problem with a custom usb device on a raspberry pi running a
> buildroot 64bit linux with a v5.19.14 kernel.
> I ran into a problem reading data from a custom usb hid device using
> interrupt transfers. I'm using
> the python hidapi library (https://pypi.org/project/hidapi/) with the
> libusb backend. After sending
> a custom command to the device, the driver changes from usbhid to usbfs.
> The device starts sending
> reports of size 1023 bytes to endpoint 0x83. `usb-devices` yields the
> following output:
> 
> T:  Bus=3D01 Lev=3D03 Prnt=3D13 Port=3D00 Cnt=3D01 Dev#=3D 14 Spd=3D480 MxC=
> h=3D 0
> D:  Ver=3D 2.00 Cls=3D00(>ifc ) Sub=3D00 Prot=3D00 MxPS=3D64 #Cfgs=3D  1
> C:  #Ifs=3D 1 Cfg#=3D 1 Atr=3D80 MxPwr=3D500mA
> I:  If#=3D 0 Alt=3D 0 #EPs=3D 1 Cls=3D03(HID  ) Sub=3D00 Prot=3D00 Driver=
> =3Dusbfs
> E:  Ad=3D83(I) Atr=3D03(Int.) MxPS=3D1024 Ivl=3D125us
> 
> The reports are numbered by an increasing report id and are available via
> interrupt transfer approx. every
> 230 microseconds (1.85 microframes). The device does not retain any old
> reports.
> 
> Initially, I get all reports in the correct order, but after some time I
> start getting reports that I have already
> received previously. The report ids start to include a shift after a
> certain number of reports and
> schematically they look like the following:
> 
> ... 1,2,...,578,  238,239,...,816,  476,477,...,1054, and so on
> 
> So there is approximately the same number of consecutive "forward" reports
> followed by a jump "backwards"
> by approximately the same number of reports. I snooped the transferred data
> using the `usbcore.usbfs_snoop` kernel
> parameter but this slowed things down, so that I started missing reports
> and could not reproduce the problem.

As Greg mentioned, usbmon is a better way to snoop the data transfers.

> Then I traced the `xhci-hcd` events and observed that the pattern started
> to occur, when my python process was
> rescheduled from one cpu-core to another

How on earth do you know when your python process is rescheduled from 
one CPU core to another?  That information certainly is not part of the 
xhci-hcd trace output.

> ...
>           python-22331   [000] ..... 174497.068723: xhci_urb_enqueue:
> ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
> sgs 0/0 stream 0 flags 00010200
>           python-22331   [000] d.... 174497.068727: xhci_queue_trb:
> INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
> 'Normal' flags b:i:I:c:s:I:e:c
>           python-22331   [000] d.... 174497.068728: xhci_inc_enq: INTR
> 000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
> 0x00000004526bf8e0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> bounce 1024 cycle 1
>           python-22331   [000] d.... 174497.068729:
> xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
>           <idle>-0       [000] d.h.. 174497.068811: xhci_handle_event:
> EVENT: TRB 00000004526bf8e0 status 'Short Packet' len 1 slot 3 ep 7
> type 'Transfer Event' flags e:c
>           <idle>-0       [000] d.h.. 174497.068814:
> xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
> size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
>           <idle>-0       [000] d.h.. 174497.068820: xhci_inc_deq: INTR
> 000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
> 0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> bounce 1024 cycle 1
>           <idle>-0       [000] d.h.. 174497.068822: xhci_urb_giveback:
> ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
> 1023/1024 sgs 0/0 stream 0 flags 00010200
>           <idle>-0       [000] d.h.. 174497.068824: xhci_inc_deq:
> EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> 0x0000000440f9f620(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> bounce 0 cycle 0
>           python-22331   [000] ..... 174497.068894: xhci_urb_enqueue:
> ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
> sgs 0/0 stream 0 flags 00010200
>           python-22331   [000] d.... 174497.068899: xhci_queue_trb:
> INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
> 'Normal' flags b:i:I:c:s:I:e:c
>           python-22331   [000] d.... 174497.068900: xhci_inc_enq: INTR
> 000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
> 0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> bounce 1024 cycle 1
>           python-22331   [000] d.... 174497.068901:
> xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
>           <idle>-0       [000] d.h.. 174497.069954: xhci_handle_event:
> EVENT: TRB 00000004526bf8f0 status 'Short Packet' len 1 slot 3 ep 7
> type 'Transfer Event' flags e:c
>           <idle>-0       [000] d.h.. 174497.069957:
> xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
> size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
>           <idle>-0       [000] d.h.. 174497.069959: xhci_inc_deq: INTR
> 000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
> 0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> bounce 1024 cycle 1
>           <idle>-0       [000] d.h.. 174497.069961: xhci_urb_giveback:
> ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
> 1023/1024 sgs 0/0 stream 0 flags 00010200
>           <idle>-0       [000] d.h.. 174497.069962: xhci_inc_deq:
> EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> 0x0000000440f9f630(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> bounce 0 cycle 0
>           python-22331   [003] ..... 174497.070160: xhci_urb_enqueue:
> ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
> sgs 0/0 stream 0 flags 00010200
>           python-22331   [003] d.... 174497.070171: xhci_queue_trb:
> INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
> 'Normal' flags b:i:I:c:s:I:e:c
>           python-22331   [003] d.... 174497.070172: xhci_inc_enq: INTR
> 000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
> 0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> bounce 1024 cycle 1
>           python-22331   [003] d.... 174497.070174:
> xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
>           <idle>-0       [000] d.h.. 174497.071201: xhci_handle_event:
> EVENT: TRB 00000004526bf900 status 'Short Packet' len 1 slot 3 ep 7
> type 'Transfer Event' flags e:c
>           <idle>-0       [000] d.h.. 174497.071204:
> xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
> size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
>           <idle>-0       [000] d.h.. 174497.071206: xhci_inc_deq: INTR
> 000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
> 0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> bounce 1024 cycle 1
>           <idle>-0       [000] d.h.. 174497.071208: xhci_urb_giveback:
> ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
> 1023/1024 sgs 0/0 stream 0 flags 00010200
>           <idle>-0       [000] d.h.. 174497.071210: xhci_inc_deq:
> EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> 0x0000000440f9f640(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> bounce 0 cycle 0
>           python-22331   [003] ..... 174497.071269: xhci_urb_enqueue:
> ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
> sgs 0/0 stream 0 flags 00010200
>           python-22331   [003] d.... 174497.071272: xhci_queue_trb:
> INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
> 'Normal' flags b:i:I:c:s:I:e:c
>           python-22331   [003] d.... 174497.071273: xhci_inc_enq: INTR
> 000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
> 0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> bounce 1024 cycle 1
>           python-22331   [003] d.... 174497.071274:
> xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
>           <idle>-0       [000] d.h.. 174497.072330: xhci_handle_event:
> EVENT: TRB 00000004526bf910 status 'Short Packet' len 1 slot 3 ep 7
> type 'Transfer Event' flags e:c
>           <idle>-0       [000] d.h.. 174497.072331:
> xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
> size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
>           <idle>-0       [000] d.h.. 174497.072332: xhci_inc_deq: INTR
> 000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
> 0x00000004526bf920(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> bounce 1024 cycle 1
>           <idle>-0       [000] d.h.. 174497.072334: xhci_urb_giveback:
> ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
> 1023/1024 sgs 0/0 stream 0 flags 00010200
>           <idle>-0       [000] d.h.. 174497.072334: xhci_inc_deq:
> EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> 0x0000000440f9f650(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> bounce 0 cycle 0
> ...
> 
> Unfortunately, I do not understand the details of the linux usb stack in
> depth and the details of
> this log enough to know where to look any further. I am hoping to get some
> help or pointers to
> how to attack my problem. I have tried using different libraries (hidapi C,
> libusb), linux systems
> (raspberry pi OS, ubuntu), but the problem persisted.
> 
> Thank you very much for taking the time for reading about my problem and
> thank you in advance
> for any advice.

The most important thing you should do, to ensure that reports do not 
get lost, is submit a large queue of interrupt transfers in advance.  
Each time a transfer completes, add another submission to the queue.

That way, even if your process loses control of the CPU for some period 
of time, the transfers will continue (unless the period is so long that 
your entire queue gets consumed before the process is able to submit 
more requests).

The fact that reports get lost if you don't retrieve them every 230 
microseconds may indicate that your custom device should not be using 
the HID protocol at all, but rather should use isochronous transfers.
Isochronous is meant for real-time applications where data is simply 
lost if it's not transferred within a fixed interval of time.

Alan Stern

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Receiving report multiple times when changing cpu
  2022-10-12 16:00 ` Greg KH
@ 2022-10-14 11:18   ` Thilo Roerig
  2022-10-14 13:52     ` Greg KH
  0 siblings, 1 reply; 9+ messages in thread
From: Thilo Roerig @ 2022-10-14 11:18 UTC (permalink / raw)
  To: gregkh; +Cc: linux-usb, thilo.roerig

> On Wed, Oct 12, 2022 at 04:35:52PM +0200, Thilo Rörig wrote:
> > Dear maintainer, dear community,
> > 
> > I'm new to looking into the linux kernel, but am hoping that this is the
> > right spot to address my problem.
> > Otherwise I would be very thankful, if you could point me in the right
> > direction.
> > 
> > I have a problem with a custom usb device on a raspberry pi running a
> > buildroot 64bit linux with a v5.19.14 kernel.
> > I ran into a problem reading data from a custom usb hid device using
> > interrupt transfers. I'm using
> > the python hidapi library (https://pypi.org/project/hidapi/) with the
> > libusb backend. After sending
> > a custom command to the device, the driver changes from usbhid to usbfs.
> > The device starts sending
> > reports of size 1023 bytes to endpoint 0x83. `usb-devices` yields the
> > following output:
> 
> USB devices do not "send" anything, they require a request from the host
> to be able to transfer any data.  So you have to ask the device "do you
> have any data?" before it can respond with anything.
> 
> So just wait, don't ask the device if you are not ready for the data to
> be recieved, do not allow the device to continue to flood the data to
> you if you can not recieve it all.

Thank you for pointing this out. I looked into the implementation of the hidapi
library. The library immediately starts submitting transfer when opening the
device. So to get a better grip on the issue, I'm now using libusb and in 
this case I only get the reports, when I'm asking for it.

> > T:  Bus=3D01 Lev=3D03 Prnt=3D13 Port=3D00 Cnt=3D01 Dev#=3D 14 Spd=3D480 MxC=
> > h=3D 0
> > D:  Ver=3D 2.00 Cls=3D00(>ifc ) Sub=3D00 Prot=3D00 MxPS=3D64 #Cfgs=3D  1
> > C:  #Ifs=3D 1 Cfg#=3D 1 Atr=3D80 MxPwr=3D500mA
> > I:  If#=3D 0 Alt=3D 0 #EPs=3D 1 Cls=3D03(HID  ) Sub=3D00 Prot=3D00 Driver=
> > =3Dusbfs
> > E:  Ad=3D83(I) Atr=3D03(Int.) MxPS=3D1024 Ivl=3D125us
> > 
> > The reports are numbered by an increasing report id and are available via
> > interrupt transfer approx. every
> > 230 microseconds (1.85 microframes). The device does not retain any old
> > reports.
> 
> So is the buffer in the device overflowing because the host is not
> asking for enough data from it, or is the data overflowing in the host
> because you are not reading the buffers fast enough?

The buffer on the device does not buffer reports, that are as old as the
reports I am receiving via libusb. Interestingly, the speed that I can receive
the reports in userspace increases, when the wrong pattern starts to occur.
Initially, I receive ~4350 reports per second (with correct report id and
increasing order), but once the wrong pattern occurs, the speed goes up to
~7650 reports per second, which is approx. 1 microframe. Without having found
any reference in the code, the pattern in the data looks as if some dequeue
pointer is "overtaking" the enqueue pointer in a ring buffer - but this is just
a wild guess. It would maybe explain the pattern in the data and maybe also the
speedup once the ring buffer is not preventing the dequeue from reading once
the enqueue pointer is reached.   
 
> > Initially, I get all reports in the correct order, but after some time I
> > start getting reports that I have already
> > received previously. The report ids start to include a shift after a
> > certain number of reports and
> > schematically they look like the following:
> > 
> > ... 1,2,...,578,  238,239,...,816,  476,477,...,1054, and so on
> > 
> > So there is approximately the same number of consecutive "forward" reports
> > followed by a jump "backwards"
> > by approximately the same number of reports. I snooped the transferred data
> > using the `usbcore.usbfs_snoop` kernel
> > parameter but this slowed things down, so that I started missing reports
> > and could not reproduce the problem.
> 
> Where was the report missed from?  The device or the host?  Does the
> device overflow and not send data if you don't ask for it fast enough?
> 
> What about using usbmon to watch the data flow by?

When I enable `usbcore.usbfs_snoop` I am missing reports on the host, since I
am not asking for the reports fast enough. I can ask the device for data
anytime and it will provide a report with a later report id. 

Thanks for the pointer to usbmon. I have recorded some logs and analyzed them,
but could not spot any irregularities when the jump pattern starts to occur in
the report numbers. Only the regularity of the number of reports of the
backward jumps.

> > Then I traced the `xhci-hcd` events and observed that the pattern started
> > to occur, when my python process was
> > rescheduled from one cpu-core to another
> > 
> > ...
> >           python-22331   [000] ..... 174497.068723: xhci_urb_enqueue:
> > ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
> > sgs 0/0 stream 0 flags 00010200
> >           python-22331   [000] d.... 174497.068727: xhci_queue_trb:
> > INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
> > 'Normal' flags b:i:I:c:s:I:e:c
> >           python-22331   [000] d.... 174497.068728: xhci_inc_enq: INTR
> > 000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
> > 0x00000004526bf8e0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> > bounce 1024 cycle 1
> >           python-22331   [000] d.... 174497.068729:
> > xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
> >           <idle>-0       [000] d.h.. 174497.068811: xhci_handle_event:
> > EVENT: TRB 00000004526bf8e0 status 'Short Packet' len 1 slot 3 ep 7
> > type 'Transfer Event' flags e:c
> >           <idle>-0       [000] d.h.. 174497.068814:
> > xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
> > size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
> >           <idle>-0       [000] d.h.. 174497.068820: xhci_inc_deq: INTR
> > 000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
> > 0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> > bounce 1024 cycle 1
> >           <idle>-0       [000] d.h.. 174497.068822: xhci_urb_giveback:
> > ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
> > 1023/1024 sgs 0/0 stream 0 flags 00010200
> >           <idle>-0       [000] d.h.. 174497.068824: xhci_inc_deq:
> > EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> > 0x0000000440f9f620(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> > bounce 0 cycle 0
> >           python-22331   [000] ..... 174497.068894: xhci_urb_enqueue:
> > ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
> > sgs 0/0 stream 0 flags 00010200
> >           python-22331   [000] d.... 174497.068899: xhci_queue_trb:
> > INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
> > 'Normal' flags b:i:I:c:s:I:e:c
> >           python-22331   [000] d.... 174497.068900: xhci_inc_enq: INTR
> > 000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
> > 0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> > bounce 1024 cycle 1
> >           python-22331   [000] d.... 174497.068901:
> > xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
> >           <idle>-0       [000] d.h.. 174497.069954: xhci_handle_event:
> > EVENT: TRB 00000004526bf8f0 status 'Short Packet' len 1 slot 3 ep 7
> > type 'Transfer Event' flags e:c
> >           <idle>-0       [000] d.h.. 174497.069957:
> > xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
> > size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
> >           <idle>-0       [000] d.h.. 174497.069959: xhci_inc_deq: INTR
> > 000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
> > 0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> > bounce 1024 cycle 1
> >           <idle>-0       [000] d.h.. 174497.069961: xhci_urb_giveback:
> > ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
> > 1023/1024 sgs 0/0 stream 0 flags 00010200
> >           <idle>-0       [000] d.h.. 174497.069962: xhci_inc_deq:
> > EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> > 0x0000000440f9f630(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> > bounce 0 cycle 0
> >           python-22331   [003] ..... 174497.070160: xhci_urb_enqueue:
> > ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
> > sgs 0/0 stream 0 flags 00010200
> >           python-22331   [003] d.... 174497.070171: xhci_queue_trb:
> > INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
> > 'Normal' flags b:i:I:c:s:I:e:c
> >           python-22331   [003] d.... 174497.070172: xhci_inc_enq: INTR
> > 000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
> > 0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> > bounce 1024 cycle 1
> >           python-22331   [003] d.... 174497.070174:
> > xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
> >           <idle>-0       [000] d.h.. 174497.071201: xhci_handle_event:
> > EVENT: TRB 00000004526bf900 status 'Short Packet' len 1 slot 3 ep 7
> > type 'Transfer Event' flags e:c
> >           <idle>-0       [000] d.h.. 174497.071204:
> > xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
> > size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
> >           <idle>-0       [000] d.h.. 174497.071206: xhci_inc_deq: INTR
> > 000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
> > 0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> > bounce 1024 cycle 1
> >           <idle>-0       [000] d.h.. 174497.071208: xhci_urb_giveback:
> > ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
> > 1023/1024 sgs 0/0 stream 0 flags 00010200
> >           <idle>-0       [000] d.h.. 174497.071210: xhci_inc_deq:
> > EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> > 0x0000000440f9f640(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> > bounce 0 cycle 0
> >           python-22331   [003] ..... 174497.071269: xhci_urb_enqueue:
> > ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
> > sgs 0/0 stream 0 flags 00010200
> >           python-22331   [003] d.... 174497.071272: xhci_queue_trb:
> > INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
> > 'Normal' flags b:i:I:c:s:I:e:c
> >           python-22331   [003] d.... 174497.071273: xhci_inc_enq: INTR
> > 000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
> > 0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> > bounce 1024 cycle 1
> >           python-22331   [003] d.... 174497.071274:
> > xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
> >           <idle>-0       [000] d.h.. 174497.072330: xhci_handle_event:
> > EVENT: TRB 00000004526bf910 status 'Short Packet' len 1 slot 3 ep 7
> > type 'Transfer Event' flags e:c
> >           <idle>-0       [000] d.h.. 174497.072331:
> > xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
> > size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
> >           <idle>-0       [000] d.h.. 174497.072332: xhci_inc_deq: INTR
> > 000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
> > 0x00000004526bf920(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> > bounce 1024 cycle 1
> >           <idle>-0       [000] d.h.. 174497.072334: xhci_urb_giveback:
> > ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
> > 1023/1024 sgs 0/0 stream 0 flags 00010200
> >           <idle>-0       [000] d.h.. 174497.072334: xhci_inc_deq:
> > EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> > 0x0000000440f9f650(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> > bounce 0 cycle 0
> > ...
> 
> That's a bit hard to read, sorry, I don't know what that means.

Excuse me for the terrible formatting of the log. The interesting part 
of the log, when the problem starts to occur is the following taken
from `/sys/kernel/debug/tracing/trace`:

...

                         _-----=> irqs-off/BH-disabled
                        / _----=> need-resched
                       | / _---=> hardirq/softirq
                       || / _--=> preempt-depth
                       ||| / _-=> migrate-disable
                       |||| /     delay
    TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
       | |         |   |||||     |         |
----------------------------------------------------------------------------
  python-22331   [000] ..... 174497.068723: xhci_urb_enqueue: ep3in-intr: ...
  python-22331   [000] d.... 174497.068727: xhci_queue_trb: ...
  python-22331   [000] d.... 174497.068728: xhci_inc_enq: INTR ...
  python-22331   [000] d.... 174497.068729: xhci_ring_ep_doorbell: ...
  <idle>-0       [000] d.h.. 174497.068814: xhci_handle_transfer: INTR: ...
  <idle>-0       [000] d.h.. 174497.068820: xhci_inc_deq: INTR ...
  <idle>-0       [000] d.h.. 174497.068822: xhci_urb_giveback: ep3in-intr: ...
  python-22331   [000] ..... 174497.068894: xhci_urb_enqueue: ep3in-intr: ...
  python-22331   [000] d.... 174497.068899: xhci_queue_trb: INTR: ...
  python-22331   [000] d.... 174497.068900: xhci_inc_enq: INTR ...
  python-22331   [000] d.... 174497.068901: xhci_ring_ep_doorbell: ...
  <idle>-0       [000] d.h.. 174497.069957: xhci_handle_transfer: INTR: ...
  <idle>-0       [000] d.h.. 174497.069959: xhci_inc_deq: INTR ...
  <idle>-0       [000] d.h.. 174497.069961: xhci_urb_giveback: ep3in-intr: ...
  python-22331   [003] ..... 174497.070160: xhci_urb_enqueue: ep3in-intr: ...
  python-22331   [003] d.... 174497.070171: xhci_queue_trb: INTR: ...
  python-22331   [003] d.... 174497.070172: xhci_inc_enq: INTR ...
  python-22331   [003] d.... 174497.070174: xhci_ring_ep_doorbell: ...
  <idle>-0       [000] d.h.. 174497.071204: xhci_handle_transfer: INTR: ...
  <idle>-0       [000] d.h.. 174497.071206: xhci_inc_deq: INTR ...
  <idle>-0       [000] d.h.. 174497.071208: xhci_urb_giveback: ep3in-intr: ...
  python-22331   [003] ..... 174497.071269: xhci_urb_enqueue: ep3in-intr: ...
  python-22331   [003] d.... 174497.071272: xhci_queue_trb: INTR: ...
  python-22331   [003] d.... 174497.071273: xhci_inc_enq: INTR ...
  python-22331   [003] d.... 174497.071274: xhci_ring_ep_doorbell: ...
  <idle>-0       [000] d.h.. 174497.072331: xhci_handle_transfer: INTR: ...
  <idle>-0       [000] d.h.. 174497.072332: xhci_inc_deq: INTR ...
  <idle>-0       [000] d.h.. 174497.072334: xhci_urb_giveback: ep3in-intr: ...
...

The change I observed, is the change in the CPU# of my python process. In the 
xhci trb enqueue and dequeue pointers I could not spot any irregularities.

> But check your userspace side, only ask for data when you are ready to
> accept it, and then process it and ask for more data.  That might make
> things a lot simpler.

I have reinvestigated my applications side and followed the calls of libusb
to ioctl calls, but could not spot anything on the way. Unfortunately, I 
could not see who is taking care of buffering the reports? Is it libusb or
is this the device driver in the kernel taking care of this?

Thank you very much again for your time and help.

Best
Thilo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Receiving report multiple times when changing cpu
  2022-10-12 16:54 ` Alan Stern
@ 2022-10-14 11:20   ` Thilo Roerig
  2022-10-14 13:32     ` Alan Stern
  0 siblings, 1 reply; 9+ messages in thread
From: Thilo Roerig @ 2022-10-14 11:20 UTC (permalink / raw)
  To: stern; +Cc: linux-usb, thilo.roerig

> On Wed, Oct 12, 2022 at 04:35:52PM +0200, Thilo Rörig wrote:
> > Dear maintainer, dear community,
> > 
> > I'm new to looking into the linux kernel, but am hoping that this is the
> > right spot to address my problem.
> > Otherwise I would be very thankful, if you could point me in the right
> > direction.
> > 
> > I have a problem with a custom usb device on a raspberry pi running a
> > buildroot 64bit linux with a v5.19.14 kernel.
> > I ran into a problem reading data from a custom usb hid device using
> > interrupt transfers. I'm using
> > the python hidapi library (https://pypi.org/project/hidapi/) with the
> > libusb backend. After sending
> > a custom command to the device, the driver changes from usbhid to usbfs.
> > The device starts sending
> > reports of size 1023 bytes to endpoint 0x83. `usb-devices` yields the
> > following output:
> > 
> > T:  Bus=3D01 Lev=3D03 Prnt=3D13 Port=3D00 Cnt=3D01 Dev#=3D 14 Spd=3D480 MxC=
> > h=3D 0
> > D:  Ver=3D 2.00 Cls=3D00(>ifc ) Sub=3D00 Prot=3D00 MxPS=3D64 #Cfgs=3D  1
> > C:  #Ifs=3D 1 Cfg#=3D 1 Atr=3D80 MxPwr=3D500mA
> > I:  If#=3D 0 Alt=3D 0 #EPs=3D 1 Cls=3D03(HID  ) Sub=3D00 Prot=3D00 Driver=
> > =3Dusbfs
> > E:  Ad=3D83(I) Atr=3D03(Int.) MxPS=3D1024 Ivl=3D125us
> > 
> > The reports are numbered by an increasing report id and are available via
> > interrupt transfer approx. every
> > 230 microseconds (1.85 microframes). The device does not retain any old
> > reports.
> > 
> > Initially, I get all reports in the correct order, but after some time I
> > start getting reports that I have already
> > received previously. The report ids start to include a shift after a
> > certain number of reports and
> > schematically they look like the following:
> > 
> > ... 1,2,...,578,  238,239,...,816,  476,477,...,1054, and so on
> > 
> > So there is approximately the same number of consecutive "forward" reports
> > followed by a jump "backwards"
> > by approximately the same number of reports. I snooped the transferred data
> > using the `usbcore.usbfs_snoop` kernel
> > parameter but this slowed things down, so that I started missing reports
> > and could not reproduce the problem.
> 
> As Greg mentioned, usbmon is a better way to snoop the data transfers.

I have continued to investigate the issue using usbmon, but unfortunately, I
could not find any other irregularity, but the jumps in the report id pattern.

> > Then I traced the `xhci-hcd` events and observed that the pattern started
> > to occur, when my python process was
> > rescheduled from one cpu-core to another
> 
> How on earth do you know when your python process is rescheduled from 
> one CPU core to another?  That information certainly is not part of the 
> xhci-hcd trace output.
> 

Sorry for the messed up output of the xhci_hcd trace. Maybe I am not
interpreting the information in the log correctly as it is the first time that
I'm looking at kernel traces. To get a better view I have reformatted the log:

...

                         _-----=> irqs-off/BH-disabled
                        / _----=> need-resched
                       | / _---=> hardirq/softirq
                       || / _--=> preempt-depth
                       ||| / _-=> migrate-disable
                       |||| /     delay
    TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
       | |         |   |||||     |         |
----------------------------------------------------------------------------
  python-22331   [000] ..... 174497.068723: xhci_urb_enqueue: ep3in-intr: ...
  python-22331   [000] d.... 174497.068727: xhci_queue_trb: ...
  python-22331   [000] d.... 174497.068728: xhci_inc_enq: INTR ...
  python-22331   [000] d.... 174497.068729: xhci_ring_ep_doorbell: ...
  <idle>-0       [000] d.h.. 174497.068814: xhci_handle_transfer: INTR: ...
  <idle>-0       [000] d.h.. 174497.068820: xhci_inc_deq: INTR ...
  <idle>-0       [000] d.h.. 174497.068822: xhci_urb_giveback: ep3in-intr: ...
  python-22331   [000] ..... 174497.068894: xhci_urb_enqueue: ep3in-intr: ...
  python-22331   [000] d.... 174497.068899: xhci_queue_trb: INTR: ...
  python-22331   [000] d.... 174497.068900: xhci_inc_enq: INTR ...
  python-22331   [000] d.... 174497.068901: xhci_ring_ep_doorbell: ...
  <idle>-0       [000] d.h.. 174497.069957: xhci_handle_transfer: INTR: ...
  <idle>-0       [000] d.h.. 174497.069959: xhci_inc_deq: INTR ...
  <idle>-0       [000] d.h.. 174497.069961: xhci_urb_giveback: ep3in-intr: ...
  python-22331   [003] ..... 174497.070160: xhci_urb_enqueue: ep3in-intr: ...
  python-22331   [003] d.... 174497.070171: xhci_queue_trb: INTR: ...
  python-22331   [003] d.... 174497.070172: xhci_inc_enq: INTR ...
  python-22331   [003] d.... 174497.070174: xhci_ring_ep_doorbell: ...
  <idle>-0       [000] d.h.. 174497.071204: xhci_handle_transfer: INTR: ...
  <idle>-0       [000] d.h.. 174497.071206: xhci_inc_deq: INTR ...
  <idle>-0       [000] d.h.. 174497.071208: xhci_urb_giveback: ep3in-intr: ...
  python-22331   [003] ..... 174497.071269: xhci_urb_enqueue: ep3in-intr: ...
  python-22331   [003] d.... 174497.071272: xhci_queue_trb: INTR: ...
  python-22331   [003] d.... 174497.071273: xhci_inc_enq: INTR ...
  python-22331   [003] d.... 174497.071274: xhci_ring_ep_doorbell: ...
  <idle>-0       [000] d.h.. 174497.072331: xhci_handle_transfer: INTR: ...
  <idle>-0       [000] d.h.. 174497.072332: xhci_inc_deq: INTR ...
  <idle>-0       [000] d.h.. 174497.072334: xhci_urb_giveback: ep3in-intr: ...
...

In the middle, the python-22331 process changes from CPU# 000 to CPU# 003. 
This is the reason for my statement in the original question. I first observed
in htop that the irregularities in the report pattern started, when the cpu
usage changed. So maybe I am also only seeing what I want to see here and 
this might no trigger my problem.

> > ...
> >           python-22331   [000] ..... 174497.068723: xhci_urb_enqueue:
> > ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
> > sgs 0/0 stream 0 flags 00010200
> >           python-22331   [000] d.... 174497.068727: xhci_queue_trb:
> > INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
> > 'Normal' flags b:i:I:c:s:I:e:c
> >           python-22331   [000] d.... 174497.068728: xhci_inc_enq: INTR
> > 000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
> > 0x00000004526bf8e0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> > bounce 1024 cycle 1
> >           python-22331   [000] d.... 174497.068729:
> > xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
> >           <idle>-0       [000] d.h.. 174497.068811: xhci_handle_event:
> > EVENT: TRB 00000004526bf8e0 status 'Short Packet' len 1 slot 3 ep 7
> > type 'Transfer Event' flags e:c
> >           <idle>-0       [000] d.h.. 174497.068814:
> > xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
> > size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
> >           <idle>-0       [000] d.h.. 174497.068820: xhci_inc_deq: INTR
> > 000000006122f915: enq 0x00000004526bf8f0(0x00000004526bf000) deq
> > 0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> > bounce 1024 cycle 1
> >           <idle>-0       [000] d.h.. 174497.068822: xhci_urb_giveback:
> > ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
> > 1023/1024 sgs 0/0 stream 0 flags 00010200
> >           <idle>-0       [000] d.h.. 174497.068824: xhci_inc_deq:
> > EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> > 0x0000000440f9f620(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> > bounce 0 cycle 0
> >           python-22331   [000] ..... 174497.068894: xhci_urb_enqueue:
> > ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length 0/1024
> > sgs 0/0 stream 0 flags 00010200
> >           python-22331   [000] d.... 174497.068899: xhci_queue_trb:
> > INTR: Buffer 00000004669c9800 length 1024 TD size 0 intr 0 type
> > 'Normal' flags b:i:I:c:s:I:e:c
> >           python-22331   [000] d.... 174497.068900: xhci_inc_enq: INTR
> > 000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
> > 0x00000004526bf8f0(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> > bounce 1024 cycle 1
> >           python-22331   [000] d.... 174497.068901:
> > xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
> >           <idle>-0       [000] d.h.. 174497.069954: xhci_handle_event:
> > EVENT: TRB 00000004526bf8f0 status 'Short Packet' len 1 slot 3 ep 7
> > type 'Transfer Event' flags e:c
> >           <idle>-0       [000] d.h.. 174497.069957:
> > xhci_handle_transfer: INTR: Buffer 00000004669c9800 length 1024 TD
> > size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
> >           <idle>-0       [000] d.h.. 174497.069959: xhci_inc_deq: INTR
> > 000000006122f915: enq 0x00000004526bf900(0x00000004526bf000) deq
> > 0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> > bounce 1024 cycle 1
> >           <idle>-0       [000] d.h.. 174497.069961: xhci_urb_giveback:
> > ep3in-intr: urb 0000000070a812bc pipe 1073852544 slot 3 length
> > 1023/1024 sgs 0/0 stream 0 flags 00010200
> >           <idle>-0       [000] d.h.. 174497.069962: xhci_inc_deq:
> > EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> > 0x0000000440f9f630(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> > bounce 0 cycle 0
> >           python-22331   [003] ..... 174497.070160: xhci_urb_enqueue:
> > ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
> > sgs 0/0 stream 0 flags 00010200
> >           python-22331   [003] d.... 174497.070171: xhci_queue_trb:
> > INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
> > 'Normal' flags b:i:I:c:s:I:e:c
> >           python-22331   [003] d.... 174497.070172: xhci_inc_enq: INTR
> > 000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
> > 0x00000004526bf900(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> > bounce 1024 cycle 1
> >           python-22331   [003] d.... 174497.070174:
> > xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
> >           <idle>-0       [000] d.h.. 174497.071201: xhci_handle_event:
> > EVENT: TRB 00000004526bf900 status 'Short Packet' len 1 slot 3 ep 7
> > type 'Transfer Event' flags e:c
> >           <idle>-0       [000] d.h.. 174497.071204:
> > xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
> > size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
> >           <idle>-0       [000] d.h.. 174497.071206: xhci_inc_deq: INTR
> > 000000006122f915: enq 0x00000004526bf910(0x00000004526bf000) deq
> > 0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> > bounce 1024 cycle 1
> >           <idle>-0       [000] d.h.. 174497.071208: xhci_urb_giveback:
> > ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
> > 1023/1024 sgs 0/0 stream 0 flags 00010200
> >           <idle>-0       [000] d.h.. 174497.071210: xhci_inc_deq:
> > EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> > 0x0000000440f9f640(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> > bounce 0 cycle 0
> >           python-22331   [003] ..... 174497.071269: xhci_urb_enqueue:
> > ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length 0/1024
> > sgs 0/0 stream 0 flags 00010200
> >           python-22331   [003] d.... 174497.071272: xhci_queue_trb:
> > INTR: Buffer 00000004799b0000 length 1024 TD size 0 intr 0 type
> > 'Normal' flags b:i:I:c:s:I:e:c
> >           python-22331   [003] d.... 174497.071273: xhci_inc_enq: INTR
> > 000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
> > 0x00000004526bf910(0x00000004526bf000) segs 2 stream 0 free_trbs 500
> > bounce 1024 cycle 1
> >           python-22331   [003] d.... 174497.071274:
> > xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep3in
> >           <idle>-0       [000] d.h.. 174497.072330: xhci_handle_event:
> > EVENT: TRB 00000004526bf910 status 'Short Packet' len 1 slot 3 ep 7
> > type 'Transfer Event' flags e:c
> >           <idle>-0       [000] d.h.. 174497.072331:
> > xhci_handle_transfer: INTR: Buffer 00000004799b0000 length 1024 TD
> > size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
> >           <idle>-0       [000] d.h.. 174497.072332: xhci_inc_deq: INTR
> > 000000006122f915: enq 0x00000004526bf920(0x00000004526bf000) deq
> > 0x00000004526bf920(0x00000004526bf000) segs 2 stream 0 free_trbs 501
> > bounce 1024 cycle 1
> >           <idle>-0       [000] d.h.. 174497.072334: xhci_urb_giveback:
> > ep3in-intr: urb 00000000a553f707 pipe 1073852544 slot 3 length
> > 1023/1024 sgs 0/0 stream 0 flags 00010200
> >           <idle>-0       [000] d.h.. 174497.072334: xhci_inc_deq:
> > EVENT 000000002a1d7c0b: enq 0x0000000440f9c000(0x0000000440f9c000) deq
> > 0x0000000440f9f650(0x0000000440f9f000) segs 8 stream 0 free_trbs 2039
> > bounce 0 cycle 0
> > ...
> > 
> > Unfortunately, I do not understand the details of the linux usb stack in
> > depth and the details of
> > this log enough to know where to look any further. I am hoping to get some
> > help or pointers to
> > how to attack my problem. I have tried using different libraries (hidapi C,
> > libusb), linux systems
> > (raspberry pi OS, ubuntu), but the problem persisted.
> > 
> > Thank you very much for taking the time for reading about my problem and
> > thank you in advance
> > for any advice.
> 
> The most important thing you should do, to ensure that reports do not 
> get lost, is submit a large queue of interrupt transfers in advance.  
> Each time a transfer completes, add another submission to the queue.
>
> That way, even if your process loses control of the CPU for some period 
> of time, the transfers will continue (unless the period is so long that 
> your entire queue gets consumed before the process is able to submit 
> more requests).

Thank you for your help! Initially, I was submitting 4 transfers only.
I'll try to submit more transfers once I start loosing reports. Currently
I am able to get all reports using only 4 transfers. I'm loosing reports
only if I'm debugging with `usbfs_snoop`. In this case, I tried submitting
more transfers, but I was still loosing some reports, so I decided, that 
I will try to find another debugging possibility iother then usbfs_snoop.

> The fact that reports get lost if you don't retrieve them every 230 
> microseconds may indicate that your custom device should not be using 
> the HID protocol at all, but rather should use isochronous transfers.
> Isochronous is meant for real-time applications where data is simply 
> lost if it's not transferred within a fixed interval of time.

Thank you very much for your comments and pointers. I will look into the
possibility to work with isochronous transfers. Due to the regularity
of the wrong report pattern, I'm suspecting that there is some ring
buffer involved, where dequeue and enqueue pointers are getting out of
sync. Unfortunately, I could not find a suspicious buffer looking at
the usbfs driver or libusb. 

Thank you very much again for your time and help.

Best
Thilo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Receiving report multiple times when changing cpu
  2022-10-14 11:20   ` Thilo Roerig
@ 2022-10-14 13:32     ` Alan Stern
  2023-02-22  8:24       ` Thilo Roerig
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Stern @ 2022-10-14 13:32 UTC (permalink / raw)
  To: Thilo Roerig; +Cc: linux-usb

On Fri, Oct 14, 2022 at 01:20:49PM +0200, Thilo Roerig wrote:
> > As Greg mentioned, usbmon is a better way to snoop the data transfers.
> 
> I have continued to investigate the issue using usbmon, but unfortunately, I
> could not find any other irregularity, but the jumps in the report id pattern.

Did you check the timestamps carefully?  In particular, did you
examine the times when each transfer was submitted and completed?  If
two adjacent transfers complete more than one or two microseconds
apart, that indicates a problem.

Also, do the report IDs show up in the usbmon trace so that you can
directly see which ID appears in which transfer?

> > The most important thing you should do, to ensure that reports do not 
> > get lost, is submit a large queue of interrupt transfers in advance.  
> > Each time a transfer completes, add another submission to the queue.
> >
> > That way, even if your process loses control of the CPU for some period 
> > of time, the transfers will continue (unless the period is so long that 
> > your entire queue gets consumed before the process is able to submit 
> > more requests).
> 
> Thank you for your help! Initially, I was submitting 4 transfers only.
> I'll try to submit more transfers once I start loosing reports. Currently
> I am able to get all reports using only 4 transfers. I'm loosing reports
> only if I'm debugging with `usbfs_snoop`. In this case, I tried submitting
> more transfers, but I was still loosing some reports, so I decided, that 
> I will try to find another debugging possibility iother then usbfs_snoop.

Yeah, probably usbfs_snoop slows everything down so much that it
becomes impossible for your program to keep up with device.

On the other hand, if you're able to get all the reports (without
usbfs_snoop) using only four transfers then it seems like you've
solved the problem!

Alan Stern

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Receiving report multiple times when changing cpu
  2022-10-14 11:18   ` Thilo Roerig
@ 2022-10-14 13:52     ` Greg KH
  2023-02-22  8:27       ` Thilo Roerig
  0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2022-10-14 13:52 UTC (permalink / raw)
  To: Thilo Roerig; +Cc: linux-usb

On Fri, Oct 14, 2022 at 01:18:55PM +0200, Thilo Roerig wrote:
> > On Wed, Oct 12, 2022 at 04:35:52PM +0200, Thilo Rörig wrote:
> > > Dear maintainer, dear community,
> > > 
> > > I'm new to looking into the linux kernel, but am hoping that this is the
> > > right spot to address my problem.
> > > Otherwise I would be very thankful, if you could point me in the right
> > > direction.
> > > 
> > > I have a problem with a custom usb device on a raspberry pi running a
> > > buildroot 64bit linux with a v5.19.14 kernel.
> > > I ran into a problem reading data from a custom usb hid device using
> > > interrupt transfers. I'm using
> > > the python hidapi library (https://pypi.org/project/hidapi/) with the
> > > libusb backend. After sending
> > > a custom command to the device, the driver changes from usbhid to usbfs.
> > > The device starts sending
> > > reports of size 1023 bytes to endpoint 0x83. `usb-devices` yields the
> > > following output:
> > 
> > USB devices do not "send" anything, they require a request from the host
> > to be able to transfer any data.  So you have to ask the device "do you
> > have any data?" before it can respond with anything.
> > 
> > So just wait, don't ask the device if you are not ready for the data to
> > be recieved, do not allow the device to continue to flood the data to
> > you if you can not recieve it all.
> 
> Thank you for pointing this out. I looked into the implementation of the hidapi
> library. The library immediately starts submitting transfer when opening the
> device. So to get a better grip on the issue, I'm now using libusb and in 
> this case I only get the reports, when I'm asking for it.
> 
> > > T:  Bus=3D01 Lev=3D03 Prnt=3D13 Port=3D00 Cnt=3D01 Dev#=3D 14 Spd=3D480 MxC=
> > > h=3D 0
> > > D:  Ver=3D 2.00 Cls=3D00(>ifc ) Sub=3D00 Prot=3D00 MxPS=3D64 #Cfgs=3D  1
> > > C:  #Ifs=3D 1 Cfg#=3D 1 Atr=3D80 MxPwr=3D500mA
> > > I:  If#=3D 0 Alt=3D 0 #EPs=3D 1 Cls=3D03(HID  ) Sub=3D00 Prot=3D00 Driver=
> > > =3Dusbfs
> > > E:  Ad=3D83(I) Atr=3D03(Int.) MxPS=3D1024 Ivl=3D125us
> > > 
> > > The reports are numbered by an increasing report id and are available via
> > > interrupt transfer approx. every
> > > 230 microseconds (1.85 microframes). The device does not retain any old
> > > reports.
> > 
> > So is the buffer in the device overflowing because the host is not
> > asking for enough data from it, or is the data overflowing in the host
> > because you are not reading the buffers fast enough?
> 
> The buffer on the device does not buffer reports, that are as old as the
> reports I am receiving via libusb. Interestingly, the speed that I can receive
> the reports in userspace increases, when the wrong pattern starts to occur.
> Initially, I receive ~4350 reports per second (with correct report id and
> increasing order), but once the wrong pattern occurs, the speed goes up to
> ~7650 reports per second, which is approx. 1 microframe. Without having found
> any reference in the code, the pattern in the data looks as if some dequeue
> pointer is "overtaking" the enqueue pointer in a ring buffer - but this is just
> a wild guess. It would maybe explain the pattern in the data and maybe also the
> speedup once the ring buffer is not preventing the dequeue from reading once
> the enqueue pointer is reached.   

Are you sure that the firmware in the device itself is not just broken
when you ramp up the speed?  It is the thing that is filling the buffers
to the host, when it has it ready, so it is the thing that is in control
of how fast it responds.

If all works well at slower speeds, try just doing that and not letting
the device itself work too hard.

What does other operating systems do with this device, how do they
control it?  With a in-kernel driver or with userspace code that takes a
long time between requests?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Receiving report multiple times when changing cpu
  2022-10-14 13:32     ` Alan Stern
@ 2023-02-22  8:24       ` Thilo Roerig
  0 siblings, 0 replies; 9+ messages in thread
From: Thilo Roerig @ 2023-02-22  8:24 UTC (permalink / raw)
  To: stern; +Cc: linux-usb, thilo.roerig

> > > As Greg mentioned, usbmon is a better way to snoop the data transfers.
> >
> > I have continued to investigate the issue using usbmon, but unfortunately, I
> > could not find any other irregularity, but the jumps in the report id pattern.
> 
> Did you check the timestamps carefully?  In particular, did you
> examine the times when each transfer was submitted and completed?  If
> two adjacent transfers complete more than one or two microseconds
> apart, that indicates a problem.

It took some time, but thanks to your pointers I was finally able to narrow
down the possible causes of the problem and it turned out that there was a
problem in the firmware of the usb device.

More precisely, I took a closer look at the timestamps of my usbmon log, as you
suggested, and it showed a pattern that changed once the duplicate reports
started. With this knowledge I could analyze the details together with the
device provider. We ended up sniffing the data on the wire and fixed some of
the issues by an update of the device firmware that does not show the problem
anymore.

On the linux end, I have moved some of my python code to C to increase the
read speed. I was loosing some reports because I was not emptying the hid
report queue used in the libusb backend of the hidapi library fast enough.

Thank you again for your great help.

Best
Thilo Roerig

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Receiving report multiple times when changing cpu
  2022-10-14 13:52     ` Greg KH
@ 2023-02-22  8:27       ` Thilo Roerig
  0 siblings, 0 replies; 9+ messages in thread
From: Thilo Roerig @ 2023-02-22  8:27 UTC (permalink / raw)
  To: gregkh; +Cc: linux-usb, thilo.roerig

> On Fri, Oct 14, 2022 at 01:18:55PM +0200, Thilo Roerig wrote:
> > > > I have a problem with a custom usb device on a raspberry pi running a
> > > > buildroot 64bit linux with a v5.19.14 kernel.
> > > > I ran into a problem reading data from a custom usb hid device using
> > > > interrupt transfers. I'm using
> > > > the python hidapi library (https://pypi.org/project/hidapi/) with the
> > > > libusb backend. After sending
> > > > a custom command to the device, the driver changes from usbhid to usbfs.
> > > > The device starts sending
> > > > reports of size 1023 bytes to endpoint 0x83. `usb-devices` yields the
> > > > following output:
> > >
> > > USB devices do not "send" anything, they require a request from the host
> > > to be able to transfer any data.  So you have to ask the device "do you
> > > have any data?" before it can respond with anything.
> > >
> > > So just wait, don't ask the device if you are not ready for the data to
> > > be recieved, do not allow the device to continue to flood the data to
> > > you if you can not recieve it all.
> >
> > Thank you for pointing this out. I looked into the implementation of the hidapi
> > library. The library immediately starts submitting transfer when opening the
> > device. So to get a better grip on the issue, I'm now using libusb and in
> > this case I only get the reports, when I'm asking for it.
> >
> > > > T:  Bus=01 Lev=03 Prnt=13 Port=00 Cnt=01 Dev#= 14 Spd=480 MxC> > > h= 0
> > > > D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
> > > > C:  #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=500mA
> > > > I:  If#= 0 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=00 Prot=00 Driver> > > =usbfs
> > > > E:  Ad=83(I) Atr=03(Int.) MxPS=1024 Ivl=125us
> > > >
> > > > The reports are numbered by an increasing report id and are available via
> > > > interrupt transfer approx. every
> > > > 230 microseconds (1.85 microframes). The device does not retain any old
> > > > reports.
> > >
> > > So is the buffer in the device overflowing because the host is not
> > > asking for enough data from it, or is the data overflowing in the host
> > > because you are not reading the buffers fast enough?
> >
> > The buffer on the device does not buffer reports, that are as old as the
> > reports I am receiving via libusb. Interestingly, the speed that I can receive
> > the reports in userspace increases, when the wrong pattern starts to occur.
> > Initially, I receive ~4350 reports per second (with correct report id and
> > increasing order), but once the wrong pattern occurs, the speed goes up to
> > ~7650 reports per second, which is approx. 1 microframe. Without having found
> > any reference in the code, the pattern in the data looks as if some dequeue
> > pointer is "overtaking" the enqueue pointer in a ring buffer - but this is just
> > a wild guess. It would maybe explain the pattern in the data and maybe also the
> > speedup once the ring buffer is not preventing the dequeue from reading once
> > the enqueue pointer is reached.
> 
> Are you sure that the firmware in the device itself is not just broken
> when you ramp up the speed?  It is the thing that is filling the buffers
> to the host, when it has it ready, so it is the thing that is in control
> of how fast it responds.
> 
> If all works well at slower speeds, try just doing that and not letting
> the device itself work too hard.
> 
> What does other operating systems do with this device, how do they
> control it?  With a in-kernel driver or with userspace code that takes a
> long time between requests?

Based on your (and Alans) replies, I could narrow down the potential reasons
for the duplicate reports and it turned out to be a problem with a buffer on 
the device when reports were not read fast enough.

Together with the device provider we could adapt the firmware on the device to
avoid the buffer overflow that caused the duplicate reports. Additionally, I
moved some of the code processing the reports from python to C, so that the HID
report queue used by hidapi does not overflow, when reading to slowly from
userspace. 

I did not dig deeper into how windows handles the device, but this is something
I will investigate, if I continue to have issues with the reports.

Thank you very much again for your great help.

Best
Thilo Roerig


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-02-22  8:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-12 14:35 Receiving report multiple times when changing cpu Thilo Rörig
2022-10-12 16:00 ` Greg KH
2022-10-14 11:18   ` Thilo Roerig
2022-10-14 13:52     ` Greg KH
2023-02-22  8:27       ` Thilo Roerig
2022-10-12 16:54 ` Alan Stern
2022-10-14 11:20   ` Thilo Roerig
2022-10-14 13:32     ` Alan Stern
2023-02-22  8:24       ` Thilo Roerig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).