linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
@ 2020-11-10 11:36 Hans de Goede
  2020-11-18 21:43 ` Hans de Goede
  2020-11-23 14:49 ` Hans de Goede
  0 siblings, 2 replies; 37+ messages in thread
From: Hans de Goede @ 2020-11-10 11:36 UTC (permalink / raw)
  To: Mathias Nyman, Greg Kroah-Hartman
  Cc: linux-usb, Linux Kernel Mailing List, linux-pci

Hi All,

Not sure if this is a XHCI driver problem at all, but I needed to start
somewhere with reporting this so I went with:

scripts/get_maintainer.pl -f drivers/usb/host/xhci-pci.c

And added a Cc: linux-pci@vger.kernel.org as bonus.

I'm seeing the following errors and very slow network performance with
the USB NIC in a Lenovo Thunderbolt gen 2 dock.

Note that the USB NIC is connected to the XHCI controller which is
embedded inside the dock and is connected over thunderbolt!

So the errors are:

[ 1148.744205] swiotlb_tbl_map_single: 6 callbacks suppressed
[ 1148.744210] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
[ 1148.744218] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 16ea@1411c0000 dir 1 --- failed
[ 1148.744226] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1148.744368] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
[ 1148.744375] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 16ea@10aabc000 dir 1 --- failed
[ 1148.744381] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1148.745141] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
[ 1148.745148] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 118e@1411c0000 dir 1 --- failed
[ 1148.745155] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1148.951282] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
[ 1148.951388] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 118e@140988000 dir 1 --- failed
[ 1148.951420] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.013342] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
[ 1151.013357] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 1d2a@1411c0000 dir 1 --- failed
[ 1151.013373] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.018660] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 18 (slots)
[ 1151.018696] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@1411c0000 dir 1 --- failed
[ 1151.018711] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.223022] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
[ 1151.223102] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@10aabc000 dir 1 --- failed
[ 1151.223133] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.228810] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
[ 1151.228870] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@10aabc000 dir 1 --- failed
[ 1151.228898] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.234792] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
[ 1151.234852] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@10aabc000 dir 1 --- failed
[ 1151.234882] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11

etc.

This happens as soon as I generate any serious amount of outgoing network traffic. E.g. rsyncing files
to another machine.

Regards,

Hans


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-10 11:36 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller Hans de Goede
@ 2020-11-18 21:43 ` Hans de Goede
  2020-11-23 14:49 ` Hans de Goede
  1 sibling, 0 replies; 37+ messages in thread
From: Hans de Goede @ 2020-11-18 21:43 UTC (permalink / raw)
  To: Mathias Nyman, Greg Kroah-Hartman
  Cc: linux-usb, Linux Kernel Mailing List, linux-pci

Hi All,

On 11/10/20 12:36 PM, Hans de Goede wrote:
> Hi All,
> 
> Not sure if this is a XHCI driver problem at all, but I needed to start
> somewhere with reporting this so I went with:
> 
> scripts/get_maintainer.pl -f drivers/usb/host/xhci-pci.c
> 
> And added a Cc: linux-pci@vger.kernel.org as bonus.
> 
> I'm seeing the following errors and very slow network performance with
> the USB NIC in a Lenovo Thunderbolt gen 2 dock.
> 
> Note that the USB NIC is connected to the XHCI controller which is
> embedded inside the dock and is connected over thunderbolt!

Ping? This is still happening and although the errors are not fatal,
outgoing network performance is very bad.

I know a lot of Linux users use thunderbolt docks and for some
reason almost all thunderbolt docks seem to be using USB attached
nics inside, so this is going to hit a lot of users if we do not
get this fixed before 5.10 gets released!

Regards,

Hans





> So the errors are:
> 
> [ 1148.744205] swiotlb_tbl_map_single: 6 callbacks suppressed
> [ 1148.744210] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1148.744218] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 16ea@1411c0000 dir 1 --- failed
> [ 1148.744226] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.744368] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1148.744375] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 16ea@10aabc000 dir 1 --- failed
> [ 1148.744381] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.745141] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1148.745148] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 118e@1411c0000 dir 1 --- failed
> [ 1148.745155] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.951282] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1148.951388] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 118e@140988000 dir 1 --- failed
> [ 1148.951420] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.013342] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1151.013357] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 1d2a@1411c0000 dir 1 --- failed
> [ 1151.013373] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.018660] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 18 (slots)
> [ 1151.018696] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@1411c0000 dir 1 --- failed
> [ 1151.018711] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.223022] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1151.223102] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@10aabc000 dir 1 --- failed
> [ 1151.223133] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.228810] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1151.228870] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@10aabc000 dir 1 --- failed
> [ 1151.228898] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.234792] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1151.234852] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@10aabc000 dir 1 --- failed
> [ 1151.234882] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> 
> etc.
> 
> This happens as soon as I generate any serious amount of outgoing network traffic. E.g. rsyncing files
> to another machine.
> 
> Regards,
> 
> Hans
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-10 11:36 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller Hans de Goede
  2020-11-18 21:43 ` Hans de Goede
@ 2020-11-23 14:49 ` Hans de Goede
  2020-11-24 10:27   ` Christoph Hellwig
  1 sibling, 1 reply; 37+ messages in thread
From: Hans de Goede @ 2020-11-23 14:49 UTC (permalink / raw)
  To: Mathias Nyman, Greg Kroah-Hartman, Christoph Hellwig
  Cc: linux-usb, Linux Kernel Mailing List, linux-pci

Hi,

+Cc Christoph Hellwig <hch@lst.de>

Christoph, this is still an issue, so I've been looking around a bit and think this
might have something to do with the dma-mapping-5.10 changes.

Do you have any suggestions to debug this, or is it time to do a git bisect
on this before 5.10 ships with regression?

Regards,

Hans




On 11/10/20 12:36 PM, Hans de Goede wrote:
> Hi All,
> 
> Not sure if this is a XHCI driver problem at all, but I needed to start
> somewhere with reporting this so I went with:
> 
> scripts/get_maintainer.pl -f drivers/usb/host/xhci-pci.c
> 
> And added a Cc: linux-pci@vger.kernel.org as bonus.
> 
> I'm seeing the following errors and very slow network performance with
> the USB NIC in a Lenovo Thunderbolt gen 2 dock.
> 
> Note that the USB NIC is connected to the XHCI controller which is
> embedded inside the dock and is connected over thunderbolt!
> 
> So the errors are:
> 
> [ 1148.744205] swiotlb_tbl_map_single: 6 callbacks suppressed
> [ 1148.744210] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1148.744218] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 16ea@1411c0000 dir 1 --- failed
> [ 1148.744226] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.744368] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1148.744375] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 16ea@10aabc000 dir 1 --- failed
> [ 1148.744381] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.745141] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1148.745148] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 118e@1411c0000 dir 1 --- failed
> [ 1148.745155] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.951282] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1148.951388] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 118e@140988000 dir 1 --- failed
> [ 1148.951420] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.013342] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1151.013357] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 1d2a@1411c0000 dir 1 --- failed
> [ 1151.013373] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.018660] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 18 (slots)
> [ 1151.018696] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@1411c0000 dir 1 --- failed
> [ 1151.018711] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.223022] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1151.223102] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@10aabc000 dir 1 --- failed
> [ 1151.223133] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.228810] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1151.228870] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@10aabc000 dir 1 --- failed
> [ 1151.228898] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.234792] xhci_hcd 0000:0a:00.0: swiotlb buffer is full (sz: 8192 bytes), total 32768 (slots), used 16 (slots)
> [ 1151.234852] xhci_hcd 0000:0a:00.0: DMAR: Device bounce map: 11da@10aabc000 dir 1 --- failed
> [ 1151.234882] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> 
> etc.
> 
> This happens as soon as I generate any serious amount of outgoing network traffic. E.g. rsyncing files
> to another machine.
> 
> Regards,
> 
> Hans
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-23 14:49 ` Hans de Goede
@ 2020-11-24 10:27   ` Christoph Hellwig
  2020-11-24 10:31     ` Hans de Goede
  2020-11-27 11:41     ` Hans de Goede
  0 siblings, 2 replies; 37+ messages in thread
From: Christoph Hellwig @ 2020-11-24 10:27 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Mathias Nyman, Greg Kroah-Hartman, Christoph Hellwig, linux-usb,
	Linux Kernel Mailing List, linux-pci

On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
> Hi,
> 
> +Cc Christoph Hellwig <hch@lst.de>
> 
> Christoph, this is still an issue, so I've been looking around a bit and think this
> might have something to do with the dma-mapping-5.10 changes.
> 
> Do you have any suggestions to debug this, or is it time to do a git bisect
> on this before 5.10 ships with regression?

Given that DMAR prefix this seems to be about using intel-iommu + bounce
buffering for external devices.  I can't really think of anything specific
in 5.10 related to that, so maybe you'll need to bisect.

I doub this means we are actually leaking swiotlb buffers, so while
I'm pretty sure we broke something in lower layers this also means
xhci doesn't handle swiotlb operation very gracefully in general.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-24 10:27   ` Christoph Hellwig
@ 2020-11-24 10:31     ` Hans de Goede
  2020-11-24 12:17       ` Mathias Nyman
  2020-11-27 11:41     ` Hans de Goede
  1 sibling, 1 reply; 37+ messages in thread
From: Hans de Goede @ 2020-11-24 10:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Mathias Nyman, Greg Kroah-Hartman, linux-usb,
	Linux Kernel Mailing List, linux-pci

Hi,

On 11/24/20 11:27 AM, Christoph Hellwig wrote:
> On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> +Cc Christoph Hellwig <hch@lst.de>
>>
>> Christoph, this is still an issue, so I've been looking around a bit and think this
>> might have something to do with the dma-mapping-5.10 changes.
>>
>> Do you have any suggestions to debug this, or is it time to do a git bisect
>> on this before 5.10 ships with regression?
> 
> Given that DMAR prefix this seems to be about using intel-iommu + bounce
> buffering for external devices.  I can't really think of anything specific
> in 5.10 related to that, so maybe you'll need to bisect.
> 
> I doub this means we are actually leaking swiotlb buffers, so while
> I'm pretty sure we broke something in lower layers this also means
> xhci doesn't handle swiotlb operation very gracefully in general.

Ok, I've re-arranged my schedule a bit so that I have time to bisect this
tomorrow, so with some luck I will be able to provide info on which commit
introduced this issue tomorrow around the end of the day.

Regards,

Hans


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-24 10:31     ` Hans de Goede
@ 2020-11-24 12:17       ` Mathias Nyman
  0 siblings, 0 replies; 37+ messages in thread
From: Mathias Nyman @ 2020-11-24 12:17 UTC (permalink / raw)
  To: Hans de Goede, Christoph Hellwig
  Cc: Mathias Nyman, Greg Kroah-Hartman, linux-usb,
	Linux Kernel Mailing List, linux-pci

On 24.11.2020 12.31, Hans de Goede wrote:
> Hi,
> 
> On 11/24/20 11:27 AM, Christoph Hellwig wrote:
>> On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
>>> Hi,
>>>
>>> +Cc Christoph Hellwig <hch@lst.de>
>>>
>>> Christoph, this is still an issue, so I've been looking around a bit and think this
>>> might have something to do with the dma-mapping-5.10 changes.
>>>
>>> Do you have any suggestions to debug this, or is it time to do a git bisect
>>> on this before 5.10 ships with regression?
>>
>> Given that DMAR prefix this seems to be about using intel-iommu + bounce
>> buffering for external devices.  I can't really think of anything specific
>> in 5.10 related to that, so maybe you'll need to bisect.
>>
>> I doub this means we are actually leaking swiotlb buffers, so while
>> I'm pretty sure we broke something in lower layers this also means
>> xhci doesn't handle swiotlb operation very gracefully in general.

Can't think of any xhci change since 5.9 that would cause this.
It's possible there's some underlying xhci issue the 5.10 dma-mapping
changes reveal.

> 
> Ok, I've re-arranged my schedule a bit so that I have time to bisect this
> tomorrow, so with some luck I will be able to provide info on which commit
> introduced this issue tomorrow around the end of the day.

Thanks for looking into it.

-Mathias


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-24 10:27   ` Christoph Hellwig
  2020-11-24 10:31     ` Hans de Goede
@ 2020-11-27 11:41     ` Hans de Goede
  2020-11-27 12:32       ` 5.10 regression caused by: "uas: fix sdev->host->dma_dev": " Hans de Goede
  1 sibling, 1 reply; 37+ messages in thread
From: Hans de Goede @ 2020-11-27 11:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Mathias Nyman, Greg Kroah-Hartman, linux-usb,
	Linux Kernel Mailing List, linux-pci

Hi,

On 11/24/20 11:27 AM, Christoph Hellwig wrote:
> On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> +Cc Christoph Hellwig <hch@lst.de>
>>
>> Christoph, this is still an issue, so I've been looking around a bit and think this
>> might have something to do with the dma-mapping-5.10 changes.
>>
>> Do you have any suggestions to debug this, or is it time to do a git bisect
>> on this before 5.10 ships with regression?
> 
> Given that DMAR prefix this seems to be about using intel-iommu + bounce
> buffering for external devices.  I can't really think of anything specific
> in 5.10 related to that, so maybe you'll need to bisect.
> 
> I doub this means we are actually leaking swiotlb buffers, so while
> I'm pretty sure we broke something in lower layers this also means
> xhci doesn't handle swiotlb operation very gracefully in general.

I've done a git bisect, and the result is somewhat surprising. The git-bisect
points to:

commit 558033c2828f ("uas: fix sdev->host->dma_dev")

    Use scsi_add_host_with_dma() instead of scsi_add_host().
    
    When the scsi request queue is initialized/allocated, hw_max_sectors is clamped
    to the dma max mapping size. Therefore, the correct device that should be used
    for the clamping needs to be set.
    
    The same clamping is still needed in uas as hw_max_sectors could be changed
    there. The original clamping would be invalidated in such cases.

I do have an UAS drive connected to the thunderbolt-dock, so I guess that this
change is causing the UAS driver to gobble all all available swiotlb space.

Regards,

Hans


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-27 11:41     ` Hans de Goede
@ 2020-11-27 12:32       ` Hans de Goede
  2020-11-27 16:19         ` Christoph Hellwig
  0 siblings, 1 reply; 37+ messages in thread
From: Hans de Goede @ 2020-11-27 12:32 UTC (permalink / raw)
  To: Christoph Hellwig, Tom Yan
  Cc: Mathias Nyman, Greg Kroah-Hartman, linux-usb,
	Linux Kernel Mailing List, linux-pci

Hi,

On 11/27/20 12:41 PM, Hans de Goede wrote:
> Hi,
> 
> On 11/24/20 11:27 AM, Christoph Hellwig wrote:
>> On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
>>> Hi,
>>>
>>> +Cc Christoph Hellwig <hch@lst.de>
>>>
>>> Christoph, this is still an issue, so I've been looking around a bit and think this
>>> might have something to do with the dma-mapping-5.10 changes.
>>>
>>> Do you have any suggestions to debug this, or is it time to do a git bisect
>>> on this before 5.10 ships with regression?
>>
>> Given that DMAR prefix this seems to be about using intel-iommu + bounce
>> buffering for external devices.  I can't really think of anything specific
>> in 5.10 related to that, so maybe you'll need to bisect.
>>
>> I doub this means we are actually leaking swiotlb buffers, so while
>> I'm pretty sure we broke something in lower layers this also means
>> xhci doesn't handle swiotlb operation very gracefully in general.
> 
> I've done a git bisect, and the result is somewhat surprising. The git-bisect
> points to:
> 
> commit 558033c2828f ("uas: fix sdev->host->dma_dev")
> 
>     Use scsi_add_host_with_dma() instead of scsi_add_host().
>     
>     When the scsi request queue is initialized/allocated, hw_max_sectors is clamped
>     to the dma max mapping size. Therefore, the correct device that should be used
>     for the clamping needs to be set.
>     
>     The same clamping is still needed in uas as hw_max_sectors could be changed
>     there. The original clamping would be invalidated in such cases.
> 
> I do have an UAS drive connected to the thunderbolt-dock, so I guess that this
> change is causing the UAS driver to gobble all all available swiotlb space.

I ran some more tests, I can confirm that reverting:

5df7ef7d32fe "uas: bump hw_max_sectors to 2048 blocks for SS or faster drives"
558033c2828f "uas: fix sdev->host->dma_dev"

Makes the problem go away while running a 5.10 kernel. I also tried doubling
the swiotlb size by adding: swiotlb=65536 to the kernel commandline but that
does not help.

Some more observations:

1. The usb-storage driver does not cause this issue, even though it has a
very similar change.

2. The problem does not happen until I plug an UAS decvice into the dock.

3. The problem continues to happen even after I unplug the UAS device and
rmmod the uas module

3. made me take a bit closer look to the troublesome commit, it passes:
udev->bus->sysdev, which I assume is the XHCI controller itself as device
to scsi_add_host_with_dma, which in turn seems to cause permanent changes
to the dma settings for the XHCI controller. I'm not all that familiar with
the DMA APIs but I'm getting the feeling that passing the actual XHCI-controller's
device as dma-device to scsi_add_host_with_dma is simply the wrong thing to
do; and that the intended effects (honor XHCI dma limits, but do not cause
any changes the XHCI dma settings) should be achieved differently.

Note that if this is indeed wrong, the matching usb-storage change should
likely also be dropped.

Regards,

Hans


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-27 12:32       ` 5.10 regression caused by: "uas: fix sdev->host->dma_dev": " Hans de Goede
@ 2020-11-27 16:19         ` Christoph Hellwig
  2020-11-27 18:12           ` Hans de Goede
  0 siblings, 1 reply; 37+ messages in thread
From: Christoph Hellwig @ 2020-11-27 16:19 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Christoph Hellwig, Tom Yan, Mathias Nyman, Greg Kroah-Hartman,
	linux-usb, Linux Kernel Mailing List, linux-pci, Lu Baolu

On Fri, Nov 27, 2020 at 01:32:16PM +0100, Hans de Goede wrote:
> I ran some more tests, I can confirm that reverting:
> 
> 5df7ef7d32fe "uas: bump hw_max_sectors to 2048 blocks for SS or faster drives"
> 558033c2828f "uas: fix sdev->host->dma_dev"
> 
> Makes the problem go away while running a 5.10 kernel. I also tried doubling
> the swiotlb size by adding: swiotlb=65536 to the kernel commandline but that
> does not help.
> 
> Some more observations:
> 
> 1. The usb-storage driver does not cause this issue, even though it has a
> very similar change.
> 
> 2. The problem does not happen until I plug an UAS decvice into the dock.
> 
> 3. The problem continues to happen even after I unplug the UAS device and
> rmmod the uas module
> 
> 3. made me take a bit closer look to the troublesome commit, it passes:
> udev->bus->sysdev, which I assume is the XHCI controller itself as device
> to scsi_add_host_with_dma, which in turn seems to cause permanent changes
> to the dma settings for the XHCI controller. I'm not all that familiar with
> the DMA APIs but I'm getting the feeling that passing the actual XHCI-controller's
> device as dma-device to scsi_add_host_with_dma is simply the wrong thing to
> do; and that the intended effects (honor XHCI dma limits, but do not cause
> any changes the XHCI dma settings) should be achieved differently.
> 
> Note that if this is indeed wrong, the matching usb-storage change should
> likely also be dropped.

One problem in this area is that the clamping of the DMA size through
dma_max_mapping_size mentioned in the commit log doesn't work when
swiotlb is called from intel-iommu. I think we need to wire up those
calls there as well.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-27 16:19         ` Christoph Hellwig
@ 2020-11-27 18:12           ` Hans de Goede
  2020-11-28  1:25             ` Tom Yan
  2020-11-28 17:15             ` 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller Christoph Hellwig
  0 siblings, 2 replies; 37+ messages in thread
From: Hans de Goede @ 2020-11-27 18:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Yan, Mathias Nyman, Greg Kroah-Hartman, linux-usb,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Hi,

On 11/27/20 5:19 PM, Christoph Hellwig wrote:
> On Fri, Nov 27, 2020 at 01:32:16PM +0100, Hans de Goede wrote:
>> I ran some more tests, I can confirm that reverting:
>>
>> 5df7ef7d32fe "uas: bump hw_max_sectors to 2048 blocks for SS or faster drives"
>> 558033c2828f "uas: fix sdev->host->dma_dev"
>>
>> Makes the problem go away while running a 5.10 kernel. I also tried doubling
>> the swiotlb size by adding: swiotlb=65536 to the kernel commandline but that
>> does not help.
>>
>> Some more observations:
>>
>> 1. The usb-storage driver does not cause this issue, even though it has a
>> very similar change.
>>
>> 2. The problem does not happen until I plug an UAS decvice into the dock.
>>
>> 3. The problem continues to happen even after I unplug the UAS device and
>> rmmod the uas module
>>
>> 3. made me take a bit closer look to the troublesome commit, it passes:
>> udev->bus->sysdev, which I assume is the XHCI controller itself as device
>> to scsi_add_host_with_dma, which in turn seems to cause permanent changes
>> to the dma settings for the XHCI controller. I'm not all that familiar with
>> the DMA APIs but I'm getting the feeling that passing the actual XHCI-controller's
>> device as dma-device to scsi_add_host_with_dma is simply the wrong thing to
>> do; and that the intended effects (honor XHCI dma limits, but do not cause
>> any changes the XHCI dma settings) should be achieved differently.
>>
>> Note that if this is indeed wrong, the matching usb-storage change should
>> likely also be dropped.
> 
> One problem in this area is that the clamping of the DMA size through
> dma_max_mapping_size mentioned in the commit log doesn't work when
> swiotlb is called from intel-iommu. I think we need to wire up those
> calls there as well.

Ok, but that does not sound like a quick last minute fix for 5.10, so maybe
for 5.10 we should just revert the uas and usb-storage changes which trigger
this problem and then retry those for 5.11 ?

Regards,

Hans 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-27 18:12           ` Hans de Goede
@ 2020-11-28  1:25             ` Tom Yan
  2020-11-28 10:43               ` Hans de Goede
  2020-11-28 17:15             ` 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller Christoph Hellwig
  1 sibling, 1 reply; 37+ messages in thread
From: Tom Yan @ 2020-11-28  1:25 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Christoph Hellwig, Mathias Nyman, Greg Kroah-Hartman, linux-usb,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Should we still be clamping max_sectors to dma_max_mapping_size(dev)
(for now)? with dev being us->pusb_dev->bus->sysdev and
devinfo->udev->bus->sysdev respectively (i.e. revert only
scsi_add_host_with_dma() to scsi_add_host())?

On Sat, 28 Nov 2020 at 02:12, Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 11/27/20 5:19 PM, Christoph Hellwig wrote:
> > On Fri, Nov 27, 2020 at 01:32:16PM +0100, Hans de Goede wrote:
> >> I ran some more tests, I can confirm that reverting:
> >>
> >> 5df7ef7d32fe "uas: bump hw_max_sectors to 2048 blocks for SS or faster drives"
> >> 558033c2828f "uas: fix sdev->host->dma_dev"
> >>
> >> Makes the problem go away while running a 5.10 kernel. I also tried doubling
> >> the swiotlb size by adding: swiotlb=65536 to the kernel commandline but that
> >> does not help.
> >>
> >> Some more observations:
> >>
> >> 1. The usb-storage driver does not cause this issue, even though it has a
> >> very similar change.
> >>
> >> 2. The problem does not happen until I plug an UAS decvice into the dock.
> >>
> >> 3. The problem continues to happen even after I unplug the UAS device and
> >> rmmod the uas module
> >>
> >> 3. made me take a bit closer look to the troublesome commit, it passes:
> >> udev->bus->sysdev, which I assume is the XHCI controller itself as device
> >> to scsi_add_host_with_dma, which in turn seems to cause permanent changes
> >> to the dma settings for the XHCI controller. I'm not all that familiar with
> >> the DMA APIs but I'm getting the feeling that passing the actual XHCI-controller's
> >> device as dma-device to scsi_add_host_with_dma is simply the wrong thing to
> >> do; and that the intended effects (honor XHCI dma limits, but do not cause
> >> any changes the XHCI dma settings) should be achieved differently.
> >>
> >> Note that if this is indeed wrong, the matching usb-storage change should
> >> likely also be dropped.
> >
> > One problem in this area is that the clamping of the DMA size through
> > dma_max_mapping_size mentioned in the commit log doesn't work when
> > swiotlb is called from intel-iommu. I think we need to wire up those
> > calls there as well.
>
> Ok, but that does not sound like a quick last minute fix for 5.10, so maybe
> for 5.10 we should just revert the uas and usb-storage changes which trigger
> this problem and then retry those for 5.11 ?
>
> Regards,
>
> Hans
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-28  1:25             ` Tom Yan
@ 2020-11-28 10:43               ` Hans de Goede
  2020-11-28 15:48                 ` [PATCH 1/2] uas: revert from scsi_add_host_with_dma() to scsi_add_host() Tom Yan
  0 siblings, 1 reply; 37+ messages in thread
From: Hans de Goede @ 2020-11-28 10:43 UTC (permalink / raw)
  To: Tom Yan
  Cc: Christoph Hellwig, Mathias Nyman, Greg Kroah-Hartman, linux-usb,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Hi Tom,

On 11/28/20 2:25 AM, Tom Yan wrote:
> Should we still be clamping max_sectors to dma_max_mapping_size(dev)
> (for now)? with dev being us->pusb_dev->bus->sysdev and
> devinfo->udev->bus->sysdev respectively (i.e. revert only
> scsi_add_host_with_dma() to scsi_add_host())?

I would expect that to work / avoid the regression, so yes that is
a good option.

If you can provide me with a patch doing that, then I can test it to
make sure it does indeed fix the regression.

Regards,

Hans


> 
> On Sat, 28 Nov 2020 at 02:12, Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>> On 11/27/20 5:19 PM, Christoph Hellwig wrote:
>>> On Fri, Nov 27, 2020 at 01:32:16PM +0100, Hans de Goede wrote:
>>>> I ran some more tests, I can confirm that reverting:
>>>>
>>>> 5df7ef7d32fe "uas: bump hw_max_sectors to 2048 blocks for SS or faster drives"
>>>> 558033c2828f "uas: fix sdev->host->dma_dev"
>>>>
>>>> Makes the problem go away while running a 5.10 kernel. I also tried doubling
>>>> the swiotlb size by adding: swiotlb=65536 to the kernel commandline but that
>>>> does not help.
>>>>
>>>> Some more observations:
>>>>
>>>> 1. The usb-storage driver does not cause this issue, even though it has a
>>>> very similar change.
>>>>
>>>> 2. The problem does not happen until I plug an UAS decvice into the dock.
>>>>
>>>> 3. The problem continues to happen even after I unplug the UAS device and
>>>> rmmod the uas module
>>>>
>>>> 3. made me take a bit closer look to the troublesome commit, it passes:
>>>> udev->bus->sysdev, which I assume is the XHCI controller itself as device
>>>> to scsi_add_host_with_dma, which in turn seems to cause permanent changes
>>>> to the dma settings for the XHCI controller. I'm not all that familiar with
>>>> the DMA APIs but I'm getting the feeling that passing the actual XHCI-controller's
>>>> device as dma-device to scsi_add_host_with_dma is simply the wrong thing to
>>>> do; and that the intended effects (honor XHCI dma limits, but do not cause
>>>> any changes the XHCI dma settings) should be achieved differently.
>>>>
>>>> Note that if this is indeed wrong, the matching usb-storage change should
>>>> likely also be dropped.
>>>
>>> One problem in this area is that the clamping of the DMA size through
>>> dma_max_mapping_size mentioned in the commit log doesn't work when
>>> swiotlb is called from intel-iommu. I think we need to wire up those
>>> calls there as well.
>>
>> Ok, but that does not sound like a quick last minute fix for 5.10, so maybe
>> for 5.10 we should just revert the uas and usb-storage changes which trigger
>> this problem and then retry those for 5.11 ?
>>
>> Regards,
>>
>> Hans
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/2] uas: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-28 10:43               ` Hans de Goede
@ 2020-11-28 15:48                 ` Tom Yan
  2020-11-28 15:48                   ` [PATCH 2/2] usb-storage: " Tom Yan
  2020-11-30  9:48                   ` [PATCH 1/2] uas: " Hans de Goede
  0 siblings, 2 replies; 37+ messages in thread
From: Tom Yan @ 2020-11-28 15:48 UTC (permalink / raw)
  To: hdegoede, hch, gregkh, linux-usb
  Cc: mathias.nyman, linux-kernel, linux-pci, baolu.lu, Tom Yan

Apparently the former (with the chosen dma_dev) may cause problem in certain
case (e.g. where thunderbolt dock and intel iommu are involved). The error
observed was:

XHCI swiotlb buffer is full / DMAR: Device bounce map failed

For now we retain the clamp for hw_max_sectors against the dma_max_mapping_size.
Since the device/size for the clamp that is applied when the scsi request queue
is initialized/allocated is different than the one used here, we invalidate the
early clamping by making a fallback blk_queue_max_hw_sectors() call.

Signed-off-by: Tom Yan <tom.ty89@gmail.com>
---
 drivers/usb/storage/uas.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index c8a577309e8f..5db1325cea20 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -843,18 +843,21 @@ static int uas_slave_alloc(struct scsi_device *sdev)
 static int uas_slave_configure(struct scsi_device *sdev)
 {
 	struct uas_dev_info *devinfo = sdev->hostdata;
-	struct device *dev = sdev->host->dma_dev;
+	struct usb_device *udev = devinfo->udev;
 
 	if (devinfo->flags & US_FL_MAX_SECTORS_64)
 		blk_queue_max_hw_sectors(sdev->request_queue, 64);
 	else if (devinfo->flags & US_FL_MAX_SECTORS_240)
 		blk_queue_max_hw_sectors(sdev->request_queue, 240);
-	else if (devinfo->udev->speed >= USB_SPEED_SUPER)
+	else if (udev->speed >= USB_SPEED_SUPER)
 		blk_queue_max_hw_sectors(sdev->request_queue, 2048);
+	else
+		blk_queue_max_hw_sectors(sdev->request_queue,
+					 SCSI_DEFAULT_MAX_SECTORS);
 
 	blk_queue_max_hw_sectors(sdev->request_queue,
 		min_t(size_t, queue_max_hw_sectors(sdev->request_queue),
-		      dma_max_mapping_size(dev) >> SECTOR_SHIFT));
+		      dma_max_mapping_size(udev->bus->sysdev) >> SECTOR_SHIFT));
 
 	if (devinfo->flags & US_FL_NO_REPORT_OPCODES)
 		sdev->no_report_opcodes = 1;
@@ -1040,7 +1043,7 @@ static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id)
 	shost->can_queue = devinfo->qdepth - 2;
 
 	usb_set_intfdata(intf, shost);
-	result = scsi_add_host_with_dma(shost, &intf->dev, udev->bus->sysdev);
+	result = scsi_add_host(shost, &intf->dev);
 	if (result)
 		goto free_streams;
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-28 15:48                 ` [PATCH 1/2] uas: revert from scsi_add_host_with_dma() to scsi_add_host() Tom Yan
@ 2020-11-28 15:48                   ` Tom Yan
  2020-11-30  9:50                     ` Hans de Goede
  2020-11-30  9:48                   ` [PATCH 1/2] uas: " Hans de Goede
  1 sibling, 1 reply; 37+ messages in thread
From: Tom Yan @ 2020-11-28 15:48 UTC (permalink / raw)
  To: hdegoede, hch, gregkh, linux-usb
  Cc: mathias.nyman, linux-kernel, linux-pci, baolu.lu, Tom Yan

While the change only seemed to have caused issue on uas drives, we
probably want to avoid it in the usb-storage driver as well, until
we are sure it is the right thing to do.

Signed-off-by: Tom Yan <tom.ty89@gmail.com>
---
 drivers/usb/storage/scsiglue.c | 40 +++++++++++++++++-----------------
 drivers/usb/storage/usb.c      |  3 +--
 2 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 560efd1479ba..6539bae1c188 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -92,7 +92,7 @@ static int slave_alloc (struct scsi_device *sdev)
 static int slave_configure(struct scsi_device *sdev)
 {
 	struct us_data *us = host_to_us(sdev->host);
-	struct device *dev = sdev->host->dma_dev;
+	struct device *dev = us->pusb_dev->bus->sysdev;
 
 	/*
 	 * Many devices have trouble transferring more than 32KB at a time,
@@ -120,6 +120,25 @@ static int slave_configure(struct scsi_device *sdev)
 		 * better throughput on most devices.
 		 */
 		blk_queue_max_hw_sectors(sdev->request_queue, 2048);
+	} else {
+		/*
+		 * Limit the total size of a transfer to 120 KB.
+		 *
+		 * Some devices are known to choke with anything larger. It seems like
+		 * the problem stems from the fact that original IDE controllers had
+		 * only an 8-bit register to hold the number of sectors in one transfer
+		 * and even those couldn't handle a full 256 sectors.
+		 *
+		 * Because we want to make sure we interoperate with as many devices as
+		 * possible, we will maintain a 240 sector transfer size limit for USB
+		 * Mass Storage devices.
+		 *
+		 * Tests show that other operating have similar limits with Microsoft
+		 * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
+		 * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
+		 * and 2048 for USB3 devices.
+		 */
+		blk_queue_max_hw_sectors(sdev->request_queue, 240);
 	}
 
 	/*
@@ -627,25 +646,6 @@ static const struct scsi_host_template usb_stor_host_template = {
 	.sg_tablesize =			SG_MAX_SEGMENTS,
 
 
-	/*
-	 * Limit the total size of a transfer to 120 KB.
-	 *
-	 * Some devices are known to choke with anything larger. It seems like
-	 * the problem stems from the fact that original IDE controllers had
-	 * only an 8-bit register to hold the number of sectors in one transfer
-	 * and even those couldn't handle a full 256 sectors.
-	 *
-	 * Because we want to make sure we interoperate with as many devices as
-	 * possible, we will maintain a 240 sector transfer size limit for USB
-	 * Mass Storage devices.
-	 *
-	 * Tests show that other operating have similar limits with Microsoft
-	 * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
-	 * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
-	 * and 2048 for USB3 devices.
-	 */
-	.max_sectors =                  240,
-
 	/* emulated HBA */
 	.emulated =			1,
 
diff --git a/drivers/usb/storage/usb.c b/drivers/usb/storage/usb.c
index c2ef367cf257..f177da4ff1bc 100644
--- a/drivers/usb/storage/usb.c
+++ b/drivers/usb/storage/usb.c
@@ -1050,8 +1050,7 @@ int usb_stor_probe2(struct us_data *us)
 	usb_autopm_get_interface_no_resume(us->pusb_intf);
 	snprintf(us->scsi_name, sizeof(us->scsi_name), "usb-storage %s",
 					dev_name(dev));
-	result = scsi_add_host_with_dma(us_to_host(us), dev,
-					us->pusb_dev->bus->sysdev);
+	result = scsi_add_host(us_to_host(us), dev);
 	if (result) {
 		dev_warn(dev,
 				"Unable to add the scsi host\n");
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-27 18:12           ` Hans de Goede
  2020-11-28  1:25             ` Tom Yan
@ 2020-11-28 17:15             ` Christoph Hellwig
  2020-11-30  8:43               ` Hans de Goede
  1 sibling, 1 reply; 37+ messages in thread
From: Christoph Hellwig @ 2020-11-28 17:15 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Christoph Hellwig, Tom Yan, Mathias Nyman, Greg Kroah-Hartman,
	linux-usb, Linux Kernel Mailing List, linux-pci, Lu Baolu

Can you give this one-liner a spin?

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index c6622011d4938c..e889111b55c71d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4007,6 +4007,7 @@ static const struct dma_map_ops bounce_dma_ops = {
 	.alloc_pages		= dma_common_alloc_pages,
 	.free_pages		= dma_common_free_pages,
 	.dma_supported		= dma_direct_supported,
+	.max_mapping_size	= swiotlb_max_mapping_size,
 };
 
 static inline int iommu_domain_cache_init(void)

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller
  2020-11-28 17:15             ` 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller Christoph Hellwig
@ 2020-11-30  8:43               ` Hans de Goede
  0 siblings, 0 replies; 37+ messages in thread
From: Hans de Goede @ 2020-11-30  8:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Yan, Mathias Nyman, Greg Kroah-Hartman, linux-usb,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Hi,

On 11/28/20 6:15 PM, Christoph Hellwig wrote:
> Can you give this one-liner a spin?
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index c6622011d4938c..e889111b55c71d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4007,6 +4007,7 @@ static const struct dma_map_ops bounce_dma_ops = {
>  	.alloc_pages		= dma_common_alloc_pages,
>  	.free_pages		= dma_common_free_pages,
>  	.dma_supported		= dma_direct_supported,
> +	.max_mapping_size	= swiotlb_max_mapping_size,
>  };
>  
>  static inline int iommu_domain_cache_init(void)
> 

I'm afraid that this does not help.

Also I still find it somewhat wrong that the use of scsi_add_host_with_dma()
in uas.c, which then passed the XHCI controller as dma-dev is causing changes
to the DMA settings of the XHCI controller, impacting *other* USB devices
and these changes also are permanent, they stay around even after unbinding
the uas driver.

This just feels wrong on many levels. If some changes to the XHCI controllers
DMA settings are necessary for better uas performance then these changes
really should be made inside the XHCI driver, so that they always apply and
not have this weirdness going on where binding one USB driver permanently
changes the behavior of the entire USB bus (until rebooted).

Querying the DMA settings of the XHCI controller in the uas driver is fine,
but changing them seems like a big nono to me.

Regards,

Hans


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/2] uas: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-28 15:48                 ` [PATCH 1/2] uas: revert from scsi_add_host_with_dma() to scsi_add_host() Tom Yan
  2020-11-28 15:48                   ` [PATCH 2/2] usb-storage: " Tom Yan
@ 2020-11-30  9:48                   ` Hans de Goede
  2020-11-30 19:30                     ` Tom Yan
  1 sibling, 1 reply; 37+ messages in thread
From: Hans de Goede @ 2020-11-30  9:48 UTC (permalink / raw)
  To: Tom Yan, hch, gregkh, linux-usb
  Cc: mathias.nyman, linux-kernel, linux-pci, baolu.lu

Hi,

On 11/28/20 4:48 PM, Tom Yan wrote:
> Apparently the former (with the chosen dma_dev) may cause problem in certain
> case (e.g. where thunderbolt dock and intel iommu are involved). The error
> observed was:
> 
> XHCI swiotlb buffer is full / DMAR: Device bounce map failed
> 
> For now we retain the clamp for hw_max_sectors against the dma_max_mapping_size.
> Since the device/size for the clamp that is applied when the scsi request queue
> is initialized/allocated is different than the one used here, we invalidate the
> early clamping by making a fallback blk_queue_max_hw_sectors() call.
> 
> Signed-off-by: Tom Yan <tom.ty89@gmail.com>

I can confirm that this fixes the network performance on a Lenovo Thunderbolt
dock generation 2, which uses an USB attach NIC.

With this patch added on top of 5.10-rc5 scp performance to another machine
on the local gbit LAN goes back from the regressed 1 MB/s to its original 100MB/s
as it should be:

Tested-by: Hans de Goede <hdegoede@redhat.com>

Regards,

Hans


> ---
>  drivers/usb/storage/uas.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
> index c8a577309e8f..5db1325cea20 100644
> --- a/drivers/usb/storage/uas.c
> +++ b/drivers/usb/storage/uas.c
> @@ -843,18 +843,21 @@ static int uas_slave_alloc(struct scsi_device *sdev)
>  static int uas_slave_configure(struct scsi_device *sdev)
>  {
>  	struct uas_dev_info *devinfo = sdev->hostdata;
> -	struct device *dev = sdev->host->dma_dev;
> +	struct usb_device *udev = devinfo->udev;
>  
>  	if (devinfo->flags & US_FL_MAX_SECTORS_64)
>  		blk_queue_max_hw_sectors(sdev->request_queue, 64);
>  	else if (devinfo->flags & US_FL_MAX_SECTORS_240)
>  		blk_queue_max_hw_sectors(sdev->request_queue, 240);
> -	else if (devinfo->udev->speed >= USB_SPEED_SUPER)
> +	else if (udev->speed >= USB_SPEED_SUPER)
>  		blk_queue_max_hw_sectors(sdev->request_queue, 2048);
> +	else
> +		blk_queue_max_hw_sectors(sdev->request_queue,
> +					 SCSI_DEFAULT_MAX_SECTORS);
>  
>  	blk_queue_max_hw_sectors(sdev->request_queue,
>  		min_t(size_t, queue_max_hw_sectors(sdev->request_queue),
> -		      dma_max_mapping_size(dev) >> SECTOR_SHIFT));
> +		      dma_max_mapping_size(udev->bus->sysdev) >> SECTOR_SHIFT));
>  
>  	if (devinfo->flags & US_FL_NO_REPORT_OPCODES)
>  		sdev->no_report_opcodes = 1;
> @@ -1040,7 +1043,7 @@ static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id)
>  	shost->can_queue = devinfo->qdepth - 2;
>  
>  	usb_set_intfdata(intf, shost);
> -	result = scsi_add_host_with_dma(shost, &intf->dev, udev->bus->sysdev);
> +	result = scsi_add_host(shost, &intf->dev);
>  	if (result)
>  		goto free_streams;
>  
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-28 15:48                   ` [PATCH 2/2] usb-storage: " Tom Yan
@ 2020-11-30  9:50                     ` Hans de Goede
  2020-11-30 12:58                       ` Tom Yan
  0 siblings, 1 reply; 37+ messages in thread
From: Hans de Goede @ 2020-11-30  9:50 UTC (permalink / raw)
  To: Tom Yan, hch, gregkh, linux-usb
  Cc: mathias.nyman, linux-kernel, linux-pci, baolu.lu

Hi,

On 11/28/20 4:48 PM, Tom Yan wrote:
> While the change only seemed to have caused issue on uas drives, we
> probably want to avoid it in the usb-storage driver as well, until
> we are sure it is the right thing to do.
> 
> Signed-off-by: Tom Yan <tom.ty89@gmail.com>

This seems to do a whole lot more then just dropping back from
 scsi_add_host_with_dma() to scsi_add_host(). This has way more
lines then the orginal commit.

IMHO it would be best to just revert commit 0154012f8018bba4d9971d1007c12ffd48539ddb
and then submit these changes as a separate patch (which would be
5.11 material then).

That separate patch could then also have a proper commit message
explaining the other changes you are making, which is also not
unimportant.

Regards,

Hans




> ---
>  drivers/usb/storage/scsiglue.c | 40 +++++++++++++++++-----------------
>  drivers/usb/storage/usb.c      |  3 +--
>  2 files changed, 21 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
> index 560efd1479ba..6539bae1c188 100644
> --- a/drivers/usb/storage/scsiglue.c
> +++ b/drivers/usb/storage/scsiglue.c
> @@ -92,7 +92,7 @@ static int slave_alloc (struct scsi_device *sdev)
>  static int slave_configure(struct scsi_device *sdev)
>  {
>  	struct us_data *us = host_to_us(sdev->host);
> -	struct device *dev = sdev->host->dma_dev;
> +	struct device *dev = us->pusb_dev->bus->sysdev;
>  
>  	/*
>  	 * Many devices have trouble transferring more than 32KB at a time,
> @@ -120,6 +120,25 @@ static int slave_configure(struct scsi_device *sdev)
>  		 * better throughput on most devices.
>  		 */
>  		blk_queue_max_hw_sectors(sdev->request_queue, 2048);
> +	} else {
> +		/*
> +		 * Limit the total size of a transfer to 120 KB.
> +		 *
> +		 * Some devices are known to choke with anything larger. It seems like
> +		 * the problem stems from the fact that original IDE controllers had
> +		 * only an 8-bit register to hold the number of sectors in one transfer
> +		 * and even those couldn't handle a full 256 sectors.
> +		 *
> +		 * Because we want to make sure we interoperate with as many devices as
> +		 * possible, we will maintain a 240 sector transfer size limit for USB
> +		 * Mass Storage devices.
> +		 *
> +		 * Tests show that other operating have similar limits with Microsoft
> +		 * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
> +		 * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
> +		 * and 2048 for USB3 devices.
> +		 */
> +		blk_queue_max_hw_sectors(sdev->request_queue, 240);
>  	}
>  
>  	/*
> @@ -627,25 +646,6 @@ static const struct scsi_host_template usb_stor_host_template = {
>  	.sg_tablesize =			SG_MAX_SEGMENTS,
>  
>  
> -	/*
> -	 * Limit the total size of a transfer to 120 KB.
> -	 *
> -	 * Some devices are known to choke with anything larger. It seems like
> -	 * the problem stems from the fact that original IDE controllers had
> -	 * only an 8-bit register to hold the number of sectors in one transfer
> -	 * and even those couldn't handle a full 256 sectors.
> -	 *
> -	 * Because we want to make sure we interoperate with as many devices as
> -	 * possible, we will maintain a 240 sector transfer size limit for USB
> -	 * Mass Storage devices.
> -	 *
> -	 * Tests show that other operating have similar limits with Microsoft
> -	 * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
> -	 * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
> -	 * and 2048 for USB3 devices.
> -	 */
> -	.max_sectors =                  240,
> -
>  	/* emulated HBA */
>  	.emulated =			1,
>  
> diff --git a/drivers/usb/storage/usb.c b/drivers/usb/storage/usb.c
> index c2ef367cf257..f177da4ff1bc 100644
> --- a/drivers/usb/storage/usb.c
> +++ b/drivers/usb/storage/usb.c
> @@ -1050,8 +1050,7 @@ int usb_stor_probe2(struct us_data *us)
>  	usb_autopm_get_interface_no_resume(us->pusb_intf);
>  	snprintf(us->scsi_name, sizeof(us->scsi_name), "usb-storage %s",
>  					dev_name(dev));
> -	result = scsi_add_host_with_dma(us_to_host(us), dev,
> -					us->pusb_dev->bus->sysdev);
> +	result = scsi_add_host(us_to_host(us), dev);
>  	if (result) {
>  		dev_warn(dev,
>  				"Unable to add the scsi host\n");
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30  9:50                     ` Hans de Goede
@ 2020-11-30 12:58                       ` Tom Yan
  2020-11-30 13:23                         ` Hans de Goede
  0 siblings, 1 reply; 37+ messages in thread
From: Tom Yan @ 2020-11-30 12:58 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Christoph Hellwig, Greg KH, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

It's merely a moving of comment moving for/and a no-behavioral-change
adaptation for the reversion. Similar has been done in the equivalent
patch for the UAS driver (and the reason is stated there).

On Mon, 30 Nov 2020 at 17:50, Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 11/28/20 4:48 PM, Tom Yan wrote:
> > While the change only seemed to have caused issue on uas drives, we
> > probably want to avoid it in the usb-storage driver as well, until
> > we are sure it is the right thing to do.
> >
> > Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>
> This seems to do a whole lot more then just dropping back from
>  scsi_add_host_with_dma() to scsi_add_host(). This has way more
> lines then the orginal commit.
>
> IMHO it would be best to just revert commit 0154012f8018bba4d9971d1007c12ffd48539ddb
> and then submit these changes as a separate patch (which would be
> 5.11 material then).
>
> That separate patch could then also have a proper commit message
> explaining the other changes you are making, which is also not
> unimportant.
>
> Regards,
>
> Hans
>
>
>
>
> > ---
> >  drivers/usb/storage/scsiglue.c | 40 +++++++++++++++++-----------------
> >  drivers/usb/storage/usb.c      |  3 +--
> >  2 files changed, 21 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
> > index 560efd1479ba..6539bae1c188 100644
> > --- a/drivers/usb/storage/scsiglue.c
> > +++ b/drivers/usb/storage/scsiglue.c
> > @@ -92,7 +92,7 @@ static int slave_alloc (struct scsi_device *sdev)
> >  static int slave_configure(struct scsi_device *sdev)
> >  {
> >       struct us_data *us = host_to_us(sdev->host);
> > -     struct device *dev = sdev->host->dma_dev;
> > +     struct device *dev = us->pusb_dev->bus->sysdev;
> >
> >       /*
> >        * Many devices have trouble transferring more than 32KB at a time,
> > @@ -120,6 +120,25 @@ static int slave_configure(struct scsi_device *sdev)
> >                * better throughput on most devices.
> >                */
> >               blk_queue_max_hw_sectors(sdev->request_queue, 2048);
> > +     } else {
> > +             /*
> > +              * Limit the total size of a transfer to 120 KB.
> > +              *
> > +              * Some devices are known to choke with anything larger. It seems like
> > +              * the problem stems from the fact that original IDE controllers had
> > +              * only an 8-bit register to hold the number of sectors in one transfer
> > +              * and even those couldn't handle a full 256 sectors.
> > +              *
> > +              * Because we want to make sure we interoperate with as many devices as
> > +              * possible, we will maintain a 240 sector transfer size limit for USB
> > +              * Mass Storage devices.
> > +              *
> > +              * Tests show that other operating have similar limits with Microsoft
> > +              * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
> > +              * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
> > +              * and 2048 for USB3 devices.
> > +              */
> > +             blk_queue_max_hw_sectors(sdev->request_queue, 240);
> >       }
> >
> >       /*
> > @@ -627,25 +646,6 @@ static const struct scsi_host_template usb_stor_host_template = {
> >       .sg_tablesize =                 SG_MAX_SEGMENTS,
> >
> >
> > -     /*
> > -      * Limit the total size of a transfer to 120 KB.
> > -      *
> > -      * Some devices are known to choke with anything larger. It seems like
> > -      * the problem stems from the fact that original IDE controllers had
> > -      * only an 8-bit register to hold the number of sectors in one transfer
> > -      * and even those couldn't handle a full 256 sectors.
> > -      *
> > -      * Because we want to make sure we interoperate with as many devices as
> > -      * possible, we will maintain a 240 sector transfer size limit for USB
> > -      * Mass Storage devices.
> > -      *
> > -      * Tests show that other operating have similar limits with Microsoft
> > -      * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
> > -      * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
> > -      * and 2048 for USB3 devices.
> > -      */
> > -     .max_sectors =                  240,
> > -
> >       /* emulated HBA */
> >       .emulated =                     1,
> >
> > diff --git a/drivers/usb/storage/usb.c b/drivers/usb/storage/usb.c
> > index c2ef367cf257..f177da4ff1bc 100644
> > --- a/drivers/usb/storage/usb.c
> > +++ b/drivers/usb/storage/usb.c
> > @@ -1050,8 +1050,7 @@ int usb_stor_probe2(struct us_data *us)
> >       usb_autopm_get_interface_no_resume(us->pusb_intf);
> >       snprintf(us->scsi_name, sizeof(us->scsi_name), "usb-storage %s",
> >                                       dev_name(dev));
> > -     result = scsi_add_host_with_dma(us_to_host(us), dev,
> > -                                     us->pusb_dev->bus->sysdev);
> > +     result = scsi_add_host(us_to_host(us), dev);
> >       if (result) {
> >               dev_warn(dev,
> >                               "Unable to add the scsi host\n");
> >
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 12:58                       ` Tom Yan
@ 2020-11-30 13:23                         ` Hans de Goede
  2020-11-30 13:30                           ` Greg KH
  2020-11-30 14:39                           ` Tom Yan
  0 siblings, 2 replies; 37+ messages in thread
From: Hans de Goede @ 2020-11-30 13:23 UTC (permalink / raw)
  To: Tom Yan, Alan Stern
  Cc: Christoph Hellwig, Greg KH, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Hi,

On 11/30/20 1:58 PM, Tom Yan wrote:
> It's merely a moving of comment moving for/and a no-behavioral-change
> adaptation for the reversion.>

IMHO the revert of the troublesome commit and the other/new changes really
should be 2 separate commits. But I will let Alan and Greg have the final
verdict on this.

p.s. Why did you not send this patch-series to Alan Stern, the maintainer of
the usb-storage driver?

> Similar has been done in the equivalent
> patch for the UAS driver (and the reason is stated there).

In the UAS driver the code setting max-hw-sectors was already moved to its
new place and another patch was added on top, so that is different.

Regards,

Hans


> 
> On Mon, 30 Nov 2020 at 17:50, Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>> On 11/28/20 4:48 PM, Tom Yan wrote:
>>> While the change only seemed to have caused issue on uas drives, we
>>> probably want to avoid it in the usb-storage driver as well, until
>>> we are sure it is the right thing to do.
>>>
>>> Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>>
>> This seems to do a whole lot more then just dropping back from
>>  scsi_add_host_with_dma() to scsi_add_host(). This has way more
>> lines then the orginal commit.
>>
>> IMHO it would be best to just revert commit 0154012f8018bba4d9971d1007c12ffd48539ddb
>> and then submit these changes as a separate patch (which would be
>> 5.11 material then).
>>
>> That separate patch could then also have a proper commit message
>> explaining the other changes you are making, which is also not
>> unimportant.
>>
>> Regards,
>>
>> Hans
>>
>>
>>
>>
>>> ---
>>>  drivers/usb/storage/scsiglue.c | 40 +++++++++++++++++-----------------
>>>  drivers/usb/storage/usb.c      |  3 +--
>>>  2 files changed, 21 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
>>> index 560efd1479ba..6539bae1c188 100644
>>> --- a/drivers/usb/storage/scsiglue.c
>>> +++ b/drivers/usb/storage/scsiglue.c
>>> @@ -92,7 +92,7 @@ static int slave_alloc (struct scsi_device *sdev)
>>>  static int slave_configure(struct scsi_device *sdev)
>>>  {
>>>       struct us_data *us = host_to_us(sdev->host);
>>> -     struct device *dev = sdev->host->dma_dev;
>>> +     struct device *dev = us->pusb_dev->bus->sysdev;
>>>
>>>       /*
>>>        * Many devices have trouble transferring more than 32KB at a time,
>>> @@ -120,6 +120,25 @@ static int slave_configure(struct scsi_device *sdev)
>>>                * better throughput on most devices.
>>>                */
>>>               blk_queue_max_hw_sectors(sdev->request_queue, 2048);
>>> +     } else {
>>> +             /*
>>> +              * Limit the total size of a transfer to 120 KB.
>>> +              *
>>> +              * Some devices are known to choke with anything larger. It seems like
>>> +              * the problem stems from the fact that original IDE controllers had
>>> +              * only an 8-bit register to hold the number of sectors in one transfer
>>> +              * and even those couldn't handle a full 256 sectors.
>>> +              *
>>> +              * Because we want to make sure we interoperate with as many devices as
>>> +              * possible, we will maintain a 240 sector transfer size limit for USB
>>> +              * Mass Storage devices.
>>> +              *
>>> +              * Tests show that other operating have similar limits with Microsoft
>>> +              * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
>>> +              * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
>>> +              * and 2048 for USB3 devices.
>>> +              */
>>> +             blk_queue_max_hw_sectors(sdev->request_queue, 240);
>>>       }
>>>
>>>       /*
>>> @@ -627,25 +646,6 @@ static const struct scsi_host_template usb_stor_host_template = {
>>>       .sg_tablesize =                 SG_MAX_SEGMENTS,
>>>
>>>
>>> -     /*
>>> -      * Limit the total size of a transfer to 120 KB.
>>> -      *
>>> -      * Some devices are known to choke with anything larger. It seems like
>>> -      * the problem stems from the fact that original IDE controllers had
>>> -      * only an 8-bit register to hold the number of sectors in one transfer
>>> -      * and even those couldn't handle a full 256 sectors.
>>> -      *
>>> -      * Because we want to make sure we interoperate with as many devices as
>>> -      * possible, we will maintain a 240 sector transfer size limit for USB
>>> -      * Mass Storage devices.
>>> -      *
>>> -      * Tests show that other operating have similar limits with Microsoft
>>> -      * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
>>> -      * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
>>> -      * and 2048 for USB3 devices.
>>> -      */
>>> -     .max_sectors =                  240,
>>> -
>>>       /* emulated HBA */
>>>       .emulated =                     1,
>>>
>>> diff --git a/drivers/usb/storage/usb.c b/drivers/usb/storage/usb.c
>>> index c2ef367cf257..f177da4ff1bc 100644
>>> --- a/drivers/usb/storage/usb.c
>>> +++ b/drivers/usb/storage/usb.c
>>> @@ -1050,8 +1050,7 @@ int usb_stor_probe2(struct us_data *us)
>>>       usb_autopm_get_interface_no_resume(us->pusb_intf);
>>>       snprintf(us->scsi_name, sizeof(us->scsi_name), "usb-storage %s",
>>>                                       dev_name(dev));
>>> -     result = scsi_add_host_with_dma(us_to_host(us), dev,
>>> -                                     us->pusb_dev->bus->sysdev);
>>> +     result = scsi_add_host(us_to_host(us), dev);
>>>       if (result) {
>>>               dev_warn(dev,
>>>                               "Unable to add the scsi host\n");
>>>
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 13:23                         ` Hans de Goede
@ 2020-11-30 13:30                           ` Greg KH
  2020-11-30 13:36                             ` Hans de Goede
  2020-11-30 14:39                           ` Tom Yan
  1 sibling, 1 reply; 37+ messages in thread
From: Greg KH @ 2020-11-30 13:30 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Tom Yan, Alan Stern, Christoph Hellwig, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
> Hi,
> 
> On 11/30/20 1:58 PM, Tom Yan wrote:
> > It's merely a moving of comment moving for/and a no-behavioral-change
> > adaptation for the reversion.>
> 
> IMHO the revert of the troublesome commit and the other/new changes really
> should be 2 separate commits. But I will let Alan and Greg have the final
> verdict on this.

I would prefer to just revert the commits and not do anything
different/special here so late in the release cycle.

So, if Alan agrees, I'll be glad to do them on my end, I just need the
commit ids for them.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 13:30                           ` Greg KH
@ 2020-11-30 13:36                             ` Hans de Goede
  2020-11-30 13:53                               ` Greg KH
  2020-11-30 17:20                               ` Alan Stern
  0 siblings, 2 replies; 37+ messages in thread
From: Hans de Goede @ 2020-11-30 13:36 UTC (permalink / raw)
  To: Greg KH
  Cc: Tom Yan, Alan Stern, Christoph Hellwig, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Hi,

On 11/30/20 2:30 PM, Greg KH wrote:
> On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> On 11/30/20 1:58 PM, Tom Yan wrote:
>>> It's merely a moving of comment moving for/and a no-behavioral-change
>>> adaptation for the reversion.>
>>
>> IMHO the revert of the troublesome commit and the other/new changes really
>> should be 2 separate commits. But I will let Alan and Greg have the final
>> verdict on this.
> 
> I would prefer to just revert the commits and not do anything
> different/special here so late in the release cycle.
> 
> So, if Alan agrees, I'll be glad to do them on my end, I just need the
> commit ids for them.

The troublesome commit are (in reverse, so revert, order):

5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
558033c2828f ("uas: fix sdev->host->dma_dev")
0154012f8018 ("usb-storage: fix sdev->host->dma_dev")

Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
last 2 patches do, with the dmadev argument of that call pointing to the device
for the XHCI controller is causing changes to the DMA settings of the XHCI controller
itself which is causing regressions in 5.10, see this email thread:

https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t

Regards,

Hans


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 13:36                             ` Hans de Goede
@ 2020-11-30 13:53                               ` Greg KH
  2020-11-30 13:55                                 ` Hans de Goede
  2020-11-30 17:20                               ` Alan Stern
  1 sibling, 1 reply; 37+ messages in thread
From: Greg KH @ 2020-11-30 13:53 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Tom Yan, Alan Stern, Christoph Hellwig, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

On Mon, Nov 30, 2020 at 02:36:38PM +0100, Hans de Goede wrote:
> Hi,
> 
> On 11/30/20 2:30 PM, Greg KH wrote:
> > On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
> >> Hi,
> >>
> >> On 11/30/20 1:58 PM, Tom Yan wrote:
> >>> It's merely a moving of comment moving for/and a no-behavioral-change
> >>> adaptation for the reversion.>
> >>
> >> IMHO the revert of the troublesome commit and the other/new changes really
> >> should be 2 separate commits. But I will let Alan and Greg have the final
> >> verdict on this.
> > 
> > I would prefer to just revert the commits and not do anything
> > different/special here so late in the release cycle.
> > 
> > So, if Alan agrees, I'll be glad to do them on my end, I just need the
> > commit ids for them.
> 
> The troublesome commit are (in reverse, so revert, order):
> 
> 5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
> 558033c2828f ("uas: fix sdev->host->dma_dev")
> 0154012f8018 ("usb-storage: fix sdev->host->dma_dev")
> 
> Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
> last 2 patches do, with the dmadev argument of that call pointing to the device
> for the XHCI controller is causing changes to the DMA settings of the XHCI controller
> itself which is causing regressions in 5.10, see this email thread:
> 
> https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t

Thanks, I'll wait for Alan to respond, but I think just reverting these
is the best solution at this point in time.  You have tested those
reverts, solve this, right?  If so, can I get a "Tested-by:"? 

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 13:53                               ` Greg KH
@ 2020-11-30 13:55                                 ` Hans de Goede
  2020-12-04 15:02                                   ` Greg KH
  0 siblings, 1 reply; 37+ messages in thread
From: Hans de Goede @ 2020-11-30 13:55 UTC (permalink / raw)
  To: Greg KH
  Cc: Tom Yan, Alan Stern, Christoph Hellwig, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Hi,

On 11/30/20 2:53 PM, Greg KH wrote:
> On Mon, Nov 30, 2020 at 02:36:38PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> On 11/30/20 2:30 PM, Greg KH wrote:
>>> On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 11/30/20 1:58 PM, Tom Yan wrote:
>>>>> It's merely a moving of comment moving for/and a no-behavioral-change
>>>>> adaptation for the reversion.>
>>>>
>>>> IMHO the revert of the troublesome commit and the other/new changes really
>>>> should be 2 separate commits. But I will let Alan and Greg have the final
>>>> verdict on this.
>>>
>>> I would prefer to just revert the commits and not do anything
>>> different/special here so late in the release cycle.
>>>
>>> So, if Alan agrees, I'll be glad to do them on my end, I just need the
>>> commit ids for them.
>>
>> The troublesome commit are (in reverse, so revert, order):
>>
>> 5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
>> 558033c2828f ("uas: fix sdev->host->dma_dev")
>> 0154012f8018 ("usb-storage: fix sdev->host->dma_dev")
>>
>> Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
>> last 2 patches do, with the dmadev argument of that call pointing to the device
>> for the XHCI controller is causing changes to the DMA settings of the XHCI controller
>> itself which is causing regressions in 5.10, see this email thread:
>>
>> https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t
> 
> Thanks, I'll wait for Alan to respond, but I think just reverting these
> is the best solution at this point in time.  You have tested those
> reverts, solve this, right?  If so, can I get a "Tested-by:"? 

Yes that was my first solution to this problem and I can confirm that that fixes
the regression:

Tested-by: Hans de Goede <hdegoede@redhat.com>

Regards,

Hans


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 13:23                         ` Hans de Goede
  2020-11-30 13:30                           ` Greg KH
@ 2020-11-30 14:39                           ` Tom Yan
  1 sibling, 0 replies; 37+ messages in thread
From: Tom Yan @ 2020-11-30 14:39 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Alan Stern, Christoph Hellwig, Greg KH, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

On Mon, 30 Nov 2020 at 21:23, Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 11/30/20 1:58 PM, Tom Yan wrote:
>
> IMHO the revert of the troublesome commit and the other/new changes really
> should be 2 separate commits. But I will let Alan and Greg have the final
> verdict on this.

They are not "other/new" changes. The same thing was done in the
earlier version of the problematic commits, before I was given the
idea that we can/should set dma_dev to the "chosen device". With the
dma_dev setting approach the exact same clamping will be applied twice
at different points so we don't have to invalidate the earlier one.
But now since we no longer do so, the two clamping are / can be
different, so we need to invalidate the earlier one when we are not
overriding the default max_sectors in each case.

>
> p.s. Why did you not send this patch-series to Alan Stern, the maintainer of
> the usb-storage driver?

Either I accidentally missed him or the list I copied from did that
already. Sorry.

>
> > Similar has been done in the equivalent
> > patch for the UAS driver (and the reason is stated there).
>
> In the UAS driver the code setting max-hw-sectors was already moved to its
> new place and another patch was added on top, so that is different.

If you are referring to the alloc/configure move in the problematic
commit, it was a trivial / code consistency change. It has nothing to
do with what I'm talking about. What I'm talking about is the
else-clause add in the first patch of the current series. I don't know
if you simply missed it or it just seemed much trivial to you, either
way it was "simple" there merely because the uas driver doesn't set
its own "default" max_sectors (and hence has no comment for it) but
use the one set in the SCSI layer. Most importantly, it's an
*adapation* that makes these patches change *nothing* from the current
behaviour but only switches back to scsi_add_host().

>
> Regards,
>
> Hans
>
>
> >
> > On Mon, 30 Nov 2020 at 17:50, Hans de Goede <hdegoede@redhat.com> wrote:
> >>
> >> Hi,
> >>
> >> On 11/28/20 4:48 PM, Tom Yan wrote:
> >>> While the change only seemed to have caused issue on uas drives, we
> >>> probably want to avoid it in the usb-storage driver as well, until
> >>> we are sure it is the right thing to do.
> >>>
> >>> Signed-off-by: Tom Yan <tom.ty89@gmail.com>
> >>
> >> This seems to do a whole lot more then just dropping back from
> >>  scsi_add_host_with_dma() to scsi_add_host(). This has way more
> >> lines then the orginal commit.
> >>
> >> IMHO it would be best to just revert commit 0154012f8018bba4d9971d1007c12ffd48539ddb
> >> and then submit these changes as a separate patch (which would be
> >> 5.11 material then).
> >>
> >> That separate patch could then also have a proper commit message
> >> explaining the other changes you are making, which is also not
> >> unimportant.
> >>
> >> Regards,
> >>
> >> Hans
> >>
> >>
> >>
> >>
> >>> ---
> >>>  drivers/usb/storage/scsiglue.c | 40 +++++++++++++++++-----------------
> >>>  drivers/usb/storage/usb.c      |  3 +--
> >>>  2 files changed, 21 insertions(+), 22 deletions(-)
> >>>
> >>> diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
> >>> index 560efd1479ba..6539bae1c188 100644
> >>> --- a/drivers/usb/storage/scsiglue.c
> >>> +++ b/drivers/usb/storage/scsiglue.c
> >>> @@ -92,7 +92,7 @@ static int slave_alloc (struct scsi_device *sdev)
> >>>  static int slave_configure(struct scsi_device *sdev)
> >>>  {
> >>>       struct us_data *us = host_to_us(sdev->host);
> >>> -     struct device *dev = sdev->host->dma_dev;
> >>> +     struct device *dev = us->pusb_dev->bus->sysdev;
> >>>
> >>>       /*
> >>>        * Many devices have trouble transferring more than 32KB at a time,
> >>> @@ -120,6 +120,25 @@ static int slave_configure(struct scsi_device *sdev)
> >>>                * better throughput on most devices.
> >>>                */
> >>>               blk_queue_max_hw_sectors(sdev->request_queue, 2048);
> >>> +     } else {
> >>> +             /*
> >>> +              * Limit the total size of a transfer to 120 KB.
> >>> +              *
> >>> +              * Some devices are known to choke with anything larger. It seems like
> >>> +              * the problem stems from the fact that original IDE controllers had
> >>> +              * only an 8-bit register to hold the number of sectors in one transfer
> >>> +              * and even those couldn't handle a full 256 sectors.
> >>> +              *
> >>> +              * Because we want to make sure we interoperate with as many devices as
> >>> +              * possible, we will maintain a 240 sector transfer size limit for USB
> >>> +              * Mass Storage devices.
> >>> +              *
> >>> +              * Tests show that other operating have similar limits with Microsoft
> >>> +              * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
> >>> +              * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
> >>> +              * and 2048 for USB3 devices.
> >>> +              */
> >>> +             blk_queue_max_hw_sectors(sdev->request_queue, 240);
> >>>       }
> >>>
> >>>       /*
> >>> @@ -627,25 +646,6 @@ static const struct scsi_host_template usb_stor_host_template = {
> >>>       .sg_tablesize =                 SG_MAX_SEGMENTS,
> >>>
> >>>
> >>> -     /*
> >>> -      * Limit the total size of a transfer to 120 KB.
> >>> -      *
> >>> -      * Some devices are known to choke with anything larger. It seems like
> >>> -      * the problem stems from the fact that original IDE controllers had
> >>> -      * only an 8-bit register to hold the number of sectors in one transfer
> >>> -      * and even those couldn't handle a full 256 sectors.
> >>> -      *
> >>> -      * Because we want to make sure we interoperate with as many devices as
> >>> -      * possible, we will maintain a 240 sector transfer size limit for USB
> >>> -      * Mass Storage devices.
> >>> -      *
> >>> -      * Tests show that other operating have similar limits with Microsoft
> >>> -      * Windows 7 limiting transfers to 128 sectors for both USB2 and USB3
> >>> -      * and Apple Mac OS X 10.11 limiting transfers to 256 sectors for USB2
> >>> -      * and 2048 for USB3 devices.
> >>> -      */
> >>> -     .max_sectors =                  240,
> >>> -
> >>>       /* emulated HBA */
> >>>       .emulated =                     1,
> >>>
> >>> diff --git a/drivers/usb/storage/usb.c b/drivers/usb/storage/usb.c
> >>> index c2ef367cf257..f177da4ff1bc 100644
> >>> --- a/drivers/usb/storage/usb.c
> >>> +++ b/drivers/usb/storage/usb.c
> >>> @@ -1050,8 +1050,7 @@ int usb_stor_probe2(struct us_data *us)
> >>>       usb_autopm_get_interface_no_resume(us->pusb_intf);
> >>>       snprintf(us->scsi_name, sizeof(us->scsi_name), "usb-storage %s",
> >>>                                       dev_name(dev));
> >>> -     result = scsi_add_host_with_dma(us_to_host(us), dev,
> >>> -                                     us->pusb_dev->bus->sysdev);
> >>> +     result = scsi_add_host(us_to_host(us), dev);
> >>>       if (result) {
> >>>               dev_warn(dev,
> >>>                               "Unable to add the scsi host\n");
> >>>
> >>
> >
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 13:36                             ` Hans de Goede
  2020-11-30 13:53                               ` Greg KH
@ 2020-11-30 17:20                               ` Alan Stern
  2020-11-30 17:24                                 ` Christoph Hellwig
  2020-11-30 18:18                                 ` Hans de Goede
  1 sibling, 2 replies; 37+ messages in thread
From: Alan Stern @ 2020-11-30 17:20 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Greg KH, Tom Yan, Christoph Hellwig, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

On Mon, Nov 30, 2020 at 02:36:38PM +0100, Hans de Goede wrote:
> Hi,
> 
> On 11/30/20 2:30 PM, Greg KH wrote:
> > On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
> >> Hi,
> >>
> >> On 11/30/20 1:58 PM, Tom Yan wrote:
> >>> It's merely a moving of comment moving for/and a no-behavioral-change
> >>> adaptation for the reversion.>
> >>
> >> IMHO the revert of the troublesome commit and the other/new changes really
> >> should be 2 separate commits. But I will let Alan and Greg have the final
> >> verdict on this.
> > 
> > I would prefer to just revert the commits and not do anything
> > different/special here so late in the release cycle.
> > 
> > So, if Alan agrees, I'll be glad to do them on my end, I just need the
> > commit ids for them.
> 
> The troublesome commit are (in reverse, so revert, order):
> 
> 5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
> 558033c2828f ("uas: fix sdev->host->dma_dev")
> 0154012f8018 ("usb-storage: fix sdev->host->dma_dev")
> 
> Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
> last 2 patches do, with the dmadev argument of that call pointing to the device
> for the XHCI controller is causing changes to the DMA settings of the XHCI controller
> itself which is causing regressions in 5.10, see this email thread:
> 
> https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t

It's hard to go wrong with reverting, so it's okay with me.

Still, Hans, have you checked out the difference between the 
scsi_add_host() and scsi_add_host_with_dma() calls?  It's just a matter 
of using dev vs. sysdev.  In particular, have you checked to see what 
those two devices are on your system?

It seems likely that if one of those calls messes up some DMA settings, 
the other one does too -- just maybe not settings that matter much.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 17:20                               ` Alan Stern
@ 2020-11-30 17:24                                 ` Christoph Hellwig
  2020-11-30 18:18                                 ` Hans de Goede
  1 sibling, 0 replies; 37+ messages in thread
From: Christoph Hellwig @ 2020-11-30 17:24 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hans de Goede, Greg KH, Tom Yan, Christoph Hellwig, linux-usb,
	Mathias Nyman, Linux Kernel Mailing List, linux-pci, Lu Baolu

On Mon, Nov 30, 2020 at 12:20:04PM -0500, Alan Stern wrote:
> > https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t
> 
> It's hard to go wrong with reverting, so it's okay with me.
> 
> Still, Hans, have you checked out the difference between the 
> scsi_add_host() and scsi_add_host_with_dma() calls?  It's just a matter 
> of using dev vs. sysdev.  In particular, have you checked to see what 
> those two devices are on your system?
> 
> It seems likely that if one of those calls messes up some DMA settings, 
> the other one does too -- just maybe not settings that matter much.

The effects from scsi_add_host_with_dma should be:

 (1) picking which device is used for the SCSI dma map helpers
 (2) use dma_max_mapping_size() to limite the I/O size

The helpers affected by (1) are not used by UAS (or usb-storage for that
matter), while we do have a real bug in the intel-iommu with bounce
buffering implementation used in the bug report.  So my clear bet is on
(2) not limiting the size, but the patch that would have fixed this
did not make a different for Hans, which leaves me a little confused.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 17:20                               ` Alan Stern
  2020-11-30 17:24                                 ` Christoph Hellwig
@ 2020-11-30 18:18                                 ` Hans de Goede
  2020-11-30 18:57                                   ` Tom Yan
  1 sibling, 1 reply; 37+ messages in thread
From: Hans de Goede @ 2020-11-30 18:18 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, Tom Yan, Christoph Hellwig, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Hi,

On 11/30/20 6:20 PM, Alan Stern wrote:
> On Mon, Nov 30, 2020 at 02:36:38PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> On 11/30/20 2:30 PM, Greg KH wrote:
>>> On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 11/30/20 1:58 PM, Tom Yan wrote:
>>>>> It's merely a moving of comment moving for/and a no-behavioral-change
>>>>> adaptation for the reversion.>
>>>>
>>>> IMHO the revert of the troublesome commit and the other/new changes really
>>>> should be 2 separate commits. But I will let Alan and Greg have the final
>>>> verdict on this.
>>>
>>> I would prefer to just revert the commits and not do anything
>>> different/special here so late in the release cycle.
>>>
>>> So, if Alan agrees, I'll be glad to do them on my end, I just need the
>>> commit ids for them.
>>
>> The troublesome commit are (in reverse, so revert, order):
>>
>> 5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
>> 558033c2828f ("uas: fix sdev->host->dma_dev")
>> 0154012f8018 ("usb-storage: fix sdev->host->dma_dev")
>>
>> Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
>> last 2 patches do, with the dmadev argument of that call pointing to the device
>> for the XHCI controller is causing changes to the DMA settings of the XHCI controller
>> itself which is causing regressions in 5.10, see this email thread:
>>
>> https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t
> 
> It's hard to go wrong with reverting, so it's okay with me.
> 
> Still, Hans, have you checked out the difference between the 
> scsi_add_host() and scsi_add_host_with_dma() calls?  It's just a matter 
> of using dev vs. sysdev.  In particular, have you checked to see what 
> those two devices are on your system?

Its not just dev vs sysdev, its iface->dev vs bus->sysdev, and I assume
that the latter is actually the XHCI controller.

my vote goes to reverting to avoid the regression for 5.10, esp. since
this is a clean revert of 3 patches with nothing depending / building
on top of the reverted commits.

Then for 5.11 we can retry to introduce similar changes. I would be happy
to try a new patch-set for 5.11.

> It seems likely that if one of those calls messes up some DMA settings, 
> the other one does too -- just maybe not settings that matter much.

I'm not very familiar with all the DMA mapping / mask code, but AFAIK making
changes to the DMA settings of a child will not influence the parent.

Where as when passing bus->sysdev, then changes are made to a device
which is shared with other devices on the bus, which is why we see
a regression in an USB NIC driver being triggered by the UAS driver
binding to a device (on the same bus).

At least that is my interpretation of this. I bisected the regression
and that pointed at the UAS DMA change and reverting it fixes things,
confirming that I did not make any mistakes during the bisect.

Regards,

Hans


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 18:18                                 ` Hans de Goede
@ 2020-11-30 18:57                                   ` Tom Yan
  2020-11-30 19:01                                     ` Tom Yan
  0 siblings, 1 reply; 37+ messages in thread
From: Tom Yan @ 2020-11-30 18:57 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Alan Stern, Greg KH, Christoph Hellwig, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

This maybe? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/scsi_lib.c?h=v5.10-rc6#n1816

UAS:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/storage/uas.c?h=v5.10-rc6#n918
BOT (AFAICT):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/hosts.c?h=v5.10-rc6#n466

It would explain why the issue is only triggered with UAS drives.

The questions (from me) are:
1. From the scsi layer POV (as per what __scsi_init_queue() does),
what/which should we use as dma_dev?
2. Do we really need to set dma_boundary in the UAS host template (to
PAGE_SIZE - 1)?
3. Kind of the same question as #1: when we clamp hw_max_sectors to
dma max mapping size, should the size actually be "the smaller one
among dev and sysdev"? Or is one of the two sizes *always* the smaller
one?


On Tue, 1 Dec 2020 at 02:19, Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 11/30/20 6:20 PM, Alan Stern wrote:
> > On Mon, Nov 30, 2020 at 02:36:38PM +0100, Hans de Goede wrote:
> >> Hi,
> >>
> >> On 11/30/20 2:30 PM, Greg KH wrote:
> >>> On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
> >>>> Hi,
> >>>>
> >>>> On 11/30/20 1:58 PM, Tom Yan wrote:
> >>>>> It's merely a moving of comment moving for/and a no-behavioral-change
> >>>>> adaptation for the reversion.>
> >>>>
> >>>> IMHO the revert of the troublesome commit and the other/new changes really
> >>>> should be 2 separate commits. But I will let Alan and Greg have the final
> >>>> verdict on this.
> >>>
> >>> I would prefer to just revert the commits and not do anything
> >>> different/special here so late in the release cycle.
> >>>
> >>> So, if Alan agrees, I'll be glad to do them on my end, I just need the
> >>> commit ids for them.
> >>
> >> The troublesome commit are (in reverse, so revert, order):
> >>
> >> 5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
> >> 558033c2828f ("uas: fix sdev->host->dma_dev")
> >> 0154012f8018 ("usb-storage: fix sdev->host->dma_dev")
> >>
> >> Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
> >> last 2 patches do, with the dmadev argument of that call pointing to the device
> >> for the XHCI controller is causing changes to the DMA settings of the XHCI controller
> >> itself which is causing regressions in 5.10, see this email thread:
> >>
> >> https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t
> >
> > It's hard to go wrong with reverting, so it's okay with me.
> >
> > Still, Hans, have you checked out the difference between the
> > scsi_add_host() and scsi_add_host_with_dma() calls?  It's just a matter
> > of using dev vs. sysdev.  In particular, have you checked to see what
> > those two devices are on your system?
>
> Its not just dev vs sysdev, its iface->dev vs bus->sysdev, and I assume
> that the latter is actually the XHCI controller.
>
> my vote goes to reverting to avoid the regression for 5.10, esp. since
> this is a clean revert of 3 patches with nothing depending / building
> on top of the reverted commits.
>
> Then for 5.11 we can retry to introduce similar changes. I would be happy
> to try a new patch-set for 5.11.
>
> > It seems likely that if one of those calls messes up some DMA settings,
> > the other one does too -- just maybe not settings that matter much.
>
> I'm not very familiar with all the DMA mapping / mask code, but AFAIK making
> changes to the DMA settings of a child will not influence the parent.
>
> Where as when passing bus->sysdev, then changes are made to a device
> which is shared with other devices on the bus, which is why we see
> a regression in an USB NIC driver being triggered by the UAS driver
> binding to a device (on the same bus).
>
> At least that is my interpretation of this. I bisected the regression
> and that pointed at the UAS DMA change and reverting it fixes things,
> confirming that I did not make any mistakes during the bisect.
>
> Regards,
>
> Hans
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 18:57                                   ` Tom Yan
@ 2020-11-30 19:01                                     ` Tom Yan
  2020-11-30 20:36                                       ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Tom Yan @ 2020-11-30 19:01 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Alan Stern, Greg KH, Christoph Hellwig, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

For the record,
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/scsi/scsi_host.h?h=v5.10-rc6#n753

On Tue, 1 Dec 2020 at 02:57, Tom Yan <tom.ty89@gmail.com> wrote:
>
> This maybe? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/scsi_lib.c?h=v5.10-rc6#n1816
>
> UAS:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/storage/uas.c?h=v5.10-rc6#n918
> BOT (AFAICT):
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/hosts.c?h=v5.10-rc6#n466
>
> It would explain why the issue is only triggered with UAS drives.
>
> The questions (from me) are:
> 1. From the scsi layer POV (as per what __scsi_init_queue() does),
> what/which should we use as dma_dev?
> 2. Do we really need to set dma_boundary in the UAS host template (to
> PAGE_SIZE - 1)?
> 3. Kind of the same question as #1: when we clamp hw_max_sectors to
> dma max mapping size, should the size actually be "the smaller one
> among dev and sysdev"? Or is one of the two sizes *always* the smaller
> one?
>
>
> On Tue, 1 Dec 2020 at 02:19, Hans de Goede <hdegoede@redhat.com> wrote:
> >
> > Hi,
> >
> > On 11/30/20 6:20 PM, Alan Stern wrote:
> > > On Mon, Nov 30, 2020 at 02:36:38PM +0100, Hans de Goede wrote:
> > >> Hi,
> > >>
> > >> On 11/30/20 2:30 PM, Greg KH wrote:
> > >>> On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
> > >>>> Hi,
> > >>>>
> > >>>> On 11/30/20 1:58 PM, Tom Yan wrote:
> > >>>>> It's merely a moving of comment moving for/and a no-behavioral-change
> > >>>>> adaptation for the reversion.>
> > >>>>
> > >>>> IMHO the revert of the troublesome commit and the other/new changes really
> > >>>> should be 2 separate commits. But I will let Alan and Greg have the final
> > >>>> verdict on this.
> > >>>
> > >>> I would prefer to just revert the commits and not do anything
> > >>> different/special here so late in the release cycle.
> > >>>
> > >>> So, if Alan agrees, I'll be glad to do them on my end, I just need the
> > >>> commit ids for them.
> > >>
> > >> The troublesome commit are (in reverse, so revert, order):
> > >>
> > >> 5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
> > >> 558033c2828f ("uas: fix sdev->host->dma_dev")
> > >> 0154012f8018 ("usb-storage: fix sdev->host->dma_dev")
> > >>
> > >> Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
> > >> last 2 patches do, with the dmadev argument of that call pointing to the device
> > >> for the XHCI controller is causing changes to the DMA settings of the XHCI controller
> > >> itself which is causing regressions in 5.10, see this email thread:
> > >>
> > >> https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t
> > >
> > > It's hard to go wrong with reverting, so it's okay with me.
> > >
> > > Still, Hans, have you checked out the difference between the
> > > scsi_add_host() and scsi_add_host_with_dma() calls?  It's just a matter
> > > of using dev vs. sysdev.  In particular, have you checked to see what
> > > those two devices are on your system?
> >
> > Its not just dev vs sysdev, its iface->dev vs bus->sysdev, and I assume
> > that the latter is actually the XHCI controller.
> >
> > my vote goes to reverting to avoid the regression for 5.10, esp. since
> > this is a clean revert of 3 patches with nothing depending / building
> > on top of the reverted commits.
> >
> > Then for 5.11 we can retry to introduce similar changes. I would be happy
> > to try a new patch-set for 5.11.
> >
> > > It seems likely that if one of those calls messes up some DMA settings,
> > > the other one does too -- just maybe not settings that matter much.
> >
> > I'm not very familiar with all the DMA mapping / mask code, but AFAIK making
> > changes to the DMA settings of a child will not influence the parent.
> >
> > Where as when passing bus->sysdev, then changes are made to a device
> > which is shared with other devices on the bus, which is why we see
> > a regression in an USB NIC driver being triggered by the UAS driver
> > binding to a device (on the same bus).
> >
> > At least that is my interpretation of this. I bisected the regression
> > and that pointed at the UAS DMA change and reverting it fixes things,
> > confirming that I did not make any mistakes during the bisect.
> >
> > Regards,
> >
> > Hans
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/2] uas: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30  9:48                   ` [PATCH 1/2] uas: " Hans de Goede
@ 2020-11-30 19:30                     ` Tom Yan
  2020-12-01 11:09                       ` Hans de Goede
  0 siblings, 1 reply; 37+ messages in thread
From: Tom Yan @ 2020-11-30 19:30 UTC (permalink / raw)
  To: Hans de Goede, Alan Stern
  Cc: Christoph Hellwig, Greg KH, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Hmm, I wonder if I/we wrongly assumed that the dma_dev used for the
hw_max_sectors clamping in __scsi_init_queue() is wrong.

So instead of adding a fallback else-clause here or using "sysdev" as
dma_dev like in the current upstream code, maybe we should actually do
a three-way min: the "changed" hw_max_sectors, dma_max_mapping_size of
dma_dev("dev") and dma_max_mapping_size of sysdev...?

On Mon, 30 Nov 2020 at 17:48, Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 11/28/20 4:48 PM, Tom Yan wrote:
> > Apparently the former (with the chosen dma_dev) may cause problem in certain
> > case (e.g. where thunderbolt dock and intel iommu are involved). The error
> > observed was:
> >
> > XHCI swiotlb buffer is full / DMAR: Device bounce map failed
> >
> > For now we retain the clamp for hw_max_sectors against the dma_max_mapping_size.
> > Since the device/size for the clamp that is applied when the scsi request queue
> > is initialized/allocated is different than the one used here, we invalidate the
> > early clamping by making a fallback blk_queue_max_hw_sectors() call.
> >
> > Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>
> I can confirm that this fixes the network performance on a Lenovo Thunderbolt
> dock generation 2, which uses an USB attach NIC.
>
> With this patch added on top of 5.10-rc5 scp performance to another machine
> on the local gbit LAN goes back from the regressed 1 MB/s to its original 100MB/s
> as it should be:
>
> Tested-by: Hans de Goede <hdegoede@redhat.com>
>
> Regards,
>
> Hans
>
>
> > ---
> >  drivers/usb/storage/uas.c | 11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
> > index c8a577309e8f..5db1325cea20 100644
> > --- a/drivers/usb/storage/uas.c
> > +++ b/drivers/usb/storage/uas.c
> > @@ -843,18 +843,21 @@ static int uas_slave_alloc(struct scsi_device *sdev)
> >  static int uas_slave_configure(struct scsi_device *sdev)
> >  {
> >       struct uas_dev_info *devinfo = sdev->hostdata;
> > -     struct device *dev = sdev->host->dma_dev;
> > +     struct usb_device *udev = devinfo->udev;
> >
> >       if (devinfo->flags & US_FL_MAX_SECTORS_64)
> >               blk_queue_max_hw_sectors(sdev->request_queue, 64);
> >       else if (devinfo->flags & US_FL_MAX_SECTORS_240)
> >               blk_queue_max_hw_sectors(sdev->request_queue, 240);
> > -     else if (devinfo->udev->speed >= USB_SPEED_SUPER)
> > +     else if (udev->speed >= USB_SPEED_SUPER)
> >               blk_queue_max_hw_sectors(sdev->request_queue, 2048);
> > +     else
> > +             blk_queue_max_hw_sectors(sdev->request_queue,
> > +                                      SCSI_DEFAULT_MAX_SECTORS);
> >
> >       blk_queue_max_hw_sectors(sdev->request_queue,
> >               min_t(size_t, queue_max_hw_sectors(sdev->request_queue),
> > -                   dma_max_mapping_size(dev) >> SECTOR_SHIFT));
> > +                   dma_max_mapping_size(udev->bus->sysdev) >> SECTOR_SHIFT));
> >
> >       if (devinfo->flags & US_FL_NO_REPORT_OPCODES)
> >               sdev->no_report_opcodes = 1;
> > @@ -1040,7 +1043,7 @@ static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id)
> >       shost->can_queue = devinfo->qdepth - 2;
> >
> >       usb_set_intfdata(intf, shost);
> > -     result = scsi_add_host_with_dma(shost, &intf->dev, udev->bus->sysdev);
> > +     result = scsi_add_host(shost, &intf->dev);
> >       if (result)
> >               goto free_streams;
> >
> >
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 19:01                                     ` Tom Yan
@ 2020-11-30 20:36                                       ` Alan Stern
  2021-02-25 16:35                                         ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2020-11-30 20:36 UTC (permalink / raw)
  To: Tom Yan
  Cc: Hans de Goede, Greg KH, Christoph Hellwig, linux-usb,
	Mathias Nyman, Linux Kernel Mailing List, linux-pci, Lu Baolu,
	SCSI development list

[Added linux-scsi to CC: list.  When discussing code in a particular 
subsystem, it's a good idea to include that subsystem's mailing list in 
the CC:.]

On Tue, Dec 01, 2020 at 03:01:56AM +0800, Tom Yan wrote:
> For the record,
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/scsi/scsi_host.h?h=v5.10-rc6#n753
> 
> On Tue, 1 Dec 2020 at 02:57, Tom Yan <tom.ty89@gmail.com> wrote:
> >
> > This maybe? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/scsi_lib.c?h=v5.10-rc6#n1816
> >
> > UAS:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/storage/uas.c?h=v5.10-rc6#n918
> > BOT (AFAICT):
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/hosts.c?h=v5.10-rc6#n466
> >
> > It would explain why the issue is only triggered with UAS drives.

In brief, a recent change -- calling scsi_add_host_with_dma rather than 
scsi_add_host -- in the USB uas driver has caused a regression in 
performance.  (Note that the shost->dma_dev value is set differently as 
a result of this change.)  Hans has determined that the problem seems 
to be related to permanent changes in the dma_dev's settings caused by 
scsi_add_host_with_dma.

Tom pointed out that __scsi_init_queue contains a couple of questionable 
assignments:

	dma_set_seg_boundary(dev, shost->dma_boundary);

and

	dma_set_max_seg_size(dev, queue_max_segment_size(q));

where dev = shost->dma_dev -- in this case, a USB host controller.

So an important question is why decisions related to a particular SCSI 
host should affect the DMA settings of a device somewhere else in the 
heirarchy?  Sure, the properties of the USB controller should constrain 
the settings available to the SCSI host, but there doesn't seem to be 
any good reason for restrictions to go in the other direction.

Doesn't the way we handle DMA permit a child device to impose additional 
restrictions (such as a smaller max segment size) beyond those of the 
parent device which actually performs the DMA transfer?

> > The questions (from me) are:
> > 1. From the scsi layer POV (as per what __scsi_init_queue() does),
> > what/which should we use as dma_dev?

We should be using the USB host controller, because it is the device 
which actually performs the DMA transfers.

> > 2. Do we really need to set dma_boundary in the UAS host template (to
> > PAGE_SIZE - 1)?

I don't know.  But in theory it should be possible to have settings 
(like this one) which affect only the transfers carried out by the SCSI 
host, not the transfers carred out by other drivers which might use the 
same USB controller.

> > 3. Kind of the same question as #1: when we clamp hw_max_sectors to
> > dma max mapping size, should the size actually be "the smaller one
> > among dev and sysdev"? Or is one of the two sizes *always* the smaller
> > one?

I assume you're referring to code in the uas driver.  There the value of 
dev is meaningless as far as DMA is concerned.  Only sysdev matters.

Alan Stern

> > On Tue, 1 Dec 2020 at 02:19, Hans de Goede <hdegoede@redhat.com> wrote:
> > >
> > > Hi,
> > >
> > > On 11/30/20 6:20 PM, Alan Stern wrote:
> > > > On Mon, Nov 30, 2020 at 02:36:38PM +0100, Hans de Goede wrote:
> > > >> Hi,
> > > >>
> > > >> On 11/30/20 2:30 PM, Greg KH wrote:
> > > >>> On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
> > > >>>> Hi,
> > > >>>>
> > > >>>> On 11/30/20 1:58 PM, Tom Yan wrote:
> > > >>>>> It's merely a moving of comment moving for/and a no-behavioral-change
> > > >>>>> adaptation for the reversion.>
> > > >>>>
> > > >>>> IMHO the revert of the troublesome commit and the other/new changes really
> > > >>>> should be 2 separate commits. But I will let Alan and Greg have the final
> > > >>>> verdict on this.
> > > >>>
> > > >>> I would prefer to just revert the commits and not do anything
> > > >>> different/special here so late in the release cycle.
> > > >>>
> > > >>> So, if Alan agrees, I'll be glad to do them on my end, I just need the
> > > >>> commit ids for them.
> > > >>
> > > >> The troublesome commit are (in reverse, so revert, order):
> > > >>
> > > >> 5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
> > > >> 558033c2828f ("uas: fix sdev->host->dma_dev")
> > > >> 0154012f8018 ("usb-storage: fix sdev->host->dma_dev")
> > > >>
> > > >> Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
> > > >> last 2 patches do, with the dmadev argument of that call pointing to the device
> > > >> for the XHCI controller is causing changes to the DMA settings of the XHCI controller
> > > >> itself which is causing regressions in 5.10, see this email thread:
> > > >>
> > > >> https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t
> > > >
> > > > It's hard to go wrong with reverting, so it's okay with me.
> > > >
> > > > Still, Hans, have you checked out the difference between the
> > > > scsi_add_host() and scsi_add_host_with_dma() calls?  It's just a matter
> > > > of using dev vs. sysdev.  In particular, have you checked to see what
> > > > those two devices are on your system?
> > >
> > > Its not just dev vs sysdev, its iface->dev vs bus->sysdev, and I assume
> > > that the latter is actually the XHCI controller.
> > >
> > > my vote goes to reverting to avoid the regression for 5.10, esp. since
> > > this is a clean revert of 3 patches with nothing depending / building
> > > on top of the reverted commits.
> > >
> > > Then for 5.11 we can retry to introduce similar changes. I would be happy
> > > to try a new patch-set for 5.11.
> > >
> > > > It seems likely that if one of those calls messes up some DMA settings,
> > > > the other one does too -- just maybe not settings that matter much.
> > >
> > > I'm not very familiar with all the DMA mapping / mask code, but AFAIK making
> > > changes to the DMA settings of a child will not influence the parent.
> > >
> > > Where as when passing bus->sysdev, then changes are made to a device
> > > which is shared with other devices on the bus, which is why we see
> > > a regression in an USB NIC driver being triggered by the UAS driver
> > > binding to a device (on the same bus).
> > >
> > > At least that is my interpretation of this. I bisected the regression
> > > and that pointed at the UAS DMA change and reverting it fixes things,
> > > confirming that I did not make any mistakes during the bisect.
> > >
> > > Regards,
> > >
> > > Hans
> > >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/2] uas: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 19:30                     ` Tom Yan
@ 2020-12-01 11:09                       ` Hans de Goede
  0 siblings, 0 replies; 37+ messages in thread
From: Hans de Goede @ 2020-12-01 11:09 UTC (permalink / raw)
  To: Tom Yan, Alan Stern
  Cc: Christoph Hellwig, Greg KH, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

Hi,

On 11/30/20 8:30 PM, Tom Yan wrote:
> Hmm, I wonder if I/we wrongly assumed that the dma_dev used for the
> hw_max_sectors clamping in __scsi_init_queue() is wrong.
> 
> So instead of adding a fallback else-clause here or using "sysdev" as
> dma_dev like in the current upstream code, maybe we should actually do
> a three-way min: the "changed" hw_max_sectors, dma_max_mapping_size of
> dma_dev("dev") and dma_max_mapping_size of sysdev...?

So both here and in the 2/2 patch thread there are lots of open questions,
which to me suggests that for 5.10 we really should just go with the 3 reverts
which I suggested earlier.

Regards,

Hans



> 
> On Mon, 30 Nov 2020 at 17:48, Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>> On 11/28/20 4:48 PM, Tom Yan wrote:
>>> Apparently the former (with the chosen dma_dev) may cause problem in certain
>>> case (e.g. where thunderbolt dock and intel iommu are involved). The error
>>> observed was:
>>>
>>> XHCI swiotlb buffer is full / DMAR: Device bounce map failed
>>>
>>> For now we retain the clamp for hw_max_sectors against the dma_max_mapping_size.
>>> Since the device/size for the clamp that is applied when the scsi request queue
>>> is initialized/allocated is different than the one used here, we invalidate the
>>> early clamping by making a fallback blk_queue_max_hw_sectors() call.
>>>
>>> Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>>
>> I can confirm that this fixes the network performance on a Lenovo Thunderbolt
>> dock generation 2, which uses an USB attach NIC.
>>
>> With this patch added on top of 5.10-rc5 scp performance to another machine
>> on the local gbit LAN goes back from the regressed 1 MB/s to its original 100MB/s
>> as it should be:
>>
>> Tested-by: Hans de Goede <hdegoede@redhat.com>
>>
>> Regards,
>>
>> Hans
>>
>>
>>> ---
>>>  drivers/usb/storage/uas.c | 11 +++++++----
>>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
>>> index c8a577309e8f..5db1325cea20 100644
>>> --- a/drivers/usb/storage/uas.c
>>> +++ b/drivers/usb/storage/uas.c
>>> @@ -843,18 +843,21 @@ static int uas_slave_alloc(struct scsi_device *sdev)
>>>  static int uas_slave_configure(struct scsi_device *sdev)
>>>  {
>>>       struct uas_dev_info *devinfo = sdev->hostdata;
>>> -     struct device *dev = sdev->host->dma_dev;
>>> +     struct usb_device *udev = devinfo->udev;
>>>
>>>       if (devinfo->flags & US_FL_MAX_SECTORS_64)
>>>               blk_queue_max_hw_sectors(sdev->request_queue, 64);
>>>       else if (devinfo->flags & US_FL_MAX_SECTORS_240)
>>>               blk_queue_max_hw_sectors(sdev->request_queue, 240);
>>> -     else if (devinfo->udev->speed >= USB_SPEED_SUPER)
>>> +     else if (udev->speed >= USB_SPEED_SUPER)
>>>               blk_queue_max_hw_sectors(sdev->request_queue, 2048);
>>> +     else
>>> +             blk_queue_max_hw_sectors(sdev->request_queue,
>>> +                                      SCSI_DEFAULT_MAX_SECTORS);
>>>
>>>       blk_queue_max_hw_sectors(sdev->request_queue,
>>>               min_t(size_t, queue_max_hw_sectors(sdev->request_queue),
>>> -                   dma_max_mapping_size(dev) >> SECTOR_SHIFT));
>>> +                   dma_max_mapping_size(udev->bus->sysdev) >> SECTOR_SHIFT));
>>>
>>>       if (devinfo->flags & US_FL_NO_REPORT_OPCODES)
>>>               sdev->no_report_opcodes = 1;
>>> @@ -1040,7 +1043,7 @@ static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id)
>>>       shost->can_queue = devinfo->qdepth - 2;
>>>
>>>       usb_set_intfdata(intf, shost);
>>> -     result = scsi_add_host_with_dma(shost, &intf->dev, udev->bus->sysdev);
>>> +     result = scsi_add_host(shost, &intf->dev);
>>>       if (result)
>>>               goto free_streams;
>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 13:55                                 ` Hans de Goede
@ 2020-12-04 15:02                                   ` Greg KH
  0 siblings, 0 replies; 37+ messages in thread
From: Greg KH @ 2020-12-04 15:02 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Tom Yan, Alan Stern, Christoph Hellwig, linux-usb, Mathias Nyman,
	Linux Kernel Mailing List, linux-pci, Lu Baolu

On Mon, Nov 30, 2020 at 02:55:45PM +0100, Hans de Goede wrote:
> Hi,
> 
> On 11/30/20 2:53 PM, Greg KH wrote:
> > On Mon, Nov 30, 2020 at 02:36:38PM +0100, Hans de Goede wrote:
> >> Hi,
> >>
> >> On 11/30/20 2:30 PM, Greg KH wrote:
> >>> On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
> >>>> Hi,
> >>>>
> >>>> On 11/30/20 1:58 PM, Tom Yan wrote:
> >>>>> It's merely a moving of comment moving for/and a no-behavioral-change
> >>>>> adaptation for the reversion.>
> >>>>
> >>>> IMHO the revert of the troublesome commit and the other/new changes really
> >>>> should be 2 separate commits. But I will let Alan and Greg have the final
> >>>> verdict on this.
> >>>
> >>> I would prefer to just revert the commits and not do anything
> >>> different/special here so late in the release cycle.
> >>>
> >>> So, if Alan agrees, I'll be glad to do them on my end, I just need the
> >>> commit ids for them.
> >>
> >> The troublesome commit are (in reverse, so revert, order):
> >>
> >> 5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
> >> 558033c2828f ("uas: fix sdev->host->dma_dev")
> >> 0154012f8018 ("usb-storage: fix sdev->host->dma_dev")
> >>
> >> Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
> >> last 2 patches do, with the dmadev argument of that call pointing to the device
> >> for the XHCI controller is causing changes to the DMA settings of the XHCI controller
> >> itself which is causing regressions in 5.10, see this email thread:
> >>
> >> https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t
> > 
> > Thanks, I'll wait for Alan to respond, but I think just reverting these
> > is the best solution at this point in time.  You have tested those
> > reverts, solve this, right?  If so, can I get a "Tested-by:"? 
> 
> Yes that was my first solution to this problem and I can confirm that that fixes
> the regression:
> 
> Tested-by: Hans de Goede <hdegoede@redhat.com>

All now reverted.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2020-11-30 20:36                                       ` Alan Stern
@ 2021-02-25 16:35                                         ` Alan Stern
  2021-02-26  5:53                                           ` Christoph Hellwig
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2021-02-25 16:35 UTC (permalink / raw)
  To: Tom Yan, Hans de Goede, Christoph Hellwig
  Cc: linux-usb, Linux Kernel Mailing List, SCSI development list

This thread seems to have fallen through the cracks.  Maybe now would be 
a good time to address the problem (since we originally planned to fix 
it in 5.11!).

The questions listed below are pretty self-contained, although the rest
of the discussion isn't.  But I never received any answers.

Alan Stern

On Mon, Nov 30, 2020 at 03:36:18PM -0500, Alan Stern wrote:
> [Added linux-scsi to CC: list.  When discussing code in a particular 
> subsystem, it's a good idea to include that subsystem's mailing list in 
> the CC:.]
> 
> On Tue, Dec 01, 2020 at 03:01:56AM +0800, Tom Yan wrote:
> > For the record,
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/scsi/scsi_host.h?h=v5.10-rc6#n753
> > 
> > On Tue, 1 Dec 2020 at 02:57, Tom Yan <tom.ty89@gmail.com> wrote:
> > >
> > > This maybe? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/scsi_lib.c?h=v5.10-rc6#n1816
> > >
> > > UAS:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/storage/uas.c?h=v5.10-rc6#n918
> > > BOT (AFAICT):
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/hosts.c?h=v5.10-rc6#n466
> > >
> > > It would explain why the issue is only triggered with UAS drives.
> 
> In brief, a recent change -- calling scsi_add_host_with_dma rather than 
> scsi_add_host -- in the USB uas driver has caused a regression in 
> performance.  (Note that the shost->dma_dev value is set differently as 
> a result of this change.)  Hans has determined that the problem seems 
> to be related to permanent changes in the dma_dev's settings caused by 
> scsi_add_host_with_dma.
> 
> Tom pointed out that __scsi_init_queue contains a couple of questionable 
> assignments:
> 
> 	dma_set_seg_boundary(dev, shost->dma_boundary);
> 
> and
> 
> 	dma_set_max_seg_size(dev, queue_max_segment_size(q));
> 
> where dev = shost->dma_dev -- in this case, a USB host controller.
> 
> So an important question is why decisions related to a particular SCSI 
> host should affect the DMA settings of a device somewhere else in the 
> heirarchy?  Sure, the properties of the USB controller should constrain 
> the settings available to the SCSI host, but there doesn't seem to be 
> any good reason for restrictions to go in the other direction.
> 
> Doesn't the way we handle DMA permit a child device to impose additional 
> restrictions (such as a smaller max segment size) beyond those of the 
> parent device which actually performs the DMA transfer?
> 
> > > The questions (from me) are:
> > > 1. From the scsi layer POV (as per what __scsi_init_queue() does),
> > > what/which should we use as dma_dev?
> 
> We should be using the USB host controller, because it is the device 
> which actually performs the DMA transfers.
> 
> > > 2. Do we really need to set dma_boundary in the UAS host template (to
> > > PAGE_SIZE - 1)?
> 
> I don't know.  But in theory it should be possible to have settings 
> (like this one) which affect only the transfers carried out by the SCSI 
> host, not the transfers carred out by other drivers which might use the 
> same USB controller.
> 
> > > 3. Kind of the same question as #1: when we clamp hw_max_sectors to
> > > dma max mapping size, should the size actually be "the smaller one
> > > among dev and sysdev"? Or is one of the two sizes *always* the smaller
> > > one?
> 
> I assume you're referring to code in the uas driver.  There the value of 
> dev is meaningless as far as DMA is concerned.  Only sysdev matters.
> 
> Alan Stern
> 
> > > On Tue, 1 Dec 2020 at 02:19, Hans de Goede <hdegoede@redhat.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On 11/30/20 6:20 PM, Alan Stern wrote:
> > > > > On Mon, Nov 30, 2020 at 02:36:38PM +0100, Hans de Goede wrote:
> > > > >> Hi,
> > > > >>
> > > > >> On 11/30/20 2:30 PM, Greg KH wrote:
> > > > >>> On Mon, Nov 30, 2020 at 02:23:48PM +0100, Hans de Goede wrote:
> > > > >>>> Hi,
> > > > >>>>
> > > > >>>> On 11/30/20 1:58 PM, Tom Yan wrote:
> > > > >>>>> It's merely a moving of comment moving for/and a no-behavioral-change
> > > > >>>>> adaptation for the reversion.>
> > > > >>>>
> > > > >>>> IMHO the revert of the troublesome commit and the other/new changes really
> > > > >>>> should be 2 separate commits. But I will let Alan and Greg have the final
> > > > >>>> verdict on this.
> > > > >>>
> > > > >>> I would prefer to just revert the commits and not do anything
> > > > >>> different/special here so late in the release cycle.
> > > > >>>
> > > > >>> So, if Alan agrees, I'll be glad to do them on my end, I just need the
> > > > >>> commit ids for them.
> > > > >>
> > > > >> The troublesome commit are (in reverse, so revert, order):
> > > > >>
> > > > >> 5df7ef7d32fe ("uas: bump hw_max_sectors to 2048 blocks for SS or faster drives")
> > > > >> 558033c2828f ("uas: fix sdev->host->dma_dev")
> > > > >> 0154012f8018 ("usb-storage: fix sdev->host->dma_dev")
> > > > >>
> > > > >> Alan, the reason for reverting these is that using scsi_add_host_with_dma() as the
> > > > >> last 2 patches do, with the dmadev argument of that call pointing to the device
> > > > >> for the XHCI controller is causing changes to the DMA settings of the XHCI controller
> > > > >> itself which is causing regressions in 5.10, see this email thread:
> > > > >>
> > > > >> https://lore.kernel.org/linux-usb/fde7e11f-5dfc-8348-c134-a21cb1116285@redhat.com/T/#t
> > > > >
> > > > > It's hard to go wrong with reverting, so it's okay with me.
> > > > >
> > > > > Still, Hans, have you checked out the difference between the
> > > > > scsi_add_host() and scsi_add_host_with_dma() calls?  It's just a matter
> > > > > of using dev vs. sysdev.  In particular, have you checked to see what
> > > > > those two devices are on your system?
> > > >
> > > > Its not just dev vs sysdev, its iface->dev vs bus->sysdev, and I assume
> > > > that the latter is actually the XHCI controller.
> > > >
> > > > my vote goes to reverting to avoid the regression for 5.10, esp. since
> > > > this is a clean revert of 3 patches with nothing depending / building
> > > > on top of the reverted commits.
> > > >
> > > > Then for 5.11 we can retry to introduce similar changes. I would be happy
> > > > to try a new patch-set for 5.11.
> > > >
> > > > > It seems likely that if one of those calls messes up some DMA settings,
> > > > > the other one does too -- just maybe not settings that matter much.
> > > >
> > > > I'm not very familiar with all the DMA mapping / mask code, but AFAIK making
> > > > changes to the DMA settings of a child will not influence the parent.
> > > >
> > > > Where as when passing bus->sysdev, then changes are made to a device
> > > > which is shared with other devices on the bus, which is why we see
> > > > a regression in an USB NIC driver being triggered by the UAS driver
> > > > binding to a device (on the same bus).
> > > >
> > > > At least that is my interpretation of this. I bisected the regression
> > > > and that pointed at the UAS DMA change and reverting it fixes things,
> > > > confirming that I did not make any mistakes during the bisect.
> > > >
> > > > Regards,
> > > >
> > > > Hans
> > > >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2021-02-25 16:35                                         ` Alan Stern
@ 2021-02-26  5:53                                           ` Christoph Hellwig
  2021-03-01 15:59                                             ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Christoph Hellwig @ 2021-02-26  5:53 UTC (permalink / raw)
  To: Alan Stern
  Cc: Tom Yan, Hans de Goede, Christoph Hellwig, linux-usb,
	Linux Kernel Mailing List, SCSI development list

On Thu, Feb 25, 2021 at 11:35:57AM -0500, Alan Stern wrote:
> This thread seems to have fallen through the cracks.  Maybe now would be 
> a good time to address the problem (since we originally planned to fix 
> it in 5.11!).
> 
> The questions listed below are pretty self-contained, although the rest
> of the discussion isn't.  But I never received any answers.

usb-storage must use scsi_add_host_with_dma to use the right device
for DMA mapping and parameters.  The calls to set the DMA options
on the device are needed so that IOMMU merging doesn't change the
imposed requirements.  If these requirements slow you down you need
to relax them, as apparently the hardware is able to handle bigger
limits.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/2] usb-storage: revert from scsi_add_host_with_dma() to scsi_add_host()
  2021-02-26  5:53                                           ` Christoph Hellwig
@ 2021-03-01 15:59                                             ` Alan Stern
  0 siblings, 0 replies; 37+ messages in thread
From: Alan Stern @ 2021-03-01 15:59 UTC (permalink / raw)
  To: Christoph Hellwig, Hans de Goede
  Cc: Tom Yan, linux-usb, Linux Kernel Mailing List, SCSI development list

On Fri, Feb 26, 2021 at 06:53:52AM +0100, Christoph Hellwig wrote:
> On Thu, Feb 25, 2021 at 11:35:57AM -0500, Alan Stern wrote:
> > This thread seems to have fallen through the cracks.  Maybe now would be 
> > a good time to address the problem (since we originally planned to fix 
> > it in 5.11!).
> > 
> > The questions listed below are pretty self-contained, although the rest
> > of the discussion isn't.  But I never received any answers.
> 
> usb-storage must use scsi_add_host_with_dma to use the right device
> for DMA mapping and parameters.  The calls to set the DMA options
> on the device are needed so that IOMMU merging doesn't change the
> imposed requirements.  If these requirements slow you down you need
> to relax them, as apparently the hardware is able to handle bigger
> limits.

Hans, don't you have the right equipment to test this approach?

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2021-03-01 16:00 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-10 11:36 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller Hans de Goede
2020-11-18 21:43 ` Hans de Goede
2020-11-23 14:49 ` Hans de Goede
2020-11-24 10:27   ` Christoph Hellwig
2020-11-24 10:31     ` Hans de Goede
2020-11-24 12:17       ` Mathias Nyman
2020-11-27 11:41     ` Hans de Goede
2020-11-27 12:32       ` 5.10 regression caused by: "uas: fix sdev->host->dma_dev": " Hans de Goede
2020-11-27 16:19         ` Christoph Hellwig
2020-11-27 18:12           ` Hans de Goede
2020-11-28  1:25             ` Tom Yan
2020-11-28 10:43               ` Hans de Goede
2020-11-28 15:48                 ` [PATCH 1/2] uas: revert from scsi_add_host_with_dma() to scsi_add_host() Tom Yan
2020-11-28 15:48                   ` [PATCH 2/2] usb-storage: " Tom Yan
2020-11-30  9:50                     ` Hans de Goede
2020-11-30 12:58                       ` Tom Yan
2020-11-30 13:23                         ` Hans de Goede
2020-11-30 13:30                           ` Greg KH
2020-11-30 13:36                             ` Hans de Goede
2020-11-30 13:53                               ` Greg KH
2020-11-30 13:55                                 ` Hans de Goede
2020-12-04 15:02                                   ` Greg KH
2020-11-30 17:20                               ` Alan Stern
2020-11-30 17:24                                 ` Christoph Hellwig
2020-11-30 18:18                                 ` Hans de Goede
2020-11-30 18:57                                   ` Tom Yan
2020-11-30 19:01                                     ` Tom Yan
2020-11-30 20:36                                       ` Alan Stern
2021-02-25 16:35                                         ` Alan Stern
2021-02-26  5:53                                           ` Christoph Hellwig
2021-03-01 15:59                                             ` Alan Stern
2020-11-30 14:39                           ` Tom Yan
2020-11-30  9:48                   ` [PATCH 1/2] uas: " Hans de Goede
2020-11-30 19:30                     ` Tom Yan
2020-12-01 11:09                       ` Hans de Goede
2020-11-28 17:15             ` 5.10 regression caused by: "uas: fix sdev->host->dma_dev": many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller Christoph Hellwig
2020-11-30  8:43               ` Hans de Goede

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).