All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
@ 2014-01-30 21:18 Sarah Sharp
  2014-01-30 21:39 ` Mark Lord
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Sarah Sharp @ 2014-01-30 21:18 UTC (permalink / raw)
  To: David Laight
  Cc: linux-usb, netdev, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Mark Lord, Alan Stern, Freddy Xin

On Thu, Jan 30, 2014 at 05:35:08PM +0100, Peter Stuge wrote:
> David Laight wrote:
> > > Where's the 8k coming from?
> > 
> > My memory, I meant 16k :-(
> 
> No problem. But what about that alignment? It seems that userspace
> needs to start caring about alignment with xhci, right?

We need to step back and reassess the larger picture here, instead of
trying to fire-fight all the regressions caused by the link TRB commit
(35773dac5f86 "usb: xhci: Link TRB must not occur within a USB payload
burst").

We shouldn't need to make userspace start to worry about alignment at
all.  libusb worked in the past, before the link TRB fix went in.  We
*cannot* break userspace USB drivers.  The breakage needs to be fixed in
the USB core or the xHCI driver.

Commit 35773dac5f86 was meant to be a short-term bandaid fix, but it's
already caused at least four different regressions.  Some we've fixed,
some have proposed solutions that David has sent.

The storage layer is getting borked because it submits scatter-gather
lists longer than what will fit on a segment, and now libusb has the
same issue.  One xHCI host controller stopped responding to commands,
and reverting the bandaid fix helped.  The implications of this change
just keep coming in, and I'm not comfortable wall-papering over the
issues.

On the flip side, it seems that the only devices that have been helped
by the bandaid fix patch are USB ethernet devices using the ax88179_178a
driver.  (Mark Lord still needs to confirm which device he uses.)  I
have not seen any other reports that other USB ethernet chipsets were
broken in 3.12 by the USB networking layer adding scatter-gather
support.

It should not matter what alignment or length of scatter-gather list the
upper layers pass the xHCI driver, it should just work.  I want to do
this fix right, by changing the fundamental way we queue TRBs to the
rings to fit the TD rules.  We should break each TD into fragment-sized
chunks, and add a link TRB in the middle of a segment where necessary.

Let's do this fix the right way, instead of wall papering over the
issue.  Here's what we should do:

1. Disable scatter-gather for the ax88179_178a driver when it's under an
   xHCI host.

2. Revert the following commits:
   f2d9b991c549 xhci: Set scatter-gather limit to avoid failed block writes.
   d6c9ea9069af xhci: Avoid infinite loop when sg urb requires too many trbs
   35773dac5f86 usb: xhci: Link TRB must not occur within a USB payload burst

3. Dan and Mathias can work together to come up with an overall plan to
   change the xHCI driver architecture to be fully compliant with the TD
   fragment rules.  That can be done over the next few kernel releases.

The end result is that we don't destabilize storage or break userspace
USB drivers, we don't break people's xHCI host controllers,
the ax88179_178a USB ethernet devices still work under xHCI (a bit with
worse performance), and other USB ethernet devices still get the
performance improvement introduced in 3.12.

Sarah Sharp

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:18 [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned Sarah Sharp
@ 2014-01-30 21:39 ` Mark Lord
  2014-01-30 21:42   ` Mark Lord
  2014-01-30 21:43 ` Alan Stern
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: Mark Lord @ 2014-01-30 21:39 UTC (permalink / raw)
  To: Sarah Sharp, David Laight
  Cc: linux-usb, netdev, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Alan Stern, Freddy Xin

On 14-01-30 04:18 PM, Sarah Sharp wrote:
>
> Let's do this fix the right way, instead of wall papering over the
> issue.  Here's what we should do:
> 
> 1. Disable scatter-gather for the ax88179_178a driver when it's under an
>    xHCI host.
> 
> 2. Revert the following commits:
>    f2d9b991c549 xhci: Set scatter-gather limit to avoid failed block writes.
>    d6c9ea9069af xhci: Avoid infinite loop when sg urb requires too many trbs
>    35773dac5f86 usb: xhci: Link TRB must not occur within a USB payload burst
> 
> 3. Dan and Mathias can work together to come up with an overall plan to
>    change the xHCI driver architecture to be fully compliant with the TD
>    fragment rules.  That can be done over the next few kernel releases.
> 
> The end result is that we don't destabilize storage or break userspace
> USB drivers, we don't break people's xHCI host controllers,
> the ax88179_178a USB ethernet devices still work under xHCI (a bit with
> worse performance), and other USB ethernet devices still get the
> performance improvement introduced in 3.12.


Performance before 3.12/3.13 was not all that bad either.
My ax88179 dongle (yes, that one, using ax88179_178a.ko)
manages very close to 1gbit/sec throughput even without SG,
and without a huge cpu tax either.

SG done Right will make it better eventually.  I can wait.

Cheers
-- 
Mark Lord
Real-Time Remedies Inc.
mlord@pobox.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:39 ` Mark Lord
@ 2014-01-30 21:42   ` Mark Lord
  0 siblings, 0 replies; 28+ messages in thread
From: Mark Lord @ 2014-01-30 21:42 UTC (permalink / raw)
  To: Sarah Sharp, David Laight
  Cc: linux-usb, netdev, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Alan Stern, Freddy Xin

Sarah, on a related note:

Is there a parameter or knob of some kind to tell the XHCI driver
to treat a specific port as USB2 (480mbit/sec max) rather than USB3 ?

The Dell XPS-13 Ultrabooks all suffer from some kind of flaw, whereby the left side
USB3 port is unreliable at SuperSpeed; the right side port works flawlessly.
The MS-Windows driver has a workaround of some sort, but we don't.

Cheers
-- 
Mark Lord
Real-Time Remedies Inc.
mlord@pobox.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:18 [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned Sarah Sharp
  2014-01-30 21:39 ` Mark Lord
@ 2014-01-30 21:43 ` Alan Stern
  2014-01-30 21:48   ` Mark Lord
  2014-01-30 21:55   ` Sarah Sharp
  2014-01-30 21:50 ` Bjørn Mork
  2014-01-31 10:14 ` David Laight
  3 siblings, 2 replies; 28+ messages in thread
From: Alan Stern @ 2014-01-30 21:43 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: David Laight, linux-usb, netdev, Greg Kroah-Hartman,
	David Miller, Dan Williams, Nyman, Mathias, Mark Lord,
	Freddy Xin

On Thu, 30 Jan 2014, Sarah Sharp wrote:

> It should not matter what alignment or length of scatter-gather list the
> upper layers pass the xHCI driver, it should just work.  I want to do
> this fix right, by changing the fundamental way we queue TRBs to the
> rings to fit the TD rules.  We should break each TD into fragment-sized
> chunks, and add a link TRB in the middle of a segment where necessary.

That's a good plan.  However _some_ restriction will turn out to be
necessary.

For example, what will you do if a driver submits an SG list containing
300 elements, each 3 bytes long?  That's too many to fit in a single
ring segment, but it's smaller than a TD fragment -- it's even smaller
than maxpacket -- so there's no place to split it.  (Not that I think
drivers _will_ submit requests like this; this is just to demonstrate
the point.)

It ought to be acceptable to require, for example, that an SG URB 
contain no more than (say) 100 elements that are smaller than 512 
bytes.

ehci-hcd gets along okay with the restriction that each SG element 
except the last has to be a multiple of the maxpacket size.  xhci-hcd 
can relax this quite a lot, but not all the way.

Alan Stern

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:43 ` Alan Stern
@ 2014-01-30 21:48   ` Mark Lord
  2014-01-30 21:55   ` Sarah Sharp
  1 sibling, 0 replies; 28+ messages in thread
From: Mark Lord @ 2014-01-30 21:48 UTC (permalink / raw)
  To: Alan Stern, Sarah Sharp
  Cc: David Laight, linux-usb, netdev, Greg Kroah-Hartman,
	David Miller, Dan Williams, Nyman, Mathias, Freddy Xin

On 14-01-30 04:43 PM, Alan Stern wrote:
> On Thu, 30 Jan 2014, Sarah Sharp wrote:
> 
>> It should not matter what alignment or length of scatter-gather list the
>> upper layers pass the xHCI driver, it should just work.  I want to do
>> this fix right, by changing the fundamental way we queue TRBs to the
>> rings to fit the TD rules.  We should break each TD into fragment-sized
>> chunks, and add a link TRB in the middle of a segment where necessary.
> 
> That's a good plan.  However _some_ restriction will turn out to be
> necessary.
> 
> For example, what will you do if a driver submits an SG list containing
> 300 elements, each 3 bytes long?

Allocate a contiguous (bounce) buffer and copy the fragments to/from it?

-- 
Mark Lord
Real-Time Remedies Inc.
mlord@pobox.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:18 [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned Sarah Sharp
  2014-01-30 21:39 ` Mark Lord
  2014-01-30 21:43 ` Alan Stern
@ 2014-01-30 21:50 ` Bjørn Mork
  2014-01-30 22:15   ` Sarah Sharp
  2014-01-31 10:14 ` David Laight
  3 siblings, 1 reply; 28+ messages in thread
From: Bjørn Mork @ 2014-01-30 21:50 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: David Laight, linux-usb@vger.kernel.org, netdev@vger.kernel.org,
	Greg Kroah-Hartman, David Miller, Dan Williams, Nyman, Mathias,
	Mark Lord, Alan Stern, Freddy Xin

FWIW, the plan looks fine to me.  Just adding a couple of hints to
simplify the implementation.

Sarah Sharp <sarah.a.sharp-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> writes:

> Let's do this fix the right way, instead of wall papering over the
> issue.  Here's what we should do:
>
> 1. Disable scatter-gather for the ax88179_178a driver when it's under an
>    xHCI host.

No need to make this conditional.  SG is only enabled in the
ax88179_178a driver if udev->bus->no_sg_constraint is true, so it
applies only to xHCI hosts in the first place.

> 2. Revert the following commits:
>    f2d9b991c549 xhci: Set scatter-gather limit to avoid failed block writes.
>    d6c9ea9069af xhci: Avoid infinite loop when sg urb requires too many trbs
>    35773dac5f86 usb: xhci: Link TRB must not occur within a USB payload burst
>
> 3. Dan and Mathias can work together to come up with an overall plan to
>    change the xHCI driver architecture to be fully compliant with the TD
>    fragment rules.  That can be done over the next few kernel releases.
>
> The end result is that we don't destabilize storage or break userspace
> USB drivers, we don't break people's xHCI host controllers,
> the ax88179_178a USB ethernet devices still work under xHCI (a bit with
> worse performance), and other USB ethernet devices still get the
> performance improvement introduced in 3.12.

No other usbnet drivers has enabled SG...  Which is why you have only
seen this problem with the ax88179_178a devices.  So there is no
performance improvement to keep.



Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:43 ` Alan Stern
  2014-01-30 21:48   ` Mark Lord
@ 2014-01-30 21:55   ` Sarah Sharp
  2014-01-30 22:05     ` Bjørn Mork
  2014-01-30 22:07     ` Alan Stern
  1 sibling, 2 replies; 28+ messages in thread
From: Sarah Sharp @ 2014-01-30 21:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Laight, linux-usb, netdev, Greg Kroah-Hartman,
	David Miller, Dan Williams, Nyman, Mathias, Mark Lord,
	Freddy Xin

On Thu, Jan 30, 2014 at 04:43:54PM -0500, Alan Stern wrote:
> On Thu, 30 Jan 2014, Sarah Sharp wrote:
> 
> > It should not matter what alignment or length of scatter-gather list the
> > upper layers pass the xHCI driver, it should just work.  I want to do
> > this fix right, by changing the fundamental way we queue TRBs to the
> > rings to fit the TD rules.  We should break each TD into fragment-sized
> > chunks, and add a link TRB in the middle of a segment where necessary.
> 
> That's a good plan.  However _some_ restriction will turn out to be
> necessary.
> 
> For example, what will you do if a driver submits an SG list containing
> 300 elements, each 3 bytes long?  That's too many to fit in a single
> ring segment, but it's smaller than a TD fragment -- it's even smaller
> than maxpacket -- so there's no place to split it.  (Not that I think
> drivers _will_ submit requests like this; this is just to demonstrate
> the point.)
> 
> It ought to be acceptable to require, for example, that an SG URB 
> contain no more than (say) 100 elements that are smaller than 512 
> bytes.

At that point, the xHCI driver or USB core should probably use a bounce
buffer.  It feels like we should attempt to push down scatter-gather
lists as far down in the stack as possible, so the upper layers don't
have to care what alignment, length, or random 64KB boundary splits we
need.

> ehci-hcd gets along okay with the restriction that each SG element 
> except the last has to be a multiple of the maxpacket size.  xhci-hcd 
> can relax this quite a lot, but not all the way.

What does the EHCI driver do when it receives a SG list from the USB
networking layer that violates this restriction?

Sarah Sharp

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:55   ` Sarah Sharp
@ 2014-01-30 22:05     ` Bjørn Mork
  2014-01-30 22:07     ` Alan Stern
  1 sibling, 0 replies; 28+ messages in thread
From: Bjørn Mork @ 2014-01-30 22:05 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: Alan Stern, David Laight, linux-usb, netdev, Greg Kroah-Hartman,
	David Miller, Dan Williams, Nyman, Mathias, Mark Lord,
	Freddy Xin

Sarah Sharp <sarah.a.sharp@linux.intel.com> writes:
> On Thu, Jan 30, 2014 at 04:43:54PM -0500, Alan Stern wrote:
>
>> ehci-hcd gets along okay with the restriction that each SG element 
>> except the last has to be a multiple of the maxpacket size.  xhci-hcd 
>> can relax this quite a lot, but not all the way.
>
> What does the EHCI driver do when it receives a SG list from the USB
> networking layer that violates this restriction?

The USB networking layer won't use SG with the EHCI driver.

Commit bcc48f1a7a0d4 introduced no_sg_constraint so that usbnet could
enable SG only for host controllers with no such restrictions. I.e.
currently for xHCI only.


Bjørn

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:55   ` Sarah Sharp
  2014-01-30 22:05     ` Bjørn Mork
@ 2014-01-30 22:07     ` Alan Stern
  1 sibling, 0 replies; 28+ messages in thread
From: Alan Stern @ 2014-01-30 22:07 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: David Laight, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Mark Lord, Freddy Xin

On Thu, 30 Jan 2014, Sarah Sharp wrote:

> > That's a good plan.  However _some_ restriction will turn out to be
> > necessary.
> > 
> > For example, what will you do if a driver submits an SG list containing
> > 300 elements, each 3 bytes long?  That's too many to fit in a single
> > ring segment, but it's smaller than a TD fragment -- it's even smaller
> > than maxpacket -- so there's no place to split it.  (Not that I think
> > drivers _will_ submit requests like this; this is just to demonstrate
> > the point.)
> > 
> > It ought to be acceptable to require, for example, that an SG URB 
> > contain no more than (say) 100 elements that are smaller than 512 
> > bytes.
> 
> At that point, the xHCI driver or USB core should probably use a bounce
> buffer.  It feels like we should attempt to push down scatter-gather
> lists as far down in the stack as possible, so the upper layers don't
> have to care what alignment, length, or random 64KB boundary splits we
> need.

Okay.  That should be doable, if awkward.

> > ehci-hcd gets along okay with the restriction that each SG element 
> > except the last has to be a multiple of the maxpacket size.  xhci-hcd 
> > can relax this quite a lot, but not all the way.
> 
> What does the EHCI driver do when it receives a SG list from the USB
> networking layer that violates this restriction?

It never receives such lists.  usb_submit_urb() returns -EINVAL before 
the request gets sent to ehci-hcd.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:50 ` Bjørn Mork
@ 2014-01-30 22:15   ` Sarah Sharp
  2014-01-31  0:17     ` Ming Lei
  2014-01-31 15:22     ` David Laight
  0 siblings, 2 replies; 28+ messages in thread
From: Sarah Sharp @ 2014-01-30 22:15 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: David Laight, linux-usb, netdev, Greg Kroah-Hartman,
	David Miller, Dan Williams, Nyman, Mathias, Mark Lord,
	Alan Stern, Freddy Xin, Ming Lei

On Thu, Jan 30, 2014 at 10:50:21PM +0100, Bjørn Mork wrote:
> FWIW, the plan looks fine to me.  Just adding a couple of hints to
> simplify the implementation.
> 
> Sarah Sharp <sarah.a.sharp@linux.intel.com> writes:
> 
> > Let's do this fix the right way, instead of wall papering over the
> > issue.  Here's what we should do:
> >
> > 1. Disable scatter-gather for the ax88179_178a driver when it's under an
> >    xHCI host.
> 
> No need to make this conditional.  SG is only enabled in the
> ax88179_178a driver if udev->bus->no_sg_constraint is true, so it
> applies only to xHCI hosts in the first place.

Ah, so you're suggesting just reverting commit
3804fad45411b48233b48003e33a78f290d227c8 "USBNET: ax88179_178a: enable
tso if usb host supports sg dma"?

> > 2. Revert the following commits:
> >    f2d9b991c549 xhci: Set scatter-gather limit to avoid failed block writes.
> >    d6c9ea9069af xhci: Avoid infinite loop when sg urb requires too many trbs
> >    35773dac5f86 usb: xhci: Link TRB must not occur within a USB payload burst
> >
> > 3. Dan and Mathias can work together to come up with an overall plan to
> >    change the xHCI driver architecture to be fully compliant with the TD
> >    fragment rules.  That can be done over the next few kernel releases.
> >
> > The end result is that we don't destabilize storage or break userspace
> > USB drivers, we don't break people's xHCI host controllers,
> > the ax88179_178a USB ethernet devices still work under xHCI (a bit with
> > worse performance), and other USB ethernet devices still get the
> > performance improvement introduced in 3.12.
> 
> No other usbnet drivers has enabled SG...  Which is why you have only
> seen this problem with the ax88179_178a devices.  So there is no
> performance improvement to keep.

I see.

Sarah Sharp

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 22:15   ` Sarah Sharp
@ 2014-01-31  0:17     ` Ming Lei
  2014-01-31 19:00       ` Sarah Sharp
  2014-01-31 15:22     ` David Laight
  1 sibling, 1 reply; 28+ messages in thread
From: Ming Lei @ 2014-01-31  0:17 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: Bjørn Mork, David Laight, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Mark Lord, Alan Stern, Freddy Xin

On Fri, Jan 31, 2014 at 6:15 AM, Sarah Sharp
<sarah.a.sharp-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
> On Thu, Jan 30, 2014 at 10:50:21PM +0100, Bjørn Mork wrote:
>> FWIW, the plan looks fine to me.  Just adding a couple of hints to
>> simplify the implementation.
>>
>> Sarah Sharp <sarah.a.sharp-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> writes:
>>
>> > Let's do this fix the right way, instead of wall papering over the
>> > issue.  Here's what we should do:
>> >
>> > 1. Disable scatter-gather for the ax88179_178a driver when it's under an
>> >    xHCI host.
>>
>> No need to make this conditional.  SG is only enabled in the
>> ax88179_178a driver if udev->bus->no_sg_constraint is true, so it
>> applies only to xHCI hosts in the first place.
>
> Ah, so you're suggesting just reverting commit
> 3804fad45411b48233b48003e33a78f290d227c8 "USBNET: ax88179_178a: enable
> tso if usb host supports sg dma"?

If I understand the problem correctly, the current issue is that xhci driver
doesn't support the arbitrary dma length not well, but per XHCI spec, it
should be supported, right?

If the above is correct, reverting the commit isn't correct since there isn't
any issue about the commit, so I suggest to disable the flag in xhci
for the buggy devices, and it may be enabled again if the problem is fixed.

>
>> > 2. Revert the following commits:
>> >    f2d9b991c549 xhci: Set scatter-gather limit to avoid failed block writes.
>> >    d6c9ea9069af xhci: Avoid infinite loop when sg urb requires too many trbs
>> >    35773dac5f86 usb: xhci: Link TRB must not occur within a USB payload burst
>> >
>> > 3. Dan and Mathias can work together to come up with an overall plan to
>> >    change the xHCI driver architecture to be fully compliant with the TD
>> >    fragment rules.  That can be done over the next few kernel releases.
>> >
>> > The end result is that we don't destabilize storage or break userspace
>> > USB drivers, we don't break people's xHCI host controllers,
>> > the ax88179_178a USB ethernet devices still work under xHCI (a bit with
>> > worse performance), and other USB ethernet devices still get the
>> > performance improvement introduced in 3.12.
>>
>> No other usbnet drivers has enabled SG...  Which is why you have only
>> seen this problem with the ax88179_178a devices.  So there is no
>> performance improvement to keep.

In my test environment, the patch does improve both throughput and
cpu utilization, if you search the previous email for the patch, you can
see the data.

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 21:18 [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned Sarah Sharp
                   ` (2 preceding siblings ...)
  2014-01-30 21:50 ` Bjørn Mork
@ 2014-01-31 10:14 ` David Laight
  2014-01-31 13:21   ` Peter Stuge
  3 siblings, 1 reply; 28+ messages in thread
From: David Laight @ 2014-01-31 10:14 UTC (permalink / raw)
  To: 'Sarah Sharp'
  Cc: linux-usb, netdev, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Mark Lord, Alan Stern, Freddy Xin

From: Sarah Sharp
> We need to step back and reassess the larger picture here, instead of
> trying to fire-fight all the regressions caused by the link TRB commit
> (35773dac5f86 "usb: xhci: Link TRB must not occur within a USB payload
> burst").

Some of the breakage seems to have been related to the PM and readq/writeq
changes.

The main problem with that patch is that it limited the number of
fragments.

> We shouldn't need to make userspace start to worry about alignment at
> all.  libusb worked in the past, before the link TRB fix went in.  We
> *cannot* break userspace USB drivers.  The breakage needs to be fixed in
> the USB core or the xHCI driver.

Userspace doesn't care since everything gets copied into aligned
kernel fragments - otherwise the other usb controllers wouldn't work.

> Commit 35773dac5f86 was meant to be a short-term bandaid fix, but it's
> already caused at least four different regressions.  Some we've fixed,
> some have proposed solutions that David has sent.
> 
> The storage layer is getting borked because it submits scatter-gather
> lists longer than what will fit on a segment, and now libusb has the
> same issue.  One xHCI host controller stopped responding to commands,
> and reverting the bandaid fix helped.  The implications of this change
> just keep coming in, and I'm not comfortable wall-papering over the
> issues.

The transfers from the storage layer are actually all 'suitably aligned'.
Fragments can cross 64k boundaries, but they all start on 4k boundaries.

> On the flip side, it seems that the only devices that have been helped
> by the bandaid fix patch are USB ethernet devices using the ax88179_178a
> driver.  (Mark Lord still needs to confirm which device he uses.)  I
> have not seen any other reports that other USB ethernet chipsets were
> broken in 3.12 by the USB networking layer adding scatter-gather
> support.

That is the only usbnet driver for which SG support is enabled.
I believe it was all enabled because the ax88179_178a is the only
one that can be expected to saturate Ge and supports TCP segmentation
offload (TSO) - where the buffer is almost always fragmented.
With TSO transmits are almost certainly single fragments.

> It should not matter what alignment or length of scatter-gather list the
> upper layers pass the xHCI driver, it should just work.  I want to do
> this fix right, by changing the fundamental way we queue TRBs to the
> rings to fit the TD rules.  We should break each TD into fragment-sized
> chunks, and add a link TRB in the middle of a segment where necessary.

There will always be some transfer requests that make this impossible
(without using a bounce buffer).
In practise nothing will send such bad transfers.
To avoid excessive work you need to be told whether the transfer is
aligned (everything from the block layer and libusb is) or misaligned
(potentially everything from usbnet).
The limits on the number length of SG list has to be different for the
two types of request.

> Let's do this fix the right way, instead of wall papering over the
> issue.  Here's what we should do:
> 
> 1. Disable scatter-gather for the ax88179_178a driver when it's under an
>    xHCI host.

That can be done by simple not setting the flag on the xhci driver.
However you also need to double check that this disables TSO.

> 2. Revert the following commits:
>    f2d9b991c549 xhci: Set scatter-gather limit to avoid failed block writes.
>    d6c9ea9069af xhci: Avoid infinite loop when sg urb requires too many trbs
>    35773dac5f86 usb: xhci: Link TRB must not occur within a USB payload burst
> 
> 3. Dan and Mathias can work together to come up with an overall plan to
>    change the xHCI driver architecture to be fully compliant with the TD
>    fragment rules.  That can be done over the next few kernel releases.

Don't forget that these rules can affect isoc transfers as well.
Even without SG, the buffer can cross a 64k boundary and thus
need splitting into separate TRB - which might need to be in
different ring segments.
Easily fixable by writing the LINK early (expect that the TRB writing
code looks for LINK TRBs).
 
> The end result is that we don't destabilize storage or break userspace
> USB drivers, we don't break people's xHCI host controllers,
> the ax88179_178a USB ethernet devices still work under xHCI (a bit with
> worse performance), and other USB ethernet devices still get the
> performance improvement introduced in 3.12.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-31 10:14 ` David Laight
@ 2014-01-31 13:21   ` Peter Stuge
  2014-01-31 13:52     ` David Laight
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Stuge @ 2014-01-31 13:21 UTC (permalink / raw)
  To: David Laight
  Cc: 'Sarah Sharp',
	linux-usb, netdev, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Mark Lord, Alan Stern, Freddy Xin

David Laight wrote:
> > We shouldn't need to make userspace start to worry about alignment at
> > all.  libusb worked in the past, before the link TRB fix went in.  We
> > *cannot* break userspace USB drivers.  The breakage needs to be fixed in
> > the USB core or the xHCI driver.
> 
> Userspace doesn't care since everything gets copied into aligned
> kernel fragments - otherwise the other usb controllers wouldn't work.

OK, but not so great if someone wants to squeeze the most performance
possible out of USB also from userspace.

I'm going off on a tangent now but would it make sense to allow
userspace to do alignment if it wants to, and have a way to tell
the kernel when urb buffers are pre-aligned?


//Peter

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-31 13:21   ` Peter Stuge
@ 2014-01-31 13:52     ` David Laight
  0 siblings, 0 replies; 28+ messages in thread
From: David Laight @ 2014-01-31 13:52 UTC (permalink / raw)
  To: 'Peter Stuge'
  Cc: 'Sarah Sharp',
	linux-usb, netdev, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Mark Lord, Alan Stern, Freddy Xin

From: Peter Stuge [mailto:peter@stuge.se]
> > Userspace doesn't care since everything gets copied into aligned
> > kernel fragments - otherwise the other usb controllers wouldn't work.
> 
> OK, but not so great if someone wants to squeeze the most performance
> possible out of USB also from userspace.
> 
> I'm going off on a tangent now but would it make sense to allow
> userspace to do alignment if it wants to, and have a way to tell
> the kernel when urb buffers are pre-aligned?

I can only see that mattering if either:
1) The userspace buffers are (say) 4n+1 aligned and the kernel
   decides to align the copy_from_user().
2) The code is doing buffer-loaning.

Personally I'm not at all sure how often buffer-loaning helps
(given the cost of the TLB shootdowns that it often implies).
I guess it might be ok if the memory doesn't have to be given
a KVA and the user program avoids any COW.
In any case and such code could be limited to page-aligned transfers.
And/or the user code would have to know at least some of the constraints.

The other usb controllers only support 'message' aligned transfers
(512 bytes for USB2). All the code I've found achieves this by using
page aligned (maybe 4k aligned) fragments. xhci trivially supports this
(it has since the 'ring expansion' code was added).

xhci can also easily support:
1) Arbitrary fragmentation for a limited number of fragments
   (by constraining the fragments to a single ring segment).
2) Arbitrary fragmentation provided all the fragments (except the
   last) exceed some minimal length (by splitting the current or
   previous fragment at the appropriate boundary).

The code for the second is probably worth adding just in case
a 4k fragment crosses a 64k boundary.

Since arbitrarily fragmented packets can't be sent to other controllers
it does seem sensible for the code generating the urb so say that
it is (or might be) fragmented like that.

	David

 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 22:15   ` Sarah Sharp
  2014-01-31  0:17     ` Ming Lei
@ 2014-01-31 15:22     ` David Laight
  1 sibling, 0 replies; 28+ messages in thread
From: David Laight @ 2014-01-31 15:22 UTC (permalink / raw)
  To: 'Sarah Sharp', Bjørn Mork
  Cc: linux-usb, netdev, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Mark Lord, Alan Stern, Freddy Xin,
	Ming Lei

From: Sarah Sharp
> On Thu, Jan 30, 2014 at 10:50:21PM +0100, Bjørn Mork wrote:
> > FWIW, the plan looks fine to me.  Just adding a couple of hints to
> > simplify the implementation.
> >
> > Sarah Sharp <sarah.a.sharp@linux.intel.com> writes:
> >
> > > Let's do this fix the right way, instead of wall papering over the
> > > issue.  Here's what we should do:
> > >
> > > 1. Disable scatter-gather for the ax88179_178a driver when it's under an
> > >    xHCI host.
> >
> > No need to make this conditional.  SG is only enabled in the
> > ax88179_178a driver if udev->bus->no_sg_constraint is true, so it
> > applies only to xHCI hosts in the first place.

Leave the usbnet code alone and unset udev->bus->no_sg_constraint.

	David

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-31  0:17     ` Ming Lei
@ 2014-01-31 19:00       ` Sarah Sharp
  2014-02-01  7:54         ` Ming Lei
  0 siblings, 1 reply; 28+ messages in thread
From: Sarah Sharp @ 2014-01-31 19:00 UTC (permalink / raw)
  To: Ming Lei
  Cc: Bjørn Mork, David Laight, linux-usb, netdev,
	Greg Kroah-Hartman, David Miller, Dan Williams, Nyman, Mathias,
	Mark Lord, Alan Stern, Freddy Xin

On Fri, Jan 31, 2014 at 08:17:58AM +0800, Ming Lei wrote:
> On Fri, Jan 31, 2014 at 6:15 AM, Sarah Sharp
> <sarah.a.sharp@linux.intel.com> wrote:
> > On Thu, Jan 30, 2014 at 10:50:21PM +0100, Bjørn Mork wrote:
> >> FWIW, the plan looks fine to me.  Just adding a couple of hints to
> >> simplify the implementation.
> >>
> >> Sarah Sharp <sarah.a.sharp@linux.intel.com> writes:
> >>
> >> > Let's do this fix the right way, instead of wall papering over the
> >> > issue.  Here's what we should do:
> >> >
> >> > 1. Disable scatter-gather for the ax88179_178a driver when it's under an
> >> >    xHCI host.
> >>
> >> No need to make this conditional.  SG is only enabled in the
> >> ax88179_178a driver if udev->bus->no_sg_constraint is true, so it
> >> applies only to xHCI hosts in the first place.
> >
> > Ah, so you're suggesting just reverting commit
> > 3804fad45411b48233b48003e33a78f290d227c8 "USBNET: ax88179_178a: enable
> > tso if usb host supports sg dma"?
> 
> If I understand the problem correctly, the current issue is that xhci driver
> doesn't support the arbitrary dma length not well, but per XHCI spec, it
> should be supported, right?
> 
> If the above is correct, reverting the commit isn't correct since there isn't
> any issue about the commit, so I suggest to disable the flag in xhci
> for the buggy devices, and it may be enabled again if the problem is fixed.

Ok, I like that plan, since it means I don't have to touch any
networking code to fix this. :)

I believe that means we'll have to disable the flag for all 1.0 xHCI
hosts, since those are the ones that need TD fragments.

> >> > 2. Revert the following commits:
> >> >    f2d9b991c549 xhci: Set scatter-gather limit to avoid failed block writes.
> >> >    d6c9ea9069af xhci: Avoid infinite loop when sg urb requires too many trbs
> >> >    35773dac5f86 usb: xhci: Link TRB must not occur within a USB payload burst
> >> >
> >> > 3. Dan and Mathias can work together to come up with an overall plan to
> >> >    change the xHCI driver architecture to be fully compliant with the TD
> >> >    fragment rules.  That can be done over the next few kernel releases.
> >> >
> >> > The end result is that we don't destabilize storage or break userspace
> >> > USB drivers, we don't break people's xHCI host controllers,
> >> > the ax88179_178a USB ethernet devices still work under xHCI (a bit with
> >> > worse performance), and other USB ethernet devices still get the
> >> > performance improvement introduced in 3.12.
> >>
> >> No other usbnet drivers has enabled SG...  Which is why you have only
> >> seen this problem with the ax88179_178a devices.  So there is no
> >> performance improvement to keep.
> 
> In my test environment, the patch does improve both throughput and
> cpu utilization, if you search the previous email for the patch, you can
> see the data.

Right, I did see the performance improvement note in that commit.  Do
you know if the ARM A15 dual core board was using a 0.96 xHCI host, or a
1.0 host?  You can find out by reloading the xHCI driver with dynamic
debugging turned on:

# sudo modprobe xhci_hcd dyndbg

and then look for lines like:

[25296.765767] xhci_hcd 0000:00:14.0: HCIVERSION: 0x100

Sarah Sharp

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-31 19:00       ` Sarah Sharp
@ 2014-02-01  7:54         ` Ming Lei
  2014-02-01 13:30           ` Mark Lord
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Lei @ 2014-02-01  7:54 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: Bjørn Mork, David Laight, linux-usb, netdev,
	Greg Kroah-Hartman, David Miller, Dan Williams, Nyman, Mathias,
	Mark Lord, Alan Stern, Freddy Xin

On Sat, Feb 1, 2014 at 3:00 AM, Sarah Sharp
<sarah.a.sharp@linux.intel.com> wrote:
> On Fri, Jan 31, 2014 at 08:17:58AM +0800, Ming Lei wrote:
>> On Fri, Jan 31, 2014 at 6:15 AM, Sarah Sharp
>> <sarah.a.sharp@linux.intel.com> wrote:
>> > On Thu, Jan 30, 2014 at 10:50:21PM +0100, Bjørn Mork wrote:
>> >> FWIW, the plan looks fine to me.  Just adding a couple of hints to
>> >> simplify the implementation.
>> >>
>> >> Sarah Sharp <sarah.a.sharp@linux.intel.com> writes:
>> >>
>> >> > Let's do this fix the right way, instead of wall papering over the
>> >> > issue.  Here's what we should do:
>> >> >
>> >> > 1. Disable scatter-gather for the ax88179_178a driver when it's under an
>> >> >    xHCI host.
>> >>
>> >> No need to make this conditional.  SG is only enabled in the
>> >> ax88179_178a driver if udev->bus->no_sg_constraint is true, so it
>> >> applies only to xHCI hosts in the first place.
>> >
>> > Ah, so you're suggesting just reverting commit
>> > 3804fad45411b48233b48003e33a78f290d227c8 "USBNET: ax88179_178a: enable
>> > tso if usb host supports sg dma"?
>>
>> If I understand the problem correctly, the current issue is that xhci driver
>> doesn't support the arbitrary dma length not well, but per XHCI spec, it
>> should be supported, right?
>>
>> If the above is correct, reverting the commit isn't correct since there isn't
>> any issue about the commit, so I suggest to disable the flag in xhci
>> for the buggy devices, and it may be enabled again if the problem is fixed.
>
> Ok, I like that plan, since it means I don't have to touch any
> networking code to fix this. :)
>
> I believe that means we'll have to disable the flag for all 1.0 xHCI
> hosts, since those are the ones that need TD fragments.
>
>> >> > 2. Revert the following commits:
>> >> >    f2d9b991c549 xhci: Set scatter-gather limit to avoid failed block writes.
>> >> >    d6c9ea9069af xhci: Avoid infinite loop when sg urb requires too many trbs
>> >> >    35773dac5f86 usb: xhci: Link TRB must not occur within a USB payload burst
>> >> >
>> >> > 3. Dan and Mathias can work together to come up with an overall plan to
>> >> >    change the xHCI driver architecture to be fully compliant with the TD
>> >> >    fragment rules.  That can be done over the next few kernel releases.
>> >> >
>> >> > The end result is that we don't destabilize storage or break userspace
>> >> > USB drivers, we don't break people's xHCI host controllers,
>> >> > the ax88179_178a USB ethernet devices still work under xHCI (a bit with
>> >> > worse performance), and other USB ethernet devices still get the
>> >> > performance improvement introduced in 3.12.
>> >>
>> >> No other usbnet drivers has enabled SG...  Which is why you have only
>> >> seen this problem with the ax88179_178a devices.  So there is no
>> >> performance improvement to keep.
>>
>> In my test environment, the patch does improve both throughput and
>> cpu utilization, if you search the previous email for the patch, you can
>> see the data.

With SG enabled, for the iperf client test case, the average urb size
for transmission will be increased from ~1500 to ~20K bytes in my
test case:

   iperf -c $SRV -t 30 -P 4 -w 128K

So I am wondering you guys do not care the improvement, maybe
the CPU is powerful enough to not degrade throughout&cpu
utilization not much, but there is still the potential CPU wakeup issue,
which means extra CPU power consumption might be introduced
after disabling SG for usbnet.

>
> Right, I did see the performance improvement note in that commit.  Do
> you know if the ARM A15 dual core board was using a 0.96 xHCI host, or a
> 1.0 host?  You can find out by reloading the xHCI driver with dynamic
> debugging turned on:
>
> # sudo modprobe xhci_hcd dyndbg

Looks I can't find the parameter 'dyndbg' for xhci_hcd.

> and then look for lines like:
>
> [25296.765767] xhci_hcd 0000:00:14.0: HCIVERSION: 0x100

I change xhci_dbg.c manually with below:

diff --git a/drivers/usb/host/xhci-dbg.c b/drivers/usb/host/xhci-dbg.c
index b016d38..1ae1966 100644
--- a/drivers/usb/host/xhci-dbg.c
+++ b/drivers/usb/host/xhci-dbg.c
@@ -66,7 +66,7 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
                        (unsigned int) temp);
        xhci_dbg(xhci, "CAPLENGTH: 0x%x\n",
                        (unsigned int) HC_LENGTH(temp));
-       xhci_dbg(xhci, "HCIVERSION: 0x%x\n",
+       dev_info(xhci_to_hcd(xhci)->self.controller, "HCIVERSION: 0x%x\n",
                        (unsigned int) HC_VERSION(temp));

and got the below output:

[tom@ming ~]$ dmesg | grep HCIVERSION
xhci-hcd xhci-hcd.2.auto: HCIVERSION: 0x100

Thanks,
--
Ming Lei

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-02-01  7:54         ` Ming Lei
@ 2014-02-01 13:30           ` Mark Lord
  2014-02-01 14:18             ` Ming Lei
  0 siblings, 1 reply; 28+ messages in thread
From: Mark Lord @ 2014-02-01 13:30 UTC (permalink / raw)
  To: Ming Lei, Sarah Sharp
  Cc: Bjørn Mork, David Laight, linux-usb, netdev,
	Greg Kroah-Hartman, David Miller, Dan Williams, Nyman, Mathias,
	Alan Stern, Freddy Xin

On 14-02-01 02:54 AM, Ming Lei wrote:
..
> With SG enabled, for the iperf client test case, the average urb size
> for transmission will be increased from ~1500 to ~20K bytes in my
> test case:
> 
>    iperf -c $SRV -t 30 -P 4 -w 128K
> 
> So I am wondering you guys do not care the improvement ..

No, that's not it.  Simply, the recent changes killed the driver
for some users, something Linus calls a "regression", and does not permit.

Far better to have it continue to work than not to work.
The plan discussed earlier calls for reintroduction of SG here
once the problems are solved outside of the main tree.
-- 
Mark Lord
Real-Time Remedies Inc.
mlord@pobox.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-02-01 13:30           ` Mark Lord
@ 2014-02-01 14:18             ` Ming Lei
  2014-02-01 20:05               ` Mark Lord
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Lei @ 2014-02-01 14:18 UTC (permalink / raw)
  To: Mark Lord
  Cc: Sarah Sharp, Bjørn Mork, David Laight, linux-usb, netdev,
	Greg Kroah-Hartman, David Miller, Dan Williams, Nyman, Mathias,
	Alan Stern, Freddy Xin

On Sat, Feb 1, 2014 at 9:30 PM, Mark Lord <mlord@pobox.com> wrote:
> On 14-02-01 02:54 AM, Ming Lei wrote:
> ..
>> With SG enabled, for the iperf client test case, the average urb size
>> for transmission will be increased from ~1500 to ~20K bytes in my
>> test case:
>>
>>    iperf -c $SRV -t 30 -P 4 -w 128K
>>
>> So I am wondering you guys do not care the improvement ..
>
> No, that's not it.  Simply, the recent changes killed the driver

I just want to clarify the sg approach does improve performance,
instead of no improvement mentioned by your guys.

> for some users, something Linus calls a "regression", and does not permit.

Even real regressions are easily/often introduced, and we are discussing
how to fix that. I suggest to unset the flag only for the known buggy
controllers.


Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-02-01 14:18             ` Ming Lei
@ 2014-02-01 20:05               ` Mark Lord
       [not found]                 ` <52ED5381.2010106-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Mark Lord @ 2014-02-01 20:05 UTC (permalink / raw)
  To: Ming Lei
  Cc: Sarah Sharp, Bjørn Mork, David Laight, linux-usb, netdev,
	Greg Kroah-Hartman, David Miller, Dan Williams, Nyman, Mathias,
	Alan Stern, Freddy Xin

On 14-02-01 09:18 AM, Ming Lei wrote:
>
> Even real regressions are easily/often introduced, and we are discussing
> how to fix that. I suggest to unset the flag only for the known buggy
> controllers.


It is not the controllers that are particularly "buggy" here.
But rather the drivers and design of parts of the kernel.

Cheers
-- 
Mark Lord
Real-Time Remedies Inc.
mlord@pobox.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
       [not found]                 ` <52ED5381.2010106-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
@ 2014-02-03  9:54                   ` David Laight
  2014-02-03 17:56                     ` Sarah Sharp
  2014-02-03 17:55                   ` Sarah Sharp
  1 sibling, 1 reply; 28+ messages in thread
From: David Laight @ 2014-02-03  9:54 UTC (permalink / raw)
  To: 'Mark Lord', Ming Lei
  Cc: Sarah Sharp, Bjørn Mork, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Greg Kroah-Hartman, David Miller,
	Dan Williams, Nyman, Mathias, Alan Stern, Freddy Xin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1359 bytes --]

From: Mark Lord
> On 14-02-01 09:18 AM, Ming Lei wrote:
> >
> > Even real regressions are easily/often introduced, and we are discussing
> > how to fix that. I suggest to unset the flag only for the known buggy
> > controllers.
> 
> It is not the controllers that are particularly "buggy" here.
> But rather the drivers and design of parts of the kernel.

I suspect that the documentation is describing the actual implementation
of a specific hardware implementation, not necessarily how the hardware was
intended to behave.

The requirement for two 32bit accesses to a 64bit register is very similar.

This also means that implementations of the hardware that claim conformance
to the 0.96 specification might have similar issues.

Given the small number of xhci controllers and the even smaller number of
VHDL (or similar) sources they will be based on, it really ought to be
possible to tabulate the controller versions and families to get a much
better idea of their behaviour.

I've got two systems with Intel USB3 controllers, linux reports one as
'panther point', the other as '7 Series/C210 Series' (seems to be a Xeon
chipset). I've no idea how the latter relates to the former.

	David

N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±ºÆâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&¢îý»\x05ËÛÔØï¦v¬Îf\x1dp)¹¹br	šê+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹\x1e®w¥¢¸?™¨è­Ú&¢)ߢ^[f

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
       [not found]                 ` <52ED5381.2010106-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
  2014-02-03  9:54                   ` David Laight
@ 2014-02-03 17:55                   ` Sarah Sharp
  1 sibling, 0 replies; 28+ messages in thread
From: Sarah Sharp @ 2014-02-03 17:55 UTC (permalink / raw)
  To: Mark Lord
  Cc: Ming Lei, Bjørn Mork, David Laight,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Greg Kroah-Hartman, David Miller, Dan Williams, Nyman, Mathias,
	Alan Stern, Freddy Xin

On Sat, Feb 01, 2014 at 03:05:21PM -0500, Mark Lord wrote:
> On 14-02-01 09:18 AM, Ming Lei wrote:
> >
> > Even real regressions are easily/often introduced, and we are discussing
> > how to fix that. I suggest to unset the flag only for the known buggy
> > controllers.

Ming, the regression cannot be easily fixed in this case.  We tried the
"easy, quick fix" and it broke USB storage and usbfs.  The patches to
paper over those issues started to creep into the upper layers, and I'm
not willing to add more code to hack around the issues caused by the
"quick fix".  We need to do this right, not wall-paper over the issues.

> It is not the controllers that are particularly "buggy" here.
> But rather the drivers and design of parts of the kernel.

As Mark mentioned, the host controllers aren't buggy.  The xHCI driver
simply doesn't handle a 1.0 host controller requirement, TD fragments,
very well.  Only the USB ethernet layer triggers this bug, because the
USB storage layer hands down scatter-gather lists in multiples of the
max packet size.

You tested on a 1.0 host controller, and it apparently didn't need the
TD fragments requirement.  It seems that Intel 1.0 xHCI host controllers
do need that requirement.  Perhaps we can add an xHCI driver quirk for
an exception so that your host can allow any kind of scatter-gather?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-02-03  9:54                   ` David Laight
@ 2014-02-03 17:56                     ` Sarah Sharp
  0 siblings, 0 replies; 28+ messages in thread
From: Sarah Sharp @ 2014-02-03 17:56 UTC (permalink / raw)
  To: David Laight
  Cc: 'Mark Lord',
	Ming Lei, Bjørn Mork, linux-usb, netdev, Greg Kroah-Hartman,
	David Miller, Dan Williams, Nyman, Mathias, Alan Stern,
	Freddy Xin

On Mon, Feb 03, 2014 at 09:54:09AM +0000, David Laight wrote:
> From: Mark Lord
> > On 14-02-01 09:18 AM, Ming Lei wrote:
> > >
> > > Even real regressions are easily/often introduced, and we are discussing
> > > how to fix that. I suggest to unset the flag only for the known buggy
> > > controllers.
> > 
> > It is not the controllers that are particularly "buggy" here.
> > But rather the drivers and design of parts of the kernel.
> 
> I suspect that the documentation is describing the actual implementation
> of a specific hardware implementation, not necessarily how the hardware was
> intended to behave.

You are speculating.  Please stop speculating without evidence.  It does
not add to this conversation.

Sarah Sharp

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
  2014-01-30 16:35           ` Peter Stuge
@ 2014-01-31  9:30             ` David Laight
  0 siblings, 0 replies; 28+ messages in thread
From: David Laight @ 2014-01-31  9:30 UTC (permalink / raw)
  To: 'Peter Stuge'
  Cc: linux-usb, netdev, Sarah Sharp, Greg Kroah-Hartman, David Miller

From: Peter Stuge
> But what about that alignment? It seems that userspace
> needs to start caring about alignment with xhci, right?

No because there is a copy_to/from_user() in the middle.

The ehci/ohci/uhci all require that fragments be a multiple of the
usb message size (512 bytes for USB2).
So everything (until very recently) would always supply suitable
aligned buffers. Mostly they are page aligned.

For those who haven't read the xhci spec carefully:

The xhci controller removes the requirement on dma segments being
aligned to usb messages.
However there are two alignment requirements:
1) dma segments must not cross 64k address boundaries.
   This is documented clearly, even though it is a slight pain.
   You'd have thought the address counter could have more than
   16 bits these days!
   There only 17 bits for the length, but a length restriction
   would be less of a problem.
2) The v1.00 version of the specification adds that the end of
   the transfer ring can only occur at a 'TD fragment' boundary.
   These are aligned with the payload 'bursts' - which can be
   sixteen 1k packets.
I think that breaking the second of these causes a usb message
be split into two small pieces - which will terminate bulk xfers.
The asix USB3 Ge silicon gets very confused when this happens.

	David

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
       [not found]         ` <063D6719AE5E284EB5DD2968C1650D6D0F6B553D-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
@ 2014-01-30 16:35           ` Peter Stuge
  2014-01-31  9:30             ` David Laight
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Stuge @ 2014-01-30 16:35 UTC (permalink / raw)
  To: David Laight
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Sarah Sharp, Greg Kroah-Hartman, David Miller

David Laight wrote:
> > Where's the 8k coming from?
> 
> My memory, I meant 16k :-(

No problem. But what about that alignment? It seems that userspace
needs to start caring about alignment with xhci, right?


//Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
       [not found]     ` <20140130161721.25560.qmail-Y+HMSxxDrH8@public.gmane.org>
@ 2014-01-30 16:30       ` David Laight
       [not found]         ` <063D6719AE5E284EB5DD2968C1650D6D0F6B553D-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: David Laight @ 2014-01-30 16:30 UTC (permalink / raw)
  To: 'Peter Stuge'
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Sarah Sharp, Greg Kroah-Hartman, David Miller

From: Peter Stuge 
...
> > code using libusb can generate arbitrarily long transfers that usually
> > get split into 8k fragments.
> 
> libusb splits transfers into 16k urbs, or doesn't with newer code
> when both kernel and libusb support scatter-gather.
> 
> > In fact libusb always uses 8k fragments.
> 
> Hm? Worst-case libusb-1.0 submits 16k urbs. libusb-0.1 I'm unsure
> about, but could check.
...
> Where's the 8k coming from?

My memory, I meant 16k :-(

	David



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
       [not found] ` <063D6719AE5E284EB5DD2968C1650D6D0F6B5486-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
@ 2014-01-30 16:17   ` Peter Stuge
       [not found]     ` <20140130161721.25560.qmail-Y+HMSxxDrH8@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Stuge @ 2014-01-30 16:17 UTC (permalink / raw)
  To: David Laight
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Sarah Sharp, Greg Kroah-Hartman, David Miller

David Laight wrote:
> Some xhci (USB3) controllers have a constraint on the offset within a
> bulk transfer of the end of the transfer ring.

Mhm.


> code using libusb can generate arbitrarily long transfers that usually
> get split into 8k fragments.

libusb splits transfers into 16k urbs, or doesn't with newer code
when both kernel and libusb support scatter-gather.


> In fact libusb always uses 8k fragments.

Hm? Worst-case libusb-1.0 submits 16k urbs. libusb-0.1 I'm unsure
about, but could check.

When both sides support it, scatter-gather is used and a single urb
is submitted.

IIRC usbfs doesn't mess with urb buffers at all.

Where's the 8k coming from?


> This all means that the xhci driver needs to accept unlimited numbers
> of 'aligned' fragments and only restrict the number of misaligned ones.

libusb applications have so far never made efforts to align their
buffers to anything. That seems to become relevant for zero-copy?


//Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
@ 2014-01-30 16:00 David Laight
       [not found] ` <063D6719AE5E284EB5DD2968C1650D6D0F6B5486-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: David Laight @ 2014-01-30 16:00 UTC (permalink / raw)
  To: linux-usb, netdev, Sarah Sharp, Greg Kroah-Hartman, David Miller

Some xhci (USB3) controllers have a constraint on the offset within a
bulk transfer of the end of the transfer ring.

The xhci documentation (v1.00, but not the earlier versions) states that
the offset (from the beginning of the transfer) at end of the transfer
ring must be a multiple of the burst size (this is actually 16k for USB3
since the controller is configured for 16 message bursts).
However the effect is probably that the transfer is split at the ring end,
so the target will see correct messages provided the data is 1k aligned.

This mostly affects scatter-gather transfer requests, but can potentially
affect other requests as they must be split on 64k address boundaries.
(It might even affect non-bulk transfers.)

The only known current source of such misaligned transfers is the
ax88179_178a ethernet driver. The hardware stops transmitting ethernet
frames when the host controller (presumably) spilts a 1k message.

Not all host controllers behave this way.
The Intel Panther Point used on recent motherboards is affected.

A fix has been applied to the xhci driver (and backported), however this
has a side effect of limiting the number of fragments that can be sent.
(It works by putting all the buffer fragments in one ring segment.)

The SCSI system generates more fragments than was originally thought, and
code using libusb can generate arbitrarily long transfers that usually
get split into 8k fragments.

We've had reports of 4MB libusb requests failing. A 16MB request would
require 256 fragments (because of the requirement to not cross a 64k
address boundary) so could not be fitted into the 255 ring slots regarless
of the number and alignment of any fragments.

In fact libusb always uses 8k fragments. Anything over 1M can't be
split with the current limit of 128 fragments and is sent unfragmented.
This leads to kmalloc() failures.

This all means that the xhci driver needs to accept unlimited numbers
of 'aligned' fragments and only restrict the number of misaligned ones.

None of the other USB controllers allow buffer fragments that cross
USB message boundaries (512 bytes for USB2), so almost all the code
uses aligned buffers. Potentially these might cross 64k boundaries
at unaligned offsets, but I suspect that really doesn't happen.

So rather than change all the code that generates urbs, this patch
modifies the only code that generates misaligned transfares to tell
the host controller that the buffer might have alignment issues.

The patch:
- Adds the flag URB_UNCONSTRAINED_XFER to urb->transfer_flags.
  This reuses the value of URB_ASYNC_UNLINK (removed in 2005).
- Sets the flag in usbnet.c for all transmit requests.
  Since the buffer offsets aren't aligned an unfragmented message might
  need splitting on a 64k boundary.
- Pass the transfer_flags down to prepare_ring() and only check for
  the end of ring segments (filling with NOPs) if the flag is set.
- Remove the advertised restriction on the number fragments xhci supports.

This doesn't actually define what a 'constrained' transfer is - but
that wasn't defined when no_sg_constraint was added to struct usb_bus.
Possibly there should also be separate limits of the number of 'constrained'
and 'unconstrained' scatter-gather lists. But and the moment the former
is (more or less) required to be infinite, and the limit of the latter
won't be reached by any code that sets the flag.

Signed-off-by: David Laight <david.laight@aculab.com>
---
 drivers/net/usb/usbnet.c     |  1 +
 drivers/usb/host/xhci-ring.c | 12 ++++++++----
 drivers/usb/host/xhci.c      |  8 ++++++--
 include/linux/usb.h          |  1 +
 4 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 4671da7..504be5b 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1303,6 +1303,7 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
 		if (build_dma_sg(skb, urb) < 0)
 			goto drop;
 	}
+	urb->transfer_flags |= URB_UNCONSTRAINED_XFER;
 	length = urb->transfer_buffer_length;
 
 	/* don't assume the hardware handles USB_ZERO_PACKET
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index a0b248c..5860874 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2932,7 +2932,8 @@ static void queue_trb(struct xhci_hcd *xhci, struct xhci_ring *ring,
  * FIXME allocate segments if the ring is full.
  */
 static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
-		u32 ep_state, unsigned int num_trbs, gfp_t mem_flags)
+		u32 ep_state, unsigned int num_trbs, gfp_t mem_flags,
+		unsigned int transfer_flags)
 {
 	unsigned int num_trbs_needed;
 
@@ -2980,6 +2981,9 @@ static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
 			 * Simplest solution is to fill the trb before the
 			 * LINK with nop commands.
 			 */
+			if (!(transfer_flags & URB_UNCONSTRAINED_XFER))
+				/* Caller says buffer is aligned */
+				break;
 			if (num_trbs == 1 || num_trbs <= usable || usable == 0)
 				break;
 
@@ -3090,7 +3094,7 @@ static int prepare_transfer(struct xhci_hcd *xhci,
 
 	ret = prepare_ring(xhci, ep_ring,
 			   le32_to_cpu(ep_ctx->ep_info) & EP_STATE_MASK,
-			   num_trbs, mem_flags);
+			   num_trbs, mem_flags, urb->transfer_flags);
 	if (ret)
 		return ret;
 
@@ -3969,7 +3973,7 @@ int xhci_queue_isoc_tx_prepare(struct xhci_hcd *xhci, gfp_t mem_flags,
 	 * Do not insert any td of the urb to the ring if the check failed.
 	 */
 	ret = prepare_ring(xhci, ep_ring, le32_to_cpu(ep_ctx->ep_info) & EP_STATE_MASK,
-			   num_trbs, mem_flags);
+			   num_trbs, mem_flags, 0);
 	if (ret)
 		return ret;
 
@@ -4026,7 +4030,7 @@ static int queue_command(struct xhci_hcd *xhci, u32 field1, u32 field2,
 		reserved_trbs++;
 
 	ret = prepare_ring(xhci, xhci->cmd_ring, EP_STATE_RUNNING,
-			reserved_trbs, GFP_ATOMIC);
+			reserved_trbs, GFP_ATOMIC, 0);
 	if (ret < 0) {
 		xhci_err(xhci, "ERR: No room for command on command ring\n");
 		if (command_must_succeed)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index ad36439..eab1c5c 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4730,8 +4730,12 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
 	struct device		*dev = hcd->self.controller;
 	int			retval;
 
-	/* Limit the block layer scatter-gather lists to half a segment. */
-	hcd->self.sg_tablesize = TRBS_PER_SEGMENT / 2;
+	/* The length of scatter-gather lists needs to be unlimited for
+	 * aligned lists (URB_UNCONSTRAINED_XFER unset).
+	 * There is currently no way of specifying the limit for
+	 * misaligned transfers.
+	 */
+	hcd->self.sg_tablesize = ~0u;
 
 	/* support to build packet from discontinuous buffers */
 	hcd->self.no_sg_constraint = 1;
diff --git a/include/linux/usb.h b/include/linux/usb.h
index c716da1..7f53034 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -1179,6 +1179,7 @@ extern int usb_disabled(void);
 #define URB_ISO_ASAP		0x0002	/* iso-only; use the first unexpired
 					 * slot in the schedule */
 #define URB_NO_TRANSFER_DMA_MAP	0x0004	/* urb->transfer_dma valid on submit */
+#define URB_UNCONSTRAINED_XFER	0x0010	/* data may not be aligned */
 #define URB_NO_FSBR		0x0020	/* UHCI-specific */
 #define URB_ZERO_PACKET		0x0040	/* Finish bulk OUT with short packet */
 #define URB_NO_INTERRUPT	0x0080	/* HINT: no non-error interrupt
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2014-02-03 17:56 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-30 21:18 [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned Sarah Sharp
2014-01-30 21:39 ` Mark Lord
2014-01-30 21:42   ` Mark Lord
2014-01-30 21:43 ` Alan Stern
2014-01-30 21:48   ` Mark Lord
2014-01-30 21:55   ` Sarah Sharp
2014-01-30 22:05     ` Bjørn Mork
2014-01-30 22:07     ` Alan Stern
2014-01-30 21:50 ` Bjørn Mork
2014-01-30 22:15   ` Sarah Sharp
2014-01-31  0:17     ` Ming Lei
2014-01-31 19:00       ` Sarah Sharp
2014-02-01  7:54         ` Ming Lei
2014-02-01 13:30           ` Mark Lord
2014-02-01 14:18             ` Ming Lei
2014-02-01 20:05               ` Mark Lord
     [not found]                 ` <52ED5381.2010106-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
2014-02-03  9:54                   ` David Laight
2014-02-03 17:56                     ` Sarah Sharp
2014-02-03 17:55                   ` Sarah Sharp
2014-01-31 15:22     ` David Laight
2014-01-31 10:14 ` David Laight
2014-01-31 13:21   ` Peter Stuge
2014-01-31 13:52     ` David Laight
  -- strict thread matches above, loose matches on Subject: below --
2014-01-30 16:00 David Laight
     [not found] ` <063D6719AE5E284EB5DD2968C1650D6D0F6B5486-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
2014-01-30 16:17   ` Peter Stuge
     [not found]     ` <20140130161721.25560.qmail-Y+HMSxxDrH8@public.gmane.org>
2014-01-30 16:30       ` David Laight
     [not found]         ` <063D6719AE5E284EB5DD2968C1650D6D0F6B553D-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
2014-01-30 16:35           ` Peter Stuge
2014-01-31  9:30             ` David Laight

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.