linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net: can: Increase tx queue length
@ 2019-03-09 14:07 Appana Durga Kedareswara rao
  2019-03-09 14:22 ` Andre Naujoks
  0 siblings, 1 reply; 10+ messages in thread
From: Appana Durga Kedareswara rao @ 2019-03-09 14:07 UTC (permalink / raw)
  To: wg, mkl, davem
  Cc: linux-can, netdev, linux-kernel, Appana Durga Kedareswara rao

While stress testing the CAN interface on xilinx axi can
in loopback mode getting message "write: no buffer space available"
Increasing device tx queue length resolved the above mentioned issue.

Signed-off-by: Appana Durga Kedareswara rao <appana.durga.rao@xilinx.com>
---
--> Network devices default tx_queue_len is 1000 but for socket
can device it is 10 any reason for it?? 
 
 drivers/net/can/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index c05e4d5..32bd5be 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -642,7 +642,7 @@ static void can_setup(struct net_device *dev)
 	dev->mtu = CAN_MTU;
 	dev->hard_header_len = 0;
 	dev->addr_len = 0;
-	dev->tx_queue_len = 10;
+	dev->tx_queue_len = 500;
 
 	/* New-style flags. */
 	dev->flags = IFF_NOARP;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: can: Increase tx queue length
  2019-03-09 14:07 [PATCH] net: can: Increase tx queue length Appana Durga Kedareswara rao
@ 2019-03-09 14:22 ` Andre Naujoks
  2019-03-09 14:40   ` Appana Durga Kedareswara Rao
  0 siblings, 1 reply; 10+ messages in thread
From: Andre Naujoks @ 2019-03-09 14:22 UTC (permalink / raw)
  To: Appana Durga Kedareswara rao, wg, mkl, davem
  Cc: linux-can, netdev, linux-kernel

On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote:
> While stress testing the CAN interface on xilinx axi can
> in loopback mode getting message "write: no buffer space available"
> Increasing device tx queue length resolved the above mentioned issue.

No need to patch the kernel:

$ ip link set <dev-name> txqueuelen 500

does the same thing.

> 
> Signed-off-by: Appana Durga Kedareswara rao <appana.durga.rao@xilinx.com>
> ---
> --> Network devices default tx_queue_len is 1000 but for socket
> can device it is 10 any reason for it?? 
>  
>  drivers/net/can/dev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
> index c05e4d5..32bd5be 100644
> --- a/drivers/net/can/dev.c
> +++ b/drivers/net/can/dev.c
> @@ -642,7 +642,7 @@ static void can_setup(struct net_device *dev)
>  	dev->mtu = CAN_MTU;
>  	dev->hard_header_len = 0;
>  	dev->addr_len = 0;
> -	dev->tx_queue_len = 10;
> +	dev->tx_queue_len = 500;
>  
>  	/* New-style flags. */
>  	dev->flags = IFF_NOARP;
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] net: can: Increase tx queue length
  2019-03-09 14:22 ` Andre Naujoks
@ 2019-03-09 14:40   ` Appana Durga Kedareswara Rao
  2019-03-09 14:56     ` Andre Naujoks
                       ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Appana Durga Kedareswara Rao @ 2019-03-09 14:40 UTC (permalink / raw)
  To: Andre Naujoks, wg, mkl, davem; +Cc: linux-can, netdev, linux-kernel

Hi Andre,

<Snip> 
> 
> On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote:
> > While stress testing the CAN interface on xilinx axi can in loopback
> > mode getting message "write: no buffer space available"
> > Increasing device tx queue length resolved the above mentioned issue.
> 
> No need to patch the kernel:
> 
> $ ip link set <dev-name> txqueuelen 500
> 
> does the same thing.

Thanks for the review... 
Agree but it is not an out of box solution right?? 
Do you have any idea for socket can devices why the tx queue length is 10 whereas
for other network devices (ex: ethernet) it is 1000 ?? 

Regards,
Kedar.
> 
> >
> > Signed-off-by: Appana Durga Kedareswara rao
> > <appana.durga.rao@xilinx.com>
> > ---
> > --> Network devices default tx_queue_len is 1000 but for socket
> > can device it is 10 any reason for it??
> >
> >  drivers/net/can/dev.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index
> > c05e4d5..32bd5be 100644
> > --- a/drivers/net/can/dev.c
> > +++ b/drivers/net/can/dev.c
> > @@ -642,7 +642,7 @@ static void can_setup(struct net_device *dev)
> >  	dev->mtu = CAN_MTU;
> >  	dev->hard_header_len = 0;
> >  	dev->addr_len = 0;
> > -	dev->tx_queue_len = 10;
> > +	dev->tx_queue_len = 500;
> >
> >  	/* New-style flags. */
> >  	dev->flags = IFF_NOARP;
> >


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: can: Increase tx queue length
  2019-03-09 14:40   ` Appana Durga Kedareswara Rao
@ 2019-03-09 14:56     ` Andre Naujoks
  2019-03-09 15:50     ` Toke Høiland-Jørgensen
  2019-03-10 13:36     ` Martin Jerabek
  2 siblings, 0 replies; 10+ messages in thread
From: Andre Naujoks @ 2019-03-09 14:56 UTC (permalink / raw)
  To: Appana Durga Kedareswara Rao, wg, mkl, davem, Oliver Hartkopp
  Cc: linux-can, netdev, linux-kernel

On 3/9/19 3:40 PM, Appana Durga Kedareswara Rao wrote:
> Hi Andre,
> 
> <Snip> 
>>
>> On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote:
>>> While stress testing the CAN interface on xilinx axi can in loopback
>>> mode getting message "write: no buffer space available"
>>> Increasing device tx queue length resolved the above mentioned issue.
>>
>> No need to patch the kernel:
>>
>> $ ip link set <dev-name> txqueuelen 500
>>
>> does the same thing.
> 
> Thanks for the review... 
> Agree but it is not an out of box solution right?? 
> Do you have any idea for socket can devices why the tx queue length is 10 whereas
> for other network devices (ex: ethernet) it is 1000 ?? 

I think the 1000 queue length is the system default and CAN just
overwrites that (a vcan does curiously does not). There probably is a
reason for the short queue for CAN. But I don't know why it was set so
low. My guess is as good as yours.

I ran into your problem multiple times, too, when replaying recorded CAN
traces. We worked around it, by adding the txqueuelen to the setup
parameters for the CAN devices. If you use ifupdown (we use a different
solution), then you could probably put it in there somewhere.

I'd be reluctant to change that default without knowing why it was set
in the first place.

Maybe Oliver can help here?

Regards
  Andre

> 
> Regards,
> Kedar.
>>
>>>
>>> Signed-off-by: Appana Durga Kedareswara rao
>>> <appana.durga.rao@xilinx.com>
>>> ---
>>> --> Network devices default tx_queue_len is 1000 but for socket
>>> can device it is 10 any reason for it??
>>>
>>>  drivers/net/can/dev.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index
>>> c05e4d5..32bd5be 100644
>>> --- a/drivers/net/can/dev.c
>>> +++ b/drivers/net/can/dev.c
>>> @@ -642,7 +642,7 @@ static void can_setup(struct net_device *dev)
>>>  	dev->mtu = CAN_MTU;
>>>  	dev->hard_header_len = 0;
>>>  	dev->addr_len = 0;
>>> -	dev->tx_queue_len = 10;
>>> +	dev->tx_queue_len = 500;
>>>
>>>  	/* New-style flags. */
>>>  	dev->flags = IFF_NOARP;
>>>
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] net: can: Increase tx queue length
  2019-03-09 14:40   ` Appana Durga Kedareswara Rao
  2019-03-09 14:56     ` Andre Naujoks
@ 2019-03-09 15:50     ` Toke Høiland-Jørgensen
  2019-03-10  5:07       ` Dave Taht
  2019-03-10 13:36     ` Martin Jerabek
  2 siblings, 1 reply; 10+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-03-09 15:50 UTC (permalink / raw)
  To: Appana Durga Kedareswara Rao, Andre Naujoks, wg, mkl, davem
  Cc: linux-can, netdev, linux-kernel

Appana Durga Kedareswara Rao <appanad@xilinx.com> writes:

> Hi Andre,
>
> <Snip> 
>> 
>> On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote:
>> > While stress testing the CAN interface on xilinx axi can in loopback
>> > mode getting message "write: no buffer space available"
>> > Increasing device tx queue length resolved the above mentioned issue.
>> 
>> No need to patch the kernel:
>> 
>> $ ip link set <dev-name> txqueuelen 500
>> 
>> does the same thing.
>
> Thanks for the review... 
> Agree but it is not an out of box solution right?? 
> Do you have any idea for socket can devices why the tx queue length is 10 whereas
> for other network devices (ex: ethernet) it is 1000 ??

Probably because you don't generally want a long queue adding latency on
a CAN interface? The default 1000 is already way too much even for an
Ethernet device in a lot of cases.

If you get "out of buffer" errors it means your application is sending
things faster than the receiver (or device) can handle them. If you
solve this by increasing the queue length you are just papering over the
underlying issue, and trading latency for fewer errors. This tradeoff
*may* be appropriate for your particular application, but I can imagine
it would not be appropriate as a default. Keeping the buffer size small
allows errors to propagate up to the application, which can then back
off, or do something smarter, as appropriate.

I don't know anything about the actual discussions going on when the
defaults were set, but I can imagine something along the lines of the
above was probably a part of it :)

-Toke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: can: Increase tx queue length
  2019-03-09 15:50     ` Toke Høiland-Jørgensen
@ 2019-03-10  5:07       ` Dave Taht
  2019-03-10 17:30         ` Oliver Hartkopp
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Taht @ 2019-03-10  5:07 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Appana Durga Kedareswara Rao,
	Andre Naujoks, wg, mkl, davem
  Cc: linux-can, netdev, linux-kernel

Toke Høiland-Jørgensen <toke@redhat.com> writes:

> Appana Durga Kedareswara Rao <appanad@xilinx.com> writes:
>
>> Hi Andre,
>>
>> <Snip> 
>>> 
>>> On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote:
>>> > While stress testing the CAN interface on xilinx axi can in loopback
>>> > mode getting message "write: no buffer space available"
>>> > Increasing device tx queue length resolved the above mentioned issue.
>>> 
>>> No need to patch the kernel:
>>> 
>>> $ ip link set <dev-name> txqueuelen 500
>>> 
>>> does the same thing.
>>
>> Thanks for the review... 
>> Agree but it is not an out of box solution right?? 
>> Do you have any idea for socket can devices why the tx queue length is 10 whereas
>> for other network devices (ex: ethernet) it is 1000 ??
>
> Probably because you don't generally want a long queue adding latency on
> a CAN interface? The default 1000 is already way too much even for an
> Ethernet device in a lot of cases.
>
> If you get "out of buffer" errors it means your application is sending
> things faster than the receiver (or device) can handle them. If you
> solve this by increasing the queue length you are just papering over the
> underlying issue, and trading latency for fewer errors. This tradeoff
> *may* be appropriate for your particular application, but I can imagine
> it would not be appropriate as a default. Keeping the buffer size small
> allows errors to propagate up to the application, which can then back
> off, or do something smarter, as appropriate.
>
> I don't know anything about the actual discussions going on when the
> defaults were set, but I can imagine something along the lines of the
> above was probably a part of it :)
>
> -Toke

In a related discussion, loud and often difficult, over here on the can bus, 

https://github.com/systemd/systemd/issues/9194#issuecomment-469403685

we found that applying fq_codel as the default via sysctl qdisc a bad
idea for systems for at least one model of can device.

If you scroll back on the bug, a good description of what the can
subsystem expects from the qdisc is therein - it mandates an in-order
fifo qdisc or no queue at all. the CAN protocol expects each packet to
be transmitted successfully or rejected, and if so, passes the error up
to userspace and is supposed to stop for further input.

As this was the first serious bug ever reported against using fq_codel
as the default in 5+ years of systemd and 7 of openwrt deployment I've
been taking it very seriously. It's worse than just systemd - openwrt
patches out pfifo_fast entirely. pfifo_fast is the wrong qdisc - the
right choices are noqueue and possibly pfifo.

However, the vcan device exposes noqueue, and so far it has been only
the one device ( a 8Devices socketcan USB2CAN ) that did not do this in
their driver that was misbehaving.

Which was just corrected with a simple:

static int usb_8dev_probe(struct usb_interface *intf,
			 const struct usb_device_id *id)
{
     ...
     netdev->netdev_ops = &usb_8dev_netdev_ops;

     netdev->flags |= IFF_ECHO; /* we support local echo */
+    netdev->priv_flags |= IFF_NO_QUEUE;
     ...
}

and successfully tested on that bug report.

So at the moment, my thought is that all can devices should default to
noqueue, if they are not already. I think a pfifo_fast and a qlen of any
size is the wrong thing, but I still don't know enough about what other
can devices do or did to be certain.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: can: Increase tx queue length
  2019-03-09 14:40   ` Appana Durga Kedareswara Rao
  2019-03-09 14:56     ` Andre Naujoks
  2019-03-09 15:50     ` Toke Høiland-Jørgensen
@ 2019-03-10 13:36     ` Martin Jerabek
  2 siblings, 0 replies; 10+ messages in thread
From: Martin Jerabek @ 2019-03-10 13:36 UTC (permalink / raw)
  To: Appana Durga Kedareswara Rao, Andre Naujoks, wg, mkl, davem
  Cc: linux-can, netdev, linux-kernel, Michal Sojka

On 09. 03. 19 15:40, Appana Durga Kedareswara Rao wrote:
> Hi Andre,
> 
> <Snip> 
>>
>> On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote:
>>> While stress testing the CAN interface on xilinx axi can in loopback
>>> mode getting message "write: no buffer space available"
>>> Increasing device tx queue length resolved the above mentioned issue.
>>
>> No need to patch the kernel:
>>
>> $ ip link set <dev-name> txqueuelen 500
>>
>> does the same thing.
> 
> Thanks for the review... 
> Agree but it is not an out of box solution right?? 
> Do you have any idea for socket can devices why the tx queue length is 10 whereas
> for other network devices (ex: ethernet) it is 1000 ?? 
> 
> Regards,
> Kedar.

There was already a patch for this in the past [1], together with a thorough
analysis, but for some reason the discussion died out.

Even if the defaults are not changed, it would be nice to at least see it
mentioned in Documentation/networking/can.txt to save people some time while
looking for the solution.

Regards,
Martin


[1] http://socket-can.996257.n3.nabble.com/Solving-ENOBUFS-returned-by-write-td2886.html


>>
>>>
>>> Signed-off-by: Appana Durga Kedareswara rao
>>> <appana.durga.rao@xilinx.com>
>>> ---
>>> --> Network devices default tx_queue_len is 1000 but for socket
>>> can device it is 10 any reason for it??
>>>
>>>  drivers/net/can/dev.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index
>>> c05e4d5..32bd5be 100644
>>> --- a/drivers/net/can/dev.c
>>> +++ b/drivers/net/can/dev.c
>>> @@ -642,7 +642,7 @@ static void can_setup(struct net_device *dev)
>>>  	dev->mtu = CAN_MTU;
>>>  	dev->hard_header_len = 0;
>>>  	dev->addr_len = 0;
>>> -	dev->tx_queue_len = 10;
>>> +	dev->tx_queue_len = 500;
>>>
>>>  	/* New-style flags. */
>>>  	dev->flags = IFF_NOARP;
>>>
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: can: Increase tx queue length
  2019-03-10  5:07       ` Dave Taht
@ 2019-03-10 17:30         ` Oliver Hartkopp
  2019-03-15 10:04           ` Appana Durga Kedareswara Rao
  0 siblings, 1 reply; 10+ messages in thread
From: Oliver Hartkopp @ 2019-03-10 17:30 UTC (permalink / raw)
  To: Dave Taht, Toke Høiland-Jørgensen,
	Appana Durga Kedareswara Rao, Andre Naujoks, wg, mkl, davem
  Cc: linux-can, netdev, linux-kernel

Hi all,

On 3/10/19 6:07 AM, Dave Taht wrote:
> Toke Høiland-Jørgensen <toke@redhat.com> writes:
> 
>> Appana Durga Kedareswara Rao <appanad@xilinx.com> writes:
>>
>>> Hi Andre,
>>>
>>> <Snip>
>>>>
>>>> On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote:
>>>>> While stress testing the CAN interface on xilinx axi can in loopback
>>>>> mode getting message "write: no buffer space available"
>>>>> Increasing device tx queue length resolved the above mentioned issue.
>>>>
>>>> No need to patch the kernel:
>>>>
>>>> $ ip link set <dev-name> txqueuelen 500
>>>>
>>>> does the same thing.
>>>
>>> Thanks for the review...
>>> Agree but it is not an out of box solution right??
>>> Do you have any idea for socket can devices why the tx queue length is 10 whereas
>>> for other network devices (ex: ethernet) it is 1000 ??
>>
>> Probably because you don't generally want a long queue adding latency on
>> a CAN interface? The default 1000 is already way too much even for an
>> Ethernet device in a lot of cases.
>>
>> If you get "out of buffer" errors it means your application is sending
>> things faster than the receiver (or device) can handle them. If you
>> solve this by increasing the queue length you are just papering over the
>> underlying issue, and trading latency for fewer errors. This tradeoff
>> *may* be appropriate for your particular application, but I can imagine
>> it would not be appropriate as a default. Keeping the buffer size small
>> allows errors to propagate up to the application, which can then back
>> off, or do something smarter, as appropriate.
>>
>> I don't know anything about the actual discussions going on when the
>> defaults were set, but I can imagine something along the lines of the
>> above was probably a part of it :)
>>
>> -Toke
> 
> In a related discussion, loud and often difficult, over here on the can bus,
> 
> https://github.com/systemd/systemd/issues/9194#issuecomment-469403685
> 
> we found that applying fq_codel as the default via sysctl qdisc a bad
> idea for systems for at least one model of can device.
> 
> If you scroll back on the bug, a good description of what the can
> subsystem expects from the qdisc is therein - it mandates an in-order
> fifo qdisc or no queue at all. the CAN protocol expects each packet to
> be transmitted successfully or rejected, and if so, passes the error up
> to userspace and is supposed to stop for further input.
> 
> As this was the first serious bug ever reported against using fq_codel
> as the default in 5+ years of systemd and 7 of openwrt deployment I've
> been taking it very seriously. It's worse than just systemd - openwrt
> patches out pfifo_fast entirely. pfifo_fast is the wrong qdisc - the
> right choices are noqueue and possibly pfifo.
> 
> However, the vcan device exposes noqueue, and so far it has been only
> the one device ( a 8Devices socketcan USB2CAN ) that did not do this in
> their driver that was misbehaving.
> 
> Which was just corrected with a simple:
> 
> static int usb_8dev_probe(struct usb_interface *intf,
> 			 const struct usb_device_id *id)
> {
>       ...
>       netdev->netdev_ops = &usb_8dev_netdev_ops;
> 
>       netdev->flags |= IFF_ECHO; /* we support local echo */
> +    netdev->priv_flags |= IFF_NO_QUEUE;
>       ...
> }
> 
> and successfully tested on that bug report.
> 
> So at the moment, my thought is that all can devices should default to
> noqueue, if they are not already. I think a pfifo_fast and a qlen of any
> size is the wrong thing, but I still don't know enough about what other
> can devices do or did to be certain.
> 

Having about 10 elements in a CAN driver tx queue allows to work with 
queueing disciplines 
(http://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf) and also to 
maintain a nearly real-time behaviour with outgoing traffic.

When the CAN interface is not able to cope with the (intened) outgoing 
traffic load, the applications should get an instant feedback about it.

There is a difference between running CAN applications in the real world 
and doing performance tests, where it makes sense to increase the 
tx-queue-len to e.g. 1000 and dump 1000 frames into the driver to check 
the hardware performance.

Best regards,
Oliver

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] net: can: Increase tx queue length
  2019-03-10 17:30         ` Oliver Hartkopp
@ 2019-03-15 10:04           ` Appana Durga Kedareswara Rao
  2019-03-15 13:53             ` Oliver Hartkopp
  0 siblings, 1 reply; 10+ messages in thread
From: Appana Durga Kedareswara Rao @ 2019-03-15 10:04 UTC (permalink / raw)
  To: Oliver Hartkopp, Dave Taht, Toke Høiland-Jørgensen,
	Andre Naujoks, wg, mkl, davem
  Cc: linux-can, netdev, linux-kernel

Hi All,

<Snip> 
> Hi all,
> 
> On 3/10/19 6:07 AM, Dave Taht wrote:
> > Toke Høiland-Jørgensen <toke@redhat.com> writes:
> >
> >> Appana Durga Kedareswara Rao <appanad@xilinx.com> writes:
> >>
> >>> Hi Andre,
> >>>
> >>> <Snip>
> >>>>
> >>>> On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote:
> >>>>> While stress testing the CAN interface on xilinx axi can in
> >>>>> loopback mode getting message "write: no buffer space available"
> >>>>> Increasing device tx queue length resolved the above mentioned issue.
> >>>>
> >>>> No need to patch the kernel:
> >>>>
> >>>> $ ip link set <dev-name> txqueuelen 500
> >>>>
> >>>> does the same thing.
> >>>
> >>> Thanks for the review...
> >>> Agree but it is not an out of box solution right??
> >>> Do you have any idea for socket can devices why the tx queue length
> >>> is 10 whereas for other network devices (ex: ethernet) it is 1000 ??
> >>
> >> Probably because you don't generally want a long queue adding latency
> >> on a CAN interface? The default 1000 is already way too much even for
> >> an Ethernet device in a lot of cases.
> >>
> >> If you get "out of buffer" errors it means your application is
> >> sending things faster than the receiver (or device) can handle them.
> >> If you solve this by increasing the queue length you are just
> >> papering over the underlying issue, and trading latency for fewer
> >> errors. This tradeoff
> >> *may* be appropriate for your particular application, but I can
> >> imagine it would not be appropriate as a default. Keeping the buffer
> >> size small allows errors to propagate up to the application, which
> >> can then back off, or do something smarter, as appropriate.
> >>
> >> I don't know anything about the actual discussions going on when the
> >> defaults were set, but I can imagine something along the lines of the
> >> above was probably a part of it :)
> >>
> >> -Toke
> >
> > In a related discussion, loud and often difficult, over here on the
> > can bus,
> >
> > https://github.com/systemd/systemd/issues/9194#issuecomment-
> 469403685
> >
> > we found that applying fq_codel as the default via sysctl qdisc a bad
> > idea for systems for at least one model of can device.
> >
> > If you scroll back on the bug, a good description of what the can
> > subsystem expects from the qdisc is therein - it mandates an in-order
> > fifo qdisc or no queue at all. the CAN protocol expects each packet to
> > be transmitted successfully or rejected, and if so, passes the error
> > up to userspace and is supposed to stop for further input.
> >
> > As this was the first serious bug ever reported against using fq_codel
> > as the default in 5+ years of systemd and 7 of openwrt deployment I've
> > been taking it very seriously. It's worse than just systemd - openwrt
> > patches out pfifo_fast entirely. pfifo_fast is the wrong qdisc - the
> > right choices are noqueue and possibly pfifo.
> >
> > However, the vcan device exposes noqueue, and so far it has been only
> > the one device ( a 8Devices socketcan USB2CAN ) that did not do this
> > in their driver that was misbehaving.
> >
> > Which was just corrected with a simple:
> >
> > static int usb_8dev_probe(struct usb_interface *intf,
> > 			 const struct usb_device_id *id)
> > {
> >       ...
> >       netdev->netdev_ops = &usb_8dev_netdev_ops;
> >
> >       netdev->flags |= IFF_ECHO; /* we support local echo */
> > +    netdev->priv_flags |= IFF_NO_QUEUE;
> >       ...
> > }
> >
> > and successfully tested on that bug report.
> >
> > So at the moment, my thought is that all can devices should default to
> > noqueue, if they are not already. I think a pfifo_fast and a qlen of
> > any size is the wrong thing, but I still don't know enough about what
> > other can devices do or did to be certain.
> >
> 
> Having about 10 elements in a CAN driver tx queue allows to work with
> queueing disciplines
> (http://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf) and also to maintain a
> nearly real-time behaviour with outgoing traffic.
> 
> When the CAN interface is not able to cope with the (intened) outgoing traffic
> load, the applications should get an instant feedback about it.
> 
> There is a difference between running CAN applications in the real world and
> doing performance tests, where it makes sense to increase the tx-queue-len to
> e.g. 1000 and dump 1000 frames into the driver to check the hardware
> performance.

Thanks, Oliver, Martin, Andre, Toke, Dave for your inputs...
So to conclude this the default txqueuelen 10 is ideal for real-time CAN traffic,
For Stress/Performance tests user manually need to increase the txqueuelen based on his requirements.

Please correct me if my understanding is wrong. 

Regards,
Kedar.

> 
> Best regards,
> Oliver

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: can: Increase tx queue length
  2019-03-15 10:04           ` Appana Durga Kedareswara Rao
@ 2019-03-15 13:53             ` Oliver Hartkopp
  0 siblings, 0 replies; 10+ messages in thread
From: Oliver Hartkopp @ 2019-03-15 13:53 UTC (permalink / raw)
  To: Appana Durga Kedareswara Rao, Dave Taht,
	Toke Høiland-Jørgensen, Andre Naujoks, wg, mkl, davem,
	Martin Jerabek
  Cc: linux-can, netdev, linux-kernel, Michal Sojka

Hi Kedar,

On 3/15/19 11:04 AM, Appana Durga Kedareswara Rao wrote:

>> Having about 10 elements in a CAN driver tx queue allows to work with
>> queueing disciplines
>> (http://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf) and also to maintain a
>> nearly real-time behaviour with outgoing traffic.
>>
>> When the CAN interface is not able to cope with the (intened) outgoing traffic
>> load, the applications should get an instant feedback about it.
>>
>> There is a difference between running CAN applications in the real world and
>> doing performance tests, where it makes sense to increase the tx-queue-len to
>> e.g. 1000 and dump 1000 frames into the driver to check the hardware
>> performance.
> 
> Thanks, Oliver, Martin, Andre, Toke, Dave for your inputs...
> So to conclude this the default txqueuelen 10 is ideal for real-time CAN traffic,
> For Stress/Performance tests user manually need to increase the txqueuelen based on his requirements.
> 
> Please correct me if my understanding is wrong.

Yes, I would confirm that approach.

As Martin Jerabek pointed to a discussion with Michal Sojka here:

https://marc.info/?l=linux-can&m=155222580602047&w=2

You might also go for a more academic view based on the number of 
different CAN applications on the host.

@Martin: Would you like to propose a patch for can.txt (now can.rst)?

Regards,
Oliver


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-03-15 13:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-09 14:07 [PATCH] net: can: Increase tx queue length Appana Durga Kedareswara rao
2019-03-09 14:22 ` Andre Naujoks
2019-03-09 14:40   ` Appana Durga Kedareswara Rao
2019-03-09 14:56     ` Andre Naujoks
2019-03-09 15:50     ` Toke Høiland-Jørgensen
2019-03-10  5:07       ` Dave Taht
2019-03-10 17:30         ` Oliver Hartkopp
2019-03-15 10:04           ` Appana Durga Kedareswara Rao
2019-03-15 13:53             ` Oliver Hartkopp
2019-03-10 13:36     ` Martin Jerabek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).