All of lore.kernel.org
 help / color / mirror / Atom feed
* sjw in can_calc_bittiming
@ 2021-11-02 15:03 Matthias Weißer
  2021-11-02 15:36 ` Oliver Hartkopp
  2021-11-05 12:07 ` Marc Kleine-Budde
  0 siblings, 2 replies; 5+ messages in thread
From: Matthias Weißer @ 2021-11-02 15:03 UTC (permalink / raw)
  To: linux-can

Hi

we recently had a case here where one member of a CAN bus couldn't receive
frames with data content of only zeros:

$ cansend can0 123#0000000000000000

After some investigation we found the root cause to be a slight difference
(about 1%) in actual bitrates of the two members. The one with showed the
RX errors had a sjw value of 1 and a lot of time quanta (40) due to the 40MHz
CAN clock.

This leads to a build up of phase error (as sjw is not able to compensate for
enough of the bitrate difference) which at some point leads to a framing
error due to missing a stuff bit. Playing around with the sample point can
improve or worsen the behavior.

We can fix this quite easily by specifying a higher sjw value.

Question is now:
Wouldn't it make sense to increase sjw in can_calc_bittiming() to something
like 5% of the total time quanta? This may increase the reliability of the CAN
network when there are differences in the bitrates of the single members. Are
there any arguments against such an arbitrary selection of sjw?

If you agree with such a change I can come up with a proper patch. If I wrote
totally nonsense please tell me :-)

Regards

Matthias

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sjw in can_calc_bittiming
  2021-11-02 15:03 sjw in can_calc_bittiming Matthias Weißer
@ 2021-11-02 15:36 ` Oliver Hartkopp
  2021-11-02 15:59   ` Wolfgang Grandegger
  2021-11-03  7:12   ` Matthias Weißer
  2021-11-05 12:07 ` Marc Kleine-Budde
  1 sibling, 2 replies; 5+ messages in thread
From: Oliver Hartkopp @ 2021-11-02 15:36 UTC (permalink / raw)
  To: Matthias Weißer, linux-can

Hi Matthias,

On 02.11.21 16:03, Matthias Weißer wrote:

> we recently had a case here where one member of a CAN bus couldn't receive
> frames with data content of only zeros:
> 
> $ cansend can0 123#0000000000000000
> 
> After some investigation we found the root cause to be a slight difference
> (about 1%) in actual bitrates of the two members. The one with showed the
> RX errors had a sjw value of 1 and a lot of time quanta (40) due to the 40MHz
> CAN clock.
> 
> This leads to a build up of phase error (as sjw is not able to compensate for
> enough of the bitrate difference) which at some point leads to a framing
> error due to missing a stuff bit. Playing around with the sample point can
> improve or worsen the behavior.
> 
> We can fix this quite easily by specifying a higher sjw value.

You can specify the sjw value with the ip command (for CAN FD there is 
also a dsjw). And IIRC you can set it to the max. value for your CAN 
controller if you define sjw to be 4.

Best regards,
Oliver

$ ip link help can
Usage: ip link set DEVICE type can
	[ bitrate BITRATE [ sample-point SAMPLE-POINT] ] |
	[ tq TQ prop-seg PROP_SEG phase-seg1 PHASE-SEG1
  	  phase-seg2 PHASE-SEG2 [ sjw SJW ] ]

	[ dbitrate BITRATE [ dsample-point SAMPLE-POINT] ] |
	[ dtq TQ dprop-seg PROP_SEG dphase-seg1 PHASE-SEG1
  	  dphase-seg2 PHASE-SEG2 [ dsjw SJW ] ]

	[ loopback { on | off } ]
	[ listen-only { on | off } ]
	[ triple-sampling { on | off } ]
	[ one-shot { on | off } ]
	[ berr-reporting { on | off } ]
	[ fd { on | off } ]
	[ fd-non-iso { on | off } ]
	[ presume-ack { on | off } ]

	[ restart-ms TIME-MS ]
	[ restart ]

	[ termination { 0..65535 } ]

	Where: BITRATE	:= { 1..1000000 }
		  SAMPLE-POINT	:= { 0.000..0.999 }
		  TQ		:= { NUMBER }
		  PROP-SEG	:= { 1..8 }
		  PHASE-SEG1	:= { 1..8 }
		  PHASE-SEG2	:= { 1..8 }
		  SJW		:= { 1..4 }
		  RESTART-MS	:= { 0 | NUMBER }

> 
> Question is now:
> Wouldn't it make sense to increase sjw in can_calc_bittiming() to something
> like 5% of the total time quanta? This may increase the reliability of the CAN
> network when there are differences in the bitrates of the single members. Are
> there any arguments against such an arbitrary selection of sjw?
> 
> If you agree with such a change I can come up with a proper patch. If I wrote
> totally nonsense please tell me :-)
> 
> Regards
> 
> Matthias
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sjw in can_calc_bittiming
  2021-11-02 15:36 ` Oliver Hartkopp
@ 2021-11-02 15:59   ` Wolfgang Grandegger
  2021-11-03  7:12   ` Matthias Weißer
  1 sibling, 0 replies; 5+ messages in thread
From: Wolfgang Grandegger @ 2021-11-02 15:59 UTC (permalink / raw)
  To: Oliver Hartkopp, Matthias Weißer, linux-can

Hello,

Am 02.11.21 um 16:36 schrieb Oliver Hartkopp:
> Hi Matthias,
> 
> On 02.11.21 16:03, Matthias Weißer wrote:
> 
>> we recently had a case here where one member of a CAN bus couldn't
>> receive
>> frames with data content of only zeros:
>>
>> $ cansend can0 123#0000000000000000
>>
>> After some investigation we found the root cause to be a slight
>> difference
>> (about 1%) in actual bitrates of the two members. The one with showed the
>> RX errors had a sjw value of 1 and a lot of time quanta (40) due to
>> the 40MHz
>> CAN clock.
>>
>> This leads to a build up of phase error (as sjw is not able to
>> compensate for
>> enough of the bitrate difference) which at some point leads to a framing
>> error due to missing a stuff bit. Playing around with the sample point
>> can
>> improve or worsen the behavior.
>>
>> We can fix this quite easily by specifying a higher sjw value.
> 
> You can specify the sjw value with the ip command (for CAN FD there is
> also a dsjw). And IIRC you can set it to the max. value for your CAN
> controller if you define sjw to be 4.
> 
> Best regards,
> Oliver
> 
> $ ip link help can
> Usage: ip link set DEVICE type can
>     [ bitrate BITRATE [ sample-point SAMPLE-POINT] ] |
>     [ tq TQ prop-seg PROP_SEG phase-seg1 PHASE-SEG1
>        phase-seg2 PHASE-SEG2 [ sjw SJW ] ]

This usage description is not totally correct! It is also possible, to
increase "sjw" together with the bitrate, e.g.:

  # ip link set can0 type can bitrate 250000 sjw 2
  # ip -d link show can0 | grep sjw
      tq 250 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 2
  
Here is the code for it:

  # cat dev.c
  ...
        /* check for sjw user settings */
        if (!bt->sjw || !btc->sjw_max) {
                bt->sjw = 1;
        } else {
                /* bt->sjw is at least 1 -> sanitize upper bound to sjw_max */
                if (bt->sjw > btc->sjw_max)
                        bt->sjw = btc->sjw_max;
                /* bt->sjw must not be higher than tseg2 */
                if (tseg2 < bt->sjw)
                        bt->sjw = tseg2;
        }

Wolfgang

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sjw in can_calc_bittiming
  2021-11-02 15:36 ` Oliver Hartkopp
  2021-11-02 15:59   ` Wolfgang Grandegger
@ 2021-11-03  7:12   ` Matthias Weißer
  1 sibling, 0 replies; 5+ messages in thread
From: Matthias Weißer @ 2021-11-03  7:12 UTC (permalink / raw)
  To: Oliver Hartkopp, linux-can

Hi Oliver

Thanks for your answer.

Am Di., 2. Nov. 2021 um 16:36 Uhr schrieb Oliver Hartkopp
<socketcan@hartkopp.net>:
>
> > We can fix this quite easily by specifying a higher sjw value.
>
> You can specify the sjw value with the ip command (for CAN FD there is
> also a dsjw). And IIRC you can set it to the max. value for your CAN
> controller if you define sjw to be 4.

I know that and this is my current solution. And I am totally happy with that.
Sorry if this wasn't clear in my original message.

My question was if it makes sense to change the default in can_calc_bittiming()
to something more adaptive than just 1 depending on the total number of tq.
This may prevent others from hitting the same problem as I did.

If you can take some time and rethink my original question that would be great

> > Wouldn't it make sense to increase sjw in can_calc_bittiming() to something
> > like 5% of the total time quanta? This may increase the reliability of the CAN
> > network when there are differences in the bitrates of the single members. Are
> > there any arguments against such an arbitrary selection of sjw?

Regards

Matthias

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sjw in can_calc_bittiming
  2021-11-02 15:03 sjw in can_calc_bittiming Matthias Weißer
  2021-11-02 15:36 ` Oliver Hartkopp
@ 2021-11-05 12:07 ` Marc Kleine-Budde
  1 sibling, 0 replies; 5+ messages in thread
From: Marc Kleine-Budde @ 2021-11-05 12:07 UTC (permalink / raw)
  To: Matthias Weißer; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 2382 bytes --]

On 02.11.2021 16:03:33, Matthias Weißer wrote:
> we recently had a case here where one member of a CAN bus couldn't
> receive frames with data content of only zeros:
> 
> $ cansend can0 123#0000000000000000
> 
> After some investigation we found the root cause to be a slight
> difference (about 1%) in actual bitrates of the two members. The one
> with showed the RX errors had a sjw value of 1 and a lot of time
> quanta (40) due to the 40MHz CAN clock.
> 
> This leads to a build up of phase error (as sjw is not able to
> compensate for enough of the bitrate difference) which at some point
> leads to a framing error due to missing a stuff bit. Playing around
> with the sample point can improve or worsen the behavior.
> 
> We can fix this quite easily by specifying a higher sjw value.

We have seen this behavior, too. :(

> Question is now:
> Wouldn't it make sense to increase sjw in can_calc_bittiming() to
> something like 5% of the total time quanta?

You mean of the total bit time?

The problem is that the SJW is set to 1 time quanta by default. For a
given bit rate the time quanta is not a constant value, but depends on
the oscillator frequency and configured bit rate prescaler. As you
suggested, I agree a better solution would be to set the SJW to a
certain percentage of the bit time. But, what is a proper default value
of a percentage based SJW?

> This may increase the reliability of the CAN network when there are
> differences in the bitrates of the single members. Are there any
> arguments against such an arbitrary selection of sjw?
> 
> If you agree with such a change I can come up with a proper patch. If
> I wrote totally nonsense please tell me :-)

I'm interested in a patch. There are some requirements and thoughts:
- introduce a define for the default SJW percentage value
- use the default SJW percentage value if the user has not provided a
  SJW
- the "struct can_bittiming" cannot be modified

In a later patch we can introduce a configuration interface to set SJW
percentage value from user space.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-05 12:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-02 15:03 sjw in can_calc_bittiming Matthias Weißer
2021-11-02 15:36 ` Oliver Hartkopp
2021-11-02 15:59   ` Wolfgang Grandegger
2021-11-03  7:12   ` Matthias Weißer
2021-11-05 12:07 ` Marc Kleine-Budde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.