* [PATCH 0/5] can: bittiming: improve SJW handling and default
@ 2022-09-07 10:38 Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 1/5] can: dev: register_candev(): ensure that bittiming const are valid Marc Kleine-Budde
` (4 more replies)
0 siblings, 5 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2022-09-07 10:38 UTC (permalink / raw)
To: linux-can; +Cc: Mark Bath
Hello,
several people noticed that on modern CAN controllers with wide bit
timing registers the default SJW of 1 can result in unstable or no
synchronization to the CAN network. See Patch 5/5 for details.
This series fixes the issue by first cleaning up the SJW handling,
making it more standard compliant (SJW must not be longer than any
Phase Seg - not just Phase Seg2) and finally using Phase Seg2 / 2 as
default, if the user space provided no SJW configuration.
regards,
Marc
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/5] can: dev: register_candev(): ensure that bittiming const are valid
2022-09-07 10:38 [PATCH 0/5] can: bittiming: improve SJW handling and default Marc Kleine-Budde
@ 2022-09-07 10:38 ` Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3() Marc Kleine-Budde
` (3 subsequent siblings)
4 siblings, 0 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2022-09-07 10:38 UTC (permalink / raw)
To: linux-can; +Cc: Mark Bath, Marc Kleine-Budde
Implement a function can_bittiming_const_valid() to check the validity
of the provided bit timing constant. Call this function from
register_candev() to check the bit timing constants during CAN
interface registration.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
drivers/net/can/dev/dev.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/net/can/dev/dev.c b/drivers/net/can/dev/dev.c
index c1956b1e9faf..3b51055be40e 100644
--- a/drivers/net/can/dev/dev.c
+++ b/drivers/net/can/dev/dev.c
@@ -498,6 +498,18 @@ static int can_get_termination(struct net_device *ndev)
return 0;
}
+static bool
+can_bittiming_const_valid(const struct can_bittiming_const *btc)
+{
+ if (!btc)
+ return true;
+
+ if (!btc->sjw_max)
+ return false;
+
+ return true;
+}
+
/* Register the CAN network device */
int register_candev(struct net_device *dev)
{
@@ -518,6 +530,10 @@ int register_candev(struct net_device *dev)
if (!priv->data_bitrate_const != !priv->data_bitrate_const_cnt)
return -EINVAL;
+ if (!can_bittiming_const_valid(priv->bittiming_const) ||
+ !can_bittiming_const_valid(priv->data_bittiming_const))
+ return -EINVAL;
+
if (!priv->termination_const) {
err = can_get_termination(dev);
if (err)
--
2.35.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2022-09-07 10:38 [PATCH 0/5] can: bittiming: improve SJW handling and default Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 1/5] can: dev: register_candev(): ensure that bittiming const are valid Marc Kleine-Budde
@ 2022-09-07 10:38 ` Marc Kleine-Budde
2022-09-11 8:24 ` Vincent Mailhol
2022-09-07 10:38 ` [PATCH 3/5] can: bittiming: can_calc_bittiming(): clean up SJW sanitizing Marc Kleine-Budde
` (2 subsequent siblings)
4 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2022-09-07 10:38 UTC (permalink / raw)
To: linux-can; +Cc: Mark Bath, Marc Kleine-Budde
In can_calc_bittiming() there are several open coded checks to ensure
that SJW is within certain limits. Replace this by a single call to
min3().
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
drivers/net/can/dev/calc_bittiming.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/net/can/dev/calc_bittiming.c b/drivers/net/can/dev/calc_bittiming.c
index d3caa040614d..ce6bef2444a2 100644
--- a/drivers/net/can/dev/calc_bittiming.c
+++ b/drivers/net/can/dev/calc_bittiming.c
@@ -158,12 +158,8 @@ int can_calc_bittiming(const struct net_device *dev, struct can_bittiming *bt,
if (!bt->sjw || !btc->sjw_max) {
bt->sjw = 1;
} else {
- /* bt->sjw is at least 1 -> sanitize upper bound to sjw_max */
- if (bt->sjw > btc->sjw_max)
- bt->sjw = btc->sjw_max;
- /* bt->sjw must not be higher than tseg2 */
- if (tseg2 < bt->sjw)
- bt->sjw = tseg2;
+ /* sjw must not be higher than sjw_max and tseg2 */
+ bt->sjw = min3(bt->sjw, btc->sjw_max, tseg2);
}
bt->brp = best_brp;
--
2.35.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/5] can: bittiming: can_calc_bittiming(): clean up SJW sanitizing
2022-09-07 10:38 [PATCH 0/5] can: bittiming: improve SJW handling and default Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 1/5] can: dev: register_candev(): ensure that bittiming const are valid Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3() Marc Kleine-Budde
@ 2022-09-07 10:38 ` Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 4/5] can: bittiming: can_calc_bittiming(): ensure that SJW is not longer than either Phase Buffer Segment Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 5/5] can: bittiming: can_calc_bittiming(): use Phase Seg2 / 2 as default for SJW Marc Kleine-Budde
4 siblings, 0 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2022-09-07 10:38 UTC (permalink / raw)
To: linux-can; +Cc: Mark Bath, Marc Kleine-Budde
Prepare the SJW handling in can_calc_bittiming() for the changes
coming in the next patches. Always ensure that SJW in within the
limits.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
drivers/net/can/dev/calc_bittiming.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/net/can/dev/calc_bittiming.c b/drivers/net/can/dev/calc_bittiming.c
index ce6bef2444a2..cb9521c8ae8e 100644
--- a/drivers/net/can/dev/calc_bittiming.c
+++ b/drivers/net/can/dev/calc_bittiming.c
@@ -154,13 +154,12 @@ int can_calc_bittiming(const struct net_device *dev, struct can_bittiming *bt,
bt->phase_seg1 = tseg1 - bt->prop_seg;
bt->phase_seg2 = tseg2;
- /* check for sjw user settings */
- if (!bt->sjw || !btc->sjw_max) {
+ /* If user space provides no sjw, use 1 as default */
+ if (!bt->sjw)
bt->sjw = 1;
- } else {
- /* sjw must not be higher than sjw_max and tseg2 */
- bt->sjw = min3(bt->sjw, btc->sjw_max, tseg2);
- }
+
+ /* sjw must not be higher than sjw_max and tseg2 */
+ bt->sjw = min3(bt->sjw, btc->sjw_max, tseg2);
bt->brp = best_brp;
--
2.35.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 4/5] can: bittiming: can_calc_bittiming(): ensure that SJW is not longer than either Phase Buffer Segment
2022-09-07 10:38 [PATCH 0/5] can: bittiming: improve SJW handling and default Marc Kleine-Budde
` (2 preceding siblings ...)
2022-09-07 10:38 ` [PATCH 3/5] can: bittiming: can_calc_bittiming(): clean up SJW sanitizing Marc Kleine-Budde
@ 2022-09-07 10:38 ` Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 5/5] can: bittiming: can_calc_bittiming(): use Phase Seg2 / 2 as default for SJW Marc Kleine-Budde
4 siblings, 0 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2022-09-07 10:38 UTC (permalink / raw)
To: linux-can; +Cc: Mark Bath, Marc Kleine-Budde
According to "The Configuration of the CAN Bit Timing" [1] the SJW
"may not be longer than either Phase Buffer Segment".
Use bt->phase_seg2 instead of tseg2 (technically these are the same
values). Include bt->phase_seg1 in the limit check, too.
[1] http://web.archive.org/http://www.oertel-halle.de/files/cia99paper.pdf
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
drivers/net/can/dev/calc_bittiming.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/can/dev/calc_bittiming.c b/drivers/net/can/dev/calc_bittiming.c
index cb9521c8ae8e..b9aa1f7d0b37 100644
--- a/drivers/net/can/dev/calc_bittiming.c
+++ b/drivers/net/can/dev/calc_bittiming.c
@@ -158,8 +158,9 @@ int can_calc_bittiming(const struct net_device *dev, struct can_bittiming *bt,
if (!bt->sjw)
bt->sjw = 1;
- /* sjw must not be higher than sjw_max and tseg2 */
- bt->sjw = min3(bt->sjw, btc->sjw_max, tseg2);
+ /* sjw must not be higher than sjw_max, phase_seg1, and phase_seg2 */
+ bt->sjw = min3(bt->sjw, btc->sjw_max,
+ min(bt->phase_seg1, bt->phase_seg2));
bt->brp = best_brp;
--
2.35.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 5/5] can: bittiming: can_calc_bittiming(): use Phase Seg2 / 2 as default for SJW
2022-09-07 10:38 [PATCH 0/5] can: bittiming: improve SJW handling and default Marc Kleine-Budde
` (3 preceding siblings ...)
2022-09-07 10:38 ` [PATCH 4/5] can: bittiming: can_calc_bittiming(): ensure that SJW is not longer than either Phase Buffer Segment Marc Kleine-Budde
@ 2022-09-07 10:38 ` Marc Kleine-Budde
2022-09-07 19:19 ` Thomas.Kopp
4 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2022-09-07 10:38 UTC (permalink / raw)
To: linux-can; +Cc: Mark Bath, Marc Kleine-Budde
"The (Re-)Synchronization Jump Width (SJW) defines how far a
resynchronization may move the Sample Point inside the limits defined
by the Phase Buffer Segments to compensate for edge phase errors." [1]
In other words this means the SJW parameter controls the tolerance of
the CAN controller against the frequency error compared to other CAN
controllers.
If the user space doesn't provide a SJW parameter, the kernel chooses
a default value of 1. This proved to be a good default value for CAN
controllers, but not anymore with modern controllers.
In the past, there were CAN controllers like the sja1000 with a rather
limited range of bit timing parameters. For the standard bit rates
this results in the following bit timing parameters:
| Bit timing parameters for sja1000 with 8.000000 MHz ref clock
| _----+--------------=> tseg1: 1 … 16
| / / _---------=> tseg2: 1 … 8
| | | / _-----=> sjw: 1 … 4
| | | | / _-=> brp: 1 … 64 (inc: 1)
| | | | | /
| nominal | | | | | real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error BTR0 BTR1
| 1000000 125 2 3 2 1 1 1000000 0.0% 75.0% 75.0% 0.0% 0x00 0x14
| 800000 125 3 4 2 1 1 800000 0.0% 80.0% 80.0% 0.0% 0x00 0x16
| 666666 125 4 4 3 1 1 666666 0.0% 80.0% 75.0% 6.2% 0x00 0x27
| 500000 125 6 7 2 1 1 500000 0.0% 87.5% 87.5% 0.0% 0x00 0x1c
| 250000 250 6 7 2 1 2 250000 0.0% 87.5% 87.5% 0.0% 0x01 0x1c
| 125000 500 6 7 2 1 4 125000 0.0% 87.5% 87.5% 0.0% 0x03 0x1c
| 100000 625 6 7 2 1 5 100000 0.0% 87.5% 87.5% 0.0% 0x04 0x1c
| 83333 750 6 7 2 1 6 83333 0.0% 87.5% 87.5% 0.0% 0x05 0x1c
| 50000 1250 6 7 2 1 10 50000 0.0% 87.5% 87.5% 0.0% 0x09 0x1c
| 33333 1875 6 7 2 1 15 33333 0.0% 87.5% 87.5% 0.0% 0x0e 0x1c
| 20000 3125 6 7 2 1 25 20000 0.0% 87.5% 87.5% 0.0% 0x18 0x1c
| 10000 6250 6 7 2 1 50 10000 0.0% 87.5% 87.5% 0.0% 0x31 0x1c
The attentive reader notices that in most cases the SJW is 1, while
the Phase Seg2 is 2. Both values are in TQ units, which itself is a
duration in nanoseconds.
For example the 500 kbit/s configuration:
| nominal real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error BTR0 BTR1
| 500000 125 6 7 2 1 1 500000 0.0% 87.5% 87.5% 0.0% 0x00 0x1c
the TQ is 125ns, the Phase Seg2 is "2" (== 250ns), the SJW is "1" (==
125 ns).
Looking at a more modern CAN controller like a mcp2518fd, it has wider
bit timing registers.
| Bit timing parameters for mcp251xfd with 40.000000 MHz ref clock
| _----+--------------=> tseg1: 2 … 256
| / / _---------=> tseg2: 1 … 128
| | | / _-----=> sjw: 1 … 128
| | | | / _-=> brp: 1 … 256 (inc: 1)
| | | | | /
| nominal | | | | | real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error NBTCFG
| 500000 25 34 35 10 1 1 500000 0.0% 87.5% 87.5% 0.0% 0x00440900
The TQ is 25ns, the Phase Seg 2 is "10" (== 250ns), the SJW is "1" (==
25ns).
As the kernel chooses a default SJW of 1 independent of the TQ, this
results in a much smaller SJW and thus much smaller tolerances against
frequency errors.
To get the same oscillator tolerances on controllers with wide bit
timing registers, choose a default SJW value of Phase Seg2 / 2. This
results in the following bit timing parameters:
| Bit timing parameters for mcp251xfd with 40.000000 MHz ref clock
| _----+--------------=> tseg1: 2 … 256
| / / _---------=> tseg2: 1 … 128
| | | / _-----=> sjw: 1 … 128
| | | | / _-=> brp: 1 … 256 (inc: 1)
| | | | | /
| nominal | | | | | real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error NBTCFG
| 500000 25 34 35 10 5 1 500000 0.0% 87.5% 87.5% 0.0% 0x00440904
The TQ is 25ns, the Phase Seg 2 is "10" (== 250ns), the SJW is "5" (==
125ns). Which is the same as on the sja1000 controller.
[1] http://web.archive.org/http://www.oertel-halle.de/files/cia99paper.pdf
Cc: Mark Bath <mark@baggywrinkle.co.uk>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
drivers/net/can/dev/calc_bittiming.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/can/dev/calc_bittiming.c b/drivers/net/can/dev/calc_bittiming.c
index b9aa1f7d0b37..6b4ca7af05fa 100644
--- a/drivers/net/can/dev/calc_bittiming.c
+++ b/drivers/net/can/dev/calc_bittiming.c
@@ -154,9 +154,9 @@ int can_calc_bittiming(const struct net_device *dev, struct can_bittiming *bt,
bt->phase_seg1 = tseg1 - bt->prop_seg;
bt->phase_seg2 = tseg2;
- /* If user space provides no sjw, use 1 as default */
+ /* If user space provides no sjw, use sane default of phase_seg2 / 2 */
if (!bt->sjw)
- bt->sjw = 1;
+ bt->sjw = max(1U, bt->phase_seg2 / 2);
/* sjw must not be higher than sjw_max, phase_seg1, and phase_seg2 */
bt->sjw = min3(bt->sjw, btc->sjw_max,
--
2.35.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* RE: [PATCH 5/5] can: bittiming: can_calc_bittiming(): use Phase Seg2 / 2 as default for SJW
2022-09-07 10:38 ` [PATCH 5/5] can: bittiming: can_calc_bittiming(): use Phase Seg2 / 2 as default for SJW Marc Kleine-Budde
@ 2022-09-07 19:19 ` Thomas.Kopp
2022-09-08 19:57 ` Marc Kleine-Budde
0 siblings, 1 reply; 20+ messages in thread
From: Thomas.Kopp @ 2022-09-07 19:19 UTC (permalink / raw)
To: mkl, linux-can; +Cc: mark
Hi Marc,
> In other words this means the SJW parameter controls the tolerance of
> the CAN controller against the frequency error compared to other CAN
> controllers.
>
> If the user space doesn't provide a SJW parameter, the kernel chooses
> a default value of 1. This proved to be a good default value for CAN
> controllers, but not anymore with modern controllers.
>
>
> To get the same oscillator tolerances on controllers with wide bit
> timing registers, choose a default SJW value of Phase Seg2 / 2. This
> results in the following bit timing parameters:
Thanks for your work on this! May I ask why Phase Seg2 / 2 is chosen?
In general the recommendation is to chose SJW as big as possible. Here's an excerpt from the CiA 601-3 (Unfortunately I can't share the entire document)
"Recom2: Choose sjwA as large as possible.
- The maximum possible value is sjwA = min(ps1A, ps2A).
- During arbitration phase, a large sjwA allows a CAN node to resynchronize quickly on the leading transmitting node.
- If the CAN controller does not allow configuring the maximum possible sjwA
because the value range for sjwA is limited in the CAN controller, then consider
increasing BRPA. This allows configuring a larger sjwA, unless this can impact
phase error tolerance in an unwanted way, see Recom4."
I think we discussed this a while ago already and the biggest reasoning against maxing SJW was breaking "compatibility" to old kernel versions. The compatibility in this is not being able to sync though...
Now, if a change to a larger default values is agreed upon we might as well chose min(phaseseg1, phaseseg2) directly to maximize robustness for the user that doesn't set it explicitly. Given that most of the getting started with Socketcan results on your favorite Search engine don't even mention SJW that's probably a significant portion of the users. From a pure communication standpoint I can't see any downside to doing this, the only hypothetical case would be if a CAN controller had an erratum about higher SJWs and the driver not handling it.
So, in essence, I'd propose to use min(p1,p2).
Best Regards,
Thomas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] can: bittiming: can_calc_bittiming(): use Phase Seg2 / 2 as default for SJW
2022-09-07 19:19 ` Thomas.Kopp
@ 2022-09-08 19:57 ` Marc Kleine-Budde
2022-09-09 12:29 ` Marc Kleine-Budde
0 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2022-09-08 19:57 UTC (permalink / raw)
To: Thomas.Kopp; +Cc: linux-can, mark
[-- Attachment #1: Type: text/plain, Size: 12627 bytes --]
Hello Thomas,
On 07.09.2022 19:19:42, Thomas.Kopp@microchip.com wrote:
> > In other words this means the SJW parameter controls the tolerance of
> > the CAN controller against the frequency error compared to other CAN
> > controllers.
> >
> > If the user space doesn't provide a SJW parameter, the kernel chooses
> > a default value of 1. This proved to be a good default value for CAN
> > controllers, but not anymore with modern controllers.
> >
> > To get the same oscillator tolerances on controllers with wide bit
> > timing registers, choose a default SJW value of Phase Seg2 / 2. This
> > results in the following bit timing parameters:
>
> Thanks for your work on this! May I ask why Phase Seg2 / 2 is chosen?
According to "Combining CANopen and SAE J1939 networks" [2], CANopen and
J1939 use a SJW of 1:
"The CANopen standard CiA 301 allows various bitrates in the range from
10 kBit/s to 1 MBit/s. The J1939-11 specification stipulates 250 kBit/s
and is used in the majority of applications. The J1939-14 standard
specifies 500 kBit/s for the physical layer.
Thereby the bitrate for a shared physical layer is limited to the
bitrates 250 kBit/s and 500 kBit/s. Fortunately both standards define
the same sample point location at 87,5 % together with a SJW value of
1."
I think this is why we implemented a default SJW of 1 in the kernel.
As discussed in my patch description a SJW of 1 for controllers that
were common at the time these standard were written, with narrow bit
timing registers, equals "Phase Seg2 / 2".
In "The Configuration of the CAN Bit Timing" [3] the following criteria
for the Oscillator Tolerance Range was given:
"The tolerance range df for an oscillator’s frequency f_osc around the
nominal frequency f_nom with:
(1 - df) * f_nom <= f_osc <= (1 + df) * f_nom
depends on the proportions of Phase_Seg1, Phase_Seg2, SJW, and the bit
time. The maximum tolerance df is the defined by two conditions (both
shall be met):
min(Phase_Seg1, Phase_Seg2)
I: df <= --------------------------------
2 * (13 * bit_time - Phase_Seg2)
SJW
II: df <= -------------
20 * bit_time
"
The 2nd criteria (see "df2" column) is influenced by SJW, so of
interest. Here a sja1000 controller:
| Bit timing parameters for sja1000 with 8.000000 MHz ref clock using algo 'v5.19'
| _----+--------------=> TSeg1: 1 … 16
| / / _---------=> TSeg2: 1 … 8
| | | / _-----=> SJW: 1 … 4
| | | | / _-=> BRP: 1 … 64 (inc: 1)
| | | | | /
| nominal | | | | | real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error df1 df2 BTR0 BTR1
| 1000000 125 2 3 2 1 1 1000000 0.0% 75.0% 75.0% 0.0% 9.8‰ 6.2‰ 0x00 0x14
| 800000 125 3 4 2 1 1 800000 0.0% 80.0% 80.0% 0.0% 7.8‰ 5.0‰ 0x00 0x16
| 666666 125 4 4 3 1 1 666666 0.0% 80.0% 75.0% 6.2% 9.8‰ 4.2‰ 0x00 0x27
| 500000 125 6 7 2 1 1 500000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x00 0x1c
| 250000 250 6 7 2 1 2 250000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x01 0x1c
| 125000 500 6 7 2 1 4 125000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x03 0x1c
| 100000 625 6 7 2 1 5 100000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x04 0x1c
| 83333 750 6 7 2 1 6 83333 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x05 0x1c
| 50000 1250 6 7 2 1 10 50000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x09 0x1c
| 33333 1875 6 7 2 1 15 33333 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x0e 0x1c
| 20000 3125 6 7 2 1 25 20000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x18 0x1c
| 10000 6250 6 7 2 1 50 10000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x31 0x1c
"df2" is 3.1‰ for 500kbit/s or slower.
In contrast the mcp251xfd with the current mainline bit timing algorithm
(SJW=1):
| Bit timing parameters for mcp251xfd with 40.000000 MHz ref clock (CIA recommendation) using algo 'v5.19'
| _----+--------------=> TSeg1: 2 … 256
| / / _---------=> TSeg2: 1 … 128
| | | / _-----=> SJW: 1 … 128
| | | | / _-=> BRP: 1 … 256 (inc: 1)
| | | | | /
| nominal | | | | | real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error df1 df2 NBTCFG
| 1000000 25 14 15 10 1 1 1000000 0.0% 75.0% 75.0% 0.0% 9.8‰ 1.2‰ 0x001c0900
| 800000 25 19 20 10 1 1 800000 0.0% 80.0% 80.0% 0.0% 7.8‰ 1.0‰ 0x00260900
| 666666 25 23 24 12 1 1 666666 0.0% 80.0% 80.0% 0.0% 7.8‰ 0.8‰ 0x002e0b00
| 500000 25 34 35 10 1 1 500000 0.0% 87.5% 87.5% 0.0% 4.9‰ 0.6‰ 0x00440900
| 250000 25 69 70 20 1 1 250000 0.0% 87.5% 87.5% 0.0% 4.9‰ 0.3‰ 0x008a1300
| 125000 50 69 70 20 1 2 125000 0.0% 87.5% 87.5% 0.0% 4.9‰ 0.3‰ 0x018a1300
| 100000 50 87 87 25 1 2 100000 0.0% 87.5% 87.5% 0.0% 4.9‰ 0.2‰ 0x01ad1800
| 83333 50 104 105 30 1 2 83333 0.0% 87.5% 87.5% 0.0% 4.9‰ 0.2‰ 0x01d01d00
| 50000 100 87 87 25 1 4 50000 0.0% 87.5% 87.5% 0.0% 4.9‰ 0.2‰ 0x03ad1800
| 33333 125 104 105 30 1 5 33333 0.0% 87.5% 87.5% 0.0% 4.9‰ 0.2‰ 0x04d01d00
| 20000 250 87 87 25 1 10 20000 0.0% 87.5% 87.5% 0.0% 4.9‰ 0.2‰ 0x09ad1800
| 10000 500 87 87 25 1 20 10000 0.0% 87.5% 87.5% 0.0% 4.9‰ 0.2‰ 0x13ad1800
for slow bit rates "df2" drops to 0.2‰.
Here the mcp251xfd with the updated algorithm (SJW = Phase Seg2 / 2):
| Bit timing parameters for mcp251xfd with 40.000000 MHz ref clock (CIA recommendation) using algo 'next'
| _----+--------------=> TSeg1: 2 … 256
| / / _---------=> TSeg2: 1 … 128
| | | / _-----=> SJW: 1 … 128
| | | | / _-=> BRP: 1 … 256 (inc: 1)
| | | | | /
| nominal | | | | | real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error df1 df2 NBTCFG
| 1000000 25 14 15 10 5 1 1000000 0.0% 75.0% 75.0% 0.0% 9.8‰ 6.2‰ 0x001c0904
| 800000 25 19 20 10 5 1 800000 0.0% 80.0% 80.0% 0.0% 7.8‰ 5.0‰ 0x00260904
| 666666 25 23 24 12 6 1 666666 0.0% 80.0% 80.0% 0.0% 7.8‰ 5.0‰ 0x002e0b05
| 500000 25 34 35 10 5 1 500000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x00440904
| 250000 25 69 70 20 10 1 250000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x008a1309
| 125000 50 69 70 20 10 2 125000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x018a1309
| 100000 50 87 87 25 12 2 100000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.0‰ 0x01ad180b
| 83333 50 104 105 30 15 2 83333 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x01d01d0e
| 50000 100 87 87 25 12 4 50000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.0‰ 0x03ad180b
| 33333 125 104 105 30 15 5 33333 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.1‰ 0x04d01d0e
| 20000 250 87 87 25 12 10 20000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.0‰ 0x09ad180b
| 10000 500 87 87 25 12 20 10000 0.0% 87.5% 87.5% 0.0% 4.9‰ 3.0‰ 0x13ad180b
The "df2" stays at ~3.1‰
[2] http://web.archive.org/https://www.can-cia.org/fileadmin/resources/documents/proceedings/2013_mmc_koppe2.pdf
[3] http://web.archive.org/http://www.oertel-halle.de/files/cia99paper.pdf
And I tested with SJW=max from mcp251x (not the mcp251xfd) with SJA=max
to a peak USB adapter with SJW=1 - the peak adapter fails to receive CAN
frames:
| 5: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
| link/can promiscuity 0 minmtu 0 maxmtu 0
| can state ERROR-ACTIVE restart-ms 1000
| bitrate 500000 sample-point 0.850
| tq 100 prop-seg 8 phase-seg1 8 phase-seg2 3 sjw 3 brp 1
| mcp251x: tseg1 3..16 tseg2 2..8 sjw 1..4 brp 1..64 brp_inc 1
| clock 10000000
| re-started bus-errors arbit-lost error-warn error-pass bus-off
| 0 0 0 2 2 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 parentbus spi parentdev spi1.1
| 97: peakfd0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
| link/can promiscuity 0 minmtu 0 maxmtu 0
| can <FD> state ERROR-PASSIVE (berr-counter tx 0 rx 136) restart-ms 1000
| bitrate 500000 sample-point 0.875
| tq 12 prop-seg 69 phase-seg1 70 phase-seg2 20 sjw 1 brp 1
| pcan_usb_fd: tseg1 1..256 tseg2 1..128 sjw 1..128 brp 1..1024 brp_inc 1
| dbitrate 2000000 dsample-point 0.750
| dtq 12 dprop-seg 14 dphase-seg1 15 dphase-seg2 10 dsjw 1 dbrp 1
| pcan_usb_fd: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..1024 dbrp_inc 1
| clock 80000000
| re-started bus-errors arbit-lost error-warn error-pass bus-off
| 0 0 0 1 2 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 gro_max_size 65536 parentbus usb parentdev 2-2.2:1.0
| (1970-01-01 02:41:33.856148) peakfd0 RX - - 20000004 [8] 00 04 00 00 00 00 00 00 ERRORFRAME
| controller-problem{rx-error-warning}
| (1970-01-01 02:41:33.857168) peakfd0 RX - - 20000004 [8] 00 10 00 00 00 00 00 00 ERRORFRAME
| controller-problem{rx-error-passive}
With SJW = Phase Seg2 / 2, it works.
> In general the recommendation is to chose SJW as big as possible.
> Here's an excerpt from the CiA 601-3 (Unfortunately I can't share the
> entire document)
CiA 301 and J1939 seem to have a different opinion...
> "Recom2: Choose sjwA as large as possible.
> - The maximum possible value is sjwA = min(ps1A, ps2A).
> - During arbitration phase, a large sjwA allows a CAN node to resynchronize quickly on the leading transmitting node.
> - If the CAN controller does not allow configuring the maximum possible sjwA
> because the value range for sjwA is limited in the CAN controller, then consider
> increasing BRPA. This allows configuring a larger sjwA, unless this can impact
> phase error tolerance in an unwanted way, see Recom4."
>
> I think we discussed this a while ago already and the biggest
> reasoning against maxing SJW was breaking "compatibility" to old
> kernel versions.
ACK
> The compatibility in this is not being able to sync
> though... Now, if a change to a larger default values is agreed upon
> we might as well chose min(phaseseg1, phaseseg2) directly to maximize
> robustness for the user that doesn't set it explicitly. Given that
> most of the getting started with Socketcan results on your favorite
> Search engine don't even mention SJW that's probably a significant
> portion of the users. From a pure communication standpoint I can't see
> any downside to doing this, the only hypothetical case would be if a
> CAN controller had an erratum about higher SJWs and the driver not
> handling it.
>
> So, in essence, I'd propose to use min(p1,p2).
NACK - as I have a setup that doesn't work anymore when mixing an old
and a new kernel.
regards,
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] can: bittiming: can_calc_bittiming(): use Phase Seg2 / 2 as default for SJW
2022-09-08 19:57 ` Marc Kleine-Budde
@ 2022-09-09 12:29 ` Marc Kleine-Budde
2022-09-26 6:15 ` Thomas.Kopp
0 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2022-09-09 12:29 UTC (permalink / raw)
To: Thomas.Kopp; +Cc: linux-can, mark
[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]
On 08.09.2022 21:57:16, Marc Kleine-Budde wrote:
> > Thanks for your work on this! May I ask why Phase Seg2 / 2 is chosen?
[...]
> And I tested with SJW=max from mcp251x (not the mcp251xfd) with SJW=max
> to a peak USB adapter with SJW=1 - the peak adapter fails to receive CAN
> frames:
> With SJW = Phase Seg2 / 2, it works.
Correction:
My test:
TX (mcp2515, 20 MHz OSC) -> RX (PEAK USB)
Is broken [1] if the PEAK USB uses SJW==1, independent of the SJW of the
mcp2515 (the sender).
The test works if the PEAK USB uses SJW=PhaseSeg2/2.
regards,
Marc
[1] The PEAK USB fails to receive CAN frames with certain contents:
fails: aaa#aa.00.00.55.00.55.ab.55
works: aaa#aa.00.00.55.00.55.aa.55
--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2022-09-07 10:38 ` [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3() Marc Kleine-Budde
@ 2022-09-11 8:24 ` Vincent Mailhol
2022-09-12 5:56 ` Thomas.Kopp
0 siblings, 1 reply; 20+ messages in thread
From: Vincent Mailhol @ 2022-09-11 8:24 UTC (permalink / raw)
To: Marc Kleine-Budde; +Cc: linux-can, Mark Bath
On Wed. 7 sept. 2022 à 19:59, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
> In can_calc_bittiming() there are several open coded checks to ensure
> that SJW is within certain limits. Replace this by a single call to
> min3().
>
> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
> ---
> drivers/net/can/dev/calc_bittiming.c | 8 ++------
> 1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/can/dev/calc_bittiming.c b/drivers/net/can/dev/calc_bittiming.c
> index d3caa040614d..ce6bef2444a2 100644
> --- a/drivers/net/can/dev/calc_bittiming.c
> +++ b/drivers/net/can/dev/calc_bittiming.c
> @@ -158,12 +158,8 @@ int can_calc_bittiming(const struct net_device *dev, struct can_bittiming *bt,
> if (!bt->sjw || !btc->sjw_max) {
> bt->sjw = 1;
> } else {
> - /* bt->sjw is at least 1 -> sanitize upper bound to sjw_max */
> - if (bt->sjw > btc->sjw_max)
> - bt->sjw = btc->sjw_max;
> - /* bt->sjw must not be higher than tseg2 */
> - if (tseg2 < bt->sjw)
> - bt->sjw = tseg2;
> + /* sjw must not be higher than sjw_max and tseg2 */
> + bt->sjw = min3(bt->sjw, btc->sjw_max, tseg2);
Not directly a criticism of this patch (as things were already like
that), but if the user provides an incorrect value for SJW (or any
other bittiming argument), wouldn't it be better to inform? Returning
-EINVAL might be too violent. Maybe a dmesg would be good?
Yours sincerely,
Vincent Mailhol
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2022-09-11 8:24 ` Vincent Mailhol
@ 2022-09-12 5:56 ` Thomas.Kopp
2022-09-12 8:28 ` Vincent Mailhol
0 siblings, 1 reply; 20+ messages in thread
From: Thomas.Kopp @ 2022-09-12 5:56 UTC (permalink / raw)
To: vincent.mailhol, mkl; +Cc: linux-can, mark
> > In can_calc_bittiming() there are several open coded checks to ensure
> > that SJW is within certain limits. Replace this by a single call to
> > min3().
> >
> > Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
> > ---
> > drivers/net/can/dev/calc_bittiming.c | 8 ++------
> > 1 file changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/can/dev/calc_bittiming.c
> b/drivers/net/can/dev/calc_bittiming.c
> > index d3caa040614d..ce6bef2444a2 100644
> > --- a/drivers/net/can/dev/calc_bittiming.c
> > +++ b/drivers/net/can/dev/calc_bittiming.c
> > @@ -158,12 +158,8 @@ int can_calc_bittiming(const struct net_device
> *dev, struct can_bittiming *bt,
> > if (!bt->sjw || !btc->sjw_max) {
> > bt->sjw = 1;
> > } else {
> > - /* bt->sjw is at least 1 -> sanitize upper bound to sjw_max */
> > - if (bt->sjw > btc->sjw_max)
> > - bt->sjw = btc->sjw_max;
> > - /* bt->sjw must not be higher than tseg2 */
> > - if (tseg2 < bt->sjw)
> > - bt->sjw = tseg2;
> > + /* sjw must not be higher than sjw_max and tseg2 */
> > + bt->sjw = min3(bt->sjw, btc->sjw_max, tseg2);
>
> Not directly a criticism of this patch (as things were already like
> that), but if the user provides an incorrect value for SJW (or any
> other bittiming argument), wouldn't it be better to inform? Returning
> -EINVAL might be too violent. Maybe a dmesg would be good?
I'd say it would be consistent to keep returning -EINVAL (at least for the SJW>(min p1,p2)) condition. The same is done for FD Bitrate < Arbitration bitrate and both conditions have the same level of justification in the ISO 11898-1:2015
"The data bit time shall have the same length as the nominal bit time or it shall be shorter than the nominal bit time."
"In nominal bit time and in data bit time, SJW shall be less than or equal to the minimum of these two items: Phase_Seg1 and Phase_Seg2."
Shall is a binding requirement to be ISO conformant in this case.
Best Regards,
Thomas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2022-09-12 5:56 ` Thomas.Kopp
@ 2022-09-12 8:28 ` Vincent Mailhol
2023-01-31 16:04 ` Marc Kleine-Budde
2023-02-01 9:05 ` Marc Kleine-Budde
0 siblings, 2 replies; 20+ messages in thread
From: Vincent Mailhol @ 2022-09-12 8:28 UTC (permalink / raw)
To: Thomas.Kopp; +Cc: mkl, linux-can, mark
On Mon. 12 Sept. 2022 at 14:56, <Thomas.Kopp@microchip.com> wrote:
> > > In can_calc_bittiming() there are several open coded checks to ensure
> > > that SJW is within certain limits. Replace this by a single call to
> > > min3().
> > >
> > > Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
> > > ---
> > > drivers/net/can/dev/calc_bittiming.c | 8 ++------
> > > 1 file changed, 2 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/net/can/dev/calc_bittiming.c
> > b/drivers/net/can/dev/calc_bittiming.c
> > > index d3caa040614d..ce6bef2444a2 100644
> > > --- a/drivers/net/can/dev/calc_bittiming.c
> > > +++ b/drivers/net/can/dev/calc_bittiming.c
> > > @@ -158,12 +158,8 @@ int can_calc_bittiming(const struct net_device
> > *dev, struct can_bittiming *bt,
> > > if (!bt->sjw || !btc->sjw_max) {
> > > bt->sjw = 1;
> > > } else {
> > > - /* bt->sjw is at least 1 -> sanitize upper bound to sjw_max */
> > > - if (bt->sjw > btc->sjw_max)
> > > - bt->sjw = btc->sjw_max;
> > > - /* bt->sjw must not be higher than tseg2 */
> > > - if (tseg2 < bt->sjw)
> > > - bt->sjw = tseg2;
> > > + /* sjw must not be higher than sjw_max and tseg2 */
> > > + bt->sjw = min3(bt->sjw, btc->sjw_max, tseg2);
> >
> > Not directly a criticism of this patch (as things were already like
> > that), but if the user provides an incorrect value for SJW (or any
> > other bittiming argument), wouldn't it be better to inform? Returning
> > -EINVAL might be too violent. Maybe a dmesg would be good?
>
> I'd say it would be consistent to keep returning -EINVAL (at least for the SJW>(min p1,p2)) condition.
>
> The same is done for FD Bitrate < Arbitration bitrate and both conditions have the same level of justification in the ISO 11898-1:2015
> "The data bit time shall have the same length as the nominal bit time or it shall be shorter than the nominal bit time."
>
> "In nominal bit time and in data bit time, SJW shall be less than or equal to the minimum of these two items: Phase_Seg1 and Phase_Seg2."
>
> Shall is a binding requirement to be ISO conformant in this case.
What you say makes sense.
Also, I was assuming that can_fixup_bittiming() was already doing the
out of range check:
https://elixir.bootlin.com/linux/v6.0-rc1/source/drivers/net/can/dev/bittiming.c#L27
But in reality, only one of either can_calc_bittiming(),
can_fixup_bittiming() or can_validate_bitrate() is being called. And
thus, can_validate_bitrate() might be called with out of range values
and in that case the neltink interface should return -ERANGE (for
example if sjw > sjw_max).
Sees that there is more work to do here than initially anticipated.
Yours sincerely,
Vincent Mailhol
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 5/5] can: bittiming: can_calc_bittiming(): use Phase Seg2 / 2 as default for SJW
2022-09-09 12:29 ` Marc Kleine-Budde
@ 2022-09-26 6:15 ` Thomas.Kopp
0 siblings, 0 replies; 20+ messages in thread
From: Thomas.Kopp @ 2022-09-26 6:15 UTC (permalink / raw)
To: mkl; +Cc: linux-can, mark
> On 08.09.2022 21:57:16, Marc Kleine-Budde wrote:
> > > Thanks for your work on this! May I ask why Phase Seg2 / 2 is chosen?
>
> [...]
>
> > And I tested with SJW=max from mcp251x (not the mcp251xfd) with
> SJW=max
> > to a peak USB adapter with SJW=1 - the peak adapter fails to receive CAN
> > frames:
>
> > With SJW = Phase Seg2 / 2, it works.
>
> Correction:
>
> My test:
>
> TX (mcp2515, 20 MHz OSC) -> RX (PEAK USB)
>
> Is broken [1] if the PEAK USB uses SJW==1, independent of the SJW of the
> mcp2515 (the sender).
>
> The test works if the PEAK USB uses SJW=PhaseSeg2/2.
So everything works as expected (or doesn't work, also as expected,) that's good. So, how do we move forward here? I'd still say maximizing SJW is the better thing to do but we should get definitely get it increased and if PS/2 is the compromise I can live with that(though, there are no drawbacks to PS/2 besides "compatibility").
Best Regards,
Thomas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2022-09-12 8:28 ` Vincent Mailhol
@ 2023-01-31 16:04 ` Marc Kleine-Budde
2023-02-01 5:49 ` Vincent Mailhol
2023-02-01 9:05 ` Marc Kleine-Budde
1 sibling, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2023-01-31 16:04 UTC (permalink / raw)
To: Vincent Mailhol; +Cc: Thomas.Kopp, linux-can, mark
[-- Attachment #1: Type: text/plain, Size: 1646 bytes --]
Finally I've found some time to look at this again...
On 12.09.2022 17:28:15, Vincent Mailhol wrote:
> Also, I was assuming that can_fixup_bittiming() was already doing the
> out of range check:
> https://elixir.bootlin.com/linux/v6.0-rc1/source/drivers/net/can/dev/bittiming.c#L27
>
> But in reality, only one of either can_calc_bittiming(),
> can_fixup_bittiming() or can_validate_bitrate() is being called. And
> thus, can_validate_bitrate() might be called with out of range values
> and in that case the neltink interface should return -ERANGE (for
> example if sjw > sjw_max).
>
> Sees that there is more work to do here than initially anticipated.
I've converted the existing netdev_err() NL_SET_ERR_MSG_FMT(). This
means the error message is transported via netlink to user space and
printed by the "ip" tool.
| # ip link set flexcan0 txqueuelen 10 type can bitrate 2200000
| Warning: bitrate error 2.5%.
|
| # ip link set flexcan0 txqueuelen 10 type can bitrate 22000000
| Error: bitrate error 80.5% too high.
This is the error message for the SJW check:
| # ip link set flexcan0 txqueuelen 10 type can bitrate 500000 sjw 3
| Error: SJW 3 bigger than phase_seg1 6 and/or phase_seg2 2.
Maybe add a '=' between the phase_seg and the actual number:
| Error: SJW 3 bigger than phase_seg1=6 and/or phase_seg2=2.
regards,
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2023-01-31 16:04 ` Marc Kleine-Budde
@ 2023-02-01 5:49 ` Vincent Mailhol
2023-02-01 8:58 ` Marc Kleine-Budde
0 siblings, 1 reply; 20+ messages in thread
From: Vincent Mailhol @ 2023-02-01 5:49 UTC (permalink / raw)
To: Marc Kleine-Budde; +Cc: Thomas.Kopp, linux-can, mark
On Wed. 1 Feb 2023 at 01:04, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
> Finally I've found some time to look at this again...
>
> On 12.09.2022 17:28:15, Vincent Mailhol wrote:
> > Also, I was assuming that can_fixup_bittiming() was already doing the
> > out of range check:
> > https://elixir.bootlin.com/linux/v6.0-rc1/source/drivers/net/can/dev/bittiming.c#L27
> >
> > But in reality, only one of either can_calc_bittiming(),
> > can_fixup_bittiming() or can_validate_bitrate() is being called. And
> > thus, can_validate_bitrate() might be called with out of range values
> > and in that case the neltink interface should return -ERANGE (for
> > example if sjw > sjw_max).
> >
> > Sees that there is more work to do here than initially anticipated.
>
> I've converted the existing netdev_err() NL_SET_ERR_MSG_FMT(). This
> means the error message is transported via netlink to user space and
> printed by the "ip" tool.
I was not aware of this NL_SET_ERR_MSG_FMT() thing, and let me say
that I really like that solution!
> | # ip link set flexcan0 txqueuelen 10 type can bitrate 2200000
> | Warning: bitrate error 2.5%.
The Linux coding style says:
Kernel messages do not have to be terminated with a period.
Ref: https://www.kernel.org/doc/html/latest/process/coding-style.html#printing-kernel-messages
However, I am not sure if this also applies to the ip tool.
> |
> | # ip link set flexcan0 txqueuelen 10 type can bitrate 22000000
> | Error: bitrate error 80.5% too high.
>
> This is the error message for the SJW check:
>
> | # ip link set flexcan0 txqueuelen 10 type can bitrate 500000 sjw 3
> | Error: SJW 3 bigger than phase_seg1 6 and/or phase_seg2 2.
At that point in the code, I assume that sjw was already validated
against sjw_max. While not impossible, I think that having sjw bigger
than both phase_seg1 and phase_seg2 at the same time is uncommon. I
suggest to split the error message in two and only print the relevant
one:
Error: SJW 3 bigger than phase_seg1 x
Error: SJW 3 bigger than phase_seg2 x
If the user still manages to violate both, only the phase_seg1 error
message is displayed.
> Maybe add a '=' between the phase_seg and the actual number:
>
> | Error: SJW 3 bigger than phase_seg1=6 and/or phase_seg2=2.
If you ask me my preferences, I will go with column:
Error: SJW: 3 bigger than phase_seg1: 6 and/or phase_seg2: 2.
But I will not complain if you pick anything else.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2023-02-01 5:49 ` Vincent Mailhol
@ 2023-02-01 8:58 ` Marc Kleine-Budde
2023-02-01 9:38 ` Vincent Mailhol
0 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2023-02-01 8:58 UTC (permalink / raw)
To: Vincent Mailhol; +Cc: Thomas.Kopp, linux-can, mark
[-- Attachment #1: Type: text/plain, Size: 2313 bytes --]
On 01.02.2023 14:49:23, Vincent Mailhol wrote:
> > I've converted the existing netdev_err() NL_SET_ERR_MSG_FMT(). This
> > means the error message is transported via netlink to user space and
> > printed by the "ip" tool.
>
> I was not aware of this NL_SET_ERR_MSG_FMT() thing, and let me say
> that I really like that solution!
>
> > | # ip link set flexcan0 txqueuelen 10 type can bitrate 2200000
> > | Warning: bitrate error 2.5%.
>
> The Linux coding style says:
>
> Kernel messages do not have to be terminated with a period.
>
> Ref: https://www.kernel.org/doc/html/latest/process/coding-style.html#printing-kernel-messages
>
> However, I am not sure if this also applies to the ip tool.
I think the period is added by "ip".
> > |
> > | # ip link set flexcan0 txqueuelen 10 type can bitrate 22000000
> > | Error: bitrate error 80.5% too high.
> >
> > This is the error message for the SJW check:
> >
> > | # ip link set flexcan0 txqueuelen 10 type can bitrate 500000 sjw 3
> > | Error: SJW 3 bigger than phase_seg1 6 and/or phase_seg2 2.
>
> At that point in the code, I assume that sjw was already validated
> against sjw_max. While not impossible, I think that having sjw bigger
> than both phase_seg1 and phase_seg2 at the same time is uncommon. I
> suggest to split the error message in two and only print the relevant
> one:
>
> Error: SJW 3 bigger than phase_seg1 x
> Error: SJW 3 bigger than phase_seg2 x
>
> If the user still manages to violate both, only the phase_seg1 error
> message is displayed.
Makes sense.
> > Maybe add a '=' between the phase_seg and the actual number:
> >
> > | Error: SJW 3 bigger than phase_seg1=6 and/or phase_seg2=2.
>
> If you ask me my preferences, I will go with column:
>
> Error: SJW: 3 bigger than phase_seg1: 6 and/or phase_seg2: 2.
>
> But I will not complain if you pick anything else.
Period looks good.
What about the error value? Always return -EINVAL instead of a mix of
-EINVAL and -ERANGE?
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2022-09-12 8:28 ` Vincent Mailhol
2023-01-31 16:04 ` Marc Kleine-Budde
@ 2023-02-01 9:05 ` Marc Kleine-Budde
1 sibling, 0 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2023-02-01 9:05 UTC (permalink / raw)
To: Vincent Mailhol; +Cc: Thomas.Kopp, linux-can, mark
[-- Attachment #1: Type: text/plain, Size: 2292 bytes --]
On 12.09.2022 17:28:15, Vincent Mailhol wrote:
> > > Not directly a criticism of this patch (as things were already like
> > > that), but if the user provides an incorrect value for SJW (or any
> > > other bittiming argument), wouldn't it be better to inform? Returning
> > > -EINVAL might be too violent. Maybe a dmesg would be good?
> >
> > I'd say it would be consistent to keep returning -EINVAL (at least for the SJW>(min p1,p2)) condition.
> >
> > The same is done for FD Bitrate < Arbitration bitrate and both conditions have the same level of justification in the ISO 11898-1:2015
> > "The data bit time shall have the same length as the nominal bit time or it shall be shorter than the nominal bit time."
> >
> > "In nominal bit time and in data bit time, SJW shall be less than or equal to the minimum of these two items: Phase_Seg1 and Phase_Seg2."
> >
> > Shall is a binding requirement to be ISO conformant in this case.
>
> What you say makes sense.
>
> Also, I was assuming that can_fixup_bittiming() was already doing the
> out of range check:
> https://elixir.bootlin.com/linux/v6.0-rc1/source/drivers/net/can/dev/bittiming.c#L27
>
> But in reality, only one of either can_calc_bittiming(),
> can_fixup_bittiming() or can_validate_bitrate() is being called. And
> thus, can_validate_bitrate() might be called with out of range values
> and in that case the neltink interface should return -ERANGE (for
> example if sjw > sjw_max).
>
> Sees that there is more work to do here than initially anticipated.
If you configure the bitrate + sjw there will be a check for sjw within
the limits (< sjw_max, < min(phase_seg1, phase_seg2)).
If you configure the low level timing parameters there will be limit
checks on the user supplied phase_seg1, phase_seg2 and sjw and a limit
check on the calculated brp.
What should be the return value if these checks fail?
I think it's best to always return -EINVAL and set an error message with
NL_SET_ERR_MSG_FMT().
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2023-02-01 8:58 ` Marc Kleine-Budde
@ 2023-02-01 9:38 ` Vincent Mailhol
2023-02-02 9:55 ` Marc Kleine-Budde
0 siblings, 1 reply; 20+ messages in thread
From: Vincent Mailhol @ 2023-02-01 9:38 UTC (permalink / raw)
To: Marc Kleine-Budde; +Cc: Thomas.Kopp, linux-can, mark
On Wed. 1 Feb 2023 at 17:58, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
> On 01.02.2023 14:49:23, Vincent Mailhol wrote:
> > > I've converted the existing netdev_err() NL_SET_ERR_MSG_FMT(). This
> > > means the error message is transported via netlink to user space and
> > > printed by the "ip" tool.
> >
> > I was not aware of this NL_SET_ERR_MSG_FMT() thing, and let me say
> > that I really like that solution!
> >
> > > | # ip link set flexcan0 txqueuelen 10 type can bitrate 2200000
> > > | Warning: bitrate error 2.5%.
> >
> > The Linux coding style says:
> >
> > Kernel messages do not have to be terminated with a period.
> >
> > Ref: https://www.kernel.org/doc/html/latest/process/coding-style.html#printing-kernel-messages
> >
> > However, I am not sure if this also applies to the ip tool.
>
> I think the period is added by "ip".
Ack. All good then!
> > > |
> > > | # ip link set flexcan0 txqueuelen 10 type can bitrate 22000000
> > > | Error: bitrate error 80.5% too high.
> > >
> > > This is the error message for the SJW check:
> > >
> > > | # ip link set flexcan0 txqueuelen 10 type can bitrate 500000 sjw 3
> > > | Error: SJW 3 bigger than phase_seg1 6 and/or phase_seg2 2.
> >
> > At that point in the code, I assume that sjw was already validated
> > against sjw_max. While not impossible, I think that having sjw bigger
> > than both phase_seg1 and phase_seg2 at the same time is uncommon. I
> > suggest to split the error message in two and only print the relevant
> > one:
> >
> > Error: SJW 3 bigger than phase_seg1 x
> > Error: SJW 3 bigger than phase_seg2 x
> >
> > If the user still manages to violate both, only the phase_seg1 error
> > message is displayed.
>
> Makes sense.
>
> > > Maybe add a '=' between the phase_seg and the actual number:
> > >
> > > | Error: SJW 3 bigger than phase_seg1=6 and/or phase_seg2=2.
> >
> > If you ask me my preferences, I will go with column:
> >
> > Error: SJW: 3 bigger than phase_seg1: 6 and/or phase_seg2: 2.
> >
> > But I will not complain if you pick anything else.
>
> Period looks good.
>
> What about the error value? Always return -EINVAL instead of a mix of
> -EINVAL and -ERANGE?
Looking at the comments from uapi/asm-generic/errno-base.h:
#define EINVAL 22 /* Invalid argument */
#define ERANGE 34 /* Math result not representable */
We are not dealing with some non-representable math results, so
-ERANGE is technically incorrect. I did suggest -ERANGE in the past
because it looks natural to me: we define a min and a max, i.e. its a
range. But the doc tells me I am wrong. Naming the error -ENAN (Error
Not A Number) would have been more explicit, but we are not going to
rewrite UAPI error definitions.
Go with -EINVAL This answer also applies to your next message and to
everywhere else in netlink, I guess.
Yours sincerely,
Vincent Mailhol
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2023-02-01 9:38 ` Vincent Mailhol
@ 2023-02-02 9:55 ` Marc Kleine-Budde
2023-02-02 10:29 ` Vincent Mailhol
0 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2023-02-02 9:55 UTC (permalink / raw)
To: Vincent Mailhol; +Cc: Thomas.Kopp, linux-can, mark
[-- Attachment #1: Type: text/plain, Size: 2505 bytes --]
On 01.02.2023 18:38:17, Vincent Mailhol wrote:
[...]
> > > > Maybe add a '=' between the phase_seg and the actual number:
> > > >
> > > > | Error: SJW 3 bigger than phase_seg1=6 and/or phase_seg2=2.
> > >
> > > If you ask me my preferences, I will go with column:
> > >
> > > Error: SJW: 3 bigger than phase_seg1: 6 and/or phase_seg2: 2.
> > >
> > > But I will not complain if you pick anything else.
> >
> > Period looks good.
Now looks like this:
| sudo ip link set flexcan0 txqueuelen 10 type can bitrate 22000000
| Error: bitrate error: 80.5% too high.
|
| sudo ip link set flexcan0 txqueuelen 10 type can bitrate 2200000
| Warning: bitrate error: 2.5%.
|
| sudo ip link set flexcan0 txqueuelen 10 type can bitrate 1000000 sjw 10
| Error: sjw: 10 greater than max sjw: 4.
|
| sudo ip link set flexcan0 txqueuelen 10 type can prop-seg 3 phase-seg1 1 phase-seg2 8 tq 10
| Error: resulting brp: 0 less than brp-min: 1.
|
| sudo ip link set flexcan0 txqueuelen 10 type can prop-seg 3 phase-seg1 4 phase-seg2 4 tq 30 sjw 5
| Error: sjw: 5 greater than max sjw: 4.
|
| sudo ip link set flexcan0 txqueuelen 10 type can prop-seg 4 phase-seg1 3 phase-seg2 4 tq 30 sjw 4
| Error: sjw: 4 greater than phase-seg1: 3.
|
| sudo ip link set flexcan0 txqueuelen 10 type can prop-seg 4 phase-seg1 4 phase-seg2 3 tq 30 sjw 4
| Error: sjw: 4 greater than phase-seg2: 3.
> > What about the error value? Always return -EINVAL instead of a mix of
> > -EINVAL and -ERANGE?
>
> Looking at the comments from uapi/asm-generic/errno-base.h:
>
> #define EINVAL 22 /* Invalid argument */
> #define ERANGE 34 /* Math result not representable */
>
> We are not dealing with some non-representable math results, so
> -ERANGE is technically incorrect. I did suggest -ERANGE in the past
> because it looks natural to me: we define a min and a max, i.e. its a
> range. But the doc tells me I am wrong. Naming the error -ENAN (Error
> Not A Number) would have been more explicit, but we are not going to
> rewrite UAPI error definitions.
>
> Go with -EINVAL This answer also applies to your next message and to
> everywhere else in netlink, I guess.
It's -EINVAL everywhere now.
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3()
2023-02-02 9:55 ` Marc Kleine-Budde
@ 2023-02-02 10:29 ` Vincent Mailhol
0 siblings, 0 replies; 20+ messages in thread
From: Vincent Mailhol @ 2023-02-02 10:29 UTC (permalink / raw)
To: Marc Kleine-Budde; +Cc: Thomas.Kopp, linux-can, mark
On Thu. 2 Feb 2023 at 18:55, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
> On 01.02.2023 18:38:17, Vincent Mailhol wrote:
> [...]
> > > > > Maybe add a '=' between the phase_seg and the actual number:
> > > > >
> > > > > | Error: SJW 3 bigger than phase_seg1=6 and/or phase_seg2=2.
> > > >
> > > > If you ask me my preferences, I will go with column:
> > > >
> > > > Error: SJW: 3 bigger than phase_seg1: 6 and/or phase_seg2: 2.
> > > >
> > > > But I will not complain if you pick anything else.
> > >
> > > Period looks good.
>
> Now looks like this:
>
> | sudo ip link set flexcan0 txqueuelen 10 type can bitrate 22000000
> | Error: bitrate error: 80.5% too high.
> |
> | sudo ip link set flexcan0 txqueuelen 10 type can bitrate 2200000
> | Warning: bitrate error: 2.5%.
> |
> | sudo ip link set flexcan0 txqueuelen 10 type can bitrate 1000000 sjw 10
> | Error: sjw: 10 greater than max sjw: 4.
> |
> | sudo ip link set flexcan0 txqueuelen 10 type can prop-seg 3 phase-seg1 1 phase-seg2 8 tq 10
> | Error: resulting brp: 0 less than brp-min: 1.
> |
> | sudo ip link set flexcan0 txqueuelen 10 type can prop-seg 3 phase-seg1 4 phase-seg2 4 tq 30 sjw 5
> | Error: sjw: 5 greater than max sjw: 4.
> |
> | sudo ip link set flexcan0 txqueuelen 10 type can prop-seg 4 phase-seg1 3 phase-seg2 4 tq 30 sjw 4
> | Error: sjw: 4 greater than phase-seg1: 3.
> |
> | sudo ip link set flexcan0 txqueuelen 10 type can prop-seg 4 phase-seg1 4 phase-seg2 3 tq 30 sjw 4
> | Error: sjw: 4 greater than phase-seg2: 3.
I like it. I am looking forward to seeing the patches.
> > > What about the error value? Always return -EINVAL instead of a mix of
> > > -EINVAL and -ERANGE?
> >
> > Looking at the comments from uapi/asm-generic/errno-base.h:
> >
> > #define EINVAL 22 /* Invalid argument */
> > #define ERANGE 34 /* Math result not representable */
> >
> > We are not dealing with some non-representable math results, so
> > -ERANGE is technically incorrect. I did suggest -ERANGE in the past
> > because it looks natural to me: we define a min and a max, i.e. its a
> > range. But the doc tells me I am wrong. Naming the error -ENAN (Error
> > Not A Number) would have been more explicit, but we are not going to
> > rewrite UAPI error definitions.
> >
> > Go with -EINVAL This answer also applies to your next message and to
> > everywhere else in netlink, I guess.
>
> It's -EINVAL everywhere now.
\o/
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2023-02-02 10:29 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-07 10:38 [PATCH 0/5] can: bittiming: improve SJW handling and default Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 1/5] can: dev: register_candev(): ensure that bittiming const are valid Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 2/5] can: bittiming: can_calc_bittiming(): make use of min3() Marc Kleine-Budde
2022-09-11 8:24 ` Vincent Mailhol
2022-09-12 5:56 ` Thomas.Kopp
2022-09-12 8:28 ` Vincent Mailhol
2023-01-31 16:04 ` Marc Kleine-Budde
2023-02-01 5:49 ` Vincent Mailhol
2023-02-01 8:58 ` Marc Kleine-Budde
2023-02-01 9:38 ` Vincent Mailhol
2023-02-02 9:55 ` Marc Kleine-Budde
2023-02-02 10:29 ` Vincent Mailhol
2023-02-01 9:05 ` Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 3/5] can: bittiming: can_calc_bittiming(): clean up SJW sanitizing Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 4/5] can: bittiming: can_calc_bittiming(): ensure that SJW is not longer than either Phase Buffer Segment Marc Kleine-Budde
2022-09-07 10:38 ` [PATCH 5/5] can: bittiming: can_calc_bittiming(): use Phase Seg2 / 2 as default for SJW Marc Kleine-Budde
2022-09-07 19:19 ` Thomas.Kopp
2022-09-08 19:57 ` Marc Kleine-Budde
2022-09-09 12:29 ` Marc Kleine-Budde
2022-09-26 6:15 ` Thomas.Kopp
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).