Linux-Can Archive on
 help / color / Atom feed
From: Vincent MAILHOL <>
To: Oliver Hartkopp <>
Cc: "Marc Kleine-Budde" <>,,,
	"Stéphane Grosjean" <>,
	"Vincent Mailhol" <>
Subject: Re: [net-rfc 04/16] can: dev: can_get_len(): add a helper function to get the correct length of Classical frames
Date: Fri, 23 Oct 2020 00:46:58 +0900
Message-ID: <> (raw)
In-Reply-To: <>

On 22.10.20 22:28, Oliver Hartkopp wrote:
> On 22.10.20 14:23, Vincent MAILHOL wrote:
>> On 22.10.20 16:15, Oliver Hartkopp wrote:
>>> On 22.10.20 05:30, Vincent MAILHOL wrote:
>>>> On 22.10.20 02:52, Oliver Hartkopp wrote:
>>>>> On 21.10.20 13:55, Vincent MAILHOL wrote:
>>>>>>> On 21.10.20 18:48, Oliver Hartkopp wrote:
>>>>>>> To be more compatible to non raw dlc CAN frame sources the tx handling
>>>>>>> could also be like this:
>>>>>>> if ((can_dlc == CAN_MAX_DLEN) &&
>>>>>>>         (raw_dlc >= CAN_MAX_DLC && raw_dlc <= CAN_MAX_RAW_DLC))
>>>>>>> => use raw_dlc
>>>> In addition to that, we would have to decide what to do with malformed
>>>> frames. If can_dlc and can_raw are set to contradictory values, do we
>>>> drop the frame, do we just ignore the raw_dlc? For example, if can_dlc
>>>> is 8 but raw_dlc is 2, I think that we should drop.
>>> The pseudo code is pretty clear on this: Use can_dlc.
>>> The nice thing about testing raw_dlc to be 9..15 when can_dlc == 8 is,
>>> that we always have a correct CAN frame on the wire with 8 bytes of content.
>>> Ok, the raw_dlc might be an uninitialized value - but still everything
>>> remains fine for operation. So why dropping frames here?
>> If the can_dlc and raw_dlc are in contradiction, then the user is
>> trying to do something not supported. After introducing raw_dlc,
>> someone might want to try sending such malformed frames. If the
>> operation succeeds, he then might be fooled in thinking that he
>> actually send, for example, a frame of eight bytes with a DLC of 2.
> ??
> You obviously did not read my description / pseudo code :-(
> There is no contradiction.
> if (can_dlc == 8) && (raw_dlc > 8 && raw_dlc <= 15)
>      dlc = raw_dlc;
> else
>      dlc = can_dlc;
> can_dlc == 8 is the entry door to write dlc values from 9..15 into the
> controller register.
> Which always leads to a valid CAN frame with 8 bytes.

Your code is correct. Sorry, my comment was not about your code but
was to answer the "why dropping frames here" question. Just wanting to
discuss what we do if the raw_dlc holds an incoherent value.

>> In such a scenario, I think it is better to tell the user: "what you
>> are doing is not supported" by returning -EINVAL (similar to what is
>> currently done when an invalid skb is sent).
>>> The problem of uninitialized values could be solved with a
>>> CAN_RAW_RAW_DLC sockopt BUT you will be always able to send a CAN frame
>>> with an AF_PACKET socket which has no CAN_RAW_RAW_DLC sockopt.
>>> And then?
>> I am not competent enough on that point to comment. I was not aware of
>> the possibility of using AF_PACKET to send CAN frames. Do you have a
>> link to any example of such code? I would be glad to study it and come
>> back to you once I understand this point.

Thanks, I will try to understand.

>>> The above pseudo code together with CAN_CTRLMODE_RAW_DLC seems to be a
>>> pretty safe option to me. Even if 'legacy' applications with
>>> uninitialized raw_dlc send CAN frames or AF_PACKET users enable
>>> CAN_CTRLMODE_RAW_DLC we always end up with a proper can_dlc == 8  and a
>>> fitting valid raw_dlc.
>> But what if that "fitting valid raw_dlc" is greater than 8?
> This is the plan!
>> The kernel
> the CAN driver
>> would see that frame as valid, a frame with a DLC greater than 8 will
>> be sent on the wire and received by the other devices than might be
>> connected on the CAN bus.
> Exactly.

And this is my point. The legacy application intended to send a frame
of 8 bytes with a DLC of 8 but on the wire, there is a frame of 8
bytes with a DLC greater than 8. The other devices on the CAN bus
might ignore it (if the device has a DLC filter) or might crash if
there is a vulnerability in DLC handling.

>>>> If we go this direction, I would propose below logic:
>>>>    1. if raw_dlc equals can_dlc: nominal case.
>>> Is not tested. We would start on testing can_dlc == 8
>>>>    2. if can_dlc equals 8 and raw_dlc is greater than 8: use raw_dlc
>>>>       case, already covered in your code.
>>> ACK
>>>>    3. if raw_dlc is 0, then consider it as unset: overwrite raw_dlc with
>>>>       can_dlc (so that it becomes case 1).
>>> Is not tested like this. We would start on can_dlc == 8 - and then check
>>> for valid raw_dlc 9..15
>>>>    4. Other scenarios: the frame is malformed: drop it.
>>> No dropping.
>>>> Logic should apply for both Tx and Rx paths.
>>> So test for can_dlc == 8 - and then check for valid raw_dlc 9..15 in the
>>> user space?! Fine.
>> My view: users and drivers directly use raw_dlc without need to check
>> for can_dlc == 8 (detailed explanation below).
>>> Btw. we can make sure that we do not have uninitialized raw_dlc value in
>>> the rx path in the enabled drivers. In fact you could always use the
>>> raw_dlc then.
>> My view is that if we introduce the raw_dlc field, we should always
>> populate it with the DLC value as defined in ISO, even if
>> CAN_CTRLMODE_RAW_DLC and CAN_RAW_RAW_DLC are not active. User could
>> then benefit from this field to get the DLC value (and maximum value
>> would be either CAN_MAX_DLC or CAN_MAX_DLC RAW depending on the
>> context).
>> Rationale is that the raw_dlc field would always be visible in the
>> struct can_frame. So if it is here, use it. Having it sometimes used,
>> sometimes not would be confusing for the user.
> I wonder if it's more confusing to fill raw_dlc without
>> Basically, that would replace the use of the can_len2dlc() function.
>> I also think it should be extended to CAN-FD, but no need to elaborate
>> more on this at the moment: let's keep the discussion focused on
>> classical CAN and tackle this later.
> Why do you want this?
> You get of the bike and push your bicycle for what reason?
> No.

Pushing the bike: yes, that was actually my motivation. One more time,
I am not telling you are wrong, just that I would do it in a different
way. I explained my view and you read it. I do not have more arguments
to bring.

>>> BUT it makes sense to use this test to detect cases, where someone
>>> forgot to switch on CAN_CTRLMODE_RAW_DLC or the driver doesn't support
>>> it and the programmer did not check the ctrlmode netlink return values.
>> My view: if the user tries to send raw_dlc bigger than 8 and
>> CAN_CTRLMODE_RAW_DLC and CAN_RAW_RAW_DLC are not active, return
> raw_dlc has always to be bigger than 8. What am I explaining wrong all
> time that you don't get it?

Was not trying to prove you wrong but to debate other options.

>>> Ok, let me rephrase: Filling can_dlc with something else than a plain
>>> length information 0..8 ;-)
>> Got it :-)
>>  From my preliminary study, not so much changes or sanity check would
>> need to be added:
>>    1. The current can_dlc sanity checks in can_send() and can_rcv() in
>>       net/can/af_can.c definitely needs to be adjusted (but this would
>>       also be the case for the raw_dlc field solution).
>>    2. In the kernel Rx and Tx paths: the length should not be accessed
>>       directly anymore but through a getter function.
>>    3. In the kernel: drivers "DLC aware" need to adapt their code and
>>       use the length getter function.
>>    4. In user land: new "DLC aware" code should always use struct
>>       can_frame for classical CAN and check whether can_dlc is greater
>>       that CAN_MAX_DLEN before accessing data.
>> That's mostly all that would have to be adjusted.
> No, it is not.
> You have to go through all protocols, e.g. raw.c, bcm.c, gw.c, j1939,
> isotp.c which get a skb to be (sometimes) cloned by them. And when they
> need to modify the can_dlc value to send it to the userspace or
> whatever, you need to modify skb->data - and then you can't clone it
> anymore but need to skbcopy() them.
> HERE you will get a really big effort. Occasionally I was also thinking
> using can_dlc would be nice but then I looked into above code and your
> use-case does not justify that effort/ risk and performance impact.

I naively thought this part would not get impacted. I have yet to
understand the full impact in detail but you made your point clear.

You won :-)
Sorry for the long exchange and thank you for your patience.

>>>>> The 14 year old documentation in can.h says:
>>>>> @can_dlc: frame payload length in byte (0 .. 8)
>>>>> I will not break this established rule for a testing feature. The
>>>>> question from Joakim clearly showed: Don't touch this!
>>>> My point is this is an expert feature: if you do not understand it, do
>>>> not use it and you are safe to go using the 14 years old definition.
>>> Full ACK - but you can't imagine what people do in the real world. Do
>>> you know these 'more helps more' guys that switch everything to 'on'? :-D
>> Switching everything options on through an "ip link set canX up ..."
>> should not cause damage. If the user is doing random setsockopt()
>> without any clues of what he is doing, then he is already a lost cause
>> for me...
>>> The approach with CAN_CTRLMODE_RAW_DLC and the described testing would
>>> be even robust against unintended miss-use.
>> Exception: can_dlc == 8 and a non-initialed raw_dlc gets a garbage
>> value between 9 and 15 from the stack (c.f. above comment).
> Right, and what would be the effect of this?
> You know the answer: Nothing breaks.

A frame of 8 bytes and DLC between 9 and 15 is sent on the wire and
this has the potential to crash the other devices on the bus as
explained in my initial e-mails.

>>>>> At the end it would have made more sense to call it can_frame.len in the
>>>>> first place. But this first came into our mind when CAN FD showed up.
>>>>> The discussion about the can_dlc meaning can be closed IMO. It is a
>>>>> plain length from 0..8 which is unfortunately named can_dlc.
>>>> I agree that it should have been named len from the beginning, but as
>>>> a matter of fact it has been named "can_dlc". For me as a user, I
>>>> expect can_dlc to follow the DLC definition in ISO and so I *prefer*
>>>> to opt for the compromise of losing the plain length property rather
>>>> than losing the semantic meaning.
>>> I don't think you can claim to be a standard user. You are an expert for
>>> security testing and digged into the ISO standard and intentionally
>>> write DLC values >8 into controller registers. Users don't do things
>>> like this ;-)
>> You get a point :-)
>> Whatever the output of our discussion is, I have one strong
>> request. The linux kernel has a huge audience and has such, I think it
>> is our duty to be technically accurate even in small details.
> But we failed to do so by naming the plain data length can_dlc. We have
> to deal with that legacy.
> You can not fix that anymore.
>> I would like to add a comment in the struct can_frame to acknowledge
>> the discrepancy with the ISO. Some people who do not have access to
>> the ISO standard might refer instead to the SocketCAN documentation to
>> learn and understand the protocol and those people might use this
>> knowledge to program on other platforms (e.g. bare-metal
>> microcontrollers). Using the DLC as a plain length on those systems
>> might results in vulnerabilities in production and potentially safety
>> hazard.
>> That's also why I would like to see the Socket CAN DLC be used
>> according to ISO: so that we raise awareness on that issue and so that
>> users can transfer their knowledge to other platforms.
> Vincent, stop discussing about having in can_dlc anything else then a
> plain length from 0..8.
> I'm really getting tired on this after the 6th or whatever round.
> Read and understand my mails.

Yours sincerely,
Vincent Mailhol

  reply index

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-19 19:05 [RFC]: can 2020-10-19 Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 01/16] can: proc: can_remove_proc(): silence remove_proc_entry warning Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 02/16] can: rx-offload: don't call kfree_skb() from IRQ context Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 03/16] can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard " Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 04/16] can: dev: can_get_len(): add a helper function to get the correct length of Classical frames Marc Kleine-Budde
2020-10-19 20:35   ` Oliver Hartkopp
2020-10-20  6:35     ` Marc Kleine-Budde
2020-10-20 11:30       ` Vincent Mailhol
2020-10-20 11:48         ` Marc Kleine-Budde
2020-10-20 12:38         ` Oliver Hartkopp
2020-10-20 15:02           ` Marc Kleine-Budde
2020-10-20 16:07           ` Vincent Mailhol
2020-10-20 17:04             ` Oliver Hartkopp
2020-10-20 18:50               ` Marc Kleine-Budde
2020-10-21  0:52               ` Vincent Mailhol
2020-10-21  6:23                 ` Vincent MAILHOL
2020-10-21  7:11                   ` Joakim Zhang
2020-10-21  7:21                     ` Marc Kleine-Budde
2020-10-21  7:48                       ` Joakim Zhang
2020-10-21  9:21                   ` Oliver Hartkopp
2020-10-21  9:48                     ` Oliver Hartkopp
2020-10-21 11:55                     ` Vincent MAILHOL
2020-10-21 17:52                       ` Oliver Hartkopp
2020-10-22  3:30                         ` Vincent MAILHOL
2020-10-22  7:15                           ` Oliver Hartkopp
2020-10-22 12:23                             ` Vincent MAILHOL
2020-10-22 13:28                               ` Oliver Hartkopp
2020-10-22 15:46                                 ` Vincent MAILHOL [this message]
2020-10-22 17:06                                   ` Oliver Hartkopp
2020-10-23 10:36                                     ` Vincent MAILHOL
2020-10-23 16:47                                       ` Oliver Hartkopp
2020-10-24  5:25                                         ` Vincent MAILHOL
2020-10-24 11:31                                           ` Oliver Hartkopp
2020-10-19 19:05 ` [net-rfc 05/16] can: dev: __can_get_echo_skb(): fix the returned length of CAN frame Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 06/16] can: can_create_echo_skb(): fix echo skb generation: always use skb_clone() Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 07/16] can: j1939: j1939_sk_bind(): return failure if netdev is down Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 08/16] can: isotp: Explain PDU in CAN_ISOTP help text Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 09/16] can: isotp: enable RX timeout handling in listen-only mode Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 10/16] can: ti_hecc: add missed clk_disable_unprepare() in error path Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 11/16] can: xilinx_can: handle failure cases of pm_runtime_get_sync Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 12/16] can: peak_usb: fix timestamp wrapping Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 13/16] can: peak_canfd: fix echo management when loopback is on Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 14/16] can: mcp251xfd: mcp251xfd_regmap_crc_read(): increase severity of CRC read error messages Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 15/16] can: mcp251xfd: mcp251xfd_regmap_nocrc_read(): fix semicolon.cocci warnings Marc Kleine-Budde
2020-10-19 19:05 ` [net-rfc 16/16] can: mcp251xfd: remove unneeded break Marc Kleine-Budde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='' \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Can Archive on

Archives are clonable:
	git clone --mirror linux-can/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-can linux-can/ \
	public-inbox-index linux-can

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone