All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Hartkopp <socketcan@hartkopp.net>
To: Marc Kleine-Budde <mkl@pengutronix.de>, dariobin@libero.it
Cc: Jacob Kroon <jacob.kroon@gmail.com>,
	linux-can@vger.kernel.org, wg@grandegger.com
Subject: Re: CM-ITC, pch_can/c_can_pci, sendto() returning ENOBUFS
Date: Wed, 21 Sep 2022 11:55:59 +0200	[thread overview]
Message-ID: <fb1f38e6-c95c-5847-2ebf-16fd8bc2db94@hartkopp.net> (raw)
In-Reply-To: <20220921074741.admuodnlv4yexfwr@pengutronix.de>



On 21.09.22 09:47, Marc Kleine-Budde wrote:
> On 21.09.2022 09:25:41, dariobin@libero.it wrote:
>>> On 9/16/22 06:14, Jacob Kroon wrote:
>>> ...> What I do know is that if I revert commit:
>>>>
>>>> "can: c_can: cache frames to operate as a true FIFO"
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=387da6bc7a826cc6d532b1c0002b7c7513238d5f
>>>>
>>>> then everything looks good. I don't get any BUG messages, and the host
>>>> has been running overnight without problems, so it seems to have fixed
>>>> the network interface lockup as well.
>>
>> Here's what I think:
>> If one or more messages are cached, the controller has to transmit more frames
>> in the unit of time when they can be transmitted (IF_COMM_TXRQST), different from
>> when the transmission occurs directly on request from the user space. In the case
>> of cached data transmission I therefore think that the controller is more heavily
>> loaded. Can this shift the balance ?
>>
>>>
>>> I ran the kernel *with* the commit above, and also with the following patch:
>>>
>>>> diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c
>>>> index 52671d1ea17d..4375dc70e21f 100644
>>>> --- a/drivers/net/can/c_can/c_can_main.c
>>>> +++ b/drivers/net/can/c_can/c_can_main.c
>>>> @@ -1,3 +1,4 @@
>>>> +#define DEBUG
>>>>   /*
>>>>    * CAN bus driver for Bosch C_CAN controller
>>>>    *
>>>> @@ -469,8 +470,15 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
>>>>   	if (c_can_get_tx_free(tx_ring) == 0)
>>>>   		netif_stop_queue(dev);
>>>>   
>>>> -	if (idx < c_can_get_tx_tail(tx_ring))
>>>> +	netdev_dbg(dev, "JAKR:%d:%d:%d:%d\n", idx,
>>>> +	                                      c_can_get_tx_head(tx_ring),
>>>> +	                                      c_can_get_tx_tail(tx_ring),
>>>> +	                                      c_can_get_tx_free(tx_ring));
>>>> +
>>>> +	if (idx < c_can_get_tx_tail(tx_ring)) {
>>>>   		cmd &= ~IF_COMM_TXRQST; /* Cache the message */
>>>> +		netdev_dbg(dev, "JAKR:Caching messages\n");
>>>> +	}
>>>>   
>>>>   	/* Store the message in the interface so we can call
>>>>   	 * can_put_echo_skb(). We must do this before we enable
>>>
>>> and I've uploaded the entire log I could capture from /dev/kmsg, right
>>> up to the hang, here:
>>>
>>> https://pastebin.com/6hvAcPc9
>>>
>>> What looks odd to me right from the start is that sometimes when idx
>>> rolls over to 0, and *only* when it rolls over to 0, the CAN frame gets
>>> cached because "idx < c_can_get_tx_tail(tx_ring)".
>>
>> If the message were not stored but transmitted, the order of transmission
>> would not be respected.
>>
>>>
>>> Is it possible there is some difference between c_can and d_can in how
>>> the HW buffers are working, which breaks the driver on my particular HW
>>> setup ?
>>>
>>
>> I tested the patch on a beaglebone board without encountering any problems.
>> There is also a version of the driver I submitted to Xenomai running on a custom
>> board without problems. But surely the setup and context is different from yours.
>>
>> What compatible are you using in your device tree?
>> I used "ti,am3352-d_can".
> 
> I think Jacob's board has a c_can core, while the beagle bone uses a
> d_can. Maybe there's a subtle difference between these cores?
> 
> Dario, do you have access to a real c_can core to test?
> 
> As reverting 387da6bc7a82 ("can: c_can: cache frames to operate as a
> true FIFO") helps to fix Jacob's problem, a temporary solution might be
> to only cache frames on d_can cores.

Btw. I uploaded the 'latest' C_CAN manuals on

https://github.com/linux-can/can-doc

... as it could only be found on archive.org :-/

Unfortunately I was not able to find any (latest?) D_CAN manual anymore, 
which was originally hosted at 
http://www.semiconductors.bosch.de/media/en/pdf/ipmodules_1/can/d_can_users_manual_111.pdf

Archive.org did not crawl this PDF ;-(

If someone still has this D_CAN PDF please send a URL or the PDF itself 
to me, so that I can put it there too.

Thanks,
Oliver

  parent reply	other threads:[~2022-09-21  9:56 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-25 13:25 CM-ITC, pch_can/c_can_pci, sendto() returning ENOBUFS Jacob Kroon
2022-08-26 11:24 ` Jacob Kroon
2022-08-29  9:14   ` Jacob Kroon
2022-08-29 13:20     ` Jacob Kroon
2022-08-29 13:53       ` Oliver Hartkopp
2022-08-30 12:59         ` Jacob Kroon
2022-08-30 19:15           ` Oliver Hartkopp
2022-09-01  9:38             ` Jacob Kroon
2022-09-01 16:35               ` Oliver Hartkopp
2022-09-02 15:13                 ` Jacob Kroon
2022-09-02 16:39                   ` Jacob Kroon
2022-09-05 14:17                   ` Marc Kleine-Budde
2022-09-05 15:54               ` Marc Kleine-Budde
2022-09-16  4:14                 ` Jacob Kroon
2022-09-19 23:24                   ` Jacob Kroon
2022-09-20  1:23                     ` Vincent Mailhol
2022-09-20  5:08                       ` Jacob Kroon
2022-09-21  7:25                     ` dariobin
2022-09-21  7:47                       ` Marc Kleine-Budde
2022-09-21  8:26                         ` Jacob Kroon
2022-09-21  9:55                         ` Oliver Hartkopp [this message]
2022-09-21 10:32                           ` Marc Kleine-Budde
2022-09-21 10:39                             ` Oliver Hartkopp
2022-09-21 10:53                               ` Marc Kleine-Budde
2022-09-21 11:00                                 ` Oliver Hartkopp
2022-09-22  7:20                         ` dariobin
2022-09-23 11:36                   ` Marc Kleine-Budde
2022-09-23 17:55                     ` dariobin
2022-09-23 19:03                       ` Jacob Kroon
2022-09-23 19:21                         ` Jacob Kroon
2022-09-23 19:45                           ` dariobin
2022-09-23 20:27                             ` Jacob Kroon
2022-09-24  5:17                               ` Jacob Kroon
2022-09-28  8:25                                 ` Marc Kleine-Budde
2022-09-28  8:28                                   ` Jacob Kroon
2022-09-28  8:02                             ` Marc Kleine-Budde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb1f38e6-c95c-5847-2ebf-16fd8bc2db94@hartkopp.net \
    --to=socketcan@hartkopp.net \
    --cc=dariobin@libero.it \
    --cc=jacob.kroon@gmail.com \
    --cc=linux-can@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    --cc=wg@grandegger.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.