linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Florian Fainelli <f.fainelli@gmail.com>
To: Florian Fainelli <f.fainelli@gmail.com>,
	Maxime Ripard <maxime@cerno.tech>
Cc: Doug Berger <opendmb@gmail.com>,
	bcm-kernel-feedback-list@broadcom.com,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Nicolas Saenz Julienne <nsaenz@kernel.org>
Subject: Re: Kernel Panic in skb_release_data using genet
Date: Mon, 31 May 2021 19:36:29 -0700	[thread overview]
Message-ID: <9e99ade5-ebfc-133e-ac61-1aba07ca80a2@gmail.com> (raw)
In-Reply-To: <77d412b4-cdd6-ea86-d7fd-adb3af8970d9@gmail.com>



On 5/28/2021 9:48 AM, Florian Fainelli wrote:
> On 5/28/21 9:32 AM, Maxime Ripard wrote:
>> hi Florian,
>>
>> On Fri, May 28, 2021 at 09:21:27AM -0700, Florian Fainelli wrote:
>>> On 5/24/21 8:37 AM, Florian Fainelli wrote:
>>>>
>>>>
>>>> On 5/24/2021 8:13 AM, Maxime Ripard wrote:
>>>>> Hi Florian,
>>>>>
>>>>> On Mon, May 24, 2021 at 07:49:25AM -0700, Florian Fainelli wrote:
>>>>>> Hi Maxime,
>>>>>>
>>>>>> On 5/24/2021 6:01 AM, Maxime Ripard wrote:
>>>>>>> Hi Doug, Florian,
>>>>>>>
>>>>>>> I've been running a RaspberryPi4 with a mainline kernel for a while,
>>>>>>> booting from NFS. Every once in a while (I'd say ~20-30% of all boots),
>>>>>>> I'm getting a kernel panic around the time init is started.
>>>>>>>
>>>>>>> I was debugging a kernel based on drm-misc-next-2021-05-17 today with
>>>>>>> KASAN enabled and got this, which looks related:
>>>>>>
>>>>>> Is there a known good version that could be used for bisection or you
>>>>>> just started to do this test and you have no reference point?
>>>>>
>>>>> I've had this issue for over a year and never (I think?) got a good
>>>>> version, so while it might be a regression, it's not a recent one.
>>>>
>>>> OK, this helps and does not really help.
>>>>
>>>>>
>>>>>> How stable in terms of clocking is the configuration that you are using?
>>>>>> I could try to fire up a similar test on a Pi4 at home, or use one of
>>>>>> our 72112 systems which is the closest we have to a Pi4 and see if that
>>>>>> happens there as well.
>>>>>
>>>>> I'm not really sure about the clocking. Is there any clock you want to
>>>>> look at in particular?
>>>>
>>>> ARM, DDR, AXI, anything that could cause some memory corruption to occur
>>>> essentially. GENET clocks are fairly fixed, you have a 250MHz clock and
>>>> a 125MHz clock feeding the data path.
>>>>
>>>>>
>>>>> My setup is fairly simple: the firmware and kernel are loaded over TFTP
>>>>> and the rootfs is mounted over NFS, and the crash always occur around
>>>>> init start, so I guess when it actually starts to transmit a decent
>>>>> amount of data?
>>>>
>>>> Do you reproduce this problem with KASAN disabled, do you eventually
>>>> have a crash pointing back to the same location?
>>>>
>>>> I have a suspicion that this is all Pi4 specific because we regularly
>>>> run the GENET driver through various kernel versions (4.9, 5.4 and 5.10
>>>> and mainline) and did not run into that.
>>>
>>> I have not had time to get a set-up to reproduce what you are seeing,
>>> could you share your .config meanwhile? Thanks
>>
>> Sorry, I didn't have the time to check how the clock were behaving.
>>
>> You'll find attached my config.txt file and .config
>>
>> I'm booting the board entirely from TFTP (which might introduce some
>> issues in the "handoff" from the bootloader to the kernel), you'll find
>> some guide there:
>>
>> https://www.raspberrypi.org/documentation/hardware/raspberrypi/bootmodes/net_tutorial.md
> 
> That is also how I boot my Pi4 at home, and I suspect you are right, if
> the VPU does not shut down GENET's DMA, and leaves buffer addresses in
> the on-chip descriptors that point to an address space that is managed
> totally differently by Linux, then we can have a serious problem and
> create some memory corruption when the ring is being reclaimed. I will
> run a few experiments to test that theory and there may be a solution
> using the SW_INIT reset controller to have a big reset of the controller
> before handing it over to the Linux driver.

Adding a WARN_ON(reg & DMA_EN) in bcmgenet_dma_disable() has not shown
that the TX or RX DMA have been left running during the hand over from
the VPU to the kernel. I checked out drm-misc-next-2021-05-17 to reduce
as much as possible the differences between your set-up and my set-up
but so far have not been able to reproduce the crash in booting from NFS
repeatedly, I will try again.
-- 
Florian



  reply	other threads:[~2021-06-01  2:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-24 13:01 Kernel Panic in skb_release_data using genet Maxime Ripard
2021-05-24 14:49 ` Florian Fainelli
2021-05-24 15:13   ` Maxime Ripard
2021-05-24 15:37     ` Florian Fainelli
2021-05-28 16:21       ` Florian Fainelli
2021-05-28 16:32         ` Maxime Ripard
2021-05-28 16:48           ` Florian Fainelli
2021-06-01  2:36             ` Florian Fainelli [this message]
2021-06-01  9:33               ` nicolas saenz julienne
2021-06-02 13:28                 ` Maxime Ripard
2021-06-10 21:33                   ` Florian Fainelli
2021-06-25 12:59                     ` Maxime Ripard
2021-07-02 16:49                       ` Florian Fainelli
2021-07-06  8:16                         ` Maxime Ripard
2022-05-13 14:56                     ` Maxime Ripard
2022-05-14 16:35                       ` Florian Fainelli
2022-05-17  7:52                         ` Maxime Ripard
2022-08-12  3:33                           ` Florian Fainelli
2022-08-15  7:07                             ` Maxime Ripard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e99ade5-ebfc-133e-ac61-1aba07ca80a2@gmail.com \
    --to=f.fainelli@gmail.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime@cerno.tech \
    --cc=netdev@vger.kernel.org \
    --cc=nsaenz@kernel.org \
    --cc=opendmb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).