linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thorsten Leemhuis <regressions@leemhuis.info>
To: Sanjay R Mehta <sanju.mehta@amd.com>, Vinod Koul <vkoul@kernel.org>
Cc: Eric Pilmore <epilmore@gigaio.com>,
	dmaengine@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	"regressions@lists.linux.dev" <regressions@lists.linux.dev>
Subject: [regression] Bug 216856 - [ptdma] NULL pointer dereference in pt_cmd_callback during server shutdown
Date: Fri, 30 Dec 2022 09:27:05 +0100	[thread overview]
Message-ID: <0e77436c-9f0a-15b6-697a-7b879e4abc4a@leemhuis.info> (raw)

Hi, this is your Linux kernel regression tracker speaking.

I noticed a bug report in bugzilla.kernel.org that looks a lot like a
regression to my untrained eyes (it's not entirely clear). As many
(most?) kernel developer don't keep an eye on it, I decided to forward
it by mail. Quoting from
https://bugzilla.kernel.org/show_bug.cgi?id=216856 :

>  Eric Pilmore 2022-12-27 22:23:50 UTC
> 
> Observed kernel panic during host shutdown on a AMD (Milan CPU) based
> server. The issue ended up being a NULL pointer dereference in
> pt_cmd_callback() when
> called from pt_issue_pending(). If you follow the flow in
> pt_issue_pending() you will note that if pt_next_dma_desc() returns
> NULL, then engine_is_idle will remain as TRUE, including if
> pt_next_dma_desc() is still returning NULL in the 2nd call just prior to
> doing the call to pt_cmd_callback().
> 
> The stack flow leading up to the panic was:
> dma_sync_wait() -> dma_async_issue_pending() -> pt_issue_pending() ->
> pt_cmd_callback()
> 
> Temporarily I worked around the issue by simply changing the IF
> condition for the call to pt_cmd_callback() to also check for a non-NULL
> desc, i.e.
> 
>    if (engine_is_idle && desc)
>       pt_cmd_callback(desc, 0);
> 
> This resolved the issue for me, however I don't know enough about the
> driver or the context here to know if this is really the desirable fix,
> and so I'm submitting this bug rather than attempting to patch myself. I
> wasn't sure if the secondary pt_next_dma_desc() call was mistakenly
> leftover from the change that introduced the engine_is_idle variable or
> not. Note that vchan_issue_pending() will return a boolean as to whether
> there are any descriptors on the Issue list, i.e. active descriptors.
> So, maybe that could be used to qualify the need to take some action?
> Also, if pt_cmd_callback() is really going to start processing on the
> next descriptor, I wonder if it should be called under the chan->vc.lock
> lock. I'm not sure of the safety of this, but if you are peeking at
> descriptors on the Issue list that you might want to ensure they're
> protected from being accessed/removed by some other thread.

See the ticket for more details.

BTW, let me use this mail to also add the report to the list of tracked
regressions to ensure it's doesn't fall through the cracks:

#regzbot introduced: v6.1..v6.2-rc1
https://bugzilla.kernel.org/show_bug.cgi?id=216856
#regzbot title: ptdma: kernel panic during host shutdown
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

             reply	other threads:[~2022-12-30  8:27 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30  8:27 Thorsten Leemhuis [this message]
2023-02-11 14:30 ` [regression] Bug 216856 - [ptdma] NULL pointer dereference in pt_cmd_callback during server shutdown Linux regression tracking #update (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0e77436c-9f0a-15b6-697a-7b879e4abc4a@leemhuis.info \
    --to=regressions@leemhuis.info \
    --cc=dmaengine@vger.kernel.org \
    --cc=epilmore@gigaio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=sanju.mehta@amd.com \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).