nouveau.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Computer Enthusiastic <computer.enthusiastic@gmail.com>
To: nouveau@lists.freedesktop.org
Subject: Re: [Nouveau] Advice about debugging nouveau driver suspend issue (init_on_alloc=1 and init_on_free=1)
Date: Thu, 14 Oct 2021 18:51:33 +0200	[thread overview]
Message-ID: <YWhgFZPqx3qKqr0C@debian.fritz.box> (raw)
In-Reply-To: <CAKb7UvhBekZhHhq=aD+hLCfOWe33whi_bScbLiDhXDwvEvbzfA@mail.gmail.com>

Hello,

Thanks for the answer.

On Wed, Oct 13, 2021 at 01:43:56PM -0400, Ilia Mirkin wrote:
> The most straightforward thing, if you can reproduce at will, would be
> to do a bisect to figure out which change this happened with. Once we
> know which change caused the problem, it will hopefully provide with
> more ability to sort out where we're going wrong. You can also play
> with things like KASAN, although it's not clear to me that they would
> necessarily catch this problem.

The patch is "mm: security: introduce init_on_alloc=1 and init_on_free=1 
boot options" (commit 6471384af2a6530696fc0203bafe4de41a23c9ef) [1]. 

It was introduced in kernel version 5.2:
$ git describe --tags 6471384af2a6530696fc0203bafe4de41a23c9ef
v5.2-5754-g6471384af2a6

This patch does not specifically change the source code of the nouveau 
module. I suppose it interacts with the memory management of nouveau 
kernel module during kernel suspend to ram and resume from ram (logs 
recorded during malfunction are available in [2]).

I activated the kasan support in the kernel (recompiling it, for example 
with kernel version 5.14.10), but I wasn't able to collect a kasan 
report/dump triggered by the issue.

A hypotesis could be that the issue is caused by an extra latency in 
buffer eviction during suspend to ram (due to memory init), that causes 
the nouveau kernel not responding to suspend, causing a "forced" suspend 
that leaves the nouveau module in an inconsistent state when the system 
is resumed. I'm not a kernel expert, therefore I don't know how to test 
this hypotesis.

Many other patches has been written over time about init_on_alloc and 
init_on_free kernel features, but the issue with nouveau driver is still 
there, up to now.

For the previous reasons, it don't know how to investigate it.

Thanks for any suggestions.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.14.12&id=6471384af2a6530696fc0203bafe4de41a23c9ef
[2] https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/uploads/4294703a356e4c00814f581a42988999/suspend-errors-5.14.9.log.zip

> On Wed, Oct 13, 2021 at 1:31 PM Computer Enthusiastic
> <computer.enthusiastic@gmail.com> wrote:
> >
> > Hello,
> >
> > Greeting to all list members.
> >
> > I hope this mailing list is the right place to ask about it, otherwise
> > I apologize in advance and, please, direct me to the right mailing
> > list.
> >
> > I would like to receive advice on how to debug (or help to debug) the
> > noveau kernel module suspend issue reported in [1][2][3]. The issue is
> > constantly reproducible and it consists in a  screen corruption with
> > system lockup on resume after suspend to ram or suspend to disk. It
> > seems to affect NV50 nvidia graphic cards at least from kernel 5.10.0
> > (probably from 5.2 or 5.3) up to 5.14.10 (I have not tested 5.15 in rc
> > state, actually).
> >
> > Thanks in advance.
> >
> > [1] https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/547
> > [2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=989705
> > [3] https://bugzilla.kernel.org/show_bug.cgi?id=213617

  reply	other threads:[~2021-10-14 16:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-13 17:31 [Nouveau] Advice about debugging nouveau driver suspend issue (init_on_alloc=1 and init_on_free=1) Computer Enthusiastic
2021-10-13 17:43 ` Ilia Mirkin
2021-10-14 16:51   ` Computer Enthusiastic [this message]
2021-10-14 18:07     ` Ilia Mirkin
2022-01-24 23:19       ` computer enthusiastic
2022-08-10 10:41         ` Computer Enthusiastic
2022-08-10 11:59           ` Karol Herbst
2022-08-10 21:35             ` Lyude Paul
2022-08-10 22:08               ` Karol Herbst
2022-08-11 21:19                 ` Lyude Paul
2022-08-11 23:07                   ` Karol Herbst
2022-08-18 20:39                     ` Lyude Paul
2022-08-18 21:17                       ` Karol Herbst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YWhgFZPqx3qKqr0C@debian.fritz.box \
    --to=computer.enthusiastic@gmail.com \
    --cc=nouveau@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).