All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: "Christian König" <christian.koenig@amd.com>
Cc: "Uladzislau Rezki" <urezki@gmail.com>,
	"Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>,
	wireguard@lists.zx2c4.com, "Jason A. Donenfeld" <Jason@zx2c4.com>,
	"Joel Fernandes" <joel@joelfernandes.org>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	Xinhui.Pan@amd.com, linux-kernel@vger.kernel.org,
	amd-gfx@lists.freedesktop.org,
	"Suren Baghdasaryan" <surenb@google.com>,
	rcu@vger.kernel.org, "Hridya Valsaraju" <hridya@google.com>,
	"Arve Hjønnevåg" <arve@android.com>,
	"Theodore Ts'o" <tytso@mit.edu>,
	alexander.deucher@amd.com, "Todd Kjos" <tkjos@android.com>,
	uladzislau.rezki@sony.com, "Martijn Coenen" <maco@android.com>,
	"Christian Brauner" <christian@brauner.io>
Subject: Re: CONFIG_ANDROID (was: rcu_sched detected expedited stalls in amdgpu after suspend)
Date: Thu, 7 Jul 2022 06:29:21 -0700	[thread overview]
Message-ID: <20220707132921.GK1790663@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <fbf83d60-67d3-698d-e2d2-02dc8d7e49c4@amd.com>

On Thu, Jul 07, 2022 at 09:30:39AM +0200, Christian König wrote:
> Am 06.07.22 um 22:42 schrieb Paul E. McKenney:
> > On Wed, Jul 06, 2022 at 08:09:49PM +0200, Uladzislau Rezki wrote:
> > > On Wed, Jul 06, 2022 at 10:58:36AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Jul 06, 2022 at 07:48:20PM +0200, Uladzislau Rezki wrote:
> > > > > Hello.
> > > > > 
> > > > > On Mon, Jul 04, 2022 at 01:30:50PM +0200, Christian König wrote:
> > > > > > Hi guys,
> > > > > > 
> > > > > > Am 28.06.22 um 22:11 schrieb Uladzislau Rezki:
> > > > > > > > Excerpts from Paul E. McKenney's message of June 28, 2022 2:54 pm:
> > > > > > > > > All you need to do to get the previous behavior is to add something like
> > > > > > > > > this to your defconfig file:
> > > > > > > > > 
> > > > > > > > > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=21000
> > > > > > > > > 
> > > > > > > > > Any reason why this will not work for you?
> > > > > > sorry for jumping in so later, I was on vacation for a week.
> > > > > > 
> > > > > > Well when any RCU period is longer than 20ms and amdgpu in the backtrace my
> > > > > > educated guess is that we messed up some timeout waiting for the hw.
> > > > > > 
> > > > > > We usually do wait a few us, but it can be that somebody is waiting for ms
> > > > > > instead.
> > > > > > 
> > > > > > So there are some todos here as far as I can see and It would be helpful to
> > > > > > get a cleaner backtrace if possible.
> > > > > > 
> > > > > Actually CONFIG_ANDROID looks like is going to be removed, so the CONFIG_RCU_EXP_CPU_STALL_TIMEOUT
> > > > > will not have any dependencies on the CONFIG_ANDROID anymore:
> > > > > 
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2022%2F6%2F29%2F756&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7C8b36bcb4fe61475c0eb708da5f8ffce8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637927369274030797%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=eaK66spsbWVi2uRhcFK7eu4usgkHFZCSvErZxB%2F2npM%3D&amp;reserved=0
> > > > But you can set the RCU_EXP_CPU_STALL_TIMEOUT Kconfig option, if you
> > > > wish.  Setting this option to 20 will get you the behavior previously
> > > > obtained by setting the now-defunct ANDROID Kconfig option.
> > > > 
> > > Right. Or over boot parameter. So for us it is not a big issue :)
> > Specifically rcupdate.rcu_exp_cpu_stall_timeout, for those just now
> > tuning in.  ;-)
> 
> I was just about to write a response asking for that :)
> 
> Thanks, I will suggest to our QA to add this parameter while doing some
> tests.

Very good!  Please let me know how it goes.

							Thanx, Paul

WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: "Christian König" <christian.koenig@amd.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
	"Theodore Ts'o" <tytso@mit.edu>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	Xinhui.Pan@amd.com, "Martijn Coenen" <maco@android.com>,
	linux-kernel@vger.kernel.org,
	"Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>,
	rcu@vger.kernel.org, "Hridya Valsaraju" <hridya@google.com>,
	"Arve Hjønnevåg" <arve@android.com>,
	"Uladzislau Rezki" <urezki@gmail.com>,
	"Todd Kjos" <tkjos@android.com>,
	amd-gfx@lists.freedesktop.org,
	"Christian Brauner" <christian@brauner.io>,
	"Joel Fernandes" <joel@joelfernandes.org>,
	alexander.deucher@amd.com, uladzislau.rezki@sony.com,
	"Suren Baghdasaryan" <surenb@google.com>,
	wireguard@lists.zx2c4.com
Subject: Re: CONFIG_ANDROID (was: rcu_sched detected expedited stalls in amdgpu after suspend)
Date: Thu, 7 Jul 2022 06:29:21 -0700	[thread overview]
Message-ID: <20220707132921.GK1790663@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <fbf83d60-67d3-698d-e2d2-02dc8d7e49c4@amd.com>

On Thu, Jul 07, 2022 at 09:30:39AM +0200, Christian König wrote:
> Am 06.07.22 um 22:42 schrieb Paul E. McKenney:
> > On Wed, Jul 06, 2022 at 08:09:49PM +0200, Uladzislau Rezki wrote:
> > > On Wed, Jul 06, 2022 at 10:58:36AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Jul 06, 2022 at 07:48:20PM +0200, Uladzislau Rezki wrote:
> > > > > Hello.
> > > > > 
> > > > > On Mon, Jul 04, 2022 at 01:30:50PM +0200, Christian König wrote:
> > > > > > Hi guys,
> > > > > > 
> > > > > > Am 28.06.22 um 22:11 schrieb Uladzislau Rezki:
> > > > > > > > Excerpts from Paul E. McKenney's message of June 28, 2022 2:54 pm:
> > > > > > > > > All you need to do to get the previous behavior is to add something like
> > > > > > > > > this to your defconfig file:
> > > > > > > > > 
> > > > > > > > > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=21000
> > > > > > > > > 
> > > > > > > > > Any reason why this will not work for you?
> > > > > > sorry for jumping in so later, I was on vacation for a week.
> > > > > > 
> > > > > > Well when any RCU period is longer than 20ms and amdgpu in the backtrace my
> > > > > > educated guess is that we messed up some timeout waiting for the hw.
> > > > > > 
> > > > > > We usually do wait a few us, but it can be that somebody is waiting for ms
> > > > > > instead.
> > > > > > 
> > > > > > So there are some todos here as far as I can see and It would be helpful to
> > > > > > get a cleaner backtrace if possible.
> > > > > > 
> > > > > Actually CONFIG_ANDROID looks like is going to be removed, so the CONFIG_RCU_EXP_CPU_STALL_TIMEOUT
> > > > > will not have any dependencies on the CONFIG_ANDROID anymore:
> > > > > 
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2022%2F6%2F29%2F756&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7C8b36bcb4fe61475c0eb708da5f8ffce8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637927369274030797%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=eaK66spsbWVi2uRhcFK7eu4usgkHFZCSvErZxB%2F2npM%3D&amp;reserved=0
> > > > But you can set the RCU_EXP_CPU_STALL_TIMEOUT Kconfig option, if you
> > > > wish.  Setting this option to 20 will get you the behavior previously
> > > > obtained by setting the now-defunct ANDROID Kconfig option.
> > > > 
> > > Right. Or over boot parameter. So for us it is not a big issue :)
> > Specifically rcupdate.rcu_exp_cpu_stall_timeout, for those just now
> > tuning in.  ;-)
> 
> I was just about to write a response asking for that :)
> 
> Thanks, I will suggest to our QA to add this parameter while doing some
> tests.

Very good!  Please let me know how it goes.

							Thanx, Paul

  reply	other threads:[~2022-07-07 13:29 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1656357116.rhe0mufk6a.none.ref@localhost>
2022-06-27 19:22 ` rcu_sched detected expedited stalls in amdgpu after suspend Alex Xu (Hello71)
2022-06-27 20:41   ` Paul E. McKenney
2022-06-27 20:41     ` Paul E. McKenney
     [not found]     ` <1656379893.q9yb069erk.none@localhost>
     [not found]       ` <20220628041252.GV1790663@paulmck-ThinkPad-P17-Gen-1>
2022-06-28 15:02         ` CONFIG_ANDROID (was: rcu_sched detected expedited stalls in amdgpu after suspend) Alex Xu (Hello71)
2022-06-28 15:02           ` Alex Xu (Hello71)
2022-06-28 15:13           ` Jason A. Donenfeld
2022-06-28 15:13             ` Jason A. Donenfeld
2022-06-28 18:54           ` Paul E. McKenney
2022-06-28 18:54             ` Paul E. McKenney
2022-06-28 19:28             ` Alex Xu (Hello71)
2022-06-28 19:28               ` Alex Xu (Hello71)
2022-06-28 20:11               ` Uladzislau Rezki
2022-06-28 20:11                 ` Uladzislau Rezki
2022-07-04 11:30                 ` Christian König
2022-07-04 11:30                   ` Christian König
2022-07-06 17:48                   ` Uladzislau Rezki
2022-07-06 17:48                     ` Uladzislau Rezki
2022-07-06 17:58                     ` Paul E. McKenney
2022-07-06 17:58                       ` Paul E. McKenney
2022-07-06 18:09                       ` Uladzislau Rezki
2022-07-06 18:09                         ` Uladzislau Rezki
2022-07-06 20:42                         ` Paul E. McKenney
2022-07-06 20:42                           ` Paul E. McKenney
2022-07-07  7:30                           ` Christian König
2022-07-07  7:30                             ` Christian König
2022-07-07 13:29                             ` Paul E. McKenney [this message]
2022-07-07 13:29                               ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220707132921.GK1790663@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=Jason@zx2c4.com \
    --cc=Xinhui.Pan@amd.com \
    --cc=alex_y_xu@yahoo.ca \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=arve@android.com \
    --cc=christian.koenig@amd.com \
    --cc=christian@brauner.io \
    --cc=gregkh@linuxfoundation.org \
    --cc=hridya@google.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maco@android.com \
    --cc=rcu@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=tkjos@android.com \
    --cc=tytso@mit.edu \
    --cc=uladzislau.rezki@sony.com \
    --cc=urezki@gmail.com \
    --cc=wireguard@lists.zx2c4.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.