All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Martin Kepplinger <martink@posteo.de>,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	daniel.vetter@intel.com, Dave Airlie <airlied@gmail.com>,
	intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker
Date: Fri, 7 Apr 2017 18:48:58 +0200	[thread overview]
Message-ID: <20170407164858.GF5035@redhat.com> (raw)
In-Reply-To: <20170407153011.GR10496@nuc-i3427.alporthouse.com>

On Fri, Apr 07, 2017 at 04:30:11PM +0100, Chris Wilson wrote:
> Not getting hangs is a good sign, but lockdep doesn't like it:
> 
> [  460.684901] WARNING: CPU: 1 PID: 172 at kernel/workqueue.c:2418 check_flush_dependency+0x92/0x130
> [  460.684924] workqueue: PF_MEMALLOC task 172(kworker/1:1H) is flushing !WQ_MEM_RECLAIM events:__i915_gem_free_work [i915]
> 
> If I allocated the workqueue with WQ_MEM_RELCAIM, it complains bitterly
> as well.

So in PF_MEMALLOC context we can't flush a workqueue with
!WQ_MEM_RECLAIM.

	system_wq = alloc_workqueue("events", 0, 0);

My point is synchronize_rcu_expedited will still push its work in
the same system_wq workqueue...

		/* Marshall arguments & schedule the expedited grace period. */
		rew.rew_func = func;
		rew.rew_rsp = rsp;
		rew.rew_s = s;
		INIT_WORK_ONSTACK(&rew.rew_work, wait_rcu_exp_gp);
		schedule_work(&rew.rew_work);

It's also using schedule_work, so either the above is a false
positive, or we've still a problem with synchronize_rcu_expedited,
just a reproducible issue anymore after we stop running it under the
struct_mutex.

Even synchronize_sched will wait on the system_wq if
synchronize_rcu_expedited has been issued in parallel by some other
code, it's just there's no check for it because it's not invoking
flush_work to wait.

The deadlock happens if we flush_work() while holding any lock that
may be taken by any of the workqueues that could be queued there. i915
makes sure to flush_work only if the struct_mutex was released (not
my initial version) but we're not checking if any of the other
system_wq workqueues could possibly taking a lock that may have been
hold during the allocation that invoked reclaim, I suppose that is the
problem left, but I don't see how flush_work is special about it,
synchronize_rcu_expedited would still have the same issue no? (despite
no lockdep warning)

I suspect this means synchronize_rcu_expedited() is not usable in
reclaim context and lockdep should warn if PF_MEMALLOC is set when
synchronize_rcu_expedited/synchronize_sched/synchronize_rcu are
called.

Probably to fix this we should create a private workqueue for both RCU
and i915 and stop sharing the system_wq with the rest of the system
(and of course set WQ_MEM_RECLAIM in both workqueues). This makes sure
when we call synchronize_rcu_expedited; flush_work from the shrinker,
we won't risk waiting on other random work that may be taking locks
that are hold by the code that invoked reclaim during an allocation.

The macro bug of waiting on system_wq 100% of the time while always
holding the struct_mutex is gone, but we need to perfect this further
and stop using the system_wq for RCU and i915 shrinker work.

> We do need it for shrinker_count as well. If we just report 0 objects,

Yes the shrinker_count too.

> the shrinker_scan callback will be skipped, iirc. All we do need it for
> direct calls to i915_gem_shrink() which themselves may or may not be
> underneath the struct_mutex at the time.

Yes.

> I don't follow,
> 
> static inline struct task_struct *__mutex_owner(struct mutex *lock)
> {
>         return (struct task_struct *)(atomic_long_read(&lock->owner) & ~0x07);
> }
> 
> The atomic read is equivalent to READ_ONCE(). What's broken here? (I
> guess strict aliasing and pointer cast?)

That was an oversight, atomic64_read does READ_ONCE internally (as it
should of course being an atomic read). I didn't recall it uses
atomic_long_read.

Thanks,
Andrea

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Martin Kepplinger <martink@posteo.de>,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	daniel.vetter@intel.com, Dave Airlie <airlied@gmail.com>,
	intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker
Date: Fri, 7 Apr 2017 18:48:58 +0200	[thread overview]
Message-ID: <20170407164858.GF5035@redhat.com> (raw)
In-Reply-To: <20170407153011.GR10496@nuc-i3427.alporthouse.com>

On Fri, Apr 07, 2017 at 04:30:11PM +0100, Chris Wilson wrote:
> Not getting hangs is a good sign, but lockdep doesn't like it:
> 
> [  460.684901] WARNING: CPU: 1 PID: 172 at kernel/workqueue.c:2418 check_flush_dependency+0x92/0x130
> [  460.684924] workqueue: PF_MEMALLOC task 172(kworker/1:1H) is flushing !WQ_MEM_RECLAIM events:__i915_gem_free_work [i915]
> 
> If I allocated the workqueue with WQ_MEM_RELCAIM, it complains bitterly
> as well.

So in PF_MEMALLOC context we can't flush a workqueue with
!WQ_MEM_RECLAIM.

	system_wq = alloc_workqueue("events", 0, 0);

My point is synchronize_rcu_expedited will still push its work in
the same system_wq workqueue...

		/* Marshall arguments & schedule the expedited grace period. */
		rew.rew_func = func;
		rew.rew_rsp = rsp;
		rew.rew_s = s;
		INIT_WORK_ONSTACK(&rew.rew_work, wait_rcu_exp_gp);
		schedule_work(&rew.rew_work);

It's also using schedule_work, so either the above is a false
positive, or we've still a problem with synchronize_rcu_expedited,
just a reproducible issue anymore after we stop running it under the
struct_mutex.

Even synchronize_sched will wait on the system_wq if
synchronize_rcu_expedited has been issued in parallel by some other
code, it's just there's no check for it because it's not invoking
flush_work to wait.

The deadlock happens if we flush_work() while holding any lock that
may be taken by any of the workqueues that could be queued there. i915
makes sure to flush_work only if the struct_mutex was released (not
my initial version) but we're not checking if any of the other
system_wq workqueues could possibly taking a lock that may have been
hold during the allocation that invoked reclaim, I suppose that is the
problem left, but I don't see how flush_work is special about it,
synchronize_rcu_expedited would still have the same issue no? (despite
no lockdep warning)

I suspect this means synchronize_rcu_expedited() is not usable in
reclaim context and lockdep should warn if PF_MEMALLOC is set when
synchronize_rcu_expedited/synchronize_sched/synchronize_rcu are
called.

Probably to fix this we should create a private workqueue for both RCU
and i915 and stop sharing the system_wq with the rest of the system
(and of course set WQ_MEM_RECLAIM in both workqueues). This makes sure
when we call synchronize_rcu_expedited; flush_work from the shrinker,
we won't risk waiting on other random work that may be taking locks
that are hold by the code that invoked reclaim during an allocation.

The macro bug of waiting on system_wq 100% of the time while always
holding the struct_mutex is gone, but we need to perfect this further
and stop using the system_wq for RCU and i915 shrinker work.

> We do need it for shrinker_count as well. If we just report 0 objects,

Yes the shrinker_count too.

> the shrinker_scan callback will be skipped, iirc. All we do need it for
> direct calls to i915_gem_shrink() which themselves may or may not be
> underneath the struct_mutex at the time.

Yes.

> I don't follow,
> 
> static inline struct task_struct *__mutex_owner(struct mutex *lock)
> {
>         return (struct task_struct *)(atomic_long_read(&lock->owner) & ~0x07);
> }
> 
> The atomic read is equivalent to READ_ONCE(). What's broken here? (I
> guess strict aliasing and pointer cast?)

That was an oversight, atomic64_read does READ_ONCE internally (as it
should of course being an atomic read). I didn't recall it uses
atomic_long_read.

Thanks,
Andrea
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2017-04-09 20:16 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-22  8:38 [BUG][REGRESSION] i915 gpu hangs under load Martin Kepplinger
2017-03-22 10:36 ` [Intel-gfx] " Jani Nikula
2017-03-22 10:36   ` Jani Nikula
2017-04-02 11:50   ` [Intel-gfx] " Thorsten Leemhuis
2017-04-02 11:50     ` Thorsten Leemhuis
2017-04-02 12:13     ` [Intel-gfx] " Martin Kepplinger
2017-04-02 12:13       ` Martin Kepplinger
2017-04-03 15:09       ` [Intel-gfx] " Jani Nikula
2017-04-03 15:09         ` Jani Nikula
2017-04-06 23:23         ` [PATCH 0/5] " Andrea Arcangeli
2017-04-06 23:23           ` [PATCH 0/5] " Andrea Arcangeli
2017-04-06 23:23           ` [PATCH 1/5] i915: avoid kernel hang caused by synchronize rcu struct_mutex deadlock Andrea Arcangeli
2017-04-06 23:23             ` Andrea Arcangeli
2017-04-07  9:05             ` [Intel-gfx] " Joonas Lahtinen
2017-04-07  9:05               ` Joonas Lahtinen
2017-04-06 23:23           ` [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker Andrea Arcangeli
2017-04-06 23:23             ` Andrea Arcangeli
2017-04-07 10:02             ` Chris Wilson
2017-04-07 10:02               ` Chris Wilson
2017-04-07 13:06               ` Andrea Arcangeli
2017-04-07 13:06                 ` Andrea Arcangeli
2017-04-07 15:30                 ` Chris Wilson
2017-04-07 15:30                   ` Chris Wilson
2017-04-07 16:48                   ` Andrea Arcangeli [this message]
2017-04-07 16:48                     ` Andrea Arcangeli
2017-04-10  9:39                     ` Chris Wilson
2017-04-10  9:39                       ` Chris Wilson
2017-04-06 23:23           ` [PATCH 3/5] i915: initialize the free_list of the fencing atomic_helper Andrea Arcangeli
2017-04-06 23:23             ` Andrea Arcangeli
2017-04-07 10:35             ` Chris Wilson
2017-04-07 10:35               ` Chris Wilson
2017-04-06 23:23           ` [PATCH 4/5] i915: schedule while freeing the lists of gem objects Andrea Arcangeli
2017-04-06 23:23             ` Andrea Arcangeli
2017-04-06 23:23           ` [PATCH 5/5] i915: fence workqueue optimization Andrea Arcangeli
2017-04-06 23:23             ` Andrea Arcangeli
2017-04-07  9:58             ` Chris Wilson
2017-04-07  9:58               ` Chris Wilson
2017-04-07 13:13               ` Andrea Arcangeli
2017-04-07 13:13                 ` Andrea Arcangeli
2017-04-10 10:15           ` [PATCH 0/5] Re: [Intel-gfx] [BUG][REGRESSION] i915 gpu hangs under load Martin Kepplinger
2017-04-10 10:15             ` [PATCH 0/5] " Martin Kepplinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170407164858.GF5035@redhat.com \
    --to=aarcange@redhat.com \
    --cc=airlied@gmail.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martink@posteo.de \
    --cc=regressions@leemhuis.info \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.