All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: "Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
	"Linux MM" <linux-mm@kvack.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"Dave Chinner" <dchinner@redhat.com>, "Leo Liu" <Leo.Liu@amd.com>
Subject: Re: [PATCH] drm/ttm: stop warning on TT shrinker failure
Date: Tue, 23 Mar 2021 14:48:00 +0100	[thread overview]
Message-ID: <YFnxkLhBGYRj7Hck@dhcp22.suse.cz> (raw)
In-Reply-To: <YFnp2e2jGrtM7iGx@phenom.ffwll.local>

On Tue 23-03-21 14:15:05, Daniel Vetter wrote:
> On Tue, Mar 23, 2021 at 01:04:03PM +0100, Michal Hocko wrote:
> > On Tue 23-03-21 12:48:58, Christian König wrote:
> > > Am 23.03.21 um 12:28 schrieb Daniel Vetter:
> > > > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > > > > I think this is where I don't get yet what Christian tries to do: We
> > > > > really shouldn't do different tricks and calling contexts between direct
> > > > > reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
> > > > > pretty much guaranteed. So whether we use explicit gfp flags or the
> > > > > context apis, result is exactly the same.
> > > 
> > > Ok let us recap what TTMs TT shrinker does here:
> > > 
> > > 1. We got memory which is not swapable because it might be accessed by the
> > > GPU at any time.
> > > 2. Make sure the memory is not accessed by the GPU and driver need to grab a
> > > lock before they can make it accessible again.
> > > 3. Allocate a shmem file and copy over the not swapable pages.
> > 
> > This is quite tricky because the shrinker operates in the PF_MEMALLOC
> > context so such an allocation would be allowed to completely deplete
> > memory unless you explicitly mark that context as __GFP_NOMEMALLOC. Also
> > note that if the allocation cannot succeed it will not trigger reclaim
> > again because you are already called from the reclaim context.
> 
> [Limiting to that discussion]
> 
> Yes it's not emulating real (direct) reclaim correctly, but ime the
> biggest issue with direct reclaim is when you do mutex_lock instead of
> mutex_trylock or in general block on stuff that you cant. And lockdep +
> fs_reclaim annotations gets us that, so pretty good to make sure our
> shrinker is correct.

I have to confess that I manage to (happily) forget all the nasty
details about fs_reclaim lockdep internals so I am not sure the use by
the proposed patch is actually reasonable. Talk to lockdep guys about
that and make sure to put a big fat comment explaining what is going on.

In general allocating from the reclaim context is a bad idea and you
should avoid that. As already said a simple allocation request from the
reclaim context is not constrained and it will not recurse back into
the reclaim. Calling into shmem from the shrinker context might be
really tricky as well. I am not even sure this is possible for anything
other than full (GFP_KERNEL) reclaim context.
-- 
Michal Hocko
SUSE Labs


WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: "Matthew Wilcox" <willy@infradead.org>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
	"Dave Chinner" <dchinner@redhat.com>, "Leo Liu" <Leo.Liu@amd.com>
Subject: Re: [PATCH] drm/ttm: stop warning on TT shrinker failure
Date: Tue, 23 Mar 2021 14:48:00 +0100	[thread overview]
Message-ID: <YFnxkLhBGYRj7Hck@dhcp22.suse.cz> (raw)
In-Reply-To: <YFnp2e2jGrtM7iGx@phenom.ffwll.local>

On Tue 23-03-21 14:15:05, Daniel Vetter wrote:
> On Tue, Mar 23, 2021 at 01:04:03PM +0100, Michal Hocko wrote:
> > On Tue 23-03-21 12:48:58, Christian König wrote:
> > > Am 23.03.21 um 12:28 schrieb Daniel Vetter:
> > > > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > > > > I think this is where I don't get yet what Christian tries to do: We
> > > > > really shouldn't do different tricks and calling contexts between direct
> > > > > reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
> > > > > pretty much guaranteed. So whether we use explicit gfp flags or the
> > > > > context apis, result is exactly the same.
> > > 
> > > Ok let us recap what TTMs TT shrinker does here:
> > > 
> > > 1. We got memory which is not swapable because it might be accessed by the
> > > GPU at any time.
> > > 2. Make sure the memory is not accessed by the GPU and driver need to grab a
> > > lock before they can make it accessible again.
> > > 3. Allocate a shmem file and copy over the not swapable pages.
> > 
> > This is quite tricky because the shrinker operates in the PF_MEMALLOC
> > context so such an allocation would be allowed to completely deplete
> > memory unless you explicitly mark that context as __GFP_NOMEMALLOC. Also
> > note that if the allocation cannot succeed it will not trigger reclaim
> > again because you are already called from the reclaim context.
> 
> [Limiting to that discussion]
> 
> Yes it's not emulating real (direct) reclaim correctly, but ime the
> biggest issue with direct reclaim is when you do mutex_lock instead of
> mutex_trylock or in general block on stuff that you cant. And lockdep +
> fs_reclaim annotations gets us that, so pretty good to make sure our
> shrinker is correct.

I have to confess that I manage to (happily) forget all the nasty
details about fs_reclaim lockdep internals so I am not sure the use by
the proposed patch is actually reasonable. Talk to lockdep guys about
that and make sure to put a big fat comment explaining what is going on.

In general allocating from the reclaim context is a bad idea and you
should avoid that. As already said a simple allocation request from the
reclaim context is not constrained and it will not recurse back into
the reclaim. Calling into shmem from the shrinker context might be
really tricky as well. I am not even sure this is possible for anything
other than full (GFP_KERNEL) reclaim context.
-- 
Michal Hocko
SUSE Labs
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: "Matthew Wilcox" <willy@infradead.org>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
	"Dave Chinner" <dchinner@redhat.com>, "Leo Liu" <Leo.Liu@amd.com>
Subject: Re: [PATCH] drm/ttm: stop warning on TT shrinker failure
Date: Tue, 23 Mar 2021 14:48:00 +0100	[thread overview]
Message-ID: <YFnxkLhBGYRj7Hck@dhcp22.suse.cz> (raw)
In-Reply-To: <YFnp2e2jGrtM7iGx@phenom.ffwll.local>

On Tue 23-03-21 14:15:05, Daniel Vetter wrote:
> On Tue, Mar 23, 2021 at 01:04:03PM +0100, Michal Hocko wrote:
> > On Tue 23-03-21 12:48:58, Christian König wrote:
> > > Am 23.03.21 um 12:28 schrieb Daniel Vetter:
> > > > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > > > > I think this is where I don't get yet what Christian tries to do: We
> > > > > really shouldn't do different tricks and calling contexts between direct
> > > > > reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
> > > > > pretty much guaranteed. So whether we use explicit gfp flags or the
> > > > > context apis, result is exactly the same.
> > > 
> > > Ok let us recap what TTMs TT shrinker does here:
> > > 
> > > 1. We got memory which is not swapable because it might be accessed by the
> > > GPU at any time.
> > > 2. Make sure the memory is not accessed by the GPU and driver need to grab a
> > > lock before they can make it accessible again.
> > > 3. Allocate a shmem file and copy over the not swapable pages.
> > 
> > This is quite tricky because the shrinker operates in the PF_MEMALLOC
> > context so such an allocation would be allowed to completely deplete
> > memory unless you explicitly mark that context as __GFP_NOMEMALLOC. Also
> > note that if the allocation cannot succeed it will not trigger reclaim
> > again because you are already called from the reclaim context.
> 
> [Limiting to that discussion]
> 
> Yes it's not emulating real (direct) reclaim correctly, but ime the
> biggest issue with direct reclaim is when you do mutex_lock instead of
> mutex_trylock or in general block on stuff that you cant. And lockdep +
> fs_reclaim annotations gets us that, so pretty good to make sure our
> shrinker is correct.

I have to confess that I manage to (happily) forget all the nasty
details about fs_reclaim lockdep internals so I am not sure the use by
the proposed patch is actually reasonable. Talk to lockdep guys about
that and make sure to put a big fat comment explaining what is going on.

In general allocating from the reclaim context is a bad idea and you
should avoid that. As already said a simple allocation request from the
reclaim context is not constrained and it will not recurse back into
the reclaim. Calling into shmem from the shrinker context might be
really tricky as well. I am not even sure this is possible for anything
other than full (GFP_KERNEL) reclaim context.
-- 
Michal Hocko
SUSE Labs
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2021-03-23 13:48 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19 14:08 [PATCH] drm/ttm: stop warning on TT shrinker failure Christian König
2021-03-19 14:08 ` Christian König
2021-03-19 17:19 ` Deucher, Alexander
2021-03-19 17:19   ` Deucher, Alexander
2021-03-19 17:52 ` Daniel Vetter
2021-03-19 17:52   ` Daniel Vetter
2021-03-19 18:53   ` Christian König
2021-03-19 18:53     ` Christian König
2021-03-19 19:06     ` Daniel Vetter
2021-03-19 19:06       ` Daniel Vetter
2021-03-20  9:04       ` Christian König
2021-03-20  9:04         ` Christian König
2021-03-20 13:17         ` Daniel Vetter
2021-03-20 13:17           ` Daniel Vetter
2021-03-21 14:18           ` Christian König
2021-03-21 14:18             ` Christian König
2021-03-22 13:49             ` Daniel Vetter
2021-03-22 13:49               ` Daniel Vetter
2021-03-22 13:49               ` Daniel Vetter
2021-03-22 14:05               ` Matthew Wilcox
2021-03-22 14:05                 ` Matthew Wilcox
2021-03-22 14:05                 ` Matthew Wilcox
2021-03-22 14:22                 ` Daniel Vetter
2021-03-22 14:22                   ` Daniel Vetter
2021-03-22 14:22                   ` Daniel Vetter
2021-03-22 15:57                 ` Michal Hocko
2021-03-22 15:57                   ` Michal Hocko
2021-03-22 15:57                   ` Michal Hocko
2021-03-22 17:02                   ` Daniel Vetter
2021-03-22 17:02                     ` Daniel Vetter
2021-03-22 17:02                     ` Daniel Vetter
2021-03-22 19:34                     ` Christian König
2021-03-22 19:34                       ` Christian König
2021-03-22 19:34                       ` Christian König
2021-03-23  7:38                       ` Michal Hocko
2021-03-23  7:38                         ` Michal Hocko
2021-03-23  7:38                         ` Michal Hocko
2021-03-23 11:28                         ` Daniel Vetter
2021-03-23 11:28                           ` Daniel Vetter
2021-03-23 11:28                           ` Daniel Vetter
2021-03-23 11:46                           ` Michal Hocko
2021-03-23 11:46                             ` Michal Hocko
2021-03-23 11:46                             ` Michal Hocko
2021-03-23 11:51                             ` Christian König
2021-03-23 11:51                               ` Christian König
2021-03-23 11:51                               ` Christian König
2021-03-23 12:00                               ` Daniel Vetter
2021-03-23 12:00                                 ` Daniel Vetter
2021-03-23 12:00                                 ` Daniel Vetter
2021-03-23 12:05                               ` Michal Hocko
2021-03-23 12:05                                 ` Michal Hocko
2021-03-23 12:05                                 ` Michal Hocko
2021-03-23 11:48                           ` Christian König
2021-03-23 11:48                             ` Christian König
2021-03-23 11:48                             ` Christian König
2021-03-23 12:04                             ` Michal Hocko
2021-03-23 12:04                               ` Michal Hocko
2021-03-23 12:04                               ` Michal Hocko
2021-03-23 12:21                               ` Christian König
2021-03-23 12:21                                 ` Christian König
2021-03-23 12:21                                 ` Christian König
2021-03-23 12:37                                 ` Michal Hocko
2021-03-23 12:37                                   ` Michal Hocko
2021-03-23 12:37                                   ` Michal Hocko
2021-03-23 13:06                                   ` Christian König
2021-03-23 13:06                                     ` Christian König
2021-03-23 13:06                                     ` Christian König
2021-03-23 13:41                                     ` Michal Hocko
2021-03-23 13:41                                       ` Michal Hocko
2021-03-23 13:41                                       ` Michal Hocko
2021-03-23 13:56                                       ` Christian König
2021-03-23 13:56                                         ` Christian König
2021-03-23 13:56                                         ` Christian König
2021-03-23 15:13                                         ` Michal Hocko
2021-03-23 15:13                                           ` Michal Hocko
2021-03-23 15:13                                           ` Michal Hocko
2021-03-23 15:45                                           ` Christian König
2021-03-23 15:45                                             ` Christian König
2021-03-23 15:45                                             ` Christian König
2021-03-24 10:19                                             ` Thomas Hellström (Intel)
2021-03-24 10:19                                               ` Thomas Hellström (Intel)
2021-03-24 10:19                                               ` Thomas Hellström (Intel)
2021-03-24 11:55                                               ` Daniel Vetter
2021-03-24 11:55                                                 ` Daniel Vetter
2021-03-24 11:55                                                 ` Daniel Vetter
2021-03-24 12:00                                                 ` Christian König
2021-03-24 12:00                                                   ` Christian König
2021-03-24 12:00                                                   ` Christian König
2021-03-24 12:01                                                   ` Daniel Vetter
2021-03-24 12:01                                                     ` Daniel Vetter
2021-03-24 12:01                                                     ` Daniel Vetter
2021-03-24 12:07                                                     ` Christian König
2021-03-24 12:07                                                       ` Christian König
2021-03-24 12:07                                                       ` Christian König
2021-03-24 19:20                                                       ` Daniel Vetter
2021-03-24 19:20                                                         ` Daniel Vetter
2021-03-24 19:20                                                         ` Daniel Vetter
2021-03-23 13:15                               ` Daniel Vetter
2021-03-23 13:15                                 ` Daniel Vetter
2021-03-23 13:15                                 ` Daniel Vetter
2021-03-23 13:48                                 ` Michal Hocko [this message]
2021-03-23 13:48                                   ` Michal Hocko
2021-03-23 13:48                                   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFnxkLhBGYRj7Hck@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=Leo.Liu@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=daniel@ffwll.ch \
    --cc=dchinner@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.