linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/memcg: fix device private memcg accounting
@ 2020-10-09 21:59 Ralph Campbell
  2020-10-09 22:50 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Ralph Campbell @ 2020-10-09 21:59 UTC (permalink / raw)
  To: linux-mm, cgroups, linux-kernel
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov,
	Jérôme Glisse, Balbir Singh, Ira Weiny, Andrew Morton,
	Ralph Campbell, stable

The code in mc_handle_swap_pte() checks for non_swap_entry() and returns
NULL before checking is_device_private_entry() so device private pages
are never handled.
Fix this by checking for non_swap_entry() after handling device private
swap PTEs.

Cc: stable@vger.kernel.org
Fixes: c733a82874a7 ("mm/memcontrol: support MEMORY_DEVICE_PRIVATE")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
---

I'm not sure exactly how to test this. I ran the HMM self tests but
that is a minimal sanity check. I think moving the self test from one
memory cgroup to another while it is running would exercise this patch.
I'm looking at how the test could move itself to another group after
migrating some anonymous memory to the test driver.

This applies cleanly to linux-5.9.0-rc8-mm1 and is for Andrew Morton's
tree.

 mm/memcontrol.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2636f8bad908..3a24e3b619f5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5549,7 +5549,7 @@ static struct page *mc_handle_swap_pte(struct vm_area_struct *vma,
 	struct page *page = NULL;
 	swp_entry_t ent = pte_to_swp_entry(ptent);
 
-	if (!(mc.flags & MOVE_ANON) || non_swap_entry(ent))
+	if (!(mc.flags & MOVE_ANON))
 		return NULL;
 
 	/*
@@ -5568,6 +5568,9 @@ static struct page *mc_handle_swap_pte(struct vm_area_struct *vma,
 		return page;
 	}
 
+	if (non_swap_entry(ent))
+		return NULL;
+
 	/*
 	 * Because lookup_swap_cache() updates some statistics counter,
 	 * we call find_get_page() with swapper_space directly.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/memcg: fix device private memcg accounting
  2020-10-09 21:59 [PATCH] mm/memcg: fix device private memcg accounting Ralph Campbell
@ 2020-10-09 22:50 ` Andrew Morton
  2020-10-10  0:00   ` Ralph Campbell
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2020-10-09 22:50 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: linux-mm, cgroups, linux-kernel, Johannes Weiner, Michal Hocko,
	Vladimir Davydov, Jérôme Glisse, Balbir Singh,
	Ira Weiny, stable

On Fri, 9 Oct 2020 14:59:52 -0700 Ralph Campbell <rcampbell@nvidia.com> wrote:

> The code in mc_handle_swap_pte() checks for non_swap_entry() and returns
> NULL before checking is_device_private_entry() so device private pages
> are never handled.
> Fix this by checking for non_swap_entry() after handling device private
> swap PTEs.
> 
> Cc: stable@vger.kernel.org

I was going to ask "what are the end-user visible effects of the bug". 
This is important information with a cc:stable.

> 
> I'm not sure exactly how to test this. I ran the HMM self tests but
> that is a minimal sanity check. I think moving the self test from one
> memory cgroup to another while it is running would exercise this patch.
> I'm looking at how the test could move itself to another group after
> migrating some anonymous memory to the test driver.
> 

But this makes me suspect the answer is "there aren't any that we know
of".  Are you sure a cc:stable is warranted?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/memcg: fix device private memcg accounting
  2020-10-09 22:50 ` Andrew Morton
@ 2020-10-10  0:00   ` Ralph Campbell
  2020-10-12 13:28     ` Johannes Weiner
  0 siblings, 1 reply; 5+ messages in thread
From: Ralph Campbell @ 2020-10-10  0:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, cgroups, linux-kernel, Johannes Weiner, Michal Hocko,
	Vladimir Davydov, Jérôme Glisse, Balbir Singh,
	Ira Weiny, stable


On 10/9/20 3:50 PM, Andrew Morton wrote:
> On Fri, 9 Oct 2020 14:59:52 -0700 Ralph Campbell <rcampbell@nvidia.com> wrote:
> 
>> The code in mc_handle_swap_pte() checks for non_swap_entry() and returns
>> NULL before checking is_device_private_entry() so device private pages
>> are never handled.
>> Fix this by checking for non_swap_entry() after handling device private
>> swap PTEs.
>>
>> Cc: stable@vger.kernel.org
> 
> I was going to ask "what are the end-user visible effects of the bug".
> This is important information with a cc:stable.
> 
>>
>> I'm not sure exactly how to test this. I ran the HMM self tests but
>> that is a minimal sanity check. I think moving the self test from one
>> memory cgroup to another while it is running would exercise this patch.
>> I'm looking at how the test could move itself to another group after
>> migrating some anonymous memory to the test driver.
>>
> 
> But this makes me suspect the answer is "there aren't any that we know
> of".  Are you sure a cc:stable is warranted?
> 

I assume the memory cgroup accounting would be off somehow when moving
a process to another memory cgroup.
Currently, the device private page is charged like a normal anonymous page
when allocated and is uncharged when the page is freed so I think that path is OK.
Maybe someone who knows more about memory cgroup accounting can comment?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/memcg: fix device private memcg accounting
  2020-10-10  0:00   ` Ralph Campbell
@ 2020-10-12 13:28     ` Johannes Weiner
  2020-10-12 17:11       ` Ralph Campbell
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Weiner @ 2020-10-12 13:28 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Andrew Morton, linux-mm, cgroups, linux-kernel, Michal Hocko,
	Vladimir Davydov, Jérôme Glisse, Balbir Singh,
	Ira Weiny, stable

On Fri, Oct 09, 2020 at 05:00:37PM -0700, Ralph Campbell wrote:
> 
> On 10/9/20 3:50 PM, Andrew Morton wrote:
> > On Fri, 9 Oct 2020 14:59:52 -0700 Ralph Campbell <rcampbell@nvidia.com> wrote:
> > 
> > > The code in mc_handle_swap_pte() checks for non_swap_entry() and returns
> > > NULL before checking is_device_private_entry() so device private pages
> > > are never handled.
> > > Fix this by checking for non_swap_entry() after handling device private
> > > swap PTEs.

The fix looks good to me.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

> > But this makes me suspect the answer is "there aren't any that we know
> > of".  Are you sure a cc:stable is warranted?
> > 
> 
> I assume the memory cgroup accounting would be off somehow when moving
> a process to another memory cgroup.
> Currently, the device private page is charged like a normal anonymous page
> when allocated and is uncharged when the page is freed so I think that path is OK.
> Maybe someone who knows more about memory cgroup accounting can comment?

As for whether to CC stable, I'm leaning toward no:

- When moving tasks, we'd leave their device pages behind in the old
  cgroup. This isn't great, but it doesn't cause counter imbalances or
  corruption or anything - we also skip locked pages, we used to skip
  pages mapped by more than one pte, the user can select whether to
  move pages along tasks at all, and if so, whether only anon or file.

- Charge moving itself is a bit of a questionable feature, and users
  have been moving away from it. Leaving tasks in a cgroup and
  changing the configuration is a heck of a lot cheaper than moving
  potentially gigabytes of pages to another configuration domain.

- According to the Fixes tag, this isn't a regression, either. Since
  their inception, we have never migrated device pages.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/memcg: fix device private memcg accounting
  2020-10-12 13:28     ` Johannes Weiner
@ 2020-10-12 17:11       ` Ralph Campbell
  0 siblings, 0 replies; 5+ messages in thread
From: Ralph Campbell @ 2020-10-12 17:11 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, linux-mm, cgroups, linux-kernel, Michal Hocko,
	Vladimir Davydov, Jérôme Glisse, Balbir Singh,
	Ira Weiny, stable


On 10/12/20 6:28 AM, Johannes Weiner wrote:
> On Fri, Oct 09, 2020 at 05:00:37PM -0700, Ralph Campbell wrote:
>>
>> On 10/9/20 3:50 PM, Andrew Morton wrote:
>>> On Fri, 9 Oct 2020 14:59:52 -0700 Ralph Campbell <rcampbell@nvidia.com> wrote:
>>>
>>>> The code in mc_handle_swap_pte() checks for non_swap_entry() and returns
>>>> NULL before checking is_device_private_entry() so device private pages
>>>> are never handled.
>>>> Fix this by checking for non_swap_entry() after handling device private
>>>> swap PTEs.
> 
> The fix looks good to me.
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
>>> But this makes me suspect the answer is "there aren't any that we know
>>> of".  Are you sure a cc:stable is warranted?
>>>
>>
>> I assume the memory cgroup accounting would be off somehow when moving
>> a process to another memory cgroup.
>> Currently, the device private page is charged like a normal anonymous page
>> when allocated and is uncharged when the page is freed so I think that path is OK.
>> Maybe someone who knows more about memory cgroup accounting can comment?
> 
> As for whether to CC stable, I'm leaning toward no:
> 
> - When moving tasks, we'd leave their device pages behind in the old
>    cgroup. This isn't great, but it doesn't cause counter imbalances or
>    corruption or anything - we also skip locked pages, we used to skip
>    pages mapped by more than one pte, the user can select whether to
>    move pages along tasks at all, and if so, whether only anon or file.
> 
> - Charge moving itself is a bit of a questionable feature, and users
>    have been moving away from it. Leaving tasks in a cgroup and
>    changing the configuration is a heck of a lot cheaper than moving
>    potentially gigabytes of pages to another configuration domain.
> 
> - According to the Fixes tag, this isn't a regression, either. Since
>    their inception, we have never migrated device pages.

Thanks for the Acked-by and the comments.
I assume Andrew will update the tags when queuing this up unless he wants me to resend.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-12 17:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09 21:59 [PATCH] mm/memcg: fix device private memcg accounting Ralph Campbell
2020-10-09 22:50 ` Andrew Morton
2020-10-10  0:00   ` Ralph Campbell
2020-10-12 13:28     ` Johannes Weiner
2020-10-12 17:11       ` Ralph Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).