linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/swap_state: update zswap LRU's protection range with the folio locked
@ 2024-02-05 23:24 Nhat Pham
  2024-02-06  2:07 ` Chengming Zhou
  2024-02-06 15:15 ` Johannes Weiner
  0 siblings, 2 replies; 4+ messages in thread
From: Nhat Pham @ 2024-02-05 23:24 UTC (permalink / raw)
  To: akpm
  Cc: hannes, chengming.zhou, yosryahmed, linux-mm, kernel-team, linux-kernel

Move the zswap LRU protection range update above the swap_read_folio()
call, and only when a new page is allocated. This is the case where
(z)swapin could happen, which is a signal that the zswap shrinker should
be more conservative with its reclaiming action.

It also prevents a race, in which folio migration can clear the
memcg_data of the now unlocked folio, resulting in a warning in the
inlined folio_lruvec() call.

Reported-by: syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000ae47f90610803260@google.com/
Fixes: b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
---
 mm/swap_state.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index e671266ad772..7255c01a1e4e 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -680,9 +680,10 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
 	/* The page was likely read above, so no need for plugging here */
 	folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx,
 					&page_allocated, false);
-	if (unlikely(page_allocated))
+	if (unlikely(page_allocated)) {
+		zswap_folio_swapin(folio);
 		swap_read_folio(folio, false, NULL);
-	zswap_folio_swapin(folio);
+	}
 	return folio;
 }
 
@@ -855,9 +856,10 @@ static struct folio *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask,
 	/* The folio was likely read above, so no need for plugging here */
 	folio = __read_swap_cache_async(targ_entry, gfp_mask, mpol, targ_ilx,
 					&page_allocated, false);
-	if (unlikely(page_allocated))
+	if (unlikely(page_allocated)) {
+		zswap_folio_swapin(folio);
 		swap_read_folio(folio, false, NULL);
-	zswap_folio_swapin(folio);
+	}
 	return folio;
 }
 

base-commit: 91f3daa1765ee4e0c89987dc25f72c40f07af34d
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/swap_state: update zswap LRU's protection range with the folio locked
  2024-02-05 23:24 [PATCH] mm/swap_state: update zswap LRU's protection range with the folio locked Nhat Pham
@ 2024-02-06  2:07 ` Chengming Zhou
  2024-02-06 15:15 ` Johannes Weiner
  1 sibling, 0 replies; 4+ messages in thread
From: Chengming Zhou @ 2024-02-06  2:07 UTC (permalink / raw)
  To: Nhat Pham, akpm; +Cc: hannes, yosryahmed, linux-mm, kernel-team, linux-kernel

On 2024/2/6 07:24, Nhat Pham wrote:
> Move the zswap LRU protection range update above the swap_read_folio()
> call, and only when a new page is allocated. This is the case where
> (z)swapin could happen, which is a signal that the zswap shrinker should
> be more conservative with its reclaiming action.
> 
> It also prevents a race, in which folio migration can clear the
> memcg_data of the now unlocked folio, resulting in a warning in the
> inlined folio_lruvec() call.
> 
> Reported-by: syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000ae47f90610803260@google.com/
> Fixes: b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
> Signed-off-by: Nhat Pham <nphamcs@gmail.com>

LGTM, thanks!

Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>

> ---
>  mm/swap_state.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index e671266ad772..7255c01a1e4e 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -680,9 +680,10 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
>  	/* The page was likely read above, so no need for plugging here */
>  	folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx,
>  					&page_allocated, false);
> -	if (unlikely(page_allocated))
> +	if (unlikely(page_allocated)) {
> +		zswap_folio_swapin(folio);
>  		swap_read_folio(folio, false, NULL);
> -	zswap_folio_swapin(folio);
> +	}
>  	return folio;
>  }
>  
> @@ -855,9 +856,10 @@ static struct folio *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask,
>  	/* The folio was likely read above, so no need for plugging here */
>  	folio = __read_swap_cache_async(targ_entry, gfp_mask, mpol, targ_ilx,
>  					&page_allocated, false);
> -	if (unlikely(page_allocated))
> +	if (unlikely(page_allocated)) {
> +		zswap_folio_swapin(folio);
>  		swap_read_folio(folio, false, NULL);
> -	zswap_folio_swapin(folio);
> +	}
>  	return folio;
>  }
>  
> 
> base-commit: 91f3daa1765ee4e0c89987dc25f72c40f07af34d

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/swap_state: update zswap LRU's protection range with the folio locked
  2024-02-05 23:24 [PATCH] mm/swap_state: update zswap LRU's protection range with the folio locked Nhat Pham
  2024-02-06  2:07 ` Chengming Zhou
@ 2024-02-06 15:15 ` Johannes Weiner
  2024-02-06 17:31   ` Nhat Pham
  1 sibling, 1 reply; 4+ messages in thread
From: Johannes Weiner @ 2024-02-06 15:15 UTC (permalink / raw)
  To: Nhat Pham
  Cc: akpm, chengming.zhou, yosryahmed, linux-mm, kernel-team, linux-kernel

On Mon, Feb 05, 2024 at 03:24:42PM -0800, Nhat Pham wrote:
> Move the zswap LRU protection range update above the swap_read_folio()
> call, and only when a new page is allocated. This is the case where
> (z)swapin could happen, which is a signal that the zswap shrinker should
> be more conservative with its reclaiming action.
> 
> It also prevents a race, in which folio migration can clear the
> memcg_data of the now unlocked folio, resulting in a warning in the
> inlined folio_lruvec() call.

The warning is the most probable outcome, and it will cause the update
to go against the root cgroup which is safe at least.

But AFAICS there is no ordering guarantee to rule out a UAF if the
lookup succeeds but the memcg and lruvec get freed before the update.

I think that part should be more prominent in the changelog. It's more
important than the first paragraph. Consider somebody scrolling
through the git log and trying to decide whether to backport or not;
it's helpful to describe the bug and its impact first thing, then put
the explanation of the fix after.

> Reported-by: syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000ae47f90610803260@google.com/
> Fixes: b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
> Signed-off-by: Nhat Pham <nphamcs@gmail.com>

Would it make sense to add

	VM_WARN_ON_ONCE(!folio_test_locked(folio));

to zswap_folio_swapin() as well?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/swap_state: update zswap LRU's protection range with the folio locked
  2024-02-06 15:15 ` Johannes Weiner
@ 2024-02-06 17:31   ` Nhat Pham
  0 siblings, 0 replies; 4+ messages in thread
From: Nhat Pham @ 2024-02-06 17:31 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: akpm, chengming.zhou, yosryahmed, linux-mm, kernel-team, linux-kernel

On Tue, Feb 6, 2024 at 7:15 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Mon, Feb 05, 2024 at 03:24:42PM -0800, Nhat Pham wrote:
> > Move the zswap LRU protection range update above the swap_read_folio()
> > call, and only when a new page is allocated. This is the case where
> > (z)swapin could happen, which is a signal that the zswap shrinker should
> > be more conservative with its reclaiming action.
> >
> > It also prevents a race, in which folio migration can clear the
> > memcg_data of the now unlocked folio, resulting in a warning in the
> > inlined folio_lruvec() call.
>
> The warning is the most probable outcome, and it will cause the update
> to go against the root cgroup which is safe at least.
>
> But AFAICS there is no ordering guarantee to rule out a UAF if the
> lookup succeeds but the memcg and lruvec get freed before the update.

Ah nice. I didn't consider that. IIUC, having the folio locked should
prevent this too. Based on the documentation:

 * For a non-kmem folio any of the following ensures folio and memcg binding
 * stability:
 *
 * - the folio lock

I'll rework the commit log to include this, and make this more prominent :)

>
> I think that part should be more prominent in the changelog. It's more
> important than the first paragraph. Consider somebody scrolling
> through the git log and trying to decide whether to backport or not;
> it's helpful to describe the bug and its impact first thing, then put
> the explanation of the fix after.
>
> > Reported-by: syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/000000000000ae47f90610803260@google.com/
> > Fixes: b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
> > Signed-off-by: Nhat Pham <nphamcs@gmail.com>
>
> Would it make sense to add
>
>         VM_WARN_ON_ONCE(!folio_test_locked(folio));
>
> to zswap_folio_swapin() as well?

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-02-06 17:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-05 23:24 [PATCH] mm/swap_state: update zswap LRU's protection range with the folio locked Nhat Pham
2024-02-06  2:07 ` Chengming Zhou
2024-02-06 15:15 ` Johannes Weiner
2024-02-06 17:31   ` Nhat Pham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).