linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Nhat Pham <nphamcs@gmail.com>, Chris Li <chrisl@kernel.org>,
	 Chengming Zhou <zhouchengming@bytedance.com>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: swap: update inuse_pages after all cleanups are done
Date: Tue, 23 Jan 2024 01:40:31 -0800	[thread overview]
Message-ID: <CAJD7tkb=-0mP1CXEmAd4QjMXKgep7myHShiwUSNnY1cjfRqfJA@mail.gmail.com> (raw)
In-Reply-To: <87wms0toh4.fsf@yhuang6-desk2.ccr.corp.intel.com>

On Tue, Jan 23, 2024 at 1:01 AM Huang, Ying <ying.huang@intel.com> wrote:
>
> Yosry Ahmed <yosryahmed@google.com> writes:
>
> > In swap_range_free(), we update inuse_pages then do some cleanups (arch
> > invalidation, zswap invalidation, swap cache cleanups, etc). During
> > swapoff, try_to_unuse() uses inuse_pages to make sure all swap entries
> > are freed. Make sure we only update inuse_pages after we are done with
> > the cleanups.
> >
> > In practice, this shouldn't matter, because swap_range_free() is called
> > with the swap info lock held, and the swapoff code will spin for that
> > lock after try_to_unuse() anyway.
> >
> > The goal is to make it obvious and more future proof that once
> > try_to_unuse() returns, all cleanups are done.
>
> Defines "all cleanups".  Apparently, some other operations are still
> to be done after try_to_unuse() in swap_off().

I am referring to the cleanups in swap_range_free() that I mentioned above.

How about s/all the cleanups/all the cleanups in swap_range_free()?

>
> > This also facilitates a
> > following zswap cleanup patch which uses this fact to simplify
> > zswap_swapoff().
> >
> > Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
> > ---
> >  mm/swapfile.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 556ff7347d5f0..2fedb148b9404 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -737,8 +737,6 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
> >               if (was_full && (si->flags & SWP_WRITEOK))
> >                       add_to_avail_list(si);
> >       }
> > -     atomic_long_add(nr_entries, &nr_swap_pages);
> > -     WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
> >       if (si->flags & SWP_BLKDEV)
> >               swap_slot_free_notify =
> >                       si->bdev->bd_disk->fops->swap_slot_free_notify;
> > @@ -752,6 +750,8 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
> >               offset++;
> >       }
> >       clear_shadow_from_swap_cache(si->type, begin, end);
> > +     atomic_long_add(nr_entries, &nr_swap_pages);
> > +     WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
>
> This isn't enough.  You need to use smp_wmb() here and smp_rmb() in
> somewhere reading si->inuse_pages.

Hmm, good point. Although as I mentioned in the commit message, this
shouldn't matter today as swap_range_free() executes with the lock
held, and we spin on the lock after try_to_unuse() returns. It may
still be more future-proof to add the memory barriers.

In swap_range_free, we want to make sure that the write to
si->inuse_pages in swap_range_free() happens *after* the cleanups
(specifically zswap_invalidate() in this case).
In swap_off, we want to make sure that the cleanups following
try_to_unuse() (e.g. zswap_swapoff) happen *after* reading
si->inuse_pages == 0 in try_to_unuse().

So I think we want smp_wmb() in swap_range_free() and smp_mb() in
try_to_unuse(). Does the below look correct to you?

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2fedb148b9404..a2fa2f65a8ddd 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -750,6 +750,12 @@ static void swap_range_free(struct
swap_info_struct *si, unsigned long offset,
                offset++;
        }
        clear_shadow_from_swap_cache(si->type, begin, end);
+
+       /*
+        * Make sure that try_to_unuse() observes si->inuse_pages reaching 0
+        * only after the above cleanups are done.
+        */
+       smp_wmb();
        atomic_long_add(nr_entries, &nr_swap_pages);
        WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
 }
@@ -2130,6 +2136,11 @@ static int try_to_unuse(unsigned int type)
                return -EINTR;
        }

+       /*
+        * Make sure that further cleanups after try_to_unuse() returns happen
+        * after swap_range_free() reduces si->inuse_pages to 0.
+        */
+       smp_mb();
        return 0;
 }

Alternatively, we may just hold the spinlock in try_to_unuse() when we
check si->inuse_pages at the end. This will also ensure that any calls
to swap_range_free() have completed. Let me know what you prefer.


  reply	other threads:[~2024-01-23  9:41 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-20  2:40 [PATCH 0/2] mm: zswap: simplify zswap_swapoff() Yosry Ahmed
2024-01-20  2:40 ` [PATCH 1/2] mm: swap: update inuse_pages after all cleanups are done Yosry Ahmed
2024-01-22 13:17   ` Chengming Zhou
2024-01-23  8:59   ` Huang, Ying
2024-01-23  9:40     ` Yosry Ahmed [this message]
2024-01-23  9:54       ` Yosry Ahmed
2024-01-24  3:13       ` Huang, Ying
2024-01-24  3:20         ` Yosry Ahmed
2024-01-24  3:27           ` Huang, Ying
2024-01-24  4:15             ` Yosry Ahmed
2024-01-20  2:40 ` [PATCH 2/2] mm: zswap: remove unnecessary tree cleanups in zswap_swapoff() Yosry Ahmed
2024-01-22 13:13   ` Chengming Zhou
2024-01-22 20:19   ` Johannes Weiner
2024-01-22 20:39     ` Yosry Ahmed
2024-01-23 15:38       ` Johannes Weiner
2024-01-23 15:54         ` Yosry Ahmed
2024-01-23 20:12           ` Johannes Weiner
2024-01-23 21:02             ` Yosry Ahmed
2024-01-24  6:57               ` Yosry Ahmed
2024-01-25  5:28                 ` Chris Li
2024-01-25  7:59                   ` Yosry Ahmed
2024-01-25 18:55                     ` Chris Li
2024-01-25 20:57                       ` Yosry Ahmed
2024-01-25 22:31                         ` Chris Li
2024-01-25 22:33                           ` Yosry Ahmed
2024-01-26  1:09                             ` Chris Li
2024-01-24  7:20               ` Chengming Zhou
2024-01-25  5:44                 ` Chris Li
2024-01-25  8:01                   ` Yosry Ahmed
2024-01-25 19:03                     ` Chris Li
2024-01-25 21:01                       ` Yosry Ahmed
2024-01-25  7:53                 ` Yosry Ahmed
2024-01-25  8:03                   ` Yosry Ahmed
2024-01-25  8:30                   ` Chengming Zhou
2024-01-25  8:42                     ` Yosry Ahmed
2024-01-25  8:52                       ` Chengming Zhou
2024-01-25  9:03                         ` Yosry Ahmed
2024-01-25  9:22                           ` Chengming Zhou
2024-01-25  9:26                             ` Yosry Ahmed
2024-01-25  9:38                               ` Chengming Zhou
2024-01-26  0:03                   ` Chengming Zhou
2024-01-26  0:05                     ` Yosry Ahmed
2024-01-26  0:10                       ` Chengming Zhou
2024-01-23 20:30           ` Nhat Pham
2024-01-23 21:04             ` Yosry Ahmed
2024-01-22 21:21   ` Nhat Pham
2024-01-22 22:31   ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJD7tkb=-0mP1CXEmAd4QjMXKgep7myHShiwUSNnY1cjfRqfJA@mail.gmail.com' \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=chrisl@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=ying.huang@intel.com \
    --cc=zhouchengming@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).