All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Geffon <bgeffon@google.com>
To: Matthias Kaehlcke <mka@chromium.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Pavel Machek <pavel@ucw.cz>, Len Brown <len.brown@intel.com>,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] PM: hibernate: don't store zero pages in the image file.
Date: Mon, 27 Mar 2023 10:20:14 -0400	[thread overview]
Message-ID: <CADyq12xaOaOu_5OXRm=YRhYVPmw72d4RQie3b3hHp=-YXoDhAg@mail.gmail.com> (raw)
In-Reply-To: <ZBx8KEuVtIbqkeq1@google.com>

On Thu, Mar 23, 2023 at 12:19 PM Matthias Kaehlcke <mka@chromium.org> wrote:
>

Hi Matthias,
Thanks for taking a look.

> On Thu, Mar 02, 2023 at 12:13:48PM -0500, Brian Geffon wrote:
> > On ChromeOS we've observed a considerable number of in-use pages filled with
> > zeros. Today with hibernate it's entirely possible that saveable pages are just
> > zero filled. Since we're already copying pages word-by-word in do_copy_page it
> > becomes almost free to determine if a page was completely filled with zeros.
> >
> > This change introduces a new bitmap which will track these zero pages. If a page
> > is zero it will not be included in the saved image, instead to track these zero
> > pages in the image file we will introduce a new flag which we will set on the
> > packed PFN list. When reading back in the image file we will detect these zero
> > page PFNs and rebuild the zero page bitmap.
> >
> > When the image is being loaded through calls to write_next_page if we encounter
> > a zero page we will silently memset it to 0 and then continue on to the next
> > page. Given the implementation in snapshot_read_next/snapshot_write_next this
> > change  will be transparent to non-compressed/compressed and swsusp modes of
> > operation.
> >
> > To provide some concrete numbers from simple ad-hoc testing, on a device which
> > was lightly in use we saw that:
> >
> >   PM: hibernation: Image created (964408 pages copied, 548304 zero pages)
> >
> > Of the approximately 6.2GB of saveable pages 2.2GB (36%) were just zero filled
> > and could be tracked entirely within the packed PFN list. The savings would
> > obviously be much lower for lzo compressed images, but even in the case of
> > compression not copying pages across to the compression threads will still
> > speed things up. It's also possible that we would see better overall compression
> > ratios as larger regions of "real data" would improve the compressibility.
> >
> > Finally, such an approach could dramatically improve swsusp performance
> > as each one of those zero pages requires a write syscall to reload, by
> > handling it as part of the packed PFN list we're able to fully avoid
> > that.
> >
> >  Patch v2 -> v3:
> >    - Use nr_zero_pages rather than walking each pfn to count.
> >    - Make sure zero_bm is allocated in safe pages on resume.
> >      When reading in the pfn list and building the zero page bm
> >      we don't know which pages are unsafe yet so we will need to
> >      copy this bm to safe pages after the metadata has been read.
> >
> >  Patch v1 -> v2:
> >    - minor code mistake from rebasing corrected.
> >
> > Signed-off-by: Brian Geffon <bgeffon@google.com>
> > ---
> >  kernel/power/snapshot.c | 169 +++++++++++++++++++++++++++++++---------
> >  1 file changed, 133 insertions(+), 36 deletions(-)
> >
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index cd8b7b35f1e8b..a2c4fe17f9067 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
>
> ...
>
> > @@ -1371,14 +1381,18 @@ static unsigned int count_data_pages(void)
> >
> >  /*
> >   * This is needed, because copy_page and memcpy are not usable for copying
> > - * task structs.
> > + * task structs. Returns 1 if a page was filled with only zeros, otherwise 0.
>
> nit: s/a page/the page/

Ack. will fix in follow up version.

>
> >   */
> > -static inline void do_copy_page(long *dst, long *src)
> > +static inline int do_copy_page(long *dst, long *src)
> >  {
> >       int n;
> > +     long z = 0;
> >
> > -     for (n = PAGE_SIZE / sizeof(long); n; n--)
> > +     for (n = PAGE_SIZE / sizeof(long); n; n--) {
> > +             z |= *src;
> >               *dst++ = *src++;
> > +     }
> > +     return !z;
> >  }
>
> ...
>
> > -static inline void copy_data_page(unsigned long dst_pfn, unsigned long src_pfn)
> > +static inline int copy_data_page(unsigned long dst_pfn, unsigned long src_pfn)
> >  {
> > -     safe_copy_page(page_address(pfn_to_page(dst_pfn)),
> > +     return safe_copy_page(page_address(pfn_to_page(dst_pfn)),
> >                               pfn_to_page(src_pfn));
> >  }
> >  #endif /* CONFIG_HIGHMEM */
> >
> >  static void copy_data_pages(struct memory_bitmap *copy_bm,
> > -                         struct memory_bitmap *orig_bm)
> > +                         struct memory_bitmap *orig_bm,
> > +                         struct memory_bitmap *zero_bm,
> > +                         unsigned int *zero_count)
> >  {
> >       struct zone *zone;
> > -     unsigned long pfn;
> > +     unsigned long pfn, copy_pfn;
> >
> >       for_each_populated_zone(zone) {
> >               unsigned long max_zone_pfn;
> > @@ -1462,11 +1482,20 @@ static void copy_data_pages(struct memory_bitmap *copy_bm,
> >       }
> >       memory_bm_position_reset(orig_bm);
> >       memory_bm_position_reset(copy_bm);
> > +     copy_pfn = memory_bm_next_pfn(copy_bm);
> >       for(;;) {
> >               pfn = memory_bm_next_pfn(orig_bm);
> >               if (unlikely(pfn == BM_END_OF_MAP))
> >                       break;
> > -             copy_data_page(memory_bm_next_pfn(copy_bm), pfn);
> > +             if (copy_data_page(copy_pfn, pfn)) {
> > +                     memory_bm_set_bit(zero_bm, pfn);
> > +                     if (zero_count)
>
> This check is not needed. The function is called only once, with a pointer. The kernel
> trusts itself if the pointer is supposed to be always != NULL.
>
> Or better: use a local counter and have copy_data_pages() return the number of pages
> that were actually copied, which is what the caller is interested in.

That makes a lot of sense, I'll switch to this approach in next iteration.

>
> > +                             (*zero_count)++;
> > +
> > +                     /* We will reuse this copy_pfn for a real 'nonzero' page. */
> > +                     continue;
> > +             }
> > +             copy_pfn = memory_bm_next_pfn(copy_bm);
> >       }
> >  }
>
> ...
>
> > @@ -2247,24 +2299,34 @@ static int load_header(struct swsusp_info *info)
> >   * unpack_orig_pfns - Set bits corresponding to given PFNs in a memory bitmap.
> >   * @bm: Memory bitmap.
> >   * @buf: Area of memory containing the PFNs.
> > + * @zero_bm: Memory bitmap which will be populated with the PFNs of zero pages.
> >   *
> >   * For each element of the array pointed to by @buf (1 page at a time), set the
> > - * corresponding bit in @bm.
> > + * corresponding bit in @bm. If the the page was originally populated with only
> > + * zeros then a corresponding bit will also be set in @zero_bm.
>
> s/the the/the/

Ack.

>
> ...
>
> > @@ -2486,6 +2548,7 @@ static inline void free_highmem_data(void) {}
> >   * prepare_image - Make room for loading hibernation image.
> >   * @new_bm: Uninitialized memory bitmap structure.
> >   * @bm: Memory bitmap with unsafe pages marked.
> > + * @zero_bm: Memory bitmap containing the zero pages.
>
> That sounds as if the memory bitmap actually contained zero pages. I suggest to
> change it to something like the comment for 'bm' above, i.e. "... with zero
> pages marked"

Sure that makes sense, will reword it.

>
> >   *
> >   * Use @bm to mark the pages that will be overwritten in the process of
> >   * restoring the system memory state from the suspend image ("unsafe" pages)
> > @@ -2496,8 +2559,12 @@ static inline void free_highmem_data(void) {}
> >   * pages will be used for just yet.  Instead, we mark them all as allocated and
> >   * create a lists of "safe" pages to be used later.  On systems with high
> >   * memory a list of "safe" highmem pages is created too.
> > + *
> > + * Because we didn't know which pages were unsafe when we created the zero bm we
> > + * will make a copy of it and recreate it within safe pages.
>
> nit: s/we will make/we make/

Ack.

>
> >   */
> > -static int prepare_image(struct memory_bitmap *new_bm, struct memory_bitmap *bm)
> > +static int prepare_image(struct memory_bitmap *new_bm, struct memory_bitmap *bm,
> > +             struct memory_bitmap *zero_bm)
> >  {
> >       unsigned int nr_pages, nr_highmem;
> >       struct linked_page *lp;
> > @@ -2516,6 +2583,20 @@ static int prepare_image(struct memory_bitmap *new_bm, struct memory_bitmap *bm)
> >
> >       duplicate_memory_bitmap(new_bm, bm);
> >       memory_bm_free(bm, PG_UNSAFE_KEEP);
> > +     error = memory_bm_create(bm, GFP_ATOMIC, PG_ANY);
> > +     if (error)
> > +             goto Free;
> > +
> > +     /* use bm as storage while we rebuild zero_bm using safe pages */
>
> Re-using the 'bm' parameter is confusing, it should be avoided IMO unless there
> is a real benefit. struct memory_bitmap isn't that big, why not create a local
> variable 'zero_mb_tmp' or similar as a temporary store for the zero page bitmap?

Sure, I'll send an updated patch which does that.

>
> > +     duplicate_memory_bitmap(bm, zero_bm);
> > +     memory_bm_free(zero_bm, PG_UNSAFE_KEEP);
> > +     error = memory_bm_create(zero_bm, GFP_ATOMIC, PG_SAFE);
> > +     if (error)
> > +             goto Free;
> > +     duplicate_memory_bitmap(zero_bm, bm);
> > +     memory_bm_free(bm, PG_UNSAFE_KEEP);
> > +     /* at this point zero_bm is in safe pages and we can use it while restoring */
> > +
> >       if (nr_highmem > 0) {
> >               error = prepare_highmem_image(bm, &nr_highmem);
> >               if (error)

  reply	other threads:[~2023-03-27 14:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-13 19:30 [PATCH] PM: hibernate: don't store zero pages in the image file Brian Geffon
2023-01-13 23:06 ` kernel test robot
2023-01-14  0:15 ` [PATCH v2] " Brian Geffon
2023-02-09 19:44   ` Rafael J. Wysocki
2023-02-13 14:45     ` Brian Geffon
2023-02-18 11:48       ` Pavel Machek
2023-02-19 18:56         ` Brian Geffon
2023-03-02 17:25     ` Brian Geffon
2023-03-02 17:13 ` [PATCH v3] " Brian Geffon
2023-03-23 16:19   ` Matthias Kaehlcke
2023-03-27 14:20     ` Brian Geffon [this message]
2023-04-06 18:20 ` [PATCH v4] " Brian Geffon
2023-06-22 16:22   ` Rafael J. Wysocki
2023-06-26 15:44 ` [PATCH] " Brian Geffon
2023-06-26 15:47 ` [PATCH v6] " Brian Geffon
2023-07-14 17:55 ` [PATCH v7] " Brian Geffon
2023-07-20 18:18   ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADyq12xaOaOu_5OXRm=YRhYVPmw72d4RQie3b3hHp=-YXoDhAg@mail.gmail.com' \
    --to=bgeffon@google.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mka@chromium.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.