All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	kernel-team <kernel-team@lge.com>
Subject: Re: [PATCH 2/2] zram: do not count duplicated pages as compressed
Date: Tue, 16 May 2017 14:26:05 +0900	[thread overview]
Message-ID: <20170516052605.GA5429@bbox> (raw)
In-Reply-To: <20170516023615.GC10262@jagdpanzerIV.localdomain>

On Tue, May 16, 2017 at 11:36:15AM +0900, Sergey Senozhatsky wrote:
> > > > @@ -794,7 +801,15 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index)
> > > >  	entry = zram_dedup_find(zram, page, &checksum);
> > > >  	if (entry) {
> > > >  		comp_len = entry->len;
> > > > -		goto found_dup;
> > > > +		zram_slot_lock(zram, index);
> > > > +		zram_free_page(zram, index);
> > > > +		zram_set_flag(zram, index, ZRAM_DUP);
> > > > +		zram_set_entry(zram, index, entry);
> > > > +		zram_set_obj_size(zram, index, comp_len);
> > > > +		zram_slot_unlock(zram, index);
> > > > +		atomic64_add(comp_len, &zram->stats.dup_data_size);
> > > > +		atomic64_inc(&zram->stats.pages_stored);
> > > > +		return 0;
> > > 
> > > hm. that's a somewhat big code duplication. isn't it?
> > 
> > Yub. 3 parts. above part,  zram_same_page_write and tail of __zram_bvec_write.
> 
> hmm... good question... hardly can think of anything significantly
> better, zram object handling is now a mix of flags, entries,
> ref_counters, etc. etc. may be we can merge some of those ops, if we
> would keep slot locked through the entire __zram_bvec_write(), but
> that does not look attractive.
> 
> something ABSOLUTELY untested and incomplete. not even compile tested (!).
> 99% broken and stupid (!). but there is one thing that it has revealed, so
> thus incomplete. see below:
> 
> ---
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 372602c7da49..b31543c40d54 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -509,11 +509,8 @@ static bool zram_same_page_write(struct zram *zram, u32 index,
>         if (page_same_filled(mem, &element)) {
>                 kunmap_atomic(mem);
>                 /* Free memory associated with this sector now. */
> -               zram_slot_lock(zram, index);
> -               zram_free_page(zram, index);
>                 zram_set_flag(zram, index, ZRAM_SAME);
>                 zram_set_element(zram, index, element);
> -               zram_slot_unlock(zram, index);
>  
>                 atomic64_inc(&zram->stats.same_pages);
>                 return true;
> @@ -778,7 +775,7 @@ static int zram_compress(struct zram *zram, struct zcomp_strm **zstrm,
>  
>  static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index)
>  {
> -       int ret;
> +       int ret = 0;
>         struct zram_entry *entry;
>         unsigned int comp_len;
>         void *src, *dst;
> @@ -786,12 +783,20 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index)
>         struct page *page = bvec->bv_page;
>         u32 checksum;
>  
> +       /*
> +        * Free memory associated with this sector
> +        * before overwriting unused sectors.
> +        */
> +       zram_slot_lock(zram, index);
> +       zram_free_page(zram, index);

Hmm, zram_free should happen only if the write is done successfully.
Otherwise, we lose the valid data although write IO was fail.

> +
>         if (zram_same_page_write(zram, index, page))
> -               return 0;
> +               goto out_unlock;
>  
>         entry = zram_dedup_find(zram, page, &checksum);
>         if (entry) {
>                 comp_len = entry->len;
> +               zram_set_flag(zram, index, ZRAM_DUP);

In case of hitting dedup, we shouldn't increase compr_data_size.
If we fix above two problems, do you think it's still cleaner?
(I don't mean to be reluctant with your suggestion. Just a
 real question to know your thought.:)


>                 goto found_dup;
>         }
>  
> @@ -799,7 +804,7 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index)
>         ret = zram_compress(zram, &zstrm, page, &entry, &comp_len);
>         if (ret) {
>                 zcomp_stream_put(zram->comp);
> -               return ret;
> +               goto out_unlock;
>         }
>  
>         dst = zs_map_object(zram->mem_pool,
> @@ -817,20 +822,16 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index)
>         zram_dedup_insert(zram, entry, checksum);
>  
>  found_dup:
> -       /*
> -        * Free memory associated with this sector
> -        * before overwriting unused sectors.
> -        */
> -       zram_slot_lock(zram, index);
> -       zram_free_page(zram, index);
>         zram_set_entry(zram, index, entry);
>         zram_set_obj_size(zram, index, comp_len);
> -       zram_slot_unlock(zram, index);
>  
>         /* Update stats */
>         atomic64_add(comp_len, &zram->stats.compr_data_size);
>         atomic64_inc(&zram->stats.pages_stored);
> -       return 0;
> +
> +out_unlock:
> +       zram_slot_unlock(zram, index);
> +       return ret;
>  }
> 
> ---
> 
> 
> namely,
> that zram_compress() error return path from __zram_bvec_write().
> 
> currently, we touch the existing compressed object and overwrite it only
> when we successfully compressed a new object. when zram_compress() fails
> we propagate the error, but never touch the old object. so all reads that
> could hit that index later will read stale data. and probably it would
> make more sense to fail those reads as well; IOW to free the old page
> regardless zram_compress() progress.
> 
> what do you think?
> 
> 	-ss

  reply	other threads:[~2017-05-16  5:26 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-15  7:41 [PATCH 1/2] zram: count same page write as page_stored Minchan Kim
2017-05-15  7:41 ` [PATCH 2/2] zram: do not count duplicated pages as compressed Minchan Kim
2017-05-16  1:30   ` Sergey Senozhatsky
2017-05-16  1:59     ` Minchan Kim
2017-05-16  2:36       ` Sergey Senozhatsky
2017-05-16  5:26         ` Minchan Kim [this message]
2017-05-16  5:45           ` Sergey Senozhatsky
2017-05-16  7:16             ` Minchan Kim
2017-05-16  7:36               ` Sergey Senozhatsky
2017-05-17  8:32                 ` Minchan Kim
2017-05-17  9:14                   ` Sergey Senozhatsky
2017-05-18  4:53                     ` Minchan Kim
2017-05-21  7:04                   ` Christoph Hellwig
2017-05-21  7:15                     ` Minchan Kim
2017-05-16  1:11 ` [PATCH 1/2] zram: count same page write as page_stored Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170516052605.GA5429@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kernel-team@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.