linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Nitin Gupta <ngupta@vflare.org>,
	Jerome Marchand <jmarchan@redhat.com>,
	Ganesh Mahendran <opensource.ganesh@gmail.com>
Subject: Re: [PATCH v1 2/2] zram: remove init_lock in zram_make_request
Date: Mon, 2 Feb 2015 14:10:19 +0900	[thread overview]
Message-ID: <20150202051017.GH6402@blaptop> (raw)
In-Reply-To: <20150202042847.GG6402@blaptop>

On Mon, Feb 02, 2015 at 01:28:47PM +0900, Minchan Kim wrote:
> On Mon, Feb 02, 2015 at 01:01:24PM +0900, Sergey Senozhatsky wrote:
> > On (02/02/15 11:44), Minchan Kim wrote:
> > > > sure, I did think about this. and I actually didn't find any reason not
> > > > to use ->refcount there. if user wants to reset the device, he first
> > > > should umount it to make bdev->bd_holders check happy. and that's where
> > > > IOs will be failed. so it makes sense to switch to ->refcount there, IMHO.
> > > 
> > > If we use zram as block device itself(not a fs or swap) and open the
> > > block device as !FMODE_EXCL, bd_holders will be void.
> > > 
> > 
> > hm.
> > I don't mind to use ->disksize there, but personally I'd maybe prefer
> > to use ->refcount, which just looks less hacky. zram's most common use
> > cases are coming from ram swap device or ram device with fs. so it looks
> > a bit like we care about some corner case here.
> 
> Maybe, but I always test zram with dd so it's not a corner case for me. :)
> 
> > 
> > just my opinion, no objections against ->disksize != 0.
> 
> Thanks. It's a draft for v2. Please review.
> 
> BTW, you pointed out race between bdev_open/close and reset and
> it's cleary bug although it's rare in real practice.
> So, I want to fix it earlier than this patch and mark it as -stable
> if we can fix it easily like Ganesh's work.
> If it gets landing, we could make this patch rebased on it.
> 
> From 699502b4e0c84b3d7b33f8754cf1c0109b16c012 Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Mon, 2 Feb 2015 10:36:28 +0900
> Subject: [PATCH v2] zram: remove init_lock in zram_make_request
> 
> Admin could reset zram during I/O operation going on so we have
> used zram->init_lock as read-side lock in I/O path to prevent
> sudden zram meta freeing.
> 
> However, the init_lock is really troublesome.
> We can't do call zram_meta_alloc under init_lock due to lockdep splat
> because zram_rw_page is one of the function under reclaim path and
> hold it as read_lock while other places in process context hold it
> as write_lock. So, we have used allocation out of the lock to avoid
> lockdep warn but it's not good for readability and fainally, I met
> another lockdep splat between init_lock and cpu_hotplug from
> kmem_cache_destroy during working zsmalloc compaction. :(
> 
> Yes, the ideal is to remove horrible init_lock of zram in rw path.
> This patch removes it in rw path and instead, add atomic refcount
> for meta lifetime management and completion to free meta in process
> context. It's important to free meta in process context because
> some of resource destruction needs mutex lock, which could be held
> if we releases the resource in reclaim context so it's deadlock,
> again.
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/block/zram/zram_drv.c | 85 ++++++++++++++++++++++++++++++-------------
>  drivers/block/zram/zram_drv.h | 20 +++++-----
>  2 files changed, 71 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index aa5a4c5..c6d505c 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -55,7 +55,7 @@ static DEVICE_ATTR_RO(name);
>  
>  static inline int init_done(struct zram *zram)
>  {
> -	return zram->meta != NULL;
> +	return zram->disksize != 0;
>  }
>  
>  static inline struct zram *dev_to_zram(struct device *dev)
> @@ -358,6 +358,18 @@ out_error:
>  	return NULL;
>  }
>  
> +static inline bool zram_meta_get(struct zram *zram)
> +{
> +	if (atomic_inc_not_zero(&zram->refcount))
> +		return true;
> +	return false;
> +}
> +
> +static inline void zram_meta_put(struct zram *zram)
> +{
> +	atomic_dec(&zram->refcount);
> +}
> +
>  static void update_position(u32 *index, int *offset, struct bio_vec *bvec)
>  {
>  	if (*offset + bvec->bv_len >= PAGE_SIZE)
> @@ -719,6 +731,10 @@ static void zram_bio_discard(struct zram *zram, u32 index,
>  
>  static void zram_reset_device(struct zram *zram, bool reset_capacity)
>  {
> +	struct zram_meta *meta;
> +	struct zcomp *comp;
> +	u64 disksize;
> +
>  	down_write(&zram->init_lock);
>  
>  	zram->limit_pages = 0;
> @@ -728,19 +744,32 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity)
>  		return;
>  	}
>  
> -	zcomp_destroy(zram->comp);
> -	zram->max_comp_streams = 1;
> -	zram_meta_free(zram->meta, zram->disksize);
> -	zram->meta = NULL;
> +	meta = zram->meta;
> +	comp = zram->comp;
> +	disksize = zram->disksize;
> +	zram->disksize = 0;
> +	/*
> +	 * ->refcount will go down to 0 eventually and rw handler cannot
> +	 * handle further I/O by init_done checking.
> +	 */
> +	zram_meta_put(zram);
> +	/*
> +	 * We want to free zram_meta in process context to avoid
> +	 * deadlock between reclaim path and any other locks
> +	 */
> +	wait_event(zram->io_done, atomic_read(&zram->refcount) == 0);
> +
>  	/* Reset stats */
>  	memset(&zram->stats, 0, sizeof(zram->stats));
> +	zram->max_comp_streams = 1;
>  
> -	zram->disksize = 0;
>  	if (reset_capacity)
>  		set_capacity(zram->disk, 0);
>  
>  	up_write(&zram->init_lock);
> -
> +	/* I/O operation under all of CPU are done so let's free */
> +	zram_meta_free(meta, disksize);
> +	zcomp_destroy(comp);
>  	/*
>  	 * Revalidate disk out of the init_lock to avoid lockdep splat.
>  	 * It's okay because disk's capacity is protected by init_lock
> @@ -783,6 +812,8 @@ static ssize_t disksize_store(struct device *dev,
>  		goto out_destroy_comp;
>  	}
>  
> +	init_waitqueue_head(&zram->io_done);
> +	zram_meta_get(zram);
        
Argh, It should be

        atomic_set(&zram->refcount, 1);

-- 
Kind regards,
Minchan Kim

  parent reply	other threads:[~2015-02-02  5:10 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1422432945-6764-1-git-send-email-minchan@kernel.org>
2015-01-28 14:19 ` [PATCH 1/2] zram: free meta table in zram_meta_free Sergey Senozhatsky
2015-01-28 23:17   ` Minchan Kim
2015-01-29  1:49     ` Ganesh Mahendran
     [not found] ` <1422432945-6764-2-git-send-email-minchan@kernel.org>
2015-01-28 14:56   ` [PATCH v1 2/2] zram: remove init_lock in zram_make_request Sergey Senozhatsky
2015-01-28 15:04     ` Sergey Senozhatsky
2015-01-28 23:33     ` Minchan Kim
     [not found]       ` <CAHqPoqKZFDSjO1pL+ixYe_m_L0nGNcu04qSNp-jd1fUixKtHnw@mail.gmail.com>
2015-01-29  2:01         ` Minchan Kim
2015-01-29  2:22           ` Sergey Senozhatsky
2015-01-29  5:28             ` Minchan Kim
2015-01-29  6:06               ` Sergey Senozhatsky
2015-01-29  6:35                 ` Minchan Kim
2015-01-29  7:08                   ` Sergey Senozhatsky
2015-01-30 14:41                     ` Minchan Kim
2015-01-31 11:31                       ` Sergey Senozhatsky
2015-02-01 14:50                       ` Sergey Senozhatsky
2015-02-01 15:04                         ` Sergey Senozhatsky
2015-02-02  1:43                           ` Minchan Kim
2015-02-02  1:59                             ` Sergey Senozhatsky
2015-02-02  2:45                               ` Minchan Kim
2015-02-02  3:47                                 ` Sergey Senozhatsky
2015-02-02  1:30                         ` Minchan Kim
2015-02-02  1:48                           ` Sergey Senozhatsky
2015-02-02  2:44                             ` Minchan Kim
2015-02-02  4:01                               ` Sergey Senozhatsky
2015-02-02  4:28                                 ` Minchan Kim
2015-02-02  5:09                                   ` Sergey Senozhatsky
2015-02-02  5:18                                     ` Minchan Kim
2015-02-02  5:28                                       ` Sergey Senozhatsky
2015-02-02  5:10                                   ` Minchan Kim [this message]
2015-01-30  0:20                   ` Sergey Senozhatsky
2015-01-29 13:48   ` Ganesh Mahendran
2015-01-29 15:12     ` Sergey Senozhatsky
2015-01-30  7:52       ` Ganesh Mahendran
2015-01-30  8:08         ` Sergey Senozhatsky
2015-01-31  8:50           ` Ganesh Mahendran
2015-01-31 11:07             ` Sergey Senozhatsky
2015-01-31 12:59               ` Ganesh Mahendran
2015-02-02  3:41 Minchan Kim
2015-02-02  5:59 ` Sergey Senozhatsky
2015-02-02  6:18   ` Minchan Kim
2015-02-02  7:06     ` Sergey Senozhatsky
2015-02-03  1:54       ` Sergey Senozhatsky
2015-02-03  3:02         ` Minchan Kim
2015-02-03  3:56           ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150202051017.GH6402@blaptop \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=jmarchan@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ngupta@vflare.org \
    --cc=opensource.ganesh@gmail.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).