All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Minchan Kim <minchan@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH v2 1/7] zram: fix lockdep warning of free block handling
Date: Mon, 26 Nov 2018 12:49:28 -0800	[thread overview]
Message-ID: <20181126124928.8fbbf01966b741ac79a3d003@linux-foundation.org> (raw)
In-Reply-To: <20181126082813.81977-2-minchan@kernel.org>

On Mon, 26 Nov 2018 17:28:07 +0900 Minchan Kim <minchan@kernel.org> wrote:

> 
> ...
>
> With writeback feature, zram_slot_free_notify could be called
> in softirq context by end_swap_bio_read. However, bitmap_lock
> is not aware of that so lockdep yell out. Thanks.
> 
> The problem is not only bitmap_lock but it is also zram_slot_lock
> so straightforward solution would disable irq on zram_slot_lock
> which covers every bitmap_lock, too.
> Although duration disabling the irq is short in many places
> zram_slot_lock is used, a place(ie, decompress) is not fast
> enough to hold irqlock on relying on compression algorithm
> so it's not a option.
> 
> The approach in this patch is just "best effort", not guarantee
> "freeing orphan zpage". If the zram_slot_lock contention may happen,
> kernel couldn't free the zpage until it recycles the block. However,
> such contention between zram_slot_free_notify and other places to
> hold zram_slot_lock should be very rare in real practice.
> To see how often it happens, this patch adds new debug stat
> "miss_free".
> 
> It also adds irq lock in get/put_block_bdev to prevent deadlock
> lockdep reported. The reason I used irq disable rather than bottom
> half is swap_slot_free_notify could be called with irq disabled
> so it breaks local_bh_enable's rule. The irqlock works on only
> writebacked zram slot entry so it should be not frequent lock.
> 
> Cc: stable@vger.kernel.org # 4.14+
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/block/zram/zram_drv.c | 56 +++++++++++++++++++++++++----------
>  drivers/block/zram/zram_drv.h |  1 +
>  2 files changed, 42 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 4879595200e1..472027eaed60 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -53,6 +53,11 @@ static size_t huge_class_size;
>  
>  static void zram_free_page(struct zram *zram, size_t index);
>  
> +static int zram_slot_trylock(struct zram *zram, u32 index)
> +{
> +	return bit_spin_trylock(ZRAM_LOCK, &zram->table[index].value);
> +}
> +
>  static void zram_slot_lock(struct zram *zram, u32 index)
>  {
>  	bit_spin_lock(ZRAM_LOCK, &zram->table[index].value);
> @@ -443,29 +448,45 @@ static ssize_t backing_dev_store(struct device *dev,
>  
>  static unsigned long get_entry_bdev(struct zram *zram)
>  {
> -	unsigned long entry;
> +	unsigned long blk_idx;
> +	unsigned long ret = 0;
>  
> -	spin_lock(&zram->bitmap_lock);
>  	/* skip 0 bit to confuse zram.handle = 0 */
> -	entry = find_next_zero_bit(zram->bitmap, zram->nr_pages, 1);
> -	if (entry == zram->nr_pages) {
> -		spin_unlock(&zram->bitmap_lock);
> -		return 0;
> +	blk_idx = find_next_zero_bit(zram->bitmap, zram->nr_pages, 1);
> +	if (blk_idx == zram->nr_pages)
> +		goto retry;
> +
> +	spin_lock_irq(&zram->bitmap_lock);
> +	if (test_bit(blk_idx, zram->bitmap)) {
> +		spin_unlock_irq(&zram->bitmap_lock);
> +		goto retry;
>  	}
>  
> -	set_bit(entry, zram->bitmap);
> -	spin_unlock(&zram->bitmap_lock);
> +	set_bit(blk_idx, zram->bitmap);

Here we could do

	if (test_and_set_bit(...)) {
		spin_unlock(...);
		goto retry;

But it's weird to take the spinlock on behalf of bitops which are
already atomic!

It seems rather suspicious to me.  Why are we doing this?

> +	ret = blk_idx;
> +	goto out;
> +retry:
> +	spin_lock_irq(&zram->bitmap_lock);
> +	blk_idx = find_next_zero_bit(zram->bitmap, zram->nr_pages, 1);
> +	if (blk_idx == zram->nr_pages)
> +		goto out;
> +
> +	set_bit(blk_idx, zram->bitmap);
> +	ret = blk_idx;
> +out:
> +	spin_unlock_irq(&zram->bitmap_lock);
>  
> -	return entry;
> +	return ret;
>  }
>  
>  static void put_entry_bdev(struct zram *zram, unsigned long entry)
>  {
>  	int was_set;
> +	unsigned long flags;
>  
> -	spin_lock(&zram->bitmap_lock);
> +	spin_lock_irqsave(&zram->bitmap_lock, flags);
>  	was_set = test_and_clear_bit(entry, zram->bitmap);
> -	spin_unlock(&zram->bitmap_lock);
> +	spin_unlock_irqrestore(&zram->bitmap_lock, flags);

Here's another one.  Surely that locking is unnecessary.

>  	WARN_ON_ONCE(!was_set);
>  }
>  


  reply	other threads:[~2018-11-26 20:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-26  8:28 [PATCH v2 0/7] zram idle page writeback Minchan Kim
2018-11-26  8:28 ` [PATCH v2 1/7] zram: fix lockdep warning of free block handling Minchan Kim
2018-11-26 20:49   ` Andrew Morton [this message]
2018-11-27  2:05     ` Minchan Kim
2018-11-26  8:28 ` [PATCH v2 2/7] zram: fix double free backing device Minchan Kim
2018-11-26  8:28 ` [PATCH v2 3/7] zram: refactoring flags and writeback stuff Minchan Kim
2018-11-26  8:28 ` [PATCH v2 4/7] zram: introduce ZRAM_IDLE flag Minchan Kim
2018-11-26  8:28 ` [PATCH v2 5/7] zram: support idle/huge page writeback Minchan Kim
2018-11-26  9:47   ` Joey Pabalinas
2018-11-26 13:44     ` Joey Pabalinas
2018-11-27  2:13     ` Minchan Kim
2018-11-27  2:53       ` Joey Pabalinas
2018-11-26  8:28 ` [PATCH v2 6/7] zram: add bd_stat statistics Minchan Kim
2018-11-26 20:58   ` Andrew Morton
2018-11-27  2:07     ` Minchan Kim
2018-11-28 23:30       ` Andrew Morton
2018-11-29  1:45         ` Minchan Kim
2018-11-26  8:28 ` [PATCH v2 7/7] zram: writeback throttle Minchan Kim
2018-11-26 20:54   ` Andrew Morton
2018-11-27  2:08     ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181126124928.8fbbf01966b741ac79a3d003@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.