dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: Ondrej Kozina <okozina@redhat.com>,
	dm-devel@redhat.com, Milan Broz <mbroz@redhat.com>
Subject: Re: [dm-devel] dm: fix deadlock when swapping to encrypted device
Date: Wed, 10 Feb 2021 14:23:23 -0500	[thread overview]
Message-ID: <20210210192323.GC7904@redhat.com> (raw)
In-Reply-To: <alpine.LRH.2.02.2102101140420.30253@file01.intranet.prod.int.rdu2.redhat.com>

On Wed, Feb 10 2021 at 11:50am -0500,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> Hi
> 
> Here I'm sending the patch that fixes swapping to dm-crypt.
> 
> The logic that limits the number of in-progress I/Os was moved to generic 
> device mapper. A dm target can activate it by setting ti->limit_swap. The 
> actual limit can be set in /sys/module/dm_mod/parameters/swap_ios.
> 
> This patch only limits swap bios (those with REQ_SWAP set). I don't limit 
> other bios, because limiting them causes performance degradation due to 
> cache line bouncing when taking the semaphore - and there are no reports 
> that non-swap I/O on dm crypt causes deadlocks.
> 
> Mikulas
> 
> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> The system would deadlock when swapping to a dm-crypt device. The reason
> is that for each incoming write bio, dm-crypt allocates memory that holds
> encrypted data. These excessive allocations exhaust all the memory and the
> result is either deadlock or OOM trigger.
> 
> This patch limits the number of in-flight swap bios, so that the memory
> consumed by dm-crypt is limited. The limit is enforced if the target set
> the "limit_swap" variable and if the bio has REQ_SWAP set.
> 
> Non-swap bios are not affected becuase taking the semaphore would cause
> performance degradation.
> 
> This is similar to request-based drivers - they will also block when the
> number of requests is over the limit.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> 
> ---
>  drivers/md/dm-core.h          |    4 ++
>  drivers/md/dm-crypt.c         |    1 
>  drivers/md/dm.c               |   61 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/device-mapper.h |    5 +++
>  4 files changed, 71 insertions(+)
> 
> Index: linux-2.6/drivers/md/dm.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm.c	2021-02-10 15:04:53.000000000 +0100
> +++ linux-2.6/drivers/md/dm.c	2021-02-10 16:29:04.000000000 +0100

> @@ -1271,6 +1307,15 @@ static blk_qc_t __map_bio(struct dm_targ
>  	atomic_inc(&io->io_count);
>  	sector = clone->bi_iter.bi_sector;
>  
> +	if (unlikely(swap_io_limit(ti, clone))) {
> +		struct mapped_device *md = io->md;
> +		int latch = get_swap_ios();
> +		if (unlikely(latch != md->swap_ios)) {
> +			__set_swap_io_limit(md, latch);
> +		}

Don't need these curly braces...

> +		down(&md->swap_ios_semaphore);
> +	}
> +
>  	r = ti->type->map(ti, clone);
>  	switch (r) {
>  	case DM_MAPIO_SUBMITTED:

> @@ -1814,6 +1868,10 @@ static struct mapped_device *alloc_dev(i
>  	init_waitqueue_head(&md->eventq);
>  	init_completion(&md->kobj_holder.completion);
>  
> +	md->swap_ios = get_swap_ios();
> +	sema_init(&md->swap_ios_semaphore, md->swap_ios);
> +	mutex_init(&md->swap_ios_lock);
> +
>  	md->disk->major = _major;
>  	md->disk->first_minor = minor;
>  	md->disk->fops = &dm_blk_dops;

This is only applicable for bio-based DM.  But probably not worth
avoiding the setup for request-based...

> @@ -3097,6 +3155,9 @@ MODULE_PARM_DESC(reserved_bio_based_ios,
>  module_param(dm_numa_node, int, S_IRUGO | S_IWUSR);
>  MODULE_PARM_DESC(dm_numa_node, "NUMA node for DM device memory allocations");
>  
> +module_param(swap_ios, int, S_IRUGO | S_IWUSR);
> +MODULE_PARM_DESC(swap_ios, "The number of swap I/Os in flight");
> +

Can you please rename this to modparam to "swap_bios"?  And rename other
variables/members, etc (e.g. "swap_bios_semaphore", "swap_bios_lock",
etc)?

> Index: linux-2.6/include/linux/device-mapper.h
> ===================================================================
> --- linux-2.6.orig/include/linux/device-mapper.h	2020-11-25 13:40:44.000000000 +0100
> +++ linux-2.6/include/linux/device-mapper.h	2021-02-10 15:52:54.000000000 +0100
> @@ -325,6 +325,11 @@ struct dm_target {
>  	 * whether or not its underlying devices have support.
>  	 */
>  	bool discards_supported:1;
> +
> +	/*
> +	 * Set if we need to limit the number of in-flight bios when swapping.
> +	 */
> +	bool limit_swap:1;
>  };
>  
>  void *dm_per_bio_data(struct bio *bio, size_t data_size);

Please rename to "limit_swap_bios".

Other than these nits this looks good to me.
Once you send v2 I can get it staged for 5.12.

Thanks,
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2021-02-10 19:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-10 16:50 [dm-devel] [PATCH] dm: fix deadlock when swapping to encrypted device Mikulas Patocka
2021-02-10 19:23 ` Mike Snitzer [this message]
2021-02-10 20:26   ` [dm-devel] [PATCH v2] " Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210210192323.GC7904@redhat.com \
    --to=snitzer@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=mbroz@redhat.com \
    --cc=mpatocka@redhat.com \
    --cc=okozina@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).