linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Damien Le Moal <Damien.LeMoal@wdc.com>
To: "xiubli@redhat.com" <xiubli@redhat.com>,
	"josef@toxicpanda.com" <josef@toxicpanda.com>,
	"axboe@kernel.dk" <axboe@kernel.dk>
Cc: "mchristi@redhat.com" <mchristi@redhat.com>,
	"ming.lei@redhat.com" <ming.lei@redhat.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Subject: Re: [PATCH v4 1/2] blk-mq: Avoid memory reclaim when allocating request map
Date: Mon, 30 Sep 2019 05:28:56 +0000	[thread overview]
Message-ID: <BYAPR04MB58160630271A8E552F7FD5D8E7820@BYAPR04MB5816.namprd04.prod.outlook.com> (raw)
In-Reply-To: 20190930015213.8865-2-xiubli@redhat.com

On 2019/09/29 18:52, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
> 
> For some storage drivers, such as the nbd, when there has new socket
> connections added, it will update the hardware queue number by calling
> blk_mq_update_nr_hw_queues(), in which it will freeze all the queues
> first. And then tries to do the hardware queue updating stuff.
> 
> But int blk_mq_alloc_rq_map()-->blk_mq_init_tags(), when allocating
> memory for tags, it may cause the mm do the memories direct reclaiming,
> since the queues has been freezed, so if needs to flush the page cache
> to disk, it will stuck in generic_make_request()-->blk_queue_enter() by
> waiting the queues to be unfreezed and then cause deadlock here.
> 
> Since the memory size requested here is a small one, which will make
> it not that easy to happen with a large size, but in theory this could
> happen when the OS is running in pressure and out of memory.
> 
> Gabriel Krisman Bertazi has hit the similar issue by fixing it in
> commit 36e1f3d10786 ("blk-mq: Avoid memory reclaim when remapping
> queues"), but might forget this part.
> 
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> CC: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-mq-tag.c | 5 +++--
>  block/blk-mq-tag.h | 5 ++++-
>  block/blk-mq.c     | 3 ++-
>  3 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index 008388e82b5c..04ee0e4c3fa1 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -462,7 +462,8 @@ static struct blk_mq_tags *blk_mq_init_bitmap_tags(struct blk_mq_tags *tags,
>  
>  struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
>  				     unsigned int reserved_tags,
> -				     int node, int alloc_policy)
> +				     int node, int alloc_policy,
> +				     gfp_t flags)
>  {
>  	struct blk_mq_tags *tags;
>  
> @@ -471,7 +472,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
>  		return NULL;
>  	}
>  
> -	tags = kzalloc_node(sizeof(*tags), GFP_KERNEL, node);
> +	tags = kzalloc_node(sizeof(*tags), flags, node);
>  	if (!tags)
>  		return NULL;
>  
> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
> index 61deab0b5a5a..296e0bc97126 100644
> --- a/block/blk-mq-tag.h
> +++ b/block/blk-mq-tag.h
> @@ -22,7 +22,10 @@ struct blk_mq_tags {
>  };
>  
>  
> -extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, unsigned int reserved_tags, int node, int alloc_policy);
> +extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
> +					    unsigned int reserved_tags,
> +					    int node, int alloc_policy,
> +					    gfp_t flags);
>  extern void blk_mq_free_tags(struct blk_mq_tags *tags);
>  
>  extern unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data);
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 240416057f28..9c52e4dfe132 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2090,7 +2090,8 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
>  		node = set->numa_node;
>  
>  	tags = blk_mq_init_tags(nr_tags, reserved_tags, node,
> -				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags));
> +				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags),
> +				GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY);

You added the gfp_t argument to blk_mq_init_tags() but you are only using that
argument with a hardcoded value here. So why not simply call kzalloc_node() in
that function with the flags GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY ? That
would avoid needing to add the "gfp_t flags" argument and still fit with your
patch 2 definition of BLK_MQ_GFP_FLAGS.

>  	if (!tags)
>  		return NULL;
>  
> 


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2019-09-30  5:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-30  1:52 [PATCH v4 0/2] blk-mq: Avoid memory reclaim when allocating xiubli
2019-09-30  1:52 ` [PATCH v4 1/2] blk-mq: Avoid memory reclaim when allocating request map xiubli
2019-09-30  5:28   ` Damien Le Moal [this message]
2019-09-30  5:50     ` Xiubo Li
2019-09-30  6:20       ` Damien Le Moal
2019-09-30  6:59         ` Xiubo Li
2019-09-30  1:52 ` [PATCH v4 2/2] blk-mq: use BLK_MQ_GFP_FLAGS macro instead xiubli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR04MB58160630271A8E552F7FD5D8E7820@BYAPR04MB5816.namprd04.prod.outlook.com \
    --to=damien.lemoal@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=josef@toxicpanda.com \
    --cc=krisman@linux.vnet.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=mchristi@redhat.com \
    --cc=ming.lei@redhat.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).