From: Damien Le Moal <Damien.LeMoal@wdc.com>
To: "xiubli@redhat.com" <xiubli@redhat.com>,
"josef@toxicpanda.com" <josef@toxicpanda.com>,
"axboe@kernel.dk" <axboe@kernel.dk>
Cc: "mchristi@redhat.com" <mchristi@redhat.com>,
"ming.lei@redhat.com" <ming.lei@redhat.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Subject: Re: [PATCH v4 1/2] blk-mq: Avoid memory reclaim when allocating request map
Date: Mon, 30 Sep 2019 05:28:56 +0000 [thread overview]
Message-ID: <BYAPR04MB58160630271A8E552F7FD5D8E7820@BYAPR04MB5816.namprd04.prod.outlook.com> (raw)
In-Reply-To: 20190930015213.8865-2-xiubli@redhat.com
On 2019/09/29 18:52, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
>
> For some storage drivers, such as the nbd, when there has new socket
> connections added, it will update the hardware queue number by calling
> blk_mq_update_nr_hw_queues(), in which it will freeze all the queues
> first. And then tries to do the hardware queue updating stuff.
>
> But int blk_mq_alloc_rq_map()-->blk_mq_init_tags(), when allocating
> memory for tags, it may cause the mm do the memories direct reclaiming,
> since the queues has been freezed, so if needs to flush the page cache
> to disk, it will stuck in generic_make_request()-->blk_queue_enter() by
> waiting the queues to be unfreezed and then cause deadlock here.
>
> Since the memory size requested here is a small one, which will make
> it not that easy to happen with a large size, but in theory this could
> happen when the OS is running in pressure and out of memory.
>
> Gabriel Krisman Bertazi has hit the similar issue by fixing it in
> commit 36e1f3d10786 ("blk-mq: Avoid memory reclaim when remapping
> queues"), but might forget this part.
>
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> CC: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> ---
> block/blk-mq-tag.c | 5 +++--
> block/blk-mq-tag.h | 5 ++++-
> block/blk-mq.c | 3 ++-
> 3 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index 008388e82b5c..04ee0e4c3fa1 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -462,7 +462,8 @@ static struct blk_mq_tags *blk_mq_init_bitmap_tags(struct blk_mq_tags *tags,
>
> struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
> unsigned int reserved_tags,
> - int node, int alloc_policy)
> + int node, int alloc_policy,
> + gfp_t flags)
> {
> struct blk_mq_tags *tags;
>
> @@ -471,7 +472,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
> return NULL;
> }
>
> - tags = kzalloc_node(sizeof(*tags), GFP_KERNEL, node);
> + tags = kzalloc_node(sizeof(*tags), flags, node);
> if (!tags)
> return NULL;
>
> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
> index 61deab0b5a5a..296e0bc97126 100644
> --- a/block/blk-mq-tag.h
> +++ b/block/blk-mq-tag.h
> @@ -22,7 +22,10 @@ struct blk_mq_tags {
> };
>
>
> -extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, unsigned int reserved_tags, int node, int alloc_policy);
> +extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
> + unsigned int reserved_tags,
> + int node, int alloc_policy,
> + gfp_t flags);
> extern void blk_mq_free_tags(struct blk_mq_tags *tags);
>
> extern unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data);
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 240416057f28..9c52e4dfe132 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2090,7 +2090,8 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
> node = set->numa_node;
>
> tags = blk_mq_init_tags(nr_tags, reserved_tags, node,
> - BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags));
> + BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags),
> + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY);
You added the gfp_t argument to blk_mq_init_tags() but you are only using that
argument with a hardcoded value here. So why not simply call kzalloc_node() in
that function with the flags GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY ? That
would avoid needing to add the "gfp_t flags" argument and still fit with your
patch 2 definition of BLK_MQ_GFP_FLAGS.
> if (!tags)
> return NULL;
>
>
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2019-09-30 5:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-30 1:52 [PATCH v4 0/2] blk-mq: Avoid memory reclaim when allocating xiubli
2019-09-30 1:52 ` [PATCH v4 1/2] blk-mq: Avoid memory reclaim when allocating request map xiubli
2019-09-30 5:28 ` Damien Le Moal [this message]
2019-09-30 5:50 ` Xiubo Li
2019-09-30 6:20 ` Damien Le Moal
2019-09-30 6:59 ` Xiubo Li
2019-09-30 1:52 ` [PATCH v4 2/2] blk-mq: use BLK_MQ_GFP_FLAGS macro instead xiubli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BYAPR04MB58160630271A8E552F7FD5D8E7820@BYAPR04MB5816.namprd04.prod.outlook.com \
--to=damien.lemoal@wdc.com \
--cc=axboe@kernel.dk \
--cc=josef@toxicpanda.com \
--cc=krisman@linux.vnet.ibm.com \
--cc=linux-block@vger.kernel.org \
--cc=mchristi@redhat.com \
--cc=ming.lei@redhat.com \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).