linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map
@ 2019-09-16  2:16 xiubli
  2019-09-16  2:16 ` [PATCHv2 1/2] " xiubli
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: xiubli @ 2019-09-16  2:16 UTC (permalink / raw)
  To: josef, axboe; +Cc: mchristi, linux-block, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

To make the patch more readable and cleaner I just split them into 2
small ones to address the issue from @Ming Lei, thanks very much.

Changed in V2:
- Addressed the comment from Ming Lei, thanks.

Xiubo Li (2):
  blk-mq: Avoid memory reclaim when allocating request map
  blk-mq: use BLK_MQ_GFP_FLAGS macro instead

 block/blk-mq-tag.c |  5 +++--
 block/blk-mq-tag.h |  5 ++++-
 block/blk-mq.c     | 17 +++++++++--------
 3 files changed, 16 insertions(+), 11 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCHv2 1/2] blk-mq: Avoid memory reclaim when allocating request map
  2019-09-16  2:16 [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map xiubli
@ 2019-09-16  2:16 ` xiubli
  2019-09-16  2:16 ` [PATCHv2 2/2] blk-mq: use BLK_MQ_GFP_FLAGS macro instead xiubli
  2019-09-16  9:06 ` [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map Christoph Hellwig
  2 siblings, 0 replies; 9+ messages in thread
From: xiubli @ 2019-09-16  2:16 UTC (permalink / raw)
  To: josef, axboe
  Cc: mchristi, linux-block, Xiubo Li, Gabriel Krisman Bertazi, Ming Lei

From: Xiubo Li <xiubli@redhat.com>

For some storage drivers, such as the nbd, when there has new socket
connections added, it will update the hardware queue number by calling
blk_mq_update_nr_hw_queues(), in which it will freeze all the queues
first. And then tries to do the hardware queue updating stuff.

But int blk_mq_alloc_rq_map()-->blk_mq_init_tags(), when allocating
memory for tags, it may cause the mm do the memories direct reclaiming,
since the queues has been freezed, so if needs to flush the page cache
to disk, it will stuck in generic_make_request()-->blk_queue_enter() by
waiting the queues to be unfreezed and then cause deadlock here.

Since the memory size requested here is a small one, which will make
it not that easy to happen with a large size, but in theory this could
happen when the OS is running in pressure and out of memory.

Gabriel Krisman Bertazi has hit the similar issue by fixing it in
commit 36e1f3d10786 ("blk-mq: Avoid memory reclaim when remapping
queues"), but might forget this part.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
CC: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-tag.c | 5 +++--
 block/blk-mq-tag.h | 5 ++++-
 block/blk-mq.c     | 3 ++-
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 008388e82b5c..04ee0e4c3fa1 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -462,7 +462,8 @@ static struct blk_mq_tags *blk_mq_init_bitmap_tags(struct blk_mq_tags *tags,
 
 struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
 				     unsigned int reserved_tags,
-				     int node, int alloc_policy)
+				     int node, int alloc_policy,
+				     gfp_t flags)
 {
 	struct blk_mq_tags *tags;
 
@@ -471,7 +472,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
 		return NULL;
 	}
 
-	tags = kzalloc_node(sizeof(*tags), GFP_KERNEL, node);
+	tags = kzalloc_node(sizeof(*tags), flags, node);
 	if (!tags)
 		return NULL;
 
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 61deab0b5a5a..296e0bc97126 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -22,7 +22,10 @@ struct blk_mq_tags {
 };
 
 
-extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, unsigned int reserved_tags, int node, int alloc_policy);
+extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
+					    unsigned int reserved_tags,
+					    int node, int alloc_policy,
+					    gfp_t flags);
 extern void blk_mq_free_tags(struct blk_mq_tags *tags);
 
 extern unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 240416057f28..9c52e4dfe132 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2090,7 +2090,8 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 		node = set->numa_node;
 
 	tags = blk_mq_init_tags(nr_tags, reserved_tags, node,
-				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags));
+				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags),
+				GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY);
 	if (!tags)
 		return NULL;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCHv2 2/2] blk-mq: use BLK_MQ_GFP_FLAGS macro instead
  2019-09-16  2:16 [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map xiubli
  2019-09-16  2:16 ` [PATCHv2 1/2] " xiubli
@ 2019-09-16  2:16 ` xiubli
  2019-09-16  9:06 ` [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map Christoph Hellwig
  2 siblings, 0 replies; 9+ messages in thread
From: xiubli @ 2019-09-16  2:16 UTC (permalink / raw)
  To: josef, axboe; +Cc: mchristi, linux-block, Xiubo Li, Ming Lei

From: Xiubo Li <xiubli@redhat.com>

There at least 6 places are using the same combined GFP flags,
switch them to one macro instead to make the code get cleaner.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9c52e4dfe132..a5faad4690cf 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -39,6 +39,8 @@
 #include "blk-mq-sched.h"
 #include "blk-rq-qos.h"
 
+#define BLK_MQ_GFP_FLAGS (GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY)
+
 static void blk_mq_poll_stats_start(struct request_queue *q);
 static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
 
@@ -2091,21 +2093,19 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 
 	tags = blk_mq_init_tags(nr_tags, reserved_tags, node,
 				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags),
-				GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY);
+				BLK_MQ_GFP_FLAGS);
 	if (!tags)
 		return NULL;
 
 	tags->rqs = kcalloc_node(nr_tags, sizeof(struct request *),
-				 GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY,
-				 node);
+				 BLK_MQ_GFP_FLAGS, node);
 	if (!tags->rqs) {
 		blk_mq_free_tags(tags);
 		return NULL;
 	}
 
 	tags->static_rqs = kcalloc_node(nr_tags, sizeof(struct request *),
-					GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY,
-					node);
+					BLK_MQ_GFP_FLAGS, node);
 	if (!tags->static_rqs) {
 		kfree(tags->rqs);
 		blk_mq_free_tags(tags);
@@ -2167,7 +2167,7 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 
 		do {
 			page = alloc_pages_node(node,
-				GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO,
+				BLK_MQ_GFP_FLAGS | __GFP_ZERO,
 				this_order);
 			if (page)
 				break;
@@ -2333,7 +2333,7 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set,
 		int node)
 {
 	struct blk_mq_hw_ctx *hctx;
-	gfp_t gfp = GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY;
+	gfp_t gfp = BLK_MQ_GFP_FLAGS;
 
 	hctx = kzalloc_node(blk_mq_hw_ctx_size(set), gfp, node);
 	if (!hctx)
@@ -3194,7 +3194,7 @@ static bool blk_mq_elv_switch_none(struct list_head *head,
 	if (!q->elevator)
 		return true;
 
-	qe = kmalloc(sizeof(*qe), GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY);
+	qe = kmalloc(sizeof(*qe), BLK_MQ_GFP_FLAGS);
 	if (!qe)
 		return false;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map
  2019-09-16  2:16 [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map xiubli
  2019-09-16  2:16 ` [PATCHv2 1/2] " xiubli
  2019-09-16  2:16 ` [PATCHv2 2/2] blk-mq: use BLK_MQ_GFP_FLAGS macro instead xiubli
@ 2019-09-16  9:06 ` Christoph Hellwig
  2019-09-16 10:40   ` Xiubo Li
                     ` (2 more replies)
  2 siblings, 3 replies; 9+ messages in thread
From: Christoph Hellwig @ 2019-09-16  9:06 UTC (permalink / raw)
  To: xiubli; +Cc: josef, axboe, mchristi, linux-block

On Mon, Sep 16, 2019 at 07:46:29AM +0530, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
> 
> To make the patch more readable and cleaner I just split them into 2
> small ones to address the issue from @Ming Lei, thanks very much.

I'd be much happier to just see memalloc_noio_save +
memalloc_noio_restore calls in the right places over sprinkling even
more magic GFP_NOIO arguments.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map
  2019-09-16  9:06 ` [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map Christoph Hellwig
@ 2019-09-16 10:40   ` Xiubo Li
  2019-09-16 16:52   ` Jens Axboe
  2019-09-17 15:59   ` Bart Van Assche
  2 siblings, 0 replies; 9+ messages in thread
From: Xiubo Li @ 2019-09-16 10:40 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: josef, axboe, mchristi, linux-block

On 2019/9/16 17:06, Christoph Hellwig wrote:
> On Mon, Sep 16, 2019 at 07:46:29AM +0530, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> To make the patch more readable and cleaner I just split them into 2
>> small ones to address the issue from @Ming Lei, thanks very much.
> I'd be much happier to just see memalloc_noio_save +
> memalloc_noio_restore calls in the right places over sprinkling even
> more magic GFP_NOIO arguments.

Hi Christoph,

BTW, then to make the code to be more readable, should I just keep the 
BLK_MQ_GFP_GLAGS as:

#define BLK_MQ_GFP_FLAGS (GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY)

Or just switch to:

#define BLK_MQ_GFP_FLAGS (GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY)

?

Any advice about this ?

Thanks very much,
BRs
Xiubo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map
  2019-09-16  9:06 ` [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map Christoph Hellwig
  2019-09-16 10:40   ` Xiubo Li
@ 2019-09-16 16:52   ` Jens Axboe
  2019-09-18 16:38     ` Christoph Hellwig
  2019-09-17 15:59   ` Bart Van Assche
  2 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2019-09-16 16:52 UTC (permalink / raw)
  To: Christoph Hellwig, xiubli; +Cc: josef, mchristi, linux-block

On 9/16/19 3:06 AM, Christoph Hellwig wrote:
> On Mon, Sep 16, 2019 at 07:46:29AM +0530, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> To make the patch more readable and cleaner I just split them into 2
>> small ones to address the issue from @Ming Lei, thanks very much.
> 
> I'd be much happier to just see memalloc_noio_save +
> memalloc_noio_restore calls in the right places over sprinkling even
> more magic GFP_NOIO arguments.

Ugh, I always thought those were kind of lame and band aiding around
places where people are too lazy to fix the path to the gfp args.
Or maybe areas where it's just feasible.

This is not one of those.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map
  2019-09-16  9:06 ` [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map Christoph Hellwig
  2019-09-16 10:40   ` Xiubo Li
  2019-09-16 16:52   ` Jens Axboe
@ 2019-09-17 15:59   ` Bart Van Assche
  2 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2019-09-17 15:59 UTC (permalink / raw)
  To: Christoph Hellwig, xiubli; +Cc: josef, axboe, mchristi, linux-block

On 9/16/19 2:06 AM, Christoph Hellwig wrote:
> On Mon, Sep 16, 2019 at 07:46:29AM +0530, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> To make the patch more readable and cleaner I just split them into 2
>> small ones to address the issue from @Ming Lei, thanks very much.
> 
> I'd be much happier to just see memalloc_noio_save +
> memalloc_noio_restore calls in the right places over sprinkling even
> more magic GFP_NOIO arguments.

Hi Christoph,

My understanding is that memalloc_noio_save() and 
memalloc_noio_restore() have been introduced to avoid having to pass 
flags that specify the noio context along deep call chains. Are you sure 
it helps to use these functions if no other function than kmalloc() is 
called between the memalloc_noio_save() and memalloc_noio_restore() calls?

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map
  2019-09-16 16:52   ` Jens Axboe
@ 2019-09-18 16:38     ` Christoph Hellwig
  2019-09-25  7:05       ` Xiubo Li
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2019-09-18 16:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, xiubli, josef, mchristi, linux-block

On Mon, Sep 16, 2019 at 10:52:33AM -0600, Jens Axboe wrote:
> On 9/16/19 3:06 AM, Christoph Hellwig wrote:
> > On Mon, Sep 16, 2019 at 07:46:29AM +0530, xiubli@redhat.com wrote:
> >> From: Xiubo Li <xiubli@redhat.com>
> >>
> >> To make the patch more readable and cleaner I just split them into 2
> >> small ones to address the issue from @Ming Lei, thanks very much.
> > 
> > I'd be much happier to just see memalloc_noio_save +
> > memalloc_noio_restore calls in the right places over sprinkling even
> > more magic GFP_NOIO arguments.
> 
> Ugh, I always thought those were kind of lame and band aiding around
> places where people are too lazy to fix the path to the gfp args.
> Or maybe areas where it's just feasible.

The way I understood the discussion around the introduction of these
flags is that we want to phase out GFP_NOIO and GFP_NOFS in the long
run, given that ammending all calls is basically impossibly, while
marking contexts is pretty easy.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map
  2019-09-18 16:38     ` Christoph Hellwig
@ 2019-09-25  7:05       ` Xiubo Li
  0 siblings, 0 replies; 9+ messages in thread
From: Xiubo Li @ 2019-09-25  7:05 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe; +Cc: josef, mchristi, linux-block

On 2019/9/19 0:38, Christoph Hellwig wrote:
> On Mon, Sep 16, 2019 at 10:52:33AM -0600, Jens Axboe wrote:
>> On 9/16/19 3:06 AM, Christoph Hellwig wrote:
>>> On Mon, Sep 16, 2019 at 07:46:29AM +0530, xiubli@redhat.com wrote:
>>>> From: Xiubo Li <xiubli@redhat.com>
>>>>
>>>> To make the patch more readable and cleaner I just split them into 2
>>>> small ones to address the issue from @Ming Lei, thanks very much.
>>> I'd be much happier to just see memalloc_noio_save +
>>> memalloc_noio_restore calls in the right places over sprinkling even
>>> more magic GFP_NOIO arguments.
>> Ugh, I always thought those were kind of lame and band aiding around
>> places where people are too lazy to fix the path to the gfp args.
>> Or maybe areas where it's just feasible.
> The way I understood the discussion around the introduction of these
> flags is that we want to phase out GFP_NOIO and GFP_NOFS in the long
> run, given that ammending all calls is basically impossibly, while
> marking contexts is pretty easy.

This was also my understanding about the memalloc_xx stuff. By going 
through the code and the usage history of it, the best practice of using 
it seems as Bart mentioned for the possibly deep call chains.

Thanks



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-09-25  7:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-16  2:16 [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map xiubli
2019-09-16  2:16 ` [PATCHv2 1/2] " xiubli
2019-09-16  2:16 ` [PATCHv2 2/2] blk-mq: use BLK_MQ_GFP_FLAGS macro instead xiubli
2019-09-16  9:06 ` [PATCHv2 0/2] blk-mq: Avoid memory reclaim when allocating request map Christoph Hellwig
2019-09-16 10:40   ` Xiubo Li
2019-09-16 16:52   ` Jens Axboe
2019-09-18 16:38     ` Christoph Hellwig
2019-09-25  7:05       ` Xiubo Li
2019-09-17 15:59   ` Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).