All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>, NeilBrown <neilb@suse.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org,
	"dm-devel@redhat.com David Rientjes" <rientjes@google.com>,
	Ondrej Kozina <okozina@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Douglas Anderson <dianders@chromium.org>,
	shli@kernel.org, Dmitry Torokhov <dmitry.torokhov@gmail.com>
Subject: Re: [dm-devel] [RFC PATCH 2/2] mm, mempool: do not throttle PF_LESS_THROTTLE tasks
Date: Thu, 24 Nov 2016 14:29:17 +0100	[thread overview]
Message-ID: <20161124132916.GF20668@dhcp22.suse.cz> (raw)
In-Reply-To: <alpine.LRH.2.02.1611231558420.31481@file01.intranet.prod.int.rdu2.redhat.com>

On Wed 23-11-16 16:11:59, Mikulas Patocka wrote:
[...]
> Hi Michal
> 
> So, here Google developers hit a stacktrace where a block device driver is 
> being throttled in the memory management:
> 
> https://www.redhat.com/archives/dm-devel/2016-November/msg00158.html
> 
> dm-bufio layer is something like a buffer cache, used by block device 
> drivers. Unlike the real buffer cache, dm-bufio guarantees forward 
> progress even if there is no memory free.
> 
> dm-bufio does something similar like a mempool allocation, it tries an 
> allocation with GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN 
> (just like a mempool) and if it fails, it will reuse some existing buffer.
> 
> Here, they caught it being throttled in the memory management:
> 
>    Workqueue: kverityd verity_prefetch_io
>    __switch_to+0x9c/0xa8
>    __schedule+0x440/0x6d8
>    schedule+0x94/0xb4
>    schedule_timeout+0x204/0x27c
>    schedule_timeout_uninterruptible+0x44/0x50
>    wait_iff_congested+0x9c/0x1f0
>    shrink_inactive_list+0x3a0/0x4cc
>    shrink_lruvec+0x418/0x5cc
>    shrink_zone+0x88/0x198
>    try_to_free_pages+0x51c/0x588
>    __alloc_pages_nodemask+0x648/0xa88
>    __get_free_pages+0x34/0x7c
>    alloc_buffer+0xa4/0x144
>    __bufio_new+0x84/0x278
>    dm_bufio_prefetch+0x9c/0x154
>    verity_prefetch_io+0xe8/0x10c
>    process_one_work+0x240/0x424
>    worker_thread+0x2fc/0x424
>    kthread+0x10c/0x114
> 
> Will you consider removing vm throttling for __GFP_NORETRY allocations?

As I've already said before I do not think that tweaking __GFP_NORETRY
is the right approach is the right approach. The whole point of the flag
is to not loop in the _allocator_ and it has nothing to do with the reclaim
and the way how it is doing throttling.

On the other hand I perfectly understand your point and a lack of
anything between GFP_NOWAIT and ___GFP_DIRECT_RECLAIM can be a bit
frustrating. It would be nice to have sime middle ground - only a
light reclaim involved and a quick back off if the memory is harder to
reclaim. That is a hard thing to do, though because all the reclaimers
(including slab shrinkers) would have to be aware of this concept to
work properly.

I have read the report from the link above and I am really wondering why
s@GFP_NOIO@GFP_NOWAIT@ is not the right way to go there. You have argued
about a clean page cache would force buffer reuse. That might be true
to some extent but is it a real problem? Please note that even
GFP_NOWAIT allocations will wake up kspwad which should clean up that
clean page cache in the background. I would even expect kswapd being
active at the time when NOWAIT requests hit the min watermark. If that
is not the case then we should probably think about why kspwad is not
proactive enough rather than tweaking __GFP_NORETRY semantic.

Thanks!
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>, NeilBrown <neilb@suse.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org,
	"dm-devel@redhat.com David Rientjes" <rientjes@google.com>,
	Ondrej Kozina <okozina@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Douglas Anderson <dianders@chromium.org>,
	shli@kernel.org, Dmitry Torokhov <dmitry.torokhov@gmail.com>
Subject: Re: [dm-devel] [RFC PATCH 2/2] mm, mempool: do not throttle PF_LESS_THROTTLE tasks
Date: Thu, 24 Nov 2016 14:29:17 +0100	[thread overview]
Message-ID: <20161124132916.GF20668@dhcp22.suse.cz> (raw)
In-Reply-To: <alpine.LRH.2.02.1611231558420.31481@file01.intranet.prod.int.rdu2.redhat.com>

On Wed 23-11-16 16:11:59, Mikulas Patocka wrote:
[...]
> Hi Michal
> 
> So, here Google developers hit a stacktrace where a block device driver is 
> being throttled in the memory management:
> 
> https://www.redhat.com/archives/dm-devel/2016-November/msg00158.html
> 
> dm-bufio layer is something like a buffer cache, used by block device 
> drivers. Unlike the real buffer cache, dm-bufio guarantees forward 
> progress even if there is no memory free.
> 
> dm-bufio does something similar like a mempool allocation, it tries an 
> allocation with GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN 
> (just like a mempool) and if it fails, it will reuse some existing buffer.
> 
> Here, they caught it being throttled in the memory management:
> 
>    Workqueue: kverityd verity_prefetch_io
>    __switch_to+0x9c/0xa8
>    __schedule+0x440/0x6d8
>    schedule+0x94/0xb4
>    schedule_timeout+0x204/0x27c
>    schedule_timeout_uninterruptible+0x44/0x50
>    wait_iff_congested+0x9c/0x1f0
>    shrink_inactive_list+0x3a0/0x4cc
>    shrink_lruvec+0x418/0x5cc
>    shrink_zone+0x88/0x198
>    try_to_free_pages+0x51c/0x588
>    __alloc_pages_nodemask+0x648/0xa88
>    __get_free_pages+0x34/0x7c
>    alloc_buffer+0xa4/0x144
>    __bufio_new+0x84/0x278
>    dm_bufio_prefetch+0x9c/0x154
>    verity_prefetch_io+0xe8/0x10c
>    process_one_work+0x240/0x424
>    worker_thread+0x2fc/0x424
>    kthread+0x10c/0x114
> 
> Will you consider removing vm throttling for __GFP_NORETRY allocations?

As I've already said before I do not think that tweaking __GFP_NORETRY
is the right approach is the right approach. The whole point of the flag
is to not loop in the _allocator_ and it has nothing to do with the reclaim
and the way how it is doing throttling.

On the other hand I perfectly understand your point and a lack of
anything between GFP_NOWAIT and ___GFP_DIRECT_RECLAIM can be a bit
frustrating. It would be nice to have sime middle ground - only a
light reclaim involved and a quick back off if the memory is harder to
reclaim. That is a hard thing to do, though because all the reclaimers
(including slab shrinkers) would have to be aware of this concept to
work properly.

I have read the report from the link above and I am really wondering why
s@GFP_NOIO@GFP_NOWAIT@ is not the right way to go there. You have argued
about a clean page cache would force buffer reuse. That might be true
to some extent but is it a real problem? Please note that even
GFP_NOWAIT allocations will wake up kspwad which should clean up that
clean page cache in the background. I would even expect kswapd being
active at the time when NOWAIT requests hit the min watermark. If that
is not the case then we should probably think about why kspwad is not
proactive enough rather than tweaking __GFP_NORETRY semantic.

Thanks!
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-11-24 13:29 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-18  8:39 [RFC PATCH 0/2] mempool vs. page allocator interaction Michal Hocko
2016-07-18  8:39 ` Michal Hocko
2016-07-18  8:41 ` [RFC PATCH 1/2] mempool: do not consume memory reserves from the reclaim path Michal Hocko
2016-07-18  8:41   ` Michal Hocko
2016-07-18  8:41   ` [RFC PATCH 2/2] mm, mempool: do not throttle PF_LESS_THROTTLE tasks Michal Hocko
2016-07-18  8:41     ` Michal Hocko
2016-07-19 21:50     ` Mikulas Patocka
2016-07-19 21:50       ` Mikulas Patocka
2016-07-22  8:46     ` NeilBrown
2016-07-22  9:04       ` NeilBrown
2016-07-22  9:15       ` Michal Hocko
2016-07-22  9:15         ` Michal Hocko
2016-07-23  0:12         ` NeilBrown
2016-07-25  8:32           ` Michal Hocko
2016-07-25  8:32             ` Michal Hocko
2016-07-25 19:23             ` Michal Hocko
2016-07-25 19:23               ` Michal Hocko
2016-07-25 19:23               ` Michal Hocko
2016-07-26  7:07               ` Michal Hocko
2016-07-26  7:07                 ` Michal Hocko
2016-07-27  3:43             ` [dm-devel] " NeilBrown
2016-07-27 18:24               ` Michal Hocko
2016-07-27 18:24                 ` Michal Hocko
2016-07-27 21:33                 ` NeilBrown
2016-07-28  7:17                   ` Michal Hocko
2016-07-28  7:17                     ` Michal Hocko
2016-08-03 12:53                     ` Mikulas Patocka
2016-08-03 12:53                       ` Mikulas Patocka
2016-08-03 14:34                       ` Michal Hocko
2016-08-03 14:34                         ` Michal Hocko
2016-08-04 18:49                         ` Mikulas Patocka
2016-08-04 18:49                           ` Mikulas Patocka
2016-08-12 12:32                           ` Michal Hocko
2016-08-12 12:32                             ` Michal Hocko
2016-08-13 17:34                             ` Mikulas Patocka
2016-08-13 17:34                               ` Mikulas Patocka
2016-08-14 10:34                               ` Michal Hocko
2016-08-14 10:34                                 ` Michal Hocko
2016-08-15 16:15                                 ` Mikulas Patocka
2016-08-15 16:15                                   ` Mikulas Patocka
2016-11-23 21:11                                 ` Mikulas Patocka
2016-11-23 21:11                                   ` Mikulas Patocka
2016-11-24 13:29                                   ` Michal Hocko [this message]
2016-11-24 13:29                                     ` Michal Hocko
2016-11-24 17:10                                     ` Mikulas Patocka
2016-11-24 17:10                                       ` Mikulas Patocka
2016-11-28 14:06                                       ` Michal Hocko
2016-11-28 14:06                                         ` Michal Hocko
2016-07-25 21:52           ` Mikulas Patocka
2016-07-25 21:52             ` Mikulas Patocka
2016-07-26  7:25             ` Michal Hocko
2016-07-26  7:25               ` Michal Hocko
2016-07-27  4:02             ` [dm-devel] " NeilBrown
2016-07-27 14:28               ` Mikulas Patocka
2016-07-27 14:28                 ` Mikulas Patocka
2016-07-27 18:40                 ` Michal Hocko
2016-07-27 18:40                   ` Michal Hocko
2016-08-03 13:59                   ` Mikulas Patocka
2016-08-03 13:59                     ` Mikulas Patocka
2016-08-03 14:42                     ` Michal Hocko
2016-08-03 14:42                       ` Michal Hocko
2016-08-04 18:46                       ` Mikulas Patocka
2016-08-04 18:46                         ` Mikulas Patocka
2016-07-27 21:36                 ` NeilBrown
2016-07-19  2:00   ` [RFC PATCH 1/2] mempool: do not consume memory reserves from the reclaim path David Rientjes
2016-07-19  2:00     ` David Rientjes
2016-07-19  7:49     ` Michal Hocko
2016-07-19  7:49       ` Michal Hocko
2016-07-19 13:54   ` Johannes Weiner
2016-07-19 13:54     ` Johannes Weiner
2016-07-19 14:19     ` Michal Hocko
2016-07-19 14:19       ` Michal Hocko
2016-07-19 22:01       ` Mikulas Patocka
2016-07-19 22:01         ` Mikulas Patocka
2016-07-19 20:45     ` David Rientjes
2016-07-19 20:45       ` David Rientjes
2016-07-20  8:15       ` Michal Hocko
2016-07-20  8:15         ` Michal Hocko
2016-07-20 21:06         ` David Rientjes
2016-07-20 21:06           ` David Rientjes
2016-07-21  8:52           ` Michal Hocko
2016-07-21  8:52             ` Michal Hocko
2016-07-21 12:13             ` Johannes Weiner
2016-07-21 12:13               ` Johannes Weiner
2016-07-21 14:53               ` Michal Hocko
2016-07-21 14:53                 ` Michal Hocko
2016-07-21 14:53                 ` Michal Hocko
2016-07-21 15:26                 ` Johannes Weiner
2016-07-21 15:26                   ` Johannes Weiner
2016-07-22  1:41                 ` NeilBrown
2016-07-22  6:37                 ` Michal Hocko
2016-07-22  6:37                   ` Michal Hocko
2016-07-22 12:26                   ` Vlastimil Babka
2016-07-22 12:26                     ` Vlastimil Babka
2016-07-22 19:44                     ` Andrew Morton
2016-07-22 19:44                       ` Andrew Morton
2016-07-23 18:52                       ` Vlastimil Babka
2016-07-23 18:52                         ` Vlastimil Babka
2016-07-19 21:50   ` Mikulas Patocka
2016-07-19 21:50     ` Mikulas Patocka
2016-07-20  6:44     ` Michal Hocko
2016-07-20  6:44       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161124132916.GF20668@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dianders@chromium.org \
    --cc=dmitry.torokhov@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mpatocka@redhat.com \
    --cc=neilb@suse.com \
    --cc=okozina@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.