* [PATCH 0/3] indirectly reclaimable memory @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch set introduces the concept of indirectly reclaimable memory and applies it to fix the issue, when a big number of dentries with external names can significantly affect the MemAvailable value. v2: 1) removed comments specific to unreclaimable slabs 2) splitted into 3 patches v1: https://lkml.org/lkml/2018/3/1/961 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com Roman Gushchin (3): mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES mm: treat indirectly reclaimable memory as available in MemAvailable dcache: account external names as indirectly reclaimable memory fs/dcache.c | 29 ++++++++++++++++++++++++----- include/linux/mmzone.h | 1 + mm/page_alloc.c | 7 +++++++ mm/vmstat.c | 1 + 4 files changed, 33 insertions(+), 5 deletions(-) -- 2.14.3 ^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 0/3] indirectly reclaimable memory @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch set introduces the concept of indirectly reclaimable memory and applies it to fix the issue, when a big number of dentries with external names can significantly affect the MemAvailable value. v2: 1) removed comments specific to unreclaimable slabs 2) splitted into 3 patches v1: https://lkml.org/lkml/2018/3/1/961 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com Roman Gushchin (3): mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES mm: treat indirectly reclaimable memory as available in MemAvailable dcache: account external names as indirectly reclaimable memory fs/dcache.c | 29 ++++++++++++++++++++++++----- include/linux/mmzone.h | 1 + mm/page_alloc.c | 7 +++++++ mm/vmstat.c | 1 + 4 files changed, 33 insertions(+), 5 deletions(-) -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 0/3] indirectly reclaimable memory @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch set introduces the concept of indirectly reclaimable memory and applies it to fix the issue, when a big number of dentries with external names can significantly affect the MemAvailable value. v2: 1) removed comments specific to unreclaimable slabs 2) splitted into 3 patches v1: https://lkml.org/lkml/2018/3/1/961 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com Roman Gushchin (3): mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES mm: treat indirectly reclaimable memory as available in MemAvailable dcache: account external names as indirectly reclaimable memory fs/dcache.c | 29 ++++++++++++++++++++++++----- include/linux/mmzone.h | 1 + mm/page_alloc.c | 7 +++++++ mm/vmstat.c | 1 + 4 files changed, 33 insertions(+), 5 deletions(-) -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-03-05 13:37 ` Roman Gushchin (?) @ 2018-03-05 13:37 ` Roman Gushchin -1 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch introduces a concept of indirectly reclaimable memory and adds the corresponding memory counter and /proc/vmstat item. Indirectly reclaimable memory is any sort of memory, used by the kernel (except of reclaimable slabs), which is actually reclaimable, i.e. will be released under memory pressure. The counter is in bytes, as it's not always possible to count such objects in pages. The name contains BYTES by analogy to NR_KERNEL_STACK_KB. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- include/linux/mmzone.h | 1 + mm/vmstat.c | 1 + 2 files changed, 2 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index e09fe563d5dc..15e783f29e21 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -180,6 +180,7 @@ enum node_stat_item { NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ NR_DIRTIED, /* page dirtyings since bootup */ NR_WRITTEN, /* page writings since bootup */ + NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */ NR_VM_NODE_STAT_ITEMS }; diff --git a/mm/vmstat.c b/mm/vmstat.c index 40b2db6db6b1..b6b5684f31fe 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1161,6 +1161,7 @@ const char * const vmstat_text[] = { "nr_vmscan_immediate_reclaim", "nr_dirtied", "nr_written", + "nr_indirectly_reclaimable", /* enum writeback_stat_item counters */ "nr_dirty_threshold", -- 2.14.3 ^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch introduces a concept of indirectly reclaimable memory and adds the corresponding memory counter and /proc/vmstat item. Indirectly reclaimable memory is any sort of memory, used by the kernel (except of reclaimable slabs), which is actually reclaimable, i.e. will be released under memory pressure. The counter is in bytes, as it's not always possible to count such objects in pages. The name contains BYTES by analogy to NR_KERNEL_STACK_KB. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- include/linux/mmzone.h | 1 + mm/vmstat.c | 1 + 2 files changed, 2 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index e09fe563d5dc..15e783f29e21 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -180,6 +180,7 @@ enum node_stat_item { NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ NR_DIRTIED, /* page dirtyings since bootup */ NR_WRITTEN, /* page writings since bootup */ + NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */ NR_VM_NODE_STAT_ITEMS }; diff --git a/mm/vmstat.c b/mm/vmstat.c index 40b2db6db6b1..b6b5684f31fe 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1161,6 +1161,7 @@ const char * const vmstat_text[] = { "nr_vmscan_immediate_reclaim", "nr_dirtied", "nr_written", + "nr_indirectly_reclaimable", /* enum writeback_stat_item counters */ "nr_dirty_threshold", -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch introduces a concept of indirectly reclaimable memory and adds the corresponding memory counter and /proc/vmstat item. Indirectly reclaimable memory is any sort of memory, used by the kernel (except of reclaimable slabs), which is actually reclaimable, i.e. will be released under memory pressure. The counter is in bytes, as it's not always possible to count such objects in pages. The name contains BYTES by analogy to NR_KERNEL_STACK_KB. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- include/linux/mmzone.h | 1 + mm/vmstat.c | 1 + 2 files changed, 2 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index e09fe563d5dc..15e783f29e21 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -180,6 +180,7 @@ enum node_stat_item { NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ NR_DIRTIED, /* page dirtyings since bootup */ NR_WRITTEN, /* page writings since bootup */ + NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */ NR_VM_NODE_STAT_ITEMS }; diff --git a/mm/vmstat.c b/mm/vmstat.c index 40b2db6db6b1..b6b5684f31fe 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1161,6 +1161,7 @@ const char * const vmstat_text[] = { "nr_vmscan_immediate_reclaim", "nr_dirtied", "nr_written", + "nr_indirectly_reclaimable", /* enum writeback_stat_item counters */ "nr_dirty_threshold", -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-03-05 13:37 ` Roman Gushchin (?) (?) @ 2018-04-11 13:16 ` Vlastimil Babka 2018-04-11 13:56 ` Roman Gushchin -1 siblings, 1 reply; 61+ messages in thread From: Vlastimil Babka @ 2018-04-11 13:16 UTC (permalink / raw) To: Roman Gushchin, linux-mm Cc: Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API [+CC linux-api] On 03/05/2018 02:37 PM, Roman Gushchin wrote: > This patch introduces a concept of indirectly reclaimable memory > and adds the corresponding memory counter and /proc/vmstat item. > > Indirectly reclaimable memory is any sort of memory, used by > the kernel (except of reclaimable slabs), which is actually > reclaimable, i.e. will be released under memory pressure. > > The counter is in bytes, as it's not always possible to > count such objects in pages. The name contains BYTES > by analogy to NR_KERNEL_STACK_KB. > > Signed-off-by: Roman Gushchin <guro@fb.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: linux-fsdevel@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: linux-mm@kvack.org > Cc: kernel-team@fb.com Hmm, looks like I'm late and this user-visible API change was just merged. But it's for rc1, so we can still change it, hopefully? One problem I see with the counter is that it's in bytes, but among counters that use pages, and the name doesn't indicate it. Then, I don't see why users should care about the "indirectly" part, as that's just an implementation detail. It is reclaimable and that's what matters, right? (I also wanted to complain about lack of Documentation/... update, but looks like there's no general file about vmstat, ugh) I also kind of liked the idea from v1 rfc posting that there would be a separate set of reclaimable kmalloc-X caches for these kind of allocations. Besides accounting, it should also help reduce memory fragmentation. The right variant of cache would be detected via __GFP_RECLAIMABLE. With that in mind, can we at least for now put the (manually maintained) byte counter in a variable that's not directly exposed via /proc/vmstat, and then when printing nr_slab_reclaimable, simply add the value (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, subtract the same value. This way we would be simply making the existing counters more precise, in line with their semantics. Thoughts? Vlastimil > --- > include/linux/mmzone.h | 1 + > mm/vmstat.c | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index e09fe563d5dc..15e783f29e21 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -180,6 +180,7 @@ enum node_stat_item { > NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ > NR_DIRTIED, /* page dirtyings since bootup */ > NR_WRITTEN, /* page writings since bootup */ > + NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */ > NR_VM_NODE_STAT_ITEMS > }; > > diff --git a/mm/vmstat.c b/mm/vmstat.c > index 40b2db6db6b1..b6b5684f31fe 100644 > --- a/mm/vmstat.c > +++ b/mm/vmstat.c > @@ -1161,6 +1161,7 @@ const char * const vmstat_text[] = { > "nr_vmscan_immediate_reclaim", > "nr_dirtied", > "nr_written", > + "nr_indirectly_reclaimable", > > /* enum writeback_stat_item counters */ > "nr_dirty_threshold", > ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-11 13:16 ` Vlastimil Babka @ 2018-04-11 13:56 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-11 13:56 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: > [+CC linux-api] > > On 03/05/2018 02:37 PM, Roman Gushchin wrote: > > This patch introduces a concept of indirectly reclaimable memory > > and adds the corresponding memory counter and /proc/vmstat item. > > > > Indirectly reclaimable memory is any sort of memory, used by > > the kernel (except of reclaimable slabs), which is actually > > reclaimable, i.e. will be released under memory pressure. > > > > The counter is in bytes, as it's not always possible to > > count such objects in pages. The name contains BYTES > > by analogy to NR_KERNEL_STACK_KB. > > > > Signed-off-by: Roman Gushchin <guro@fb.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > > Cc: Michal Hocko <mhocko@suse.com> > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > Cc: linux-fsdevel@vger.kernel.org > > Cc: linux-kernel@vger.kernel.org > > Cc: linux-mm@kvack.org > > Cc: kernel-team@fb.com > > Hmm, looks like I'm late and this user-visible API change was just > merged. But it's for rc1, so we can still change it, hopefully? > > One problem I see with the counter is that it's in bytes, but among > counters that use pages, and the name doesn't indicate it. Here I just followed "nr_kernel_stack" path, which is measured in kB, but this is not mentioned in the field name. > Then, I don't > see why users should care about the "indirectly" part, as that's just an > implementation detail. It is reclaimable and that's what matters, right? > (I also wanted to complain about lack of Documentation/... update, but > looks like there's no general file about vmstat, ugh) I agree, that it's a bit weird, and it's probably better to not expose it at all; but this is how all vm counters work. We do expose them all in /proc/vmstat. A good number of them is useless until you are not a mm developer, so it's arguable more "debug info" rather than "api". It's definitely not a reason to make them messy. Does "nr_indirectly_reclaimable_bytes" look better to you? > > I also kind of liked the idea from v1 rfc posting that there would be a > separate set of reclaimable kmalloc-X caches for these kind of > allocations. Besides accounting, it should also help reduce memory > fragmentation. The right variant of cache would be detected via > __GFP_RECLAIMABLE. Well, the downside is that we have to introduce X new caches just for this particular problem. I'm not strictly against the idea, but not convinced that it's much better. > > With that in mind, can we at least for now put the (manually maintained) > byte counter in a variable that's not directly exposed via /proc/vmstat, > and then when printing nr_slab_reclaimable, simply add the value > (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, > subtract the same value. This way we would be simply making the existing > counters more precise, in line with their semantics. Idk, I don't like the idea of adding a counter outside of the vm counters infrastructure, and I definitely wouldn't touch the exposed nr_slab_reclaimable and nr_slab_unreclaimable fields. We do have some stats in /proc/slabinfo, /proc/meminfo and /sys/kernel/slab and I think that we should keep it consistent. Thanks! > > Thoughts? > Vlastimil > > > --- > > include/linux/mmzone.h | 1 + > > mm/vmstat.c | 1 + > > 2 files changed, 2 insertions(+) > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > index e09fe563d5dc..15e783f29e21 100644 > > --- a/include/linux/mmzone.h > > +++ b/include/linux/mmzone.h > > @@ -180,6 +180,7 @@ enum node_stat_item { > > NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ > > NR_DIRTIED, /* page dirtyings since bootup */ > > NR_WRITTEN, /* page writings since bootup */ > > + NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */ > > NR_VM_NODE_STAT_ITEMS > > }; > > > > diff --git a/mm/vmstat.c b/mm/vmstat.c > > index 40b2db6db6b1..b6b5684f31fe 100644 > > --- a/mm/vmstat.c > > +++ b/mm/vmstat.c > > @@ -1161,6 +1161,7 @@ const char * const vmstat_text[] = { > > "nr_vmscan_immediate_reclaim", > > "nr_dirtied", > > "nr_written", > > + "nr_indirectly_reclaimable", > > > > /* enum writeback_stat_item counters */ > > "nr_dirty_threshold", > > > ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES @ 2018-04-11 13:56 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-11 13:56 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: > [+CC linux-api] > > On 03/05/2018 02:37 PM, Roman Gushchin wrote: > > This patch introduces a concept of indirectly reclaimable memory > > and adds the corresponding memory counter and /proc/vmstat item. > > > > Indirectly reclaimable memory is any sort of memory, used by > > the kernel (except of reclaimable slabs), which is actually > > reclaimable, i.e. will be released under memory pressure. > > > > The counter is in bytes, as it's not always possible to > > count such objects in pages. The name contains BYTES > > by analogy to NR_KERNEL_STACK_KB. > > > > Signed-off-by: Roman Gushchin <guro@fb.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > > Cc: Michal Hocko <mhocko@suse.com> > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > Cc: linux-fsdevel@vger.kernel.org > > Cc: linux-kernel@vger.kernel.org > > Cc: linux-mm@kvack.org > > Cc: kernel-team@fb.com > > Hmm, looks like I'm late and this user-visible API change was just > merged. But it's for rc1, so we can still change it, hopefully? > > One problem I see with the counter is that it's in bytes, but among > counters that use pages, and the name doesn't indicate it. Here I just followed "nr_kernel_stack" path, which is measured in kB, but this is not mentioned in the field name. > Then, I don't > see why users should care about the "indirectly" part, as that's just an > implementation detail. It is reclaimable and that's what matters, right? > (I also wanted to complain about lack of Documentation/... update, but > looks like there's no general file about vmstat, ugh) I agree, that it's a bit weird, and it's probably better to not expose it at all; but this is how all vm counters work. We do expose them all in /proc/vmstat. A good number of them is useless until you are not a mm developer, so it's arguable more "debug info" rather than "api". It's definitely not a reason to make them messy. Does "nr_indirectly_reclaimable_bytes" look better to you? > > I also kind of liked the idea from v1 rfc posting that there would be a > separate set of reclaimable kmalloc-X caches for these kind of > allocations. Besides accounting, it should also help reduce memory > fragmentation. The right variant of cache would be detected via > __GFP_RECLAIMABLE. Well, the downside is that we have to introduce X new caches just for this particular problem. I'm not strictly against the idea, but not convinced that it's much better. > > With that in mind, can we at least for now put the (manually maintained) > byte counter in a variable that's not directly exposed via /proc/vmstat, > and then when printing nr_slab_reclaimable, simply add the value > (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, > subtract the same value. This way we would be simply making the existing > counters more precise, in line with their semantics. Idk, I don't like the idea of adding a counter outside of the vm counters infrastructure, and I definitely wouldn't touch the exposed nr_slab_reclaimable and nr_slab_unreclaimable fields. We do have some stats in /proc/slabinfo, /proc/meminfo and /sys/kernel/slab and I think that we should keep it consistent. Thanks! > > Thoughts? > Vlastimil > > > --- > > include/linux/mmzone.h | 1 + > > mm/vmstat.c | 1 + > > 2 files changed, 2 insertions(+) > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > index e09fe563d5dc..15e783f29e21 100644 > > --- a/include/linux/mmzone.h > > +++ b/include/linux/mmzone.h > > @@ -180,6 +180,7 @@ enum node_stat_item { > > NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ > > NR_DIRTIED, /* page dirtyings since bootup */ > > NR_WRITTEN, /* page writings since bootup */ > > + NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */ > > NR_VM_NODE_STAT_ITEMS > > }; > > > > diff --git a/mm/vmstat.c b/mm/vmstat.c > > index 40b2db6db6b1..b6b5684f31fe 100644 > > --- a/mm/vmstat.c > > +++ b/mm/vmstat.c > > @@ -1161,6 +1161,7 @@ const char * const vmstat_text[] = { > > "nr_vmscan_immediate_reclaim", > > "nr_dirtied", > > "nr_written", > > + "nr_indirectly_reclaimable", > > > > /* enum writeback_stat_item counters */ > > "nr_dirty_threshold", > > > ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-11 13:56 ` Roman Gushchin (?) @ 2018-04-12 6:52 ` Vlastimil Babka 2018-04-12 11:52 ` Michal Hocko 2018-04-12 14:57 ` Roman Gushchin -1 siblings, 2 replies; 61+ messages in thread From: Vlastimil Babka @ 2018-04-12 6:52 UTC (permalink / raw) To: Roman Gushchin Cc: linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On 04/11/2018 03:56 PM, Roman Gushchin wrote: > On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: >> [+CC linux-api] >> >> On 03/05/2018 02:37 PM, Roman Gushchin wrote: >>> This patch introduces a concept of indirectly reclaimable memory >>> and adds the corresponding memory counter and /proc/vmstat item. >>> >>> Indirectly reclaimable memory is any sort of memory, used by >>> the kernel (except of reclaimable slabs), which is actually >>> reclaimable, i.e. will be released under memory pressure. >>> >>> The counter is in bytes, as it's not always possible to >>> count such objects in pages. The name contains BYTES >>> by analogy to NR_KERNEL_STACK_KB. >>> >>> Signed-off-by: Roman Gushchin <guro@fb.com> >>> Cc: Andrew Morton <akpm@linux-foundation.org> >>> Cc: Alexander Viro <viro@zeniv.linux.org.uk> >>> Cc: Michal Hocko <mhocko@suse.com> >>> Cc: Johannes Weiner <hannes@cmpxchg.org> >>> Cc: linux-fsdevel@vger.kernel.org >>> Cc: linux-kernel@vger.kernel.org >>> Cc: linux-mm@kvack.org >>> Cc: kernel-team@fb.com >> >> Hmm, looks like I'm late and this user-visible API change was just >> merged. But it's for rc1, so we can still change it, hopefully? >> >> One problem I see with the counter is that it's in bytes, but among >> counters that use pages, and the name doesn't indicate it. > > Here I just followed "nr_kernel_stack" path, which is measured in kB, > but this is not mentioned in the field name. Oh, didn't know. Bad example to follow :P >> Then, I don't >> see why users should care about the "indirectly" part, as that's just an >> implementation detail. It is reclaimable and that's what matters, right? >> (I also wanted to complain about lack of Documentation/... update, but >> looks like there's no general file about vmstat, ugh) > > I agree, that it's a bit weird, and it's probably better to not expose > it at all; but this is how all vm counters work. We do expose them all > in /proc/vmstat. A good number of them is useless until you are not a > mm developer, so it's arguable more "debug info" rather than "api". Yeah the problem is that once tools start rely on them, they fall under the "do not break userspace" rule, however we call them. So being cautious and conservative can't hurt. > It's definitely not a reason to make them messy. > Does "nr_indirectly_reclaimable_bytes" look better to you? It still has has the "indirecly" part and feels arbitrary :/ >> >> I also kind of liked the idea from v1 rfc posting that there would be a >> separate set of reclaimable kmalloc-X caches for these kind of >> allocations. Besides accounting, it should also help reduce memory >> fragmentation. The right variant of cache would be detected via >> __GFP_RECLAIMABLE. > > Well, the downside is that we have to introduce X new caches > just for this particular problem. I'm not strictly against the idea, > but not convinced that it's much better. Maybe we can find more cases that would benefit from it. Heck, even slab itself allocates some management structures from the generic kmalloc caches, and if they are used for reclaimable caches, they could be tracked as reclaimable as well. >> >> With that in mind, can we at least for now put the (manually maintained) >> byte counter in a variable that's not directly exposed via /proc/vmstat, >> and then when printing nr_slab_reclaimable, simply add the value >> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, >> subtract the same value. This way we would be simply making the existing >> counters more precise, in line with their semantics. > > Idk, I don't like the idea of adding a counter outside of the vm counters > infrastructure, and I definitely wouldn't touch the exposed > nr_slab_reclaimable and nr_slab_unreclaimable fields. We would be just making the reported values more precise wrt reality. > We do have some stats in /proc/slabinfo, /proc/meminfo and /sys/kernel/slab > and I think that we should keep it consistent. Right, meminfo would be adjusted the same. slabinfo doesn't indicate which caches are reclaimable, so there will be no change. /sys/kernel/slab/cache/reclaim_account does, but I doubt anything will break. > Thanks! > >> >> Thoughts? >> Vlastimil >> >>> --- >>> include/linux/mmzone.h | 1 + >>> mm/vmstat.c | 1 + >>> 2 files changed, 2 insertions(+) >>> >>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >>> index e09fe563d5dc..15e783f29e21 100644 >>> --- a/include/linux/mmzone.h >>> +++ b/include/linux/mmzone.h >>> @@ -180,6 +180,7 @@ enum node_stat_item { >>> NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ >>> NR_DIRTIED, /* page dirtyings since bootup */ >>> NR_WRITTEN, /* page writings since bootup */ >>> + NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */ >>> NR_VM_NODE_STAT_ITEMS >>> }; >>> >>> diff --git a/mm/vmstat.c b/mm/vmstat.c >>> index 40b2db6db6b1..b6b5684f31fe 100644 >>> --- a/mm/vmstat.c >>> +++ b/mm/vmstat.c >>> @@ -1161,6 +1161,7 @@ const char * const vmstat_text[] = { >>> "nr_vmscan_immediate_reclaim", >>> "nr_dirtied", >>> "nr_written", >>> + "nr_indirectly_reclaimable", >>> >>> /* enum writeback_stat_item counters */ >>> "nr_dirty_threshold", >>> >> > ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-12 6:52 ` Vlastimil Babka @ 2018-04-12 11:52 ` Michal Hocko 2018-04-12 14:38 ` Roman Gushchin 2018-04-12 14:57 ` Roman Gushchin 1 sibling, 1 reply; 61+ messages in thread From: Michal Hocko @ 2018-04-12 11:52 UTC (permalink / raw) To: Vlastimil Babka Cc: Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Thu 12-04-18 08:52:52, Vlastimil Babka wrote: > On 04/11/2018 03:56 PM, Roman Gushchin wrote: > > On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: [...] > >> With that in mind, can we at least for now put the (manually maintained) > >> byte counter in a variable that's not directly exposed via /proc/vmstat, > >> and then when printing nr_slab_reclaimable, simply add the value > >> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, > >> subtract the same value. This way we would be simply making the existing > >> counters more precise, in line with their semantics. > > > > Idk, I don't like the idea of adding a counter outside of the vm counters > > infrastructure, and I definitely wouldn't touch the exposed > > nr_slab_reclaimable and nr_slab_unreclaimable fields. Why? > We would be just making the reported values more precise wrt reality. I was suggesting something similar in an earlier discussion. I am not really happy about the new exposed counter either. It is just arbitrary by name yet very specific for this particular usecase. What is a poor user supposed to do with the new counter? Can this be used for any calculations? -- Michal Hocko SUSE Lab ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-12 11:52 ` Michal Hocko @ 2018-04-12 14:38 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-12 14:38 UTC (permalink / raw) To: Michal Hocko Cc: Vlastimil Babka, linux-mm, Andrew Morton, Alexander Viro, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Thu, Apr 12, 2018 at 01:52:17PM +0200, Michal Hocko wrote: > On Thu 12-04-18 08:52:52, Vlastimil Babka wrote: > > On 04/11/2018 03:56 PM, Roman Gushchin wrote: > > > On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: > [...] > > >> With that in mind, can we at least for now put the (manually maintained) > > >> byte counter in a variable that's not directly exposed via /proc/vmstat, > > >> and then when printing nr_slab_reclaimable, simply add the value > > >> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, > > >> subtract the same value. This way we would be simply making the existing > > >> counters more precise, in line with their semantics. > > > > > > Idk, I don't like the idea of adding a counter outside of the vm counters > > > infrastructure, and I definitely wouldn't touch the exposed > > > nr_slab_reclaimable and nr_slab_unreclaimable fields. > > Why? Both nr_slab_reclaimable and nr_slab_unreclaimable have a very simple meaning: they are numbers of pages used by corresponding slab caches. In the answer to the very first version of this patchset Andrew suggested to generalize the idea to allow further accounting of non-kmalloc() allocations. I like the idea, even if don't have a good example right now. The problem with external names existed for many years before we've accidentally hit it, so if we don't have other examples right now, it doesn't mean that we wouldn't have them in the future. > > > We would be just making the reported values more precise wrt reality. > > I was suggesting something similar in an earlier discussion. I am not > really happy about the new exposed counter either. It is just arbitrary > by name yet very specific for this particular usecase. > > What is a poor user supposed to do with the new counter? Can this be > used for any calculations? For me the most important part is to fix the overcommit logic, because it's a real security and production issue. Adjusting MemAvailable is important too. I really open here for any concrete suggestions on how to do it without exporting of a new value, and without adding too much complexity to the code (e.g. skipping this particular mm counter on printing will be quite messy). Thanks! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES @ 2018-04-12 14:38 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-12 14:38 UTC (permalink / raw) To: Michal Hocko Cc: Vlastimil Babka, linux-mm, Andrew Morton, Alexander Viro, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Thu, Apr 12, 2018 at 01:52:17PM +0200, Michal Hocko wrote: > On Thu 12-04-18 08:52:52, Vlastimil Babka wrote: > > On 04/11/2018 03:56 PM, Roman Gushchin wrote: > > > On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: > [...] > > >> With that in mind, can we at least for now put the (manually maintained) > > >> byte counter in a variable that's not directly exposed via /proc/vmstat, > > >> and then when printing nr_slab_reclaimable, simply add the value > > >> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, > > >> subtract the same value. This way we would be simply making the existing > > >> counters more precise, in line with their semantics. > > > > > > Idk, I don't like the idea of adding a counter outside of the vm counters > > > infrastructure, and I definitely wouldn't touch the exposed > > > nr_slab_reclaimable and nr_slab_unreclaimable fields. > > Why? Both nr_slab_reclaimable and nr_slab_unreclaimable have a very simple meaning: they are numbers of pages used by corresponding slab caches. In the answer to the very first version of this patchset Andrew suggested to generalize the idea to allow further accounting of non-kmalloc() allocations. I like the idea, even if don't have a good example right now. The problem with external names existed for many years before we've accidentally hit it, so if we don't have other examples right now, it doesn't mean that we wouldn't have them in the future. > > > We would be just making the reported values more precise wrt reality. > > I was suggesting something similar in an earlier discussion. I am not > really happy about the new exposed counter either. It is just arbitrary > by name yet very specific for this particular usecase. > > What is a poor user supposed to do with the new counter? Can this be > used for any calculations? For me the most important part is to fix the overcommit logic, because it's a real security and production issue. Adjusting MemAvailable is important too. I really open here for any concrete suggestions on how to do it without exporting of a new value, and without adding too much complexity to the code (e.g. skipping this particular mm counter on printing will be quite messy). Thanks! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-12 14:38 ` Roman Gushchin (?) @ 2018-04-12 14:46 ` Michal Hocko -1 siblings, 0 replies; 61+ messages in thread From: Michal Hocko @ 2018-04-12 14:46 UTC (permalink / raw) To: Roman Gushchin Cc: Vlastimil Babka, linux-mm, Andrew Morton, Alexander Viro, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Thu 12-04-18 15:38:33, Roman Gushchin wrote: > On Thu, Apr 12, 2018 at 01:52:17PM +0200, Michal Hocko wrote: > > On Thu 12-04-18 08:52:52, Vlastimil Babka wrote: > > > On 04/11/2018 03:56 PM, Roman Gushchin wrote: > > > > On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: > > [...] > > > >> With that in mind, can we at least for now put the (manually maintained) > > > >> byte counter in a variable that's not directly exposed via /proc/vmstat, > > > >> and then when printing nr_slab_reclaimable, simply add the value > > > >> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, > > > >> subtract the same value. This way we would be simply making the existing > > > >> counters more precise, in line with their semantics. > > > > > > > > Idk, I don't like the idea of adding a counter outside of the vm counters > > > > infrastructure, and I definitely wouldn't touch the exposed > > > > nr_slab_reclaimable and nr_slab_unreclaimable fields. > > > > Why? > > Both nr_slab_reclaimable and nr_slab_unreclaimable have a very simple > meaning: they are numbers of pages used by corresponding slab caches. Right, but if names are reclaimable then they should end up in the reclaimable slabs and to be accounted as such. Objects themselves are not sufficient to reclaim the accounted memory. > In the answer to the very first version of this patchset > Andrew suggested to generalize the idea to allow further > accounting of non-kmalloc() allocations. > I like the idea, even if don't have a good example right now. Well, I have to disagree here. It sounds completely ad-hoc without a reasoable semantic. Or how does it help users when they do not know what is the indirect dependency and how to trigger it. > The problem with external names existed for many years before > we've accidentally hit it, so if we don't have other examples > right now, it doesn't mean that we wouldn't have them in the future. > > > > > > We would be just making the reported values more precise wrt reality. > > > > I was suggesting something similar in an earlier discussion. I am not > > really happy about the new exposed counter either. It is just arbitrary > > by name yet very specific for this particular usecase. > > > > What is a poor user supposed to do with the new counter? Can this be > > used for any calculations? > > For me the most important part is to fix the overcommit logic, because it's > a real security and production issue. Sure, the problem is ugly. Not the first one when the unaccounted kernel allocation can eat a lot of memory. We have many other such. The usual answer was to use kmemcg accounting. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-12 6:52 ` Vlastimil Babka @ 2018-04-12 14:57 ` Roman Gushchin 2018-04-12 14:57 ` Roman Gushchin 1 sibling, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-12 14:57 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Thu, Apr 12, 2018 at 08:52:52AM +0200, Vlastimil Babka wrote: > On 04/11/2018 03:56 PM, Roman Gushchin wrote: > > On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: > >> [+CC linux-api] > >> > >> On 03/05/2018 02:37 PM, Roman Gushchin wrote: > >>> This patch introduces a concept of indirectly reclaimable memory > >>> and adds the corresponding memory counter and /proc/vmstat item. > >>> > >>> Indirectly reclaimable memory is any sort of memory, used by > >>> the kernel (except of reclaimable slabs), which is actually > >>> reclaimable, i.e. will be released under memory pressure. > >>> > >>> The counter is in bytes, as it's not always possible to > >>> count such objects in pages. The name contains BYTES > >>> by analogy to NR_KERNEL_STACK_KB. > >>> > >>> Signed-off-by: Roman Gushchin <guro@fb.com> > >>> Cc: Andrew Morton <akpm@linux-foundation.org> > >>> Cc: Alexander Viro <viro@zeniv.linux.org.uk> > >>> Cc: Michal Hocko <mhocko@suse.com> > >>> Cc: Johannes Weiner <hannes@cmpxchg.org> > >>> Cc: linux-fsdevel@vger.kernel.org > >>> Cc: linux-kernel@vger.kernel.org > >>> Cc: linux-mm@kvack.org > >>> Cc: kernel-team@fb.com > >> > >> Hmm, looks like I'm late and this user-visible API change was just > >> merged. But it's for rc1, so we can still change it, hopefully? > >> > >> One problem I see with the counter is that it's in bytes, but among > >> counters that use pages, and the name doesn't indicate it. > > > > Here I just followed "nr_kernel_stack" path, which is measured in kB, > > but this is not mentioned in the field name. > > Oh, didn't know. Bad example to follow :P > > >> Then, I don't > >> see why users should care about the "indirectly" part, as that's just an > >> implementation detail. It is reclaimable and that's what matters, right? > >> (I also wanted to complain about lack of Documentation/... update, but > >> looks like there's no general file about vmstat, ugh) > > > > I agree, that it's a bit weird, and it's probably better to not expose > > it at all; but this is how all vm counters work. We do expose them all > > in /proc/vmstat. A good number of them is useless until you are not a > > mm developer, so it's arguable more "debug info" rather than "api". > > Yeah the problem is that once tools start rely on them, they fall under > the "do not break userspace" rule, however we call them. So being > cautious and conservative can't hurt. > > > It's definitely not a reason to make them messy. > > Does "nr_indirectly_reclaimable_bytes" look better to you? > > It still has has the "indirecly" part and feels arbitrary :/ > > >> > >> I also kind of liked the idea from v1 rfc posting that there would be a > >> separate set of reclaimable kmalloc-X caches for these kind of > >> allocations. Besides accounting, it should also help reduce memory > >> fragmentation. The right variant of cache would be detected via > >> __GFP_RECLAIMABLE. > > > > Well, the downside is that we have to introduce X new caches > > just for this particular problem. I'm not strictly against the idea, > > but not convinced that it's much better. > > Maybe we can find more cases that would benefit from it. Heck, even slab > itself allocates some management structures from the generic kmalloc > caches, and if they are used for reclaimable caches, they could be > tracked as reclaimable as well. This is a good catch! > > >> > >> With that in mind, can we at least for now put the (manually maintained) > >> byte counter in a variable that's not directly exposed via /proc/vmstat, > >> and then when printing nr_slab_reclaimable, simply add the value > >> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, > >> subtract the same value. This way we would be simply making the existing > >> counters more precise, in line with their semantics. > > > > Idk, I don't like the idea of adding a counter outside of the vm counters > > infrastructure, and I definitely wouldn't touch the exposed > > nr_slab_reclaimable and nr_slab_unreclaimable fields. > > We would be just making the reported values more precise wrt reality. It depends on if we believe that only slab memory can be reclaimable or not. If yes, this is true, otherwise not. My guess is that some drivers (e.g. networking) might have buffers, which are reclaimable under mempressure, and are allocated using the page allocator. But I have to look closer... > > We do have some stats in /proc/slabinfo, /proc/meminfo and /sys/kernel/slab > > and I think that we should keep it consistent. > > Right, meminfo would be adjusted the same. slabinfo doesn't indicate > which caches are reclaimable, so there will be no change. > /sys/kernel/slab/cache/reclaim_account does, but I doubt anything will > break. It also can be found out of the corresponding directory name in sysfs: $ ls -la /sys/kernel/slab/dentr* lrwxrwxrwx. 1 root root 0 Apr 11 14:45 /sys/kernel/slab/dentry -> :aA-0000192 ^ this is the "reclaimable" flag Not saying that something will break. Thanks! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES @ 2018-04-12 14:57 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-12 14:57 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Thu, Apr 12, 2018 at 08:52:52AM +0200, Vlastimil Babka wrote: > On 04/11/2018 03:56 PM, Roman Gushchin wrote: > > On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: > >> [+CC linux-api] > >> > >> On 03/05/2018 02:37 PM, Roman Gushchin wrote: > >>> This patch introduces a concept of indirectly reclaimable memory > >>> and adds the corresponding memory counter and /proc/vmstat item. > >>> > >>> Indirectly reclaimable memory is any sort of memory, used by > >>> the kernel (except of reclaimable slabs), which is actually > >>> reclaimable, i.e. will be released under memory pressure. > >>> > >>> The counter is in bytes, as it's not always possible to > >>> count such objects in pages. The name contains BYTES > >>> by analogy to NR_KERNEL_STACK_KB. > >>> > >>> Signed-off-by: Roman Gushchin <guro@fb.com> > >>> Cc: Andrew Morton <akpm@linux-foundation.org> > >>> Cc: Alexander Viro <viro@zeniv.linux.org.uk> > >>> Cc: Michal Hocko <mhocko@suse.com> > >>> Cc: Johannes Weiner <hannes@cmpxchg.org> > >>> Cc: linux-fsdevel@vger.kernel.org > >>> Cc: linux-kernel@vger.kernel.org > >>> Cc: linux-mm@kvack.org > >>> Cc: kernel-team@fb.com > >> > >> Hmm, looks like I'm late and this user-visible API change was just > >> merged. But it's for rc1, so we can still change it, hopefully? > >> > >> One problem I see with the counter is that it's in bytes, but among > >> counters that use pages, and the name doesn't indicate it. > > > > Here I just followed "nr_kernel_stack" path, which is measured in kB, > > but this is not mentioned in the field name. > > Oh, didn't know. Bad example to follow :P > > >> Then, I don't > >> see why users should care about the "indirectly" part, as that's just an > >> implementation detail. It is reclaimable and that's what matters, right? > >> (I also wanted to complain about lack of Documentation/... update, but > >> looks like there's no general file about vmstat, ugh) > > > > I agree, that it's a bit weird, and it's probably better to not expose > > it at all; but this is how all vm counters work. We do expose them all > > in /proc/vmstat. A good number of them is useless until you are not a > > mm developer, so it's arguable more "debug info" rather than "api". > > Yeah the problem is that once tools start rely on them, they fall under > the "do not break userspace" rule, however we call them. So being > cautious and conservative can't hurt. > > > It's definitely not a reason to make them messy. > > Does "nr_indirectly_reclaimable_bytes" look better to you? > > It still has has the "indirecly" part and feels arbitrary :/ > > >> > >> I also kind of liked the idea from v1 rfc posting that there would be a > >> separate set of reclaimable kmalloc-X caches for these kind of > >> allocations. Besides accounting, it should also help reduce memory > >> fragmentation. The right variant of cache would be detected via > >> __GFP_RECLAIMABLE. > > > > Well, the downside is that we have to introduce X new caches > > just for this particular problem. I'm not strictly against the idea, > > but not convinced that it's much better. > > Maybe we can find more cases that would benefit from it. Heck, even slab > itself allocates some management structures from the generic kmalloc > caches, and if they are used for reclaimable caches, they could be > tracked as reclaimable as well. This is a good catch! > > >> > >> With that in mind, can we at least for now put the (manually maintained) > >> byte counter in a variable that's not directly exposed via /proc/vmstat, > >> and then when printing nr_slab_reclaimable, simply add the value > >> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, > >> subtract the same value. This way we would be simply making the existing > >> counters more precise, in line with their semantics. > > > > Idk, I don't like the idea of adding a counter outside of the vm counters > > infrastructure, and I definitely wouldn't touch the exposed > > nr_slab_reclaimable and nr_slab_unreclaimable fields. > > We would be just making the reported values more precise wrt reality. It depends on if we believe that only slab memory can be reclaimable or not. If yes, this is true, otherwise not. My guess is that some drivers (e.g. networking) might have buffers, which are reclaimable under mempressure, and are allocated using the page allocator. But I have to look closer... > > We do have some stats in /proc/slabinfo, /proc/meminfo and /sys/kernel/slab > > and I think that we should keep it consistent. > > Right, meminfo would be adjusted the same. slabinfo doesn't indicate > which caches are reclaimable, so there will be no change. > /sys/kernel/slab/cache/reclaim_account does, but I doubt anything will > break. It also can be found out of the corresponding directory name in sysfs: $ ls -la /sys/kernel/slab/dentr* lrwxrwxrwx. 1 root root 0 Apr 11 14:45 /sys/kernel/slab/dentry -> :aA-0000192 ^ this is the "reclaimable" flag Not saying that something will break. Thanks! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-12 14:57 ` Roman Gushchin (?) @ 2018-04-13 6:59 ` Michal Hocko -1 siblings, 0 replies; 61+ messages in thread From: Michal Hocko @ 2018-04-13 6:59 UTC (permalink / raw) To: Roman Gushchin Cc: Vlastimil Babka, linux-mm, Andrew Morton, Alexander Viro, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Thu 12-04-18 15:57:03, Roman Gushchin wrote: > On Thu, Apr 12, 2018 at 08:52:52AM +0200, Vlastimil Babka wrote: [...] > > We would be just making the reported values more precise wrt reality. > > It depends on if we believe that only slab memory can be reclaimable > or not. If yes, this is true, otherwise not. > > My guess is that some drivers (e.g. networking) might have buffers, > which are reclaimable under mempressure, and are allocated using > the page allocator. But I have to look closer... Well, we have many direct page allocator users which are not accounted in vmstat. Some of those use their specific accounting (e.g. network buffers, some fs metadata a many others). In the ideal world MM layer would know about those but... Anyway, this particular case is quite clear, no? We _use_ kmalloc so this is slab allocator. We just misaccount it. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-12 14:57 ` Roman Gushchin (?) (?) @ 2018-04-13 12:13 ` vinayak menon 2018-04-25 3:49 ` Vijayanand Jitta 2018-04-25 15:55 ` Matthew Wilcox -1 siblings, 2 replies; 61+ messages in thread From: vinayak menon @ 2018-04-13 12:13 UTC (permalink / raw) To: Roman Gushchin Cc: Vlastimil Babka, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Thu, Apr 12, 2018 at 8:27 PM, Roman Gushchin <guro@fb.com> wrote: > On Thu, Apr 12, 2018 at 08:52:52AM +0200, Vlastimil Babka wrote: >> On 04/11/2018 03:56 PM, Roman Gushchin wrote: >> > On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: >> >> [+CC linux-api] >> >> >> >> On 03/05/2018 02:37 PM, Roman Gushchin wrote: >> >>> This patch introduces a concept of indirectly reclaimable memory >> >>> and adds the corresponding memory counter and /proc/vmstat item. >> >>> >> >>> Indirectly reclaimable memory is any sort of memory, used by >> >>> the kernel (except of reclaimable slabs), which is actually >> >>> reclaimable, i.e. will be released under memory pressure. >> >>> >> >>> The counter is in bytes, as it's not always possible to >> >>> count such objects in pages. The name contains BYTES >> >>> by analogy to NR_KERNEL_STACK_KB. >> >>> >> >>> Signed-off-by: Roman Gushchin <guro@fb.com> >> >>> Cc: Andrew Morton <akpm@linux-foundation.org> >> >>> Cc: Alexander Viro <viro@zeniv.linux.org.uk> >> >>> Cc: Michal Hocko <mhocko@suse.com> >> >>> Cc: Johannes Weiner <hannes@cmpxchg.org> >> >>> Cc: linux-fsdevel@vger.kernel.org >> >>> Cc: linux-kernel@vger.kernel.org >> >>> Cc: linux-mm@kvack.org >> >>> Cc: kernel-team@fb.com >> >> >> >> Hmm, looks like I'm late and this user-visible API change was just >> >> merged. But it's for rc1, so we can still change it, hopefully? >> >> >> >> One problem I see with the counter is that it's in bytes, but among >> >> counters that use pages, and the name doesn't indicate it. >> > >> > Here I just followed "nr_kernel_stack" path, which is measured in kB, >> > but this is not mentioned in the field name. >> >> Oh, didn't know. Bad example to follow :P >> >> >> Then, I don't >> >> see why users should care about the "indirectly" part, as that's just an >> >> implementation detail. It is reclaimable and that's what matters, right? >> >> (I also wanted to complain about lack of Documentation/... update, but >> >> looks like there's no general file about vmstat, ugh) >> > >> > I agree, that it's a bit weird, and it's probably better to not expose >> > it at all; but this is how all vm counters work. We do expose them all >> > in /proc/vmstat. A good number of them is useless until you are not a >> > mm developer, so it's arguable more "debug info" rather than "api". >> >> Yeah the problem is that once tools start rely on them, they fall under >> the "do not break userspace" rule, however we call them. So being >> cautious and conservative can't hurt. >> >> > It's definitely not a reason to make them messy. >> > Does "nr_indirectly_reclaimable_bytes" look better to you? >> >> It still has has the "indirecly" part and feels arbitrary :/ >> >> >> >> >> I also kind of liked the idea from v1 rfc posting that there would be a >> >> separate set of reclaimable kmalloc-X caches for these kind of >> >> allocations. Besides accounting, it should also help reduce memory >> >> fragmentation. The right variant of cache would be detected via >> >> __GFP_RECLAIMABLE. >> > >> > Well, the downside is that we have to introduce X new caches >> > just for this particular problem. I'm not strictly against the idea, >> > but not convinced that it's much better. >> >> Maybe we can find more cases that would benefit from it. Heck, even slab >> itself allocates some management structures from the generic kmalloc >> caches, and if they are used for reclaimable caches, they could be >> tracked as reclaimable as well. > > This is a good catch! > >> >> >> >> >> With that in mind, can we at least for now put the (manually maintained) >> >> byte counter in a variable that's not directly exposed via /proc/vmstat, >> >> and then when printing nr_slab_reclaimable, simply add the value >> >> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, >> >> subtract the same value. This way we would be simply making the existing >> >> counters more precise, in line with their semantics. >> > >> > Idk, I don't like the idea of adding a counter outside of the vm counters >> > infrastructure, and I definitely wouldn't touch the exposed >> > nr_slab_reclaimable and nr_slab_unreclaimable fields. >> >> We would be just making the reported values more precise wrt reality. > > It depends on if we believe that only slab memory can be reclaimable > or not. If yes, this is true, otherwise not. > > My guess is that some drivers (e.g. networking) might have buffers, > which are reclaimable under mempressure, and are allocated using > the page allocator. But I have to look closer... > One such case I have encountered is that of the ION page pool. The page pool registers a shrinker. When not in any memory pressure page pool can go high and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. Thanks, Vinayak ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-13 12:13 ` vinayak menon @ 2018-04-25 3:49 ` Vijayanand Jitta 2018-04-25 12:52 ` Roman Gushchin 2018-04-25 15:55 ` Matthew Wilcox 1 sibling, 1 reply; 61+ messages in thread From: Vijayanand Jitta @ 2018-04-25 3:49 UTC (permalink / raw) To: vinayak menon, Roman Gushchin Cc: Vlastimil Babka, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On 4/13/2018 5:43 PM, vinayak menon wrote: > On Thu, Apr 12, 2018 at 8:27 PM, Roman Gushchin <guro@fb.com> wrote: >> On Thu, Apr 12, 2018 at 08:52:52AM +0200, Vlastimil Babka wrote: >>> On 04/11/2018 03:56 PM, Roman Gushchin wrote: >>>> On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote: >>>>> [+CC linux-api] >>>>> >>>>> On 03/05/2018 02:37 PM, Roman Gushchin wrote: >>>>>> This patch introduces a concept of indirectly reclaimable memory >>>>>> and adds the corresponding memory counter and /proc/vmstat item. >>>>>> >>>>>> Indirectly reclaimable memory is any sort of memory, used by >>>>>> the kernel (except of reclaimable slabs), which is actually >>>>>> reclaimable, i.e. will be released under memory pressure. >>>>>> >>>>>> The counter is in bytes, as it's not always possible to >>>>>> count such objects in pages. The name contains BYTES >>>>>> by analogy to NR_KERNEL_STACK_KB. >>>>>> >>>>>> Signed-off-by: Roman Gushchin <guro@fb.com> >>>>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>>>> Cc: Alexander Viro <viro@zeniv.linux.org.uk> >>>>>> Cc: Michal Hocko <mhocko@suse.com> >>>>>> Cc: Johannes Weiner <hannes@cmpxchg.org> >>>>>> Cc: linux-fsdevel@vger.kernel.org >>>>>> Cc: linux-kernel@vger.kernel.org >>>>>> Cc: linux-mm@kvack.org >>>>>> Cc: kernel-team@fb.com >>>>> >>>>> Hmm, looks like I'm late and this user-visible API change was just >>>>> merged. But it's for rc1, so we can still change it, hopefully? >>>>> >>>>> One problem I see with the counter is that it's in bytes, but among >>>>> counters that use pages, and the name doesn't indicate it. >>>> >>>> Here I just followed "nr_kernel_stack" path, which is measured in kB, >>>> but this is not mentioned in the field name. >>> >>> Oh, didn't know. Bad example to follow :P >>> >>>>> Then, I don't >>>>> see why users should care about the "indirectly" part, as that's just an >>>>> implementation detail. It is reclaimable and that's what matters, right? >>>>> (I also wanted to complain about lack of Documentation/... update, but >>>>> looks like there's no general file about vmstat, ugh) >>>> >>>> I agree, that it's a bit weird, and it's probably better to not expose >>>> it at all; but this is how all vm counters work. We do expose them all >>>> in /proc/vmstat. A good number of them is useless until you are not a >>>> mm developer, so it's arguable more "debug info" rather than "api". >>> >>> Yeah the problem is that once tools start rely on them, they fall under >>> the "do not break userspace" rule, however we call them. So being >>> cautious and conservative can't hurt. >>> >>>> It's definitely not a reason to make them messy. >>>> Does "nr_indirectly_reclaimable_bytes" look better to you? >>> >>> It still has has the "indirecly" part and feels arbitrary :/ >>> >>>>> >>>>> I also kind of liked the idea from v1 rfc posting that there would be a >>>>> separate set of reclaimable kmalloc-X caches for these kind of >>>>> allocations. Besides accounting, it should also help reduce memory >>>>> fragmentation. The right variant of cache would be detected via >>>>> __GFP_RECLAIMABLE. >>>> >>>> Well, the downside is that we have to introduce X new caches >>>> just for this particular problem. I'm not strictly against the idea, >>>> but not convinced that it's much better. >>> >>> Maybe we can find more cases that would benefit from it. Heck, even slab >>> itself allocates some management structures from the generic kmalloc >>> caches, and if they are used for reclaimable caches, they could be >>> tracked as reclaimable as well. >> >> This is a good catch! >> >>> >>>>> >>>>> With that in mind, can we at least for now put the (manually maintained) >>>>> byte counter in a variable that's not directly exposed via /proc/vmstat, >>>>> and then when printing nr_slab_reclaimable, simply add the value >>>>> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable, >>>>> subtract the same value. This way we would be simply making the existing >>>>> counters more precise, in line with their semantics. >>>> >>>> Idk, I don't like the idea of adding a counter outside of the vm counters >>>> infrastructure, and I definitely wouldn't touch the exposed >>>> nr_slab_reclaimable and nr_slab_unreclaimable fields. >>> >>> We would be just making the reported values more precise wrt reality. >> >> It depends on if we believe that only slab memory can be reclaimable >> or not. If yes, this is true, otherwise not. >> >> My guess is that some drivers (e.g. networking) might have buffers, >> which are reclaimable under mempressure, and are allocated using >> the page allocator. But I have to look closer... >> > > One such case I have encountered is that of the ION page pool. The page pool > registers a shrinker. When not in any memory pressure page pool can go high > and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send > a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. > > Thanks, > Vinayak > As Vinayak mentioned NR_INDIRECTLY_RECLAIMABLE_BYTES can be used to solve the issue with ION page pool when OVERCOMMIT_GUESS is set, the patch for the same can be found here https://lkml.org/lkml/2018/4/24/1288 ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-25 3:49 ` Vijayanand Jitta @ 2018-04-25 12:52 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-25 12:52 UTC (permalink / raw) To: Vijayanand Jitta Cc: vinayak menon, Vlastimil Babka, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Wed, Apr 25, 2018 at 09:19:29AM +0530, Vijayanand Jitta wrote: > >>>> Idk, I don't like the idea of adding a counter outside of the vm counters > >>>> infrastructure, and I definitely wouldn't touch the exposed > >>>> nr_slab_reclaimable and nr_slab_unreclaimable fields. > >>> > >>> We would be just making the reported values more precise wrt reality. > >> > >> It depends on if we believe that only slab memory can be reclaimable > >> or not. If yes, this is true, otherwise not. > >> > >> My guess is that some drivers (e.g. networking) might have buffers, > >> which are reclaimable under mempressure, and are allocated using > >> the page allocator. But I have to look closer... > >> > > > > One such case I have encountered is that of the ION page pool. The page pool > > registers a shrinker. When not in any memory pressure page pool can go high > > and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send > > a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. Perfect! This is exactly what I've expected. > > > > Thanks, > > Vinayak > > > > As Vinayak mentioned NR_INDIRECTLY_RECLAIMABLE_BYTES can be used to solve the issue > with ION page pool when OVERCOMMIT_GUESS is set, the patch for the same can be > found here https://lkml.org/lkml/2018/4/24/1288 This makes perfect sense to me. Please, fell free to add: Acked-by: Roman Gushchin <guro@fb.com> Thank you! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES @ 2018-04-25 12:52 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-25 12:52 UTC (permalink / raw) To: Vijayanand Jitta Cc: vinayak menon, Vlastimil Babka, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Wed, Apr 25, 2018 at 09:19:29AM +0530, Vijayanand Jitta wrote: > >>>> Idk, I don't like the idea of adding a counter outside of the vm counters > >>>> infrastructure, and I definitely wouldn't touch the exposed > >>>> nr_slab_reclaimable and nr_slab_unreclaimable fields. > >>> > >>> We would be just making the reported values more precise wrt reality. > >> > >> It depends on if we believe that only slab memory can be reclaimable > >> or not. If yes, this is true, otherwise not. > >> > >> My guess is that some drivers (e.g. networking) might have buffers, > >> which are reclaimable under mempressure, and are allocated using > >> the page allocator. But I have to look closer... > >> > > > > One such case I have encountered is that of the ION page pool. The page pool > > registers a shrinker. When not in any memory pressure page pool can go high > > and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send > > a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. Perfect! This is exactly what I've expected. > > > > Thanks, > > Vinayak > > > > As Vinayak mentioned NR_INDIRECTLY_RECLAIMABLE_BYTES can be used to solve the issue > with ION page pool when OVERCOMMIT_GUESS is set, the patch for the same can be > found here https://lkml.org/lkml/2018/4/24/1288 This makes perfect sense to me. Please, fell free to add: Acked-by: Roman Gushchin <guro@fb.com> Thank you! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-25 12:52 ` Roman Gushchin (?) @ 2018-04-25 15:47 ` Vlastimil Babka 2018-04-25 16:48 ` Roman Gushchin -1 siblings, 1 reply; 61+ messages in thread From: Vlastimil Babka @ 2018-04-25 15:47 UTC (permalink / raw) To: Roman Gushchin, Vijayanand Jitta Cc: vinayak menon, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On 04/25/2018 02:52 PM, Roman Gushchin wrote: > On Wed, Apr 25, 2018 at 09:19:29AM +0530, Vijayanand Jitta wrote: >>>>>> Idk, I don't like the idea of adding a counter outside of the vm counters >>>>>> infrastructure, and I definitely wouldn't touch the exposed >>>>>> nr_slab_reclaimable and nr_slab_unreclaimable fields. >>>>> >>>>> We would be just making the reported values more precise wrt reality. >>>> >>>> It depends on if we believe that only slab memory can be reclaimable >>>> or not. If yes, this is true, otherwise not. >>>> >>>> My guess is that some drivers (e.g. networking) might have buffers, >>>> which are reclaimable under mempressure, and are allocated using >>>> the page allocator. But I have to look closer... >>>> >>> >>> One such case I have encountered is that of the ION page pool. The page pool >>> registers a shrinker. When not in any memory pressure page pool can go high >>> and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send >>> a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. FYI, we have discussed this at LSF/MM and agreed to try the kmalloc reclaimable caches idea. The existing counter could then remain for page allocator users such as ION. It's a bit weird to have it in bytes and not pages then, IMHO. What if we hid it from /proc/vmstat now so it doesn't become ABI, and later convert it to page granularity and expose it under a name such as "nr_other_reclaimable" ? Vlastimil > Perfect! > This is exactly what I've expected. > >>> >>> Thanks, >>> Vinayak >>> >> >> As Vinayak mentioned NR_INDIRECTLY_RECLAIMABLE_BYTES can be used to solve the issue >> with ION page pool when OVERCOMMIT_GUESS is set, the patch for the same can be >> found here https://lkml.org/lkml/2018/4/24/1288 > > This makes perfect sense to me. > > Please, fell free to add: > Acked-by: Roman Gushchin <guro@fb.com> > > Thank you! > ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-25 15:47 ` Vlastimil Babka @ 2018-04-25 16:48 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-25 16:48 UTC (permalink / raw) To: Vlastimil Babka Cc: Vijayanand Jitta, vinayak menon, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Wed, Apr 25, 2018 at 05:47:26PM +0200, Vlastimil Babka wrote: > On 04/25/2018 02:52 PM, Roman Gushchin wrote: > > On Wed, Apr 25, 2018 at 09:19:29AM +0530, Vijayanand Jitta wrote: > >>>>>> Idk, I don't like the idea of adding a counter outside of the vm counters > >>>>>> infrastructure, and I definitely wouldn't touch the exposed > >>>>>> nr_slab_reclaimable and nr_slab_unreclaimable fields. > >>>>> > >>>>> We would be just making the reported values more precise wrt reality. > >>>> > >>>> It depends on if we believe that only slab memory can be reclaimable > >>>> or not. If yes, this is true, otherwise not. > >>>> > >>>> My guess is that some drivers (e.g. networking) might have buffers, > >>>> which are reclaimable under mempressure, and are allocated using > >>>> the page allocator. But I have to look closer... > >>>> > >>> > >>> One such case I have encountered is that of the ION page pool. The page pool > >>> registers a shrinker. When not in any memory pressure page pool can go high > >>> and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send > >>> a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. > > FYI, we have discussed this at LSF/MM and agreed to try the kmalloc > reclaimable caches idea. The existing counter could then remain for page > allocator users such as ION. It's a bit weird to have it in bytes and > not pages then, IMHO. What if we hid it from /proc/vmstat now so it > doesn't become ABI, and later convert it to page granularity and expose > it under a name such as "nr_other_reclaimable" ? I've nothing against hiding it from /proc/vmstat, as long as we keep the counter in place and the main issue resolved. Maybe it's better to add nr_reclaimable = nr_slab_reclaimable + nr_other_reclaimable, which will have a simpler meaning that nr_other_reclaimable (what is other?). Thanks! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES @ 2018-04-25 16:48 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-25 16:48 UTC (permalink / raw) To: Vlastimil Babka Cc: Vijayanand Jitta, vinayak menon, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Wed, Apr 25, 2018 at 05:47:26PM +0200, Vlastimil Babka wrote: > On 04/25/2018 02:52 PM, Roman Gushchin wrote: > > On Wed, Apr 25, 2018 at 09:19:29AM +0530, Vijayanand Jitta wrote: > >>>>>> Idk, I don't like the idea of adding a counter outside of the vm counters > >>>>>> infrastructure, and I definitely wouldn't touch the exposed > >>>>>> nr_slab_reclaimable and nr_slab_unreclaimable fields. > >>>>> > >>>>> We would be just making the reported values more precise wrt reality. > >>>> > >>>> It depends on if we believe that only slab memory can be reclaimable > >>>> or not. If yes, this is true, otherwise not. > >>>> > >>>> My guess is that some drivers (e.g. networking) might have buffers, > >>>> which are reclaimable under mempressure, and are allocated using > >>>> the page allocator. But I have to look closer... > >>>> > >>> > >>> One such case I have encountered is that of the ION page pool. The page pool > >>> registers a shrinker. When not in any memory pressure page pool can go high > >>> and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send > >>> a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. > > FYI, we have discussed this at LSF/MM and agreed to try the kmalloc > reclaimable caches idea. The existing counter could then remain for page > allocator users such as ION. It's a bit weird to have it in bytes and > not pages then, IMHO. What if we hid it from /proc/vmstat now so it > doesn't become ABI, and later convert it to page granularity and expose > it under a name such as "nr_other_reclaimable" ? I've nothing against hiding it from /proc/vmstat, as long as we keep the counter in place and the main issue resolved. Maybe it's better to add nr_reclaimable = nr_slab_reclaimable + nr_other_reclaimable, which will have a simpler meaning that nr_other_reclaimable (what is other?). Thanks! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-25 16:48 ` Roman Gushchin (?) @ 2018-04-25 17:02 ` Vlastimil Babka 2018-04-25 17:23 ` Roman Gushchin -1 siblings, 1 reply; 61+ messages in thread From: Vlastimil Babka @ 2018-04-25 17:02 UTC (permalink / raw) To: Roman Gushchin Cc: Vijayanand Jitta, vinayak menon, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On 04/25/2018 06:48 PM, Roman Gushchin wrote: > On Wed, Apr 25, 2018 at 05:47:26PM +0200, Vlastimil Babka wrote: >> On 04/25/2018 02:52 PM, Roman Gushchin wrote: >>> On Wed, Apr 25, 2018 at 09:19:29AM +0530, Vijayanand Jitta wrote: >>>>>>>> Idk, I don't like the idea of adding a counter outside of the vm counters >>>>>>>> infrastructure, and I definitely wouldn't touch the exposed >>>>>>>> nr_slab_reclaimable and nr_slab_unreclaimable fields. >>>>>>> >>>>>>> We would be just making the reported values more precise wrt reality. >>>>>> >>>>>> It depends on if we believe that only slab memory can be reclaimable >>>>>> or not. If yes, this is true, otherwise not. >>>>>> >>>>>> My guess is that some drivers (e.g. networking) might have buffers, >>>>>> which are reclaimable under mempressure, and are allocated using >>>>>> the page allocator. But I have to look closer... >>>>>> >>>>> >>>>> One such case I have encountered is that of the ION page pool. The page pool >>>>> registers a shrinker. When not in any memory pressure page pool can go high >>>>> and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send >>>>> a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. >> >> FYI, we have discussed this at LSF/MM and agreed to try the kmalloc >> reclaimable caches idea. The existing counter could then remain for page >> allocator users such as ION. It's a bit weird to have it in bytes and >> not pages then, IMHO. What if we hid it from /proc/vmstat now so it >> doesn't become ABI, and later convert it to page granularity and expose >> it under a name such as "nr_other_reclaimable" ? > > I've nothing against hiding it from /proc/vmstat, as long as we keep > the counter in place and the main issue resolved. Sure. > Maybe it's better to add nr_reclaimable = nr_slab_reclaimable + nr_other_reclaimable, > which will have a simpler meaning that nr_other_reclaimable (what is other?). "other" can be changed, sure. nr_reclaimable is possible if we change slab to adjust that counter as well - vmstat code doesn't support arbitrary calculations when printing. > Thanks! > ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-25 17:02 ` Vlastimil Babka @ 2018-04-25 17:23 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-25 17:23 UTC (permalink / raw) To: Vlastimil Babka Cc: Vijayanand Jitta, vinayak menon, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Wed, Apr 25, 2018 at 07:02:42PM +0200, Vlastimil Babka wrote: > On 04/25/2018 06:48 PM, Roman Gushchin wrote: > > On Wed, Apr 25, 2018 at 05:47:26PM +0200, Vlastimil Babka wrote: > >> On 04/25/2018 02:52 PM, Roman Gushchin wrote: > >>> On Wed, Apr 25, 2018 at 09:19:29AM +0530, Vijayanand Jitta wrote: > >>>>>>>> Idk, I don't like the idea of adding a counter outside of the vm counters > >>>>>>>> infrastructure, and I definitely wouldn't touch the exposed > >>>>>>>> nr_slab_reclaimable and nr_slab_unreclaimable fields. > >>>>>>> > >>>>>>> We would be just making the reported values more precise wrt reality. > >>>>>> > >>>>>> It depends on if we believe that only slab memory can be reclaimable > >>>>>> or not. If yes, this is true, otherwise not. > >>>>>> > >>>>>> My guess is that some drivers (e.g. networking) might have buffers, > >>>>>> which are reclaimable under mempressure, and are allocated using > >>>>>> the page allocator. But I have to look closer... > >>>>>> > >>>>> > >>>>> One such case I have encountered is that of the ION page pool. The page pool > >>>>> registers a shrinker. When not in any memory pressure page pool can go high > >>>>> and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send > >>>>> a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. > >> > >> FYI, we have discussed this at LSF/MM and agreed to try the kmalloc > >> reclaimable caches idea. The existing counter could then remain for page > >> allocator users such as ION. It's a bit weird to have it in bytes and > >> not pages then, IMHO. What if we hid it from /proc/vmstat now so it > >> doesn't become ABI, and later convert it to page granularity and expose > >> it under a name such as "nr_other_reclaimable" ? > > > > I've nothing against hiding it from /proc/vmstat, as long as we keep > > the counter in place and the main issue resolved. > > Sure. > > > Maybe it's better to add nr_reclaimable = nr_slab_reclaimable + nr_other_reclaimable, > > which will have a simpler meaning that nr_other_reclaimable (what is other?). > > "other" can be changed, sure. nr_reclaimable is possible if we change > slab to adjust that counter as well - vmstat code doesn't support > arbitrary calculations when printing. Sure, but even just hiding a value isn't that easy now. So we have to touch the vmstat printing code anyway. Thanks! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES @ 2018-04-25 17:23 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-25 17:23 UTC (permalink / raw) To: Vlastimil Babka Cc: Vijayanand Jitta, vinayak menon, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Wed, Apr 25, 2018 at 07:02:42PM +0200, Vlastimil Babka wrote: > On 04/25/2018 06:48 PM, Roman Gushchin wrote: > > On Wed, Apr 25, 2018 at 05:47:26PM +0200, Vlastimil Babka wrote: > >> On 04/25/2018 02:52 PM, Roman Gushchin wrote: > >>> On Wed, Apr 25, 2018 at 09:19:29AM +0530, Vijayanand Jitta wrote: > >>>>>>>> Idk, I don't like the idea of adding a counter outside of the vm counters > >>>>>>>> infrastructure, and I definitely wouldn't touch the exposed > >>>>>>>> nr_slab_reclaimable and nr_slab_unreclaimable fields. > >>>>>>> > >>>>>>> We would be just making the reported values more precise wrt reality. > >>>>>> > >>>>>> It depends on if we believe that only slab memory can be reclaimable > >>>>>> or not. If yes, this is true, otherwise not. > >>>>>> > >>>>>> My guess is that some drivers (e.g. networking) might have buffers, > >>>>>> which are reclaimable under mempressure, and are allocated using > >>>>>> the page allocator. But I have to look closer... > >>>>>> > >>>>> > >>>>> One such case I have encountered is that of the ION page pool. The page pool > >>>>> registers a shrinker. When not in any memory pressure page pool can go high > >>>>> and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send > >>>>> a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. > >> > >> FYI, we have discussed this at LSF/MM and agreed to try the kmalloc > >> reclaimable caches idea. The existing counter could then remain for page > >> allocator users such as ION. It's a bit weird to have it in bytes and > >> not pages then, IMHO. What if we hid it from /proc/vmstat now so it > >> doesn't become ABI, and later convert it to page granularity and expose > >> it under a name such as "nr_other_reclaimable" ? > > > > I've nothing against hiding it from /proc/vmstat, as long as we keep > > the counter in place and the main issue resolved. > > Sure. > > > Maybe it's better to add nr_reclaimable = nr_slab_reclaimable + nr_other_reclaimable, > > which will have a simpler meaning that nr_other_reclaimable (what is other?). > > "other" can be changed, sure. nr_reclaimable is possible if we change > slab to adjust that counter as well - vmstat code doesn't support > arbitrary calculations when printing. Sure, but even just hiding a value isn't that easy now. So we have to touch the vmstat printing code anyway. Thanks! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-13 12:13 ` vinayak menon 2018-04-25 3:49 ` Vijayanand Jitta @ 2018-04-25 15:55 ` Matthew Wilcox 2018-04-25 16:59 ` Vlastimil Babka 1 sibling, 1 reply; 61+ messages in thread From: Matthew Wilcox @ 2018-04-25 15:55 UTC (permalink / raw) To: vinayak menon Cc: Roman Gushchin, Vlastimil Babka, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On Fri, Apr 13, 2018 at 05:43:39PM +0530, vinayak menon wrote: > One such case I have encountered is that of the ION page pool. The page pool > registers a shrinker. When not in any memory pressure page pool can go high > and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send > a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. Why not just account them as NR_SLAB_RECLAIMABLE? I know it's not slab, but other than that mis-naming, it seems like it'll do the right thing. ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES 2018-04-25 15:55 ` Matthew Wilcox @ 2018-04-25 16:59 ` Vlastimil Babka 0 siblings, 0 replies; 61+ messages in thread From: Vlastimil Babka @ 2018-04-25 16:59 UTC (permalink / raw) To: Matthew Wilcox, vinayak menon Cc: Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team, Linux API On 04/25/2018 05:55 PM, Matthew Wilcox wrote: > On Fri, Apr 13, 2018 at 05:43:39PM +0530, vinayak menon wrote: >> One such case I have encountered is that of the ION page pool. The page pool >> registers a shrinker. When not in any memory pressure page pool can go high >> and thus cause an mmap to fail when OVERCOMMIT_GUESS is set. I can send >> a patch to account ION page pool pages in NR_INDIRECTLY_RECLAIMABLE_BYTES. > > Why not just account them as NR_SLAB_RECLAIMABLE? I know it's not slab, but > other than that mis-naming, it seems like it'll do the right thing. Hm I think it would be confusing for anyone trying to correlate the number with /proc/slabinfo - the numbers there wouldn't add up. ^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 2/3] mm: add indirectly reclaimable memory to MemAvailable 2018-03-05 13:37 ` Roman Gushchin (?) @ 2018-03-05 13:37 ` Roman Gushchin -1 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch adjusts /proc/meminfo MemAvailable calculation by adding the amount of indirectly reclaimable memory (rounded to the PAGE_SIZE). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- mm/page_alloc.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2836bc9e0999..2247cda9e94e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4695,6 +4695,13 @@ long si_mem_available(void) min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, wmark_low); + /* + * Part of the kernel memory, which can be released under memory + * pressure. + */ + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> + PAGE_SHIFT; + if (available < 0) available = 0; return available; -- 2.14.3 ^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 2/3] mm: add indirectly reclaimable memory to MemAvailable @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch adjusts /proc/meminfo MemAvailable calculation by adding the amount of indirectly reclaimable memory (rounded to the PAGE_SIZE). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- mm/page_alloc.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2836bc9e0999..2247cda9e94e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4695,6 +4695,13 @@ long si_mem_available(void) min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, wmark_low); + /* + * Part of the kernel memory, which can be released under memory + * pressure. + */ + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> + PAGE_SHIFT; + if (available < 0) available = 0; return available; -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 2/3] mm: add indirectly reclaimable memory to MemAvailable @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch adjusts /proc/meminfo MemAvailable calculation by adding the amount of indirectly reclaimable memory (rounded to the PAGE_SIZE). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- mm/page_alloc.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2836bc9e0999..2247cda9e94e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4695,6 +4695,13 @@ long si_mem_available(void) min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, wmark_low); + /* + * Part of the kernel memory, which can be released under memory + * pressure. + */ + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> + PAGE_SHIFT; + if (available < 0) available = 0; return available; -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 61+ messages in thread
* Re: [PATCH 2/3] mm: add indirectly reclaimable memory to MemAvailable 2018-03-05 13:37 ` Roman Gushchin (?) @ 2018-03-05 13:47 ` Roman Gushchin -1 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:47 UTC (permalink / raw) To: linux-mm Cc: Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team Please, ignore this particular patch, it was sent by mistake. On Mon, Mar 05, 2018 at 01:37:41PM +0000, Roman Gushchin wrote: > This patch adjusts /proc/meminfo MemAvailable calculation > by adding the amount of indirectly reclaimable memory > (rounded to the PAGE_SIZE). > > Signed-off-by: Roman Gushchin <guro@fb.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: linux-fsdevel@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: linux-mm@kvack.org > Cc: kernel-team@fb.com > --- > mm/page_alloc.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 2836bc9e0999..2247cda9e94e 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4695,6 +4695,13 @@ long si_mem_available(void) > min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, > wmark_low); > > + /* > + * Part of the kernel memory, which can be released under memory > + * pressure. > + */ > + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> > + PAGE_SHIFT; > + > if (available < 0) > available = 0; > return available; > -- > 2.14.3 > ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 2/3] mm: add indirectly reclaimable memory to MemAvailable @ 2018-03-05 13:47 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:47 UTC (permalink / raw) To: linux-mm Cc: Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team Please, ignore this particular patch, it was sent by mistake. On Mon, Mar 05, 2018 at 01:37:41PM +0000, Roman Gushchin wrote: > This patch adjusts /proc/meminfo MemAvailable calculation > by adding the amount of indirectly reclaimable memory > (rounded to the PAGE_SIZE). > > Signed-off-by: Roman Gushchin <guro@fb.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: linux-fsdevel@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: linux-mm@kvack.org > Cc: kernel-team@fb.com > --- > mm/page_alloc.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 2836bc9e0999..2247cda9e94e 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4695,6 +4695,13 @@ long si_mem_available(void) > min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, > wmark_low); > > + /* > + * Part of the kernel memory, which can be released under memory > + * pressure. > + */ > + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> > + PAGE_SHIFT; > + > if (available < 0) > available = 0; > return available; > -- > 2.14.3 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 2/3] mm: add indirectly reclaimable memory to MemAvailable @ 2018-03-05 13:47 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:47 UTC (permalink / raw) To: linux-mm Cc: Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team Please, ignore this particular patch, it was sent by mistake. On Mon, Mar 05, 2018 at 01:37:41PM +0000, Roman Gushchin wrote: > This patch adjusts /proc/meminfo MemAvailable calculation > by adding the amount of indirectly reclaimable memory > (rounded to the PAGE_SIZE). > > Signed-off-by: Roman Gushchin <guro@fb.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: linux-fsdevel@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: linux-mm@kvack.org > Cc: kernel-team@fb.com > --- > mm/page_alloc.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 2836bc9e0999..2247cda9e94e 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4695,6 +4695,13 @@ long si_mem_available(void) > min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, > wmark_low); > > + /* > + * Part of the kernel memory, which can be released under memory > + * pressure. > + */ > + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> > + PAGE_SHIFT; > + > if (available < 0) > available = 0; > return available; > -- > 2.14.3 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 2/3] mm: treat indirectly reclaimable memory as available in MemAvailable 2018-03-05 13:37 ` Roman Gushchin (?) @ 2018-03-05 13:37 ` Roman Gushchin -1 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch adjusts /proc/meminfo MemAvailable calculation by adding the amount of indirectly reclaimable memory (rounded to the PAGE_SIZE). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- mm/page_alloc.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2836bc9e0999..2247cda9e94e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4695,6 +4695,13 @@ long si_mem_available(void) min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, wmark_low); + /* + * Part of the kernel memory, which can be released under memory + * pressure. + */ + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> + PAGE_SHIFT; + if (available < 0) available = 0; return available; -- 2.14.3 ^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 2/3] mm: treat indirectly reclaimable memory as available in MemAvailable @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch adjusts /proc/meminfo MemAvailable calculation by adding the amount of indirectly reclaimable memory (rounded to the PAGE_SIZE). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- mm/page_alloc.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2836bc9e0999..2247cda9e94e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4695,6 +4695,13 @@ long si_mem_available(void) min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, wmark_low); + /* + * Part of the kernel memory, which can be released under memory + * pressure. + */ + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> + PAGE_SHIFT; + if (available < 0) available = 0; return available; -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 2/3] mm: treat indirectly reclaimable memory as available in MemAvailable @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team This patch adjusts /proc/meminfo MemAvailable calculation by adding the amount of indirectly reclaimable memory (rounded to the PAGE_SIZE). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- mm/page_alloc.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2836bc9e0999..2247cda9e94e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4695,6 +4695,13 @@ long si_mem_available(void) min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, wmark_low); + /* + * Part of the kernel memory, which can be released under memory + * pressure. + */ + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> + PAGE_SHIFT; + if (available < 0) available = 0; return available; -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-03-05 13:37 ` Roman Gushchin (?) @ 2018-03-05 13:37 ` Roman Gushchin -1 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team I was reported about suspicious growth of unreclaimable slabs on some machines. I've found that it happens on machines with low memory pressure, and these unreclaimable slabs are external names attached to dentries. External names are allocated using generic kmalloc() function, so they are accounted as unreclaimable. But they are held by dentries, which are reclaimable, and they will be reclaimed under the memory pressure. In particular, this breaks MemAvailable calculation, as it doesn't take unreclaimable slabs into account. This leads to a silly situation, when a machine is almost idle, has no memory pressure and therefore has a big dentry cache. And the resulting MemAvailable is too low to start a new workload. To address the issue, the NR_INDIRECTLY_RECLAIMABLE_BYTES counter is used to track the amount of memory, consumed by external names. The counter is increased in the dentry allocation path, if an external name structure is allocated; and it's decreased in the dentry freeing path. To reproduce the problem I've used the following Python script: import os for iter in range (0, 10000000): try: name = ("/some_long_name_%d" % iter) + "_" * 220 os.stat(name) except Exception: pass Without this patch: $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7811688 kB $ python indirect.py $ cat /proc/meminfo | grep MemAvailable MemAvailable: 2753052 kB With the patch: $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7809516 kB $ python indirect.py $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7749144 kB Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- fs/dcache.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 5c7df1df81ff..a0312d73f575 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -273,8 +273,16 @@ static void __d_free(struct rcu_head *head) static void __d_free_external(struct rcu_head *head) { struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu); - kfree(external_name(dentry)); - kmem_cache_free(dentry_cache, dentry); + struct external_name *name = external_name(dentry); + unsigned long bytes; + + bytes = dentry->d_name.len + offsetof(struct external_name, name[1]); + mod_node_page_state(page_pgdat(virt_to_page(name)), + NR_INDIRECTLY_RECLAIMABLE_BYTES, + -kmalloc_size(kmalloc_index(bytes))); + + kfree(name); + kmem_cache_free(dentry_cache, dentry); } static inline int dname_external(const struct dentry *dentry) @@ -1598,6 +1606,7 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) struct dentry *dentry; char *dname; int err; + size_t reclaimable = 0; dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL); if (!dentry) @@ -1614,9 +1623,11 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) name = &slash_name; dname = dentry->d_iname; } else if (name->len > DNAME_INLINE_LEN-1) { - size_t size = offsetof(struct external_name, name[1]); - struct external_name *p = kmalloc(size + name->len, - GFP_KERNEL_ACCOUNT); + struct external_name *p; + + reclaimable = offsetof(struct external_name, name[1]) + + name->len; + p = kmalloc(reclaimable, GFP_KERNEL_ACCOUNT); if (!p) { kmem_cache_free(dentry_cache, dentry); return NULL; @@ -1665,6 +1676,14 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) } } + if (unlikely(reclaimable)) { + pg_data_t *pgdat; + + pgdat = page_pgdat(virt_to_page(external_name(dentry))); + mod_node_page_state(pgdat, NR_INDIRECTLY_RECLAIMABLE_BYTES, + kmalloc_size(kmalloc_index(reclaimable))); + } + this_cpu_inc(nr_dentry); return dentry; -- 2.14.3 ^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 3/3] dcache: account external names as indirectly reclaimable memory @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team I was reported about suspicious growth of unreclaimable slabs on some machines. I've found that it happens on machines with low memory pressure, and these unreclaimable slabs are external names attached to dentries. External names are allocated using generic kmalloc() function, so they are accounted as unreclaimable. But they are held by dentries, which are reclaimable, and they will be reclaimed under the memory pressure. In particular, this breaks MemAvailable calculation, as it doesn't take unreclaimable slabs into account. This leads to a silly situation, when a machine is almost idle, has no memory pressure and therefore has a big dentry cache. And the resulting MemAvailable is too low to start a new workload. To address the issue, the NR_INDIRECTLY_RECLAIMABLE_BYTES counter is used to track the amount of memory, consumed by external names. The counter is increased in the dentry allocation path, if an external name structure is allocated; and it's decreased in the dentry freeing path. To reproduce the problem I've used the following Python script: import os for iter in range (0, 10000000): try: name = ("/some_long_name_%d" % iter) + "_" * 220 os.stat(name) except Exception: pass Without this patch: $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7811688 kB $ python indirect.py $ cat /proc/meminfo | grep MemAvailable MemAvailable: 2753052 kB With the patch: $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7809516 kB $ python indirect.py $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7749144 kB Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- fs/dcache.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 5c7df1df81ff..a0312d73f575 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -273,8 +273,16 @@ static void __d_free(struct rcu_head *head) static void __d_free_external(struct rcu_head *head) { struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu); - kfree(external_name(dentry)); - kmem_cache_free(dentry_cache, dentry); + struct external_name *name = external_name(dentry); + unsigned long bytes; + + bytes = dentry->d_name.len + offsetof(struct external_name, name[1]); + mod_node_page_state(page_pgdat(virt_to_page(name)), + NR_INDIRECTLY_RECLAIMABLE_BYTES, + -kmalloc_size(kmalloc_index(bytes))); + + kfree(name); + kmem_cache_free(dentry_cache, dentry); } static inline int dname_external(const struct dentry *dentry) @@ -1598,6 +1606,7 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) struct dentry *dentry; char *dname; int err; + size_t reclaimable = 0; dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL); if (!dentry) @@ -1614,9 +1623,11 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) name = &slash_name; dname = dentry->d_iname; } else if (name->len > DNAME_INLINE_LEN-1) { - size_t size = offsetof(struct external_name, name[1]); - struct external_name *p = kmalloc(size + name->len, - GFP_KERNEL_ACCOUNT); + struct external_name *p; + + reclaimable = offsetof(struct external_name, name[1]) + + name->len; + p = kmalloc(reclaimable, GFP_KERNEL_ACCOUNT); if (!p) { kmem_cache_free(dentry_cache, dentry); return NULL; @@ -1665,6 +1676,14 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) } } + if (unlikely(reclaimable)) { + pg_data_t *pgdat; + + pgdat = page_pgdat(virt_to_page(external_name(dentry))); + mod_node_page_state(pgdat, NR_INDIRECTLY_RECLAIMABLE_BYTES, + kmalloc_size(kmalloc_index(reclaimable))); + } + this_cpu_inc(nr_dentry); return dentry; -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 3/3] dcache: account external names as indirectly reclaimable memory @ 2018-03-05 13:37 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-05 13:37 UTC (permalink / raw) To: linux-mm Cc: Roman Gushchin, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team I was reported about suspicious growth of unreclaimable slabs on some machines. I've found that it happens on machines with low memory pressure, and these unreclaimable slabs are external names attached to dentries. External names are allocated using generic kmalloc() function, so they are accounted as unreclaimable. But they are held by dentries, which are reclaimable, and they will be reclaimed under the memory pressure. In particular, this breaks MemAvailable calculation, as it doesn't take unreclaimable slabs into account. This leads to a silly situation, when a machine is almost idle, has no memory pressure and therefore has a big dentry cache. And the resulting MemAvailable is too low to start a new workload. To address the issue, the NR_INDIRECTLY_RECLAIMABLE_BYTES counter is used to track the amount of memory, consumed by external names. The counter is increased in the dentry allocation path, if an external name structure is allocated; and it's decreased in the dentry freeing path. To reproduce the problem I've used the following Python script: import os for iter in range (0, 10000000): try: name = ("/some_long_name_%d" % iter) + "_" * 220 os.stat(name) except Exception: pass Without this patch: $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7811688 kB $ python indirect.py $ cat /proc/meminfo | grep MemAvailable MemAvailable: 2753052 kB With the patch: $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7809516 kB $ python indirect.py $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7749144 kB Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: kernel-team@fb.com --- fs/dcache.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 5c7df1df81ff..a0312d73f575 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -273,8 +273,16 @@ static void __d_free(struct rcu_head *head) static void __d_free_external(struct rcu_head *head) { struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu); - kfree(external_name(dentry)); - kmem_cache_free(dentry_cache, dentry); + struct external_name *name = external_name(dentry); + unsigned long bytes; + + bytes = dentry->d_name.len + offsetof(struct external_name, name[1]); + mod_node_page_state(page_pgdat(virt_to_page(name)), + NR_INDIRECTLY_RECLAIMABLE_BYTES, + -kmalloc_size(kmalloc_index(bytes))); + + kfree(name); + kmem_cache_free(dentry_cache, dentry); } static inline int dname_external(const struct dentry *dentry) @@ -1598,6 +1606,7 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) struct dentry *dentry; char *dname; int err; + size_t reclaimable = 0; dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL); if (!dentry) @@ -1614,9 +1623,11 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) name = &slash_name; dname = dentry->d_iname; } else if (name->len > DNAME_INLINE_LEN-1) { - size_t size = offsetof(struct external_name, name[1]); - struct external_name *p = kmalloc(size + name->len, - GFP_KERNEL_ACCOUNT); + struct external_name *p; + + reclaimable = offsetof(struct external_name, name[1]) + + name->len; + p = kmalloc(reclaimable, GFP_KERNEL_ACCOUNT); if (!p) { kmem_cache_free(dentry_cache, dentry); return NULL; @@ -1665,6 +1676,14 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) } } + if (unlikely(reclaimable)) { + pg_data_t *pgdat; + + pgdat = page_pgdat(virt_to_page(external_name(dentry))); + mod_node_page_state(pgdat, NR_INDIRECTLY_RECLAIMABLE_BYTES, + kmalloc_size(kmalloc_index(reclaimable))); + } + this_cpu_inc(nr_dentry); return dentry; -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-03-05 13:37 ` Roman Gushchin (?) (?) @ 2018-03-12 21:17 ` Al Viro 2018-03-12 22:36 ` Roman Gushchin -1 siblings, 1 reply; 61+ messages in thread From: Al Viro @ 2018-03-12 21:17 UTC (permalink / raw) To: Roman Gushchin Cc: linux-mm, Andrew Morton, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Mon, Mar 05, 2018 at 01:37:43PM +0000, Roman Gushchin wrote: > diff --git a/fs/dcache.c b/fs/dcache.c > index 5c7df1df81ff..a0312d73f575 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -273,8 +273,16 @@ static void __d_free(struct rcu_head *head) > static void __d_free_external(struct rcu_head *head) > { > struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu); > - kfree(external_name(dentry)); > - kmem_cache_free(dentry_cache, dentry); > + struct external_name *name = external_name(dentry); > + unsigned long bytes; > + > + bytes = dentry->d_name.len + offsetof(struct external_name, name[1]); > + mod_node_page_state(page_pgdat(virt_to_page(name)), > + NR_INDIRECTLY_RECLAIMABLE_BYTES, > + -kmalloc_size(kmalloc_index(bytes))); > + > + kfree(name); > + kmem_cache_free(dentry_cache, dentry); > } That can't be right - external names can be freed in release_dentry_name_snapshot() and copy_name() as well. When do you want kfree_rcu() paths accounted for, BTW? At the point where we are freeing them, or where we are scheduling their freeing? ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-03-12 21:17 ` Al Viro @ 2018-03-12 22:36 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-12 22:36 UTC (permalink / raw) To: Al Viro Cc: linux-mm, Andrew Morton, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Mon, Mar 12, 2018 at 09:17:42PM +0000, Al Viro wrote: > On Mon, Mar 05, 2018 at 01:37:43PM +0000, Roman Gushchin wrote: > > diff --git a/fs/dcache.c b/fs/dcache.c > > index 5c7df1df81ff..a0312d73f575 100644 > > --- a/fs/dcache.c > > +++ b/fs/dcache.c > > @@ -273,8 +273,16 @@ static void __d_free(struct rcu_head *head) > > static void __d_free_external(struct rcu_head *head) > > { > > struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu); > > - kfree(external_name(dentry)); > > - kmem_cache_free(dentry_cache, dentry); > > + struct external_name *name = external_name(dentry); > > + unsigned long bytes; > > + > > + bytes = dentry->d_name.len + offsetof(struct external_name, name[1]); > > + mod_node_page_state(page_pgdat(virt_to_page(name)), > > + NR_INDIRECTLY_RECLAIMABLE_BYTES, > > + -kmalloc_size(kmalloc_index(bytes))); > > + > > + kfree(name); > > + kmem_cache_free(dentry_cache, dentry); > > } > > That can't be right - external names can be freed in release_dentry_name_snapshot() > and copy_name() as well. When do you want kfree_rcu() paths accounted for, BTW? > At the point where we are freeing them, or where we are scheduling their freeing? Ah, I see... I think, it's better to account them when we're actually freeing, otherwise we will have strange path: (indirectly) reclaimable -> unreclaimable -> free Do you agree? Although it shouldn't be that important in practice. Thank you! -- >From ad9d6c627c2b9315de1967c40a1f4fa68705cf9e Mon Sep 17 00:00:00 2001 From: Roman Gushchin <guro@fb.com> Date: Mon, 12 Mar 2018 22:24:28 +0000 Subject: [PATCH] dcache: fix indirectly reclaimable memory accounting Signed-off-by: Roman Gushchin <guro@fb.com> --- fs/dcache.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 98826efe22a0..19bc7495a6c4 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -266,6 +266,19 @@ static void __d_free(struct rcu_head *head) kmem_cache_free(dentry_cache, dentry); } +static void __d_free_external_name(struct rcu_head *head) +{ + struct external_name *name; + + name = container_of(head, struct external_name, u.head); + + mod_node_page_state(page_pgdat(virt_to_page(name)), + NR_INDIRECTLY_RECLAIMABLE_BYTES, + -ksize(name)); + + kfree(name); +} + static void __d_free_external(struct rcu_head *head) { struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu); @@ -307,7 +320,7 @@ void release_dentry_name_snapshot(struct name_snapshot *name) struct external_name *p; p = container_of(name->name, struct external_name, name[0]); if (unlikely(atomic_dec_and_test(&p->u.count))) - kfree_rcu(p, u.head); + call_rcu(&p->u.head, __d_free_external_name); } } EXPORT_SYMBOL(release_dentry_name_snapshot); @@ -2769,7 +2782,7 @@ static void copy_name(struct dentry *dentry, struct dentry *target) dentry->d_name.hash_len = target->d_name.hash_len; } if (old_name && likely(atomic_dec_and_test(&old_name->u.count))) - kfree_rcu(old_name, u.head); + call_rcu(&old_name->u.head, __d_free_external_name); } static void dentry_lock_for_move(struct dentry *dentry, struct dentry *target) -- 2.14.3 ^ permalink raw reply related [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory @ 2018-03-12 22:36 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-03-12 22:36 UTC (permalink / raw) To: Al Viro Cc: linux-mm, Andrew Morton, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Mon, Mar 12, 2018 at 09:17:42PM +0000, Al Viro wrote: > On Mon, Mar 05, 2018 at 01:37:43PM +0000, Roman Gushchin wrote: > > diff --git a/fs/dcache.c b/fs/dcache.c > > index 5c7df1df81ff..a0312d73f575 100644 > > --- a/fs/dcache.c > > +++ b/fs/dcache.c > > @@ -273,8 +273,16 @@ static void __d_free(struct rcu_head *head) > > static void __d_free_external(struct rcu_head *head) > > { > > struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu); > > - kfree(external_name(dentry)); > > - kmem_cache_free(dentry_cache, dentry); > > + struct external_name *name = external_name(dentry); > > + unsigned long bytes; > > + > > + bytes = dentry->d_name.len + offsetof(struct external_name, name[1]); > > + mod_node_page_state(page_pgdat(virt_to_page(name)), > > + NR_INDIRECTLY_RECLAIMABLE_BYTES, > > + -kmalloc_size(kmalloc_index(bytes))); > > + > > + kfree(name); > > + kmem_cache_free(dentry_cache, dentry); > > } > > That can't be right - external names can be freed in release_dentry_name_snapshot() > and copy_name() as well. When do you want kfree_rcu() paths accounted for, BTW? > At the point where we are freeing them, or where we are scheduling their freeing? Ah, I see... I think, it's better to account them when we're actually freeing, otherwise we will have strange path: (indirectly) reclaimable -> unreclaimable -> free Do you agree? Although it shouldn't be that important in practice. Thank you! -- ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-03-12 22:36 ` Roman Gushchin (?) @ 2018-03-13 0:45 ` Al Viro 2018-04-05 22:11 ` Andrew Morton -1 siblings, 1 reply; 61+ messages in thread From: Al Viro @ 2018-03-13 0:45 UTC (permalink / raw) To: Roman Gushchin Cc: linux-mm, Andrew Morton, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Mon, Mar 12, 2018 at 10:36:38PM +0000, Roman Gushchin wrote: > Ah, I see... > > I think, it's better to account them when we're actually freeing, > otherwise we will have strange path: > (indirectly) reclaimable -> unreclaimable -> free > > Do you agree? > +static void __d_free_external_name(struct rcu_head *head) > +{ > + struct external_name *name; > + > + name = container_of(head, struct external_name, u.head); > + > + mod_node_page_state(page_pgdat(virt_to_page(name)), > + NR_INDIRECTLY_RECLAIMABLE_BYTES, > + -ksize(name)); > + > + kfree(name); > +} Maybe, but then you want to call that from __d_free_external() and from failure path in __d_alloc() as well. Duplicating something that convoluted and easy to get out of sync is just asking for trouble. ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-03-13 0:45 ` Al Viro @ 2018-04-05 22:11 ` Andrew Morton 2018-04-06 10:32 ` Roman Gushchin 0 siblings, 1 reply; 61+ messages in thread From: Andrew Morton @ 2018-04-05 22:11 UTC (permalink / raw) To: Al Viro Cc: Roman Gushchin, linux-mm, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Tue, 13 Mar 2018 00:45:32 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote: > On Mon, Mar 12, 2018 at 10:36:38PM +0000, Roman Gushchin wrote: > > > Ah, I see... > > > > I think, it's better to account them when we're actually freeing, > > otherwise we will have strange path: > > (indirectly) reclaimable -> unreclaimable -> free > > > > Do you agree? > > > +static void __d_free_external_name(struct rcu_head *head) > > +{ > > + struct external_name *name; > > + > > + name = container_of(head, struct external_name, u.head); > > + > > + mod_node_page_state(page_pgdat(virt_to_page(name)), > > + NR_INDIRECTLY_RECLAIMABLE_BYTES, > > + -ksize(name)); > > + > > + kfree(name); > > +} > > Maybe, but then you want to call that from __d_free_external() and from > failure path in __d_alloc() as well. Duplicating something that convoluted > and easy to get out of sync is just asking for trouble. So.. where are we at with this issue? ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-05 22:11 ` Andrew Morton @ 2018-04-06 10:32 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-06 10:32 UTC (permalink / raw) To: Andrew Morton Cc: Al Viro, linux-mm, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Thu, Apr 05, 2018 at 03:11:23PM -0700, Andrew Morton wrote: > On Tue, 13 Mar 2018 00:45:32 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote: > > > On Mon, Mar 12, 2018 at 10:36:38PM +0000, Roman Gushchin wrote: > > > > > Ah, I see... > > > > > > I think, it's better to account them when we're actually freeing, > > > otherwise we will have strange path: > > > (indirectly) reclaimable -> unreclaimable -> free > > > > > > Do you agree? > > > > > +static void __d_free_external_name(struct rcu_head *head) > > > +{ > > > + struct external_name *name; > > > + > > > + name = container_of(head, struct external_name, u.head); > > > + > > > + mod_node_page_state(page_pgdat(virt_to_page(name)), > > > + NR_INDIRECTLY_RECLAIMABLE_BYTES, > > > + -ksize(name)); > > > + > > > + kfree(name); > > > +} > > > > Maybe, but then you want to call that from __d_free_external() and from > > failure path in __d_alloc() as well. Duplicating something that convoluted > > and easy to get out of sync is just asking for trouble. > > So.. where are we at with this issue? I assume that commit 0babe6fe1da3 ("dcache: fix indirectly reclaimable memory accounting") address the issue. __d_free_external_name() is now called from all release paths (including __d_free_external()) and is the only place where NR_INDIRECTLY_RECLAIMABLE_BYTES is decremented. __d_alloc()'s error path is slightly different, because I bump NR_INDIRECTLY_RECLAIMABLE_BYTES in a very last moment, when it's already clear, that no errors did occur. So we don't need to increase and decrease the counter back and forth. Thank you! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory @ 2018-04-06 10:32 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-06 10:32 UTC (permalink / raw) To: Andrew Morton Cc: Al Viro, linux-mm, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Thu, Apr 05, 2018 at 03:11:23PM -0700, Andrew Morton wrote: > On Tue, 13 Mar 2018 00:45:32 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote: > > > On Mon, Mar 12, 2018 at 10:36:38PM +0000, Roman Gushchin wrote: > > > > > Ah, I see... > > > > > > I think, it's better to account them when we're actually freeing, > > > otherwise we will have strange path: > > > (indirectly) reclaimable -> unreclaimable -> free > > > > > > Do you agree? > > > > > +static void __d_free_external_name(struct rcu_head *head) > > > +{ > > > + struct external_name *name; > > > + > > > + name = container_of(head, struct external_name, u.head); > > > + > > > + mod_node_page_state(page_pgdat(virt_to_page(name)), > > > + NR_INDIRECTLY_RECLAIMABLE_BYTES, > > > + -ksize(name)); > > > + > > > + kfree(name); > > > +} > > > > Maybe, but then you want to call that from __d_free_external() and from > > failure path in __d_alloc() as well. Duplicating something that convoluted > > and easy to get out of sync is just asking for trouble. > > So.. where are we at with this issue? I assume that commit 0babe6fe1da3 ("dcache: fix indirectly reclaimable memory accounting") address the issue. __d_free_external_name() is now called from all release paths (including __d_free_external()) and is the only place where NR_INDIRECTLY_RECLAIMABLE_BYTES is decremented. __d_alloc()'s error path is slightly different, because I bump NR_INDIRECTLY_RECLAIMABLE_BYTES in a very last moment, when it's already clear, that no errors did occur. So we don't need to increase and decrease the counter back and forth. Thank you! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-03-05 13:37 ` Roman Gushchin ` (2 preceding siblings ...) (?) @ 2018-04-13 13:35 ` Minchan Kim 2018-04-13 13:59 ` Michal Hocko -1 siblings, 1 reply; 61+ messages in thread From: Minchan Kim @ 2018-04-13 13:35 UTC (permalink / raw) To: Roman Gushchin Cc: linux-mm, Andrew Morton, Alexander Viro, Michal Hocko, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Mon, Mar 05, 2018 at 01:37:43PM +0000, Roman Gushchin wrote: > I was reported about suspicious growth of unreclaimable slabs > on some machines. I've found that it happens on machines > with low memory pressure, and these unreclaimable slabs > are external names attached to dentries. > > External names are allocated using generic kmalloc() function, > so they are accounted as unreclaimable. But they are held > by dentries, which are reclaimable, and they will be reclaimed > under the memory pressure. > > In particular, this breaks MemAvailable calculation, as it > doesn't take unreclaimable slabs into account. > This leads to a silly situation, when a machine is almost idle, > has no memory pressure and therefore has a big dentry cache. > And the resulting MemAvailable is too low to start a new workload. > > To address the issue, the NR_INDIRECTLY_RECLAIMABLE_BYTES counter > is used to track the amount of memory, consumed by external names. > The counter is increased in the dentry allocation path, if an external > name structure is allocated; and it's decreased in the dentry freeing > path. > > To reproduce the problem I've used the following Python script: > import os > > for iter in range (0, 10000000): > try: > name = ("/some_long_name_%d" % iter) + "_" * 220 > os.stat(name) > except Exception: > pass > > Without this patch: > $ cat /proc/meminfo | grep MemAvailable > MemAvailable: 7811688 kB > $ python indirect.py > $ cat /proc/meminfo | grep MemAvailable > MemAvailable: 2753052 kB > > With the patch: > $ cat /proc/meminfo | grep MemAvailable > MemAvailable: 7809516 kB > $ python indirect.py > $ cat /proc/meminfo | grep MemAvailable > MemAvailable: 7749144 kB > > Signed-off-by: Roman Gushchin <guro@fb.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: linux-fsdevel@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: linux-mm@kvack.org > Cc: kernel-team@fb.com > --- > fs/dcache.c | 29 ++++++++++++++++++++++++----- > 1 file changed, 24 insertions(+), 5 deletions(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index 5c7df1df81ff..a0312d73f575 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -273,8 +273,16 @@ static void __d_free(struct rcu_head *head) > static void __d_free_external(struct rcu_head *head) > { > struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu); > - kfree(external_name(dentry)); > - kmem_cache_free(dentry_cache, dentry); > + struct external_name *name = external_name(dentry); > + unsigned long bytes; > + > + bytes = dentry->d_name.len + offsetof(struct external_name, name[1]); > + mod_node_page_state(page_pgdat(virt_to_page(name)), > + NR_INDIRECTLY_RECLAIMABLE_BYTES, > + -kmalloc_size(kmalloc_index(bytes))); > + > + kfree(name); > + kmem_cache_free(dentry_cache, dentry); > } > > static inline int dname_external(const struct dentry *dentry) > @@ -1598,6 +1606,7 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) > struct dentry *dentry; > char *dname; > int err; > + size_t reclaimable = 0; > > dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL); > if (!dentry) > @@ -1614,9 +1623,11 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) > name = &slash_name; > dname = dentry->d_iname; > } else if (name->len > DNAME_INLINE_LEN-1) { > - size_t size = offsetof(struct external_name, name[1]); > - struct external_name *p = kmalloc(size + name->len, > - GFP_KERNEL_ACCOUNT); > + struct external_name *p; > + > + reclaimable = offsetof(struct external_name, name[1]) + > + name->len; > + p = kmalloc(reclaimable, GFP_KERNEL_ACCOUNT); Can't we use kmem_cache_alloc with own cache created with SLAB_RECLAIM_ACCOUNT if they are reclaimable? With that, it would help fragmentation problem with __GFP_RECLAIMABLE for page allocation as well as counting problem, IMHO. > if (!p) { > kmem_cache_free(dentry_cache, dentry); > return NULL; > @@ -1665,6 +1676,14 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) > } > } > > + if (unlikely(reclaimable)) { > + pg_data_t *pgdat; > + > + pgdat = page_pgdat(virt_to_page(external_name(dentry))); > + mod_node_page_state(pgdat, NR_INDIRECTLY_RECLAIMABLE_BYTES, > + kmalloc_size(kmalloc_index(reclaimable))); > + } > + > this_cpu_inc(nr_dentry); > > return dentry; > -- > 2.14.3 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-13 13:35 ` Minchan Kim @ 2018-04-13 13:59 ` Michal Hocko 2018-04-13 14:20 ` Vlastimil Babka 0 siblings, 1 reply; 61+ messages in thread From: Michal Hocko @ 2018-04-13 13:59 UTC (permalink / raw) To: Minchan Kim Cc: Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Fri 13-04-18 22:35:19, Minchan Kim wrote: > On Mon, Mar 05, 2018 at 01:37:43PM +0000, Roman Gushchin wrote: [...] > > @@ -1614,9 +1623,11 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) > > name = &slash_name; > > dname = dentry->d_iname; > > } else if (name->len > DNAME_INLINE_LEN-1) { > > - size_t size = offsetof(struct external_name, name[1]); > > - struct external_name *p = kmalloc(size + name->len, > > - GFP_KERNEL_ACCOUNT); > > + struct external_name *p; > > + > > + reclaimable = offsetof(struct external_name, name[1]) + > > + name->len; > > + p = kmalloc(reclaimable, GFP_KERNEL_ACCOUNT); > > Can't we use kmem_cache_alloc with own cache created with SLAB_RECLAIM_ACCOUNT > if they are reclaimable? No, because names have different sizes and so we would basically have to duplicate many caches. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-13 13:59 ` Michal Hocko @ 2018-04-13 14:20 ` Vlastimil Babka 2018-04-13 14:28 ` Michal Hocko 0 siblings, 1 reply; 61+ messages in thread From: Vlastimil Babka @ 2018-04-13 14:20 UTC (permalink / raw) To: Michal Hocko, Minchan Kim Cc: Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On 04/13/2018 03:59 PM, Michal Hocko wrote: > On Fri 13-04-18 22:35:19, Minchan Kim wrote: >> On Mon, Mar 05, 2018 at 01:37:43PM +0000, Roman Gushchin wrote: > [...] >>> @@ -1614,9 +1623,11 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) >>> name = &slash_name; >>> dname = dentry->d_iname; >>> } else if (name->len > DNAME_INLINE_LEN-1) { >>> - size_t size = offsetof(struct external_name, name[1]); >>> - struct external_name *p = kmalloc(size + name->len, >>> - GFP_KERNEL_ACCOUNT); >>> + struct external_name *p; >>> + >>> + reclaimable = offsetof(struct external_name, name[1]) + >>> + name->len; >>> + p = kmalloc(reclaimable, GFP_KERNEL_ACCOUNT); >> >> Can't we use kmem_cache_alloc with own cache created with SLAB_RECLAIM_ACCOUNT >> if they are reclaimable? > > No, because names have different sizes and so we would basically have to > duplicate many caches. We would need kmalloc-reclaimable-X variants. It could be worth it, especially if we find more similar usages. I suspect they would be more useful than the existing dma-kmalloc-X :) Maybe create both (dma and reclaimable) on demand? ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-13 14:20 ` Vlastimil Babka @ 2018-04-13 14:28 ` Michal Hocko 2018-04-13 14:37 ` Johannes Weiner 0 siblings, 1 reply; 61+ messages in thread From: Michal Hocko @ 2018-04-13 14:28 UTC (permalink / raw) To: Vlastimil Babka Cc: Minchan Kim, Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, Johannes Weiner, linux-fsdevel, linux-kernel, kernel-team On Fri 13-04-18 16:20:00, Vlastimil Babka wrote: > On 04/13/2018 03:59 PM, Michal Hocko wrote: > > On Fri 13-04-18 22:35:19, Minchan Kim wrote: > >> On Mon, Mar 05, 2018 at 01:37:43PM +0000, Roman Gushchin wrote: > > [...] > >>> @@ -1614,9 +1623,11 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) > >>> name = &slash_name; > >>> dname = dentry->d_iname; > >>> } else if (name->len > DNAME_INLINE_LEN-1) { > >>> - size_t size = offsetof(struct external_name, name[1]); > >>> - struct external_name *p = kmalloc(size + name->len, > >>> - GFP_KERNEL_ACCOUNT); > >>> + struct external_name *p; > >>> + > >>> + reclaimable = offsetof(struct external_name, name[1]) + > >>> + name->len; > >>> + p = kmalloc(reclaimable, GFP_KERNEL_ACCOUNT); > >> > >> Can't we use kmem_cache_alloc with own cache created with SLAB_RECLAIM_ACCOUNT > >> if they are reclaimable? > > > > No, because names have different sizes and so we would basically have to > > duplicate many caches. > > We would need kmalloc-reclaimable-X variants. It could be worth it, > especially if we find more similar usages. I suspect they would be more > useful than the existing dma-kmalloc-X :) I am still not sure why __GFP_RECLAIMABLE cannot be made work as expected and account slab pages as SLAB_RECLAIMABLE -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-13 14:28 ` Michal Hocko @ 2018-04-13 14:37 ` Johannes Weiner 2018-04-16 11:41 ` Michal Hocko 0 siblings, 1 reply; 61+ messages in thread From: Johannes Weiner @ 2018-04-13 14:37 UTC (permalink / raw) To: Michal Hocko Cc: Vlastimil Babka, Minchan Kim, Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, linux-fsdevel, linux-kernel, kernel-team On Fri, Apr 13, 2018 at 04:28:21PM +0200, Michal Hocko wrote: > On Fri 13-04-18 16:20:00, Vlastimil Babka wrote: > > We would need kmalloc-reclaimable-X variants. It could be worth it, > > especially if we find more similar usages. I suspect they would be more > > useful than the existing dma-kmalloc-X :) > > I am still not sure why __GFP_RECLAIMABLE cannot be made work as > expected and account slab pages as SLAB_RECLAIMABLE Can you outline how this would work without separate caches? ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-13 14:37 ` Johannes Weiner @ 2018-04-16 11:41 ` Michal Hocko 2018-04-16 12:06 ` Vlastimil Babka 2018-04-17 11:24 ` Roman Gushchin 0 siblings, 2 replies; 61+ messages in thread From: Michal Hocko @ 2018-04-16 11:41 UTC (permalink / raw) To: Johannes Weiner Cc: Vlastimil Babka, Minchan Kim, Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, linux-fsdevel, linux-kernel, kernel-team On Fri 13-04-18 10:37:16, Johannes Weiner wrote: > On Fri, Apr 13, 2018 at 04:28:21PM +0200, Michal Hocko wrote: > > On Fri 13-04-18 16:20:00, Vlastimil Babka wrote: > > > We would need kmalloc-reclaimable-X variants. It could be worth it, > > > especially if we find more similar usages. I suspect they would be more > > > useful than the existing dma-kmalloc-X :) > > > > I am still not sure why __GFP_RECLAIMABLE cannot be made work as > > expected and account slab pages as SLAB_RECLAIMABLE > > Can you outline how this would work without separate caches? I thought that the cache would only maintain two sets of slab pages depending on the allocation reuquests. I am pretty sure there will be other details to iron out and maybe it will turn out that such a large portion of the chache would need to duplicate the state that a completely new cache would be more reasonable. Is this worth exploring at least? I mean something like this should help with the fragmentation already AFAIU. Accounting would be just free on top. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-16 11:41 ` Michal Hocko @ 2018-04-16 12:06 ` Vlastimil Babka 2018-04-16 12:27 ` Michal Hocko 2018-04-16 13:09 ` Matthew Wilcox 2018-04-17 11:24 ` Roman Gushchin 1 sibling, 2 replies; 61+ messages in thread From: Vlastimil Babka @ 2018-04-16 12:06 UTC (permalink / raw) To: Michal Hocko, Johannes Weiner Cc: Minchan Kim, Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, linux-fsdevel, linux-kernel, kernel-team On 04/16/2018 01:41 PM, Michal Hocko wrote: > On Fri 13-04-18 10:37:16, Johannes Weiner wrote: >> On Fri, Apr 13, 2018 at 04:28:21PM +0200, Michal Hocko wrote: >>> On Fri 13-04-18 16:20:00, Vlastimil Babka wrote: >>>> We would need kmalloc-reclaimable-X variants. It could be worth it, >>>> especially if we find more similar usages. I suspect they would be more >>>> useful than the existing dma-kmalloc-X :) >>> >>> I am still not sure why __GFP_RECLAIMABLE cannot be made work as >>> expected and account slab pages as SLAB_RECLAIMABLE >> >> Can you outline how this would work without separate caches? > > I thought that the cache would only maintain two sets of slab pages > depending on the allocation reuquests. I am pretty sure there will be > other details to iron out and For example the percpu (and other) array caches... > maybe it will turn out that such a large > portion of the chache would need to duplicate the state that a > completely new cache would be more reasonable. I'm afraid that's the case, yes. > Is this worth exploring > at least? I mean something like this should help with the fragmentation > already AFAIU. Accounting would be just free on top. Yep. It could be also CONFIG_urable so smaller systems don't need to deal with the memory overhead of this. So do we put it on LSF/MM agenda? ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-16 12:06 ` Vlastimil Babka @ 2018-04-16 12:27 ` Michal Hocko 2018-04-16 19:57 ` Vlastimil Babka 2018-04-16 13:09 ` Matthew Wilcox 1 sibling, 1 reply; 61+ messages in thread From: Michal Hocko @ 2018-04-16 12:27 UTC (permalink / raw) To: Vlastimil Babka Cc: Johannes Weiner, Minchan Kim, Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, linux-fsdevel, linux-kernel, kernel-team On Mon 16-04-18 14:06:21, Vlastimil Babka wrote: > On 04/16/2018 01:41 PM, Michal Hocko wrote: > > On Fri 13-04-18 10:37:16, Johannes Weiner wrote: > >> On Fri, Apr 13, 2018 at 04:28:21PM +0200, Michal Hocko wrote: > >>> On Fri 13-04-18 16:20:00, Vlastimil Babka wrote: > >>>> We would need kmalloc-reclaimable-X variants. It could be worth it, > >>>> especially if we find more similar usages. I suspect they would be more > >>>> useful than the existing dma-kmalloc-X :) > >>> > >>> I am still not sure why __GFP_RECLAIMABLE cannot be made work as > >>> expected and account slab pages as SLAB_RECLAIMABLE > >> > >> Can you outline how this would work without separate caches? > > > > I thought that the cache would only maintain two sets of slab pages > > depending on the allocation reuquests. I am pretty sure there will be > > other details to iron out and > > For example the percpu (and other) array caches... > > > maybe it will turn out that such a large > > portion of the chache would need to duplicate the state that a > > completely new cache would be more reasonable. > > I'm afraid that's the case, yes. > > > Is this worth exploring > > at least? I mean something like this should help with the fragmentation > > already AFAIU. Accounting would be just free on top. > > Yep. It could be also CONFIG_urable so smaller systems don't need to > deal with the memory overhead of this. > > So do we put it on LSF/MM agenda? If you volunteer to lead the discussion, then I do not have any objections. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-16 12:27 ` Michal Hocko @ 2018-04-16 19:57 ` Vlastimil Babka 2018-04-17 6:44 ` Michal Hocko 0 siblings, 1 reply; 61+ messages in thread From: Vlastimil Babka @ 2018-04-16 19:57 UTC (permalink / raw) To: Michal Hocko Cc: Johannes Weiner, Minchan Kim, Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, linux-fsdevel, linux-kernel, kernel-team, lsf-pc On 04/16/2018 02:27 PM, Michal Hocko wrote: > On Mon 16-04-18 14:06:21, Vlastimil Babka wrote: >> >> For example the percpu (and other) array caches... >> >>> maybe it will turn out that such a large >>> portion of the chache would need to duplicate the state that a >>> completely new cache would be more reasonable. >> >> I'm afraid that's the case, yes. >> >>> Is this worth exploring >>> at least? I mean something like this should help with the fragmentation >>> already AFAIU. Accounting would be just free on top. >> >> Yep. It could be also CONFIG_urable so smaller systems don't need to >> deal with the memory overhead of this. >> >> So do we put it on LSF/MM agenda? > > If you volunteer to lead the discussion, then I do not have any > objections. Sure, let's add the topic of SLAB_MINIMIZE_WASTE [1] as well. Something like "Supporting reclaimable kmalloc caches and large non-buddy-sized objects in slab allocators" ? [1] https://marc.info/?l=linux-mm&m=152156671614796&w=2 ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-16 19:57 ` Vlastimil Babka @ 2018-04-17 6:44 ` Michal Hocko 0 siblings, 0 replies; 61+ messages in thread From: Michal Hocko @ 2018-04-17 6:44 UTC (permalink / raw) To: Vlastimil Babka Cc: Johannes Weiner, Minchan Kim, Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, linux-fsdevel, linux-kernel, kernel-team, lsf-pc [the head of the thread is http://lkml.kernel.org/r/08524819-14ef-81d0-fa90-d7af13c6b9d5@suse.cz] On Mon 16-04-18 21:57:50, Vlastimil Babka wrote: > On 04/16/2018 02:27 PM, Michal Hocko wrote: > > On Mon 16-04-18 14:06:21, Vlastimil Babka wrote: > >> > >> For example the percpu (and other) array caches... > >> > >>> maybe it will turn out that such a large > >>> portion of the chache would need to duplicate the state that a > >>> completely new cache would be more reasonable. > >> > >> I'm afraid that's the case, yes. > >> > >>> Is this worth exploring > >>> at least? I mean something like this should help with the fragmentation > >>> already AFAIU. Accounting would be just free on top. > >> > >> Yep. It could be also CONFIG_urable so smaller systems don't need to > >> deal with the memory overhead of this. > >> > >> So do we put it on LSF/MM agenda? > > > > If you volunteer to lead the discussion, then I do not have any > > objections. > > Sure, let's add the topic of SLAB_MINIMIZE_WASTE [1] as well. > > Something like "Supporting reclaimable kmalloc caches and large > non-buddy-sized objects in slab allocators" ? > > [1] https://marc.info/?l=linux-mm&m=152156671614796&w=2 OK, noted. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-16 12:06 ` Vlastimil Babka 2018-04-16 12:27 ` Michal Hocko @ 2018-04-16 13:09 ` Matthew Wilcox 1 sibling, 0 replies; 61+ messages in thread From: Matthew Wilcox @ 2018-04-16 13:09 UTC (permalink / raw) To: Vlastimil Babka Cc: Michal Hocko, Johannes Weiner, Minchan Kim, Roman Gushchin, linux-mm, Andrew Morton, Alexander Viro, linux-fsdevel, linux-kernel, kernel-team On Mon, Apr 16, 2018 at 02:06:21PM +0200, Vlastimil Babka wrote: > On 04/16/2018 01:41 PM, Michal Hocko wrote: > > On Fri 13-04-18 10:37:16, Johannes Weiner wrote: > >> On Fri, Apr 13, 2018 at 04:28:21PM +0200, Michal Hocko wrote: > >>> On Fri 13-04-18 16:20:00, Vlastimil Babka wrote: > >>>> We would need kmalloc-reclaimable-X variants. It could be worth it, > >>>> especially if we find more similar usages. I suspect they would be more > >>>> useful than the existing dma-kmalloc-X :) > >>> > >>> I am still not sure why __GFP_RECLAIMABLE cannot be made work as > >>> expected and account slab pages as SLAB_RECLAIMABLE > >> > >> Can you outline how this would work without separate caches? > > > > I thought that the cache would only maintain two sets of slab pages > > depending on the allocation reuquests. I am pretty sure there will be > > other details to iron out and > > For example the percpu (and other) array caches... > > > maybe it will turn out that such a large > > portion of the chache would need to duplicate the state that a > > completely new cache would be more reasonable. > > I'm afraid that's the case, yes. I'm not sure it'll be so bad, at least for SLUB ... I think everything we need to duplicate is already percpu, and if we combine GFP_DMA and GFP_RECLAIMABLE into this, we might even get more savings. Also, we only need to do this for the kmalloc slabs; currently 13 of them. So we eliminate 13 caches and in return allocate 13 * 2 * NR_CPU pointers. That'll be a win on some machines and a loss on others, but the machines where it's consuming more memory should have more memory to begin with, so I'd count it as a win. The node partial list probably wants to be trebled in size to have one list per memory type. But I think the allocation path only changes like this: @@ -2663,10 +2663,13 @@ static __always_inline void *slab_alloc_node(struct kmem _cache *s, struct kmem_cache_cpu *c; struct page *page; unsigned long tid; + unsigned int offset = 0; s = slab_pre_alloc_hook(s, gfpflags); if (!s) return NULL; if (s->flags & SLAB_KMALLOC) offset = flags_to_slab_id(gfpflags); redo: /* * Must read kmem_cache cpu data via this cpu ptr. Preemption is @@ -2679,8 +2682,8 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s, * to check if it is matched or not. */ do { - tid = this_cpu_read(s->cpu_slab->tid); - c = raw_cpu_ptr(s->cpu_slab); + tid = this_cpu_read((&s->cpu_slab[offset])->tid); + c = raw_cpu_ptr(&s->cpu_slab[offset]); } while (IS_ENABLED(CONFIG_PREEMPT) && unlikely(tid != READ_ONCE(c->tid))); > > Is this worth exploring > > at least? I mean something like this should help with the fragmentation > > already AFAIU. Accounting would be just free on top. > > Yep. It could be also CONFIG_urable so smaller systems don't need to > deal with the memory overhead of this. > > So do we put it on LSF/MM agenda? We have an agenda? :-) ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory 2018-04-16 11:41 ` Michal Hocko @ 2018-04-17 11:24 ` Roman Gushchin 2018-04-17 11:24 ` Roman Gushchin 1 sibling, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-17 11:24 UTC (permalink / raw) To: Michal Hocko Cc: Johannes Weiner, Vlastimil Babka, Minchan Kim, linux-mm, Andrew Morton, Alexander Viro, linux-fsdevel, linux-kernel, kernel-team On Mon, Apr 16, 2018 at 01:41:44PM +0200, Michal Hocko wrote: > On Fri 13-04-18 10:37:16, Johannes Weiner wrote: > > On Fri, Apr 13, 2018 at 04:28:21PM +0200, Michal Hocko wrote: > > > On Fri 13-04-18 16:20:00, Vlastimil Babka wrote: > > > > We would need kmalloc-reclaimable-X variants. It could be worth it, > > > > especially if we find more similar usages. I suspect they would be more > > > > useful than the existing dma-kmalloc-X :) > > > > > > I am still not sure why __GFP_RECLAIMABLE cannot be made work as > > > expected and account slab pages as SLAB_RECLAIMABLE > > > > Can you outline how this would work without separate caches? > > I thought that the cache would only maintain two sets of slab pages > depending on the allocation reuquests. I am pretty sure there will be > other details to iron out and maybe it will turn out that such a large > portion of the chache would need to duplicate the state that a > completely new cache would be more reasonable. Is this worth exploring > at least? I mean something like this should help with the fragmentation > already AFAIU. Accounting would be just free on top. IMO, this approach is much better than duplicating all kmalloc caches. It's definitely has to be explored and discussed. Thank you! ^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 3/3] dcache: account external names as indirectly reclaimable memory @ 2018-04-17 11:24 ` Roman Gushchin 0 siblings, 0 replies; 61+ messages in thread From: Roman Gushchin @ 2018-04-17 11:24 UTC (permalink / raw) To: Michal Hocko Cc: Johannes Weiner, Vlastimil Babka, Minchan Kim, linux-mm, Andrew Morton, Alexander Viro, linux-fsdevel, linux-kernel, kernel-team On Mon, Apr 16, 2018 at 01:41:44PM +0200, Michal Hocko wrote: > On Fri 13-04-18 10:37:16, Johannes Weiner wrote: > > On Fri, Apr 13, 2018 at 04:28:21PM +0200, Michal Hocko wrote: > > > On Fri 13-04-18 16:20:00, Vlastimil Babka wrote: > > > > We would need kmalloc-reclaimable-X variants. It could be worth it, > > > > especially if we find more similar usages. I suspect they would be more > > > > useful than the existing dma-kmalloc-X :) > > > > > > I am still not sure why __GFP_RECLAIMABLE cannot be made work as > > > expected and account slab pages as SLAB_RECLAIMABLE > > > > Can you outline how this would work without separate caches? > > I thought that the cache would only maintain two sets of slab pages > depending on the allocation reuquests. I am pretty sure there will be > other details to iron out and maybe it will turn out that such a large > portion of the chache would need to duplicate the state that a > completely new cache would be more reasonable. Is this worth exploring > at least? I mean something like this should help with the fragmentation > already AFAIU. Accounting would be just free on top. IMO, this approach is much better than duplicating all kmalloc caches. It's definitely has to be explored and discussed. Thank you! ^ permalink raw reply [flat|nested] 61+ messages in thread
end of thread, other threads:[~2018-04-25 17:23 UTC | newest] Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-05 13:37 [PATCH 0/3] indirectly reclaimable memory Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-03-05 13:37 ` [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-04-11 13:16 ` Vlastimil Babka 2018-04-11 13:56 ` Roman Gushchin 2018-04-11 13:56 ` Roman Gushchin 2018-04-12 6:52 ` Vlastimil Babka 2018-04-12 11:52 ` Michal Hocko 2018-04-12 14:38 ` Roman Gushchin 2018-04-12 14:38 ` Roman Gushchin 2018-04-12 14:46 ` Michal Hocko 2018-04-12 14:57 ` Roman Gushchin 2018-04-12 14:57 ` Roman Gushchin 2018-04-13 6:59 ` Michal Hocko 2018-04-13 12:13 ` vinayak menon 2018-04-25 3:49 ` Vijayanand Jitta 2018-04-25 12:52 ` Roman Gushchin 2018-04-25 12:52 ` Roman Gushchin 2018-04-25 15:47 ` Vlastimil Babka 2018-04-25 16:48 ` Roman Gushchin 2018-04-25 16:48 ` Roman Gushchin 2018-04-25 17:02 ` Vlastimil Babka 2018-04-25 17:23 ` Roman Gushchin 2018-04-25 17:23 ` Roman Gushchin 2018-04-25 15:55 ` Matthew Wilcox 2018-04-25 16:59 ` Vlastimil Babka 2018-03-05 13:37 ` [PATCH 2/3] mm: add indirectly reclaimable memory to MemAvailable Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-03-05 13:47 ` Roman Gushchin 2018-03-05 13:47 ` Roman Gushchin 2018-03-05 13:47 ` Roman Gushchin 2018-03-05 13:37 ` [PATCH 2/3] mm: treat indirectly reclaimable memory as available in MemAvailable Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-03-05 13:37 ` [PATCH 3/3] dcache: account external names as indirectly reclaimable memory Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-03-05 13:37 ` Roman Gushchin 2018-03-12 21:17 ` Al Viro 2018-03-12 22:36 ` Roman Gushchin 2018-03-12 22:36 ` Roman Gushchin 2018-03-13 0:45 ` Al Viro 2018-04-05 22:11 ` Andrew Morton 2018-04-06 10:32 ` Roman Gushchin 2018-04-06 10:32 ` Roman Gushchin 2018-04-13 13:35 ` Minchan Kim 2018-04-13 13:59 ` Michal Hocko 2018-04-13 14:20 ` Vlastimil Babka 2018-04-13 14:28 ` Michal Hocko 2018-04-13 14:37 ` Johannes Weiner 2018-04-16 11:41 ` Michal Hocko 2018-04-16 12:06 ` Vlastimil Babka 2018-04-16 12:27 ` Michal Hocko 2018-04-16 19:57 ` Vlastimil Babka 2018-04-17 6:44 ` Michal Hocko 2018-04-16 13:09 ` Matthew Wilcox 2018-04-17 11:24 ` Roman Gushchin 2018-04-17 11:24 ` Roman Gushchin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.