linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm/memcontrol: make the slab calculation consistent
@ 2020-12-03  3:11 Muchun Song
  2020-12-04 15:46 ` Johannes Weiner
  0 siblings, 1 reply; 4+ messages in thread
From: Muchun Song @ 2020-12-03  3:11 UTC (permalink / raw)
  To: hannes, mhocko, vdavydov.dev, akpm
  Cc: cgroups, linux-mm, linux-kernel, Muchun Song, Roman Gushchin

Although the ratio of the slab is one, we also should read the ratio
from the related memory_stats instead of hard-coding. And the local
variable of size is already the value of slab_unreclaimable. So we
do not need to read again. Simplify the code here.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Roman Gushchin <guro@fb.com>
---
Changes in v2:
 - Add a comment in the memory_stat_format() suggested by Roman.

 mm/memcontrol.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9922f1510956..75df129b7a52 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1545,12 +1545,22 @@ static int __init memory_stats_init(void)
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(memory_stats); i++) {
+		switch (memory_stats[i].idx) {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-		if (memory_stats[i].idx == NR_ANON_THPS ||
-		    memory_stats[i].idx == NR_FILE_THPS ||
-		    memory_stats[i].idx == NR_SHMEM_THPS)
+		case NR_ANON_THPS:
+		case NR_FILE_THPS:
+		case NR_SHMEM_THPS:
 			memory_stats[i].ratio = HPAGE_PMD_SIZE;
+			break;
 #endif
+		case NR_SLAB_UNRECLAIMABLE_B:
+			VM_BUG_ON(i < 1);
+			VM_BUG_ON(memory_stats[i - 1].idx != NR_SLAB_RECLAIMABLE_B);
+			break;
+		default:
+			break;
+		}
+
 		VM_BUG_ON(!memory_stats[i].ratio);
 		VM_BUG_ON(memory_stats[i].idx >= MEMCG_NR_STAT);
 	}
@@ -1586,9 +1596,15 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
 		size *= memory_stats[i].ratio;
 		seq_buf_printf(&s, "%s %llu\n", memory_stats[i].name, size);
 
+		/*
+		 * We are printing reclaimable, unreclaimable of the slab
+		 * and the sum of both.
+		 */
 		if (unlikely(memory_stats[i].idx == NR_SLAB_UNRECLAIMABLE_B)) {
-			size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) +
-			       memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B);
+			int idx = i - 1;
+
+			size += memcg_page_state(memcg, memory_stats[idx].idx) *
+				memory_stats[idx].ratio;
 			seq_buf_printf(&s, "slab %llu\n", size);
 		}
 	}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] mm/memcontrol: make the slab calculation consistent
  2020-12-03  3:11 [PATCH v2] mm/memcontrol: make the slab calculation consistent Muchun Song
@ 2020-12-04 15:46 ` Johannes Weiner
  2020-12-04 16:19   ` [External] " Muchun Song
  2020-12-04 22:08   ` Roman Gushchin
  0 siblings, 2 replies; 4+ messages in thread
From: Johannes Weiner @ 2020-12-04 15:46 UTC (permalink / raw)
  To: Muchun Song
  Cc: mhocko, vdavydov.dev, akpm, cgroups, linux-mm, linux-kernel,
	Roman Gushchin

On Thu, Dec 03, 2020 at 11:11:11AM +0800, Muchun Song wrote:
> Although the ratio of the slab is one, we also should read the ratio
> from the related memory_stats instead of hard-coding. And the local
> variable of size is already the value of slab_unreclaimable. So we
> do not need to read again. Simplify the code here.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Acked-by: Roman Gushchin <guro@fb.com>

I agree that ignoring the ratio right now is not very pretty, but

		size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) +
		       memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B);
		seq_buf_printf(&s, "slab %llu\n", size);

is way easier to understand and more robust than using idx and idx + 1
and then requiring a series of BUG_ONs to ensure these two items are
actually adjacent and in the right order.

There is a redundant call to memcg_page_state(), granted, but that
function is extremely cheap compared with e.g. seq_buf_printf().

>  mm/memcontrol.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)

IMO this really just complicates the code with little discernible
upside. It's going to be a NAK from me, unfortunately.


In retrospect, I think that memory_stats[] table was a mistake. It
would probably be easier to implement this using a wrapper for
memcg_page_state() that has a big switch() for unit
conversion. Something like this:

/* Translate stat items to the correct unit for memory.stat output */
static unsigned long memcg_page_state_output(memcg, item)
{
	unsigned long value = memcg_page_state(memcg, item);
	int unit = PAGE_SIZE;

	switch (item) {
	case NR_SLAB_RECLAIMABLE_B:
	case NR_SLAB_UNRECLAIMABLE_B:
	case WORKINGSET_REFAULT_ANON:
	case WORKINGSET_REFAULT_FILE:
	case WORKINGSET_ACTIVATE_ANON:
	case WORKINGSET_ACTIVATE_FILE:
	case WORKINGSET_RESTORE_ANON:
	case WORKINGSET_RESTORE_FILE:
	case MEMCG_PERCPU_B:
		unit = 1;
		break;
	case NR_SHMEM_THPS:
	case NR_FILE_THPS:
	case NR_ANON_THPS:
		unit = HPAGE_PMD_SIZE;
		break;
	case NR_KERNEL_STACK_KB:
		unit = 1024;
		break;
	}
	
	return value * unit;
}

This would fix the ratio inconsistency, get rid of the awkward mix of
static and runtime initialization of the table, is probably about the
same amount of code, but simpler and more obvious overall.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [External] Re: [PATCH v2] mm/memcontrol: make the slab calculation consistent
  2020-12-04 15:46 ` Johannes Weiner
@ 2020-12-04 16:19   ` Muchun Song
  2020-12-04 22:08   ` Roman Gushchin
  1 sibling, 0 replies; 4+ messages in thread
From: Muchun Song @ 2020-12-04 16:19 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Michal Hocko, Vladimir Davydov, Andrew Morton, Cgroups,
	Linux Memory Management List, LKML, Roman Gushchin

On Fri, Dec 4, 2020 at 11:48 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Thu, Dec 03, 2020 at 11:11:11AM +0800, Muchun Song wrote:
> > Although the ratio of the slab is one, we also should read the ratio
> > from the related memory_stats instead of hard-coding. And the local
> > variable of size is already the value of slab_unreclaimable. So we
> > do not need to read again. Simplify the code here.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > Acked-by: Roman Gushchin <guro@fb.com>
>
> I agree that ignoring the ratio right now is not very pretty, but
>
>                 size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) +
>                        memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B);
>                 seq_buf_printf(&s, "slab %llu\n", size);
>
> is way easier to understand and more robust than using idx and idx + 1
> and then requiring a series of BUG_ONs to ensure these two items are
> actually adjacent and in the right order.
>
> There is a redundant call to memcg_page_state(), granted, but that
> function is extremely cheap compared with e.g. seq_buf_printf().
>
> >  mm/memcontrol.c | 26 +++++++++++++++++++++-----
> >  1 file changed, 21 insertions(+), 5 deletions(-)
>
> IMO this really just complicates the code with little discernible
> upside. It's going to be a NAK from me, unfortunately.
>
>
> In retrospect, I think that memory_stats[] table was a mistake. It
> would probably be easier to implement this using a wrapper for
> memcg_page_state() that has a big switch() for unit
> conversion. Something like this:
>
> /* Translate stat items to the correct unit for memory.stat output */
> static unsigned long memcg_page_state_output(memcg, item)
> {
>         unsigned long value = memcg_page_state(memcg, item);
>         int unit = PAGE_SIZE;
>
>         switch (item) {
>         case NR_SLAB_RECLAIMABLE_B:
>         case NR_SLAB_UNRECLAIMABLE_B:
>         case WORKINGSET_REFAULT_ANON:
>         case WORKINGSET_REFAULT_FILE:
>         case WORKINGSET_ACTIVATE_ANON:
>         case WORKINGSET_ACTIVATE_FILE:
>         case WORKINGSET_RESTORE_ANON:
>         case WORKINGSET_RESTORE_FILE:
>         case MEMCG_PERCPU_B:
>                 unit = 1;
>                 break;
>         case NR_SHMEM_THPS:
>         case NR_FILE_THPS:
>         case NR_ANON_THPS:
>                 unit = HPAGE_PMD_SIZE;
>                 break;
>         case NR_KERNEL_STACK_KB:
>                 unit = 1024;
>                 break;
>         }
>
>         return value * unit;
> }
>
> This would fix the ratio inconsistency, get rid of the awkward mix of
> static and runtime initialization of the table, is probably about the
> same amount of code, but simpler and more obvious overall.

Good idea. I can do that :)

Thanks.

-- 
Yours,
Muchun

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] mm/memcontrol: make the slab calculation consistent
  2020-12-04 15:46 ` Johannes Weiner
  2020-12-04 16:19   ` [External] " Muchun Song
@ 2020-12-04 22:08   ` Roman Gushchin
  1 sibling, 0 replies; 4+ messages in thread
From: Roman Gushchin @ 2020-12-04 22:08 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Muchun Song, mhocko, vdavydov.dev, akpm, cgroups, linux-mm, linux-kernel

On Fri, Dec 04, 2020 at 10:46:13AM -0500, Johannes Weiner wrote:
> On Thu, Dec 03, 2020 at 11:11:11AM +0800, Muchun Song wrote:
> > Although the ratio of the slab is one, we also should read the ratio
> > from the related memory_stats instead of hard-coding. And the local
> > variable of size is already the value of slab_unreclaimable. So we
> > do not need to read again. Simplify the code here.
> > 
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > Acked-by: Roman Gushchin <guro@fb.com>
> 
> I agree that ignoring the ratio right now is not very pretty, but
> 
> 		size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) +
> 		       memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B);
> 		seq_buf_printf(&s, "slab %llu\n", size);
> 
> is way easier to understand and more robust than using idx and idx + 1
> and then requiring a series of BUG_ONs to ensure these two items are
> actually adjacent and in the right order.
> 
> There is a redundant call to memcg_page_state(), granted, but that
> function is extremely cheap compared with e.g. seq_buf_printf().
> 
> >  mm/memcontrol.c | 26 +++++++++++++++++++++-----
> >  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> IMO this really just complicates the code with little discernible
> upside. It's going to be a NAK from me, unfortunately.
> 
> 
> In retrospect, I think that memory_stats[] table was a mistake. It
> would probably be easier to implement this using a wrapper for
> memcg_page_state() that has a big switch() for unit
> conversion. Something like this:

+1

> 
> /* Translate stat items to the correct unit for memory.stat output */
> static unsigned long memcg_page_state_output(memcg, item)
> {
> 	unsigned long value = memcg_page_state(memcg, item);
> 	int unit = PAGE_SIZE;
> 
> 	switch (item) {
> 	case NR_SLAB_RECLAIMABLE_B:
> 	case NR_SLAB_UNRECLAIMABLE_B:
> 	case WORKINGSET_REFAULT_ANON:
> 	case WORKINGSET_REFAULT_FILE:
> 	case WORKINGSET_ACTIVATE_ANON:
> 	case WORKINGSET_ACTIVATE_FILE:
> 	case WORKINGSET_RESTORE_ANON:
> 	case WORKINGSET_RESTORE_FILE:
> 	case MEMCG_PERCPU_B:
> 		unit = 1;
> 		break;
> 	case NR_SHMEM_THPS:
> 	case NR_FILE_THPS:
> 	case NR_ANON_THPS:
> 		unit = HPAGE_PMD_SIZE;
> 		break;
             ^^^^^^^^^^^^
These can be easily converted to ordinary pages,
so we can completely avoid this exception.

> 	case NR_KERNEL_STACK_KB:
> 		unit = 1024;
> 		break;
> 	}

And NR_KERNEL_STACK_KB can be converted to bytes.

Then we'll have everything kernel-related in bytes and
everything userspace-related in PAGE_SIZE's.

> 	
> 	return value * unit;
> }
> 
> This would fix the ratio inconsistency, get rid of the awkward mix of
> static and runtime initialization of the table, is probably about the
> same amount of code, but simpler and more obvious overall.

+1

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-04 22:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-03  3:11 [PATCH v2] mm/memcontrol: make the slab calculation consistent Muchun Song
2020-12-04 15:46 ` Johannes Weiner
2020-12-04 16:19   ` [External] " Muchun Song
2020-12-04 22:08   ` Roman Gushchin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).