linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* tools/cgroup/memcg_slabinfo.py is broken
@ 2022-02-15 13:22 Vasily Averin
  2022-02-15 23:29 ` Roman Gushchin
  0 siblings, 1 reply; 10+ messages in thread
From: Vasily Averin @ 2022-02-15 13:22 UTC (permalink / raw)
  To: Linux MM; +Cc: Roman Gushchin, Matthew Wilcox, Vlastimil Babka

Dear all,

commit 07f910f9b729 ("mm: Remove slab from struct page")
broke tools/cgroup/memcg_slabinfo.py script.
It was used as replacement of /psys/fs/cgroup/memory/memory.kmem.slabinfo
and was used to monitor slab objects related to specified cgroup

# ./memcg_slabinfo.py /sys/fs/cgroup/memory/lxc.payload.c9s/
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
Traceback (most recent call last):
   File "/usr/bin/drgn", line 33, in <module>
     sys.exit(load_entry_point('drgn==0.0.16', 'console_scripts', 'drgn')())
   File "/usr/lib64/python3.10/site-packages/drgn/internal/cli.py", line 133, in main
     runpy.run_path(args.script[0], init_globals=init_globals, run_name="__main__")
   File "/usr/lib64/python3.10/runpy.py", line 269, in run_path
     return _run_module_code(code, init_globals, run_name,
   File "/usr/lib64/python3.10/runpy.py", line 96, in _run_module_code
     _run_code(code, mod_globals, init_globals,
   File "/usr/lib64/python3.10/runpy.py", line 86, in _run_code
     exec(code, run_globals)
   File "./memcg_slabinfo.py", line 226, in <module>
     main()
   File "./memcg_slabinfo.py", line 199, in main
     cache = page.slab_cache
AttributeError: 'struct page' has no member 'slab_cache'

Any ideas how to repair it ?

Thank you,
	Vasily Averin


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: tools/cgroup/memcg_slabinfo.py is broken
  2022-02-15 13:22 tools/cgroup/memcg_slabinfo.py is broken Vasily Averin
@ 2022-02-15 23:29 ` Roman Gushchin
       [not found]   ` <35486a00-791c-3b1b-14e0-476e6242709b@virtuozzo.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Roman Gushchin @ 2022-02-15 23:29 UTC (permalink / raw)
  To: Vasily Averin; +Cc: Linux MM, Matthew Wilcox, Vlastimil Babka

On Tue, Feb 15, 2022 at 04:22:14PM +0300, Vasily Averin wrote:
> Dear all,
> 
> commit 07f910f9b729 ("mm: Remove slab from struct page")
> broke tools/cgroup/memcg_slabinfo.py script.
> It was used as replacement of /psys/fs/cgroup/memory/memory.kmem.slabinfo
> and was used to monitor slab objects related to specified cgroup
> 
> # ./memcg_slabinfo.py /sys/fs/cgroup/memory/lxc.payload.c9s/
> # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
> Traceback (most recent call last):
>   File "/usr/bin/drgn", line 33, in <module>
>     sys.exit(load_entry_point('drgn==0.0.16', 'console_scripts', 'drgn')())
>   File "/usr/lib64/python3.10/site-packages/drgn/internal/cli.py", line 133, in main
>     runpy.run_path(args.script[0], init_globals=init_globals, run_name="__main__")
>   File "/usr/lib64/python3.10/runpy.py", line 269, in run_path
>     return _run_module_code(code, init_globals, run_name,
>   File "/usr/lib64/python3.10/runpy.py", line 96, in _run_module_code
>     _run_code(code, mod_globals, init_globals,
>   File "/usr/lib64/python3.10/runpy.py", line 86, in _run_code
>     exec(code, run_globals)
>   File "./memcg_slabinfo.py", line 226, in <module>
>     main()
>   File "./memcg_slabinfo.py", line 199, in main
>     cache = page.slab_cache
> AttributeError: 'struct page' has no member 'slab_cache'
> 
> Any ideas how to repair it ?
> 
> Thank you,
> 	Vasily Averin

Thanks for the report, Vasily!

I'll take a look, we need to update the script according to the latest
slab/page/folio changes.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: tools/cgroup/memcg_slabinfo.py is broken
       [not found]     ` <Yg1kFH3umdrvOhu1@carbon.dhcp.thefacebook.com>
@ 2022-02-16 21:12       ` Matthew Wilcox
  2022-02-16 21:44         ` Roman Gushchin
  0 siblings, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2022-02-16 21:12 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Vasily Averin, Andrew Morton, Linux MM, Vlastimil Babka

On Wed, Feb 16, 2022 at 12:52:36PM -0800, Roman Gushchin wrote:
> The problem can be fixed by explicitly casting struct page * to struct
> slab * for slab pages. The tools works as expected with this fix, e.g.:

This feels like a quick fix, and not really correct.

> @@ -77,7 +77,8 @@ def count_partial(n, fn):
>  
>  
>  def count_free(page):
> -    return page.objects - page.inuse
> +    slab = cast('struct slab *', page)
> +    return slab.objects - slab.inuse

count_free() should take a slab, not a page.

>  def slub_get_slabinfo(s, cfg):
> @@ -193,10 +194,11 @@ def main():
>          # look over all slab pages, belonging to non-root memcgs
>          # and look for objects belonging to the given memory cgroup
>          for page in for_each_slab_page(prog):

for_each_slab_page() should be renamed for_each_slab().
It should return a slab, not a page.  And it should definitely skip
over tail pages (it works today by coincidence because tail pages do
not have PG_slab set).

count_partial() should use struct slab, and slab_list, not lru.

... I think that's it.  But I'm no pythonist, much less dragoneer.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: tools/cgroup/memcg_slabinfo.py is broken
  2022-02-16 21:12       ` Matthew Wilcox
@ 2022-02-16 21:44         ` Roman Gushchin
       [not found]           ` <Yg1y0ax5bnjGLAqz@casper.infradead.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Roman Gushchin @ 2022-02-16 21:44 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Vasily Averin, Andrew Morton, Linux MM, Vlastimil Babka

On Wed, Feb 16, 2022 at 09:12:01PM +0000, Matthew Wilcox wrote:
> On Wed, Feb 16, 2022 at 12:52:36PM -0800, Roman Gushchin wrote:
> > The problem can be fixed by explicitly casting struct page * to struct
> > slab * for slab pages. The tools works as expected with this fix, e.g.:
> 
> This feels like a quick fix, and not really correct.

Do you mind to provide a correct version?

Thanks!


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: tools/cgroup/memcg_slabinfo.py is broken
       [not found]             ` <Yg16ClVQj8SLSlqV@carbon.DHCP.thefacebook.com>
@ 2022-02-16 23:12               ` Matthew Wilcox
  2022-02-16 23:57                 ` Shakeel Butt
  0 siblings, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2022-02-16 23:12 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Vasily Averin, Andrew Morton, Linux MM, Vlastimil Babka

On Wed, Feb 16, 2022 at 02:26:18PM -0800, Roman Gushchin wrote:
> On Wed, Feb 16, 2022 at 09:55:29PM +0000, Matthew Wilcox wrote:
> > On Wed, Feb 16, 2022 at 01:44:10PM -0800, Roman Gushchin wrote:
> > > On Wed, Feb 16, 2022 at 09:12:01PM +0000, Matthew Wilcox wrote:
> > > > On Wed, Feb 16, 2022 at 12:52:36PM -0800, Roman Gushchin wrote:
> > > > > The problem can be fixed by explicitly casting struct page * to struct
> > > > > slab * for slab pages. The tools works as expected with this fix, e.g.:
> > > > 
> > > > This feels like a quick fix, and not really correct.
> > > 
> > > Do you mind to provide a correct version?
> > 
> > I know nothing about python, nor how to even run this script.  But this
> > is the kind of thing I was thinking about.
> > 
> > I didn't do the 'skip over tail pages' because I have no idea how
> > you'd tell for_each_page() to do that.  I don't know where to get
> > drgn.helpers.linux from, so I can't look at the implementation.
> > Maybe it already does that.
> 
> Is there an actual plan to set the slab flag for tail pages?
> If not, why to check both? It's fairly expensive already, so I'd
> not add any additional checks if there is no strict necessity.

The plan to set the slab flag on tail pages is probably about five
years away.  There are a lot of moving parts before we get there.
I think you could reduce the cost if we had a for_each_folio()
loop.  Most code doesn't really want to look at every page, so
having a shorter list (and iterating over each page within a folio
for the few places that really do want to see every page) would
be a good tradeoff.

> > @@ -145,14 +145,14 @@ def detect_kernel_config():
> >      return cfg
> >  
> >  
> > -def for_each_slab_page(prog):
> > +def for_each_slab(prog):
> 
> Here I'd keep _page (or _folio, if you want), because
> it makes it clear that the function goes over all pages
> (and is expensive therefore).

But there's no intrinsic reason that it should.  If that's a
performance problem, you could keep slabs on a separate list
from, eg, file or anon memory.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: tools/cgroup/memcg_slabinfo.py is broken
  2022-02-16 23:12               ` Matthew Wilcox
@ 2022-02-16 23:57                 ` Shakeel Butt
  2022-02-17  0:13                   ` Matthew Wilcox
  0 siblings, 1 reply; 10+ messages in thread
From: Shakeel Butt @ 2022-02-16 23:57 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Roman Gushchin, Vasily Averin, Andrew Morton, Linux MM, Vlastimil Babka

On Wed, Feb 16, 2022 at 3:12 PM Matthew Wilcox <willy@infradead.org> wrote:
>
[...]
> > Here I'd keep _page (or _folio, if you want), because
> > it makes it clear that the function goes over all pages
> > (and is expensive therefore).
>
> But there's no intrinsic reason that it should.  If that's a
> performance problem, you could keep slabs on a separate list
> from, eg, file or anon memory.
>

Is there enough space in the struct slab for that list or do you mean
something different?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: tools/cgroup/memcg_slabinfo.py is broken
  2022-02-16 23:57                 ` Shakeel Butt
@ 2022-02-17  0:13                   ` Matthew Wilcox
  2022-02-17  0:25                     ` Roman Gushchin
       [not found]                     ` <Yg2cKKnIboNu7j+p@carbon.DHCP.thefacebook.com>
  0 siblings, 2 replies; 10+ messages in thread
From: Matthew Wilcox @ 2022-02-17  0:13 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Roman Gushchin, Vasily Averin, Andrew Morton, Linux MM, Vlastimil Babka

On Wed, Feb 16, 2022 at 03:57:39PM -0800, Shakeel Butt wrote:
> On Wed, Feb 16, 2022 at 3:12 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> [...]
> > > Here I'd keep _page (or _folio, if you want), because
> > > it makes it clear that the function goes over all pages
> > > (and is expensive therefore).
> >
> > But there's no intrinsic reason that it should.  If that's a
> > performance problem, you could keep slabs on a separate list
> > from, eg, file or anon memory.
> >
> 
> Is there enough space in the struct slab for that list or do you mean
> something different?

Well, I don't know what data structure your for_each_page() uses, so
I don't know what to suggest exactly.

If it's an iteration over the entirety of memmap, then maybe you'd
rather iterate over each kmem_cache like /proc/slabinfo does?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: tools/cgroup/memcg_slabinfo.py is broken
  2022-02-17  0:13                   ` Matthew Wilcox
@ 2022-02-17  0:25                     ` Roman Gushchin
       [not found]                     ` <Yg2cKKnIboNu7j+p@carbon.DHCP.thefacebook.com>
  1 sibling, 0 replies; 10+ messages in thread
From: Roman Gushchin @ 2022-02-17  0:25 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Shakeel Butt, Vasily Averin, Andrew Morton, Linux MM, Vlastimil Babka

On Thu, Feb 17, 2022 at 12:13:53AM +0000, Matthew Wilcox wrote:
> On Wed, Feb 16, 2022 at 03:57:39PM -0800, Shakeel Butt wrote:
> > On Wed, Feb 16, 2022 at 3:12 PM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > [...]
> > > > Here I'd keep _page (or _folio, if you want), because
> > > > it makes it clear that the function goes over all pages
> > > > (and is expensive therefore).
> > >
> > > But there's no intrinsic reason that it should.  If that's a
> > > performance problem, you could keep slabs on a separate list
> > > from, eg, file or anon memory.
> > >
> > 
> > Is there enough space in the struct slab for that list or do you mean
> > something different?
> 
> Well, I don't know what data structure your for_each_page() uses, so
> I don't know what to suggest exactly.
> 
> If it's an iteration over the entirety of memmap, then maybe you'd
> rather iterate over each kmem_cache like /proc/slabinfo does?

For SLUB /proc/slabinfo ignores pages on percpu partial lists,
which can significantly alter results.

For slabinfo.py the accuracy is more important than performance,
at least in my understanding. It's not a tool which a generic linux
user would run often.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: tools/cgroup/memcg_slabinfo.py is broken
       [not found]                       ` <11060a75-98c2-f547-68eb-fcab404a2539@virtuozzo.com>
@ 2022-02-21  9:19                         ` Vlastimil Babka
  0 siblings, 0 replies; 10+ messages in thread
From: Vlastimil Babka @ 2022-02-21  9:19 UTC (permalink / raw)
  To: Vasily Averin, Roman Gushchin, Andrew Morton
  Cc: Shakeel Butt, Matthew Wilcox, Linux MM

On 2/21/22 06:17, Vasily Averin wrote:
> On 17.02.2022 03:51, Roman Gushchin wrote:
>> On Thu, Feb 17, 2022 at 12:13:53AM +0000, Matthew Wilcox wrote:
>>> On Wed, Feb 16, 2022 at 03:57:39PM -0800, Shakeel Butt wrote:
>>>> On Wed, Feb 16, 2022 at 3:12 PM Matthew Wilcox <willy@infradead.org> wrote:
>>>>>
>>>> [...]
>>>>>> Here I'd keep _page (or _folio, if you want), because
>>>>>> it makes it clear that the function goes over all pages
>>>>>> (and is expensive therefore).
>>>>>
>>>>> But there's no intrinsic reason that it should.  If that's a
>>>>> performance problem, you could keep slabs on a separate list
>>>>> from, eg, file or anon memory.
>>>>>
>>>>
>>>> Is there enough space in the struct slab for that list or do you mean
>>>> something different?
>>>
>>> Well, I don't know what data structure your for_each_page() uses, so
>>> I don't know what to suggest exactly.
>>>
>>> If it's an iteration over the entirety of memmap, then maybe you'd
>>> rather iterate over each kmem_cache like /proc/slabinfo does?
>>
>> Anyway, here is v2. I hope you're fine with it.
> 
> It works well on my test node,
> Tested-by: Vasily Averin <vvs@virtuozzo.com>

Thanks all for fixing and testing!

> Andrew,
> this script was created in August 2020, broken in few months, repaired in
> April 2021,
> then broken again in October 2021 and I hope will be repaired soon.

Since it was me who broke it now in 5.17-rc1 and it's slab related, I'd take
the fix to slab tree and send to Linus so it's in 5.17 final.

> I think it would be useful to add smoke test with this script into checks
> of some build robots.
> 
> Could you please advise how to do it?

I guess ask the lkp maintainers?

BTW I'm a bit skeptical in general about maintaining drgn (or gdb) based
python scriplets within the kernel tree, that are supposed to work only with
that specific version. The model of the 'crash' tool which supports a range
of kernels seems more practical and sustainable to me.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: tools/cgroup/memcg_slabinfo.py is broken
       [not found]                     ` <Yg2cKKnIboNu7j+p@carbon.DHCP.thefacebook.com>
       [not found]                       ` <11060a75-98c2-f547-68eb-fcab404a2539@virtuozzo.com>
@ 2022-02-21 13:27                       ` Matthew Wilcox
  1 sibling, 0 replies; 10+ messages in thread
From: Matthew Wilcox @ 2022-02-21 13:27 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Shakeel Butt, Vasily Averin, Andrew Morton, Linux MM, Vlastimil Babka

On Wed, Feb 16, 2022 at 04:51:52PM -0800, Roman Gushchin wrote:
> >From 04efbbdcfde4535c4623f2fd6910a505a9c638bb Mon Sep 17 00:00:00 2001
> From: Roman Gushchin <guro@fb.com>
> Date: Wed, 16 Feb 2022 12:43:30 -0800
> Subject: [PATCH v2] tools/cgroup/slabinfo: update to work with struct slab
> 
> After the introduction of the dedicated struct slab to describe slab
> pages by commit d122019bf061 ("mm: Split slab into its own type") and
> the following removal of the corresponding struct page's fields by
> commit 07f910f9b729 ("mm: Remove slab from struct page") the
[...]
> 
> v2: change naming and count_partial()/count_free()/for_each_slab()
>     signatures to work with slabs, suggested by Matthew Wilcox
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Reported-by: Vasily Averin <vvs@virtuozzo.com>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-02-21 13:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-15 13:22 tools/cgroup/memcg_slabinfo.py is broken Vasily Averin
2022-02-15 23:29 ` Roman Gushchin
     [not found]   ` <35486a00-791c-3b1b-14e0-476e6242709b@virtuozzo.com>
     [not found]     ` <Yg1kFH3umdrvOhu1@carbon.dhcp.thefacebook.com>
2022-02-16 21:12       ` Matthew Wilcox
2022-02-16 21:44         ` Roman Gushchin
     [not found]           ` <Yg1y0ax5bnjGLAqz@casper.infradead.org>
     [not found]             ` <Yg16ClVQj8SLSlqV@carbon.DHCP.thefacebook.com>
2022-02-16 23:12               ` Matthew Wilcox
2022-02-16 23:57                 ` Shakeel Butt
2022-02-17  0:13                   ` Matthew Wilcox
2022-02-17  0:25                     ` Roman Gushchin
     [not found]                     ` <Yg2cKKnIboNu7j+p@carbon.DHCP.thefacebook.com>
     [not found]                       ` <11060a75-98c2-f547-68eb-fcab404a2539@virtuozzo.com>
2022-02-21  9:19                         ` Vlastimil Babka
2022-02-21 13:27                       ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).