linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations
       [not found] ` <f9394752-e272-9bf9-645f-a18c56d1c4ec@openvz.org>
@ 2022-06-06 13:49   ` Qian Cai
       [not found]     ` <0e714a5a-d2ed-9b44-fdbe-04b5595165da@openvz.org>
       [not found]     ` <360a2672-65a7-4ad4-c8b8-cc4c1f0c02cd@openvz.org>
       [not found]   ` <20220918092849.GA10314@u164.east.ru>
  1 sibling, 2 replies; 16+ messages in thread
From: Qian Cai @ 2022-06-06 13:49 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Andrew Morton, kernel, linux-kernel, linux-mm, Shakeel Butt,
	Roman Gushchin, Michal Koutný,
	Vlastimil Babka, Michal Hocko, Florian Westphal, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, cgroups

On Fri, Jun 03, 2022 at 07:19:43AM +0300, Vasily Averin wrote:
> __register_pernet_operations() executes init hook of registered
> pernet_operation structure in all existing net namespaces.
> 
> Typically, these hooks are called by a process associated with
> the specified net namespace, and all __GFP_ACCOUNT marked
> allocation are accounted for corresponding container/memcg.
> 
> However __register_pernet_operations() calls the hooks in the same
> context, and as a result all marked allocations are accounted
> to one memcg for all processed net namespaces.
> 
> This patch adjusts active memcg for each net namespace and helps
> to account memory allocated inside ops_init() into the proper memcg.
> 
> Signed-off-by: Vasily Averin <vvs@openvz.org>
> Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
> Acked-by: Shakeel Butt <shakeelb@google.com>
> ---
...
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 9ecead1042b9..dad16b484cd5 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -1755,6 +1755,42 @@ static inline void count_objcg_event(struct obj_cgroup *objcg,
>  	rcu_read_unlock();
>  }
>  
> +/**
> + * get_mem_cgroup_from_obj - get a memcg associated with passed kernel object.
> + * @p: pointer to object from which memcg should be extracted. It can be NULL.
> + *
> + * Retrieves the memory group into which the memory of the pointed kernel
> + * object is accounted. If memcg is found, its reference is taken.
> + * If a passed kernel object is uncharged, or if proper memcg cannot be found,
> + * as well as if mem_cgroup is disabled, NULL is returned.
> + *
> + * Return: valid memcg pointer with taken reference or NULL.
> + */
> +static inline struct mem_cgroup *get_mem_cgroup_from_obj(void *p)
> +{
> +	struct mem_cgroup *memcg;
> +
> +	rcu_read_lock();
> +	do {
> +		memcg = mem_cgroup_from_obj(p);
> +	} while (memcg && !css_tryget(&memcg->css));
> +	rcu_read_unlock();
> +	return memcg;
> +}
...
> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> index 0ec2f5906a27..6b9f19122ec1 100644
> --- a/net/core/net_namespace.c
> +++ b/net/core/net_namespace.c
> @@ -18,6 +18,7 @@
>  #include <linux/user_namespace.h>
>  #include <linux/net_namespace.h>
>  #include <linux/sched/task.h>
> +#include <linux/sched/mm.h>
>  #include <linux/uidgid.h>
>  #include <linux/cookie.h>
>  
> @@ -1143,7 +1144,13 @@ static int __register_pernet_operations(struct list_head *list,
>  		 * setup_net() and cleanup_net() are not possible.
>  		 */
>  		for_each_net(net) {
> +			struct mem_cgroup *old, *memcg;
> +
> +			memcg = mem_cgroup_or_root(get_mem_cgroup_from_obj(net));
> +			old = set_active_memcg(memcg);
>  			error = ops_init(ops, net);
> +			set_active_memcg(old);
> +			mem_cgroup_put(memcg);
>  			if (error)
>  				goto out_undo;
>  			list_add_tail(&net->exit_list, &net_exit_list);
> -- 
> 2.36.1

This triggers a few boot warnings like those.

 virt_to_phys used for non-linear address: ffffd8efe2d2fe00 (init_net)
 WARNING: CPU: 87 PID: 3170 at arch/arm64/mm/physaddr.c:12 __virt_to_phys
 CPU: 87 PID: 3170 Comm: modprobe Tainted: G    B   W         5.19.0-rc1-next-20220606 #138
 pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : __virt_to_phys
 lr : __virt_to_phys
 sp : ffff800051cc76b0
 x29: ffff800051cc76b0 x28: ffffd8efb5ba6ab8 x27: ffffd8efb5ba6b2c
 x26: ffffd8efb1bccb20 x25: ffffd8efbaaf8200 x24: ffff800051cc77f0
 x23: ffffd8efb744a000 x22: ffffd8efbb1bc000 x21: 0000600000000000
 x20: 0000d8efe2d2fe00 x19: ffffd8efe2d2fe00 x18: 0000000000000443
 x17: 0000000000000000 x16: 0000000000000002 x15: ffffd8efb9db2000
 x14: 0000000000000001 x13: 0000000000000000 x12: ffff6806c88f8986
 x11: 1fffe806c88f8985 x10: ffff6806c88f8985 x9 : dfff800000000000
 x8 : ffff4036447c4c2b x7 : 0000000000000001 x6 : ffff6806c88f8985
 x5 : ffff4036447c4c28 x4 : ffff6806c88f8986 x3 : ffffd8efb34b3850
 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff400335f99a80
 Call trace:
  __virt_to_phys
  mem_cgroup_from_obj
  __register_pernet_operations
  register_pernet_operations
  register_pernet_subsys
  nfnetlink_init [nfnetlink]
  load_module
  __do_sys_finit_module
  __arm64_sys_finit_module
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync
 irq event stamp: 0
 hardirqs last  enabled at (0):  0x0
 hardirqs last disabled at (0):  copy_process
 softirqs last  enabled at (0):  copy_process
 softirqs last disabled at (0):  0x0

 virt_to_phys used for non-linear address: ffffd8efe2d2fe00 (init_net)
 WARNING: CPU: 156 PID: 3176 at arch/arm64/mm/physaddr.c:12 __virt_to_phys
 CPU: 156 PID: 3176 Comm: modprobe Tainted: G    B   W         5.19.0-rc1-next-20220606 #138
 pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : __virt_to_phys
 lr : __virt_to_phys
 sp : ffff800051b376e0
 x29: ffff800051b376e0 x28: ffffd8efb5ba6ab8 x27: ffffd8efb5ba6b2c
 x26: ffffd8efb286e910 x25: ffffd8efbaaf8200 x24: ffff800051b37820
 x23: ffffd8efb744a000 x22: ffffd8efbb1bc000 x21: 0000600000000000
 x20: 0000d8efe2d2fe00 x19: ffffd8efe2d2fe00 x18: 00000000000001cb
 x17: 0000000000000000 x16: 0000000000000002 x15: ffffd8efb9db2000
 x14: 0000000000000001 x13: 0000000000000000 x12: ffff6806c8a03f86
 x8 : ffff40364501fc2b x7 : 0000000000000001 x6 : ffff6806c8a03f85
 x5 : ffff40364501fc28 x4 : ffff6806c8a03f86 x3 : ffffd8efb34b3850
 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff40033376b4c0
 Call trace:
  __virt_to_phys
  mem_cgroup_from_obj
  __register_pernet_operations
  register_pernet_operations
  register_pernet_subsys
  nf_tables_module_init [nf_tables]
  do_one_initcall
  do_init_module
  load_module
  __do_sys_finit_module
  __arm64_sys_finit_module
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync
 irq event stamp: 0
 hardirqs last  enabled at (0):  0x0
 hardirqs last disabled at (0):  copy_process
 softirqs last  enabled at (0):  copy_process
 softirqs last disabled at (0):  0x0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations
       [not found]     ` <0e714a5a-d2ed-9b44-fdbe-04b5595165da@openvz.org>
@ 2022-06-06 18:43       ` Qian Cai
  0 siblings, 0 replies; 16+ messages in thread
From: Qian Cai @ 2022-06-06 18:43 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Andrew Morton, kernel, linux-kernel, linux-mm, Shakeel Butt,
	Roman Gushchin, Michal Koutný,
	Vlastimil Babka, Michal Hocko, Florian Westphal, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, cgroups

On Mon, Jun 06, 2022 at 08:37:26PM +0300, Vasily Averin wrote:
> On 6/6/22 16:49, Qian Cai wrote:
> > This triggers a few boot warnings like those.
> > 
> >  virt_to_phys used for non-linear address: ffffd8efe2d2fe00 (init_net)
> >  WARNING: CPU: 87 PID: 3170 at arch/arm64/mm/physaddr.c:12 __virt_to_phys
> 
> Thank you for reporting the problem,
> Could you please provide me your config file via private email?

$ make ARCH=arm64 defconfig debug.config


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations
       [not found]     ` <360a2672-65a7-4ad4-c8b8-cc4c1f0c02cd@openvz.org>
@ 2022-06-07  5:58       ` Shakeel Butt
  2022-06-07 12:37         ` Vasily Averin
  0 siblings, 1 reply; 16+ messages in thread
From: Shakeel Butt @ 2022-06-07  5:58 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Qian Cai, Roman Gushchin, Andrew Morton, kernel, LKML, Linux MM,
	Michal Koutný,
	Vlastimil Babka, Michal Hocko, Florian Westphal, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, Cgroups

On Mon, Jun 6, 2022 at 11:45 AM Vasily Averin <vvs@openvz.org> wrote:
>
[...]
>
> As far as I understand this report means that 'init_net' have incorrect
> virtual address on arm64.

So, the two call stacks tell the addresses belong to the kernel
modules (nfnetlink and nf_tables) whose underlying memory is allocated
through vmalloc and virt_to_page() does not work on vmalloc()
addresses.

>
> Roman, Shakeel, I need your help
>
> Should we perhaps verify kaddr via virt_addr_valid() before using virt_to_page()
> If so, where it should be checked?

I think virt_addr_valid() check in mem_cgroup_from_obj() should work
but I think it is expensive on the arm64 platform. The cheaper and a
bit hacky way to avoid such addresses is to directly use
is_vmalloc_addr() directly.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations
  2022-06-07  5:58       ` Shakeel Butt
@ 2022-06-07 12:37         ` Vasily Averin
  2022-06-07 14:10           ` Shakeel Butt
  0 siblings, 1 reply; 16+ messages in thread
From: Vasily Averin @ 2022-06-07 12:37 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Qian Cai, Roman Gushchin, Andrew Morton, kernel, LKML, Linux MM,
	Michal Koutný,
	Vlastimil Babka, Michal Hocko, Florian Westphal, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, Cgroups

On 6/7/22 08:58, Shakeel Butt wrote:
> On Mon, Jun 6, 2022 at 11:45 AM Vasily Averin <vvs@openvz.org> wrote:
>>
> [...]
>>
>> As far as I understand this report means that 'init_net' have incorrect
>> virtual address on arm64.
> 
> So, the two call stacks tell the addresses belong to the kernel
> modules (nfnetlink and nf_tables) whose underlying memory is allocated
> through vmalloc and virt_to_page() does not work on vmalloc()
> addresses.

However in both these cases get_mem_cgroup_from_obj() -> mem_cgroup_from_obj() ->
virt_to_folio() -> virt_to_page() -> virt_to_pfn() -> __virt_to_phys() 
handles address of struct net taken from for_each_net().
The only net namespace that exists at this stage is init_net,
and dmesg output confirms this:
"virt_to_phys used for non-linear address: ffffd8efe2d2fe00 (init_net)"

>> Roman, Shakeel, I need your help
>>
>> Should we perhaps verify kaddr via virt_addr_valid() before using virt_to_page()
>> If so, where it should be checked?
> 
> I think virt_addr_valid() check in mem_cgroup_from_obj() should work
> but I think it is expensive on the arm64 platform. The cheaper and a
> bit hacky way to avoid such addresses is to directly use
> is_vmalloc_addr() directly.

I do not understand why you mean that processed address is vmalloc-specific.
As far as I understand it is valid address of static variable, and for some reason
arm64 does not consider them valid virtual addresses.

Thank you,
	Vasily Averin


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations
  2022-06-07 12:37         ` Vasily Averin
@ 2022-06-07 14:10           ` Shakeel Butt
  0 siblings, 0 replies; 16+ messages in thread
From: Shakeel Butt @ 2022-06-07 14:10 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Qian Cai, Roman Gushchin, Andrew Morton, kernel, LKML, Linux MM,
	Michal Koutný,
	Vlastimil Babka, Michal Hocko, Florian Westphal, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, Cgroups

On Tue, Jun 7, 2022 at 5:37 AM Vasily Averin <vvs@openvz.org> wrote:
>
> On 6/7/22 08:58, Shakeel Butt wrote:
> > On Mon, Jun 6, 2022 at 11:45 AM Vasily Averin <vvs@openvz.org> wrote:
> >>
> > [...]
> >>
> >> As far as I understand this report means that 'init_net' have incorrect
> >> virtual address on arm64.
> >
> > So, the two call stacks tell the addresses belong to the kernel
> > modules (nfnetlink and nf_tables) whose underlying memory is allocated
> > through vmalloc and virt_to_page() does not work on vmalloc()
> > addresses.
>
> However in both these cases get_mem_cgroup_from_obj() -> mem_cgroup_from_obj() ->
> virt_to_folio() -> virt_to_page() -> virt_to_pfn() -> __virt_to_phys()
> handles address of struct net taken from for_each_net().
> The only net namespace that exists at this stage is init_net,
> and dmesg output confirms this:
> "virt_to_phys used for non-linear address: ffffd8efe2d2fe00 (init_net)"
>
> >> Roman, Shakeel, I need your help
> >>
> >> Should we perhaps verify kaddr via virt_addr_valid() before using virt_to_page()
> >> If so, where it should be checked?
> >
> > I think virt_addr_valid() check in mem_cgroup_from_obj() should work
> > but I think it is expensive on the arm64 platform. The cheaper and a
> > bit hacky way to avoid such addresses is to directly use
> > is_vmalloc_addr() directly.
>
> I do not understand why you mean that processed address is vmalloc-specific.
> As far as I understand it is valid address of static variable, and for some reason
> arm64 does not consider them valid virtual addresses.
>

Indeed you are right as we are using the addresses of net namespaces
and the report already has the information on the address
ffffd8efe2d2fe00 which is init_net.

I don't know what is the right way to handle such addresses on arm64.
BTW there is a separate report on this issue and arm maintainers are
also CCed. Why not ask this question on that report?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
       [not found]   ` <20220918092849.GA10314@u164.east.ru>
@ 2022-09-21 14:41     ` Anatoly Pugachev
  2022-09-21 14:44     ` Anatoly Pugachev
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Anatoly Pugachev @ 2022-09-21 14:41 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Andrew Morton, kernel, Linux Kernel list, linux-mm, Shakeel Butt,
	Roman Gushchin, Michal Koutný,
	Vlastimil Babka, Michal Hocko, Florian Westphal, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, cgroups,
	Sparc kernel list

[-- Attachment #1: Type: text/plain, Size: 367 bytes --]

On Sun, Sep 18, 2022 at 12:39 PM Anatoly Pugachev <matorola@gmail.com>
wrote:

>
> I'm unable to boot my sparc64 VM anymore (5.19 still boots, 6.0-rc1 does
> not),
> bisected up to this patch,
>
> mator@ttip:~/linux-2.6$ git bisect bad
> 1d0403d20f6c281cb3d14c5f1db5317caeec48e9 is the first bad commit
>

reverting this patch makes my sparc64 box boot successfully.

[-- Attachment #2: Type: text/html, Size: 666 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
       [not found]   ` <20220918092849.GA10314@u164.east.ru>
  2022-09-21 14:41     ` [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations) Anatoly Pugachev
@ 2022-09-21 14:44     ` Anatoly Pugachev
  2022-09-21 17:02       ` Michal Koutný
  2022-09-27  9:54     ` Vlastimil Babka
  2022-09-28  7:21     ` [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations) #forregzbot Thorsten Leemhuis
  3 siblings, 1 reply; 16+ messages in thread
From: Anatoly Pugachev @ 2022-09-21 14:44 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Andrew Morton, kernel, Linux Kernel list, linux-mm, Shakeel Butt,
	Roman Gushchin, Michal Koutný,
	Vlastimil Babka, Michal Hocko, Florian Westphal, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, cgroups,
	Sparc kernel list

On Sun, Sep 18, 2022 at 12:39 PM Anatoly Pugachev <matorola@gmail.com> wrote:
>
>
> I'm unable to boot my sparc64 VM anymore (5.19 still boots, 6.0-rc1 does not),
> bisected up to this patch,
>
> mator@ttip:~/linux-2.6$ git bisect bad
> 1d0403d20f6c281cb3d14c5f1db5317caeec48e9 is the first bad commit
> commit 1d0403d20f6c281cb3d14c5f1db5317caeec48e9

reverting this patch makes my sparc64 box boot successfully.

mator@ttip:~$ uname -a
Linux ttip 6.0.0-rc6-00010-gb7f0f527dc3c #377 SMP Wed Sep 21 17:34:50
MSK 2022 sparc64 GNU/Linux


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
  2022-09-21 14:44     ` Anatoly Pugachev
@ 2022-09-21 17:02       ` Michal Koutný
  2022-09-26 13:06         ` Anatoly Pugachev
  0 siblings, 1 reply; 16+ messages in thread
From: Michal Koutný @ 2022-09-21 17:02 UTC (permalink / raw)
  To: Anatoly Pugachev
  Cc: Vasily Averin, Andrew Morton, kernel, Linux Kernel list,
	linux-mm, Shakeel Butt, Roman Gushchin, Vlastimil Babka,
	Michal Hocko, Florian Westphal, David S. Miller, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, cgroups, Sparc kernel list

Hello.

Thanks for the report.

On Wed, Sep 21, 2022 at 05:44:56PM +0300, Anatoly Pugachev <matorola@gmail.com> wrote:
> On Sun, Sep 18, 2022 at 12:39 PM Anatoly Pugachev <matorola@gmail.com> wrote:
> >
> >
> > I'm unable to boot my sparc64 VM anymore (5.19 still boots, 6.0-rc1 does not),
> > bisected up to this patch,
> >
> > mator@ttip:~/linux-2.6$ git bisect bad
> > 1d0403d20f6c281cb3d14c5f1db5317caeec48e9 is the first bad commit
> > commit 1d0403d20f6c281cb3d14c5f1db5317caeec48e9
> 
> reverting this patch makes my sparc64 box boot successfully.

The failed address falls into vmmemmap region (per your boot log
output). It looks like the respective page/folio (of init_net struct) is
unbacked there (and likely folio_test_slab fails dereferencing ->flags). 

Would you mind sharing your kernel's config?
(I'm most curious about CONFIG_SPARSMEM_VMEMMAP, I'm not familiar with
your arch at all though.)

Thanks,
Michal


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
  2022-09-21 17:02       ` Michal Koutný
@ 2022-09-26 13:06         ` Anatoly Pugachev
  2022-09-26 17:28           ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Anatoly Pugachev @ 2022-09-26 13:06 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Vasily Averin, Andrew Morton, kernel, Linux Kernel list,
	linux-mm, Shakeel Butt, Roman Gushchin, Vlastimil Babka,
	Michal Hocko, Florian Westphal, David S. Miller, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, cgroups, Sparc kernel list

On Wed, Sep 21, 2022 at 8:03 PM Michal Koutný <mkoutny@suse.com> wrote:
> On Wed, Sep 21, 2022 at 05:44:56PM +0300, Anatoly Pugachev <matorola@gmail.com> wrote:
> > On Sun, Sep 18, 2022 at 12:39 PM Anatoly Pugachev <matorola@gmail.com> wrote:
> > >
> > >
> > > I'm unable to boot my sparc64 VM anymore (5.19 still boots, 6.0-rc1 does not),
> > > bisected up to this patch,
> > >
> > > mator@ttip:~/linux-2.6$ git bisect bad
> > > 1d0403d20f6c281cb3d14c5f1db5317caeec48e9 is the first bad commit
> > > commit 1d0403d20f6c281cb3d14c5f1db5317caeec48e9
> >
> > reverting this patch makes my sparc64 box boot successfully.
>
> The failed address falls into vmmemmap region (per your boot log
> output). It looks like the respective page/folio (of init_net struct) is
> unbacked there (and likely folio_test_slab fails dereferencing ->flags).
>
> Would you mind sharing your kernel's config?
> (I'm most curious about CONFIG_SPARSMEM_VMEMMAP, I'm not familiar with
> your arch at all though.)

mator@ttip:~/dmesg$ zcat config-6.0.0-rc6-00010-gb7f0f527dc3c.gz | grep VMEMMAP
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y

I do upload config and boot logs to
https://github.com/mator/sparc64-dmesg

building a new kernel version/releases as 'make olddefconfig && make -j'
current version of booted 6.0.0-rc6 is available as
https://github.com/mator/sparc64-dmesg/blob/master/config-6.0.0-rc6-00010-gb7f0f527dc3c.gz


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
  2022-09-26 13:06         ` Anatoly Pugachev
@ 2022-09-26 17:28           ` Jakub Kicinski
  2022-09-26 17:32             ` Shakeel Butt
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2022-09-26 17:28 UTC (permalink / raw)
  To: Vasily Averin, Shakeel Butt
  Cc: Anatoly Pugachev, Michal Koutný,
	Andrew Morton, kernel, Linux Kernel list, linux-mm,
	Roman Gushchin, Vlastimil Babka, Michal Hocko, Florian Westphal,
	David S. Miller, Paolo Abeni, Eric Dumazet, cgroups,
	Sparc kernel list, Thorsten Leemhuis, Linus Torvalds

On Mon, 26 Sep 2022 16:06:08 +0300 Anatoly Pugachev wrote:
> On Wed, Sep 21, 2022 at 8:03 PM Michal Koutný <mkoutny@suse.com> wrote:
> > On Wed, Sep 21, 2022 at 05:44:56PM +0300, Anatoly Pugachev <matorola@gmail.com> wrote:  
> > > reverting this patch makes my sparc64 box boot successfully.  
> >
> > The failed address falls into vmmemmap region (per your boot log
> > output). It looks like the respective page/folio (of init_net struct) is
> > unbacked there (and likely folio_test_slab fails dereferencing ->flags).
> >
> > Would you mind sharing your kernel's config?
> > (I'm most curious about CONFIG_SPARSMEM_VMEMMAP, I'm not familiar with
> > your arch at all though.)  
> 
> mator@ttip:~/dmesg$ zcat config-6.0.0-rc6-00010-gb7f0f527dc3c.gz | grep VMEMMAP
> CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
> CONFIG_SPARSEMEM_VMEMMAP=y
> 
> I do upload config and boot logs to
> https://github.com/mator/sparc64-dmesg
> 
> building a new kernel version/releases as 'make olddefconfig && make -j'
> current version of booted 6.0.0-rc6 is available as
> https://github.com/mator/sparc64-dmesg/blob/master/config-6.0.0-rc6-00010-gb7f0f527dc3c.gz

Forgive my uniformed chime-in but Linus seemed happy with the size of
-rc7 and now I'm worried there won't be an -rc8. AFAICT this is a 6.0
regression. Vasily, Shakeel, do we have a plan to fix this?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
  2022-09-26 17:28           ` Jakub Kicinski
@ 2022-09-26 17:32             ` Shakeel Butt
  2022-09-26 17:36               ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Shakeel Butt @ 2022-09-26 17:32 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Vasily Averin, Anatoly Pugachev, Michal Koutný,
	Andrew Morton, kernel, Linux Kernel list, linux-mm,
	Roman Gushchin, Vlastimil Babka, Michal Hocko, Florian Westphal,
	David S. Miller, Paolo Abeni, Eric Dumazet, Cgroups,
	Sparc kernel list, Thorsten Leemhuis, Linus Torvalds

On Mon, Sep 26, 2022 at 10:28 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 26 Sep 2022 16:06:08 +0300 Anatoly Pugachev wrote:
> > On Wed, Sep 21, 2022 at 8:03 PM Michal Koutný <mkoutny@suse.com> wrote:
> > > On Wed, Sep 21, 2022 at 05:44:56PM +0300, Anatoly Pugachev <matorola@gmail.com> wrote:
> > > > reverting this patch makes my sparc64 box boot successfully.
> > >
> > > The failed address falls into vmmemmap region (per your boot log
> > > output). It looks like the respective page/folio (of init_net struct) is
> > > unbacked there (and likely folio_test_slab fails dereferencing ->flags).
> > >
> > > Would you mind sharing your kernel's config?
> > > (I'm most curious about CONFIG_SPARSMEM_VMEMMAP, I'm not familiar with
> > > your arch at all though.)
> >
> > mator@ttip:~/dmesg$ zcat config-6.0.0-rc6-00010-gb7f0f527dc3c.gz | grep VMEMMAP
> > CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
> > CONFIG_SPARSEMEM_VMEMMAP=y
> >
> > I do upload config and boot logs to
> > https://github.com/mator/sparc64-dmesg
> >
> > building a new kernel version/releases as 'make olddefconfig && make -j'
> > current version of booted 6.0.0-rc6 is available as
> > https://github.com/mator/sparc64-dmesg/blob/master/config-6.0.0-rc6-00010-gb7f0f527dc3c.gz
>
> Forgive my uniformed chime-in but Linus seemed happy with the size of
> -rc7 and now I'm worried there won't be an -rc8. AFAICT this is a 6.0
> regression. Vasily, Shakeel, do we have a plan to fix this?

I was actually waiting for Vasily to respond. Anyways, I think the
easiest way to proceed is to revert the commit 1d0403d20f6c ("net: set
proper memcg for net_init hooks allocations"). We can debug the issue
in the next cycle.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
  2022-09-26 17:32             ` Shakeel Butt
@ 2022-09-26 17:36               ` Andrew Morton
  2022-09-26 19:00                 ` Shakeel Butt
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2022-09-26 17:36 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Jakub Kicinski, Vasily Averin, Anatoly Pugachev,
	Michal Koutný,
	kernel, Linux Kernel list, linux-mm, Roman Gushchin,
	Vlastimil Babka, Michal Hocko, Florian Westphal, David S. Miller,
	Paolo Abeni, Eric Dumazet, Cgroups, Sparc kernel list,
	Thorsten Leemhuis, Linus Torvalds

On Mon, 26 Sep 2022 10:32:49 -0700 Shakeel Butt <shakeelb@google.com> wrote:

> > Forgive my uniformed chime-in but Linus seemed happy with the size of
> > -rc7 and now I'm worried there won't be an -rc8. AFAICT this is a 6.0
> > regression. Vasily, Shakeel, do we have a plan to fix this?
> 
> I was actually waiting for Vasily to respond. Anyways, I think the
> easiest way to proceed is to revert the commit 1d0403d20f6c ("net: set
> proper memcg for net_init hooks allocations"). We can debug the issue
> in the next cycle.

If agreeable, could someone please send along a tested and changelogged
patch to do this?



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
  2022-09-26 17:36               ` Andrew Morton
@ 2022-09-26 19:00                 ` Shakeel Butt
  0 siblings, 0 replies; 16+ messages in thread
From: Shakeel Butt @ 2022-09-26 19:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jakub Kicinski, Vasily Averin, Anatoly Pugachev,
	Michal Koutný,
	kernel, Linux Kernel list, linux-mm, Roman Gushchin,
	Vlastimil Babka, Michal Hocko, Florian Westphal, David S. Miller,
	Paolo Abeni, Eric Dumazet, Cgroups, Sparc kernel list,
	Thorsten Leemhuis, Linus Torvalds

On Mon, Sep 26, 2022 at 10:36 AM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> On Mon, 26 Sep 2022 10:32:49 -0700 Shakeel Butt <shakeelb@google.com> wrote:
>
> > > Forgive my uniformed chime-in but Linus seemed happy with the size of
> > > -rc7 and now I'm worried there won't be an -rc8. AFAICT this is a 6.0
> > > regression. Vasily, Shakeel, do we have a plan to fix this?
> >
> > I was actually waiting for Vasily to respond. Anyways, I think the
> > easiest way to proceed is to revert the commit 1d0403d20f6c ("net: set
> > proper memcg for net_init hooks allocations"). We can debug the issue
> > in the next cycle.
>
> If agreeable, could someone please send along a tested and changelogged
> patch to do this?
>

I will send this revert soon and I think Anatoly has already tested
the revert but I will let him add his tested-by tag.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
       [not found]   ` <20220918092849.GA10314@u164.east.ru>
  2022-09-21 14:41     ` [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations) Anatoly Pugachev
  2022-09-21 14:44     ` Anatoly Pugachev
@ 2022-09-27  9:54     ` Vlastimil Babka
  2022-09-28  7:54       ` Thorsten Leemhuis
  2022-09-28  7:21     ` [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations) #forregzbot Thorsten Leemhuis
  3 siblings, 1 reply; 16+ messages in thread
From: Vlastimil Babka @ 2022-09-27  9:54 UTC (permalink / raw)
  To: Anatoly Pugachev, Vasily Averin
  Cc: Andrew Morton, kernel, linux-kernel, linux-mm, Shakeel Butt,
	Roman Gushchin, Michal Koutný,
	Michal Hocko, Florian Westphal, David S. Miller, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, cgroups, sparclinux, regressions

On 9/18/22 11:28, Anatoly Pugachev wrote:
> On Fri, Jun 03, 2022 at 07:19:43AM +0300, Vasily Averin wrote:
>> __register_pernet_operations() executes init hook of registered
>> pernet_operation structure in all existing net namespaces.
>> 
>> Typically, these hooks are called by a process associated with
>> the specified net namespace, and all __GFP_ACCOUNT marked
>> allocation are accounted for corresponding container/memcg.
>> 
>> However __register_pernet_operations() calls the hooks in the same
>> context, and as a result all marked allocations are accounted
>> to one memcg for all processed net namespaces.
>> 
>> This patch adjusts active memcg for each net namespace and helps
>> to account memory allocated inside ops_init() into the proper memcg.
>> 
>> Signed-off-by: Vasily Averin <vvs@openvz.org>
>> Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
>> Acked-by: Shakeel Butt <shakeelb@google.com>
>> ---
>> v6: re-based to current upstream (v5.18-11267-gb00ed48bb0a7)
> 
> 
> Hello!
> 
> I'm unable to boot my sparc64 VM anymore (5.19 still boots, 6.0-rc1 does not),
> bisected up to this patch,
> 
> mator@ttip:~/linux-2.6$ git bisect bad
> 1d0403d20f6c281cb3d14c5f1db5317caeec48e9 is the first bad commit
> commit 1d0403d20f6c281cb3d14c5f1db5317caeec48e9
> Author: Vasily Averin <vvs@openvz.org>
> Date:   Fri Jun 3 07:19:43 2022 +0300
> 
>     net: set proper memcg for net_init hooks allocations
> 
>     __register_pernet_operations() executes init hook of registered
>     pernet_operation structure in all existing net namespaces.
> 
>     Typically, these hooks are called by a process associated with the
>     specified net namespace, and all __GFP_ACCOUNT marked allocation are
>     accounted for corresponding container/memcg.
> 
>     However __register_pernet_operations() calls the hooks in the same
>     context, and as a result all marked allocations are accounted to one memcg
>     for all processed net namespaces.
> 
>     This patch adjusts active memcg for each net namespace and helps to
>     account memory allocated inside ops_init() into the proper memcg.
> 
>     Link: https://lkml.kernel.org/r/f9394752-e272-9bf9-645f-a18c56d1c4ec@openvz.org
>     Signed-off-by: Vasily Averin <vvs@openvz.org>
>     Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
>     Acked-by: Shakeel Butt <shakeelb@google.com>
>     Cc: Michal Koutný <mkoutny@suse.com>
>     Cc: Vlastimil Babka <vbabka@suse.cz>
>     Cc: Michal Hocko <mhocko@suse.com>
>     Cc: Florian Westphal <fw@strlen.de>
>     Cc: David S. Miller <davem@davemloft.net>
>     Cc: Jakub Kicinski <kuba@kernel.org>
>     Cc: Paolo Abeni <pabeni@redhat.com>
>     Cc: Eric Dumazet <edumazet@google.com>
>     Cc: Johannes Weiner <hannes@cmpxchg.org>
>     Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>     Cc: Linux Kernel Functional Testing <lkft@linaro.org>
>     Cc: Muchun Song <songmuchun@bytedance.com>
>     Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
>     Cc: Qian Cai <quic_qiancai@quicinc.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 
>  include/linux/memcontrol.h | 47 +++++++++++++++++++++++++++++++++++++++++++++-
>  net/core/net_namespace.c   |  7 +++++++
>  2 files changed, 53 insertions(+), 1 deletion(-)
> 
> getting the following kernel OOPS:
> 
> 
> [    0.000010] PROMLIB: Sun IEEE Boot Prom 'OBP 4.38.17 2019/01/25 08:22'
> [    0.000028] PROMLIB: Root node compatible: sun4v
> [    0.000070] Linux version 5.19.0-rc2-00025-g1d0403d20f6c (mator@ttip) (gcc (Debian 12.2.0-2) 12.2.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #376 SMP Sun Sep 18 02:22:43 MSK 2022
> [    0.000098] printk: debug: skip boot console de-registration.
> [    0.000438] printk: bootconsole [earlyprom0] enabled
> [    0.000491] ARCH: SUN4V
> [    0.000534] Ethernet address: 00:14:4f:fa:06:f2
> [    0.000583] MM: PAGE_OFFSET is 0xfff8000000000000 (max_phys_bits == 47)
> [    0.000644] MM: VMALLOC [0x0000000100000000 --> 0x0006000000000000]
> [    0.000704] MM: VMEMMAP [0x0006000000000000 --> 0x000c000000000000]
> [    0.014651] Kernel: Using 5 locked TLB entries for main kernel image.
> [    0.014719] Remapping the kernel...
> [    0.014750] done.
> [    0.033774] OF stdout device is: /virtual-devices@100/console@1
> [    0.033838] PROM: Built device tree with 67601 bytes of memory.
> [    0.033896] MDESC: Size is 24208 bytes.
> [    0.033989] PLATFORM: banner-name [SPARC T5-2]
> [    0.034034] PLATFORM: name [ORCL,SPARC-T5-2]
> [    0.034076] PLATFORM: hostid [84fa06f2]
> [    0.034113] PLATFORM: serial# [0035260e]
> [    0.034154] PLATFORM: stick-frequency [3b9aca00]
> [    0.034196] PLATFORM: mac-address [144ffa06f2]
> [    0.034238] PLATFORM: watchdog-resolution [1000 ms]
> [    0.034284] PLATFORM: watchdog-max-timeout [31536000000 ms]
> [    0.034335] PLATFORM: max-cpus [1024]
> [    0.034419] Top of RAM: 0x42f948000, Total RAM: 0x3ff3a0000
> [    0.034474] Memory hole size: 773MB
> [    0.036430] Allocated 24576 bytes for kernel page tables.
> [    0.036506] Zone ranges:
> [    0.036541]   Normal   [mem 0x0000000030400000-0x000000042f947fff]
> [    0.036602] Movable zone start for each node
> [    0.036645] Early memory node ranges
> [    0.036679]   node   0: [mem 0x0000000030400000-0x000000006febffff]
> [    0.036738]   node   0: [mem 0x000000006ff40000-0x000000006ff65fff]
> [    0.036796]   node   0: [mem 0x0000000070000000-0x000000042f8b1fff]
> [    0.036854]   node   0: [mem 0x000000042f940000-0x000000042f947fff]
> [    0.036912] Initmem setup node 0 [mem 0x0000000030400000-0x000000042f947fff]
> [    0.046980] On node 0, zone Normal: 98816 pages in unavailable ranges
> [    0.047007] On node 0, zone Normal: 64 pages in unavailable ranges
> [    0.048447] On node 0, zone Normal: 77 pages in unavailable ranges
> [    0.048516] On node 0, zone Normal: 71 pages in unavailable ranges
> [    0.050336] On node 0, zone Normal: 33628 pages in unavailable ranges
> [    0.050400] Booting Linux...
> [    0.050500] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,n2,mul32]
> [    0.050581] CPU CAPS: [div32,v8plus,popc,vis,vis2,ASIBlkInit,fmaf,vis3]
> [    0.050663] CPU CAPS: [hpc,ima,pause,cbcond,aes,des,kasumi,camellia]
> [    0.050744] CPU CAPS: [md5,sha1,sha256,sha512,mpmul,montmul,montsqr,crc32c]
> [    0.093786] percpu: Embedded 18 pages/cpu s105824 r8192 d33440 u262144
> [    0.095225] SUN4V: Mondo queue sizes [cpu(131072) dev(16384) r(8192) nr(256)]
> [    0.095510] Built 1 zonelists, mobility grouping on.  Total pages: 2077148
> [    0.095587] Kernel command line: BOOT_IMAGE=/vmlinux-5.19.0-rc2-00025-g1d0403d20f6c root=/dev/vdiska2 ro keep_bootcon
> [    0.095745] Unknown kernel command line parameters "BOOT_IMAGE=/vmlinux-5.19.0-rc2-00025-g1d0403d20f6c", will be passed to user space.
> [    0.095851] printk: log_buf_len individual max cpu contribution: 4096 bytes
> [    0.095914] printk: log_buf_len total cpu_extra contributions: 1044480 bytes
> [    0.095973] printk: log_buf_len min size: 131072 bytes
> [    0.097772] printk: log_buf_len: 2097152 bytes
> [    0.097818] printk: early log buf free: 126264(96%)
> [    0.099466] Dentry cache hash table entries: 2097152 (order: 11, 16777216 bytes, linear)
> [    0.100365] Inode-cache hash table entries: 1048576 (order: 10, 8388608 bytes, linear)
> [    0.100439] Sorting __ex_table...
> [    0.100692] mem auto-init: stack:off, heap alloc:off, heap free:off
> [    0.105101] Memory: 1259512K/16764544K available (8962K kernel code, 1702K rwdata, 3048K rodata, 632K init, 3160K bss, 289008K reserved, 0K cma-reserved)
> [    0.108565] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=256, Nodes=1
> [    0.109364] ftrace: allocating 27588 entries in 54 pages
> [    0.120238] ftrace: allocated 54 pages with 4 groups
> [    0.120513] trace event string verifier disabled
> [    0.124589] rcu: Hierarchical RCU implementation.
> [    0.124642] rcu:     RCU debug extended QS entry/exit.
> [    0.124689]  Rude variant of Tasks RCU enabled.
> [    0.124733]  Tracing variant of Tasks RCU enabled.
> [    0.124778] rcu: RCU calculated value of scheduler-enlistment delay is 26 jiffies.
> [    0.131351] NR_IRQS: 2048, nr_irqs: 2048, preallocated irqs: 1
> [    0.131438] SUN4V: Using IRQ API major 3, cookie only virqs enabled
> [    0.135353] rcu: srcu_init: Setting srcu_struct sizes to big.
> [    0.135477] clocksource: stick: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [    0.135579] clocksource: mult[800000] shift[23]
> [    0.135626] clockevent: mult[80000000] shift[31]
> [    0.136279] Console: colour dummy device 80x25
> [    0.136333] printk: console [tty0] enabled
> [    0.136393] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> [    0.136482] ... MAX_LOCKDEP_SUBCLASSES:  8
> [    0.136536] ... MAX_LOCK_DEPTH:          48
> [    0.136589] ... MAX_LOCKDEP_KEYS:        8192
> [    0.136645] ... CLASSHASH_SIZE:          4096
> [    0.136699] ... MAX_LOCKDEP_ENTRIES:     16384
> [    0.136756] ... MAX_LOCKDEP_CHAINS:      32768
> [    0.136811] ... CHAINHASH_SIZE:          16384
> [    0.136868]  memory used by lock dependency info: 2603 kB
> [    0.136933]  per task-struct memory footprint: 1920 bytes
> [    0.215908] Calibrating delay using timer specific routine.. 2007.88 BogoMIPS (lpj=4015778)
> [    0.216049] pid_max: default: 262144 minimum: 2048
> [    0.216772] LSM: Security Framework initializing
> [    0.217017] Unable to handle kernel paging request at virtual address 000612000002e000
> [    0.217116] tsk->{mm,active_mm}->context = 0000000000000000
> [    0.217184] tsk->{mm,active_mm}->pgd = fff8000070002000
> [    0.217247]               \|/ ____ \|/
> [    0.217247]               "@'/ .. \`@"
> [    0.217247]               /_| \__/ |_\
> [    0.217247]                  \__U_/
> [    0.217406] swapper/0(0): Oops [#1]
> [    0.217458] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc2-00025-g1d0403d20f6c #376
> [    0.217559] TSTATE: 0000009180001607 TPC: 00000000006c9118 TNPC: 00000000006c911c Y: df1f6831    Not tainted
> [    0.217673] TPC: <mem_cgroup_from_obj+0x78/0x120>
> [    0.217742] g0: 0000000000000000 g1: 0000004000000a89 g2: 0006000000000000 g3: 54256f3ea00db3c0
> [    0.217843] g4: 0000000000fdf680 g5: fff800042960e000 g6: 0000000000fc0000 g7: 0000000000000002
> [    0.217943] o0: 000612000002f688 o1: 0000000000fdffa0 o2: 22645555e843a019 o3: 24f02a9c57a00000
> [    0.218043] o4: 000000000000000d o5: 9b8bf183d547acad sp: 0000000000fc3191 ret_pc: 00000000006c90c8
> [    0.218145] RPC: <mem_cgroup_from_obj+0x28/0x120>
> [    0.218207] l0: 00000000011f31c0 l1: 0000000000000000 l2: 0000000000000000 l3: ffffffffffffffff
> [    0.218309] l4: ffffffff0000003c l5: 00000000014e3800 l6: 0000000000000000 l7: 0000000000fdac00
> [    0.218409] i0: 0000000001512d80 i1: 0000000000000000 i2: 0000000000000000 i3: 0000000000000002
> [    0.218509] i4: 00000000011f31c0 i5: 0000000000000000 i6: 0000000000fc3241 i7: 0000000000ae012c
> [    0.218609] I7: <__register_pernet_operations+0xcc/0x420>
> [    0.218681] Call Trace:
> [    0.218718] [<0000000000ae012c>] __register_pernet_operations+0xcc/0x420
> [    0.218800] [<0000000000ae04e4>] register_pernet_operations+0x64/0xa0
> [    0.218878] [<0000000000ae053c>] register_pernet_subsys+0x1c/0x40
> [    0.218955] [<0000000001199010>] net_ns_init+0xe8/0x148
> [    0.219028] [<0000000001170ed4>] start_kernel+0x5e0/0x660
> [    0.219096] [<0000000001173e28>] start_early_boot+0x2a0/0x2b0
> [    0.219169] [<0000000000cb6fe0>] tlb_fixup_done+0x4c/0x6c
> [    0.219240] [<0000000000027414>] 0x27414
> [    0.219293] Disabling lock debugging due to kernel taint
> [    0.219345] Caller[0000000000ae012c]: __register_pernet_operations+0xcc/0x420
> [    0.220423] Caller[0000000000ae04e4]: register_pernet_operations+0x64/0xa0
> [    0.220490] Caller[0000000000ae053c]: register_pernet_subsys+0x1c/0x40
> [    0.220551] Caller[0000000001199010]: net_ns_init+0xe8/0x148
> [    0.220608] Caller[0000000001170ed4]: start_kernel+0x5e0/0x660
> [    0.220664] Caller[0000000001173e28]: start_early_boot+0x2a0/0x2b0
> [    0.220723] Caller[0000000000cb6fe0]: tlb_fixup_done+0x4c/0x6c
> [    0.220780] Caller[0000000000027414]: 0x27414
> [    0.220823] Instruction DUMP:
> [    0.220825]  90020001
> [    0.220858]  912a3003
> [    0.220886]  90020002
> [    0.220912] <c25a2008>
> [    0.220939]  84086001
> [    0.220967]  82007fff
> [    0.220993]  83788408
> [    0.221020]  90100001
> [    0.221047]  c25a0000
> [    0.221074]
> [    0.221120] Kernel panic - not syncing: Attempted to kill the idle task!
> [    0.221183] Unable to handle kernel NULL pointer dereference
> [    0.221237] tsk->{mm,active_mm}->context = 0000000000000000
> [    0.221287] tsk->{mm,active_mm}->pgd = fff8000070002000
> [    0.221335]               \|/ ____ \|/
> [    0.221335]               "@'/ .. \`@"
> [    0.221335]               /_| \__/ |_\
> [    0.221335]                  \__U_/
> [    0.221457] swapper/0(0): Oops [#2]
> [    0.221494] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G      D           5.19.0-rc2-00025-g1d0403d20f6c #376
> [    0.221580] TSTATE: 0000004480e01607 TPC: 0000000000a64030 TNPC: 0000000000a64034 Y: 000008a3    Tainted: G      D
> [    0.221678] TPC: <sunhv_migrate_hvcons_irq+0x30/0x60>
> [    0.221731] g0: 00000000014e3800 g1: 0000000000000020 g2: 0000000000000000 g3: 000000000000009d
> [    0.221808] g4: 0000000000fdf680 g5: fff800042960e000 g6: 0000000000fc0000 g7: 0000000000000001
> [    0.222888] o0: 000000000000003c o1: 0000000000cc9400 o2: 0000000000000000 o3: 0000000000ece2a0
> [    0.222966] o4: 6c65207461736b21 o5: 0000000000000000 sp: 0000000000fc2b21 ret_pc: 00000000004dbfdc
> [    0.223046] RPC: <vprintk+0x5c/0x80>
> [    0.223087] l0: 0000000001228e40 l1: 0000000000000020 l2: 0000000000eceb78 l3: 0000000f477791df
> [    0.223167] l4: f477792d02f140eb l5: 00000000014e3800 l6: 0000000000000000 l7: 0000000000000001
> [    0.223243] i0: 0000000000000000 i1: 0000000000fc3508 i2: 0000000000eceb78 i3: 0000000000fc35c8
> [    0.223320] i4: 0000000000a1c888 i5: 0000000001229220 i6: 0000000000fc2bd1 i7: 0000000000440a1c
> [    0.223397] I7: <smp_send_stop+0x3c/0x100>
> [    0.223443] Call Trace:
> [    0.223470] [<0000000000440a1c>] smp_send_stop+0x3c/0x100
> [    0.223522] [<0000000000cac4a0>] panic+0x104/0x374
> [    0.223572] [<000000000046a4fc>] make_task_dead+0x5c/0xe0
> [    0.223629] [<0000000000cab660>] die_if_kernel+0x258/0x264
> [    0.223681] [<0000000000cc3624>] unhandled_fault+0x98/0xb4
> [    0.223737] [<0000000000cc3e54>] do_sparc64_fault+0x814/0xa00
> [    0.223792] [<0000000000407714>] sparc64_realfault_common+0x10/0x20
> [    0.223858] [<00000000006c9118>] mem_cgroup_from_obj+0x78/0x120
> [    0.223914] [<0000000000ae012c>] __register_pernet_operations+0xcc/0x420
> [    0.223976] [<0000000000ae04e4>] register_pernet_operations+0x64/0xa0
> [    0.224038] [<0000000000ae053c>] register_pernet_subsys+0x1c/0x40
> [    0.224094] [<0000000001199010>] net_ns_init+0xe8/0x148
> [    0.224147] [<0000000001170ed4>] start_kernel+0x5e0/0x660
> [    0.224198] [<0000000001173e28>] start_early_boot+0x2a0/0x2b0
> [    0.224254] [<0000000000cb6fe0>] tlb_fixup_done+0x4c/0x6c
> [    0.225308] [<0000000000027414>] 0x27414
> [    0.225349] Caller[0000000000440a1c]: smp_send_stop+0x3c/0x100
> [    0.225406] Caller[0000000000cac4a0]: panic+0x104/0x374
> [    0.225456] Caller[000000000046a4fc]: make_task_dead+0x5c/0xe0
> [    0.225512] Caller[0000000000cab660]: die_if_kernel+0x258/0x264
> [    0.225567] Caller[0000000000cc3624]: unhandled_fault+0x98/0xb4
> [    0.225624] Caller[0000000000cc3e54]: do_sparc64_fault+0x814/0xa00
> [    0.225685] Caller[0000000000407714]: sparc64_realfault_common+0x10/0x20
> [    0.225747] Caller[00000000006c90c8]: mem_cgroup_from_obj+0x28/0x120
> [    0.225806] Caller[0000000000ae012c]: __register_pernet_operations+0xcc/0x420
> [    0.225875] Caller[0000000000ae04e4]: register_pernet_operations+0x64/0xa0
> [    0.225940] Caller[0000000000ae053c]: register_pernet_subsys+0x1c/0x40
> [    0.226001] Caller[0000000001199010]: net_ns_init+0xe8/0x148
> [    0.226058] Caller[0000000001170ed4]: start_kernel+0x5e0/0x660
> [    0.226113] Caller[0000000001173e28]: start_early_boot+0x2a0/0x2b0
> [    0.226172] Caller[0000000000cb6fe0]: tlb_fixup_done+0x4c/0x6c
> [    0.226228] Caller[0000000000027414]: 0x27414
> [    0.226271] Instruction DUMP:
> [    0.226273]  83287005
> [    0.226305]  13003325
> [    0.226333]  82204018
> [    0.226359] <d000a0d8>
> [    0.226385]  92126358
> [    0.226412]  7fe9f0e2
> [    0.226439]  92024001
> [    0.226465]  81cfe008
> [    0.226492]  01000000
> [    0.226519]
> [    0.226562] Kernel panic - not syncing: Attempted to kill the idle task!
> [    0.226626] Unable to handle kernel NULL pointer dereference
> [    0.226678] tsk->{mm,active_mm}->context = 0000000000000000
> [    0.226729] tsk->{mm,active_mm}->pgd = fff8000070002000
> 

#regzbot introduced: 1d0403d20f6c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations) #forregzbot
       [not found]   ` <20220918092849.GA10314@u164.east.ru>
                       ` (2 preceding siblings ...)
  2022-09-27  9:54     ` Vlastimil Babka
@ 2022-09-28  7:21     ` Thorsten Leemhuis
  3 siblings, 0 replies; 16+ messages in thread
From: Thorsten Leemhuis @ 2022-09-28  7:21 UTC (permalink / raw)
  To: regressions; +Cc: kernel, linux-kernel, linux-mm, cgroups, sparclinux

TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker.

On 18.09.22 11:28, Anatoly Pugachev wrote:
> On Fri, Jun 03, 2022 at 07:19:43AM +0300, Vasily Averin wrote:
>> __register_pernet_operations() executes init hook of registered
>> pernet_operation structure in all existing net namespaces.
>>
>> Typically, these hooks are called by a process associated with
>> the specified net namespace, and all __GFP_ACCOUNT marked
>> allocation are accounted for corresponding container/memcg.
>>
>> However __register_pernet_operations() calls the hooks in the same
>> context, and as a result all marked allocations are accounted
>> to one memcg for all processed net namespaces.
>>
>> This patch adjusts active memcg for each net namespace and helps
>> to account memory allocated inside ops_init() into the proper memcg.
>>
>> Signed-off-by: Vasily Averin <vvs@openvz.org>
>> Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
>> Acked-by: Shakeel Butt <shakeelb@google.com>
>> ---
>> v6: re-based to current upstream (v5.18-11267-gb00ed48bb0a7)
> 
> 
> Hello!
> 
> I'm unable to boot my sparc64 VM anymore (5.19 still boots, 6.0-rc1 does not),
> bisected up to this patch,
> 
> mator@ttip:~/linux-2.6$ git bisect bad
> 1d0403d20f6c281cb3d14c5f1db5317caeec48e9 is the first bad commit
> commit 1d0403d20f6c281cb3d14c5f1db5317caeec48e9
> Author: Vasily Averin <vvs@openvz.org>
> Date:   Fri Jun 3 07:19:43 2022 +0300

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced 1d0403d20f6c281cb3d14c5f1db5317caeec48e9
#regzbot title cgroups/sparc64: sparc64 fails to boot
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.


Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations)
  2022-09-27  9:54     ` Vlastimil Babka
@ 2022-09-28  7:54       ` Thorsten Leemhuis
  0 siblings, 0 replies; 16+ messages in thread
From: Thorsten Leemhuis @ 2022-09-28  7:54 UTC (permalink / raw)
  To: Vlastimil Babka, Anatoly Pugachev, Vasily Averin
  Cc: Andrew Morton, kernel, linux-kernel, linux-mm, Shakeel Butt,
	Roman Gushchin, Michal Koutný,
	Michal Hocko, Florian Westphal, David S. Miller, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, cgroups, sparclinux, regressions

On 27.09.22 11:54, Vlastimil Babka wrote:
> On 9/18/22 11:28, Anatoly Pugachev wrote:
>> On Fri, Jun 03, 2022 at 07:19:43AM +0300, Vasily Averin wrote:
>>> __register_pernet_operations() executes init hook of registered
>>> pernet_operation structure in all existing net namespaces.
>> [...]
>> I'm unable to boot my sparc64 VM anymore (5.19 still boots, 6.0-rc1 does not),
>> bisected up to this patch,
>>
>> mator@ttip:~/linux-2.6$ git bisect bad
>> 1d0403d20f6c281cb3d14c5f1db5317caeec48e9 is the first bad commit
>> commit 1d0403d20f6c281cb3d14c5f1db5317caeec48e9
>> [...]
> 
> #regzbot introduced: 1d0403d20f6c

Thx for getting this regression tracked using regzbot. FWIW, that went
sideways (as your already noticed and mentioned on IRC), as that made
regzbot treat *your* mail as the report of the regressions. In cases
like this you need "#regzbot ^introduced 1d0403d20f6c" (since recently
"#regzbot introduced 1d0403d20f6c ^" works, too), as then regzbot will
consider the *parent* mail the report (and then regzbot will look out
for patches that link to them using a Link: tag).

No worries, I did the same mistake a few time already :-D I send a mail
with that command now, so let's resolve this subthread by marking it invalid

#regzbot invalid: mis-used regzbot command, now properly tracked

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-09-28  7:54 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <6b362c6e-9c80-4344-9430-b831f9871a3c@openvz.org>
     [not found] ` <f9394752-e272-9bf9-645f-a18c56d1c4ec@openvz.org>
2022-06-06 13:49   ` [PATCH memcg v6] net: set proper memcg for net_init hooks allocations Qian Cai
     [not found]     ` <0e714a5a-d2ed-9b44-fdbe-04b5595165da@openvz.org>
2022-06-06 18:43       ` Qian Cai
     [not found]     ` <360a2672-65a7-4ad4-c8b8-cc4c1f0c02cd@openvz.org>
2022-06-07  5:58       ` Shakeel Butt
2022-06-07 12:37         ` Vasily Averin
2022-06-07 14:10           ` Shakeel Butt
     [not found]   ` <20220918092849.GA10314@u164.east.ru>
2022-09-21 14:41     ` [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations) Anatoly Pugachev
2022-09-21 14:44     ` Anatoly Pugachev
2022-09-21 17:02       ` Michal Koutný
2022-09-26 13:06         ` Anatoly Pugachev
2022-09-26 17:28           ` Jakub Kicinski
2022-09-26 17:32             ` Shakeel Butt
2022-09-26 17:36               ` Andrew Morton
2022-09-26 19:00                 ` Shakeel Butt
2022-09-27  9:54     ` Vlastimil Babka
2022-09-28  7:54       ` Thorsten Leemhuis
2022-09-28  7:21     ` [sparc64] fails to boot, (was: Re: [PATCH memcg v6] net: set proper memcg for net_init hooks allocations) #forregzbot Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).