linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: exec error: BUG: Bad rss-counter
       [not found] ` <m1wnuqhaew.fsf@fess.ebiederm.org>
@ 2021-03-02  7:59   ` Ilya Lipnitskiy
       [not found]     ` <m1blc1gxdx.fsf@fess.ebiederm.org>
       [not found]     ` <CAHk-=wjVWMnH2LfFNnXcf6=WuU1RyLa_cgTEOqnViHiqDrqQjg@mail.gmail.com>
  0 siblings, 2 replies; 13+ messages in thread
From: Ilya Lipnitskiy @ 2021-03-02  7:59 UTC (permalink / raw)
  To: Eric W. Biederman, linux-mm
  Cc: Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig, Linus Torvalds

On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>
> > Eric, All,
> >
> > The following error appears when running Linux 5.10.18 on an embedded
> > MIPS mt7621 target:
> > [    0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
> >
> > Being a very generic error, I started digging and added a stack dump
> > before the BUG:
> > Call Trace:
> > [<80008094>] show_stack+0x30/0x100
> > [<8033b238>] dump_stack+0xac/0xe8
> > [<800285e8>] __mmdrop+0x98/0x1d0
> > [<801a6de8>] free_bprm+0x44/0x118
> > [<801a86a8>] kernel_execve+0x160/0x1d8
> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >
> > So that's how I got to looking at fs/exec.c and noticed quite a few
> > changes last year. Turns out this message only occurs once very early
> > at boot during the very first call to kernel_execve. current->mm is
> > NULL at this stage, so acct_arg_size() is effectively a no-op.
>
> If you believe this is a new error you could bisect the kernel
> to see which change introduced the behavior you are seeing.
>
> > More digging, and I traced the RSS counter increment to:
> > [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
> > [<80160d58>] handle_mm_fault+0x6e4/0xea0
> > [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
> > [<8015992c>] __get_user_pages_remote+0x128/0x360
> > [<801a6d9c>] get_arg_page+0x34/0xa0
> > [<801a7394>] copy_string_kernel+0x194/0x2a4
> > [<801a880c>] kernel_execve+0x11c/0x298
> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >
> > In fact, I also checked vma_pages(bprm->vma) and lo and behold it is set to 1.
> >
> > How is fs/exec.c supposed to handle implied RSS increments that happen
> > due to page faults when discarding the bprm structure? In this case,
> > the bug-generating kernel_execve call never succeeded, it returned -2,
> > but I didn't trace exactly what failed.
>
> Unless I am mistaken any left over pages should be purged by exit_mmap
> which is called by mmput before mmput calls mmdrop.
Good to know. Some more digging and I can say that we hit this error
when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
vm_normal_page returns NULL, zap_pte_range does not decrement
MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
usable, but special? Or am I totally off the mark here?

Here is the (optimized) stack trace when the counter does not get decremented:
[<8015b078>] vm_normal_page+0x114/0x1a8
[<8015dc98>] unmap_page_range+0x388/0xacc
[<8015e5a0>] unmap_vmas+0x6c/0x98
[<80166194>] exit_mmap+0xd8/0x1ac
[<800290c0>] mmput+0x58/0xf8
[<801a6f8c>] free_bprm+0x2c/0xc4
[<801a8890>] kernel_execve+0x160/0x1d8
[<800420e0>] call_usermodehelper_exec_async+0x114/0x194
[<80003198>] ret_from_kernel_thread+0x14/0x1c

>
> AKA it looks very very fishy this happens and this does not look like
> an execve error.
I think you are right, I'm probably wrong to bother you. However,
since the thread is already started, let me add linux-mm here :)
>
> On the other hand it would be good to know why kernel_execve is failing.
> Then the error handling paths could be scrutinized, and we can check to
> see if everything that should happen on an error path does.
I can check on this, but likely it's the init system not doing things
quite in the right order on my platform, or something similar. The
error is ENOENT from do_open_execat().
>
> > Interestingly, this "BUG:" message is timing-dependent. If I wait a
> > bit before calling free_bprm after bprm_execve the message seems to go
> > away (there are 3 other cores running and calling into kernel_execve
> > at the same time, so there is that). The error also only ever happens
> > once (probably because no more page faults happen?).
> >
> > I don't know enough to propose a proper fix here. Is it decrementing
> > the bprm->mm RSS counter to account for that page fault? Or is
> > current->mm being NULL a bigger problem?
>
> This is call_usermode_helper calls kernel_execve from a kernel thread
> forked by kthreadd.  Which means current->mm == NULL is expected, and
> current->active_mm == &init_mm.
>
> Similarly I bprm->mm having an incremented RSS counter appears correct.
>
> The question is why doesn't that count get consistently cleaned up.
>
> > Apologies in advance, but I have looked hard and do not see a clear
> > resolution for this even in the latest kernel code.
>
> I may be blind but I see two possibilities.
>
> 1) There is a memory stomp that happens early on and bad timing causes
>    the memory stomp to result in an elevated rss count.
>
> 2) There is a buggy error handling path, and whatever failure you are
>     running into that early in boot walks through that buggy failure
>     path.
>
> I don't think this is a widespread issue or yours would not be the first
> report like this I have seen.
>
> The two productive paths I can see for tracing down your problem are:
> 1) git bisect (assuming you have a known good version)
> 2) Figuring out what exec failed.
>
> I really think exec_mmap should have cleaned up anything in the mm.  So
> the fact that it doesn't worries me.
>
> Eric

Ilya


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
       [not found]     ` <m1blc1gxdx.fsf@fess.ebiederm.org>
@ 2021-03-03  7:01       ` Ilya Lipnitskiy
  2021-03-03 15:50         ` Eric W. Biederman
  0 siblings, 1 reply; 13+ messages in thread
From: Ilya Lipnitskiy @ 2021-03-03  7:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-mm, Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig, Linus Torvalds

On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>
> > On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>
> >> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
> >>
> >> > Eric, All,
> >> >
> >> > The following error appears when running Linux 5.10.18 on an embedded
> >> > MIPS mt7621 target:
> >> > [    0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
> >> >
> >> > Being a very generic error, I started digging and added a stack dump
> >> > before the BUG:
> >> > Call Trace:
> >> > [<80008094>] show_stack+0x30/0x100
> >> > [<8033b238>] dump_stack+0xac/0xe8
> >> > [<800285e8>] __mmdrop+0x98/0x1d0
> >> > [<801a6de8>] free_bprm+0x44/0x118
> >> > [<801a86a8>] kernel_execve+0x160/0x1d8
> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >> >
> >> > So that's how I got to looking at fs/exec.c and noticed quite a few
> >> > changes last year. Turns out this message only occurs once very early
> >> > at boot during the very first call to kernel_execve. current->mm is
> >> > NULL at this stage, so acct_arg_size() is effectively a no-op.
> >>
> >> If you believe this is a new error you could bisect the kernel
> >> to see which change introduced the behavior you are seeing.
> >>
> >> > More digging, and I traced the RSS counter increment to:
> >> > [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
> >> > [<80160d58>] handle_mm_fault+0x6e4/0xea0
> >> > [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
> >> > [<8015992c>] __get_user_pages_remote+0x128/0x360
> >> > [<801a6d9c>] get_arg_page+0x34/0xa0
> >> > [<801a7394>] copy_string_kernel+0x194/0x2a4
> >> > [<801a880c>] kernel_execve+0x11c/0x298
> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >> >
> >> > In fact, I also checked vma_pages(bprm->vma) and lo and behold it is set to 1.
> >> >
> >> > How is fs/exec.c supposed to handle implied RSS increments that happen
> >> > due to page faults when discarding the bprm structure? In this case,
> >> > the bug-generating kernel_execve call never succeeded, it returned -2,
> >> > but I didn't trace exactly what failed.
> >>
> >> Unless I am mistaken any left over pages should be purged by exit_mmap
> >> which is called by mmput before mmput calls mmdrop.
> > Good to know. Some more digging and I can say that we hit this error
> > when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
> > vm_normal_page returns NULL, zap_pte_range does not decrement
> > MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
> > usable, but special? Or am I totally off the mark here?
>
> It would be good to know if that is the page that get_user_pages_remote
> returned to copy_string_kernel.  The zero page that is always zero,
> should never be returned when a writable mapping is desired.

Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
page_to_pfn(page) is 0) and it is the same page that is being freed and not
refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
ZERO_PAGE(0)==0x809fc000 -> PFN 5120.

I think I have found the problem though, after much digging and thanks to all
the information provided. init_zero_pfn() gets called too late (after
the call to
is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and after,
zero_pfn == 5120. Boom.

So PFN 0 is special, but only for a little bit, enough for something
on my system
to call kernel_execve :)

Question: is my system not supposed to be calling kernel_execve this
early or does
init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
core_initcall.

>
> > Here is the (optimized) stack trace when the counter does not get decremented:
> > [<8015b078>] vm_normal_page+0x114/0x1a8
> > [<8015dc98>] unmap_page_range+0x388/0xacc
> > [<8015e5a0>] unmap_vmas+0x6c/0x98
> > [<80166194>] exit_mmap+0xd8/0x1ac
> > [<800290c0>] mmput+0x58/0xf8
> > [<801a6f8c>] free_bprm+0x2c/0xc4
> > [<801a8890>] kernel_execve+0x160/0x1d8
> > [<800420e0>] call_usermodehelper_exec_async+0x114/0x194
> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >
> >>
> >> AKA it looks very very fishy this happens and this does not look like
> >> an execve error.
> > I think you are right, I'm probably wrong to bother you. However,
> > since the thread is already started, let me add linux-mm here :)
>
> It happens during exec.  I don't mind looking and pointing you a useful
> direction.
>
> >>
> >> On the other hand it would be good to know why kernel_execve is failing.
> >> Then the error handling paths could be scrutinized, and we can check to
> >> see if everything that should happen on an error path does.
> > I can check on this, but likely it's the init system not doing things
> > quite in the right order on my platform, or something similar. The
> > error is ENOENT from do_open_execat().
>
> That does narrow things down considerably.
> After the error all we do is:
> Clear in_execve and fs->in_exec.
> Return from bprm_execve
> Call free_bprm
> Which does:
>         if (bprm->mm) {
>                 acct_arg_size(bprm, 0);
>                 mmput(bprm->mm);
>         }
>
> So it really needs to be the mmput that cleans things up.\
>
> I would really verify the correspondence between what get_arg_page
> returns and what gets freed in mmput if it is not too difficult.
> I think it should just be a page or two.
>
> Eric

Ilya


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
       [not found]     ` <CAHk-=wjVWMnH2LfFNnXcf6=WuU1RyLa_cgTEOqnViHiqDrqQjg@mail.gmail.com>
@ 2021-03-03  7:07       ` Ilya Lipnitskiy
  0 siblings, 0 replies; 13+ messages in thread
From: Ilya Lipnitskiy @ 2021-03-03  7:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric W. Biederman, Linux-MM, Linux Kernel Mailing List,
	linux-fsdevel, Kees Cook, Christoph Hellwig

On Tue, Mar 2, 2021 at 10:56 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, Mar 1, 2021 at 11:59 PM Ilya Lipnitskiy
> <ilya.lipnitskiy@gmail.com> wrote:
> >
> > Good to know. Some more digging and I can say that we hit this error
> > when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
> > vm_normal_page returns NULL, zap_pte_range does not decrement
> > MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
> > usable, but special? Or am I totally off the mark here?
>
> PFN 0 should be usable - depending on architecture, of course - and
> shouldn't even be special in any way.
>
> is_zero_pfn(pfn) is *not* meant to test for pfn being 0 - it's meant
> to test for the pfn pointing to the special zero-filled page. The two
> _could_ be the same thing, of course, but generally are not (h8300
> seems to say "we use pfn 0 as the zero page" if I read things right).
>
> In fact, there can be many zero-filled pages - architectures with
> virtually mapped caches that want cache coloring have multiple
> contiguous zero-filled pages and then map in the right one based on
> virtual address. I'm not sure why it would matter (the zero-page is
> always mapped read-only, so any physical aliases should be a
> non-issue), but whatever..
>
> > Here is the (optimized) stack trace when the counter does not get decremented:
> > [<8015b078>] vm_normal_page+0x114/0x1a8
>
> Yes, if "is_zero_pfn()" returns true, then it won't be considered a
> normal page, and is not refcounted.
>
> But that should only trigger for pfn == zero_pfn, and zero_pfn should
> be initialized to
>
>     zero_pfn = page_to_pfn(ZERO_PAGE(0));
>
> so it _sounds_ like you possibly have something odd going on with ZERO_PAGE.
Thanks for explaining this - I have figured out that zero_pfn gets set
a little late (see other response for details). Until init_zero_pfn()
is called, zero_pfn==0, after - zero_pfn==5120. Seems somewhat bad,
unless my system is breaking rules by checking zero_pfn before it was
initialized ;)
>
> Yes, one architecture does actually make pfn 0 _be_ the zero page, but
> you said MIPS, and that does do the page coloring games, and has
>
>    #define ZERO_PAGE(vaddr) \
>         (virt_to_page((void *)(empty_zero_page + (((unsigned
> long)(vaddr)) & zero_page_mask))))
>
> where zero_page_mask is the page colorign mask, and empty_zero_page is
> allocated in setup_zero_pages() fairly early in mem_init() (again, it
> allocates multiple pages depending on the page ordering - see that
> horrible virtual cache thing with cpu_has_vce).
>
> So PFN 0 shouldn't be an issue at all.
>
> Of course, since you said this was an embedded MIPS platform, maybe
> it's one of the broken ones with virtual caches and cpu_has_vce is
> set. I'm not sure how much testing that has gotten lately. MOST of the
> later MIPS architectures walked away from the pure virtual cache
> setups.
FWIW, here is the CPU info from my platform, and cpu_has_vce is not set:

system type             : MediaTek MT7621 ver:1 eco:3
machine                 : Ubiquiti EdgeRouter X
processor               : 0
cpu model               : MIPS 1004Kc V2.15
BogoMIPS                : 581.63
wait instruction        : yes
microsecond timers      : yes
tlb_entries             : 32
extra interrupt vector  : yes
hardware watchpoint     : yes, count: 4, address/irw mask: [0x0ffc,
0x0ffc, 0x0ffb, 0x0ffb]
isa                     : mips1 mips2 mips32r1 mips32r2
ASEs implemented        : mips16 dsp mt
Options implemented     : tlb 4kex 4k_cache prefetch mcheck ejtag llsc
pindexed_dcache userlocal vint perf_cntr_intr_bit cdmm perf
shadow register sets    : 1
kscratch registers      : 0
package                 : 0
core                    : 0
VPE                     : 0
VCED exceptions         : not available
VCEI exceptions         : not available

>
>               Linus
Ilya


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-03  7:01       ` Ilya Lipnitskiy
@ 2021-03-03 15:50         ` Eric W. Biederman
  2021-03-03 15:55           ` Ilya Lipnitskiy
  2021-03-29  2:46           ` Ilya Lipnitskiy
  0 siblings, 2 replies; 13+ messages in thread
From: Eric W. Biederman @ 2021-03-03 15:50 UTC (permalink / raw)
  To: Ilya Lipnitskiy
  Cc: linux-mm, Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig, Linus Torvalds

Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:

> On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>
>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>>
>> > On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>> >>
>> >> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>> >>
>> >> > Eric, All,
>> >> >
>> >> > The following error appears when running Linux 5.10.18 on an embedded
>> >> > MIPS mt7621 target:
>> >> > [    0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
>> >> >
>> >> > Being a very generic error, I started digging and added a stack dump
>> >> > before the BUG:
>> >> > Call Trace:
>> >> > [<80008094>] show_stack+0x30/0x100
>> >> > [<8033b238>] dump_stack+0xac/0xe8
>> >> > [<800285e8>] __mmdrop+0x98/0x1d0
>> >> > [<801a6de8>] free_bprm+0x44/0x118
>> >> > [<801a86a8>] kernel_execve+0x160/0x1d8
>> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
>> >> >
>> >> > So that's how I got to looking at fs/exec.c and noticed quite a few
>> >> > changes last year. Turns out this message only occurs once very early
>> >> > at boot during the very first call to kernel_execve. current->mm is
>> >> > NULL at this stage, so acct_arg_size() is effectively a no-op.
>> >>
>> >> If you believe this is a new error you could bisect the kernel
>> >> to see which change introduced the behavior you are seeing.
>> >>
>> >> > More digging, and I traced the RSS counter increment to:
>> >> > [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
>> >> > [<80160d58>] handle_mm_fault+0x6e4/0xea0
>> >> > [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
>> >> > [<8015992c>] __get_user_pages_remote+0x128/0x360
>> >> > [<801a6d9c>] get_arg_page+0x34/0xa0
>> >> > [<801a7394>] copy_string_kernel+0x194/0x2a4
>> >> > [<801a880c>] kernel_execve+0x11c/0x298
>> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
>> >> >
>> >> > In fact, I also checked vma_pages(bprm->vma) and lo and behold it is set to 1.
>> >> >
>> >> > How is fs/exec.c supposed to handle implied RSS increments that happen
>> >> > due to page faults when discarding the bprm structure? In this case,
>> >> > the bug-generating kernel_execve call never succeeded, it returned -2,
>> >> > but I didn't trace exactly what failed.
>> >>
>> >> Unless I am mistaken any left over pages should be purged by exit_mmap
>> >> which is called by mmput before mmput calls mmdrop.
>> > Good to know. Some more digging and I can say that we hit this error
>> > when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
>> > vm_normal_page returns NULL, zap_pte_range does not decrement
>> > MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
>> > usable, but special? Or am I totally off the mark here?
>>
>> It would be good to know if that is the page that get_user_pages_remote
>> returned to copy_string_kernel.  The zero page that is always zero,
>> should never be returned when a writable mapping is desired.
>
> Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
> page_to_pfn(page) is 0) and it is the same page that is being freed and not
> refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
> ZERO_PAGE(0)==0x809fc000 -> PFN 5120.
>
> I think I have found the problem though, after much digging and thanks to all
> the information provided. init_zero_pfn() gets called too late (after
> the call to
> is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and after,
> zero_pfn == 5120. Boom.
>
> So PFN 0 is special, but only for a little bit, enough for something
> on my system
> to call kernel_execve :)
>
> Question: is my system not supposed to be calling kernel_execve this
> early or does
> init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
> core_initcall.

Looking quickly it seems that init_zero_pfn() is in mm/memory.c and is
common for both mips and x86.  Further it appears init_zero_pfn() has
been that was since 2009 a13ea5b75964 ("mm: reinstate ZERO_PAGE").

Given the testing that x86 gets and that nothing like this has been
reported it looks like whatever driver is triggering the kernel_execve
is doing something wrong. 

Because honestly.  If the zero page isn't working there is not a chance
that anything in userspace is working so it is clearly much too early.

I suspect there is some driver that is initialized very early that is
doing something that looks innocuous (like triggering a hotplug event)
and that happens to cause a call_usermode_helper which then calls
kernel_execve.

Eric


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-03 15:50         ` Eric W. Biederman
@ 2021-03-03 15:55           ` Ilya Lipnitskiy
  2021-03-03 16:07             ` Eric W. Biederman
  2021-03-20 15:59             ` Zhou Yanjie
  2021-03-29  2:46           ` Ilya Lipnitskiy
  1 sibling, 2 replies; 13+ messages in thread
From: Ilya Lipnitskiy @ 2021-03-03 15:55 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linux-MM, Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig, Linus Torvalds

On Wed, Mar 3, 2021 at 7:50 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>
> > On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>
> >> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
> >>
> >> > On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >> >>
> >> >> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
> >> >>
> >> >> > Eric, All,
> >> >> >
> >> >> > The following error appears when running Linux 5.10.18 on an embedded
> >> >> > MIPS mt7621 target:
> >> >> > [    0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
> >> >> >
> >> >> > Being a very generic error, I started digging and added a stack dump
> >> >> > before the BUG:
> >> >> > Call Trace:
> >> >> > [<80008094>] show_stack+0x30/0x100
> >> >> > [<8033b238>] dump_stack+0xac/0xe8
> >> >> > [<800285e8>] __mmdrop+0x98/0x1d0
> >> >> > [<801a6de8>] free_bprm+0x44/0x118
> >> >> > [<801a86a8>] kernel_execve+0x160/0x1d8
> >> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >> >> >
> >> >> > So that's how I got to looking at fs/exec.c and noticed quite a few
> >> >> > changes last year. Turns out this message only occurs once very early
> >> >> > at boot during the very first call to kernel_execve. current->mm is
> >> >> > NULL at this stage, so acct_arg_size() is effectively a no-op.
> >> >>
> >> >> If you believe this is a new error you could bisect the kernel
> >> >> to see which change introduced the behavior you are seeing.
> >> >>
> >> >> > More digging, and I traced the RSS counter increment to:
> >> >> > [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
> >> >> > [<80160d58>] handle_mm_fault+0x6e4/0xea0
> >> >> > [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
> >> >> > [<8015992c>] __get_user_pages_remote+0x128/0x360
> >> >> > [<801a6d9c>] get_arg_page+0x34/0xa0
> >> >> > [<801a7394>] copy_string_kernel+0x194/0x2a4
> >> >> > [<801a880c>] kernel_execve+0x11c/0x298
> >> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >> >> >
> >> >> > In fact, I also checked vma_pages(bprm->vma) and lo and behold it is set to 1.
> >> >> >
> >> >> > How is fs/exec.c supposed to handle implied RSS increments that happen
> >> >> > due to page faults when discarding the bprm structure? In this case,
> >> >> > the bug-generating kernel_execve call never succeeded, it returned -2,
> >> >> > but I didn't trace exactly what failed.
> >> >>
> >> >> Unless I am mistaken any left over pages should be purged by exit_mmap
> >> >> which is called by mmput before mmput calls mmdrop.
> >> > Good to know. Some more digging and I can say that we hit this error
> >> > when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
> >> > vm_normal_page returns NULL, zap_pte_range does not decrement
> >> > MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
> >> > usable, but special? Or am I totally off the mark here?
> >>
> >> It would be good to know if that is the page that get_user_pages_remote
> >> returned to copy_string_kernel.  The zero page that is always zero,
> >> should never be returned when a writable mapping is desired.
> >
> > Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
> > page_to_pfn(page) is 0) and it is the same page that is being freed and not
> > refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
> > ZERO_PAGE(0)==0x809fc000 -> PFN 5120.
> >
> > I think I have found the problem though, after much digging and thanks to all
> > the information provided. init_zero_pfn() gets called too late (after
> > the call to
> > is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and after,
> > zero_pfn == 5120. Boom.
> >
> > So PFN 0 is special, but only for a little bit, enough for something
> > on my system
> > to call kernel_execve :)
> >
> > Question: is my system not supposed to be calling kernel_execve this
> > early or does
> > init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
> > core_initcall.
>
> Looking quickly it seems that init_zero_pfn() is in mm/memory.c and is
> common for both mips and x86.  Further it appears init_zero_pfn() has
> been that was since 2009 a13ea5b75964 ("mm: reinstate ZERO_PAGE").
>
> Given the testing that x86 gets and that nothing like this has been
> reported it looks like whatever driver is triggering the kernel_execve
> is doing something wrong.

>
> Because honestly.  If the zero page isn't working there is not a chance
> that anything in userspace is working so it is clearly much too early.
>
> I suspect there is some driver that is initialized very early that is
> doing something that looks innocuous (like triggering a hotplug event)
> and that happens to cause a call_usermode_helper which then calls
> kernel_execve.
I will investigate the offenders more closely. However, I do not
notice this behavior on the same system based on the 5.4 kernel. Is it
possible that last year's exec changes have exposed this issue? Not
blaming exec at all, just making sure I understand the problem better.

Ilya


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-03 15:55           ` Ilya Lipnitskiy
@ 2021-03-03 16:07             ` Eric W. Biederman
  2021-03-20 15:59             ` Zhou Yanjie
  1 sibling, 0 replies; 13+ messages in thread
From: Eric W. Biederman @ 2021-03-03 16:07 UTC (permalink / raw)
  To: Ilya Lipnitskiy
  Cc: Linux-MM, Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig, Linus Torvalds

Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:

> On Wed, Mar 3, 2021 at 7:50 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>
>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>>
>> > On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>> >>
>> >> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>> >>
>> >> > On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>> >> >>
>> >> >> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>> >> >>
>> >> >> > Eric, All,
>> >> >> >
>> >> >> > The following error appears when running Linux 5.10.18 on an embedded
>> >> >> > MIPS mt7621 target:
>> >> >> > [    0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
>> >> >> >
>> >> >> > Being a very generic error, I started digging and added a stack dump
>> >> >> > before the BUG:
>> >> >> > Call Trace:
>> >> >> > [<80008094>] show_stack+0x30/0x100
>> >> >> > [<8033b238>] dump_stack+0xac/0xe8
>> >> >> > [<800285e8>] __mmdrop+0x98/0x1d0
>> >> >> > [<801a6de8>] free_bprm+0x44/0x118
>> >> >> > [<801a86a8>] kernel_execve+0x160/0x1d8
>> >> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>> >> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
>> >> >> >
>> >> >> > So that's how I got to looking at fs/exec.c and noticed quite a few
>> >> >> > changes last year. Turns out this message only occurs once very early
>> >> >> > at boot during the very first call to kernel_execve. current->mm is
>> >> >> > NULL at this stage, so acct_arg_size() is effectively a no-op.
>> >> >>
>> >> >> If you believe this is a new error you could bisect the kernel
>> >> >> to see which change introduced the behavior you are seeing.
>> >> >>
>> >> >> > More digging, and I traced the RSS counter increment to:
>> >> >> > [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
>> >> >> > [<80160d58>] handle_mm_fault+0x6e4/0xea0
>> >> >> > [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
>> >> >> > [<8015992c>] __get_user_pages_remote+0x128/0x360
>> >> >> > [<801a6d9c>] get_arg_page+0x34/0xa0
>> >> >> > [<801a7394>] copy_string_kernel+0x194/0x2a4
>> >> >> > [<801a880c>] kernel_execve+0x11c/0x298
>> >> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>> >> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
>> >> >> >
>> >> >> > In fact, I also checked vma_pages(bprm->vma) and lo and behold it is set to 1.
>> >> >> >
>> >> >> > How is fs/exec.c supposed to handle implied RSS increments that happen
>> >> >> > due to page faults when discarding the bprm structure? In this case,
>> >> >> > the bug-generating kernel_execve call never succeeded, it returned -2,
>> >> >> > but I didn't trace exactly what failed.
>> >> >>
>> >> >> Unless I am mistaken any left over pages should be purged by exit_mmap
>> >> >> which is called by mmput before mmput calls mmdrop.
>> >> > Good to know. Some more digging and I can say that we hit this error
>> >> > when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
>> >> > vm_normal_page returns NULL, zap_pte_range does not decrement
>> >> > MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
>> >> > usable, but special? Or am I totally off the mark here?
>> >>
>> >> It would be good to know if that is the page that get_user_pages_remote
>> >> returned to copy_string_kernel.  The zero page that is always zero,
>> >> should never be returned when a writable mapping is desired.
>> >
>> > Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
>> > page_to_pfn(page) is 0) and it is the same page that is being freed and not
>> > refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
>> > ZERO_PAGE(0)==0x809fc000 -> PFN 5120.
>> >
>> > I think I have found the problem though, after much digging and thanks to all
>> > the information provided. init_zero_pfn() gets called too late (after
>> > the call to
>> > is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and after,
>> > zero_pfn == 5120. Boom.
>> >
>> > So PFN 0 is special, but only for a little bit, enough for something
>> > on my system
>> > to call kernel_execve :)
>> >
>> > Question: is my system not supposed to be calling kernel_execve this
>> > early or does
>> > init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
>> > core_initcall.
>>
>> Looking quickly it seems that init_zero_pfn() is in mm/memory.c and is
>> common for both mips and x86.  Further it appears init_zero_pfn() has
>> been that was since 2009 a13ea5b75964 ("mm: reinstate ZERO_PAGE").
>>
>> Given the testing that x86 gets and that nothing like this has been
>> reported it looks like whatever driver is triggering the kernel_execve
>> is doing something wrong.
>
>>
>> Because honestly.  If the zero page isn't working there is not a chance
>> that anything in userspace is working so it is clearly much too early.
>>
>> I suspect there is some driver that is initialized very early that is
>> doing something that looks innocuous (like triggering a hotplug event)
>> and that happens to cause a call_usermode_helper which then calls
>> kernel_execve.
> I will investigate the offenders more closely. However, I do not
> notice this behavior on the same system based on the 5.4 kernel. Is it
> possible that last year's exec changes have exposed this issue? Not
> blaming exec at all, just making sure I understand the problem better.

Only in the sense that copy_strings_kernel does less work than
"set_fs(KERNEL_DS); copy_strings; set_fs(USER_DS);"

Nothing huge was changed in exec but lots was moved around so that
it was clearer what is happening, and so that hacks like set_fs could
be removed.

Eric



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-03 15:55           ` Ilya Lipnitskiy
  2021-03-03 16:07             ` Eric W. Biederman
@ 2021-03-20 15:59             ` Zhou Yanjie
  2021-03-29  2:48               ` Ilya Lipnitskiy
  1 sibling, 1 reply; 13+ messages in thread
From: Zhou Yanjie @ 2021-03-20 15:59 UTC (permalink / raw)
  To: Ilya Lipnitskiy, Eric W. Biederman
  Cc: Linux-MM, Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig, Linus Torvalds

Hi Ilya,

On 2021/3/3 下午11:55, Ilya Lipnitskiy wrote:
> On Wed, Mar 3, 2021 at 7:50 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>>
>>> On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>>>>
>>>>> On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>>>>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>>>>>>
>>>>>>> Eric, All,
>>>>>>>
>>>>>>> The following error appears when running Linux 5.10.18 on an embedded
>>>>>>> MIPS mt7621 target:
>>>>>>> [    0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
>>>>>>>
>>>>>>> Being a very generic error, I started digging and added a stack dump
>>>>>>> before the BUG:
>>>>>>> Call Trace:
>>>>>>> [<80008094>] show_stack+0x30/0x100
>>>>>>> [<8033b238>] dump_stack+0xac/0xe8
>>>>>>> [<800285e8>] __mmdrop+0x98/0x1d0
>>>>>>> [<801a6de8>] free_bprm+0x44/0x118
>>>>>>> [<801a86a8>] kernel_execve+0x160/0x1d8
>>>>>>> [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>>>>>>> [<80003198>] ret_from_kernel_thread+0x14/0x1c
>>>>>>>
>>>>>>> So that's how I got to looking at fs/exec.c and noticed quite a few
>>>>>>> changes last year. Turns out this message only occurs once very early
>>>>>>> at boot during the very first call to kernel_execve. current->mm is
>>>>>>> NULL at this stage, so acct_arg_size() is effectively a no-op.
>>>>>> If you believe this is a new error you could bisect the kernel
>>>>>> to see which change introduced the behavior you are seeing.
>>>>>>
>>>>>>> More digging, and I traced the RSS counter increment to:
>>>>>>> [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
>>>>>>> [<80160d58>] handle_mm_fault+0x6e4/0xea0
>>>>>>> [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
>>>>>>> [<8015992c>] __get_user_pages_remote+0x128/0x360
>>>>>>> [<801a6d9c>] get_arg_page+0x34/0xa0
>>>>>>> [<801a7394>] copy_string_kernel+0x194/0x2a4
>>>>>>> [<801a880c>] kernel_execve+0x11c/0x298
>>>>>>> [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>>>>>>> [<80003198>] ret_from_kernel_thread+0x14/0x1c
>>>>>>>
>>>>>>> In fact, I also checked vma_pages(bprm->vma) and lo and behold it is set to 1.
>>>>>>>
>>>>>>> How is fs/exec.c supposed to handle implied RSS increments that happen
>>>>>>> due to page faults when discarding the bprm structure? In this case,
>>>>>>> the bug-generating kernel_execve call never succeeded, it returned -2,
>>>>>>> but I didn't trace exactly what failed.
>>>>>> Unless I am mistaken any left over pages should be purged by exit_mmap
>>>>>> which is called by mmput before mmput calls mmdrop.
>>>>> Good to know. Some more digging and I can say that we hit this error
>>>>> when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
>>>>> vm_normal_page returns NULL, zap_pte_range does not decrement
>>>>> MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
>>>>> usable, but special? Or am I totally off the mark here?
>>>> It would be good to know if that is the page that get_user_pages_remote
>>>> returned to copy_string_kernel.  The zero page that is always zero,
>>>> should never be returned when a writable mapping is desired.
>>> Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
>>> page_to_pfn(page) is 0) and it is the same page that is being freed and not
>>> refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
>>> ZERO_PAGE(0)==0x809fc000 -> PFN 5120.
>>>
>>> I think I have found the problem though, after much digging and thanks to all
>>> the information provided. init_zero_pfn() gets called too late (after
>>> the call to
>>> is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and after,
>>> zero_pfn == 5120. Boom.
>>>
>>> So PFN 0 is special, but only for a little bit, enough for something
>>> on my system
>>> to call kernel_execve :)
>>>
>>> Question: is my system not supposed to be calling kernel_execve this
>>> early or does
>>> init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
>>> core_initcall.
>> Looking quickly it seems that init_zero_pfn() is in mm/memory.c and is
>> common for both mips and x86.  Further it appears init_zero_pfn() has
>> been that was since 2009 a13ea5b75964 ("mm: reinstate ZERO_PAGE").
>>
>> Given the testing that x86 gets and that nothing like this has been
>> reported it looks like whatever driver is triggering the kernel_execve
>> is doing something wrong.
>> Because honestly.  If the zero page isn't working there is not a chance
>> that anything in userspace is working so it is clearly much too early.
>>
>> I suspect there is some driver that is initialized very early that is
>> doing something that looks innocuous (like triggering a hotplug event)
>> and that happens to cause a call_usermode_helper which then calls
>> kernel_execve.
> I will investigate the offenders more closely. However, I do not
> notice this behavior on the same system based on the 5.4 kernel. Is it


I also encountered this problem on Ingenic X1000 and X1830. This is the 
printed information:

[    0.120715] BUG: Bad rss-counter state mm:(ptrval) 
\x7f\x7ftype:MM_ANONPAGES val:1

I tested kernel 5.9, kernel 5.10, kernel 5.11, and kernel 5.12, only 
kernel 5.9 did not have this problem, so we can know that this problem 
was introduced in kernel 5.10, have you found any effective solution?


Thanks and best regards!


> possible that last year's exec changes have exposed this issue? Not
> blaming exec at all, just making sure I understand the problem better.
>
> Ilya


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-03 15:50         ` Eric W. Biederman
  2021-03-03 15:55           ` Ilya Lipnitskiy
@ 2021-03-29  2:46           ` Ilya Lipnitskiy
  1 sibling, 0 replies; 13+ messages in thread
From: Ilya Lipnitskiy @ 2021-03-29  2:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linux-MM, Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig, Linus Torvalds

On Wed, Mar 3, 2021 at 7:50 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>
> > On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>
> >> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
> >>
> >> > On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >> >>
> >> >> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
> >> >>
> >> >> > Eric, All,
> >> >> >
> >> >> > The following error appears when running Linux 5.10.18 on an embedded
> >> >> > MIPS mt7621 target:
> >> >> > [    0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
> >> >> >
> >> >> > Being a very generic error, I started digging and added a stack dump
> >> >> > before the BUG:
> >> >> > Call Trace:
> >> >> > [<80008094>] show_stack+0x30/0x100
> >> >> > [<8033b238>] dump_stack+0xac/0xe8
> >> >> > [<800285e8>] __mmdrop+0x98/0x1d0
> >> >> > [<801a6de8>] free_bprm+0x44/0x118
> >> >> > [<801a86a8>] kernel_execve+0x160/0x1d8
> >> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >> >> >
> >> >> > So that's how I got to looking at fs/exec.c and noticed quite a few
> >> >> > changes last year. Turns out this message only occurs once very early
> >> >> > at boot during the very first call to kernel_execve. current->mm is
> >> >> > NULL at this stage, so acct_arg_size() is effectively a no-op.
> >> >>
> >> >> If you believe this is a new error you could bisect the kernel
> >> >> to see which change introduced the behavior you are seeing.
> >> >>
> >> >> > More digging, and I traced the RSS counter increment to:
> >> >> > [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
> >> >> > [<80160d58>] handle_mm_fault+0x6e4/0xea0
> >> >> > [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
> >> >> > [<8015992c>] __get_user_pages_remote+0x128/0x360
> >> >> > [<801a6d9c>] get_arg_page+0x34/0xa0
> >> >> > [<801a7394>] copy_string_kernel+0x194/0x2a4
> >> >> > [<801a880c>] kernel_execve+0x11c/0x298
> >> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >> >> >
> >> >> > In fact, I also checked vma_pages(bprm->vma) and lo and behold it is set to 1.
> >> >> >
> >> >> > How is fs/exec.c supposed to handle implied RSS increments that happen
> >> >> > due to page faults when discarding the bprm structure? In this case,
> >> >> > the bug-generating kernel_execve call never succeeded, it returned -2,
> >> >> > but I didn't trace exactly what failed.
> >> >>
> >> >> Unless I am mistaken any left over pages should be purged by exit_mmap
> >> >> which is called by mmput before mmput calls mmdrop.
> >> > Good to know. Some more digging and I can say that we hit this error
> >> > when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
> >> > vm_normal_page returns NULL, zap_pte_range does not decrement
> >> > MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
> >> > usable, but special? Or am I totally off the mark here?
> >>
> >> It would be good to know if that is the page that get_user_pages_remote
> >> returned to copy_string_kernel.  The zero page that is always zero,
> >> should never be returned when a writable mapping is desired.
> >
> > Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
> > page_to_pfn(page) is 0) and it is the same page that is being freed and not
> > refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
> > ZERO_PAGE(0)==0x809fc000 -> PFN 5120.
> >
> > I think I have found the problem though, after much digging and thanks to all
> > the information provided. init_zero_pfn() gets called too late (after
> > the call to
> > is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and after,
> > zero_pfn == 5120. Boom.
> >
> > So PFN 0 is special, but only for a little bit, enough for something
> > on my system
> > to call kernel_execve :)
> >
> > Question: is my system not supposed to be calling kernel_execve this
> > early or does
> > init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
> > core_initcall.
>
> Looking quickly it seems that init_zero_pfn() is in mm/memory.c and is
> common for both mips and x86.  Further it appears init_zero_pfn() has
> been that was since 2009 a13ea5b75964 ("mm: reinstate ZERO_PAGE").
>
> Given the testing that x86 gets and that nothing like this has been
> reported it looks like whatever driver is triggering the kernel_execve
> is doing something wrong.
>
> Because honestly.  If the zero page isn't working there is not a chance
> that anything in userspace is working so it is clearly much too early.
>
> I suspect there is some driver that is initialized very early that is
> doing something that looks innocuous (like triggering a hotplug event)
> and that happens to cause a call_usermode_helper which then calls
> kernel_execve.

Here is the data that's passed into the very first kernel_execve call:
kernel_filename: /sbin/hotplug
argv: [/sbin/hotplug, bus]
envp: [ACTION=add, DEVPATH=/bus/workqueue, SUBSYSTEM=bus, SEQNUM=4,
HOME=/, PATH=/sbin:/bin:/usr/sbin:/usr/bin]

It comes from kobject_uevent_env() calling call_usermodehelper_exec()
with UMH_NO_WAIT.

Trace:
[<80340dc8>] kobject_uevent_env+0x7e4/0x7ec
[<8033f8b8>] kset_register+0x68/0x88
[<803cf824>] bus_register+0xdc/0x34c
[<803cfac8>] subsys_virtual_register+0x34/0x78
[<8086afb0>] wq_sysfs_init+0x1c/0x4c
[<80001648>] do_one_initcall+0x50/0x1a8
[<8086503c>] kernel_init_freeable+0x230/0x2c8
[<8066bca0>] kernel_init+0x10/0x100
[<80003038>] ret_from_kernel_thread+0x14/0x1c

A bunch of other bus devices are initialized at the same time, but
SEQNUM=4 gets to go first for some reason:
[    0.420497] smp: Brought up 1 node, 4 CPUs
[    0.431204] ACTION:add DEVPATH:/bus/platform SUBSYSTEM:bus SEQNUM: 1
[    0.431249] ACTION:add DEVPATH:/bus/cpu SUBSYSTEM:bus SEQNUM: 2
[    0.440594] ACTION:add DEVPATH:/bus/container SUBSYSTEM:bus SEQNUM: 3
[    0.449994] ACTION:add DEVPATH:/bus/workqueue SUBSYSTEM:bus SEQNUM: 4

Since both wq_sysfs_init() and init_zero_pfn() are annotated with
core_initcall() is there a race?

Maybe there is still an argument for moving init_zero_pfn() to
early_initcall()? According to the comment above init_zero_pfn(),
"CONFIG_MMU architectures set up ZERO_PAGE in their paging_init()".
paging_init() gets called in setup_arch(), which is way before
do_pre_smp_initcalls(), so it should work, right? Obviously something
that needs to be tested, but are my assumptions correct?

FWIW I tested it on my MIPS device and it boots fine and the BUG
message is gone. I still don't know why it started appearing on 5.10+,
maybe some core_initcalls got added that made the race worse?

Ilya


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-20 15:59             ` Zhou Yanjie
@ 2021-03-29  2:48               ` Ilya Lipnitskiy
  2021-03-30  4:56                 ` Zhou Yanjie
  0 siblings, 1 reply; 13+ messages in thread
From: Ilya Lipnitskiy @ 2021-03-29  2:48 UTC (permalink / raw)
  To: Zhou Yanjie
  Cc: Eric W. Biederman, Linux-MM, Linux Kernel Mailing List,
	linux-fsdevel, Kees Cook, Christoph Hellwig, Linus Torvalds

On Sat, Mar 20, 2021 at 8:59 AM Zhou Yanjie <zhouyanjie@wanyeetech.com> wrote:
>
> Hi Ilya,
>
> On 2021/3/3 下午11:55, Ilya Lipnitskiy wrote:
> > On Wed, Mar 3, 2021 at 7:50 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
> >>
> >>> On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
> >>>>
> >>>>> On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>>>>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
> >>>>>>
> >>>>>>> Eric, All,
> >>>>>>>
> >>>>>>> The following error appears when running Linux 5.10.18 on an embedded
> >>>>>>> MIPS mt7621 target:
> >>>>>>> [    0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
> >>>>>>>
> >>>>>>> Being a very generic error, I started digging and added a stack dump
> >>>>>>> before the BUG:
> >>>>>>> Call Trace:
> >>>>>>> [<80008094>] show_stack+0x30/0x100
> >>>>>>> [<8033b238>] dump_stack+0xac/0xe8
> >>>>>>> [<800285e8>] __mmdrop+0x98/0x1d0
> >>>>>>> [<801a6de8>] free_bprm+0x44/0x118
> >>>>>>> [<801a86a8>] kernel_execve+0x160/0x1d8
> >>>>>>> [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >>>>>>> [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >>>>>>>
> >>>>>>> So that's how I got to looking at fs/exec.c and noticed quite a few
> >>>>>>> changes last year. Turns out this message only occurs once very early
> >>>>>>> at boot during the very first call to kernel_execve. current->mm is
> >>>>>>> NULL at this stage, so acct_arg_size() is effectively a no-op.
> >>>>>> If you believe this is a new error you could bisect the kernel
> >>>>>> to see which change introduced the behavior you are seeing.
> >>>>>>
> >>>>>>> More digging, and I traced the RSS counter increment to:
> >>>>>>> [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
> >>>>>>> [<80160d58>] handle_mm_fault+0x6e4/0xea0
> >>>>>>> [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
> >>>>>>> [<8015992c>] __get_user_pages_remote+0x128/0x360
> >>>>>>> [<801a6d9c>] get_arg_page+0x34/0xa0
> >>>>>>> [<801a7394>] copy_string_kernel+0x194/0x2a4
> >>>>>>> [<801a880c>] kernel_execve+0x11c/0x298
> >>>>>>> [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >>>>>>> [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >>>>>>>
> >>>>>>> In fact, I also checked vma_pages(bprm->vma) and lo and behold it is set to 1.
> >>>>>>>
> >>>>>>> How is fs/exec.c supposed to handle implied RSS increments that happen
> >>>>>>> due to page faults when discarding the bprm structure? In this case,
> >>>>>>> the bug-generating kernel_execve call never succeeded, it returned -2,
> >>>>>>> but I didn't trace exactly what failed.
> >>>>>> Unless I am mistaken any left over pages should be purged by exit_mmap
> >>>>>> which is called by mmput before mmput calls mmdrop.
> >>>>> Good to know. Some more digging and I can say that we hit this error
> >>>>> when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
> >>>>> vm_normal_page returns NULL, zap_pte_range does not decrement
> >>>>> MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
> >>>>> usable, but special? Or am I totally off the mark here?
> >>>> It would be good to know if that is the page that get_user_pages_remote
> >>>> returned to copy_string_kernel.  The zero page that is always zero,
> >>>> should never be returned when a writable mapping is desired.
> >>> Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
> >>> page_to_pfn(page) is 0) and it is the same page that is being freed and not
> >>> refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
> >>> ZERO_PAGE(0)==0x809fc000 -> PFN 5120.
> >>>
> >>> I think I have found the problem though, after much digging and thanks to all
> >>> the information provided. init_zero_pfn() gets called too late (after
> >>> the call to
> >>> is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and after,
> >>> zero_pfn == 5120. Boom.
> >>>
> >>> So PFN 0 is special, but only for a little bit, enough for something
> >>> on my system
> >>> to call kernel_execve :)
> >>>
> >>> Question: is my system not supposed to be calling kernel_execve this
> >>> early or does
> >>> init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
> >>> core_initcall.
> >> Looking quickly it seems that init_zero_pfn() is in mm/memory.c and is
> >> common for both mips and x86.  Further it appears init_zero_pfn() has
> >> been that was since 2009 a13ea5b75964 ("mm: reinstate ZERO_PAGE").
> >>
> >> Given the testing that x86 gets and that nothing like this has been
> >> reported it looks like whatever driver is triggering the kernel_execve
> >> is doing something wrong.
> >> Because honestly.  If the zero page isn't working there is not a chance
> >> that anything in userspace is working so it is clearly much too early.
> >>
> >> I suspect there is some driver that is initialized very early that is
> >> doing something that looks innocuous (like triggering a hotplug event)
> >> and that happens to cause a call_usermode_helper which then calls
> >> kernel_execve.
> > I will investigate the offenders more closely. However, I do not
> > notice this behavior on the same system based on the 5.4 kernel. Is it
>
>
> I also encountered this problem on Ingenic X1000 and X1830. This is the
> printed information:
>
> [    0.120715] BUG: Bad rss-counter state mm:(ptrval)
>   type:MM_ANONPAGES val:1
>
> I tested kernel 5.9, kernel 5.10, kernel 5.11, and kernel 5.12, only
> kernel 5.9 did not have this problem, so we can know that this problem
> was introduced in kernel 5.10, have you found any effective solution?
Try:
diff --git a/mm/memory.c b/mm/memory.c
index c8e357627318..1fd753245369 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -166,7 +166,7 @@ static int __init init_zero_pfn(void)
        zero_pfn = page_to_pfn(ZERO_PAGE(0));
        return 0;
 }
-core_initcall(init_zero_pfn);
+early_initcall(init_zero_pfn);

 void mm_trace_rss_stat(struct mm_struct *mm, int member, long count)
 {


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-29  2:48               ` Ilya Lipnitskiy
@ 2021-03-30  4:56                 ` Zhou Yanjie
  2021-03-30 16:11                   ` Linus Torvalds
  0 siblings, 1 reply; 13+ messages in thread
From: Zhou Yanjie @ 2021-03-30  4:56 UTC (permalink / raw)
  To: Ilya Lipnitskiy
  Cc: Eric W. Biederman, Linux-MM, Linux Kernel Mailing List,
	linux-fsdevel, Kees Cook, Christoph Hellwig, Linus Torvalds

Hi Ilya,

On 2021/3/29 上午10:48, Ilya Lipnitskiy wrote:
> On Sat, Mar 20, 2021 at 8:59 AM Zhou Yanjie <zhouyanjie@wanyeetech.com> wrote:
>> Hi Ilya,
>>
>> On 2021/3/3 下午11:55, Ilya Lipnitskiy wrote:
>>> On Wed, Mar 3, 2021 at 7:50 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>>>>
>>>>> On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>>>>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>>>>>>
>>>>>>> On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>>>>>>> Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> writes:
>>>>>>>>
>>>>>>>>> Eric, All,
>>>>>>>>>
>>>>>>>>> The following error appears when running Linux 5.10.18 on an embedded
>>>>>>>>> MIPS mt7621 target:
>>>>>>>>> [    0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
>>>>>>>>>
>>>>>>>>> Being a very generic error, I started digging and added a stack dump
>>>>>>>>> before the BUG:
>>>>>>>>> Call Trace:
>>>>>>>>> [<80008094>] show_stack+0x30/0x100
>>>>>>>>> [<8033b238>] dump_stack+0xac/0xe8
>>>>>>>>> [<800285e8>] __mmdrop+0x98/0x1d0
>>>>>>>>> [<801a6de8>] free_bprm+0x44/0x118
>>>>>>>>> [<801a86a8>] kernel_execve+0x160/0x1d8
>>>>>>>>> [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>>>>>>>>> [<80003198>] ret_from_kernel_thread+0x14/0x1c
>>>>>>>>>
>>>>>>>>> So that's how I got to looking at fs/exec.c and noticed quite a few
>>>>>>>>> changes last year. Turns out this message only occurs once very early
>>>>>>>>> at boot during the very first call to kernel_execve. current->mm is
>>>>>>>>> NULL at this stage, so acct_arg_size() is effectively a no-op.
>>>>>>>> If you believe this is a new error you could bisect the kernel
>>>>>>>> to see which change introduced the behavior you are seeing.
>>>>>>>>
>>>>>>>>> More digging, and I traced the RSS counter increment to:
>>>>>>>>> [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
>>>>>>>>> [<80160d58>] handle_mm_fault+0x6e4/0xea0
>>>>>>>>> [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
>>>>>>>>> [<8015992c>] __get_user_pages_remote+0x128/0x360
>>>>>>>>> [<801a6d9c>] get_arg_page+0x34/0xa0
>>>>>>>>> [<801a7394>] copy_string_kernel+0x194/0x2a4
>>>>>>>>> [<801a880c>] kernel_execve+0x11c/0x298
>>>>>>>>> [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>>>>>>>>> [<80003198>] ret_from_kernel_thread+0x14/0x1c
>>>>>>>>>
>>>>>>>>> In fact, I also checked vma_pages(bprm->vma) and lo and behold it is set to 1.
>>>>>>>>>
>>>>>>>>> How is fs/exec.c supposed to handle implied RSS increments that happen
>>>>>>>>> due to page faults when discarding the bprm structure? In this case,
>>>>>>>>> the bug-generating kernel_execve call never succeeded, it returned -2,
>>>>>>>>> but I didn't trace exactly what failed.
>>>>>>>> Unless I am mistaken any left over pages should be purged by exit_mmap
>>>>>>>> which is called by mmput before mmput calls mmdrop.
>>>>>>> Good to know. Some more digging and I can say that we hit this error
>>>>>>> when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
>>>>>>> vm_normal_page returns NULL, zap_pte_range does not decrement
>>>>>>> MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
>>>>>>> usable, but special? Or am I totally off the mark here?
>>>>>> It would be good to know if that is the page that get_user_pages_remote
>>>>>> returned to copy_string_kernel.  The zero page that is always zero,
>>>>>> should never be returned when a writable mapping is desired.
>>>>> Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
>>>>> page_to_pfn(page) is 0) and it is the same page that is being freed and not
>>>>> refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
>>>>> ZERO_PAGE(0)==0x809fc000 -> PFN 5120.
>>>>>
>>>>> I think I have found the problem though, after much digging and thanks to all
>>>>> the information provided. init_zero_pfn() gets called too late (after
>>>>> the call to
>>>>> is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and after,
>>>>> zero_pfn == 5120. Boom.
>>>>>
>>>>> So PFN 0 is special, but only for a little bit, enough for something
>>>>> on my system
>>>>> to call kernel_execve :)
>>>>>
>>>>> Question: is my system not supposed to be calling kernel_execve this
>>>>> early or does
>>>>> init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
>>>>> core_initcall.
>>>> Looking quickly it seems that init_zero_pfn() is in mm/memory.c and is
>>>> common for both mips and x86.  Further it appears init_zero_pfn() has
>>>> been that was since 2009 a13ea5b75964 ("mm: reinstate ZERO_PAGE").
>>>>
>>>> Given the testing that x86 gets and that nothing like this has been
>>>> reported it looks like whatever driver is triggering the kernel_execve
>>>> is doing something wrong.
>>>> Because honestly.  If the zero page isn't working there is not a chance
>>>> that anything in userspace is working so it is clearly much too early.
>>>>
>>>> I suspect there is some driver that is initialized very early that is
>>>> doing something that looks innocuous (like triggering a hotplug event)
>>>> and that happens to cause a call_usermode_helper which then calls
>>>> kernel_execve.
>>> I will investigate the offenders more closely. However, I do not
>>> notice this behavior on the same system based on the 5.4 kernel. Is it
>>
>> I also encountered this problem on Ingenic X1000 and X1830. This is the
>> printed information:
>>
>> [    0.120715] BUG: Bad rss-counter state mm:(ptrval)
>>    type:MM_ANONPAGES val:1
>>
>> I tested kernel 5.9, kernel 5.10, kernel 5.11, and kernel 5.12, only
>> kernel 5.9 did not have this problem, so we can know that this problem
>> was introduced in kernel 5.10, have you found any effective solution?
> Try:
> diff --git a/mm/memory.c b/mm/memory.c
> index c8e357627318..1fd753245369 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -166,7 +166,7 @@ static int __init init_zero_pfn(void)
>          zero_pfn = page_to_pfn(ZERO_PAGE(0));
>          return 0;
>   }
> -core_initcall(init_zero_pfn);
> +early_initcall(init_zero_pfn);


It works, thanks!


Best regards!


>   void mm_trace_rss_stat(struct mm_struct *mm, int member, long count)
>   {


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-30  4:56                 ` Zhou Yanjie
@ 2021-03-30 16:11                   ` Linus Torvalds
  2021-03-30 16:36                     ` Ilya Lipnitskiy
  0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2021-03-30 16:11 UTC (permalink / raw)
  To: Zhou Yanjie
  Cc: Ilya Lipnitskiy, Eric W. Biederman, Linux-MM,
	Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig

On Mon, Mar 29, 2021 at 9:56 PM Zhou Yanjie <zhouyanjie@wanyeetech.com> wrote:
>
> On 2021/3/29 上午10:48, Ilya Lipnitskiy wrote:
> >
> > Try:
> > diff --git a/mm/memory.c b/mm/memory.c
> > index c8e357627318..1fd753245369 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -166,7 +166,7 @@ static int __init init_zero_pfn(void)
> >          zero_pfn = page_to_pfn(ZERO_PAGE(0));
> >          return 0;
> >   }
> > -core_initcall(init_zero_pfn);
> > +early_initcall(init_zero_pfn);
>
> It works, thanks!

Looks good to me - init_zero_pfn() can be called early, because it
depends on paging_init() will should have happened long before any
initcalls in setup_arch().

Ilya, mind sending a signed-off version with a nice commit message,
and I'll apply it.

             Linus


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-30 16:11                   ` Linus Torvalds
@ 2021-03-30 16:36                     ` Ilya Lipnitskiy
  2021-03-30 16:47                       ` Linus Torvalds
  0 siblings, 1 reply; 13+ messages in thread
From: Ilya Lipnitskiy @ 2021-03-30 16:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Zhou Yanjie, Eric W. Biederman, Linux-MM,
	Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig

On Tue, Mar 30, 2021 at 9:11 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, Mar 29, 2021 at 9:56 PM Zhou Yanjie <zhouyanjie@wanyeetech.com> wrote:
> >
> > On 2021/3/29 上午10:48, Ilya Lipnitskiy wrote:
> > >
> > > Try:
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index c8e357627318..1fd753245369 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -166,7 +166,7 @@ static int __init init_zero_pfn(void)
> > >          zero_pfn = page_to_pfn(ZERO_PAGE(0));
> > >          return 0;
> > >   }
> > > -core_initcall(init_zero_pfn);
> > > +early_initcall(init_zero_pfn);
> >
> > It works, thanks!
>
> Looks good to me - init_zero_pfn() can be called early, because it
> depends on paging_init() will should have happened long before any
> initcalls in setup_arch().
>
> Ilya, mind sending a signed-off version with a nice commit message,
> and I'll apply it.
Sorry, I could have done better linking it to this thread - I actually
did submit it recently - please see
https://lkml.kernel.org/r/20210330044208.8305-1-ilya.lipnitskiy@gmail.com

Ilya


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: exec error: BUG: Bad rss-counter
  2021-03-30 16:36                     ` Ilya Lipnitskiy
@ 2021-03-30 16:47                       ` Linus Torvalds
  0 siblings, 0 replies; 13+ messages in thread
From: Linus Torvalds @ 2021-03-30 16:47 UTC (permalink / raw)
  To: Ilya Lipnitskiy
  Cc: Zhou Yanjie, Eric W. Biederman, Linux-MM,
	Linux Kernel Mailing List, linux-fsdevel, Kees Cook,
	Christoph Hellwig

On Tue, Mar 30, 2021 at 9:36 AM Ilya Lipnitskiy
<ilya.lipnitskiy@gmail.com> wrote:
>
> Sorry, I could have done better linking it to this thread - I actually
> did submit it recently - please see
> https://lkml.kernel.org/r/20210330044208.8305-1-ilya.lipnitskiy@gmail.com

Oh, ok, that looks fine.

Thanks, applied,

             Linus


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-03-30 16:47 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CALCv0x1NauG_13DmmzwYaRDaq3qjmvEdyi7=XzF04KR06Q=WHA@mail.gmail.com>
     [not found] ` <m1wnuqhaew.fsf@fess.ebiederm.org>
2021-03-02  7:59   ` exec error: BUG: Bad rss-counter Ilya Lipnitskiy
     [not found]     ` <m1blc1gxdx.fsf@fess.ebiederm.org>
2021-03-03  7:01       ` Ilya Lipnitskiy
2021-03-03 15:50         ` Eric W. Biederman
2021-03-03 15:55           ` Ilya Lipnitskiy
2021-03-03 16:07             ` Eric W. Biederman
2021-03-20 15:59             ` Zhou Yanjie
2021-03-29  2:48               ` Ilya Lipnitskiy
2021-03-30  4:56                 ` Zhou Yanjie
2021-03-30 16:11                   ` Linus Torvalds
2021-03-30 16:36                     ` Ilya Lipnitskiy
2021-03-30 16:47                       ` Linus Torvalds
2021-03-29  2:46           ` Ilya Lipnitskiy
     [not found]     ` <CAHk-=wjVWMnH2LfFNnXcf6=WuU1RyLa_cgTEOqnViHiqDrqQjg@mail.gmail.com>
2021-03-03  7:07       ` Ilya Lipnitskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).