linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] tracing: Fixes to bootconfig memory management
@ 2021-09-14 14:56 Steven Rostedt
  2021-09-14 18:01 ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2021-09-14 14:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LKML, Ingo Molnar, Andrew Morton, Masami Hiramatsu


Linus,

A couple of memory management fixes to the bootconfig code


Please pull the latest trace-v5.15-rc1 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-v5.15-rc1

Tag SHA1: 04c8861530c8c66f00918702374668e8e8b230af
Head SHA1: 8e9f0934a07e699044d422ca9cfb553f25c72b41


Masami Hiramatsu (2):
      bootconfig: Fix to check the xbc_node is used before free it
      bootconfig: Free copied bootconfig data after boot

----
 init/main.c      | 8 ++++++++
 lib/bootconfig.c | 3 ++-
 2 files changed, 10 insertions(+), 1 deletion(-)
---------------------------
diff --git a/init/main.c b/init/main.c
index d08caed17c7f..ddbcb372225a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -319,6 +319,8 @@ static void * __init get_boot_config_from_initrd(u32 *_size, u32 *_csum)
 #ifdef CONFIG_BOOT_CONFIG
 
 static char xbc_namebuf[XBC_KEYLEN_MAX] __initdata;
+static void *init_xbc_data_copy __initdata;
+static phys_addr_t init_xbc_data_size __initdata;
 
 #define rest(dst, end) ((end) > (dst) ? (end) - (dst) : 0)
 
@@ -458,18 +460,24 @@ static void __init setup_boot_config(void)
 		else
 			pr_err("Failed to parse bootconfig: %s at %d.\n",
 				msg, pos);
+		memblock_free(__pa(copy), size + 1);
 	} else {
 		pr_info("Load bootconfig: %d bytes %d nodes\n", size, ret);
 		/* keys starting with "kernel." are passed via cmdline */
 		extra_command_line = xbc_make_cmdline("kernel");
 		/* Also, "init." keys are init arguments */
 		extra_init_args = xbc_make_cmdline("init");
+		init_xbc_data_copy = copy;
+		init_xbc_data_size = size + 1;
 	}
 	return;
 }
 
 static void __init exit_boot_config(void)
 {
+	if (!init_xbc_data_copy)
+		return;
+	memblock_free(__pa(init_xbc_data_copy), init_xbc_data_size);
 	xbc_destroy_all();
 }
 
diff --git a/lib/bootconfig.c b/lib/bootconfig.c
index f8419cff1147..4f8849706ef6 100644
--- a/lib/bootconfig.c
+++ b/lib/bootconfig.c
@@ -792,7 +792,8 @@ void __init xbc_destroy_all(void)
 	xbc_data = NULL;
 	xbc_data_size = 0;
 	xbc_node_num = 0;
-	memblock_free(__pa(xbc_nodes), sizeof(struct xbc_node) * XBC_NODE_MAX);
+	if (xbc_nodes)
+		memblock_free(__pa(xbc_nodes), sizeof(struct xbc_node) * XBC_NODE_MAX);
 	xbc_nodes = NULL;
 	brace_index = 0;
 }

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 14:56 [GIT PULL] tracing: Fixes to bootconfig memory management Steven Rostedt
@ 2021-09-14 18:01 ` Linus Torvalds
  2021-09-14 18:59   ` Steven Rostedt
  2021-09-17 20:10   ` Mike Rapoport
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2021-09-14 18:01 UTC (permalink / raw)
  To: Steven Rostedt, Mike Rapoport, Andrew Morton
  Cc: LKML, Ingo Molnar, Masami Hiramatsu, Linux-MM

On Tue, Sep 14, 2021 at 7:56 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> A couple of memory management fixes to the bootconfig code

These may be fixes, but they are too ugly to merit the tiny
theoretical leak fix.

All of these are just plain wrong:

> +static void *init_xbc_data_copy __initdata;
> +static phys_addr_t init_xbc_data_size __initdata;
> +               init_xbc_data_copy = copy;
> +               init_xbc_data_size = size + 1;
> +       memblock_free(__pa(init_xbc_data_copy), init_xbc_data_size);

because the xbc code already saves these as xbc_data/xbc_data_size and
that final free should just be done in xbc_destroy_all().

So this fix is pointlessly ugly to begin with.

But what I _really_ ended up reacting to was that

> +               memblock_free(__pa(copy), size + 1);

where that "copy" was allocated with

        copy = memblock_alloc(size + 1, SMP_CACHE_BYTES);

so it should damn well be free'd without any crazy "__pa()" games.

This is a memblock interface bug, plain and simple.

Mike - this craziness needs to just be fixed. If memblock_alloc()
returns a virtual address, then memblock_free() should take one.

And if somebody has physical addresses because they aren't freeing
previously allocated resources, but because they are initializing the
memblock data from physical resources, then it shouldn't be called
"memblock_free()".

Alternatively, it should just _all_ be done in physaddr_t - that would
at least be consistent. But it would be *bad*.

Let's just get these interfaces fixed. It might be as simple as having
a "memblock_free_phys()" interface, and doing a search-and-replace
with coccinelle of

     memblock_free(__pa(xyz), .. -> memblock_free(xyz, ...
     memblock_free(other, .. -> memblock_free_phys(other, ..

and adding the (trivial) internal helper functions to memblock,
instead of making the atcual _users_ of memblock do insanely stupid
and confusing things.

Doing that automatic replacement might need an intermediate to avoid
the ambiguous case - first translate

     memblock_free(__pa(xyz), .. -> memblock_free_sane(xyz, ..

and then do any remaining

     memblock_free(xyz, .. -> memblock_free_phys(xyz, ..

and then when there are no remaining cases of 'memblock_free()' left,
do a final rename

     memblock_free_sane(.. -> memblock_free(..

but the actual commit can and should be just a single commit that just
fixes 'memblock_free()' to have sane interfaces.

Happily at least the type ends up making sure that we don't have
subtle mistakes (ie physaddr_t is an integer type, and a virtual
pointer is a pointer, so any missed conversions would cause nice
compile-time errors).

I hadn't noticed this insanity until now, but now that I do, I really
don't want to add to the ugliness for some unimportant theoretical
leak fix.

The memblock code has had enough subtleties that having inconsistent
and illogical basic interfaces is certainly not a good idea.

               Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 18:01 ` Linus Torvalds
@ 2021-09-14 18:59   ` Steven Rostedt
  2021-09-14 19:05     ` Linus Torvalds
  2021-09-17 20:10   ` Mike Rapoport
  1 sibling, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2021-09-14 18:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM, Vlastimil Babka

On Tue, 14 Sep 2021 11:01:31 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> All of these are just plain wrong:

I'll revert these, and expect Masami to address the issues you present.

Note, they do fix a real bug as shown by:

  https://lore.kernel.org/all/e5c7ce5b-93f0-b305-de32-c99d0390eb28@suse.cz/

> 
> > +static void *init_xbc_data_copy __initdata;
> > +static phys_addr_t init_xbc_data_size __initdata;
> > +               init_xbc_data_copy = copy;
> > +               init_xbc_data_size = size + 1;
> > +       memblock_free(__pa(init_xbc_data_copy), init_xbc_data_size);  
> 
> because the xbc code already saves these as xbc_data/xbc_data_size and
> that final free should just be done in xbc_destroy_all().
> 
> So this fix is pointlessly ugly to begin with.
> 
> But what I _really_ ended up reacting to was that
> 
> > +               memblock_free(__pa(copy), size + 1);  

Should we hold off until mm has fixed this issue?

-- Steve


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 18:59   ` Steven Rostedt
@ 2021-09-14 19:05     ` Linus Torvalds
  2021-09-14 19:14       ` Steven Rostedt
  2021-09-14 19:23       ` Linus Torvalds
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2021-09-14 19:05 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM, Vlastimil Babka

On Tue, Sep 14, 2021 at 12:00 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Note, they do fix a real bug as shown by:
>
>   https://lore.kernel.org/all/e5c7ce5b-93f0-b305-de32-c99d0390eb28@suse.cz/

Ok, can you send me that NULL ptr check as one separate thing?

> Should we hold off until mm has fixed this issue?

Yeah.

That would actually fix the NULL pointer issue too, since a _proper_
memblock_free() would just be a no-op for NULL.

That whole "do __pa() on a random pointer to pass it to
memblock_free()" is really just complete garbage in _so_ many ways.

                Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 19:05     ` Linus Torvalds
@ 2021-09-14 19:14       ` Steven Rostedt
  2021-09-14 19:23       ` Linus Torvalds
  1 sibling, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2021-09-14 19:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM, Vlastimil Babka

On Tue, 14 Sep 2021 12:05:09 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> > Note, they do fix a real bug as shown by:
> >
> >   https://lore.kernel.org/all/e5c7ce5b-93f0-b305-de32-c99d0390eb28@suse.cz/  
> 
> Ok, can you send me that NULL ptr check as one separate thing?

Perfect, because that one was added first, and I only need to rebase to
it without changing the history. Although, I will add the tested-by
from Vlastimil and the link he provided. But the actual code has
already ran through my test suite, and I don't need to re-run it.

-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 19:05     ` Linus Torvalds
  2021-09-14 19:14       ` Steven Rostedt
@ 2021-09-14 19:23       ` Linus Torvalds
  2021-09-14 19:38         ` Linus Torvalds
  1 sibling, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2021-09-14 19:23 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM, Vlastimil Babka

On Tue, Sep 14, 2021 at 12:05 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Ok, can you send me that NULL ptr check as one separate thing?

Actually, never mind. I'll just do the automated replacement, it will
fix that NULL pointer thing too.

            Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 19:23       ` Linus Torvalds
@ 2021-09-14 19:38         ` Linus Torvalds
  2021-09-14 20:48           ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2021-09-14 19:38 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM, Vlastimil Babka

On Tue, Sep 14, 2021 at 12:23 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Actually, never mind. I'll just do the automated replacement, it will
> fix that NULL pointer thing too.

Sadly, the automated replacement isn't quite as simple as the
sed-script I planned on using.

Because this:

  sed -i \
    's:memblock_free(__pa(\([^)]*\)),:memblock_free_sane(\1,:' \
    $(git grep -wl memblock_free)

does actually mostly work, but because "__pa()" takes either a pointer
or a integer value, the end result is that sometimes we call
memblock_free_sane() with things that aren't pointers.

Very annoying. That "__pa()" macro has the two underscores for a
reason. It shouldn't be "use this thing willy nilly".

So I'll do a minimal conversion that adds "memblock_free_ptr()" and
hope that people start using that. And then we can later try to move
"memblock_free()" to a name that isn't so misleading.

                  Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 19:38         ` Linus Torvalds
@ 2021-09-14 20:48           ` Linus Torvalds
  2021-09-14 21:05             ` Steven Rostedt
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2021-09-14 20:48 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM, Vlastimil Babka

On Tue, Sep 14, 2021 at 12:38 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So I'll do a minimal conversion that adds "memblock_free_ptr()" and
> hope that people start using that. And then we can later try to move
> "memblock_free()" to a name that isn't so misleading.

Commit 77e02cf57b6c ("memblock: introduce saner 'memblock_free_ptr()'
interface") should hopefully fix that panic that Vlastimil saw, and
the kernel test robot report as well.

And it should make it easy to cleanly fix that 'copy' leak too.

            Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 20:48           ` Linus Torvalds
@ 2021-09-14 21:05             ` Steven Rostedt
  2021-09-14 22:47               ` Vlastimil Babka
  2021-09-14 23:44               ` Masami Hiramatsu
  0 siblings, 2 replies; 14+ messages in thread
From: Steven Rostedt @ 2021-09-14 21:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM, Vlastimil Babka

On Tue, 14 Sep 2021 13:48:15 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Sep 14, 2021 at 12:38 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > So I'll do a minimal conversion that adds "memblock_free_ptr()" and
> > hope that people start using that. And then we can later try to move
> > "memblock_free()" to a name that isn't so misleading.  
> 
> Commit 77e02cf57b6c ("memblock: introduce saner 'memblock_free_ptr()'
> interface") should hopefully fix that panic that Vlastimil saw, and
> the kernel test robot report as well.
> 
> And it should make it easy to cleanly fix that 'copy' leak too.
> 

Vlastimil,

Can you confirm that Linus's changes addresses your issue?

Masami,

Care to rebase on top of Linus's change?

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 21:05             ` Steven Rostedt
@ 2021-09-14 22:47               ` Vlastimil Babka
  2021-09-14 23:29                 ` Linus Torvalds
  2021-09-14 23:44               ` Masami Hiramatsu
  1 sibling, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2021-09-14 22:47 UTC (permalink / raw)
  To: Steven Rostedt, Linus Torvalds
  Cc: Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM

On 9/14/21 23:05, Steven Rostedt wrote:
> On Tue, 14 Sep 2021 13:48:15 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
>> On Tue, Sep 14, 2021 at 12:38 PM Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> >
>> > So I'll do a minimal conversion that adds "memblock_free_ptr()" and
>> > hope that people start using that. And then we can later try to move
>> > "memblock_free()" to a name that isn't so misleading.  
>> 
>> Commit 77e02cf57b6c ("memblock: introduce saner 'memblock_free_ptr()'
>> interface") should hopefully fix that panic that Vlastimil saw, and
>> the kernel test robot report as well.
>> 
>> And it should make it easy to cleanly fix that 'copy' leak too.
>> 
> 
> Vlastimil,
> 
> Can you confirm that Linus's changes addresses your issue?

Well, looks like I can't. Commit 77e02cf57b6cf does boot fine for me,
multiple times. But so now does the parent commit 6a4746ba06191. Looks like
the magic is gone. I'm now surprised how deterministic it was during the
bisect (most bad cases manifested on first boot, only few at second).

> Masami,
> 
> Care to rebase on top of Linus's change?
> 
> Thanks!
> 
> -- Steve
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 22:47               ` Vlastimil Babka
@ 2021-09-14 23:29                 ` Linus Torvalds
  2021-09-15  9:28                   ` Vlastimil Babka
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2021-09-14 23:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Steven Rostedt, Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM

On Tue, Sep 14, 2021 at 3:48 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> Well, looks like I can't. Commit 77e02cf57b6cf does boot fine for me,
> multiple times. But so now does the parent commit 6a4746ba06191. Looks like
> the magic is gone. I'm now surprised how deterministic it was during the
> bisect (most bad cases manifested on first boot, only few at second).

Well, your report was clearly memory corruption by the invalid
memblock_free() just ending up causing random problems later on.

So it could easily be 100% deterministic with a certain memory layout
at a particular commit. And then enough other changes later, and it's
all gone, because the memory corruption now hits something else that
didn't even care.

The code for your oops was

   0: 48 8b 17              mov    (%rdi),%rdx
   3: 48 39 d7              cmp    %rdx,%rdi
   6: 74 43                je     0x4b
   8: 48 8b 47 08          mov    0x8(%rdi),%rax
   c: 48 85 c0              test   %rax,%rax
   f: 74 23                je     0x34
  11: 49 89 c0              mov    %rax,%r8
  14:* 48 8b 40 10          mov    0x10(%rax),%rax <-- trapping instruction

and that's the start of rb_next(), so what's going on is that
"rb->rb_right" (the second word of 'struct rb_node') ends up having
that value in %rax:

  RAX: 343479726f6d656d

which is ASCII "44yromem" rather than a valid pointer if I looked that up right.

And just _slightly_ different allocation patterns, and your 'struct
rb_node' gets allocated somewhere else, and you don't see the oops at
all, or you get it later in some different place.

Most memory corruption doesn't cause oopses, because most memory isn't
used as pointers etc.

What you _could_ try if you care enough is

 - go back to the thing you bisectted to where you can still hopefully
recreate the problem

 - apply that patch at that point with no other changes

and then the test would hopefully be closer to the state you could
re-create the problem.

And hopefully it would still not reproduce, just because the bug is
fixed, of course ;)

The very unlikely alternative is that your bisect was just pure random
bad luck and hit the wrong commit entirely, and the oops was due to
some other problem.

But it does seem unlikely to be something else. Usually when bisects
go off into the weeds due to not being reproducible, they go very
obviously off into the weeds rather than point to something that ends
up having a very similar bug.

           Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 21:05             ` Steven Rostedt
  2021-09-14 22:47               ` Vlastimil Babka
@ 2021-09-14 23:44               ` Masami Hiramatsu
  1 sibling, 0 replies; 14+ messages in thread
From: Masami Hiramatsu @ 2021-09-14 23:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM, Vlastimil Babka

On Tue, 14 Sep 2021 17:05:53 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 14 Sep 2021 13:48:15 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > On Tue, Sep 14, 2021 at 12:38 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > So I'll do a minimal conversion that adds "memblock_free_ptr()" and
> > > hope that people start using that. And then we can later try to move
> > > "memblock_free()" to a name that isn't so misleading.  
> > 
> > Commit 77e02cf57b6c ("memblock: introduce saner 'memblock_free_ptr()'
> > interface") should hopefully fix that panic that Vlastimil saw, and
> > the kernel test robot report as well.
> > 
> > And it should make it easy to cleanly fix that 'copy' leak too.
> > 
> 
> Vlastimil,
> 
> Can you confirm that Linus's changes addresses your issue?
> 
> Masami,
> 
> Care to rebase on top of Linus's change?

Sure! and I found another memblock leak, so I'll add it too.

Thank you!


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 23:29                 ` Linus Torvalds
@ 2021-09-15  9:28                   ` Vlastimil Babka
  0 siblings, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2021-09-15  9:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Mike Rapoport, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM

On 9/15/21 01:29, Linus Torvalds wrote:
> On Tue, Sep 14, 2021 at 3:48 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> Well, looks like I can't. Commit 77e02cf57b6cf does boot fine for me,
>> multiple times. But so now does the parent commit 6a4746ba06191. Looks like
>> the magic is gone. I'm now surprised how deterministic it was during the
>> bisect (most bad cases manifested on first boot, only few at second).
> 
> Well, your report was clearly memory corruption by the invalid
> memblock_free() just ending up causing random problems later on.

> So it could easily be 100% deterministic with a certain memory layout
> at a particular commit. And then enough other changes later, and it's
> all gone, because the memory corruption now hits something else that
> didn't even care.
> 
> The code for your oops was
> 
>    0: 48 8b 17              mov    (%rdi),%rdx
>    3: 48 39 d7              cmp    %rdx,%rdi
>    6: 74 43                je     0x4b
>    8: 48 8b 47 08          mov    0x8(%rdi),%rax
>    c: 48 85 c0              test   %rax,%rax
>    f: 74 23                je     0x34
>   11: 49 89 c0              mov    %rax,%r8
>   14:* 48 8b 40 10          mov    0x10(%rax),%rax <-- trapping instruction
> 
> and that's the start of rb_next(), so what's going on is that
> "rb->rb_right" (the second word of 'struct rb_node') ends up having
> that value in %rax:
> 
>   RAX: 343479726f6d656d
> 
> which is ASCII "44yromem" rather than a valid pointer if I looked that up right.

Yep, I was pretty sure it was related to the
"/sys/bus/memory/devices/memory44" sysfs object and bisection would lead to
kobject/sysfs or some memory hotplug related changes. So the result was a
surprise.

> And just _slightly_ different allocation patterns, and your 'struct
> rb_node' gets allocated somewhere else, and you don't see the oops at
> all, or you get it later in some different place.
> 
> Most memory corruption doesn't cause oopses, because most memory isn't
> used as pointers etc.
> 
> What you _could_ try if you care enough is
> 
>  - go back to the thing you bisectted to where you can still hopefully
> recreate the problem
> 
>  - apply that patch at that point with no other changes
> 
> and then the test would hopefully be closer to the state you could
> re-create the problem.
> 
> And hopefully it would still not reproduce, just because the bug is
> fixed, of course ;)

Yeah, that worked! Commit 40caa127f3c7 was still broken, and cherry-pick of
77e02cf57b6cf on top fixed it. Thanks!

> The very unlikely alternative is that your bisect was just pure random
> bad luck and hit the wrong commit entirely, and the oops was due to
> some other problem.
> 
> But it does seem unlikely to be something else. Usually when bisects
> go off into the weeds due to not being reproducible, they go very
> obviously off into the weeds rather than point to something that ends
> up having a very similar bug.
> 
>            Linus
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] tracing: Fixes to bootconfig memory management
  2021-09-14 18:01 ` Linus Torvalds
  2021-09-14 18:59   ` Steven Rostedt
@ 2021-09-17 20:10   ` Mike Rapoport
  1 sibling, 0 replies; 14+ messages in thread
From: Mike Rapoport @ 2021-09-17 20:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Andrew Morton, LKML, Ingo Molnar,
	Masami Hiramatsu, Linux-MM

On Tue, Sep 14, 2021 at 11:01:31AM -0700, Linus Torvalds wrote:
> On Tue, Sep 14, 2021 at 7:56 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > A couple of memory management fixes to the bootconfig code
> 
> These may be fixes, but they are too ugly to merit the tiny
> theoretical leak fix.
> 
> All of these are just plain wrong:
> 
> > +static void *init_xbc_data_copy __initdata;
> > +static phys_addr_t init_xbc_data_size __initdata;
> > +               init_xbc_data_copy = copy;
> > +               init_xbc_data_size = size + 1;
> > +       memblock_free(__pa(init_xbc_data_copy), init_xbc_data_size);
> 
> because the xbc code already saves these as xbc_data/xbc_data_size and
> that final free should just be done in xbc_destroy_all().
> 
> So this fix is pointlessly ugly to begin with.
> 
> But what I _really_ ended up reacting to was that
> 
> > +               memblock_free(__pa(copy), size + 1);
> 
> where that "copy" was allocated with
> 
>         copy = memblock_alloc(size + 1, SMP_CACHE_BYTES);
> 
> so it should damn well be free'd without any crazy "__pa()" games.
> 
> This is a memblock interface bug, plain and simple.
> 
> Mike - this craziness needs to just be fixed. If memblock_alloc()
> returns a virtual address, then memblock_free() should take one.

Yep, it was on my todo list. But since it was like this for years with both
memblock and bootmem I didn't prioritise this.
 
> Let's just get these interfaces fixed. It might be as simple as having
> a "memblock_free_phys()" interface, and doing a search-and-replace
> with coccinelle of
> 
>      memblock_free(__pa(xyz), .. -> memblock_free(xyz, ...
>      memblock_free(other, .. -> memblock_free_phys(other, ..
> 
> and adding the (trivial) internal helper functions to memblock,
> instead of making the atcual _users_ of memblock do insanely stupid
> and confusing things.

I've done the automated search and replace, with several fixups here and
there, so there is now memblock_phys_free(phys_addr_t addr) to match
memblock_phys_alloc() and memblock_free(void *ptr) to match
memblock_alloc().

The initial version is in memblock tree

https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock.git/log/?h=memblock_free-cleanup/v0

I'm waiting for robots to run the builds before posting.

While doing the replacement I've found one mismatch in Xen code which used
memblock_free() to free a virtual pointer, but except that users seem to do
the correct thing, even if it is ugly __pa() conversions.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-09-17 20:10 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-14 14:56 [GIT PULL] tracing: Fixes to bootconfig memory management Steven Rostedt
2021-09-14 18:01 ` Linus Torvalds
2021-09-14 18:59   ` Steven Rostedt
2021-09-14 19:05     ` Linus Torvalds
2021-09-14 19:14       ` Steven Rostedt
2021-09-14 19:23       ` Linus Torvalds
2021-09-14 19:38         ` Linus Torvalds
2021-09-14 20:48           ` Linus Torvalds
2021-09-14 21:05             ` Steven Rostedt
2021-09-14 22:47               ` Vlastimil Babka
2021-09-14 23:29                 ` Linus Torvalds
2021-09-15  9:28                   ` Vlastimil Babka
2021-09-14 23:44               ` Masami Hiramatsu
2021-09-17 20:10   ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).