All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jason A. Donenfeld" <Jason@zx2c4.com>
To: Eric Biggers <ebiggers@kernel.org>, x86@kernel.org, linux-mm@kvack.org
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org,
	"Laurent Vivier" <laurent@vivier.eu>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Ard Biesheuvel" <ardb@kernel.org>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	kasan-dev@googlegroups.com, "Dmitry Vyukov" <dvyukov@google.com>
Subject: Re: [PATCH v5 4/4] x86: re-enable rng seeding via SetupData
Date: Wed, 28 Dec 2022 04:07:13 +0100	[thread overview]
Message-ID: <Y6uy4b71GX0epQsu@zx2c4.com> (raw)
In-Reply-To: <Y6r09pm68oI7GMe1@zx2c4.com>

On Tue, Dec 27, 2022 at 02:36:54PM +0100, Jason A. Donenfeld wrote:
> On Mon, Dec 26, 2022 at 05:57:30PM +0100, Jason A. Donenfeld wrote:
> > On Mon, Dec 26, 2022 at 03:43:04PM +0100, Jason A. Donenfeld wrote:
> > > On Mon, Dec 26, 2022 at 03:24:07PM +0100, Jason A. Donenfeld wrote:
> > > > Hi,
> > > > 
> > > > I'm currently stumped at the moment, so adding linux-mm@ and x86@. Still
> > > > working on it though. Details of where I'm at are below the quote below.
> > > > 
> > > > On Sat, Dec 24, 2022 at 05:21:46AM +0100, Jason A. Donenfeld wrote:
> > > > > On Sat, Dec 24, 2022 at 04:09:08AM +0100, Jason A. Donenfeld wrote:
> > > > > > Hi Eric,
> > > > > > 
> > > > > > Replying to you from my telephone, and I'm traveling the next two days,
> > > > > > but I thought I should mention some preliminary results right away from
> > > > > > doing some termux compiles:
> > > > > > 
> > > > > > On Fri, Dec 23, 2022 at 04:14:00PM -0800, Eric Biggers wrote:
> > > > > > > Hi Jason,
> > > > > > > 
> > > > > > > On Wed, Sep 21, 2022 at 11:31:34AM +0200, Jason A. Donenfeld wrote:
> > > > > > > > This reverts 3824e25db1 ("x86: disable rng seeding via setup_data"), but
> > > > > > > > for 7.2 rather than 7.1, now that modifying setup_data is safe to do.
> > > > > > > > 
> > > > > > > > Cc: Laurent Vivier <laurent@vivier.eu>
> > > > > > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > > > > > Cc: Peter Maydell <peter.maydell@linaro.org>
> > > > > > > > Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > > > > > > Cc: Richard Henderson <richard.henderson@linaro.org>
> > > > > > > > Cc: Ard Biesheuvel <ardb@kernel.org>
> > > > > > > > Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> > > > > > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > > > > > > > ---
> > > > > > > >  hw/i386/microvm.c | 2 +-
> > > > > > > >  hw/i386/pc_piix.c | 3 ++-
> > > > > > > >  hw/i386/pc_q35.c  | 3 ++-
> > > > > > > >  3 files changed, 5 insertions(+), 3 deletions(-)
> > > > > > > > 
> > > > > > > 
> > > > > > > After upgrading to QEMU 7.2, Linux 6.1 no longer boots with some configs.  There
> > > > > > > is no output at all.  I bisected it to this commit, and I verified that the
> > > > > > > following change to QEMU's master branch makes the problem go away:
> > > > > > > 
> > > > > > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > > > > > index b48047f50c..42f5b07d2f 100644
> > > > > > > --- a/hw/i386/pc_piix.c
> > > > > > > +++ b/hw/i386/pc_piix.c
> > > > > > > @@ -441,6 +441,7 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> > > > > > >      pc_i440fx_machine_options(m);
> > > > > > >      m->alias = "pc";
> > > > > > >      m->is_default = true;
> > > > > > > +    PC_MACHINE_CLASS(m)->legacy_no_rng_seed = true;
> > > > > > >  }
> > > > > > > 
> > > > > > > I've attached the kernel config I am seeing the problem on.
> > > > > > > 
> > > > > > > For some reason, the problem also goes away if I disable CONFIG_KASAN.
> > > > > > > 
> > > > > > > Any idea what is causing this?
> > > > > > 
> > > > > > - Commenting out the call to parse_setup_data() doesn't fix the issue.
> > > > > >   So there's no KASAN issue with the actual parser.
> > > > > > 
> > > > > > - Using KASAN_OUTLINE rather than INLINE does fix the issue!
> > > > > > 
> > > > > > That makes me suspect that it's file size related, and QEMU or the BIOS
> > > > > > is placing setup data at an overlapping offset by accident, or something
> > > > > > similar.
> > > > > 
> > > > > I removed the file systems from your config to bring the kernel size
> > > > > back down, and voila, it works, even with KASAN_INLINE. So perhaps I'm
> > > > > on the right track here...
> > > > 
> > > > QEMU sticks setup_data after the kernel image, the same as kexec-tools
> > > > and everything else. Apparently, when the kernel image is large, the
> > > > call to early_memremap(boot_params.hdr.setup_data, ...) returns a value
> > > > that points some place bogus, and the system crashes or does something
> > > > weird. I haven't yet determined what this limit is, but in my current
> > > > test kernel, a value of 0x0000000001327650 is enough to make it point to
> > > > rubbish.
> > > > 
> > > > Is this expected? What's going on here?
> > > 
> > > Attaching gdb to QEMU and switching it to physical memory mode
> > > (`maintenance packet Qqemu.PhyMemMode:1 `) indicates that it
> > > early_memremap is actually working fine and something *else* is at this
> > > address? That's kinda weird... Is KASAN populating physical addresses
> > > immediately after the kernel image extremely early in boot? I'm seeing
> > > the crash happen from early_reserve_memory()->
> > > memblock_x86_reserve_range_setup_data(), which should be before
> > > kasan_init() even runs. Is QEMU calculating kernel_size wrong, when it
> > > goes to determine where to put the setup_data data? But that's the same
> > > calculation as used everywhere else, so hmm...
> > > 
> > > Jason
> > 
> > If bzImage is 15770544 bytes, it does not boot. If bzImage is 15641776
> > bytes, it does boot. So something is happening somewhat close to the
> > 16MB mark?
> > 
> 
> Okay, the issue is that it's being decompressed to an area that overlaps
> the source. So for example in my test kernel:
> 
> input_addr: 0x3f112bf
> output_addr: 0x1000000
> output_len: 0x3a5d7d8
> 
> Since 0x3a5d7d8 + 0x1000000 > 0x3f112bf, eventually this corrupts the
> setup_data at the end there.
> 
> Now digging into what can be done about it.

Not quite. input_addr doesn't matter, since setup_data still points to
the old mapping.

So the actual issue is:

compressed_size: 	0xf028d4
decompressed_size:      0x3a5d7d8
setup_data:      	0x100000 + compressed_size
output_addr:    	0x1000000 (this is LOAD_PHYSICAL_ADDR)

Since `output_addr + decompressed_size > setup_data && output_addr <
setup_data`, then it means the decompressor will write over setup_data.

Note that this is also a problem for SETUP_DTB, so it's a longstanding
bug.

I'm experimenting now with appending lots of zeros between the kernel
image and setup_data, so that the decompressor doesn't overwrite
setup_data, but so far it's not working.

Another option would be to have the build system warn when this is going
to happen, and suggest that the user increase the value of
CONFIG_PHYSICAL_START. This might be the best option...

Jason


  reply	other threads:[~2022-12-28  3:07 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21  9:31 [PATCH v5 1/4] x86: return modified setup_data only if read as memory, not as file Jason A. Donenfeld
2022-09-21  9:31 ` [PATCH v5 2/4] x86: use typedef for SetupData struct Jason A. Donenfeld
2022-09-21  9:31 ` [PATCH v5 3/4] x86: reinitialize RNG seed on system reboot Jason A. Donenfeld
2022-09-21  9:31 ` [PATCH v5 4/4] x86: re-enable rng seeding via SetupData Jason A. Donenfeld
2022-12-24  0:14   ` Eric Biggers
2022-12-24  3:09     ` Jason A. Donenfeld
2022-12-24  4:21       ` Jason A. Donenfeld
2022-12-26 14:24         ` Jason A. Donenfeld
2022-12-26 14:43           ` Jason A. Donenfeld
2022-12-26 16:57             ` Jason A. Donenfeld
2022-12-27 13:36               ` Jason A. Donenfeld
2022-12-28  3:07                 ` Jason A. Donenfeld [this message]
2022-12-28 14:39                   ` Jason A. Donenfeld
2022-09-22  7:26 ` [PATCH v5 1/4] x86: return modified setup_data only if read as memory, not as file Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y6uy4b71GX0epQsu@zx2c4.com \
    --to=jason@zx2c4.com \
    --cc=ardb@kernel.org \
    --cc=dvyukov@google.com \
    --cc=ebiggers@kernel.org \
    --cc=f4bug@amsat.org \
    --cc=kasan-dev@googlegroups.com \
    --cc=kraxel@redhat.com \
    --cc=laurent@vivier.eu \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.