All of lore.kernel.org
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: pbonzini@redhat.com, ebiggers@kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, qemu-devel@nongnu.org,
	ardb@kernel.org, kraxel@redhat.com, bp@alien8.de,
	philmd@linaro.org
Subject: Re: [PATCH qemu] x86: don't let decompressed kernel image clobber setup_data
Date: Wed, 28 Dec 2022 18:13:34 -0800	[thread overview]
Message-ID: <0baf674b-c7e7-a010-375d-ea1132495c44@zytor.com> (raw)
In-Reply-To: <9188EEE9-2759-4389-B39E-0FEBBA3FA57D@zytor.com>

[-- Attachment #1: Type: text/plain, Size: 2251 bytes --]

On 12/28/22 15:58, H. Peter Anvin wrote:
> On December 28, 2022 8:57:54 AM PST, "Jason A. Donenfeld" <Jason@zx2c4.com> wrote:
>> HELLO H. PETER ANVIN,
>> E
>> L
>> L
>> O
>>
>> On Wed, Dec 28, 2022 at 05:30:30PM +0100, Jason A. Donenfeld wrote:
>>>> Fix looks good, glad you figured out the problem.
>>>
>>> I mean, kind of. The solution here sucks, especially given that in the
>>> worst case, setup_data just gets dropped. I'm half inclined to consider
>>> this a kernel bug instead, and add some code to relocate setup_data
>>> prior to decompression, and then fix up all the links. It seems like
>>> this would be a lot more robust.
>>>
>>> I just wish the people who wrote this stuff would chime in. I've had
>>> x86@kernel.org CC'd but so far, no input from them.
>>
>> Apparently you are the x86 boot guru. What do you want to happen here?
>> Your input would be very instrumental.
>>
>> Jason
> 
> Hi!
> 
> Glad you asked.
> 
> So the kernel load addresses are parameterized in the kernel image
> setup header. One of the things that are so parameterized are the
> size and possible realignment of the kernel image in memory.
> 
> I'm very confused where you are getting the 64 MB number from. There
> should not be any such limitation.
> 
> In general, setup_data should be able to go anywhere the initrd can
> go, and so is subject to the same address cap (896 MB for old
> kernels, 4 GB on newer ones; this address too is enumerated in the
> header.)
> 
> If you want to put setup_data above 4 GB, it *should* be ok if and
> only if the kernel supports loading the initrd high, too (again,
> enumerated in the header.
> 
> TL;DR: put setup_data where you put the initrd (before or after
> doesn't matter.)
> 
> To be maximally conservative, link the setup_data list in order from
> lowest to highest address; currently there is no such item of
> relevance, but in the future there may be setup_data items needed by
> the BIOS part of the bootstrap in which case they would have to be <
> 1 MB and precede any items > 1 MB for obvious reasons. That being
> said, with BIOS dying it is not all that likely that such entries
> will ever be needed.
> 

So let me try for an algorithm. Attached as a text file to avoid line 
break damage.

	-hpa

[-- Attachment #2: kernel-data-addresses.txt --]
[-- Type: text/plain, Size: 5844 bytes --]

Here is an attempted description with pseudo-C code:

First of all, take a 4K page of memory and *initialize it to zero*.
{
    #include <asm/bootparam.h>	/* From the uapi kernel sources */

    /* Allocated somewhere in your code... */
    extern unsigned char *kernel_image;		/* Kernel file */
    extern struct boot_params *boot_params;	/* 4K buffer */
    extern uint32_t kernel_image_size;		/* Size of kernel file */

    /* Callbacks into your code */
    extern bool is_bios_boot(void);
    extern uint32_t end_of_low_memory(void); /* For BIOS boot */
    /*
     * This MUST return an alignment address between start_address
     * and max_address...
     */
    extern uint64_t maybe_relocate_kernel(uint64_t start_address,
	  uint64_t max_address, uint32_t alignment);

    /*
     * Convenience pointer into the kernel image; modifications
     * done here should be reflected in the loaded kernel image
     */
    struct setup_header * const kernel_setup_header =
	(struct setup_header *)(kernel_image + 0x1f1);

    /* Initialize boot_params to zero!!! */
    memset(boot_params, 0, sizeof *boot_params);
}

Copy the setup header starting at file offset 0x1f1 to offset 0x1f1
into that page:
{
    int setup_length =
	kernel_setup_header->header == 0x53726448
	? (kernel_setup_header->jump >> 8) + 17 : 15;

    memcpy(&boot_params->hdr, kernel_setup_header, setup_length);
}

Now you can compute values including ones are omitted by older kernels:
{
    /*
     * Split between the part of the kernel to be loaded into
     * low memory (for 16-bit boot, otherwise it can be safely
     * omitted) and the part to be loaded into high memory.
     */
    if (!boot_params->hdr.setup_sects)
	boot_param->hdr.setup_sects = 4;

    int high_kernel_start = (boot_param->hdr.setup_sects+1) << 9;

    /*
     * Highest permitted address for the high part of the kernel image,
     * initrd, command line (*except for 16-bit boot*), and setup_data
     *
     * max_initrd_addr here is exclusive
     */
    uint64_t max_initrd_addr = (uint64_t)boot_params->hdr.initrd_addr_max + 1;
    if (boot_params->hdr.version < 0x0200)
	max_initrd_addr = 0;	/* No initrd supported */
    else if (boot_params->hdr.version < 0x0203)
	max_initrd_addr = 0x38000000;
    else if (boot_params->hdr.version >= 0x020c &&
	       (boot_params->hdr.xloadflags & XLF_CAN_BE_LOADED_ABOVE_4G))
	max_initrd_addr = (uint64_t)1 << 52; /* Architecture-imposed limit */

    /*
     * Maximum command line size *including terminating null*
     */
    unsigned int cmdline_size;
    if (boot_params->hdr.version < 0x0200)
	cmdline_size = 0;	/* No command line supported */
    else if (boot_params->hdr.version < 0x0206)
	boot_params->hdr.cmdline_size = 256;
    else
	boot_params->hdr.cmdline_size + 1;

    /* Command line size including terminating null */

    /*
     * Load addresses for the low and high kernels, respectively
     */
    uint32_t low_kernel_address;
    uint64_t cmdline_addr;	/* Address to load the command line */

    if (is_bios_boot()) {
	if (!(boot_params->hdr.loadflags & LOADED_HIGH)) {
	    low_kernel_address = 0x90000;
	} else {
	    /*
	     * Recommended to be the lowest available address between
	     * 0x10000 and 0x90000
	     */
	    low_kernel_address = preferred_low_kernel_address();
	}

	uint32_t lowkernel_max;

	lowkernel_max = low_kernel_address + 0x10000;
	if (boot_params.hdr.version >= 0x0202)
	    lowkernel_max += (cmdline_size + 15) & ~15;

	/*
	 * end_of_low_memory() is usually given by *(uint8_t *)0x413 << 10
	 */
	if (lowkernel_max > end_of_low_memory())
	    lowkernel_max = end_of_low_memory();

	cmdline_addr = (lowkernel_max - cmdline_size) & ~15;
	if (boot_params->hdr.version >= 0x0202)
	    kernel_setup_header->cmd_line_ptr = cmdline_addr;
	else if (boot_params->hdr.version >= 0x0200)
	    kernel_setup_header->setup_move_size =
		lowkernel_max - low_kernel_address;

	if (boot_params.hdr.version >= 0x0201) {
	    kernel_setup_header->heap_end_ptr
		= cmdline_addr - low_kernel_address - 0x0200;
	    kernel_setup_header->loadflags |= CAN_USE_HEAP;
	}
    } else {
	low_kernel_address = 0;	/* Not used for non-BIOS boot */
	cmdline_addr = 0;	/* Not assigned yet */
    }

    /*
     * Default load address for the high kernel, and if it can be relocated
     */
    uint64_t high_kernel_address;
    uint32_t high_kernel_size;	/* The amount of memory the high kernel needs */
    bool relocatable_kernel = false;
    uint32_t high_kernel_alignment = 0x400000; /* Kernel runtime alignment */

    if (!(boot_params->hdr.loadflags & LOADED_HIGH)) {
	high_kernel_address = 0x10000;
    } else {
	if (boot_params->hdr.version >= 0x020a)
	    high_kernel_address = boot_params->hdr.pref_address;
	else
	    high_kernel_address = 0x100000;

	if (boot_params->hdr.version >= 0x0205 &&
	    boot_params->hdr.relocatable_kernel) {
	    relocatable_kernel = true;
	    high_kernel_alignment = boot_params->hdr.kernel_alignment;
	}
    }

    /*
     * Linear memory area needed by the kernel
     */
    uint32_t kernel_mem_size;
    if (boot_params->hdr.version >= 0x020a)
	kernel_mem_size = boot_params->hdr.init_size;
    else
	kernel_mem_size = kernel_image_size << 2; /* Pure guesswork... */

    /* Relocate the kernel load address if desired */
    if (relocatable_kernel) {
	high_kernel_address =
	    maybe_relocate_kernel(high_kernel_address,
				  max_initrd_addr - kernel_mem_size,
				  high_kernel_aligment);
    }

    /* Adjust for possible internal kernel realigment */
    kernel_mem_size += (-high_kernel_address) & (high_kernel_alignment - 1);

    /*
     * Determine the minimum safe address for loading initrd, setup_data,
     * and, if cmdline_addr == 0 (i.e. !is_bios_boot()), the command line.
     */
    uint64_t min_initrd_addr = high_kernel_address + kernel_mem_size;
}

  reply	other threads:[~2022-12-29  2:14 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-28 14:38 [PATCH qemu] x86: don't let decompressed kernel image clobber setup_data Jason A. Donenfeld
2022-12-28 16:02 ` Philippe Mathieu-Daudé
2022-12-28 16:30   ` Jason A. Donenfeld
2022-12-28 16:57     ` Jason A. Donenfeld
2022-12-28 23:58       ` H. Peter Anvin
2022-12-29  2:13         ` H. Peter Anvin [this message]
2022-12-29  2:31         ` Jason A. Donenfeld
2022-12-29  7:28           ` Philippe Mathieu-Daudé
2022-12-29  7:30           ` H. Peter Anvin
2022-12-29  7:31           ` H. Peter Anvin
2022-12-29 12:47             ` Borislav Petkov
2022-12-30 15:54               ` Jason A. Donenfeld
2022-12-30 17:01                 ` Borislav Petkov
2022-12-30 17:07                   ` Jason A. Donenfeld
2022-12-30 19:54                     ` Borislav Petkov
2022-12-30 21:58                       ` H. Peter Anvin
2022-12-30 22:10                         ` Jason A. Donenfeld
2022-12-31  1:06                           ` H. Peter Anvin
2022-12-31  1:14                             ` H. Peter Anvin
2022-12-31 12:55                             ` Jason A. Donenfeld
2022-12-31 13:40                             ` Borislav Petkov
2022-12-31 13:44                               ` Jason A. Donenfeld
2022-12-31 13:48                                 ` Borislav Petkov
2022-12-31 13:51                                   ` Jason A. Donenfeld
2022-12-31 14:24                                     ` Borislav Petkov
2022-12-31 18:22                                       ` Jason A. Donenfeld
2022-12-31 19:00                                         ` Borislav Petkov
2023-01-01  3:21                                           ` H. Peter Anvin
2023-01-01  3:31                                             ` H. Peter Anvin
2023-01-02  6:01                                               ` Borislav Petkov
2023-01-02  6:17                                                 ` Borislav Petkov
2023-01-02  9:32                                                   ` Ard Biesheuvel
2023-01-02 13:36                                                     ` Borislav Petkov
2023-01-02 15:03                                                       ` Ard Biesheuvel
2023-01-02  5:50                                             ` Borislav Petkov
2023-01-01  4:33                                         ` H. Peter Anvin
2023-01-01  4:55                                           ` Mika Penttilä
2023-01-01  5:13                                             ` H. Peter Anvin
2022-12-30 15:59             ` Jason A. Donenfeld
2022-12-30 16:21               ` Jason A. Donenfeld
2022-12-30 19:13               ` H. Peter Anvin
2022-12-31  9:48               ` Borislav Petkov
2022-12-31 12:54                 ` Jason A. Donenfeld
2022-12-31 13:35                   ` Borislav Petkov
2022-12-31 13:42                     ` Jason A. Donenfeld
2022-12-30 18:30 ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0baf674b-c7e7-a010-375d-ea1132495c44@zytor.com \
    --to=hpa@zytor.com \
    --cc=Jason@zx2c4.com \
    --cc=ardb@kernel.org \
    --cc=bp@alien8.de \
    --cc=ebiggers@kernel.org \
    --cc=kraxel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.