All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Kiper <daniel.kiper@oracle.com>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	dpsmith@apertussolutions.com, eric.snowberg@oracle.com,
	kanth.ghatraju@oracle.com, konrad.wilk@oracle.com,
	ross.philipson@oracle.com
Subject: Re: [PATCH RFC 1/2] x86/boot: Introduce the setup_header2
Date: Fri, 14 Jun 2019 13:06:50 +0200	[thread overview]
Message-ID: <20190614110650.ytg7dvyimstu2lis@tomti.i.net-space.pl> (raw)
In-Reply-To: <95fd235b-b4e5-c547-3625-b23ef66c5d4f@zytor.com>

I am working on new version of patches but I have some concerns. Please
look below for more details...

On Thu, Jun 06, 2019 at 03:06:30PM -0700, H. Peter Anvin wrote:
> On 5/24/19 2:55 AM, Daniel Kiper wrote:
> > Due to limited space left in the setup header it was decided to
> > introduce the setup_header2. Its role is to communicate Linux kernel
> > supported features to the boot loader. Starting from now this is the
> > primary way to communicate things to the boot loader.
> >
> > Suggested-by: H. Peter Anvin <hpa@zytor.com>
> > Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
> > Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
> > Reviewed-by: Eric Snowberg <eric.snowberg@oracle.com>
> > ---
> > I know that setup_header2 is not the best name. There were some
> > alternatives proposed like setup_header_extra, setup_header_addendum,
> > setup_header_more, ext_setup_header, extended_setup_header, extended_header
> > and extended_setup. Sadly, I am not happy with any of them. So,
> > leaving setup_header2 as is but still looking for better name.
> > Probably shorter == better...
>
> I would say kernel_info. The relationships between the headers are analogous
> to the various data sections:
>
> 	setup_header		= .data
> 	boot_params/setup_data	= .bss
>
> What is missing from the above list? That's right:
>
> 	kernel_info		= .rodata
>
> We have been (ab)using .data for things that could go into .rodata or .bss for
> a long time, for lack of alternatives and -- especially early on -- intertia.
> Also, the BIOS stub is responsible for creating boot_params, so it isn't
> available to a BIOS-based loader (setup_data is, though.)
>
> setup_header is permanently limited to 144 bytes due to the reach of the
> 2-byte jump field, which doubles as a length field for the structure, combined
> with the size of the "hole" in struct boot_params that a protected-mode loader
> or the BIOS stub has to copy it into. It is currently 119 bytes long, which
> leaves us with 25 very precious bytes. This isn't something that can be fixed
> without revising the boot protocol entirely, breaking backwards compatibility.
>
> boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
> by adding setup_data entries. It cannot be used to communicate properties of
> the kernel image, because it is .bss and has no image-provided content.
>
> kernel_info solves this by providing an extensible place for information about
> the kernel image. It is readonly, because the kernel cannot rely on a
> bootloader copying its contents anywhere, but that is OK; if it becomes
> necessary it can still contain data items that an enabled bootloader would be
> expected to copy into a setup_data chunk.
>
> ^ The above or some variant thereof may be a good thing to put both in your
> patch comments as well as in the boot protocol documentation.

Will do...

> While we are making a change that bumps the version number anyway, there is
> another change I would like to make to the boot protocol which we might as
> well do at the same time. setup_data is a bit awkward to use for extremely
> large data objects, both because the setup_data header has to be adjacent to
> the data object, and because it has a 32-bit length field. However, it is
> important that intermediate stages of the boot process have a way to identify
> which chunks of memory are occupied by kernel data.

Is not possible to "identify which chunks of memory are occupied by
kernel data" today? I think that it is. So, this does not look like
a valid point for change. Am I missing something?

> Thus I think we should introduce a uniform way to specify such indirect data.
> We define a new setup_data type we can maybe call SETUP_INDIRECT; a
> SETUP_INDIRECT data item would be an array of structures of the form:

OK.

> struct setup_indirect {
> 	__u32 type;
> 	__u32 reserved;	/* Reserved, must be set to zero */
> 	__u64 len;
> 	__u64 addr;
> };
>
> ... where type is itself simply a SETUP_* type -- although we probably don't
> want to let it be SETUP_INDIRECT itself since making it a tree structure could
> require a lot of stack space in something that needs to parse it, and stack
> space can be limited in boot contexts.

Yeah...

> This would be particularly useful for having SETUP_INITRAMFS, if it becomes
> desirable to allow the kernel to parse a non-contiguous set of memory regions
> for the initramfs.

OK.

> It might be a good idea to immediately start out struct kernel_info with
> either a high mark or a bitmask of SETUP_* types that the kernel supports. A

High mark seems better for me here.

> bitmask would be more flexible, but would need provisions to be grown in the
> future.

Yep.

Anyway, I agree with the idea but I am not sure it makes sense to
introduce something which does not have users right now. Does it?

> Which leads me to yet another thought.
>
> We probably want to make the contents of kernel_info a bit more structured to
> allow for content that may need to be extended in the future, or is inherently
> variable length (like strings.)
>
> This would lend itself to a structure such as:
>
> 	- Magic number
> 	- Length of total structure

My first proposal have both fields...

> ... followed by a list of data chunks, each prefixed by a length field. The
> first data chunk would be the main (root) structure; other data structures are

I am not sure that I understand the idea of the main (root) structure.
Should it point to itself?

> pointed to from the root structure using offsets from the beginning of the
> structure (the magic number field.)

Sounds nice but I think that it is an overkill in many simple cases,
e.g. look at MLE entry point. In case of MLE entry point we do not need
nothing fancy. So, I think that this filed should be a member of the
kernel_info which points directly to new MLE entry point. I can agree
that we can use the mechanism mentioned by you above in more complicated
cases and it can be described in Documentation/x86/boot.rst. But I would
not enforce it for everything. Just for the sake of simplicity.

> As an implementation detail, strings can of course be "pooled" into a single
> data chunk as long as they are zero-terminated.

OK.

> I have intentionally avoided specifying a type field for each data chunk;
> history shows that it is generally a bad idea to have multiple ways to derive
> the same information, as different implementations will do it differently,
> resulting in bugs when things change.

Make sense for me.

So, it seems to me that we have to have at least two patches:
  - one introducing the kernel_info structure,
  - another introducing the setup_indirect and bumping the
    boot protocol version number.

This thing has at least one cons: first patch introduces a new boot
protocol feature but it is not reflected in the protocol version change.
Not perfect but I do not think that we should bump the version number in
first patch. do you have better idea?

Daniel

  reply	other threads:[~2019-06-14 11:07 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-24  9:55 [PATCH RFC 0/2] x86/boot: Introduce the setup_header2 Daniel Kiper
2019-05-24  9:55 ` [PATCH RFC 1/2] " Daniel Kiper
2019-06-06 22:06   ` H. Peter Anvin
2019-06-14 11:06     ` Daniel Kiper [this message]
2019-05-24  9:55 ` [PATCH RFC 2/2] x86/boot: Introduce dummy MLE header Daniel Kiper
2019-06-05 13:50 ` [PATCH RFC 0/2] x86/boot: Introduce the setup_header2 Daniel Kiper
2019-06-05 14:01   ` Konrad Rzeszutek Wilk
2019-06-06 11:51     ` Daniel Kiper
2019-06-06 17:30       ` Konrad Rzeszutek Wilk
2019-06-06 17:46         ` Daniel Kiper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190614110650.ytg7dvyimstu2lis@tomti.i.net-space.pl \
    --to=daniel.kiper@oracle.com \
    --cc=dpsmith@apertussolutions.com \
    --cc=eric.snowberg@oracle.com \
    --cc=hpa@zytor.com \
    --cc=kanth.ghatraju@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ross.philipson@oracle.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.