linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Baoquan He <bhe@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Borislav Petkov <bp@alien8.de>, Vivek Goyal <vgoyal@redhat.com>,
	Andy Lutomirski <luto@kernel.org>,
	Lasse Collin <lasse.collin@tukaani.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Young <dyoung@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/6] x86/boot: Move compressed kernel to end of decompression buffer
Date: Fri, 29 Apr 2016 00:48:54 -0700	[thread overview]
Message-ID: <CAGXu5jLyeaUV_Pe7d-GoTdb+PxsTynkGAw+MzK3xd+_gCqbnvg@mail.gmail.com> (raw)
In-Reply-To: <20160429071805.GC28320@gmail.com>

On Fri, Apr 29, 2016 at 12:18 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Kees Cook <keescook@chromium.org> wrote:
>
>> From: Yinghai Lu <yinghai@kernel.org>
>>
>> This change makes later calculations about where the kernel is located
>> easier to reason about. To better understand this change, we must first
>> clarify what VO and ZO are. They were introduced in commits by hpa:
>>
>> 77d1a49 x86, boot: make symbols from the main vmlinux available
>> 37ba7ab x86, boot: make kernel_alignment adjustable; new bzImage fields
>>
>> Specifically:
>>
>> VO:
>> - uncompressed kernel image
>> - size: VO__end - VO__text ("VO_INIT_SIZE" define)
>>
>> ZO:
>> - bootable compressed kernel image (boot/compressed/vmlinux)
>> - head text + compressed kernel (VO and relocs table) + decompressor code
>> - size: ZO__end - ZO_startup_32 ("ZO_INIT_SIZE" define, though see below)
>>
>> The INIT_SIZE definition is used to find the larger of the two image sizes:
>>
>>  #define ZO_INIT_SIZE    (ZO__end - ZO_startup_32 + ZO_z_extract_offset)
>>  #define VO_INIT_SIZE    (VO__end - VO__text)
>>  #if ZO_INIT_SIZE > VO_INIT_SIZE
>>  #define INIT_SIZE ZO_INIT_SIZE
>>  #else
>>  #define INIT_SIZE VO_INIT_SIZE
>>  #endif
>>
>> The current code uses extract_offset to decide where to position the
>> copied ZO (i.e. ZO starts at extract_offset). (This is why ZO_INIT_SIZE
>> currently includes the extract_offset.)
>
> Yeah, so I rewrote the above to:
>
> =================>
> This change makes later calculations about where the kernel is located
> easier to reason about. To better understand this change, we must first
> clarify what 'VO' and 'ZO' are. These values were introduced in commits
> by hpa:
>
>   77d1a4999502 ("x86, boot: make symbols from the main vmlinux available")
>   37ba7ab5e33c ("x86, boot: make kernel_alignment adjustable; new bzImage fields")
>
> Specifically:
>
> All names prefixed with 'VO_':
>
>  - relate to the uncompressed kernel image
>
>  - the size of the VO image is: VO__end-VO__text ("VO_INIT_SIZE" define)
>
> All names prefixed with 'ZO_':
>
>  - relate to the bootable compressed kernel image (boot/compressed/vmlinux),
>    which is composed of the following memory areas:
>      - head text
>      - compressed kernel (VO image and relocs table)
>      - decompressor code
>
>  - the size of the ZO image is: ZO__end - ZO_startup_32 ("ZO_INIT_SIZE" define, though see below)
>
> The 'INIT_SIZE' value is used to find the larger of the two image sizes:
>
>  #define ZO_INIT_SIZE    (ZO__end - ZO_startup_32 + ZO_z_extract_offset)
>  #define VO_INIT_SIZE    (VO__end - VO__text)
>
>  #if ZO_INIT_SIZE > VO_INIT_SIZE
>  # define INIT_SIZE ZO_INIT_SIZE
>  #else
>  # define INIT_SIZE VO_INIT_SIZE
>  #endif
>
> The current code uses extract_offset to decide where to position the
> copied ZO (i.e. ZO starts at extract_offset). (This is why ZO_INIT_SIZE
> currently includes the extract_offset.)
> <=================
>
> Assuming the edits I made are correct, this is the point where the changelog lost
> me. It does not explain why ZO_z_extract_offset exists. Why isn't the ZO copied to
> offset 0?
>
> I had to go into arch/x86/boot/compressed/mkpiggy.c, where ZO_z_extract_offset is
> generated, to find the answer: it's needed because we are trying to minimize the
> amount of RAM used for the whole act of creating an uncompressed, executable,
> properly relocation-linked kernel image in system memory. We do this so that
> kernels can be booted on even very small systems.
>
> To achieve the goal of minimal memory consumption we have implemented an in-place
> decompression strategy: instead of cleanly separating the VO and ZO images and
> also allocating some memory for the decompression code's runtime needs, we instead
> create this elaborate layout of memory buffers where the output (decompressed)
> stream, as it progresses, overlaps with and destroys the input (compressed)
> stream. This can only be done safely if the ZO image is placed to the end of the
> VO range, plus a certain amount of safety distance to make sure that when the last
> bytes of the VO range are decompressed, the compressed stream pointer is safely
> beyond the end of the VO range. Correct?
>
> This is a very essential central concept to the whole code, but nowhere is it
> described clearly!

That would certainly be worth calling out in the description, true.

> But more importantly, especially in view of address space randomization, we should
> realize that the days of 8 MB i386-DX systems are gone, and we should get rid of
> all this crazy obfuscation that is hindering development in this area. I also
> suspect that the actual temporary allocation size reduction savings from this
> trick are relatively small, compared to the resulting total memory size.
>
> So my suggestion: let's just cleanly separate all the data areas and not try to do
> any clever overlapping: the benefit will be minimal, and any system that has main
> RAM less than twice of the VO+ZO image sizes is fundamentally unbootable and
> unusable anyway.
>
> I.e. have a really clean size calculation of:
>
>         ZO + VO + decompressor-stacks-size + decompressor-data-size
>
> and decompress accordingly without tricks, without overlaps, without any chance
> for corruption - and, most importantly, without this metric ton of obfuscation
> that very few people have managed to fight their way through in the last couple of
> years, and which hinders essential features ...
>
> Agreed?

I don't agree. We do still have embedded systems running x86 kernels,
and we have cases where we're running multiple kernels in memory (like
kdump). I think the memory savings is worth the complexity, especially
since the complexity is being reduced up by this patch. But that's not
all:

If we moved the compressed kernel after the buffer, the only thing
we'd do would be taking up more memory. We'd still have the head_*.S
complexity of handling the relocation and handling the copy, we'd
still have the extraction, etc, etc. The only thing would be literally
changing extract_offset to INIT_SIZE. Everything else would be the
same.

If we moved the decompressed kernel after the compressed kernel,
(ignoring KASLR for a moment) then we'd end up in a confusing
situation where the kernel would be running somewhere other than where
the boot loader asked it to load. I don't even want to think about the
weird bug reports we might get from a change like that from old or
weird loaders.

This patch gets us a more reasonable layout with less complexity and
no change to the memory footprint without changing the expectations of
the boot loader. I really think this should stand.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

  reply	other threads:[~2016-04-29  7:48 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-29  0:09 [PATCH 0/6] x86/boot: Improve compressed kernel handling Kees Cook
2016-04-29  0:09 ` [PATCH 1/6] x86/KASLR: Handle kernel relocation above 2G Kees Cook
2016-04-29  8:02   ` [tip:x86/boot] x86/KASLR: Handle kernel relocations above 2G correctly tip-bot for Baoquan He
2016-04-29  0:09 ` [PATCH 2/6] x86/boot: Move compressed kernel to end of decompression buffer Kees Cook
2016-04-29  7:18   ` Ingo Molnar
2016-04-29  7:48     ` Kees Cook [this message]
2016-04-29  8:07       ` Ingo Molnar
2016-04-29  9:51         ` Ingo Molnar
2016-04-29  9:51   ` [tip:x86/boot] x86/boot: Move compressed kernel to the end of the " tip-bot for Yinghai Lu
2016-08-16  4:01   ` [PATCH 2/6] x86/boot: Move compressed kernel to end of " Matt Mullins
2016-08-16 19:19     ` Yinghai Lu
2016-08-17  2:25       ` Matt Mullins
2016-10-03 21:50         ` Simon Glass
2016-11-30 16:52           ` Andy Shevchenko
2016-04-29  0:09 ` [PATCH 3/6] x86/boot: Calculate decompression size during boot not build Kees Cook
2016-04-29  9:52   ` [tip:x86/boot] " tip-bot for Yinghai Lu
2016-04-29  0:09 ` [PATCH 4/6] x86/boot: Fix "run_size" calculation Kees Cook
2016-04-29  9:52   ` [tip:x86/boot] " tip-bot for Yinghai Lu
2016-04-29  0:09 ` [PATCH 5/6] x86/KASLR: Clean up unused code from old "run_size" Kees Cook
2016-04-29  9:52   ` [tip:x86/boot] x86/KASLR: Clean up unused code from old 'run_size' and rename it to 'kernel_total_size' tip-bot for Yinghai Lu
2016-04-29  0:09 ` [PATCH 6/6] x86/boot: Correctly bounds-check relocations Kees Cook
2016-04-29  9:53   ` [tip:x86/boot] " tip-bot for Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGXu5jLyeaUV_Pe7d-GoTdb+PxsTynkGAw+MzK3xd+_gCqbnvg@mail.gmail.com \
    --to=keescook@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=dyoung@redhat.com \
    --cc=hpa@zytor.com \
    --cc=lasse.collin@tukaani.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).