Live-Patching Archive on lore.kernel.org
 help / color / Atom feed
* Re: [PATCH v4 00/10] Function Granular KASLR
       [not found] <20200717170008.5949-1-kristen@linux.intel.com>
@ 2020-07-22  9:27 ` Miroslav Benes
  2020-07-22 14:39   ` Kees Cook
  2020-08-03 11:39   ` Evgenii Shatokhin
  0 siblings, 2 replies; 17+ messages in thread
From: Miroslav Benes @ 2020-07-22  9:27 UTC (permalink / raw)
  To: Kristen Carlson Accardi
  Cc: keescook, tglx, mingo, bp, arjan, x86, linux-kernel,
	kernel-hardening, rick.p.edgecombe, live-patching

Let me CC live-patching ML, because from a quick glance this is something 
which could impact live patching code. At least it invalidates assumptions 
which "sympos" is based on.

Miroslav

On Fri, 17 Jul 2020, Kristen Carlson Accardi wrote:

> Function Granular Kernel Address Space Layout Randomization (fgkaslr)
> ---------------------------------------------------------------------
> 
> This patch set is an implementation of finer grained kernel address space
> randomization. It rearranges your kernel code at load time 
> on a per-function level granularity, with only around a second added to
> boot time.

[...]

> Background
> ----------
> KASLR was merged into the kernel with the objective of increasing the
> difficulty of code reuse attacks. Code reuse attacks reused existing code
> snippets to get around existing memory protections. They exploit software bugs
> which expose addresses of useful code snippets to control the flow of
> execution for their own nefarious purposes. KASLR moves the entire kernel
> code text as a unit at boot time in order to make addresses less predictable.
> The order of the code within the segment is unchanged - only the base address
> is shifted. There are a few shortcomings to this algorithm.
> 
> 1. Low Entropy - there are only so many locations the kernel can fit in. This
>    means an attacker could guess without too much trouble.
> 2. Knowledge of a single address can reveal the offset of the base address,
>    exposing all other locations for a published/known kernel image.
> 3. Info leaks abound.
> 
> Finer grained ASLR has been proposed as a way to make ASLR more resistant
> to info leaks. It is not a new concept at all, and there are many variations
> possible. Function reordering is an implementation of finer grained ASLR
> which randomizes the layout of an address space on a function level
> granularity. We use the term "fgkaslr" in this document to refer to the
> technique of function reordering when used with KASLR, as well as finer grained
> KASLR in general.
> 
> Proposed Improvement
> --------------------
> This patch set proposes adding function reordering on top of the existing
> KASLR base address randomization. The over-arching objective is incremental
> improvement over what we already have. It is designed to work in combination
> with the existing solution. The implementation is really pretty simple, and
> there are 2 main area where changes occur:
> 
> * Build time
> 
> GCC has had an option to place functions into individual .text sections for
> many years now. This option can be used to implement function reordering at
> load time. The final compiled vmlinux retains all the section headers, which
> can be used to help find the address ranges of each function. Using this
> information and an expanded table of relocation addresses, individual text
> sections can be suffled immediately after decompression. Some data tables
> inside the kernel that have assumptions about order require re-sorting
> after being updated when applying relocations. In order to modify these tables,
> a few key symbols are excluded from the objcopy symbol stripping process for
> use after shuffling the text segments.
> 
> Some highlights from the build time changes to look for:
> 
> The top level kernel Makefile was modified to add the gcc flag if it
> is supported. Currently, I am applying this flag to everything it is
> possible to randomize. Anything that is written in C and not present in a
> special input section is randomized. The final binary segment 0 retains a
> consolidated .text section, as well as all the individual .text.* sections.
> Future work could turn off this flags for selected files or even entire
> subsystems, although obviously at the cost of security.
> 
> The relocs tool is updated to add relative relocations. This information
> previously wasn't included because it wasn't necessary when moving the
> entire .text segment as a unit. 
> 
> A new file was created to contain a list of symbols that objcopy should
> keep. We use those symbols at load time as described below.
> 
> * Load time
> 
> The boot kernel was modified to parse the vmlinux elf file after
> decompression to check for our interesting symbols that we kept, and to
> look for any .text.* sections to randomize. The consolidated .text section
> is skipped and not moved. The sections are shuffled randomly, and copied
> into memory following the .text section in a new random order. The existing
> code which updated relocation addresses was modified to account for
> not just a fixed delta from the load address, but the offset that the function
> section was moved to. This requires inspection of each address to see if
> it was impacted by a randomization. We use a bsearch to make this less
> horrible on performance. Any tables that need to be modified with new
> addresses or resorted are updated using the symbol addresses parsed from the
> elf symbol table.
> 
> In order to hide our new layout, symbols reported through /proc/kallsyms
> will be displayed in a random order.
> 
> Security Considerations
> -----------------------
> The objective of this patch set is to improve a technology that is already
> merged into the kernel (KASLR). This code will not prevent all attacks,
> but should instead be considered as one of several tools that can be used.
> In particular, this code is meant to make KASLR more effective in the presence
> of info leaks.
> 
> How much entropy we are adding to the existing entropy of standard KASLR will
> depend on a few variables. Firstly and most obviously, the number of functions
> that are randomized matters. This implementation keeps the existing .text
> section for code that cannot be randomized - for example, because it was
> assembly code. The less sections to randomize, the less entropy. In addition,
> due to alignment (16 bytes for x86_64), the number of bits in a address that
> the attacker needs to guess is reduced, as the lower bits are identical.

[...]

> Modules
> -------
> Modules are randomized similarly to the rest of the kernel by shuffling
> the sections at load time prior to moving them into memory. The module must
> also have been build with the -ffunction-sections compiler option.
> 
> Although fgkaslr for the kernel is only supported for the X86_64 architecture,
> it is possible to use fgkaslr with modules on other architectures. To enable
> this feature, select
> 
> CONFIG_MODULE_FG_KASLR=y
> 
> This option is selected automatically for X86_64 when CONFIG_FG_KASLR is set.
> 
> Disabling
> ---------
> Disabling normal KASLR using the nokaslr command line option also disables
> fgkaslr. It is also possible to disable fgkaslr separately by booting with
> fgkaslr=off on the commandline.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-07-22  9:27 ` [PATCH v4 00/10] Function Granular KASLR Miroslav Benes
@ 2020-07-22 14:39   ` Kees Cook
  2020-07-22 14:51     ` Joe Lawrence
  2020-07-22 16:07     ` Josh Poimboeuf
  2020-08-03 11:39   ` Evgenii Shatokhin
  1 sibling, 2 replies; 17+ messages in thread
From: Kees Cook @ 2020-07-22 14:39 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Kristen Carlson Accardi, tglx, mingo, bp, arjan, x86,
	linux-kernel, kernel-hardening, rick.p.edgecombe, live-patching

On Wed, Jul 22, 2020 at 11:27:30AM +0200, Miroslav Benes wrote:
> Let me CC live-patching ML, because from a quick glance this is something 
> which could impact live patching code. At least it invalidates assumptions 
> which "sympos" is based on.

In a quick skim, it looks like the symbol resolution is using
kallsyms_on_each_symbol(), so I think this is safe? What's a good
selftest for live-patching?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-07-22 14:39   ` Kees Cook
@ 2020-07-22 14:51     ` Joe Lawrence
  2020-07-22 14:56       ` Joe Lawrence
  2020-07-22 16:07     ` Josh Poimboeuf
  1 sibling, 1 reply; 17+ messages in thread
From: Joe Lawrence @ 2020-07-22 14:51 UTC (permalink / raw)
  To: Kees Cook, Miroslav Benes
  Cc: Kristen Carlson Accardi, tglx, mingo, bp, arjan, x86,
	linux-kernel, kernel-hardening, rick.p.edgecombe, live-patching

On 7/22/20 10:39 AM, Kees Cook wrote:
> On Wed, Jul 22, 2020 at 11:27:30AM +0200, Miroslav Benes wrote:
>> Let me CC live-patching ML, because from a quick glance this is something
>> which could impact live patching code. At least it invalidates assumptions
>> which "sympos" is based on.
> 
> In a quick skim, it looks like the symbol resolution is using
> kallsyms_on_each_symbol(), so I think this is safe? What's a good
> selftest for live-patching?
> 

Hi Kees,

I don't think any of the in-tree tests currently exercise the 
kallsyms/sympos end of livepatching.

I do have a local branch that does facilitate creating klp-relocations 
that do rely upon this feature -- I'll try to see if I can get those 
working with this patchset and report back later this week.

-- Joe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-07-22 14:51     ` Joe Lawrence
@ 2020-07-22 14:56       ` Joe Lawrence
  2020-07-22 18:24         ` Kristen Carlson Accardi
  0 siblings, 1 reply; 17+ messages in thread
From: Joe Lawrence @ 2020-07-22 14:56 UTC (permalink / raw)
  To: Kees Cook, Miroslav Benes
  Cc: Kristen Carlson Accardi, tglx, mingo, bp, arjan, x86,
	linux-kernel, kernel-hardening, rick.p.edgecombe, live-patching

On 7/22/20 10:51 AM, Joe Lawrence wrote:
> On 7/22/20 10:39 AM, Kees Cook wrote:
>> On Wed, Jul 22, 2020 at 11:27:30AM +0200, Miroslav Benes wrote:
>>> Let me CC live-patching ML, because from a quick glance this is something
>>> which could impact live patching code. At least it invalidates assumptions
>>> which "sympos" is based on.
>>
>> In a quick skim, it looks like the symbol resolution is using
>> kallsyms_on_each_symbol(), so I think this is safe? What's a good
>> selftest for live-patching?
>>
> 
> Hi Kees,
> 
> I don't think any of the in-tree tests currently exercise the
> kallsyms/sympos end of livepatching.
> 

On second thought, I mispoke.. The general livepatch code does use it:

klp_init_object
   klp_init_object_loaded
     klp_find_object_symbol

in which case any of the current kselftests should exercise that.

   % make -C tools/testing/selftests/livepatch run_tests

-- Joe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-07-22 14:39   ` Kees Cook
  2020-07-22 14:51     ` Joe Lawrence
@ 2020-07-22 16:07     ` Josh Poimboeuf
  2020-07-22 19:42       ` Kees Cook
  1 sibling, 1 reply; 17+ messages in thread
From: Josh Poimboeuf @ 2020-07-22 16:07 UTC (permalink / raw)
  To: Kees Cook
  Cc: Miroslav Benes, Kristen Carlson Accardi, tglx, mingo, bp, arjan,
	x86, linux-kernel, kernel-hardening, rick.p.edgecombe,
	live-patching

On Wed, Jul 22, 2020 at 07:39:55AM -0700, Kees Cook wrote:
> On Wed, Jul 22, 2020 at 11:27:30AM +0200, Miroslav Benes wrote:
> > Let me CC live-patching ML, because from a quick glance this is something 
> > which could impact live patching code. At least it invalidates assumptions 
> > which "sympos" is based on.
> 
> In a quick skim, it looks like the symbol resolution is using
> kallsyms_on_each_symbol(), so I think this is safe? What's a good
> selftest for live-patching?

The problem is duplicate symbols.  If there are two static functions
named 'foo' then livepatch needs a way to distinguish them.

Our current approach to that problem is "sympos".  We rely on the fact
that the second foo() always comes after the first one in the symbol
list and kallsyms.  So they're referred to as foo,1 and foo,2.

-- 
Josh


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-07-22 14:56       ` Joe Lawrence
@ 2020-07-22 18:24         ` Kristen Carlson Accardi
  0 siblings, 0 replies; 17+ messages in thread
From: Kristen Carlson Accardi @ 2020-07-22 18:24 UTC (permalink / raw)
  To: Joe Lawrence, Kees Cook, Miroslav Benes
  Cc: tglx, mingo, bp, arjan, x86, linux-kernel, kernel-hardening,
	rick.p.edgecombe, live-patching

On Wed, 2020-07-22 at 10:56 -0400, Joe Lawrence wrote:
> On 7/22/20 10:51 AM, Joe Lawrence wrote:
> > On 7/22/20 10:39 AM, Kees Cook wrote:
> > > On Wed, Jul 22, 2020 at 11:27:30AM +0200, Miroslav Benes wrote:
> > > > Let me CC live-patching ML, because from a quick glance this is
> > > > something
> > > > which could impact live patching code. At least it invalidates
> > > > assumptions
> > > > which "sympos" is based on.
> > > 
> > > In a quick skim, it looks like the symbol resolution is using
> > > kallsyms_on_each_symbol(), so I think this is safe? What's a good
> > > selftest for live-patching?
> > > 
> > 
> > Hi Kees,
> > 
> > I don't think any of the in-tree tests currently exercise the
> > kallsyms/sympos end of livepatching.
> > 
> 
> On second thought, I mispoke.. The general livepatch code does use
> it:
> 
> klp_init_object
>    klp_init_object_loaded
>      klp_find_object_symbol
> 
> in which case any of the current kselftests should exercise that.
> 
>    % make -C tools/testing/selftests/livepatch run_tests
> 
> -- Joe
> 

Thanks, it looks like this should work for helping me exercise the live
patch code paths. I will take a look and get back to you all.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-07-22 16:07     ` Josh Poimboeuf
@ 2020-07-22 19:42       ` Kees Cook
  2020-07-22 19:56         ` Kristen Carlson Accardi
  0 siblings, 1 reply; 17+ messages in thread
From: Kees Cook @ 2020-07-22 19:42 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Miroslav Benes, Kristen Carlson Accardi, tglx, mingo, bp, arjan,
	x86, linux-kernel, kernel-hardening, rick.p.edgecombe,
	live-patching

On Wed, Jul 22, 2020 at 11:07:30AM -0500, Josh Poimboeuf wrote:
> On Wed, Jul 22, 2020 at 07:39:55AM -0700, Kees Cook wrote:
> > On Wed, Jul 22, 2020 at 11:27:30AM +0200, Miroslav Benes wrote:
> > > Let me CC live-patching ML, because from a quick glance this is something 
> > > which could impact live patching code. At least it invalidates assumptions 
> > > which "sympos" is based on.
> > 
> > In a quick skim, it looks like the symbol resolution is using
> > kallsyms_on_each_symbol(), so I think this is safe? What's a good
> > selftest for live-patching?
> 
> The problem is duplicate symbols.  If there are two static functions
> named 'foo' then livepatch needs a way to distinguish them.
> 
> Our current approach to that problem is "sympos".  We rely on the fact
> that the second foo() always comes after the first one in the symbol
> list and kallsyms.  So they're referred to as foo,1 and foo,2.

Ah. Fun. In that case, perhaps the LTO series has some solutions. I
think builds with LTO end up renaming duplicate symbols like that, so
it'll be back to being unique.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-07-22 19:42       ` Kees Cook
@ 2020-07-22 19:56         ` Kristen Carlson Accardi
  2020-07-22 21:33           ` Josh Poimboeuf
  0 siblings, 1 reply; 17+ messages in thread
From: Kristen Carlson Accardi @ 2020-07-22 19:56 UTC (permalink / raw)
  To: Kees Cook, Josh Poimboeuf
  Cc: Miroslav Benes, tglx, mingo, bp, arjan, x86, linux-kernel,
	kernel-hardening, rick.p.edgecombe, live-patching

On Wed, 2020-07-22 at 12:42 -0700, Kees Cook wrote:
> On Wed, Jul 22, 2020 at 11:07:30AM -0500, Josh Poimboeuf wrote:
> > On Wed, Jul 22, 2020 at 07:39:55AM -0700, Kees Cook wrote:
> > > On Wed, Jul 22, 2020 at 11:27:30AM +0200, Miroslav Benes wrote:
> > > > Let me CC live-patching ML, because from a quick glance this is
> > > > something 
> > > > which could impact live patching code. At least it invalidates
> > > > assumptions 
> > > > which "sympos" is based on.
> > > 
> > > In a quick skim, it looks like the symbol resolution is using
> > > kallsyms_on_each_symbol(), so I think this is safe? What's a good
> > > selftest for live-patching?
> > 
> > The problem is duplicate symbols.  If there are two static
> > functions
> > named 'foo' then livepatch needs a way to distinguish them.
> > 
> > Our current approach to that problem is "sympos".  We rely on the
> > fact
> > that the second foo() always comes after the first one in the
> > symbol
> > list and kallsyms.  So they're referred to as foo,1 and foo,2.
> 
> Ah. Fun. In that case, perhaps the LTO series has some solutions. I
> think builds with LTO end up renaming duplicate symbols like that, so
> it'll be back to being unique.
> 

Well, glad to hear there might be some precendence for how to solve
this, as I wasn't able to think of something reasonable off the top of
my head. Are you speaking of the Clang LTO series? 
https://lore.kernel.org/lkml/20200624203200.78870-1-samitolvanen@google.com/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-07-22 19:56         ` Kristen Carlson Accardi
@ 2020-07-22 21:33           ` Josh Poimboeuf
  0 siblings, 0 replies; 17+ messages in thread
From: Josh Poimboeuf @ 2020-07-22 21:33 UTC (permalink / raw)
  To: Kristen Carlson Accardi
  Cc: Kees Cook, Miroslav Benes, tglx, mingo, bp, arjan, x86,
	linux-kernel, kernel-hardening, rick.p.edgecombe, live-patching

On Wed, Jul 22, 2020 at 12:56:10PM -0700, Kristen Carlson Accardi wrote:
> On Wed, 2020-07-22 at 12:42 -0700, Kees Cook wrote:
> > On Wed, Jul 22, 2020 at 11:07:30AM -0500, Josh Poimboeuf wrote:
> > > On Wed, Jul 22, 2020 at 07:39:55AM -0700, Kees Cook wrote:
> > > > On Wed, Jul 22, 2020 at 11:27:30AM +0200, Miroslav Benes wrote:
> > > > > Let me CC live-patching ML, because from a quick glance this is
> > > > > something 
> > > > > which could impact live patching code. At least it invalidates
> > > > > assumptions 
> > > > > which "sympos" is based on.
> > > > 
> > > > In a quick skim, it looks like the symbol resolution is using
> > > > kallsyms_on_each_symbol(), so I think this is safe? What's a good
> > > > selftest for live-patching?
> > > 
> > > The problem is duplicate symbols.  If there are two static
> > > functions
> > > named 'foo' then livepatch needs a way to distinguish them.
> > > 
> > > Our current approach to that problem is "sympos".  We rely on the
> > > fact
> > > that the second foo() always comes after the first one in the
> > > symbol
> > > list and kallsyms.  So they're referred to as foo,1 and foo,2.
> > 
> > Ah. Fun. In that case, perhaps the LTO series has some solutions. I
> > think builds with LTO end up renaming duplicate symbols like that, so
> > it'll be back to being unique.
> > 
> 
> Well, glad to hear there might be some precendence for how to solve
> this, as I wasn't able to think of something reasonable off the top of
> my head. Are you speaking of the Clang LTO series? 
> https://lore.kernel.org/lkml/20200624203200.78870-1-samitolvanen@google.com/

I'm not sure how LTO does it, but a few more (half-brained) ideas that
could work:

1) Add a field in kallsyms to keep track of a symbol's original offset
   before randomization/re-sorting.  Livepatch could use that field to
   determine the original sympos.

2) In fgkaslr code, go through all the sections and mark the ones which
   have duplicates (i.e. same name).  Then when shuffling the sections,
   skip a shuffle if it involves a duplicate section.  That way all the
   duplicates would retain their original sympos.

3) Livepatch could uniquely identify symbols by some feature other than
   sympos.  For example:

   Symbol/function size - obviously this would only work if duplicately
   named symbols have different sizes.

   Checksum - as part of a separate feature we're also looking at giving
   each function its own checksum, calculated based on its instruction
   opcodes.  Though calculating checksums at runtime could be
   complicated by IP-relative addressing.

I'm thinking #1 or #2 wouldn't be too bad.  #3 might be harder.

-- 
Josh


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-07-22  9:27 ` [PATCH v4 00/10] Function Granular KASLR Miroslav Benes
  2020-07-22 14:39   ` Kees Cook
@ 2020-08-03 11:39   ` Evgenii Shatokhin
  2020-08-03 17:45     ` Kees Cook
  1 sibling, 1 reply; 17+ messages in thread
From: Evgenii Shatokhin @ 2020-08-03 11:39 UTC (permalink / raw)
  To: Kristen Carlson Accardi
  Cc: Miroslav Benes, keescook, tglx, mingo, bp, arjan, x86,
	linux-kernel, kernel-hardening, rick.p.edgecombe, live-patching,
	Joe Lawrence, Josh Poimboeuf

Hi,

> On Fri, 17 Jul 2020, Kristen Carlson Accardi wrote:
> 
>> Function Granular Kernel Address Space Layout Randomization (fgkaslr)
>> ---------------------------------------------------------------------
>>
>> This patch set is an implementation of finer grained kernel address space
>> randomization. It rearranges your kernel code at load time
>> on a per-function level granularity, with only around a second added to
>> boot time.
> 
> [...]

>> Modules
>> -------
>> Modules are randomized similarly to the rest of the kernel by shuffling
>> the sections at load time prior to moving them into memory. The module must
>> also have been build with the -ffunction-sections compiler option.

It seems, a couple more adjustments are needed in the module loader code.

With function granular KASLR, modules will have lots of ELF sections due 
to -ffunction-sections.
On my x86_64 system with kernel 5.8-rc7 with FG KASLR patches, for 
example, xfs.ko has 4849 ELF sections total, 2428 of these are loaded 
and shown in /sys/module/xfs/sections/.

There are at least 2 places where high-order memory allocations might 
happen during module loading. Such allocations may fail if memory is 
fragmented, while physically contiguous memory areas are not really 
needed there. I suggest to switch to kvmalloc/kvfree there.

1. kernel/module.c, randomize_text():
	Elf_Shdr **text_list;
	...
	int max_sections = info->hdr->e_shnum;
	...
	text_list = kmalloc_array(max_sections, sizeof(*text_list), GFP_KERNEL);

The size of the allocated memory area is (8 * total_number_of_sections), 
if I understand it right, which is 38792 for xfs.ko, a 4th order allocation.

2. kernel/module.c, mod_sysfs_setup() => add_sect_attrs().

This allocation can be larger than the first one.

We found this issue with livepatch modules some time ago (these modules 
are already built with -ffunction-sections) [1], but, with FG KASLR, it 
affects all kernel modules. Large ones like xfs.ko, btrfs.ko, etc., 
could suffer the most from it.

When a module is loaded sysfs attributes are created for its ELF 
sections (visible as /sys/module/<module_name>/sections/*). and contain 
the start addresses of these ELF sections. A single memory chunk is 
allocated
for all these:

         size[0] = ALIGN(sizeof(*sect_attrs)
                         + nloaded * sizeof(sect_attrs->attrs[0]),
                         sizeof(sect_attrs->grp.attrs[0]));
         size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.attrs[0]);
         sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);

'nloaded' is the number of loaded ELF section in the module.

For the kernel 5.8-rc7 on my system, the total size is 56 + 72 * 
nloaded, which is 174872 for xfs.ko, 43 pages, 6th order allocation.

I enabled 'mm_page_alloc' tracepoint with filter 'order > 3' to confirm 
the issue and, indeed, got these two allocations when modprobe'ing xfs:
----------------------------
/sys/kernel/debug/tracing/trace:
         modprobe-1509  <...>: mm_page_alloc: <...> order=4 
migratetype=0 gfp_flags=GFP_KERNEL|__GFP_COMP
         modprobe-1509  <stack trace>
		=> __alloc_pages_nodemask
		=> alloc_pages_current
		=> kmalloc_order
		=> kmalloc_order_trace
		=> __kmalloc
		=> load_module

         modprobe-1509  <...>: mm_page_alloc: <...> order=6 
migratetype=0 gfp_flags=GFP_KERNEL|__GFP_COMP|__GFP_ZERO
         modprobe-1509  <stack trace>
		=> __alloc_pages_nodemask
		=> alloc_pages_current
		=> kmalloc_order
		=> kmalloc_order_trace
		=> __kmalloc
		=> mod_sysfs_setup
		=> load_module
----------------------------

I suppose, something like this can be used as workaround:

* for randomize_text():
-----------
diff --git a/kernel/module.c b/kernel/module.c
index 0f4f4e567a42..a2473db1d0a3 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2433,7 +2433,7 @@ static void randomize_text(struct module *mod, 
struct load_info *info)
  	if (sec == 0)
  		return;

-	text_list = kmalloc_array(max_sections, sizeof(*text_list), GFP_KERNEL);
+	text_list = kvmalloc_array(max_sections, sizeof(*text_list), GFP_KERNEL);
  	if (!text_list)
  		return;

@@ -2466,7 +2466,7 @@ static void randomize_text(struct module *mod, 
struct load_info *info)
  		shdr->sh_entsize = get_offset(mod, &size, shdr, 0);
  	}

-	kfree(text_list);
+	kvfree(text_list);
  }

  /* Lay out the SHF_ALLOC sections in a way not dissimilar to how ld

-----------

* for add_sect_attrs():
-----------
diff --git a/kernel/module.c b/kernel/module.c
index 0f4f4e567a42..a2473db1d0a3 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1541,7 +1541,7 @@ static void free_sect_attrs(struct 
module_sect_attrs *sect_attrs)

  	for (section = 0; section < sect_attrs->nsections; section++)
  		kfree(sect_attrs->attrs[section].battr.attr.name);
-	kfree(sect_attrs);
+	kvfree(sect_attrs);
  }

  static void add_sect_attrs(struct module *mod, const struct load_info 
*info)
@@ -1558,7 +1558,7 @@ static void add_sect_attrs(struct module *mod, 
const struct load_info *info)
  	size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded),
  			sizeof(sect_attrs->grp.bin_attrs[0]));
  	size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.bin_attrs[0]);
-	sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);
+	sect_attrs = kvzalloc(size[0] + size[1], GFP_KERNEL);
  	if (sect_attrs == NULL)
  		return;

-----------

[1] https://github.com/dynup/kpatch/pull/1131

Regards,
Evgenii

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-08-03 11:39   ` Evgenii Shatokhin
@ 2020-08-03 17:45     ` Kees Cook
  2020-08-03 18:17       ` Joe Lawrence
  0 siblings, 1 reply; 17+ messages in thread
From: Kees Cook @ 2020-08-03 17:45 UTC (permalink / raw)
  To: Evgenii Shatokhin
  Cc: Kristen Carlson Accardi, Miroslav Benes, tglx, mingo, bp, arjan,
	x86, linux-kernel, kernel-hardening, rick.p.edgecombe,
	live-patching, Joe Lawrence, Josh Poimboeuf, Jessica Yu

On Mon, Aug 03, 2020 at 02:39:32PM +0300, Evgenii Shatokhin wrote:
> There are at least 2 places where high-order memory allocations might happen
> during module loading. Such allocations may fail if memory is fragmented,
> while physically contiguous memory areas are not really needed there. I
> suggest to switch to kvmalloc/kvfree there.

While this does seem to be the right solution for the extant problem, I
do want to take a moment and ask if the function sections need to be
exposed at all? What tools use this information, and do they just want
to see the bounds of the code region? (i.e. the start/end of all the
.text* sections) Perhaps .text.* could be excluded from the sysfs
section list?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-08-03 17:45     ` Kees Cook
@ 2020-08-03 18:17       ` Joe Lawrence
  2020-08-03 19:38         ` Frank Ch. Eigler
  0 siblings, 1 reply; 17+ messages in thread
From: Joe Lawrence @ 2020-08-03 18:17 UTC (permalink / raw)
  To: Kees Cook, Evgenii Shatokhin
  Cc: Kristen Carlson Accardi, Miroslav Benes, tglx, mingo, bp, arjan,
	x86, linux-kernel, kernel-hardening, rick.p.edgecombe,
	live-patching, Josh Poimboeuf, Jessica Yu, Frank Ch. Eigler

On 8/3/20 1:45 PM, Kees Cook wrote:
> On Mon, Aug 03, 2020 at 02:39:32PM +0300, Evgenii Shatokhin wrote:
>> There are at least 2 places where high-order memory allocations might happen
>> during module loading. Such allocations may fail if memory is fragmented,
>> while physically contiguous memory areas are not really needed there. I
>> suggest to switch to kvmalloc/kvfree there.
> 
> While this does seem to be the right solution for the extant problem, I
> do want to take a moment and ask if the function sections need to be
> exposed at all? What tools use this information, and do they just want
> to see the bounds of the code region? (i.e. the start/end of all the
> .text* sections) Perhaps .text.* could be excluded from the sysfs
> section list?
> 

[[cc += FChE, see [0] for Evgenii's full mail ]]

It looks like debugging tools like systemtap [1], gdb [2] and its 
add-symbol-file cmd, etc. peek at the /sys/module/<MOD>/section/ info.

But yeah, it would be preferable if we didn't export a long sysfs 
representation if nobody actually needs it.


[0] 
https://lore.kernel.org/lkml/e9c4d88b-86db-47e9-4299-3fac45a7e3fd@virtuozzo.com/
[1] https://fossies.org/linux/systemtap/staprun/staprun.c
[2] 
https://www.oreilly.com/library/view/linux-device-drivers/0596005903/ch04.html#linuxdrive3-CHP-4-SECT-6.1

-- Joe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-08-03 18:17       ` Joe Lawrence
@ 2020-08-03 19:38         ` Frank Ch. Eigler
  2020-08-03 20:11           ` Kees Cook
  0 siblings, 1 reply; 17+ messages in thread
From: Frank Ch. Eigler @ 2020-08-03 19:38 UTC (permalink / raw)
  To: Joe Lawrence
  Cc: Kees Cook, Evgenii Shatokhin, Kristen Carlson Accardi,
	Miroslav Benes, tglx, mingo, bp, arjan, x86, linux-kernel,
	kernel-hardening, rick.p.edgecombe, live-patching,
	Josh Poimboeuf, Jessica Yu

Hi -

> > While this does seem to be the right solution for the extant problem, I
> > do want to take a moment and ask if the function sections need to be
> > exposed at all? What tools use this information, and do they just want
> > to see the bounds of the code region? (i.e. the start/end of all the
> > .text* sections) Perhaps .text.* could be excluded from the sysfs
> > section list?

> [[cc += FChE, see [0] for Evgenii's full mail ]]

Thanks!

> It looks like debugging tools like systemtap [1], gdb [2] and its
> add-symbol-file cmd, etc. peek at the /sys/module/<MOD>/section/ info.
> But yeah, it would be preferable if we didn't export a long sysfs
> representation if nobody actually needs it.

Systemtap needs to know base addresses of loaded text & data sections,
in order to perform relocation of probe point PCs and context data
addresses.  It uses /sys/module/...., kind of under protest, because
there seems to exist no MODULE_EXPORT'd API to get at that information
some other way.

- FChE


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-08-03 19:38         ` Frank Ch. Eigler
@ 2020-08-03 20:11           ` Kees Cook
  2020-08-03 21:12             ` Frank Ch. Eigler
  0 siblings, 1 reply; 17+ messages in thread
From: Kees Cook @ 2020-08-03 20:11 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Joe Lawrence, Evgenii Shatokhin, Kristen Carlson Accardi,
	Miroslav Benes, tglx, mingo, bp, arjan, x86, linux-kernel,
	kernel-hardening, rick.p.edgecombe, live-patching,
	Josh Poimboeuf, Jessica Yu

On Mon, Aug 03, 2020 at 03:38:37PM -0400, Frank Ch. Eigler wrote:
> Hi -
> 
> > > While this does seem to be the right solution for the extant problem, I
> > > do want to take a moment and ask if the function sections need to be
> > > exposed at all? What tools use this information, and do they just want
> > > to see the bounds of the code region? (i.e. the start/end of all the
> > > .text* sections) Perhaps .text.* could be excluded from the sysfs
> > > section list?
> 
> > [[cc += FChE, see [0] for Evgenii's full mail ]]
> 
> Thanks!
> 
> > It looks like debugging tools like systemtap [1], gdb [2] and its
> > add-symbol-file cmd, etc. peek at the /sys/module/<MOD>/section/ info.
> > But yeah, it would be preferable if we didn't export a long sysfs
> > representation if nobody actually needs it.
> 
> Systemtap needs to know base addresses of loaded text & data sections,
> in order to perform relocation of probe point PCs and context data
> addresses.  It uses /sys/module/...., kind of under protest, because
> there seems to exist no MODULE_EXPORT'd API to get at that information
> some other way.

Wouldn't /proc/kallsysms entries cover this? I must be missing
something...

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-08-03 20:11           ` Kees Cook
@ 2020-08-03 21:12             ` Frank Ch. Eigler
  2020-08-03 21:41               ` Kees Cook
  0 siblings, 1 reply; 17+ messages in thread
From: Frank Ch. Eigler @ 2020-08-03 21:12 UTC (permalink / raw)
  To: Kees Cook
  Cc: Joe Lawrence, Evgenii Shatokhin, Kristen Carlson Accardi,
	Miroslav Benes, tglx, mingo, bp, arjan, x86, linux-kernel,
	kernel-hardening, rick.p.edgecombe, live-patching,
	Josh Poimboeuf, Jessica Yu

Hi -

On Mon, Aug 03, 2020 at 01:11:27PM -0700, Kees Cook wrote:
> [...]
> > Systemtap needs to know base addresses of loaded text & data sections,
> > in order to perform relocation of probe point PCs and context data
> > addresses.  It uses /sys/module/...., kind of under protest, because
> > there seems to exist no MODULE_EXPORT'd API to get at that information
> > some other way.
> 
> Wouldn't /proc/kallsysms entries cover this? I must be missing
> something...

We have relocated based on sections, not some subset of function
symbols accessible that way, partly because DWARF line- and DIE- based
probes can map to addresses some way away from function symbols, into
function interiors, or cloned/moved bits of optimized code.  It would
take some work to prove that function-symbol based heuristic
arithmetic would have just as much reach.

- FChE


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-08-03 21:12             ` Frank Ch. Eigler
@ 2020-08-03 21:41               ` Kees Cook
  2020-08-04  0:48                 ` Frank Ch. Eigler
  0 siblings, 1 reply; 17+ messages in thread
From: Kees Cook @ 2020-08-03 21:41 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Joe Lawrence, Evgenii Shatokhin, Kristen Carlson Accardi,
	Miroslav Benes, tglx, mingo, bp, arjan, x86, linux-kernel,
	kernel-hardening, rick.p.edgecombe, live-patching,
	Josh Poimboeuf, Jessica Yu

On Mon, Aug 03, 2020 at 05:12:28PM -0400, Frank Ch. Eigler wrote:
> Hi -
> 
> On Mon, Aug 03, 2020 at 01:11:27PM -0700, Kees Cook wrote:
> > [...]
> > > Systemtap needs to know base addresses of loaded text & data sections,
> > > in order to perform relocation of probe point PCs and context data
> > > addresses.  It uses /sys/module/...., kind of under protest, because
> > > there seems to exist no MODULE_EXPORT'd API to get at that information
> > > some other way.
> > 
> > Wouldn't /proc/kallsysms entries cover this? I must be missing
> > something...
> 
> We have relocated based on sections, not some subset of function
> symbols accessible that way, partly because DWARF line- and DIE- based
> probes can map to addresses some way away from function symbols, into
> function interiors, or cloned/moved bits of optimized code.  It would
> take some work to prove that function-symbol based heuristic
> arithmetic would have just as much reach.

Interesting. Do you have an example handy? It seems like something like
that would reference the enclosing section, which means we can't just
leave them out of the sysfs list... (but if such things never happen in
the function-sections, then we *can* remove them...)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 00/10] Function Granular KASLR
  2020-08-03 21:41               ` Kees Cook
@ 2020-08-04  0:48                 ` Frank Ch. Eigler
  0 siblings, 0 replies; 17+ messages in thread
From: Frank Ch. Eigler @ 2020-08-04  0:48 UTC (permalink / raw)
  To: Kees Cook
  Cc: Joe Lawrence, Evgenii Shatokhin, Kristen Carlson Accardi,
	Miroslav Benes, tglx, mingo, bp, arjan, x86, linux-kernel,
	kernel-hardening, rick.p.edgecombe, live-patching,
	Josh Poimboeuf, Jessica Yu

Hi -

> > We have relocated based on sections, not some subset of function
> > symbols accessible that way, partly because DWARF line- and DIE- based
> > probes can map to addresses some way away from function symbols, into
> > function interiors, or cloned/moved bits of optimized code.  It would
> > take some work to prove that function-symbol based heuristic
> > arithmetic would have just as much reach.
> 
> Interesting. Do you have an example handy? 

No, I'm afraid I don't have one that I know cannot possibly be
expressed by reference to a function symbol only.  I'd look at
systemtap (4.3) probe point lists like:

% stap -vL 'kernel.statement("*@kernel/*verif*.c:*")'
% stap -vL 'module("amdgpu").statement("*@*execution*.c:*")'

which give an impression of computed PC addresses.

> It seems like something like that would reference the enclosing
> section, which means we can't just leave them out of the sysfs
> list... (but if such things never happen in the function-sections,
> then we *can* remove them...)

I'm not sure we can easily prove they can never happen there.

- FChE


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, back to index

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200717170008.5949-1-kristen@linux.intel.com>
2020-07-22  9:27 ` [PATCH v4 00/10] Function Granular KASLR Miroslav Benes
2020-07-22 14:39   ` Kees Cook
2020-07-22 14:51     ` Joe Lawrence
2020-07-22 14:56       ` Joe Lawrence
2020-07-22 18:24         ` Kristen Carlson Accardi
2020-07-22 16:07     ` Josh Poimboeuf
2020-07-22 19:42       ` Kees Cook
2020-07-22 19:56         ` Kristen Carlson Accardi
2020-07-22 21:33           ` Josh Poimboeuf
2020-08-03 11:39   ` Evgenii Shatokhin
2020-08-03 17:45     ` Kees Cook
2020-08-03 18:17       ` Joe Lawrence
2020-08-03 19:38         ` Frank Ch. Eigler
2020-08-03 20:11           ` Kees Cook
2020-08-03 21:12             ` Frank Ch. Eigler
2020-08-03 21:41               ` Kees Cook
2020-08-04  0:48                 ` Frank Ch. Eigler

Live-Patching Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/live-patching/0 live-patching/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 live-patching live-patching/ https://lore.kernel.org/live-patching \
		live-patching@vger.kernel.org
	public-inbox-index live-patching

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.live-patching


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git