linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nadav Amit <nadav.amit@gmail.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jiri Kosina <jkosina@suse.cz>, Will Deacon <will.deacon@arm.com>,
	Benjamin Herrenschmidt <benh@au1.ibm.com>,
	Nick Piggin <npiggin@gmail.com>,
	the arch/x86 maintainers <x86@kernel.org>,
	Borislav Petkov <bp@alien8.de>, Rik van Riel <riel@surriel.com>,
	Jann Horn <jannh@google.com>,
	Adin Scannell <ascannell@google.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, David Miller <davem@davemloft.net>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>
Subject: Re: TLB flushes on fixmap changes
Date: Mon, 27 Aug 2018 10:34:36 -0700	[thread overview]
Message-ID: <01DA0BDD-7504-4209-8A8F-20B27CF6A1C7@gmail.com> (raw)
In-Reply-To: <20180827170511.6bafa15cbc102ae135366e86@kernel.org>

at 1:05 AM, Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Sun, 26 Aug 2018 20:26:09 -0700
> Nadav Amit <nadav.amit@gmail.com> wrote:
> 
>> at 8:03 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>> 
>>> On Sun, 26 Aug 2018 11:09:58 +0200
>>> Peter Zijlstra <peterz@infradead.org> wrote:
>>> 
>>>> On Sat, Aug 25, 2018 at 09:21:22PM -0700, Andy Lutomirski wrote:
>>>>> I just re-read text_poke().  It's, um, horrible.  Not only is the
>>>>> implementation overcomplicated and probably buggy, but it's SLOOOOOW.
>>>>> It's totally the wrong API -- poking one instruction at a time
>>>>> basically can't be efficient on x86.  The API should either poke lots
>>>>> of instructions at once or should be text_poke_begin(); ...;
>>>>> text_poke_end();.
>>>> 
>>>> I don't think anybody ever cared about performance here. Only
>>>> correctness. That whole text_poke_bp() thing is entirely tricky.
>>> 
>>> Agreed. Self modification is a special event.
>>> 
>>>> FWIW, before text_poke_bp(), text_poke() would only be used from
>>>> stop_machine, so all the other CPUs would be stuck busy-waiting with
>>>> IRQs disabled. These days, yeah, that's lots more dodgy, but yes
>>>> text_mutex should be serializing all that.
>>> 
>>> I'm still not sure that speculative page-table walk can be done
>>> over the mutex. Also, if the fixmap area is for aliasing
>>> pages (which always mapped to memory), what kind of
>>> security issue can happen?
>> 
>> The PTE is accessible from other cores, so just as we assume for L1TF that
>> the every addressable memory might be cached in L1, we should assume and
>> PTE might be cached in the TLB when it is present.
> 
> Ok, so other cores can accidentally cache the PTE in TLB, (and no way
> to shoot down explicitly?)

There is way (although current it does not). But it seems that the consensus
is that it is better to avoid it being mapped at all in remote cores.

>> Although the mapping is for an alias, there are a couple of issues here.
>> First, this alias mapping is writable, so it might an attacker to change the
>> kernel code (following another initial attack).
> 
> Combined with some buffer overflow, correct? If the attacker already can
> write a kernel data directly, he is in the kernel mode.

Right.

> 
>> Second, the alias mapping is
>> never explicitly flushed. We may assume that once the original mapping is
>> removed/changed, a full TLB flush would take place, but there is no
>> guarantee it actually takes place.
> 
> Hmm, would this means a full TLB flush will not flush alias mapping?
> (or, the full TLB flush just doesn't work?)

It will flush the alias mapping, but currently there is no such explicit
flush.

>>> Anyway, from the viewpoint of kprobes, either per-cpu fixmap or
>>> changing CR3 sounds good to me. I think we don't even need per-cpu,
>>> it can call a thread/function on a dedicated core (like the first
>>> boot processor) and wait :) This may prevent leakage of pte change
>>> to other cores.
>> 
>> I implemented per-cpu fixmap, but I think that it makes more sense to take
>> peterz approach and set an entry in the PGD level. Per-CPU fixmap either
>> requires to pre-populate various levels in the page-table hierarchy, or
>> conditionally synchronize whenever module memory is allocated, since they
>> can share the same PGD, PUD & PMD. While usually the synchronization is not
>> needed, the possibility that synchronization is needed complicates locking.
> 
> Could you point which PeterZ approach you said? I guess it will be
> make a clone of PGD and use it for local page mapping (as new mm).
> If so, yes it sounds perfectly fine to me.

The thread is too long. What I think is best is having a mapping in the PGD
level. I’ll try to give it a shot, and see what I get.

>> Anyhow, having fixed addresses for the fixmap can be used to circumvent
>> KASLR.
> 
> I think text_poke doesn't mind using random address :)
> 
>> I don’t think a dedicated core is needed. Anyhow there is a lock
>> (text_mutex), so use_mm() can be used after acquiring the mutex.
> 
> Hmm, use_mm() said;
> 
> /*
> * use_mm
> *      Makes the calling kernel thread take on the specified
> *      mm context.
> *      (Note: this routine is intended to be called only
> *      from a kernel thread context)
> */
> 
> So maybe we need a dedicated kernel thread for safeness?

Yes, it says so. But I am not sure it cannot be changed, at least for this
specific use-case. Switching kernel threads just for patching seems to me as
an overkill.

Let me see if I can get something half-reasonable doing so...


  reply	other threads:[~2018-08-27 17:34 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-22 15:30 [PATCH 0/4] x86: TLB invalidate fixes Peter Zijlstra
2018-08-22 15:30 ` [PATCH 1/4] x86/mm/tlb: Revert the recent lazy TLB patches Peter Zijlstra
2018-08-22 21:37   ` Linus Torvalds
2018-08-22 22:11     ` Rik van Riel
2018-08-22 15:30 ` [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition Peter Zijlstra
2018-08-23  3:31   ` Nicholas Piggin
2018-08-23  3:35     ` Linus Torvalds
2018-08-23  3:44       ` Linus Torvalds
2018-08-23  4:16       ` Nicholas Piggin
2018-08-23  4:54         ` Linus Torvalds
2018-08-23  5:15           ` Nicholas Piggin
2018-08-24  8:42           ` Peter Zijlstra
2018-08-23 13:40   ` Will Deacon
2018-08-22 15:30 ` [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE Peter Zijlstra
2018-08-22 15:55   ` Peter Zijlstra
2018-08-23  3:45     ` Nicholas Piggin
2018-08-23  3:59       ` Linus Torvalds
2018-08-23  4:33         ` Nicholas Piggin
2018-08-23  5:03           ` Linus Torvalds
2018-08-23  5:58             ` Nicholas Piggin
2018-08-23  4:54         ` Benjamin Herrenschmidt
2018-08-23  5:11           ` Linus Torvalds
2018-08-23  5:20             ` Linus Torvalds
2018-08-23  6:48               ` Martin Schwidefsky
2018-08-23  5:21             ` Benjamin Herrenschmidt
2018-08-23  6:15               ` Nicholas Piggin
2018-08-23 13:39             ` Will Deacon
2018-08-24  8:47               ` Peter Zijlstra
2018-08-24 11:32                 ` Peter Zijlstra
2018-08-24 11:39                   ` Peter Zijlstra
2018-08-27  5:00                     ` Nicholas Piggin
2018-08-27  7:47                       ` Peter Zijlstra
2018-08-27  8:04                         ` Nicholas Piggin
2018-08-27  8:09                           ` Benjamin Herrenschmidt
2018-08-27  8:20                             ` Peter Zijlstra
2018-08-27  8:54                               ` Nicholas Piggin
2018-08-27  9:02                             ` Nicholas Piggin
2018-08-27 22:13                               ` Benjamin Herrenschmidt
2018-08-27 13:36                           ` Rik van Riel
2018-08-27 14:29                             ` Nicholas Piggin
2018-08-27  8:57                         ` removig ia64, was: " Christoph Hellwig
2018-08-27 11:28                           ` Peter Zijlstra
2018-08-27 11:45                           ` Jason Duerstock
2018-08-27 11:00                         ` Peter Zijlstra
2018-08-30  0:13                           ` Vineet Gupta
2018-08-30 10:23                             ` Peter Zijlstra
2018-08-24 17:26                 ` Nadav Amit
2018-08-24 18:04                   ` Peter Zijlstra
2018-08-24 18:35                     ` TLB flushes on fixmap changes Nadav Amit
2018-08-24 19:31                       ` Linus Torvalds
2018-08-24 20:24                         ` Nadav Amit
2018-08-25  0:58                           ` Linus Torvalds
2018-08-25  2:29                             ` nadav.amit
2018-08-25  4:23                               ` Andy Lutomirski
2018-08-26  2:23                                 ` Masami Hiramatsu
2018-08-26  4:21                                   ` Andy Lutomirski
2018-08-26  4:43                                     ` Kees Cook
2018-08-26  5:53                                       ` Nadav Amit
2018-08-26 14:20                                       ` Andy Lutomirski
2018-08-26 16:47                                         ` Kees Cook
2018-08-26 17:25                                           ` Andy Lutomirski
2018-08-26 20:15                                             ` Thomas Gleixner
2018-08-26 22:03                                               ` Kees Cook
2018-08-26 22:15                                                 ` Matthew Wilcox
2018-08-26 22:29                                                 ` Jann Horn
2018-08-26  9:09                                     ` Peter Zijlstra
2018-08-27  3:03                                       ` Masami Hiramatsu
2018-08-27  3:26                                         ` Nadav Amit
2018-08-27  8:05                                           ` Masami Hiramatsu
2018-08-27 17:34                                             ` Nadav Amit [this message]
2018-08-27 18:45                                               ` Andy Lutomirski
2018-08-27 18:54                                                 ` Nadav Amit
2018-08-27 18:58                                                   ` Andy Lutomirski
2018-08-27 19:10                                                     ` Nadav Amit
2018-08-27 19:43                                                       ` Nadav Amit
2018-08-27 19:58                                                         ` Andy Lutomirski
2018-08-27 20:16                                                           ` Nadav Amit
2018-08-27 21:55                                                             ` Nadav Amit
2018-08-27 22:32                                                               ` Andy Lutomirski
2018-08-27 22:54                                                                 ` Nadav Amit
2018-08-27 23:01                                                                   ` Andy Lutomirski
2018-08-28  8:49                                                                     ` Masami Hiramatsu
2018-08-28 17:33                                                                       ` Nadav Amit
2018-08-27  8:13                                         ` Peter Zijlstra
2018-08-27  9:39                                           ` Masami Hiramatsu
2018-08-27  9:55                                           ` Jann Horn
2018-08-26 22:48                                     ` Jann Horn
2018-08-24  8:35           ` [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE Peter Zijlstra
2018-08-24 13:13             ` Peter Zijlstra
2018-08-24 13:14               ` Peter Zijlstra
2018-08-24 15:49               ` Will Deacon
2018-08-23 23:31     ` Will Deacon
2018-08-22 21:34   ` Linus Torvalds
2018-08-23  8:46   ` Nicholas Piggin
2018-08-22 15:30 ` [PATCH 4/4] x86/mm: Only use tlb_remove_table() for paravirt Peter Zijlstra
2018-08-22 22:12   ` Eduardo Valentin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01DA0BDD-7504-4209-8A8F-20B27CF6A1C7@gmail.com \
    --to=nadav.amit@gmail.com \
    --cc=ascannell@google.com \
    --cc=benh@au1.ibm.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=davem@davemloft.net \
    --cc=jannh@google.com \
    --cc=jkosina@suse.cz \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).