From: Daniel Borkmann <daniel@iogearbox.net>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>,
Laura Abbott <labbott@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>, Peter Anvin <hpa@zytor.com>,
Fengguang Wu <fengguang.wu@intel.com>,
Network Development <netdev@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>, LKP <lkp@01.org>,
ast@fb.com, the arch/x86 maintainers <x86@kernel.org>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf
Date: Thu, 09 Mar 2017 18:51:03 +0100 [thread overview]
Message-ID: <58C19607.6000605@iogearbox.net> (raw)
In-Reply-To: <alpine.DEB.2.20.1703091547460.3521@nanos>
On 03/09/2017 03:49 PM, Thomas Gleixner wrote:
> On Thu, 9 Mar 2017, Daniel Borkmann wrote:
>> On 03/09/2017 02:10 PM, Thomas Gleixner wrote:
>>> On Thu, 9 Mar 2017, Daniel Borkmann wrote:
>>>> With regard to CPA_FLUSHTLB that Linus mentioned, when I investigated
>>>> code paths in change_page_attr_set_clr(), I did see that CPA_FLUSHTLB
>>>> was set each time we switched attrs and a cpa_flush_range() was
>>>> performed (with the correct number of pages and cache set to 0). That
>>>> would be a __flush_tlb_all() eventually.
>>>>
>>>> Hmm, it indeed might seem likely that this could be an emulation bug.
>>>
>>> Which variant of __flush_tlb_all() is used when the test fails?
>>>
>>> Check for the following flags in /proc/cpuinfo: pge invpcid
>>
>> I added the following and booted with both variants:
>>
>> printk("X86_FEATURE_PGE:%u\n", static_cpu_has(X86_FEATURE_PGE));
>> printk("X86_FEATURE_INVPCID:%u\n", static_cpu_has(X86_FEATURE_INVPCID));
>>
>> "-cpu host" gives:
>>
>> [ 8.326117] X86_FEATURE_PGE:1
>> [ 8.326381] X86_FEATURE_INVPCID:1
>>
>> "-cpu kvm64" gives:
>>
>> [ 8.517069] X86_FEATURE_PGE:1
>> [ 8.517393] X86_FEATURE_INVPCID:0
>
> That's the one which fails. So it's using the CR4 based flushing. Just ran
> a test on a physical system with PGE=1 and INVPCID=0. Works fine.
>
> Emulation problem?
So in the git qemu code base (target/i386/helper.c), cr3 vs cr4 looks
like the following, both sharing the tlb_flush() itself:
void cpu_x86_update_cr3(CPUX86State *env, target_ulong new_cr3)
{
X86CPU *cpu = x86_env_get_cpu(env);
env->cr[3] = new_cr3;
if (env->cr[0] & CR0_PG_MASK) {
qemu_log_mask(CPU_LOG_MMU,
"CR3 update: CR3=" TARGET_FMT_lx "\n", new_cr3);
tlb_flush(CPU(cpu));
}
}
void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_cr4)
{
X86CPU *cpu = x86_env_get_cpu(env);
uint32_t hflags;
#if defined(DEBUG_MMU)
printf("CR4 update: %08x -> %08x\n", (uint32_t)env->cr[4], new_cr4);
#endif
if ((new_cr4 ^ env->cr[4]) &
(CR4_PGE_MASK | CR4_PAE_MASK | CR4_PSE_MASK |
CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_LA57_MASK)) {
tlb_flush(CPU(cpu));
}
[...]
}
I added some debugging around __native_flush_tlb_global_irq_disabled()
and if I understand it correctly, the idea of cr4 is that we need to
toggle X86_CR4_PGE in order to trigger a TLB flush.
What I see is that original cr4 is 0x610. The cpu_tlbstate.cr4 is
consistent to native_read_cr4() and since cr4 is != 0, it tells me
based on the comment in native_read_cr4() that cr4 seems to be
supported. Thus, meaning we end up with writing ...
native_write_cr4(0x610);
native_write_cr4(0x610);
... twice, and this just doesn't trigger the desired TLB flush. I
changed the code into the following ...
cr4 = this_cpu_read(cpu_tlbstate.cr4);
/* clear PGE */
- native_write_cr4(cr4 & ~X86_CR4_PGE);
+ native_write_cr4(cr4 ^ X86_CR4_PGE);
/* write old PGE again and flush TLBs */
native_write_cr4(cr4);
... and the test cases seem to be working for me now with "-cpu kvm64",
so that seems to trigger the TLB we were missing.
I don't know enough about x86 internals to tell whether the change is
sane, though, but it seems at least for qemu fwiw. ;) Thoughts?
Thanks,
Daniel
next prev parent reply other threads:[~2017-03-09 17:51 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-01 12:54 [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf Fengguang Wu
2017-03-02 20:23 ` Fengguang Wu
2017-03-02 20:40 ` Daniel Borkmann
2017-03-08 19:25 ` Linus Torvalds
2017-03-08 22:27 ` Daniel Borkmann
2017-03-08 22:36 ` Kees Cook
2017-03-08 22:51 ` Daniel Borkmann
2017-03-08 23:55 ` Laura Abbott
2017-03-09 5:36 ` Kees Cook
2017-03-09 13:04 ` Daniel Borkmann
2017-03-09 13:10 ` Thomas Gleixner
2017-03-09 13:25 ` Daniel Borkmann
2017-03-09 14:49 ` Thomas Gleixner
2017-03-09 17:51 ` Daniel Borkmann [this message]
2017-03-09 18:08 ` David Miller
2017-03-09 18:10 ` Linus Torvalds
2017-03-09 18:15 ` Linus Torvalds
2017-03-09 18:31 ` Daniel Borkmann
2017-03-09 21:32 ` Daniel Borkmann
2017-03-09 21:55 ` Borislav Petkov
2017-03-09 22:07 ` Borislav Petkov
2017-03-09 22:11 ` Daniel Borkmann
2017-03-09 22:48 ` Borislav Petkov
2017-03-09 23:26 ` Linus Torvalds
2017-03-09 23:44 ` Borislav Petkov
2017-03-10 0:13 ` Daniel Borkmann
2017-03-12 21:40 ` Borislav Petkov
2017-03-09 14:53 ` Daniel Borkmann
2017-03-09 17:48 ` Linus Torvalds
2017-03-08 22:43 ` Linus Torvalds
2017-03-09 1:34 ` Fengguang Wu
2017-03-09 13:09 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58C19607.6000605@iogearbox.net \
--to=daniel@iogearbox.net \
--cc=ast@fb.com \
--cc=davem@davemloft.net \
--cc=fengguang.wu@intel.com \
--cc=hpa@zytor.com \
--cc=keescook@chromium.org \
--cc=labbott@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@01.org \
--cc=mingo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).