linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rian Quinn <rianquinn@gmail.com>
To: ebiederm@xmission.com
Cc: linux-kernel@vger.kernel.org, x86@kernel.org
Subject: Re: x86_64 INIT/SIPI Bug
Date: Fri, 9 Nov 2018 09:13:36 -0700	[thread overview]
Message-ID: <CANmpu8Wj+Kq7nnWA48QPcn9ZyJT+=-LcRnbzxkHJL1wq4U_d2Q@mail.gmail.com> (raw)
In-Reply-To: <8736sbkq2w.fsf@xmission.com>

>> I apologize upfront if this is the wrong place to post this, pretty new to this.
>>
>> We are working on the Bareflank Hypervisor (www.bareflank.org), and we
>> are passing through the INIT/SIPI process (similar to how a VMX
>> rootkit from EFI might boot the OS) and we noticed that on Arch Linux,
>> the INIT/SIPI process stalls, something we are not seeing on Ubuntu.
>>
>> Turns out, to fix the issue, we had to turn on cpu_init_udelay=10000.
>> The problem is, once a hypervisor is turned on, even one that is doing
>> nothing but passing through the instructions, the "quick" that is
>> detailed below fails as the kernel is not giving the CPU enough time
>> to perform a VMExit/VMEntry (even through the VMExit is not doing
>> anything).
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/smpboot.c?h=v4.20-rc1#n650
>>
>> You can see our INIT/SIPI code here if you are interested:
>> https://github.com/rianquinn/extended_apis/blob/hyperkernel_1/bfvmm/src/hve/arch/intel_x64/vmexit/init_signal.cpp
>>
>> The reason I suggest this is a bug is the manual clearly states that a
>> wait is required and the "quirk" that turns off this delay prevents
>> code like this from working. Would it be possible to either:
>> - Turn this off by default, but still allow someone to turn it on if
>> they are confident the delay is not needed?
>> - Provide a generic way to turn this off (maybe if a hypervisor is
>> detected, it defaults to off)?
>>
>> I'd be more than happy to provide a patch and test, but I'm not sure
>> if there is any interest in changing this code.
>
> I would suggest testing either for your hypervisor or simply for being
> inside some hypervisor.  As I read the code it is only turning off the
> 10ms delay on processors where it is know that it is safe to do so.
> This is a case where it is not safe to disable the 10ms delay,
> so it makes sense not not turn off the delay.

I think the best solution would be to simply check CPUID leaf 0x40000000
as every hypervisor (at least the legit ones) set this leaf to a unique
ID that can be used to see if a hypervisor is running. Specifically, I
would patch the smp_quirk_init_udelay() function to check to see if
CPUID 0x40000000 returns 0. If it does, no hypervisor is running in which case
the quirk can set the delay to 0. If the leaf is not 0, the quirk is ignored
and the delay is set to UDELAY_10MS_DEFAULT as it does already.

Thoughts? I think the patch would look something like:

    static void __init smp_quirk_init_udelay(void)
    {
+       unsigned int eax, ebx, ecx, edx;
+
+    eax = 0x40000000;
+    ecx = 0;
+    native_cpuid(&eax, &ebx, &ecx, &edx);
+
        /* if cmdline changed it from default, leave it alone */
        if (init_udelay != UINT_MAX)
            return;

+       /* ensure a hypervisor is not running */
+       if (ebx == 0) {
+           /* if modern processor, use no delay */
+           if (((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) &&
(boot_cpu_data.x86 == 6)) ||
+               ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
(boot_cpu_data.x86 >= 0xF))) {
+               init_udelay = 0;
+               return;
+           }
+       }
+
-       /* if modern processor, use no delay */
-       if (((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) &&
(boot_cpu_data.x86 == 6)) ||
-           ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
(boot_cpu_data.x86 >= 0xF))) {
-           init_udelay = 0;
-           return;
-       }
        /* else, use legacy delay */
        init_udelay = UDELAY_10MS_DEFAULT;
    }

On Thu, Nov 8, 2018 at 6:16 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Rian Quinn <rianquinn@gmail.com> writes:
>
> > I apologize upfront if this is the wrong place to post this, pretty new to this.
> >
> > We are working on the Bareflank Hypervisor (www.bareflank.org), and we
> > are passing through the INIT/SIPI process (similar to how a VMX
> > rootkit from EFI might boot the OS) and we noticed that on Arch Linux,
> > the INIT/SIPI process stalls, something we are not seeing on Ubuntu.
> >
> > Turns out, to fix the issue, we had to turn on cpu_init_udelay=10000.
> > The problem is, once a hypervisor is turned on, even one that is doing
> > nothing but passing through the instructions, the "quick" that is
> > detailed below fails as the kernel is not giving the CPU enough time
> > to perform a VMExit/VMEntry (even through the VMExit is not doing
> > anything).
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/smpboot.c?h=v4.20-rc1#n650
> >
> > You can see our INIT/SIPI code here if you are interested:
> > https://github.com/rianquinn/extended_apis/blob/hyperkernel_1/bfvmm/src/hve/arch/intel_x64/vmexit/init_signal.cpp
> >
> > The reason I suggest this is a bug is the manual clearly states that a
> > wait is required and the "quirk" that turns off this delay prevents
> > code like this from working. Would it be possible to either:
> > - Turn this off by default, but still allow someone to turn it on if
> > they are confident the delay is not needed?
> > - Provide a generic way to turn this off (maybe if a hypervisor is
> > detected, it defaults to off)?
> >
> > I'd be more than happy to provide a patch and test, but I'm not sure
> > if there is any interest in changing this code.
>
> I would suggest testing either for your hypervisor or simply for being
> inside some hypervisor.  As I read the code it is only turning off the
> 10ms delay on processors where it is know that it is safe to do so.
> This is a case where it is not safe to disable the 10ms delay,
> so it makes sense not not turn off the delay.
>
> Eric
>

  reply	other threads:[~2018-11-09 16:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-08 22:23 x86_64 INIT/SIPI Bug Rian Quinn
2018-11-09  1:16 ` Eric W. Biederman
2018-11-09 16:13   ` Rian Quinn [this message]
2018-11-09 17:49 ` Sean Christopherson
2018-11-09 18:04   ` Rian Quinn
2018-11-09 18:42     ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANmpu8Wj+Kq7nnWA48QPcn9ZyJT+=-LcRnbzxkHJL1wq4U_d2Q@mail.gmail.com' \
    --to=rianquinn@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).