All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Ben Gardon <bgardon@google.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>,
	David Matlack <dmatlack@google.com>,
	Jim Mattson <jmattson@google.com>,
	David Dunn <daviddunn@google.com>,
	Jing Zhang <jingzhangos@google.com>,
	Junaid Shahid <junaids@google.com>
Subject: Re: [PATCH v5 08/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
Date: Wed, 13 Apr 2022 23:03:57 +0000	[thread overview]
Message-ID: <YldW3QEDM5Z0Y5Mn@google.com> (raw)
In-Reply-To: <20220413175944.71705-9-bgardon@google.com>

On Wed, Apr 13, 2022, Ben Gardon wrote:
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 72183ae628f7..021452a9fa91 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7855,6 +7855,19 @@ At this time, KVM_PMU_CAP_DISABLE is the only capability.  Setting
>  this capability will disable PMU virtualization for that VM.  Usermode
>  should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
>  
> +8.36 KVM_CAP_VM_DISABLE_NX_HUGE_PAGES
> +---------------------------
> +
> +:Capability KVM_CAP_PMU_CAPABILITY
> +:Architectures: x86
> +:Type: vm
> +:Returns 0 on success, -EPERM if the userspace process does not
> +	 have CAP_SYS_BOOT

Needs to document the -EINVAL cases, especially the requirement that this be
called before VMs are created.  The 

> +This capability disables the NX huge pages mitigation for iTLB MULTIHIT.
> +
> +The capability has no effect if the nx_huge_pages module parameter is not set.
> +
>  9. Known KVM API problems
>  =========================
>  
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 2c20f715f009..b8ab4fa7d4b2 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1240,6 +1240,8 @@ struct kvm_arch {
>  	hpa_t	hv_root_tdp;
>  	spinlock_t hv_root_tdp_lock;
>  #endif
> +
> +	bool disable_nx_huge_pages;
>  };
>  
>  struct kvm_vm_stat {
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 671cfeccf04e..148f630af78a 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -173,9 +173,10 @@ struct kvm_page_fault {
>  int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
>  
>  extern int nx_huge_pages;
> -static inline bool is_nx_huge_page_enabled(void)
> +static inline bool is_nx_huge_page_enabled(struct kvm *kvm)
>  {
> -	return READ_ONCE(nx_huge_pages);
> +	return READ_ONCE(nx_huge_pages) &&
> +	       !kvm->arch.disable_nx_huge_pages;

No need for a newline, that fits on a single line.

> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 566548a3efa7..03aa1e0f60e2 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1469,7 +1469,8 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter,
>  	 * not been linked in yet and thus is not reachable from any other CPU.
>  	 */
>  	for (i = 0; i < PT64_ENT_PER_PAGE; i++)
> -		sp->spt[i] = make_huge_page_split_spte(huge_spte, level, i);
> +		sp->spt[i] = make_huge_page_split_spte(kvm, huge_spte,
> +						       level, i);

Just let this poke past 80 chars.

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 665c1fa8bb57..27631c3b53c2 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4286,6 +4286,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_SYS_ATTRIBUTES:
>  	case KVM_CAP_VAPIC:
>  	case KVM_CAP_ENABLE_CAP:
> +	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
>  		r = 1;
>  		break;
>  	case KVM_CAP_EXIT_HYPERCALL:
> @@ -6079,6 +6080,28 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		}
>  		mutex_unlock(&kvm->lock);
>  		break;
> +	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
> +		r = -EINVAL;
> +		if (cap->args[0])
> +			break;
> +
> +		/*
> +		 * Since the risk of disabling NX hugepages is a guest crashing
> +		 * the system, ensure the userspace process has permission to
> +		 * reboot the system.

Since I'm nitpicking already and there's also a comment...

Can you call out that, unlike the actual reboot() syscall, the process needs the
capability in the init? namespace (I don't actual know the terminology) because
exposing /dev/kvm into a container doesn't magically limit the iTLB multihit bug
to that container.  I.e. that this _must_ use capable(), not ns_capable().	

Amusingly, someone could subvert the selftest's SYS_reboot heuristic by running
the test in a container :-)

  reply	other threads:[~2022-04-13 23:04 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13 17:59 [PATCH v5 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
2022-04-13 17:59 ` [PATCH v5 01/10] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
2022-04-13 17:59 ` [PATCH v5 02/10] KVM: selftests: Read binary stats header in lib Ben Gardon
2022-04-13 17:59 ` [PATCH v5 03/10] KVM: selftests: Read binary stats desc " Ben Gardon
2022-04-13 17:59 ` [PATCH v5 04/10] KVM: selftests: Clean up coding style in binary stats test Ben Gardon
2022-04-13 17:59 ` [PATCH v5 05/10] KVM: selftests: Read binary stat data in lib Ben Gardon
2022-04-13 17:59 ` [PATCH v5 06/10] KVM: selftests: Add NX huge pages test Ben Gardon
2022-04-13 22:35   ` Sean Christopherson
2022-04-14 20:59     ` Ben Gardon
2022-04-14 21:36       ` Sean Christopherson
2022-04-13 17:59 ` [PATCH v5 07/10] KVM: x86: Fix errant brace in KVM capability handling Ben Gardon
2022-04-13 17:59 ` [PATCH v5 08/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Ben Gardon
2022-04-13 23:03   ` Sean Christopherson [this message]
2022-04-13 17:59 ` [PATCH v5 09/10] KVM: selftests: Factor out calculation of pages needed for a VM Ben Gardon
2022-04-13 17:59 ` [PATCH v5 10/10] KVM: selftests: Test disabling NX hugepages on " Ben Gardon
2022-04-13 22:48   ` Sean Christopherson
2022-04-14 21:14     ` Ben Gardon
2022-04-14 22:29       ` Sean Christopherson
2022-04-13 21:21 ` [PATCH v5 00/10] KVM: x86: Add a cap to disable " David Matlack

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YldW3QEDM5Z0Y5Mn@google.com \
    --to=seanjc@google.com \
    --cc=bgardon@google.com \
    --cc=daviddunn@google.com \
    --cc=dmatlack@google.com \
    --cc=jingzhangos@google.com \
    --cc=jmattson@google.com \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.