kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v4 03/12] x86/mtrr: support setting MTRR state for software defined MTRRs
       [not found] ` <20230306163425.8324-4-jgross@suse.com>
@ 2023-03-20 12:59   ` Huang, Kai
  2023-03-20 13:47     ` Juergen Gross
  0 siblings, 1 reply; 5+ messages in thread
From: Huang, Kai @ 2023-03-20 12:59 UTC (permalink / raw)
  To: Gross, Jurgen, linux-kernel, x86
  Cc: Christopherson,,
	Sean, bp, dave.hansen, hpa, mingo, tglx, pbonzini, kvm, Yamahata,
	Isaku

On Mon, 2023-03-06 at 17:34 +0100, Juergen Gross wrote:
> When running virtualized, MTRR access can be reduced (e.g. in Xen PV
> guests or when running as a SEV-SNP guest under Hyper-V). Typically
> the hypervisor will reset the MTRR feature in CPUID data, resulting
> in no MTRR memory type information being available for the kernel.
> 
> This has turned out to result in problems:
> 
> - Hyper-V SEV-SNP guests using uncached mappings where they shouldn't
> - Xen PV dom0 mapping memory as WB which should be UC- instead
> 
> Solve those problems by supporting to set a static MTRR state,
> overwriting the empty state used today. In case such a state has been
> set, don't call get_mtrr_state() in mtrr_bp_init(). The set state
> will only be used by mtrr_type_lookup(), as in all other cases
> mtrr_enabled() is being checked, which will return false. Accept the
> overwrite call only for selected cases when running as a guest.
> Disable X86_FEATURE_MTRR in order to avoid any MTRR modifications by
> just refusing them.
> 
> 
[...]

> diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
> index ee09d359e08f..49b4cc923312 100644
> --- a/arch/x86/kernel/cpu/mtrr/generic.c
> +++ b/arch/x86/kernel/cpu/mtrr/generic.c
> @@ -8,10 +8,12 @@
>  #include <linux/init.h>
>  #include <linux/io.h>
>  #include <linux/mm.h>
> -
> +#include <linux/cc_platform.h>
>  #include <asm/processor-flags.h>
>  #include <asm/cacheinfo.h>
>  #include <asm/cpufeature.h>
> +#include <asm/hypervisor.h>
> +#include <asm/mshyperv.h>

Is <asm/mshyperv.h> needed here?

>  #include <asm/tlbflush.h>
>  #include <asm/mtrr.h>
>  #include <asm/msr.h>
> @@ -240,6 +242,48 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
>  	return mtrr_state.def_type;
>  }
>  
> +/**
> + * mtrr_overwrite_state - set static MTRR state
> + *
> + * Used to set MTRR state via different means (e.g. with data obtained from
> + * a hypervisor).

+KVM list and KVM maintainers,

IIUC in the next patch, SEV-SNP guest only sets a synthetic MTRR w/o telling the
hypervisor (hyperv).  I think this works for SEV-SNP running on top of hyperv
because they have mutual understanding?

What about the SNP guest running on other hypervisors such as KVM?

Since this code covers TDX guest too, I think eventually it makes sense for TDX
guest to use this function too (to avoid #VE IIUC).  If want to do that, then I
think TDX guest should have the same mutual understanding with *ALL* hypervisor,
as I am not sure what's the point of making the TDX guest's MTRR behaviour
depending on specific hypervisor.

For now I don't see there's any use case for TDX guest to use non-WB memory type
(in fact, KVM always maps guest memory as WB if there's no non-coherent DMA to
the guest memory), so to me it seems it's OK to make a universal mutual
understanding that TDX guest will always have WB memory type for all memory.

But, I am not sure whether it's better to have a standard hypercall between
guest & hypervisor for this purpose so things can be more flexible?

> + * Is allowed only for special cases when running virtualized. Must be called
> + * from the x86_init.hyper.init_platform() hook. X86_FEATURE_MTRR must be off.
> + */
> +void mtrr_overwrite_state(struct mtrr_var_range *var, unsigned int num_var,
> +			  mtrr_type def_type)
> +{
> +	unsigned int i;
> +
> +	if (WARN_ON(mtrr_state_set ||
> +		    hypervisor_is_type(X86_HYPER_NATIVE) ||
> +		    !cpu_feature_enabled(X86_FEATURE_HYPERVISOR) ||
> +		    (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP) &&
> +		     !hv_is_isolation_supported() &&
> +		     !cpu_feature_enabled(X86_FEATURE_XENPV) &&
> +		     !cpu_feature_enabled(X86_FEATURE_TDX_GUEST))))
> +		return;
> +
> +	/* Disable MTRR in order to disable MTRR modifications. */
> +	setup_clear_cpu_cap(X86_FEATURE_MTRR);
> +
> +	if (var) {
> +		if (num_var > MTRR_MAX_VAR_RANGES) {
> +			pr_warn("Trying to overwrite MTRR state with %u variable entries\n",
> +				num_var);
> +			num_var = MTRR_MAX_VAR_RANGES;
> +		}
> +		for (i = 0; i < num_var; i++)
> +			mtrr_state.var_ranges[i] = var[i];
> +		num_var_ranges = num_var;
> +	}
> +
> +	mtrr_state.def_type = def_type;
> +	mtrr_state.enabled |= MTRR_STATE_MTRR_ENABLED;
> +
> +	mtrr_state_set = 1;
> +}
> +
>  /**
>   * mtrr_type_lookup - look up memory type in MTRR
>   *
> diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.c b/arch/x86/kernel/cpu/mtrr/mtrr.c
> index 7596ebeab929..5fe62ee0361b 100644
> --- a/arch/x86/kernel/cpu/mtrr/mtrr.c
> +++ b/arch/x86/kernel/cpu/mtrr/mtrr.c
> @@ -666,6 +666,15 @@ void __init mtrr_bp_init(void)
>  	const char *why = "(not available)";
>  	unsigned int phys_addr;
>  
> +	if (mtrr_state.enabled) {
> +		/* Software overwrite of MTRR state, only for generic case. */
> +		mtrr_calc_physbits(true);
> +		init_table();
> +		pr_info("MTRRs set to read-only\n");
> +
> +		return;
> +	}
> +
>  	phys_addr = mtrr_calc_physbits(boot_cpu_has(X86_FEATURE_MTRR));
>  
>  	if (boot_cpu_has(X86_FEATURE_MTRR)) {
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 16babff771bd..0cccfeb67c3a 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -1037,6 +1037,8 @@ void __init setup_arch(char **cmdline_p)
>  	/*
>  	 * VMware detection requires dmi to be available, so this
>  	 * needs to be done after dmi_setup(), for the boot CPU.
> +	 * For some guest types (Xen PV, SEV-SNP, TDX) it is required to be
> +	 * called before cache_bp_init() for setting up MTRR state.
>  	 */
>  	init_hypervisor_platform();
>  


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 03/12] x86/mtrr: support setting MTRR state for software defined MTRRs
  2023-03-20 12:59   ` [PATCH v4 03/12] x86/mtrr: support setting MTRR state for software defined MTRRs Huang, Kai
@ 2023-03-20 13:47     ` Juergen Gross
  2023-03-20 21:34       ` Huang, Kai
  2023-03-20 22:42       ` Borislav Petkov
  0 siblings, 2 replies; 5+ messages in thread
From: Juergen Gross @ 2023-03-20 13:47 UTC (permalink / raw)
  To: Huang, Kai, linux-kernel, x86
  Cc: Christopherson,,
	Sean, bp, dave.hansen, hpa, mingo, tglx, pbonzini, kvm, Yamahata,
	Isaku


[-- Attachment #1.1.1: Type: text/plain, Size: 3640 bytes --]

On 20.03.23 13:59, Huang, Kai wrote:
> On Mon, 2023-03-06 at 17:34 +0100, Juergen Gross wrote:
>> When running virtualized, MTRR access can be reduced (e.g. in Xen PV
>> guests or when running as a SEV-SNP guest under Hyper-V). Typically
>> the hypervisor will reset the MTRR feature in CPUID data, resulting
>> in no MTRR memory type information being available for the kernel.
>>
>> This has turned out to result in problems:
>>
>> - Hyper-V SEV-SNP guests using uncached mappings where they shouldn't
>> - Xen PV dom0 mapping memory as WB which should be UC- instead
>>
>> Solve those problems by supporting to set a static MTRR state,
>> overwriting the empty state used today. In case such a state has been
>> set, don't call get_mtrr_state() in mtrr_bp_init(). The set state
>> will only be used by mtrr_type_lookup(), as in all other cases
>> mtrr_enabled() is being checked, which will return false. Accept the
>> overwrite call only for selected cases when running as a guest.
>> Disable X86_FEATURE_MTRR in order to avoid any MTRR modifications by
>> just refusing them.
>>
>>
> [...]
> 
>> diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
>> index ee09d359e08f..49b4cc923312 100644
>> --- a/arch/x86/kernel/cpu/mtrr/generic.c
>> +++ b/arch/x86/kernel/cpu/mtrr/generic.c
>> @@ -8,10 +8,12 @@
>>   #include <linux/init.h>
>>   #include <linux/io.h>
>>   #include <linux/mm.h>
>> -
>> +#include <linux/cc_platform.h>
>>   #include <asm/processor-flags.h>
>>   #include <asm/cacheinfo.h>
>>   #include <asm/cpufeature.h>
>> +#include <asm/hypervisor.h>
>> +#include <asm/mshyperv.h>
> 
> Is <asm/mshyperv.h> needed here?

Yes, for hv_is_isolation_supported().

> 
>>   #include <asm/tlbflush.h>
>>   #include <asm/mtrr.h>
>>   #include <asm/msr.h>
>> @@ -240,6 +242,48 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
>>   	return mtrr_state.def_type;
>>   }
>>   
>> +/**
>> + * mtrr_overwrite_state - set static MTRR state
>> + *
>> + * Used to set MTRR state via different means (e.g. with data obtained from
>> + * a hypervisor).
> 
> +KVM list and KVM maintainers,
> 
> IIUC in the next patch, SEV-SNP guest only sets a synthetic MTRR w/o telling the
> hypervisor (hyperv).  I think this works for SEV-SNP running on top of hyperv
> because they have mutual understanding?
> 
> What about the SNP guest running on other hypervisors such as KVM?
> 
> Since this code covers TDX guest too, I think eventually it makes sense for TDX
> guest to use this function too (to avoid #VE IIUC).  If want to do that, then I
> think TDX guest should have the same mutual understanding with *ALL* hypervisor,
> as I am not sure what's the point of making the TDX guest's MTRR behaviour
> depending on specific hypervisor.

This series tries to fix the current fallout.

Boris Petkov asked for the hypervisor specific tests to be added, so I've
added them after discussing the topic with him (he is the maintainer of
this code after all).

> For now I don't see there's any use case for TDX guest to use non-WB memory type
> (in fact, KVM always maps guest memory as WB if there's no non-coherent DMA to
> the guest memory), so to me it seems it's OK to make a universal mutual
> understanding that TDX guest will always have WB memory type for all memory.

I agree.

> But, I am not sure whether it's better to have a standard hypercall between
> guest & hypervisor for this purpose so things can be more flexible?

Maybe. But for now we need to handle the current situation.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 03/12] x86/mtrr: support setting MTRR state for software defined MTRRs
  2023-03-20 13:47     ` Juergen Gross
@ 2023-03-20 21:34       ` Huang, Kai
  2023-03-20 22:42       ` Borislav Petkov
  1 sibling, 0 replies; 5+ messages in thread
From: Huang, Kai @ 2023-03-20 21:34 UTC (permalink / raw)
  To: Gross, Jurgen, linux-kernel, x86
  Cc: Christopherson,,
	Sean, bp, dave.hansen, hpa, mingo, tglx, pbonzini, kvm, Yamahata,
	Isaku

On Mon, 2023-03-20 at 14:47 +0100, Juergen Gross wrote:
> On 20.03.23 13:59, Huang, Kai wrote:
> > On Mon, 2023-03-06 at 17:34 +0100, Juergen Gross wrote:
> > > When running virtualized, MTRR access can be reduced (e.g. in Xen PV
> > > guests or when running as a SEV-SNP guest under Hyper-V). Typically
> > > the hypervisor will reset the MTRR feature in CPUID data, resulting
> > > in no MTRR memory type information being available for the kernel.
> > > 
> > > This has turned out to result in problems:
> > > 
> > > - Hyper-V SEV-SNP guests using uncached mappings where they shouldn't
> > > - Xen PV dom0 mapping memory as WB which should be UC- instead
> > > 
> > > Solve those problems by supporting to set a static MTRR state,
> > > overwriting the empty state used today. In case such a state has been
> > > set, don't call get_mtrr_state() in mtrr_bp_init(). The set state
> > > will only be used by mtrr_type_lookup(), as in all other cases
> > > mtrr_enabled() is being checked, which will return false. Accept the
> > > overwrite call only for selected cases when running as a guest.
> > > Disable X86_FEATURE_MTRR in order to avoid any MTRR modifications by
> > > just refusing them.
> > > 
> > > 
> > [...]
> > 
> > > diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
> > > index ee09d359e08f..49b4cc923312 100644
> > > --- a/arch/x86/kernel/cpu/mtrr/generic.c
> > > +++ b/arch/x86/kernel/cpu/mtrr/generic.c
> > > @@ -8,10 +8,12 @@
> > >   #include <linux/init.h>
> > >   #include <linux/io.h>
> > >   #include <linux/mm.h>
> > > -
> > > +#include <linux/cc_platform.h>
> > >   #include <asm/processor-flags.h>
> > >   #include <asm/cacheinfo.h>
> > >   #include <asm/cpufeature.h>
> > > +#include <asm/hypervisor.h>
> > > +#include <asm/mshyperv.h>
> > 
> > Is <asm/mshyperv.h> needed here?
> 
> Yes, for hv_is_isolation_supported().
> 
> > 
> > >   #include <asm/tlbflush.h>
> > >   #include <asm/mtrr.h>
> > >   #include <asm/msr.h>
> > > @@ -240,6 +242,48 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
> > >   	return mtrr_state.def_type;
> > >   }
> > >   
> > > +/**
> > > + * mtrr_overwrite_state - set static MTRR state
> > > + *
> > > + * Used to set MTRR state via different means (e.g. with data obtained from
> > > + * a hypervisor).
> > 
> > +KVM list and KVM maintainers,
> > 
> > IIUC in the next patch, SEV-SNP guest only sets a synthetic MTRR w/o telling the
> > hypervisor (hyperv).  I think this works for SEV-SNP running on top of hyperv
> > because they have mutual understanding?
> > 
> > What about the SNP guest running on other hypervisors such as KVM?
> > 
> > Since this code covers TDX guest too, I think eventually it makes sense for TDX
> > guest to use this function too (to avoid #VE IIUC).  If want to do that, then I
> > think TDX guest should have the same mutual understanding with *ALL* hypervisor,
> > as I am not sure what's the point of making the TDX guest's MTRR behaviour
> > depending on specific hypervisor.
> 
> This series tries to fix the current fallout.
> 
> Boris Petkov asked for the hypervisor specific tests to be added, so I've
> added them after discussing the topic with him (he is the maintainer of
> this code after all).
> 
> > For now I don't see there's any use case for TDX guest to use non-WB memory type
> > (in fact, KVM always maps guest memory as WB if there's no non-coherent DMA to
> > the guest memory), so to me it seems it's OK to make a universal mutual
> > understanding that TDX guest will always have WB memory type for all memory.
> 
> I agree.
> 
> > But, I am not sure whether it's better to have a standard hypercall between
> > guest & hypervisor for this purpose so things can be more flexible?
> 
> Maybe. But for now we need to handle the current situation.
> 
> 

Agreed.  Thanks for explaining.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 03/12] x86/mtrr: support setting MTRR state for software defined MTRRs
  2023-03-20 13:47     ` Juergen Gross
  2023-03-20 21:34       ` Huang, Kai
@ 2023-03-20 22:42       ` Borislav Petkov
  2023-03-21  6:01         ` Juergen Gross
  1 sibling, 1 reply; 5+ messages in thread
From: Borislav Petkov @ 2023-03-20 22:42 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Huang, Kai, linux-kernel, x86, Christopherson,,
	Sean, dave.hansen, hpa, mingo, tglx, pbonzini, kvm, Yamahata,
	Isaku

On Mon, Mar 20, 2023 at 02:47:30PM +0100, Juergen Gross wrote:
> > Since this code covers TDX guest too, I think eventually it makes sense for TDX
> > guest to use this function too (to avoid #VE IIUC).  If want to do that, then I
> > think TDX guest should have the same mutual understanding with *ALL* hypervisor,
> > as I am not sure what's the point of making the TDX guest's MTRR behaviour
> > depending on specific hypervisor.
> 
> This series tries to fix the current fallout.

We can relax the check so that it runs on TDX too. Along with a comment
above it why it needs to run on TDX.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 03/12] x86/mtrr: support setting MTRR state for software defined MTRRs
  2023-03-20 22:42       ` Borislav Petkov
@ 2023-03-21  6:01         ` Juergen Gross
  0 siblings, 0 replies; 5+ messages in thread
From: Juergen Gross @ 2023-03-21  6:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Huang, Kai, linux-kernel, x86, Christopherson,,
	Sean, dave.hansen, hpa, mingo, tglx, pbonzini, kvm, Yamahata,
	Isaku


[-- Attachment #1.1.1: Type: text/plain, Size: 697 bytes --]

On 20.03.23 23:42, Borislav Petkov wrote:
> On Mon, Mar 20, 2023 at 02:47:30PM +0100, Juergen Gross wrote:
>>> Since this code covers TDX guest too, I think eventually it makes sense for TDX
>>> guest to use this function too (to avoid #VE IIUC).  If want to do that, then I
>>> think TDX guest should have the same mutual understanding with *ALL* hypervisor,
>>> as I am not sure what's the point of making the TDX guest's MTRR behaviour
>>> depending on specific hypervisor.
>>
>> This series tries to fix the current fallout.
> 
> We can relax the check so that it runs on TDX too. Along with a comment
> above it why it needs to run on TDX.
> 

Okay, fine with me.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-21  6:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20230306163425.8324-1-jgross@suse.com>
     [not found] ` <20230306163425.8324-4-jgross@suse.com>
2023-03-20 12:59   ` [PATCH v4 03/12] x86/mtrr: support setting MTRR state for software defined MTRRs Huang, Kai
2023-03-20 13:47     ` Juergen Gross
2023-03-20 21:34       ` Huang, Kai
2023-03-20 22:42       ` Borislav Petkov
2023-03-21  6:01         ` Juergen Gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).