[XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain

All of lore.kernel.org
 help / color / mirror / Atom feed

* [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain
       [not found] <cover.1607686878.git.havanur@amazon.com>
@ 2020-12-11 11:44 ` Harsha Shamsundara Havanur
  2020-12-12 13:35   ` Shamsundara Havanur, Harsha
  2020-12-14  8:52   ` Jan Beulich
  0 siblings, 2 replies; 9+ messages in thread
From: Harsha Shamsundara Havanur @ 2020-12-11 11:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Harsha Shamsundara Havanur

A HVM domain flushes cache on all the cpus using
`flush_all` macro which uses cpu_online_map, during
i) creation of a new domain
ii) when device-model op is performed
iii) when domain is destructed.

This triggers IPI on all the cpus, thus affecting other
domains that are pinned to different pcpus. This patch
restricts cache flush to the set of cpus affinitized to
the current domain using `domain->dirty_cpumask`.

Signed-off-by: Harsha Shamsundara Havanur <havanur@amazon.com>
---
 xen/arch/x86/hvm/hvm.c     | 2 +-
 xen/arch/x86/hvm/mtrr.c    | 6 +++---
 xen/arch/x86/hvm/svm/svm.c | 2 +-
 xen/arch/x86/hvm/vmx/vmx.c | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 54e32e4fe8..ec247c7010 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2219,7 +2219,7 @@ void hvm_shadow_handle_cd(struct vcpu *v, unsigned long value)
             domain_pause_nosync(v->domain);
 
             /* Flush physical caches. */
-            flush_all(FLUSH_CACHE);
+            flush_mask(v->domain->dirty_cpumask, FLUSH_CACHE);
             hvm_set_uc_mode(v, 1);
 
             domain_unpause(v->domain);
diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index fb051d59c3..0d804c1fa0 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -631,7 +631,7 @@ int hvm_set_mem_pinned_cacheattr(struct domain *d, uint64_t gfn_start,
                         break;
                     /* fall through */
                 default:
-                    flush_all(FLUSH_CACHE);
+                    flush_mask(d->dirty_cpumask, FLUSH_CACHE);
                     break;
                 }
                 return 0;
@@ -683,7 +683,7 @@ int hvm_set_mem_pinned_cacheattr(struct domain *d, uint64_t gfn_start,
     list_add_rcu(&range->list, &d->arch.hvm.pinned_cacheattr_ranges);
     p2m_memory_type_changed(d);
     if ( type != PAT_TYPE_WRBACK )
-        flush_all(FLUSH_CACHE);
+        flush_mask(d->dirty_cpumask, FLUSH_CACHE);
 
     return 0;
 }
@@ -785,7 +785,7 @@ void memory_type_changed(struct domain *d)
          d->vcpu && d->vcpu[0] )
     {
         p2m_memory_type_changed(d);
-        flush_all(FLUSH_CACHE);
+        flush_mask(d->dirty_cpumask, FLUSH_CACHE);
     }
 }
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index cfea5b5523..383e763d7d 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2395,7 +2395,7 @@ static void svm_vmexit_mce_intercept(
 static void svm_wbinvd_intercept(void)
 {
     if ( cache_flush_permitted(current->domain) )
-        flush_all(FLUSH_CACHE);
+        flush_mask(current->domain->dirty_cpumask, FLUSH_CACHE);
 }
 
 static void svm_vmexit_do_invalidate_cache(struct cpu_user_regs *regs,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 86b8916a5d..a05c7036c4 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3349,7 +3349,7 @@ static void vmx_wbinvd_intercept(void)
         return;
 
     if ( cpu_has_wbinvd_exiting )
-        flush_all(FLUSH_CACHE);
+        flush_mask(current->domain->dirty_cpumask, FLUSH_CACHE);
     else
         wbinvd();
 }
-- 
2.16.6



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain
  2020-12-11 11:44 ` [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain Harsha Shamsundara Havanur
@ 2020-12-12 13:35   ` Shamsundara Havanur, Harsha
  2020-12-14  8:52   ` Jan Beulich
  1 sibling, 0 replies; 9+ messages in thread
From: Shamsundara Havanur, Harsha @ 2020-12-12 13:35 UTC (permalink / raw)
  To: xen-devel; +Cc: jbeulich, andrew.cooper3

CCing Andy and Jan
Is restricting cache flush to set of cpus bound to the domain, a
right thing to do?

On Fri, 2020-12-11 at 11:44 +0000, Harsha Shamsundara Havanur wrote:
> A HVM domain flushes cache on all the cpus using
> `flush_all` macro which uses cpu_online_map, during
> i) creation of a new domain
> ii) when device-model op is performed
> iii) when domain is destructed.
> 
> This triggers IPI on all the cpus, thus affecting other
> domains that are pinned to different pcpus. This patch
> restricts cache flush to the set of cpus affinitized to
> the current domain using `domain->dirty_cpumask`.
> 
> Signed-off-by: Harsha Shamsundara Havanur <havanur@amazon.com>
> ---
>  xen/arch/x86/hvm/hvm.c     | 2 +-
>  xen/arch/x86/hvm/mtrr.c    | 6 +++---
>  xen/arch/x86/hvm/svm/svm.c | 2 +-
>  xen/arch/x86/hvm/vmx/vmx.c | 2 +-
>  4 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 54e32e4fe8..ec247c7010 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2219,7 +2219,7 @@ void hvm_shadow_handle_cd(struct vcpu *v,
> unsigned long value)
>              domain_pause_nosync(v->domain);
>  
>              /* Flush physical caches. */
> -            flush_all(FLUSH_CACHE);
> +            flush_mask(v->domain->dirty_cpumask, FLUSH_CACHE);
>              hvm_set_uc_mode(v, 1);
>  
>              domain_unpause(v->domain);
> diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
> index fb051d59c3..0d804c1fa0 100644
> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -631,7 +631,7 @@ int hvm_set_mem_pinned_cacheattr(struct domain
> *d, uint64_t gfn_start,
>                          break;
>                      /* fall through */
>                  default:
> -                    flush_all(FLUSH_CACHE);
> +                    flush_mask(d->dirty_cpumask, FLUSH_CACHE);
>                      break;
>                  }
>                  return 0;
> @@ -683,7 +683,7 @@ int hvm_set_mem_pinned_cacheattr(struct domain
> *d, uint64_t gfn_start,
>      list_add_rcu(&range->list, &d-
> >arch.hvm.pinned_cacheattr_ranges);
>      p2m_memory_type_changed(d);
>      if ( type != PAT_TYPE_WRBACK )
> -        flush_all(FLUSH_CACHE);
> +        flush_mask(d->dirty_cpumask, FLUSH_CACHE);
>  
>      return 0;
>  }
> @@ -785,7 +785,7 @@ void memory_type_changed(struct domain *d)
>           d->vcpu && d->vcpu[0] )
>      {
>          p2m_memory_type_changed(d);
> -        flush_all(FLUSH_CACHE);
> +        flush_mask(d->dirty_cpumask, FLUSH_CACHE);
>      }
>  }
>  
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index cfea5b5523..383e763d7d 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -2395,7 +2395,7 @@ static void svm_vmexit_mce_intercept(
>  static void svm_wbinvd_intercept(void)
>  {
>      if ( cache_flush_permitted(current->domain) )
> -        flush_all(FLUSH_CACHE);
> +        flush_mask(current->domain->dirty_cpumask, FLUSH_CACHE);
>  }
>  
>  static void svm_vmexit_do_invalidate_cache(struct cpu_user_regs
> *regs,
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 86b8916a5d..a05c7036c4 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3349,7 +3349,7 @@ static void vmx_wbinvd_intercept(void)
>          return;
>  
>      if ( cpu_has_wbinvd_exiting )
> -        flush_all(FLUSH_CACHE);
> +        flush_mask(current->domain->dirty_cpumask, FLUSH_CACHE);
>      else
>          wbinvd();
>  }

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain
  2020-12-11 11:44 ` [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain Harsha Shamsundara Havanur
  2020-12-12 13:35   ` Shamsundara Havanur, Harsha
@ 2020-12-14  8:52   ` Jan Beulich
  2020-12-14  9:26     ` Shamsundara Havanur, Harsha
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2020-12-14  8:52 UTC (permalink / raw)
  To: Harsha Shamsundara Havanur; +Cc: xen-devel

On 11.12.2020 12:44, Harsha Shamsundara Havanur wrote:
> A HVM domain flushes cache on all the cpus using
> `flush_all` macro which uses cpu_online_map, during
> i) creation of a new domain
> ii) when device-model op is performed
> iii) when domain is destructed.
> 
> This triggers IPI on all the cpus, thus affecting other
> domains that are pinned to different pcpus. This patch
> restricts cache flush to the set of cpus affinitized to
> the current domain using `domain->dirty_cpumask`.

But then you need to effect cache flushing when a CPU gets
taken out of domain->dirty_cpumask. I don't think you/we want
to do that.

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain
  2020-12-14  8:52   ` Jan Beulich
@ 2020-12-14  9:26     ` Shamsundara Havanur, Harsha
  2020-12-14  9:55       ` Jan Beulich
  2020-12-14 10:56       ` Julien Grall
  0 siblings, 2 replies; 9+ messages in thread
From: Shamsundara Havanur, Harsha @ 2020-12-14  9:26 UTC (permalink / raw)
  To: jbeulich; +Cc: xen-devel

On Mon, 2020-12-14 at 09:52 +0100, Jan Beulich wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> On 11.12.2020 12:44, Harsha Shamsundara Havanur wrote:
> > A HVM domain flushes cache on all the cpus using
> > `flush_all` macro which uses cpu_online_map, during
> > i) creation of a new domain
> > ii) when device-model op is performed
> > iii) when domain is destructed.
> > 
> > This triggers IPI on all the cpus, thus affecting other
> > domains that are pinned to different pcpus. This patch
> > restricts cache flush to the set of cpus affinitized to
> > the current domain using `domain->dirty_cpumask`.
> 
> But then you need to effect cache flushing when a CPU gets
> taken out of domain->dirty_cpumask. I don't think you/we want
> to do that.
> 
If we do not restrict, it could lead to DoS attack, where a malicious
guest could keep writing to MTRR registers or do a cache flush through
DM Op and keep sending IPIs to other neighboring guests.

-Harsha
> Jan
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain
  2020-12-14  9:26     ` Shamsundara Havanur, Harsha
@ 2020-12-14  9:55       ` Jan Beulich
  2020-12-14 10:56       ` Julien Grall
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2020-12-14  9:55 UTC (permalink / raw)
  To: Shamsundara Havanur, Harsha; +Cc: xen-devel

On 14.12.2020 10:26, Shamsundara Havanur, Harsha wrote:
> On Mon, 2020-12-14 at 09:52 +0100, Jan Beulich wrote:
>> On 11.12.2020 12:44, Harsha Shamsundara Havanur wrote:
>>> A HVM domain flushes cache on all the cpus using
>>> `flush_all` macro which uses cpu_online_map, during
>>> i) creation of a new domain
>>> ii) when device-model op is performed
>>> iii) when domain is destructed.
>>>
>>> This triggers IPI on all the cpus, thus affecting other
>>> domains that are pinned to different pcpus. This patch
>>> restricts cache flush to the set of cpus affinitized to
>>> the current domain using `domain->dirty_cpumask`.
>>
>> But then you need to effect cache flushing when a CPU gets
>> taken out of domain->dirty_cpumask. I don't think you/we want
>> to do that.
>>
> If we do not restrict, it could lead to DoS attack, where a malicious
> guest could keep writing to MTRR registers or do a cache flush through
> DM Op and keep sending IPIs to other neighboring guests.

Could you outline how this can become a DoS? Throughput may be
(heavily) impacted, yes, but I don't see how this could suppress
forward progress altogether. Improved accounting may be desirable
here, such that the time spent in the flushes gets subtracted
from the initiator's credits rather than the vCPU's which happens
to run on the subject pCPU at the time. This is a more general
topic though, which was previously brought up: Time spent in
servicing interrupts should in general not be accounted to the
vCPU running of which happened to be interrupted. It's just that
for the majority of interrupts the amount of time needed to
handle them is pretty well bounded, albeit very high interrupt
rates could have the same effect as a single interrupt taking
very long to service.

An intermediate option (albeit presumably somewhat intrusive)
might be to use e.g. a tasklet on each CPU to effect the
flushing. This wouldn't reduce the overall hit on the system,
but would at least avoid penalizing other vCPU-s as to their
scheduling time slices. The issuing vCPU would then need
pausing until all of the flushes got carried out.

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain
  2020-12-14  9:26     ` Shamsundara Havanur, Harsha
  2020-12-14  9:55       ` Jan Beulich
@ 2020-12-14 10:56       ` Julien Grall
  2020-12-14 16:01         ` Andrew Cooper
  1 sibling, 1 reply; 9+ messages in thread
From: Julien Grall @ 2020-12-14 10:56 UTC (permalink / raw)
  To: Shamsundara Havanur, Harsha, jbeulich
  Cc: xen-devel, Paul Durrant, Wieczorkiewicz, Pawel

Hi Harsha,

On 14/12/2020 09:26, Shamsundara Havanur, Harsha wrote:
> On Mon, 2020-12-14 at 09:52 +0100, Jan Beulich wrote:
>> CAUTION: This email originated from outside of the organization. Do
>> not click links or open attachments unless you can confirm the sender
>> and know the content is safe.
>>
>>
>>
>> On 11.12.2020 12:44, Harsha Shamsundara Havanur wrote:
>>> A HVM domain flushes cache on all the cpus using
>>> `flush_all` macro which uses cpu_online_map, during
>>> i) creation of a new domain
>>> ii) when device-model op is performed
>>> iii) when domain is destructed.
>>>
>>> This triggers IPI on all the cpus, thus affecting other
>>> domains that are pinned to different pcpus. This patch
>>> restricts cache flush to the set of cpus affinitized to
>>> the current domain using `domain->dirty_cpumask`.
>>
>> But then you need to effect cache flushing when a CPU gets
>> taken out of domain->dirty_cpumask. I don't think you/we want
>> to do that.
>>
> If we do not restrict, it could lead to DoS attack, where a malicious
> guest could keep writing to MTRR registers or do a cache flush through
> DM Op and keep sending IPIs to other neighboring guests.

I saw Jan already answered about the alleged DoS, so I will just focus 
on the resolution.

I agree that in the ideal situation we want to limit the impact on the 
other vCPUs. However, we also need to make sure the cure is not worse 
than the symptoms.

The cache flush cannot be restricted in all the pinning situation 
because pinning doesn't imply the pCPU will be dedicated to a given vCPU 
or even the vCPU will stick on pCPU (we may allow floating on a NUMA 
socket). Although your setup may offer this guarantee.

My knowledge in this area is quite limited. But below a few question 
that hopefully will help to make a decision.

The first question to answer is: can the flush can be restricted in a 
setup where each vCPUs are running on a decicated pCPU (i.e partionned 
system)?

If the answer is yes, then we should figure out whether using 
domain->dirty_cpumask would always be correct? For instance, a vCPU may 
not have yet run, so can we consider the associated pCPU cache would be 
consistent?

Another line of question is what can we do on system supporting 
self-snooping? IOW, would it be possible to restrict the flush for all 
the setup?

Cheers,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain
  2020-12-14 10:56       ` Julien Grall
@ 2020-12-14 16:01         ` Andrew Cooper
  2020-12-14 19:05           ` Shamsundara Havanur, Harsha
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2020-12-14 16:01 UTC (permalink / raw)
  To: Julien Grall, Shamsundara Havanur, Harsha, jbeulich
  Cc: xen-devel, Paul Durrant, Wieczorkiewicz, Pawel

On 14/12/2020 10:56, Julien Grall wrote:
> Hi Harsha,
>
> On 14/12/2020 09:26, Shamsundara Havanur, Harsha wrote:
>> On Mon, 2020-12-14 at 09:52 +0100, Jan Beulich wrote:
>>> CAUTION: This email originated from outside of the organization. Do
>>> not click links or open attachments unless you can confirm the sender
>>> and know the content is safe.
>>>
>>>
>>>
>>> On 11.12.2020 12:44, Harsha Shamsundara Havanur wrote:
>>>> A HVM domain flushes cache on all the cpus using
>>>> `flush_all` macro which uses cpu_online_map, during
>>>> i) creation of a new domain
>>>> ii) when device-model op is performed
>>>> iii) when domain is destructed.
>>>>
>>>> This triggers IPI on all the cpus, thus affecting other
>>>> domains that are pinned to different pcpus. This patch
>>>> restricts cache flush to the set of cpus affinitized to
>>>> the current domain using `domain->dirty_cpumask`.
>>>
>>> But then you need to effect cache flushing when a CPU gets
>>> taken out of domain->dirty_cpumask. I don't think you/we want
>>> to do that.
>>>
>> If we do not restrict, it could lead to DoS attack, where a malicious
>> guest could keep writing to MTRR registers or do a cache flush through
>> DM Op and keep sending IPIs to other neighboring guests.
>
> I saw Jan already answered about the alleged DoS, so I will just focus
> on the resolution.
>
> I agree that in the ideal situation we want to limit the impact on the
> other vCPUs. However, we also need to make sure the cure is not worse
> than the symptoms.

And specifically, only a change which is correct.  This patch very
definitely isn't.

Lines can get cached on other cpus from, e.g. qemu mappings and PV backends.

>
> The cache flush cannot be restricted in all the pinning situation
> because pinning doesn't imply the pCPU will be dedicated to a given
> vCPU or even the vCPU will stick on pCPU (we may allow floating on a
> NUMA socket). Although your setup may offer this guarantee.
>
> My knowledge in this area is quite limited. But below a few question
> that hopefully will help to make a decision.
>
> The first question to answer is: can the flush can be restricted in a
> setup where each vCPUs are running on a decicated pCPU (i.e partionned
> system)?

Not really.  Lines can become cached even from speculation in the directmap.

If you need to flush the caches (and don't have a virtual mapping to
issue clflush/clflushopt/clwb over), it must be on all CPUs.

> If the answer is yes, then we should figure out whether using
> domain->dirty_cpumask would always be correct? For instance, a vCPU
> may not have yet run, so can we consider the associated pCPU cache
> would be consistent?
>
> Another line of question is what can we do on system supporting
> self-snooping? IOW, would it be possible to restrict the flush for all
> the setup?

Right - this is the avenue which ought to be investigated.  (Working)
self-noop ought to remove the need for some of these cache flushes. 
Others look like they're not correct to begin with.  Others, such as the
wbinvd intercepts absolutely must stay as they are.

~Andrew


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain
  2020-12-14 16:01         ` Andrew Cooper
@ 2020-12-14 19:05           ` Shamsundara Havanur, Harsha
  2020-12-14 19:36             ` Andrew Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Shamsundara Havanur, Harsha @ 2020-12-14 19:05 UTC (permalink / raw)
  To: jbeulich, julien, andrew.cooper3; +Cc: Wieczorkiewicz, Pawel, xen-devel, paul

On Mon, 2020-12-14 at 16:01 +0000, Andrew Cooper wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> On 14/12/2020 10:56, Julien Grall wrote:
> > Hi Harsha,
> > 
> > On 14/12/2020 09:26, Shamsundara Havanur, Harsha wrote:
> > > On Mon, 2020-12-14 at 09:52 +0100, Jan Beulich wrote:
> > > > CAUTION: This email originated from outside of the
> > > > organization. Do
> > > > not click links or open attachments unless you can confirm the
> > > > sender
> > > > and know the content is safe.
> > > > 
> > > > 
> > > > 
> > > > On 11.12.2020 12:44, Harsha Shamsundara Havanur wrote:
> > > > > A HVM domain flushes cache on all the cpus using
> > > > > `flush_all` macro which uses cpu_online_map, during
> > > > > i) creation of a new domain
> > > > > ii) when device-model op is performed
> > > > > iii) when domain is destructed.
> > > > > 
> > > > > This triggers IPI on all the cpus, thus affecting other
> > > > > domains that are pinned to different pcpus. This patch
> > > > > restricts cache flush to the set of cpus affinitized to
> > > > > the current domain using `domain->dirty_cpumask`.
> > > > 
> > > > But then you need to effect cache flushing when a CPU gets
> > > > taken out of domain->dirty_cpumask. I don't think you/we want
> > > > to do that.
> > > > 
> > > 
> > > If we do not restrict, it could lead to DoS attack, where a
> > > malicious
> > > guest could keep writing to MTRR registers or do a cache flush
> > > through
> > > DM Op and keep sending IPIs to other neighboring guests.
> > 
> > I saw Jan already answered about the alleged DoS, so I will just
> > focus
> > on the resolution.
> > 
> > I agree that in the ideal situation we want to limit the impact on
> > the
> > other vCPUs. However, we also need to make sure the cure is not
> > worse
> > than the symptoms.
> 
> And specifically, only a change which is correct.  This patch very
> definitely isn't.
> 
> Lines can get cached on other cpus from, e.g. qemu mappings and PV
> backends.
> 
> > 
> > The cache flush cannot be restricted in all the pinning situation
> > because pinning doesn't imply the pCPU will be dedicated to a given
> > vCPU or even the vCPU will stick on pCPU (we may allow floating on
> > a
> > NUMA socket). Although your setup may offer this guarantee.
> > 
> > My knowledge in this area is quite limited. But below a few
> > question
> > that hopefully will help to make a decision.
> > 
> > The first question to answer is: can the flush can be restricted in
> > a
> > setup where each vCPUs are running on a decicated pCPU (i.e
> > partionned
> > system)?
> 
> Not really.  Lines can become cached even from speculation in the
> directmap.
> 
> If you need to flush the caches (and don't have a virtual mapping to
> issue clflush/clflushopt/clwb over), it must be on all CPUs.

If lines are cached due to aggressive speculation from a different
guest, wouldn't they be invalidated at the speculation boundary, since
it's a wrong speculation? Would it still require to be flushed
explicitly?

-Harsha

> 
> > If the answer is yes, then we should figure out whether using
> > domain->dirty_cpumask would always be correct? For instance, a vCPU
> > may not have yet run, so can we consider the associated pCPU cache
> > would be consistent?
> > 
> > Another line of question is what can we do on system supporting
> > self-snooping? IOW, would it be possible to restrict the flush for
> > all
> > the setup?
> 
> Right - this is the avenue which ought to be investigated.  (Working)
> self-noop ought to remove the need for some of these cache flushes.
> Others look like they're not correct to begin with.  Others, such as
> the
> wbinvd intercepts absolutely must stay as they are.
> 
> ~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain
  2020-12-14 19:05           ` Shamsundara Havanur, Harsha
@ 2020-12-14 19:36             ` Andrew Cooper
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2020-12-14 19:36 UTC (permalink / raw)
  To: Shamsundara Havanur, Harsha, jbeulich, julien
  Cc: Wieczorkiewicz, Pawel, xen-devel, paul

On 14/12/2020 19:05, Shamsundara Havanur, Harsha wrote:
> On Mon, 2020-12-14 at 16:01 +0000, Andrew Cooper wrote:
>> CAUTION: This email originated from outside of the organization. Do
>> not click links or open attachments unless you can confirm the sender
>> and know the content is safe.
>>
>>
>>
>> On 14/12/2020 10:56, Julien Grall wrote:
>>> Hi Harsha,
>>>
>>> On 14/12/2020 09:26, Shamsundara Havanur, Harsha wrote:
>>>> On Mon, 2020-12-14 at 09:52 +0100, Jan Beulich wrote:
>>>>> CAUTION: This email originated from outside of the
>>>>> organization. Do
>>>>> not click links or open attachments unless you can confirm the
>>>>> sender
>>>>> and know the content is safe.
>>>>>
>>>>>
>>>>>
>>>>> On 11.12.2020 12:44, Harsha Shamsundara Havanur wrote:
>>>>>> A HVM domain flushes cache on all the cpus using
>>>>>> `flush_all` macro which uses cpu_online_map, during
>>>>>> i) creation of a new domain
>>>>>> ii) when device-model op is performed
>>>>>> iii) when domain is destructed.
>>>>>>
>>>>>> This triggers IPI on all the cpus, thus affecting other
>>>>>> domains that are pinned to different pcpus. This patch
>>>>>> restricts cache flush to the set of cpus affinitized to
>>>>>> the current domain using `domain->dirty_cpumask`.
>>>>> But then you need to effect cache flushing when a CPU gets
>>>>> taken out of domain->dirty_cpumask. I don't think you/we want
>>>>> to do that.
>>>>>
>>>> If we do not restrict, it could lead to DoS attack, where a
>>>> malicious
>>>> guest could keep writing to MTRR registers or do a cache flush
>>>> through
>>>> DM Op and keep sending IPIs to other neighboring guests.
>>> I saw Jan already answered about the alleged DoS, so I will just
>>> focus
>>> on the resolution.
>>>
>>> I agree that in the ideal situation we want to limit the impact on
>>> the
>>> other vCPUs. However, we also need to make sure the cure is not
>>> worse
>>> than the symptoms.
>> And specifically, only a change which is correct.  This patch very
>> definitely isn't.
>>
>> Lines can get cached on other cpus from, e.g. qemu mappings and PV
>> backends.
>>
>>> The cache flush cannot be restricted in all the pinning situation
>>> because pinning doesn't imply the pCPU will be dedicated to a given
>>> vCPU or even the vCPU will stick on pCPU (we may allow floating on
>>> a
>>> NUMA socket). Although your setup may offer this guarantee.
>>>
>>> My knowledge in this area is quite limited. But below a few
>>> question
>>> that hopefully will help to make a decision.
>>>
>>> The first question to answer is: can the flush can be restricted in
>>> a
>>> setup where each vCPUs are running on a decicated pCPU (i.e
>>> partionned
>>> system)?
>> Not really.  Lines can become cached even from speculation in the
>> directmap.
>>
>> If you need to flush the caches (and don't have a virtual mapping to
>> issue clflush/clflushopt/clwb over), it must be on all CPUs.
> If lines are cached due to aggressive speculation from a different
> guest, wouldn't they be invalidated at the speculation boundary, since
> it's a wrong speculation? Would it still require to be flushed
> explicitly?

No.  Caches are microarchitectural state (just like TLBs, linefill
buffers, etc.)

The entire mess surrounding speculative security issues is that the
perturbance from bad speculation survive, and can be recovered at a
later point.

~Andrew


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-12-14 19:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1607686878.git.havanur@amazon.com>
2020-12-11 11:44 ` [XEN PATCH v1 1/1] Invalidate cache for cpus affinitized to the domain Harsha Shamsundara Havanur
2020-12-12 13:35   ` Shamsundara Havanur, Harsha
2020-12-14  8:52   ` Jan Beulich
2020-12-14  9:26     ` Shamsundara Havanur, Harsha
2020-12-14  9:55       ` Jan Beulich
2020-12-14 10:56       ` Julien Grall
2020-12-14 16:01         ` Andrew Cooper
2020-12-14 19:05           ` Shamsundara Havanur, Harsha
2020-12-14 19:36             ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.