All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups
@ 2020-04-03 15:30 David Hildenbrand
  2020-04-03 15:30 ` [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks David Hildenbrand
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: David Hildenbrand @ 2020-04-03 15:30 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger,
	David Hildenbrand

Some vsie/gmap fixes and two cleanups/improvements.

Patch #1 fixes an issue reported by Janosch. It was never observed so far,
because KVM usually doesn't use a region 1 table for it's guest (unless
memory would be exceeding something like 16 EB, which isn't even supported
by the HW). Older QEMU+KVM or other hypervisors can trigger this.

Patch #2 fixes a code path that probably was never taken and will most
probably not be taken very often in the future - unless somebody really
messes up the page tables for a guest (or writes a test for it). At some
point, a test case for this would be nice.

Patch #3 fixes a rare possible race. Don't think this is stable material.

Gave it some testing with my limited access to somewhat-fast s390x
machines. Booted a Linux kernel, supplying all possible number of
page table hiearchies.

v1 -> v2:
- "KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks"
-- Fix WARN_ON_ONCE
- "gmap_table_walk() simplifications"
-- Also init "table" directly

David Hildenbrand (5):
  KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  KVM: s390: vsie: Fix delivery of addressing exceptions
  KVM: s390: vsie: Fix possible race when shadowing region 3 tables
  KVM: s390: vsie: Move conditional reschedule
  KVM: s390: vsie: gmap_table_walk() simplifications

 arch/s390/kvm/vsie.c |  4 ++--
 arch/s390/mm/gmap.c  | 17 +++++++++++------
 2 files changed, 13 insertions(+), 8 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  2020-04-03 15:30 [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups David Hildenbrand
@ 2020-04-03 15:30 ` David Hildenbrand
  2020-04-03 17:56   ` Christian Borntraeger
  2020-04-07  7:33   ` Christian Borntraeger
  2020-04-03 15:30 ` [PATCH v2 2/5] KVM: s390: vsie: Fix delivery of addressing exceptions David Hildenbrand
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 22+ messages in thread
From: David Hildenbrand @ 2020-04-03 15:30 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger,
	David Hildenbrand, stable

In case we have a region 1 ASCE, our shadow/g3 address can have any value.
Unfortunately, (-1UL << 64) is undefined and triggers sometimes,
rejecting valid shadow addresses when trying to walk our shadow table
hierarchy.

The result is that the prefix cannot get mapped and will loop basically
forever trying to map it (-EAGAIN loop).

After all, the broken check is only a sanity check, our table shadowing
code in kvm_s390_shadow_tables() already checks these conditions, injecting
proper translation exceptions. Turn it into a WARN_ON_ONCE().

Fixes: 4be130a08420 ("s390/mm: add shadow gmap support")
Tested-by: Janosch Frank <frankja@linux.ibm.com>
Reported-by: Janosch Frank <frankja@linux.ibm.com>
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/s390/mm/gmap.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index 2fbece47ef6f..b93dd54b234a 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -787,14 +787,18 @@ static void gmap_call_notifier(struct gmap *gmap, unsigned long start,
 static inline unsigned long *gmap_table_walk(struct gmap *gmap,
 					     unsigned long gaddr, int level)
 {
+	const int asce_type = gmap->asce & _ASCE_TYPE_MASK;
 	unsigned long *table;
 
 	if ((gmap->asce & _ASCE_TYPE_MASK) + 4 < (level * 4))
 		return NULL;
 	if (gmap_is_shadow(gmap) && gmap->removed)
 		return NULL;
-	if (gaddr & (-1UL << (31 + ((gmap->asce & _ASCE_TYPE_MASK) >> 2)*11)))
+
+	if (WARN_ON_ONCE(asce_type != _ASCE_TYPE_REGION1 &&
+			 gaddr & (-1UL << (31 + (asce_type >> 2) * 11))))
 		return NULL;
+
 	table = gmap->table;
 	switch (gmap->asce & _ASCE_TYPE_MASK) {
 	case _ASCE_TYPE_REGION1:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 2/5] KVM: s390: vsie: Fix delivery of addressing exceptions
  2020-04-03 15:30 [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups David Hildenbrand
  2020-04-03 15:30 ` [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks David Hildenbrand
@ 2020-04-03 15:30 ` David Hildenbrand
  2020-04-07 11:00   ` Claudio Imbrenda
  2020-04-03 15:30 ` [PATCH v2 3/5] KVM: s390: vsie: Fix possible race when shadowing region 3 tables David Hildenbrand
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2020-04-03 15:30 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger,
	David Hildenbrand, stable

Whenever we get an -EFAULT, we failed to read in guest 2 physical
address space. Such addressing exceptions are reported via a program
intercept to the nested hypervisor.

We faked the intercept, we have to return to guest 2. Instead, right
now we would be returning -EFAULT from the intercept handler, eventually
crashing the VM.

Addressing exceptions can only happen if the g2->g3 page tables
reference invalid g2 addresses (say, either a table or the final page is
not accessible - so something that basically never happens in sane
environments.

Identified by manual code inspection.

Fixes: a3508fbe9dc6 ("KVM: s390: vsie: initial support for nested virtualization")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/s390/kvm/vsie.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index 076090f9e666..4f6c22d72072 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -1202,6 +1202,7 @@ static int vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
 		scb_s->iprcc = PGM_ADDRESSING;
 		scb_s->pgmilc = 4;
 		scb_s->gpsw.addr = __rewind_psw(scb_s->gpsw, 4);
+		rc = 1;
 	}
 	return rc;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 3/5] KVM: s390: vsie: Fix possible race when shadowing region 3 tables
  2020-04-03 15:30 [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups David Hildenbrand
  2020-04-03 15:30 ` [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks David Hildenbrand
  2020-04-03 15:30 ` [PATCH v2 2/5] KVM: s390: vsie: Fix delivery of addressing exceptions David Hildenbrand
@ 2020-04-03 15:30 ` David Hildenbrand
  2020-04-07 11:05   ` Claudio Imbrenda
  2020-04-03 15:30 ` [PATCH v2 4/5] KVM: s390: vsie: Move conditional reschedule David Hildenbrand
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2020-04-03 15:30 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger,
	David Hildenbrand

We have to properly retry again by returning -EINVAL immediately in case
somebody else instantiated the table concurrently. We missed to add the
goto in this function only. The code now matches the other, similar
shadowing functions.

We are overwriting an existing region 2 table entry. All allocated pages
are added to the crst_list to be freed later, so they are not lost
forever. However, when unshadowing the region 2 table, we wouldn't trigger
unshadowing of the original shadowed region 3 table that we replaced. It
would get unshadowed when the original region 3 table is modified. As it's
not connected to the page table hierarchy anymore, it's not going to get
used anymore. However, for a limited time, this page table will stick
around, so it's in some sense a temporary memory leak.

Identified by manual code inspection. I don't think this classifies as
stable material.

Fixes: 998f637cc4b9 ("s390/mm: avoid races on region/segment/page table shadowing")
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/s390/mm/gmap.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index b93dd54b234a..24ef30fb0833 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -1844,6 +1844,7 @@ int gmap_shadow_r3t(struct gmap *sg, unsigned long saddr, unsigned long r3t,
 		goto out_free;
 	} else if (*table & _REGION_ENTRY_ORIGIN) {
 		rc = -EAGAIN;		/* Race with shadow */
+		goto out_free;
 	}
 	crst_table_init(s_r3t, _REGION3_ENTRY_EMPTY);
 	/* mark as invalid as long as the parent table is not protected */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 4/5] KVM: s390: vsie: Move conditional reschedule
  2020-04-03 15:30 [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups David Hildenbrand
                   ` (2 preceding siblings ...)
  2020-04-03 15:30 ` [PATCH v2 3/5] KVM: s390: vsie: Fix possible race when shadowing region 3 tables David Hildenbrand
@ 2020-04-03 15:30 ` David Hildenbrand
  2020-04-06 15:06   ` Christian Borntraeger
  2020-04-07 10:52   ` Claudio Imbrenda
  2020-04-03 15:30 ` [PATCH v2 5/5] KVM: s390: vsie: gmap_table_walk() simplifications David Hildenbrand
  2020-04-06 16:06 ` [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups Christian Borntraeger
  5 siblings, 2 replies; 22+ messages in thread
From: David Hildenbrand @ 2020-04-03 15:30 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger,
	David Hildenbrand

Let's move it to the outer loop, in case we ever run again into long
loops, trying to map the prefix. While at it, convert it to cond_resched().

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/s390/kvm/vsie.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index 4f6c22d72072..ef05b4e167fb 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -1000,8 +1000,6 @@ static int do_vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
 
 	handle_last_fault(vcpu, vsie_page);
 
-	if (need_resched())
-		schedule();
 	if (test_cpu_flag(CIF_MCCK_PENDING))
 		s390_handle_mcck();
 
@@ -1185,6 +1183,7 @@ static int vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
 		    kvm_s390_vcpu_has_irq(vcpu, 0) ||
 		    kvm_s390_vcpu_sie_inhibited(vcpu))
 			break;
+		cond_resched();
 	}
 
 	if (rc == -EFAULT) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 5/5] KVM: s390: vsie: gmap_table_walk() simplifications
  2020-04-03 15:30 [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups David Hildenbrand
                   ` (3 preceding siblings ...)
  2020-04-03 15:30 ` [PATCH v2 4/5] KVM: s390: vsie: Move conditional reschedule David Hildenbrand
@ 2020-04-03 15:30 ` David Hildenbrand
  2020-04-06 16:06   ` Christian Borntraeger
  2020-04-07 11:10   ` Claudio Imbrenda
  2020-04-06 16:06 ` [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups Christian Borntraeger
  5 siblings, 2 replies; 22+ messages in thread
From: David Hildenbrand @ 2020-04-03 15:30 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger,
	David Hildenbrand

Let's use asce_type where applicable. Also, simplify our sanity check for
valid table levels and convert it into a WARN_ON_ONCE(). Check if we even
have a valid gmap shadow as the very first step.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/s390/mm/gmap.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index 24ef30fb0833..a2bd8d7792e9 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -788,19 +788,19 @@ static inline unsigned long *gmap_table_walk(struct gmap *gmap,
 					     unsigned long gaddr, int level)
 {
 	const int asce_type = gmap->asce & _ASCE_TYPE_MASK;
-	unsigned long *table;
+	unsigned long *table = gmap->table;
 
-	if ((gmap->asce & _ASCE_TYPE_MASK) + 4 < (level * 4))
-		return NULL;
 	if (gmap_is_shadow(gmap) && gmap->removed)
 		return NULL;
 
+	if (WARN_ON_ONCE(level > (asce_type >> 2) + 1))
+		return NULL;
+
 	if (WARN_ON_ONCE(asce_type != _ASCE_TYPE_REGION1 &&
 			 gaddr & (-1UL << (31 + (asce_type >> 2) * 11))))
 		return NULL;
 
-	table = gmap->table;
-	switch (gmap->asce & _ASCE_TYPE_MASK) {
+	switch (asce_type) {
 	case _ASCE_TYPE_REGION1:
 		table += (gaddr & _REGION1_INDEX) >> _REGION1_SHIFT;
 		if (level == 4)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  2020-04-03 15:30 ` [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks David Hildenbrand
@ 2020-04-03 17:56   ` Christian Borntraeger
  2020-04-03 19:55     ` David Hildenbrand
  2020-04-07  7:33   ` Christian Borntraeger
  1 sibling, 1 reply; 22+ messages in thread
From: Christian Borntraeger @ 2020-04-03 17:56 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, stable



On 03.04.20 17:30, David Hildenbrand wrote:
> In case we have a region 1 ASCE, our shadow/g3 address can have any value.
> Unfortunately, (-1UL << 64) is undefined and triggers sometimes,
> rejecting valid shadow addresses when trying to walk our shadow table
> hierarchy.

I thin the range of the addresses do not matter.
Took me a while to understand maybe rephrase that:

In case we have a region 1 the following calculation 
(31 + ((gmap->asce & _ASCE_TYPE_MASK) >> 2)*11)
results in 64. As shifts beyond the size are undefined the compiler is free to use
instructions like sllg. sllg will only use 6 bits of the shift value (here 64)
resulting in no shift at all. That means that ALL addresses will be rejected.

With that this makes sense. 

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>


> 
> The result is that the prefix cannot get mapped and will loop basically
> forever trying to map it (-EAGAIN loop).
> 
> After all, the broken check is only a sanity check, our table shadowing
> code in kvm_s390_shadow_tables() already checks these conditions, injecting
> proper translation exceptions. Turn it into a WARN_ON_ONCE().
> 
> Fixes: 4be130a08420 ("s390/mm: add shadow gmap support")
> Tested-by: Janosch Frank <frankja@linux.ibm.com>
> Reported-by: Janosch Frank <frankja@linux.ibm.com>
> Cc: <stable@vger.kernel.org> # v4.8+
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/s390/mm/gmap.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
> index 2fbece47ef6f..b93dd54b234a 100644
> --- a/arch/s390/mm/gmap.c
> +++ b/arch/s390/mm/gmap.c
> @@ -787,14 +787,18 @@ static void gmap_call_notifier(struct gmap *gmap, unsigned long start,
>  static inline unsigned long *gmap_table_walk(struct gmap *gmap,
>  					     unsigned long gaddr, int level)
>  {
> +	const int asce_type = gmap->asce & _ASCE_TYPE_MASK;
>  	unsigned long *table;
>  
>  	if ((gmap->asce & _ASCE_TYPE_MASK) + 4 < (level * 4))
>  		return NULL;
>  	if (gmap_is_shadow(gmap) && gmap->removed)
>  		return NULL;
> -	if (gaddr & (-1UL << (31 + ((gmap->asce & _ASCE_TYPE_MASK) >> 2)*11)))
> +
> +	if (WARN_ON_ONCE(asce_type != _ASCE_TYPE_REGION1 &&
> +			 gaddr & (-1UL << (31 + (asce_type >> 2) * 11))))
>  		return NULL;
> +
>  	table = gmap->table;
>  	switch (gmap->asce & _ASCE_TYPE_MASK) {
>  	case _ASCE_TYPE_REGION1:
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  2020-04-03 17:56   ` Christian Borntraeger
@ 2020-04-03 19:55     ` David Hildenbrand
  2020-04-06  8:32       ` Christian Borntraeger
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2020-04-03 19:55 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Hildenbrand, kvm, linux-s390, linux-kernel, Vasily Gorbik,
	Heiko Carstens, Cornelia Huck, Janosch Frank, stable



> Am 03.04.2020 um 19:56 schrieb Christian Borntraeger <borntraeger@de.ibm.com>:
> 
> 
> 
>> On 03.04.20 17:30, David Hildenbrand wrote:
>> In case we have a region 1 ASCE, our shadow/g3 address can have any value.
>> Unfortunately, (-1UL << 64) is undefined and triggers sometimes,
>> rejecting valid shadow addresses when trying to walk our shadow table
>> hierarchy.
> 
> I thin the range of the addresses do not matter.
> Took me a while to understand maybe rephrase that:
> 
> In case we have a region 1 the following calculation 
> (31 + ((gmap->asce & _ASCE_TYPE_MASK) >> 2)*11)
> results in 64. As shifts beyond the size are undefined the compiler is free to use
> instructions like sllg. sllg will only use 6 bits of the shift value (here 64)
> resulting in no shift at all. That means that ALL addresses will be rejected.

Interestingly, it would not fail when shadowing the r2t, but only when trying to shadow the r3t.

> 
> With that this makes sense. 
> 
> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> 

In case there are no other comments, can you fixup when applying, or do you want me to resend?

Cheers

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  2020-04-03 19:55     ` David Hildenbrand
@ 2020-04-06  8:32       ` Christian Borntraeger
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-04-06  8:32 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, stable



On 03.04.20 21:55, David Hildenbrand wrote:
> 
> 
>> Am 03.04.2020 um 19:56 schrieb Christian Borntraeger <borntraeger@de.ibm.com>:
>>
>> 
>>
>>> On 03.04.20 17:30, David Hildenbrand wrote:
>>> In case we have a region 1 ASCE, our shadow/g3 address can have any value.
>>> Unfortunately, (-1UL << 64) is undefined and triggers sometimes,
>>> rejecting valid shadow addresses when trying to walk our shadow table
>>> hierarchy.
>>
>> I thin the range of the addresses do not matter.
>> Took me a while to understand maybe rephrase that:
>>
>> In case we have a region 1 the following calculation 
>> (31 + ((gmap->asce & _ASCE_TYPE_MASK) >> 2)*11)
>> results in 64. As shifts beyond the size are undefined the compiler is free to use
>> instructions like sllg. sllg will only use 6 bits of the shift value (here 64)
>> resulting in no shift at all. That means that ALL addresses will be rejected.
> 
> Interestingly, it would not fail when shadowing the r2t, but only when trying to shadow the r3t.
> 
>>
>> With that this makes sense. 
>>
>> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
>>
> 
> In case there are no other comments, can you fixup when applying, or do you want me to resend?

I can fixup.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/5] KVM: s390: vsie: Move conditional reschedule
  2020-04-03 15:30 ` [PATCH v2 4/5] KVM: s390: vsie: Move conditional reschedule David Hildenbrand
@ 2020-04-06 15:06   ` Christian Borntraeger
  2020-04-07 10:52   ` Claudio Imbrenda
  1 sibling, 0 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-04-06 15:06 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank

On 03.04.20 17:30, David Hildenbrand wrote:
> Let's move it to the outer loop, in case we ever run again into long
> loops, trying to map the prefix. While at it, convert it to cond_resched().
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>


> ---
>  arch/s390/kvm/vsie.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
> index 4f6c22d72072..ef05b4e167fb 100644
> --- a/arch/s390/kvm/vsie.c
> +++ b/arch/s390/kvm/vsie.c
> @@ -1000,8 +1000,6 @@ static int do_vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
>  
>  	handle_last_fault(vcpu, vsie_page);
>  
> -	if (need_resched())
> -		schedule();
>  	if (test_cpu_flag(CIF_MCCK_PENDING))
>  		s390_handle_mcck();
>  
> @@ -1185,6 +1183,7 @@ static int vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
>  		    kvm_s390_vcpu_has_irq(vcpu, 0) ||
>  		    kvm_s390_vcpu_sie_inhibited(vcpu))
>  			break;
> +		cond_resched();
>  	}
>  
>  	if (rc == -EFAULT) {
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 5/5] KVM: s390: vsie: gmap_table_walk() simplifications
  2020-04-03 15:30 ` [PATCH v2 5/5] KVM: s390: vsie: gmap_table_walk() simplifications David Hildenbrand
@ 2020-04-06 16:06   ` Christian Borntraeger
  2020-04-07 11:10   ` Claudio Imbrenda
  1 sibling, 0 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-04-06 16:06 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank



On 03.04.20 17:30, David Hildenbrand wrote:
> Let's use asce_type where applicable. Also, simplify our sanity check for
> valid table levels and convert it into a WARN_ON_ONCE(). Check if we even
> have a valid gmap shadow as the very first step.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/mm/gmap.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
> index 24ef30fb0833..a2bd8d7792e9 100644
> --- a/arch/s390/mm/gmap.c
> +++ b/arch/s390/mm/gmap.c
> @@ -788,19 +788,19 @@ static inline unsigned long *gmap_table_walk(struct gmap *gmap,
>  					     unsigned long gaddr, int level)
>  {
>  	const int asce_type = gmap->asce & _ASCE_TYPE_MASK;
> -	unsigned long *table;
> +	unsigned long *table = gmap->table;
>  
> -	if ((gmap->asce & _ASCE_TYPE_MASK) + 4 < (level * 4))
> -		return NULL;
>  	if (gmap_is_shadow(gmap) && gmap->removed)
>  		return NULL;
>  
> +	if (WARN_ON_ONCE(level > (asce_type >> 2) + 1))
> +		return NULL;
> +
>  	if (WARN_ON_ONCE(asce_type != _ASCE_TYPE_REGION1 &&
>  			 gaddr & (-1UL << (31 + (asce_type >> 2) * 11))))
>  		return NULL;
>  
> -	table = gmap->table;
> -	switch (gmap->asce & _ASCE_TYPE_MASK) {
> +	switch (asce_type) {
>  	case _ASCE_TYPE_REGION1:
>  		table += (gaddr & _REGION1_INDEX) >> _REGION1_SHIFT;
>  		if (level == 4)
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups
  2020-04-03 15:30 [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups David Hildenbrand
                   ` (4 preceding siblings ...)
  2020-04-03 15:30 ` [PATCH v2 5/5] KVM: s390: vsie: gmap_table_walk() simplifications David Hildenbrand
@ 2020-04-06 16:06 ` Christian Borntraeger
  5 siblings, 0 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-04-06 16:06 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank

Series applied. thanks. 
I will schedule the first 3 for master, 4 and 5 for next.


On 03.04.20 17:30, David Hildenbrand wrote:
> Some vsie/gmap fixes and two cleanups/improvements.
> 
> Patch #1 fixes an issue reported by Janosch. It was never observed so far,
> because KVM usually doesn't use a region 1 table for it's guest (unless
> memory would be exceeding something like 16 EB, which isn't even supported
> by the HW). Older QEMU+KVM or other hypervisors can trigger this.
> 
> Patch #2 fixes a code path that probably was never taken and will most
> probably not be taken very often in the future - unless somebody really
> messes up the page tables for a guest (or writes a test for it). At some
> point, a test case for this would be nice.
> 
> Patch #3 fixes a rare possible race. Don't think this is stable material.
> 
> Gave it some testing with my limited access to somewhat-fast s390x
> machines. Booted a Linux kernel, supplying all possible number of
> page table hiearchies.
> 
> v1 -> v2:
> - "KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks"
> -- Fix WARN_ON_ONCE
> - "gmap_table_walk() simplifications"
> -- Also init "table" directly
> 
> David Hildenbrand (5):
>   KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
>   KVM: s390: vsie: Fix delivery of addressing exceptions
>   KVM: s390: vsie: Fix possible race when shadowing region 3 tables
>   KVM: s390: vsie: Move conditional reschedule
>   KVM: s390: vsie: gmap_table_walk() simplifications
> 
>  arch/s390/kvm/vsie.c |  4 ++--
>  arch/s390/mm/gmap.c  | 17 +++++++++++------
>  2 files changed, 13 insertions(+), 8 deletions(-)
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  2020-04-03 15:30 ` [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks David Hildenbrand
  2020-04-03 17:56   ` Christian Borntraeger
@ 2020-04-07  7:33   ` Christian Borntraeger
  2020-04-07  7:49     ` David Hildenbrand
  1 sibling, 1 reply; 22+ messages in thread
From: Christian Borntraeger @ 2020-04-07  7:33 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, stable


On 03.04.20 17:30, David Hildenbrand wrote:
> In case we have a region 1 ASCE, our shadow/g3 address can have any value.
> Unfortunately, (-1UL << 64) is undefined and triggers sometimes,
> rejecting valid shadow addresses when trying to walk our shadow table
> hierarchy.
> 
> The result is that the prefix cannot get mapped and will loop basically
> forever trying to map it (-EAGAIN loop).
> 
> After all, the broken check is only a sanity check, our table shadowing
> code in kvm_s390_shadow_tables() already checks these conditions, injecting
> proper translation exceptions. Turn it into a WARN_ON_ONCE().

After some testing I now triggered this warning:

[  541.633114] ------------[ cut here ]------------
[  541.633128] WARNING: CPU: 38 PID: 2812 at arch/s390/mm/gmap.c:799 gmap_shadow_pgt_lookup+0x98/0x1a0
[  541.633129] Modules linked in: vhost_net vhost macvtap macvlan tap kvm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp xt_CT tun bridge stp llc xt_tcpudp ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ip6table_filter ip6_tables iptable_filter rpcrdma sunrpc rdma_ucm rdma_cm iw_cm ib_cm configfs mlx5_ib s390_trng ghash_s390 prng aes_s390 ib_uverbs des_s390 ib_core libdes sha3_512_s390 genwqe_card sha3_256_s390 vfio_ccw crc_itu_t vfio_mdev sha512_s390 mdev vfio_iommu_type1 sha1_s390 vfio eadm_sch zcrypt_cex4 sch_fq_codel ip_tables x_tables mlx5_core sha256_s390 sha_common pkey zcrypt rng_core autofs4
[  541.633164] CPU: 38 PID: 2812 Comm: CPU 0/KVM Not tainted 5.6.0+ #354
[  541.633166] Hardware name: IBM 3906 M04 704 (LPAR)
[  541.633167] Krnl PSW : 0704d00180000000 00000014e05dc454 (gmap_shadow_pgt_lookup+0x9c/0x1a0)
[  541.633169]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
[  541.633171] Krnl GPRS: 0000000000000000 0000001f00000000 0000001f487b8000 ffffffff80000000
[  541.633172]            ffffffffffffffff 000003e003defa18 000003e003defa1c 000003e003defa18
[  541.633173]            fffffffffffff000 000003e003defa18 000003e003defa28 0000001f70e06300
[  541.633174]            0000001f43484000 00000000043ed200 000003e003def978 000003e003def920
[  541.633203] Krnl Code: 00000014e05dc448: b9800038		ngr	%r3,%r8
                          00000014e05dc44c: a7840014		brc	8,00000014e05dc474
                         #00000014e05dc450: af000000		mc	0,0
                         >00000014e05dc454: a728fff5		lhi	%r2,-11
                          00000014e05dc458: a7180000		lhi	%r1,0
                          00000014e05dc45c: b2fa0070		niai	7,0
                          00000014e05dc460: 4010b04a		sth	%r1,74(%r11)
                          00000014e05dc464: b9140022		lgfr	%r2,%r2
[  541.633215] Call Trace:
[  541.633218]  [<00000014e05dc454>] gmap_shadow_pgt_lookup+0x9c/0x1a0 
[  541.633257]  [<000003ff804c57d6>] kvm_s390_shadow_fault+0x66/0x1e8 [kvm] 
[  541.633265]  [<000003ff804c72dc>] vsie_run+0x43c/0x710 [kvm] 
[  541.633273]  [<000003ff804c85ca>] kvm_s390_handle_vsie+0x632/0x750 [kvm] 
[  541.633281]  [<000003ff804c123c>] kvm_s390_handle_b2+0x84/0x4e0 [kvm] 
[  541.633289]  [<000003ff804b46b2>] kvm_handle_sie_intercept+0x172/0xcb8 [kvm] 
[  541.633297]  [<000003ff804b18a8>] __vcpu_run+0x658/0xc90 [kvm] 
[  541.633305]  [<000003ff804b2920>] kvm_arch_vcpu_ioctl_run+0x248/0x858 [kvm] 
[  541.633313]  [<000003ff8049d454>] kvm_vcpu_ioctl+0x284/0x7b0 [kvm] 
[  541.633316]  [<00000014e087d5ae>] ksys_ioctl+0xae/0xe8 
[  541.633318]  [<00000014e087d652>] __s390x_sys_ioctl+0x2a/0x38 
[  541.633323]  [<00000014e0ff02a2>] system_call+0x2a6/0x2c8 
[  541.633323] Last Breaking-Event-Address:
[  541.633334]  [<000003ff804983e0>] kvm_running_vcpu+0x3ea9ee997d8/0x3ea9ee99950 [kvm]
[  541.633335] ---[ end trace f69b6021855ea189 ]---


Unfortunately no dump at that point in time.
I have other tests which are clearly fixed by this patch, so we should propbably go forward anyway.
Question is, is this just another bug we need to fix or is the assumption that somebody else checked 
all conditions so we can warn false?


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  2020-04-07  7:33   ` Christian Borntraeger
@ 2020-04-07  7:49     ` David Hildenbrand
  2020-04-07  7:52       ` Christian Borntraeger
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2020-04-07  7:49 UTC (permalink / raw)
  To: Christian Borntraeger, kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, stable

On 07.04.20 09:33, Christian Borntraeger wrote:
> 
> On 03.04.20 17:30, David Hildenbrand wrote:
>> In case we have a region 1 ASCE, our shadow/g3 address can have any value.
>> Unfortunately, (-1UL << 64) is undefined and triggers sometimes,
>> rejecting valid shadow addresses when trying to walk our shadow table
>> hierarchy.
>>
>> The result is that the prefix cannot get mapped and will loop basically
>> forever trying to map it (-EAGAIN loop).
>>
>> After all, the broken check is only a sanity check, our table shadowing
>> code in kvm_s390_shadow_tables() already checks these conditions, injecting
>> proper translation exceptions. Turn it into a WARN_ON_ONCE().
> 
> After some testing I now triggered this warning:
> 
> [  541.633114] ------------[ cut here ]------------
> [  541.633128] WARNING: CPU: 38 PID: 2812 at arch/s390/mm/gmap.c:799 gmap_shadow_pgt_lookup+0x98/0x1a0
> [  541.633129] Modules linked in: vhost_net vhost macvtap macvlan tap kvm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp xt_CT tun bridge stp llc xt_tcpudp ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ip6table_filter ip6_tables iptable_filter rpcrdma sunrpc rdma_ucm rdma_cm iw_cm ib_cm configfs mlx5_ib s390_trng ghash_s390 prng aes_s390 ib_uverbs des_s390 ib_core libdes sha3_512_s390 genwqe_card sha3_256_s390 vfio_ccw crc_itu_t vfio_mdev sha512_s390 mdev vfio_iommu_type1 sha1_s390 vfio eadm_sch zcrypt_cex4 sch_fq_codel ip_tables x_tables mlx5_core sha256_s390 sha_common pkey zcrypt rng_core autofs4
> [  541.633164] CPU: 38 PID: 2812 Comm: CPU 0/KVM Not tainted 5.6.0+ #354
> [  541.633166] Hardware name: IBM 3906 M04 704 (LPAR)
> [  541.633167] Krnl PSW : 0704d00180000000 00000014e05dc454 (gmap_shadow_pgt_lookup+0x9c/0x1a0)
> [  541.633169]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
> [  541.633171] Krnl GPRS: 0000000000000000 0000001f00000000 0000001f487b8000 ffffffff80000000
> [  541.633172]            ffffffffffffffff 000003e003defa18 000003e003defa1c 000003e003defa18
> [  541.633173]            fffffffffffff000 000003e003defa18 000003e003defa28 0000001f70e06300
> [  541.633174]            0000001f43484000 00000000043ed200 000003e003def978 000003e003def920
> [  541.633203] Krnl Code: 00000014e05dc448: b9800038		ngr	%r3,%r8
>                           00000014e05dc44c: a7840014		brc	8,00000014e05dc474
>                          #00000014e05dc450: af000000		mc	0,0
>                          >00000014e05dc454: a728fff5		lhi	%r2,-11
>                           00000014e05dc458: a7180000		lhi	%r1,0
>                           00000014e05dc45c: b2fa0070		niai	7,0
>                           00000014e05dc460: 4010b04a		sth	%r1,74(%r11)
>                           00000014e05dc464: b9140022		lgfr	%r2,%r2
> [  541.633215] Call Trace:
> [  541.633218]  [<00000014e05dc454>] gmap_shadow_pgt_lookup+0x9c/0x1a0 
> [  541.633257]  [<000003ff804c57d6>] kvm_s390_shadow_fault+0x66/0x1e8 [kvm] 
> [  541.633265]  [<000003ff804c72dc>] vsie_run+0x43c/0x710 [kvm] 
> [  541.633273]  [<000003ff804c85ca>] kvm_s390_handle_vsie+0x632/0x750 [kvm] 
> [  541.633281]  [<000003ff804c123c>] kvm_s390_handle_b2+0x84/0x4e0 [kvm] 
> [  541.633289]  [<000003ff804b46b2>] kvm_handle_sie_intercept+0x172/0xcb8 [kvm] 
> [  541.633297]  [<000003ff804b18a8>] __vcpu_run+0x658/0xc90 [kvm] 
> [  541.633305]  [<000003ff804b2920>] kvm_arch_vcpu_ioctl_run+0x248/0x858 [kvm] 
> [  541.633313]  [<000003ff8049d454>] kvm_vcpu_ioctl+0x284/0x7b0 [kvm] 
> [  541.633316]  [<00000014e087d5ae>] ksys_ioctl+0xae/0xe8 
> [  541.633318]  [<00000014e087d652>] __s390x_sys_ioctl+0x2a/0x38 
> [  541.633323]  [<00000014e0ff02a2>] system_call+0x2a6/0x2c8 
> [  541.633323] Last Breaking-Event-Address:
> [  541.633334]  [<000003ff804983e0>] kvm_running_vcpu+0x3ea9ee997d8/0x3ea9ee99950 [kvm]
> [  541.633335] ---[ end trace f69b6021855ea189 ]---
> 
> 
> Unfortunately no dump at that point in time.
> I have other tests which are clearly fixed by this patch, so we should propbably go forward anyway.
> Question is, is this just another bug we need to fix or is the assumption that somebody else checked 
> all conditions so we can warn false?

Yeah, I think it is via

kvm_s390_shadow_fault()->gmap_shadow_pgt_lookup()->gmap_table_walk()

where we just peek if there is already something shadowed. If not, we go
via the full kvm_s390_shadow_tables() path.

So we could either do sanity checks in gmap_shadow_pgt_lookup(), or
rather drop the WARN_ON_ONCE. I think the latter makes sense, now that
we understood the problem.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  2020-04-07  7:49     ` David Hildenbrand
@ 2020-04-07  7:52       ` Christian Borntraeger
  2020-04-07  7:53         ` David Hildenbrand
  2020-04-07 10:48         ` Claudio Imbrenda
  0 siblings, 2 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-04-07  7:52 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, stable



On 07.04.20 09:49, David Hildenbrand wrote:
> On 07.04.20 09:33, Christian Borntraeger wrote:
>>
>> On 03.04.20 17:30, David Hildenbrand wrote:
>>> In case we have a region 1 ASCE, our shadow/g3 address can have any value.
>>> Unfortunately, (-1UL << 64) is undefined and triggers sometimes,
>>> rejecting valid shadow addresses when trying to walk our shadow table
>>> hierarchy.
>>>
>>> The result is that the prefix cannot get mapped and will loop basically
>>> forever trying to map it (-EAGAIN loop).
>>>
>>> After all, the broken check is only a sanity check, our table shadowing
>>> code in kvm_s390_shadow_tables() already checks these conditions, injecting
>>> proper translation exceptions. Turn it into a WARN_ON_ONCE().
>>
>> After some testing I now triggered this warning:
>>
>> [  541.633114] ------------[ cut here ]------------
>> [  541.633128] WARNING: CPU: 38 PID: 2812 at arch/s390/mm/gmap.c:799 gmap_shadow_pgt_lookup+0x98/0x1a0
>> [  541.633129] Modules linked in: vhost_net vhost macvtap macvlan tap kvm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp xt_CT tun bridge stp llc xt_tcpudp ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ip6table_filter ip6_tables iptable_filter rpcrdma sunrpc rdma_ucm rdma_cm iw_cm ib_cm configfs mlx5_ib s390_trng ghash_s390 prng aes_s390 ib_uverbs des_s390 ib_core libdes sha3_512_s390 genwqe_card sha3_256_s390 vfio_ccw crc_itu_t vfio_mdev sha512_s390 mdev vfio_iommu_type1 sha1_s390 vfio eadm_sch zcrypt_cex4 sch_fq_codel ip_tables x_tables mlx5_core sha256_s390 sha_common pkey zcrypt rng_core autofs4
>> [  541.633164] CPU: 38 PID: 2812 Comm: CPU 0/KVM Not tainted 5.6.0+ #354
>> [  541.633166] Hardware name: IBM 3906 M04 704 (LPAR)
>> [  541.633167] Krnl PSW : 0704d00180000000 00000014e05dc454 (gmap_shadow_pgt_lookup+0x9c/0x1a0)
>> [  541.633169]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
>> [  541.633171] Krnl GPRS: 0000000000000000 0000001f00000000 0000001f487b8000 ffffffff80000000
>> [  541.633172]            ffffffffffffffff 000003e003defa18 000003e003defa1c 000003e003defa18
>> [  541.633173]            fffffffffffff000 000003e003defa18 000003e003defa28 0000001f70e06300
>> [  541.633174]            0000001f43484000 00000000043ed200 000003e003def978 000003e003def920
>> [  541.633203] Krnl Code: 00000014e05dc448: b9800038		ngr	%r3,%r8
>>                           00000014e05dc44c: a7840014		brc	8,00000014e05dc474
>>                          #00000014e05dc450: af000000		mc	0,0
>>                          >00000014e05dc454: a728fff5		lhi	%r2,-11
>>                           00000014e05dc458: a7180000		lhi	%r1,0
>>                           00000014e05dc45c: b2fa0070		niai	7,0
>>                           00000014e05dc460: 4010b04a		sth	%r1,74(%r11)
>>                           00000014e05dc464: b9140022		lgfr	%r2,%r2
>> [  541.633215] Call Trace:
>> [  541.633218]  [<00000014e05dc454>] gmap_shadow_pgt_lookup+0x9c/0x1a0 
>> [  541.633257]  [<000003ff804c57d6>] kvm_s390_shadow_fault+0x66/0x1e8 [kvm] 
>> [  541.633265]  [<000003ff804c72dc>] vsie_run+0x43c/0x710 [kvm] 
>> [  541.633273]  [<000003ff804c85ca>] kvm_s390_handle_vsie+0x632/0x750 [kvm] 
>> [  541.633281]  [<000003ff804c123c>] kvm_s390_handle_b2+0x84/0x4e0 [kvm] 
>> [  541.633289]  [<000003ff804b46b2>] kvm_handle_sie_intercept+0x172/0xcb8 [kvm] 
>> [  541.633297]  [<000003ff804b18a8>] __vcpu_run+0x658/0xc90 [kvm] 
>> [  541.633305]  [<000003ff804b2920>] kvm_arch_vcpu_ioctl_run+0x248/0x858 [kvm] 
>> [  541.633313]  [<000003ff8049d454>] kvm_vcpu_ioctl+0x284/0x7b0 [kvm] 
>> [  541.633316]  [<00000014e087d5ae>] ksys_ioctl+0xae/0xe8 
>> [  541.633318]  [<00000014e087d652>] __s390x_sys_ioctl+0x2a/0x38 
>> [  541.633323]  [<00000014e0ff02a2>] system_call+0x2a6/0x2c8 
>> [  541.633323] Last Breaking-Event-Address:
>> [  541.633334]  [<000003ff804983e0>] kvm_running_vcpu+0x3ea9ee997d8/0x3ea9ee99950 [kvm]
>> [  541.633335] ---[ end trace f69b6021855ea189 ]---
>>
>>
>> Unfortunately no dump at that point in time.
>> I have other tests which are clearly fixed by this patch, so we should propbably go forward anyway.
>> Question is, is this just another bug we need to fix or is the assumption that somebody else checked 
>> all conditions so we can warn false?
> 
> Yeah, I think it is via
> 
> kvm_s390_shadow_fault()->gmap_shadow_pgt_lookup()->gmap_table_walk()
> 
> where we just peek if there is already something shadowed. If not, we go
> via the full kvm_s390_shadow_tables() path.
> 
> So we could either do sanity checks in gmap_shadow_pgt_lookup(), or
> rather drop the WARN_ON_ONCE. I think the latter makes sense, now that
> we understood the problem.

Ok, so I will drop the WARN_ON_ONCE and fixup the commit message.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  2020-04-07  7:52       ` Christian Borntraeger
@ 2020-04-07  7:53         ` David Hildenbrand
  2020-04-07 10:48         ` Claudio Imbrenda
  1 sibling, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2020-04-07  7:53 UTC (permalink / raw)
  To: Christian Borntraeger, kvm
  Cc: linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, stable

On 07.04.20 09:52, Christian Borntraeger wrote:
> 
> 
> On 07.04.20 09:49, David Hildenbrand wrote:
>> On 07.04.20 09:33, Christian Borntraeger wrote:
>>>
>>> On 03.04.20 17:30, David Hildenbrand wrote:
>>>> In case we have a region 1 ASCE, our shadow/g3 address can have any value.
>>>> Unfortunately, (-1UL << 64) is undefined and triggers sometimes,
>>>> rejecting valid shadow addresses when trying to walk our shadow table
>>>> hierarchy.
>>>>
>>>> The result is that the prefix cannot get mapped and will loop basically
>>>> forever trying to map it (-EAGAIN loop).
>>>>
>>>> After all, the broken check is only a sanity check, our table shadowing
>>>> code in kvm_s390_shadow_tables() already checks these conditions, injecting
>>>> proper translation exceptions. Turn it into a WARN_ON_ONCE().
>>>
>>> After some testing I now triggered this warning:
>>>
>>> [  541.633114] ------------[ cut here ]------------
>>> [  541.633128] WARNING: CPU: 38 PID: 2812 at arch/s390/mm/gmap.c:799 gmap_shadow_pgt_lookup+0x98/0x1a0
>>> [  541.633129] Modules linked in: vhost_net vhost macvtap macvlan tap kvm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp xt_CT tun bridge stp llc xt_tcpudp ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ip6table_filter ip6_tables iptable_filter rpcrdma sunrpc rdma_ucm rdma_cm iw_cm ib_cm configfs mlx5_ib s390_trng ghash_s390 prng aes_s390 ib_uverbs des_s390 ib_core libdes sha3_512_s390 genwqe_card sha3_256_s390 vfio_ccw crc_itu_t vfio_mdev sha512_s390 mdev vfio_iommu_type1 sha1_s390 vfio eadm_sch zcrypt_cex4 sch_fq_codel ip_tables x_tables mlx5_core sha256_s390 sha_common pkey zcrypt rng_core autofs4
>>> [  541.633164] CPU: 38 PID: 2812 Comm: CPU 0/KVM Not tainted 5.6.0+ #354
>>> [  541.633166] Hardware name: IBM 3906 M04 704 (LPAR)
>>> [  541.633167] Krnl PSW : 0704d00180000000 00000014e05dc454 (gmap_shadow_pgt_lookup+0x9c/0x1a0)
>>> [  541.633169]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
>>> [  541.633171] Krnl GPRS: 0000000000000000 0000001f00000000 0000001f487b8000 ffffffff80000000
>>> [  541.633172]            ffffffffffffffff 000003e003defa18 000003e003defa1c 000003e003defa18
>>> [  541.633173]            fffffffffffff000 000003e003defa18 000003e003defa28 0000001f70e06300
>>> [  541.633174]            0000001f43484000 00000000043ed200 000003e003def978 000003e003def920
>>> [  541.633203] Krnl Code: 00000014e05dc448: b9800038		ngr	%r3,%r8
>>>                           00000014e05dc44c: a7840014		brc	8,00000014e05dc474
>>>                          #00000014e05dc450: af000000		mc	0,0
>>>                          >00000014e05dc454: a728fff5		lhi	%r2,-11
>>>                           00000014e05dc458: a7180000		lhi	%r1,0
>>>                           00000014e05dc45c: b2fa0070		niai	7,0
>>>                           00000014e05dc460: 4010b04a		sth	%r1,74(%r11)
>>>                           00000014e05dc464: b9140022		lgfr	%r2,%r2
>>> [  541.633215] Call Trace:
>>> [  541.633218]  [<00000014e05dc454>] gmap_shadow_pgt_lookup+0x9c/0x1a0 
>>> [  541.633257]  [<000003ff804c57d6>] kvm_s390_shadow_fault+0x66/0x1e8 [kvm] 
>>> [  541.633265]  [<000003ff804c72dc>] vsie_run+0x43c/0x710 [kvm] 
>>> [  541.633273]  [<000003ff804c85ca>] kvm_s390_handle_vsie+0x632/0x750 [kvm] 
>>> [  541.633281]  [<000003ff804c123c>] kvm_s390_handle_b2+0x84/0x4e0 [kvm] 
>>> [  541.633289]  [<000003ff804b46b2>] kvm_handle_sie_intercept+0x172/0xcb8 [kvm] 
>>> [  541.633297]  [<000003ff804b18a8>] __vcpu_run+0x658/0xc90 [kvm] 
>>> [  541.633305]  [<000003ff804b2920>] kvm_arch_vcpu_ioctl_run+0x248/0x858 [kvm] 
>>> [  541.633313]  [<000003ff8049d454>] kvm_vcpu_ioctl+0x284/0x7b0 [kvm] 
>>> [  541.633316]  [<00000014e087d5ae>] ksys_ioctl+0xae/0xe8 
>>> [  541.633318]  [<00000014e087d652>] __s390x_sys_ioctl+0x2a/0x38 
>>> [  541.633323]  [<00000014e0ff02a2>] system_call+0x2a6/0x2c8 
>>> [  541.633323] Last Breaking-Event-Address:
>>> [  541.633334]  [<000003ff804983e0>] kvm_running_vcpu+0x3ea9ee997d8/0x3ea9ee99950 [kvm]
>>> [  541.633335] ---[ end trace f69b6021855ea189 ]---
>>>
>>>
>>> Unfortunately no dump at that point in time.
>>> I have other tests which are clearly fixed by this patch, so we should propbably go forward anyway.
>>> Question is, is this just another bug we need to fix or is the assumption that somebody else checked 
>>> all conditions so we can warn false?
>>
>> Yeah, I think it is via
>>
>> kvm_s390_shadow_fault()->gmap_shadow_pgt_lookup()->gmap_table_walk()
>>
>> where we just peek if there is already something shadowed. If not, we go
>> via the full kvm_s390_shadow_tables() path.
>>
>> So we could either do sanity checks in gmap_shadow_pgt_lookup(), or
>> rather drop the WARN_ON_ONCE. I think the latter makes sense, now that
>> we understood the problem.
> 
> Ok, so I will drop the WARN_ON_ONCE and fixup the commit message.
> 

Perfect, thanks!

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
  2020-04-07  7:52       ` Christian Borntraeger
  2020-04-07  7:53         ` David Hildenbrand
@ 2020-04-07 10:48         ` Claudio Imbrenda
  1 sibling, 0 replies; 22+ messages in thread
From: Claudio Imbrenda @ 2020-04-07 10:48 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Hildenbrand, kvm, linux-s390, linux-kernel, Vasily Gorbik,
	Heiko Carstens, Cornelia Huck, Janosch Frank, stable

On Tue, 7 Apr 2020 09:52:53 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 07.04.20 09:49, David Hildenbrand wrote:
> > On 07.04.20 09:33, Christian Borntraeger wrote:  
> >>
> >> On 03.04.20 17:30, David Hildenbrand wrote:  
> >>> In case we have a region 1 ASCE, our shadow/g3 address can have
> >>> any value. Unfortunately, (-1UL << 64) is undefined and triggers
> >>> sometimes, rejecting valid shadow addresses when trying to walk
> >>> our shadow table hierarchy.
> >>>
> >>> The result is that the prefix cannot get mapped and will loop
> >>> basically forever trying to map it (-EAGAIN loop).
> >>>
> >>> After all, the broken check is only a sanity check, our table
> >>> shadowing code in kvm_s390_shadow_tables() already checks these
> >>> conditions, injecting proper translation exceptions. Turn it into
> >>> a WARN_ON_ONCE().  
> >>
> >> After some testing I now triggered this warning:
> >>
> >> [  541.633114] ------------[ cut here ]------------
> >> [  541.633128] WARNING: CPU: 38 PID: 2812 at
> >> arch/s390/mm/gmap.c:799 gmap_shadow_pgt_lookup+0x98/0x1a0 [
> >> 541.633129] Modules linked in: vhost_net vhost macvtap macvlan tap
> >> kvm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp xt_CT
> >> tun bridge stp llc xt_tcpudp ip6t_REJECT nf_reject_ipv6
> >> ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat
> >> ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat
> >> iptable_mangle iptable_raw iptable_security nf_conntrack
> >> nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ip6table_filter
> >> ip6_tables iptable_filter rpcrdma sunrpc rdma_ucm rdma_cm iw_cm
> >> ib_cm configfs mlx5_ib s390_trng ghash_s390 prng aes_s390
> >> ib_uverbs des_s390 ib_core libdes sha3_512_s390 genwqe_card
> >> sha3_256_s390 vfio_ccw crc_itu_t vfio_mdev sha512_s390 mdev
> >> vfio_iommu_type1 sha1_s390 vfio eadm_sch zcrypt_cex4 sch_fq_codel
> >> ip_tables x_tables mlx5_core sha256_s390 sha_common pkey zcrypt
> >> rng_core autofs4 [  541.633164] CPU: 38 PID: 2812 Comm: CPU 0/KVM
> >> Not tainted 5.6.0+ #354 [  541.633166] Hardware name: IBM 3906 M04
> >> 704 (LPAR) [  541.633167] Krnl PSW : 0704d00180000000
> >> 00000014e05dc454 (gmap_shadow_pgt_lookup+0x9c/0x1a0) [
> >> 541.633169]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3
> >> CC:1 PM:0 RI:0 EA:3 [  541.633171] Krnl GPRS: 0000000000000000
> >> 0000001f00000000 0000001f487b8000 ffffffff80000000 [  541.633172]
> >>           ffffffffffffffff 000003e003defa18 000003e003defa1c
> >> 000003e003defa18 [  541.633173]            fffffffffffff000
> >> 000003e003defa18 000003e003defa28 0000001f70e06300 [  541.633174]
> >>           0000001f43484000 00000000043ed200 000003e003def978
> >> 000003e003def920 [  541.633203] Krnl Code: 00000014e05dc448:
> >> b9800038		ngr	%r3,%r8 00000014e05dc44c:
> >> a7840014		brc	8,00000014e05dc474
> >> #00000014e05dc450: af000000		mc	0,0  
> >>                          >00000014e05dc454: a728fff5
> >>                          >	lhi	%r2,-11  
> >>                           00000014e05dc458: a7180000
> >> 	lhi	%r1,0 00000014e05dc45c: b2fa0070
> >> niai	7,0 00000014e05dc460: 4010b04a
> >> sth	%r1,74(%r11) 00000014e05dc464: b9140022
> >> lgfr	%r2,%r2 [  541.633215] Call Trace:
> >> [  541.633218]  [<00000014e05dc454>]
> >> gmap_shadow_pgt_lookup+0x9c/0x1a0 [  541.633257]
> >> [<000003ff804c57d6>] kvm_s390_shadow_fault+0x66/0x1e8 [kvm] [
> >> 541.633265]  [<000003ff804c72dc>] vsie_run+0x43c/0x710 [kvm] [
> >> 541.633273]  [<000003ff804c85ca>] kvm_s390_handle_vsie+0x632/0x750
> >> [kvm] [  541.633281]  [<000003ff804c123c>]
> >> kvm_s390_handle_b2+0x84/0x4e0 [kvm] [  541.633289]
> >> [<000003ff804b46b2>] kvm_handle_sie_intercept+0x172/0xcb8 [kvm] [
> >> 541.633297]  [<000003ff804b18a8>] __vcpu_run+0x658/0xc90 [kvm] [
> >> 541.633305]  [<000003ff804b2920>]
> >> kvm_arch_vcpu_ioctl_run+0x248/0x858 [kvm] [  541.633313]
> >> [<000003ff8049d454>] kvm_vcpu_ioctl+0x284/0x7b0 [kvm] [
> >> 541.633316]  [<00000014e087d5ae>] ksys_ioctl+0xae/0xe8 [
> >> 541.633318]  [<00000014e087d652>] __s390x_sys_ioctl+0x2a/0x38 [
> >> 541.633323]  [<00000014e0ff02a2>] system_call+0x2a6/0x2c8 [
> >> 541.633323] Last Breaking-Event-Address: [  541.633334]
> >> [<000003ff804983e0>] kvm_running_vcpu+0x3ea9ee997d8/0x3ea9ee99950
> >> [kvm] [  541.633335] ---[ end trace f69b6021855ea189 ]---
> >>
> >>
> >> Unfortunately no dump at that point in time.
> >> I have other tests which are clearly fixed by this patch, so we
> >> should propbably go forward anyway. Question is, is this just
> >> another bug we need to fix or is the assumption that somebody else
> >> checked all conditions so we can warn false?  
> > 
> > Yeah, I think it is via
> > 
> > kvm_s390_shadow_fault()->gmap_shadow_pgt_lookup()->gmap_table_walk()
> > 
> > where we just peek if there is already something shadowed. If not,
> > we go via the full kvm_s390_shadow_tables() path.
> > 
> > So we could either do sanity checks in gmap_shadow_pgt_lookup(), or
> > rather drop the WARN_ON_ONCE. I think the latter makes sense, now
> > that we understood the problem.  
> 
> Ok, so I will drop the WARN_ON_ONCE and fixup the commit message.
> 

with those fixes, you can also add:

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/5] KVM: s390: vsie: Move conditional reschedule
  2020-04-03 15:30 ` [PATCH v2 4/5] KVM: s390: vsie: Move conditional reschedule David Hildenbrand
  2020-04-06 15:06   ` Christian Borntraeger
@ 2020-04-07 10:52   ` Claudio Imbrenda
  1 sibling, 0 replies; 22+ messages in thread
From: Claudio Imbrenda @ 2020-04-07 10:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger

On Fri,  3 Apr 2020 17:30:49 +0200
David Hildenbrand <david@redhat.com> wrote:

> Let's move it to the outer loop, in case we ever run again into long
> loops, trying to map the prefix. While at it, convert it to
> cond_resched().
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/s390/kvm/vsie.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
> index 4f6c22d72072..ef05b4e167fb 100644
> --- a/arch/s390/kvm/vsie.c
> +++ b/arch/s390/kvm/vsie.c
> @@ -1000,8 +1000,6 @@ static int do_vsie_run(struct kvm_vcpu *vcpu,
> struct vsie_page *vsie_page) 
>  	handle_last_fault(vcpu, vsie_page);
>  
> -	if (need_resched())
> -		schedule();
>  	if (test_cpu_flag(CIF_MCCK_PENDING))
>  		s390_handle_mcck();
>  
> @@ -1185,6 +1183,7 @@ static int vsie_run(struct kvm_vcpu *vcpu,
> struct vsie_page *vsie_page) kvm_s390_vcpu_has_irq(vcpu, 0) ||
>  		    kvm_s390_vcpu_sie_inhibited(vcpu))
>  			break;
> +		cond_resched();
>  	}
>  
>  	if (rc == -EFAULT) {

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/5] KVM: s390: vsie: Fix delivery of addressing exceptions
  2020-04-03 15:30 ` [PATCH v2 2/5] KVM: s390: vsie: Fix delivery of addressing exceptions David Hildenbrand
@ 2020-04-07 11:00   ` Claudio Imbrenda
  2020-04-07 11:35     ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Claudio Imbrenda @ 2020-04-07 11:00 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger, stable

On Fri,  3 Apr 2020 17:30:47 +0200
David Hildenbrand <david@redhat.com> wrote:

> Whenever we get an -EFAULT, we failed to read in guest 2 physical
> address space. Such addressing exceptions are reported via a program
> intercept to the nested hypervisor.
> 
> We faked the intercept, we have to return to guest 2. Instead, right
> now we would be returning -EFAULT from the intercept handler,
> eventually crashing the VM.
> 
> Addressing exceptions can only happen if the g2->g3 page tables
> reference invalid g2 addresses (say, either a table or the final page
> is not accessible - so something that basically never happens in sane
> environments.
> 
> Identified by manual code inspection.
> 
> Fixes: a3508fbe9dc6 ("KVM: s390: vsie: initial support for nested
> virtualization") Cc: <stable@vger.kernel.org> # v4.8+
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/s390/kvm/vsie.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
> index 076090f9e666..4f6c22d72072 100644
> --- a/arch/s390/kvm/vsie.c
> +++ b/arch/s390/kvm/vsie.c
> @@ -1202,6 +1202,7 @@ static int vsie_run(struct kvm_vcpu *vcpu,
> struct vsie_page *vsie_page) scb_s->iprcc = PGM_ADDRESSING;
>  		scb_s->pgmilc = 4;
>  		scb_s->gpsw.addr = __rewind_psw(scb_s->gpsw, 4);
> +		rc = 1;
>  	}
>  	return rc;
>  }

so, the reason why we never noticed this issue before is simply that
nobody tried running a misbehaving nested guest?

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] KVM: s390: vsie: Fix possible race when shadowing region 3 tables
  2020-04-03 15:30 ` [PATCH v2 3/5] KVM: s390: vsie: Fix possible race when shadowing region 3 tables David Hildenbrand
@ 2020-04-07 11:05   ` Claudio Imbrenda
  0 siblings, 0 replies; 22+ messages in thread
From: Claudio Imbrenda @ 2020-04-07 11:05 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger

On Fri,  3 Apr 2020 17:30:48 +0200
David Hildenbrand <david@redhat.com> wrote:

> We have to properly retry again by returning -EINVAL immediately in
> case somebody else instantiated the table concurrently. We missed to
> add the goto in this function only. The code now matches the other,
> similar shadowing functions.
> 
> We are overwriting an existing region 2 table entry. All allocated
> pages are added to the crst_list to be freed later, so they are not
> lost forever. However, when unshadowing the region 2 table, we
> wouldn't trigger unshadowing of the original shadowed region 3 table
> that we replaced. It would get unshadowed when the original region 3
> table is modified. As it's not connected to the page table hierarchy
> anymore, it's not going to get used anymore. However, for a limited
> time, this page table will stick around, so it's in some sense a
> temporary memory leak.
> 
> Identified by manual code inspection. I don't think this classifies as
> stable material.
> 
> Fixes: 998f637cc4b9 ("s390/mm: avoid races on region/segment/page
> table shadowing") Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/s390/mm/gmap.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
> index b93dd54b234a..24ef30fb0833 100644
> --- a/arch/s390/mm/gmap.c
> +++ b/arch/s390/mm/gmap.c
> @@ -1844,6 +1844,7 @@ int gmap_shadow_r3t(struct gmap *sg, unsigned
> long saddr, unsigned long r3t, goto out_free;
>  	} else if (*table & _REGION_ENTRY_ORIGIN) {
>  		rc = -EAGAIN;		/* Race with shadow */
> +		goto out_free;
>  	}
>  	crst_table_init(s_r3t, _REGION3_ENTRY_EMPTY);
>  	/* mark as invalid as long as the parent table is not
> protected */

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 5/5] KVM: s390: vsie: gmap_table_walk() simplifications
  2020-04-03 15:30 ` [PATCH v2 5/5] KVM: s390: vsie: gmap_table_walk() simplifications David Hildenbrand
  2020-04-06 16:06   ` Christian Borntraeger
@ 2020-04-07 11:10   ` Claudio Imbrenda
  1 sibling, 0 replies; 22+ messages in thread
From: Claudio Imbrenda @ 2020-04-07 11:10 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger

On Fri,  3 Apr 2020 17:30:50 +0200
David Hildenbrand <david@redhat.com> wrote:

> Let's use asce_type where applicable. Also, simplify our sanity check
> for valid table levels and convert it into a WARN_ON_ONCE(). Check if
> we even have a valid gmap shadow as the very first step.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/s390/mm/gmap.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
> index 24ef30fb0833..a2bd8d7792e9 100644
> --- a/arch/s390/mm/gmap.c
> +++ b/arch/s390/mm/gmap.c
> @@ -788,19 +788,19 @@ static inline unsigned long
> *gmap_table_walk(struct gmap *gmap, unsigned long gaddr, int level)
>  {
>  	const int asce_type = gmap->asce & _ASCE_TYPE_MASK;
> -	unsigned long *table;
> +	unsigned long *table = gmap->table;
>  
> -	if ((gmap->asce & _ASCE_TYPE_MASK) + 4 < (level * 4))
> -		return NULL;
>  	if (gmap_is_shadow(gmap) && gmap->removed)
>  		return NULL;
>  
> +	if (WARN_ON_ONCE(level > (asce_type >> 2) + 1))
> +		return NULL;
> +
>  	if (WARN_ON_ONCE(asce_type != _ASCE_TYPE_REGION1 &&
>  			 gaddr & (-1UL << (31 + (asce_type >> 2) *
> 11)))) return NULL;
>  
> -	table = gmap->table;
> -	switch (gmap->asce & _ASCE_TYPE_MASK) {
> +	switch (asce_type) {
>  	case _ASCE_TYPE_REGION1:
>  		table += (gaddr & _REGION1_INDEX) >> _REGION1_SHIFT;
>  		if (level == 4)

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/5] KVM: s390: vsie: Fix delivery of addressing exceptions
  2020-04-07 11:00   ` Claudio Imbrenda
@ 2020-04-07 11:35     ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2020-04-07 11:35 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: kvm, linux-s390, linux-kernel, Vasily Gorbik, Heiko Carstens,
	Cornelia Huck, Janosch Frank, Christian Borntraeger, stable

On 07.04.20 13:00, Claudio Imbrenda wrote:
> On Fri,  3 Apr 2020 17:30:47 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> Whenever we get an -EFAULT, we failed to read in guest 2 physical
>> address space. Such addressing exceptions are reported via a program
>> intercept to the nested hypervisor.
>>
>> We faked the intercept, we have to return to guest 2. Instead, right
>> now we would be returning -EFAULT from the intercept handler,
>> eventually crashing the VM.
>>
>> Addressing exceptions can only happen if the g2->g3 page tables
>> reference invalid g2 addresses (say, either a table or the final page
>> is not accessible - so something that basically never happens in sane
>> environments.
>>
>> Identified by manual code inspection.
>>
>> Fixes: a3508fbe9dc6 ("KVM: s390: vsie: initial support for nested
>> virtualization") Cc: <stable@vger.kernel.org> # v4.8+
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  arch/s390/kvm/vsie.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>> index 076090f9e666..4f6c22d72072 100644
>> --- a/arch/s390/kvm/vsie.c
>> +++ b/arch/s390/kvm/vsie.c
>> @@ -1202,6 +1202,7 @@ static int vsie_run(struct kvm_vcpu *vcpu,
>> struct vsie_page *vsie_page) scb_s->iprcc = PGM_ADDRESSING;
>>  		scb_s->pgmilc = 4;
>>  		scb_s->gpsw.addr = __rewind_psw(scb_s->gpsw, 4);
>> +		rc = 1;
>>  	}
>>  	return rc;
>>  }
> 
> so, the reason why we never noticed this issue before is simply that
> nobody tried running a misbehaving nested guest?

Yes, actually, a misbehaving nested hypervisor.

> 
> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> 

Thanks!

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-04-07 11:35 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-03 15:30 [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups David Hildenbrand
2020-04-03 15:30 ` [PATCH v2 1/5] KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks David Hildenbrand
2020-04-03 17:56   ` Christian Borntraeger
2020-04-03 19:55     ` David Hildenbrand
2020-04-06  8:32       ` Christian Borntraeger
2020-04-07  7:33   ` Christian Borntraeger
2020-04-07  7:49     ` David Hildenbrand
2020-04-07  7:52       ` Christian Borntraeger
2020-04-07  7:53         ` David Hildenbrand
2020-04-07 10:48         ` Claudio Imbrenda
2020-04-03 15:30 ` [PATCH v2 2/5] KVM: s390: vsie: Fix delivery of addressing exceptions David Hildenbrand
2020-04-07 11:00   ` Claudio Imbrenda
2020-04-07 11:35     ` David Hildenbrand
2020-04-03 15:30 ` [PATCH v2 3/5] KVM: s390: vsie: Fix possible race when shadowing region 3 tables David Hildenbrand
2020-04-07 11:05   ` Claudio Imbrenda
2020-04-03 15:30 ` [PATCH v2 4/5] KVM: s390: vsie: Move conditional reschedule David Hildenbrand
2020-04-06 15:06   ` Christian Borntraeger
2020-04-07 10:52   ` Claudio Imbrenda
2020-04-03 15:30 ` [PATCH v2 5/5] KVM: s390: vsie: gmap_table_walk() simplifications David Hildenbrand
2020-04-06 16:06   ` Christian Borntraeger
2020-04-07 11:10   ` Claudio Imbrenda
2020-04-06 16:06 ` [PATCH v2 0/5] KVM: s390: vsie: fixes and cleanups Christian Borntraeger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.