All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: "James Hogan" <jhogan@kernel.org>,
	"Paul Mackerras" <paulus@ozlabs.org>,
	"Christian Borntraeger" <borntraeger@de.ibm.com>,
	"Janosch Frank" <frankja@linux.ibm.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Marc Zyngier" <maz@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	"Wanpeng Li" <wanpengli@tencent.com>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	kvm@vger.kernel.org, "David Hildenbrand" <david@redhat.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Cornelia Huck" <cohuck@redhat.com>,
	linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm-ppc@vger.kernel.org, "Vitaly Kuznetsov" <vkuznets@redhat.com>,
	kvmarm@lists.cs.columbia.edu, "Jim Mattson" <jmattson@google.com>,
	"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v4 01/19] KVM: x86: Allocate new rmap and large page tracking when moving memslot
Date: Tue, 17 Dec 2019 16:56:40 -0500	[thread overview]
Message-ID: <20191217215640.GI7258@xz-x1> (raw)
In-Reply-To: <20191217204041.10815-2-sean.j.christopherson@intel.com>

On Tue, Dec 17, 2019 at 12:40:23PM -0800, Sean Christopherson wrote:
> Reallocate a rmap array and recalcuate large page compatibility when
> moving an existing memslot to correctly handle the alignment properties
> of the new memslot.  The number of rmap entries required at each level
> is dependent on the alignment of the memslot's base gfn with respect to
> that level, e.g. moving a large-page aligned memslot so that it becomes
> unaligned will increase the number of rmap entries needed at the now
> unaligned level.
> 
> Not updating the rmap array is the most obvious bug, as KVM accesses
> garbage data beyond the end of the rmap.  KVM interprets the bad data as
> pointers, leading to non-canonical #GPs, unexpected #PFs, etc...
> 
>   general protection fault: 0000 [#1] SMP
>   CPU: 0 PID: 1909 Comm: move_memory_reg Not tainted 5.4.0-rc7+ #139
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:rmap_get_first+0x37/0x50 [kvm]
>   Code: <48> 8b 3b 48 85 ff 74 ec e8 6c f4 ff ff 85 c0 74 e3 48 89 d8 5b c3
>   RSP: 0018:ffffc9000021bbc8 EFLAGS: 00010246
>   RAX: ffff00617461642e RBX: ffff00617461642e RCX: 0000000000000012
>   RDX: ffff88827400f568 RSI: ffffc9000021bbe0 RDI: ffff88827400f570
>   RBP: 0010000000000000 R08: ffffc9000021bd00 R09: ffffc9000021bda8
>   R10: ffffc9000021bc48 R11: 0000000000000000 R12: 0030000000000000
>   R13: 0000000000000000 R14: ffff88827427d700 R15: ffffc9000021bce8
>   FS:  00007f7eda014700(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007f7ed9216ff8 CR3: 0000000274391003 CR4: 0000000000162eb0
>   Call Trace:
>    kvm_mmu_slot_set_dirty+0xa1/0x150 [kvm]
>    __kvm_set_memory_region.part.64+0x559/0x960 [kvm]
>    kvm_set_memory_region+0x45/0x60 [kvm]
>    kvm_vm_ioctl+0x30f/0x920 [kvm]
>    do_vfs_ioctl+0xa1/0x620
>    ksys_ioctl+0x66/0x70
>    __x64_sys_ioctl+0x16/0x20
>    do_syscall_64+0x4c/0x170
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f7ed9911f47
>   Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 6f 2c 00 f7 d8 64 89 01 48
>   RSP: 002b:00007ffc00937498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>   RAX: ffffffffffffffda RBX: 0000000001ab0010 RCX: 00007f7ed9911f47
>   RDX: 0000000001ab1350 RSI: 000000004020ae46 RDI: 0000000000000004
>   RBP: 000000000000000a R08: 0000000000000000 R09: 00007f7ed9214700
>   R10: 00007f7ed92149d0 R11: 0000000000000246 R12: 00000000bffff000
>   R13: 0000000000000003 R14: 00007f7ed9215000 R15: 0000000000000000
>   Modules linked in: kvm_intel kvm irqbypass
>   ---[ end trace 0c5f570b3358ca89 ]---
> 
> The disallow_lpage tracking is more subtle.  Failure to update results
> in KVM creating large pages when it shouldn't, either due to stale data
> or again due to indexing beyond the end of the metadata arrays, which
> can lead to memory corruption and/or leaking data to guest/userspace.
> 
> Note, the arrays for the old memslot are freed by the unconditional call
> to kvm_free_memslot() in __kvm_set_memory_region().
> 
> Fixes: 05da45583de9b ("KVM: MMU: large page support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

I think the error-prone part is:

	new = old = *slot;

Where IMHO it would be better if we only copy pointers explicitly when
under control, rather than blindly copying all the pointers in the
structure which even contains sub-structures.

For example, I see PPC has this:

struct kvm_arch_memory_slot {
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
	unsigned long *rmap;
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
};

I started to look into HV code of it a bit, then I see...

 - kvm_arch_create_memslot(kvmppc_core_create_memslot_hv) init slot->arch.rmap,
 - kvm_arch_flush_shadow_memslot(kvmppc_core_flush_memslot_hv) didn't free it,
 - kvm_arch_prepare_memory_region(kvmppc_core_prepare_memory_region_hv) is nop.

So Does it have similar issue?

-- 
Peter Xu


WARNING: multiple messages have this Message-ID (diff)
From: Peter Xu <peterx@redhat.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: "Wanpeng Li" <wanpengli@tencent.com>,
	"Janosch Frank" <frankja@linux.ibm.com>,
	kvm@vger.kernel.org, "David Hildenbrand" <david@redhat.com>,
	"James Hogan" <jhogan@kernel.org>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Cornelia Huck" <cohuck@redhat.com>,
	linux-kernel@vger.kernel.org,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	kvm-ppc@vger.kernel.org, linux-mips@vger.kernel.org,
	"Paul Mackerras" <paulus@ozlabs.org>,
	"Christian Borntraeger" <borntraeger@de.ibm.com>,
	"Marc Zyngier" <maz@kernel.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Vitaly Kuznetsov" <vkuznets@redhat.com>,
	"David Gibson" <david@gibson.dropbear.id.au>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org,
	"Jim Mattson" <jmattson@google.com>
Subject: Re: [PATCH v4 01/19] KVM: x86: Allocate new rmap and large page tracking when moving memslot
Date: Tue, 17 Dec 2019 16:56:40 -0500	[thread overview]
Message-ID: <20191217215640.GI7258@xz-x1> (raw)
In-Reply-To: <20191217204041.10815-2-sean.j.christopherson@intel.com>

On Tue, Dec 17, 2019 at 12:40:23PM -0800, Sean Christopherson wrote:
> Reallocate a rmap array and recalcuate large page compatibility when
> moving an existing memslot to correctly handle the alignment properties
> of the new memslot.  The number of rmap entries required at each level
> is dependent on the alignment of the memslot's base gfn with respect to
> that level, e.g. moving a large-page aligned memslot so that it becomes
> unaligned will increase the number of rmap entries needed at the now
> unaligned level.
> 
> Not updating the rmap array is the most obvious bug, as KVM accesses
> garbage data beyond the end of the rmap.  KVM interprets the bad data as
> pointers, leading to non-canonical #GPs, unexpected #PFs, etc...
> 
>   general protection fault: 0000 [#1] SMP
>   CPU: 0 PID: 1909 Comm: move_memory_reg Not tainted 5.4.0-rc7+ #139
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:rmap_get_first+0x37/0x50 [kvm]
>   Code: <48> 8b 3b 48 85 ff 74 ec e8 6c f4 ff ff 85 c0 74 e3 48 89 d8 5b c3
>   RSP: 0018:ffffc9000021bbc8 EFLAGS: 00010246
>   RAX: ffff00617461642e RBX: ffff00617461642e RCX: 0000000000000012
>   RDX: ffff88827400f568 RSI: ffffc9000021bbe0 RDI: ffff88827400f570
>   RBP: 0010000000000000 R08: ffffc9000021bd00 R09: ffffc9000021bda8
>   R10: ffffc9000021bc48 R11: 0000000000000000 R12: 0030000000000000
>   R13: 0000000000000000 R14: ffff88827427d700 R15: ffffc9000021bce8
>   FS:  00007f7eda014700(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007f7ed9216ff8 CR3: 0000000274391003 CR4: 0000000000162eb0
>   Call Trace:
>    kvm_mmu_slot_set_dirty+0xa1/0x150 [kvm]
>    __kvm_set_memory_region.part.64+0x559/0x960 [kvm]
>    kvm_set_memory_region+0x45/0x60 [kvm]
>    kvm_vm_ioctl+0x30f/0x920 [kvm]
>    do_vfs_ioctl+0xa1/0x620
>    ksys_ioctl+0x66/0x70
>    __x64_sys_ioctl+0x16/0x20
>    do_syscall_64+0x4c/0x170
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f7ed9911f47
>   Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 6f 2c 00 f7 d8 64 89 01 48
>   RSP: 002b:00007ffc00937498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>   RAX: ffffffffffffffda RBX: 0000000001ab0010 RCX: 00007f7ed9911f47
>   RDX: 0000000001ab1350 RSI: 000000004020ae46 RDI: 0000000000000004
>   RBP: 000000000000000a R08: 0000000000000000 R09: 00007f7ed9214700
>   R10: 00007f7ed92149d0 R11: 0000000000000246 R12: 00000000bffff000
>   R13: 0000000000000003 R14: 00007f7ed9215000 R15: 0000000000000000
>   Modules linked in: kvm_intel kvm irqbypass
>   ---[ end trace 0c5f570b3358ca89 ]---
> 
> The disallow_lpage tracking is more subtle.  Failure to update results
> in KVM creating large pages when it shouldn't, either due to stale data
> or again due to indexing beyond the end of the metadata arrays, which
> can lead to memory corruption and/or leaking data to guest/userspace.
> 
> Note, the arrays for the old memslot are freed by the unconditional call
> to kvm_free_memslot() in __kvm_set_memory_region().
> 
> Fixes: 05da45583de9b ("KVM: MMU: large page support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

I think the error-prone part is:

	new = old = *slot;

Where IMHO it would be better if we only copy pointers explicitly when
under control, rather than blindly copying all the pointers in the
structure which even contains sub-structures.

For example, I see PPC has this:

struct kvm_arch_memory_slot {
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
	unsigned long *rmap;
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
};

I started to look into HV code of it a bit, then I see...

 - kvm_arch_create_memslot(kvmppc_core_create_memslot_hv) init slot->arch.rmap,
 - kvm_arch_flush_shadow_memslot(kvmppc_core_flush_memslot_hv) didn't free it,
 - kvm_arch_prepare_memory_region(kvmppc_core_prepare_memory_region_hv) is nop.

So Does it have similar issue?

-- 
Peter Xu

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Peter Xu <peterx@redhat.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: "Wanpeng Li" <wanpengli@tencent.com>,
	"Janosch Frank" <frankja@linux.ibm.com>,
	kvm@vger.kernel.org, "David Hildenbrand" <david@redhat.com>,
	"James Hogan" <jhogan@kernel.org>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Cornelia Huck" <cohuck@redhat.com>,
	linux-kernel@vger.kernel.org,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	kvm-ppc@vger.kernel.org, linux-mips@vger.kernel.org,
	"Paul Mackerras" <paulus@ozlabs.org>,
	"Christian Borntraeger" <borntraeger@de.ibm.com>,
	"Marc Zyngier" <maz@kernel.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Vitaly Kuznetsov" <vkuznets@redhat.com>,
	"David Gibson" <david@gibson.dropbear.id.au>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org,
	"Jim Mattson" <jmattson@google.com>
Subject: Re: [PATCH v4 01/19] KVM: x86: Allocate new rmap and large page tracking when moving memslot
Date: Tue, 17 Dec 2019 16:56:40 -0500	[thread overview]
Message-ID: <20191217215640.GI7258@xz-x1> (raw)
In-Reply-To: <20191217204041.10815-2-sean.j.christopherson@intel.com>

On Tue, Dec 17, 2019 at 12:40:23PM -0800, Sean Christopherson wrote:
> Reallocate a rmap array and recalcuate large page compatibility when
> moving an existing memslot to correctly handle the alignment properties
> of the new memslot.  The number of rmap entries required at each level
> is dependent on the alignment of the memslot's base gfn with respect to
> that level, e.g. moving a large-page aligned memslot so that it becomes
> unaligned will increase the number of rmap entries needed at the now
> unaligned level.
> 
> Not updating the rmap array is the most obvious bug, as KVM accesses
> garbage data beyond the end of the rmap.  KVM interprets the bad data as
> pointers, leading to non-canonical #GPs, unexpected #PFs, etc...
> 
>   general protection fault: 0000 [#1] SMP
>   CPU: 0 PID: 1909 Comm: move_memory_reg Not tainted 5.4.0-rc7+ #139
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:rmap_get_first+0x37/0x50 [kvm]
>   Code: <48> 8b 3b 48 85 ff 74 ec e8 6c f4 ff ff 85 c0 74 e3 48 89 d8 5b c3
>   RSP: 0018:ffffc9000021bbc8 EFLAGS: 00010246
>   RAX: ffff00617461642e RBX: ffff00617461642e RCX: 0000000000000012
>   RDX: ffff88827400f568 RSI: ffffc9000021bbe0 RDI: ffff88827400f570
>   RBP: 0010000000000000 R08: ffffc9000021bd00 R09: ffffc9000021bda8
>   R10: ffffc9000021bc48 R11: 0000000000000000 R12: 0030000000000000
>   R13: 0000000000000000 R14: ffff88827427d700 R15: ffffc9000021bce8
>   FS:  00007f7eda014700(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007f7ed9216ff8 CR3: 0000000274391003 CR4: 0000000000162eb0
>   Call Trace:
>    kvm_mmu_slot_set_dirty+0xa1/0x150 [kvm]
>    __kvm_set_memory_region.part.64+0x559/0x960 [kvm]
>    kvm_set_memory_region+0x45/0x60 [kvm]
>    kvm_vm_ioctl+0x30f/0x920 [kvm]
>    do_vfs_ioctl+0xa1/0x620
>    ksys_ioctl+0x66/0x70
>    __x64_sys_ioctl+0x16/0x20
>    do_syscall_64+0x4c/0x170
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f7ed9911f47
>   Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 6f 2c 00 f7 d8 64 89 01 48
>   RSP: 002b:00007ffc00937498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>   RAX: ffffffffffffffda RBX: 0000000001ab0010 RCX: 00007f7ed9911f47
>   RDX: 0000000001ab1350 RSI: 000000004020ae46 RDI: 0000000000000004
>   RBP: 000000000000000a R08: 0000000000000000 R09: 00007f7ed9214700
>   R10: 00007f7ed92149d0 R11: 0000000000000246 R12: 00000000bffff000
>   R13: 0000000000000003 R14: 00007f7ed9215000 R15: 0000000000000000
>   Modules linked in: kvm_intel kvm irqbypass
>   ---[ end trace 0c5f570b3358ca89 ]---
> 
> The disallow_lpage tracking is more subtle.  Failure to update results
> in KVM creating large pages when it shouldn't, either due to stale data
> or again due to indexing beyond the end of the metadata arrays, which
> can lead to memory corruption and/or leaking data to guest/userspace.
> 
> Note, the arrays for the old memslot are freed by the unconditional call
> to kvm_free_memslot() in __kvm_set_memory_region().
> 
> Fixes: 05da45583de9b ("KVM: MMU: large page support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

I think the error-prone part is:

	new = old = *slot;

Where IMHO it would be better if we only copy pointers explicitly when
under control, rather than blindly copying all the pointers in the
structure which even contains sub-structures.

For example, I see PPC has this:

struct kvm_arch_memory_slot {
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
	unsigned long *rmap;
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
};

I started to look into HV code of it a bit, then I see...

 - kvm_arch_create_memslot(kvmppc_core_create_memslot_hv) init slot->arch.rmap,
 - kvm_arch_flush_shadow_memslot(kvmppc_core_flush_memslot_hv) didn't free it,
 - kvm_arch_prepare_memory_region(kvmppc_core_prepare_memory_region_hv) is nop.

So Does it have similar issue?

-- 
Peter Xu


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Peter Xu <peterx@redhat.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: "James Hogan" <jhogan@kernel.org>,
	"Paul Mackerras" <paulus@ozlabs.org>,
	"Christian Borntraeger" <borntraeger@de.ibm.com>,
	"Janosch Frank" <frankja@linux.ibm.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Marc Zyngier" <maz@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	"Wanpeng Li" <wanpengli@tencent.com>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	kvm@vger.kernel.org, "David Hildenbrand" <david@redhat.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Cornelia Huck" <cohuck@redhat.com>,
	linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm-ppc@vger.kernel.org, "Vitaly Kuznetsov" <vkuznets@redhat.com>,
	kvmarm@lists.cs.columbia.edu, "Jim Mattson" <jmattson@google.com>,
	"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v4 01/19] KVM: x86: Allocate new rmap and large page tracking when moving memslot
Date: Tue, 17 Dec 2019 21:56:40 +0000	[thread overview]
Message-ID: <20191217215640.GI7258@xz-x1> (raw)
In-Reply-To: <20191217204041.10815-2-sean.j.christopherson@intel.com>

On Tue, Dec 17, 2019 at 12:40:23PM -0800, Sean Christopherson wrote:
> Reallocate a rmap array and recalcuate large page compatibility when
> moving an existing memslot to correctly handle the alignment properties
> of the new memslot.  The number of rmap entries required at each level
> is dependent on the alignment of the memslot's base gfn with respect to
> that level, e.g. moving a large-page aligned memslot so that it becomes
> unaligned will increase the number of rmap entries needed at the now
> unaligned level.
> 
> Not updating the rmap array is the most obvious bug, as KVM accesses
> garbage data beyond the end of the rmap.  KVM interprets the bad data as
> pointers, leading to non-canonical #GPs, unexpected #PFs, etc...
> 
>   general protection fault: 0000 [#1] SMP
>   CPU: 0 PID: 1909 Comm: move_memory_reg Not tainted 5.4.0-rc7+ #139
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:rmap_get_first+0x37/0x50 [kvm]
>   Code: <48> 8b 3b 48 85 ff 74 ec e8 6c f4 ff ff 85 c0 74 e3 48 89 d8 5b c3
>   RSP: 0018:ffffc9000021bbc8 EFLAGS: 00010246
>   RAX: ffff00617461642e RBX: ffff00617461642e RCX: 0000000000000012
>   RDX: ffff88827400f568 RSI: ffffc9000021bbe0 RDI: ffff88827400f570
>   RBP: 0010000000000000 R08: ffffc9000021bd00 R09: ffffc9000021bda8
>   R10: ffffc9000021bc48 R11: 0000000000000000 R12: 0030000000000000
>   R13: 0000000000000000 R14: ffff88827427d700 R15: ffffc9000021bce8
>   FS:  00007f7eda014700(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007f7ed9216ff8 CR3: 0000000274391003 CR4: 0000000000162eb0
>   Call Trace:
>    kvm_mmu_slot_set_dirty+0xa1/0x150 [kvm]
>    __kvm_set_memory_region.part.64+0x559/0x960 [kvm]
>    kvm_set_memory_region+0x45/0x60 [kvm]
>    kvm_vm_ioctl+0x30f/0x920 [kvm]
>    do_vfs_ioctl+0xa1/0x620
>    ksys_ioctl+0x66/0x70
>    __x64_sys_ioctl+0x16/0x20
>    do_syscall_64+0x4c/0x170
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f7ed9911f47
>   Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 6f 2c 00 f7 d8 64 89 01 48
>   RSP: 002b:00007ffc00937498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>   RAX: ffffffffffffffda RBX: 0000000001ab0010 RCX: 00007f7ed9911f47
>   RDX: 0000000001ab1350 RSI: 000000004020ae46 RDI: 0000000000000004
>   RBP: 000000000000000a R08: 0000000000000000 R09: 00007f7ed9214700
>   R10: 00007f7ed92149d0 R11: 0000000000000246 R12: 00000000bffff000
>   R13: 0000000000000003 R14: 00007f7ed9215000 R15: 0000000000000000
>   Modules linked in: kvm_intel kvm irqbypass
>   ---[ end trace 0c5f570b3358ca89 ]---
> 
> The disallow_lpage tracking is more subtle.  Failure to update results
> in KVM creating large pages when it shouldn't, either due to stale data
> or again due to indexing beyond the end of the metadata arrays, which
> can lead to memory corruption and/or leaking data to guest/userspace.
> 
> Note, the arrays for the old memslot are freed by the unconditional call
> to kvm_free_memslot() in __kvm_set_memory_region().
> 
> Fixes: 05da45583de9b ("KVM: MMU: large page support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

I think the error-prone part is:

	new = old = *slot;

Where IMHO it would be better if we only copy pointers explicitly when
under control, rather than blindly copying all the pointers in the
structure which even contains sub-structures.

For example, I see PPC has this:

struct kvm_arch_memory_slot {
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
	unsigned long *rmap;
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
};

I started to look into HV code of it a bit, then I see...

 - kvm_arch_create_memslot(kvmppc_core_create_memslot_hv) init slot->arch.rmap,
 - kvm_arch_flush_shadow_memslot(kvmppc_core_flush_memslot_hv) didn't free it,
 - kvm_arch_prepare_memory_region(kvmppc_core_prepare_memory_region_hv) is nop.

So Does it have similar issue?

-- 
Peter Xu

  parent reply	other threads:[~2019-12-17 21:56 UTC|newest]

Thread overview: 160+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-17 20:40 [PATCH v4 00/19] KVM: Dynamically size memslot arrays Sean Christopherson
2019-12-17 20:40 ` Sean Christopherson
2019-12-17 20:40 ` Sean Christopherson
2019-12-17 20:40 ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 01/19] KVM: x86: Allocate new rmap and large page tracking when moving memslot Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:48   ` Sean Christopherson
2019-12-17 20:48     ` Sean Christopherson
2019-12-17 20:48     ` Sean Christopherson
2019-12-17 20:48     ` Sean Christopherson
2019-12-17 20:50     ` Sean Christopherson
2019-12-17 21:56   ` Peter Xu [this message]
2019-12-17 21:56     ` Peter Xu
2019-12-17 21:56     ` Peter Xu
2019-12-17 21:56     ` Peter Xu
2019-12-17 22:20     ` Sean Christopherson
2019-12-17 22:20       ` Sean Christopherson
2019-12-17 22:20       ` Sean Christopherson
2019-12-17 22:20       ` Sean Christopherson
2019-12-17 22:37       ` Peter Xu
2019-12-17 22:37         ` Peter Xu
2019-12-17 22:37         ` Peter Xu
2019-12-17 22:37         ` Peter Xu
2019-12-17 20:40 ` [PATCH v4 02/19] KVM: Reinstall old memslots if arch preparation fails Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 03/19] KVM: Don't free new memslot if allocation of said memslot fails Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 04/19] KVM: PPC: Move memslot memory allocation into prepare_memory_region() Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 05/19] KVM: x86: Allocate memslot resources during prepare_memory_region() Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 22:07   ` Peter Xu
2019-12-17 22:07     ` Peter Xu
2019-12-17 22:07     ` Peter Xu
2019-12-17 22:07     ` Peter Xu
2019-12-17 20:40 ` [PATCH v4 06/19] KVM: Drop kvm_arch_create_memslot() Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 07/19] KVM: Explicitly free allocated-but-unused dirty bitmap Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 22:24   ` Peter Xu
2019-12-17 22:24     ` Peter Xu
2019-12-17 22:24     ` Peter Xu
2019-12-17 22:24     ` Peter Xu
2019-12-17 22:51     ` Sean Christopherson
2019-12-17 22:51       ` Sean Christopherson
2019-12-17 22:51       ` Sean Christopherson
2019-12-17 22:51       ` Sean Christopherson
2019-12-18 16:17       ` Peter Xu
2019-12-18 16:17         ` Peter Xu
2019-12-18 16:17         ` Peter Xu
2019-12-18 16:17         ` Peter Xu
2019-12-17 20:40 ` [PATCH v4 08/19] KVM: Refactor error handling for setting memory region Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 09/19] KVM: Move setting of memslot into helper routine Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 10/19] KVM: Drop "const" attribute from old memslot in commit_memory_region() Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 11/19] KVM: x86: Free arrays for old memslot when moving memslot's base gfn Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 22:48   ` Peter Xu
2019-12-17 22:48     ` Peter Xu
2019-12-17 22:48     ` Peter Xu
2019-12-17 22:48     ` Peter Xu
2019-12-17 20:40 ` [PATCH v4 12/19] KVM: Move memslot deletion to helper function Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 13/19] KVM: Simplify kvm_free_memslot() and all its descendents Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 14/19] KVM: Clean up local variable usage in __kvm_set_memory_region() Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 15/19] KVM: Provide common implementation for generic dirty log functions Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 16/19] KVM: Ensure validity of memslot with respect to kvm_get_dirty_log() Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-24 18:19   ` Peter Xu
2019-12-24 18:19     ` Peter Xu
2019-12-24 18:19     ` Peter Xu
2019-12-24 18:19     ` Peter Xu
2020-01-14 18:25     ` Sean Christopherson
2020-01-14 18:25       ` Sean Christopherson
2020-01-14 18:25       ` Sean Christopherson
2020-01-14 18:25       ` Sean Christopherson
2020-02-06 22:03       ` Peter Xu
2020-02-06 22:03         ` Peter Xu
2020-02-06 22:03         ` Peter Xu
2020-02-06 22:03         ` Peter Xu
2020-02-07 18:52         ` Sean Christopherson
2020-02-07 18:52           ` Sean Christopherson
2020-02-07 18:52           ` Sean Christopherson
2020-02-07 18:52           ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 17/19] KVM: Terminate memslot walks via used_slots Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 18/19] KVM: Dynamically size memslot array based on number of used slots Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 19/19] KVM: selftests: Add test for KVM_SET_USER_MEMORY_REGION Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-17 20:40   ` Sean Christopherson
2019-12-18 11:29   ` Christian Borntraeger
2019-12-18 11:29     ` Christian Borntraeger
2019-12-18 11:29     ` Christian Borntraeger
2019-12-18 11:29     ` Christian Borntraeger
2019-12-18 11:39   ` Christian Borntraeger
2019-12-18 11:39     ` Christian Borntraeger
2019-12-18 11:39     ` Christian Borntraeger
2019-12-18 11:39     ` Christian Borntraeger
2019-12-18 16:39     ` Sean Christopherson
2019-12-18 16:39       ` Sean Christopherson
2019-12-18 16:39       ` Sean Christopherson
2019-12-18 16:39       ` Sean Christopherson
2020-03-24  9:43       ` Christian Borntraeger
2020-03-24  9:43         ` Christian Borntraeger
2020-03-24  9:43         ` Christian Borntraeger
2020-03-24  9:43         ` Christian Borntraeger
2020-03-24 18:20         ` Sean Christopherson
2020-03-24 18:20           ` Sean Christopherson
2020-03-24 18:20           ` Sean Christopherson
2020-03-24 18:20           ` Sean Christopherson
2019-12-18 11:40 ` [PATCH v4 00/19] KVM: Dynamically size memslot arrays Christian Borntraeger
2019-12-18 11:40   ` Christian Borntraeger
2019-12-18 11:40   ` Christian Borntraeger
2019-12-18 11:40   ` Christian Borntraeger
2019-12-18 18:10 ` Marc Zyngier
2019-12-18 18:10   ` Marc Zyngier
2019-12-18 18:10   ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191217215640.GI7258@xz-x1 \
    --to=peterx@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=david@redhat.com \
    --cc=f4bug@amsat.org \
    --cc=frankja@linux.ibm.com \
    --cc=jhogan@kernel.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=paulus@ozlabs.org \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.