* [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
[not found] <20220303183328.1499189-1-dmatlack@google.com>
@ 2022-03-03 18:33 ` David Matlack
2022-03-08 21:40 ` Sean Christopherson
2022-03-15 15:43 ` Murilo Opsfelder Araújo
0 siblings, 2 replies; 8+ messages in thread
From: David Matlack @ 2022-03-03 18:33 UTC (permalink / raw)
To: pbonzini
Cc: David Matlack, kvm, Marcelo Tosatti, Gleb Natapov, Rik van Riel,
seanjc, bgardon, stable
Tie the lifetime the KVM module to the lifetime of each VM via
kvm.users_count. This way anything that grabs a reference to the VM via
kvm_get_kvm() cannot accidentally outlive the KVM module.
Prior to this commit, the lifetime of the KVM module was tied to the
lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
file descriptors by their respective file_operations "owner" field.
This approach is insufficient because references grabbed via
kvm_get_kvm() do not prevent closing any of the aforementioned file
descriptors.
This fixes a long standing theoretical bug in KVM that at least affects
async page faults. kvm_setup_async_pf() grabs a reference via
kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
prevents the VM file descriptor from being closed and the KVM module
from being unloaded before this callback runs.
Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
Cc: stable@vger.kernel.org
Suggested-by: Ben Gardon <bgardon@google.com>
[ Based on a patch from Ben implemented for Google's kernel. ]
Signed-off-by: David Matlack <dmatlack@google.com>
---
virt/kvm/kvm_main.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 35ae6d32dae5..b59f0a29dbd5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
static const struct file_operations stat_fops_per_vm;
+static struct file_operations kvm_chardev_ops;
+
static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
unsigned long arg);
#ifdef CONFIG_KVM_COMPAT
@@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
preempt_notifier_inc();
kvm_init_pm_notifier(kvm);
+ if (!try_module_get(kvm_chardev_ops.owner)) {
+ r = -ENODEV;
+ goto out_err;
+ }
+
return kvm;
out_err:
@@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
preempt_notifier_dec();
hardware_disable_all();
mmdrop(mm);
+ module_put(kvm_chardev_ops.owner);
}
void kvm_get_kvm(struct kvm *kvm)
base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156
--
2.35.1.616.g0bdcbb4464-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
2022-03-03 18:33 ` [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed David Matlack
@ 2022-03-08 21:40 ` Sean Christopherson
2022-03-08 22:28 ` David Matlack
2022-03-08 23:43 ` David Matlack
2022-03-15 15:43 ` Murilo Opsfelder Araújo
1 sibling, 2 replies; 8+ messages in thread
From: Sean Christopherson @ 2022-03-08 21:40 UTC (permalink / raw)
To: David Matlack
Cc: pbonzini, kvm, Marcelo Tosatti, Gleb Natapov, Rik van Riel,
bgardon, stable
On Thu, Mar 03, 2022, David Matlack wrote:
> Tie the lifetime the KVM module to the lifetime of each VM via
> kvm.users_count. This way anything that grabs a reference to the VM via
> kvm_get_kvm() cannot accidentally outlive the KVM module.
>
> Prior to this commit, the lifetime of the KVM module was tied to the
> lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> file descriptors by their respective file_operations "owner" field.
> This approach is insufficient because references grabbed via
> kvm_get_kvm() do not prevent closing any of the aforementioned file
> descriptors.
>
> This fixes a long standing theoretical bug in KVM that at least affects
> async page faults. kvm_setup_async_pf() grabs a reference via
> kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> prevents the VM file descriptor from being closed and the KVM module
> from being unloaded before this callback runs.
>
> Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
And (or)
Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
because the above is x86-centric, at a glance PPC and maybe s390 have issues
beyond async #PF.
> Cc: stable@vger.kernel.org
> Suggested-by: Ben Gardon <bgardon@google.com>
> [ Based on a patch from Ben implemented for Google's kernel. ]
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
> virt/kvm/kvm_main.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 35ae6d32dae5..b59f0a29dbd5 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
>
> static const struct file_operations stat_fops_per_vm;
>
> +static struct file_operations kvm_chardev_ops;
> +
> static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> unsigned long arg);
> #ifdef CONFIG_KVM_COMPAT
> @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> preempt_notifier_inc();
> kvm_init_pm_notifier(kvm);
>
> + if (!try_module_get(kvm_chardev_ops.owner)) {
The "try" aspect is unnecessary. Stealing from Paolo's version,
/* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
__module_get(kvm_chardev_ops.owner);
> + r = -ENODEV;
> + goto out_err;
> + }
> +
> return kvm;
>
> out_err:
> @@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
> preempt_notifier_dec();
> hardware_disable_all();
> mmdrop(mm);
> + module_put(kvm_chardev_ops.owner);
> }
>
> void kvm_get_kvm(struct kvm *kvm)
>
> base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
> prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
> prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156
> --
> 2.35.1.616.g0bdcbb4464-goog
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
2022-03-08 21:40 ` Sean Christopherson
@ 2022-03-08 22:28 ` David Matlack
2022-03-08 23:08 ` Sean Christopherson
2022-03-08 23:43 ` David Matlack
1 sibling, 1 reply; 8+ messages in thread
From: David Matlack @ 2022-03-08 22:28 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, kvm list, Marcelo Tosatti, Gleb Natapov,
Rik van Riel, Ben Gardon, stable
On Tue, Mar 8, 2022 at 1:40 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Mar 03, 2022, David Matlack wrote:
> > Tie the lifetime the KVM module to the lifetime of each VM via
> > kvm.users_count. This way anything that grabs a reference to the VM via
> > kvm_get_kvm() cannot accidentally outlive the KVM module.
> >
> > Prior to this commit, the lifetime of the KVM module was tied to the
> > lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> > file descriptors by their respective file_operations "owner" field.
> > This approach is insufficient because references grabbed via
> > kvm_get_kvm() do not prevent closing any of the aforementioned file
> > descriptors.
> >
> > This fixes a long standing theoretical bug in KVM that at least affects
> > async page faults. kvm_setup_async_pf() grabs a reference via
> > kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> > prevents the VM file descriptor from being closed and the KVM module
> > from being unloaded before this callback runs.
> >
> > Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
>
> And (or)
>
> Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
>
> because the above is x86-centric, at a glance PPC and maybe s390 have issues
> beyond async #PF.
>
> > Cc: stable@vger.kernel.org
> > Suggested-by: Ben Gardon <bgardon@google.com>
> > [ Based on a patch from Ben implemented for Google's kernel. ]
> > Signed-off-by: David Matlack <dmatlack@google.com>
> > ---
> > virt/kvm/kvm_main.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 35ae6d32dae5..b59f0a29dbd5 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
> >
> > static const struct file_operations stat_fops_per_vm;
> >
> > +static struct file_operations kvm_chardev_ops;
> > +
> > static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> > unsigned long arg);
> > #ifdef CONFIG_KVM_COMPAT
> > @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> > preempt_notifier_inc();
> > kvm_init_pm_notifier(kvm);
> >
> > + if (!try_module_get(kvm_chardev_ops.owner)) {
>
> The "try" aspect is unnecessary. Stealing from Paolo's version,
>
> /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
> __module_get(kvm_chardev_ops.owner);
Right, I did see that and agree we're guaranteed the KVM module has a
reference at this point. But the KVM module might be in state
MODULE_STATE_GOING (e.g. if someone ran "rmmod --wait"), which
try_module_get() checks.
>
> > + r = -ENODEV;
> > + goto out_err;
> > + }
> > +
> > return kvm;
> >
> > out_err:
> > @@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
> > preempt_notifier_dec();
> > hardware_disable_all();
> > mmdrop(mm);
> > + module_put(kvm_chardev_ops.owner);
> > }
> >
> > void kvm_get_kvm(struct kvm *kvm)
> >
> > base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
> > prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
> > prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156
> > --
> > 2.35.1.616.g0bdcbb4464-goog
> >
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
2022-03-08 22:28 ` David Matlack
@ 2022-03-08 23:08 ` Sean Christopherson
2022-03-08 23:44 ` David Matlack
0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2022-03-08 23:08 UTC (permalink / raw)
To: David Matlack
Cc: Paolo Bonzini, kvm list, Marcelo Tosatti, Gleb Natapov,
Rik van Riel, Ben Gardon, stable
On Tue, Mar 08, 2022, David Matlack wrote:
> On Tue, Mar 8, 2022 at 1:40 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Thu, Mar 03, 2022, David Matlack wrote:
> > > Tie the lifetime the KVM module to the lifetime of each VM via
> > > kvm.users_count. This way anything that grabs a reference to the VM via
> > > kvm_get_kvm() cannot accidentally outlive the KVM module.
> > >
> > > Prior to this commit, the lifetime of the KVM module was tied to the
> > > lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> > > file descriptors by their respective file_operations "owner" field.
> > > This approach is insufficient because references grabbed via
> > > kvm_get_kvm() do not prevent closing any of the aforementioned file
> > > descriptors.
> > >
> > > This fixes a long standing theoretical bug in KVM that at least affects
> > > async page faults. kvm_setup_async_pf() grabs a reference via
> > > kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> > > prevents the VM file descriptor from being closed and the KVM module
> > > from being unloaded before this callback runs.
> > >
> > > Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
> >
> > And (or)
> >
> > Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
> >
> > because the above is x86-centric, at a glance PPC and maybe s390 have issues
> > beyond async #PF.
> >
> > > Cc: stable@vger.kernel.org
> > > Suggested-by: Ben Gardon <bgardon@google.com>
> > > [ Based on a patch from Ben implemented for Google's kernel. ]
> > > Signed-off-by: David Matlack <dmatlack@google.com>
> > > ---
> > > virt/kvm/kvm_main.c | 8 ++++++++
> > > 1 file changed, 8 insertions(+)
> > >
> > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > index 35ae6d32dae5..b59f0a29dbd5 100644
> > > --- a/virt/kvm/kvm_main.c
> > > +++ b/virt/kvm/kvm_main.c
> > > @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
> > >
> > > static const struct file_operations stat_fops_per_vm;
> > >
> > > +static struct file_operations kvm_chardev_ops;
> > > +
> > > static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> > > unsigned long arg);
> > > #ifdef CONFIG_KVM_COMPAT
> > > @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> > > preempt_notifier_inc();
> > > kvm_init_pm_notifier(kvm);
> > >
> > > + if (!try_module_get(kvm_chardev_ops.owner)) {
> >
> > The "try" aspect is unnecessary. Stealing from Paolo's version,
> >
> > /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
> > __module_get(kvm_chardev_ops.owner);
>
> Right, I did see that and agree we're guaranteed the KVM module has a
> reference at this point. But the KVM module might be in state
> MODULE_STATE_GOING (e.g. if someone ran "rmmod --wait"), which
> try_module_get() checks.
Ah, can you throw that in as a comment? Doesn't have to be much, just enough of
a breadcrumb to connect the dots and to prevent us from "optimizing" this to
__module_get() in the future.
/* Use the "try" variant to play nice with e.g. "rmmod --wait". */
With a comment,
Reviewed-by: Sean Christopherson <seanjc@google.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
2022-03-08 21:40 ` Sean Christopherson
2022-03-08 22:28 ` David Matlack
@ 2022-03-08 23:43 ` David Matlack
1 sibling, 0 replies; 8+ messages in thread
From: David Matlack @ 2022-03-08 23:43 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, kvm list, Marcelo Tosatti, Gleb Natapov,
Rik van Riel, Ben Gardon, stable
On Tue, Mar 8, 2022 at 1:40 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Mar 03, 2022, David Matlack wrote:
> > Tie the lifetime the KVM module to the lifetime of each VM via
> > kvm.users_count. This way anything that grabs a reference to the VM via
> > kvm_get_kvm() cannot accidentally outlive the KVM module.
> >
> > Prior to this commit, the lifetime of the KVM module was tied to the
> > lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> > file descriptors by their respective file_operations "owner" field.
> > This approach is insufficient because references grabbed via
> > kvm_get_kvm() do not prevent closing any of the aforementioned file
> > descriptors.
> >
> > This fixes a long standing theoretical bug in KVM that at least affects
> > async page faults. kvm_setup_async_pf() grabs a reference via
> > kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> > prevents the VM file descriptor from being closed and the KVM module
> > from being unloaded before this callback runs.
> >
> > Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
>
> And (or)
>
> Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
>
> because the above is x86-centric, at a glance PPC and maybe s390 have issues
> beyond async #PF.
SGTM. It's a moot point in terms of stable inclusion since
af585b921e5d was first added in v2.6.38. But for anyone doing their
own backporting, 3d3aab1b973b makes it a bit more obvious this is a
generic problem even though it's not the commit that introduces the
bug.
>
> > Cc: stable@vger.kernel.org
> > Suggested-by: Ben Gardon <bgardon@google.com>
> > [ Based on a patch from Ben implemented for Google's kernel. ]
> > Signed-off-by: David Matlack <dmatlack@google.com>
> > ---
> > virt/kvm/kvm_main.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 35ae6d32dae5..b59f0a29dbd5 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
> >
> > static const struct file_operations stat_fops_per_vm;
> >
> > +static struct file_operations kvm_chardev_ops;
> > +
> > static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> > unsigned long arg);
> > #ifdef CONFIG_KVM_COMPAT
> > @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> > preempt_notifier_inc();
> > kvm_init_pm_notifier(kvm);
> >
> > + if (!try_module_get(kvm_chardev_ops.owner)) {
>
> The "try" aspect is unnecessary. Stealing from Paolo's version,
>
> /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
> __module_get(kvm_chardev_ops.owner);
>
> > + r = -ENODEV;
> > + goto out_err;
> > + }
> > +
> > return kvm;
> >
> > out_err:
> > @@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
> > preempt_notifier_dec();
> > hardware_disable_all();
> > mmdrop(mm);
> > + module_put(kvm_chardev_ops.owner);
> > }
> >
> > void kvm_get_kvm(struct kvm *kvm)
> >
> > base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
> > prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
> > prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156
> > --
> > 2.35.1.616.g0bdcbb4464-goog
> >
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
2022-03-08 23:08 ` Sean Christopherson
@ 2022-03-08 23:44 ` David Matlack
0 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2022-03-08 23:44 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, kvm list, Marcelo Tosatti, Gleb Natapov,
Rik van Riel, Ben Gardon, stable
On Tue, Mar 8, 2022 at 3:09 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Mar 08, 2022, David Matlack wrote:
> > On Tue, Mar 8, 2022 at 1:40 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Thu, Mar 03, 2022, David Matlack wrote:
> > > > Tie the lifetime the KVM module to the lifetime of each VM via
> > > > kvm.users_count. This way anything that grabs a reference to the VM via
> > > > kvm_get_kvm() cannot accidentally outlive the KVM module.
> > > >
> > > > Prior to this commit, the lifetime of the KVM module was tied to the
> > > > lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> > > > file descriptors by their respective file_operations "owner" field.
> > > > This approach is insufficient because references grabbed via
> > > > kvm_get_kvm() do not prevent closing any of the aforementioned file
> > > > descriptors.
> > > >
> > > > This fixes a long standing theoretical bug in KVM that at least affects
> > > > async page faults. kvm_setup_async_pf() grabs a reference via
> > > > kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> > > > prevents the VM file descriptor from being closed and the KVM module
> > > > from being unloaded before this callback runs.
> > > >
> > > > Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
> > >
> > > And (or)
> > >
> > > Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
> > >
> > > because the above is x86-centric, at a glance PPC and maybe s390 have issues
> > > beyond async #PF.
> > >
> > > > Cc: stable@vger.kernel.org
> > > > Suggested-by: Ben Gardon <bgardon@google.com>
> > > > [ Based on a patch from Ben implemented for Google's kernel. ]
> > > > Signed-off-by: David Matlack <dmatlack@google.com>
> > > > ---
> > > > virt/kvm/kvm_main.c | 8 ++++++++
> > > > 1 file changed, 8 insertions(+)
> > > >
> > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > > index 35ae6d32dae5..b59f0a29dbd5 100644
> > > > --- a/virt/kvm/kvm_main.c
> > > > +++ b/virt/kvm/kvm_main.c
> > > > @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
> > > >
> > > > static const struct file_operations stat_fops_per_vm;
> > > >
> > > > +static struct file_operations kvm_chardev_ops;
> > > > +
> > > > static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> > > > unsigned long arg);
> > > > #ifdef CONFIG_KVM_COMPAT
> > > > @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> > > > preempt_notifier_inc();
> > > > kvm_init_pm_notifier(kvm);
> > > >
> > > > + if (!try_module_get(kvm_chardev_ops.owner)) {
> > >
> > > The "try" aspect is unnecessary. Stealing from Paolo's version,
> > >
> > > /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
> > > __module_get(kvm_chardev_ops.owner);
> >
> > Right, I did see that and agree we're guaranteed the KVM module has a
> > reference at this point. But the KVM module might be in state
> > MODULE_STATE_GOING (e.g. if someone ran "rmmod --wait"), which
> > try_module_get() checks.
>
> Ah, can you throw that in as a comment? Doesn't have to be much, just enough of
> a breadcrumb to connect the dots and to prevent us from "optimizing" this to
> __module_get() in the future.
>
> /* Use the "try" variant to play nice with e.g. "rmmod --wait". */
Yeah. I should have included this in the first place (or at least a
blurb in the commit message).
>
> With a comment,
>
> Reviewed-by: Sean Christopherson <seanjc@google.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
2022-03-03 18:33 ` [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed David Matlack
2022-03-08 21:40 ` Sean Christopherson
@ 2022-03-15 15:43 ` Murilo Opsfelder Araújo
2022-03-15 20:45 ` Paolo Bonzini
1 sibling, 1 reply; 8+ messages in thread
From: Murilo Opsfelder Araújo @ 2022-03-15 15:43 UTC (permalink / raw)
To: David Matlack, pbonzini
Cc: kvm, Marcelo Tosatti, Gleb Natapov, Rik van Riel, seanjc,
bgardon, stable, farosas
Hi, David.
Some comments below.
On 3/3/22 15:33, David Matlack wrote:
> Tie the lifetime the KVM module to the lifetime of each VM via
> kvm.users_count. This way anything that grabs a reference to the VM via
> kvm_get_kvm() cannot accidentally outlive the KVM module.
>
> Prior to this commit, the lifetime of the KVM module was tied to the
> lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> file descriptors by their respective file_operations "owner" field.
> This approach is insufficient because references grabbed via
> kvm_get_kvm() do not prevent closing any of the aforementioned file
> descriptors.
>
> This fixes a long standing theoretical bug in KVM that at least affects
> async page faults. kvm_setup_async_pf() grabs a reference via
> kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> prevents the VM file descriptor from being closed and the KVM module
> from being unloaded before this callback runs.
>
> Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
> Cc: stable@vger.kernel.org
> Suggested-by: Ben Gardon <bgardon@google.com>
> [ Based on a patch from Ben implemented for Google's kernel. ]
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
> virt/kvm/kvm_main.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 35ae6d32dae5..b59f0a29dbd5 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
>
> static const struct file_operations stat_fops_per_vm;
>
> +static struct file_operations kvm_chardev_ops;
> +
> static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> unsigned long arg);
> #ifdef CONFIG_KVM_COMPAT
> @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> preempt_notifier_inc();
> kvm_init_pm_notifier(kvm);
>
> + if (!try_module_get(kvm_chardev_ops.owner)) {
> + r = -ENODEV;
> + goto out_err;
> + }
> +
Doesn't this problem also affects the other functions called from
kvm_dev_ioctl()?
Is it possible that the module is removed while other ioctl's are still running,
e.g. KVM_GET_API_VERSION and KVM_CHECK_EXTENSION, even though they don't use
struct kvm?
I wonder if this try_module_get() (along with module_put() in the out path of
the function) shouldn't be placed in the upper function kvm_dev_ioctl() so it
would cover all the other ioctl's.
> return kvm;
>
> out_err:
> @@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
> preempt_notifier_dec();
> hardware_disable_all();
> mmdrop(mm);
> + module_put(kvm_chardev_ops.owner);
> }
>
> void kvm_get_kvm(struct kvm *kvm)
>
> base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
> prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
> prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156
--
Murilo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
2022-03-15 15:43 ` Murilo Opsfelder Araújo
@ 2022-03-15 20:45 ` Paolo Bonzini
0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2022-03-15 20:45 UTC (permalink / raw)
To: muriloo, David Matlack
Cc: kvm, Marcelo Tosatti, Gleb Natapov, Rik van Riel, seanjc,
bgardon, stable, farosas
On 3/15/22 16:43, Murilo Opsfelder Araújo wrote:
>>
>> + if (!try_module_get(kvm_chardev_ops.owner)) {
>> + r = -ENODEV;
>> + goto out_err;
>> + }
>> +
>
> Doesn't this problem also affects the other functions called from
> kvm_dev_ioctl()?
>
> Is it possible that the module is removed while other ioctl's are
> still running, e.g. KVM_GET_API_VERSION and KVM_CHECK_EXTENSION, even
> though they don't use struct kvm?
No, because opening /dev/kvm also adds a reference to the module. The
problem is that create_vm creates another source of references to the
module that can survive after /dev/kvm is closed.
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-03-15 20:45 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20220303183328.1499189-1-dmatlack@google.com>
2022-03-03 18:33 ` [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed David Matlack
2022-03-08 21:40 ` Sean Christopherson
2022-03-08 22:28 ` David Matlack
2022-03-08 23:08 ` Sean Christopherson
2022-03-08 23:44 ` David Matlack
2022-03-08 23:43 ` David Matlack
2022-03-15 15:43 ` Murilo Opsfelder Araújo
2022-03-15 20:45 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).