stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
       [not found] <20220303183328.1499189-1-dmatlack@google.com>
@ 2022-03-03 18:33 ` David Matlack
  2022-03-08 21:40   ` Sean Christopherson
  2022-03-15 15:43   ` Murilo Opsfelder Araújo
  0 siblings, 2 replies; 8+ messages in thread
From: David Matlack @ 2022-03-03 18:33 UTC (permalink / raw)
  To: pbonzini
  Cc: David Matlack, kvm, Marcelo Tosatti, Gleb Natapov, Rik van Riel,
	seanjc, bgardon, stable

Tie the lifetime the KVM module to the lifetime of each VM via
kvm.users_count. This way anything that grabs a reference to the VM via
kvm_get_kvm() cannot accidentally outlive the KVM module.

Prior to this commit, the lifetime of the KVM module was tied to the
lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
file descriptors by their respective file_operations "owner" field.
This approach is insufficient because references grabbed via
kvm_get_kvm() do not prevent closing any of the aforementioned file
descriptors.

This fixes a long standing theoretical bug in KVM that at least affects
async page faults. kvm_setup_async_pf() grabs a reference via
kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
prevents the VM file descriptor from being closed and the KVM module
from being unloaded before this callback runs.

Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
Cc: stable@vger.kernel.org
Suggested-by: Ben Gardon <bgardon@google.com>
[ Based on a patch from Ben implemented for Google's kernel. ]
Signed-off-by: David Matlack <dmatlack@google.com>
---
 virt/kvm/kvm_main.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 35ae6d32dae5..b59f0a29dbd5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
 
 static const struct file_operations stat_fops_per_vm;
 
+static struct file_operations kvm_chardev_ops;
+
 static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
 			   unsigned long arg);
 #ifdef CONFIG_KVM_COMPAT
@@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
 	preempt_notifier_inc();
 	kvm_init_pm_notifier(kvm);
 
+	if (!try_module_get(kvm_chardev_ops.owner)) {
+		r = -ENODEV;
+		goto out_err;
+	}
+
 	return kvm;
 
 out_err:
@@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
 	preempt_notifier_dec();
 	hardware_disable_all();
 	mmdrop(mm);
+	module_put(kvm_chardev_ops.owner);
 }
 
 void kvm_get_kvm(struct kvm *kvm)

base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156
-- 
2.35.1.616.g0bdcbb4464-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
  2022-03-03 18:33 ` [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed David Matlack
@ 2022-03-08 21:40   ` Sean Christopherson
  2022-03-08 22:28     ` David Matlack
  2022-03-08 23:43     ` David Matlack
  2022-03-15 15:43   ` Murilo Opsfelder Araújo
  1 sibling, 2 replies; 8+ messages in thread
From: Sean Christopherson @ 2022-03-08 21:40 UTC (permalink / raw)
  To: David Matlack
  Cc: pbonzini, kvm, Marcelo Tosatti, Gleb Natapov, Rik van Riel,
	bgardon, stable

On Thu, Mar 03, 2022, David Matlack wrote:
> Tie the lifetime the KVM module to the lifetime of each VM via
> kvm.users_count. This way anything that grabs a reference to the VM via
> kvm_get_kvm() cannot accidentally outlive the KVM module.
> 
> Prior to this commit, the lifetime of the KVM module was tied to the
> lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> file descriptors by their respective file_operations "owner" field.
> This approach is insufficient because references grabbed via
> kvm_get_kvm() do not prevent closing any of the aforementioned file
> descriptors.
> 
> This fixes a long standing theoretical bug in KVM that at least affects
> async page faults. kvm_setup_async_pf() grabs a reference via
> kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> prevents the VM file descriptor from being closed and the KVM module
> from being unloaded before this callback runs.
> 
> Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")

And (or)

  Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")

because the above is x86-centric, at a glance PPC and maybe s390 have issues
beyond async #PF.

> Cc: stable@vger.kernel.org
> Suggested-by: Ben Gardon <bgardon@google.com>
> [ Based on a patch from Ben implemented for Google's kernel. ]
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
>  virt/kvm/kvm_main.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 35ae6d32dae5..b59f0a29dbd5 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
>  
>  static const struct file_operations stat_fops_per_vm;
>  
> +static struct file_operations kvm_chardev_ops;
> +
>  static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
>  			   unsigned long arg);
>  #ifdef CONFIG_KVM_COMPAT
> @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
>  	preempt_notifier_inc();
>  	kvm_init_pm_notifier(kvm);
>  
> +	if (!try_module_get(kvm_chardev_ops.owner)) {

The "try" aspect is unnecessary.  Stealing from Paolo's version, 

	/* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
	__module_get(kvm_chardev_ops.owner);

> +		r = -ENODEV;
> +		goto out_err;
> +	}
> +
>  	return kvm;
>  
>  out_err:
> @@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
>  	preempt_notifier_dec();
>  	hardware_disable_all();
>  	mmdrop(mm);
> +	module_put(kvm_chardev_ops.owner);
>  }
>  
>  void kvm_get_kvm(struct kvm *kvm)
> 
> base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
> prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
> prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156
> -- 
> 2.35.1.616.g0bdcbb4464-goog
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
  2022-03-08 21:40   ` Sean Christopherson
@ 2022-03-08 22:28     ` David Matlack
  2022-03-08 23:08       ` Sean Christopherson
  2022-03-08 23:43     ` David Matlack
  1 sibling, 1 reply; 8+ messages in thread
From: David Matlack @ 2022-03-08 22:28 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm list, Marcelo Tosatti, Gleb Natapov,
	Rik van Riel, Ben Gardon, stable

On Tue, Mar 8, 2022 at 1:40 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Mar 03, 2022, David Matlack wrote:
> > Tie the lifetime the KVM module to the lifetime of each VM via
> > kvm.users_count. This way anything that grabs a reference to the VM via
> > kvm_get_kvm() cannot accidentally outlive the KVM module.
> >
> > Prior to this commit, the lifetime of the KVM module was tied to the
> > lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> > file descriptors by their respective file_operations "owner" field.
> > This approach is insufficient because references grabbed via
> > kvm_get_kvm() do not prevent closing any of the aforementioned file
> > descriptors.
> >
> > This fixes a long standing theoretical bug in KVM that at least affects
> > async page faults. kvm_setup_async_pf() grabs a reference via
> > kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> > prevents the VM file descriptor from being closed and the KVM module
> > from being unloaded before this callback runs.
> >
> > Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
>
> And (or)
>
>   Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
>
> because the above is x86-centric, at a glance PPC and maybe s390 have issues
> beyond async #PF.
>
> > Cc: stable@vger.kernel.org
> > Suggested-by: Ben Gardon <bgardon@google.com>
> > [ Based on a patch from Ben implemented for Google's kernel. ]
> > Signed-off-by: David Matlack <dmatlack@google.com>
> > ---
> >  virt/kvm/kvm_main.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 35ae6d32dae5..b59f0a29dbd5 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
> >
> >  static const struct file_operations stat_fops_per_vm;
> >
> > +static struct file_operations kvm_chardev_ops;
> > +
> >  static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> >                          unsigned long arg);
> >  #ifdef CONFIG_KVM_COMPAT
> > @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> >       preempt_notifier_inc();
> >       kvm_init_pm_notifier(kvm);
> >
> > +     if (!try_module_get(kvm_chardev_ops.owner)) {
>
> The "try" aspect is unnecessary.  Stealing from Paolo's version,
>
>         /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
>         __module_get(kvm_chardev_ops.owner);

Right, I did see that and agree we're guaranteed the KVM module has a
reference at this point. But the KVM module might be in state
MODULE_STATE_GOING (e.g. if someone ran "rmmod --wait"), which
try_module_get() checks.

>
> > +             r = -ENODEV;
> > +             goto out_err;
> > +     }
> > +
> >       return kvm;
> >
> >  out_err:
> > @@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
> >       preempt_notifier_dec();
> >       hardware_disable_all();
> >       mmdrop(mm);
> > +     module_put(kvm_chardev_ops.owner);
> >  }
> >
> >  void kvm_get_kvm(struct kvm *kvm)
> >
> > base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
> > prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
> > prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156
> > --
> > 2.35.1.616.g0bdcbb4464-goog
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
  2022-03-08 22:28     ` David Matlack
@ 2022-03-08 23:08       ` Sean Christopherson
  2022-03-08 23:44         ` David Matlack
  0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2022-03-08 23:08 UTC (permalink / raw)
  To: David Matlack
  Cc: Paolo Bonzini, kvm list, Marcelo Tosatti, Gleb Natapov,
	Rik van Riel, Ben Gardon, stable

On Tue, Mar 08, 2022, David Matlack wrote:
> On Tue, Mar 8, 2022 at 1:40 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Thu, Mar 03, 2022, David Matlack wrote:
> > > Tie the lifetime the KVM module to the lifetime of each VM via
> > > kvm.users_count. This way anything that grabs a reference to the VM via
> > > kvm_get_kvm() cannot accidentally outlive the KVM module.
> > >
> > > Prior to this commit, the lifetime of the KVM module was tied to the
> > > lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> > > file descriptors by their respective file_operations "owner" field.
> > > This approach is insufficient because references grabbed via
> > > kvm_get_kvm() do not prevent closing any of the aforementioned file
> > > descriptors.
> > >
> > > This fixes a long standing theoretical bug in KVM that at least affects
> > > async page faults. kvm_setup_async_pf() grabs a reference via
> > > kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> > > prevents the VM file descriptor from being closed and the KVM module
> > > from being unloaded before this callback runs.
> > >
> > > Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
> >
> > And (or)
> >
> >   Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
> >
> > because the above is x86-centric, at a glance PPC and maybe s390 have issues
> > beyond async #PF.
> >
> > > Cc: stable@vger.kernel.org
> > > Suggested-by: Ben Gardon <bgardon@google.com>
> > > [ Based on a patch from Ben implemented for Google's kernel. ]
> > > Signed-off-by: David Matlack <dmatlack@google.com>
> > > ---
> > >  virt/kvm/kvm_main.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > index 35ae6d32dae5..b59f0a29dbd5 100644
> > > --- a/virt/kvm/kvm_main.c
> > > +++ b/virt/kvm/kvm_main.c
> > > @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
> > >
> > >  static const struct file_operations stat_fops_per_vm;
> > >
> > > +static struct file_operations kvm_chardev_ops;
> > > +
> > >  static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> > >                          unsigned long arg);
> > >  #ifdef CONFIG_KVM_COMPAT
> > > @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> > >       preempt_notifier_inc();
> > >       kvm_init_pm_notifier(kvm);
> > >
> > > +     if (!try_module_get(kvm_chardev_ops.owner)) {
> >
> > The "try" aspect is unnecessary.  Stealing from Paolo's version,
> >
> >         /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
> >         __module_get(kvm_chardev_ops.owner);
> 
> Right, I did see that and agree we're guaranteed the KVM module has a
> reference at this point. But the KVM module might be in state
> MODULE_STATE_GOING (e.g. if someone ran "rmmod --wait"), which
> try_module_get() checks.

Ah, can you throw that in as a comment?  Doesn't have to be much, just enough of
a breadcrumb to connect the dots and to prevent us from "optimizing" this to
__module_get() in the future.

	/* Use the "try" variant to play nice with e.g. "rmmod --wait". */

With a comment,

Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
  2022-03-08 21:40   ` Sean Christopherson
  2022-03-08 22:28     ` David Matlack
@ 2022-03-08 23:43     ` David Matlack
  1 sibling, 0 replies; 8+ messages in thread
From: David Matlack @ 2022-03-08 23:43 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm list, Marcelo Tosatti, Gleb Natapov,
	Rik van Riel, Ben Gardon, stable

On Tue, Mar 8, 2022 at 1:40 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Mar 03, 2022, David Matlack wrote:
> > Tie the lifetime the KVM module to the lifetime of each VM via
> > kvm.users_count. This way anything that grabs a reference to the VM via
> > kvm_get_kvm() cannot accidentally outlive the KVM module.
> >
> > Prior to this commit, the lifetime of the KVM module was tied to the
> > lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> > file descriptors by their respective file_operations "owner" field.
> > This approach is insufficient because references grabbed via
> > kvm_get_kvm() do not prevent closing any of the aforementioned file
> > descriptors.
> >
> > This fixes a long standing theoretical bug in KVM that at least affects
> > async page faults. kvm_setup_async_pf() grabs a reference via
> > kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> > prevents the VM file descriptor from being closed and the KVM module
> > from being unloaded before this callback runs.
> >
> > Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
>
> And (or)
>
>   Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
>
> because the above is x86-centric, at a glance PPC and maybe s390 have issues
> beyond async #PF.

SGTM. It's a moot point in terms of stable inclusion since
af585b921e5d was first added in v2.6.38. But for anyone doing their
own backporting, 3d3aab1b973b makes it a bit more obvious this is a
generic problem even though it's not the commit that introduces the
bug.

>
> > Cc: stable@vger.kernel.org
> > Suggested-by: Ben Gardon <bgardon@google.com>
> > [ Based on a patch from Ben implemented for Google's kernel. ]
> > Signed-off-by: David Matlack <dmatlack@google.com>
> > ---
> >  virt/kvm/kvm_main.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 35ae6d32dae5..b59f0a29dbd5 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
> >
> >  static const struct file_operations stat_fops_per_vm;
> >
> > +static struct file_operations kvm_chardev_ops;
> > +
> >  static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> >                          unsigned long arg);
> >  #ifdef CONFIG_KVM_COMPAT
> > @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> >       preempt_notifier_inc();
> >       kvm_init_pm_notifier(kvm);
> >
> > +     if (!try_module_get(kvm_chardev_ops.owner)) {
>
> The "try" aspect is unnecessary.  Stealing from Paolo's version,
>
>         /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
>         __module_get(kvm_chardev_ops.owner);
>
> > +             r = -ENODEV;
> > +             goto out_err;
> > +     }
> > +
> >       return kvm;
> >
> >  out_err:
> > @@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
> >       preempt_notifier_dec();
> >       hardware_disable_all();
> >       mmdrop(mm);
> > +     module_put(kvm_chardev_ops.owner);
> >  }
> >
> >  void kvm_get_kvm(struct kvm *kvm)
> >
> > base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
> > prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
> > prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156
> > --
> > 2.35.1.616.g0bdcbb4464-goog
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
  2022-03-08 23:08       ` Sean Christopherson
@ 2022-03-08 23:44         ` David Matlack
  0 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2022-03-08 23:44 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm list, Marcelo Tosatti, Gleb Natapov,
	Rik van Riel, Ben Gardon, stable

On Tue, Mar 8, 2022 at 3:09 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Mar 08, 2022, David Matlack wrote:
> > On Tue, Mar 8, 2022 at 1:40 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Thu, Mar 03, 2022, David Matlack wrote:
> > > > Tie the lifetime the KVM module to the lifetime of each VM via
> > > > kvm.users_count. This way anything that grabs a reference to the VM via
> > > > kvm_get_kvm() cannot accidentally outlive the KVM module.
> > > >
> > > > Prior to this commit, the lifetime of the KVM module was tied to the
> > > > lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> > > > file descriptors by their respective file_operations "owner" field.
> > > > This approach is insufficient because references grabbed via
> > > > kvm_get_kvm() do not prevent closing any of the aforementioned file
> > > > descriptors.
> > > >
> > > > This fixes a long standing theoretical bug in KVM that at least affects
> > > > async page faults. kvm_setup_async_pf() grabs a reference via
> > > > kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> > > > prevents the VM file descriptor from being closed and the KVM module
> > > > from being unloaded before this callback runs.
> > > >
> > > > Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
> > >
> > > And (or)
> > >
> > >   Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
> > >
> > > because the above is x86-centric, at a glance PPC and maybe s390 have issues
> > > beyond async #PF.
> > >
> > > > Cc: stable@vger.kernel.org
> > > > Suggested-by: Ben Gardon <bgardon@google.com>
> > > > [ Based on a patch from Ben implemented for Google's kernel. ]
> > > > Signed-off-by: David Matlack <dmatlack@google.com>
> > > > ---
> > > >  virt/kvm/kvm_main.c | 8 ++++++++
> > > >  1 file changed, 8 insertions(+)
> > > >
> > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > > index 35ae6d32dae5..b59f0a29dbd5 100644
> > > > --- a/virt/kvm/kvm_main.c
> > > > +++ b/virt/kvm/kvm_main.c
> > > > @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
> > > >
> > > >  static const struct file_operations stat_fops_per_vm;
> > > >
> > > > +static struct file_operations kvm_chardev_ops;
> > > > +
> > > >  static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
> > > >                          unsigned long arg);
> > > >  #ifdef CONFIG_KVM_COMPAT
> > > > @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
> > > >       preempt_notifier_inc();
> > > >       kvm_init_pm_notifier(kvm);
> > > >
> > > > +     if (!try_module_get(kvm_chardev_ops.owner)) {
> > >
> > > The "try" aspect is unnecessary.  Stealing from Paolo's version,
> > >
> > >         /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
> > >         __module_get(kvm_chardev_ops.owner);
> >
> > Right, I did see that and agree we're guaranteed the KVM module has a
> > reference at this point. But the KVM module might be in state
> > MODULE_STATE_GOING (e.g. if someone ran "rmmod --wait"), which
> > try_module_get() checks.
>
> Ah, can you throw that in as a comment?  Doesn't have to be much, just enough of
> a breadcrumb to connect the dots and to prevent us from "optimizing" this to
> __module_get() in the future.
>
>         /* Use the "try" variant to play nice with e.g. "rmmod --wait". */

Yeah. I should have included this in the first place (or at least a
blurb in the commit message).

>
> With a comment,
>
> Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
  2022-03-03 18:33 ` [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed David Matlack
  2022-03-08 21:40   ` Sean Christopherson
@ 2022-03-15 15:43   ` Murilo Opsfelder Araújo
  2022-03-15 20:45     ` Paolo Bonzini
  1 sibling, 1 reply; 8+ messages in thread
From: Murilo Opsfelder Araújo @ 2022-03-15 15:43 UTC (permalink / raw)
  To: David Matlack, pbonzini
  Cc: kvm, Marcelo Tosatti, Gleb Natapov, Rik van Riel, seanjc,
	bgardon, stable, farosas

Hi, David.

Some comments below.

On 3/3/22 15:33, David Matlack wrote:
> Tie the lifetime the KVM module to the lifetime of each VM via
> kvm.users_count. This way anything that grabs a reference to the VM via
> kvm_get_kvm() cannot accidentally outlive the KVM module.
> 
> Prior to this commit, the lifetime of the KVM module was tied to the
> lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
> file descriptors by their respective file_operations "owner" field.
> This approach is insufficient because references grabbed via
> kvm_get_kvm() do not prevent closing any of the aforementioned file
> descriptors.
> 
> This fixes a long standing theoretical bug in KVM that at least affects
> async page faults. kvm_setup_async_pf() grabs a reference via
> kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
> prevents the VM file descriptor from being closed and the KVM module
> from being unloaded before this callback runs.
> 
> Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
> Cc: stable@vger.kernel.org
> Suggested-by: Ben Gardon <bgardon@google.com>
> [ Based on a patch from Ben implemented for Google's kernel. ]
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
>   virt/kvm/kvm_main.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 35ae6d32dae5..b59f0a29dbd5 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
>   
>   static const struct file_operations stat_fops_per_vm;
>   
> +static struct file_operations kvm_chardev_ops;
> +
>   static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
>   			   unsigned long arg);
>   #ifdef CONFIG_KVM_COMPAT
> @@ -1131,6 +1133,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
>   	preempt_notifier_inc();
>   	kvm_init_pm_notifier(kvm);
>   
> +	if (!try_module_get(kvm_chardev_ops.owner)) {
> +		r = -ENODEV;
> +		goto out_err;
> +	}
> +

Doesn't this problem also affects the other functions called from
kvm_dev_ioctl()?

Is it possible that the module is removed while other ioctl's are still running,
e.g. KVM_GET_API_VERSION and KVM_CHECK_EXTENSION, even though they don't use
struct kvm?

I wonder if this try_module_get() (along with module_put() in the out path of
the function) shouldn't be placed in the upper function kvm_dev_ioctl() so it
would cover all the other ioctl's.

>   	return kvm;
>   
>   out_err:
> @@ -1220,6 +1227,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
>   	preempt_notifier_dec();
>   	hardware_disable_all();
>   	mmdrop(mm);
> +	module_put(kvm_chardev_ops.owner);
>   }
>   
>   void kvm_get_kvm(struct kvm *kvm)
> 
> base-commit: b13a3befc815eae574d87e6249f973dfbb6ad6cd
> prerequisite-patch-id: 38f66d60319bf0bc9bf49f91f0f9119e5441629b
> prerequisite-patch-id: 51aa921d68ea649d436ea68e1b8f4aabc3805156

-- 
Murilo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed
  2022-03-15 15:43   ` Murilo Opsfelder Araújo
@ 2022-03-15 20:45     ` Paolo Bonzini
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2022-03-15 20:45 UTC (permalink / raw)
  To: muriloo, David Matlack
  Cc: kvm, Marcelo Tosatti, Gleb Natapov, Rik van Riel, seanjc,
	bgardon, stable, farosas


On 3/15/22 16:43, Murilo Opsfelder Araújo wrote:
>>
>> +    if (!try_module_get(kvm_chardev_ops.owner)) {
>> +        r = -ENODEV;
>> +        goto out_err;
>> +    }
>> +
> 
> Doesn't this problem also affects the other functions called from
> kvm_dev_ioctl()?
> 
> Is it possible that the module is removed while other ioctl's are
> still running, e.g. KVM_GET_API_VERSION and KVM_CHECK_EXTENSION, even
> though they don't use struct kvm?

No, because opening /dev/kvm also adds a reference to the module.  The 
problem is that create_vm creates another source of references to the 
module that can survive after /dev/kvm is closed.

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-03-15 20:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20220303183328.1499189-1-dmatlack@google.com>
2022-03-03 18:33 ` [PATCH RESEND 1/2] KVM: Prevent module exit until all VMs are freed David Matlack
2022-03-08 21:40   ` Sean Christopherson
2022-03-08 22:28     ` David Matlack
2022-03-08 23:08       ` Sean Christopherson
2022-03-08 23:44         ` David Matlack
2022-03-08 23:43     ` David Matlack
2022-03-15 15:43   ` Murilo Opsfelder Araújo
2022-03-15 20:45     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).