All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] KVM updates for Linux 3.1.8
@ 2012-01-12 10:39 Avi Kivity
  2012-01-12 10:39 ` [PATCH 1/4] KVM guest: prevent tracing recursion with kvmclock Avi Kivity
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Avi Kivity @ 2012-01-12 10:39 UTC (permalink / raw)
  To: stable; +Cc: kvm, Marcelo Tosatti

The following patches want to be applied to the 3.1 branch.  "All users must
update".

Alex Williamson (2):
  KVM: Remove ability to assign a device without iommu support
  KVM: Device assignment permission checks

Avi Kivity (1):
  KVM guest: prevent tracing recursion with kvmclock

Jan Kiszka (1):
  KVM: x86: Prevent starting PIT timers in the absence of irqchip
    support

 Documentation/virtual/kvm/api.txt |    7 +++
 arch/x86/kernel/kvmclock.c        |    5 +-
 arch/x86/kvm/i8254.c              |   10 +++-
 virt/kvm/assigned-dev.c           |   93 +++++++++++++++++++++++++++++++++----
 4 files changed, 101 insertions(+), 14 deletions(-)

-- 
1.7.7.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/4] KVM guest: prevent tracing recursion with kvmclock
  2012-01-12 10:39 [PATCH 0/4] KVM updates for Linux 3.1.8 Avi Kivity
@ 2012-01-12 10:39 ` Avi Kivity
  2012-01-12 10:39 ` [PATCH 2/4] KVM: x86: Prevent starting PIT timers in the absence of irqchip support Avi Kivity
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Avi Kivity @ 2012-01-12 10:39 UTC (permalink / raw)
  To: stable; +Cc: kvm, Marcelo Tosatti

Prevent tracing of preempt_disable() in get_cpu_var() in
kvm_clock_read(). When CONFIG_DEBUG_PREEMPT is enabled,
preempt_disable/enable() are traced and this causes the function_graph
tracer to go into an infinite recursion. By open coding the
preempt_disable() around the get_cpu_var(), we can use the notrace
version which prevents preempt_disable/enable() from being traced and
prevents the recursion.

Based on a similar patch for Xen from Jeremy Fitzhardinge.

Tested-by: Gleb Natapov <gleb@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
(cherry picked from commit 95ef1e52922cf75b1ea2eae54ef886f2cc47eecb)
---
 arch/x86/kernel/kvmclock.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index c1a0188..44842d7 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -74,9 +74,10 @@ static cycle_t kvm_clock_read(void)
 	struct pvclock_vcpu_time_info *src;
 	cycle_t ret;
 
-	src = &get_cpu_var(hv_clock);
+	preempt_disable_notrace();
+	src = &__get_cpu_var(hv_clock);
 	ret = pvclock_clocksource_read(src);
-	put_cpu_var(hv_clock);
+	preempt_enable_notrace();
 	return ret;
 }
 
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/4] KVM: x86: Prevent starting PIT timers in the absence of irqchip support
  2012-01-12 10:39 [PATCH 0/4] KVM updates for Linux 3.1.8 Avi Kivity
  2012-01-12 10:39 ` [PATCH 1/4] KVM guest: prevent tracing recursion with kvmclock Avi Kivity
@ 2012-01-12 10:39 ` Avi Kivity
  2012-01-12 10:39 ` [PATCH 3/4] KVM: Remove ability to assign a device without iommu support Avi Kivity
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Avi Kivity @ 2012-01-12 10:39 UTC (permalink / raw)
  To: stable; +Cc: kvm, Marcelo Tosatti

From: Jan Kiszka <jan.kiszka@siemens.com>

User space may create the PIT and forgets about setting up the irqchips.
In that case, firing PIT IRQs will crash the host:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000128
IP: [<ffffffffa10f6280>] kvm_set_irq+0x30/0x170 [kvm]
...
Call Trace:
 [<ffffffffa11228c1>] pit_do_work+0x51/0xd0 [kvm]
 [<ffffffff81071431>] process_one_work+0x111/0x4d0
 [<ffffffff81071bb2>] worker_thread+0x152/0x340
 [<ffffffff81075c8e>] kthread+0x7e/0x90
 [<ffffffff815a4474>] kernel_thread_helper+0x4/0x10

Prevent this by checking the irqchip mode before starting a timer. We
can't deny creating the PIT if the irqchips aren't set up yet as
current user land expects this order to work.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
(cherry picked from commit 0924ab2cfa98b1ece26c033d696651fd62896c69)
---
 arch/x86/kvm/i8254.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index efad723..43e04d1 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -338,11 +338,15 @@ static enum hrtimer_restart pit_timer_fn(struct hrtimer *data)
 		return HRTIMER_NORESTART;
 }
 
-static void create_pit_timer(struct kvm_kpit_state *ps, u32 val, int is_period)
+static void create_pit_timer(struct kvm *kvm, u32 val, int is_period)
 {
+	struct kvm_kpit_state *ps = &kvm->arch.vpit->pit_state;
 	struct kvm_timer *pt = &ps->pit_timer;
 	s64 interval;
 
+	if (!irqchip_in_kernel(kvm))
+		return;
+
 	interval = muldiv64(val, NSEC_PER_SEC, KVM_PIT_FREQ);
 
 	pr_debug("create pit timer, interval is %llu nsec\n", interval);
@@ -394,13 +398,13 @@ static void pit_load_count(struct kvm *kvm, int channel, u32 val)
         /* FIXME: enhance mode 4 precision */
 	case 4:
 		if (!(ps->flags & KVM_PIT_FLAGS_HPET_LEGACY)) {
-			create_pit_timer(ps, val, 0);
+			create_pit_timer(kvm, val, 0);
 		}
 		break;
 	case 2:
 	case 3:
 		if (!(ps->flags & KVM_PIT_FLAGS_HPET_LEGACY)){
-			create_pit_timer(ps, val, 1);
+			create_pit_timer(kvm, val, 1);
 		}
 		break;
 	default:
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] KVM: Remove ability to assign a device without iommu support
  2012-01-12 10:39 [PATCH 0/4] KVM updates for Linux 3.1.8 Avi Kivity
  2012-01-12 10:39 ` [PATCH 1/4] KVM guest: prevent tracing recursion with kvmclock Avi Kivity
  2012-01-12 10:39 ` [PATCH 2/4] KVM: x86: Prevent starting PIT timers in the absence of irqchip support Avi Kivity
@ 2012-01-12 10:39 ` Avi Kivity
  2012-01-12 10:39 ` [PATCH 4/4] KVM: Device assignment permission checks Avi Kivity
  2012-01-12 16:15 ` [PATCH 0/4] KVM updates for Linux 3.1.8 Greg KH
  4 siblings, 0 replies; 8+ messages in thread
From: Avi Kivity @ 2012-01-12 10:39 UTC (permalink / raw)
  To: stable; +Cc: kvm, Marcelo Tosatti

From: Alex Williamson <alex.williamson@redhat.com>

This option has no users and it exposes a security hole that we
can allow devices to be assigned without iommu protection.  Make
KVM_DEV_ASSIGN_ENABLE_IOMMU a mandatory option.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
(cherry picked from commit 423873736b78f549fbfa2f715f2e4de7e6c5e1e9)
---
 Documentation/virtual/kvm/api.txt |    3 +++
 virt/kvm/assigned-dev.c           |   18 +++++++++---------
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index b0e4b9c..57c9336 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1131,6 +1131,9 @@ following flags are specified:
 /* Depends on KVM_CAP_IOMMU */
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 
+The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
+isolation of the device.  Usages not specifying this flag are deprecated.
+
 4.49 KVM_DEASSIGN_PCI_DEVICE
 
 Capability: KVM_CAP_DEVICE_DEASSIGNMENT
diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index 4e9eaeb..2269d71 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -481,6 +481,9 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
 	struct kvm_assigned_dev_kernel *match;
 	struct pci_dev *dev;
 
+	if (!(assigned_dev->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU))
+		return -EINVAL;
+
 	mutex_lock(&kvm->lock);
 	idx = srcu_read_lock(&kvm->srcu);
 
@@ -538,16 +541,14 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
 
 	list_add(&match->list, &kvm->arch.assigned_dev_head);
 
-	if (assigned_dev->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU) {
-		if (!kvm->arch.iommu_domain) {
-			r = kvm_iommu_map_guest(kvm);
-			if (r)
-				goto out_list_del;
-		}
-		r = kvm_assign_device(kvm, match);
+	if (!kvm->arch.iommu_domain) {
+		r = kvm_iommu_map_guest(kvm);
 		if (r)
 			goto out_list_del;
 	}
+	r = kvm_assign_device(kvm, match);
+	if (r)
+		goto out_list_del;
 
 out:
 	srcu_read_unlock(&kvm->srcu, idx);
@@ -587,8 +588,7 @@ static int kvm_vm_ioctl_deassign_device(struct kvm *kvm,
 		goto out;
 	}
 
-	if (match->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU)
-		kvm_deassign_device(kvm, match);
+	kvm_deassign_device(kvm, match);
 
 	kvm_free_assigned_device(kvm, match);
 
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] KVM: Device assignment permission checks
  2012-01-12 10:39 [PATCH 0/4] KVM updates for Linux 3.1.8 Avi Kivity
                   ` (2 preceding siblings ...)
  2012-01-12 10:39 ` [PATCH 3/4] KVM: Remove ability to assign a device without iommu support Avi Kivity
@ 2012-01-12 10:39 ` Avi Kivity
  2012-01-12 16:15 ` [PATCH 0/4] KVM updates for Linux 3.1.8 Greg KH
  4 siblings, 0 replies; 8+ messages in thread
From: Avi Kivity @ 2012-01-12 10:39 UTC (permalink / raw)
  To: stable; +Cc: kvm, Marcelo Tosatti

From: Alex Williamson <alex.williamson@redhat.com>

Only allow KVM device assignment to attach to devices which:

 - Are not bridges
 - Have BAR resources (assume others are special devices)
 - The user has permissions to use

Assigning a bridge is a configuration error, it's not supported, and
typically doesn't result in the behavior the user is expecting anyway.
Devices without BAR resources are typically chipset components that
also don't have host drivers.  We don't want users to hold such devices
captive or cause system problems by fencing them off into an iommu
domain.  We determine "permission to use" by testing whether the user
has access to the PCI sysfs resource files.  By default a normal user
will not have access to these files, so it provides a good indication
that an administration agent has granted the user access to the device.

[Yang Bai: add missing #include]
[avi: fix comment style]

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Yang Bai <hamo.by@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
(cherry picked from commit 3d27e23b17010c668db311140b17bbbb70c78fb9)
---
 Documentation/virtual/kvm/api.txt |    4 ++
 virt/kvm/assigned-dev.c           |   75 +++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+), 0 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 57c9336..13ab837 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1134,6 +1134,10 @@ following flags are specified:
 The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
 isolation of the device.  Usages not specifying this flag are deprecated.
 
+Only PCI header type 0 devices with PCI BAR resources are supported by
+device assignment.  The user requesting this ioctl must have read/write
+access to the PCI sysfs resource files associated with the device.
+
 4.49 KVM_DEASSIGN_PCI_DEVICE
 
 Capability: KVM_CAP_DEVICE_DEASSIGNMENT
diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index 2269d71..af79102 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -17,6 +17,8 @@
 #include <linux/pci.h>
 #include <linux/interrupt.h>
 #include <linux/slab.h>
+#include <linux/namei.h>
+#include <linux/fs.h>
 #include "irq.h"
 
 static struct kvm_assigned_dev_kernel *kvm_find_assigned_dev(struct list_head *head,
@@ -474,12 +476,73 @@ static int kvm_vm_ioctl_deassign_dev_irq(struct kvm *kvm,
 	return r;
 }
 
+/*
+ * We want to test whether the caller has been granted permissions to
+ * use this device.  To be able to configure and control the device,
+ * the user needs access to PCI configuration space and BAR resources.
+ * These are accessed through PCI sysfs.  PCI config space is often
+ * passed to the process calling this ioctl via file descriptor, so we
+ * can't rely on access to that file.  We can check for permissions
+ * on each of the BAR resource files, which is a pretty clear
+ * indicator that the user has been granted access to the device.
+ */
+static int probe_sysfs_permissions(struct pci_dev *dev)
+{
+#ifdef CONFIG_SYSFS
+	int i;
+	bool bar_found = false;
+
+	for (i = PCI_STD_RESOURCES; i <= PCI_STD_RESOURCE_END; i++) {
+		char *kpath, *syspath;
+		struct path path;
+		struct inode *inode;
+		int r;
+
+		if (!pci_resource_len(dev, i))
+			continue;
+
+		kpath = kobject_get_path(&dev->dev.kobj, GFP_KERNEL);
+		if (!kpath)
+			return -ENOMEM;
+
+		/* Per sysfs-rules, sysfs is always at /sys */
+		syspath = kasprintf(GFP_KERNEL, "/sys%s/resource%d", kpath, i);
+		kfree(kpath);
+		if (!syspath)
+			return -ENOMEM;
+
+		r = kern_path(syspath, LOOKUP_FOLLOW, &path);
+		kfree(syspath);
+		if (r)
+			return r;
+
+		inode = path.dentry->d_inode;
+
+		r = inode_permission(inode, MAY_READ | MAY_WRITE | MAY_ACCESS);
+		path_put(&path);
+		if (r)
+			return r;
+
+		bar_found = true;
+	}
+
+	/* If no resources, probably something special */
+	if (!bar_found)
+		return -EPERM;
+
+	return 0;
+#else
+	return -EINVAL; /* No way to control the device without sysfs */
+#endif
+}
+
 static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
 				      struct kvm_assigned_pci_dev *assigned_dev)
 {
 	int r = 0, idx;
 	struct kvm_assigned_dev_kernel *match;
 	struct pci_dev *dev;
+	u8 header_type;
 
 	if (!(assigned_dev->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU))
 		return -EINVAL;
@@ -510,6 +573,18 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
 		r = -EINVAL;
 		goto out_free;
 	}
+
+	/* Don't allow bridges to be assigned */
+	pci_read_config_byte(dev, PCI_HEADER_TYPE, &header_type);
+	if ((header_type & PCI_HEADER_TYPE) != PCI_HEADER_TYPE_NORMAL) {
+		r = -EPERM;
+		goto out_put;
+	}
+
+	r = probe_sysfs_permissions(dev);
+	if (r)
+		goto out_put;
+
 	if (pci_enable_device(dev)) {
 		printk(KERN_INFO "%s: Could not enable PCI device\n", __func__);
 		r = -EBUSY;
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/4] KVM updates for Linux 3.1.8
  2012-01-12 10:39 [PATCH 0/4] KVM updates for Linux 3.1.8 Avi Kivity
                   ` (3 preceding siblings ...)
  2012-01-12 10:39 ` [PATCH 4/4] KVM: Device assignment permission checks Avi Kivity
@ 2012-01-12 16:15 ` Greg KH
  2012-01-12 16:25   ` Avi Kivity
  4 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2012-01-12 16:15 UTC (permalink / raw)
  To: Avi Kivity; +Cc: stable, kvm, Marcelo Tosatti

On Thu, Jan 12, 2012 at 12:39:50PM +0200, Avi Kivity wrote:
> The following patches want to be applied to the 3.1 branch.  "All users must
> update".

Do any of these also need to go to the 3.0 branch?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/4] KVM updates for Linux 3.1.8
  2012-01-12 16:15 ` [PATCH 0/4] KVM updates for Linux 3.1.8 Greg KH
@ 2012-01-12 16:25   ` Avi Kivity
  2012-01-12 22:58     ` Greg KH
  0 siblings, 1 reply; 8+ messages in thread
From: Avi Kivity @ 2012-01-12 16:25 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, kvm, Marcelo Tosatti

On 01/12/2012 06:15 PM, Greg KH wrote:
> On Thu, Jan 12, 2012 at 12:39:50PM +0200, Avi Kivity wrote:
> > The following patches want to be applied to the 3.1 branch.  "All users must
> > update".
>
> Do any of these also need to go to the 3.0 branch?

Yes.  But for kvm we regression test stable patches, and I haven't done
this yet.  I find the normal stable process, while very convenient,
frightening.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/4] KVM updates for Linux 3.1.8
  2012-01-12 16:25   ` Avi Kivity
@ 2012-01-12 22:58     ` Greg KH
  0 siblings, 0 replies; 8+ messages in thread
From: Greg KH @ 2012-01-12 22:58 UTC (permalink / raw)
  To: Avi Kivity; +Cc: stable, kvm, Marcelo Tosatti

On Thu, Jan 12, 2012 at 06:25:43PM +0200, Avi Kivity wrote:
> On 01/12/2012 06:15 PM, Greg KH wrote:
> > On Thu, Jan 12, 2012 at 12:39:50PM +0200, Avi Kivity wrote:
> > > The following patches want to be applied to the 3.1 branch.  "All users must
> > > update".
> >
> > Do any of these also need to go to the 3.0 branch?
> 
> Yes.  But for kvm we regression test stable patches, and I haven't done
> this yet.  I find the normal stable process, while very convenient,
> frightening.

What do you mean by this?

And thanks for the patches, they are all now queued up.

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-01-12 22:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-12 10:39 [PATCH 0/4] KVM updates for Linux 3.1.8 Avi Kivity
2012-01-12 10:39 ` [PATCH 1/4] KVM guest: prevent tracing recursion with kvmclock Avi Kivity
2012-01-12 10:39 ` [PATCH 2/4] KVM: x86: Prevent starting PIT timers in the absence of irqchip support Avi Kivity
2012-01-12 10:39 ` [PATCH 3/4] KVM: Remove ability to assign a device without iommu support Avi Kivity
2012-01-12 10:39 ` [PATCH 4/4] KVM: Device assignment permission checks Avi Kivity
2012-01-12 16:15 ` [PATCH 0/4] KVM updates for Linux 3.1.8 Greg KH
2012-01-12 16:25   ` Avi Kivity
2012-01-12 22:58     ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.