All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] Userspace grant communication
@ 2010-12-17  0:17 Daniel De Graaf
  2010-12-17  0:17 ` [PATCH 1/7] xen-gntdev: Fix circular locking dependency Daniel De Graaf
                   ` (8 more replies)
  0 siblings, 9 replies; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17  0:17 UTC (permalink / raw)
  To: xen-devel; +Cc: jeremy, Ian.Campbell

Rebased on top of Stefano's 2.6.37 gntdev patches; included suggestions
from the v2 patch queue. HVM mapping is now much cleaner, since pages
with real GFNs are available. Multicall API didn't make it in, because
the actual hypercall is within a function that I'm not sure should be
changed to multicall in all instances, and I haven't had a chance to
figure out how to use it despite this.

[PATCH 1/7] xen-gntdev: Fix circular locking dependency
[PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open
[PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data
[PATCH 4/7] xen-gntdev: Use find_vma rather than iterating our vma list manually
[PATCH 5/7] xen-gntdev: Add reference counting to maps
[PATCH 6/7] xen-gntdev: Support mapping in HVM domains
[PATCH 7/7] xen-gntalloc: Userspace grant allocation driver

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/7] xen-gntdev: Fix circular locking dependency
  2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
@ 2010-12-17  0:17 ` Daniel De Graaf
  2010-12-17  0:17 ` [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open Daniel De Graaf
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17  0:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Daniel De Graaf, jeremy, Ian.Campbell

apply_to_page_range will acquire PTE lock while priv->lock is held,
and mn_invl_range_start tries to acquire priv->lock with PTE already
held.  Fix by not holding priv->lock during the entire map operation.
This is safe because map->vma is set nonzero while the lock is held,
which will cause subsequent maps to fail and will cause the unmap
ioctl (and other users of gntdev_del_map) to return -EBUSY until the
area is unmapped. It is similarly impossible for gntdev_vma_close to
be called while the vma is still being created.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 drivers/xen/gntdev.c |   19 +++++++++----------
 1 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 35de6bb..387c5f1 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -608,23 +608,22 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 	if (!(vma->vm_flags & VM_WRITE))
 		map->flags |= GNTMAP_readonly;
 
+	spin_unlock(&priv->lock);
+
 	err = apply_to_page_range(vma->vm_mm, vma->vm_start,
 				  vma->vm_end - vma->vm_start,
 				  find_grant_ptes, map);
-	if (err) {
-		goto unlock_out;
-		if (debug)
-			printk("%s: find_grant_ptes() failure.\n", __FUNCTION__);
-	}
+	if (err)
+		return err;
 
 	err = map_grant_pages(map);
-	if (err) {
-		goto unlock_out;
-		if (debug)
-			printk("%s: map_grant_pages() failure.\n", __FUNCTION__);
-	}
+	if (err)
+		return err;
+
 	map->is_mapped = 1;
 
+	return 0;
+
 unlock_out:
 	spin_unlock(&priv->lock);
 	return err;
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open
  2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
  2010-12-17  0:17 ` [PATCH 1/7] xen-gntdev: Fix circular locking dependency Daniel De Graaf
@ 2010-12-17  0:17 ` Daniel De Graaf
  2011-01-10 21:52   ` Konrad Rzeszutek Wilk
  2010-12-17  0:17 ` [PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data Daniel De Graaf
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17  0:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Daniel De Graaf, jeremy, Ian.Campbell

Because there is no limitation on how many times a user can open a
given device file, an per-file-description limit on the number of
pages granted offers little to no benefit. Change to a global limit
and remove the ioctl() as the parameter can now be changed via sysfs.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 drivers/xen/gntdev.c |   48 ++++++++++++++----------------------------------
 1 files changed, 14 insertions(+), 34 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 387c5f1..7e21592 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -45,13 +45,12 @@ MODULE_DESCRIPTION("User-space granted page access driver");
 
 static int debug = 0;
 module_param(debug, int, 0644);
-static int limit = 1024;
+static int limit = 1024*1024;
 module_param(limit, int, 0644);
+static atomic_t pages_mapped = ATOMIC_INIT(0);
 
 struct gntdev_priv {
 	struct list_head maps;
-	uint32_t used;
-	uint32_t limit;
 	spinlock_t lock;
 	struct mm_struct *mm;
 	struct mmu_notifier mn;
@@ -78,8 +77,8 @@ static void gntdev_print_maps(struct gntdev_priv *priv,
 {
 	struct grant_map *map;
 
-	printk("%s: maps list (priv %p, usage %d/%d)\n",
-	       __FUNCTION__, priv, priv->used, priv->limit);
+	printk("%s: maps list (priv %p)\n",
+	       __FUNCTION__, priv);
 	list_for_each_entry(map, &priv->maps, next)
 		printk("  index %2d, count %2d %s\n",
 		       map->index, map->count,
@@ -115,9 +114,6 @@ static struct grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count)
 	add->count = count;
 	add->priv  = priv;
 
-	if (add->count + priv->used > priv->limit)
-		goto err;
-
 	return add;
 
 err:
@@ -148,7 +144,6 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
 	list_add_tail(&add->next, &priv->maps);
 
 done:
-	priv->used += add->count;
 	if (debug)
 		gntdev_print_maps(priv, "[new]", add->index);
 }
@@ -195,7 +190,7 @@ static int gntdev_del_map(struct grant_map *map)
 		if (map->unmap_ops[i].handle)
 			return -EBUSY;
 
-	map->priv->used -= map->count;
+	atomic_sub(map->count, &pages_mapped);
 	list_del(&map->next);
 	return 0;
 }
@@ -386,7 +381,6 @@ static int gntdev_open(struct inode *inode, struct file *flip)
 
 	INIT_LIST_HEAD(&priv->maps);
 	spin_lock_init(&priv->lock);
-	priv->limit = limit;
 
 	priv->mm = get_task_mm(current);
 	if (!priv->mm) {
@@ -442,19 +436,26 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
 		       op.count);
 	if (unlikely(op.count <= 0))
 		return -EINVAL;
-	if (unlikely(op.count > priv->limit))
-		return -EINVAL;
 
 	err = -ENOMEM;
 	map = gntdev_alloc_map(priv, op.count);
 	if (!map)
 		return err;
+
 	if (copy_from_user(map->grants, &u->refs,
 			   sizeof(map->grants[0]) * op.count) != 0) {
 		gntdev_free_map(map);
 		return err;
 	}
 
+	if (unlikely(atomic_add_return(op.count, &pages_mapped) > limit))
+	{
+		if (debug)
+			printk("%s: can't map: over limit\n", __FUNCTION__);
+		gntdev_free_map(map);
+		return err;
+	}
+
 	spin_lock(&priv->lock);
 	gntdev_add_map(priv, map);
 	op.index = map->index << PAGE_SHIFT;
@@ -521,24 +522,6 @@ static long gntdev_ioctl_get_offset_for_vaddr(struct gntdev_priv *priv,
 	return 0;
 }
 
-static long gntdev_ioctl_set_max_grants(struct gntdev_priv *priv,
-					struct ioctl_gntdev_set_max_grants __user *u)
-{
-	struct ioctl_gntdev_set_max_grants op;
-
-	if (copy_from_user(&op, u, sizeof(op)) != 0)
-		return -EFAULT;
-	if (debug)
-		printk("%s: priv %p, limit %d\n", __FUNCTION__, priv, op.count);
-	if (op.count > limit)
-		return -EINVAL;
-
-	spin_lock(&priv->lock);
-	priv->limit = op.count;
-	spin_unlock(&priv->lock);
-	return 0;
-}
-
 static long gntdev_ioctl(struct file *flip,
 			 unsigned int cmd, unsigned long arg)
 {
@@ -555,9 +538,6 @@ static long gntdev_ioctl(struct file *flip,
 	case IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR:
 		return gntdev_ioctl_get_offset_for_vaddr(priv, ptr);
 
-	case IOCTL_GNTDEV_SET_MAX_GRANTS:
-		return gntdev_ioctl_set_max_grants(priv, ptr);
-
 	default:
 		if (debug)
 			printk("%s: priv %p, unknown cmd %x\n",
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data
  2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
  2010-12-17  0:17 ` [PATCH 1/7] xen-gntdev: Fix circular locking dependency Daniel De Graaf
  2010-12-17  0:17 ` [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open Daniel De Graaf
@ 2010-12-17  0:17 ` Daniel De Graaf
  2011-01-10 22:14   ` Konrad Rzeszutek Wilk
  2010-12-17  0:17 ` [PATCH 4/7] xen-gntdev: Use find_vma rather than iterating our vma list manually Daniel De Graaf
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17  0:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Daniel De Graaf, jeremy, Ian.Campbell

The entire hypercall argument list isn't required; only selected
fields from the hypercall need to be tracked between the ioctl, map,
and unmap operations.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 drivers/xen/gntdev.c |  222 ++++++++++++++++++++++++++++++--------------------
 1 files changed, 134 insertions(+), 88 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 7e21592..3778b85 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -56,18 +56,24 @@ struct gntdev_priv {
 	struct mmu_notifier mn;
 };
 
+struct granted_page {
+	u64 pte_maddr;
+	union {
+		struct ioctl_gntdev_grant_ref target;
+		grant_handle_t handle;
+	};
+};
+
 struct grant_map {
 	struct list_head next;
 	struct gntdev_priv *priv;
 	struct vm_area_struct *vma;
 	int index;
 	int count;
-	int flags;
-	int is_mapped;
-	struct ioctl_gntdev_grant_ref *grants;
-	struct gnttab_map_grant_ref   *map_ops;
-	struct gnttab_unmap_grant_ref *unmap_ops;
+	int is_mapped:1;
+	int is_ro:1;
 	struct page **pages;
+	struct granted_page pginfo[0];
 };
 
 /* ------------------------------------------------------------------ */
@@ -85,24 +91,19 @@ static void gntdev_print_maps(struct gntdev_priv *priv,
 		       map->index == text_index && text ? text : "");
 }
 
-static struct grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count)
+static struct grant_map *gntdev_alloc_map(int count,
+		struct ioctl_gntdev_grant_ref* grants)
 {
 	struct grant_map *add;
 	int i;
 
-	add = kzalloc(sizeof(struct grant_map), GFP_KERNEL);
-	if (NULL == add)
+	add = kzalloc(sizeof(*add) + sizeof(add->pginfo[0])*count, GFP_KERNEL);
+	if (!add)
 		return NULL;
 
-	add->grants    = kzalloc(sizeof(add->grants[0])    * count, GFP_KERNEL);
-	add->map_ops   = kzalloc(sizeof(add->map_ops[0])   * count, GFP_KERNEL);
-	add->unmap_ops = kzalloc(sizeof(add->unmap_ops[0]) * count, GFP_KERNEL);
-	add->pages     = kzalloc(sizeof(add->pages[0])     * count, GFP_KERNEL);
-	if (NULL == add->grants    ||
-	    NULL == add->map_ops   ||
-	    NULL == add->unmap_ops ||
-	    NULL == add->pages)
-		goto err;
+	add->pages = kzalloc(sizeof(add->pages[0])*count, GFP_KERNEL);
+	if (!add->pages)
+		goto err_nopages;
 
 	for (i = 0; i < count; i++) {
 		add->pages[i] = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
@@ -112,20 +113,18 @@ static struct grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count)
 
 	add->index = 0;
 	add->count = count;
-	add->priv  = priv;
+	for(i = 0; i < count; i++)
+		add->pginfo[i].target = grants[i];
 
 	return add;
 
 err:
-	if (add->pages)
-		for (i = 0; i < count; i++) {
-			if (add->pages[i])
-				__free_page(add->pages[i]);
-		}
+	for (i = 0; i < count; i++) {
+		if (add->pages[i])
+			__free_page(add->pages[i]);
+	}
 	kfree(add->pages);
-	kfree(add->grants);
-	kfree(add->map_ops);
-	kfree(add->unmap_ops);
+err_nopages:
 	kfree(add);
 	return NULL;
 }
@@ -134,6 +133,7 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
 {
 	struct grant_map *map;
 
+	spin_lock(&priv->lock);
 	list_for_each_entry(map, &priv->maps, next) {
 		if (add->index + add->count < map->index) {
 			list_add_tail(&add->next, &map->next);
@@ -144,8 +144,10 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
 	list_add_tail(&add->next, &priv->maps);
 
 done:
+	add->priv = priv;
 	if (debug)
 		gntdev_print_maps(priv, "[new]", add->index);
+	spin_unlock(&priv->lock);
 }
 
 static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int index,
@@ -186,9 +188,10 @@ static int gntdev_del_map(struct grant_map *map)
 
 	if (map->vma)
 		return -EBUSY;
-	for (i = 0; i < map->count; i++)
-		if (map->unmap_ops[i].handle)
-			return -EBUSY;
+	if (map->is_mapped)
+		for (i = 0; i < map->count; i++)
+			if (map->pginfo[i].handle)
+				return -EBUSY;
 
 	atomic_sub(map->count, &pages_mapped);
 	list_del(&map->next);
@@ -202,15 +205,10 @@ static void gntdev_free_map(struct grant_map *map)
 	if (!map)
 		return;
 
-	if (map->pages)
-		for (i = 0; i < map->count; i++) {
-			if (map->pages[i])
-				__free_page(map->pages[i]);
-		}
-	kfree(map->pages);
-	kfree(map->grants);
-	kfree(map->map_ops);
-	kfree(map->unmap_ops);
+	for (i = 0; i < map->count; i++) {
+		if (map->pages[i])
+			__free_page(map->pages[i]);
+	}
 	kfree(map);
 }
 
@@ -223,53 +221,99 @@ static int find_grant_ptes(pte_t *pte, pgtable_t token, unsigned long addr, void
 	u64 pte_maddr;
 
 	BUG_ON(pgnr >= map->count);
+
 	pte_maddr = arbitrary_virt_to_machine(pte).maddr;
+	map->pginfo[pgnr].pte_maddr = pte_maddr;
 
-	gnttab_set_map_op(&map->map_ops[pgnr], pte_maddr,
-			  GNTMAP_contains_pte | map->flags,
-			  map->grants[pgnr].ref,
-			  map->grants[pgnr].domid);
-	gnttab_set_unmap_op(&map->unmap_ops[pgnr], pte_maddr,
-			    GNTMAP_contains_pte | map->flags,
-			    0 /* handle */);
 	return 0;
 }
 
 static int map_grant_pages(struct grant_map *map)
 {
-	int i, err = 0;
+	int i, flags, err = 0;
+	struct gnttab_map_grant_ref* map_ops = NULL;
 
+	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;
+	if (map->is_ro)
+		flags |= GNTMAP_readonly;
+
+	err = -ENOMEM;
+	map_ops = kzalloc(sizeof(map_ops[0]) * map->count, GFP_TEMPORARY);
+	if (!map_ops)
+		goto out;
+
+	for(i=0; i < map->count; i++) {
+		gnttab_set_map_op(&map_ops[i], map->pginfo[i].pte_maddr, flags,
+				  map->pginfo[i].target.ref,
+				  map->pginfo[i].target.domid);
+	}
 	if (debug)
 		printk("%s: map %d+%d\n", __FUNCTION__, map->index, map->count);
-	err = gnttab_map_refs(map->map_ops, map->pages, map->count);
+
+	err = gnttab_map_refs(map_ops, map->pages, map->count);
+
 	if (WARN_ON(err))
-		return err;
+		goto out;
 
 	for (i = 0; i < map->count; i++) {
-		if (map->map_ops[i].status)
+		if (map_ops[i].status) {
+			__free_page(map->pages[i]);
+			map->pages[i] = NULL;
 			err = -EINVAL;
-		map->unmap_ops[i].handle = map->map_ops[i].handle;
+		} else {
+			map->pginfo[i].handle = map_ops[i].handle;
+		}
 	}
+
+out:
+	kfree(map_ops);
 	return err;
 }
 
-static int unmap_grant_pages(struct grant_map *map, int offset, int pages)
+static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
 {
-	int i, err = 0;
+	int i, flags, err = 0;
+	struct gnttab_unmap_grant_ref *unmap_ops;
+	struct gnttab_unmap_grant_ref unmap_single;
+
+	if (pages > 1) {
+		unmap_ops = kzalloc(sizeof(unmap_ops[0]) * pages,
+		                    GFP_TEMPORARY);
+		if (unlikely(!unmap_ops)) {
+			for(i=0; i < pages; i++)
+				unmap_grant_pages(map, offset + i, 1);
+			return;
+		}
+	} else {
+		unmap_ops = &unmap_single;
+	}
+
+	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;
+	if (map->is_ro)
+		flags |= GNTMAP_readonly;
 
+	for(i=0; i < pages; i++)
+		gnttab_set_unmap_op(&unmap_ops[i],
+		                    map->pginfo[offset+i].pte_maddr, flags,
+		                    map->pginfo[offset+i].handle);
 	if (debug)
 		printk("%s: map %d+%d [%d+%d]\n", __FUNCTION__,
 		       map->index, map->count, offset, pages);
-	err = gnttab_unmap_refs(map->unmap_ops + offset, map->pages, pages);
+
+	err = gnttab_unmap_refs(unmap_ops, map->pages + offset, pages);
+
 	if (WARN_ON(err))
-		return err;
+		goto out;
 
 	for (i = 0; i < pages; i++) {
-		if (map->unmap_ops[offset+i].status)
-			err = -EINVAL;
-		map->unmap_ops[offset+i].handle = 0;
+		WARN_ON(unmap_ops[i].status);
+		__free_page(map->pages[offset+i]);
+		map->pages[offset+i] = NULL;
+		map->pginfo[offset+i].handle = 0;
 	}
-	return err;
+out:
+	if (unmap_ops != &unmap_single)
+		kfree(unmap_ops);
 }
 
 /* ------------------------------------------------------------------ */
@@ -308,7 +352,6 @@ static void mn_invl_range_start(struct mmu_notifier *mn,
 	struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn);
 	struct grant_map *map;
 	unsigned long mstart, mend;
-	int err;
 
 	spin_lock(&priv->lock);
 	list_for_each_entry(map, &priv->maps, next) {
@@ -327,10 +370,9 @@ static void mn_invl_range_start(struct mmu_notifier *mn,
 			       __FUNCTION__, map->index, map->count,
 			       map->vma->vm_start, map->vma->vm_end,
 			       start, end, mstart, mend);
-		err = unmap_grant_pages(map,
-					(mstart - map->vma->vm_start) >> PAGE_SHIFT,
-					(mend - mstart) >> PAGE_SHIFT);
-		WARN_ON(err);
+		unmap_grant_pages(map,
+				  (mstart - map->vma->vm_start) >> PAGE_SHIFT,
+				  (mend - mstart) >> PAGE_SHIFT);
 	}
 	spin_unlock(&priv->lock);
 }
@@ -347,7 +389,6 @@ static void mn_release(struct mmu_notifier *mn,
 {
 	struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn);
 	struct grant_map *map;
-	int err;
 
 	spin_lock(&priv->lock);
 	list_for_each_entry(map, &priv->maps, next) {
@@ -357,8 +398,7 @@ static void mn_release(struct mmu_notifier *mn,
 			printk("%s: map %d+%d (%lx %lx)\n",
 			       __FUNCTION__, map->index, map->count,
 			       map->vma->vm_start, map->vma->vm_end);
-		err = unmap_grant_pages(map, 0, map->count);
-		WARN_ON(err);
+		unmap_grant_pages(map, 0, map->count);
 	}
 	spin_unlock(&priv->lock);
 }
@@ -427,6 +467,7 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
 {
 	struct ioctl_gntdev_map_grant_ref op;
 	struct grant_map *map;
+	struct ioctl_gntdev_grant_ref* grants;
 	int err;
 
 	if (copy_from_user(&op, u, sizeof(op)) != 0)
@@ -437,38 +478,45 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
 	if (unlikely(op.count <= 0))
 		return -EINVAL;
 
-	err = -ENOMEM;
-	map = gntdev_alloc_map(priv, op.count);
-	if (!map)
-		return err;
+	grants = kmalloc(sizeof(grants[0]) * op.count, GFP_TEMPORARY);
+	if (!grants)
+		return -ENOMEM;
 
-	if (copy_from_user(map->grants, &u->refs,
-			   sizeof(map->grants[0]) * op.count) != 0) {
-		gntdev_free_map(map);
-		return err;
+	if (copy_from_user(grants, u->refs, sizeof(grants[0]) * op.count)) {
+		err = -EFAULT;
+		goto out_free;
 	}
 
+	err = -ENOMEM;
+	map = gntdev_alloc_map(op.count, grants);
+	if (!map)
+		goto out_free;
+
 	if (unlikely(atomic_add_return(op.count, &pages_mapped) > limit))
 	{
 		if (debug)
 			printk("%s: can't map: over limit\n", __FUNCTION__);
-		gntdev_free_map(map);
-		return err;
+		goto out_free_map;
 	}
 
-	spin_lock(&priv->lock);
 	gntdev_add_map(priv, map);
 	op.index = map->index << PAGE_SHIFT;
-	spin_unlock(&priv->lock);
 
-	if (copy_to_user(u, &op, sizeof(op)) != 0) {
-		spin_lock(&priv->lock);
-		gntdev_del_map(map);
-		spin_unlock(&priv->lock);
-		gntdev_free_map(map);
-		return err;
+	if (copy_to_user(u, &op, sizeof(op))) {
+		err = -EFAULT;
+		goto out_remove;
 	}
-	return 0;
+	err = 0;
+out_free:
+	kfree(grants);
+	return err;
+out_remove:
+	spin_lock(&priv->lock);
+	gntdev_del_map(map);
+	spin_unlock(&priv->lock);
+out_free_map:
+	gntdev_free_map(map);
+	goto out_free;
 }
 
 static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv,
@@ -584,9 +632,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 	vma->vm_private_data = map;
 	map->vma = vma;
 
-	map->flags = GNTMAP_host_map | GNTMAP_application_map;
-	if (!(vma->vm_flags & VM_WRITE))
-		map->flags |= GNTMAP_readonly;
+	map->is_ro = !(vma->vm_flags & VM_WRITE);
 
 	spin_unlock(&priv->lock);
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/7] xen-gntdev: Use find_vma rather than iterating our vma list manually
  2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
                   ` (2 preceding siblings ...)
  2010-12-17  0:17 ` [PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data Daniel De Graaf
@ 2010-12-17  0:17 ` Daniel De Graaf
  2010-12-17  0:17 ` [PATCH 5/7] xen-gntdev: Add reference counting to maps Daniel De Graaf
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17  0:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Daniel De Graaf, jeremy, Ian.Campbell

This should be faster if many mappings exist, and also removes
the only user of map->vma not related to PTE modification.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 drivers/xen/gntdev.c |   32 ++++++++------------------------
 1 files changed, 8 insertions(+), 24 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 3778b85..6a3c9e4 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -165,23 +165,6 @@ static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int ind
 	return NULL;
 }
 
-static struct grant_map *gntdev_find_map_vaddr(struct gntdev_priv *priv,
-					       unsigned long vaddr)
-{
-	struct grant_map *map;
-
-	list_for_each_entry(map, &priv->maps, next) {
-		if (!map->vma)
-			continue;
-		if (vaddr < map->vma->vm_start)
-			continue;
-		if (vaddr >= map->vma->vm_end)
-			continue;
-		return map;
-	}
-	return NULL;
-}
-
 static int gntdev_del_map(struct grant_map *map)
 {
 	int i;
@@ -546,6 +529,7 @@ static long gntdev_ioctl_get_offset_for_vaddr(struct gntdev_priv *priv,
 					      struct ioctl_gntdev_get_offset_for_vaddr __user *u)
 {
 	struct ioctl_gntdev_get_offset_for_vaddr op;
+	struct vm_area_struct *vma;
 	struct grant_map *map;
 
 	if (copy_from_user(&op, u, sizeof(op)) != 0)
@@ -554,16 +538,16 @@ static long gntdev_ioctl_get_offset_for_vaddr(struct gntdev_priv *priv,
 		printk("%s: priv %p, offset for vaddr %lx\n", __FUNCTION__, priv,
 		       (unsigned long)op.vaddr);
 
-	spin_lock(&priv->lock);
-	map = gntdev_find_map_vaddr(priv, op.vaddr);
-	if (map == NULL ||
-	    map->vma->vm_start != op.vaddr) {
-		spin_unlock(&priv->lock);
+	vma = find_vma(current->mm, op.vaddr);
+	if (!vma || vma->vm_ops != &gntdev_vmops)
 		return -EINVAL;
-	}
+
+	map = vma->vm_private_data;
+	if (!map)
+		return -EINVAL;
+
 	op.offset = map->index << PAGE_SHIFT;
 	op.count = map->count;
-	spin_unlock(&priv->lock);
 
 	if (copy_to_user(u, &op, sizeof(op)) != 0)
 		return -EFAULT;
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 5/7] xen-gntdev: Add reference counting to maps
  2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
                   ` (3 preceding siblings ...)
  2010-12-17  0:17 ` [PATCH 4/7] xen-gntdev: Use find_vma rather than iterating our vma list manually Daniel De Graaf
@ 2010-12-17  0:17 ` Daniel De Graaf
  2010-12-17  0:49   ` Jeremy Fitzhardinge
                     ` (3 more replies)
  2010-12-17  0:17 ` [PATCH 6/7] xen-gntdev: Support mapping in HVM domains Daniel De Graaf
                   ` (3 subsequent siblings)
  8 siblings, 4 replies; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17  0:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Daniel De Graaf, jeremy, Ian.Campbell

This allows userspace to perform mmap() on the gntdev device and then
immediately close the filehandle or remove the mapping using the
remove ioctl, with the mapped area remaining valid until unmapped.
This also fixes an infinite loop when a gntdev device is closed
without first unmapping all areas.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 drivers/xen/gntdev.c |   67 +++++++++++++++++++++++--------------------------
 1 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 6a3c9e4..f1fc8fa 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -66,16 +66,18 @@ struct granted_page {
 
 struct grant_map {
 	struct list_head next;
-	struct gntdev_priv *priv;
 	struct vm_area_struct *vma;
 	int index;
 	int count;
+	atomic_t users;
 	int is_mapped:1;
 	int is_ro:1;
 	struct page **pages;
 	struct granted_page pginfo[0];
 };
 
+static void unmap_grant_pages(struct grant_map *map, int offset, int pages);
+
 /* ------------------------------------------------------------------ */
 
 static void gntdev_print_maps(struct gntdev_priv *priv,
@@ -112,6 +114,7 @@ static struct grant_map *gntdev_alloc_map(int count,
 	}
 
 	add->index = 0;
+	atomic_set(&add->users, 1);
 	add->count = count;
 	for(i = 0; i < count; i++)
 		add->pginfo[i].target = grants[i];
@@ -144,7 +147,6 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
 	list_add_tail(&add->next, &priv->maps);
 
 done:
-	add->priv = priv;
 	if (debug)
 		gntdev_print_maps(priv, "[new]", add->index);
 	spin_unlock(&priv->lock);
@@ -165,33 +167,26 @@ static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int ind
 	return NULL;
 }
 
-static int gntdev_del_map(struct grant_map *map)
+static void gntdev_put_map(struct grant_map *map)
 {
 	int i;
 
-	if (map->vma)
-		return -EBUSY;
-	if (map->is_mapped)
-		for (i = 0; i < map->count; i++)
-			if (map->pginfo[i].handle)
-				return -EBUSY;
+	if (!map)
+		return;
 
-	atomic_sub(map->count, &pages_mapped);
-	list_del(&map->next);
-	return 0;
-}
+	if (!atomic_dec_and_test(&map->users))
+		return;
 
-static void gntdev_free_map(struct grant_map *map)
-{
-	unsigned i;
+	atomic_sub(map->count, &pages_mapped);
 
-	if (!map)
-		return;
+	if (!use_ptemod)
+		unmap_grant_pages(map, 0, map->count);
 
 	for (i = 0; i < map->count; i++) {
 		if (map->pages[i])
 			__free_page(map->pages[i]);
 	}
+	kfree(map->pages);
 	kfree(map);
 }
 
@@ -310,6 +305,7 @@ static void gntdev_vma_close(struct vm_area_struct *vma)
 	map->is_mapped = 0;
 	map->vma = NULL;
 	vma->vm_private_data = NULL;
+	gntdev_put_map(map);
 }
 
 static int gntdev_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
@@ -425,7 +421,6 @@ static int gntdev_release(struct inode *inode, struct file *flip)
 {
 	struct gntdev_priv *priv = flip->private_data;
 	struct grant_map *map;
-	int err;
 
 	if (debug)
 		printk("%s: priv %p\n", __FUNCTION__, priv);
@@ -433,10 +428,8 @@ static int gntdev_release(struct inode *inode, struct file *flip)
 	spin_lock(&priv->lock);
 	while (!list_empty(&priv->maps)) {
 		map = list_entry(priv->maps.next, struct grant_map, next);
-		err = gntdev_del_map(map);
-		if (WARN_ON(err))
-			gntdev_free_map(map);
-
+		list_del(&map->next);
+		gntdev_put_map(map);
 	}
 	spin_unlock(&priv->lock);
 
@@ -487,18 +480,14 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
 
 	if (copy_to_user(u, &op, sizeof(op))) {
 		err = -EFAULT;
-		goto out_remove;
+		goto out_free;
 	}
 	err = 0;
 out_free:
 	kfree(grants);
 	return err;
-out_remove:
-	spin_lock(&priv->lock);
-	gntdev_del_map(map);
-	spin_unlock(&priv->lock);
 out_free_map:
-	gntdev_free_map(map);
+	gntdev_put_map(map);
 	goto out_free;
 }
 
@@ -507,7 +496,7 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv,
 {
 	struct ioctl_gntdev_unmap_grant_ref op;
 	struct grant_map *map;
-	int err = -EINVAL;
+	int err;
 
 	if (copy_from_user(&op, u, sizeof(op)) != 0)
 		return -EFAULT;
@@ -517,11 +506,13 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv,
 
 	spin_lock(&priv->lock);
 	map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count);
-	if (map)
-		err = gntdev_del_map(map);
+	if (map) {
+		list_del(&map->next);
+		gntdev_put_map(map);
+		err = 0;
+	} else
+		err = -EINVAL;
 	spin_unlock(&priv->lock);
-	if (!err)
-		gntdev_free_map(map);
 	return err;
 }
 
@@ -599,13 +590,15 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 	map = gntdev_find_map_index(priv, index, count);
 	if (!map)
 		goto unlock_out;
-	if (map->vma)
+	if (use_ptemod && map->vma)
 		goto unlock_out;
 	if (priv->mm != vma->vm_mm) {
 		printk("%s: Huh? Other mm?\n", __FUNCTION__);
 		goto unlock_out;
 	}
 
+	atomic_inc(&map->users);
+
 	vma->vm_ops = &gntdev_vmops;
 
 	vma->vm_flags |= VM_RESERVED;
@@ -614,7 +607,9 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 	vma->vm_flags |= VM_PFNMAP;
 
 	vma->vm_private_data = map;
-	map->vma = vma;
+
+	if (use_ptemod)
+		map->vma = vma;
 
 	map->is_ro = !(vma->vm_flags & VM_WRITE);
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 6/7] xen-gntdev: Support mapping in HVM domains
  2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
                   ` (4 preceding siblings ...)
  2010-12-17  0:17 ` [PATCH 5/7] xen-gntdev: Add reference counting to maps Daniel De Graaf
@ 2010-12-17  0:17 ` Daniel De Graaf
  2010-12-17 15:22   ` [PATCH 6/7 v2] " Daniel De Graaf
  2011-01-10 22:41   ` [PATCH 6/7] " Konrad Rzeszutek Wilk
  2010-12-17  0:17 ` [PATCH 7/7] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17  0:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Daniel De Graaf, jeremy, Ian.Campbell

HVM does not allow direct PTE modification, so instead we request
that Xen change its internal p2m mappings on the allocated pages and
map the memory into userspace normally.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 drivers/xen/gntdev.c      |  109 +++++++++++++++++++++++++++++++--------------
 drivers/xen/grant-table.c |    7 +++
 2 files changed, 83 insertions(+), 33 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index f1fc8fa..0985577 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -30,6 +30,7 @@
 #include <linux/sched.h>
 #include <linux/spinlock.h>
 #include <linux/slab.h>
+#include <linux/highmem.h>
 
 #include <xen/xen.h>
 #include <xen/grant_table.h>
@@ -49,6 +50,8 @@ static int limit = 1024*1024;
 module_param(limit, int, 0644);
 static atomic_t pages_mapped = ATOMIC_INIT(0);
 
+static int use_ptemod = 0;
+
 struct gntdev_priv {
 	struct list_head maps;
 	spinlock_t lock;
@@ -209,9 +212,12 @@ static int find_grant_ptes(pte_t *pte, pgtable_t token, unsigned long addr, void
 static int map_grant_pages(struct grant_map *map)
 {
 	int i, flags, err = 0;
+	phys_addr_t addr;
 	struct gnttab_map_grant_ref* map_ops = NULL;
 
-	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;
+	flags = GNTMAP_host_map;
+	if (use_ptemod)
+		flags |= GNTMAP_application_map | GNTMAP_contains_pte;
 	if (map->is_ro)
 		flags |= GNTMAP_readonly;
 
@@ -221,7 +227,11 @@ static int map_grant_pages(struct grant_map *map)
 		goto out;
 
 	for(i=0; i < map->count; i++) {
-		gnttab_set_map_op(&map_ops[i], map->pginfo[i].pte_maddr, flags,
+		if (use_ptemod)
+			addr = map->pginfo[i].pte_maddr;
+		else
+			addr = (phys_addr_t)pfn_to_kaddr(page_to_pfn(map->pages[i]));
+		gnttab_set_map_op(&map_ops[i], addr, flags,
 				  map->pginfo[i].target.ref,
 				  map->pginfo[i].target.domid);
 	}
@@ -253,6 +263,7 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
 	int i, flags, err = 0;
 	struct gnttab_unmap_grant_ref *unmap_ops;
 	struct gnttab_unmap_grant_ref unmap_single;
+	phys_addr_t addr;
 
 	if (pages > 1) {
 		unmap_ops = kzalloc(sizeof(unmap_ops[0]) * pages,
@@ -266,14 +277,23 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
 		unmap_ops = &unmap_single;
 	}
 
-	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;
+	flags = GNTMAP_host_map;
+	if (use_ptemod)
+		flags |= GNTMAP_application_map | GNTMAP_contains_pte;
 	if (map->is_ro)
 		flags |= GNTMAP_readonly;
 
-	for(i=0; i < pages; i++)
-		gnttab_set_unmap_op(&unmap_ops[i],
-		                    map->pginfo[offset+i].pte_maddr, flags,
+	for(i=0; i < pages; i++) {
+		if (WARN_ON(!map->pages[i]))
+			continue;
+		if (use_ptemod)
+			addr = map->pginfo[i].pte_maddr;
+		else
+			addr = (phys_addr_t)pfn_to_kaddr(page_to_pfn(map->pages[i]));
+		gnttab_set_unmap_op(&unmap_ops[i], addr, flags,
 		                    map->pginfo[offset+i].handle);
+	}
+
 	if (debug)
 		printk("%s: map %d+%d [%d+%d]\n", __FUNCTION__,
 		       map->index, map->count, offset, pages);
@@ -284,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
 		goto out;
 
 	for (i = 0; i < pages; i++) {
+		uint32_t check, *tmp;
 		WARN_ON(unmap_ops[i].status);
-		__free_page(map->pages[offset+i]);
+		if (!map->pages[i])
+			continue;
+		/* XXX When unmapping, Xen will sometimes end up mapping the GFN
+		 * to an invalid MFN. In this case, writes will be discarded and
+		 * reads will return all 0xFF bytes. Leak these unusable GFNs
+		 * until a way to restore them is found.
+		 */
+		tmp = kmap(map->pages[i]);
+		tmp[0] = 0xdeaddead;
+		mb();
+		check = tmp[0];
+		kunmap(map->pages[i]);
+		if (check == 0xdeaddead)
+			__free_page(map->pages[i]);
+		else if (debug)
+			printk("%s: Discard page %d=%ld\n", __func__,
+				i, page_to_pfn(map->pages[i]));
 		map->pages[offset+i] = NULL;
 		map->pginfo[offset+i].handle = 0;
 	}
@@ -308,18 +345,8 @@ static void gntdev_vma_close(struct vm_area_struct *vma)
 	gntdev_put_map(map);
 }
 
-static int gntdev_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
-{
-	if (debug)
-		printk("%s: vaddr %p, pgoff %ld (shouldn't happen)\n",
-		       __FUNCTION__, vmf->virtual_address, vmf->pgoff);
-	vmf->flags = VM_FAULT_ERROR;
-	return 0;
-}
-
 static struct vm_operations_struct gntdev_vmops = {
 	.close = gntdev_vma_close,
-	.fault = gntdev_vma_fault,
 };
 
 /* ------------------------------------------------------------------ */
@@ -401,14 +428,16 @@ static int gntdev_open(struct inode *inode, struct file *flip)
 	INIT_LIST_HEAD(&priv->maps);
 	spin_lock_init(&priv->lock);
 
-	priv->mm = get_task_mm(current);
-	if (!priv->mm) {
-		kfree(priv);
-		return -ENOMEM;
+	if (use_ptemod) {
+		priv->mm = get_task_mm(current);
+		if (!priv->mm) {
+			kfree(priv);
+			return -ENOMEM;
+		}
+		priv->mn.ops = &gntdev_mmu_ops;
+		mmu_notifier_register(&priv->mn, priv->mm);
+		mmput(priv->mm);
 	}
-	priv->mn.ops = &gntdev_mmu_ops;
-	mmu_notifier_register(&priv->mn, priv->mm);
-	mmput(priv->mm);
 
 	flip->private_data = priv;
 	if (debug)
@@ -433,7 +462,8 @@ static int gntdev_release(struct inode *inode, struct file *flip)
 	}
 	spin_unlock(&priv->lock);
 
-	mmu_notifier_unregister(&priv->mn, priv->mm);
+	if (use_ptemod)
+		mmu_notifier_unregister(&priv->mn, priv->mm);
 	kfree(priv);
 	return 0;
 }
@@ -577,7 +607,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 	int index = vma->vm_pgoff;
 	int count = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
 	struct grant_map *map;
-	int err = -EINVAL;
+	int i, err = -EINVAL;
 
 	if ((vma->vm_flags & VM_WRITE) && !(vma->vm_flags & VM_SHARED))
 		return -EINVAL;
@@ -592,7 +622,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 		goto unlock_out;
 	if (use_ptemod && map->vma)
 		goto unlock_out;
-	if (priv->mm != vma->vm_mm) {
+	if (use_ptemod && priv->mm != vma->vm_mm) {
 		printk("%s: Huh? Other mm?\n", __FUNCTION__);
 		goto unlock_out;
 	}
@@ -615,18 +645,29 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 
 	spin_unlock(&priv->lock);
 
-	err = apply_to_page_range(vma->vm_mm, vma->vm_start,
-				  vma->vm_end - vma->vm_start,
-				  find_grant_ptes, map);
-	if (err)
-		return err;
+	if (use_ptemod) {
+		err = apply_to_page_range(vma->vm_mm, vma->vm_start,
+					  vma->vm_end - vma->vm_start,
+					  find_grant_ptes, map);
+		if (err)
+			return err;
+	}
 
 	err = map_grant_pages(map);
 	if (err)
 		return err;
-
+	
 	map->is_mapped = 1;
 
+	if (!use_ptemod) {
+		for(i = 0; i < count; i++) {
+			err = vm_insert_page(vma, vma->vm_start + i*PAGE_SIZE,
+			                     map->pages[i]);
+			if (err)
+				return err;
+		}
+	}
+
 	return 0;
 
 unlock_out:
@@ -657,6 +698,8 @@ static int __init gntdev_init(void)
 	if (!xen_domain())
 		return -ENODEV;
 
+	use_ptemod = xen_pv_domain();
+
 	err = misc_register(&gntdev_miscdev);
 	if (err != 0) {
 		printk(KERN_ERR "Could not register gntdev device\n");
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index a5cf820..c8ab76e 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -457,6 +457,9 @@ int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
 
 	ret = HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, map_ops, count);
 
+	if (xen_feature(XENFEAT_auto_translated_physmap))
+		return ret;
+
 	for (i = 0; i < count; i++) {
 		pfn = mfn_to_pfn(map_ops[i].host_addr >> PAGE_SHIFT);
 		pte = (pte_t *) __va((pfn << PAGE_SHIFT) +
@@ -476,6 +479,10 @@ int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops,
 	int i, ret;
 
 	ret = HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, unmap_ops, count);
+
+	if (xen_feature(XENFEAT_auto_translated_physmap))
+		return ret;
+
 	for (i = 0; i < count; i++)
 		m2p_remove_override(pages[i]);
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 7/7] xen-gntalloc: Userspace grant allocation driver
  2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
                   ` (5 preceding siblings ...)
  2010-12-17  0:17 ` [PATCH 6/7] xen-gntdev: Support mapping in HVM domains Daniel De Graaf
@ 2010-12-17  0:17 ` Daniel De Graaf
  2011-01-07 11:56 ` [PATCH v3] Userspace grant communication Stefano Stabellini
  2011-01-14 15:18 ` Konrad Rzeszutek Wilk
  8 siblings, 0 replies; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17  0:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Daniel De Graaf, jeremy, Ian.Campbell

This allows a userspace application to allocate a shared page for
implementing inter-domain communication or device drivers. These
shared pages can be mapped using the gntdev device or by the kernel
in another domain.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 drivers/xen/Kconfig    |    7 +
 drivers/xen/Makefile   |    2 +
 drivers/xen/gntalloc.c |  488 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/xen/gntalloc.h |   70 +++++++
 4 files changed, 567 insertions(+), 0 deletions(-)
 create mode 100644 drivers/xen/gntalloc.c
 create mode 100644 include/xen/gntalloc.h

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 0c6d2a1..677ccad 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -69,6 +69,13 @@ config XEN_GNTDEV
 	help
 	  Allows userspace processes use grants.
 	  
+config XEN_GRANT_DEV_ALLOC
+	tristate "User-space grant reference allocator driver"
+	depends on XEN
+	help
+	  Allows userspace processes to create pages with access granted
+	  to other domains.
+
 config XEN_PLATFORM_PCI
 	tristate "xen platform pci device driver"
 	depends on XEN_PVHVM
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 674fdb5..ec38a72 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_XEN_XENCOMM)	+= xencomm.o
 obj-$(CONFIG_XEN_BALLOON)	+= balloon.o
 obj-$(CONFIG_XEN_DEV_EVTCHN)	+= xen-evtchn.o
 obj-$(CONFIG_XEN_GNTDEV)	+= xen-gntdev.o
+obj-$(CONFIG_XEN_GRANT_DEV_ALLOC)	+= xen-gntalloc.o
 obj-$(CONFIG_XENFS)		+= xenfs/
 obj-$(CONFIG_XEN_SYS_HYPERVISOR)	+= sys-hypervisor.o
 obj-$(CONFIG_XEN_PLATFORM_PCI)	+= platform-pci.o
@@ -18,4 +19,5 @@ obj-$(CONFIG_XEN_DOM0)		+= pci.o
 
 xen-evtchn-y			:= evtchn.o
 xen-gntdev-y				:= gntdev.o
+xen-gntalloc-y				:= gntalloc.o
 
diff --git a/drivers/xen/gntalloc.c b/drivers/xen/gntalloc.c
new file mode 100644
index 0000000..4bee149
--- /dev/null
+++ b/drivers/xen/gntalloc.c
@@ -0,0 +1,488 @@
+/******************************************************************************
+ * gntalloc.c
+ *
+ * Device for creating grant references (in user-space) that may be shared
+ * with other domains.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/*
+ * This driver exists to allow userspace programs in Linux to allocate kernel
+ * memory that will later be shared with another domain.  Without this device,
+ * Linux userspace programs cannot create grant references.
+ *
+ * How this stuff works:
+ *   X -> granting a page to Y
+ *   Y -> mapping the grant from X
+ *
+ *   1. X uses the gntalloc device to allocate a page of kernel memory, P.
+ *   2. X creates an entry in the grant table that says domid(Y) can access P.
+ *      This is done without a hypercall unless the grant table needs expansion.
+ *   3. X gives the grant reference identifier, GREF, to Y.
+ *   4. Y maps the page, either directly into kernel memory for use in a backend
+ *      driver, or via a the gntdev device to map into the address space of an
+ *      application running in Y. This is the first point at which Xen does any
+ *      tracking of the page.
+ *   5. A program in X mmap()s a segment of the gntalloc device that corresponds
+ *      to the shared page, and can now communicate with Y over the shared page.
+ *
+ *
+ * NOTE TO USERSPACE LIBRARIES:
+ *   The grant allocation and mmap()ing are, naturally, two separate operations.
+ *   You set up the sharing by calling the create ioctl() and then the mmap().
+ *   Teardown requires munmap() and either close() or ioctl().
+ *
+ * WARNING: Since Xen does not allow a guest to forcibly end the use of a grant
+ * reference, this device can be used to consume kernel memory by leaving grant
+ * references mapped by another domain when an application exits. Therefore,
+ * there is a global limit on the number of pages that can be allocated. When
+ * all references to the page are unmapped, it will be freed during the next
+ * grant operation.
+ */
+
+#include <asm/atomic.h>
+#include <linux/module.h>
+#include <linux/miscdevice.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/device.h>
+#include <linux/mm.h>
+#include <asm/uaccess.h>
+#include <linux/types.h>
+#include <linux/list.h>
+
+#include <xen/xen.h>
+#include <xen/page.h>
+#include <xen/grant_table.h>
+#include <xen/gntalloc.h>
+
+static int debug = 0;
+module_param(debug, int, 0644);
+
+static int limit = 1024;
+module_param(limit, int, 0644);
+
+static LIST_HEAD(gref_list);
+static DEFINE_SPINLOCK(gref_lock);
+static int gref_size = 0;
+
+/* Metadata on a grant reference. */
+struct gntalloc_gref {
+	struct list_head next_all;   /* list entry gref_list */
+	struct list_head next_file;  /* list entry file->list, if open */
+	struct page* page;	     /* The shared page */
+	uint64_t file_index;         /* File offset for mmap() */
+	unsigned int users;          /* Use count - when zero, waiting on Xen */
+	grant_ref_t gref_id;         /* The grant reference number */
+};
+
+struct gntalloc_file_private_data {
+	struct list_head list;
+	uint64_t index;
+};
+
+static void __del_gref(struct gntalloc_gref *gref);
+
+static void do_cleanup(void)
+{
+	struct gntalloc_gref *gref, *n;
+	list_for_each_entry_safe(gref, n, &gref_list, next_all) {
+		if (!gref->users)
+			__del_gref(gref);
+	}
+}
+
+static int add_grefs(struct ioctl_gntalloc_alloc_gref* op,
+	uint32_t* gref_ids, struct gntalloc_file_private_data *priv)
+{
+	int i, rc, readonly;
+	LIST_HEAD(queue_all);
+	LIST_HEAD(queue_file);
+	struct gntalloc_gref *gref;
+
+	readonly = !(op->flags & GNTALLOC_FLAG_WRITABLE);
+	rc = -ENOMEM;
+	for(i=0; i < op->count; i++) {
+		gref = kzalloc(sizeof(*gref), GFP_KERNEL);
+		if (!gref)
+			goto undo;
+		list_add_tail(&gref->next_all, &queue_all);
+		list_add_tail(&gref->next_file, &queue_file);
+		gref->users = 1;
+		gref->file_index = op->index + i * PAGE_SIZE;
+		gref->page = alloc_page(GFP_KERNEL|__GFP_ZERO);
+		if (!gref->page)
+			goto undo;
+
+		/* Grant foreign access to the page. */
+		gref->gref_id = gnttab_grant_foreign_access(op->domid,
+			pfn_to_mfn(page_to_pfn(gref->page)), readonly);
+		if (gref->gref_id < 0) {
+			rc = gref->gref_id;
+			goto undo;
+		}
+		gref_ids[i] = gref->gref_id;
+	}
+
+	/* Add to gref lists. */
+	spin_lock(&gref_lock);
+	list_splice_tail(&queue_all, &gref_list);
+	list_splice_tail(&queue_file, &priv->list);
+	spin_unlock(&gref_lock);
+
+	return 0;
+
+undo:
+	list_for_each_entry(gref, &queue_file, next_file) {
+		if (gref->page && gref->gref_id >= 0) {
+			__del_gref(gref);
+		} else {
+			if (gref->page)
+				__free_page(gref->page);
+			list_del(&gref->next_all);
+			kfree(gref);
+		}
+	}
+
+	/* It's possible for the target domain to map the just-allocated grant
+	 * references by blindly guessing their IDs; if this is done, then
+	 * __del_gref will leave them in the queue_all list. They need to be
+	 * added to the global list so that we can free them when they are no
+	 * longer referenced.
+	 */
+	if (unlikely(!list_empty(&queue_all))) {
+		spin_lock(&gref_lock);
+		list_splice_tail(&queue_all, &gref_list);
+		spin_unlock(&gref_lock);
+	}
+	return rc;
+}
+
+static void __del_gref(struct gntalloc_gref *gref)
+{
+	if (gnttab_query_foreign_access(gref->gref_id))
+		return;
+
+	if (!gnttab_end_foreign_access_ref(gref->gref_id, 0))
+		return;
+
+	gref_size--;
+	list_del(&gref->next_all);
+
+	__free_page(gref->page);
+	kfree(gref);
+}
+
+// finds contiguous grant references in a file, returns the first
+static struct gntalloc_gref* find_grefs(struct gntalloc_file_private_data *priv,
+                                       uint64_t index, uint32_t count)
+{
+	struct gntalloc_gref *rv = NULL, *gref;
+	list_for_each_entry(gref, &priv->list, next_file) {
+		if (gref->file_index == index && !rv)
+			rv = gref;
+		if (rv) {
+			if (gref->file_index != index)
+				return NULL;
+			index += PAGE_SIZE;
+			count--;
+			if (count == 0)
+				return rv;
+		}
+	}
+	return NULL;
+}
+
+/*
+ * -------------------------------------
+ *  File operations.
+ * -------------------------------------
+ */
+static int gntalloc_open(struct inode *inode, struct file *filp)
+{
+	struct gntalloc_file_private_data *priv;
+
+	try_module_get(THIS_MODULE);
+
+	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		goto out_nomem;
+	INIT_LIST_HEAD(&priv->list);
+
+	filp->private_data = priv;
+
+	if (debug)
+		printk("%s: priv %p\n", __FUNCTION__, priv);
+
+	return 0;
+
+out_nomem:
+	return -ENOMEM;
+}
+
+static int gntalloc_release(struct inode *inode, struct file *filp)
+{
+	struct gntalloc_file_private_data *priv = filp->private_data;
+	struct gntalloc_gref *gref;
+
+	if (debug)
+		printk("%s: priv %p\n", __FUNCTION__, priv);
+
+	spin_lock(&gref_lock);
+	while (!list_empty(&priv->list)) {
+		gref = list_entry(priv->list.next,
+			struct gntalloc_gref, next_file);
+		list_del(&gref->next_file);
+		gref->users--;
+		if (gref->users == 0)
+			__del_gref(gref);
+	}
+	kfree(priv);
+	spin_unlock(&gref_lock);
+
+	module_put(THIS_MODULE);
+
+	return 0;
+}
+
+static long gntalloc_ioctl_alloc(struct gntalloc_file_private_data *priv,
+                                 struct ioctl_gntalloc_alloc_gref __user *arg)
+{
+	int rc = 0;
+	struct ioctl_gntalloc_alloc_gref op;
+	uint32_t* gref_ids;
+
+	if (debug)
+		printk("%s: priv %p\n", __FUNCTION__, priv);
+
+	if (copy_from_user(&op, arg, sizeof(op))) {
+		rc = -EFAULT;
+		goto out;
+	}
+
+	gref_ids = kzalloc(sizeof(gref_ids[0]) * op.count, GFP_TEMPORARY);
+	if (!gref_ids) {
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	spin_lock(&gref_lock);
+	do_cleanup();
+	if (gref_size + op.count > limit) {
+		spin_unlock(&gref_lock);
+		rc = -ENOSPC;
+		goto out_free;
+	}
+	gref_size += op.count;
+	op.index = priv->index;
+	priv->index += op.count * PAGE_SIZE;
+	spin_unlock(&gref_lock);
+
+	rc = add_grefs(&op, gref_ids, priv);
+	if (rc < 0)
+		goto out_free;
+
+	if (copy_to_user(arg, &op, sizeof(op))) {
+		rc = -EFAULT;
+		goto out_free;
+	}
+	if (copy_to_user(arg->gref_ids, gref_ids,
+	                   sizeof(gref_ids[0]) * op.count)) {
+		rc = -EFAULT;
+		goto out_free;
+	}
+
+out_free:
+	kfree(gref_ids);
+out:
+	return rc;
+}
+
+static long gntalloc_ioctl_dealloc(struct gntalloc_file_private_data *priv,
+                                   void __user *arg)
+{
+	int i, rc = 0;
+	struct ioctl_gntalloc_dealloc_gref op;
+	struct gntalloc_gref *gref, *n;
+
+	if (debug)
+		printk("%s: priv %p\n", __FUNCTION__, priv);
+
+	if (copy_from_user(&op, arg, sizeof(op))) {
+		rc = -EFAULT;
+		goto dealloc_grant_out;
+	}
+
+	spin_lock(&gref_lock);
+	gref = find_grefs(priv, op.index, op.count);
+	if (gref) {
+		for(i=0; i < op.count; i++) {
+			n = list_entry(gref->next_file.next,
+			               struct gntalloc_gref, next_file);
+			list_del(&gref->next_file);
+			gref->users--;
+			gref = n;
+		}
+	} else {
+		rc = -EINVAL;
+	}
+
+	do_cleanup();
+
+	spin_unlock(&gref_lock);
+dealloc_grant_out:
+	return rc;
+}
+
+static long gntalloc_ioctl(struct file *filp, unsigned int cmd,
+		unsigned long arg)
+{
+	struct gntalloc_file_private_data *priv = filp->private_data;
+
+	switch (cmd) {
+	case IOCTL_GNTALLOC_ALLOC_GREF:
+		return gntalloc_ioctl_alloc(priv, (void __user*)arg);
+
+	case IOCTL_GNTALLOC_DEALLOC_GREF:
+		return gntalloc_ioctl_dealloc(priv, (void __user*)arg);
+
+	default:
+		return -ENOIOCTLCMD;
+	}
+
+	return 0;
+}
+
+static void gntalloc_vma_close(struct vm_area_struct *vma)
+{
+	struct gntalloc_gref *gref = vma->vm_private_data;
+	if (!gref)
+		return;
+
+	spin_lock(&gref_lock);
+	gref->users--;
+	if (gref->users == 0)
+		__del_gref(gref);
+	spin_unlock(&gref_lock);
+}
+
+static struct vm_operations_struct gntalloc_vmops = {
+	.close = gntalloc_vma_close,
+};
+
+static int gntalloc_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+	struct gntalloc_file_private_data *priv = filp->private_data;
+	struct gntalloc_gref *gref;
+	int count = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+	int rv, i;
+
+	if (debug)
+		printk("%s: priv %p, page %lu+%d\n", __func__,
+		       priv, vma->vm_pgoff, count);
+
+	if (!(vma->vm_flags & VM_SHARED)) {
+		printk(KERN_ERR "%s: Mapping must be shared.\n", __func__);
+		return -EINVAL;
+	}
+
+	spin_lock(&gref_lock);
+	gref = find_grefs(priv, vma->vm_pgoff << PAGE_SHIFT, count);
+	if (gref == NULL) {
+		rv = -ENOENT;
+		if (debug)
+			printk("%s: Could not find grant reference",
+				__func__);
+		goto out_unlock;
+	}
+
+	vma->vm_private_data = gref;
+
+	vma->vm_flags |= VM_RESERVED;
+	vma->vm_flags |= VM_DONTCOPY;
+	vma->vm_flags |= VM_IO;
+	vma->vm_flags |= VM_PFNMAP | VM_PFN_AT_MMAP;
+
+	vma->vm_ops = &gntalloc_vmops;
+
+	for(i=0; i < count; i++) {
+		gref->users++;
+		rv = vm_insert_page(vma, vma->vm_start + i * PAGE_SIZE,
+		                    gref->page);
+		if (rv)
+			goto out_unlock;
+
+		gref = list_entry(gref->next_file.next,
+		                  struct gntalloc_gref, next_file);
+	}
+	rv = 0;
+
+out_unlock:
+	spin_unlock(&gref_lock);
+	return rv;
+}
+
+static const struct file_operations gntalloc_fops = {
+	.owner = THIS_MODULE,
+	.open = gntalloc_open,
+	.release = gntalloc_release,
+	.unlocked_ioctl = gntalloc_ioctl,
+	.mmap = gntalloc_mmap
+};
+
+/*
+ * -------------------------------------
+ * Module creation/destruction.
+ * -------------------------------------
+ */
+static struct miscdevice gntalloc_miscdev = {
+	.minor	= MISC_DYNAMIC_MINOR,
+	.name	= "xen/gntalloc",
+	.fops	= &gntalloc_fops,
+};
+
+static int __init gntalloc_init(void)
+{
+	int err;
+
+	if (!xen_domain()) {
+		if (debug)
+			printk(KERN_ERR "gntalloc: You must be running Xen\n");
+		return -ENODEV;
+	}
+
+	err = misc_register(&gntalloc_miscdev);
+	if (err != 0) {
+		printk(KERN_ERR "Could not register misc gntalloc device\n");
+		return err;
+	}
+
+	if (debug)
+		printk(KERN_INFO "Created grant allocation device at %d,%d\n",
+			MISC_MAJOR, gntalloc_miscdev.minor);
+
+	return 0;
+}
+
+static void __exit gntalloc_exit(void)
+{
+	misc_deregister(&gntalloc_miscdev);
+}
+
+module_init(gntalloc_init);
+module_exit(gntalloc_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Carter Weatherly <carter.weatherly@jhuapl.edu>, "
+              "Daniel De Graaf <dgdegra@tycho.nsa.gov>");
+MODULE_DESCRIPTION("User-space grant reference allocator driver");
diff --git a/include/xen/gntalloc.h b/include/xen/gntalloc.h
new file mode 100644
index 0000000..e1d6d0f
--- /dev/null
+++ b/include/xen/gntalloc.h
@@ -0,0 +1,70 @@
+/******************************************************************************
+ * gntalloc.h
+ *
+ * Interface to /dev/xen/gntalloc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef __LINUX_PUBLIC_GNTALLOC_H__
+#define __LINUX_PUBLIC_GNTALLOC_H__
+
+/*
+ * Allocates a new page and creates a new grant reference.
+ */
+#define IOCTL_GNTALLOC_ALLOC_GREF \
+_IOC(_IOC_NONE, 'G', 5, sizeof(struct ioctl_gntalloc_alloc_gref))
+struct ioctl_gntalloc_alloc_gref {
+	/* IN parameters */
+	/* The ID of the domain to be given access to the grants. */
+	uint16_t domid;
+	/* Flags for this mapping */
+	uint16_t flags;
+	/* Number of pages to map */
+	uint32_t count;
+	/* OUT parameters */
+	/* The offset to be used on a subsequent call to mmap(). */
+	uint64_t index;
+	/* The grant references of the newly created grant, one per page */
+	/* Variable size, depending on count */
+	uint32_t gref_ids[1];
+};
+
+#define GNTALLOC_FLAG_WRITABLE 1
+
+/*
+ * Deallocates the grant reference, allowing the associated page to be freed if
+ * no other domains are using it.
+ */
+#define IOCTL_GNTALLOC_DEALLOC_GREF \
+_IOC(_IOC_NONE, 'G', 6, sizeof(struct ioctl_gntalloc_dealloc_gref))
+struct ioctl_gntalloc_dealloc_gref {
+	/* IN parameters */
+	/* The offset returned in the map operation */
+	uint64_t index;
+	/* Number of references to unmap */
+	uint32_t count;
+};
+#endif /* __LINUX_PUBLIC_GNTALLOC_H__ */
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/7] xen-gntdev: Add reference counting to maps
  2010-12-17  0:17 ` [PATCH 5/7] xen-gntdev: Add reference counting to maps Daniel De Graaf
@ 2010-12-17  0:49   ` Jeremy Fitzhardinge
  2010-12-17 15:11     ` Daniel De Graaf
  2010-12-17  0:51   ` Jeremy Fitzhardinge
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 38+ messages in thread
From: Jeremy Fitzhardinge @ 2010-12-17  0:49 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: xen-devel, Ian.Campbell

On 12/16/2010 04:17 PM, Daniel De Graaf wrote:
> This allows userspace to perform mmap() on the gntdev device and then
> immediately close the filehandle or remove the mapping using the
> remove ioctl, with the mapped area remaining valid until unmapped.
> This also fixes an infinite loop when a gntdev device is closed
> without first unmapping all areas.
>
> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
> ---
>  drivers/xen/gntdev.c |   67 +++++++++++++++++++++++--------------------------
>  1 files changed, 31 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
> index 6a3c9e4..f1fc8fa 100644
> --- a/drivers/xen/gntdev.c
> +++ b/drivers/xen/gntdev.c
> @@ -66,16 +66,18 @@ struct granted_page {
>  
>  struct grant_map {
>  	struct list_head next;
> -	struct gntdev_priv *priv;
>  	struct vm_area_struct *vma;
>  	int index;
>  	int count;
> +	atomic_t users;

Does this need to be atomic?  Won't it be happening under spinlock anyway?

>  	int is_mapped:1;
>  	int is_ro:1;
>  	struct page **pages;
>  	struct granted_page pginfo[0];
>  };
>  
> +static void unmap_grant_pages(struct grant_map *map, int offset, int pages);
> +
>  /* ------------------------------------------------------------------ */
>  
>  static void gntdev_print_maps(struct gntdev_priv *priv,
> @@ -112,6 +114,7 @@ static struct grant_map *gntdev_alloc_map(int count,
>  	}
>  
>  	add->index = 0;
> +	atomic_set(&add->users, 1);
>  	add->count = count;
>  	for(i = 0; i < count; i++)
>  		add->pginfo[i].target = grants[i];
> @@ -144,7 +147,6 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
>  	list_add_tail(&add->next, &priv->maps);
>  
>  done:
> -	add->priv = priv;
>  	if (debug)
>  		gntdev_print_maps(priv, "[new]", add->index);
>  	spin_unlock(&priv->lock);
> @@ -165,33 +167,26 @@ static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int ind
>  	return NULL;
>  }
>  
> -static int gntdev_del_map(struct grant_map *map)
> +static void gntdev_put_map(struct grant_map *map)
>  {
>  	int i;
>  
> -	if (map->vma)
> -		return -EBUSY;
> -	if (map->is_mapped)
> -		for (i = 0; i < map->count; i++)
> -			if (map->pginfo[i].handle)
> -				return -EBUSY;
> +	if (!map)
> +		return;
>  
> -	atomic_sub(map->count, &pages_mapped);
> -	list_del(&map->next);
> -	return 0;
> -}
> +	if (!atomic_dec_and_test(&map->users))
> +		return;
>  
> -static void gntdev_free_map(struct grant_map *map)
> -{
> -	unsigned i;
> +	atomic_sub(map->count, &pages_mapped);
>  
> -	if (!map)
> -		return;
> +	if (!use_ptemod)
> +		unmap_grant_pages(map, 0, map->count);
>  
>  	for (i = 0; i < map->count; i++) {
>  		if (map->pages[i])
>  			__free_page(map->pages[i]);
>  	}
> +	kfree(map->pages);
>  	kfree(map);
>  }
>  
> @@ -310,6 +305,7 @@ static void gntdev_vma_close(struct vm_area_struct *vma)
>  	map->is_mapped = 0;
>  	map->vma = NULL;
>  	vma->vm_private_data = NULL;
> +	gntdev_put_map(map);
>  }
>  
>  static int gntdev_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> @@ -425,7 +421,6 @@ static int gntdev_release(struct inode *inode, struct file *flip)
>  {
>  	struct gntdev_priv *priv = flip->private_data;
>  	struct grant_map *map;
> -	int err;
>  
>  	if (debug)
>  		printk("%s: priv %p\n", __FUNCTION__, priv);
> @@ -433,10 +428,8 @@ static int gntdev_release(struct inode *inode, struct file *flip)
>  	spin_lock(&priv->lock);
>  	while (!list_empty(&priv->maps)) {
>  		map = list_entry(priv->maps.next, struct grant_map, next);
> -		err = gntdev_del_map(map);
> -		if (WARN_ON(err))
> -			gntdev_free_map(map);
> -
> +		list_del(&map->next);
> +		gntdev_put_map(map);
>  	}
>  	spin_unlock(&priv->lock);
>  
> @@ -487,18 +480,14 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
>  
>  	if (copy_to_user(u, &op, sizeof(op))) {
>  		err = -EFAULT;
> -		goto out_remove;
> +		goto out_free;
>  	}
>  	err = 0;
>  out_free:
>  	kfree(grants);
>  	return err;
> -out_remove:
> -	spin_lock(&priv->lock);
> -	gntdev_del_map(map);
> -	spin_unlock(&priv->lock);
>  out_free_map:
> -	gntdev_free_map(map);
> +	gntdev_put_map(map);
>  	goto out_free;
>  }
>  
> @@ -507,7 +496,7 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv,
>  {
>  	struct ioctl_gntdev_unmap_grant_ref op;
>  	struct grant_map *map;
> -	int err = -EINVAL;
> +	int err;
>  
>  	if (copy_from_user(&op, u, sizeof(op)) != 0)
>  		return -EFAULT;
> @@ -517,11 +506,13 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv,
>  
>  	spin_lock(&priv->lock);
>  	map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count);
> -	if (map)
> -		err = gntdev_del_map(map);
> +	if (map) {
> +		list_del(&map->next);
> +		gntdev_put_map(map);
> +		err = 0;
> +	} else
> +		err = -EINVAL;

What prevents unmap_grant_ref being called multiple times?

>  	spin_unlock(&priv->lock);
> -	if (!err)
> -		gntdev_free_map(map);
>  	return err;
>  }
>  
> @@ -599,13 +590,15 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
>  	map = gntdev_find_map_index(priv, index, count);
>  	if (!map)
>  		goto unlock_out;
> -	if (map->vma)
> +	if (use_ptemod && map->vma)
>  		goto unlock_out;
>  	if (priv->mm != vma->vm_mm) {
>  		printk("%s: Huh? Other mm?\n", __FUNCTION__);
>  		goto unlock_out;
>  	}
>  
> +	atomic_inc(&map->users);
> +
>  	vma->vm_ops = &gntdev_vmops;
>  
>  	vma->vm_flags |= VM_RESERVED;
> @@ -614,7 +607,9 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
>  	vma->vm_flags |= VM_PFNMAP;
>  
>  	vma->vm_private_data = map;
> -	map->vma = vma;
> +
> +	if (use_ptemod)
> +		map->vma = vma;
>  
>  	map->is_ro = !(vma->vm_flags & VM_WRITE);
>  

    J

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/7] xen-gntdev: Add reference counting to maps
  2010-12-17  0:17 ` [PATCH 5/7] xen-gntdev: Add reference counting to maps Daniel De Graaf
  2010-12-17  0:49   ` Jeremy Fitzhardinge
@ 2010-12-17  0:51   ` Jeremy Fitzhardinge
  2010-12-17 15:22   ` [PATCH 5/7 v2] " Daniel De Graaf
  2011-01-10 22:24   ` [PATCH 5/7] " Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 38+ messages in thread
From: Jeremy Fitzhardinge @ 2010-12-17  0:51 UTC (permalink / raw)
  Cc: Daniel De Graaf, xen-devel, Ian.Campbell

On 12/16/2010 04:17 PM, Daniel De Graaf wrote:
> -static void gntdev_free_map(struct grant_map *map)
> -{
> -	unsigned i;
> +	atomic_sub(map->count, &pages_mapped);
>  
> -	if (!map)
> -		return;
> +	if (!use_ptemod)

Does this depend on the later hvm patch?

    J

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/7] xen-gntdev: Add reference counting to maps
  2010-12-17  0:49   ` Jeremy Fitzhardinge
@ 2010-12-17 15:11     ` Daniel De Graaf
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17 15:11 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel, Ian.Campbell

On 12/16/2010 07:49 PM, Jeremy Fitzhardinge wrote:
> On 12/16/2010 04:17 PM, Daniel De Graaf wrote:
>> This allows userspace to perform mmap() on the gntdev device and then
>> immediately close the filehandle or remove the mapping using the
>> remove ioctl, with the mapped area remaining valid until unmapped.
>> This also fixes an infinite loop when a gntdev device is closed
>> without first unmapping all areas.
>>
>> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
>> ---
>>  drivers/xen/gntdev.c |   67 +++++++++++++++++++++++--------------------------
>>  1 files changed, 31 insertions(+), 36 deletions(-)
>>
>> diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
>> index 6a3c9e4..f1fc8fa 100644
>> --- a/drivers/xen/gntdev.c
>> +++ b/drivers/xen/gntdev.c
>> @@ -66,16 +66,18 @@ struct granted_page {
>>  
>>  struct grant_map {
>>  	struct list_head next;
>> -	struct gntdev_priv *priv;
>>  	struct vm_area_struct *vma;
>>  	int index;
>>  	int count;
>> +	atomic_t users;
> 
> Does this need to be atomic?  Won't it be happening under spinlock anyway?

gntdev_put_map will not be called under spinlock if it is called from an
munmap(), especially one that happens after the file is closed.

>> @@ -517,11 +506,13 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv,
>>  
>>  	spin_lock(&priv->lock);
>>  	map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count);
>> -	if (map)
>> -		err = gntdev_del_map(map);
>> +	if (map) {
>> +		list_del(&map->next);
>> +		gntdev_put_map(map);
>> +		err = 0;
>> +	} else
>> +		err = -EINVAL;
> 
> What prevents unmap_grant_ref being called multiple times?

gntdev_find_map_index searches in priv->list for the mapping; if
found, it removes it from that list. A second search will just
return -EINVAL, even if the pages are still mapped.

>>  	spin_unlock(&priv->lock);
>> -	if (!err)
>> -		gntdev_free_map(map);
>>  	return err;
>>  }
>>  
>> @@ -599,13 +590,15 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
>>  	map = gntdev_find_map_index(priv, index, count);
>>  	if (!map)
>>  		goto unlock_out;
>> -	if (map->vma)
>> +	if (use_ptemod && map->vma)
>>  		goto unlock_out;
> 
> Does this depend on the later hvm patch?

Whoops, that ended up in the wrong patch. I'll correct the pair.

>>  	if (priv->mm != vma->vm_mm) {
>>  		printk("%s: Huh? Other mm?\n", __FUNCTION__);
>>  		goto unlock_out;
>>  	}
>>  
>> +	atomic_inc(&map->users);
>> +
>>  	vma->vm_ops = &gntdev_vmops;
>>  
>>  	vma->vm_flags |= VM_RESERVED;
>> @@ -614,7 +607,9 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
>>  	vma->vm_flags |= VM_PFNMAP;
>>  
>>  	vma->vm_private_data = map;
>> -	map->vma = vma;
>> +
>> +	if (use_ptemod)
>> +		map->vma = vma;
>>  
>>  	map->is_ro = !(vma->vm_flags & VM_WRITE);
>>  
> 
>     J
> 


-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7 v2] xen-gntdev: Support mapping in HVM domains
  2010-12-17  0:17 ` [PATCH 6/7] xen-gntdev: Support mapping in HVM domains Daniel De Graaf
@ 2010-12-17 15:22   ` Daniel De Graaf
  2011-01-10 22:41   ` [PATCH 6/7] " Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17 15:22 UTC (permalink / raw)
  To: jeremy; +Cc: xen-devel, Ian.Campbell

HVM does not allow direct PTE modification, so instead we request
that Xen change its internal p2m mappings on the allocated pages and
map the memory into userspace normally.
    
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 drivers/xen/gntdev.c      |  117 +++++++++++++++++++++++++++++++-------------
 drivers/xen/grant-table.c |    7 +++
 2 files changed, 89 insertions(+), 35 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 0c8e8a2..0985577 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -30,6 +30,7 @@
 #include <linux/sched.h>
 #include <linux/spinlock.h>
 #include <linux/slab.h>
+#include <linux/highmem.h>
 
 #include <xen/xen.h>
 #include <xen/grant_table.h>
@@ -49,6 +50,8 @@ static int limit = 1024*1024;
 module_param(limit, int, 0644);
 static atomic_t pages_mapped = ATOMIC_INIT(0);
 
+static int use_ptemod = 0;
+
 struct gntdev_priv {
 	struct list_head maps;
 	spinlock_t lock;
@@ -179,6 +182,9 @@ static void gntdev_put_map(struct grant_map *map)
 
 	atomic_sub(map->count, &pages_mapped);
 
+	if (!use_ptemod)
+		unmap_grant_pages(map, 0, map->count);
+
 	for (i = 0; i < map->count; i++) {
 		if (map->pages[i])
 			__free_page(map->pages[i]);
@@ -206,9 +212,12 @@ static int find_grant_ptes(pte_t *pte, pgtable_t token, unsigned long addr, void
 static int map_grant_pages(struct grant_map *map)
 {
 	int i, flags, err = 0;
+	phys_addr_t addr;
 	struct gnttab_map_grant_ref* map_ops = NULL;
 
-	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;
+	flags = GNTMAP_host_map;
+	if (use_ptemod)
+		flags |= GNTMAP_application_map | GNTMAP_contains_pte;
 	if (map->is_ro)
 		flags |= GNTMAP_readonly;
 
@@ -218,7 +227,11 @@ static int map_grant_pages(struct grant_map *map)
 		goto out;
 
 	for(i=0; i < map->count; i++) {
-		gnttab_set_map_op(&map_ops[i], map->pginfo[i].pte_maddr, flags,
+		if (use_ptemod)
+			addr = map->pginfo[i].pte_maddr;
+		else
+			addr = (phys_addr_t)pfn_to_kaddr(page_to_pfn(map->pages[i]));
+		gnttab_set_map_op(&map_ops[i], addr, flags,
 				  map->pginfo[i].target.ref,
 				  map->pginfo[i].target.domid);
 	}
@@ -250,6 +263,7 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
 	int i, flags, err = 0;
 	struct gnttab_unmap_grant_ref *unmap_ops;
 	struct gnttab_unmap_grant_ref unmap_single;
+	phys_addr_t addr;
 
 	if (pages > 1) {
 		unmap_ops = kzalloc(sizeof(unmap_ops[0]) * pages,
@@ -263,14 +277,23 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
 		unmap_ops = &unmap_single;
 	}
 
-	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;
+	flags = GNTMAP_host_map;
+	if (use_ptemod)
+		flags |= GNTMAP_application_map | GNTMAP_contains_pte;
 	if (map->is_ro)
 		flags |= GNTMAP_readonly;
 
-	for(i=0; i < pages; i++)
-		gnttab_set_unmap_op(&unmap_ops[i],
-		                    map->pginfo[offset+i].pte_maddr, flags,
+	for(i=0; i < pages; i++) {
+		if (WARN_ON(!map->pages[i]))
+			continue;
+		if (use_ptemod)
+			addr = map->pginfo[i].pte_maddr;
+		else
+			addr = (phys_addr_t)pfn_to_kaddr(page_to_pfn(map->pages[i]));
+		gnttab_set_unmap_op(&unmap_ops[i], addr, flags,
 		                    map->pginfo[offset+i].handle);
+	}
+
 	if (debug)
 		printk("%s: map %d+%d [%d+%d]\n", __FUNCTION__,
 		       map->index, map->count, offset, pages);
@@ -281,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
 		goto out;
 
 	for (i = 0; i < pages; i++) {
+		uint32_t check, *tmp;
 		WARN_ON(unmap_ops[i].status);
-		__free_page(map->pages[offset+i]);
+		if (!map->pages[i])
+			continue;
+		/* XXX When unmapping, Xen will sometimes end up mapping the GFN
+		 * to an invalid MFN. In this case, writes will be discarded and
+		 * reads will return all 0xFF bytes. Leak these unusable GFNs
+		 * until a way to restore them is found.
+		 */
+		tmp = kmap(map->pages[i]);
+		tmp[0] = 0xdeaddead;
+		mb();
+		check = tmp[0];
+		kunmap(map->pages[i]);
+		if (check == 0xdeaddead)
+			__free_page(map->pages[i]);
+		else if (debug)
+			printk("%s: Discard page %d=%ld\n", __func__,
+				i, page_to_pfn(map->pages[i]));
 		map->pages[offset+i] = NULL;
 		map->pginfo[offset+i].handle = 0;
 	}
@@ -305,18 +345,8 @@ static void gntdev_vma_close(struct vm_area_struct *vma)
 	gntdev_put_map(map);
 }
 
-static int gntdev_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
-{
-	if (debug)
-		printk("%s: vaddr %p, pgoff %ld (shouldn't happen)\n",
-		       __FUNCTION__, vmf->virtual_address, vmf->pgoff);
-	vmf->flags = VM_FAULT_ERROR;
-	return 0;
-}
-
 static struct vm_operations_struct gntdev_vmops = {
 	.close = gntdev_vma_close,
-	.fault = gntdev_vma_fault,
 };
 
 /* ------------------------------------------------------------------ */
@@ -398,14 +428,16 @@ static int gntdev_open(struct inode *inode, struct file *flip)
 	INIT_LIST_HEAD(&priv->maps);
 	spin_lock_init(&priv->lock);
 
-	priv->mm = get_task_mm(current);
-	if (!priv->mm) {
-		kfree(priv);
-		return -ENOMEM;
+	if (use_ptemod) {
+		priv->mm = get_task_mm(current);
+		if (!priv->mm) {
+			kfree(priv);
+			return -ENOMEM;
+		}
+		priv->mn.ops = &gntdev_mmu_ops;
+		mmu_notifier_register(&priv->mn, priv->mm);
+		mmput(priv->mm);
 	}
-	priv->mn.ops = &gntdev_mmu_ops;
-	mmu_notifier_register(&priv->mn, priv->mm);
-	mmput(priv->mm);
 
 	flip->private_data = priv;
 	if (debug)
@@ -430,7 +462,8 @@ static int gntdev_release(struct inode *inode, struct file *flip)
 	}
 	spin_unlock(&priv->lock);
 
-	mmu_notifier_unregister(&priv->mn, priv->mm);
+	if (use_ptemod)
+		mmu_notifier_unregister(&priv->mn, priv->mm);
 	kfree(priv);
 	return 0;
 }
@@ -574,7 +607,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 	int index = vma->vm_pgoff;
 	int count = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
 	struct grant_map *map;
-	int err = -EINVAL;
+	int i, err = -EINVAL;
 
 	if ((vma->vm_flags & VM_WRITE) && !(vma->vm_flags & VM_SHARED))
 		return -EINVAL;
@@ -587,9 +620,9 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 	map = gntdev_find_map_index(priv, index, count);
 	if (!map)
 		goto unlock_out;
-	if (map->vma)
+	if (use_ptemod && map->vma)
 		goto unlock_out;
-	if (priv->mm != vma->vm_mm) {
+	if (use_ptemod && priv->mm != vma->vm_mm) {
 		printk("%s: Huh? Other mm?\n", __FUNCTION__);
 		goto unlock_out;
 	}
@@ -605,24 +638,36 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 
 	vma->vm_private_data = map;
 
-	map->vma = vma;
+	if (use_ptemod)
+		map->vma = vma;
 
 	map->is_ro = !(vma->vm_flags & VM_WRITE);
 
 	spin_unlock(&priv->lock);
 
-	err = apply_to_page_range(vma->vm_mm, vma->vm_start,
-				  vma->vm_end - vma->vm_start,
-				  find_grant_ptes, map);
-	if (err)
-		return err;
+	if (use_ptemod) {
+		err = apply_to_page_range(vma->vm_mm, vma->vm_start,
+					  vma->vm_end - vma->vm_start,
+					  find_grant_ptes, map);
+		if (err)
+			return err;
+	}
 
 	err = map_grant_pages(map);
 	if (err)
 		return err;
-
+	
 	map->is_mapped = 1;
 
+	if (!use_ptemod) {
+		for(i = 0; i < count; i++) {
+			err = vm_insert_page(vma, vma->vm_start + i*PAGE_SIZE,
+			                     map->pages[i]);
+			if (err)
+				return err;
+		}
+	}
+
 	return 0;
 
 unlock_out:
@@ -653,6 +698,8 @@ static int __init gntdev_init(void)
 	if (!xen_domain())
 		return -ENODEV;
 
+	use_ptemod = xen_pv_domain();
+
 	err = misc_register(&gntdev_miscdev);
 	if (err != 0) {
 		printk(KERN_ERR "Could not register gntdev device\n");
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index a5cf820..c8ab76e 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -457,6 +457,9 @@ int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
 
 	ret = HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, map_ops, count);
 
+	if (xen_feature(XENFEAT_auto_translated_physmap))
+		return ret;
+
 	for (i = 0; i < count; i++) {
 		pfn = mfn_to_pfn(map_ops[i].host_addr >> PAGE_SHIFT);
 		pte = (pte_t *) __va((pfn << PAGE_SHIFT) +
@@ -476,6 +479,10 @@ int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops,
 	int i, ret;
 
 	ret = HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, unmap_ops, count);
+
+	if (xen_feature(XENFEAT_auto_translated_physmap))
+		return ret;
+
 	for (i = 0; i < count; i++)
 		m2p_remove_override(pages[i]);

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/7 v2] xen-gntdev: Add reference counting to maps
  2010-12-17  0:17 ` [PATCH 5/7] xen-gntdev: Add reference counting to maps Daniel De Graaf
  2010-12-17  0:49   ` Jeremy Fitzhardinge
  2010-12-17  0:51   ` Jeremy Fitzhardinge
@ 2010-12-17 15:22   ` Daniel De Graaf
  2011-01-10 22:28     ` Konrad Rzeszutek Wilk
  2011-01-10 22:24   ` [PATCH 5/7] " Konrad Rzeszutek Wilk
  3 siblings, 1 reply; 38+ messages in thread
From: Daniel De Graaf @ 2010-12-17 15:22 UTC (permalink / raw)
  To: jeremy; +Cc: xen-devel, Ian.Campbell

This allows userspace to perform mmap() on the gntdev device and then
immediately close the filehandle or remove the mapping using the
remove ioctl, with the mapped area remaining valid until unmapped.
This also fixes an infinite loop when a gntdev device is closed
without first unmapping all areas.
    
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 drivers/xen/gntdev.c |   61 +++++++++++++++++++++----------------------------
 1 files changed, 26 insertions(+), 35 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 6a3c9e4..0c8e8a2 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -66,16 +66,18 @@ struct granted_page {
 
 struct grant_map {
 	struct list_head next;
-	struct gntdev_priv *priv;
 	struct vm_area_struct *vma;
 	int index;
 	int count;
+	atomic_t users;
 	int is_mapped:1;
 	int is_ro:1;
 	struct page **pages;
 	struct granted_page pginfo[0];
 };
 
+static void unmap_grant_pages(struct grant_map *map, int offset, int pages);
+
 /* ------------------------------------------------------------------ */
 
 static void gntdev_print_maps(struct gntdev_priv *priv,
@@ -112,6 +114,7 @@ static struct grant_map *gntdev_alloc_map(int count,
 	}
 
 	add->index = 0;
+	atomic_set(&add->users, 1);
 	add->count = count;
 	for(i = 0; i < count; i++)
 		add->pginfo[i].target = grants[i];
@@ -144,7 +147,6 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
 	list_add_tail(&add->next, &priv->maps);
 
 done:
-	add->priv = priv;
 	if (debug)
 		gntdev_print_maps(priv, "[new]", add->index);
 	spin_unlock(&priv->lock);
@@ -165,33 +167,23 @@ static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int ind
 	return NULL;
 }
 
-static int gntdev_del_map(struct grant_map *map)
+static void gntdev_put_map(struct grant_map *map)
 {
 	int i;
 
-	if (map->vma)
-		return -EBUSY;
-	if (map->is_mapped)
-		for (i = 0; i < map->count; i++)
-			if (map->pginfo[i].handle)
-				return -EBUSY;
-
-	atomic_sub(map->count, &pages_mapped);
-	list_del(&map->next);
-	return 0;
-}
-
-static void gntdev_free_map(struct grant_map *map)
-{
-	unsigned i;
-
 	if (!map)
 		return;
 
+	if (!atomic_dec_and_test(&map->users))
+		return;
+
+	atomic_sub(map->count, &pages_mapped);
+
 	for (i = 0; i < map->count; i++) {
 		if (map->pages[i])
 			__free_page(map->pages[i]);
 	}
+	kfree(map->pages);
 	kfree(map);
 }
 
@@ -310,6 +302,7 @@ static void gntdev_vma_close(struct vm_area_struct *vma)
 	map->is_mapped = 0;
 	map->vma = NULL;
 	vma->vm_private_data = NULL;
+	gntdev_put_map(map);
 }
 
 static int gntdev_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
@@ -425,7 +418,6 @@ static int gntdev_release(struct inode *inode, struct file *flip)
 {
 	struct gntdev_priv *priv = flip->private_data;
 	struct grant_map *map;
-	int err;
 
 	if (debug)
 		printk("%s: priv %p\n", __FUNCTION__, priv);
@@ -433,10 +425,8 @@ static int gntdev_release(struct inode *inode, struct file *flip)
 	spin_lock(&priv->lock);
 	while (!list_empty(&priv->maps)) {
 		map = list_entry(priv->maps.next, struct grant_map, next);
-		err = gntdev_del_map(map);
-		if (WARN_ON(err))
-			gntdev_free_map(map);
-
+		list_del(&map->next);
+		gntdev_put_map(map);
 	}
 	spin_unlock(&priv->lock);
 
@@ -487,18 +477,14 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
 
 	if (copy_to_user(u, &op, sizeof(op))) {
 		err = -EFAULT;
-		goto out_remove;
+		goto out_free;
 	}
 	err = 0;
 out_free:
 	kfree(grants);
 	return err;
-out_remove:
-	spin_lock(&priv->lock);
-	gntdev_del_map(map);
-	spin_unlock(&priv->lock);
 out_free_map:
-	gntdev_free_map(map);
+	gntdev_put_map(map);
 	goto out_free;
 }
 
@@ -507,7 +493,7 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv,
 {
 	struct ioctl_gntdev_unmap_grant_ref op;
 	struct grant_map *map;
-	int err = -EINVAL;
+	int err;
 
 	if (copy_from_user(&op, u, sizeof(op)) != 0)
 		return -EFAULT;
@@ -517,11 +503,13 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv,
 
 	spin_lock(&priv->lock);
 	map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count);
-	if (map)
-		err = gntdev_del_map(map);
+	if (map) {
+		list_del(&map->next);
+		gntdev_put_map(map);
+		err = 0;
+	} else
+		err = -EINVAL;
 	spin_unlock(&priv->lock);
-	if (!err)
-		gntdev_free_map(map);
 	return err;
 }
 
@@ -606,6 +594,8 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 		goto unlock_out;
 	}
 
+	atomic_inc(&map->users);
+
 	vma->vm_ops = &gntdev_vmops;
 
 	vma->vm_flags |= VM_RESERVED;
@@ -614,6 +604,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 	vma->vm_flags |= VM_PFNMAP;
 
 	vma->vm_private_data = map;
+
 	map->vma = vma;
 
 	map->is_ro = !(vma->vm_flags & VM_WRITE);

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] Userspace grant communication
  2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
                   ` (6 preceding siblings ...)
  2010-12-17  0:17 ` [PATCH 7/7] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
@ 2011-01-07 11:56 ` Stefano Stabellini
  2011-01-14 15:18 ` Konrad Rzeszutek Wilk
  8 siblings, 0 replies; 38+ messages in thread
From: Stefano Stabellini @ 2011-01-07 11:56 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: Ian Campbell, jeremy, xen-devel

On Fri, 17 Dec 2010, Daniel De Graaf wrote:
> Rebased on top of Stefano's 2.6.37 gntdev patches; included suggestions
> from the v2 patch queue. HVM mapping is now much cleaner, since pages
> with real GFNs are available. Multicall API didn't make it in, because
> the actual hypercall is within a function that I'm not sure should be
> changed to multicall in all instances, and I haven't had a chance to
> figure out how to use it despite this.
> 
> [PATCH 1/7] xen-gntdev: Fix circular locking dependency

Thanks for the fix and for doing the rebase; I have imported this patch
in my series.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open
  2010-12-17  0:17 ` [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open Daniel De Graaf
@ 2011-01-10 21:52   ` Konrad Rzeszutek Wilk
  2011-01-11 12:45     ` Daniel De Graaf
  0 siblings, 1 reply; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-10 21:52 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell

>  static long gntdev_ioctl(struct file *flip,
>  			 unsigned int cmd, unsigned long arg)
>  {
> @@ -555,9 +538,6 @@ static long gntdev_ioctl(struct file *flip,
>  	case IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR:
>  		return gntdev_ioctl_get_offset_for_vaddr(priv, ptr);
>  
> -	case IOCTL_GNTDEV_SET_MAX_GRANTS:
> -		return gntdev_ioctl_set_max_grants(priv, ptr);

Would it make sense to return -EPNOTSUPPORTED? Or does it not really
matter as nobody has been using this ioctl call?

> -
>  	default:
>  		if (debug)
>  			printk("%s: priv %p, unknown cmd %x\n",
> -- 
> 1.7.2.3
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data
  2010-12-17  0:17 ` [PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data Daniel De Graaf
@ 2011-01-10 22:14   ` Konrad Rzeszutek Wilk
  2011-01-11 13:02     ` Daniel De Graaf
  0 siblings, 1 reply; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-10 22:14 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell

> @@ -134,6 +133,7 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
>  {
>  	struct grant_map *map;
>  
> +	spin_lock(&priv->lock);
>  	list_for_each_entry(map, &priv->maps, next) {
>  		if (add->index + add->count < map->index) {
>  			list_add_tail(&add->next, &map->next);
> @@ -144,8 +144,10 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
>  	list_add_tail(&add->next, &priv->maps);
>  
>  done:
> +	add->priv = priv;
>  	if (debug)
>  		gntdev_print_maps(priv, "[new]", add->index);
> +	spin_unlock(&priv->lock);

That looks like you are also fixing a bug?
>  }
>  
>  static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int index,
> @@ -186,9 +188,10 @@ static int gntdev_del_map(struct grant_map *map)
>  
>  	if (map->vma)
>  		return -EBUSY;
> -	for (i = 0; i < map->count; i++)
> -		if (map->unmap_ops[i].handle)
> -			return -EBUSY;
> +	if (map->is_mapped)
> +		for (i = 0; i < map->count; i++)
> +			if (map->pginfo[i].handle)
> +				return -EBUSY;
>  
>  	atomic_sub(map->count, &pages_mapped);
>  	list_del(&map->next);
> @@ -202,15 +205,10 @@ static void gntdev_free_map(struct grant_map *map)
>  	if (!map)
>  		return;
>  
> -	if (map->pages)
> -		for (i = 0; i < map->count; i++) {
> -			if (map->pages[i])
> -				__free_page(map->pages[i]);
> -		}
> -	kfree(map->pages);
> -	kfree(map->grants);
> -	kfree(map->map_ops);
> -	kfree(map->unmap_ops);
> +	for (i = 0; i < map->count; i++) {
> +		if (map->pages[i])
> +			__free_page(map->pages[i]);
> +	}
>  	kfree(map);
>  }
>  
> @@ -223,53 +221,99 @@ static int find_grant_ptes(pte_t *pte, pgtable_t token, unsigned long addr, void
>  	u64 pte_maddr;
>  
>  	BUG_ON(pgnr >= map->count);
> +
>  	pte_maddr = arbitrary_virt_to_machine(pte).maddr;
> +	map->pginfo[pgnr].pte_maddr = pte_maddr;
>  
> -	gnttab_set_map_op(&map->map_ops[pgnr], pte_maddr,
> -			  GNTMAP_contains_pte | map->flags,
> -			  map->grants[pgnr].ref,
> -			  map->grants[pgnr].domid);
> -	gnttab_set_unmap_op(&map->unmap_ops[pgnr], pte_maddr,
> -			    GNTMAP_contains_pte | map->flags,
> -			    0 /* handle */);
>  	return 0;
>  }
>  
>  static int map_grant_pages(struct grant_map *map)
>  {
> -	int i, err = 0;
> +	int i, flags, err = 0;
> +	struct gnttab_map_grant_ref* map_ops = NULL;
>  
> +	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;

I am not sure if the GNTMAP_contains_pte is correct here. Stefano mentioned
that is used to determine how many arguments to put in the hypercall. Looking
at the previous usage - it was only done on the unmap_op, while you enforce
it on map_op too?

> +	if (map->is_ro)
> +		flags |= GNTMAP_readonly;
> +
> +	err = -ENOMEM;
> +	map_ops = kzalloc(sizeof(map_ops[0]) * map->count, GFP_TEMPORARY);
> +	if (!map_ops)
> +		goto out;
> +
> +	for(i=0; i < map->count; i++) {
> +		gnttab_set_map_op(&map_ops[i], map->pginfo[i].pte_maddr, flags,
> +				  map->pginfo[i].target.ref,
> +				  map->pginfo[i].target.domid);
> +	}
>  	if (debug)
>  		printk("%s: map %d+%d\n", __FUNCTION__, map->index, map->count);
> -	err = gnttab_map_refs(map->map_ops, map->pages, map->count);
> +
> +	err = gnttab_map_refs(map_ops, map->pages, map->count);
> +
>  	if (WARN_ON(err))
> -		return err;
> +		goto out;
>  
>  	for (i = 0; i < map->count; i++) {
> -		if (map->map_ops[i].status)
> +		if (map_ops[i].status) {
> +			__free_page(map->pages[i]);
> +			map->pages[i] = NULL;
>  			err = -EINVAL;
> -		map->unmap_ops[i].handle = map->map_ops[i].handle;
> +		} else {
> +			map->pginfo[i].handle = map_ops[i].handle;
> +		}
>  	}
> +
> +out:
> +	kfree(map_ops);
>  	return err;
>  }
>  
> -static int unmap_grant_pages(struct grant_map *map, int offset, int pages)
> +static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
>  {
> -	int i, err = 0;
> +	int i, flags, err = 0;
> +	struct gnttab_unmap_grant_ref *unmap_ops;
> +	struct gnttab_unmap_grant_ref unmap_single;
> +
> +	if (pages > 1) {
> +		unmap_ops = kzalloc(sizeof(unmap_ops[0]) * pages,
> +		                    GFP_TEMPORARY);
> +		if (unlikely(!unmap_ops)) {
> +			for(i=0; i < pages; i++)
> +				unmap_grant_pages(map, offset + i, 1);
> +			return;
> +		}
> +	} else {
> +		unmap_ops = &unmap_single;
> +	}
> +
> +	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;
> +	if (map->is_ro)
> +		flags |= GNTMAP_readonly;
>  
> +	for(i=0; i < pages; i++)
> +		gnttab_set_unmap_op(&unmap_ops[i],
> +		                    map->pginfo[offset+i].pte_maddr, flags,
> +		                    map->pginfo[offset+i].handle);
>  	if (debug)
>  		printk("%s: map %d+%d [%d+%d]\n", __FUNCTION__,
>  		       map->index, map->count, offset, pages);
> -	err = gnttab_unmap_refs(map->unmap_ops + offset, map->pages, pages);
> +
> +	err = gnttab_unmap_refs(unmap_ops, map->pages + offset, pages);
> +
>  	if (WARN_ON(err))
> -		return err;
> +		goto out;
>  
>  	for (i = 0; i < pages; i++) {
> -		if (map->unmap_ops[offset+i].status)
> -			err = -EINVAL;
> -		map->unmap_ops[offset+i].handle = 0;
> +		WARN_ON(unmap_ops[i].status);

Why change it from err to WARN_ON? I think the caller of this function
checks for this too so you would end up with two WARN_ON?

Also, you don't add the offset value to i here..


> +		__free_page(map->pages[offset+i]);
> +		map->pages[offset+i] = NULL;
> +		map->pginfo[offset+i].handle = 0;
>  	}
> -	return err;
> +out:
> +	if (unmap_ops != &unmap_single)
> +		kfree(unmap_ops);
>  }
>  
>  /* ------------------------------------------------------------------ */
> @@ -308,7 +352,6 @@ static void mn_invl_range_start(struct mmu_notifier *mn,
>  	struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn);
>  	struct grant_map *map;
>  	unsigned long mstart, mend;
> -	int err;
>  
>  	spin_lock(&priv->lock);
>  	list_for_each_entry(map, &priv->maps, next) {
> @@ -327,10 +370,9 @@ static void mn_invl_range_start(struct mmu_notifier *mn,
>  			       __FUNCTION__, map->index, map->count,
>  			       map->vma->vm_start, map->vma->vm_end,
>  			       start, end, mstart, mend);
> -		err = unmap_grant_pages(map,
> -					(mstart - map->vma->vm_start) >> PAGE_SHIFT,
> -					(mend - mstart) >> PAGE_SHIFT);
> -		WARN_ON(err);
> +		unmap_grant_pages(map,
> +				  (mstart - map->vma->vm_start) >> PAGE_SHIFT,
> +				  (mend - mstart) >> PAGE_SHIFT);

Ah, so you rememoved the WARN_ON here. What is the reason for doing so?

>  	}
>  	spin_unlock(&priv->lock);
>  }
> @@ -347,7 +389,6 @@ static void mn_release(struct mmu_notifier *mn,
>  {
>  	struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn);
>  	struct grant_map *map;
> -	int err;
>  
>  	spin_lock(&priv->lock);
>  	list_for_each_entry(map, &priv->maps, next) {
> @@ -357,8 +398,7 @@ static void mn_release(struct mmu_notifier *mn,
>  			printk("%s: map %d+%d (%lx %lx)\n",
>  			       __FUNCTION__, map->index, map->count,
>  			       map->vma->vm_start, map->vma->vm_end);
> -		err = unmap_grant_pages(map, 0, map->count);
> -		WARN_ON(err);
> +		unmap_grant_pages(map, 0, map->count);
>  	}
>  	spin_unlock(&priv->lock);
>  }
> @@ -427,6 +467,7 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
>  {
>  	struct ioctl_gntdev_map_grant_ref op;
>  	struct grant_map *map;
> +	struct ioctl_gntdev_grant_ref* grants;
>  	int err;
>  
>  	if (copy_from_user(&op, u, sizeof(op)) != 0)
> @@ -437,38 +478,45 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
>  	if (unlikely(op.count <= 0))
>  		return -EINVAL;
>  
> -	err = -ENOMEM;
> -	map = gntdev_alloc_map(priv, op.count);
> -	if (!map)
> -		return err;
> +	grants = kmalloc(sizeof(grants[0]) * op.count, GFP_TEMPORARY);
> +	if (!grants)
> +		return -ENOMEM;
>  
> -	if (copy_from_user(map->grants, &u->refs,
> -			   sizeof(map->grants[0]) * op.count) != 0) {
> -		gntdev_free_map(map);
> -		return err;
> +	if (copy_from_user(grants, u->refs, sizeof(grants[0]) * op.count)) {
> +		err = -EFAULT;
> +		goto out_free;
>  	}
>  
> +	err = -ENOMEM;
> +	map = gntdev_alloc_map(op.count, grants);
> +	if (!map)
> +		goto out_free;
> +
>  	if (unlikely(atomic_add_return(op.count, &pages_mapped) > limit))
>  	{
>  		if (debug)
>  			printk("%s: can't map: over limit\n", __FUNCTION__);
> -		gntdev_free_map(map);
> -		return err;
> +		goto out_free_map;

I just noticed it now, but shouldn't we also free grants? That looks like
it needs a seperate bug patch thought.
>  	}
>  
> -	spin_lock(&priv->lock);
>  	gntdev_add_map(priv, map);
>  	op.index = map->index << PAGE_SHIFT;
> -	spin_unlock(&priv->lock);

Ah, so you moved the spinlock down. I presume it is OK to have op.index be
unprotected.
>  
> -	if (copy_to_user(u, &op, sizeof(op)) != 0) {
> -		spin_lock(&priv->lock);
> -		gntdev_del_map(map);
> -		spin_unlock(&priv->lock);
> -		gntdev_free_map(map);
> -		return err;
> +	if (copy_to_user(u, &op, sizeof(op))) {
> +		err = -EFAULT;
> +		goto out_remove;

Hmm, should we free the grants on this exit path?

>  	}
> -	return 0;
> +	err = 0;
> +out_free:
> +	kfree(grants);
> +	return err;
> +out_remove:
> +	spin_lock(&priv->lock);
> +	gntdev_del_map(map);
> +	spin_unlock(&priv->lock);
> +out_free_map:
> +	gntdev_free_map(map);
> +	goto out_free;
>  }

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/7] xen-gntdev: Add reference counting to maps
  2010-12-17  0:17 ` [PATCH 5/7] xen-gntdev: Add reference counting to maps Daniel De Graaf
                     ` (2 preceding siblings ...)
  2010-12-17 15:22   ` [PATCH 5/7 v2] " Daniel De Graaf
@ 2011-01-10 22:24   ` Konrad Rzeszutek Wilk
  2011-01-11 11:10     ` Stefano Stabellini
  3 siblings, 1 reply; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-10 22:24 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell

> -static void gntdev_free_map(struct grant_map *map)
> -{
> -	unsigned i;
> +	atomic_sub(map->count, &pages_mapped);
>  
> -	if (!map)
> -		return;
> +	if (!use_ptemod)
> +		unmap_grant_pages(map, 0, map->count);
>  
>  	for (i = 0; i < map->count; i++) {
>  		if (map->pages[i])
>  			__free_page(map->pages[i]);
>  	}
> +	kfree(map->pages);

Can you roll that in the previous patch that introduced the map->pages code?

>  	kfree(map);
>  }
>  
> @@ -310,6 +305,7 @@ static void gntdev_vma_close(struct vm_area_struct *vma)
>  	map->is_mapped = 0;
>  	map->vma = NULL;
>  	vma->vm_private_data = NULL;
> +	gntdev_put_map(map);

I am somehow not seeing this function, nor the use_ptemod defined. Ah, you answered
that later on..

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Re: [PATCH 5/7 v2] xen-gntdev: Add reference counting to maps
  2010-12-17 15:22   ` [PATCH 5/7 v2] " Daniel De Graaf
@ 2011-01-10 22:28     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-10 22:28 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell

> -static int gntdev_del_map(struct grant_map *map)
> +static void gntdev_put_map(struct grant_map *map)

Aha.. here it is.

.. snip ..
> @@ -606,6 +594,8 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
>  		goto unlock_out;
>  	}
>  
> +	atomic_inc(&map->users);
> +
>  	vma->vm_ops = &gntdev_vmops;
>  
>  	vma->vm_flags |= VM_RESERVED;
> @@ -614,6 +604,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
>  	vma->vm_flags |= VM_PFNMAP;
>  
>  	vma->vm_private_data = map;
> +

No need for that.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7] xen-gntdev: Support mapping in HVM domains
  2010-12-17  0:17 ` [PATCH 6/7] xen-gntdev: Support mapping in HVM domains Daniel De Graaf
  2010-12-17 15:22   ` [PATCH 6/7 v2] " Daniel De Graaf
@ 2011-01-10 22:41   ` Konrad Rzeszutek Wilk
  2011-01-11 13:15     ` Daniel De Graaf
  1 sibling, 1 reply; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-10 22:41 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell

> @@ -284,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
>  		goto out;
>  
>  	for (i = 0; i < pages; i++) {
> +		uint32_t check, *tmp;
>  		WARN_ON(unmap_ops[i].status);
> -		__free_page(map->pages[offset+i]);
> +		if (!map->pages[i])
> +			continue;
> +		/* XXX When unmapping, Xen will sometimes end up mapping the GFN
> +		 * to an invalid MFN. In this case, writes will be discarded and
> +		 * reads will return all 0xFF bytes. Leak these unusable GFNs
> +		 * until a way to restore them is found.
> +		 */
> +		tmp = kmap(map->pages[i]);
> +		tmp[0] = 0xdeaddead;
> +		mb();
> +		check = tmp[0];
> +		kunmap(map->pages[i]);
> +		if (check == 0xdeaddead)
> +			__free_page(map->pages[i]);
> +		else if (debug)
> +			printk("%s: Discard page %d=%ld\n", __func__,
> +				i, page_to_pfn(map->pages[i]));

Whoa. Any leads to when the "sometimes" happens? Does the status report an
error or is it silent?

>  		map->pages[offset+i] = NULL;
>  		map->pginfo[offset+i].handle = 0;
>  	}

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/7] xen-gntdev: Add reference counting to maps
  2011-01-10 22:24   ` [PATCH 5/7] " Konrad Rzeszutek Wilk
@ 2011-01-11 11:10     ` Stefano Stabellini
  2011-01-11 17:46       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 38+ messages in thread
From: Stefano Stabellini @ 2011-01-11 11:10 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Ian Campbell, Daniel De Graaf, xen-devel, jeremy

On Mon, 10 Jan 2011, Konrad Rzeszutek Wilk wrote:
> > -static void gntdev_free_map(struct grant_map *map)
> > -{
> > -	unsigned i;
> > +	atomic_sub(map->count, &pages_mapped);
> >  
> > -	if (!map)
> > -		return;
> > +	if (!use_ptemod)
> > +		unmap_grant_pages(map, 0, map->count);
> >  
> >  	for (i = 0; i < map->count; i++) {
> >  		if (map->pages[i])
> >  			__free_page(map->pages[i]);
> >  	}
> > +	kfree(map->pages);
> 
> Can you roll that in the previous patch that introduced the map->pages code?
> 

map->pages is actually introduced by "xen gntdev: use gnttab_map_refs
and gnttab_unmap_refs" in my patch series and it already has a
kfree(map->pages) in gntdev_free_map.
In fact I think reading this chuck of the patch on its own is misleading
because as you can see the whole gntdev_free_map function has been
removed...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open
  2011-01-10 21:52   ` Konrad Rzeszutek Wilk
@ 2011-01-11 12:45     ` Daniel De Graaf
  2011-01-11 17:51       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 38+ messages in thread
From: Daniel De Graaf @ 2011-01-11 12:45 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: jeremy, xen-devel, Ian.Campbell

On 01/10/2011 04:52 PM, Konrad Rzeszutek Wilk wrote:
>>  static long gntdev_ioctl(struct file *flip,
>>  			 unsigned int cmd, unsigned long arg)
>>  {
>> @@ -555,9 +538,6 @@ static long gntdev_ioctl(struct file *flip,
>>  	case IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR:
>>  		return gntdev_ioctl_get_offset_for_vaddr(priv, ptr);
>>  
>> -	case IOCTL_GNTDEV_SET_MAX_GRANTS:
>> -		return gntdev_ioctl_set_max_grants(priv, ptr);
> 
> Would it make sense to return -EPNOTSUPPORTED? Or does it not really
> matter as nobody has been using this ioctl call?

Does this produce a clearer error message than the default -ENOIOCTLCMD?
It's possible that some people use it, since it was exposed as an API.

Doesn't really matter to me; we could also have it return 0 and be a
noop in case someone interprets its failure as inability to support a
sufficient number of grants.

>> -
>>  	default:
>>  		if (debug)
>>  			printk("%s: priv %p, unknown cmd %x\n",
>> -- 
>> 1.7.2.3
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data
  2011-01-10 22:14   ` Konrad Rzeszutek Wilk
@ 2011-01-11 13:02     ` Daniel De Graaf
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel De Graaf @ 2011-01-11 13:02 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: jeremy, xen-devel, Ian.Campbell

On 01/10/2011 05:14 PM, Konrad Rzeszutek Wilk wrote:
>> @@ -134,6 +133,7 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
>>  {
>>  	struct grant_map *map;
>>  
>> +	spin_lock(&priv->lock);
>>  	list_for_each_entry(map, &priv->maps, next) {
>>  		if (add->index + add->count < map->index) {
>>  			list_add_tail(&add->next, &map->next);
>> @@ -144,8 +144,10 @@ static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add)
>>  	list_add_tail(&add->next, &priv->maps);
>>  
>>  done:
>> +	add->priv = priv;
>>  	if (debug)
>>  		gntdev_print_maps(priv, "[new]", add->index);
>> +	spin_unlock(&priv->lock);
> 
> That looks like you are also fixing a bug?

The lock is just being pushed inside this function from its caller, to
make it easier to see where the lock was being held.

>>  }
>>  
>>  static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int index,
>> @@ -186,9 +188,10 @@ static int gntdev_del_map(struct grant_map *map)
>>  
>>  	if (map->vma)
>>  		return -EBUSY;
>> -	for (i = 0; i < map->count; i++)
>> -		if (map->unmap_ops[i].handle)
>> -			return -EBUSY;
>> +	if (map->is_mapped)
>> +		for (i = 0; i < map->count; i++)
>> +			if (map->pginfo[i].handle)
>> +				return -EBUSY;
>>  
>>  	atomic_sub(map->count, &pages_mapped);
>>  	list_del(&map->next);
>> @@ -202,15 +205,10 @@ static void gntdev_free_map(struct grant_map *map)
>>  	if (!map)
>>  		return;
>>  
>> -	if (map->pages)
>> -		for (i = 0; i < map->count; i++) {
>> -			if (map->pages[i])
>> -				__free_page(map->pages[i]);
>> -		}
>> -	kfree(map->pages);
>> -	kfree(map->grants);
>> -	kfree(map->map_ops);
>> -	kfree(map->unmap_ops);
>> +	for (i = 0; i < map->count; i++) {
>> +		if (map->pages[i])
>> +			__free_page(map->pages[i]);
>> +	}
>>  	kfree(map);
>>  }
>>  
>> @@ -223,53 +221,99 @@ static int find_grant_ptes(pte_t *pte, pgtable_t token, unsigned long addr, void
>>  	u64 pte_maddr;
>>  
>>  	BUG_ON(pgnr >= map->count);
>> +
>>  	pte_maddr = arbitrary_virt_to_machine(pte).maddr;
>> +	map->pginfo[pgnr].pte_maddr = pte_maddr;
>>  
>> -	gnttab_set_map_op(&map->map_ops[pgnr], pte_maddr,
>> -			  GNTMAP_contains_pte | map->flags,
>> -			  map->grants[pgnr].ref,
>> -			  map->grants[pgnr].domid);
>> -	gnttab_set_unmap_op(&map->unmap_ops[pgnr], pte_maddr,
>> -			    GNTMAP_contains_pte | map->flags,
>> -			    0 /* handle */);
>>  	return 0;
>>  }
>>  
>>  static int map_grant_pages(struct grant_map *map)
>>  {
>> -	int i, err = 0;
>> +	int i, flags, err = 0;
>> +	struct gnttab_map_grant_ref* map_ops = NULL;
>>  
>> +	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;
> 
> I am not sure if the GNTMAP_contains_pte is correct here. Stefano mentioned
> that is used to determine how many arguments to put in the hypercall. Looking
> at the previous usage - it was only done on the unmap_op, while you enforce
> it on map_op too?

GNTMAP_contains_pte is used on PV guests (only) to indicate that the
argument contains PTEs that need to be modified, rather than PFNs to
remap. See patch 6 for the HVM case where this flag is not used.

Previous use had it on both map and unmap. This was clearer prior to
a reshuffling patch that moved it to the gnttab_set_map_op call.

>> +	if (map->is_ro)
>> +		flags |= GNTMAP_readonly;
>> +
>> +	err = -ENOMEM;
>> +	map_ops = kzalloc(sizeof(map_ops[0]) * map->count, GFP_TEMPORARY);
>> +	if (!map_ops)
>> +		goto out;
>> +
>> +	for(i=0; i < map->count; i++) {
>> +		gnttab_set_map_op(&map_ops[i], map->pginfo[i].pte_maddr, flags,
>> +				  map->pginfo[i].target.ref,
>> +				  map->pginfo[i].target.domid);
>> +	}
>>  	if (debug)
>>  		printk("%s: map %d+%d\n", __FUNCTION__, map->index, map->count);
>> -	err = gnttab_map_refs(map->map_ops, map->pages, map->count);
>> +
>> +	err = gnttab_map_refs(map_ops, map->pages, map->count);
>> +
>>  	if (WARN_ON(err))
>> -		return err;
>> +		goto out;
>>  
>>  	for (i = 0; i < map->count; i++) {
>> -		if (map->map_ops[i].status)
>> +		if (map_ops[i].status) {
>> +			__free_page(map->pages[i]);
>> +			map->pages[i] = NULL;
>>  			err = -EINVAL;
>> -		map->unmap_ops[i].handle = map->map_ops[i].handle;
>> +		} else {
>> +			map->pginfo[i].handle = map_ops[i].handle;
>> +		}
>>  	}
>> +
>> +out:
>> +	kfree(map_ops);
>>  	return err;
>>  }
>>  
>> -static int unmap_grant_pages(struct grant_map *map, int offset, int pages)
>> +static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
>>  {
>> -	int i, err = 0;
>> +	int i, flags, err = 0;
>> +	struct gnttab_unmap_grant_ref *unmap_ops;
>> +	struct gnttab_unmap_grant_ref unmap_single;
>> +
>> +	if (pages > 1) {
>> +		unmap_ops = kzalloc(sizeof(unmap_ops[0]) * pages,
>> +		                    GFP_TEMPORARY);
>> +		if (unlikely(!unmap_ops)) {
>> +			for(i=0; i < pages; i++)
>> +				unmap_grant_pages(map, offset + i, 1);
>> +			return;
>> +		}
>> +	} else {
>> +		unmap_ops = &unmap_single;
>> +	}
>> +
>> +	flags = GNTMAP_host_map | GNTMAP_application_map | GNTMAP_contains_pte;
>> +	if (map->is_ro)
>> +		flags |= GNTMAP_readonly;
>>  
>> +	for(i=0; i < pages; i++)
>> +		gnttab_set_unmap_op(&unmap_ops[i],
>> +		                    map->pginfo[offset+i].pte_maddr, flags,
>> +		                    map->pginfo[offset+i].handle);
>>  	if (debug)
>>  		printk("%s: map %d+%d [%d+%d]\n", __FUNCTION__,
>>  		       map->index, map->count, offset, pages);
>> -	err = gnttab_unmap_refs(map->unmap_ops + offset, map->pages, pages);
>> +
>> +	err = gnttab_unmap_refs(unmap_ops, map->pages + offset, pages);
>> +
>>  	if (WARN_ON(err))
>> -		return err;
>> +		goto out;
>>  
>>  	for (i = 0; i < pages; i++) {
>> -		if (map->unmap_ops[offset+i].status)
>> -			err = -EINVAL;
>> -		map->unmap_ops[offset+i].handle = 0;
>> +		WARN_ON(unmap_ops[i].status);
> 
> Why change it from err to WARN_ON? I think the caller of this function
> checks for this too so you would end up with two WARN_ON?

No, unmap_ops[i].status is not ever seen by the caller. This function
returns void, so the caller doesn't have anything to warn about.

> Also, you don't add the offset value to i here..

unmap_ops (the local variable) is indexed without using offset.

>> +		__free_page(map->pages[offset+i]);
>> +		map->pages[offset+i] = NULL;
>> +		map->pginfo[offset+i].handle = 0;
>>  	}
>> -	return err;
>> +out:
>> +	if (unmap_ops != &unmap_single)
>> +		kfree(unmap_ops);
>>  }
>>  
>>  /* ------------------------------------------------------------------ */
>> @@ -308,7 +352,6 @@ static void mn_invl_range_start(struct mmu_notifier *mn,
>>  	struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn);
>>  	struct grant_map *map;
>>  	unsigned long mstart, mend;
>> -	int err;
>>  
>>  	spin_lock(&priv->lock);
>>  	list_for_each_entry(map, &priv->maps, next) {
>> @@ -327,10 +370,9 @@ static void mn_invl_range_start(struct mmu_notifier *mn,
>>  			       __FUNCTION__, map->index, map->count,
>>  			       map->vma->vm_start, map->vma->vm_end,
>>  			       start, end, mstart, mend);
>> -		err = unmap_grant_pages(map,
>> -					(mstart - map->vma->vm_start) >> PAGE_SHIFT,
>> -					(mend - mstart) >> PAGE_SHIFT);
>> -		WARN_ON(err);
>> +		unmap_grant_pages(map,
>> +				  (mstart - map->vma->vm_start) >> PAGE_SHIFT,
>> +				  (mend - mstart) >> PAGE_SHIFT);
> 
> Ah, so you rememoved the WARN_ON here. What is the reason for doing so?

This makes it clearer where the error was located (in the hypercall return,
or in the status of some page). Since the return value of unmap_grant_pages
was never used except to pass to a WARN_ON, it seemed redundant.

>>  	}
>>  	spin_unlock(&priv->lock);
>>  }
>> @@ -347,7 +389,6 @@ static void mn_release(struct mmu_notifier *mn,
>>  {
>>  	struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn);
>>  	struct grant_map *map;
>> -	int err;
>>  
>>  	spin_lock(&priv->lock);
>>  	list_for_each_entry(map, &priv->maps, next) {
>> @@ -357,8 +398,7 @@ static void mn_release(struct mmu_notifier *mn,
>>  			printk("%s: map %d+%d (%lx %lx)\n",
>>  			       __FUNCTION__, map->index, map->count,
>>  			       map->vma->vm_start, map->vma->vm_end);
>> -		err = unmap_grant_pages(map, 0, map->count);
>> -		WARN_ON(err);
>> +		unmap_grant_pages(map, 0, map->count);
>>  	}
>>  	spin_unlock(&priv->lock);
>>  }
>> @@ -427,6 +467,7 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
>>  {
>>  	struct ioctl_gntdev_map_grant_ref op;
>>  	struct grant_map *map;
>> +	struct ioctl_gntdev_grant_ref* grants;
>>  	int err;
>>  
>>  	if (copy_from_user(&op, u, sizeof(op)) != 0)
>> @@ -437,38 +478,45 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv,
>>  	if (unlikely(op.count <= 0))
>>  		return -EINVAL;
>>  
>> -	err = -ENOMEM;
>> -	map = gntdev_alloc_map(priv, op.count);
>> -	if (!map)
>> -		return err;
>> +	grants = kmalloc(sizeof(grants[0]) * op.count, GFP_TEMPORARY);
>> +	if (!grants)
>> +		return -ENOMEM;
>>  
>> -	if (copy_from_user(map->grants, &u->refs,
>> -			   sizeof(map->grants[0]) * op.count) != 0) {
>> -		gntdev_free_map(map);
>> -		return err;
>> +	if (copy_from_user(grants, u->refs, sizeof(grants[0]) * op.count)) {
>> +		err = -EFAULT;
>> +		goto out_free;
>>  	}
>>  
>> +	err = -ENOMEM;
>> +	map = gntdev_alloc_map(op.count, grants);
>> +	if (!map)
>> +		goto out_free;
>> +
>>  	if (unlikely(atomic_add_return(op.count, &pages_mapped) > limit))
>>  	{
>>  		if (debug)
>>  			printk("%s: can't map: over limit\n", __FUNCTION__);
>> -		gntdev_free_map(map);
>> -		return err;
>> +		goto out_free_map;
> 
> I just noticed it now, but shouldn't we also free grants? That looks like
> it needs a seperate bug patch thought.

It is getting freed: out_free_map goes to out_free after freeing the map.

>>  	}
>>  
>> -	spin_lock(&priv->lock);
>>  	gntdev_add_map(priv, map);
>>  	op.index = map->index << PAGE_SHIFT;
>> -	spin_unlock(&priv->lock);
> 
> Ah, so you moved the spinlock down. I presume it is OK to have op.index be
> unprotected.

Yep, op is on the stack.

>>  
>> -	if (copy_to_user(u, &op, sizeof(op)) != 0) {
>> -		spin_lock(&priv->lock);
>> -		gntdev_del_map(map);
>> -		spin_unlock(&priv->lock);
>> -		gntdev_free_map(map);
>> -		return err;
>> +	if (copy_to_user(u, &op, sizeof(op))) {
>> +		err = -EFAULT;
>> +		goto out_remove;
> 
> Hmm, should we free the grants on this exit path?

We are; out_remove falls to out_free_map falls to out_free.

>>  	}
>> -	return 0;
>> +	err = 0;
>> +out_free:
>> +	kfree(grants);
>> +	return err;
>> +out_remove:
>> +	spin_lock(&priv->lock);
>> +	gntdev_del_map(map);
>> +	spin_unlock(&priv->lock);
>> +out_free_map:
>> +	gntdev_free_map(map);
>> +	goto out_free;
>>  }
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7] xen-gntdev: Support mapping in HVM domains
  2011-01-10 22:41   ` [PATCH 6/7] " Konrad Rzeszutek Wilk
@ 2011-01-11 13:15     ` Daniel De Graaf
  2011-01-11 14:52       ` Daniel De Graaf
  0 siblings, 1 reply; 38+ messages in thread
From: Daniel De Graaf @ 2011-01-11 13:15 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: jeremy, xen-devel, Ian.Campbell

On 01/10/2011 05:41 PM, Konrad Rzeszutek Wilk wrote:
>> @@ -284,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
>>  		goto out;
>>  
>>  	for (i = 0; i < pages; i++) {
>> +		uint32_t check, *tmp;
>>  		WARN_ON(unmap_ops[i].status);
>> -		__free_page(map->pages[offset+i]);
>> +		if (!map->pages[i])
>> +			continue;
>> +		/* XXX When unmapping, Xen will sometimes end up mapping the GFN
>> +		 * to an invalid MFN. In this case, writes will be discarded and
>> +		 * reads will return all 0xFF bytes. Leak these unusable GFNs
>> +		 * until a way to restore them is found.
>> +		 */
>> +		tmp = kmap(map->pages[i]);
>> +		tmp[0] = 0xdeaddead;
>> +		mb();
>> +		check = tmp[0];
>> +		kunmap(map->pages[i]);
>> +		if (check == 0xdeaddead)
>> +			__free_page(map->pages[i]);
>> +		else if (debug)
>> +			printk("%s: Discard page %d=%ld\n", __func__,
>> +				i, page_to_pfn(map->pages[i]));
> 
> Whoa. Any leads to when the "sometimes" happens? Does the status report an
> error or is it silent?

Status is silent in this case. I can produce it quite reliably on my
test system where I am mapping a framebuffer (1280 pages) between two
HVM guests - in this case, about 2/3 of the released pages will end up
being invalid. It doesn't seem to be size-related as I have also seen
it on the small 3-page page index mapping. There is a message on xm
dmesg that may be related:

(XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 7cbc6: c=8000000000000004 t=7400000000000002

This appears about once per page, with different MFNs but the same c/t.
One of the two HVM guests (the one doing the mapping) has the PCI
graphics card forwarded to it.

>>  		map->pages[offset+i] = NULL;
>>  		map->pginfo[offset+i].handle = 0;
>>  	}
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7] xen-gntdev: Support mapping in HVM domains
  2011-01-11 13:15     ` Daniel De Graaf
@ 2011-01-11 14:52       ` Daniel De Graaf
  2011-01-11 18:00         ` c/s 22402 ("86 hvm: Refuse to perform __hvm_copy() work in atomic context.") breaks HVM, race possible in other code - any ideas? Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 38+ messages in thread
From: Daniel De Graaf @ 2011-01-11 14:52 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: jeremy, xen-devel, Ian.Campbell

On 01/11/2011 08:15 AM, Daniel De Graaf wrote:
> On 01/10/2011 05:41 PM, Konrad Rzeszutek Wilk wrote:
>>> @@ -284,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
>>>  		goto out;
>>>  
>>>  	for (i = 0; i < pages; i++) {
>>> +		uint32_t check, *tmp;
>>>  		WARN_ON(unmap_ops[i].status);
>>> -		__free_page(map->pages[offset+i]);
>>> +		if (!map->pages[i])
>>> +			continue;
>>> +		/* XXX When unmapping, Xen will sometimes end up mapping the GFN
>>> +		 * to an invalid MFN. In this case, writes will be discarded and
>>> +		 * reads will return all 0xFF bytes. Leak these unusable GFNs
>>> +		 * until a way to restore them is found.
>>> +		 */
>>> +		tmp = kmap(map->pages[i]);
>>> +		tmp[0] = 0xdeaddead;
>>> +		mb();
>>> +		check = tmp[0];
>>> +		kunmap(map->pages[i]);
>>> +		if (check == 0xdeaddead)
>>> +			__free_page(map->pages[i]);
>>> +		else if (debug)
>>> +			printk("%s: Discard page %d=%ld\n", __func__,
>>> +				i, page_to_pfn(map->pages[i]));
>>
>> Whoa. Any leads to when the "sometimes" happens? Does the status report an
>> error or is it silent?
> 
> Status is silent in this case. I can produce it quite reliably on my
> test system where I am mapping a framebuffer (1280 pages) between two
> HVM guests - in this case, about 2/3 of the released pages will end up
> being invalid. It doesn't seem to be size-related as I have also seen
> it on the small 3-page page index mapping. There is a message on xm
> dmesg that may be related:
> 
> (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 7cbc6: c=8000000000000004 t=7400000000000002
> 
> This appears about once per page, with different MFNs but the same c/t.
> One of the two HVM guests (the one doing the mapping) has the PCI
> graphics card forwarded to it.
> 

Just tested on the latest xen 4.1 (with 22402:7d2fdc083c9c reverted as
that breaks HVM grants), which produces different output:

...
(XEN) mm.c:889:d1 Error getting mfn b803e (pfn 25a3e) from L1 entry 00000000b803e021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn b8038 (pfn 25a38) from L1 entry 00000000b8038021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn b803d (pfn 25a3d) from L1 entry 00000000b803d021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn 10829 (pfn 25a29) from L1 entry 0000000010829021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn 1081c (pfn 25a1c) from L1 entry 000000001081c021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn 10816 (pfn 25a16) from L1 entry 0000000010816021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn 1081a (pfn 25a1a) from L1 entry 000000001081a021 for l1e_owner=1, pg_owner=1
...

This appears on the map; nothing is printed on the unmap. If the
unmap happens while the domain is up, it seems to be invalid more often;
most (perhaps all) of the destination-valid unmaps happen when the domain
is being destroyed. Exactly which pages are valid or invalid seems to be
mostly random, although nearby GFNs tend to have the same validity.

If you have any thoughts as to the cause, I can test patches or provide
output as needed; it would be better if this workaround weren't needed.

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/7] xen-gntdev: Add reference counting to maps
  2011-01-11 11:10     ` Stefano Stabellini
@ 2011-01-11 17:46       ` Konrad Rzeszutek Wilk
  2011-01-12 11:58         ` Stefano Stabellini
  0 siblings, 1 reply; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-11 17:46 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Ian Campbell, Daniel De Graaf, xen-devel, jeremy

On Tue, Jan 11, 2011 at 11:10:23AM +0000, Stefano Stabellini wrote:
> On Mon, 10 Jan 2011, Konrad Rzeszutek Wilk wrote:
> > > -static void gntdev_free_map(struct grant_map *map)
> > > -{
> > > -	unsigned i;
> > > +	atomic_sub(map->count, &pages_mapped);
> > >  
> > > -	if (!map)
> > > -		return;
> > > +	if (!use_ptemod)
> > > +		unmap_grant_pages(map, 0, map->count);
> > >  
> > >  	for (i = 0; i < map->count; i++) {
> > >  		if (map->pages[i])
> > >  			__free_page(map->pages[i]);
> > >  	}
> > > +	kfree(map->pages);
> > 
> > Can you roll that in the previous patch that introduced the map->pages code?
> > 
> 
> map->pages is actually introduced by "xen gntdev: use gnttab_map_refs
> and gnttab_unmap_refs" in my patch series and it already has a

Right. But I meant his patch that collapsed all of the different kzalloc's in just one.

> kfree(map->pages) in gntdev_free_map.
> In fact I think reading this chuck of the patch on its own is misleading
> because as you can see the whole gntdev_free_map function has been
> removed...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open
  2011-01-11 12:45     ` Daniel De Graaf
@ 2011-01-11 17:51       ` Konrad Rzeszutek Wilk
  2011-01-11 18:18         ` Daniel De Graaf
  0 siblings, 1 reply; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-11 17:51 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell

On Tue, Jan 11, 2011 at 07:45:34AM -0500, Daniel De Graaf wrote:
> On 01/10/2011 04:52 PM, Konrad Rzeszutek Wilk wrote:
> >>  static long gntdev_ioctl(struct file *flip,
> >>  			 unsigned int cmd, unsigned long arg)
> >>  {
> >> @@ -555,9 +538,6 @@ static long gntdev_ioctl(struct file *flip,
> >>  	case IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR:
> >>  		return gntdev_ioctl_get_offset_for_vaddr(priv, ptr);
> >>  
> >> -	case IOCTL_GNTDEV_SET_MAX_GRANTS:
> >> -		return gntdev_ioctl_set_max_grants(priv, ptr);
> > 
> > Would it make sense to return -EPNOTSUPPORTED? Or does it not really
> > matter as nobody has been using this ioctl call?
> 
> Does this produce a clearer error message than the default -ENOIOCTLCMD?
> It's possible that some people use it, since it was exposed as an API.

Looking at the Xen tools the user of this is:
xc_gnttab_set_max_grants which would end up returning whatever the
error is. I don't see any users of this in the Xen tools, thought there might
be some in the XCP code. Lets stay with your ENOIOCTLCMD.

However, I was wondering if you are going to submit a patch to the Xen
tool stack so that it can utlize the SysFS interface to set the limits
for that API call?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* c/s 22402 ("86 hvm: Refuse to perform __hvm_copy() work in atomic context.") breaks HVM, race possible in other code - any ideas?
  2011-01-11 14:52       ` Daniel De Graaf
@ 2011-01-11 18:00         ` Konrad Rzeszutek Wilk
  2011-01-11 18:24           ` Daniel De Graaf
  0 siblings, 1 reply; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-11 18:00 UTC (permalink / raw)
  To: Daniel De Graaf, keir; +Cc: jeremy, xen-devel, Ian.Campbell

On Tue, Jan 11, 2011 at 09:52:19AM -0500, Daniel De Graaf wrote:
> On 01/11/2011 08:15 AM, Daniel De Graaf wrote:
> > On 01/10/2011 05:41 PM, Konrad Rzeszutek Wilk wrote:
> >>> @@ -284,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
> >>>  		goto out;
> >>>  
> >>>  	for (i = 0; i < pages; i++) {
> >>> +		uint32_t check, *tmp;
> >>>  		WARN_ON(unmap_ops[i].status);
> >>> -		__free_page(map->pages[offset+i]);
> >>> +		if (!map->pages[i])
> >>> +			continue;
> >>> +		/* XXX When unmapping, Xen will sometimes end up mapping the GFN
> >>> +		 * to an invalid MFN. In this case, writes will be discarded and
> >>> +		 * reads will return all 0xFF bytes. Leak these unusable GFNs
> >>> +		 * until a way to restore them is found.
> >>> +		 */
> >>> +		tmp = kmap(map->pages[i]);
> >>> +		tmp[0] = 0xdeaddead;
> >>> +		mb();
> >>> +		check = tmp[0];
> >>> +		kunmap(map->pages[i]);
> >>> +		if (check == 0xdeaddead)
> >>> +			__free_page(map->pages[i]);
> >>> +		else if (debug)
> >>> +			printk("%s: Discard page %d=%ld\n", __func__,
> >>> +				i, page_to_pfn(map->pages[i]));
> >>
> >> Whoa. Any leads to when the "sometimes" happens? Does the status report an
> >> error or is it silent?
> > 
> > Status is silent in this case. I can produce it quite reliably on my
> > test system where I am mapping a framebuffer (1280 pages) between two
> > HVM guests - in this case, about 2/3 of the released pages will end up
> > being invalid. It doesn't seem to be size-related as I have also seen
> > it on the small 3-page page index mapping. There is a message on xm
> > dmesg that may be related:
> > 
> > (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 7cbc6: c=8000000000000004 t=7400000000000002
> > 
> > This appears about once per page, with different MFNs but the same c/t.
> > One of the two HVM guests (the one doing the mapping) has the PCI
> > graphics card forwarded to it.
> > 
> 
> Just tested on the latest xen 4.1 (with 22402:7d2fdc083c9c reverted as
> that breaks HVM grants), which produces different output:

Keir, the c/s 22402 has your name on it.

Any ideas on the problem that Daniel is hitting with unmapping grants?
> 
> ...
> (XEN) mm.c:889:d1 Error getting mfn b803e (pfn 25a3e) from L1 entry 00000000b803e021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn b8038 (pfn 25a38) from L1 entry 00000000b8038021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn b803d (pfn 25a3d) from L1 entry 00000000b803d021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn 10829 (pfn 25a29) from L1 entry 0000000010829021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn 1081c (pfn 25a1c) from L1 entry 000000001081c021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn 10816 (pfn 25a16) from L1 entry 0000000010816021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn 1081a (pfn 25a1a) from L1 entry 000000001081a021 for l1e_owner=1, pg_owner=1
> ...
> 
> This appears on the map; nothing is printed on the unmap. If the
> unmap happens while the domain is up, it seems to be invalid more often;
> most (perhaps all) of the destination-valid unmaps happen when the domain
> is being destroyed. Exactly which pages are valid or invalid seems to be
> mostly random, although nearby GFNs tend to have the same validity.
> 
> If you have any thoughts as to the cause, I can test patches or provide
> output as needed; it would be better if this workaround weren't needed.
> 
> -- 
> Daniel De Graaf
> National Security Agency

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open
  2011-01-11 17:51       ` Konrad Rzeszutek Wilk
@ 2011-01-11 18:18         ` Daniel De Graaf
  2011-01-11 18:21           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 38+ messages in thread
From: Daniel De Graaf @ 2011-01-11 18:18 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: jeremy, xen-devel, Ian.Campbell

On 01/11/2011 12:51 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 11, 2011 at 07:45:34AM -0500, Daniel De Graaf wrote:
>> On 01/10/2011 04:52 PM, Konrad Rzeszutek Wilk wrote:
>>>>  static long gntdev_ioctl(struct file *flip,
>>>>  			 unsigned int cmd, unsigned long arg)
>>>>  {
>>>> @@ -555,9 +538,6 @@ static long gntdev_ioctl(struct file *flip,
>>>>  	case IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR:
>>>>  		return gntdev_ioctl_get_offset_for_vaddr(priv, ptr);
>>>>  
>>>> -	case IOCTL_GNTDEV_SET_MAX_GRANTS:
>>>> -		return gntdev_ioctl_set_max_grants(priv, ptr);
>>>
>>> Would it make sense to return -EPNOTSUPPORTED? Or does it not really
>>> matter as nobody has been using this ioctl call?
>>
>> Does this produce a clearer error message than the default -ENOIOCTLCMD?
>> It's possible that some people use it, since it was exposed as an API.
> 
> Looking at the Xen tools the user of this is:
> xc_gnttab_set_max_grants which would end up returning whatever the
> error is. I don't see any users of this in the Xen tools, thought there might
> be some in the XCP code. Lets stay with your ENOIOCTLCMD.
> 
> However, I was wondering if you are going to submit a patch to the Xen
> tool stack so that it can utlize the SysFS interface to set the limits
> for that API call?
> 

No, because the semantics of what the limit is covering have changed. The
new limit is per-domain, and if there was any existing code that set the
limit, it would have been a per-open value and probably too low. I think
it was suggested that the call be removed from the Xen API; I can submit
a patch to do that, if you want.

The new value is probably best set in modprobe.conf if gntdev is a module;
the sysfs interface is useful for runtime adjustment or if it is builtin.

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open
  2011-01-11 18:18         ` Daniel De Graaf
@ 2011-01-11 18:21           ` Konrad Rzeszutek Wilk
  2011-01-11 18:49             ` [PATCH libxc] Remove set_max_grants in linux Daniel De Graaf
  0 siblings, 1 reply; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-11 18:21 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell

On Tue, Jan 11, 2011 at 01:18:09PM -0500, Daniel De Graaf wrote:
> On 01/11/2011 12:51 PM, Konrad Rzeszutek Wilk wrote:
> > On Tue, Jan 11, 2011 at 07:45:34AM -0500, Daniel De Graaf wrote:
> >> On 01/10/2011 04:52 PM, Konrad Rzeszutek Wilk wrote:
> >>>>  static long gntdev_ioctl(struct file *flip,
> >>>>  			 unsigned int cmd, unsigned long arg)
> >>>>  {
> >>>> @@ -555,9 +538,6 @@ static long gntdev_ioctl(struct file *flip,
> >>>>  	case IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR:
> >>>>  		return gntdev_ioctl_get_offset_for_vaddr(priv, ptr);
> >>>>  
> >>>> -	case IOCTL_GNTDEV_SET_MAX_GRANTS:
> >>>> -		return gntdev_ioctl_set_max_grants(priv, ptr);
> >>>
> >>> Would it make sense to return -EPNOTSUPPORTED? Or does it not really
> >>> matter as nobody has been using this ioctl call?
> >>
> >> Does this produce a clearer error message than the default -ENOIOCTLCMD?
> >> It's possible that some people use it, since it was exposed as an API.
> > 
> > Looking at the Xen tools the user of this is:
> > xc_gnttab_set_max_grants which would end up returning whatever the
> > error is. I don't see any users of this in the Xen tools, thought there might
> > be some in the XCP code. Lets stay with your ENOIOCTLCMD.
> > 
> > However, I was wondering if you are going to submit a patch to the Xen
> > tool stack so that it can utlize the SysFS interface to set the limits
> > for that API call?
> > 
> 
> No, because the semantics of what the limit is covering have changed. The
> new limit is per-domain, and if there was any existing code that set the
> limit, it would have been a per-open value and probably too low. I think
> it was suggested that the call be removed from the Xen API; I can submit
> a patch to do that, if you want.

Please do. Thank you.
> 
> The new value is probably best set in modprobe.conf if gntdev is a module;
> the sysfs interface is useful for runtime adjustment or if it is builtin.
> 
> -- 
> Daniel De Graaf
> National Security Agency

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: c/s 22402 ("86 hvm: Refuse to perform __hvm_copy() work in atomic context.") breaks HVM, race possible in other code - any ideas?
  2011-01-11 18:00         ` c/s 22402 ("86 hvm: Refuse to perform __hvm_copy() work in atomic context.") breaks HVM, race possible in other code - any ideas? Konrad Rzeszutek Wilk
@ 2011-01-11 18:24           ` Daniel De Graaf
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel De Graaf @ 2011-01-11 18:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: jeremy, xen-devel, keir, Ian.Campbell

On 01/11/2011 01:00 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 11, 2011 at 09:52:19AM -0500, Daniel De Graaf wrote:
>> On 01/11/2011 08:15 AM, Daniel De Graaf wrote:
>>> On 01/10/2011 05:41 PM, Konrad Rzeszutek Wilk wrote:
>>>>> @@ -284,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
>>>>>  		goto out;
>>>>>  
>>>>>  	for (i = 0; i < pages; i++) {
>>>>> +		uint32_t check, *tmp;
>>>>>  		WARN_ON(unmap_ops[i].status);
>>>>> -		__free_page(map->pages[offset+i]);
>>>>> +		if (!map->pages[i])
>>>>> +			continue;
>>>>> +		/* XXX When unmapping, Xen will sometimes end up mapping the GFN
>>>>> +		 * to an invalid MFN. In this case, writes will be discarded and
>>>>> +		 * reads will return all 0xFF bytes. Leak these unusable GFNs
>>>>> +		 * until a way to restore them is found.
>>>>> +		 */
>>>>> +		tmp = kmap(map->pages[i]);
>>>>> +		tmp[0] = 0xdeaddead;
>>>>> +		mb();
>>>>> +		check = tmp[0];
>>>>> +		kunmap(map->pages[i]);
>>>>> +		if (check == 0xdeaddead)
>>>>> +			__free_page(map->pages[i]);
>>>>> +		else if (debug)
>>>>> +			printk("%s: Discard page %d=%ld\n", __func__,
>>>>> +				i, page_to_pfn(map->pages[i]));
>>>>
>>>> Whoa. Any leads to when the "sometimes" happens? Does the status report an
>>>> error or is it silent?
>>>
>>> Status is silent in this case. I can produce it quite reliably on my
>>> test system where I am mapping a framebuffer (1280 pages) between two
>>> HVM guests - in this case, about 2/3 of the released pages will end up
>>> being invalid. It doesn't seem to be size-related as I have also seen
>>> it on the small 3-page page index mapping. There is a message on xm
>>> dmesg that may be related:
>>>
>>> (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 7cbc6: c=8000000000000004 t=7400000000000002
>>>
>>> This appears about once per page, with different MFNs but the same c/t.
>>> One of the two HVM guests (the one doing the mapping) has the PCI
>>> graphics card forwarded to it.
>>>
>>
>> Just tested on the latest xen 4.1 (with 22402:7d2fdc083c9c reverted as
>> that breaks HVM grants), which produces different output:
> 
> Keir, the c/s 22402 has your name on it.
> 
> Any ideas on the problem that Daniel is hitting with unmapping grants?

c/s 22402 was discussed at the beginning of December:
http://lists.xensource.com/archives/html/xen-devel/2010-12/msg00063.html

Since I don't use xenpaging (which is the reason for the change), this
revert shouldn't be relevant to the problems I am seeing.

>> ...
>> (XEN) mm.c:889:d1 Error getting mfn b803e (pfn 25a3e) from L1 entry 00000000b803e021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn b8038 (pfn 25a38) from L1 entry 00000000b8038021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn b803d (pfn 25a3d) from L1 entry 00000000b803d021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn 10829 (pfn 25a29) from L1 entry 0000000010829021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn 1081c (pfn 25a1c) from L1 entry 000000001081c021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn 10816 (pfn 25a16) from L1 entry 0000000010816021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn 1081a (pfn 25a1a) from L1 entry 000000001081a021 for l1e_owner=1, pg_owner=1
>> ...
>>
>> This appears on the map; nothing is printed on the unmap. If the
>> unmap happens while the domain is up, it seems to be invalid more often;
>> most (perhaps all) of the destination-valid unmaps happen when the domain
>> is being destroyed. Exactly which pages are valid or invalid seems to be
>> mostly random, although nearby GFNs tend to have the same validity.
>>
>> If you have any thoughts as to the cause, I can test patches or provide
>> output as needed; it would be better if this workaround weren't needed.
>>
>> -- 
>> Daniel De Graaf
>> National Security Agency

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH libxc] Remove set_max_grants in linux
  2011-01-11 18:21           ` Konrad Rzeszutek Wilk
@ 2011-01-11 18:49             ` Daniel De Graaf
  2011-01-12 17:17               ` Ian Jackson
                                 ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Daniel De Graaf @ 2011-01-11 18:49 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: jeremy, xen-devel, Ian.Campbell

On 01/11/2011 01:21 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 11, 2011 at 01:18:09PM -0500, Daniel De Graaf wrote:
>> On 01/11/2011 12:51 PM, Konrad Rzeszutek Wilk wrote:
>>> On Tue, Jan 11, 2011 at 07:45:34AM -0500, Daniel De Graaf wrote:
>>>> On 01/10/2011 04:52 PM, Konrad Rzeszutek Wilk wrote:
>>>>>>  static long gntdev_ioctl(struct file *flip,
>>>>>>  			 unsigned int cmd, unsigned long arg)
>>>>>>  {
>>>>>> @@ -555,9 +538,6 @@ static long gntdev_ioctl(struct file *flip,
>>>>>>  	case IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR:
>>>>>>  		return gntdev_ioctl_get_offset_for_vaddr(priv, ptr);
>>>>>>  
>>>>>> -	case IOCTL_GNTDEV_SET_MAX_GRANTS:
>>>>>> -		return gntdev_ioctl_set_max_grants(priv, ptr);
>>>>>
>>>>> Would it make sense to return -EPNOTSUPPORTED? Or does it not really
>>>>> matter as nobody has been using this ioctl call?
>>>>
>>>> Does this produce a clearer error message than the default -ENOIOCTLCMD?
>>>> It's possible that some people use it, since it was exposed as an API.
>>>
>>> Looking at the Xen tools the user of this is:
>>> xc_gnttab_set_max_grants which would end up returning whatever the
>>> error is. I don't see any users of this in the Xen tools, thought there might
>>> be some in the XCP code. Lets stay with your ENOIOCTLCMD.
>>>
>>> However, I was wondering if you are going to submit a patch to the Xen
>>> tool stack so that it can utlize the SysFS interface to set the limits
>>> for that API call?
>>>
>>
>> No, because the semantics of what the limit is covering have changed. The
>> new limit is per-domain, and if there was any existing code that set the
>> limit, it would have been a per-open value and probably too low. I think
>> it was suggested that the call be removed from the Xen API; I can submit
>> a patch to do that, if you want.
> 
> Please do. Thank you.
>

I just removed the linux version, because I didn't want to prevent its use
in minios where it is needed to resize the array of grants.

------------------------------------------------------------------->8

The maximum number of grants is now constrained domain-wide in linux,
so set_max_grants should be a noop there. Previously, this constraint
was per-file-description.

diff -r 7b4c82f07281 tools/libxc/xc_gnttab.c
--- a/tools/libxc/xc_gnttab.c	Wed Jan 05 23:54:15 2011 +0000
+++ b/tools/libxc/xc_gnttab.c	Tue Jan 11 13:38:24 2011 -0500
@@ -184,6 +184,8 @@ int xc_gnttab_munmap(xc_gnttab *xcg,
 
 int xc_gnttab_set_max_grants(xc_gnttab *xcg, uint32_t count)
 {
+	if (!xcg->ops->u.gnttab.set_max_grants)
+		return 0;
 	return xcg->ops->u.gnttab.set_max_grants(xcg, xcg->ops_handle, count);
 }
 
diff -r 7b4c82f07281 tools/libxc/xc_linux_osdep.c
--- a/tools/libxc/xc_linux_osdep.c	Wed Jan 05 23:54:15 2011 +0000
+++ b/tools/libxc/xc_linux_osdep.c	Tue Jan 11 13:38:24 2011 -0500
@@ -627,19 +627,6 @@ static int linux_gnttab_munmap(xc_gnttab
     return 0;
 }
 
-static int linux_gnttab_set_max_grants(xc_gnttab *xcg, xc_osdep_handle h, uint32_t count)
-{
-    int fd = (int)h;
-    struct ioctl_gntdev_set_max_grants set_max;
-    int rc;
-
-    set_max.count = count;
-    if ( (rc = ioctl(fd, IOCTL_GNTDEV_SET_MAX_GRANTS, &set_max)) )
-        return rc;
-
-    return 0;
-}
-
 static struct xc_osdep_ops linux_gnttab_ops = {
     .open = &linux_gnttab_open,
     .close = &linux_gnttab_close,
@@ -649,7 +636,6 @@ static struct xc_osdep_ops linux_gnttab_
         .map_grant_refs = &linux_gnttab_map_grant_refs,
         .map_domain_grant_refs = &linux_gnttab_map_domain_grant_refs,
         .munmap = &linux_gnttab_munmap,
-        .set_max_grants = &linux_gnttab_set_max_grants,
     },
 };

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/7] xen-gntdev: Add reference counting to maps
  2011-01-11 17:46       ` Konrad Rzeszutek Wilk
@ 2011-01-12 11:58         ` Stefano Stabellini
  0 siblings, 0 replies; 38+ messages in thread
From: Stefano Stabellini @ 2011-01-12 11:58 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Ian Campbell, Daniel De Graaf, xen-devel, jeremy, Stefano Stabellini

On Tue, 11 Jan 2011, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 11, 2011 at 11:10:23AM +0000, Stefano Stabellini wrote:
> > On Mon, 10 Jan 2011, Konrad Rzeszutek Wilk wrote:
> > > > -static void gntdev_free_map(struct grant_map *map)
> > > > -{
> > > > -	unsigned i;
> > > > +	atomic_sub(map->count, &pages_mapped);
> > > >  
> > > > -	if (!map)
> > > > -		return;
> > > > +	if (!use_ptemod)
> > > > +		unmap_grant_pages(map, 0, map->count);
> > > >  
> > > >  	for (i = 0; i < map->count; i++) {
> > > >  		if (map->pages[i])
> > > >  			__free_page(map->pages[i]);
> > > >  	}
> > > > +	kfree(map->pages);
> > > 
> > > Can you roll that in the previous patch that introduced the map->pages code?
> > > 
> > 
> > map->pages is actually introduced by "xen gntdev: use gnttab_map_refs
> > and gnttab_unmap_refs" in my patch series and it already has a
> 
> Right. But I meant his patch that collapsed all of the different kzalloc's in just one.
> 

"xen-gntdev: Remove unneeded structures from grant_map tracking data" is
a nice cleanup but I'd rather keep it separate from "xen gntdev: use
gnttab_map_refs and gnttab_unmap_refs" that introduces a new
functionality (the usage of gnttab_map_refs and gnttab_unmap_refs).

But I could import just the bit that collapses all the kzalloc's in one.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH libxc] Remove set_max_grants in linux
  2011-01-11 18:49             ` [PATCH libxc] Remove set_max_grants in linux Daniel De Graaf
@ 2011-01-12 17:17               ` Ian Jackson
  2011-01-12 17:57                 ` Daniel De Graaf
  2011-01-13 12:09               ` Ian Jackson
  2011-01-17 17:29               ` Ian Jackson
  2 siblings, 1 reply; 38+ messages in thread
From: Ian Jackson @ 2011-01-12 17:17 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell, Konrad Rzeszutek Wilk

Daniel De Graaf writes ("[Xen-devel] [PATCH libxc] Remove set_max_grants in linux"):
> I just removed the linux version, because I didn't want to prevent its use
> in minios where it is needed to resize the array of grants.
> 
> ------------------------------------------------------------------->8
> 
> The maximum number of grants is now constrained domain-wide in linux,
> so set_max_grants should be a noop there. Previously, this constraint
> was per-file-description.

Is this unconditional change suitable for putting into xen-unstable.hg
for the 4.1 release ?  In particular, will it break compatibility with
"traditional Xen" kernels ?

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH libxc] Remove set_max_grants in linux
  2011-01-12 17:17               ` Ian Jackson
@ 2011-01-12 17:57                 ` Daniel De Graaf
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel De Graaf @ 2011-01-12 17:57 UTC (permalink / raw)
  To: Ian Jackson; +Cc: jeremy, xen-devel, Ian.Campbell, Konrad Rzeszutek Wilk

On 01/12/2011 12:17 PM, Ian Jackson wrote:
> Daniel De Graaf writes ("[Xen-devel] [PATCH libxc] Remove set_max_grants in linux"):
>> I just removed the linux version, because I didn't want to prevent its use
>> in minios where it is needed to resize the array of grants.
>>
>> ------------------------------------------------------------------->8
>>
>> The maximum number of grants is now constrained domain-wide in linux,
>> so set_max_grants should be a noop there. Previously, this constraint
>> was per-file-description.
> 
> Is this unconditional change suitable for putting into xen-unstable.hg
> for the 4.1 release ?  In particular, will it break compatibility with
> "traditional Xen" kernels ?
> 
> Ian.
> 

It won't break compatibility; Linux has never required that you set the
limit in order to use grants. The default limit is set to the maximum
possible limit, so existing users cannot have been increasing it;
changing a possible decrease to a noop shouldn't break anything sane.

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH libxc] Remove set_max_grants in linux
  2011-01-11 18:49             ` [PATCH libxc] Remove set_max_grants in linux Daniel De Graaf
  2011-01-12 17:17               ` Ian Jackson
@ 2011-01-13 12:09               ` Ian Jackson
  2011-01-13 12:48                 ` Daniel De Graaf
  2011-01-17 17:29               ` Ian Jackson
  2 siblings, 1 reply; 38+ messages in thread
From: Ian Jackson @ 2011-01-13 12:09 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell, Konrad Rzeszutek Wilk

Daniel De Graaf writes ("[Xen-devel] [PATCH libxc] Remove set_max_grants in linux"):
> I just removed the linux version, because I didn't want to prevent its use
> in minios where it is needed to resize the array of grants.
...
> The maximum number of grants is now constrained domain-wide in linux,
> so set_max_grants should be a noop there. Previously, this constraint
> was per-file-description.

Can you provide a Signed-off-By please, as your confirmation of the
statements in the Developer's Certificate of Origin ?

Then I'll apply your patch after rc1.

Thanks,
Ian.

>From Documentation/SubmittingPatches in the Linux kernel tree:

       Developer's Certificate of Origin 1.1

       By making a contribution to this project, I certify that:

       (a) The contribution was created in whole or in part by me and I
           have the right to submit it under the open source license
           indicated in the file; or

       (b) The contribution is based upon previous work that, to the best
           of my knowledge, is covered under an appropriate open source
           license and I have the right under that license to submit that
           work with modifications, whether created in whole or in part
           by me, under the same open source license (unless I am
           permitted to submit under a different license), as indicated
           in the file; or

       (c) The contribution was provided directly to me by some other
           person who certified (a), (b) or (c) and I have not modified
           it.

       (d) I understand and agree that this project and the contribution
           are public and that a record of the contribution (including all
           personal information I submit with it, including my sign-off) is
           maintained indefinitely and may be redistributed consistent with
           this project or the open source license(s) involved.

-- 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH libxc] Remove set_max_grants in linux
  2011-01-13 12:09               ` Ian Jackson
@ 2011-01-13 12:48                 ` Daniel De Graaf
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel De Graaf @ 2011-01-13 12:48 UTC (permalink / raw)
  To: Ian Jackson; +Cc: jeremy, xen-devel, Ian.Campbell, Konrad Rzeszutek Wilk

On 01/13/2011 07:09 AM, Ian Jackson wrote:
> Daniel De Graaf writes ("[Xen-devel] [PATCH libxc] Remove set_max_grants in linux"):
>> I just removed the linux version, because I didn't want to prevent its use
>> in minios where it is needed to resize the array of grants.
> ...
>> The maximum number of grants is now constrained domain-wide in linux,
>> so set_max_grants should be a noop there. Previously, this constraint
>> was per-file-description.
> 
> Can you provide a Signed-off-By please, as your confirmation of the
> statements in the Developer's Certificate of Origin ?
> 

Whoops, I forgot that mercurial likes to hide those lines.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>

> Then I'll apply your patch after rc1.
> 
> Thanks,
> Ian.
> 
>>From Documentation/SubmittingPatches in the Linux kernel tree:
> 
>        Developer's Certificate of Origin 1.1
> 
>        By making a contribution to this project, I certify that:
> 
>        (a) The contribution was created in whole or in part by me and I
>            have the right to submit it under the open source license
>            indicated in the file; or
> 
>        (b) The contribution is based upon previous work that, to the best
>            of my knowledge, is covered under an appropriate open source
>            license and I have the right under that license to submit that
>            work with modifications, whether created in whole or in part
>            by me, under the same open source license (unless I am
>            permitted to submit under a different license), as indicated
>            in the file; or
> 
>        (c) The contribution was provided directly to me by some other
>            person who certified (a), (b) or (c) and I have not modified
>            it.
> 
>        (d) I understand and agree that this project and the contribution
>            are public and that a record of the contribution (including all
>            personal information I submit with it, including my sign-off) is
>            maintained indefinitely and may be redistributed consistent with
>            this project or the open source license(s) involved.
> 


-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] Userspace grant communication
  2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
                   ` (7 preceding siblings ...)
  2011-01-07 11:56 ` [PATCH v3] Userspace grant communication Stefano Stabellini
@ 2011-01-14 15:18 ` Konrad Rzeszutek Wilk
  8 siblings, 0 replies; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-14 15:18 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell

On Thu, Dec 16, 2010 at 07:17:36PM -0500, Daniel De Graaf wrote:
> Rebased on top of Stefano's 2.6.37 gntdev patches; included suggestions

I was wondering whether you would be OK rebasing this on top
of Linus latest tree? (or if you prefer you could use mine:
devel/next-2.6.38) and re-posting them?

2.6.38 has now the gntdev driver along with one of your patches.


> from the v2 patch queue. HVM mapping is now much cleaner, since pages
> with real GFNs are available. Multicall API didn't make it in, because
> the actual hypercall is within a function that I'm not sure should be
> changed to multicall in all instances, and I haven't had a chance to
> figure out how to use it despite this.
> 
> [PATCH 1/7] xen-gntdev: Fix circular locking dependency
> [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open
> [PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data
> [PATCH 4/7] xen-gntdev: Use find_vma rather than iterating our vma list manually
> [PATCH 5/7] xen-gntdev: Add reference counting to maps
> [PATCH 6/7] xen-gntdev: Support mapping in HVM domains
> [PATCH 7/7] xen-gntalloc: Userspace grant allocation driver
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH libxc] Remove set_max_grants in linux
  2011-01-11 18:49             ` [PATCH libxc] Remove set_max_grants in linux Daniel De Graaf
  2011-01-12 17:17               ` Ian Jackson
  2011-01-13 12:09               ` Ian Jackson
@ 2011-01-17 17:29               ` Ian Jackson
  2 siblings, 0 replies; 38+ messages in thread
From: Ian Jackson @ 2011-01-17 17:29 UTC (permalink / raw)
  To: Daniel De Graaf; +Cc: jeremy, xen-devel, Ian.Campbell, Konrad Rzeszutek Wilk

Daniel De Graaf writes ("[Xen-devel] [PATCH libxc] Remove set_max_grants in linux"):
> The maximum number of grants is now constrained domain-wide in linux,
> so set_max_grants should be a noop there. Previously, this constraint
> was per-file-description.

I have applied this patch, thanks.

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2011-01-17 17:29 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
2010-12-17  0:17 ` [PATCH 1/7] xen-gntdev: Fix circular locking dependency Daniel De Graaf
2010-12-17  0:17 ` [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open Daniel De Graaf
2011-01-10 21:52   ` Konrad Rzeszutek Wilk
2011-01-11 12:45     ` Daniel De Graaf
2011-01-11 17:51       ` Konrad Rzeszutek Wilk
2011-01-11 18:18         ` Daniel De Graaf
2011-01-11 18:21           ` Konrad Rzeszutek Wilk
2011-01-11 18:49             ` [PATCH libxc] Remove set_max_grants in linux Daniel De Graaf
2011-01-12 17:17               ` Ian Jackson
2011-01-12 17:57                 ` Daniel De Graaf
2011-01-13 12:09               ` Ian Jackson
2011-01-13 12:48                 ` Daniel De Graaf
2011-01-17 17:29               ` Ian Jackson
2010-12-17  0:17 ` [PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data Daniel De Graaf
2011-01-10 22:14   ` Konrad Rzeszutek Wilk
2011-01-11 13:02     ` Daniel De Graaf
2010-12-17  0:17 ` [PATCH 4/7] xen-gntdev: Use find_vma rather than iterating our vma list manually Daniel De Graaf
2010-12-17  0:17 ` [PATCH 5/7] xen-gntdev: Add reference counting to maps Daniel De Graaf
2010-12-17  0:49   ` Jeremy Fitzhardinge
2010-12-17 15:11     ` Daniel De Graaf
2010-12-17  0:51   ` Jeremy Fitzhardinge
2010-12-17 15:22   ` [PATCH 5/7 v2] " Daniel De Graaf
2011-01-10 22:28     ` Konrad Rzeszutek Wilk
2011-01-10 22:24   ` [PATCH 5/7] " Konrad Rzeszutek Wilk
2011-01-11 11:10     ` Stefano Stabellini
2011-01-11 17:46       ` Konrad Rzeszutek Wilk
2011-01-12 11:58         ` Stefano Stabellini
2010-12-17  0:17 ` [PATCH 6/7] xen-gntdev: Support mapping in HVM domains Daniel De Graaf
2010-12-17 15:22   ` [PATCH 6/7 v2] " Daniel De Graaf
2011-01-10 22:41   ` [PATCH 6/7] " Konrad Rzeszutek Wilk
2011-01-11 13:15     ` Daniel De Graaf
2011-01-11 14:52       ` Daniel De Graaf
2011-01-11 18:00         ` c/s 22402 ("86 hvm: Refuse to perform __hvm_copy() work in atomic context.") breaks HVM, race possible in other code - any ideas? Konrad Rzeszutek Wilk
2011-01-11 18:24           ` Daniel De Graaf
2010-12-17  0:17 ` [PATCH 7/7] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
2011-01-07 11:56 ` [PATCH v3] Userspace grant communication Stefano Stabellini
2011-01-14 15:18 ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.