All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] vhost: MQ live-migration fixes
@ 2017-11-24 17:48 Maxime Coquelin
  2017-11-24 17:48 ` [PATCH 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Maxime Coquelin @ 2017-11-24 17:48 UTC (permalink / raw)
  To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
  Cc: stable, jfreiman, Maxime Coquelin

This 3 patches series fixes issues met when doing live-migration
with multiple queue pairs.

Patch 1 is theorical and unlikely to be reproduced in real use-cases,
so it may be safe not to pick it in stable trees.

Patch 2 reproduces quite often when lots of packets are being processed.
Easiest way to reproduce it is to run DPDK in guest and perform IO
loopback with testpmd. This patch targets both v16.11 & v17.11 stable
trees, and will require a rework for v16.11 as some dirty logging
functions moved from virtio-net.c to vhost.h. I'm not sure of the
process here, but I can provide the v16.11 backport if needed.

Patch 3 is a regression introduced in v17.11. For a reason I have
yet to understand, QEMU sends VHOST_USER_SET_VRING_ADDR requests
when live-migration is initiated. The problem is that the vhost-user
protocol thread has no way to be sure the PMD threads are accessing
the rings or not. As the new addresses sent by QEMU are the same
it sent intially, this patch just ignores them.

Regards,
Maxime

Maxime Coquelin (3):
  vhost: fix fd leak in VHOST_USER_SET_LOG_BASE
  vhost: protect dirty logging against logging base change
  vhost: don't invalidate vrings if new addresses are identical

 lib/librte_vhost/vhost.c      |  2 ++
 lib/librte_vhost/vhost.h      | 10 ++++++++--
 lib/librte_vhost/vhost_user.c | 32 ++++++++++++++++++++++++++++----
 3 files changed, 38 insertions(+), 6 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE
  2017-11-24 17:48 [PATCH 0/3] vhost: MQ live-migration fixes Maxime Coquelin
@ 2017-11-24 17:48 ` Maxime Coquelin
  2017-11-24 17:48 ` [PATCH 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
  2017-11-24 17:48 ` [PATCH 3/3] vhost: don't invalidate vrings if new addresses are identical Maxime Coquelin
  2 siblings, 0 replies; 4+ messages in thread
From: Maxime Coquelin @ 2017-11-24 17:48 UTC (permalink / raw)
  To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
  Cc: stable, jfreiman, Maxime Coquelin

If VHOST_USER_SET_LOG_BASE request's message size is invalid,
the fd is leaked.

Fix this by closing the fd systematically as long as it is valid.

Fixes: 53af5b1e0ace ("vhost: fix leak of file descriptor")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index f4c7ce462..f06d9bb65 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -895,6 +895,7 @@ static int
 vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 {
 	int fd = msg->fds[0];
+	int ret = 0;
 	uint64_t size, off;
 	void *addr;
 
@@ -907,7 +908,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"invalid log base msg size: %"PRId32" != %d\n",
 			msg->size, (int)sizeof(VhostUserLog));
-		return -1;
+		ret = -1;
+		goto out;
 	}
 
 	size = msg->payload.log.mmap_size;
@@ -921,10 +923,10 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 	 * fail when offset is not page size aligned.
 	 */
 	addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
-	close(fd);
 	if (addr == MAP_FAILED) {
 		RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n");
-		return -1;
+		ret = -1;
+		goto out;
 	}
 
 	/*
@@ -938,7 +940,10 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 	dev->log_base = dev->log_addr + off;
 	dev->log_size = size;
 
-	return 0;
+out:
+	close(fd);
+
+	return ret;
 }
 
 /*
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] vhost: protect dirty logging against logging base change
  2017-11-24 17:48 [PATCH 0/3] vhost: MQ live-migration fixes Maxime Coquelin
  2017-11-24 17:48 ` [PATCH 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
@ 2017-11-24 17:48 ` Maxime Coquelin
  2017-11-24 17:48 ` [PATCH 3/3] vhost: don't invalidate vrings if new addresses are identical Maxime Coquelin
  2 siblings, 0 replies; 4+ messages in thread
From: Maxime Coquelin @ 2017-11-24 17:48 UTC (permalink / raw)
  To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
  Cc: stable, jfreiman, Maxime Coquelin

When performing live-migration with multiple queue pairs,
VHOST_USER_SET_LOG_BASE request is sent multiple times.

If packets are being processed by the PMD threads, it is
possible that they are setting bits in the dirty log map while
its region is being unmapped by the vhost-user protocol thread.
It results in the following crash:
Thread 3 "lcore-slave-2" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f71ca495700 (LWP 32451)]
0x00000000004bfc8a in vhost_set_bit (addr=0x7f71cbe18432 <error: Cannot access memory at address 0x7f71cbe18432>, nr=1) at /home/max/projects/src/mainline/dpdk/lib/librte_vhost/vhost.h:267
267        __sync_fetch_and_or_8(addr, (1U << nr));

We can see the vhost-user protocol thread just did the unmap of the
dirty log region when it happens.

This patch prevents this by introducing a RW lock to protect
the log base.

Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost.c      |  2 ++
 lib/librte_vhost/vhost.h      | 10 ++++++++--
 lib/librte_vhost/vhost_user.c |  4 ++++
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 4f8b73a09..5a7699da0 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -311,6 +311,8 @@ vhost_new_device(void)
 		return -1;
 	}
 
+	rte_rwlock_init(&dev->log_lock);
+
 	vhost_devices[i] = dev;
 	dev->vid = i;
 	dev->slave_req_fd = -1;
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 1cc81c17c..0f76d6495 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -243,6 +243,7 @@ struct virtio_net {
 	uint64_t		log_size;
 	uint64_t		log_base;
 	uint64_t		log_addr;
+	rte_rwlock_t	log_lock;
 	struct ether_addr	mac;
 	uint16_t		mtu;
 
@@ -278,12 +279,14 @@ vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len)
 {
 	uint64_t page;
 
+	rte_rwlock_read_lock(&dev->log_lock);
+
 	if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) ||
 		   !dev->log_base || !len))
-		return;
+		goto unlock;
 
 	if (unlikely(dev->log_size <= ((addr + len - 1) / VHOST_LOG_PAGE / 8)))
-		return;
+		goto unlock;
 
 	/* To make sure guest memory updates are committed before logging */
 	rte_smp_wmb();
@@ -293,6 +296,9 @@ vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len)
 		vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page);
 		page += 1;
 	}
+
+unlock:
+	rte_rwlock_read_unlock(&dev->log_lock);
 }
 
 static __rte_always_inline void
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index f06d9bb65..4b03dbbca 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -929,6 +929,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 		goto out;
 	}
 
+	rte_rwlock_write_lock(&dev->log_lock);
+
 	/*
 	 * Free previously mapped log memory on occasionally
 	 * multiple VHOST_USER_SET_LOG_BASE.
@@ -940,6 +942,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 	dev->log_base = dev->log_addr + off;
 	dev->log_size = size;
 
+	rte_rwlock_write_unlock(&dev->log_lock);
+
 out:
 	close(fd);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] vhost: don't invalidate vrings if new addresses are identical
  2017-11-24 17:48 [PATCH 0/3] vhost: MQ live-migration fixes Maxime Coquelin
  2017-11-24 17:48 ` [PATCH 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
  2017-11-24 17:48 ` [PATCH 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
@ 2017-11-24 17:48 ` Maxime Coquelin
  2 siblings, 0 replies; 4+ messages in thread
From: Maxime Coquelin @ 2017-11-24 17:48 UTC (permalink / raw)
  To: dev, yliu, tiwei.bie, jianfeng.tan, vkaplans
  Cc: stable, jfreiman, Maxime Coquelin

In VHOST_USER_SET_VRING_ADDR handling, don't invalidate the vring
if it has already been mapped and new addresses are identical.

When initiating live-migration, VHOST_USER_SET_VRING_ADDR is sent
again by QEMU, but the queues are enabled, so invalidating them
can result in NULL pointer de-referencing. In this case, it is
not needed to perform the invalidation, as the new addresses provided
by QEMU are indentical to the initial ones.

Fixes: eefac9536a90 ("vhost: postpone device creation until rings are mapped")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 4b03dbbca..29a431687 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -436,6 +436,14 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index)
 	return dev;
 }
 
+static int
+is_same_vring_addrs(struct vhost_vring_addr *a1, struct vhost_vring_addr *a2)
+{
+	return ((a1->desc_user_addr == a2->desc_user_addr) &&
+			(a1->used_user_addr == a2->used_user_addr) &&
+			(a1->avail_user_addr == a2->avail_user_addr));
+}
+
 /*
  * The virtio device sends us the desc, used and avail ring addresses.
  * This function then converts these to our address space.
@@ -453,6 +461,13 @@ vhost_user_set_vring_addr(struct virtio_net **pdev, VhostUserMsg *msg)
 	/* addr->index refers to the queue index. The txq 1, rxq is 0. */
 	vq = dev->virtqueue[msg->payload.addr.index];
 
+	/*
+	 * If it is trying to set the same rings addresses, don't invalidate as
+	 * PMD threads might be using them.
+	 */
+	if (is_same_vring_addrs(addr, &vq->ring_addrs))
+		return 0;
+
 	/*
 	 * Rings addresses should not be interpreted as long as the ring is not
 	 * started and enabled
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-11-24 17:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-24 17:48 [PATCH 0/3] vhost: MQ live-migration fixes Maxime Coquelin
2017-11-24 17:48 ` [PATCH 1/3] vhost: fix fd leak in VHOST_USER_SET_LOG_BASE Maxime Coquelin
2017-11-24 17:48 ` [PATCH 2/3] vhost: protect dirty logging against logging base change Maxime Coquelin
2017-11-24 17:48 ` [PATCH 3/3] vhost: don't invalidate vrings if new addresses are identical Maxime Coquelin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.