linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove()
@ 2019-06-28 12:36 Stefano Garzarella
  2019-06-28 12:36 ` [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock Stefano Garzarella
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Stefano Garzarella @ 2019-06-28 12:36 UTC (permalink / raw)
  To: netdev
  Cc: kvm, virtualization, Stefan Hajnoczi, Michael S. Tsirkin,
	David S. Miller, Jason Wang, linux-kernel

During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock
before registering the driver", Stefan pointed out some possible issues
in the .probe() and .remove() callbacks of the virtio-vsock driver.

This series tries to solve these issues:
- Patch 1 adds RCU critical sections to avoid use-after-free of
  'the_virtio_vsock' pointer.
- Patch 2 stops workers before to call vdev->config->reset(vdev) to
  be sure that no one is accessing the device.
- Patch 3 moves the works flush at the end of the .remove() to avoid
  use-after-free of 'vsock' object.

v2:
- Patch 1: use RCU to protect 'the_virtio_vsock' pointer
- Patch 2: no changes
- Patch 3: flush works only at the end of .remove()
- Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers
  allocated.

v1: https://patchwork.kernel.org/cover/10964733/

Stefano Garzarella (3):
  vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  vsock/virtio: stop workers during the .remove()
  vsock/virtio: fix flush of works during the .remove()

 net/vmw_vsock/virtio_transport.c | 131 ++++++++++++++++++++++++-------
 1 file changed, 102 insertions(+), 29 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  2019-06-28 12:36 [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove() Stefano Garzarella
@ 2019-06-28 12:36 ` Stefano Garzarella
  2019-07-01 14:54   ` Stefan Hajnoczi
                     ` (2 more replies)
  2019-06-28 12:36 ` [PATCH v2 2/3] vsock/virtio: stop workers during the .remove() Stefano Garzarella
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 19+ messages in thread
From: Stefano Garzarella @ 2019-06-28 12:36 UTC (permalink / raw)
  To: netdev
  Cc: kvm, virtualization, Stefan Hajnoczi, Michael S. Tsirkin,
	David S. Miller, Jason Wang, linux-kernel

Some callbacks used by the upper layers can run while we are in the
.remove(). A potential use-after-free can happen, because we free
the_virtio_vsock without knowing if the callbacks are over or not.

To solve this issue we move the assignment of the_virtio_vsock at the
end of .probe(), when we finished all the initialization, and at the
beginning of .remove(), before to release resources.
For the same reason, we do the same also for the vdev->priv.

We use RCU to be sure that all callbacks that use the_virtio_vsock
ended before freeing it. This is not required for callbacks that
use vdev->priv, because after the vdev->config->del_vqs() we are sure
that they are ended and will no longer be invoked.

We also take the mutex during the .remove() to avoid that .probe() can
run while we are resetting the device.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 67 +++++++++++++++++++++-----------
 1 file changed, 44 insertions(+), 23 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 9c287e3e393c..7ad510ec12e0 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -65,19 +65,22 @@ struct virtio_vsock {
 	u32 guest_cid;
 };
 
-static struct virtio_vsock *virtio_vsock_get(void)
-{
-	return the_virtio_vsock;
-}
-
 static u32 virtio_transport_get_local_cid(void)
 {
-	struct virtio_vsock *vsock = virtio_vsock_get();
+	struct virtio_vsock *vsock;
+	u32 ret;
 
-	if (!vsock)
-		return VMADDR_CID_ANY;
+	rcu_read_lock();
+	vsock = rcu_dereference(the_virtio_vsock);
+	if (!vsock) {
+		ret = VMADDR_CID_ANY;
+		goto out_rcu;
+	}
 
-	return vsock->guest_cid;
+	ret = vsock->guest_cid;
+out_rcu:
+	rcu_read_unlock();
+	return ret;
 }
 
 static void virtio_transport_loopback_work(struct work_struct *work)
@@ -197,14 +200,18 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
 	struct virtio_vsock *vsock;
 	int len = pkt->len;
 
-	vsock = virtio_vsock_get();
+	rcu_read_lock();
+	vsock = rcu_dereference(the_virtio_vsock);
 	if (!vsock) {
 		virtio_transport_free_pkt(pkt);
-		return -ENODEV;
+		len = -ENODEV;
+		goto out_rcu;
 	}
 
-	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid)
-		return virtio_transport_send_pkt_loopback(vsock, pkt);
+	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid) {
+		len = virtio_transport_send_pkt_loopback(vsock, pkt);
+		goto out_rcu;
+	}
 
 	if (pkt->reply)
 		atomic_inc(&vsock->queued_replies);
@@ -214,6 +221,9 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
 	spin_unlock_bh(&vsock->send_pkt_list_lock);
 
 	queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
+
+out_rcu:
+	rcu_read_unlock();
 	return len;
 }
 
@@ -222,12 +232,14 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
 {
 	struct virtio_vsock *vsock;
 	struct virtio_vsock_pkt *pkt, *n;
-	int cnt = 0;
+	int cnt = 0, ret;
 	LIST_HEAD(freeme);
 
-	vsock = virtio_vsock_get();
+	rcu_read_lock();
+	vsock = rcu_dereference(the_virtio_vsock);
 	if (!vsock) {
-		return -ENODEV;
+		ret = -ENODEV;
+		goto out_rcu;
 	}
 
 	spin_lock_bh(&vsock->send_pkt_list_lock);
@@ -255,7 +267,11 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
 			queue_work(virtio_vsock_workqueue, &vsock->rx_work);
 	}
 
-	return 0;
+	ret = 0;
+
+out_rcu:
+	rcu_read_unlock();
+	return ret;
 }
 
 static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
@@ -590,8 +606,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
 	vsock->rx_buf_max_nr = 0;
 	atomic_set(&vsock->queued_replies, 0);
 
-	vdev->priv = vsock;
-	the_virtio_vsock = vsock;
 	mutex_init(&vsock->tx_lock);
 	mutex_init(&vsock->rx_lock);
 	mutex_init(&vsock->event_lock);
@@ -613,6 +627,9 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
 	virtio_vsock_event_fill(vsock);
 	mutex_unlock(&vsock->event_lock);
 
+	vdev->priv = vsock;
+	rcu_assign_pointer(the_virtio_vsock, vsock);
+
 	mutex_unlock(&the_virtio_vsock_mutex);
 	return 0;
 
@@ -627,6 +644,12 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
 	struct virtio_vsock *vsock = vdev->priv;
 	struct virtio_vsock_pkt *pkt;
 
+	mutex_lock(&the_virtio_vsock_mutex);
+
+	vdev->priv = NULL;
+	rcu_assign_pointer(the_virtio_vsock, NULL);
+	synchronize_rcu();
+
 	flush_work(&vsock->loopback_work);
 	flush_work(&vsock->rx_work);
 	flush_work(&vsock->tx_work);
@@ -666,12 +689,10 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
 	}
 	spin_unlock_bh(&vsock->loopback_list_lock);
 
-	mutex_lock(&the_virtio_vsock_mutex);
-	the_virtio_vsock = NULL;
-	mutex_unlock(&the_virtio_vsock_mutex);
-
 	vdev->config->del_vqs(vdev);
 
+	mutex_unlock(&the_virtio_vsock_mutex);
+
 	kfree(vsock);
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/3] vsock/virtio: stop workers during the .remove()
  2019-06-28 12:36 [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove() Stefano Garzarella
  2019-06-28 12:36 ` [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock Stefano Garzarella
@ 2019-06-28 12:36 ` Stefano Garzarella
  2019-07-04  4:00   ` Jason Wang
  2019-06-28 12:36 ` [PATCH v2 3/3] vsock/virtio: fix flush of works " Stefano Garzarella
  2019-07-01 15:11 ` [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove() Stefan Hajnoczi
  3 siblings, 1 reply; 19+ messages in thread
From: Stefano Garzarella @ 2019-06-28 12:36 UTC (permalink / raw)
  To: netdev
  Cc: kvm, virtualization, Stefan Hajnoczi, Michael S. Tsirkin,
	David S. Miller, Jason Wang, linux-kernel

Before to call vdev->config->reset(vdev) we need to be sure that
no one is accessing the device, for this reason, we add new variables
in the struct virtio_vsock to stop the workers during the .remove().

This patch also add few comments before vdev->config->reset(vdev)
and vdev->config->del_vqs(vdev).

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 51 +++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 7ad510ec12e0..1b44ec6f3f6c 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -38,6 +38,7 @@ struct virtio_vsock {
 	 * must be accessed with tx_lock held.
 	 */
 	struct mutex tx_lock;
+	bool tx_run;
 
 	struct work_struct send_pkt_work;
 	spinlock_t send_pkt_list_lock;
@@ -53,6 +54,7 @@ struct virtio_vsock {
 	 * must be accessed with rx_lock held.
 	 */
 	struct mutex rx_lock;
+	bool rx_run;
 	int rx_buf_nr;
 	int rx_buf_max_nr;
 
@@ -60,6 +62,7 @@ struct virtio_vsock {
 	 * vqs[VSOCK_VQ_EVENT] must be accessed with event_lock held.
 	 */
 	struct mutex event_lock;
+	bool event_run;
 	struct virtio_vsock_event event_list[8];
 
 	u32 guest_cid;
@@ -94,6 +97,10 @@ static void virtio_transport_loopback_work(struct work_struct *work)
 	spin_unlock_bh(&vsock->loopback_list_lock);
 
 	mutex_lock(&vsock->rx_lock);
+
+	if (!vsock->rx_run)
+		goto out;
+
 	while (!list_empty(&pkts)) {
 		struct virtio_vsock_pkt *pkt;
 
@@ -102,6 +109,7 @@ static void virtio_transport_loopback_work(struct work_struct *work)
 
 		virtio_transport_recv_pkt(pkt);
 	}
+out:
 	mutex_unlock(&vsock->rx_lock);
 }
 
@@ -130,6 +138,9 @@ virtio_transport_send_pkt_work(struct work_struct *work)
 
 	mutex_lock(&vsock->tx_lock);
 
+	if (!vsock->tx_run)
+		goto out;
+
 	vq = vsock->vqs[VSOCK_VQ_TX];
 
 	for (;;) {
@@ -188,6 +199,7 @@ virtio_transport_send_pkt_work(struct work_struct *work)
 	if (added)
 		virtqueue_kick(vq);
 
+out:
 	mutex_unlock(&vsock->tx_lock);
 
 	if (restart_rx)
@@ -323,6 +335,10 @@ static void virtio_transport_tx_work(struct work_struct *work)
 
 	vq = vsock->vqs[VSOCK_VQ_TX];
 	mutex_lock(&vsock->tx_lock);
+
+	if (!vsock->tx_run)
+		goto out;
+
 	do {
 		struct virtio_vsock_pkt *pkt;
 		unsigned int len;
@@ -333,6 +349,8 @@ static void virtio_transport_tx_work(struct work_struct *work)
 			added = true;
 		}
 	} while (!virtqueue_enable_cb(vq));
+
+out:
 	mutex_unlock(&vsock->tx_lock);
 
 	if (added)
@@ -361,6 +379,9 @@ static void virtio_transport_rx_work(struct work_struct *work)
 
 	mutex_lock(&vsock->rx_lock);
 
+	if (!vsock->rx_run)
+		goto out;
+
 	do {
 		virtqueue_disable_cb(vq);
 		for (;;) {
@@ -470,6 +491,9 @@ static void virtio_transport_event_work(struct work_struct *work)
 
 	mutex_lock(&vsock->event_lock);
 
+	if (!vsock->event_run)
+		goto out;
+
 	do {
 		struct virtio_vsock_event *event;
 		unsigned int len;
@@ -484,7 +508,7 @@ static void virtio_transport_event_work(struct work_struct *work)
 	} while (!virtqueue_enable_cb(vq));
 
 	virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]);
-
+out:
 	mutex_unlock(&vsock->event_lock);
 }
 
@@ -619,12 +643,18 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
 	INIT_WORK(&vsock->send_pkt_work, virtio_transport_send_pkt_work);
 	INIT_WORK(&vsock->loopback_work, virtio_transport_loopback_work);
 
+	mutex_lock(&vsock->tx_lock);
+	vsock->tx_run = true;
+	mutex_unlock(&vsock->tx_lock);
+
 	mutex_lock(&vsock->rx_lock);
 	virtio_vsock_rx_fill(vsock);
+	vsock->rx_run = true;
 	mutex_unlock(&vsock->rx_lock);
 
 	mutex_lock(&vsock->event_lock);
 	virtio_vsock_event_fill(vsock);
+	vsock->event_run = true;
 	mutex_unlock(&vsock->event_lock);
 
 	vdev->priv = vsock;
@@ -659,6 +689,24 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
 	/* Reset all connected sockets when the device disappear */
 	vsock_for_each_connected_socket(virtio_vsock_reset_sock);
 
+	/* Stop all work handlers to make sure no one is accessing the device,
+	 * so we can safely call vdev->config->reset().
+	 */
+	mutex_lock(&vsock->rx_lock);
+	vsock->rx_run = false;
+	mutex_unlock(&vsock->rx_lock);
+
+	mutex_lock(&vsock->tx_lock);
+	vsock->tx_run = false;
+	mutex_unlock(&vsock->tx_lock);
+
+	mutex_lock(&vsock->event_lock);
+	vsock->event_run = false;
+	mutex_unlock(&vsock->event_lock);
+
+	/* Flush all device writes and interrupts, device will not use any
+	 * more buffers.
+	 */
 	vdev->config->reset(vdev);
 
 	mutex_lock(&vsock->rx_lock);
@@ -689,6 +737,7 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
 	}
 	spin_unlock_bh(&vsock->loopback_list_lock);
 
+	/* Delete virtqueues and flush outstanding callbacks if any */
 	vdev->config->del_vqs(vdev);
 
 	mutex_unlock(&the_virtio_vsock_mutex);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/3] vsock/virtio: fix flush of works during the .remove()
  2019-06-28 12:36 [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove() Stefano Garzarella
  2019-06-28 12:36 ` [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock Stefano Garzarella
  2019-06-28 12:36 ` [PATCH v2 2/3] vsock/virtio: stop workers during the .remove() Stefano Garzarella
@ 2019-06-28 12:36 ` Stefano Garzarella
  2019-07-01 15:08   ` Stefan Hajnoczi
  2019-07-01 15:09   ` Stefan Hajnoczi
  2019-07-01 15:11 ` [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove() Stefan Hajnoczi
  3 siblings, 2 replies; 19+ messages in thread
From: Stefano Garzarella @ 2019-06-28 12:36 UTC (permalink / raw)
  To: netdev
  Cc: kvm, virtualization, Stefan Hajnoczi, Michael S. Tsirkin,
	David S. Miller, Jason Wang, linux-kernel

This patch moves the flush of works after vdev->config->del_vqs(vdev),
because we need to be sure that no workers run before to free the
'vsock' object.

Since we stopped the workers using the [tx|rx|event]_run flags,
we are sure no one is accessing the device while we are calling
vdev->config->reset(vdev), so we can safely move the workers' flush.

Before the vdev->config->del_vqs(vdev), workers can be scheduled
by VQ callbacks, so we must flush them after del_vqs(), to avoid
use-after-free of 'vsock' object.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 1b44ec6f3f6c..96dafa978268 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -680,12 +680,6 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
 	rcu_assign_pointer(the_virtio_vsock, NULL);
 	synchronize_rcu();
 
-	flush_work(&vsock->loopback_work);
-	flush_work(&vsock->rx_work);
-	flush_work(&vsock->tx_work);
-	flush_work(&vsock->event_work);
-	flush_work(&vsock->send_pkt_work);
-
 	/* Reset all connected sockets when the device disappear */
 	vsock_for_each_connected_socket(virtio_vsock_reset_sock);
 
@@ -740,6 +734,15 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
 	/* Delete virtqueues and flush outstanding callbacks if any */
 	vdev->config->del_vqs(vdev);
 
+	/* Other works can be queued before 'config->del_vqs()', so we flush
+	 * all works before to free the vsock object to avoid use after free.
+	 */
+	flush_work(&vsock->loopback_work);
+	flush_work(&vsock->rx_work);
+	flush_work(&vsock->tx_work);
+	flush_work(&vsock->event_work);
+	flush_work(&vsock->send_pkt_work);
+
 	mutex_unlock(&the_virtio_vsock_mutex);
 
 	kfree(vsock);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  2019-06-28 12:36 ` [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock Stefano Garzarella
@ 2019-07-01 14:54   ` Stefan Hajnoczi
  2019-07-01 15:10   ` Stefan Hajnoczi
  2019-07-03  9:53   ` Jason Wang
  2 siblings, 0 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2019-07-01 14:54 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: netdev, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
	Stefan Hajnoczi, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 1361 bytes --]

On Fri, Jun 28, 2019 at 02:36:57PM +0200, Stefano Garzarella wrote:
> Some callbacks used by the upper layers can run while we are in the
> .remove(). A potential use-after-free can happen, because we free
> the_virtio_vsock without knowing if the callbacks are over or not.
> 
> To solve this issue we move the assignment of the_virtio_vsock at the
> end of .probe(), when we finished all the initialization, and at the
> beginning of .remove(), before to release resources.
> For the same reason, we do the same also for the vdev->priv.
> 
> We use RCU to be sure that all callbacks that use the_virtio_vsock
> ended before freeing it. This is not required for callbacks that
> use vdev->priv, because after the vdev->config->del_vqs() we are sure
> that they are ended and will no longer be invoked.

->del_vqs() is only called at the very end, did you forget to move it
earlier?

In particular, the virtqueue handler callbacks schedule a workqueue.
The work functions use container_of() to get vsock.  We need to be sure
that the work item isn't freed along with vsock while the work item is
still pending.

How do we know that the virtqueue handler is never called in such a way
that it sees vsock != NULL (there is no explicit memory barrier on the
read side) and then schedules a work item after flush_work() has run?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] vsock/virtio: fix flush of works during the .remove()
  2019-06-28 12:36 ` [PATCH v2 3/3] vsock/virtio: fix flush of works " Stefano Garzarella
@ 2019-07-01 15:08   ` Stefan Hajnoczi
  2019-07-01 15:09   ` Stefan Hajnoczi
  1 sibling, 0 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2019-07-01 15:08 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: netdev, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
	Stefan Hajnoczi, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 647 bytes --]

On Fri, Jun 28, 2019 at 02:36:59PM +0200, Stefano Garzarella wrote:
> This patch moves the flush of works after vdev->config->del_vqs(vdev),
> because we need to be sure that no workers run before to free the
> 'vsock' object.
> 
> Since we stopped the workers using the [tx|rx|event]_run flags,
> we are sure no one is accessing the device while we are calling
> vdev->config->reset(vdev), so we can safely move the workers' flush.

What about send_pkt and loopback work?  How were they stopped safely?

For example, if send_pkt work executes then we're in trouble since it
accesses the tx virtqueue which is deleted by ->del_vqs().

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] vsock/virtio: fix flush of works during the .remove()
  2019-06-28 12:36 ` [PATCH v2 3/3] vsock/virtio: fix flush of works " Stefano Garzarella
  2019-07-01 15:08   ` Stefan Hajnoczi
@ 2019-07-01 15:09   ` Stefan Hajnoczi
  1 sibling, 0 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2019-07-01 15:09 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: netdev, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
	Stefan Hajnoczi, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 745 bytes --]

On Fri, Jun 28, 2019 at 02:36:59PM +0200, Stefano Garzarella wrote:
> This patch moves the flush of works after vdev->config->del_vqs(vdev),
> because we need to be sure that no workers run before to free the
> 'vsock' object.
> 
> Since we stopped the workers using the [tx|rx|event]_run flags,
> we are sure no one is accessing the device while we are calling
> vdev->config->reset(vdev), so we can safely move the workers' flush.
> 
> Before the vdev->config->del_vqs(vdev), workers can be scheduled
> by VQ callbacks, so we must flush them after del_vqs(), to avoid
> use-after-free of 'vsock' object.

Nevermind, I looked back at Patch 2 and saw the send_pkt and loopback
work functions were also updated.  Thanks!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  2019-06-28 12:36 ` [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock Stefano Garzarella
  2019-07-01 14:54   ` Stefan Hajnoczi
@ 2019-07-01 15:10   ` Stefan Hajnoczi
  2019-07-03  9:53   ` Jason Wang
  2 siblings, 0 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2019-07-01 15:10 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: netdev, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
	Stefan Hajnoczi, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 856 bytes --]

On Fri, Jun 28, 2019 at 02:36:57PM +0200, Stefano Garzarella wrote:
> Some callbacks used by the upper layers can run while we are in the
> .remove(). A potential use-after-free can happen, because we free
> the_virtio_vsock without knowing if the callbacks are over or not.
> 
> To solve this issue we move the assignment of the_virtio_vsock at the
> end of .probe(), when we finished all the initialization, and at the
> beginning of .remove(), before to release resources.
> For the same reason, we do the same also for the vdev->priv.
> 
> We use RCU to be sure that all callbacks that use the_virtio_vsock
> ended before freeing it. This is not required for callbacks that
> use vdev->priv, because after the vdev->config->del_vqs() we are sure
> that they are ended and will no longer be invoked.

My question is answered in Patch 3.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove()
  2019-06-28 12:36 [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove() Stefano Garzarella
                   ` (2 preceding siblings ...)
  2019-06-28 12:36 ` [PATCH v2 3/3] vsock/virtio: fix flush of works " Stefano Garzarella
@ 2019-07-01 15:11 ` Stefan Hajnoczi
  2019-07-01 17:03   ` Stefano Garzarella
  3 siblings, 1 reply; 19+ messages in thread
From: Stefan Hajnoczi @ 2019-07-01 15:11 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: netdev, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
	Stefan Hajnoczi, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 1196 bytes --]

On Fri, Jun 28, 2019 at 02:36:56PM +0200, Stefano Garzarella wrote:
> During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock
> before registering the driver", Stefan pointed out some possible issues
> in the .probe() and .remove() callbacks of the virtio-vsock driver.
> 
> This series tries to solve these issues:
> - Patch 1 adds RCU critical sections to avoid use-after-free of
>   'the_virtio_vsock' pointer.
> - Patch 2 stops workers before to call vdev->config->reset(vdev) to
>   be sure that no one is accessing the device.
> - Patch 3 moves the works flush at the end of the .remove() to avoid
>   use-after-free of 'vsock' object.
> 
> v2:
> - Patch 1: use RCU to protect 'the_virtio_vsock' pointer
> - Patch 2: no changes
> - Patch 3: flush works only at the end of .remove()
> - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers
>   allocated.
> 
> v1: https://patchwork.kernel.org/cover/10964733/

This looks good to me.

Did you run any stress tests?  For example an SMP guest constantly
connecting and sending packets together with a script that
hotplug/unplugs vhost-vsock-pci from the host side.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove()
  2019-07-01 15:11 ` [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove() Stefan Hajnoczi
@ 2019-07-01 17:03   ` Stefano Garzarella
  2019-07-03  9:14     ` Stefan Hajnoczi
  0 siblings, 1 reply; 19+ messages in thread
From: Stefano Garzarella @ 2019-07-01 17:03 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: netdev, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
	Stefan Hajnoczi, David S. Miller

On Mon, Jul 01, 2019 at 04:11:13PM +0100, Stefan Hajnoczi wrote:
> On Fri, Jun 28, 2019 at 02:36:56PM +0200, Stefano Garzarella wrote:
> > During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock
> > before registering the driver", Stefan pointed out some possible issues
> > in the .probe() and .remove() callbacks of the virtio-vsock driver.
> > 
> > This series tries to solve these issues:
> > - Patch 1 adds RCU critical sections to avoid use-after-free of
> >   'the_virtio_vsock' pointer.
> > - Patch 2 stops workers before to call vdev->config->reset(vdev) to
> >   be sure that no one is accessing the device.
> > - Patch 3 moves the works flush at the end of the .remove() to avoid
> >   use-after-free of 'vsock' object.
> > 
> > v2:
> > - Patch 1: use RCU to protect 'the_virtio_vsock' pointer
> > - Patch 2: no changes
> > - Patch 3: flush works only at the end of .remove()
> > - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers
> >   allocated.
> > 
> > v1: https://patchwork.kernel.org/cover/10964733/
> 
> This looks good to me.

Thanks for the review!

> 
> Did you run any stress tests?  For example an SMP guest constantly
> connecting and sending packets together with a script that
> hotplug/unplugs vhost-vsock-pci from the host side.

Yes, I started an SMP guest (-smp 4 -monitor tcp:127.0.0.1:1234,server,nowait)
and I run these scripts to stress the .probe()/.remove() path:

- guest
  while true; do
      cat /dev/urandom | nc-vsock -l 4321 > /dev/null &
      cat /dev/urandom | nc-vsock -l 5321 > /dev/null &
      cat /dev/urandom | nc-vsock -l 6321 > /dev/null &
      cat /dev/urandom | nc-vsock -l 7321 > /dev/null &
      wait
  done

- host
  while true; do
      cat /dev/urandom | nc-vsock 3 4321 > /dev/null &
      cat /dev/urandom | nc-vsock 3 5321 > /dev/null &
      cat /dev/urandom | nc-vsock 3 6321 > /dev/null &
      cat /dev/urandom | nc-vsock 3 7321 > /dev/null &
      sleep 2
      echo "device_del v1" | nc 127.0.0.1 1234
      sleep 1
      echo "device_add vhost-vsock-pci,id=v1,guest-cid=3" | nc 127.0.0.1 1234
      sleep 1
  done

Do you think is enough or is better to have a test more accurate?

Thanks,
Stefano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove()
  2019-07-01 17:03   ` Stefano Garzarella
@ 2019-07-03  9:14     ` Stefan Hajnoczi
  2019-07-03 10:07       ` Stefano Garzarella
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Hajnoczi @ 2019-07-03  9:14 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: netdev, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
	Stefan Hajnoczi, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 2545 bytes --]

On Mon, Jul 01, 2019 at 07:03:57PM +0200, Stefano Garzarella wrote:
> On Mon, Jul 01, 2019 at 04:11:13PM +0100, Stefan Hajnoczi wrote:
> > On Fri, Jun 28, 2019 at 02:36:56PM +0200, Stefano Garzarella wrote:
> > > During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock
> > > before registering the driver", Stefan pointed out some possible issues
> > > in the .probe() and .remove() callbacks of the virtio-vsock driver.
> > > 
> > > This series tries to solve these issues:
> > > - Patch 1 adds RCU critical sections to avoid use-after-free of
> > >   'the_virtio_vsock' pointer.
> > > - Patch 2 stops workers before to call vdev->config->reset(vdev) to
> > >   be sure that no one is accessing the device.
> > > - Patch 3 moves the works flush at the end of the .remove() to avoid
> > >   use-after-free of 'vsock' object.
> > > 
> > > v2:
> > > - Patch 1: use RCU to protect 'the_virtio_vsock' pointer
> > > - Patch 2: no changes
> > > - Patch 3: flush works only at the end of .remove()
> > > - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers
> > >   allocated.
> > > 
> > > v1: https://patchwork.kernel.org/cover/10964733/
> > 
> > This looks good to me.
> 
> Thanks for the review!
> 
> > 
> > Did you run any stress tests?  For example an SMP guest constantly
> > connecting and sending packets together with a script that
> > hotplug/unplugs vhost-vsock-pci from the host side.
> 
> Yes, I started an SMP guest (-smp 4 -monitor tcp:127.0.0.1:1234,server,nowait)
> and I run these scripts to stress the .probe()/.remove() path:
> 
> - guest
>   while true; do
>       cat /dev/urandom | nc-vsock -l 4321 > /dev/null &
>       cat /dev/urandom | nc-vsock -l 5321 > /dev/null &
>       cat /dev/urandom | nc-vsock -l 6321 > /dev/null &
>       cat /dev/urandom | nc-vsock -l 7321 > /dev/null &
>       wait
>   done
> 
> - host
>   while true; do
>       cat /dev/urandom | nc-vsock 3 4321 > /dev/null &
>       cat /dev/urandom | nc-vsock 3 5321 > /dev/null &
>       cat /dev/urandom | nc-vsock 3 6321 > /dev/null &
>       cat /dev/urandom | nc-vsock 3 7321 > /dev/null &
>       sleep 2
>       echo "device_del v1" | nc 127.0.0.1 1234
>       sleep 1
>       echo "device_add vhost-vsock-pci,id=v1,guest-cid=3" | nc 127.0.0.1 1234
>       sleep 1
>   done
> 
> Do you think is enough or is better to have a test more accurate?

That's good when left running overnight so that thousands of hotplug
events are tested.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  2019-06-28 12:36 ` [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock Stefano Garzarella
  2019-07-01 14:54   ` Stefan Hajnoczi
  2019-07-01 15:10   ` Stefan Hajnoczi
@ 2019-07-03  9:53   ` Jason Wang
  2019-07-03 10:41     ` Stefano Garzarella
  2 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2019-07-03  9:53 UTC (permalink / raw)
  To: Stefano Garzarella, netdev
  Cc: kvm, virtualization, Stefan Hajnoczi, Michael S. Tsirkin,
	David S. Miller, linux-kernel


On 2019/6/28 下午8:36, Stefano Garzarella wrote:
> Some callbacks used by the upper layers can run while we are in the
> .remove(). A potential use-after-free can happen, because we free
> the_virtio_vsock without knowing if the callbacks are over or not.
>
> To solve this issue we move the assignment of the_virtio_vsock at the
> end of .probe(), when we finished all the initialization, and at the
> beginning of .remove(), before to release resources.
> For the same reason, we do the same also for the vdev->priv.
>
> We use RCU to be sure that all callbacks that use the_virtio_vsock
> ended before freeing it. This is not required for callbacks that
> use vdev->priv, because after the vdev->config->del_vqs() we are sure
> that they are ended and will no longer be invoked.
>
> We also take the mutex during the .remove() to avoid that .probe() can
> run while we are resetting the device.
>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
>   net/vmw_vsock/virtio_transport.c | 67 +++++++++++++++++++++-----------
>   1 file changed, 44 insertions(+), 23 deletions(-)
>
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> index 9c287e3e393c..7ad510ec12e0 100644
> --- a/net/vmw_vsock/virtio_transport.c
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -65,19 +65,22 @@ struct virtio_vsock {
>   	u32 guest_cid;
>   };
>   
> -static struct virtio_vsock *virtio_vsock_get(void)
> -{
> -	return the_virtio_vsock;
> -}
> -
>   static u32 virtio_transport_get_local_cid(void)
>   {
> -	struct virtio_vsock *vsock = virtio_vsock_get();
> +	struct virtio_vsock *vsock;
> +	u32 ret;
>   
> -	if (!vsock)
> -		return VMADDR_CID_ANY;
> +	rcu_read_lock();
> +	vsock = rcu_dereference(the_virtio_vsock);
> +	if (!vsock) {
> +		ret = VMADDR_CID_ANY;
> +		goto out_rcu;
> +	}
>   
> -	return vsock->guest_cid;
> +	ret = vsock->guest_cid;
> +out_rcu:
> +	rcu_read_unlock();
> +	return ret;
>   }
>   
>   static void virtio_transport_loopback_work(struct work_struct *work)
> @@ -197,14 +200,18 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
>   	struct virtio_vsock *vsock;
>   	int len = pkt->len;
>   
> -	vsock = virtio_vsock_get();
> +	rcu_read_lock();
> +	vsock = rcu_dereference(the_virtio_vsock);
>   	if (!vsock) {
>   		virtio_transport_free_pkt(pkt);
> -		return -ENODEV;
> +		len = -ENODEV;
> +		goto out_rcu;
>   	}
>   
> -	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid)
> -		return virtio_transport_send_pkt_loopback(vsock, pkt);
> +	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid) {
> +		len = virtio_transport_send_pkt_loopback(vsock, pkt);
> +		goto out_rcu;
> +	}
>   
>   	if (pkt->reply)
>   		atomic_inc(&vsock->queued_replies);
> @@ -214,6 +221,9 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
>   	spin_unlock_bh(&vsock->send_pkt_list_lock);
>   
>   	queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
> +
> +out_rcu:
> +	rcu_read_unlock();
>   	return len;
>   }
>   
> @@ -222,12 +232,14 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
>   {
>   	struct virtio_vsock *vsock;
>   	struct virtio_vsock_pkt *pkt, *n;
> -	int cnt = 0;
> +	int cnt = 0, ret;
>   	LIST_HEAD(freeme);
>   
> -	vsock = virtio_vsock_get();
> +	rcu_read_lock();
> +	vsock = rcu_dereference(the_virtio_vsock);
>   	if (!vsock) {
> -		return -ENODEV;
> +		ret = -ENODEV;
> +		goto out_rcu;
>   	}
>   
>   	spin_lock_bh(&vsock->send_pkt_list_lock);
> @@ -255,7 +267,11 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
>   			queue_work(virtio_vsock_workqueue, &vsock->rx_work);
>   	}
>   
> -	return 0;
> +	ret = 0;
> +
> +out_rcu:
> +	rcu_read_unlock();
> +	return ret;
>   }
>   
>   static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
> @@ -590,8 +606,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>   	vsock->rx_buf_max_nr = 0;
>   	atomic_set(&vsock->queued_replies, 0);
>   
> -	vdev->priv = vsock;
> -	the_virtio_vsock = vsock;
>   	mutex_init(&vsock->tx_lock);
>   	mutex_init(&vsock->rx_lock);
>   	mutex_init(&vsock->event_lock);
> @@ -613,6 +627,9 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>   	virtio_vsock_event_fill(vsock);
>   	mutex_unlock(&vsock->event_lock);
>   
> +	vdev->priv = vsock;
> +	rcu_assign_pointer(the_virtio_vsock, vsock);


You probably need to use rcu_dereference_protected() to access 
the_virtio_vsock in the function in order to survive from sparse.


> +
>   	mutex_unlock(&the_virtio_vsock_mutex);
>   	return 0;
>   
> @@ -627,6 +644,12 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>   	struct virtio_vsock *vsock = vdev->priv;
>   	struct virtio_vsock_pkt *pkt;
>   
> +	mutex_lock(&the_virtio_vsock_mutex);
> +
> +	vdev->priv = NULL;
> +	rcu_assign_pointer(the_virtio_vsock, NULL);


This is still suspicious, can we access the_virtio_vsock through 
vdev->priv? If yes, we may still get use-after-free since it was not 
protected by RCU.

Another more interesting question, I believe we will do singleton for 
virtio_vsock structure. Then what's the point of using vdev->priv to 
access the_virtio_vsock? It looks to me we can it brings extra troubles 
for doing synchronization.

Thanks


> +	synchronize_rcu();
> +
>   	flush_work(&vsock->loopback_work);
>   	flush_work(&vsock->rx_work);
>   	flush_work(&vsock->tx_work);
> @@ -666,12 +689,10 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>   	}
>   	spin_unlock_bh(&vsock->loopback_list_lock);
>   
> -	mutex_lock(&the_virtio_vsock_mutex);
> -	the_virtio_vsock = NULL;
> -	mutex_unlock(&the_virtio_vsock_mutex);
> -
>   	vdev->config->del_vqs(vdev);
>   
> +	mutex_unlock(&the_virtio_vsock_mutex);
> +
>   	kfree(vsock);
>   }
>   

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove()
  2019-07-03  9:14     ` Stefan Hajnoczi
@ 2019-07-03 10:07       ` Stefano Garzarella
  0 siblings, 0 replies; 19+ messages in thread
From: Stefano Garzarella @ 2019-07-03 10:07 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: netdev, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
	Stefan Hajnoczi, David S. Miller

On Wed, Jul 03, 2019 at 10:14:53AM +0100, Stefan Hajnoczi wrote:
> On Mon, Jul 01, 2019 at 07:03:57PM +0200, Stefano Garzarella wrote:
> > On Mon, Jul 01, 2019 at 04:11:13PM +0100, Stefan Hajnoczi wrote:
> > > On Fri, Jun 28, 2019 at 02:36:56PM +0200, Stefano Garzarella wrote:
> > > > During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock
> > > > before registering the driver", Stefan pointed out some possible issues
> > > > in the .probe() and .remove() callbacks of the virtio-vsock driver.
> > > > 
> > > > This series tries to solve these issues:
> > > > - Patch 1 adds RCU critical sections to avoid use-after-free of
> > > >   'the_virtio_vsock' pointer.
> > > > - Patch 2 stops workers before to call vdev->config->reset(vdev) to
> > > >   be sure that no one is accessing the device.
> > > > - Patch 3 moves the works flush at the end of the .remove() to avoid
> > > >   use-after-free of 'vsock' object.
> > > > 
> > > > v2:
> > > > - Patch 1: use RCU to protect 'the_virtio_vsock' pointer
> > > > - Patch 2: no changes
> > > > - Patch 3: flush works only at the end of .remove()
> > > > - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers
> > > >   allocated.
> > > > 
> > > > v1: https://patchwork.kernel.org/cover/10964733/
> > > 
> > > This looks good to me.
> > 
> > Thanks for the review!
> > 
> > > 
> > > Did you run any stress tests?  For example an SMP guest constantly
> > > connecting and sending packets together with a script that
> > > hotplug/unplugs vhost-vsock-pci from the host side.
> > 
> > Yes, I started an SMP guest (-smp 4 -monitor tcp:127.0.0.1:1234,server,nowait)
> > and I run these scripts to stress the .probe()/.remove() path:
> > 
> > - guest
> >   while true; do
> >       cat /dev/urandom | nc-vsock -l 4321 > /dev/null &
> >       cat /dev/urandom | nc-vsock -l 5321 > /dev/null &
> >       cat /dev/urandom | nc-vsock -l 6321 > /dev/null &
> >       cat /dev/urandom | nc-vsock -l 7321 > /dev/null &
> >       wait
> >   done
> > 
> > - host
> >   while true; do
> >       cat /dev/urandom | nc-vsock 3 4321 > /dev/null &
> >       cat /dev/urandom | nc-vsock 3 5321 > /dev/null &
> >       cat /dev/urandom | nc-vsock 3 6321 > /dev/null &
> >       cat /dev/urandom | nc-vsock 3 7321 > /dev/null &
> >       sleep 2
> >       echo "device_del v1" | nc 127.0.0.1 1234
> >       sleep 1
> >       echo "device_add vhost-vsock-pci,id=v1,guest-cid=3" | nc 127.0.0.1 1234
> >       sleep 1
> >   done
> > 
> > Do you think is enough or is better to have a test more accurate?
> 
> That's good when left running overnight so that thousands of hotplug
> events are tested.

Honestly I run the test for ~30 mins (because without the patch the
crash happens in a few seconds), but of course, I'll run it this night :)

Thanks,
Stefano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  2019-07-03  9:53   ` Jason Wang
@ 2019-07-03 10:41     ` Stefano Garzarella
  2019-07-04  3:58       ` Jason Wang
  2019-07-04 10:17       ` Stefan Hajnoczi
  0 siblings, 2 replies; 19+ messages in thread
From: Stefano Garzarella @ 2019-07-03 10:41 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi
  Cc: netdev, kvm, virtualization, Michael S. Tsirkin, David S. Miller,
	linux-kernel

On Wed, Jul 03, 2019 at 05:53:58PM +0800, Jason Wang wrote:
> 
> On 2019/6/28 下午8:36, Stefano Garzarella wrote:
> > Some callbacks used by the upper layers can run while we are in the
> > .remove(). A potential use-after-free can happen, because we free
> > the_virtio_vsock without knowing if the callbacks are over or not.
> > 
> > To solve this issue we move the assignment of the_virtio_vsock at the
> > end of .probe(), when we finished all the initialization, and at the
> > beginning of .remove(), before to release resources.
> > For the same reason, we do the same also for the vdev->priv.
> > 
> > We use RCU to be sure that all callbacks that use the_virtio_vsock
> > ended before freeing it. This is not required for callbacks that
> > use vdev->priv, because after the vdev->config->del_vqs() we are sure
> > that they are ended and will no longer be invoked.
> > 
> > We also take the mutex during the .remove() to avoid that .probe() can
> > run while we are resetting the device.
> > 
> > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > ---
> >   net/vmw_vsock/virtio_transport.c | 67 +++++++++++++++++++++-----------
> >   1 file changed, 44 insertions(+), 23 deletions(-)
> > 
> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > index 9c287e3e393c..7ad510ec12e0 100644
> > --- a/net/vmw_vsock/virtio_transport.c
> > +++ b/net/vmw_vsock/virtio_transport.c
> > @@ -65,19 +65,22 @@ struct virtio_vsock {
> >   	u32 guest_cid;
> >   };
> > -static struct virtio_vsock *virtio_vsock_get(void)
> > -{
> > -	return the_virtio_vsock;
> > -}
> > -
> >   static u32 virtio_transport_get_local_cid(void)
> >   {
> > -	struct virtio_vsock *vsock = virtio_vsock_get();
> > +	struct virtio_vsock *vsock;
> > +	u32 ret;
> > -	if (!vsock)
> > -		return VMADDR_CID_ANY;
> > +	rcu_read_lock();
> > +	vsock = rcu_dereference(the_virtio_vsock);
> > +	if (!vsock) {
> > +		ret = VMADDR_CID_ANY;
> > +		goto out_rcu;
> > +	}
> > -	return vsock->guest_cid;
> > +	ret = vsock->guest_cid;
> > +out_rcu:
> > +	rcu_read_unlock();
> > +	return ret;
> >   }
> >   static void virtio_transport_loopback_work(struct work_struct *work)
> > @@ -197,14 +200,18 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
> >   	struct virtio_vsock *vsock;
> >   	int len = pkt->len;
> > -	vsock = virtio_vsock_get();
> > +	rcu_read_lock();
> > +	vsock = rcu_dereference(the_virtio_vsock);
> >   	if (!vsock) {
> >   		virtio_transport_free_pkt(pkt);
> > -		return -ENODEV;
> > +		len = -ENODEV;
> > +		goto out_rcu;
> >   	}
> > -	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid)
> > -		return virtio_transport_send_pkt_loopback(vsock, pkt);
> > +	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid) {
> > +		len = virtio_transport_send_pkt_loopback(vsock, pkt);
> > +		goto out_rcu;
> > +	}
> >   	if (pkt->reply)
> >   		atomic_inc(&vsock->queued_replies);
> > @@ -214,6 +221,9 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
> >   	spin_unlock_bh(&vsock->send_pkt_list_lock);
> >   	queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
> > +
> > +out_rcu:
> > +	rcu_read_unlock();
> >   	return len;
> >   }
> > @@ -222,12 +232,14 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
> >   {
> >   	struct virtio_vsock *vsock;
> >   	struct virtio_vsock_pkt *pkt, *n;
> > -	int cnt = 0;
> > +	int cnt = 0, ret;
> >   	LIST_HEAD(freeme);
> > -	vsock = virtio_vsock_get();
> > +	rcu_read_lock();
> > +	vsock = rcu_dereference(the_virtio_vsock);
> >   	if (!vsock) {
> > -		return -ENODEV;
> > +		ret = -ENODEV;
> > +		goto out_rcu;
> >   	}
> >   	spin_lock_bh(&vsock->send_pkt_list_lock);
> > @@ -255,7 +267,11 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
> >   			queue_work(virtio_vsock_workqueue, &vsock->rx_work);
> >   	}
> > -	return 0;
> > +	ret = 0;
> > +
> > +out_rcu:
> > +	rcu_read_unlock();
> > +	return ret;
> >   }
> >   static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
> > @@ -590,8 +606,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
> >   	vsock->rx_buf_max_nr = 0;
> >   	atomic_set(&vsock->queued_replies, 0);
> > -	vdev->priv = vsock;
> > -	the_virtio_vsock = vsock;
> >   	mutex_init(&vsock->tx_lock);
> >   	mutex_init(&vsock->rx_lock);
> >   	mutex_init(&vsock->event_lock);
> > @@ -613,6 +627,9 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
> >   	virtio_vsock_event_fill(vsock);
> >   	mutex_unlock(&vsock->event_lock);
> > +	vdev->priv = vsock;
> > +	rcu_assign_pointer(the_virtio_vsock, vsock);
> 
> 
> You probably need to use rcu_dereference_protected() to access
> the_virtio_vsock in the function in order to survive from sparse.
> 

Ooo, thanks!

Do you mean when we check if the_virtio_vsock is not null at the beginning of
virtio_vsock_probe()?

> 
> > +
> >   	mutex_unlock(&the_virtio_vsock_mutex);
> >   	return 0;
> > @@ -627,6 +644,12 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> >   	struct virtio_vsock *vsock = vdev->priv;
> >   	struct virtio_vsock_pkt *pkt;
> > +	mutex_lock(&the_virtio_vsock_mutex);
> > +
> > +	vdev->priv = NULL;
> > +	rcu_assign_pointer(the_virtio_vsock, NULL);
> 
> 
> This is still suspicious, can we access the_virtio_vsock through vdev->priv?
> If yes, we may still get use-after-free since it was not protected by RCU.

We will free the object only after calling the del_vqs(), so we are sure
that the vq_callbacks ended and will no longer be invoked.
So, IIUC it shouldn't happen.

> 
> Another more interesting question, I believe we will do singleton for
> virtio_vsock structure. Then what's the point of using vdev->priv to access
> the_virtio_vsock? It looks to me we can it brings extra troubles for doing
> synchronization.

I thought about it when I tried to use RCU to stop the worker and I
think make sense. Maybe can be another series after this will be merged.

@Stefan, what do you think about that?

Thanks,
Stefano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  2019-07-03 10:41     ` Stefano Garzarella
@ 2019-07-04  3:58       ` Jason Wang
  2019-07-04  9:20         ` Stefano Garzarella
  2019-07-04 10:17       ` Stefan Hajnoczi
  1 sibling, 1 reply; 19+ messages in thread
From: Jason Wang @ 2019-07-04  3:58 UTC (permalink / raw)
  To: Stefano Garzarella, Stefan Hajnoczi
  Cc: netdev, kvm, virtualization, Michael S. Tsirkin, David S. Miller,
	linux-kernel


On 2019/7/3 下午6:41, Stefano Garzarella wrote:
> On Wed, Jul 03, 2019 at 05:53:58PM +0800, Jason Wang wrote:
>> On 2019/6/28 下午8:36, Stefano Garzarella wrote:
>>> Some callbacks used by the upper layers can run while we are in the
>>> .remove(). A potential use-after-free can happen, because we free
>>> the_virtio_vsock without knowing if the callbacks are over or not.
>>>
>>> To solve this issue we move the assignment of the_virtio_vsock at the
>>> end of .probe(), when we finished all the initialization, and at the
>>> beginning of .remove(), before to release resources.
>>> For the same reason, we do the same also for the vdev->priv.
>>>
>>> We use RCU to be sure that all callbacks that use the_virtio_vsock
>>> ended before freeing it. This is not required for callbacks that
>>> use vdev->priv, because after the vdev->config->del_vqs() we are sure
>>> that they are ended and will no longer be invoked.
>>>
>>> We also take the mutex during the .remove() to avoid that .probe() can
>>> run while we are resetting the device.
>>>
>>> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>>> ---
>>>    net/vmw_vsock/virtio_transport.c | 67 +++++++++++++++++++++-----------
>>>    1 file changed, 44 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>>> index 9c287e3e393c..7ad510ec12e0 100644
>>> --- a/net/vmw_vsock/virtio_transport.c
>>> +++ b/net/vmw_vsock/virtio_transport.c
>>> @@ -65,19 +65,22 @@ struct virtio_vsock {
>>>    	u32 guest_cid;
>>>    };
>>> -static struct virtio_vsock *virtio_vsock_get(void)
>>> -{
>>> -	return the_virtio_vsock;
>>> -}
>>> -
>>>    static u32 virtio_transport_get_local_cid(void)
>>>    {
>>> -	struct virtio_vsock *vsock = virtio_vsock_get();
>>> +	struct virtio_vsock *vsock;
>>> +	u32 ret;
>>> -	if (!vsock)
>>> -		return VMADDR_CID_ANY;
>>> +	rcu_read_lock();
>>> +	vsock = rcu_dereference(the_virtio_vsock);
>>> +	if (!vsock) {
>>> +		ret = VMADDR_CID_ANY;
>>> +		goto out_rcu;
>>> +	}
>>> -	return vsock->guest_cid;
>>> +	ret = vsock->guest_cid;
>>> +out_rcu:
>>> +	rcu_read_unlock();
>>> +	return ret;
>>>    }
>>>    static void virtio_transport_loopback_work(struct work_struct *work)
>>> @@ -197,14 +200,18 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
>>>    	struct virtio_vsock *vsock;
>>>    	int len = pkt->len;
>>> -	vsock = virtio_vsock_get();
>>> +	rcu_read_lock();
>>> +	vsock = rcu_dereference(the_virtio_vsock);
>>>    	if (!vsock) {
>>>    		virtio_transport_free_pkt(pkt);
>>> -		return -ENODEV;
>>> +		len = -ENODEV;
>>> +		goto out_rcu;
>>>    	}
>>> -	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid)
>>> -		return virtio_transport_send_pkt_loopback(vsock, pkt);
>>> +	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid) {
>>> +		len = virtio_transport_send_pkt_loopback(vsock, pkt);
>>> +		goto out_rcu;
>>> +	}
>>>    	if (pkt->reply)
>>>    		atomic_inc(&vsock->queued_replies);
>>> @@ -214,6 +221,9 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
>>>    	spin_unlock_bh(&vsock->send_pkt_list_lock);
>>>    	queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
>>> +
>>> +out_rcu:
>>> +	rcu_read_unlock();
>>>    	return len;
>>>    }
>>> @@ -222,12 +232,14 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
>>>    {
>>>    	struct virtio_vsock *vsock;
>>>    	struct virtio_vsock_pkt *pkt, *n;
>>> -	int cnt = 0;
>>> +	int cnt = 0, ret;
>>>    	LIST_HEAD(freeme);
>>> -	vsock = virtio_vsock_get();
>>> +	rcu_read_lock();
>>> +	vsock = rcu_dereference(the_virtio_vsock);
>>>    	if (!vsock) {
>>> -		return -ENODEV;
>>> +		ret = -ENODEV;
>>> +		goto out_rcu;
>>>    	}
>>>    	spin_lock_bh(&vsock->send_pkt_list_lock);
>>> @@ -255,7 +267,11 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
>>>    			queue_work(virtio_vsock_workqueue, &vsock->rx_work);
>>>    	}
>>> -	return 0;
>>> +	ret = 0;
>>> +
>>> +out_rcu:
>>> +	rcu_read_unlock();
>>> +	return ret;
>>>    }
>>>    static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
>>> @@ -590,8 +606,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>>>    	vsock->rx_buf_max_nr = 0;
>>>    	atomic_set(&vsock->queued_replies, 0);
>>> -	vdev->priv = vsock;
>>> -	the_virtio_vsock = vsock;
>>>    	mutex_init(&vsock->tx_lock);
>>>    	mutex_init(&vsock->rx_lock);
>>>    	mutex_init(&vsock->event_lock);
>>> @@ -613,6 +627,9 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>>>    	virtio_vsock_event_fill(vsock);
>>>    	mutex_unlock(&vsock->event_lock);
>>> +	vdev->priv = vsock;
>>> +	rcu_assign_pointer(the_virtio_vsock, vsock);
>>
>> You probably need to use rcu_dereference_protected() to access
>> the_virtio_vsock in the function in order to survive from sparse.
>>
> Ooo, thanks!
>
> Do you mean when we check if the_virtio_vsock is not null at the beginning of
> virtio_vsock_probe()?


I mean instead of:

     /* Only one virtio-vsock device per guest is supported */
     if (the_virtio_vsock) {
         ret = -EBUSY;
         goto out;
     }

you should use:

if (rcu_dereference_protected(the_virtio_vosck, 
lock_dep_is_held(&the_virtio_vsock_mutex))

...


>
>>> +
>>>    	mutex_unlock(&the_virtio_vsock_mutex);
>>>    	return 0;
>>> @@ -627,6 +644,12 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>    	struct virtio_vsock *vsock = vdev->priv;
>>>    	struct virtio_vsock_pkt *pkt;
>>> +	mutex_lock(&the_virtio_vsock_mutex);
>>> +
>>> +	vdev->priv = NULL;
>>> +	rcu_assign_pointer(the_virtio_vsock, NULL);
>>
>> This is still suspicious, can we access the_virtio_vsock through vdev->priv?
>> If yes, we may still get use-after-free since it was not protected by RCU.
> We will free the object only after calling the del_vqs(), so we are sure
> that the vq_callbacks ended and will no longer be invoked.
> So, IIUC it shouldn't happen.


Yes, but any dereference that is not done in vq_callbacks will be very 
dangerous in the future.

Thanks


>
>> Another more interesting question, I believe we will do singleton for
>> virtio_vsock structure. Then what's the point of using vdev->priv to access
>> the_virtio_vsock? It looks to me we can it brings extra troubles for doing
>> synchronization.
> I thought about it when I tried to use RCU to stop the worker and I
> think make sense. Maybe can be another series after this will be merged.
>
> @Stefan, what do you think about that?
>
> Thanks,
> Stefano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] vsock/virtio: stop workers during the .remove()
  2019-06-28 12:36 ` [PATCH v2 2/3] vsock/virtio: stop workers during the .remove() Stefano Garzarella
@ 2019-07-04  4:00   ` Jason Wang
  0 siblings, 0 replies; 19+ messages in thread
From: Jason Wang @ 2019-07-04  4:00 UTC (permalink / raw)
  To: Stefano Garzarella, netdev
  Cc: kvm, virtualization, Stefan Hajnoczi, Michael S. Tsirkin,
	David S. Miller, linux-kernel


On 2019/6/28 下午8:36, Stefano Garzarella wrote:
> Before to call vdev->config->reset(vdev) we need to be sure that
> no one is accessing the device, for this reason, we add new variables
> in the struct virtio_vsock to stop the workers during the .remove().
>
> This patch also add few comments before vdev->config->reset(vdev)
> and vdev->config->del_vqs(vdev).
>
> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
>   net/vmw_vsock/virtio_transport.c | 51 +++++++++++++++++++++++++++++++-
>   1 file changed, 50 insertions(+), 1 deletion(-)


This should work. But we may consider to convert the_virtio_vosck to 
socket object and use socket refcnt and destructor in the future instead 
of inventing something new by ourselves.

Thanks


>
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> index 7ad510ec12e0..1b44ec6f3f6c 100644
> --- a/net/vmw_vsock/virtio_transport.c
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -38,6 +38,7 @@ struct virtio_vsock {
>   	 * must be accessed with tx_lock held.
>   	 */
>   	struct mutex tx_lock;
> +	bool tx_run;
>   
>   	struct work_struct send_pkt_work;
>   	spinlock_t send_pkt_list_lock;
> @@ -53,6 +54,7 @@ struct virtio_vsock {
>   	 * must be accessed with rx_lock held.
>   	 */
>   	struct mutex rx_lock;
> +	bool rx_run;
>   	int rx_buf_nr;
>   	int rx_buf_max_nr;
>   
> @@ -60,6 +62,7 @@ struct virtio_vsock {
>   	 * vqs[VSOCK_VQ_EVENT] must be accessed with event_lock held.
>   	 */
>   	struct mutex event_lock;
> +	bool event_run;
>   	struct virtio_vsock_event event_list[8];
>   
>   	u32 guest_cid;
> @@ -94,6 +97,10 @@ static void virtio_transport_loopback_work(struct work_struct *work)
>   	spin_unlock_bh(&vsock->loopback_list_lock);
>   
>   	mutex_lock(&vsock->rx_lock);
> +
> +	if (!vsock->rx_run)
> +		goto out;
> +
>   	while (!list_empty(&pkts)) {
>   		struct virtio_vsock_pkt *pkt;
>   
> @@ -102,6 +109,7 @@ static void virtio_transport_loopback_work(struct work_struct *work)
>   
>   		virtio_transport_recv_pkt(pkt);
>   	}
> +out:
>   	mutex_unlock(&vsock->rx_lock);
>   }
>   
> @@ -130,6 +138,9 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>   
>   	mutex_lock(&vsock->tx_lock);
>   
> +	if (!vsock->tx_run)
> +		goto out;
> +
>   	vq = vsock->vqs[VSOCK_VQ_TX];
>   
>   	for (;;) {
> @@ -188,6 +199,7 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>   	if (added)
>   		virtqueue_kick(vq);
>   
> +out:
>   	mutex_unlock(&vsock->tx_lock);
>   
>   	if (restart_rx)
> @@ -323,6 +335,10 @@ static void virtio_transport_tx_work(struct work_struct *work)
>   
>   	vq = vsock->vqs[VSOCK_VQ_TX];
>   	mutex_lock(&vsock->tx_lock);
> +
> +	if (!vsock->tx_run)
> +		goto out;
> +
>   	do {
>   		struct virtio_vsock_pkt *pkt;
>   		unsigned int len;
> @@ -333,6 +349,8 @@ static void virtio_transport_tx_work(struct work_struct *work)
>   			added = true;
>   		}
>   	} while (!virtqueue_enable_cb(vq));
> +
> +out:
>   	mutex_unlock(&vsock->tx_lock);
>   
>   	if (added)
> @@ -361,6 +379,9 @@ static void virtio_transport_rx_work(struct work_struct *work)
>   
>   	mutex_lock(&vsock->rx_lock);
>   
> +	if (!vsock->rx_run)
> +		goto out;
> +
>   	do {
>   		virtqueue_disable_cb(vq);
>   		for (;;) {
> @@ -470,6 +491,9 @@ static void virtio_transport_event_work(struct work_struct *work)
>   
>   	mutex_lock(&vsock->event_lock);
>   
> +	if (!vsock->event_run)
> +		goto out;
> +
>   	do {
>   		struct virtio_vsock_event *event;
>   		unsigned int len;
> @@ -484,7 +508,7 @@ static void virtio_transport_event_work(struct work_struct *work)
>   	} while (!virtqueue_enable_cb(vq));
>   
>   	virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]);
> -
> +out:
>   	mutex_unlock(&vsock->event_lock);
>   }
>   
> @@ -619,12 +643,18 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>   	INIT_WORK(&vsock->send_pkt_work, virtio_transport_send_pkt_work);
>   	INIT_WORK(&vsock->loopback_work, virtio_transport_loopback_work);
>   
> +	mutex_lock(&vsock->tx_lock);
> +	vsock->tx_run = true;
> +	mutex_unlock(&vsock->tx_lock);
> +
>   	mutex_lock(&vsock->rx_lock);
>   	virtio_vsock_rx_fill(vsock);
> +	vsock->rx_run = true;
>   	mutex_unlock(&vsock->rx_lock);
>   
>   	mutex_lock(&vsock->event_lock);
>   	virtio_vsock_event_fill(vsock);
> +	vsock->event_run = true;
>   	mutex_unlock(&vsock->event_lock);
>   
>   	vdev->priv = vsock;
> @@ -659,6 +689,24 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>   	/* Reset all connected sockets when the device disappear */
>   	vsock_for_each_connected_socket(virtio_vsock_reset_sock);
>   
> +	/* Stop all work handlers to make sure no one is accessing the device,
> +	 * so we can safely call vdev->config->reset().
> +	 */
> +	mutex_lock(&vsock->rx_lock);
> +	vsock->rx_run = false;
> +	mutex_unlock(&vsock->rx_lock);
> +
> +	mutex_lock(&vsock->tx_lock);
> +	vsock->tx_run = false;
> +	mutex_unlock(&vsock->tx_lock);
> +
> +	mutex_lock(&vsock->event_lock);
> +	vsock->event_run = false;
> +	mutex_unlock(&vsock->event_lock);
> +
> +	/* Flush all device writes and interrupts, device will not use any
> +	 * more buffers.
> +	 */
>   	vdev->config->reset(vdev);
>   
>   	mutex_lock(&vsock->rx_lock);
> @@ -689,6 +737,7 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>   	}
>   	spin_unlock_bh(&vsock->loopback_list_lock);
>   
> +	/* Delete virtqueues and flush outstanding callbacks if any */
>   	vdev->config->del_vqs(vdev);
>   
>   	mutex_unlock(&the_virtio_vsock_mutex);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  2019-07-04  3:58       ` Jason Wang
@ 2019-07-04  9:20         ` Stefano Garzarella
  2019-07-05  0:18           ` Jason Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Stefano Garzarella @ 2019-07-04  9:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: Stefan Hajnoczi, netdev, kvm, virtualization, Michael S. Tsirkin,
	David S. Miller, linux-kernel

On Thu, Jul 04, 2019 at 11:58:00AM +0800, Jason Wang wrote:
> 
> On 2019/7/3 下午6:41, Stefano Garzarella wrote:
> > On Wed, Jul 03, 2019 at 05:53:58PM +0800, Jason Wang wrote:
> > > On 2019/6/28 下午8:36, Stefano Garzarella wrote:
> > > > Some callbacks used by the upper layers can run while we are in the
> > > > .remove(). A potential use-after-free can happen, because we free
> > > > the_virtio_vsock without knowing if the callbacks are over or not.
> > > > 
> > > > To solve this issue we move the assignment of the_virtio_vsock at the
> > > > end of .probe(), when we finished all the initialization, and at the
> > > > beginning of .remove(), before to release resources.
> > > > For the same reason, we do the same also for the vdev->priv.
> > > > 
> > > > We use RCU to be sure that all callbacks that use the_virtio_vsock
> > > > ended before freeing it. This is not required for callbacks that
> > > > use vdev->priv, because after the vdev->config->del_vqs() we are sure
> > > > that they are ended and will no longer be invoked.
> > > > 
> > > > We also take the mutex during the .remove() to avoid that .probe() can
> > > > run while we are resetting the device.
> > > > 
> > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > > > ---
> > > >    net/vmw_vsock/virtio_transport.c | 67 +++++++++++++++++++++-----------
> > > >    1 file changed, 44 insertions(+), 23 deletions(-)
> > > > 
> > > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > > index 9c287e3e393c..7ad510ec12e0 100644
> > > > --- a/net/vmw_vsock/virtio_transport.c
> > > > +++ b/net/vmw_vsock/virtio_transport.c
> > > > @@ -65,19 +65,22 @@ struct virtio_vsock {
> > > >    	u32 guest_cid;
> > > >    };
> > > > -static struct virtio_vsock *virtio_vsock_get(void)
> > > > -{
> > > > -	return the_virtio_vsock;
> > > > -}
> > > > -
> > > >    static u32 virtio_transport_get_local_cid(void)
> > > >    {
> > > > -	struct virtio_vsock *vsock = virtio_vsock_get();
> > > > +	struct virtio_vsock *vsock;
> > > > +	u32 ret;
> > > > -	if (!vsock)
> > > > -		return VMADDR_CID_ANY;
> > > > +	rcu_read_lock();
> > > > +	vsock = rcu_dereference(the_virtio_vsock);
> > > > +	if (!vsock) {
> > > > +		ret = VMADDR_CID_ANY;
> > > > +		goto out_rcu;
> > > > +	}
> > > > -	return vsock->guest_cid;
> > > > +	ret = vsock->guest_cid;
> > > > +out_rcu:
> > > > +	rcu_read_unlock();
> > > > +	return ret;
> > > >    }
> > > >    static void virtio_transport_loopback_work(struct work_struct *work)
> > > > @@ -197,14 +200,18 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
> > > >    	struct virtio_vsock *vsock;
> > > >    	int len = pkt->len;
> > > > -	vsock = virtio_vsock_get();
> > > > +	rcu_read_lock();
> > > > +	vsock = rcu_dereference(the_virtio_vsock);
> > > >    	if (!vsock) {
> > > >    		virtio_transport_free_pkt(pkt);
> > > > -		return -ENODEV;
> > > > +		len = -ENODEV;
> > > > +		goto out_rcu;
> > > >    	}
> > > > -	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid)
> > > > -		return virtio_transport_send_pkt_loopback(vsock, pkt);
> > > > +	if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid) {
> > > > +		len = virtio_transport_send_pkt_loopback(vsock, pkt);
> > > > +		goto out_rcu;
> > > > +	}
> > > >    	if (pkt->reply)
> > > >    		atomic_inc(&vsock->queued_replies);
> > > > @@ -214,6 +221,9 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt)
> > > >    	spin_unlock_bh(&vsock->send_pkt_list_lock);
> > > >    	queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
> > > > +
> > > > +out_rcu:
> > > > +	rcu_read_unlock();
> > > >    	return len;
> > > >    }
> > > > @@ -222,12 +232,14 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
> > > >    {
> > > >    	struct virtio_vsock *vsock;
> > > >    	struct virtio_vsock_pkt *pkt, *n;
> > > > -	int cnt = 0;
> > > > +	int cnt = 0, ret;
> > > >    	LIST_HEAD(freeme);
> > > > -	vsock = virtio_vsock_get();
> > > > +	rcu_read_lock();
> > > > +	vsock = rcu_dereference(the_virtio_vsock);
> > > >    	if (!vsock) {
> > > > -		return -ENODEV;
> > > > +		ret = -ENODEV;
> > > > +		goto out_rcu;
> > > >    	}
> > > >    	spin_lock_bh(&vsock->send_pkt_list_lock);
> > > > @@ -255,7 +267,11 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
> > > >    			queue_work(virtio_vsock_workqueue, &vsock->rx_work);
> > > >    	}
> > > > -	return 0;
> > > > +	ret = 0;
> > > > +
> > > > +out_rcu:
> > > > +	rcu_read_unlock();
> > > > +	return ret;
> > > >    }
> > > >    static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
> > > > @@ -590,8 +606,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
> > > >    	vsock->rx_buf_max_nr = 0;
> > > >    	atomic_set(&vsock->queued_replies, 0);
> > > > -	vdev->priv = vsock;
> > > > -	the_virtio_vsock = vsock;
> > > >    	mutex_init(&vsock->tx_lock);
> > > >    	mutex_init(&vsock->rx_lock);
> > > >    	mutex_init(&vsock->event_lock);
> > > > @@ -613,6 +627,9 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
> > > >    	virtio_vsock_event_fill(vsock);
> > > >    	mutex_unlock(&vsock->event_lock);
> > > > +	vdev->priv = vsock;
> > > > +	rcu_assign_pointer(the_virtio_vsock, vsock);
> > > 
> > > You probably need to use rcu_dereference_protected() to access
> > > the_virtio_vsock in the function in order to survive from sparse.
> > > 
> > Ooo, thanks!
> > 
> > Do you mean when we check if the_virtio_vsock is not null at the beginning of
> > virtio_vsock_probe()?
> 
> 
> I mean instead of:
> 
>     /* Only one virtio-vsock device per guest is supported */
>     if (the_virtio_vsock) {
>         ret = -EBUSY;
>         goto out;
>     }
> 
> you should use:
> 
> if (rcu_dereference_protected(the_virtio_vosck,
> lock_dep_is_held(&the_virtio_vsock_mutex))
> 
> ...

Okay, thanks for confirming! I'll send a v3 to fix this!

> 
> 
> > 
> > > > +
> > > >    	mutex_unlock(&the_virtio_vsock_mutex);
> > > >    	return 0;
> > > > @@ -627,6 +644,12 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > >    	struct virtio_vsock *vsock = vdev->priv;
> > > >    	struct virtio_vsock_pkt *pkt;
> > > > +	mutex_lock(&the_virtio_vsock_mutex);
> > > > +
> > > > +	vdev->priv = NULL;
> > > > +	rcu_assign_pointer(the_virtio_vsock, NULL);
> > > 
> > > This is still suspicious, can we access the_virtio_vsock through vdev->priv?
> > > If yes, we may still get use-after-free since it was not protected by RCU.
> > We will free the object only after calling the del_vqs(), so we are sure
> > that the vq_callbacks ended and will no longer be invoked.
> > So, IIUC it shouldn't happen.
> 
> 
> Yes, but any dereference that is not done in vq_callbacks will be very
> dangerous in the future.

Right.

Do you think make sense to continue with this series in order to fix the
hot-unplug issue, then I'll work to refactor the driver code to use the refcnt
(as you suggested in patch 2) and singleton for the_virtio_vsock?

Thanks,
Stefano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  2019-07-03 10:41     ` Stefano Garzarella
  2019-07-04  3:58       ` Jason Wang
@ 2019-07-04 10:17       ` Stefan Hajnoczi
  1 sibling, 0 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2019-07-04 10:17 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Jason Wang, netdev, kvm, virtualization, Michael S. Tsirkin,
	David S. Miller, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 723 bytes --]

On Wed, Jul 03, 2019 at 12:41:35PM +0200, Stefano Garzarella wrote:
> On Wed, Jul 03, 2019 at 05:53:58PM +0800, Jason Wang wrote:
> > On 2019/6/28 下午8:36, Stefano Garzarella wrote:
> > Another more interesting question, I believe we will do singleton for
> > virtio_vsock structure. Then what's the point of using vdev->priv to access
> > the_virtio_vsock? It looks to me we can it brings extra troubles for doing
> > synchronization.
> 
> I thought about it when I tried to use RCU to stop the worker and I
> think make sense. Maybe can be another series after this will be merged.
> 
> @Stefan, what do you think about that?

Yes, let's make it a singleton and keep no other references to it.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
  2019-07-04  9:20         ` Stefano Garzarella
@ 2019-07-05  0:18           ` Jason Wang
  0 siblings, 0 replies; 19+ messages in thread
From: Jason Wang @ 2019-07-05  0:18 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Stefan Hajnoczi, netdev, kvm, virtualization, Michael S. Tsirkin,
	David S. Miller, linux-kernel


On 2019/7/4 下午5:20, Stefano Garzarella wrote:
>>>> This is still suspicious, can we access the_virtio_vsock through vdev->priv?
>>>> If yes, we may still get use-after-free since it was not protected by RCU.
>>> We will free the object only after calling the del_vqs(), so we are sure
>>> that the vq_callbacks ended and will no longer be invoked.
>>> So, IIUC it shouldn't happen.
>> Yes, but any dereference that is not done in vq_callbacks will be very
>> dangerous in the future.
> Right.
>
> Do you think make sense to continue with this series in order to fix the
> hot-unplug issue, then I'll work to refactor the driver code to use the refcnt
> (as you suggested in patch 2) and singleton for the_virtio_vsock?
>
> Thanks,
> Stefano


Yes.

Thanks


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-07-05  0:18 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-28 12:36 [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove() Stefano Garzarella
2019-06-28 12:36 ` [PATCH v2 1/3] vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock Stefano Garzarella
2019-07-01 14:54   ` Stefan Hajnoczi
2019-07-01 15:10   ` Stefan Hajnoczi
2019-07-03  9:53   ` Jason Wang
2019-07-03 10:41     ` Stefano Garzarella
2019-07-04  3:58       ` Jason Wang
2019-07-04  9:20         ` Stefano Garzarella
2019-07-05  0:18           ` Jason Wang
2019-07-04 10:17       ` Stefan Hajnoczi
2019-06-28 12:36 ` [PATCH v2 2/3] vsock/virtio: stop workers during the .remove() Stefano Garzarella
2019-07-04  4:00   ` Jason Wang
2019-06-28 12:36 ` [PATCH v2 3/3] vsock/virtio: fix flush of works " Stefano Garzarella
2019-07-01 15:08   ` Stefan Hajnoczi
2019-07-01 15:09   ` Stefan Hajnoczi
2019-07-01 15:11 ` [PATCH v2 0/3] vsock/virtio: several fixes in the .probe() and .remove() Stefan Hajnoczi
2019-07-01 17:03   ` Stefano Garzarella
2019-07-03  9:14     ` Stefan Hajnoczi
2019-07-03 10:07       ` Stefano Garzarella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).