All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/2] tcm_vhost endpoint
@ 2013-03-28  2:17 Asias He
  2013-03-28  2:17 ` [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint Asias He
                   ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Asias He @ 2013-03-28  2:17 UTC (permalink / raw)
  To: Nicholas Bellinger
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi, Paolo Bonzini

Reordered the patch, so patch 1 can go to v3.9.

Asias He (2):
  tcm_vhost: Initialize vq->last_used_idx when set endpoint
  tcm_vhost: Use vq->private_data to indicate if the endpoint is setup

 drivers/vhost/tcm_vhost.c | 47 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 41 insertions(+), 6 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint
  2013-03-28  2:17 [PATCH V2 0/2] tcm_vhost endpoint Asias He
@ 2013-03-28  2:17 ` Asias He
  2013-03-28  2:54   ` Nicholas A. Bellinger
  2013-03-28  2:54   ` Nicholas A. Bellinger
  2013-03-28  2:17 ` Asias He
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 33+ messages in thread
From: Asias He @ 2013-03-28  2:17 UTC (permalink / raw)
  To: Nicholas Bellinger
  Cc: Paolo Bonzini, Stefan Hajnoczi, Michael S. Tsirkin,
	Rusty Russell, kvm, virtualization, target-devel, Asias He

This patch fixes guest hang when booting seabios and guest.

  [    0.576238] scsi0 : Virtio SCSI HBA
  [    0.616754] virtio_scsi virtio1: request:id 0 is not a head!

vq->last_used_idx is initialized only when /dev/vhost-scsi is
opened or closed.

   vhost_scsi_open -> vhost_dev_init() -> vhost_vq_reset()
   vhost_scsi_release() -> vhost_dev_cleanup -> vhost_vq_reset()

So, when guest talks to tcm_vhost after seabios does, vq->last_used_idx
still contains the old valule for seabios. This confuses guest.

Fix this by calling vhost_init_used() to init vq->last_used_idx when
we set endpoint.

Signed-off-by: Asias He <asias@redhat.com>
---
 drivers/vhost/tcm_vhost.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 43fb11e..5e3d4487 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -781,8 +781,9 @@ static int vhost_scsi_set_endpoint(
 {
 	struct tcm_vhost_tport *tv_tport;
 	struct tcm_vhost_tpg *tv_tpg;
+	struct vhost_virtqueue *vq;
 	bool match = false;
-	int index, ret;
+	int index, ret, i;
 
 	mutex_lock(&vs->dev.mutex);
 	/* Verify that ring has been setup correctly. */
@@ -826,6 +827,12 @@ static int vhost_scsi_set_endpoint(
 	if (match) {
 		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
 		       sizeof(vs->vs_vhost_wwpn));
+		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
+			vq = &vs->vqs[i];
+			mutex_lock(&vq->mutex);
+			vhost_init_used(vq);
+			mutex_unlock(&vq->mutex);
+		}
 		vs->vs_endpoint = true;
 		ret = 0;
 	} else {
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint
  2013-03-28  2:17 [PATCH V2 0/2] tcm_vhost endpoint Asias He
  2013-03-28  2:17 ` [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint Asias He
@ 2013-03-28  2:17 ` Asias He
  2013-03-28  2:17 ` [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup Asias He
  2013-03-28  2:17 ` Asias He
  3 siblings, 0 replies; 33+ messages in thread
From: Asias He @ 2013-03-28  2:17 UTC (permalink / raw)
  To: Nicholas Bellinger
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi, Paolo Bonzini

This patch fixes guest hang when booting seabios and guest.

  [    0.576238] scsi0 : Virtio SCSI HBA
  [    0.616754] virtio_scsi virtio1: request:id 0 is not a head!

vq->last_used_idx is initialized only when /dev/vhost-scsi is
opened or closed.

   vhost_scsi_open -> vhost_dev_init() -> vhost_vq_reset()
   vhost_scsi_release() -> vhost_dev_cleanup -> vhost_vq_reset()

So, when guest talks to tcm_vhost after seabios does, vq->last_used_idx
still contains the old valule for seabios. This confuses guest.

Fix this by calling vhost_init_used() to init vq->last_used_idx when
we set endpoint.

Signed-off-by: Asias He <asias@redhat.com>
---
 drivers/vhost/tcm_vhost.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 43fb11e..5e3d4487 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -781,8 +781,9 @@ static int vhost_scsi_set_endpoint(
 {
 	struct tcm_vhost_tport *tv_tport;
 	struct tcm_vhost_tpg *tv_tpg;
+	struct vhost_virtqueue *vq;
 	bool match = false;
-	int index, ret;
+	int index, ret, i;
 
 	mutex_lock(&vs->dev.mutex);
 	/* Verify that ring has been setup correctly. */
@@ -826,6 +827,12 @@ static int vhost_scsi_set_endpoint(
 	if (match) {
 		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
 		       sizeof(vs->vs_vhost_wwpn));
+		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
+			vq = &vs->vqs[i];
+			mutex_lock(&vq->mutex);
+			vhost_init_used(vq);
+			mutex_unlock(&vq->mutex);
+		}
 		vs->vs_endpoint = true;
 		ret = 0;
 	} else {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  2:17 [PATCH V2 0/2] tcm_vhost endpoint Asias He
  2013-03-28  2:17 ` [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint Asias He
  2013-03-28  2:17 ` Asias He
@ 2013-03-28  2:17 ` Asias He
  2013-03-28  6:16   ` Michael S. Tsirkin
  2013-03-28  2:17 ` Asias He
  3 siblings, 1 reply; 33+ messages in thread
From: Asias He @ 2013-03-28  2:17 UTC (permalink / raw)
  To: Nicholas Bellinger
  Cc: Paolo Bonzini, Stefan Hajnoczi, Michael S. Tsirkin,
	Rusty Russell, kvm, virtualization, target-devel, Asias He

Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
not. It is set or cleared in vhost_scsi_set_endpoint() or
vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
we check it in vhost_scsi_handle_vq(), we ignored the lock.

Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
indicate the status of the endpoint, we use per virtqueue
vq->private_data to indicate it. In this way, we can only take the
vq->mutex lock which is per queue and make the concurrent multiqueue
process having less lock contention. Further, in the read side of
vq->private_data, we can even do not take only lock if it is accessed in
the vhost worker thread, because it is protected by "vhost rcu".

Signed-off-by: Asias He <asias@redhat.com>
---
 drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 5e3d4487..0524267 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -67,7 +67,6 @@ struct vhost_scsi {
 	/* Protected by vhost_scsi->dev.mutex */
 	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
 	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
-	bool vs_endpoint;
 
 	struct vhost_dev dev;
 	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
@@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
 	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
 }
 
+static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
+{
+	bool ret = false;
+
+	/*
+	 * We can handle the vq only after the endpoint is setup by calling the
+	 * VHOST_SCSI_SET_ENDPOINT ioctl.
+	 *
+	 * TODO: Check that we are running from vhost_worker which acts
+	 * as read-side critical section for vhost kind of RCU.
+	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
+	 */
+	if (rcu_dereference_check(vq->private_data, 1))
+		ret = true;
+
+	return ret;
+}
+
 static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
 {
 	return 1;
@@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
 	int head, ret;
 	u8 target;
 
-	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
-	if (unlikely(!vs->vs_endpoint))
+	if (!tcm_vhost_check_endpoint(vq))
 		return;
 
 	mutex_lock(&vq->mutex);
@@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
 		       sizeof(vs->vs_vhost_wwpn));
 		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
 			vq = &vs->vqs[i];
+			/* Flushing the vhost_work acts as synchronize_rcu */
 			mutex_lock(&vq->mutex);
+			rcu_assign_pointer(vq->private_data, vs);
 			vhost_init_used(vq);
 			mutex_unlock(&vq->mutex);
 		}
-		vs->vs_endpoint = true;
 		ret = 0;
 	} else {
 		ret = -EEXIST;
@@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
 {
 	struct tcm_vhost_tport *tv_tport;
 	struct tcm_vhost_tpg *tv_tpg;
+	struct vhost_virtqueue *vq;
+	bool match = false;
 	int index, ret, i;
 	u8 target;
 
@@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
 		}
 		tv_tpg->tv_tpg_vhost_count--;
 		vs->vs_tpg[target] = NULL;
-		vs->vs_endpoint = false;
+		match = true;
 		mutex_unlock(&tv_tpg->tv_tpg_mutex);
 	}
+	if (match) {
+		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
+			vq = &vs->vqs[i];
+			/* Flushing the vhost_work acts as synchronize_rcu */
+			mutex_lock(&vq->mutex);
+			rcu_assign_pointer(vq->private_data, NULL);
+			mutex_unlock(&vq->mutex);
+		}
+	}
 	mutex_unlock(&vs->dev.mutex);
 	return 0;
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  2:17 [PATCH V2 0/2] tcm_vhost endpoint Asias He
                   ` (2 preceding siblings ...)
  2013-03-28  2:17 ` [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup Asias He
@ 2013-03-28  2:17 ` Asias He
  3 siblings, 0 replies; 33+ messages in thread
From: Asias He @ 2013-03-28  2:17 UTC (permalink / raw)
  To: Nicholas Bellinger
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi, Paolo Bonzini

Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
not. It is set or cleared in vhost_scsi_set_endpoint() or
vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
we check it in vhost_scsi_handle_vq(), we ignored the lock.

Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
indicate the status of the endpoint, we use per virtqueue
vq->private_data to indicate it. In this way, we can only take the
vq->mutex lock which is per queue and make the concurrent multiqueue
process having less lock contention. Further, in the read side of
vq->private_data, we can even do not take only lock if it is accessed in
the vhost worker thread, because it is protected by "vhost rcu".

Signed-off-by: Asias He <asias@redhat.com>
---
 drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 5e3d4487..0524267 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -67,7 +67,6 @@ struct vhost_scsi {
 	/* Protected by vhost_scsi->dev.mutex */
 	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
 	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
-	bool vs_endpoint;
 
 	struct vhost_dev dev;
 	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
@@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
 	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
 }
 
+static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
+{
+	bool ret = false;
+
+	/*
+	 * We can handle the vq only after the endpoint is setup by calling the
+	 * VHOST_SCSI_SET_ENDPOINT ioctl.
+	 *
+	 * TODO: Check that we are running from vhost_worker which acts
+	 * as read-side critical section for vhost kind of RCU.
+	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
+	 */
+	if (rcu_dereference_check(vq->private_data, 1))
+		ret = true;
+
+	return ret;
+}
+
 static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
 {
 	return 1;
@@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
 	int head, ret;
 	u8 target;
 
-	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
-	if (unlikely(!vs->vs_endpoint))
+	if (!tcm_vhost_check_endpoint(vq))
 		return;
 
 	mutex_lock(&vq->mutex);
@@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
 		       sizeof(vs->vs_vhost_wwpn));
 		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
 			vq = &vs->vqs[i];
+			/* Flushing the vhost_work acts as synchronize_rcu */
 			mutex_lock(&vq->mutex);
+			rcu_assign_pointer(vq->private_data, vs);
 			vhost_init_used(vq);
 			mutex_unlock(&vq->mutex);
 		}
-		vs->vs_endpoint = true;
 		ret = 0;
 	} else {
 		ret = -EEXIST;
@@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
 {
 	struct tcm_vhost_tport *tv_tport;
 	struct tcm_vhost_tpg *tv_tpg;
+	struct vhost_virtqueue *vq;
+	bool match = false;
 	int index, ret, i;
 	u8 target;
 
@@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
 		}
 		tv_tpg->tv_tpg_vhost_count--;
 		vs->vs_tpg[target] = NULL;
-		vs->vs_endpoint = false;
+		match = true;
 		mutex_unlock(&tv_tpg->tv_tpg_mutex);
 	}
+	if (match) {
+		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
+			vq = &vs->vqs[i];
+			/* Flushing the vhost_work acts as synchronize_rcu */
+			mutex_lock(&vq->mutex);
+			rcu_assign_pointer(vq->private_data, NULL);
+			mutex_unlock(&vq->mutex);
+		}
+	}
 	mutex_unlock(&vs->dev.mutex);
 	return 0;
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint
  2013-03-28  2:17 ` [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint Asias He
  2013-03-28  2:54   ` Nicholas A. Bellinger
@ 2013-03-28  2:54   ` Nicholas A. Bellinger
  2013-03-28  3:21     ` Asias He
  2013-03-28  3:21     ` Asias He
  1 sibling, 2 replies; 33+ messages in thread
From: Nicholas A. Bellinger @ 2013-03-28  2:54 UTC (permalink / raw)
  To: Asias He
  Cc: Paolo Bonzini, Stefan Hajnoczi, Michael S. Tsirkin,
	Rusty Russell, kvm, virtualization, target-devel

Hi Asias,

On Thu, 2013-03-28 at 10:17 +0800, Asias He wrote:
> This patch fixes guest hang when booting seabios and guest.
> 
>   [    0.576238] scsi0 : Virtio SCSI HBA
>   [    0.616754] virtio_scsi virtio1: request:id 0 is not a head!
> 
> vq->last_used_idx is initialized only when /dev/vhost-scsi is
> opened or closed.
> 
>    vhost_scsi_open -> vhost_dev_init() -> vhost_vq_reset()
>    vhost_scsi_release() -> vhost_dev_cleanup -> vhost_vq_reset()
> 
> So, when guest talks to tcm_vhost after seabios does, vq->last_used_idx
> still contains the old valule for seabios. This confuses guest.
> 
> Fix this by calling vhost_init_used() to init vq->last_used_idx when
> we set endpoint.
> 
> Signed-off-by: Asias He <asias@redhat.com>
> ---
>  drivers/vhost/tcm_vhost.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> index 43fb11e..5e3d4487 100644
> --- a/drivers/vhost/tcm_vhost.c
> +++ b/drivers/vhost/tcm_vhost.c
> @@ -781,8 +781,9 @@ static int vhost_scsi_set_endpoint(
>  {
>  	struct tcm_vhost_tport *tv_tport;
>  	struct tcm_vhost_tpg *tv_tpg;
> +	struct vhost_virtqueue *vq;
>  	bool match = false;
> -	int index, ret;
> +	int index, ret, i;
>  
>  	mutex_lock(&vs->dev.mutex);
>  	/* Verify that ring has been setup correctly. */
> @@ -826,6 +827,12 @@ static int vhost_scsi_set_endpoint(
>  	if (match) {
>  		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
>  		       sizeof(vs->vs_vhost_wwpn));
> +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> +			vq = &vs->vqs[i];
> +			mutex_lock(&vq->mutex);
> +			vhost_init_used(vq);
> +			mutex_unlock(&vq->mutex);
> +		}

Already tried a similar patch earlier today, but as vhost_init_used()
depends upon a vq->private_data being set it does not actually
re-initialize ->last_used_idx..

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint
  2013-03-28  2:17 ` [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint Asias He
@ 2013-03-28  2:54   ` Nicholas A. Bellinger
  2013-03-28  2:54   ` Nicholas A. Bellinger
  1 sibling, 0 replies; 33+ messages in thread
From: Nicholas A. Bellinger @ 2013-03-28  2:54 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi, Paolo Bonzini

Hi Asias,

On Thu, 2013-03-28 at 10:17 +0800, Asias He wrote:
> This patch fixes guest hang when booting seabios and guest.
> 
>   [    0.576238] scsi0 : Virtio SCSI HBA
>   [    0.616754] virtio_scsi virtio1: request:id 0 is not a head!
> 
> vq->last_used_idx is initialized only when /dev/vhost-scsi is
> opened or closed.
> 
>    vhost_scsi_open -> vhost_dev_init() -> vhost_vq_reset()
>    vhost_scsi_release() -> vhost_dev_cleanup -> vhost_vq_reset()
> 
> So, when guest talks to tcm_vhost after seabios does, vq->last_used_idx
> still contains the old valule for seabios. This confuses guest.
> 
> Fix this by calling vhost_init_used() to init vq->last_used_idx when
> we set endpoint.
> 
> Signed-off-by: Asias He <asias@redhat.com>
> ---
>  drivers/vhost/tcm_vhost.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> index 43fb11e..5e3d4487 100644
> --- a/drivers/vhost/tcm_vhost.c
> +++ b/drivers/vhost/tcm_vhost.c
> @@ -781,8 +781,9 @@ static int vhost_scsi_set_endpoint(
>  {
>  	struct tcm_vhost_tport *tv_tport;
>  	struct tcm_vhost_tpg *tv_tpg;
> +	struct vhost_virtqueue *vq;
>  	bool match = false;
> -	int index, ret;
> +	int index, ret, i;
>  
>  	mutex_lock(&vs->dev.mutex);
>  	/* Verify that ring has been setup correctly. */
> @@ -826,6 +827,12 @@ static int vhost_scsi_set_endpoint(
>  	if (match) {
>  		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
>  		       sizeof(vs->vs_vhost_wwpn));
> +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> +			vq = &vs->vqs[i];
> +			mutex_lock(&vq->mutex);
> +			vhost_init_used(vq);
> +			mutex_unlock(&vq->mutex);
> +		}

Already tried a similar patch earlier today, but as vhost_init_used()
depends upon a vq->private_data being set it does not actually
re-initialize ->last_used_idx..

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint
  2013-03-28  2:54   ` Nicholas A. Bellinger
  2013-03-28  3:21     ` Asias He
@ 2013-03-28  3:21     ` Asias He
  1 sibling, 0 replies; 33+ messages in thread
From: Asias He @ 2013-03-28  3:21 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Paolo Bonzini, Stefan Hajnoczi, Michael S. Tsirkin,
	Rusty Russell, kvm, virtualization, target-devel

On Wed, Mar 27, 2013 at 07:54:07PM -0700, Nicholas A. Bellinger wrote:
> Hi Asias,
> 
> On Thu, 2013-03-28 at 10:17 +0800, Asias He wrote:
> > This patch fixes guest hang when booting seabios and guest.
> > 
> >   [    0.576238] scsi0 : Virtio SCSI HBA
> >   [    0.616754] virtio_scsi virtio1: request:id 0 is not a head!
> > 
> > vq->last_used_idx is initialized only when /dev/vhost-scsi is
> > opened or closed.
> > 
> >    vhost_scsi_open -> vhost_dev_init() -> vhost_vq_reset()
> >    vhost_scsi_release() -> vhost_dev_cleanup -> vhost_vq_reset()
> > 
> > So, when guest talks to tcm_vhost after seabios does, vq->last_used_idx
> > still contains the old valule for seabios. This confuses guest.
> > 
> > Fix this by calling vhost_init_used() to init vq->last_used_idx when
> > we set endpoint.
> > 
> > Signed-off-by: Asias He <asias@redhat.com>
> > ---
> >  drivers/vhost/tcm_vhost.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > index 43fb11e..5e3d4487 100644
> > --- a/drivers/vhost/tcm_vhost.c
> > +++ b/drivers/vhost/tcm_vhost.c
> > @@ -781,8 +781,9 @@ static int vhost_scsi_set_endpoint(
> >  {
> >  	struct tcm_vhost_tport *tv_tport;
> >  	struct tcm_vhost_tpg *tv_tpg;
> > +	struct vhost_virtqueue *vq;
> >  	bool match = false;
> > -	int index, ret;
> > +	int index, ret, i;
> >  
> >  	mutex_lock(&vs->dev.mutex);
> >  	/* Verify that ring has been setup correctly. */
> > @@ -826,6 +827,12 @@ static int vhost_scsi_set_endpoint(
> >  	if (match) {
> >  		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
> >  		       sizeof(vs->vs_vhost_wwpn));
> > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > +			vq = &vs->vqs[i];
> > +			mutex_lock(&vq->mutex);
> > +			vhost_init_used(vq);
> > +			mutex_unlock(&vq->mutex);
> > +		}
> 
> Already tried a similar patch earlier today, but as vhost_init_used()
> depends upon a vq->private_data being set it does not actually
> re-initialize ->last_used_idx..
> 

Sigh... Ah, We have this in vhost_init_used

        if (!vq->private_data)
                return 0;


Michael, how bad if we let the original patch 1/2 and 2/2 go to 3.9.

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint
  2013-03-28  2:54   ` Nicholas A. Bellinger
@ 2013-03-28  3:21     ` Asias He
  2013-03-28  3:21     ` Asias He
  1 sibling, 0 replies; 33+ messages in thread
From: Asias He @ 2013-03-28  3:21 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi, Paolo Bonzini

On Wed, Mar 27, 2013 at 07:54:07PM -0700, Nicholas A. Bellinger wrote:
> Hi Asias,
> 
> On Thu, 2013-03-28 at 10:17 +0800, Asias He wrote:
> > This patch fixes guest hang when booting seabios and guest.
> > 
> >   [    0.576238] scsi0 : Virtio SCSI HBA
> >   [    0.616754] virtio_scsi virtio1: request:id 0 is not a head!
> > 
> > vq->last_used_idx is initialized only when /dev/vhost-scsi is
> > opened or closed.
> > 
> >    vhost_scsi_open -> vhost_dev_init() -> vhost_vq_reset()
> >    vhost_scsi_release() -> vhost_dev_cleanup -> vhost_vq_reset()
> > 
> > So, when guest talks to tcm_vhost after seabios does, vq->last_used_idx
> > still contains the old valule for seabios. This confuses guest.
> > 
> > Fix this by calling vhost_init_used() to init vq->last_used_idx when
> > we set endpoint.
> > 
> > Signed-off-by: Asias He <asias@redhat.com>
> > ---
> >  drivers/vhost/tcm_vhost.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > index 43fb11e..5e3d4487 100644
> > --- a/drivers/vhost/tcm_vhost.c
> > +++ b/drivers/vhost/tcm_vhost.c
> > @@ -781,8 +781,9 @@ static int vhost_scsi_set_endpoint(
> >  {
> >  	struct tcm_vhost_tport *tv_tport;
> >  	struct tcm_vhost_tpg *tv_tpg;
> > +	struct vhost_virtqueue *vq;
> >  	bool match = false;
> > -	int index, ret;
> > +	int index, ret, i;
> >  
> >  	mutex_lock(&vs->dev.mutex);
> >  	/* Verify that ring has been setup correctly. */
> > @@ -826,6 +827,12 @@ static int vhost_scsi_set_endpoint(
> >  	if (match) {
> >  		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
> >  		       sizeof(vs->vs_vhost_wwpn));
> > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > +			vq = &vs->vqs[i];
> > +			mutex_lock(&vq->mutex);
> > +			vhost_init_used(vq);
> > +			mutex_unlock(&vq->mutex);
> > +		}
> 
> Already tried a similar patch earlier today, but as vhost_init_used()
> depends upon a vq->private_data being set it does not actually
> re-initialize ->last_used_idx..
> 

Sigh... Ah, We have this in vhost_init_used

        if (!vq->private_data)
                return 0;


Michael, how bad if we let the original patch 1/2 and 2/2 go to 3.9.

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  2:17 ` [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup Asias He
@ 2013-03-28  6:16   ` Michael S. Tsirkin
  2013-03-28  8:10     ` Asias He
  0 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-03-28  6:16 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> not. It is set or cleared in vhost_scsi_set_endpoint() or
> vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> we check it in vhost_scsi_handle_vq(), we ignored the lock.
> 
> Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> indicate the status of the endpoint, we use per virtqueue
> vq->private_data to indicate it. In this way, we can only take the
> vq->mutex lock which is per queue and make the concurrent multiqueue
> process having less lock contention. Further, in the read side of
> vq->private_data, we can even do not take only lock if it is accessed in
> the vhost worker thread, because it is protected by "vhost rcu".
> 
> Signed-off-by: Asias He <asias@redhat.com>
> ---
>  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
>  1 file changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> index 5e3d4487..0524267 100644
> --- a/drivers/vhost/tcm_vhost.c
> +++ b/drivers/vhost/tcm_vhost.c
> @@ -67,7 +67,6 @@ struct vhost_scsi {
>  	/* Protected by vhost_scsi->dev.mutex */
>  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
>  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> -	bool vs_endpoint;
>  
>  	struct vhost_dev dev;
>  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
>  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
>  }
>  
> +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> +{
> +	bool ret = false;
> +
> +	/*
> +	 * We can handle the vq only after the endpoint is setup by calling the
> +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> +	 *
> +	 * TODO: Check that we are running from vhost_worker which acts
> +	 * as read-side critical section for vhost kind of RCU.
> +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> +	 */
> +	if (rcu_dereference_check(vq->private_data, 1))
> +		ret = true;
> +
> +	return ret;
> +}
> +
>  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
>  {
>  	return 1;
> @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
>  	int head, ret;
>  	u8 target;
>  
> -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> -	if (unlikely(!vs->vs_endpoint))
> +	if (!tcm_vhost_check_endpoint(vq))
>  		return;
>

I would just move the check to under vq mutex,
and avoid rcu completely. In vhost-net we are using
private data outside lock so we can't do this,
no such issue here.
  
>  	mutex_lock(&vq->mutex);
> @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
>  		       sizeof(vs->vs_vhost_wwpn));
>  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
>  			vq = &vs->vqs[i];
> +			/* Flushing the vhost_work acts as synchronize_rcu */
>  			mutex_lock(&vq->mutex);
> +			rcu_assign_pointer(vq->private_data, vs);
>  			vhost_init_used(vq);
>  			mutex_unlock(&vq->mutex);
>  		}
> -		vs->vs_endpoint = true;
>  		ret = 0;
>  	} else {
>  		ret = -EEXIST;


There's also some weird smp_mb__after_atomic_inc() with no
atomic in sight just above ... Nicholas what was the point there?


> @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
>  {
>  	struct tcm_vhost_tport *tv_tport;
>  	struct tcm_vhost_tpg *tv_tpg;
> +	struct vhost_virtqueue *vq;
> +	bool match = false;
>  	int index, ret, i;
>  	u8 target;
>  
> @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
>  		}
>  		tv_tpg->tv_tpg_vhost_count--;
>  		vs->vs_tpg[target] = NULL;
> -		vs->vs_endpoint = false;
> +		match = true;
>  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
>  	}
> +	if (match) {
> +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> +			vq = &vs->vqs[i];
> +			/* Flushing the vhost_work acts as synchronize_rcu */
> +			mutex_lock(&vq->mutex);
> +			rcu_assign_pointer(vq->private_data, NULL);
> +			mutex_unlock(&vq->mutex);
> +		}
> +	}

I'm trying to understand what's going on here.
Does vhost_scsi only have a single target?
Because the moment you clear one target you
also set private_data to NULL ...


>  	mutex_unlock(&vs->dev.mutex);
>  	return 0;
>  
> -- 
> 1.8.1.4

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  6:16   ` Michael S. Tsirkin
@ 2013-03-28  8:10     ` Asias He
  2013-03-28  8:33       ` Michael S. Tsirkin
                         ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Asias He @ 2013-03-28  8:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > 
> > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > indicate the status of the endpoint, we use per virtqueue
> > vq->private_data to indicate it. In this way, we can only take the
> > vq->mutex lock which is per queue and make the concurrent multiqueue
> > process having less lock contention. Further, in the read side of
> > vq->private_data, we can even do not take only lock if it is accessed in
> > the vhost worker thread, because it is protected by "vhost rcu".
> > 
> > Signed-off-by: Asias He <asias@redhat.com>
> > ---
> >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> >  1 file changed, 33 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > index 5e3d4487..0524267 100644
> > --- a/drivers/vhost/tcm_vhost.c
> > +++ b/drivers/vhost/tcm_vhost.c
> > @@ -67,7 +67,6 @@ struct vhost_scsi {
> >  	/* Protected by vhost_scsi->dev.mutex */
> >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > -	bool vs_endpoint;
> >  
> >  	struct vhost_dev dev;
> >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> >  }
> >  
> > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > +{
> > +	bool ret = false;
> > +
> > +	/*
> > +	 * We can handle the vq only after the endpoint is setup by calling the
> > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > +	 *
> > +	 * TODO: Check that we are running from vhost_worker which acts
> > +	 * as read-side critical section for vhost kind of RCU.
> > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > +	 */
> > +	if (rcu_dereference_check(vq->private_data, 1))
> > +		ret = true;
> > +
> > +	return ret;
> > +}
> > +
> >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> >  {
> >  	return 1;
> > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> >  	int head, ret;
> >  	u8 target;
> >  
> > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > -	if (unlikely(!vs->vs_endpoint))
> > +	if (!tcm_vhost_check_endpoint(vq))
> >  		return;
> >
> 
> I would just move the check to under vq mutex,
> and avoid rcu completely. In vhost-net we are using
> private data outside lock so we can't do this,
> no such issue here.

Are you talking about:

   handle_tx:
           /* TODO: check that we are running from vhost_worker? */
           sock = rcu_dereference_check(vq->private_data, 1);
           if (!sock)
                   return;
   
           wmem = atomic_read(&sock->sk->sk_wmem_alloc);
           if (wmem >= sock->sk->sk_sndbuf) {
                   mutex_lock(&vq->mutex);
                   tx_poll_start(net, sock);
                   mutex_unlock(&vq->mutex);
                   return;
           }
           mutex_lock(&vq->mutex);

Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
the check under the lock as well.
   
   handle_rx:
           mutex_lock(&vq->mutex);
   
           /* TODO: check that we are running from vhost_worker? */
           struct socket *sock = rcu_dereference_check(vq->private_data, 1);
   
           if (!sock)
                   return;
   
           mutex_lock(&vq->mutex);

Can't we can do the check under the vq->mutex here?

The rcu is still there but it makes the code easier to read. IMO, If we want to
use rcu, use it explicitly and avoid the vhost rcu completely. 

> >  	mutex_lock(&vq->mutex);
> > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> >  		       sizeof(vs->vs_vhost_wwpn));
> >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> >  			vq = &vs->vqs[i];
> > +			/* Flushing the vhost_work acts as synchronize_rcu */
> >  			mutex_lock(&vq->mutex);
> > +			rcu_assign_pointer(vq->private_data, vs);
> >  			vhost_init_used(vq);
> >  			mutex_unlock(&vq->mutex);
> >  		}
> > -		vs->vs_endpoint = true;
> >  		ret = 0;
> >  	} else {
> >  		ret = -EEXIST;
> 
> 
> There's also some weird smp_mb__after_atomic_inc() with no
> atomic in sight just above ... Nicholas what was the point there?
> 
> 
> > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> >  {
> >  	struct tcm_vhost_tport *tv_tport;
> >  	struct tcm_vhost_tpg *tv_tpg;
> > +	struct vhost_virtqueue *vq;
> > +	bool match = false;
> >  	int index, ret, i;
> >  	u8 target;
> >  
> > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> >  		}
> >  		tv_tpg->tv_tpg_vhost_count--;
> >  		vs->vs_tpg[target] = NULL;
> > -		vs->vs_endpoint = false;
> > +		match = true;
> >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> >  	}
> > +	if (match) {
> > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > +			vq = &vs->vqs[i];
> > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > +			mutex_lock(&vq->mutex);
> > +			rcu_assign_pointer(vq->private_data, NULL);
> > +			mutex_unlock(&vq->mutex);
> > +		}
> > +	}
> 
> I'm trying to understand what's going on here.
> Does vhost_scsi only have a single target?
> Because the moment you clear one target you
> also set private_data to NULL ...

vhost_scsi supports multi target. Currently, We can not disable specific target
under the wwpn. When we clear or set the endpoint, we disable or enable all the
targets under the wwpn.

> 
> >  	mutex_unlock(&vs->dev.mutex);
> >  	return 0;
> >  
> > -- 
> > 1.8.1.4

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  8:10     ` Asias He
  2013-03-28  8:33       ` Michael S. Tsirkin
@ 2013-03-28  8:33       ` Michael S. Tsirkin
  2013-03-28  8:47         ` Asias He
  2013-03-28  9:18       ` Michael S. Tsirkin
  2013-03-28  9:18       ` Michael S. Tsirkin
  3 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-03-28  8:33 UTC (permalink / raw)
  To: Asias He
  Cc: Nicholas Bellinger, Paolo Bonzini, Stefan Hajnoczi,
	Rusty Russell, kvm, virtualization, target-devel

On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > 
> > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > indicate the status of the endpoint, we use per virtqueue
> > > vq->private_data to indicate it. In this way, we can only take the
> > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > process having less lock contention. Further, in the read side of
> > > vq->private_data, we can even do not take only lock if it is accessed in
> > > the vhost worker thread, because it is protected by "vhost rcu".
> > > 
> > > Signed-off-by: Asias He <asias@redhat.com>
> > > ---
> > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > index 5e3d4487..0524267 100644
> > > --- a/drivers/vhost/tcm_vhost.c
> > > +++ b/drivers/vhost/tcm_vhost.c
> > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > >  	/* Protected by vhost_scsi->dev.mutex */
> > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > -	bool vs_endpoint;
> > >  
> > >  	struct vhost_dev dev;
> > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > >  }
> > >  
> > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > +{
> > > +	bool ret = false;
> > > +
> > > +	/*
> > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > +	 *
> > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > +	 * as read-side critical section for vhost kind of RCU.
> > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > +	 */
> > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > +		ret = true;
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > >  {
> > >  	return 1;
> > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > >  	int head, ret;
> > >  	u8 target;
> > >  
> > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > -	if (unlikely(!vs->vs_endpoint))
> > > +	if (!tcm_vhost_check_endpoint(vq))
> > >  		return;
> > >
> > 
> > I would just move the check to under vq mutex,
> > and avoid rcu completely. In vhost-net we are using
> > private data outside lock so we can't do this,
> > no such issue here.
> 
> Are you talking about:
> 
>    handle_tx:
>            /* TODO: check that we are running from vhost_worker? */
>            sock = rcu_dereference_check(vq->private_data, 1);
>            if (!sock)
>                    return;
>    
>            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
>            if (wmem >= sock->sk->sk_sndbuf) {
>                    mutex_lock(&vq->mutex);
>                    tx_poll_start(net, sock);
>                    mutex_unlock(&vq->mutex);
>                    return;
>            }
>            mutex_lock(&vq->mutex);
> 
> Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> the check under the lock as well.
>    
>    handle_rx:
>            mutex_lock(&vq->mutex);
>    
>            /* TODO: check that we are running from vhost_worker? */
>            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
>    
>            if (!sock)
>                    return;
>    
>            mutex_lock(&vq->mutex);
> 
> Can't we can do the check under the vq->mutex here?
> 
> The rcu is still there but it makes the code easier to read. IMO, If we want to
> use rcu, use it explicitly and avoid the vhost rcu completely. 

The point is to make spurios wakeups as lightweight as possible.
The seemed to happen a lot with -net.
Should not happen with -scsi at all.


> > >  	mutex_lock(&vq->mutex);
> > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > >  		       sizeof(vs->vs_vhost_wwpn));
> > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > >  			vq = &vs->vqs[i];
> > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > >  			mutex_lock(&vq->mutex);
> > > +			rcu_assign_pointer(vq->private_data, vs);
> > >  			vhost_init_used(vq);
> > >  			mutex_unlock(&vq->mutex);
> > >  		}
> > > -		vs->vs_endpoint = true;
> > >  		ret = 0;
> > >  	} else {
> > >  		ret = -EEXIST;
> > 
> > 
> > There's also some weird smp_mb__after_atomic_inc() with no
> > atomic in sight just above ... Nicholas what was the point there?
> > 
> > 
> > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > >  {
> > >  	struct tcm_vhost_tport *tv_tport;
> > >  	struct tcm_vhost_tpg *tv_tpg;
> > > +	struct vhost_virtqueue *vq;
> > > +	bool match = false;
> > >  	int index, ret, i;
> > >  	u8 target;
> > >  
> > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > >  		}
> > >  		tv_tpg->tv_tpg_vhost_count--;
> > >  		vs->vs_tpg[target] = NULL;
> > > -		vs->vs_endpoint = false;
> > > +		match = true;
> > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > >  	}
> > > +	if (match) {
> > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > +			vq = &vs->vqs[i];
> > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > +			mutex_lock(&vq->mutex);
> > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > +			mutex_unlock(&vq->mutex);
> > > +		}
> > > +	}
> > 
> > I'm trying to understand what's going on here.
> > Does vhost_scsi only have a single target?
> > Because the moment you clear one target you
> > also set private_data to NULL ...
> 
> vhost_scsi supports multi target. Currently, We can not disable specific target
> under the wwpn. When we clear or set the endpoint, we disable or enable all the
> targets under the wwpn.
> 
> > 
> > >  	mutex_unlock(&vs->dev.mutex);
> > >  	return 0;
> > >  
> > > -- 
> > > 1.8.1.4
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  8:10     ` Asias He
@ 2013-03-28  8:33       ` Michael S. Tsirkin
  2013-03-28  8:33       ` Michael S. Tsirkin
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-03-28  8:33 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > 
> > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > indicate the status of the endpoint, we use per virtqueue
> > > vq->private_data to indicate it. In this way, we can only take the
> > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > process having less lock contention. Further, in the read side of
> > > vq->private_data, we can even do not take only lock if it is accessed in
> > > the vhost worker thread, because it is protected by "vhost rcu".
> > > 
> > > Signed-off-by: Asias He <asias@redhat.com>
> > > ---
> > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > index 5e3d4487..0524267 100644
> > > --- a/drivers/vhost/tcm_vhost.c
> > > +++ b/drivers/vhost/tcm_vhost.c
> > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > >  	/* Protected by vhost_scsi->dev.mutex */
> > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > -	bool vs_endpoint;
> > >  
> > >  	struct vhost_dev dev;
> > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > >  }
> > >  
> > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > +{
> > > +	bool ret = false;
> > > +
> > > +	/*
> > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > +	 *
> > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > +	 * as read-side critical section for vhost kind of RCU.
> > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > +	 */
> > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > +		ret = true;
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > >  {
> > >  	return 1;
> > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > >  	int head, ret;
> > >  	u8 target;
> > >  
> > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > -	if (unlikely(!vs->vs_endpoint))
> > > +	if (!tcm_vhost_check_endpoint(vq))
> > >  		return;
> > >
> > 
> > I would just move the check to under vq mutex,
> > and avoid rcu completely. In vhost-net we are using
> > private data outside lock so we can't do this,
> > no such issue here.
> 
> Are you talking about:
> 
>    handle_tx:
>            /* TODO: check that we are running from vhost_worker? */
>            sock = rcu_dereference_check(vq->private_data, 1);
>            if (!sock)
>                    return;
>    
>            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
>            if (wmem >= sock->sk->sk_sndbuf) {
>                    mutex_lock(&vq->mutex);
>                    tx_poll_start(net, sock);
>                    mutex_unlock(&vq->mutex);
>                    return;
>            }
>            mutex_lock(&vq->mutex);
> 
> Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> the check under the lock as well.
>    
>    handle_rx:
>            mutex_lock(&vq->mutex);
>    
>            /* TODO: check that we are running from vhost_worker? */
>            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
>    
>            if (!sock)
>                    return;
>    
>            mutex_lock(&vq->mutex);
> 
> Can't we can do the check under the vq->mutex here?
> 
> The rcu is still there but it makes the code easier to read. IMO, If we want to
> use rcu, use it explicitly and avoid the vhost rcu completely. 

The point is to make spurios wakeups as lightweight as possible.
The seemed to happen a lot with -net.
Should not happen with -scsi at all.


> > >  	mutex_lock(&vq->mutex);
> > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > >  		       sizeof(vs->vs_vhost_wwpn));
> > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > >  			vq = &vs->vqs[i];
> > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > >  			mutex_lock(&vq->mutex);
> > > +			rcu_assign_pointer(vq->private_data, vs);
> > >  			vhost_init_used(vq);
> > >  			mutex_unlock(&vq->mutex);
> > >  		}
> > > -		vs->vs_endpoint = true;
> > >  		ret = 0;
> > >  	} else {
> > >  		ret = -EEXIST;
> > 
> > 
> > There's also some weird smp_mb__after_atomic_inc() with no
> > atomic in sight just above ... Nicholas what was the point there?
> > 
> > 
> > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > >  {
> > >  	struct tcm_vhost_tport *tv_tport;
> > >  	struct tcm_vhost_tpg *tv_tpg;
> > > +	struct vhost_virtqueue *vq;
> > > +	bool match = false;
> > >  	int index, ret, i;
> > >  	u8 target;
> > >  
> > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > >  		}
> > >  		tv_tpg->tv_tpg_vhost_count--;
> > >  		vs->vs_tpg[target] = NULL;
> > > -		vs->vs_endpoint = false;
> > > +		match = true;
> > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > >  	}
> > > +	if (match) {
> > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > +			vq = &vs->vqs[i];
> > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > +			mutex_lock(&vq->mutex);
> > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > +			mutex_unlock(&vq->mutex);
> > > +		}
> > > +	}
> > 
> > I'm trying to understand what's going on here.
> > Does vhost_scsi only have a single target?
> > Because the moment you clear one target you
> > also set private_data to NULL ...
> 
> vhost_scsi supports multi target. Currently, We can not disable specific target
> under the wwpn. When we clear or set the endpoint, we disable or enable all the
> targets under the wwpn.
> 
> > 
> > >  	mutex_unlock(&vs->dev.mutex);
> > >  	return 0;
> > >  
> > > -- 
> > > 1.8.1.4
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  8:33       ` Michael S. Tsirkin
@ 2013-03-28  8:47         ` Asias He
  2013-03-28  9:06           ` Michael S. Tsirkin
  0 siblings, 1 reply; 33+ messages in thread
From: Asias He @ 2013-03-28  8:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Thu, Mar 28, 2013 at 10:33:30AM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > 
> > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > indicate the status of the endpoint, we use per virtqueue
> > > > vq->private_data to indicate it. In this way, we can only take the
> > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > process having less lock contention. Further, in the read side of
> > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > 
> > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > ---
> > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > index 5e3d4487..0524267 100644
> > > > --- a/drivers/vhost/tcm_vhost.c
> > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > -	bool vs_endpoint;
> > > >  
> > > >  	struct vhost_dev dev;
> > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > >  }
> > > >  
> > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > +{
> > > > +	bool ret = false;
> > > > +
> > > > +	/*
> > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > +	 *
> > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > +	 */
> > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > +		ret = true;
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > >  {
> > > >  	return 1;
> > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > >  	int head, ret;
> > > >  	u8 target;
> > > >  
> > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > -	if (unlikely(!vs->vs_endpoint))
> > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > >  		return;
> > > >
> > > 
> > > I would just move the check to under vq mutex,
> > > and avoid rcu completely. In vhost-net we are using
> > > private data outside lock so we can't do this,
> > > no such issue here.
> > 
> > Are you talking about:
> > 
> >    handle_tx:
> >            /* TODO: check that we are running from vhost_worker? */
> >            sock = rcu_dereference_check(vq->private_data, 1);
> >            if (!sock)
> >                    return;
> >    
> >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> >            if (wmem >= sock->sk->sk_sndbuf) {
> >                    mutex_lock(&vq->mutex);
> >                    tx_poll_start(net, sock);
> >                    mutex_unlock(&vq->mutex);
> >                    return;
> >            }
> >            mutex_lock(&vq->mutex);
> > 
> > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > the check under the lock as well.
> >    
> >    handle_rx:
> >            mutex_lock(&vq->mutex);
> >    
> >            /* TODO: check that we are running from vhost_worker? */
> >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> >    
> >            if (!sock)
> >                    return;
> >    
> >            mutex_lock(&vq->mutex);
> > 
> > Can't we can do the check under the vq->mutex here?
> > 
> > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > use rcu, use it explicitly and avoid the vhost rcu completely. 
> 
> The point is to make spurios wakeups as lightweight as possible.
> The seemed to happen a lot with -net.
> Should not happen with -scsi at all.

I am wondering:

1. Why there is a lot of spurios wakeups

2. What performance impact it would give if we take the lock to check
   vq->private_data. Sinc, at any time, either handle_tx or handle_rx
   can be running. So we can almost always take the vq->mutex mutex.
   Did you managed to measure real perf difference?

> 
> > > >  	mutex_lock(&vq->mutex);
> > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > >  			vq = &vs->vqs[i];
> > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > >  			mutex_lock(&vq->mutex);
> > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > >  			vhost_init_used(vq);
> > > >  			mutex_unlock(&vq->mutex);
> > > >  		}
> > > > -		vs->vs_endpoint = true;
> > > >  		ret = 0;
> > > >  	} else {
> > > >  		ret = -EEXIST;
> > > 
> > > 
> > > There's also some weird smp_mb__after_atomic_inc() with no
> > > atomic in sight just above ... Nicholas what was the point there?
> > > 
> > > 
> > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > >  {
> > > >  	struct tcm_vhost_tport *tv_tport;
> > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > +	struct vhost_virtqueue *vq;
> > > > +	bool match = false;
> > > >  	int index, ret, i;
> > > >  	u8 target;
> > > >  
> > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > >  		}
> > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > >  		vs->vs_tpg[target] = NULL;
> > > > -		vs->vs_endpoint = false;
> > > > +		match = true;
> > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > >  	}
> > > > +	if (match) {
> > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > +			vq = &vs->vqs[i];
> > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > +			mutex_lock(&vq->mutex);
> > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > +			mutex_unlock(&vq->mutex);
> > > > +		}
> > > > +	}
> > > 
> > > I'm trying to understand what's going on here.
> > > Does vhost_scsi only have a single target?
> > > Because the moment you clear one target you
> > > also set private_data to NULL ...
> > 
> > vhost_scsi supports multi target. Currently, We can not disable specific target
> > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > targets under the wwpn.
> > 
> > > 
> > > >  	mutex_unlock(&vs->dev.mutex);
> > > >  	return 0;
> > > >  
> > > > -- 
> > > > 1.8.1.4
> > 
> > -- 
> > Asias

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  8:47         ` Asias He
@ 2013-03-28  9:06           ` Michael S. Tsirkin
  2013-03-29  6:27             ` Asias He
  2013-03-29  6:27             ` Asias He
  0 siblings, 2 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-03-28  9:06 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Thu, Mar 28, 2013 at 04:47:15PM +0800, Asias He wrote:
> On Thu, Mar 28, 2013 at 10:33:30AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > 
> > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > process having less lock contention. Further, in the read side of
> > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > 
> > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > ---
> > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > index 5e3d4487..0524267 100644
> > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > -	bool vs_endpoint;
> > > > >  
> > > > >  	struct vhost_dev dev;
> > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > >  }
> > > > >  
> > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > +{
> > > > > +	bool ret = false;
> > > > > +
> > > > > +	/*
> > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > +	 *
> > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > +	 */
> > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > +		ret = true;
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > >  {
> > > > >  	return 1;
> > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > >  	int head, ret;
> > > > >  	u8 target;
> > > > >  
> > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > >  		return;
> > > > >
> > > > 
> > > > I would just move the check to under vq mutex,
> > > > and avoid rcu completely. In vhost-net we are using
> > > > private data outside lock so we can't do this,
> > > > no such issue here.
> > > 
> > > Are you talking about:
> > > 
> > >    handle_tx:
> > >            /* TODO: check that we are running from vhost_worker? */
> > >            sock = rcu_dereference_check(vq->private_data, 1);
> > >            if (!sock)
> > >                    return;
> > >    
> > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > >            if (wmem >= sock->sk->sk_sndbuf) {
> > >                    mutex_lock(&vq->mutex);
> > >                    tx_poll_start(net, sock);
> > >                    mutex_unlock(&vq->mutex);
> > >                    return;
> > >            }
> > >            mutex_lock(&vq->mutex);
> > > 
> > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > the check under the lock as well.
> > >    
> > >    handle_rx:
> > >            mutex_lock(&vq->mutex);
> > >    
> > >            /* TODO: check that we are running from vhost_worker? */
> > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > >    
> > >            if (!sock)
> > >                    return;
> > >    
> > >            mutex_lock(&vq->mutex);
> > > 
> > > Can't we can do the check under the vq->mutex here?
> > > 
> > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > 
> > The point is to make spurios wakeups as lightweight as possible.
> > The seemed to happen a lot with -net.
> > Should not happen with -scsi at all.
> 
> I am wondering:
> 
> 1. Why there is a lot of spurios wakeups
> 
> 2. What performance impact it would give if we take the lock to check
>    vq->private_data. Sinc, at any time, either handle_tx or handle_rx
>    can be running. So we can almost always take the vq->mutex mutex.
>    Did you managed to measure real perf difference?

At some point when this was written, yes.  We can revisit this, but
let's focus on fixing vhost-scsi.

> > 
> > > > >  	mutex_lock(&vq->mutex);
> > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > >  			vq = &vs->vqs[i];
> > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > >  			mutex_lock(&vq->mutex);
> > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > >  			vhost_init_used(vq);
> > > > >  			mutex_unlock(&vq->mutex);
> > > > >  		}
> > > > > -		vs->vs_endpoint = true;
> > > > >  		ret = 0;
> > > > >  	} else {
> > > > >  		ret = -EEXIST;
> > > > 
> > > > 
> > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > atomic in sight just above ... Nicholas what was the point there?
> > > > 
> > > > 
> > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > >  {
> > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > +	struct vhost_virtqueue *vq;
> > > > > +	bool match = false;
> > > > >  	int index, ret, i;
> > > > >  	u8 target;
> > > > >  
> > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > >  		}
> > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > >  		vs->vs_tpg[target] = NULL;
> > > > > -		vs->vs_endpoint = false;
> > > > > +		match = true;
> > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > >  	}
> > > > > +	if (match) {
> > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > +			vq = &vs->vqs[i];
> > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > +			mutex_lock(&vq->mutex);
> > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > +			mutex_unlock(&vq->mutex);
> > > > > +		}
> > > > > +	}
> > > > 
> > > > I'm trying to understand what's going on here.
> > > > Does vhost_scsi only have a single target?
> > > > Because the moment you clear one target you
> > > > also set private_data to NULL ...
> > > 
> > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > targets under the wwpn.




> > > > 
> > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > >  	return 0;
> > > > >  
> > > > > -- 
> > > > > 1.8.1.4
> > > 
> > > -- 
> > > Asias
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  8:10     ` Asias He
  2013-03-28  8:33       ` Michael S. Tsirkin
  2013-03-28  8:33       ` Michael S. Tsirkin
@ 2013-03-28  9:18       ` Michael S. Tsirkin
  2013-03-29  6:22         ` Asias He
  2013-03-28  9:18       ` Michael S. Tsirkin
  3 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-03-28  9:18 UTC (permalink / raw)
  To: Asias He
  Cc: Nicholas Bellinger, Paolo Bonzini, Stefan Hajnoczi,
	Rusty Russell, kvm, virtualization, target-devel

On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > 
> > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > indicate the status of the endpoint, we use per virtqueue
> > > vq->private_data to indicate it. In this way, we can only take the
> > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > process having less lock contention. Further, in the read side of
> > > vq->private_data, we can even do not take only lock if it is accessed in
> > > the vhost worker thread, because it is protected by "vhost rcu".
> > > 
> > > Signed-off-by: Asias He <asias@redhat.com>
> > > ---
> > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > index 5e3d4487..0524267 100644
> > > --- a/drivers/vhost/tcm_vhost.c
> > > +++ b/drivers/vhost/tcm_vhost.c
> > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > >  	/* Protected by vhost_scsi->dev.mutex */
> > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > -	bool vs_endpoint;
> > >  
> > >  	struct vhost_dev dev;
> > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > >  }
> > >  
> > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > +{
> > > +	bool ret = false;
> > > +
> > > +	/*
> > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > +	 *
> > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > +	 * as read-side critical section for vhost kind of RCU.
> > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > +	 */
> > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > +		ret = true;
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > >  {
> > >  	return 1;
> > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > >  	int head, ret;
> > >  	u8 target;
> > >  
> > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > -	if (unlikely(!vs->vs_endpoint))
> > > +	if (!tcm_vhost_check_endpoint(vq))
> > >  		return;
> > >
> > 
> > I would just move the check to under vq mutex,
> > and avoid rcu completely. In vhost-net we are using
> > private data outside lock so we can't do this,
> > no such issue here.
> 
> Are you talking about:
> 
>    handle_tx:
>            /* TODO: check that we are running from vhost_worker? */
>            sock = rcu_dereference_check(vq->private_data, 1);
>            if (!sock)
>                    return;
>    
>            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
>            if (wmem >= sock->sk->sk_sndbuf) {
>                    mutex_lock(&vq->mutex);
>                    tx_poll_start(net, sock);
>                    mutex_unlock(&vq->mutex);
>                    return;
>            }
>            mutex_lock(&vq->mutex);
> 
> Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> the check under the lock as well.
>    
>    handle_rx:
>            mutex_lock(&vq->mutex);
>    
>            /* TODO: check that we are running from vhost_worker? */
>            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
>    
>            if (!sock)
>                    return;
>    
>            mutex_lock(&vq->mutex);
> 
> Can't we can do the check under the vq->mutex here?
> 
> The rcu is still there but it makes the code easier to read. IMO, If we want to
> use rcu, use it explicitly and avoid the vhost rcu completely. 
> 
> > >  	mutex_lock(&vq->mutex);
> > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > >  		       sizeof(vs->vs_vhost_wwpn));
> > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > >  			vq = &vs->vqs[i];
> > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > >  			mutex_lock(&vq->mutex);
> > > +			rcu_assign_pointer(vq->private_data, vs);
> > >  			vhost_init_used(vq);
> > >  			mutex_unlock(&vq->mutex);
> > >  		}
> > > -		vs->vs_endpoint = true;
> > >  		ret = 0;
> > >  	} else {
> > >  		ret = -EEXIST;
> > 
> > 
> > There's also some weird smp_mb__after_atomic_inc() with no
> > atomic in sight just above ... Nicholas what was the point there?
> > 
> > 
> > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > >  {
> > >  	struct tcm_vhost_tport *tv_tport;
> > >  	struct tcm_vhost_tpg *tv_tpg;
> > > +	struct vhost_virtqueue *vq;
> > > +	bool match = false;
> > >  	int index, ret, i;
> > >  	u8 target;
> > >  
> > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > >  		}
> > >  		tv_tpg->tv_tpg_vhost_count--;
> > >  		vs->vs_tpg[target] = NULL;
> > > -		vs->vs_endpoint = false;
> > > +		match = true;
> > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > >  	}
> > > +	if (match) {
> > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > +			vq = &vs->vqs[i];
> > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > +			mutex_lock(&vq->mutex);
> > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > +			mutex_unlock(&vq->mutex);
> > > +		}
> > > +	}
> > 
> > I'm trying to understand what's going on here.
> > Does vhost_scsi only have a single target?
> > Because the moment you clear one target you
> > also set private_data to NULL ...
> 
> vhost_scsi supports multi target. Currently, We can not disable specific target
> under the wwpn. When we clear or set the endpoint, we disable or enable all the
> targets under the wwpn.

okay, but changing vs->vs_tpg[target] under dev mutex, then using
it under vq mutex looks wrong.

Since we want to use private_data anyway, how about
making private_data point at struct tcm_vhost_tpg * ?

Allocate it dynamically in SET_ENDPOINT (and free old value if any).


> > 
> > >  	mutex_unlock(&vs->dev.mutex);
> > >  	return 0;
> > >  
> > > -- 
> > > 1.8.1.4
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  8:10     ` Asias He
                         ` (2 preceding siblings ...)
  2013-03-28  9:18       ` Michael S. Tsirkin
@ 2013-03-28  9:18       ` Michael S. Tsirkin
  3 siblings, 0 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-03-28  9:18 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > 
> > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > indicate the status of the endpoint, we use per virtqueue
> > > vq->private_data to indicate it. In this way, we can only take the
> > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > process having less lock contention. Further, in the read side of
> > > vq->private_data, we can even do not take only lock if it is accessed in
> > > the vhost worker thread, because it is protected by "vhost rcu".
> > > 
> > > Signed-off-by: Asias He <asias@redhat.com>
> > > ---
> > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > index 5e3d4487..0524267 100644
> > > --- a/drivers/vhost/tcm_vhost.c
> > > +++ b/drivers/vhost/tcm_vhost.c
> > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > >  	/* Protected by vhost_scsi->dev.mutex */
> > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > -	bool vs_endpoint;
> > >  
> > >  	struct vhost_dev dev;
> > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > >  }
> > >  
> > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > +{
> > > +	bool ret = false;
> > > +
> > > +	/*
> > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > +	 *
> > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > +	 * as read-side critical section for vhost kind of RCU.
> > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > +	 */
> > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > +		ret = true;
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > >  {
> > >  	return 1;
> > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > >  	int head, ret;
> > >  	u8 target;
> > >  
> > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > -	if (unlikely(!vs->vs_endpoint))
> > > +	if (!tcm_vhost_check_endpoint(vq))
> > >  		return;
> > >
> > 
> > I would just move the check to under vq mutex,
> > and avoid rcu completely. In vhost-net we are using
> > private data outside lock so we can't do this,
> > no such issue here.
> 
> Are you talking about:
> 
>    handle_tx:
>            /* TODO: check that we are running from vhost_worker? */
>            sock = rcu_dereference_check(vq->private_data, 1);
>            if (!sock)
>                    return;
>    
>            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
>            if (wmem >= sock->sk->sk_sndbuf) {
>                    mutex_lock(&vq->mutex);
>                    tx_poll_start(net, sock);
>                    mutex_unlock(&vq->mutex);
>                    return;
>            }
>            mutex_lock(&vq->mutex);
> 
> Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> the check under the lock as well.
>    
>    handle_rx:
>            mutex_lock(&vq->mutex);
>    
>            /* TODO: check that we are running from vhost_worker? */
>            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
>    
>            if (!sock)
>                    return;
>    
>            mutex_lock(&vq->mutex);
> 
> Can't we can do the check under the vq->mutex here?
> 
> The rcu is still there but it makes the code easier to read. IMO, If we want to
> use rcu, use it explicitly and avoid the vhost rcu completely. 
> 
> > >  	mutex_lock(&vq->mutex);
> > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > >  		       sizeof(vs->vs_vhost_wwpn));
> > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > >  			vq = &vs->vqs[i];
> > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > >  			mutex_lock(&vq->mutex);
> > > +			rcu_assign_pointer(vq->private_data, vs);
> > >  			vhost_init_used(vq);
> > >  			mutex_unlock(&vq->mutex);
> > >  		}
> > > -		vs->vs_endpoint = true;
> > >  		ret = 0;
> > >  	} else {
> > >  		ret = -EEXIST;
> > 
> > 
> > There's also some weird smp_mb__after_atomic_inc() with no
> > atomic in sight just above ... Nicholas what was the point there?
> > 
> > 
> > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > >  {
> > >  	struct tcm_vhost_tport *tv_tport;
> > >  	struct tcm_vhost_tpg *tv_tpg;
> > > +	struct vhost_virtqueue *vq;
> > > +	bool match = false;
> > >  	int index, ret, i;
> > >  	u8 target;
> > >  
> > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > >  		}
> > >  		tv_tpg->tv_tpg_vhost_count--;
> > >  		vs->vs_tpg[target] = NULL;
> > > -		vs->vs_endpoint = false;
> > > +		match = true;
> > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > >  	}
> > > +	if (match) {
> > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > +			vq = &vs->vqs[i];
> > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > +			mutex_lock(&vq->mutex);
> > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > +			mutex_unlock(&vq->mutex);
> > > +		}
> > > +	}
> > 
> > I'm trying to understand what's going on here.
> > Does vhost_scsi only have a single target?
> > Because the moment you clear one target you
> > also set private_data to NULL ...
> 
> vhost_scsi supports multi target. Currently, We can not disable specific target
> under the wwpn. When we clear or set the endpoint, we disable or enable all the
> targets under the wwpn.

okay, but changing vs->vs_tpg[target] under dev mutex, then using
it under vq mutex looks wrong.

Since we want to use private_data anyway, how about
making private_data point at struct tcm_vhost_tpg * ?

Allocate it dynamically in SET_ENDPOINT (and free old value if any).


> > 
> > >  	mutex_unlock(&vs->dev.mutex);
> > >  	return 0;
> > >  
> > > -- 
> > > 1.8.1.4
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  9:18       ` Michael S. Tsirkin
@ 2013-03-29  6:22         ` Asias He
  2013-03-31  8:20           ` Michael S. Tsirkin
  0 siblings, 1 reply; 33+ messages in thread
From: Asias He @ 2013-03-29  6:22 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Thu, Mar 28, 2013 at 11:18:22AM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > 
> > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > indicate the status of the endpoint, we use per virtqueue
> > > > vq->private_data to indicate it. In this way, we can only take the
> > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > process having less lock contention. Further, in the read side of
> > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > 
> > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > ---
> > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > index 5e3d4487..0524267 100644
> > > > --- a/drivers/vhost/tcm_vhost.c
> > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > -	bool vs_endpoint;
> > > >  
> > > >  	struct vhost_dev dev;
> > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > >  }
> > > >  
> > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > +{
> > > > +	bool ret = false;
> > > > +
> > > > +	/*
> > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > +	 *
> > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > +	 */
> > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > +		ret = true;
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > >  {
> > > >  	return 1;
> > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > >  	int head, ret;
> > > >  	u8 target;
> > > >  
> > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > -	if (unlikely(!vs->vs_endpoint))
> > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > >  		return;
> > > >
> > > 
> > > I would just move the check to under vq mutex,
> > > and avoid rcu completely. In vhost-net we are using
> > > private data outside lock so we can't do this,
> > > no such issue here.
> > 
> > Are you talking about:
> > 
> >    handle_tx:
> >            /* TODO: check that we are running from vhost_worker? */
> >            sock = rcu_dereference_check(vq->private_data, 1);
> >            if (!sock)
> >                    return;
> >    
> >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> >            if (wmem >= sock->sk->sk_sndbuf) {
> >                    mutex_lock(&vq->mutex);
> >                    tx_poll_start(net, sock);
> >                    mutex_unlock(&vq->mutex);
> >                    return;
> >            }
> >            mutex_lock(&vq->mutex);
> > 
> > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > the check under the lock as well.
> >    
> >    handle_rx:
> >            mutex_lock(&vq->mutex);
> >    
> >            /* TODO: check that we are running from vhost_worker? */
> >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> >    
> >            if (!sock)
> >                    return;
> >    
> >            mutex_lock(&vq->mutex);
> > 
> > Can't we can do the check under the vq->mutex here?
> > 
> > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > 
> > > >  	mutex_lock(&vq->mutex);
> > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > >  			vq = &vs->vqs[i];
> > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > >  			mutex_lock(&vq->mutex);
> > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > >  			vhost_init_used(vq);
> > > >  			mutex_unlock(&vq->mutex);
> > > >  		}
> > > > -		vs->vs_endpoint = true;
> > > >  		ret = 0;
> > > >  	} else {
> > > >  		ret = -EEXIST;
> > > 
> > > 
> > > There's also some weird smp_mb__after_atomic_inc() with no
> > > atomic in sight just above ... Nicholas what was the point there?
> > > 
> > > 
> > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > >  {
> > > >  	struct tcm_vhost_tport *tv_tport;
> > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > +	struct vhost_virtqueue *vq;
> > > > +	bool match = false;
> > > >  	int index, ret, i;
> > > >  	u8 target;
> > > >  
> > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > >  		}
> > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > >  		vs->vs_tpg[target] = NULL;
> > > > -		vs->vs_endpoint = false;
> > > > +		match = true;
> > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > >  	}
> > > > +	if (match) {
> > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > +			vq = &vs->vqs[i];
> > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > +			mutex_lock(&vq->mutex);
> > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > +			mutex_unlock(&vq->mutex);
> > > > +		}
> > > > +	}
> > > 
> > > I'm trying to understand what's going on here.
> > > Does vhost_scsi only have a single target?
> > > Because the moment you clear one target you
> > > also set private_data to NULL ...
> > 
> > vhost_scsi supports multi target. Currently, We can not disable specific target
> > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > targets under the wwpn.
> 
> okay, but changing vs->vs_tpg[target] under dev mutex, then using
> it under vq mutex looks wrong.

I do not see a problem here.

Access of vs->vs_tpg[target] in vhost_scsi_handle_vq() happens only when
the SET_ENDPOINT is done. At that time, the vs->vs_tpg[] is already
ready. Even if the vs->vs_tpg[target] is changed to NULL in
CLEAR_ENDPOINT, it is safe since we fail the request if
vs->vs_tpg[target] is NULL.

> Since we want to use private_data anyway, how about
> making private_data point at struct tcm_vhost_tpg * ?
> 
> Allocate it dynamically in SET_ENDPOINT (and free old value if any).

The struct tcm_vhost_tpg is per target. I assume you want to point
private_data to the 'struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET]'

> 
> > > 
> > > >  	mutex_unlock(&vs->dev.mutex);
> > > >  	return 0;
> > > >  
> > > > -- 
> > > > 1.8.1.4
> > 
> > -- 
> > Asias

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  9:06           ` Michael S. Tsirkin
@ 2013-03-29  6:27             ` Asias He
  2013-03-31  8:23               ` Michael S. Tsirkin
  2013-03-31  8:23               ` Michael S. Tsirkin
  2013-03-29  6:27             ` Asias He
  1 sibling, 2 replies; 33+ messages in thread
From: Asias He @ 2013-03-29  6:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Nicholas Bellinger, Paolo Bonzini, Stefan Hajnoczi,
	Rusty Russell, kvm, virtualization, target-devel

On Thu, Mar 28, 2013 at 11:06:15AM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 28, 2013 at 04:47:15PM +0800, Asias He wrote:
> > On Thu, Mar 28, 2013 at 10:33:30AM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > 
> > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > process having less lock contention. Further, in the read side of
> > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > 
> > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > ---
> > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > index 5e3d4487..0524267 100644
> > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > -	bool vs_endpoint;
> > > > > >  
> > > > > >  	struct vhost_dev dev;
> > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > >  }
> > > > > >  
> > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > +{
> > > > > > +	bool ret = false;
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > +	 *
> > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > +	 */
> > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > +		ret = true;
> > > > > > +
> > > > > > +	return ret;
> > > > > > +}
> > > > > > +
> > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > >  {
> > > > > >  	return 1;
> > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > >  	int head, ret;
> > > > > >  	u8 target;
> > > > > >  
> > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > >  		return;
> > > > > >
> > > > > 
> > > > > I would just move the check to under vq mutex,
> > > > > and avoid rcu completely. In vhost-net we are using
> > > > > private data outside lock so we can't do this,
> > > > > no such issue here.
> > > > 
> > > > Are you talking about:
> > > > 
> > > >    handle_tx:
> > > >            /* TODO: check that we are running from vhost_worker? */
> > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > >            if (!sock)
> > > >                    return;
> > > >    
> > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > >                    mutex_lock(&vq->mutex);
> > > >                    tx_poll_start(net, sock);
> > > >                    mutex_unlock(&vq->mutex);
> > > >                    return;
> > > >            }
> > > >            mutex_lock(&vq->mutex);
> > > > 
> > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > the check under the lock as well.
> > > >    
> > > >    handle_rx:
> > > >            mutex_lock(&vq->mutex);
> > > >    
> > > >            /* TODO: check that we are running from vhost_worker? */
> > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > >    
> > > >            if (!sock)
> > > >                    return;
> > > >    
> > > >            mutex_lock(&vq->mutex);
> > > > 
> > > > Can't we can do the check under the vq->mutex here?
> > > > 
> > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > 
> > > The point is to make spurios wakeups as lightweight as possible.
> > > The seemed to happen a lot with -net.
> > > Should not happen with -scsi at all.
> > 
> > I am wondering:
> > 
> > 1. Why there is a lot of spurios wakeups
> > 
> > 2. What performance impact it would give if we take the lock to check
> >    vq->private_data. Sinc, at any time, either handle_tx or handle_rx
> >    can be running. So we can almost always take the vq->mutex mutex.
> >    Did you managed to measure real perf difference?
> 
> At some point when this was written, yes.  We can revisit this, but
> let's focus on fixing vhost-scsi.

If no perf difference is measurable, we can simplify the -net. It would
be one small step towards removing the vhost rcu thing.

> > > 
> > > > > >  	mutex_lock(&vq->mutex);
> > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > >  			vq = &vs->vqs[i];
> > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > >  			mutex_lock(&vq->mutex);
> > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > >  			vhost_init_used(vq);
> > > > > >  			mutex_unlock(&vq->mutex);
> > > > > >  		}
> > > > > > -		vs->vs_endpoint = true;
> > > > > >  		ret = 0;
> > > > > >  	} else {
> > > > > >  		ret = -EEXIST;
> > > > > 
> > > > > 
> > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > 
> > > > > 
> > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > >  {
> > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > +	struct vhost_virtqueue *vq;
> > > > > > +	bool match = false;
> > > > > >  	int index, ret, i;
> > > > > >  	u8 target;
> > > > > >  
> > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > >  		}
> > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > -		vs->vs_endpoint = false;
> > > > > > +		match = true;
> > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > >  	}
> > > > > > +	if (match) {
> > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > +			vq = &vs->vqs[i];
> > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > +			mutex_lock(&vq->mutex);
> > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > +		}
> > > > > > +	}
> > > > > 
> > > > > I'm trying to understand what's going on here.
> > > > > Does vhost_scsi only have a single target?
> > > > > Because the moment you clear one target you
> > > > > also set private_data to NULL ...
> > > > 
> > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > targets under the wwpn.
> 
> 
> 
> 
> > > > > 
> > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > >  	return 0;
> > > > > >  
> > > > > > -- 
> > > > > > 1.8.1.4
> > > > 
> > > > -- 
> > > > Asias
> > 
> > -- 
> > Asias

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-28  9:06           ` Michael S. Tsirkin
  2013-03-29  6:27             ` Asias He
@ 2013-03-29  6:27             ` Asias He
  1 sibling, 0 replies; 33+ messages in thread
From: Asias He @ 2013-03-29  6:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Thu, Mar 28, 2013 at 11:06:15AM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 28, 2013 at 04:47:15PM +0800, Asias He wrote:
> > On Thu, Mar 28, 2013 at 10:33:30AM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > 
> > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > process having less lock contention. Further, in the read side of
> > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > 
> > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > ---
> > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > index 5e3d4487..0524267 100644
> > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > -	bool vs_endpoint;
> > > > > >  
> > > > > >  	struct vhost_dev dev;
> > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > >  }
> > > > > >  
> > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > +{
> > > > > > +	bool ret = false;
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > +	 *
> > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > +	 */
> > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > +		ret = true;
> > > > > > +
> > > > > > +	return ret;
> > > > > > +}
> > > > > > +
> > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > >  {
> > > > > >  	return 1;
> > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > >  	int head, ret;
> > > > > >  	u8 target;
> > > > > >  
> > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > >  		return;
> > > > > >
> > > > > 
> > > > > I would just move the check to under vq mutex,
> > > > > and avoid rcu completely. In vhost-net we are using
> > > > > private data outside lock so we can't do this,
> > > > > no such issue here.
> > > > 
> > > > Are you talking about:
> > > > 
> > > >    handle_tx:
> > > >            /* TODO: check that we are running from vhost_worker? */
> > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > >            if (!sock)
> > > >                    return;
> > > >    
> > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > >                    mutex_lock(&vq->mutex);
> > > >                    tx_poll_start(net, sock);
> > > >                    mutex_unlock(&vq->mutex);
> > > >                    return;
> > > >            }
> > > >            mutex_lock(&vq->mutex);
> > > > 
> > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > the check under the lock as well.
> > > >    
> > > >    handle_rx:
> > > >            mutex_lock(&vq->mutex);
> > > >    
> > > >            /* TODO: check that we are running from vhost_worker? */
> > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > >    
> > > >            if (!sock)
> > > >                    return;
> > > >    
> > > >            mutex_lock(&vq->mutex);
> > > > 
> > > > Can't we can do the check under the vq->mutex here?
> > > > 
> > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > 
> > > The point is to make spurios wakeups as lightweight as possible.
> > > The seemed to happen a lot with -net.
> > > Should not happen with -scsi at all.
> > 
> > I am wondering:
> > 
> > 1. Why there is a lot of spurios wakeups
> > 
> > 2. What performance impact it would give if we take the lock to check
> >    vq->private_data. Sinc, at any time, either handle_tx or handle_rx
> >    can be running. So we can almost always take the vq->mutex mutex.
> >    Did you managed to measure real perf difference?
> 
> At some point when this was written, yes.  We can revisit this, but
> let's focus on fixing vhost-scsi.

If no perf difference is measurable, we can simplify the -net. It would
be one small step towards removing the vhost rcu thing.

> > > 
> > > > > >  	mutex_lock(&vq->mutex);
> > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > >  			vq = &vs->vqs[i];
> > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > >  			mutex_lock(&vq->mutex);
> > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > >  			vhost_init_used(vq);
> > > > > >  			mutex_unlock(&vq->mutex);
> > > > > >  		}
> > > > > > -		vs->vs_endpoint = true;
> > > > > >  		ret = 0;
> > > > > >  	} else {
> > > > > >  		ret = -EEXIST;
> > > > > 
> > > > > 
> > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > 
> > > > > 
> > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > >  {
> > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > +	struct vhost_virtqueue *vq;
> > > > > > +	bool match = false;
> > > > > >  	int index, ret, i;
> > > > > >  	u8 target;
> > > > > >  
> > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > >  		}
> > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > -		vs->vs_endpoint = false;
> > > > > > +		match = true;
> > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > >  	}
> > > > > > +	if (match) {
> > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > +			vq = &vs->vqs[i];
> > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > +			mutex_lock(&vq->mutex);
> > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > +		}
> > > > > > +	}
> > > > > 
> > > > > I'm trying to understand what's going on here.
> > > > > Does vhost_scsi only have a single target?
> > > > > Because the moment you clear one target you
> > > > > also set private_data to NULL ...
> > > > 
> > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > targets under the wwpn.
> 
> 
> 
> 
> > > > > 
> > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > >  	return 0;
> > > > > >  
> > > > > > -- 
> > > > > > 1.8.1.4
> > > > 
> > > > -- 
> > > > Asias
> > 
> > -- 
> > Asias

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-29  6:22         ` Asias He
@ 2013-03-31  8:20           ` Michael S. Tsirkin
  2013-04-01  2:13             ` Asias He
  0 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-03-31  8:20 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Fri, Mar 29, 2013 at 02:22:52PM +0800, Asias He wrote:
> On Thu, Mar 28, 2013 at 11:18:22AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > 
> > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > process having less lock contention. Further, in the read side of
> > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > 
> > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > ---
> > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > index 5e3d4487..0524267 100644
> > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > -	bool vs_endpoint;
> > > > >  
> > > > >  	struct vhost_dev dev;
> > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > >  }
> > > > >  
> > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > +{
> > > > > +	bool ret = false;
> > > > > +
> > > > > +	/*
> > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > +	 *
> > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > +	 */
> > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > +		ret = true;
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > >  {
> > > > >  	return 1;
> > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > >  	int head, ret;
> > > > >  	u8 target;
> > > > >  
> > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > >  		return;
> > > > >
> > > > 
> > > > I would just move the check to under vq mutex,
> > > > and avoid rcu completely. In vhost-net we are using
> > > > private data outside lock so we can't do this,
> > > > no such issue here.
> > > 
> > > Are you talking about:
> > > 
> > >    handle_tx:
> > >            /* TODO: check that we are running from vhost_worker? */
> > >            sock = rcu_dereference_check(vq->private_data, 1);
> > >            if (!sock)
> > >                    return;
> > >    
> > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > >            if (wmem >= sock->sk->sk_sndbuf) {
> > >                    mutex_lock(&vq->mutex);
> > >                    tx_poll_start(net, sock);
> > >                    mutex_unlock(&vq->mutex);
> > >                    return;
> > >            }
> > >            mutex_lock(&vq->mutex);
> > > 
> > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > the check under the lock as well.
> > >    
> > >    handle_rx:
> > >            mutex_lock(&vq->mutex);
> > >    
> > >            /* TODO: check that we are running from vhost_worker? */
> > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > >    
> > >            if (!sock)
> > >                    return;
> > >    
> > >            mutex_lock(&vq->mutex);
> > > 
> > > Can't we can do the check under the vq->mutex here?
> > > 
> > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > 
> > > > >  	mutex_lock(&vq->mutex);
> > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > >  			vq = &vs->vqs[i];
> > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > >  			mutex_lock(&vq->mutex);
> > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > >  			vhost_init_used(vq);
> > > > >  			mutex_unlock(&vq->mutex);
> > > > >  		}
> > > > > -		vs->vs_endpoint = true;
> > > > >  		ret = 0;
> > > > >  	} else {
> > > > >  		ret = -EEXIST;
> > > > 
> > > > 
> > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > atomic in sight just above ... Nicholas what was the point there?
> > > > 
> > > > 
> > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > >  {
> > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > +	struct vhost_virtqueue *vq;
> > > > > +	bool match = false;
> > > > >  	int index, ret, i;
> > > > >  	u8 target;
> > > > >  
> > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > >  		}
> > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > >  		vs->vs_tpg[target] = NULL;
> > > > > -		vs->vs_endpoint = false;
> > > > > +		match = true;
> > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > >  	}
> > > > > +	if (match) {
> > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > +			vq = &vs->vqs[i];
> > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > +			mutex_lock(&vq->mutex);
> > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > +			mutex_unlock(&vq->mutex);
> > > > > +		}
> > > > > +	}
> > > > 
> > > > I'm trying to understand what's going on here.
> > > > Does vhost_scsi only have a single target?
> > > > Because the moment you clear one target you
> > > > also set private_data to NULL ...
> > > 
> > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > targets under the wwpn.
> > 
> > okay, but changing vs->vs_tpg[target] under dev mutex, then using
> > it under vq mutex looks wrong.
> 
> I do not see a problem here.
> 
> Access of vs->vs_tpg[target] in vhost_scsi_handle_vq() happens only when
> the SET_ENDPOINT is done.

But nothing prevents multiple SET_ENDPOINT calls while
the previous one is in progress.

> At that time, the vs->vs_tpg[] is already
> ready. Even if the vs->vs_tpg[target] is changed to NULL in
> CLEAR_ENDPOINT, it is safe since we fail the request if
> vs->vs_tpg[target] is NULL.

We check it without a common lock so it can become NULL
after we test it.

> > Since we want to use private_data anyway, how about
> > making private_data point at struct tcm_vhost_tpg * ?
> > 
> > Allocate it dynamically in SET_ENDPOINT (and free old value if any).
> 
> The struct tcm_vhost_tpg is per target. I assume you want to point
> private_data to the 'struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET]'

No, I want to put it at the array of targets.

> > 
> > > > 
> > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > >  	return 0;
> > > > >  
> > > > > -- 
> > > > > 1.8.1.4
> > > 
> > > -- 
> > > Asias
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-29  6:27             ` Asias He
  2013-03-31  8:23               ` Michael S. Tsirkin
@ 2013-03-31  8:23               ` Michael S. Tsirkin
  2013-04-01  2:20                 ` Asias He
                                   ` (2 more replies)
  1 sibling, 3 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-03-31  8:23 UTC (permalink / raw)
  To: Asias He
  Cc: Nicholas Bellinger, Paolo Bonzini, Stefan Hajnoczi,
	Rusty Russell, kvm, virtualization, target-devel

On Fri, Mar 29, 2013 at 02:27:50PM +0800, Asias He wrote:
> On Thu, Mar 28, 2013 at 11:06:15AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 28, 2013 at 04:47:15PM +0800, Asias He wrote:
> > > On Thu, Mar 28, 2013 at 10:33:30AM +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > > 
> > > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > > process having less lock contention. Further, in the read side of
> > > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > > 
> > > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > > ---
> > > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > > index 5e3d4487..0524267 100644
> > > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > > -	bool vs_endpoint;
> > > > > > >  
> > > > > > >  	struct vhost_dev dev;
> > > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > > >  }
> > > > > > >  
> > > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > > +{
> > > > > > > +	bool ret = false;
> > > > > > > +
> > > > > > > +	/*
> > > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > > +	 *
> > > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > > +	 */
> > > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > > +		ret = true;
> > > > > > > +
> > > > > > > +	return ret;
> > > > > > > +}
> > > > > > > +
> > > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > > >  {
> > > > > > >  	return 1;
> > > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > > >  	int head, ret;
> > > > > > >  	u8 target;
> > > > > > >  
> > > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > > >  		return;
> > > > > > >
> > > > > > 
> > > > > > I would just move the check to under vq mutex,
> > > > > > and avoid rcu completely. In vhost-net we are using
> > > > > > private data outside lock so we can't do this,
> > > > > > no such issue here.
> > > > > 
> > > > > Are you talking about:
> > > > > 
> > > > >    handle_tx:
> > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > > >            if (!sock)
> > > > >                    return;
> > > > >    
> > > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > > >                    mutex_lock(&vq->mutex);
> > > > >                    tx_poll_start(net, sock);
> > > > >                    mutex_unlock(&vq->mutex);
> > > > >                    return;
> > > > >            }
> > > > >            mutex_lock(&vq->mutex);
> > > > > 
> > > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > > the check under the lock as well.
> > > > >    
> > > > >    handle_rx:
> > > > >            mutex_lock(&vq->mutex);
> > > > >    
> > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > > >    
> > > > >            if (!sock)
> > > > >                    return;
> > > > >    
> > > > >            mutex_lock(&vq->mutex);
> > > > > 
> > > > > Can't we can do the check under the vq->mutex here?
> > > > > 
> > > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > > 
> > > > The point is to make spurios wakeups as lightweight as possible.
> > > > The seemed to happen a lot with -net.
> > > > Should not happen with -scsi at all.
> > > 
> > > I am wondering:
> > > 
> > > 1. Why there is a lot of spurios wakeups
> > > 
> > > 2. What performance impact it would give if we take the lock to check
> > >    vq->private_data. Sinc, at any time, either handle_tx or handle_rx
> > >    can be running. So we can almost always take the vq->mutex mutex.
> > >    Did you managed to measure real perf difference?
> > 
> > At some point when this was written, yes.  We can revisit this, but
> > let's focus on fixing vhost-scsi.
> 
> If no perf difference is measurable, we can simplify the -net. It would
> be one small step towards removing the vhost rcu thing.

Rusty's currently doing some reorgs of -net let's delay
cleanups there to avoid stepping on each other's toys.
Let's focus on scsi here.
E.g. any chance framing assumptions can be fixed in 3.10?

> > > > 
> > > > > > >  	mutex_lock(&vq->mutex);
> > > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > >  			vq = &vs->vqs[i];
> > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > >  			mutex_lock(&vq->mutex);
> > > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > > >  			vhost_init_used(vq);
> > > > > > >  			mutex_unlock(&vq->mutex);
> > > > > > >  		}
> > > > > > > -		vs->vs_endpoint = true;
> > > > > > >  		ret = 0;
> > > > > > >  	} else {
> > > > > > >  		ret = -EEXIST;
> > > > > > 
> > > > > > 
> > > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > > 
> > > > > > 
> > > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > > >  {
> > > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > > +	struct vhost_virtqueue *vq;
> > > > > > > +	bool match = false;
> > > > > > >  	int index, ret, i;
> > > > > > >  	u8 target;
> > > > > > >  
> > > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > > >  		}
> > > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > > -		vs->vs_endpoint = false;
> > > > > > > +		match = true;
> > > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > > >  	}
> > > > > > > +	if (match) {
> > > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > +			vq = &vs->vqs[i];
> > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > +			mutex_lock(&vq->mutex);
> > > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > > +		}
> > > > > > > +	}
> > > > > > 
> > > > > > I'm trying to understand what's going on here.
> > > > > > Does vhost_scsi only have a single target?
> > > > > > Because the moment you clear one target you
> > > > > > also set private_data to NULL ...
> > > > > 
> > > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > > targets under the wwpn.
> > 
> > 
> > 
> > 
> > > > > > 
> > > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > > >  	return 0;
> > > > > > >  
> > > > > > > -- 
> > > > > > > 1.8.1.4
> > > > > 
> > > > > -- 
> > > > > Asias
> > > 
> > > -- 
> > > Asias
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-29  6:27             ` Asias He
@ 2013-03-31  8:23               ` Michael S. Tsirkin
  2013-03-31  8:23               ` Michael S. Tsirkin
  1 sibling, 0 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-03-31  8:23 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Fri, Mar 29, 2013 at 02:27:50PM +0800, Asias He wrote:
> On Thu, Mar 28, 2013 at 11:06:15AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 28, 2013 at 04:47:15PM +0800, Asias He wrote:
> > > On Thu, Mar 28, 2013 at 10:33:30AM +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > > 
> > > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > > process having less lock contention. Further, in the read side of
> > > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > > 
> > > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > > ---
> > > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > > index 5e3d4487..0524267 100644
> > > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > > -	bool vs_endpoint;
> > > > > > >  
> > > > > > >  	struct vhost_dev dev;
> > > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > > >  }
> > > > > > >  
> > > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > > +{
> > > > > > > +	bool ret = false;
> > > > > > > +
> > > > > > > +	/*
> > > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > > +	 *
> > > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > > +	 */
> > > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > > +		ret = true;
> > > > > > > +
> > > > > > > +	return ret;
> > > > > > > +}
> > > > > > > +
> > > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > > >  {
> > > > > > >  	return 1;
> > > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > > >  	int head, ret;
> > > > > > >  	u8 target;
> > > > > > >  
> > > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > > >  		return;
> > > > > > >
> > > > > > 
> > > > > > I would just move the check to under vq mutex,
> > > > > > and avoid rcu completely. In vhost-net we are using
> > > > > > private data outside lock so we can't do this,
> > > > > > no such issue here.
> > > > > 
> > > > > Are you talking about:
> > > > > 
> > > > >    handle_tx:
> > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > > >            if (!sock)
> > > > >                    return;
> > > > >    
> > > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > > >                    mutex_lock(&vq->mutex);
> > > > >                    tx_poll_start(net, sock);
> > > > >                    mutex_unlock(&vq->mutex);
> > > > >                    return;
> > > > >            }
> > > > >            mutex_lock(&vq->mutex);
> > > > > 
> > > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > > the check under the lock as well.
> > > > >    
> > > > >    handle_rx:
> > > > >            mutex_lock(&vq->mutex);
> > > > >    
> > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > > >    
> > > > >            if (!sock)
> > > > >                    return;
> > > > >    
> > > > >            mutex_lock(&vq->mutex);
> > > > > 
> > > > > Can't we can do the check under the vq->mutex here?
> > > > > 
> > > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > > 
> > > > The point is to make spurios wakeups as lightweight as possible.
> > > > The seemed to happen a lot with -net.
> > > > Should not happen with -scsi at all.
> > > 
> > > I am wondering:
> > > 
> > > 1. Why there is a lot of spurios wakeups
> > > 
> > > 2. What performance impact it would give if we take the lock to check
> > >    vq->private_data. Sinc, at any time, either handle_tx or handle_rx
> > >    can be running. So we can almost always take the vq->mutex mutex.
> > >    Did you managed to measure real perf difference?
> > 
> > At some point when this was written, yes.  We can revisit this, but
> > let's focus on fixing vhost-scsi.
> 
> If no perf difference is measurable, we can simplify the -net. It would
> be one small step towards removing the vhost rcu thing.

Rusty's currently doing some reorgs of -net let's delay
cleanups there to avoid stepping on each other's toys.
Let's focus on scsi here.
E.g. any chance framing assumptions can be fixed in 3.10?

> > > > 
> > > > > > >  	mutex_lock(&vq->mutex);
> > > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > >  			vq = &vs->vqs[i];
> > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > >  			mutex_lock(&vq->mutex);
> > > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > > >  			vhost_init_used(vq);
> > > > > > >  			mutex_unlock(&vq->mutex);
> > > > > > >  		}
> > > > > > > -		vs->vs_endpoint = true;
> > > > > > >  		ret = 0;
> > > > > > >  	} else {
> > > > > > >  		ret = -EEXIST;
> > > > > > 
> > > > > > 
> > > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > > 
> > > > > > 
> > > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > > >  {
> > > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > > +	struct vhost_virtqueue *vq;
> > > > > > > +	bool match = false;
> > > > > > >  	int index, ret, i;
> > > > > > >  	u8 target;
> > > > > > >  
> > > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > > >  		}
> > > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > > -		vs->vs_endpoint = false;
> > > > > > > +		match = true;
> > > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > > >  	}
> > > > > > > +	if (match) {
> > > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > +			vq = &vs->vqs[i];
> > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > +			mutex_lock(&vq->mutex);
> > > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > > +		}
> > > > > > > +	}
> > > > > > 
> > > > > > I'm trying to understand what's going on here.
> > > > > > Does vhost_scsi only have a single target?
> > > > > > Because the moment you clear one target you
> > > > > > also set private_data to NULL ...
> > > > > 
> > > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > > targets under the wwpn.
> > 
> > 
> > 
> > 
> > > > > > 
> > > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > > >  	return 0;
> > > > > > >  
> > > > > > > -- 
> > > > > > > 1.8.1.4
> > > > > 
> > > > > -- 
> > > > > Asias
> > > 
> > > -- 
> > > Asias
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-31  8:20           ` Michael S. Tsirkin
@ 2013-04-01  2:13             ` Asias He
  2013-04-02 12:15               ` Michael S. Tsirkin
  0 siblings, 1 reply; 33+ messages in thread
From: Asias He @ 2013-04-01  2:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Sun, Mar 31, 2013 at 11:20:24AM +0300, Michael S. Tsirkin wrote:
> On Fri, Mar 29, 2013 at 02:22:52PM +0800, Asias He wrote:
> > On Thu, Mar 28, 2013 at 11:18:22AM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > 
> > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > process having less lock contention. Further, in the read side of
> > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > 
> > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > ---
> > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > index 5e3d4487..0524267 100644
> > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > -	bool vs_endpoint;
> > > > > >  
> > > > > >  	struct vhost_dev dev;
> > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > >  }
> > > > > >  
> > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > +{
> > > > > > +	bool ret = false;
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > +	 *
> > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > +	 */
> > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > +		ret = true;
> > > > > > +
> > > > > > +	return ret;
> > > > > > +}
> > > > > > +
> > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > >  {
> > > > > >  	return 1;
> > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > >  	int head, ret;
> > > > > >  	u8 target;
> > > > > >  
> > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > >  		return;
> > > > > >
> > > > > 
> > > > > I would just move the check to under vq mutex,
> > > > > and avoid rcu completely. In vhost-net we are using
> > > > > private data outside lock so we can't do this,
> > > > > no such issue here.
> > > > 
> > > > Are you talking about:
> > > > 
> > > >    handle_tx:
> > > >            /* TODO: check that we are running from vhost_worker? */
> > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > >            if (!sock)
> > > >                    return;
> > > >    
> > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > >                    mutex_lock(&vq->mutex);
> > > >                    tx_poll_start(net, sock);
> > > >                    mutex_unlock(&vq->mutex);
> > > >                    return;
> > > >            }
> > > >            mutex_lock(&vq->mutex);
> > > > 
> > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > the check under the lock as well.
> > > >    
> > > >    handle_rx:
> > > >            mutex_lock(&vq->mutex);
> > > >    
> > > >            /* TODO: check that we are running from vhost_worker? */
> > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > >    
> > > >            if (!sock)
> > > >                    return;
> > > >    
> > > >            mutex_lock(&vq->mutex);
> > > > 
> > > > Can't we can do the check under the vq->mutex here?
> > > > 
> > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > > 
> > > > > >  	mutex_lock(&vq->mutex);
> > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > >  			vq = &vs->vqs[i];
> > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > >  			mutex_lock(&vq->mutex);
> > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > >  			vhost_init_used(vq);
> > > > > >  			mutex_unlock(&vq->mutex);
> > > > > >  		}
> > > > > > -		vs->vs_endpoint = true;
> > > > > >  		ret = 0;
> > > > > >  	} else {
> > > > > >  		ret = -EEXIST;
> > > > > 
> > > > > 
> > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > 
> > > > > 
> > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > >  {
> > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > +	struct vhost_virtqueue *vq;
> > > > > > +	bool match = false;
> > > > > >  	int index, ret, i;
> > > > > >  	u8 target;
> > > > > >  
> > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > >  		}
> > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > -		vs->vs_endpoint = false;
> > > > > > +		match = true;
> > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > >  	}
> > > > > > +	if (match) {
> > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > +			vq = &vs->vqs[i];
> > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > +			mutex_lock(&vq->mutex);
> > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > +		}
> > > > > > +	}
> > > > > 
> > > > > I'm trying to understand what's going on here.
> > > > > Does vhost_scsi only have a single target?
> > > > > Because the moment you clear one target you
> > > > > also set private_data to NULL ...
> > > > 
> > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > targets under the wwpn.
> > > 
> > > okay, but changing vs->vs_tpg[target] under dev mutex, then using
> > > it under vq mutex looks wrong.
> > 
> > I do not see a problem here.
> > 
> > Access of vs->vs_tpg[target] in vhost_scsi_handle_vq() happens only when
> > the SET_ENDPOINT is done.
> 
> But nothing prevents multiple SET_ENDPOINT calls while
> the previous one is in progress.

vhost_scsi_set_endpoint() and vhost_scsi_clear_endpoint() are protected
by vs->dev.mutex, no?

And in vhost_scsi_set_endpoint():

	if (tv_tpg->tv_tpg_vhost_count != 0) {
		mutex_unlock(&tv_tpg->tv_tpg_mutex);
		continue;
	}

This prevents calling of vhost_scsi_set_endpoint before we call
vhost_scsi_clear_endpoint to decrease tv_tpg->tv_tpg_vhost_count.

> > At that time, the vs->vs_tpg[] is already
> > ready. Even if the vs->vs_tpg[target] is changed to NULL in
> > CLEAR_ENDPOINT, it is safe since we fail the request if
> > vs->vs_tpg[target] is NULL.
> 
> We check it without a common lock so it can become NULL
> after we test it.


vhost_scsi_handle_vq:

     tv_tpg = vs->vs_tpg[target];	
     if (!tv_tpg)
         we fail the cmd
     ...

     INIT_WORK(&tv_cmd->work, tcm_vhost_submission_work);
     queue_work(tcm_vhost_workqueue, &tv_cmd->work);

So, after we test tv_tpg, event if vs->vs_tpg[target] become NULL, it
does not matter if the tpg is not deleted by calling tcm_vhost_drop_tpg().
tcm_vhost_drop_tpg() will not succeed if we do not call vhost_scsi_clear_endpoint() 
Becasue, tcm_vhost_drop_tpg -> tcm_vhost_drop_nexus ->  check if (tpg->tv_tpg_vhost_count != 0) 

Further, the tcm core should fail the cmd if the tpg is gonna when we submit the cmd in
tcm_vhost_submission_work. (nab, is this true?)

> > > Since we want to use private_data anyway, how about
> > > making private_data point at struct tcm_vhost_tpg * ?
> > > 
> > > Allocate it dynamically in SET_ENDPOINT (and free old value if any).
> > 
> > The struct tcm_vhost_tpg is per target. I assume you want to point
> > private_data to the 'struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET]'
> 
> No, I want to put it at the array of targets.

tcm_vhost_tpg is allocated in tcm_vhost_make_tpg. There is no array of
the targets. The targets exist when user create them in host side using
targetcli tools or /sys/kernel/config interface.

> > > 
> > > > > 
> > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > >  	return 0;
> > > > > >  
> > > > > > -- 
> > > > > > 1.8.1.4
> > > > 
> > > > -- 
> > > > Asias
> > 
> > -- 
> > Asias

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-31  8:23               ` Michael S. Tsirkin
@ 2013-04-01  2:20                 ` Asias He
  2013-04-01 22:57                 ` Rusty Russell
  2013-04-01 22:57                 ` Rusty Russell
  2 siblings, 0 replies; 33+ messages in thread
From: Asias He @ 2013-04-01  2:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Sun, Mar 31, 2013 at 11:23:12AM +0300, Michael S. Tsirkin wrote:
> On Fri, Mar 29, 2013 at 02:27:50PM +0800, Asias He wrote:
> > On Thu, Mar 28, 2013 at 11:06:15AM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Mar 28, 2013 at 04:47:15PM +0800, Asias He wrote:
> > > > On Thu, Mar 28, 2013 at 10:33:30AM +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > > > 
> > > > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > > > process having less lock contention. Further, in the read side of
> > > > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > > > 
> > > > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > > > ---
> > > > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > > > index 5e3d4487..0524267 100644
> > > > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > > > -	bool vs_endpoint;
> > > > > > > >  
> > > > > > > >  	struct vhost_dev dev;
> > > > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > > > >  }
> > > > > > > >  
> > > > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > > > +{
> > > > > > > > +	bool ret = false;
> > > > > > > > +
> > > > > > > > +	/*
> > > > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > > > +	 *
> > > > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > > > +	 */
> > > > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > > > +		ret = true;
> > > > > > > > +
> > > > > > > > +	return ret;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > > > >  {
> > > > > > > >  	return 1;
> > > > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > > > >  	int head, ret;
> > > > > > > >  	u8 target;
> > > > > > > >  
> > > > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > > > >  		return;
> > > > > > > >
> > > > > > > 
> > > > > > > I would just move the check to under vq mutex,
> > > > > > > and avoid rcu completely. In vhost-net we are using
> > > > > > > private data outside lock so we can't do this,
> > > > > > > no such issue here.
> > > > > > 
> > > > > > Are you talking about:
> > > > > > 
> > > > > >    handle_tx:
> > > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > > > >            if (!sock)
> > > > > >                    return;
> > > > > >    
> > > > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > > > >                    mutex_lock(&vq->mutex);
> > > > > >                    tx_poll_start(net, sock);
> > > > > >                    mutex_unlock(&vq->mutex);
> > > > > >                    return;
> > > > > >            }
> > > > > >            mutex_lock(&vq->mutex);
> > > > > > 
> > > > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > > > the check under the lock as well.
> > > > > >    
> > > > > >    handle_rx:
> > > > > >            mutex_lock(&vq->mutex);
> > > > > >    
> > > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > > > >    
> > > > > >            if (!sock)
> > > > > >                    return;
> > > > > >    
> > > > > >            mutex_lock(&vq->mutex);
> > > > > > 
> > > > > > Can't we can do the check under the vq->mutex here?
> > > > > > 
> > > > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > > > 
> > > > > The point is to make spurios wakeups as lightweight as possible.
> > > > > The seemed to happen a lot with -net.
> > > > > Should not happen with -scsi at all.
> > > > 
> > > > I am wondering:
> > > > 
> > > > 1. Why there is a lot of spurios wakeups
> > > > 
> > > > 2. What performance impact it would give if we take the lock to check
> > > >    vq->private_data. Sinc, at any time, either handle_tx or handle_rx
> > > >    can be running. So we can almost always take the vq->mutex mutex.
> > > >    Did you managed to measure real perf difference?
> > > 
> > > At some point when this was written, yes.  We can revisit this, but
> > > let's focus on fixing vhost-scsi.
> > 
> > If no perf difference is measurable, we can simplify the -net. It would
> > be one small step towards removing the vhost rcu thing.
> 
> Rusty's currently doing some reorgs of -net let's delay
> cleanups there to avoid stepping on each other's toys.
> Let's focus on scsi here.
> E.g. any chance framing assumptions can be fixed in 3.10?

There are set endpoint, hotplug and flush series in flight already. So I will
not start framing assumptions fix until we merge the in flight ones.
However, IMO, we can probably fix it in 3.10.

> > > > > 
> > > > > > > >  	mutex_lock(&vq->mutex);
> > > > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > >  			vq = &vs->vqs[i];
> > > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > >  			mutex_lock(&vq->mutex);
> > > > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > > > >  			vhost_init_used(vq);
> > > > > > > >  			mutex_unlock(&vq->mutex);
> > > > > > > >  		}
> > > > > > > > -		vs->vs_endpoint = true;
> > > > > > > >  		ret = 0;
> > > > > > > >  	} else {
> > > > > > > >  		ret = -EEXIST;
> > > > > > > 
> > > > > > > 
> > > > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > > > 
> > > > > > > 
> > > > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > > > >  {
> > > > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > > > +	struct vhost_virtqueue *vq;
> > > > > > > > +	bool match = false;
> > > > > > > >  	int index, ret, i;
> > > > > > > >  	u8 target;
> > > > > > > >  
> > > > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > > > >  		}
> > > > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > > > -		vs->vs_endpoint = false;
> > > > > > > > +		match = true;
> > > > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > > > >  	}
> > > > > > > > +	if (match) {
> > > > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > > +			vq = &vs->vqs[i];
> > > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > > +			mutex_lock(&vq->mutex);
> > > > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > > > +		}
> > > > > > > > +	}
> > > > > > > 
> > > > > > > I'm trying to understand what's going on here.
> > > > > > > Does vhost_scsi only have a single target?
> > > > > > > Because the moment you clear one target you
> > > > > > > also set private_data to NULL ...
> > > > > > 
> > > > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > > > targets under the wwpn.
> > > 
> > > 
> > > 
> > > 
> > > > > > > 
> > > > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > > > >  	return 0;
> > > > > > > >  
> > > > > > > > -- 
> > > > > > > > 1.8.1.4
> > > > > > 
> > > > > > -- 
> > > > > > Asias
> > > > 
> > > > -- 
> > > > Asias
> > 
> > -- 
> > Asias

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-31  8:23               ` Michael S. Tsirkin
  2013-04-01  2:20                 ` Asias He
@ 2013-04-01 22:57                 ` Rusty Russell
  2013-04-02 13:10                   ` Michael S. Tsirkin
  2013-04-12 11:37                   ` Michael S. Tsirkin
  2013-04-01 22:57                 ` Rusty Russell
  2 siblings, 2 replies; 33+ messages in thread
From: Rusty Russell @ 2013-04-01 22:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, Asias He
  Cc: Nicholas Bellinger, Paolo Bonzini, Stefan Hajnoczi, kvm,
	virtualization, target-devel

"Michael S. Tsirkin" <mst@redhat.com> writes:
> Rusty's currently doing some reorgs of -net let's delay
> cleanups there to avoid stepping on each other's toys.
> Let's focus on scsi here.
> E.g. any chance framing assumptions can be fixed in 3.10?

I am waiting for your removal of the dma-compelete ordering stuff in
vhost-net.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-31  8:23               ` Michael S. Tsirkin
  2013-04-01  2:20                 ` Asias He
  2013-04-01 22:57                 ` Rusty Russell
@ 2013-04-01 22:57                 ` Rusty Russell
  2 siblings, 0 replies; 33+ messages in thread
From: Rusty Russell @ 2013-04-01 22:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, Asias He
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

"Michael S. Tsirkin" <mst@redhat.com> writes:
> Rusty's currently doing some reorgs of -net let's delay
> cleanups there to avoid stepping on each other's toys.
> Let's focus on scsi here.
> E.g. any chance framing assumptions can be fixed in 3.10?

I am waiting for your removal of the dma-compelete ordering stuff in
vhost-net.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-04-01  2:13             ` Asias He
@ 2013-04-02 12:15               ` Michael S. Tsirkin
  2013-04-02 15:10                 ` Asias He
  0 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-04-02 12:15 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Mon, Apr 01, 2013 at 10:13:47AM +0800, Asias He wrote:
> On Sun, Mar 31, 2013 at 11:20:24AM +0300, Michael S. Tsirkin wrote:
> > On Fri, Mar 29, 2013 at 02:22:52PM +0800, Asias He wrote:
> > > On Thu, Mar 28, 2013 at 11:18:22AM +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > > 
> > > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > > process having less lock contention. Further, in the read side of
> > > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > > 
> > > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > > ---
> > > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > > index 5e3d4487..0524267 100644
> > > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > > -	bool vs_endpoint;
> > > > > > >  
> > > > > > >  	struct vhost_dev dev;
> > > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > > >  }
> > > > > > >  
> > > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > > +{
> > > > > > > +	bool ret = false;
> > > > > > > +
> > > > > > > +	/*
> > > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > > +	 *
> > > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > > +	 */
> > > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > > +		ret = true;
> > > > > > > +
> > > > > > > +	return ret;
> > > > > > > +}
> > > > > > > +
> > > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > > >  {
> > > > > > >  	return 1;
> > > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > > >  	int head, ret;
> > > > > > >  	u8 target;
> > > > > > >  
> > > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > > >  		return;
> > > > > > >
> > > > > > 
> > > > > > I would just move the check to under vq mutex,
> > > > > > and avoid rcu completely. In vhost-net we are using
> > > > > > private data outside lock so we can't do this,
> > > > > > no such issue here.
> > > > > 
> > > > > Are you talking about:
> > > > > 
> > > > >    handle_tx:
> > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > > >            if (!sock)
> > > > >                    return;
> > > > >    
> > > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > > >                    mutex_lock(&vq->mutex);
> > > > >                    tx_poll_start(net, sock);
> > > > >                    mutex_unlock(&vq->mutex);
> > > > >                    return;
> > > > >            }
> > > > >            mutex_lock(&vq->mutex);
> > > > > 
> > > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > > the check under the lock as well.
> > > > >    
> > > > >    handle_rx:
> > > > >            mutex_lock(&vq->mutex);
> > > > >    
> > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > > >    
> > > > >            if (!sock)
> > > > >                    return;
> > > > >    
> > > > >            mutex_lock(&vq->mutex);
> > > > > 
> > > > > Can't we can do the check under the vq->mutex here?
> > > > > 
> > > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > > > 
> > > > > > >  	mutex_lock(&vq->mutex);
> > > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > >  			vq = &vs->vqs[i];
> > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > >  			mutex_lock(&vq->mutex);
> > > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > > >  			vhost_init_used(vq);
> > > > > > >  			mutex_unlock(&vq->mutex);
> > > > > > >  		}
> > > > > > > -		vs->vs_endpoint = true;
> > > > > > >  		ret = 0;
> > > > > > >  	} else {
> > > > > > >  		ret = -EEXIST;
> > > > > > 
> > > > > > 
> > > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > > 
> > > > > > 
> > > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > > >  {
> > > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > > +	struct vhost_virtqueue *vq;
> > > > > > > +	bool match = false;
> > > > > > >  	int index, ret, i;
> > > > > > >  	u8 target;
> > > > > > >  
> > > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > > >  		}
> > > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > > -		vs->vs_endpoint = false;
> > > > > > > +		match = true;
> > > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > > >  	}
> > > > > > > +	if (match) {
> > > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > +			vq = &vs->vqs[i];
> > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > +			mutex_lock(&vq->mutex);
> > > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > > +		}
> > > > > > > +	}
> > > > > > 
> > > > > > I'm trying to understand what's going on here.
> > > > > > Does vhost_scsi only have a single target?
> > > > > > Because the moment you clear one target you
> > > > > > also set private_data to NULL ...
> > > > > 
> > > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > > targets under the wwpn.
> > > > 
> > > > okay, but changing vs->vs_tpg[target] under dev mutex, then using
> > > > it under vq mutex looks wrong.
> > > 
> > > I do not see a problem here.
> > > 
> > > Access of vs->vs_tpg[target] in vhost_scsi_handle_vq() happens only when
> > > the SET_ENDPOINT is done.
> > 
> > But nothing prevents multiple SET_ENDPOINT calls while
> > the previous one is in progress.
> 
> vhost_scsi_set_endpoint() and vhost_scsi_clear_endpoint() are protected
> by vs->dev.mutex, no?
> 
> And in vhost_scsi_set_endpoint():
> 
> 	if (tv_tpg->tv_tpg_vhost_count != 0) {
> 		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> 		continue;
> 	}
> 
> This prevents calling of vhost_scsi_set_endpoint before we call
> vhost_scsi_clear_endpoint to decrease tv_tpg->tv_tpg_vhost_count.

All this seems to do is prevent reusing the same target
in multiple vhosts.

> > > At that time, the vs->vs_tpg[] is already
> > > ready. Even if the vs->vs_tpg[target] is changed to NULL in
> > > CLEAR_ENDPOINT, it is safe since we fail the request if
> > > vs->vs_tpg[target] is NULL.
> > 
> > We check it without a common lock so it can become NULL
> > after we test it.
> 
> 
> vhost_scsi_handle_vq:
> 
>      tv_tpg = vs->vs_tpg[target];	
>      if (!tv_tpg)
>          we fail the cmd
>      ...
> 
>      INIT_WORK(&tv_cmd->work, tcm_vhost_submission_work);
>      queue_work(tcm_vhost_workqueue, &tv_cmd->work);
> 
> So, after we test tv_tpg, event if vs->vs_tpg[target] become NULL, it
> does not matter if the tpg is not deleted by calling tcm_vhost_drop_tpg().
> tcm_vhost_drop_tpg() will not succeed if we do not call vhost_scsi_clear_endpoint() 
> Becasue, tcm_vhost_drop_tpg -> tcm_vhost_drop_nexus ->  check if (tpg->tv_tpg_vhost_count != 0) 

My point is this:
		tv_tpg = vs->vs_tpg[target];
		if (!tv_tpg) {
			....
			return
		}

                tv_cmd = vhost_scsi_allocate_cmd(tv_tpg, &v_req,

above line can legally reread vs->vs_tpg[target] from array.
You need ACCESS_ONCE if you don't want that.

> Further, the tcm core should fail the cmd if the tpg is gonna when we submit the cmd in
> tcm_vhost_submission_work. (nab, is this true?)
> 
> > > > Since we want to use private_data anyway, how about
> > > > making private_data point at struct tcm_vhost_tpg * ?
> > > > 
> > > > Allocate it dynamically in SET_ENDPOINT (and free old value if any).
> > > 
> > > The struct tcm_vhost_tpg is per target. I assume you want to point
> > > private_data to the 'struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET]'
> > 
> > No, I want to put it at the array of targets.
> 
> tcm_vhost_tpg is allocated in tcm_vhost_make_tpg. There is no array of
> the targets. The targets exist when user create them in host side using
> targetcli tools or /sys/kernel/config interface.

I really simply mean this field:
	        struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];

allocate it dynamically when endpoint is set, and
set private data for each vq.

> > > > 
> > > > > > 
> > > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > > >  	return 0;
> > > > > > >  
> > > > > > > -- 
> > > > > > > 1.8.1.4
> > > > > 
> > > > > -- 
> > > > > Asias
> > > 
> > > -- 
> > > Asias
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-04-01 22:57                 ` Rusty Russell
@ 2013-04-02 13:10                   ` Michael S. Tsirkin
  2013-04-12 11:37                   ` Michael S. Tsirkin
  1 sibling, 0 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-04-02 13:10 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Tue, Apr 02, 2013 at 09:27:57AM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > Rusty's currently doing some reorgs of -net let's delay
> > cleanups there to avoid stepping on each other's toys.
> > Let's focus on scsi here.
> > E.g. any chance framing assumptions can be fixed in 3.10?
> 
> I am waiting for your removal of the dma-compelete ordering stuff in
> vhost-net.
> 
> Cheers,
> Rusty.

Sure.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-04-02 12:15               ` Michael S. Tsirkin
@ 2013-04-02 15:10                 ` Asias He
  2013-04-02 15:18                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 33+ messages in thread
From: Asias He @ 2013-04-02 15:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Tue, Apr 02, 2013 at 03:15:31PM +0300, Michael S. Tsirkin wrote:
> On Mon, Apr 01, 2013 at 10:13:47AM +0800, Asias He wrote:
> > On Sun, Mar 31, 2013 at 11:20:24AM +0300, Michael S. Tsirkin wrote:
> > > On Fri, Mar 29, 2013 at 02:22:52PM +0800, Asias He wrote:
> > > > On Thu, Mar 28, 2013 at 11:18:22AM +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > > > 
> > > > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > > > process having less lock contention. Further, in the read side of
> > > > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > > > 
> > > > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > > > ---
> > > > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > > > index 5e3d4487..0524267 100644
> > > > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > > > -	bool vs_endpoint;
> > > > > > > >  
> > > > > > > >  	struct vhost_dev dev;
> > > > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > > > >  }
> > > > > > > >  
> > > > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > > > +{
> > > > > > > > +	bool ret = false;
> > > > > > > > +
> > > > > > > > +	/*
> > > > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > > > +	 *
> > > > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > > > +	 */
> > > > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > > > +		ret = true;
> > > > > > > > +
> > > > > > > > +	return ret;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > > > >  {
> > > > > > > >  	return 1;
> > > > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > > > >  	int head, ret;
> > > > > > > >  	u8 target;
> > > > > > > >  
> > > > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > > > >  		return;
> > > > > > > >
> > > > > > > 
> > > > > > > I would just move the check to under vq mutex,
> > > > > > > and avoid rcu completely. In vhost-net we are using
> > > > > > > private data outside lock so we can't do this,
> > > > > > > no such issue here.
> > > > > > 
> > > > > > Are you talking about:
> > > > > > 
> > > > > >    handle_tx:
> > > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > > > >            if (!sock)
> > > > > >                    return;
> > > > > >    
> > > > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > > > >                    mutex_lock(&vq->mutex);
> > > > > >                    tx_poll_start(net, sock);
> > > > > >                    mutex_unlock(&vq->mutex);
> > > > > >                    return;
> > > > > >            }
> > > > > >            mutex_lock(&vq->mutex);
> > > > > > 
> > > > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > > > the check under the lock as well.
> > > > > >    
> > > > > >    handle_rx:
> > > > > >            mutex_lock(&vq->mutex);
> > > > > >    
> > > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > > > >    
> > > > > >            if (!sock)
> > > > > >                    return;
> > > > > >    
> > > > > >            mutex_lock(&vq->mutex);
> > > > > > 
> > > > > > Can't we can do the check under the vq->mutex here?
> > > > > > 
> > > > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > > > > 
> > > > > > > >  	mutex_lock(&vq->mutex);
> > > > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > >  			vq = &vs->vqs[i];
> > > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > >  			mutex_lock(&vq->mutex);
> > > > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > > > >  			vhost_init_used(vq);
> > > > > > > >  			mutex_unlock(&vq->mutex);
> > > > > > > >  		}
> > > > > > > > -		vs->vs_endpoint = true;
> > > > > > > >  		ret = 0;
> > > > > > > >  	} else {
> > > > > > > >  		ret = -EEXIST;
> > > > > > > 
> > > > > > > 
> > > > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > > > 
> > > > > > > 
> > > > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > > > >  {
> > > > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > > > +	struct vhost_virtqueue *vq;
> > > > > > > > +	bool match = false;
> > > > > > > >  	int index, ret, i;
> > > > > > > >  	u8 target;
> > > > > > > >  
> > > > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > > > >  		}
> > > > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > > > -		vs->vs_endpoint = false;
> > > > > > > > +		match = true;
> > > > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > > > >  	}
> > > > > > > > +	if (match) {
> > > > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > > +			vq = &vs->vqs[i];
> > > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > > +			mutex_lock(&vq->mutex);
> > > > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > > > +		}
> > > > > > > > +	}
> > > > > > > 
> > > > > > > I'm trying to understand what's going on here.
> > > > > > > Does vhost_scsi only have a single target?
> > > > > > > Because the moment you clear one target you
> > > > > > > also set private_data to NULL ...
> > > > > > 
> > > > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > > > targets under the wwpn.
> > > > > 
> > > > > okay, but changing vs->vs_tpg[target] under dev mutex, then using
> > > > > it under vq mutex looks wrong.
> > > > 
> > > > I do not see a problem here.
> > > > 
> > > > Access of vs->vs_tpg[target] in vhost_scsi_handle_vq() happens only when
> > > > the SET_ENDPOINT is done.
> > > 
> > > But nothing prevents multiple SET_ENDPOINT calls while
> > > the previous one is in progress.
> > 
> > vhost_scsi_set_endpoint() and vhost_scsi_clear_endpoint() are protected
> > by vs->dev.mutex, no?
> > 
> > And in vhost_scsi_set_endpoint():
> > 
> > 	if (tv_tpg->tv_tpg_vhost_count != 0) {
> > 		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > 		continue;
> > 	}
> > 
> > This prevents calling of vhost_scsi_set_endpoint before we call
> > vhost_scsi_clear_endpoint to decrease tv_tpg->tv_tpg_vhost_count.
> 
> All this seems to do is prevent reusing the same target
> in multiple vhosts.
> 
> > > > At that time, the vs->vs_tpg[] is already
> > > > ready. Even if the vs->vs_tpg[target] is changed to NULL in
> > > > CLEAR_ENDPOINT, it is safe since we fail the request if
> > > > vs->vs_tpg[target] is NULL.
> > > 
> > > We check it without a common lock so it can become NULL
> > > after we test it.
> > 
> > 
> > vhost_scsi_handle_vq:
> > 
> >      tv_tpg = vs->vs_tpg[target];	
> >      if (!tv_tpg)
> >          we fail the cmd
> >      ...
> > 
> >      INIT_WORK(&tv_cmd->work, tcm_vhost_submission_work);
> >      queue_work(tcm_vhost_workqueue, &tv_cmd->work);
> > 
> > So, after we test tv_tpg, event if vs->vs_tpg[target] become NULL, it
> > does not matter if the tpg is not deleted by calling tcm_vhost_drop_tpg().
> > tcm_vhost_drop_tpg() will not succeed if we do not call vhost_scsi_clear_endpoint() 
> > Becasue, tcm_vhost_drop_tpg -> tcm_vhost_drop_nexus ->  check if (tpg->tv_tpg_vhost_count != 0) 
> 
> My point is this:
> 		tv_tpg = vs->vs_tpg[target];
> 		if (!tv_tpg) {
> 			....
> 			return
> 		}
> 
>                 tv_cmd = vhost_scsi_allocate_cmd(tv_tpg, &v_req,
> 
> above line can legally reread vs->vs_tpg[target] from array.
> You need ACCESS_ONCE if you don't want that.

Well, this is another problem we have. Will include it in next version. 

> 
> > Further, the tcm core should fail the cmd if the tpg is gonna when we submit the cmd in
> > tcm_vhost_submission_work. (nab, is this true?)
> > 
> > > > > Since we want to use private_data anyway, how about
> > > > > making private_data point at struct tcm_vhost_tpg * ?
> > > > > 
> > > > > Allocate it dynamically in SET_ENDPOINT (and free old value if any).
> > > > 
> > > > The struct tcm_vhost_tpg is per target. I assume you want to point
> > > > private_data to the 'struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET]'
> > > 
> > > No, I want to put it at the array of targets.
> > 
> > tcm_vhost_tpg is allocated in tcm_vhost_make_tpg. There is no array of
> > the targets. The targets exist when user create them in host side using
> > targetcli tools or /sys/kernel/config interface.
> 
> I really simply mean this field:
> 	        struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> 
> allocate it dynamically when endpoint is set, and
> set private data for each vq.

What's the benefit of allocating it dynamically? Why bother it if the
current static and simpler one works ok.

So do you have further concerns other than the ACCESS_ONCE one.

> > > > > 
> > > > > > > 
> > > > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > > > >  	return 0;
> > > > > > > >  
> > > > > > > > -- 
> > > > > > > > 1.8.1.4
> > > > > > 
> > > > > > -- 
> > > > > > Asias
> > > > 
> > > > -- 
> > > > Asias
> > 
> > -- 
> > Asias

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-04-02 15:10                 ` Asias He
@ 2013-04-02 15:18                   ` Michael S. Tsirkin
  2013-04-03  6:08                     ` Asias He
  0 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-04-02 15:18 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Tue, Apr 02, 2013 at 11:10:02PM +0800, Asias He wrote:
> On Tue, Apr 02, 2013 at 03:15:31PM +0300, Michael S. Tsirkin wrote:
> > On Mon, Apr 01, 2013 at 10:13:47AM +0800, Asias He wrote:
> > > On Sun, Mar 31, 2013 at 11:20:24AM +0300, Michael S. Tsirkin wrote:
> > > > On Fri, Mar 29, 2013 at 02:22:52PM +0800, Asias He wrote:
> > > > > On Thu, Mar 28, 2013 at 11:18:22AM +0200, Michael S. Tsirkin wrote:
> > > > > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > > > > 
> > > > > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > > > > process having less lock contention. Further, in the read side of
> > > > > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > > > > ---
> > > > > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > > > > index 5e3d4487..0524267 100644
> > > > > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > > > > -	bool vs_endpoint;
> > > > > > > > >  
> > > > > > > > >  	struct vhost_dev dev;
> > > > > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > > > > >  }
> > > > > > > > >  
> > > > > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > > > > +{
> > > > > > > > > +	bool ret = false;
> > > > > > > > > +
> > > > > > > > > +	/*
> > > > > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > > > > +	 *
> > > > > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > > > > +	 */
> > > > > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > > > > +		ret = true;
> > > > > > > > > +
> > > > > > > > > +	return ret;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > > > > >  {
> > > > > > > > >  	return 1;
> > > > > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > > > > >  	int head, ret;
> > > > > > > > >  	u8 target;
> > > > > > > > >  
> > > > > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > > > > >  		return;
> > > > > > > > >
> > > > > > > > 
> > > > > > > > I would just move the check to under vq mutex,
> > > > > > > > and avoid rcu completely. In vhost-net we are using
> > > > > > > > private data outside lock so we can't do this,
> > > > > > > > no such issue here.
> > > > > > > 
> > > > > > > Are you talking about:
> > > > > > > 
> > > > > > >    handle_tx:
> > > > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > > > > >            if (!sock)
> > > > > > >                    return;
> > > > > > >    
> > > > > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > > > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > > > > >                    mutex_lock(&vq->mutex);
> > > > > > >                    tx_poll_start(net, sock);
> > > > > > >                    mutex_unlock(&vq->mutex);
> > > > > > >                    return;
> > > > > > >            }
> > > > > > >            mutex_lock(&vq->mutex);
> > > > > > > 
> > > > > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > > > > the check under the lock as well.
> > > > > > >    
> > > > > > >    handle_rx:
> > > > > > >            mutex_lock(&vq->mutex);
> > > > > > >    
> > > > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > > > > >    
> > > > > > >            if (!sock)
> > > > > > >                    return;
> > > > > > >    
> > > > > > >            mutex_lock(&vq->mutex);
> > > > > > > 
> > > > > > > Can't we can do the check under the vq->mutex here?
> > > > > > > 
> > > > > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > > > > > 
> > > > > > > > >  	mutex_lock(&vq->mutex);
> > > > > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > > >  			vq = &vs->vqs[i];
> > > > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > > >  			mutex_lock(&vq->mutex);
> > > > > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > > > > >  			vhost_init_used(vq);
> > > > > > > > >  			mutex_unlock(&vq->mutex);
> > > > > > > > >  		}
> > > > > > > > > -		vs->vs_endpoint = true;
> > > > > > > > >  		ret = 0;
> > > > > > > > >  	} else {
> > > > > > > > >  		ret = -EEXIST;
> > > > > > > > 
> > > > > > > > 
> > > > > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > > > > >  {
> > > > > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > > > > +	struct vhost_virtqueue *vq;
> > > > > > > > > +	bool match = false;
> > > > > > > > >  	int index, ret, i;
> > > > > > > > >  	u8 target;
> > > > > > > > >  
> > > > > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > > > > >  		}
> > > > > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > > > > -		vs->vs_endpoint = false;
> > > > > > > > > +		match = true;
> > > > > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > > > > >  	}
> > > > > > > > > +	if (match) {
> > > > > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > > > +			vq = &vs->vqs[i];
> > > > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > > > +			mutex_lock(&vq->mutex);
> > > > > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > > > > +		}
> > > > > > > > > +	}
> > > > > > > > 
> > > > > > > > I'm trying to understand what's going on here.
> > > > > > > > Does vhost_scsi only have a single target?
> > > > > > > > Because the moment you clear one target you
> > > > > > > > also set private_data to NULL ...
> > > > > > > 
> > > > > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > > > > targets under the wwpn.
> > > > > > 
> > > > > > okay, but changing vs->vs_tpg[target] under dev mutex, then using
> > > > > > it under vq mutex looks wrong.
> > > > > 
> > > > > I do not see a problem here.
> > > > > 
> > > > > Access of vs->vs_tpg[target] in vhost_scsi_handle_vq() happens only when
> > > > > the SET_ENDPOINT is done.
> > > > 
> > > > But nothing prevents multiple SET_ENDPOINT calls while
> > > > the previous one is in progress.
> > > 
> > > vhost_scsi_set_endpoint() and vhost_scsi_clear_endpoint() are protected
> > > by vs->dev.mutex, no?
> > > 
> > > And in vhost_scsi_set_endpoint():
> > > 
> > > 	if (tv_tpg->tv_tpg_vhost_count != 0) {
> > > 		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > 		continue;
> > > 	}
> > > 
> > > This prevents calling of vhost_scsi_set_endpoint before we call
> > > vhost_scsi_clear_endpoint to decrease tv_tpg->tv_tpg_vhost_count.
> > 
> > All this seems to do is prevent reusing the same target
> > in multiple vhosts.
> > 
> > > > > At that time, the vs->vs_tpg[] is already
> > > > > ready. Even if the vs->vs_tpg[target] is changed to NULL in
> > > > > CLEAR_ENDPOINT, it is safe since we fail the request if
> > > > > vs->vs_tpg[target] is NULL.
> > > > 
> > > > We check it without a common lock so it can become NULL
> > > > after we test it.
> > > 
> > > 
> > > vhost_scsi_handle_vq:
> > > 
> > >      tv_tpg = vs->vs_tpg[target];	
> > >      if (!tv_tpg)
> > >          we fail the cmd
> > >      ...
> > > 
> > >      INIT_WORK(&tv_cmd->work, tcm_vhost_submission_work);
> > >      queue_work(tcm_vhost_workqueue, &tv_cmd->work);
> > > 
> > > So, after we test tv_tpg, event if vs->vs_tpg[target] become NULL, it
> > > does not matter if the tpg is not deleted by calling tcm_vhost_drop_tpg().
> > > tcm_vhost_drop_tpg() will not succeed if we do not call vhost_scsi_clear_endpoint() 
> > > Becasue, tcm_vhost_drop_tpg -> tcm_vhost_drop_nexus ->  check if (tpg->tv_tpg_vhost_count != 0) 
> > 
> > My point is this:
> > 		tv_tpg = vs->vs_tpg[target];
> > 		if (!tv_tpg) {
> > 			....
> > 			return
> > 		}
> > 
> >                 tv_cmd = vhost_scsi_allocate_cmd(tv_tpg, &v_req,
> > 
> > above line can legally reread vs->vs_tpg[target] from array.
> > You need ACCESS_ONCE if you don't want that.
> 
> Well, this is another problem we have. Will include it in next version. 
> 
> > 
> > > Further, the tcm core should fail the cmd if the tpg is gonna when we submit the cmd in
> > > tcm_vhost_submission_work. (nab, is this true?)
> > > 
> > > > > > Since we want to use private_data anyway, how about
> > > > > > making private_data point at struct tcm_vhost_tpg * ?
> > > > > > 
> > > > > > Allocate it dynamically in SET_ENDPOINT (and free old value if any).
> > > > > 
> > > > > The struct tcm_vhost_tpg is per target. I assume you want to point
> > > > > private_data to the 'struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET]'
> > > > 
> > > > No, I want to put it at the array of targets.
> > > 
> > > tcm_vhost_tpg is allocated in tcm_vhost_make_tpg. There is no array of
> > > the targets. The targets exist when user create them in host side using
> > > targetcli tools or /sys/kernel/config interface.
> > 
> > I really simply mean this field:
> > 	        struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > 
> > allocate it dynamically when endpoint is set, and
> > set private data for each vq.
> 
> What's the benefit of allocating it dynamically? Why bother it if the
> current static and simpler one works ok.

Because this makes it the lifetime rules clear:
instead of changing values in the array, you
replace the pointer to the array.

> 
> So do you have further concerns other than the ACCESS_ONCE one.

It's just ugly to use a pointer as a flag. vhost uses private_data to
point to the constant backend structure, and NULL if there's no backend,
so vhost-scsi should just do this too then it won't have problems.

> > > > > > 
> > > > > > > > 
> > > > > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > > > > >  	return 0;
> > > > > > > > >  
> > > > > > > > > -- 
> > > > > > > > > 1.8.1.4
> > > > > > > 
> > > > > > > -- 
> > > > > > > Asias
> > > > > 
> > > > > -- 
> > > > > Asias
> > > 
> > > -- 
> > > Asias
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-04-02 15:18                   ` Michael S. Tsirkin
@ 2013-04-03  6:08                     ` Asias He
  0 siblings, 0 replies; 33+ messages in thread
From: Asias He @ 2013-04-03  6:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Tue, Apr 02, 2013 at 06:18:33PM +0300, Michael S. Tsirkin wrote:
> On Tue, Apr 02, 2013 at 11:10:02PM +0800, Asias He wrote:
> > On Tue, Apr 02, 2013 at 03:15:31PM +0300, Michael S. Tsirkin wrote:
> > > On Mon, Apr 01, 2013 at 10:13:47AM +0800, Asias He wrote:
> > > > On Sun, Mar 31, 2013 at 11:20:24AM +0300, Michael S. Tsirkin wrote:
> > > > > On Fri, Mar 29, 2013 at 02:22:52PM +0800, Asias He wrote:
> > > > > > On Thu, Mar 28, 2013 at 11:18:22AM +0200, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Mar 28, 2013 at 04:10:02PM +0800, Asias He wrote:
> > > > > > > > On Thu, Mar 28, 2013 at 08:16:59AM +0200, Michael S. Tsirkin wrote:
> > > > > > > > > On Thu, Mar 28, 2013 at 10:17:28AM +0800, Asias He wrote:
> > > > > > > > > > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > > > > > > > > > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > > > > > > > > > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > > > > > > > > > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > > > > > > > > > 
> > > > > > > > > > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > > > > > > > > > indicate the status of the endpoint, we use per virtqueue
> > > > > > > > > > vq->private_data to indicate it. In this way, we can only take the
> > > > > > > > > > vq->mutex lock which is per queue and make the concurrent multiqueue
> > > > > > > > > > process having less lock contention. Further, in the read side of
> > > > > > > > > > vq->private_data, we can even do not take only lock if it is accessed in
> > > > > > > > > > the vhost worker thread, because it is protected by "vhost rcu".
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Asias He <asias@redhat.com>
> > > > > > > > > > ---
> > > > > > > > > >  drivers/vhost/tcm_vhost.c | 38 +++++++++++++++++++++++++++++++++-----
> > > > > > > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > > > > > > > > > index 5e3d4487..0524267 100644
> > > > > > > > > > --- a/drivers/vhost/tcm_vhost.c
> > > > > > > > > > +++ b/drivers/vhost/tcm_vhost.c
> > > > > > > > > > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > > > > > > > > >  	/* Protected by vhost_scsi->dev.mutex */
> > > > > > > > > >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > > > > > > > >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > > > > > > > -	bool vs_endpoint;
> > > > > > > > > >  
> > > > > > > > > >  	struct vhost_dev dev;
> > > > > > > > > >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > > > > > > > > > @@ -91,6 +90,24 @@ static int iov_num_pages(struct iovec *iov)
> > > > > > > > > >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > > > > > > > > >  }
> > > > > > > > > >  
> > > > > > > > > > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > > > > > > > > > +{
> > > > > > > > > > +	bool ret = false;
> > > > > > > > > > +
> > > > > > > > > > +	/*
> > > > > > > > > > +	 * We can handle the vq only after the endpoint is setup by calling the
> > > > > > > > > > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > > > > > > > > > +	 *
> > > > > > > > > > +	 * TODO: Check that we are running from vhost_worker which acts
> > > > > > > > > > +	 * as read-side critical section for vhost kind of RCU.
> > > > > > > > > > +	 * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > > > > > > > > > +	 */
> > > > > > > > > > +	if (rcu_dereference_check(vq->private_data, 1))
> > > > > > > > > > +		ret = true;
> > > > > > > > > > +
> > > > > > > > > > +	return ret;
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > > > > > > > > >  {
> > > > > > > > > >  	return 1;
> > > > > > > > > > @@ -581,8 +598,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > > > > > > > > >  	int head, ret;
> > > > > > > > > >  	u8 target;
> > > > > > > > > >  
> > > > > > > > > > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > > > > > > > > > -	if (unlikely(!vs->vs_endpoint))
> > > > > > > > > > +	if (!tcm_vhost_check_endpoint(vq))
> > > > > > > > > >  		return;
> > > > > > > > > >
> > > > > > > > > 
> > > > > > > > > I would just move the check to under vq mutex,
> > > > > > > > > and avoid rcu completely. In vhost-net we are using
> > > > > > > > > private data outside lock so we can't do this,
> > > > > > > > > no such issue here.
> > > > > > > > 
> > > > > > > > Are you talking about:
> > > > > > > > 
> > > > > > > >    handle_tx:
> > > > > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > > > > >            sock = rcu_dereference_check(vq->private_data, 1);
> > > > > > > >            if (!sock)
> > > > > > > >                    return;
> > > > > > > >    
> > > > > > > >            wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > > > > > > >            if (wmem >= sock->sk->sk_sndbuf) {
> > > > > > > >                    mutex_lock(&vq->mutex);
> > > > > > > >                    tx_poll_start(net, sock);
> > > > > > > >                    mutex_unlock(&vq->mutex);
> > > > > > > >                    return;
> > > > > > > >            }
> > > > > > > >            mutex_lock(&vq->mutex);
> > > > > > > > 
> > > > > > > > Why not do the atomic_read and tx_poll_start under the vq->mutex, and thus do
> > > > > > > > the check under the lock as well.
> > > > > > > >    
> > > > > > > >    handle_rx:
> > > > > > > >            mutex_lock(&vq->mutex);
> > > > > > > >    
> > > > > > > >            /* TODO: check that we are running from vhost_worker? */
> > > > > > > >            struct socket *sock = rcu_dereference_check(vq->private_data, 1);
> > > > > > > >    
> > > > > > > >            if (!sock)
> > > > > > > >                    return;
> > > > > > > >    
> > > > > > > >            mutex_lock(&vq->mutex);
> > > > > > > > 
> > > > > > > > Can't we can do the check under the vq->mutex here?
> > > > > > > > 
> > > > > > > > The rcu is still there but it makes the code easier to read. IMO, If we want to
> > > > > > > > use rcu, use it explicitly and avoid the vhost rcu completely. 
> > > > > > > > 
> > > > > > > > > >  	mutex_lock(&vq->mutex);
> > > > > > > > > > @@ -829,11 +845,12 @@ static int vhost_scsi_set_endpoint(
> > > > > > > > > >  		       sizeof(vs->vs_vhost_wwpn));
> > > > > > > > > >  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > > > >  			vq = &vs->vqs[i];
> > > > > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > > > >  			mutex_lock(&vq->mutex);
> > > > > > > > > > +			rcu_assign_pointer(vq->private_data, vs);
> > > > > > > > > >  			vhost_init_used(vq);
> > > > > > > > > >  			mutex_unlock(&vq->mutex);
> > > > > > > > > >  		}
> > > > > > > > > > -		vs->vs_endpoint = true;
> > > > > > > > > >  		ret = 0;
> > > > > > > > > >  	} else {
> > > > > > > > > >  		ret = -EEXIST;
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > There's also some weird smp_mb__after_atomic_inc() with no
> > > > > > > > > atomic in sight just above ... Nicholas what was the point there?
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > @@ -849,6 +866,8 @@ static int vhost_scsi_clear_endpoint(
> > > > > > > > > >  {
> > > > > > > > > >  	struct tcm_vhost_tport *tv_tport;
> > > > > > > > > >  	struct tcm_vhost_tpg *tv_tpg;
> > > > > > > > > > +	struct vhost_virtqueue *vq;
> > > > > > > > > > +	bool match = false;
> > > > > > > > > >  	int index, ret, i;
> > > > > > > > > >  	u8 target;
> > > > > > > > > >  
> > > > > > > > > > @@ -884,9 +903,18 @@ static int vhost_scsi_clear_endpoint(
> > > > > > > > > >  		}
> > > > > > > > > >  		tv_tpg->tv_tpg_vhost_count--;
> > > > > > > > > >  		vs->vs_tpg[target] = NULL;
> > > > > > > > > > -		vs->vs_endpoint = false;
> > > > > > > > > > +		match = true;
> > > > > > > > > >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > > > > > > >  	}
> > > > > > > > > > +	if (match) {
> > > > > > > > > > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > > > > > > > > > +			vq = &vs->vqs[i];
> > > > > > > > > > +			/* Flushing the vhost_work acts as synchronize_rcu */
> > > > > > > > > > +			mutex_lock(&vq->mutex);
> > > > > > > > > > +			rcu_assign_pointer(vq->private_data, NULL);
> > > > > > > > > > +			mutex_unlock(&vq->mutex);
> > > > > > > > > > +		}
> > > > > > > > > > +	}
> > > > > > > > > 
> > > > > > > > > I'm trying to understand what's going on here.
> > > > > > > > > Does vhost_scsi only have a single target?
> > > > > > > > > Because the moment you clear one target you
> > > > > > > > > also set private_data to NULL ...
> > > > > > > > 
> > > > > > > > vhost_scsi supports multi target. Currently, We can not disable specific target
> > > > > > > > under the wwpn. When we clear or set the endpoint, we disable or enable all the
> > > > > > > > targets under the wwpn.
> > > > > > > 
> > > > > > > okay, but changing vs->vs_tpg[target] under dev mutex, then using
> > > > > > > it under vq mutex looks wrong.
> > > > > > 
> > > > > > I do not see a problem here.
> > > > > > 
> > > > > > Access of vs->vs_tpg[target] in vhost_scsi_handle_vq() happens only when
> > > > > > the SET_ENDPOINT is done.
> > > > > 
> > > > > But nothing prevents multiple SET_ENDPOINT calls while
> > > > > the previous one is in progress.
> > > > 
> > > > vhost_scsi_set_endpoint() and vhost_scsi_clear_endpoint() are protected
> > > > by vs->dev.mutex, no?
> > > > 
> > > > And in vhost_scsi_set_endpoint():
> > > > 
> > > > 	if (tv_tpg->tv_tpg_vhost_count != 0) {
> > > > 		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > > > 		continue;
> > > > 	}
> > > > 
> > > > This prevents calling of vhost_scsi_set_endpoint before we call
> > > > vhost_scsi_clear_endpoint to decrease tv_tpg->tv_tpg_vhost_count.
> > > 
> > > All this seems to do is prevent reusing the same target
> > > in multiple vhosts.
> > > 
> > > > > > At that time, the vs->vs_tpg[] is already
> > > > > > ready. Even if the vs->vs_tpg[target] is changed to NULL in
> > > > > > CLEAR_ENDPOINT, it is safe since we fail the request if
> > > > > > vs->vs_tpg[target] is NULL.
> > > > > 
> > > > > We check it without a common lock so it can become NULL
> > > > > after we test it.
> > > > 
> > > > 
> > > > vhost_scsi_handle_vq:
> > > > 
> > > >      tv_tpg = vs->vs_tpg[target];	
> > > >      if (!tv_tpg)
> > > >          we fail the cmd
> > > >      ...
> > > > 
> > > >      INIT_WORK(&tv_cmd->work, tcm_vhost_submission_work);
> > > >      queue_work(tcm_vhost_workqueue, &tv_cmd->work);
> > > > 
> > > > So, after we test tv_tpg, event if vs->vs_tpg[target] become NULL, it
> > > > does not matter if the tpg is not deleted by calling tcm_vhost_drop_tpg().
> > > > tcm_vhost_drop_tpg() will not succeed if we do not call vhost_scsi_clear_endpoint() 
> > > > Becasue, tcm_vhost_drop_tpg -> tcm_vhost_drop_nexus ->  check if (tpg->tv_tpg_vhost_count != 0) 
> > > 
> > > My point is this:
> > > 		tv_tpg = vs->vs_tpg[target];
> > > 		if (!tv_tpg) {
> > > 			....
> > > 			return
> > > 		}
> > > 
> > >                 tv_cmd = vhost_scsi_allocate_cmd(tv_tpg, &v_req,
> > > 
> > > above line can legally reread vs->vs_tpg[target] from array.
> > > You need ACCESS_ONCE if you don't want that.
> > 
> > Well, this is another problem we have. Will include it in next version. 
> > 
> > > 
> > > > Further, the tcm core should fail the cmd if the tpg is gonna when we submit the cmd in
> > > > tcm_vhost_submission_work. (nab, is this true?)
> > > > 
> > > > > > > Since we want to use private_data anyway, how about
> > > > > > > making private_data point at struct tcm_vhost_tpg * ?
> > > > > > > 
> > > > > > > Allocate it dynamically in SET_ENDPOINT (and free old value if any).
> > > > > > 
> > > > > > The struct tcm_vhost_tpg is per target. I assume you want to point
> > > > > > private_data to the 'struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET]'
> > > > > 
> > > > > No, I want to put it at the array of targets.
> > > > 
> > > > tcm_vhost_tpg is allocated in tcm_vhost_make_tpg. There is no array of
> > > > the targets. The targets exist when user create them in host side using
> > > > targetcli tools or /sys/kernel/config interface.
> > > 
> > > I really simply mean this field:
> > > 	        struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > > 
> > > allocate it dynamically when endpoint is set, and
> > > set private data for each vq.
> > 
> > What's the benefit of allocating it dynamically? Why bother it if the
> > current static and simpler one works ok.
> 
> Because this makes it the lifetime rules clear:
> instead of changing values in the array, you
> replace the pointer to the array.

> > 
> > So do you have further concerns other than the ACCESS_ONCE one.
> 
> It's just ugly to use a pointer as a flag. vhost uses private_data to
> point to the constant backend structure, and NULL if there's no backend,
> so vhost-scsi should just do this too then it won't have problems.

I'm fine to point vq->private_data to vs_tpg, but I'm not 100% like the
idea to allocate it dynamically.

Anyway, I will send out v3 trying to make you happy ;-)
 
> > > > > > > 
> > > > > > > > > 
> > > > > > > > > >  	mutex_unlock(&vs->dev.mutex);
> > > > > > > > > >  	return 0;
> > > > > > > > > >  
> > > > > > > > > > -- 
> > > > > > > > > > 1.8.1.4
> > > > > > > > 
> > > > > > > > -- 
> > > > > > > > Asias
> > > > > > 
> > > > > > -- 
> > > > > > Asias
> > > > 
> > > > -- 
> > > > Asias
> > 
> > -- 
> > Asias

-- 
Asias

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-04-01 22:57                 ` Rusty Russell
  2013-04-02 13:10                   ` Michael S. Tsirkin
@ 2013-04-12 11:37                   ` Michael S. Tsirkin
  1 sibling, 0 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2013-04-12 11:37 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, virtualization, target-devel, Stefan Hajnoczi, Paolo Bonzini

On Tue, Apr 02, 2013 at 09:27:57AM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > Rusty's currently doing some reorgs of -net let's delay
> > cleanups there to avoid stepping on each other's toys.
> > Let's focus on scsi here.
> > E.g. any chance framing assumptions can be fixed in 3.10?
> 
> I am waiting for your removal of the dma-compelete ordering stuff in
> vhost-net.
> 
> Cheers,
> Rusty.

Now, it looks like it's actually a smart datastructure.
It allows signalling consumptions from multiple
without any locks, with multiple consumers, and just
a single kref counter.
Nothing simpler than a producer/consumer does this.
Yes it can in theory delay some tx completions a bit but
normally no one is waiting for them.

We can refactor it to save some memory, and cleanup
the code, playing with this now.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2013-04-12 11:37 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-28  2:17 [PATCH V2 0/2] tcm_vhost endpoint Asias He
2013-03-28  2:17 ` [PATCH V2 1/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint Asias He
2013-03-28  2:54   ` Nicholas A. Bellinger
2013-03-28  2:54   ` Nicholas A. Bellinger
2013-03-28  3:21     ` Asias He
2013-03-28  3:21     ` Asias He
2013-03-28  2:17 ` Asias He
2013-03-28  2:17 ` [PATCH V2 2/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup Asias He
2013-03-28  6:16   ` Michael S. Tsirkin
2013-03-28  8:10     ` Asias He
2013-03-28  8:33       ` Michael S. Tsirkin
2013-03-28  8:33       ` Michael S. Tsirkin
2013-03-28  8:47         ` Asias He
2013-03-28  9:06           ` Michael S. Tsirkin
2013-03-29  6:27             ` Asias He
2013-03-31  8:23               ` Michael S. Tsirkin
2013-03-31  8:23               ` Michael S. Tsirkin
2013-04-01  2:20                 ` Asias He
2013-04-01 22:57                 ` Rusty Russell
2013-04-02 13:10                   ` Michael S. Tsirkin
2013-04-12 11:37                   ` Michael S. Tsirkin
2013-04-01 22:57                 ` Rusty Russell
2013-03-29  6:27             ` Asias He
2013-03-28  9:18       ` Michael S. Tsirkin
2013-03-29  6:22         ` Asias He
2013-03-31  8:20           ` Michael S. Tsirkin
2013-04-01  2:13             ` Asias He
2013-04-02 12:15               ` Michael S. Tsirkin
2013-04-02 15:10                 ` Asias He
2013-04-02 15:18                   ` Michael S. Tsirkin
2013-04-03  6:08                     ` Asias He
2013-03-28  9:18       ` Michael S. Tsirkin
2013-03-28  2:17 ` Asias He

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.