All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
@ 2012-07-11 21:15 Nicholas A. Bellinger
  2012-07-11 21:15 ` [RFC-v2 1/4] vhost: Separate vhost-net features from vhost features Nicholas A. Bellinger
                   ` (8 more replies)
  0 siblings, 9 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-11 21:15 UTC (permalink / raw)
  To: target-devel
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Michael S. Tsirkin,
	Zhi Yong Wu, Anthony Liguori, linux-scsi, Paolo Bonzini, lf-virt,
	Christoph Hellwig

From: Nicholas Bellinger <nab@linux-iscsi.org>

Hi folks,

The following is a RFC-v2 series of tcm_vhost target fabric driver code
currently in-flight for-3.6 mainline code.

After last week's developments along with the help of some new folks, the
changelog v1 -> v2 so far looks like:

*) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He)
*) Fix tv_cmd completion -> release SGL memory leak (nab)
*) Fix sparse warnings for static variable usage (Fengguang Wu)
*) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu)
*) Convert to cmwq submission for I/O dispatch (nab + hch)

Also following Paolo's request, a patch for hw/virtio-scsi.c that sets
scsi_host->max_target=0 that removes the need for virtio-scsi LLD to hardcode
VirtIOSCSIConfig->max_id=1 in order to function with tcm_vhost.

Note this series has been pushed into target-pending.git/for-next-merge, and
should be getting picked up for tomorrow's linux-next build.

Please let us know if you have any concerns and/or additional review feedback.

Thank you!

Nicholas Bellinger (2):
  vhost: Add vhost_scsi specific defines
  tcm_vhost: Initial merge for vhost level target fabric driver

Stefan Hajnoczi (2):
  vhost: Separate vhost-net features from vhost features
  vhost: make vhost work queue visible

 drivers/vhost/Kconfig     |    6 +
 drivers/vhost/Makefile    |    1 +
 drivers/vhost/net.c       |    4 +-
 drivers/vhost/tcm_vhost.c | 1609 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/vhost/tcm_vhost.h |   74 ++
 drivers/vhost/test.c      |    4 +-
 drivers/vhost/vhost.c     |    5 +-
 drivers/vhost/vhost.h     |    6 +-
 include/linux/vhost.h     |    9 +
 9 files changed, 1710 insertions(+), 8 deletions(-)
 create mode 100644 drivers/vhost/tcm_vhost.c
 create mode 100644 drivers/vhost/tcm_vhost.h

-- 
1.7.2.5

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [RFC-v2 1/4] vhost: Separate vhost-net features from vhost features
  2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
@ 2012-07-11 21:15 ` Nicholas A. Bellinger
  2012-07-11 21:15 ` Nicholas A. Bellinger
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-11 21:15 UTC (permalink / raw)
  To: target-devel
  Cc: linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi, Zhi Yong Wu,
	Anthony Liguori, Paolo Bonzini, Michael S. Tsirkin,
	Christoph Hellwig, Jens Axboe, Hannes Reinecke, Asias He

From: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

In order for other vhost devices to use the VHOST_FEATURES bits the
vhost-net specific bits need to be moved to their own VHOST_NET_FEATURES
constant.

(Asias: Update drivers/vhost/test.c to use VHOST_NET_FEATURES)

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Asias He <asias@redhat.com>
Signed-off-by: Nicholas A. Bellinger <nab@risingtidesystems.com>
---
 drivers/vhost/net.c   |    4 ++--
 drivers/vhost/test.c  |    4 ++--
 drivers/vhost/vhost.h |    3 ++-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index f82a739..072cbba 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -823,14 +823,14 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
 			return -EFAULT;
 		return vhost_net_set_backend(n, backend.index, backend.fd);
 	case VHOST_GET_FEATURES:
-		features = VHOST_FEATURES;
+		features = VHOST_NET_FEATURES;
 		if (copy_to_user(featurep, &features, sizeof features))
 			return -EFAULT;
 		return 0;
 	case VHOST_SET_FEATURES:
 		if (copy_from_user(&features, featurep, sizeof features))
 			return -EFAULT;
-		if (features & ~VHOST_FEATURES)
+		if (features & ~VHOST_NET_FEATURES)
 			return -EOPNOTSUPP;
 		return vhost_net_set_features(n, features);
 	case VHOST_RESET_OWNER:
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 3de00d9..91d6f06 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -261,14 +261,14 @@ static long vhost_test_ioctl(struct file *f, unsigned int ioctl,
 			return -EFAULT;
 		return vhost_test_run(n, test);
 	case VHOST_GET_FEATURES:
-		features = VHOST_FEATURES;
+		features = VHOST_NET_FEATURES;
 		if (copy_to_user(featurep, &features, sizeof features))
 			return -EFAULT;
 		return 0;
 	case VHOST_SET_FEATURES:
 		if (copy_from_user(&features, featurep, sizeof features))
 			return -EFAULT;
-		if (features & ~VHOST_FEATURES)
+		if (features & ~VHOST_NET_FEATURES)
 			return -EOPNOTSUPP;
 		return vhost_test_set_features(n, features);
 	case VHOST_RESET_OWNER:
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 8de1fd5..07b9763 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -201,7 +201,8 @@ enum {
 	VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
 			 (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
 			 (1ULL << VIRTIO_RING_F_EVENT_IDX) |
-			 (1ULL << VHOST_F_LOG_ALL) |
+			 (1ULL << VHOST_F_LOG_ALL),
+	VHOST_NET_FEATURES = VHOST_FEATURES |
 			 (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
 			 (1ULL << VIRTIO_NET_F_MRG_RXBUF),
 };
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC-v2 1/4] vhost: Separate vhost-net features from vhost features
  2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
  2012-07-11 21:15 ` [RFC-v2 1/4] vhost: Separate vhost-net features from vhost features Nicholas A. Bellinger
@ 2012-07-11 21:15 ` Nicholas A. Bellinger
  2012-07-11 21:15 ` [RFC-v2 2/4] vhost: make vhost work queue visible Nicholas A. Bellinger
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-11 21:15 UTC (permalink / raw)
  To: target-devel
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Michael S. Tsirkin,
	Zhi Yong Wu, Anthony Liguori, linux-scsi, Paolo Bonzini, lf-virt,
	Christoph Hellwig

From: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

In order for other vhost devices to use the VHOST_FEATURES bits the
vhost-net specific bits need to be moved to their own VHOST_NET_FEATURES
constant.

(Asias: Update drivers/vhost/test.c to use VHOST_NET_FEATURES)

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Asias He <asias@redhat.com>
Signed-off-by: Nicholas A. Bellinger <nab@risingtidesystems.com>
---
 drivers/vhost/net.c   |    4 ++--
 drivers/vhost/test.c  |    4 ++--
 drivers/vhost/vhost.h |    3 ++-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index f82a739..072cbba 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -823,14 +823,14 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
 			return -EFAULT;
 		return vhost_net_set_backend(n, backend.index, backend.fd);
 	case VHOST_GET_FEATURES:
-		features = VHOST_FEATURES;
+		features = VHOST_NET_FEATURES;
 		if (copy_to_user(featurep, &features, sizeof features))
 			return -EFAULT;
 		return 0;
 	case VHOST_SET_FEATURES:
 		if (copy_from_user(&features, featurep, sizeof features))
 			return -EFAULT;
-		if (features & ~VHOST_FEATURES)
+		if (features & ~VHOST_NET_FEATURES)
 			return -EOPNOTSUPP;
 		return vhost_net_set_features(n, features);
 	case VHOST_RESET_OWNER:
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 3de00d9..91d6f06 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -261,14 +261,14 @@ static long vhost_test_ioctl(struct file *f, unsigned int ioctl,
 			return -EFAULT;
 		return vhost_test_run(n, test);
 	case VHOST_GET_FEATURES:
-		features = VHOST_FEATURES;
+		features = VHOST_NET_FEATURES;
 		if (copy_to_user(featurep, &features, sizeof features))
 			return -EFAULT;
 		return 0;
 	case VHOST_SET_FEATURES:
 		if (copy_from_user(&features, featurep, sizeof features))
 			return -EFAULT;
-		if (features & ~VHOST_FEATURES)
+		if (features & ~VHOST_NET_FEATURES)
 			return -EOPNOTSUPP;
 		return vhost_test_set_features(n, features);
 	case VHOST_RESET_OWNER:
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 8de1fd5..07b9763 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -201,7 +201,8 @@ enum {
 	VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
 			 (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
 			 (1ULL << VIRTIO_RING_F_EVENT_IDX) |
-			 (1ULL << VHOST_F_LOG_ALL) |
+			 (1ULL << VHOST_F_LOG_ALL),
+	VHOST_NET_FEATURES = VHOST_FEATURES |
 			 (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
 			 (1ULL << VIRTIO_NET_F_MRG_RXBUF),
 };
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC-v2 2/4] vhost: make vhost work queue visible
  2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
  2012-07-11 21:15 ` [RFC-v2 1/4] vhost: Separate vhost-net features from vhost features Nicholas A. Bellinger
  2012-07-11 21:15 ` Nicholas A. Bellinger
@ 2012-07-11 21:15 ` Nicholas A. Bellinger
  2012-07-11 21:15 ` Nicholas A. Bellinger
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-11 21:15 UTC (permalink / raw)
  To: target-devel
  Cc: linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi, Zhi Yong Wu,
	Anthony Liguori, Paolo Bonzini, Michael S. Tsirkin,
	Christoph Hellwig, Jens Axboe, Hannes Reinecke, Stefan Hajnoczi

From: Stefan Hajnoczi <stefanha@gmail.com>

The vhost work queue allows processing to be done in vhost worker thread
context, which uses the owner process mm.  Access to the vring and guest
memory is typically only possible from vhost worker context so it is
useful to allow work to be queued directly by users.

Currently vhost_net only uses the poll wrappers which do not expose the
work queue functions.  However, for tcm_vhost (vhost_scsi) it will be
necessary to queue custom work.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/vhost/vhost.c |    5 ++---
 drivers/vhost/vhost.h |    3 +++
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 94dbd25..1aab08b 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -64,7 +64,7 @@ static int vhost_poll_wakeup(wait_queue_t *wait, unsigned mode, int sync,
 	return 0;
 }
 
-static void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn)
+void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn)
 {
 	INIT_LIST_HEAD(&work->node);
 	work->fn = fn;
@@ -137,8 +137,7 @@ void vhost_poll_flush(struct vhost_poll *poll)
 	vhost_work_flush(poll->dev, &poll->work);
 }
 
-static inline void vhost_work_queue(struct vhost_dev *dev,
-				    struct vhost_work *work)
+void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work)
 {
 	unsigned long flags;
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 07b9763..1125af3 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -43,6 +43,9 @@ struct vhost_poll {
 	struct vhost_dev	 *dev;
 };
 
+void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn);
+void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work);
+
 void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn,
 		     unsigned long mask, struct vhost_dev *dev);
 void vhost_poll_start(struct vhost_poll *poll, struct file *file);
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC-v2 2/4] vhost: make vhost work queue visible
  2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
                   ` (2 preceding siblings ...)
  2012-07-11 21:15 ` [RFC-v2 2/4] vhost: make vhost work queue visible Nicholas A. Bellinger
@ 2012-07-11 21:15 ` Nicholas A. Bellinger
  2012-07-11 21:15 ` [RFC-v2 3/4] vhost: Add vhost_scsi specific defines Nicholas A. Bellinger
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-11 21:15 UTC (permalink / raw)
  To: target-devel
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Michael S. Tsirkin,
	Zhi Yong Wu, Anthony Liguori, linux-scsi, Paolo Bonzini, lf-virt,
	Christoph Hellwig

From: Stefan Hajnoczi <stefanha@gmail.com>

The vhost work queue allows processing to be done in vhost worker thread
context, which uses the owner process mm.  Access to the vring and guest
memory is typically only possible from vhost worker context so it is
useful to allow work to be queued directly by users.

Currently vhost_net only uses the poll wrappers which do not expose the
work queue functions.  However, for tcm_vhost (vhost_scsi) it will be
necessary to queue custom work.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/vhost/vhost.c |    5 ++---
 drivers/vhost/vhost.h |    3 +++
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 94dbd25..1aab08b 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -64,7 +64,7 @@ static int vhost_poll_wakeup(wait_queue_t *wait, unsigned mode, int sync,
 	return 0;
 }
 
-static void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn)
+void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn)
 {
 	INIT_LIST_HEAD(&work->node);
 	work->fn = fn;
@@ -137,8 +137,7 @@ void vhost_poll_flush(struct vhost_poll *poll)
 	vhost_work_flush(poll->dev, &poll->work);
 }
 
-static inline void vhost_work_queue(struct vhost_dev *dev,
-				    struct vhost_work *work)
+void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work)
 {
 	unsigned long flags;
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 07b9763..1125af3 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -43,6 +43,9 @@ struct vhost_poll {
 	struct vhost_dev	 *dev;
 };
 
+void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn);
+void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work);
+
 void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn,
 		     unsigned long mask, struct vhost_dev *dev);
 void vhost_poll_start(struct vhost_poll *poll, struct file *file);
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC-v2 3/4] vhost: Add vhost_scsi specific defines
  2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
                   ` (3 preceding siblings ...)
  2012-07-11 21:15 ` Nicholas A. Bellinger
@ 2012-07-11 21:15 ` Nicholas A. Bellinger
  2012-07-11 21:15 ` Nicholas A. Bellinger
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-11 21:15 UTC (permalink / raw)
  To: target-devel
  Cc: linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi, Zhi Yong Wu,
	Anthony Liguori, Paolo Bonzini, Michael S. Tsirkin,
	Christoph Hellwig, Jens Axboe, Hannes Reinecke,
	Nicholas Bellinger

From: Nicholas Bellinger <nab@risingtidesystems.com>

This patch adds the initial vhost_scsi_ioctl() callers for VHOST_SCSI_SET_ENDPOINT
and VHOST_SCSI_CLEAR_ENDPOINT respectively, and also adds struct vhost_vring_target
that is used by tcm_vhost code when locating target ports during qemu setup.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Signed-off-by: Nicholas A. Bellinger <nab@risingtidesystems.com>
---
 include/linux/vhost.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/linux/vhost.h b/include/linux/vhost.h
index e847f1e..33b313b 100644
--- a/include/linux/vhost.h
+++ b/include/linux/vhost.h
@@ -24,7 +24,11 @@ struct vhost_vring_state {
 struct vhost_vring_file {
 	unsigned int index;
 	int fd; /* Pass -1 to unbind from file. */
+};
 
+struct vhost_vring_target {
+	unsigned char vhost_wwpn[224];
+	unsigned short vhost_tpgt;
 };
 
 struct vhost_vring_addr {
@@ -121,6 +125,11 @@ struct vhost_memory {
  * device.  This can be used to stop the ring (e.g. for migration). */
 #define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
 
+/* VHOST_SCSI specific defines */
+
+#define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct vhost_vring_target)
+#define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct vhost_vring_target)
+
 /* Feature bits */
 /* Log all write descriptors. Can be changed while device is active. */
 #define VHOST_F_LOG_ALL 26
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC-v2 3/4] vhost: Add vhost_scsi specific defines
  2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
                   ` (4 preceding siblings ...)
  2012-07-11 21:15 ` [RFC-v2 3/4] vhost: Add vhost_scsi specific defines Nicholas A. Bellinger
@ 2012-07-11 21:15 ` Nicholas A. Bellinger
  2012-07-11 21:15 ` [RFC-v2 4/4] tcm_vhost: Initial merge for vhost level target fabric driver Nicholas A. Bellinger
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-11 21:15 UTC (permalink / raw)
  To: target-devel
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Michael S. Tsirkin,
	Zhi Yong Wu, Anthony Liguori, linux-scsi, Paolo Bonzini, lf-virt,
	Nicholas Bellinger, Christoph Hellwig

From: Nicholas Bellinger <nab@risingtidesystems.com>

This patch adds the initial vhost_scsi_ioctl() callers for VHOST_SCSI_SET_ENDPOINT
and VHOST_SCSI_CLEAR_ENDPOINT respectively, and also adds struct vhost_vring_target
that is used by tcm_vhost code when locating target ports during qemu setup.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Signed-off-by: Nicholas A. Bellinger <nab@risingtidesystems.com>
---
 include/linux/vhost.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/linux/vhost.h b/include/linux/vhost.h
index e847f1e..33b313b 100644
--- a/include/linux/vhost.h
+++ b/include/linux/vhost.h
@@ -24,7 +24,11 @@ struct vhost_vring_state {
 struct vhost_vring_file {
 	unsigned int index;
 	int fd; /* Pass -1 to unbind from file. */
+};
 
+struct vhost_vring_target {
+	unsigned char vhost_wwpn[224];
+	unsigned short vhost_tpgt;
 };
 
 struct vhost_vring_addr {
@@ -121,6 +125,11 @@ struct vhost_memory {
  * device.  This can be used to stop the ring (e.g. for migration). */
 #define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
 
+/* VHOST_SCSI specific defines */
+
+#define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct vhost_vring_target)
+#define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct vhost_vring_target)
+
 /* Feature bits */
 /* Log all write descriptors. Can be changed while device is active. */
 #define VHOST_F_LOG_ALL 26
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC-v2 4/4] tcm_vhost: Initial merge for vhost level target fabric driver
  2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
                   ` (6 preceding siblings ...)
  2012-07-11 21:15 ` [RFC-v2 4/4] tcm_vhost: Initial merge for vhost level target fabric driver Nicholas A. Bellinger
@ 2012-07-11 21:15 ` Nicholas A. Bellinger
  2012-07-17 15:05 ` [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Michael S. Tsirkin
  8 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-11 21:15 UTC (permalink / raw)
  To: target-devel
  Cc: linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi, Zhi Yong Wu,
	Anthony Liguori, Paolo Bonzini, Michael S. Tsirkin,
	Christoph Hellwig, Jens Axboe, Hannes Reinecke,
	Nicholas Bellinger

From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch adds the initial code for tcm_vhost, a Vhost level TCM
fabric driver for virtio SCSI initiators into KVM guest.

This code is currently up and running on v3.5-rc2 host+guest along
with the virtio-scsi vdev->scan() patch to allow a proper
scsi_scan_host() to occur once the tcm_vhost nexus has been established
by the paravirtualized virtio-scsi client here:

virtio-scsi: Add vdrv->scan for post VIRTIO_CONFIG_S_DRIVER_OK LUN scanning
http://marc.info/?l=linux-scsi&m=134160609212542&w=2

Using tcm_vhost requires Zhi's -> Stefan's qemu vhost-scsi tree here:

https://github.com/wuzhy/qemu/tree/vhost-scsi

along with the recent QEMU patch to hw/virtio-scsi.c to set max_target=0
during vhost-scsi operation.

Changelog v1 -> v2:

  Fix tv_cmd completion -> release SGL memory leak
  Fix sparse warnings for static variable usage
  Fix sparse warnings for min() typing + printk format specs
  Convert to cmwq submission for I/O dispatch

Changelog v0 -> v1:

  Merge into single source + header file, and move to drivers/vhost/

Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/vhost/Kconfig     |    6 +
 drivers/vhost/Makefile    |    1 +
 drivers/vhost/tcm_vhost.c | 1609 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/vhost/tcm_vhost.h |   74 ++
 4 files changed, 1690 insertions(+), 0 deletions(-)
 create mode 100644 drivers/vhost/tcm_vhost.c
 create mode 100644 drivers/vhost/tcm_vhost.h

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index e4e2fd1..ccbeb6f 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -9,3 +9,9 @@ config VHOST_NET
 	  To compile this driver as a module, choose M here: the module will
 	  be called vhost_net.
 
+config TCM_VHOST
+	tristate "TCM_VHOST fabric module (EXPERIMENTAL)"
+	depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && m
+	default n
+	---help---
+	Say M here to enable the TCM_VHOST fabric module for use with virtio-scsi guests
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index 72dd020..b10c7b1 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_VHOST_NET) += vhost_net.o
+obj-$(CONFIG_TCM_VHOST) += tcm_vhost.o
 vhost_net-y := vhost.o net.o
diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
new file mode 100644
index 0000000..da0b8ac
--- /dev/null
+++ b/drivers/vhost/tcm_vhost.c
@@ -0,0 +1,1609 @@
+/*******************************************************************************
+ * Vhost kernel TCM fabric driver for virtio SCSI initiators
+ *
+ * (C) Copyright 2010-2012 RisingTide Systems LLC.
+ * (C) Copyright 2010-2012 IBM Corp.
+ *
+ * Licensed to the Linux Foundation under the General Public License (GPL) version 2.
+ *
+ * Authors: Nicholas A. Bellinger <nab@risingtidesystems.com>
+ *          Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ ****************************************************************************/
+
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <generated/utsrelease.h>
+#include <linux/utsname.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/kthread.h>
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/configfs.h>
+#include <linux/ctype.h>
+#include <linux/compat.h>
+#include <linux/eventfd.h>
+#include <linux/vhost.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <asm/unaligned.h>
+#include <scsi/scsi.h>
+#include <scsi/scsi_tcq.h>
+#include <target/target_core_base.h>
+#include <target/target_core_fabric.h>
+#include <target/target_core_fabric_configfs.h>
+#include <target/target_core_configfs.h>
+#include <target/configfs_macros.h>
+#include <linux/vhost.h>
+#include <linux/virtio_net.h> /* TODO vhost.h currently depends on this */
+#include <linux/virtio_scsi.h>
+
+#include "vhost.c"
+#include "vhost.h"
+#include "tcm_vhost.h"
+
+struct vhost_scsi {
+	atomic_t vhost_ref_cnt;
+	struct tcm_vhost_tpg *vs_tpg;
+	struct vhost_dev dev;
+	struct vhost_virtqueue vqs[3];
+
+	struct vhost_work vs_completion_work; /* cmd completion work item */
+	struct list_head vs_completion_list;  /* cmd completion queue */
+	spinlock_t vs_completion_lock;        /* protects s_completion_list */
+};
+
+/* Local pointer to allocated TCM configfs fabric module */
+static struct target_fabric_configfs *tcm_vhost_fabric_configfs;
+
+static struct workqueue_struct *tcm_vhost_workqueue;
+
+/* Global spinlock to protect tcm_vhost TPG list for vhost IOCTL access */
+static DEFINE_MUTEX(tcm_vhost_mutex);
+static LIST_HEAD(tcm_vhost_list);
+
+static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
+{
+	return 1;
+}
+
+static int tcm_vhost_check_false(struct se_portal_group *se_tpg)
+{
+	return 0;
+}
+
+static char *tcm_vhost_get_fabric_name(void)
+{
+	return "vhost";
+}
+
+static u8 tcm_vhost_get_fabric_proto_ident(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return sas_get_fabric_proto_ident(se_tpg);
+	case SCSI_PROTOCOL_FCP:
+		return fc_get_fabric_proto_ident(se_tpg);
+	case SCSI_PROTOCOL_ISCSI:
+		return iscsi_get_fabric_proto_ident(se_tpg);
+	default:
+		pr_err("Unknown tport_proto_id: 0x%02x, using"
+			" SAS emulation\n", tport->tport_proto_id);
+		break;
+	}
+
+	return sas_get_fabric_proto_ident(se_tpg);
+}
+
+static char *tcm_vhost_get_fabric_wwn(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	return &tport->tport_name[0];
+}
+
+static u16 tcm_vhost_get_tag(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	return tpg->tport_tpgt;
+}
+
+static u32 tcm_vhost_get_default_depth(struct se_portal_group *se_tpg)
+{
+	return 1;
+}
+
+static u32 tcm_vhost_get_pr_transport_id(
+	struct se_portal_group *se_tpg,
+	struct se_node_acl *se_nacl,
+	struct t10_pr_registration *pr_reg,
+	int *format_code,
+	unsigned char *buf)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return sas_get_pr_transport_id(se_tpg, se_nacl, pr_reg,
+					format_code, buf);
+	case SCSI_PROTOCOL_FCP:
+		return fc_get_pr_transport_id(se_tpg, se_nacl, pr_reg,
+					format_code, buf);
+	case SCSI_PROTOCOL_ISCSI:
+		return iscsi_get_pr_transport_id(se_tpg, se_nacl, pr_reg,
+					format_code, buf);
+	default:
+		pr_err("Unknown tport_proto_id: 0x%02x, using"
+			" SAS emulation\n", tport->tport_proto_id);
+		break;
+	}
+
+	return sas_get_pr_transport_id(se_tpg, se_nacl, pr_reg,
+			format_code, buf);
+}
+
+static u32 tcm_vhost_get_pr_transport_id_len(
+	struct se_portal_group *se_tpg,
+	struct se_node_acl *se_nacl,
+	struct t10_pr_registration *pr_reg,
+	int *format_code)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return sas_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,
+					format_code);
+	case SCSI_PROTOCOL_FCP:
+		return fc_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,
+					format_code);
+	case SCSI_PROTOCOL_ISCSI:
+		return iscsi_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,
+					format_code);
+	default:
+		pr_err("Unknown tport_proto_id: 0x%02x, using"
+			" SAS emulation\n", tport->tport_proto_id);
+		break;
+	}
+
+	return sas_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,
+			format_code);
+}
+
+static char *tcm_vhost_parse_pr_out_transport_id(
+	struct se_portal_group *se_tpg,
+	const char *buf,
+	u32 *out_tid_len,
+	char **port_nexus_ptr)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return sas_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,
+					port_nexus_ptr);
+	case SCSI_PROTOCOL_FCP:
+		return fc_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,
+					port_nexus_ptr);
+	case SCSI_PROTOCOL_ISCSI:
+		return iscsi_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,
+					port_nexus_ptr);
+	default:
+		pr_err("Unknown tport_proto_id: 0x%02x, using"
+			" SAS emulation\n", tport->tport_proto_id);
+		break;
+	}
+
+	return sas_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,
+			port_nexus_ptr);
+}
+
+static struct se_node_acl *tcm_vhost_alloc_fabric_acl(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_nacl *nacl;
+
+	nacl = kzalloc(sizeof(struct tcm_vhost_nacl), GFP_KERNEL);
+	if (!nacl) {
+		pr_err("Unable to alocate struct tcm_vhost_nacl\n");
+		return NULL;
+	}
+
+	return &nacl->se_node_acl;
+}
+
+static void tcm_vhost_release_fabric_acl(
+	struct se_portal_group *se_tpg,
+	struct se_node_acl *se_nacl)
+{
+	struct tcm_vhost_nacl *nacl = container_of(se_nacl,
+			struct tcm_vhost_nacl, se_node_acl);
+	kfree(nacl);
+}
+
+static u32 tcm_vhost_tpg_get_inst_index(struct se_portal_group *se_tpg)
+{
+	return 1;
+}
+
+static void tcm_vhost_release_cmd(struct se_cmd *se_cmd)
+{
+	return;
+}
+
+static int tcm_vhost_shutdown_session(struct se_session *se_sess)
+{
+	return 0;
+}
+
+static void tcm_vhost_close_session(struct se_session *se_sess)
+{
+	return;
+}
+
+static u32 tcm_vhost_sess_get_index(struct se_session *se_sess)
+{
+	return 0;
+}
+
+static int tcm_vhost_write_pending(struct se_cmd *se_cmd)
+{
+	/* Go ahead and process the write immediately */
+	transport_generic_process_write(se_cmd);
+	return 0;
+}
+
+static int tcm_vhost_write_pending_status(struct se_cmd *se_cmd)
+{
+	return 0;
+}
+
+static void tcm_vhost_set_default_node_attrs(struct se_node_acl *nacl)
+{
+	return;
+}
+
+static u32 tcm_vhost_get_task_tag(struct se_cmd *se_cmd)
+{
+	return 0;
+}
+
+static int tcm_vhost_get_cmd_state(struct se_cmd *se_cmd)
+{
+	return 0;
+}
+
+static void vhost_scsi_complete_cmd(struct tcm_vhost_cmd *);
+
+static int tcm_vhost_queue_data_in(struct se_cmd *se_cmd)
+{
+	struct tcm_vhost_cmd *tv_cmd = container_of(se_cmd,
+				struct tcm_vhost_cmd, tvc_se_cmd);
+	vhost_scsi_complete_cmd(tv_cmd);
+	return 0;
+}
+
+static int tcm_vhost_queue_status(struct se_cmd *se_cmd)
+{
+	struct tcm_vhost_cmd *tv_cmd = container_of(se_cmd,
+				struct tcm_vhost_cmd, tvc_se_cmd);
+	vhost_scsi_complete_cmd(tv_cmd);
+	return 0;
+}
+
+static int tcm_vhost_queue_tm_rsp(struct se_cmd *se_cmd)
+{
+	return 0;
+}
+
+static u16 tcm_vhost_set_fabric_sense_len(struct se_cmd *se_cmd, u32 sense_length)
+{
+	return 0;
+}
+
+static u16 tcm_vhost_get_fabric_sense_len(void)
+{
+	return 0;
+}
+
+static void vhost_scsi_free_cmd(struct tcm_vhost_cmd *tv_cmd)
+{
+	struct se_cmd *se_cmd = &tv_cmd->tvc_se_cmd;
+
+	/* TODO locking against target/backend threads? */
+	transport_generic_free_cmd(se_cmd, 1);
+
+	if (tv_cmd->tvc_sgl_count) {
+		u32 i;
+		for (i = 0; i < tv_cmd->tvc_sgl_count; i++)
+			put_page(sg_page(&tv_cmd->tvc_sgl[i]));
+
+		kfree(tv_cmd->tvc_sgl);
+	}
+
+	kfree(tv_cmd);
+}
+
+/* Dequeue a command from the completion list */
+static struct tcm_vhost_cmd *vhost_scsi_get_cmd_from_completion(struct vhost_scsi *vs)
+{
+	struct tcm_vhost_cmd *tv_cmd = NULL;
+
+	spin_lock_bh(&vs->vs_completion_lock);
+	if (list_empty(&vs->vs_completion_list)) {
+		spin_unlock_bh(&vs->vs_completion_lock);
+		return NULL;
+	}
+
+	list_for_each_entry(tv_cmd, &vs->vs_completion_list,
+			    tvc_completion_list) {
+		list_del(&tv_cmd->tvc_completion_list);
+		break;
+	}
+	spin_unlock_bh(&vs->vs_completion_lock);
+	return tv_cmd;
+}
+
+/* Fill in status and signal that we are done processing this command
+ *
+ * This is scheduled in the vhost work queue so we are called with the owner
+ * process mm and can access the vring.
+ */
+static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
+{
+	struct vhost_scsi *vs = container_of(work, struct vhost_scsi,
+	                                     vs_completion_work);
+	struct tcm_vhost_cmd *tv_cmd;
+
+	while ((tv_cmd = vhost_scsi_get_cmd_from_completion(vs)) != NULL) {
+		struct virtio_scsi_cmd_resp v_rsp;
+		struct se_cmd *se_cmd = &tv_cmd->tvc_se_cmd;
+		int ret;
+
+		pr_debug("%s tv_cmd %p resid %u status %#02x\n", __func__,
+			tv_cmd, se_cmd->residual_count, se_cmd->scsi_status);
+
+		memset(&v_rsp, 0, sizeof(v_rsp));
+		v_rsp.resid = se_cmd->residual_count;
+		/* TODO is status_qualifier field needed? */
+		v_rsp.status = se_cmd->scsi_status;
+		v_rsp.sense_len = se_cmd->scsi_sense_length;
+		memcpy(v_rsp.sense, tv_cmd->tvc_sense_buf,
+		       v_rsp.sense_len);
+		ret = copy_to_user(tv_cmd->tvc_resp, &v_rsp, sizeof(v_rsp));
+		if (likely(ret == 0))
+			vhost_add_used(&vs->vqs[2], tv_cmd->tvc_vq_desc, 0);
+		else
+			pr_err("Faulted on virtio_scsi_cmd_resp\n");
+
+		vhost_scsi_free_cmd(tv_cmd);
+	}
+
+	vhost_signal(&vs->dev, &vs->vqs[2]);
+}
+
+static void vhost_scsi_complete_cmd(struct tcm_vhost_cmd *tv_cmd)
+{
+	struct vhost_scsi *vs = tv_cmd->tvc_vhost;
+
+	pr_debug("%s tv_cmd %p\n", __func__, tv_cmd);
+
+	spin_lock_bh(&vs->vs_completion_lock);
+	list_add_tail(&tv_cmd->tvc_completion_list, &vs->vs_completion_list);
+	spin_unlock_bh(&vs->vs_completion_lock);
+
+	vhost_work_queue(&vs->dev, &vs->vs_completion_work);
+}
+
+static struct tcm_vhost_cmd *vhost_scsi_allocate_cmd(
+	struct tcm_vhost_tpg *tv_tpg,
+	struct virtio_scsi_cmd_req *v_req,
+	u32 exp_data_len,
+	int data_direction)
+{
+	struct tcm_vhost_cmd *tv_cmd;
+	struct tcm_vhost_nexus *tv_nexus;
+	struct se_portal_group *se_tpg = &tv_tpg->se_tpg;
+	struct se_session *se_sess;
+	struct se_cmd *se_cmd;
+	int sam_task_attr;
+
+	tv_nexus = tv_tpg->tpg_nexus;
+	if (!tv_nexus) {
+		pr_err("Unable to locate active struct tcm_vhost_nexus\n");
+		return ERR_PTR(-EIO);
+	}
+	se_sess = tv_nexus->tvn_se_sess;
+
+	tv_cmd = kzalloc(sizeof(struct tcm_vhost_cmd), GFP_ATOMIC);
+	if (!tv_cmd) {
+		pr_err("Unable to allocate struct tcm_vhost_cmd\n");
+		return ERR_PTR(-ENOMEM);
+	}
+	INIT_LIST_HEAD(&tv_cmd->tvc_completion_list);
+	tv_cmd->tvc_tag = v_req->tag;
+
+	se_cmd = &tv_cmd->tvc_se_cmd;
+	/*
+	 * Locate the SAM Task Attr from virtio_scsi_cmd_req
+	 */
+	sam_task_attr = v_req->task_attr;
+	/*
+	 * Initialize struct se_cmd descriptor from target_core_mod infrastructure
+	 */
+	transport_init_se_cmd(se_cmd, se_tpg->se_tpg_tfo, se_sess, exp_data_len,
+				data_direction, sam_task_attr,
+				&tv_cmd->tvc_sense_buf[0]);
+
+#if 0	/* FIXME: vhost_scsi_allocate_cmd() BIDI operation */
+	if (bidi)
+		se_cmd->se_cmd_flags |= SCF_BIDI;
+#endif
+	return tv_cmd;
+}
+
+/*
+ * Map a user memory range into a scatterlist
+ *
+ * Returns the number of scatterlist entries used or -errno on error.
+ */
+static int vhost_scsi_map_to_sgl(struct scatterlist *sgl,
+		                 unsigned int sgl_count,
+		                 void __user *ptr, size_t len, int write)
+{
+	struct scatterlist *sg = sgl;
+	unsigned int npages = 0;
+	int ret;
+
+	while (len > 0) {
+		struct page *page;
+		unsigned int offset = (uintptr_t)ptr & ~PAGE_MASK;
+		unsigned int nbytes = min_t(unsigned int,
+				PAGE_SIZE - offset, len);
+
+		if (npages == sgl_count) {
+			ret = -ENOBUFS;
+			goto err;
+		}
+
+		ret = get_user_pages_fast((unsigned long)ptr, 1, write, &page);
+		BUG_ON(ret == 0); /* we should either get our page or fail */
+		if (ret < 0)
+			goto err;
+
+		sg_set_page(sg, page, nbytes, offset);
+		ptr += nbytes;
+		len -= nbytes;
+		sg++;
+		npages++;
+	}
+	return npages;
+
+err:
+	/* Put pages that we hold */
+	for (sg = sgl; sg != &sgl[npages]; sg++)
+		put_page(sg_page(sg));
+	return ret;
+}
+
+static int vhost_scsi_map_iov_to_sgl(struct tcm_vhost_cmd *tv_cmd,
+                                     struct iovec *iov, unsigned int niov,
+				     int write)
+{
+	int ret;
+	unsigned int i;
+	u32 sgl_count;
+	struct scatterlist *sg;
+
+	/*
+	 * Find out how long sglist needs to be
+	 */
+	sgl_count = 0;
+	for (i = 0; i < niov; i++) {
+		sgl_count += (((uintptr_t)iov[i].iov_base + iov[i].iov_len +
+		             PAGE_SIZE - 1) >> PAGE_SHIFT) -
+		             ((uintptr_t)iov[i].iov_base >> PAGE_SHIFT);
+	}
+	/* TODO overflow checking */
+
+	sg = kmalloc(sizeof(tv_cmd->tvc_sgl[0]) * sgl_count, GFP_ATOMIC);
+	if (!sg)
+		return -ENOMEM;
+	pr_debug("%s sg %p sgl_count %u is_err %ld\n", __func__,
+	       sg, sgl_count, IS_ERR(sg));
+	sg_init_table(sg, sgl_count);
+
+	tv_cmd->tvc_sgl = sg;
+	tv_cmd->tvc_sgl_count = sgl_count;
+
+	pr_debug("Mapping %u iovecs for %u pages\n", niov, sgl_count);
+	for (i = 0; i < niov; i++) {
+		ret = vhost_scsi_map_to_sgl(sg, sgl_count, iov[i].iov_base,
+		                            iov[i].iov_len, write);
+		if (ret < 0) {
+			for (i = 0; i < tv_cmd->tvc_sgl_count; i++)
+				put_page(sg_page(&tv_cmd->tvc_sgl[i]));
+			kfree(tv_cmd->tvc_sgl);
+			tv_cmd->tvc_sgl = NULL;
+			tv_cmd->tvc_sgl_count = 0;
+			return ret;
+		}
+
+		sg += ret;
+		sgl_count -= ret;
+	}
+	return 0;
+}
+
+static void tcm_vhost_submission_work(struct work_struct *work)
+{
+	struct tcm_vhost_cmd *tv_cmd =
+		container_of(work, struct tcm_vhost_cmd, work);
+	struct se_cmd *se_cmd = &tv_cmd->tvc_se_cmd;
+	struct scatterlist *sg_ptr, *sg_bidi_ptr = NULL;
+	int rc, sg_no_bidi = 0;
+	/*
+	 * Locate the struct se_lun pointer based on v_req->lun, and
+	 * attach it to struct se_cmd
+	 */
+	rc = transport_lookup_cmd_lun(&tv_cmd->tvc_se_cmd, tv_cmd->tvc_lun);
+	if (rc < 0) {
+		pr_err("Failed to look up lun: %d\n", tv_cmd->tvc_lun);
+		transport_send_check_condition_and_sense(&tv_cmd->tvc_se_cmd,
+			tv_cmd->tvc_se_cmd.scsi_sense_reason, 0);
+		transport_generic_free_cmd(se_cmd, 0);
+		return;
+	}
+
+	rc = target_setup_cmd_from_cdb(se_cmd, tv_cmd->tvc_cdb);
+	if (rc == -ENOMEM) {
+		transport_send_check_condition_and_sense(se_cmd,
+				TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0);
+		transport_generic_free_cmd(se_cmd, 0);
+		return;
+	} else if (rc < 0) {
+		if (se_cmd->se_cmd_flags & SCF_SCSI_RESERVATION_CONFLICT)
+			tcm_vhost_queue_status(se_cmd);
+		else
+			transport_send_check_condition_and_sense(se_cmd,
+					se_cmd->scsi_sense_reason, 0);
+		transport_generic_free_cmd(se_cmd, 0);
+		return;
+	}
+
+	if (tv_cmd->tvc_sgl_count) {
+		sg_ptr = tv_cmd->tvc_sgl;
+		/*
+		 * For BIDI commands, pass in the extra READ buffer
+		 * to transport_generic_map_mem_to_cmd() below..
+		 */
+/* FIXME: Fix BIDI operation in tcm_vhost_submission_work() */
+#if 0
+		if (se_cmd->se_cmd_flags & SCF_BIDI) {
+			sg_bidi_ptr = NULL;
+			sg_no_bidi = 0;
+		}
+#endif
+	} else {
+		sg_ptr = NULL;
+	}
+
+	rc = transport_generic_map_mem_to_cmd(se_cmd, sg_ptr,
+				tv_cmd->tvc_sgl_count, sg_bidi_ptr,
+				sg_no_bidi);
+	if (rc < 0) {
+		transport_send_check_condition_and_sense(se_cmd,
+				se_cmd->scsi_sense_reason, 0);
+		transport_generic_free_cmd(se_cmd, 0);
+		return;
+	}
+	transport_handle_cdb_direct(se_cmd);
+}
+
+static void vhost_scsi_handle_vq(struct vhost_scsi *vs)
+{
+	struct vhost_virtqueue *vq = &vs->vqs[2];
+	struct virtio_scsi_cmd_req v_req;
+	struct tcm_vhost_tpg *tv_tpg;
+	struct tcm_vhost_cmd *tv_cmd;
+	u32 exp_data_len, data_first, data_num, data_direction;
+	unsigned out, in, i;
+	int head, ret;
+
+	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
+	tv_tpg = vs->vs_tpg;
+	if (unlikely(!tv_tpg)) {
+		pr_err("%s endpoint not set\n", __func__);
+		return;
+	}
+
+	mutex_lock(&vq->mutex);
+	vhost_disable_notify(&vs->dev, vq);
+
+	for (;;) {
+		head = vhost_get_vq_desc(&vs->dev, vq, vq->iov,
+					ARRAY_SIZE(vq->iov), &out, &in,
+					NULL, NULL);
+		pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n", head, out, in);
+		/* On error, stop handling until the next kick. */
+		if (unlikely(head < 0))
+			break;
+		/* Nothing new?  Wait for eventfd to tell us they refilled. */
+		if (head == vq->num) {
+			if (unlikely(vhost_enable_notify(&vs->dev, vq))) {
+				vhost_disable_notify(&vs->dev, vq);
+				continue;
+			}
+			break;
+		}
+
+/* FIXME: BIDI operation */
+		if (out == 1 && in == 1) {
+			data_direction = DMA_NONE;
+			data_first = 0;
+			data_num = 0;
+		} else if (out == 1 && in > 1) {
+			data_direction = DMA_FROM_DEVICE;
+			data_first = out + 1;
+			data_num = in - 1;
+		} else if (out > 1 && in == 1) {
+			data_direction = DMA_TO_DEVICE;
+			data_first = 1;
+			data_num = out - 1;
+		} else {
+			pr_err("Invalid buffer layout out: %u in: %u\n", out, in);
+			break;
+		}
+
+		/*
+		 * Check for a sane resp buffer so we can report errors to
+		 * the guest.
+		 */
+		if (unlikely(vq->iov[out].iov_len !=
+					sizeof(struct virtio_scsi_cmd_resp))) {
+			pr_err("Expecting virtio_scsi_cmd_resp, got %zu bytes\n",
+					vq->iov[out].iov_len);
+			break;
+		}
+
+		if (unlikely(vq->iov[0].iov_len != sizeof(v_req))) {
+			pr_err("Expecting virtio_scsi_cmd_req, got %zu bytes\n",
+					vq->iov[0].iov_len);
+			break;
+		}
+		pr_debug("Calling __copy_from_user: vq->iov[0].iov_base: %p, len: %zu\n",
+				vq->iov[0].iov_base, sizeof(v_req));
+		ret = __copy_from_user(&v_req, vq->iov[0].iov_base, sizeof(v_req));
+		if (unlikely(ret)) {
+			pr_err("Faulted on virtio_scsi_cmd_req\n");
+			break;
+		}
+
+		exp_data_len = 0;
+		for (i = 0; i < data_num; i++) {
+			exp_data_len += vq->iov[data_first + i].iov_len;
+		}
+
+		tv_cmd = vhost_scsi_allocate_cmd(tv_tpg, &v_req,
+					exp_data_len, data_direction);
+		if (IS_ERR(tv_cmd)) {
+			pr_err("vhost_scsi_allocate_cmd failed %ld\n", PTR_ERR(tv_cmd));
+			break;
+		}
+		pr_debug("Allocated tv_cmd: %p exp_data_len: %d, data_direction: %d\n",
+				tv_cmd, exp_data_len, data_direction);
+
+		tv_cmd->tvc_vhost = vs;
+
+		if (unlikely(vq->iov[out].iov_len !=
+		             sizeof(struct virtio_scsi_cmd_resp))) {
+			pr_err("Expecting virtio_scsi_cmd_resp, "
+			       " got %zu bytes, out: %d, in: %d\n", vq->iov[out].iov_len, out, in);
+			break;
+		}
+
+		tv_cmd->tvc_resp = vq->iov[out].iov_base;
+
+		/*
+		 * Copy in the recieved CDB descriptor into tv_cmd->tvc_cdb
+		 * that will be used by tcm_vhost_new_cmd_map() and down into
+		 * target_setup_cmd_from_cdb()
+		 */
+		memcpy(tv_cmd->tvc_cdb, v_req.cdb, TCM_VHOST_MAX_CDB_SIZE);
+		/*
+		 * Check that the recieved CDB size does not exceeded our
+		 * hardcoded max for tcm_vhost
+		 */
+		/* TODO what if cdb was too small for varlen cdb header? */
+		if (unlikely(scsi_command_size(tv_cmd->tvc_cdb) > TCM_VHOST_MAX_CDB_SIZE)) {
+			pr_err("Received SCSI CDB with command_size: %d that exceeds"
+				" SCSI_MAX_VARLEN_CDB_SIZE: %d\n",
+				scsi_command_size(tv_cmd->tvc_cdb), TCM_VHOST_MAX_CDB_SIZE);
+			break; /* TODO */
+		}
+		tv_cmd->tvc_lun = ((v_req.lun[2] << 8) | v_req.lun[3]) & 0x3FFF;
+
+		pr_debug("vhost_scsi got command opcode: %#02x, lun: %d\n",
+			tv_cmd->tvc_cdb[0], tv_cmd->tvc_lun);
+
+		if (data_direction != DMA_NONE) {
+			ret = vhost_scsi_map_iov_to_sgl(tv_cmd, &vq->iov[data_first],
+					data_num, data_direction == DMA_TO_DEVICE);
+			if (unlikely(ret)) {
+				pr_err("Failed to map iov to sgl\n");
+				break; /* TODO */
+			}
+		}
+
+		/*
+		 * Save the descriptor from vhost_get_vq_desc() to be used to
+		 * complete the virtio-scsi request in TCM callback context via
+		 * tcm_vhost_queue_data_in() and tcm_vhost_queue_status()
+		 */
+		tv_cmd->tvc_vq_desc = head;
+		/*
+		 * Dispatch tv_cmd descriptor for cmwq execution in process
+		 * context provided by tcm_vhost_workqueue.  This also ensures
+		 * tv_cmd is executed on the same kworker CPU as this vhost
+		 * thread to gain positive L2 cache locality effects..
+		 */
+		INIT_WORK(&tv_cmd->work, tcm_vhost_submission_work);
+		queue_work(tcm_vhost_workqueue, &tv_cmd->work);
+	}
+
+	mutex_unlock(&vq->mutex);
+}
+
+static void vhost_scsi_ctl_handle_kick(struct vhost_work *work)
+{
+     pr_err("%s: The handling func for control queue.\n", __func__);
+}
+
+static void vhost_scsi_evt_handle_kick(struct vhost_work *work)
+{
+     pr_err("%s: The handling func for event queue.\n", __func__);
+}
+
+static void vhost_scsi_handle_kick(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						poll.work);
+	struct vhost_scsi *vs = container_of(vq->dev, struct vhost_scsi, dev);
+
+	vhost_scsi_handle_vq(vs);
+}
+
+/*
+ * Called from vhost_scsi_ioctl() context to walk the list of available tcm_vhost_tpg
+ * with an active struct tcm_vhost_nexus
+ */
+static int vhost_scsi_set_endpoint(
+	struct vhost_scsi *vs,
+	struct vhost_vring_target *t)
+{
+	struct tcm_vhost_tport *tv_tport;
+	struct tcm_vhost_tpg *tv_tpg;
+        int index;
+
+	mutex_lock(&vs->dev.mutex);
+	/* Verify that ring has been setup correctly. */
+	for (index = 0; index < vs->dev.nvqs; ++index) {
+		/* Verify that ring has been setup correctly. */
+		if (!vhost_vq_access_ok(&vs->vqs[index])) {
+		        mutex_unlock(&vs->dev.mutex);
+			return -EFAULT;
+		}
+	}
+
+	if (vs->vs_tpg) {
+		mutex_unlock(&vs->dev.mutex);
+		return -EEXIST;
+	}
+	mutex_unlock(&vs->dev.mutex);
+
+	mutex_lock(&tcm_vhost_mutex);
+	list_for_each_entry(tv_tpg, &tcm_vhost_list, tv_tpg_list) {
+		mutex_lock(&tv_tpg->tv_tpg_mutex);
+		if (!tv_tpg->tpg_nexus) {
+			mutex_unlock(&tv_tpg->tv_tpg_mutex);
+			continue;
+		}
+		if (atomic_read(&tv_tpg->tv_tpg_vhost_count)) {
+			mutex_unlock(&tv_tpg->tv_tpg_mutex);
+			continue;
+		}
+		tv_tport = tv_tpg->tport;
+
+		if (!strcmp(tv_tport->tport_name, t->vhost_wwpn) &&
+		    (tv_tpg->tport_tpgt == t->vhost_tpgt)) {
+			atomic_inc(&tv_tpg->tv_tpg_vhost_count);
+			smp_mb__after_atomic_inc();
+			mutex_unlock(&tv_tpg->tv_tpg_mutex);
+			mutex_unlock(&tcm_vhost_mutex);
+
+			mutex_lock(&vs->dev.mutex);
+			vs->vs_tpg = tv_tpg;
+			atomic_inc(&vs->vhost_ref_cnt);
+			smp_mb__after_atomic_inc();
+			mutex_unlock(&vs->dev.mutex);
+			return 0;
+		}
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+	}
+	mutex_unlock(&tcm_vhost_mutex);
+	return -EINVAL;
+}
+
+static int vhost_scsi_clear_endpoint(
+	struct vhost_scsi *vs,
+	struct vhost_vring_target *t)
+{
+	struct tcm_vhost_tport *tv_tport;
+	struct tcm_vhost_tpg *tv_tpg;
+        int index;
+
+	mutex_lock(&vs->dev.mutex);
+	/* Verify that ring has been setup correctly. */
+	for (index = 0; index < vs->dev.nvqs; ++index) {
+		if (!vhost_vq_access_ok(&vs->vqs[index])) {
+		        mutex_unlock(&vs->dev.mutex);
+			return -EFAULT;
+		}
+	}
+
+	if (!vs->vs_tpg) {
+		mutex_unlock(&vs->dev.mutex);
+		return -ENODEV;
+	}
+	tv_tpg = vs->vs_tpg;
+	tv_tport = tv_tpg->tport;
+
+	if (strcmp(tv_tport->tport_name, t->vhost_wwpn) ||
+	    (tv_tpg->tport_tpgt != t->vhost_tpgt)) {
+		mutex_unlock(&vs->dev.mutex);
+		pr_warn("tv_tport->tport_name: %s, tv_tpg->tport_tpgt: %hu"
+			" does not match t->vhost_wwpn: %s, t->vhost_tpgt: %hu\n",
+			tv_tport->tport_name, tv_tpg->tport_tpgt,
+			t->vhost_wwpn, t->vhost_tpgt);
+		return -EINVAL;
+	}
+        atomic_dec(&tv_tpg->tv_tpg_vhost_count);
+	vs->vs_tpg = NULL;
+	mutex_unlock(&vs->dev.mutex);
+
+	return 0;
+}
+
+static int vhost_scsi_open(struct inode *inode, struct file *f)
+{
+	struct vhost_scsi *s;
+	int r;
+
+	s = kzalloc(sizeof(*s), GFP_KERNEL);
+	if (!s)
+		return -ENOMEM;
+
+	vhost_work_init(&s->vs_completion_work, vhost_scsi_complete_cmd_work);
+	INIT_LIST_HEAD(&s->vs_completion_list);
+	spin_lock_init(&s->vs_completion_lock);
+
+	s->vqs[0].handle_kick = vhost_scsi_ctl_handle_kick;
+	s->vqs[1].handle_kick = vhost_scsi_evt_handle_kick;
+	s->vqs[2].handle_kick = vhost_scsi_handle_kick;
+	r = vhost_dev_init(&s->dev, s->vqs, 3);
+	if (r < 0) {
+		kfree(s);
+		return r;
+	}
+
+	f->private_data = s;
+	return 0;
+}
+
+static int vhost_scsi_release(struct inode *inode, struct file *f)
+{
+	struct vhost_scsi *s = f->private_data;
+
+        if (s->vs_tpg && s->vs_tpg->tport) {
+            struct vhost_vring_target backend;
+            memcpy(backend.vhost_wwpn, s->vs_tpg->tport->tport_name, sizeof(backend.vhost_wwpn));
+            backend.vhost_tpgt = s->vs_tpg->tport_tpgt;
+            vhost_scsi_clear_endpoint(s, &backend);
+        }
+
+	vhost_dev_cleanup(&s->dev, false);
+	kfree(s);
+	return 0;
+}
+
+static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
+{
+	if (features & ~VHOST_FEATURES)
+		return -EOPNOTSUPP;
+
+	mutex_lock(&vs->dev.mutex);
+	if ((features & (1 << VHOST_F_LOG_ALL)) &&
+	    !vhost_log_access_ok(&vs->dev)) {
+		mutex_unlock(&vs->dev.mutex);
+		return -EFAULT;
+	}
+	vs->dev.acked_features = features;
+	/* TODO possibly smp_wmb() and flush vqs */
+	mutex_unlock(&vs->dev.mutex);
+	return 0;
+}
+
+static long vhost_scsi_ioctl(struct file *f, unsigned int ioctl,
+				unsigned long arg)
+{
+	struct vhost_scsi *vs = f->private_data;
+	struct vhost_vring_target backend;
+	void __user *argp = (void __user *)arg;
+	u64 __user *featurep = argp;
+	u64 features;
+	int r;
+
+	switch (ioctl) {
+	case VHOST_SCSI_SET_ENDPOINT:
+		if (copy_from_user(&backend, argp, sizeof backend))
+			return -EFAULT;
+
+		return vhost_scsi_set_endpoint(vs, &backend);
+	case VHOST_SCSI_CLEAR_ENDPOINT:
+		if (copy_from_user(&backend, argp, sizeof backend))
+			return -EFAULT;
+
+		return vhost_scsi_clear_endpoint(vs, &backend);
+	case VHOST_GET_FEATURES:
+		features = VHOST_FEATURES;
+		if (copy_to_user(featurep, &features, sizeof features))
+			return -EFAULT;
+		return 0;
+	case VHOST_SET_FEATURES:
+		if (copy_from_user(&features, featurep, sizeof features))
+			return -EFAULT;
+		return vhost_scsi_set_features(vs, features);
+	default:
+		mutex_lock(&vs->dev.mutex);
+		r = vhost_dev_ioctl(&vs->dev, ioctl, arg);
+		mutex_unlock(&vs->dev.mutex);
+		return r;
+	}
+}
+
+static const struct file_operations vhost_scsi_fops = {
+	.owner          = THIS_MODULE,
+	.release        = vhost_scsi_release,
+	.unlocked_ioctl = vhost_scsi_ioctl,
+	/* TODO compat ioctl? */
+	.open           = vhost_scsi_open,
+	.llseek		= noop_llseek,
+};
+
+static struct miscdevice vhost_scsi_misc = {
+	MISC_DYNAMIC_MINOR,
+	"vhost-scsi",
+	&vhost_scsi_fops,
+};
+
+static int __init vhost_scsi_register(void)
+{
+	return misc_register(&vhost_scsi_misc);
+}
+
+static int vhost_scsi_deregister(void)
+{
+	return misc_deregister(&vhost_scsi_misc);
+}
+
+static char *tcm_vhost_dump_proto_id(struct tcm_vhost_tport *tport)
+{
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return "SAS";
+	case SCSI_PROTOCOL_FCP:
+		return "FCP";
+	case SCSI_PROTOCOL_ISCSI:
+		return "iSCSI";
+	default:
+		break;
+	}
+
+	return "Unknown";
+}
+
+static int tcm_vhost_port_link(
+	struct se_portal_group *se_tpg,
+	struct se_lun *lun)
+{
+	struct tcm_vhost_tpg *tv_tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+
+	atomic_inc(&tv_tpg->tv_tpg_port_count);
+	smp_mb__after_atomic_inc();
+
+	return 0;
+}
+
+static void tcm_vhost_port_unlink(
+	struct se_portal_group *se_tpg,
+	struct se_lun *se_lun)
+{
+	struct tcm_vhost_tpg *tv_tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+
+	atomic_dec(&tv_tpg->tv_tpg_port_count);
+	smp_mb__after_atomic_dec();
+}
+
+static struct se_node_acl *tcm_vhost_make_nodeacl(
+	struct se_portal_group *se_tpg,
+	struct config_group *group,
+	const char *name)
+{
+	struct se_node_acl *se_nacl, *se_nacl_new;
+	struct tcm_vhost_nacl *nacl;
+	u64 wwpn = 0;
+	u32 nexus_depth;
+
+	/* tcm_vhost_parse_wwn(name, &wwpn, 1) < 0)
+		return ERR_PTR(-EINVAL); */
+	se_nacl_new = tcm_vhost_alloc_fabric_acl(se_tpg);
+	if (!se_nacl_new)
+		return ERR_PTR(-ENOMEM);
+//#warning FIXME: Hardcoded nexus depth in tcm_vhost_make_nodeacl()
+	nexus_depth = 1;
+	/*
+	 * se_nacl_new may be released by core_tpg_add_initiator_node_acl()
+	 * when converting a NodeACL from demo mode -> explict
+	 */
+	se_nacl = core_tpg_add_initiator_node_acl(se_tpg, se_nacl_new,
+				name, nexus_depth);
+	if (IS_ERR(se_nacl)) {
+		tcm_vhost_release_fabric_acl(se_tpg, se_nacl_new);
+		return se_nacl;
+	}
+	/*
+	 * Locate our struct tcm_vhost_nacl and set the FC Nport WWPN
+	 */
+	nacl = container_of(se_nacl, struct tcm_vhost_nacl, se_node_acl);
+	nacl->iport_wwpn = wwpn;
+	/* tcm_vhost_format_wwn(&nacl->iport_name[0], TCM_VHOST_NAMELEN, wwpn); */
+
+	return se_nacl;
+}
+
+static void tcm_vhost_drop_nodeacl(struct se_node_acl *se_acl)
+{
+	struct tcm_vhost_nacl *nacl = container_of(se_acl,
+				struct tcm_vhost_nacl, se_node_acl);
+	core_tpg_del_initiator_node_acl(se_acl->se_tpg, se_acl, 1);
+	kfree(nacl);
+}
+
+static int tcm_vhost_make_nexus(
+	struct tcm_vhost_tpg *tv_tpg,
+	const char *name)
+{
+	struct se_portal_group *se_tpg;
+	struct tcm_vhost_nexus *tv_nexus;
+
+	mutex_lock(&tv_tpg->tv_tpg_mutex);
+	if (tv_tpg->tpg_nexus) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		pr_debug("tv_tpg->tpg_nexus already exists\n");
+		return -EEXIST;
+	}
+	se_tpg = &tv_tpg->se_tpg;
+
+	tv_nexus = kzalloc(sizeof(struct tcm_vhost_nexus), GFP_KERNEL);
+	if (!tv_nexus) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		pr_err("Unable to allocate struct tcm_vhost_nexus\n");
+		return -ENOMEM;
+	}
+	/*
+	 *  Initialize the struct se_session pointer
+	 */
+	tv_nexus->tvn_se_sess = transport_init_session();
+	if (IS_ERR(tv_nexus->tvn_se_sess)) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		kfree(tv_nexus);
+		return -ENOMEM;
+	}
+	/*
+	 * Since we are running in 'demo mode' this call with generate a
+	 * struct se_node_acl for the tcm_vhost struct se_portal_group with
+	 * the SCSI Initiator port name of the passed configfs group 'name'.
+	 */
+	tv_nexus->tvn_se_sess->se_node_acl = core_tpg_check_initiator_node_acl(
+				se_tpg, (unsigned char *)name);
+	if (!tv_nexus->tvn_se_sess->se_node_acl) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		pr_debug("core_tpg_check_initiator_node_acl() failed"
+				" for %s\n", name);
+		transport_free_session(tv_nexus->tvn_se_sess);
+		kfree(tv_nexus);
+		return -ENOMEM;
+	}
+	/*
+	 * Now register the TCM vHost virtual I_T Nexus as active with the
+	 * call to __transport_register_session()
+	 */
+	__transport_register_session(se_tpg, tv_nexus->tvn_se_sess->se_node_acl,
+			tv_nexus->tvn_se_sess, tv_nexus);
+	tv_tpg->tpg_nexus = tv_nexus;
+
+	mutex_unlock(&tv_tpg->tv_tpg_mutex);
+	return 0;
+}
+
+static int tcm_vhost_drop_nexus(
+	struct tcm_vhost_tpg *tpg)
+{
+	struct se_session *se_sess;
+	struct tcm_vhost_nexus *tv_nexus;
+
+	mutex_lock(&tpg->tv_tpg_mutex);
+	tv_nexus = tpg->tpg_nexus;
+	if (!tv_nexus) {
+		mutex_unlock(&tpg->tv_tpg_mutex);
+		return -ENODEV;
+	}
+
+	se_sess = tv_nexus->tvn_se_sess;
+	if (!se_sess) {
+		mutex_unlock(&tpg->tv_tpg_mutex);
+		return -ENODEV;
+	}
+
+	if (atomic_read(&tpg->tv_tpg_port_count)) {
+		mutex_unlock(&tpg->tv_tpg_mutex);
+		pr_err("Unable to remove TCM_vHost I_T Nexus with"
+			" active TPG port count: %d\n",
+			atomic_read(&tpg->tv_tpg_port_count));
+		return -EPERM;
+	}
+
+	if (atomic_read(&tpg->tv_tpg_vhost_count)) {
+		pr_err("Unable to remove TCM_vHost I_T Nexus with"
+			" active TPG vhost count: %d\n",
+			atomic_read(&tpg->tv_tpg_vhost_count));
+		return -EPERM;
+	}
+
+	pr_debug("TCM_vHost_ConfigFS: Removing I_T Nexus to emulated"
+		" %s Initiator Port: %s\n", tcm_vhost_dump_proto_id(tpg->tport),
+		tv_nexus->tvn_se_sess->se_node_acl->initiatorname);
+	/*
+	 * Release the SCSI I_T Nexus to the emulated vHost Target Port
+	 */
+	transport_deregister_session(tv_nexus->tvn_se_sess);
+	tpg->tpg_nexus = NULL;
+	mutex_unlock(&tpg->tv_tpg_mutex);
+
+	kfree(tv_nexus);
+	return 0;
+}
+
+static ssize_t tcm_vhost_tpg_show_nexus(
+	struct se_portal_group *se_tpg,
+	char *page)
+{
+	struct tcm_vhost_tpg *tv_tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_nexus *tv_nexus;
+	ssize_t ret;
+
+	mutex_lock(&tv_tpg->tv_tpg_mutex);
+	tv_nexus = tv_tpg->tpg_nexus;
+	if (!tv_nexus) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		return -ENODEV;
+	}
+	ret = snprintf(page, PAGE_SIZE, "%s\n",
+			tv_nexus->tvn_se_sess->se_node_acl->initiatorname);
+	mutex_unlock(&tv_tpg->tv_tpg_mutex);
+
+	return ret;
+}
+
+static ssize_t tcm_vhost_tpg_store_nexus(
+	struct se_portal_group *se_tpg,
+	const char *page,
+	size_t count)
+{
+	struct tcm_vhost_tpg *tv_tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport_wwn = tv_tpg->tport;
+	unsigned char i_port[TCM_VHOST_NAMELEN], *ptr, *port_ptr;
+	int ret;
+	/*
+	 * Shutdown the active I_T nexus if 'NULL' is passed..
+	 */
+	if (!strncmp(page, "NULL", 4)) {
+		ret = tcm_vhost_drop_nexus(tv_tpg);
+		return (!ret) ? count : ret;
+	}
+	/*
+	 * Otherwise make sure the passed virtual Initiator port WWN matches
+	 * the fabric protocol_id set in tcm_vhost_make_tport(), and call
+	 * tcm_vhost_make_nexus().
+	 */
+	if (strlen(page) > TCM_VHOST_NAMELEN) {
+		pr_err("Emulated NAA Sas Address: %s, exceeds"
+				" max: %d\n", page, TCM_VHOST_NAMELEN);
+		return -EINVAL;
+	}
+	snprintf(&i_port[0], TCM_VHOST_NAMELEN, "%s", page);
+
+	ptr = strstr(i_port, "naa.");
+	if (ptr) {
+		if (tport_wwn->tport_proto_id != SCSI_PROTOCOL_SAS) {
+			pr_err("Passed SAS Initiator Port %s does not"
+				" match target port protoid: %s\n", i_port,
+				tcm_vhost_dump_proto_id(tport_wwn));
+			return -EINVAL;
+		}
+		port_ptr = &i_port[0];
+		goto check_newline;
+	}
+	ptr = strstr(i_port, "fc.");
+	if (ptr) {
+		if (tport_wwn->tport_proto_id != SCSI_PROTOCOL_FCP) {
+			pr_err("Passed FCP Initiator Port %s does not"
+				" match target port protoid: %s\n", i_port,
+				tcm_vhost_dump_proto_id(tport_wwn));
+			return -EINVAL;
+		}
+		port_ptr = &i_port[3]; /* Skip over "fc." */
+		goto check_newline;
+	}
+	ptr = strstr(i_port, "iqn.");
+	if (ptr) {
+		if (tport_wwn->tport_proto_id != SCSI_PROTOCOL_ISCSI) {
+			pr_err("Passed iSCSI Initiator Port %s does not"
+				" match target port protoid: %s\n", i_port,
+				tcm_vhost_dump_proto_id(tport_wwn));
+			return -EINVAL;
+		}
+		port_ptr = &i_port[0];
+		goto check_newline;
+	}
+	pr_err("Unable to locate prefix for emulated Initiator Port:"
+			" %s\n", i_port);
+	return -EINVAL;
+	/*
+	 * Clear any trailing newline for the NAA WWN
+	 */
+check_newline:
+	if (i_port[strlen(i_port)-1] == '\n')
+		i_port[strlen(i_port)-1] = '\0';
+
+	ret = tcm_vhost_make_nexus(tv_tpg, port_ptr);
+	if (ret < 0)
+		return ret;
+
+	return count;
+}
+
+TF_TPG_BASE_ATTR(tcm_vhost, nexus, S_IRUGO | S_IWUSR);
+
+static struct configfs_attribute *tcm_vhost_tpg_attrs[] = {
+	&tcm_vhost_tpg_nexus.attr,
+	NULL,
+};
+
+static struct se_portal_group *tcm_vhost_make_tpg(
+	struct se_wwn *wwn,
+	struct config_group *group,
+	const char *name)
+{
+	struct tcm_vhost_tport*tport = container_of(wwn,
+			struct tcm_vhost_tport, tport_wwn);
+
+	struct tcm_vhost_tpg *tpg;
+	unsigned long tpgt;
+	int ret;
+
+	if (strstr(name, "tpgt_") != name)
+		return ERR_PTR(-EINVAL);
+	if (strict_strtoul(name + 5, 10, &tpgt) || tpgt > UINT_MAX)
+		return ERR_PTR(-EINVAL);
+
+	tpg = kzalloc(sizeof(struct tcm_vhost_tpg), GFP_KERNEL);
+	if (!tpg) {
+		pr_err("Unable to allocate struct tcm_vhost_tpg");
+		return ERR_PTR(-ENOMEM);
+	}
+	mutex_init(&tpg->tv_tpg_mutex);
+	INIT_LIST_HEAD(&tpg->tv_tpg_list);
+	tpg->tport = tport;
+	tpg->tport_tpgt = tpgt;
+
+	ret = core_tpg_register(&tcm_vhost_fabric_configfs->tf_ops, wwn,
+				&tpg->se_tpg, tpg, TRANSPORT_TPG_TYPE_NORMAL);
+	if (ret < 0) {
+		kfree(tpg);
+		return NULL;
+	}
+	mutex_lock(&tcm_vhost_mutex);
+	list_add_tail(&tpg->tv_tpg_list, &tcm_vhost_list);
+	mutex_unlock(&tcm_vhost_mutex);
+
+	return &tpg->se_tpg;
+}
+
+static void tcm_vhost_drop_tpg(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+
+	mutex_lock(&tcm_vhost_mutex);
+	list_del(&tpg->tv_tpg_list);
+	mutex_unlock(&tcm_vhost_mutex);
+	/*
+	 * Release the virtual I_T Nexus for this vHost TPG
+	 */
+	tcm_vhost_drop_nexus(tpg);
+	/*
+	 * Deregister the se_tpg from TCM..
+	 */
+	core_tpg_deregister(se_tpg);
+	kfree(tpg);
+}
+
+static struct se_wwn *tcm_vhost_make_tport(
+	struct target_fabric_configfs *tf,
+	struct config_group *group,
+	const char *name)
+{
+	struct tcm_vhost_tport *tport;
+	char *ptr;
+	u64 wwpn = 0;
+	int off = 0;
+
+	/* if (tcm_vhost_parse_wwn(name, &wwpn, 1) < 0)
+		return ERR_PTR(-EINVAL); */
+
+	tport = kzalloc(sizeof(struct tcm_vhost_tport), GFP_KERNEL);
+	if (!tport) {
+		pr_err("Unable to allocate struct tcm_vhost_tport");
+		return ERR_PTR(-ENOMEM);
+	}
+	tport->tport_wwpn = wwpn;
+	/* tcm_vhost_format_wwn(&tport->tport_name[0], TCM_VHOST__NAMELEN, wwpn); */
+	/*
+	 * Determine the emulated Protocol Identifier and Target Port Name
+	 * based on the incoming configfs directory name.
+	 */
+	ptr = strstr(name, "naa.");
+	if (ptr) {
+		tport->tport_proto_id = SCSI_PROTOCOL_SAS;
+		goto check_len;
+	}
+	ptr = strstr(name, "fc.");
+	if (ptr) {
+		tport->tport_proto_id = SCSI_PROTOCOL_FCP;
+		off = 3; /* Skip over "fc." */
+		goto check_len;
+	}
+	ptr = strstr(name, "iqn.");
+	if (ptr) {
+		tport->tport_proto_id = SCSI_PROTOCOL_ISCSI;
+		goto check_len;
+	}
+
+	pr_err("Unable to locate prefix for emulated Target Port:"
+			" %s\n", name);
+	return ERR_PTR(-EINVAL);
+
+check_len:
+	if (strlen(name) > TCM_VHOST_NAMELEN) {
+		pr_err("Emulated %s Address: %s, exceeds"
+			" max: %d\n", name, tcm_vhost_dump_proto_id(tport),
+			TCM_VHOST_NAMELEN);
+		kfree(tport);
+		return ERR_PTR(-EINVAL);
+	}
+	snprintf(&tport->tport_name[0], TCM_VHOST_NAMELEN, "%s", &name[off]);
+
+	pr_debug("TCM_VHost_ConfigFS: Allocated emulated Target"
+		" %s Address: %s\n", tcm_vhost_dump_proto_id(tport), name);
+
+	return &tport->tport_wwn;
+}
+
+static void tcm_vhost_drop_tport(struct se_wwn *wwn)
+{
+	struct tcm_vhost_tport *tport = container_of(wwn,
+				struct tcm_vhost_tport, tport_wwn);
+
+	pr_debug("TCM_VHost_ConfigFS: Deallocating emulated Target"
+		" %s Address: %s\n", tcm_vhost_dump_proto_id(tport),
+		tport->tport_name);;
+
+	kfree(tport);
+}
+
+static ssize_t tcm_vhost_wwn_show_attr_version(
+	struct target_fabric_configfs *tf,
+	char *page)
+{
+	return sprintf(page, "TCM_VHOST fabric module %s on %s/%s"
+		"on "UTS_RELEASE"\n", TCM_VHOST_VERSION, utsname()->sysname,
+		utsname()->machine);
+}
+
+TF_WWN_ATTR_RO(tcm_vhost, version);
+
+static struct configfs_attribute *tcm_vhost_wwn_attrs[] = {
+	&tcm_vhost_wwn_version.attr,
+	NULL,
+};
+
+static struct target_core_fabric_ops tcm_vhost_ops = {
+	.get_fabric_name		= tcm_vhost_get_fabric_name,
+	.get_fabric_proto_ident		= tcm_vhost_get_fabric_proto_ident,
+	.tpg_get_wwn			= tcm_vhost_get_fabric_wwn,
+	.tpg_get_tag			= tcm_vhost_get_tag,
+	.tpg_get_default_depth		= tcm_vhost_get_default_depth,
+	.tpg_get_pr_transport_id	= tcm_vhost_get_pr_transport_id,
+	.tpg_get_pr_transport_id_len	= tcm_vhost_get_pr_transport_id_len,
+	.tpg_parse_pr_out_transport_id	= tcm_vhost_parse_pr_out_transport_id,
+	.tpg_check_demo_mode		= tcm_vhost_check_true,
+	.tpg_check_demo_mode_cache	= tcm_vhost_check_true,
+	.tpg_check_demo_mode_write_protect = tcm_vhost_check_false,
+	.tpg_check_prod_mode_write_protect = tcm_vhost_check_false,
+	.tpg_alloc_fabric_acl		= tcm_vhost_alloc_fabric_acl,
+	.tpg_release_fabric_acl		= tcm_vhost_release_fabric_acl,
+	.tpg_get_inst_index		= tcm_vhost_tpg_get_inst_index,
+	.release_cmd			= tcm_vhost_release_cmd,
+	.shutdown_session		= tcm_vhost_shutdown_session,
+	.close_session			= tcm_vhost_close_session,
+	.sess_get_index			= tcm_vhost_sess_get_index,
+	.sess_get_initiator_sid		= NULL,
+	.write_pending			= tcm_vhost_write_pending,
+	.write_pending_status		= tcm_vhost_write_pending_status,
+	.set_default_node_attributes	= tcm_vhost_set_default_node_attrs,
+	.get_task_tag			= tcm_vhost_get_task_tag,
+	.get_cmd_state			= tcm_vhost_get_cmd_state,
+	.queue_data_in			= tcm_vhost_queue_data_in,
+	.queue_status			= tcm_vhost_queue_status,
+	.queue_tm_rsp			= tcm_vhost_queue_tm_rsp,
+	.get_fabric_sense_len		= tcm_vhost_get_fabric_sense_len,
+	.set_fabric_sense_len		= tcm_vhost_set_fabric_sense_len,
+	/*
+	 * Setup function pointers for generic logic in target_core_fabric_configfs.c
+	 */
+	.fabric_make_wwn		= tcm_vhost_make_tport,
+	.fabric_drop_wwn		= tcm_vhost_drop_tport,
+	.fabric_make_tpg		= tcm_vhost_make_tpg,
+	.fabric_drop_tpg		= tcm_vhost_drop_tpg,
+	.fabric_post_link		= tcm_vhost_port_link,
+	.fabric_pre_unlink		= tcm_vhost_port_unlink,
+	.fabric_make_np			= NULL,
+	.fabric_drop_np			= NULL,
+	.fabric_make_nodeacl		= tcm_vhost_make_nodeacl,
+	.fabric_drop_nodeacl		= tcm_vhost_drop_nodeacl,
+};
+
+static int tcm_vhost_register_configfs(void)
+{
+	struct target_fabric_configfs *fabric;
+	int ret;
+
+	pr_debug("TCM_VHOST fabric module %s on %s/%s"
+		" on "UTS_RELEASE"\n",TCM_VHOST_VERSION, utsname()->sysname,
+		utsname()->machine);
+	/*
+	 * Register the top level struct config_item_type with TCM core
+	 */
+	fabric = target_fabric_configfs_init(THIS_MODULE, "vhost");
+	if (IS_ERR(fabric)) {
+		pr_err("target_fabric_configfs_init() failed\n");
+		return PTR_ERR(fabric);
+	}
+	/*
+	 * Setup fabric->tf_ops from our local tcm_vhost_ops
+	 */
+	fabric->tf_ops = tcm_vhost_ops;
+	/*
+	 * Setup default attribute lists for various fabric->tf_cit_tmpl
+	 */
+	TF_CIT_TMPL(fabric)->tfc_wwn_cit.ct_attrs = tcm_vhost_wwn_attrs;
+	TF_CIT_TMPL(fabric)->tfc_tpg_base_cit.ct_attrs = tcm_vhost_tpg_attrs;
+	TF_CIT_TMPL(fabric)->tfc_tpg_attrib_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_param_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_np_base_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_nacl_base_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_nacl_attrib_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_nacl_auth_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_nacl_param_cit.ct_attrs = NULL;
+	/*
+	 * Register the fabric for use within TCM
+	 */
+	ret = target_fabric_configfs_register(fabric);
+	if (ret < 0) {
+		pr_err("target_fabric_configfs_register() failed"
+				" for TCM_VHOST\n");
+		return ret;
+	}
+	/*
+	 * Setup our local pointer to *fabric
+	 */
+	tcm_vhost_fabric_configfs = fabric;
+	pr_debug("TCM_VHOST[0] - Set fabric -> tcm_vhost_fabric_configfs\n");
+	return 0;
+};
+
+static void tcm_vhost_deregister_configfs(void)
+{
+	if (!tcm_vhost_fabric_configfs)
+		return;
+
+	target_fabric_configfs_deregister(tcm_vhost_fabric_configfs);
+	tcm_vhost_fabric_configfs = NULL;
+	pr_debug("TCM_VHOST[0] - Cleared tcm_vhost_fabric_configfs\n");
+};
+
+static int __init tcm_vhost_init(void)
+{
+	int ret = -ENOMEM;
+
+	tcm_vhost_workqueue = alloc_workqueue("tcm_vhost", 0, 0);
+	if (!tcm_vhost_workqueue)
+		goto out;
+
+	ret = vhost_scsi_register();
+	if (ret < 0)
+		goto out_destroy_workqueue;
+
+	ret = tcm_vhost_register_configfs();
+	if (ret < 0)
+		goto out_vhost_scsi_deregister;
+
+	return 0;
+
+out_vhost_scsi_deregister:
+	vhost_scsi_deregister();
+out_destroy_workqueue:
+	destroy_workqueue(tcm_vhost_workqueue);
+out:
+	return ret;
+};
+
+static void tcm_vhost_exit(void)
+{
+	tcm_vhost_deregister_configfs();
+	vhost_scsi_deregister();
+	destroy_workqueue(tcm_vhost_workqueue);
+};
+
+MODULE_DESCRIPTION("TCM_VHOST series fabric driver");
+MODULE_LICENSE("GPL");
+module_init(tcm_vhost_init);
+module_exit(tcm_vhost_exit);
diff --git a/drivers/vhost/tcm_vhost.h b/drivers/vhost/tcm_vhost.h
new file mode 100644
index 0000000..9d6cace
--- /dev/null
+++ b/drivers/vhost/tcm_vhost.h
@@ -0,0 +1,74 @@
+#define TCM_VHOST_VERSION  "v0.1"
+#define TCM_VHOST_NAMELEN 256
+#define TCM_VHOST_MAX_CDB_SIZE 32
+
+struct tcm_vhost_cmd {
+	/* Descriptor from vhost_get_vq_desc() for virt_queue segment */
+	int tvc_vq_desc;
+	/* The Tag from include/linux/virtio_scsi.h:struct virtio_scsi_cmd_req */
+	u64 tvc_tag;
+	/* The number of scatterlists associated with this cmd */
+	u32 tvc_sgl_count;
+	/* Saved unpacked SCSI LUN for tcm_vhost_submission_work() */
+	u32 tvc_lun;
+	/* Pointer to the SGL formatted memory from virtio-scsi */
+	struct scatterlist *tvc_sgl;
+	/* Pointer to response */
+	struct virtio_scsi_cmd_resp __user *tvc_resp;
+	/* Pointer to vhost_scsi for our device */
+	struct vhost_scsi *tvc_vhost;
+	/* The TCM I/O descriptor that is accessed via container_of() */
+	struct se_cmd tvc_se_cmd;
+	/* work item used for cmwq dispatch to tcm_vhost_submission_work() */
+	struct work_struct work;
+	/* Copy of the incoming SCSI command descriptor block (CDB) */
+	unsigned char tvc_cdb[TCM_VHOST_MAX_CDB_SIZE];
+	/* Sense buffer that will be mapped into outgoing status */
+	unsigned char tvc_sense_buf[TRANSPORT_SENSE_BUFFER];
+	/* Completed commands list, serviced from vhost worker thread */
+	struct list_head tvc_completion_list;
+};
+
+struct tcm_vhost_nexus {
+	/* Pointer to TCM session for I_T Nexus */
+	struct se_session *tvn_se_sess;
+};
+
+struct tcm_vhost_nacl {
+	/* Binary World Wide unique Port Name for Vhost Initiator port */
+	u64 iport_wwpn;
+	/* ASCII formatted WWPN for Sas Initiator port */
+	char iport_name[TCM_VHOST_NAMELEN];
+	/* Returned by tcm_vhost_make_nodeacl() */
+	struct se_node_acl se_node_acl;
+};
+
+struct tcm_vhost_tpg {
+	/* Vhost port target portal group tag for TCM */
+	u16 tport_tpgt;
+	/* Used to track number of TPG Port/Lun Links wrt to explict I_T Nexus shutdown */
+	atomic_t tv_tpg_port_count;
+	/* Used for vhost_scsi device reference to tpg_nexus */
+	atomic_t tv_tpg_vhost_count;
+	/* list for tcm_vhost_list */
+	struct list_head tv_tpg_list;
+	/* Used to protect access for tpg_nexus */
+	struct mutex tv_tpg_mutex;
+	/* Pointer to the TCM VHost I_T Nexus for this TPG endpoint */
+	struct tcm_vhost_nexus *tpg_nexus;
+	/* Pointer back to tcm_vhost_tport */
+	struct tcm_vhost_tport *tport;
+	/* Returned by tcm_vhost_make_tpg() */
+	struct se_portal_group se_tpg;
+};
+
+struct tcm_vhost_tport {
+	/* SCSI protocol the tport is providing */
+	u8 tport_proto_id;
+	/* Binary World Wide unique Port Name for Vhost Target port */
+	u64 tport_wwpn;
+	/* ASCII formatted WWPN for Vhost Target port */
+	char tport_name[TCM_VHOST_NAMELEN];
+	/* Returned by tcm_vhost_make_tport() */
+	struct se_wwn tport_wwn;
+};
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC-v2 4/4] tcm_vhost: Initial merge for vhost level target fabric driver
  2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
                   ` (5 preceding siblings ...)
  2012-07-11 21:15 ` Nicholas A. Bellinger
@ 2012-07-11 21:15 ` Nicholas A. Bellinger
  2012-07-11 21:15 ` Nicholas A. Bellinger
  2012-07-17 15:05 ` [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Michael S. Tsirkin
  8 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-11 21:15 UTC (permalink / raw)
  To: target-devel
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Michael S. Tsirkin,
	Zhi Yong Wu, Anthony Liguori, linux-scsi, Paolo Bonzini, lf-virt,
	Christoph Hellwig

From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch adds the initial code for tcm_vhost, a Vhost level TCM
fabric driver for virtio SCSI initiators into KVM guest.

This code is currently up and running on v3.5-rc2 host+guest along
with the virtio-scsi vdev->scan() patch to allow a proper
scsi_scan_host() to occur once the tcm_vhost nexus has been established
by the paravirtualized virtio-scsi client here:

virtio-scsi: Add vdrv->scan for post VIRTIO_CONFIG_S_DRIVER_OK LUN scanning
http://marc.info/?l=linux-scsi&m=134160609212542&w=2

Using tcm_vhost requires Zhi's -> Stefan's qemu vhost-scsi tree here:

https://github.com/wuzhy/qemu/tree/vhost-scsi

along with the recent QEMU patch to hw/virtio-scsi.c to set max_target=0
during vhost-scsi operation.

Changelog v1 -> v2:

  Fix tv_cmd completion -> release SGL memory leak
  Fix sparse warnings for static variable usage
  Fix sparse warnings for min() typing + printk format specs
  Convert to cmwq submission for I/O dispatch

Changelog v0 -> v1:

  Merge into single source + header file, and move to drivers/vhost/

Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/vhost/Kconfig     |    6 +
 drivers/vhost/Makefile    |    1 +
 drivers/vhost/tcm_vhost.c | 1609 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/vhost/tcm_vhost.h |   74 ++
 4 files changed, 1690 insertions(+), 0 deletions(-)
 create mode 100644 drivers/vhost/tcm_vhost.c
 create mode 100644 drivers/vhost/tcm_vhost.h

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index e4e2fd1..ccbeb6f 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -9,3 +9,9 @@ config VHOST_NET
 	  To compile this driver as a module, choose M here: the module will
 	  be called vhost_net.
 
+config TCM_VHOST
+	tristate "TCM_VHOST fabric module (EXPERIMENTAL)"
+	depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && m
+	default n
+	---help---
+	Say M here to enable the TCM_VHOST fabric module for use with virtio-scsi guests
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index 72dd020..b10c7b1 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_VHOST_NET) += vhost_net.o
+obj-$(CONFIG_TCM_VHOST) += tcm_vhost.o
 vhost_net-y := vhost.o net.o
diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
new file mode 100644
index 0000000..da0b8ac
--- /dev/null
+++ b/drivers/vhost/tcm_vhost.c
@@ -0,0 +1,1609 @@
+/*******************************************************************************
+ * Vhost kernel TCM fabric driver for virtio SCSI initiators
+ *
+ * (C) Copyright 2010-2012 RisingTide Systems LLC.
+ * (C) Copyright 2010-2012 IBM Corp.
+ *
+ * Licensed to the Linux Foundation under the General Public License (GPL) version 2.
+ *
+ * Authors: Nicholas A. Bellinger <nab@risingtidesystems.com>
+ *          Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ ****************************************************************************/
+
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <generated/utsrelease.h>
+#include <linux/utsname.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/kthread.h>
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/configfs.h>
+#include <linux/ctype.h>
+#include <linux/compat.h>
+#include <linux/eventfd.h>
+#include <linux/vhost.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <asm/unaligned.h>
+#include <scsi/scsi.h>
+#include <scsi/scsi_tcq.h>
+#include <target/target_core_base.h>
+#include <target/target_core_fabric.h>
+#include <target/target_core_fabric_configfs.h>
+#include <target/target_core_configfs.h>
+#include <target/configfs_macros.h>
+#include <linux/vhost.h>
+#include <linux/virtio_net.h> /* TODO vhost.h currently depends on this */
+#include <linux/virtio_scsi.h>
+
+#include "vhost.c"
+#include "vhost.h"
+#include "tcm_vhost.h"
+
+struct vhost_scsi {
+	atomic_t vhost_ref_cnt;
+	struct tcm_vhost_tpg *vs_tpg;
+	struct vhost_dev dev;
+	struct vhost_virtqueue vqs[3];
+
+	struct vhost_work vs_completion_work; /* cmd completion work item */
+	struct list_head vs_completion_list;  /* cmd completion queue */
+	spinlock_t vs_completion_lock;        /* protects s_completion_list */
+};
+
+/* Local pointer to allocated TCM configfs fabric module */
+static struct target_fabric_configfs *tcm_vhost_fabric_configfs;
+
+static struct workqueue_struct *tcm_vhost_workqueue;
+
+/* Global spinlock to protect tcm_vhost TPG list for vhost IOCTL access */
+static DEFINE_MUTEX(tcm_vhost_mutex);
+static LIST_HEAD(tcm_vhost_list);
+
+static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
+{
+	return 1;
+}
+
+static int tcm_vhost_check_false(struct se_portal_group *se_tpg)
+{
+	return 0;
+}
+
+static char *tcm_vhost_get_fabric_name(void)
+{
+	return "vhost";
+}
+
+static u8 tcm_vhost_get_fabric_proto_ident(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return sas_get_fabric_proto_ident(se_tpg);
+	case SCSI_PROTOCOL_FCP:
+		return fc_get_fabric_proto_ident(se_tpg);
+	case SCSI_PROTOCOL_ISCSI:
+		return iscsi_get_fabric_proto_ident(se_tpg);
+	default:
+		pr_err("Unknown tport_proto_id: 0x%02x, using"
+			" SAS emulation\n", tport->tport_proto_id);
+		break;
+	}
+
+	return sas_get_fabric_proto_ident(se_tpg);
+}
+
+static char *tcm_vhost_get_fabric_wwn(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	return &tport->tport_name[0];
+}
+
+static u16 tcm_vhost_get_tag(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	return tpg->tport_tpgt;
+}
+
+static u32 tcm_vhost_get_default_depth(struct se_portal_group *se_tpg)
+{
+	return 1;
+}
+
+static u32 tcm_vhost_get_pr_transport_id(
+	struct se_portal_group *se_tpg,
+	struct se_node_acl *se_nacl,
+	struct t10_pr_registration *pr_reg,
+	int *format_code,
+	unsigned char *buf)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return sas_get_pr_transport_id(se_tpg, se_nacl, pr_reg,
+					format_code, buf);
+	case SCSI_PROTOCOL_FCP:
+		return fc_get_pr_transport_id(se_tpg, se_nacl, pr_reg,
+					format_code, buf);
+	case SCSI_PROTOCOL_ISCSI:
+		return iscsi_get_pr_transport_id(se_tpg, se_nacl, pr_reg,
+					format_code, buf);
+	default:
+		pr_err("Unknown tport_proto_id: 0x%02x, using"
+			" SAS emulation\n", tport->tport_proto_id);
+		break;
+	}
+
+	return sas_get_pr_transport_id(se_tpg, se_nacl, pr_reg,
+			format_code, buf);
+}
+
+static u32 tcm_vhost_get_pr_transport_id_len(
+	struct se_portal_group *se_tpg,
+	struct se_node_acl *se_nacl,
+	struct t10_pr_registration *pr_reg,
+	int *format_code)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return sas_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,
+					format_code);
+	case SCSI_PROTOCOL_FCP:
+		return fc_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,
+					format_code);
+	case SCSI_PROTOCOL_ISCSI:
+		return iscsi_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,
+					format_code);
+	default:
+		pr_err("Unknown tport_proto_id: 0x%02x, using"
+			" SAS emulation\n", tport->tport_proto_id);
+		break;
+	}
+
+	return sas_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,
+			format_code);
+}
+
+static char *tcm_vhost_parse_pr_out_transport_id(
+	struct se_portal_group *se_tpg,
+	const char *buf,
+	u32 *out_tid_len,
+	char **port_nexus_ptr)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport = tpg->tport;
+
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return sas_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,
+					port_nexus_ptr);
+	case SCSI_PROTOCOL_FCP:
+		return fc_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,
+					port_nexus_ptr);
+	case SCSI_PROTOCOL_ISCSI:
+		return iscsi_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,
+					port_nexus_ptr);
+	default:
+		pr_err("Unknown tport_proto_id: 0x%02x, using"
+			" SAS emulation\n", tport->tport_proto_id);
+		break;
+	}
+
+	return sas_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,
+			port_nexus_ptr);
+}
+
+static struct se_node_acl *tcm_vhost_alloc_fabric_acl(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_nacl *nacl;
+
+	nacl = kzalloc(sizeof(struct tcm_vhost_nacl), GFP_KERNEL);
+	if (!nacl) {
+		pr_err("Unable to alocate struct tcm_vhost_nacl\n");
+		return NULL;
+	}
+
+	return &nacl->se_node_acl;
+}
+
+static void tcm_vhost_release_fabric_acl(
+	struct se_portal_group *se_tpg,
+	struct se_node_acl *se_nacl)
+{
+	struct tcm_vhost_nacl *nacl = container_of(se_nacl,
+			struct tcm_vhost_nacl, se_node_acl);
+	kfree(nacl);
+}
+
+static u32 tcm_vhost_tpg_get_inst_index(struct se_portal_group *se_tpg)
+{
+	return 1;
+}
+
+static void tcm_vhost_release_cmd(struct se_cmd *se_cmd)
+{
+	return;
+}
+
+static int tcm_vhost_shutdown_session(struct se_session *se_sess)
+{
+	return 0;
+}
+
+static void tcm_vhost_close_session(struct se_session *se_sess)
+{
+	return;
+}
+
+static u32 tcm_vhost_sess_get_index(struct se_session *se_sess)
+{
+	return 0;
+}
+
+static int tcm_vhost_write_pending(struct se_cmd *se_cmd)
+{
+	/* Go ahead and process the write immediately */
+	transport_generic_process_write(se_cmd);
+	return 0;
+}
+
+static int tcm_vhost_write_pending_status(struct se_cmd *se_cmd)
+{
+	return 0;
+}
+
+static void tcm_vhost_set_default_node_attrs(struct se_node_acl *nacl)
+{
+	return;
+}
+
+static u32 tcm_vhost_get_task_tag(struct se_cmd *se_cmd)
+{
+	return 0;
+}
+
+static int tcm_vhost_get_cmd_state(struct se_cmd *se_cmd)
+{
+	return 0;
+}
+
+static void vhost_scsi_complete_cmd(struct tcm_vhost_cmd *);
+
+static int tcm_vhost_queue_data_in(struct se_cmd *se_cmd)
+{
+	struct tcm_vhost_cmd *tv_cmd = container_of(se_cmd,
+				struct tcm_vhost_cmd, tvc_se_cmd);
+	vhost_scsi_complete_cmd(tv_cmd);
+	return 0;
+}
+
+static int tcm_vhost_queue_status(struct se_cmd *se_cmd)
+{
+	struct tcm_vhost_cmd *tv_cmd = container_of(se_cmd,
+				struct tcm_vhost_cmd, tvc_se_cmd);
+	vhost_scsi_complete_cmd(tv_cmd);
+	return 0;
+}
+
+static int tcm_vhost_queue_tm_rsp(struct se_cmd *se_cmd)
+{
+	return 0;
+}
+
+static u16 tcm_vhost_set_fabric_sense_len(struct se_cmd *se_cmd, u32 sense_length)
+{
+	return 0;
+}
+
+static u16 tcm_vhost_get_fabric_sense_len(void)
+{
+	return 0;
+}
+
+static void vhost_scsi_free_cmd(struct tcm_vhost_cmd *tv_cmd)
+{
+	struct se_cmd *se_cmd = &tv_cmd->tvc_se_cmd;
+
+	/* TODO locking against target/backend threads? */
+	transport_generic_free_cmd(se_cmd, 1);
+
+	if (tv_cmd->tvc_sgl_count) {
+		u32 i;
+		for (i = 0; i < tv_cmd->tvc_sgl_count; i++)
+			put_page(sg_page(&tv_cmd->tvc_sgl[i]));
+
+		kfree(tv_cmd->tvc_sgl);
+	}
+
+	kfree(tv_cmd);
+}
+
+/* Dequeue a command from the completion list */
+static struct tcm_vhost_cmd *vhost_scsi_get_cmd_from_completion(struct vhost_scsi *vs)
+{
+	struct tcm_vhost_cmd *tv_cmd = NULL;
+
+	spin_lock_bh(&vs->vs_completion_lock);
+	if (list_empty(&vs->vs_completion_list)) {
+		spin_unlock_bh(&vs->vs_completion_lock);
+		return NULL;
+	}
+
+	list_for_each_entry(tv_cmd, &vs->vs_completion_list,
+			    tvc_completion_list) {
+		list_del(&tv_cmd->tvc_completion_list);
+		break;
+	}
+	spin_unlock_bh(&vs->vs_completion_lock);
+	return tv_cmd;
+}
+
+/* Fill in status and signal that we are done processing this command
+ *
+ * This is scheduled in the vhost work queue so we are called with the owner
+ * process mm and can access the vring.
+ */
+static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
+{
+	struct vhost_scsi *vs = container_of(work, struct vhost_scsi,
+	                                     vs_completion_work);
+	struct tcm_vhost_cmd *tv_cmd;
+
+	while ((tv_cmd = vhost_scsi_get_cmd_from_completion(vs)) != NULL) {
+		struct virtio_scsi_cmd_resp v_rsp;
+		struct se_cmd *se_cmd = &tv_cmd->tvc_se_cmd;
+		int ret;
+
+		pr_debug("%s tv_cmd %p resid %u status %#02x\n", __func__,
+			tv_cmd, se_cmd->residual_count, se_cmd->scsi_status);
+
+		memset(&v_rsp, 0, sizeof(v_rsp));
+		v_rsp.resid = se_cmd->residual_count;
+		/* TODO is status_qualifier field needed? */
+		v_rsp.status = se_cmd->scsi_status;
+		v_rsp.sense_len = se_cmd->scsi_sense_length;
+		memcpy(v_rsp.sense, tv_cmd->tvc_sense_buf,
+		       v_rsp.sense_len);
+		ret = copy_to_user(tv_cmd->tvc_resp, &v_rsp, sizeof(v_rsp));
+		if (likely(ret == 0))
+			vhost_add_used(&vs->vqs[2], tv_cmd->tvc_vq_desc, 0);
+		else
+			pr_err("Faulted on virtio_scsi_cmd_resp\n");
+
+		vhost_scsi_free_cmd(tv_cmd);
+	}
+
+	vhost_signal(&vs->dev, &vs->vqs[2]);
+}
+
+static void vhost_scsi_complete_cmd(struct tcm_vhost_cmd *tv_cmd)
+{
+	struct vhost_scsi *vs = tv_cmd->tvc_vhost;
+
+	pr_debug("%s tv_cmd %p\n", __func__, tv_cmd);
+
+	spin_lock_bh(&vs->vs_completion_lock);
+	list_add_tail(&tv_cmd->tvc_completion_list, &vs->vs_completion_list);
+	spin_unlock_bh(&vs->vs_completion_lock);
+
+	vhost_work_queue(&vs->dev, &vs->vs_completion_work);
+}
+
+static struct tcm_vhost_cmd *vhost_scsi_allocate_cmd(
+	struct tcm_vhost_tpg *tv_tpg,
+	struct virtio_scsi_cmd_req *v_req,
+	u32 exp_data_len,
+	int data_direction)
+{
+	struct tcm_vhost_cmd *tv_cmd;
+	struct tcm_vhost_nexus *tv_nexus;
+	struct se_portal_group *se_tpg = &tv_tpg->se_tpg;
+	struct se_session *se_sess;
+	struct se_cmd *se_cmd;
+	int sam_task_attr;
+
+	tv_nexus = tv_tpg->tpg_nexus;
+	if (!tv_nexus) {
+		pr_err("Unable to locate active struct tcm_vhost_nexus\n");
+		return ERR_PTR(-EIO);
+	}
+	se_sess = tv_nexus->tvn_se_sess;
+
+	tv_cmd = kzalloc(sizeof(struct tcm_vhost_cmd), GFP_ATOMIC);
+	if (!tv_cmd) {
+		pr_err("Unable to allocate struct tcm_vhost_cmd\n");
+		return ERR_PTR(-ENOMEM);
+	}
+	INIT_LIST_HEAD(&tv_cmd->tvc_completion_list);
+	tv_cmd->tvc_tag = v_req->tag;
+
+	se_cmd = &tv_cmd->tvc_se_cmd;
+	/*
+	 * Locate the SAM Task Attr from virtio_scsi_cmd_req
+	 */
+	sam_task_attr = v_req->task_attr;
+	/*
+	 * Initialize struct se_cmd descriptor from target_core_mod infrastructure
+	 */
+	transport_init_se_cmd(se_cmd, se_tpg->se_tpg_tfo, se_sess, exp_data_len,
+				data_direction, sam_task_attr,
+				&tv_cmd->tvc_sense_buf[0]);
+
+#if 0	/* FIXME: vhost_scsi_allocate_cmd() BIDI operation */
+	if (bidi)
+		se_cmd->se_cmd_flags |= SCF_BIDI;
+#endif
+	return tv_cmd;
+}
+
+/*
+ * Map a user memory range into a scatterlist
+ *
+ * Returns the number of scatterlist entries used or -errno on error.
+ */
+static int vhost_scsi_map_to_sgl(struct scatterlist *sgl,
+		                 unsigned int sgl_count,
+		                 void __user *ptr, size_t len, int write)
+{
+	struct scatterlist *sg = sgl;
+	unsigned int npages = 0;
+	int ret;
+
+	while (len > 0) {
+		struct page *page;
+		unsigned int offset = (uintptr_t)ptr & ~PAGE_MASK;
+		unsigned int nbytes = min_t(unsigned int,
+				PAGE_SIZE - offset, len);
+
+		if (npages == sgl_count) {
+			ret = -ENOBUFS;
+			goto err;
+		}
+
+		ret = get_user_pages_fast((unsigned long)ptr, 1, write, &page);
+		BUG_ON(ret == 0); /* we should either get our page or fail */
+		if (ret < 0)
+			goto err;
+
+		sg_set_page(sg, page, nbytes, offset);
+		ptr += nbytes;
+		len -= nbytes;
+		sg++;
+		npages++;
+	}
+	return npages;
+
+err:
+	/* Put pages that we hold */
+	for (sg = sgl; sg != &sgl[npages]; sg++)
+		put_page(sg_page(sg));
+	return ret;
+}
+
+static int vhost_scsi_map_iov_to_sgl(struct tcm_vhost_cmd *tv_cmd,
+                                     struct iovec *iov, unsigned int niov,
+				     int write)
+{
+	int ret;
+	unsigned int i;
+	u32 sgl_count;
+	struct scatterlist *sg;
+
+	/*
+	 * Find out how long sglist needs to be
+	 */
+	sgl_count = 0;
+	for (i = 0; i < niov; i++) {
+		sgl_count += (((uintptr_t)iov[i].iov_base + iov[i].iov_len +
+		             PAGE_SIZE - 1) >> PAGE_SHIFT) -
+		             ((uintptr_t)iov[i].iov_base >> PAGE_SHIFT);
+	}
+	/* TODO overflow checking */
+
+	sg = kmalloc(sizeof(tv_cmd->tvc_sgl[0]) * sgl_count, GFP_ATOMIC);
+	if (!sg)
+		return -ENOMEM;
+	pr_debug("%s sg %p sgl_count %u is_err %ld\n", __func__,
+	       sg, sgl_count, IS_ERR(sg));
+	sg_init_table(sg, sgl_count);
+
+	tv_cmd->tvc_sgl = sg;
+	tv_cmd->tvc_sgl_count = sgl_count;
+
+	pr_debug("Mapping %u iovecs for %u pages\n", niov, sgl_count);
+	for (i = 0; i < niov; i++) {
+		ret = vhost_scsi_map_to_sgl(sg, sgl_count, iov[i].iov_base,
+		                            iov[i].iov_len, write);
+		if (ret < 0) {
+			for (i = 0; i < tv_cmd->tvc_sgl_count; i++)
+				put_page(sg_page(&tv_cmd->tvc_sgl[i]));
+			kfree(tv_cmd->tvc_sgl);
+			tv_cmd->tvc_sgl = NULL;
+			tv_cmd->tvc_sgl_count = 0;
+			return ret;
+		}
+
+		sg += ret;
+		sgl_count -= ret;
+	}
+	return 0;
+}
+
+static void tcm_vhost_submission_work(struct work_struct *work)
+{
+	struct tcm_vhost_cmd *tv_cmd =
+		container_of(work, struct tcm_vhost_cmd, work);
+	struct se_cmd *se_cmd = &tv_cmd->tvc_se_cmd;
+	struct scatterlist *sg_ptr, *sg_bidi_ptr = NULL;
+	int rc, sg_no_bidi = 0;
+	/*
+	 * Locate the struct se_lun pointer based on v_req->lun, and
+	 * attach it to struct se_cmd
+	 */
+	rc = transport_lookup_cmd_lun(&tv_cmd->tvc_se_cmd, tv_cmd->tvc_lun);
+	if (rc < 0) {
+		pr_err("Failed to look up lun: %d\n", tv_cmd->tvc_lun);
+		transport_send_check_condition_and_sense(&tv_cmd->tvc_se_cmd,
+			tv_cmd->tvc_se_cmd.scsi_sense_reason, 0);
+		transport_generic_free_cmd(se_cmd, 0);
+		return;
+	}
+
+	rc = target_setup_cmd_from_cdb(se_cmd, tv_cmd->tvc_cdb);
+	if (rc == -ENOMEM) {
+		transport_send_check_condition_and_sense(se_cmd,
+				TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0);
+		transport_generic_free_cmd(se_cmd, 0);
+		return;
+	} else if (rc < 0) {
+		if (se_cmd->se_cmd_flags & SCF_SCSI_RESERVATION_CONFLICT)
+			tcm_vhost_queue_status(se_cmd);
+		else
+			transport_send_check_condition_and_sense(se_cmd,
+					se_cmd->scsi_sense_reason, 0);
+		transport_generic_free_cmd(se_cmd, 0);
+		return;
+	}
+
+	if (tv_cmd->tvc_sgl_count) {
+		sg_ptr = tv_cmd->tvc_sgl;
+		/*
+		 * For BIDI commands, pass in the extra READ buffer
+		 * to transport_generic_map_mem_to_cmd() below..
+		 */
+/* FIXME: Fix BIDI operation in tcm_vhost_submission_work() */
+#if 0
+		if (se_cmd->se_cmd_flags & SCF_BIDI) {
+			sg_bidi_ptr = NULL;
+			sg_no_bidi = 0;
+		}
+#endif
+	} else {
+		sg_ptr = NULL;
+	}
+
+	rc = transport_generic_map_mem_to_cmd(se_cmd, sg_ptr,
+				tv_cmd->tvc_sgl_count, sg_bidi_ptr,
+				sg_no_bidi);
+	if (rc < 0) {
+		transport_send_check_condition_and_sense(se_cmd,
+				se_cmd->scsi_sense_reason, 0);
+		transport_generic_free_cmd(se_cmd, 0);
+		return;
+	}
+	transport_handle_cdb_direct(se_cmd);
+}
+
+static void vhost_scsi_handle_vq(struct vhost_scsi *vs)
+{
+	struct vhost_virtqueue *vq = &vs->vqs[2];
+	struct virtio_scsi_cmd_req v_req;
+	struct tcm_vhost_tpg *tv_tpg;
+	struct tcm_vhost_cmd *tv_cmd;
+	u32 exp_data_len, data_first, data_num, data_direction;
+	unsigned out, in, i;
+	int head, ret;
+
+	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
+	tv_tpg = vs->vs_tpg;
+	if (unlikely(!tv_tpg)) {
+		pr_err("%s endpoint not set\n", __func__);
+		return;
+	}
+
+	mutex_lock(&vq->mutex);
+	vhost_disable_notify(&vs->dev, vq);
+
+	for (;;) {
+		head = vhost_get_vq_desc(&vs->dev, vq, vq->iov,
+					ARRAY_SIZE(vq->iov), &out, &in,
+					NULL, NULL);
+		pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n", head, out, in);
+		/* On error, stop handling until the next kick. */
+		if (unlikely(head < 0))
+			break;
+		/* Nothing new?  Wait for eventfd to tell us they refilled. */
+		if (head == vq->num) {
+			if (unlikely(vhost_enable_notify(&vs->dev, vq))) {
+				vhost_disable_notify(&vs->dev, vq);
+				continue;
+			}
+			break;
+		}
+
+/* FIXME: BIDI operation */
+		if (out == 1 && in == 1) {
+			data_direction = DMA_NONE;
+			data_first = 0;
+			data_num = 0;
+		} else if (out == 1 && in > 1) {
+			data_direction = DMA_FROM_DEVICE;
+			data_first = out + 1;
+			data_num = in - 1;
+		} else if (out > 1 && in == 1) {
+			data_direction = DMA_TO_DEVICE;
+			data_first = 1;
+			data_num = out - 1;
+		} else {
+			pr_err("Invalid buffer layout out: %u in: %u\n", out, in);
+			break;
+		}
+
+		/*
+		 * Check for a sane resp buffer so we can report errors to
+		 * the guest.
+		 */
+		if (unlikely(vq->iov[out].iov_len !=
+					sizeof(struct virtio_scsi_cmd_resp))) {
+			pr_err("Expecting virtio_scsi_cmd_resp, got %zu bytes\n",
+					vq->iov[out].iov_len);
+			break;
+		}
+
+		if (unlikely(vq->iov[0].iov_len != sizeof(v_req))) {
+			pr_err("Expecting virtio_scsi_cmd_req, got %zu bytes\n",
+					vq->iov[0].iov_len);
+			break;
+		}
+		pr_debug("Calling __copy_from_user: vq->iov[0].iov_base: %p, len: %zu\n",
+				vq->iov[0].iov_base, sizeof(v_req));
+		ret = __copy_from_user(&v_req, vq->iov[0].iov_base, sizeof(v_req));
+		if (unlikely(ret)) {
+			pr_err("Faulted on virtio_scsi_cmd_req\n");
+			break;
+		}
+
+		exp_data_len = 0;
+		for (i = 0; i < data_num; i++) {
+			exp_data_len += vq->iov[data_first + i].iov_len;
+		}
+
+		tv_cmd = vhost_scsi_allocate_cmd(tv_tpg, &v_req,
+					exp_data_len, data_direction);
+		if (IS_ERR(tv_cmd)) {
+			pr_err("vhost_scsi_allocate_cmd failed %ld\n", PTR_ERR(tv_cmd));
+			break;
+		}
+		pr_debug("Allocated tv_cmd: %p exp_data_len: %d, data_direction: %d\n",
+				tv_cmd, exp_data_len, data_direction);
+
+		tv_cmd->tvc_vhost = vs;
+
+		if (unlikely(vq->iov[out].iov_len !=
+		             sizeof(struct virtio_scsi_cmd_resp))) {
+			pr_err("Expecting virtio_scsi_cmd_resp, "
+			       " got %zu bytes, out: %d, in: %d\n", vq->iov[out].iov_len, out, in);
+			break;
+		}
+
+		tv_cmd->tvc_resp = vq->iov[out].iov_base;
+
+		/*
+		 * Copy in the recieved CDB descriptor into tv_cmd->tvc_cdb
+		 * that will be used by tcm_vhost_new_cmd_map() and down into
+		 * target_setup_cmd_from_cdb()
+		 */
+		memcpy(tv_cmd->tvc_cdb, v_req.cdb, TCM_VHOST_MAX_CDB_SIZE);
+		/*
+		 * Check that the recieved CDB size does not exceeded our
+		 * hardcoded max for tcm_vhost
+		 */
+		/* TODO what if cdb was too small for varlen cdb header? */
+		if (unlikely(scsi_command_size(tv_cmd->tvc_cdb) > TCM_VHOST_MAX_CDB_SIZE)) {
+			pr_err("Received SCSI CDB with command_size: %d that exceeds"
+				" SCSI_MAX_VARLEN_CDB_SIZE: %d\n",
+				scsi_command_size(tv_cmd->tvc_cdb), TCM_VHOST_MAX_CDB_SIZE);
+			break; /* TODO */
+		}
+		tv_cmd->tvc_lun = ((v_req.lun[2] << 8) | v_req.lun[3]) & 0x3FFF;
+
+		pr_debug("vhost_scsi got command opcode: %#02x, lun: %d\n",
+			tv_cmd->tvc_cdb[0], tv_cmd->tvc_lun);
+
+		if (data_direction != DMA_NONE) {
+			ret = vhost_scsi_map_iov_to_sgl(tv_cmd, &vq->iov[data_first],
+					data_num, data_direction == DMA_TO_DEVICE);
+			if (unlikely(ret)) {
+				pr_err("Failed to map iov to sgl\n");
+				break; /* TODO */
+			}
+		}
+
+		/*
+		 * Save the descriptor from vhost_get_vq_desc() to be used to
+		 * complete the virtio-scsi request in TCM callback context via
+		 * tcm_vhost_queue_data_in() and tcm_vhost_queue_status()
+		 */
+		tv_cmd->tvc_vq_desc = head;
+		/*
+		 * Dispatch tv_cmd descriptor for cmwq execution in process
+		 * context provided by tcm_vhost_workqueue.  This also ensures
+		 * tv_cmd is executed on the same kworker CPU as this vhost
+		 * thread to gain positive L2 cache locality effects..
+		 */
+		INIT_WORK(&tv_cmd->work, tcm_vhost_submission_work);
+		queue_work(tcm_vhost_workqueue, &tv_cmd->work);
+	}
+
+	mutex_unlock(&vq->mutex);
+}
+
+static void vhost_scsi_ctl_handle_kick(struct vhost_work *work)
+{
+     pr_err("%s: The handling func for control queue.\n", __func__);
+}
+
+static void vhost_scsi_evt_handle_kick(struct vhost_work *work)
+{
+     pr_err("%s: The handling func for event queue.\n", __func__);
+}
+
+static void vhost_scsi_handle_kick(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						poll.work);
+	struct vhost_scsi *vs = container_of(vq->dev, struct vhost_scsi, dev);
+
+	vhost_scsi_handle_vq(vs);
+}
+
+/*
+ * Called from vhost_scsi_ioctl() context to walk the list of available tcm_vhost_tpg
+ * with an active struct tcm_vhost_nexus
+ */
+static int vhost_scsi_set_endpoint(
+	struct vhost_scsi *vs,
+	struct vhost_vring_target *t)
+{
+	struct tcm_vhost_tport *tv_tport;
+	struct tcm_vhost_tpg *tv_tpg;
+        int index;
+
+	mutex_lock(&vs->dev.mutex);
+	/* Verify that ring has been setup correctly. */
+	for (index = 0; index < vs->dev.nvqs; ++index) {
+		/* Verify that ring has been setup correctly. */
+		if (!vhost_vq_access_ok(&vs->vqs[index])) {
+		        mutex_unlock(&vs->dev.mutex);
+			return -EFAULT;
+		}
+	}
+
+	if (vs->vs_tpg) {
+		mutex_unlock(&vs->dev.mutex);
+		return -EEXIST;
+	}
+	mutex_unlock(&vs->dev.mutex);
+
+	mutex_lock(&tcm_vhost_mutex);
+	list_for_each_entry(tv_tpg, &tcm_vhost_list, tv_tpg_list) {
+		mutex_lock(&tv_tpg->tv_tpg_mutex);
+		if (!tv_tpg->tpg_nexus) {
+			mutex_unlock(&tv_tpg->tv_tpg_mutex);
+			continue;
+		}
+		if (atomic_read(&tv_tpg->tv_tpg_vhost_count)) {
+			mutex_unlock(&tv_tpg->tv_tpg_mutex);
+			continue;
+		}
+		tv_tport = tv_tpg->tport;
+
+		if (!strcmp(tv_tport->tport_name, t->vhost_wwpn) &&
+		    (tv_tpg->tport_tpgt == t->vhost_tpgt)) {
+			atomic_inc(&tv_tpg->tv_tpg_vhost_count);
+			smp_mb__after_atomic_inc();
+			mutex_unlock(&tv_tpg->tv_tpg_mutex);
+			mutex_unlock(&tcm_vhost_mutex);
+
+			mutex_lock(&vs->dev.mutex);
+			vs->vs_tpg = tv_tpg;
+			atomic_inc(&vs->vhost_ref_cnt);
+			smp_mb__after_atomic_inc();
+			mutex_unlock(&vs->dev.mutex);
+			return 0;
+		}
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+	}
+	mutex_unlock(&tcm_vhost_mutex);
+	return -EINVAL;
+}
+
+static int vhost_scsi_clear_endpoint(
+	struct vhost_scsi *vs,
+	struct vhost_vring_target *t)
+{
+	struct tcm_vhost_tport *tv_tport;
+	struct tcm_vhost_tpg *tv_tpg;
+        int index;
+
+	mutex_lock(&vs->dev.mutex);
+	/* Verify that ring has been setup correctly. */
+	for (index = 0; index < vs->dev.nvqs; ++index) {
+		if (!vhost_vq_access_ok(&vs->vqs[index])) {
+		        mutex_unlock(&vs->dev.mutex);
+			return -EFAULT;
+		}
+	}
+
+	if (!vs->vs_tpg) {
+		mutex_unlock(&vs->dev.mutex);
+		return -ENODEV;
+	}
+	tv_tpg = vs->vs_tpg;
+	tv_tport = tv_tpg->tport;
+
+	if (strcmp(tv_tport->tport_name, t->vhost_wwpn) ||
+	    (tv_tpg->tport_tpgt != t->vhost_tpgt)) {
+		mutex_unlock(&vs->dev.mutex);
+		pr_warn("tv_tport->tport_name: %s, tv_tpg->tport_tpgt: %hu"
+			" does not match t->vhost_wwpn: %s, t->vhost_tpgt: %hu\n",
+			tv_tport->tport_name, tv_tpg->tport_tpgt,
+			t->vhost_wwpn, t->vhost_tpgt);
+		return -EINVAL;
+	}
+        atomic_dec(&tv_tpg->tv_tpg_vhost_count);
+	vs->vs_tpg = NULL;
+	mutex_unlock(&vs->dev.mutex);
+
+	return 0;
+}
+
+static int vhost_scsi_open(struct inode *inode, struct file *f)
+{
+	struct vhost_scsi *s;
+	int r;
+
+	s = kzalloc(sizeof(*s), GFP_KERNEL);
+	if (!s)
+		return -ENOMEM;
+
+	vhost_work_init(&s->vs_completion_work, vhost_scsi_complete_cmd_work);
+	INIT_LIST_HEAD(&s->vs_completion_list);
+	spin_lock_init(&s->vs_completion_lock);
+
+	s->vqs[0].handle_kick = vhost_scsi_ctl_handle_kick;
+	s->vqs[1].handle_kick = vhost_scsi_evt_handle_kick;
+	s->vqs[2].handle_kick = vhost_scsi_handle_kick;
+	r = vhost_dev_init(&s->dev, s->vqs, 3);
+	if (r < 0) {
+		kfree(s);
+		return r;
+	}
+
+	f->private_data = s;
+	return 0;
+}
+
+static int vhost_scsi_release(struct inode *inode, struct file *f)
+{
+	struct vhost_scsi *s = f->private_data;
+
+        if (s->vs_tpg && s->vs_tpg->tport) {
+            struct vhost_vring_target backend;
+            memcpy(backend.vhost_wwpn, s->vs_tpg->tport->tport_name, sizeof(backend.vhost_wwpn));
+            backend.vhost_tpgt = s->vs_tpg->tport_tpgt;
+            vhost_scsi_clear_endpoint(s, &backend);
+        }
+
+	vhost_dev_cleanup(&s->dev, false);
+	kfree(s);
+	return 0;
+}
+
+static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
+{
+	if (features & ~VHOST_FEATURES)
+		return -EOPNOTSUPP;
+
+	mutex_lock(&vs->dev.mutex);
+	if ((features & (1 << VHOST_F_LOG_ALL)) &&
+	    !vhost_log_access_ok(&vs->dev)) {
+		mutex_unlock(&vs->dev.mutex);
+		return -EFAULT;
+	}
+	vs->dev.acked_features = features;
+	/* TODO possibly smp_wmb() and flush vqs */
+	mutex_unlock(&vs->dev.mutex);
+	return 0;
+}
+
+static long vhost_scsi_ioctl(struct file *f, unsigned int ioctl,
+				unsigned long arg)
+{
+	struct vhost_scsi *vs = f->private_data;
+	struct vhost_vring_target backend;
+	void __user *argp = (void __user *)arg;
+	u64 __user *featurep = argp;
+	u64 features;
+	int r;
+
+	switch (ioctl) {
+	case VHOST_SCSI_SET_ENDPOINT:
+		if (copy_from_user(&backend, argp, sizeof backend))
+			return -EFAULT;
+
+		return vhost_scsi_set_endpoint(vs, &backend);
+	case VHOST_SCSI_CLEAR_ENDPOINT:
+		if (copy_from_user(&backend, argp, sizeof backend))
+			return -EFAULT;
+
+		return vhost_scsi_clear_endpoint(vs, &backend);
+	case VHOST_GET_FEATURES:
+		features = VHOST_FEATURES;
+		if (copy_to_user(featurep, &features, sizeof features))
+			return -EFAULT;
+		return 0;
+	case VHOST_SET_FEATURES:
+		if (copy_from_user(&features, featurep, sizeof features))
+			return -EFAULT;
+		return vhost_scsi_set_features(vs, features);
+	default:
+		mutex_lock(&vs->dev.mutex);
+		r = vhost_dev_ioctl(&vs->dev, ioctl, arg);
+		mutex_unlock(&vs->dev.mutex);
+		return r;
+	}
+}
+
+static const struct file_operations vhost_scsi_fops = {
+	.owner          = THIS_MODULE,
+	.release        = vhost_scsi_release,
+	.unlocked_ioctl = vhost_scsi_ioctl,
+	/* TODO compat ioctl? */
+	.open           = vhost_scsi_open,
+	.llseek		= noop_llseek,
+};
+
+static struct miscdevice vhost_scsi_misc = {
+	MISC_DYNAMIC_MINOR,
+	"vhost-scsi",
+	&vhost_scsi_fops,
+};
+
+static int __init vhost_scsi_register(void)
+{
+	return misc_register(&vhost_scsi_misc);
+}
+
+static int vhost_scsi_deregister(void)
+{
+	return misc_deregister(&vhost_scsi_misc);
+}
+
+static char *tcm_vhost_dump_proto_id(struct tcm_vhost_tport *tport)
+{
+	switch (tport->tport_proto_id) {
+	case SCSI_PROTOCOL_SAS:
+		return "SAS";
+	case SCSI_PROTOCOL_FCP:
+		return "FCP";
+	case SCSI_PROTOCOL_ISCSI:
+		return "iSCSI";
+	default:
+		break;
+	}
+
+	return "Unknown";
+}
+
+static int tcm_vhost_port_link(
+	struct se_portal_group *se_tpg,
+	struct se_lun *lun)
+{
+	struct tcm_vhost_tpg *tv_tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+
+	atomic_inc(&tv_tpg->tv_tpg_port_count);
+	smp_mb__after_atomic_inc();
+
+	return 0;
+}
+
+static void tcm_vhost_port_unlink(
+	struct se_portal_group *se_tpg,
+	struct se_lun *se_lun)
+{
+	struct tcm_vhost_tpg *tv_tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+
+	atomic_dec(&tv_tpg->tv_tpg_port_count);
+	smp_mb__after_atomic_dec();
+}
+
+static struct se_node_acl *tcm_vhost_make_nodeacl(
+	struct se_portal_group *se_tpg,
+	struct config_group *group,
+	const char *name)
+{
+	struct se_node_acl *se_nacl, *se_nacl_new;
+	struct tcm_vhost_nacl *nacl;
+	u64 wwpn = 0;
+	u32 nexus_depth;
+
+	/* tcm_vhost_parse_wwn(name, &wwpn, 1) < 0)
+		return ERR_PTR(-EINVAL); */
+	se_nacl_new = tcm_vhost_alloc_fabric_acl(se_tpg);
+	if (!se_nacl_new)
+		return ERR_PTR(-ENOMEM);
+//#warning FIXME: Hardcoded nexus depth in tcm_vhost_make_nodeacl()
+	nexus_depth = 1;
+	/*
+	 * se_nacl_new may be released by core_tpg_add_initiator_node_acl()
+	 * when converting a NodeACL from demo mode -> explict
+	 */
+	se_nacl = core_tpg_add_initiator_node_acl(se_tpg, se_nacl_new,
+				name, nexus_depth);
+	if (IS_ERR(se_nacl)) {
+		tcm_vhost_release_fabric_acl(se_tpg, se_nacl_new);
+		return se_nacl;
+	}
+	/*
+	 * Locate our struct tcm_vhost_nacl and set the FC Nport WWPN
+	 */
+	nacl = container_of(se_nacl, struct tcm_vhost_nacl, se_node_acl);
+	nacl->iport_wwpn = wwpn;
+	/* tcm_vhost_format_wwn(&nacl->iport_name[0], TCM_VHOST_NAMELEN, wwpn); */
+
+	return se_nacl;
+}
+
+static void tcm_vhost_drop_nodeacl(struct se_node_acl *se_acl)
+{
+	struct tcm_vhost_nacl *nacl = container_of(se_acl,
+				struct tcm_vhost_nacl, se_node_acl);
+	core_tpg_del_initiator_node_acl(se_acl->se_tpg, se_acl, 1);
+	kfree(nacl);
+}
+
+static int tcm_vhost_make_nexus(
+	struct tcm_vhost_tpg *tv_tpg,
+	const char *name)
+{
+	struct se_portal_group *se_tpg;
+	struct tcm_vhost_nexus *tv_nexus;
+
+	mutex_lock(&tv_tpg->tv_tpg_mutex);
+	if (tv_tpg->tpg_nexus) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		pr_debug("tv_tpg->tpg_nexus already exists\n");
+		return -EEXIST;
+	}
+	se_tpg = &tv_tpg->se_tpg;
+
+	tv_nexus = kzalloc(sizeof(struct tcm_vhost_nexus), GFP_KERNEL);
+	if (!tv_nexus) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		pr_err("Unable to allocate struct tcm_vhost_nexus\n");
+		return -ENOMEM;
+	}
+	/*
+	 *  Initialize the struct se_session pointer
+	 */
+	tv_nexus->tvn_se_sess = transport_init_session();
+	if (IS_ERR(tv_nexus->tvn_se_sess)) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		kfree(tv_nexus);
+		return -ENOMEM;
+	}
+	/*
+	 * Since we are running in 'demo mode' this call with generate a
+	 * struct se_node_acl for the tcm_vhost struct se_portal_group with
+	 * the SCSI Initiator port name of the passed configfs group 'name'.
+	 */
+	tv_nexus->tvn_se_sess->se_node_acl = core_tpg_check_initiator_node_acl(
+				se_tpg, (unsigned char *)name);
+	if (!tv_nexus->tvn_se_sess->se_node_acl) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		pr_debug("core_tpg_check_initiator_node_acl() failed"
+				" for %s\n", name);
+		transport_free_session(tv_nexus->tvn_se_sess);
+		kfree(tv_nexus);
+		return -ENOMEM;
+	}
+	/*
+	 * Now register the TCM vHost virtual I_T Nexus as active with the
+	 * call to __transport_register_session()
+	 */
+	__transport_register_session(se_tpg, tv_nexus->tvn_se_sess->se_node_acl,
+			tv_nexus->tvn_se_sess, tv_nexus);
+	tv_tpg->tpg_nexus = tv_nexus;
+
+	mutex_unlock(&tv_tpg->tv_tpg_mutex);
+	return 0;
+}
+
+static int tcm_vhost_drop_nexus(
+	struct tcm_vhost_tpg *tpg)
+{
+	struct se_session *se_sess;
+	struct tcm_vhost_nexus *tv_nexus;
+
+	mutex_lock(&tpg->tv_tpg_mutex);
+	tv_nexus = tpg->tpg_nexus;
+	if (!tv_nexus) {
+		mutex_unlock(&tpg->tv_tpg_mutex);
+		return -ENODEV;
+	}
+
+	se_sess = tv_nexus->tvn_se_sess;
+	if (!se_sess) {
+		mutex_unlock(&tpg->tv_tpg_mutex);
+		return -ENODEV;
+	}
+
+	if (atomic_read(&tpg->tv_tpg_port_count)) {
+		mutex_unlock(&tpg->tv_tpg_mutex);
+		pr_err("Unable to remove TCM_vHost I_T Nexus with"
+			" active TPG port count: %d\n",
+			atomic_read(&tpg->tv_tpg_port_count));
+		return -EPERM;
+	}
+
+	if (atomic_read(&tpg->tv_tpg_vhost_count)) {
+		pr_err("Unable to remove TCM_vHost I_T Nexus with"
+			" active TPG vhost count: %d\n",
+			atomic_read(&tpg->tv_tpg_vhost_count));
+		return -EPERM;
+	}
+
+	pr_debug("TCM_vHost_ConfigFS: Removing I_T Nexus to emulated"
+		" %s Initiator Port: %s\n", tcm_vhost_dump_proto_id(tpg->tport),
+		tv_nexus->tvn_se_sess->se_node_acl->initiatorname);
+	/*
+	 * Release the SCSI I_T Nexus to the emulated vHost Target Port
+	 */
+	transport_deregister_session(tv_nexus->tvn_se_sess);
+	tpg->tpg_nexus = NULL;
+	mutex_unlock(&tpg->tv_tpg_mutex);
+
+	kfree(tv_nexus);
+	return 0;
+}
+
+static ssize_t tcm_vhost_tpg_show_nexus(
+	struct se_portal_group *se_tpg,
+	char *page)
+{
+	struct tcm_vhost_tpg *tv_tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_nexus *tv_nexus;
+	ssize_t ret;
+
+	mutex_lock(&tv_tpg->tv_tpg_mutex);
+	tv_nexus = tv_tpg->tpg_nexus;
+	if (!tv_nexus) {
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
+		return -ENODEV;
+	}
+	ret = snprintf(page, PAGE_SIZE, "%s\n",
+			tv_nexus->tvn_se_sess->se_node_acl->initiatorname);
+	mutex_unlock(&tv_tpg->tv_tpg_mutex);
+
+	return ret;
+}
+
+static ssize_t tcm_vhost_tpg_store_nexus(
+	struct se_portal_group *se_tpg,
+	const char *page,
+	size_t count)
+{
+	struct tcm_vhost_tpg *tv_tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+	struct tcm_vhost_tport *tport_wwn = tv_tpg->tport;
+	unsigned char i_port[TCM_VHOST_NAMELEN], *ptr, *port_ptr;
+	int ret;
+	/*
+	 * Shutdown the active I_T nexus if 'NULL' is passed..
+	 */
+	if (!strncmp(page, "NULL", 4)) {
+		ret = tcm_vhost_drop_nexus(tv_tpg);
+		return (!ret) ? count : ret;
+	}
+	/*
+	 * Otherwise make sure the passed virtual Initiator port WWN matches
+	 * the fabric protocol_id set in tcm_vhost_make_tport(), and call
+	 * tcm_vhost_make_nexus().
+	 */
+	if (strlen(page) > TCM_VHOST_NAMELEN) {
+		pr_err("Emulated NAA Sas Address: %s, exceeds"
+				" max: %d\n", page, TCM_VHOST_NAMELEN);
+		return -EINVAL;
+	}
+	snprintf(&i_port[0], TCM_VHOST_NAMELEN, "%s", page);
+
+	ptr = strstr(i_port, "naa.");
+	if (ptr) {
+		if (tport_wwn->tport_proto_id != SCSI_PROTOCOL_SAS) {
+			pr_err("Passed SAS Initiator Port %s does not"
+				" match target port protoid: %s\n", i_port,
+				tcm_vhost_dump_proto_id(tport_wwn));
+			return -EINVAL;
+		}
+		port_ptr = &i_port[0];
+		goto check_newline;
+	}
+	ptr = strstr(i_port, "fc.");
+	if (ptr) {
+		if (tport_wwn->tport_proto_id != SCSI_PROTOCOL_FCP) {
+			pr_err("Passed FCP Initiator Port %s does not"
+				" match target port protoid: %s\n", i_port,
+				tcm_vhost_dump_proto_id(tport_wwn));
+			return -EINVAL;
+		}
+		port_ptr = &i_port[3]; /* Skip over "fc." */
+		goto check_newline;
+	}
+	ptr = strstr(i_port, "iqn.");
+	if (ptr) {
+		if (tport_wwn->tport_proto_id != SCSI_PROTOCOL_ISCSI) {
+			pr_err("Passed iSCSI Initiator Port %s does not"
+				" match target port protoid: %s\n", i_port,
+				tcm_vhost_dump_proto_id(tport_wwn));
+			return -EINVAL;
+		}
+		port_ptr = &i_port[0];
+		goto check_newline;
+	}
+	pr_err("Unable to locate prefix for emulated Initiator Port:"
+			" %s\n", i_port);
+	return -EINVAL;
+	/*
+	 * Clear any trailing newline for the NAA WWN
+	 */
+check_newline:
+	if (i_port[strlen(i_port)-1] == '\n')
+		i_port[strlen(i_port)-1] = '\0';
+
+	ret = tcm_vhost_make_nexus(tv_tpg, port_ptr);
+	if (ret < 0)
+		return ret;
+
+	return count;
+}
+
+TF_TPG_BASE_ATTR(tcm_vhost, nexus, S_IRUGO | S_IWUSR);
+
+static struct configfs_attribute *tcm_vhost_tpg_attrs[] = {
+	&tcm_vhost_tpg_nexus.attr,
+	NULL,
+};
+
+static struct se_portal_group *tcm_vhost_make_tpg(
+	struct se_wwn *wwn,
+	struct config_group *group,
+	const char *name)
+{
+	struct tcm_vhost_tport*tport = container_of(wwn,
+			struct tcm_vhost_tport, tport_wwn);
+
+	struct tcm_vhost_tpg *tpg;
+	unsigned long tpgt;
+	int ret;
+
+	if (strstr(name, "tpgt_") != name)
+		return ERR_PTR(-EINVAL);
+	if (strict_strtoul(name + 5, 10, &tpgt) || tpgt > UINT_MAX)
+		return ERR_PTR(-EINVAL);
+
+	tpg = kzalloc(sizeof(struct tcm_vhost_tpg), GFP_KERNEL);
+	if (!tpg) {
+		pr_err("Unable to allocate struct tcm_vhost_tpg");
+		return ERR_PTR(-ENOMEM);
+	}
+	mutex_init(&tpg->tv_tpg_mutex);
+	INIT_LIST_HEAD(&tpg->tv_tpg_list);
+	tpg->tport = tport;
+	tpg->tport_tpgt = tpgt;
+
+	ret = core_tpg_register(&tcm_vhost_fabric_configfs->tf_ops, wwn,
+				&tpg->se_tpg, tpg, TRANSPORT_TPG_TYPE_NORMAL);
+	if (ret < 0) {
+		kfree(tpg);
+		return NULL;
+	}
+	mutex_lock(&tcm_vhost_mutex);
+	list_add_tail(&tpg->tv_tpg_list, &tcm_vhost_list);
+	mutex_unlock(&tcm_vhost_mutex);
+
+	return &tpg->se_tpg;
+}
+
+static void tcm_vhost_drop_tpg(struct se_portal_group *se_tpg)
+{
+	struct tcm_vhost_tpg *tpg = container_of(se_tpg,
+				struct tcm_vhost_tpg, se_tpg);
+
+	mutex_lock(&tcm_vhost_mutex);
+	list_del(&tpg->tv_tpg_list);
+	mutex_unlock(&tcm_vhost_mutex);
+	/*
+	 * Release the virtual I_T Nexus for this vHost TPG
+	 */
+	tcm_vhost_drop_nexus(tpg);
+	/*
+	 * Deregister the se_tpg from TCM..
+	 */
+	core_tpg_deregister(se_tpg);
+	kfree(tpg);
+}
+
+static struct se_wwn *tcm_vhost_make_tport(
+	struct target_fabric_configfs *tf,
+	struct config_group *group,
+	const char *name)
+{
+	struct tcm_vhost_tport *tport;
+	char *ptr;
+	u64 wwpn = 0;
+	int off = 0;
+
+	/* if (tcm_vhost_parse_wwn(name, &wwpn, 1) < 0)
+		return ERR_PTR(-EINVAL); */
+
+	tport = kzalloc(sizeof(struct tcm_vhost_tport), GFP_KERNEL);
+	if (!tport) {
+		pr_err("Unable to allocate struct tcm_vhost_tport");
+		return ERR_PTR(-ENOMEM);
+	}
+	tport->tport_wwpn = wwpn;
+	/* tcm_vhost_format_wwn(&tport->tport_name[0], TCM_VHOST__NAMELEN, wwpn); */
+	/*
+	 * Determine the emulated Protocol Identifier and Target Port Name
+	 * based on the incoming configfs directory name.
+	 */
+	ptr = strstr(name, "naa.");
+	if (ptr) {
+		tport->tport_proto_id = SCSI_PROTOCOL_SAS;
+		goto check_len;
+	}
+	ptr = strstr(name, "fc.");
+	if (ptr) {
+		tport->tport_proto_id = SCSI_PROTOCOL_FCP;
+		off = 3; /* Skip over "fc." */
+		goto check_len;
+	}
+	ptr = strstr(name, "iqn.");
+	if (ptr) {
+		tport->tport_proto_id = SCSI_PROTOCOL_ISCSI;
+		goto check_len;
+	}
+
+	pr_err("Unable to locate prefix for emulated Target Port:"
+			" %s\n", name);
+	return ERR_PTR(-EINVAL);
+
+check_len:
+	if (strlen(name) > TCM_VHOST_NAMELEN) {
+		pr_err("Emulated %s Address: %s, exceeds"
+			" max: %d\n", name, tcm_vhost_dump_proto_id(tport),
+			TCM_VHOST_NAMELEN);
+		kfree(tport);
+		return ERR_PTR(-EINVAL);
+	}
+	snprintf(&tport->tport_name[0], TCM_VHOST_NAMELEN, "%s", &name[off]);
+
+	pr_debug("TCM_VHost_ConfigFS: Allocated emulated Target"
+		" %s Address: %s\n", tcm_vhost_dump_proto_id(tport), name);
+
+	return &tport->tport_wwn;
+}
+
+static void tcm_vhost_drop_tport(struct se_wwn *wwn)
+{
+	struct tcm_vhost_tport *tport = container_of(wwn,
+				struct tcm_vhost_tport, tport_wwn);
+
+	pr_debug("TCM_VHost_ConfigFS: Deallocating emulated Target"
+		" %s Address: %s\n", tcm_vhost_dump_proto_id(tport),
+		tport->tport_name);;
+
+	kfree(tport);
+}
+
+static ssize_t tcm_vhost_wwn_show_attr_version(
+	struct target_fabric_configfs *tf,
+	char *page)
+{
+	return sprintf(page, "TCM_VHOST fabric module %s on %s/%s"
+		"on "UTS_RELEASE"\n", TCM_VHOST_VERSION, utsname()->sysname,
+		utsname()->machine);
+}
+
+TF_WWN_ATTR_RO(tcm_vhost, version);
+
+static struct configfs_attribute *tcm_vhost_wwn_attrs[] = {
+	&tcm_vhost_wwn_version.attr,
+	NULL,
+};
+
+static struct target_core_fabric_ops tcm_vhost_ops = {
+	.get_fabric_name		= tcm_vhost_get_fabric_name,
+	.get_fabric_proto_ident		= tcm_vhost_get_fabric_proto_ident,
+	.tpg_get_wwn			= tcm_vhost_get_fabric_wwn,
+	.tpg_get_tag			= tcm_vhost_get_tag,
+	.tpg_get_default_depth		= tcm_vhost_get_default_depth,
+	.tpg_get_pr_transport_id	= tcm_vhost_get_pr_transport_id,
+	.tpg_get_pr_transport_id_len	= tcm_vhost_get_pr_transport_id_len,
+	.tpg_parse_pr_out_transport_id	= tcm_vhost_parse_pr_out_transport_id,
+	.tpg_check_demo_mode		= tcm_vhost_check_true,
+	.tpg_check_demo_mode_cache	= tcm_vhost_check_true,
+	.tpg_check_demo_mode_write_protect = tcm_vhost_check_false,
+	.tpg_check_prod_mode_write_protect = tcm_vhost_check_false,
+	.tpg_alloc_fabric_acl		= tcm_vhost_alloc_fabric_acl,
+	.tpg_release_fabric_acl		= tcm_vhost_release_fabric_acl,
+	.tpg_get_inst_index		= tcm_vhost_tpg_get_inst_index,
+	.release_cmd			= tcm_vhost_release_cmd,
+	.shutdown_session		= tcm_vhost_shutdown_session,
+	.close_session			= tcm_vhost_close_session,
+	.sess_get_index			= tcm_vhost_sess_get_index,
+	.sess_get_initiator_sid		= NULL,
+	.write_pending			= tcm_vhost_write_pending,
+	.write_pending_status		= tcm_vhost_write_pending_status,
+	.set_default_node_attributes	= tcm_vhost_set_default_node_attrs,
+	.get_task_tag			= tcm_vhost_get_task_tag,
+	.get_cmd_state			= tcm_vhost_get_cmd_state,
+	.queue_data_in			= tcm_vhost_queue_data_in,
+	.queue_status			= tcm_vhost_queue_status,
+	.queue_tm_rsp			= tcm_vhost_queue_tm_rsp,
+	.get_fabric_sense_len		= tcm_vhost_get_fabric_sense_len,
+	.set_fabric_sense_len		= tcm_vhost_set_fabric_sense_len,
+	/*
+	 * Setup function pointers for generic logic in target_core_fabric_configfs.c
+	 */
+	.fabric_make_wwn		= tcm_vhost_make_tport,
+	.fabric_drop_wwn		= tcm_vhost_drop_tport,
+	.fabric_make_tpg		= tcm_vhost_make_tpg,
+	.fabric_drop_tpg		= tcm_vhost_drop_tpg,
+	.fabric_post_link		= tcm_vhost_port_link,
+	.fabric_pre_unlink		= tcm_vhost_port_unlink,
+	.fabric_make_np			= NULL,
+	.fabric_drop_np			= NULL,
+	.fabric_make_nodeacl		= tcm_vhost_make_nodeacl,
+	.fabric_drop_nodeacl		= tcm_vhost_drop_nodeacl,
+};
+
+static int tcm_vhost_register_configfs(void)
+{
+	struct target_fabric_configfs *fabric;
+	int ret;
+
+	pr_debug("TCM_VHOST fabric module %s on %s/%s"
+		" on "UTS_RELEASE"\n",TCM_VHOST_VERSION, utsname()->sysname,
+		utsname()->machine);
+	/*
+	 * Register the top level struct config_item_type with TCM core
+	 */
+	fabric = target_fabric_configfs_init(THIS_MODULE, "vhost");
+	if (IS_ERR(fabric)) {
+		pr_err("target_fabric_configfs_init() failed\n");
+		return PTR_ERR(fabric);
+	}
+	/*
+	 * Setup fabric->tf_ops from our local tcm_vhost_ops
+	 */
+	fabric->tf_ops = tcm_vhost_ops;
+	/*
+	 * Setup default attribute lists for various fabric->tf_cit_tmpl
+	 */
+	TF_CIT_TMPL(fabric)->tfc_wwn_cit.ct_attrs = tcm_vhost_wwn_attrs;
+	TF_CIT_TMPL(fabric)->tfc_tpg_base_cit.ct_attrs = tcm_vhost_tpg_attrs;
+	TF_CIT_TMPL(fabric)->tfc_tpg_attrib_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_param_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_np_base_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_nacl_base_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_nacl_attrib_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_nacl_auth_cit.ct_attrs = NULL;
+	TF_CIT_TMPL(fabric)->tfc_tpg_nacl_param_cit.ct_attrs = NULL;
+	/*
+	 * Register the fabric for use within TCM
+	 */
+	ret = target_fabric_configfs_register(fabric);
+	if (ret < 0) {
+		pr_err("target_fabric_configfs_register() failed"
+				" for TCM_VHOST\n");
+		return ret;
+	}
+	/*
+	 * Setup our local pointer to *fabric
+	 */
+	tcm_vhost_fabric_configfs = fabric;
+	pr_debug("TCM_VHOST[0] - Set fabric -> tcm_vhost_fabric_configfs\n");
+	return 0;
+};
+
+static void tcm_vhost_deregister_configfs(void)
+{
+	if (!tcm_vhost_fabric_configfs)
+		return;
+
+	target_fabric_configfs_deregister(tcm_vhost_fabric_configfs);
+	tcm_vhost_fabric_configfs = NULL;
+	pr_debug("TCM_VHOST[0] - Cleared tcm_vhost_fabric_configfs\n");
+};
+
+static int __init tcm_vhost_init(void)
+{
+	int ret = -ENOMEM;
+
+	tcm_vhost_workqueue = alloc_workqueue("tcm_vhost", 0, 0);
+	if (!tcm_vhost_workqueue)
+		goto out;
+
+	ret = vhost_scsi_register();
+	if (ret < 0)
+		goto out_destroy_workqueue;
+
+	ret = tcm_vhost_register_configfs();
+	if (ret < 0)
+		goto out_vhost_scsi_deregister;
+
+	return 0;
+
+out_vhost_scsi_deregister:
+	vhost_scsi_deregister();
+out_destroy_workqueue:
+	destroy_workqueue(tcm_vhost_workqueue);
+out:
+	return ret;
+};
+
+static void tcm_vhost_exit(void)
+{
+	tcm_vhost_deregister_configfs();
+	vhost_scsi_deregister();
+	destroy_workqueue(tcm_vhost_workqueue);
+};
+
+MODULE_DESCRIPTION("TCM_VHOST series fabric driver");
+MODULE_LICENSE("GPL");
+module_init(tcm_vhost_init);
+module_exit(tcm_vhost_exit);
diff --git a/drivers/vhost/tcm_vhost.h b/drivers/vhost/tcm_vhost.h
new file mode 100644
index 0000000..9d6cace
--- /dev/null
+++ b/drivers/vhost/tcm_vhost.h
@@ -0,0 +1,74 @@
+#define TCM_VHOST_VERSION  "v0.1"
+#define TCM_VHOST_NAMELEN 256
+#define TCM_VHOST_MAX_CDB_SIZE 32
+
+struct tcm_vhost_cmd {
+	/* Descriptor from vhost_get_vq_desc() for virt_queue segment */
+	int tvc_vq_desc;
+	/* The Tag from include/linux/virtio_scsi.h:struct virtio_scsi_cmd_req */
+	u64 tvc_tag;
+	/* The number of scatterlists associated with this cmd */
+	u32 tvc_sgl_count;
+	/* Saved unpacked SCSI LUN for tcm_vhost_submission_work() */
+	u32 tvc_lun;
+	/* Pointer to the SGL formatted memory from virtio-scsi */
+	struct scatterlist *tvc_sgl;
+	/* Pointer to response */
+	struct virtio_scsi_cmd_resp __user *tvc_resp;
+	/* Pointer to vhost_scsi for our device */
+	struct vhost_scsi *tvc_vhost;
+	/* The TCM I/O descriptor that is accessed via container_of() */
+	struct se_cmd tvc_se_cmd;
+	/* work item used for cmwq dispatch to tcm_vhost_submission_work() */
+	struct work_struct work;
+	/* Copy of the incoming SCSI command descriptor block (CDB) */
+	unsigned char tvc_cdb[TCM_VHOST_MAX_CDB_SIZE];
+	/* Sense buffer that will be mapped into outgoing status */
+	unsigned char tvc_sense_buf[TRANSPORT_SENSE_BUFFER];
+	/* Completed commands list, serviced from vhost worker thread */
+	struct list_head tvc_completion_list;
+};
+
+struct tcm_vhost_nexus {
+	/* Pointer to TCM session for I_T Nexus */
+	struct se_session *tvn_se_sess;
+};
+
+struct tcm_vhost_nacl {
+	/* Binary World Wide unique Port Name for Vhost Initiator port */
+	u64 iport_wwpn;
+	/* ASCII formatted WWPN for Sas Initiator port */
+	char iport_name[TCM_VHOST_NAMELEN];
+	/* Returned by tcm_vhost_make_nodeacl() */
+	struct se_node_acl se_node_acl;
+};
+
+struct tcm_vhost_tpg {
+	/* Vhost port target portal group tag for TCM */
+	u16 tport_tpgt;
+	/* Used to track number of TPG Port/Lun Links wrt to explict I_T Nexus shutdown */
+	atomic_t tv_tpg_port_count;
+	/* Used for vhost_scsi device reference to tpg_nexus */
+	atomic_t tv_tpg_vhost_count;
+	/* list for tcm_vhost_list */
+	struct list_head tv_tpg_list;
+	/* Used to protect access for tpg_nexus */
+	struct mutex tv_tpg_mutex;
+	/* Pointer to the TCM VHost I_T Nexus for this TPG endpoint */
+	struct tcm_vhost_nexus *tpg_nexus;
+	/* Pointer back to tcm_vhost_tport */
+	struct tcm_vhost_tport *tport;
+	/* Returned by tcm_vhost_make_tpg() */
+	struct se_portal_group se_tpg;
+};
+
+struct tcm_vhost_tport {
+	/* SCSI protocol the tport is providing */
+	u8 tport_proto_id;
+	/* Binary World Wide unique Port Name for Vhost Target port */
+	u64 tport_wwpn;
+	/* ASCII formatted WWPN for Vhost Target port */
+	char tport_name[TCM_VHOST_NAMELEN];
+	/* Returned by tcm_vhost_make_tport() */
+	struct se_wwn tport_wwn;
+};
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
                   ` (7 preceding siblings ...)
  2012-07-11 21:15 ` Nicholas A. Bellinger
@ 2012-07-17 15:05 ` Michael S. Tsirkin
  2012-07-17 18:55   ` Anthony Liguori
                     ` (2 more replies)
  8 siblings, 3 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-17 15:05 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger <nab@linux-iscsi.org>
> 
> Hi folks,
> 
> The following is a RFC-v2 series of tcm_vhost target fabric driver code
> currently in-flight for-3.6 mainline code.
> 
> After last week's developments along with the help of some new folks, the
> changelog v1 -> v2 so far looks like:
> 
> *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He)
> *) Fix tv_cmd completion -> release SGL memory leak (nab)
> *) Fix sparse warnings for static variable usage (Fengguang Wu)
> *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu)
> *) Convert to cmwq submission for I/O dispatch (nab + hch)
> 
> Also following Paolo's request, a patch for hw/virtio-scsi.c that sets
> scsi_host->max_target=0 that removes the need for virtio-scsi LLD to hardcode
> VirtIOSCSIConfig->max_id=1 in order to function with tcm_vhost.
> 
> Note this series has been pushed into target-pending.git/for-next-merge, and
> should be getting picked up for tomorrow's linux-next build.
> 
> Please let us know if you have any concerns and/or additional review feedback.
> 
> Thank you!


It still seems not 100% clear whether this driver will have major
userspace using it. And if not, it would be very hard to support a driver
when recent userspace does not use it in the end.

I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
Then we don't commit to an ABI.
For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.


> Nicholas Bellinger (2):
>   vhost: Add vhost_scsi specific defines
>   tcm_vhost: Initial merge for vhost level target fabric driver
> 
> Stefan Hajnoczi (2):
>   vhost: Separate vhost-net features from vhost features
>   vhost: make vhost work queue visible
> 
>  drivers/vhost/Kconfig     |    6 +
>  drivers/vhost/Makefile    |    1 +
>  drivers/vhost/net.c       |    4 +-
>  drivers/vhost/tcm_vhost.c | 1609 +++++++++++++++++++++++++++++++++++++++++++++
>  drivers/vhost/tcm_vhost.h |   74 ++
>  drivers/vhost/test.c      |    4 +-
>  drivers/vhost/vhost.c     |    5 +-
>  drivers/vhost/vhost.h     |    6 +-
>  include/linux/vhost.h     |    9 +
>  9 files changed, 1710 insertions(+), 8 deletions(-)
>  create mode 100644 drivers/vhost/tcm_vhost.c
>  create mode 100644 drivers/vhost/tcm_vhost.h
> 
> -- 
> 1.7.2.5

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 15:05 ` [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Michael S. Tsirkin
@ 2012-07-17 18:55   ` Anthony Liguori
  2012-07-17 19:00     ` Michael S. Tsirkin
  2012-07-17 21:50     ` Nicholas A. Bellinger
  2012-07-17 21:17   ` Nicholas A. Bellinger
  2012-07-17 21:17   ` Nicholas A. Bellinger
  2 siblings, 2 replies; 48+ messages in thread
From: Anthony Liguori @ 2012-07-17 18:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, lf-virt, Anthony Liguori,
	target-devel, linux-scsi, Paolo Bonzini, Zhi Yong Wu,
	Christoph Hellwig

On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
> On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
>> From: Nicholas Bellinger<nab@linux-iscsi.org>
>>
>> Hi folks,
>>
>> The following is a RFC-v2 series of tcm_vhost target fabric driver code
>> currently in-flight for-3.6 mainline code.
>>
>> After last week's developments along with the help of some new folks, the
>> changelog v1 ->  v2 so far looks like:
>>
>> *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He)
>> *) Fix tv_cmd completion ->  release SGL memory leak (nab)
>> *) Fix sparse warnings for static variable usage (Fengguang Wu)
>> *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu)
>> *) Convert to cmwq submission for I/O dispatch (nab + hch)
>>
>> Also following Paolo's request, a patch for hw/virtio-scsi.c that sets
>> scsi_host->max_target=0 that removes the need for virtio-scsi LLD to hardcode
>> VirtIOSCSIConfig->max_id=1 in order to function with tcm_vhost.
>>
>> Note this series has been pushed into target-pending.git/for-next-merge, and
>> should be getting picked up for tomorrow's linux-next build.
>>
>> Please let us know if you have any concerns and/or additional review feedback.
>>
>> Thank you!
>
>
> It still seems not 100% clear whether this driver will have major
> userspace using it. And if not, it would be very hard to support a driver
> when recent userspace does not use it in the end.

I don't think this is a good reason to exclude something from the kernel. 
However, there are good reasons why this doesn't make sense for something like 
QEMU--specifically because we have a large number of features in our block layer 
that tcm_vhost would bypass.

But perhaps it makes sense for something like native kvm tool.  And if it did go 
into the kernel, we would certainly support it in QEMU.

But I do think the kernel should carefully consider whether it wants to support 
an interface like this.  This an extremely complicated ABI with a lot of subtle 
details around state and compatibility.

Are you absolutely confident that you can support a userspace application that 
expects to get exactly the same response from all possible commands in 20 kernel 
versions from now?  Virtualization requires absolutely precise compatibility in 
terms of bugs and features.  This is probably not something the TCM stack has 
had to consider yet.

> I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> Then we don't commit to an ABI.

I think this is a good idea.  Even if it goes in, a really clear policy would be 
needed wrt the userspace ABI.

While tcm_vhost is probably more useful than vhost_blk, it's a much more complex 
ABI to maintain.

Regards,

Anthony Liguori

> For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
> Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.
>
>
>> Nicholas Bellinger (2):
>>    vhost: Add vhost_scsi specific defines
>>    tcm_vhost: Initial merge for vhost level target fabric driver
>>
>> Stefan Hajnoczi (2):
>>    vhost: Separate vhost-net features from vhost features
>>    vhost: make vhost work queue visible
>>
>>   drivers/vhost/Kconfig     |    6 +
>>   drivers/vhost/Makefile    |    1 +
>>   drivers/vhost/net.c       |    4 +-
>>   drivers/vhost/tcm_vhost.c | 1609 +++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/vhost/tcm_vhost.h |   74 ++
>>   drivers/vhost/test.c      |    4 +-
>>   drivers/vhost/vhost.c     |    5 +-
>>   drivers/vhost/vhost.h     |    6 +-
>>   include/linux/vhost.h     |    9 +
>>   9 files changed, 1710 insertions(+), 8 deletions(-)
>>   create mode 100644 drivers/vhost/tcm_vhost.c
>>   create mode 100644 drivers/vhost/tcm_vhost.h
>>
>> --
>> 1.7.2.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 18:55   ` Anthony Liguori
@ 2012-07-17 19:00     ` Michael S. Tsirkin
  2012-07-17 21:50     ` Nicholas A. Bellinger
  1 sibling, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-17 19:00 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, lf-virt, Anthony Liguori,
	target-devel, linux-scsi, Paolo Bonzini, Zhi Yong Wu,
	Christoph Hellwig

On Tue, Jul 17, 2012 at 01:55:42PM -0500, Anthony Liguori wrote:
> On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
> >On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> >>From: Nicholas Bellinger<nab@linux-iscsi.org>
> >>
> >>Hi folks,
> >>
> >>The following is a RFC-v2 series of tcm_vhost target fabric driver code
> >>currently in-flight for-3.6 mainline code.
> >>
> >>After last week's developments along with the help of some new folks, the
> >>changelog v1 ->  v2 so far looks like:
> >>
> >>*) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He)
> >>*) Fix tv_cmd completion ->  release SGL memory leak (nab)
> >>*) Fix sparse warnings for static variable usage (Fengguang Wu)
> >>*) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu)
> >>*) Convert to cmwq submission for I/O dispatch (nab + hch)
> >>
> >>Also following Paolo's request, a patch for hw/virtio-scsi.c that sets
> >>scsi_host->max_target=0 that removes the need for virtio-scsi LLD to hardcode
> >>VirtIOSCSIConfig->max_id=1 in order to function with tcm_vhost.
> >>
> >>Note this series has been pushed into target-pending.git/for-next-merge, and
> >>should be getting picked up for tomorrow's linux-next build.
> >>
> >>Please let us know if you have any concerns and/or additional review feedback.
> >>
> >>Thank you!
> >
> >
> >It still seems not 100% clear whether this driver will have major
> >userspace using it. And if not, it would be very hard to support a driver
> >when recent userspace does not use it in the end.
> 
> I don't think this is a good reason to exclude something from the
> kernel. However, there are good reasons why this doesn't make sense
> for something like QEMU--specifically because we have a large number
> of features in our block layer that tcm_vhost would bypass.
> 
> But perhaps it makes sense for something like native kvm tool.  And
> if it did go into the kernel, we would certainly support it in QEMU.
> 
> But I do think the kernel should carefully consider whether it wants
> to support an interface like this.  This an extremely complicated
> ABI with a lot of subtle details around state and compatibility.
> 
> Are you absolutely confident that you can support a userspace
> application that expects to get exactly the same response from all
> possible commands in 20 kernel versions from now?  Virtualization
> requires absolutely precise compatibility in terms of bugs and
> features.  This is probably not something the TCM stack has had to
> consider yet.
> 
> >I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> >Then we don't commit to an ABI.
> 
> I think this is a good idea.  Even if it goes in, a really clear
> policy would be needed wrt the userspace ABI.
> 
> While tcm_vhost is probably more useful than vhost_blk, it's a much
> more complex ABI to maintain.
> 
> Regards,
> 
> Anthony Liguori

Maybe something like a whitelist of features will help?
Might even be a good idea to make it user controllable.

> >For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
> >Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.
> >
> >
> >>Nicholas Bellinger (2):
> >>   vhost: Add vhost_scsi specific defines
> >>   tcm_vhost: Initial merge for vhost level target fabric driver
> >>
> >>Stefan Hajnoczi (2):
> >>   vhost: Separate vhost-net features from vhost features
> >>   vhost: make vhost work queue visible
> >>
> >>  drivers/vhost/Kconfig     |    6 +
> >>  drivers/vhost/Makefile    |    1 +
> >>  drivers/vhost/net.c       |    4 +-
> >>  drivers/vhost/tcm_vhost.c | 1609 +++++++++++++++++++++++++++++++++++++++++++++
> >>  drivers/vhost/tcm_vhost.h |   74 ++
> >>  drivers/vhost/test.c      |    4 +-
> >>  drivers/vhost/vhost.c     |    5 +-
> >>  drivers/vhost/vhost.h     |    6 +-
> >>  include/linux/vhost.h     |    9 +
> >>  9 files changed, 1710 insertions(+), 8 deletions(-)
> >>  create mode 100644 drivers/vhost/tcm_vhost.c
> >>  create mode 100644 drivers/vhost/tcm_vhost.h
> >>
> >>--
> >>1.7.2.5
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 15:05 ` [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Michael S. Tsirkin
  2012-07-17 18:55   ` Anthony Liguori
  2012-07-17 21:17   ` Nicholas A. Bellinger
@ 2012-07-17 21:17   ` Nicholas A. Bellinger
  2012-07-17 21:34     ` Michael S. Tsirkin
  2012-07-17 21:58     ` Michael S. Tsirkin
  2 siblings, 2 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-17 21:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: target-devel, linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi,
	Zhi Yong Wu, Anthony Liguori, Paolo Bonzini, Christoph Hellwig,
	Jens Axboe, Hannes Reinecke

On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger <nab@linux-iscsi.org>
> > 
> > Hi folks,
> > 
> > The following is a RFC-v2 series of tcm_vhost target fabric driver code
> > currently in-flight for-3.6 mainline code.
> > 
> > After last week's developments along with the help of some new folks, the
> > changelog v1 -> v2 so far looks like:
> > 
> > *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He)
> > *) Fix tv_cmd completion -> release SGL memory leak (nab)
> > *) Fix sparse warnings for static variable usage (Fengguang Wu)
> > *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu)
> > *) Convert to cmwq submission for I/O dispatch (nab + hch)
> > 
> > Also following Paolo's request, a patch for hw/virtio-scsi.c that sets
> > scsi_host->max_target=0 that removes the need for virtio-scsi LLD to hardcode
> > VirtIOSCSIConfig->max_id=1 in order to function with tcm_vhost.
> > 
> > Note this series has been pushed into target-pending.git/for-next-merge, and
> > should be getting picked up for tomorrow's linux-next build.
> > 
> > Please let us know if you have any concerns and/or additional review feedback.
> > 
> > Thank you!
> 
> 
> It still seems not 100% clear whether this driver will have major
> userspace using it. And if not, it would be very hard to support a driver
> when recent userspace does not use it in the end.
> 

I'm happy to commit to working with QEMU + kvm-tool folks to get to a
series that can (eventually) see vhost-scsi support merged into upstream
userspace code.

It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes
merged, but certainly vhost-scsi has alot less moving pieces and
hopefully alot less controversial bits than the buffer -> SGL
conversion..  The key word being here 'hopefully'..  ;)

> I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> Then we don't commit to an ABI.
> For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
> Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.
> 

So tcm_vhost has been marked as Experimental following virtio-scsi.

Wrt to staging, I'd like to avoid mucking with staging because:

*) The code has been posted for review
*) The code has been converted to use the latest target-core primitives
*) The code does not require cleanups between staging -> merge
*) The code has been stable the last 7 days since RFC-v2 with heavy 

Also, tcm_vhost has been marked as Experimental following virtio-scsi.
I'd much rather leave it at Experimental until we merge upstream
userspace support.   If userspace support never ends up materializing,
I'm fine with dropping it all together.

However at this point given that there is a 3x performance gap between
virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block
I/O, and we still need the latter to do proper SCSI CDB passthrough for
non TYPE_DISK devices I'm hoping that we can agree on userspace bits
once tcm_vhost is merged.

--nab

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 15:05 ` [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Michael S. Tsirkin
  2012-07-17 18:55   ` Anthony Liguori
@ 2012-07-17 21:17   ` Nicholas A. Bellinger
  2012-07-17 21:17   ` Nicholas A. Bellinger
  2 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-17 21:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger <nab@linux-iscsi.org>
> > 
> > Hi folks,
> > 
> > The following is a RFC-v2 series of tcm_vhost target fabric driver code
> > currently in-flight for-3.6 mainline code.
> > 
> > After last week's developments along with the help of some new folks, the
> > changelog v1 -> v2 so far looks like:
> > 
> > *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He)
> > *) Fix tv_cmd completion -> release SGL memory leak (nab)
> > *) Fix sparse warnings for static variable usage (Fengguang Wu)
> > *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu)
> > *) Convert to cmwq submission for I/O dispatch (nab + hch)
> > 
> > Also following Paolo's request, a patch for hw/virtio-scsi.c that sets
> > scsi_host->max_target=0 that removes the need for virtio-scsi LLD to hardcode
> > VirtIOSCSIConfig->max_id=1 in order to function with tcm_vhost.
> > 
> > Note this series has been pushed into target-pending.git/for-next-merge, and
> > should be getting picked up for tomorrow's linux-next build.
> > 
> > Please let us know if you have any concerns and/or additional review feedback.
> > 
> > Thank you!
> 
> 
> It still seems not 100% clear whether this driver will have major
> userspace using it. And if not, it would be very hard to support a driver
> when recent userspace does not use it in the end.
> 

I'm happy to commit to working with QEMU + kvm-tool folks to get to a
series that can (eventually) see vhost-scsi support merged into upstream
userspace code.

It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes
merged, but certainly vhost-scsi has alot less moving pieces and
hopefully alot less controversial bits than the buffer -> SGL
conversion..  The key word being here 'hopefully'..  ;)

> I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> Then we don't commit to an ABI.
> For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
> Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.
> 

So tcm_vhost has been marked as Experimental following virtio-scsi.

Wrt to staging, I'd like to avoid mucking with staging because:

*) The code has been posted for review
*) The code has been converted to use the latest target-core primitives
*) The code does not require cleanups between staging -> merge
*) The code has been stable the last 7 days since RFC-v2 with heavy 

Also, tcm_vhost has been marked as Experimental following virtio-scsi.
I'd much rather leave it at Experimental until we merge upstream
userspace support.   If userspace support never ends up materializing,
I'm fine with dropping it all together.

However at this point given that there is a 3x performance gap between
virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block
I/O, and we still need the latter to do proper SCSI CDB passthrough for
non TYPE_DISK devices I'm hoping that we can agree on userspace bits
once tcm_vhost is merged.

--nab

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 21:17   ` Nicholas A. Bellinger
@ 2012-07-17 21:34     ` Michael S. Tsirkin
  2012-07-17 22:02       ` Nicholas A. Bellinger
  2012-07-17 22:02       ` Nicholas A. Bellinger
  2012-07-17 21:58     ` Michael S. Tsirkin
  1 sibling, 2 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-17 21:34 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote:
> On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote:
> > On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> > > From: Nicholas Bellinger <nab@linux-iscsi.org>
> > > 
> > > Hi folks,
> > > 
> > > The following is a RFC-v2 series of tcm_vhost target fabric driver code
> > > currently in-flight for-3.6 mainline code.
> > > 
> > > After last week's developments along with the help of some new folks, the
> > > changelog v1 -> v2 so far looks like:
> > > 
> > > *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He)
> > > *) Fix tv_cmd completion -> release SGL memory leak (nab)
> > > *) Fix sparse warnings for static variable usage (Fengguang Wu)
> > > *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu)
> > > *) Convert to cmwq submission for I/O dispatch (nab + hch)
> > > 
> > > Also following Paolo's request, a patch for hw/virtio-scsi.c that sets
> > > scsi_host->max_target=0 that removes the need for virtio-scsi LLD to hardcode
> > > VirtIOSCSIConfig->max_id=1 in order to function with tcm_vhost.
> > > 
> > > Note this series has been pushed into target-pending.git/for-next-merge, and
> > > should be getting picked up for tomorrow's linux-next build.
> > > 
> > > Please let us know if you have any concerns and/or additional review feedback.
> > > 
> > > Thank you!
> > 
> > 
> > It still seems not 100% clear whether this driver will have major
> > userspace using it. And if not, it would be very hard to support a driver
> > when recent userspace does not use it in the end.
> > 
> 
> I'm happy to commit to working with QEMU + kvm-tool folks to get to a
> series that can (eventually) see vhost-scsi support merged into upstream
> userspace code.
> 
> It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes
> merged, but certainly vhost-scsi has alot less moving pieces and
> hopefully alot less controversial bits than the buffer -> SGL
> conversion..  The key word being here 'hopefully'..  ;)
> 
> > I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> > Then we don't commit to an ABI.
> > For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
> > Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.
> > 
> 
> So tcm_vhost has been marked as Experimental following virtio-scsi.
> 
> Wrt to staging, I'd like to avoid mucking with staging because:
> 
> *) The code has been posted for review
> *) The code has been converted to use the latest target-core primitives
> *) The code does not require cleanups between staging -> merge
> *) The code has been stable the last 7 days since RFC-v2 with heavy 

staging is not just for code that needs cleanups.
It's for anything that does not guarantee ABI stability yet.
And I think it's a bit early to guarantee ABI stability - 7 days
is not all that long.

See for example Anthony's comments that raise exactly the ABI
issues.

> Also, tcm_vhost has been marked as Experimental following virtio-scsi.
> I'd much rather leave it at Experimental until we merge upstream
> userspace support.   If userspace support never ends up materializing,
> I'm fine with dropping it all together.

Once it's in kernel you never know who will use this driver.
Experimental does not mean driver can be dropped, staging does.

> However at this point given that there is a 3x performance gap between
> virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block
> I/O, and we still need the latter to do proper SCSI CDB passthrough for
> non TYPE_DISK devices I'm hoping that we can agree on userspace bits
> once tcm_vhost is merged.
> 
> --nab

I do think upstream kernel would help you nail userspace issues too
but at this point it looks like either staging meterial or 3.6 is too
early.

-- 
MST

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 18:55   ` Anthony Liguori
  2012-07-17 19:00     ` Michael S. Tsirkin
@ 2012-07-17 21:50     ` Nicholas A. Bellinger
  2012-07-18 13:42       ` Anthony Liguori
  2012-07-18 13:42       ` Anthony Liguori
  1 sibling, 2 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-17 21:50 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Michael S. Tsirkin,
	lf-virt, Anthony Liguori, target-devel, linux-scsi,
	Paolo Bonzini, Zhi Yong Wu, Christoph Hellwig

On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote:
> On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
> > On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:

<SNIP>

> >
> > It still seems not 100% clear whether this driver will have major
> > userspace using it. And if not, it would be very hard to support a driver
> > when recent userspace does not use it in the end.
> 
> I don't think this is a good reason to exclude something from the kernel. 
> However, there are good reasons why this doesn't make sense for something like 
> QEMU--specifically because we have a large number of features in our block layer 
> that tcm_vhost would bypass.
> 

I can definitely appreciate your concern here as the QEMU maintainer.

> But perhaps it makes sense for something like native kvm tool.  And if it did go 
> into the kernel, we would certainly support it in QEMU.
> 

...

> But I do think the kernel should carefully consider whether it wants to support 
> an interface like this.  This an extremely complicated ABI with a lot of subtle 
> details around state and compatibility.
> 
> Are you absolutely confident that you can support a userspace application that 
> expects to get exactly the same response from all possible commands in 20 kernel 
> versions from now?  Virtualization requires absolutely precise compatibility in 
> terms of bugs and features.  This is probably not something the TCM stack has 
> had to consider yet.
> 

We most certainly have thought about long term userspace compatibility
with TCM.  Our userspace code (that's now available in all major
distros) is completely forward-compatible with new fabric modules such
as tcm_vhost.  No update required.

Also, by virtue of the fact that we are using configfs + rtslib (python
object library) on top, it's very easy to keep any type of compatibility
logic around in python code.  With rtslib, we are able to hide configfs
ABI changes from higher level apps.

So far we've had a track record of 100% userspace ABI compatibility in
mainline since .38, and I don't intend to merge a patch that breaks this
any time soon.  But if that ever happens, apps using rtslib are not
going to be effected.

> > I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> > Then we don't commit to an ABI.
> 
> I think this is a good idea.  Even if it goes in, a really clear policy would be 
> needed wrt the userspace ABI.
> 
> While tcm_vhost is probably more useful than vhost_blk, it's a much more complex 
> ABI to maintain.
> 

As far as I am concerned, the kernel API (eg: configfs directory layout)
as it is now in sys/kernel/config/target/vhost/ is not going to change.
It's based on the same drivers/target/target_core_fabric_configfs.c
generic layout that we've had since .38.

The basic functional fabric layout in configfs is identical (with fabric
dependent WWPN naming of course) regardless of fabric driver, and by
virtue of being generic it means we can add things like fabric dependent
attributes + parameters in the future for existing fabrics without
breaking userspace.

So while I agree the ABI is more complex than vhost-blk, the logic in
target_core_fabric_configfs.c is a basic ABI fabric definition that we
are enforcing across all fabric modules in mainline for long term
compatibility.

--nab

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 21:17   ` Nicholas A. Bellinger
  2012-07-17 21:34     ` Michael S. Tsirkin
@ 2012-07-17 21:58     ` Michael S. Tsirkin
  2012-07-17 22:04       ` Nicholas A. Bellinger
  1 sibling, 1 reply; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-17 21:58 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote:
> Wrt to staging, I'd like to avoid mucking with staging because:
> 
> *) The code has been posted for review
> *) The code has been converted to use the latest target-core primitives
> *) The code does not require cleanups between staging -> merge
> *) The code has been stable the last 7 days since RFC-v2 with heavy 

BTW I don't suggest putting code itself in staging.  Just the config
flag to enable it.  Once we are more or less sure multiple userspaces
are using this driver, we'll move the config hopefully already in 3.7.
What's the downside?

-- 
MST

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 21:34     ` Michael S. Tsirkin
  2012-07-17 22:02       ` Nicholas A. Bellinger
@ 2012-07-17 22:02       ` Nicholas A. Bellinger
  2012-07-17 22:18         ` Michael S. Tsirkin
  2012-07-17 22:18         ` Michael S. Tsirkin
  1 sibling, 2 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-17 22:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: target-devel, linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi,
	Zhi Yong Wu, Anthony Liguori, Paolo Bonzini, Christoph Hellwig,
	Jens Axboe, Hannes Reinecke

On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote:
> > On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote:
> > > On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> > > > From: Nicholas Bellinger <nab@linux-iscsi.org>

<SNIP>

> > I'm happy to commit to working with QEMU + kvm-tool folks to get to a
> > series that can (eventually) see vhost-scsi support merged into upstream
> > userspace code.
> > 
> > It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes
> > merged, but certainly vhost-scsi has alot less moving pieces and
> > hopefully alot less controversial bits than the buffer -> SGL
> > conversion..  The key word being here 'hopefully'..  ;)
> > 
> > > I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> > > Then we don't commit to an ABI.
> > > For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
> > > Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.
> > > 
> > 
> > So tcm_vhost has been marked as Experimental following virtio-scsi.
> > 
> > Wrt to staging, I'd like to avoid mucking with staging because:
> > 
> > *) The code has been posted for review
> > *) The code has been converted to use the latest target-core primitives
> > *) The code does not require cleanups between staging -> merge
> > *) The code has been stable the last 7 days since RFC-v2 with heavy 
> 
> staging is not just for code that needs cleanups.
> It's for anything that does not guarantee ABI stability yet.
> And I think it's a bit early to guarantee ABI stability - 7 days
> is not all that long.
> 

I was talking about the new I/O path has been running for 7 days.

> See for example Anthony's comments that raise exactly the ABI
> issues.
> 

As mentioned in the response to Anthony, we are using the same generic
fabric ABI in drivers/target/target_core_fabric_configfs.c since .38.

That part is not going to change, and it has not changed for any of the
other 7 target fabric modules we've merged into mainline since then.

> > Also, tcm_vhost has been marked as Experimental following virtio-scsi.
> > I'd much rather leave it at Experimental until we merge upstream
> > userspace support.   If userspace support never ends up materializing,
> > I'm fine with dropping it all together.
> 
> Once it's in kernel you never know who will use this driver.
> Experimental does not mean driver can be dropped, staging does.
> 

Yes, that's the point of being in mainline.  People using the code,
right..?

> > However at this point given that there is a 3x performance gap between
> > virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block
> > I/O, and we still need the latter to do proper SCSI CDB passthrough for
> > non TYPE_DISK devices I'm hoping that we can agree on userspace bits
> > once tcm_vhost is merged.
> > 
> > --nab
> 
> I do think upstream kernel would help you nail userspace issues too
> but at this point it looks like either staging meterial or 3.6 is too
> early.
> 

I think for-3.6 is just the right time for this kernel code.  Seriously,
The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is
going to be the same now for-3.6, the same for-3.7, and the same for .38
code. 

I'd be happy to move tcm_vhost back to drivers/target/ for now, and we
move it to drivers/vhost/ once the userspace bits are worked out..?

Would that be a reasonable compromise to move forward..?

--nab


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 21:34     ` Michael S. Tsirkin
@ 2012-07-17 22:02       ` Nicholas A. Bellinger
  2012-07-17 22:02       ` Nicholas A. Bellinger
  1 sibling, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-17 22:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote:
> > On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote:
> > > On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> > > > From: Nicholas Bellinger <nab@linux-iscsi.org>

<SNIP>

> > I'm happy to commit to working with QEMU + kvm-tool folks to get to a
> > series that can (eventually) see vhost-scsi support merged into upstream
> > userspace code.
> > 
> > It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes
> > merged, but certainly vhost-scsi has alot less moving pieces and
> > hopefully alot less controversial bits than the buffer -> SGL
> > conversion..  The key word being here 'hopefully'..  ;)
> > 
> > > I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> > > Then we don't commit to an ABI.
> > > For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
> > > Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.
> > > 
> > 
> > So tcm_vhost has been marked as Experimental following virtio-scsi.
> > 
> > Wrt to staging, I'd like to avoid mucking with staging because:
> > 
> > *) The code has been posted for review
> > *) The code has been converted to use the latest target-core primitives
> > *) The code does not require cleanups between staging -> merge
> > *) The code has been stable the last 7 days since RFC-v2 with heavy 
> 
> staging is not just for code that needs cleanups.
> It's for anything that does not guarantee ABI stability yet.
> And I think it's a bit early to guarantee ABI stability - 7 days
> is not all that long.
> 

I was talking about the new I/O path has been running for 7 days.

> See for example Anthony's comments that raise exactly the ABI
> issues.
> 

As mentioned in the response to Anthony, we are using the same generic
fabric ABI in drivers/target/target_core_fabric_configfs.c since .38.

That part is not going to change, and it has not changed for any of the
other 7 target fabric modules we've merged into mainline since then.

> > Also, tcm_vhost has been marked as Experimental following virtio-scsi.
> > I'd much rather leave it at Experimental until we merge upstream
> > userspace support.   If userspace support never ends up materializing,
> > I'm fine with dropping it all together.
> 
> Once it's in kernel you never know who will use this driver.
> Experimental does not mean driver can be dropped, staging does.
> 

Yes, that's the point of being in mainline.  People using the code,
right..?

> > However at this point given that there is a 3x performance gap between
> > virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block
> > I/O, and we still need the latter to do proper SCSI CDB passthrough for
> > non TYPE_DISK devices I'm hoping that we can agree on userspace bits
> > once tcm_vhost is merged.
> > 
> > --nab
> 
> I do think upstream kernel would help you nail userspace issues too
> but at this point it looks like either staging meterial or 3.6 is too
> early.
> 

I think for-3.6 is just the right time for this kernel code.  Seriously,
The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is
going to be the same now for-3.6, the same for-3.7, and the same for .38
code. 

I'd be happy to move tcm_vhost back to drivers/target/ for now, and we
move it to drivers/vhost/ once the userspace bits are worked out..?

Would that be a reasonable compromise to move forward..?

--nab

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 21:58     ` Michael S. Tsirkin
@ 2012-07-17 22:04       ` Nicholas A. Bellinger
  0 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-17 22:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Wed, 2012-07-18 at 00:58 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote:
> > Wrt to staging, I'd like to avoid mucking with staging because:
> > 
> > *) The code has been posted for review
> > *) The code has been converted to use the latest target-core primitives
> > *) The code does not require cleanups between staging -> merge
> > *) The code has been stable the last 7 days since RFC-v2 with heavy 
> 
> BTW I don't suggest putting code itself in staging.  Just the config
> flag to enable it.  Once we are more or less sure multiple userspaces
> are using this driver, we'll move the config hopefully already in 3.7.
> What's the downside?
> 

Ahh, sorry I managed to miss that part..  ;)

If it's just a CONFIG_STAGING flag for a release or two until we work
out the userspace bits, I don't have an objection doing something like
that if it helps getting the code exposed to a large set of eyes in
mainline.

--nab

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 22:02       ` Nicholas A. Bellinger
  2012-07-17 22:18         ` Michael S. Tsirkin
@ 2012-07-17 22:18         ` Michael S. Tsirkin
  2012-07-17 22:37           ` Nicholas A. Bellinger
  1 sibling, 1 reply; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-17 22:18 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: target-devel, linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi,
	Zhi Yong Wu, Anthony Liguori, Paolo Bonzini, Christoph Hellwig,
	Jens Axboe, Hannes Reinecke

On Tue, Jul 17, 2012 at 03:02:08PM -0700, Nicholas A. Bellinger wrote:
> On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote:
> > > On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> > > > > From: Nicholas Bellinger <nab@linux-iscsi.org>
> 
> <SNIP>
> 
> > > I'm happy to commit to working with QEMU + kvm-tool folks to get to a
> > > series that can (eventually) see vhost-scsi support merged into upstream
> > > userspace code.
> > > 
> > > It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes
> > > merged, but certainly vhost-scsi has alot less moving pieces and
> > > hopefully alot less controversial bits than the buffer -> SGL
> > > conversion..  The key word being here 'hopefully'..  ;)
> > > 
> > > > I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> > > > Then we don't commit to an ABI.
> > > > For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
> > > > Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.
> > > > 
> > > 
> > > So tcm_vhost has been marked as Experimental following virtio-scsi.
> > > 
> > > Wrt to staging, I'd like to avoid mucking with staging because:
> > > 
> > > *) The code has been posted for review
> > > *) The code has been converted to use the latest target-core primitives
> > > *) The code does not require cleanups between staging -> merge
> > > *) The code has been stable the last 7 days since RFC-v2 with heavy 
> > 
> > staging is not just for code that needs cleanups.
> > It's for anything that does not guarantee ABI stability yet.
> > And I think it's a bit early to guarantee ABI stability - 7 days
> > is not all that long.
> > 
> 
> I was talking about the new I/O path has been running for 7 days.
> 
> > See for example Anthony's comments that raise exactly the ABI
> > issues.
> > 
> 
> As mentioned in the response to Anthony, we are using the same generic
> fabric ABI in drivers/target/target_core_fabric_configfs.c since .38.
> 
> That part is not going to change, and it has not changed for any of the
> other 7 target fabric modules we've merged into mainline since then.
> 
> > > Also, tcm_vhost has been marked as Experimental following virtio-scsi.
> > > I'd much rather leave it at Experimental until we merge upstream
> > > userspace support.   If userspace support never ends up materializing,
> > > I'm fine with dropping it all together.
> > 
> > Once it's in kernel you never know who will use this driver.
> > Experimental does not mean driver can be dropped, staging does.
> > 
> 
> Yes, that's the point of being in mainline.  People using the code,
> right..?

Exactly. I am just worried about in the end being no major users and we
are being stuck with a niche driver that as a result is very hard to test.
And the reason for the fear is the initial negative reaction from the
qemu side.  And no if it's there we can't just drop it.

> > > However at this point given that there is a 3x performance gap between
> > > virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block
> > > I/O, and we still need the latter to do proper SCSI CDB passthrough for
> > > non TYPE_DISK devices I'm hoping that we can agree on userspace bits
> > > once tcm_vhost is merged.
> > > 
> > > --nab
> > 
> > I do think upstream kernel would help you nail userspace issues too
> > but at this point it looks like either staging meterial or 3.6 is too
> > early.
> > 
> 
> I think for-3.6 is just the right time for this kernel code.  Seriously,
> The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is
> going to be the same now for-3.6, the same for-3.7, and the same for .38
> code. 
> 
> I'd be happy to move tcm_vhost back to drivers/target/ for now, and we
> move it to drivers/vhost/ once the userspace bits are worked out..?
> 
> Would that be a reasonable compromise to move forward..?
> 
> --nab

I don't see how it helps.  The driver is either a guaranteed ABI or not.
I'd prefer not to have vhost users outside drivers/vhost/ since it is
harder for me to keep track of them.

What's the problem with staging proposal? It's just another hoop to jump
through to enable it?

-- 
MST

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 22:02       ` Nicholas A. Bellinger
@ 2012-07-17 22:18         ` Michael S. Tsirkin
  2012-07-17 22:18         ` Michael S. Tsirkin
  1 sibling, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-17 22:18 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Tue, Jul 17, 2012 at 03:02:08PM -0700, Nicholas A. Bellinger wrote:
> On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote:
> > > On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> > > > > From: Nicholas Bellinger <nab@linux-iscsi.org>
> 
> <SNIP>
> 
> > > I'm happy to commit to working with QEMU + kvm-tool folks to get to a
> > > series that can (eventually) see vhost-scsi support merged into upstream
> > > userspace code.
> > > 
> > > It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes
> > > merged, but certainly vhost-scsi has alot less moving pieces and
> > > hopefully alot less controversial bits than the buffer -> SGL
> > > conversion..  The key word being here 'hopefully'..  ;)
> > > 
> > > > I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> > > > Then we don't commit to an ABI.
> > > > For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig.
> > > > Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig.
> > > > 
> > > 
> > > So tcm_vhost has been marked as Experimental following virtio-scsi.
> > > 
> > > Wrt to staging, I'd like to avoid mucking with staging because:
> > > 
> > > *) The code has been posted for review
> > > *) The code has been converted to use the latest target-core primitives
> > > *) The code does not require cleanups between staging -> merge
> > > *) The code has been stable the last 7 days since RFC-v2 with heavy 
> > 
> > staging is not just for code that needs cleanups.
> > It's for anything that does not guarantee ABI stability yet.
> > And I think it's a bit early to guarantee ABI stability - 7 days
> > is not all that long.
> > 
> 
> I was talking about the new I/O path has been running for 7 days.
> 
> > See for example Anthony's comments that raise exactly the ABI
> > issues.
> > 
> 
> As mentioned in the response to Anthony, we are using the same generic
> fabric ABI in drivers/target/target_core_fabric_configfs.c since .38.
> 
> That part is not going to change, and it has not changed for any of the
> other 7 target fabric modules we've merged into mainline since then.
> 
> > > Also, tcm_vhost has been marked as Experimental following virtio-scsi.
> > > I'd much rather leave it at Experimental until we merge upstream
> > > userspace support.   If userspace support never ends up materializing,
> > > I'm fine with dropping it all together.
> > 
> > Once it's in kernel you never know who will use this driver.
> > Experimental does not mean driver can be dropped, staging does.
> > 
> 
> Yes, that's the point of being in mainline.  People using the code,
> right..?

Exactly. I am just worried about in the end being no major users and we
are being stuck with a niche driver that as a result is very hard to test.
And the reason for the fear is the initial negative reaction from the
qemu side.  And no if it's there we can't just drop it.

> > > However at this point given that there is a 3x performance gap between
> > > virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block
> > > I/O, and we still need the latter to do proper SCSI CDB passthrough for
> > > non TYPE_DISK devices I'm hoping that we can agree on userspace bits
> > > once tcm_vhost is merged.
> > > 
> > > --nab
> > 
> > I do think upstream kernel would help you nail userspace issues too
> > but at this point it looks like either staging meterial or 3.6 is too
> > early.
> > 
> 
> I think for-3.6 is just the right time for this kernel code.  Seriously,
> The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is
> going to be the same now for-3.6, the same for-3.7, and the same for .38
> code. 
> 
> I'd be happy to move tcm_vhost back to drivers/target/ for now, and we
> move it to drivers/vhost/ once the userspace bits are worked out..?
> 
> Would that be a reasonable compromise to move forward..?
> 
> --nab

I don't see how it helps.  The driver is either a guaranteed ABI or not.
I'd prefer not to have vhost users outside drivers/vhost/ since it is
harder for me to keep track of them.

What's the problem with staging proposal? It's just another hoop to jump
through to enable it?

-- 
MST

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 22:18         ` Michael S. Tsirkin
@ 2012-07-17 22:37           ` Nicholas A. Bellinger
  2012-07-17 23:11             ` Michael S. Tsirkin
  0 siblings, 1 reply; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-17 22:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Wed, 2012-07-18 at 01:18 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 03:02:08PM -0700, Nicholas A. Bellinger wrote:
> > On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote:
> > > > On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote:
> > > > > On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> > > > > > From: Nicholas Bellinger <nab@linux-iscsi.org>

<SNIP>

> > As mentioned in the response to Anthony, we are using the same generic
> > fabric ABI in drivers/target/target_core_fabric_configfs.c since .38.
> > 
> > That part is not going to change, and it has not changed for any of the
> > other 7 target fabric modules we've merged into mainline since then.
> > 
> > > > Also, tcm_vhost has been marked as Experimental following virtio-scsi.
> > > > I'd much rather leave it at Experimental until we merge upstream
> > > > userspace support.   If userspace support never ends up materializing,
> > > > I'm fine with dropping it all together.
> > > 
> > > Once it's in kernel you never know who will use this driver.
> > > Experimental does not mean driver can be dropped, staging does.
> > > 
> > 
> > Yes, that's the point of being in mainline.  People using the code,
> > right..?
> 
> Exactly. I am just worried about in the end being no major users and we
> are being stuck with a niche driver that as a result is very hard to test.
> And the reason for the fear is the initial negative reaction from the
> qemu side.  And no if it's there we can't just drop it.
> 

That is certainly a reasonable concern.. 

> > > > However at this point given that there is a 3x performance gap between
> > > > virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block
> > > > I/O, and we still need the latter to do proper SCSI CDB passthrough for
> > > > non TYPE_DISK devices I'm hoping that we can agree on userspace bits
> > > > once tcm_vhost is merged.
> > > > 
> > > > --nab
> > > 
> > > I do think upstream kernel would help you nail userspace issues too
> > > but at this point it looks like either staging meterial or 3.6 is too
> > > early.
> > > 
> > 
> > I think for-3.6 is just the right time for this kernel code.  Seriously,
> > The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is
> > going to be the same now for-3.6, the same for-3.7, and the same for .38
> > code. 
> > 
> > I'd be happy to move tcm_vhost back to drivers/target/ for now, and we
> > move it to drivers/vhost/ once the userspace bits are worked out..?
> > 
> > Would that be a reasonable compromise to move forward..?
> > 
> > --nab
> 
> I don't see how it helps.  The driver is either a guaranteed ABI or not.
> I'd prefer not to have vhost users outside drivers/vhost/ since it is
> harder for me to keep track of them.
> 
> What's the problem with staging proposal? It's just another hoop to jump
> through to enable it?
> 

Yeah, I'm OK with just adding a CONFIG_STAGING tag is a reasonable step
forward for-3.6.

Adding the following patch into target-pending/for-next-merge now:

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index ccbeb6f..2cd7135 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -11,7 +11,7 @@ config VHOST_NET
 
 config TCM_VHOST
        tristate "TCM_VHOST fabric module (EXPERIMENTAL)"
-       depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && m
+       depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && STAGING && m
        default n
        ---help---
        Say M here to enable the TCM_VHOST fabric module for use with virtio-scsi guests

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 22:37           ` Nicholas A. Bellinger
@ 2012-07-17 23:11             ` Michael S. Tsirkin
  2012-07-18  0:17               ` Nicholas A. Bellinger
  0 siblings, 1 reply; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-17 23:11 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Tue, Jul 17, 2012 at 03:37:20PM -0700, Nicholas A. Bellinger wrote:
> On Wed, 2012-07-18 at 01:18 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 03:02:08PM -0700, Nicholas A. Bellinger wrote:
> > > On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote:
> > > > > On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote:
> > > > > > On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> > > > > > > From: Nicholas Bellinger <nab@linux-iscsi.org>
> 
> <SNIP>
> 
> > > As mentioned in the response to Anthony, we are using the same generic
> > > fabric ABI in drivers/target/target_core_fabric_configfs.c since .38.
> > > 
> > > That part is not going to change, and it has not changed for any of the
> > > other 7 target fabric modules we've merged into mainline since then.
> > > 
> > > > > Also, tcm_vhost has been marked as Experimental following virtio-scsi.
> > > > > I'd much rather leave it at Experimental until we merge upstream
> > > > > userspace support.   If userspace support never ends up materializing,
> > > > > I'm fine with dropping it all together.
> > > > 
> > > > Once it's in kernel you never know who will use this driver.
> > > > Experimental does not mean driver can be dropped, staging does.
> > > > 
> > > 
> > > Yes, that's the point of being in mainline.  People using the code,
> > > right..?
> > 
> > Exactly. I am just worried about in the end being no major users and we
> > are being stuck with a niche driver that as a result is very hard to test.
> > And the reason for the fear is the initial negative reaction from the
> > qemu side.  And no if it's there we can't just drop it.
> > 
> 
> That is certainly a reasonable concern.. 
> 
> > > > > However at this point given that there is a 3x performance gap between
> > > > > virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block
> > > > > I/O, and we still need the latter to do proper SCSI CDB passthrough for
> > > > > non TYPE_DISK devices I'm hoping that we can agree on userspace bits
> > > > > once tcm_vhost is merged.
> > > > > 
> > > > > --nab
> > > > 
> > > > I do think upstream kernel would help you nail userspace issues too
> > > > but at this point it looks like either staging meterial or 3.6 is too
> > > > early.
> > > > 
> > > 
> > > I think for-3.6 is just the right time for this kernel code.  Seriously,
> > > The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is
> > > going to be the same now for-3.6, the same for-3.7, and the same for .38
> > > code. 
> > > 
> > > I'd be happy to move tcm_vhost back to drivers/target/ for now, and we
> > > move it to drivers/vhost/ once the userspace bits are worked out..?
> > > 
> > > Would that be a reasonable compromise to move forward..?
> > > 
> > > --nab
> > 
> > I don't see how it helps.  The driver is either a guaranteed ABI or not.
> > I'd prefer not to have vhost users outside drivers/vhost/ since it is
> > harder for me to keep track of them.
> > 
> > What's the problem with staging proposal? It's just another hoop to jump
> > through to enable it?
> > 
> 
> Yeah, I'm OK with just adding a CONFIG_STAGING tag is a reasonable step
> forward for-3.6.
> 
> Adding the following patch into target-pending/for-next-merge now:
> 
> diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> index ccbeb6f..2cd7135 100644
> --- a/drivers/vhost/Kconfig
> +++ b/drivers/vhost/Kconfig
> @@ -11,7 +11,7 @@ config VHOST_NET
>  
>  config TCM_VHOST
>         tristate "TCM_VHOST fabric module (EXPERIMENTAL)"
> -       depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && m
> +       depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && STAGING && m
>         default n
>         ---help---
>         Say M here to enable the TCM_VHOST fabric module for use with virtio-scsi guests
> 
> 

Hmm that's not explicit enough, someone might enable CONFIG_STAGING for
some other reason and won't notice the dependency.
We need it to appear with other staging drivers in the menu,
so there needs to be a Kconfig that is included from
drivers/staging/Kconfig.

For example, we can create
drivers/vhost/staging/Kconfig or drivers/vhost/tcm/Kconfig and include
it from drivers/staging/Kconfig.  nouveau did something like this for a
while, see f3c93cbde7eab38671ae085cb1027b08f5f36757.

No need to move the rest of the code.

-- 
MST

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 23:11             ` Michael S. Tsirkin
@ 2012-07-18  0:17               ` Nicholas A. Bellinger
  0 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-18  0:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Greg KH, Zhi Yong Wu,
	Anthony Liguori, target-devel, linux-scsi, Paolo Bonzini,
	lf-virt, Christoph Hellwig

On Wed, 2012-07-18 at 02:11 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 03:37:20PM -0700, Nicholas A. Bellinger wrote:
> > On Wed, 2012-07-18 at 01:18 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 03:02:08PM -0700, Nicholas A. Bellinger wrote:
> > > > On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote:

<SNIP>

> > > 
> > > I don't see how it helps.  The driver is either a guaranteed ABI or not.
> > > I'd prefer not to have vhost users outside drivers/vhost/ since it is
> > > harder for me to keep track of them.
> > > 
> > > What's the problem with staging proposal? It's just another hoop to jump
> > > through to enable it?
> > > 
> > 
> > Yeah, I'm OK with just adding a CONFIG_STAGING tag is a reasonable step
> > forward for-3.6.
> > 
> > Adding the following patch into target-pending/for-next-merge now:
> > 
> > diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> > index ccbeb6f..2cd7135 100644
> > --- a/drivers/vhost/Kconfig
> > +++ b/drivers/vhost/Kconfig
> > @@ -11,7 +11,7 @@ config VHOST_NET
> >  
> >  config TCM_VHOST
> >         tristate "TCM_VHOST fabric module (EXPERIMENTAL)"
> > -       depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && m
> > +       depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && STAGING && m
> >         default n
> >         ---help---
> >         Say M here to enable the TCM_VHOST fabric module for use with virtio-scsi guests
> > 
> > 
> 
> Hmm that's not explicit enough, someone might enable CONFIG_STAGING for
> some other reason and won't notice the dependency.
> We need it to appear with other staging drivers in the menu,
> so there needs to be a Kconfig that is included from
> drivers/staging/Kconfig.
> 

Mmmmm, I am sensing a linux-next merge conflict with staging-next and/or
another for-next-merge rebase coming on..

> For example, we can create
> drivers/vhost/staging/Kconfig or drivers/vhost/tcm/Kconfig and include
> it from drivers/staging/Kconfig.  nouveau did something like this for a
> while, see f3c93cbde7eab38671ae085cb1027b08f5f36757.
> 
> No need to move the rest of the code.
> 

OK, lets use drivers/vhost/tcm/Kconfig and I'll post a incremental patch
to make it appear under staging it shortly.

(CC'ing Greg-KH for good measure.)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 21:50     ` Nicholas A. Bellinger
  2012-07-18 13:42       ` Anthony Liguori
@ 2012-07-18 13:42       ` Anthony Liguori
  2012-07-18 13:56         ` Paolo Bonzini
                           ` (5 more replies)
  1 sibling, 6 replies; 48+ messages in thread
From: Anthony Liguori @ 2012-07-18 13:42 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Michael S. Tsirkin, target-devel, linux-scsi, lf-virt, kvm-devel,
	Stefan Hajnoczi, Zhi Yong Wu, Anthony Liguori, Paolo Bonzini,
	Christoph Hellwig, Jens Axboe, Hannes Reinecke

On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote:
> On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote:
>> On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
>>> On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
>
> <SNIP>
>
>>>
>>> It still seems not 100% clear whether this driver will have major
>>> userspace using it. And if not, it would be very hard to support a driver
>>> when recent userspace does not use it in the end.
>>
>> I don't think this is a good reason to exclude something from the kernel.
>> However, there are good reasons why this doesn't make sense for something like
>> QEMU--specifically because we have a large number of features in our block layer
>> that tcm_vhost would bypass.
>>
>
> I can definitely appreciate your concern here as the QEMU maintainer.
>
>> But perhaps it makes sense for something like native kvm tool.  And if it did go
>> into the kernel, we would certainly support it in QEMU.
>>
>
> ...
>
>> But I do think the kernel should carefully consider whether it wants to support
>> an interface like this.  This an extremely complicated ABI with a lot of subtle
>> details around state and compatibility.
>>
>> Are you absolutely confident that you can support a userspace application that
>> expects to get exactly the same response from all possible commands in 20 kernel
>> versions from now?  Virtualization requires absolutely precise compatibility in
>> terms of bugs and features.  This is probably not something the TCM stack has
>> had to consider yet.
>>
>
> We most certainly have thought about long term userspace compatibility
> with TCM.  Our userspace code (that's now available in all major
> distros) is completely forward-compatible with new fabric modules such
> as tcm_vhost.  No update required.

I'm not sure we're talking about the same thing when we say compatibility.

I'm not talking about the API.  I'm talking about the behavior of the commands 
that tcm_vhost supports.

If you add support for a new command, you need to provide userspace a way to 
disable this command.  If you change what gets reported for VPD, you need to 
provide userspace a way to make VPD look like what it did in a previous version.

Basically, you need to be able to make a TCM device behave 100% the same as it 
did in an older version of the kernel.

This is unique to virtualization due to live migration.  If you migrate from a 
3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM 
device behaves exactly like the 3.6 kernel because the guest that is interacting 
with it does not realize that live migration happened.

Yes, you can add knobs via configfs to control this behavior, but I think the 
question is, what's the plan for this?

BTW, I think this is a good thing to cover in Documentation/vhost/tcm_vhost.txt. 
  I think that's probably the only change that's needed here.

Regards,

Anthony Liguori

>
> Also, by virtue of the fact that we are using configfs + rtslib (python
> object library) on top, it's very easy to keep any type of compatibility
> logic around in python code.  With rtslib, we are able to hide configfs
> ABI changes from higher level apps.
>
> So far we've had a track record of 100% userspace ABI compatibility in
> mainline since .38, and I don't intend to merge a patch that breaks this
> any time soon.  But if that ever happens, apps using rtslib are not
> going to be effected.
>
>>> I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
>>> Then we don't commit to an ABI.
>>
>> I think this is a good idea.  Even if it goes in, a really clear policy would be
>> needed wrt the userspace ABI.
>>
>> While tcm_vhost is probably more useful than vhost_blk, it's a much more complex
>> ABI to maintain.
>>
>
> As far as I am concerned, the kernel API (eg: configfs directory layout)
> as it is now in sys/kernel/config/target/vhost/ is not going to change.
> It's based on the same drivers/target/target_core_fabric_configfs.c
> generic layout that we've had since .38.
>
> The basic functional fabric layout in configfs is identical (with fabric
> dependent WWPN naming of course) regardless of fabric driver, and by
> virtue of being generic it means we can add things like fabric dependent
> attributes + parameters in the future for existing fabrics without
> breaking userspace.
>
> So while I agree the ABI is more complex than vhost-blk, the logic in
> target_core_fabric_configfs.c is a basic ABI fabric definition that we
> are enforcing across all fabric modules in mainline for long term
> compatibility.
>
> --nab
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-17 21:50     ` Nicholas A. Bellinger
@ 2012-07-18 13:42       ` Anthony Liguori
  2012-07-18 13:42       ` Anthony Liguori
  1 sibling, 0 replies; 48+ messages in thread
From: Anthony Liguori @ 2012-07-18 13:42 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Michael S. Tsirkin,
	lf-virt, Anthony Liguori, target-devel, linux-scsi,
	Paolo Bonzini, Zhi Yong Wu, Christoph Hellwig

On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote:
> On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote:
>> On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
>>> On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
>
> <SNIP>
>
>>>
>>> It still seems not 100% clear whether this driver will have major
>>> userspace using it. And if not, it would be very hard to support a driver
>>> when recent userspace does not use it in the end.
>>
>> I don't think this is a good reason to exclude something from the kernel.
>> However, there are good reasons why this doesn't make sense for something like
>> QEMU--specifically because we have a large number of features in our block layer
>> that tcm_vhost would bypass.
>>
>
> I can definitely appreciate your concern here as the QEMU maintainer.
>
>> But perhaps it makes sense for something like native kvm tool.  And if it did go
>> into the kernel, we would certainly support it in QEMU.
>>
>
> ...
>
>> But I do think the kernel should carefully consider whether it wants to support
>> an interface like this.  This an extremely complicated ABI with a lot of subtle
>> details around state and compatibility.
>>
>> Are you absolutely confident that you can support a userspace application that
>> expects to get exactly the same response from all possible commands in 20 kernel
>> versions from now?  Virtualization requires absolutely precise compatibility in
>> terms of bugs and features.  This is probably not something the TCM stack has
>> had to consider yet.
>>
>
> We most certainly have thought about long term userspace compatibility
> with TCM.  Our userspace code (that's now available in all major
> distros) is completely forward-compatible with new fabric modules such
> as tcm_vhost.  No update required.

I'm not sure we're talking about the same thing when we say compatibility.

I'm not talking about the API.  I'm talking about the behavior of the commands 
that tcm_vhost supports.

If you add support for a new command, you need to provide userspace a way to 
disable this command.  If you change what gets reported for VPD, you need to 
provide userspace a way to make VPD look like what it did in a previous version.

Basically, you need to be able to make a TCM device behave 100% the same as it 
did in an older version of the kernel.

This is unique to virtualization due to live migration.  If you migrate from a 
3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM 
device behaves exactly like the 3.6 kernel because the guest that is interacting 
with it does not realize that live migration happened.

Yes, you can add knobs via configfs to control this behavior, but I think the 
question is, what's the plan for this?

BTW, I think this is a good thing to cover in Documentation/vhost/tcm_vhost.txt. 
  I think that's probably the only change that's needed here.

Regards,

Anthony Liguori

>
> Also, by virtue of the fact that we are using configfs + rtslib (python
> object library) on top, it's very easy to keep any type of compatibility
> logic around in python code.  With rtslib, we are able to hide configfs
> ABI changes from higher level apps.
>
> So far we've had a track record of 100% userspace ABI compatibility in
> mainline since .38, and I don't intend to merge a patch that breaks this
> any time soon.  But if that ever happens, apps using rtslib are not
> going to be effected.
>
>>> I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
>>> Then we don't commit to an ABI.
>>
>> I think this is a good idea.  Even if it goes in, a really clear policy would be
>> needed wrt the userspace ABI.
>>
>> While tcm_vhost is probably more useful than vhost_blk, it's a much more complex
>> ABI to maintain.
>>
>
> As far as I am concerned, the kernel API (eg: configfs directory layout)
> as it is now in sys/kernel/config/target/vhost/ is not going to change.
> It's based on the same drivers/target/target_core_fabric_configfs.c
> generic layout that we've had since .38.
>
> The basic functional fabric layout in configfs is identical (with fabric
> dependent WWPN naming of course) regardless of fabric driver, and by
> virtue of being generic it means we can add things like fabric dependent
> attributes + parameters in the future for existing fabrics without
> breaking userspace.
>
> So while I agree the ABI is more complex than vhost-blk, the logic in
> target_core_fabric_configfs.c is a basic ABI fabric definition that we
> are enforcing across all fabric modules in mainline for long term
> compatibility.
>
> --nab
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 13:42       ` Anthony Liguori
@ 2012-07-18 13:56         ` Paolo Bonzini
  2012-07-18 15:33         ` Michael S. Tsirkin
                           ` (4 subsequent siblings)
  5 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2012-07-18 13:56 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jens Axboe, linux-scsi, kvm-devel, Michael S. Tsirkin, lf-virt,
	Anthony Liguori, target-devel, Zhi Yong Wu, Christoph Hellwig,
	Stefan Hajnoczi

Il 18/07/2012 15:42, Anthony Liguori ha scritto:
> If you add support for a new command, you need to provide userspace a
> way to disable this command.  If you change what gets reported for VPD,
> you need to provide userspace a way to make VPD look like what it did in
> a previous version.

The QEMU target is not enforcing this to this level.  We didn't for
CD-ROM ATAPI, and we're not doing it for SCSI.

It may indeed be useful for changes to VPD pages or major features.
However, so far we've never introduced any feature that deserved it.
This is also because OSes typically don't care: they use a small subset
of the features and all the remaining "decorations" are only needed to
be pedantically compliant to the spec.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 13:42       ` Anthony Liguori
  2012-07-18 13:56         ` Paolo Bonzini
@ 2012-07-18 15:33         ` Michael S. Tsirkin
  2012-07-18 15:53         ` Christoph Hellwig
                           ` (3 subsequent siblings)
  5 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-18 15:33 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, lf-virt, Anthony Liguori,
	target-devel, linux-scsi, Paolo Bonzini, Zhi Yong Wu,
	Christoph Hellwig

On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote:
> >On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote:
> >>On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
> >>>On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> >
> ><SNIP>
> >
> >>>
> >>>It still seems not 100% clear whether this driver will have major
> >>>userspace using it. And if not, it would be very hard to support a driver
> >>>when recent userspace does not use it in the end.
> >>
> >>I don't think this is a good reason to exclude something from the kernel.
> >>However, there are good reasons why this doesn't make sense for something like
> >>QEMU--specifically because we have a large number of features in our block layer
> >>that tcm_vhost would bypass.
> >>
> >
> >I can definitely appreciate your concern here as the QEMU maintainer.
> >
> >>But perhaps it makes sense for something like native kvm tool.  And if it did go
> >>into the kernel, we would certainly support it in QEMU.
> >>
> >
> >...
> >
> >>But I do think the kernel should carefully consider whether it wants to support
> >>an interface like this.  This an extremely complicated ABI with a lot of subtle
> >>details around state and compatibility.
> >>
> >>Are you absolutely confident that you can support a userspace application that
> >>expects to get exactly the same response from all possible commands in 20 kernel
> >>versions from now?  Virtualization requires absolutely precise compatibility in
> >>terms of bugs and features.  This is probably not something the TCM stack has
> >>had to consider yet.
> >>
> >
> >We most certainly have thought about long term userspace compatibility
> >with TCM.  Our userspace code (that's now available in all major
> >distros) is completely forward-compatible with new fabric modules such
> >as tcm_vhost.  No update required.
> 
> I'm not sure we're talking about the same thing when we say compatibility.
> 
> I'm not talking about the API.  I'm talking about the behavior of
> the commands that tcm_vhost supports.
> 
> If you add support for a new command, you need to provide userspace
> a way to disable this command.  If you change what gets reported for
> VPD, you need to provide userspace a way to make VPD look like what
> it did in a previous version.
> 
> Basically, you need to be able to make a TCM device behave 100% the
> same as it did in an older version of the kernel.
> 
> This is unique to virtualization due to live migration.  If you
> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> because the guest that is interacting with it does not realize that
> live migration happened.
> 
> Yes, you can add knobs via configfs to control this behavior, but I
> think the question is, what's the plan for this?
> 
> BTW, I think this is a good thing to cover in
> Documentation/vhost/tcm_vhost.txt.  I think that's probably the only
> change that's needed here.
> 
> Regards,
> 
> Anthony Liguori

I agree it's needed but it's not a requirement for merging IMHO.
As a first step we can disable live migration.

> >
> >Also, by virtue of the fact that we are using configfs + rtslib (python
> >object library) on top, it's very easy to keep any type of compatibility
> >logic around in python code.  With rtslib, we are able to hide configfs
> >ABI changes from higher level apps.
> >
> >So far we've had a track record of 100% userspace ABI compatibility in
> >mainline since .38, and I don't intend to merge a patch that breaks this
> >any time soon.  But if that ever happens, apps using rtslib are not
> >going to be effected.
> >
> >>>I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> >>>Then we don't commit to an ABI.
> >>
> >>I think this is a good idea.  Even if it goes in, a really clear policy would be
> >>needed wrt the userspace ABI.
> >>
> >>While tcm_vhost is probably more useful than vhost_blk, it's a much more complex
> >>ABI to maintain.
> >>
> >
> >As far as I am concerned, the kernel API (eg: configfs directory layout)
> >as it is now in sys/kernel/config/target/vhost/ is not going to change.
> >It's based on the same drivers/target/target_core_fabric_configfs.c
> >generic layout that we've had since .38.
> >
> >The basic functional fabric layout in configfs is identical (with fabric
> >dependent WWPN naming of course) regardless of fabric driver, and by
> >virtue of being generic it means we can add things like fabric dependent
> >attributes + parameters in the future for existing fabrics without
> >breaking userspace.
> >
> >So while I agree the ABI is more complex than vhost-blk, the logic in
> >target_core_fabric_configfs.c is a basic ABI fabric definition that we
> >are enforcing across all fabric modules in mainline for long term
> >compatibility.
> >
> >--nab
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 13:42       ` Anthony Liguori
                           ` (2 preceding siblings ...)
  2012-07-18 15:53         ` Christoph Hellwig
@ 2012-07-18 15:53         ` Christoph Hellwig
  2012-07-18 16:00           ` Michael S. Tsirkin
                             ` (2 more replies)
  2012-07-18 22:45         ` Nicholas A. Bellinger
  2012-07-18 22:45         ` Nicholas A. Bellinger
  5 siblings, 3 replies; 48+ messages in thread
From: Christoph Hellwig @ 2012-07-18 15:53 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Nicholas A. Bellinger, Michael S. Tsirkin, target-devel,
	linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi, Zhi Yong Wu,
	Anthony Liguori, Paolo Bonzini, Christoph Hellwig, Jens Axboe,
	Hannes Reinecke

On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> 
> If you add support for a new command, you need to provide userspace
> a way to disable this command.  If you change what gets reported for
> VPD, you need to provide userspace a way to make VPD look like what
> it did in a previous version.
> 
> Basically, you need to be able to make a TCM device behave 100% the
> same as it did in an older version of the kernel.
> 
> This is unique to virtualization due to live migration.  If you
> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> because the guest that is interacting with it does not realize that
> live migration happened.

I don't think these strict live migration rules apply to SCSI targets.

Real life storage systems get new features and different behaviour with
firmware upgrades all the time, and SCSI initiators deal with that just
fine.  I don't see any reason to be more picky just because we're
virtualized.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 13:42       ` Anthony Liguori
  2012-07-18 13:56         ` Paolo Bonzini
  2012-07-18 15:33         ` Michael S. Tsirkin
@ 2012-07-18 15:53         ` Christoph Hellwig
  2012-07-18 15:53         ` Christoph Hellwig
                           ` (2 subsequent siblings)
  5 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2012-07-18 15:53 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jens Axboe, linux-scsi, kvm-devel, Michael S. Tsirkin, lf-virt,
	Anthony Liguori, target-devel, Paolo Bonzini, Zhi Yong Wu,
	Christoph Hellwig, Stefan Hajnoczi

On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> 
> If you add support for a new command, you need to provide userspace
> a way to disable this command.  If you change what gets reported for
> VPD, you need to provide userspace a way to make VPD look like what
> it did in a previous version.
> 
> Basically, you need to be able to make a TCM device behave 100% the
> same as it did in an older version of the kernel.
> 
> This is unique to virtualization due to live migration.  If you
> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> because the guest that is interacting with it does not realize that
> live migration happened.

I don't think these strict live migration rules apply to SCSI targets.

Real life storage systems get new features and different behaviour with
firmware upgrades all the time, and SCSI initiators deal with that just
fine.  I don't see any reason to be more picky just because we're
virtualized.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 15:53         ` Christoph Hellwig
@ 2012-07-18 16:00           ` Michael S. Tsirkin
  2012-07-18 16:42             ` Rustad, Mark D
  2012-07-18 16:42             ` Rustad, Mark D
  2012-07-18 16:00           ` Anthony Liguori
  2012-07-18 16:00           ` Anthony Liguori
  2 siblings, 2 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-18 16:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Anthony Liguori, linux-scsi, kvm-devel, lf-virt,
	Anthony Liguori, target-devel, Paolo Bonzini, Zhi Yong Wu,
	Christoph Hellwig, Stefan Hajnoczi

On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote:
> On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> > 
> > If you add support for a new command, you need to provide userspace
> > a way to disable this command.  If you change what gets reported for
> > VPD, you need to provide userspace a way to make VPD look like what
> > it did in a previous version.
> > 
> > Basically, you need to be able to make a TCM device behave 100% the
> > same as it did in an older version of the kernel.
> > 
> > This is unique to virtualization due to live migration.  If you
> > migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> > that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> > because the guest that is interacting with it does not realize that
> > live migration happened.
> 
> I don't think these strict live migration rules apply to SCSI targets.
> 
> Real life storage systems get new features and different behaviour with
> firmware upgrades all the time, and SCSI initiators deal with that just
> fine.
>  I don't see any reason to be more picky just because we're
> virtualized.


Presumably initiators are shut down for target firmware upgrades?
With virtualization your host can change without guest shutdown.
You can also *lose* commands when migrating to an older host.

-- 
MST

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 15:53         ` Christoph Hellwig
  2012-07-18 16:00           ` Michael S. Tsirkin
  2012-07-18 16:00           ` Anthony Liguori
@ 2012-07-18 16:00           ` Anthony Liguori
  2012-07-18 16:47             ` James Bottomley
  2 siblings, 1 reply; 48+ messages in thread
From: Anthony Liguori @ 2012-07-18 16:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nicholas A. Bellinger, Michael S. Tsirkin, target-devel,
	linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi, Zhi Yong Wu,
	Anthony Liguori, Paolo Bonzini, Christoph Hellwig, Jens Axboe,
	Hannes Reinecke

On 07/18/2012 10:53 AM, Christoph Hellwig wrote:
> On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
>>
>> If you add support for a new command, you need to provide userspace
>> a way to disable this command.  If you change what gets reported for
>> VPD, you need to provide userspace a way to make VPD look like what
>> it did in a previous version.
>>
>> Basically, you need to be able to make a TCM device behave 100% the
>> same as it did in an older version of the kernel.
>>
>> This is unique to virtualization due to live migration.  If you
>> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
>> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
>> because the guest that is interacting with it does not realize that
>> live migration happened.
>
> I don't think these strict live migration rules apply to SCSI targets.
>
> Real life storage systems get new features and different behaviour with
> firmware upgrades all the time, and SCSI initiators deal with that just
> fine.  I don't see any reason to be more picky just because we're
> virtualized.

But would this happen while a system is running live?

I agree that in general, SCSI targets don't need this, but I'm pretty sure that 
if a guest probes for a command, you migrate to an old version, and that command 
is no longer there, badness will ensue.

It's different when you're talking about a reboot happening or a 
disconnect/reconnect due to firmware upgrade.  The OS would naturally be 
reprobing in this case.

Regards,

Anthony Liguori

>


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 15:53         ` Christoph Hellwig
  2012-07-18 16:00           ` Michael S. Tsirkin
@ 2012-07-18 16:00           ` Anthony Liguori
  2012-07-18 16:00           ` Anthony Liguori
  2 siblings, 0 replies; 48+ messages in thread
From: Anthony Liguori @ 2012-07-18 16:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-scsi, kvm-devel, Michael S. Tsirkin, lf-virt,
	Anthony Liguori, target-devel, Paolo Bonzini, Zhi Yong Wu,
	Christoph Hellwig, Stefan Hajnoczi

On 07/18/2012 10:53 AM, Christoph Hellwig wrote:
> On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
>>
>> If you add support for a new command, you need to provide userspace
>> a way to disable this command.  If you change what gets reported for
>> VPD, you need to provide userspace a way to make VPD look like what
>> it did in a previous version.
>>
>> Basically, you need to be able to make a TCM device behave 100% the
>> same as it did in an older version of the kernel.
>>
>> This is unique to virtualization due to live migration.  If you
>> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
>> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
>> because the guest that is interacting with it does not realize that
>> live migration happened.
>
> I don't think these strict live migration rules apply to SCSI targets.
>
> Real life storage systems get new features and different behaviour with
> firmware upgrades all the time, and SCSI initiators deal with that just
> fine.  I don't see any reason to be more picky just because we're
> virtualized.

But would this happen while a system is running live?

I agree that in general, SCSI targets don't need this, but I'm pretty sure that 
if a guest probes for a command, you migrate to an old version, and that command 
is no longer there, badness will ensue.

It's different when you're talking about a reboot happening or a 
disconnect/reconnect due to firmware upgrade.  The OS would naturally be 
reprobing in this case.

Regards,

Anthony Liguori

>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 16:00           ` Michael S. Tsirkin
@ 2012-07-18 16:42             ` Rustad, Mark D
  2012-07-18 17:17               ` Michael S. Tsirkin
  2012-07-18 16:42             ` Rustad, Mark D
  1 sibling, 1 reply; 48+ messages in thread
From: Rustad, Mark D @ 2012-07-18 16:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christoph Hellwig, Anthony Liguori, Nicholas A. Bellinger,
	target-devel, linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi,
	Zhi Yong Wu, Anthony Liguori, Paolo Bonzini, Christoph Hellwig,
	Jens Axboe, Hannes Reinecke

On Jul 18, 2012, at 9:00 AM, Michael S. Tsirkin wrote:

> On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote:
>> On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
>>> 
>>> If you add support for a new command, you need to provide userspace
>>> a way to disable this command.  If you change what gets reported for
>>> VPD, you need to provide userspace a way to make VPD look like what
>>> it did in a previous version.
>>> 
>>> Basically, you need to be able to make a TCM device behave 100% the
>>> same as it did in an older version of the kernel.
>>> 
>>> This is unique to virtualization due to live migration.  If you
>>> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
>>> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
>>> because the guest that is interacting with it does not realize that
>>> live migration happened.
>> 
>> I don't think these strict live migration rules apply to SCSI targets.
>> 
>> Real life storage systems get new features and different behaviour with
>> firmware upgrades all the time, and SCSI initiators deal with that just
>> fine.
>> I don't see any reason to be more picky just because we're
>> virtualized.
> 
> Presumably initiators are shut down for target firmware upgrades?
> With virtualization your host can change without guest shutdown.
> You can also *lose* commands when migrating to an older host.


Actually no. Storage vendors do not want to impose a need to take initiators down for any reason. I have worked for a storage system vendor that routinely did firmware upgrades on-the-fly. This is done by multi-pathing and taking one path down, upgrade, bring up, repeat. There was even one non-redundant system that I am aware of that could upgrade firmware and reboot fast enough that the initiators would not notice.

You do have to pay very close attention to some things however. Don't change the device identity in any way - even version information, otherwise a Windows initiator will blue-screen. I made that mistake myself, so I remember it well. It seemed like such an innocent change. I don't recall there being any issue with adding commands and we did do that on occasion.

-- 
Mark Rustad, LAN Access Division, Intel Corporation


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 16:00           ` Michael S. Tsirkin
  2012-07-18 16:42             ` Rustad, Mark D
@ 2012-07-18 16:42             ` Rustad, Mark D
  1 sibling, 0 replies; 48+ messages in thread
From: Rustad, Mark D @ 2012-07-18 16:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Anthony Liguori, kvm-devel, linux-scsi, lf-virt,
	Christoph Hellwig, Anthony Liguori, target-devel, Paolo Bonzini,
	Zhi Yong Wu, Christoph Hellwig, Stefan Hajnoczi

On Jul 18, 2012, at 9:00 AM, Michael S. Tsirkin wrote:

> On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote:
>> On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
>>> 
>>> If you add support for a new command, you need to provide userspace
>>> a way to disable this command.  If you change what gets reported for
>>> VPD, you need to provide userspace a way to make VPD look like what
>>> it did in a previous version.
>>> 
>>> Basically, you need to be able to make a TCM device behave 100% the
>>> same as it did in an older version of the kernel.
>>> 
>>> This is unique to virtualization due to live migration.  If you
>>> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
>>> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
>>> because the guest that is interacting with it does not realize that
>>> live migration happened.
>> 
>> I don't think these strict live migration rules apply to SCSI targets.
>> 
>> Real life storage systems get new features and different behaviour with
>> firmware upgrades all the time, and SCSI initiators deal with that just
>> fine.
>> I don't see any reason to be more picky just because we're
>> virtualized.
> 
> Presumably initiators are shut down for target firmware upgrades?
> With virtualization your host can change without guest shutdown.
> You can also *lose* commands when migrating to an older host.


Actually no. Storage vendors do not want to impose a need to take initiators down for any reason. I have worked for a storage system vendor that routinely did firmware upgrades on-the-fly. This is done by multi-pathing and taking one path down, upgrade, bring up, repeat. There was even one non-redundant system that I am aware of that could upgrade firmware and reboot fast enough that the initiators would not notice.

You do have to pay very close attention to some things however. Don't change the device identity in any way - even version information, otherwise a Windows initiator will blue-screen. I made that mistake myself, so I remember it well. It seemed like such an innocent change. I don't recall there being any issue with adding commands and we did do that on occasion.

-- 
Mark Rustad, LAN Access Division, Intel Corporation

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 16:00           ` Anthony Liguori
@ 2012-07-18 16:47             ` James Bottomley
  2012-07-18 19:12               ` Anthony Liguori
  2012-07-18 19:12               ` Anthony Liguori
  0 siblings, 2 replies; 48+ messages in thread
From: James Bottomley @ 2012-07-18 16:47 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Christoph Hellwig, Nicholas A. Bellinger, Michael S. Tsirkin,
	target-devel, linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi,
	Zhi Yong Wu, Anthony Liguori, Paolo Bonzini, Christoph Hellwig,
	Jens Axboe, Hannes Reinecke

On Wed, 2012-07-18 at 11:00 -0500, Anthony Liguori wrote:
> On 07/18/2012 10:53 AM, Christoph Hellwig wrote:
> > On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> >>
> >> If you add support for a new command, you need to provide userspace
> >> a way to disable this command.  If you change what gets reported for
> >> VPD, you need to provide userspace a way to make VPD look like what
> >> it did in a previous version.
> >>
> >> Basically, you need to be able to make a TCM device behave 100% the
> >> same as it did in an older version of the kernel.
> >>
> >> This is unique to virtualization due to live migration.  If you
> >> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> >> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> >> because the guest that is interacting with it does not realize that
> >> live migration happened.
> >
> > I don't think these strict live migration rules apply to SCSI targets.
> >
> > Real life storage systems get new features and different behaviour with
> > firmware upgrades all the time, and SCSI initiators deal with that just
> > fine.  I don't see any reason to be more picky just because we're
> > virtualized.
> 
> But would this happen while a system is running live?

Of course: Think about the consequences: you want to upgrade one array
on your SAN.  You definitely don't want to shut down your entire data
centre to achieve it.  In place upgrades on running SANs have been
common in enterprise environments for a while.

> I agree that in general, SCSI targets don't need this, but I'm pretty sure that 
> if a guest probes for a command, you migrate to an old version, and that command 
> is no longer there, badness will ensue.

What command are we talking about?  Operation of initiators is usually
just READ and WRITE.  So perhaps we might have inline UNMAP ... but the
world wouldn't come to an end even if the latter stopped working.

Most of the complex SCSI stuff is done at start of day; it's actually
only then we'd notice things like changes in INQUIRY strings or mode
pages.

Failover, which is what you're talking about, requires reinstatement of
all the operating parameters of the source/target system, but that's not
wholly the responsibility of the storage system ...

James

> It's different when you're talking about a reboot happening or a 
> disconnect/reconnect due to firmware upgrade.  The OS would naturally be 
> reprobing in this case.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 16:42             ` Rustad, Mark D
@ 2012-07-18 17:17               ` Michael S. Tsirkin
  2012-07-18 20:12                 ` Rustad, Mark D
  2012-07-18 20:12                 ` Rustad, Mark D
  0 siblings, 2 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2012-07-18 17:17 UTC (permalink / raw)
  To: Rustad, Mark D
  Cc: Jens Axboe, Anthony Liguori, kvm-devel, linux-scsi, lf-virt,
	Christoph Hellwig, Anthony Liguori, target-devel, Paolo Bonzini,
	Zhi Yong Wu, Christoph Hellwig, Stefan Hajnoczi

On Wed, Jul 18, 2012 at 04:42:33PM +0000, Rustad, Mark D wrote:
> On Jul 18, 2012, at 9:00 AM, Michael S. Tsirkin wrote:
> 
> > On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote:
> >> On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> >>> 
> >>> If you add support for a new command, you need to provide userspace
> >>> a way to disable this command.  If you change what gets reported for
> >>> VPD, you need to provide userspace a way to make VPD look like what
> >>> it did in a previous version.
> >>> 
> >>> Basically, you need to be able to make a TCM device behave 100% the
> >>> same as it did in an older version of the kernel.
> >>> 
> >>> This is unique to virtualization due to live migration.  If you
> >>> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> >>> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> >>> because the guest that is interacting with it does not realize that
> >>> live migration happened.
> >> 
> >> I don't think these strict live migration rules apply to SCSI targets.
> >> 
> >> Real life storage systems get new features and different behaviour with
> >> firmware upgrades all the time, and SCSI initiators deal with that just
> >> fine.
> >> I don't see any reason to be more picky just because we're
> >> virtualized.
> > 
> > Presumably initiators are shut down for target firmware upgrades?
> > With virtualization your host can change without guest shutdown.
> > You can also *lose* commands when migrating to an older host.
> 
> 
> Actually no. Storage vendors do not want to impose a need to take initiators down for any reason. I have worked for a storage system vendor that routinely did firmware upgrades on-the-fly. This is done by multi-pathing and taking one path down, upgrade, bring up, repeat.

With live migration even that does not happen.

> There was even one non-redundant system that I am aware of that could upgrade firmware and reboot fast enough that the initiators would not notice.
> 
> You do have to pay very close attention to some things however. Don't change the device identity in any way - even version information, otherwise a Windows initiator will blue-screen. I made that mistake myself, so I remember it well. It seemed like such an innocent change. I don't recall there being any issue with adding commands and we did do that on occasion.

How about removing commands?

> -- 
> Mark Rustad, LAN Access Division, Intel Corporation

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 16:47             ` James Bottomley
@ 2012-07-18 19:12               ` Anthony Liguori
  2012-07-19  6:00                 ` Paolo Bonzini
  2012-07-19  6:00                 ` Paolo Bonzini
  2012-07-18 19:12               ` Anthony Liguori
  1 sibling, 2 replies; 48+ messages in thread
From: Anthony Liguori @ 2012-07-18 19:12 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, Nicholas A. Bellinger, Michael S. Tsirkin,
	target-devel, linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi,
	Zhi Yong Wu, Anthony Liguori, Paolo Bonzini, Christoph Hellwig,
	Jens Axboe, Hannes Reinecke

On 07/18/2012 11:47 AM, James Bottomley wrote:
> On Wed, 2012-07-18 at 11:00 -0500, Anthony Liguori wrote:
>
> Of course: Think about the consequences: you want to upgrade one array
> on your SAN.  You definitely don't want to shut down your entire data
> centre to achieve it.  In place upgrades on running SANs have been
> common in enterprise environments for a while.

Would firmware upgrades ever result in major OS visible changes though?

Maybe OSes are more robust with SCSI than with other types of buses, but I don't 
think it's safe to completely ignore the problem.

>> I agree that in general, SCSI targets don't need this, but I'm pretty sure that
>> if a guest probes for a command, you migrate to an old version, and that command
>> is no longer there, badness will ensue.
>
> What command are we talking about?  Operation of initiators is usually
> just READ and WRITE.  So perhaps we might have inline UNMAP ... but the
> world wouldn't come to an end even if the latter stopped working.

Is that true for all OSes?  Linux may handle things gracefully if UNMAP starts 
throwing errors but that doesn't mean that Windows will.

There are other cases where this creates problems.  Windows (and some other 
OSes) fingerprint the hardware profile in order to do license enforcement.  If 
the hardware changes beyond a certain amount, then they refuse to boot.

Windows does this with a points system and I do believe that INQUIRY responses 
from any local disks are included in this tally.

> Most of the complex SCSI stuff is done at start of day; it's actually
> only then we'd notice things like changes in INQUIRY strings or mode
> pages.
>
> Failover, which is what you're talking about, requires reinstatement of
> all the operating parameters of the source/target system, but that's not
> wholly the responsibility of the storage system ...

It's the responsibility of the hypervisor when dealing with live migration.

Regards,

Anthony Liguori

>
> James
>
>> It's different when you're talking about a reboot happening or a
>> disconnect/reconnect due to firmware upgrade.  The OS would naturally be
>> reprobing in this case.
>
>
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 16:47             ` James Bottomley
  2012-07-18 19:12               ` Anthony Liguori
@ 2012-07-18 19:12               ` Anthony Liguori
  1 sibling, 0 replies; 48+ messages in thread
From: Anthony Liguori @ 2012-07-18 19:12 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jens Axboe, kvm-devel, linux-scsi, Michael S. Tsirkin, lf-virt,
	Christoph Hellwig, Anthony Liguori, target-devel, Paolo Bonzini,
	Zhi Yong Wu, Christoph Hellwig, Stefan Hajnoczi

On 07/18/2012 11:47 AM, James Bottomley wrote:
> On Wed, 2012-07-18 at 11:00 -0500, Anthony Liguori wrote:
>
> Of course: Think about the consequences: you want to upgrade one array
> on your SAN.  You definitely don't want to shut down your entire data
> centre to achieve it.  In place upgrades on running SANs have been
> common in enterprise environments for a while.

Would firmware upgrades ever result in major OS visible changes though?

Maybe OSes are more robust with SCSI than with other types of buses, but I don't 
think it's safe to completely ignore the problem.

>> I agree that in general, SCSI targets don't need this, but I'm pretty sure that
>> if a guest probes for a command, you migrate to an old version, and that command
>> is no longer there, badness will ensue.
>
> What command are we talking about?  Operation of initiators is usually
> just READ and WRITE.  So perhaps we might have inline UNMAP ... but the
> world wouldn't come to an end even if the latter stopped working.

Is that true for all OSes?  Linux may handle things gracefully if UNMAP starts 
throwing errors but that doesn't mean that Windows will.

There are other cases where this creates problems.  Windows (and some other 
OSes) fingerprint the hardware profile in order to do license enforcement.  If 
the hardware changes beyond a certain amount, then they refuse to boot.

Windows does this with a points system and I do believe that INQUIRY responses 
from any local disks are included in this tally.

> Most of the complex SCSI stuff is done at start of day; it's actually
> only then we'd notice things like changes in INQUIRY strings or mode
> pages.
>
> Failover, which is what you're talking about, requires reinstatement of
> all the operating parameters of the source/target system, but that's not
> wholly the responsibility of the storage system ...

It's the responsibility of the hypervisor when dealing with live migration.

Regards,

Anthony Liguori

>
> James
>
>> It's different when you're talking about a reboot happening or a
>> disconnect/reconnect due to firmware upgrade.  The OS would naturally be
>> reprobing in this case.
>
>
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 17:17               ` Michael S. Tsirkin
@ 2012-07-18 20:12                 ` Rustad, Mark D
  2012-07-18 20:12                 ` Rustad, Mark D
  1 sibling, 0 replies; 48+ messages in thread
From: Rustad, Mark D @ 2012-07-18 20:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christoph Hellwig, Anthony Liguori, Nicholas A. Bellinger,
	target-devel, linux-scsi, lf-virt, kvm-devel, Stefan Hajnoczi,
	Zhi Yong Wu, Anthony Liguori, Paolo Bonzini, Christoph Hellwig,
	Jens Axboe, Hannes Reinecke

On Jul 18, 2012, at 10:17 AM, Michael S. Tsirkin wrote:

<snip>

>> You do have to pay very close attention to some things however. Don't change the device identity in any way - even version information, otherwise a Windows initiator will blue-screen. I made that mistake myself, so I remember it well. It seemed like such an innocent change. I don't recall there being any issue with adding commands and we did do that on occasion.
> 
> How about removing commands?

Good question. With the storage system I am familiar with, that would only be a risk if firmware were downgraded. Downgrading would never have been recommended. I am sure that if something like persistent reserve suddenly went away it would cause big trouble for some initiators.

-- 
Mark Rustad, LAN Access Division, Intel Corporation


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 17:17               ` Michael S. Tsirkin
  2012-07-18 20:12                 ` Rustad, Mark D
@ 2012-07-18 20:12                 ` Rustad, Mark D
  1 sibling, 0 replies; 48+ messages in thread
From: Rustad, Mark D @ 2012-07-18 20:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Anthony Liguori, kvm-devel, linux-scsi, lf-virt,
	Christoph Hellwig, Anthony Liguori, target-devel, Paolo Bonzini,
	Zhi Yong Wu, Christoph Hellwig, Stefan Hajnoczi

On Jul 18, 2012, at 10:17 AM, Michael S. Tsirkin wrote:

<snip>

>> You do have to pay very close attention to some things however. Don't change the device identity in any way - even version information, otherwise a Windows initiator will blue-screen. I made that mistake myself, so I remember it well. It seemed like such an innocent change. I don't recall there being any issue with adding commands and we did do that on occasion.
> 
> How about removing commands?

Good question. With the storage system I am familiar with, that would only be a risk if firmware were downgraded. Downgrading would never have been recommended. I am sure that if something like persistent reserve suddenly went away it would cause big trouble for some initiators.

-- 
Mark Rustad, LAN Access Division, Intel Corporation

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 13:42       ` Anthony Liguori
                           ` (3 preceding siblings ...)
  2012-07-18 15:53         ` Christoph Hellwig
@ 2012-07-18 22:45         ` Nicholas A. Bellinger
  2012-07-18 22:45         ` Nicholas A. Bellinger
  5 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-18 22:45 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Michael S. Tsirkin, target-devel, linux-scsi, lf-virt, kvm-devel,
	Stefan Hajnoczi, Zhi Yong Wu, Anthony Liguori, Paolo Bonzini,
	Christoph Hellwig, Jens Axboe, Hannes Reinecke

On Wed, 2012-07-18 at 08:42 -0500, Anthony Liguori wrote:
> On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote:
> > On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote:
> >> On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
> >>> On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> >

<SNIP>

> >
> >> But I do think the kernel should carefully consider whether it wants to support
> >> an interface like this.  This an extremely complicated ABI with a lot of subtle
> >> details around state and compatibility.
> >>
> >> Are you absolutely confident that you can support a userspace application that
> >> expects to get exactly the same response from all possible commands in 20 kernel
> >> versions from now?  Virtualization requires absolutely precise compatibility in
> >> terms of bugs and features.  This is probably not something the TCM stack has
> >> had to consider yet.
> >>
> >
> > We most certainly have thought about long term userspace compatibility
> > with TCM.  Our userspace code (that's now available in all major
> > distros) is completely forward-compatible with new fabric modules such
> > as tcm_vhost.  No update required.
> 
> I'm not sure we're talking about the same thing when we say compatibility.
> 
> I'm not talking about the API.  I'm talking about the behavior of the commands 
> that tcm_vhost supports.
> 

OK, I understand what your getting at now..

> If you add support for a new command, you need to provide userspace a way to 
> disable this command.  If you change what gets reported for VPD, you need to 
> provide userspace a way to make VPD look like what it did in a previous version.
> 
> Basically, you need to be able to make a TCM device behave 100% the same as it 
> did in an older version of the kernel.
> 
> This is unique to virtualization due to live migration.  If you migrate from a 
> 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM 
> device behaves exactly like the 3.6 kernel because the guest that is interacting 
> with it does not realize that live migration happened.
> 
> Yes, you can add knobs via configfs to control this behavior, but I think the 
> question is, what's the plan for this?
> 

So we already allow for some types of CDBs emulation to be toggled via
backend device attributes:

root@tifa:/usr/src/target-pending.git# tree /sys/kernel/config/target/core/iblock_2/fioa/attrib/
/sys/kernel/config/target/core/iblock_2/fioa/attrib/
├── block_size
├── emulate_dpo
├── emulate_fua_read
├── emulate_fua_write
├── emulate_rest_reord
├── emulate_tas
├── emulate_tpu
├── emulate_tpws
├── emulate_ua_intlck_ctrl
├── emulate_write_cache
├── enforce_pr_isids

<SNIP>

Some things like SPC-3 persistent reservations + implict/explict ALUA
multipath currently can't be disabled, but adding two more backend
attributes to disable/enable this logic individual is easy enough to do.

So that said, I don't have a problem with adding the necessary device
attributes to limit what type of CDBs a backend device is capable of
processing.

Trying to limiting this per-guest (instead of per-device) is where
things get a little more tricky..

> BTW, I think this is a good thing to cover in Documentation/vhost/tcm_vhost.txt. 
>   I think that's probably the only change that's needed here.
> 

Sure, but I'll need to know what else that you'd like to optionally
restrict it terms of CDB processing that's not already there..

Thanks for your feedback!

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 13:42       ` Anthony Liguori
                           ` (4 preceding siblings ...)
  2012-07-18 22:45         ` Nicholas A. Bellinger
@ 2012-07-18 22:45         ` Nicholas A. Bellinger
  5 siblings, 0 replies; 48+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-18 22:45 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jens Axboe, Stefan Hajnoczi, kvm-devel, Michael S. Tsirkin,
	lf-virt, Anthony Liguori, target-devel, linux-scsi,
	Paolo Bonzini, Zhi Yong Wu, Christoph Hellwig

On Wed, 2012-07-18 at 08:42 -0500, Anthony Liguori wrote:
> On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote:
> > On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote:
> >> On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
> >>> On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote:
> >

<SNIP>

> >
> >> But I do think the kernel should carefully consider whether it wants to support
> >> an interface like this.  This an extremely complicated ABI with a lot of subtle
> >> details around state and compatibility.
> >>
> >> Are you absolutely confident that you can support a userspace application that
> >> expects to get exactly the same response from all possible commands in 20 kernel
> >> versions from now?  Virtualization requires absolutely precise compatibility in
> >> terms of bugs and features.  This is probably not something the TCM stack has
> >> had to consider yet.
> >>
> >
> > We most certainly have thought about long term userspace compatibility
> > with TCM.  Our userspace code (that's now available in all major
> > distros) is completely forward-compatible with new fabric modules such
> > as tcm_vhost.  No update required.
> 
> I'm not sure we're talking about the same thing when we say compatibility.
> 
> I'm not talking about the API.  I'm talking about the behavior of the commands 
> that tcm_vhost supports.
> 

OK, I understand what your getting at now..

> If you add support for a new command, you need to provide userspace a way to 
> disable this command.  If you change what gets reported for VPD, you need to 
> provide userspace a way to make VPD look like what it did in a previous version.
> 
> Basically, you need to be able to make a TCM device behave 100% the same as it 
> did in an older version of the kernel.
> 
> This is unique to virtualization due to live migration.  If you migrate from a 
> 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM 
> device behaves exactly like the 3.6 kernel because the guest that is interacting 
> with it does not realize that live migration happened.
> 
> Yes, you can add knobs via configfs to control this behavior, but I think the 
> question is, what's the plan for this?
> 

So we already allow for some types of CDBs emulation to be toggled via
backend device attributes:

root@tifa:/usr/src/target-pending.git# tree /sys/kernel/config/target/core/iblock_2/fioa/attrib/
/sys/kernel/config/target/core/iblock_2/fioa/attrib/
├── block_size
├── emulate_dpo
├── emulate_fua_read
├── emulate_fua_write
├── emulate_rest_reord
├── emulate_tas
├── emulate_tpu
├── emulate_tpws
├── emulate_ua_intlck_ctrl
├── emulate_write_cache
├── enforce_pr_isids

<SNIP>

Some things like SPC-3 persistent reservations + implict/explict ALUA
multipath currently can't be disabled, but adding two more backend
attributes to disable/enable this logic individual is easy enough to do.

So that said, I don't have a problem with adding the necessary device
attributes to limit what type of CDBs a backend device is capable of
processing.

Trying to limiting this per-guest (instead of per-device) is where
things get a little more tricky..

> BTW, I think this is a good thing to cover in Documentation/vhost/tcm_vhost.txt. 
>   I think that's probably the only change that's needed here.
> 

Sure, but I'll need to know what else that you'd like to optionally
restrict it terms of CDB processing that's not already there..

Thanks for your feedback!

--nab

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 19:12               ` Anthony Liguori
  2012-07-19  6:00                 ` Paolo Bonzini
@ 2012-07-19  6:00                 ` Paolo Bonzini
  2012-07-19  7:28                   ` James Bottomley
  1 sibling, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2012-07-19  6:00 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: James Bottomley, Christoph Hellwig, Nicholas A. Bellinger,
	Michael S. Tsirkin, target-devel, linux-scsi, lf-virt, kvm-devel,
	Stefan Hajnoczi, Zhi Yong Wu, Anthony Liguori, Christoph Hellwig,
	Jens Axboe, Hannes Reinecke

Il 18/07/2012 21:12, Anthony Liguori ha scritto:
> Is that true for all OSes?  Linux may handle things gracefully if UNMAP
> starts throwing errors but that doesn't mean that Windows will.

There is so much USB crap (not just removable, think of ATA-to-USB
enclosures) that they have to deal with pretty much everything.

> Windows does this with a points system and I do believe that INQUIRY
> responses from any local disks are included in this tally.

INQUIRY responses (at least vendor/product/type) should not change.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-18 19:12               ` Anthony Liguori
@ 2012-07-19  6:00                 ` Paolo Bonzini
  2012-07-19  6:00                 ` Paolo Bonzini
  1 sibling, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2012-07-19  6:00 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jens Axboe, Christoph Hellwig, kvm-devel, linux-scsi,
	Michael S. Tsirkin, lf-virt, James Bottomley, Anthony Liguori,
	target-devel, Zhi Yong Wu, Christoph Hellwig, Stefan Hajnoczi

Il 18/07/2012 21:12, Anthony Liguori ha scritto:
> Is that true for all OSes?  Linux may handle things gracefully if UNMAP
> starts throwing errors but that doesn't mean that Windows will.

There is so much USB crap (not just removable, think of ATA-to-USB
enclosures) that they have to deal with pretty much everything.

> Windows does this with a points system and I do believe that INQUIRY
> responses from any local disks are included in this tally.

INQUIRY responses (at least vendor/product/type) should not change.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-19  6:00                 ` Paolo Bonzini
@ 2012-07-19  7:28                   ` James Bottomley
  2012-07-19  7:30                     ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: James Bottomley @ 2012-07-19  7:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Anthony Liguori, Christoph Hellwig, Nicholas A. Bellinger,
	Michael S. Tsirkin, target-devel, linux-scsi, lf-virt, kvm-devel,
	Stefan Hajnoczi, Zhi Yong Wu, Anthony Liguori, Christoph Hellwig,
	Jens Axboe, Hannes Reinecke

On Thu, 2012-07-19 at 08:00 +0200, Paolo Bonzini wrote:
> Il 18/07/2012 21:12, Anthony Liguori ha scritto:
>  Windows does this with a points system and I do believe that INQUIRY
> > responses from any local disks are included in this tally.
> 
> INQUIRY responses (at least vendor/product/type) should not change.

INQUIRY responses often change for arrays because a firmware upgrade
enables new features and new features have to declare themselves,
usually in the INQUIRY data.  What you mean, I think, is that previously
exposed features in INQUIRY data, as well as strings
(vendor/product/type, as you say), shouldn't change, but unexposed data
(read 0 in the fields) may.

James



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
  2012-07-19  7:28                   ` James Bottomley
@ 2012-07-19  7:30                     ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2012-07-19  7:30 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jens Axboe, Anthony Liguori, kvm-devel, linux-scsi,
	Michael S. Tsirkin, lf-virt, Christoph Hellwig, Anthony Liguori,
	target-devel, Zhi Yong Wu, Christoph Hellwig, Stefan Hajnoczi

Il 19/07/2012 09:28, James Bottomley ha scritto:
>> > INQUIRY responses (at least vendor/product/type) should not change.
> INQUIRY responses often change for arrays because a firmware upgrade
> enables new features and new features have to declare themselves,
> usually in the INQUIRY data.  What you mean, I think, is that previously
> exposed features in INQUIRY data, as well as strings
> (vendor/product/type, as you say), shouldn't change, but unexposed data
> (read 0 in the fields) may.

What I meant is that it's unlikely that Windows fingerprinting is using
anything but vendor/product/type, because everything else can change.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2012-07-19  7:30 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-11 21:15 [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Nicholas A. Bellinger
2012-07-11 21:15 ` [RFC-v2 1/4] vhost: Separate vhost-net features from vhost features Nicholas A. Bellinger
2012-07-11 21:15 ` Nicholas A. Bellinger
2012-07-11 21:15 ` [RFC-v2 2/4] vhost: make vhost work queue visible Nicholas A. Bellinger
2012-07-11 21:15 ` Nicholas A. Bellinger
2012-07-11 21:15 ` [RFC-v2 3/4] vhost: Add vhost_scsi specific defines Nicholas A. Bellinger
2012-07-11 21:15 ` Nicholas A. Bellinger
2012-07-11 21:15 ` [RFC-v2 4/4] tcm_vhost: Initial merge for vhost level target fabric driver Nicholas A. Bellinger
2012-07-11 21:15 ` Nicholas A. Bellinger
2012-07-17 15:05 ` [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6 Michael S. Tsirkin
2012-07-17 18:55   ` Anthony Liguori
2012-07-17 19:00     ` Michael S. Tsirkin
2012-07-17 21:50     ` Nicholas A. Bellinger
2012-07-18 13:42       ` Anthony Liguori
2012-07-18 13:42       ` Anthony Liguori
2012-07-18 13:56         ` Paolo Bonzini
2012-07-18 15:33         ` Michael S. Tsirkin
2012-07-18 15:53         ` Christoph Hellwig
2012-07-18 15:53         ` Christoph Hellwig
2012-07-18 16:00           ` Michael S. Tsirkin
2012-07-18 16:42             ` Rustad, Mark D
2012-07-18 17:17               ` Michael S. Tsirkin
2012-07-18 20:12                 ` Rustad, Mark D
2012-07-18 20:12                 ` Rustad, Mark D
2012-07-18 16:42             ` Rustad, Mark D
2012-07-18 16:00           ` Anthony Liguori
2012-07-18 16:00           ` Anthony Liguori
2012-07-18 16:47             ` James Bottomley
2012-07-18 19:12               ` Anthony Liguori
2012-07-19  6:00                 ` Paolo Bonzini
2012-07-19  6:00                 ` Paolo Bonzini
2012-07-19  7:28                   ` James Bottomley
2012-07-19  7:30                     ` Paolo Bonzini
2012-07-18 19:12               ` Anthony Liguori
2012-07-18 22:45         ` Nicholas A. Bellinger
2012-07-18 22:45         ` Nicholas A. Bellinger
2012-07-17 21:17   ` Nicholas A. Bellinger
2012-07-17 21:17   ` Nicholas A. Bellinger
2012-07-17 21:34     ` Michael S. Tsirkin
2012-07-17 22:02       ` Nicholas A. Bellinger
2012-07-17 22:02       ` Nicholas A. Bellinger
2012-07-17 22:18         ` Michael S. Tsirkin
2012-07-17 22:18         ` Michael S. Tsirkin
2012-07-17 22:37           ` Nicholas A. Bellinger
2012-07-17 23:11             ` Michael S. Tsirkin
2012-07-18  0:17               ` Nicholas A. Bellinger
2012-07-17 21:58     ` Michael S. Tsirkin
2012-07-17 22:04       ` Nicholas A. Bellinger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.