linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] Drivers: hv: Miscellaneous vmbus and util driver fixes
@ 2016-04-05 23:57 K. Y. Srinivasan
  2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
  0 siblings, 1 reply; 13+ messages in thread
From: K. Y. Srinivasan @ 2016-04-05 23:57 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

Cleanup the ringbuffer code and implement APIs for "in place" consumption.
This patchset also includes some other miscellaneous fixes.

K. Y. Srinivasan (6):
  Drivers: hv: vmbus: Introduce functions for estimating room in the
    ring buffer
  Drivers: hv: vmbus: Use READ_ONCE() to read variables that are
    volatile
  Drivers: hv: vmbus: Use the new virt_xx barrier code
  Drivers: hv: vmbus: Export the vmbus_set_event() API
  Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h
  Drivers: hv: vmbus: Implement APIs to support "in place" consumption
    of vmbus packets

Vitaly Kuznetsov (2):
  Drivers: hv: kvp: fix IP Failover
  Drivers: hv: vmbus: handle various crash scenarios

 drivers/hv/channel_mgmt.c |   58 ++++++++++++----
 drivers/hv/connection.c   |    1 +
 drivers/hv/hv_kvp.c       |   31 ++++++++
 drivers/hv/hyperv_vmbus.h |   23 +++++-
 drivers/hv/ring_buffer.c  |   95 +++----------------------
 drivers/hv/vmbus_drv.c    |    7 +-
 include/linux/hyperv.h    |  168 +++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 278 insertions(+), 105 deletions(-)

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/8] Drivers: hv: kvp: fix IP Failover
  2016-04-05 23:57 [PATCH 0/8] Drivers: hv: Miscellaneous vmbus and util driver fixes K. Y. Srinivasan
@ 2016-04-05 23:57 ` K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 2/8] Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer K. Y. Srinivasan
                     ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: K. Y. Srinivasan @ 2016-04-05 23:57 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Hyper-V VMs can be replicated to another hosts and there is a feature to
set different IP for replicas, it is called 'Failover TCP/IP'. When
such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon
as we finish negotiation procedure. The problem is that it can happen (and
it actually happens) before userspace daemon connects and we reply with
HV_E_FAIL to the message. As there are no repetitions we fail to set the
requested IP.

Solve the issue by postponing our reply to the negotiation message till
userspace daemon is connected. We can't wait too long as there is a
host-side timeout (cca. 75 seconds) and if we fail to reply in this time
frame the whole KVP service will become inactive. The solution is not
ideal - if it takes userspace daemon more than 60 seconds to connect
IP Failover will still fail but I don't see a solution with our current
separation between kernel and userspace parts.

Other two modules (VSS and FCOPY) don't require such delay, leave them
untouched.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/hv_kvp.c       |   31 +++++++++++++++++++++++++++++++
 drivers/hv/hyperv_vmbus.h |    5 +++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
index 9b9b370..cb1a916 100644
--- a/drivers/hv/hv_kvp.c
+++ b/drivers/hv/hv_kvp.c
@@ -78,9 +78,11 @@ static void kvp_send_key(struct work_struct *dummy);
 
 static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error);
 static void kvp_timeout_func(struct work_struct *dummy);
+static void kvp_host_handshake_func(struct work_struct *dummy);
 static void kvp_register(int);
 
 static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func);
+static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func);
 static DECLARE_WORK(kvp_sendkey_work, kvp_send_key);
 
 static const char kvp_devname[] = "vmbus/hv_kvp";
@@ -130,6 +132,11 @@ static void kvp_timeout_func(struct work_struct *dummy)
 	hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper);
 }
 
+static void kvp_host_handshake_func(struct work_struct *dummy)
+{
+	hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback);
+}
+
 static int kvp_handle_handshake(struct hv_kvp_msg *msg)
 {
 	switch (msg->kvp_hdr.operation) {
@@ -154,6 +161,12 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg)
 	pr_debug("KVP: userspace daemon ver. %d registered\n",
 		 KVP_OP_REGISTER);
 	kvp_register(dm_reg_value);
+
+	/*
+	 * If we're still negotiating with the host cancel the timeout
+	 * work to not poll the channel twice.
+	 */
+	cancel_delayed_work_sync(&kvp_host_handshake_work);
 	hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper);
 
 	return 0;
@@ -594,7 +607,22 @@ void hv_kvp_onchannelcallback(void *context)
 	struct icmsg_negotiate *negop = NULL;
 	int util_fw_version;
 	int kvp_srv_version;
+	static enum {NEGO_NOT_STARTED,
+		     NEGO_IN_PROGRESS,
+		     NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED;
 
+	if (host_negotiatied == NEGO_NOT_STARTED &&
+	    kvp_transaction.state < HVUTIL_READY) {
+		/*
+		 * If userspace daemon is not connected and host is asking
+		 * us to negotiate we need to delay to not lose messages.
+		 * This is important for Failover IP setting.
+		 */
+		host_negotiatied = NEGO_IN_PROGRESS;
+		schedule_delayed_work(&kvp_host_handshake_work,
+				      HV_UTIL_NEGO_TIMEOUT * HZ);
+		return;
+	}
 	if (kvp_transaction.state > HVUTIL_READY)
 		return;
 
@@ -672,6 +700,8 @@ void hv_kvp_onchannelcallback(void *context)
 		vmbus_sendpacket(channel, recv_buffer,
 				       recvlen, requestid,
 				       VM_PKT_DATA_INBAND, 0);
+
+		host_negotiatied = NEGO_FINISHED;
 	}
 
 }
@@ -708,6 +738,7 @@ hv_kvp_init(struct hv_util_service *srv)
 void hv_kvp_deinit(void)
 {
 	kvp_transaction.state = HVUTIL_DEVICE_DYING;
+	cancel_delayed_work_sync(&kvp_host_handshake_work);
 	cancel_delayed_work_sync(&kvp_timeout_work);
 	cancel_work_sync(&kvp_sendkey_work);
 	hvutil_transport_destroy(hvt);
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 12321b9..8b07f9c 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -36,6 +36,11 @@
 #define HV_UTIL_TIMEOUT 30
 
 /*
+ * Timeout for guest-host handshake for services.
+ */
+#define HV_UTIL_NEGO_TIMEOUT 60
+
+/*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
  * is set by CPUID(HVCPUID_VERSION_FEATURES).
  */
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/8] Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer
  2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
@ 2016-04-05 23:57   ` K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 3/8] Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile K. Y. Srinivasan
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: K. Y. Srinivasan @ 2016-04-05 23:57 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

Introduce separate functions for estimating how much can be read from
and written to the ring buffer.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/ring_buffer.c |   25 ++++---------------------
 include/linux/hyperv.h   |   27 +++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index a40a73a..544362c 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -38,8 +38,6 @@ void hv_begin_read(struct hv_ring_buffer_info *rbi)
 
 u32 hv_end_read(struct hv_ring_buffer_info *rbi)
 {
-	u32 read;
-	u32 write;
 
 	rbi->ring_buffer->interrupt_mask = 0;
 	mb();
@@ -49,9 +47,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi)
 	 * If it is not, we raced and we need to process new
 	 * incoming messages.
 	 */
-	hv_get_ringbuffer_availbytes(rbi, &read, &write);
-
-	return read;
+	return hv_get_bytes_to_read(rbi);
 }
 
 /*
@@ -106,9 +102,6 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi)
 static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi)
 {
 	u32 cur_write_sz;
-	u32 r_size;
-	u32 write_loc;
-	u32 read_loc = rbi->ring_buffer->read_index;
 	u32 pending_sz;
 
 	/*
@@ -125,14 +118,11 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi)
 	mb();
 
 	pending_sz = rbi->ring_buffer->pending_send_sz;
-	write_loc = rbi->ring_buffer->write_index;
 	/* If the other end is not blocked on write don't bother. */
 	if (pending_sz == 0)
 		return false;
 
-	r_size = rbi->ring_datasize;
-	cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) :
-			read_loc - write_loc;
+	cur_write_sz = hv_get_bytes_to_write(rbi);
 
 	if (cur_write_sz >= pending_sz)
 		return true;
@@ -332,7 +322,6 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info,
 {
 	int i = 0;
 	u32 bytes_avail_towrite;
-	u32 bytes_avail_toread;
 	u32 totalbytes_towrite = 0;
 
 	u32 next_write_location;
@@ -348,9 +337,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info,
 	if (lock)
 		spin_lock_irqsave(&outring_info->ring_lock, flags);
 
-	hv_get_ringbuffer_availbytes(outring_info,
-				&bytes_avail_toread,
-				&bytes_avail_towrite);
+	bytes_avail_towrite = hv_get_bytes_to_write(outring_info);
 
 	/*
 	 * If there is only room for the packet, assume it is full.
@@ -401,7 +388,6 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info,
 		       void *buffer, u32 buflen, u32 *buffer_actual_len,
 		       u64 *requestid, bool *signal, bool raw)
 {
-	u32 bytes_avail_towrite;
 	u32 bytes_avail_toread;
 	u32 next_read_location = 0;
 	u64 prev_indices = 0;
@@ -417,10 +403,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info,
 	*buffer_actual_len = 0;
 	*requestid = 0;
 
-	hv_get_ringbuffer_availbytes(inring_info,
-				&bytes_avail_toread,
-				&bytes_avail_towrite);
-
+	bytes_avail_toread = hv_get_bytes_to_read(inring_info);
 	/* Make sure there is something to read */
 	if (bytes_avail_toread < sizeof(desc)) {
 		/*
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index ecd81c3..a6b053c 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -151,6 +151,33 @@ hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi,
 	*read = dsize - *write;
 }
 
+static inline u32 hv_get_bytes_to_read(struct hv_ring_buffer_info *rbi)
+{
+	u32 read_loc, write_loc, dsize, read;
+
+	dsize = rbi->ring_datasize;
+	read_loc = rbi->ring_buffer->read_index;
+	write_loc = READ_ONCE(rbi->ring_buffer->write_index);
+
+	read = write_loc >= read_loc ? (write_loc - read_loc) :
+		(dsize - read_loc) + write_loc;
+
+	return read;
+}
+
+static inline u32 hv_get_bytes_to_write(struct hv_ring_buffer_info *rbi)
+{
+	u32 read_loc, write_loc, dsize, write;
+
+	dsize = rbi->ring_datasize;
+	read_loc = READ_ONCE(rbi->ring_buffer->read_index);
+	write_loc = rbi->ring_buffer->write_index;
+
+	write = write_loc >= read_loc ? dsize - (write_loc - read_loc) :
+		read_loc - write_loc;
+	return write;
+}
+
 /*
  * VMBUS version is 32 bit entity broken up into
  * two 16 bit quantities: major_number. minor_number.
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/8] Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile
  2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 2/8] Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer K. Y. Srinivasan
@ 2016-04-05 23:57   ` K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 4/8] Drivers: hv: vmbus: Use the new virt_xx barrier code K. Y. Srinivasan
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: K. Y. Srinivasan @ 2016-04-05 23:57 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

Use the READ_ONCE macro to access variabes that can change asynchronously.
This is the recommended mechanism for dealing with "unsafe" compiler
optimizations.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/ring_buffer.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 544362c..6ea1b55 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -69,7 +69,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi)
 static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi)
 {
 	mb();
-	if (rbi->ring_buffer->interrupt_mask)
+	if (READ_ONCE(rbi->ring_buffer->interrupt_mask))
 		return false;
 
 	/* check interrupt_mask before read_index */
@@ -78,7 +78,7 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi)
 	 * This is the only case we need to signal when the
 	 * ring transitions from being empty to non-empty.
 	 */
-	if (old_write == rbi->ring_buffer->read_index)
+	if (old_write == READ_ONCE(rbi->ring_buffer->read_index))
 		return true;
 
 	return false;
@@ -117,7 +117,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi)
 	 */
 	mb();
 
-	pending_sz = rbi->ring_buffer->pending_send_sz;
+	pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
 	/* If the other end is not blocked on write don't bother. */
 	if (pending_sz == 0)
 		return false;
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/8] Drivers: hv: vmbus: Use the new virt_xx barrier code
  2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 2/8] Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 3/8] Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile K. Y. Srinivasan
@ 2016-04-05 23:57   ` K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 5/8] Drivers: hv: vmbus: Export the vmbus_set_event() API K. Y. Srinivasan
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: K. Y. Srinivasan @ 2016-04-05 23:57 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

Use the virt_xx barriers that have been defined for use in virtual machines.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/ring_buffer.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 6ea1b55..8f518af 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -33,14 +33,14 @@
 void hv_begin_read(struct hv_ring_buffer_info *rbi)
 {
 	rbi->ring_buffer->interrupt_mask = 1;
-	mb();
+	virt_mb();
 }
 
 u32 hv_end_read(struct hv_ring_buffer_info *rbi)
 {
 
 	rbi->ring_buffer->interrupt_mask = 0;
-	mb();
+	virt_mb();
 
 	/*
 	 * Now check to see if the ring buffer is still empty.
@@ -68,12 +68,12 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi)
 
 static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi)
 {
-	mb();
+	virt_mb();
 	if (READ_ONCE(rbi->ring_buffer->interrupt_mask))
 		return false;
 
 	/* check interrupt_mask before read_index */
-	rmb();
+	virt_rmb();
 	/*
 	 * This is the only case we need to signal when the
 	 * ring transitions from being empty to non-empty.
@@ -115,7 +115,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi)
 	 * read index, we could miss sending the interrupt. Issue a full
 	 * memory barrier to address this.
 	 */
-	mb();
+	virt_mb();
 
 	pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
 	/* If the other end is not blocked on write don't bother. */
@@ -371,7 +371,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info,
 					     sizeof(u64));
 
 	/* Issue a full memory barrier before updating the write index */
-	mb();
+	virt_mb();
 
 	/* Now, update the write location */
 	hv_set_next_write_location(outring_info, next_write_location);
@@ -447,7 +447,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info,
 	 * the writer may start writing to the read area once the read index
 	 * is updated.
 	 */
-	mb();
+	virt_mb();
 
 	/* Update the read index */
 	hv_set_next_read_location(inring_info, next_read_location);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/8] Drivers: hv: vmbus: Export the vmbus_set_event() API
  2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
                     ` (2 preceding siblings ...)
  2016-04-05 23:57   ` [PATCH 4/8] Drivers: hv: vmbus: Use the new virt_xx barrier code K. Y. Srinivasan
@ 2016-04-05 23:57   ` K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 6/8] Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h K. Y. Srinivasan
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: K. Y. Srinivasan @ 2016-04-05 23:57 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

In preparation for moving some ring buffer functionality out of the
vmbus driver, export the API for signaling the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/connection.c   |    1 +
 drivers/hv/hyperv_vmbus.h |    2 --
 include/linux/hyperv.h    |    1 +
 3 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index d02f137..fcf8a02 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -495,3 +495,4 @@ void vmbus_set_event(struct vmbus_channel *channel)
 
 	hv_do_hypercall(HVCALL_SIGNAL_EVENT, channel->sig_event, NULL);
 }
+EXPORT_SYMBOL_GPL(vmbus_set_event);
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 8b07f9c..e5203e4 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -672,8 +672,6 @@ void vmbus_disconnect(void);
 
 int vmbus_post_msg(void *buffer, size_t buflen);
 
-void vmbus_set_event(struct vmbus_channel *channel);
-
 void vmbus_on_event(unsigned long data);
 void vmbus_on_msg_dpc(unsigned long data);
 
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index a6b053c..4adeb6e 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1365,4 +1365,5 @@ extern __u32 vmbus_proto_version;
 
 int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id,
 				  const uuid_le *shv_host_servie_id);
+void vmbus_set_event(struct vmbus_channel *channel);
 #endif /* _HYPERV_H */
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/8] Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h
  2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
                     ` (3 preceding siblings ...)
  2016-04-05 23:57   ` [PATCH 5/8] Drivers: hv: vmbus: Export the vmbus_set_event() API K. Y. Srinivasan
@ 2016-04-05 23:57   ` K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 7/8] Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets K. Y. Srinivasan
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: K. Y. Srinivasan @ 2016-04-05 23:57 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

In preparation for implementing APIs for in-place consumption of VMBUS
packets, movve some ring buffer functionality into hyperv.h

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/ring_buffer.c |   55 ----------------------------------------------
 include/linux/hyperv.h   |   54 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 55 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 8f518af..dd255c9 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -84,52 +84,6 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi)
 	return false;
 }
 
-/*
- * To optimize the flow management on the send-side,
- * when the sender is blocked because of lack of
- * sufficient space in the ring buffer, potential the
- * consumer of the ring buffer can signal the producer.
- * This is controlled by the following parameters:
- *
- * 1. pending_send_sz: This is the size in bytes that the
- *    producer is trying to send.
- * 2. The feature bit feat_pending_send_sz set to indicate if
- *    the consumer of the ring will signal when the ring
- *    state transitions from being full to a state where
- *    there is room for the producer to send the pending packet.
- */
-
-static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi)
-{
-	u32 cur_write_sz;
-	u32 pending_sz;
-
-	/*
-	 * Issue a full memory barrier before making the signaling decision.
-	 * Here is the reason for having this barrier:
-	 * If the reading of the pend_sz (in this function)
-	 * were to be reordered and read before we commit the new read
-	 * index (in the calling function)  we could
-	 * have a problem. If the host were to set the pending_sz after we
-	 * have sampled pending_sz and go to sleep before we commit the
-	 * read index, we could miss sending the interrupt. Issue a full
-	 * memory barrier to address this.
-	 */
-	virt_mb();
-
-	pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
-	/* If the other end is not blocked on write don't bother. */
-	if (pending_sz == 0)
-		return false;
-
-	cur_write_sz = hv_get_bytes_to_write(rbi);
-
-	if (cur_write_sz >= pending_sz)
-		return true;
-
-	return false;
-}
-
 /* Get the next write location for the specified ring buffer. */
 static inline u32
 hv_get_next_write_location(struct hv_ring_buffer_info *ring_info)
@@ -180,15 +134,6 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info,
 	ring_info->ring_buffer->read_index = next_read_location;
 }
 
-
-/* Get the start of the ring buffer. */
-static inline void *
-hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info)
-{
-	return (void *)ring_info->ring_buffer->buffer;
-}
-
-
 /* Get the size of the ring buffer. */
 static inline u32
 hv_get_ring_buffersize(struct hv_ring_buffer_info *ring_info)
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 4adeb6e..6797a30 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1366,4 +1366,58 @@ extern __u32 vmbus_proto_version;
 int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id,
 				  const uuid_le *shv_host_servie_id);
 void vmbus_set_event(struct vmbus_channel *channel);
+
+/* Get the start of the ring buffer. */
+static inline void *
+hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info)
+{
+	return (void *)ring_info->ring_buffer->buffer;
+}
+
+/*
+ * To optimize the flow management on the send-side,
+ * when the sender is blocked because of lack of
+ * sufficient space in the ring buffer, potential the
+ * consumer of the ring buffer can signal the producer.
+ * This is controlled by the following parameters:
+ *
+ * 1. pending_send_sz: This is the size in bytes that the
+ *    producer is trying to send.
+ * 2. The feature bit feat_pending_send_sz set to indicate if
+ *    the consumer of the ring will signal when the ring
+ *    state transitions from being full to a state where
+ *    there is room for the producer to send the pending packet.
+ */
+
+static inline  bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi)
+{
+	u32 cur_write_sz;
+	u32 pending_sz;
+
+	/*
+	 * Issue a full memory barrier before making the signaling decision.
+	 * Here is the reason for having this barrier:
+	 * If the reading of the pend_sz (in this function)
+	 * were to be reordered and read before we commit the new read
+	 * index (in the calling function)  we could
+	 * have a problem. If the host were to set the pending_sz after we
+	 * have sampled pending_sz and go to sleep before we commit the
+	 * read index, we could miss sending the interrupt. Issue a full
+	 * memory barrier to address this.
+	 */
+	virt_mb();
+
+	pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
+	/* If the other end is not blocked on write don't bother. */
+	if (pending_sz == 0)
+		return false;
+
+	cur_write_sz = hv_get_bytes_to_write(rbi);
+
+	if (cur_write_sz >= pending_sz)
+		return true;
+
+	return false;
+}
+
 #endif /* _HYPERV_H */
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/8] Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets
  2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
                     ` (4 preceding siblings ...)
  2016-04-05 23:57   ` [PATCH 6/8] Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h K. Y. Srinivasan
@ 2016-04-05 23:57   ` K. Y. Srinivasan
  2016-04-05 23:57   ` [PATCH 8/8] Drivers: hv: vmbus: handle various crash scenarios K. Y. Srinivasan
  2016-04-30 21:04   ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover Greg KH
  7 siblings, 0 replies; 13+ messages in thread
From: K. Y. Srinivasan @ 2016-04-05 23:57 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

Implement APIs for in-place consumption of vmbus packets. Currently, each
packet is copied and processed one at a time and as part of processing
each packet we potentially may signal the host (if it is waiting for
room to produce a packet).

These APIs help batched in-place processing of vmbus packets.
We also optimize host signaling by having a separate API to signal
the end of in-place consumption. With netvsc using these APIs,
on an iperf run on average I see about 20X reduction in checks to
signal the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/ring_buffer.c |    1 +
 include/linux/hyperv.h   |   86 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 87 insertions(+), 0 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index dd255c9..fe586bf 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -132,6 +132,7 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info,
 		    u32 next_read_location)
 {
 	ring_info->ring_buffer->read_index = next_read_location;
+	ring_info->priv_read_index = next_read_location;
 }
 
 /* Get the size of the ring buffer. */
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 6797a30..b10954a 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -126,6 +126,8 @@ struct hv_ring_buffer_info {
 
 	u32 ring_datasize;		/* < ring_size */
 	u32 ring_data_startoffset;
+	u32 priv_write_index;
+	u32 priv_read_index;
 };
 
 /*
@@ -1420,4 +1422,88 @@ static inline  bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi)
 	return false;
 }
 
+/*
+ * An API to support in-place processing of incoming VMBUS packets.
+ */
+#define VMBUS_PKT_TRAILER	8
+
+static inline struct vmpacket_descriptor *
+get_next_pkt_raw(struct vmbus_channel *channel)
+{
+	struct hv_ring_buffer_info *ring_info = &channel->inbound;
+	u32 read_loc = ring_info->priv_read_index;
+	void *ring_buffer = hv_get_ring_buffer(ring_info);
+	struct vmpacket_descriptor *cur_desc;
+	u32 packetlen;
+	u32 dsize = ring_info->ring_datasize;
+	u32 delta = read_loc - ring_info->ring_buffer->read_index;
+	u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta);
+
+	if (bytes_avail_toread < sizeof(struct vmpacket_descriptor))
+		return NULL;
+
+	if ((read_loc + sizeof(*cur_desc)) > dsize)
+		return NULL;
+
+	cur_desc = ring_buffer + read_loc;
+	packetlen = cur_desc->len8 << 3;
+
+	/*
+	 * If the packet under consideration is wrapping around,
+	 * return failure.
+	 */
+	if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1))
+		return NULL;
+
+	return cur_desc;
+}
+
+/*
+ * A helper function to step through packets "in-place"
+ * This API is to be called after each successful call
+ * get_next_pkt_raw().
+ */
+static inline void put_pkt_raw(struct vmbus_channel *channel,
+				struct vmpacket_descriptor *desc)
+{
+	struct hv_ring_buffer_info *ring_info = &channel->inbound;
+	u32 read_loc = ring_info->priv_read_index;
+	u32 packetlen = desc->len8 << 3;
+	u32 dsize = ring_info->ring_datasize;
+
+	if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize)
+		BUG();
+	/*
+	 * Include the packet trailer.
+	 */
+	ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER;
+}
+
+/*
+ * This call commits the read index and potentially signals the host.
+ * Here is the pattern for using the "in-place" consumption APIs:
+ *
+ * while (get_next_pkt_raw() {
+ *	process the packet "in-place";
+ *	put_pkt_raw();
+ * }
+ * if (packets processed in place)
+ *	commit_rd_index();
+ */
+static inline void commit_rd_index(struct vmbus_channel *channel)
+{
+	struct hv_ring_buffer_info *ring_info = &channel->inbound;
+	/*
+	 * Make sure all reads are done before we update the read index since
+	 * the writer may start writing to the read area once the read index
+	 * is updated.
+	 */
+	virt_rmb();
+	ring_info->ring_buffer->read_index = ring_info->priv_read_index;
+
+	if (hv_need_to_signal_on_read(ring_info))
+		vmbus_set_event(channel);
+}
+
+
 #endif /* _HYPERV_H */
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 8/8] Drivers: hv: vmbus: handle various crash scenarios
  2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
                     ` (5 preceding siblings ...)
  2016-04-05 23:57   ` [PATCH 7/8] Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets K. Y. Srinivasan
@ 2016-04-05 23:57   ` K. Y. Srinivasan
  2016-04-30 21:04   ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover Greg KH
  7 siblings, 0 replies; 13+ messages in thread
From: K. Y. Srinivasan @ 2016-04-05 23:57 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always
delivered to the CPU which was used for initial contact or to CPU0
depending on host version. vmbus_wait_for_unload() doesn't account for
the fact that in case we're crashing on some other CPU we won't get the
CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will
never end.

Do the following:
1) Check for completion_done() in the loop. In case interrupt handler is
   still alive we'll get the confirmation we need.

2) Read message pages for all CPUs message page as we're unsure where
   CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with
   still-alive interrupt handler doing the same, add cmpxchg() to
   vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message.

3) Cleanup message pages on all CPUs. This is required (at least for the
   current CPU as we're clearing CPU0 messages now but we may want to bring
   up additional CPUs on crash) as new messages won't be delivered till we
   consume what's pending. On boot we'll place message pages somewhere else
   and we won't be able to read stale messages.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/channel_mgmt.c |   58 +++++++++++++++++++++++++++++++++-----------
 drivers/hv/hyperv_vmbus.h |   16 +++++++++++-
 drivers/hv/vmbus_drv.c    |    7 +++--
 3 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 38b682ba..b6c1211 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -597,27 +597,55 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
 
 static void vmbus_wait_for_unload(void)
 {
-	int cpu = smp_processor_id();
-	void *page_addr = hv_context.synic_message_page[cpu];
-	struct hv_message *msg = (struct hv_message *)page_addr +
-				  VMBUS_MESSAGE_SINT;
+	int cpu;
+	void *page_addr;
+	struct hv_message *msg;
 	struct vmbus_channel_message_header *hdr;
-	bool unloaded = false;
+	u32 message_type;
 
+	/*
+	 * CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was
+	 * used for initial contact or to CPU0 depending on host version. When
+	 * we're crashing on a different CPU let's hope that IRQ handler on
+	 * the cpu which receives CHANNELMSG_UNLOAD_RESPONSE is still
+	 * functional and vmbus_unload_response() will complete
+	 * vmbus_connection.unload_event. If not, the last thing we can do is
+	 * read message pages for all CPUs directly.
+	 */
 	while (1) {
-		if (READ_ONCE(msg->header.message_type) == HVMSG_NONE) {
-			mdelay(10);
-			continue;
-		}
+		if (completion_done(&vmbus_connection.unload_event))
+			break;
 
-		hdr = (struct vmbus_channel_message_header *)msg->u.payload;
-		if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE)
-			unloaded = true;
+		for_each_online_cpu(cpu) {
+			page_addr = hv_context.synic_message_page[cpu];
+			msg = (struct hv_message *)page_addr +
+				VMBUS_MESSAGE_SINT;
 
-		vmbus_signal_eom(msg);
+			message_type = READ_ONCE(msg->header.message_type);
+			if (message_type == HVMSG_NONE)
+				continue;
 
-		if (unloaded)
-			break;
+			hdr = (struct vmbus_channel_message_header *)
+				msg->u.payload;
+
+			if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE)
+				complete(&vmbus_connection.unload_event);
+
+			vmbus_signal_eom(msg, message_type);
+		}
+
+		mdelay(10);
+	}
+
+	/*
+	 * We're crashing and already got the UNLOAD_RESPONSE, cleanup all
+	 * maybe-pending messages on all CPUs to be able to receive new
+	 * messages after we reconnect.
+	 */
+	for_each_online_cpu(cpu) {
+		page_addr = hv_context.synic_message_page[cpu];
+		msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT;
+		msg->header.message_type = HVMSG_NONE;
 	}
 }
 
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index e5203e4..718b5c7 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -625,9 +625,21 @@ extern struct vmbus_channel_message_table_entry
 	channel_message_table[CHANNELMSG_COUNT];
 
 /* Free the message slot and signal end-of-message if required */
-static inline void vmbus_signal_eom(struct hv_message *msg)
+static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
 {
-	msg->header.message_type = HVMSG_NONE;
+	/*
+	 * On crash we're reading some other CPU's message page and we need
+	 * to be careful: this other CPU may already had cleared the header
+	 * and the host may already had delivered some other message there.
+	 * In case we blindly write msg->header.message_type we're going
+	 * to lose it. We can still lose a message of the same type but
+	 * we count on the fact that there can only be one
+	 * CHANNELMSG_UNLOAD_RESPONSE and we don't care about other messages
+	 * on crash.
+	 */
+	if (cmpxchg(&msg->header.message_type, old_msg_type,
+		    HVMSG_NONE) != old_msg_type)
+		return;
 
 	/*
 	 * Make sure the write to MessageType (ie set to
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index a29a6c0..952f20f 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -712,7 +712,7 @@ static void hv_process_timer_expiration(struct hv_message *msg, int cpu)
 	if (dev->event_handler)
 		dev->event_handler(dev);
 
-	vmbus_signal_eom(msg);
+	vmbus_signal_eom(msg, HVMSG_TIMER_EXPIRED);
 }
 
 void vmbus_on_msg_dpc(unsigned long data)
@@ -724,8 +724,9 @@ void vmbus_on_msg_dpc(unsigned long data)
 	struct vmbus_channel_message_header *hdr;
 	struct vmbus_channel_message_table_entry *entry;
 	struct onmessage_work_context *ctx;
+	u32 message_type = msg->header.message_type;
 
-	if (msg->header.message_type == HVMSG_NONE)
+	if (message_type == HVMSG_NONE)
 		/* no msg */
 		return;
 
@@ -750,7 +751,7 @@ void vmbus_on_msg_dpc(unsigned long data)
 		entry->message_handler(hdr);
 
 msg_handled:
-	vmbus_signal_eom(msg);
+	vmbus_signal_eom(msg, message_type);
 }
 
 static void vmbus_isr(void)
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/8] Drivers: hv: kvp: fix IP Failover
  2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
                     ` (6 preceding siblings ...)
  2016-04-05 23:57   ` [PATCH 8/8] Drivers: hv: vmbus: handle various crash scenarios K. Y. Srinivasan
@ 2016-04-30 21:04   ` Greg KH
  2016-04-30 21:43     ` KY Srinivasan
  7 siblings, 1 reply; 13+ messages in thread
From: Greg KH @ 2016-04-30 21:04 UTC (permalink / raw)
  To: K. Y. Srinivasan; +Cc: linux-kernel, devel, olaf, apw, vkuznets, jasowang

On Tue, Apr 05, 2016 at 04:57:40PM -0700, K. Y. Srinivasan wrote:
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> Hyper-V VMs can be replicated to another hosts and there is a feature to
> set different IP for replicas, it is called 'Failover TCP/IP'. When
> such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon
> as we finish negotiation procedure. The problem is that it can happen (and
> it actually happens) before userspace daemon connects and we reply with
> HV_E_FAIL to the message. As there are no repetitions we fail to set the
> requested IP.
> 
> Solve the issue by postponing our reply to the negotiation message till
> userspace daemon is connected. We can't wait too long as there is a
> host-side timeout (cca. 75 seconds) and if we fail to reply in this time
> frame the whole KVP service will become inactive. The solution is not
> ideal - if it takes userspace daemon more than 60 seconds to connect
> IP Failover will still fail but I don't see a solution with our current
> separation between kernel and userspace parts.
> 
> Other two modules (VSS and FCOPY) don't require such delay, leave them
> untouched.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> ---
>  drivers/hv/hv_kvp.c       |   31 +++++++++++++++++++++++++++++++
>  drivers/hv/hyperv_vmbus.h |    5 +++++
>  2 files changed, 36 insertions(+), 0 deletions(-)

This series doesn't apply to my tree :(

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/8] Drivers: hv: kvp: fix IP Failover
  2016-04-30 21:04   ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover Greg KH
@ 2016-04-30 21:43     ` KY Srinivasan
  2016-04-30 21:54       ` Greg KH
  0 siblings, 1 reply; 13+ messages in thread
From: KY Srinivasan @ 2016-04-30 21:43 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, devel, olaf, apw, vkuznets, jasowang



> -----Original Message-----
> From: Greg KH [mailto:gregkh@linuxfoundation.org]
> Sent: Saturday, April 30, 2016 2:05 PM
> To: KY Srinivasan <kys@microsoft.com>
> Cc: linux-kernel@vger.kernel.org; devel@linuxdriverproject.org;
> olaf@aepfle.de; apw@canonical.com; vkuznets@redhat.com;
> jasowang@redhat.com
> Subject: Re: [PATCH 1/8] Drivers: hv: kvp: fix IP Failover
> 
> On Tue, Apr 05, 2016 at 04:57:40PM -0700, K. Y. Srinivasan wrote:
> > From: Vitaly Kuznetsov <vkuznets@redhat.com>
> >
> > Hyper-V VMs can be replicated to another hosts and there is a feature to
> > set different IP for replicas, it is called 'Failover TCP/IP'. When
> > such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as
> soon
> > as we finish negotiation procedure. The problem is that it can happen (and
> > it actually happens) before userspace daemon connects and we reply with
> > HV_E_FAIL to the message. As there are no repetitions we fail to set the
> > requested IP.
> >
> > Solve the issue by postponing our reply to the negotiation message till
> > userspace daemon is connected. We can't wait too long as there is a
> > host-side timeout (cca. 75 seconds) and if we fail to reply in this time
> > frame the whole KVP service will become inactive. The solution is not
> > ideal - if it takes userspace daemon more than 60 seconds to connect
> > IP Failover will still fail but I don't see a solution with our current
> > separation between kernel and userspace parts.
> >
> > Other two modules (VSS and FCOPY) don't require such delay, leave them
> > untouched.
> >
> > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> > ---
> >  drivers/hv/hv_kvp.c       |   31 +++++++++++++++++++++++++++++++
> >  drivers/hv/hyperv_vmbus.h |    5 +++++
> >  2 files changed, 36 insertions(+), 0 deletions(-)
> 
> This series doesn't apply to my tree :(

Looks like you have already applied most of the patches in this series. I will resend what is not applied.

Thanks,

K. Y

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/8] Drivers: hv: kvp: fix IP Failover
  2016-04-30 21:43     ` KY Srinivasan
@ 2016-04-30 21:54       ` Greg KH
  2016-05-01  0:21         ` KY Srinivasan
  0 siblings, 1 reply; 13+ messages in thread
From: Greg KH @ 2016-04-30 21:54 UTC (permalink / raw)
  To: KY Srinivasan; +Cc: linux-kernel, devel, olaf, apw, vkuznets, jasowang

On Sat, Apr 30, 2016 at 09:43:09PM +0000, KY Srinivasan wrote:
> 
> 
> > -----Original Message-----
> > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > Sent: Saturday, April 30, 2016 2:05 PM
> > To: KY Srinivasan <kys@microsoft.com>
> > Cc: linux-kernel@vger.kernel.org; devel@linuxdriverproject.org;
> > olaf@aepfle.de; apw@canonical.com; vkuznets@redhat.com;
> > jasowang@redhat.com
> > Subject: Re: [PATCH 1/8] Drivers: hv: kvp: fix IP Failover
> > 
> > On Tue, Apr 05, 2016 at 04:57:40PM -0700, K. Y. Srinivasan wrote:
> > > From: Vitaly Kuznetsov <vkuznets@redhat.com>
> > >
> > > Hyper-V VMs can be replicated to another hosts and there is a feature to
> > > set different IP for replicas, it is called 'Failover TCP/IP'. When
> > > such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as
> > soon
> > > as we finish negotiation procedure. The problem is that it can happen (and
> > > it actually happens) before userspace daemon connects and we reply with
> > > HV_E_FAIL to the message. As there are no repetitions we fail to set the
> > > requested IP.
> > >
> > > Solve the issue by postponing our reply to the negotiation message till
> > > userspace daemon is connected. We can't wait too long as there is a
> > > host-side timeout (cca. 75 seconds) and if we fail to reply in this time
> > > frame the whole KVP service will become inactive. The solution is not
> > > ideal - if it takes userspace daemon more than 60 seconds to connect
> > > IP Failover will still fail but I don't see a solution with our current
> > > separation between kernel and userspace parts.
> > >
> > > Other two modules (VSS and FCOPY) don't require such delay, leave them
> > > untouched.
> > >
> > > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> > > ---
> > >  drivers/hv/hv_kvp.c       |   31 +++++++++++++++++++++++++++++++
> > >  drivers/hv/hyperv_vmbus.h |    5 +++++
> > >  2 files changed, 36 insertions(+), 0 deletions(-)
> > 
> > This series doesn't apply to my tree :(
> 
> Looks like you have already applied most of the patches in this series. I will resend what is not applied.

If this was a "resend", why didn't it show that in the patch
description?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/8] Drivers: hv: kvp: fix IP Failover
  2016-04-30 21:54       ` Greg KH
@ 2016-05-01  0:21         ` KY Srinivasan
  0 siblings, 0 replies; 13+ messages in thread
From: KY Srinivasan @ 2016-05-01  0:21 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, devel, olaf, apw, vkuznets, jasowang



> -----Original Message-----
> From: Greg KH [mailto:gregkh@linuxfoundation.org]
> Sent: Saturday, April 30, 2016 2:54 PM
> To: KY Srinivasan <kys@microsoft.com>
> Cc: linux-kernel@vger.kernel.org; devel@linuxdriverproject.org;
> olaf@aepfle.de; apw@canonical.com; vkuznets@redhat.com;
> jasowang@redhat.com
> Subject: Re: [PATCH 1/8] Drivers: hv: kvp: fix IP Failover
> 
> On Sat, Apr 30, 2016 at 09:43:09PM +0000, KY Srinivasan wrote:
> >
> >
> > > -----Original Message-----
> > > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > > Sent: Saturday, April 30, 2016 2:05 PM
> > > To: KY Srinivasan <kys@microsoft.com>
> > > Cc: linux-kernel@vger.kernel.org; devel@linuxdriverproject.org;
> > > olaf@aepfle.de; apw@canonical.com; vkuznets@redhat.com;
> > > jasowang@redhat.com
> > > Subject: Re: [PATCH 1/8] Drivers: hv: kvp: fix IP Failover
> > >
> > > On Tue, Apr 05, 2016 at 04:57:40PM -0700, K. Y. Srinivasan wrote:
> > > > From: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > >
> > > > Hyper-V VMs can be replicated to another hosts and there is a feature
> to
> > > > set different IP for replicas, it is called 'Failover TCP/IP'. When
> > > > such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message
> as
> > > soon
> > > > as we finish negotiation procedure. The problem is that it can happen
> (and
> > > > it actually happens) before userspace daemon connects and we reply
> with
> > > > HV_E_FAIL to the message. As there are no repetitions we fail to set
> the
> > > > requested IP.
> > > >
> > > > Solve the issue by postponing our reply to the negotiation message till
> > > > userspace daemon is connected. We can't wait too long as there is a
> > > > host-side timeout (cca. 75 seconds) and if we fail to reply in this time
> > > > frame the whole KVP service will become inactive. The solution is not
> > > > ideal - if it takes userspace daemon more than 60 seconds to connect
> > > > IP Failover will still fail but I don't see a solution with our current
> > > > separation between kernel and userspace parts.
> > > >
> > > > Other two modules (VSS and FCOPY) don't require such delay, leave
> them
> > > > untouched.
> > > >
> > > > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> > > > ---
> > > >  drivers/hv/hv_kvp.c       |   31 +++++++++++++++++++++++++++++++
> > > >  drivers/hv/hyperv_vmbus.h |    5 +++++
> > > >  2 files changed, 36 insertions(+), 0 deletions(-)
> > >
> > > This series doesn't apply to my tree :(
> >
> > Looks like you have already applied most of the patches in this series. I will
> resend what is not applied.
> 
> If this was a "resend", why didn't it show that in the patch
> description?

My fault; sorry for the confusion. Greg, I am going to resend all the 
patches yet to be committed with the right "resend" tag.

Regards,

K. Y

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-05-01  0:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-05 23:57 [PATCH 0/8] Drivers: hv: Miscellaneous vmbus and util driver fixes K. Y. Srinivasan
2016-04-05 23:57 ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover K. Y. Srinivasan
2016-04-05 23:57   ` [PATCH 2/8] Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer K. Y. Srinivasan
2016-04-05 23:57   ` [PATCH 3/8] Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile K. Y. Srinivasan
2016-04-05 23:57   ` [PATCH 4/8] Drivers: hv: vmbus: Use the new virt_xx barrier code K. Y. Srinivasan
2016-04-05 23:57   ` [PATCH 5/8] Drivers: hv: vmbus: Export the vmbus_set_event() API K. Y. Srinivasan
2016-04-05 23:57   ` [PATCH 6/8] Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h K. Y. Srinivasan
2016-04-05 23:57   ` [PATCH 7/8] Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets K. Y. Srinivasan
2016-04-05 23:57   ` [PATCH 8/8] Drivers: hv: vmbus: handle various crash scenarios K. Y. Srinivasan
2016-04-30 21:04   ` [PATCH 1/8] Drivers: hv: kvp: fix IP Failover Greg KH
2016-04-30 21:43     ` KY Srinivasan
2016-04-30 21:54       ` Greg KH
2016-05-01  0:21         ` KY Srinivasan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).