linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] hv_utils: Add the support of hibernation
@ 2020-01-25 19:53 Dexuan Cui
  2020-01-25 19:53 ` [PATCH v3 1/4] Tools: hv: Reopen the devices if read() or write() returns errors Dexuan Cui
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Dexuan Cui @ 2020-01-25 19:53 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal, linux-hyperv, linux-kernel,
	mikelley, vkuznets
  Cc: Dexuan Cui

Hi,
This is an updated version of the v2 patchset 
(https://lkml.org/lkml/2020/1/13/42).

I addressed all the comments from Michael, and documented the changes
after the "---" line in every patch's changelog.

Please review.

Thanks!

Dexuan Cui (4):
  Tools: hv: Reopen the devices if read() or write() returns errors
  hv_utils: Support host-initiated restart request
  hv_utils: Support host-initiated hibernation request
  hv_utils: Add the support of hibernation

 drivers/hv/hv_fcopy.c      |  54 +++++++++++++-
 drivers/hv/hv_kvp.c        |  43 ++++++++++-
 drivers/hv/hv_snapshot.c   |  55 +++++++++++++-
 drivers/hv/hv_util.c       | 144 ++++++++++++++++++++++++++++++++++++-
 drivers/hv/hyperv_vmbus.h  |   6 ++
 include/linux/hyperv.h     |   2 +
 tools/hv/hv_fcopy_daemon.c |  33 +++++++--
 tools/hv/hv_kvp_daemon.c   |  36 ++++++----
 tools/hv/hv_vss_daemon.c   |  49 ++++++++++---
 9 files changed, 385 insertions(+), 37 deletions(-)

-- 
2.19.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/4] Tools: hv: Reopen the devices if read() or write() returns errors
  2020-01-25 19:53 [PATCH v3 0/4] hv_utils: Add the support of hibernation Dexuan Cui
@ 2020-01-25 19:53 ` Dexuan Cui
  2020-01-26  4:49   ` Michael Kelley
  2020-01-25 19:53 ` [PATCH v3 2/4] hv_utils: Support host-initiated restart request Dexuan Cui
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Dexuan Cui @ 2020-01-25 19:53 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal, linux-hyperv, linux-kernel,
	mikelley, vkuznets
  Cc: Dexuan Cui

The state machine in the hv_utils driver can run out of order in some
corner cases, e.g. if the kvp daemon doesn't call write() fast enough
due to some reason, kvp_timeout_func() can run first and move the state
to HVUTIL_READY; next, when kvp_on_msg() is called it returns -EINVAL
since kvp_transaction.state is smaller than HVUTIL_USERSPACE_REQ; later,
the daemon's write() gets an error -EINVAL, and the daemon will exit().

We can reproduce the issue by sending a SIGSTOP signal to the daemon, wait
for 1 minute, and send a SIGCONT signal to the daemon: the daemon will
exit() quickly.

We can fix the issue by forcing a reset of the device (which means the
daemon can close() and open() the device again) and doing extra necessary
clean-up.

Signed-off-by: Dexuan Cui <decui@microsoft.com>

---
Changes in v2:
    This is actually a new patch that makes the daemons more robust.

Changes in v3 (I addressed Michael's comments):
    Don't reset target_fd, since that's unnecessary.
    Reset target_fname by: target_fname[0] = '\0';
    Added the missing "fs_frozen = true;" in vss_operate().
    After reopen_vss_fd: if vss_operate(VSS_OP_THAW) can not clear
        fs_frozen due to an error, we just exit().
    Added comments.

 tools/hv/hv_fcopy_daemon.c | 33 +++++++++++++++++++++----
 tools/hv/hv_kvp_daemon.c   | 36 ++++++++++++++++------------
 tools/hv/hv_vss_daemon.c   | 49 +++++++++++++++++++++++++++++---------
 3 files changed, 88 insertions(+), 30 deletions(-)

diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c
index aea2d91ab364..48cfa032432c 100644
--- a/tools/hv/hv_fcopy_daemon.c
+++ b/tools/hv/hv_fcopy_daemon.c
@@ -80,6 +80,8 @@ static int hv_start_fcopy(struct hv_start_fcopy *smsg)
 
 	error = 0;
 done:
+	if (error)
+		target_fname[0] = '\0';
 	return error;
 }
 
@@ -108,15 +110,29 @@ static int hv_copy_data(struct hv_do_fcopy *cpmsg)
 	return ret;
 }
 
+/*
+ * Reset target_fname to "" in the two below functions for hibernation: if
+ * the fcopy operation is aborted by hibernation, the daemon should remove the
+ * partially-copied file; to achieve this, the hv_utils driver always fakes a
+ * CANCEL_FCOPY message upon suspend, and later when the VM resumes back,
+ * the daemon calls hv_copy_cancel() to remove the file; if a file is copied
+ * successfully before suspend, hv_copy_finished() must reset target_fname to
+ * avoid that the file can be incorrectly removed upon resume, since the faked
+ * CANCEL_FCOPY message is spurious in this case.
+ */
 static int hv_copy_finished(void)
 {
 	close(target_fd);
+	target_fname[0] = '\0';
 	return 0;
 }
 static int hv_copy_cancel(void)
 {
 	close(target_fd);
-	unlink(target_fname);
+	if (strlen(target_fname) > 0) {
+		unlink(target_fname);
+		target_fname[0] = '\0';
+	}
 	return 0;
 
 }
@@ -141,7 +157,7 @@ int main(int argc, char *argv[])
 		struct hv_do_fcopy copy;
 		__u32 kernel_modver;
 	} buffer = { };
-	int in_handshake = 1;
+	int in_handshake;
 
 	static struct option long_options[] = {
 		{"help",	no_argument,	   0,  'h' },
@@ -170,6 +186,10 @@ int main(int argc, char *argv[])
 	openlog("HV_FCOPY", 0, LOG_USER);
 	syslog(LOG_INFO, "starting; pid is:%d", getpid());
 
+reopen_fcopy_fd:
+	/* Remove any possible partially-copied file on error */
+	hv_copy_cancel();
+	in_handshake = 1;
 	fcopy_fd = open("/dev/vmbus/hv_fcopy", O_RDWR);
 
 	if (fcopy_fd < 0) {
@@ -196,7 +216,7 @@ int main(int argc, char *argv[])
 		len = pread(fcopy_fd, &buffer, sizeof(buffer), 0);
 		if (len < 0) {
 			syslog(LOG_ERR, "pread failed: %s", strerror(errno));
-			exit(EXIT_FAILURE);
+			goto reopen_fcopy_fd;
 		}
 
 		if (in_handshake) {
@@ -231,9 +251,14 @@ int main(int argc, char *argv[])
 
 		}
 
+		/*
+		 * pwrite() may return an error due to the faked CANCEL_FCOPY
+		 * message upon hibernation. Ignore the error by resetting the
+		 * dev file, i.e. closing and re-opening it.
+		 */
 		if (pwrite(fcopy_fd, &error, sizeof(int), 0) != sizeof(int)) {
 			syslog(LOG_ERR, "pwrite failed: %s", strerror(errno));
-			exit(EXIT_FAILURE);
+			goto reopen_fcopy_fd;
 		}
 	}
 }
diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
index e9ef4ca6a655..ee9c1bb2293e 100644
--- a/tools/hv/hv_kvp_daemon.c
+++ b/tools/hv/hv_kvp_daemon.c
@@ -76,7 +76,7 @@ enum {
 	DNS
 };
 
-static int in_hand_shake = 1;
+static int in_hand_shake;
 
 static char *os_name = "";
 static char *os_major = "";
@@ -1360,7 +1360,7 @@ void print_usage(char *argv[])
 
 int main(int argc, char *argv[])
 {
-	int kvp_fd, len;
+	int kvp_fd = -1, len;
 	int error;
 	struct pollfd pfd;
 	char    *p;
@@ -1400,14 +1400,6 @@ int main(int argc, char *argv[])
 	openlog("KVP", 0, LOG_USER);
 	syslog(LOG_INFO, "KVP starting; pid is:%d", getpid());
 
-	kvp_fd = open("/dev/vmbus/hv_kvp", O_RDWR | O_CLOEXEC);
-
-	if (kvp_fd < 0) {
-		syslog(LOG_ERR, "open /dev/vmbus/hv_kvp failed; error: %d %s",
-			errno, strerror(errno));
-		exit(EXIT_FAILURE);
-	}
-
 	/*
 	 * Retrieve OS release information.
 	 */
@@ -1423,6 +1415,18 @@ int main(int argc, char *argv[])
 		exit(EXIT_FAILURE);
 	}
 
+reopen_kvp_fd:
+	if (kvp_fd != -1)
+		close(kvp_fd);
+	in_hand_shake = 1;
+	kvp_fd = open("/dev/vmbus/hv_kvp", O_RDWR | O_CLOEXEC);
+
+	if (kvp_fd < 0) {
+		syslog(LOG_ERR, "open /dev/vmbus/hv_kvp failed; error: %d %s",
+		       errno, strerror(errno));
+		exit(EXIT_FAILURE);
+	}
+
 	/*
 	 * Register ourselves with the kernel.
 	 */
@@ -1456,9 +1460,7 @@ int main(int argc, char *argv[])
 		if (len != sizeof(struct hv_kvp_msg)) {
 			syslog(LOG_ERR, "read failed; error:%d %s",
 			       errno, strerror(errno));
-
-			close(kvp_fd);
-			return EXIT_FAILURE;
+			goto reopen_kvp_fd;
 		}
 
 		/*
@@ -1617,13 +1619,17 @@ int main(int argc, char *argv[])
 			break;
 		}
 
-		/* Send the value back to the kernel. */
+		/*
+		 * Send the value back to the kernel. Note: the write() may
+		 * return an error due to hibernation; we can ignore the error
+		 * by resetting the dev file, i.e. closing and re-opening it.
+		 */
 kvp_done:
 		len = write(kvp_fd, hv_msg, sizeof(struct hv_kvp_msg));
 		if (len != sizeof(struct hv_kvp_msg)) {
 			syslog(LOG_ERR, "write failed; error: %d %s", errno,
 			       strerror(errno));
-			exit(EXIT_FAILURE);
+			goto reopen_kvp_fd;
 		}
 	}
 
diff --git a/tools/hv/hv_vss_daemon.c b/tools/hv/hv_vss_daemon.c
index 92902a88f671..dd111870beee 100644
--- a/tools/hv/hv_vss_daemon.c
+++ b/tools/hv/hv_vss_daemon.c
@@ -28,6 +28,8 @@
 #include <stdbool.h>
 #include <dirent.h>
 
+static bool fs_frozen;
+
 /* Don't use syslog() in the function since that can cause write to disk */
 static int vss_do_freeze(char *dir, unsigned int cmd)
 {
@@ -155,18 +157,27 @@ static int vss_operate(int operation)
 			continue;
 		}
 		error |= vss_do_freeze(ent->mnt_dir, cmd);
-		if (error && operation == VSS_OP_FREEZE)
-			goto err;
+		if (operation == VSS_OP_FREEZE) {
+			if (error)
+				goto err;
+			fs_frozen = true;
+		}
 	}
 
 	endmntent(mounts);
 
 	if (root_seen) {
 		error |= vss_do_freeze("/", cmd);
-		if (error && operation == VSS_OP_FREEZE)
-			goto err;
+		if (operation == VSS_OP_FREEZE) {
+			if (error)
+				goto err;
+			fs_frozen = true;
+		}
 	}
 
+	if (operation == VSS_OP_THAW && !error)
+		fs_frozen = false;
+
 	goto out;
 err:
 	save_errno = errno;
@@ -175,6 +186,7 @@ static int vss_operate(int operation)
 		endmntent(mounts);
 	}
 	vss_operate(VSS_OP_THAW);
+	fs_frozen = false;
 	/* Call syslog after we thaw all filesystems */
 	if (ent)
 		syslog(LOG_ERR, "FREEZE of %s failed; error:%d %s",
@@ -196,13 +208,13 @@ void print_usage(char *argv[])
 
 int main(int argc, char *argv[])
 {
-	int vss_fd, len;
+	int vss_fd = -1, len;
 	int error;
 	struct pollfd pfd;
 	int	op;
 	struct hv_vss_msg vss_msg[1];
 	int daemonize = 1, long_index = 0, opt;
-	int in_handshake = 1;
+	int in_handshake;
 	__u32 kernel_modver;
 
 	static struct option long_options[] = {
@@ -232,6 +244,18 @@ int main(int argc, char *argv[])
 	openlog("Hyper-V VSS", 0, LOG_USER);
 	syslog(LOG_INFO, "VSS starting; pid is:%d", getpid());
 
+reopen_vss_fd:
+	if (vss_fd != -1)
+		close(vss_fd);
+	if (fs_frozen) {
+		if (vss_operate(VSS_OP_THAW) || fs_frozen) {
+			syslog(LOG_ERR, "failed to thaw file system: err=%d",
+			       errno);
+			exit(EXIT_FAILURE);
+		}
+	}
+
+	in_handshake = 1;
 	vss_fd = open("/dev/vmbus/hv_vss", O_RDWR);
 	if (vss_fd < 0) {
 		syslog(LOG_ERR, "open /dev/vmbus/hv_vss failed; error: %d %s",
@@ -284,8 +308,7 @@ int main(int argc, char *argv[])
 		if (len != sizeof(struct hv_vss_msg)) {
 			syslog(LOG_ERR, "read failed; error:%d %s",
 			       errno, strerror(errno));
-			close(vss_fd);
-			return EXIT_FAILURE;
+			goto reopen_vss_fd;
 		}
 
 		op = vss_msg->vss_hdr.operation;
@@ -312,14 +335,18 @@ int main(int argc, char *argv[])
 		default:
 			syslog(LOG_ERR, "Illegal op:%d\n", op);
 		}
+
+		/*
+		 * The write() may return an error due to the faked VSS_OP_THAW
+		 * message upon hibernation. Ignore the error by resetting the
+		 * dev file, i.e. closing and re-opening it.
+		 */
 		vss_msg->error = error;
 		len = write(vss_fd, vss_msg, sizeof(struct hv_vss_msg));
 		if (len != sizeof(struct hv_vss_msg)) {
 			syslog(LOG_ERR, "write failed; error: %d %s", errno,
 			       strerror(errno));
-
-			if (op == VSS_OP_FREEZE)
-				vss_operate(VSS_OP_THAW);
+			goto reopen_vss_fd;
 		}
 	}
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/4] hv_utils: Support host-initiated restart request
  2020-01-25 19:53 [PATCH v3 0/4] hv_utils: Add the support of hibernation Dexuan Cui
  2020-01-25 19:53 ` [PATCH v3 1/4] Tools: hv: Reopen the devices if read() or write() returns errors Dexuan Cui
@ 2020-01-25 19:53 ` Dexuan Cui
  2020-01-26  4:57   ` Michael Kelley
  2020-01-25 19:53 ` [PATCH v3 3/4] hv_utils: Support host-initiated hibernation request Dexuan Cui
  2020-01-25 19:53 ` [PATCH v3 4/4] hv_utils: Add the support of hibernation Dexuan Cui
  3 siblings, 1 reply; 10+ messages in thread
From: Dexuan Cui @ 2020-01-25 19:53 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal, linux-hyperv, linux-kernel,
	mikelley, vkuznets
  Cc: Dexuan Cui

The hv_utils driver currently supports a "shutdown" operation initiated
from the Hyper-V host. Newer versions of Hyper-V also support a "restart"
operation. So add support for the updated protocol version that has
"restart" support, and perform a clean reboot when such a message is
received from Hyper-V.

To test the restart functionality, run this PowerShell command on the
Hyper-V host:

Restart-VM  <vmname>  -Type Reboot

Signed-off-by: Dexuan Cui <decui@microsoft.com>

---
Changes in v2:
   It's the same as v1. 

Changes in v3 (I addressed Michael's comments):
    Used a better version of changelog from  Michael.
    Added a comment about the meaning of shutdown_msg->flags.
    Call schedule_work() at the end of the function for consistency.

 drivers/hv/hv_util.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index 766bd8457346..d815bea0fda3 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -24,6 +24,8 @@
 
 #define SD_MAJOR	3
 #define SD_MINOR	0
+#define SD_MINOR_1	1
+#define SD_VERSION_3_1	(SD_MAJOR << 16 | SD_MINOR_1)
 #define SD_VERSION	(SD_MAJOR << 16 | SD_MINOR)
 
 #define SD_MAJOR_1	1
@@ -50,8 +52,9 @@ static int sd_srv_version;
 static int ts_srv_version;
 static int hb_srv_version;
 
-#define SD_VER_COUNT 2
+#define SD_VER_COUNT 3
 static const int sd_versions[] = {
+	SD_VERSION_3_1,
 	SD_VERSION,
 	SD_VERSION_1
 };
@@ -118,17 +121,28 @@ static void perform_shutdown(struct work_struct *dummy)
 	orderly_poweroff(true);
 }
 
+static void perform_restart(struct work_struct *dummy)
+{
+	orderly_reboot();
+}
+
 /*
  * Perform the shutdown operation in a thread context.
  */
 static DECLARE_WORK(shutdown_work, perform_shutdown);
 
+/*
+ * Perform the restart operation in a thread context.
+ */
+static DECLARE_WORK(restart_work, perform_restart);
+
 static void shutdown_onchannelcallback(void *context)
 {
 	struct vmbus_channel *channel = context;
 	u32 recvlen;
 	u64 requestid;
 	bool execute_shutdown = false;
+	bool execute_reboot = false;
 	u8  *shut_txf_buf = util_shutdown.recv_buffer;
 
 	struct shutdown_msg_data *shutdown_msg;
@@ -157,6 +171,12 @@ static void shutdown_onchannelcallback(void *context)
 					sizeof(struct vmbuspipe_hdr) +
 					sizeof(struct icmsg_hdr)];
 
+			/*
+			 * shutdown_msg->flags can be 0 (shut down), 2(reboot),
+			 * or 4(hibernate). It may bitwise-OR 1, which means
+			 * performing the request by force. Linux always tries
+			 * to perform the request by force.
+			 */
 			switch (shutdown_msg->flags) {
 			case 0:
 			case 1:
@@ -166,6 +186,14 @@ static void shutdown_onchannelcallback(void *context)
 				pr_info("Shutdown request received -"
 					    " graceful shutdown initiated\n");
 				break;
+			case 2:
+			case 3:
+				icmsghdrp->status = HV_S_OK;
+				execute_reboot = true;
+
+				pr_info("Restart request received -"
+					    " graceful restart initiated\n");
+				break;
 			default:
 				icmsghdrp->status = HV_E_FAIL;
 				execute_shutdown = false;
@@ -186,6 +214,8 @@ static void shutdown_onchannelcallback(void *context)
 
 	if (execute_shutdown == true)
 		schedule_work(&shutdown_work);
+	if (execute_reboot == true)
+		schedule_work(&restart_work);
 }
 
 /*
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/4] hv_utils: Support host-initiated hibernation request
  2020-01-25 19:53 [PATCH v3 0/4] hv_utils: Add the support of hibernation Dexuan Cui
  2020-01-25 19:53 ` [PATCH v3 1/4] Tools: hv: Reopen the devices if read() or write() returns errors Dexuan Cui
  2020-01-25 19:53 ` [PATCH v3 2/4] hv_utils: Support host-initiated restart request Dexuan Cui
@ 2020-01-25 19:53 ` Dexuan Cui
  2020-01-25 19:53 ` [PATCH v3 4/4] hv_utils: Add the support of hibernation Dexuan Cui
  3 siblings, 0 replies; 10+ messages in thread
From: Dexuan Cui @ 2020-01-25 19:53 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal, linux-hyperv, linux-kernel,
	mikelley, vkuznets
  Cc: Dexuan Cui

Update the Shutdown IC version to 3.2, which is required for the host to
send the hibernation request.

The user is expected to create the below udev rule file, which is applied
upon the host-initiated hibernation request:

root@localhost:~# cat /usr/lib/udev/rules.d/40-vm-hibernation.rules
SUBSYSTEM=="vmbus", ACTION=="change", DRIVER=="hv_utils", ENV{EVENT}=="hibernate", RUN+="/usr/bin/systemctl hibernate"

Signed-off-by: Dexuan Cui <decui@microsoft.com>

---
Changes in v2:
    Send the host-initiated hibernation request to the user space via udev.
    (v1 used call_usermodehelper() and "/sbin/hyperv-hibernate".)

Changes in v3 (I addressed Michael's comoments):
    Fixed the order issue in sd_versions[].
    Moved schedule_work() to a later place for consistency.

 drivers/hv/hv_util.c | 52 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index d815bea0fda3..e07197dfb4a2 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -25,7 +25,9 @@
 #define SD_MAJOR	3
 #define SD_MINOR	0
 #define SD_MINOR_1	1
+#define SD_MINOR_2	2
 #define SD_VERSION_3_1	(SD_MAJOR << 16 | SD_MINOR_1)
+#define SD_VERSION_3_2	(SD_MAJOR << 16 | SD_MINOR_2)
 #define SD_VERSION	(SD_MAJOR << 16 | SD_MINOR)
 
 #define SD_MAJOR_1	1
@@ -52,8 +54,9 @@ static int sd_srv_version;
 static int ts_srv_version;
 static int hb_srv_version;
 
-#define SD_VER_COUNT 3
+#define SD_VER_COUNT 4
 static const int sd_versions[] = {
+	SD_VERSION_3_2,
 	SD_VERSION_3_1,
 	SD_VERSION,
 	SD_VERSION_1
@@ -78,9 +81,45 @@ static const int fw_versions[] = {
 	UTIL_WS2K8_FW_VERSION
 };
 
+/*
+ * Send the "hibernate" udev event in a thread context.
+ */
+struct hibernate_work_context {
+	struct work_struct work;
+	struct hv_device *dev;
+};
+
+static struct hibernate_work_context hibernate_context;
+static bool hibernation_supported;
+
+static void send_hibernate_uevent(struct work_struct *work)
+{
+	char *uevent_env[2] = { "EVENT=hibernate", NULL };
+	struct hibernate_work_context *ctx;
+
+	ctx = container_of(work, struct hibernate_work_context, work);
+
+	kobject_uevent_env(&ctx->dev->device.kobj, KOBJ_CHANGE, uevent_env);
+
+	pr_info("Sent hibernation uevent\n");
+}
+
+static int hv_shutdown_init(struct hv_util_service *srv)
+{
+	struct vmbus_channel *channel = srv->channel;
+
+	INIT_WORK(&hibernate_context.work, send_hibernate_uevent);
+	hibernate_context.dev = channel->device_obj;
+
+	hibernation_supported = hv_is_hibernation_supported();
+
+	return 0;
+}
+
 static void shutdown_onchannelcallback(void *context);
 static struct hv_util_service util_shutdown = {
 	.util_cb = shutdown_onchannelcallback,
+	.util_init = hv_shutdown_init,
 };
 
 static int hv_timesync_init(struct hv_util_service *srv);
@@ -143,6 +182,7 @@ static void shutdown_onchannelcallback(void *context)
 	u64 requestid;
 	bool execute_shutdown = false;
 	bool execute_reboot = false;
+	bool execute_hibernate = false;
 	u8  *shut_txf_buf = util_shutdown.recv_buffer;
 
 	struct shutdown_msg_data *shutdown_msg;
@@ -194,6 +234,14 @@ static void shutdown_onchannelcallback(void *context)
 				pr_info("Restart request received -"
 					    " graceful restart initiated\n");
 				break;
+			case 4:
+			case 5:
+				pr_info("Hibernation request received\n");
+
+				icmsghdrp->status = hibernation_supported ?
+					HV_S_OK : HV_E_FAIL;
+				execute_hibernate = hibernation_supported;
+				break;
 			default:
 				icmsghdrp->status = HV_E_FAIL;
 				execute_shutdown = false;
@@ -216,6 +264,8 @@ static void shutdown_onchannelcallback(void *context)
 		schedule_work(&shutdown_work);
 	if (execute_reboot == true)
 		schedule_work(&restart_work);
+	if (execute_hibernate == true)
+		schedule_work(&hibernate_context.work);
 }
 
 /*
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 4/4] hv_utils: Add the support of hibernation
  2020-01-25 19:53 [PATCH v3 0/4] hv_utils: Add the support of hibernation Dexuan Cui
                   ` (2 preceding siblings ...)
  2020-01-25 19:53 ` [PATCH v3 3/4] hv_utils: Support host-initiated hibernation request Dexuan Cui
@ 2020-01-25 19:53 ` Dexuan Cui
  2020-01-26  4:58   ` Michael Kelley
  3 siblings, 1 reply; 10+ messages in thread
From: Dexuan Cui @ 2020-01-25 19:53 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal, linux-hyperv, linux-kernel,
	mikelley, vkuznets
  Cc: Dexuan Cui

Add util_pre_suspend() and util_pre_resume() for some hv_utils devices
(e.g. kvp/vss/fcopy), because they need special handling before
util_suspend() calls vmbus_close().

For kvp, all the possible pending work items should be cancelled.

For vss and fcopy, some extra clean-up needs to be done, i.e. fake a
THAW message for hv_vss_daemon and fake a CANCEL_FCOPY message for
hv_fcopy_daemon, otherwise when the VM resums back, the daemons
can end up in an inconsistent state (i.e. the file systems are
frozen but will never be thawed; the file transmitted via fcopy
may not be complete). Note: there is an extra patch for the daemons:
"Tools: hv: Reopen the devices if read() or write() returns errors",
because the hv_utils driver can not guarantee the whole transaction
finishes completely once util_suspend() starts to run (at this time,
all the userspace processes are frozen).

util_probe() disables channel->callback_event to avoid the race with
the channel callback.

Signed-off-by: Dexuan Cui <decui@microsoft.com>

---
Changes in v2:
    Handles fcopy/vss specially to avoid possible inconsistent states.

Changes in v3 (I addressed Michael's comments):
    Removed unneeded blank lines.
    Simplified the error handling logic by allocating memory earlier.
    Added a comment before util_suspend(): when we're in the function,
      all the userspace processes have been frozen.

 drivers/hv/hv_fcopy.c     | 54 +++++++++++++++++++++++++++++++++-
 drivers/hv/hv_kvp.c       | 43 +++++++++++++++++++++++++--
 drivers/hv/hv_snapshot.c  | 55 ++++++++++++++++++++++++++++++++--
 drivers/hv/hv_util.c      | 62 ++++++++++++++++++++++++++++++++++++++-
 drivers/hv/hyperv_vmbus.h |  6 ++++
 include/linux/hyperv.h    |  2 ++
 6 files changed, 216 insertions(+), 6 deletions(-)

diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index 08fa4a5de644..bb9ba3f7c794 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -346,9 +346,61 @@ int hv_fcopy_init(struct hv_util_service *srv)
 	return 0;
 }
 
+static void hv_fcopy_cancel_work(void)
+{
+	cancel_delayed_work_sync(&fcopy_timeout_work);
+	cancel_work_sync(&fcopy_send_work);
+}
+
+int hv_fcopy_pre_suspend(void)
+{
+	struct vmbus_channel *channel = fcopy_transaction.recv_channel;
+	struct hv_fcopy_hdr *fcopy_msg;
+
+	/*
+	 * Fake a CANCEL_FCOPY message for the user space daemon in case the
+	 * daemon is in the middle of copying some file. It doesn't matter if
+	 * there is already a message pending to be delivered to the user
+	 * space since we force fcopy_transaction.state to be HVUTIL_READY, so
+	 * the user space daemon's write() will fail with EINVAL (see
+	 * fcopy_on_msg()), and the daemon will reset the device by closing
+	 * and re-opening it.
+	 */
+	fcopy_msg = kzalloc(sizeof(*fcopy_msg), GFP_KERNEL);
+	if (!fcopy_msg)
+		return -ENOMEM;
+
+	tasklet_disable(&channel->callback_event);
+
+	fcopy_msg->operation = CANCEL_FCOPY;
+
+	hv_fcopy_cancel_work();
+
+	/* We don't care about the return value. */
+	hvutil_transport_send(hvt, fcopy_msg, sizeof(*fcopy_msg), NULL);
+
+	kfree(fcopy_msg);
+
+	fcopy_transaction.state = HVUTIL_READY;
+
+	/* tasklet_enable() will be called in hv_fcopy_pre_resume(). */
+	return 0;
+}
+
+int hv_fcopy_pre_resume(void)
+{
+	struct vmbus_channel *channel = fcopy_transaction.recv_channel;
+
+	tasklet_enable(&channel->callback_event);
+
+	return 0;
+}
+
 void hv_fcopy_deinit(void)
 {
 	fcopy_transaction.state = HVUTIL_DEVICE_DYING;
-	cancel_delayed_work_sync(&fcopy_timeout_work);
+
+	hv_fcopy_cancel_work();
+
 	hvutil_transport_destroy(hvt);
 }
diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
index ae7c028dc5a8..e74b144b8f3d 100644
--- a/drivers/hv/hv_kvp.c
+++ b/drivers/hv/hv_kvp.c
@@ -758,11 +758,50 @@ hv_kvp_init(struct hv_util_service *srv)
 	return 0;
 }
 
-void hv_kvp_deinit(void)
+static void hv_kvp_cancel_work(void)
 {
-	kvp_transaction.state = HVUTIL_DEVICE_DYING;
 	cancel_delayed_work_sync(&kvp_host_handshake_work);
 	cancel_delayed_work_sync(&kvp_timeout_work);
 	cancel_work_sync(&kvp_sendkey_work);
+}
+
+int hv_kvp_pre_suspend(void)
+{
+	struct vmbus_channel *channel = kvp_transaction.recv_channel;
+
+	tasklet_disable(&channel->callback_event);
+
+	/*
+	 * If there is a pending transtion, it's unnecessary to tell the host
+	 * that the transaction will fail, because that is implied when
+	 * util_suspend() calls vmbus_close() later.
+	 */
+	hv_kvp_cancel_work();
+
+	/*
+	 * Forece the state to READY to handle the ICMSGTYPE_NEGOTIATE message
+	 * later. The user space daemon may go out of order and its write()
+	 * may fail with EINVAL: this doesn't matter since the daemon will
+	 * reset the device by closing and re-opening it.
+	 */
+	kvp_transaction.state = HVUTIL_READY;
+	return 0;
+}
+
+int hv_kvp_pre_resume(void)
+{
+	struct vmbus_channel *channel = kvp_transaction.recv_channel;
+
+	tasklet_enable(&channel->callback_event);
+
+	return 0;
+}
+
+void hv_kvp_deinit(void)
+{
+	kvp_transaction.state = HVUTIL_DEVICE_DYING;
+
+	hv_kvp_cancel_work();
+
 	hvutil_transport_destroy(hvt);
 }
diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c
index 03b6454268b3..1c75b38f0d6d 100644
--- a/drivers/hv/hv_snapshot.c
+++ b/drivers/hv/hv_snapshot.c
@@ -379,10 +379,61 @@ hv_vss_init(struct hv_util_service *srv)
 	return 0;
 }
 
-void hv_vss_deinit(void)
+static void hv_vss_cancel_work(void)
 {
-	vss_transaction.state = HVUTIL_DEVICE_DYING;
 	cancel_delayed_work_sync(&vss_timeout_work);
 	cancel_work_sync(&vss_handle_request_work);
+}
+
+int hv_vss_pre_suspend(void)
+{
+	struct vmbus_channel *channel = vss_transaction.recv_channel;
+	struct hv_vss_msg *vss_msg;
+
+	/*
+	 * Fake a THAW message for the user space daemon in case the daemon
+	 * has frozen the file systems. It doesn't matter if there is already
+	 * a message pending to be delivered to the user space since we force
+	 * vss_transaction.state to be HVUTIL_READY, so the user space daemon's
+	 * write() will fail with EINVAL (see vss_on_msg()), and the daemon
+	 * will reset the device by closing and re-opening it.
+	 */
+	vss_msg = kzalloc(sizeof(*vss_msg), GFP_KERNEL);
+	if (!vss_msg)
+		return -ENOMEM;
+
+	tasklet_disable(&channel->callback_event);
+
+	vss_msg->vss_hdr.operation = VSS_OP_THAW;
+
+	/* Cancel any possible pending work. */
+	hv_vss_cancel_work();
+
+	/* We don't care about the return value. */
+	hvutil_transport_send(hvt, vss_msg, sizeof(*vss_msg), NULL);
+
+	kfree(vss_msg);
+
+	vss_transaction.state = HVUTIL_READY;
+
+	/* tasklet_enable() will be called in hv_vss_pre_resume(). */
+	return 0;
+}
+
+int hv_vss_pre_resume(void)
+{
+	struct vmbus_channel *channel = vss_transaction.recv_channel;
+
+	tasklet_enable(&channel->callback_event);
+
+	return 0;
+}
+
+void hv_vss_deinit(void)
+{
+	vss_transaction.state = HVUTIL_DEVICE_DYING;
+
+	hv_vss_cancel_work();
+
 	hvutil_transport_destroy(hvt);
 }
diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index e07197dfb4a2..0fe5ff845d07 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -123,12 +123,14 @@ static struct hv_util_service util_shutdown = {
 };
 
 static int hv_timesync_init(struct hv_util_service *srv);
+static int hv_timesync_pre_suspend(void);
 static void hv_timesync_deinit(void);
 
 static void timesync_onchannelcallback(void *context);
 static struct hv_util_service util_timesynch = {
 	.util_cb = timesync_onchannelcallback,
 	.util_init = hv_timesync_init,
+	.util_pre_suspend = hv_timesync_pre_suspend,
 	.util_deinit = hv_timesync_deinit,
 };
 
@@ -140,18 +142,24 @@ static struct hv_util_service util_heartbeat = {
 static struct hv_util_service util_kvp = {
 	.util_cb = hv_kvp_onchannelcallback,
 	.util_init = hv_kvp_init,
+	.util_pre_suspend = hv_kvp_pre_suspend,
+	.util_pre_resume = hv_kvp_pre_resume,
 	.util_deinit = hv_kvp_deinit,
 };
 
 static struct hv_util_service util_vss = {
 	.util_cb = hv_vss_onchannelcallback,
 	.util_init = hv_vss_init,
+	.util_pre_suspend = hv_vss_pre_suspend,
+	.util_pre_resume = hv_vss_pre_resume,
 	.util_deinit = hv_vss_deinit,
 };
 
 static struct hv_util_service util_fcopy = {
 	.util_cb = hv_fcopy_onchannelcallback,
 	.util_init = hv_fcopy_init,
+	.util_pre_suspend = hv_fcopy_pre_suspend,
+	.util_pre_resume = hv_fcopy_pre_resume,
 	.util_deinit = hv_fcopy_deinit,
 };
 
@@ -521,6 +529,44 @@ static int util_remove(struct hv_device *dev)
 	return 0;
 }
 
+/*
+ * When we're in util_suspend(), all the userspace processes have been frozen
+ * (refer to hibernate() -> freeze_processes()). The userspace is thawed only
+ * after the whole resume procedure, including util_resume(), finishes.
+ */
+static int util_suspend(struct hv_device *dev)
+{
+	struct hv_util_service *srv = hv_get_drvdata(dev);
+	int ret = 0;
+
+	if (srv->util_pre_suspend) {
+		ret = srv->util_pre_suspend();
+		if (ret)
+			return ret;
+	}
+
+	vmbus_close(dev->channel);
+
+	return 0;
+}
+
+static int util_resume(struct hv_device *dev)
+{
+	struct hv_util_service *srv = hv_get_drvdata(dev);
+	int ret = 0;
+
+	if (srv->util_pre_resume) {
+		ret = srv->util_pre_resume();
+		if (ret)
+			return ret;
+	}
+
+	ret = vmbus_open(dev->channel, 4 * HV_HYP_PAGE_SIZE,
+			 4 * HV_HYP_PAGE_SIZE, NULL, 0, srv->util_cb,
+			 dev->channel);
+	return ret;
+}
+
 static const struct hv_vmbus_device_id id_table[] = {
 	/* Shutdown guid */
 	{ HV_SHUTDOWN_GUID,
@@ -557,6 +603,8 @@ static  struct hv_driver util_drv = {
 	.id_table = id_table,
 	.probe =  util_probe,
 	.remove =  util_remove,
+	.suspend = util_suspend,
+	.resume =  util_resume,
 	.driver = {
 		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
 	},
@@ -626,11 +674,23 @@ static int hv_timesync_init(struct hv_util_service *srv)
 	return 0;
 }
 
+static void hv_timesync_cancel_work(void)
+{
+	cancel_work_sync(&adj_time_work);
+}
+
+static int hv_timesync_pre_suspend(void)
+{
+	hv_timesync_cancel_work();
+	return 0;
+}
+
 static void hv_timesync_deinit(void)
 {
 	if (hv_ptp_clock)
 		ptp_clock_unregister(hv_ptp_clock);
-	cancel_work_sync(&adj_time_work);
+
+	hv_timesync_cancel_work();
 }
 
 static int __init init_hyperv_utils(void)
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 20edcfd3b96c..f5fa3b3c9baf 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -352,14 +352,20 @@ void vmbus_on_msg_dpc(unsigned long data);
 
 int hv_kvp_init(struct hv_util_service *srv);
 void hv_kvp_deinit(void);
+int hv_kvp_pre_suspend(void);
+int hv_kvp_pre_resume(void);
 void hv_kvp_onchannelcallback(void *context);
 
 int hv_vss_init(struct hv_util_service *srv);
 void hv_vss_deinit(void);
+int hv_vss_pre_suspend(void);
+int hv_vss_pre_resume(void);
 void hv_vss_onchannelcallback(void *context);
 
 int hv_fcopy_init(struct hv_util_service *srv);
 void hv_fcopy_deinit(void);
+int hv_fcopy_pre_suspend(void);
+int hv_fcopy_pre_resume(void);
 void hv_fcopy_onchannelcallback(void *context);
 void vmbus_initiate_unload(bool crash);
 
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 41c58011431e..692c89ccf5df 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1435,6 +1435,8 @@ struct hv_util_service {
 	void (*util_cb)(void *);
 	int (*util_init)(struct hv_util_service *);
 	void (*util_deinit)(void);
+	int (*util_pre_suspend)(void);
+	int (*util_pre_resume)(void);
 };
 
 struct vmbuspipe_hdr {
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH v3 1/4] Tools: hv: Reopen the devices if read() or write() returns errors
  2020-01-25 19:53 ` [PATCH v3 1/4] Tools: hv: Reopen the devices if read() or write() returns errors Dexuan Cui
@ 2020-01-26  4:49   ` Michael Kelley
  2020-01-26  4:59     ` Dexuan Cui
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Kelley @ 2020-01-26  4:49 UTC (permalink / raw)
  To: Dexuan Cui, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal, linux-hyperv, linux-kernel, vkuznets

From: Dexuan Cui <decui@microsoft.com> Sent: Saturday, January 25, 2020 11:54 AM
> 
> The state machine in the hv_utils driver can run out of order in some
> corner cases, e.g. if the kvp daemon doesn't call write() fast enough
> due to some reason, kvp_timeout_func() can run first and move the state
> to HVUTIL_READY; next, when kvp_on_msg() is called it returns -EINVAL
> since kvp_transaction.state is smaller than HVUTIL_USERSPACE_REQ; later,
> the daemon's write() gets an error -EINVAL, and the daemon will exit().
> 
> We can reproduce the issue by sending a SIGSTOP signal to the daemon, wait
> for 1 minute, and send a SIGCONT signal to the daemon: the daemon will
> exit() quickly.
> 
> We can fix the issue by forcing a reset of the device (which means the
> daemon can close() and open() the device again) and doing extra necessary
> clean-up.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> 
> ---
> Changes in v2:
>     This is actually a new patch that makes the daemons more robust.
> 
> Changes in v3 (I addressed Michael's comments):
>     Don't reset target_fd, since that's unnecessary.
>     Reset target_fname by: target_fname[0] = '\0';
>     Added the missing "fs_frozen = true;" in vss_operate().
>     After reopen_vss_fd: if vss_operate(VSS_OP_THAW) can not clear
>         fs_frozen due to an error, we just exit().
>     Added comments.
> 
>  tools/hv/hv_fcopy_daemon.c | 33 +++++++++++++++++++++----
>  tools/hv/hv_kvp_daemon.c   | 36 ++++++++++++++++------------
>  tools/hv/hv_vss_daemon.c   | 49 +++++++++++++++++++++++++++++---------
>  3 files changed, 88 insertions(+), 30 deletions(-)
> 
> diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c
> index aea2d91ab364..48cfa032432c 100644
> --- a/tools/hv/hv_fcopy_daemon.c
> +++ b/tools/hv/hv_fcopy_daemon.c
> @@ -80,6 +80,8 @@ static int hv_start_fcopy(struct hv_start_fcopy *smsg)
> 
>  	error = 0;
>  done:
> +	if (error)
> +		target_fname[0] = '\0';
>  	return error;
>  }
> 
> @@ -108,15 +110,29 @@ static int hv_copy_data(struct hv_do_fcopy *cpmsg)
>  	return ret;
>  }
> 
> +/*
> + * Reset target_fname to "" in the two below functions for hibernation: if
> + * the fcopy operation is aborted by hibernation, the daemon should remove the
> + * partially-copied file; to achieve this, the hv_utils driver always fakes a
> + * CANCEL_FCOPY message upon suspend, and later when the VM resumes back,
> + * the daemon calls hv_copy_cancel() to remove the file; if a file is copied
> + * successfully before suspend, hv_copy_finished() must reset target_fname to
> + * avoid that the file can be incorrectly removed upon resume, since the faked
> + * CANCEL_FCOPY message is spurious in this case.
> + */
>  static int hv_copy_finished(void)
>  {
>  	close(target_fd);
> +	target_fname[0] = '\0';
>  	return 0;
>  }
>  static int hv_copy_cancel(void)
>  {
>  	close(target_fd);
> -	unlink(target_fname);
> +	if (strlen(target_fname) > 0) {
> +		unlink(target_fname);
> +		target_fname[0] = '\0';
> +	}
>  	return 0;
> 
>  }
> @@ -141,7 +157,7 @@ int main(int argc, char *argv[])
>  		struct hv_do_fcopy copy;
>  		__u32 kernel_modver;
>  	} buffer = { };
> -	int in_handshake = 1;
> +	int in_handshake;
> 
>  	static struct option long_options[] = {
>  		{"help",	no_argument,	   0,  'h' },
> @@ -170,6 +186,10 @@ int main(int argc, char *argv[])
>  	openlog("HV_FCOPY", 0, LOG_USER);
>  	syslog(LOG_INFO, "starting; pid is:%d", getpid());
> 
> +reopen_fcopy_fd:
> +	/* Remove any possible partially-copied file on error */
> +	hv_copy_cancel();

Since you have removed the calls to close(fcopy_fd) after a
pread() or pwrite() failure that were in v2 of the patch, I was
expecting to see

	if (fcopy_fd != -1)
		close(fcopy_fd)

here, like you've done with the kvp and vss code.  And
remember to initialize fcopy_fd to -1. :-)

Michael

> +	in_handshake = 1;
>  	fcopy_fd = open("/dev/vmbus/hv_fcopy", O_RDWR);
> 
>  	if (fcopy_fd < 0) {
> @@ -196,7 +216,7 @@ int main(int argc, char *argv[])
>  		len = pread(fcopy_fd, &buffer, sizeof(buffer), 0);
>  		if (len < 0) {
>  			syslog(LOG_ERR, "pread failed: %s", strerror(errno));
> -			exit(EXIT_FAILURE);
> +			goto reopen_fcopy_fd;
>  		}
> 
>  		if (in_handshake) {
> @@ -231,9 +251,14 @@ int main(int argc, char *argv[])
> 
>  		}
> 
> +		/*
> +		 * pwrite() may return an error due to the faked CANCEL_FCOPY
> +		 * message upon hibernation. Ignore the error by resetting the
> +		 * dev file, i.e. closing and re-opening it.
> +		 */
>  		if (pwrite(fcopy_fd, &error, sizeof(int), 0) != sizeof(int)) {
>  			syslog(LOG_ERR, "pwrite failed: %s", strerror(errno));
> -			exit(EXIT_FAILURE);
> +			goto reopen_fcopy_fd;
>  		}
>  	}
>  }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v3 2/4] hv_utils: Support host-initiated restart request
  2020-01-25 19:53 ` [PATCH v3 2/4] hv_utils: Support host-initiated restart request Dexuan Cui
@ 2020-01-26  4:57   ` Michael Kelley
  2020-01-26  5:05     ` Dexuan Cui
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Kelley @ 2020-01-26  4:57 UTC (permalink / raw)
  To: Dexuan Cui, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal, linux-hyperv, linux-kernel, vkuznets

From: Dexuan Cui <decui@microsoft.com> Sent: Saturday, January 25, 2020 11:54 AM
>
> The hv_utils driver currently supports a "shutdown" operation initiated
> from the Hyper-V host. Newer versions of Hyper-V also support a "restart"
> operation. So add support for the updated protocol version that has
> "restart" support, and perform a clean reboot when such a message is
> received from Hyper-V.
> 
> To test the restart functionality, run this PowerShell command on the
> Hyper-V host:
> 
> Restart-VM  <vmname>  -Type Reboot
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> 
> ---
> Changes in v2:
>    It's the same as v1.
> 
> Changes in v3 (I addressed Michael's comments):
>     Used a better version of changelog from  Michael.
>     Added a comment about the meaning of shutdown_msg->flags.
>     Call schedule_work() at the end of the function for consistency.
> 
>  drivers/hv/hv_util.c | 32 +++++++++++++++++++++++++++++++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
> index 766bd8457346..d815bea0fda3 100644
> --- a/drivers/hv/hv_util.c
> +++ b/drivers/hv/hv_util.c
> @@ -24,6 +24,8 @@
> 
>  #define SD_MAJOR	3
>  #define SD_MINOR	0
> +#define SD_MINOR_1	1
> +#define SD_VERSION_3_1	(SD_MAJOR << 16 | SD_MINOR_1)
>  #define SD_VERSION	(SD_MAJOR << 16 | SD_MINOR)
> 
>  #define SD_MAJOR_1	1
> @@ -50,8 +52,9 @@ static int sd_srv_version;
>  static int ts_srv_version;
>  static int hb_srv_version;
> 
> -#define SD_VER_COUNT 2
> +#define SD_VER_COUNT 3
>  static const int sd_versions[] = {
> +	SD_VERSION_3_1,
>  	SD_VERSION,
>  	SD_VERSION_1
>  };
> @@ -118,17 +121,28 @@ static void perform_shutdown(struct work_struct *dummy)
>  	orderly_poweroff(true);
>  }
> 
> +static void perform_restart(struct work_struct *dummy)
> +{
> +	orderly_reboot();
> +}
> +
>  /*
>   * Perform the shutdown operation in a thread context.
>   */
>  static DECLARE_WORK(shutdown_work, perform_shutdown);
> 
> +/*
> + * Perform the restart operation in a thread context.
> + */
> +static DECLARE_WORK(restart_work, perform_restart);
> +
>  static void shutdown_onchannelcallback(void *context)
>  {
>  	struct vmbus_channel *channel = context;
>  	u32 recvlen;
>  	u64 requestid;
>  	bool execute_shutdown = false;
> +	bool execute_reboot = false;
>  	u8  *shut_txf_buf = util_shutdown.recv_buffer;
> 
>  	struct shutdown_msg_data *shutdown_msg;
> @@ -157,6 +171,12 @@ static void shutdown_onchannelcallback(void *context)
>  					sizeof(struct vmbuspipe_hdr) +
>  					sizeof(struct icmsg_hdr)];
> 
> +			/*
> +			 * shutdown_msg->flags can be 0 (shut down), 2(reboot),
> +			 * or 4(hibernate). It may bitwise-OR 1, which means
> +			 * performing the request by force. Linux always tries
> +			 * to perform the request by force.
> +			 */
>  			switch (shutdown_msg->flags) {
>  			case 0:
>  			case 1:
> @@ -166,6 +186,14 @@ static void shutdown_onchannelcallback(void *context)
>  				pr_info("Shutdown request received -"
>  					    " graceful shutdown initiated\n");
>  				break;
> +			case 2:
> +			case 3:
> +				icmsghdrp->status = HV_S_OK;
> +				execute_reboot = true;
> +
> +				pr_info("Restart request received -"
> +					    " graceful restart initiated\n");
> +				break;
>  			default:
>  				icmsghdrp->status = HV_E_FAIL;
>  				execute_shutdown = false;
> @@ -186,6 +214,8 @@ static void shutdown_onchannelcallback(void *context)
> 
>  	if (execute_shutdown == true)
>  		schedule_work(&shutdown_work);
> +	if (execute_reboot == true)
> +		schedule_work(&restart_work);

This works, and responds to my comment.  But FWIW, a more compact
approach would be to drop the boolean execute_shutdown, execute_restart,
etc. local variables and have just this local variable:

	struct work_struct *work_to_do = NULL;

In the "case" branches do:

	work_to_do = &shutdown_work;

or
	work_to_do = &restart_work;

Then at the bottom of the function, just do:

	if (work_to_do)
		schedule_work(work_to_do);

Patch 3 of this series would then be a little simpler as well.

>  }
> 
>  /*
> --
> 2.19.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v3 4/4] hv_utils: Add the support of hibernation
  2020-01-25 19:53 ` [PATCH v3 4/4] hv_utils: Add the support of hibernation Dexuan Cui
@ 2020-01-26  4:58   ` Michael Kelley
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Kelley @ 2020-01-26  4:58 UTC (permalink / raw)
  To: Dexuan Cui, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal, linux-hyperv, linux-kernel, vkuznets

From: Dexuan Cui <decui@microsoft.com> Sent: Saturday, January 25, 2020 11:54 AM
> 
> Add util_pre_suspend() and util_pre_resume() for some hv_utils devices
> (e.g. kvp/vss/fcopy), because they need special handling before
> util_suspend() calls vmbus_close().
> 
> For kvp, all the possible pending work items should be cancelled.
> 
> For vss and fcopy, some extra clean-up needs to be done, i.e. fake a
> THAW message for hv_vss_daemon and fake a CANCEL_FCOPY message for
> hv_fcopy_daemon, otherwise when the VM resums back, the daemons
> can end up in an inconsistent state (i.e. the file systems are
> frozen but will never be thawed; the file transmitted via fcopy
> may not be complete). Note: there is an extra patch for the daemons:
> "Tools: hv: Reopen the devices if read() or write() returns errors",
> because the hv_utils driver can not guarantee the whole transaction
> finishes completely once util_suspend() starts to run (at this time,
> all the userspace processes are frozen).
> 
> util_probe() disables channel->callback_event to avoid the race with
> the channel callback.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> 
> ---
> Changes in v2:
>     Handles fcopy/vss specially to avoid possible inconsistent states.
> 
> Changes in v3 (I addressed Michael's comments):
>     Removed unneeded blank lines.
>     Simplified the error handling logic by allocating memory earlier.
>     Added a comment before util_suspend(): when we're in the function,
>       all the userspace processes have been frozen.
> 
>  drivers/hv/hv_fcopy.c     | 54 +++++++++++++++++++++++++++++++++-
>  drivers/hv/hv_kvp.c       | 43 +++++++++++++++++++++++++--
>  drivers/hv/hv_snapshot.c  | 55 ++++++++++++++++++++++++++++++++--
>  drivers/hv/hv_util.c      | 62 ++++++++++++++++++++++++++++++++++++++-
>  drivers/hv/hyperv_vmbus.h |  6 ++++
>  include/linux/hyperv.h    |  2 ++
>  6 files changed, 216 insertions(+), 6 deletions(-)
> 

Reviewed-by: Michael Kelley <mikelley@microsoft.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v3 1/4] Tools: hv: Reopen the devices if read() or write() returns errors
  2020-01-26  4:49   ` Michael Kelley
@ 2020-01-26  4:59     ` Dexuan Cui
  0 siblings, 0 replies; 10+ messages in thread
From: Dexuan Cui @ 2020-01-26  4:59 UTC (permalink / raw)
  To: Michael Kelley, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal, linux-hyperv, linux-kernel, vkuznets

> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Saturday, January 25, 2020 8:49 PM
> > +reopen_fcopy_fd:
> > +	/* Remove any possible partially-copied file on error */
> > +	hv_copy_cancel();
> 
> Since you have removed the calls to close(fcopy_fd) after a
> pread() or pwrite() failure that were in v2 of the patch, I was
> expecting to see
> 
> 	if (fcopy_fd != -1)
> 		close(fcopy_fd)

I missed this... Thanks for catching it!
 
> here, like you've done with the kvp and vss code.  And
> remember to initialize fcopy_fd to -1. :-)
> 
> Michael

Will do it in v4.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v3 2/4] hv_utils: Support host-initiated restart request
  2020-01-26  4:57   ` Michael Kelley
@ 2020-01-26  5:05     ` Dexuan Cui
  0 siblings, 0 replies; 10+ messages in thread
From: Dexuan Cui @ 2020-01-26  5:05 UTC (permalink / raw)
  To: Michael Kelley, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal, linux-hyperv, linux-kernel, vkuznets

> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Saturday, January 25, 2020 8:57 PM
> > @@ -186,6 +214,8 @@ static void shutdown_onchannelcallback(void
> *context)
> >
> >  	if (execute_shutdown == true)
> >  		schedule_work(&shutdown_work);
> > +	if (execute_reboot == true)
> > +		schedule_work(&restart_work);
> 
> This works, and responds to my comment.  But FWIW, a more compact
> approach would be to drop the boolean execute_shutdown, execute_restart,
> etc. local variables and have just this local variable:
> 
> 	struct work_struct *work_to_do = NULL;
> 
> In the "case" branches do:
> 
> 	work_to_do = &shutdown_work;
> 
> or
> 	work_to_do = &restart_work;
> 
> Then at the bottom of the function, just do:
> 
> 	if (work_to_do)
> 		schedule_work(work_to_do);
> 
> Patch 3 of this series would then be a little simpler as well.

Good idea! I'll do this in v4.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-01-26  5:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-25 19:53 [PATCH v3 0/4] hv_utils: Add the support of hibernation Dexuan Cui
2020-01-25 19:53 ` [PATCH v3 1/4] Tools: hv: Reopen the devices if read() or write() returns errors Dexuan Cui
2020-01-26  4:49   ` Michael Kelley
2020-01-26  4:59     ` Dexuan Cui
2020-01-25 19:53 ` [PATCH v3 2/4] hv_utils: Support host-initiated restart request Dexuan Cui
2020-01-26  4:57   ` Michael Kelley
2020-01-26  5:05     ` Dexuan Cui
2020-01-25 19:53 ` [PATCH v3 3/4] hv_utils: Support host-initiated hibernation request Dexuan Cui
2020-01-25 19:53 ` [PATCH v3 4/4] hv_utils: Add the support of hibernation Dexuan Cui
2020-01-26  4:58   ` Michael Kelley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).