All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm@lists.01.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@mellanox.com>, Pavel Machek <pavel@ucw.cz>,
	Len Brown <len.brown@intel.com>,
	linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 11/12] PM, libnvdimm: Add 'mem-quiet' state and callback for firmware activation
Date: Thu, 09 Jul 2020 16:57:12 +0200	[thread overview]
Message-ID: <23449996.3uVv1d17cZ@kreacher> (raw)
In-Reply-To: <159408717289.2385045.14094866475168644020.stgit@dwillia2-desk3.amr.corp.intel.com>

On Tuesday, July 7, 2020 3:59:32 AM CEST Dan Williams wrote:
> The runtime firmware activation capability of Intel NVDIMM devices
> requires memory transactions to be disabled for 100s of microseconds.
> This timeout is large enough to cause in-flight DMA to fail and other
> application detectable timeouts. Arrange for firmware activation to be
> executed while the system is "quiesced", all processes and device-DMA
> frozen.
> 
> It is already required that invoking device ->freeze() callbacks is
> sufficient to cease DMA. A device that continues memory writes outside
> of user-direction violates expectations of the PM core to be to
> establish a coherent hibernation image.
> 
> That said, RDMA devices are an example of a device that access memory
> outside of user process direction. RDMA drivers also typically assume
> the system they are operating in will never be hibernated. A solution
> for RDMA collisions with firmware activation is outside the scope of
> this change and may need to rely on being able to survive the platform
> imposed memory controller quiesce period.

Thanks for following my suggestion to use the hibernation infrastructure
rather than the suspend one, but I think it would be better to go a bit
further with that.

Namely, after thinking about this a bit more I have come to the conclusion
that what is needed is an ability to execute a function, inside of the
kernel, in a "quiet" environment in which memory updates are unlikely.

While the hibernation infrastructure as is can be used for that, kind of, IMO
it would be cleaner to introduce a helper for that, like in the (untested)
patch below, so if the "quiet execution environment" is needed, whoever
needs it may simply pass a function to hibernate_quiet_exec() and provide
whatever user-space I/F is suitable on top of that.

Please let me know what you think.

Cheers!

---
 include/linux/suspend.h  |    6 ++
 kernel/power/hibernate.c |   97 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 103 insertions(+)

Index: linux-pm/kernel/power/hibernate.c
===================================================================
--- linux-pm.orig/kernel/power/hibernate.c
+++ linux-pm/kernel/power/hibernate.c
@@ -795,6 +795,103 @@ int hibernate(void)
 	return error;
 }
 
+/**
+ * hibernate_quiet_exec - Execute a function with all devices frozen.
+ * @func: Function to execute.
+ * @data: Data pointer to pass to @func.
+ *
+ * Return the @func return value or an error code if it cannot be executed.
+ */
+int hibernate_quiet_exec(int (*func)(void *data), void *data)
+{
+	int error, nr_calls = 0;
+
+	lock_system_sleep();
+
+	if (!hibernate_acquire()) {
+		error = -EBUSY;
+		goto unlock;
+	}
+
+	pm_prepare_console();
+
+	error = __pm_notifier_call_chain(PM_HIBERNATION_PREPARE, -1, &nr_calls);
+	if (error) {
+		nr_calls--;
+		goto exit;
+	}
+
+	error = freeze_processes();
+	if (error)
+		goto exit;
+
+	lock_device_hotplug();
+
+	pm_suspend_clear_flags();
+
+	error = platform_begin(true);
+	if (error)
+		goto thaw;
+
+	error = freeze_kernel_threads();
+	if (error)
+		goto thaw;
+
+	error = dpm_prepare(PMSG_FREEZE);
+	if (error)
+		goto dpm_complete;
+
+	suspend_console();
+
+	error = dpm_suspend(PMSG_FREEZE);
+	if (error)
+		goto dpm_resume;
+
+	error = dpm_suspend_end(PMSG_FREEZE);
+	if (error)
+		goto dpm_resume;
+
+	error = platform_pre_snapshot(true);
+	if (error)
+		goto skip;
+
+	error = func(data);
+
+skip:
+	platform_finish(true);
+
+	dpm_resume_start(PMSG_THAW);
+
+dpm_resume:
+	dpm_resume(PMSG_THAW);
+
+	resume_console();
+
+dpm_complete:
+	dpm_complete(PMSG_THAW);
+
+	thaw_kernel_threads();
+
+thaw:
+	platform_end(true);
+
+	unlock_device_hotplug();
+
+	thaw_processes();
+
+exit:
+	__pm_notifier_call_chain(PM_POST_HIBERNATION, nr_calls, NULL);
+
+	pm_restore_console();
+
+	hibernate_release();
+
+unlock:
+	unlock_system_sleep();
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(hibernate_quiet_exec);
 
 /**
  * software_resume - Resume from a saved hibernation image.
Index: linux-pm/include/linux/suspend.h
===================================================================
--- linux-pm.orig/include/linux/suspend.h
+++ linux-pm/include/linux/suspend.h
@@ -453,6 +453,8 @@ extern bool hibernation_available(void);
 asmlinkage int swsusp_save(void);
 extern struct pbe *restore_pblist;
 int pfn_is_nosave(unsigned long pfn);
+
+int hibernate_quiet_exec(int (*func)(void *data), void *data);
 #else /* CONFIG_HIBERNATION */
 static inline void register_nosave_region(unsigned long b, unsigned long e) {}
 static inline void register_nosave_region_late(unsigned long b, unsigned long e) {}
@@ -464,6 +466,10 @@ static inline void hibernation_set_ops(c
 static inline int hibernate(void) { return -ENOSYS; }
 static inline bool system_entering_hibernation(void) { return false; }
 static inline bool hibernation_available(void) { return false; }
+
+static inline hibernate_quiet_exec(int (*func)(void *data), void *data) {
+	return -ENOTSUPP;
+}
 #endif /* CONFIG_HIBERNATION */
 
 #ifdef CONFIG_HIBERNATION_SNAPSHOT_DEV


_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm@lists.01.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@mellanox.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Ira Weiny <ira.weiny@intel.com>, Pavel Machek <pavel@ucw.cz>,
	Len Brown <len.brown@intel.com>,
	linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 11/12] PM, libnvdimm: Add 'mem-quiet' state and callback for firmware activation
Date: Thu, 09 Jul 2020 16:57:12 +0200	[thread overview]
Message-ID: <23449996.3uVv1d17cZ@kreacher> (raw)
In-Reply-To: <159408717289.2385045.14094866475168644020.stgit@dwillia2-desk3.amr.corp.intel.com>

On Tuesday, July 7, 2020 3:59:32 AM CEST Dan Williams wrote:
> The runtime firmware activation capability of Intel NVDIMM devices
> requires memory transactions to be disabled for 100s of microseconds.
> This timeout is large enough to cause in-flight DMA to fail and other
> application detectable timeouts. Arrange for firmware activation to be
> executed while the system is "quiesced", all processes and device-DMA
> frozen.
> 
> It is already required that invoking device ->freeze() callbacks is
> sufficient to cease DMA. A device that continues memory writes outside
> of user-direction violates expectations of the PM core to be to
> establish a coherent hibernation image.
> 
> That said, RDMA devices are an example of a device that access memory
> outside of user process direction. RDMA drivers also typically assume
> the system they are operating in will never be hibernated. A solution
> for RDMA collisions with firmware activation is outside the scope of
> this change and may need to rely on being able to survive the platform
> imposed memory controller quiesce period.

Thanks for following my suggestion to use the hibernation infrastructure
rather than the suspend one, but I think it would be better to go a bit
further with that.

Namely, after thinking about this a bit more I have come to the conclusion
that what is needed is an ability to execute a function, inside of the
kernel, in a "quiet" environment in which memory updates are unlikely.

While the hibernation infrastructure as is can be used for that, kind of, IMO
it would be cleaner to introduce a helper for that, like in the (untested)
patch below, so if the "quiet execution environment" is needed, whoever
needs it may simply pass a function to hibernate_quiet_exec() and provide
whatever user-space I/F is suitable on top of that.

Please let me know what you think.

Cheers!

---
 include/linux/suspend.h  |    6 ++
 kernel/power/hibernate.c |   97 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 103 insertions(+)

Index: linux-pm/kernel/power/hibernate.c
===================================================================
--- linux-pm.orig/kernel/power/hibernate.c
+++ linux-pm/kernel/power/hibernate.c
@@ -795,6 +795,103 @@ int hibernate(void)
 	return error;
 }
 
+/**
+ * hibernate_quiet_exec - Execute a function with all devices frozen.
+ * @func: Function to execute.
+ * @data: Data pointer to pass to @func.
+ *
+ * Return the @func return value or an error code if it cannot be executed.
+ */
+int hibernate_quiet_exec(int (*func)(void *data), void *data)
+{
+	int error, nr_calls = 0;
+
+	lock_system_sleep();
+
+	if (!hibernate_acquire()) {
+		error = -EBUSY;
+		goto unlock;
+	}
+
+	pm_prepare_console();
+
+	error = __pm_notifier_call_chain(PM_HIBERNATION_PREPARE, -1, &nr_calls);
+	if (error) {
+		nr_calls--;
+		goto exit;
+	}
+
+	error = freeze_processes();
+	if (error)
+		goto exit;
+
+	lock_device_hotplug();
+
+	pm_suspend_clear_flags();
+
+	error = platform_begin(true);
+	if (error)
+		goto thaw;
+
+	error = freeze_kernel_threads();
+	if (error)
+		goto thaw;
+
+	error = dpm_prepare(PMSG_FREEZE);
+	if (error)
+		goto dpm_complete;
+
+	suspend_console();
+
+	error = dpm_suspend(PMSG_FREEZE);
+	if (error)
+		goto dpm_resume;
+
+	error = dpm_suspend_end(PMSG_FREEZE);
+	if (error)
+		goto dpm_resume;
+
+	error = platform_pre_snapshot(true);
+	if (error)
+		goto skip;
+
+	error = func(data);
+
+skip:
+	platform_finish(true);
+
+	dpm_resume_start(PMSG_THAW);
+
+dpm_resume:
+	dpm_resume(PMSG_THAW);
+
+	resume_console();
+
+dpm_complete:
+	dpm_complete(PMSG_THAW);
+
+	thaw_kernel_threads();
+
+thaw:
+	platform_end(true);
+
+	unlock_device_hotplug();
+
+	thaw_processes();
+
+exit:
+	__pm_notifier_call_chain(PM_POST_HIBERNATION, nr_calls, NULL);
+
+	pm_restore_console();
+
+	hibernate_release();
+
+unlock:
+	unlock_system_sleep();
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(hibernate_quiet_exec);
 
 /**
  * software_resume - Resume from a saved hibernation image.
Index: linux-pm/include/linux/suspend.h
===================================================================
--- linux-pm.orig/include/linux/suspend.h
+++ linux-pm/include/linux/suspend.h
@@ -453,6 +453,8 @@ extern bool hibernation_available(void);
 asmlinkage int swsusp_save(void);
 extern struct pbe *restore_pblist;
 int pfn_is_nosave(unsigned long pfn);
+
+int hibernate_quiet_exec(int (*func)(void *data), void *data);
 #else /* CONFIG_HIBERNATION */
 static inline void register_nosave_region(unsigned long b, unsigned long e) {}
 static inline void register_nosave_region_late(unsigned long b, unsigned long e) {}
@@ -464,6 +466,10 @@ static inline void hibernation_set_ops(c
 static inline int hibernate(void) { return -ENOSYS; }
 static inline bool system_entering_hibernation(void) { return false; }
 static inline bool hibernation_available(void) { return false; }
+
+static inline hibernate_quiet_exec(int (*func)(void *data), void *data) {
+	return -ENOTSUPP;
+}
 #endif /* CONFIG_HIBERNATION */
 
 #ifdef CONFIG_HIBERNATION_SNAPSHOT_DEV




  parent reply	other threads:[~2020-07-09 14:57 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07  1:58 [PATCH v2 00/12] ACPI/NVDIMM: Runtime Firmware Activation Dan Williams
2020-07-07  1:58 ` Dan Williams
2020-07-07  1:58 ` [PATCH v2 01/12] libnvdimm: Validate command family indices Dan Williams
2020-07-07  1:58   ` Dan Williams
2020-07-10 14:02   ` Sasha Levin
2020-07-10 14:02     ` Sasha Levin
2020-07-07  1:58 ` [PATCH v2 02/12] ACPI: NFIT: Move bus_dsm_mask out of generic nvdimm_bus_descriptor Dan Williams
2020-07-07  1:58   ` Dan Williams
2020-07-07  1:58 ` [PATCH v2 03/12] ACPI: NFIT: Define runtime firmware activation commands Dan Williams
2020-07-07  1:58   ` Dan Williams
2020-07-07  1:58 ` [PATCH v2 04/12] tools/testing/nvdimm: Cleanup dimm index passing Dan Williams
2020-07-07  1:58   ` Dan Williams
2020-07-07  1:59 ` [PATCH v2 05/12] tools/testing/nvdimm: Add command debug messages Dan Williams
2020-07-07  1:59   ` Dan Williams
2020-07-07  1:59 ` [PATCH v2 06/12] tools/testing/nvdimm: Prepare nfit_ctl_test() for ND_CMD_CALL emulation Dan Williams
2020-07-07  1:59   ` Dan Williams
2020-07-07  1:59 ` [PATCH v2 07/12] tools/testing/nvdimm: Emulate firmware activation commands Dan Williams
2020-07-07  1:59   ` Dan Williams
2020-07-07  1:59 ` [PATCH v2 08/12] driver-core: Introduce DEVICE_ATTR_ADMIN_{RO,RW} Dan Williams
2020-07-07  1:59   ` Dan Williams
2020-07-07  1:59 ` [PATCH v2 09/12] libnvdimm: Convert to DEVICE_ATTR_ADMIN_RO() Dan Williams
2020-07-07  1:59   ` Dan Williams
2020-07-07  1:59 ` [PATCH v2 10/12] libnvdimm: Add runtime firmware activation sysfs interface Dan Williams
2020-07-07  1:59   ` Dan Williams
2020-07-07  1:59 ` [PATCH v2 11/12] PM, libnvdimm: Add 'mem-quiet' state and callback for firmware activation Dan Williams
2020-07-07  1:59   ` Dan Williams
2020-07-07 16:56   ` Pavel Machek
2020-07-07 16:56     ` Pavel Machek
2020-07-09 14:57   ` Rafael J. Wysocki [this message]
2020-07-09 14:57     ` Rafael J. Wysocki
2020-07-09 19:04     ` Dan Williams
2020-07-09 19:04       ` Dan Williams
2020-07-13 14:03       ` Rafael J. Wysocki
2020-07-13 14:03         ` Rafael J. Wysocki
2020-07-09 15:00   ` Christoph Hellwig
2020-07-09 15:00     ` Christoph Hellwig
2020-07-09 15:38     ` Jason Gunthorpe
2020-07-09 15:38       ` Jason Gunthorpe
2020-07-09 15:43       ` Rafael J. Wysocki
2020-07-09 15:43         ` Rafael J. Wysocki
2020-07-09 16:10       ` Dan Williams
2020-07-09 16:10         ` Dan Williams
2020-07-09 16:34         ` Jason Gunthorpe
2020-07-09 16:34           ` Jason Gunthorpe
2020-07-09 15:56     ` Dan Williams
2020-07-09 15:56       ` Dan Williams
2020-07-07  1:59 ` [PATCH v2 12/12] ACPI: NFIT: Add runtime firmware activate support Dan Williams
2020-07-07  1:59   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23449996.3uVv1d17cZ@kreacher \
    --to=rjw@rjwysocki.net \
    --cc=dan.j.williams@intel.com \
    --cc=dledford@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jgg@mellanox.com \
    --cc=len.brown@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.