linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] powerpc/papr_scm: Workaround for failure of drc bind after kexec
@ 2019-06-29 16:06 Vaibhav Jain
  2019-06-29 16:06 ` [PATCH v4 1/3] powerpc/pseries: Update SCM hcall op-codes in hvcall.h Vaibhav Jain
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Vaibhav Jain @ 2019-06-29 16:06 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Aneesh Kumar K . V, Oliver O'Halloran, Vaibhav Jain,
	Laurent Dufour, David Gibson

Presently an error is returned in response to hcall H_SCM_BIND_MEM when a
new kernel boots on lpar via kexec. This prevents papr_scm from registering
drc memory regions with nvdimm. The error reported is of the form below:

"papr_scm ibm,persistent-memory:ibm,pmemory@44100002: bind err: -68"

On investigation it was revealed that phyp returns this error as previous
kernel did not completely release bindings for drc scm-memory blocks and
hence phyp rejected request for re-binding these block to lpar with error
H_OVERLAP. Also support for a new H_SCM_UNBIND_ALL is recently added which
is better suited for releasing all the bound scm-memory block from an lpar.

So leveraging new hcall H_SCM_UNBIND_ALL, we can workaround H_OVERLAP issue
during kexec by forcing an unbind of all drm scm-memory blocks and issuing
H_SCM_BIND_MEM to re-bind the drc scm-memory blocks to lpar. This sequence
will also be needed when a new kernel boot on lpar after previous kernel
panicked and it never got an opportunity to call H_SCM_UNBIND_MEM/ALL.

Hence this patch-set implements following changes to papr_scm module:

* Update hvcall.h to include opcodes for new hcall H_SCM_UNBIND_ALL.

* Update it to use H_SCM_UNBIND_ALL instead of H_SCM_UNBIND_MEM

* In case hcall H_SCM_BIND_MEM fails with error H_OVERLAP, force
  H_SCM_UNBIND_ALL and retry the bind operation again.

With the patch-set applied re-bind of drc scm-memory to lpar succeeds after
a kexec to new kernel as illustrated below:

# Old kernel
$ sudo ndctl list -R
[
  {
    "dev":"region0",
    <snip>
    ....
  }
]
# kexec to new kernel
$ sudo kexec --initrd=... vmlinux
...
...
I'm in purgatory
...
papr_scm ibm,persistent-memory:ibm,pmemory@44100002: Un-binding and retrying
...
# New kernel
$ sudo ndctl list -R
[
  {
    "dev":"region0",
    <snip>
    ....
  }
]

---
Change-log:
v4:
* Updated the patch description of first patch in the series as suggested
  by Mpe.

v3:
* Fixed a build warning reported by kbuild test robot.
* Updated the hcall opcode from latest papr-scm specification.
* Fixed a minor code comment & patch description as pointed out by Oliver.

v2:
* Addressed review comments from Oliver on v1 patchset.

Vaibhav Jain (3):
  powerpc/pseries: Update SCM hcall op-codes in hvcall.h
  powerpc/papr_scm: Update drc_pmem_unbind() to use H_SCM_UNBIND_ALL
  powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails

 arch/powerpc/include/asm/hvcall.h         | 11 ++++--
 arch/powerpc/platforms/pseries/papr_scm.c | 44 ++++++++++++++++++-----
 2 files changed, 44 insertions(+), 11 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 1/3] powerpc/pseries: Update SCM hcall op-codes in hvcall.h
  2019-06-29 16:06 [PATCH v4 0/3] powerpc/papr_scm: Workaround for failure of drc bind after kexec Vaibhav Jain
@ 2019-06-29 16:06 ` Vaibhav Jain
  2019-07-24 13:32   ` Michael Ellerman
  2019-06-29 16:06 ` [PATCH v4 2/3] powerpc/papr_scm: Update drc_pmem_unbind() to use H_SCM_UNBIND_ALL Vaibhav Jain
  2019-06-29 16:06 ` [PATCH v4 3/3] powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails Vaibhav Jain
  2 siblings, 1 reply; 5+ messages in thread
From: Vaibhav Jain @ 2019-06-29 16:06 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Aneesh Kumar K . V, Oliver O'Halloran, Vaibhav Jain,
	Laurent Dufour, David Gibson

Update the hvcalls.h to include op-codes for new hcalls introduce to
manage SCM memory. Also update existing hcall definitions to reflect
current papr specification for SCM.

The removed hcall op-codes H_SCM_MEM_QUERY, H_SCM_BLOCK_CLEAR were
transient proposals and there support was never implemented by
Power-VM nor they were used anywhere in Linux kernel. Hence we don't
expect anyone to be impacted by this change.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Change-log:

v4:
* Updated the patch description mentioned current status of removed
  hcall opcodes. [Mpe]

v3:
* Added updated opcode for H_SCM_HEALTH [Oliver]

v2:
* None new patch in this series.
---
 arch/powerpc/include/asm/hvcall.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 463c63a9fcf1..11112023e327 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -302,9 +302,14 @@
 #define H_SCM_UNBIND_MEM        0x3F0
 #define H_SCM_QUERY_BLOCK_MEM_BINDING 0x3F4
 #define H_SCM_QUERY_LOGICAL_MEM_BINDING 0x3F8
-#define H_SCM_MEM_QUERY	        0x3FC
-#define H_SCM_BLOCK_CLEAR       0x400
-#define MAX_HCALL_OPCODE	H_SCM_BLOCK_CLEAR
+#define H_SCM_UNBIND_ALL        0x3FC
+#define H_SCM_HEALTH            0x400
+#define H_SCM_PERFORMANCE_STATS 0x418
+#define MAX_HCALL_OPCODE	H_SCM_PERFORMANCE_STATS
+
+/* Scope args for H_SCM_UNBIND_ALL */
+#define H_UNBIND_SCOPE_ALL (0x1)
+#define H_UNBIND_SCOPE_DRC (0x2)
 
 /* H_VIOCTL functions */
 #define H_GET_VIOA_DUMP_SIZE	0x01
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 2/3] powerpc/papr_scm: Update drc_pmem_unbind() to use H_SCM_UNBIND_ALL
  2019-06-29 16:06 [PATCH v4 0/3] powerpc/papr_scm: Workaround for failure of drc bind after kexec Vaibhav Jain
  2019-06-29 16:06 ` [PATCH v4 1/3] powerpc/pseries: Update SCM hcall op-codes in hvcall.h Vaibhav Jain
@ 2019-06-29 16:06 ` Vaibhav Jain
  2019-06-29 16:06 ` [PATCH v4 3/3] powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails Vaibhav Jain
  2 siblings, 0 replies; 5+ messages in thread
From: Vaibhav Jain @ 2019-06-29 16:06 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Aneesh Kumar K . V, Oliver O'Halloran, Vaibhav Jain,
	Laurent Dufour, David Gibson

The new hcall named H_SCM_UNBIND_ALL has been introduce that can
unbind all or specific scm memory assigned to an lpar. This is
more efficient than using H_SCM_UNBIND_MEM as currently we don't
support partial unbind of scm memory.

Hence this patch proposes following changes to drc_pmem_unbind():

    * Update drc_pmem_unbind() to replace hcall H_SCM_UNBIND_MEM to
      H_SCM_UNBIND_ALL.

    * Update drc_pmem_unbind() to handles cases when PHYP asks the guest
      kernel to wait for specific amount of time before retrying the
      hcall via the 'LONG_BUSY' return value.

    * Ensure appropriate error code is returned back from the function
      in case of an error.

Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Change-log:

v4:
* None. Re-spinning the patchset.

v3:
* Fixed a build warning reported by kbuild-robot.
* Updated patch description to put emphasis on 'scm memory' instead of
  'scm drc memory blocks' as for phyp there is a stark difference
  between how drc are managed for scm memory v/s regular memory. [Oliver]

v2:
* Added a dev_dbg when unbind operation succeeds [Oliver]
* Changed type of variable 'rc' to int64_t [Oliver]
* Removed the code that was logging a warning in case bind operation
  takes >1-seconds [Oliver]
* Spinned off changes to hvcall.h as a separate patch. [Oliver]
---
 arch/powerpc/platforms/pseries/papr_scm.c | 29 +++++++++++++++++------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 96c53b23e58f..c01a03fd3ee7 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -11,6 +11,7 @@
 #include <linux/sched.h>
 #include <linux/libnvdimm.h>
 #include <linux/platform_device.h>
+#include <linux/delay.h>
 
 #include <asm/plpar_wrappers.h>
 
@@ -77,22 +78,36 @@ static int drc_pmem_bind(struct papr_scm_priv *p)
 static int drc_pmem_unbind(struct papr_scm_priv *p)
 {
 	unsigned long ret[PLPAR_HCALL_BUFSIZE];
-	uint64_t rc, token;
+	uint64_t token = 0;
+	int64_t rc;
 
-	token = 0;
+	dev_dbg(&p->pdev->dev, "unbind drc %x\n", p->drc_index);
 
-	/* NB: unbind has the same retry requirements mentioned above */
+	/* NB: unbind has the same retry requirements as drc_pmem_bind() */
 	do {
-		rc = plpar_hcall(H_SCM_UNBIND_MEM, ret, p->drc_index,
-				p->bound_addr, p->blocks, token);
+
+		/* Unbind of all SCM resources associated with drcIndex */
+		rc = plpar_hcall(H_SCM_UNBIND_ALL, ret, H_UNBIND_SCOPE_DRC,
+				 p->drc_index, token);
 		token = ret[0];
-		cond_resched();
+
+		/* Check if we are stalled for some time */
+		if (H_IS_LONG_BUSY(rc)) {
+			msleep(get_longbusy_msecs(rc));
+			rc = H_BUSY;
+		} else if (rc == H_BUSY) {
+			cond_resched();
+		}
+
 	} while (rc == H_BUSY);
 
 	if (rc)
 		dev_err(&p->pdev->dev, "unbind error: %lld\n", rc);
+	else
+		dev_dbg(&p->pdev->dev, "unbind drc %x complete\n",
+			p->drc_index);
 
-	return !!rc;
+	return rc == H_SUCCESS ? 0 : -ENXIO;
 }
 
 static int papr_scm_meta_get(struct papr_scm_priv *p,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 3/3] powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails
  2019-06-29 16:06 [PATCH v4 0/3] powerpc/papr_scm: Workaround for failure of drc bind after kexec Vaibhav Jain
  2019-06-29 16:06 ` [PATCH v4 1/3] powerpc/pseries: Update SCM hcall op-codes in hvcall.h Vaibhav Jain
  2019-06-29 16:06 ` [PATCH v4 2/3] powerpc/papr_scm: Update drc_pmem_unbind() to use H_SCM_UNBIND_ALL Vaibhav Jain
@ 2019-06-29 16:06 ` Vaibhav Jain
  2 siblings, 0 replies; 5+ messages in thread
From: Vaibhav Jain @ 2019-06-29 16:06 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Aneesh Kumar K . V, Oliver O'Halloran, Vaibhav Jain,
	Laurent Dufour, David Gibson

In some cases initial bind of scm memory for an lpar can fail if
previously it wasn't released using a scm-unbind hcall. This situation
can arise due to panic of the previous kernel or forced lpar
fadump. In such cases the H_SCM_BIND_MEM return a H_OVERLAP error.

To mitigate such cases the patch updates papr_scm_probe() to force a
call to drc_pmem_unbind() in case the initial bind of scm memory fails
with EBUSY error. In case scm-bind operation again fails after the
forced scm-unbind then we follow the existing error path. We also
update drc_pmem_bind() to handle the H_OVERLAP error returned by phyp
and indicate it as a EBUSY error back to the caller.

Suggested-by: "Oliver O'Halloran" <oohall@gmail.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
---
Change-log:

v4:
* None. Re-spinning the patchset.

v3:
* Minor update to a code comment. [Oliver]

v2:
* Moved the retry code from drc_pmem_bind() to papr_scm_probe()
  [Oliver]
* Changed the type of variable 'rc' in drc_pmem_bind() to
  int64_t. [Oliver]
---
 arch/powerpc/platforms/pseries/papr_scm.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index c01a03fd3ee7..7c5e10c063a0 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -43,8 +43,9 @@ struct papr_scm_priv {
 static int drc_pmem_bind(struct papr_scm_priv *p)
 {
 	unsigned long ret[PLPAR_HCALL_BUFSIZE];
-	uint64_t rc, token;
 	uint64_t saved = 0;
+	uint64_t token;
+	int64_t rc;
 
 	/*
 	 * When the hypervisor cannot map all the requested memory in a single
@@ -64,6 +65,10 @@ static int drc_pmem_bind(struct papr_scm_priv *p)
 	} while (rc == H_BUSY);
 
 	if (rc) {
+		/* H_OVERLAP needs a separate error path */
+		if (rc == H_OVERLAP)
+			return -EBUSY;
+
 		dev_err(&p->pdev->dev, "bind err: %lld\n", rc);
 		return -ENXIO;
 	}
@@ -331,6 +336,14 @@ static int papr_scm_probe(struct platform_device *pdev)
 
 	/* request the hypervisor to bind this region to somewhere in memory */
 	rc = drc_pmem_bind(p);
+
+	/* If phyp says drc memory still bound then force unbound and retry */
+	if (rc == -EBUSY) {
+		dev_warn(&pdev->dev, "Retrying bind after unbinding\n");
+		drc_pmem_unbind(p);
+		rc = drc_pmem_bind(p);
+	}
+
 	if (rc)
 		goto err;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 1/3] powerpc/pseries: Update SCM hcall op-codes in hvcall.h
  2019-06-29 16:06 ` [PATCH v4 1/3] powerpc/pseries: Update SCM hcall op-codes in hvcall.h Vaibhav Jain
@ 2019-07-24 13:32   ` Michael Ellerman
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2019-07-24 13:32 UTC (permalink / raw)
  To: Vaibhav Jain, linuxppc-dev
  Cc: Aneesh Kumar K . V, Laurent Dufour, Oliver O'Halloran,
	David Gibson, Vaibhav Jain

On Sat, 2019-06-29 at 16:06:08 UTC, Vaibhav Jain wrote:
> Update the hvcalls.h to include op-codes for new hcalls introduce to
> manage SCM memory. Also update existing hcall definitions to reflect
> current papr specification for SCM.
> 
> The removed hcall op-codes H_SCM_MEM_QUERY, H_SCM_BLOCK_CLEAR were
> transient proposals and there support was never implemented by
> Power-VM nor they were used anywhere in Linux kernel. Hence we don't
> expect anyone to be impacted by this change.
> 
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>

Series applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/6d140e7569db89a1b596c1c2d1c2293d5c594432

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-07-24 13:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-29 16:06 [PATCH v4 0/3] powerpc/papr_scm: Workaround for failure of drc bind after kexec Vaibhav Jain
2019-06-29 16:06 ` [PATCH v4 1/3] powerpc/pseries: Update SCM hcall op-codes in hvcall.h Vaibhav Jain
2019-07-24 13:32   ` Michael Ellerman
2019-06-29 16:06 ` [PATCH v4 2/3] powerpc/papr_scm: Update drc_pmem_unbind() to use H_SCM_UNBIND_ALL Vaibhav Jain
2019-06-29 16:06 ` [PATCH v4 3/3] powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails Vaibhav Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).