Linux-NVDIMM Archive on lore.kernel.org
 help / color / Atom feed
From: Vaibhav Jain <vaibhav@linux.ibm.com>
To: Michael Ellerman <michaele@au1.ibm.com>, Ira Weiny <ira.weiny@intel.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-nvdimm@lists.01.org
Subject: Re: [RESEND PATCH v7 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP
Date: Thu, 21 May 2020 22:29:38 +0530
Message-ID: <87imgpw7et.fsf@linux.ibm.com> (raw)
In-Reply-To: <87k115gy0i.fsf@mpe.ellerman.id.au>

Michael Ellerman <michaele@au1.ibm.com> writes:

> Vaibhav Jain <vaibhav@linux.ibm.com> writes:
>> Thanks for reviewing this this patch Ira. My responses below:
>> Ira Weiny <ira.weiny@intel.com> writes:
>>> On Wed, May 20, 2020 at 12:30:56AM +0530, Vaibhav Jain wrote:
>>>> Implement support for fetching nvdimm health information via
>>>> H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
>>>> of 64-bit big-endian integers, bitwise-and of which is then stored in
>>>> 'struct papr_scm_priv' and subsequently partially exposed to
>>>> user-space via newly introduced dimm specific attribute
>>>> 'papr/flags'. Since the hcall is costly, the health information is
>>>> cached and only re-queried, 60s after the previous successful hcall.
> ...
>>>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>>>> index f35592423380..142636e1a59f 100644
>>>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>>>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>>>> @@ -39,6 +78,15 @@ struct papr_scm_priv {
>>>>  	struct resource res;
>>>>  	struct nd_region *region;
>>>>  	struct nd_interleave_set nd_set;
>>>> +
>>>> +	/* Protect dimm health data from concurrent read/writes */
>>>> +	struct mutex health_mutex;
>>>> +
>>>> +	/* Last time the health information of the dimm was updated */
>>>> +	unsigned long lasthealth_jiffies;
>>>> +
>>>> +	/* Health information for the dimm */
>>>> +	u64 health_bitmap;
>>>
>>> I wonder if this should be typed big endian as you mention that it is in the
>>> commit message?
>> This was discussed in an earlier review of the patch series at
>> https://lore.kernel.org/linux-nvdimm/878sjetcis.fsf@mpe.ellerman.id.au
>>
>> Even though health bitmap is returned in big endian format (For ex
>> value 0xC00000000000000 indicates bits 0,1 set), its value is never
>> used. Instead only test for specific bits being set in the register is
>> done.
>
> This has already caused a lot of confusion, so let me try and clear it
> up. I will probably fail :)
>
> The value is not big endian.
>
> It's returned in a GPR (a register), from the hypervisor. The ordering
> of bytes in a register is not dependent on what endian we're executing
> in.
>
> It's true that the hypervisor will have been running big endian, and
> when it returns to us we will now be running little endian. But the
> value is unchanged, it was 0xC00000000000000 in the GPR while the HV was
> running and it's still 0xC00000000000000 when we return to Linux. You
> can see this in mambo, see below for an example.
>
>
> _However_, the specification of the bits in the bitmap value uses MSB 0
> ordering, as is traditional for IBM documentation. That means the most
> significant bit, aka. the left most bit, is called "bit 0".
>
> See: https://en.wikipedia.org/wiki/Bit_numbering#MSB_0_bit_numbering
>
> That is the opposite numbering from what most people use, and in
> particular what most code in Linux uses, which is that bit 0 is the
> least significant bit.
>
> Which is where the confusion comes in. It's not that the bytes are
> returned in a different order, it's that the bits are numbered
> differently in the IBM documentation.
>
> The way to fix this kind of thing is to read the docs, and convert all
> the bits into correct numbering (LSB=0), and then throw away the docs ;)
>
> cheers
Thanks a lot for clarifying this Mpe and for this detailed explaination.

I have removed the term Big-Endian from v8 patch description to avoid
any further confusion.

>
>
>
> In mambo we can set a breakpoint just before the kernel enters skiboot,
> towards the end of __opal_call. The kernel is running LE and skiboot
> runs BE.
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] b 0xc0000000000c1744
>   breakpoint set at [0:0:0]: 0xc0000000000c1744 (0x00000000000C1744) Enc:0x2402004C : hrfid
>
> Then run:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] c
>   [0:0:0]: 0xC0000000000C1744 (0x00000000000C1744) Enc:0x2402004C : hrfid
>   INFO: 121671618: (121671618): ** Execution stopped: user (tcl),  **
>   121671618: ** finished running 121671618 instructions **
>
> And we stop there, on an hrfid that we haven't executed yet.
> We can print r0, to see the OPAL token:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p r0
>   0x0000000000000019
>
> ie. we're calling OPAL_CONSOLE_WRITE_BUFFER_SPACE (25).
>
> And we can print the MSR:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p msr
>   0x9000000002001033
>   
>                      64-bit mode (SF): 0x1 [64-bit mode]
>                 Hypervisor State (HV): 0x1
>                Vector Available (VEC): 0x1
>   Machine Check Interrupt Enable (ME): 0x1
>             Instruction Relocate (IR): 0x1
>                    Data Relocate (DR): 0x1
>            Recoverable Interrupt (RI): 0x1
>               Little-Endian Mode (LE): 0x1 [little-endian]
>
> ie. we're little endian.
>
> We then step one instruction:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] s
>   [0:0:0]: 0x0000000030002BF0 (0x0000000030002BF0) Enc:0x7D9FFAA6 : mfspr   r12,PIR
>
> Now we're in skiboot. Print the MSR again:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p msr
>   0x9000000002001002
>   
>                      64-bit mode (SF): 0x1 [64-bit mode]
>                 Hypervisor State (HV): 0x1
>                Vector Available (VEC): 0x1
>   Machine Check Interrupt Enable (ME): 0x1
>            Recoverable Interrupt (RI): 0x1
>
> We're big endian.
> Print r0:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p r0
>   0x0000000000000019
>
> r0 is unchanged!
Got it. Thanks again.

-- 
Cheers
~ Vaibhav
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

  reply index

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-19 19:00 [RESEND PATCH v7 0/5] powerpc/papr_scm: Add support for reporting nvdimm health Vaibhav Jain
2020-05-19 19:00 ` [RESEND PATCH v7 1/5] powerpc: Document details on H_SCM_HEALTH hcall Vaibhav Jain
2020-05-19 19:00 ` [RESEND PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules Vaibhav Jain
2020-05-20 17:01   ` Christoph Hellwig
2020-05-19 19:00 ` [RESEND PATCH v7 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP Vaibhav Jain
2020-05-20 14:54   ` Ira Weiny
2020-05-20 17:15     ` Vaibhav Jain
2020-05-21 14:31       ` Michael Ellerman
2020-05-21 16:59         ` Vaibhav Jain [this message]
2020-05-21 23:32       ` Ira Weiny
2020-05-19 19:00 ` [RESEND PATCH v7 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods Vaibhav Jain
2020-05-20  7:09   ` Aneesh Kumar K.V
2020-05-20 10:19     ` Vaibhav Jain
2020-05-20 15:32   ` Ira Weiny
2020-05-20 18:37     ` Vaibhav Jain
2020-05-21  6:48     ` Michael Ellerman
2020-05-22  7:38       ` [RESEND PATCH v7 4/5] ndctl/papr_scm, uapi: " Vaibhav Jain
2020-05-25 12:00         ` Vaibhav Jain
2020-05-26 12:14           ` Michael Ellerman
2020-05-19 19:00 ` [RESEND PATCH v7 5/5] powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_HEALTH Vaibhav Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87imgpw7et.fsf@linux.ibm.com \
    --to=vaibhav@linux.ibm.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michaele@au1.ibm.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVDIMM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvdimm/0 linux-nvdimm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvdimm linux-nvdimm/ https://lore.kernel.org/linux-nvdimm \
		linux-nvdimm@lists.01.org
	public-inbox-index linux-nvdimm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.01.lists.linux-nvdimm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git