linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] powerpc/vphn: fix endian issue in NUMA device node code
@ 2014-10-03  9:13 Greg Kurz
  2014-10-06 22:36 ` Nishanth Aravamudan
  2014-10-07  9:28 ` [v2] " Michael Ellerman
  0 siblings, 2 replies; 5+ messages in thread
From: Greg Kurz @ 2014-10-03  9:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nishanth Aravamudan

The associativity domain numbers are obtained from the hypervisor through
registers and written into memory by the guest: the packed array passed to
vphn_unpack_associativity() is then native-endian, unlike what was assumed
in the following commit:

commit b08a2a12e44eaec5024b2b969f4fcb98169d1ca3
Author: Alistair Popple <alistair@popple.id.au>
Date:   Wed Aug 7 02:01:44 2013 +1000

    powerpc: Make NUMA device node code endian safe

If a CPU home node changes, the topology gets filled with
bogus values. This leads to severe performance breakdowns.

This patch does two things:
- extract values from the packed array with shifts, in order to be endian
  neutral
- convert the resulting values to be32 as expected

Suggested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---

Changes in v2:
- removed the left out __be16 *field declaration
- removed the left out be16_to_cpup() call
- updated the comment of the magic formula

Thanks again Nish... the two left outs probably explain why PowerVM wasn't
happy that patch. :P

--
Greg

 arch/powerpc/mm/numa.c |   64 +++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 52 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index b835bf0..4547f91 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1369,38 +1369,78 @@ static int update_cpu_associativity_changes_mask(void)
 #define VPHN_ASSOC_BUFSIZE (6*sizeof(u64)/sizeof(u32) + 1)
 
 /*
+ * The associativity values are either 16-bit (VPHN_FIELD_MSB) or 32-bit (data
+ * or VPHN_FIELD_UNUSED). We hence need to parse the packed array into 16-bit
+ * chunks. Let's do that with bit shifts to be endian neutral.
+ *
+ *    --- 16-bit chunks -->
+ *  _________________________
+ *  |  0  |  1  |  2  |  3  |   packed[0]
+ *  -------------------------
+ *  _________________________
+ *  |  4  |  5  |  6  |  7  |   packed[1]
+ *  -------------------------
+ *            ...
+ *  _________________________
+ *  | 20  | 21  | 22  | 23  |   packed[5]
+ *  -------------------------
+ *       48    32    16     0
+ *    <------ bits -------- 
+ *
+ * We need 48-bit shift for chunks 0,4,8,16,20
+ *         32-bit shift for chunks 1,5,9,17,21
+ *         16-bit shift for chunks 2,6,10,18,22
+ *             no shift for chunks 3,7,11,19,23
+ *
+ * The 2 lo bits of the chunk index multiplied by 16 give the shift.
+ * The remaining hi bits divided by 4 give the index in packed[].
+ *
+ * For example:
+ * chunk  0 = packed[0/4]  >> (~0b00000 & 0b11) * 16 = bits 48..63 in packed[0]
+ * chunk  5 = packed[5/4]  >> (~0b00101 & 0b11) * 16 = bits 32..47 in packed[1]
+ * chunk 22 = packed[22/4] >> (~0b10110 & 0b11) * 16 = bits 16..31 in packed[5]
+ */
+static inline u16 read_vphn_chunk(const long *packed, unsigned int i)
+{
+	return packed[i >> 2] >> ((~i & 3) << 4);
+}
+
+/*
  * Convert the associativity domain numbers returned from the hypervisor
  * to the sequence they would appear in the ibm,associativity property.
  */
 static int vphn_unpack_associativity(const long *packed, __be32 *unpacked)
 {
-	int i, nr_assoc_doms = 0;
-	const __be16 *field = (const __be16 *) packed;
+	unsigned int i, j, nr_assoc_doms = 0;
 
 #define VPHN_FIELD_UNUSED	(0xffff)
 #define VPHN_FIELD_MSB		(0x8000)
 #define VPHN_FIELD_MASK		(~VPHN_FIELD_MSB)
 
-	for (i = 1; i < VPHN_ASSOC_BUFSIZE; i++) {
-		if (be16_to_cpup(field) == VPHN_FIELD_UNUSED) {
+	for (i = 1, j = 0; i < VPHN_ASSOC_BUFSIZE; i++) {
+		u16 field = read_vphn_chunk(packed, j);
+
+		if (field == VPHN_FIELD_UNUSED) {
 			/* All significant fields processed, and remaining
 			 * fields contain the reserved value of all 1's.
 			 * Just store them.
 			 */
-			unpacked[i] = *((__be32 *)field);
-			field += 2;
-		} else if (be16_to_cpup(field) & VPHN_FIELD_MSB) {
+			unpacked[i] = (VPHN_FIELD_UNUSED << 16 |
+				       VPHN_FIELD_UNUSED);
+			j += 2;
+		} else if (field & VPHN_FIELD_MSB) {
 			/* Data is in the lower 15 bits of this field */
-			unpacked[i] = cpu_to_be32(
-				be16_to_cpup(field) & VPHN_FIELD_MASK);
-			field++;
+			unpacked[i] = cpu_to_be32(field & VPHN_FIELD_MASK);
+			j++;
 			nr_assoc_doms++;
 		} else {
 			/* Data is in the lower 15 bits of this field
 			 * concatenated with the next 16 bit field
 			 */
-			unpacked[i] = *((__be32 *)field);
-			field += 2;
+			unpacked[i] =
+				cpu_to_be32((u32) field << 16 |
+					    read_vphn_chunk(packed, j + 1));
+			j += 2;
 			nr_assoc_doms++;
 		}
 	}

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] powerpc/vphn: fix endian issue in NUMA device node code
  2014-10-03  9:13 [PATCH v2] powerpc/vphn: fix endian issue in NUMA device node code Greg Kurz
@ 2014-10-06 22:36 ` Nishanth Aravamudan
  2014-10-07  9:28 ` [v2] " Michael Ellerman
  1 sibling, 0 replies; 5+ messages in thread
From: Nishanth Aravamudan @ 2014-10-06 22:36 UTC (permalink / raw)
  To: Greg Kurz; +Cc: linuxppc-dev

On 03.10.2014 [11:13:17 +0200], Greg Kurz wrote:
> The associativity domain numbers are obtained from the hypervisor through
> registers and written into memory by the guest: the packed array passed to
> vphn_unpack_associativity() is then native-endian, unlike what was assumed
> in the following commit:
> 
> commit b08a2a12e44eaec5024b2b969f4fcb98169d1ca3
> Author: Alistair Popple <alistair@popple.id.au>
> Date:   Wed Aug 7 02:01:44 2013 +1000
> 
>     powerpc: Make NUMA device node code endian safe
> 
> If a CPU home node changes, the topology gets filled with
> bogus values. This leads to severe performance breakdowns.
> 
> This patch does two things:
> - extract values from the packed array with shifts, in order to be endian
>   neutral
> - convert the resulting values to be32 as expected
> 
> Suggested-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>

Reviewed-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>

> ---
> 
> Changes in v2:
> - removed the left out __be16 *field declaration
> - removed the left out be16_to_cpup() call
> - updated the comment of the magic formula
> 
> Thanks again Nish... the two left outs probably explain why PowerVM wasn't
> happy that patch. :P

Yep, I've tested this v2 patch successfully now!

-Nish

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v2] powerpc/vphn: fix endian issue in NUMA device node code
  2014-10-03  9:13 [PATCH v2] powerpc/vphn: fix endian issue in NUMA device node code Greg Kurz
  2014-10-06 22:36 ` Nishanth Aravamudan
@ 2014-10-07  9:28 ` Michael Ellerman
  2014-10-10  8:20   ` Greg Kurz
  1 sibling, 1 reply; 5+ messages in thread
From: Michael Ellerman @ 2014-10-07  9:28 UTC (permalink / raw)
  To: Greg Kurz, linuxppc-dev; +Cc: Nishanth Aravamudan

On Fri, 2014-03-10 at 09:13:17 UTC, Greg Kurz wrote:
> The associativity domain numbers are obtained from the hypervisor through
> registers and written into memory by the guest: the packed array passed to
> vphn_unpack_associativity() is then native-endian, unlike what was assumed
> in the following commit:
> 
> This patch does two things:
> - extract values from the packed array with shifts, in order to be endian
>   neutral
> - convert the resulting values to be32 as expected
> 
> Suggested-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> Reviewed-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>


Hi Greg,

I'm a bit dense, it's after 8pm, but this seems like it's more complicated than
it needs to be?

We get six 64-bit registers back from the hypervisor, they're cpu endian
obviously, and each is defined to consist of four 2 byte fields.

So to unpack them, can't we just iterate over those six 64-bit values, which if
we load them as 64-bit values will be back in cpu endian?

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v2] powerpc/vphn: fix endian issue in NUMA device node code
  2014-10-07  9:28 ` [v2] " Michael Ellerman
@ 2014-10-10  8:20   ` Greg Kurz
  2014-10-15  2:20     ` Michael Ellerman
  0 siblings, 1 reply; 5+ messages in thread
From: Greg Kurz @ 2014-10-10  8:20 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Nishanth Aravamudan, linuxppc-dev

On Tue,  7 Oct 2014 20:28:23 +1100 (EST)
Michael Ellerman <mpe@ellerman.id.au> wrote:

> On Fri, 2014-03-10 at 09:13:17 UTC, Greg Kurz wrote:
> > The associativity domain numbers are obtained from the hypervisor through
> > registers and written into memory by the guest: the packed array passed to
> > vphn_unpack_associativity() is then native-endian, unlike what was assumed
> > in the following commit:
> > 
> > This patch does two things:
> > - extract values from the packed array with shifts, in order to be endian
> >   neutral
> > - convert the resulting values to be32 as expected
> > 
> > Suggested-by: Anton Blanchard <anton@samba.org>
> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > Reviewed-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> > Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> 
> 
> Hi Greg,
> 
> I'm a bit dense, it's after 8pm, but this seems like it's more complicated than
> it needs to be?
> 
> We get six 64-bit registers back from the hypervisor, they're cpu endian
> obviously, and each is defined to consist of four 2 byte fields.
> 
> So to unpack them, can't we just iterate over those six 64-bit values, which if
> we load them as 64-bit values will be back in cpu endian?
> 
> cheers
> 

First, I was sure I had Cc'd Benjamin... sorry for this omission :)

Hi Michael,

Do you mean something like the following ?

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index b835bf0..fbe5a8b 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1421,8 +1421,11 @@ static long hcall_vphn(unsigned long cpu, __be32 *associativity)
        long retbuf[PLPAR_HCALL9_BUFSIZE] = {0};
        u64 flags = 1;
        int hwcpu = get_hard_smp_processor_id(cpu);
+       int i;
 
        rc = plpar_hcall9(H_HOME_NODE_ASSOCIATIVITY, retbuf, flags, hwcpu);
+       for (i = 0; i < 6; i++)
+               retbuf[i] = cpu_to_be64(retbuf[i]);
        vphn_unpack_associativity(retbuf, associativity);
 
        return rc;

Sure it also works and is a lot simplier... but it adds an extra loop. Also,
if the 3 first elements of the array contain 12 VPHN_FIELD_MSB fields, then
we don't even need to swap the remaining elements: only the parsing code
knows.

On the other hand, I understand this is not a hot path... so what should we
do ?

Cheers.

--
Greg

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [v2] powerpc/vphn: fix endian issue in NUMA device node code
  2014-10-10  8:20   ` Greg Kurz
@ 2014-10-15  2:20     ` Michael Ellerman
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2014-10-15  2:20 UTC (permalink / raw)
  To: Greg Kurz; +Cc: Nishanth Aravamudan, linuxppc-dev

On Fri, 2014-10-10 at 08:20:33 UTC, Greg Kurz wrote:
> On Tue,  7 Oct 2014 20:28:23 +1100 (EST)
> Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> > On Fri, 2014-03-10 at 09:13:17 UTC, Greg Kurz wrote:
> > > The associativity domain numbers are obtained from the hypervisor through
> > > registers and written into memory by the guest: the packed array passed to
> > > vphn_unpack_associativity() is then native-endian, unlike what was assumed
> > > in the following commit:
> > > 
> > > This patch does two things:
> > > - extract values from the packed array with shifts, in order to be endian
> > >   neutral
> > > - convert the resulting values to be32 as expected
> > 
> > So to unpack them, can't we just iterate over those six 64-bit values, which if
> > we load them as 64-bit values will be back in cpu endian?
> 
> Hi Michael,
> 
> Do you mean something like the following ?

Not exactly, but that's certainly a lot simpler than the original patch.
 
> Sure it also works and is a lot simplier... but it adds an extra loop. Also,
> if the 3 first elements of the array contain 12 VPHN_FIELD_MSB fields, then
> we don't even need to swap the remaining elements: only the parsing code
> knows.
>
> On the other hand, I understand this is not a hot path... so what should we
> do ?

So I prefer this to your original patch. I know the original was cool and
tricky but it was also complicated :)

What I was thinking of was that the actual parsing code, in vphn_unpack_associativity()
would do the 64-bit loads. That will require changing the parsing loop quite a
bit, but I think it might be worth it.

So can you resend this version with a proper changelog, and I'll merge that
straight away as a bug fix. Then if you can look at reworking the parsing loop
to avoid all the endian swapping on the input side that'd be great.

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-10-15  2:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-03  9:13 [PATCH v2] powerpc/vphn: fix endian issue in NUMA device node code Greg Kurz
2014-10-06 22:36 ` Nishanth Aravamudan
2014-10-07  9:28 ` [v2] " Michael Ellerman
2014-10-10  8:20   ` Greg Kurz
2014-10-15  2:20     ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).