All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Neuling <mikey@neuling.org>
To: Eric Van Hensbergen <ericvh@gmail.com>
Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	bg-linux@lists.anl-external.org
Subject: Re: [PATCH 3/7] [RFC] add support for BlueGene/P FPU
Date: Fri, 20 May 2011 07:36:32 +1000	[thread overview]
Message-ID: <29601.1305840992@neuling.org> (raw)
In-Reply-To: <BANLkTimKhApFW8G1-pG0u_9Kv2YB0R1O0w@mail.gmail.com>

In message <BANLkTimKhApFW8G1-pG0u_9Kv2YB0R1O0w@mail.gmail.com> you wrote:
> On Thu, May 19, 2011 at 12:58 AM, Michael Neuling <mikey@neuling.org> wrote=
> :
> > Eric,
> >
> >> This patch adds save/restore register support for the BlueGene/P
> >> double hummer FPU.
> >
> > What does this mean? =A0Needs more details here.
> >
> 
> Hi Mikey,
> 
> any specific details you are looking for here?  AFAIK these patches
> are required for the kernel to save/restore the double hummer
> properly.

I should have been more specific.  What does double hammer mean?

I description of how double hammer differs from normal and why a change
in the fpu code is needed would be great.

> 
> >>
> >> +#ifdef CONFIG_BGP
> >> +#define LFPDX(frt, ra, rb) =A0 .long (31<<26)|((frt)<<21)|((ra)<<16)| \
> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ((rb)<<11)|(462<<1)
> >> +#define STFPDX(frt, ra, rb) =A0.long (31<<26)|((frt)<<21)|((ra)<<16)| \
> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ((rb)<<11)|(974<<1)
> >> +#endif /* CONFIG_BGP */
> >
> > Put these in arch/powerpc/include/asm/ppc-opcode.h and reformat to fit
> > whats there already.
> >
> > Also, don't need to put these defines inside a #ifdef.
> >
> 
> Sure, I'll fix that up.
> 
> >> +#ifdef CONFIG_BGP
> >> +#define SAVE_FPR(n, b, base) li b, THREAD_FPR0+(16*(n)); STFPDX(n, base=
> , b)
> >> +#define REST_FPR(n, b, base) li b, THREAD_FPR0+(16*(n)); LFPDX(n, base,=
>  b)
> >
> > 16*? =A0Are these FP regs 64 or 128 bits wide? =A0If 128 you are doing to
> > have to play with TS_WIDTH to get the size of the FPs correct in the
> > thread_struct.
> >
> > I think there's a bug here.
> >
> 
> I actually have three different versions of this code from different
> source patches that I'm drawing from - so your help in figuring out
> the best way to approach this is appreciated.  The kittyhawk version
> of the code has 8* instead of 16*.  According to the docs:
> "Each of the two FPU units contains 32 64-bit floating point registers
> for a total of 64 FP registers per processor." which would seem to
> point to the kittyhawk version - but they have a second SAVE_32SFPRS
> for the second hummer.  What wasn't clear to me with this version of
> the code was whether or not they were doing something clever like
> saving the pair of the 64-bit FPU registers in a single 128-bit slot
> (seems plausible).  

Ok, sounds like there is 32*8*2 bytes of data, rather than the normal
32*8 bytes for FP only (ignoring VSX).  If this is the case, then you'll
need make 'fpr' in the thread struct bigger which you can do by setting
TS_FPRWIDTH = 2 like we do for VSX.

If there is some instruction that saves and restores two of these at a
time (which LFPDX/STFPDX might I guess), then we can use that, otherwise
we'll have to do 64 saves/restores.  Double load/stores will be faster
I'm guessing though.  

If two at a time, do we need to increase the index in pairs?

> If this is not the way to go, I can certainly
> switch the kittyhawk version of the patch with the *, the extra
> SAVE32SFPR and the extra double hummer specific storage space in the
> thread_struct.  

I'd be tempted to keep it in the 'fpr' part of the struct so you can
then access it with ptrace/signals/core dumps.

> If it would help I can post an alternate version of the patch for
> discussion with the kittyhawk version.

Sure.

The most useful thing would be to see the instruction definition for
STFPDX/LFPDX.

> 
> >> =A0/*
> >> diff --git a/arch/powerpc/platforms/44x/Kconfig b/arch/powerpc/platforms=
> /44x/
> > Kconfig
> >> index f485fc5f..24a515e 100644
> >> --- a/arch/powerpc/platforms/44x/Kconfig
> >> +++ b/arch/powerpc/platforms/44x/Kconfig
> >> @@ -169,6 +169,15 @@ config YOSEMITE
> >> =A0 =A0 =A0 help
> >> =A0 =A0 =A0 =A0 This option enables support for the AMCC PPC440EP evalua=
> tion board.
> >>
> >> +config =A0 =A0 =A0 BGP
> >
> > Does this FPU feature have a specific name like double hammer? =A0I'd
> > rather have the BGP defconfig depend on PPC_FPU_DOUBLE_HUMMER, or
> > something like that...
> >
> >> + =A0 =A0 bool "Blue Gene/P"
> >> + =A0 =A0 depends on 44x
> >> + =A0 =A0 default n
> >> + =A0 =A0 select PPC_FPU
> >> + =A0 =A0 select PPC_DOUBLE_FPU
> >
> > ... in fact, it seem you are doing something like these here but you
> > don't use PPC_DOUBLE_FPU anywhere?
> >
> 
> A fair point.  I'm fine with calling it DOUBLE_HUMMER, but I wasn't sure if
> that was "too internal" of a name for the kernel.  Let me know and
> I'll fix it up.

What I'm mostly concerned about is disassociating it with a particular
CPU.  

If it has an external name, then all the better.

> I'll also change the CONFIG_BGP defines in the FPU code to PPC_DOUBLE_FPU
> or PPC_DOUBLE_HUMMER depending on what the community decides.

Mikey

WARNING: multiple messages have this Message-ID (diff)
From: Michael Neuling <mikey@neuling.org>
To: Eric Van Hensbergen <ericvh@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	bg-linux@lists.anl-external.org
Subject: Re: [PATCH 3/7] [RFC] add support for BlueGene/P FPU
Date: Fri, 20 May 2011 07:36:32 +1000	[thread overview]
Message-ID: <29601.1305840992@neuling.org> (raw)
In-Reply-To: <BANLkTimKhApFW8G1-pG0u_9Kv2YB0R1O0w@mail.gmail.com>

In message <BANLkTimKhApFW8G1-pG0u_9Kv2YB0R1O0w@mail.gmail.com> you wrote:
> On Thu, May 19, 2011 at 12:58 AM, Michael Neuling <mikey@neuling.org> wrote=
> :
> > Eric,
> >
> >> This patch adds save/restore register support for the BlueGene/P
> >> double hummer FPU.
> >
> > What does this mean? =A0Needs more details here.
> >
> 
> Hi Mikey,
> 
> any specific details you are looking for here?  AFAIK these patches
> are required for the kernel to save/restore the double hummer
> properly.

I should have been more specific.  What does double hammer mean?

I description of how double hammer differs from normal and why a change
in the fpu code is needed would be great.

> 
> >>
> >> +#ifdef CONFIG_BGP
> >> +#define LFPDX(frt, ra, rb) =A0 .long (31<<26)|((frt)<<21)|((ra)<<16)| \
> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ((rb)<<11)|(462<<1)
> >> +#define STFPDX(frt, ra, rb) =A0.long (31<<26)|((frt)<<21)|((ra)<<16)| \
> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ((rb)<<11)|(974<<1)
> >> +#endif /* CONFIG_BGP */
> >
> > Put these in arch/powerpc/include/asm/ppc-opcode.h and reformat to fit
> > whats there already.
> >
> > Also, don't need to put these defines inside a #ifdef.
> >
> 
> Sure, I'll fix that up.
> 
> >> +#ifdef CONFIG_BGP
> >> +#define SAVE_FPR(n, b, base) li b, THREAD_FPR0+(16*(n)); STFPDX(n, base=
> , b)
> >> +#define REST_FPR(n, b, base) li b, THREAD_FPR0+(16*(n)); LFPDX(n, base,=
>  b)
> >
> > 16*? =A0Are these FP regs 64 or 128 bits wide? =A0If 128 you are doing to
> > have to play with TS_WIDTH to get the size of the FPs correct in the
> > thread_struct.
> >
> > I think there's a bug here.
> >
> 
> I actually have three different versions of this code from different
> source patches that I'm drawing from - so your help in figuring out
> the best way to approach this is appreciated.  The kittyhawk version
> of the code has 8* instead of 16*.  According to the docs:
> "Each of the two FPU units contains 32 64-bit floating point registers
> for a total of 64 FP registers per processor." which would seem to
> point to the kittyhawk version - but they have a second SAVE_32SFPRS
> for the second hummer.  What wasn't clear to me with this version of
> the code was whether or not they were doing something clever like
> saving the pair of the 64-bit FPU registers in a single 128-bit slot
> (seems plausible).  

Ok, sounds like there is 32*8*2 bytes of data, rather than the normal
32*8 bytes for FP only (ignoring VSX).  If this is the case, then you'll
need make 'fpr' in the thread struct bigger which you can do by setting
TS_FPRWIDTH = 2 like we do for VSX.

If there is some instruction that saves and restores two of these at a
time (which LFPDX/STFPDX might I guess), then we can use that, otherwise
we'll have to do 64 saves/restores.  Double load/stores will be faster
I'm guessing though.  

If two at a time, do we need to increase the index in pairs?

> If this is not the way to go, I can certainly
> switch the kittyhawk version of the patch with the *, the extra
> SAVE32SFPR and the extra double hummer specific storage space in the
> thread_struct.  

I'd be tempted to keep it in the 'fpr' part of the struct so you can
then access it with ptrace/signals/core dumps.

> If it would help I can post an alternate version of the patch for
> discussion with the kittyhawk version.

Sure.

The most useful thing would be to see the instruction definition for
STFPDX/LFPDX.

> 
> >> =A0/*
> >> diff --git a/arch/powerpc/platforms/44x/Kconfig b/arch/powerpc/platforms=
> /44x/
> > Kconfig
> >> index f485fc5f..24a515e 100644
> >> --- a/arch/powerpc/platforms/44x/Kconfig
> >> +++ b/arch/powerpc/platforms/44x/Kconfig
> >> @@ -169,6 +169,15 @@ config YOSEMITE
> >> =A0 =A0 =A0 help
> >> =A0 =A0 =A0 =A0 This option enables support for the AMCC PPC440EP evalua=
> tion board.
> >>
> >> +config =A0 =A0 =A0 BGP
> >
> > Does this FPU feature have a specific name like double hammer? =A0I'd
> > rather have the BGP defconfig depend on PPC_FPU_DOUBLE_HUMMER, or
> > something like that...
> >
> >> + =A0 =A0 bool "Blue Gene/P"
> >> + =A0 =A0 depends on 44x
> >> + =A0 =A0 default n
> >> + =A0 =A0 select PPC_FPU
> >> + =A0 =A0 select PPC_DOUBLE_FPU
> >
> > ... in fact, it seem you are doing something like these here but you
> > don't use PPC_DOUBLE_FPU anywhere?
> >
> 
> A fair point.  I'm fine with calling it DOUBLE_HUMMER, but I wasn't sure if
> that was "too internal" of a name for the kernel.  Let me know and
> I'll fix it up.

What I'm mostly concerned about is disassociating it with a particular
CPU.  

If it has an external name, then all the better.

> I'll also change the CONFIG_BGP defines in the FPU code to PPC_DOUBLE_FPU
> or PPC_DOUBLE_HUMMER depending on what the community decides.

Mikey

  parent reply	other threads:[~2011-05-19 21:36 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-18 21:24 [PATCH 1/7] [RFC] Mainline BG/P platform support Eric Van Hensbergen
2011-05-18 21:24 ` Eric Van Hensbergen
2011-05-18 21:24 ` [PATCH 2/7] [RFC] add bluegene entry to cputable Eric Van Hensbergen
2011-05-18 21:24   ` Eric Van Hensbergen
2011-05-20  0:35   ` Benjamin Herrenschmidt
2011-05-20  0:35     ` Benjamin Herrenschmidt
2011-05-20  1:08     ` Eric Van Hensbergen
2011-05-20  1:08       ` Eric Van Hensbergen
2011-05-20  1:50       ` Benjamin Herrenschmidt
2011-05-20  1:50         ` Benjamin Herrenschmidt
2011-05-18 21:24 ` [PATCH 3/7] [RFC] add support for BlueGene/P FPU Eric Van Hensbergen
2011-05-18 21:24   ` Eric Van Hensbergen
2011-05-19  5:58   ` Michael Neuling
2011-05-19  5:58     ` Michael Neuling
2011-05-19 13:53     ` Eric Van Hensbergen
2011-05-19 13:53       ` Eric Van Hensbergen
2011-05-19 15:22       ` [bg-linux] " Kazutomo Yoshii
2011-05-19 21:36       ` Michael Neuling [this message]
2011-05-19 21:36         ` Michael Neuling
2011-05-19 21:55         ` Eric Van Hensbergen
2011-05-19 21:55           ` Eric Van Hensbergen
2011-05-19 23:16           ` Michael Neuling
2011-05-19 23:16             ` Michael Neuling
2011-05-20  0:30             ` Eric Van Hensbergen
2011-05-20  0:30               ` Eric Van Hensbergen
2011-05-20  0:43               ` Michael Neuling
2011-05-20  0:43                 ` Michael Neuling
2011-05-20  0:53       ` Benjamin Herrenschmidt
2011-05-20  0:52     ` Benjamin Herrenschmidt
2011-05-20  0:52       ` Benjamin Herrenschmidt
2011-05-19 21:41   ` [PATCH 3/7] [RFC][V2] add support for BlueGene/P Double FPU Eric Van Hensbergen
2011-05-19 21:41     ` Eric Van Hensbergen
2011-05-18 21:24 ` [PATCH 4/7] [RFC] enable L1_WRITETHROUGH mode for BG/P Eric Van Hensbergen
2011-05-18 21:24   ` Eric Van Hensbergen
2011-05-19 10:43   ` Josh Boyer
2011-05-19 10:43     ` Josh Boyer
2011-05-19 12:53     ` Eric Van Hensbergen
2011-05-19 12:53       ` Eric Van Hensbergen
2011-05-19 21:42   ` [PATCH 4/7] [RFC][V2] enable BGP_L1_WRITETHROUGH " Eric Van Hensbergen
2011-05-19 21:42     ` Eric Van Hensbergen
2011-05-20  1:01   ` [PATCH 4/7] [RFC] enable L1_WRITETHROUGH " Benjamin Herrenschmidt
2011-05-20  1:01     ` Benjamin Herrenschmidt
2011-05-18 21:24 ` [PATCH 5/7] [RFC] force 32-byte aligned kmallocs Eric Van Hensbergen
2011-05-18 21:24   ` Eric Van Hensbergen
2011-05-20  0:36   ` Benjamin Herrenschmidt
2011-05-20  0:36     ` Benjamin Herrenschmidt
2011-05-20  0:47     ` Eric Van Hensbergen
2011-05-20  0:47       ` Eric Van Hensbergen
2011-05-20  1:50       ` Benjamin Herrenschmidt
2011-05-20  1:50         ` Benjamin Herrenschmidt
2011-05-20  1:32     ` [bg-linux] " Kazutomo Yoshii
2011-05-20  2:08       ` Benjamin Herrenschmidt
2011-05-20  2:08         ` Benjamin Herrenschmidt
2011-05-20  2:13         ` Benjamin Herrenschmidt
2011-05-20  2:13           ` Benjamin Herrenschmidt
2011-05-20  3:02           ` Kazutomo Yoshii
2011-05-20  3:13             ` Benjamin Herrenschmidt
2011-05-18 21:24 ` [PATCH 6/7] [RFC] enable early TLBs for BG/P Eric Van Hensbergen
2011-05-18 21:24   ` Eric Van Hensbergen
2011-05-20  0:39   ` Benjamin Herrenschmidt
2011-05-20  0:39     ` Benjamin Herrenschmidt
2011-05-20  1:21     ` Eric Van Hensbergen
2011-05-20  1:21       ` Eric Van Hensbergen
2011-05-20  1:54       ` Benjamin Herrenschmidt
2011-05-20  1:54         ` Benjamin Herrenschmidt
2011-05-20  3:38         ` [bg-linux] " Kazutomo Yoshii
2011-05-20  3:38           ` Kazutomo Yoshii
2011-05-20  3:52           ` Benjamin Herrenschmidt
2011-05-20  3:52             ` Benjamin Herrenschmidt
2011-05-20 13:01             ` Eric Van Hensbergen
2011-05-20 22:20               ` Benjamin Herrenschmidt
2011-05-18 21:24 ` [PATCH 7/7] [RFC] SMP support code Eric Van Hensbergen
2011-05-18 21:24   ` Eric Van Hensbergen
2011-05-20  1:05   ` Benjamin Herrenschmidt
2011-05-20  1:05     ` Benjamin Herrenschmidt
2011-05-19 11:01 ` [PATCH 1/7] [RFC] Mainline BG/P platform support Josh Boyer
2011-05-19 11:01   ` Josh Boyer
2011-05-19 12:35   ` Eric Van Hensbergen
2011-05-19 12:35     ` Eric Van Hensbergen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=29601.1305840992@neuling.org \
    --to=mikey@neuling.org \
    --cc=bg-linux@lists.anl-external.org \
    --cc=ericvh@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.