All of lore.kernel.org
 help / color / mirror / Atom feed
* SPE & Interrupt context (was how to make use of SPE instructions)
@ 2015-01-27  5:09 Markus Stockhausen
  2015-01-28  4:21 ` Scott Wood
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Stockhausen @ 2015-01-27  5:09 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, Herbert Xu

[-- Attachment #1: Type: text/plain, Size: 1464 bytes --]

On Tue, 2015-01-20 at 14:53 +0000, Markus Stockhausen wrote:
> > Von: Scott Wood [scottwood@freescale.com]
> > Gesendet: Dienstag, 20. Januar 2015 08:38
> > An: Markus Stockhausen
> > Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org
> > Betreff: Re: AW: How to make use of SPE instructions?
> > ...
> > With your advice I would place a enable/disable preemption call after
> > 1K of processed data. But wil that be sufficient if I only reeanble it
> > for a short timeframe like this:
> >
> >   do {
> >     disable_preemption()
> >     ... calc hashes for 1K of data with 16.000 CPU cycles (or 20us) ...
> >     enable_preemption()
> >   while (dataleft>0);
> 
> Yes, it's sufficient.  When you enable preemption it will check to see
> whether there is a pending reschedule.
> 

Hi Scott,

thanks for your helpful feedback. As you might have seen I sent a first
patch for the sha256 kernel module that takes care about preemption.

Herbert Xu noticed that my module won't run in for IPsec as all
work will be done from interrupt context. Do you have a tip how I can
mitigate the check I implemented:

static bool spe_usable(void)
{
  return !in_interrupt();
}

Intel guys have something like that

bool irq_fpu_usable(void)
{
  return !in_interrupt() ||
    interrupted_user_mode() ||
    interrupted_kernel_fpu_idle();
}

But I have no idea how to transfer it to the PPC/SPE case.

Thanks in advance.

Markus
=

[-- Attachment #2: InterScan_Disclaimer.txt --]
[-- Type: text/plain, Size: 1650 bytes --]

****************************************************************************
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497

****************************************************************************

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SPE & Interrupt context (was how to make use of SPE instructions)
  2015-01-27  5:09 SPE & Interrupt context (was how to make use of SPE instructions) Markus Stockhausen
@ 2015-01-28  4:21 ` Scott Wood
  2015-01-28  5:00   ` AW: " Markus Stockhausen
  0 siblings, 1 reply; 9+ messages in thread
From: Scott Wood @ 2015-01-28  4:21 UTC (permalink / raw)
  To: Markus Stockhausen; +Cc: linuxppc-dev, Herbert Xu

On Tue, 2015-01-27 at 05:09 +0000, Markus Stockhausen wrote:
> On Tue, 2015-01-20 at 14:53 +0000, Markus Stockhausen wrote:
> > > Von: Scott Wood [scottwood@freescale.com]
> > > Gesendet: Dienstag, 20. Januar 2015 08:38
> > > An: Markus Stockhausen
> > > Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org
> > > Betreff: Re: AW: How to make use of SPE instructions?
> > > ...
> > > With your advice I would place a enable/disable preemption call after
> > > 1K of processed data. But wil that be sufficient if I only reeanble it
> > > for a short timeframe like this:
> > >
> > >   do {
> > >     disable_preemption()
> > >     ... calc hashes for 1K of data with 16.000 CPU cycles (or 20us) ...
> > >     enable_preemption()
> > >   while (dataleft>0);
> > 
> > Yes, it's sufficient.  When you enable preemption it will check to see
> > whether there is a pending reschedule.
> > 
> 
> Hi Scott,
> 
> thanks for your helpful feedback. As you might have seen I sent a first
> patch for the sha256 kernel module that takes care about preemption.
> 
> Herbert Xu noticed that my module won't run in for IPsec as all
> work will be done from interrupt context. Do you have a tip how I can
> mitigate the check I implemented:
> 
> static bool spe_usable(void)
> {
>   return !in_interrupt();
> }
> 
> Intel guys have something like that
> 
> bool irq_fpu_usable(void)
> {
>   return !in_interrupt() ||
>     interrupted_user_mode() ||
>     interrupted_kernel_fpu_idle();
> }
> 
> But I have no idea how to transfer it to the PPC/SPE case.

I'm not sure what sort of tip you're looking for, other than
implementing it myself. :-)

-Scott

^ permalink raw reply	[flat|nested] 9+ messages in thread

* AW: SPE & Interrupt context (was how to make use of SPE instructions)
  2015-01-28  4:21 ` Scott Wood
@ 2015-01-28  5:00   ` Markus Stockhausen
  2015-01-30  0:49     ` Scott Wood
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Stockhausen @ 2015-01-28  5:00 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, Herbert Xu

[-- Attachment #1: Type: text/plain, Size: 1777 bytes --]

> > Von: Scott Wood [scottwood@freescale.com]
> > Gesendet: Mittwoch, 28. Januar 2015 05:21
> > An: Markus Stockhausen
> > Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> > Betreff: Re: SPE & Interrupt context (was how to make use of SPE instructions)
> > 
> > Hi Scott,
> >
> > thanks for your helpful feedback. As you might have seen I sent a first
> > patch for the sha256 kernel module that takes care about preemption.
> >
> > Herbert Xu noticed that my module won't run in for IPsec as all
> > work will be done from interrupt context. Do you have a tip how I can
> > mitigate the check I implemented:
> >
> > static bool spe_usable(void)
> > {
> >   return !in_interrupt();
> > }
> >
> > Intel guys have something like that
> >
> > bool irq_fpu_usable(void)
> > {
> >   return !in_interrupt() ||
> >     interrupted_user_mode() ||
> >     interrupted_kernel_fpu_idle();
> > }
> >
> > But I have no idea how to transfer it to the PPC/SPE case.
> 
> I'm not sure what sort of tip you're looking for, other than
> implementing it myself. :-)

Hi Scott,

maybe I did not explain it correctly. interrupted_kernel_fpu_idle()
is x86 specific. The same applies to interrupted_user_mode().
I'm just searching for a similar feature in the PPC/SPE world.
I can see that enable_kernel_spe() does something with the
MSR_SPE flag, but I have no idea  how to determine if I'm allowed
to enable SPE although I'm inside an interrupt context.

I'm asking because from the previous posts I conclude that 
running SPE instructions inside an interrupt might be critical. 
Because of registers not being saved?

Or can I just save the register contents myself and interrupt
context is no longer a showstopper?

Markus


=

[-- Attachment #2: InterScan_Disclaimer.txt --]
[-- Type: text/plain, Size: 1650 bytes --]

****************************************************************************
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497

****************************************************************************

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)
  2015-01-28  5:00   ` AW: " Markus Stockhausen
@ 2015-01-30  0:49     ` Scott Wood
  2015-01-30  5:37       ` AW: " Markus Stockhausen
  0 siblings, 1 reply; 9+ messages in thread
From: Scott Wood @ 2015-01-30  0:49 UTC (permalink / raw)
  To: Markus Stockhausen; +Cc: linuxppc-dev, Herbert Xu

On Wed, 2015-01-28 at 05:00 +0000, Markus Stockhausen wrote:
> > > Von: Scott Wood [scottwood@freescale.com]
> > > Gesendet: Mittwoch, 28. Januar 2015 05:21
> > > An: Markus Stockhausen
> > > Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> > > Betreff: Re: SPE & Interrupt context (was how to make use of SPE instructions)
> > > 
> > > Hi Scott,
> > >
> > > thanks for your helpful feedback. As you might have seen I sent a first
> > > patch for the sha256 kernel module that takes care about preemption.
> > >
> > > Herbert Xu noticed that my module won't run in for IPsec as all
> > > work will be done from interrupt context. Do you have a tip how I can
> > > mitigate the check I implemented:
> > >
> > > static bool spe_usable(void)
> > > {
> > >   return !in_interrupt();
> > > }
> > >
> > > Intel guys have something like that
> > >
> > > bool irq_fpu_usable(void)
> > > {
> > >   return !in_interrupt() ||
> > >     interrupted_user_mode() ||
> > >     interrupted_kernel_fpu_idle();
> > > }
> > >
> > > But I have no idea how to transfer it to the PPC/SPE case.
> > 
> > I'm not sure what sort of tip you're looking for, other than
> > implementing it myself. :-)
> 
> Hi Scott,
> 
> maybe I did not explain it correctly. interrupted_kernel_fpu_idle()
> is x86 specific. The same applies to interrupted_user_mode().
> I'm just searching for a similar feature in the PPC/SPE world.

There isn't one.

> I can see that enable_kernel_spe() does something with the
> MSR_SPE flag, but I have no idea  how to determine if I'm allowed
> to enable SPE although I'm inside an interrupt context.

As with x86, you'd want to check whether the kernel interrupted
userspace.  I don't know what x86 is doing with TS, but on PPC you might
check whether the interrupted thread had MSR_FP enabled.

> I'm asking because from the previous posts I conclude that 
> running SPE instructions inside an interrupt might be critical. 
> Because of registers not being saved?

Yes.  Currently callers of enable_kernel_spe() only need to disable
preemption, not interrupts.

> Or can I just save the register contents myself and interrupt
> context is no longer a showstopper?

If you only need a small number of registers that might be reasonable,
but if you need a bunch then you don't want to save them when you don't
have to.

Another option is to change enable_kernel_spe() to require interrupts to
be disabled.

-Scott

^ permalink raw reply	[flat|nested] 9+ messages in thread

* AW: AW: SPE & Interrupt context (was how to make use of SPE instructions)
  2015-01-30  0:49     ` Scott Wood
@ 2015-01-30  5:37       ` Markus Stockhausen
  2015-01-30  8:49         ` Gabriel Paubert
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Stockhausen @ 2015-01-30  5:37 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, Herbert Xu

[-- Attachment #1: Type: text/plain, Size: 4453 bytes --]

> Von: Scott Wood [scottwood@freescale.com]
> Gesendet: Freitag, 30. Januar 2015 01:49
> An: Markus Stockhausen
> Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> Betreff: Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)
> 
> On Wed, 2015-01-28 at 05:00 +0000, Markus Stockhausen wrote:
> > > > Von: Scott Wood [scottwood@freescale.com]
> > > > Gesendet: Mittwoch, 28. Januar 2015 05:21
> > > > An: Markus Stockhausen
> > > > Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> > > > Betreff: Re: SPE & Interrupt context (was how to make use of SPE instructions)
> > > >
> > > > Hi Scott,
> > > >
> > > > thanks for your helpful feedback. As you might have seen I sent a first
> > > > patch for the sha256 kernel module that takes care about preemption.
> > > >
> > > > Herbert Xu noticed that my module won't run in for IPsec as all
> > > > work will be done from interrupt context. Do you have a tip how I can
> > > > mitigate the check I implemented:
> > > >
> > > > static bool spe_usable(void)
> > > > {
> > > >   return !in_interrupt();
> > > > }
> > > >
> > > > Intel guys have something like that
> > > >
> > > > bool irq_fpu_usable(void)
> > > > {
> > > >   return !in_interrupt() ||
> > > >     interrupted_user_mode() ||
> > > >     interrupted_kernel_fpu_idle();
> > > > }
> > > >
> > > > But I have no idea how to transfer it to the PPC/SPE case.
> > >
> > > I'm not sure what sort of tip you're looking for, other than
> > > implementing it myself. :-)
> >
> > Hi Scott,
> >
> > maybe I did not explain it correctly. interrupted_kernel_fpu_idle()
> > is x86 specific. The same applies to interrupted_user_mode().
> > I'm just searching for a similar feature in the PPC/SPE world.
> 
> There isn't one.
> 
> > I can see that enable_kernel_spe() does something with the
> > MSR_SPE flag, but I have no idea  how to determine if I'm allowed
> > to enable SPE although I'm inside an interrupt context.
> 
> As with x86, you'd want to check whether the kernel interrupted
> userspace.  I don't know what x86 is doing with TS, but on PPC you might
> check whether the interrupted thread had MSR_FP enabled.
> 
> > I'm asking because from the previous posts I conclude that
> > running SPE instructions inside an interrupt might be critical.
> > Because of registers not being saved?
> 
> Yes.  Currently callers of enable_kernel_spe() only need to disable
> preemption, not interrupts.
> 
> > Or can I just save the register contents myself and interrupt
> > context is no longer a showstopper?
> 
> If you only need a small number of registers that might be reasonable,
> but if you need a bunch then you don't want to save them when you don't
> have to.
> 
> Another option is to change enable_kernel_spe() to require interrupts to
> be disabled.

Phew, that is going deeper than I expected. 

I'm a newbie in the topic of interrupts and FPU/SPE registers. Nevertheless
enforcing enable_kernel_spe() to only be available outside of interrupt
context sounds too restrictive for me. Also checking for thread/CPU flags 
of an interrupted process is nothing I can or want to implement. There
might be the risk that I'm starting something that will be too complex
for me.

BUT! Given the fact that SPE registers are only extended GPRs and my
algorithm needs just 10 of them I can live with the following design.

- I must already save several non-volatile registers. Putting the 64 bit values 
into them would require me to save their contents with evstdd instead of 
stw. Of course stack alignment to 8 bytes required. So only a few alignment
instructions needed additionally during initialization.

- During function cleanup I will restore the registers the same way.

- In case I interrupted myself, I might have saved sensitive data of another 
thread on my stack. So I will zero that area after I restored the registers.
That needs an additional 10 instructions. In contrast to ~2000 instructions
for one sha256 round that should be neglectable.

This little overhead will save me lots of trouble at other locations:

- I can avoid checking for an interrupt context.

- I don't need a fallback to the generic implementation. 

Thinking about it more and more I think I performance will stay the same. 
Can you confirm that this will work? If yes I will send a v2 patch.

Markus
=

[-- Attachment #2: InterScan_Disclaimer.txt --]
[-- Type: text/plain, Size: 1650 bytes --]

****************************************************************************
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497

****************************************************************************

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)
  2015-01-30  5:37       ` AW: " Markus Stockhausen
@ 2015-01-30  8:49         ` Gabriel Paubert
  2015-01-30  9:39           ` AW: " Markus Stockhausen
  0 siblings, 1 reply; 9+ messages in thread
From: Gabriel Paubert @ 2015-01-30  8:49 UTC (permalink / raw)
  To: Markus Stockhausen; +Cc: Scott Wood, linuxppc-dev, Herbert Xu

On Fri, Jan 30, 2015 at 05:37:29AM +0000, Markus Stockhausen wrote:
> > Von: Scott Wood [scottwood@freescale.com]
> > Gesendet: Freitag, 30. Januar 2015 01:49
> > An: Markus Stockhausen
> > Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> > Betreff: Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)
> > 
> > On Wed, 2015-01-28 at 05:00 +0000, Markus Stockhausen wrote:
> > > > > Von: Scott Wood [scottwood@freescale.com]
> > > > > Gesendet: Mittwoch, 28. Januar 2015 05:21
> > > > > An: Markus Stockhausen
> > > > > Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> > > > > Betreff: Re: SPE & Interrupt context (was how to make use of SPE instructions)
> > > > >
> > > > > Hi Scott,
> > > > >
> > > > > thanks for your helpful feedback. As you might have seen I sent a first
> > > > > patch for the sha256 kernel module that takes care about preemption.
> > > > >
> > > > > Herbert Xu noticed that my module won't run in for IPsec as all
> > > > > work will be done from interrupt context. Do you have a tip how I can
> > > > > mitigate the check I implemented:
> > > > >
> > > > > static bool spe_usable(void)
> > > > > {
> > > > >   return !in_interrupt();
> > > > > }
> > > > >
> > > > > Intel guys have something like that
> > > > >
> > > > > bool irq_fpu_usable(void)
> > > > > {
> > > > >   return !in_interrupt() ||
> > > > >     interrupted_user_mode() ||
> > > > >     interrupted_kernel_fpu_idle();
> > > > > }
> > > > >
> > > > > But I have no idea how to transfer it to the PPC/SPE case.
> > > >
> > > > I'm not sure what sort of tip you're looking for, other than
> > > > implementing it myself. :-)
> > >
> > > Hi Scott,
> > >
> > > maybe I did not explain it correctly. interrupted_kernel_fpu_idle()
> > > is x86 specific. The same applies to interrupted_user_mode().
> > > I'm just searching for a similar feature in the PPC/SPE world.
> > 
> > There isn't one.
> > 
> > > I can see that enable_kernel_spe() does something with the
> > > MSR_SPE flag, but I have no idea  how to determine if I'm allowed
> > > to enable SPE although I'm inside an interrupt context.
> > 
> > As with x86, you'd want to check whether the kernel interrupted
> > userspace.  I don't know what x86 is doing with TS, but on PPC you might
> > check whether the interrupted thread had MSR_FP enabled.
> > 
> > > I'm asking because from the previous posts I conclude that
> > > running SPE instructions inside an interrupt might be critical.
> > > Because of registers not being saved?
> > 
> > Yes.  Currently callers of enable_kernel_spe() only need to disable
> > preemption, not interrupts.
> > 
> > > Or can I just save the register contents myself and interrupt
> > > context is no longer a showstopper?
> > 
> > If you only need a small number of registers that might be reasonable,
> > but if you need a bunch then you don't want to save them when you don't
> > have to.
> > 
> > Another option is to change enable_kernel_spe() to require interrupts to
> > be disabled.
> 
> Phew, that is going deeper than I expected. 
> 
> I'm a newbie in the topic of interrupts and FPU/SPE registers. Nevertheless
> enforcing enable_kernel_spe() to only be available outside of interrupt
> context sounds too restrictive for me. Also checking for thread/CPU flags 
> of an interrupted process is nothing I can or want to implement. There
> might be the risk that I'm starting something that will be too complex
> for me.
> 
> BUT! Given the fact that SPE registers are only extended GPRs and my
> algorithm needs just 10 of them I can live with the following design.
> 
> - I must already save several non-volatile registers. Putting the 64 bit values 
> into them would require me to save their contents with evstdd instead of 
> stw. Of course stack alignment to 8 bytes required. So only a few alignment
> instructions needed additionally during initialization.

On most PPC ABI the stack is guaranteed to be aligned to a 16 byte
boundary. In some it may be only 8, but I can't remember any 4 byte
only alignment.

I checked my 32 bit kernel images with:

objdump -d vmlinux |awk '/stwu.*r1,/{print $6,$7}'|sort -u

and the stack seems to always be 16 byte aligned.
For 64 bit, use stdu instead of stwu.

I've also found a few stwux/stdux which are hopefully known
to be harmless.

> 
> - During function cleanup I will restore the registers the same way.
> 
> - In case I interrupted myself, I might have saved sensitive data of another 
> thread on my stack. So I will zero that area after I restored the registers.
> That needs an additional 10 instructions. In contrast to ~2000 instructions
> for one sha256 round that should be neglectable.
> 
> This little overhead will save me lots of trouble at other locations:
> 
> - I can avoid checking for an interrupt context.
> 
> - I don't need a fallback to the generic implementation. 
> 
> Thinking about it more and more I think I performance will stay the same. 
> Can you confirm that this will work? If yes I will send a v2 patch.
> 
> Markus

    Gabriel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* AW: AW: SPE & Interrupt context (was how to make use of SPE instructions)
  2015-01-30  8:49         ` Gabriel Paubert
@ 2015-01-30  9:39           ` Markus Stockhausen
  2015-01-30 10:41             ` Gabriel Paubert
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Stockhausen @ 2015-01-30  9:39 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: Scott Wood, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2163 bytes --]

> Von: Gabriel Paubert [paubert@iram.es]
> Gesendet: Freitag, 30. Januar 2015 09:49
> An: Markus Stockhausen
> Cc: Scott Wood; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> Betreff: Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)
>
> > ...
> > - I must already save several non-volatile registers. Putting the 64 bit values
> > into them would require me to save their contents with evstdd instead of
> > stw. Of course stack alignment to 8 bytes required. So only a few alignment
> > instructions needed additionally during initialization.
> 
> On most PPC ABI the stack is guaranteed to be aligned to a 16 byte
> boundary. In some it may be only 8, but I can't remember any 4 byte
> only alignment.
> 
> I checked my 32 bit kernel images with:
> 
> objdump -d vmlinux |awk '/stwu.*r1,/{print $6,$7}'|sort -u
> 
> and the stack seems to always be 16 byte aligned.
> For 64 bit, use stdu instead of stwu.
> 
> I've also found a few stwux/stdux which are hopefully known
> to be harmless.
>
> Gabriel

A helpful annotation. But now I'm unsure about function usage. SPE seems to be
32bit only and I would use their evxxx instructions. Do you think the following
sequence will be the right way? 

_GLOBAL(ppc_spe_sha256_transform)
  stwu            r1,-128(r1);    /* create stack frame           */
  stw             r24,8(r1);      /* save normal registers        */
  stw             r25,12(r1);                                       
  evstdw          r14,16(r1);     /* We must save non volatile    */
  evstdw          r15,24(r1);    /* registers. Take the chance   */
  evstdw          r16,32(r12);    /* and save the SPE part too    */ \
  ...
  lwz             r24,8(r1);      /* restore normal registers     */ \
  lwz             r25,12(r1);
  evldw           r14,16(r12);     /* restore non-v. + SPE registers      */
  evldw           r15,24(r12);
  evldw           r16,32(r12);
  addi            r1,r1,128;      /* cleanup stack frame          */

Or must I use the kernel provided defines with PPC_STLU r1,-INT_FRAME_SIZE(r1) 
plus SAVE_GPR/SAVE_EVR/REST_GPR/REST_EVR?

Markus=

[-- Attachment #2: InterScan_Disclaimer.txt --]
[-- Type: text/plain, Size: 1650 bytes --]

****************************************************************************
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497

****************************************************************************

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)
  2015-01-30  9:39           ` AW: " Markus Stockhausen
@ 2015-01-30 10:41             ` Gabriel Paubert
  2015-01-30 17:57               ` Scott Wood
  0 siblings, 1 reply; 9+ messages in thread
From: Gabriel Paubert @ 2015-01-30 10:41 UTC (permalink / raw)
  To: Markus Stockhausen; +Cc: Scott Wood, linuxppc-dev

On Fri, Jan 30, 2015 at 09:39:41AM +0000, Markus Stockhausen wrote:
> > Von: Gabriel Paubert [paubert@iram.es]
> > Gesendet: Freitag, 30. Januar 2015 09:49
> > An: Markus Stockhausen
> > Cc: Scott Wood; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> > Betreff: Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)
> >
> > > ...
> > > - I must already save several non-volatile registers. Putting the 64 bit values
> > > into them would require me to save their contents with evstdd instead of
> > > stw. Of course stack alignment to 8 bytes required. So only a few alignment
> > > instructions needed additionally during initialization.
> > 
> > On most PPC ABI the stack is guaranteed to be aligned to a 16 byte
> > boundary. In some it may be only 8, but I can't remember any 4 byte
> > only alignment.
> > 
> > I checked my 32 bit kernel images with:
> > 
> > objdump -d vmlinux |awk '/stwu.*r1,/{print $6,$7}'|sort -u
> > 
> > and the stack seems to always be 16 byte aligned.
> > For 64 bit, use stdu instead of stwu.
> > 
> > I've also found a few stwux/stdux which are hopefully known
> > to be harmless.
> >
> > Gabriel
> 
> A helpful annotation. But now I'm unsure about function usage. SPE seems to be
> 32bit only and I would use their evxxx instructions. Do you think the following
> sequence will be the right way? 
> 
> _GLOBAL(ppc_spe_sha256_transform)
>   stwu            r1,-128(r1);    /* create stack frame           */
>   stw             r24,8(r1);      /* save normal registers        */
>   stw             r25,12(r1);                                       
>   evstdw          r14,16(r1);     /* We must save non volatile    */
>   evstdw          r15,24(r1);    /* registers. Take the chance   */
>   evstdw          r16,32(r12);    /* and save the SPE part too    */ \
>   ...
>   lwz             r24,8(r1);      /* restore normal registers     */ \
>   lwz             r25,12(r1);
>   evldw           r14,16(r12);     /* restore non-v. + SPE registers      */
>   evldw           r15,24(r12);
>   evldw           r16,32(r12);
>   addi            r1,r1,128;      /* cleanup stack frame          */
> 

Yes. But there is also probably a status/control register somewhere that
you might need to save restore, unless it is never used and/or affected by the
instructions you use.

> Or must I use the kernel provided defines with PPC_STLU r1,-INT_FRAME_SIZE(r1) 
> plus SAVE_GPR/SAVE_EVR/REST_GPR/REST_EVR?
> 

>From what I understand INT_FRAME_SIZE is for interrupt entry code. This
is not the case of your code which is a standard function except for
the fact that it clobbers the upper 32 bits of some registers by using
SPE instructions. Therore INT_FRAME_SIZE is overkill. I also believe that
you can save the registers as you suggest, no need to split it into
the high and low part.

By the way, I wonder where the SAVE_EVR/REST_EVR macros are used. I only
see the definitions, no use in a 3.18 source tree.

    Gabriel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)
  2015-01-30 10:41             ` Gabriel Paubert
@ 2015-01-30 17:57               ` Scott Wood
  0 siblings, 0 replies; 9+ messages in thread
From: Scott Wood @ 2015-01-30 17:57 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: linuxppc-dev, Markus Stockhausen

On Fri, 2015-01-30 at 11:41 +0100, Gabriel Paubert wrote:
> By the way, I wonder where the SAVE_EVR/REST_EVR macros are used. I only
> see the definitions, no use in a 3.18 source tree.

SAVE_EVR is used by SAVE_2EVRs, which is used by SAVE_4EVRS, etc.

The 32EVRS version is used in load_up_spe() and kvm_save_guest_spe().

-Scott

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-01-30 17:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-27  5:09 SPE & Interrupt context (was how to make use of SPE instructions) Markus Stockhausen
2015-01-28  4:21 ` Scott Wood
2015-01-28  5:00   ` AW: " Markus Stockhausen
2015-01-30  0:49     ` Scott Wood
2015-01-30  5:37       ` AW: " Markus Stockhausen
2015-01-30  8:49         ` Gabriel Paubert
2015-01-30  9:39           ` AW: " Markus Stockhausen
2015-01-30 10:41             ` Gabriel Paubert
2015-01-30 17:57               ` Scott Wood

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.