All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
@ 2012-03-24 18:58 Blue Swirl
  2012-03-25 22:29 ` Richard Henderson
  2012-03-26 11:52 ` Peter Maydell
  0 siblings, 2 replies; 14+ messages in thread
From: Blue Swirl @ 2012-03-24 18:58 UTC (permalink / raw)
  To: qemu-devel, Paul Brook, Peter Maydell

v2: fix patch 1, tweak patch 2 and rebase to master.

URL	git://repo.or.cz/qemu/blueswirl.git
	http://repo.or.cz/r/qemu/blueswirl.git

Blue Swirl (6):
  arm: move neon_tbl to neon_helper.c
  arm: move saturating arithmetic to helper.c
  arm: move other arithmetic to helper.c
  arm: move cpsr and banked register access to helper.c
  arm: move exception and wfi helpers to helper.c
  arm: move load and store helpers, switch to AREG0 free mode

 Makefile.target          |    6 +-
 configure                |    2 +-
 target-arm/helper.c      |  387 +++++++++++++++++++++++++++++++++++++++++-
 target-arm/helper.h      |   60 ++++----
 target-arm/neon_helper.c |   22 +++
 target-arm/op_helper.c   |  430 ----------------------------------------------
 target-arm/translate.c   |  148 ++++++++--------
 7 files changed, 512 insertions(+), 543 deletions(-)
 delete mode 100644 target-arm/op_helper.c

-- 
1.7.9

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-24 18:58 [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion Blue Swirl
@ 2012-03-25 22:29 ` Richard Henderson
  2012-03-26 11:52 ` Peter Maydell
  1 sibling, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2012-03-25 22:29 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Peter Maydell, qemu-devel, Paul Brook

On 03/24/2012 11:58 AM, Blue Swirl wrote:
> v2: fix patch 1, tweak patch 2 and rebase to master.
> 
> URL	git://repo.or.cz/qemu/blueswirl.git
> 	http://repo.or.cz/r/qemu/blueswirl.git
> 
> Blue Swirl (6):
>   arm: move neon_tbl to neon_helper.c
>   arm: move saturating arithmetic to helper.c
>   arm: move other arithmetic to helper.c
>   arm: move cpsr and banked register access to helper.c
>   arm: move exception and wfi helpers to helper.c
>   arm: move load and store helpers, switch to AREG0 free mode
> 
>  Makefile.target          |    6 +-
>  configure                |    2 +-
>  target-arm/helper.c      |  387 +++++++++++++++++++++++++++++++++++++++++-
>  target-arm/helper.h      |   60 ++++----
>  target-arm/neon_helper.c |   22 +++
>  target-arm/op_helper.c   |  430 ----------------------------------------------
>  target-arm/translate.c   |  148 ++++++++--------
>  7 files changed, 512 insertions(+), 543 deletions(-)
>  delete mode 100644 target-arm/op_helper.c
> 

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-24 18:58 [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion Blue Swirl
  2012-03-25 22:29 ` Richard Henderson
@ 2012-03-26 11:52 ` Peter Maydell
  2012-03-26 12:46   ` Lluís Vilanova
  2012-03-26 13:05   ` Paul Brook
  1 sibling, 2 replies; 14+ messages in thread
From: Peter Maydell @ 2012-03-26 11:52 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel, Paul Brook

On 24 March 2012 18:58, Blue Swirl <blauwirbel@gmail.com> wrote:
> v2: fix patch 1, tweak patch 2 and rebase to master.
>
> URL     git://repo.or.cz/qemu/blueswirl.git
>        http://repo.or.cz/r/qemu/blueswirl.git
>
> Blue Swirl (6):
>  arm: move neon_tbl to neon_helper.c
>  arm: move saturating arithmetic to helper.c
>  arm: move other arithmetic to helper.c
>  arm: move cpsr and banked register access to helper.c
>  arm: move exception and wfi helpers to helper.c
>  arm: move load and store helpers, switch to AREG0 free mode

The patches themselves look OK, but do we really want to take
a 5% performance hit for this cleanup?

-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-26 11:52 ` Peter Maydell
@ 2012-03-26 12:46   ` Lluís Vilanova
  2012-03-26 12:48     ` Andreas Färber
  2012-03-26 13:05   ` Paul Brook
  1 sibling, 1 reply; 14+ messages in thread
From: Lluís Vilanova @ 2012-03-26 12:46 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Blue Swirl, qemu-devel, Paul Brook

Peter Maydell writes:

> On 24 March 2012 18:58, Blue Swirl <blauwirbel@gmail.com> wrote:
>> v2: fix patch 1, tweak patch 2 and rebase to master.
>> 
>> URL     git://repo.or.cz/qemu/blueswirl.git
>>        http://repo.or.cz/r/qemu/blueswirl.git
>> 
>> Blue Swirl (6):
>>  arm: move neon_tbl to neon_helper.c
>>  arm: move saturating arithmetic to helper.c
>>  arm: move other arithmetic to helper.c
>>  arm: move cpsr and banked register access to helper.c
>>  arm: move exception and wfi helpers to helper.c
>>  arm: move load and store helpers, switch to AREG0 free mode

> The patches themselves look OK, but do we really want to take
> a 5% performance hit for this cleanup?

I was also wondering this. Is there any plan for recovering that 5% afterwards?


Thanks,
    Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-26 12:46   ` Lluís Vilanova
@ 2012-03-26 12:48     ` Andreas Färber
  0 siblings, 0 replies; 14+ messages in thread
From: Andreas Färber @ 2012-03-26 12:48 UTC (permalink / raw)
  To: Lluís Vilanova; +Cc: Blue Swirl, Peter Maydell, qemu-devel, Paul Brook

Am 26.03.2012 14:46, schrieb Lluís Vilanova:
> Peter Maydell writes:
> 
>> On 24 March 2012 18:58, Blue Swirl <blauwirbel@gmail.com> wrote:
>>> v2: fix patch 1, tweak patch 2 and rebase to master.
>>>
>>> URL     git://repo.or.cz/qemu/blueswirl.git
>>>        http://repo.or.cz/r/qemu/blueswirl.git
>>>
>>> Blue Swirl (6):
>>>  arm: move neon_tbl to neon_helper.c
>>>  arm: move saturating arithmetic to helper.c
>>>  arm: move other arithmetic to helper.c
>>>  arm: move cpsr and banked register access to helper.c
>>>  arm: move exception and wfi helpers to helper.c
>>>  arm: move load and store helpers, switch to AREG0 free mode
> 
>> The patches themselves look OK, but do we really want to take
>> a 5% performance hit for this cleanup?
> 
> I was also wondering this. Is there any plan for recovering that 5% afterwards?

Maybe by switching to clang? ;)

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-26 11:52 ` Peter Maydell
  2012-03-26 12:46   ` Lluís Vilanova
@ 2012-03-26 13:05   ` Paul Brook
  2012-03-26 17:02     ` Blue Swirl
  1 sibling, 1 reply; 14+ messages in thread
From: Paul Brook @ 2012-03-26 13:05 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Blue Swirl, qemu-devel

> On 24 March 2012 18:58, Blue Swirl <blauwirbel@gmail.com> wrote:
> > v2: fix patch 1, tweak patch 2 and rebase to master.
> > 
> > URL     git://repo.or.cz/qemu/blueswirl.git
> >        http://repo.or.cz/r/qemu/blueswirl.git
> > 
> > Blue Swirl (6):
> >  arm: move neon_tbl to neon_helper.c
> >  arm: move saturating arithmetic to helper.c
> >  arm: move other arithmetic to helper.c
> >  arm: move cpsr and banked register access to helper.c
> >  arm: move exception and wfi helpers to helper.c
> >  arm: move load and store helpers, switch to AREG0 free mode
> 
> The patches themselves look OK, but do we really want to take
> a 5% performance hit for this cleanup?

I have a similar concern.  I'd like to at least have some idea where this 
slowdown is coming from.

Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-26 13:05   ` Paul Brook
@ 2012-03-26 17:02     ` Blue Swirl
  2012-03-26 17:59       ` Lluís Vilanova
  2012-03-27 13:40       ` Laurent Desnogues
  0 siblings, 2 replies; 14+ messages in thread
From: Blue Swirl @ 2012-03-26 17:02 UTC (permalink / raw)
  To: Paul Brook; +Cc: Peter Maydell, qemu-devel

On Mon, Mar 26, 2012 at 13:05, Paul Brook <paul@codesourcery.com> wrote:
>> On 24 March 2012 18:58, Blue Swirl <blauwirbel@gmail.com> wrote:
>> > v2: fix patch 1, tweak patch 2 and rebase to master.
>> >
>> > URL     git://repo.or.cz/qemu/blueswirl.git
>> >        http://repo.or.cz/r/qemu/blueswirl.git
>> >
>> > Blue Swirl (6):
>> >  arm: move neon_tbl to neon_helper.c
>> >  arm: move saturating arithmetic to helper.c
>> >  arm: move other arithmetic to helper.c
>> >  arm: move cpsr and banked register access to helper.c
>> >  arm: move exception and wfi helpers to helper.c
>> >  arm: move load and store helpers, switch to AREG0 free mode
>>
>> The patches themselves look OK, but do we really want to take
>> a 5% performance hit for this cleanup?
>
> I have a similar concern.  I'd like to at least have some idea where this
> slowdown is coming from.

At least stack protector is protecting more code than before (for
example TLB miss handler), but could overhead from that amount to 5%?

Otherwise there should be just a few extra register moves here and
there, that should be cheap on modern processors.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-26 17:02     ` Blue Swirl
@ 2012-03-26 17:59       ` Lluís Vilanova
  2012-03-27 13:40       ` Laurent Desnogues
  1 sibling, 0 replies; 14+ messages in thread
From: Lluís Vilanova @ 2012-03-26 17:59 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Peter Maydell, Paul Brook, qemu-devel

Blue Swirl writes:

> On Mon, Mar 26, 2012 at 13:05, Paul Brook <paul@codesourcery.com> wrote:
>>> On 24 March 2012 18:58, Blue Swirl <blauwirbel@gmail.com> wrote:
>>> > v2: fix patch 1, tweak patch 2 and rebase to master.
>>> >
>>> > URL     git://repo.or.cz/qemu/blueswirl.git
>>> >        http://repo.or.cz/r/qemu/blueswirl.git
>>> >
>>> > Blue Swirl (6):
>>> >  arm: move neon_tbl to neon_helper.c
>>> >  arm: move saturating arithmetic to helper.c
>>> >  arm: move other arithmetic to helper.c
>>> >  arm: move cpsr and banked register access to helper.c
>>> >  arm: move exception and wfi helpers to helper.c
>>> >  arm: move load and store helpers, switch to AREG0 free mode
>>> 
>>> The patches themselves look OK, but do we really want to take
>>> a 5% performance hit for this cleanup?
>> 
>> I have a similar concern.  I'd like to at least have some idea where this
>> slowdown is coming from.

> At least stack protector is protecting more code than before (for
> example TLB miss handler), but could overhead from that amount to 5%?

Then you can try comparing both builds with a modified configure that does not
add the "-fstack-protector-all" option.

If you want to fine-tune it, you can add
"__attribute__((optimize("no-stack-protector")))" to those functions or just
add:

    #pragma GCC push_options
    #pragma GCC optimize ("no-stack-protector")

at the beginning of the "softmmu_template.h", and:

    #pragma GCC pop_options

at the end of it.

Or even better, use it for the whole "target-*/*helper.c" file, as there should
be no user-induced overflow in helpers (unless the instr decoding code in
"translate.c" is exploitable).


Thanks,
  Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-26 17:02     ` Blue Swirl
  2012-03-26 17:59       ` Lluís Vilanova
@ 2012-03-27 13:40       ` Laurent Desnogues
  2012-03-27 16:48         ` Blue Swirl
  1 sibling, 1 reply; 14+ messages in thread
From: Laurent Desnogues @ 2012-03-27 13:40 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Peter Maydell, Paul Brook, qemu-devel

On Mon, Mar 26, 2012 at 7:02 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
[...]
> At least stack protector is protecting more code than before (for
> example TLB miss handler), but could overhead from that amount to 5%?
>
> Otherwise there should be just a few extra register moves here and
> there, that should be cheap on modern processors.

The extra moves might be cheap but their cost is obviously not 0:
on top of using extra CPU core resources, code size is increased
which results in more instruction cache misses.

I didn't like the idea when we discussed it back in May, now it
looks like we have concrete evidence the speed impact is
measurable (though I'd like some more numbers than the rough
5% estimate I gave).


Laurent

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-27 13:40       ` Laurent Desnogues
@ 2012-03-27 16:48         ` Blue Swirl
  2012-03-27 17:01           ` Laurent Desnogues
  0 siblings, 1 reply; 14+ messages in thread
From: Blue Swirl @ 2012-03-27 16:48 UTC (permalink / raw)
  To: Laurent Desnogues; +Cc: Peter Maydell, Paul Brook, qemu-devel

On Tue, Mar 27, 2012 at 13:40, Laurent Desnogues
<laurent.desnogues@gmail.com> wrote:
> On Mon, Mar 26, 2012 at 7:02 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
> [...]
>> At least stack protector is protecting more code than before (for
>> example TLB miss handler), but could overhead from that amount to 5%?
>>
>> Otherwise there should be just a few extra register moves here and
>> there, that should be cheap on modern processors.
>
> The extra moves might be cheap but their cost is obviously not 0:
> on top of using extra CPU core resources, code size is increased
> which results in more instruction cache misses.
>
> I didn't like the idea when we discussed it back in May, now it
> looks like we have concrete evidence the speed impact is
> measurable (though I'd like some more numbers than the rough
> 5% estimate I gave).

A clearly defined test case running on a host that does not adjust
clock frequencies would be nice. It would be interesting to find out
where exactly the slowdown comes from.

Perhaps the access helpers ({helper,_}_{ld,st}{b,w,l}_mmu) generated
by softmmu_template.h are the culprit. If so, they could be split from
other code and moved to TCG back ends. That way the interface could be
improved while keeping all other cleanups.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-27 16:48         ` Blue Swirl
@ 2012-03-27 17:01           ` Laurent Desnogues
  2012-03-27 19:59             ` Artyom Tarasenko
  0 siblings, 1 reply; 14+ messages in thread
From: Laurent Desnogues @ 2012-03-27 17:01 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Peter Maydell, Paul Brook, qemu-devel

On Tue, Mar 27, 2012 at 6:48 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
> On Tue, Mar 27, 2012 at 13:40, Laurent Desnogues
> <laurent.desnogues@gmail.com> wrote:
>> On Mon, Mar 26, 2012 at 7:02 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>> [...]
>>> At least stack protector is protecting more code than before (for
>>> example TLB miss handler), but could overhead from that amount to 5%?
>>>
>>> Otherwise there should be just a few extra register moves here and
>>> there, that should be cheap on modern processors.
>>
>> The extra moves might be cheap but their cost is obviously not 0:
>> on top of using extra CPU core resources, code size is increased
>> which results in more instruction cache misses.
>>
>> I didn't like the idea when we discussed it back in May, now it
>> looks like we have concrete evidence the speed impact is
>> measurable (though I'd like some more numbers than the rough
>> 5% estimate I gave).
>
> A clearly defined test case running on a host that does not adjust
> clock frequencies would be nice. It would be interesting to find out
> where exactly the slowdown comes from.
>
> Perhaps the access helpers ({helper,_}_{ld,st}{b,w,l}_mmu) generated
> by softmmu_template.h are the culprit. If so, they could be split from
> other code and moved to TCG back ends. That way the interface could be
> improved while keeping all other cleanups.

I also get a slowdown running in user mode, so I don't think
improving the mmu ld/st will completely remove the issue.
In that case the slowdown comes from the extra move
instructions for helper calls.  The ARM target uses way too
many helpers, but that's another discussion :-)


Laurent

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-27 17:01           ` Laurent Desnogues
@ 2012-03-27 19:59             ` Artyom Tarasenko
  2012-03-29 15:42               ` Laurent Desnogues
  0 siblings, 1 reply; 14+ messages in thread
From: Artyom Tarasenko @ 2012-03-27 19:59 UTC (permalink / raw)
  To: Laurent Desnogues
  Cc: Blue Swirl, Peter Maydell, Lluís Vilanova, Paul Brook, qemu-devel

On Tue, Mar 27, 2012 at 7:01 PM, Laurent Desnogues
<laurent.desnogues@gmail.com> wrote:
> On Tue, Mar 27, 2012 at 6:48 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>> On Tue, Mar 27, 2012 at 13:40, Laurent Desnogues
>> <laurent.desnogues@gmail.com> wrote:
>>> On Mon, Mar 26, 2012 at 7:02 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>> [...]
>>>> At least stack protector is protecting more code than before (for
>>>> example TLB miss handler), but could overhead from that amount to 5%?
>>>>
>>>> Otherwise there should be just a few extra register moves here and
>>>> there, that should be cheap on modern processors.
>>>
>>> The extra moves might be cheap but their cost is obviously not 0:
>>> on top of using extra CPU core resources, code size is increased
>>> which results in more instruction cache misses.
>>>
>>> I didn't like the idea when we discussed it back in May, now it
>>> looks like we have concrete evidence the speed impact is
>>> measurable (though I'd like some more numbers than the rough
>>> 5% estimate I gave).
>>
>> A clearly defined test case running on a host that does not adjust
>> clock frequencies would be nice. It would be interesting to find out
>> where exactly the slowdown comes from.
>>
>> Perhaps the access helpers ({helper,_}_{ld,st}{b,w,l}_mmu) generated
>> by softmmu_template.h are the culprit. If so, they could be split from
>> other code and moved to TCG back ends. That way the interface could be
>> improved while keeping all other cleanups.
>
> I also get a slowdown running in user mode, so I don't think
> improving the mmu ld/st will completely remove the issue.
> In that case the slowdown comes from the extra move
> instructions for helper calls.  The ARM target uses way too
> many helpers, but that's another discussion :-)
>

Have you tried compiling without -fstack-protector-all as suggested by Lluís?
I observe a similar slowdown on a sparc target, and there compiling
without stack protection definitely helps.


Artyom

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/search/label/qemu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-27 19:59             ` Artyom Tarasenko
@ 2012-03-29 15:42               ` Laurent Desnogues
  2012-03-29 16:28                 ` Richard Henderson
  0 siblings, 1 reply; 14+ messages in thread
From: Laurent Desnogues @ 2012-03-29 15:42 UTC (permalink / raw)
  To: Artyom Tarasenko
  Cc: Blue Swirl, Peter Maydell, Lluís Vilanova, Paul Brook, qemu-devel

On Tue, Mar 27, 2012 at 9:59 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> On Tue, Mar 27, 2012 at 7:01 PM, Laurent Desnogues
> <laurent.desnogues@gmail.com> wrote:
>> On Tue, Mar 27, 2012 at 6:48 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>> On Tue, Mar 27, 2012 at 13:40, Laurent Desnogues
>>> <laurent.desnogues@gmail.com> wrote:
>>>> On Mon, Mar 26, 2012 at 7:02 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>> [...]
>>>>> At least stack protector is protecting more code than before (for
>>>>> example TLB miss handler), but could overhead from that amount to 5%?
>>>>>
>>>>> Otherwise there should be just a few extra register moves here and
>>>>> there, that should be cheap on modern processors.
>>>>
>>>> The extra moves might be cheap but their cost is obviously not 0:
>>>> on top of using extra CPU core resources, code size is increased
>>>> which results in more instruction cache misses.
>>>>
>>>> I didn't like the idea when we discussed it back in May, now it
>>>> looks like we have concrete evidence the speed impact is
>>>> measurable (though I'd like some more numbers than the rough
>>>> 5% estimate I gave).
>>>
>>> A clearly defined test case running on a host that does not adjust
>>> clock frequencies would be nice. It would be interesting to find out
>>> where exactly the slowdown comes from.
>>>
>>> Perhaps the access helpers ({helper,_}_{ld,st}{b,w,l}_mmu) generated
>>> by softmmu_template.h are the culprit. If so, they could be split from
>>> other code and moved to TCG back ends. That way the interface could be
>>> improved while keeping all other cleanups.
>>
>> I also get a slowdown running in user mode, so I don't think
>> improving the mmu ld/st will completely remove the issue.
>> In that case the slowdown comes from the extra move
>> instructions for helper calls.  The ARM target uses way too
>> many helpers, but that's another discussion :-)
>>
>
> Have you tried compiling without -fstack-protector-all as suggested by Lluís?
> I observe a similar slowdown on a sparc target, and there compiling
> without stack protection definitely helps.

That will indeed probably make the real problem, which is that
this patch increases the size of generated code, less obvious
on small benchmarks that don't put pressure on instruction
cache.  But the fact is that generated code is larger and will
have to execute more instructions, so no matter what you do,
this will have an impact on speed.


Laurent

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion
  2012-03-29 15:42               ` Laurent Desnogues
@ 2012-03-29 16:28                 ` Richard Henderson
  0 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2012-03-29 16:28 UTC (permalink / raw)
  To: Laurent Desnogues
  Cc: Peter Maydell, qemu-devel, Blue Swirl, Paul Brook,
	Lluís Vilanova, Artyom Tarasenko

On 03/29/2012 11:42 AM, Laurent Desnogues wrote:
> That will indeed probably make the real problem, which is that
> this patch increases the size of generated code, less obvious
> on small benchmarks that don't put pressure on instruction
> cache.  But the fact is that generated code is larger and will
> have to execute more instructions, so no matter what you do,
> this will have an impact on speed.

While this is true, the benefit of using a more standard calling
convention on reliability and debug-ability is enormous.

Consider the i686 host, where we currently obscond with EBP.
While I'm not aware of any current problems with -O0 or spill
failures under optimization, it's not inconceivable.

Consider sparc-linux host, where we have *no* call-saved global
register at all, and (currently) try very hard to use a call-
clobbered global register, with occasionally disastrous results.
See the patch set I posted recently where I give up on this entirely
and make sparc use a TLS variable instead of a hard register at all.
This regresses the Sparc host on speed for a progression in reliability.
The conversion to explicit env arguments fixes essentially all of
the speed regression since we then receive ENV in %o0 instead of
having to read from TLS.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-03-29 16:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-24 18:58 [Qemu-devel] [PATCH v2 0/6] ARM: AREG0 conversion Blue Swirl
2012-03-25 22:29 ` Richard Henderson
2012-03-26 11:52 ` Peter Maydell
2012-03-26 12:46   ` Lluís Vilanova
2012-03-26 12:48     ` Andreas Färber
2012-03-26 13:05   ` Paul Brook
2012-03-26 17:02     ` Blue Swirl
2012-03-26 17:59       ` Lluís Vilanova
2012-03-27 13:40       ` Laurent Desnogues
2012-03-27 16:48         ` Blue Swirl
2012-03-27 17:01           ` Laurent Desnogues
2012-03-27 19:59             ` Artyom Tarasenko
2012-03-29 15:42               ` Laurent Desnogues
2012-03-29 16:28                 ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.