All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] exec-all.h: Make MAX_OP_PER_INSTR large enough for target-arm's uses
@ 2011-06-22 14:16 Peter Maydell
  2011-07-06 11:15 ` Peter Maydell
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Maydell @ 2011-06-22 14:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

The target-arm frontend's worst-case TCG ops per instr is 194 (and in
general many of the "load multiple registers" ARM instructions generate
more than 100 TCG ops). Raise MAX_OP_PER_INSTR accordingly to avoid
possible buffer overruns.

Since it doesn't make any sense for the "64 bit guest on 32 bit host"
case to have a smaller limit than the normal case, we collapse the
two cases back into each other again.

(This increase costs us about 14K in extra static buffer space and
21K of extra margin at the end of a 32MB codegen buffer.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
You might recall the patchset which moves the Neon load/store multiple
instructions to helper functions, and which turns out to slow them
down rather a lot. This is the other approach, which is just to
raise the limit so that the existing implementations don't risk
buffer overruns. The extra memory costs are tiny IMHO.
(The Neon instructions are the worst offenders but the VFP load/store
multiple insns also breach the previous limit. I think we should
consider an implementation of an instruction that's been basically
the same since VFP support was added to QEMU in 2005 to be an
acceptable one, and make sure our buffer sizes cope with it :-))

 exec-all.h |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index 2a13a95..ef5f5b6 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -43,11 +43,7 @@ typedef ram_addr_t tb_page_addr_t;
 typedef struct TranslationBlock TranslationBlock;
 
 /* XXX: make safe guess about sizes */
-#if (HOST_LONG_BITS == 32) && (TARGET_LONG_BITS == 64)
-#define MAX_OP_PER_INSTR 128
-#else
-#define MAX_OP_PER_INSTR 96
-#endif
+#define MAX_OP_PER_INSTR 208
 
 #if HOST_LONG_BITS == 32
 #define MAX_OPC_PARAM_PER_ARG 2
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PATCH] exec-all.h: Make MAX_OP_PER_INSTR large enough for target-arm's uses
  2011-06-22 14:16 [Qemu-devel] [PATCH] exec-all.h: Make MAX_OP_PER_INSTR large enough for target-arm's uses Peter Maydell
@ 2011-07-06 11:15 ` Peter Maydell
  2011-07-12 20:52   ` Blue Swirl
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Maydell @ 2011-07-06 11:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

Ping?

On 22 June 2011 15:16, Peter Maydell <peter.maydell@linaro.org> wrote:
> The target-arm frontend's worst-case TCG ops per instr is 194 (and in
> general many of the "load multiple registers" ARM instructions generate
> more than 100 TCG ops). Raise MAX_OP_PER_INSTR accordingly to avoid
> possible buffer overruns.
>
> Since it doesn't make any sense for the "64 bit guest on 32 bit host"
> case to have a smaller limit than the normal case, we collapse the
> two cases back into each other again.
>
> (This increase costs us about 14K in extra static buffer space and
> 21K of extra margin at the end of a 32MB codegen buffer.)
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> You might recall the patchset which moves the Neon load/store multiple
> instructions to helper functions, and which turns out to slow them
> down rather a lot. This is the other approach, which is just to
> raise the limit so that the existing implementations don't risk
> buffer overruns. The extra memory costs are tiny IMHO.
> (The Neon instructions are the worst offenders but the VFP load/store
> multiple insns also breach the previous limit. I think we should
> consider an implementation of an instruction that's been basically
> the same since VFP support was added to QEMU in 2005 to be an
> acceptable one, and make sure our buffer sizes cope with it :-))
>
>  exec-all.h |    6 +-----
>  1 files changed, 1 insertions(+), 5 deletions(-)
>
> diff --git a/exec-all.h b/exec-all.h
> index 2a13a95..ef5f5b6 100644
> --- a/exec-all.h
> +++ b/exec-all.h
> @@ -43,11 +43,7 @@ typedef ram_addr_t tb_page_addr_t;
>  typedef struct TranslationBlock TranslationBlock;
>
>  /* XXX: make safe guess about sizes */
> -#if (HOST_LONG_BITS == 32) && (TARGET_LONG_BITS == 64)
> -#define MAX_OP_PER_INSTR 128
> -#else
> -#define MAX_OP_PER_INSTR 96
> -#endif
> +#define MAX_OP_PER_INSTR 208
>
>  #if HOST_LONG_BITS == 32
>  #define MAX_OPC_PARAM_PER_ARG 2
> --
> 1.7.1

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PATCH] exec-all.h: Make MAX_OP_PER_INSTR large enough for target-arm's uses
  2011-07-06 11:15 ` Peter Maydell
@ 2011-07-12 20:52   ` Blue Swirl
  0 siblings, 0 replies; 3+ messages in thread
From: Blue Swirl @ 2011-07-12 20:52 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, patches

Thanks, applied.

On Wed, Jul 6, 2011 at 2:15 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> Ping?
>
> On 22 June 2011 15:16, Peter Maydell <peter.maydell@linaro.org> wrote:
>> The target-arm frontend's worst-case TCG ops per instr is 194 (and in
>> general many of the "load multiple registers" ARM instructions generate
>> more than 100 TCG ops). Raise MAX_OP_PER_INSTR accordingly to avoid
>> possible buffer overruns.
>>
>> Since it doesn't make any sense for the "64 bit guest on 32 bit host"
>> case to have a smaller limit than the normal case, we collapse the
>> two cases back into each other again.
>>
>> (This increase costs us about 14K in extra static buffer space and
>> 21K of extra margin at the end of a 32MB codegen buffer.)
>>
>> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>> ---
>> You might recall the patchset which moves the Neon load/store multiple
>> instructions to helper functions, and which turns out to slow them
>> down rather a lot. This is the other approach, which is just to
>> raise the limit so that the existing implementations don't risk
>> buffer overruns. The extra memory costs are tiny IMHO.
>> (The Neon instructions are the worst offenders but the VFP load/store
>> multiple insns also breach the previous limit. I think we should
>> consider an implementation of an instruction that's been basically
>> the same since VFP support was added to QEMU in 2005 to be an
>> acceptable one, and make sure our buffer sizes cope with it :-))
>>
>>  exec-all.h |    6 +-----
>>  1 files changed, 1 insertions(+), 5 deletions(-)
>>
>> diff --git a/exec-all.h b/exec-all.h
>> index 2a13a95..ef5f5b6 100644
>> --- a/exec-all.h
>> +++ b/exec-all.h
>> @@ -43,11 +43,7 @@ typedef ram_addr_t tb_page_addr_t;
>>  typedef struct TranslationBlock TranslationBlock;
>>
>>  /* XXX: make safe guess about sizes */
>> -#if (HOST_LONG_BITS == 32) && (TARGET_LONG_BITS == 64)
>> -#define MAX_OP_PER_INSTR 128
>> -#else
>> -#define MAX_OP_PER_INSTR 96
>> -#endif
>> +#define MAX_OP_PER_INSTR 208
>>
>>  #if HOST_LONG_BITS == 32
>>  #define MAX_OPC_PARAM_PER_ARG 2
>> --
>> 1.7.1
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-07-12 20:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-22 14:16 [Qemu-devel] [PATCH] exec-all.h: Make MAX_OP_PER_INSTR large enough for target-arm's uses Peter Maydell
2011-07-06 11:15 ` Peter Maydell
2011-07-12 20:52   ` Blue Swirl

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.