* [PATCH] x86/irq: Improve local_irq_restore() code generation and performance
@ 2021-12-06 13:38 Andrew Cooper
2021-12-06 13:55 ` Andrew Cooper
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Cooper @ 2021-12-06 13:38 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu
popf is a horribly expensive instruction, while sti is an optimised fastpath.
Switching popf for a conditional branch and sti caused an 8% perf improvement
in various linux measurements.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
xen/include/asm-x86/system.h | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
index 65e63de69a67..4be235472ecd 100644
--- a/xen/include/asm-x86/system.h
+++ b/xen/include/asm-x86/system.h
@@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
})
#define local_irq_restore(x) \
({ \
- BUILD_BUG_ON(sizeof(x) != sizeof(long)); \
- asm volatile ( "pushfq\n\t" \
- "andq %0, (%%rsp)\n\t" \
- "orq %1, (%%rsp)\n\t" \
- "popfq" \
- : : "i?r" ( ~X86_EFLAGS_IF ), \
- "ri" ( (x) & X86_EFLAGS_IF ) ); \
+ if ( (x) & X86_EFLAGS_IF ) \
+ local_irq_enable(); \
})
static inline int local_irq_is_enabled(void)
--
2.11.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] x86/irq: Improve local_irq_restore() code generation and performance
2021-12-06 13:38 [PATCH] x86/irq: Improve local_irq_restore() code generation and performance Andrew Cooper
@ 2021-12-06 13:55 ` Andrew Cooper
2021-12-06 14:07 ` Jan Beulich
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Cooper @ 2021-12-06 13:55 UTC (permalink / raw)
To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu
On 06/12/2021 13:38, Andrew Cooper wrote:
> popf is a horribly expensive instruction, while sti is an optimised fastpath.
> Switching popf for a conditional branch and sti caused an 8% perf improvement
> in various linux measurements.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> ---
> xen/include/asm-x86/system.h | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
> index 65e63de69a67..4be235472ecd 100644
> --- a/xen/include/asm-x86/system.h
> +++ b/xen/include/asm-x86/system.h
> @@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
> })
> #define local_irq_restore(x) \
> ({ \
> - BUILD_BUG_ON(sizeof(x) != sizeof(long)); \
> - asm volatile ( "pushfq\n\t" \
> - "andq %0, (%%rsp)\n\t" \
> - "orq %1, (%%rsp)\n\t" \
> - "popfq" \
> - : : "i?r" ( ~X86_EFLAGS_IF ), \
> - "ri" ( (x) & X86_EFLAGS_IF ) ); \
> + if ( (x) & X86_EFLAGS_IF ) \
> + local_irq_enable(); \
> })
Bah. There's still the one total abuse of local_irq_restore() to
disable interrupts.
I'll do a pre-requisite patch.
~Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86/irq: Improve local_irq_restore() code generation and performance
2021-12-06 13:55 ` Andrew Cooper
@ 2021-12-06 14:07 ` Jan Beulich
2021-12-06 15:10 ` Andrew Cooper
0 siblings, 1 reply; 5+ messages in thread
From: Jan Beulich @ 2021-12-06 14:07 UTC (permalink / raw)
To: Andrew Cooper, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel
On 06.12.2021 14:55, Andrew Cooper wrote:
> On 06/12/2021 13:38, Andrew Cooper wrote:
>> popf is a horribly expensive instruction, while sti is an optimised fastpath.
>> Switching popf for a conditional branch and sti caused an 8% perf improvement
>> in various linux measurements.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Wei Liu <wl@xen.org>
>> ---
>> xen/include/asm-x86/system.h | 9 ++-------
>> 1 file changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
>> index 65e63de69a67..4be235472ecd 100644
>> --- a/xen/include/asm-x86/system.h
>> +++ b/xen/include/asm-x86/system.h
>> @@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>> })
>> #define local_irq_restore(x) \
>> ({ \
>> - BUILD_BUG_ON(sizeof(x) != sizeof(long)); \
>> - asm volatile ( "pushfq\n\t" \
>> - "andq %0, (%%rsp)\n\t" \
>> - "orq %1, (%%rsp)\n\t" \
>> - "popfq" \
>> - : : "i?r" ( ~X86_EFLAGS_IF ), \
>> - "ri" ( (x) & X86_EFLAGS_IF ) ); \
>> + if ( (x) & X86_EFLAGS_IF ) \
>> + local_irq_enable(); \
>> })
>
> Bah. There's still the one total abuse of local_irq_restore() to
> disable interrupts.
Question is whether that's really to be considered an abuse: To me
"restore" doesn't mean only "maybe re-enable", but also "maybe
re-disable". And a conditional STI-or-CLI is likely still be better
than POPF.
Jan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86/irq: Improve local_irq_restore() code generation and performance
2021-12-06 14:07 ` Jan Beulich
@ 2021-12-06 15:10 ` Andrew Cooper
2021-12-06 15:15 ` Jan Beulich
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Cooper @ 2021-12-06 15:10 UTC (permalink / raw)
To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel
On 06/12/2021 14:07, Jan Beulich wrote:
> On 06.12.2021 14:55, Andrew Cooper wrote:
>> On 06/12/2021 13:38, Andrew Cooper wrote:
>>> popf is a horribly expensive instruction, while sti is an optimised fastpath.
>>> Switching popf for a conditional branch and sti caused an 8% perf improvement
>>> in various linux measurements.
>>>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>> CC: Jan Beulich <JBeulich@suse.com>
>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>> CC: Wei Liu <wl@xen.org>
>>> ---
>>> xen/include/asm-x86/system.h | 9 ++-------
>>> 1 file changed, 2 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
>>> index 65e63de69a67..4be235472ecd 100644
>>> --- a/xen/include/asm-x86/system.h
>>> +++ b/xen/include/asm-x86/system.h
>>> @@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>>> })
>>> #define local_irq_restore(x) \
>>> ({ \
>>> - BUILD_BUG_ON(sizeof(x) != sizeof(long)); \
>>> - asm volatile ( "pushfq\n\t" \
>>> - "andq %0, (%%rsp)\n\t" \
>>> - "orq %1, (%%rsp)\n\t" \
>>> - "popfq" \
>>> - : : "i?r" ( ~X86_EFLAGS_IF ), \
>>> - "ri" ( (x) & X86_EFLAGS_IF ) ); \
>>> + if ( (x) & X86_EFLAGS_IF ) \
>>> + local_irq_enable(); \
>>> })
>> Bah. There's still the one total abuse of local_irq_restore() to
>> disable interrupts.
> Question is whether that's really to be considered an abuse:
These are Linux's APIs, not ours, and they've spoken on the matter.
Furthermore, I agree with this being an abuse of the mechanism.
> To me
> "restore" doesn't mean only "maybe re-enable", but also "maybe
> re-disable".
nor does "save" mean "save and disable", but that's what it does.
The naming may not be completely ideal, but the expected usage is very
much one way.
> And a conditional STI-or-CLI is likely still be better
> than POPF.
It likely is better than popf, but for one single abuse which can be
written in a better way anyway, it's really not worth it.
~Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86/irq: Improve local_irq_restore() code generation and performance
2021-12-06 15:10 ` Andrew Cooper
@ 2021-12-06 15:15 ` Jan Beulich
0 siblings, 0 replies; 5+ messages in thread
From: Jan Beulich @ 2021-12-06 15:15 UTC (permalink / raw)
To: Andrew Cooper, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel
On 06.12.2021 16:10, Andrew Cooper wrote:
> On 06/12/2021 14:07, Jan Beulich wrote:
>> On 06.12.2021 14:55, Andrew Cooper wrote:
>>> On 06/12/2021 13:38, Andrew Cooper wrote:
>>>> popf is a horribly expensive instruction, while sti is an optimised fastpath.
>>>> Switching popf for a conditional branch and sti caused an 8% perf improvement
>>>> in various linux measurements.
>>>>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> ---
>>>> CC: Jan Beulich <JBeulich@suse.com>
>>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>>> CC: Wei Liu <wl@xen.org>
>>>> ---
>>>> xen/include/asm-x86/system.h | 9 ++-------
>>>> 1 file changed, 2 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
>>>> index 65e63de69a67..4be235472ecd 100644
>>>> --- a/xen/include/asm-x86/system.h
>>>> +++ b/xen/include/asm-x86/system.h
>>>> @@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>>>> })
>>>> #define local_irq_restore(x) \
>>>> ({ \
>>>> - BUILD_BUG_ON(sizeof(x) != sizeof(long)); \
>>>> - asm volatile ( "pushfq\n\t" \
>>>> - "andq %0, (%%rsp)\n\t" \
>>>> - "orq %1, (%%rsp)\n\t" \
>>>> - "popfq" \
>>>> - : : "i?r" ( ~X86_EFLAGS_IF ), \
>>>> - "ri" ( (x) & X86_EFLAGS_IF ) ); \
>>>> + if ( (x) & X86_EFLAGS_IF ) \
>>>> + local_irq_enable(); \
>>>> })
>>> Bah. There's still the one total abuse of local_irq_restore() to
>>> disable interrupts.
>> Question is whether that's really to be considered an abuse:
>
> These are Linux's APIs, not ours, and they've spoken on the matter.
> Furthermore, I agree with this being an abuse of the mechanism.
>
>> To me
>> "restore" doesn't mean only "maybe re-enable", but also "maybe
>> re-disable".
>
> nor does "save" mean "save and disable", but that's what it does.
>
> The naming may not be completely ideal, but the expected usage is very
> much one way.
>
>> And a conditional STI-or-CLI is likely still be better
>> than POPF.
>
> It likely is better than popf, but for one single abuse which can be
> written in a better way anyway, it's really not worth it.
Fine with me as long as we can be very certain that's it's really only
one such case.
Jan
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-12-06 15:15 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-06 13:38 [PATCH] x86/irq: Improve local_irq_restore() code generation and performance Andrew Cooper
2021-12-06 13:55 ` Andrew Cooper
2021-12-06 14:07 ` Jan Beulich
2021-12-06 15:10 ` Andrew Cooper
2021-12-06 15:15 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.