All of lore.kernel.org
 help / color / mirror / Atom feed
* x86 system call (using sysenter) perf regression
@ 2021-07-12  4:17 vivek thakkar
  2021-07-12 20:50 ` Andy Lutomirski
  0 siblings, 1 reply; 3+ messages in thread
From: vivek thakkar @ 2021-07-12  4:17 UTC (permalink / raw)
  To: x86; +Cc: linux-arch

We ran a test to measure the syscall performance on two different
kernels (v4.9.x and v4.0.9). The program is as simple as this:

for (int i =0; i< 100'000'000; i++) {
     syscall(SYS_getpid);
}

The program was built for x86 and was using the "vdso" mechanism to
generate the system call and we could confirm that it was
transitioning into the kernel using sysenter.

We find that the time taken by each system call takes 10ns more in
4.9.x  as compared to 4.0.9. On deeper analysis, we found that there
are 40 more instructions that get executed in the newer kernel version
- the user space transitioning mechanism based off of vdso remains the
same.

commit 5f310f739b4cc343f3f087681e41bbc2f0ce902d
Author: Andy Lutomirski <luto@kernel.org>
Date:   Mon Oct 5 17:48:15 2015 -0700

    x86/entry/32: Re-implement SYSENTER using the new C path

Is that something that is already known to the community?

Regards,
Vivek Thakkar

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: x86 system call (using sysenter) perf regression
  2021-07-12  4:17 x86 system call (using sysenter) perf regression vivek thakkar
@ 2021-07-12 20:50 ` Andy Lutomirski
  2021-07-13  2:36   ` vivek thakkar
  0 siblings, 1 reply; 3+ messages in thread
From: Andy Lutomirski @ 2021-07-12 20:50 UTC (permalink / raw)
  To: vivek thakkar, the arch/x86 maintainers; +Cc: linux-arch



On Sun, Jul 11, 2021, at 9:17 PM, vivek thakkar wrote:
> We ran a test to measure the syscall performance on two different
> kernels (v4.9.x and v4.0.9). The program is as simple as this:
> 
> for (int i =0; i< 100'000'000; i++) {
>      syscall(SYS_getpid);
> }
> 
> The program was built for x86 and was using the "vdso" mechanism to
> generate the system call and we could confirm that it was
> transitioning into the kernel using sysenter.
> 
> We find that the time taken by each system call takes 10ns more in
> 4.9.x  as compared to 4.0.9. On deeper analysis, we found that there
> are 40 more instructions that get executed in the newer kernel version
> - the user space transitioning mechanism based off of vdso remains the
> same.
> 
> commit 5f310f739b4cc343f3f087681e41bbc2f0ce902d
> Author: Andy Lutomirski <luto@kernel.org>
> Date:   Mon Oct 5 17:48:15 2015 -0700
> 
>     x86/entry/32: Re-implement SYSENTER using the new C path
> 
> Is that something that is already known to the community?
> 

Yes. We made a conscious decision to trade a bit of performance for a lot of maintainability, especially for a legacy-ish path. The old SYSENTER code was a mess.

> Regards,
> Vivek Thakkar
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: x86 system call (using sysenter) perf regression
  2021-07-12 20:50 ` Andy Lutomirski
@ 2021-07-13  2:36   ` vivek thakkar
  0 siblings, 0 replies; 3+ messages in thread
From: vivek thakkar @ 2021-07-13  2:36 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: the arch/x86 maintainers, linux-arch

> Yes. We made a conscious decision to trade a bit of performance for a lot of maintainability, especially for a legacy-ish path. The old SYSENTER code was a mess.
>

Thanks Andy.

Regards,
Vivek Thakkar

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-07-13  2:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12  4:17 x86 system call (using sysenter) perf regression vivek thakkar
2021-07-12 20:50 ` Andy Lutomirski
2021-07-13  2:36   ` vivek thakkar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.