All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: christophe leroy <christophe.leroy@c-s.fr>
Cc: Andy Lutomirski <luto@kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Arnd Bergmann <arnd@arndb.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	"open list:MIPS" <linux-mips@vger.kernel.org>,
	X86 ML <x86@kernel.org>
Subject: Re: [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch
Date: Tue, 24 Dec 2019 04:41:48 -0800	[thread overview]
Message-ID: <CALCETrUR-NgVMeTPh3TmgNSTsA=2xE03_KBeO9DSk0P-JxD_fQ@mail.gmail.com> (raw)
In-Reply-To: <D2614EC4-5B80-4846-994D-22730ACD44A1@amacapital.net>

On Tue, Dec 24, 2019 at 4:15 AM Andy Lutomirski <luto@amacapital.net> wrote:
>
>
>
> > On Dec 24, 2019, at 7:53 PM, christophe leroy <christophe.leroy@c-s.fr> wrote:
> >
> > 
> >
> >> Le 24/12/2019 à 03:27, Andy Lutomirski a écrit :
> >>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
> >>> <christophe.leroy@c-s.fr> wrote:
> >>>
> >>> On powerpc, __arch_get_vdso_data() clobbers the link register,
> >>> requiring the caller to set a stack frame in order to save it.
> >>>
> >>> As the parent function already has to set a stack frame and save
> >>> the link register to call the C vdso function, retriving the
> >>> vdso data pointer there is lighter.
> >> I'm confused.  Can't you inline __arch_get_vdso_data()?  Or is the
> >> issue that you can't retrieve the program counter on power without
> >> clobbering the link register?
> >
> > Yes it can be inlined (I did it in V1 https://patchwork.ozlabs.org/patch/1180571/), but you can't do it without clobbering the link register, because the only way to get the program counter is to do to as if you were calling another function but you call to the address which just follows where you are, so that it sets LR which the simulated return address which corresponds to the address following the branch.
> >
> > static __always_inline
> > const struct vdso_data *__arch_get_vdso_data(void)
> > {
> >    void *ptr;
> >
> >    asm volatile(
> >        "    bcl    20, 31, .+4;\n"
> >        "    mflr    %0;\n"
> >        "    addi    %0, %0, __kernel_datapage_offset - (.-4);\n"
> >        : "=b"(ptr) : : "lr");
> >
> >    return ptr + *(unsigned long *)ptr;
> > }
> >
> >> I would imagine that this patch generates worse code on any
> >> architecture with PC-relative addressing modes (which includes at
> >> least x86_64, and I would guess includes most modern architectures).
> >
> > Why ? Powerpc is also using PC-relative addressing for all calls but indirect calls.
>
> I mean PC-relative access for data.  The data page is at a constant, known offset from the vDSO text.
>
> I haven’t checked how much x86_64 benefits from this, but at least the non-array fields ought to be accessible with a PC-relative access.
>
> It should be possible to refactor a little bit so that the compiler can still see what’s going on.  Maybe your patch actually does this. I’d want to look at the assembly.  This also might not matter much on x86_64 in particular, since x86_64 can convert a PC-relative address to an absolute address with a single instruction with no clobbers.
>
> Does power have PC-relative data access?  If so, I wonder if the code can be arranged so that even the array accesses don’t require computing an absolute address at any point.

Indeed the x86 code is also suboptimal, but at least the unnecessary
absolute address calculation is cheap on x86_64.  Ideally we'd pass
around offsets into the vdso data instead of passing pointers, and
maybe the compiler will figure it out.  I can try to play with this in
the morning.

WARNING: multiple messages have this Message-ID (diff)
From: Andy Lutomirski <luto@kernel.org>
To: christophe leroy <christophe.leroy@c-s.fr>
Cc: Arnd Bergmann <arnd@arndb.de>, X86 ML <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"open list:MIPS" <linux-mips@vger.kernel.org>,
	Paul Mackerras <paulus@samba.org>,
	Andy Lutomirski <luto@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch
Date: Tue, 24 Dec 2019 04:41:48 -0800	[thread overview]
Message-ID: <CALCETrUR-NgVMeTPh3TmgNSTsA=2xE03_KBeO9DSk0P-JxD_fQ@mail.gmail.com> (raw)
In-Reply-To: <D2614EC4-5B80-4846-994D-22730ACD44A1@amacapital.net>

On Tue, Dec 24, 2019 at 4:15 AM Andy Lutomirski <luto@amacapital.net> wrote:
>
>
>
> > On Dec 24, 2019, at 7:53 PM, christophe leroy <christophe.leroy@c-s.fr> wrote:
> >
> > 
> >
> >> Le 24/12/2019 à 03:27, Andy Lutomirski a écrit :
> >>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
> >>> <christophe.leroy@c-s.fr> wrote:
> >>>
> >>> On powerpc, __arch_get_vdso_data() clobbers the link register,
> >>> requiring the caller to set a stack frame in order to save it.
> >>>
> >>> As the parent function already has to set a stack frame and save
> >>> the link register to call the C vdso function, retriving the
> >>> vdso data pointer there is lighter.
> >> I'm confused.  Can't you inline __arch_get_vdso_data()?  Or is the
> >> issue that you can't retrieve the program counter on power without
> >> clobbering the link register?
> >
> > Yes it can be inlined (I did it in V1 https://patchwork.ozlabs.org/patch/1180571/), but you can't do it without clobbering the link register, because the only way to get the program counter is to do to as if you were calling another function but you call to the address which just follows where you are, so that it sets LR which the simulated return address which corresponds to the address following the branch.
> >
> > static __always_inline
> > const struct vdso_data *__arch_get_vdso_data(void)
> > {
> >    void *ptr;
> >
> >    asm volatile(
> >        "    bcl    20, 31, .+4;\n"
> >        "    mflr    %0;\n"
> >        "    addi    %0, %0, __kernel_datapage_offset - (.-4);\n"
> >        : "=b"(ptr) : : "lr");
> >
> >    return ptr + *(unsigned long *)ptr;
> > }
> >
> >> I would imagine that this patch generates worse code on any
> >> architecture with PC-relative addressing modes (which includes at
> >> least x86_64, and I would guess includes most modern architectures).
> >
> > Why ? Powerpc is also using PC-relative addressing for all calls but indirect calls.
>
> I mean PC-relative access for data.  The data page is at a constant, known offset from the vDSO text.
>
> I haven’t checked how much x86_64 benefits from this, but at least the non-array fields ought to be accessible with a PC-relative access.
>
> It should be possible to refactor a little bit so that the compiler can still see what’s going on.  Maybe your patch actually does this. I’d want to look at the assembly.  This also might not matter much on x86_64 in particular, since x86_64 can convert a PC-relative address to an absolute address with a single instruction with no clobbers.
>
> Does power have PC-relative data access?  If so, I wonder if the code can be arranged so that even the array accesses don’t require computing an absolute address at any point.

Indeed the x86 code is also suboptimal, but at least the unnecessary
absolute address calculation is cheap on x86_64.  Ideally we'd pass
around offsets into the vdso data instead of passing pointers, and
maybe the compiler will figure it out.  I can try to play with this in
the morning.

WARNING: multiple messages have this Message-ID (diff)
From: Andy Lutomirski <luto@kernel.org>
To: christophe leroy <christophe.leroy@c-s.fr>
Cc: Arnd Bergmann <arnd@arndb.de>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	X86 ML <x86@kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	"open list:MIPS" <linux-mips@vger.kernel.org>,
	Paul Mackerras <paulus@samba.org>,
	Andy Lutomirski <luto@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch
Date: Tue, 24 Dec 2019 04:41:48 -0800	[thread overview]
Message-ID: <CALCETrUR-NgVMeTPh3TmgNSTsA=2xE03_KBeO9DSk0P-JxD_fQ@mail.gmail.com> (raw)
In-Reply-To: <D2614EC4-5B80-4846-994D-22730ACD44A1@amacapital.net>

On Tue, Dec 24, 2019 at 4:15 AM Andy Lutomirski <luto@amacapital.net> wrote:
>
>
>
> > On Dec 24, 2019, at 7:53 PM, christophe leroy <christophe.leroy@c-s.fr> wrote:
> >
> > 
> >
> >> Le 24/12/2019 à 03:27, Andy Lutomirski a écrit :
> >>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
> >>> <christophe.leroy@c-s.fr> wrote:
> >>>
> >>> On powerpc, __arch_get_vdso_data() clobbers the link register,
> >>> requiring the caller to set a stack frame in order to save it.
> >>>
> >>> As the parent function already has to set a stack frame and save
> >>> the link register to call the C vdso function, retriving the
> >>> vdso data pointer there is lighter.
> >> I'm confused.  Can't you inline __arch_get_vdso_data()?  Or is the
> >> issue that you can't retrieve the program counter on power without
> >> clobbering the link register?
> >
> > Yes it can be inlined (I did it in V1 https://patchwork.ozlabs.org/patch/1180571/), but you can't do it without clobbering the link register, because the only way to get the program counter is to do to as if you were calling another function but you call to the address which just follows where you are, so that it sets LR which the simulated return address which corresponds to the address following the branch.
> >
> > static __always_inline
> > const struct vdso_data *__arch_get_vdso_data(void)
> > {
> >    void *ptr;
> >
> >    asm volatile(
> >        "    bcl    20, 31, .+4;\n"
> >        "    mflr    %0;\n"
> >        "    addi    %0, %0, __kernel_datapage_offset - (.-4);\n"
> >        : "=b"(ptr) : : "lr");
> >
> >    return ptr + *(unsigned long *)ptr;
> > }
> >
> >> I would imagine that this patch generates worse code on any
> >> architecture with PC-relative addressing modes (which includes at
> >> least x86_64, and I would guess includes most modern architectures).
> >
> > Why ? Powerpc is also using PC-relative addressing for all calls but indirect calls.
>
> I mean PC-relative access for data.  The data page is at a constant, known offset from the vDSO text.
>
> I haven’t checked how much x86_64 benefits from this, but at least the non-array fields ought to be accessible with a PC-relative access.
>
> It should be possible to refactor a little bit so that the compiler can still see what’s going on.  Maybe your patch actually does this. I’d want to look at the assembly.  This also might not matter much on x86_64 in particular, since x86_64 can convert a PC-relative address to an absolute address with a single instruction with no clobbers.
>
> Does power have PC-relative data access?  If so, I wonder if the code can be arranged so that even the array accesses don’t require computing an absolute address at any point.

Indeed the x86 code is also suboptimal, but at least the unnecessary
absolute address calculation is cheap on x86_64.  Ideally we'd pass
around offsets into the vdso data instead of passing pointers, and
maybe the compiler will figure it out.  I can try to play with this in
the morning.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-12-24 12:42 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
2019-12-23 14:31 ` Christophe Leroy
2019-12-23 14:31 ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-24  2:07   ` Andy Lutomirski
2019-12-24  2:07     ` Andy Lutomirski
2019-12-24  2:07     ` Andy Lutomirski
2020-01-10 20:56     ` Thomas Gleixner
2020-01-10 20:56       ` Thomas Gleixner
2020-01-10 20:56       ` Thomas Gleixner
2020-01-10 21:02       ` Andy Lutomirski
2020-01-10 21:02         ` Andy Lutomirski
2020-01-10 21:02         ` Andy Lutomirski
2019-12-25  2:05   ` kbuild test robot
2019-12-25  6:01   ` kbuild test robot
2019-12-30 12:27   ` Arnd Bergmann
2019-12-30 12:27     ` Arnd Bergmann
2019-12-30 12:27     ` Arnd Bergmann
2020-01-02 11:29     ` Arnd Bergmann
2020-01-02 11:29       ` Arnd Bergmann
2020-01-02 11:29       ` Arnd Bergmann
2020-01-09 15:43       ` Christophe Leroy
2020-01-09 15:43         ` Christophe Leroy
2020-01-09 15:43         ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 02/10] lib: vdso: move call to fallback out of common code Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-24  2:24   ` Andy Lutomirski
2019-12-24  2:24     ` Andy Lutomirski
2019-12-24  2:24     ` Andy Lutomirski
2019-12-24 11:41     ` christophe leroy
2019-12-24 11:41       ` christophe leroy
2019-12-24 11:41       ` christophe leroy
2019-12-24 12:09       ` Andy Lutomirski
2019-12-24 12:09         ` Andy Lutomirski
2019-12-24 12:09         ` Andy Lutomirski
2019-12-25  2:19   ` kbuild test robot
2019-12-23 14:31 ` [RFC PATCH v2 03/10] lib: vdso: Change __cvdso_clock_gettime/getres_common() to __cvdso_clock_gettime/getres() Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-24  2:27   ` Andy Lutomirski
2019-12-24  2:27     ` Andy Lutomirski
2019-12-24  2:27     ` Andy Lutomirski
2019-12-24 11:53     ` christophe leroy
2019-12-24 11:53       ` christophe leroy
2019-12-24 11:53       ` christophe leroy
2019-12-24 12:15       ` Andy Lutomirski
2019-12-24 12:15         ` Andy Lutomirski
2019-12-24 12:15         ` Andy Lutomirski
2019-12-24 12:41         ` Andy Lutomirski [this message]
2019-12-24 12:41           ` Andy Lutomirski
2019-12-24 12:41           ` Andy Lutomirski
2019-12-24 14:46         ` Segher Boessenkool
2019-12-24 14:46           ` Segher Boessenkool
2019-12-24 14:46           ` Segher Boessenkool
2019-12-23 14:31 ` [RFC PATCH v2 05/10] lib: vdso: inline do_hres() Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-24  2:29   ` Andy Lutomirski
2019-12-24  2:29     ` Andy Lutomirski
2019-12-24  2:29     ` Andy Lutomirski
2019-12-30 12:07   ` Arnd Bergmann
2019-12-30 12:07     ` Arnd Bergmann
2019-12-30 12:07     ` Arnd Bergmann
2020-01-10 21:07     ` Thomas Gleixner
2020-01-10 21:07       ` Thomas Gleixner
2020-01-10 21:07       ` Thomas Gleixner
2020-01-11  9:06       ` Christophe Leroy
2020-01-11  9:06         ` Christophe Leroy
2020-01-11  9:06         ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 06/10] lib: vdso: make do_coarse() return 0 Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time() Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-24  1:58   ` Andy Lutomirski
2019-12-24  1:58     ` Andy Lutomirski
2019-12-24  1:58     ` Andy Lutomirski
2019-12-24 11:12     ` christophe leroy
2019-12-24 11:12       ` christophe leroy
2019-12-24 11:12       ` christophe leroy
2019-12-24 12:04       ` Andy Lutomirski
2019-12-24 12:04         ` Andy Lutomirski
2019-12-24 12:04         ` Andy Lutomirski
2020-01-10 21:12   ` Thomas Gleixner
2020-01-10 21:12     ` Thomas Gleixner
2020-01-10 21:12     ` Thomas Gleixner
2020-01-11  8:05     ` Christophe Leroy
2020-01-11  8:05       ` Christophe Leroy
2020-01-11  8:05       ` Christophe Leroy
2020-01-11 11:07       ` Thomas Gleixner
2020-01-11 11:07         ` Thomas Gleixner
2020-01-11 11:07         ` Thomas Gleixner
2020-01-13  6:52         ` Christophe Leroy
2020-01-13  6:52           ` Christophe Leroy
2020-01-13  6:52           ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 08/10] lib: vdso: Avoid duplication in __cvdso_clock_getres() Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-24  1:59   ` Andy Lutomirski
2019-12-24  1:59     ` Andy Lutomirski
2019-12-24  1:59     ` Andy Lutomirski
2019-12-23 14:31 ` [RFC PATCH v2 09/10] powerpc/vdso32: inline __get_datapage() Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 10/10] powerpc/32: Switch VDSO to C implementation Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-23 14:31   ` Christophe Leroy
2019-12-25  6:54   ` kbuild test robot
2020-01-09 17:52 ` Surprising code generated for vdso_read_begin() Christophe Leroy
2020-01-09 20:07   ` Segher Boessenkool
2020-01-09 20:07     ` Segher Boessenkool
2020-01-09 20:07     ` Segher Boessenkool
2020-01-10  6:45     ` Christophe Leroy
2020-01-10  6:45       ` Christophe Leroy
2020-01-10  6:45       ` Christophe Leroy
2020-01-11 11:33       ` Segher Boessenkool
2020-01-11 11:33         ` Segher Boessenkool
2020-01-11 11:33         ` Segher Boessenkool
2020-02-16 18:10         ` Arnd Bergmann
2020-02-16 18:10           ` Arnd Bergmann
2020-02-16 18:10           ` Arnd Bergmann
2020-02-19  8:45           ` Christophe Leroy
2020-02-19  8:45             ` Christophe Leroy
2020-02-19  8:45             ` Christophe Leroy
2020-02-19  9:52             ` Arnd Bergmann
2020-02-19  9:52               ` Arnd Bergmann
2020-02-19  9:52               ` Arnd Bergmann
2020-02-19 13:08               ` Segher Boessenkool
2020-02-19 13:08                 ` Segher Boessenkool
2020-02-19 13:08                 ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrUR-NgVMeTPh3TmgNSTsA=2xE03_KBeO9DSk0P-JxD_fQ@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=arnd@arndb.de \
    --cc=benh@kernel.crashing.org \
    --cc=christophe.leroy@c-s.fr \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    --cc=tglx@linutronix.de \
    --cc=vincenzo.frascino@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.