LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: nathanl@linux.ibm.com, arnd@arndb.de, x86@kernel.org,
	linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org,
	Paul Mackerras <paulus@samba.org>,
	luto@kernel.org, tglx@linutronix.de, vincenzo.frascino@arm.com,
	linuxppc-dev@lists.ozlabs.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH v4 00/11] powerpc: switch VDSO to C implementation.
Date: Mon, 20 Jan 2020 11:27:32 -0600
Message-ID: <20200120172732.GC3191@gate.crashing.org> (raw)
In-Reply-To: <4b0e5941-c37e-3c85-3809-45f33ce35657@c-s.fr>

On Mon, Jan 20, 2020 at 06:08:23PM +0100, Christophe Leroy wrote:
> Not easy I think.
> 
> First we have the unavoidable ASM entry function that can't be dropped 
> because of the CR[SO] bit the set on error or clear on no error and that 
> can't be done in C.

Yup.

> In our ASM VDSO, fixed shifts are used, while in generic C VDSO, shifts 
> are generic and read from the VDSO data.

Does that cost more than just a few cycles?

> And there is still some funny code generated by GCC (8.1), like:
> 
>  620:	7d 29 3c 30 	srw     r9,r9,r7
>  624:	21 87 00 20 	subfic  r12,r7,32
>  628:	7d 07 3c 31 	srw.    r7,r8,r7
>  62c:	7d 08 60 30 	slw     r8,r8,r12
>  630:	7d 0b 4b 78 	or      r11,r8,r9

(This can be done cheaper for fixed shifts, you can use rlwimi then).

>  634:	39 40 00 00 	li      r10,0
>  638:	40 82 00 84 	bne     6bc <__c_kernel_clock_gettime+0x114>
>  63c:	81 23 00 24 	lwz     r9,36(r3)
>  640:	81 05 00 00 	lwz     r8,0(r5)
> ...
>  6bc:	7d 69 5b 78 	mr      r9,r11
>  6c0:	7c ea 3b 78 	mr      r10,r7
>  6c4:	7d 2b 4b 78 	mr      r11,r9
>  6c8:	4b ff ff 74 	b       63c <__c_kernel_clock_gettime+0x94>
> 
> This branch to 6bc is totally useless:
> - copying r11 into r9 is pointless as r9 is overwritten in 63c
> - copying back r9 into r11 is pointless as r11 has not been modified 
> inbetween.

Yeah, huh, how did that happen.

> - loading r10 with 0 then overwritting r10 with r7 when r7 is not 0 is 
> pointless as well, could have directly put the result of srw. in r10.

This may be harder to make the compiler do.

But the r9/r11 thing suggests you are preventing optimisation somewhere,
maybe with some asm?  Do you have some small testcase I can compile?


Segher

      reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-16 17:58 Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 01/11] powerpc/64: Don't provide time functions in compat VDSO32 Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 02/11] powerpc/vdso: Switch VDSO to generic C implementation Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 03/11] lib: vdso: only read hrtimer_res when needed in __cvdso_clock_getres() Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 04/11] powerpc/vdso: simplify __get_datapage() Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 05/11] lib: vdso: allow arches to provide vdso data pointer Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 06/11] powerpc/vdso: provide inline alternative to __get_datapage() Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 07/11] powerpc/vdso: provide vdso data pointer from the ASM caller Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 08/11] lib: vdso: allow fixed clock mode Christophe Leroy
2020-01-16 20:13   ` Thomas Gleixner
2020-01-16 20:19     ` Andy Lutomirski
2020-01-16 21:07       ` Thomas Gleixner
2020-01-16 17:58 ` [RFC PATCH v4 09/11] powerpc/vdso: override __arch_vdso_capable() Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 10/11] lib: vdso: Allow arches to override the ns shift operation Christophe Leroy
2020-01-16 19:47   ` Andy Lutomirski
2020-01-16 19:57     ` Thomas Gleixner
2020-01-16 20:20       ` Andy Lutomirski
2020-01-29  7:14         ` Thomas Gleixner
2020-01-29  7:26           ` Christophe Leroy
2020-01-16 17:58 ` [RFC PATCH v4 11/11] powerpc/32: provide vdso_shift_ns() Christophe Leroy
2020-01-17  8:58 ` [RFC PATCH v4 00/11] powerpc: switch VDSO to C implementation Segher Boessenkool
2020-01-17  9:26   ` Christophe Leroy
2020-01-20 14:56   ` Christophe Leroy
2020-01-20 15:19     ` Segher Boessenkool
2020-01-20 17:08       ` Christophe Leroy
2020-01-20 17:27         ` Segher Boessenkool [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200120172732.GC3191@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=arnd@arndb.de \
    --cc=christophe.leroy@c-s.fr \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@kernel.org \
    --cc=nathanl@linux.ibm.com \
    --cc=paulus@samba.org \
    --cc=tglx@linutronix.de \
    --cc=vincenzo.frascino@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git