* [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation
@ 2019-12-02 7:57 Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 1/8] powerpc/32: Add VDSO version of getcpu on non SMP Christophe Leroy
` (7 more replies)
0 siblings, 8 replies; 12+ messages in thread
From: Christophe Leroy @ 2019-12-02 7:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel, arnd
This series:
- adds getcpu() on non SMP ppc32
- adds coarse clocks in clock_gettime
- fixes and adds all clocks in clock_getres
- optimises the retrieval of the datapage address
- optimises the cache functions
v4:
- Rebased on top of ceb307474506 ("Merge tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground")
- Fixed build failure with old binutils reported by mpe (patch 4)
v3:
- Dropped the 'fast syscall' hack for getcpu() on SMP.
- Moved get_datapage macro into asm/vdso_datapage.h so that it can be used on PPC64 as well.
v2:
- Used named labels in patch 2
- Added patch from Vincenzo to fix clock_getres() (patch 3)
- Removed unnecessary label in patch 4 as suggested by Segher
- Added patches 5 to 8
Christophe Leroy (8):
powerpc/32: Add VDSO version of getcpu on non SMP
powerpc/vdso32: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE
powerpc: Fix vDSO clock_getres()
powerpc/vdso32: inline __get_datapage()
powerpc/vdso32: Don't read cache line size from the datapage on PPC32.
powerpc/vdso32: use LOAD_REG_IMMEDIATE()
powerpc/vdso32: implement clock_getres entirely
powerpc/vdso32: miscellaneous optimisations
arch/powerpc/include/asm/vdso_datapage.h | 16 +++-
arch/powerpc/kernel/asm-offsets.c | 7 +-
arch/powerpc/kernel/time.c | 1 +
arch/powerpc/kernel/vdso.c | 5 --
arch/powerpc/kernel/vdso32/Makefile | 4 +-
arch/powerpc/kernel/vdso32/cacheflush.S | 32 ++++++--
arch/powerpc/kernel/vdso32/datapage.S | 31 +-------
arch/powerpc/kernel/vdso32/getcpu.S | 23 +++++-
arch/powerpc/kernel/vdso32/gettimeofday.S | 124 +++++++++++++++++++++---------
arch/powerpc/kernel/vdso32/vdso32.lds.S | 2 +-
arch/powerpc/kernel/vdso64/gettimeofday.S | 7 +-
11 files changed, 164 insertions(+), 88 deletions(-)
--
2.13.3
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v4 1/8] powerpc/32: Add VDSO version of getcpu on non SMP
2019-12-02 7:57 [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
@ 2019-12-02 7:57 ` Christophe Leroy
2020-01-29 5:17 ` Michael Ellerman
2019-12-02 7:57 ` [PATCH v4 2/8] powerpc/vdso32: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE Christophe Leroy
` (6 subsequent siblings)
7 siblings, 1 reply; 12+ messages in thread
From: Christophe Leroy @ 2019-12-02 7:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel, arnd
Commit 18ad51dd342a ("powerpc: Add VDSO version of getcpu") added
getcpu() for PPC64 only, by making use of a user readable general
purpose SPR.
PPC32 doesn't have any such SPR.
For non SMP, just return CPU id 0 from the VDSO directly.
PPC32 doesn't support CONFIG_NUMA so NUMA node is always 0.
Before the patch, vdsotest reported:
getcpu: syscall: 1572 nsec/call
getcpu: libc: 1787 nsec/call
getcpu: vdso: not tested
Now, vdsotest reports:
getcpu: syscall: 1582 nsec/call
getcpu: libc: 502 nsec/call
getcpu: vdso: 187 nsec/call
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: fixed build error in getcpu.S
v3: dropped the fast system call, only support non SMP for now.
---
arch/powerpc/kernel/vdso32/Makefile | 4 +---
arch/powerpc/kernel/vdso32/getcpu.S | 17 +++++++++++++++++
arch/powerpc/kernel/vdso32/vdso32.lds.S | 2 +-
3 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/vdso32/Makefile b/arch/powerpc/kernel/vdso32/Makefile
index 06f54d947057..e147bbdc12cd 100644
--- a/arch/powerpc/kernel/vdso32/Makefile
+++ b/arch/powerpc/kernel/vdso32/Makefile
@@ -2,9 +2,7 @@
# List of files in the vdso, has to be asm only for now
-obj-vdso32-$(CONFIG_PPC64) = getcpu.o
-obj-vdso32 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o \
- $(obj-vdso32-y)
+obj-vdso32 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o getcpu.o
# Build rules
diff --git a/arch/powerpc/kernel/vdso32/getcpu.S b/arch/powerpc/kernel/vdso32/getcpu.S
index 63e914539e1a..90b39af14383 100644
--- a/arch/powerpc/kernel/vdso32/getcpu.S
+++ b/arch/powerpc/kernel/vdso32/getcpu.S
@@ -15,6 +15,7 @@
* int __kernel_getcpu(unsigned *cpu, unsigned *node);
*
*/
+#if defined(CONFIG_PPC64)
V_FUNCTION_BEGIN(__kernel_getcpu)
.cfi_startproc
mfspr r5,SPRN_SPRG_VDSO_READ
@@ -31,3 +32,19 @@ V_FUNCTION_BEGIN(__kernel_getcpu)
blr
.cfi_endproc
V_FUNCTION_END(__kernel_getcpu)
+#elif !defined(CONFIG_SMP)
+V_FUNCTION_BEGIN(__kernel_getcpu)
+ .cfi_startproc
+ cmpwi cr0, r3, 0
+ cmpwi cr1, r4, 0
+ li r5, 0
+ beq cr0, 1f
+ stw r5, 0(r3)
+1: li r3, 0 /* always success */
+ crclr cr0*4+so
+ beqlr cr1
+ stw r5, 0(r4)
+ blr
+ .cfi_endproc
+V_FUNCTION_END(__kernel_getcpu)
+#endif
diff --git a/arch/powerpc/kernel/vdso32/vdso32.lds.S b/arch/powerpc/kernel/vdso32/vdso32.lds.S
index 00c025ba4a92..5206c2eb2a1d 100644
--- a/arch/powerpc/kernel/vdso32/vdso32.lds.S
+++ b/arch/powerpc/kernel/vdso32/vdso32.lds.S
@@ -155,7 +155,7 @@ VERSION
__kernel_sync_dicache_p5;
__kernel_sigtramp32;
__kernel_sigtramp_rt32;
-#ifdef CONFIG_PPC64
+#if defined(CONFIG_PPC64) || !defined(CONFIG_SMP)
__kernel_getcpu;
#endif
--
2.13.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 2/8] powerpc/vdso32: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE
2019-12-02 7:57 [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 1/8] powerpc/32: Add VDSO version of getcpu on non SMP Christophe Leroy
@ 2019-12-02 7:57 ` Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 3/8] powerpc: Fix vDSO clock_getres() Christophe Leroy
` (5 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Christophe Leroy @ 2019-12-02 7:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel, arnd
This is copied and adapted from commit 5c929885f1bb ("powerpc/vdso64:
Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE")
from Santosh Sivaraj <santosh@fossix.org>
Benchmark from vdsotest-all:
clock-gettime-realtime: syscall: 3601 nsec/call
clock-gettime-realtime: libc: 1072 nsec/call
clock-gettime-realtime: vdso: 931 nsec/call
clock-gettime-monotonic: syscall: 4034 nsec/call
clock-gettime-monotonic: libc: 1213 nsec/call
clock-gettime-monotonic: vdso: 1076 nsec/call
clock-gettime-realtime-coarse: syscall: 2722 nsec/call
clock-gettime-realtime-coarse: libc: 805 nsec/call
clock-gettime-realtime-coarse: vdso: 668 nsec/call
clock-gettime-monotonic-coarse: syscall: 2949 nsec/call
clock-gettime-monotonic-coarse: libc: 882 nsec/call
clock-gettime-monotonic-coarse: vdso: 745 nsec/call
Additional test passed with:
vdsotest -d 30 clock-gettime-monotonic-coarse verify
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Santosh Sivaraj <santosh@fossix.org>
Link: https://github.com/linuxppc/issues/issues/41
---
v4: Using STAMP_XTIME_SEC and STAMP_XTIMER_NSEC instead of STAMP_XTIME following merge of latest 2038 fixing series.
---
arch/powerpc/kernel/vdso32/gettimeofday.S | 64 +++++++++++++++++++++++++++----
1 file changed, 57 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
index c8e6902cb01b..7c1be86c1e90 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -69,7 +69,13 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
cmpli cr0,r3,CLOCK_REALTIME
cmpli cr1,r3,CLOCK_MONOTONIC
cror cr0*4+eq,cr0*4+eq,cr1*4+eq
- bne cr0,99f
+
+ cmpli cr5,r3,CLOCK_REALTIME_COARSE
+ cmpli cr6,r3,CLOCK_MONOTONIC_COARSE
+ cror cr5*4+eq,cr5*4+eq,cr6*4+eq
+
+ cror cr0*4+eq,cr0*4+eq,cr5*4+eq
+ bne cr0, .Lgettime_fallback
mflr r12 /* r12 saves lr */
.cfi_register lr,r12
@@ -78,8 +84,10 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
mr r9,r3 /* datapage ptr in r9 */
lis r7,NSEC_PER_SEC@h /* want nanoseconds */
ori r7,r7,NSEC_PER_SEC@l
-50: bl __do_get_tspec@local /* get sec/nsec from tb & kernel */
- bne cr1,80f /* not monotonic -> all done */
+ beq cr5, .Lcoarse_clocks
+.Lprecise_clocks:
+ bl __do_get_tspec@local /* get sec/nsec from tb & kernel */
+ bne cr1, .Lfinish /* not monotonic -> all done */
/*
* CLOCK_MONOTONIC
@@ -103,12 +111,53 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
add r9,r9,r0
lwz r0,(CFG_TB_UPDATE_COUNT+LOPART)(r9)
cmpl cr0,r8,r0 /* check if updated */
- bne- 50b
+ bne- .Lprecise_clocks
+ b .Lfinish_monotonic
+
+ /*
+ * For coarse clocks we get data directly from the vdso data page, so
+ * we don't need to call __do_get_tspec, but we still need to do the
+ * counter trick.
+ */
+.Lcoarse_clocks:
+ lwz r8,(CFG_TB_UPDATE_COUNT+LOPART)(r9)
+ andi. r0,r8,1 /* pending update ? loop */
+ bne- .Lcoarse_clocks
+ add r9,r9,r0 /* r0 is already 0 */
+
+ /*
+ * CLOCK_REALTIME_COARSE, below values are needed for MONOTONIC_COARSE
+ * too
+ */
+ lwz r3,STAMP_XTIME_SEC+LOPART(r9)
+ lwz r4,STAMP_XTIME_NSEC+LOPART(r9)
+ bne cr6,1f
+
+ /* CLOCK_MONOTONIC_COARSE */
+ lwz r5,(WTOM_CLOCK_SEC+LOPART)(r9)
+ lwz r6,WTOM_CLOCK_NSEC(r9)
+
+ /* check if counter has updated */
+ or r0,r6,r5
+1: or r0,r0,r3
+ or r0,r0,r4
+ xor r0,r0,r0
+ add r3,r3,r0
+ lwz r0,CFG_TB_UPDATE_COUNT+LOPART(r9)
+ cmpl cr0,r0,r8 /* check if updated */
+ bne- .Lcoarse_clocks
+
+ /* Counter has not updated, so continue calculating proper values for
+ * sec and nsec if monotonic coarse, or just return with the proper
+ * values for realtime.
+ */
+ bne cr6, .Lfinish
/* Calculate and store result. Note that this mimics the C code,
* which may cause funny results if nsec goes negative... is that
* possible at all ?
*/
+.Lfinish_monotonic:
add r3,r3,r5
add r4,r4,r6
cmpw cr0,r4,r7
@@ -116,11 +165,12 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
blt 1f
subf r4,r7,r4
addi r3,r3,1
-1: bge cr1,80f
+1: bge cr1, .Lfinish
addi r3,r3,-1
add r4,r4,r7
-80: stw r3,TSPC32_TV_SEC(r11)
+.Lfinish:
+ stw r3,TSPC32_TV_SEC(r11)
stw r4,TSPC32_TV_NSEC(r11)
mtlr r12
@@ -131,7 +181,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
/*
* syscall fallback
*/
-99:
+.Lgettime_fallback:
li r0,__NR_clock_gettime
.cfi_restore lr
sc
--
2.13.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 3/8] powerpc: Fix vDSO clock_getres()
2019-12-02 7:57 [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 1/8] powerpc/32: Add VDSO version of getcpu on non SMP Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 2/8] powerpc/vdso32: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE Christophe Leroy
@ 2019-12-02 7:57 ` Christophe Leroy
2019-12-04 13:30 ` Michael Ellerman
2019-12-02 7:57 ` [PATCH v4 4/8] powerpc/vdso32: inline __get_datapage() Christophe Leroy
` (4 subsequent siblings)
7 siblings, 1 reply; 12+ messages in thread
From: Christophe Leroy @ 2019-12-02 7:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel, arnd
From: Vincenzo Frascino <vincenzo.frascino@arm.com>
clock_getres in the vDSO library has to preserve the same behaviour
of posix_get_hrtimer_res().
In particular, posix_get_hrtimer_res() does:
sec = 0;
ns = hrtimer_resolution;
and hrtimer_resolution depends on the enablement of the high
resolution timers that can happen either at compile or at run time.
Fix the powerpc vdso implementation of clock_getres keeping a copy of
hrtimer_resolution in vdso data and using that directly.
Fixes: a7f290dad32e ("[PATCH] powerpc: Merge vdso's and add vdso support
to 32 bits kernel")
Cc: stable@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
[chleroy: changed CLOCK_REALTIME_RES to CLOCK_HRTIMER_RES]
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
arch/powerpc/include/asm/vdso_datapage.h | 2 ++
arch/powerpc/kernel/asm-offsets.c | 2 +-
arch/powerpc/kernel/time.c | 1 +
arch/powerpc/kernel/vdso32/gettimeofday.S | 7 +++++--
arch/powerpc/kernel/vdso64/gettimeofday.S | 7 +++++--
5 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h
index a115970a6809..40f13f3626d3 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -83,6 +83,7 @@ struct vdso_data {
__s64 wtom_clock_sec; /* Wall to monotonic clock sec */
__s64 stamp_xtime_sec; /* xtime secs as at tb_orig_stamp */
__s64 stamp_xtime_nsec; /* xtime nsecs as at tb_orig_stamp */
+ __u32 hrtimer_res; /* hrtimer resolution */
__u32 syscall_map_64[SYSCALL_MAP_SIZE]; /* map of syscalls */
__u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */
};
@@ -105,6 +106,7 @@ struct vdso_data {
__s32 stamp_xtime_sec; /* xtime seconds as at tb_orig_stamp */
__s32 stamp_xtime_nsec; /* xtime nsecs as at tb_orig_stamp */
__u32 stamp_sec_fraction; /* fractional seconds of stamp_xtime */
+ __u32 hrtimer_res; /* hrtimer resolution */
__u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */
__u32 dcache_block_size; /* L1 d-cache block size */
__u32 icache_block_size; /* L1 i-cache block size */
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index f22bd6d1fe93..3d47aec7becf 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -388,6 +388,7 @@ int main(void)
OFFSET(STAMP_XTIME_SEC, vdso_data, stamp_xtime_sec);
OFFSET(STAMP_XTIME_NSEC, vdso_data, stamp_xtime_nsec);
OFFSET(STAMP_SEC_FRAC, vdso_data, stamp_sec_fraction);
+ OFFSET(CLOCK_HRTIMER_RES, vdso_data, hrtimer_res);
OFFSET(CFG_ICACHE_BLOCKSZ, vdso_data, icache_block_size);
OFFSET(CFG_DCACHE_BLOCKSZ, vdso_data, dcache_block_size);
OFFSET(CFG_ICACHE_LOGBLOCKSZ, vdso_data, icache_log_block_size);
@@ -413,7 +414,6 @@ int main(void)
DEFINE(CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE);
DEFINE(CLOCK_MONOTONIC_COARSE, CLOCK_MONOTONIC_COARSE);
DEFINE(NSEC_PER_SEC, NSEC_PER_SEC);
- DEFINE(CLOCK_REALTIME_RES, MONOTONIC_RES_NSEC);
#ifdef CONFIG_BUG
DEFINE(BUG_ENTRY_SIZE, sizeof(struct bug_entry));
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 2d13cea13954..1168e8b37e30 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -960,6 +960,7 @@ void update_vsyscall(struct timekeeper *tk)
vdso_data->stamp_xtime_sec = xt.tv_sec;
vdso_data->stamp_xtime_nsec = xt.tv_nsec;
vdso_data->stamp_sec_fraction = frac_sec;
+ vdso_data->hrtimer_res = hrtimer_resolution;
smp_wmb();
++(vdso_data->tb_update_count);
}
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
index 7c1be86c1e90..e9ce8ee56edb 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -204,12 +204,15 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
cror cr0*4+eq,cr0*4+eq,cr1*4+eq
bne cr0,99f
+ mflr r12
+ .cfi_register lr,r12
+ bl __get_datapage@local /* get data page */
+ lwz r5, CLOCK_HRTIMER_RES(r3)
+ mtlr r12
li r3,0
cmpli cr0,r4,0
crclr cr0*4+so
beqlr
- lis r5,CLOCK_REALTIME_RES@h
- ori r5,r5,CLOCK_REALTIME_RES@l
stw r3,TSPC32_TV_SEC(r4)
stw r5,TSPC32_TV_NSEC(r4)
blr
diff --git a/arch/powerpc/kernel/vdso64/gettimeofday.S b/arch/powerpc/kernel/vdso64/gettimeofday.S
index 1f24e411af80..1c9a04703250 100644
--- a/arch/powerpc/kernel/vdso64/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso64/gettimeofday.S
@@ -186,12 +186,15 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
cror cr0*4+eq,cr0*4+eq,cr1*4+eq
bne cr0,99f
+ mflr r12
+ .cfi_register lr,r12
+ bl V_LOCAL_FUNC(__get_datapage)
+ lwz r5, CLOCK_HRTIMER_RES(r3)
+ mtlr r12
li r3,0
cmpldi cr0,r4,0
crclr cr0*4+so
beqlr
- lis r5,CLOCK_REALTIME_RES@h
- ori r5,r5,CLOCK_REALTIME_RES@l
std r3,TSPC64_TV_SEC(r4)
std r5,TSPC64_TV_NSEC(r4)
blr
--
2.13.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 4/8] powerpc/vdso32: inline __get_datapage()
2019-12-02 7:57 [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
` (2 preceding siblings ...)
2019-12-02 7:57 ` [PATCH v4 3/8] powerpc: Fix vDSO clock_getres() Christophe Leroy
@ 2019-12-02 7:57 ` Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 5/8] powerpc/vdso32: Don't read cache line size from the datapage on PPC32 Christophe Leroy
` (3 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Christophe Leroy @ 2019-12-02 7:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel, arnd
__get_datapage() is only a few instructions to retrieve the
address of the page where the kernel stores data to the VDSO.
By inlining this function into its users, a bl/blr pair and
a mflr/mtlr pair is avoided, plus a few reg moves.
The improvement is noticeable (about 55 nsec/call on an 8xx)
vdsotest before the patch:
gettimeofday: vdso: 731 nsec/call
clock-gettime-realtime-coarse: vdso: 668 nsec/call
clock-gettime-monotonic-coarse: vdso: 745 nsec/call
vdsotest after the patch:
gettimeofday: vdso: 677 nsec/call
clock-gettime-realtime-coarse: vdso: 613 nsec/call
clock-gettime-monotonic-coarse: vdso: 690 nsec/call
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v3: define get_datapage macro in asm/vdso_datapage.h
v4: fixed build failure with old binutils
---
arch/powerpc/include/asm/vdso_datapage.h | 10 ++++++++++
arch/powerpc/kernel/vdso32/cacheflush.S | 9 ++++-----
arch/powerpc/kernel/vdso32/datapage.S | 28 +++-------------------------
arch/powerpc/kernel/vdso32/gettimeofday.S | 12 +++++-------
4 files changed, 22 insertions(+), 37 deletions(-)
diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h
index 40f13f3626d3..ee5319a6f4e3 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -118,6 +118,16 @@ struct vdso_data {
extern struct vdso_data *vdso_data;
+#else /* __ASSEMBLY__ */
+
+.macro get_datapage ptr, tmp
+ bcl 20, 31, .+4
+ mflr \ptr
+ addi \ptr, \ptr, (__kernel_datapage_offset - (.-4))@l
+ lwz \tmp, 0(\ptr)
+ add \ptr, \tmp, \ptr
+.endm
+
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
diff --git a/arch/powerpc/kernel/vdso32/cacheflush.S b/arch/powerpc/kernel/vdso32/cacheflush.S
index 7f882e7b9f43..d178ec8c279d 100644
--- a/arch/powerpc/kernel/vdso32/cacheflush.S
+++ b/arch/powerpc/kernel/vdso32/cacheflush.S
@@ -8,6 +8,7 @@
#include <asm/processor.h>
#include <asm/ppc_asm.h>
#include <asm/vdso.h>
+#include <asm/vdso_datapage.h>
#include <asm/asm-offsets.h>
.text
@@ -24,14 +25,12 @@ V_FUNCTION_BEGIN(__kernel_sync_dicache)
.cfi_startproc
mflr r12
.cfi_register lr,r12
- mr r11,r3
- bl __get_datapage@local
+ get_datapage r10, r0
mtlr r12
- mr r10,r3
lwz r7,CFG_DCACHE_BLOCKSZ(r10)
addi r5,r7,-1
- andc r6,r11,r5 /* round low to line bdy */
+ andc r6,r3,r5 /* round low to line bdy */
subf r8,r6,r4 /* compute length */
add r8,r8,r5 /* ensure we get enough */
lwz r9,CFG_DCACHE_LOGBLOCKSZ(r10)
@@ -48,7 +47,7 @@ V_FUNCTION_BEGIN(__kernel_sync_dicache)
lwz r7,CFG_ICACHE_BLOCKSZ(r10)
addi r5,r7,-1
- andc r6,r11,r5 /* round low to line bdy */
+ andc r6,r3,r5 /* round low to line bdy */
subf r8,r6,r4 /* compute length */
add r8,r8,r5
lwz r9,CFG_ICACHE_LOGBLOCKSZ(r10)
diff --git a/arch/powerpc/kernel/vdso32/datapage.S b/arch/powerpc/kernel/vdso32/datapage.S
index 6c7401bd284e..1095d818f94a 100644
--- a/arch/powerpc/kernel/vdso32/datapage.S
+++ b/arch/powerpc/kernel/vdso32/datapage.S
@@ -10,35 +10,13 @@
#include <asm/asm-offsets.h>
#include <asm/unistd.h>
#include <asm/vdso.h>
+#include <asm/vdso_datapage.h>
.text
.global __kernel_datapage_offset;
__kernel_datapage_offset:
.long 0
-V_FUNCTION_BEGIN(__get_datapage)
- .cfi_startproc
- /* We don't want that exposed or overridable as we want other objects
- * to be able to bl directly to here
- */
- .protected __get_datapage
- .hidden __get_datapage
-
- mflr r0
- .cfi_register lr,r0
-
- bcl 20,31,data_page_branch
-data_page_branch:
- mflr r3
- mtlr r0
- addi r3, r3, __kernel_datapage_offset-data_page_branch
- lwz r0,0(r3)
- .cfi_restore lr
- add r3,r0,r3
- blr
- .cfi_endproc
-V_FUNCTION_END(__get_datapage)
-
/*
* void *__kernel_get_syscall_map(unsigned int *syscall_count) ;
*
@@ -53,7 +31,7 @@ V_FUNCTION_BEGIN(__kernel_get_syscall_map)
mflr r12
.cfi_register lr,r12
mr r4,r3
- bl __get_datapage@local
+ get_datapage r3, r0
mtlr r12
addi r3,r3,CFG_SYSCALL_MAP32
cmpli cr0,r4,0
@@ -75,7 +53,7 @@ V_FUNCTION_BEGIN(__kernel_get_tbfreq)
.cfi_startproc
mflr r12
.cfi_register lr,r12
- bl __get_datapage@local
+ get_datapage r3, r0
lwz r4,(CFG_TB_TICKS_PER_SEC + 4)(r3)
lwz r3,CFG_TB_TICKS_PER_SEC(r3)
mtlr r12
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
index e9ce8ee56edb..74973548529a 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -9,6 +9,7 @@
#include <asm/processor.h>
#include <asm/ppc_asm.h>
#include <asm/vdso.h>
+#include <asm/vdso_datapage.h>
#include <asm/asm-offsets.h>
#include <asm/unistd.h>
@@ -33,8 +34,7 @@ V_FUNCTION_BEGIN(__kernel_gettimeofday)
mr r10,r3 /* r10 saves tv */
mr r11,r4 /* r11 saves tz */
- bl __get_datapage@local /* get data page */
- mr r9, r3 /* datapage ptr in r9 */
+ get_datapage r9, r0
cmplwi r10,0 /* check if tv is NULL */
beq 3f
lis r7,1000000@ha /* load up USEC_PER_SEC */
@@ -80,8 +80,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
mflr r12 /* r12 saves lr */
.cfi_register lr,r12
mr r11,r4 /* r11 saves tp */
- bl __get_datapage@local /* get data page */
- mr r9,r3 /* datapage ptr in r9 */
+ get_datapage r9, r0
lis r7,NSEC_PER_SEC@h /* want nanoseconds */
ori r7,r7,NSEC_PER_SEC@l
beq cr5, .Lcoarse_clocks
@@ -206,7 +205,7 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
mflr r12
.cfi_register lr,r12
- bl __get_datapage@local /* get data page */
+ get_datapage r3, r0
lwz r5, CLOCK_HRTIMER_RES(r3)
mtlr r12
li r3,0
@@ -240,8 +239,7 @@ V_FUNCTION_BEGIN(__kernel_time)
.cfi_register lr,r12
mr r11,r3 /* r11 holds t */
- bl __get_datapage@local
- mr r9, r3 /* datapage ptr in r9 */
+ get_datapage r9, r0
lwz r3,STAMP_XTIME_SEC+LOPART(r9)
--
2.13.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 5/8] powerpc/vdso32: Don't read cache line size from the datapage on PPC32.
2019-12-02 7:57 [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
` (3 preceding siblings ...)
2019-12-02 7:57 ` [PATCH v4 4/8] powerpc/vdso32: inline __get_datapage() Christophe Leroy
@ 2019-12-02 7:57 ` Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 6/8] powerpc/vdso32: use LOAD_REG_IMMEDIATE() Christophe Leroy
` (2 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Christophe Leroy @ 2019-12-02 7:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel, arnd
On PPC32, the cache lines have a fixed size known at build time.
Don't read it from the datapage.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
arch/powerpc/include/asm/vdso_datapage.h | 4 ----
arch/powerpc/kernel/asm-offsets.c | 2 +-
arch/powerpc/kernel/vdso.c | 5 -----
arch/powerpc/kernel/vdso32/cacheflush.S | 23 +++++++++++++++++++++++
4 files changed, 24 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h
index ee5319a6f4e3..b9ef6cf50ea5 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -108,10 +108,6 @@ struct vdso_data {
__u32 stamp_sec_fraction; /* fractional seconds of stamp_xtime */
__u32 hrtimer_res; /* hrtimer resolution */
__u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */
- __u32 dcache_block_size; /* L1 d-cache block size */
- __u32 icache_block_size; /* L1 i-cache block size */
- __u32 dcache_log_block_size; /* L1 d-cache log block size */
- __u32 icache_log_block_size; /* L1 i-cache log block size */
};
#endif /* CONFIG_PPC64 */
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 3d47aec7becf..0013197d89a6 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -389,11 +389,11 @@ int main(void)
OFFSET(STAMP_XTIME_NSEC, vdso_data, stamp_xtime_nsec);
OFFSET(STAMP_SEC_FRAC, vdso_data, stamp_sec_fraction);
OFFSET(CLOCK_HRTIMER_RES, vdso_data, hrtimer_res);
+#ifdef CONFIG_PPC64
OFFSET(CFG_ICACHE_BLOCKSZ, vdso_data, icache_block_size);
OFFSET(CFG_DCACHE_BLOCKSZ, vdso_data, dcache_block_size);
OFFSET(CFG_ICACHE_LOGBLOCKSZ, vdso_data, icache_log_block_size);
OFFSET(CFG_DCACHE_LOGBLOCKSZ, vdso_data, dcache_log_block_size);
-#ifdef CONFIG_PPC64
OFFSET(CFG_SYSCALL_MAP64, vdso_data, syscall_map_64);
OFFSET(TVAL64_TV_SEC, __kernel_old_timeval, tv_sec);
OFFSET(TVAL64_TV_USEC, __kernel_old_timeval, tv_usec);
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index eae9ddaecbcf..b9a108411c0d 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -728,11 +728,6 @@ static int __init vdso_init(void)
*/
vdso64_pages = (&vdso64_end - &vdso64_start) >> PAGE_SHIFT;
DBG("vdso64_kbase: %p, 0x%x pages\n", vdso64_kbase, vdso64_pages);
-#else
- vdso_data->dcache_block_size = L1_CACHE_BYTES;
- vdso_data->dcache_log_block_size = L1_CACHE_SHIFT;
- vdso_data->icache_block_size = L1_CACHE_BYTES;
- vdso_data->icache_log_block_size = L1_CACHE_SHIFT;
#endif /* CONFIG_PPC64 */
diff --git a/arch/powerpc/kernel/vdso32/cacheflush.S b/arch/powerpc/kernel/vdso32/cacheflush.S
index d178ec8c279d..3440ddf21c8b 100644
--- a/arch/powerpc/kernel/vdso32/cacheflush.S
+++ b/arch/powerpc/kernel/vdso32/cacheflush.S
@@ -10,6 +10,7 @@
#include <asm/vdso.h>
#include <asm/vdso_datapage.h>
#include <asm/asm-offsets.h>
+#include <asm/cache.h>
.text
@@ -23,28 +24,44 @@
*/
V_FUNCTION_BEGIN(__kernel_sync_dicache)
.cfi_startproc
+#ifdef CONFIG_PPC64
mflr r12
.cfi_register lr,r12
get_datapage r10, r0
mtlr r12
+#endif
+#ifdef CONFIG_PPC64
lwz r7,CFG_DCACHE_BLOCKSZ(r10)
addi r5,r7,-1
+#else
+ li r5, L1_CACHE_BYTES - 1
+#endif
andc r6,r3,r5 /* round low to line bdy */
subf r8,r6,r4 /* compute length */
add r8,r8,r5 /* ensure we get enough */
+#ifdef CONFIG_PPC64
lwz r9,CFG_DCACHE_LOGBLOCKSZ(r10)
srw. r8,r8,r9 /* compute line count */
+#else
+ srwi. r8, r8, L1_CACHE_SHIFT
+ mr r7, r6
+#endif
crclr cr0*4+so
beqlr /* nothing to do? */
mtctr r8
1: dcbst 0,r6
+#ifdef CONFIG_PPC64
add r6,r6,r7
+#else
+ addi r6, r6, L1_CACHE_BYTES
+#endif
bdnz 1b
sync
/* Now invalidate the instruction cache */
+#ifdef CONFIG_PPC64
lwz r7,CFG_ICACHE_BLOCKSZ(r10)
addi r5,r7,-1
andc r6,r3,r5 /* round low to line bdy */
@@ -54,9 +71,15 @@ V_FUNCTION_BEGIN(__kernel_sync_dicache)
srw. r8,r8,r9 /* compute line count */
crclr cr0*4+so
beqlr /* nothing to do? */
+#endif
mtctr r8
+#ifdef CONFIG_PPC64
2: icbi 0,r6
add r6,r6,r7
+#else
+2: icbi 0, r7
+ addi r7, r7, L1_CACHE_BYTES
+#endif
bdnz 2b
isync
li r3,0
--
2.13.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 6/8] powerpc/vdso32: use LOAD_REG_IMMEDIATE()
2019-12-02 7:57 [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
` (4 preceding siblings ...)
2019-12-02 7:57 ` [PATCH v4 5/8] powerpc/vdso32: Don't read cache line size from the datapage on PPC32 Christophe Leroy
@ 2019-12-02 7:57 ` Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 7/8] powerpc/vdso32: implement clock_getres entirely Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 8/8] powerpc/vdso32: miscellaneous optimisations Christophe Leroy
7 siblings, 0 replies; 12+ messages in thread
From: Christophe Leroy @ 2019-12-02 7:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel, arnd
Use LOAD_REG_IMMEDIATE() to load registers with immediate value.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
arch/powerpc/kernel/vdso32/gettimeofday.S | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
index 74973548529a..9aafacea9c4a 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -37,8 +37,7 @@ V_FUNCTION_BEGIN(__kernel_gettimeofday)
get_datapage r9, r0
cmplwi r10,0 /* check if tv is NULL */
beq 3f
- lis r7,1000000@ha /* load up USEC_PER_SEC */
- addi r7,r7,1000000@l /* so we get microseconds in r4 */
+ LOAD_REG_IMMEDIATE(r7, 1000000) /* load up USEC_PER_SEC */
bl __do_get_tspec@local /* get sec/usec from tb & kernel */
stw r3,TVAL32_TV_SEC(r10)
stw r4,TVAL32_TV_USEC(r10)
@@ -81,8 +80,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
.cfi_register lr,r12
mr r11,r4 /* r11 saves tp */
get_datapage r9, r0
- lis r7,NSEC_PER_SEC@h /* want nanoseconds */
- ori r7,r7,NSEC_PER_SEC@l
+ LOAD_REG_IMMEDIATE(r7, NSEC_PER_SEC) /* load up NSEC_PER_SEC */
beq cr5, .Lcoarse_clocks
.Lprecise_clocks:
bl __do_get_tspec@local /* get sec/nsec from tb & kernel */
--
2.13.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 7/8] powerpc/vdso32: implement clock_getres entirely
2019-12-02 7:57 [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
` (5 preceding siblings ...)
2019-12-02 7:57 ` [PATCH v4 6/8] powerpc/vdso32: use LOAD_REG_IMMEDIATE() Christophe Leroy
@ 2019-12-02 7:57 ` Christophe Leroy
2020-05-05 22:52 ` Aurelien Jarno
2019-12-02 7:57 ` [PATCH v4 8/8] powerpc/vdso32: miscellaneous optimisations Christophe Leroy
7 siblings, 1 reply; 12+ messages in thread
From: Christophe Leroy @ 2019-12-02 7:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel, arnd
clock_getres returns hrtimer_res for all clocks but coarse ones
for which it returns KTIME_LOW_RES.
return EINVAL for unknown clocks.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
arch/powerpc/kernel/asm-offsets.c | 3 +++
arch/powerpc/kernel/vdso32/gettimeofday.S | 19 +++++++++++--------
2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 0013197d89a6..90e53d432f2e 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -413,7 +413,10 @@ int main(void)
DEFINE(CLOCK_MONOTONIC, CLOCK_MONOTONIC);
DEFINE(CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE);
DEFINE(CLOCK_MONOTONIC_COARSE, CLOCK_MONOTONIC_COARSE);
+ DEFINE(CLOCK_MAX, CLOCK_TAI);
DEFINE(NSEC_PER_SEC, NSEC_PER_SEC);
+ DEFINE(EINVAL, EINVAL);
+ DEFINE(KTIME_LOW_RES, KTIME_LOW_RES);
#ifdef CONFIG_BUG
DEFINE(BUG_ENTRY_SIZE, sizeof(struct bug_entry));
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
index 9aafacea9c4a..20ae38f3a5a3 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -196,17 +196,20 @@ V_FUNCTION_END(__kernel_clock_gettime)
V_FUNCTION_BEGIN(__kernel_clock_getres)
.cfi_startproc
/* Check for supported clock IDs */
- cmpwi cr0,r3,CLOCK_REALTIME
- cmpwi cr1,r3,CLOCK_MONOTONIC
- cror cr0*4+eq,cr0*4+eq,cr1*4+eq
- bne cr0,99f
+ cmplwi cr0, r3, CLOCK_MAX
+ cmpwi cr1, r3, CLOCK_REALTIME_COARSE
+ cmpwi cr7, r3, CLOCK_MONOTONIC_COARSE
+ bgt cr0, 99f
+ LOAD_REG_IMMEDIATE(r5, KTIME_LOW_RES)
+ beq cr1, 1f
+ beq cr7, 1f
mflr r12
.cfi_register lr,r12
get_datapage r3, r0
lwz r5, CLOCK_HRTIMER_RES(r3)
mtlr r12
- li r3,0
+1: li r3,0
cmpli cr0,r4,0
crclr cr0*4+so
beqlr
@@ -215,11 +218,11 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
blr
/*
- * syscall fallback
+ * invalid clock
*/
99:
- li r0,__NR_clock_getres
- sc
+ li r3, EINVAL
+ crset so
blr
.cfi_endproc
V_FUNCTION_END(__kernel_clock_getres)
--
2.13.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 8/8] powerpc/vdso32: miscellaneous optimisations
2019-12-02 7:57 [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
` (6 preceding siblings ...)
2019-12-02 7:57 ` [PATCH v4 7/8] powerpc/vdso32: implement clock_getres entirely Christophe Leroy
@ 2019-12-02 7:57 ` Christophe Leroy
7 siblings, 0 replies; 12+ messages in thread
From: Christophe Leroy @ 2019-12-02 7:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel, arnd
Various optimisations by inverting branches and removing
redundant instructions.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
arch/powerpc/kernel/vdso32/datapage.S | 3 +--
arch/powerpc/kernel/vdso32/getcpu.S | 6 +++---
arch/powerpc/kernel/vdso32/gettimeofday.S | 18 +++++++++---------
3 files changed, 13 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/kernel/vdso32/datapage.S b/arch/powerpc/kernel/vdso32/datapage.S
index 1095d818f94a..217bb630f8f9 100644
--- a/arch/powerpc/kernel/vdso32/datapage.S
+++ b/arch/powerpc/kernel/vdso32/datapage.S
@@ -30,11 +30,10 @@ V_FUNCTION_BEGIN(__kernel_get_syscall_map)
.cfi_startproc
mflr r12
.cfi_register lr,r12
- mr r4,r3
+ mr. r4,r3
get_datapage r3, r0
mtlr r12
addi r3,r3,CFG_SYSCALL_MAP32
- cmpli cr0,r4,0
beqlr
li r0,NR_syscalls
stw r0,0(r4)
diff --git a/arch/powerpc/kernel/vdso32/getcpu.S b/arch/powerpc/kernel/vdso32/getcpu.S
index 90b39af14383..ff5e214fec41 100644
--- a/arch/powerpc/kernel/vdso32/getcpu.S
+++ b/arch/powerpc/kernel/vdso32/getcpu.S
@@ -25,10 +25,10 @@ V_FUNCTION_BEGIN(__kernel_getcpu)
rlwinm r7,r5,16,31-15,31-0
beq cr0,1f
stw r6,0(r3)
-1: beq cr1,2f
- stw r7,0(r4)
-2: crclr cr0*4+so
+1: crclr cr0*4+so
li r3,0 /* always success */
+ beqlr cr1
+ stw r7,0(r4)
blr
.cfi_endproc
V_FUNCTION_END(__kernel_getcpu)
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
index 20ae38f3a5a3..a3951567118a 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -32,10 +32,9 @@ V_FUNCTION_BEGIN(__kernel_gettimeofday)
mflr r12
.cfi_register lr,r12
- mr r10,r3 /* r10 saves tv */
+ mr. r10,r3 /* r10 saves tv */
mr r11,r4 /* r11 saves tz */
get_datapage r9, r0
- cmplwi r10,0 /* check if tv is NULL */
beq 3f
LOAD_REG_IMMEDIATE(r7, 1000000) /* load up USEC_PER_SEC */
bl __do_get_tspec@local /* get sec/usec from tb & kernel */
@@ -43,15 +42,16 @@ V_FUNCTION_BEGIN(__kernel_gettimeofday)
stw r4,TVAL32_TV_USEC(r10)
3: cmplwi r11,0 /* check if tz is NULL */
- beq 1f
+ mtlr r12
+ crclr cr0*4+so
+ li r3,0
+ beqlr
+
lwz r4,CFG_TZ_MINUTEWEST(r9)/* fill tz */
lwz r5,CFG_TZ_DSTTIME(r9)
stw r4,TZONE_TZ_MINWEST(r11)
stw r5,TZONE_TZ_DSTTIME(r11)
-1: mtlr r12
- crclr cr0*4+so
- li r3,0
blr
.cfi_endproc
V_FUNCTION_END(__kernel_gettimeofday)
@@ -245,10 +245,10 @@ V_FUNCTION_BEGIN(__kernel_time)
lwz r3,STAMP_XTIME_SEC+LOPART(r9)
cmplwi r11,0 /* check if t is NULL */
- beq 2f
- stw r3,0(r11) /* store result at *t */
-2: mtlr r12
+ mtlr r12
crclr cr0*4+so
+ beqlr
+ stw r3,0(r11) /* store result at *t */
blr
.cfi_endproc
V_FUNCTION_END(__kernel_time)
--
2.13.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v4 3/8] powerpc: Fix vDSO clock_getres()
2019-12-02 7:57 ` [PATCH v4 3/8] powerpc: Fix vDSO clock_getres() Christophe Leroy
@ 2019-12-04 13:30 ` Michael Ellerman
0 siblings, 0 replies; 12+ messages in thread
From: Michael Ellerman @ 2019-12-04 13:30 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel, arnd
On Mon, 2019-12-02 at 07:57:29 UTC, Christophe Leroy wrote:
> From: Vincenzo Frascino <vincenzo.frascino@arm.com>
>
> clock_getres in the vDSO library has to preserve the same behaviour
> of posix_get_hrtimer_res().
>
> In particular, posix_get_hrtimer_res() does:
> sec = 0;
> ns = hrtimer_resolution;
> and hrtimer_resolution depends on the enablement of the high
> resolution timers that can happen either at compile or at run time.
>
> Fix the powerpc vdso implementation of clock_getres keeping a copy of
> hrtimer_resolution in vdso data and using that directly.
>
> Fixes: a7f290dad32e ("[PATCH] powerpc: Merge vdso's and add vdso support
> to 32 bits kernel")
> Cc: stable@vger.kernel.org
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Acked-by: Shuah Khan <skhan@linuxfoundation.org>
> [chleroy: changed CLOCK_REALTIME_RES to CLOCK_HRTIMER_RES]
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Applied to powerpc fixes, thanks.
https://git.kernel.org/powerpc/c/552263456215ada7ee8700ce022d12b0cffe4802
cheers
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 1/8] powerpc/32: Add VDSO version of getcpu on non SMP
2019-12-02 7:57 ` [PATCH v4 1/8] powerpc/32: Add VDSO version of getcpu on non SMP Christophe Leroy
@ 2020-01-29 5:17 ` Michael Ellerman
0 siblings, 0 replies; 12+ messages in thread
From: Michael Ellerman @ 2020-01-29 5:17 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel, arnd
On Mon, 2019-12-02 at 07:57:27 UTC, Christophe Leroy wrote:
> Commit 18ad51dd342a ("powerpc: Add VDSO version of getcpu") added
> getcpu() for PPC64 only, by making use of a user readable general
> purpose SPR.
>
> PPC32 doesn't have any such SPR.
>
> For non SMP, just return CPU id 0 from the VDSO directly.
> PPC32 doesn't support CONFIG_NUMA so NUMA node is always 0.
>
> Before the patch, vdsotest reported:
> getcpu: syscall: 1572 nsec/call
> getcpu: libc: 1787 nsec/call
> getcpu: vdso: not tested
>
> Now, vdsotest reports:
> getcpu: syscall: 1582 nsec/call
> getcpu: libc: 502 nsec/call
> getcpu: vdso: 187 nsec/call
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Patches 1, 2 and 4-8, applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/902137ba8e469ed07c7f120a390161937a6288fb
cheers
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 7/8] powerpc/vdso32: implement clock_getres entirely
2019-12-02 7:57 ` [PATCH v4 7/8] powerpc/vdso32: implement clock_getres entirely Christophe Leroy
@ 2020-05-05 22:52 ` Aurelien Jarno
0 siblings, 0 replies; 12+ messages in thread
From: Aurelien Jarno @ 2020-05-05 22:52 UTC (permalink / raw)
To: Christophe Leroy; +Cc: arnd, linux-kernel, Paul Mackerras, linuxppc-dev
Hi,
On 2019-12-02 07:57, Christophe Leroy wrote:
> clock_getres returns hrtimer_res for all clocks but coarse ones
> for which it returns KTIME_LOW_RES.
>
> return EINVAL for unknown clocks.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
> arch/powerpc/kernel/asm-offsets.c | 3 +++
> arch/powerpc/kernel/vdso32/gettimeofday.S | 19 +++++++++++--------
> 2 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> index 0013197d89a6..90e53d432f2e 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -413,7 +413,10 @@ int main(void)
> DEFINE(CLOCK_MONOTONIC, CLOCK_MONOTONIC);
> DEFINE(CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE);
> DEFINE(CLOCK_MONOTONIC_COARSE, CLOCK_MONOTONIC_COARSE);
> + DEFINE(CLOCK_MAX, CLOCK_TAI);
> DEFINE(NSEC_PER_SEC, NSEC_PER_SEC);
> + DEFINE(EINVAL, EINVAL);
> + DEFINE(KTIME_LOW_RES, KTIME_LOW_RES);
>
> #ifdef CONFIG_BUG
> DEFINE(BUG_ENTRY_SIZE, sizeof(struct bug_entry));
> diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
> index 9aafacea9c4a..20ae38f3a5a3 100644
> --- a/arch/powerpc/kernel/vdso32/gettimeofday.S
> +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
> @@ -196,17 +196,20 @@ V_FUNCTION_END(__kernel_clock_gettime)
> V_FUNCTION_BEGIN(__kernel_clock_getres)
> .cfi_startproc
> /* Check for supported clock IDs */
> - cmpwi cr0,r3,CLOCK_REALTIME
> - cmpwi cr1,r3,CLOCK_MONOTONIC
> - cror cr0*4+eq,cr0*4+eq,cr1*4+eq
> - bne cr0,99f
> + cmplwi cr0, r3, CLOCK_MAX
> + cmpwi cr1, r3, CLOCK_REALTIME_COARSE
> + cmpwi cr7, r3, CLOCK_MONOTONIC_COARSE
> + bgt cr0, 99f
> + LOAD_REG_IMMEDIATE(r5, KTIME_LOW_RES)
> + beq cr1, 1f
> + beq cr7, 1f
>
> mflr r12
> .cfi_register lr,r12
> get_datapage r3, r0
> lwz r5, CLOCK_HRTIMER_RES(r3)
> mtlr r12
> - li r3,0
> +1: li r3,0
> cmpli cr0,r4,0
> crclr cr0*4+so
> beqlr
> @@ -215,11 +218,11 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
> blr
>
> /*
> - * syscall fallback
> + * invalid clock
> */
> 99:
> - li r0,__NR_clock_getres
> - sc
> + li r3, EINVAL
> + crset so
> blr
> .cfi_endproc
> V_FUNCTION_END(__kernel_clock_getres)
Removing the syscall fallback looks wrong, and broke access to
per-processes clocks. With this change a few glibc tests now fail.
This can be reproduced by the simple code below:
| #include <errno.h>
| #include <stdio.h>
| #include <string.h>
| #include <sys/types.h>
| #include <time.h>
| #include <unistd.h>
|
| int main()
| {
| struct timespec res;
| clockid_t ci;
| int e;
|
| e = clock_getcpuclockid(getpid(), &ci);
| if (e) {
| printf("clock_getcpuclockid returned %d\n", e);
| return e;
| }
| e = clock_getres (ci, &res);
| printf("clock_getres returned %d\n", e);
| if (e) {
| printf(" errno: %d, %s\n", errno, strerror(errno));
| }
|
| return e;
| }
Without this patch or with -m64, it returns:
| clock_getres returned 0
With this patch with -m32 it returns:
| clock_getres returned -1
| errno: 22, Invalid argument
Regards,
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-05-05 23:25 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-02 7:57 [PATCH v4 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 1/8] powerpc/32: Add VDSO version of getcpu on non SMP Christophe Leroy
2020-01-29 5:17 ` Michael Ellerman
2019-12-02 7:57 ` [PATCH v4 2/8] powerpc/vdso32: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 3/8] powerpc: Fix vDSO clock_getres() Christophe Leroy
2019-12-04 13:30 ` Michael Ellerman
2019-12-02 7:57 ` [PATCH v4 4/8] powerpc/vdso32: inline __get_datapage() Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 5/8] powerpc/vdso32: Don't read cache line size from the datapage on PPC32 Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 6/8] powerpc/vdso32: use LOAD_REG_IMMEDIATE() Christophe Leroy
2019-12-02 7:57 ` [PATCH v4 7/8] powerpc/vdso32: implement clock_getres entirely Christophe Leroy
2020-05-05 22:52 ` Aurelien Jarno
2019-12-02 7:57 ` [PATCH v4 8/8] powerpc/vdso32: miscellaneous optimisations Christophe Leroy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).