* [PATCH 0/2] generic/bitops: Always inline some more generic helpers @ 2022-01-13 15:53 Borislav Petkov 2022-01-13 15:53 ` [PATCH 1/2] asm-generic/bitops: Always inline all bit manipulation helpers Borislav Petkov ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Borislav Petkov @ 2022-01-13 15:53 UTC (permalink / raw) To: Peter Zijlstra Cc: Arnd Bergmann, Boqun Feng, Marco Elver, Paul E. McKenney, Will Deacon, X86 ML, LKML From: Borislav Petkov <bp@suse.de> Hi all, a build report by the 0day robot: https://lore.kernel.org/r/Yc7t934f%2Bf/mO8lj@zn.tnic made me look at asm and how gcc, at least, generates funky calls to the *_bit() bit manipulation functions on x86 instead of inlining them into the call sites as on x86 that's a single insn, in most of the cases. So PeterZ says the way to go is to always inline them. So here they are. The fun thing is that on x86 there is even a size decrease of more than a Kilobyte for a defconfig, which is nice, see patch 1. As always, comments and suggestions are welcome. Thx. Borislav Petkov (2): asm-generic/bitops: Always inline all bit manipulation helpers cpumask: Always inline helpers which use bit manipulation functions include/asm-generic/bitops/instrumented-atomic.h | 12 ++++++------ .../asm-generic/bitops/instrumented-non-atomic.h | 16 ++++++++-------- include/linux/cpumask.h | 14 +++++++------- 3 files changed, 21 insertions(+), 21 deletions(-) -- 2.29.2 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] asm-generic/bitops: Always inline all bit manipulation helpers 2022-01-13 15:53 [PATCH 0/2] generic/bitops: Always inline some more generic helpers Borislav Petkov @ 2022-01-13 15:53 ` Borislav Petkov 2022-01-26 13:30 ` [tip: locking/core] " tip-bot2 for Borislav Petkov 2022-01-13 15:53 ` [PATCH 2/2] cpumask: Always inline helpers which use bit manipulation functions Borislav Petkov 2022-01-13 16:58 ` [PATCH 0/2] generic/bitops: Always inline some more generic helpers Marco Elver 2 siblings, 1 reply; 7+ messages in thread From: Borislav Petkov @ 2022-01-13 15:53 UTC (permalink / raw) To: Peter Zijlstra Cc: Arnd Bergmann, Boqun Feng, Marco Elver, Paul E. McKenney, Will Deacon, X86 ML, LKML From: Borislav Petkov <bp@suse.de> Make it consistent with the atomic/atomic-instrumented.h helpers. And defconfig size is actually going down! text data bss dec hex filename 22352096 8213152 1917164 32482412 1efa46c vmlinux.x86-64.defconfig.before 22350551 8213184 1917164 32480899 1ef9e83 vmlinux.x86-64.defconfig.after Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> --- include/asm-generic/bitops/instrumented-atomic.h | 12 ++++++------ .../asm-generic/bitops/instrumented-non-atomic.h | 16 ++++++++-------- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/include/asm-generic/bitops/instrumented-atomic.h b/include/asm-generic/bitops/instrumented-atomic.h index c90192b1c755..4225a8ca9c1a 100644 --- a/include/asm-generic/bitops/instrumented-atomic.h +++ b/include/asm-generic/bitops/instrumented-atomic.h @@ -23,7 +23,7 @@ * Note that @nr may be almost arbitrarily large; this function is not * restricted to acting on a single-word quantity. */ -static inline void set_bit(long nr, volatile unsigned long *addr) +static __always_inline void set_bit(long nr, volatile unsigned long *addr) { instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long)); arch_set_bit(nr, addr); @@ -36,7 +36,7 @@ static inline void set_bit(long nr, volatile unsigned long *addr) * * This is a relaxed atomic operation (no implied memory barriers). */ -static inline void clear_bit(long nr, volatile unsigned long *addr) +static __always_inline void clear_bit(long nr, volatile unsigned long *addr) { instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long)); arch_clear_bit(nr, addr); @@ -52,7 +52,7 @@ static inline void clear_bit(long nr, volatile unsigned long *addr) * Note that @nr may be almost arbitrarily large; this function is not * restricted to acting on a single-word quantity. */ -static inline void change_bit(long nr, volatile unsigned long *addr) +static __always_inline void change_bit(long nr, volatile unsigned long *addr) { instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long)); arch_change_bit(nr, addr); @@ -65,7 +65,7 @@ static inline void change_bit(long nr, volatile unsigned long *addr) * * This is an atomic fully-ordered operation (implied full memory barrier). */ -static inline bool test_and_set_bit(long nr, volatile unsigned long *addr) +static __always_inline bool test_and_set_bit(long nr, volatile unsigned long *addr) { kcsan_mb(); instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long)); @@ -79,7 +79,7 @@ static inline bool test_and_set_bit(long nr, volatile unsigned long *addr) * * This is an atomic fully-ordered operation (implied full memory barrier). */ -static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr) +static __always_inline bool test_and_clear_bit(long nr, volatile unsigned long *addr) { kcsan_mb(); instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long)); @@ -93,7 +93,7 @@ static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr) * * This is an atomic fully-ordered operation (implied full memory barrier). */ -static inline bool test_and_change_bit(long nr, volatile unsigned long *addr) +static __always_inline bool test_and_change_bit(long nr, volatile unsigned long *addr) { kcsan_mb(); instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long)); diff --git a/include/asm-generic/bitops/instrumented-non-atomic.h b/include/asm-generic/bitops/instrumented-non-atomic.h index 37363d570b9b..7ab1ecc37782 100644 --- a/include/asm-generic/bitops/instrumented-non-atomic.h +++ b/include/asm-generic/bitops/instrumented-non-atomic.h @@ -22,7 +22,7 @@ * region of memory concurrently, the effect may be that only one operation * succeeds. */ -static inline void __set_bit(long nr, volatile unsigned long *addr) +static __always_inline void __set_bit(long nr, volatile unsigned long *addr) { instrument_write(addr + BIT_WORD(nr), sizeof(long)); arch___set_bit(nr, addr); @@ -37,7 +37,7 @@ static inline void __set_bit(long nr, volatile unsigned long *addr) * region of memory concurrently, the effect may be that only one operation * succeeds. */ -static inline void __clear_bit(long nr, volatile unsigned long *addr) +static __always_inline void __clear_bit(long nr, volatile unsigned long *addr) { instrument_write(addr + BIT_WORD(nr), sizeof(long)); arch___clear_bit(nr, addr); @@ -52,13 +52,13 @@ static inline void __clear_bit(long nr, volatile unsigned long *addr) * region of memory concurrently, the effect may be that only one operation * succeeds. */ -static inline void __change_bit(long nr, volatile unsigned long *addr) +static __always_inline void __change_bit(long nr, volatile unsigned long *addr) { instrument_write(addr + BIT_WORD(nr), sizeof(long)); arch___change_bit(nr, addr); } -static inline void __instrument_read_write_bitop(long nr, volatile unsigned long *addr) +static __always_inline void __instrument_read_write_bitop(long nr, volatile unsigned long *addr) { if (IS_ENABLED(CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC)) { /* @@ -90,7 +90,7 @@ static inline void __instrument_read_write_bitop(long nr, volatile unsigned long * This operation is non-atomic. If two instances of this operation race, one * can appear to succeed but actually fail. */ -static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr) +static __always_inline bool __test_and_set_bit(long nr, volatile unsigned long *addr) { __instrument_read_write_bitop(nr, addr); return arch___test_and_set_bit(nr, addr); @@ -104,7 +104,7 @@ static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr) * This operation is non-atomic. If two instances of this operation race, one * can appear to succeed but actually fail. */ -static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr) +static __always_inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr) { __instrument_read_write_bitop(nr, addr); return arch___test_and_clear_bit(nr, addr); @@ -118,7 +118,7 @@ static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr) * This operation is non-atomic. If two instances of this operation race, one * can appear to succeed but actually fail. */ -static inline bool __test_and_change_bit(long nr, volatile unsigned long *addr) +static __always_inline bool __test_and_change_bit(long nr, volatile unsigned long *addr) { __instrument_read_write_bitop(nr, addr); return arch___test_and_change_bit(nr, addr); @@ -129,7 +129,7 @@ static inline bool __test_and_change_bit(long nr, volatile unsigned long *addr) * @nr: bit number to test * @addr: Address to start counting from */ -static inline bool test_bit(long nr, const volatile unsigned long *addr) +static __always_inline bool test_bit(long nr, const volatile unsigned long *addr) { instrument_atomic_read(addr + BIT_WORD(nr), sizeof(long)); return arch_test_bit(nr, addr); -- 2.29.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [tip: locking/core] asm-generic/bitops: Always inline all bit manipulation helpers 2022-01-13 15:53 ` [PATCH 1/2] asm-generic/bitops: Always inline all bit manipulation helpers Borislav Petkov @ 2022-01-26 13:30 ` tip-bot2 for Borislav Petkov 0 siblings, 0 replies; 7+ messages in thread From: tip-bot2 for Borislav Petkov @ 2022-01-26 13:30 UTC (permalink / raw) To: linux-tip-commits Cc: Peter Zijlstra (Intel), Borislav Petkov, Marco Elver, x86, linux-kernel The following commit has been merged into the locking/core branch of tip: Commit-ID: acb13ea0baf8db8d05a3910c06e997c90825faad Gitweb: https://git.kernel.org/tip/acb13ea0baf8db8d05a3910c06e997c90825faad Author: Borislav Petkov <bp@suse.de> AuthorDate: Thu, 13 Jan 2022 16:53:56 +01:00 Committer: Peter Zijlstra <peterz@infradead.org> CommitterDate: Tue, 25 Jan 2022 22:30:28 +01:00 asm-generic/bitops: Always inline all bit manipulation helpers Make it consistent with the atomic/atomic-instrumented.h helpers. And defconfig size is actually going down! text data bss dec hex filename 22352096 8213152 1917164 32482412 1efa46c vmlinux.x86-64.defconfig.before 22350551 8213184 1917164 32480899 1ef9e83 vmlinux.x86-64.defconfig.after Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/20220113155357.4706-2-bp@alien8.de --- include/asm-generic/bitops/instrumented-atomic.h | 12 ++++---- include/asm-generic/bitops/instrumented-non-atomic.h | 16 +++++------ 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/include/asm-generic/bitops/instrumented-atomic.h b/include/asm-generic/bitops/instrumented-atomic.h index c90192b..4225a8c 100644 --- a/include/asm-generic/bitops/instrumented-atomic.h +++ b/include/asm-generic/bitops/instrumented-atomic.h @@ -23,7 +23,7 @@ * Note that @nr may be almost arbitrarily large; this function is not * restricted to acting on a single-word quantity. */ -static inline void set_bit(long nr, volatile unsigned long *addr) +static __always_inline void set_bit(long nr, volatile unsigned long *addr) { instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long)); arch_set_bit(nr, addr); @@ -36,7 +36,7 @@ static inline void set_bit(long nr, volatile unsigned long *addr) * * This is a relaxed atomic operation (no implied memory barriers). */ -static inline void clear_bit(long nr, volatile unsigned long *addr) +static __always_inline void clear_bit(long nr, volatile unsigned long *addr) { instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long)); arch_clear_bit(nr, addr); @@ -52,7 +52,7 @@ static inline void clear_bit(long nr, volatile unsigned long *addr) * Note that @nr may be almost arbitrarily large; this function is not * restricted to acting on a single-word quantity. */ -static inline void change_bit(long nr, volatile unsigned long *addr) +static __always_inline void change_bit(long nr, volatile unsigned long *addr) { instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long)); arch_change_bit(nr, addr); @@ -65,7 +65,7 @@ static inline void change_bit(long nr, volatile unsigned long *addr) * * This is an atomic fully-ordered operation (implied full memory barrier). */ -static inline bool test_and_set_bit(long nr, volatile unsigned long *addr) +static __always_inline bool test_and_set_bit(long nr, volatile unsigned long *addr) { kcsan_mb(); instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long)); @@ -79,7 +79,7 @@ static inline bool test_and_set_bit(long nr, volatile unsigned long *addr) * * This is an atomic fully-ordered operation (implied full memory barrier). */ -static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr) +static __always_inline bool test_and_clear_bit(long nr, volatile unsigned long *addr) { kcsan_mb(); instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long)); @@ -93,7 +93,7 @@ static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr) * * This is an atomic fully-ordered operation (implied full memory barrier). */ -static inline bool test_and_change_bit(long nr, volatile unsigned long *addr) +static __always_inline bool test_and_change_bit(long nr, volatile unsigned long *addr) { kcsan_mb(); instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long)); diff --git a/include/asm-generic/bitops/instrumented-non-atomic.h b/include/asm-generic/bitops/instrumented-non-atomic.h index 37363d5..7ab1ecc 100644 --- a/include/asm-generic/bitops/instrumented-non-atomic.h +++ b/include/asm-generic/bitops/instrumented-non-atomic.h @@ -22,7 +22,7 @@ * region of memory concurrently, the effect may be that only one operation * succeeds. */ -static inline void __set_bit(long nr, volatile unsigned long *addr) +static __always_inline void __set_bit(long nr, volatile unsigned long *addr) { instrument_write(addr + BIT_WORD(nr), sizeof(long)); arch___set_bit(nr, addr); @@ -37,7 +37,7 @@ static inline void __set_bit(long nr, volatile unsigned long *addr) * region of memory concurrently, the effect may be that only one operation * succeeds. */ -static inline void __clear_bit(long nr, volatile unsigned long *addr) +static __always_inline void __clear_bit(long nr, volatile unsigned long *addr) { instrument_write(addr + BIT_WORD(nr), sizeof(long)); arch___clear_bit(nr, addr); @@ -52,13 +52,13 @@ static inline void __clear_bit(long nr, volatile unsigned long *addr) * region of memory concurrently, the effect may be that only one operation * succeeds. */ -static inline void __change_bit(long nr, volatile unsigned long *addr) +static __always_inline void __change_bit(long nr, volatile unsigned long *addr) { instrument_write(addr + BIT_WORD(nr), sizeof(long)); arch___change_bit(nr, addr); } -static inline void __instrument_read_write_bitop(long nr, volatile unsigned long *addr) +static __always_inline void __instrument_read_write_bitop(long nr, volatile unsigned long *addr) { if (IS_ENABLED(CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC)) { /* @@ -90,7 +90,7 @@ static inline void __instrument_read_write_bitop(long nr, volatile unsigned long * This operation is non-atomic. If two instances of this operation race, one * can appear to succeed but actually fail. */ -static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr) +static __always_inline bool __test_and_set_bit(long nr, volatile unsigned long *addr) { __instrument_read_write_bitop(nr, addr); return arch___test_and_set_bit(nr, addr); @@ -104,7 +104,7 @@ static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr) * This operation is non-atomic. If two instances of this operation race, one * can appear to succeed but actually fail. */ -static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr) +static __always_inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr) { __instrument_read_write_bitop(nr, addr); return arch___test_and_clear_bit(nr, addr); @@ -118,7 +118,7 @@ static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr) * This operation is non-atomic. If two instances of this operation race, one * can appear to succeed but actually fail. */ -static inline bool __test_and_change_bit(long nr, volatile unsigned long *addr) +static __always_inline bool __test_and_change_bit(long nr, volatile unsigned long *addr) { __instrument_read_write_bitop(nr, addr); return arch___test_and_change_bit(nr, addr); @@ -129,7 +129,7 @@ static inline bool __test_and_change_bit(long nr, volatile unsigned long *addr) * @nr: bit number to test * @addr: Address to start counting from */ -static inline bool test_bit(long nr, const volatile unsigned long *addr) +static __always_inline bool test_bit(long nr, const volatile unsigned long *addr) { instrument_atomic_read(addr + BIT_WORD(nr), sizeof(long)); return arch_test_bit(nr, addr); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] cpumask: Always inline helpers which use bit manipulation functions 2022-01-13 15:53 [PATCH 0/2] generic/bitops: Always inline some more generic helpers Borislav Petkov 2022-01-13 15:53 ` [PATCH 1/2] asm-generic/bitops: Always inline all bit manipulation helpers Borislav Petkov @ 2022-01-13 15:53 ` Borislav Petkov 2022-01-26 13:30 ` [tip: locking/core] " tip-bot2 for Borislav Petkov 2022-01-13 16:58 ` [PATCH 0/2] generic/bitops: Always inline some more generic helpers Marco Elver 2 siblings, 1 reply; 7+ messages in thread From: Borislav Petkov @ 2022-01-13 15:53 UTC (permalink / raw) To: Peter Zijlstra Cc: Arnd Bergmann, Boqun Feng, Marco Elver, Paul E. McKenney, Will Deacon, X86 ML, LKML From: Borislav Petkov <bp@suse.de> Former are always inlined so do that for the latter too, for consistency. Size impact is a whopping 5 bytes increase! :-) text data bss dec hex filename 22350551 8213184 1917164 32480899 1ef9e83 vmlinux.x86-64.defconfig.before 22350556 8213152 1917164 32480872 1ef9e68 vmlinux.x86-64.defconfig.after Signed-off-by: Borislav Petkov <bp@suse.de> --- include/linux/cpumask.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 1e7399fc69c0..676ada1fbccd 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -306,12 +306,12 @@ extern int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool * @cpu: cpu number (< nr_cpu_ids) * @dstp: the cpumask pointer */ -static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) +static __always_inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) { set_bit(cpumask_check(cpu), cpumask_bits(dstp)); } -static inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) +static __always_inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) { __set_bit(cpumask_check(cpu), cpumask_bits(dstp)); } @@ -322,12 +322,12 @@ static inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) * @cpu: cpu number (< nr_cpu_ids) * @dstp: the cpumask pointer */ -static inline void cpumask_clear_cpu(int cpu, struct cpumask *dstp) +static __always_inline void cpumask_clear_cpu(int cpu, struct cpumask *dstp) { clear_bit(cpumask_check(cpu), cpumask_bits(dstp)); } -static inline void __cpumask_clear_cpu(int cpu, struct cpumask *dstp) +static __always_inline void __cpumask_clear_cpu(int cpu, struct cpumask *dstp) { __clear_bit(cpumask_check(cpu), cpumask_bits(dstp)); } @@ -339,7 +339,7 @@ static inline void __cpumask_clear_cpu(int cpu, struct cpumask *dstp) * * Returns 1 if @cpu is set in @cpumask, else returns 0 */ -static inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask) +static __always_inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask) { return test_bit(cpumask_check(cpu), cpumask_bits((cpumask))); } @@ -353,7 +353,7 @@ static inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask) * * test_and_set_bit wrapper for cpumasks. */ -static inline int cpumask_test_and_set_cpu(int cpu, struct cpumask *cpumask) +static __always_inline int cpumask_test_and_set_cpu(int cpu, struct cpumask *cpumask) { return test_and_set_bit(cpumask_check(cpu), cpumask_bits(cpumask)); } @@ -367,7 +367,7 @@ static inline int cpumask_test_and_set_cpu(int cpu, struct cpumask *cpumask) * * test_and_clear_bit wrapper for cpumasks. */ -static inline int cpumask_test_and_clear_cpu(int cpu, struct cpumask *cpumask) +static __always_inline int cpumask_test_and_clear_cpu(int cpu, struct cpumask *cpumask) { return test_and_clear_bit(cpumask_check(cpu), cpumask_bits(cpumask)); } -- 2.29.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [tip: locking/core] cpumask: Always inline helpers which use bit manipulation functions 2022-01-13 15:53 ` [PATCH 2/2] cpumask: Always inline helpers which use bit manipulation functions Borislav Petkov @ 2022-01-26 13:30 ` tip-bot2 for Borislav Petkov 0 siblings, 0 replies; 7+ messages in thread From: tip-bot2 for Borislav Petkov @ 2022-01-26 13:30 UTC (permalink / raw) To: linux-tip-commits Cc: Borislav Petkov, Peter Zijlstra (Intel), Marco Elver, x86, linux-kernel The following commit has been merged into the locking/core branch of tip: Commit-ID: 1dc01abad6544cb9d884071b626b706e37aa9601 Gitweb: https://git.kernel.org/tip/1dc01abad6544cb9d884071b626b706e37aa9601 Author: Borislav Petkov <bp@suse.de> AuthorDate: Thu, 13 Jan 2022 16:53:57 +01:00 Committer: Peter Zijlstra <peterz@infradead.org> CommitterDate: Tue, 25 Jan 2022 22:30:28 +01:00 cpumask: Always inline helpers which use bit manipulation functions Former are always inlined so do that for the latter too, for consistency. Size impact is a whopping 5 bytes increase! :-) text data bss dec hex filename 22350551 8213184 1917164 32480899 1ef9e83 vmlinux.x86-64.defconfig.before 22350556 8213152 1917164 32480872 1ef9e68 vmlinux.x86-64.defconfig.after Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/20220113155357.4706-3-bp@alien8.de --- include/linux/cpumask.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 64dae70..6b06c69 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -341,12 +341,12 @@ extern int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool * @cpu: cpu number (< nr_cpu_ids) * @dstp: the cpumask pointer */ -static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) +static __always_inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) { set_bit(cpumask_check(cpu), cpumask_bits(dstp)); } -static inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) +static __always_inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) { __set_bit(cpumask_check(cpu), cpumask_bits(dstp)); } @@ -357,12 +357,12 @@ static inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) * @cpu: cpu number (< nr_cpu_ids) * @dstp: the cpumask pointer */ -static inline void cpumask_clear_cpu(int cpu, struct cpumask *dstp) +static __always_inline void cpumask_clear_cpu(int cpu, struct cpumask *dstp) { clear_bit(cpumask_check(cpu), cpumask_bits(dstp)); } -static inline void __cpumask_clear_cpu(int cpu, struct cpumask *dstp) +static __always_inline void __cpumask_clear_cpu(int cpu, struct cpumask *dstp) { __clear_bit(cpumask_check(cpu), cpumask_bits(dstp)); } @@ -374,7 +374,7 @@ static inline void __cpumask_clear_cpu(int cpu, struct cpumask *dstp) * * Returns 1 if @cpu is set in @cpumask, else returns 0 */ -static inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask) +static __always_inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask) { return test_bit(cpumask_check(cpu), cpumask_bits((cpumask))); } @@ -388,7 +388,7 @@ static inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask) * * test_and_set_bit wrapper for cpumasks. */ -static inline int cpumask_test_and_set_cpu(int cpu, struct cpumask *cpumask) +static __always_inline int cpumask_test_and_set_cpu(int cpu, struct cpumask *cpumask) { return test_and_set_bit(cpumask_check(cpu), cpumask_bits(cpumask)); } @@ -402,7 +402,7 @@ static inline int cpumask_test_and_set_cpu(int cpu, struct cpumask *cpumask) * * test_and_clear_bit wrapper for cpumasks. */ -static inline int cpumask_test_and_clear_cpu(int cpu, struct cpumask *cpumask) +static __always_inline int cpumask_test_and_clear_cpu(int cpu, struct cpumask *cpumask) { return test_and_clear_bit(cpumask_check(cpu), cpumask_bits(cpumask)); } ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] generic/bitops: Always inline some more generic helpers 2022-01-13 15:53 [PATCH 0/2] generic/bitops: Always inline some more generic helpers Borislav Petkov 2022-01-13 15:53 ` [PATCH 1/2] asm-generic/bitops: Always inline all bit manipulation helpers Borislav Petkov 2022-01-13 15:53 ` [PATCH 2/2] cpumask: Always inline helpers which use bit manipulation functions Borislav Petkov @ 2022-01-13 16:58 ` Marco Elver 2022-01-14 8:12 ` Peter Zijlstra 2 siblings, 1 reply; 7+ messages in thread From: Marco Elver @ 2022-01-13 16:58 UTC (permalink / raw) To: Borislav Petkov Cc: Peter Zijlstra, Arnd Bergmann, Boqun Feng, Paul E. McKenney, Will Deacon, X86 ML, LKML On Thu, 13 Jan 2022 at 16:53, Borislav Petkov <bp@alien8.de> wrote: > > From: Borislav Petkov <bp@suse.de> > > Hi all, > > a build report by the 0day robot: > > https://lore.kernel.org/r/Yc7t934f%2Bf/mO8lj@zn.tnic > > made me look at asm and how gcc, at least, generates funky calls to the > *_bit() bit manipulation functions on x86 instead of inlining them into > the call sites as on x86 that's a single insn, in most of the cases. > > So PeterZ says the way to go is to always inline them. So here they are. > The fun thing is that on x86 there is even a size decrease of more than > a Kilobyte for a defconfig, which is nice, see patch 1. > > As always, comments and suggestions are welcome. > > Thx. > > Borislav Petkov (2): > asm-generic/bitops: Always inline all bit manipulation helpers > cpumask: Always inline helpers which use bit manipulation functions > > include/asm-generic/bitops/instrumented-atomic.h | 12 ++++++------ > .../asm-generic/bitops/instrumented-non-atomic.h | 16 ++++++++-------- > include/linux/cpumask.h | 14 +++++++------- > 3 files changed, 21 insertions(+), 21 deletions(-) Acked-by: Marco Elver <elver@google.com> Yup, this is probably something we should have done a long time ago. :-) Thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] generic/bitops: Always inline some more generic helpers 2022-01-13 16:58 ` [PATCH 0/2] generic/bitops: Always inline some more generic helpers Marco Elver @ 2022-01-14 8:12 ` Peter Zijlstra 0 siblings, 0 replies; 7+ messages in thread From: Peter Zijlstra @ 2022-01-14 8:12 UTC (permalink / raw) To: Marco Elver Cc: Borislav Petkov, Arnd Bergmann, Boqun Feng, Paul E. McKenney, Will Deacon, X86 ML, LKML On Thu, Jan 13, 2022 at 05:58:57PM +0100, Marco Elver wrote: > On Thu, 13 Jan 2022 at 16:53, Borislav Petkov <bp@alien8.de> wrote: > > > > From: Borislav Petkov <bp@suse.de> > > > > Hi all, > > > > a build report by the 0day robot: > > > > https://lore.kernel.org/r/Yc7t934f%2Bf/mO8lj@zn.tnic > > > > made me look at asm and how gcc, at least, generates funky calls to the > > *_bit() bit manipulation functions on x86 instead of inlining them into > > the call sites as on x86 that's a single insn, in most of the cases. > > > > So PeterZ says the way to go is to always inline them. So here they are. > > The fun thing is that on x86 there is even a size decrease of more than > > a Kilobyte for a defconfig, which is nice, see patch 1. > > > > As always, comments and suggestions are welcome. > > > > Thx. > > > > Borislav Petkov (2): > > asm-generic/bitops: Always inline all bit manipulation helpers > > cpumask: Always inline helpers which use bit manipulation functions > > > > include/asm-generic/bitops/instrumented-atomic.h | 12 ++++++------ > > .../asm-generic/bitops/instrumented-non-atomic.h | 16 ++++++++-------- > > include/linux/cpumask.h | 14 +++++++------- > > 3 files changed, 21 insertions(+), 21 deletions(-) > > Acked-by: Marco Elver <elver@google.com> > > Yup, this is probably something we should have done a long time ago. :-) Thanks, I'll go stuff it in locking/core. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-01-26 13:30 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-01-13 15:53 [PATCH 0/2] generic/bitops: Always inline some more generic helpers Borislav Petkov 2022-01-13 15:53 ` [PATCH 1/2] asm-generic/bitops: Always inline all bit manipulation helpers Borislav Petkov 2022-01-26 13:30 ` [tip: locking/core] " tip-bot2 for Borislav Petkov 2022-01-13 15:53 ` [PATCH 2/2] cpumask: Always inline helpers which use bit manipulation functions Borislav Petkov 2022-01-26 13:30 ` [tip: locking/core] " tip-bot2 for Borislav Petkov 2022-01-13 16:58 ` [PATCH 0/2] generic/bitops: Always inline some more generic helpers Marco Elver 2022-01-14 8:12 ` Peter Zijlstra
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.