* [PATCH] hrtimer: avoid retrigger_next_event IPI @ 2021-04-07 13:53 Marcelo Tosatti 2021-04-07 19:28 ` kernel test robot ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Marcelo Tosatti @ 2021-04-07 13:53 UTC (permalink / raw) To: Thomas Gleixner Cc: linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits Setting the realtime clock triggers an IPI to all CPUs to reprogram hrtimers. However, only base, boottime and tai clocks have their offsets updated (and therefore potentially require a reprogram). If the CPU is a nohz_full one, check if it only has monotonic active timers, and in that case update the realtime base offsets, skipping the IPI. This reduces interruptions to nohz_full CPUs. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 743c852e10f2..b42b1a434b22 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -853,6 +853,28 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) tick_program_event(expires, 1); } +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME)| \ + (1U << HRTIMER_BASE_REALTIME_SOFT)| \ + (1U << HRTIMER_BASE_BOOTTIME)| \ + (1U << HRTIMER_BASE_BOOTTIME_SOFT)| \ + (1U << HRTIMER_BASE_TAI)| \ + (1U << HRTIMER_BASE_TAI_SOFT)) + +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) +{ + unsigned int active = 0; + + if (!cpu_base->softirq_activated) + active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT; + + active = active | (cpu_base->active_bases & HRTIMER_ACTIVE_HARD); + + if ((active & CLOCK_SET_BASES) == 0) + return false; + + return true; +} + /* * Clock realtime was set * @@ -867,9 +889,41 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) void clock_was_set(void) { #ifdef CONFIG_HIGH_RES_TIMERS - /* Retrigger the CPU local events everywhere */ - on_each_cpu(retrigger_next_event, NULL, 1); + cpumask_var_t mask; + int cpu; + + if (!tick_nohz_full_enabled()) { + /* Retrigger the CPU local events everywhere */ + on_each_cpu(retrigger_next_event, NULL, 1); + goto set_timerfd; + } + + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { + on_each_cpu(retrigger_next_event, NULL, 1); + goto set_timerfd; + } + + /* Avoid interrupting nohz_full CPUs if possible */ + preempt_disable(); + for_each_online_cpu(cpu) { + if (tick_nohz_full_cpu(cpu)) { + unsigned long flags; + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); + + raw_spin_lock_irqsave(&cpu_base->lock, flags); + if (need_reprogram_timer(cpu_base)) + cpumask_set_cpu(cpu, mask); + else + hrtimer_update_base(cpu_base); + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); + } + } + + smp_call_function_many(mask, retrigger_next_event, NULL, 1); + preempt_enable(); + free_cpumask_var(mask); #endif +set_timerfd: timerfd_clock_was_set(); } ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] hrtimer: avoid retrigger_next_event IPI 2021-04-07 13:53 [PATCH] hrtimer: avoid retrigger_next_event IPI Marcelo Tosatti @ 2021-04-07 19:28 ` kernel test robot 2021-04-07 22:14 ` Frederic Weisbecker 2021-04-09 14:15 ` Thomas Gleixner 2 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2021-04-07 19:28 UTC (permalink / raw) To: Marcelo Tosatti, Thomas Gleixner Cc: kbuild-all, clang-built-linux, linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits [-- Attachment #1: Type: text/plain, Size: 6205 bytes --] Hi Marcelo, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on tip/timers/core] [also build test WARNING on linux/master linus/master v5.12-rc6 next-20210407] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Marcelo-Tosatti/hrtimer-avoid-retrigger_next_event-IPI/20210407-233005 base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git d4c7c28806616809e3baa0b7cd8c665516b2726d config: arm64-randconfig-r032-20210407 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project c060945b23a1c54d4b2a053ff4b093a2277b303d) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # https://github.com/0day-ci/linux/commit/defd4db9d63d1f16e3e21862bd9c9105a9f6a7e9 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Marcelo-Tosatti/hrtimer-avoid-retrigger_next_event-IPI/20210407-233005 git checkout defd4db9d63d1f16e3e21862bd9c9105a9f6a7e9 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): kernel/time/hrtimer.c:120:21: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides] [CLOCK_REALTIME] = HRTIMER_BASE_REALTIME, ^~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:118:27: note: previous initialization is here [0 ... MAX_CLOCKS - 1] = HRTIMER_MAX_CLOCK_BASES, ^~~~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:121:22: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides] [CLOCK_MONOTONIC] = HRTIMER_BASE_MONOTONIC, ^~~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:118:27: note: previous initialization is here [0 ... MAX_CLOCKS - 1] = HRTIMER_MAX_CLOCK_BASES, ^~~~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:122:21: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides] [CLOCK_BOOTTIME] = HRTIMER_BASE_BOOTTIME, ^~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:118:27: note: previous initialization is here [0 ... MAX_CLOCKS - 1] = HRTIMER_MAX_CLOCK_BASES, ^~~~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:123:17: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides] [CLOCK_TAI] = HRTIMER_BASE_TAI, ^~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:118:27: note: previous initialization is here [0 ... MAX_CLOCKS - 1] = HRTIMER_MAX_CLOCK_BASES, ^~~~~~~~~~~~~~~~~~~~~~~ >> kernel/time/hrtimer.c:944:1: warning: unused label 'set_timerfd' [-Wunused-label] set_timerfd: ^~~~~~~~~~~~ kernel/time/hrtimer.c:147:20: warning: unused function 'is_migration_base' [-Wunused-function] static inline bool is_migration_base(struct hrtimer_clock_base *base) ^ kernel/time/hrtimer.c:650:19: warning: unused function 'hrtimer_hres_active' [-Wunused-function] static inline int hrtimer_hres_active(void) ^ kernel/time/hrtimer.c:881:13: warning: unused function 'need_reprogram_timer' [-Wunused-function] static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) ^ kernel/time/hrtimer.c:1793:20: warning: unused function '__hrtimer_peek_ahead_timers' [-Wunused-function] static inline void __hrtimer_peek_ahead_timers(void) { } ^ 9 warnings generated. vim +/set_timerfd +944 kernel/time/hrtimer.c 895 896 /* 897 * Clock realtime was set 898 * 899 * Change the offset of the realtime clock vs. the monotonic 900 * clock. 901 * 902 * We might have to reprogram the high resolution timer interrupt. On 903 * SMP we call the architecture specific code to retrigger _all_ high 904 * resolution timer interrupts. On UP we just disable interrupts and 905 * call the high resolution interrupt code. 906 */ 907 void clock_was_set(void) 908 { 909 #ifdef CONFIG_HIGH_RES_TIMERS 910 cpumask_var_t mask; 911 int cpu; 912 913 if (!tick_nohz_full_enabled()) { 914 /* Retrigger the CPU local events everywhere */ 915 on_each_cpu(retrigger_next_event, NULL, 1); 916 goto set_timerfd; 917 } 918 919 if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { 920 on_each_cpu(retrigger_next_event, NULL, 1); 921 goto set_timerfd; 922 } 923 924 /* Avoid interrupting nohz_full CPUs if possible */ 925 preempt_disable(); 926 for_each_online_cpu(cpu) { 927 if (tick_nohz_full_cpu(cpu)) { 928 unsigned long flags; 929 struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); 930 931 raw_spin_lock_irqsave(&cpu_base->lock, flags); 932 if (need_reprogram_timer(cpu_base)) 933 cpumask_set_cpu(cpu, mask); 934 else 935 hrtimer_update_base(cpu_base); 936 raw_spin_unlock_irqrestore(&cpu_base->lock, flags); 937 } 938 } 939 940 smp_call_function_many(mask, retrigger_next_event, NULL, 1); 941 preempt_enable(); 942 free_cpumask_var(mask); 943 #endif > 944 set_timerfd: 945 timerfd_clock_was_set(); 946 } 947 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 27570 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hrtimer: avoid retrigger_next_event IPI @ 2021-04-07 19:28 ` kernel test robot 0 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2021-04-07 19:28 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 6339 bytes --] Hi Marcelo, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on tip/timers/core] [also build test WARNING on linux/master linus/master v5.12-rc6 next-20210407] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Marcelo-Tosatti/hrtimer-avoid-retrigger_next_event-IPI/20210407-233005 base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git d4c7c28806616809e3baa0b7cd8c665516b2726d config: arm64-randconfig-r032-20210407 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project c060945b23a1c54d4b2a053ff4b093a2277b303d) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # https://github.com/0day-ci/linux/commit/defd4db9d63d1f16e3e21862bd9c9105a9f6a7e9 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Marcelo-Tosatti/hrtimer-avoid-retrigger_next_event-IPI/20210407-233005 git checkout defd4db9d63d1f16e3e21862bd9c9105a9f6a7e9 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): kernel/time/hrtimer.c:120:21: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides] [CLOCK_REALTIME] = HRTIMER_BASE_REALTIME, ^~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:118:27: note: previous initialization is here [0 ... MAX_CLOCKS - 1] = HRTIMER_MAX_CLOCK_BASES, ^~~~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:121:22: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides] [CLOCK_MONOTONIC] = HRTIMER_BASE_MONOTONIC, ^~~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:118:27: note: previous initialization is here [0 ... MAX_CLOCKS - 1] = HRTIMER_MAX_CLOCK_BASES, ^~~~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:122:21: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides] [CLOCK_BOOTTIME] = HRTIMER_BASE_BOOTTIME, ^~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:118:27: note: previous initialization is here [0 ... MAX_CLOCKS - 1] = HRTIMER_MAX_CLOCK_BASES, ^~~~~~~~~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:123:17: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides] [CLOCK_TAI] = HRTIMER_BASE_TAI, ^~~~~~~~~~~~~~~~ kernel/time/hrtimer.c:118:27: note: previous initialization is here [0 ... MAX_CLOCKS - 1] = HRTIMER_MAX_CLOCK_BASES, ^~~~~~~~~~~~~~~~~~~~~~~ >> kernel/time/hrtimer.c:944:1: warning: unused label 'set_timerfd' [-Wunused-label] set_timerfd: ^~~~~~~~~~~~ kernel/time/hrtimer.c:147:20: warning: unused function 'is_migration_base' [-Wunused-function] static inline bool is_migration_base(struct hrtimer_clock_base *base) ^ kernel/time/hrtimer.c:650:19: warning: unused function 'hrtimer_hres_active' [-Wunused-function] static inline int hrtimer_hres_active(void) ^ kernel/time/hrtimer.c:881:13: warning: unused function 'need_reprogram_timer' [-Wunused-function] static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) ^ kernel/time/hrtimer.c:1793:20: warning: unused function '__hrtimer_peek_ahead_timers' [-Wunused-function] static inline void __hrtimer_peek_ahead_timers(void) { } ^ 9 warnings generated. vim +/set_timerfd +944 kernel/time/hrtimer.c 895 896 /* 897 * Clock realtime was set 898 * 899 * Change the offset of the realtime clock vs. the monotonic 900 * clock. 901 * 902 * We might have to reprogram the high resolution timer interrupt. On 903 * SMP we call the architecture specific code to retrigger _all_ high 904 * resolution timer interrupts. On UP we just disable interrupts and 905 * call the high resolution interrupt code. 906 */ 907 void clock_was_set(void) 908 { 909 #ifdef CONFIG_HIGH_RES_TIMERS 910 cpumask_var_t mask; 911 int cpu; 912 913 if (!tick_nohz_full_enabled()) { 914 /* Retrigger the CPU local events everywhere */ 915 on_each_cpu(retrigger_next_event, NULL, 1); 916 goto set_timerfd; 917 } 918 919 if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { 920 on_each_cpu(retrigger_next_event, NULL, 1); 921 goto set_timerfd; 922 } 923 924 /* Avoid interrupting nohz_full CPUs if possible */ 925 preempt_disable(); 926 for_each_online_cpu(cpu) { 927 if (tick_nohz_full_cpu(cpu)) { 928 unsigned long flags; 929 struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); 930 931 raw_spin_lock_irqsave(&cpu_base->lock, flags); 932 if (need_reprogram_timer(cpu_base)) 933 cpumask_set_cpu(cpu, mask); 934 else 935 hrtimer_update_base(cpu_base); 936 raw_spin_unlock_irqrestore(&cpu_base->lock, flags); 937 } 938 } 939 940 smp_call_function_many(mask, retrigger_next_event, NULL, 1); 941 preempt_enable(); 942 free_cpumask_var(mask); 943 #endif > 944 set_timerfd: 945 timerfd_clock_was_set(); 946 } 947 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 27570 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hrtimer: avoid retrigger_next_event IPI 2021-04-07 13:53 [PATCH] hrtimer: avoid retrigger_next_event IPI Marcelo Tosatti 2021-04-07 19:28 ` kernel test robot @ 2021-04-07 22:14 ` Frederic Weisbecker 2021-04-08 12:27 ` Marcelo Tosatti 2021-04-09 14:15 ` Thomas Gleixner 2 siblings, 1 reply; 20+ messages in thread From: Frederic Weisbecker @ 2021-04-07 22:14 UTC (permalink / raw) To: Marcelo Tosatti Cc: Thomas Gleixner, linux-kernel, Peter Xu, Nitesh Narayan Lal, Alex Belits On Wed, Apr 07, 2021 at 10:53:01AM -0300, Marcelo Tosatti wrote: > > Setting the realtime clock triggers an IPI to all CPUs to reprogram > hrtimers. > > However, only base, boottime and tai clocks have their offsets updated > (and therefore potentially require a reprogram). > > If the CPU is a nohz_full one, check if it only has > monotonic active timers, and in that case update the > realtime base offsets, skipping the IPI. > > This reduces interruptions to nohz_full CPUs. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c > index 743c852e10f2..b42b1a434b22 100644 > --- a/kernel/time/hrtimer.c > +++ b/kernel/time/hrtimer.c > @@ -853,6 +853,28 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) > tick_program_event(expires, 1); > } > > +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME)| \ > + (1U << HRTIMER_BASE_REALTIME_SOFT)| \ > + (1U << HRTIMER_BASE_BOOTTIME)| \ > + (1U << HRTIMER_BASE_BOOTTIME_SOFT)| \ > + (1U << HRTIMER_BASE_TAI)| \ > + (1U << HRTIMER_BASE_TAI_SOFT)) > + > +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > +{ > + unsigned int active = 0; > + > + if (!cpu_base->softirq_activated) > + active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT; > + > + active = active | (cpu_base->active_bases & HRTIMER_ACTIVE_HARD); > + > + if ((active & CLOCK_SET_BASES) == 0) > + return false; > + > + return true; > +} > + > /* > * Clock realtime was set > * > @@ -867,9 +889,41 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) > void clock_was_set(void) > { > #ifdef CONFIG_HIGH_RES_TIMERS > - /* Retrigger the CPU local events everywhere */ > - on_each_cpu(retrigger_next_event, NULL, 1); > + cpumask_var_t mask; > + int cpu; > + > + if (!tick_nohz_full_enabled()) { > + /* Retrigger the CPU local events everywhere */ > + on_each_cpu(retrigger_next_event, NULL, 1); > + goto set_timerfd; > + } > + > + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { > + on_each_cpu(retrigger_next_event, NULL, 1); > + goto set_timerfd; > + } > + > + /* Avoid interrupting nohz_full CPUs if possible */ > + preempt_disable(); > + for_each_online_cpu(cpu) { > + if (tick_nohz_full_cpu(cpu)) { > + unsigned long flags; > + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); > + > + raw_spin_lock_irqsave(&cpu_base->lock, flags); > + if (need_reprogram_timer(cpu_base)) > + cpumask_set_cpu(cpu, mask); > + else > + hrtimer_update_base(cpu_base); > + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); > + } You forgot to add the housekeeping CPUs to the mask. As for the need_reprogram_timer() trick, I'll rather defer to Thomas review... Thanks. > + } > + > + smp_call_function_many(mask, retrigger_next_event, NULL, 1); > + preempt_enable(); > + free_cpumask_var(mask); > #endif > +set_timerfd: > timerfd_clock_was_set(); > } > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hrtimer: avoid retrigger_next_event IPI 2021-04-07 22:14 ` Frederic Weisbecker @ 2021-04-08 12:27 ` Marcelo Tosatti 0 siblings, 0 replies; 20+ messages in thread From: Marcelo Tosatti @ 2021-04-08 12:27 UTC (permalink / raw) To: Frederic Weisbecker Cc: Thomas Gleixner, linux-kernel, Peter Xu, Nitesh Narayan Lal, Alex Belits On Thu, Apr 08, 2021 at 12:14:57AM +0200, Frederic Weisbecker wrote: > On Wed, Apr 07, 2021 at 10:53:01AM -0300, Marcelo Tosatti wrote: > > > > Setting the realtime clock triggers an IPI to all CPUs to reprogram > > hrtimers. > > > > However, only base, boottime and tai clocks have their offsets updated > > (and therefore potentially require a reprogram). > > > > If the CPU is a nohz_full one, check if it only has > > monotonic active timers, and in that case update the > > realtime base offsets, skipping the IPI. > > > > This reduces interruptions to nohz_full CPUs. > > > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > > > diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c > > index 743c852e10f2..b42b1a434b22 100644 > > --- a/kernel/time/hrtimer.c > > +++ b/kernel/time/hrtimer.c > > @@ -853,6 +853,28 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) > > tick_program_event(expires, 1); > > } > > > > +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME)| \ > > + (1U << HRTIMER_BASE_REALTIME_SOFT)| \ > > + (1U << HRTIMER_BASE_BOOTTIME)| \ > > + (1U << HRTIMER_BASE_BOOTTIME_SOFT)| \ > > + (1U << HRTIMER_BASE_TAI)| \ > > + (1U << HRTIMER_BASE_TAI_SOFT)) > > + > > +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > > +{ > > + unsigned int active = 0; > > + > > + if (!cpu_base->softirq_activated) > > + active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT; If cpu_base->softirq_activated == 1, should IPI as well. > > + active = active | (cpu_base->active_bases & HRTIMER_ACTIVE_HARD); > > + > > + if ((active & CLOCK_SET_BASES) == 0) > > + return false; > > + > > + return true; > > +} > > + > > /* > > * Clock realtime was set > > * > > @@ -867,9 +889,41 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) > > void clock_was_set(void) > > { > > #ifdef CONFIG_HIGH_RES_TIMERS > > - /* Retrigger the CPU local events everywhere */ > > - on_each_cpu(retrigger_next_event, NULL, 1); > > + cpumask_var_t mask; > > + int cpu; > > + > > + if (!tick_nohz_full_enabled()) { > > + /* Retrigger the CPU local events everywhere */ > > + on_each_cpu(retrigger_next_event, NULL, 1); > > + goto set_timerfd; > > + } > > + > > + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { > > + on_each_cpu(retrigger_next_event, NULL, 1); > > + goto set_timerfd; > > + } > > + > > + /* Avoid interrupting nohz_full CPUs if possible */ > > + preempt_disable(); > > + for_each_online_cpu(cpu) { > > + if (tick_nohz_full_cpu(cpu)) { > > + unsigned long flags; > > + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); > > + > > + raw_spin_lock_irqsave(&cpu_base->lock, flags); > > + if (need_reprogram_timer(cpu_base)) > > + cpumask_set_cpu(cpu, mask); > > + else > > + hrtimer_update_base(cpu_base); > > + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); > > + } > > You forgot to add the housekeeping CPUs to the mask. So people are using: console=tty0 console=ttyS0,115200n8 skew_tick=1 nohz=on rcu_nocbs=8-31 tuned.non_isolcpus=000000ff intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,8-31 systemd.cpu_affinity=0,1,2,3,4,5,6,7 default_hugepagesz=1G hugepagesz=2M hugepages=128 nohz_full=8-31 And using the nohz_full= CPUs (or subsets of nohz_full= CPUs) in two modes: Either "generic non-isolated applications" (with load-balancing enabled for those CPUs), or for latency sensitive applications. And switching between the modes. In this case, it would only be possible to check for housekeeping CPUs of type MANAGED_IRQ, which would be strange. > As for the need_reprogram_timer() trick, I'll rather defer to Thomas review... > > Thanks. Thanks! > > > + } > > + > > + smp_call_function_many(mask, retrigger_next_event, NULL, 1); > > + preempt_enable(); > > + free_cpumask_var(mask); > > #endif > > +set_timerfd: > > timerfd_clock_was_set(); > > } > > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hrtimer: avoid retrigger_next_event IPI 2021-04-07 13:53 [PATCH] hrtimer: avoid retrigger_next_event IPI Marcelo Tosatti 2021-04-07 19:28 ` kernel test robot 2021-04-07 22:14 ` Frederic Weisbecker @ 2021-04-09 14:15 ` Thomas Gleixner 2021-04-09 16:51 ` Marcelo Tosatti 2 siblings, 1 reply; 20+ messages in thread From: Thomas Gleixner @ 2021-04-09 14:15 UTC (permalink / raw) To: Marcelo Tosatti Cc: linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits On Wed, Apr 07 2021 at 10:53, Marcelo Tosatti wrote: > Setting the realtime clock triggers an IPI to all CPUs to reprogram > hrtimers. > > However, only base, boottime and tai clocks have their offsets updated base clock? And why boottime? Boottime is not affected by a clock realtime set. It's clock REALTIME and TAI, nothing else. > +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME)| \ > + (1U << HRTIMER_BASE_REALTIME_SOFT)| \ > + (1U << HRTIMER_BASE_BOOTTIME)| \ > + (1U << HRTIMER_BASE_BOOTTIME_SOFT)| \ > + (1U << HRTIMER_BASE_TAI)| \ > + (1U << HRTIMER_BASE_TAI_SOFT)) > + > +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > +{ > + unsigned int active = 0; > + > + if (!cpu_base->softirq_activated) > + active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT; > + > + active = active | (cpu_base->active_bases & HRTIMER_ACTIVE_HARD); > + > + if ((active & CLOCK_SET_BASES) == 0) > + return false; > + > + return true; > +} Errm. > + /* Avoid interrupting nohz_full CPUs if possible */ > + preempt_disable(); > + for_each_online_cpu(cpu) { > + if (tick_nohz_full_cpu(cpu)) { > + unsigned long flags; > + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); > + > + raw_spin_lock_irqsave(&cpu_base->lock, flags); > + if (need_reprogram_timer(cpu_base)) > + cpumask_set_cpu(cpu, mask); > + else > + hrtimer_update_base(cpu_base); > + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); > + } > + } How is that supposed to be correct? CPU0 CPU1 clock_was_set() hrtimer_start(CLOCK_REALTIME) if (!active_mask[CPU1] & XXX) continue; active_mask |= REALTIME; ---> fail because that newly started timer is on the old offset. Thanks, tglx ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hrtimer: avoid retrigger_next_event IPI 2021-04-09 14:15 ` Thomas Gleixner @ 2021-04-09 16:51 ` Marcelo Tosatti 2021-04-10 7:53 ` Thomas Gleixner 0 siblings, 1 reply; 20+ messages in thread From: Marcelo Tosatti @ 2021-04-09 16:51 UTC (permalink / raw) To: Thomas Gleixner, Anna-Maria Behnsen Cc: linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits +CC Anna-Maria. On Fri, Apr 09, 2021 at 04:15:13PM +0200, Thomas Gleixner wrote: > On Wed, Apr 07 2021 at 10:53, Marcelo Tosatti wrote: > > Setting the realtime clock triggers an IPI to all CPUs to reprogram > > hrtimers. > > > > However, only base, boottime and tai clocks have their offsets updated > > base clock? Heh... > And why boottime? Boottime is not affected by a clock > realtime set. It's clock REALTIME and TAI, nothing else. OK! > > +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME)| \ > > + (1U << HRTIMER_BASE_REALTIME_SOFT)| \ > > + (1U << HRTIMER_BASE_BOOTTIME)| \ > > + (1U << HRTIMER_BASE_BOOTTIME_SOFT)| \ > > + (1U << HRTIMER_BASE_TAI)| \ > > + (1U << HRTIMER_BASE_TAI_SOFT)) > > + > > +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > > +{ > > + unsigned int active = 0; > > + > > + if (!cpu_base->softirq_activated) > > + active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT; Again, if (cpu_base->softirq_activated), need to IPI (will resend). > > + active = active | (cpu_base->active_bases & HRTIMER_ACTIVE_HARD); > > + > > + if ((active & CLOCK_SET_BASES) == 0) > > + return false; > > + > > + return true; > > +} > > Errm. What? > > + /* Avoid interrupting nohz_full CPUs if possible */ > > + preempt_disable(); > > + for_each_online_cpu(cpu) { > > + if (tick_nohz_full_cpu(cpu)) { > > + unsigned long flags; > > + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); > > + > > + raw_spin_lock_irqsave(&cpu_base->lock, flags); > > + if (need_reprogram_timer(cpu_base)) > > + cpumask_set_cpu(cpu, mask); > > + else > > + hrtimer_update_base(cpu_base); > > + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); > > + } > > + } > > How is that supposed to be correct? > > CPU0 CPU1 > > clock_was_set() hrtimer_start(CLOCK_REALTIME) > > if (!active_mask[CPU1] & XXX) > continue; > active_mask |= REALTIME; > > ---> fail because that newly started timer is on the old offset. CPU0 CPU1 clock_was_set() Case-1: CPU-1 grabs base->lock before CPU-0: CPU-0 sees active_mask[CPU1] and IPIs. base = lock_hrtimer_base(timer, &flags); if (__hrtimer_start_range_ns(timer, tim, ... hrtimer_reprogram(timer, true); unlock_hrtimer_base(timer, &flags); raw_spin_lock_irqsave(&cpu_base->lock, flags); if (need_reprogram_timer(cpu_base)) cpumask_set_cpu(cpu, mask); else hrtimer_update_base(cpu_base); raw_spin_unlock_irqrestore(&cpu_base->lock, flags); Case-2: CPU-1 grabs base->lock after CPU-0: CPU-0 will have updated the offsets remotely. base = lock_hrtimer_base(timer, &flags); if (__hrtimer_start_range_ns(timer, tim, ... hrtimer_reprogram(timer, true); unlock_hrtimer_base(timer, &flags); No? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hrtimer: avoid retrigger_next_event IPI 2021-04-09 16:51 ` Marcelo Tosatti @ 2021-04-10 7:53 ` Thomas Gleixner 2021-04-13 17:04 ` [PATCH v2] " Marcelo Tosatti 0 siblings, 1 reply; 20+ messages in thread From: Thomas Gleixner @ 2021-04-10 7:53 UTC (permalink / raw) To: Marcelo Tosatti, Anna-Maria Behnsen Cc: linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits On Fri, Apr 09 2021 at 13:51, Marcelo Tosatti wrote: > On Fri, Apr 09, 2021 at 04:15:13PM +0200, Thomas Gleixner wrote: >> On Wed, Apr 07 2021 at 10:53, Marcelo Tosatti wrote: >> ---> fail because that newly started timer is on the old offset. > > CPU0 CPU1 > > > clock_was_set() > Case-1: CPU-1 grabs base->lock before CPU-0: > CPU-0 sees active_mask[CPU1] and IPIs. > > base = lock_hrtimer_base(timer, &flags); > if (__hrtimer_start_range_ns(timer, tim, ... > hrtimer_reprogram(timer, true); > > unlock_hrtimer_base(timer, &flags); > > > raw_spin_lock_irqsave(&cpu_base->lock, flags); > if (need_reprogram_timer(cpu_base)) > cpumask_set_cpu(cpu, mask); > else > hrtimer_update_base(cpu_base); > raw_spin_unlock_irqrestore(&cpu_base->lock, flags); > > Case-2: CPU-1 grabs base->lock after CPU-0: > CPU-0 will have updated the offsets remotely. > > base = lock_hrtimer_base(timer, &flags); > if (__hrtimer_start_range_ns(timer, tim, ... > hrtimer_reprogram(timer, true); > > unlock_hrtimer_base(timer, &flags); > > > No? Yeah, you're right. I misread the loop logic. Can we please make that unconditional independent of nohz full. There is no reason to special case it. Thanks, tglx ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2] hrtimer: avoid retrigger_next_event IPI 2021-04-10 7:53 ` Thomas Gleixner @ 2021-04-13 17:04 ` Marcelo Tosatti 2021-04-14 17:19 ` Thomas Gleixner 2021-04-15 15:39 ` [PATCH v3] " Marcelo Tosatti 0 siblings, 2 replies; 20+ messages in thread From: Marcelo Tosatti @ 2021-04-13 17:04 UTC (permalink / raw) To: Thomas Gleixner Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits Setting the realtime clock triggers an IPI to all CPUs to reprogram hrtimers. However, only realtime and TAI clocks have their offsets updated (and therefore potentially require a reprogram). Check if it only has monotonic active timers, and in that case update the realtime and TAI base offsets remotely, skipping the IPI. This reduces interruptions to latency sensitive applications. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> --- v2: - Only REALTIME and TAI bases are affected by offset-to-monotonic changes (Thomas). - Don't special case nohz_full CPUs (Thomas). diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 5c9d968187ae..be21b85c679d 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -871,6 +871,28 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) tick_program_event(expires, 1); } +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME)| \ + (1U << HRTIMER_BASE_REALTIME_SOFT)| \ + (1U << HRTIMER_BASE_TAI)| \ + (1U << HRTIMER_BASE_TAI_SOFT)) + +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) +{ + unsigned int active = 0; + + if (cpu_base->softirq_activated) + return true; + + active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT; + + active = active | (cpu_base->active_bases & HRTIMER_ACTIVE_HARD); + + if ((active & CLOCK_SET_BASES) == 0) + return false; + + return true; +} + /* * Clock realtime was set * @@ -885,9 +907,31 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) void clock_was_set(void) { #ifdef CONFIG_HIGH_RES_TIMERS - /* Retrigger the CPU local events everywhere */ - on_each_cpu(retrigger_next_event, NULL, 1); + cpumask_var_t mask; + int cpu; + + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { + on_each_cpu(retrigger_next_event, NULL, 1); + goto set_timerfd; + } + + /* Avoid interrupting CPUs if possible */ + preempt_disable(); + for_each_online_cpu(cpu) { + unsigned long flags; + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); + + raw_spin_lock_irqsave(&cpu_base->lock, flags); + if (need_reprogram_timer(cpu_base)) + cpumask_set_cpu(cpu, mask); + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); + } + + smp_call_function_many(mask, retrigger_next_event, NULL, 1); + preempt_enable(); + free_cpumask_var(mask); #endif +set_timerfd: timerfd_clock_was_set(); } ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2] hrtimer: avoid retrigger_next_event IPI 2021-04-13 17:04 ` [PATCH v2] " Marcelo Tosatti @ 2021-04-14 17:19 ` Thomas Gleixner 2021-04-15 15:39 ` [PATCH v3] " Marcelo Tosatti 1 sibling, 0 replies; 20+ messages in thread From: Thomas Gleixner @ 2021-04-14 17:19 UTC (permalink / raw) To: Marcelo Tosatti Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits Marcelo, On Tue, Apr 13 2021 at 14:04, Marcelo Tosatti wrote: > Setting the realtime clock triggers an IPI to all CPUs to reprogram > hrtimers. s/hrtimers/clock event device/ > However, only realtime and TAI clocks have their offsets updated > (and therefore potentially require a reprogram). > > Check if it only has monotonic active timers, and in that case boottime != monotonic > update the realtime and TAI base offsets remotely, skipping the IPI. Something like this perhaps: Instead of sending an IPI unconditionally, check each per CPU hrtimer base whether it has active timers in the CLOCK_REALTIME and CLOCK_TAI bases. If that's not the case, update the realtime and TAI base offsets remotely and skip the IPI. This ensures that any subsequently armed timers on CLOCK_REALTIME and CLOCK_TAI are evaluated with the correct offsets. Hmm? > +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME)| \ Missing space between ) and | > + (1U << HRTIMER_BASE_REALTIME_SOFT)| \ > + (1U << HRTIMER_BASE_TAI)| \ > + (1U << HRTIMER_BASE_TAI_SOFT)) > + > +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > +{ > + unsigned int active = 0; > + > + if (cpu_base->softirq_activated) > + return true; > + > + active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT; > + > + active = active | (cpu_base->active_bases & HRTIMER_ACTIVE_HARD); > + > + if ((active & CLOCK_SET_BASES) == 0) > + return false; > + > + return true; That's a pretty elaborate way of writing: return (cpu_base->active_bases & CLOCK_SET_BASES) != 0; > +} > + > /* > * Clock realtime was set > * > @@ -885,9 +907,31 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) > void clock_was_set(void) > { > #ifdef CONFIG_HIGH_RES_TIMERS > - /* Retrigger the CPU local events everywhere */ > - on_each_cpu(retrigger_next_event, NULL, 1); > + cpumask_var_t mask; > + int cpu; > + > + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { > + on_each_cpu(retrigger_next_event, NULL, 1); > + goto set_timerfd; > + } > + > + /* Avoid interrupting CPUs if possible */ > + preempt_disable(); That preempt disable is only required around the function call, for the loop cpus_read_lock() is sufficient. > + for_each_online_cpu(cpu) { > + unsigned long flags; > + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); > + > + raw_spin_lock_irqsave(&cpu_base->lock, flags); > + if (need_reprogram_timer(cpu_base)) > + cpumask_set_cpu(cpu, mask); > + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); > + } > + > + smp_call_function_many(mask, retrigger_next_event, NULL, 1); > + preempt_enable(); > + free_cpumask_var(mask); > #endif > +set_timerfd: My brain compiler tells me that with CONFIG_HIGH_RES_TIMERS=n a real compiler will emit a warning about a defined, but unused label.... > timerfd_clock_was_set(); > } Thanks, tglx ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v3] hrtimer: avoid retrigger_next_event IPI 2021-04-13 17:04 ` [PATCH v2] " Marcelo Tosatti 2021-04-14 17:19 ` Thomas Gleixner @ 2021-04-15 15:39 ` Marcelo Tosatti 2021-04-15 18:59 ` Thomas Gleixner 1 sibling, 1 reply; 20+ messages in thread From: Marcelo Tosatti @ 2021-04-15 15:39 UTC (permalink / raw) To: Thomas Gleixner Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits Setting the realtime clock triggers an IPI to all CPUs to reprogram the clock event device. However, only realtime and TAI clocks have their offsets updated (and therefore potentially require a reprogram). Instead of sending an IPI unconditionally, check each per CPU hrtimer base whether it has active timers in the CLOCK_REALTIME and CLOCK_TAI bases. If that's not the case, update the realtime and TAI base offsets remotely and skip the IPI. This ensures that any subsequently armed timers on CLOCK_REALTIME and CLOCK_TAI are evaluated with the correct offsets. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> --- v3: - Nicer changelog (Thomas). - Code style fixes (Thomas). - Compilation warning with CONFIG_HIGH_RES_TIMERS=n (Thomas). - Shrink preemption disabled section (Thomas). v2: - Only REALTIME and TAI bases are affected by offset-to-monotonic changes (Thomas). - Don't special case nohz_full CPUs (Thomas). diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 5c9d968187ae..dd9c0d2f469f 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -871,6 +871,24 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) tick_program_event(expires, 1); } +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ + (1U << HRTIMER_BASE_TAI) | \ + (1U << HRTIMER_BASE_TAI_SOFT)) + +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) +{ + unsigned int active = 0; + + if (cpu_base->softirq_activated) + return true; + + active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT; + active = active | (cpu_base->active_bases & HRTIMER_ACTIVE_HARD); + + return (cpu_base->active_bases & CLOCK_SET_BASES) != 0; +} + /* * Clock realtime was set * @@ -885,8 +903,32 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) void clock_was_set(void) { #ifdef CONFIG_HIGH_RES_TIMERS - /* Retrigger the CPU local events everywhere */ - on_each_cpu(retrigger_next_event, NULL, 1); + cpumask_var_t mask; + int cpu; + + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { + on_each_cpu(retrigger_next_event, NULL, 1); + goto set_timerfd; + } + + /* Avoid interrupting CPUs if possible */ + cpus_read_lock(); + for_each_online_cpu(cpu) { + unsigned long flags; + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); + + raw_spin_lock_irqsave(&cpu_base->lock, flags); + if (need_reprogram_timer(cpu_base)) + cpumask_set_cpu(cpu, mask); + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); + } + + preempt_disable(); + smp_call_function_many(mask, retrigger_next_event, NULL, 1); + preempt_enable(); + cpus_read_unlock(); + free_cpumask_var(mask); +set_timerfd: #endif timerfd_clock_was_set(); } ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v3] hrtimer: avoid retrigger_next_event IPI 2021-04-15 15:39 ` [PATCH v3] " Marcelo Tosatti @ 2021-04-15 18:59 ` Thomas Gleixner 2021-04-15 20:40 ` [PATCH v4] " Marcelo Tosatti 0 siblings, 1 reply; 20+ messages in thread From: Thomas Gleixner @ 2021-04-15 18:59 UTC (permalink / raw) To: Marcelo Tosatti Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits On Thu, Apr 15 2021 at 12:39, Marcelo Tosatti wrote: > +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > +{ > + unsigned int active = 0; > + > + if (cpu_base->softirq_activated) > + return true; > + > + active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT; > + active = active | (cpu_base->active_bases & HRTIMER_ACTIVE_HARD); It's not entirely clear to me what that 'active' variable business above is doing and why it's needed at all. But I might be missing something. > + return (cpu_base->active_bases & CLOCK_SET_BASES) != 0; > +} ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v4] hrtimer: avoid retrigger_next_event IPI 2021-04-15 18:59 ` Thomas Gleixner @ 2021-04-15 20:40 ` Marcelo Tosatti 2021-04-16 16:00 ` [PATCH v5] " Marcelo Tosatti 0 siblings, 1 reply; 20+ messages in thread From: Marcelo Tosatti @ 2021-04-15 20:40 UTC (permalink / raw) To: Thomas Gleixner Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits Setting the realtime clock triggers an IPI to all CPUs to reprogram the clock event device. However, only realtime and TAI clocks have their offsets updated (and therefore potentially require a reprogram). Instead of sending an IPI unconditionally, check each per CPU hrtimer base whether it has active timers in the CLOCK_REALTIME and CLOCK_TAI bases. If that's not the case, update the realtime and TAI base offsets remotely and skip the IPI. This ensures that any subsequently armed timers on CLOCK_REALTIME and CLOCK_TAI are evaluated with the correct offsets. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> --- v4: - Drop unused code (Thomas). v3: - Nicer changelog (Thomas). - Code style fixes (Thomas). - Compilation warning with CONFIG_HIGH_RES_TIMERS=n (Thomas). - Shrink preemption disabled section (Thomas). v2: - Only REALTIME and TAI bases are affected by offset-to-monotonic changes (Thomas). - Don't special case nohz_full CPUs (Thomas). diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 5c9d968187ae..e228c0a0c98f 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -871,6 +871,19 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) tick_program_event(expires, 1); } +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ + (1U << HRTIMER_BASE_TAI) | \ + (1U << HRTIMER_BASE_TAI_SOFT)) + +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) +{ + if (cpu_base->softirq_activated) + return true; + + return (cpu_base->active_bases & CLOCK_SET_BASES) != 0; +} + /* * Clock realtime was set * @@ -885,8 +898,32 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) void clock_was_set(void) { #ifdef CONFIG_HIGH_RES_TIMERS - /* Retrigger the CPU local events everywhere */ - on_each_cpu(retrigger_next_event, NULL, 1); + cpumask_var_t mask; + int cpu; + + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { + on_each_cpu(retrigger_next_event, NULL, 1); + goto set_timerfd; + } + + /* Avoid interrupting CPUs if possible */ + cpus_read_lock(); + for_each_online_cpu(cpu) { + unsigned long flags; + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); + + raw_spin_lock_irqsave(&cpu_base->lock, flags); + if (need_reprogram_timer(cpu_base)) + cpumask_set_cpu(cpu, mask); + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); + } + + preempt_disable(); + smp_call_function_many(mask, retrigger_next_event, NULL, 1); + preempt_enable(); + cpus_read_unlock(); + free_cpumask_var(mask); +set_timerfd: #endif timerfd_clock_was_set(); } ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v5] hrtimer: avoid retrigger_next_event IPI 2021-04-15 20:40 ` [PATCH v4] " Marcelo Tosatti @ 2021-04-16 16:00 ` Marcelo Tosatti 2021-04-16 17:13 ` Peter Xu 2021-04-19 19:39 ` [PATCH v6] " Marcelo Tosatti 0 siblings, 2 replies; 20+ messages in thread From: Marcelo Tosatti @ 2021-04-16 16:00 UTC (permalink / raw) To: Thomas Gleixner Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits Setting the realtime clock triggers an IPI to all CPUs to reprogram the clock event device. However, only realtime and TAI clocks have their offsets updated (and therefore potentially require a reprogram). Instead of sending an IPI unconditionally, check each per CPU hrtimer base whether it has active timers in the CLOCK_REALTIME and CLOCK_TAI bases. If that's not the case, update the realtime and TAI base offsets remotely and skip the IPI. This ensures that any subsequently armed timers on CLOCK_REALTIME and CLOCK_TAI are evaluated with the correct offsets. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> --- v5: - Add missing hrtimer_update_base (Peter Xu). v4: - Drop unused code (Thomas). v3: - Nicer changelog (Thomas). - Code style fixes (Thomas). - Compilation warning with CONFIG_HIGH_RES_TIMERS=n (Thomas). - Shrink preemption disabled section (Thomas). v2: - Only REALTIME and TAI bases are affected by offset-to-monotonic changes (Thomas). - Don't special case nohz_full CPUs (Thomas). diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 5c9d968187ae..06fcc272e28d 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -871,6 +871,19 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) tick_program_event(expires, 1); } +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ + (1U << HRTIMER_BASE_TAI) | \ + (1U << HRTIMER_BASE_TAI_SOFT)) + +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) +{ + if (cpu_base->softirq_activated) + return true; + + return (cpu_base->active_bases & CLOCK_SET_BASES) != 0; +} + /* * Clock realtime was set * @@ -885,8 +898,34 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) void clock_was_set(void) { #ifdef CONFIG_HIGH_RES_TIMERS - /* Retrigger the CPU local events everywhere */ - on_each_cpu(retrigger_next_event, NULL, 1); + cpumask_var_t mask; + int cpu; + + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { + on_each_cpu(retrigger_next_event, NULL, 1); + goto set_timerfd; + } + + /* Avoid interrupting CPUs if possible */ + cpus_read_lock(); + for_each_online_cpu(cpu) { + unsigned long flags; + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); + + raw_spin_lock_irqsave(&cpu_base->lock, flags); + if (need_reprogram_timer(cpu_base)) + cpumask_set_cpu(cpu, mask); + else + hrtimer_update_base(cpu_base); + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); + } + + preempt_disable(); + smp_call_function_many(mask, retrigger_next_event, NULL, 1); + preempt_enable(); + cpus_read_unlock(); + free_cpumask_var(mask); +set_timerfd: #endif timerfd_clock_was_set(); } ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v5] hrtimer: avoid retrigger_next_event IPI 2021-04-16 16:00 ` [PATCH v5] " Marcelo Tosatti @ 2021-04-16 17:13 ` Peter Xu 2021-04-17 16:24 ` Thomas Gleixner 2021-04-19 19:39 ` [PATCH v6] " Marcelo Tosatti 1 sibling, 1 reply; 20+ messages in thread From: Peter Xu @ 2021-04-16 17:13 UTC (permalink / raw) To: Marcelo Tosatti Cc: Thomas Gleixner, Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Nitesh Narayan Lal, Alex Belits Hi, all, On Fri, Apr 16, 2021 at 01:00:23PM -0300, Marcelo Tosatti wrote: > > Setting the realtime clock triggers an IPI to all CPUs to reprogram > the clock event device. > > However, only realtime and TAI clocks have their offsets updated > (and therefore potentially require a reprogram). > > Instead of sending an IPI unconditionally, check each per CPU hrtimer base > whether it has active timers in the CLOCK_REALTIME and CLOCK_TAI bases. If > that's not the case, update the realtime and TAI base offsets remotely and > skip the IPI. This ensures that any subsequently armed timers on > CLOCK_REALTIME and CLOCK_TAI are evaluated with the correct offsets. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > --- > > v5: > - Add missing hrtimer_update_base (Peter Xu). > > v4: > - Drop unused code (Thomas). > > v3: > - Nicer changelog (Thomas). > - Code style fixes (Thomas). > - Compilation warning with CONFIG_HIGH_RES_TIMERS=n (Thomas). > - Shrink preemption disabled section (Thomas). > > v2: > - Only REALTIME and TAI bases are affected by offset-to-monotonic changes (Thomas). > - Don't special case nohz_full CPUs (Thomas). > > > diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c > index 5c9d968187ae..06fcc272e28d 100644 > --- a/kernel/time/hrtimer.c > +++ b/kernel/time/hrtimer.c > @@ -871,6 +871,19 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) > tick_program_event(expires, 1); > } > > +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ > + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ > + (1U << HRTIMER_BASE_TAI) | \ > + (1U << HRTIMER_BASE_TAI_SOFT)) > + > +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > +{ > + if (cpu_base->softirq_activated) > + return true; A pure question on whether this check is needed... Here even if softirq_activated==1 (as softirq is going to happen), as long as (cpu_base->active_bases & CLOCK_SET_BASES)==0, shouldn't it already mean that "yes indeed clock was set, but no need to kick this cpu as no relevant timer"? As that question seems to be orthogonal to whether a softirq is going to trigger on that cpu. Thanks, > + > + return (cpu_base->active_bases & CLOCK_SET_BASES) != 0; > +} -- Peter Xu ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v5] hrtimer: avoid retrigger_next_event IPI 2021-04-16 17:13 ` Peter Xu @ 2021-04-17 16:24 ` Thomas Gleixner 2021-04-17 16:51 ` Thomas Gleixner 0 siblings, 1 reply; 20+ messages in thread From: Thomas Gleixner @ 2021-04-17 16:24 UTC (permalink / raw) To: Peter Xu, Marcelo Tosatti Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Nitesh Narayan Lal, Alex Belits On Fri, Apr 16 2021 at 13:13, Peter Xu wrote: > On Fri, Apr 16, 2021 at 01:00:23PM -0300, Marcelo Tosatti wrote: >> >> +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ >> + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ >> + (1U << HRTIMER_BASE_TAI) | \ >> + (1U << HRTIMER_BASE_TAI_SOFT)) >> + >> +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) >> +{ >> + if (cpu_base->softirq_activated) >> + return true; > > A pure question on whether this check is needed... > > Here even if softirq_activated==1 (as softirq is going to happen), as long as > (cpu_base->active_bases & CLOCK_SET_BASES)==0, shouldn't it already mean that > "yes indeed clock was set, but no need to kick this cpu as no relevant timer"? > As that question seems to be orthogonal to whether a softirq is going to > trigger on that cpu. That's correct and it's not any different from firing the IPI because in both cases the update happens with the base lock of the CPU in question held. And if there are no active timers in any of the affected bases, then there is no need to reevaluate the next expiry because the offset update does not affect any armed timers. It just makes sure that the next enqueu of a timer on such a base will see the the correct offset. I'll just zap it. Thanks, tglx ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v5] hrtimer: avoid retrigger_next_event IPI 2021-04-17 16:24 ` Thomas Gleixner @ 2021-04-17 16:51 ` Thomas Gleixner 2021-04-19 18:56 ` Marcelo Tosatti 0 siblings, 1 reply; 20+ messages in thread From: Thomas Gleixner @ 2021-04-17 16:51 UTC (permalink / raw) To: Peter Xu, Marcelo Tosatti Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Nitesh Narayan Lal, Alex Belits On Sat, Apr 17 2021 at 18:24, Thomas Gleixner wrote: > On Fri, Apr 16 2021 at 13:13, Peter Xu wrote: >> On Fri, Apr 16, 2021 at 01:00:23PM -0300, Marcelo Tosatti wrote: >>> >>> +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ >>> + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ >>> + (1U << HRTIMER_BASE_TAI) | \ >>> + (1U << HRTIMER_BASE_TAI_SOFT)) >>> + >>> +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) >>> +{ >>> + if (cpu_base->softirq_activated) >>> + return true; >> >> A pure question on whether this check is needed... >> >> Here even if softirq_activated==1 (as softirq is going to happen), as long as >> (cpu_base->active_bases & CLOCK_SET_BASES)==0, shouldn't it already mean that >> "yes indeed clock was set, but no need to kick this cpu as no relevant timer"? >> As that question seems to be orthogonal to whether a softirq is going to >> trigger on that cpu. > > That's correct and it's not any different from firing the IPI because in > both cases the update happens with the base lock of the CPU in question > held. And if there are no active timers in any of the affected bases, > then there is no need to reevaluate the next expiry because the offset > update does not affect any armed timers. It just makes sure that the > next enqueu of a timer on such a base will see the the correct offset. > > I'll just zap it. But the whole thing is still wrong in two aspects: 1) BOOTTIME can be one of the affected clocks when sleep time (suspended time) is injected because that uses the same mechanism. Sorry for missing that earlier when I asked to remove it, but that's trivial to fix by adding the BOOTTIME base back. 2) What's worse is that on resume this might break because that mechanism is also used to enforce the reprogramming of the clock event devices and there we cannot be selective on clock bases. I need to dig deeper into that because suspend/resume has changed a lot over time, so this might be just a historical leftover. But without proper analysis we might end up with subtle and hard to debug wreckage. Thanks, tglx ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v5] hrtimer: avoid retrigger_next_event IPI 2021-04-17 16:51 ` Thomas Gleixner @ 2021-04-19 18:56 ` Marcelo Tosatti 0 siblings, 0 replies; 20+ messages in thread From: Marcelo Tosatti @ 2021-04-19 18:56 UTC (permalink / raw) To: Thomas Gleixner Cc: Peter Xu, Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Nitesh Narayan Lal, Alex Belits On Sat, Apr 17, 2021 at 06:51:08PM +0200, Thomas Gleixner wrote: > On Sat, Apr 17 2021 at 18:24, Thomas Gleixner wrote: > > On Fri, Apr 16 2021 at 13:13, Peter Xu wrote: > >> On Fri, Apr 16, 2021 at 01:00:23PM -0300, Marcelo Tosatti wrote: > >>> > >>> +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ > >>> + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ > >>> + (1U << HRTIMER_BASE_TAI) | \ > >>> + (1U << HRTIMER_BASE_TAI_SOFT)) > >>> + > >>> +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > >>> +{ > >>> + if (cpu_base->softirq_activated) > >>> + return true; > >> > >> A pure question on whether this check is needed... > >> > >> Here even if softirq_activated==1 (as softirq is going to happen), as long as > >> (cpu_base->active_bases & CLOCK_SET_BASES)==0, shouldn't it already mean that > >> "yes indeed clock was set, but no need to kick this cpu as no relevant timer"? > >> As that question seems to be orthogonal to whether a softirq is going to > >> trigger on that cpu. > > > > That's correct and it's not any different from firing the IPI because in > > both cases the update happens with the base lock of the CPU in question > > held. And if there are no active timers in any of the affected bases, > > then there is no need to reevaluate the next expiry because the offset > > update does not affect any armed timers. It just makes sure that the > > next enqueu of a timer on such a base will see the the correct offset. > > > > I'll just zap it. > > But the whole thing is still wrong in two aspects: > > 1) BOOTTIME can be one of the affected clocks when sleep time > (suspended time) is injected because that uses the same mechanism. > > Sorry for missing that earlier when I asked to remove it, but > that's trivial to fix by adding the BOOTTIME base back. > > 2) What's worse is that on resume this might break because that > mechanism is also used to enforce the reprogramming of the clock > event devices and there we cannot be selective on clock bases. > > I need to dig deeper into that because suspend/resume has changed > a lot over time, so this might be just a historical leftover. But > without proper analysis we might end up with subtle and hard to > debug wreckage. > > Thanks, > > tglx Thomas, There is no gain in avoiding the IPIs for the suspend/resume case (since suspending is a large interruption anyway). To avoid the potential complexity (and associated bugs), one option would be to NOT skip IPIs for the resume case. Sending -v6 with that (and other suggestions/fixes). ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v6] hrtimer: avoid retrigger_next_event IPI 2021-04-16 16:00 ` [PATCH v5] " Marcelo Tosatti 2021-04-16 17:13 ` Peter Xu @ 2021-04-19 19:39 ` Marcelo Tosatti 2021-04-19 20:52 ` Thomas Gleixner 1 sibling, 1 reply; 20+ messages in thread From: Marcelo Tosatti @ 2021-04-19 19:39 UTC (permalink / raw) To: Thomas Gleixner Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits Setting the realtime clock triggers an IPI to all CPUs to reprogram the clock event device. However, only realtime and TAI clocks have their offsets updated (and therefore potentially require a reprogram). Instead of sending an IPI unconditionally, check each per CPU hrtimer base whether it has active timers in the CLOCK_REALTIME and CLOCK_TAI bases. If that's not the case, update the realtime and TAI base offsets remotely and skip the IPI. This ensures that any subsequently armed timers on CLOCK_REALTIME and CLOCK_TAI are evaluated with the correct offsets. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> --- v6: - Do not take softirq_raised into account (Peter Xu). - Include BOOTTIME as base that requires IPI (Thomas). - Unconditional reprogram on resume path, since there is nothing to gain in such path anyway. v5: - Add missing hrtimer_update_base (Peter Xu). v4: - Drop unused code (Thomas). v3: - Nicer changelog (Thomas). - Code style fixes (Thomas). - Compilation warning with CONFIG_HIGH_RES_TIMERS=n (Thomas). - Shrink preemption disabled section (Thomas). v2: - Only REALTIME and TAI bases are affected by offset-to-monotonic changes (Thomas). - Don't special case nohz_full CPUs (Thomas). diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index bb5e7b0a4274..14a6e449b221 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -318,7 +318,7 @@ struct clock_event_device; extern void hrtimer_interrupt(struct clock_event_device *dev); -extern void clock_was_set_delayed(void); +extern void clock_was_set_delayed(bool force_reprogram); extern unsigned int hrtimer_resolution; @@ -326,7 +326,7 @@ extern unsigned int hrtimer_resolution; #define hrtimer_resolution (unsigned int)LOW_RES_NSEC -static inline void clock_was_set_delayed(void) { } +static inline void clock_was_set_delayed(bool force_reprogram) { } #endif @@ -351,7 +351,7 @@ hrtimer_expires_remaining_adjusted(const struct hrtimer *timer) timer->base->get_time()); } -extern void clock_was_set(void); +extern void clock_was_set(bool); #ifdef CONFIG_TIMERFD extern void timerfd_clock_was_set(void); #else diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 5c9d968187ae..2258782fd714 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -758,9 +758,17 @@ static void hrtimer_switch_to_hres(void) retrigger_next_event(NULL); } +static void clock_was_set_force_reprogram_work(struct work_struct *work) +{ + clock_was_set(true); +} + +static DECLARE_WORK(hrtimer_force_reprogram_work, clock_was_set_force_reprogram_work); + + static void clock_was_set_work(struct work_struct *work) { - clock_was_set(); + clock_was_set(false); } static DECLARE_WORK(hrtimer_work, clock_was_set_work); @@ -769,9 +777,12 @@ static DECLARE_WORK(hrtimer_work, clock_was_set_work); * Called from timekeeping and resume code to reprogram the hrtimer * interrupt device on all cpus. */ -void clock_was_set_delayed(void) +void clock_was_set_delayed(bool force_reprogram) { - schedule_work(&hrtimer_work); + if (force_reprogram) + schedule_work(&hrtimer_force_reprogram_work); + else + schedule_work(&hrtimer_work); } #else @@ -871,6 +882,18 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) tick_program_event(expires, 1); } +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ + (1U << HRTIMER_BASE_TAI) | \ + (1U << HRTIMER_BASE_TAI_SOFT) | \ + (1U << HRTIMER_BASE_BOOTTIME) | \ + (1U << HRTIMER_BASE_BOOTTIME_SOFT)) + +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) +{ + return (cpu_base->active_bases & CLOCK_SET_BASES) != 0; +} + /* * Clock realtime was set * @@ -882,11 +905,42 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) * resolution timer interrupts. On UP we just disable interrupts and * call the high resolution interrupt code. */ -void clock_was_set(void) +void clock_was_set(bool force_reprogram) { #ifdef CONFIG_HIGH_RES_TIMERS - /* Retrigger the CPU local events everywhere */ - on_each_cpu(retrigger_next_event, NULL, 1); + cpumask_var_t mask; + int cpu; + + if (force_reprogram == true) { + on_each_cpu(retrigger_next_event, NULL, 1); + goto set_timerfd; + } + + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { + on_each_cpu(retrigger_next_event, NULL, 1); + goto set_timerfd; + } + + /* Avoid interrupting CPUs if possible */ + cpus_read_lock(); + for_each_online_cpu(cpu) { + unsigned long flags; + struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); + + raw_spin_lock_irqsave(&cpu_base->lock, flags); + if (need_reprogram_timer(cpu_base)) + cpumask_set_cpu(cpu, mask); + else + hrtimer_update_base(cpu_base); + raw_spin_unlock_irqrestore(&cpu_base->lock, flags); + } + + preempt_disable(); + smp_call_function_many(mask, retrigger_next_event, NULL, 1); + preempt_enable(); + cpus_read_unlock(); + free_cpumask_var(mask); +set_timerfd: #endif timerfd_clock_was_set(); } @@ -903,7 +957,7 @@ void hrtimers_resume(void) /* Retrigger on the local CPU */ retrigger_next_event(NULL); /* And schedule a retrigger for all others */ - clock_was_set_delayed(); + clock_was_set_delayed(true); } /* diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 6aee5768c86f..3fef237267bd 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -1323,7 +1323,7 @@ int do_settimeofday64(const struct timespec64 *ts) raw_spin_unlock_irqrestore(&timekeeper_lock, flags); /* signal hrtimers about time change */ - clock_was_set(); + clock_was_set(false); if (!ret) audit_tk_injoffset(ts_delta); @@ -1371,7 +1371,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts) raw_spin_unlock_irqrestore(&timekeeper_lock, flags); /* signal hrtimers about time change */ - clock_was_set(); + clock_was_set(false); return ret; } @@ -1736,7 +1736,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta) raw_spin_unlock_irqrestore(&timekeeper_lock, flags); /* signal hrtimers about time change */ - clock_was_set(); + clock_was_set(true); } #endif @@ -2187,7 +2187,7 @@ static void timekeeping_advance(enum timekeeping_adv_mode mode) raw_spin_unlock_irqrestore(&timekeeper_lock, flags); if (clock_set) /* Have to call _delayed version, since in irq context*/ - clock_was_set_delayed(); + clock_was_set_delayed(false); } /** @@ -2425,7 +2425,7 @@ int do_adjtimex(struct __kernel_timex *txc) timekeeping_advance(TK_ADV_FREQ); if (tai != orig_tai) - clock_was_set(); + clock_was_set(false); ntp_notify_cmos_timer(); ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v6] hrtimer: avoid retrigger_next_event IPI 2021-04-19 19:39 ` [PATCH v6] " Marcelo Tosatti @ 2021-04-19 20:52 ` Thomas Gleixner 0 siblings, 0 replies; 20+ messages in thread From: Thomas Gleixner @ 2021-04-19 20:52 UTC (permalink / raw) To: Marcelo Tosatti Cc: Anna-Maria Behnsen, linux-kernel, Frederic Weisbecker, Peter Xu, Nitesh Narayan Lal, Alex Belits On Mon, Apr 19 2021 at 16:39, Marcelo Tosatti wrote: > > +static void clock_was_set_force_reprogram_work(struct work_struct *work) > +{ > + clock_was_set(true); > +} > + > +static DECLARE_WORK(hrtimer_force_reprogram_work, clock_was_set_force_reprogram_work); > + > + > static void clock_was_set_work(struct work_struct *work) > { > - clock_was_set(); > + clock_was_set(false); > } > > static DECLARE_WORK(hrtimer_work, clock_was_set_work); > @@ -769,9 +777,12 @@ static DECLARE_WORK(hrtimer_work, clock_was_set_work); > * Called from timekeeping and resume code to reprogram the hrtimer > * interrupt device on all cpus. > */ > -void clock_was_set_delayed(void) > +void clock_was_set_delayed(bool force_reprogram) > { > - schedule_work(&hrtimer_work); > + if (force_reprogram) > + schedule_work(&hrtimer_force_reprogram_work); > + else > + schedule_work(&hrtimer_work); > } > > #else > @@ -871,6 +882,18 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) > tick_program_event(expires, 1); > } > > +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ > + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ > + (1U << HRTIMER_BASE_TAI) | \ > + (1U << HRTIMER_BASE_TAI_SOFT) | \ > + (1U << HRTIMER_BASE_BOOTTIME) | \ > + (1U << HRTIMER_BASE_BOOTTIME_SOFT)) > + > +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > +{ > + return (cpu_base->active_bases & CLOCK_SET_BASES) != 0; > +} > + > /* > * Clock realtime was set > * > @@ -882,11 +905,42 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram) > * resolution timer interrupts. On UP we just disable interrupts and > * call the high resolution interrupt code. > */ > -void clock_was_set(void) > +void clock_was_set(bool force_reprogram) > { > #ifdef CONFIG_HIGH_RES_TIMERS > - /* Retrigger the CPU local events everywhere */ > - on_each_cpu(retrigger_next_event, NULL, 1); > + cpumask_var_t mask; > + int cpu; > + > + if (force_reprogram == true) { > + on_each_cpu(retrigger_next_event, NULL, 1); > + goto set_timerfd; > + } > + > + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { > + on_each_cpu(retrigger_next_event, NULL, 1); > + goto set_timerfd; > + } This is really horrible and the proper thing to do is to seperate the s2idle/resume related functionality which allows to do some other cleanups as well. I already started to look at that over the weekend, but got stuck due to a couple of correctness issues I found with the whole clock_was_set() mechanism while looking. That stuff needs to be fixed first. Back to the above. For the s2idle resume path there is no reason for an IPI at all. It's just that way because it has been bolted on the existing machinery. So today it does: tick_unfreeze() { if (first_cpu_to_unfreeze()) { timekeeping_resume(); tick_resume(); tick_resume_local(); clock_was_set_delayed(); } else { tick_resume_local(); } } Then after everything is unfrozen including workqueues the delayed clock_was_set() runs and does the IPI dance. That's just silly. We can be smarter than that: tick_unfreeze() { if (first_cpu_to_unfreeze()) timekeeping_resume(); tick_resume(); tick_resume_local(); hrtimer_resume_local(); } else { tick_resume_local(); hrtimer_resume_local(); } } See? No IPI required at all and no magic force argument and as a bonus we get also understandable code. Splitting this properly allows to be smarter about the affected clocks because most of the clock_was_set() events only care about CLOCK_REALTIME and CLOCK_TAI and just the sleep time injection affects CLOCK_BOOTTIME as well. There are a few other things which can be done to avoid even more IPIs, but let me address the correctness issues of clock_was_set() first. Thanks, tglx ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2021-04-19 20:54 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-07 13:53 [PATCH] hrtimer: avoid retrigger_next_event IPI Marcelo Tosatti 2021-04-07 19:28 ` kernel test robot 2021-04-07 19:28 ` kernel test robot 2021-04-07 22:14 ` Frederic Weisbecker 2021-04-08 12:27 ` Marcelo Tosatti 2021-04-09 14:15 ` Thomas Gleixner 2021-04-09 16:51 ` Marcelo Tosatti 2021-04-10 7:53 ` Thomas Gleixner 2021-04-13 17:04 ` [PATCH v2] " Marcelo Tosatti 2021-04-14 17:19 ` Thomas Gleixner 2021-04-15 15:39 ` [PATCH v3] " Marcelo Tosatti 2021-04-15 18:59 ` Thomas Gleixner 2021-04-15 20:40 ` [PATCH v4] " Marcelo Tosatti 2021-04-16 16:00 ` [PATCH v5] " Marcelo Tosatti 2021-04-16 17:13 ` Peter Xu 2021-04-17 16:24 ` Thomas Gleixner 2021-04-17 16:51 ` Thomas Gleixner 2021-04-19 18:56 ` Marcelo Tosatti 2021-04-19 19:39 ` [PATCH v6] " Marcelo Tosatti 2021-04-19 20:52 ` Thomas Gleixner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.