From: peterz@infradead.org To: qianjun.kernel@gmail.com Cc: tglx@linutronix.de, will@kernel.org, luto@kernel.org, linux-kernel@vger.kernel.org, laoar.shao@gmail.com, urezki@gmail.com, frederic@kernel.org Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs Date: Fri, 11 Sep 2020 17:55:55 +0200 [thread overview] Message-ID: <20200911155555.GX2674@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <20200909090931.8836-1-qianjun.kernel@gmail.com> [-- Attachment #1: Type: text/plain, Size: 4585 bytes --] On Wed, Sep 09, 2020 at 05:09:31PM +0800, qianjun.kernel@gmail.com wrote: > From: jun qian <qianjun.kernel@gmail.com> > > When get the pending softirqs, it need to process all the pending > softirqs in the while loop. If the processing time of each pending > softirq is need more than 2 msec in this loop, or one of the softirq > will running a long time, according to the original code logic, it > will process all the pending softirqs without wakeuping ksoftirqd, > which will cause a relatively large scheduling delay on the > corresponding CPU, which we do not wish to see. The patch will check > the total time to process pending softirq, if the time exceeds 2 ms > we need to wakeup the ksofirqd to aviod large sched delay. But what is all that unreadaable gibberish with pending_new_{flag,bit} ? Random comments below.. > +#define MAX_SOFTIRQ_TIME_NS 2000000 2*NSEC_PER_MSEC > +DEFINE_PER_CPU(__u32, pending_new_flag); > +DEFINE_PER_CPU(__u32, pending_next_bit); __u32 is for userspace ABI, this is not it, use u32 > +#define SOFTIRQ_PENDING_MASK ((1UL << NR_SOFTIRQS) - 1) > + > asmlinkage __visible void __softirq_entry __do_softirq(void) > { > - unsigned long end = jiffies + MAX_SOFTIRQ_TIME; > + u64 end = sched_clock() + MAX_SOFTIRQ_TIME_NS; > unsigned long old_flags = current->flags; > int max_restart = MAX_SOFTIRQ_RESTART; > struct softirq_action *h; > bool in_hardirq; > - __u32 pending; > - int softirq_bit; > + __u32 pending, pending_left, pending_new; > + int softirq_bit, next_bit; > + unsigned long flags; > > /* > * Mask out PF_MEMALLOC as the current task context is borrowed for the > @@ -277,10 +282,33 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) > > h = softirq_vec; > > - while ((softirq_bit = ffs(pending))) { > - unsigned int vec_nr; > + next_bit = per_cpu(pending_next_bit, smp_processor_id()); > + per_cpu(pending_new_flag, smp_processor_id()) = 0; __this_cpu_read() / __this_cpu_write() > + > + pending_left = pending & > + (SOFTIRQ_PENDING_MASK << next_bit); > + pending_new = pending & > + (SOFTIRQ_PENDING_MASK >> (NR_SOFTIRQS - next_bit)); The second mask is the inverse of the first. > + /* > + * In order to be fair, we shold process the pengding bits by the > + * last processing order. > + */ > + while ((softirq_bit = ffs(pending_left)) || > + (softirq_bit = ffs(pending_new))) { > int prev_count; > + unsigned int vec_nr = 0; > > + /* > + * when the left pengding bits have been handled, we should > + * to reset the h to softirq_vec. > + */ > + if (!ffs(pending_left)) { > + if (per_cpu(pending_new_flag, smp_processor_id()) == 0) { > + h = softirq_vec; > + per_cpu(pending_new_flag, smp_processor_id()) = 1; > + } > + } > h += softirq_bit - 1; > > vec_nr = h - softirq_vec; > @@ -298,17 +326,44 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) > preempt_count_set(prev_count); > } > h++; > - pending >>= softirq_bit; > + > + if (ffs(pending_left)) This is the _third_ ffs(pending_left), those things are _expensive_ (on some archs, see include/asm-generic/bitops/__ffs.h). > + pending_left >>= softirq_bit; > + else > + pending_new >>= softirq_bit; > + > + /* > + * the softirq's action has been run too much time, > + * so it may need to wakeup the ksoftirqd > + */ > + if (need_resched() && sched_clock() > end) { > + /* > + * Ensure that the remaining pending bits will be > + * handled. > + */ > + local_irq_save(flags); > + if (ffs(pending_left)) *fourth*... > + or_softirq_pending((pending_left << (vec_nr + 1)) | > + pending_new); > + else > + or_softirq_pending(pending_new << (vec_nr + 1)); > + local_irq_restore(flags); > + per_cpu(pending_next_bit, smp_processor_id()) = vec_nr + 1; > + break; > + } > } > > + /* reset the pending_next_bit */ > + per_cpu(pending_next_bit, smp_processor_id()) = 0; > + > if (__this_cpu_read(ksoftirqd) == current) > rcu_softirq_qs(); > local_irq_disable(); > > pending = local_softirq_pending(); > if (pending) { > - if (time_before(jiffies, end) && !need_resched() && > - --max_restart) > + if (!need_resched() && --max_restart && > + sched_clock() <= end) > goto restart; > > wakeup_softirqd(); This really wants to be a number of separate patches; and I quickly lost the plot in your code. Instead of cleaning things up, you're making an even bigger mess of things. That said, I _think_ I've managed to decode what you want. See the completely untested patches attached. [-- Attachment #2: peterz-softirq-fix-loop.patch --] [-- Type: text/x-diff, Size: 1456 bytes --] Subject: softirq: Rewrite softirq processing loop From: Peter Zijlstra <peterz@infradead.org> Date: Fri Sep 11 17:00:03 CEST 2020 Simplify the softirq processing loop by using the bitmap APIs Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- kernel/softirq.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -258,9 +258,9 @@ asmlinkage __visible void __softirq_entr unsigned long old_flags = current->flags; int max_restart = MAX_SOFTIRQ_RESTART; struct softirq_action *h; + unsigned long pending; + unsigned int vec_nr; bool in_hardirq; - __u32 pending; - int softirq_bit; /* * Mask out PF_MEMALLOC as the current task context is borrowed for the @@ -281,15 +281,13 @@ asmlinkage __visible void __softirq_entr local_irq_enable(); - h = softirq_vec; + for_each_set_bit(vec_nr, &pending, NR_SOFTIRQS) { + unsigned int prev_count; - while ((softirq_bit = ffs(pending))) { - unsigned int vec_nr; - int prev_count; + __clear_bit(vec_nr, &pending); - h += softirq_bit - 1; + h = softirq_vec + vec_nr; - vec_nr = h - softirq_vec; prev_count = preempt_count(); kstat_incr_softirqs_this_cpu(vec_nr); @@ -303,8 +301,6 @@ asmlinkage __visible void __softirq_entr prev_count, preempt_count()); preempt_count_set(prev_count); } - h++; - pending >>= softirq_bit; } if (__this_cpu_read(ksoftirqd) == current) [-- Attachment #3: peterz-softirq-timo.patch --] [-- Type: text/x-diff, Size: 1552 bytes --] Subject: softirq: Use sched_clock() based timeout From: Peter Zijlstra <peterz@infradead.org> Date: Fri Sep 11 17:30:01 CEST 2020 Replace the jiffies based timeout with a sched_clock() based one. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- kernel/softirq.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -25,6 +25,7 @@ #include <linux/smpboot.h> #include <linux/tick.h> #include <linux/irq.h> +#include <linux/sched/clock.h> #define CREATE_TRACE_POINTS #include <trace/events/irq.h> @@ -216,7 +217,7 @@ EXPORT_SYMBOL(__local_bh_enable_ip); * we want to handle softirqs as soon as possible, but they * should not be able to lock up the box. */ -#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2) +#define MAX_SOFTIRQ_TIME 2*NSEC_PER_MSEC #define MAX_SOFTIRQ_RESTART 10 #ifdef CONFIG_TRACE_IRQFLAGS @@ -254,9 +255,9 @@ static inline void lockdep_softirq_end(b asmlinkage __visible void __softirq_entry __do_softirq(void) { - unsigned long end = jiffies + MAX_SOFTIRQ_TIME; unsigned long old_flags = current->flags; int max_restart = MAX_SOFTIRQ_RESTART; + u64 start = sched_clock(); struct softirq_action *h; unsigned long pending; unsigned int vec_nr; @@ -309,7 +310,7 @@ asmlinkage __visible void __softirq_entr pending = local_softirq_pending(); if (pending) { - if (time_before(jiffies, end) && !need_resched() && + if (sched_clock() - start < MAX_SOFTIRQ_TIME && !need_resched() && --max_restart) goto restart; [-- Attachment #4: peterz-softirq-needs-break.patch --] [-- Type: text/x-diff, Size: 2703 bytes --] Subject: softirq: Factor loop termination condition From: Peter Zijlstra <peterz@infradead.org> Date: Fri Sep 11 17:17:20 CEST 2020 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- kernel/softirq.c | 44 +++++++++++++++++++++++++------------------- 1 file changed, 25 insertions(+), 19 deletions(-) --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -204,22 +204,6 @@ void __local_bh_enable_ip(unsigned long } EXPORT_SYMBOL(__local_bh_enable_ip); -/* - * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times, - * but break the loop if need_resched() is set or after 2 ms. - * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in - * certain cases, such as stop_machine(), jiffies may cease to - * increment and so we need the MAX_SOFTIRQ_RESTART limit as - * well to make sure we eventually return from this method. - * - * These limits have been established via experimentation. - * The two things to balance is latency against fairness - - * we want to handle softirqs as soon as possible, but they - * should not be able to lock up the box. - */ -#define MAX_SOFTIRQ_TIME 2*NSEC_PER_MSEC -#define MAX_SOFTIRQ_RESTART 10 - #ifdef CONFIG_TRACE_IRQFLAGS /* * When we run softirqs from irq_exit() and thus on the hardirq stack we need @@ -253,10 +237,33 @@ static inline bool lockdep_softirq_start static inline void lockdep_softirq_end(bool in_hardirq) { } #endif +/* + * We restart softirq processing but break the loop if need_resched() is set or + * after 2 ms. The MAX_SOFTIRQ_RESTART guarantees a loop termination if + * sched_clock() were ever to stall. + * + * These limits have been established via experimentation. The two things to + * balance is latency against fairness - we want to handle softirqs as soon as + * possible, but they should not be able to lock up the box. + */ +#define MAX_SOFTIRQ_TIME 2*NSEC_PER_MSEC +#define MAX_SOFTIRQ_RESTART 10 + +static inline bool __softirq_needs_break(u64 start) +{ + if (need_resched()) + return true; + + if (sched_clock() - start >= MAX_SOFTIRQ_TIME) + return true; + + return false; +} + asmlinkage __visible void __softirq_entry __do_softirq(void) { + unsigned int max_restart = MAX_SOFTIRQ_RESTART; unsigned long old_flags = current->flags; - int max_restart = MAX_SOFTIRQ_RESTART; u64 start = sched_clock(); struct softirq_action *h; unsigned long pending; @@ -310,8 +317,7 @@ asmlinkage __visible void __softirq_entr pending = local_softirq_pending(); if (pending) { - if (sched_clock() - start < MAX_SOFTIRQ_TIME && !need_resched() && - --max_restart) + if (!__softirq_needs_break(start) && --max_restart) goto restart; wakeup_softirqd(); [-- Attachment #5: peterz-softirq-break-more.patch --] [-- Type: text/x-diff, Size: 1118 bytes --] Subject: softirq: Allow early break From: Peter Zijlstra <peterz@infradead.org> Date: Fri Sep 11 17:50:17 CEST 2020 Allow terminating the softirq processing loop without finishing the vectors. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- kernel/softirq.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -309,19 +309,23 @@ asmlinkage __visible void __softirq_entr prev_count, preempt_count()); preempt_count_set(prev_count); } + + if (pending && __softirq_needs_break(start)) + break; } if (__this_cpu_read(ksoftirqd) == current) rcu_softirq_qs(); local_irq_disable(); - pending = local_softirq_pending(); - if (pending) { - if (!__softirq_needs_break(start) && --max_restart) - goto restart; + if (pending) + or_softirq_pending(pending); + else if ((pending = local_softirq_pending()) && + !__softirq_needs_break(start) && + --max_restart) + goto restart; - wakeup_softirqd(); - } + wakeup_softirqd(); lockdep_softirq_end(in_hardirq); account_irq_exit_time(current);
next prev parent reply other threads:[~2020-09-11 15:57 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-09-09 9:09 qianjun.kernel 2020-09-11 15:55 ` peterz [this message] 2020-09-12 7:17 ` jun qian 2020-09-11 16:46 ` Qais Yousef 2020-09-11 18:28 ` peterz 2020-09-14 11:27 ` Qais Yousef 2020-09-14 14:14 ` peterz 2020-09-14 15:28 ` Qais Yousef [not found] ` <CA+njcd3HFV5Gqtt9qzTAzpnA4-4ngPBQ7T0gwgc0Fm9_VoJLcQ@mail.gmail.com> 2020-09-14 11:41 ` Qais Yousef
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200911155555.GX2674@hirez.programming.kicks-ass.net \ --to=peterz@infradead.org \ --cc=frederic@kernel.org \ --cc=laoar.shao@gmail.com \ --cc=linux-kernel@vger.kernel.org \ --cc=luto@kernel.org \ --cc=qianjun.kernel@gmail.com \ --cc=tglx@linutronix.de \ --cc=urezki@gmail.com \ --cc=will@kernel.org \ --subject='Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.