All of lore.kernel.org
 help / color / mirror / Atom feed
From: jun qian <qianjun.kernel@gmail.com>
To: peterz@infradead.org
Cc: Thomas Gleixner <tglx@linutronix.de>,
	will@kernel.org, luto@kernel.org, linux-kernel@vger.kernel.org,
	Yafang Shao <laoar.shao@gmail.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	frederic@kernel.org
Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs
Date: Sat, 12 Sep 2020 15:17:17 +0800	[thread overview]
Message-ID: <CAKc596LcHyjbTdnjSvhrPrCd3BMjDGGTe4DDzOqApeqb2ypSgA@mail.gmail.com> (raw)
In-Reply-To: <20200911155555.GX2674@hirez.programming.kicks-ass.net>

<peterz@infradead.org> 于2020年9月11日周五 下午11:55写道:
>
> On Wed, Sep 09, 2020 at 05:09:31PM +0800, qianjun.kernel@gmail.com wrote:
> > From: jun qian <qianjun.kernel@gmail.com>
> >
> > When get the pending softirqs, it need to process all the pending
> > softirqs in the while loop. If the processing time of each pending
> > softirq is need more than 2 msec in this loop, or one of the softirq
> > will running a long time, according to the original code logic, it
> > will process all the pending softirqs without wakeuping ksoftirqd,
> > which will cause a relatively large scheduling delay on the
> > corresponding CPU, which we do not wish to see. The patch will check
> > the total time to process pending softirq, if the time exceeds 2 ms
> > we need to wakeup the ksofirqd to aviod large sched delay.
>
> But what is all that unreadaable gibberish with pending_new_{flag,bit} ?
>
> Random comments below..
>
>
> > +#define MAX_SOFTIRQ_TIME_NS 2000000
>
>         2*NSEC_PER_MSEC
>
>
> > +DEFINE_PER_CPU(__u32, pending_new_flag);
> > +DEFINE_PER_CPU(__u32, pending_next_bit);
>
> __u32 is for userspace ABI, this is not it, use u32
>
> > +#define SOFTIRQ_PENDING_MASK ((1UL << NR_SOFTIRQS) - 1)
> > +
> >  asmlinkage __visible void __softirq_entry __do_softirq(void)
> >  {
> > -     unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
> > +     u64 end = sched_clock() + MAX_SOFTIRQ_TIME_NS;
> >       unsigned long old_flags = current->flags;
> >       int max_restart = MAX_SOFTIRQ_RESTART;
> >       struct softirq_action *h;
> >       bool in_hardirq;
> > -     __u32 pending;
> > -     int softirq_bit;
> > +     __u32 pending, pending_left, pending_new;
> > +     int softirq_bit, next_bit;
> > +     unsigned long flags;
> >
> >       /*
> >        * Mask out PF_MEMALLOC as the current task context is borrowed for the
> > @@ -277,10 +282,33 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> >
> >       h = softirq_vec;
> >
> > -     while ((softirq_bit = ffs(pending))) {
> > -             unsigned int vec_nr;
> > +     next_bit = per_cpu(pending_next_bit, smp_processor_id());
> > +     per_cpu(pending_new_flag, smp_processor_id()) = 0;
>
>         __this_cpu_read() / __this_cpu_write()
>
> > +
> > +     pending_left = pending &
> > +             (SOFTIRQ_PENDING_MASK << next_bit);
> > +     pending_new = pending &
> > +             (SOFTIRQ_PENDING_MASK >> (NR_SOFTIRQS - next_bit));
>
> The second mask is the inverse of the first.
>
> > +     /*
> > +      * In order to be fair, we shold process the pengding bits by the
> > +      * last processing order.
> > +      */
> > +     while ((softirq_bit = ffs(pending_left)) ||
> > +             (softirq_bit = ffs(pending_new))) {
> >               int prev_count;
> > +             unsigned int vec_nr = 0;
> >
> > +             /*
> > +              * when the left pengding bits have been handled, we should
> > +              * to reset the h to softirq_vec.
> > +              */
> > +             if (!ffs(pending_left)) {
> > +                     if (per_cpu(pending_new_flag, smp_processor_id()) == 0) {
> > +                             h = softirq_vec;
> > +                             per_cpu(pending_new_flag, smp_processor_id()) = 1;
> > +                     }
> > +             }
> >               h += softirq_bit - 1;
> >
> >               vec_nr = h - softirq_vec;
> > @@ -298,17 +326,44 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> >                       preempt_count_set(prev_count);
> >               }
> >               h++;
> > -             pending >>= softirq_bit;
> > +
> > +             if (ffs(pending_left))
>
> This is the _third_ ffs(pending_left), those things are _expensive_ (on
> some archs, see include/asm-generic/bitops/__ffs.h).
>
> > +                     pending_left >>= softirq_bit;
> > +             else
> > +                     pending_new >>= softirq_bit;
> > +
> > +             /*
> > +              * the softirq's action has been run too much time,
> > +              * so it may need to wakeup the ksoftirqd
> > +              */
> > +             if (need_resched() && sched_clock() > end) {
> > +                     /*
> > +                      * Ensure that the remaining pending bits will be
> > +                      * handled.
> > +                      */
> > +                     local_irq_save(flags);
> > +                     if (ffs(pending_left))
>
> *fourth*...
>
> > +                             or_softirq_pending((pending_left << (vec_nr + 1)) |
> > +                                                     pending_new);
> > +                     else
> > +                             or_softirq_pending(pending_new << (vec_nr + 1));
> > +                     local_irq_restore(flags);
> > +                     per_cpu(pending_next_bit, smp_processor_id()) = vec_nr + 1;
> > +                     break;
> > +             }
> >       }
> >
> > +     /* reset the pending_next_bit */
> > +     per_cpu(pending_next_bit, smp_processor_id()) = 0;
> > +
> >       if (__this_cpu_read(ksoftirqd) == current)
> >               rcu_softirq_qs();
> >       local_irq_disable();
> >
> >       pending = local_softirq_pending();
> >       if (pending) {
> > -             if (time_before(jiffies, end) && !need_resched() &&
> > -                 --max_restart)
> > +             if (!need_resched() && --max_restart &&
> > +                 sched_clock() <= end)
> >                       goto restart;
> >
> >               wakeup_softirqd();
>
> This really wants to be a number of separate patches; and I quickly lost
> the plot in your code. Instead of cleaning things up, you're making an
> even bigger mess of things.
>
> That said, I _think_ I've managed to decode what you want. See the
> completely untested patches attached.
>
>

thanks a lot thank for your suggestion. I will rewrite the patch by
following you sugestion, Useing multiple patches to clarify ideas.

  reply	other threads:[~2020-09-12  7:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-09  9:09 [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs qianjun.kernel
2020-09-11 15:55 ` peterz
2020-09-12  7:17   ` jun qian [this message]
2020-09-11 16:46 ` Qais Yousef
2020-09-11 18:28   ` peterz
2020-09-14 11:27     ` Qais Yousef
2020-09-14 14:14       ` peterz
2020-09-14 15:28         ` Qais Yousef
     [not found]     ` <CA+njcd3HFV5Gqtt9qzTAzpnA4-4ngPBQ7T0gwgc0Fm9_VoJLcQ@mail.gmail.com>
2020-09-14 11:41       ` Qais Yousef

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKc596LcHyjbTdnjSvhrPrCd3BMjDGGTe4DDzOqApeqb2ypSgA@mail.gmail.com \
    --to=qianjun.kernel@gmail.com \
    --cc=frederic@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=urezki@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.