linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org, torvalds@linux-foundation.org,
	mingo@redhat.com, akpm@linux-foundation.org,
	peterz@infradead.org
Subject: Re: [PATCHSET] timer: clean up initializers and implement irqsafe timers
Date: Tue, 14 Aug 2012 12:15:49 -0700	[thread overview]
Message-ID: <20120814191549.GX25632@google.com> (raw)
In-Reply-To: <alpine.LFD.2.02.1208131720070.32033@ionos>

Hello, Thomas.

On Tue, Aug 14, 2012 at 08:55:16PM +0200, Thomas Gleixner wrote:
> > * mod_delayed_work() can't be used from IRQ handlers.
> 
> This function does not exist. So what?

It makes the workqueue users messy.  It's difficult to get completely
correct and subtle errors are difficult to detect / verify.

> > * __cancel_delayed_work() can't use the usual try_to_grab_pending()
> >   which handles all three states but instead only deals with the first
> >   state using a separate implementation.  There's no way to make a
> >   delayed_work not pending from IRQ handlers.
> 
> And why is that desired and justifies the mess you are trying to
> create in the timer code?

Because API forcing its users to be messy is stupid.

> > * The context / behavior differences among cancel_delayed_work(),
> >   __cancel_delayed_work(), cancel_delayed_work_sync() are subtle and
> >   confusing (the first two are mostly historical tho).
> 
> We have a lot of subtle differences in interfaces for similar
> reasons.

And we should work to make them better.

> > This patchset implements irqsafe timers.  For an irqsafe timer, IRQ is
> > not enabled from dispatch till the end of its execution making it safe
> > to drain the timer regardless of context.  This will enable cleaning
> > up delayed_work interface.
> 
> By burdening crap on the timer code. We had a similar context case
> handling in the original hrtimers code and Linus went berserk on it.
> There is no real good reason to reinvent it in a different flavour.
> 
> Your general approach about workqueues seems to be adding hacks into
> other sensitive code paths [ see __schedule() ]. Can we please stop
> that? workqueues are not so special to justify that.

The schedule thing worked out pretty well, didn't it?  If it improves
the kernel in general, I don't see why timer shouldn't participate in
it.  Timer ain't that special either.  However, it does suck to add
one-off feature which isn't used by anyone else but I couldn't find a
better way.

So, if you can think of something better, sure.  Let's see.

> Right now delayed work arms a timer, whose callback enqueues the work
> and wakes the worker thread, which then executes the work.
> 
> So what about changing delayed_work into:
> 
> struct delayed_work {
>        struct work_struct work;
>        unsigned long expires;
> };
> 
> Now when delayed work gets scheduled it gets enqueued into a separate
> list in the workqueue core with the proper worker lock held. Then
> check the expiry time of the new work against the current expiry time
> of a timer in the worker itself.

Work items aren't assigned to worker on queue.  It's a shared worker
pool.  Workers take work items when they can.

> If the new expiry time is after the
> current expiry time, nothing to do. If the new expiry is before the
> current expiry time or the timer is not armed, then (re)arm the timer.
> 
> When the timer expires it wakes the worker and that evaluates the
> delayed list for expired works and executes them and rearms the timer
> if necessary.

How are you gonna decide which worker a delayed work item should be
queued on?  What if the work item before it takes a very long time to
finish?  Do we migrate those work items to a different worker?

> To cancel delayed work you don't have to worry about the timer
> callback being executed at all, simply because the timer callback is
> just a wakeup of the worker and not fiddling with the work itself. If
> the work is removed before the worker thread runs, life goes on as
> usual.
> 
> So all you have to do is to remove the work from the delayed list. If
> the timer is armed, just leave it alone and let it fire. Canceling
> delayed work is probably not a high frequency operation.
> 
> In fact that would make cancel_delayed_work and cancel_work basically
> the same operation.
> 
> I have no idea how many concurrent delayed works are on the fly, so I
> can't tell whether a simple ordered list is sufficient or if you need
> a tree which is better suited for a large number of sorted items. But
> that's a trivial to solve detail.

Aside from work <-> worker association confusion, you're basically
suggesting for workqueue to implement its own tvec_base in suboptimal
way.  Doesn't make much sense to me.

Thanks.

-- 
tejun

  reply	other threads:[~2012-08-14 19:15 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-08 18:10 Tejun Heo
2012-08-08 18:10 ` [PATCH 1/4] timer: generalize timer->base flags handling Tejun Heo
2012-08-21 16:40   ` [tip:timers/core] timer: Generalize " tip-bot for Tejun Heo
2012-08-08 18:10 ` [PATCH 2/4] timer: relocate declarations of init_timer_on_stack_key() Tejun Heo
2012-08-21 16:41   ` [tip:timers/core] timer: Relocate " tip-bot for Tejun Heo
2012-08-08 18:10 ` [PATCH 3/4] timer: clean up timer initializers Tejun Heo
2012-08-21 16:42   ` [tip:timers/core] timer: Clean " tip-bot for Tejun Heo
2012-08-08 18:10 ` [PATCH 4/4] timer: implement TIMER_IRQSAFE Tejun Heo
2012-08-21 16:43   ` [tip:timers/core] timer: Implement TIMER_IRQSAFE tip-bot for Tejun Heo
2012-08-21 19:26     ` Tejun Heo
2012-08-08 18:13 ` $SUBJ should have been "[PATCHSET] timer: clean up initializers and implement irqsafe timers" Tejun Heo
2012-08-13 23:35 ` [PATCHSET] timer: clean up initializers and implement irqsafe timers Tejun Heo
2012-08-14  8:32   ` Thomas Gleixner
2012-08-14 19:16   ` Thomas Gleixner
2012-08-14 19:22     ` Tejun Heo
2012-08-14 21:03       ` Thomas Gleixner
2012-08-14 21:56         ` Tejun Heo
2012-08-14 22:45           ` Thomas Gleixner
2012-08-14 23:01             ` Tejun Heo
2012-08-14 23:33               ` Thomas Gleixner
2012-08-15  0:18                 ` Tejun Heo
2012-08-15 10:58                   ` Peter Zijlstra
2012-08-16 19:36                     ` Tejun Heo
2012-08-14 18:55 ` Thomas Gleixner
2012-08-14 19:15   ` Tejun Heo [this message]
2012-08-14 20:43     ` Thomas Gleixner
2012-08-14 21:40       ` Tejun Heo
2012-08-14 23:12         ` Thomas Gleixner
2012-08-14 23:27           ` Tejun Heo
2012-08-14 23:46             ` Thomas Gleixner
2012-08-14 23:52               ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120814191549.GX25632@google.com \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).