From: Viresh Kumar <viresh.kumar@linaro.org>
To: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
vlevenetz@mm-sol.com, vaibhav.hiremath@linaro.org,
alex.elder@linaro.org, johan@kernel.org
Subject: [Query] Preemption (hogging) of the work handler
Date: Fri, 1 Jul 2016 09:59:59 -0700 [thread overview]
Message-ID: <20160701165959.GR12473@ubuntu> (raw)
Hi Tejun,
we are stuck with a typical issue on our octa-core ARM platform and
wanted to make sure that we aren't abusing the workqueue API by using
it for the wrong usecase.
Setup:
The system watchdog uses a delayed-work (1 second) for petting the
watchdog (resetting its counter) and if the work doesn't reset the
counters in time (another 1 second), the watchdog resets the system.
Petting-time: 1 second
Watchdog Reset-time: 2 seconds
The wq is allocated with:
wdog_wq = alloc_workqueue("wdog", WQ_HIGHPRI, 0);
The watchdog's work-handler looks like this:
static void pet_watchdog_work(struct work_struct *work)
{
...
pet_watchdog(); //Reset its counters
/* CONFIG_HZ=300, queuing for 1 second */
queue_delayed_work(wdog_wq, &wdog_dwork, 300);
}
kernel: 3.10 (Yeah, you can rant me for that, but its not something I
can decide on :)
Symptoms:
- The watchdog reboots the system sometimes. It is more reproducible
in cases where an (out-of-tree) bus enumerated over USB is suddenly
disconnected, which leads to removal of lots of kernel devices on
that bus and a lot of print messages as well, due to failures for
sending any more data for those devices..
Observations:
I tried to get more into it and found this..
- The timer used by the delayed work fires at the time it was
programmed for (checked timer->expires with value of jiffies) and
the work-handler gets a chance to run and reset the counters pretty
quickly after that.
- But somehow, the timer isn't programmed for the right time.
- Something is happening between the time the work-handler starts
running and we read jiffies from the add_timer() function which gets
called from within the queue_delayed_work().
- For example, if the value of jiffies in the pet_watchdog_work()
handler (before calling queue_delayed_work()) is say 1000000, then
the value of jiffies after the call to queue_delayed_work() has
returned becomes 1000310. i.e. it sometimes increases by a value of
over 300, which is 1 second in our setup. I have seen this delta to
vary from 50 to 350. If it crosses 300, the watchdog resets the
system (as it was programmed for 2 seconds).
So, we aren't able to queue the next timer in time and that causes all
these problems. I haven't concluded on why is that so..
Questions:
- I hope that the wq handler can be preempted, but can it be this bad?
- Is it fine to use the wq-handler for petting the watchdog? Or should
that only be done with help of interrupt-handlers?
- Any other clues you can give which can help us figure out what's
going on?
Thanks in advance and sorry to bother you :)
--
viresh
next reply other threads:[~2016-07-01 17:00 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-01 16:59 Viresh Kumar [this message]
2016-07-01 17:22 ` [Query] Preemption (hogging) of the work handler Tejun Heo
2016-07-01 17:28 ` Viresh Kumar
2016-07-06 18:28 ` Viresh Kumar
2016-07-06 19:23 ` Steven Rostedt
2016-07-06 19:25 ` Viresh Kumar
2016-07-11 10:26 ` Jan Kara
2016-07-11 15:44 ` Sergey Senozhatsky
2016-07-11 22:35 ` Viresh Kumar
2016-07-11 22:44 ` Rafael J. Wysocki
2016-07-11 22:46 ` Viresh Kumar
2016-07-12 12:24 ` Rafael J. Wysocki
2016-07-12 13:02 ` Viresh Kumar
2016-07-12 13:56 ` Petr Mladek
2016-07-12 14:04 ` Viresh Kumar
2016-07-12 9:38 ` Sergey Senozhatsky
2016-07-12 12:52 ` Petr Mladek
2016-07-12 13:12 ` Viresh Kumar
2016-07-12 17:11 ` Viresh Kumar
2016-07-12 19:59 ` Rafael J. Wysocki
2016-07-12 20:08 ` Viresh Kumar
2016-07-13 7:00 ` Sergey Senozhatsky
2016-07-13 12:05 ` Rafael J. Wysocki
2016-07-13 12:57 ` Sergey Senozhatsky
2016-07-13 13:22 ` Rafael J. Wysocki
2016-07-12 14:03 ` Sergey Senozhatsky
2016-07-12 14:12 ` Viresh Kumar
2016-07-14 23:52 ` Viresh Kumar
2016-07-15 13:11 ` Sergey Senozhatsky
2016-07-15 15:57 ` Viresh Kumar
2016-07-12 23:19 ` Viresh Kumar
2016-07-13 0:18 ` Viresh Kumar
2016-07-13 5:45 ` Sergey Senozhatsky
2016-07-13 15:39 ` Viresh Kumar
2016-07-13 23:08 ` Rafael J. Wysocki
2016-07-13 23:18 ` Viresh Kumar
2016-07-13 23:38 ` Greg Kroah-Hartman
2016-07-14 0:55 ` Sergey Senozhatsky
2016-07-14 1:09 ` Rafael J. Wysocki
2016-07-14 1:32 ` Sergey Senozhatsky
2016-07-14 21:57 ` Viresh Kumar
2016-07-14 21:55 ` Viresh Kumar
2016-07-14 14:12 ` Jan Kara
2016-07-14 14:33 ` Rafael J. Wysocki
2016-07-14 14:39 ` Jan Kara
2016-07-14 14:47 ` Rafael J. Wysocki
2016-07-14 14:55 ` Jan Kara
2016-07-14 22:14 ` Viresh Kumar
2016-07-14 14:34 ` Sergey Senozhatsky
2016-07-14 15:03 ` Jan Kara
2016-07-14 22:12 ` Viresh Kumar
2016-07-18 11:01 ` Jan Kara
2016-07-18 11:49 ` Rafael J. Wysocki
2016-07-29 20:42 ` Viresh Kumar
2016-07-30 2:12 ` Sergey Senozhatsky
2016-07-11 19:03 ` Viresh Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160701165959.GR12473@ubuntu \
--to=viresh.kumar@linaro.org \
--cc=alex.elder@linaro.org \
--cc=gregkh@linuxfoundation.org \
--cc=johan@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
--cc=vaibhav.hiremath@linaro.org \
--cc=vlevenetz@mm-sol.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).