* Workqueue "scheduling while atomic" issues in v3.12.19-rt30
@ 2014-09-23 10:55 Chris.Pringle
2014-09-23 11:27 ` Pavel Vasilyev
2014-09-23 13:39 ` Pavel Vasilyev
0 siblings, 2 replies; 6+ messages in thread
From: Chris.Pringle @ 2014-09-23 10:55 UTC (permalink / raw)
To: linux-rt-users
Dear All,
We've come across a number of "scheduling while atomic" BUGs after
upgrading from v3.0 (Freescale SDK v1.2.1) to v3.12.19-rt30 (Freescale SDK
v1.6) on a P2041 (e500mc core).
2014 Sep 22 17:20:07.984 (none) kernel: BUG: scheduling while atomic:
rcuc/3/31/0x00000002
2014 Sep 22 17:20:07.984 (none) kernel: Modules linked in: cryptodev(O)
ssp(O)
2014 Sep 22 17:20:07.984 (none) kernel: Preemption disabled at:[< (null)>]
(null)
2014 Sep 22 17:20:07.984 (none) kernel:
2014 Sep 22 17:20:07.984 (none) kernel: CPU: 3 PID: 31 Comm: rcuc/3
Tainted: G O 3.12.19-rt30 #1
2014 Sep 22 17:20:07.984 (none) kernel: Call Trace:
2014 Sep 22 17:20:07.984 (none) kernel: [ea0e3bc0] [80006d24]
show_stack+0x44/0x150 (unreliable)
2014 Sep 22 17:20:07.984 (none) kernel: [ea0e3c00] [80677eb4]
dump_stack+0x7c/0xdc
2014 Sep 22 17:20:07.984 (none) kernel: [ea0e3c20] [80675490]
__schedule_bug+0x84/0xa0
2014 Sep 22 17:20:07.984 (none) kernel: [ea0e3c30] [806725b8]
__schedule+0x458/0x4e0
2014 Sep 22 17:20:07.985 (none) kernel: [ea0e3d30] [80672710]
schedule+0x30/0xd0
2014 Sep 22 17:20:07.985 (none) kernel: [ea0e3d40] [80673338]
rt_spin_lock_slowlock+0x140/0x288
2014 Sep 22 17:20:07.985 (none) kernel: [ea0e3db0] [8003f43c]
__queue_work+0x18c/0x280
2014 Sep 22 17:20:07.985 (none) kernel: [ea0e3de0] [8003f684]
queue_work_on+0x144/0x150
2014 Sep 22 17:20:07.985 (none) kernel: [ea0e3e10] [802ee29c]
percpu_ref_kill_rcu+0x16c/0x170
2014 Sep 22 17:20:07.985 (none) kernel: [ea0e3e40] [80093010]
rcu_cpu_kthread+0x2f0/0x640
2014 Sep 22 17:20:07.985 (none) kernel: [ea0e3eb0] [800528c8]
smpboot_thread_fn+0x268/0x2e0
2014 Sep 22 17:20:07.985 (none) kernel: [ea0e3ee0] [80047fb8]
kthread+0x98/0xa0
2014 Sep 22 17:20:07.985 (none) kernel: [ea0e3f40] [8000f5a4]
ret_from_kernel_thread+0x5c/0x64
We are using asynchronous I/O in our applications and rapid calls to
io_queue_release appear to make this happen more frequently. This appears
to be related to this thread: https://lkml.org/lkml/2014/6/8/10
The thread suggests that the fix should be made in schedule_work rather
than the in the aio implementation:
"I think you should fix schedule_work(), because that should be callable
from any context"
Looking at the schedule_work function and following it through to
__queue_work, it's calling spin_lock (which can sleep in RT) within
rcu_read_lock (which I believe will also disable preemption requiring
anything in between the rcu read lock to be atomic and not sleep). I
believe this then results in the messages above.
I tried changing the spin lock for a raw spin lock and it seems to fix the
problem for me and so far hasn't caused any adverse effects for our
application. Despite this working, it doesn't feel like the right fix and
I'm wondering whether anyone else has any thoughts on this? Is there a
kinder way to fix this problem? Have I perhaps misunderstood what this
code is actually doing?
I've attached the patch for reference.
Cheers,
Chris
--- linux-fsl-sdk-v1.6/kernel/workqueue.c 2014-09-23
09:18:09.000000000 +0100
+++ linux-fsl-sdk-v1.6/kernel/workqueue.c 2014-09-23
09:27:39.000000000 +0100
@@ -143,7 +143,7 @@
/* struct worker is defined in workqueue_internal.h */
struct worker_pool {
- spinlock_t lock; /* the pool lock */
+ raw_spinlock_t lock; /* the pool lock */
int cpu; /* I: the associated cpu
*/
int node; /* I: the associated node
ID */
int id; /* I: pool ID */
@@ -797,7 +797,7 @@
* Wake up the first idle worker of @pool.
*
* CONTEXT:
- * spin_lock_irq(pool->lock).
+ * raw_spin_lock_irq(pool->lock).
*/
static void wake_up_worker(struct worker_pool *pool)
{
@@ -849,7 +849,7 @@
return;
worker->sleeping = 1;
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
/*
* The counterpart of the following dec_and_test, implied mb,
* worklist not empty test sequence is in insert_work().
@@ -867,7 +867,7 @@
if (next)
wake_up_process(next->task);
}
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
}
/**
@@ -881,7 +881,7 @@
* woken up.
*
* CONTEXT:
- * spin_lock_irq(pool->lock)
+ * raw_spin_lock_irq(pool->lock)
*/
static inline void worker_set_flags(struct worker *worker, unsigned int
flags,
bool wakeup)
@@ -916,7 +916,7 @@
* Clear @flags in @worker->flags and adjust nr_running accordingly.
*
* CONTEXT:
- * spin_lock_irq(pool->lock)
+ * raw_spin_lock_irq(pool->lock)
*/
static inline void worker_clr_flags(struct worker *worker, unsigned int
flags)
{
@@ -964,7 +964,7 @@
* actually occurs, it should be easy to locate the culprit work
function.
*
* CONTEXT:
- * spin_lock_irq(pool->lock).
+ * raw_spin_lock_irq(pool->lock).
*
* Return:
* Pointer to worker which is executing @work if found, %NULL
@@ -999,7 +999,7 @@
* nested inside outer list_for_each_entry_safe().
*
* CONTEXT:
- * spin_lock_irq(pool->lock).
+ * raw_spin_lock_irq(pool->lock).
*/
static void move_linked_works(struct work_struct *work, struct list_head
*head,
struct work_struct **nextp)
@@ -1077,9 +1077,9 @@
* As both pwqs and pools are RCU protected, the
* following lock operations are safe.
*/
- local_spin_lock_irq(pendingb_lock, &pwq->pool->lock);
+ raw_spin_lock_irq(&pwq->pool->lock);
put_pwq(pwq);
- local_spin_unlock_irq(pendingb_lock, &pwq->pool->lock);
+ raw_spin_unlock_irq(&pwq->pool->lock);
}
}
@@ -1110,7 +1110,7 @@
* decrement nr_in_flight of its pwq and handle workqueue flushing.
*
* CONTEXT:
- * spin_lock_irq(pool->lock).
+ * raw_spin_lock_irq(pool->lock).
*/
static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, int color)
{
@@ -1209,7 +1209,7 @@
if (!pool)
goto fail;
- spin_lock(&pool->lock);
+ raw_spin_lock(&pool->lock);
/*
* work->data is guaranteed to point to pwq only while the work
* item is queued on pwq->wq, and both updating work->data to
point
@@ -1238,11 +1238,11 @@
/* work->data points to pwq iff queued, point to pool */
set_work_pool_and_keep_pending(work, pool->id);
- spin_unlock(&pool->lock);
+ raw_spin_unlock(&pool->lock);
rcu_read_unlock();
return 1;
}
- spin_unlock(&pool->lock);
+ raw_spin_unlock(&pool->lock);
fail:
rcu_read_unlock();
local_unlock_irqrestore(pendingb_lock, *flags);
@@ -1263,7 +1263,7 @@
* work_struct flags.
*
* CONTEXT:
- * spin_lock_irq(pool->lock).
+ * raw_spin_lock_irq(pool->lock).
*/
static void insert_work(struct pool_workqueue *pwq, struct work_struct
*work,
struct list_head *head, unsigned int extra_flags)
@@ -1346,7 +1346,7 @@
if (last_pool && last_pool != pwq->pool) {
struct worker *worker;
- spin_lock(&last_pool->lock);
+ raw_spin_lock(&last_pool->lock);
worker = find_worker_executing_work(last_pool, work);
@@ -1354,11 +1354,11 @@
pwq = worker->current_pwq;
} else {
/* meh... not running there, queue here */
- spin_unlock(&last_pool->lock);
- spin_lock(&pwq->pool->lock);
+ raw_spin_unlock(&last_pool->lock);
+ raw_spin_lock(&pwq->pool->lock);
}
} else {
- spin_lock(&pwq->pool->lock);
+ raw_spin_lock(&pwq->pool->lock);
}
/*
@@ -1371,7 +1371,7 @@
*/
if (unlikely(!pwq->refcnt)) {
if (wq->flags & WQ_UNBOUND) {
- spin_unlock(&pwq->pool->lock);
+ raw_spin_unlock(&pwq->pool->lock);
cpu_relax();
goto retry;
}
@@ -1401,7 +1401,7 @@
insert_work(pwq, work, worklist, work_flags);
out:
- spin_unlock(&pwq->pool->lock);
+ raw_spin_unlock(&pwq->pool->lock);
rcu_read_unlock();
}
@@ -1554,7 +1554,7 @@
* necessary.
*
* LOCKING:
- * spin_lock_irq(pool->lock).
+ * raw_spin_lock_irq(pool->lock).
*/
static void worker_enter_idle(struct worker *worker)
{
@@ -1594,7 +1594,7 @@
* @worker is leaving idle state. Update stats.
*
* LOCKING:
- * spin_lock_irq(pool->lock).
+ * raw_spin_lock_irq(pool->lock).
*/
static void worker_leave_idle(struct worker *worker)
{
@@ -1652,13 +1652,13 @@
if (!(pool->flags & POOL_DISASSOCIATED))
set_cpus_allowed_ptr(current,
pool->attrs->cpumask);
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
if (pool->flags & POOL_DISASSOCIATED)
return false;
if (task_cpu(current) == pool->cpu &&
cpumask_equal(¤t->cpus_allowed,
pool->attrs->cpumask))
return true;
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
/*
* We've raced with CPU hot[un]plug. Give it a breather
@@ -1712,11 +1712,11 @@
* without installing the pointer.
*/
idr_preload(GFP_KERNEL);
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
id = idr_alloc(&pool->worker_idr, NULL, 0, 0, GFP_NOWAIT);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
idr_preload_end();
if (id < 0)
goto fail;
@@ -1758,17 +1758,17 @@
worker->flags |= WORKER_UNBOUND;
/* successful, commit the pointer to idr */
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
idr_replace(&pool->worker_idr, worker, worker->id);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
return worker;
fail:
if (id >= 0) {
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
idr_remove(&pool->worker_idr, id);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
}
kfree(worker);
return NULL;
@@ -1781,7 +1781,7 @@
* Make the pool aware of @worker and start it.
*
* CONTEXT:
- * spin_lock_irq(pool->lock).
+ * raw_spin_lock_irq(pool->lock).
*/
static void start_worker(struct worker *worker)
{
@@ -1807,9 +1807,9 @@
worker = create_worker(pool);
if (worker) {
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
start_worker(worker);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
}
mutex_unlock(&pool->manager_mutex);
@@ -1824,7 +1824,7 @@
* Destroy @worker and adjust @pool stats accordingly.
*
* CONTEXT:
- * spin_lock_irq(pool->lock) which is released and regrabbed.
+ * raw_spin_lock_irq(pool->lock) which is released and regrabbed.
*/
static void destroy_worker(struct worker *worker)
{
@@ -1854,20 +1854,20 @@
idr_remove(&pool->worker_idr, worker->id);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
kthread_stop(worker->task);
put_task_struct(worker->task);
kfree(worker);
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
}
static void idle_worker_timeout(unsigned long __pool)
{
struct worker_pool *pool = (void *)__pool;
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
if (too_many_workers(pool)) {
struct worker *worker;
@@ -1886,7 +1886,7 @@
}
}
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
}
static void send_mayday(struct work_struct *work)
@@ -1912,7 +1912,7 @@
struct work_struct *work;
spin_lock_irq(&wq_mayday_lock); /* for wq->maydays */
- spin_lock(&pool->lock);
+ raw_spin_lock(&pool->lock);
if (need_to_create_worker(pool)) {
/*
@@ -1925,7 +1925,7 @@
send_mayday(work);
}
- spin_unlock(&pool->lock);
+ raw_spin_unlock(&pool->lock);
spin_unlock_irq(&wq_mayday_lock);
mod_timer(&pool->mayday_timer, jiffies + MAYDAY_INTERVAL);
@@ -1945,7 +1945,7 @@
* may_start_working() %true.
*
* LOCKING:
- * spin_lock_irq(pool->lock) which may be released and regrabbed
+ * raw_spin_lock_irq(pool->lock) which may be released and regrabbed
* multiple times. Does GFP_KERNEL allocations. Called only from
* manager.
*
@@ -1960,7 +1960,7 @@
if (!need_to_create_worker(pool))
return false;
restart:
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
/* if we don't make progress in MAYDAY_INITIAL_TIMEOUT, call for
help */
mod_timer(&pool->mayday_timer, jiffies + MAYDAY_INITIAL_TIMEOUT);
@@ -1971,7 +1971,7 @@
worker = create_worker(pool);
if (worker) {
del_timer_sync(&pool->mayday_timer);
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
start_worker(worker);
if (WARN_ON_ONCE(need_to_create_worker(pool)))
goto restart;
@@ -1989,7 +1989,7 @@
}
del_timer_sync(&pool->mayday_timer);
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
if (need_to_create_worker(pool))
goto restart;
return true;
@@ -2003,7 +2003,7 @@
* IDLE_WORKER_TIMEOUT.
*
* LOCKING:
- * spin_lock_irq(pool->lock) which may be released and regrabbed
+ * raw_spin_lock_irq(pool->lock) which may be released and regrabbed
* multiple times. Called only from manager.
*
* Return:
@@ -2046,7 +2046,7 @@
* and may_start_working() is true.
*
* CONTEXT:
- * spin_lock_irq(pool->lock) which may be released and regrabbed
+ * raw_spin_lock_irq(pool->lock) which may be released and regrabbed
* multiple times. Does GFP_KERNEL allocations.
*
* Return:
@@ -2090,9 +2090,9 @@
* most cases. trylock first without dropping @pool->lock.
*/
if (unlikely(!mutex_trylock(&pool->manager_mutex))) {
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
mutex_lock(&pool->manager_mutex);
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
ret = true;
}
@@ -2122,7 +2122,7 @@
* call this function to process a work.
*
* CONTEXT:
- * spin_lock_irq(pool->lock) which is released and regrabbed.
+ * raw_spin_lock_irq(pool->lock) which is released and regrabbed.
*/
static void process_one_work(struct worker *worker, struct work_struct
*work)
__releases(&pool->lock)
@@ -2198,7 +2198,7 @@
*/
set_work_pool_and_clear_pending(work, pool->id);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
lock_map_acquire_read(&pwq->wq->lockdep_map);
lock_map_acquire(&lockdep_map);
@@ -2230,7 +2230,7 @@
*/
cond_resched();
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
/* clear cpu intensive status */
if (unlikely(cpu_intensive))
@@ -2254,7 +2254,7 @@
* fetches a work from the top and executes it.
*
* CONTEXT:
- * spin_lock_irq(pool->lock) which may be released and regrabbed
+ * raw_spin_lock_irq(pool->lock) which may be released and regrabbed
* multiple times.
*/
static void process_scheduled_works(struct worker *worker)
@@ -2286,11 +2286,11 @@
/* tell the scheduler that this is a workqueue worker */
worker->task->flags |= PF_WQ_WORKER;
woke_up:
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
/* am I supposed to die? */
if (unlikely(worker->flags & WORKER_DIE)) {
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
WARN_ON_ONCE(!list_empty(&worker->entry));
worker->task->flags &= ~PF_WQ_WORKER;
return 0;
@@ -2352,7 +2352,7 @@
*/
worker_enter_idle(worker);
__set_current_state(TASK_INTERRUPTIBLE);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
schedule();
goto woke_up;
}
@@ -2438,7 +2438,7 @@
wake_up_worker(pool);
rescuer->pool = NULL;
- spin_unlock(&pool->lock);
+ raw_spin_unlock(&pool->lock);
spin_lock(&wq_mayday_lock);
}
@@ -2483,7 +2483,7 @@
* underneath us, so we can't reliably determine pwq from @target.
*
* CONTEXT:
- * spin_lock_irq(pool->lock).
+ * raw_spin_lock_irq(pool->lock).
*/
static void insert_wq_barrier(struct pool_workqueue *pwq,
struct wq_barrier *barr,
@@ -2567,7 +2567,7 @@
for_each_pwq(pwq, wq) {
struct worker_pool *pool = pwq->pool;
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
if (flush_color >= 0) {
WARN_ON_ONCE(pwq->flush_color != -1);
@@ -2584,7 +2584,7 @@
pwq->work_color = work_color;
}
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
}
if (flush_color >= 0 &&
atomic_dec_and_test(&wq->nr_pwqs_to_flush))
@@ -2779,9 +2779,9 @@
for_each_pwq(pwq, wq) {
bool drained;
- spin_lock_irq(&pwq->pool->lock);
+ raw_spin_lock_irq(&pwq->pool->lock);
drained = !pwq->nr_active &&
list_empty(&pwq->delayed_works);
- spin_unlock_irq(&pwq->pool->lock);
+ raw_spin_unlock_irq(&pwq->pool->lock);
if (drained)
continue;
@@ -2816,7 +2816,7 @@
return false;
}
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
/* see the comment in try_to_grab_pending() with the same code */
pwq = get_work_pwq(work);
if (pwq) {
@@ -2830,7 +2830,7 @@
}
insert_wq_barrier(pwq, barr, work, worker);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
/*
* If @max_active is 1 or rescuer is in use, flushing another work
@@ -2846,7 +2846,7 @@
rcu_read_unlock();
return true;
already_gone:
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
rcu_read_unlock();
return false;
}
@@ -3503,7 +3503,7 @@
*/
static int init_worker_pool(struct worker_pool *pool)
{
- spin_lock_init(&pool->lock);
+ raw_spin_lock_init(&pool->lock);
pool->id = -1;
pool->cpu = -1;
pool->node = NUMA_NO_NODE;
@@ -3579,13 +3579,13 @@
*/
mutex_lock(&pool->manager_arb);
mutex_lock(&pool->manager_mutex);
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
while ((worker = first_worker(pool)))
destroy_worker(worker);
WARN_ON(pool->nr_workers || pool->nr_idle);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
mutex_unlock(&pool->manager_mutex);
mutex_unlock(&pool->manager_arb);
@@ -3739,7 +3739,7 @@
if (!freezable && pwq->max_active == wq->saved_max_active)
return;
- spin_lock_irq(&pwq->pool->lock);
+ raw_spin_lock_irq(&pwq->pool->lock);
if (!freezable || !(pwq->pool->flags & POOL_FREEZING)) {
pwq->max_active = wq->saved_max_active;
@@ -3757,7 +3757,7 @@
pwq->max_active = 0;
}
- spin_unlock_irq(&pwq->pool->lock);
+ raw_spin_unlock_irq(&pwq->pool->lock);
}
/* initialize newly alloced @pwq which is associated with @wq and @pool
*/
@@ -4107,9 +4107,9 @@
goto out_unlock;
use_dfl_pwq:
- spin_lock_irq(&wq->dfl_pwq->pool->lock);
+ raw_spin_lock_irq(&wq->dfl_pwq->pool->lock);
get_pwq(wq->dfl_pwq);
- spin_unlock_irq(&wq->dfl_pwq->pool->lock);
+ raw_spin_unlock_irq(&wq->dfl_pwq->pool->lock);
old_pwq = numa_pwq_tbl_install(wq, node, wq->dfl_pwq);
out_unlock:
mutex_unlock(&wq->mutex);
@@ -4462,10 +4462,10 @@
rcu_read_lock();
pool = get_work_pool(work);
if (pool) {
- spin_lock_irqsave(&pool->lock, flags);
+ raw_spin_lock_irqsave(&pool->lock, flags);
if (find_worker_executing_work(pool, work))
ret |= WORK_BUSY_RUNNING;
- spin_unlock_irqrestore(&pool->lock, flags);
+ raw_spin_unlock_irqrestore(&pool->lock, flags);
}
rcu_read_unlock();
return ret;
@@ -4575,7 +4575,7 @@
WARN_ON_ONCE(cpu != smp_processor_id());
mutex_lock(&pool->manager_mutex);
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
/*
* We've blocked all manager operations. Make all workers
@@ -4589,7 +4589,7 @@
pool->flags |= POOL_DISASSOCIATED;
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
mutex_unlock(&pool->manager_mutex);
/*
@@ -4615,9 +4615,9 @@
* worker blocking could lead to lengthy stalls. Kick off
* unbound chain execution of currently pending work
items.
*/
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
wake_up_worker(pool);
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
}
}
@@ -4645,7 +4645,7 @@
WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
pool->attrs->cpumask) <
0);
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
for_each_pool_worker(worker, wi, pool) {
unsigned int worker_flags = worker->flags;
@@ -4682,7 +4682,7 @@
ACCESS_ONCE(worker->flags) = worker_flags;
}
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
}
/**
@@ -4749,9 +4749,9 @@
mutex_lock(&pool->manager_mutex);
if (pool->cpu == cpu) {
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
pool->flags &= ~POOL_DISASSOCIATED;
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
rebind_workers(pool);
} else if (pool->cpu < 0) {
@@ -4874,10 +4874,10 @@
/* set FREEZING */
for_each_pool(pool, pi) {
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
WARN_ON_ONCE(pool->flags & POOL_FREEZING);
pool->flags |= POOL_FREEZING;
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
}
list_for_each_entry(wq, &workqueues, list) {
@@ -4959,10 +4959,10 @@
/* clear FREEZING */
for_each_pool(pool, pi) {
- spin_lock_irq(&pool->lock);
+ raw_spin_lock_irq(&pool->lock);
WARN_ON_ONCE(!(pool->flags & POOL_FREEZING));
pool->flags &= ~POOL_FREEZING;
- spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irq(&pool->lock);
}
/* restore max_active and repopulate worklist */
DISCLAIMER:
Privileged and/or Confidential information may be contained in this
message. If you are not the addressee of this message, you may not
copy, use or deliver this message to anyone. In such event, you
should destroy the message and kindly notify the sender by reply
e-mail. It is understood that opinions or conclusions that do not
relate to the official business of the company are neither given
nor endorsed by the company.
Thank You.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Workqueue "scheduling while atomic" issues in v3.12.19-rt30
2014-09-23 10:55 Workqueue "scheduling while atomic" issues in v3.12.19-rt30 Chris.Pringle
@ 2014-09-23 11:27 ` Pavel Vasilyev
2014-09-23 11:59 ` Chris.Pringle
2014-09-23 13:39 ` Pavel Vasilyev
1 sibling, 1 reply; 6+ messages in thread
From: Pavel Vasilyev @ 2014-09-23 11:27 UTC (permalink / raw)
To: Chris.Pringle, linux-rt-users
23.09.2014 14:55, Chris.Pringle@grassvalley.com пишет:
> Dear All,
>
> We've come across a number of "scheduling while atomic" BUGs after
> upgrading from v3.0 (Freescale SDK v1.2.1) to v3.12.19-rt30 (Freescale SDK
> v1.6) on a P2041 (e500mc core).
Use 3.2.x or 3.14, other - buggy patches.
--
Pavel.
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Workqueue "scheduling while atomic" issues in v3.12.19-rt30
2014-09-23 11:27 ` Pavel Vasilyev
@ 2014-09-23 11:59 ` Chris.Pringle
0 siblings, 0 replies; 6+ messages in thread
From: Chris.Pringle @ 2014-09-23 11:59 UTC (permalink / raw)
To: pavel; +Cc: linux-rt-users
> 23.09.2014 14:55, Chris.Pringle@grassvalley.com пишет:
> > We've come across a number of "scheduling while atomic" BUGs after
> > upgrading from v3.0 (Freescale SDK v1.2.1) to v3.12.19-rt30 (Freescale
SDK
> > v1.6) on a P2041 (e500mc core).
>
> Use 3.2.x or 3.14, other - buggy patches.
Pavel - thanks for the feedback. Unfortunately upgrading/downgrading is
not an option for us as we rely on the Freescale drivers contained in each
of their SDKs. We had tried an earlier SDK however had other kernel
related issues (now fixed in v3.12).
Do you know what changes were made in more recent patches that fix this? I
had a look at the v3.12 rt40 patches as well as the v3.14 rt patches but
it wasn't obvious to me what had changed.
DISCLAIMER:
Privileged and/or Confidential information may be contained in this
message. If you are not the addressee of this message, you may not
copy, use or deliver this message to anyone. In such event, you
should destroy the message and kindly notify the sender by reply
e-mail. It is understood that opinions or conclusions that do not
relate to the official business of the company are neither given
nor endorsed by the company.
Thank You.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Workqueue "scheduling while atomic" issues in v3.12.19-rt30
2014-09-23 10:55 Workqueue "scheduling while atomic" issues in v3.12.19-rt30 Chris.Pringle
2014-09-23 11:27 ` Pavel Vasilyev
@ 2014-09-23 13:39 ` Pavel Vasilyev
2014-09-24 9:10 ` Chris.Pringle
1 sibling, 1 reply; 6+ messages in thread
From: Pavel Vasilyev @ 2014-09-23 13:39 UTC (permalink / raw)
To: Chris.Pringle, linux-rt-users
23.09.2014 14:55, Chris.Pringle@grassvalley.com пишет:
> I've attached the patch for reference.
ported you fixes to 3.14.19 + patch-3.14.12-rt9.patch.gz
Intel Celeron G1620, while working :)
http://i67.fastpic.ru/big/2014/0923/39/df38663bd2414afe064c1978ad7b3d39.jpg
--
Pavel.
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Workqueue "scheduling while atomic" issues in v3.12.19-rt30
2014-09-23 13:39 ` Pavel Vasilyev
@ 2014-09-24 9:10 ` Chris.Pringle
2014-09-25 9:01 ` Chris.Pringle
0 siblings, 1 reply; 6+ messages in thread
From: Chris.Pringle @ 2014-09-24 9:10 UTC (permalink / raw)
To: pavel; +Cc: linux-rt-users
> 23.09.2014 14:55, Chris.Pringle@grassvalley.com пишет:
>
>
> > I've attached the patch for reference.
>
> ported you fixes to 3.14.19 + patch-3.14.12-rt9.patch.gz
> Intel Celeron G1620, while working :)
>
>
http://i67.fastpic.ru/big/2014/0923/39/df38663bd2414afe064c1978ad7b3d39.jpg
Thanks for trying. Unfortunately, I don't think this is a workable fix as
I'm pretty convinced that changing the spin lock type is not the right
thing to do for RT.. I have backported all fixes in aio.c and workqueue.c
from v3.14 (rt9) to v3.12 and have seen some improvement in one of my test
cases (although I have suspicions it's just masking the problem), however
the other is still failing as previously posted. In v3.0 of the kernel,
the aio stuff never used the new percpu ref counting so I think this is a
new problem introduced/revealed by that. I guess the question is, do we
expect it to be safe to call schedule_work() from atomic context (as Ben
LaHaise suggested)? If the answer is yes, then there is a bug in the
workqueues, if the answer is no, then the bug is in fs/aio.c in the way it
cleans up reqs via the percpu refcounting (which eventually calls
schedule_work()).
DISCLAIMER:
Privileged and/or Confidential information may be contained in this
message. If you are not the addressee of this message, you may not
copy, use or deliver this message to anyone. In such event, you
should destroy the message and kindly notify the sender by reply
e-mail. It is understood that opinions or conclusions that do not
relate to the official business of the company are neither given
nor endorsed by the company.
Thank You.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Workqueue "scheduling while atomic" issues in v3.12.19-rt30
2014-09-24 9:10 ` Chris.Pringle
@ 2014-09-25 9:01 ` Chris.Pringle
0 siblings, 0 replies; 6+ messages in thread
From: Chris.Pringle @ 2014-09-25 9:01 UTC (permalink / raw)
To: linux-rt-users; +Cc: pavel
> > 23.09.2014 14:55, Chris.Pringle@grassvalley.com пишет:
> >
> >
> > > I've attached the patch for reference.
> >
> > ported you fixes to 3.14.19 + patch-3.14.12-rt9.patch.gz
> > Intel Celeron G1620, while working :)
> >
> >
>
http://i67.fastpic.ru/big/2014/0923/39/df38663bd2414afe064c1978ad7b3d39.jpg
>
>
> Thanks for trying. Unfortunately, I don't think this is a workable fix
as
> I'm pretty convinced that changing the spin lock type is not the right
> thing to do for RT.. I have backported all fixes in aio.c and
workqueue.c
> from v3.14 (rt9) to v3.12 and have seen some improvement in one of my
test
> cases (although I have suspicions it's just masking the problem),
however
> the other is still failing as previously posted. In v3.0 of the kernel,
> the aio stuff never used the new percpu ref counting so I think this is
a
> new problem introduced/revealed by that. I guess the question is, do we
> expect it to be safe to call schedule_work() from atomic context (as Ben
> LaHaise suggested)? If the answer is yes, then there is a bug in the
> workqueues, if the answer is no, then the bug is in fs/aio.c in the way
it
> cleans up reqs via the percpu refcounting (which eventually calls
> schedule_work()).
Okay - just for complete-ness in case anyone else runs into this issue or
wants to pick it up... The original patch I sent out does not fix the
problem as there is another lock (local_lock_irqsave etc) in there which
would also need changing; changing this to a raw spinlock causes other
areas of the kernel to get complain about scheduling whilst atomic. I
didn't think this was the right way to fix it anyway.
I believe the problem is with the way aio is using the percpu ref
counting; it's fine without RT, but with RT the way aio uses per cpu ref
counting eventually calls schedule_work (not currently atomic) from
within percpu_ref_kill_rcu (atomic) and then spits out a kernel BUG.
I think there are a few ways we could fix this specific problem:
1) We could fix schedule_work() so it can be called from atomic context
which would solve the problem for aio with the way it uses percpu ref
counting. It only solves this one case however and I think some
clarification around what should and shouldn't be callable from atomic
context would be helpful.
2) For RT we could change aio so it's not reliant on percpu ref counting
(like the v3.0 implementation) but this is likely to give quite a
different implementation than the stock kernel; either that or look at at
a change to the mainline kernels aio implementation (i.e. not in RT
patches).
3) Maybe we could find a way of changing the percpu ref counting to make
it safe to call non-atomic code (with RT patches) - so it executes
functions in a non-atomic context perhaps?...
I also think some clarification on the following two things would also
help decide on the best way to address this:
1) Is it intended that schedule_work() be callable from an atomic context?
Currently it's not, but I've not seen any documentation to suggest whether
it is or is not supposed to be callable from atomic context
2) Should functions run by percpu_ref_kill_rcu be running in an atomic or
non-atomic context? Currently it's atomic, but again, I've not seen any
documentation to suggest whether it is or is not supposed to be running in
atomic context
I think it needs someone who has extensive knowledge/experience with these
areas of code to take care of this issue as this is not really my area of
expertise. For now I've gone with solution (2) and ported the original
v3.0 aio implementation onto v3.12 to resolve the problem for our use.
Happy to assist with testing a v3.12 fix on our platform (we can't upgrade
to v3.14 due to lock-in with the Freescale SDK) if anyone has any
suggested fixes.
DISCLAIMER:
Privileged and/or Confidential information may be contained in this
message. If you are not the addressee of this message, you may not
copy, use or deliver this message to anyone. In such event, you
should destroy the message and kindly notify the sender by reply
e-mail. It is understood that opinions or conclusions that do not
relate to the official business of the company are neither given
nor endorsed by the company.
Thank You.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-09-25 9:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-23 10:55 Workqueue "scheduling while atomic" issues in v3.12.19-rt30 Chris.Pringle
2014-09-23 11:27 ` Pavel Vasilyev
2014-09-23 11:59 ` Chris.Pringle
2014-09-23 13:39 ` Pavel Vasilyev
2014-09-24 9:10 ` Chris.Pringle
2014-09-25 9:01 ` Chris.Pringle
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.