* 3.14.23-rt20 - sched/numa: Fix task_numa_free() lockdep splat
[not found] ` <1414910967.5380.81.camel@marge.simpson.net>
@ 2014-11-02 7:30 ` Mike Galbraith
2014-11-02 7:31 ` 3.14.23-rt20 - wwmutex: fix __ww_mutex_lock_interruptible lockdep annotation Mike Galbraith
` (5 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Mike Galbraith @ 2014-11-02 7:30 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, linux-rt-users, Thomas Gleixner, Carsten Emde, John Kacur,
Sebastian Andrzej Siewior, Clark Williams
>From 60e69eed85bb7b5198ef70643b5895c26ad76ef7 Mon Sep 17 00:00:00 2001
From: Mike Galbraith <bitbucket@online.de>
Date: Mon, 7 Apr 2014 10:55:15 +0200
Subject: sched/numa: Fix task_numa_free() lockdep splat
Sasha reported that lockdep claims that the following commit:
made numa_group.lock interrupt unsafe:
156654f491dd ("sched/numa: Move task_numa_free() to __put_task_struct()")
While I don't see how that could be, given the commit in question moved
task_numa_free() from one irq enabled region to another, the below does
make both gripes and lockups upon gripe with numa=fake=4 go away.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Fixes: 156654f491dd ("sched/numa: Move task_numa_free() to __put_task_struct()")
Signed-off-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: torvalds@linux-foundation.org
Cc: mgorman@suse.com
Cc: akpm@linux-foundation.org
Cc: Dave Jones <davej@redhat.com>
Link: http://lkml.kernel.org/r/1396860915.5170.5.camel@marge.simpson.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
kernel/sched/fair.c | 13 +++++++------
kernel/sched/sched.h | 9 +++++++++
2 files changed, 16 insertions(+), 6 deletions(-)
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1371,7 +1371,7 @@ static void task_numa_placement(struct t
/* If the task is part of a group prevent parallel updates to group stats */
if (p->numa_group) {
group_lock = &p->numa_group->lock;
- spin_lock(group_lock);
+ spin_lock_irq(group_lock);
}
/* Find the node with the highest number of faults */
@@ -1432,7 +1432,7 @@ static void task_numa_placement(struct t
}
}
- spin_unlock(group_lock);
+ spin_unlock_irq(group_lock);
}
/* Preferred node as the node with the most faults */
@@ -1532,7 +1532,8 @@ static void task_numa_group(struct task_
if (!join)
return;
- double_lock(&my_grp->lock, &grp->lock);
+ BUG_ON(irqs_disabled());
+ double_lock_irq(&my_grp->lock, &grp->lock);
for (i = 0; i < 2*nr_node_ids; i++) {
my_grp->faults[i] -= p->numa_faults[i];
@@ -1546,7 +1547,7 @@ static void task_numa_group(struct task_
grp->nr_tasks++;
spin_unlock(&my_grp->lock);
- spin_unlock(&grp->lock);
+ spin_unlock_irq(&grp->lock);
rcu_assign_pointer(p->numa_group, grp);
@@ -1565,14 +1566,14 @@ void task_numa_free(struct task_struct *
void *numa_faults = p->numa_faults;
if (grp) {
- spin_lock(&grp->lock);
+ spin_lock_irq(&grp->lock);
for (i = 0; i < 2*nr_node_ids; i++)
grp->faults[i] -= p->numa_faults[i];
grp->total_faults -= p->total_numa_faults;
list_del(&p->numa_entry);
grp->nr_tasks--;
- spin_unlock(&grp->lock);
+ spin_unlock_irq(&grp->lock);
rcu_assign_pointer(p->numa_group, NULL);
put_numa_group(grp);
}
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1402,6 +1402,15 @@ static inline void double_lock(spinlock_
spin_lock_nested(l2, SINGLE_DEPTH_NESTING);
}
+static inline void double_lock_irq(spinlock_t *l1, spinlock_t *l2)
+{
+ if (l1 > l2)
+ swap(l1, l2);
+
+ spin_lock_irq(l1);
+ spin_lock_nested(l2, SINGLE_DEPTH_NESTING);
+}
+
static inline void double_raw_lock(raw_spinlock_t *l1, raw_spinlock_t *l2)
{
if (l1 > l2)
^ permalink raw reply [flat|nested] 29+ messages in thread
* 3.14.23-rt20 - wwmutex: fix __ww_mutex_lock_interruptible lockdep annotation
[not found] ` <1414910967.5380.81.camel@marge.simpson.net>
2014-11-02 7:30 ` 3.14.23-rt20 - sched/numa: Fix task_numa_free() lockdep splat Mike Galbraith
@ 2014-11-02 7:31 ` Mike Galbraith
2014-11-02 7:31 ` 3.14.23-rt20 - mm,memcg: make refill_stock()/consume_stock() use get_cpu_light() Mike Galbraith
` (4 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Mike Galbraith @ 2014-11-02 7:31 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, linux-rt-users, Thomas Gleixner, Carsten Emde, John Kacur,
Sebastian Andrzej Siewior, Clark Williams
Adding mutex_acquire_nest() as used in __ww_mutex_lock() fixes the
lockdep splat below. Remove superflous linebreak in __ww_mutex_lock()
as well.
[ 56.999063] =============================================
[ 56.999063] [ INFO: possible recursive locking detected ]
[ 56.999065] 3.14.4-rt5 #26 Not tainted
[ 56.999065] ---------------------------------------------
[ 56.999067] Xorg/4298 is trying to acquire lock:
[ 56.999107] (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffffa02b4270>] nouveau_gem_ioctl_pushbuf+0x870/0x19f0 [nouveau]
[ 56.999107]
but task is already holding lock:
[ 56.999130] (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffffa02b4270>] nouveau_gem_ioctl_pushbuf+0x870/0x19f0 [nouveau]
[ 56.999131]
other info that might help us debug this:
[ 56.999131] Possible unsafe locking scenario:
[ 56.999132] CPU0
[ 56.999132] ----
[ 56.999133] lock(reservation_ww_class_mutex);
[ 56.999134] lock(reservation_ww_class_mutex);
[ 56.999134]
*** DEADLOCK ***
[ 56.999135] May be due to missing lock nesting notation
[ 56.999136] 3 locks held by Xorg/4298:
[ 56.999160] #0: (&cli->mutex){+.+.+.}, at: [<ffffffffa02b597b>] nouveau_abi16_get+0x2b/0x100 [nouveau]
[ 56.999174] #1: (reservation_ww_class_acquire){+.+...}, at: [<ffffffffa0160cd2>] drm_ioctl+0x4d2/0x610 [drm]
[ 56.999197] #2: (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffffa02b4270>] nouveau_gem_ioctl_pushbuf+0x870/0x19f0 [nouveau]
[ 56.999198]
stack backtrace:
[ 56.999200] CPU: 1 PID: 4298 Comm: Xorg Not tainted 3.14.4-rt5 #26
[ 56.999200] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 56.999203] ffffffff820309b0 ffff8802217a9ac8 ffffffff8157e53e ffffffff820309b0
[ 56.999205] ffff8802217a9b90 ffffffff810964ce ffff8802217a9b40 0000000000000246
[ 56.999207] 0000000000000003 ffffffff8204f130 0001e09300000000 0001e093347523a9
[ 56.999207] Call Trace:
[ 56.999212] [<ffffffff8157e53e>] dump_stack+0x4f/0x9d
[ 56.999215] [<ffffffff810964ce>] __lock_acquire+0x169e/0x1a50
[ 56.999218] [<ffffffff81588da9>] ? preempt_count_sub+0x49/0x50
[ 56.999219] [<ffffffff81584541>] ? _raw_spin_unlock+0x31/0x60
[ 56.999221] [<ffffffff8109701d>] lock_acquire+0x8d/0x140
[ 56.999243] [<ffffffffa02b4270>] ? nouveau_gem_ioctl_pushbuf+0x870/0x19f0 [nouveau]
[ 56.999246] [<ffffffff8158313d>] __ww_mutex_lock_interruptible+0x3d/0xf0
[ 56.999268] [<ffffffffa02b4270>] ? nouveau_gem_ioctl_pushbuf+0x870/0x19f0 [nouveau]
[ 56.999290] [<ffffffffa02b4270>] nouveau_gem_ioctl_pushbuf+0x870/0x19f0 [nouveau]
[ 56.999298] [<ffffffffa0160cd2>] ? drm_ioctl+0x4d2/0x610 [drm]
[ 56.999308] [<ffffffffa0160cd2>] drm_ioctl+0x4d2/0x610 [drm]
[ 56.999311] [<ffffffff81588da9>] ? preempt_count_sub+0x49/0x50
[ 56.999313] [<ffffffff8107cd60>] ? migrate_enable+0xe0/0x1e0
[ 56.999335] [<ffffffffa02aaece>] nouveau_drm_ioctl+0x4e/0x90 [nouveau]
[ 56.999338] [<ffffffff81185ff0>] do_vfs_ioctl+0x300/0x530
[ 56.999340] [<ffffffff8158d33b>] ? sysret_check+0x1b/0x56
[ 56.999342] [<ffffffff81094a8d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
[ 56.999344] [<ffffffff81186261>] SyS_ioctl+0x41/0x80
[ 56.999345] [<ffffffff8158d316>] system_call_fastpath+0x1a/0x1f
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
---
kernel/locking/rtmutex.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1964,7 +1964,7 @@ __ww_mutex_lock_interruptible(struct ww_
might_sleep();
- mutex_acquire(&lock->base.dep_map, 0, 0, _RET_IP_);
+ mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map, _RET_IP_);
ret = rt_mutex_slowlock(&lock->base.lock, TASK_INTERRUPTIBLE, NULL, 0, ww_ctx);
if (ret)
mutex_release(&lock->base.dep_map, 1, _RET_IP_);
@@ -1982,8 +1982,7 @@ __ww_mutex_lock(struct ww_mutex *lock, s
might_sleep();
- mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map,
- _RET_IP_);
+ mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map, _RET_IP_);
ret = rt_mutex_slowlock(&lock->base.lock, TASK_UNINTERRUPTIBLE, NULL, 0, ww_ctx);
if (ret)
mutex_release(&lock->base.dep_map, 1, _RET_IP_);
^ permalink raw reply [flat|nested] 29+ messages in thread
* 3.14.23-rt20 - mm,memcg: make refill_stock()/consume_stock() use get_cpu_light()
[not found] ` <1414910967.5380.81.camel@marge.simpson.net>
2014-11-02 7:30 ` 3.14.23-rt20 - sched/numa: Fix task_numa_free() lockdep splat Mike Galbraith
2014-11-02 7:31 ` 3.14.23-rt20 - wwmutex: fix __ww_mutex_lock_interruptible lockdep annotation Mike Galbraith
@ 2014-11-02 7:31 ` Mike Galbraith
2014-11-02 7:31 ` 3.14.23-rt20 - fs,btrfs: fix rt deadlock on extent_buffer->lock Mike Galbraith
` (3 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Mike Galbraith @ 2014-11-02 7:31 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, linux-rt-users, Thomas Gleixner, Carsten Emde, John Kacur,
Sebastian Andrzej Siewior, Clark Williams
Nikita reported the following memcg scheduling while atomic bug:
Call Trace:
[e22d5a90] [c0007ea8] show_stack+0x4c/0x168 (unreliable)
[e22d5ad0] [c0618c04] __schedule_bug+0x94/0xb0
[e22d5ae0] [c060b9ec] __schedule+0x530/0x550
[e22d5bf0] [c060bacc] schedule+0x30/0xbc
[e22d5c00] [c060ca24] rt_spin_lock_slowlock+0x180/0x27c
[e22d5c70] [c00b39dc] res_counter_uncharge_until+0x40/0xc4
[e22d5ca0] [c013ca88] drain_stock.isra.20+0x54/0x98
[e22d5cc0] [c01402ac] __mem_cgroup_try_charge+0x2e8/0xbac
[e22d5d70] [c01410d4] mem_cgroup_charge_common+0x3c/0x70
[e22d5d90] [c0117284] __do_fault+0x38c/0x510
[e22d5df0] [c011a5f4] handle_pte_fault+0x98/0x858
[e22d5e50] [c060ed08] do_page_fault+0x42c/0x6fc
[e22d5f40] [c000f5b4] handle_page_fault+0xc/0x80
What happens:
refill_stock()
get_cpu_var()
drain_stock()
res_counter_uncharge()
res_counter_uncharge_until()
spin_lock() <== boom
Fix it by replacing get/put_cpu_var() with get/put_cpu_light().
Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
---
mm/memcontrol.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2438,16 +2438,18 @@ static bool consume_stock(struct mem_cgr
{
struct memcg_stock_pcp *stock;
bool ret = true;
+ int cpu;
if (nr_pages > CHARGE_BATCH)
return false;
- stock = &get_cpu_var(memcg_stock);
+ cpu = get_cpu_light();
+ stock = &per_cpu(memcg_stock, cpu);
if (memcg == stock->cached && stock->nr_pages >= nr_pages)
stock->nr_pages -= nr_pages;
else /* need to call res_counter_charge */
ret = false;
- put_cpu_var(memcg_stock);
+ put_cpu_light();
return ret;
}
@@ -2497,14 +2499,17 @@ static void __init memcg_stock_init(void
*/
static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
{
- struct memcg_stock_pcp *stock = &get_cpu_var(memcg_stock);
+ struct memcg_stock_pcp *stock;
+ int cpu = get_cpu_light();
+
+ stock = &per_cpu(memcg_stock, cpu);
if (stock->cached != memcg) { /* reset if necessary */
drain_stock(stock);
stock->cached = memcg;
}
stock->nr_pages += nr_pages;
- put_cpu_var(memcg_stock);
+ put_cpu_light();
}
/*
^ permalink raw reply [flat|nested] 29+ messages in thread
* 3.14.23-rt20 - fs,btrfs: fix rt deadlock on extent_buffer->lock
[not found] ` <1414910967.5380.81.camel@marge.simpson.net>
` (2 preceding siblings ...)
2014-11-02 7:31 ` 3.14.23-rt20 - mm,memcg: make refill_stock()/consume_stock() use get_cpu_light() Mike Galbraith
@ 2014-11-02 7:31 ` Mike Galbraith
2015-02-17 11:56 ` Sebastian Andrzej Siewior
2014-11-02 7:31 ` 3.14.23-rt20 - aio: fix rcu garbage collection might_sleep() splat Mike Galbraith
` (2 subsequent siblings)
6 siblings, 1 reply; 29+ messages in thread
From: Mike Galbraith @ 2014-11-02 7:31 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, linux-rt-users, Thomas Gleixner, Carsten Emde, John Kacur,
Sebastian Andrzej Siewior, Clark Williams
Subject: fs,btrfs: fix rt deadlock on extent_buffer->lock
From: Mike Galbraith <mgalbraith@suse.de>
Sat Jul 14 12:30:41 CEST 2012
Trivially repeatable deadlock is cured by enabling lockdep code in
btrfs_clear_path_blocking() as suggested by Chris Mason. He also
suggested restricting blocking reader count to one, and not allowing
a spinning reader while blocking reader exists. This has proven to
be unnecessary, the strict lock order enforcement is enough.. or
rather that's my box's opinion after long hours of hard pounding.
Note: extent-tree.c bit is additional recommendation from Chris
Mason, split into a separate patch after discussion.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Chris Mason <chris.mason@fusionio.com>
---
fs/btrfs/ctree.c | 4 ++--
fs/btrfs/extent-tree.c | 8 --------
2 files changed, 2 insertions(+), 10 deletions(-)
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -80,7 +80,7 @@ noinline void btrfs_clear_path_blocking(
{
int i;
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
+#if (defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PREEMPT_RT_BASE))
/* lockdep really cares that we take all of these spinlocks
* in the right order. If any of the locks in the path are not
* currently blocking, it is going to complain. So, make really
@@ -107,7 +107,7 @@ noinline void btrfs_clear_path_blocking(
}
}
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
+#if (defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PREEMPT_RT_BASE))
if (held)
btrfs_clear_lock_blocking_rw(held, held_rw);
#endif
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -6938,14 +6938,6 @@ use_block_rsv(struct btrfs_trans_handle
goto again;
}
- if (btrfs_test_opt(root, ENOSPC_DEBUG)) {
- static DEFINE_RATELIMIT_STATE(_rs,
- DEFAULT_RATELIMIT_INTERVAL * 10,
- /*DEFAULT_RATELIMIT_BURST*/ 1);
- if (__ratelimit(&_rs))
- WARN(1, KERN_DEBUG
- "BTRFS: block rsv returned %d\n", ret);
- }
try_reserve:
ret = reserve_metadata_bytes(root, block_rsv, blocksize,
BTRFS_RESERVE_NO_FLUSH);
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 3.14.23-rt20 - fs,btrfs: fix rt deadlock on extent_buffer->lock
2014-11-02 7:31 ` 3.14.23-rt20 - fs,btrfs: fix rt deadlock on extent_buffer->lock Mike Galbraith
@ 2015-02-17 11:56 ` Sebastian Andrzej Siewior
2015-02-17 12:23 ` Mike Galbraith
2015-02-18 10:47 ` Mike Galbraith
0 siblings, 2 replies; 29+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-02-17 11:56 UTC (permalink / raw)
To: Mike Galbraith
Cc: Steven Rostedt, LKML, linux-rt-users, Thomas Gleixner,
Carsten Emde, John Kacur, Clark Williams
* Mike Galbraith | 2014-11-02 08:31:18 [+0100]:
>--- a/fs/btrfs/ctree.c
>+++ b/fs/btrfs/ctree.c
>@@ -80,7 +80,7 @@ noinline void btrfs_clear_path_blocking(
> {
> int i;
>
>-#ifdef CONFIG_DEBUG_LOCK_ALLOC
>+#if (defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PREEMPT_RT_BASE))
> /* lockdep really cares that we take all of these spinlocks
> * in the right order. If any of the locks in the path are not
> * currently blocking, it is going to complain. So, make really
This is gone since commit f82c458 ("btrfs: fix lockups from
btrfs_clear_path_blocking")
>@@ -107,7 +107,7 @@ noinline void btrfs_clear_path_blocking(
> }
> }
>
>-#ifdef CONFIG_DEBUG_LOCK_ALLOC
>+#if (defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PREEMPT_RT_BASE))
> if (held)
> btrfs_clear_lock_blocking_rw(held, held_rw);
> #endif
>--- a/fs/btrfs/extent-tree.c
>+++ b/fs/btrfs/extent-tree.c
>@@ -6938,14 +6938,6 @@ use_block_rsv(struct btrfs_trans_handle
> goto again;
> }
>
>- if (btrfs_test_opt(root, ENOSPC_DEBUG)) {
>- static DEFINE_RATELIMIT_STATE(_rs,
>- DEFAULT_RATELIMIT_INTERVAL * 10,
>- /*DEFAULT_RATELIMIT_BURST*/ 1);
>- if (__ratelimit(&_rs))
>- WARN(1, KERN_DEBUG
>- "BTRFS: block rsv returned %d\n", ret);
>- }
> try_reserve:
> ret = reserve_metadata_bytes(root, block_rsv, blocksize,
> BTRFS_RESERVE_NO_FLUSH);
>
and this look like just a warning with enabled debug that is supressed.
May I drop this patch?
Sebastian
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 3.14.23-rt20 - fs,btrfs: fix rt deadlock on extent_buffer->lock
2015-02-17 11:56 ` Sebastian Andrzej Siewior
@ 2015-02-17 12:23 ` Mike Galbraith
2015-02-18 10:47 ` Mike Galbraith
1 sibling, 0 replies; 29+ messages in thread
From: Mike Galbraith @ 2015-02-17 12:23 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, LKML, linux-rt-users, Thomas Gleixner,
Carsten Emde, John Kacur, Clark Williams
On Tue, 2015-02-17 at 12:56 +0100, Sebastian Andrzej Siewior wrote:
> * Mike Galbraith | 2014-11-02 08:31:18 [+0100]:
>
> >--- a/fs/btrfs/ctree.c
> >+++ b/fs/btrfs/ctree.c
> >@@ -80,7 +80,7 @@ noinline void btrfs_clear_path_blocking(
> > {
> > int i;
> >
> >-#ifdef CONFIG_DEBUG_LOCK_ALLOC
> >+#if (defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PREEMPT_RT_BASE))
> > /* lockdep really cares that we take all of these spinlocks
> > * in the right order. If any of the locks in the path are not
> > * currently blocking, it is going to complain. So, make really
>
> This is gone since commit f82c458 ("btrfs: fix lockups from
> btrfs_clear_path_blocking")
>
> >@@ -107,7 +107,7 @@ noinline void btrfs_clear_path_blocking(
> > }
> > }
> >
> >-#ifdef CONFIG_DEBUG_LOCK_ALLOC
> >+#if (defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PREEMPT_RT_BASE))
> > if (held)
> > btrfs_clear_lock_blocking_rw(held, held_rw);
> > #endif
> >--- a/fs/btrfs/extent-tree.c
> >+++ b/fs/btrfs/extent-tree.c
> >@@ -6938,14 +6938,6 @@ use_block_rsv(struct btrfs_trans_handle
> > goto again;
> > }
> >
> >- if (btrfs_test_opt(root, ENOSPC_DEBUG)) {
> >- static DEFINE_RATELIMIT_STATE(_rs,
> >- DEFAULT_RATELIMIT_INTERVAL * 10,
> >- /*DEFAULT_RATELIMIT_BURST*/ 1);
> >- if (__ratelimit(&_rs))
> >- WARN(1, KERN_DEBUG
> >- "BTRFS: block rsv returned %d\n", ret);
> >- }
> > try_reserve:
> > ret = reserve_metadata_bytes(root, block_rsv, blocksize,
> > BTRFS_RESERVE_NO_FLUSH);
> >
> and this look like just a warning with enabled debug that is supressed.
> May I drop this patch?
Hm, I'll try to find some time to beat on filesystems again.
-Mike
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 3.14.23-rt20 - fs,btrfs: fix rt deadlock on extent_buffer->lock
2015-02-17 11:56 ` Sebastian Andrzej Siewior
2015-02-17 12:23 ` Mike Galbraith
@ 2015-02-18 10:47 ` Mike Galbraith
1 sibling, 0 replies; 29+ messages in thread
From: Mike Galbraith @ 2015-02-18 10:47 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, LKML, linux-rt-users, Thomas Gleixner,
Carsten Emde, John Kacur, Clark Williams
On Tue, 2015-02-17 at 12:56 +0100, Sebastian Andrzej Siewior wrote:
> * Mike Galbraith | 2014-11-02 08:31:18 [+0100]:
>
> >--- a/fs/btrfs/ctree.c
> >+++ b/fs/btrfs/ctree.c
> >@@ -80,7 +80,7 @@ noinline void btrfs_clear_path_blocking(
> > {
> > int i;
> >
> >-#ifdef CONFIG_DEBUG_LOCK_ALLOC
> >+#if (defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PREEMPT_RT_BASE))
> > /* lockdep really cares that we take all of these spinlocks
> > * in the right order. If any of the locks in the path are not
> > * currently blocking, it is going to complain. So, make really
>
> This is gone since commit f82c458 ("btrfs: fix lockups from
> btrfs_clear_path_blocking")
Goody, BTRFS took a modest workout without a whimper. When that patch
was born, you didn't have to try, deadlock was immediate gratification.
> >@@ -107,7 +107,7 @@ noinline void btrfs_clear_path_blocking(
> > }
> > }
> >
> >-#ifdef CONFIG_DEBUG_LOCK_ALLOC
> >+#if (defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PREEMPT_RT_BASE))
> > if (held)
> > btrfs_clear_lock_blocking_rw(held, held_rw);
> > #endif
> >--- a/fs/btrfs/extent-tree.c
> >+++ b/fs/btrfs/extent-tree.c
> >@@ -6938,14 +6938,6 @@ use_block_rsv(struct btrfs_trans_handle
> > goto again;
> > }
> >
> >- if (btrfs_test_opt(root, ENOSPC_DEBUG)) {
> >- static DEFINE_RATELIMIT_STATE(_rs,
> >- DEFAULT_RATELIMIT_INTERVAL * 10,
> >- /*DEFAULT_RATELIMIT_BURST*/ 1);
> >- if (__ratelimit(&_rs))
> >- WARN(1, KERN_DEBUG
> >- "BTRFS: block rsv returned %d\n", ret);
> >- }
> > try_reserve:
> > ret = reserve_metadata_bytes(root, block_rsv, blocksize,
> > BTRFS_RESERVE_NO_FLUSH);
> >
> and this look like just a warning with enabled debug that is supressed.
> May I drop this patch?
Yup, that was a cosmetic thing, also suggested by Chris Mason. BTRFS
seems to now just work. Hopefully it'll keep on just working, I don't
want to ever go back there :)
-Mike
^ permalink raw reply [flat|nested] 29+ messages in thread
* 3.14.23-rt20 - aio: fix rcu garbage collection might_sleep() splat
[not found] ` <1414910967.5380.81.camel@marge.simpson.net>
` (3 preceding siblings ...)
2014-11-02 7:31 ` 3.14.23-rt20 - fs,btrfs: fix rt deadlock on extent_buffer->lock Mike Galbraith
@ 2014-11-02 7:31 ` Mike Galbraith
2015-02-17 12:53 ` Sebastian Andrzej Siewior
2014-11-02 7:31 ` 3.14.23-rt20 - x86, UV: raw_spinlock conversion Mike Galbraith
2014-11-02 7:31 ` 3.14.23-rt20 - softirq: resurrect softirq threads Mike Galbraith
6 siblings, 1 reply; 29+ messages in thread
From: Mike Galbraith @ 2014-11-02 7:31 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, linux-rt-users, Thomas Gleixner, Carsten Emde, John Kacur,
Sebastian Andrzej Siewior, Clark Williams
(not pretty)
[ 172.743098] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:768
[ 172.743116] in_atomic(): 1, irqs_disabled(): 0, pid: 26, name: rcuos/2
[ 172.743117] 2 locks held by rcuos/2/26:
[ 172.743128] #0: (rcu_callback){.+.+..}, at: [<ffffffff810b1a12>] rcu_nocb_kthread+0x1e2/0x380
[ 172.743135] #1: (rcu_read_lock_sched){.+.+..}, at: [<ffffffff812acd26>] percpu_ref_kill_rcu+0xa6/0x1c0
[ 172.743138] Preemption disabled at:[<ffffffff810b1a93>] rcu_nocb_kthread+0x263/0x380
[ 172.743138]
[ 172.743142] CPU: 0 PID: 26 Comm: rcuos/2 Not tainted 3.14.4-rt5 #31
[ 172.743143] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 172.743148] ffff8802231aa190 ffff8802231a5d08 ffffffff81582e9e 0000000000000000
[ 172.743151] ffff8802231a5d28 ffffffff81077aeb ffff880209f68140 ffff880209f681c0
[ 172.743154] ffff8802231a5d48 ffffffff81589304 ffff880209f68000 ffff880209f68000
[ 172.743155] Call Trace:
[ 172.743160] [<ffffffff81582e9e>] dump_stack+0x4e/0x9c
[ 172.743163] [<ffffffff81077aeb>] __might_sleep+0xfb/0x170
[ 172.743167] [<ffffffff81589304>] rt_spin_lock+0x24/0x70
[ 172.743171] [<ffffffff811c5790>] free_ioctx_users+0x30/0x130
[ 172.743174] [<ffffffff812ace34>] percpu_ref_kill_rcu+0x1b4/0x1c0
[ 172.743177] [<ffffffff812acd26>] ? percpu_ref_kill_rcu+0xa6/0x1c0
[ 172.743180] [<ffffffff812acc80>] ? percpu_ref_kill_and_confirm+0x70/0x70
[ 172.743183] [<ffffffff810b1a93>] rcu_nocb_kthread+0x263/0x380
[ 172.743185] [<ffffffff810b1a12>] ? rcu_nocb_kthread+0x1e2/0x380
[ 172.743189] [<ffffffff810b1830>] ? rcu_report_exp_rnp.isra.52+0xc0/0xc0
[ 172.743192] [<ffffffff8106e046>] kthread+0xd6/0xf0
[ 172.743194] [<ffffffff8158900c>] ? _raw_spin_unlock_irq+0x2c/0x70
[ 172.743197] [<ffffffff8106df70>] ? __kthread_parkme+0x70/0x70
[ 172.743200] [<ffffffff81591eec>] ret_from_fork+0x7c/0xb0
[ 172.743203] [<ffffffff8106df70>] ? __kthread_parkme+0x70/0x70
crash> gdb list *percpu_ref_kill_rcu+0x1b4
0xffffffff812ace34 is in percpu_ref_kill_rcu (include/linux/percpu-refcount.h:169).
164 pcpu_count = ACCESS_ONCE(ref->pcpu_count);
165
166 if (likely(REF_STATUS(pcpu_count) == PCPU_REF_PTR))
167 __this_cpu_dec(*pcpu_count);
168 else if (unlikely(atomic_dec_and_test(&ref->count)))
169 ref->release(ref);
170
171 rcu_read_unlock_sched();
172 }
173
Ok, so ->release() can't do anything where it may meet a sleeping lock,
but in an -rt kernel, it does that.
Convert struct kioctx ctx_lock/completion_lock to raw_spinlock_t, and
defer final free to a time when we're not under rcu_read_lock_sched().
runltp -f ltp-aio-stress.part1 runs kernel/ltp gripe free.
INFO: ltp-pan reported all tests PASS
LTP Version: 20140422
###############################################################
Done executing testcases.
LTP Version: 20140422
###############################################################
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
---
fs/aio.c | 65 ++++++++++++++++++++++++++++++++++++++++++++-------------------
1 file changed, 46 insertions(+), 19 deletions(-)
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -130,7 +130,7 @@ struct kioctx {
} ____cacheline_aligned_in_smp;
struct {
- spinlock_t ctx_lock;
+ raw_spinlock_t ctx_lock;
struct list_head active_reqs; /* used for cancellation */
} ____cacheline_aligned_in_smp;
@@ -142,13 +142,16 @@ struct kioctx {
struct {
unsigned tail;
unsigned completed_events;
- spinlock_t completion_lock;
+ raw_spinlock_t completion_lock;
} ____cacheline_aligned_in_smp;
struct page *internal_pages[AIO_RING_PAGES];
struct file *aio_ring_file;
unsigned id;
+#ifdef CONFIG_PREEMPT_RT_BASE
+ struct rcu_head rcu;
+#endif
};
/*------ sysctl variables----*/
@@ -340,11 +343,11 @@ static int aio_migratepage(struct addres
* while the old page is copied to the new. This prevents new
* events from being lost.
*/
- spin_lock_irqsave(&ctx->completion_lock, flags);
+ raw_spin_lock_irqsave(&ctx->completion_lock, flags);
migrate_page_copy(new, old);
BUG_ON(ctx->ring_pages[idx] != old);
ctx->ring_pages[idx] = new;
- spin_unlock_irqrestore(&ctx->completion_lock, flags);
+ raw_spin_unlock_irqrestore(&ctx->completion_lock, flags);
/* The old page is no longer accessible. */
put_page(old);
@@ -467,14 +470,14 @@ void kiocb_set_cancel_fn(struct kiocb *r
struct kioctx *ctx = req->ki_ctx;
unsigned long flags;
- spin_lock_irqsave(&ctx->ctx_lock, flags);
+ raw_spin_lock_irqsave(&ctx->ctx_lock, flags);
if (!req->ki_list.next)
list_add(&req->ki_list, &ctx->active_reqs);
req->ki_cancel = cancel;
- spin_unlock_irqrestore(&ctx->ctx_lock, flags);
+ raw_spin_unlock_irqrestore(&ctx->ctx_lock, flags);
}
EXPORT_SYMBOL(kiocb_set_cancel_fn);
@@ -499,6 +502,7 @@ static int kiocb_cancel(struct kioctx *c
return cancel(kiocb);
}
+#ifndef CONFIG_PREEMPT_RT_BASE
static void free_ioctx(struct work_struct *work)
{
struct kioctx *ctx = container_of(work, struct kioctx, free_work);
@@ -509,6 +513,18 @@ static void free_ioctx(struct work_struc
free_percpu(ctx->cpu);
kmem_cache_free(kioctx_cachep, ctx);
}
+#else
+static void free_ioctx_rcu(struct rcu_head *rcu)
+{
+ struct kioctx *ctx = container_of(rcu, struct kioctx, rcu);
+
+ pr_debug("freeing %p\n", ctx);
+
+ aio_free_ring(ctx);
+ free_percpu(ctx->cpu);
+ kmem_cache_free(kioctx_cachep, ctx);
+}
+#endif
static void free_ioctx_reqs(struct percpu_ref *ref)
{
@@ -518,8 +534,19 @@ static void free_ioctx_reqs(struct percp
if (ctx->requests_done)
complete(ctx->requests_done);
+#ifndef CONFIG_PREEMPT_RT_BASE
INIT_WORK(&ctx->free_work, free_ioctx);
schedule_work(&ctx->free_work);
+#else
+ /*
+ * We're in ->release() under rcu_read_lock_sched(), and can't do
+ * anything that requires taking a sleeping lock, so ->release()
+ * becomes a two stage rcu process for -rt. We've now done the
+ * rcu work that we can under locks made raw to get us this far.
+ * Defer the freeing bit until we're again allowed to schedule().
+ */
+ call_rcu(&ctx->rcu, free_ioctx_rcu);
+#endif
}
/*
@@ -532,7 +559,7 @@ static void free_ioctx_users(struct perc
struct kioctx *ctx = container_of(ref, struct kioctx, users);
struct kiocb *req;
- spin_lock_irq(&ctx->ctx_lock);
+ raw_spin_lock_irq(&ctx->ctx_lock);
while (!list_empty(&ctx->active_reqs)) {
req = list_first_entry(&ctx->active_reqs,
@@ -542,7 +569,7 @@ static void free_ioctx_users(struct perc
kiocb_cancel(ctx, req);
}
- spin_unlock_irq(&ctx->ctx_lock);
+ raw_spin_unlock_irq(&ctx->ctx_lock);
percpu_ref_kill(&ctx->reqs);
percpu_ref_put(&ctx->reqs);
@@ -655,8 +682,8 @@ static struct kioctx *ioctx_alloc(unsign
ctx->max_reqs = nr_events;
- spin_lock_init(&ctx->ctx_lock);
- spin_lock_init(&ctx->completion_lock);
+ raw_spin_lock_init(&ctx->ctx_lock);
+ raw_spin_lock_init(&ctx->completion_lock);
mutex_init(&ctx->ring_lock);
/* Protect against page migration throughout kiotx setup by keeping
* the ring_lock mutex held until setup is complete. */
@@ -925,7 +952,7 @@ static void refill_reqs_available(struct
*/
static void user_refill_reqs_available(struct kioctx *ctx)
{
- spin_lock_irq(&ctx->completion_lock);
+ raw_spin_lock_irq(&ctx->completion_lock);
if (ctx->completed_events) {
struct aio_ring *ring;
unsigned head;
@@ -946,7 +973,7 @@ static void user_refill_reqs_available(s
refill_reqs_available(ctx, head, ctx->tail);
}
- spin_unlock_irq(&ctx->completion_lock);
+ raw_spin_unlock_irq(&ctx->completion_lock);
}
/* aio_get_req
@@ -1041,9 +1068,9 @@ void aio_complete(struct kiocb *iocb, lo
if (iocb->ki_list.next) {
unsigned long flags;
- spin_lock_irqsave(&ctx->ctx_lock, flags);
+ raw_spin_lock_irqsave(&ctx->ctx_lock, flags);
list_del(&iocb->ki_list);
- spin_unlock_irqrestore(&ctx->ctx_lock, flags);
+ raw_spin_unlock_irqrestore(&ctx->ctx_lock, flags);
}
/*
@@ -1051,7 +1078,7 @@ void aio_complete(struct kiocb *iocb, lo
* ctx->completion_lock to prevent other code from messing with the tail
* pointer since we might be called from irq context.
*/
- spin_lock_irqsave(&ctx->completion_lock, flags);
+ raw_spin_lock_irqsave(&ctx->completion_lock, flags);
tail = ctx->tail;
pos = tail + AIO_EVENTS_OFFSET;
@@ -1090,7 +1117,7 @@ void aio_complete(struct kiocb *iocb, lo
ctx->completed_events++;
if (ctx->completed_events > 1)
refill_reqs_available(ctx, head, tail);
- spin_unlock_irqrestore(&ctx->completion_lock, flags);
+ raw_spin_unlock_irqrestore(&ctx->completion_lock, flags);
pr_debug("added to ring %p at [%u]\n", iocb, tail);
@@ -1631,7 +1658,7 @@ static struct kiocb *lookup_kiocb(struct
{
struct list_head *pos;
- assert_spin_locked(&ctx->ctx_lock);
+ assert_raw_spin_locked(&ctx->ctx_lock);
if (key != KIOCB_KEY)
return NULL;
@@ -1671,7 +1698,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t
if (unlikely(!ctx))
return -EINVAL;
- spin_lock_irq(&ctx->ctx_lock);
+ raw_spin_lock_irq(&ctx->ctx_lock);
kiocb = lookup_kiocb(ctx, iocb, key);
if (kiocb)
@@ -1679,7 +1706,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t
else
ret = -EINVAL;
- spin_unlock_irq(&ctx->ctx_lock);
+ raw_spin_unlock_irq(&ctx->ctx_lock);
if (!ret) {
/*
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 3.14.23-rt20 - aio: fix rcu garbage collection might_sleep() splat
2014-11-02 7:31 ` 3.14.23-rt20 - aio: fix rcu garbage collection might_sleep() splat Mike Galbraith
@ 2015-02-17 12:53 ` Sebastian Andrzej Siewior
2015-02-17 14:01 ` Mike Galbraith
0 siblings, 1 reply; 29+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-02-17 12:53 UTC (permalink / raw)
To: Mike Galbraith
Cc: Steven Rostedt, LKML, linux-rt-users, Thomas Gleixner,
Carsten Emde, John Kacur, Clark Williams
* Mike Galbraith | 2014-11-02 08:31:28 [+0100]:
>(not pretty)
You do the RCU thingy and replace locking. Is this required after the
patch I've sent in "Re: [RFC PATCH V2] rt/aio: fix rcu garbage
collection might_sleep() splat"? From a quick browse I've seen that
ffs_epfile_aio_write() invokes kiocb_set_cancel_fn(, ffs_aio_cancel) and
the function does hold a sleeping lock.
Sebastian
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 3.14.23-rt20 - aio: fix rcu garbage collection might_sleep() splat
2015-02-17 12:53 ` Sebastian Andrzej Siewior
@ 2015-02-17 14:01 ` Mike Galbraith
0 siblings, 0 replies; 29+ messages in thread
From: Mike Galbraith @ 2015-02-17 14:01 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, LKML, linux-rt-users, Thomas Gleixner,
Carsten Emde, John Kacur, Clark Williams
On Tue, 2015-02-17 at 13:53 +0100, Sebastian Andrzej Siewior wrote:
> * Mike Galbraith | 2014-11-02 08:31:28 [+0100]:
>
> >(not pretty)
>
> You do the RCU thingy and replace locking. Is this required after the
> patch I've sent in "Re: [RFC PATCH V2] rt/aio: fix rcu garbage
> collection might_sleep() splat"? From a quick browse I've seen that
> ffs_epfile_aio_write() invokes kiocb_set_cancel_fn(, ffs_aio_cancel) and
> the function does hold a sleeping lock.
Yeah, I don't think that's changed, it should still gripe.
-Mike
^ permalink raw reply [flat|nested] 29+ messages in thread
* 3.14.23-rt20 - x86, UV: raw_spinlock conversion
[not found] ` <1414910967.5380.81.camel@marge.simpson.net>
` (4 preceding siblings ...)
2014-11-02 7:31 ` 3.14.23-rt20 - aio: fix rcu garbage collection might_sleep() splat Mike Galbraith
@ 2014-11-02 7:31 ` Mike Galbraith
2015-02-17 13:02 ` Sebastian Andrzej Siewior
2014-11-02 7:31 ` 3.14.23-rt20 - softirq: resurrect softirq threads Mike Galbraith
6 siblings, 1 reply; 29+ messages in thread
From: Mike Galbraith @ 2014-11-02 7:31 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, linux-rt-users, Thomas Gleixner, Carsten Emde, John Kacur,
Sebastian Andrzej Siewior, Clark Williams
Shrug. Lots of hobbyists have a beast in their basement, right?
From: Mike Galbraith <mgalbraith@suse.de>
Date: Mon Dec 5 10:01:47 CET 2011
Subject: x86, UV: raw_spinlock conversion
Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
---
arch/x86/include/asm/uv/uv_bau.h | 14 +++++++-------
arch/x86/include/asm/uv/uv_hub.h | 2 +-
arch/x86/kernel/apic/x2apic_uv_x.c | 2 +-
arch/x86/platform/uv/tlb_uv.c | 26 +++++++++++++-------------
arch/x86/platform/uv/uv_time.c | 21 +++++++++++++--------
5 files changed, 35 insertions(+), 30 deletions(-)
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -611,9 +611,9 @@ struct bau_control {
cycles_t send_message;
cycles_t period_end;
cycles_t period_time;
- spinlock_t uvhub_lock;
- spinlock_t queue_lock;
- spinlock_t disable_lock;
+ raw_spinlock_t uvhub_lock;
+ raw_spinlock_t queue_lock;
+ raw_spinlock_t disable_lock;
/* tunables */
int max_concurr;
int max_concurr_const;
@@ -773,15 +773,15 @@ static inline int atom_asr(short i, stru
* to be lowered below the current 'v'. atomic_add_unless can only stop
* on equal.
*/
-static inline int atomic_inc_unless_ge(spinlock_t *lock, atomic_t *v, int u)
+static inline int atomic_inc_unless_ge(raw_spinlock_t *lock, atomic_t *v, int u)
{
- spin_lock(lock);
+ raw_spin_lock(lock);
if (atomic_read(v) >= u) {
- spin_unlock(lock);
+ raw_spin_unlock(lock);
return 0;
}
atomic_inc(v);
- spin_unlock(lock);
+ raw_spin_unlock(lock);
return 1;
}
--- a/arch/x86/include/asm/uv/uv_hub.h
+++ b/arch/x86/include/asm/uv/uv_hub.h
@@ -502,7 +502,7 @@ struct uv_blade_info {
unsigned short nr_online_cpus;
unsigned short pnode;
short memory_nid;
- spinlock_t nmi_lock; /* obsolete, see uv_hub_nmi */
+ raw_spinlock_t nmi_lock; /* obsolete, see uv_hub_nmi */
unsigned long nmi_count; /* obsolete, see uv_hub_nmi */
};
extern struct uv_blade_info *uv_blade_info;
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -911,7 +911,7 @@ void __init uv_system_init(void)
uv_blade_info[blade].pnode = pnode;
uv_blade_info[blade].nr_possible_cpus = 0;
uv_blade_info[blade].nr_online_cpus = 0;
- spin_lock_init(&uv_blade_info[blade].nmi_lock);
+ raw_spin_lock_init(&uv_blade_info[blade].nmi_lock);
min_pnode = min(pnode, min_pnode);
max_pnode = max(pnode, max_pnode);
blade++;
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -719,9 +719,9 @@ static void destination_plugged(struct b
quiesce_local_uvhub(hmaster);
- spin_lock(&hmaster->queue_lock);
+ raw_spin_lock(&hmaster->queue_lock);
reset_with_ipi(&bau_desc->distribution, bcp);
- spin_unlock(&hmaster->queue_lock);
+ raw_spin_unlock(&hmaster->queue_lock);
end_uvhub_quiesce(hmaster);
@@ -741,9 +741,9 @@ static void destination_timeout(struct b
quiesce_local_uvhub(hmaster);
- spin_lock(&hmaster->queue_lock);
+ raw_spin_lock(&hmaster->queue_lock);
reset_with_ipi(&bau_desc->distribution, bcp);
- spin_unlock(&hmaster->queue_lock);
+ raw_spin_unlock(&hmaster->queue_lock);
end_uvhub_quiesce(hmaster);
@@ -764,7 +764,7 @@ static void disable_for_period(struct ba
cycles_t tm1;
hmaster = bcp->uvhub_master;
- spin_lock(&hmaster->disable_lock);
+ raw_spin_lock(&hmaster->disable_lock);
if (!bcp->baudisabled) {
stat->s_bau_disabled++;
tm1 = get_cycles();
@@ -777,7 +777,7 @@ static void disable_for_period(struct ba
}
}
}
- spin_unlock(&hmaster->disable_lock);
+ raw_spin_unlock(&hmaster->disable_lock);
}
static void count_max_concurr(int stat, struct bau_control *bcp,
@@ -840,7 +840,7 @@ static void record_send_stats(cycles_t t
*/
static void uv1_throttle(struct bau_control *hmaster, struct ptc_stats *stat)
{
- spinlock_t *lock = &hmaster->uvhub_lock;
+ raw_spinlock_t *lock = &hmaster->uvhub_lock;
atomic_t *v;
v = &hmaster->active_descriptor_count;
@@ -972,7 +972,7 @@ static int check_enable(struct bau_contr
struct bau_control *hmaster;
hmaster = bcp->uvhub_master;
- spin_lock(&hmaster->disable_lock);
+ raw_spin_lock(&hmaster->disable_lock);
if (bcp->baudisabled && (get_cycles() >= bcp->set_bau_on_time)) {
stat->s_bau_reenabled++;
for_each_present_cpu(tcpu) {
@@ -984,10 +984,10 @@ static int check_enable(struct bau_contr
tbcp->period_giveups = 0;
}
}
- spin_unlock(&hmaster->disable_lock);
+ raw_spin_unlock(&hmaster->disable_lock);
return 0;
}
- spin_unlock(&hmaster->disable_lock);
+ raw_spin_unlock(&hmaster->disable_lock);
return -1;
}
@@ -1895,9 +1895,9 @@ static void __init init_per_cpu_tunables
bcp->cong_reps = congested_reps;
bcp->disabled_period = sec_2_cycles(disabled_period);
bcp->giveup_limit = giveup_limit;
- spin_lock_init(&bcp->queue_lock);
- spin_lock_init(&bcp->uvhub_lock);
- spin_lock_init(&bcp->disable_lock);
+ raw_spin_lock_init(&bcp->queue_lock);
+ raw_spin_lock_init(&bcp->uvhub_lock);
+ raw_spin_lock_init(&bcp->disable_lock);
}
}
--- a/arch/x86/platform/uv/uv_time.c
+++ b/arch/x86/platform/uv/uv_time.c
@@ -58,7 +58,7 @@ static DEFINE_PER_CPU(struct clock_event
/* There is one of these allocated per node */
struct uv_rtc_timer_head {
- spinlock_t lock;
+ raw_spinlock_t lock;
/* next cpu waiting for timer, local node relative: */
int next_cpu;
/* number of cpus on this node: */
@@ -178,7 +178,7 @@ static __init int uv_rtc_allocate_timers
uv_rtc_deallocate_timers();
return -ENOMEM;
}
- spin_lock_init(&head->lock);
+ raw_spin_lock_init(&head->lock);
head->ncpus = uv_blade_nr_possible_cpus(bid);
head->next_cpu = -1;
blade_info[bid] = head;
@@ -232,7 +232,7 @@ static int uv_rtc_set_timer(int cpu, u64
unsigned long flags;
int next_cpu;
- spin_lock_irqsave(&head->lock, flags);
+ raw_spin_lock_irqsave(&head->lock, flags);
next_cpu = head->next_cpu;
*t = expires;
@@ -244,12 +244,12 @@ static int uv_rtc_set_timer(int cpu, u64
if (uv_setup_intr(cpu, expires)) {
*t = ULLONG_MAX;
uv_rtc_find_next_timer(head, pnode);
- spin_unlock_irqrestore(&head->lock, flags);
+ raw_spin_unlock_irqrestore(&head->lock, flags);
return -ETIME;
}
}
- spin_unlock_irqrestore(&head->lock, flags);
+ raw_spin_unlock_irqrestore(&head->lock, flags);
return 0;
}
@@ -268,7 +268,7 @@ static int uv_rtc_unset_timer(int cpu, i
unsigned long flags;
int rc = 0;
- spin_lock_irqsave(&head->lock, flags);
+ raw_spin_lock_irqsave(&head->lock, flags);
if ((head->next_cpu == bcpu && uv_read_rtc(NULL) >= *t) || force)
rc = 1;
@@ -280,7 +280,7 @@ static int uv_rtc_unset_timer(int cpu, i
uv_rtc_find_next_timer(head, pnode);
}
- spin_unlock_irqrestore(&head->lock, flags);
+ raw_spin_unlock_irqrestore(&head->lock, flags);
return rc;
}
@@ -300,13 +300,18 @@ static int uv_rtc_unset_timer(int cpu, i
static cycle_t uv_read_rtc(struct clocksource *cs)
{
unsigned long offset;
+ cycle_t cycles;
+ migrate_disable();
if (uv_get_min_hub_revision_id() == 1)
offset = 0;
else
offset = (uv_blade_processor_id() * L1_CACHE_BYTES) % PAGE_SIZE;
- return (cycle_t)uv_read_local_mmr(UVH_RTC | offset);
+ cycles = (cycle_t)uv_read_local_mmr(UVH_RTC | offset);
+ migrate_enable();
+
+ return cycles;
}
/*
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 3.14.23-rt20 - x86, UV: raw_spinlock conversion
2014-11-02 7:31 ` 3.14.23-rt20 - x86, UV: raw_spinlock conversion Mike Galbraith
@ 2015-02-17 13:02 ` Sebastian Andrzej Siewior
2015-02-17 14:11 ` Mike Galbraith
0 siblings, 1 reply; 29+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-02-17 13:02 UTC (permalink / raw)
To: Mike Galbraith
Cc: Steven Rostedt, LKML, linux-rt-users, Thomas Gleixner,
Carsten Emde, John Kacur, Clark Williams
* Mike Galbraith | 2014-11-02 08:31:37 [+0100]:
>Shrug. Lots of hobbyists have a beast in their basement, right?
I can take this as is if you want.
>--- a/arch/x86/platform/uv/uv_time.c
>+++ b/arch/x86/platform/uv/uv_time.c
>@@ -300,13 +300,18 @@ static int uv_rtc_unset_timer(int cpu, i
> static cycle_t uv_read_rtc(struct clocksource *cs)
> {
> unsigned long offset;
>+ cycle_t cycles;
>
>+ migrate_disable();
> if (uv_get_min_hub_revision_id() == 1)
> offset = 0;
> else
> offset = (uv_blade_processor_id() * L1_CACHE_BYTES) % PAGE_SIZE;
>
>- return (cycle_t)uv_read_local_mmr(UVH_RTC | offset);
>+ cycles = (cycle_t)uv_read_local_mmr(UVH_RTC | offset);
>+ migrate_enable();
>+
>+ return cycles;
> }
but do you really want a migrate_disable() in here? The only problem I
can imagine is that you switch CPUs between uv_blade_processor_id()
and the actual read. I recommend preempt_disable() and sending this
upstream as well since it is not limited to -RT.
Sebastian
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 3.14.23-rt20 - x86, UV: raw_spinlock conversion
2015-02-17 13:02 ` Sebastian Andrzej Siewior
@ 2015-02-17 14:11 ` Mike Galbraith
0 siblings, 0 replies; 29+ messages in thread
From: Mike Galbraith @ 2015-02-17 14:11 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, LKML, linux-rt-users, Thomas Gleixner,
Carsten Emde, John Kacur, Clark Williams
On Tue, 2015-02-17 at 14:02 +0100, Sebastian Andrzej Siewior wrote:
> * Mike Galbraith | 2014-11-02 08:31:37 [+0100]:
>
> >Shrug. Lots of hobbyists have a beast in their basement, right?
>
> I can take this as is if you want.
Or you can trash it since SGI didn't seem to care. This patch was born
while getting 2.6.33-rt running on a UV100 box, I've been carrying it
ever since.
-Mike
^ permalink raw reply [flat|nested] 29+ messages in thread
* 3.14.23-rt20 - softirq: resurrect softirq threads
[not found] ` <1414910967.5380.81.camel@marge.simpson.net>
` (5 preceding siblings ...)
2014-11-02 7:31 ` 3.14.23-rt20 - x86, UV: raw_spinlock conversion Mike Galbraith
@ 2014-11-02 7:31 ` Mike Galbraith
2015-02-17 13:05 ` Sebastian Andrzej Siewior
6 siblings, 1 reply; 29+ messages in thread
From: Mike Galbraith @ 2014-11-02 7:31 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, linux-rt-users, Thomas Gleixner, Carsten Emde, John Kacur,
Sebastian Andrzej Siewior, Clark Williams
(sirqs suck, this makes them suck less for some boxen/loads)
Subject: softirq: resurrect softirq threads
From: Mike Galbraith <mgalbraith@suse.de>
Date: Mon Jan 6 08:42:11 CET 2014
Some loads cannot tolerate the jitter induced by all softirqs being processed
at the same priority. Let the user prioritize them again.
Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
---
Documentation/kernel-parameters.txt | 3
include/linux/interrupt.h | 9 -
include/linux/sched.h | 6 +
kernel/sched/cputime.c | 4
kernel/softirq.c | 182 +++++++++++++++++++++++++++++++-----
5 files changed, 173 insertions(+), 31 deletions(-)
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3187,6 +3187,9 @@ bytes respectively. Such letter suffixes
Force threading of all interrupt handlers except those
marked explicitly IRQF_NO_THREAD.
+ threadsirqs [KNL]
+ Enable or disable threading of all softirqs for -rt.
+
tmem [KNL,XEN]
Enable the Transcendent memory driver if built-in.
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -425,6 +425,7 @@ struct softirq_action
asmlinkage void do_softirq(void);
asmlinkage void __do_softirq(void);
static inline void thread_do_softirq(void) { do_softirq(); }
+#define NR_SOFTIRQ_THREADS 1
#ifdef __ARCH_HAS_DO_SOFTIRQ
void do_softirq_own_stack(void);
#else
@@ -435,6 +436,7 @@ static inline void do_softirq_own_stack(
#endif
#else
extern void thread_do_softirq(void);
+#define NR_SOFTIRQ_THREADS NR_SOFTIRQS
#endif
extern void open_softirq(int nr, void (*action)(struct softirq_action *));
@@ -445,12 +447,7 @@ extern void raise_softirq_irqoff(unsigne
extern void raise_softirq(unsigned int nr);
extern void softirq_check_pending_idle(void);
-DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
-
-static inline struct task_struct *this_cpu_ksoftirqd(void)
-{
- return this_cpu_read(ksoftirqd);
-}
+DECLARE_PER_CPU(struct task_struct * [NR_SOFTIRQ_THREADS], ksoftirqd);
/* Tasklets --- multithreaded analogue of BHs.
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1263,6 +1263,7 @@ struct task_struct {
/* Revert to default priority/policy when forking */
unsigned sched_reset_on_fork:1;
unsigned sched_contributes_to_load:1;
+ unsigned sched_is_softirqd:1;
pid_t pid;
pid_t tgid;
@@ -1678,6 +1679,11 @@ static inline struct pid *task_tgid(stru
return task->group_leader->pids[PIDTYPE_PID].pid;
}
+static inline bool task_is_softirqd(struct task_struct *task)
+{
+ return task->sched_is_softirqd;
+}
+
/*
* Without tasklist or rcu lock it is not safe to dereference
* the result of task_pgrp/task_session even if task == current,
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -68,7 +68,7 @@ void irqtime_account_irq(struct task_str
*/
if (hardirq_count())
__this_cpu_add(cpu_hardirq_time, delta);
- else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
+ else if (in_serving_softirq() && !task_is_softirqd(curr))
__this_cpu_add(cpu_softirq_time, delta);
irq_time_write_end();
@@ -342,7 +342,7 @@ static void irqtime_account_process_tick
cpustat[CPUTIME_IRQ] += cputime;
} else if (irqtime_account_si_update()) {
cpustat[CPUTIME_SOFTIRQ] += cputime;
- } else if (this_cpu_ksoftirqd() == p) {
+ } else if (task_is_softirqd(p)) {
/*
* ksoftirqd time do not get accounted in cpu_softirq_time.
* So, we have to handle it separately here.
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -56,7 +56,14 @@ EXPORT_SYMBOL(irq_stat);
static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
-DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
+DEFINE_PER_CPU(struct task_struct * [NR_SOFTIRQ_THREADS], ksoftirqd);
+
+static unsigned int __read_mostly threadsirqs;
+
+static struct task_struct *__this_cpu_ksoftirqd(int nr)
+{
+ return __this_cpu_read(ksoftirqd[nr && threadsirqs ? nr : 0]);
+}
const char * const softirq_to_name[NR_SOFTIRQS] = {
"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "BLOCK_IOPOLL",
@@ -161,10 +168,10 @@ static inline void softirq_clr_runner(un
* to the pending events, so lets the scheduler to balance
* the softirq load for us.
*/
-static void wakeup_softirqd(void)
+static void wakeup_softirqd(int nr)
{
/* Interrupts are disabled: no need to stop preemption */
- struct task_struct *tsk = __this_cpu_read(ksoftirqd);
+ struct task_struct *tsk = __this_cpu_ksoftirqd(nr);
if (tsk && tsk->state != TASK_RUNNING)
wake_up_process(tsk);
@@ -413,7 +420,7 @@ asmlinkage void __do_softirq(void)
--max_restart)
goto restart;
- wakeup_softirqd();
+ wakeup_softirqd(0);
}
lockdep_softirq_end(in_hardirq);
@@ -458,7 +465,7 @@ void raise_softirq_irqoff(unsigned int n
* schedule the softirq soon.
*/
if (!in_interrupt())
- wakeup_softirqd();
+ wakeup_softirqd(0);
}
void __raise_softirq_irqoff(unsigned int nr)
@@ -469,8 +476,18 @@ void __raise_softirq_irqoff(unsigned int
static inline void local_bh_disable_nort(void) { local_bh_disable(); }
static inline void _local_bh_enable_nort(void) { _local_bh_enable(); }
-static void ksoftirqd_set_sched_params(unsigned int cpu) { }
-static void ksoftirqd_clr_sched_params(unsigned int cpu, bool online) { }
+static void ksoftirqd_set_sched_params(unsigned int cpu)
+{
+ local_irq_disable();
+ current->sched_is_softirqd = 1;
+ local_irq_enable();
+}
+static void ksoftirqd_clr_sched_params(unsigned int cpu, bool online)
+{
+ local_irq_disable();
+ current->sched_is_softirqd = 0;
+ local_irq_enable();
+}
#else /* !PREEMPT_RT_FULL */
@@ -656,15 +673,15 @@ static void do_raise_softirq_irqoff(unsi
if (!in_irq() && current->softirq_nestcnt)
current->softirqs_raised |= (1U << nr);
- else if (__this_cpu_read(ksoftirqd))
- __this_cpu_read(ksoftirqd)->softirqs_raised |= (1U << nr);
+ else if (__this_cpu_ksoftirqd(nr))
+ __this_cpu_ksoftirqd(nr)->softirqs_raised |= (1U << nr);
}
void __raise_softirq_irqoff(unsigned int nr)
{
do_raise_softirq_irqoff(nr);
if (!in_irq() && !current->softirq_nestcnt)
- wakeup_softirqd();
+ wakeup_softirqd(nr);
}
/*
@@ -691,7 +708,7 @@ void raise_softirq_irqoff(unsigned int n
*
*/
if (!current->softirq_nestcnt)
- wakeup_softirqd();
+ wakeup_softirqd(nr);
}
static inline int ksoftirqd_softirq_pending(void)
@@ -709,6 +726,7 @@ static inline void ksoftirqd_set_sched_p
sched_setscheduler(current, SCHED_FIFO, ¶m);
/* Take over all pending softirqs when starting */
local_irq_disable();
+ current->sched_is_softirqd = 1;
current->softirqs_raised = local_softirq_pending();
local_irq_enable();
}
@@ -717,9 +735,26 @@ static inline void ksoftirqd_clr_sched_p
{
struct sched_param param = { .sched_priority = 0 };
+ local_irq_disable();
+ current->sched_is_softirqd = 0;
+ current->softirqs_raised = 0;
+ local_irq_enable();
sched_setscheduler(current, SCHED_NORMAL, ¶m);
}
+static int __init threadsoftirqs(char *str)
+{
+ int thread = 0;
+
+ if (!get_option(&str, &thread))
+ thread = 1;
+
+ threadsirqs = !!thread;
+
+ return 0;
+}
+
+early_param("threadsirqs", threadsoftirqs);
#endif /* PREEMPT_RT_FULL */
/*
* Enter an interrupt context.
@@ -760,15 +795,25 @@ static inline void invoke_softirq(void)
do_softirq_own_stack();
#endif
} else {
- wakeup_softirqd();
+ wakeup_softirqd(0);
}
#else /* PREEMPT_RT_FULL */
+ struct task_struct *tsk;
unsigned long flags;
+ u32 pending, nr;
local_irq_save(flags);
- if (__this_cpu_read(ksoftirqd) &&
- __this_cpu_read(ksoftirqd)->softirqs_raised)
- wakeup_softirqd();
+ pending = local_softirq_pending();
+
+ while (pending) {
+ nr = __ffs(pending);
+ tsk = __this_cpu_ksoftirqd(nr);
+ if (tsk && tsk->softirqs_raised)
+ wakeup_softirqd(nr);
+ if (!threadsirqs)
+ break;
+ pending &= ~(1U << nr);
+ }
local_irq_restore(flags);
#endif
}
@@ -1201,20 +1246,111 @@ static struct notifier_block cpu_nfb = {
.notifier_call = cpu_callback
};
-static struct smp_hotplug_thread softirq_threads = {
- .store = &ksoftirqd,
- .setup = ksoftirqd_set_sched_params,
- .cleanup = ksoftirqd_clr_sched_params,
- .thread_should_run = ksoftirqd_should_run,
- .thread_fn = run_ksoftirqd,
- .thread_comm = "ksoftirqd/%u",
+static struct smp_hotplug_thread softirq_threads[] = {
+ {
+ .store = &ksoftirqd[0],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "ksoftirqd/%u",
+ },
+#ifdef CONFIG_PREEMPT_RT_FULL
+ {
+ .store = &ksoftirqd[HI_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-high/%u",
+ },
+ {
+ .store = &ksoftirqd[TIMER_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-timer/%u",
+ },
+ {
+ .store = &ksoftirqd[NET_TX_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-net-tx/%u",
+ },
+ {
+ .store = &ksoftirqd[NET_RX_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-net-rx/%u",
+ },
+ {
+ .store = &ksoftirqd[BLOCK_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-blk/%u",
+ },
+ {
+ .store = &ksoftirqd[BLOCK_IOPOLL_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-blk-pol/%u",
+ },
+ {
+ .store = &ksoftirqd[TASKLET_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-tasklet/%u",
+ },
+ {
+ .store = &ksoftirqd[SCHED_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-sched/%u",
+ },
+ {
+ .store = &ksoftirqd[HRTIMER_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-hrtimer/%u",
+ },
+ {
+ .store = &ksoftirqd[RCU_SOFTIRQ],
+ .setup = ksoftirqd_set_sched_params,
+ .cleanup = ksoftirqd_clr_sched_params,
+ .thread_should_run = ksoftirqd_should_run,
+ .thread_fn = run_ksoftirqd,
+ .thread_comm = "sirq-rcu/%u",
+ },
+#endif
};
static __init int spawn_ksoftirqd(void)
{
+ struct smp_hotplug_thread *t = &softirq_threads[threadsirqs];
+ int i, threads = NR_SOFTIRQ_THREADS;
+
register_cpu_notifier(&cpu_nfb);
- BUG_ON(smpboot_register_percpu_thread(&softirq_threads));
+ for (i = 0; i < threads; i++, t++) {
+ BUG_ON(smpboot_register_percpu_thread(t));
+ if (!threadsirqs)
+ break;
+ }
return 0;
}
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 3.14.23-rt20 - softirq: resurrect softirq threads
2014-11-02 7:31 ` 3.14.23-rt20 - softirq: resurrect softirq threads Mike Galbraith
@ 2015-02-17 13:05 ` Sebastian Andrzej Siewior
2015-02-17 14:00 ` Mike Galbraith
0 siblings, 1 reply; 29+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-02-17 13:05 UTC (permalink / raw)
To: Mike Galbraith
Cc: Steven Rostedt, LKML, linux-rt-users, Thomas Gleixner,
Carsten Emde, John Kacur, Clark Williams
* Mike Galbraith | 2014-11-02 08:31:47 [+0100]:
>(sirqs suck, this makes them suck less for some boxen/loads)
>
>Subject: softirq: resurrect softirq threads
>From: Mike Galbraith <mgalbraith@suse.de>
>Date: Mon Jan 6 08:42:11 CET 2014
>
>Some loads cannot tolerate the jitter induced by all softirqs being processed
>at the same priority. Let the user prioritize them again.
>
>Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
I'm going to postpone this. While it might make sense in general I'm
going to wait for tglx, Steven and others to see if this is what we want
or if there are some plans redoing the softirq handling.
Sebastian
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 3.14.23-rt20 - softirq: resurrect softirq threads
2015-02-17 13:05 ` Sebastian Andrzej Siewior
@ 2015-02-17 14:00 ` Mike Galbraith
0 siblings, 0 replies; 29+ messages in thread
From: Mike Galbraith @ 2015-02-17 14:00 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, LKML, linux-rt-users, Thomas Gleixner,
Carsten Emde, John Kacur, Clark Williams
On Tue, 2015-02-17 at 14:05 +0100, Sebastian Andrzej Siewior wrote:
> * Mike Galbraith | 2014-11-02 08:31:47 [+0100]:
>
> >(sirqs suck, this makes them suck less for some boxen/loads)
> >
> >Subject: softirq: resurrect softirq threads
> >From: Mike Galbraith <mgalbraith@suse.de>
> >Date: Mon Jan 6 08:42:11 CET 2014
> >
> >Some loads cannot tolerate the jitter induced by all softirqs being processed
> >at the same priority. Let the user prioritize them again.
> >
> >Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
>
> I'm going to postpone this. While it might make sense in general I'm
> going to wait for tglx, Steven and others to see if this is what we want
> or if there are some plans redoing the softirq handling.
You can postpone this one forever, they don't want it. I posted it for
folks who may want the option.
-Mike
^ permalink raw reply [flat|nested] 29+ messages in thread