* Re: schedule under irqs_disabled in SLUB problem [not found] <CADF-jezvVP2O++FR2KiRSSSJF7oObjy8LSP3-yj1HCmxyTzB_Q@mail.gmail.com> @ 2017-11-02 16:50 ` Sebastian Andrzej Siewior 2017-11-02 20:55 ` Grygorii Strashko [not found] ` <CADF-jexLs9vRuiuoRmcA+0L6Mp-XxW75okheWV+ipGf1b_Ua1w@mail.gmail.com> 0 siblings, 2 replies; 19+ messages in thread From: Sebastian Andrzej Siewior @ 2017-11-02 16:50 UTC (permalink / raw) To: Pavel V. Panteleev; +Cc: linux-rt-users, tglx On 2017-11-01 12:41:33 [+0300], Pavel V. Panteleev wrote: > Hello! I have a problem on kernel 3.14.79-rt85 and I don't know, if it's > solved on current kernel or, may be, it's our additions problem. Could you > help me, please? will do my best. > Kernel is broken in migrate_enable(): > WARN_ON_ONCE(p->migrate_disable <= 0); > > I have a trace, where I see, that it happened, because few > migrate_disable() were called under irqs_disabled (migrate_disable counter > wasn't changed). But after schedule() call irqs were enabled and the > following migrate_disable() and migrate_enable() changed migrate_disable > counter. I suppose, that problem is schedule() call under irqs_disabled. that is basically what happens if the counter goes negative. … > As I see, in allocate_slab() kernel could be under irqs_disabled. And irqs > would be enabled in case of SYSTEM_RUNNING (why only in case of > SYSTEM_RUNNING?). But in our case system isn't running yet (it's always > before /sbin/init), so irqs wouldn't be enabled: ideally the interrupts would be always enabled because they have to on -RT but we can't do so early in the boot process. > enableirqs = (flags & __GFP_WAIT) != 0; > #ifdef CONFIG_PREEMPT_RT_FULL > enableirqs |= system_state == SYSTEM_RUNNING; > #endif > if (enableirqs) > local_irq_enable(); … > P. S. > Today I reproduced this on 4.9.47-rt37. Try to reproduce on x86. In newer -RT we have if (system_state > SYSTEM_BOOTING) instead but this won't work even in v4.9 because it does not have SYSTEM_SCHEDULING yet. So v4.11/v4.13 should not be affected. If that is true, then something like diff --git a/init/main.c b/init/main.c index 6470deef01c9..362761effeac 100644 --- a/init/main.c +++ b/init/main.c @@ -379,6 +379,7 @@ static void __init setup_command_line(char *command_line) */ static __initdata DECLARE_COMPLETION(kthreadd_done); +extern bool slab_do_irq_on; static noinline void __ref rest_init(void) { @@ -396,6 +397,7 @@ static noinline void __ref rest_init(void) rcu_read_lock(); kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); rcu_read_unlock(); + slab_do_irq_on = true; complete(&kthreadd_done); /* diff --git a/mm/slub.c b/mm/slub.c index 67eb368b9314..22ec805130e5 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1527,6 +1527,7 @@ static inline bool shuffle_freelist(struct kmem_cache *s, struct page *page) return false; } #endif /* CONFIG_SLAB_FREELIST_RANDOM */ +bool slab_do_irq_on; static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { @@ -1543,7 +1544,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) if (gfpflags_allow_blocking(flags)) enableirqs = true; #ifdef CONFIG_PREEMPT_RT_FULL - if (system_state == SYSTEM_RUNNING) + if (slab_do_irq_on) enableirqs = true; #endif if (enableirqs) might just work. Can you test this please? Sebastian ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-02 16:50 ` schedule under irqs_disabled in SLUB problem Sebastian Andrzej Siewior @ 2017-11-02 20:55 ` Grygorii Strashko [not found] ` <CADF-jexLs9vRuiuoRmcA+0L6Mp-XxW75okheWV+ipGf1b_Ua1w@mail.gmail.com> 1 sibling, 0 replies; 19+ messages in thread From: Grygorii Strashko @ 2017-11-02 20:55 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Pavel V. Panteleev; +Cc: linux-rt-users, tglx On 11/02/2017 11:50 AM, Sebastian Andrzej Siewior wrote: > On 2017-11-01 12:41:33 [+0300], Pavel V. Panteleev wrote: >> Hello! I have a problem on kernel 3.14.79-rt85 and I don't know, if it's >> solved on current kernel or, may be, it's our additions problem. Could you >> help me, please? > > will do my best. > >> Kernel is broken in migrate_enable(): >> WARN_ON_ONCE(p->migrate_disable <= 0); >> >> I have a trace, where I see, that it happened, because few >> migrate_disable() were called under irqs_disabled (migrate_disable counter >> wasn't changed). But after schedule() call irqs were enabled and the >> following migrate_disable() and migrate_enable() changed migrate_disable >> counter. I suppose, that problem is schedule() call under irqs_disabled. > > that is basically what happens if the counter goes negative. > > … >> As I see, in allocate_slab() kernel could be under irqs_disabled. And irqs >> would be enabled in case of SYSTEM_RUNNING (why only in case of >> SYSTEM_RUNNING?). But in our case system isn't running yet (it's always >> before /sbin/init), so irqs wouldn't be enabled: > > ideally the interrupts would be always enabled because they have to on > -RT but we can't do so early in the boot process. > >> enableirqs = (flags & __GFP_WAIT) != 0; >> #ifdef CONFIG_PREEMPT_RT_FULL >> enableirqs |= system_state == SYSTEM_RUNNING; >> #endif >> if (enableirqs) >> local_irq_enable(); > … >> P. S. >> Today I reproduced this on 4.9.47-rt37. Try to reproduce on x86. > > In newer -RT we have > if (system_state > SYSTEM_BOOTING) > instead but this won't work even in v4.9 because it does not have > SYSTEM_SCHEDULING yet. So v4.11/v4.13 should not be affected. > > If that is true, then something like > > diff --git a/init/main.c b/init/main.c > index 6470deef01c9..362761effeac 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -379,6 +379,7 @@ static void __init setup_command_line(char *command_line) > */ > > static __initdata DECLARE_COMPLETION(kthreadd_done); > +extern bool slab_do_irq_on; > > static noinline void __ref rest_init(void) > { > @@ -396,6 +397,7 @@ static noinline void __ref rest_init(void) > rcu_read_lock(); > kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); > rcu_read_unlock(); > + slab_do_irq_on = true; > complete(&kthreadd_done); > > /* > diff --git a/mm/slub.c b/mm/slub.c > index 67eb368b9314..22ec805130e5 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1527,6 +1527,7 @@ static inline bool shuffle_freelist(struct kmem_cache *s, struct page *page) > return false; > } > #endif /* CONFIG_SLAB_FREELIST_RANDOM */ > +bool slab_do_irq_on; > > static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > { > @@ -1543,7 +1544,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > if (gfpflags_allow_blocking(flags)) > enableirqs = true; > #ifdef CONFIG_PREEMPT_RT_FULL > - if (system_state == SYSTEM_RUNNING) > + if (slab_do_irq_on) > enableirqs = true; > #endif > if (enableirqs) > > might just work. Can you test this please? I've used to see the same and created patch for 4.9 (should work for 4.1) which does mostly the same things, but it also enables __might_sleep() during boot. Of course, it's possible that I've selected wrong place to set SYSTEM_BOOTING_LATE and it's more correct to do it in rest_init(). NOTE: As Sebastian mentioned below patch is not relevant for newer Kernel versions. >From a9600a6a18ffe285bccd299985ad57450e4aa1b4 Mon Sep 17 00:00:00 2001 From: Grygorii Strashko <grygorii.strashko@ti.com> Date: Fri, 9 Oct 2015 08:52:16 -0500 Subject: [DEBUG PATCH] core: introduce SYSTEM_BOOTING_LATE state This patch introduces new SYSTEM_BOOTING_LATE system state which will be set right after scheduler is fully operational during the boot. And reuse this state in update ___might_sleep() to allow catching "BUG: sleeping function called from invalid context" early during the boot which is very helping for debugging. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> --- include/linux/kernel.h | 1 + init/main.c | 1 + kernel/sched/core.c | 4 +++- mm/slub.c | 3 +-- 4 files changed, 6 insertions(+), 3 deletions(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 7894d55..5dfed24 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -488,6 +488,7 @@ extern bool early_boot_irqs_disabled; /* Values used for system_state */ extern enum system_states { SYSTEM_BOOTING, + SYSTEM_BOOTING_LATE, SYSTEM_RUNNING, SYSTEM_HALT, SYSTEM_POWER_OFF, diff --git a/init/main.c b/init/main.c index a4a61e7..e5e5906 100644 --- a/init/main.c +++ b/init/main.c @@ -989,6 +989,7 @@ static noinline void __init kernel_init_freeable(void) /* Now the scheduler is fully set up and can do blocking allocations */ gfp_allowed_mask = __GFP_BITS_MASK; + system_state = SYSTEM_BOOTING_LATE; /* * init can allocate pages on any node diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 127aaf4..04c1805 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7977,7 +7977,9 @@ void ___might_sleep(const char *file, int line, int preempt_offset) rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */ if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && !is_idle_task(current)) || - (system_state != SYSTEM_RUNNING && system_state != SYSTEM_SUSPEND) || oops_in_progress) + (system_state != SYSTEM_RUNNING && + system_state != SYSTEM_SUSPEND && + system_state != SYSTEM_BOOTING_LATE) || oops_in_progress) return; if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy) return; diff --git a/mm/slub.c b/mm/slub.c index 6d72b7f..993c89b 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1540,8 +1540,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) if (gfpflags_allow_blocking(flags)) enableirqs = true; #ifdef CONFIG_PREEMPT_RT_FULL - if (system_state == SYSTEM_RUNNING) - enableirqs = true; + enableirqs = system_state == SYSTEM_RUNNING || system_state == SYSTEM_BOOTING_LATE; #endif if (enableirqs) local_irq_enable(); -- 2.10.5 -- regards, -grygorii ^ permalink raw reply related [flat|nested] 19+ messages in thread
[parent not found: <CADF-jexLs9vRuiuoRmcA+0L6Mp-XxW75okheWV+ipGf1b_Ua1w@mail.gmail.com>]
* Re: schedule under irqs_disabled in SLUB problem [not found] ` <CADF-jexLs9vRuiuoRmcA+0L6Mp-XxW75okheWV+ipGf1b_Ua1w@mail.gmail.com> @ 2017-11-03 10:23 ` Pavel V. Panteleev 2017-11-07 9:00 ` Pavel V. Panteleev ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Pavel V. Panteleev @ 2017-11-03 10:23 UTC (permalink / raw) To: linux-rt-users I'll try Sebastian's patch on the next week. Thanks! ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-03 10:23 ` Pavel V. Panteleev @ 2017-11-07 9:00 ` Pavel V. Panteleev 2017-11-07 9:14 ` Pavel V. Panteleev 2017-11-07 9:47 ` Pavel V. Panteleev 2 siblings, 0 replies; 19+ messages in thread From: Pavel V. Panteleev @ 2017-11-07 9:00 UTC (permalink / raw) To: linux-rt-users Thanks, it works. On 03.11.2017 13:23, Pavel V. Panteleev wrote: > I'll try Sebastian's patch on the next week. Thanks! ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-03 10:23 ` Pavel V. Panteleev 2017-11-07 9:00 ` Pavel V. Panteleev @ 2017-11-07 9:14 ` Pavel V. Panteleev 2017-11-07 9:47 ` Pavel V. Panteleev 2 siblings, 0 replies; 19+ messages in thread From: Pavel V. Panteleev @ 2017-11-07 9:14 UTC (permalink / raw) To: linux-rt-users Thanks, it works. On 03.11.2017 13:23, Pavel V. Panteleev wrote: > I'll try Sebastian's patch on the next week. Thanks! ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-03 10:23 ` Pavel V. Panteleev 2017-11-07 9:00 ` Pavel V. Panteleev 2017-11-07 9:14 ` Pavel V. Panteleev @ 2017-11-07 9:47 ` Pavel V. Panteleev 2017-11-16 16:08 ` Sebastian Andrzej Siewior 2 siblings, 1 reply; 19+ messages in thread From: Pavel V. Panteleev @ 2017-11-07 9:47 UTC (permalink / raw) To: linux-rt-users Thanks, it works. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-07 9:47 ` Pavel V. Panteleev @ 2017-11-16 16:08 ` Sebastian Andrzej Siewior 2017-11-16 16:39 ` Pavel V. Panteleev 2017-11-17 17:38 ` Julia Cartwright 0 siblings, 2 replies; 19+ messages in thread From: Sebastian Andrzej Siewior @ 2017-11-16 16:08 UTC (permalink / raw) To: Pavel V. Panteleev; +Cc: linux-rt-users, Julia Cartwright, Steven Rostedt + Steven & Julia On 2017-11-07 12:47:27 [+0300], Pavel V. Panteleev wrote: > Thanks, it works. Okay, good to hear. Steven + Julia: We need to decide what are going to do about this stable-wise. The bug was reported against 3.14.79-rt85 and the devel tree is not affected*. The thread starts at https://www.spinics.net/lists/linux-rt-users/msg17560.html [*] probably since v4.11.5-rt1 once "sched/core: Enable might_sleep() and smp_processor_id() checks early" and its prerequisites appeared in the queue. Sebastian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-16 16:08 ` Sebastian Andrzej Siewior @ 2017-11-16 16:39 ` Pavel V. Panteleev 2017-11-17 17:38 ` Julia Cartwright 1 sibling, 0 replies; 19+ messages in thread From: Pavel V. Panteleev @ 2017-11-16 16:39 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: linux-rt-users, Julia Cartwright, Steven Rostedt On 16.11.2017 19:08, Sebastian Andrzej Siewior wrote: > + Steven & Julia > > On 2017-11-07 12:47:27 [+0300], Pavel V. Panteleev wrote: >> Thanks, it works. > Okay, good to hear. > > Steven + Julia: > We need to decide what are going to do about this stable-wise. The bug > was reported against 3.14.79-rt85 and the devel tree is not affected*. > The thread starts at > https://www.spinics.net/lists/linux-rt-users/msg17560.html > > [*] probably since v4.11.5-rt1 once > "sched/core: Enable might_sleep() and smp_processor_id() checks > early" and its prerequisites appeared in the queue. > > Sebastian > > kernel 4.9 has the same problem. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-16 16:08 ` Sebastian Andrzej Siewior 2017-11-16 16:39 ` Pavel V. Panteleev @ 2017-11-17 17:38 ` Julia Cartwright 2017-11-24 6:39 ` Sam Kappen 2017-11-24 9:35 ` [PATCH] mm/slub: enable IRQs once scheduling is working Sebastian Andrzej Siewior 1 sibling, 2 replies; 19+ messages in thread From: Julia Cartwright @ 2017-11-17 17:38 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Pavel V. Panteleev, linux-rt-users, Steven Rostedt On Thu, Nov 16, 2017 at 05:08:37PM +0100, Sebastian Andrzej Siewior wrote: > + Steven & Julia > > On 2017-11-07 12:47:27 [+0300], Pavel V. Panteleev wrote: > > Thanks, it works. > > Okay, good to hear. > > Steven + Julia: > We need to decide what are going to do about this stable-wise. The bug > was reported against 3.14.79-rt85 and the devel tree is not affected*. > The thread starts at > https://www.spinics.net/lists/linux-rt-users/msg17560.html Your proposed patch seems reasonable to me to pull back into the relevant releases. Can you send a proper patch against the latest affected tree (4.9?) and the stable team will pull it back? It looks like it will need some minor massaging on it's way back, but that shouldn't be a problem. Thanks, Julia ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-17 17:38 ` Julia Cartwright @ 2017-11-24 6:39 ` Sam Kappen 2017-11-24 9:37 ` Sebastian Andrzej Siewior 2017-11-24 9:35 ` [PATCH] mm/slub: enable IRQs once scheduling is working Sebastian Andrzej Siewior 1 sibling, 1 reply; 19+ messages in thread From: Sam Kappen @ 2017-11-24 6:39 UTC (permalink / raw) To: linux-rt-users Hi, I am also faces a similar kind of issue on X86 target, while testing 3.10.105-rt119. The issue is seen during boot-up when USB/SCSI enumeration starts. Below is the log from my console scsi 0:0:0:0: Direct-Access Linux scsi_debug 0004 PQ: 0 ANSI: 5 ------------[ cut here ]------------ ------------[ cut here ]------------ WARNING: at kernel/sched/core.c:3052 migrate_disable+0xee/0x100() Modules linked in: CPU: 3 PID: 7 Comm: kworker/u16:0 Not tainted 3.10.107-rt120+ #2 Hardware name: Intel Corporation S1200RP_SE/S1200RP_SE, BIOS S1200RP.86B.02.02.0005.102320140911 10/23/2014 Workqueue: events_unbound async_run_entry_fn 0000000000000000 ffff880244927338 ffffffff8168b2f0 0000000000000000 0000000000000009 ffff880244927370 ffffffff8105ef8c ffff8802448fb540 0000000000000025 0000000000000004 0000000000000025 ffffffff81d9810c Call Trace: [<ffffffff8168b2f0>] dump_stack+0x4f/0x65 [<ffffffff8105ef8c>] warn_slowpath_common+0x5c/0xa0 [<ffffffff8105f085>] warn_slowpath_null+0x15/0x20 [<ffffffff8109355e>] migrate_disable+0xee/0x100 [<ffffffff810600af>] call_console_drivers.constprop.14+0x4f/0xd0 [<ffffffff81061241>] console_unlock+0x2a1/0x470 [<ffffffff810616e2>] vprintk_emit+0x2d2/0x550 [<ffffffff8168eb49>] ? _raw_spin_unlock_irqrestore+0x19/0x50 [<ffffffff810936ce>] ? migrate_enable+0x15e/0x1f0 [<ffffffff816892d3>] printk+0x4a/0x52 [<ffffffff810936ce>] ? migrate_enable+0x15e/0x1f0 [<ffffffff8105ef5a>] warn_slowpath_common+0x2a/0xa0 [<ffffffff8105f085>] warn_slowpath_null+0x15/0x20 [<ffffffff810936ce>] migrate_enable+0x15e/0x1f0 [<ffffffff810fce40>] get_page_from_freelist+0x630/0xb90 [<ffffffff8168e32a>] ? rt_spin_lock_slowlock+0x2ca/0x310 [<ffffffff810fe36d>] __alloc_pages_nodemask+0x13d/0x9e0 [<ffffffff810fce72>] ? get_page_from_freelist+0x662/0xb90 [<ffffffff81133dd0>] alloc_pages_current+0xb0/0x150 [<ffffffff81138e05>] new_slab+0x2b5/0x380 [<ffffffff8113b67a>] __slab_alloc.isra.18+0x58a/0x670 [<ffffffff813d3f40>] ? scsi_pool_alloc_command+0x20/0x70 [<ffffffff81133dd0>] ? alloc_pages_current+0xb0/0x150 [<ffffffff8113b956>] kmem_cache_alloc+0xd6/0x100 [<ffffffff813d3f40>] ? scsi_pool_alloc_command+0x20/0x70 [<ffffffff813d3f40>] scsi_pool_alloc_command+0x20/0x70 [<ffffffff813d492e>] scsi_host_alloc_command.isra.1+0x1e/0x80 [<ffffffff813d49b0>] __scsi_get_command+0x20/0xc0 [<ffffffff813d4a83>] scsi_get_command+0x33/0xc0 [<ffffffff813dad1a>] scsi_get_cmd_from_req+0x4a/0x60 [<ffffffff813db6cb>] scsi_setup_blk_pc_cmnd+0x2b/0xf0 [<ffffffff813db8fc>] scsi_prep_fn+0x3c/0x50 [<ffffffff812c9ef3>] blk_peek_request+0xf3/0x1c0 [<ffffffff813db960>] scsi_request_fn+0x50/0x570 [<ffffffff812c6c6e>] __blk_run_queue+0x2e/0x40 [<ffffffff812cdde0>] blk_execute_rq_nowait+0x70/0x100 [<ffffffff812cdef8>] blk_execute_rq+0x88/0xe0 sd 0:0:0:0: Attached scsi generic sg0 type 0 [<ffffffff812ca040>] ? blk_rq_bio_prep+0x60/0xc0 [<ffffffff812cdcf0>] ? blk_rq_map_kern+0xf0/0x170 [<ffffffff812c86c0>] ? blk_get_request+0x60/0xe0 [<ffffffff813da050>] scsi_execute+0xf0/0x150 [<ffffffff813da182>] scsi_execute_req_flags+0x82/0xf0 [<ffffffff8145d87f>] read_capacity_16+0xcf/0x520 [<ffffffff8145e060>] sd_revalidate_disk+0x350/0x1bd0 [<ffffffff8145f9a4>] sd_probe_async+0xc4/0x1d0 [<ffffffff8108e7c2>] async_run_entry_fn+0x32/0x130 [<ffffffff8107f5a5>] process_one_work+0x145/0x420 [<ffffffff81080903>] worker_thread+0x163/0x470 [<ffffffff8168d91c>] ? preempt_schedule+0x4c/0x70 [<ffffffff810807a0>] ? manage_workers.isra.7+0x2d0/0x2d0 [<ffffffff8108735f>] kthread+0xbf/0xd0 [<ffffffff810872a0>] ? kthread_worker_fn+0x1a0/0x1a0 [<ffffffff8168f6be>] ret_from_fork+0x4e/0x80 [<ffffffff810872a0>] ? kthread_worker_fn+0x1a0/0x1a0 ---[ end trace 0000000000000001 ]--- ------------[ cut here ]------------ WARNING: at kernel/sched/core.c:3087 migrate_enable+0x15e/0x1f0() Modules linked in: CPU: 3 PID: 7 Comm: kworker/u16:0 Tainted: G W 3.10.1 Test case to reproduce: 1. Enable PXE boot and mount file-system on USB stick 2. Continuously reboot the system with USB stick connected 3. We generally see the issue after every 3 to 5 hours. On looking at the issue it is identified that there is some piece of code someplace that calls migrate_disable() with interrupts off, enables interrupts, then calls migrate_enable(). On instrumentation it is observed that for some SCSI layer calls(calls from get_requests) the above condition is not evaluated to true hence reaches at buffered_rmqueue with irqs in disabled state. >From the below call chain buffered_rmqueue-> local_spin_lock_irqsave -> local_lock_irqsave -> spin_lock ->rt_spin_lock -> rt_spin_lock_fastlock -> rt_spin_lock_slowlock In a normal case, when it enters rt_spin_lock_slowlock with irqs_disabled, the same is returned in below case, if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) { raw_spin_unlock(&lock->wait_lock); return; } But in the some case above condition is meet true and the control reaches below in same function, pi_lock(&self->pi_lock); self->saved_state = self->state; __set_current_state(TASK_UNINTERRUPTIBLE); pi_unlock(&self->pi_lock); pi_lock & pi_unlock disables and enables the irqs respectively, so in this special case the irq state is not retained while exiting rt_spin_lock_slowlock function and this results in the crash! Could you please help to resolve the issue. Regards, Sam On Fri, Nov 17, 2017 at 11:08 PM, Julia Cartwright <julia@ni.com> wrote: > On Thu, Nov 16, 2017 at 05:08:37PM +0100, Sebastian Andrzej Siewior wrote: >> + Steven & Julia >> >> On 2017-11-07 12:47:27 [+0300], Pavel V. Panteleev wrote: >> > Thanks, it works. >> >> Okay, good to hear. >> >> Steven + Julia: >> We need to decide what are going to do about this stable-wise. The bug >> was reported against 3.14.79-rt85 and the devel tree is not affected*. >> The thread starts at >> https://www.spinics.net/lists/linux-rt-users/msg17560.html > > Your proposed patch seems reasonable to me to pull back into the > relevant releases. Can you send a proper patch against the latest > affected tree (4.9?) and the stable team will pull it back? It looks > like it will need some minor massaging on it's way back, but that > shouldn't be a problem. > > Thanks, > Julia > -- > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-24 6:39 ` Sam Kappen @ 2017-11-24 9:37 ` Sebastian Andrzej Siewior 2017-11-27 6:46 ` Sam Kappen 0 siblings, 1 reply; 19+ messages in thread From: Sebastian Andrzej Siewior @ 2017-11-24 9:37 UTC (permalink / raw) To: Sam Kappen; +Cc: linux-rt-users On 2017-11-24 12:09:16 [+0530], Sam Kappen wrote: > Hi, Hi, > I am also faces a similar kind of issue on X86 target, while testing > 3.10.105-rt119. > The issue is seen during boot-up when USB/SCSI enumeration starts. > > Below is the log from my console Can you try if the patch I posted solves that? From the callchain it looks like the same thing. Sebastian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-24 9:37 ` Sebastian Andrzej Siewior @ 2017-11-27 6:46 ` Sam Kappen 2017-12-04 9:59 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 19+ messages in thread From: Sam Kappen @ 2017-11-27 6:46 UTC (permalink / raw) To: Sebastian Andrzej Siewior; +Cc: linux-rt-users Hi, Many thanks for your kind response. I will put it for long run test and update. Could you please look at my below queries? 1.) I had derived and tried a patch based on the below analysis. ( I referred below open source commit, to derive on this patch. https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.9.47-rt37-rebase&id=7a347757f027190c95a363a491c18156a926a370 ) In some cases pi_lock in rt_spin_lock_slowlock does not retain the irqs state while exiting function, this causes issue in migrate_disable() + enable as they are not symmetrical in regard to the status of interrupts. To fix pi_lock & pi_unlock in rt_spin_lock_slowlock, it has been modified to retain irq state by using raw_spin_lock and raw_spin_unlock and also modified wait_lock in rt_spin_lock_slowlock with raw_spin_lock_irqsave & *_restore. kernel/rtmutex.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c index 7cf4b8b..9c67d80 100644 --- a/kernel/rtmutex.c +++ b/kernel/rtmutex.c @@ -1191,8 +1191,6 @@ static int adaptive_wait(struct rt_mutex *lock, } #endif -# define pi_lock(lock) raw_spin_lock_irq(lock) -# define pi_unlock(lock) raw_spin_unlock_irq(lock) /* * Slow path lock function spin_lock style: this variant is very @@ -1206,14 +1204,15 @@ static void noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock) struct task_struct *lock_owner, *self = current; struct rt_mutex_waiter waiter, *top_waiter; int ret; + unsigned long flags; rt_mutex_init_waiter(&waiter, true); - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); init_lists(lock); if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) { - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); return; } @@ -1225,10 +1224,10 @@ static void noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock) * as well. We are serialized via pi_lock against wakeups. See * try_to_wake_up(). */ - pi_lock(&self->pi_lock); + raw_spin_lock(&self->pi_lock); self->saved_state = self->state; __set_current_state(TASK_UNINTERRUPTIBLE); - pi_unlock(&self->pi_lock); + raw_spin_unlock(&self->pi_lock); ret = task_blocks_on_rt_mutex(lock, &waiter, self, RT_MUTEX_MIN_CHAINWALK); BUG_ON(ret); @@ -1241,18 +1240,18 @@ static void noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock) top_waiter = rt_mutex_top_waiter(lock); lock_owner = rt_mutex_owner(lock); - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); debug_rt_mutex_print_deadlock(&waiter); if (top_waiter != &waiter || adaptive_wait(lock, lock_owner)) schedule_rt_mutex(lock); - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); - pi_lock(&self->pi_lock); + raw_spin_lock(&self->pi_lock); __set_current_state(TASK_UNINTERRUPTIBLE); - pi_unlock(&self->pi_lock); + raw_spin_unlock(&self->pi_lock); } /* @@ -1262,10 +1261,10 @@ static void noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock) * happened while we were blocked. Clear saved_state so * try_to_wakeup() does not get confused. */ - pi_lock(&self->pi_lock); + raw_spin_lock(&self->pi_lock); __set_current_state(self->saved_state); self->saved_state = TASK_RUNNING; - pi_unlock(&self->pi_lock); + raw_spin_unlock(&self->pi_lock); /* * try_to_take_rt_mutex() sets the waiter bit @@ -1276,7 +1275,7 @@ static void noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock) BUG_ON(rt_mutex_has_waiters(lock) && &waiter == rt_mutex_top_waiter(lock)); BUG_ON(!plist_node_empty(&waiter.list_entry)); - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); debug_rt_mutex_free_waiter(&waiter); } -- 2.7.4 We were testing above patch on multiple targets we could experience some stuck issue on some remote target after 2 days. I am not sure what really happens there, may be the issue when try for scheduling with irq in disabled state. The systems I have tested found to be worked 7 days after that I stopped the test. 2.) With your patch during the slab allocations irqs will be in enabled state. So if we enable irqs in early stage will there be any side effects? I am sorry if my question doesn't seem to be logical. Regards, Sam On Fri, Nov 24, 2017 at 3:07 PM, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > On 2017-11-24 12:09:16 [+0530], Sam Kappen wrote: >> Hi, > Hi, > >> I am also faces a similar kind of issue on X86 target, while testing >> 3.10.105-rt119. >> The issue is seen during boot-up when USB/SCSI enumeration starts. >> >> Below is the log from my console > > Can you try if the patch I posted solves that? From the callchain it > looks like the same thing. > > Sebastian ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-11-27 6:46 ` Sam Kappen @ 2017-12-04 9:59 ` Sebastian Andrzej Siewior 2017-12-05 16:31 ` Sam Kappen 0 siblings, 1 reply; 19+ messages in thread From: Sebastian Andrzej Siewior @ 2017-12-04 9:59 UTC (permalink / raw) To: Sam Kappen; +Cc: linux-rt-users On 2017-11-27 12:16:36 [+0530], Sam Kappen wrote: > Hi, Hi, > 1.) > I had derived and tried a patch based on the below analysis. > ( I referred below open source commit, to derive on this patch. > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.9.47-rt37-rebase&id=7a347757f027190c95a363a491c18156a926a370 > ) > > In some cases pi_lock in rt_spin_lock_slowlock does not retain the > irqs state while exiting function, this causes > issue in migrate_disable() + enable as they are not symmetrical in > regard to the status of interrupts. > To fix pi_lock & pi_unlock in rt_spin_lock_slowlock, it has been > modified to retain irq state by using > raw_spin_lock and raw_spin_unlock and also modified wait_lock in > rt_spin_lock_slowlock with raw_spin_lock_irqsave & *_restore. Can you provide more informations on this? Like a stack strace that shows that this happens and when it happens? It should not happen. … > We were testing above patch on multiple targets we could experience > some stuck issue on some remote target after 2 days. I am not > sure what really happens there, may be the issue when try for > scheduling with irq in disabled state. > The systems I have tested found to be worked 7 days after that I > stopped the test. Which patch? The patch I've sent and ask you for testing or the patch you had in this email? > > 2.) With your patch during the slab allocations irqs will be in enabled state. > So if we enable irqs in early stage will there be any side effects? I > am sorry if my question doesn't seem > to be logical. You must not enable the interrupts too early. At the time of scheduling it is okay. > Regards, > Sam Sebastian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-12-04 9:59 ` Sebastian Andrzej Siewior @ 2017-12-05 16:31 ` Sam Kappen 2017-12-12 10:18 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 19+ messages in thread From: Sam Kappen @ 2017-12-05 16:31 UTC (permalink / raw) To: Sebastian Andrzej Siewior; +Cc: linux-rt-users Hi, Thanks for looking at my queries. Please see my answers inline. 1.) > I had derived and tried a patch based on the below analysis. > ( I referred below open source commit, to derive on this patch. > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.9.47-rt37-rebase&id=7a347757f027190c95a363a491c18156a926a370 > ) > > In some cases pi_lock in rt_spin_lock_slowlock does not retain the > irqs state while exiting function, this causes > issue in migrate_disable() + enable as they are not symmetrical in > regard to the status of interrupts. > To fix pi_lock & pi_unlock in rt_spin_lock_slowlock, it has been > modified to retain irq state by using > raw_spin_lock and raw_spin_unlock and also modified wait_lock in > rt_spin_lock_slowlock with raw_spin_lock_irqsave & *_restore. Can you provide more informations on this? Like a stack strace that shows that this happens and when it happens? It should not happen. As we were experiencing a panic issue with in 3 to 6 hours during the test(Test is continues soft reboot of the system as mentioned in previous mail). With the help of instrument code we have been tracked it down to the function rt_spin_lock_slowlock() in rtmutex.c. We see this issue when there is a state change for irqs from disabled to enabled. During slab allocations for SCSI on bootup the irqs are found to be in disabled state since the system state is not yet in "RUNNING". So we have added instrument code throughout the call trace and confirmed culprit as pi_lock()/pi_unlock for changing the irqs state. Basically it happens when it acquires the lock with irqs in disabled state. I guess below scenario is happens when issue hits. It looks like during normal cases with irqs in disabled state from the function rt_spin_lock_slowlock(), It gets mutex lock in its first try and takes first return path so it need not have to take pi_lock/unlock. But in some special case (error case) mutex lock is not available(I am not sure why this happens? ) and go further retry hence it acquires pi_lock/unlock then into panic. I am providing below some stack traces which we have seen during the test. All relevant debug configs were enabled while testing. scsi 0:0:0:0: Direct-Access Linux scsi_debug 0004 PQ: 0 ANSI: 5 mm/mempolicy.c alloc_pages_current 2067 irq disabled!!! ==> debug print added by me mm/mempolicy.c alloc_pages_current 2067 irq disabled!!! ==> debug print added by me mm/mempolicy.c alloc_pages_current 2067 irq disabled!!! ==> debug print added by me mm/mempolicy.c alloc_pages_current 2067 irq disabled!!! ==> debug print added by me ------------[ cut here ]------------ ------------[ cut here ]------------ WARNING: at kernel/sched/core.c:3052 migrate_disable+0x10b/0x120() Modules linked in: CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 3.10.107-rt120+ #49 Hardware name: To be filled by O.E.M. To be filled by O.E.M./WADE-8078, BIOS R1.00.E0 07/07/2014 Workqueue: events_unbound async_run_entry_fn 0000000000000000 ffff880159ee72e8 ffffffff816b617c 0000000000000000 0000000000000009 ffff880159ee7328 ffffffff8105fc8b ffff880159ee7348 ffff880159ea3540 0000000000000038 0000000000000001 0000000000000004 Call Trace: [<ffffffff816b617c>] dump_stack+0x4f/0x65 [<ffffffff8105fc8b>] warn_slowpath_common+0x6b/0xa0 [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 [<ffffffff8109585b>] migrate_disable+0x10b/0x120 [<ffffffff81060c45>] call_console_drivers.constprop.20+0x65/0x100 [<ffffffff81061da8>] console_unlock+0x398/0x3d0 [<ffffffff81062303>] vprintk_emit+0x2b3/0x500 [<ffffffff810b9526>] ? __try_to_take_rt_mutex+0x146/0x190 [<ffffffff8109569c>] ? migrate_enable+0x14c/0x200 [<ffffffff816b17e5>] printk+0x48/0x4a [<ffffffff8109569c>] ? migrate_enable+0x14c/0x200 [<ffffffff8105fc59>] warn_slowpath_common+0x39/0xa0 [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 [<ffffffff8109569c>] migrate_enable+0x14c/0x200 [<ffffffff81100fb1>] get_page_from_freelist+0x9a1/0xbc0 [<ffffffff812fc3d9>] ? number.isra.1+0x329/0x360 [<ffffffff81101f89>] __alloc_pages_nodemask+0x179/0xa50 [<ffffffff816ba6b3>] ? _raw_spin_unlock+0x13/0x40 [<ffffffff8109de14>] ? update_curr+0xa4/0xf0 [<ffffffff81138ab1>] alloc_pages_current+0x101/0x1f0 [<ffffffff8113cf95>] new_slab+0x265/0x310 [<ffffffff816b386e>] __slab_alloc.isra.62+0x4e0/0x6ca [<ffffffff816ba744>] ? _raw_spin_unlock_irq+0x14/0x40 [<ffffffff810fbd0a>] ? mempool_alloc_slab+0x3a/0x70 [<ffffffff816ba6b3>] ? _raw_spin_unlock+0x13/0x40 [<ffffffff8113f5d0>] kmem_cache_alloc+0x170/0x190 [<ffffffff810fbd0a>] ? mempool_alloc_slab+0x3a/0x70 [<ffffffff810fbd0a>] mempool_alloc_slab+0x3a/0x70 [<ffffffff810fc0be>] mempool_alloc+0xae/0x210 [<ffffffff81063f85>] ? unpin_current_cpu+0x15/0x70 [<ffffffff812d5ce8>] get_request+0x3a8/0x7c0 [<ffffffff81089c30>] ? __init_waitqueue_head+0x60/0x60 [<ffffffff812d619a>] blk_get_request+0x9a/0x140 [<ffffffff8113fb68>] ? kmem_cache_free+0x188/0x1a0 [<ffffffff813ef02a>] scsi_execute+0x4a/0x170 [<ffffffff813ef226>] scsi_execute_req_flags+0xd6/0x190 [<ffffffff81478349>] read_capacity_16+0xb9/0x550 [<ffffffff81478fc8>] sd_revalidate_disk+0x4c8/0x1c90 [<ffffffff8147a855>] sd_probe_async+0xc5/0x1d0 [<ffffffff81090a96>] async_run_entry_fn+0x36/0x130 [<ffffffff81081047>] process_one_work+0x147/0x3d0 [<ffffffff810824e1>] worker_thread+0x161/0x3d0 [<ffffffff81082380>] ? manage_workers.isra.33+0x2f0/0x2f0 [<ffffffff81088f75>] kthread+0xc5/0xd0 [<ffffffff81088eb0>] ? __init_kthread_worker+0x60/0x60 [<ffffffff816bb37e>] ret_from_fork+0x4e/0x80 [<ffffffff81088eb0>] ? __init_kthread_worker+0x60/0x60 ---[ end trace 0000000000000001 ]--- ------------[ cut here ]------------ WARNING: at kernel/sched/core.c:3087 migrate_enable+0x14c/0x200() Modules linked in: CPU: 1 PID: 7 Comm: kworker/u8:0 Tainted: G W 3.10.107-rt120+ #49 Hardware name: To be filled by O.E.M. To be filled by O.E.M./WADE-8078, BIOS R1.00.E0 07/07/2014 Workqueue: events_unbound async_run_entry_fn 0000000000000000 ffff880159ee7228 ffffffff816b617c 0000000000000000 0000000000000009 ffff880159ee7268 ffffffff8105fc8b ffff880159ee7258 ffff880159ea3540 ffffffff81e1a9e8 ffffffff81e1a840 0000000000000000 Call Trace: [<ffffffff816b617c>] dump_stack+0x4f/0x65 [<ffffffff8105fc8b>] warn_slowpath_common+0x6b/0xa0 [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 [<ffffffff8109569c>] migrate_enable+0x14c/0x200 [<ffffffff8138ae56>] serial8250_poll+0xd6/0x120 [<ffffffff813878dd>] uartdrv_console_write+0xdd/0x330 [<ffffffff81060cad>] call_console_drivers.constprop.20+0xcd/0x100 [<ffffffff81061da8>] console_unlock+0x398/0x3d0 [<ffffffff81062303>] vprintk_emit+0x2b3/0x500 [<ffffffff810b9526>] ? __try_to_take_rt_mutex+0x146/0x190 sd 0:0:0:0: Attached scsi generic sg0 type 0 [<ffffffff8109569c>] ? migrate_enable+0x14c/0x200 [<ffffffff816b17e5>] printk+0x48/0x4a [<ffffffff8109569c>] ? migrate_enable+0x14c/0x200 [<ffffffff8105fc59>] warn_slowpath_common+0x39/0xa0 [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 [<ffffffff8109569c>] migrate_enable+0x14c/0x200 [<ffffffff81100fb1>] get_page_from_freelist+0x9a1/0xbc0 [<ffffffff812fc3d9>] ? number.isra.1+0x329/0x360 [<ffffffff81101f89>] __alloc_pages_nodemask+0x179/0xa50 [<ffffffff816ba6b3>] ? _raw_spin_unlock+0x13/0x40 [<ffffffff8109de14>] ? update_curr+0xa4/0xf0 [<ffffffff81138ab1>] alloc_pages_current+0x101/0x1f0 [<ffffffff8113cf95>] new_slab+0x265/0x310 [<ffffffff816b386e>] __slab_alloc.isra.62+0x4e0/0x6ca [<ffffffff816ba744>] ? _raw_spin_unlock_irq+0x14/0x40 [<ffffffff810fbd0a>] ? mempool_alloc_slab+0x3a/0x70 [<ffffffff816ba6b3>] ? _raw_spin_unlock+0x13/0x40 [<ffffffff8113f5d0>] kmem_cache_alloc+0x170/0x190 [<ffffffff810fbd0a>] ? mempool_alloc_slab+0x3a/0x70 [<ffffffff810fbd0a>] mempool_alloc_slab+0x3a/0x70 [<ffffffff810fc0be>] mempool_alloc+0xae/0x210 [<ffffffff81063f85>] ? unpin_current_cpu+0x15/0x70 [<ffffffff812d5ce8>] get_request+0x3a8/0x7c0 [<ffffffff81089c30>] ? __init_waitqueue_head+0x60/0x60 [<ffffffff812d619a>] blk_get_request+0x9a/0x140 [<ffffffff8113fb68>] ? kmem_cache_free+0x188/0x1a0 [<ffffffff813ef02a>] scsi_execute+0x4a/0x170 [<ffffffff813ef226>] scsi_execute_req_flags+0xd6/0x190 [<ffffffff81478349>] read_capacity_16+0xb9/0x550 [<ffffffff81478fc8>] sd_revalidate_disk+0x4c8/0x ------------------------------------------------------------------------------------ scsi0 : scsi_debug, version 1.82 [20100324], dev_size_mb=8, opts=0x0 scsi 0:0:0:0: Direct-Access Linux scsi_debug 0004 PQ: 0 ANSI: 5 kernel/rtmutex.c rt_spin_lock_slowlock 1266 enterirq 1 exitirq 0!!!==> debug print added by me XXX: After local_spin_lock_irqsave enterirqs 1 exitirqs 0!!! ==> debug print added by me ------------[ cut here ]------------ WARNING: at kernel/sched/core.c:3052 migrate_disable+0x10b/0x120() Modules linked in: CPU: 0 PID: 7 Comm: kworker/u8:0 Not tainted 3.10.107-rt120+ #106 Hardware name: To be filled by O.E.M. To be filled by O.E.M./WADE-8078, BIOS R1.00.E0 07/07/2014 Workqueue: events_unbound async_run_entry_fn 0000000000000000 ffff880159ee73f8 ffffffff816b65bc 0000000000000000 0000000000000009 ffff880159ee7438 ffffffff8105fc8b ffff880159ee7438 ffff880159ea3540 0000000000000044 0000000000000001 0000000000000004 Call Trace: [<ffffffff816b65bc>] dump_stack+0x4f/0x65 [<ffffffff8105fc8b>] warn_slowpath_common+0x6b/0xa0 [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 [<ffffffff8109585b>] migrate_disable+0x10b/0x120 [<ffffffff81060c45>] call_console_drivers.constprop.20+0x65/0x100 [<ffffffff81061da8>] console_unlock+0x398/0x3d0 [<ffffffff81062303>] vprintk_emit+0x2b3/0x500 [<ffffffff816b1c25>] printk+0x48/0x4a [<ffffffff81100f85>] get_page_from_freelist+0x985/0xc70 [<ffffffff81102029>] __alloc_pages_nodemask+0x179/0xa50 [<ffffffff81100b7a>] ? get_page_from_freelist+0x57a/0xc70 [<ffffffff81063f85>] ? unpin_current_cpu+0x15/0x70 [<ffffffff81138b2c>] alloc_pages_current+0xdc/0x1b0 [<ffffffff8113d015>] new_slab+0x285/0x370 [<ffffffff816b3cae>] __slab_alloc.isra.62+0x4e0/0x6ca [<ffffffff810fbcf8>] ? mempool_alloc_slab+0x28/0x60 [<ffffffff8113f752>] kmem_cache_alloc+0x1a2/0x1e0 [<ffffffff810fbcf8>] ? mempool_alloc_slab+0x28/0x60 [<ffffffff810fbcf8>] mempool_alloc_slab+0x28/0x60 [<ffffffff810fc0ae>] mempool_alloc+0xae/0x210 [<ffffffff81063f85>] ? unpin_current_cpu+0x15/0x70 [<ffffffff812d60f8>] get_request+0x3a8/0x7c0 [<ffffffff81089c30>] ? __init_waitqueue_head+0x60/0x60 [<ffffffff812d65aa>] blk_get_request+0x9a/0x140 [<ffffffff813ef46a>] scsi_execute+0x4a/0x170 [<ffffffff813ef666>] scsi_execute_req_flags+0xd6/0x190 [<ffffffff81479008>] sd_revalidate_disk+0xc8/0x1c90 [<ffffffff816ba561>] ? rt_spin_lock_slowlock+0x291/0x340 [<ffffffff8147ac95>] sd_probe_async+0xc5/0x1d0 [<ffffffff81090a96>] async_run_entry_fn+0x36/0x130 [<ffffffff81081047>] process_one_work+0x147/0x3d0 [<ffffffff810824e1>] worker_thread+0x161/0x3d0 [<ffffffff81082380>] ? manage_workers.isra.33+0x2f0/0x2f0 [<ffffffff81088f75>] kthread+0xc5/0xd0 [<ffffffff81088eb0>] ? __init_kthread_worker+0x60/0x60 [<ffffffff816bb8fe>] ret_from_fork+0x4e/0x80 [<ffffffff81088eb0>] ? __init_kthread_worker+0x60/0x60 ---[ end trace 0000000000000001 ]--- ------------[ cut here ]------------ WARNING: at kernel/sched/core.c:3087 migrate_enable+0x14c/0x200() Modules linked in: CPU: 0 PID: 7 Comm: kworker/u8:0 Tainted: G W 3.10.107-rt120+ #106 Hardware name: To be filled by O.E.M. To be filled by O.E.M./WADE-8078, BIOS R1.00.E0 07/07/2014 Workqueue: events_unbound async_run_entry_fn 0000000000000000 ffff880159ee7338 ffffffff816b65bc 0000000000000000 0000000000000009 ffff880159ee7378 ffffffff8105fc8b ffff880159ee7368 ffff880159ea3540 ffffffff81e1aa28 ffffffff81e1a880 0000000000000000 Call Trace: [<ffffffff816b65bc>] dump_stack+0x4f/0x65 [<ffffffff8105fc8b>] warn_slowpath_common+0x6b/0xa0 [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 [<ffffffff8109569c>] migrate_enable+0x14c/0x200 [<ffffffff8138b266>] serial8250_poll+0xd6/0x120 [<ffffffff81387ced>] uartdrv_console_write+0xdd/0x330 [<ffffffff81060cad>] call_console_drivers.constprop.20+0xcd/0x100 [<ffffffff81061da8>] console_unlock+0x398/0x3d0 [<ffffffff81062303>] vprintk_emit+0x2b3/0x500 [<ffffffff816b1c25>] printk+0x48/0x4a [<ffffffff81100f85>] get_page_from_freelist+0x985/0xc70 [<ffffffff81102029>] __alloc_pages_nodemask+0x179/0xa50 [<ffffffff81100b7a>] ? get_page_from_freelist+0x57a/0xc70 [<ffffffff81063f85>] ? unpin_current_cpu+0x15/0x70 [<ffffffff81138b2c>] alloc_pages_current+0xdc/0x1b0 [<ffffffff8113d015>] new_slab+0x285/0x370 [<ffffffff816b3cae>] __slab_alloc.isra.62+0x4e0/0x6ca [<ffffffff810fbcf8>] ? mempool_alloc_slab+0x28/0x60 [<ffffffff8113f752>] kmem_cache_alloc+0x1a2/0x1e0 [<ffffffff810fbcf8>] ? mempool_alloc_slab+0x28/0x60 [<ffffffff810fbcf8>] mempool_alloc_slab+0x28/0x60 [<ffffffff810fc0ae>] mempool_alloc+0xae/0x210 [<ffffffff81063f85>] ? unpin_current_cpu+0x15/0x70 [<ffffffff812d60f8>] get_request+0x3a8/0x7c0 [<ffffffff81089c30>] ? __init_waitqueue_head+0x60/0x60 [<ffffffff812d65aa>] blk_get_request+0x9a/0x140 [<ffffffff813ef46a>] scsi_execute+0x4a/0x170 [<ffffffff813ef666>] scsi_execute_req_flags+0xd6/0x190 [<ffffffff81479008>] sd_revalidate_disk+0xc8/0x1c90 [<ffffffff816ba561>] ? rt_spin_lock_slowlock+0x291/0x340 [<ffffffff8147ac95>] sd_probe_async+0xc5/0x1d0 [<ffffffff81090a96>] async_run_entry_fn+0x36/0x130 [<ffffffff81081047>] process_one_work+0x147/0x3d0 [<ffffffff810824e1>] worker_thread+0x161/0x3d0 [<ffffffff81082380>] ? manage_workers.isra.33+0x2f0/0x2f0 [<ffffffff81088f75>] kthread+0xc5/0xd0 [<ffffffff81088eb0>] ? __init_kthread_worker+0x60/0x60 [<ffffffff816bb8fe>] ret_from_fork+0x4e/0x80 [<ffffffff81088eb0>] ? __init_kthread_worker+0x60/0x60 ---[ end trace 0000000000000002 ]--- XXX: __rmqueue enterirqs 1 exitirqs 0!!! ==> debug print added by me XXX: __mod_zone_freepage_state enterirq 1 exitirq 0!!! ==> debug print added by me ------------[ cut here ]------------ sd 0:0:0:0: Attached scsi generic sg0 type 0 ahci 0000:00:13.0: controller can't do DEVSLP, turning off ahci 0000:00:13.0: AHCI 0001.0300 32 slots 2 ports 3 Gbps 0x0 impl SATA mode ahci 0000:00:13.0: flags: 64bit ncq pm led clo pio slum part deso scsi1 : ahci ---------------------------------------------------------------------------------------------- WARNING: at kernel/sched/core.c:3052 migrate_disable+0x10b/0x120() Modules linked in: CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 3.10.107-rt120+ #49 Hardware name: To be filled by O.E.M. To be filled by O.E.M./WADE-8078, BIOS R1.00.E0 07/07/2014 Workqueue: events_unbound async_run_entry_fn 0000000000000000 ffff880159ee72e8 ffffffff816b617c 0000000000000000 0000000000000009 ffff880159ee7328 ffffffff8105fc8b ffff880159ee7348 ffff880159ea3540 0000000000000038 0000000000000001 0000000000000004 Call Trace: [<ffffffff816b617c>] dump_stack+0x4f/0x65 [<ffffffff8105fc8b>] warn_slowpath_common+0x6b/0xa0 [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 [<ffffffff8109585b>] migrate_disable+0x10b/0x120 [<ffffffff81060c45>] call_console_drivers.constprop.20+0x65/0x100 [<ffffffff81061da8>] console_unlock+0x398/0x3d0 [<ffffffff81062303>] vprintk_emit+0x2b3/0x500 [<ffffffff810b9526>] ? __try_to_take_rt_mutex+0x146/0x190 [<ffffffff8109569c>] ? migrate_enable+0x14c/0x200 [<ffffffff816b17e5>] printk+0x48/0x4a [<ffffffff8109569c>] ? migrate_enable+0x14c/0x200 [<ffffffff8105fc59>] warn_slowpath_common+0x39/0xa0 [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 [<ffffffff8109569c>] migrate_enable+0x14c/0x200 [<ffffffff81100fb1>] get_page_from_freelist+0x9a1/0xbc0 [<ffffffff812fc3d9>] ? number.isra.1+0x329/0x360 [<ffffffff81101f89>] __alloc_pages_nodemask+0x179/0xa50 [<ffffffff816ba6b3>] ? _raw_spin_unlock+0x13/0x40 [<ffffffff8109de14>] ? update_curr+0xa4/0xf0 [<ffffffff81138ab1>] alloc_pages_current+0x101/0x1f0 [<ffffffff8113cf95>] new_slab+0x265/0x310 [<ffffffff816b386e>] __slab_alloc.isra.62+0x4e0/0x6ca [<ffffffff816ba744>] ? _raw_spin_unlock_irq+0x14/0x40 [<ffffffff810fbd0a>] ? mempool_alloc_slab+0x3a/0x70 [<ffffffff816ba6b3>] ? _raw_spin_unlock+0x13/0x40 [<ffffffff8113f5d0>] kmem_cache_alloc+0x170/0x190 [<ffffffff810fbd0a>] ? mempool_alloc_slab+0x3a/0x70 [<ffffffff810fbd0a>] mempool_alloc_slab+0x3a/0x70 [<ffffffff810fc0be>] mempool_alloc+0xae/0x210 [<ffffffff81063f85>] ? unpin_current_cpu+0x15/0x70 [<ffffffff812d5ce8>] get_request+0x3a8/0x7c0 [<ffffffff81089c30>] ? __init_waitqueue_head+0x60/0x60 [<ffffffff812d619a>] blk_get_request+0x9a/0x140 [<ffffffff8113fb68>] ? kmem_cache_free+0x188/0x1a0 [<ffffffff813ef02a>] scsi_execute+0x4a/0x170 [<ffffffff813ef226>] scsi_execute_req_flags+0xd6/0x190 [<ffffffff81478349>] read_capacity_16+0xb9/0x550 [<ffffffff81478fc8>] sd_revalidate_disk+0x4c8/0x1c90 [<ffffffff8147a855>] sd_probe_async+0xc5/0x1d0 [<ffffffff81090a96>] async_run_entry_fn+0x36/0x130 [<ffffffff81081047>] process_one_work+0x147/0x3d0 [<ffffffff810824e1>] worker_thread+0x161/0x3d0 [<ffffffff81082380>] ? manage_workers.isra.33+0x2f0/0x2f0 [<ffffffff81088f75>] kthread+0xc5/0xd0 [<ffffffff81088eb0>] ? __init_kthread_worker+0x60/0x60 [<ffffffff816bb37e>] ret_from_fork+0x4e/0x80 [<ffffffff81088eb0>] ? __init_kthread_worker+0x60/0x60 ---[ end trace 0000000000000001 ]--- ------------[ cut here ]------------ WARNING: at kernel/sched/core.c:3087 migrate_enable+0x14c/0x200() Modules linked in: … > We were testing above patch on multiple targets we could experience > some stuck issue on some remote target after 2 days. I am not > sure what really happens there, may be the issue when try for > scheduling with irq in disabled state. > The systems I have tested found to be worked 7 days after that I > stopped the test. Which patch? The patch I've sent and ask you for testing or the patch you had in this email? Patch I had in this mail. > > 2.) With your patch during the slab allocations irqs will be in enabled state. > So if we enable irqs in early stage will there be any side effects? I > am sorry if my question doesn't seem > to be logical. You must not enable the interrupts too early. At the time of scheduling it is okay. Thanks. I have been testing your patch, I will update once I finish the long run test. Regards, Sam On Mon, Dec 4, 2017 at 3:29 PM, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > On 2017-11-27 12:16:36 [+0530], Sam Kappen wrote: >> Hi, > Hi, > >> 1.) >> I had derived and tried a patch based on the below analysis. >> ( I referred below open source commit, to derive on this patch. >> https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.9.47-rt37-rebase&id=7a347757f027190c95a363a491c18156a926a370 >> ) >> >> In some cases pi_lock in rt_spin_lock_slowlock does not retain the >> irqs state while exiting function, this causes >> issue in migrate_disable() + enable as they are not symmetrical in >> regard to the status of interrupts. >> To fix pi_lock & pi_unlock in rt_spin_lock_slowlock, it has been >> modified to retain irq state by using >> raw_spin_lock and raw_spin_unlock and also modified wait_lock in >> rt_spin_lock_slowlock with raw_spin_lock_irqsave & *_restore. > > Can you provide more informations on this? Like a stack strace that > shows that this happens and when it happens? It should not happen. > > … >> We were testing above patch on multiple targets we could experience >> some stuck issue on some remote target after 2 days. I am not >> sure what really happens there, may be the issue when try for >> scheduling with irq in disabled state. >> The systems I have tested found to be worked 7 days after that I >> stopped the test. > > Which patch? The patch I've sent and ask you for testing or the patch > you had in this email? > >> >> 2.) With your patch during the slab allocations irqs will be in enabled state. >> So if we enable irqs in early stage will there be any side effects? I >> am sorry if my question doesn't seem >> to be logical. > > You must not enable the interrupts too early. At the time of scheduling > it is okay. > >> Regards, >> Sam > > Sebastian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-12-05 16:31 ` Sam Kappen @ 2017-12-12 10:18 ` Sebastian Andrzej Siewior 2018-03-05 8:47 ` Sam Kappen 0 siblings, 1 reply; 19+ messages in thread From: Sebastian Andrzej Siewior @ 2017-12-12 10:18 UTC (permalink / raw) To: Sam Kappen; +Cc: linux-rt-users On 2017-12-05 22:01:19 [+0530], Sam Kappen wrote: > Hi, Hi, > Thanks for looking at my queries. Please see my answers inline. please don't top-post. Please use a client which adds proper indention while quoting the email. > 1.) > > I had derived and tried a patch based on the below analysis. > > ( I referred below open source commit, to derive on this patch. > > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.9.47-rt37-rebase&id=7a347757f027190c95a363a491c18156a926a370 > > ) > > > We see this issue when there is a state change for irqs from disabled > to enabled. During slab allocations for SCSI on bootup > the irqs are found to be in disabled state since the system state is > not yet in "RUNNING". > > So we have added instrument code throughout the call trace and > confirmed culprit as pi_lock()/pi_unlock for changing the irqs state. > Basically it happens when it acquires the lock with irqs in disabled state. but by pi_lock/pi_unlock you don't mean the futex operation, do you? based on the fact that the system is not in state "running" yet and this trace here: > ------------[ cut here ]------------ > WARNING: at kernel/sched/core.c:3052 migrate_disable+0x10b/0x120() > Modules linked in: > CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 3.10.107-rt120+ #49 > Hardware name: To be filled by O.E.M. To be filled by … > Call Trace: … > [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 > [<ffffffff8109569c>] migrate_enable+0x14c/0x200 > [<ffffffff81100fb1>] get_page_from_freelist+0x9a1/0xbc0 > [<ffffffff81101f89>] __alloc_pages_nodemask+0x179/0xa50 > [<ffffffff81138ab1>] alloc_pages_current+0x101/0x1f0 > [<ffffffff8113cf95>] new_slab+0x265/0x310 > [<ffffffff816b386e>] __slab_alloc.isra.62+0x4e0/0x6ca > [<ffffffff8113f5d0>] kmem_cache_alloc+0x170/0x190 > [<ffffffff810fbd0a>] mempool_alloc_slab+0x3a/0x70 > [<ffffffff810fc0be>] mempool_alloc+0xae/0x210 > [<ffffffff812d5ce8>] get_request+0x3a8/0x7c0 > [<ffffffff812d619a>] blk_get_request+0x9a/0x140 > [<ffffffff813ef02a>] scsi_execute+0x4a/0x170 … > ---[ end trace 0000000000000001 ]--- I would that this is the same issue and the patch I posted should help. > > 2.) With your patch during the slab allocations irqs will be in enabled state. > > Thanks. I have been testing your patch, I will update once I finish the long > run test. Okay, so a note to myself, there is nothing outstanding for me to do so far. > Regards, > Sam Sebastian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2017-12-12 10:18 ` Sebastian Andrzej Siewior @ 2018-03-05 8:47 ` Sam Kappen 2018-03-05 17:40 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 19+ messages in thread From: Sam Kappen @ 2018-03-05 8:47 UTC (permalink / raw) To: Sebastian Andrzej Siewior; +Cc: linux-rt-users On Tue, Dec 12, 2017 at 3:48 PM, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > On 2017-12-05 22:01:19 [+0530], Sam Kappen wrote: >> Hi, > Hi, > >> Thanks for looking at my queries. Please see my answers inline. > please don't top-post. Please use a client which adds proper indention > while quoting the email. > >> 1.) >> > I had derived and tried a patch based on the below analysis. >> > ( I referred below open source commit, to derive on this patch. >> > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.9.47-rt37-rebase&id=7a347757f027190c95a363a491c18156a926a370 >> > ) >> > >> We see this issue when there is a state change for irqs from disabled >> to enabled. During slab allocations for SCSI on bootup >> the irqs are found to be in disabled state since the system state is >> not yet in "RUNNING". >> >> So we have added instrument code throughout the call trace and >> confirmed culprit as pi_lock()/pi_unlock for changing the irqs state. >> Basically it happens when it acquires the lock with irqs in disabled state. > > but by pi_lock/pi_unlock you don't mean the futex operation, do you? > > based on the fact that the system is not in state "running" yet and this > trace here: >> ------------[ cut here ]------------ >> WARNING: at kernel/sched/core.c:3052 migrate_disable+0x10b/0x120() >> Modules linked in: >> CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 3.10.107-rt120+ #49 >> Hardware name: To be filled by O.E.M. To be filled by > … >> Call Trace: > … >> [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20 >> [<ffffffff8109569c>] migrate_enable+0x14c/0x200 >> [<ffffffff81100fb1>] get_page_from_freelist+0x9a1/0xbc0 >> [<ffffffff81101f89>] __alloc_pages_nodemask+0x179/0xa50 >> [<ffffffff81138ab1>] alloc_pages_current+0x101/0x1f0 >> [<ffffffff8113cf95>] new_slab+0x265/0x310 >> [<ffffffff816b386e>] __slab_alloc.isra.62+0x4e0/0x6ca >> [<ffffffff8113f5d0>] kmem_cache_alloc+0x170/0x190 >> [<ffffffff810fbd0a>] mempool_alloc_slab+0x3a/0x70 >> [<ffffffff810fc0be>] mempool_alloc+0xae/0x210 >> [<ffffffff812d5ce8>] get_request+0x3a8/0x7c0 >> [<ffffffff812d619a>] blk_get_request+0x9a/0x140 >> [<ffffffff813ef02a>] scsi_execute+0x4a/0x170 > … >> ---[ end trace 0000000000000001 ]--- > > I would that this is the same issue and the patch I posted should help. > >> > 2.) With your patch during the slab allocations irqs will be in enabled state. >> >> Thanks. I have been testing your patch, I will update once I finish the long >> run test. > > Okay, so a note to myself, there is nothing outstanding for me to do so > far. >We have tested it for nearly a month and issue is not reproducible with your patch. Many thanks. >> Regards, >> Sam > > Sebastian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: schedule under irqs_disabled in SLUB problem 2018-03-05 8:47 ` Sam Kappen @ 2018-03-05 17:40 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 19+ messages in thread From: Sebastian Andrzej Siewior @ 2018-03-05 17:40 UTC (permalink / raw) To: Sam Kappen; +Cc: linux-rt-users On 2018-03-05 14:17:29 [+0530], Sam Kappen wrote: > On Tue, Dec 12, 2017 at 3:48 PM, Sebastian Andrzej Siewior > <bigeasy@linutronix.de> wrote: The whole email looks like quote of my email with no additional content. Did you press the send button too soon or did you hide the new content somewhere where I could not find it :)? Sebastian ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] mm/slub: enable IRQs once scheduling is working 2017-11-17 17:38 ` Julia Cartwright 2017-11-24 6:39 ` Sam Kappen @ 2017-11-24 9:35 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 19+ messages in thread From: Sebastian Andrzej Siewior @ 2017-11-24 9:35 UTC (permalink / raw) To: Julia Cartwright; +Cc: Pavel V. Panteleev, linux-rt-users, Steven Rostedt Pavel V. Panteleev reported that a WARN_ON_ONCE(p->migrate_disable <= 0) triggered in 3.14.79-rt85 during boot from radix_tree_insert(). The problem is that radix_tree_insert() allocates memory and SLUB does not enable interrupts in allocate_slab(). They can't be be enabled unconditionally because early in the boot process they have to remain off. Enabling them at the "SYSTEM_RUNNING" state is too late. The earliest point is once we have scheduling working that is once the kthread is running. In newer -RT (v4.11+) there is the check if (system_state > SYSTEM_BOOTING) to enable interrupts in the allocator which is what this patch intends: Once scheduling is working enable interrupts in the allocator path so we can take sleeping locks. Reported-by: "Pavel V. Panteleev" <panteleev_p@mcst.ru> Tested-by: "Pavel V. Panteleev" <panteleev_p@mcst.ru> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- init/main.c | 2 ++ mm/slub.c | 3 ++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/init/main.c b/init/main.c index 6470deef01c9..362761effeac 100644 --- a/init/main.c +++ b/init/main.c @@ -379,6 +379,7 @@ static void __init setup_command_line(char *command_line) */ static __initdata DECLARE_COMPLETION(kthreadd_done); +extern bool slab_do_irq_on; static noinline void __ref rest_init(void) { @@ -396,6 +397,7 @@ static noinline void __ref rest_init(void) rcu_read_lock(); kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); rcu_read_unlock(); + slab_do_irq_on = true; complete(&kthreadd_done); /* diff --git a/mm/slub.c b/mm/slub.c index 67eb368b9314..22ec805130e5 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1527,6 +1527,7 @@ static inline bool shuffle_freelist(struct kmem_cache *s, struct page *page) return false; } #endif /* CONFIG_SLAB_FREELIST_RANDOM */ +bool slab_do_irq_on; static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { @@ -1543,7 +1544,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) if (gfpflags_allow_blocking(flags)) enableirqs = true; #ifdef CONFIG_PREEMPT_RT_FULL - if (system_state == SYSTEM_RUNNING) + if (slab_do_irq_on) enableirqs = true; #endif if (enableirqs) -- 2.15.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* schedule under irqs_disabled in SLUB problem @ 2017-11-01 11:31 Pavel V. Panteleev 0 siblings, 0 replies; 19+ messages in thread From: Pavel V. Panteleev @ 2017-11-01 11:31 UTC (permalink / raw) To: linux-rt-users Hello! I have a problem on kernel 3.14.79-rt85 and I don't know, if it's solved on current kernel or, may be, it's our additions problem. Could you help me, please? Kernel is broken in migrate_enable(): WARN_ON_ONCE(p->migrate_disable <= 0); I have a trace, where I see, that it happened, because few migrate_disable() were called under irqs_disabled (migrate_disable counter wasn't changed). But after schedule() call irqs were enabled and the following migrate_disable() and migrate_enable() changed migrate_disable counter. I suppose, that problem is schedule() call under irqs_disabled. There is a stack: PROCESS: kworker/u8:4, PID: 744, CPU: 3, state: R oncpu (0x0), flags: 0x4208060 --------------------------------------------------------------------- IP (hex) FILENAME PROCEDURE --------------------------------------------------------------------- e200020e0700 <kernel> __schedule+0x1b60/0x1f90 e200020e0da8 <kernel> schedule+0x278/0x2e8 e200020e8a68 <kernel> rt_spin_lock_slowlock+0x740/0x15c8 e200020f5088 <kernel> rt_spin_lock+0x3b8/0x408 e20000578290 <kernel> get_page_from_freelist+0x1528/0x1b48 e20000579b28 <kernel> __alloc_pages_nodemask+0x270/0x1b90 e200006ac9d8 <kernel> alloc_pages_current+0x638/0xad0 e200006b24d8 <kernel> allocate_slab+0x210/0xa70 e200006b9468 <kernel> __slab_alloc+0x1300/0x1d80 e200006ba900 <kernel> kmem_cache_alloc+0xa18/0xa78 e2000101ed78 <kernel> radix_tree_node_alloc+0x280/0x478 e2000101f2a8 <kernel> radix_tree_extend+0x248/0x588 e2000101f6c0 <kernel> radix_tree_insert+0xd8/0x1258 e20000550e10 <kernel> page_cache_tree_insert+0xd0/0x498 e200005513e8 <kernel> add_to_page_cache_locked+0x210/0x7b0 e2000055d7f8 <kernel> __read_cache_page+0x160/0x7e8 e2000055df00 <kernel> do_read_cache_page+0x80/0xab8 e2000055e9c8 <kernel> read_cache_page+0x90/0xa8 e20000fdf110 <kernel> read_dev_sector+0xa8/0x180 e20000fe0178 <kernel> parse_extended+0x198/0x8f0 e20000fe1288 <kernel> msdos_partition+0x7a8/0xed8 e20000fdf790 <kernel> check_partition+0x508/0x870 e20000fddfe8 <kernel> rescan_partitions+0x3a8/0x1128 e20000818328 <kernel> __blkdev_get+0xcb8/0xe48 e20000818d20 <kernel> blkdev_get+0x868/0xf90 e20000fcd100 <kernel> register_disk+0x890/0xad0 e20000fcda08 <kernel> add_disk+0x6c8/0xc98 e20001801ab0 <kernel> sd_probe_async+0x328/0x648 e200002935a8 <kernel> async_run_entry_fn+0x100/0x598 e2000023fd28 <kernel> process_one_work+0x5a0/0x1940 e20000241508 <kernel> worker_thread+0x440/0xff8 e2000026c1a8 <kernel> kthread+0x4e8/0x5a0 As I see, in allocate_slab() kernel could be under irqs_disabled. And irqs would be enabled in case of SYSTEM_RUNNING (why only in case of SYSTEM_RUNNING?). But in our case system isn't running yet (it's always before /sbin/init), so irqs wouldn't be enabled: enableirqs = (flags & __GFP_WAIT) != 0; #ifdef CONFIG_PREEMPT_RT_FULL enableirqs |= system_state == SYSTEM_RUNNING; #endif if (enableirqs) local_irq_enable(); So we call schedule() under irqs_disabled in ...->get_page_from_freelist->buffered_rmqueue->local_spin_lock_irqsave->local_lock_irqsave->__local_lock_irqsave->__local_lock_irq->spin_lock_irqsave->spin_lock->rt_spin_lock->rt_spin_lock_slowlock->schedule_rt_mutex. P. S. Today I reproduced this on 4.9.47-rt37. Try to reproduce on x86. Best regards, Pavel V. Panteleev ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2018-03-05 17:40 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CADF-jezvVP2O++FR2KiRSSSJF7oObjy8LSP3-yj1HCmxyTzB_Q@mail.gmail.com> 2017-11-02 16:50 ` schedule under irqs_disabled in SLUB problem Sebastian Andrzej Siewior 2017-11-02 20:55 ` Grygorii Strashko [not found] ` <CADF-jexLs9vRuiuoRmcA+0L6Mp-XxW75okheWV+ipGf1b_Ua1w@mail.gmail.com> 2017-11-03 10:23 ` Pavel V. Panteleev 2017-11-07 9:00 ` Pavel V. Panteleev 2017-11-07 9:14 ` Pavel V. Panteleev 2017-11-07 9:47 ` Pavel V. Panteleev 2017-11-16 16:08 ` Sebastian Andrzej Siewior 2017-11-16 16:39 ` Pavel V. Panteleev 2017-11-17 17:38 ` Julia Cartwright 2017-11-24 6:39 ` Sam Kappen 2017-11-24 9:37 ` Sebastian Andrzej Siewior 2017-11-27 6:46 ` Sam Kappen 2017-12-04 9:59 ` Sebastian Andrzej Siewior 2017-12-05 16:31 ` Sam Kappen 2017-12-12 10:18 ` Sebastian Andrzej Siewior 2018-03-05 8:47 ` Sam Kappen 2018-03-05 17:40 ` Sebastian Andrzej Siewior 2017-11-24 9:35 ` [PATCH] mm/slub: enable IRQs once scheduling is working Sebastian Andrzej Siewior 2017-11-01 11:31 schedule under irqs_disabled in SLUB problem Pavel V. Panteleev
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.