From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752321AbbEGOgI (ORCPT ); Thu, 7 May 2015 10:36:08 -0400 Received: from arcturus.aphlor.org ([188.246.204.175]:51497 "EHLO arcturus.aphlor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751303AbbEGOgB (ORCPT ); Thu, 7 May 2015 10:36:01 -0400 Date: Thu, 7 May 2015 10:35:52 -0400 From: Dave Jones To: Linux Kernel Cc: linux-ext4@vger.kernel.org, paulmck@linux.vnet.ibm.com Subject: ext4_es_lookup_extent causing rcu stalls. Message-ID: <20150507143552.GA19619@codemonkey.org.uk> Mail-Followup-To: Dave Jones , Linux Kernel , linux-ext4@vger.kernel.org, paulmck@linux.vnet.ibm.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -2.9 (--) X-Spam-Report: Spam report generated by SpamAssassin on "arcturus.aphlor.org" Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Authenticated-User: davej@codemonkey.org.uk Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I got similar traces to these from the soft lockup detector until I disabled it. It isn't so easy to disable RCU, so perhaps we could do something more creative.. Traces from 4.1-rc2 Dave INFO: rcu_sched self-detected stall on CPU 5: (6499 ticks this GP) idle=861/140000000000001/0 softirq=18210/18210 fqs=6500 (t=6500 jiffies g=57760 c=57759 q=0) Task dump for CPU 5: trinity-c1 R running task 13136 17998 31742 0x0048000c 0000000000003350 00000000a3d4db28 ffff88043d943d58 ffffffffb30abd15 ffffffffb30abc32 0000000000000005 0000000000000000 0000000000000005 ffffffffb3c3b080 0000000000000082 ffff88043d943d78 ffffffffb30afaad Call Trace: [] sched_show_task+0x165/0x280 [] ? sched_show_task+0x82/0x280 [] dump_cpu_task+0x3d/0x50 [] rcu_dump_cpu_stacks+0x91/0xd0 [] rcu_check_callbacks+0x7ec/0xb30 [] ? tick_do_update_jiffies64+0xde/0x140 [] update_process_times+0x39/0x60 [] tick_sched_handle.isra.20+0x25/0x60 [] tick_sched_timer+0x41/0x80 [] __run_hrtimer+0xba/0x5f0 [] ? tick_sched_do_timer+0x40/0x40 [] hrtimer_interrupt+0x103/0x230 [] ? vtime_common_account_irq_enter+0x32/0x50 [] local_apic_timer_interrupt+0x3c/0x70 [] smp_apic_timer_interrupt+0x41/0x60 [] apic_timer_interrupt+0x70/0x80 [] ? do_raw_read_lock+0x1a/0x50 [] _raw_read_lock+0x43/0x50 [] ? ext4_es_lookup_extent+0x5e/0x470 [] ext4_es_lookup_extent+0x5e/0x470 [] ext4_map_blocks+0x5b/0x590 [] ? ext4_es_find_delayed_extent_range+0x42c/0x480 [] ext4_llseek+0x271/0x410 [] ? trace_hardirqs_on+0xd/0x10 [] SyS_lseek+0x99/0xc0 [] system_call_fastpath+0x12/0x6f INFO: rcu_sched detected stalls on CPUs/tasks: (detected by 0, t=6502 jiffies, g=75215, c=75214, q=0) All QSes seen, last rcu_sched kthread activity 6502 (4295121467-4295114965), jiffies_till_next_fqs=1, root ->qsmask 0x0 trinity-c21 R running task 12768 20468 30399 0x0008000c 00000000000031e0 00000000f34c833f ffff88043d803da8 ffffffffb30abd15 ffffffffb30abc32 ffffffffb3c3b500 0000000000001966 ffff88043d817580 ffffffffb3c3b080 0000000000000000 ffff88043d803e28 ffffffffb30ef556 Call Trace: [] sched_show_task+0x165/0x280 [] ? sched_show_task+0x82/0x280 [] rcu_check_callbacks+0xb26/0xb30 [] update_process_times+0x39/0x60 [] tick_sched_handle.isra.20+0x25/0x60 [] tick_sched_timer+0x41/0x80 [] __run_hrtimer+0xba/0x5f0 [] ? tick_sched_do_timer+0x40/0x40 [] hrtimer_interrupt+0x103/0x230 [] ? vtime_common_account_irq_enter+0x32/0x50 [] local_apic_timer_interrupt+0x3c/0x70 [] smp_apic_timer_interrupt+0x41/0x60 [] apic_timer_interrupt+0x70/0x80 [] ? lock_acquire+0xd7/0x270 [] ? ext4_es_find_delayed_extent_range+0x76/0x480 [] _raw_read_lock+0x3b/0x50 [] ? ext4_es_find_delayed_extent_range+0x76/0x480 [] ext4_es_find_delayed_extent_range+0x76/0x480 [] ext4_llseek+0x291/0x410 [] ? trace_hardirqs_on+0xd/0x10 [] SyS_lseek+0x99/0xc0 [] system_call_fastpath+0x12/0x6f rcu_sched kthread starved for 6502 jiffies! INFO: rcu_sched detected stalls on CPUs/tasks: (detected by 0, t=26007 jiffies, g=75215, c=75214, q=0) All QSes seen, last rcu_sched kthread activity 26007 (4295140972-4295114965), jiffies_till_next_fqs=1, root ->qsmask 0x0 trinity-c21 R running task 12768 20468 30399 0x0008000c 00000000000031e0 00000000f34c833f ffff88043d803da8 ffffffffb30abd15 ffffffffb30abc32 ffffffffb3c3b500 0000000000006597 ffff88043d817580 ffffffffb3c3b080 0000000000000000 ffff88043d803e28 ffffffffb30ef556 Call Trace: [] sched_show_task+0x165/0x280 [] ? sched_show_task+0x82/0x280 [] rcu_check_callbacks+0xb26/0xb30 [] update_process_times+0x39/0x60 [] tick_sched_handle.isra.20+0x25/0x60 [] tick_sched_timer+0x41/0x80 [] __run_hrtimer+0xba/0x5f0 [] ? tick_sched_do_timer+0x40/0x40 [] hrtimer_interrupt+0x103/0x230 [] ? vtime_common_account_irq_enter+0x32/0x50 [] local_apic_timer_interrupt+0x3c/0x70 [] smp_apic_timer_interrupt+0x41/0x60 [] apic_timer_interrupt+0x70/0x80 [] ? debug_lockdep_rcu_enabled+0x2f/0x40 [] ext4_es_find_delayed_extent_range+0x2cd/0x480 [] ? ext4_es_find_delayed_extent_range+0x1a/0x480 [] ext4_llseek+0x291/0x410 [] ? trace_hardirqs_on+0xd/0x10 [] SyS_lseek+0x99/0xc0 [] system_call_fastpath+0x12/0x6f rcu_sched kthread starved for 26007 jiffies!