From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753059Ab1IVR0j (ORCPT ); Thu, 22 Sep 2011 13:26:39 -0400 Received: from mail-wy0-f174.google.com ([74.125.82.174]:53421 "EHLO mail-wy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751881Ab1IVR0g convert rfc822-to-8bit (ORCPT ); Thu, 22 Sep 2011 13:26:36 -0400 MIME-Version: 1.0 In-Reply-To: References: From: Bart Van Assche Date: Thu, 22 Sep 2011 19:26:14 +0200 X-Google-Sender-Auth: 94qtz_Utt6CoeZvGCUSIugLV8aM Message-ID: Subject: Re: blkdev_issue_discard() hangs forever if the underlying storage device is removed To: Jens Axboe , Mike Snitzer , Lukas Czerner Cc: LKML Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Aug 27, 2011 at 8:11 AM, Bart Van Assche wrote: > Apparently blkdev_issue_discard() never times out, not even if the > device has been removed. This is what appeared in the kernel log after > device removal (triggered by running mkfs.ext4 on an SRP SCSI device > node): > > [ ... ] In case anyone is interested, I ran into a similar call stack with 3.1-rc6 for the truncate_inode_pages() call. I/O was started while the SRP connection was fully operational and the call stack was reported after ib_srp had invoked scsi_remove_host(). That excludes the ib_srp driver as a potential cause of this hang, isn't it ? INFO: task fio:17621 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 000000010003baef 0 17621 17606 0x00000004 ffff8800952498c8 0000000000000046 ffffffff813d81ef ffffffff81082bee ffff880000000000 ffff880095249fd8 ffff880095249fd8 ffff880095249fd8 ffff8801a8bf4ce0 ffff880095249fd8 ffff880095249fd8 ffff880095248000 Call Trace: [] ? __schedule+0x66f/0x7d0 [] ? mark_held_locks+0x6e/0x130 [] ? __lock_page+0x70/0x70 [] schedule+0x3f/0x60 [] io_schedule+0x60/0x80 [] sleep_on_page+0xe/0x20 [] __wait_on_bit_lock+0x5a/0xc0 [] ? find_get_pages+0x10f/0x1c0 [] ? filemap_fault+0x4b0/0x4b0 [] __lock_page+0x67/0x70 [] ? autoremove_wake_function+0x50/0x50 [] truncate_inode_pages_range+0x493/0x4a0 [] truncate_inode_pages+0x15/0x20 [] kill_bdev+0x37/0x40 [] __blkdev_put+0x74/0x1c0 [] blkdev_put+0x60/0x190 [] blkdev_close+0x24/0x30 [] fput+0xf8/0x230 [] filp_close+0x66/0x90 [] put_files_struct+0xf2/0x1d0 [] ? put_files_struct+0x38/0x1d0 [] exit_files+0x52/0x60 [] do_exit+0x158/0x850 [] ? get_signal_to_deliver+0xee/0x5d0 [] ? _raw_spin_lock_irq+0x17/0x60 [] ? _raw_spin_unlock_irq+0x30/0x50 [] do_group_exit+0x5c/0xd0 [] get_signal_to_deliver+0x230/0x5d0 [] do_signal+0x6b/0x750 [] ? hrtimer_cancel+0x22/0x30 [] ? do_nanosleep+0xa4/0xd0 [] ? hrtimer_nanosleep+0xac/0x150 [] ? sysret_signal+0x5/0x3d [] do_notify_resume+0x5d/0x70 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] int_signal+0x12/0x17 1 lock held by fio/17621: #0: (&bdev->bd_mutex){+.+.+.}, at: [] __blkdev_put+0x3f/0x1c0 Bart.