From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752223AbdKFXMe (ORCPT ); Mon, 6 Nov 2017 18:12:34 -0500 Received: from mail-it0-f66.google.com ([209.85.214.66]:44336 "EHLO mail-it0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751552AbdKFXMc (ORCPT ); Mon, 6 Nov 2017 18:12:32 -0500 X-Google-Smtp-Source: ABhQp+T3g0jO83NYLQKgUX/lglkwdOiAbZAb5Pp6a1N/eKSkHyQ5ayxIaUx9exYiFOzxhyS+Q/g0VFFTJKiBXXLuM+Y= MIME-Version: 1.0 In-Reply-To: <20171106225354.6ucl4f4ipsjlntzl@wfg-t540p.sh.intel.com> References: <20171029225155.qcum5i75awrt5tzm@wfg-t540p.sh.intel.com> <20171029231835.3725fnd5yehlmqob@wfg-t540p.sh.intel.com> <20171030110511.scfrdtlnf5lbdhu5@pd.tnic> <526e7cf2-0672-e44b-c32f-26128a2dfd37@codeaurora.org> <20171106224635.qopgsszwxzuitkpf@wfg-t540p.sh.intel.com> <20171106225354.6ucl4f4ipsjlntzl@wfg-t540p.sh.intel.com> From: Linus Torvalds Date: Mon, 6 Nov 2017 15:12:31 -0800 X-Google-Sender-Auth: KzQu5p6xxRLnpA8XaqXDJOgfung Message-ID: Subject: Re: [ata_scsi_offline_dev] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238 To: Fengguang Wu Cc: IDE-ML , Christoph Hellwig , Tejun Heo , Hannes Reinecke , Linux Kernel Mailing List , Johannes Thumshirn , "Martin K. Petersen" , linux-scsi , James Bottomley Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 6, 2017 at 2:53 PM, Fengguang Wu wrote: > > The same dmesg happen to contain another libata related bug. Attached again. > It's rare and in the error handling path, so unlikely a new regression. > > [ 49.608280] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238 > [ 49.647821] mutex_lock+0x20/0x50 > [ 49.651443] kernfs_find_and_get_ns+0x23/0x60 > [ 49.656104] sysfs_notify+0x77/0x90 > [ 49.659900] scsi_device_set_state+0x63/0x150 > [ 49.664559] ata_scsi_offline_dev+0x1c/0x30 [libata] > [ 49.669817] ata_eh_detach_dev+0x3b/0xb0 [libata] ata_eh_detach_dev() does spin_lock_irqsave(ap->lock, flags); and then does if (ata_scsi_offline_dev(dev)) { dev->flags |= ATA_DFLAG_DETACHED; ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG; } inside that spinlock. And this code is not new - it has done it since 2006 or so. But it does seem to be a new regression in 4.14, caused by commit 8a97712e5314 ("scsi: make 'state' device attribute pollable"), because that's what added the sysfs_notify() call to scsi_device_set_state(), which made that spinlock be a problem. That commit came in through the SCSI merge this merge window, and it seems to still revert cleanly. So I do suspect that by now we should just revert that commit. It's not clear why that state attribute should be pollable, and the new code is clearly very much buggy. Hannes, Martin? Linus