From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded Date: Thu, 16 Mar 2017 23:19:52 +0000 Message-ID: <1489706379.2574.24.camel@sandisk.com> References: <20170316205650.3322-1-bart.vanassche@sandisk.com> <1489704797.4650.8.camel@HansenPartnership.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from esa1.hgst.iphmx.com ([68.232.141.245]:52650 "EHLO esa1.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753465AbdCPXZb (ORCPT ); Thu, 16 Mar 2017 19:25:31 -0400 In-Reply-To: <1489704797.4650.8.camel@HansenPartnership.com> Content-Language: en-US Content-ID: <8AB17B94D67ACB4C9F047893321BE62A@sandisk.com> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: "James.Bottomley@HansenPartnership.com" , "martin.petersen@oracle.com" Cc: "linux-scsi@vger.kernel.org" , "maxg@mellanox.com" , "israelr@mellanox.com" , "hare@suse.de" On Thu, 2017-03-16 at 15:53 -0700, James Bottomley wrote: > On Thu, 2017-03-16 at 13:56 -0700, Bart Van Assche wrote: > > scsi_target_unblock() must unblock SCSI devices even if this function > > is called after unloading of the LLD that created these devices has > > started. This is necessary to prevent that __scsi_remove_device()=20 > > hangs on the SYNCHRONIZE CACHE command issued by the sd driver during > > shutdown. >=20 > Your special get function misses the try_module_get(). But this is all > really a bit ugly. Since the only problem is the SYNC CACHE triggered > by device_del, isn't a better solution a new state: SDEV_CANCEL_BLOCK.=20 > This will make the device visible to scsi_get_device() and we can take > it back from CANCEL_BLOCKED->CANCEL when the queue is unblocked. I > suspect we could also simply throw away the sync cache command when the > device is blocked (the cache should destage naturally in the time it > takes for the device to be unblocked). Hello James, The purpose of this patch series is to make sure that unblock also occurs after module unloading has started. My understanding of try_module_get() is that it fails once module unloading has started.=A0In other words, it is on purpose that try_module_get() is not called. From the kernel module code: bool try_module_get(struct module *module) { bool ret =3D true; if (module) { preempt_disable(); /* Note: here, we can fail to get a reference */ if (likely(module_is_live(module) && atomic_inc_not_zero(&module->refcnt) !=3D 0)) trace_module_get(module, _RET_IP_); else ret =3D false; preempt_enable(); } return ret; } static inline int module_is_live(struct module *mod) { return mod->state !=3D MODULE_STATE_GOING; } Regarding introducing a new device state: this is something I would like to avoid. Any code that manipulates the SCSI device state is unnecessarily har= d to modify because multiple types of state information have been mixed up in a single state variable (blocked / not blocked; created / running / cancel = / offline). Additionally, the SCSI device state is visible to user space. Adding a new SCSI device state could break existing user space applications= . There is another problem with the introduction of the SDEV_CANCEL_BLOCKED state: we do not want open() calls to succeed for devices that are in the SDEV_DEL, SDEV_CANCEL nor for devices that are in the SDEV_CANCEL_BLOCKED state. scsi_disk_get() in the sd driver relies on scsi_device_get() to chec= k the SCSI device state. If scsi_device_get() would succeed for devices in th= e SDEV_CANCEL_BLOCKED state then an explicit check for that state would have to be added in several users of scsi_device_get(). In other words, I think adding the SDEV_CANCEL_BLOCKED state would result in a much more complex an= d also harder to test patch. Thanks, Bart.=