From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965897AbeCHGUd (ORCPT ); Thu, 8 Mar 2018 01:20:33 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:47090 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935407AbeCHGUa (ORCPT ); Thu, 8 Mar 2018 01:20:30 -0500 From: Jianchao Wang To: keith.busch@intel.com, axboe@fb.com, hch@lst.de, sagi@grimberg.me Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Subject: PATCH V4 0/5 nvme-pci: fixes on nvme_timeout and nvme_dev_disable Date: Thu, 8 Mar 2018 14:19:26 +0800 Message-Id: <1520489971-31174-1-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8825 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=795 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1803080079 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Firstly, really appreciate Keith and Sagi's precious advice on previous versions. And this is the version 4. Some patches of the previous patchset have been submitted and the left is this patchset which has been refactored. Please consider it for 4.17. The target of this patchset is to avoid nvme_dev_disable to be invoked by nvme_timeout. As we know, nvme_dev_disable will issue commands on adminq, if the controller no response, it has to depend on timeout path. However, nvme_timeout will also need to invoke nvme_dev_disable. This will introduce dangerous circular dependence. Moreover, nvme_dev_disable is under the shutdown_lock, even when it go to sleep, this makes things worse. The basic idea of this patchset is: - When need to schedule reset_work, hand over expired requests to nvme_dev_disable. They will be completed after the controller is disabled/shtudown. - When requests from nvme_dev_disable and nvme_reset_work expires, disable the controller directly then the request could be completed to wakeup the waiter. The 'disable the controller directly' here means that it doesn't send commands on adminq. A new interface is introduced for this, nvme_pci_disable_ctrl_directly. More details, please refer to the comment of the function. Then nvme_timeout doesn't depends on nvme_dev_disable any more. Because there is big difference from previous version, and some relatively independent patches have been submitted, so I just reserve the key part of previous version change log following. Change V3->V4 - refactor the interfaces flushing in-flight requests and add them to nvme core. - refactor the nvme_timeout to make it more clearly Change V2->V3: - discard the patch which unfreeze the queue after nvme_dev_disable Changes V1->V2: - disable PCI controller bus master in nvme_pci_disable_ctrl_directly There are 5 patches: 1st one is to change the operations on nvme_request->flags to atomic operations, then we could introduce another NVME_REQ_ABORTED next. 2nd patch introduce two new interfaces to flush in-flight requests in nvme core. 3rd patch is to avoid the nvme_dev_disable in nvme_timeout, it introduce new interface nvme_pci_disable_ctrl_directly and refactor the nvme_timeout 4th~5th is to fix issues introduced after 3rd patch. Jianchao Wang (5) 0001-nvme-do-atomically-bit-operations-on-nvme_request.fl.patch 0002-nvme-add-helper-interface-to-flush-in-flight-request.patch 0003-nvme-pci-avoid-nvme_dev_disable-to-be-invoked-in-nvm.patch 0004-nvme-pci-discard-wait-timeout-when-delete-cq-sq.patch 0005-nvme-pci-add-the-timeout-case-for-DELETEING-state.patch diff stat drivers/nvme/host/core.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++ drivers/nvme/host/nvme.h | 4 +- drivers/nvme/host/pci.c | 224 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------- Thanks Jianchao