From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755778AbaFJT7l (ORCPT ); Tue, 10 Jun 2014 15:59:41 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:9059 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755093AbaFJT7j (ORCPT ); Tue, 10 Jun 2014 15:59:39 -0400 Message-ID: <5397636F.9050209@fb.com> Date: Tue, 10 Jun 2014 13:58:39 -0600 From: Jens Axboe User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Keith Busch CC: =?ISO-8859-1?Q?Matias_Bj=F8rling?= , "willy@linux.intel.com" , "sbradshaw@micron.com" , "tom.leiming@gmail.com" , "hch@infradead.org" , "linux-kernel@vger.kernel.org" , "linux-nvme@lists.infradead.org" Subject: Re: [PATCH v7] NVMe: conversion to blk-mq References: <1402392038-5268-1-git-send-email-m@bjorling.me> <1402392038-5268-2-git-send-email-m@bjorling.me>, In-Reply-To: Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 8bit X-Originating-IP: [192.168.57.29] X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.12.52,1.0.14,0.0.0000 definitions=2014-06-10_05:2014-06-10,2014-06-10,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 spamscore=0 suspectscore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1406100241 X-FB-Internal: deliver Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/10/2014 01:29 PM, Keith Busch wrote: > On Tue, 10 Jun 2014, Jens Axboe wrote: >>> On Jun 10, 2014, at 9:52 AM, Keith Busch wrote: >>> >>>> On Tue, 10 Jun 2014, Matias Bjørling wrote: >>>> This converts the current NVMe driver to utilize the blk-mq layer. >>> >>> I'd like to run xfstests on this, but it is failing mkfs.xfs. I honestly >>> don't know much about this area, but I think this may be from the recent >>> chunk sectors patch causing a __bio_add_page to reject adding a new >>> page. >> >> Gah, yes that's a bug in the chunk patch. It must always allow a >> single page >> at any offset. I'll test and send out a fix. > > I have two devices, one formatted 4k, the other 512. The 4k is used as > the TEST_DEV and 512 is used as SCRATCH_DEV. I'm always hitting a BUG when > unmounting the scratch dev in xfstests generic/068. The bug looks like > nvme was trying to use an SGL that doesn't map correctly to a PRP. I'm guessing it's some of the coalescing settings, since the driver is now using the generic block rq mapping. > Also, it doesn't look like this driver can recover from an unresponsive > device, leaving tasks in uniterruptible sleep state forever. Still looking > into that one though; as far as I can tell the device is perfectly fine, > but lots of "Cancelling I/O" messages are getting logged. If the task is still stuck, some of the IOs must not be getting cancelled. From mboxrd@z Thu Jan 1 00:00:00 1970 From: axboe@fb.com (Jens Axboe) Date: Tue, 10 Jun 2014 13:58:39 -0600 Subject: [PATCH v7] NVMe: conversion to blk-mq In-Reply-To: References: <1402392038-5268-1-git-send-email-m@bjorling.me> <1402392038-5268-2-git-send-email-m@bjorling.me>, Message-ID: <5397636F.9050209@fb.com> On 06/10/2014 01:29 PM, Keith Busch wrote: > On Tue, 10 Jun 2014, Jens Axboe wrote: >>> On Jun 10, 2014,@9:52 AM, Keith Busch wrote: >>> >>>> On Tue, 10 Jun 2014, Matias Bj?rling wrote: >>>> This converts the current NVMe driver to utilize the blk-mq layer. >>> >>> I'd like to run xfstests on this, but it is failing mkfs.xfs. I honestly >>> don't know much about this area, but I think this may be from the recent >>> chunk sectors patch causing a __bio_add_page to reject adding a new >>> page. >> >> Gah, yes that's a bug in the chunk patch. It must always allow a >> single page >> at any offset. I'll test and send out a fix. > > I have two devices, one formatted 4k, the other 512. The 4k is used as > the TEST_DEV and 512 is used as SCRATCH_DEV. I'm always hitting a BUG when > unmounting the scratch dev in xfstests generic/068. The bug looks like > nvme was trying to use an SGL that doesn't map correctly to a PRP. I'm guessing it's some of the coalescing settings, since the driver is now using the generic block rq mapping. > Also, it doesn't look like this driver can recover from an unresponsive > device, leaving tasks in uniterruptible sleep state forever. Still looking > into that one though; as far as I can tell the device is perfectly fine, > but lots of "Cancelling I/O" messages are getting logged. If the task is still stuck, some of the IOs must not be getting cancelled.