From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754900AbbHJXEK (ORCPT ); Mon, 10 Aug 2015 19:04:10 -0400 Received: from g4t3426.houston.hp.com ([15.201.208.54]:49350 "EHLO g4t3426.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751890AbbHJXEI (ORCPT ); Mon, 10 Aug 2015 19:04:08 -0400 Subject: Re: regression introduced by "block: Add support for DAX reads/writes to block devices" To: Dave Chinner References: <20150805220113.GC3902@dastard> <55C2BB9E.3040709@hp.com> <20150806032421.GA16638@dastard> <55C3124F.3020602@plexistor.com> <20150806203450.GB16638@dastard> <55C714D0.8070003@plexistor.com> <55C8D208.1070903@hp.com> <20150810212728.GJ3902@dastard> Cc: Boaz Harrosh , Jeff Moyer , "matthew r. wilcox" , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Vishal Verma From: Linda Knippers X-Enigmail-Draft-Status: N1110 Message-ID: <55C92DE6.4080505@hp.com> Date: Mon, 10 Aug 2015 19:04:06 -0400 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <20150810212728.GJ3902@dastard> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/10/2015 5:27 PM, Dave Chinner wrote: > On Mon, Aug 10, 2015 at 12:32:08PM -0400, Linda Knippers wrote: >> On 8/9/2015 4:52 AM, Boaz Harrosh wrote: >>> On 08/06/2015 11:34 PM, Dave Chinner wrote: >>>> On Thu, Aug 06, 2015 at 10:52:47AM +0300, Boaz Harrosh wrote: >>>>> On 08/06/2015 06:24 AM, Dave Chinner wrote: >>>>>> On Wed, Aug 05, 2015 at 09:42:54PM -0400, Linda Knippers wrote: >>>>>>> On 08/05/2015 06:01 PM, Dave Chinner wrote: >>>>>>>> On Wed, Aug 05, 2015 at 04:19:08PM -0400, Jeff Moyer wrote: >>>>> <> >>>>>>>>> >>>>>>>>> I sat down with Linda to look into it, and the problem is that mkfs.xfs >>>>>>>>> sets the blocksize of the device to 512 (via BLKBSZSET), and then reads >>>>>>>>> from the last sector of the device. This results in dax_io trying to do >>>>>>>>> a page-sized I/O at 512 bytes from the end of the device. >>>>>>>> >>>>> >>>>> This part I do not understand. how is mkfs.xfs reading the sector? >>>>> Is it through open(/dev/pmem0,...) ? O_DIRECT? >>>> >>>> mkfs.xfs uses O_DIRECT. Only if open(O_DIRECT) fails or mkfs.xfs is >>>> told that it is working on an image file does it fall back to >>>> buffered IO. All of the XFS userspace tools work this way to prevent >>>> page cache pollution issues with read-once or write-once data during >>>> operation. > .... >> That patch does cause 'mkfs -t xfs' to work. >> >> Before: >> $ sudo mkfs -t xfs -f /dev/pmem3 >> meta-data=/dev/pmem3 isize=256 agcount=4, agsize=524288 blks >> = sectsz=512 attr=2, projid32bit=1 > ^^^^^^^^^^ > .... > >> $ sudo mkfs -t xfs -f /dev/pmem3 >> meta-data=/dev/pmem3 isize=256 agcount=4, agsize=524288 blks >> = sectsz=4096 attr=2, projid32bit=1 > ^^^^^^^^^^^ > > So in the after case, mkfs.xfs is behaving differently and not > exercising the bug. It's seen the: > >> $ cat /sys/block/pmem3/queue/logical_block_size >> 512 >> $ cat /sys/block/pmem3/queue/physical_block_size >> 4096 > ^^^^ > > 4k physical block size, and hence configured the filesystem with a > 4k sector size so all IO it issues is physicallly aligned. IOWs, > mkfs.xfs's last sector read is 4k aligned and sized, and therefore > the test has not confirmed that the patch fixes the 512 byte last > sector read is fixed at all. That is true. All I reported is that mkfs.xfs now works. I had some questions about whether that was right. The underlying problem is still there, as I can demonstrate with a simple reproducer that just does what mkfs.xfs would have done before. > Isn't there a regression test suite that covers basic block device > functionality that you can use to test these simple corner cases? If there is, seems like DAX adds a few twists. -- ljk > > Cheers, > > Dave. >