From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753553AbbHMTcl (ORCPT ); Thu, 13 Aug 2015 15:32:41 -0400 Received: from mga01.intel.com ([192.55.52.88]:53557 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752012AbbHMTcj convert rfc822-to-8bit (ORCPT ); Thu, 13 Aug 2015 15:32:39 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,672,1432623600"; d="scan'208";a="541425433" From: "Wilcox, Matthew R" To: Jeff Moyer , Linda Knippers CC: Boaz Harrosh , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "Verma, Vishal L" Subject: RE: regression introduced by "block: Add support for DAX reads/writes to block devices" Thread-Topic: regression introduced by "block: Add support for DAX reads/writes to block devices" Thread-Index: AQHQ1fSJbz3w0Nd/aUKgc2IIOADimZ4KQ6BA Date: Thu, 13 Aug 2015 19:32:10 +0000 Message-ID: <100D68C7BA14664A8938383216E40DE040915418@FMSMSX114.amr.corp.intel.com> References: <100D68C7BA14664A8938383216E40DE04091408C@FMSMSX114.amr.corp.intel.com> <100D68C7BA14664A8938383216E40DE0409144D9@FMSMSX114.amr.corp.intel.com> <55C855D5.1070001@plexistor.com> <55CC2BDA.3080906@plexistor.com> <55CCC8F8.6080204@hp.com> <55CCD94D.7040807@hp.com> In-Reply-To: Accept-Language: en-CA, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.1.200.108] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I liked the patch you were pushing to request the *page* containing the requested bytes instead of the *block* containing the requested bytes. For the misaligned partition problem, I was thinking we should change the direct_access API to return a phys_addr_t instead of a pfn. That way we can return something that isn't actually page aligned, and DAX can take care of making sure it doesn't overshoot the end. -----Original Message----- From: Jeff Moyer [mailto:jmoyer@redhat.com] Sent: Thursday, August 13, 2015 11:19 AM To: Linda Knippers Cc: Boaz Harrosh; Wilcox, Matthew R; linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org; Verma, Vishal L Subject: Re: regression introduced by "block: Add support for DAX reads/writes to block devices" Linda Knippers writes: >>> It causes the physical block size to be PAGE_SIZE but the >>> logical block size is still 512. However, the minimum_io_size >>> is now 4096 (same as physical block size, I assume). The >>> optimal_io_size is still 0. What does that mean? >> >> physical block size - device's internal block size >> logical block size - addressable unit > > Right, but it's still reported as 512 and that doesn't work. Understood. :) >> optimal io size - device's preferred unit for streaming > > So 0 is ok. Correct. >> We can change the block device to export logical/physical block sizes of >> PAGE_SIZE. However, when persistent memory support comes to platforms >> that support page sizes > 32k, xfs will again run into problems (Dave >> Chinner mentioned that xfs can't deal with logical block sizes >32k.) >> Arguably, you can use pmem and dax on such platforms using RAM today for >> testing. Do we care about breaking that? > > I would think so. AARCH64 uses 64k pages today. So does powerpc, but I guess nobody cares about that anymore. ;-) If the logical block size is smaller than the page size, we're going to have to deal with sub-page I/O. For now, we can do as Boaz suggested, and just turn off dax for those configurations. We could also just revert the patch that introduced this problem. I really don't know who is going to care about O_DIRECT I/O performance to a persistent memory block device. Willy? What was the real motivation there? > I think Documentation/filesystems/dax.txt could use a little update > too. It has a section "Implementation Tips for Block Driver Writers" > that makes it sound easy but now I wonder if it even works with the > example ram drivers. Should we be able to read any 512 byte > "sector"? If the logical block size is 512 bytes, then you have to be able to do (direct) I/O to any 512 byte sector. Simple as that. Cheers, Jeff