From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933244AbcFTTkh (ORCPT ); Mon, 20 Jun 2016 15:40:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44819 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753266AbcFTTk2 (ORCPT ); Mon, 20 Jun 2016 15:40:28 -0400 Date: Mon, 20 Jun 2016 15:40:26 -0400 From: Mike Snitzer To: "Kani, Toshimitsu" Cc: "linux-kernel@vger.kernel.org" , "linux-nvdimm@ml01.01.org" , "agk@redhat.com" , "linux-raid@vger.kernel.org" , "viro@zeniv.linux.org.uk" , "dan.j.williams@intel.com" , "axboe@kernel.dk" , "ross.zwisler@linux.intel.com" , "dm-devel@redhat.com" , sandeen@redhat.com Subject: Re: [PATCH 0/6] Support DAX for device-mapper dm-linear devices Message-ID: <20160620194026.GA21657@redhat.com> References: <1465856497-19698-1-git-send-email-toshi.kani@hpe.com> <20160613225756.GA18417@redhat.com> <20160620180043.GA21261@redhat.com> <1466446861.3504.243.camel@hpe.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1466446861.3504.243.camel@hpe.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 20 Jun 2016 19:40:28 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 20 2016 at 2:31pm -0400, Kani, Toshimitsu wrote: > On Mon, 2016-06-20 at 14:00 -0400, Mike Snitzer wrote: > > > > I rebased your patches on linux-dm.git's 'for-next' (which includes what > > I've already staged for the 4.8 merge window).  And I folded/changed > > some of the DM patches so that there are only 2 now (1 for DM core and 1 > > for dm-linear).  Please see the 4 topmost commits in my 'wip' here: > > > > http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=wip > > > > Feel free to pick these patches up to use as the basis for continued > > work or re-posting of this set.. either that or I could post them as v2 > > on your behalf. > > > > As for testing, I've verified that basic IO works to a pmem-based DM > > linear device and that mixed table types are rejected as expected. > > Great! I will send additional patch, add DAX support to dm-stripe, on top of > these once I finish my testing. I did some further testing and am seeing some XFS corruption when testing a DM linear device that spans multiple pmem devices. I created 2 partitions ontop of /dev/pmem0 (which I created using the howto from https://nvdimm.wiki.kernel.org). Then I did: # pvcreate /dev/pmem0p1 # pvcreate /dev/pmem0p2 # vgcreate pmem /dev/pmem0p1 /dev/pmem0p2 # lvcreate -L 2.9G -n lv pmem # lsblk /dev/pmem0 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT pmem0 259:0 0 6G 0 disk ├─pmem0p1 259:1 0 1G 0 part │ └─pmem-lv 253:4 0 2.9G 0 lvm └─pmem0p2 259:2 0 2G 0 part └─pmem-lv 253:4 0 2.9G 0 lvm # dmsetup table pmem-lv 0 4186112 linear 259:2 2048 4186112 1900544 linear 259:1 2048 # mkfs.xfs /dev/pmem/lv # mount -o dax -t xfs /dev/pmem/lv /mnt/dax [11452.212034] XFS (dm-4): DAX enabled. Warning: EXPERIMENTAL, use at your own risk [11452.220323] XFS (dm-4): Mounting V4 Filesystem [11452.226526] XFS (dm-4): Ending clean mount # dd if=/dev/zero of=/mnt/dax/meh bs=1024K oflag=direct [11729.754671] XFS (dm-4): Metadata corruption detected at xfs_agf_read_verify+0x70/0x120 [xfs], xfs_agf block 0x45a808 [11729.766423] XFS (dm-4): Unmount and run xfs_repair [11729.771774] XFS (dm-4): First 64 bytes of corrupted metadata buffer: [11729.778869] ffff8800b8038000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [11729.788582] ffff8800b8038010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [11729.798293] ffff8800b8038020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [11729.808002] ffff8800b8038030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [11729.817715] XFS (dm-4): metadata I/O error: block 0x45a808 ("xfs_trans_read_buf_map") error 117 numblks 8 When this XFS corruption occurs corruption then also manifests in lvm2's metadata: # vgremove pmem Do you really want to remove volume group "pmem" containing 1 logical volumes? [y/n]: y Do you really want to remove active logical volume lv? [y/n]: y Incorrect metadata area header checksum on /dev/pmem0p1 at offset 4096 WARNING: Failed to write an MDA of VG pmem. Incorrect metadata area header checksum on /dev/pmem0p2 at offset 4096 WARNING: Failed to write an MDA of VG pmem. Failed to write VG pmem. Incorrect metadata area header checksum on /dev/pmem0p2 at offset 4096 Incorrect metadata area header checksum on /dev/pmem0p1 at offset 4096 If I don't use XFS, and only issue IO directly to the /dev/pmem/lv, I don't see this corruption.