linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Jan Kara <jack@suse.cz>,
	Xiao Guangrong <guangrong.xiao@linux.intel.com>,
	linux-ext4@vger.kernel.org, "Theodore Ts'o" <tytso@mit.edu>,
	Dan Williams <dan.j.williams@intel.com>,
	Yumei Huang <yuhuang@redhat.com>, KVM <kvm@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux ACPI <linux-acpi@vger.kernel.org>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: DAX can not work on virtual nvdimm device
Date: Fri, 9 Sep 2016 11:19:25 +0200	[thread overview]
Message-ID: <20160909091925.GF22777@quack2.suse.cz> (raw)
In-Reply-To: <20160908204708.GA15167@linux.intel.com>

On Thu 08-09-16 14:47:08, Ross Zwisler wrote:
> On Tue, Sep 06, 2016 at 05:06:20PM +0200, Jan Kara wrote:
> > On Thu 01-09-16 20:57:38, Ross Zwisler wrote:
> > > On Wed, Aug 31, 2016 at 04:44:47PM +0800, Xiao Guangrong wrote:
> > > > On 08/31/2016 01:09 AM, Dan Williams wrote:
> > > > > 
> > > > > Can you post your exact reproduction steps?  This test is not failing for me.
> > > > > 
> > > > 
> > > > Sure.
> > > > 
> > > > 1. make the guest kernel based on your tree, the top commit is
> > > >    10d7902fa0e82b (dax: unmap/truncate on device shutdown) and
> > > >    the config file can be found in this thread.
> > > > 
> > > > 2. add guest kernel command line: memmap=6G!10G
> > > > 
> > > > 3: start the guest:
> > > >    x86_64-softmmu/qemu-system-x86_64 -machine pc,nvdimm --enable-kvm \
> > > >    -smp 16 -m 32G,maxmem=100G,slots=100 /other/VMs/centos6.img -monitor stdio
> > > > 
> > > > 4: in guest:
> > > >    mkfs.ext4 /dev/pmem0
> > > >    mount -o dax /dev/pmem0  /mnt/pmem/
> > > >    echo > /mnt/pmem/xxx
> > > >    ./mmap /mnt/pmem/xxx
> > > >    ./read /mnt/pmem/xxx
> > > > 
> > > >   The source code of mmap and read has been attached in this mail.
> > > > 
> > > >   Hopefully, you can detect the error triggered by read test.
> > > > 
> > > > Thanks!
> > > 
> > > Okay, I think I've isolated this issue.  Xiao's VM was an old CentOS 6 system,
> > > and for some reason ext4+DAX with the old tools found in that VM fails.  I was
> > > able to reproduce this failure with a freshly installed CentOS 6.8 VM.
> > > 
> > > You can see the failure with his tests, or perhaps more easily with this
> > > series of commands:
> > > 
> > >   # mkfs.ext4 /dev/pmem0
> > >   # mount -o dax /dev/pmem0  /mnt/pmem/
> > >   # touch /mnt/pmem/x
> > >   # md5sum /mnt/pmem/x
> > >   md5sum: /mnt/pmem/x: Bad address
> > > 
> > > This sequence of commands works fine in the old CentOS 6 system if you use XFS
> > > instead of ext4, and it works fine with both ext4 and XFS in CentOS 7 and
> > > with recent versions of Fedora.
> > > 
> > > I've added the ext4 folks to this mail in case they care, but my guess is that
> > > the tools in CentOS 6 are so old that it's not worth worrying about.  For
> > > reference, the kernel in CentOS 6 is based on 2.6.32.  :)  DAX was introduced
> > > in v4.0.
> > 
> > Hum, can you post 'dumpe2fs -h /dev/pmem0' output from that system when the
> > md5sum fails? Because the only idea I have is that mkfs.ext4 in CentOS 6
> > creates the filesystem with a different set of features than more recent
> > e2fsprogs and so we hit some untested path...
> 
> Sure, here's the output:
> 
> # dumpe2fs -h /dev/pmem0 
> dumpe2fs 1.41.12 (17-May-2010)
> Filesystem volume name:   <none>
> Last mounted on:          /mnt/pmem
> Filesystem UUID:          4cd8a836-cc54-4c59-ae0a-4a26bab0f8bc
> Filesystem magic number:  0xEF53
> Filesystem revision #:    1 (dynamic)
> Filesystem features:      has_journal ext_attr resize_inode dir_index filetype
> needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg
> dir_nlink extra_isize
> Filesystem flags:         signed_directory_hash 
> Default mount options:    (none)
> Filesystem state:         clean
> Errors behavior:          Continue
> Filesystem OS type:       Linux
> Inode count:              1048576
> Block count:              4194304
> Reserved block count:     209715
> Free blocks:              4084463
> Free inodes:              1048565
> First block:              0
> Block size:               4096
> Fragment size:            4096
> Reserved GDT blocks:      1023
> Blocks per group:         32768
> Fragments per group:      32768
> Inodes per group:         8192
> Inode blocks per group:   512
> RAID stride:              1
> Flex block group size:    16
> Filesystem created:       Thu Sep  8 14:45:31 2016
> Last mount time:          Thu Sep  8 14:45:39 2016
> Last write time:          Thu Sep  8 14:45:39 2016
> Mount count:              1
> Maximum mount count:      21
> Last checked:             Thu Sep  8 14:45:31 2016
> Check interval:           15552000 (6 months)
> Next check after:         Tue Mar  7 13:45:31 2017
> Lifetime writes:          388 MB
> Reserved blocks uid:      0 (user root)
> Reserved blocks gid:      0 (group root)
> First inode:              11
> Inode size:	          256
> Required extra isize:     28
> Desired extra isize:      28
> Journal inode:            8
> Default directory hash:   half_md4
> Directory Hash Seed:      19cad581-c46a-4212-bfa0-d527ff55db49
> Journal backup:           inode blocks
> Journal features:         (none)
> Journal size:             128M
> Journal length:           32768
> Journal sequence:         0x00000002
> Journal start:            1

Hum, nothing unusual in there. I've tried reproducing on a local SLE11 SP3
machine (which is from about the same time) but everything works as
expected there. Shrug...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2016-09-09  9:20 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-19 11:19 DAX can not work on virtual nvdimm device Xiao Guangrong
2016-08-19 14:59 ` Dan Williams
2016-08-19 18:30   ` Ross Zwisler
2016-08-21  9:55     ` Boaz Harrosh
2016-08-29  7:54     ` Xiao Guangrong
2016-08-29 19:30       ` Ross Zwisler
2016-08-30  6:53         ` Xiao Guangrong
2016-08-30 17:09           ` Dan Williams
2016-08-31  8:44             ` Xiao Guangrong
2016-08-31 16:46               ` Ross Zwisler
2016-09-02  2:57               ` Ross Zwisler
2016-09-06 15:06                 ` Jan Kara
2016-09-08 20:47                   ` Ross Zwisler
2016-09-09  9:19                     ` Jan Kara [this message]
2016-09-09 14:03                       ` Theodore Ts'o
2016-09-09 16:34                         ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160909091925.GF22777@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=dan.j.williams@intel.com \
    --cc=guangrong.xiao@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=qemu-devel@nongnu.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=stefanha@redhat.com \
    --cc=tytso@mit.edu \
    --cc=yuhuang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).