From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756728AbcH2H7g (ORCPT ); Mon, 29 Aug 2016 03:59:36 -0400 Received: from mga05.intel.com ([192.55.52.43]:61649 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756681AbcH2H7b (ORCPT ); Mon, 29 Aug 2016 03:59:31 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,595,1464678000"; d="scan'208";a="872104574" Subject: Re: DAX can not work on virtual nvdimm device To: Ross Zwisler , Dan Williams References: <436d7526-bf06-633d-afce-4333552d9e31@linux.intel.com> <20160819183047.GA7216@linux.intel.com> Cc: Yumei Huang , KVM , "linux-nvdimm@lists.01.org" , "qemu-devel@nongnu.org" , LKML , Linux ACPI , Stefan Hajnoczi From: Xiao Guangrong Message-ID: <600ac51c-0f61-6e53-9bfa-669c85494d1f@linux.intel.com> Date: Mon, 29 Aug 2016 15:54:10 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160819183047.GA7216@linux.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Ross, Sorry for the delay, i just returned back from KVM Forum. On 08/20/2016 02:30 AM, Ross Zwisler wrote: > On Fri, Aug 19, 2016 at 07:59:29AM -0700, Dan Williams wrote: >> On Fri, Aug 19, 2016 at 4:19 AM, Xiao Guangrong >> wrote: >>> >>> Hi Dan, >>> >>> Recently, Redhat reported that nvml test suite failed on QEMU/KVM, >>> more detailed info please refer to: >>> https://bugzilla.redhat.com/show_bug.cgi?id=1365721 >>> >>> The reason for this bug is that the memory region created by mmap() >>> on the dax-based file was gone so that the region can not be found >>> in /proc/self/smaps during the runtime. >>> >>> This is a simple way to trigger this issue: >>> mount -o dax /dev/pmem0 /mnt/pmem/ >>> vim /mnt/pmem/xxx >>> then 'vim' is crashed due to segment fault. >>> >>> This bug can be reproduced on your tree, the top commit is >>> 10d7902fa0e82b (dax: unmap/truncate on device shutdown), the kernel >>> configure file is attached. >>> >>> Your thought or comment is highly appreciated. >> >> I'm going to be offline until Tuesday, but I will investigate when I'm >> back. In the meantime if Ross or Vishal had an opportunity to take a >> look I wouldn't say "no" :). > > I haven't been able to reproduce this vim segfault. I'm using QEMU v2.6.0, > and the kernel commit you mentioned, and your kernel config. > > Here's my QEMU command line: > > sudo ~/qemu/bin/qemu-system-x86_64 /var/lib/libvirt/images/alara.qcow2 \ > -machine pc,nvdimm -m 8G,maxmem=100G,slots=100 -object \ > memory-backend-file,id=mem1,share,mem-path=/dev/pmem0,size=8G -device \ > nvdimm,memdev=mem1,id=nv1 -smp 6 -machine pc,accel=kvm > > With this I'm able to mkfs the guest's /dev/pmem0, mount it with -o dax, and > write a file with vim. Thanks for your test. That's strange... > > Can you reproduce your results with a pmem device created via a memmap kernel > command line parameter in the guest? You'll need to update your kernel > config to enable CONFIG_X86_PMEM_LEGACY and CONFIG_X86_PMEM_LEGACY_DEVICE. > Okay, i tested it with mmap=6G!10G, it failed too. So it looks like it's a filesystem or DAX issue. More precisely, i figured out the root case that read() returns a wrong value when it reaches the end of the file, following test case can trigger it: #include #include #include #include #include #include int main(int argc, char *argv[]) { char *filename; if (argc < 2) { printf("arg: filename.\n"); return -1; } filename = argv[1]; printf("test on %s.\n", filename); int fd = open(filename, O_RDWR); if (fd < 0) { perror("open"); return -1; } int count = 0; while (1) { ssize_t ret; char buf; ret = read(fd, &buf, sizeof(buf)); if (ret < 0) { perror("READ"); return -1; } if (ret == 0) break; if (ret != sizeof(buf)) { printf("Count %x Ret %lx sizeof(buf) %lx.\n", count, ret, sizeof(buf)); return -1; } count++; printf("%c", buf); } printf("\n Good Read.\n"); return 0; } It will fail at "ret != sizeof(buf)", for example, the error output on my test env is: Count 1000 Ret 22f84200 sizeof(buf) 1.