From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752822AbbLBDpF (ORCPT ); Tue, 1 Dec 2015 22:45:05 -0500 Received: from mail-yk0-f169.google.com ([209.85.160.169]:36524 "EHLO mail-yk0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750918AbbLBDpC (ORCPT ); Tue, 1 Dec 2015 22:45:02 -0500 MIME-Version: 1.0 In-Reply-To: <1449022764.31589.24.camel@hpe.com> References: <1448309082-20851-1-git-send-email-toshi.kani@hpe.com> <1449022764.31589.24.camel@hpe.com> Date: Tue, 1 Dec 2015 19:45:01 -0800 Message-ID: Subject: Re: [PATCH] mm: Fix mmap MAP_POPULATE for DAX pmd mapping From: Dan Williams To: Toshi Kani Cc: Andrew Morton , "Kirill A. Shutemov" , Matthew Wilcox , Ross Zwisler , mauricio.porto@hpe.com, Linux MM , linux-fsdevel , "linux-nvdimm@lists.01.org" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 1, 2015 at 6:19 PM, Toshi Kani wrote: > On Mon, 2015-11-30 at 14:08 -0800, Dan Williams wrote: >> On Mon, Nov 23, 2015 at 12:04 PM, Toshi Kani wrote: >> > The following oops was observed when mmap() with MAP_POPULATE >> > pre-faulted pmd mappings of a DAX file. follow_trans_huge_pmd() >> > expects that a target address has a struct page. >> > >> > BUG: unable to handle kernel paging request at ffffea0012220000 >> > follow_trans_huge_pmd+0xba/0x390 >> > follow_page_mask+0x33d/0x420 >> > __get_user_pages+0xdc/0x800 >> > populate_vma_page_range+0xb5/0xe0 >> > __mm_populate+0xc5/0x150 >> > vm_mmap_pgoff+0xd5/0xe0 >> > SyS_mmap_pgoff+0x1c1/0x290 >> > SyS_mmap+0x1b/0x30 >> > >> > Fix it by making the PMD pre-fault handling consistent with PTE. >> > After pre-faulted in faultin_page(), follow_page_mask() calls >> > follow_trans_huge_pmd(), which is changed to call follow_pfn_pmd() >> > for VM_PFNMAP or VM_MIXEDMAP. follow_pfn_pmd() handles FOLL_TOUCH >> > and returns with -EEXIST. >> > >> > Reported-by: Mauricio Porto >> > Signed-off-by: Toshi Kani >> > Cc: Andrew Morton >> > Cc: Kirill A. Shutemov >> > Cc: Matthew Wilcox >> > Cc: Dan Williams >> > Cc: Ross Zwisler >> > --- >> >> Hey Toshi, >> >> I ended up fixing this differently with follow_pmd_devmap() introduced >> in this series: >> >> https://lists.01.org/pipermail/linux-nvdimm/2015-November/003033.html >> >> Does the latest libnvdimm-pending branch [1] pass your test case? > > Hi Dan, > > I ran several test cases, and they all hit the case "pfn not in memmap" in > __dax_pmd_fault() during mmap(MAP_POPULATE). Looking at the dax.pfn, PFN_DEV is > set but PFN_MAP is not. I have not looked into why, but I thought I let you > know first. I've also seen the test thread got hung up at the end sometime. That PFN_MAP flag will not be set by default for NFIT-defined persistent memory. See pmem_should_map_pages() for pmem namespaces that will have it set by default, currently only e820 type-12 memory ranges. NFIT-defined persistent memory can have a memmap array dynamically allocated by setting up a pfn device (similar to setting up a btt). We don't map it by default because the NFIT may describe hundreds of gigabytes of persistent and the overhead of the memmap may be too large to locate the memmap in ram. I have a pending patch in libnvdimm-pending that allows the capacity for the memmap to come from pmem instead of ram: https://git.kernel.org/cgit/linux/kernel/git/djbw/nvdimm.git/commit/?h=libnvdimm-pending&id=3117a24e07fe > I also noticed that reason is not set in the case below. > > if (length < PMD_SIZE > || (pfn_t_to_pfn(dax.pfn) & PG_PMD_COLOUR)) { > dax_unmap_atomic(bdev, &dax); > goto fallback; > } Thanks, I'll fix that up.