From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 0090E20961094 for ; Mon, 7 May 2018 12:08:36 -0700 (PDT) From: Jeff Moyer Subject: Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone References: <1525704627-30114-1-git-send-email-yehs1@lenovo.com> <20180507184622.GB12361@bombadil.infradead.org> Date: Mon, 07 May 2018 15:08:32 -0400 In-Reply-To: (Dan Williams's message of "Mon, 7 May 2018 11:57:10 -0700") Message-ID: MIME-Version: 1.0 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Dan Williams Cc: Sasha Levin , Michal Hocko , Huaisheng Ye , linux-nvdimm , Tetsuo Handa , chengnt@lenovo.com, Linux Kernel Mailing List , Matthew Wilcox , pasha.tatashin@oracle.com, Linux MM , Dave Hansen , Johannes Weiner , Andrew Morton , colyli@suse.de, Mel Gorman , Vlastimil Babka List-ID: RGFuIFdpbGxpYW1zIDxkYW4uai53aWxsaWFtc0BpbnRlbC5jb20+IHdyaXRlczoKCj4gT24gTW9u LCBNYXkgNywgMjAxOCBhdCAxMTo0NiBBTSwgTWF0dGhldyBXaWxjb3ggPHdpbGx5QGluZnJhZGVh ZC5vcmc+IHdyb3RlOgo+PiBPbiBNb24sIE1heSAwNywgMjAxOCBhdCAxMDo1MDoyMVBNICswODAw LCBIdWFpc2hlbmcgWWUgd3JvdGU6Cj4+PiBUcmFkaXRpb25hbGx5LCBOVkRJTU1zIGFyZSB0cmVh dGVkIGJ5IG1tKG1lbW9yeSBtYW5hZ2VtZW50KSBzdWJzeXN0ZW0gYXMKPj4+IERFVklDRSB6b25l LCB3aGljaCBpcyBhIHZpcnR1YWwgem9uZSBhbmQgYm90aCBpdHMgc3RhcnQgYW5kIGVuZCBvZiBw Zm4KPj4+IGFyZSBlcXVhbCB0byAwLCBtbSB3b3VsZG7igJl0IG1hbmFnZSBOVkRJTU0gZGlyZWN0 bHkgYXMgRFJBTSwga2VybmVsIHVzZXMKPj4+IGNvcnJlc3BvbmRpbmcgZHJpdmVycywgd2hpY2gg bG9jYXRlIGF0IFxkcml2ZXJzXG52ZGltbVwgYW5kCj4+PiBcZHJpdmVyc1xhY3BpXG5maXQgYW5k IGZzLCB0byByZWFsaXplIE5WRElNTSBtZW1vcnkgYWxsb2MgYW5kIGZyZWUgd2l0aAo+Pj4gbWVt b3J5IGhvdCBwbHVnIGltcGxlbWVudGF0aW9uLgo+Pgo+PiBZb3UgcHJvYmFibHkgd2FudCB0byBs ZXQgbGludXgtbnZkaW1tIGtub3cgYWJvdXQgdGhpcyBwYXRjaCBzZXQuCj4+IEFkZGluZyB0byB0 aGUgY2MuCj4KPiBZZXMsIHRoYW5rcyBmb3IgdGhhdCEKPgo+PiBBbHNvLCBJIG9ubHkgcmVjZWl2 ZWQgcGF0Y2ggMCBhbmQgNC4gIFdoYXQgaGFwcGVuZWQKPj4gdG8gMS0zLDUgYW5kIDY/Cj4+Cj4+ PiBXaXRoIGN1cnJlbnQga2VybmVsLCBtYW55IG1t4oCZcyBjbGFzc2ljYWwgZmVhdHVyZXMgbGlr ZSB0aGUgYnVkZHkKPj4+IHN5c3RlbSwgc3dhcCBtZWNoYW5pc20gYW5kIHBhZ2UgY2FjaGUgY291 bGRu4oCZdCBiZSBzdXBwb3J0ZWQgdG8gTlZESU1NLgo+Pj4gV2hhdCB3ZSBhcmUgZG9pbmcgaXMg dG8gZXhwYW5kIGtlcm5lbCBtbeKAmXMgY2FwYWNpdHkgdG8gbWFrZSBpdCB0byBoYW5kbGUKPj4+ IE5WRElNTSBsaWtlIERSQU0uIEZ1cnRoZXJtb3JlIHdlIG1ha2UgbW0gY291bGQgdHJlYXQgRFJB TSBhbmQgTlZESU1NCj4+PiBzZXBhcmF0ZWx5LCB0aGF0IG1lYW5zIG1tIGNhbiBvbmx5IHB1dCB0 aGUgY3JpdGljYWwgcGFnZXMgdG8gTlZESU1NCgpQbGVhc2UgZGVmaW5lICJjcml0aWNhbCBwYWdl cy4iCgo+Pj4gem9uZSwgaGVyZSB3ZSBjcmVhdGVkIGEgbmV3IHpvbmUgdHlwZSBhcyBOVk0gem9u ZS4gVGhhdCBpcyB0byBzYXkgZm9yCj4+PiB0cmFkaXRpb25hbChvciBub3JtYWwpIHBhZ2VzIHdo aWNoIHdvdWxkIGJlIHN0b3JlZCBhdCBEUkFNIHNjb3BlIGxpa2UKPj4+IE5vcm1hbCwgRE1BMzIg YW5kIERNQSB6b25lcy4gQnV0IGZvciB0aGUgY3JpdGljYWwgcGFnZXMsIHdoaWNoIHdlIGhvcGUK Pj4+IHRoZW0gY291bGQgYmUgcmVjb3ZlcmVkIGZyb20gcG93ZXIgZmFpbCBvciBzeXN0ZW0gY3Jh c2gsIHdlIG1ha2UgdGhlbQo+Pj4gdG8gYmUgcGVyc2lzdGVudCBieSBzdG9yaW5nIHRoZW0gdG8g TlZNIHpvbmUuCgpbLi4uXQoKPiBJIHRoaW5rIGFkZGluZyB5ZXQgb25lIG1vcmUgbW0tem9uZSBp cyB0aGUgd3JvbmcgZGlyZWN0aW9uLiBJbnN0ZWFkLAo+IHdoYXQgd2UgaGF2ZSBiZWVuIGNvbnNp ZGVyaW5nIGlzIGEgbWVjaGFuaXNtIHRvIGFsbG93IGEgZGV2aWNlLWRheAo+IGluc3RhbmNlIHRv IGJlIGdpdmVuIGJhY2sgdG8gdGhlIGtlcm5lbCBhcyBhIGRpc3RpbmN0IG51bWEgbm9kZQo+IG1h bmFnZWQgYnkgdGhlIFZNLiBJdCBzZWVtcyBpdCB0aW1lcyB0byBkdXN0IG9mZiB0aG9zZSBwYXRj aGVzLgoKV2hhdCdzIHRoZSB1c2UgY2FzZT8gIFRoZSBhYm92ZSBwYXRjaCBkZXNjcmlwdGlvbiBz ZWVtcyB0byBpbmRpY2F0ZSBhbgppbnRlbnQgdG8gcmVjb3ZlciBjb250ZW50cyBhZnRlciBhIHBv d2VyIGxvc3MuICBXaXRob3V0IHNlZWluZyB0aGUgd2hvbGUKc2VyaWVzLCBJJ20gbm90IHN1cmUg aG93IHRoYXQncyBhY2NvbXBsaXNoZWQgaW4gYSBzYWZlIG9yIG1lYW5pbmdmdWwKd2F5LgoKSHVh aXNoZW5nLCBjb3VsZCB5b3UgcHJvdmlkZSBhIGJpdCBtb3JlIGJhY2tncm91bmQ/CgpUaGFua3Mh CkplZmYKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KTGlu dXgtbnZkaW1tIG1haWxpbmcgbGlzdApMaW51eC1udmRpbW1AbGlzdHMuMDEub3JnCmh0dHBzOi8v bGlzdHMuMDEub3JnL21haWxtYW4vbGlzdGluZm8vbGludXgtbnZkaW1tCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752747AbeEGTIh convert rfc822-to-8bit (ORCPT ); Mon, 7 May 2018 15:08:37 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:59670 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752099AbeEGTIg (ORCPT ); Mon, 7 May 2018 15:08:36 -0400 From: Jeff Moyer To: Dan Williams Cc: Matthew Wilcox , Michal Hocko , Huaisheng Ye , linux-nvdimm , Tetsuo Handa , chengnt@lenovo.com, Dave Hansen , Linux Kernel Mailing List , pasha.tatashin@oracle.com, Linux MM , colyli@suse.de, Johannes Weiner , Andrew Morton , Sasha Levin , Mel Gorman , Vlastimil Babka Subject: Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone References: <1525704627-30114-1-git-send-email-yehs1@lenovo.com> <20180507184622.GB12361@bombadil.infradead.org> X-PGP-KeyID: 1F78E1B4 X-PGP-CertKey: F6FE 280D 8293 F72C 65FD 5A58 1FF8 A7CA 1F78 E1B4 X-PCLoadLetter: What the f**k does that mean? Date: Mon, 07 May 2018 15:08:32 -0400 In-Reply-To: (Dan Williams's message of "Mon, 7 May 2018 11:57:10 -0700") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dan Williams writes: > On Mon, May 7, 2018 at 11:46 AM, Matthew Wilcox wrote: >> On Mon, May 07, 2018 at 10:50:21PM +0800, Huaisheng Ye wrote: >>> Traditionally, NVDIMMs are treated by mm(memory management) subsystem as >>> DEVICE zone, which is a virtual zone and both its start and end of pfn >>> are equal to 0, mm wouldn’t manage NVDIMM directly as DRAM, kernel uses >>> corresponding drivers, which locate at \drivers\nvdimm\ and >>> \drivers\acpi\nfit and fs, to realize NVDIMM memory alloc and free with >>> memory hot plug implementation. >> >> You probably want to let linux-nvdimm know about this patch set. >> Adding to the cc. > > Yes, thanks for that! > >> Also, I only received patch 0 and 4. What happened >> to 1-3,5 and 6? >> >>> With current kernel, many mm’s classical features like the buddy >>> system, swap mechanism and page cache couldn’t be supported to NVDIMM. >>> What we are doing is to expand kernel mm’s capacity to make it to handle >>> NVDIMM like DRAM. Furthermore we make mm could treat DRAM and NVDIMM >>> separately, that means mm can only put the critical pages to NVDIMM Please define "critical pages." >>> zone, here we created a new zone type as NVM zone. That is to say for >>> traditional(or normal) pages which would be stored at DRAM scope like >>> Normal, DMA32 and DMA zones. But for the critical pages, which we hope >>> them could be recovered from power fail or system crash, we make them >>> to be persistent by storing them to NVM zone. [...] > I think adding yet one more mm-zone is the wrong direction. Instead, > what we have been considering is a mechanism to allow a device-dax > instance to be given back to the kernel as a distinct numa node > managed by the VM. It seems it times to dust off those patches. What's the use case? The above patch description seems to indicate an intent to recover contents after a power loss. Without seeing the whole series, I'm not sure how that's accomplished in a safe or meaningful way. Huaisheng, could you provide a bit more background? Thanks! Jeff