From: Alexander Duyck <alexander.h.duyck@linux.intel.com> To: Michal Hocko <mhocko@kernel.org> Cc: "Pasha Tatashin" <pavel.tatashin@microsoft.com>, linux-nvdimm <linux-nvdimm@lists.01.org>, "Dave Hansen" <dave.hansen@intel.com>, "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>, "Ingo Molnar" <mingo@kernel.org>, "Linux MM" <linux-mm@kvack.org>, "Jérôme Glisse" <jglisse@redhat.com>, rppt@linux.vnet.ibm.com, "Andrew Morton" <akpm@linux-foundation.org>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Subject: Re: [PATCH v5 4/4] mm: Defer ZONE_DEVICE page initialization to the point where we init pgmap Date: Wed, 10 Oct 2018 10:39:01 -0700 [thread overview] Message-ID: <98c35e19-13b9-0913-87d9-b3f1ab738b61@linux.intel.com> (raw) In-Reply-To: <20181010172451.GK5873@dhcp22.suse.cz> On 10/10/2018 10:24 AM, Michal Hocko wrote: > On Wed 10-10-18 09:39:08, Alexander Duyck wrote: >> On 10/10/2018 2:58 AM, Michal Hocko wrote: >>> On Tue 09-10-18 13:26:41, Alexander Duyck wrote: >>> [...] >>>> I would think with that being the case we still probably need the call to >>>> __SetPageReserved to set the bit with the expectation that it will not be >>>> cleared for device-pages since the pages are not onlined. Removing the call >>>> to __SetPageReserved would probably introduce a number of regressions as >>>> there are multiple spots that use the reserved bit to determine if a page >>>> can be swapped out to disk, mapped as system memory, or migrated. >>> >>> PageReserved is meant to tell any potential pfn walkers that might get >>> to this struct page to back off and not touch it. Even though >>> ZONE_DEVICE doesn't online pages in traditional sense it makes those >>> pages available for further use so the page reserved bit should be >>> cleared. >> >> So from what I can tell that isn't necessarily the case. Specifically if the >> pagemap type is MEMORY_DEVICE_PRIVATE or MEMORY_DEVICE_PUBLIC both are >> special cases where the memory may not be accessible to the CPU or cannot be >> pinned in order to allow for eviction. > > Could you give me an example please? Honestly I am getting a bit beyond my depth here so maybe Dan could explain better. I am basing the above comment on Dan's earlier comment in this thread combined with the comment that explains the "memory_type" field for the pgmap: https://elixir.bootlin.com/linux/v4.19-rc7/source/include/linux/memremap.h#L28 >> The specific case that Dan and Yi are referring to is for the type >> MEMORY_DEVICE_FS_DAX. For that type I could probably look at not setting the >> reserved bit. Part of me wants to say that we should wait and clear the bit >> later, but that would end up just adding time back to initialization. At >> this point I would consider the change more of a follow-up optimization >> rather than a fix though since this is tailoring things specifically for DAX >> versus the other ZONE_DEVICE types. > > I thought I have already made it clear that these zone device hacks are > not acceptable to the generic hotplug code. If the current reserve bit > handling is not correct then give us a specific reason for that and we > can start thinking about the proper fix. I might have misunderstood your earlier comment then. I thought you were saying that we shouldn't bother with setting the reserved bit. Now it sounds like you were thinking more along the lines of what I was here in my comment where I thought the bit should be cleared later in some code specifically related to DAX when it is exposing it for use to userspace or KVM. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: Alexander Duyck <alexander.h.duyck@linux.intel.com> To: Michal Hocko <mhocko@kernel.org> Cc: "Dan Williams" <dan.j.williams@intel.com>, "Linux MM" <linux-mm@kvack.org>, "Andrew Morton" <akpm@linux-foundation.org>, "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>, linux-nvdimm <linux-nvdimm@lists.01.org>, "Pasha Tatashin" <pavel.tatashin@microsoft.com>, "Dave Hansen" <dave.hansen@intel.com>, "Jérôme Glisse" <jglisse@redhat.com>, rppt@linux.vnet.ibm.com, "Ingo Molnar" <mingo@kernel.org>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, yi.z.zhang@linux.intel.com Subject: Re: [PATCH v5 4/4] mm: Defer ZONE_DEVICE page initialization to the point where we init pgmap Date: Wed, 10 Oct 2018 10:39:01 -0700 [thread overview] Message-ID: <98c35e19-13b9-0913-87d9-b3f1ab738b61@linux.intel.com> (raw) In-Reply-To: <20181010172451.GK5873@dhcp22.suse.cz> On 10/10/2018 10:24 AM, Michal Hocko wrote: > On Wed 10-10-18 09:39:08, Alexander Duyck wrote: >> On 10/10/2018 2:58 AM, Michal Hocko wrote: >>> On Tue 09-10-18 13:26:41, Alexander Duyck wrote: >>> [...] >>>> I would think with that being the case we still probably need the call to >>>> __SetPageReserved to set the bit with the expectation that it will not be >>>> cleared for device-pages since the pages are not onlined. Removing the call >>>> to __SetPageReserved would probably introduce a number of regressions as >>>> there are multiple spots that use the reserved bit to determine if a page >>>> can be swapped out to disk, mapped as system memory, or migrated. >>> >>> PageReserved is meant to tell any potential pfn walkers that might get >>> to this struct page to back off and not touch it. Even though >>> ZONE_DEVICE doesn't online pages in traditional sense it makes those >>> pages available for further use so the page reserved bit should be >>> cleared. >> >> So from what I can tell that isn't necessarily the case. Specifically if the >> pagemap type is MEMORY_DEVICE_PRIVATE or MEMORY_DEVICE_PUBLIC both are >> special cases where the memory may not be accessible to the CPU or cannot be >> pinned in order to allow for eviction. > > Could you give me an example please? Honestly I am getting a bit beyond my depth here so maybe Dan could explain better. I am basing the above comment on Dan's earlier comment in this thread combined with the comment that explains the "memory_type" field for the pgmap: https://elixir.bootlin.com/linux/v4.19-rc7/source/include/linux/memremap.h#L28 >> The specific case that Dan and Yi are referring to is for the type >> MEMORY_DEVICE_FS_DAX. For that type I could probably look at not setting the >> reserved bit. Part of me wants to say that we should wait and clear the bit >> later, but that would end up just adding time back to initialization. At >> this point I would consider the change more of a follow-up optimization >> rather than a fix though since this is tailoring things specifically for DAX >> versus the other ZONE_DEVICE types. > > I thought I have already made it clear that these zone device hacks are > not acceptable to the generic hotplug code. If the current reserve bit > handling is not correct then give us a specific reason for that and we > can start thinking about the proper fix. I might have misunderstood your earlier comment then. I thought you were saying that we shouldn't bother with setting the reserved bit. Now it sounds like you were thinking more along the lines of what I was here in my comment where I thought the bit should be cleared later in some code specifically related to DAX when it is exposing it for use to userspace or KVM.
next prev parent reply other threads:[~2018-10-10 17:39 UTC|newest] Thread overview: 144+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-09-25 20:18 [PATCH v5 0/4] Address issues slowing persistent memory initialization Alexander Duyck 2018-09-25 20:18 ` Alexander Duyck 2018-09-25 20:19 ` [PATCH v5 1/4] mm: Remove now defunct NO_BOOTMEM from depends list for deferred init Alexander Duyck 2018-09-25 20:19 ` Alexander Duyck 2018-09-25 21:05 ` Mike Rapoport 2018-09-25 21:05 ` Mike Rapoport 2018-09-25 20:20 ` [PATCH v5 2/4] mm: Provide kernel parameter to allow disabling page init poisoning Alexander Duyck 2018-09-25 20:20 ` Alexander Duyck 2018-09-25 20:26 ` Dave Hansen 2018-09-25 20:26 ` Dave Hansen 2018-09-25 20:38 ` Alexander Duyck 2018-09-25 20:38 ` Alexander Duyck 2018-09-25 22:14 ` Dave Hansen 2018-09-25 22:14 ` Dave Hansen 2018-09-25 22:14 ` Dave Hansen 2018-09-25 22:27 ` Alexander Duyck 2018-09-25 22:27 ` Alexander Duyck 2018-09-25 22:27 ` Alexander Duyck 2018-09-26 7:38 ` Michal Hocko 2018-09-26 7:38 ` Michal Hocko 2018-09-26 15:24 ` Alexander Duyck 2018-09-26 15:39 ` Michal Hocko 2018-09-26 15:39 ` Michal Hocko 2018-09-26 15:41 ` Dave Hansen 2018-09-26 15:41 ` Dave Hansen 2018-09-26 16:18 ` Alexander Duyck 2018-09-26 15:36 ` Dave Hansen 2018-09-26 22:36 ` Andrew Morton 2018-09-26 22:36 ` Andrew Morton 2018-09-25 20:20 ` [PATCH v5 3/4] mm: Create non-atomic version of SetPageReserved for init use Alexander Duyck 2018-09-25 20:20 ` Alexander Duyck 2018-09-25 20:21 ` [PATCH v5 4/4] mm: Defer ZONE_DEVICE page initialization to the point where we init pgmap Alexander Duyck 2018-09-25 20:21 ` Alexander Duyck 2018-09-26 7:55 ` Michal Hocko 2018-09-26 18:25 ` Alexander Duyck 2018-09-26 18:25 ` Alexander Duyck 2018-09-26 18:52 ` Dan Williams 2018-09-26 18:52 ` Dan Williams 2018-09-27 11:20 ` Michal Hocko 2018-09-27 11:20 ` Michal Hocko 2018-09-27 11:09 ` Michal Hocko 2018-09-27 11:09 ` Michal Hocko 2018-09-27 12:25 ` Oscar Salvador 2018-09-27 13:13 ` Michal Hocko 2018-09-27 14:50 ` Oscar Salvador 2018-09-27 14:50 ` Oscar Salvador 2018-09-27 14:50 ` Oscar Salvador 2018-09-27 15:41 ` David Hildenbrand 2018-09-27 15:41 ` David Hildenbrand 2018-09-28 8:12 ` Oscar Salvador 2018-09-28 8:12 ` Oscar Salvador 2018-09-28 8:44 ` Oscar Salvador 2018-09-28 8:44 ` Oscar Salvador 2018-09-28 15:50 ` Dan Williams 2018-09-28 15:50 ` Dan Williams 2018-09-27 12:32 ` Oscar Salvador 2018-10-08 21:01 ` Dan Williams 2018-10-08 21:01 ` Dan Williams 2018-10-08 21:38 ` Alexander Duyck 2018-10-08 21:38 ` Alexander Duyck 2018-10-08 22:00 ` Dan Williams 2018-10-08 22:00 ` Dan Williams 2018-10-08 22:00 ` Dan Williams 2018-10-08 22:07 ` Alexander Duyck 2018-10-08 22:07 ` Alexander Duyck 2018-10-08 22:36 ` Alexander Duyck 2018-10-08 22:36 ` Alexander Duyck 2018-10-08 22:59 ` Dan Williams 2018-10-08 23:34 ` [mm PATCH] memremap: Fix reference count for pgmap in devm_memremap_pages Alexander Duyck 2018-10-08 23:34 ` Alexander Duyck 2018-10-09 0:20 ` Dan Williams 2018-10-09 0:20 ` Dan Williams 2018-10-09 17:00 ` [PATCH v5 4/4] mm: Defer ZONE_DEVICE page initialization to the point where we init pgmap Yi Zhang 2018-10-09 17:00 ` Yi Zhang 2018-10-09 18:04 ` Dan Williams 2018-10-09 18:04 ` Dan Williams 2018-10-09 20:26 ` Alexander Duyck 2018-10-09 20:26 ` Alexander Duyck 2018-10-09 21:19 ` Dan Williams 2018-10-09 21:19 ` Dan Williams 2018-10-10 12:52 ` Yi Zhang 2018-10-10 12:52 ` Yi Zhang 2018-10-10 15:27 ` Alexander Duyck 2018-10-10 15:27 ` Alexander Duyck 2018-10-11 8:17 ` Yi Zhang 2018-10-11 8:17 ` Yi Zhang 2018-10-10 9:58 ` Michal Hocko 2018-10-10 16:39 ` Alexander Duyck 2018-10-10 16:39 ` Alexander Duyck 2018-10-10 17:24 ` Michal Hocko 2018-10-10 17:24 ` Michal Hocko 2018-10-10 17:39 ` Alexander Duyck [this message] 2018-10-10 17:39 ` Alexander Duyck 2018-10-10 17:53 ` Michal Hocko 2018-10-10 17:53 ` Michal Hocko 2018-10-10 18:13 ` Alexander Duyck 2018-10-10 18:13 ` Alexander Duyck 2018-10-10 18:52 ` Michal Hocko 2018-10-10 18:52 ` Michal Hocko 2018-10-11 8:55 ` Michal Hocko 2018-10-11 8:55 ` Michal Hocko 2018-10-11 17:38 ` Alexander Duyck 2018-10-11 18:22 ` Dan Williams 2018-10-11 18:22 ` Dan Williams 2018-10-17 7:52 ` Michal Hocko 2018-10-17 7:52 ` Michal Hocko 2018-10-17 15:02 ` Alexander Duyck 2018-10-17 15:02 ` Alexander Duyck 2018-10-29 14:12 ` Michal Hocko 2018-10-29 14:12 ` Michal Hocko 2018-10-29 15:59 ` Alexander Duyck 2018-10-29 15:59 ` Alexander Duyck 2018-10-29 15:59 ` Alexander Duyck 2018-10-29 16:35 ` Michal Hocko 2018-10-29 16:35 ` Michal Hocko 2018-10-29 17:01 ` Alexander Duyck 2018-10-29 17:24 ` Michal Hocko 2018-10-29 17:24 ` Michal Hocko 2018-10-29 17:34 ` Dan Williams 2018-10-29 17:34 ` Dan Williams 2018-10-29 17:45 ` Michal Hocko 2018-10-29 17:45 ` Michal Hocko 2018-10-29 17:42 ` Alexander Duyck 2018-10-29 17:42 ` Alexander Duyck 2018-10-29 18:18 ` Michal Hocko 2018-10-29 18:18 ` Michal Hocko 2018-10-29 19:59 ` Alexander Duyck 2018-10-29 19:59 ` Alexander Duyck 2018-10-30 6:29 ` Michal Hocko 2018-10-30 6:29 ` Michal Hocko 2018-10-30 6:55 ` Dan Williams 2018-10-30 8:17 ` Michal Hocko 2018-10-30 8:17 ` Michal Hocko 2018-10-30 15:57 ` Dan Williams 2018-10-30 8:05 ` Oscar Salvador 2018-10-29 15:49 ` Dan Williams 2018-10-29 15:49 ` Dan Williams 2018-10-29 15:56 ` Michal Hocko 2018-10-10 18:18 ` Dan Williams 2018-10-10 18:18 ` Dan Williams 2018-10-11 8:39 ` Yi Zhang 2018-10-11 8:39 ` Yi Zhang 2018-10-11 15:38 ` Alexander Duyck 2018-10-11 15:38 ` Alexander Duyck
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=98c35e19-13b9-0913-87d9-b3f1ab738b61@linux.intel.com \ --to=alexander.h.duyck@linux.intel.com \ --cc=akpm@linux-foundation.org \ --cc=dave.hansen@intel.com \ --cc=jglisse@redhat.com \ --cc=kirill.shutemov@linux.intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nvdimm@lists.01.org \ --cc=mhocko@kernel.org \ --cc=mingo@kernel.org \ --cc=pavel.tatashin@microsoft.com \ --cc=rppt@linux.vnet.ibm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.