linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Logan Gunthorpe" <logang@deltatee.com>,
	"Toshi Kani" <toshi.kani@hpe.com>,
	"Jeff Moyer" <jmoyer@redhat.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	stable <stable@vger.kernel.org>, "Linux MM" <linux-mm@kvack.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v5 00/10] mm: Sub-section memory hotplug support
Date: Wed, 27 Mar 2019 09:17:37 -0700	[thread overview]
Message-ID: <CAPcyv4heVUMUVrFz4HDX11OxW0ZWkS6EpJJ4aT3QJcUmPTFpRg@mail.gmail.com> (raw)
In-Reply-To: <20190327161306.GM11927@dhcp22.suse.cz>

On Wed, Mar 27, 2019 at 9:13 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Tue 26-03-19 17:20:41, Dan Williams wrote:
> > On Tue, Mar 26, 2019 at 1:04 AM Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > On Mon 25-03-19 13:03:47, Dan Williams wrote:
> > > > On Mon, Mar 25, 2019 at 3:20 AM Michal Hocko <mhocko@kernel.org> wrote:
> > > [...]
> > > > > > User-defined memory namespaces have this problem, but 2MB is the
> > > > > > default alignment and is sufficient for most uses.
> > > > >
> > > > > What does prevent users to go and use a larger alignment?
> > > >
> > > > Given that we are living with 64MB granularity on mainstream platforms
> > > > for the foreseeable future, the reason users can't rely on a larger
> > > > alignment to address the issue is that the physical alignment may
> > > > change from one boot to the next.
> > >
> > > I would love to learn more about this inter boot volatility. Could you
> > > expand on that some more? I though that the HW configuration presented
> > > to the OS would be more or less stable unless the underlying HW changes.
> >
> > Even if the configuration is static there can be hardware failures
> > that prevent a DIMM, or a PCI device to be included in the memory map.
> > When that happens the BIOS needs to re-layout the map and the result
> > is not guaranteed to maintain the previous alignment.
> >
> > > > No, you can't just wish hardware / platform firmware won't do this,
> > > > because there are not enough platform resources to give every hardware
> > > > device a guaranteed alignment.
> > >
> > > Guarantee is one part and I can see how nobody wants to give you
> > > something as strong but how often does that happen in the real life?
> >
> > I expect a "rare" event to happen everyday in a data-center fleet.
> > Failure rates tend towards 100% daily occurrence at scale and in this
> > case the kernel has everything it needs to mitigate such an event.
> >
> > Setting aside the success rate of a software-alignment mitigation, the
> > reason I am charging this hill again after a 2 year hiatus is the
> > realization that this problem is wider spread than the original
> > failing scenario. Back in 2017 the problem seemed limited to custom
> > memmap= configurations, and collisions between PMEM and System RAM.
> > Now it is clear that the collisions can happen between PMEM regions
> > and namespaces as well, and the problem spans platforms from multiple
> > vendors. Here is the most recent collision problem:
> > https://github.com/pmem/ndctl/issues/76, from a third-party platform.
> >
> > The fix for that issue uncovered a bug in the padding implementation,
> > and a fix for that bug would result in even more hacks in the nvdimm
> > code for what is a core kernel deficiency. Code review of those
> > changes resulted in changing direction to go after the core
> > deficiency.
>
> This kind of information along with real world examples is exactly what
> you should have added into the cover letter. A previous very vague
> claims were not really convincing or something that can be considered a
> proper justification. Please do realize that people who are not working
> with the affected HW are unlikely to have an idea how serious/relevant
> those problems really are.
>
> People are asking for a smaller memory hotplug granularity for other
> usecases (e.g. memory ballooning into VMs) which are quite dubious to
> be honest and not really worth all the code rework. If we are talking
> about something that can be worked around elsewhere then it is preferred
> because the code base is not in an excellent shape and putting more on
> top is just going to cause more headaches.
>
> I will try to find some time to review this more deeply (no promises
> though because time is hectic and this is not a simple feature). For the
> future, please try harder to write up a proper justification and a
> highlevel design description which tells a bit about all important parts
> of the new scheme.

Fair enough. I've been steeped in this for too long, and should have
taken a wider view to bring reviewers up to speed.


  reply	other threads:[~2019-03-27 16:17 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22 16:57 [PATCH v5 00/10] mm: Sub-section memory hotplug support Dan Williams
2019-03-22 16:57 ` [PATCH v5 01/10] mm/sparsemem: Introduce struct mem_section_usage Dan Williams
2019-03-22 16:58 ` [PATCH v5 02/10] mm/sparsemem: Introduce common definitions for the size and mask of a section Dan Williams
2019-03-22 16:58 ` [PATCH v5 03/10] mm/sparsemem: Add helpers track active portions of a section at boot Dan Williams
2019-03-22 16:58 ` [PATCH v5 04/10] mm/hotplug: Prepare shrink_{zone, pgdat}_span for sub-section removal Dan Williams
2019-03-22 16:58 ` [PATCH v5 05/10] mm/sparsemem: Convert kmalloc_section_memmap() to populate_section_memmap() Dan Williams
2019-03-22 16:58 ` [PATCH v5 06/10] mm/sparsemem: Prepare for sub-section ranges Dan Williams
2019-03-22 16:58 ` [PATCH v5 07/10] mm/sparsemem: Support sub-section hotplug Dan Williams
2019-03-22 16:58 ` [PATCH v5 08/10] mm/devm_memremap_pages: Enable sub-section remap Dan Williams
2019-03-22 16:58 ` [PATCH v5 09/10] libnvdimm/pfn: Fix fsdax-mode namespace info-block zero-fields Dan Williams
2019-03-27 14:00   ` Sasha Levin
2019-03-22 16:58 ` [PATCH v5 10/10] libnvdimm/pfn: Stop padding pmem namespaces to section alignment Dan Williams
2019-03-22 18:05 ` [PATCH v5 00/10] mm: Sub-section memory hotplug support Michal Hocko
2019-03-22 18:32   ` Dan Williams
2019-03-25 10:19     ` Michal Hocko
2019-03-25 14:28       ` Jeff Moyer
2019-03-25 14:50         ` Michal Hocko
2019-03-25 20:03       ` Dan Williams
2019-03-26  8:04         ` Michal Hocko
2019-03-27  0:20           ` Dan Williams
2019-03-27 16:13             ` Michal Hocko
2019-03-27 16:17               ` Dan Williams [this message]
2019-03-28 13:38               ` David Hildenbrand
2019-03-28 14:16                 ` Michal Hocko
2019-04-01  9:18             ` David Hildenbrand
2019-03-28 20:10 ` David Hildenbrand
2019-03-28 20:43   ` Dan Williams
2019-03-28 21:17     ` David Hildenbrand
2019-03-28 21:32       ` Dan Williams
2019-03-28 21:54         ` David Hildenbrand
2019-04-10  9:51 ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4heVUMUVrFz4HDX11OxW0ZWkS6EpJJ4aT3QJcUmPTFpRg@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=jglisse@redhat.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=logang@deltatee.com \
    --cc=mhocko@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=toshi.kani@hpe.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).