linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Tatashin <pasha.tatashin@soleen.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "James Morris" <jmorris@namei.org>,
	"Sasha Levin" <sashal@kernel.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Linux MM" <linux-mm@kvack.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Michal Hocko" <mhocko@suse.com>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Keith Busch" <keith.busch@intel.com>,
	"Vishal L Verma" <vishal.l.verma@intel.com>,
	"Dave Jiang" <dave.jiang@intel.com>,
	"Ross Zwisler" <zwisler@kernel.org>,
	"Tom Lendacky" <thomas.lendacky@amd.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	"Fengguang Wu" <fengguang.wu@intel.com>,
	"Borislav Petkov" <bp@suse.de>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Yaowei Bai" <baiyaowei@cmss.chinamobile.com>,
	"Takashi Iwai" <tiwai@suse.de>,
	"Jérôme Glisse" <jglisse@redhat.com>
Subject: Re: [v1 2/2] device-dax: "Hotremove" persistent memory that is used like normal RAM
Date: Sat, 20 Apr 2019 18:04:07 -0400	[thread overview]
Message-ID: <CA+CK2bBFqq0tNOE9gh7JAhjw8XLW_pMpVQtUwbm6JwW=LWt_iQ@mail.gmail.com> (raw)
In-Reply-To: <CAPcyv4gBu5QhgRQ+maJs108JwBrcCa9U1e9wgO8FP6Q3qwy69g@mail.gmail.com>

On Sat, Apr 20, 2019 at 5:02 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Sat, Apr 20, 2019 at 10:02 AM Pavel Tatashin
> <pasha.tatashin@soleen.com> wrote:
> >
> > > > Thank you for looking at this.  Are you saying, that if drv.remove()
> > > > returns a failure it is simply ignored, and unbind proceeds?
> > >
> > > Yeah, that's the problem. I've looked at making unbind able to fail,
> > > but that can lead to general bad behavior in device-drivers. I.e. why
> > > spend time unwinding allocated resources when the driver can simply
> > > fail unbind? About the best a driver can do is make unbind wait on
> > > some event, but any return results in device-unbind.
> >
> > Hm, just tested, and it is indeed so.
> >
> > I see the following options:
> >
> > 1. Move hot remove code to some other interface, that can fail. Not
> > sure what that would be, but outside of unbind/remove_id. Any
> > suggestion?
> > 2. Option two is don't attept to offline memory in unbind. Do
> > hot-remove memory in unbind if every section is already offlined.
> > Basically, do a walk through memblocks, and if every section is
> > offlined, also do the cleanup.
>
> I think something like option-2 could work just as long as the user is
> ok with failure and prepared to handle it. It's already the case that
> the request_region() in kmem permanently prevents the memory range
> from being reused by any other driver. So if the hot-unplug fails it
> could skip the corresponding release_region() and effectively it's the
> same as what we have now in terms of reuse protection. In your flow if
> the memory remove failed then the conversion attempt from devdax to
> raw mode would also fail and presumably you could fall back to doing a
> full reboot / rebuild of the application state?

With option two, where we will simply check that every memory_block is
offlined, we will have deterministic behavior:

1. If user did not offline every dax memory section beforehand via
echo offline > /sys/devices/system/memory/memoryN/state

echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind
Will be the same as now, will simply return, and user won't be able to
use dax afterwords or hotremove it.

2. If user did offline ever dax memory section beforehand
echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind
Will be guaranteed to succeed to hotremove the memory, as there is
nothing that can fail.

So, if user wants to hotremove dax memory, he/she must ensure that
every section is offlined before unbinding.

Pasha

  reply	other threads:[~2019-04-20 22:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-20 15:31 [v1 0/2] "Hotremove" persistent memory Pavel Tatashin
2019-04-20 15:31 ` [v1 1/2] device-dax: fix memory and resource leak if hotplug fails Pavel Tatashin
2019-04-20 15:31 ` [v1 2/2] device-dax: "Hotremove" persistent memory that is used like normal RAM Pavel Tatashin
2019-04-20 16:18   ` Dan Williams
2019-04-20 16:30     ` Pavel Tatashin
2019-04-20 16:36       ` Dan Williams
2019-04-20 17:01         ` Pavel Tatashin
2019-04-20 21:02           ` Dan Williams
2019-04-20 22:04             ` Pavel Tatashin [this message]
2019-04-20 23:19               ` Dan Williams
2019-04-20 16:34 ` [v1 0/2] "Hotremove" persistent memory Dan Williams
2019-04-20 16:56   ` Pavel Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+CK2bBFqq0tNOE9gh7JAhjw8XLW_pMpVQtUwbm6JwW=LWt_iQ@mail.gmail.com' \
    --to=pasha.tatashin@soleen.com \
    --cc=akpm@linux-foundation.org \
    --cc=baiyaowei@cmss.chinamobile.com \
    --cc=bhelgaas@google.com \
    --cc=bp@suse.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=jglisse@redhat.com \
    --cc=jmorris@namei.org \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mhocko@suse.com \
    --cc=sashal@kernel.org \
    --cc=thomas.lendacky@amd.com \
    --cc=tiwai@suse.de \
    --cc=vishal.l.verma@intel.com \
    --cc=ying.huang@intel.com \
    --cc=zwisler@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).