All of lore.kernel.org
 help / color / mirror / Atom feed
From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>,
	Michal Hocko <mhocko@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 4/5] mm: Fix page reference leak in soft_offline_page()
Date: Thu, 14 Jan 2021 07:10:25 +0000	[thread overview]
Message-ID: <20210114071024.GC16873@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <CAPcyv4gVPMUD7P0KwAY9xk3xBkodPExeJVG6i9=9FGexbJZpBw@mail.gmail.com>

On Wed, Jan 13, 2021 at 10:18:09PM -0800, Dan Williams wrote:
> On Wed, Jan 13, 2021 at 5:50 PM HORIGUCHI NAOYA(堀口 直也)
> <naoya.horiguchi@nec.com> wrote:
> >
> > On Wed, Jan 13, 2021 at 04:43:32PM -0800, Dan Williams wrote:
> > > The conversion to move pfn_to_online_page() internal to
> > > soft_offline_page() missed that the get_user_pages() reference taken by
> > > the madvise() path needs to be dropped when pfn_to_online_page() fails.
> > > Note the direct sysfs-path to soft_offline_page() does not perform a
> > > get_user_pages() lookup.
> > >
> > > When soft_offline_page() is handed a pfn_valid() &&
> > > !pfn_to_online_page() pfn the kernel hangs at dax-device shutdown due to
> > > a leaked reference.
> > >
> > > Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn")
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
> > > Cc: Michal Hocko <mhocko@kernel.org>
> > > Reviewed-by: David Hildenbrand <david@redhat.com>
> > > Reviewed-by: Oscar Salvador <osalvador@suse.de>
> > > Cc: <stable@vger.kernel.org>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >
> > I'm OK if we don't have any other better approach, but the proposed changes
> > make code a little messy, and I feel that get_user_pages() might be the
> > right place to fix. Is get_user_pages() expected to return struct page with
> > holding refcount for offline valid pages?  I thought that such pages are
> > only used by drivers for dax-devices, but that might be wrong. Can I ask for
> > a little more explanation from this perspective?
> 
> The motivation for ZONE_DEVICE is to allow get_user_pages() for "offline" pfns.

I missed this point, thank you.

> 
> soft_offline_page() wants to only operate on "online" pfns.

Right.

> 
> get_user_pages() on a dax-mapping returns an "offline" ZONE_DEVICE page.

OK.

> 
> When pfn_to_online_page() fails the get_user_pages() reference needs
> to be dropped.

The background is clear to me, and I agree with this patch.

Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

> 
> To be honest I dislike the entire flags based scheme for communicating
> the fact that page reference obtained by madvise needs to be dropped
> later. I'd rather pass a non-NULL 'struct page *' than redo
> pfn_to_page() conversions in the leaf functions, but that's a much
> larger change.

As Oscar mentions in another email, removing MF_COUNT_INCREASED was
recently discussed and rejected. I think that if pfn-page conversion
layer is somehow factored out from soft/hard offline handler, code would
be more readable/maintainable.

Thanks,
Naoya Horiguchi
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>,
	Michal Hocko <mhocko@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 4/5] mm: Fix page reference leak in soft_offline_page()
Date: Thu, 14 Jan 2021 07:10:25 +0000	[thread overview]
Message-ID: <20210114071024.GC16873@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <CAPcyv4gVPMUD7P0KwAY9xk3xBkodPExeJVG6i9=9FGexbJZpBw@mail.gmail.com>

On Wed, Jan 13, 2021 at 10:18:09PM -0800, Dan Williams wrote:
> On Wed, Jan 13, 2021 at 5:50 PM HORIGUCHI NAOYA(堀口 直也)
> <naoya.horiguchi@nec.com> wrote:
> >
> > On Wed, Jan 13, 2021 at 04:43:32PM -0800, Dan Williams wrote:
> > > The conversion to move pfn_to_online_page() internal to
> > > soft_offline_page() missed that the get_user_pages() reference taken by
> > > the madvise() path needs to be dropped when pfn_to_online_page() fails.
> > > Note the direct sysfs-path to soft_offline_page() does not perform a
> > > get_user_pages() lookup.
> > >
> > > When soft_offline_page() is handed a pfn_valid() &&
> > > !pfn_to_online_page() pfn the kernel hangs at dax-device shutdown due to
> > > a leaked reference.
> > >
> > > Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn")
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
> > > Cc: Michal Hocko <mhocko@kernel.org>
> > > Reviewed-by: David Hildenbrand <david@redhat.com>
> > > Reviewed-by: Oscar Salvador <osalvador@suse.de>
> > > Cc: <stable@vger.kernel.org>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >
> > I'm OK if we don't have any other better approach, but the proposed changes
> > make code a little messy, and I feel that get_user_pages() might be the
> > right place to fix. Is get_user_pages() expected to return struct page with
> > holding refcount for offline valid pages?  I thought that such pages are
> > only used by drivers for dax-devices, but that might be wrong. Can I ask for
> > a little more explanation from this perspective?
> 
> The motivation for ZONE_DEVICE is to allow get_user_pages() for "offline" pfns.

I missed this point, thank you.

> 
> soft_offline_page() wants to only operate on "online" pfns.

Right.

> 
> get_user_pages() on a dax-mapping returns an "offline" ZONE_DEVICE page.

OK.

> 
> When pfn_to_online_page() fails the get_user_pages() reference needs
> to be dropped.

The background is clear to me, and I agree with this patch.

Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

> 
> To be honest I dislike the entire flags based scheme for communicating
> the fact that page reference obtained by madvise needs to be dropped
> later. I'd rather pass a non-NULL 'struct page *' than redo
> pfn_to_page() conversions in the leaf functions, but that's a much
> larger change.

As Oscar mentions in another email, removing MF_COUNT_INCREASED was
recently discussed and rejected. I think that if pfn-page conversion
layer is somehow factored out from soft/hard offline handler, code would
be more readable/maintainable.

Thanks,
Naoya Horiguchi

  parent reply	other threads:[~2021-01-14  7:10 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-14  0:43 [PATCH v4 0/5] mm: Fix pfn_to_online_page() with respect to ZONE_DEVICE Dan Williams
2021-01-14  0:43 ` Dan Williams
2021-01-14  0:43 ` [PATCH v4 1/5] mm: Move pfn_to_online_page() out of line Dan Williams
2021-01-14  0:43   ` Dan Williams
2021-01-20 10:11   ` Michal Hocko
2021-01-20 10:11     ` Michal Hocko
2021-01-14  0:43 ` [PATCH v4 2/5] mm: Teach pfn_to_online_page() to consider subsection validity Dan Williams
2021-01-14  0:43   ` Dan Williams
2021-01-20 10:24   ` Michal Hocko
2021-01-20 10:24     ` Michal Hocko
2021-01-14  0:43 ` [PATCH v4 3/5] mm: Teach pfn_to_online_page() about ZONE_DEVICE section collisions Dan Williams
2021-01-14  0:43   ` Dan Williams
2021-01-20 10:30   ` Michal Hocko
2021-01-20 10:30     ` Michal Hocko
2021-01-14  0:43 ` [PATCH v4 4/5] mm: Fix page reference leak in soft_offline_page() Dan Williams
2021-01-14  0:43   ` Dan Williams
2021-01-14  1:49   ` HORIGUCHI NAOYA(堀口 直也)
2021-01-14  1:49     ` HORIGUCHI NAOYA(堀口 直也)
2021-01-14  6:18     ` Dan Williams
2021-01-14  6:18       ` Dan Williams
2021-01-14  6:18       ` Dan Williams
2021-01-14  6:30       ` Oscar Salvador
2021-01-14  6:30         ` Oscar Salvador
2021-01-14  7:10       ` HORIGUCHI NAOYA(堀口 直也) [this message]
2021-01-14  7:10         ` HORIGUCHI NAOYA(堀口 直也)
2021-01-17 22:01   ` Andrew Morton
2021-01-17 22:01     ` Andrew Morton
2021-01-17 22:35     ` Dan Williams
2021-01-17 22:35       ` Dan Williams
2021-01-17 22:35       ` Dan Williams
2021-01-14  0:43 ` [PATCH v4 5/5] mm: Fix memory_failure() handling of dax-namespace metadata Dan Williams
2021-01-14  0:43   ` Dan Williams
2021-01-14  2:25   ` HORIGUCHI NAOYA(堀口 直也)
2021-01-14  2:25     ` HORIGUCHI NAOYA(堀口 直也)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210114071024.GC16873@hori.linux.bs1.fc.nec.co.jp \
    --to=naoya.horiguchi@nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mhocko@kernel.org \
    --cc=nao.horiguchi@gmail.com \
    --cc=osalvador@suse.de \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.