All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, will.deacon@arm.com,
	catalin.marinas@arm.com, mhocko@suse.com,
	mgorman@techsingularity.net, james.morse@arm.com,
	robin.murphy@arm.com, cpandya@codeaurora.org,
	arunks@codeaurora.org, dan.j.williams@intel.com,
	osalvador@suse.de, david@redhat.com, cai@lca.pw,
	logang@deltatee.com, ira.weiny@intel.com
Subject: Re: [PATCH V2 2/2] arm64/mm: Enable memory hot remove
Date: Wed, 17 Apr 2019 18:39:48 +0100	[thread overview]
Message-ID: <20190417173948.GB15589@lakrids.cambridge.arm.com> (raw)
In-Reply-To: <bba0b71c-2d04-d589-e2bf-5de37806548f@arm.com>

On Wed, Apr 17, 2019 at 10:15:35PM +0530, Anshuman Khandual wrote:
> On 04/17/2019 07:51 PM, Mark Rutland wrote:
> > On Wed, Apr 17, 2019 at 03:28:18PM +0530, Anshuman Khandual wrote:
> >> On 04/15/2019 07:18 PM, Mark Rutland wrote:
> >>> On Sun, Apr 14, 2019 at 11:29:13AM +0530, Anshuman Khandual wrote:

> >>>> +	spin_unlock(&init_mm.page_table_lock);
> >>>
> >>> What precisely is the page_table_lock intended to protect?
> >>
> >> Concurrent modification to kernel page table (init_mm) while clearing entries.
> > 
> > Concurrent modification by what code?
> > 
> > If something else can *modify* the portion of the table that we're
> > manipulating, then I don't see how we can safely walk the table up to
> > this point without holding the lock, nor how we can safely add memory.
> > 
> > Even if this is to protect something else which *reads* the tables,
> > other code in arm64 which modifies the kernel page tables doesn't take
> > the lock.
> > 
> > Usually, if you can do a lockless walk you have to verify that things
> > didn't change once you've taken the lock, but we don't follow that
> > pattern here.
> > 
> > As things stand it's not clear to me whether this is necessary or
> > sufficient.
> 
> Hence lets take more conservative approach and wrap the entire process of
> remove_pagetable() under init_mm.page_table_lock which looks safe unless
> in the worst case when free_pages() gets stuck for some reason in which
> case we have bigger memory problem to deal with than a soft lock up.

Sorry, but I'm not happy with _any_ solution until we understand where
and why we need to take the init_mm ptl, and have made some effort to
ensure that the kernel correctly does so elsewhere. It is not sufficient
to consider this code in isolation.

IIUC, before this patch we never clear non-leaf entries in the kernel
page tables, so readers don't presently need to take the ptl in order to
safely walk down to a leaf entry.

For example, the arm64 ptdump code never takes the ptl, and as of this
patch it will blow up if it races with a hot-remove, regardless of
whether the hot-remove code itself holds the ptl.

Note that the same applies to the x86 ptdump code; we cannot assume that
just because x86 does something that it happens to be correct.

I strongly suspect there are other cases that would fall afoul of this,
in both arm64 and generic code.

Thanks,
Mark.

WARNING: multiple messages have this Message-ID (diff)
From: Mark Rutland <mark.rutland@arm.com>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: cai@lca.pw, mhocko@suse.com, ira.weiny@intel.com,
	david@redhat.com, catalin.marinas@arm.com, will.deacon@arm.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	logang@deltatee.com, james.morse@arm.com, cpandya@codeaurora.org,
	arunks@codeaurora.org, akpm@linux-foundation.org,
	osalvador@suse.de, mgorman@techsingularity.net,
	dan.j.williams@intel.com, linux-arm-kernel@lists.infradead.org,
	robin.murphy@arm.com
Subject: Re: [PATCH V2 2/2] arm64/mm: Enable memory hot remove
Date: Wed, 17 Apr 2019 18:39:48 +0100	[thread overview]
Message-ID: <20190417173948.GB15589@lakrids.cambridge.arm.com> (raw)
In-Reply-To: <bba0b71c-2d04-d589-e2bf-5de37806548f@arm.com>

On Wed, Apr 17, 2019 at 10:15:35PM +0530, Anshuman Khandual wrote:
> On 04/17/2019 07:51 PM, Mark Rutland wrote:
> > On Wed, Apr 17, 2019 at 03:28:18PM +0530, Anshuman Khandual wrote:
> >> On 04/15/2019 07:18 PM, Mark Rutland wrote:
> >>> On Sun, Apr 14, 2019 at 11:29:13AM +0530, Anshuman Khandual wrote:

> >>>> +	spin_unlock(&init_mm.page_table_lock);
> >>>
> >>> What precisely is the page_table_lock intended to protect?
> >>
> >> Concurrent modification to kernel page table (init_mm) while clearing entries.
> > 
> > Concurrent modification by what code?
> > 
> > If something else can *modify* the portion of the table that we're
> > manipulating, then I don't see how we can safely walk the table up to
> > this point without holding the lock, nor how we can safely add memory.
> > 
> > Even if this is to protect something else which *reads* the tables,
> > other code in arm64 which modifies the kernel page tables doesn't take
> > the lock.
> > 
> > Usually, if you can do a lockless walk you have to verify that things
> > didn't change once you've taken the lock, but we don't follow that
> > pattern here.
> > 
> > As things stand it's not clear to me whether this is necessary or
> > sufficient.
> 
> Hence lets take more conservative approach and wrap the entire process of
> remove_pagetable() under init_mm.page_table_lock which looks safe unless
> in the worst case when free_pages() gets stuck for some reason in which
> case we have bigger memory problem to deal with than a soft lock up.

Sorry, but I'm not happy with _any_ solution until we understand where
and why we need to take the init_mm ptl, and have made some effort to
ensure that the kernel correctly does so elsewhere. It is not sufficient
to consider this code in isolation.

IIUC, before this patch we never clear non-leaf entries in the kernel
page tables, so readers don't presently need to take the ptl in order to
safely walk down to a leaf entry.

For example, the arm64 ptdump code never takes the ptl, and as of this
patch it will blow up if it races with a hot-remove, regardless of
whether the hot-remove code itself holds the ptl.

Note that the same applies to the x86 ptdump code; we cannot assume that
just because x86 does something that it happens to be correct.

I strongly suspect there are other cases that would fall afoul of this,
in both arm64 and generic code.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-04-17 17:39 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-14  5:59 [PATCH V2 0/2] arm64/mm: Enable memory hot remove Anshuman Khandual
2019-04-14  5:59 ` Anshuman Khandual
2019-04-14  5:59 ` [PATCH V2 1/2] mm/hotplug: Reorder arch_remove_memory() call in __remove_memory() Anshuman Khandual
2019-04-14  5:59   ` Anshuman Khandual
2019-04-15 13:58   ` David Hildenbrand
2019-04-15 13:58     ` David Hildenbrand
2019-04-16 10:12     ` Anshuman Khandual
2019-04-16 10:12       ` Anshuman Khandual
2019-04-14  5:59 ` [PATCH V2 2/2] arm64/mm: Enable memory hot remove Anshuman Khandual
2019-04-14  5:59   ` Anshuman Khandual
2019-04-15 13:48   ` Mark Rutland
2019-04-15 13:48     ` Mark Rutland
2019-04-17  9:58     ` Anshuman Khandual
2019-04-17  9:58       ` Anshuman Khandual
2019-04-17 14:21       ` Mark Rutland
2019-04-17 14:21         ` Mark Rutland
2019-04-17 16:45         ` Anshuman Khandual
2019-04-17 16:45           ` Anshuman Khandual
2019-04-17 17:39           ` Mark Rutland [this message]
2019-04-17 17:39             ` Mark Rutland
2019-04-18  5:28             ` Anshuman Khandual
2019-04-18  5:28               ` Anshuman Khandual
2019-04-23  7:31               ` Anshuman Khandual
2019-04-23  7:31                 ` Anshuman Khandual
2019-04-23  7:37                 ` David Hildenbrand
2019-04-23  7:37                   ` David Hildenbrand
2019-04-23  7:45                   ` Anshuman Khandual
2019-04-23  7:45                     ` Anshuman Khandual
2019-04-23  7:51                     ` David Hildenbrand
2019-04-23  7:51                       ` David Hildenbrand
2019-04-23  8:37                       ` Anshuman Khandual
2019-04-23  8:37                         ` Anshuman Khandual
2019-04-23 16:05                 ` Mark Rutland
2019-04-23 16:05                   ` Mark Rutland
2019-04-24  5:59                   ` Anshuman Khandual
2019-04-24  5:59                     ` Anshuman Khandual
2019-04-24  8:19                     ` Mark Rutland
2019-04-24  8:19                       ` Mark Rutland
2019-04-15 13:55   ` David Hildenbrand
2019-04-15 13:55     ` David Hildenbrand
2019-04-16  9:52     ` Anshuman Khandual
2019-04-16  9:52       ` Anshuman Khandual
2019-05-13  8:22 ` [PATCH V2 0/2] " David Hildenbrand
2019-05-13  8:22   ` David Hildenbrand
2019-05-13  8:37   ` Anshuman Khandual
2019-05-13  8:37     ` Anshuman Khandual
2019-05-13 10:01     ` David Hildenbrand
2019-05-13 10:01       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190417173948.GB15589@lakrids.cambridge.arm.com \
    --to=mark.rutland@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=arunks@codeaurora.org \
    --cc=cai@lca.pw \
    --cc=catalin.marinas@arm.com \
    --cc=cpandya@codeaurora.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=logang@deltatee.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=robin.murphy@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.