Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Oscar Salvador <osalvador@suse.de>, akpm@linux-foundation.org
Cc: mhocko@suse.com, dan.j.williams@intel.com,
	pavel.tatashin@microsoft.com, jglisse@redhat.com,
	Jonathan.Cameron@huawei.com, rafael@kernel.org, david@redhat.com,
	linux-mm@kvack.org, Oscar Salvador <osalvador@suse.com>
Subject: Re: [PATCH v2 4/5] mm, memory-hotplug: Rework unregister_mem_sect_under_nodes
Date: Sun, 24 Mar 2019 12:18:26 +0530
Message-ID: <45d6b6ed-ae84-f2d5-0d57-dc2e28938ce0@arm.com> (raw)
In-Reply-To: <20181127162005.15833-5-osalvador@suse.de>



On 11/27/2018 09:50 PM, Oscar Salvador wrote:
> From: Oscar Salvador <osalvador@suse.com>
> 
> This tries to address another issue about accessing
> unitiliazed pages.
> 
> Jonathan reported a problem [1] where we can access steal pages
> in case we hot-remove memory without onlining it first.
> 
> This time is in unregister_mem_sect_under_nodes.
> This function tries to get the nid from the pfn and then
> tries to remove the symlink between mem_blk <-> nid and vice versa.
> 
> Since we already know the nid in remove_memory(), we can pass
> it down the chain to unregister_mem_sect_under_nodes.
> There we can just remove the symlinks without the need
> to look into the pages.
> 
> This also allows us to cleanup unregister_mem_sect_under_nodes.
> 
> [1] https://www.spinics.net/lists/linux-mm/msg161316.html
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  drivers/base/memory.c  |  9 ++++-----
>  drivers/base/node.c    | 39 ++++++---------------------------------
>  include/linux/memory.h |  2 +-
>  include/linux/node.h   |  9 ++++-----
>  mm/memory_hotplug.c    |  2 +-
>  5 files changed, 16 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 0e5985682642..3d8c65d84bea 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -744,8 +744,7 @@ unregister_memory(struct memory_block *memory)
>  	device_unregister(&memory->dev);
>  }
>  
> -static int remove_memory_section(unsigned long node_id,
> -			       struct mem_section *section, int phys_device)
> +static int remove_memory_section(unsigned long nid, struct mem_section *section)
>  {
>  	struct memory_block *mem;
>  
> @@ -759,7 +758,7 @@ static int remove_memory_section(unsigned long node_id,
>  	if (!mem)
>  		goto out_unlock;
>  
> -	unregister_mem_sect_under_nodes(mem, __section_nr(section));
> +	unregister_mem_sect_under_nodes(nid, mem);
>  
>  	mem->section_count--;
>  	if (mem->section_count == 0)
> @@ -772,12 +771,12 @@ static int remove_memory_section(unsigned long node_id,
>  	return 0;
>  }
>  
> -int unregister_memory_section(struct mem_section *section)
> +int unregister_memory_section(int nid, struct mem_section *section)
>  {
>  	if (!present_section(section))
>  		return -EINVAL;
>  
> -	return remove_memory_section(0, section, 0);
> +	return remove_memory_section(nid, section);
>  }
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
>  
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index 86d6cd92ce3d..0858f7f3c7cd 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -453,40 +453,13 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, void *arg)
>  	return 0;
>  }
>  
> -/* unregister memory section under all nodes that it spans */
> -int unregister_mem_sect_under_nodes(struct memory_block *mem_blk,
> -				    unsigned long phys_index)
> +/* Remove symlink between node <-> mem_blk */
> +void unregister_mem_sect_under_nodes(int nid, struct memory_block *mem_blk)
>  {
> -	NODEMASK_ALLOC(nodemask_t, unlinked_nodes, GFP_KERNEL);
> -	unsigned long pfn, sect_start_pfn, sect_end_pfn;
> -
> -	if (!mem_blk) {
> -		NODEMASK_FREE(unlinked_nodes);
> -		return -EFAULT;
> -	}
> -	if (!unlinked_nodes)
> -		return -ENOMEM;
> -	nodes_clear(*unlinked_nodes);
> -
> -	sect_start_pfn = section_nr_to_pfn(phys_index);
> -	sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
> -	for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
> -		int nid;
> -
> -		nid = get_nid_for_pfn(pfn);
> -		if (nid < 0)
> -			continue;
> -		if (!node_online(nid))
> -			continue;
> -		if (node_test_and_set(nid, *unlinked_nodes))
> -			continue;
> -		sysfs_remove_link(&node_devices[nid]->dev.kobj,
> -			 kobject_name(&mem_blk->dev.kobj));
> -		sysfs_remove_link(&mem_blk->dev.kobj,
> -			 kobject_name(&node_devices[nid]->dev.kobj));
> -	}
> -	NODEMASK_FREE(unlinked_nodes);
> -	return 0;
> +	sysfs_remove_link(&node_devices[nid]->dev.kobj,
> +			kobject_name(&mem_blk->dev.kobj));
> +	sysfs_remove_link(&mem_blk->dev.kobj,
> +			kobject_name(&node_devices[nid]->dev.kobj));

Hello Oscar,

Passing down node ID till unregister_mem_sect_under_nodes() solves the problem of
querying struct page for nid but the current code assumes that the pfn range for
any given memory section can have different node IDs. Hence it scans over the
section and try to remove all possible node <---> memory block sysfs links.

I am just wondering is that assumption even correct ? Can we really have a memory
section which belongs to different nodes ? Is that even possible.

- Anshuman


  reply index

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-27 16:20 [PATCH v2 0/5] Do not touch pages in hot-remove path Oscar Salvador
2018-11-27 16:20 ` [PATCH v2 1/5] mm, memory_hotplug: Add nid parameter to arch_remove_memory Oscar Salvador
2018-11-27 16:20 ` [PATCH v2 2/5] kernel, resource: Check for IORESOURCE_SYSRAM in release_mem_region_adjustable Oscar Salvador
2018-11-27 16:20 ` [PATCH v2 3/5] mm, memory_hotplug: Move zone/pages handling to offline stage Oscar Salvador
2018-11-28  7:52   ` Mike Rapoport
2018-11-28 14:25     ` osalvador
2018-11-28 14:15   ` osalvador
2018-11-27 16:20 ` [PATCH v2 4/5] mm, memory-hotplug: Rework unregister_mem_sect_under_nodes Oscar Salvador
2019-03-24  6:48   ` Anshuman Khandual [this message]
2019-03-25  7:40     ` Oscar Salvador
2019-03-25  8:04       ` Michal Hocko
2019-03-25  8:14         ` Oscar Salvador
2018-11-27 16:20 ` [PATCH v2 5/5] mm, memory_hotplug: Refactor shrink_zone/pgdat_span Oscar Salvador
2018-11-28  6:50   ` Michal Hocko
2018-11-28  7:07     ` Oscar Salvador
2018-11-28 10:03       ` David Hildenbrand
2018-11-28 10:14       ` Michal Hocko
2018-11-28 11:00         ` osalvador
2018-11-28 12:31           ` Michal Hocko
2018-11-28 12:51             ` osalvador
2018-11-28 13:08               ` Michal Hocko
2018-11-28 13:18                 ` osalvador
2018-11-28 15:50                   ` Michal Hocko
2018-11-28 16:02                     ` osalvador
2018-11-29  9:29                     ` osalvador
2018-11-28 13:09               ` osalvador
2018-11-27 16:25 [PATCH v2 4/5] mm, memory-hotplug: Rework unregister_mem_sect_under_nodes Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45d6b6ed-ae84-f2d5-0d57-dc2e28938ce0@arm.com \
    --to=anshuman.khandual@arm.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=jglisse@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.com \
    --cc=osalvador@suse.de \
    --cc=pavel.tatashin@microsoft.com \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git