From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754725AbcLOScr (ORCPT ); Thu, 15 Dec 2016 13:32:47 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:52114 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752516AbcLOScp (ORCPT ); Thu, 15 Dec 2016 13:32:45 -0500 Date: Thu, 15 Dec 2016 18:31:55 +0000 From: Andrea Reale To: Xishi Qiu Cc: Maciej Bielski , scott.branden@broadcom.com, will.deacon@arm.com, tech@virtualopensystems.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH] Memory hotplug support for arm64 platform References: <1481717765-31186-1-git-send-email-m.bielski@virtualopensystems.com> <58523598.3080902@huawei.com> <58523AC5.10800@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <58523AC5.10800@huawei.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16121518-0040-0000-0000-0000031FF045 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16121518-0041-0000-0000-00001E418BA2 Message-Id: <20161215183154.GC4806@samekh> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-12-15_13:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1612150279 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Xishi Qiu, thanks for your comments. The short anwser to your question is the following. As you hinted, it is related to the way pfn_valid() is implemented in arm64 when CONFIG_HAVE_ARCH_PFN_VALID is true (default), i.e., just a check for the NOMAP flags on the corresponding memblocks. Since arch_add_memory->__add_pages() expects pfn_valid() to return false when it is first called, we mark corresponding memory blocks with NOMAP; however, arch_add_memory->__add_pages()->__add_section()->__add_zone() expects pfn_valid() to return true when, at the end of its body, it cycles through pages to call SetPageReserved(). Since blocks are marked with NOMAP, pages will not be reserved there, henceforth we need to reserve them after we clear the NOMAP flag inside the body of arch_add_memory. Having pages reserved at the end of arch_add_memory is a preconditions for the upcoming onlining of memory blocks (see memory_block_change_state()). > It's because that in memmap_init_zone() -> early_pfn_valid(), the new page is still > invalid, so we need to init it after memblock_clear_nomap() > > So why not use __init_single_page() and set_pageblock_migratetype()? About your comment on memmap_init_zone()->early_pfn_valid(), I think that that particular check is never executed in the context of memory hotplug; in fact, just before the call to early_pfn_valid(), the code will jump to the `not_early` label, because the context == MEMMAP_HOPTLUG. Now, the question would be: why there's no need for this hack in implementation for memory hotplug for other architectures (e.g., x86)? The answer we have found lies in the following: as said above, when CONFIG_HAVE_ARCH_PFN_VALID is true (default), the implementation of pfn_valid for arm64 just checks the NOMAP flag on the memmblocks, and returns true if corresponding memblock_add_node() has succeeded (it seems it does not consider the sparse memory model structures created by arm64_memory_present() and sparse_init()). On the contrary, the implementation of pfn_valid() for other architectures is defined in include/linux/mmzone.h. This implemenation, among other things, checks for the SECTION_HAS_MEM_MAP flag on section->section_mem_map. Now, when we go to memory hotplug, this flag is actually set inside __add_section()->sparse_add_one_section()->sparse_init_one_section(). This happens before the call to __add_zone(), so that, when it is eventually invoked, pfn_valid() would return true and pages would be correctly reserved. On the contrary, on arm64, it will keep returning false because of the different implementation of pfn_valid(). Given the above, we believed it was cleaner to go for this little and well-confined hack rather then possibly changing the logic of pfn_valid. However, we are very open for discussion and suggestions for improvement. In particular, any advices on how to effectively modify pfn_valid() without breaking existing functionalities are very welcome. Thanks, Andrea From mboxrd@z Thu Jan 1 00:00:00 1970 From: ar@linux.vnet.ibm.com (Andrea Reale) Date: Thu, 15 Dec 2016 18:31:55 +0000 Subject: [RFC PATCH] Memory hotplug support for arm64 platform In-Reply-To: <58523AC5.10800@huawei.com> References: <1481717765-31186-1-git-send-email-m.bielski@virtualopensystems.com> <58523598.3080902@huawei.com> <58523AC5.10800@huawei.com> Message-ID: <20161215183154.GC4806@samekh> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi Xishi Qiu, thanks for your comments. The short anwser to your question is the following. As you hinted, it is related to the way pfn_valid() is implemented in arm64 when CONFIG_HAVE_ARCH_PFN_VALID is true (default), i.e., just a check for the NOMAP flags on the corresponding memblocks. Since arch_add_memory->__add_pages() expects pfn_valid() to return false when it is first called, we mark corresponding memory blocks with NOMAP; however, arch_add_memory->__add_pages()->__add_section()->__add_zone() expects pfn_valid() to return true when, at the end of its body, it cycles through pages to call SetPageReserved(). Since blocks are marked with NOMAP, pages will not be reserved there, henceforth we need to reserve them after we clear the NOMAP flag inside the body of arch_add_memory. Having pages reserved at the end of arch_add_memory is a preconditions for the upcoming onlining of memory blocks (see memory_block_change_state()). > It's because that in memmap_init_zone() -> early_pfn_valid(), the new page is still > invalid, so we need to init it after memblock_clear_nomap() > > So why not use __init_single_page() and set_pageblock_migratetype()? About your comment on memmap_init_zone()->early_pfn_valid(), I think that that particular check is never executed in the context of memory hotplug; in fact, just before the call to early_pfn_valid(), the code will jump to the `not_early` label, because the context == MEMMAP_HOPTLUG. Now, the question would be: why there's no need for this hack in implementation for memory hotplug for other architectures (e.g., x86)? The answer we have found lies in the following: as said above, when CONFIG_HAVE_ARCH_PFN_VALID is true (default), the implementation of pfn_valid for arm64 just checks the NOMAP flag on the memmblocks, and returns true if corresponding memblock_add_node() has succeeded (it seems it does not consider the sparse memory model structures created by arm64_memory_present() and sparse_init()). On the contrary, the implementation of pfn_valid() for other architectures is defined in include/linux/mmzone.h. This implemenation, among other things, checks for the SECTION_HAS_MEM_MAP flag on section->section_mem_map. Now, when we go to memory hotplug, this flag is actually set inside __add_section()->sparse_add_one_section()->sparse_init_one_section(). This happens before the call to __add_zone(), so that, when it is eventually invoked, pfn_valid() would return true and pages would be correctly reserved. On the contrary, on arm64, it will keep returning false because of the different implementation of pfn_valid(). Given the above, we believed it was cleaner to go for this little and well-confined hack rather then possibly changing the logic of pfn_valid. However, we are very open for discussion and suggestions for improvement. In particular, any advices on how to effectively modify pfn_valid() without breaking existing functionalities are very welcome. Thanks, Andrea