From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757726AbcIWGkw (ORCPT ); Fri, 23 Sep 2016 02:40:52 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:47750 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754790AbcIWGku (ORCPT ); Fri, 23 Sep 2016 02:40:50 -0400 Subject: Re: [PATCH v2 1/1] mm/hugetlb: fix memory offline with hugepage size > memory block size To: Michal Hocko , Gerald Schaefer References: <20160920155354.54403-1-gerald.schaefer@de.ibm.com> <20160920155354.54403-2-gerald.schaefer@de.ibm.com> <05d701d213d1$7fb70880$7f251980$@alibaba-inc.com> <20160921143534.0dd95fe7@thinkpad> <20160922095137.GC11875@dhcp22.suse.cz> Cc: Andrew Morton , Naoya Horiguchi , Hillf Danton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A . Shutemov" , Vlastimil Babka , Mike Kravetz , "Aneesh Kumar K . V" , Martin Schwidefsky , Heiko Carstens , Dave Hansen From: Rui Teng Date: Fri, 23 Sep 2016 14:40:33 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20160922095137.GC11875@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16092306-0004-0000-0000-000010739217 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005807; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000185; SDB=6.00760516; UDB=6.00361813; IPR=6.00535038; BA=6.00004748; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012756; XFM=3.00000011; UTC=2016-09-23 06:40:47 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16092306-0005-0000-0000-00007924CDC0 Message-Id: <4ef25b67-13bc-57bd-f322-04310e6d6a00@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-09-23_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609020000 definitions=main-1609230121 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/22/16 5:51 PM, Michal Hocko wrote: > On Wed 21-09-16 14:35:34, Gerald Schaefer wrote: >> dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a >> list corruption and addressing exception when trying to set a memory >> block offline that is part (but not the first part) of a hugetlb page >> with a size > memory block size. >> >> When no other smaller hugetlb page sizes are present, the VM_BUG_ON() >> will trigger directly. In the other case we will run into an addressing >> exception later, because dissolve_free_huge_page() will not work on the >> head page of the compound hugetlb page which will result in a NULL >> hstate from page_hstate(). >> >> To fix this, first remove the VM_BUG_ON() because it is wrong, and then >> use the compound head page in dissolve_free_huge_page(). > > OK so dissolve_free_huge_page will work also on tail pages now which > makes some sense. I would appreciate also few words why do we want to > sacrifice something as precious as gigantic page rather than fail the > page block offline. Dave pointed out dim offline usecase for example. > >> Also change locking in dissolve_free_huge_page(), so that it only takes >> the lock when actually removing a hugepage. > > From a quick look it seems this has been broken since introduced by > c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle > hugepage"). Do we want to have this backported to stable? In any way > Fixes: SHA1 would be really nice. > If the huge page hot-plug function was introduced by c8721bbbdd36, and it has already indicated that the gigantic page is not supported: "As for larger hugepages (1GB for x86_64), it's not easy to do hotremove over them because it's larger than memory block. So we now simply leave it to fail as it is." Is it possible that the gigantic page hot-plugin has never been supported? I made another patch for this problem, and also tried to apply the first version of this patch on my system too. But they only postpone the error happened. The HugePages_Free will be changed from 2 to 1, if I offline a huge page. I think it does not have a correct roll back. # cat /proc/meminfo | grep -i huge AnonHugePages: 0 kB HugePages_Total: 2 HugePages_Free: 1 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 16777216 kB I will make more test on it, but can any one confirm that this function has been implemented and tested before? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f69.google.com (mail-pa0-f69.google.com [209.85.220.69]) by kanga.kvack.org (Postfix) with ESMTP id 1678F280261 for ; Fri, 23 Sep 2016 02:40:51 -0400 (EDT) Received: by mail-pa0-f69.google.com with SMTP id fu14so190498629pad.0 for ; Thu, 22 Sep 2016 23:40:51 -0700 (PDT) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com. [148.163.156.1]) by mx.google.com with ESMTPS id zk10si6321455pac.45.2016.09.22.23.40.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Sep 2016 23:40:50 -0700 (PDT) Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u8N6c5d8103696 for ; Fri, 23 Sep 2016 02:40:49 -0400 Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.150]) by mx0a-001b2d01.pphosted.com with ESMTP id 25mqb72aud-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 23 Sep 2016 02:40:49 -0400 Received: from localhost by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 23 Sep 2016 00:40:48 -0600 Subject: Re: [PATCH v2 1/1] mm/hugetlb: fix memory offline with hugepage size > memory block size References: <20160920155354.54403-1-gerald.schaefer@de.ibm.com> <20160920155354.54403-2-gerald.schaefer@de.ibm.com> <05d701d213d1$7fb70880$7f251980$@alibaba-inc.com> <20160921143534.0dd95fe7@thinkpad> <20160922095137.GC11875@dhcp22.suse.cz> From: Rui Teng Date: Fri, 23 Sep 2016 14:40:33 +0800 MIME-Version: 1.0 In-Reply-To: <20160922095137.GC11875@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Message-Id: <4ef25b67-13bc-57bd-f322-04310e6d6a00@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko , Gerald Schaefer Cc: Andrew Morton , Naoya Horiguchi , Hillf Danton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A . Shutemov" , Vlastimil Babka , Mike Kravetz , "Aneesh Kumar K . V" , Martin Schwidefsky , Heiko Carstens , Dave Hansen On 9/22/16 5:51 PM, Michal Hocko wrote: > On Wed 21-09-16 14:35:34, Gerald Schaefer wrote: >> dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a >> list corruption and addressing exception when trying to set a memory >> block offline that is part (but not the first part) of a hugetlb page >> with a size > memory block size. >> >> When no other smaller hugetlb page sizes are present, the VM_BUG_ON() >> will trigger directly. In the other case we will run into an addressing >> exception later, because dissolve_free_huge_page() will not work on the >> head page of the compound hugetlb page which will result in a NULL >> hstate from page_hstate(). >> >> To fix this, first remove the VM_BUG_ON() because it is wrong, and then >> use the compound head page in dissolve_free_huge_page(). > > OK so dissolve_free_huge_page will work also on tail pages now which > makes some sense. I would appreciate also few words why do we want to > sacrifice something as precious as gigantic page rather than fail the > page block offline. Dave pointed out dim offline usecase for example. > >> Also change locking in dissolve_free_huge_page(), so that it only takes >> the lock when actually removing a hugepage. > > From a quick look it seems this has been broken since introduced by > c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle > hugepage"). Do we want to have this backported to stable? In any way > Fixes: SHA1 would be really nice. > If the huge page hot-plug function was introduced by c8721bbbdd36, and it has already indicated that the gigantic page is not supported: "As for larger hugepages (1GB for x86_64), it's not easy to do hotremove over them because it's larger than memory block. So we now simply leave it to fail as it is." Is it possible that the gigantic page hot-plugin has never been supported? I made another patch for this problem, and also tried to apply the first version of this patch on my system too. But they only postpone the error happened. The HugePages_Free will be changed from 2 to 1, if I offline a huge page. I think it does not have a correct roll back. # cat /proc/meminfo | grep -i huge AnonHugePages: 0 kB HugePages_Total: 2 HugePages_Free: 1 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 16777216 kB I will make more test on it, but can any one confirm that this function has been implemented and tested before? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org