From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76E83C35254 for ; Mon, 17 Feb 2020 10:33:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1DCB320725 for ; Mon, 17 Feb 2020 10:33:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="bRhPlXE7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1DCB320725 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BFE3C6B0037; Mon, 17 Feb 2020 05:33:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B87CB6B006C; Mon, 17 Feb 2020 05:33:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A50646B006E; Mon, 17 Feb 2020 05:33:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0066.hostedemail.com [216.40.44.66]) by kanga.kvack.org (Postfix) with ESMTP id 8EBD06B0037 for ; Mon, 17 Feb 2020 05:33:58 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 48A9F180AD806 for ; Mon, 17 Feb 2020 10:33:58 +0000 (UTC) X-FDA: 76499258556.18.lamp21_5cd225e76d449 X-HE-Tag: lamp21_5cd225e76d449 X-Filterd-Recvd-Size: 6632 Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Mon, 17 Feb 2020 10:33:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1581935637; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LzUnWluDUwksoynb2Hs4Qikwp+RtRSXZjCcE1+mJSGc=; b=bRhPlXE73HE5gq20l4GCWt44QaSYHSyFkr2MMGlD+5AbcenHv0rHdmMOh9KplnWr6CPVaA DGNoZl/NmcuZVC9lPVVzgBCZEeTMelYEhI+27qaVF5uWGTcftBs3ME+jPrsg/2Xiw+wtLb H0TtuX0GymLiCXvZe0KNgbhgJBAYh64= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-1-ONEniGDBM5Sf-946hbx79w-1; Mon, 17 Feb 2020 05:33:55 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 82E2D1B2C980; Mon, 17 Feb 2020 10:33:53 +0000 (UTC) Received: from localhost (ovpn-12-200.pek2.redhat.com [10.72.12.200]) by smtp.corp.redhat.com (Postfix) with ESMTPS id AEDA18DC1E; Mon, 17 Feb 2020 10:33:49 +0000 (UTC) Date: Mon, 17 Feb 2020 18:33:47 +0800 From: Baoquan He To: David Hildenbrand Cc: kkabe@vega.pgw.jp, Oscar Salvador , bugzilla-daemon@bugzilla.kernel.org, akpm@linux-foundation.org, richardw.yang@linux.intel.com, mhocko@kernel.org, n-horiguchi@ah.jp.nec.com, linux-mm@kvack.org Subject: Re: [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add Message-ID: <20200217103347.GM26758@MiWiFi-R3L-srv> References: <20200212073123.GG8965@MiWiFi-R3L-srv> <200217144627.M0113305@vega.pgw.jp> <20200217093447.GA1139@linux> <20200217101318.GL26758@MiWiFi-R3L-srv> <383e5dbf-b402-575d-8dae-5e92b51e9834@redhat.com> MIME-Version: 1.0 In-Reply-To: <383e5dbf-b402-575d-8dae-5e92b51e9834@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-MC-Unique: ONEniGDBM5Sf-946hbx79w-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 02/17/20 at 11:24am, David Hildenbrand wrote: > On 17.02.20 11:13, Baoquan He wrote: > > On 02/17/20 at 10:34am, Oscar Salvador wrote: > >> On Mon, Feb 17, 2020 at 02:46:27PM +0900, kkabe@vega.pgw.jp wrote: > >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >>> struct page * __meminit populate_section_memmap(unsigned long pfn, > >>> unsigned long nr_pages, int nid, struct vmem_altmap *= altmap) > >>> { > >>> struct page *page, *ret; > >>> unsigned long memmap_size =3D sizeof(struct page) * PAGES_PER= _SECTION; > >>> > >>> page =3D alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memma= p_size)); > >>> if (page) { > >>> goto got_map_page; > >>> } > >>> pr_info("%s: alloc_pages() returned 0x%p (should be 0), reverting to = vmalloc(memmap_size=3D%lu)\n", __func__, page, memmap_size); > >>> BUG_ON(page !=3D 0); > >>> > >>> ret =3D vmalloc(memmap_size); > >>> pr_info("%s: vmalloc(%lu) returned 0x%p\n", __func__, memmap_size, re= t); > >>> if (ret) { > >>> goto got_map_ptr; > >>> } > >>> > >>> return NULL; > >>> got_map_page: > >>> ret =3D (struct page *)pfn_to_kaddr(page_to_pfn(page)); > >>> pr_info("%s: allocated struct page *page=3D0x%p\n", __func__, page); > >>> got_map_ptr: > >>> > >>> pr_info("%s: returning struct page * =3D0x%p\n", __func__, ret); > >>> return ret; > >>> } > >> > >> Could you please replace %p with %px. Wih the first, pointers are hash= ed so it is trickier > >> to get an overview of the meaning. > >> > >> David could be right about ZONE_NORMAL vs ZONE_HIGHMEM. > >> IIUC, default_kernel_zone_for_pfn and default_zone_for_pfn seem to onl= y deal with > >> (ZONE_DMA,ZONE_NORMAL] or ZONE_MOVABLE. > >=20 > > Ah, I think you both have spotted the problem. > > =20 > > In i386, if w/o momory hot add, normal memory will only include those > > below 896M and they are added into normal zone. The left are added into > > highmem zone. > > =20 > > How this influence the page allocation? > > =20 > > Very huge. As we know, in i386, normal memory can be accessed with > > virt_to_phys, namely PAGE_OFFSET + phys. But highmem has to be accessed > > with kmap. However, the later hot added memory are all put into normal > > memmory, accessing into them will stump into vmalloc area, I would say. > > =20 > > So, i386 doesn't support memory hot add well. Not sure if below change > > can make it work normally. > > =20 > > We can just adjus the hot adding code as we have done for boot memmory. > > Iterate zone from highmem if allowed when hot add memory. > > =20 > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > > index 475d0d68a32c..1380392d9ef5 100644 > > --- a/mm/memory_hotplug.c > > +++ b/mm/memory_hotplug.c > > @@ -716,7 +716,10 @@ static struct zone *default_kernel_zone_for_pfn(in= t nid, unsigned long start_pfn > > =09struct pglist_data *pgdat =3D NODE_DATA(nid); > > =09int zid; > > =20 > > -=09for (zid =3D 0; zid <=3D ZONE_NORMAL; zid++) { > > +=09for (zid =3D 0; zid < MAX_NR_ZONES; zid++) { >=20 > ZONE_DEVICE? :/ Not sure if ZONE_DEVICE will be supported on 32 bit system. >=20 > > +=09=09if (zid =3D=3D ZONE_MOVABLE) > > +=09=09=09continue; > > + > > =09=09struct zone *zone =3D &pgdat->node_zones[zid]; > > =20 > > =09=09if (zone_intersects(zone, start_pfn, nr_pages)) > >=20 > >=20 >=20 > What if somebody onlines memory from user space explicitly to the normal > zone? We can trigger crashes? Seems the current i386 code doesn't support it. Unless we change that too. If not reserving virtual address space, later added any memory has to be highmem. >=20 > This doesn't look like it ever worked reliably, can we just disable > memory hotplug in case we have PAE? (especially, as continued i386 > support is questionable) This is not PAE, this is only HIGHMEM4G.