From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1AD8C35242 for ; Mon, 17 Feb 2020 07:45:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 497EA206D5 for ; Mon, 17 Feb 2020 07:45:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WKfcE854" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 497EA206D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E02B56B0005; Mon, 17 Feb 2020 02:45:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DB45B6B0006; Mon, 17 Feb 2020 02:45:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC8426B0007; Mon, 17 Feb 2020 02:45:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0045.hostedemail.com [216.40.44.45]) by kanga.kvack.org (Postfix) with ESMTP id ADF076B0005 for ; Mon, 17 Feb 2020 02:45:21 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 30E73181AEF15 for ; Mon, 17 Feb 2020 07:45:21 +0000 (UTC) X-FDA: 76498833642.26.jail59_4bc5fb8f9da16 X-HE-Tag: jail59_4bc5fb8f9da16 X-Filterd-Recvd-Size: 12254 Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Mon, 17 Feb 2020 07:45:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1581925520; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ew5Qbf5kt5fMtgD1HFfj5qSSm5giSh9GFkR6zsyfUGQ=; b=WKfcE854Uo+iw8cTQ1rXjWT634J+0Cm89XLOtun75ajz5hekmyQXeLtDzJromOXaoxfbJs 4SG6Zq1aGHUwTB/jrUg6ghaO1PX4yTHkGcFIhdhU+RlaTO8hAVQmioZwTF/c6J072cRpfA kOU0K9Jct++4W9uoZ/UcA2gVGyX4kg4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-419-XNxVoYvyM-eoPqxzLAd67Q-1; Mon, 17 Feb 2020 02:45:10 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 14B5B18C35A1; Mon, 17 Feb 2020 07:45:09 +0000 (UTC) Received: from localhost (ovpn-12-200.pek2.redhat.com [10.72.12.200]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 48F7A5C10D; Mon, 17 Feb 2020 07:44:59 +0000 (UTC) Date: Mon, 17 Feb 2020 15:44:57 +0800 From: Baoquan He To: kkabe@vega.pgw.jp Cc: bugzilla-daemon@bugzilla.kernel.org, akpm@linux-foundation.org, richardw.yang@linux.intel.com, david@redhat.com, mhocko@kernel.org, n-horiguchi@ah.jp.nec.com, linux-mm@kvack.org Subject: Re: [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add Message-ID: <20200217074457.GB19207@MiWiFi-R3L-srv> References: <20200212073123.GG8965@MiWiFi-R3L-srv> <200217144627.M0113305@vega.pgw.jp> MIME-Version: 1.0 In-Reply-To: <200217144627.M0113305@vega.pgw.jp> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-MC-Unique: XNxVoYvyM-eoPqxzLAd67Q-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 02/17/20 at 02:46pm, kkabe@vega.pgw.jp wrote: > bhe@redhat.com sed in <20200212073123.GG8965@MiWiFi-R3L-srv> >=20 > >> On 02/11/20 at 04:41pm, Andrew Morton wrote: > >> > On Tue, 11 Feb 2020 07:07:41 +0800 Wei Yang wrote: > >> >=20 > >> > > On Mon, Feb 10, 2020 at 02:15:51PM +0800, Baoquan He wrote: > >> > > >On 02/10/20 at 02:09pm, Baoquan He wrote: > >> > > >> On 02/09/20 at 09:56pm, Andrew Morton wrote: > >> > > >> > On Mon, 10 Feb 2020 13:40:27 +0800 Baoquan He wrote: > >> > > >> >=20 > >> > > >> > > Hi Andrew, > >> > > >> > >=20 > >> > > >> > > On 02/09/20 at 09:32pm, Andrew Morton wrote: > >> > > >> > > > On Tue, 04 Feb 2020 11:25:48 +0000 bugzilla-daemon@bugzil= la.kernel.org wrote: > >> > > >> > > >=20 > >> > > >> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=3D206401 > >> > > >> > > > >=20 > >> > > >> > > >=20 > >> > > >> > > > An oops during mem hotadd. Could someone please take a l= ook when > >> > > >> > > > convenient? > >> > > >> > >=20 > >> > > >> > > This has been addressed by Wei Yang's patch, please check i= t here: > >> > > >> > >=20 > >> > > >> > > http://lkml.kernel.org/r/20200209104826.3385-7-bhe@redhat.c= om > >> > > >> > >=20 > >> > > >> >=20 > >> > > >> > hm, OK, thanks. It's unfortunate that a 5.5 fix is buried in= a > >> > > >> > six-patch series which is still in progress! Can we please m= erge that > >> > > >> > as a standalone fix with a cc:stable, Fixes:, etc? > >> > > > > >> > > >Maybe can add Fixes tag as follow when merge: > >> > > > > >> > > >Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") > >> > > > > >> >=20 > >> > The reporter (cc'ed here) is still seeing issues: > >> > https://bugzilla.kernel.org/show_bug.cgi?id=3D206401 > >> >=20 > >> > Could we please continue this investigation via emailed reply-to-all= , > >> > rather than via the bugzilla interface? > >>=20 > >> Yes, people prefer mailing list to discuss issues. >=20 >=20 > I found perplexing behavior in populate_section_memmap(). >=20 > populate_section_memmap() calls alloc_pages(), and if that fails, > falls back to vmalloc(). >=20 > But according to the trace, populate_section_memmap() seems to > throw out the alloc_pages() result and always falls back to vmalloc(), > which could be a wrong area to use. >=20 > I sprinkled pr_info() in mm/sparse.c:populate_section_memmap() as below: >=20 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > struct page * __meminit populate_section_memmap(unsigned long pfn, > unsigned long nr_pages, int nid, struct vmem_altmap *altm= ap) > { > struct page *page, *ret; > unsigned long memmap_size =3D sizeof(struct page) * PAGES_PER_SEC= TION; >=20 > page =3D alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_si= ze)); > if (page) { > goto got_map_page; > } > pr_info("%s: alloc_pages() returned 0x%p (should be 0), reverting to vmal= loc(memmap_size=3D%lu)\n", __func__, page, memmap_size); > BUG_ON(page !=3D 0); >=20 > ret =3D vmalloc(memmap_size); > pr_info("%s: vmalloc(%lu) returned 0x%p\n", __func__, memmap_size, ret); > if (ret) { > goto got_map_ptr; > } >=20 > return NULL; > got_map_page: > ret =3D (struct page *)pfn_to_kaddr(page_to_pfn(page)); > pr_info("%s: allocated struct page *page=3D0x%p\n", __func__, page); > got_map_ptr: >=20 > pr_info("%s: returning struct page * =3D0x%p\n", __func__, ret); > return ret; > } > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D >=20 > and got a following panic. > It even ignores BUG_ON() (perhaps optimized out). >=20 > Is this worth investigating? > Disassembly doesn't reveal anything suspicious, but I have feeling that > I'm looking at disassembly different than that the CPU is seeing. > It's too trivial to be a compiler bug. >=20 >=20 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D > [root@localhost ~]# readelf -l /proc/kcore >=20 > Elf file type is CORE (Core file) > Entry point 0x0 > There are 3 program headers, starting at offset 52 >=20 > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > NOTE 0x000094 0x00000000 0x00000000 0x01304 0x00000 0 > LOAD 0xaff2000 0xcaff0000 0xffffffff 0x3400e000 0x3400e000 RW= E 0x1000 This should be vmalloc area, the region covers [0xcaff0000, 0xcaff0000+0x34= 00e000] =09=09=09=09=09 [0xcaff0000, 0xfeffe000] > LOAD 0x002000 0xc0000000 0x00000000 0xa7f0000 0xa7f0000 RWE 0= x1000 This should be the direct mapping starting from 0xc0000000, covers the boot= memory you set for guest kernel, 168M, [0x0xc0000000, 0xca7f0000] Since system only detects your boot memory, the max_pfn is 168M, so VMALLOC_START =3D high_memory + VMALLOC_OFFSET; So any hot added memory will be taken as high memory. Sorry, I have forgot most of details of i386, these are just my rough understanding about it. >=20 >=20 > [ 302.784196] hv_balloon: Max. dynamic memory size: 1048576 MB > [ 643.475080] hv_balloon: hv_mem_hot_add: calling add_memory(nid=3D0, ((= start_pfn=3D0x10000) << PAGE_SHIFT)=3D0x10000000, (HA_CHUNK << PAGE_SHIFT)= =3D134217728) > [ 643.513804] populate_section_memmap: alloc_pages() returned 0xb1a7c4b2= (should be 0), reverting to vmalloc(memmap_size=3D655360) This pr_info is truly weird. > [ 643.513849] populate_section_memmap: vmalloc(655360) returned 0x11b0e7= 15 > [ 643.513872] populate_section_memmap: returning struct page * =3D0x11b0= e715 But here the returned page address is 0x11b0e715, which is also bizarre. Kernel address is above 3G, right? > [ 643.525352] populate_section_memmap: alloc_pages() returned 0xb1a7c4b2= (should be 0), reverting to vmalloc(memmap_size=3D655360) > [ 643.536698] populate_section_memmap: vmalloc(655360) returned 0xf2ba65= 10 > [ 643.536722] populate_section_memmap: returning struct page * =3D0xf2ba= 6510 Here, the returned page address looks regular. > [ 643.536749] hv_balloon: hv_mem_hot_add: add_memory() returned 0 > [ 645.394458] BUG: unable to handle page fault for address: d13ff000 > [ 645.394518] #PF: supervisor write access in kernel mode > [ 645.394565] #PF: error_code(0x0002) - not-present page > [ 645.394584] *pde =3D 00000000 > [ 645.394601] Oops: 0002 [#1] SMP > [ 645.394614] CPU: 0 PID: 361 Comm: systemd-udevd Not tainted 5.6.0-rc1.= el8.i586 #1 > [ 645.394636] Hardware name: Microsoft Corporation Virtual Machine/Virtu= al Machine, BIOS 090006 05/23/2012 > [ 645.394670] EIP: wp_page_copy+0x8e/0x750 > [ 645.394690] Code: 03 00 00 8b 45 d0 85 c0 0f 84 46 05 00 00 e8 d9 85 e= 5 ff 89 45 bc 89 f8 e8 cf 85 e5 ff 8b 55 bc 8d 78 04 8b 0a 83 e7 fc 89 d6 <= 89> 08 8b 8a fc 0f 00 00 89 88 fc 0f 00 00 89 c1 29 f9 89 55 bc 29 > [ 645.394739] EAX: d13ff000 EBX: c752df28 ECX: 00000000 EDX: c5e0d000 > [ 645.394767] ESI: c5e0d000 EDI: d13ff004 EBP: c752deec ESP: c752dea8 > [ 645.394790] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210= 282 > [ 645.394815] CR0: 80050033 CR2: d13ff000 CR3: 08e5a000 CR4: 003406d0 > [ 645.394840] Call Trace: > [ 645.394852] ? reuse_swap_page+0x83/0x390 > [ 645.394873] do_wp_page+0x87/0x6e0 > [ 645.394885] handle_mm_fault+0x808/0xe30 > [ 645.394893] do_page_fault+0x19f/0x4d0 > [ 645.394901] ? do_kern_addr_fault+0x80/0x80 > [ 645.394915] common_exception_read_cr2+0x15a/0x15f > [ 645.394930] EIP: 0xb7aaf8bb > [ 645.394944] Code: 24 0c e3 2c 89 d7 83 e2 03 74 11 7a 04 aa 49 74 1f a= a 49 74 1b 83 f2 01 75 02 aa 49 89 ca c1 e9 02 83 e2 03 69 c0 01 01 01 01 <= f3> ab 89 d1 f3 aa 8b 44 24 08 5f c3 66 90 66 90 66 90 66 90 90 f3 > [ 645.394973] EAX: 00000000 EBX: b7f05f60 ECX: 0000000d EDX: 00000000 > [ 645.394988] ESI: 02194db4 EDI: 02194db4 EBP: b7f05db4 ESP: bffed978 > [ 645.395003] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00210= 206 > [ 645.395018] Modules linked in: rfkill intel_rapl_msr intel_rapl_common= crc32_pclmul snd_pcm snd_timer snd soundcore intel_rapl_perf sg pcspkr hv_= netvsc i2c_piix4 hyperv_fb hv_utils hv_balloon joydev ip_tables ext4 mbcach= e jbd2 sr_mod cdrom sd_mod t10_pi ata_generic hyperv_keyboard hid_hyperv hv= _storvsc scsi_transport_fc ata_piix crc32c_intel serio_raw hv_vmbus libata > [ 645.395101] CR2: 00000000d13ff000 > [ 645.395121] ---[ end trace 3bb1d66cb8b20841 ]--- > [ 645.395144] EIP: wp_page_copy+0x8e/0x750 > [ 645.395157] Code: 03 00 00 8b 45 d0 85 c0 0f 84 46 05 00 00 e8 d9 85 e= 5 ff 89 45 bc 89 f8 e8 cf 85 e5 ff 8b 55 bc 8d 78 04 8b 0a 83 e7 fc 89 d6 <= 89> 08 8b 8a fc 0f 00 00 89 88 fc 0f 00 00 89 c1 29 f9 89 55 bc 29 > [ 645.395206] EAX: d13ff000 EBX: c752df28 ECX: 00000000 EDX: c5e0d000 > [ 645.395235] ESI: c5e0d000 EDI: d13ff004 EBP: c752deec ESP: c752dea8 > [ 645.395261] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210= 282 > [ 645.395278] CR0: 80050033 CR2: d13ff000 CR3: 08e5a000 CR4: 003406d0 > [ 645.395308] Kernel panic - not syncing: Fatal exception > [ 645.395329] Kernel Offset: 0x3e00000 from 0xc1000000 (relocation range= : 0xc0000000-0xcafeffff) > [ 645.395354] ---[ end Kernel panic - not syncing: Fatal exception ]--- > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D >=20