From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7835AC433E3 for ; Mon, 27 Jul 2020 14:47:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EDEEF2083B for ; Mon, 27 Jul 2020 14:47:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EDEEF2083B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 49DAA6B0002; Mon, 27 Jul 2020 10:47:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 475546B0003; Mon, 27 Jul 2020 10:47:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38BA06B0005; Mon, 27 Jul 2020 10:47:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0190.hostedemail.com [216.40.44.190]) by kanga.kvack.org (Postfix) with ESMTP id 22E1B6B0002 for ; Mon, 27 Jul 2020 10:47:19 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id ACB228248047 for ; Mon, 27 Jul 2020 14:47:18 +0000 (UTC) X-FDA: 77084133756.11.swim37_0203b8f26f62 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id 73F7E180F8B82 for ; Mon, 27 Jul 2020 14:47:18 +0000 (UTC) X-HE-Tag: swim37_0203b8f26f62 X-Filterd-Recvd-Size: 4679 Received: from r3-22.sinamail.sina.com.cn (r3-22.sinamail.sina.com.cn [202.108.3.22]) by imf15.hostedemail.com (Postfix) with SMTP for ; Mon, 27 Jul 2020 14:47:15 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([222.131.74.184]) by sina.com with ESMTP id 5F1EE8E90000C34B; Mon, 27 Jul 2020 22:47:07 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 631073628842 From: Hillf Danton To: Matthew Wilcox Cc: Hillf Danton , "Kirill A. Shutemov" , "Kirill A. Shutemov" , Andrew Morton , syzbot , linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzkaller-bugs@googlegroups.com, Mike Kravetz , Johannes Weiner , Jens Axboe , Markus Elfring Subject: Re: kernel BUG at include/linux/swapops.h:LINE! Date: Mon, 27 Jul 2020 22:46:56 +0800 Message-Id: <20200727144656.15200-1-hdanton@sina.com> In-Reply-To: <20200727134446.GL23808@casper.infradead.org> References: <000000000000bc4fd705a6e090e2@google.com> <0000000000004c38cd05aad1d13f@google.com> <20200720165144.93189f7825bd28e234a42cb8@linux-foundation.org> <20200723073744.5268-1-hdanton@sina.com> <20200724111311.rcjqigtjqpkenxg6@box> <20200726164904.GG23808@casper.infradead.org> <20200727103140.xycdx6ctecomqsoe@box> <20200727125950.12048-1-hdanton@sina.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 73F7E180F8B82 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 27 Jul 2020 14:44:46 +0100 Matthew Wilcox wrote: > On Mon, Jul 27, 2020 at 08:59:50PM +0800, Hillf Danton wrote: > > Can you elaborate on the difference between the two dumps? >=20 > You didn't trim anything, so I have no idea which two dumps you mean. >=20 > I'll annotate below ... Double thanks. >=20 > > > > On Sun, Jul 26, 2020 at 05:49:04PM +0100, Matthew Wilcox wrote: > > > > > 1457 086 (20181): drop_caches: 3 > > > > > 1457 page:00000000a216ae9a refcount:2 mapcount:0 mapping:000000= 009ba7bfed index:0x2227 pfn:0x229e7 > > > > > 1457 aops:def_blk_aops ino:0 > > > > > 1457 flags: 0x4000000000002030(lru|active|private) > > > > > 1457 raw: 4000000000002030 fffff5b4416b5a48 fffff5b4408a7988 ff= ff9e9c34848578 > > > > > 1457 raw: 0000000000002227 ffff9e9bd18f0d00 00000002ffffffff 00= 00000000000000 > > > > > 1457 page dumped because: not locked > > > > > 1457 swap entry 30.229e7 >=20 > This is a dump of the page that was found when looking up the migration= entry. It can be understood without difficulty as page(with mapping) is not lock= ed. >=20 > > On Mon, 27 Jul 2020 13:03:10 +0100 Matthew Wilcox wrote: > > > It's not mapped with a PMD. I tweaked my debugging slightly: > > >=20 > > > static inline swp_entry_t make_migration_entry(struct page *page, = int write) > > > { > > > - BUG_ON(!PageLocked(compound_head(page))); > > > + VM_BUG_ON_PAGE(!PageLocked(page), page); > > > =20 > > > +if (PageHead(page)) dump_page(page, "make entry"); > > > +if (PageTail(page)) printk("pfn %lx order %d\n", page_to_pfn(page)= , thp_order(thp_head(page))); > > >=20 > > > 1523 page:0000000006f62206 refcount:490 mapcount:1 mapping:00000000= 00000000 index:0x562b12a00 pfn:0x1dc00 > > > 1523 head:0000000006f62206 order:9 compound_mapcount:0 compound_pin= count:0 > > > 1523 anon flags: 0x400000000009003d(locked|uptodate|dirty|lru|activ= e|head|swapbacked) > > > 1523 raw: 400000000009003d ffffecfd41301308 ffffecfd41b08008 ffff9e= 9971c00059 > > > 1523 raw: 0000000562b12a00 0000000000000000 000001ea00000000 000000= 0000000000 > > > 1523 page dumped because: make entry >=20 > This is dumping the page when we create the entry. Hard to understand that a locked page is dumped. >=20 > For completeness, here's the page that we find from the same run. >=20 > 1523 page:00000000a18100e6 refcount:0 mapcount:0 mapping:00000000000000= 00 index:0x1 pfn:0x1ddde > 1523 flags: 0x4000000000000000() > 1523 raw: 4000000000000000 dead000000000100 dead000000000122 0000000000= 000000 > 1523 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000= 000000 > 1523 page dumped because: not locked >=20 > (an order-9 page will occupy PFNs 0x1dc00-0x1ddff) >=20 > It's clearly been freed and is still sitting on the per-CPU free list. As it survived free, it is simple to see refcount or lock; what's unclear is why there is a migrate entry left two miles behind, anon or not. > I've also seen them as PageBuddy and, as in the first example above, > reallocated to a different user.