linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Hillf Danton <hdanton@sina.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	syzbot <syzbot+c48f34012b06c4ac67dd@syzkaller.appspotmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	syzkaller-bugs@googlegroups.com,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jens Axboe <axboe@kernel.dk>,
	Markus Elfring <Markus.Elfring@web.de>
Subject: Re: kernel BUG at include/linux/swapops.h:LINE!
Date: Sun, 26 Jul 2020 17:49:04 +0100	[thread overview]
Message-ID: <20200726164904.GG23808@casper.infradead.org> (raw)
In-Reply-To: <20200724111311.rcjqigtjqpkenxg6@box>

On Fri, Jul 24, 2020 at 02:13:11PM +0300, Kirill A. Shutemov wrote:
> On Thu, Jul 23, 2020 at 03:37:44PM +0800, Hillf Danton wrote:
> > 
> > On Tue, 21 Jul 2020 14:11:31 +0300 Kirill A. Shutemov wrote:
> > > On Mon, Jul 20, 2020 at 04:51:44PM -0700, Andrew Morton wrote:
> > > > On Sun, 19 Jul 2020 14:10:19 -0700 syzbot wrote:
> > > > 
> > > > > syzbot has found a reproducer for the following issue on:
> > > > > 
> > > > > HEAD commit:    4c43049f Add linux-next specific files for 20200716
> > > > > git tree:       linux-next
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=12c56087100000
> > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2c76d72659687242
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=c48f34012b06c4ac67dd
> > > > > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1344abeb100000
> > > > > 
> > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > Reported-by: syzbot+c48f34012b06c4ac67dd@syzkaller.appspotmail.com
> > > > 
> > > > Thanks.
> > > > 
> > > > __handle_mm_fault
> > > >   ->pmd_migration_entry_wait
> > > >     ->migration_entry_to_page
> > > > 
> > > > stumbled onto an unlocked page.
> > > > 
> > > > I don't immediately see a cause.  Perhaps Matthew's "THP prep patches",
> > > > perhaps something else.
> > > > 
> > > > Is it possible to perform a bisection?
> > > 
> > > Maybe it's related to the new lock_page_async()?
> > 
> > Or is there likely the window that after copy_huge_pmd() the src pmd migrate
> > entry is removed and the page unlocked but the dst is not?
> 
> No.
> 
> copy_huge_pmd() runs with exclusive mmap_lock on the source side and
> destination side is not running yet.

The one I'm hitting is huge related though.

I added this debug:

+++ b/include/linux/swapops.h
@@ -165,8 +165,9 @@ static inline struct page *device_private_entry_to_page(swp_entry_t entry)
 #ifdef CONFIG_MIGRATION
 static inline swp_entry_t make_migration_entry(struct page *page, int write)
 {
-       BUG_ON(!PageLocked(compound_head(page)));
+       VM_BUG_ON_PAGE(!PageLocked(page), page);
 
+if (PageCompound(page)) printk("pfn %lx order %d\n", page_to_pfn(page), thp_order(thp_head(page)));
        return swp_entry(write ? SWP_MIGRATION_WRITE : SWP_MIGRATION_READ,
                        page_to_pfn(page));
 }
@@ -194,7 +195,11 @@ static inline struct page *migration_entry_to_page(swp_entry_t entry)
         * Any use of migration entries may only occur while the
         * corresponding page is locked
         */
-       BUG_ON(!PageLocked(compound_head(p)));
+       if (!PageLocked(p)) {
+               dump_page(p, "not locked");
+               printk("swap entry %d.%lx\n", swp_type(entry), swp_offset(entry));
+               BUG();
+       }
        return p;
 }
 

and got useful output (while running generic/086):

1457 086 (20181): drop_caches: 3
1457 page:00000000a216ae9a refcount:2 mapcount:0 mapping:000000009ba7bfed index:0x2227 pfn:0x229e7
1457 aops:def_blk_aops ino:0
1457 flags: 0x4000000000002030(lru|active|private)
1457 raw: 4000000000002030 fffff5b4416b5a48 fffff5b4408a7988 ffff9e9c34848578
1457 raw: 0000000000002227 ffff9e9bd18f0d00 00000002ffffffff 0000000000000000
1457 page dumped because: not locked
1457 swap entry 30.229e7
1457 ------------[ cut here ]------------
1457 kernel BUG at include/linux/swapops.h:201!
1457 invalid opcode: 0000 [#1] SMP PTI
1457 CPU: 3 PID: 646 Comm: check Kdump: loaded Tainted: G        W         5.8.0-rc6-00067-gd8b18bdf9870-dirty #355
1457 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
1457 RIP: 0010:__migration_entry_wait+0x109/0x110
[...]

Looking back in the trace, I see:

...
1457 pfn 229e5 order 9
1457 pfn 229e6 order 9
1457 pfn 229e7 order 9
1457 pfn 229e8 order 9
1457 pfn 229e9 order 9
...

so I would say we have a refcount problem.  I've probably made it worse by
creating more THPs, but I don't think I'm the originator of the problem.

I know very little about the migration code today.  I suspect I'm going
to have to learn about it next week.

  reply	other threads:[~2020-07-26 16:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-30 17:05 kernel BUG at include/linux/swapops.h:LINE! syzbot
2020-07-19 21:10 ` syzbot
2020-07-20 23:51   ` Andrew Morton
2020-07-21  0:21     ` Matthew Wilcox
2020-07-21  2:14       ` Matthew Wilcox
2020-07-21 11:11     ` Kirill A. Shutemov
2020-07-21 15:11       ` Jens Axboe
     [not found]     ` <20200723073744.5268-1-hdanton@sina.com>
2020-07-24 11:13       ` Kirill A. Shutemov
2020-07-26 16:49         ` Matthew Wilcox [this message]
2020-07-27 10:31           ` Kirill A. Shutemov
2020-07-27 12:03             ` Matthew Wilcox
2020-07-29 19:21               ` Kirill A. Shutemov
2020-07-29 19:54                 ` Matthew Wilcox
2020-07-29 22:11                   ` Matthew Wilcox
     [not found]             ` <20200727125950.12048-1-hdanton@sina.com>
2020-07-27 13:44               ` Matthew Wilcox
2021-05-08 11:24 ` [syzbot] " syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200726164904.GG23808@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=Markus.Elfring@web.de \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=syzbot+c48f34012b06c4ac67dd@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).