linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: Zi Yan <ziy@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Sean Christopherson <seanjc@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	"Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>,
	djwong@kernel.org, brauner@kernel.org, david@fromorbit.com,
	chandan.babu@oracle.com, akpm@linux-foundation.org,
	linux-fsdevel@vger.kernel.org, hare@suse.de,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-xfs@vger.kernel.org, gost.dev@samsung.com,
	p.raghav@samsung.com
Subject: Re: [PATCH v4 05/11] mm: do not split a folio if it has minimum folio order requirement
Date: Mon, 29 Apr 2024 17:49:16 -0700	[thread overview]
Message-ID: <ZjBADFMehRG1U6ln@bombadil.infradead.org> (raw)
In-Reply-To: <ZjA7yBQjkh52TM_T@bombadil.infradead.org>

On Mon, Apr 29, 2024 at 05:31:04PM -0700, Luis Chamberlain wrote:
> On Mon, Apr 29, 2024 at 10:29:29AM -0400, Zi Yan wrote:
> > On 28 Apr 2024, at 23:56, Luis Chamberlain wrote:
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index 73a052a382f1..83b528eb7100 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1484,7 +1484,12 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f
> > >  	int rc;
> > >
> > >  	folio_lock(folio);
> > > +	if (folio_test_dirty(folio)) {
> > > +		rc = -EBUSY;
> > > +		goto out;
> > > +	}
> > >  	rc = split_folio_to_list(folio, split_folios);
> > > +out:
> > >  	folio_unlock(folio);
> > >  	if (!rc)
> > >  		list_move_tail(&folio->lru, split_folios);
> > >
> > > However I'd like compaction folks to review this. I see some indications
> > > in the code that migration can race with truncation but we feel fine by
> > > it by taking the folio lock. However here we have a case where we see
> > > the folio clearly locked and the folio is dirty. Other migraiton code
> > > seems to write back the code and can wait, here we just move on. Further
> > > reading on commit 0003e2a414687 ("mm: Add AS_UNMOVABLE to mark mapping
> > > as completely unmovable") seems to hint that migration is safe if the
> > > mapping either does not exist or the mapping does exist but has
> > > mapping->a_ops->migrate_folio so I'd like further feedback on this.
> > 
> > During migration, all page table entries pointing to this dirty folio
> > are invalid, and accesses to this folio will cause page fault and
> > wait on the migration entry. I am not sure we need to skip dirty folios.

That would seem to explain why we get this, if we allow these to
continue, ie, without the hunk above, so it begs to ask, are we sure
we are waiting for migration to complete if we're doing writeback?

Apr 29 17:28:54 min-xfs-reflink-16k-4ks unknown: run fstests generic/447 at 2024-04-29 17:28:54
Apr 29 17:28:55 min-xfs-reflink-16k-4ks kernel: XFS (loop5): EXPERIMENTAL: Filesystem with Large Block Size (16384 bytes) enabled.
Apr 29 17:28:55 min-xfs-reflink-16k-4ks kernel: XFS (loop5): Mounting V5 Filesystem 89a3d14d-8147-455d-8897-927a0b7baf65
Apr 29 17:28:55 min-xfs-reflink-16k-4ks kernel: XFS (loop5): Ending clean mount
Apr 29 17:28:56 min-xfs-reflink-16k-4ks kernel: XFS (loop5): Unmounting Filesystem 89a3d14d-8147-455d-8897-927a0b7baf65
Apr 29 17:28:59 min-xfs-reflink-16k-4ks kernel: XFS (loop5): EXPERIMENTAL: Filesystem with Large Block Size (16384 bytes) enabled.
Apr 29 17:28:59 min-xfs-reflink-16k-4ks kernel: XFS (loop5): Mounting V5 Filesystem e29c3eee-18d1-4fec-9a17-076b3eccbd74
Apr 29 17:28:59 min-xfs-reflink-16k-4ks kernel: XFS (loop5): Ending clean mount
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: BUG: kernel NULL pointer dereference, address: 0000000000000036
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: #PF: supervisor read access in kernel mode
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: #PF: error_code(0x0000) - not-present page
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: PGD 0 P4D 0
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: CPU: 3 PID: 2245 Comm: kworker/u36:5 Not tainted 6.9.0-rc5+ #26
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: Workqueue: writeback wb_workfn (flush-7:5)
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: RIP: 0010:filemap_get_folios_tag (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:457 ./include/linux/atomic/atomic-arch-fallback.h:2426 ./include/linux/atomic/atomic-arch-fallback.h:2456 ./include/linux/atomic/atomic-instrumented.h:1518 ./include/linux/page_ref.h:238 ./include/linux/page_ref.h:247 ./include/linux/page_ref.h:280 ./include/linux/page_ref.h:313 mm/filemap.c:1984 mm/filemap.c:2222) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: Code: ad 05 86 00 48 89 c3 48 3d 06 04 00 00 74 e8 48 81 fb 02 04 00 00 0f 84 d0 00 00 00 48 85 db 0f 84 04 01 00 00 f6 c3 01 75 c4 <8b> 43 34 85 c0 0f 84 b7 00 00 00 8d 50 01 48 8d 73 34 f0 0f b1 53
All code
========
   0:	ad                   	lods   %ds:(%rsi),%eax
   1:	05 86 00 48 89       	add    $0x89480086,%eax
   6:	c3                   	ret
   7:	48 3d 06 04 00 00    	cmp    $0x406,%rax
   d:	74 e8                	je     0xfffffffffffffff7
   f:	48 81 fb 02 04 00 00 	cmp    $0x402,%rbx
  16:	0f 84 d0 00 00 00    	je     0xec
  1c:	48 85 db             	test   %rbx,%rbx
  1f:	0f 84 04 01 00 00    	je     0x129
  25:	f6 c3 01             	test   $0x1,%bl
  28:	75 c4                	jne    0xffffffffffffffee
  2a:*	8b 43 34             	mov    0x34(%rbx),%eax		<-- trapping instruction
  2d:	85 c0                	test   %eax,%eax
  2f:	0f 84 b7 00 00 00    	je     0xec
  35:	8d 50 01             	lea    0x1(%rax),%edx
  38:	48 8d 73 34          	lea    0x34(%rbx),%rsi
  3c:	f0                   	lock
  3d:	0f                   	.byte 0xf
  3e:	b1 53                	mov    $0x53,%cl

Code starting with the faulting instruction
===========================================
   0:	8b 43 34             	mov    0x34(%rbx),%eax
   3:	85 c0                	test   %eax,%eax
   5:	0f 84 b7 00 00 00    	je     0xc2
   b:	8d 50 01             	lea    0x1(%rax),%edx
   e:	48 8d 73 34          	lea    0x34(%rbx),%rsi
  12:	f0                   	lock
  13:	0f                   	.byte 0xf
  14:	b1 53                	mov    $0x53,%cl
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: RSP: 0018:ffffad0a0396b8f8 EFLAGS: 00010246
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: RAX: 0000000000000002 RBX: 0000000000000002 RCX: 00000000000ba200
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: RDX: 0000000000000002 RSI: 0000000000000002 RDI: ffff9f408d20d488
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000000
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: R10: 0000000000000228 R11: 0000000000000000 R12: ffffffffffffffff
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: R13: ffffad0a0396bbb8 R14: ffffad0a0396bcb8 R15: ffff9f40633bfc00
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: FS:  0000000000000000(0000) GS:ffff9f40bbcc0000(0000) knlGS:0000000000000000
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: CR2: 0000000000000036 CR3: 000000010bd44006 CR4: 0000000000770ef0
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: PKRU: 55555554
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: Call Trace:
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel:  <TASK>
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? page_fault_oops (arch/x86/mm/fault.c:713) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? write_cache_pages (mm/page-writeback.c:2569) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? xfs_vm_writepages (fs/xfs/xfs_aops.c:508) xfs
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? do_user_addr_fault (./include/linux/kprobes.h:591 (discriminator 1) arch/x86/mm/fault.c:1265 (discriminator 1)) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1513 arch/x86/mm/fault.c:1563) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? filemap_get_folios_tag (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:457 ./include/linux/atomic/atomic-arch-fallback.h:2426 ./include/linux/atomic/atomic-arch-fallback.h:2456 ./include/linux/atomic/atomic-instrumented.h:1518 ./include/linux/page_ref.h:238 ./include/linux/page_ref.h:247 ./include/linux/page_ref.h:280 ./include/linux/page_ref.h:313 mm/filemap.c:1984 mm/filemap.c:2222) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? filemap_get_folios_tag (mm/filemap.c:1972 mm/filemap.c:2222) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? __pfx_iomap_do_writepage (fs/iomap/buffered-io.c:1963) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: writeback_iter (./include/linux/pagevec.h:91 mm/page-writeback.c:2421 mm/page-writeback.c:2520) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: write_cache_pages (mm/page-writeback.c:2568) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: iomap_writepages (fs/iomap/buffered-io.c:1984) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: xfs_vm_writepages (fs/xfs/xfs_aops.c:508) xfs
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: do_writepages (mm/page-writeback.c:2612) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: __writeback_single_inode (fs/fs-writeback.c:1659) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? _raw_spin_lock (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:134 (discriminator 4) kernel/locking/spinlock.c:154 (discriminator 4)) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: writeback_sb_inodes (fs/fs-writeback.c:1943) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: __writeback_inodes_wb (fs/fs-writeback.c:2013) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: wb_writeback (fs/fs-writeback.c:2119) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: wb_workfn (fs/fs-writeback.c:2277 (discriminator 1) fs/fs-writeback.c:2304 (discriminator 1)) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: process_one_work (kernel/workqueue.c:3254) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? __pfx_worker_thread (kernel/workqueue.c:3362) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: kthread (kernel/kthread.c:388) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? __pfx_kthread (kernel/kthread.c:341) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ret_from_fork (arch/x86/kernel/process.c:147) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ? __pfx_kthread (kernel/kthread.c:341) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) 
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel:  </TASK>
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: Modules linked in: xfs nvme_fabrics nvme_core t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 sunrpc 9p netfs kvm_intel kvm crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd virtio_balloon pcspkr 9pnet_virtio virtio_console button evdev joydev serio_raw loop drm dm_mod autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 md_mod virtio_net net_failover failover virtio_blk crc32_pclmul crc32c_intel psmouse virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: CR2: 0000000000000036
Apr 29 17:29:20 min-xfs-reflink-16k-4ks kernel: ---[ end trace 0000000000000000 ]---

  reply	other threads:[~2024-04-30  0:49 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-25 11:37 [PATCH v4 00/11] enable bs > ps in XFS Pankaj Raghav (Samsung)
2024-04-25 11:37 ` [PATCH v4 01/11] readahead: rework loop in page_cache_ra_unbounded() Pankaj Raghav (Samsung)
2024-04-25 11:37 ` [PATCH v4 02/11] fs: Allow fine-grained control of folio sizes Pankaj Raghav (Samsung)
2024-04-25 18:07   ` Hannes Reinecke
2024-04-26 15:09   ` Darrick J. Wong
2024-04-25 11:37 ` [PATCH v4 03/11] filemap: allocate mapping_min_order folios in the page cache Pankaj Raghav (Samsung)
2024-04-25 19:04   ` Hannes Reinecke
2024-04-26 15:12   ` Darrick J. Wong
2024-04-28 20:59     ` Pankaj Raghav (Samsung)
2024-04-25 11:37 ` [PATCH v4 04/11] readahead: allocate folios with mapping_min_order in readahead Pankaj Raghav (Samsung)
2024-04-25 18:53   ` Matthew Wilcox
2024-04-25 11:37 ` [PATCH v4 05/11] mm: do not split a folio if it has minimum folio order requirement Pankaj Raghav (Samsung)
2024-04-25 20:10   ` Matthew Wilcox
2024-04-26  0:47     ` Luis Chamberlain
2024-04-26 23:46       ` Luis Chamberlain
2024-04-28  0:57         ` Luis Chamberlain
2024-04-29  3:56           ` Luis Chamberlain
2024-04-29 14:29             ` Zi Yan
2024-04-30  0:31               ` Luis Chamberlain
2024-04-30  0:49                 ` Luis Chamberlain [this message]
2024-04-30  2:43                 ` Zi Yan
2024-04-30 19:27                   ` Luis Chamberlain
2024-05-01  4:13                     ` Matthew Wilcox
2024-05-01 14:28                       ` Matthew Wilcox
2024-04-26 15:49     ` Pankaj Raghav (Samsung)
2024-04-25 11:37 ` [PATCH v4 06/11] filemap: cap PTE range to be created to i_size in folio_map_range() Pankaj Raghav (Samsung)
2024-04-25 20:24   ` Matthew Wilcox
2024-04-26 12:54     ` Pankaj Raghav (Samsung)
2024-04-25 11:37 ` [PATCH v4 07/11] iomap: fix iomap_dio_zero() for fs bs > system page size Pankaj Raghav (Samsung)
2024-04-26  6:22   ` Christoph Hellwig
2024-04-26 11:43     ` Pankaj Raghav (Samsung)
2024-04-27  5:12       ` Christoph Hellwig
2024-04-29 21:02         ` Pankaj Raghav (Samsung)
2024-04-27  3:26     ` Matthew Wilcox
2024-04-27  4:52       ` Christoph Hellwig
2024-04-25 11:37 ` [PATCH v4 08/11] xfs: use kvmalloc for xattr buffers Pankaj Raghav (Samsung)
2024-04-26 15:18   ` Darrick J. Wong
2024-04-28 21:06     ` Pankaj Raghav (Samsung)
2024-04-25 11:37 ` [PATCH v4 09/11] xfs: expose block size in stat Pankaj Raghav (Samsung)
2024-04-26 15:15   ` Darrick J. Wong
2024-04-25 11:37 ` [PATCH v4 10/11] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Pankaj Raghav (Samsung)
2024-04-26 15:16   ` Darrick J. Wong
2024-04-25 11:37 ` [PATCH v4 11/11] xfs: enable block size larger than page size support Pankaj Raghav (Samsung)
2024-04-26 15:18   ` Darrick J. Wong
2024-04-27  4:42 ` [PATCH v4 00/11] enable bs > ps in XFS Ritesh Harjani
2024-04-27  5:05   ` Darrick J. Wong
2024-04-29 20:39   ` Pankaj Raghav (Samsung)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZjBADFMehRG1U6ln@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=chandan.babu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hare@suse.de \
    --cc=kernel@pankajraghav.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=seanjc@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).