All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Yang Shi <shy828301@gmail.com>,
	Huang Ying <ying.huang@intel.com>
Subject: Re: [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed
Date: Fri, 8 Mar 2024 15:11:35 +0000	[thread overview]
Message-ID: <Zesqp6SAlsBQzYaq@casper.infradead.org> (raw)
In-Reply-To: <644c2f60-dbb0-4fdb-8505-96f8101b2399@arm.com>

On Fri, Mar 08, 2024 at 02:21:30PM +0000, Ryan Roberts wrote:
> > [  247.788985] BUG: Bad page state in process usemem  pfn:ae58c2
> > [  247.789617] page: refcount:0 mapcount:0 mapping:00000000dc16b680 index:0x1
> > pfn:0xae58c2
> > [  247.790129] aops:0x0 ino:dead000000000122
> > [  247.790394] flags: 0xbfffc0000000000(node=0|zone=2|lastcpupid=0xffff)
> > [  247.790821] page_type: 0xffffffff()
> > [  247.791052] raw: 0bfffc0000000000 0000000000000000 fffffc002a963090
> > fffffc002a963090
> > [  247.791546] raw: 0000000000000001 0000000000000000 00000000ffffffff
> > 0000000000000000
> > [  247.792258] page dumped because: non-NULL mapping
> > [  247.792567] Modules linked in:
> > [  247.792772] CPU: 0 PID: 2052 Comm: usemem Not tainted
> > 6.8.0-rc5-00456-g52fd6cd3bee5 #30
> > [  247.793300] Hardware name: linux,dummy-virt (DT)
> > [  247.793680] Call trace:
> > [  247.793894]  dump_backtrace+0x9c/0x100
> > [  247.794200]  show_stack+0x20/0x38
> > [  247.794460]  dump_stack_lvl+0x90/0xb0
> > [  247.794726]  dump_stack+0x18/0x28
> > [  247.794964]  bad_page+0x88/0x128
> > [  247.795196]  get_page_from_freelist+0xdc4/0x1280
> > [  247.795520]  __alloc_pages+0xe8/0x1038
...
> > My sense is that the first deferred split issue is now fully resolved once the
> > extra code above is reinserted, but we still have a second problem. Thoughts?

That seems likely ;-(  It doesn't fit the same pattern as the ones we've
been looking at.

> bisect lands back on the same patch it always does; "mm: Allow non-hugetlb large
> folios to be batch processed". Without this change, I can't reproduce the above
> oops.
> 
> With that change present, if I "re-narrow" the window as you suggested, I also
> can't reproduce the problem.

Ah, a pre-existing condition ;-(

> As far as I can tell, mapping is zeroed when the page is freed, and the same
> page checks are run at at that point too. So mapping must be written to while
> the page is in the buddy? Perhaps something thinks its still a tail page during
> split, but the buddy thinks its been freed?

I'll stare at those codepaths; see if I can see anything.

> Also the mapping value 00000000dc16b680 is not a valid kernel address, I don't
> think. So surprised that get_kernel_nofault(host, &mapping->host) works.

Ah, you've been caught by hashed kernel pointers.  You can tell because
the top 32 bits are 0.  The real pointer is fffffc002a963090 (see the
raw dump).

Actually, I have a clue!  The third and fourth word have the same value.
That's indicative of an empty list_head.  And if this were LRU, that would
be the second and third word.  And the PFN is congruent to 2 modulo 4.
So this is the second tail page, and that's an empty deferred_list.
So how do we init a list_head after a folio gets freed?


  reply	other threads:[~2024-03-08 15:11 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27 17:42 [PATCH v3 00/18] Rearrange batched folio freeing Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 01/18] mm: Make folios_put() the basis of release_pages() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 02/18] mm: Convert free_unref_page_list() to use folios Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 03/18] mm: Add free_unref_folios() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 04/18] mm: Use folios_put() in __folio_batch_release() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 05/18] memcg: Add mem_cgroup_uncharge_folios() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 06/18] mm: Remove use of folio list from folios_put() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 07/18] mm: Use free_unref_folios() in put_pages_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 08/18] mm: use __page_cache_release() in folios_put() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 09/18] mm: Handle large folios in free_unref_folios() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Matthew Wilcox (Oracle)
2024-03-06 13:42   ` Ryan Roberts
2024-03-06 16:09     ` Matthew Wilcox
2024-03-06 16:19       ` Ryan Roberts
2024-03-06 17:41         ` Ryan Roberts
2024-03-06 18:41           ` Zi Yan
2024-03-06 19:55             ` Matthew Wilcox
2024-03-06 21:55               ` Matthew Wilcox
2024-03-07  8:56                 ` Ryan Roberts
2024-03-07 13:50                   ` Yin, Fengwei
2024-03-07 14:05                     ` Re: Matthew Wilcox
2024-03-07 15:24                       ` Re: Ryan Roberts
2024-03-07 16:24                         ` Re: Ryan Roberts
2024-03-07 23:02                           ` Re: Matthew Wilcox
2024-03-08  1:06                       ` Re: Yin, Fengwei
2024-03-07 17:33                   ` [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Matthew Wilcox
2024-03-07 18:35                     ` Ryan Roberts
2024-03-07 20:42                       ` Matthew Wilcox
2024-03-08 11:44                     ` Ryan Roberts
2024-03-08 12:09                       ` Ryan Roberts
2024-03-08 14:21                         ` Ryan Roberts
2024-03-08 15:11                           ` Matthew Wilcox [this message]
2024-03-08 16:03                             ` Matthew Wilcox
2024-03-08 17:13                               ` Ryan Roberts
2024-03-08 18:09                                 ` Ryan Roberts
2024-03-08 18:18                                   ` Matthew Wilcox
2024-03-09  4:34                                     ` Andrew Morton
2024-03-09  4:52                                       ` Matthew Wilcox
2024-03-09  8:05                                         ` Ryan Roberts
2024-03-09 12:33                                           ` Ryan Roberts
2024-03-10 13:38                                             ` Matthew Wilcox
2024-03-08 15:33                         ` Matthew Wilcox
2024-03-09  6:09                       ` Matthew Wilcox
2024-03-09  7:59                         ` Ryan Roberts
2024-03-09  8:18                           ` Ryan Roberts
2024-03-09  9:38                             ` Ryan Roberts
2024-03-10  4:23                               ` Matthew Wilcox
2024-03-10  8:23                                 ` Ryan Roberts
2024-03-10 11:08                                   ` Matthew Wilcox
2024-03-10 11:01       ` Ryan Roberts
2024-03-10 11:11         ` Matthew Wilcox
2024-03-10 16:31           ` Ryan Roberts
2024-03-10 19:57             ` Matthew Wilcox
2024-03-10 19:59             ` Ryan Roberts
2024-03-10 20:46               ` Matthew Wilcox
2024-03-10 21:52                 ` Matthew Wilcox
2024-03-11  9:01                   ` Ryan Roberts
2024-03-11 12:26                     ` Matthew Wilcox
2024-03-11 12:36                       ` Ryan Roberts
2024-03-11 15:50                         ` Matthew Wilcox
2024-03-11 16:14                           ` Ryan Roberts
2024-03-11 17:49                             ` Matthew Wilcox
2024-03-12 11:57                               ` Ryan Roberts
2024-03-11 19:26                             ` Matthew Wilcox
2024-03-10 11:14         ` Ryan Roberts
2024-02-27 17:42 ` [PATCH v3 11/18] mm: Free folios in a batch in shrink_folio_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 12/18] mm: Free folios directly in move_folios_to_lru() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 13/18] memcg: Remove mem_cgroup_uncharge_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 14/18] mm: Remove free_unref_page_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 15/18] mm: Remove lru_to_page() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 16/18] mm: Convert free_pages_and_swap_cache() to use folios_put() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 17/18] mm: Use a folio in __collapse_huge_page_copy_succeeded() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 18/18] mm: Convert free_swap_cache() to take a folio Matthew Wilcox (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zesqp6SAlsBQzYaq@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.