All of lore.kernel.org
 help / color / mirror / Atom feed
* mm: swap: locking in release_pages()
@ 2022-04-05 10:20 Alexander Sverdlin
  2022-04-05 10:43 ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Sverdlin @ 2022-04-05 10:20 UTC (permalink / raw)
  To: Nicholas Piggin, Alexander Duyck, Matthew Wilcox, Michal Hocko,
	Hugh Dickins, Yu Zhao, Mel Gorman, Lee Schermerhorn,
	Hisashi Hifumi, Sasha Levin, Andrew Morton, linux-mm
  Cc: linux-kernel

Dear mm developers!

After experiencing a crash in release_pages() [1] I'm trying to understand the locking in the release_pages():

No matter if we consider v5.17 or v5.4 (as in my case), they both have similar locking patterns:

v5.17:

if (PageLRU(page)) {
        struct lruvec *prev_lruvec = lruvec;

        lruvec = folio_lruvec_relock_irqsave(folio, lruvec,
                                                        &flags);
        if (prev_lruvec != lruvec)
                lock_batch = 0;

        del_page_from_lru_list(page, lruvec);
        __clear_page_lru_flags(page);
}

v5.4:

if (PageLRU(page)) {
        struct pglist_data *pgdat = page_pgdat(page);

        if (pgdat != locked_pgdat) {
                if (locked_pgdat)
                        spin_unlock_irqrestore(&locked_pgdat->lru_lock,
                                                        flags);
                lock_batch = 0;
                locked_pgdat = pgdat;
                spin_lock_irqsave(&locked_pgdat->lru_lock, flags);
        }

        lruvec = mem_cgroup_page_lruvec(page, locked_pgdat);
        VM_BUG_ON_PAGE(!PageLRU(page), page);
        __ClearPageLRU(page);
        del_page_from_lru_list(page, lruvec, page_off_lru(page));
}

What I don't understand here is, what guarantees us that "if (PageLRU(page))" condition
is still valid after we swap the locks in "if (pgdat != locked_pgdat)" case?
If we check under one lock and VM_BUG_ON_PAGE() under another lock, what actually stops
it from crashing as below or BUG() from time to time?

1. Crash of v5.4.170 on an ARM32 machine:

Unable to handle kernel NULL pointer dereference at virtual address 00000104
pgd = e138149d
[00000104] *pgd=84d2fd003, *pmd=8ffd6f003
Internal error: Oops: a07 [#1] PREEMPT SMP ARM
...
CPU: 1 PID: 6172 Comm: AaSysInfoRColle Tainted: G    B      O      5.4.170-... #1
Hardware name: Keystone
PC is at release_pages+0x194/0x358
LR is at release_pages+0x10c/0x358
pc : [<c03367a4>]    lr : [<c033671c>]    psr: 600c0093
sp : ccec7da8  ip : 00000000  fp : 00000002
r10: 600c0013  r9 : c0e054a0  r8 : 00000001
r7 : 00000000  r6 : ccec7e68  r5 : 00000000  r4 : d435e280
r3 : 00000122  r2 : 00000100  r1 : 00000001  r0 : 600c0013
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 30c5387d  Table: 4d2b2cc0  DAC: 55555555
Process AaSysInfoRColle (pid: 6172, stack limit = 0x36239b43)
Stack: (0xccec7da8 to 0xccec8000)
7da0:                   c0a11970 c0ea3380 00000000 c0e04e48 cb8fbef8 ccec7dbc
7dc0: ccec7dbc c0e04e48 d435e280 ccec7e64 d435e280 00000000 ffffffff 00000000
7de0: 00000000 cb8fbef8 00000000 c0338654 00000001 c0345f0c ccec7e28 d0d0d0d0
7e00: 7fffffff c03af3e0 000000fc fffffe00 00000000 00000000 cb8fbdf8 00000000
7e20: 00000000 2f2f2f2f 00000000 c03ac448 cefdc02d c03af738 cefdc02d fefefeff
7e40: ccec7ea0 00000002 ccec7f60 c03ad6e8 00000010 00000000 00000000 ffffffff
7e60: ccec7ea0 00000101 d435e280 0000000a fffffffe c03ad6f0 00000000 ccec7ea0
7e80: ccec7f60 c03bf24c ccec7ef0 c03b9738 cb8fbdf8 c03bf24c ccec7f68 c03b1eac
7ea0: 62395b18 c0e04e48 20d0cab6 ccec7ee0 cb8fbdf8 ced8e580 ccec7f68 ffffffff
7ec0: ffffffff cb8fbdf8 fffffffe c0346468 ffffffff ffffffff 00000000 c03ee290
7ee0: 00000000 c03d16e8 cb8fbdf8 cb8fbe7c 00000007 00000000 00000000 cbb26580
7f00: c0268e64 ccec7f04 ccec7f04 c0e04e48 cb8fbdf8 c0e04e48 c0a108e4 cb8fbdf8
7f20: cb8fbeb8 c0a108e4 ccec7f68 00000000 ffffff9c c03bf790 c9cced48 00000000
7f40: cefdc000 c03b3950 ccec7f68 ccec7f58 cd2a2540 00000002 00000000 00000000
7f60: c3043a90 cd768440 79377e71 00000047 cefdc02d c03c55dc 0000000a c0e04e48
7f80: b0a9b7e8 b0a9b7e8 b6d695a1 aaa0fd60 0000000a c0201204 ccec6000 0000000a
7fa0: 000002a4 c0201000 b0a9b7e8 b6d695a1 b0a9b7e8 b07056b0 a5c2b128 00000000
7fc0: b0a9b7e8 b6d695a1 aaa0fd60 0000000a a5c2a344 a5c2a348 a5c2a474 000002a4
7fe0: b6e5191c a5c2a2bc b68922d3 b68e15d8 600c0030 b0a9b7e8 00000000 00000000
[<c03367a4>] (release_pages) from [<c0338654>] (__pagevec_release+0x28/0x44)
[<c0338654>] (__pagevec_release) from [<c0345f0c>] (shmem_undo_range+0x358/0x714)
[<c0345f0c>] (shmem_undo_range) from [<c0346468>] (shmem_evict_inode+0x11c/0x2ac)
[<c0346468>] (shmem_evict_inode) from [<c03bf790>] (evict+0xa0/0x1b8)
[<c03bf790>] (evict) from [<c03b3950>] (do_unlinkat+0x1d0/0x2a0)
[<c03b3950>] (do_unlinkat) from [<c0201000>] (ret_fast_syscall+0x0/0x4c)
Exception stack(0xccec7fa8 to 0xccec7ff0)
7fa0:                   b0a9b7e8 b6d695a1 b0a9b7e8 b07056b0 a5c2b128 00000000
7fc0: b0a9b7e8 b6d695a1 aaa0fd60 0000000a a5c2a344 a5c2a348 a5c2a474 000002a4
7fe0: b6e5191c a5c2a2bc b68922d3 b68e15d8
Code: e5933000 e3130020 1a000065 e1c420d4 (e5823004)
Kernel panic - not syncing: Fatal exception
 

-- 
Best regards,
Alexander Sverdlin.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: swap: locking in release_pages()
  2022-04-05 10:20 mm: swap: locking in release_pages() Alexander Sverdlin
@ 2022-04-05 10:43 ` Michal Hocko
  2022-04-05 11:36   ` Alexander Sverdlin
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2022-04-05 10:43 UTC (permalink / raw)
  To: Alexander Sverdlin
  Cc: Nicholas Piggin, Alexander Duyck, Matthew Wilcox, Hugh Dickins,
	Yu Zhao, Mel Gorman, Lee Schermerhorn, Sasha Levin,
	Andrew Morton, linux-mm, linux-kernel

On Tue 05-04-22 12:20:15, Alexander Sverdlin wrote:
> Dear mm developers!
> 
> After experiencing a crash in release_pages() [1] I'm trying to understand the locking in the release_pages():
> 
> No matter if we consider v5.17 or v5.4 (as in my case), they both have similar locking patterns:

Similar but the notable difference is that 5.4 used per node lru locking
while newer versions 5.11+ kernels use per memcg locking. If you see the
issue on 5.4 then this is unlikely a regression.
[...]
> What I don't understand here is, what guarantees us that "if (PageLRU(page))" condition
> is still valid after we swap the locks in "if (pgdat != locked_pgdat)" case?

The underlying reasoning is that the PageLRU handling is done after the
last reference has been dropped. isolate_lru_page and others should
elevate the reference count before isolating page from LRU lists.
Some callers user TestClearPageLRU

> If we check under one lock and VM_BUG_ON_PAGE() under another lock, what actually stops
> it from crashing as below or BUG() from time to time?
G

> 
> 1. Crash of v5.4.170 on an ARM32 machine:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000104
> pgd = e138149d
> [00000104] *pgd=84d2fd003, *pmd=8ffd6f003
> Internal error: Oops: a07 [#1] PREEMPT SMP ARM
> ...
> CPU: 1 PID: 6172 Comm: AaSysInfoRColle Tainted: G    B      O      5.4.170-... #1
> Hardware name: Keystone
> PC is at release_pages+0x194/0x358
> LR is at release_pages+0x10c/0x358

Which LOC does this correspond to? (faddr2line should give you a nice
output).
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: swap: locking in release_pages()
  2022-04-05 10:43 ` Michal Hocko
@ 2022-04-05 11:36   ` Alexander Sverdlin
  2022-04-05 11:45     ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Sverdlin @ 2022-04-05 11:36 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Nicholas Piggin, Alexander Duyck, Matthew Wilcox, Hugh Dickins,
	Yu Zhao, Mel Gorman, Lee Schermerhorn, Sasha Levin,
	Andrew Morton, linux-mm, linux-kernel

Hello Michal,

thanks for the quick reply!

On 05/04/2022 12:43, Michal Hocko wrote:
>> 1. Crash of v5.4.170 on an ARM32 machine:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000104
>> pgd = e138149d
>> [00000104] *pgd=84d2fd003, *pmd=8ffd6f003
>> Internal error: Oops: a07 [#1] PREEMPT SMP ARM
>> ...
>> CPU: 1 PID: 6172 Comm: AaSysInfoRColle Tainted: G    B      O      5.4.170-... #1
>> Hardware name: Keystone
>> PC is at release_pages+0x194/0x358
>> LR is at release_pages+0x10c/0x358
> Which LOC does this correspond to? (faddr2line should give you a nice
> output).

Sorry, I forgot this info in the initial report:

this is indeed the del_page_from_lru_list() in this crash. So it's either two different ->lru_lock
or some other code path doesn't take lru_lock at all...

-- 
Best regards,
Alexander Sverdlin.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: swap: locking in release_pages()
  2022-04-05 11:36   ` Alexander Sverdlin
@ 2022-04-05 11:45     ` Michal Hocko
  2022-04-05 14:00       ` Alexander Sverdlin
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2022-04-05 11:45 UTC (permalink / raw)
  To: Alexander Sverdlin
  Cc: Nicholas Piggin, Alexander Duyck, Matthew Wilcox, Hugh Dickins,
	Yu Zhao, Mel Gorman, Lee Schermerhorn, Sasha Levin,
	Andrew Morton, linux-mm, linux-kernel

On Tue 05-04-22 13:36:09, Alexander Sverdlin wrote:
> Hello Michal,
> 
> thanks for the quick reply!
> 
> On 05/04/2022 12:43, Michal Hocko wrote:
> >> 1. Crash of v5.4.170 on an ARM32 machine:
> >>
> >> Unable to handle kernel NULL pointer dereference at virtual address 00000104
> >> pgd = e138149d
> >> [00000104] *pgd=84d2fd003, *pmd=8ffd6f003
> >> Internal error: Oops: a07 [#1] PREEMPT SMP ARM
> >> ...
> >> CPU: 1 PID: 6172 Comm: AaSysInfoRColle Tainted: G    B      O      5.4.170-... #1
> >> Hardware name: Keystone
> >> PC is at release_pages+0x194/0x358
> >> LR is at release_pages+0x10c/0x358
> > Which LOC does this correspond to? (faddr2line should give you a nice
> > output).
> 
> Sorry, I forgot this info in the initial report:
> 
> this is indeed the del_page_from_lru_list() in this crash.

Could you be more specific please? Is the problem in list_del or
update_lru_size?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: swap: locking in release_pages()
  2022-04-05 11:45     ` Michal Hocko
@ 2022-04-05 14:00       ` Alexander Sverdlin
  2022-04-05 14:17         ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Sverdlin @ 2022-04-05 14:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Nicholas Piggin, Alexander Duyck, Matthew Wilcox, Hugh Dickins,
	Yu Zhao, Mel Gorman, Lee Schermerhorn, Sasha Levin,
	Andrew Morton, linux-mm, linux-kernel

Hello Michal!

On 05/04/2022 13:45, Michal Hocko wrote:
>>>> 1. Crash of v5.4.170 on an ARM32 machine:
>>>>
>>>> Unable to handle kernel NULL pointer dereference at virtual address 00000104
>>>> pgd = e138149d
>>>> [00000104] *pgd=84d2fd003, *pmd=8ffd6f003
>>>> Internal error: Oops: a07 [#1] PREEMPT SMP ARM
>>>> ...
>>>> CPU: 1 PID: 6172 Comm: AaSysInfoRColle Tainted: G    B      O      5.4.170-... #1
>>>> Hardware name: Keystone
>>>> PC is at release_pages+0x194/0x358
>>>> LR is at release_pages+0x10c/0x358
>>> Which LOC does this correspond to? (faddr2line should give you a nice
>>> output).
>> Sorry, I forgot this info in the initial report:
>>
>> this is indeed the del_page_from_lru_list() in this crash.
> Could you be more specific please? Is the problem in list_del or
> update_lru_size?

static inline void __list_del(struct list_head * prev, struct list_head * next)
{
        next->prev = prev;    <--

-- 
Best regards,
Alexander Sverdlin.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: swap: locking in release_pages()
  2022-04-05 14:00       ` Alexander Sverdlin
@ 2022-04-05 14:17         ` Michal Hocko
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2022-04-05 14:17 UTC (permalink / raw)
  To: Alexander Sverdlin
  Cc: Nicholas Piggin, Alexander Duyck, Matthew Wilcox, Hugh Dickins,
	Yu Zhao, Mel Gorman, Lee Schermerhorn, Sasha Levin,
	Andrew Morton, linux-mm, linux-kernel

On Tue 05-04-22 16:00:54, Alexander Sverdlin wrote:
> Hello Michal!
> 
> On 05/04/2022 13:45, Michal Hocko wrote:
> >>>> 1. Crash of v5.4.170 on an ARM32 machine:
> >>>>
> >>>> Unable to handle kernel NULL pointer dereference at virtual address 00000104
> >>>> pgd = e138149d
> >>>> [00000104] *pgd=84d2fd003, *pmd=8ffd6f003
> >>>> Internal error: Oops: a07 [#1] PREEMPT SMP ARM
> >>>> ...
> >>>> CPU: 1 PID: 6172 Comm: AaSysInfoRColle Tainted: G    B      O      5.4.170-... #1
> >>>> Hardware name: Keystone
> >>>> PC is at release_pages+0x194/0x358
> >>>> LR is at release_pages+0x10c/0x358
> >>> Which LOC does this correspond to? (faddr2line should give you a nice
> >>> output).
> >> Sorry, I forgot this info in the initial report:
> >>
> >> this is indeed the del_page_from_lru_list() in this crash.
> > Could you be more specific please? Is the problem in list_del or
> > update_lru_size?
> 
> static inline void __list_del(struct list_head * prev, struct list_head * next)
> {
>         next->prev = prev;    <--

OK, I see. AFAICS this means that entry->next is NULL which doesn't look
like somebody else has done list_del as that would leave poison values
behind. Maybe somebody has clobbered the page state.

In any case I would recommend reproducing without stable patches and/or
with the current Linus tree.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-04-06  1:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-05 10:20 mm: swap: locking in release_pages() Alexander Sverdlin
2022-04-05 10:43 ` Michal Hocko
2022-04-05 11:36   ` Alexander Sverdlin
2022-04-05 11:45     ` Michal Hocko
2022-04-05 14:00       ` Alexander Sverdlin
2022-04-05 14:17         ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.