Re: linux-5.13.2: warning from kernel/rcu/tree

rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
       [not found] <c9fd1311-662c-f993-c8ef-54af036f2f78@googlemail.com>
@ 2021-07-18 21:01 ` Oleksandr Natalenko
  2021-07-18 21:03   ` Oleksandr Natalenko
  0 siblings, 1 reply; 34+ messages in thread
From: Oleksandr Natalenko @ 2021-07-18 21:01 UTC (permalink / raw)
  To: LKML
  Cc: linux-stable, Chris Clayton, Chris Rankin, Paul E. McKenney,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, rcu, Andrew Morton, linux-mm

Hello.

On sobota 17. července 2021 22:22:08 CEST Chris Clayton wrote:
> I checked the output from dmesg yesterday and found the following warning:
> 
> [Fri Jul 16 09:15:29 2021] ------------[ cut here ]------------
> [Fri Jul 16 09:15:29 2021] WARNING: CPU: 11 PID: 2701 at
> kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x37/0x3d0 [Fri Jul 16
> 09:15:29 2021] Modules linked in: uas hidp rfcomm bnep xt_MASQUERADE
> iptable_nat nf_nat xt_LOG nf_log_syslog xt_limit xt_multiport xt_conntrack
> iptable_filter btusb btintel wmi_bmof uvcvideo videobuf2_vmalloc
> videobuf2_memops videobuf2_v4l2 videobuf2_common coretemp hwmon
> snd_hda_codec_hdmi x86_pkg_temp_thermal snd_hda_codec_realtek
> snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg
> snd_hda_codec snd_hwdep snd_hda_core i2c_i801 i2c_smbus iwlmvm mac80211
> iwlwifi i915 mei_me mei cfg80211 intel_lpss_pci intel_lpss wmi
> nf_conntrack_ftp xt_helper nf_conntrack nf_defrag_ipv4 tun
> [Fri Jul 16 09:15:29 2021] CPU: 11 PID: 2701 Comm: lpqd Not tainted 5.13.2
> #1 [Fri Jul 16 09:15:29 2021] Hardware name: Notebook                      
>   NP50DE_DB                       /NP50DE_DB , BIOS 1.07.04 02/17/2020
> [Fri Jul 16 09:15:29 2021] RIP: 0010:rcu_note_context_switch+0x37/0x3d0
> [Fri Jul 16 09:15:29 2021] Code: 02 00 e8 ec a0 6c 00 89 c0 65 4c 8b 2c 25
> 00 6d 01 00 48 03 1c c5 80 56 e1 b6 40 84 ed 75 0d 41 8b 95 04 03 00 00 85
> d2 7e 02 <0f> 0b 65 48 8b 04 25 00 6d 01 00 8b 80 04 03 00 00 85 c0 7e 0a
> 41 [Fri Jul 16 09:15:29 2021] RSP: 0000:ffffb5d483837c70 EFLAGS: 00010002
> [Fri Jul 16 09:15:29 2021] RAX: 000000000000000b RBX: ffff9b77806e1d80 RCX:
> 0000000000000100 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000001 RSI:
> ffffffffb6d82ead RDI: ffffffffb6da5e4e [Fri Jul 16 09:15:29 2021] RBP:
> 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [Fri Jul 16
> 09:15:29 2021] R10: 000000067bce4fff R11: 0000000000000000 R12:
> ffff9b77806e1100 [Fri Jul 16 09:15:29 2021] R13: ffff9b734a833a00 R14:
> ffff9b734a833a00 R15: 0000000000000000 [Fri Jul 16 09:15:29 2021] FS: 
> 00007fccbfc5fe40(0000) GS:ffff9b77806c0000(0000) knlGS:0000000000000000
> [Fri Jul 16 09:15:29 2021] CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033 [Fri Jul 16 09:15:29 2021] CR2: 00007fccc2db7290 CR3:
> 00000003fb0b8002 CR4: 00000000007706e0 [Fri Jul 16 09:15:29 2021] PKRU:
> 55555554
> [Fri Jul 16 09:15:29 2021] Call Trace:
> [Fri Jul 16 09:15:29 2021]  __schedule+0x86/0x810
> [Fri Jul 16 09:15:29 2021]  schedule+0x40/0xe0
> [Fri Jul 16 09:15:29 2021]  io_schedule+0x3d/0x60
> [Fri Jul 16 09:15:29 2021]  wait_on_page_bit_common+0x129/0x390
> [Fri Jul 16 09:15:29 2021]  ? __filemap_set_wb_err+0x10/0x10
> [Fri Jul 16 09:15:29 2021]  __lock_page_or_retry+0x13f/0x1d0
> [Fri Jul 16 09:15:29 2021]  do_swap_page+0x335/0x5b0
> [Fri Jul 16 09:15:29 2021]  __handle_mm_fault+0x444/0xb20
> [Fri Jul 16 09:15:29 2021]  handle_mm_fault+0x5c/0x170
> [Fri Jul 16 09:15:29 2021]  ? find_vma+0x5b/0x70
> [Fri Jul 16 09:15:29 2021]  exc_page_fault+0x1ab/0x610
> [Fri Jul 16 09:15:29 2021]  ? fpregs_assert_state_consistent+0x19/0x40
> [Fri Jul 16 09:15:29 2021]  ? asm_exc_page_fault+0x8/0x30
> [Fri Jul 16 09:15:29 2021]  asm_exc_page_fault+0x1e/0x30
> [Fri Jul 16 09:15:29 2021] RIP: 0033:0x7fccc2d3c520
> [Fri Jul 16 09:15:29 2021] Code: 68 4c 00 00 00 e9 20 fb ff ff ff 25 7a ad
> 07 00 68 4d 00 00 00 e9 10 fb ff ff ff 25 72 ad 07 00 68 4e 00 00 00 e9 00
> fb ff ff <ff> 25 6a ad 07 00 68 4f 00 00 00 e9 f0 fa ff ff ff 25 62 ad 07
> 00 [Fri Jul 16 09:15:29 2021] RSP: 002b:00007ffebd529048 EFLAGS: 00010293
> [Fri Jul 16 09:15:29 2021] RAX: 0000000000000001 RBX: 00007fccc46e2890 RCX:
> 0000000000000010 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000010 RSI:
> 0000000000000000 RDI: 00007fccc46e2890 [Fri Jul 16 09:15:29 2021] RBP:
> 000056264f1dd4a0 R08: 000056264f21aba0 R09: 000056264f1f58a0 [Fri Jul 16
> 09:15:29 2021] R10: 0000000000000007 R11: 0000000000000246 R12:
> 000056264f21ac00 [Fri Jul 16 09:15:29 2021] R13: 000056264f1e0a30 R14:
> 00007ffebd529080 R15: 00000000000dd87b [Fri Jul 16 09:15:29 2021] ---[ end
> trace c8b06e067d8b0fc2 ]---
> 
> At the time the warning was issued I was creating a (weekly) backup of my
> linux system (home-brewed based on the guidance from Linux From Scratch).
> My backup routine is completed by copying the archive files (created with
> dar) and a directory that contains about 7000 source and binary rpm files
> to an external USB drive. I didn't spot the warning until later in the day,
> so I'm not sure exactly where I was in my backup process.
> 
> I haven't seen this warning before. Consequently, I don;t know how easy (or
> otherwise) it is to reproduce.
> 
> Let me know if I can provide any additional diagnostics, but please cc me as
> I'm not subscribed.

Confirming the same for me with v5.13.2, and cross-referencing another
report [1] against v5.12.17.

Also Cc'ing relevant people on this.

Thanks.

[1] https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1kEHLHA@mail.gmail.com/


-- 
Oleksandr Natalenko (post-factum)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-18 21:01 ` linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359 Oleksandr Natalenko
@ 2021-07-18 21:03   ` Oleksandr Natalenko
  2021-07-18 21:22     ` Matthew Wilcox
  2021-07-18 21:59     ` Paul E. McKenney
  0 siblings, 2 replies; 34+ messages in thread
From: Oleksandr Natalenko @ 2021-07-18 21:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: stable, Chris Clayton, Chris Rankin, Paul E. McKenney,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, rcu, Andrew Morton, linux-mm

+ stable@vger.kernel.org

On neděle 18. července 2021 23:01:24 CEST Oleksandr Natalenko wrote:
> Hello.
> 
> On sobota 17. července 2021 22:22:08 CEST Chris Clayton wrote:
> > I checked the output from dmesg yesterday and found the following warning:
> > 
> > [Fri Jul 16 09:15:29 2021] ------------[ cut here ]------------
> > [Fri Jul 16 09:15:29 2021] WARNING: CPU: 11 PID: 2701 at
> > kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x37/0x3d0 [Fri Jul
> > 16
> > 09:15:29 2021] Modules linked in: uas hidp rfcomm bnep xt_MASQUERADE
> > iptable_nat nf_nat xt_LOG nf_log_syslog xt_limit xt_multiport xt_conntrack
> > iptable_filter btusb btintel wmi_bmof uvcvideo videobuf2_vmalloc
> > videobuf2_memops videobuf2_v4l2 videobuf2_common coretemp hwmon
> > snd_hda_codec_hdmi x86_pkg_temp_thermal snd_hda_codec_realtek
> > snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg
> > snd_hda_codec snd_hwdep snd_hda_core i2c_i801 i2c_smbus iwlmvm mac80211
> > iwlwifi i915 mei_me mei cfg80211 intel_lpss_pci intel_lpss wmi
> > nf_conntrack_ftp xt_helper nf_conntrack nf_defrag_ipv4 tun
> > [Fri Jul 16 09:15:29 2021] CPU: 11 PID: 2701 Comm: lpqd Not tainted 5.13.2
> > #1 [Fri Jul 16 09:15:29 2021] Hardware name: Notebook
> > 
> >   NP50DE_DB                       /NP50DE_DB , BIOS 1.07.04 02/17/2020
> > 
> > [Fri Jul 16 09:15:29 2021] RIP: 0010:rcu_note_context_switch+0x37/0x3d0
> > [Fri Jul 16 09:15:29 2021] Code: 02 00 e8 ec a0 6c 00 89 c0 65 4c 8b 2c 25
> > 00 6d 01 00 48 03 1c c5 80 56 e1 b6 40 84 ed 75 0d 41 8b 95 04 03 00 00 85
> > d2 7e 02 <0f> 0b 65 48 8b 04 25 00 6d 01 00 8b 80 04 03 00 00 85 c0 7e 0a
> > 41 [Fri Jul 16 09:15:29 2021] RSP: 0000:ffffb5d483837c70 EFLAGS: 00010002
> > [Fri Jul 16 09:15:29 2021] RAX: 000000000000000b RBX: ffff9b77806e1d80
> > RCX:
> > 0000000000000100 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000001 RSI:
> > ffffffffb6d82ead RDI: ffffffffb6da5e4e [Fri Jul 16 09:15:29 2021] RBP:
> > 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [Fri Jul 16
> > 09:15:29 2021] R10: 000000067bce4fff R11: 0000000000000000 R12:
> > ffff9b77806e1100 [Fri Jul 16 09:15:29 2021] R13: ffff9b734a833a00 R14:
> > ffff9b734a833a00 R15: 0000000000000000 [Fri Jul 16 09:15:29 2021] FS:
> > 00007fccbfc5fe40(0000) GS:ffff9b77806c0000(0000) knlGS:0000000000000000
> > [Fri Jul 16 09:15:29 2021] CS:  0010 DS: 0000 ES: 0000 CR0:
> > 0000000080050033 [Fri Jul 16 09:15:29 2021] CR2: 00007fccc2db7290 CR3:
> > 00000003fb0b8002 CR4: 00000000007706e0 [Fri Jul 16 09:15:29 2021] PKRU:
> > 55555554
> > [Fri Jul 16 09:15:29 2021] Call Trace:
> > [Fri Jul 16 09:15:29 2021]  __schedule+0x86/0x810
> > [Fri Jul 16 09:15:29 2021]  schedule+0x40/0xe0
> > [Fri Jul 16 09:15:29 2021]  io_schedule+0x3d/0x60
> > [Fri Jul 16 09:15:29 2021]  wait_on_page_bit_common+0x129/0x390
> > [Fri Jul 16 09:15:29 2021]  ? __filemap_set_wb_err+0x10/0x10
> > [Fri Jul 16 09:15:29 2021]  __lock_page_or_retry+0x13f/0x1d0
> > [Fri Jul 16 09:15:29 2021]  do_swap_page+0x335/0x5b0
> > [Fri Jul 16 09:15:29 2021]  __handle_mm_fault+0x444/0xb20
> > [Fri Jul 16 09:15:29 2021]  handle_mm_fault+0x5c/0x170
> > [Fri Jul 16 09:15:29 2021]  ? find_vma+0x5b/0x70
> > [Fri Jul 16 09:15:29 2021]  exc_page_fault+0x1ab/0x610
> > [Fri Jul 16 09:15:29 2021]  ? fpregs_assert_state_consistent+0x19/0x40
> > [Fri Jul 16 09:15:29 2021]  ? asm_exc_page_fault+0x8/0x30
> > [Fri Jul 16 09:15:29 2021]  asm_exc_page_fault+0x1e/0x30
> > [Fri Jul 16 09:15:29 2021] RIP: 0033:0x7fccc2d3c520
> > [Fri Jul 16 09:15:29 2021] Code: 68 4c 00 00 00 e9 20 fb ff ff ff 25 7a ad
> > 07 00 68 4d 00 00 00 e9 10 fb ff ff ff 25 72 ad 07 00 68 4e 00 00 00 e9 00
> > fb ff ff <ff> 25 6a ad 07 00 68 4f 00 00 00 e9 f0 fa ff ff ff 25 62 ad 07
> > 00 [Fri Jul 16 09:15:29 2021] RSP: 002b:00007ffebd529048 EFLAGS: 00010293
> > [Fri Jul 16 09:15:29 2021] RAX: 0000000000000001 RBX: 00007fccc46e2890
> > RCX:
> > 0000000000000010 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000010 RSI:
> > 0000000000000000 RDI: 00007fccc46e2890 [Fri Jul 16 09:15:29 2021] RBP:
> > 000056264f1dd4a0 R08: 000056264f21aba0 R09: 000056264f1f58a0 [Fri Jul 16
> > 09:15:29 2021] R10: 0000000000000007 R11: 0000000000000246 R12:
> > 000056264f21ac00 [Fri Jul 16 09:15:29 2021] R13: 000056264f1e0a30 R14:
> > 00007ffebd529080 R15: 00000000000dd87b [Fri Jul 16 09:15:29 2021] ---[ end
> > trace c8b06e067d8b0fc2 ]---
> > 
> > At the time the warning was issued I was creating a (weekly) backup of my
> > linux system (home-brewed based on the guidance from Linux From Scratch).
> > My backup routine is completed by copying the archive files (created with
> > dar) and a directory that contains about 7000 source and binary rpm files
> > to an external USB drive. I didn't spot the warning until later in the
> > day,
> > so I'm not sure exactly where I was in my backup process.
> > 
> > I haven't seen this warning before. Consequently, I don;t know how easy
> > (or
> > otherwise) it is to reproduce.
> > 
> > Let me know if I can provide any additional diagnostics, but please cc me
> > as I'm not subscribed.
> 
> Confirming the same for me with v5.13.2, and cross-referencing another
> report [1] against v5.12.17.
> 
> Also Cc'ing relevant people on this.
> 
> Thanks.
> 
> [1]
> https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1k
> EHLHA@mail.gmail.com/


-- 
Oleksandr Natalenko (post-factum)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-18 21:03   ` Oleksandr Natalenko
@ 2021-07-18 21:22     ` Matthew Wilcox
  2021-07-18 21:36       ` Chris Clayton
  2021-07-18 21:59     ` Paul E. McKenney
  1 sibling, 1 reply; 34+ messages in thread
From: Matthew Wilcox @ 2021-07-18 21:22 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: linux-kernel, stable, Chris Clayton, Chris Rankin,
	Paul E. McKenney, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, linux-mm

On Sun, Jul 18, 2021 at 11:03:51PM +0200, Oleksandr Natalenko wrote:
> + stable@vger.kernel.org
> 
> On neděle 18. července 2021 23:01:24 CEST Oleksandr Natalenko wrote:
> > Hello.
> > 
> > On sobota 17. července 2021 22:22:08 CEST Chris Clayton wrote:
> > > I checked the output from dmesg yesterday and found the following warning:
> > > 
> > > [Fri Jul 16 09:15:29 2021] ------------[ cut here ]------------
> > > [Fri Jul 16 09:15:29 2021] WARNING: CPU: 11 PID: 2701 at
> > > kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x37/0x3d0 [Fri Jul

Could you run ./scripts/faddr2line vmlinux rcu_note_context_switch+0x37/0x3d0

> > > [Fri Jul 16 09:15:29 2021] Call Trace:
> > > [Fri Jul 16 09:15:29 2021]  __schedule+0x86/0x810
> > > [Fri Jul 16 09:15:29 2021]  schedule+0x40/0xe0
> > > [Fri Jul 16 09:15:29 2021]  io_schedule+0x3d/0x60
> > > [Fri Jul 16 09:15:29 2021]  wait_on_page_bit_common+0x129/0x390
> > > [Fri Jul 16 09:15:29 2021]  ? __filemap_set_wb_err+0x10/0x10
> > > [Fri Jul 16 09:15:29 2021]  __lock_page_or_retry+0x13f/0x1d0
> > > [Fri Jul 16 09:15:29 2021]  do_swap_page+0x335/0x5b0
> > > [Fri Jul 16 09:15:29 2021]  __handle_mm_fault+0x444/0xb20
> > > [Fri Jul 16 09:15:29 2021]  handle_mm_fault+0x5c/0x170

You were handling a page fault at the time.  The page you wanted was
on swap and this warning fired as a result of waiting for the page
to come back in from swap.  There are a number of warnings in that
function, so it'd be good to track down exactly which one it is.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-18 21:22     ` Matthew Wilcox
@ 2021-07-18 21:36       ` Chris Clayton
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Clayton @ 2021-07-18 21:36 UTC (permalink / raw)
  To: Matthew Wilcox, Oleksandr Natalenko
  Cc: linux-kernel, stable, Chris Rankin, Paul E. McKenney,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, rcu, Andrew Morton, linux-mm



On 18/07/2021 22:22, Matthew Wilcox wrote:
> On Sun, Jul 18, 2021 at 11:03:51PM +0200, Oleksandr Natalenko wrote:
>> + stable@vger.kernel.org
>>
>> On neděle 18. července 2021 23:01:24 CEST Oleksandr Natalenko wrote:
>>> Hello.
>>>
>>> On sobota 17. července 2021 22:22:08 CEST Chris Clayton wrote:
>>>> I checked the output from dmesg yesterday and found the following warning:
>>>>
>>>> [Fri Jul 16 09:15:29 2021] ------------[ cut here ]------------
>>>> [Fri Jul 16 09:15:29 2021] WARNING: CPU: 11 PID: 2701 at
>>>> kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x37/0x3d0 [Fri Jul
> 
> Could you run ./scripts/faddr2line vmlinux rcu_note_context_switch+0x37/0x3d0
> 
$ ./scripts/faddr2line vmlinux rcu_note_context_switch+0x37/0x3d0
rcu_note_context_switch+0x37/0x3d0:
rcu_note_context_switch at ??:?

>>>> [Fri Jul 16 09:15:29 2021] Call Trace:
>>>> [Fri Jul 16 09:15:29 2021]  __schedule+0x86/0x810
>>>> [Fri Jul 16 09:15:29 2021]  schedule+0x40/0xe0
>>>> [Fri Jul 16 09:15:29 2021]  io_schedule+0x3d/0x60
>>>> [Fri Jul 16 09:15:29 2021]  wait_on_page_bit_common+0x129/0x390
>>>> [Fri Jul 16 09:15:29 2021]  ? __filemap_set_wb_err+0x10/0x10
>>>> [Fri Jul 16 09:15:29 2021]  __lock_page_or_retry+0x13f/0x1d0
>>>> [Fri Jul 16 09:15:29 2021]  do_swap_page+0x335/0x5b0
>>>> [Fri Jul 16 09:15:29 2021]  __handle_mm_fault+0x444/0xb20
>>>> [Fri Jul 16 09:15:29 2021]  handle_mm_fault+0x5c/0x170
> 
> You were handling a page fault at the time.  The page you wanted was
> on swap and this warning fired as a result of waiting for the page
> to come back in from swap.  There are a number of warnings in that
> function, so it'd be good to track down exactly which one it is.
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-18 21:03   ` Oleksandr Natalenko
  2021-07-18 21:22     ` Matthew Wilcox
@ 2021-07-18 21:59     ` Paul E. McKenney
  2021-07-18 22:51       ` Matthew Wilcox
  2021-07-19  7:32       ` Chris Clayton
  1 sibling, 2 replies; 34+ messages in thread
From: Paul E. McKenney @ 2021-07-18 21:59 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: linux-kernel, stable, Chris Clayton, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, linux-mm

On Sun, Jul 18, 2021 at 11:03:51PM +0200, Oleksandr Natalenko wrote:
> + stable@vger.kernel.org
> 
> On neděle 18. července 2021 23:01:24 CEST Oleksandr Natalenko wrote:
> > Hello.
> > 
> > On sobota 17. července 2021 22:22:08 CEST Chris Clayton wrote:
> > > I checked the output from dmesg yesterday and found the following warning:
> > > 
> > > [Fri Jul 16 09:15:29 2021] ------------[ cut here ]------------
> > > [Fri Jul 16 09:15:29 2021] WARNING: CPU: 11 PID: 2701 at
> > > kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x37/0x3d0 [Fri Jul
> > > 16

I am not seeing a warning at line 359 of either v5.13.2 or v5.12.7.

> > > 09:15:29 2021] Modules linked in: uas hidp rfcomm bnep xt_MASQUERADE
> > > iptable_nat nf_nat xt_LOG nf_log_syslog xt_limit xt_multiport xt_conntrack
> > > iptable_filter btusb btintel wmi_bmof uvcvideo videobuf2_vmalloc
> > > videobuf2_memops videobuf2_v4l2 videobuf2_common coretemp hwmon
> > > snd_hda_codec_hdmi x86_pkg_temp_thermal snd_hda_codec_realtek
> > > snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg
> > > snd_hda_codec snd_hwdep snd_hda_core i2c_i801 i2c_smbus iwlmvm mac80211
> > > iwlwifi i915 mei_me mei cfg80211 intel_lpss_pci intel_lpss wmi
> > > nf_conntrack_ftp xt_helper nf_conntrack nf_defrag_ipv4 tun
> > > [Fri Jul 16 09:15:29 2021] CPU: 11 PID: 2701 Comm: lpqd Not tainted 5.13.2
> > > #1 [Fri Jul 16 09:15:29 2021] Hardware name: Notebook
> > > 
> > >   NP50DE_DB                       /NP50DE_DB , BIOS 1.07.04 02/17/2020
> > > 
> > > [Fri Jul 16 09:15:29 2021] RIP: 0010:rcu_note_context_switch+0x37/0x3d0
> > > [Fri Jul 16 09:15:29 2021] Code: 02 00 e8 ec a0 6c 00 89 c0 65 4c 8b 2c 25
> > > 00 6d 01 00 48 03 1c c5 80 56 e1 b6 40 84 ed 75 0d 41 8b 95 04 03 00 00 85
> > > d2 7e 02 <0f> 0b 65 48 8b 04 25 00 6d 01 00 8b 80 04 03 00 00 85 c0 7e 0a
> > > 41 [Fri Jul 16 09:15:29 2021] RSP: 0000:ffffb5d483837c70 EFLAGS: 00010002
> > > [Fri Jul 16 09:15:29 2021] RAX: 000000000000000b RBX: ffff9b77806e1d80
> > > RCX:
> > > 0000000000000100 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000001 RSI:
> > > ffffffffb6d82ead RDI: ffffffffb6da5e4e [Fri Jul 16 09:15:29 2021] RBP:
> > > 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [Fri Jul 16
> > > 09:15:29 2021] R10: 000000067bce4fff R11: 0000000000000000 R12:
> > > ffff9b77806e1100 [Fri Jul 16 09:15:29 2021] R13: ffff9b734a833a00 R14:
> > > ffff9b734a833a00 R15: 0000000000000000 [Fri Jul 16 09:15:29 2021] FS:
> > > 00007fccbfc5fe40(0000) GS:ffff9b77806c0000(0000) knlGS:0000000000000000
> > > [Fri Jul 16 09:15:29 2021] CS:  0010 DS: 0000 ES: 0000 CR0:
> > > 0000000080050033 [Fri Jul 16 09:15:29 2021] CR2: 00007fccc2db7290 CR3:
> > > 00000003fb0b8002 CR4: 00000000007706e0 [Fri Jul 16 09:15:29 2021] PKRU:
> > > 55555554
> > > [Fri Jul 16 09:15:29 2021] Call Trace:
> > > [Fri Jul 16 09:15:29 2021]  __schedule+0x86/0x810
> > > [Fri Jul 16 09:15:29 2021]  schedule+0x40/0xe0
> > > [Fri Jul 16 09:15:29 2021]  io_schedule+0x3d/0x60
> > > [Fri Jul 16 09:15:29 2021]  wait_on_page_bit_common+0x129/0x390
> > > [Fri Jul 16 09:15:29 2021]  ? __filemap_set_wb_err+0x10/0x10
> > > [Fri Jul 16 09:15:29 2021]  __lock_page_or_retry+0x13f/0x1d0
> > > [Fri Jul 16 09:15:29 2021]  do_swap_page+0x335/0x5b0
> > > [Fri Jul 16 09:15:29 2021]  __handle_mm_fault+0x444/0xb20
> > > [Fri Jul 16 09:15:29 2021]  handle_mm_fault+0x5c/0x170
> > > [Fri Jul 16 09:15:29 2021]  ? find_vma+0x5b/0x70
> > > [Fri Jul 16 09:15:29 2021]  exc_page_fault+0x1ab/0x610
> > > [Fri Jul 16 09:15:29 2021]  ? fpregs_assert_state_consistent+0x19/0x40
> > > [Fri Jul 16 09:15:29 2021]  ? asm_exc_page_fault+0x8/0x30
> > > [Fri Jul 16 09:15:29 2021]  asm_exc_page_fault+0x1e/0x30
> > > [Fri Jul 16 09:15:29 2021] RIP: 0033:0x7fccc2d3c520
> > > [Fri Jul 16 09:15:29 2021] Code: 68 4c 00 00 00 e9 20 fb ff ff ff 25 7a ad
> > > 07 00 68 4d 00 00 00 e9 10 fb ff ff ff 25 72 ad 07 00 68 4e 00 00 00 e9 00
> > > fb ff ff <ff> 25 6a ad 07 00 68 4f 00 00 00 e9 f0 fa ff ff ff 25 62 ad 07
> > > 00 [Fri Jul 16 09:15:29 2021] RSP: 002b:00007ffebd529048 EFLAGS: 00010293
> > > [Fri Jul 16 09:15:29 2021] RAX: 0000000000000001 RBX: 00007fccc46e2890
> > > RCX:
> > > 0000000000000010 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000010 RSI:
> > > 0000000000000000 RDI: 00007fccc46e2890 [Fri Jul 16 09:15:29 2021] RBP:
> > > 000056264f1dd4a0 R08: 000056264f21aba0 R09: 000056264f1f58a0 [Fri Jul 16
> > > 09:15:29 2021] R10: 0000000000000007 R11: 0000000000000246 R12:
> > > 000056264f21ac00 [Fri Jul 16 09:15:29 2021] R13: 000056264f1e0a30 R14:
> > > 00007ffebd529080 R15: 00000000000dd87b [Fri Jul 16 09:15:29 2021] ---[ end
> > > trace c8b06e067d8b0fc2 ]---
> > > 
> > > At the time the warning was issued I was creating a (weekly) backup of my
> > > linux system (home-brewed based on the guidance from Linux From Scratch).
> > > My backup routine is completed by copying the archive files (created with
> > > dar) and a directory that contains about 7000 source and binary rpm files
> > > to an external USB drive. I didn't spot the warning until later in the
> > > day,
> > > so I'm not sure exactly where I was in my backup process.
> > > 
> > > I haven't seen this warning before. Consequently, I don;t know how easy
> > > (or
> > > otherwise) it is to reproduce.
> > > 
> > > Let me know if I can provide any additional diagnostics, but please cc me
> > > as I'm not subscribed.
> > 
> > Confirming the same for me with v5.13.2, and cross-referencing another
> > report [1] against v5.12.17.
> > 
> > Also Cc'ing relevant people on this.
> > 
> > Thanks.
> > 
> > [1]
> > https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1k
> > EHLHA@mail.gmail.com/

But this one does show this warning in v5.12.17:

	WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0);

This is in rcu_note_context_switch(), and could be caused by something
like a schedule() within an RCU read-side critical section.  This would
of course be RCU-usage bugs, given that you are not permitted to block
within an RCU read-side critical section.

I suggest checking the functions in the stack trace to see where the
rcu_read_lock() is hiding.  CONFIG_PROVE_LOCKING might also be helpful.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-18 21:59     ` Paul E. McKenney
@ 2021-07-18 22:51       ` Matthew Wilcox
  2021-07-19  1:53         ` Paul E. McKenney
  2021-07-19  7:32       ` Chris Clayton
  1 sibling, 1 reply; 34+ messages in thread
From: Matthew Wilcox @ 2021-07-18 22:51 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Oleksandr Natalenko, linux-kernel, stable, Chris Clayton,
	Chris Rankin, Josh Triplett, Steven Rostedt, Mathieu Desnoyers,
	Lai Jiangshan, Joel Fernandes, rcu, Andrew Morton, linux-mm

On Sun, Jul 18, 2021 at 02:59:14PM -0700, Paul E. McKenney wrote:
> > > https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1k
> > > EHLHA@mail.gmail.com/
> 
> But this one does show this warning in v5.12.17:
> 
> 	WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0);
> 
> This is in rcu_note_context_switch(), and could be caused by something
> like a schedule() within an RCU read-side critical section.  This would
> of course be RCU-usage bugs, given that you are not permitted to block
> within an RCU read-side critical section.
> 
> I suggest checking the functions in the stack trace to see where the
> rcu_read_lock() is hiding.  CONFIG_PROVE_LOCKING might also be helpful.

I'm not sure I see it in this stack trace.

Is it possible that there's something taking the rcu read lock in an
interrupt handler, then returning from the interrupt handler without
releasing the rcu lock?  Do we have debugging that would fire if
somebody did this?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-18 22:51       ` Matthew Wilcox
@ 2021-07-19  1:53         ` Paul E. McKenney
  2021-07-19  2:24           ` Zhouyi Zhou
  0 siblings, 1 reply; 34+ messages in thread
From: Paul E. McKenney @ 2021-07-19  1:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Oleksandr Natalenko, linux-kernel, stable, Chris Clayton,
	Chris Rankin, Josh Triplett, Steven Rostedt, Mathieu Desnoyers,
	Lai Jiangshan, Joel Fernandes, rcu, Andrew Morton, linux-mm

On Sun, Jul 18, 2021 at 11:51:36PM +0100, Matthew Wilcox wrote:
> On Sun, Jul 18, 2021 at 02:59:14PM -0700, Paul E. McKenney wrote:
> > > > https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1k
> > > > EHLHA@mail.gmail.com/
> > 
> > But this one does show this warning in v5.12.17:
> > 
> > 	WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0);
> > 
> > This is in rcu_note_context_switch(), and could be caused by something
> > like a schedule() within an RCU read-side critical section.  This would
> > of course be RCU-usage bugs, given that you are not permitted to block
> > within an RCU read-side critical section.
> > 
> > I suggest checking the functions in the stack trace to see where the
> > rcu_read_lock() is hiding.  CONFIG_PROVE_LOCKING might also be helpful.
> 
> I'm not sure I see it in this stack trace.
> 
> Is it possible that there's something taking the rcu read lock in an
> interrupt handler, then returning from the interrupt handler without
> releasing the rcu lock?  Do we have debugging that would fire if
> somebody did this?

Lockdep should complain, but in the absence of lockdep I don't know
that anything would gripe in this situation.

Also, this is a preemptible kernel, so it is possible to trace
__rcu_read_lock(), if that helps.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19  1:53         ` Paul E. McKenney
@ 2021-07-19  2:24           ` Zhouyi Zhou
  2021-07-19  2:27             ` Zhouyi Zhou
  2021-07-19  2:43             ` Matthew Wilcox
  0 siblings, 2 replies; 34+ messages in thread
From: Zhouyi Zhou @ 2021-07-19  2:24 UTC (permalink / raw)
  To: paulmck
  Cc: Matthew Wilcox, Oleksandr Natalenko, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM

On Mon, Jul 19, 2021 at 9:53 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Sun, Jul 18, 2021 at 11:51:36PM +0100, Matthew Wilcox wrote:
> > On Sun, Jul 18, 2021 at 02:59:14PM -0700, Paul E. McKenney wrote:
> > > > > https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1k
> > > > > EHLHA@mail.gmail.com/
> > >
> > > But this one does show this warning in v5.12.17:
> > >
> > >     WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0);
> > >
> > > This is in rcu_note_context_switch(), and could be caused by something
> > > like a schedule() within an RCU read-side critical section.  This would
> > > of course be RCU-usage bugs, given that you are not permitted to block
> > > within an RCU read-side critical section.
> > >
> > > I suggest checking the functions in the stack trace to see where the
> > > rcu_read_lock() is hiding.  CONFIG_PROVE_LOCKING might also be helpful.
> >
> > I'm not sure I see it in this stack trace.
> >
> > Is it possible that there's something taking the rcu read lock in an
> > interrupt handler, then returning from the interrupt handler without
> > releasing the rcu lock?  Do we have debugging that would fire if
> > somebody did this?
>
> Lockdep should complain, but in the absence of lockdep I don't know
> that anything would gripe in this situation.
I think Lockdep should complain.
Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place
that could possibly trigger that problem:

struct swap_info_struct *get_swap_device(swp_entry_t entry)
{
     struct swap_info_struct *si;
     unsigned long offset;

     if (!entry.val)
             goto out;
    si = swp_swap_info(entry);
    if (!si)
       goto bad_nofile;

   rcu_read_lock();
  if (data_race(!(si->flags & SWP_VALID)))
     goto unlock_out;
  offset = swp_offset(entry);
  if (offset >= si->max)
   goto unlock_out;

  return si;
bad_nofile:
  pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val);
out:
  return NULL;
unlock_out:
  rcu_read_unlock();
  return NULL;
}
I guess the function "return si" without a rcu_read_unlock.

However the get_swap_device has changed in the mainline tree,
there is no rcu_read_lock anymore.

>
> Also, this is a preemptible kernel, so it is possible to trace
> __rcu_read_lock(), if that helps.
>
>                                                         Thanx, Paul
Thanx
Zhouyi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19  2:24           ` Zhouyi Zhou
@ 2021-07-19  2:27             ` Zhouyi Zhou
  2021-07-19  2:43             ` Matthew Wilcox
  1 sibling, 0 replies; 34+ messages in thread
From: Zhouyi Zhou @ 2021-07-19  2:27 UTC (permalink / raw)
  To: paulmck
  Cc: Matthew Wilcox, Oleksandr Natalenko, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM

On Mon, Jul 19, 2021 at 10:24 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> On Mon, Jul 19, 2021 at 9:53 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Sun, Jul 18, 2021 at 11:51:36PM +0100, Matthew Wilcox wrote:
> > > On Sun, Jul 18, 2021 at 02:59:14PM -0700, Paul E. McKenney wrote:
> > > > > > https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1k
> > > > > > EHLHA@mail.gmail.com/
> > > >
> > > > But this one does show this warning in v5.12.17:
> > > >
> > > >     WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0);
> > > >
> > > > This is in rcu_note_context_switch(), and could be caused by something
> > > > like a schedule() within an RCU read-side critical section.  This would
> > > > of course be RCU-usage bugs, given that you are not permitted to block
> > > > within an RCU read-side critical section.
> > > >
> > > > I suggest checking the functions in the stack trace to see where the
> > > > rcu_read_lock() is hiding.  CONFIG_PROVE_LOCKING might also be helpful.
> > >
> > > I'm not sure I see it in this stack trace.
> > >
> > > Is it possible that there's something taking the rcu read lock in an
> > > interrupt handler, then returning from the interrupt handler without
> > > releasing the rcu lock?  Do we have debugging that would fire if
> > > somebody did this?
> >
> > Lockdep should complain, but in the absence of lockdep I don't know
> > that anything would gripe in this situation.
> I think Lockdep should complain.
> Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place
I examined 5.13.2 the unpaired rcu_read_lock is still there
> that could possibly trigger that problem:
>
> struct swap_info_struct *get_swap_device(swp_entry_t entry)
> {
>      struct swap_info_struct *si;
>      unsigned long offset;
>
>      if (!entry.val)
>              goto out;
>     si = swp_swap_info(entry);
>     if (!si)
>        goto bad_nofile;
>
>    rcu_read_lock();
>   if (data_race(!(si->flags & SWP_VALID)))
>      goto unlock_out;
>   offset = swp_offset(entry);
>   if (offset >= si->max)
>    goto unlock_out;
>
>   return si;
> bad_nofile:
>   pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val);
> out:
>   return NULL;
> unlock_out:
>   rcu_read_unlock();
>   return NULL;
> }
> I guess the function "return si" without a rcu_read_unlock.
>
> However the get_swap_device has changed in the mainline tree,
> there is no rcu_read_lock anymore.
>
> >
> > Also, this is a preemptible kernel, so it is possible to trace
> > __rcu_read_lock(), if that helps.
> >
> >                                                         Thanx, Paul
> Thanx
> Zhouyi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19  2:24           ` Zhouyi Zhou
  2021-07-19  2:27             ` Zhouyi Zhou
@ 2021-07-19  2:43             ` Matthew Wilcox
  2021-07-19  2:59               ` Zhouyi Zhou
  2021-07-19 10:14               ` Boqun Feng
  1 sibling, 2 replies; 34+ messages in thread
From: Matthew Wilcox @ 2021-07-19  2:43 UTC (permalink / raw)
  To: Zhouyi Zhou
  Cc: paulmck, Oleksandr Natalenko, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM

On Mon, Jul 19, 2021 at 10:24:18AM +0800, Zhouyi Zhou wrote:
> Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place
> that could possibly trigger that problem:
> 
> struct swap_info_struct *get_swap_device(swp_entry_t entry)
> {
>      struct swap_info_struct *si;
>      unsigned long offset;
> 
>      if (!entry.val)
>              goto out;
>     si = swp_swap_info(entry);
>     if (!si)
>        goto bad_nofile;
> 
>    rcu_read_lock();
>   if (data_race(!(si->flags & SWP_VALID)))
>      goto unlock_out;
>   offset = swp_offset(entry);
>   if (offset >= si->max)
>    goto unlock_out;
> 
>   return si;
> bad_nofile:
>   pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val);
> out:
>   return NULL;
> unlock_out:
>   rcu_read_unlock();
>   return NULL;
> }
> I guess the function "return si" without a rcu_read_unlock.

Yes, but the caller is supposed to call put_swap_device() which
calls rcu_read_unlock().  See commit eb085574a752.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19  2:43             ` Matthew Wilcox
@ 2021-07-19  2:59               ` Zhouyi Zhou
  2021-07-19 10:14               ` Boqun Feng
  1 sibling, 0 replies; 34+ messages in thread
From: Zhouyi Zhou @ 2021-07-19  2:59 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: paulmck, Oleksandr Natalenko, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM

On Mon, Jul 19, 2021 at 10:44 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Mon, Jul 19, 2021 at 10:24:18AM +0800, Zhouyi Zhou wrote:
> > Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place
> > that could possibly trigger that problem:
> >
> > struct swap_info_struct *get_swap_device(swp_entry_t entry)
> > {
> >      struct swap_info_struct *si;
> >      unsigned long offset;
> >
> >      if (!entry.val)
> >              goto out;
> >     si = swp_swap_info(entry);
> >     if (!si)
> >        goto bad_nofile;
> >
> >    rcu_read_lock();
> >   if (data_race(!(si->flags & SWP_VALID)))
> >      goto unlock_out;
> >   offset = swp_offset(entry);
> >   if (offset >= si->max)
> >    goto unlock_out;
> >
> >   return si;
> > bad_nofile:
> >   pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val);
> > out:
> >   return NULL;
> > unlock_out:
> >   rcu_read_unlock();
> >   return NULL;
> > }
> > I guess the function "return si" without a rcu_read_unlock.
>
> Yes, but the caller is supposed to call put_swap_device() which
> calls rcu_read_unlock().  See commit eb085574a752.
I see, sorry for the mistake

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19  2:43             ` Matthew Wilcox
  2021-07-19  2:59               ` Zhouyi Zhou
@ 2021-07-19 10:14               ` Boqun Feng
  2021-07-19 11:12                 ` Miaohe Lin
  1 sibling, 1 reply; 34+ messages in thread
From: Boqun Feng @ 2021-07-19 10:14 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Zhouyi Zhou, paulmck, Oleksandr Natalenko, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM, Miaohe Lin, Huang, Ying

On Mon, Jul 19, 2021 at 03:43:00AM +0100, Matthew Wilcox wrote:
> On Mon, Jul 19, 2021 at 10:24:18AM +0800, Zhouyi Zhou wrote:
> > Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place
> > that could possibly trigger that problem:
> > 
> > struct swap_info_struct *get_swap_device(swp_entry_t entry)
> > {
> >      struct swap_info_struct *si;
> >      unsigned long offset;
> > 
> >      if (!entry.val)
> >              goto out;
> >     si = swp_swap_info(entry);
> >     if (!si)
> >        goto bad_nofile;
> > 
> >    rcu_read_lock();
> >   if (data_race(!(si->flags & SWP_VALID)))
> >      goto unlock_out;
> >   offset = swp_offset(entry);
> >   if (offset >= si->max)
> >    goto unlock_out;
> > 
> >   return si;
> > bad_nofile:
> >   pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val);
> > out:
> >   return NULL;
> > unlock_out:
> >   rcu_read_unlock();
> >   return NULL;
> > }
> > I guess the function "return si" without a rcu_read_unlock.
> 
> Yes, but the caller is supposed to call put_swap_device() which
> calls rcu_read_unlock().  See commit eb085574a752.

Right, but we need to make sure there is no sleepable function called
before put_swap_device() called, and the call trace showed the following
happened:

	do_swap_page():
	  si = get_swap_device():
	    rcu_read_lock();
	  lock_page_or_retry():
	    might_sleep(); // call a sleepable function inside RCU read-side c.s.
	    __lock_page_or_retry():
	      wait_on_page_bit_common():
	        schedule():
		  rcu_note_context_switch();
		  // Warn here
	  put_swap_device();
	    rcu_read_unlock();

, which introduced by commit 2799e77529c2a

[Copy the author]

Regards,
Boqun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 10:14               ` Boqun Feng
@ 2021-07-19 11:12                 ` Miaohe Lin
  2021-07-19 11:17                   ` Oleksandr Natalenko
  2021-07-19 11:22                   ` Matthew Wilcox
  0 siblings, 2 replies; 34+ messages in thread
From: Miaohe Lin @ 2021-07-19 11:12 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Zhouyi Zhou, paulmck, Oleksandr Natalenko, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM, Huang, Ying, Matthew Wilcox

On 2021/7/19 18:14, Boqun Feng wrote:
> On Mon, Jul 19, 2021 at 03:43:00AM +0100, Matthew Wilcox wrote:
>> On Mon, Jul 19, 2021 at 10:24:18AM +0800, Zhouyi Zhou wrote:
>>> Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place
>>> that could possibly trigger that problem:
>>>
>>> struct swap_info_struct *get_swap_device(swp_entry_t entry)
>>> {
>>>      struct swap_info_struct *si;
>>>      unsigned long offset;
>>>
>>>      if (!entry.val)
>>>              goto out;
>>>     si = swp_swap_info(entry);
>>>     if (!si)
>>>        goto bad_nofile;
>>>
>>>    rcu_read_lock();
>>>   if (data_race(!(si->flags & SWP_VALID)))
>>>      goto unlock_out;
>>>   offset = swp_offset(entry);
>>>   if (offset >= si->max)
>>>    goto unlock_out;
>>>
>>>   return si;
>>> bad_nofile:
>>>   pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val);
>>> out:
>>>   return NULL;
>>> unlock_out:
>>>   rcu_read_unlock();
>>>   return NULL;
>>> }
>>> I guess the function "return si" without a rcu_read_unlock.
>>
>> Yes, but the caller is supposed to call put_swap_device() which
>> calls rcu_read_unlock().  See commit eb085574a752.
> 
> Right, but we need to make sure there is no sleepable function called
> before put_swap_device() called, and the call trace showed the following
> happened:
> 
> 	do_swap_page():
> 	  si = get_swap_device():
> 	    rcu_read_lock();
> 	  lock_page_or_retry():
> 	    might_sleep(); // call a sleepable function inside RCU read-side c.s.
> 	    __lock_page_or_retry():
> 	      wait_on_page_bit_common():
> 	        schedule():
> 		  rcu_note_context_switch();
> 		  // Warn here
> 	  put_swap_device();
> 	    rcu_read_unlock();
> 
> , which introduced by commit 2799e77529c2a

When in the commit 2799e77529c2a, we're using the percpu_ref to serialize against
concurrent swapoff, i.e. there's percpu_ref inside get_swap_device() instead of
rcu_read_lock(). Please see commit 63d8620ecf93 ("mm/swapfile: use percpu_ref to
serialize against concurrent swapoff") for detail.

Thanks.

> 
> [Copy the author]
> 
> Regards,
> Boqun
> 
> .
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 11:12                 ` Miaohe Lin
@ 2021-07-19 11:17                   ` Oleksandr Natalenko
  2021-07-19 11:22                   ` Matthew Wilcox
  1 sibling, 0 replies; 34+ messages in thread
From: Oleksandr Natalenko @ 2021-07-19 11:17 UTC (permalink / raw)
  To: Boqun Feng, Miaohe Lin
  Cc: Zhouyi Zhou, paulmck, linux-kernel, stable, Chris Clayton,
	Chris Rankin, Josh Triplett, Steven Rostedt, Mathieu Desnoyers,
	Lai Jiangshan, Joel Fernandes, rcu, Andrew Morton, Linux-MM,
	Huang, Ying, Matthew Wilcox

Hello.

On pondělí 19. července 2021 13:12:58 CEST Miaohe Lin wrote:
> On 2021/7/19 18:14, Boqun Feng wrote:
> > On Mon, Jul 19, 2021 at 03:43:00AM +0100, Matthew Wilcox wrote:
> >> On Mon, Jul 19, 2021 at 10:24:18AM +0800, Zhouyi Zhou wrote:
> >>> Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious
> >>> place that could possibly trigger that problem:
> >>> 
> >>> struct swap_info_struct *get_swap_device(swp_entry_t entry)
> >>> {
> >>> 
> >>>      struct swap_info_struct *si;
> >>>      unsigned long offset;
> >>>      
> >>>      if (!entry.val)
> >>>      
> >>>              goto out;
> >>>     
> >>>     si = swp_swap_info(entry);
> >>>     if (!si)
> >>>     
> >>>        goto bad_nofile;
> >>>    
> >>>    rcu_read_lock();
> >>>   
> >>>   if (data_race(!(si->flags & SWP_VALID)))
> >>>   
> >>>      goto unlock_out;
> >>>   
> >>>   offset = swp_offset(entry);
> >>>   if (offset >= si->max)
> >>>   
> >>>    goto unlock_out;
> >>>   
> >>>   return si;
> >>> 
> >>> bad_nofile:
> >>>   pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val);
> >>> 
> >>> out:
> >>>   return NULL;
> >>> 
> >>> unlock_out:
> >>>   rcu_read_unlock();
> >>>   return NULL;
> >>> 
> >>> }
> >>> I guess the function "return si" without a rcu_read_unlock.
> >> 
> >> Yes, but the caller is supposed to call put_swap_device() which
> >> calls rcu_read_unlock().  See commit eb085574a752.
> > 
> > Right, but we need to make sure there is no sleepable function called
> > before put_swap_device() called, and the call trace showed the following
> > 
> > happened:
> > 	do_swap_page():
> > 	  si = get_swap_device():
> > 	    rcu_read_lock();
> > 	  
> > 	  lock_page_or_retry():
> > 	    might_sleep(); // call a sleepable function inside RCU read-side 
c.s.
> > 	    
> > 	    __lock_page_or_retry():
> > 	      wait_on_page_bit_common():
> > 	        schedule():
> > 		  rcu_note_context_switch();
> > 		  // Warn here
> > 	  
> > 	  put_swap_device();
> > 	  
> > 	    rcu_read_unlock();
> > 
> > , which introduced by commit 2799e77529c2a
> 
> When in the commit 2799e77529c2a, we're using the percpu_ref to serialize
> against concurrent swapoff, i.e. there's percpu_ref inside
> get_swap_device() instead of rcu_read_lock(). Please see commit
> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against concurrent
> swapoff") for detail.

The problem here is that 2799e77529c2a got pulled into stable, but 
63d8620ecf93 was not pulled. Are you suggesting that 63d8620ecf93 should be 
pulled into the stable kernel as well?

Thanks.

-- 
Oleksandr Natalenko (post-factum)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 11:12                 ` Miaohe Lin
  2021-07-19 11:17                   ` Oleksandr Natalenko
@ 2021-07-19 11:22                   ` Matthew Wilcox
  2021-07-19 11:50                     ` Miaohe Lin
  1 sibling, 1 reply; 34+ messages in thread
From: Matthew Wilcox @ 2021-07-19 11:22 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Boqun Feng, Zhouyi Zhou, paulmck, Oleksandr Natalenko,
	linux-kernel, stable, Chris Clayton, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying, gregkh

On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
> When in the commit 2799e77529c2a, we're using the percpu_ref to serialize against
> concurrent swapoff, i.e. there's percpu_ref inside get_swap_device() instead of
> rcu_read_lock(). Please see commit 63d8620ecf93 ("mm/swapfile: use percpu_ref to
> serialize against concurrent swapoff") for detail.

Oh, so this is a backport problem.  2799e77529c2 was backported without
its prerequisite 63d8620ecf93.  Greg, probably best to just drop
2799e77529c2 from all stable trees; the race described is not very
important (swapoff vs reading a page back from that swap device).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 11:22                   ` Matthew Wilcox
@ 2021-07-19 11:50                     ` Miaohe Lin
  2021-07-19 11:59                       ` Oleksandr Natalenko
  0 siblings, 1 reply; 34+ messages in thread
From: Miaohe Lin @ 2021-07-19 11:50 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Boqun Feng, Zhouyi Zhou, paulmck, Oleksandr Natalenko,
	linux-kernel, stable, Chris Clayton, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying, gregkh

On 2021/7/19 19:22, Matthew Wilcox wrote:
> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
>> When in the commit 2799e77529c2a, we're using the percpu_ref to serialize against
>> concurrent swapoff, i.e. there's percpu_ref inside get_swap_device() instead of
>> rcu_read_lock(). Please see commit 63d8620ecf93 ("mm/swapfile: use percpu_ref to
>> serialize against concurrent swapoff") for detail.
> 
> Oh, so this is a backport problem.  2799e77529c2 was backported without
> its prerequisite 63d8620ecf93.  Greg, probably best to just drop

Yes, they're posted as a patch set:

https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.com

> 2799e77529c2 from all stable trees; the race described is not very
> important (swapoff vs reading a page back from that swap device).
> .
> 

The swapoff races with reading a page back from that swap device should be really
uncommon as most users only do swapoff when the system is going to shutdown.

Sorry for the trouble!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 11:50                     ` Miaohe Lin
@ 2021-07-19 11:59                       ` Oleksandr Natalenko
  2021-07-19 12:08                         ` Miaohe Lin
  0 siblings, 1 reply; 34+ messages in thread
From: Oleksandr Natalenko @ 2021-07-19 11:59 UTC (permalink / raw)
  To: Matthew Wilcox, Miaohe Lin
  Cc: Boqun Feng, Zhouyi Zhou, paulmck, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM, Huang, Ying, gregkh

On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
> On 2021/7/19 19:22, Matthew Wilcox wrote:
> > On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
> >> When in the commit 2799e77529c2a, we're using the percpu_ref to serialize
> >> against concurrent swapoff, i.e. there's percpu_ref inside
> >> get_swap_device() instead of rcu_read_lock(). Please see commit
> >> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
> >> concurrent swapoff") for detail.
> > 
> > Oh, so this is a backport problem.  2799e77529c2 was backported without
> > its prerequisite 63d8620ecf93.  Greg, probably best to just drop
> 
> Yes, they're posted as a patch set:
> 
> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.com
> 
> > 2799e77529c2 from all stable trees; the race described is not very
> > important (swapoff vs reading a page back from that swap device).
> > .
> 
> The swapoff races with reading a page back from that swap device should be
> really uncommon as most users only do swapoff when the system is going to
> shutdown.
> 
> Sorry for the trouble!

git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
95d192da198d mm/z3fold: use release_z3fold_page_locked() to release locked 
z3fold page
ccb7848e2344 mm/z3fold: fix potential memory leak in z3fold_destroy_pool()
9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other processes are 
mapping it
f13259175e4f mm/huge_memory.c: add missing read-only THP checking in 
transparent_hugepage_enabled()
afafd371e7de mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK
a533a21b692f mm/shmem: fix shmem_swapin() race with swapoff
c3b39134bbd0 swap: fix do_swap_page() race with swapoff

Do you suggest reverting "mm/shmem: fix shmem_swapin() race with swapoff" as 
well?

-- 
Oleksandr Natalenko (post-factum)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 11:59                       ` Oleksandr Natalenko
@ 2021-07-19 12:08                         ` Miaohe Lin
  2021-07-19 12:12                           ` Oleksandr Natalenko
  0 siblings, 1 reply; 34+ messages in thread
From: Miaohe Lin @ 2021-07-19 12:08 UTC (permalink / raw)
  To: Oleksandr Natalenko, Matthew Wilcox
  Cc: Boqun Feng, Zhouyi Zhou, paulmck, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM, Huang, Ying, gregkh

On 2021/7/19 19:59, Oleksandr Natalenko wrote:
> On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
>> On 2021/7/19 19:22, Matthew Wilcox wrote:
>>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
>>>> When in the commit 2799e77529c2a, we're using the percpu_ref to serialize
>>>> against concurrent swapoff, i.e. there's percpu_ref inside
>>>> get_swap_device() instead of rcu_read_lock(). Please see commit
>>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
>>>> concurrent swapoff") for detail.
>>>
>>> Oh, so this is a backport problem.  2799e77529c2 was backported without
>>> its prerequisite 63d8620ecf93.  Greg, probably best to just drop
>>
>> Yes, they're posted as a patch set:
>>
>> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.com
>>
>>> 2799e77529c2 from all stable trees; the race described is not very
>>> important (swapoff vs reading a page back from that swap device).
>>> .
>>
>> The swapoff races with reading a page back from that swap device should be
>> really uncommon as most users only do swapoff when the system is going to
>> shutdown.
>>
>> Sorry for the trouble!
> 
> git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
> 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
> 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release locked 
> z3fold page
> ccb7848e2344 mm/z3fold: fix potential memory leak in z3fold_destroy_pool()
> 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other processes are 
> mapping it
> f13259175e4f mm/huge_memory.c: add missing read-only THP checking in 
> transparent_hugepage_enabled()
> afafd371e7de mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK
> a533a21b692f mm/shmem: fix shmem_swapin() race with swapoff
> c3b39134bbd0 swap: fix do_swap_page() race with swapoff
> 
> Do you suggest reverting "mm/shmem: fix shmem_swapin() race with swapoff" as 
> well?
This patch also rely on its prerequisite 63d8620ecf93. I think we should either revert
any commit in this series or just backport the entire series.

Many thanks.

> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 12:08                         ` Miaohe Lin
@ 2021-07-19 12:12                           ` Oleksandr Natalenko
  2021-07-19 12:16                             ` Matthew Wilcox
  2021-07-19 12:17                             ` Miaohe Lin
  0 siblings, 2 replies; 34+ messages in thread
From: Oleksandr Natalenko @ 2021-07-19 12:12 UTC (permalink / raw)
  To: Matthew Wilcox, Miaohe Lin
  Cc: Boqun Feng, Zhouyi Zhou, paulmck, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM, Huang, Ying, gregkh

On pondělí 19. července 2021 14:08:37 CEST Miaohe Lin wrote:
> On 2021/7/19 19:59, Oleksandr Natalenko wrote:
> > On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
> >> On 2021/7/19 19:22, Matthew Wilcox wrote:
> >>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
> >>>> When in the commit 2799e77529c2a, we're using the percpu_ref to
> >>>> serialize
> >>>> against concurrent swapoff, i.e. there's percpu_ref inside
> >>>> get_swap_device() instead of rcu_read_lock(). Please see commit
> >>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
> >>>> concurrent swapoff") for detail.
> >>> 
> >>> Oh, so this is a backport problem.  2799e77529c2 was backported without
> >>> its prerequisite 63d8620ecf93.  Greg, probably best to just drop
> >> 
> >> Yes, they're posted as a patch set:
> >> 
> >> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.com
> >> 
> >>> 2799e77529c2 from all stable trees; the race described is not very
> >>> important (swapoff vs reading a page back from that swap device).
> >>> .
> >> 
> >> The swapoff races with reading a page back from that swap device should
> >> be
> >> really uncommon as most users only do swapoff when the system is going to
> >> shutdown.
> >> 
> >> Sorry for the trouble!
> > 
> > git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
> > 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
> > 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release locked
> > z3fold page
> > ccb7848e2344 mm/z3fold: fix potential memory leak in z3fold_destroy_pool()
> > 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other processes
> > are mapping it
> > f13259175e4f mm/huge_memory.c: add missing read-only THP checking in
> > transparent_hugepage_enabled()
> > afafd371e7de mm/huge_memory.c: remove dedicated macro
> > HPAGE_CACHE_INDEX_MASK a533a21b692f mm/shmem: fix shmem_swapin() race
> > with swapoff
> > c3b39134bbd0 swap: fix do_swap_page() race with swapoff
> > 
> > Do you suggest reverting "mm/shmem: fix shmem_swapin() race with swapoff"
> > as well?
> 
> This patch also rely on its prerequisite 63d8620ecf93. I think we should
> either revert any commit in this series or just backport the entire series.

Then why not just pick up 2 more patches instead of dropping 2 patches. Greg, 
could you please make sure the whole series from [1] gets pulled?

Thanks.

[1] https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.com

-- 
Oleksandr Natalenko (post-factum)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 12:12                           ` Oleksandr Natalenko
@ 2021-07-19 12:16                             ` Matthew Wilcox
  2021-07-19 12:23                               ` Oleksandr Natalenko
  2021-07-19 12:17                             ` Miaohe Lin
  1 sibling, 1 reply; 34+ messages in thread
From: Matthew Wilcox @ 2021-07-19 12:16 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: Miaohe Lin, Boqun Feng, Zhouyi Zhou, paulmck, linux-kernel,
	stable, Chris Clayton, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying, gregkh

On Mon, Jul 19, 2021 at 02:12:15PM +0200, Oleksandr Natalenko wrote:
> On pondělí 19. července 2021 14:08:37 CEST Miaohe Lin wrote:
> > On 2021/7/19 19:59, Oleksandr Natalenko wrote:
> > > On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
> > >> On 2021/7/19 19:22, Matthew Wilcox wrote:
> > >>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
> > >>>> When in the commit 2799e77529c2a, we're using the percpu_ref to
> > >>>> serialize
> > >>>> against concurrent swapoff, i.e. there's percpu_ref inside
> > >>>> get_swap_device() instead of rcu_read_lock(). Please see commit
> > >>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
> > >>>> concurrent swapoff") for detail.
> > >>> 
> > >>> Oh, so this is a backport problem.  2799e77529c2 was backported without
> > >>> its prerequisite 63d8620ecf93.  Greg, probably best to just drop
> > >> 
> > >> Yes, they're posted as a patch set:
> > >> 
> > >> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.com
> > >> 
> > >>> 2799e77529c2 from all stable trees; the race described is not very
> > >>> important (swapoff vs reading a page back from that swap device).
> > >>> .
> > >> 
> > >> The swapoff races with reading a page back from that swap device should
> > >> be
> > >> really uncommon as most users only do swapoff when the system is going to
> > >> shutdown.
> > >> 
> > >> Sorry for the trouble!
> > > 
> > > git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
> > > 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
> > > 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release locked
> > > z3fold page
> > > ccb7848e2344 mm/z3fold: fix potential memory leak in z3fold_destroy_pool()
> > > 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other processes
> > > are mapping it
> > > f13259175e4f mm/huge_memory.c: add missing read-only THP checking in
> > > transparent_hugepage_enabled()
> > > afafd371e7de mm/huge_memory.c: remove dedicated macro
> > > HPAGE_CACHE_INDEX_MASK a533a21b692f mm/shmem: fix shmem_swapin() race
> > > with swapoff
> > > c3b39134bbd0 swap: fix do_swap_page() race with swapoff
> > > 
> > > Do you suggest reverting "mm/shmem: fix shmem_swapin() race with swapoff"
> > > as well?
> > 
> > This patch also rely on its prerequisite 63d8620ecf93. I think we should
> > either revert any commit in this series or just backport the entire series.
> 
> Then why not just pick up 2 more patches instead of dropping 2 patches. Greg, 
> could you please make sure the whole series from [1] gets pulled?

Because none of these patches should have been backported in the first
place.  It's just not worth the destabilisation.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 12:16                             ` Matthew Wilcox
@ 2021-07-19 12:23                               ` Oleksandr Natalenko
  2021-07-19 16:47                                 ` Zhouyi Zhou
  0 siblings, 1 reply; 34+ messages in thread
From: Oleksandr Natalenko @ 2021-07-19 12:23 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Miaohe Lin, Boqun Feng, Zhouyi Zhou, paulmck, linux-kernel,
	stable, Chris Clayton, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying, gregkh

On pondělí 19. července 2021 14:16:04 CEST Matthew Wilcox wrote:
> On Mon, Jul 19, 2021 at 02:12:15PM +0200, Oleksandr Natalenko wrote:
> > On pondělí 19. července 2021 14:08:37 CEST Miaohe Lin wrote:
> > > On 2021/7/19 19:59, Oleksandr Natalenko wrote:
> > > > On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
> > > >> On 2021/7/19 19:22, Matthew Wilcox wrote:
> > > >>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
> > > >>>> When in the commit 2799e77529c2a, we're using the percpu_ref to
> > > >>>> serialize
> > > >>>> against concurrent swapoff, i.e. there's percpu_ref inside
> > > >>>> get_swap_device() instead of rcu_read_lock(). Please see commit
> > > >>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
> > > >>>> concurrent swapoff") for detail.
> > > >>> 
> > > >>> Oh, so this is a backport problem.  2799e77529c2 was backported
> > > >>> without
> > > >>> its prerequisite 63d8620ecf93.  Greg, probably best to just drop
> > > >> 
> > > >> Yes, they're posted as a patch set:
> > > >> 
> > > >> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.co
> > > >> m
> > > >> 
> > > >>> 2799e77529c2 from all stable trees; the race described is not very
> > > >>> important (swapoff vs reading a page back from that swap device).
> > > >>> .
> > > >> 
> > > >> The swapoff races with reading a page back from that swap device
> > > >> should
> > > >> be
> > > >> really uncommon as most users only do swapoff when the system is
> > > >> going to
> > > >> shutdown.
> > > >> 
> > > >> Sorry for the trouble!
> > > > 
> > > > git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
> > > > 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
> > > > 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release
> > > > locked
> > > > z3fold page
> > > > ccb7848e2344 mm/z3fold: fix potential memory leak in
> > > > z3fold_destroy_pool()
> > > > 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other
> > > > processes
> > > > are mapping it
> > > > f13259175e4f mm/huge_memory.c: add missing read-only THP checking in
> > > > transparent_hugepage_enabled()
> > > > afafd371e7de mm/huge_memory.c: remove dedicated macro
> > > > HPAGE_CACHE_INDEX_MASK a533a21b692f mm/shmem: fix shmem_swapin() race
> > > > with swapoff
> > > > c3b39134bbd0 swap: fix do_swap_page() race with swapoff
> > > > 
> > > > Do you suggest reverting "mm/shmem: fix shmem_swapin() race with
> > > > swapoff"
> > > > as well?
> > > 
> > > This patch also rely on its prerequisite 63d8620ecf93. I think we should
> > > either revert any commit in this series or just backport the entire
> > > series.
> > 
> > Then why not just pick up 2 more patches instead of dropping 2 patches.
> > Greg, could you please make sure the whole series from [1] gets pulled?
> 
> Because none of these patches should have been backported in the first
> place.  It's just not worth the destabilisation.

What about the rest then?

git log --oneline v5.13..v5.13.3 -- mm/ | wc -l
18

Those look to be fixes, these ones too.

-- 
Oleksandr Natalenko (post-factum)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 12:23                               ` Oleksandr Natalenko
@ 2021-07-19 16:47                                 ` Zhouyi Zhou
       [not found]                                   ` <CAABZP2yh3J8+P=3PLZVaC47ymKC7PcfQCBBxjXJ9Ybn+HREbdg@mail.gmail.com>
  0 siblings, 1 reply; 34+ messages in thread
From: Zhouyi Zhou @ 2021-07-19 16:47 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: Matthew Wilcox, Miaohe Lin, Boqun Feng, paulmck, linux-kernel,
	stable, Chris Clayton, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying, Greg KH

I downloaded linux-5.13.2, configure and compile the kernel with
CONFIG_LOCKDEP=y
CONFIG_LOCK_DEBUGGING_SUPPORT=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_LOCK_ALLOC=y
install the kernel on a qemu box, then run following C Program
#include <stdlib.h>
int main()
{
        while(1)
        {
                malloc(1024*1024);
        }
        return 0;
}
And following is the dmesg:
[   96.155017] ------------[ cut here ]------------
[   96.155030] WARNING: CPU: 10 PID: 770 at
kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x91/0x610
[   96.155074] Modules linked in: ppdev intel_rapl_msr
intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
bochs_drm aesni_intel drm_vram_helper evdev crypto_simd drm_ttm_helper
snd_pcm cryptd ttm snd_timer parport_pc serio_raw drm_kms_helper sg
snd parport soundcore drm pcspkr button ip_tables autofs4 psmouse
sr_mod i2c_piix4 sd_mod crc32c_intel t10_pi cdrom i2c_core e1000
ata_generic floppy
[   96.155180] CPU: 10 PID: 770 Comm: containerd Not tainted 5.13.2 #1
[   96.155185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[   96.155189] RIP: 0010:rcu_note_context_switch+0x91/0x610
[   96.155195] Code: ff 74 0f 65 8b 05 a7 10 e5 7e 85 c0 0f 84 22 01
00 00 45 84 ed 75 15 65 48 8b 04 25 80 7f 01 00 8b b0 44 03 00 00 85
f6 7e 02 <0f> 0b 65 48 8b 04 25 80 7f 01 00 8b 88 44 03 00 00 85 c9 7e
0f 41
[   96.155200] RSP: 0000:ffffc90000a8bc50 EFLAGS: 00010002
[   96.155204] RAX: ffff88810830c300 RBX: ffff88813bcae680 RCX: 0000000000000000
[   96.155208] RDX: 0000000000000002 RSI: 0000000000000001 RDI: 0000000000000001
[   96.155210] RBP: ffffc90000a8bcd0 R08: 0000000000000001 R09: 0000000000000001
[   96.155213] R10: 0000000000000000 R11: ffffffff81319b93 R12: ffff88810830c300
[   96.155216] R13: 0000000000000000 R14: ffff88813bcad958 R15: 0000000000004970
[   96.155220] FS:  00007fc09cff9700(0000) GS:ffff88813bc80000(0000)
knlGS:0000000000000000
[   96.155223] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   96.155226] CR2: 00005594d2ff0530 CR3: 0000000103be4001 CR4: 0000000000060ee0
[   96.155233] Call Trace:
[   96.155243]  __schedule+0xda/0xa30
[   96.155283]  schedule+0x46/0xf0
[   96.155289]  io_schedule+0x12/0x40
[   96.155295]  __lock_page_or_retry+0x1f9/0x510
[   96.155319]  ? __page_cache_alloc+0x140/0x140
[   96.155345]  do_swap_page+0x33f/0x930
[   96.155364]  __handle_mm_fault+0xa54/0x1550
[   96.155390]  handle_mm_fault+0x17f/0x420
[   96.155400]  do_user_addr_fault+0x1be/0x770
[   96.155422]  exc_page_fault+0x69/0x280
[   96.155435]  ? asm_exc_page_fault+0x8/0x30
[   96.155443]  asm_exc_page_fault+0x1e/0x30
[   96.155448] RIP: 0033:0x5594d15ec98f
[   96.155454] Code: 44 24 58 48 85 c0 48 b9 00 e4 0b 54 02 00 00 00
48 0f 44 c1 48 89 44 24 58 eb 05 48 8b 44 24 58 48 89 04 24 e8 e2 c9
9c ff 90 <48> 8b 05 9a 3b a0 01 48 8b 4c 24 50 48 89 0c 24 48 8d 15 9a
c4 ec
[   96.155457] RSP: 002b:000000c0001f7f80 EFLAGS: 00010206
[   96.155462] RAX: 0000000000000000 RBX: 00005594d0f9cf55 RCX: fffffffffffffff8
[   96.155465] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005594d0fa2d01
[   96.155467] RBP: 000000c0001f7fc0 R08: 0000000000000000 R09: 0000000000000000
[   96.155470] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000004
[   96.155472] R13: 0000000000000013 R14: 00005594d1ee96e6 R15: 0000000000000039
[   96.155495] irq event stamp: 10464
[   96.155497] hardirqs last  enabled at (10463): [<ffffffff81c0a574>]
_raw_spin_unlock_irq+0x24/0x50
[   96.155508] hardirqs last disabled at (10464): [<ffffffff81c01882>]
__schedule+0x412/0xa30
[   96.155512] softirqs last  enabled at (8668): [<ffffffff82000401>]
__do_softirq+0x401/0x51b
[   96.155517] softirqs last disabled at (8657): [<ffffffff81129c22>]
irq_exit_rcu+0x142/0x150
[   96.155531] ---[ end trace 165ff31fd86ffc12 ]---

[   96.177669] =============================
[   96.177693] [ BUG: Invalid wait context ]
[   96.177717] 5.13.2 #1 Tainted: G        W
[   96.177743] -----------------------------
[   96.177765] containerd/770 is trying to lock:
[   96.177790] ffff88813ba69b30 (&cache->alloc_lock){+.+.}-{3:3}, at:
get_swap_page+0x126/0x200
[   96.177867] other info that might help us debug this:
[   96.177894] context-{4:4}
[   96.177910] 3 locks held by containerd/770:
[   96.177934]  #0: ffff88810815ea28 (&mm->mmap_lock#2){++++}-{3:3},
at: do_user_addr_fault+0x115/0x770
[   96.177999]  #1: ffffffff82915020 (rcu_read_lock){....}-{1:2}, at:
get_swap_device+0x33/0x140
[   96.178057]  #2: ffffffff82955ba0 (fs_reclaim){+.+.}-{0:0}, at:
__fs_reclaim_acquire+0x5/0x30
[   96.178115] stack backtrace:
[   96.178133] CPU: 1 PID: 770 Comm: containerd Tainted: G        W
     5.13.2 #1
[   96.178183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[   96.178254] Call Trace:
[   96.178274]  dump_stack+0x82/0xa4
[   96.178314]  __lock_acquire+0x9a5/0x20a0
[   96.178347]  ? find_held_lock+0x3a/0xb0
[   96.178381]  lock_acquire+0xe9/0x320
[   96.178405]  ? get_swap_page+0x126/0x200
[   96.178433]  ? _raw_spin_unlock+0x29/0x40
[   96.178461]  ? page_vma_mapped_walk+0x3a3/0x960
[   96.178491]  __mutex_lock+0x99/0x980
[   96.178515]  ? get_swap_page+0x126/0x200
[   96.178541]  ? find_held_lock+0x3a/0xb0
[   96.178577]  ? get_swap_page+0x126/0x200
[   96.178603]  ? page_referenced+0xd5/0x170
[   96.178637]  ? lock_release+0x1b4/0x300
[   96.178663]  ? get_swap_page+0x126/0x200
[   96.178698]  get_swap_page+0x126/0x200
[   96.178725]  add_to_swap+0x14/0x60
[   96.178749]  shrink_page_list+0xb13/0xe70
[   96.178787]  shrink_inactive_list+0x243/0x550
[   96.178819]  shrink_lruvec+0x4fd/0x780
[   96.178849]  ? shrink_node+0x257/0x7c0
[   96.178873]  shrink_node+0x257/0x7c0
[   96.178900]  do_try_to_free_pages+0xdd/0x410
[   96.178929]  try_to_free_pages+0x110/0x300
[   96.178966]  __alloc_pages_slowpath.constprop.126+0x2ae/0xfa0
[   96.179002]  ? lock_release+0x1b4/0x300
[   96.179028]  __alloc_pages+0x37d/0x400
[   96.179054]  alloc_pages_vma+0x73/0x1d0
[   96.179878]  __read_swap_cache_async+0xb8/0x280
[   96.180684]  swap_cluster_readahead+0x194/0x270
[   96.181459]  ? swapin_readahead+0x62/0x530
[   96.182008]  swapin_readahead+0x62/0x530
[   96.182558]  ? find_held_lock+0x3a/0xb0
[   96.183109]  ? lookup_swap_cache+0x5c/0x1c0
[   96.183657]  ? lock_release+0x1b4/0x300
[   96.184207]  ? do_swap_page+0x232/0x930
[   96.184753]  do_swap_page+0x232/0x930
[   96.185326]  __handle_mm_fault+0xa54/0x1550
[   96.185850]  handle_mm_fault+0x17f/0x420
[   96.186361]  do_user_addr_fault+0x1be/0x770
[   96.186880]  exc_page_fault+0x69/0x280
[   96.187382]  ? asm_exc_page_fault+0x8/0x30
[   96.187879]  asm_exc_page_fault+0x1e/0x30
[   96.188363] RIP: 0033:0x5594d0f78da4
[   96.188829] Code: cc cc cc cc cc cc 48 8b 0d 39 d2 01 02 64 48 8b
09 48 3b 61 10 76 3d 48 83 ec 28 48 89 6c 24 20 48 8d 6c 24 20 48 8b
44 24 30 <48> 8b 08 48 89 0c 24 48 89 44 24 08 c6 44 24 10 01 e8 76 f4
ff ff
[   96.189894] RSP: 002b:000000c0001f7de8 EFLAGS: 00010216
[   96.190406] RAX: 00005594d2308160 RBX: 0000000000000000 RCX: 000000c0004ea480
[   96.190902] RDX: 000000c0002a4270 RSI: 0000000000000010 RDI: 0000000000000011
[   96.191394] RBP: 000000c0001f7e08 R08: 0000000000000002 R09: 0000000000000011
[   96.191891] R10: 00005594d22f6ce0 R11: 00005594d1ee96e4 R12: ffffffffffffffff
[   96.192396] R13: 0000000000000028 R14: 0000000000000027 R15: 0000000000000200
[  115.344546] exaust invoked oom-killer:
gfp_mask=0x100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0,
oom_score_adj=0
[  115.346019] CPU: 1 PID: 969 Comm: exaust Tainted: G        W
 5.13.2 #1
[  115.346569] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[  115.347712] Call Trace:
[  115.348287]  dump_stack+0x82/0xa4
[  115.348867]  dump_header+0x55/0x3f0
[  115.349491]  oom_kill_process+0x160/0x210
[  115.350068]  out_of_memory+0x10b/0x630
[  115.350646]  __alloc_pages_slowpath.constprop.126+0xec7/0xfa0
[  115.351255]  __alloc_pages+0x37d/0x400
[  115.351861]  alloc_pages_vma+0x73/0x1d0
[  115.352449]  __handle_mm_fault+0xe1b/0x1550
[  115.353042]  handle_mm_fault+0x17f/0x420
[  115.353646]  do_user_addr_fault+0x1be/0x770
[  115.354239]  exc_page_fault+0x69/0x280
[  115.354833]  ? asm_exc_page_fault+0x8/0x30
[  115.355428]  asm_exc_page_fault+0x1e/0x30
[  115.356021] RIP: 0033:0x7fe8ee633543
[  115.356617] Code: Unable to access opcode bytes at RIP 0x7fe8ee633519.
[  115.357245] RSP: 002b:00007fff742c68c8 EFLAGS: 00010206
[  115.357857] RAX: 00007fe798041010 RBX: 0000000000000000 RCX: 00007fe7980fd000
[  115.358469] RDX: 00007fe798141000 RSI: 0000000000000000 RDI: 00007fe798041010
[  115.359077] RBP: 00007fff742c68e0 R08: 00000000ffffffff R09: 0000000000000000
[  115.359686] R10: 0000000000000022 R11: 0000000000000246 R12: 0000561abe3fa060
[  115.360300] R13: 00007fff742c69c0 R14: 0000000000000000 R15: 0000000000000000
[  115.361313] Mem-Info:
[  115.362285] active_anon:188386 inactive_anon:764572 isolated_anon:64
                active_file:61 inactive_file:0 isolated_file:0
                unevictable:0 dirty:0 writeback:2
                slab_reclaimable:6676 slab_unreclaimable:6200
                mapped:155 shmem:180 pagetables:3124 bounce:0
                free:25647 free_pcp:0 free_cma:0
[  115.366889] Node 0 active_anon:782092kB inactive_anon:3029744kB
active_file:244kB inactive_file:0kB unevictable:0kB
isolated(anon):256kB isolated(file):0kB mapped:620kB dirty:0kB
writeback:8kB shmem:720kB shmem_thp: 0kB shmem_pmdmapped: 0kB
anon_thp: 0kB writeback_tmp:0kB kernel_stack:3904kB pagetables:12496kB
all_unreclaimable? no
[  115.369378] Node 0 DMA free:15296kB min:260kB low:324kB high:388kB
reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
[  115.372121] lowmem_reserve[]: 0 2925 3874 3874
[  115.373195] Node 0 DMA32 free:54436kB min:50824kB low:63528kB
high:76232kB reserved_highatomic:0KB active_anon:819596kB
inactive_anon:2135980kB active_file:48kB inactive_file:24kB
unevictable:0kB writepending:8kB present:3129212kB managed:3021488kB
mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  115.375958] lowmem_reserve[]: 0 0 949 949
[  115.377064] Node 0 Normal free:32856kB min:32880kB low:37004kB
high:41128kB reserved_highatomic:0KB active_anon:2232kB
inactive_anon:853384kB active_file:80kB inactive_file:76kB
unevictable:0kB writepending:0kB present:1048576kB managed:972400kB
mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  115.380050] lowmem_reserve[]: 0 0 0 0
[  115.380908] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB (U)
1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M)
= 15296kB
[  115.382728] Node 0 DMA32: 86*4kB (UM) 52*8kB (UME) 33*16kB (UM)
28*32kB (UME) 11*64kB (U) 2*128kB (UE) 0*256kB 1*512kB (U) 2*1024kB
(UM) 2*2048kB (ME) 11*4096kB (M) = 54856kB
[  115.384560] Node 0 Normal: 597*4kB (UME) 904*8kB (UME) 414*16kB
(UME) 187*32kB (UME) 64*64kB (UME) 19*128kB (UME) 3*256kB (U) 3*512kB
(UE) 2*1024kB (M) 0*2048kB 0*4096kB = 33108kB
[  115.386558] Node 0 hugepages_total=0 hugepages_free=0
hugepages_surp=0 hugepages_size=1048576kB
[  115.387543] Node 0 hugepages_total=0 hugepages_free=0
hugepages_surp=0 hugepages_size=2048kB
[  115.388503] 61467 total pagecache pages
[  115.389532] 61242 pages in swap cache
[  115.390450] Swap cache stats: add 523756, delete 462538, find 69/171
[  115.391383] Free swap  = 0kB
[  115.392312] Total swap = 2094076kB
[  115.393506] 1048445 pages RAM
[  115.394538] 0 pages HighMem/MovableOnly
[  115.395514] 46133 pages reserved
[  115.396525] 0 pages hwpoisoned
[  115.397824] Tasks state (memory values in pages):
[  115.398778] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes
swapents oom_score_adj name
[  115.399792] [    312]     0   312     6603      217    98304
181             0 systemd-journal
[  115.400749] [    339]     0   339     5562        1    69632
395         -1000 systemd-udevd
[  115.401869] [    379]   101   379    23271       17    81920
197             0 systemd-timesyn
[  115.402873] [    388]   107   388     1707        7    53248
101             0 rpcbind
[  115.403869] [    479]     0   479     2373       39    53248
284             0 dhclient
[  115.404849] [    485]     0   485     2120        8    53248
43             0 cron
[  115.406291] [    487]     0   487     4869       44    77824
197             0 systemd-logind
[  115.407536] [    494]   104   494     2247       58    57344
80          -900 dbus-daemon
[  115.408772] [    501]     0   501    56457        0    86016
226             0 rsyslogd
[  115.410041] [    503]     0   503   315093      769   311296
3943             0 containerd
[  115.411289] [    507]     0   507     1404        2    45056
26             0 agetty
[  115.412563] [    521]     0   521    27588      235   110592
1678             0 unattended-upgr
[  115.413834] [    860]     0   860     3964       28    73728
187         -1000 sshd
[  115.415062] [    866]   106   866     5015       19    81920
187             0 exim4
[  115.416286] [    901]     0   901     4233        1    73728
281             0 sshd
[  115.417534] [    928]     0   928     5287       57    73728
280             0 systemd
[  115.418755] [    929]     0   929     5710       77    90112
488             0 (sd-pam)
[  115.419921] [    943]     0   943     2013        1    53248
407             0 bash
[  115.421061] [    969]     0   969  1402762   888995 11276288
513120             0 exaust
[  115.421973] [    970]     0   970     4233      272    77824
5             0 sshd
[  115.422840] [    976]     0   976     1980      297    53248
64             0 bash
[  115.423664] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/session-1.scope,task=exaust,pid=969,uid=0
[  115.425344] Out of memory: Killed process 969 (exaust)
total-vm:5611048kB, anon-rss:3555976kB, file-rss:4kB, shmem-rss:0kB,
UID:0 pgtables:11012kB oom_score_adj:0
[  115.912696] oom_reaper: reaped process 969 (exaust), now
anon-rss:0kB, file-rss:0kB, shmem-rss:0kB


I can tell from above that:  [   96.177999]  #1: ffffffff82915020
(rcu_read_lock){....}-{1:2}, at: get_swap_device+0x33/0x140
 get_swap_device did help the rcu_read_lock

Thanks
Zhouyi

On Mon, Jul 19, 2021 at 8:23 PM Oleksandr Natalenko
<oleksandr@natalenko.name> wrote:
>
> On pondělí 19. července 2021 14:16:04 CEST Matthew Wilcox wrote:
> > On Mon, Jul 19, 2021 at 02:12:15PM +0200, Oleksandr Natalenko wrote:
> > > On pondělí 19. července 2021 14:08:37 CEST Miaohe Lin wrote:
> > > > On 2021/7/19 19:59, Oleksandr Natalenko wrote:
> > > > > On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
> > > > >> On 2021/7/19 19:22, Matthew Wilcox wrote:
> > > > >>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
> > > > >>>> When in the commit 2799e77529c2a, we're using the percpu_ref to
> > > > >>>> serialize
> > > > >>>> against concurrent swapoff, i.e. there's percpu_ref inside
> > > > >>>> get_swap_device() instead of rcu_read_lock(). Please see commit
> > > > >>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
> > > > >>>> concurrent swapoff") for detail.
> > > > >>>
> > > > >>> Oh, so this is a backport problem.  2799e77529c2 was backported
> > > > >>> without
> > > > >>> its prerequisite 63d8620ecf93.  Greg, probably best to just drop
> > > > >>
> > > > >> Yes, they're posted as a patch set:
> > > > >>
> > > > >> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.co
> > > > >> m
> > > > >>
> > > > >>> 2799e77529c2 from all stable trees; the race described is not very
> > > > >>> important (swapoff vs reading a page back from that swap device).
> > > > >>> .
> > > > >>
> > > > >> The swapoff races with reading a page back from that swap device
> > > > >> should
> > > > >> be
> > > > >> really uncommon as most users only do swapoff when the system is
> > > > >> going to
> > > > >> shutdown.
> > > > >>
> > > > >> Sorry for the trouble!
> > > > >
> > > > > git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
> > > > > 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
> > > > > 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release
> > > > > locked
> > > > > z3fold page
> > > > > ccb7848e2344 mm/z3fold: fix potential memory leak in
> > > > > z3fold_destroy_pool()
> > > > > 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other
> > > > > processes
> > > > > are mapping it
> > > > > f13259175e4f mm/huge_memory.c: add missing read-only THP checking in
> > > > > transparent_hugepage_enabled()
> > > > > afafd371e7de mm/huge_memory.c: remove dedicated macro
> > > > > HPAGE_CACHE_INDEX_MASK a533a21b692f mm/shmem: fix shmem_swapin() race
> > > > > with swapoff
> > > > > c3b39134bbd0 swap: fix do_swap_page() race with swapoff
> > > > >
> > > > > Do you suggest reverting "mm/shmem: fix shmem_swapin() race with
> > > > > swapoff"
> > > > > as well?
> > > >
> > > > This patch also rely on its prerequisite 63d8620ecf93. I think we should
> > > > either revert any commit in this series or just backport the entire
> > > > series.
> > >
> > > Then why not just pick up 2 more patches instead of dropping 2 patches.
> > > Greg, could you please make sure the whole series from [1] gets pulled?
> >
> > Because none of these patches should have been backported in the first
> > place.  It's just not worth the destabilisation.
>
> What about the rest then?
>
> git log --oneline v5.13..v5.13.3 -- mm/ | wc -l
> 18
>
> Those look to be fixes, these ones too.
>
> --
> Oleksandr Natalenko (post-factum)
>
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

[parent not found: <CAABZP2yh3J8+P=3PLZVaC47ymKC7PcfQCBBxjXJ9Ybn+HREbdg@mail.gmail.com>]

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
       [not found]                                   ` <CAABZP2yh3J8+P=3PLZVaC47ymKC7PcfQCBBxjXJ9Ybn+HREbdg@mail.gmail.com>
@ 2021-07-22  7:30                                     ` Chris Clayton
  2021-07-22  8:57                                       ` Zhouyi Zhou
  2021-07-22 14:05                                       ` Paul E. McKenney
  0 siblings, 2 replies; 34+ messages in thread
From: Chris Clayton @ 2021-07-22  7:30 UTC (permalink / raw)
  To: Zhouyi Zhou, Oleksandr Natalenko
  Cc: Matthew Wilcox, Miaohe Lin, Boqun Feng, paulmck, linux-kernel,
	stable, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM, Huang, Ying, Greg KH



On 19/07/2021 17:56, Zhouyi Zhou wrote:
> Attached is my kernel configuration file (config-5.13.2), dmesg (dmesg.txt)
> My qemu box's memory layout is:
>                     total        used        free      shared
> buff/cache   available
> Mem:           3915          92        3764           0          58        3676
> Swap:          2044          39        2005
> The memory exhausting program:
> #include <stdlib.h>
> int main()
> {
>         while(1)
>         {
>         void *ptr =
>                 malloc(1024*1024);
>          memset(ptr, 0, 1024*1024);
>         }
>         return 0;
> }
> 

I'm not sure what I'm supposed to make of this. Is it that my laptop simply ran out of memory? That would be odd
because, as I said earlier, I was simply doing a weekly archive of my system. That involved using the dar backup utility
to create the incremental backup (which in this instance resulted in a single archive file), compressing that archive
with xz and copying (again one at a time) it and a directory containing about 7000 source and binary rpm files from an
internal HDD to and external HDD connected via USB

My laptop, which I've had for a little over 3 months, has 32GB of memory. I have been using dar for a few years now (and
the version currently installed since May) and have had no trouble with it. Similarly, I've used xz to compress the
backup archives for year and the version currently installed since December 2020. The (incremental) archive file is
about 7GB and 1.2GB when compressed. If copying files, the largest of which is 1.2GB, serially can cause 32GB fn RAM to
be exhausted , then we are all up the creek. So I don't see where in my normal archiving process, the memory exhaustion
would arise from (but I'm happy to be educated on this).

Or is the memory exhaustion likely to be a side effect of the fact that two patches applied in 5.13.2 are missing a
pre-requisite? If that's the case there, seems to no disagreement on whether the two patches (which had not been tagged
for stable) should be reverted or the missing prerequisite should be applied (along with another, related patch that was
in the patch set). Perhaps the stable and mm teams are resolving this issue behind the scenes, but in the meantime I
have backported the missing patches to 5.13.4 and 5.10.52 and am currently running the former as my default kernel.

I also have a patch that reverts the two patches that were applied to stable, so could run with that applied if it would
be more helpful. It would, of course, leave open the races that the patches are designed to close., but if I've manually
run swapoff more than once or twice in the twenty years I've been using Linux-based systems, I'd be very surprised.

> On Tue, Jul 20, 2021 at 12:47 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>>
>> I downloaded linux-5.13.2, configure and compile the kernel with
>> CONFIG_LOCKDEP=y
>> CONFIG_LOCK_DEBUGGING_SUPPORT=y
>> CONFIG_PROVE_LOCKING=y
>> CONFIG_DEBUG_SPINLOCK=y
>> CONFIG_DEBUG_LOCK_ALLOC=y
>> install the kernel on a qemu box, then run following C Program
>> #include <stdlib.h>
>> int main()
>> {
>>         while(1)
>>         {
>>                 malloc(1024*1024);
>>         }
>>         return 0;
>> }
>> And following is the dmesg:
>> [   96.155017] ------------[ cut here ]------------
>> [   96.155030] WARNING: CPU: 10 PID: 770 at
>> kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x91/0x610
>> [   96.155074] Modules linked in: ppdev intel_rapl_msr
>> intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
>> bochs_drm aesni_intel drm_vram_helper evdev crypto_simd drm_ttm_helper
>> snd_pcm cryptd ttm snd_timer parport_pc serio_raw drm_kms_helper sg
>> snd parport soundcore drm pcspkr button ip_tables autofs4 psmouse
>> sr_mod i2c_piix4 sd_mod crc32c_intel t10_pi cdrom i2c_core e1000
>> ata_generic floppy
>> [   96.155180] CPU: 10 PID: 770 Comm: containerd Not tainted 5.13.2 #1
>> [   96.155185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
>> [   96.155189] RIP: 0010:rcu_note_context_switch+0x91/0x610
>> [   96.155195] Code: ff 74 0f 65 8b 05 a7 10 e5 7e 85 c0 0f 84 22 01
>> 00 00 45 84 ed 75 15 65 48 8b 04 25 80 7f 01 00 8b b0 44 03 00 00 85
>> f6 7e 02 <0f> 0b 65 48 8b 04 25 80 7f 01 00 8b 88 44 03 00 00 85 c9 7e
>> 0f 41
>> [   96.155200] RSP: 0000:ffffc90000a8bc50 EFLAGS: 00010002
>> [   96.155204] RAX: ffff88810830c300 RBX: ffff88813bcae680 RCX: 0000000000000000
>> [   96.155208] RDX: 0000000000000002 RSI: 0000000000000001 RDI: 0000000000000001
>> [   96.155210] RBP: ffffc90000a8bcd0 R08: 0000000000000001 R09: 0000000000000001
>> [   96.155213] R10: 0000000000000000 R11: ffffffff81319b93 R12: ffff88810830c300
>> [   96.155216] R13: 0000000000000000 R14: ffff88813bcad958 R15: 0000000000004970
>> [   96.155220] FS:  00007fc09cff9700(0000) GS:ffff88813bc80000(0000)
>> knlGS:0000000000000000
>> [   96.155223] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   96.155226] CR2: 00005594d2ff0530 CR3: 0000000103be4001 CR4: 0000000000060ee0
>> [   96.155233] Call Trace:
>> [   96.155243]  __schedule+0xda/0xa30
>> [   96.155283]  schedule+0x46/0xf0
>> [   96.155289]  io_schedule+0x12/0x40
>> [   96.155295]  __lock_page_or_retry+0x1f9/0x510
>> [   96.155319]  ? __page_cache_alloc+0x140/0x140
>> [   96.155345]  do_swap_page+0x33f/0x930
>> [   96.155364]  __handle_mm_fault+0xa54/0x1550
>> [   96.155390]  handle_mm_fault+0x17f/0x420
>> [   96.155400]  do_user_addr_fault+0x1be/0x770
>> [   96.155422]  exc_page_fault+0x69/0x280
>> [   96.155435]  ? asm_exc_page_fault+0x8/0x30
>> [   96.155443]  asm_exc_page_fault+0x1e/0x30
>> [   96.155448] RIP: 0033:0x5594d15ec98f
>> [   96.155454] Code: 44 24 58 48 85 c0 48 b9 00 e4 0b 54 02 00 00 00
>> 48 0f 44 c1 48 89 44 24 58 eb 05 48 8b 44 24 58 48 89 04 24 e8 e2 c9
>> 9c ff 90 <48> 8b 05 9a 3b a0 01 48 8b 4c 24 50 48 89 0c 24 48 8d 15 9a
>> c4 ec
>> [   96.155457] RSP: 002b:000000c0001f7f80 EFLAGS: 00010206
>> [   96.155462] RAX: 0000000000000000 RBX: 00005594d0f9cf55 RCX: fffffffffffffff8
>> [   96.155465] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005594d0fa2d01
>> [   96.155467] RBP: 000000c0001f7fc0 R08: 0000000000000000 R09: 0000000000000000
>> [   96.155470] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000004
>> [   96.155472] R13: 0000000000000013 R14: 00005594d1ee96e6 R15: 0000000000000039
>> [   96.155495] irq event stamp: 10464
>> [   96.155497] hardirqs last  enabled at (10463): [<ffffffff81c0a574>]
>> _raw_spin_unlock_irq+0x24/0x50
>> [   96.155508] hardirqs last disabled at (10464): [<ffffffff81c01882>]
>> __schedule+0x412/0xa30
>> [   96.155512] softirqs last  enabled at (8668): [<ffffffff82000401>]
>> __do_softirq+0x401/0x51b
>> [   96.155517] softirqs last disabled at (8657): [<ffffffff81129c22>]
>> irq_exit_rcu+0x142/0x150
>> [   96.155531] ---[ end trace 165ff31fd86ffc12 ]---
>>
>> [   96.177669] =============================
>> [   96.177693] [ BUG: Invalid wait context ]
>> [   96.177717] 5.13.2 #1 Tainted: G        W
>> [   96.177743] -----------------------------
>> [   96.177765] containerd/770 is trying to lock:
>> [   96.177790] ffff88813ba69b30 (&cache->alloc_lock){+.+.}-{3:3}, at:
>> get_swap_page+0x126/0x200
>> [   96.177867] other info that might help us debug this:
>> [   96.177894] context-{4:4}
>> [   96.177910] 3 locks held by containerd/770:
>> [   96.177934]  #0: ffff88810815ea28 (&mm->mmap_lock#2){++++}-{3:3},
>> at: do_user_addr_fault+0x115/0x770
>> [   96.177999]  #1: ffffffff82915020 (rcu_read_lock){....}-{1:2}, at:
>> get_swap_device+0x33/0x140
>> [   96.178057]  #2: ffffffff82955ba0 (fs_reclaim){+.+.}-{0:0}, at:
>> __fs_reclaim_acquire+0x5/0x30
>> [   96.178115] stack backtrace:
>> [   96.178133] CPU: 1 PID: 770 Comm: containerd Tainted: G        W
>>      5.13.2 #1
>> [   96.178183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
>> [   96.178254] Call Trace:
>> [   96.178274]  dump_stack+0x82/0xa4
>> [   96.178314]  __lock_acquire+0x9a5/0x20a0
>> [   96.178347]  ? find_held_lock+0x3a/0xb0
>> [   96.178381]  lock_acquire+0xe9/0x320
>> [   96.178405]  ? get_swap_page+0x126/0x200
>> [   96.178433]  ? _raw_spin_unlock+0x29/0x40
>> [   96.178461]  ? page_vma_mapped_walk+0x3a3/0x960
>> [   96.178491]  __mutex_lock+0x99/0x980
>> [   96.178515]  ? get_swap_page+0x126/0x200
>> [   96.178541]  ? find_held_lock+0x3a/0xb0
>> [   96.178577]  ? get_swap_page+0x126/0x200
>> [   96.178603]  ? page_referenced+0xd5/0x170
>> [   96.178637]  ? lock_release+0x1b4/0x300
>> [   96.178663]  ? get_swap_page+0x126/0x200
>> [   96.178698]  get_swap_page+0x126/0x200
>> [   96.178725]  add_to_swap+0x14/0x60
>> [   96.178749]  shrink_page_list+0xb13/0xe70
>> [   96.178787]  shrink_inactive_list+0x243/0x550
>> [   96.178819]  shrink_lruvec+0x4fd/0x780
>> [   96.178849]  ? shrink_node+0x257/0x7c0
>> [   96.178873]  shrink_node+0x257/0x7c0
>> [   96.178900]  do_try_to_free_pages+0xdd/0x410
>> [   96.178929]  try_to_free_pages+0x110/0x300
>> [   96.178966]  __alloc_pages_slowpath.constprop.126+0x2ae/0xfa0
>> [   96.179002]  ? lock_release+0x1b4/0x300
>> [   96.179028]  __alloc_pages+0x37d/0x400
>> [   96.179054]  alloc_pages_vma+0x73/0x1d0
>> [   96.179878]  __read_swap_cache_async+0xb8/0x280
>> [   96.180684]  swap_cluster_readahead+0x194/0x270
>> [   96.181459]  ? swapin_readahead+0x62/0x530
>> [   96.182008]  swapin_readahead+0x62/0x530
>> [   96.182558]  ? find_held_lock+0x3a/0xb0
>> [   96.183109]  ? lookup_swap_cache+0x5c/0x1c0
>> [   96.183657]  ? lock_release+0x1b4/0x300
>> [   96.184207]  ? do_swap_page+0x232/0x930
>> [   96.184753]  do_swap_page+0x232/0x930
>> [   96.185326]  __handle_mm_fault+0xa54/0x1550
>> [   96.185850]  handle_mm_fault+0x17f/0x420
>> [   96.186361]  do_user_addr_fault+0x1be/0x770
>> [   96.186880]  exc_page_fault+0x69/0x280
>> [   96.187382]  ? asm_exc_page_fault+0x8/0x30
>> [   96.187879]  asm_exc_page_fault+0x1e/0x30
>> [   96.188363] RIP: 0033:0x5594d0f78da4
>> [   96.188829] Code: cc cc cc cc cc cc 48 8b 0d 39 d2 01 02 64 48 8b
>> 09 48 3b 61 10 76 3d 48 83 ec 28 48 89 6c 24 20 48 8d 6c 24 20 48 8b
>> 44 24 30 <48> 8b 08 48 89 0c 24 48 89 44 24 08 c6 44 24 10 01 e8 76 f4
>> ff ff
>> [   96.189894] RSP: 002b:000000c0001f7de8 EFLAGS: 00010216
>> [   96.190406] RAX: 00005594d2308160 RBX: 0000000000000000 RCX: 000000c0004ea480
>> [   96.190902] RDX: 000000c0002a4270 RSI: 0000000000000010 RDI: 0000000000000011
>> [   96.191394] RBP: 000000c0001f7e08 R08: 0000000000000002 R09: 0000000000000011
>> [   96.191891] R10: 00005594d22f6ce0 R11: 00005594d1ee96e4 R12: ffffffffffffffff
>> [   96.192396] R13: 0000000000000028 R14: 0000000000000027 R15: 0000000000000200
>> [  115.344546] exaust invoked oom-killer:
>> gfp_mask=0x100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0,
>> oom_score_adj=0
>> [  115.346019] CPU: 1 PID: 969 Comm: exaust Tainted: G        W
>>  5.13.2 #1
>> [  115.346569] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
>> [  115.347712] Call Trace:
>> [  115.348287]  dump_stack+0x82/0xa4
>> [  115.348867]  dump_header+0x55/0x3f0
>> [  115.349491]  oom_kill_process+0x160/0x210
>> [  115.350068]  out_of_memory+0x10b/0x630
>> [  115.350646]  __alloc_pages_slowpath.constprop.126+0xec7/0xfa0
>> [  115.351255]  __alloc_pages+0x37d/0x400
>> [  115.351861]  alloc_pages_vma+0x73/0x1d0
>> [  115.352449]  __handle_mm_fault+0xe1b/0x1550
>> [  115.353042]  handle_mm_fault+0x17f/0x420
>> [  115.353646]  do_user_addr_fault+0x1be/0x770
>> [  115.354239]  exc_page_fault+0x69/0x280
>> [  115.354833]  ? asm_exc_page_fault+0x8/0x30
>> [  115.355428]  asm_exc_page_fault+0x1e/0x30
>> [  115.356021] RIP: 0033:0x7fe8ee633543
>> [  115.356617] Code: Unable to access opcode bytes at RIP 0x7fe8ee633519.
>> [  115.357245] RSP: 002b:00007fff742c68c8 EFLAGS: 00010206
>> [  115.357857] RAX: 00007fe798041010 RBX: 0000000000000000 RCX: 00007fe7980fd000
>> [  115.358469] RDX: 00007fe798141000 RSI: 0000000000000000 RDI: 00007fe798041010
>> [  115.359077] RBP: 00007fff742c68e0 R08: 00000000ffffffff R09: 0000000000000000
>> [  115.359686] R10: 0000000000000022 R11: 0000000000000246 R12: 0000561abe3fa060
>> [  115.360300] R13: 00007fff742c69c0 R14: 0000000000000000 R15: 0000000000000000
>> [  115.361313] Mem-Info:
>> [  115.362285] active_anon:188386 inactive_anon:764572 isolated_anon:64
>>                 active_file:61 inactive_file:0 isolated_file:0
>>                 unevictable:0 dirty:0 writeback:2
>>                 slab_reclaimable:6676 slab_unreclaimable:6200
>>                 mapped:155 shmem:180 pagetables:3124 bounce:0
>>                 free:25647 free_pcp:0 free_cma:0
>> [  115.366889] Node 0 active_anon:782092kB inactive_anon:3029744kB
>> active_file:244kB inactive_file:0kB unevictable:0kB
>> isolated(anon):256kB isolated(file):0kB mapped:620kB dirty:0kB
>> writeback:8kB shmem:720kB shmem_thp: 0kB shmem_pmdmapped: 0kB
>> anon_thp: 0kB writeback_tmp:0kB kernel_stack:3904kB pagetables:12496kB
>> all_unreclaimable? no
>> [  115.369378] Node 0 DMA free:15296kB min:260kB low:324kB high:388kB
>> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
>> present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB
>> local_pcp:0kB free_cma:0kB
>> [  115.372121] lowmem_reserve[]: 0 2925 3874 3874
>> [  115.373195] Node 0 DMA32 free:54436kB min:50824kB low:63528kB
>> high:76232kB reserved_highatomic:0KB active_anon:819596kB
>> inactive_anon:2135980kB active_file:48kB inactive_file:24kB
>> unevictable:0kB writepending:8kB present:3129212kB managed:3021488kB
>> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> [  115.375958] lowmem_reserve[]: 0 0 949 949
>> [  115.377064] Node 0 Normal free:32856kB min:32880kB low:37004kB
>> high:41128kB reserved_highatomic:0KB active_anon:2232kB
>> inactive_anon:853384kB active_file:80kB inactive_file:76kB
>> unevictable:0kB writepending:0kB present:1048576kB managed:972400kB
>> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> [  115.380050] lowmem_reserve[]: 0 0 0 0
>> [  115.380908] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB (U)
>> 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M)
>> = 15296kB
>> [  115.382728] Node 0 DMA32: 86*4kB (UM) 52*8kB (UME) 33*16kB (UM)
>> 28*32kB (UME) 11*64kB (U) 2*128kB (UE) 0*256kB 1*512kB (U) 2*1024kB
>> (UM) 2*2048kB (ME) 11*4096kB (M) = 54856kB
>> [  115.384560] Node 0 Normal: 597*4kB (UME) 904*8kB (UME) 414*16kB
>> (UME) 187*32kB (UME) 64*64kB (UME) 19*128kB (UME) 3*256kB (U) 3*512kB
>> (UE) 2*1024kB (M) 0*2048kB 0*4096kB = 33108kB
>> [  115.386558] Node 0 hugepages_total=0 hugepages_free=0
>> hugepages_surp=0 hugepages_size=1048576kB
>> [  115.387543] Node 0 hugepages_total=0 hugepages_free=0
>> hugepages_surp=0 hugepages_size=2048kB
>> [  115.388503] 61467 total pagecache pages
>> [  115.389532] 61242 pages in swap cache
>> [  115.390450] Swap cache stats: add 523756, delete 462538, find 69/171
>> [  115.391383] Free swap  = 0kB
>> [  115.392312] Total swap = 2094076kB
>> [  115.393506] 1048445 pages RAM
>> [  115.394538] 0 pages HighMem/MovableOnly
>> [  115.395514] 46133 pages reserved
>> [  115.396525] 0 pages hwpoisoned
>> [  115.397824] Tasks state (memory values in pages):
>> [  115.398778] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes
>> swapents oom_score_adj name
>> [  115.399792] [    312]     0   312     6603      217    98304
>> 181             0 systemd-journal
>> [  115.400749] [    339]     0   339     5562        1    69632
>> 395         -1000 systemd-udevd
>> [  115.401869] [    379]   101   379    23271       17    81920
>> 197             0 systemd-timesyn
>> [  115.402873] [    388]   107   388     1707        7    53248
>> 101             0 rpcbind
>> [  115.403869] [    479]     0   479     2373       39    53248
>> 284             0 dhclient
>> [  115.404849] [    485]     0   485     2120        8    53248
>> 43             0 cron
>> [  115.406291] [    487]     0   487     4869       44    77824
>> 197             0 systemd-logind
>> [  115.407536] [    494]   104   494     2247       58    57344
>> 80          -900 dbus-daemon
>> [  115.408772] [    501]     0   501    56457        0    86016
>> 226             0 rsyslogd
>> [  115.410041] [    503]     0   503   315093      769   311296
>> 3943             0 containerd
>> [  115.411289] [    507]     0   507     1404        2    45056
>> 26             0 agetty
>> [  115.412563] [    521]     0   521    27588      235   110592
>> 1678             0 unattended-upgr
>> [  115.413834] [    860]     0   860     3964       28    73728
>> 187         -1000 sshd
>> [  115.415062] [    866]   106   866     5015       19    81920
>> 187             0 exim4
>> [  115.416286] [    901]     0   901     4233        1    73728
>> 281             0 sshd
>> [  115.417534] [    928]     0   928     5287       57    73728
>> 280             0 systemd
>> [  115.418755] [    929]     0   929     5710       77    90112
>> 488             0 (sd-pam)
>> [  115.419921] [    943]     0   943     2013        1    53248
>> 407             0 bash
>> [  115.421061] [    969]     0   969  1402762   888995 11276288
>> 513120             0 exaust
>> [  115.421973] [    970]     0   970     4233      272    77824
>> 5             0 sshd
>> [  115.422840] [    976]     0   976     1980      297    53248
>> 64             0 bash
>> [  115.423664] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/session-1.scope,task=exaust,pid=969,uid=0
>> [  115.425344] Out of memory: Killed process 969 (exaust)
>> total-vm:5611048kB, anon-rss:3555976kB, file-rss:4kB, shmem-rss:0kB,
>> UID:0 pgtables:11012kB oom_score_adj:0
>> [  115.912696] oom_reaper: reaped process 969 (exaust), now
>> anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
>>
>>
>> I can tell from above that:  [   96.177999]  #1: ffffffff82915020
>> (rcu_read_lock){....}-{1:2}, at: get_swap_device+0x33/0x140
>>  get_swap_device did help the rcu_read_lock
>>
>> Thanks
>> Zhouyi
>>
>> On Mon, Jul 19, 2021 at 8:23 PM Oleksandr Natalenko
>> <oleksandr@natalenko.name> wrote:
>>>
>>> On pondělí 19. července 2021 14:16:04 CEST Matthew Wilcox wrote:
>>>> On Mon, Jul 19, 2021 at 02:12:15PM +0200, Oleksandr Natalenko wrote:
>>>>> On pondělí 19. července 2021 14:08:37 CEST Miaohe Lin wrote:
>>>>>> On 2021/7/19 19:59, Oleksandr Natalenko wrote:
>>>>>>> On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
>>>>>>>> On 2021/7/19 19:22, Matthew Wilcox wrote:
>>>>>>>>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
>>>>>>>>>> When in the commit 2799e77529c2a, we're using the percpu_ref to
>>>>>>>>>> serialize
>>>>>>>>>> against concurrent swapoff, i.e. there's percpu_ref inside
>>>>>>>>>> get_swap_device() instead of rcu_read_lock(). Please see commit
>>>>>>>>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
>>>>>>>>>> concurrent swapoff") for detail.
>>>>>>>>>
>>>>>>>>> Oh, so this is a backport problem.  2799e77529c2 was backported
>>>>>>>>> without
>>>>>>>>> its prerequisite 63d8620ecf93.  Greg, probably best to just drop
>>>>>>>>
>>>>>>>> Yes, they're posted as a patch set:
>>>>>>>>
>>>>>>>> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.co
>>>>>>>> m
>>>>>>>>
>>>>>>>>> 2799e77529c2 from all stable trees; the race described is not very
>>>>>>>>> important (swapoff vs reading a page back from that swap device).
>>>>>>>>> .
>>>>>>>>
>>>>>>>> The swapoff races with reading a page back from that swap device
>>>>>>>> should
>>>>>>>> be
>>>>>>>> really uncommon as most users only do swapoff when the system is
>>>>>>>> going to
>>>>>>>> shutdown.
>>>>>>>>
>>>>>>>> Sorry for the trouble!
>>>>>>>
>>>>>>> git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
>>>>>>> 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
>>>>>>> 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release
>>>>>>> locked
>>>>>>> z3fold page
>>>>>>> ccb7848e2344 mm/z3fold: fix potential memory leak in
>>>>>>> z3fold_destroy_pool()
>>>>>>> 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other
>>>>>>> processes
>>>>>>> are mapping it
>>>>>>> f13259175e4f mm/huge_memory.c: add missing read-only THP checking in
>>>>>>> transparent_hugepage_enabled()
>>>>>>> afafd371e7de mm/huge_memory.c: remove dedicated macro
>>>>>>> HPAGE_CACHE_INDEX_MASK a533a21b692f mm/shmem: fix shmem_swapin() race
>>>>>>> with swapoff
>>>>>>> c3b39134bbd0 swap: fix do_swap_page() race with swapoff
>>>>>>>
>>>>>>> Do you suggest reverting "mm/shmem: fix shmem_swapin() race with
>>>>>>> swapoff"
>>>>>>> as well?
>>>>>>
>>>>>> This patch also rely on its prerequisite 63d8620ecf93. I think we should
>>>>>> either revert any commit in this series or just backport the entire
>>>>>> series.
>>>>>
>>>>> Then why not just pick up 2 more patches instead of dropping 2 patches.
>>>>> Greg, could you please make sure the whole series from [1] gets pulled?
>>>>
>>>> Because none of these patches should have been backported in the first
>>>> place.  It's just not worth the destabilisation.
>>>
>>> What about the rest then?
>>>
>>> git log --oneline v5.13..v5.13.3 -- mm/ | wc -l
>>> 18
>>>
>>> Those look to be fixes, these ones too.
>>>
>>> --
>>> Oleksandr Natalenko (post-factum)
>>>
>>>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-22  7:30                                     ` Chris Clayton
@ 2021-07-22  8:57                                       ` Zhouyi Zhou
  2021-07-22 12:36                                         ` Matthew Wilcox
  2021-07-22 14:05                                       ` Paul E. McKenney
  1 sibling, 1 reply; 34+ messages in thread
From: Zhouyi Zhou @ 2021-07-22  8:57 UTC (permalink / raw)
  To: Chris Clayton
  Cc: Oleksandr Natalenko, Matthew Wilcox, Miaohe Lin, Boqun Feng,
	paulmck, linux-kernel, stable, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying, Greg KH

Thanks for reviewing,

What I have deduced from the dmesg  is:
In function do_swap_page,
after invoking
3385        si = get_swap_device(entry); /* rcu_read_lock */
and before
3561    out:
3562        if (si)
3563            put_swap_device(si);
The thread got scheduled out in
3454        locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);

I am only familiar with Linux RCU subsystem, hope mm people can solve our
confusions.

On Thu, Jul 22, 2021 at 3:30 PM Chris Clayton <chris2553@googlemail.com> wrote:
>
>
>
> On 19/07/2021 17:56, Zhouyi Zhou wrote:
> > Attached is my kernel configuration file (config-5.13.2), dmesg (dmesg.txt)
> > My qemu box's memory layout is:
> >                     total        used        free      shared
> > buff/cache   available
> > Mem:           3915          92        3764           0          58        3676
> > Swap:          2044          39        2005
> > The memory exhausting program:
> > #include <stdlib.h>
> > int main()
> > {
> >         while(1)
> >         {
> >         void *ptr =
> >                 malloc(1024*1024);
> >          memset(ptr, 0, 1024*1024);
> >         }
> >         return 0;
> > }
> >
The warning and the BUG report is very each to trigger.

>
> I'm not sure what I'm supposed to make of this. Is it that my laptop simply ran out of memory? That would be odd
> because, as I said earlier, I was simply doing a weekly archive of my system. That involved using the dar backup utility
> to create the incremental backup (which in this instance resulted in a single archive file), compressing that archive
> with xz and copying (again one at a time) it and a directory containing about 7000 source and binary rpm files from an
> internal HDD to and external HDD connected via USB
>
> My laptop, which I've had for a little over 3 months, has 32GB of memory. I have been using dar for a few years now (and
> the version currently installed since May) and have had no trouble with it. Similarly, I've used xz to compress the
> backup archives for year and the version currently installed since December 2020. The (incremental) archive file is
> about 7GB and 1.2GB when compressed. If copying files, the largest of which is 1.2GB, serially can cause 32GB fn RAM to
> be exhausted , then we are all up the creek. So I don't see where in my normal archiving process, the memory exhaustion
> would arise from (but I'm happy to be educated on this).
>
> Or is the memory exhaustion likely to be a side effect of the fact that two patches applied in 5.13.2 are missing a
> pre-requisite? If that's the case there, seems to no disagreement on whether the two patches (which had not been tagged
> for stable) should be reverted or the missing prerequisite should be applied (along with another, related patch that was
> in the patch set). Perhaps the stable and mm teams are resolving this issue behind the scenes, but in the meantime I
> have backported the missing patches to 5.13.4 and 5.10.52 and am currently running the former as my default kernel.
>
> I also have a patch that reverts the two patches that were applied to stable, so could run with that applied if it would
> be more helpful. It would, of course, leave open the races that the patches are designed to close., but if I've manually
> run swapoff more than once or twice in the twenty years I've been using Linux-based systems, I'd be very surprised.
>
> > On Tue, Jul 20, 2021 at 12:47 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >>
> >> I downloaded linux-5.13.2, configure and compile the kernel with
> >> CONFIG_LOCKDEP=y
> >> CONFIG_LOCK_DEBUGGING_SUPPORT=y
> >> CONFIG_PROVE_LOCKING=y
> >> CONFIG_DEBUG_SPINLOCK=y
> >> CONFIG_DEBUG_LOCK_ALLOC=y
> >> install the kernel on a qemu box, then run following C Program
> >> #include <stdlib.h>
> >> int main()
> >> {
> >>         while(1)
> >>         {
> >>                 malloc(1024*1024);
+memset(ptr, 0, 1024*1024); /*touch the allocated virtual memory*/
> >>         }
> >>         return 0;
> >> }
> >> And following is the dmesg:
> >> [   96.155017] ------------[ cut here ]------------
> >> [   96.155030] WARNING: CPU: 10 PID: 770 at
> >> kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x91/0x610
> >> [   96.155074] Modules linked in: ppdev intel_rapl_msr
> >> intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> >> bochs_drm aesni_intel drm_vram_helper evdev crypto_simd drm_ttm_helper
> >> snd_pcm cryptd ttm snd_timer parport_pc serio_raw drm_kms_helper sg
> >> snd parport soundcore drm pcspkr button ip_tables autofs4 psmouse
> >> sr_mod i2c_piix4 sd_mod crc32c_intel t10_pi cdrom i2c_core e1000
> >> ata_generic floppy
> >> [   96.155180] CPU: 10 PID: 770 Comm: containerd Not tainted 5.13.2 #1
> >> [   96.155185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> >> [   96.155189] RIP: 0010:rcu_note_context_switch+0x91/0x610
> >> [   96.155195] Code: ff 74 0f 65 8b 05 a7 10 e5 7e 85 c0 0f 84 22 01
> >> 00 00 45 84 ed 75 15 65 48 8b 04 25 80 7f 01 00 8b b0 44 03 00 00 85
> >> f6 7e 02 <0f> 0b 65 48 8b 04 25 80 7f 01 00 8b 88 44 03 00 00 85 c9 7e
> >> 0f 41
> >> [   96.155200] RSP: 0000:ffffc90000a8bc50 EFLAGS: 00010002
> >> [   96.155204] RAX: ffff88810830c300 RBX: ffff88813bcae680 RCX: 0000000000000000
> >> [   96.155208] RDX: 0000000000000002 RSI: 0000000000000001 RDI: 0000000000000001
> >> [   96.155210] RBP: ffffc90000a8bcd0 R08: 0000000000000001 R09: 0000000000000001
> >> [   96.155213] R10: 0000000000000000 R11: ffffffff81319b93 R12: ffff88810830c300
> >> [   96.155216] R13: 0000000000000000 R14: ffff88813bcad958 R15: 0000000000004970
> >> [   96.155220] FS:  00007fc09cff9700(0000) GS:ffff88813bc80000(0000)
> >> knlGS:0000000000000000
> >> [   96.155223] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [   96.155226] CR2: 00005594d2ff0530 CR3: 0000000103be4001 CR4: 0000000000060ee0
> >> [   96.155233] Call Trace:
> >> [   96.155243]  __schedule+0xda/0xa30
> >> [   96.155283]  schedule+0x46/0xf0
> >> [   96.155289]  io_schedule+0x12/0x40
> >> [   96.155295]  __lock_page_or_retry+0x1f9/0x510
> >> [   96.155319]  ? __page_cache_alloc+0x140/0x140
> >> [   96.155345]  do_swap_page+0x33f/0x930
> >> [   96.155364]  __handle_mm_fault+0xa54/0x1550
> >> [   96.155390]  handle_mm_fault+0x17f/0x420
> >> [   96.155400]  do_user_addr_fault+0x1be/0x770
> >> [   96.155422]  exc_page_fault+0x69/0x280
> >> [   96.155435]  ? asm_exc_page_fault+0x8/0x30
> >> [   96.155443]  asm_exc_page_fault+0x1e/0x30
> >> [   96.155448] RIP: 0033:0x5594d15ec98f
> >> [   96.155454] Code: 44 24 58 48 85 c0 48 b9 00 e4 0b 54 02 00 00 00
> >> 48 0f 44 c1 48 89 44 24 58 eb 05 48 8b 44 24 58 48 89 04 24 e8 e2 c9
> >> 9c ff 90 <48> 8b 05 9a 3b a0 01 48 8b 4c 24 50 48 89 0c 24 48 8d 15 9a
> >> c4 ec
> >> [   96.155457] RSP: 002b:000000c0001f7f80 EFLAGS: 00010206
> >> [   96.155462] RAX: 0000000000000000 RBX: 00005594d0f9cf55 RCX: fffffffffffffff8
> >> [   96.155465] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005594d0fa2d01
> >> [   96.155467] RBP: 000000c0001f7fc0 R08: 0000000000000000 R09: 0000000000000000
> >> [   96.155470] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000004
> >> [   96.155472] R13: 0000000000000013 R14: 00005594d1ee96e6 R15: 0000000000000039
> >> [   96.155495] irq event stamp: 10464
> >> [   96.155497] hardirqs last  enabled at (10463): [<ffffffff81c0a574>]
> >> _raw_spin_unlock_irq+0x24/0x50
> >> [   96.155508] hardirqs last disabled at (10464): [<ffffffff81c01882>]
> >> __schedule+0x412/0xa30
> >> [   96.155512] softirqs last  enabled at (8668): [<ffffffff82000401>]
> >> __do_softirq+0x401/0x51b
> >> [   96.155517] softirqs last disabled at (8657): [<ffffffff81129c22>]
> >> irq_exit_rcu+0x142/0x150
> >> [   96.155531] ---[ end trace 165ff31fd86ffc12 ]---
> >>
> >> [   96.177669] =============================
> >> [   96.177693] [ BUG: Invalid wait context ]
> >> [   96.177717] 5.13.2 #1 Tainted: G        W
> >> [   96.177743] -----------------------------
> >> [   96.177765] containerd/770 is trying to lock:
> >> [   96.177790] ffff88813ba69b30 (&cache->alloc_lock){+.+.}-{3:3}, at:
> >> get_swap_page+0x126/0x200
> >> [   96.177867] other info that might help us debug this:
> >> [   96.177894] context-{4:4}
> >> [   96.177910] 3 locks held by containerd/770:
> >> [   96.177934]  #0: ffff88810815ea28 (&mm->mmap_lock#2){++++}-{3:3},
> >> at: do_user_addr_fault+0x115/0x770
> >> [   96.177999]  #1: ffffffff82915020 (rcu_read_lock){....}-{1:2}, at:
> >> get_swap_device+0x33/0x140
> >> [   96.178057]  #2: ffffffff82955ba0 (fs_reclaim){+.+.}-{0:0}, at:
> >> __fs_reclaim_acquire+0x5/0x30
When lock related bug occurs, above three locks are held by the task.

> >> [   96.178115] stack backtrace:
> >> [   96.178133] CPU: 1 PID: 770 Comm: containerd Tainted: G        W
> >>      5.13.2 #1
> >> [   96.178183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> >> [   96.178254] Call Trace:
> >> [   96.178274]  dump_stack+0x82/0xa4
> >> [   96.178314]  __lock_acquire+0x9a5/0x20a0
> >> [   96.178347]  ? find_held_lock+0x3a/0xb0
> >> [   96.178381]  lock_acquire+0xe9/0x320
> >> [   96.178405]  ? get_swap_page+0x126/0x200
> >> [   96.178433]  ? _raw_spin_unlock+0x29/0x40
> >> [   96.178461]  ? page_vma_mapped_walk+0x3a3/0x960
> >> [   96.178491]  __mutex_lock+0x99/0x980
> >> [   96.178515]  ? get_swap_page+0x126/0x200
> >> [   96.178541]  ? find_held_lock+0x3a/0xb0
> >> [   96.178577]  ? get_swap_page+0x126/0x200
> >> [   96.178603]  ? page_referenced+0xd5/0x170
> >> [   96.178637]  ? lock_release+0x1b4/0x300
> >> [   96.178663]  ? get_swap_page+0x126/0x200
> >> [   96.178698]  get_swap_page+0x126/0x200
> >> [   96.178725]  add_to_swap+0x14/0x60
> >> [   96.178749]  shrink_page_list+0xb13/0xe70
> >> [   96.178787]  shrink_inactive_list+0x243/0x550
> >> [   96.178819]  shrink_lruvec+0x4fd/0x780
> >> [   96.178849]  ? shrink_node+0x257/0x7c0
> >> [   96.178873]  shrink_node+0x257/0x7c0
> >> [   96.178900]  do_try_to_free_pages+0xdd/0x410
> >> [   96.178929]  try_to_free_pages+0x110/0x300
> >> [   96.178966]  __alloc_pages_slowpath.constprop.126+0x2ae/0xfa0
> >> [   96.179002]  ? lock_release+0x1b4/0x300
> >> [   96.179028]  __alloc_pages+0x37d/0x400
> >> [   96.179054]  alloc_pages_vma+0x73/0x1d0
> >> [   96.179878]  __read_swap_cache_async+0xb8/0x280
> >> [   96.180684]  swap_cluster_readahead+0x194/0x270
> >> [   96.181459]  ? swapin_readahead+0x62/0x530
> >> [   96.182008]  swapin_readahead+0x62/0x530
> >> [   96.182558]  ? find_held_lock+0x3a/0xb0
> >> [   96.183109]  ? lookup_swap_cache+0x5c/0x1c0
> >> [   96.183657]  ? lock_release+0x1b4/0x300
> >> [   96.184207]  ? do_swap_page+0x232/0x930
> >> [   96.184753]  do_swap_page+0x232/0x930
do_swap_page is the function that cause the problem.

> >> [   96.185326]  __handle_mm_fault+0xa54/0x1550
> >> [   96.185850]  handle_mm_fault+0x17f/0x420
> >> [   96.186361]  do_user_addr_fault+0x1be/0x770
> >> [   96.186880]  exc_page_fault+0x69/0x280
> >> [   96.187382]  ? asm_exc_page_fault+0x8/0x30
> >> [   96.187879]  asm_exc_page_fault+0x1e/0x30
> >> [   96.188363] RIP: 0033:0x5594d0f78da4
> >> [   96.188829] Code: cc cc cc cc cc cc 48 8b 0d 39 d2 01 02 64 48 8b
> >> 09 48 3b 61 10 76 3d 48 83 ec 28 48 89 6c 24 20 48 8d 6c 24 20 48 8b
> >> 44 24 30 <48> 8b 08 48 89 0c 24 48 89 44 24 08 c6 44 24 10 01 e8 76 f4
> >> ff ff
> >> [   96.189894] RSP: 002b:000000c0001f7de8 EFLAGS: 00010216
> >> [   96.190406] RAX: 00005594d2308160 RBX: 0000000000000000 RCX: 000000c0004ea480
> >> [   96.190902] RDX: 000000c0002a4270 RSI: 0000000000000010 RDI: 0000000000000011
> >> [   96.191394] RBP: 000000c0001f7e08 R08: 0000000000000002 R09: 0000000000000011
> >> [   96.191891] R10: 00005594d22f6ce0 R11: 00005594d1ee96e4 R12: ffffffffffffffff
> >> [   96.192396] R13: 0000000000000028 R14: 0000000000000027 R15: 0000000000000200
> >> [  115.344546] exaust invoked oom-killer:
> >> gfp_mask=0x100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0,
> >> oom_score_adj=0
> >> [  115.346019] CPU: 1 PID: 969 Comm: exaust Tainted: G        W
> >>  5.13.2 #1
> >> [  115.346569] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> >> [  115.347712] Call Trace:
> >> [  115.348287]  dump_stack+0x82/0xa4
> >> [  115.348867]  dump_header+0x55/0x3f0
> >> [  115.349491]  oom_kill_process+0x160/0x210
> >> [  115.350068]  out_of_memory+0x10b/0x630
> >> [  115.350646]  __alloc_pages_slowpath.constprop.126+0xec7/0xfa0
> >> [  115.351255]  __alloc_pages+0x37d/0x400
> >> [  115.351861]  alloc_pages_vma+0x73/0x1d0
> >> [  115.352449]  __handle_mm_fault+0xe1b/0x1550
> >> [  115.353042]  handle_mm_fault+0x17f/0x420
> >> [  115.353646]  do_user_addr_fault+0x1be/0x770
> >> [  115.354239]  exc_page_fault+0x69/0x280
> >> [  115.354833]  ? asm_exc_page_fault+0x8/0x30
> >> [  115.355428]  asm_exc_page_fault+0x1e/0x30
> >> [  115.356021] RIP: 0033:0x7fe8ee633543
> >> [  115.356617] Code: Unable to access opcode bytes at RIP 0x7fe8ee633519.
> >> [  115.357245] RSP: 002b:00007fff742c68c8 EFLAGS: 00010206
> >> [  115.357857] RAX: 00007fe798041010 RBX: 0000000000000000 RCX: 00007fe7980fd000
> >> [  115.358469] RDX: 00007fe798141000 RSI: 0000000000000000 RDI: 00007fe798041010
> >> [  115.359077] RBP: 00007fff742c68e0 R08: 00000000ffffffff R09: 0000000000000000
> >> [  115.359686] R10: 0000000000000022 R11: 0000000000000246 R12: 0000561abe3fa060
> >> [  115.360300] R13: 00007fff742c69c0 R14: 0000000000000000 R15: 0000000000000000
> >> [  115.361313] Mem-Info:
> >> [  115.362285] active_anon:188386 inactive_anon:764572 isolated_anon:64
> >>                 active_file:61 inactive_file:0 isolated_file:0
> >>                 unevictable:0 dirty:0 writeback:2
> >>                 slab_reclaimable:6676 slab_unreclaimable:6200
> >>                 mapped:155 shmem:180 pagetables:3124 bounce:0
> >>                 free:25647 free_pcp:0 free_cma:0
> >> [  115.366889] Node 0 active_anon:782092kB inactive_anon:3029744kB
> >> active_file:244kB inactive_file:0kB unevictable:0kB
> >> isolated(anon):256kB isolated(file):0kB mapped:620kB dirty:0kB
> >> writeback:8kB shmem:720kB shmem_thp: 0kB shmem_pmdmapped: 0kB
> >> anon_thp: 0kB writeback_tmp:0kB kernel_stack:3904kB pagetables:12496kB
> >> all_unreclaimable? no
> >> [  115.369378] Node 0 DMA free:15296kB min:260kB low:324kB high:388kB
> >> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >> present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB
> >> local_pcp:0kB free_cma:0kB
> >> [  115.372121] lowmem_reserve[]: 0 2925 3874 3874
> >> [  115.373195] Node 0 DMA32 free:54436kB min:50824kB low:63528kB
> >> high:76232kB reserved_highatomic:0KB active_anon:819596kB
> >> inactive_anon:2135980kB active_file:48kB inactive_file:24kB
> >> unevictable:0kB writepending:8kB present:3129212kB managed:3021488kB
> >> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >> [  115.375958] lowmem_reserve[]: 0 0 949 949
> >> [  115.377064] Node 0 Normal free:32856kB min:32880kB low:37004kB
> >> high:41128kB reserved_highatomic:0KB active_anon:2232kB
> >> inactive_anon:853384kB active_file:80kB inactive_file:76kB
> >> unevictable:0kB writepending:0kB present:1048576kB managed:972400kB
> >> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >> [  115.380050] lowmem_reserve[]: 0 0 0 0
> >> [  115.380908] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB (U)
> >> 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M)
> >> = 15296kB
> >> [  115.382728] Node 0 DMA32: 86*4kB (UM) 52*8kB (UME) 33*16kB (UM)
> >> 28*32kB (UME) 11*64kB (U) 2*128kB (UE) 0*256kB 1*512kB (U) 2*1024kB
> >> (UM) 2*2048kB (ME) 11*4096kB (M) = 54856kB
> >> [  115.384560] Node 0 Normal: 597*4kB (UME) 904*8kB (UME) 414*16kB
> >> (UME) 187*32kB (UME) 64*64kB (UME) 19*128kB (UME) 3*256kB (U) 3*512kB
> >> (UE) 2*1024kB (M) 0*2048kB 0*4096kB = 33108kB
> >> [  115.386558] Node 0 hugepages_total=0 hugepages_free=0
> >> hugepages_surp=0 hugepages_size=1048576kB
> >> [  115.387543] Node 0 hugepages_total=0 hugepages_free=0
> >> hugepages_surp=0 hugepages_size=2048kB
> >> [  115.388503] 61467 total pagecache pages
> >> [  115.389532] 61242 pages in swap cache
> >> [  115.390450] Swap cache stats: add 523756, delete 462538, find 69/171
> >> [  115.391383] Free swap  = 0kB
> >> [  115.392312] Total swap = 2094076kB
> >> [  115.393506] 1048445 pages RAM
> >> [  115.394538] 0 pages HighMem/MovableOnly
> >> [  115.395514] 46133 pages reserved
> >> [  115.396525] 0 pages hwpoisoned
> >> [  115.397824] Tasks state (memory values in pages):
> >> [  115.398778] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes
> >> swapents oom_score_adj name
> >> [  115.399792] [    312]     0   312     6603      217    98304
> >> 181             0 systemd-journal
> >> [  115.400749] [    339]     0   339     5562        1    69632
> >> 395         -1000 systemd-udevd
> >> [  115.401869] [    379]   101   379    23271       17    81920
> >> 197             0 systemd-timesyn
> >> [  115.402873] [    388]   107   388     1707        7    53248
> >> 101             0 rpcbind
> >> [  115.403869] [    479]     0   479     2373       39    53248
> >> 284             0 dhclient
> >> [  115.404849] [    485]     0   485     2120        8    53248
> >> 43             0 cron
> >> [  115.406291] [    487]     0   487     4869       44    77824
> >> 197             0 systemd-logind
> >> [  115.407536] [    494]   104   494     2247       58    57344
> >> 80          -900 dbus-daemon
> >> [  115.408772] [    501]     0   501    56457        0    86016
> >> 226             0 rsyslogd
> >> [  115.410041] [    503]     0   503   315093      769   311296
> >> 3943             0 containerd
> >> [  115.411289] [    507]     0   507     1404        2    45056
> >> 26             0 agetty
> >> [  115.412563] [    521]     0   521    27588      235   110592
> >> 1678             0 unattended-upgr
> >> [  115.413834] [    860]     0   860     3964       28    73728
> >> 187         -1000 sshd
> >> [  115.415062] [    866]   106   866     5015       19    81920
> >> 187             0 exim4
> >> [  115.416286] [    901]     0   901     4233        1    73728
> >> 281             0 sshd
> >> [  115.417534] [    928]     0   928     5287       57    73728
> >> 280             0 systemd
> >> [  115.418755] [    929]     0   929     5710       77    90112
> >> 488             0 (sd-pam)
> >> [  115.419921] [    943]     0   943     2013        1    53248
> >> 407             0 bash
> >> [  115.421061] [    969]     0   969  1402762   888995 11276288
> >> 513120             0 exaust
> >> [  115.421973] [    970]     0   970     4233      272    77824
> >> 5             0 sshd
> >> [  115.422840] [    976]     0   976     1980      297    53248
> >> 64             0 bash
> >> [  115.423664] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/session-1.scope,task=exaust,pid=969,uid=0
> >> [  115.425344] Out of memory: Killed process 969 (exaust)
> >> total-vm:5611048kB, anon-rss:3555976kB, file-rss:4kB, shmem-rss:0kB,
> >> UID:0 pgtables:11012kB oom_score_adj:0
> >> [  115.912696] oom_reaper: reaped process 969 (exaust), now
> >> anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> >>
> >>
> >> I can tell from above that:  [   96.177999]  #1: ffffffff82915020
> >> (rcu_read_lock){....}-{1:2}, at: get_swap_device+0x33/0x140
> >>  get_swap_device did help the rcu_read_lock
> >>
> >> Thanks
> >> Zhouyi
> >>
> >> On Mon, Jul 19, 2021 at 8:23 PM Oleksandr Natalenko
> >> <oleksandr@natalenko.name> wrote:
> >>>
> >>> On pondělí 19. července 2021 14:16:04 CEST Matthew Wilcox wrote:
> >>>> On Mon, Jul 19, 2021 at 02:12:15PM +0200, Oleksandr Natalenko wrote:
> >>>>> On pondělí 19. července 2021 14:08:37 CEST Miaohe Lin wrote:
> >>>>>> On 2021/7/19 19:59, Oleksandr Natalenko wrote:
> >>>>>>> On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
> >>>>>>>> On 2021/7/19 19:22, Matthew Wilcox wrote:
> >>>>>>>>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
> >>>>>>>>>> When in the commit 2799e77529c2a, we're using the percpu_ref to
> >>>>>>>>>> serialize
> >>>>>>>>>> against concurrent swapoff, i.e. there's percpu_ref inside
> >>>>>>>>>> get_swap_device() instead of rcu_read_lock(). Please see commit
> >>>>>>>>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
> >>>>>>>>>> concurrent swapoff") for detail.
> >>>>>>>>>
> >>>>>>>>> Oh, so this is a backport problem.  2799e77529c2 was backported
> >>>>>>>>> without
> >>>>>>>>> its prerequisite 63d8620ecf93.  Greg, probably best to just drop
> >>>>>>>>
> >>>>>>>> Yes, they're posted as a patch set:
> >>>>>>>>
> >>>>>>>> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.co
> >>>>>>>> m
> >>>>>>>>
> >>>>>>>>> 2799e77529c2 from all stable trees; the race described is not very
> >>>>>>>>> important (swapoff vs reading a page back from that swap device).
> >>>>>>>>> .
> >>>>>>>>
> >>>>>>>> The swapoff races with reading a page back from that swap device
> >>>>>>>> should
> >>>>>>>> be
> >>>>>>>> really uncommon as most users only do swapoff when the system is
> >>>>>>>> going to
> >>>>>>>> shutdown.
> >>>>>>>>
> >>>>>>>> Sorry for the trouble!
> >>>>>>>
> >>>>>>> git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
> >>>>>>> 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
> >>>>>>> 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release
> >>>>>>> locked
> >>>>>>> z3fold page
> >>>>>>> ccb7848e2344 mm/z3fold: fix potential memory leak in
> >>>>>>> z3fold_destroy_pool()
> >>>>>>> 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other
> >>>>>>> processes
> >>>>>>> are mapping it
> >>>>>>> f13259175e4f mm/huge_memory.c: add missing read-only THP checking in
> >>>>>>> transparent_hugepage_enabled()
> >>>>>>> afafd371e7de mm/huge_memory.c: remove dedicated macro
> >>>>>>> HPAGE_CACHE_INDEX_MASK a533a21b692f mm/shmem: fix shmem_swapin() race
> >>>>>>> with swapoff
> >>>>>>> c3b39134bbd0 swap: fix do_swap_page() race with swapoff
> >>>>>>>
> >>>>>>> Do you suggest reverting "mm/shmem: fix shmem_swapin() race with
> >>>>>>> swapoff"
> >>>>>>> as well?
> >>>>>>
> >>>>>> This patch also rely on its prerequisite 63d8620ecf93. I think we should
> >>>>>> either revert any commit in this series or just backport the entire
> >>>>>> series.
> >>>>>
> >>>>> Then why not just pick up 2 more patches instead of dropping 2 patches.
> >>>>> Greg, could you please make sure the whole series from [1] gets pulled?
> >>>>
> >>>> Because none of these patches should have been backported in the first
> >>>> place.  It's just not worth the destabilisation.
> >>>
> >>> What about the rest then?
> >>>
> >>> git log --oneline v5.13..v5.13.3 -- mm/ | wc -l
> >>> 18
> >>>
> >>> Those look to be fixes, these ones too.
> >>>
> >>> --
> >>> Oleksandr Natalenko (post-factum)
> >>>
> >>>

Best Wishes
Zhouyi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-22  8:57                                       ` Zhouyi Zhou
@ 2021-07-22 12:36                                         ` Matthew Wilcox
  2021-07-22 13:26                                           ` Greg KH
  2021-07-22 17:44                                           ` Zhouyi Zhou
  0 siblings, 2 replies; 34+ messages in thread
From: Matthew Wilcox @ 2021-07-22 12:36 UTC (permalink / raw)
  To: Zhouyi Zhou
  Cc: Chris Clayton, Oleksandr Natalenko, Miaohe Lin, Boqun Feng,
	paulmck, linux-kernel, stable, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying, Greg KH

On Thu, Jul 22, 2021 at 04:57:57PM +0800, Zhouyi Zhou wrote:
> Thanks for reviewing,
> 
> What I have deduced from the dmesg  is:
> In function do_swap_page,
> after invoking
> 3385        si = get_swap_device(entry); /* rcu_read_lock */
> and before
> 3561    out:
> 3562        if (si)
> 3563            put_swap_device(si);
> The thread got scheduled out in
> 3454        locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
> 
> I am only familiar with Linux RCU subsystem, hope mm people can solve our
> confusions.

I don't understamd why you're still talking.  The problem is understood.
You need to revert the unnecessary backport of 2799e77529c2 and
2efa33fc7f6e

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-22 12:36                                         ` Matthew Wilcox
@ 2021-07-22 13:26                                           ` Greg KH
  2021-07-22 14:00                                             ` Greg KH
  2021-07-22 17:44                                           ` Zhouyi Zhou
  1 sibling, 1 reply; 34+ messages in thread
From: Greg KH @ 2021-07-22 13:26 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Zhouyi Zhou, Chris Clayton, Oleksandr Natalenko, Miaohe Lin,
	Boqun Feng, paulmck, linux-kernel, stable, Chris Rankin,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, rcu, Andrew Morton, Linux-MM, Huang, Ying

On Thu, Jul 22, 2021 at 01:36:02PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 22, 2021 at 04:57:57PM +0800, Zhouyi Zhou wrote:
> > Thanks for reviewing,
> > 
> > What I have deduced from the dmesg  is:
> > In function do_swap_page,
> > after invoking
> > 3385        si = get_swap_device(entry); /* rcu_read_lock */
> > and before
> > 3561    out:
> > 3562        if (si)
> > 3563            put_swap_device(si);
> > The thread got scheduled out in
> > 3454        locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
> > 
> > I am only familiar with Linux RCU subsystem, hope mm people can solve our
> > confusions.
> 
> I don't understamd why you're still talking.  The problem is understood.
> You need to revert the unnecessary backport of 2799e77529c2 and
> 2efa33fc7f6e

Sorry for the delay, will go do so in a minute...

greg k-h

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-22 13:26                                           ` Greg KH
@ 2021-07-22 14:00                                             ` Greg KH
  2021-07-23  1:51                                               ` Miaohe Lin
  0 siblings, 1 reply; 34+ messages in thread
From: Greg KH @ 2021-07-22 14:00 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Zhouyi Zhou, Chris Clayton, Oleksandr Natalenko, Miaohe Lin,
	Boqun Feng, paulmck, linux-kernel, stable, Chris Rankin,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, rcu, Andrew Morton, Linux-MM, Huang, Ying

On Thu, Jul 22, 2021 at 03:26:52PM +0200, Greg KH wrote:
> On Thu, Jul 22, 2021 at 01:36:02PM +0100, Matthew Wilcox wrote:
> > On Thu, Jul 22, 2021 at 04:57:57PM +0800, Zhouyi Zhou wrote:
> > > Thanks for reviewing,
> > > 
> > > What I have deduced from the dmesg  is:
> > > In function do_swap_page,
> > > after invoking
> > > 3385        si = get_swap_device(entry); /* rcu_read_lock */
> > > and before
> > > 3561    out:
> > > 3562        if (si)
> > > 3563            put_swap_device(si);
> > > The thread got scheduled out in
> > > 3454        locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
> > > 
> > > I am only familiar with Linux RCU subsystem, hope mm people can solve our
> > > confusions.
> > 
> > I don't understamd why you're still talking.  The problem is understood.
> > You need to revert the unnecessary backport of 2799e77529c2 and
> > 2efa33fc7f6e
> 
> Sorry for the delay, will go do so in a minute...

Both now reverted from 5.10.y and 5.13.y.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-22 14:00                                             ` Greg KH
@ 2021-07-23  1:51                                               ` Miaohe Lin
  2021-07-23  7:02                                                 ` Greg KH
  0 siblings, 1 reply; 34+ messages in thread
From: Miaohe Lin @ 2021-07-23  1:51 UTC (permalink / raw)
  To: Greg KH, Matthew Wilcox
  Cc: Zhouyi Zhou, Chris Clayton, Oleksandr Natalenko, Boqun Feng,
	paulmck, linux-kernel, stable, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying

On 2021/7/22 22:00, Greg KH wrote:
> On Thu, Jul 22, 2021 at 03:26:52PM +0200, Greg KH wrote:
>> On Thu, Jul 22, 2021 at 01:36:02PM +0100, Matthew Wilcox wrote:
>>> On Thu, Jul 22, 2021 at 04:57:57PM +0800, Zhouyi Zhou wrote:
>>>> Thanks for reviewing,
>>>>
>>>> What I have deduced from the dmesg  is:
>>>> In function do_swap_page,
>>>> after invoking
>>>> 3385        si = get_swap_device(entry); /* rcu_read_lock */
>>>> and before
>>>> 3561    out:
>>>> 3562        if (si)
>>>> 3563            put_swap_device(si);
>>>> The thread got scheduled out in
>>>> 3454        locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
>>>>
>>>> I am only familiar with Linux RCU subsystem, hope mm people can solve our
>>>> confusions.
>>>
>>> I don't understamd why you're still talking.  The problem is understood.
>>> You need to revert the unnecessary backport of 2799e77529c2 and
>>> 2efa33fc7f6e
>>
>> Sorry for the delay, will go do so in a minute...
> 
> Both now reverted from 5.10.y and 5.13.y.
> 

I browsed my previous backport notifying email and found that these two patches are also
backported into 5.12. And it seems it's missed.

Thanks.

> thanks,
> 
> greg k-h
> .
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-23  1:51                                               ` Miaohe Lin
@ 2021-07-23  7:02                                                 ` Greg KH
  2021-07-23  7:13                                                   ` Miaohe Lin
  0 siblings, 1 reply; 34+ messages in thread
From: Greg KH @ 2021-07-23  7:02 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Matthew Wilcox, Zhouyi Zhou, Chris Clayton, Oleksandr Natalenko,
	Boqun Feng, paulmck, linux-kernel, stable, Chris Rankin,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, rcu, Andrew Morton, Linux-MM, Huang, Ying

On Fri, Jul 23, 2021 at 09:51:09AM +0800, Miaohe Lin wrote:
> On 2021/7/22 22:00, Greg KH wrote:
> > On Thu, Jul 22, 2021 at 03:26:52PM +0200, Greg KH wrote:
> >> On Thu, Jul 22, 2021 at 01:36:02PM +0100, Matthew Wilcox wrote:
> >>> On Thu, Jul 22, 2021 at 04:57:57PM +0800, Zhouyi Zhou wrote:
> >>>> Thanks for reviewing,
> >>>>
> >>>> What I have deduced from the dmesg  is:
> >>>> In function do_swap_page,
> >>>> after invoking
> >>>> 3385        si = get_swap_device(entry); /* rcu_read_lock */
> >>>> and before
> >>>> 3561    out:
> >>>> 3562        if (si)
> >>>> 3563            put_swap_device(si);
> >>>> The thread got scheduled out in
> >>>> 3454        locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
> >>>>
> >>>> I am only familiar with Linux RCU subsystem, hope mm people can solve our
> >>>> confusions.
> >>>
> >>> I don't understamd why you're still talking.  The problem is understood.
> >>> You need to revert the unnecessary backport of 2799e77529c2 and
> >>> 2efa33fc7f6e
> >>
> >> Sorry for the delay, will go do so in a minute...
> > 
> > Both now reverted from 5.10.y and 5.13.y.
> > 
> 
> I browsed my previous backport notifying email and found that these two patches are also
> backported into 5.12. And it seems it's missed.

5.12 is now end-of-life, it's not being touched anymore, and no one
should continue to use it.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-23  7:02                                                 ` Greg KH
@ 2021-07-23  7:13                                                   ` Miaohe Lin
  0 siblings, 0 replies; 34+ messages in thread
From: Miaohe Lin @ 2021-07-23  7:13 UTC (permalink / raw)
  To: Greg KH
  Cc: Matthew Wilcox, Zhouyi Zhou, Chris Clayton, Oleksandr Natalenko,
	Boqun Feng, paulmck, linux-kernel, stable, Chris Rankin,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, rcu, Andrew Morton, Linux-MM, Huang, Ying

On 2021/7/23 15:02, Greg KH wrote:
> On Fri, Jul 23, 2021 at 09:51:09AM +0800, Miaohe Lin wrote:
>> On 2021/7/22 22:00, Greg KH wrote:
>>> On Thu, Jul 22, 2021 at 03:26:52PM +0200, Greg KH wrote:
>>>> On Thu, Jul 22, 2021 at 01:36:02PM +0100, Matthew Wilcox wrote:
>>>>> On Thu, Jul 22, 2021 at 04:57:57PM +0800, Zhouyi Zhou wrote:
>>>>>> Thanks for reviewing,
>>>>>>
>>>>>> What I have deduced from the dmesg  is:
>>>>>> In function do_swap_page,
>>>>>> after invoking
>>>>>> 3385        si = get_swap_device(entry); /* rcu_read_lock */
>>>>>> and before
>>>>>> 3561    out:
>>>>>> 3562        if (si)
>>>>>> 3563            put_swap_device(si);
>>>>>> The thread got scheduled out in
>>>>>> 3454        locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
>>>>>>
>>>>>> I am only familiar with Linux RCU subsystem, hope mm people can solve our
>>>>>> confusions.
>>>>>
>>>>> I don't understamd why you're still talking.  The problem is understood.
>>>>> You need to revert the unnecessary backport of 2799e77529c2 and
>>>>> 2efa33fc7f6e
>>>>
>>>> Sorry for the delay, will go do so in a minute...
>>>
>>> Both now reverted from 5.10.y and 5.13.y.
>>>
>>
>> I browsed my previous backport notifying email and found that these two patches are also
>> backported into 5.12. And it seems it's missed.
> 
> 5.12 is now end-of-life, it's not being touched anymore, and no one
> should continue to use it.
> 
> thanks,

I see! Many thanks for your kindly explanation! :)

> 
> greg k-h
> .
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-22 12:36                                         ` Matthew Wilcox
  2021-07-22 13:26                                           ` Greg KH
@ 2021-07-22 17:44                                           ` Zhouyi Zhou
  1 sibling, 0 replies; 34+ messages in thread
From: Zhouyi Zhou @ 2021-07-22 17:44 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Chris Clayton, Oleksandr Natalenko, Miaohe Lin, Boqun Feng,
	paulmck, linux-kernel, stable, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying, Greg KH

I apologize sincerely for my irresponsible and hasty email.
I reverted the unnecessary backport of 2799e77529c2 and 2efa33fc7f6e,
tested on the same qemu box as before with the same C program, there
is no warning about RCU this time. dmesg only shows the backtrace of
OOM kill.

As for memory OOMs caused by grace period's undue ends, I found each
deletion of a inode will cause a leak.
1035    void security_inode_free(struct inode *inode)
1036    {
1037        integrity_inode_free(inode);
1038        call_void_hook(inode_free_security, inode);
1039        /*
1040         * The inode may still be referenced in a path walk and
1041         * a call to security_inode_permission() can be made
1042         * after inode_free_security() is called. Ideally, the VFS
1043         * wouldn't do this, but fixing that is a much harder
1044         * job. For now, simply free the i_security via RCU, and
1045         * leave the current inode->i_security pointer intact.
1046         * The inode will be freed after the RCU grace period too.
1047         */
1048        if (inode->i_security)
1049            call_rcu((struct rcu_head *)inode->i_security,
1050                    inode_free_by_rcu);
1051    }

I am willing to do any experiment if there is a need.

Sorry again
Best Wishes
Zhouyi

On Thu, Jul 22, 2021 at 8:36 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Jul 22, 2021 at 04:57:57PM +0800, Zhouyi Zhou wrote:
> > Thanks for reviewing,
> >
> > What I have deduced from the dmesg  is:
> > In function do_swap_page,
> > after invoking
> > 3385        si = get_swap_device(entry); /* rcu_read_lock */
> > and before
> > 3561    out:
> > 3562        if (si)
> > 3563            put_swap_device(si);
> > The thread got scheduled out in
> > 3454        locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
> >
> > I am only familiar with Linux RCU subsystem, hope mm people can solve our
> > confusions.
>
> I don't understamd why you're still talking.  The problem is understood.
> You need to revert the unnecessary backport of 2799e77529c2 and
> 2efa33fc7f6e

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-22  7:30                                     ` Chris Clayton
  2021-07-22  8:57                                       ` Zhouyi Zhou
@ 2021-07-22 14:05                                       ` Paul E. McKenney
  1 sibling, 0 replies; 34+ messages in thread
From: Paul E. McKenney @ 2021-07-22 14:05 UTC (permalink / raw)
  To: Chris Clayton
  Cc: Zhouyi Zhou, Oleksandr Natalenko, Matthew Wilcox, Miaohe Lin,
	Boqun Feng, linux-kernel, stable, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, Linux-MM, Huang, Ying, Greg KH

On Thu, Jul 22, 2021 at 08:30:10AM +0100, Chris Clayton wrote:
> On 19/07/2021 17:56, Zhouyi Zhou wrote:
> > Attached is my kernel configuration file (config-5.13.2), dmesg (dmesg.txt)
> > My qemu box's memory layout is:
> >                     total        used        free      shared
> > buff/cache   available
> > Mem:           3915          92        3764           0          58        3676
> > Swap:          2044          39        2005
> > The memory exhausting program:
> > #include <stdlib.h>
> > int main()
> > {
> >         while(1)
> >         {
> >         void *ptr =
> >                 malloc(1024*1024);
> >          memset(ptr, 0, 1024*1024);
> >         }
> >         return 0;
> > }
> > 
> 
> I'm not sure what I'm supposed to make of this. Is it that my laptop simply ran out of memory? That would be odd
> because, as I said earlier, I was simply doing a weekly archive of my system. That involved using the dar backup utility
> to create the incremental backup (which in this instance resulted in a single archive file), compressing that archive
> with xz and copying (again one at a time) it and a directory containing about 7000 source and binary rpm files from an
> internal HDD to and external HDD connected via USB
> 
> My laptop, which I've had for a little over 3 months, has 32GB of memory. I have been using dar for a few years now (and
> the version currently installed since May) and have had no trouble with it. Similarly, I've used xz to compress the
> backup archives for year and the version currently installed since December 2020. The (incremental) archive file is
> about 7GB and 1.2GB when compressed. If copying files, the largest of which is 1.2GB, serially can cause 32GB fn RAM to
> be exhausted , then we are all up the creek. So I don't see where in my normal archiving process, the memory exhaustion
> would arise from (but I'm happy to be educated on this).
> 
> Or is the memory exhaustion likely to be a side effect of the fact that two patches applied in 5.13.2 are missing a
> pre-requisite? If that's the case there, seems to no disagreement on whether the two patches (which had not been tagged
> for stable) should be reverted or the missing prerequisite should be applied (along with another, related patch that was
> in the patch set). Perhaps the stable and mm teams are resolving this issue behind the scenes, but in the meantime I
> have backported the missing patches to 5.13.4 and 5.10.52 and am currently running the former as my default kernel.

Indeed, one of the possible side effects of omitting an rcu_read_unlock()
is running out of memory.  This is because RCU never sees a quiescent
state from the task that failed to execute rcu_read_unlock(), which
in turn means that RCU can never again free any memory, which in
turn of course means that your system will sooner or later exhaust
memory.  Greg's revert should restore the balance of rcu_read_lock()
and rcu_read_unlock(), which will eliminate this particular cause of
memory exhaustion.

							Thanx, Paul

> I also have a patch that reverts the two patches that were applied to stable, so could run with that applied if it would
> be more helpful. It would, of course, leave open the races that the patches are designed to close., but if I've manually
> run swapoff more than once or twice in the twenty years I've been using Linux-based systems, I'd be very surprised.
> 
> > On Tue, Jul 20, 2021 at 12:47 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >>
> >> I downloaded linux-5.13.2, configure and compile the kernel with
> >> CONFIG_LOCKDEP=y
> >> CONFIG_LOCK_DEBUGGING_SUPPORT=y
> >> CONFIG_PROVE_LOCKING=y
> >> CONFIG_DEBUG_SPINLOCK=y
> >> CONFIG_DEBUG_LOCK_ALLOC=y
> >> install the kernel on a qemu box, then run following C Program
> >> #include <stdlib.h>
> >> int main()
> >> {
> >>         while(1)
> >>         {
> >>                 malloc(1024*1024);
> >>         }
> >>         return 0;
> >> }
> >> And following is the dmesg:
> >> [   96.155017] ------------[ cut here ]------------
> >> [   96.155030] WARNING: CPU: 10 PID: 770 at
> >> kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x91/0x610
> >> [   96.155074] Modules linked in: ppdev intel_rapl_msr
> >> intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> >> bochs_drm aesni_intel drm_vram_helper evdev crypto_simd drm_ttm_helper
> >> snd_pcm cryptd ttm snd_timer parport_pc serio_raw drm_kms_helper sg
> >> snd parport soundcore drm pcspkr button ip_tables autofs4 psmouse
> >> sr_mod i2c_piix4 sd_mod crc32c_intel t10_pi cdrom i2c_core e1000
> >> ata_generic floppy
> >> [   96.155180] CPU: 10 PID: 770 Comm: containerd Not tainted 5.13.2 #1
> >> [   96.155185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> >> [   96.155189] RIP: 0010:rcu_note_context_switch+0x91/0x610
> >> [   96.155195] Code: ff 74 0f 65 8b 05 a7 10 e5 7e 85 c0 0f 84 22 01
> >> 00 00 45 84 ed 75 15 65 48 8b 04 25 80 7f 01 00 8b b0 44 03 00 00 85
> >> f6 7e 02 <0f> 0b 65 48 8b 04 25 80 7f 01 00 8b 88 44 03 00 00 85 c9 7e
> >> 0f 41
> >> [   96.155200] RSP: 0000:ffffc90000a8bc50 EFLAGS: 00010002
> >> [   96.155204] RAX: ffff88810830c300 RBX: ffff88813bcae680 RCX: 0000000000000000
> >> [   96.155208] RDX: 0000000000000002 RSI: 0000000000000001 RDI: 0000000000000001
> >> [   96.155210] RBP: ffffc90000a8bcd0 R08: 0000000000000001 R09: 0000000000000001
> >> [   96.155213] R10: 0000000000000000 R11: ffffffff81319b93 R12: ffff88810830c300
> >> [   96.155216] R13: 0000000000000000 R14: ffff88813bcad958 R15: 0000000000004970
> >> [   96.155220] FS:  00007fc09cff9700(0000) GS:ffff88813bc80000(0000)
> >> knlGS:0000000000000000
> >> [   96.155223] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [   96.155226] CR2: 00005594d2ff0530 CR3: 0000000103be4001 CR4: 0000000000060ee0
> >> [   96.155233] Call Trace:
> >> [   96.155243]  __schedule+0xda/0xa30
> >> [   96.155283]  schedule+0x46/0xf0
> >> [   96.155289]  io_schedule+0x12/0x40
> >> [   96.155295]  __lock_page_or_retry+0x1f9/0x510
> >> [   96.155319]  ? __page_cache_alloc+0x140/0x140
> >> [   96.155345]  do_swap_page+0x33f/0x930
> >> [   96.155364]  __handle_mm_fault+0xa54/0x1550
> >> [   96.155390]  handle_mm_fault+0x17f/0x420
> >> [   96.155400]  do_user_addr_fault+0x1be/0x770
> >> [   96.155422]  exc_page_fault+0x69/0x280
> >> [   96.155435]  ? asm_exc_page_fault+0x8/0x30
> >> [   96.155443]  asm_exc_page_fault+0x1e/0x30
> >> [   96.155448] RIP: 0033:0x5594d15ec98f
> >> [   96.155454] Code: 44 24 58 48 85 c0 48 b9 00 e4 0b 54 02 00 00 00
> >> 48 0f 44 c1 48 89 44 24 58 eb 05 48 8b 44 24 58 48 89 04 24 e8 e2 c9
> >> 9c ff 90 <48> 8b 05 9a 3b a0 01 48 8b 4c 24 50 48 89 0c 24 48 8d 15 9a
> >> c4 ec
> >> [   96.155457] RSP: 002b:000000c0001f7f80 EFLAGS: 00010206
> >> [   96.155462] RAX: 0000000000000000 RBX: 00005594d0f9cf55 RCX: fffffffffffffff8
> >> [   96.155465] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005594d0fa2d01
> >> [   96.155467] RBP: 000000c0001f7fc0 R08: 0000000000000000 R09: 0000000000000000
> >> [   96.155470] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000004
> >> [   96.155472] R13: 0000000000000013 R14: 00005594d1ee96e6 R15: 0000000000000039
> >> [   96.155495] irq event stamp: 10464
> >> [   96.155497] hardirqs last  enabled at (10463): [<ffffffff81c0a574>]
> >> _raw_spin_unlock_irq+0x24/0x50
> >> [   96.155508] hardirqs last disabled at (10464): [<ffffffff81c01882>]
> >> __schedule+0x412/0xa30
> >> [   96.155512] softirqs last  enabled at (8668): [<ffffffff82000401>]
> >> __do_softirq+0x401/0x51b
> >> [   96.155517] softirqs last disabled at (8657): [<ffffffff81129c22>]
> >> irq_exit_rcu+0x142/0x150
> >> [   96.155531] ---[ end trace 165ff31fd86ffc12 ]---
> >>
> >> [   96.177669] =============================
> >> [   96.177693] [ BUG: Invalid wait context ]
> >> [   96.177717] 5.13.2 #1 Tainted: G        W
> >> [   96.177743] -----------------------------
> >> [   96.177765] containerd/770 is trying to lock:
> >> [   96.177790] ffff88813ba69b30 (&cache->alloc_lock){+.+.}-{3:3}, at:
> >> get_swap_page+0x126/0x200
> >> [   96.177867] other info that might help us debug this:
> >> [   96.177894] context-{4:4}
> >> [   96.177910] 3 locks held by containerd/770:
> >> [   96.177934]  #0: ffff88810815ea28 (&mm->mmap_lock#2){++++}-{3:3},
> >> at: do_user_addr_fault+0x115/0x770
> >> [   96.177999]  #1: ffffffff82915020 (rcu_read_lock){....}-{1:2}, at:
> >> get_swap_device+0x33/0x140
> >> [   96.178057]  #2: ffffffff82955ba0 (fs_reclaim){+.+.}-{0:0}, at:
> >> __fs_reclaim_acquire+0x5/0x30
> >> [   96.178115] stack backtrace:
> >> [   96.178133] CPU: 1 PID: 770 Comm: containerd Tainted: G        W
> >>      5.13.2 #1
> >> [   96.178183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> >> [   96.178254] Call Trace:
> >> [   96.178274]  dump_stack+0x82/0xa4
> >> [   96.178314]  __lock_acquire+0x9a5/0x20a0
> >> [   96.178347]  ? find_held_lock+0x3a/0xb0
> >> [   96.178381]  lock_acquire+0xe9/0x320
> >> [   96.178405]  ? get_swap_page+0x126/0x200
> >> [   96.178433]  ? _raw_spin_unlock+0x29/0x40
> >> [   96.178461]  ? page_vma_mapped_walk+0x3a3/0x960
> >> [   96.178491]  __mutex_lock+0x99/0x980
> >> [   96.178515]  ? get_swap_page+0x126/0x200
> >> [   96.178541]  ? find_held_lock+0x3a/0xb0
> >> [   96.178577]  ? get_swap_page+0x126/0x200
> >> [   96.178603]  ? page_referenced+0xd5/0x170
> >> [   96.178637]  ? lock_release+0x1b4/0x300
> >> [   96.178663]  ? get_swap_page+0x126/0x200
> >> [   96.178698]  get_swap_page+0x126/0x200
> >> [   96.178725]  add_to_swap+0x14/0x60
> >> [   96.178749]  shrink_page_list+0xb13/0xe70
> >> [   96.178787]  shrink_inactive_list+0x243/0x550
> >> [   96.178819]  shrink_lruvec+0x4fd/0x780
> >> [   96.178849]  ? shrink_node+0x257/0x7c0
> >> [   96.178873]  shrink_node+0x257/0x7c0
> >> [   96.178900]  do_try_to_free_pages+0xdd/0x410
> >> [   96.178929]  try_to_free_pages+0x110/0x300
> >> [   96.178966]  __alloc_pages_slowpath.constprop.126+0x2ae/0xfa0
> >> [   96.179002]  ? lock_release+0x1b4/0x300
> >> [   96.179028]  __alloc_pages+0x37d/0x400
> >> [   96.179054]  alloc_pages_vma+0x73/0x1d0
> >> [   96.179878]  __read_swap_cache_async+0xb8/0x280
> >> [   96.180684]  swap_cluster_readahead+0x194/0x270
> >> [   96.181459]  ? swapin_readahead+0x62/0x530
> >> [   96.182008]  swapin_readahead+0x62/0x530
> >> [   96.182558]  ? find_held_lock+0x3a/0xb0
> >> [   96.183109]  ? lookup_swap_cache+0x5c/0x1c0
> >> [   96.183657]  ? lock_release+0x1b4/0x300
> >> [   96.184207]  ? do_swap_page+0x232/0x930
> >> [   96.184753]  do_swap_page+0x232/0x930
> >> [   96.185326]  __handle_mm_fault+0xa54/0x1550
> >> [   96.185850]  handle_mm_fault+0x17f/0x420
> >> [   96.186361]  do_user_addr_fault+0x1be/0x770
> >> [   96.186880]  exc_page_fault+0x69/0x280
> >> [   96.187382]  ? asm_exc_page_fault+0x8/0x30
> >> [   96.187879]  asm_exc_page_fault+0x1e/0x30
> >> [   96.188363] RIP: 0033:0x5594d0f78da4
> >> [   96.188829] Code: cc cc cc cc cc cc 48 8b 0d 39 d2 01 02 64 48 8b
> >> 09 48 3b 61 10 76 3d 48 83 ec 28 48 89 6c 24 20 48 8d 6c 24 20 48 8b
> >> 44 24 30 <48> 8b 08 48 89 0c 24 48 89 44 24 08 c6 44 24 10 01 e8 76 f4
> >> ff ff
> >> [   96.189894] RSP: 002b:000000c0001f7de8 EFLAGS: 00010216
> >> [   96.190406] RAX: 00005594d2308160 RBX: 0000000000000000 RCX: 000000c0004ea480
> >> [   96.190902] RDX: 000000c0002a4270 RSI: 0000000000000010 RDI: 0000000000000011
> >> [   96.191394] RBP: 000000c0001f7e08 R08: 0000000000000002 R09: 0000000000000011
> >> [   96.191891] R10: 00005594d22f6ce0 R11: 00005594d1ee96e4 R12: ffffffffffffffff
> >> [   96.192396] R13: 0000000000000028 R14: 0000000000000027 R15: 0000000000000200
> >> [  115.344546] exaust invoked oom-killer:
> >> gfp_mask=0x100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0,
> >> oom_score_adj=0
> >> [  115.346019] CPU: 1 PID: 969 Comm: exaust Tainted: G        W
> >>  5.13.2 #1
> >> [  115.346569] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> >> [  115.347712] Call Trace:
> >> [  115.348287]  dump_stack+0x82/0xa4
> >> [  115.348867]  dump_header+0x55/0x3f0
> >> [  115.349491]  oom_kill_process+0x160/0x210
> >> [  115.350068]  out_of_memory+0x10b/0x630
> >> [  115.350646]  __alloc_pages_slowpath.constprop.126+0xec7/0xfa0
> >> [  115.351255]  __alloc_pages+0x37d/0x400
> >> [  115.351861]  alloc_pages_vma+0x73/0x1d0
> >> [  115.352449]  __handle_mm_fault+0xe1b/0x1550
> >> [  115.353042]  handle_mm_fault+0x17f/0x420
> >> [  115.353646]  do_user_addr_fault+0x1be/0x770
> >> [  115.354239]  exc_page_fault+0x69/0x280
> >> [  115.354833]  ? asm_exc_page_fault+0x8/0x30
> >> [  115.355428]  asm_exc_page_fault+0x1e/0x30
> >> [  115.356021] RIP: 0033:0x7fe8ee633543
> >> [  115.356617] Code: Unable to access opcode bytes at RIP 0x7fe8ee633519.
> >> [  115.357245] RSP: 002b:00007fff742c68c8 EFLAGS: 00010206
> >> [  115.357857] RAX: 00007fe798041010 RBX: 0000000000000000 RCX: 00007fe7980fd000
> >> [  115.358469] RDX: 00007fe798141000 RSI: 0000000000000000 RDI: 00007fe798041010
> >> [  115.359077] RBP: 00007fff742c68e0 R08: 00000000ffffffff R09: 0000000000000000
> >> [  115.359686] R10: 0000000000000022 R11: 0000000000000246 R12: 0000561abe3fa060
> >> [  115.360300] R13: 00007fff742c69c0 R14: 0000000000000000 R15: 0000000000000000
> >> [  115.361313] Mem-Info:
> >> [  115.362285] active_anon:188386 inactive_anon:764572 isolated_anon:64
> >>                 active_file:61 inactive_file:0 isolated_file:0
> >>                 unevictable:0 dirty:0 writeback:2
> >>                 slab_reclaimable:6676 slab_unreclaimable:6200
> >>                 mapped:155 shmem:180 pagetables:3124 bounce:0
> >>                 free:25647 free_pcp:0 free_cma:0
> >> [  115.366889] Node 0 active_anon:782092kB inactive_anon:3029744kB
> >> active_file:244kB inactive_file:0kB unevictable:0kB
> >> isolated(anon):256kB isolated(file):0kB mapped:620kB dirty:0kB
> >> writeback:8kB shmem:720kB shmem_thp: 0kB shmem_pmdmapped: 0kB
> >> anon_thp: 0kB writeback_tmp:0kB kernel_stack:3904kB pagetables:12496kB
> >> all_unreclaimable? no
> >> [  115.369378] Node 0 DMA free:15296kB min:260kB low:324kB high:388kB
> >> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >> present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB
> >> local_pcp:0kB free_cma:0kB
> >> [  115.372121] lowmem_reserve[]: 0 2925 3874 3874
> >> [  115.373195] Node 0 DMA32 free:54436kB min:50824kB low:63528kB
> >> high:76232kB reserved_highatomic:0KB active_anon:819596kB
> >> inactive_anon:2135980kB active_file:48kB inactive_file:24kB
> >> unevictable:0kB writepending:8kB present:3129212kB managed:3021488kB
> >> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >> [  115.375958] lowmem_reserve[]: 0 0 949 949
> >> [  115.377064] Node 0 Normal free:32856kB min:32880kB low:37004kB
> >> high:41128kB reserved_highatomic:0KB active_anon:2232kB
> >> inactive_anon:853384kB active_file:80kB inactive_file:76kB
> >> unevictable:0kB writepending:0kB present:1048576kB managed:972400kB
> >> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >> [  115.380050] lowmem_reserve[]: 0 0 0 0
> >> [  115.380908] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB (U)
> >> 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M)
> >> = 15296kB
> >> [  115.382728] Node 0 DMA32: 86*4kB (UM) 52*8kB (UME) 33*16kB (UM)
> >> 28*32kB (UME) 11*64kB (U) 2*128kB (UE) 0*256kB 1*512kB (U) 2*1024kB
> >> (UM) 2*2048kB (ME) 11*4096kB (M) = 54856kB
> >> [  115.384560] Node 0 Normal: 597*4kB (UME) 904*8kB (UME) 414*16kB
> >> (UME) 187*32kB (UME) 64*64kB (UME) 19*128kB (UME) 3*256kB (U) 3*512kB
> >> (UE) 2*1024kB (M) 0*2048kB 0*4096kB = 33108kB
> >> [  115.386558] Node 0 hugepages_total=0 hugepages_free=0
> >> hugepages_surp=0 hugepages_size=1048576kB
> >> [  115.387543] Node 0 hugepages_total=0 hugepages_free=0
> >> hugepages_surp=0 hugepages_size=2048kB
> >> [  115.388503] 61467 total pagecache pages
> >> [  115.389532] 61242 pages in swap cache
> >> [  115.390450] Swap cache stats: add 523756, delete 462538, find 69/171
> >> [  115.391383] Free swap  = 0kB
> >> [  115.392312] Total swap = 2094076kB
> >> [  115.393506] 1048445 pages RAM
> >> [  115.394538] 0 pages HighMem/MovableOnly
> >> [  115.395514] 46133 pages reserved
> >> [  115.396525] 0 pages hwpoisoned
> >> [  115.397824] Tasks state (memory values in pages):
> >> [  115.398778] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes
> >> swapents oom_score_adj name
> >> [  115.399792] [    312]     0   312     6603      217    98304
> >> 181             0 systemd-journal
> >> [  115.400749] [    339]     0   339     5562        1    69632
> >> 395         -1000 systemd-udevd
> >> [  115.401869] [    379]   101   379    23271       17    81920
> >> 197             0 systemd-timesyn
> >> [  115.402873] [    388]   107   388     1707        7    53248
> >> 101             0 rpcbind
> >> [  115.403869] [    479]     0   479     2373       39    53248
> >> 284             0 dhclient
> >> [  115.404849] [    485]     0   485     2120        8    53248
> >> 43             0 cron
> >> [  115.406291] [    487]     0   487     4869       44    77824
> >> 197             0 systemd-logind
> >> [  115.407536] [    494]   104   494     2247       58    57344
> >> 80          -900 dbus-daemon
> >> [  115.408772] [    501]     0   501    56457        0    86016
> >> 226             0 rsyslogd
> >> [  115.410041] [    503]     0   503   315093      769   311296
> >> 3943             0 containerd
> >> [  115.411289] [    507]     0   507     1404        2    45056
> >> 26             0 agetty
> >> [  115.412563] [    521]     0   521    27588      235   110592
> >> 1678             0 unattended-upgr
> >> [  115.413834] [    860]     0   860     3964       28    73728
> >> 187         -1000 sshd
> >> [  115.415062] [    866]   106   866     5015       19    81920
> >> 187             0 exim4
> >> [  115.416286] [    901]     0   901     4233        1    73728
> >> 281             0 sshd
> >> [  115.417534] [    928]     0   928     5287       57    73728
> >> 280             0 systemd
> >> [  115.418755] [    929]     0   929     5710       77    90112
> >> 488             0 (sd-pam)
> >> [  115.419921] [    943]     0   943     2013        1    53248
> >> 407             0 bash
> >> [  115.421061] [    969]     0   969  1402762   888995 11276288
> >> 513120             0 exaust
> >> [  115.421973] [    970]     0   970     4233      272    77824
> >> 5             0 sshd
> >> [  115.422840] [    976]     0   976     1980      297    53248
> >> 64             0 bash
> >> [  115.423664] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/session-1.scope,task=exaust,pid=969,uid=0
> >> [  115.425344] Out of memory: Killed process 969 (exaust)
> >> total-vm:5611048kB, anon-rss:3555976kB, file-rss:4kB, shmem-rss:0kB,
> >> UID:0 pgtables:11012kB oom_score_adj:0
> >> [  115.912696] oom_reaper: reaped process 969 (exaust), now
> >> anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> >>
> >>
> >> I can tell from above that:  [   96.177999]  #1: ffffffff82915020
> >> (rcu_read_lock){....}-{1:2}, at: get_swap_device+0x33/0x140
> >>  get_swap_device did help the rcu_read_lock
> >>
> >> Thanks
> >> Zhouyi
> >>
> >> On Mon, Jul 19, 2021 at 8:23 PM Oleksandr Natalenko
> >> <oleksandr@natalenko.name> wrote:
> >>>
> >>> On pondělí 19. července 2021 14:16:04 CEST Matthew Wilcox wrote:
> >>>> On Mon, Jul 19, 2021 at 02:12:15PM +0200, Oleksandr Natalenko wrote:
> >>>>> On pondělí 19. července 2021 14:08:37 CEST Miaohe Lin wrote:
> >>>>>> On 2021/7/19 19:59, Oleksandr Natalenko wrote:
> >>>>>>> On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
> >>>>>>>> On 2021/7/19 19:22, Matthew Wilcox wrote:
> >>>>>>>>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
> >>>>>>>>>> When in the commit 2799e77529c2a, we're using the percpu_ref to
> >>>>>>>>>> serialize
> >>>>>>>>>> against concurrent swapoff, i.e. there's percpu_ref inside
> >>>>>>>>>> get_swap_device() instead of rcu_read_lock(). Please see commit
> >>>>>>>>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
> >>>>>>>>>> concurrent swapoff") for detail.
> >>>>>>>>>
> >>>>>>>>> Oh, so this is a backport problem.  2799e77529c2 was backported
> >>>>>>>>> without
> >>>>>>>>> its prerequisite 63d8620ecf93.  Greg, probably best to just drop
> >>>>>>>>
> >>>>>>>> Yes, they're posted as a patch set:
> >>>>>>>>
> >>>>>>>> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.co
> >>>>>>>> m
> >>>>>>>>
> >>>>>>>>> 2799e77529c2 from all stable trees; the race described is not very
> >>>>>>>>> important (swapoff vs reading a page back from that swap device).
> >>>>>>>>> .
> >>>>>>>>
> >>>>>>>> The swapoff races with reading a page back from that swap device
> >>>>>>>> should
> >>>>>>>> be
> >>>>>>>> really uncommon as most users only do swapoff when the system is
> >>>>>>>> going to
> >>>>>>>> shutdown.
> >>>>>>>>
> >>>>>>>> Sorry for the trouble!
> >>>>>>>
> >>>>>>> git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
> >>>>>>> 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
> >>>>>>> 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release
> >>>>>>> locked
> >>>>>>> z3fold page
> >>>>>>> ccb7848e2344 mm/z3fold: fix potential memory leak in
> >>>>>>> z3fold_destroy_pool()
> >>>>>>> 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other
> >>>>>>> processes
> >>>>>>> are mapping it
> >>>>>>> f13259175e4f mm/huge_memory.c: add missing read-only THP checking in
> >>>>>>> transparent_hugepage_enabled()
> >>>>>>> afafd371e7de mm/huge_memory.c: remove dedicated macro
> >>>>>>> HPAGE_CACHE_INDEX_MASK a533a21b692f mm/shmem: fix shmem_swapin() race
> >>>>>>> with swapoff
> >>>>>>> c3b39134bbd0 swap: fix do_swap_page() race with swapoff
> >>>>>>>
> >>>>>>> Do you suggest reverting "mm/shmem: fix shmem_swapin() race with
> >>>>>>> swapoff"
> >>>>>>> as well?
> >>>>>>
> >>>>>> This patch also rely on its prerequisite 63d8620ecf93. I think we should
> >>>>>> either revert any commit in this series or just backport the entire
> >>>>>> series.
> >>>>>
> >>>>> Then why not just pick up 2 more patches instead of dropping 2 patches.
> >>>>> Greg, could you please make sure the whole series from [1] gets pulled?
> >>>>
> >>>> Because none of these patches should have been backported in the first
> >>>> place.  It's just not worth the destabilisation.
> >>>
> >>> What about the rest then?
> >>>
> >>> git log --oneline v5.13..v5.13.3 -- mm/ | wc -l
> >>> 18
> >>>
> >>> Those look to be fixes, these ones too.
> >>>
> >>> --
> >>> Oleksandr Natalenko (post-factum)
> >>>
> >>>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-19 12:12                           ` Oleksandr Natalenko
  2021-07-19 12:16                             ` Matthew Wilcox
@ 2021-07-19 12:17                             ` Miaohe Lin
  1 sibling, 0 replies; 34+ messages in thread
From: Miaohe Lin @ 2021-07-19 12:17 UTC (permalink / raw)
  To: Oleksandr Natalenko, Matthew Wilcox
  Cc: Boqun Feng, Zhouyi Zhou, paulmck, linux-kernel, stable,
	Chris Clayton, Chris Rankin, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu,
	Andrew Morton, Linux-MM, Huang, Ying, gregkh

On 2021/7/19 20:12, Oleksandr Natalenko wrote:
> On pondělí 19. července 2021 14:08:37 CEST Miaohe Lin wrote:
>> On 2021/7/19 19:59, Oleksandr Natalenko wrote:
>>> On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote:
>>>> On 2021/7/19 19:22, Matthew Wilcox wrote:
>>>>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote:
>>>>>> When in the commit 2799e77529c2a, we're using the percpu_ref to
>>>>>> serialize
>>>>>> against concurrent swapoff, i.e. there's percpu_ref inside
>>>>>> get_swap_device() instead of rcu_read_lock(). Please see commit
>>>>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against
>>>>>> concurrent swapoff") for detail.
>>>>>
>>>>> Oh, so this is a backport problem.  2799e77529c2 was backported without
>>>>> its prerequisite 63d8620ecf93.  Greg, probably best to just drop
>>>>
>>>> Yes, they're posted as a patch set:
>>>>
>>>> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.com
>>>>
>>>>> 2799e77529c2 from all stable trees; the race described is not very
>>>>> important (swapoff vs reading a page back from that swap device).
>>>>> .
>>>>
>>>> The swapoff races with reading a page back from that swap device should
>>>> be
>>>> really uncommon as most users only do swapoff when the system is going to
>>>> shutdown.
>>>>
>>>> Sorry for the trouble!
>>>
>>> git log --oneline v5.13..v5.13.3 --author="Miaohe Lin"
>>> 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry()
>>> 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release locked
>>> z3fold page
>>> ccb7848e2344 mm/z3fold: fix potential memory leak in z3fold_destroy_pool()
>>> 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other processes
>>> are mapping it
>>> f13259175e4f mm/huge_memory.c: add missing read-only THP checking in
>>> transparent_hugepage_enabled()
>>> afafd371e7de mm/huge_memory.c: remove dedicated macro
>>> HPAGE_CACHE_INDEX_MASK a533a21b692f mm/shmem: fix shmem_swapin() race
>>> with swapoff
>>> c3b39134bbd0 swap: fix do_swap_page() race with swapoff
>>>
>>> Do you suggest reverting "mm/shmem: fix shmem_swapin() race with swapoff"
>>> as well?
>>
>> This patch also rely on its prerequisite 63d8620ecf93. I think we should
>> either revert any commit in this series or just backport the entire series.
> 
> Then why not just pick up 2 more patches instead of dropping 2 patches. Greg, 
> could you please make sure the whole series from [1] gets pulled?
> 
> Thanks.
> 
> [1] https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.com
> 

I browsed the previous backport notifying email and found that these two patches are also
backported into 5.10 and 5.12. So we might also need to make sure the whole series from
[1] gets pulled in 5.10 and 5.12.

Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359
  2021-07-18 21:59     ` Paul E. McKenney
  2021-07-18 22:51       ` Matthew Wilcox
@ 2021-07-19  7:32       ` Chris Clayton
  1 sibling, 0 replies; 34+ messages in thread
From: Chris Clayton @ 2021-07-19  7:32 UTC (permalink / raw)
  To: paulmck, Oleksandr Natalenko
  Cc: linux-kernel, stable, Chris Rankin, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu, Andrew Morton, linux-mm



On 18/07/2021 22:59, Paul E. McKenney wrote:
> On Sun, Jul 18, 2021 at 11:03:51PM +0200, Oleksandr Natalenko wrote:
>> + stable@vger.kernel.org
>>
>> On neděle 18. července 2021 23:01:24 CEST Oleksandr Natalenko wrote:
>>> Hello.
>>>
>>> On sobota 17. července 2021 22:22:08 CEST Chris Clayton wrote:
>>>> I checked the output from dmesg yesterday and found the following warning:
>>>>
>>>> [Fri Jul 16 09:15:29 2021] ------------[ cut here ]------------
>>>> [Fri Jul 16 09:15:29 2021] WARNING: CPU: 11 PID: 2701 at
>>>> kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x37/0x3d0 [Fri Jul
>>>> 16
> 
> I am not seeing a warning at line 359 of either v5.13.2 or v5.12.7.
> 

Mmm, in the 5.13.2 tarball downloaded from https://cdn.kernel.org/pub/linux/kernel/v5.x/ I see:

    350  */
    351 void rcu_note_context_switch(bool preempt)
    352 {
    353         struct task_struct *t = current;
    354         struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
    355         struct rcu_node *rnp;
    356
    357         trace_rcu_utilization(TPS("Start context switch"));
    358         lockdep_assert_irqs_disabled();
    359         WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0);
    360         if (rcu_preempt_depth() > 0 &&
    361             !t->rcu_read_unlock_special.b.blocked) {
    362

>>>> 09:15:29 2021] Modules linked in: uas hidp rfcomm bnep xt_MASQUERADE
>>>> iptable_nat nf_nat xt_LOG nf_log_syslog xt_limit xt_multiport xt_conntrack
>>>> iptable_filter btusb btintel wmi_bmof uvcvideo videobuf2_vmalloc
>>>> videobuf2_memops videobuf2_v4l2 videobuf2_common coretemp hwmon
>>>> snd_hda_codec_hdmi x86_pkg_temp_thermal snd_hda_codec_realtek
>>>> snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg
>>>> snd_hda_codec snd_hwdep snd_hda_core i2c_i801 i2c_smbus iwlmvm mac80211
>>>> iwlwifi i915 mei_me mei cfg80211 intel_lpss_pci intel_lpss wmi
>>>> nf_conntrack_ftp xt_helper nf_conntrack nf_defrag_ipv4 tun
>>>> [Fri Jul 16 09:15:29 2021] CPU: 11 PID: 2701 Comm: lpqd Not tainted 5.13.2
>>>> #1 [Fri Jul 16 09:15:29 2021] Hardware name: Notebook
>>>>
>>>>   NP50DE_DB                       /NP50DE_DB , BIOS 1.07.04 02/17/2020
>>>>
>>>> [Fri Jul 16 09:15:29 2021] RIP: 0010:rcu_note_context_switch+0x37/0x3d0
>>>> [Fri Jul 16 09:15:29 2021] Code: 02 00 e8 ec a0 6c 00 89 c0 65 4c 8b 2c 25
>>>> 00 6d 01 00 48 03 1c c5 80 56 e1 b6 40 84 ed 75 0d 41 8b 95 04 03 00 00 85
>>>> d2 7e 02 <0f> 0b 65 48 8b 04 25 00 6d 01 00 8b 80 04 03 00 00 85 c0 7e 0a
>>>> 41 [Fri Jul 16 09:15:29 2021] RSP: 0000:ffffb5d483837c70 EFLAGS: 00010002
>>>> [Fri Jul 16 09:15:29 2021] RAX: 000000000000000b RBX: ffff9b77806e1d80
>>>> RCX:
>>>> 0000000000000100 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000001 RSI:
>>>> ffffffffb6d82ead RDI: ffffffffb6da5e4e [Fri Jul 16 09:15:29 2021] RBP:
>>>> 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [Fri Jul 16
>>>> 09:15:29 2021] R10: 000000067bce4fff R11: 0000000000000000 R12:
>>>> ffff9b77806e1100 [Fri Jul 16 09:15:29 2021] R13: ffff9b734a833a00 R14:
>>>> ffff9b734a833a00 R15: 0000000000000000 [Fri Jul 16 09:15:29 2021] FS:
>>>> 00007fccbfc5fe40(0000) GS:ffff9b77806c0000(0000) knlGS:0000000000000000
>>>> [Fri Jul 16 09:15:29 2021] CS:  0010 DS: 0000 ES: 0000 CR0:
>>>> 0000000080050033 [Fri Jul 16 09:15:29 2021] CR2: 00007fccc2db7290 CR3:
>>>> 00000003fb0b8002 CR4: 00000000007706e0 [Fri Jul 16 09:15:29 2021] PKRU:
>>>> 55555554
>>>> [Fri Jul 16 09:15:29 2021] Call Trace:
>>>> [Fri Jul 16 09:15:29 2021]  __schedule+0x86/0x810
>>>> [Fri Jul 16 09:15:29 2021]  schedule+0x40/0xe0
>>>> [Fri Jul 16 09:15:29 2021]  io_schedule+0x3d/0x60
>>>> [Fri Jul 16 09:15:29 2021]  wait_on_page_bit_common+0x129/0x390
>>>> [Fri Jul 16 09:15:29 2021]  ? __filemap_set_wb_err+0x10/0x10
>>>> [Fri Jul 16 09:15:29 2021]  __lock_page_or_retry+0x13f/0x1d0
>>>> [Fri Jul 16 09:15:29 2021]  do_swap_page+0x335/0x5b0
>>>> [Fri Jul 16 09:15:29 2021]  __handle_mm_fault+0x444/0xb20
>>>> [Fri Jul 16 09:15:29 2021]  handle_mm_fault+0x5c/0x170
>>>> [Fri Jul 16 09:15:29 2021]  ? find_vma+0x5b/0x70
>>>> [Fri Jul 16 09:15:29 2021]  exc_page_fault+0x1ab/0x610
>>>> [Fri Jul 16 09:15:29 2021]  ? fpregs_assert_state_consistent+0x19/0x40
>>>> [Fri Jul 16 09:15:29 2021]  ? asm_exc_page_fault+0x8/0x30
>>>> [Fri Jul 16 09:15:29 2021]  asm_exc_page_fault+0x1e/0x30
>>>> [Fri Jul 16 09:15:29 2021] RIP: 0033:0x7fccc2d3c520
>>>> [Fri Jul 16 09:15:29 2021] Code: 68 4c 00 00 00 e9 20 fb ff ff ff 25 7a ad
>>>> 07 00 68 4d 00 00 00 e9 10 fb ff ff ff 25 72 ad 07 00 68 4e 00 00 00 e9 00
>>>> fb ff ff <ff> 25 6a ad 07 00 68 4f 00 00 00 e9 f0 fa ff ff ff 25 62 ad 07
>>>> 00 [Fri Jul 16 09:15:29 2021] RSP: 002b:00007ffebd529048 EFLAGS: 00010293
>>>> [Fri Jul 16 09:15:29 2021] RAX: 0000000000000001 RBX: 00007fccc46e2890
>>>> RCX:
>>>> 0000000000000010 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000010 RSI:
>>>> 0000000000000000 RDI: 00007fccc46e2890 [Fri Jul 16 09:15:29 2021] RBP:
>>>> 000056264f1dd4a0 R08: 000056264f21aba0 R09: 000056264f1f58a0 [Fri Jul 16
>>>> 09:15:29 2021] R10: 0000000000000007 R11: 0000000000000246 R12:
>>>> 000056264f21ac00 [Fri Jul 16 09:15:29 2021] R13: 000056264f1e0a30 R14:
>>>> 00007ffebd529080 R15: 00000000000dd87b [Fri Jul 16 09:15:29 2021] ---[ end
>>>> trace c8b06e067d8b0fc2 ]---
>>>>
>>>> At the time the warning was issued I was creating a (weekly) backup of my
>>>> linux system (home-brewed based on the guidance from Linux From Scratch).
>>>> My backup routine is completed by copying the archive files (created with
>>>> dar) and a directory that contains about 7000 source and binary rpm files
>>>> to an external USB drive. I didn't spot the warning until later in the
>>>> day,
>>>> so I'm not sure exactly where I was in my backup process.
>>>>
>>>> I haven't seen this warning before. Consequently, I don;t know how easy
>>>> (or
>>>> otherwise) it is to reproduce.
>>>>
>>>> Let me know if I can provide any additional diagnostics, but please cc me
>>>> as I'm not subscribed.
>>>
>>> Confirming the same for me with v5.13.2, and cross-referencing another
>>> report [1] against v5.12.17.
>>>
>>> Also Cc'ing relevant people on this.
>>>
>>> Thanks.
>>>
>>> [1]
>>> https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1k
>>> EHLHA@mail.gmail.com/
> 
> But this one does show this warning in v5.12.17:
> 
> 	WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0);
> 
> This is in rcu_note_context_switch(), and could be caused by something
> like a schedule() within an RCU read-side critical section.  This would
> of course be RCU-usage bugs, given that you are not permitted to block
> within an RCU read-side critical section.
> 
> I suggest checking the functions in the stack trace to see where the
> rcu_read_lock() is hiding.  CONFIG_PROVE_LOCKING might also be helpful.
> 
> 							Thanx, Paul
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2021-07-23  7:13 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <c9fd1311-662c-f993-c8ef-54af036f2f78@googlemail.com>
2021-07-18 21:01 ` linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359 Oleksandr Natalenko
2021-07-18 21:03   ` Oleksandr Natalenko
2021-07-18 21:22     ` Matthew Wilcox
2021-07-18 21:36       ` Chris Clayton
2021-07-18 21:59     ` Paul E. McKenney
2021-07-18 22:51       ` Matthew Wilcox
2021-07-19  1:53         ` Paul E. McKenney
2021-07-19  2:24           ` Zhouyi Zhou
2021-07-19  2:27             ` Zhouyi Zhou
2021-07-19  2:43             ` Matthew Wilcox
2021-07-19  2:59               ` Zhouyi Zhou
2021-07-19 10:14               ` Boqun Feng
2021-07-19 11:12                 ` Miaohe Lin
2021-07-19 11:17                   ` Oleksandr Natalenko
2021-07-19 11:22                   ` Matthew Wilcox
2021-07-19 11:50                     ` Miaohe Lin
2021-07-19 11:59                       ` Oleksandr Natalenko
2021-07-19 12:08                         ` Miaohe Lin
2021-07-19 12:12                           ` Oleksandr Natalenko
2021-07-19 12:16                             ` Matthew Wilcox
2021-07-19 12:23                               ` Oleksandr Natalenko
2021-07-19 16:47                                 ` Zhouyi Zhou
     [not found]                                   ` <CAABZP2yh3J8+P=3PLZVaC47ymKC7PcfQCBBxjXJ9Ybn+HREbdg@mail.gmail.com>
2021-07-22  7:30                                     ` Chris Clayton
2021-07-22  8:57                                       ` Zhouyi Zhou
2021-07-22 12:36                                         ` Matthew Wilcox
2021-07-22 13:26                                           ` Greg KH
2021-07-22 14:00                                             ` Greg KH
2021-07-23  1:51                                               ` Miaohe Lin
2021-07-23  7:02                                                 ` Greg KH
2021-07-23  7:13                                                   ` Miaohe Lin
2021-07-22 17:44                                           ` Zhouyi Zhou
2021-07-22 14:05                                       ` Paul E. McKenney
2021-07-19 12:17                             ` Miaohe Lin
2021-07-19  7:32       ` Chris Clayton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).