KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat

All of lore.kernel.org
 help / color / mirror / Atom feed

* KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated
@ 2021-09-16 13:53 ` Hao Sun
  0 siblings, 0 replies; 8+ messages in thread
From: Hao Sun @ 2021-09-16 13:53 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: cgroups, hannes, lizefan.x, tj

Hi

KCSAN reported the following data race during the kernel booting when
using Healer to fuzz the latest Linux kernel.

HEAD commit: ff1ffd71d5f0 Merge tag 'hyperv-fixes-signed-20210915'
git tree: upstream
console output: https://paste.ubuntu.com/p/s4kFHrHCNh/
kernel config: https://paste.ubuntu.com/p/FjTsrWnBVM/

If you fix this issue, please add the following tag to the commit:
Reported-by: Hao Sun <sunhao.th@gmail.com>

==================================================================
BUG: KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated

write to 0xffffe8ffffc194d0 of 8 bytes by task 8 on cpu 1:
 cgroup_rstat_cpu_pop_updated kernel/cgroup/rstat.c:139 [inline]
 cgroup_rstat_flush_locked+0x282/0x760 kernel/cgroup/rstat.c:161
 cgroup_rstat_flush_irqsafe+0x24/0x40 kernel/cgroup/rstat.c:218
 mem_cgroup_flush_stats mm/memcontrol.c:5354 [inline]
 flush_memcg_stats_work+0x34/0x60 mm/memcontrol.c:5366
 process_one_work+0x402/0x910 kernel/workqueue.c:2297
 worker_thread+0x638/0xac0 kernel/workqueue.c:2444
 kthread+0x243/0x280 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30

read to 0xffffe8ffffc194d0 of 8 bytes by task 1245 on cpu 0:
 cgroup_rstat_updated+0x53/0x1b0 kernel/cgroup/rstat.c:38
 __count_memcg_events+0x43/0x50 mm/memcontrol.c:788
 __activate_page+0x50b/0x5f0 mm/swap.c:309
 pagevec_lru_move_fn+0x1c4/0x2d0 mm/swap.c:197
 activate_page mm/swap.c:338 [inline]
 mark_page_accessed+0x47d/0x550 mm/swap.c:422
 zap_pte_range+0x5cc/0xdb0 mm/memory.c:1359
 zap_pmd_range mm/memory.c:1481 [inline]
 zap_pud_range mm/memory.c:1510 [inline]
 zap_p4d_range mm/memory.c:1531 [inline]
 unmap_page_range+0x2dc/0x3d0 mm/memory.c:1552
 unmap_single_vma+0x157/0x210 mm/memory.c:1597
 unmap_vmas+0xd0/0x180 mm/memory.c:1629
 exit_mmap+0x235/0x470 mm/mmap.c:3171
 __mmput+0x27/0x1d0 kernel/fork.c:1115
 mmput+0x3d/0x50 kernel/fork.c:1136
 exit_mm+0x2dc/0x3d0 kernel/exit.c:501
 do_exit+0x3e0/0x14f0 kernel/exit.c:812
 do_group_exit+0xa4/0x1a0 kernel/exit.c:922
 __do_sys_exit_group+0xb/0x10 kernel/exit.c:933
 __se_sys_exit_group+0x5/0x10 kernel/exit.c:931
 __x64_sys_exit_group+0x16/0x20 kernel/exit.c:931
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xa0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0xffff888101bc2010 -> 0x0000000000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 1245 Comm: syz-executor Not tainted 5.15.0-rc1+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014

^ permalink raw reply	[flat|nested] 8+ messages in thread

* KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated
@ 2021-09-16 13:53 ` Hao Sun
  0 siblings, 0 replies; 8+ messages in thread
From: Hao Sun @ 2021-09-16 13:53 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	lizefan.x-EC8Uxl6Npydl57MIdRCFDg, tj-DgEjT+Ai2ygdnm+yROfE0A

Hi

KCSAN reported the following data race during the kernel booting when
using Healer to fuzz the latest Linux kernel.

HEAD commit: ff1ffd71d5f0 Merge tag 'hyperv-fixes-signed-20210915'
git tree: upstream
console output: https://paste.ubuntu.com/p/s4kFHrHCNh/
kernel config: https://paste.ubuntu.com/p/FjTsrWnBVM/

If you fix this issue, please add the following tag to the commit:
Reported-by: Hao Sun <sunhao.th-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

==================================================================
BUG: KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated

write to 0xffffe8ffffc194d0 of 8 bytes by task 8 on cpu 1:
 cgroup_rstat_cpu_pop_updated kernel/cgroup/rstat.c:139 [inline]
 cgroup_rstat_flush_locked+0x282/0x760 kernel/cgroup/rstat.c:161
 cgroup_rstat_flush_irqsafe+0x24/0x40 kernel/cgroup/rstat.c:218
 mem_cgroup_flush_stats mm/memcontrol.c:5354 [inline]
 flush_memcg_stats_work+0x34/0x60 mm/memcontrol.c:5366
 process_one_work+0x402/0x910 kernel/workqueue.c:2297
 worker_thread+0x638/0xac0 kernel/workqueue.c:2444
 kthread+0x243/0x280 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30

read to 0xffffe8ffffc194d0 of 8 bytes by task 1245 on cpu 0:
 cgroup_rstat_updated+0x53/0x1b0 kernel/cgroup/rstat.c:38
 __count_memcg_events+0x43/0x50 mm/memcontrol.c:788
 __activate_page+0x50b/0x5f0 mm/swap.c:309
 pagevec_lru_move_fn+0x1c4/0x2d0 mm/swap.c:197
 activate_page mm/swap.c:338 [inline]
 mark_page_accessed+0x47d/0x550 mm/swap.c:422
 zap_pte_range+0x5cc/0xdb0 mm/memory.c:1359
 zap_pmd_range mm/memory.c:1481 [inline]
 zap_pud_range mm/memory.c:1510 [inline]
 zap_p4d_range mm/memory.c:1531 [inline]
 unmap_page_range+0x2dc/0x3d0 mm/memory.c:1552
 unmap_single_vma+0x157/0x210 mm/memory.c:1597
 unmap_vmas+0xd0/0x180 mm/memory.c:1629
 exit_mmap+0x235/0x470 mm/mmap.c:3171
 __mmput+0x27/0x1d0 kernel/fork.c:1115
 mmput+0x3d/0x50 kernel/fork.c:1136
 exit_mm+0x2dc/0x3d0 kernel/exit.c:501
 do_exit+0x3e0/0x14f0 kernel/exit.c:812
 do_group_exit+0xa4/0x1a0 kernel/exit.c:922
 __do_sys_exit_group+0xb/0x10 kernel/exit.c:933
 __se_sys_exit_group+0x5/0x10 kernel/exit.c:931
 __x64_sys_exit_group+0x16/0x20 kernel/exit.c:931
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xa0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0xffff888101bc2010 -> 0x0000000000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 1245 Comm: syz-executor Not tainted 5.15.0-rc1+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated
@ 2021-09-17 16:41   ` Michal Koutný
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Koutný @ 2021-09-17 16:41 UTC (permalink / raw)
  To: Hao Sun; +Cc: Linux Kernel Mailing List, cgroups, hannes, lizefan.x, tj

Hello Hao.

On Thu, Sep 16, 2021 at 09:53:55PM +0800, Hao Sun <sunhao.th@gmail.com> wrote:
> KCSAN reported the following data race during the kernel booting when
> using Healer to fuzz the latest Linux kernel.
> [...]
>  cgroup_rstat_cpu_pop_updated kernel/cgroup/rstat.c:139 [inline]
>  [...]
>  cgroup_rstat_updated+0x53/0x1b0 kernel/cgroup/rstat.c:38

FWIW, it's a "safe" race between updaters and flushers (possibly
missing the latest update(s)). This is expected as explained in
cgroup_rstat_updated() comment.

Michal

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated
@ 2021-09-17 16:41   ` Michal Koutný
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Koutný @ 2021-09-17 16:41 UTC (permalink / raw)
  To: Hao Sun
  Cc: Linux Kernel Mailing List, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, lizefan.x-EC8Uxl6Npydl57MIdRCFDg,
	tj-DgEjT+Ai2ygdnm+yROfE0A

Hello Hao.

On Thu, Sep 16, 2021 at 09:53:55PM +0800, Hao Sun <sunhao.th-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> KCSAN reported the following data race during the kernel booting when
> using Healer to fuzz the latest Linux kernel.
> [...]
>  cgroup_rstat_cpu_pop_updated kernel/cgroup/rstat.c:139 [inline]
>  [...]
>  cgroup_rstat_updated+0x53/0x1b0 kernel/cgroup/rstat.c:38

FWIW, it's a "safe" race between updaters and flushers (possibly
missing the latest update(s)). This is expected as explained in
cgroup_rstat_updated() comment.

Michal

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated
@ 2021-09-18  1:27     ` Hao Sun
  0 siblings, 0 replies; 8+ messages in thread
From: Hao Sun @ 2021-09-18  1:27 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Linux Kernel Mailing List, cgroups, hannes, lizefan.x, tj

Hi Michal,

Michal Koutný <mkoutny@suse.com> 于2021年9月18日周六 上午12:41写道：
>
> Hello Hao.
>
> On Thu, Sep 16, 2021 at 09:53:55PM +0800, Hao Sun <sunhao.th@gmail.com> wrote:
> > KCSAN reported the following data race during the kernel booting when
> > using Healer to fuzz the latest Linux kernel.
> > [...]
> >  cgroup_rstat_cpu_pop_updated kernel/cgroup/rstat.c:139 [inline]
> >  [...]
> >  cgroup_rstat_updated+0x53/0x1b0 kernel/cgroup/rstat.c:38
>
> FWIW, it's a "safe" race between updaters and flushers (possibly
> missing the latest update(s)). This is expected as explained in
> cgroup_rstat_updated() comment.
>

Would it be better to add a `data_race` macro to the corresponding
location so that the false report can be disabled?
See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt#n58
for more details.
Currently, fuzzer can not test the kernel with KCSAN enabled for a
long time, because cgroups setup is the basic step before executing
any test case.

Regards
Hao

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated
@ 2021-09-18  1:27     ` Hao Sun
  0 siblings, 0 replies; 8+ messages in thread
From: Hao Sun @ 2021-09-18  1:27 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Linux Kernel Mailing List, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, lizefan.x-EC8Uxl6Npydl57MIdRCFDg,
	tj-DgEjT+Ai2ygdnm+yROfE0A

Hi Michal,

Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org> 于2021年9月18日周六 上午12:41写道：
>
> Hello Hao.
>
> On Thu, Sep 16, 2021 at 09:53:55PM +0800, Hao Sun <sunhao.th-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > KCSAN reported the following data race during the kernel booting when
> > using Healer to fuzz the latest Linux kernel.
> > [...]
> >  cgroup_rstat_cpu_pop_updated kernel/cgroup/rstat.c:139 [inline]
> >  [...]
> >  cgroup_rstat_updated+0x53/0x1b0 kernel/cgroup/rstat.c:38
>
> FWIW, it's a "safe" race between updaters and flushers (possibly
> missing the latest update(s)). This is expected as explained in
> cgroup_rstat_updated() comment.
>

Would it be better to add a `data_race` macro to the corresponding
location so that the false report can be disabled?
See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt#n58
for more details.
Currently, fuzzer can not test the kernel with KCSAN enabled for a
long time, because cgroups setup is the basic step before executing
any test case.

Regards
Hao

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated
@ 2021-09-20 18:09       ` Michal Koutný
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Koutný @ 2021-09-20 18:09 UTC (permalink / raw)
  To: Hao Sun; +Cc: Linux Kernel Mailing List, cgroups, hannes, lizefan.x, tj

On Sat, Sep 18, 2021 at 09:27:08AM +0800, Hao Sun <sunhao.th@gmail.com> wrote:
> Would it be better to add a `data_race` macro to the corresponding
> location so that the false report can be disabled?

Something like this

--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -35,7 +35,7 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu)
         * instead of NULL, we can tell whether @cgrp is on the list by
         * testing the next pointer for NULL.
         */
-       if (cgroup_rstat_cpu(cgrp, cpu)->updated_next)
+       if (data_race(cgroup_rstat_cpu(cgrp, cpu)->updated_next))
                return;

        raw_spin_lock_irqsave(cpu_lock, flags);
?

Makes sense to me. Will you send a patch (if this resolves your KCSAN
noise)?

(IIUC, this becase more visible after commit aa48e47e3906 ("memcg:
infrastructure to flush memcg stats") v5.15-rc1 but it was present since
d8ef4b38cb69 ("Revert "cgroup: Add memory barriers to plug
cgroup_rstat_updated() race window"") v5.7.)

> See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt#n58
> for more details.

(Interesting, learning...)

Thanks,
Michal

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated
@ 2021-09-20 18:09       ` Michal Koutný
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Koutný @ 2021-09-20 18:09 UTC (permalink / raw)
  To: Hao Sun
  Cc: Linux Kernel Mailing List, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, lizefan.x-EC8Uxl6Npydl57MIdRCFDg,
	tj-DgEjT+Ai2ygdnm+yROfE0A

On Sat, Sep 18, 2021 at 09:27:08AM +0800, Hao Sun <sunhao.th-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Would it be better to add a `data_race` macro to the corresponding
> location so that the false report can be disabled?

Something like this

--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -35,7 +35,7 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu)
         * instead of NULL, we can tell whether @cgrp is on the list by
         * testing the next pointer for NULL.
         */
-       if (cgroup_rstat_cpu(cgrp, cpu)->updated_next)
+       if (data_race(cgroup_rstat_cpu(cgrp, cpu)->updated_next))
                return;

        raw_spin_lock_irqsave(cpu_lock, flags);
?

Makes sense to me. Will you send a patch (if this resolves your KCSAN
noise)?

(IIUC, this becase more visible after commit aa48e47e3906 ("memcg:
infrastructure to flush memcg stats") v5.15-rc1 but it was present since
d8ef4b38cb69 ("Revert "cgroup: Add memory barriers to plug
cgroup_rstat_updated() race window"") v5.7.)

> See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt#n58
> for more details.

(Interesting, learning...)

Thanks,
Michal

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-09-20 18:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-16 13:53 KCSAN: data-race in cgroup_rstat_flush_locked / cgroup_rstat_updated Hao Sun
2021-09-16 13:53 ` Hao Sun
2021-09-17 16:41 ` Michal Koutný
2021-09-17 16:41   ` Michal Koutný
2021-09-18  1:27   ` Hao Sun
2021-09-18  1:27     ` Hao Sun
2021-09-20 18:09     ` Michal Koutný
2021-09-20 18:09       ` Michal Koutný

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.