* BUG: KASAN: global-out-of-bounds in vprintk_store on x86_64 @ 2020-10-14 10:49 Naresh Kamboju 2020-10-14 12:47 ` Sergey Senozhatsky 2020-10-14 14:05 ` Petr Mladek 0 siblings, 2 replies; 8+ messages in thread From: Naresh Kamboju @ 2020-10-14 10:49 UTC (permalink / raw) To: Linux-Next Mailing List, linux-mm, open list, lkft-triage, LTP List Cc: Andrew Morton, Mike Rapoport, Stephen Rothwell, John Ogness, Petr Mladek, Sergey Senozhatsky, Rasmus Villemoes, Steven Rostedt, Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner While testing LTP controllers testing on x86_64 KASAN enabled linux next 20201013 tag kernel this kernel BUG noticed. The oom-killer log also noticed while running tests. metadata: git branch: master git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git commit: f2fb1afc57304f9dd68c20a08270e287470af2eb git describe: next-20201013 make_kernelversion: 5.9.0 kernel-config: https://builds.tuxbuild.com/TXP6cqokP8BCYJrv7zf6kw/kernel.config steps to reproduce: ------------------------- # cd /opt/ltp # ./runltp -f controllers Crash log: -------------- [ 221.921944] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=c,mems_allowed=0,oom_memcg=/0,task_memcg=in [ 221.922108] ================================================================== [ 221.922111] BUG: KASAN: global-out-of-bounds in vprintk_store+0x362/0x3d0 [ 221.922112] Write of size 2 at addr ffffffffba51dbcd by task memcg_test_1/11282 [ 221.922113] [ 221.922114] CPU: 1 PID: 11282 Comm: memcg_test_1 Not tainted 5.9.0-next-20201013 #1 [ 221.922116] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 221.922116] Call Trace: [ 221.922117] dump_stack+0xa4/0xd9 [ 221.922118] print_address_description.constprop.0+0x21/0x210 [ 221.922119] ? _raw_write_lock_bh+0xe0/0xe0 [ 221.922120] ? vprintk_store+0x362/0x3d0 [ 221.922121] kasan_report.cold+0x37/0x7c [ 221.922122] ? vprintk_store+0x362/0x3d0 [ 221.922123] check_memory_region+0x18c/0x1f0 [ 221.922124] memcpy+0x3c/0x60 [ 221.922125] vprintk_store+0x362/0x3d0 [ 221.922125] ? __ia32_sys_syslog+0x50/0x50 [ 221.922126] ? _raw_spin_lock_irqsave+0x9b/0x100 [ 221.922127] ? _raw_spin_lock_irq+0xf0/0xf0 [ 221.922128] ? __kasan_check_write+0x14/0x20 [ 221.922129] vprintk_emit+0x8d/0x1f0 [ 221.922130] vprintk_default+0x1d/0x20 [ 221.922131] vprintk_func+0x5a/0x100 [ 221.922132] printk+0xb2/0xe3 [ 221.922133] ? swsusp_write.cold+0x189/0x189 [ 221.922134] ? kernfs_vfs_xattr_set+0x60/0x60 [ 221.922134] ? _raw_write_lock_bh+0xe0/0xe0 [ 221.922135] ? trace_hardirqs_on+0x38/0x100 [ 221.922136] pr_cont_kernfs_path.cold+0x49/0x4b [ 221.922137] mem_cgroup_print_oom_context.cold+0x74/0xc3 [ 221.922138] dump_header+0x340/0x3bf [ 221.922139] oom_kill_process.cold+0xb/0x10 [ 221.922140] out_of_memory+0x1e9/0x860 [ 221.922141] ? oom_killer_disable+0x210/0x210 [ 221.922142] mem_cgroup_out_of_memory+0x198/0x1c0 [ 221.922143] ? mem_cgroup_count_precharge_pte_range+0x250/0x250 [ 221.922144] try_charge+0xa9b/0xc50 [ 221.922145] ? arch_stack_walk+0x9e/0xf0 [ 221.922146] ? memory_high_write+0x230/0x230 [ 221.922146] ? avc_has_extended_perms+0x830/0x830 [ 221.922147] ? stack_trace_save+0x94/0xc0 [ 221.922148] ? stack_trace_consume_entry+0x90/0x90 [ 221.922149] __memcg_kmem_charge+0x73/0x120 [ 221.922150] ? cred_has_capability+0x10f/0x200 [ 221.922151] ? mem_cgroup_can_attach+0x260/0x260 [ 221.922152] ? selinux_sb_eat_lsm_opts+0x2f0/0x2f0 [ 221.922153] ? obj_cgroup_charge+0x16b/0x220 [ 221.922154] ? kmem_cache_alloc+0x78/0x4c0 [ 221.922155] obj_cgroup_charge+0x122/0x220 [ 221.922156] ? vm_area_alloc+0x20/0x90 [ 221.922156] kmem_cache_alloc+0x78/0x4c0 [ 221.922157] vm_area_alloc+0x20/0x90 [ 221.922158] mmap_region+0x3ed/0x9a0 [ 221.922159] ? cap_mmap_addr+0x1d/0x80 [ 221.922160] do_mmap+0x3ee/0x720 [ 221.922161] vm_mmap_pgoff+0x16a/0x1c0 [ 221.922162] ? randomize_stack_top+0x90/0x90 [ 221.922163] ? copy_page_range+0x1980/0x1980 [ 221.922163] ksys_mmap_pgoff+0xab/0x350 [ 221.922164] ? find_mergeable_anon_vma+0x110/0x110 [ 221.922165] ? __audit_syscall_entry+0x1a6/0x1e0 [ 221.922166] __x64_sys_mmap+0x8d/0xb0 [ 221.922167] do_syscall_64+0x38/0x50 [ 221.922168] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 221.922169] RIP: 0033:0x7fe8f5e75103 [ 221.922172] Code: 54 41 89 d4 55 48 89 fd 53 4c 89 cb 48 85 ff 74 56 49 89 d9 45 89 f8 45 89 f2 44 89 e2 4c 89 ee 48 89 ef b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7d 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f [ 221.922173] RSP: 002b:00007ffd38c90198 EFLAGS: 00000246 ORIG_RAX: 0000000000000009 [ 221.922175] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe8f5e75103 [ 221.922176] RDX: 0000000000000003 RSI: 0000000000001000 RDI: 0000000000000000 [ 221.922178] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 221.922179] R10: 0000000000002022 R11: 0000000000000246 R12: 0000000000000003 [ 221.922180] R13: 0000000000001000 R14: 0000000000002022 R15: 0000000000000000 [ 221.922181] [ 213O[ 221.922182] The buggy address belongs to the variable: [ 221.922183] clear_seq+0x2d/0x40 [ 221.922183] [ 221.922184] Memory state around the buggy address: [ 221.922185] ffffffffba51da80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 221.922187] ffffffffba51db00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 221.922188] >ffffffffba51db80: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 [ 221.922189] ^ [ 221.922190] ffffffffba51dc00: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 [ 221.922191] ffffffffba51dc80: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 [ 221.922193] ================================================================== [ 221.922194] Disabling lock debugging due to kernel taint [ 221.922196] ,task=memcg_test_1,pid=11280,uid=0 [ 221.922205] Memory cgroup out of memory: Killed process 11280 (memcg_test_1) total-vm:4124kB, anon-rss:72kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:48kB oom_score_adj:0 [ 221.922509] memcg_test_1 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0 [ 222.885676] CPU: 2 PID: 11283 Comm: memcg_test_1 Tainted: G B 5.9.0-next-20201013 #1 [ 222.885678] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 222.885679] Call Trace: [ 222.885683] dump_stack+0xa4/0xd9 [ 222.885686] dump_header+0x8f/0x3bf [ 222.885689] oom_kill_process.cold+0xb/0x10 [ 222.885692] out_of_memory+0x1e9/0x860 [ 222.885697] ? oom_killer_disable+0x210/0x210 [ 222.923728] mem_cgroup_out_of_memory+0x198/0x1c0 [ 222.923731] ? mem_cgroup_count_precharge_pte_range+0x250/0x250 Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> full test details and log link, https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20201013/testrun/3302398/suite/linux-log-parser/tests/ -- Linaro LKFT https://lkft.linaro.org ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: KASAN: global-out-of-bounds in vprintk_store on x86_64 2020-10-14 10:49 BUG: KASAN: global-out-of-bounds in vprintk_store on x86_64 Naresh Kamboju @ 2020-10-14 12:47 ` Sergey Senozhatsky 2020-10-14 14:05 ` Petr Mladek 1 sibling, 0 replies; 8+ messages in thread From: Sergey Senozhatsky @ 2020-10-14 12:47 UTC (permalink / raw) To: John Ogness, Petr Mladek Cc: Linux-Next Mailing List, linux-mm, open list, lkft-triage, LTP List, Andrew Morton, Mike Rapoport, Stephen Rothwell, Sergey Senozhatsky, Rasmus Villemoes, Steven Rostedt, Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner, Naresh Kamboju On (20/10/14 16:19), Naresh Kamboju wrote: > Crash log: > -------------- > [ 221.921944] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=c,mems_allowed=0,oom_memcg=/0,task_memcg=in > [ 221.922108] ================================================================== > [ 221.922111] BUG: KASAN: global-out-of-bounds in vprintk_store+0x362/0x3d0 > [ 221.922112] Write of size 2 at addr ffffffffba51dbcd by task > memcg_test_1/11282 > [ 221.922113] > [ 221.922114] CPU: 1 PID: 11282 Comm: memcg_test_1 Not tainted > 5.9.0-next-20201013 #1 > [ 221.922116] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS > 2.0b 07/27/2017 > [ 221.922116] Call Trace: > [ 221.922117] dump_stack+0xa4/0xd9 > [ 221.922118] print_address_description.constprop.0+0x21/0x210 > [ 221.922119] ? _raw_write_lock_bh+0xe0/0xe0 > [ 221.922120] ? vprintk_store+0x362/0x3d0 > [ 221.922121] kasan_report.cold+0x37/0x7c > [ 221.922122] ? vprintk_store+0x362/0x3d0 > [ 221.922123] check_memory_region+0x18c/0x1f0 > [ 221.922124] memcpy+0x3c/0x60 > [ 221.922125] vprintk_store+0x362/0x3d0 > [ 221.922125] ? __ia32_sys_syslog+0x50/0x50 > [ 221.922126] ? _raw_spin_lock_irqsave+0x9b/0x100 > [ 221.922127] ? _raw_spin_lock_irq+0xf0/0xf0 > [ 221.922128] ? __kasan_check_write+0x14/0x20 > [ 221.922129] vprintk_emit+0x8d/0x1f0 > [ 221.922130] vprintk_default+0x1d/0x20 > [ 221.922131] vprintk_func+0x5a/0x100 > [ 221.922132] printk+0xb2/0xe3 > [ 221.922133] ? swsusp_write.cold+0x189/0x189 > [ 221.922134] ? kernfs_vfs_xattr_set+0x60/0x60 > [ 221.922134] ? _raw_write_lock_bh+0xe0/0xe0 > [ 221.922135] ? trace_hardirqs_on+0x38/0x100 > [ 221.922136] pr_cont_kernfs_path.cold+0x49/0x4b A side note, So this brings the question: #define PATH_MAX 4096 static char kernfs_pr_cont_buf[PATH_MAX]; void pr_cont_kernfs_path(struct kernfs_node *kn) { ... if (sz >= sizeof(kernfs_pr_cont_buf)) { pr_cont("(name too long)"); goto out; } pr_cont("%s", kernfs_pr_cont_buf); ... } Do we still want to limit printk line to 1024? There are functions that want longer lines (e.g. kernel boot command line). -ss ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: KASAN: global-out-of-bounds in vprintk_store on x86_64 2020-10-14 10:49 BUG: KASAN: global-out-of-bounds in vprintk_store on x86_64 Naresh Kamboju 2020-10-14 12:47 ` Sergey Senozhatsky @ 2020-10-14 14:05 ` Petr Mladek 2020-10-14 15:13 ` Petr Mladek 1 sibling, 1 reply; 8+ messages in thread From: Petr Mladek @ 2020-10-14 14:05 UTC (permalink / raw) To: Naresh Kamboju Cc: Linux-Next Mailing List, linux-mm, open list, lkft-triage, LTP List, Andrew Morton, Mike Rapoport, Stephen Rothwell, John Ogness, Sergey Senozhatsky, Rasmus Villemoes, Steven Rostedt, Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner On Wed 2020-10-14 16:19:18, Naresh Kamboju wrote: > While testing LTP controllers testing on x86_64 KASAN enabled linux > next 20201013 tag > kernel this kernel BUG noticed. The oom-killer log also noticed while > running tests. > > metadata: > git branch: master > git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next > git commit: f2fb1afc57304f9dd68c20a08270e287470af2eb > git describe: next-20201013 > make_kernelversion: 5.9.0 > kernel-config: > https://builds.tuxbuild.com/TXP6cqokP8BCYJrv7zf6kw/kernel.config > > > steps to reproduce: > ------------------------- > # cd /opt/ltp > # ./runltp -f controllers > > Crash log: > -------------- > [ 221.921944] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=c,mems_allowed=0,oom_memcg=/0,task_memcg=in This is the last string stored in the ring buffer before KASAN trigerred. > [ 221.922108] ================================================================== > [ 221.922111] BUG: KASAN: global-out-of-bounds in vprintk_store+0x362/0x3d0 > [ 221.922112] Write of size 2 at addr ffffffffba51dbcd by task > memcg_test_1/11282 My guess is that the size 2 is related to the string "in" from the above line. It is likely done by void pr_cont_kernfs_path(struct kernfs_node *kn) { [...] pr_cont("%s", kernfs_pr_cont_buf); called from void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p) { [...] if (p) { pr_cont(",task_memcg="); pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id)); > [ 221.922113] > [ 221.922114] CPU: 1 PID: 11282 Comm: memcg_test_1 Not tainted > 5.9.0-next-20201013 #1 > [ 221.922116] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS > 2.0b 07/27/2017 > [ 221.922116] Call Trace: > [ 221.922117] dump_stack+0xa4/0xd9 > [ 221.922118] print_address_description.constprop.0+0x21/0x210 > [ 221.922119] ? _raw_write_lock_bh+0xe0/0xe0 > [ 221.922120] ? vprintk_store+0x362/0x3d0 > [ 221.922121] kasan_report.cold+0x37/0x7c > [ 221.922122] ? vprintk_store+0x362/0x3d0 > [ 221.922123] check_memory_region+0x18c/0x1f0 > [ 221.922124] memcpy+0x3c/0x60 > [ 221.922125] vprintk_store+0x362/0x3d0 It seems that vprintk() store was able to extend the last string by the two characters. So this is likely: static size_t log_output(int facility, int level, enum log_flags lflags, const struct dev_printk_info *dev_info, char *text, size_t text_len) { [...] if (prb_reserve_in_last(&e, prb, &r, caller_id, LOG_LINE_MAX)) { memcpy(&r.text_buf[r.info->text_len], text, text_len); But very likely the two characters were copied to wrong location. There are many similar lines in the full log and they always contain task_memcg=/0 It means that the size of the path is 2 characters but it should be "/0". I guess that "in" was in the log buffer from the previous wrap. So, it seems that prb_reserve_in_last() correctly updated the size of the extended record but it returned wrong pointer to the buffer or wrong current length. Anyway, prb_commit(&e) moved the buffer back to consistent state. > [ 221.922125] ? __ia32_sys_syslog+0x50/0x50 > [ 221.922126] ? _raw_spin_lock_irqsave+0x9b/0x100 > [ 221.922127] ? _raw_spin_lock_irq+0xf0/0xf0 > [ 221.922128] ? __kasan_check_write+0x14/0x20 > [ 221.922129] vprintk_emit+0x8d/0x1f0 > [ 221.922130] vprintk_default+0x1d/0x20 > [ 221.922131] vprintk_func+0x5a/0x100 > [ 221.922132] printk+0xb2/0xe3 > [ 221.922133] ? swsusp_write.cold+0x189/0x189 > [ 221.922134] ? kernfs_vfs_xattr_set+0x60/0x60 > [ 221.922134] ? _raw_write_lock_bh+0xe0/0xe0 > [ 221.922135] ? trace_hardirqs_on+0x38/0x100 > [ 221.922136] pr_cont_kernfs_path.cold+0x49/0x4b > [ 221.922137] mem_cgroup_print_oom_context.cold+0x74/0xc3 > [ 221.922138] dump_header+0x340/0x3bf > [ 221.922139] oom_kill_process.cold+0xb/0x10 > [ 221.922140] out_of_memory+0x1e9/0x860 > [ 221.922141] ? oom_killer_disable+0x210/0x210 > [ 221.922142] mem_cgroup_out_of_memory+0x198/0x1c0 > [ 221.922143] ? mem_cgroup_count_precharge_pte_range+0x250/0x250 > [ 221.922144] try_charge+0xa9b/0xc50 > [ 221.922145] ? arch_stack_walk+0x9e/0xf0 > [ 221.922146] ? memory_high_write+0x230/0x230 > [ 221.922146] ? avc_has_extended_perms+0x830/0x830 > [ 221.922147] ? stack_trace_save+0x94/0xc0 > [ 221.922148] ? stack_trace_consume_entry+0x90/0x90 > [ 221.922149] __memcg_kmem_charge+0x73/0x120 > [ 221.922150] ? cred_has_capability+0x10f/0x200 > [ 221.922151] ? mem_cgroup_can_attach+0x260/0x260 > [ 221.922152] ? selinux_sb_eat_lsm_opts+0x2f0/0x2f0 > [ 221.922153] ? obj_cgroup_charge+0x16b/0x220 > [ 221.922154] ? kmem_cache_alloc+0x78/0x4c0 > [ 221.922155] obj_cgroup_charge+0x122/0x220 > [ 221.922156] ? vm_area_alloc+0x20/0x90 > [ 221.922156] kmem_cache_alloc+0x78/0x4c0 > [ 221.922157] vm_area_alloc+0x20/0x90 > [ 221.922158] mmap_region+0x3ed/0x9a0 > [ 221.922159] ? cap_mmap_addr+0x1d/0x80 > [ 221.922160] do_mmap+0x3ee/0x720 > [ 221.922161] vm_mmap_pgoff+0x16a/0x1c0 > [ 221.922162] ? randomize_stack_top+0x90/0x90 > [ 221.922163] ? copy_page_range+0x1980/0x1980 > [ 221.922163] ksys_mmap_pgoff+0xab/0x350 > [ 221.922164] ? find_mergeable_anon_vma+0x110/0x110 > [ 221.922165] ? __audit_syscall_entry+0x1a6/0x1e0 > [ 221.922166] __x64_sys_mmap+0x8d/0xb0 > [ 221.922167] do_syscall_64+0x38/0x50 > [ 221.922168] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 221.922169] RIP: 0033:0x7fe8f5e75103 > [ 221.922172] Code: 54 41 89 d4 55 48 89 fd 53 4c 89 cb 48 85 ff 74 > 56 49 89 d9 45 89 f8 45 89 f2 44 89 e2 4c 89 ee 48 89 ef b8 09 00 00 > 00 0f 05 <48> 3d 00 f0 ff ff 77 7d 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 > 2e 0f > [ 221.922173] RSP: 002b:00007ffd38c90198 EFLAGS: 00000246 ORIG_RAX: > 0000000000000009 > [ 221.922175] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe8f5e75103 > [ 221.922176] RDX: 0000000000000003 RSI: 0000000000001000 RDI: 0000000000000000 > [ 221.922178] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > [ 221.922179] R10: 0000000000002022 R11: 0000000000000246 R12: 0000000000000003 > [ 221.922180] R13: 0000000000001000 R14: 0000000000002022 R15: 0000000000000000 > [ 221.922181] > [ 213O[ 221.922182] The buggy address belongs to the variable: > [ 221.922183] clear_seq+0x2d/0x40 I am slightly confused by this. The address ffffffffba51dbcd seems to be from the range of "module mapping space" adresses. "clear_seq" is static variable from kernel/printk/printk.c. I would expect that it is built-in and thus in the range "kernel text mapping". Anyway, the variable is defined as: static u64 clear_seq; The size should be 8B. I wonder where 0x2d/0x40 comes from. Well, it probaly is not important. I guess that there is a bug somewhere in prb_reserve_in_last(). > [ 221.922183] > [ 221.922184] Memory state around the buggy address: > [ 221.922185] ffffffffba51da80: 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 > [ 221.922187] ffffffffba51db00: 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 > [ 221.922188] >ffffffffba51db80: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 > 00 f9 f9 f9 > [ 221.922189] ^ > [ 221.922190] ffffffffba51dc00: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 > 00 f9 f9 f9 > [ 221.922191] ffffffffba51dc80: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 > 00 f9 f9 f9 > [ 221.922193] ================================================================== > [ 221.922194] Disabling lock debugging due to kernel taint > [ 221.922196] ,task=memcg_test_1,pid=11280,uid=0 This is the rest of the line that should have been concatenated. It means that the code. > [ 221.922205] Memory cgroup out of memory: Killed process 11280 > (memcg_test_1) total-vm:4124kB, anon-rss:72kB, file-rss:0kB, > shmem-rss:0kB, UID:0 pgtables:48kB oom_score_adj:0 > [ 221.922509] memcg_test_1 invoked oom-killer: > gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0 > [ 222.885676] CPU: 2 PID: 11283 Comm: memcg_test_1 Tainted: G B > 5.9.0-next-20201013 #1 > [ 222.885678] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS > 2.0b 07/27/2017 > [ 222.885679] Call Trace: > [ 222.885683] dump_stack+0xa4/0xd9 > [ 222.885686] dump_header+0x8f/0x3bf > [ 222.885689] oom_kill_process.cold+0xb/0x10 > [ 222.885692] out_of_memory+0x1e9/0x860 > [ 222.885697] ? oom_killer_disable+0x210/0x210 > [ 222.923728] mem_cgroup_out_of_memory+0x198/0x1c0 > [ 222.923731] ? mem_cgroup_count_precharge_pte_range+0x250/0x250 The OOM is likely expected. It seems that the test is testing OOM situations in memcg code. I am going to dig more into it. Best Regards, Petr ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: KASAN: global-out-of-bounds in vprintk_store on x86_64 2020-10-14 14:05 ` Petr Mladek @ 2020-10-14 15:13 ` Petr Mladek 2020-10-14 17:50 ` [PATCH] printk: ringbuffer: Wrong data pointer when appending small string Petr Mladek 0 siblings, 1 reply; 8+ messages in thread From: Petr Mladek @ 2020-10-14 15:13 UTC (permalink / raw) To: Naresh Kamboju Cc: Linux-Next Mailing List, linux-mm, open list, lkft-triage, LTP List, Andrew Morton, Mike Rapoport, Stephen Rothwell, John Ogness, Sergey Senozhatsky, Rasmus Villemoes, Steven Rostedt, Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner On Wed 2020-10-14 16:05:20, Petr Mladek wrote: > On Wed 2020-10-14 16:19:18, Naresh Kamboju wrote: > > While testing LTP controllers testing on x86_64 KASAN enabled linux > > next 20201013 tag > > kernel this kernel BUG noticed. The oom-killer log also noticed while > > running tests. > > > > metadata: > > git branch: master > > git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next > > git commit: f2fb1afc57304f9dd68c20a08270e287470af2eb > > git describe: next-20201013 > > make_kernelversion: 5.9.0 > > kernel-config: > > https://builds.tuxbuild.com/TXP6cqokP8BCYJrv7zf6kw/kernel.config > > > This is the last string stored in the ring buffer before KASAN trigerred. > > > [ 221.922108] ================================================================== > > [ 221.922111] BUG: KASAN: global-out-of-bounds in vprintk_store+0x362/0x3d0 > > [ 221.922112] Write of size 2 at addr ffffffffba51dbcd by task > > memcg_test_1/11282 > > > [ 221.922114] CPU: 1 PID: 11282 Comm: memcg_test_1 Not tainted > > 5.9.0-next-20201013 #1 > > [ 221.922116] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS > > 2.0b 07/27/2017 > > [ 221.922116] Call Trace: > > [ 221.922117] dump_stack+0xa4/0xd9 > > [ 221.922118] print_address_description.constprop.0+0x21/0x210 > > [ 221.922119] ? _raw_write_lock_bh+0xe0/0xe0 > > [ 221.922120] ? vprintk_store+0x362/0x3d0 > > [ 221.922121] kasan_report.cold+0x37/0x7c > > [ 221.922122] ? vprintk_store+0x362/0x3d0 > > [ 221.922123] check_memory_region+0x18c/0x1f0 > > [ 221.922124] memcpy+0x3c/0x60 > > [ 221.922125] vprintk_store+0x362/0x3d0 > > It seems that vprintk() store was able to extend the last string > by the two characters. So this is likely: > > static size_t log_output(int facility, int level, enum log_flags lflags, > const struct dev_printk_info *dev_info, > char *text, size_t text_len) > { > [...] > if (prb_reserve_in_last(&e, prb, &r, caller_id, LOG_LINE_MAX)) { > memcpy(&r.text_buf[r.info->text_len], text, text_len); > > But very likely the two characters were copied to wrong location. > There are many similar lines in the full log and they always contain > > task_memcg=/0 > > It means that the size of the path is 2 characters but it should be > "/0". I guess that "in" was in the log buffer from the previous > wrap. > > So, it seems that prb_reserve_in_last() correctly updated the size > of the extended record but it returned wrong pointer to the buffer > or wrong current length. > > Anyway, prb_commit(&e) moved the buffer back to consistent state. > > [ 213O[ 221.922182] The buggy address belongs to the variable: > > [ 221.922183] clear_seq+0x2d/0x40 > > The address ffffffffba51dbcd seems to be from the range of > "module mapping space" adresses. > > "clear_seq" is static variable from kernel/printk/printk.c. There is a bug in data_realloc(). We return wrong pointer when there is no need to resize the record and it has been wrapped. It might happen when we add only few characters and there was a space because of alignment. It would cause writing outside the buffer into another statically defined variables. So, it might be it. The following change patch should fix it: diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c index 2493348a1631..24a960a89aa8 100644 --- a/kernel/printk/printk_ringbuffer.c +++ b/kernel/printk/printk_ringbuffer.c @@ -1125,7 +1125,10 @@ static char *data_realloc(struct printk_ringbuffer *rb, /* If the data block does not increase, there is nothing to do. */ if (head_lpos - next_lpos < DATA_SIZE(data_ring)) { - blk = to_block(data_ring, blk_lpos->begin); + if (wrapped) + blk = to_block(data_ring, 0); + else + blk = to_block(data_ring, blk_lpos->begin); return &blk->data[0]; } ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH] printk: ringbuffer: Wrong data pointer when appending small string 2020-10-14 15:13 ` Petr Mladek @ 2020-10-14 17:50 ` Petr Mladek 2020-10-15 6:31 ` Sergey Senozhatsky 2020-10-15 12:22 ` Petr Mladek 0 siblings, 2 replies; 8+ messages in thread From: Petr Mladek @ 2020-10-14 17:50 UTC (permalink / raw) To: John Ogness, Sergey Senozhatsky, Steven Rostedt Cc: Linux-Next Mailing List, linux-mm, open list, lkft-triage, LTP List, Andrew Morton, Mike Rapoport, Stephen Rothwell, Rasmus Villemoes, Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner, Naresh Kamboju data_realloc() returns wrong data pointer when the block is wrapped and the size is not increased. It might happen when pr_cont() wants to add only few characters and there is already a space for them because of alignment. It might cause writing outsite the buffer. It has been detected by LTP tests with KASAN enabled: [ 221.921944] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=c,mems_allowed=0,oom_memcg=/0,task_memcg=in [ 221.922108] ================================================================== [ 221.922111] BUG: KASAN: global-out-of-bounds in vprintk_store+0x362/0x3d0 [ 221.922112] Write of size 2 at addr ffffffffba51dbcd by task memcg_test_1/11282 [ 221.922113] [ 221.922114] CPU: 1 PID: 11282 Comm: memcg_test_1 Not tainted 5.9.0-next-20201013 #1 [ 221.922116] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 221.922116] Call Trace: [ 221.922117] dump_stack+0xa4/0xd9 [ 221.922118] print_address_description.constprop.0+0x21/0x210 [ 221.922119] ? _raw_write_lock_bh+0xe0/0xe0 [ 221.922120] ? vprintk_store+0x362/0x3d0 [ 221.922121] kasan_report.cold+0x37/0x7c [ 221.922122] ? vprintk_store+0x362/0x3d0 [ 221.922123] check_memory_region+0x18c/0x1f0 [ 221.922124] memcpy+0x3c/0x60 [ 221.922125] vprintk_store+0x362/0x3d0 [ 221.922125] ? __ia32_sys_syslog+0x50/0x50 [ 221.922126] ? _raw_spin_lock_irqsave+0x9b/0x100 [ 221.922127] ? _raw_spin_lock_irq+0xf0/0xf0 [ 221.922128] ? __kasan_check_write+0x14/0x20 [ 221.922129] vprintk_emit+0x8d/0x1f0 [ 221.922130] vprintk_default+0x1d/0x20 [ 221.922131] vprintk_func+0x5a/0x100 [ 221.922132] printk+0xb2/0xe3 [ 221.922133] ? swsusp_write.cold+0x189/0x189 [ 221.922134] ? kernfs_vfs_xattr_set+0x60/0x60 [ 221.922134] ? _raw_write_lock_bh+0xe0/0xe0 [ 221.922135] ? trace_hardirqs_on+0x38/0x100 [ 221.922136] pr_cont_kernfs_path.cold+0x49/0x4b [ 221.922137] mem_cgroup_print_oom_context.cold+0x74/0xc3 [ 221.922138] dump_header+0x340/0x3bf [ 221.922139] oom_kill_process.cold+0xb/0x10 [ 221.922140] out_of_memory+0x1e9/0x860 [ 221.922141] ? oom_killer_disable+0x210/0x210 [ 221.922142] mem_cgroup_out_of_memory+0x198/0x1c0 [ 221.922143] ? mem_cgroup_count_precharge_pte_range+0x250/0x250 [ 221.922144] try_charge+0xa9b/0xc50 [ 221.922145] ? arch_stack_walk+0x9e/0xf0 [ 221.922146] ? memory_high_write+0x230/0x230 [ 221.922146] ? avc_has_extended_perms+0x830/0x830 [ 221.922147] ? stack_trace_save+0x94/0xc0 [ 221.922148] ? stack_trace_consume_entry+0x90/0x90 [ 221.922149] __memcg_kmem_charge+0x73/0x120 [ 221.922150] ? cred_has_capability+0x10f/0x200 [ 221.922151] ? mem_cgroup_can_attach+0x260/0x260 [ 221.922152] ? selinux_sb_eat_lsm_opts+0x2f0/0x2f0 [ 221.922153] ? obj_cgroup_charge+0x16b/0x220 [ 221.922154] ? kmem_cache_alloc+0x78/0x4c0 [ 221.922155] obj_cgroup_charge+0x122/0x220 [ 221.922156] ? vm_area_alloc+0x20/0x90 [ 221.922156] kmem_cache_alloc+0x78/0x4c0 [ 221.922157] vm_area_alloc+0x20/0x90 [ 221.922158] mmap_region+0x3ed/0x9a0 [ 221.922159] ? cap_mmap_addr+0x1d/0x80 [ 221.922160] do_mmap+0x3ee/0x720 [ 221.922161] vm_mmap_pgoff+0x16a/0x1c0 [ 221.922162] ? randomize_stack_top+0x90/0x90 [ 221.922163] ? copy_page_range+0x1980/0x1980 [ 221.922163] ksys_mmap_pgoff+0xab/0x350 [ 221.922164] ? find_mergeable_anon_vma+0x110/0x110 [ 221.922165] ? __audit_syscall_entry+0x1a6/0x1e0 [ 221.922166] __x64_sys_mmap+0x8d/0xb0 [ 221.922167] do_syscall_64+0x38/0x50 [ 221.922168] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 221.922169] RIP: 0033:0x7fe8f5e75103 [ 221.922172] Code: 54 41 89 d4 55 48 89 fd 53 4c 89 cb 48 85 ff 74 56 49 89 d9 45 89 f8 45 89 f2 44 89 e2 4c 89 ee 48 89 ef b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7d 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f [ 221.922173] RSP: 002b:00007ffd38c90198 EFLAGS: 00000246 ORIG_RAX: 0000000000000009 [ 221.922175] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe8f5e75103 [ 221.922176] RDX: 0000000000000003 RSI: 0000000000001000 RDI: 0000000000000000 [ 221.922178] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 221.922179] R10: 0000000000002022 R11: 0000000000000246 R12: 0000000000000003 [ 221.922180] R13: 0000000000001000 R14: 0000000000002022 R15: 0000000000000000 [ 221.922181] [ 213O[ 221.922182] The buggy address belongs to the variable: [ 221.922183] clear_seq+0x2d/0x40 [ 221.922183] [ 221.922184] Memory state around the buggy address: [ 221.922185] ffffffffba51da80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 221.922187] ffffffffba51db00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 221.922188] >ffffffffba51db80: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 [ 221.922189] ^ [ 221.922190] ffffffffba51dc00: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 [ 221.922191] ffffffffba51dc80: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 [ 221.922193] ================================================================== [ 221.922194] Disabling lock debugging due to kernel taint [ 221.922196] ,task=memcg_test_1,pid=11280,uid=0 [ 221.922205] Memory cgroup out of memory: Killed process 11280 Link: https://lore.kernel.org/r/CA+G9fYt46oC7-BKryNDaaXPJ9GztvS2cs_7GjYRjanRi4+ryCQ@mail.gmail.com Fixes: 4cfc7258f876a7feba673ac ("printk: ringbuffer: add finalization/extension support") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Petr Mladek <pmladek@suse.com> --- kernel/printk/printk_ringbuffer.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c index 2493348a1631..24a960a89aa8 100644 --- a/kernel/printk/printk_ringbuffer.c +++ b/kernel/printk/printk_ringbuffer.c @@ -1125,7 +1125,10 @@ static char *data_realloc(struct printk_ringbuffer *rb, /* If the data block does not increase, there is nothing to do. */ if (head_lpos - next_lpos < DATA_SIZE(data_ring)) { - blk = to_block(data_ring, blk_lpos->begin); + if (wrapped) + blk = to_block(data_ring, 0); + else + blk = to_block(data_ring, blk_lpos->begin); return &blk->data[0]; } -- 2.26.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] printk: ringbuffer: Wrong data pointer when appending small string 2020-10-14 17:50 ` [PATCH] printk: ringbuffer: Wrong data pointer when appending small string Petr Mladek @ 2020-10-15 6:31 ` Sergey Senozhatsky 2020-10-15 9:59 ` John Ogness 2020-10-15 12:22 ` Petr Mladek 1 sibling, 1 reply; 8+ messages in thread From: Sergey Senozhatsky @ 2020-10-15 6:31 UTC (permalink / raw) To: Petr Mladek Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt, Linux-Next Mailing List, linux-mm, open list, lkft-triage, LTP List, Andrew Morton, Mike Rapoport, Stephen Rothwell, Rasmus Villemoes, Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner, Naresh Kamboju On (20/10/14 19:50), Petr Mladek wrote: > diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c > index 2493348a1631..24a960a89aa8 100644 > --- a/kernel/printk/printk_ringbuffer.c > +++ b/kernel/printk/printk_ringbuffer.c > @@ -1125,7 +1125,10 @@ static char *data_realloc(struct printk_ringbuffer *rb, > > /* If the data block does not increase, there is nothing to do. */ > if (head_lpos - next_lpos < DATA_SIZE(data_ring)) { > - blk = to_block(data_ring, blk_lpos->begin); > + if (wrapped) > + blk = to_block(data_ring, 0); > + else > + blk = to_block(data_ring, blk_lpos->begin); > return &blk->data[0]; > } Great catch. Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> -ss ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] printk: ringbuffer: Wrong data pointer when appending small string 2020-10-15 6:31 ` Sergey Senozhatsky @ 2020-10-15 9:59 ` John Ogness 0 siblings, 0 replies; 8+ messages in thread From: John Ogness @ 2020-10-15 9:59 UTC (permalink / raw) To: Sergey Senozhatsky, Petr Mladek Cc: Sergey Senozhatsky, Steven Rostedt, Linux-Next Mailing List, linux-mm, open list, lkft-triage, LTP List, Andrew Morton, Mike Rapoport, Stephen Rothwell, Rasmus Villemoes, Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner, Naresh Kamboju On 2020-10-15, Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote: >> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c >> index 2493348a1631..24a960a89aa8 100644 >> --- a/kernel/printk/printk_ringbuffer.c >> +++ b/kernel/printk/printk_ringbuffer.c >> @@ -1125,7 +1125,10 @@ static char *data_realloc(struct printk_ringbuffer *rb, >> >> /* If the data block does not increase, there is nothing to do. */ >> if (head_lpos - next_lpos < DATA_SIZE(data_ring)) { >> - blk = to_block(data_ring, blk_lpos->begin); >> + if (wrapped) >> + blk = to_block(data_ring, 0); >> + else >> + blk = to_block(data_ring, blk_lpos->begin); >> return &blk->data[0]; >> } > > Great catch. Indeed. I don't know how we missed that one. :-/ Reviewed-by: John Ogness <john.ogness@linutronix.de> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] printk: ringbuffer: Wrong data pointer when appending small string 2020-10-14 17:50 ` [PATCH] printk: ringbuffer: Wrong data pointer when appending small string Petr Mladek 2020-10-15 6:31 ` Sergey Senozhatsky @ 2020-10-15 12:22 ` Petr Mladek 1 sibling, 0 replies; 8+ messages in thread From: Petr Mladek @ 2020-10-15 12:22 UTC (permalink / raw) To: John Ogness, Sergey Senozhatsky, Steven Rostedt Cc: Linux-Next Mailing List, linux-mm, open list, lkft-triage, LTP List, Andrew Morton, Mike Rapoport, Stephen Rothwell, Rasmus Villemoes, Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner, Naresh Kamboju On Wed 2020-10-14 19:50:54, Petr Mladek wrote: > data_realloc() returns wrong data pointer when the block is wrapped and > the size is not increased. It might happen when pr_cont() wants to > add only few characters and there is already a space for them because > of alignment. > > It might cause writing outsite the buffer. It has been detected by LTP > tests with KASAN enabled: > > Link: https://lore.kernel.org/r/CA+G9fYt46oC7-BKryNDaaXPJ9GztvS2cs_7GjYRjanRi4+ryCQ@mail.gmail.com > Fixes: 4cfc7258f876a7feba673ac ("printk: ringbuffer: add finalization/extension support") > Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> > Signed-off-by: Petr Mladek <pmladek@suse.com> The patch is committed into printk/linux.git, branch for-5.10-fixup. I am going to send a pull request with it tomorrow unless something happens in the meantime. Best Regards, Petr ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-10-15 12:22 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-14 10:49 BUG: KASAN: global-out-of-bounds in vprintk_store on x86_64 Naresh Kamboju 2020-10-14 12:47 ` Sergey Senozhatsky 2020-10-14 14:05 ` Petr Mladek 2020-10-14 15:13 ` Petr Mladek 2020-10-14 17:50 ` [PATCH] printk: ringbuffer: Wrong data pointer when appending small string Petr Mladek 2020-10-15 6:31 ` Sergey Senozhatsky 2020-10-15 9:59 ` John Ogness 2020-10-15 12:22 ` Petr Mladek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).