* zram: lockdep spew for zram->init_lock @ 2014-02-27 14:48 Sasha Levin 2014-02-27 23:56 ` Minchan Kim 0 siblings, 1 reply; 6+ messages in thread From: Sasha Levin @ 2014-02-27 14:48 UTC (permalink / raw) To: Minchan Kim, ngupta; +Cc: LKML Hi all, I've stumbled on the following spew while fuzzing with trinity inside a KVM tools guest running latest -next. It looks like a false positive (we only set size for uninitialized devices, so we can't deadlock on them being in-use) but I'd really like someone to confirm it before I write it down as such. [ 2655.365684] ================================= [ 2655.368278] [ INFO: inconsistent lock state ] [ 2655.370163] 3.14.0-rc4-next-20140226-sasha-00013-g082bdac-dirty #4 Tainted: G W [ 2655.371972] --------------------------------- [ 2655.371972] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage. [ 2655.371972] kswapd30/5352 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 2655.371972] (&zram->init_lock){+++++-}, at: [<drivers/block/zram/zram_drv.c:663>] zram_make_request+0x2a/0xc0 [ 2655.371972] {RECLAIM_FS-ON-W} state was registered at: [ 2655.371972] [<kernel/locking/lockdep.c:2523>] mark_held_locks+0x6c/0x90 [ 2655.371972] [<kernel/locking/lockdep.c:2745 kernel/locking/lockdep.c:2760>] lockdep_trace_alloc+0xfd/0x140 [ 2655.371972] [<mm/slub.c:965 mm/slub.c:2402 mm/slub.c:2475 mm/slub.c:2492>] kmem_cache_alloc_trace+0x32/0x2e0 [ 2655.371972] [<include/linux/slab.h:453 drivers/block/zram/zram_drv.c:172>] zram_meta_alloc+0x20/0x150 [ 2655.371972] [<drivers/block/zram/zram_drv.c:554>] disksize_store+0x8e/0xf0 [ 2655.371972] [<drivers/base/core.c:139>] dev_attr_store+0x1b/0x20 [ 2655.371972] [<fs/sysfs/file.c:114>] sysfs_kf_write+0x4a/0x60 [ 2655.371972] [<fs/kernfs/file.c:299>] kernfs_fop_write+0x110/0x190 [ 2655.371972] [<fs/read_write.c:473>] vfs_write+0xe3/0x1d0 [ 2655.371972] [<fs/read_write.c:523 fs/read_write.c:515>] SyS_write+0x5d/0xa0 [ 2655.371972] [<arch/x86/kernel/entry_64.S:749>] tracesys+0xdd/0xe2 [ 2655.371972] irq event stamp: 10207 [ 2655.371972] hardirqs last enabled at (10207): [<arch/x86/include/asm/paravirt.h:809 block/blk-throttle.c:982>] throtl_update_dispatch_stats+0x15d/0x1a0 [ 2655.371972] hardirqs last disabled at (10206): [<block/blk-throttle.c:977>] throtl_update_dispatch_stats+0xa4/0x1a0 [ 2655.371972] softirqs last enabled at (10172): [<arch/x86/include/asm/preempt.h:22 kernel/softirq.c:297>] __do_softirq+0x447/0x4f0 [ 2655.371972] softirqs last disabled at (10165): [<kernel/softirq.c:347 kernel/softirq.c:388>] irq_exit+0x83/0x160 [ 2655.371972] [ 2655.371972] other info that might help us debug this: [ 2655.371972] Possible unsafe locking scenario: [ 2655.371972] [ 2655.371972] CPU0 [ 2655.371972] ---- [ 2655.371972] lock(&zram->init_lock); [ 2655.371972] <Interrupt> [ 2655.371972] lock(&zram->init_lock); [ 2655.371972] [ 2655.371972] *** DEADLOCK *** [ 2655.371972] [ 2655.371972] no locks held by kswapd30/5352. [ 2655.371972] [ 2655.371972] stack backtrace: [ 2655.371972] CPU: 78 PID: 5352 Comm: kswapd30 Tainted: G W 3.14.0-rc4-next-20140226-sasha-00013-g082bdac-dirty #4 [ 2655.371972] ffff880636f98cc0 ffff880636f8d3f8 ffffffff843882e5 0000000000000000 [ 2655.371972] ffff880636f98000 ffff880636f8d458 ffffffff811a09f7 0000000000000000 [ 2655.371972] 0000000000000001 ffff880600000001 ffffffff876c2060 0000000000000009 [ 2655.371972] Call Trace: [ 2655.371972] [<lib/dump_stack.c:52>] dump_stack+0x52/0x7f [ 2655.371972] [<kernel/locking/lockdep.c:2254>] print_usage_bug+0x1a7/0x1e0 [ 2655.371972] [<kernel/locking/lockdep.c:2347>] ? print_usage_bug+0x1e0/0x1e0 [ 2655.371972] [<kernel/locking/lockdep.c:2465>] mark_lock_irq+0xd9/0x2a0 [ 2655.371972] [<kernel/locking/lockdep.c:2920>] mark_lock+0x128/0x210 [ 2655.371972] [<kernel/locking/lockdep.c:2821>] mark_irqflags+0x144/0x170 [ 2655.371972] [<kernel/locking/lockdep.c:3138>] __lock_acquire+0x2de/0x5a0 [ 2655.371972] [<arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602>] lock_acquire+0x182/0x1d0 [ 2655.371972] [<drivers/block/zram/zram_drv.c:663>] ? zram_make_request+0x2a/0xc0 [ 2655.371972] [<arch/x86/include/asm/rwsem.h:83 kernel/locking/rwsem.c:23>] down_read+0x47/0xa0 [ 2655.371972] [<drivers/block/zram/zram_drv.c:663>] ? zram_make_request+0x2a/0xc0 [ 2655.371972] [<arch/x86/include/asm/current.h:14 kernel/sched/core.c:2508>] ? preempt_count_add+0x96/0xc0 [ 2655.371972] [<drivers/block/zram/zram_drv.c:663>] zram_make_request+0x2a/0xc0 [ 2655.371972] [<block/blk-core.c:1862>] generic_make_request+0xb6/0x110 [ 2655.371972] [<block/blk-core.c:1913>] submit_bio+0x148/0x170 [ 2655.371972] [<include/linux/rcupdate.h:800 include/linux/memcontrol.h:180 mm/page-writeback.c:2408>] ? test_set_page_writeback+0x24e/0x2a0 [ 2655.371972] [<mm/page_io.c:315>] __swap_writepage+0x1fc/0x220 [ 2655.371972] [<arch/x86/include/asm/preempt.h:98 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183>] ? _raw_spin_unlock+0x30/0x50 [ 2655.371972] [<mm/swapfile.c:898>] ? page_swapcount+0x4e/0x60 [ 2655.371972] [<mm/page_io.c:249>] swap_writepage+0x72/0x80 [ 2655.371972] [<mm/vmscan.c:502>] pageout+0x167/0x2e0 [ 2655.371972] [<mm/vmscan.c:1015>] shrink_page_list+0x4f4/0x7c0 [ 2655.371972] [<include/linux/spinlock.h:328 mm/vmscan.c:1503>] shrink_inactive_list+0x31c/0x570 [ 2655.371972] [<mm/vmscan.c:1744>] ? shrink_active_list+0x30b/0x320 [ 2655.371972] [<mm/vmscan.c:1830 mm/vmscan.c:2054>] shrink_lruvec+0x124/0x300 [ 2655.371972] [<arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:305>] ? sched_clock+0x1d/0x30 [ 2655.371972] [<mm/vmscan.c:2235>] shrink_zone+0x8e/0x1d0 [ 2655.371972] [<include/linux/bitmap.h:165 include/linux/nodemask.h:131 mm/vmscan.c:2904>] kswapd_shrink_zone+0xf1/0x1b0 [ 2655.371972] [<mm/vmscan.c:3088>] balance_pgdat+0x363/0x540 [ 2655.371972] [<kernel/sched/wait.c:254>] ? finish_wait+0x70/0x90 [ 2655.371972] [<mm/vmscan.c:3296>] kswapd+0x2eb/0x350 [ 2655.371972] [<mm/vmscan.c:3213>] ? ftrace_raw_event_mm_vmscan_writepage+0x180/0x180 [ 2655.371972] [<kernel/kthread.c:216>] kthread+0x105/0x110 [ 2655.371972] [<kernel/locking/lockdep.c:3506>] ? __lock_release+0x1e2/0x200 [ 2655.371972] [<kernel/kthread.c:185>] ? set_kthreadd_affinity+0x30/0x30 [ 2655.371972] [<arch/x86/kernel/entry_64.S:555>] ret_from_fork+0x7c/0xb0 [ 2655.371972] [<kernel/kthread.c:185>] ? set_kthreadd_affinity+0x30/0x30 Thanks, Sasha ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: zram: lockdep spew for zram->init_lock 2014-02-27 14:48 zram: lockdep spew for zram->init_lock Sasha Levin @ 2014-02-27 23:56 ` Minchan Kim 2014-03-01 0:32 ` Andrew Morton 0 siblings, 1 reply; 6+ messages in thread From: Minchan Kim @ 2014-02-27 23:56 UTC (permalink / raw) To: Sasha Levin; +Cc: ngupta, LKML, Sergey Senozhatsky, Andrew Morton Hello Sasha, On Thu, Feb 27, 2014 at 09:48:37AM -0500, Sasha Levin wrote: > Hi all, > > I've stumbled on the following spew while fuzzing with trinity > inside a KVM tools guest running latest -next. It looks like a false > positive (we only set size for uninitialized devices, so we can't > deadlock on them being in-use) but I'd really like someone to > confirm it before I write it down as such. Thanks for the info! As you said, it's false positive and introduced by [1] in recent linux-next but I don't feel we need to annotate to prevent lockdep spew. Just enough to move zram_meta_alloc out of lock as old. Sergey claimed that "let's avoid unnecessary allocation/free of zram_meta on currently used device" but I think it's not worth to add annotate lock so that lockdep would be void. [1] zram: delete zram_init_device() >From 89353c8319e183826b03bf8562569f46ae460763 Mon Sep 17 00:00:00 2001 From: Minchan Kim <minchan@kernel.org> Date: Fri, 28 Feb 2014 08:39:21 +0900 Subject: [PATCH] zram: prevent lockdep spew of init_lock Sasha reported following below lockdep spew of zram. It was introduced by [1] in recent linux-next but it's false positive because zram_meta_alloc with down_write(init_lock) couldn't be called during zram is working as swap device so we could annotate the lock. But I don't think it's worthy because it would make greate lockdep less effective. Instead, move zram_meta_alloc out of the lock as good old day so we could do unnecessary allocation/free of zram_meta for initialied device as Sergey claimed in [1] but it wouldn't be common and be harmful if someone might do it. Rather than, I'd like to respect lockdep which is great tool to prevent upcoming subtle bugs. [1] zram: delete zram_init_device [ 2655.365684] ================================= [ 2655.368278] [ INFO: inconsistent lock state ] [ 2655.370163] 3.14.0-rc4-next-20140226-sasha-00013-g082bdac-dirty #4 Tainted: G W [ 2655.371972] --------------------------------- [ 2655.371972] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage. [ 2655.371972] kswapd30/5352 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 2655.371972] (&zram->init_lock){+++++-}, at: [<drivers/block/zram/zram_drv.c:663>] zram_make_request+0x2a/0xc0 [ 2655.371972] {RECLAIM_FS-ON-W} state was registered at: [ 2655.371972] [<kernel/locking/lockdep.c:2523>] mark_held_locks+0x6c/0x90 [ 2655.371972] [<kernel/locking/lockdep.c:2745 kernel/locking/lockdep.c:2760>] lockdep_trace_alloc+0xfd/0x140 [ 2655.371972] [<mm/slub.c:965 mm/slub.c:2402 mm/slub.c:2475 mm/slub.c:2492>] kmem_cache_alloc_trace+0x32/0x2e0 [ 2655.371972] [<include/linux/slab.h:453 drivers/block/zram/zram_drv.c:172>] zram_meta_alloc+0x20/0x150 [ 2655.371972] [<drivers/block/zram/zram_drv.c:554>] disksize_store+0x8e/0xf0 [ 2655.371972] [<drivers/base/core.c:139>] dev_attr_store+0x1b/0x20 [ 2655.371972] [<fs/sysfs/file.c:114>] sysfs_kf_write+0x4a/0x60 [ 2655.371972] [<fs/kernfs/file.c:299>] kernfs_fop_write+0x110/0x190 [ 2655.371972] [<fs/read_write.c:473>] vfs_write+0xe3/0x1d0 [ 2655.371972] [<fs/read_write.c:523 fs/read_write.c:515>] SyS_write+0x5d/0xa0 [ 2655.371972] [<arch/x86/kernel/entry_64.S:749>] tracesys+0xdd/0xe2 [ 2655.371972] irq event stamp: 10207 [ 2655.371972] hardirqs last enabled at (10207): [<arch/x86/include/asm/paravirt.h:809 block/blk-throttle.c:982>] throtl_update_dispatch_stats+0x15d/0x1a0 [ 2655.371972] hardirqs last disabled at (10206): [<block/blk-throttle.c:977>] throtl_update_dispatch_stats+0xa4/0x1a0 [ 2655.371972] softirqs last enabled at (10172): [<arch/x86/include/asm/preempt.h:22 kernel/softirq.c:297>] __do_softirq+0x447/0x4f0 [ 2655.371972] softirqs last disabled at (10165): [<kernel/softirq.c:347 kernel/softirq.c:388>] irq_exit+0x83/0x160 [ 2655.371972] [ 2655.371972] other info that might help us debug this: [ 2655.371972] Possible unsafe locking scenario: [ 2655.371972] [ 2655.371972] CPU0 [ 2655.371972] ---- [ 2655.371972] lock(&zram->init_lock); [ 2655.371972] <Interrupt> [ 2655.371972] lock(&zram->init_lock); [ 2655.371972] [ 2655.371972] *** DEADLOCK *** [ 2655.371972] [ 2655.371972] no locks held by kswapd30/5352. [ 2655.371972] [ 2655.371972] stack backtrace: [ 2655.371972] CPU: 78 PID: 5352 Comm: kswapd30 Tainted: G W 3.14.0-rc4-next-20140226-sasha-00013-g082bdac-dirty #4 [ 2655.371972] ffff880636f98cc0 ffff880636f8d3f8 ffffffff843882e5 0000000000000000 [ 2655.371972] ffff880636f98000 ffff880636f8d458 ffffffff811a09f7 0000000000000000 [ 2655.371972] 0000000000000001 ffff880600000001 ffffffff876c2060 0000000000000009 [ 2655.371972] Call Trace: [ 2655.371972] [<lib/dump_stack.c:52>] dump_stack+0x52/0x7f [ 2655.371972] [<kernel/locking/lockdep.c:2254>] print_usage_bug+0x1a7/0x1e0 [ 2655.371972] [<kernel/locking/lockdep.c:2347>] ? print_usage_bug+0x1e0/0x1e0 [ 2655.371972] [<kernel/locking/lockdep.c:2465>] mark_lock_irq+0xd9/0x2a0 [ 2655.371972] [<kernel/locking/lockdep.c:2920>] mark_lock+0x128/0x210 [ 2655.371972] [<kernel/locking/lockdep.c:2821>] mark_irqflags+0x144/0x170 [ 2655.371972] [<kernel/locking/lockdep.c:3138>] __lock_acquire+0x2de/0x5a0 [ 2655.371972] [<arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602>] lock_acquire+0x182/0x1d0 [ 2655.371972] [<drivers/block/zram/zram_drv.c:663>] ? zram_make_request+0x2a/0xc0 [ 2655.371972] [<arch/x86/include/asm/rwsem.h:83 kernel/locking/rwsem.c:23>] down_read+0x47/0xa0 [ 2655.371972] [<drivers/block/zram/zram_drv.c:663>] ? zram_make_request+0x2a/0xc0 [ 2655.371972] [<arch/x86/include/asm/current.h:14 kernel/sched/core.c:2508>] ? preempt_count_add+0x96/0xc0 [ 2655.371972] [<drivers/block/zram/zram_drv.c:663>] zram_make_request+0x2a/0xc0 [ 2655.371972] [<block/blk-core.c:1862>] generic_make_request+0xb6/0x110 [ 2655.371972] [<block/blk-core.c:1913>] submit_bio+0x148/0x170 [ 2655.371972] [<include/linux/rcupdate.h:800 include/linux/memcontrol.h:180 mm/page-writeback.c:2408>] ? test_set_page_writeback+0x24e/0x2a0 [ 2655.371972] [<mm/page_io.c:315>] __swap_writepage+0x1fc/0x220 [ 2655.371972] [<arch/x86/include/asm/preempt.h:98 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183>] ? _raw_spin_unlock+0x30/0x50 [ 2655.371972] [<mm/swapfile.c:898>] ? page_swapcount+0x4e/0x60 [ 2655.371972] [<mm/page_io.c:249>] swap_writepage+0x72/0x80 [ 2655.371972] [<mm/vmscan.c:502>] pageout+0x167/0x2e0 [ 2655.371972] [<mm/vmscan.c:1015>] shrink_page_list+0x4f4/0x7c0 [ 2655.371972] [<include/linux/spinlock.h:328 mm/vmscan.c:1503>] shrink_inactive_list+0x31c/0x570 [ 2655.371972] [<mm/vmscan.c:1744>] ? shrink_active_list+0x30b/0x320 [ 2655.371972] [<mm/vmscan.c:1830 mm/vmscan.c:2054>] shrink_lruvec+0x124/0x300 [ 2655.371972] [<arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:305>] ? sched_clock+0x1d/0x30 [ 2655.371972] [<mm/vmscan.c:2235>] shrink_zone+0x8e/0x1d0 [ 2655.371972] [<include/linux/bitmap.h:165 include/linux/nodemask.h:131 mm/vmscan.c:2904>] kswapd_shrink_zone+0xf1/0x1b0 [ 2655.371972] [<mm/vmscan.c:3088>] balance_pgdat+0x363/0x540 [ 2655.371972] [<kernel/sched/wait.c:254>] ? finish_wait+0x70/0x90 [ 2655.371972] [<mm/vmscan.c:3296>] kswapd+0x2eb/0x350 [ 2655.371972] [<mm/vmscan.c:3213>] ? ftrace_raw_event_mm_vmscan_writepage+0x180/0x180 [ 2655.371972] [<kernel/kthread.c:216>] kthread+0x105/0x110 [ 2655.371972] [<kernel/locking/lockdep.c:3506>] ? __lock_release+0x1e2/0x200 [ 2655.371972] [<kernel/kthread.c:185>] ? set_kthreadd_affinity+0x30/0x30 [ 2655.371972] [<arch/x86/kernel/entry_64.S:555>] ret_from_fork+0x7c/0xb0 [ 2655.371972] [<kernel/kthread.c:185>] ? set_kthreadd_affinity+0x30/0x30 Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Minchan Kim <minchan@kernel.org> --- drivers/block/zram/zram_drv.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 9baac5b76bfe..76ba67673a90 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -537,26 +537,27 @@ static ssize_t disksize_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { u64 disksize; + struct zram_meta *meta; struct zram *zram = dev_to_zram(dev); disksize = memparse(buf, NULL); if (!disksize) return -EINVAL; + disksize = PAGE_ALIGN(disksize); + meta = zram_meta_alloc(disksize); + if (!meta) + return -ENOMEM; + down_write(&zram->init_lock); if (init_done(zram)) { + zram_meta_free(meta); up_write(&zram->init_lock); pr_info("Cannot change disksize for initialized device\n"); return -EBUSY; } - disksize = PAGE_ALIGN(disksize); - zram->meta = zram_meta_alloc(disksize); - if (!zram->meta) { - up_write(&zram->init_lock); - return -ENOMEM; - } - + zram->meta = meta; zram->disksize = disksize; set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); up_write(&zram->init_lock); -- 1.8.5.3 -- Kind regards, Minchan Kim ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: zram: lockdep spew for zram->init_lock 2014-02-27 23:56 ` Minchan Kim @ 2014-03-01 0:32 ` Andrew Morton 2014-03-01 8:19 ` Sergey Senozhatsky 2014-03-03 7:30 ` Minchan Kim 0 siblings, 2 replies; 6+ messages in thread From: Andrew Morton @ 2014-03-01 0:32 UTC (permalink / raw) To: Minchan Kim; +Cc: Sasha Levin, ngupta, LKML, Sergey Senozhatsky On Fri, 28 Feb 2014 08:56:29 +0900 Minchan Kim <minchan@kernel.org> wrote: > Sasha reported following below lockdep spew of zram. > > It was introduced by [1] in recent linux-next but it's false positive > because zram_meta_alloc with down_write(init_lock) couldn't be called > during zram is working as swap device so we could annotate the lock. > > But I don't think it's worthy because it would make greate lockdep > less effective. Instead, move zram_meta_alloc out of the lock as good > old day so we could do unnecessary allocation/free of zram_meta for > initialied device as Sergey claimed in [1] but it wouldn't be common > and be harmful if someone might do it. Rather than, I'd like to respect > lockdep which is great tool to prevent upcoming subtle bugs. > > [1] zram: delete zram_init_device > > ... > > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -537,26 +537,27 @@ static ssize_t disksize_store(struct device *dev, > struct device_attribute *attr, const char *buf, size_t len) > { > u64 disksize; > + struct zram_meta *meta; > struct zram *zram = dev_to_zram(dev); > > disksize = memparse(buf, NULL); > if (!disksize) > return -EINVAL; > > + disksize = PAGE_ALIGN(disksize); > + meta = zram_meta_alloc(disksize); > + if (!meta) > + return -ENOMEM; > + > down_write(&zram->init_lock); > if (init_done(zram)) { > + zram_meta_free(meta); > up_write(&zram->init_lock); > pr_info("Cannot change disksize for initialized device\n"); > return -EBUSY; > } > > - disksize = PAGE_ALIGN(disksize); > - zram->meta = zram_meta_alloc(disksize); > - if (!zram->meta) { > - up_write(&zram->init_lock); > - return -ENOMEM; > - } > - > + zram->meta = meta; > zram->disksize = disksize; > set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); > up_write(&zram->init_lock); When applying zram-use-zcomp-compressing-backends.patch on top of this we get a bit of a mess, and simple conflict resolution results in a leak. disksize_store() was one of those nasty functions which does multiple "return" statements after performing locking and resource allocation. As usual, this led to a resource leak. Remember folks, "return" is a goto in disguise. Here's what I ended up with. Please review. static ssize_t disksize_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { u64 disksize; struct zram_meta *meta; struct zram *zram = dev_to_zram(dev); int err; disksize = memparse(buf, NULL); if (!disksize) return -EINVAL; disksize = PAGE_ALIGN(disksize); meta = zram_meta_alloc(disksize); if (!meta) return -ENOMEM; down_write(&zram->init_lock); if (init_done(zram)) { pr_info("Cannot change disksize for initialized device\n"); err = -EBUSY; goto out_free_meta; } zram->comp = zcomp_create(default_compressor); if (!zram->comp) { pr_info("Cannot initialise %s compressing backend\n", default_compressor); err = -EINVAL; goto out_free_meta; } zram->meta = meta; zram->disksize = disksize; set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); up_write(&zram->init_lock); return len; out_free_meta: up_write(&zram->init_lock); zram_meta_free(meta); return err; } ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: zram: lockdep spew for zram->init_lock 2014-03-01 0:32 ` Andrew Morton @ 2014-03-01 8:19 ` Sergey Senozhatsky 2014-03-03 7:30 ` Minchan Kim 1 sibling, 0 replies; 6+ messages in thread From: Sergey Senozhatsky @ 2014-03-01 8:19 UTC (permalink / raw) To: Andrew Morton; +Cc: Minchan Kim, Sasha Levin, ngupta, LKML, Sergey Senozhatsky On (02/28/14 16:32), Andrew Morton wrote: > Date: Fri, 28 Feb 2014 16:32:06 -0800 > From: Andrew Morton <akpm@linux-foundation.org> > To: Minchan Kim <minchan@kernel.org> > Cc: Sasha Levin <sasha.levin@oracle.com>, ngupta@vflare.org, LKML > <linux-kernel@vger.kernel.org>, Sergey Senozhatsky > <sergey.senozhatsky@gmail.com> > Subject: Re: zram: lockdep spew for zram->init_lock > X-Mailer: Sylpheed 3.2.0beta5 (GTK+ 2.24.10; x86_64-pc-linux-gnu) > > On Fri, 28 Feb 2014 08:56:29 +0900 Minchan Kim <minchan@kernel.org> wrote: > > > Sasha reported following below lockdep spew of zram. > > > > It was introduced by [1] in recent linux-next but it's false positive > > because zram_meta_alloc with down_write(init_lock) couldn't be called > > during zram is working as swap device so we could annotate the lock. > > > > But I don't think it's worthy because it would make greate lockdep > > less effective. Instead, move zram_meta_alloc out of the lock as good > > old day so we could do unnecessary allocation/free of zram_meta for > > initialied device as Sergey claimed in [1] but it wouldn't be common > > and be harmful if someone might do it. Rather than, I'd like to respect > > lockdep which is great tool to prevent upcoming subtle bugs. > > > > [1] zram: delete zram_init_device > > > > ... > > > > --- a/drivers/block/zram/zram_drv.c > > +++ b/drivers/block/zram/zram_drv.c > > @@ -537,26 +537,27 @@ static ssize_t disksize_store(struct device *dev, > > struct device_attribute *attr, const char *buf, size_t len) > > { > > u64 disksize; > > + struct zram_meta *meta; > > struct zram *zram = dev_to_zram(dev); > > > > disksize = memparse(buf, NULL); > > if (!disksize) > > return -EINVAL; > > > > + disksize = PAGE_ALIGN(disksize); > > + meta = zram_meta_alloc(disksize); > > + if (!meta) > > + return -ENOMEM; > > + > > down_write(&zram->init_lock); > > if (init_done(zram)) { > > + zram_meta_free(meta); > > up_write(&zram->init_lock); > > pr_info("Cannot change disksize for initialized device\n"); > > return -EBUSY; > > } > > > > - disksize = PAGE_ALIGN(disksize); > > - zram->meta = zram_meta_alloc(disksize); > > - if (!zram->meta) { > > - up_write(&zram->init_lock); > > - return -ENOMEM; > > - } > > - > > + zram->meta = meta; > > zram->disksize = disksize; > > set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); > > up_write(&zram->init_lock); > > When applying zram-use-zcomp-compressing-backends.patch on top of this > we get a bit of a mess, and simple conflict resolution results in a > leak. > > disksize_store() was one of those nasty functions which does multiple > "return" statements after performing locking and resource allocation. > As usual, this led to a resource leak. Remember folks, "return" is a > goto in disguise. > > > Here's what I ended up with. Please review. > looks good to me. Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> > static ssize_t disksize_store(struct device *dev, > struct device_attribute *attr, const char *buf, size_t len) > { > u64 disksize; > struct zram_meta *meta; > struct zram *zram = dev_to_zram(dev); > int err; > > disksize = memparse(buf, NULL); > if (!disksize) > return -EINVAL; > > disksize = PAGE_ALIGN(disksize); > meta = zram_meta_alloc(disksize); > if (!meta) > return -ENOMEM; > > down_write(&zram->init_lock); > if (init_done(zram)) { > pr_info("Cannot change disksize for initialized device\n"); > err = -EBUSY; > goto out_free_meta; > } > > zram->comp = zcomp_create(default_compressor); > if (!zram->comp) { > pr_info("Cannot initialise %s compressing backend\n", > default_compressor); > err = -EINVAL; > goto out_free_meta; > } > > zram->meta = meta; > zram->disksize = disksize; > set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); > up_write(&zram->init_lock); > > return len; > > out_free_meta: > up_write(&zram->init_lock); > zram_meta_free(meta); > return err; > } > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: zram: lockdep spew for zram->init_lock 2014-03-01 0:32 ` Andrew Morton 2014-03-01 8:19 ` Sergey Senozhatsky @ 2014-03-03 7:30 ` Minchan Kim 2014-03-03 19:51 ` Sergey Senozhatsky 1 sibling, 1 reply; 6+ messages in thread From: Minchan Kim @ 2014-03-03 7:30 UTC (permalink / raw) To: Andrew Morton; +Cc: Sasha Levin, ngupta, LKML, Sergey Senozhatsky Hello Andrew, I'm not in office now and I would be off in this week, maybe so I don't have source code on top of Sergey's recent change but it seems below code has same problem. Pz, Sergey or Jerome Could you confirm it instead of me? On Fri, Feb 28, 2014 at 04:32:06PM -0800, Andrew Morton wrote: > On Fri, 28 Feb 2014 08:56:29 +0900 Minchan Kim <minchan@kernel.org> wrote: > > > Sasha reported following below lockdep spew of zram. > > > > It was introduced by [1] in recent linux-next but it's false positive > > because zram_meta_alloc with down_write(init_lock) couldn't be called > > during zram is working as swap device so we could annotate the lock. > > > > But I don't think it's worthy because it would make greate lockdep > > less effective. Instead, move zram_meta_alloc out of the lock as good > > old day so we could do unnecessary allocation/free of zram_meta for > > initialied device as Sergey claimed in [1] but it wouldn't be common > > and be harmful if someone might do it. Rather than, I'd like to respect > > lockdep which is great tool to prevent upcoming subtle bugs. > > > > [1] zram: delete zram_init_device > > > > ... > > > > --- a/drivers/block/zram/zram_drv.c > > +++ b/drivers/block/zram/zram_drv.c > > @@ -537,26 +537,27 @@ static ssize_t disksize_store(struct device *dev, > > struct device_attribute *attr, const char *buf, size_t len) > > { > > u64 disksize; > > + struct zram_meta *meta; > > struct zram *zram = dev_to_zram(dev); > > > > disksize = memparse(buf, NULL); > > if (!disksize) > > return -EINVAL; > > > > + disksize = PAGE_ALIGN(disksize); > > + meta = zram_meta_alloc(disksize); > > + if (!meta) > > + return -ENOMEM; > > + > > down_write(&zram->init_lock); > > if (init_done(zram)) { > > + zram_meta_free(meta); > > up_write(&zram->init_lock); > > pr_info("Cannot change disksize for initialized device\n"); > > return -EBUSY; > > } > > > > - disksize = PAGE_ALIGN(disksize); > > - zram->meta = zram_meta_alloc(disksize); > > - if (!zram->meta) { > > - up_write(&zram->init_lock); > > - return -ENOMEM; > > - } > > - > > + zram->meta = meta; > > zram->disksize = disksize; > > set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); > > up_write(&zram->init_lock); > > When applying zram-use-zcomp-compressing-backends.patch on top of this > we get a bit of a mess, and simple conflict resolution results in a > leak. > > disksize_store() was one of those nasty functions which does multiple > "return" statements after performing locking and resource allocation. > As usual, this led to a resource leak. Remember folks, "return" is a > goto in disguise. > > > Here's what I ended up with. Please review. > > static ssize_t disksize_store(struct device *dev, > struct device_attribute *attr, const char *buf, size_t len) > { > u64 disksize; > struct zram_meta *meta; > struct zram *zram = dev_to_zram(dev); > int err; > > disksize = memparse(buf, NULL); > if (!disksize) > return -EINVAL; > > disksize = PAGE_ALIGN(disksize); > meta = zram_meta_alloc(disksize); > if (!meta) > return -ENOMEM; > > down_write(&zram->init_lock); > if (init_done(zram)) { > pr_info("Cannot change disksize for initialized device\n"); > err = -EBUSY; > goto out_free_meta; > } > > zram->comp = zcomp_create(default_compressor); AFAIR, zcomp_create try to allocate memory with GFP_KERNEL so it could make a same problem. If so, let's allocate it out of the lock. Please check it. > if (!zram->comp) { > pr_info("Cannot initialise %s compressing backend\n", > default_compressor); > err = -EINVAL; > goto out_free_meta; > } > > zram->meta = meta; > zram->disksize = disksize; > set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); > up_write(&zram->init_lock); > > return len; > > out_free_meta: > up_write(&zram->init_lock); > zram_meta_free(meta); > return err; > } > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: zram: lockdep spew for zram->init_lock 2014-03-03 7:30 ` Minchan Kim @ 2014-03-03 19:51 ` Sergey Senozhatsky 0 siblings, 0 replies; 6+ messages in thread From: Sergey Senozhatsky @ 2014-03-03 19:51 UTC (permalink / raw) To: Minchan Kim; +Cc: Andrew Morton, Sasha Levin, ngupta, LKML, Sergey Senozhatsky Hello, On (03/03/14 07:30), Minchan Kim wrote: > Hello Andrew, > > I'm not in office now and I would be off in this week, maybe > so I don't have source code on top of Sergey's recent change > but it seems below code has same problem. > > Pz, Sergey or Jerome Could you confirm it instead of me? > > On Fri, Feb 28, 2014 at 04:32:06PM -0800, Andrew Morton wrote: > > On Fri, 28 Feb 2014 08:56:29 +0900 Minchan Kim <minchan@kernel.org> wrote: > > > > > Sasha reported following below lockdep spew of zram. > > > > > > It was introduced by [1] in recent linux-next but it's false positive > > > because zram_meta_alloc with down_write(init_lock) couldn't be called > > > during zram is working as swap device so we could annotate the lock. > > > > > > But I don't think it's worthy because it would make greate lockdep > > > less effective. Instead, move zram_meta_alloc out of the lock as good > > > old day so we could do unnecessary allocation/free of zram_meta for > > > initialied device as Sergey claimed in [1] but it wouldn't be common > > > and be harmful if someone might do it. Rather than, I'd like to respect > > > lockdep which is great tool to prevent upcoming subtle bugs. > > > > > > [1] zram: delete zram_init_device > > > > > > ... > > > > > > --- a/drivers/block/zram/zram_drv.c > > > +++ b/drivers/block/zram/zram_drv.c > > > @@ -537,26 +537,27 @@ static ssize_t disksize_store(struct device *dev, > > > struct device_attribute *attr, const char *buf, size_t len) > > > { > > > u64 disksize; > > > + struct zram_meta *meta; > > > struct zram *zram = dev_to_zram(dev); > > > > > > disksize = memparse(buf, NULL); > > > if (!disksize) > > > return -EINVAL; > > > > > > + disksize = PAGE_ALIGN(disksize); > > > + meta = zram_meta_alloc(disksize); > > > + if (!meta) > > > + return -ENOMEM; > > > + > > > down_write(&zram->init_lock); > > > if (init_done(zram)) { > > > + zram_meta_free(meta); > > > up_write(&zram->init_lock); > > > pr_info("Cannot change disksize for initialized device\n"); > > > return -EBUSY; > > > } > > > > > > - disksize = PAGE_ALIGN(disksize); > > > - zram->meta = zram_meta_alloc(disksize); > > > - if (!zram->meta) { > > > - up_write(&zram->init_lock); > > > - return -ENOMEM; > > > - } > > > - > > > + zram->meta = meta; > > > zram->disksize = disksize; > > > set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); > > > up_write(&zram->init_lock); > > > > When applying zram-use-zcomp-compressing-backends.patch on top of this > > we get a bit of a mess, and simple conflict resolution results in a > > leak. > > > > disksize_store() was one of those nasty functions which does multiple > > "return" statements after performing locking and resource allocation. > > As usual, this led to a resource leak. Remember folks, "return" is a > > goto in disguise. > > > > > > Here's what I ended up with. Please review. > > > > static ssize_t disksize_store(struct device *dev, > > struct device_attribute *attr, const char *buf, size_t len) > > { > > u64 disksize; > > struct zram_meta *meta; > > struct zram *zram = dev_to_zram(dev); > > int err; > > > > disksize = memparse(buf, NULL); > > if (!disksize) > > return -EINVAL; > > > > disksize = PAGE_ALIGN(disksize); > > meta = zram_meta_alloc(disksize); > > if (!meta) > > return -ENOMEM; > > > > down_write(&zram->init_lock); > > if (init_done(zram)) { > > pr_info("Cannot change disksize for initialized device\n"); > > err = -EBUSY; > > goto out_free_meta; > > } > > > > zram->comp = zcomp_create(default_compressor); > > AFAIR, zcomp_create try to allocate memory with GFP_KERNEL so it could > make a same problem. If so, let's allocate it out of the lock. > > Please check it. > good catch. you're right, zcomp allocation use GPF_KERNEL. since Minchan is out of the office this week should I send out a patch (combined Minchan's original patch + Andrew's rework) or, Andrew, you will handle it? -ss > > if (!zram->comp) { > > pr_info("Cannot initialise %s compressing backend\n", > > default_compressor); > > err = -EINVAL; > > goto out_free_meta; > > } > > > > zram->meta = meta; > > zram->disksize = disksize; > > set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); > > up_write(&zram->init_lock); > > > > return len; > > > > out_free_meta: > > up_write(&zram->init_lock); > > zram_meta_free(meta); > > return err; > > } > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-03-03 19:55 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-02-27 14:48 zram: lockdep spew for zram->init_lock Sasha Levin 2014-02-27 23:56 ` Minchan Kim 2014-03-01 0:32 ` Andrew Morton 2014-03-01 8:19 ` Sergey Senozhatsky 2014-03-03 7:30 ` Minchan Kim 2014-03-03 19:51 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).