All of lore.kernel.org
 help / color / mirror / Atom feed
* request_queue use-after-free - inode_detach_wb()
@ 2015-11-16 20:59 Ilya Dryomov
  2015-11-17 20:56 ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Ilya Dryomov @ 2015-11-16 20:59 UTC (permalink / raw)
  To: Christoph Hellwig, Tejun Heo
  Cc: linux-kernel, linux-fsdevel, Ceph Development

[-- Attachment #1: Type: text/plain, Size: 3115 bytes --]

Hello,

Last week, while running an rbd test which does a lot of maps and
unmaps (read losetup / losetup -d) with slab debugging enabled, I hit
the attached splat.  That 6a byte corresponds to the atomic_long_t
count of the percpu_ref refcnt in request_queue::backing_dev_info::wb,
pointing to a percpu_ref_put() on a freed memory.  It hasn't reproduced
since.

After a prolonged stare at rbd (we've just fixed an rbd vs sysfs
lifecycle issue, so I naturally assumed we either missed something or
it had something to do with that patch) I looked wider and concluded
that the most likely place a stray percpu_ref_put() could have come
from was inode_detach_wb().  It's called from __destroy_inode(), which
means iput(), which means bdput().

Looking at __blkdev_put(), the issue becomes clear: we are taking
precautions to flush before calling out to ->release() because, at
least according to the comment, ->release() can free queue; we are
recording owner pointer because put_disk() may free both gendisk and
queue, and then, after all that, we are calling bdput() which may
touch the queue through wb_put() in inode_detach_wb().  (The fun part
is wb_put() is supposed to be a noop for root wbs, but slab debugging
interferes with that by poisoning wb->bdi pointer.)

1514                  * dirty data before.
1515                  */
1516                 bdev_write_inode(bdev);
1517         }
1518         if (bdev->bd_contains == bdev) {
1519                 if (disk->fops->release)
1520                         disk->fops->release(disk, mode);
1521         }
1522         if (!bdev->bd_openers) {
1523                 struct module *owner = disk->fops->owner;
1524
1525                 disk_put_part(bdev->bd_part);
1526                 bdev->bd_part = NULL;
1527                 bdev->bd_disk = NULL;
1528                 if (bdev != bdev->bd_contains)
1529                         victim = bdev->bd_contains;
1530                 bdev->bd_contains = NULL;
1531
1532                 put_disk(disk); <-- may free q
1533                 module_put(owner);
1534         }
1535         mutex_unlock(&bdev->bd_mutex);
1536         bdput(bdev); <-- may touch q.backing_dev_info.wb

To reproduce, apply the attached patch (systemd-udevd condition is just
a convenience: udev reacts to change events by getting the bdev which
it then has to put), boot with slub_debug=,blkdev_queue and do:

$ sudo modprobe loop
$ sudo losetup /dev/loop0 foo.img
$ sudo dd if=/dev/urandom of=/dev/loop0 bs=1M count=1
$ sudo losetup -d /dev/loop0
$ sudo rmmod loop

(rmmod is key - it's the only way to get loop to do put_disk().  For
rbd, it's just rbd map - dd - rbd unmap.)

In the past we used to reassign to default_backing_dev_info here, but
it was nuked in b83ae6d42143 ("fs: remove mapping->backing_dev_info").
Shortly after that cgroup-specific writebacks patches from Tejun got
merged, adding inode::i_wb and inode_detach_wb() call.  The fix seems
to be to detach the inode earlier, but I'm not familiar enough with
cgroups code, so sending my findings instead of a patch.  Christoph,
Tejun?

Thanks,

                Ilya

[-- Attachment #2: blkdev_queue-poison.txt --]
[-- Type: text/plain, Size: 25222 bytes --]

[18513.199040] =============================================================================
[18513.199459] BUG blkdev_queue (Not tainted): Poison overwritten
[18513.199459] -----------------------------------------------------------------------------
[18513.199459] 
[18513.205765] Disabling lock debugging due to kernel taint
[18513.205765] INFO: 0xffff8800659a05d8-0xffff8800659a05d8. First byte 0x6a instead of 0x6b
[18513.205765] INFO: Allocated in blk_alloc_queue_node+0x28/0x2c0 age=10215 cpu=1 pid=1920
[18513.205765] 	__slab_alloc.constprop.50+0x4d5/0x540
[18513.205765] 	kmem_cache_alloc+0x2ba/0x320
[18513.205765] 	blk_alloc_queue_node+0x28/0x2c0
[18513.205765] 	blk_mq_init_queue+0x20/0x60
[18513.205765] 	do_rbd_add.isra.23+0x833/0xd70
[18513.205765] 	rbd_add+0x1d/0x30
[18513.205765] 	bus_attr_store+0x25/0x30
[18513.205765] 	sysfs_kf_write+0x45/0x60
[18513.205765] 	kernfs_fop_write+0x141/0x190
[18513.205765] 	__vfs_write+0x28/0xe0
[18513.205765] 	vfs_write+0xa2/0x180
[18513.205765] 	SyS_write+0x49/0xa0
[18513.219265] 	entry_SYSCALL_64_fastpath+0x12/0x6f
[18513.219265] INFO: Freed in blk_free_queue_rcu+0x1c/0x20 age=122 cpu=1 pid=1959
[18513.219265] 	__slab_free+0x148/0x290
[18513.219265] 	kmem_cache_free+0x2b7/0x340
[18513.219265] 	blk_free_queue_rcu+0x1c/0x20
[18513.219265] 	rcu_process_callbacks+0x2fb/0x820
[18513.219265] 	__do_softirq+0xd4/0x460
[18513.219265] 	irq_exit+0x95/0xa0
[18513.219265] 	smp_apic_timer_interrupt+0x42/0x50
[18513.219265] 	apic_timer_interrupt+0x81/0x90
[18513.219265] 	__slab_free+0xb5/0x290
[18513.219265] 	kmem_cache_free+0x2b7/0x340
[18513.219265] 	ptlock_free+0x19/0x20
[18513.219265] 	___pte_free_tlb+0x22/0x50
[18513.219265] 	free_pgd_range+0x258/0x440
[18513.219265] 	free_pgtables+0xc4/0x120
[18513.219265] 	exit_mmap+0xc3/0x130
[18513.219265] 	mmput+0x3d/0xf0
[18513.219265] INFO: Slab 0xffffea0001966800 objects=9 used=9 fp=0x          (null) flags=0x4000000000004080
[18513.219265] INFO: Object 0xffff8800659a0000 @offset=0 fp=0xffff8800659a6c40
[18513.219265] 
[18513.219265] Object ffff8800659a0000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0020: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0030: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0040: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0050: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0060: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0070: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0090: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a00a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a00b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a00c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a00d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a00e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a00f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0100: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0110: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0120: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0130: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0140: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0150: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0160: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0170: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0180: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a01a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a01b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a01c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a01d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a01e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a01f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0200: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0210: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0220: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0230: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0240: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0250: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0260: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0270: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0280: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0290: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a02a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a02b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a02c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a02d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a02e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a02f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0300: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0310: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0320: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0330: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0340: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0350: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0360: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0370: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0380: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0390: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a03a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a03b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a03c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a03d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a03e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a03f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0400: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0410: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0420: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0430: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0440: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0450: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0460: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0470: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0480: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0490: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a04a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a04b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a04c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a04d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a04e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a04f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0500: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0510: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0520: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0530: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0540: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0550: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0560: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0570: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0580: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0590: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a05a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a05b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a05c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a05d0: 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkjkkkkkkk
[18513.219265] Object ffff8800659a05e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a05f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0600: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0610: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0620: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0630: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0640: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0650: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0660: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0670: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0680: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0690: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a06a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a06b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a06c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a06d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a06e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a06f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0700: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0710: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0720: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0730: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0740: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0750: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0760: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0770: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0780: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0790: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a07a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a07b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a07c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a07d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a07e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a07f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0800: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0810: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0820: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0830: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0840: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0850: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0860: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0870: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0880: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0890: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a08a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a08b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a08c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a08d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a08e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a08f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0900: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0910: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0920: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0930: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0940: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0950: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0960: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0970: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0980: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0990: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a09a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a09b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a09c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a09d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a09e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a09f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0a90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0aa0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0ab0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0ac0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0ad0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0ae0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0af0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0b90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0ba0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0bb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0bc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0bd0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0be0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0bf0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0c00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0c10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0c20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[18513.219265] Object ffff8800659a0c30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
[18513.219265] Redzone ffff8800659a0c40: bb bb bb bb bb bb bb bb                          ........
[18513.219265] Padding ffff8800659a0d80: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
[18513.219265] CPU: 1 PID: 1920 Comm: test_librbd_fsx Tainted: G    B           4.3.0-rc7-vm+ #129
[18513.219265] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
[18513.219265]  ffff8800659a0000 ffff880065a73b08 ffffffff813750bc ffff88007cb5d440
[18513.219265]  ffff880065a73b48 ffffffff8115bd07 0000000000000008 ffff880000000001
[18513.219265]  ffff8800659a05d9 ffff88007cb5d440 000000000000006b ffff8800659a0000
[18513.219265] Call Trace:
[18513.219265]  [<ffffffff813750bc>] dump_stack+0x4e/0x82
[18513.219265]  [<ffffffff8115bd07>] print_trailer+0x147/0x1e0
[18513.219265]  [<ffffffff8115c0c5>] check_bytes_and_report+0xc5/0x110
[18513.219265]  [<ffffffff8115c2d4>] check_object+0x1c4/0x240
[18513.219265]  [<ffffffff8134da48>] ? blk_alloc_queue_node+0x28/0x2c0
[18513.219265]  [<ffffffff8115d044>] alloc_debug_processing+0x104/0x180
[18513.219265]  [<ffffffff8115ea45>] __slab_alloc.constprop.50+0x4d5/0x540
[18513.219265]  [<ffffffff8134da48>] ? blk_alloc_queue_node+0x28/0x2c0
[18513.219265]  [<ffffffff8134da48>] ? blk_alloc_queue_node+0x28/0x2c0
[18513.219265]  [<ffffffff8115f80a>] kmem_cache_alloc+0x2ba/0x320
[18513.219265]  [<ffffffff8134da48>] blk_alloc_queue_node+0x28/0x2c0
[18513.219265]  [<ffffffff8135e520>] blk_mq_init_queue+0x20/0x60
[18513.219265]  [<ffffffff81430f83>] do_rbd_add.isra.23+0x833/0xd70
[18513.219265]  [<ffffffff814314fd>] rbd_add+0x1d/0x30
[18513.219265]  [<ffffffff8141aa05>] bus_attr_store+0x25/0x30
[18513.219265]  [<ffffffff811eed15>] sysfs_kf_write+0x45/0x60
[18513.219265]  [<ffffffff811ee081>] kernfs_fop_write+0x141/0x190
[18513.219265]  [<ffffffff81173e38>] __vfs_write+0x28/0xe0
[18513.219265]  [<ffffffff8108464a>] ? percpu_down_read+0x5a/0xa0
[18513.219265]  [<ffffffff81176be9>] ? __sb_start_write+0xc9/0x110
[18513.219265]  [<ffffffff81176be9>] ? __sb_start_write+0xc9/0x110
[18513.219265]  [<ffffffff81174522>] vfs_write+0xa2/0x180
[18513.219265]  [<ffffffff81175089>] SyS_write+0x49/0xa0
[18513.219265]  [<ffffffff8155b257>] entry_SYSCALL_64_fastpath+0x12/0x6f
[18513.219265] FIX blkdev_queue: Restoring 0xffff8800659a05d8-0xffff8800659a05d8=0x6b
[18513.219265] 
[18513.219265] FIX blkdev_queue: Marking all objects used

[-- Attachment #3: blkdev_put-delay.diff --]
[-- Type: text/plain, Size: 1580 bytes --]

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 565b8dac5782..5a4ff505fd12 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -552,6 +552,7 @@ static void blk_free_queue_rcu(struct rcu_head *rcu_head)
 {
 	struct request_queue *q = container_of(rcu_head, struct request_queue,
 					       rcu_head);
+	printk("freeing q\n");
 	kmem_cache_free(blk_requestq_cachep, q);
 }
 
diff --git a/fs/block_dev.c b/fs/block_dev.c
index bb0dfb1c7af1..6475dac5f3bc 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1496,6 +1496,8 @@ static int blkdev_open(struct inode * inode, struct file * filp)
 	return blkdev_get(bdev, filp->f_mode, filp);
 }
 
+#include <linux/delay.h>
+
 static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 {
 	struct gendisk *disk = bdev->bd_disk;
@@ -1531,6 +1533,12 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 
 		put_disk(disk);
 		module_put(owner);
+
+		if (!strcmp(current->comm, "systemd-udevd")) {
+			printk("sleep start %d\n", task_pid_nr(current));
+			ssleep(3);
+			printk("sleep end %d\n", task_pid_nr(current));
+		}
 	}
 	mutex_unlock(&bdev->bd_mutex);
 	bdput(bdev);
diff --git a/fs/inode.c b/fs/inode.c
index 1be5f9003eb3..10625eeb7816 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1490,6 +1490,8 @@ void iput(struct inode *inode)
 {
 	if (!inode)
 		return;
+	if (inode->i_wb)
+		BUG_ON(atomic_long_read(&inode->i_wb->refcnt.count) == 0x6b6b6b6b6b6b6b6b);
 	BUG_ON(inode->i_state & I_CLEAR);
 retry:
 	if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock)) {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: request_queue use-after-free - inode_detach_wb()
  2015-11-16 20:59 request_queue use-after-free - inode_detach_wb() Ilya Dryomov
@ 2015-11-17 20:56 ` Tejun Heo
  2015-11-18 15:12   ` Ilya Dryomov
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2015-11-17 20:56 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, Ceph Development

Hello, Ilya.

On Mon, Nov 16, 2015 at 09:59:18PM +0100, Ilya Dryomov wrote:
...
> Looking at __blkdev_put(), the issue becomes clear: we are taking
> precautions to flush before calling out to ->release() because, at
> least according to the comment, ->release() can free queue; we are
> recording owner pointer because put_disk() may free both gendisk and
> queue, and then, after all that, we are calling bdput() which may
> touch the queue through wb_put() in inode_detach_wb().  (The fun part
> is wb_put() is supposed to be a noop for root wbs, but slab debugging
> interferes with that by poisoning wb->bdi pointer.)
> 
> 1514                  * dirty data before.
> 1515                  */
> 1516                 bdev_write_inode(bdev);
> 1517         }
> 1518         if (bdev->bd_contains == bdev) {
> 1519                 if (disk->fops->release)
> 1520                         disk->fops->release(disk, mode);
> 1521         }
> 1522         if (!bdev->bd_openers) {
> 1523                 struct module *owner = disk->fops->owner;
> 1524
> 1525                 disk_put_part(bdev->bd_part);
> 1526                 bdev->bd_part = NULL;
> 1527                 bdev->bd_disk = NULL;
> 1528                 if (bdev != bdev->bd_contains)
> 1529                         victim = bdev->bd_contains;
> 1530                 bdev->bd_contains = NULL;
> 1531
> 1532                 put_disk(disk); <-- may free q
> 1533                 module_put(owner);
> 1534         }
> 1535         mutex_unlock(&bdev->bd_mutex);
> 1536         bdput(bdev); <-- may touch q.backing_dev_info.wb

Ah, that's a sneaky bug.  Thanks a lot for chasing it down.  The
scenario sounds completely plausible to me.

> To reproduce, apply the attached patch (systemd-udevd condition is just
> a convenience: udev reacts to change events by getting the bdev which
> it then has to put), boot with slub_debug=,blkdev_queue and do:
> 
> $ sudo modprobe loop
> $ sudo losetup /dev/loop0 foo.img
> $ sudo dd if=/dev/urandom of=/dev/loop0 bs=1M count=1
> $ sudo losetup -d /dev/loop0
> $ sudo rmmod loop
> 
> (rmmod is key - it's the only way to get loop to do put_disk().  For
> rbd, it's just rbd map - dd - rbd unmap.)
> 
> In the past we used to reassign to default_backing_dev_info here, but
> it was nuked in b83ae6d42143 ("fs: remove mapping->backing_dev_info").

Woohoo, it wasn't me. :)

> Shortly after that cgroup-specific writebacks patches from Tejun got
> merged, adding inode::i_wb and inode_detach_wb() call.  The fix seems
> to be to detach the inode earlier, but I'm not familiar enough with
> cgroups code, so sending my findings instead of a patch.  Christoph,
> Tejun?

It's stinky that the bdi is going away while the inode is still there.
Yeah, blkdev inodes are special and created early but I think it makes
sense to keep the underlying structures (queue and bdi) around while
bdev is associated with it.  Would simply moving put_disk() after
bdput() work?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: request_queue use-after-free - inode_detach_wb()
  2015-11-17 20:56 ` Tejun Heo
@ 2015-11-18 15:12   ` Ilya Dryomov
  2015-11-18 15:30     ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Ilya Dryomov @ 2015-11-18 15:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, Ceph Development

On Tue, Nov 17, 2015 at 9:56 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Ilya.
>
> On Mon, Nov 16, 2015 at 09:59:18PM +0100, Ilya Dryomov wrote:
> ...
>> Looking at __blkdev_put(), the issue becomes clear: we are taking
>> precautions to flush before calling out to ->release() because, at
>> least according to the comment, ->release() can free queue; we are
>> recording owner pointer because put_disk() may free both gendisk and
>> queue, and then, after all that, we are calling bdput() which may
>> touch the queue through wb_put() in inode_detach_wb().  (The fun part
>> is wb_put() is supposed to be a noop for root wbs, but slab debugging
>> interferes with that by poisoning wb->bdi pointer.)
>>
>> 1514                  * dirty data before.
>> 1515                  */
>> 1516                 bdev_write_inode(bdev);
>> 1517         }
>> 1518         if (bdev->bd_contains == bdev) {
>> 1519                 if (disk->fops->release)
>> 1520                         disk->fops->release(disk, mode);
>> 1521         }
>> 1522         if (!bdev->bd_openers) {
>> 1523                 struct module *owner = disk->fops->owner;
>> 1524
>> 1525                 disk_put_part(bdev->bd_part);
>> 1526                 bdev->bd_part = NULL;
>> 1527                 bdev->bd_disk = NULL;
>> 1528                 if (bdev != bdev->bd_contains)
>> 1529                         victim = bdev->bd_contains;
>> 1530                 bdev->bd_contains = NULL;
>> 1531
>> 1532                 put_disk(disk); <-- may free q
>> 1533                 module_put(owner);
>> 1534         }
>> 1535         mutex_unlock(&bdev->bd_mutex);
>> 1536         bdput(bdev); <-- may touch q.backing_dev_info.wb
>
> Ah, that's a sneaky bug.  Thanks a lot for chasing it down.  The
> scenario sounds completely plausible to me.
>
>> To reproduce, apply the attached patch (systemd-udevd condition is just
>> a convenience: udev reacts to change events by getting the bdev which
>> it then has to put), boot with slub_debug=,blkdev_queue and do:
>>
>> $ sudo modprobe loop
>> $ sudo losetup /dev/loop0 foo.img
>> $ sudo dd if=/dev/urandom of=/dev/loop0 bs=1M count=1
>> $ sudo losetup -d /dev/loop0
>> $ sudo rmmod loop
>>
>> (rmmod is key - it's the only way to get loop to do put_disk().  For
>> rbd, it's just rbd map - dd - rbd unmap.)
>>
>> In the past we used to reassign to default_backing_dev_info here, but
>> it was nuked in b83ae6d42143 ("fs: remove mapping->backing_dev_info").
>
> Woohoo, it wasn't me. :)

Well, you and Christoph have been pulling it in different directions.
He removed default_backing_dev_info along with mapping->backing_dev_info,
but you then kind of readded this link with inode->i_wb.

>
>> Shortly after that cgroup-specific writebacks patches from Tejun got
>> merged, adding inode::i_wb and inode_detach_wb() call.  The fix seems
>> to be to detach the inode earlier, but I'm not familiar enough with
>> cgroups code, so sending my findings instead of a patch.  Christoph,
>> Tejun?
>
> It's stinky that the bdi is going away while the inode is still there.
> Yeah, blkdev inodes are special and created early but I think it makes
> sense to keep the underlying structures (queue and bdi) around while
> bdev is associated with it.  Would simply moving put_disk() after
> bdput() work?

I'd think so.  struct block_device is essentially a "block device"
pseudo-filesystem inode, and as such, may not be around during the
entire lifetime of gendisk / queue.  It may be kicked out of the inode
cache as soon as the device is closed, so it makes sense to put it
before putting gendisk / queue, which will outlive it.

However, I'm confused by this comment

/*
 * ->release can cause the queue to disappear, so flush all
 * dirty data before.
 */
bdev_write_inode(bdev);

It's not true, at least since your 523e1d399ce0 ("block: make gendisk
hold a reference to its queue"), right?  (It used to say "->release can
cause the old bdi to disappear, so must switch it out first" and was
changed by Christoph in the middle of his backing_dev_info series.)

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: request_queue use-after-free - inode_detach_wb()
  2015-11-18 15:12   ` Ilya Dryomov
@ 2015-11-18 15:30     ` Tejun Heo
  2015-11-18 15:48       ` Ilya Dryomov
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2015-11-18 15:30 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, Ceph Development

Hello, Ilya.

On Wed, Nov 18, 2015 at 04:12:07PM +0100, Ilya Dryomov wrote:
> > It's stinky that the bdi is going away while the inode is still there.
> > Yeah, blkdev inodes are special and created early but I think it makes
> > sense to keep the underlying structures (queue and bdi) around while
> > bdev is associated with it.  Would simply moving put_disk() after
> > bdput() work?
> 
> I'd think so.  struct block_device is essentially a "block device"
> pseudo-filesystem inode, and as such, may not be around during the
> entire lifetime of gendisk / queue.  It may be kicked out of the inode
> cache as soon as the device is closed, so it makes sense to put it
> before putting gendisk / queue, which will outlive it.
> 
> However, I'm confused by this comment
> 
> /*
>  * ->release can cause the queue to disappear, so flush all
>  * dirty data before.
>  */
> bdev_write_inode(bdev);
> 
> It's not true, at least since your 523e1d399ce0 ("block: make gendisk
> hold a reference to its queue"), right?  (It used to say "->release can
> cause the old bdi to disappear, so must switch it out first" and was
> changed by Christoph in the middle of his backing_dev_info series.)

Right, it started with each layer going away separately, which tends
to get tricky with hotunplug, and we've been gradually moving towards
a model where the entire stack stays till the last ref is gone, so
yeah the comment isn't true anymore.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: request_queue use-after-free - inode_detach_wb()
  2015-11-18 15:30     ` Tejun Heo
@ 2015-11-18 15:48       ` Ilya Dryomov
  2015-11-18 15:56         ` Tejun Heo
  2015-11-19 20:56         ` Ilya Dryomov
  0 siblings, 2 replies; 10+ messages in thread
From: Ilya Dryomov @ 2015-11-18 15:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, Ceph Development

On Wed, Nov 18, 2015 at 4:30 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Ilya.
>
> On Wed, Nov 18, 2015 at 04:12:07PM +0100, Ilya Dryomov wrote:
>> > It's stinky that the bdi is going away while the inode is still there.
>> > Yeah, blkdev inodes are special and created early but I think it makes
>> > sense to keep the underlying structures (queue and bdi) around while
>> > bdev is associated with it.  Would simply moving put_disk() after
>> > bdput() work?
>>
>> I'd think so.  struct block_device is essentially a "block device"
>> pseudo-filesystem inode, and as such, may not be around during the
>> entire lifetime of gendisk / queue.  It may be kicked out of the inode
>> cache as soon as the device is closed, so it makes sense to put it
>> before putting gendisk / queue, which will outlive it.
>>
>> However, I'm confused by this comment
>>
>> /*
>>  * ->release can cause the queue to disappear, so flush all
>>  * dirty data before.
>>  */
>> bdev_write_inode(bdev);
>>
>> It's not true, at least since your 523e1d399ce0 ("block: make gendisk
>> hold a reference to its queue"), right?  (It used to say "->release can
>> cause the old bdi to disappear, so must switch it out first" and was
>> changed by Christoph in the middle of his backing_dev_info series.)
>
> Right, it started with each layer going away separately, which tends
> to get tricky with hotunplug, and we've been gradually moving towards
> a model where the entire stack stays till the last ref is gone, so
> yeah the comment isn't true anymore.

OK, I'll try to work up a patch to do bdput before put_disk and also
drop this comment.

Just to be clear, the bdi/wb vs inode lifetime rules are that inodes
should always be within bdi/wb?  There's been a lot of churn in this
and related areas recently, including in block drivers: 6cd18e711dd8
("block: destroy bdi before blockdev is unregistered"), b02176f30cd3
("block: don't release bdi while request_queue has live references"),
so I want to fully get my head around this.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: request_queue use-after-free - inode_detach_wb()
  2015-11-18 15:48       ` Ilya Dryomov
@ 2015-11-18 15:56         ` Tejun Heo
  2015-11-19 20:56         ` Ilya Dryomov
  1 sibling, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2015-11-18 15:56 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, Ceph Development

Hello, Ilya.

On Wed, Nov 18, 2015 at 04:48:06PM +0100, Ilya Dryomov wrote:
> Just to be clear, the bdi/wb vs inode lifetime rules are that inodes
> should always be within bdi/wb?  There's been a lot of churn in this

Yes, that's where *I* think we should be headed.  Stuff in lower
layers should stick around while upper layer things are around.

> and related areas recently, including in block drivers: 6cd18e711dd8
> ("block: destroy bdi before blockdev is unregistered"), b02176f30cd3
> ("block: don't release bdi while request_queue has live references"),
> so I want to fully get my head around this.

End-of-life issue has always been a bit of mess in the block layer.
Thanks a lot for working on this.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: request_queue use-after-free - inode_detach_wb()
  2015-11-18 15:48       ` Ilya Dryomov
  2015-11-18 15:56         ` Tejun Heo
@ 2015-11-19 20:56         ` Ilya Dryomov
  2015-11-19 21:18           ` Tejun Heo
  1 sibling, 1 reply; 10+ messages in thread
From: Ilya Dryomov @ 2015-11-19 20:56 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, Ceph Development

On Wed, Nov 18, 2015 at 4:48 PM, Ilya Dryomov <idryomov@gmail.com> wrote:
> On Wed, Nov 18, 2015 at 4:30 PM, Tejun Heo <tj@kernel.org> wrote:
>> Hello, Ilya.
>>
>> On Wed, Nov 18, 2015 at 04:12:07PM +0100, Ilya Dryomov wrote:
>>> > It's stinky that the bdi is going away while the inode is still there.
>>> > Yeah, blkdev inodes are special and created early but I think it makes
>>> > sense to keep the underlying structures (queue and bdi) around while
>>> > bdev is associated with it.  Would simply moving put_disk() after
>>> > bdput() work?
>>>
>>> I'd think so.  struct block_device is essentially a "block device"
>>> pseudo-filesystem inode, and as such, may not be around during the
>>> entire lifetime of gendisk / queue.  It may be kicked out of the inode
>>> cache as soon as the device is closed, so it makes sense to put it
>>> before putting gendisk / queue, which will outlive it.
>>>
>>> However, I'm confused by this comment
>>>
>>> /*
>>>  * ->release can cause the queue to disappear, so flush all
>>>  * dirty data before.
>>>  */
>>> bdev_write_inode(bdev);
>>>
>>> It's not true, at least since your 523e1d399ce0 ("block: make gendisk
>>> hold a reference to its queue"), right?  (It used to say "->release can
>>> cause the old bdi to disappear, so must switch it out first" and was
>>> changed by Christoph in the middle of his backing_dev_info series.)
>>
>> Right, it started with each layer going away separately, which tends
>> to get tricky with hotunplug, and we've been gradually moving towards
>> a model where the entire stack stays till the last ref is gone, so
>> yeah the comment isn't true anymore.
>
> OK, I'll try to work up a patch to do bdput before put_disk and also
> drop this comment.

Doing bdput before put_disk in fs/block_dev.c helps, but isn't enough.
There is nothing guaranteeing that our bdput in __blkdev_put() is the
last one.  One particular issue is the linkage between /dev inodes and
bdev internal inodes.  /dev inodes hold bdev inodes, so:

186 static void __fput(struct file *file)
187 {
188         struct dentry *dentry = file->f_path.dentry;
189         struct vfsmount *mnt = file->f_path.mnt;
190         struct inode *inode = file->f_inode;
191
192         might_sleep();
193
194         fsnotify_close(file);
195         /*
196          * The function eventpoll_release() should be the first called
197          * in the file cleanup chain.
198          */
199         eventpoll_release(file);
200         locks_remove_file(file);
201
202         if (unlikely(file->f_flags & FASYNC)) {
203                 if (file->f_op->fasync)
204                         file->f_op->fasync(-1, file, 0);
205         }
206         ima_file_free(file);
207         if (file->f_op->release)
208                 file->f_op->release(inode, file);

This translates to blkdev_put().  Suppose in response to this release
block device driver dropped gendisk, queue, etc.  Then we, still in
blkdev_put(), did our bdput and put_disk.  The queue is now gone, but
there's still a ref on the bdev inode - from the /dev inode.  When the
latter gets evicted thanks to dput below, we end up in bd_forget(),
which finishes up with iput(bdev->bd_inode)...

209         security_file_free(file);
210         if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
211                      !(file->f_mode & FMODE_PATH))) {
212                 cdev_put(inode->i_cdev);
213         }
214         fops_put(file->f_op);
215         put_pid(file->f_owner.pid);
216         if ((file->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
217                 i_readcount_dec(inode);
218         if (file->f_mode & FMODE_WRITER) {
219                 put_write_access(inode);
220                 __mnt_drop_write(mnt);
221         }
222         file->f_path.dentry = NULL;
223         file->f_path.mnt = NULL;
224         file->f_inode = NULL;
225         file_free(file);
226         dput(dentry);
227         mntput(mnt);
228 }

On Wed, Nov 18, 2015 at 4:56 PM, Tejun Heo <tj@kernel.org> wrote:
> On Wed, Nov 18, 2015 at 04:48:06PM +0100, Ilya Dryomov wrote:
>> Just to be clear, the bdi/wb vs inode lifetime rules are that inodes
>> should always be within bdi/wb?  There's been a lot of churn in this
>
> Yes, that's where *I* think we should be headed.  Stuff in lower
> layers should stick around while upper layer things are around

I think the fundamental problem is the embedding of bdi in the queue.
The lifetime rules (or, rather, expectations) for the two seem to be
completely different and, while used together, they belong to different
subsystems.  Even if we find a way to fix this particular race, there
is a good chance someone will reintroduce it in the future, perhaps in
a more subtle way.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: request_queue use-after-free - inode_detach_wb()
  2015-11-19 20:56         ` Ilya Dryomov
@ 2015-11-19 21:18           ` Tejun Heo
  2015-11-19 21:56             ` Ilya Dryomov
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2015-11-19 21:18 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, Ceph Development

Hello, Ilya.

On Thu, Nov 19, 2015 at 09:56:21PM +0100, Ilya Dryomov wrote:
> > Yes, that's where *I* think we should be headed.  Stuff in lower
> > layers should stick around while upper layer things are around
> 
> I think the fundamental problem is the embedding of bdi in the queue.
> The lifetime rules (or, rather, expectations) for the two seem to be
> completely different and, while used together, they belong to different
> subsystems.  Even if we find a way to fix this particular race, there
> is a good chance someone will reintroduce it in the future, perhaps in
> a more subtle way.

You're right.  This is nasty.  Hmmm... the root problem is that beyond
the last __blkdev_put() the bdev and disk don't really have anything
to do with each other but the bdev is still pointing to it.  We are
already guaranteeing that the underlying disk hangs around while there
are bdevs associated with it.

We already know that the bdev is idle once bd_openers hits zero and
the inode gets flushed, so at that point, the problem is bdev's
inode->i_wb is still pointing to something that the bdev doesn't have
anything to do with.  So, can we do inode_detach_wb() after flushing
the inode?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: request_queue use-after-free - inode_detach_wb()
  2015-11-19 21:18           ` Tejun Heo
@ 2015-11-19 21:56             ` Ilya Dryomov
  2015-11-19 22:14               ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Ilya Dryomov @ 2015-11-19 21:56 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, Ceph Development

On Thu, Nov 19, 2015 at 10:18 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Ilya.
>
> On Thu, Nov 19, 2015 at 09:56:21PM +0100, Ilya Dryomov wrote:
>> > Yes, that's where *I* think we should be headed.  Stuff in lower
>> > layers should stick around while upper layer things are around
>>
>> I think the fundamental problem is the embedding of bdi in the queue.
>> The lifetime rules (or, rather, expectations) for the two seem to be
>> completely different and, while used together, they belong to different
>> subsystems.  Even if we find a way to fix this particular race, there
>> is a good chance someone will reintroduce it in the future, perhaps in
>> a more subtle way.
>
> You're right.  This is nasty.  Hmmm... the root problem is that beyond
> the last __blkdev_put() the bdev and disk don't really have anything
> to do with each other but the bdev is still pointing to it.  We are
> already guaranteeing that the underlying disk hangs around while there
> are bdevs associated with it.
>
> We already know that the bdev is idle once bd_openers hits zero and
> the inode gets flushed, so at that point, the problem is bdev's
> inode->i_wb is still pointing to something that the bdev doesn't have
> anything to do with.  So, can we do inode_detach_wb() after flushing
> the inode?

Detaching the inode earlier is what I suggested in the first email, but
I didn't know if this kind of special casing was OK.  I'll try it out.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: request_queue use-after-free - inode_detach_wb()
  2015-11-19 21:56             ` Ilya Dryomov
@ 2015-11-19 22:14               ` Tejun Heo
  0 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2015-11-19 22:14 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, Ceph Development

Hello,

On Thu, Nov 19, 2015 at 10:56:43PM +0100, Ilya Dryomov wrote:
> Detaching the inode earlier is what I suggested in the first email, but
> I didn't know if this kind of special casing was OK.  I'll try it out.

Yeah, I was confused.  Sorry about that.  On the surface, it looks
like a special case but everything around bdev is special case anyway
and looking at the underlying lifetime rules, I think this is the
right thing to do.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-11-19 22:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-16 20:59 request_queue use-after-free - inode_detach_wb() Ilya Dryomov
2015-11-17 20:56 ` Tejun Heo
2015-11-18 15:12   ` Ilya Dryomov
2015-11-18 15:30     ` Tejun Heo
2015-11-18 15:48       ` Ilya Dryomov
2015-11-18 15:56         ` Tejun Heo
2015-11-19 20:56         ` Ilya Dryomov
2015-11-19 21:18           ` Tejun Heo
2015-11-19 21:56             ` Ilya Dryomov
2015-11-19 22:14               ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.