* [PATCH] Btrfs: don't remove raid type sysfs entries until unmount @ 2014-05-22 0:12 Chris Mason 2014-05-22 1:21 ` Jeff Mahoney 0 siblings, 1 reply; 10+ messages in thread From: Chris Mason @ 2014-05-22 0:12 UTC (permalink / raw) To: linux-btrfs, David Sterba, Jeff Mahoney The Btrfs sysfs code removes entries for raid types that are no longer in use. This means that if you have a raid0 FS and use balance to turn it into a raid1 FS, the raid0 sysfs entries will go away. The rough chain of events is: __link_block_group() -> see we're the first RAIDX, add sysfs entry btrfs_remove_block_group() -> notice we're removing the last RAIDX remove sysfs entry This all makes sense until we try to add RAIDX back into the FS again. The problem is that our RAID kobjects are just in an array that gets freed at unmount time, instead of an array of pointers to kobjects that get freed when the great sysfs in the sky is done with them. When we remove the sysfs entry for a given raid level, the syfs code free's the name. When we use the same kobject to add back the RAIDX entry again, sysfs sees the old name pointer and tries to free it again. All of which is a long way of saying we're using sysfs wrong. For now, just don't remove entries for raid levels that we're no longer using. Signed-off-by: Chris Mason <clm@fb.com> Reported-by: Dave Sterba <dsterba@suse.cz> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index ddf16bf..acdc7ed 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -8535,12 +8535,14 @@ static void __link_block_group(struct btrfs_space_info *space_info, struct kobject *kobj = &space_info->block_group_kobjs[index]; int ret; - kobject_get(&space_info->kobj); /* put in release */ - ret = kobject_add(kobj, &space_info->kobj, "%s", - get_raid_name(index)); - if (ret) { - pr_warn("BTRFS: failed to add kobject for block cache. ignoring.\n"); - kobject_put(&space_info->kobj); + if (!kobj->name) { + kobject_get(&space_info->kobj); /* put in release */ + ret = kobject_add(kobj, &space_info->kobj, "%s", + get_raid_name(index)); + if (ret) { + pr_warn("BTRFS: failed to add kobject for block cache. ignoring.\n"); + kobject_put(&space_info->kobj); + } } } } @@ -8976,8 +8978,6 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, */ list_del_init(&block_group->list); if (list_empty(&block_group->space_info->block_groups[index])) { - kobject_del(&block_group->space_info->block_group_kobjs[index]); - kobject_put(&block_group->space_info->block_group_kobjs[index]); clear_avail_alloc_bits(root->fs_info, block_group->flags); } up_write(&block_group->space_info->groups_sem); ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount 2014-05-22 0:12 [PATCH] Btrfs: don't remove raid type sysfs entries until unmount Chris Mason @ 2014-05-22 1:21 ` Jeff Mahoney 2014-05-22 12:19 ` Chris Mason 0 siblings, 1 reply; 10+ messages in thread From: Jeff Mahoney @ 2014-05-22 1:21 UTC (permalink / raw) To: Chris Mason, linux-btrfs, David Sterba On 05/21/2014 08:12 PM, Chris Mason wrote: > > The Btrfs sysfs code removes entries for raid types that are no > longer in use. This means that if you have a raid0 FS and use balance > to turn it into a raid1 FS, the raid0 sysfs entries will go away. > > The rough chain of events is: > > __link_block_group() -> see we're the first RAIDX, add sysfs entry > > btrfs_remove_block_group() -> notice we're removing the last RAIDX > remove sysfs entry > > This all makes sense until we try to add RAIDX back into the FS again. > The problem is that our RAID kobjects are just in an array that gets > freed at unmount time, instead of an array of pointers to kobjects that > get freed when the great sysfs in the sky is done with them. > > When we remove the sysfs entry for a given raid level, the syfs code > free's the name. When we use the same kobject to add back the RAIDX > entry again, sysfs sees the old name pointer and tries to free it again. > > All of which is a long way of saying we're using sysfs wrong. For now, > just don't remove entries for raid levels that we're no longer using. Hi Chris - Thanks for posting the problem. I disagree that sysfs is being used wrong here - or well, not on purpose. The problem is only that I didn't anticipate raid levels going away and only initialize the kobjects i update_space_info. The name being double-freed is only part of the problem. The kobject isn't reinitialized at all. Clearing ->name and re-calling kobject_init would fix it - but I think we can do one better. kobject_cleanup's comment for freeing ->name claims that ->name being set means that it was allocated by the kobject infrastructure. The raid names come directly out of an array anyway, so we don't even need to copy it. We can avoid the string allocation entirely and handle the reuse properly. The following (untested, tonight anyway) patch should fix it. -Jeff --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3401,10 +3401,8 @@ static int update_space_info(struct btrfs_fs_info *info, u64 flags, return ret; } - for (i = 0; i < BTRFS_NR_RAID_TYPES; i++) { + for (i = 0; i < BTRFS_NR_RAID_TYPES; i++) INIT_LIST_HEAD(&found->block_groups[i]); - kobject_init(&found->block_group_kobjs[i], &btrfs_raid_ktype); - } init_rwsem(&found->groups_sem); spin_lock_init(&found->lock); found->flags = flags & BTRFS_BLOCK_GROUP_TYPE_MASK; @@ -8356,8 +8354,9 @@ static void __link_block_group(struct btrfs_space_info *space_info, int ret; kobject_get(&space_info->kobj); /* put in release */ - ret = kobject_add(kobj, &space_info->kobj, "%s", - get_raid_name(index)); + kobject->name = get_raid_name(index); + ret = kobject_init_and_add(kobj, &btrfs_raid_ktype, + &space_info->kobj, NULL); if (ret) { pr_warn("BTRFS: failed to add kobject for block cache. ignoring.\n"); kobject_put(&space_info->kobj); diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index c5eb214..d742d79 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -288,6 +288,7 @@ static struct attribute *raid_attributes[] = { static void release_raid_kobj(struct kobject *kobj) { + kobj->name = NULL; kobject_put(kobj->parent); } -Jeff > Signed-off-by: Chris Mason <clm@fb.com> > Reported-by: Dave Sterba <dsterba@suse.cz> > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index ddf16bf..acdc7ed 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -8535,12 +8535,14 @@ static void __link_block_group(struct btrfs_space_info *space_info, > struct kobject *kobj = &space_info->block_group_kobjs[index]; > int ret; > > - kobject_get(&space_info->kobj); /* put in release */ > - ret = kobject_add(kobj, &space_info->kobj, "%s", > - get_raid_name(index)); > - if (ret) { > - pr_warn("BTRFS: failed to add kobject for block cache. ignoring.\n"); > - kobject_put(&space_info->kobj); > + if (!kobj->name) { > + kobject_get(&space_info->kobj); /* put in release */ > + ret = kobject_add(kobj, &space_info->kobj, "%s", > + get_raid_name(index)); > + if (ret) { > + pr_warn("BTRFS: failed to add kobject for block cache. ignoring.\n"); > + kobject_put(&space_info->kobj); > + } > } > } > } > @@ -8976,8 +8978,6 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, > */ > list_del_init(&block_group->list); > if (list_empty(&block_group->space_info->block_groups[index])) { > - kobject_del(&block_group->space_info->block_group_kobjs[index]); > - kobject_put(&block_group->space_info->block_group_kobjs[index]); > clear_avail_alloc_bits(root->fs_info, block_group->flags); > } > up_write(&block_group->space_info->groups_sem); > ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount 2014-05-22 1:21 ` Jeff Mahoney @ 2014-05-22 12:19 ` Chris Mason 2014-05-22 15:05 ` Jeff Mahoney 0 siblings, 1 reply; 10+ messages in thread From: Chris Mason @ 2014-05-22 12:19 UTC (permalink / raw) To: Jeff Mahoney, linux-btrfs, David Sterba On 05/21/2014 09:21 PM, Jeff Mahoney wrote: > On 05/21/2014 08:12 PM, Chris Mason wrote: >> >> The Btrfs sysfs code removes entries for raid types that are no >> longer in use. This means that if you have a raid0 FS and use balance >> to turn it into a raid1 FS, the raid0 sysfs entries will go away. >> >> The rough chain of events is: >> >> __link_block_group() -> see we're the first RAIDX, add sysfs entry >> >> btrfs_remove_block_group() -> notice we're removing the last RAIDX >> remove sysfs entry >> >> This all makes sense until we try to add RAIDX back into the FS again. >> The problem is that our RAID kobjects are just in an array that gets >> freed at unmount time, instead of an array of pointers to kobjects that >> get freed when the great sysfs in the sky is done with them. >> >> When we remove the sysfs entry for a given raid level, the syfs code >> free's the name. When we use the same kobject to add back the RAIDX >> entry again, sysfs sees the old name pointer and tries to free it again. >> >> All of which is a long way of saying we're using sysfs wrong. For now, >> just don't remove entries for raid levels that we're no longer using. > > Hi Chris - > > Thanks for posting the problem. I disagree that sysfs is being used > wrong here - or well, not on purpose. The problem is only that I didn't > anticipate raid levels going away and only initialize the kobjects i > update_space_info. Thanks Jeff, this is more a comment on the sadness that comes from using sysfs, not a knock on your patch ;) > > The name being double-freed is only part of the problem. The kobject > isn't reinitialized at all. Clearing ->name and re-calling kobject_init > would fix it - but I think we can do one better. > > kobject_cleanup's comment for freeing ->name claims that ->name being > set means that it was allocated by the kobject infrastructure. The raid > names come directly out of an array anyway, so we don't even need to > copy it. We can avoid the string allocation entirely and handle the > reuse properly. > > The following (untested, tonight anyway) patch should fix it. Can we safely reinit a kobject that has been put in use in sysfs? Given all the things that can hold refs etc is this legal? -chris ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount 2014-05-22 12:19 ` Chris Mason @ 2014-05-22 15:05 ` Jeff Mahoney 2014-05-22 17:17 ` Chris Mason 0 siblings, 1 reply; 10+ messages in thread From: Jeff Mahoney @ 2014-05-22 15:05 UTC (permalink / raw) To: Chris Mason, linux-btrfs, David Sterba -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 5/22/14, 8:19 AM, Chris Mason wrote: > Can we safely reinit a kobject that has been put in use in sysfs? > Given all the things that can hold refs etc is this legal? It depends on how the kobject is being used. It wouldn't be safe to re-use the kobject embedded in space_info since it controls the lifetime of the object, regardless of its use in sysfs. The kobjects for block groups only exist for creating the subdirectories and their lifetime is actually the lifetime of the space_info. We take a reference to the space_info when we add them to sysfs because that's where they draw their data. The only reference to a block group kobject is taken when we add it to sysfs and is dropped when we remove it. Holding a sysfs file open doesn't pin the kobject, so once we remove it from sysfs (kobject_del waits for readers to complete), it's safe to reinitialize it. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQIcBAEBAgAGBQJTfhJBAAoJEB57S2MheeWymZoP/Ru0dB9DDQW7eCgeOuXTNMZt uy29gJsUjC6QoQagJ0aHT1SiNw9EFHuDnfrnUaOGxO2l0dDostdrU4xqH/632xVv 9o4QrTWtg0d7mN2+D4YaYLB/74Nd1O86CE+3zbmM8CY8Ueh+JMR63jSSU5k2zeCX ts0dcdK31DBP/aoof6NS8KNdJOpi0dIiefH2Si4Zm2RYBxcMBrKpI8KIrhab1Bo6 vomgq4XkAZUK7G2C/gOmGdqMlSMwMsOhiZ2fYH7pdR74q4vPu6K+RnxFnIuOzmPi nzMxvgYnjjn1+cynTmZD3xMj94MyCamV1wbj4jciIiDdNrr1XRJH1EMqA5T3wDN4 vw52dVm6sDGizO4/coU21WMum2IUrIbCESXK74Tef3FLWzRWerKeZStIYBYGmVuz WGTfwUJ+SnhEwSAEHbYIDM8fogVvei2UlPAwQ4o2kSzM9kKf+E5YbwYJDbdKlQ0C tF6kyZFzkJ2ZEEVGv7fqswHudkWGafrPxRCN9iMvxWUWcxNybh1PZ3MW45uvfyA7 UrZYuDruTS2WorYSivKdFi+9DgTxms/ekfiNi4SZQ5Z1JteAPU8AS5OAQOuo+UCD uIdlRzorK/pzYH0c/qbN5vy/t+cBVYU6ya238R9eTbq1jQGWWZlZuwaUJQb9nhmS yKBOJnpRMrJOpO6Onrrh =0M9G -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount 2014-05-22 15:05 ` Jeff Mahoney @ 2014-05-22 17:17 ` Chris Mason 2014-05-23 12:38 ` David Sterba 0 siblings, 1 reply; 10+ messages in thread From: Chris Mason @ 2014-05-22 17:17 UTC (permalink / raw) To: Jeff Mahoney, linux-btrfs, David Sterba On 05/22/2014 11:05 AM, Jeff Mahoney wrote: > - gpg control packet > On 5/22/14, 8:19 AM, Chris Mason wrote: >> Can we safely reinit a kobject that has been put in use in sysfs? >> Given all the things that can hold refs etc is this legal? > > It depends on how the kobject is being used. > > It wouldn't be safe to re-use the kobject embedded in space_info since > it controls the lifetime of the object, regardless of its use in sysfs. > > The kobjects for block groups only exist for creating the > subdirectories and their lifetime is actually the lifetime of the > space_info. We take a reference to the space_info when we add them to > sysfs because that's where they draw their data. The only reference to > a block group kobject is taken when we add it to sysfs and is dropped > when we remove it. Holding a sysfs file open doesn't pin the kobject, > so once we remove it from sysfs (kobject_del waits for readers to > complete), it's safe to reinitialize it. > Fair enough, once you've tested this new patch a bit I'll drop mine for yours. Thanks! -chris ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount 2014-05-22 17:17 ` Chris Mason @ 2014-05-23 12:38 ` David Sterba 2014-05-23 12:56 ` Jeff Mahoney 0 siblings, 1 reply; 10+ messages in thread From: David Sterba @ 2014-05-23 12:38 UTC (permalink / raw) To: Chris Mason; +Cc: Jeff Mahoney, linux-btrfs On Thu, May 22, 2014 at 01:17:50PM -0400, Chris Mason wrote: > On 05/22/2014 11:05 AM, Jeff Mahoney wrote: > > - gpg control packet > > On 5/22/14, 8:19 AM, Chris Mason wrote: > >> Can we safely reinit a kobject that has been put in use in sysfs? > >> Given all the things that can hold refs etc is this legal? > > > > It depends on how the kobject is being used. > > > > It wouldn't be safe to re-use the kobject embedded in space_info since > > it controls the lifetime of the object, regardless of its use in sysfs. > > > > The kobjects for block groups only exist for creating the > > subdirectories and their lifetime is actually the lifetime of the > > space_info. We take a reference to the space_info when we add them to > > sysfs because that's where they draw their data. The only reference to > > a block group kobject is taken when we add it to sysfs and is dropped > > when we remove it. Holding a sysfs file open doesn't pin the kobject, > > so once we remove it from sysfs (kobject_del waits for readers to > > complete), it's safe to reinitialize it. > > > > Fair enough, once you've tested this new patch a bit I'll drop mine for > yours. With a fixup in __link_block_group - kobject->name = get_raid_name(index); + kobj->name = get_raid_name(index); it crashes right away, tests/btrfs/001, patch applied on top of 3.15-rc6, I haven't analyzed the cause yet. [ 520.880250] ------------[ cut here ]------------ [ 520.884185] kernel BUG at mm/slub.c:3376! [ 520.884185] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 520.884185] Modules linked in: rpcsec_gss_krb5 dm_crypt loop btrfs [ 520.884185] CPU: 1 PID: 3698 Comm: umount Not tainted 3.15.0-rc6-default+ #144 [ 520.884185] Hardware name: Intel Corporation Santa Rosa platform/Matanzas, BIOS TSRSCRB1.86C.0047.B00.0610170821 10/17/06 [ 520.884185] task: ffff880075892370 ti: ffff88006e0ce000 task.ti: ffff88006e0ce000 [ 520.884185] RIP: 0010:[<ffffffff8116efcf>] [<ffffffff8116efcf>] kfree+0x1bf/0x1d0 [ 520.884185] RSP: 0018:ffff88006e0cfcb8 EFLAGS: 00010246 [ 520.884185] RAX: 4000000000000000 RBX: ffffffffa00bb63e RCX: 0000000180190007 [ 520.884185] RDX: 0000000000000000 RSI: ffffea0001d5ad40 RDI: ffffffffa00bb63e [ 520.884185] RBP: ffff88006e0cfce8 R08: 0000000000000001 R09: 0000000000000000 [ 520.884185] R10: 0000000000000000 R11: 0000000000000001 R12: ffffea0000802ec0 [ 520.884185] R13: ffffffff813ba137 R14: ffffffffa00bb63e R15: 0000000000000003 [ 520.884185] FS: 00007fba2c0cd7e0(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000 [ 520.884185] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 520.884185] CR2: 00007fba2b759610 CR3: 00000000756a6000 CR4: 00000000000007e0 [ 520.884185] Stack: [ 520.884185] ffff88007a849000 ffffffffa00cde20 ffff880075950be0 ffff880075950ba8 [ 520.884185] ffffffffa00bb63e 0000000000000003 ffff88006e0cfd18 ffffffff813ba137 [ 520.884185] ffff880075950ba8 ffff880075950800 ffff8800755e5a58 0000000000000004 [ 520.884185] Call Trace: [ 520.884185] [<ffffffff813ba137>] kobject_release+0xa7/0x1d0 [ 520.884185] [<ffffffff813ba291>] kobject_put+0x31/0x70 [ 520.884185] [<ffffffffa00223ca>] btrfs_free_block_groups+0x30a/0x420 [btrfs] [ 520.884185] [<ffffffffa002f939>] close_ctree+0x149/0x2e0 [btrfs] [ 520.884185] [<ffffffff8119560f>] ? dispose_list+0x4f/0x60 [ 520.884185] [<ffffffff81196544>] ? evict_inodes+0x114/0x130 [ 520.884185] [<ffffffffa00019b9>] btrfs_put_super+0x19/0x20 [btrfs] [ 520.884185] [<ffffffff8117c78e>] generic_shutdown_super+0x7e/0x110 [ 520.884185] [<ffffffff8117c8b6>] kill_anon_super+0x16/0x30 [ 520.884185] [<ffffffffa000289a>] btrfs_kill_super+0x1a/0xa0 [btrfs] [ 520.884185] [<ffffffff8117c94d>] deactivate_locked_super+0x4d/0x80 [ 520.884185] [<ffffffff8117cf5a>] deactivate_super+0x4a/0x70 [ 520.884185] [<ffffffff8119a341>] mntput_no_expire+0x111/0x180 [ 520.884185] [<ffffffff8119a247>] ? mntput_no_expire+0x17/0x180 [ 520.884185] [<ffffffff81190213>] ? dput+0x23/0x110 [ 520.884185] [<ffffffff8119b8eb>] SyS_umount+0xcb/0x420 [ 520.884185] [<ffffffff81a16092>] ? system_call_fastpath+0x16/0x1b [ 520.884185] [<ffffffff81a16092>] system_call_fastpath+0x16/0x1b [ 520.884185] Code: c4 40 74 05 41 8b 74 24 68 4c 89 e7 e8 4b 25 fc ff e9 4f ff ff ff 49 8b 44 24 30 49 8b 14 24 80 e6 80 4c 0f 45 e0 e9 ae fe ff ff <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 [ 520.884185] RIP [<ffffffff8116efcf>] kfree+0x1bf/0x1d0 [ 520.884185] RSP <ffff88006e0cfcb8> [ 521.242658] ---[ end trace 4228a5dbb43499c0 ]--- ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount 2014-05-23 12:38 ` David Sterba @ 2014-05-23 12:56 ` Jeff Mahoney 2014-05-23 14:32 ` David Sterba 0 siblings, 1 reply; 10+ messages in thread From: Jeff Mahoney @ 2014-05-23 12:56 UTC (permalink / raw) To: dsterba, Chris Mason, linux-btrfs -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 5/23/14, 8:38 AM, David Sterba wrote: > On Thu, May 22, 2014 at 01:17:50PM -0400, Chris Mason wrote: >> On 05/22/2014 11:05 AM, Jeff Mahoney wrote: >>> - gpg control packet On 5/22/14, 8:19 AM, Chris Mason wrote: >>>> Can we safely reinit a kobject that has been put in use in >>>> sysfs? Given all the things that can hold refs etc is this >>>> legal? >>> >>> It depends on how the kobject is being used. >>> >>> It wouldn't be safe to re-use the kobject embedded in >>> space_info since it controls the lifetime of the object, >>> regardless of its use in sysfs. >>> >>> The kobjects for block groups only exist for creating the >>> subdirectories and their lifetime is actually the lifetime of >>> the space_info. We take a reference to the space_info when we >>> add them to sysfs because that's where they draw their data. >>> The only reference to a block group kobject is taken when we >>> add it to sysfs and is dropped when we remove it. Holding a >>> sysfs file open doesn't pin the kobject, so once we remove it >>> from sysfs (kobject_del waits for readers to complete), it's >>> safe to reinitialize it. >>> >> >> Fair enough, once you've tested this new patch a bit I'll drop >> mine for yours. > > With a fixup in __link_block_group > > - kobject->name = get_raid_name(index); + > kobj->name = get_raid_name(index); > > it crashes right away, tests/btrfs/001, patch applied on top of > 3.15-rc6, I haven't analyzed the cause yet. Sigh. Yep. kobject_cleanup caches ->name before the release function and ignores the cleared value. It seems the "free name if we alloced it" comment in there was leftover from the middle of a patch series Kay applied in late 2007. Commit 0f4dafc05 adds the comment while adding a flag to indicate that kobject_set_name_vargs set the name. The commit af5ca3f4e says kobject names must be dynamic but didn't update the comment. Ok, so we can't save the strdup. - -Jeff > [ 520.880250] ------------[ cut here ]------------ [ 520.884185] > kernel BUG at mm/slub.c:3376! [ 520.884185] invalid opcode: 0000 > [#1] SMP DEBUG_PAGEALLOC [ 520.884185] Modules linked in: > rpcsec_gss_krb5 dm_crypt loop btrfs [ 520.884185] CPU: 1 PID: 3698 > Comm: umount Not tainted 3.15.0-rc6-default+ #144 [ 520.884185] > Hardware name: Intel Corporation Santa Rosa platform/Matanzas, BIOS > TSRSCRB1.86C.0047.B00.0610170821 10/17/06 [ 520.884185] task: > ffff880075892370 ti: ffff88006e0ce000 task.ti: ffff88006e0ce000 [ > 520.884185] RIP: 0010:[<ffffffff8116efcf>] [<ffffffff8116efcf>] > kfree+0x1bf/0x1d0 [ 520.884185] RSP: 0018:ffff88006e0cfcb8 > EFLAGS: 00010246 [ 520.884185] RAX: 4000000000000000 RBX: > ffffffffa00bb63e RCX: 0000000180190007 [ 520.884185] RDX: > 0000000000000000 RSI: ffffea0001d5ad40 RDI: ffffffffa00bb63e [ > 520.884185] RBP: ffff88006e0cfce8 R08: 0000000000000001 R09: > 0000000000000000 [ 520.884185] R10: 0000000000000000 R11: > 0000000000000001 R12: ffffea0000802ec0 [ 520.884185] R13: > ffffffff813ba137 R14: ffffffffa00bb63e R15: 0000000000000003 [ > 520.884185] FS: 00007fba2c0cd7e0(0000) GS:ffff88007d400000(0000) > knlGS:0000000000000000 [ 520.884185] CS: 0010 DS: 0000 ES: 0000 > CR0: 000000008005003b [ 520.884185] CR2: 00007fba2b759610 CR3: > 00000000756a6000 CR4: 00000000000007e0 [ 520.884185] Stack: [ > 520.884185] ffff88007a849000 ffffffffa00cde20 ffff880075950be0 > ffff880075950ba8 [ 520.884185] ffffffffa00bb63e 0000000000000003 > ffff88006e0cfd18 ffffffff813ba137 [ 520.884185] ffff880075950ba8 > ffff880075950800 ffff8800755e5a58 0000000000000004 [ 520.884185] > Call Trace: [ 520.884185] [<ffffffff813ba137>] > kobject_release+0xa7/0x1d0 [ 520.884185] [<ffffffff813ba291>] > kobject_put+0x31/0x70 [ 520.884185] [<ffffffffa00223ca>] > btrfs_free_block_groups+0x30a/0x420 [btrfs] [ 520.884185] > [<ffffffffa002f939>] close_ctree+0x149/0x2e0 [btrfs] [ 520.884185] > [<ffffffff8119560f>] ? dispose_list+0x4f/0x60 [ 520.884185] > [<ffffffff81196544>] ? evict_inodes+0x114/0x130 [ 520.884185] > [<ffffffffa00019b9>] btrfs_put_super+0x19/0x20 [btrfs] [ > 520.884185] [<ffffffff8117c78e>] > generic_shutdown_super+0x7e/0x110 [ 520.884185] > [<ffffffff8117c8b6>] kill_anon_super+0x16/0x30 [ 520.884185] > [<ffffffffa000289a>] btrfs_kill_super+0x1a/0xa0 [btrfs] [ > 520.884185] [<ffffffff8117c94d>] > deactivate_locked_super+0x4d/0x80 [ 520.884185] > [<ffffffff8117cf5a>] deactivate_super+0x4a/0x70 [ 520.884185] > [<ffffffff8119a341>] mntput_no_expire+0x111/0x180 [ 520.884185] > [<ffffffff8119a247>] ? mntput_no_expire+0x17/0x180 [ 520.884185] > [<ffffffff81190213>] ? dput+0x23/0x110 [ 520.884185] > [<ffffffff8119b8eb>] SyS_umount+0xcb/0x420 [ 520.884185] > [<ffffffff81a16092>] ? system_call_fastpath+0x16/0x1b [ > 520.884185] [<ffffffff81a16092>] system_call_fastpath+0x16/0x1b [ > 520.884185] Code: c4 40 74 05 41 8b 74 24 68 4c 89 e7 e8 4b 25 fc > ff e9 4f ff ff ff 49 8b 44 24 30 49 8b 14 24 80 e6 80 4c 0f 45 e0 > e9 ae fe ff ff <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 > 55 48 89 e5 [ 520.884185] RIP [<ffffffff8116efcf>] > kfree+0x1bf/0x1d0 [ 520.884185] RSP <ffff88006e0cfcb8> [ > 521.242658] ---[ end trace 4228a5dbb43499c0 ]--- > - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQIcBAEBAgAGBQJTf0VmAAoJEB57S2MheeWyzg4QAL8ZbJ4t9Pky0iE5p9odTbHe RaZkS6ctraqHtHgmViqTxu78cnbRvxoQ6oOE9ZSCS9YczWkuABFk8EQ0AbTRnblo wO6awRI0++rHmf1gA+soAtu6TfLOYR43SOFu2H9aMK/YLAGBzFnaLefeFfloj8gb xVZnBlv3M4syKtcOmrxkCraxz9G9XakwFDI1AGtSTgxBU4tFIE0cyQOtQZf0sAlI XqphtcoyWMa+T9GOLdFlUpV0p1OMArZhj+/tyMV7UotwHG5Y754kBU4Vd1wCiP/q p/+lmnlMxVRdREHkGYZLpeG39bQQjMhp8qBgznPBln/R3IQq0nFJxrcPTO3ADH5T LuLZwl63vW/uetWbVCvDLev27RXmg9aU1RtOGes0bK7HKDx+EymE5CpayADPnmlo ZLcRol6wol8+R4BJQI9tKffy30PN7Uzhtt79X7EZiVHX5DMGH3aMnOwUihdlgvNd YBgzwuXFjceoby7y+K1p7CpsDuh2VTBDnQ5e5SiJ3uyLpLeLwvctJ2BmVsEyS076 e4TLQRo+bW9M1y2Pn70B3tjKcQDPnufavk5bvsJqXqXLbpAp2xQDaN0RQrbnXeZG PU8uknhLvOmdMdxw9wqVPOGpUOIZytDofAVK4QSXGzjCsMFnl3NLqj0tGgUh9DOU mNH6TpALtxBaY1sJh8DN =3sYp -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount 2014-05-23 12:56 ` Jeff Mahoney @ 2014-05-23 14:32 ` David Sterba 2014-05-23 14:34 ` Jeff Mahoney 0 siblings, 1 reply; 10+ messages in thread From: David Sterba @ 2014-05-23 14:32 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Chris Mason, linux-btrfs On Fri, May 23, 2014 at 08:56:06AM -0400, Jeff Mahoney wrote: > Sigh. Yep. kobject_cleanup caches ->name before the release function > and ignores the cleared value. It seems the "free name if we alloced > it" comment in there was leftover from the middle of a patch series > Kay applied in late 2007. Commit 0f4dafc05 adds the comment while > adding a flag to indicate that kobject_set_name_vargs set the name. > The commit af5ca3f4e says kobject names must be dynamic but didn't > update the comment. > > Ok, so we can't save the strdup. The tests have been running fine for more than 40 minutes, previously it took a few minutes to trigger the bug. my quick fix: --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -8354,7 +8354,8 @@ static void __link_block_group(struct btrfs_space_info *space_info, int ret; kobject_get(&space_info->kobj); /* put in release */ - kobj->name = get_raid_name(index); + kobj->name = kstrdup(get_raid_name(index), GFP_ATOMIC); + BUG_ON(!kobj->name); ret = kobject_init_and_add(kobj, &btrfs_raid_ktype, &space_info->kobj, NULL); if (ret) { diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index d742d7973d3e..ac26f91968a7 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -288,7 +288,7 @@ static struct attribute *raid_attributes[] = { static void release_raid_kobj(struct kobject *kobj) { - kobj->name = NULL; + /* kobj->name = NULL; */ kobject_put(kobj->parent); } ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount 2014-05-23 14:32 ` David Sterba @ 2014-05-23 14:34 ` Jeff Mahoney 2014-05-23 18:09 ` Jeff Mahoney 0 siblings, 1 reply; 10+ messages in thread From: Jeff Mahoney @ 2014-05-23 14:34 UTC (permalink / raw) To: dsterba, Chris Mason, linux-btrfs -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 5/23/14, 10:32 AM, David Sterba wrote: > On Fri, May 23, 2014 at 08:56:06AM -0400, Jeff Mahoney wrote: >> Sigh. Yep. kobject_cleanup caches ->name before the release >> function and ignores the cleared value. It seems the "free name >> if we alloced it" comment in there was leftover from the middle >> of a patch series Kay applied in late 2007. Commit 0f4dafc05 adds >> the comment while adding a flag to indicate that >> kobject_set_name_vargs set the name. The commit af5ca3f4e says >> kobject names must be dynamic but didn't update the comment. >> >> Ok, so we can't save the strdup. > > The tests have been running fine for more than 40 minutes, > previously it took a few minutes to trigger the bug. > > my quick fix: > > --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ > -8354,7 +8354,8 @@ static void __link_block_group(struct > btrfs_space_info *space_info, int ret; > > kobject_get(&space_info->kobj); /* put in release */ - > kobj->name = get_raid_name(index); + kobj->name = > kstrdup(get_raid_name(index), GFP_ATOMIC); + > BUG_ON(!kobj->name); ret = kobject_init_and_add(kobj, > &btrfs_raid_ktype, &space_info->kobj, NULL); if (ret) { diff --git > a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index > d742d7973d3e..ac26f91968a7 100644 --- a/fs/btrfs/sysfs.c +++ > b/fs/btrfs/sysfs.c @@ -288,7 +288,7 @@ static struct attribute > *raid_attributes[] = { > > static void release_raid_kobj(struct kobject *kobj) { - > kobj->name = NULL; + /* kobj->name = NULL; */ > kobject_put(kobj->parent); } > I'm testing as well. I'm seeing: kobject (ffff88033f847e78): tried to init an initialized object, something is seriously wrong. .. as well. A memset would clear this up, but perhaps that's not the right solution after all. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQIcBAEBAgAGBQJTf1yEAAoJEB57S2MheeWyZCkQAK09TN//RPDTlTidMwg76LOh As1Q+dHPvCnIa3nvlI4XpCtVaIdOHiaDzGLHOSA/Dfq3PpNvETg/9R5eqcnMEXN+ W74Wr+ZJ5iwSq5jeMP4p73Ph+Dwzmc6OVr7mD4+CqVkdMuPBzWBl+aKA6d1UEWFb 2J/nUTS4EQz9I7O6UBgXt0pkZXavwp0gqqpHn4YV848gAnNyz6qhHe3nXLS/3XCW LX3WUS1HROoD2fTzKMSxlw1xHiHFZ2XItOMgP9Ma1pr/cQA1InGxTCIVshtkx8Hq uxgSorK/6H0iRsCVRuXX6JowYodX3UJ9x3VUQQbpI5sm5X6OlPzcwS9C4MqjjVNn 3Qqx7CJk79+CZBcBYMdK2borjrmCzQeyGQb49cH4eVNKUYLL2ja/ygoFq7aKOMgc dTuZbcwQc43uVm768fAxTYEDqJEdHI+LUKoK05XDEm6SbaDt81T2to0wxyJjpiE3 rbNmS1ubjE+WvoZzyYREd2WRUqZsuJV27017CGkJSHqj4OjfG3lvhkrVtM5Tu+jw 8f+KlgAHHN674mRXqUDU4a8Twc7aC5Y36v5mscXAIm2wQKh5EKdPAxedUOqHmMT9 t3WJG8ErmitEuZ8Hb4FzIeL1U3ztnULg+K7vYLCuQS1/C4p3xXU6jtOgPngIuGmj BGMMRo82DAEc+pxHJI7w =RN4K -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount 2014-05-23 14:34 ` Jeff Mahoney @ 2014-05-23 18:09 ` Jeff Mahoney 0 siblings, 0 replies; 10+ messages in thread From: Jeff Mahoney @ 2014-05-23 18:09 UTC (permalink / raw) To: dsterba, Chris Mason, linux-btrfs -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 5/23/14, 10:34 AM, Jeff Mahoney wrote: > On 5/23/14, 10:32 AM, David Sterba wrote: >> On Fri, May 23, 2014 at 08:56:06AM -0400, Jeff Mahoney wrote: >>> Sigh. Yep. kobject_cleanup caches ->name before the release >>> function and ignores the cleared value. It seems the "free >>> name if we alloced it" comment in there was leftover from the >>> middle of a patch series Kay applied in late 2007. Commit >>> 0f4dafc05 adds the comment while adding a flag to indicate >>> that kobject_set_name_vargs set the name. The commit af5ca3f4e >>> says kobject names must be dynamic but didn't update the >>> comment. >>> >>> Ok, so we can't save the strdup. > >> The tests have been running fine for more than 40 minutes, >> previously it took a few minutes to trigger the bug. > >> my quick fix: > >> --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ >> -8354,7 +8354,8 @@ static void __link_block_group(struct >> btrfs_space_info *space_info, int ret; > >> kobject_get(&space_info->kobj); /* put in release */ - kobj->name >> = get_raid_name(index); + kobj->name = >> kstrdup(get_raid_name(index), GFP_ATOMIC); + BUG_ON(!kobj->name); >> ret = kobject_init_and_add(kobj, &btrfs_raid_ktype, >> &space_info->kobj, NULL); if (ret) { diff --git >> a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index >> d742d7973d3e..ac26f91968a7 100644 --- a/fs/btrfs/sysfs.c +++ >> b/fs/btrfs/sysfs.c @@ -288,7 +288,7 @@ static struct attribute >> *raid_attributes[] = { > >> static void release_raid_kobj(struct kobject *kobj) { - >> kobj->name = NULL; + /* kobj->name = NULL; */ >> kobject_put(kobj->parent); } > > > I'm testing as well. I'm seeing: kobject (ffff88033f847e78): tried > to init an initialized object, something is seriously wrong. > > .. as well. A memset would clear this up, but perhaps that's not > the right solution after all. For background, the reason the kobjects are statically allocated is because they cover all of the block group caches for that raid type and it seemed silly to have a special structure that only contains a kobject and an int. So it uses its index in that array to determine which raid type it covers. Putting the kobject, list head, and type in a single structure ends up introducing more complexity for little gain since we end up with that lockdep alloc during reclaim problem again (or end up allocating every time and freeing if it's not needed.) In retrospect, we waste less memory allocating a special structure on-demand anyway and it eliminates the questions. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQIcBAEBAgAGBQJTf47SAAoJEB57S2MheeWyJrAP/RLfyy90y+6pARBi1bseMW0d maQcNk6DkWYt4DYenMe+lvJJf+RPBotsFaDzq1Eblq8DDi1r7hb2wZvpS0AaCbCg hRLAWXpRi4cXLfjt5fQp+chymE6QUc3PYCFa9DvqcCjKBv3Wm2ArVAUbL4V1mucJ klLwI1O2Y8i1EbBifPIHzClK6xeiAYhe7yOs+CoH9Q2oZA+vuHtAuNyUTOMXS0sY 88/2o/cnqX762fxT6J8DPPMqKiu3YwoiNGOW2gEpHyujKmY66BnfHsrx07lR3atj eOD5VmUsHKYWvEWqbmGoCTHmT3wJbaWw1tGlcOIS2LVdbq538HUzsEk3ENhB9kfU ANz7b4X7iGSzAeLdkLzMKGt5ApAqqEOGO5BBtcQhmPaMtLXNXgEraSjEkd4dO1Xs lyte4PQ+FICI/hKwOVcZPt/64mC1ABfmdnAwBx4zSECsWv7HuXTtVAX5MhZX1vGQ l9sdkwq1nfGhcY4Zf/KIF8Q2ljRH/k1I9XCdFJpyUSnk5NBvV7qLBw+9C8rQzirs J5DTYWDAq1h1W0vn3JRzPi9AmO7DGPBeJIyGlR8JAxxjw29PX9j0mzB6T2Hc3zZe jzIzjBnBdJRc2fiFIJXkKc1QpdJ6J6DKQaMGv6Yz58TYu3/juAKRD2C9DuKvPWWb FvK4VPhhsCQhZzJvQsko =MwZr -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-05-23 18:09 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-05-22 0:12 [PATCH] Btrfs: don't remove raid type sysfs entries until unmount Chris Mason 2014-05-22 1:21 ` Jeff Mahoney 2014-05-22 12:19 ` Chris Mason 2014-05-22 15:05 ` Jeff Mahoney 2014-05-22 17:17 ` Chris Mason 2014-05-23 12:38 ` David Sterba 2014-05-23 12:56 ` Jeff Mahoney 2014-05-23 14:32 ` David Sterba 2014-05-23 14:34 ` Jeff Mahoney 2014-05-23 18:09 ` Jeff Mahoney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).