From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cn.fujitsu.com ([59.151.112.132]:20304 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1754725AbbA3DaY convert rfc822-to-8bit (ORCPT ); Thu, 29 Jan 2015 22:30:24 -0500 Message-ID: <54CAFAC6.9030705@cn.fujitsu.com> Date: Fri, 30 Jan 2015 11:30:14 +0800 From: Qu Wenruo MIME-Version: 1.0 To: Miao Xie , CC: linux-fsdevel Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get vfsmount from a given sb. References: <1422498281-20493-1-git-send-email-quwenruo@cn.fujitsu.com> <1422498281-20493-7-git-send-email-quwenruo@cn.fujitsu.com> <54CAD5DC.2060603@huawei.com> <54CAE1E3.1040406@cn.fujitsu.com> <54CAE632.1020908@cn.fujitsu.com> <54CAF8E6.8030100@huawei.com> In-Reply-To: <54CAF8E6.8030100@huawei.com> Content-Type: text/plain; charset="utf-8"; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: -------- Original Message -------- Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get vfsmount from a given sb. From: Miao Xie To: Qu Wenruo , Date: 2015年01月30日 11:22 > On Fri, 30 Jan 2015 10:02:26 +0800, Qu Wenruo wrote: >> -------- Original Message -------- >> Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get >> vfsmount from a given sb. >> From: Qu Wenruo >> To: Miao Xie , linux-btrfs@vger.kernel.org >> Date: 2015年01月30日 09:44 >>> -------- Original Message -------- >>> Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get >>> vfsmount from a given sb. >>> From: Miao Xie >>> To: Qu Wenruo , >>> Date: 2015年01月30日 08:52 >>>> On Thu, 29 Jan 2015 10:24:39 +0800, Qu Wenruo wrote: >>>>> There are sysfs interfaces in some fs, only btrfs yet, which will modify >>>>> on-disk data. >>>>> Unlike normal file operation routine we can use mnt_want_write_file() to >>>>> protect the operation, change through sysfs won't to be binded to any file >>>>> in the filesystem. >>>>> So we can only extract the first vfsmount of a superblock and pass it to >>>>> mnt_want_write() to do the protection. >>>> This method is wrong, becasue one fs may be mounted on the multi places >>>> at the same time, someone is R/O, someone is R/W, you may get a R/O and >>>> fail to get the write permission. >>> This shouldn't happen. If someone is ro, the whole fs should be ro, right? >>> You can mount a device which is already mounted as rw to other point as ro, >>> and remount a mount point to ro will also cause all other mount point to ro. >>> >>> So I didn't see the problem here. >>>> I think you do label/feature change by sysfs interface by the following way >>>> >>>> btrfs_sysfs_change_XXXX() >>>> { >>>> /* Use trylock to avoid the race with umount */ >>>> if(!mutex_trylock(&sb->s_umount)) >>>> return -EBUSY; >>>> >>>> check R/O and FREEZE >>>> >>>> mutex_unlock(&sb->s_umount); >>>> } >>> This looks better since it not introduce changes to VFS. >>> >>> Thanks, >>> Qu >> Oh, wait a second, this one leads to the old problem and old solution. >> >> If we hold s_umount mutex, we must do freeze check and can't start transaction >> since it will deadlock. >> >> And for freeze check, we must use sb_try_start_intwrite() to hold the freeze >> lock and then add a new >> btrfs_start_transaction_freeze() which will not call sb_start_write()... >> >> Oh this seems so similar, v2 or v3 version RFC patch? >> So still goes to the old method? > No. Just check R/O and RREEZE, if failed, go out. if the check pass, > we start_transaction. Because we do it in s_umount lock, no one can > change fs to R/O or FREEZE. > > Maybe the above description is not so clear, explain it again. > > btrfs_sysfs_change_XXXX() > { > /* Use trylock to avoid the race with umount */ > if(!mutex_trylock(&sb->s_umount)) > return -EBUSY; > > if (fs is R/O or FREEZED) { > mutex_unlock(&sb->s_umount); > return -EACCES; > } > > btrfs_start_transaction() > change label/feature > btrfs_commit_transaction() > > mutex_unlock(&sb->s_umount); > } I prefer the sb_want_write() method, since it doesn't even need to hold the s_umount mutex. Thanks, Qu > Thanks > Miao > >> Thanks, >> Qu >>>> Thanks >>>> Miao >>>> >>>>> Cc: linux-fsdevel >>>>> Signed-off-by: Qu Wenruo >>>>> --- >>>>> fs/namespace.c | 25 +++++++++++++++++++++++++ >>>>> include/linux/mount.h | 1 + >>>>> 2 files changed, 26 insertions(+) >>>>> >>>>> diff --git a/fs/namespace.c b/fs/namespace.c >>>>> index cd1e968..5a16a62 100644 >>>>> --- a/fs/namespace.c >>>>> +++ b/fs/namespace.c >>>>> @@ -1105,6 +1105,31 @@ struct vfsmount *mntget(struct vfsmount *mnt) >>>>> } >>>>> EXPORT_SYMBOL(mntget); >>>>> +/* >>>>> + * get a vfsmount from a given sb >>>>> + * >>>>> + * This is especially used for case where change fs' sysfs interface >>>>> + * will lead to a write, e.g. Change label through sysfs in btrfs. >>>>> + * So vfs can get a vfsmount and then use mnt_want_write() to protect. >>>>> + */ >>>>> +struct vfsmount *get_vfsmount_sb(struct super_block *sb) >>>>> +{ >>>>> + struct vfsmount *ret_vfs = NULL; >>>>> + struct mount *mnt; >>>>> + int ret = 0; >>>>> + >>>>> + lock_mount_hash(); >>>>> + if (list_empty(&sb->s_mounts)) >>>>> + goto out; >>>>> + mnt = list_entry(sb->s_mounts.next, struct mount, mnt_instance); >>>>> + ret_vfs = &mnt->mnt; >>>>> + ret_vfs = mntget(ret_vfs); >>>>> +out: >>>>> + unlock_mount_hash(); >>>>> + return ret_vfs; >>>>> +} >>>>> +EXPORT_SYMBOL(get_vfsmount_sb); >>>>> + >>>>> struct vfsmount *mnt_clone_internal(struct path *path) >>>>> { >>>>> struct mount *p; >>>>> diff --git a/include/linux/mount.h b/include/linux/mount.h >>>>> index c2c561d..cf1b0f5 100644 >>>>> --- a/include/linux/mount.h >>>>> +++ b/include/linux/mount.h >>>>> @@ -79,6 +79,7 @@ extern void mnt_drop_write_file(struct file *file); >>>>> extern void mntput(struct vfsmount *mnt); >>>>> extern struct vfsmount *mntget(struct vfsmount *mnt); >>>>> extern struct vfsmount *mnt_clone_internal(struct path *path); >>>>> +extern struct vfsmount *get_vfsmount_sb(struct super_block *sb); >>>>> extern int __mnt_is_readonly(struct vfsmount *mnt); >>>>> struct path; >>>>> >> . >> From mboxrd@z Thu Jan 1 00:00:00 1970 From: Qu Wenruo Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get vfsmount from a given sb. Date: Fri, 30 Jan 2015 11:30:14 +0800 Message-ID: <54CAFAC6.9030705@cn.fujitsu.com> References: <1422498281-20493-1-git-send-email-quwenruo@cn.fujitsu.com> <1422498281-20493-7-git-send-email-quwenruo@cn.fujitsu.com> <54CAD5DC.2060603@huawei.com> <54CAE1E3.1040406@cn.fujitsu.com> <54CAE632.1020908@cn.fujitsu.com> <54CAF8E6.8030100@huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-fsdevel To: Miao Xie , Return-path: Received: from cn.fujitsu.com ([59.151.112.132]:20304 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1754725AbbA3DaY convert rfc822-to-8bit (ORCPT ); Thu, 29 Jan 2015 22:30:24 -0500 In-Reply-To: <54CAF8E6.8030100@huawei.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: -------- Original Message -------- Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function=20 to get vfsmount from a given sb. =46rom: Miao Xie To: Qu Wenruo , Date: 2015=E5=B9=B401=E6=9C=8830=E6=97=A5 11:22 > On Fri, 30 Jan 2015 10:02:26 +0800, Qu Wenruo wrote: >> -------- Original Message -------- >> Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() functi= on to get >> vfsmount from a given sb. >> From: Qu Wenruo >> To: Miao Xie , linux-btrfs@vger.kernel.org >> Date: 2015=E5=B9=B401=E6=9C=8830=E6=97=A5 09:44 >>> -------- Original Message -------- >>> Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() funct= ion to get >>> vfsmount from a given sb. >>> From: Miao Xie >>> To: Qu Wenruo , >>> Date: 2015=E5=B9=B401=E6=9C=8830=E6=97=A5 08:52 >>>> On Thu, 29 Jan 2015 10:24:39 +0800, Qu Wenruo wrote: >>>>> There are sysfs interfaces in some fs, only btrfs yet, which will= modify >>>>> on-disk data. >>>>> Unlike normal file operation routine we can use mnt_want_write_fi= le() to >>>>> protect the operation, change through sysfs won't to be binded to= any file >>>>> in the filesystem. >>>>> So we can only extract the first vfsmount of a superblock and pas= s it to >>>>> mnt_want_write() to do the protection. >>>> This method is wrong, becasue one fs may be mounted on the multi = places >>>> at the same time, someone is R/O, someone is R/W, you may get a R/= O and >>>> fail to get the write permission. >>> This shouldn't happen. If someone is ro, the whole fs should be ro,= right? >>> You can mount a device which is already mounted as rw to other poin= t as ro, >>> and remount a mount point to ro will also cause all other mount poi= nt to ro. >>> >>> So I didn't see the problem here. >>>> I think you do label/feature change by sysfs interface by the foll= owing way >>>> >>>> btrfs_sysfs_change_XXXX() >>>> { >>>> /* Use trylock to avoid the race with umount */ >>>> if(!mutex_trylock(&sb->s_umount)) >>>> return -EBUSY; >>>> >>>> check R/O and FREEZE >>>> >>>> mutex_unlock(&sb->s_umount); >>>> } >>> This looks better since it not introduce changes to VFS. >>> >>> Thanks, >>> Qu >> Oh, wait a second, this one leads to the old problem and old solutio= n. >> >> If we hold s_umount mutex, we must do freeze check and can't start t= ransaction >> since it will deadlock. >> >> And for freeze check, we must use sb_try_start_intwrite() to hold th= e freeze >> lock and then add a new >> btrfs_start_transaction_freeze() which will not call sb_start_write(= )... >> >> Oh this seems so similar, v2 or v3 version RFC patch? >> So still goes to the old method? > No. Just check R/O and RREEZE, if failed, go out. if the check pass, > we start_transaction. Because we do it in s_umount lock, no one can > change fs to R/O or FREEZE. > > Maybe the above description is not so clear, explain it again. > > btrfs_sysfs_change_XXXX() > { > /* Use trylock to avoid the race with umount */ > if(!mutex_trylock(&sb->s_umount)) > return -EBUSY; > > if (fs is R/O or FREEZED) { > mutex_unlock(&sb->s_umount); > return -EACCES; > } > > btrfs_start_transaction() > change label/feature > btrfs_commit_transaction() > > mutex_unlock(&sb->s_umount); > } I prefer the sb_want_write() method, since it doesn't even need to hold= =20 the s_umount mutex. Thanks, Qu > Thanks > Miao > >> Thanks, >> Qu >>>> Thanks >>>> Miao >>>> >>>>> Cc: linux-fsdevel >>>>> Signed-off-by: Qu Wenruo >>>>> --- >>>>> fs/namespace.c | 25 +++++++++++++++++++++++++ >>>>> include/linux/mount.h | 1 + >>>>> 2 files changed, 26 insertions(+) >>>>> >>>>> diff --git a/fs/namespace.c b/fs/namespace.c >>>>> index cd1e968..5a16a62 100644 >>>>> --- a/fs/namespace.c >>>>> +++ b/fs/namespace.c >>>>> @@ -1105,6 +1105,31 @@ struct vfsmount *mntget(struct vfsmount *m= nt) >>>>> } >>>>> EXPORT_SYMBOL(mntget); >>>>> +/* >>>>> + * get a vfsmount from a given sb >>>>> + * >>>>> + * This is especially used for case where change fs' sysfs inter= face >>>>> + * will lead to a write, e.g. Change label through sysfs in btrf= s. >>>>> + * So vfs can get a vfsmount and then use mnt_want_write() to pr= otect. >>>>> + */ >>>>> +struct vfsmount *get_vfsmount_sb(struct super_block *sb) >>>>> +{ >>>>> + struct vfsmount *ret_vfs =3D NULL; >>>>> + struct mount *mnt; >>>>> + int ret =3D 0; >>>>> + >>>>> + lock_mount_hash(); >>>>> + if (list_empty(&sb->s_mounts)) >>>>> + goto out; >>>>> + mnt =3D list_entry(sb->s_mounts.next, struct mount, mnt_inst= ance); >>>>> + ret_vfs =3D &mnt->mnt; >>>>> + ret_vfs =3D mntget(ret_vfs); >>>>> +out: >>>>> + unlock_mount_hash(); >>>>> + return ret_vfs; >>>>> +} >>>>> +EXPORT_SYMBOL(get_vfsmount_sb); >>>>> + >>>>> struct vfsmount *mnt_clone_internal(struct path *path) >>>>> { >>>>> struct mount *p; >>>>> diff --git a/include/linux/mount.h b/include/linux/mount.h >>>>> index c2c561d..cf1b0f5 100644 >>>>> --- a/include/linux/mount.h >>>>> +++ b/include/linux/mount.h >>>>> @@ -79,6 +79,7 @@ extern void mnt_drop_write_file(struct file *fi= le); >>>>> extern void mntput(struct vfsmount *mnt); >>>>> extern struct vfsmount *mntget(struct vfsmount *mnt); >>>>> extern struct vfsmount *mnt_clone_internal(struct path *path); >>>>> +extern struct vfsmount *get_vfsmount_sb(struct super_block *sb); >>>>> extern int __mnt_is_readonly(struct vfsmount *mnt); >>>>> struct path; >>>>> >> . >> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html