linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
@ 2019-03-26 10:49 fdmanana
  2019-03-26 12:17 ` Nikolay Borisov
  0 siblings, 1 reply; 7+ messages in thread
From: fdmanana @ 2019-03-26 10:49 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

Whan a filesystem is mounted with the nologreplay mount option, which
requires it to be mounted in RO mode as well, we can not allow discard on
free space inside block groups, because log trees refer to extents that
are not pinned in a block group's free space cache (pinning the extents is
precisely the first phase of replaying a log tree).

So do not allow the fitrim ioctl to do anything when the filesystem is
mounted with the nologreplay option, because later it can be mounted RW
without that option, which causes log replay to happen and result in
either a failure to replay the log trees (leading to a mount failure), a
crash or some silent corruption.

Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ioctl.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 494f0f10d70e..01808934d21f 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -501,6 +501,16 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
+	/*
+	 * If the fs is mounted with nologreplay, which requires it to be
+	 * mounted in RO mode as well, we can not allow discard on free space
+	 * inside block groups, because log trees refer to extents that are not
+	 * pinned in a block group's free space cache (pinning the extents is
+	 * precisely the first phase of replaying a log tree).
+	 */
+	if (btrfs_test_opt(fs_info, NOLOGREPLAY))
+		return -EROFS;
+
 	rcu_read_lock();
 	list_for_each_entry_rcu(device, &fs_info->fs_devices->devices,
 				dev_list) {
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
  2019-03-26 10:49 [PATCH] Btrfs: do not allow trimming when a fs is mounted with the nologreplay option fdmanana
@ 2019-03-26 12:17 ` Nikolay Borisov
  2019-03-26 12:35   ` Filipe Manana
  2019-03-26 13:40   ` Qu Wenruo
  0 siblings, 2 replies; 7+ messages in thread
From: Nikolay Borisov @ 2019-03-26 12:17 UTC (permalink / raw)
  To: fdmanana, linux-btrfs



On 26.03.19 г. 12:49 ч., fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> Whan a filesystem is mounted with the nologreplay mount option, which
> requires it to be mounted in RO mode as well, we can not allow discard on
> free space inside block groups, because log trees refer to extents that
> are not pinned in a block group's free space cache (pinning the extents is
> precisely the first phase of replaying a log tree).
> 
> So do not allow the fitrim ioctl to do anything when the filesystem is
> mounted with the nologreplay option, because later it can be mounted RW
> without that option, which causes log replay to happen and result in
> either a failure to replay the log trees (leading to a mount failure), a
> crash or some silent corruption.
> 
> Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Does it make sense to make the check a bit more specific and only return
EROFS when NOLOGREPLAY and the log tree has non-null generation?

In any case:

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/ioctl.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 494f0f10d70e..01808934d21f 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -501,6 +501,16 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EPERM;
>  
> +	/*
> +	 * If the fs is mounted with nologreplay, which requires it to be
> +	 * mounted in RO mode as well, we can not allow discard on free space
> +	 * inside block groups, because log trees refer to extents that are not
> +	 * pinned in a block group's free space cache (pinning the extents is
> +	 * precisely the first phase of replaying a log tree).
> +	 */
> +	if (btrfs_test_opt(fs_info, NOLOGREPLAY))
> +		return -EROFS;
> +
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(device, &fs_info->fs_devices->devices,
>  				dev_list) {
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
  2019-03-26 12:17 ` Nikolay Borisov
@ 2019-03-26 12:35   ` Filipe Manana
  2019-03-26 12:39     ` Nikolay Borisov
  2019-03-26 13:40   ` Qu Wenruo
  1 sibling, 1 reply; 7+ messages in thread
From: Filipe Manana @ 2019-03-26 12:35 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-btrfs, Darrick J. Wong

On Tue, Mar 26, 2019 at 12:17 PM Nikolay Borisov <nborisov@suse.com> wrote:
>
>
>
> On 26.03.19 г. 12:49 ч., fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > Whan a filesystem is mounted with the nologreplay mount option, which
> > requires it to be mounted in RO mode as well, we can not allow discard on
> > free space inside block groups, because log trees refer to extents that
> > are not pinned in a block group's free space cache (pinning the extents is
> > precisely the first phase of replaying a log tree).
> >
> > So do not allow the fitrim ioctl to do anything when the filesystem is
> > mounted with the nologreplay option, because later it can be mounted RW
> > without that option, which causes log replay to happen and result in
> > either a failure to replay the log trees (leading to a mount failure), a
> > crash or some silent corruption.
> >
> > Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
>
> Does it make sense to make the check a bit more specific and only return
> EROFS when NOLOGREPLAY and the log tree has non-null generation?

It would make sense checking if there's actually a log tree as well.
Neither the xfs nor ext4 (which is already in Linus' tree) do such
equivalent checks, nor the proposed fstests test case makes sure a
journal/log exists.

Not against it, but this isn't a common use case either.

>
> In any case:
>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
>
> > ---
> >  fs/btrfs/ioctl.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> > index 494f0f10d70e..01808934d21f 100644
> > --- a/fs/btrfs/ioctl.c
> > +++ b/fs/btrfs/ioctl.c
> > @@ -501,6 +501,16 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
> >       if (!capable(CAP_SYS_ADMIN))
> >               return -EPERM;
> >
> > +     /*
> > +      * If the fs is mounted with nologreplay, which requires it to be
> > +      * mounted in RO mode as well, we can not allow discard on free space
> > +      * inside block groups, because log trees refer to extents that are not
> > +      * pinned in a block group's free space cache (pinning the extents is
> > +      * precisely the first phase of replaying a log tree).
> > +      */
> > +     if (btrfs_test_opt(fs_info, NOLOGREPLAY))
> > +             return -EROFS;
> > +
> >       rcu_read_lock();
> >       list_for_each_entry_rcu(device, &fs_info->fs_devices->devices,
> >                               dev_list) {
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
  2019-03-26 12:35   ` Filipe Manana
@ 2019-03-26 12:39     ` Nikolay Borisov
  2019-03-28 15:54       ` David Sterba
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolay Borisov @ 2019-03-26 12:39 UTC (permalink / raw)
  To: Filipe Manana; +Cc: linux-btrfs, Darrick J. Wong



On 26.03.19 г. 14:35 ч., Filipe Manana wrote:
> On Tue, Mar 26, 2019 at 12:17 PM Nikolay Borisov <nborisov@suse.com> wrote:
>>
>>
>>
>> On 26.03.19 г. 12:49 ч., fdmanana@kernel.org wrote:
>>> From: Filipe Manana <fdmanana@suse.com>
>>>
>>> Whan a filesystem is mounted with the nologreplay mount option, which
>>> requires it to be mounted in RO mode as well, we can not allow discard on
>>> free space inside block groups, because log trees refer to extents that
>>> are not pinned in a block group's free space cache (pinning the extents is
>>> precisely the first phase of replaying a log tree).
>>>
>>> So do not allow the fitrim ioctl to do anything when the filesystem is
>>> mounted with the nologreplay option, because later it can be mounted RW
>>> without that option, which causes log replay to happen and result in
>>> either a failure to replay the log trees (leading to a mount failure), a
>>> crash or some silent corruption.
>>>
>>> Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
>>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
>>
>> Does it make sense to make the check a bit more specific and only return
>> EROFS when NOLOGREPLAY and the log tree has non-null generation?
> 
> It would make sense checking if there's actually a log tree as well.
> Neither the xfs nor ext4 (which is already in Linus' tree) do such
> equivalent checks, nor the proposed fstests test case makes sure a
> journal/log exists.
> 
> Not against it, but this isn't a common use case either.

I think of this as sorts of "optimisation" where if we don't have a tree
then we can allow trim. Though this is much simpler so I'm fine with it
as well.


> 
>>
>> In any case:
>>
>> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
>>
>>> ---
>>>  fs/btrfs/ioctl.c | 10 ++++++++++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>>> index 494f0f10d70e..01808934d21f 100644
>>> --- a/fs/btrfs/ioctl.c
>>> +++ b/fs/btrfs/ioctl.c
>>> @@ -501,6 +501,16 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
>>>       if (!capable(CAP_SYS_ADMIN))
>>>               return -EPERM;
>>>
>>> +     /*
>>> +      * If the fs is mounted with nologreplay, which requires it to be
>>> +      * mounted in RO mode as well, we can not allow discard on free space
>>> +      * inside block groups, because log trees refer to extents that are not
>>> +      * pinned in a block group's free space cache (pinning the extents is
>>> +      * precisely the first phase of replaying a log tree).
>>> +      */
>>> +     if (btrfs_test_opt(fs_info, NOLOGREPLAY))
>>> +             return -EROFS;
>>> +
>>>       rcu_read_lock();
>>>       list_for_each_entry_rcu(device, &fs_info->fs_devices->devices,
>>>                               dev_list) {
>>>
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
  2019-03-26 12:17 ` Nikolay Borisov
  2019-03-26 12:35   ` Filipe Manana
@ 2019-03-26 13:40   ` Qu Wenruo
  2019-03-26 13:48     ` David Sterba
  1 sibling, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2019-03-26 13:40 UTC (permalink / raw)
  To: Nikolay Borisov, fdmanana, linux-btrfs



On 2019/3/26 下午8:17, Nikolay Borisov wrote:
>
>
> On 26.03.19 г. 12:49 ч., fdmanana@kernel.org wrote:
>> From: Filipe Manana <fdmanana@suse.com>
>>
>> Whan a filesystem is mounted with the nologreplay mount option, which
>> requires it to be mounted in RO mode as well, we can not allow discard on
>> free space inside block groups, because log trees refer to extents that
>> are not pinned in a block group's free space cache (pinning the extents is
>> precisely the first phase of replaying a log tree).
>>
>> So do not allow the fitrim ioctl to do anything when the filesystem is
>> mounted with the nologreplay option, because later it can be mounted RW
>> without that option, which causes log replay to happen and result in
>> either a failure to replay the log trees (leading to a mount failure), a
>> crash or some silent corruption.
>>
>> Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
>
> Does it make sense to make the check a bit more specific and only return
> EROFS when NOLOGREPLAY and the log tree has non-null generation?

To me fstrim is a WRITE operation, why it is allowed even in RO mount?

Thanks,
Qu

>
> In any case:
>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
>
>> ---
>>  fs/btrfs/ioctl.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 494f0f10d70e..01808934d21f 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -501,6 +501,16 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
>>  	if (!capable(CAP_SYS_ADMIN))
>>  		return -EPERM;
>>
>> +	/*
>> +	 * If the fs is mounted with nologreplay, which requires it to be
>> +	 * mounted in RO mode as well, we can not allow discard on free space
>> +	 * inside block groups, because log trees refer to extents that are not
>> +	 * pinned in a block group's free space cache (pinning the extents is
>> +	 * precisely the first phase of replaying a log tree).
>> +	 */
>> +	if (btrfs_test_opt(fs_info, NOLOGREPLAY))
>> +		return -EROFS;
>> +
>>  	rcu_read_lock();
>>  	list_for_each_entry_rcu(device, &fs_info->fs_devices->devices,
>>  				dev_list) {
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
  2019-03-26 13:40   ` Qu Wenruo
@ 2019-03-26 13:48     ` David Sterba
  0 siblings, 0 replies; 7+ messages in thread
From: David Sterba @ 2019-03-26 13:48 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Nikolay Borisov, fdmanana, linux-btrfs

On Tue, Mar 26, 2019 at 09:40:08PM +0800, Qu Wenruo wrote:
> 
> 
> On 2019/3/26 下午8:17, Nikolay Borisov wrote:
> >
> >
> > On 26.03.19 г. 12:49 ч., fdmanana@kernel.org wrote:
> >> From: Filipe Manana <fdmanana@suse.com>
> >>
> >> Whan a filesystem is mounted with the nologreplay mount option, which
> >> requires it to be mounted in RO mode as well, we can not allow discard on
> >> free space inside block groups, because log trees refer to extents that
> >> are not pinned in a block group's free space cache (pinning the extents is
> >> precisely the first phase of replaying a log tree).
> >>
> >> So do not allow the fitrim ioctl to do anything when the filesystem is
> >> mounted with the nologreplay option, because later it can be mounted RW
> >> without that option, which causes log replay to happen and result in
> >> either a failure to replay the log trees (leading to a mount failure), a
> >> crash or some silent corruption.
> >>
> >> Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
> >> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> >
> > Does it make sense to make the check a bit more specific and only return
> > EROFS when NOLOGREPLAY and the log tree has non-null generation?
> 
> To me fstrim is a WRITE operation, why it is allowed even in RO mount?

It's write to the block device, not to the filesystem.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
  2019-03-26 12:39     ` Nikolay Borisov
@ 2019-03-28 15:54       ` David Sterba
  0 siblings, 0 replies; 7+ messages in thread
From: David Sterba @ 2019-03-28 15:54 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Filipe Manana, linux-btrfs, Darrick J. Wong

On Tue, Mar 26, 2019 at 02:39:45PM +0200, Nikolay Borisov wrote:
> On 26.03.19 г. 14:35 ч., Filipe Manana wrote:
> > On Tue, Mar 26, 2019 at 12:17 PM Nikolay Borisov <nborisov@suse.com> wrote:
> >> On 26.03.19 г. 12:49 ч., fdmanana@kernel.org wrote:
> >>> From: Filipe Manana <fdmanana@suse.com>
> >>>
> >>> Whan a filesystem is mounted with the nologreplay mount option, which
> >>> requires it to be mounted in RO mode as well, we can not allow discard on
> >>> free space inside block groups, because log trees refer to extents that
> >>> are not pinned in a block group's free space cache (pinning the extents is
> >>> precisely the first phase of replaying a log tree).
> >>>
> >>> So do not allow the fitrim ioctl to do anything when the filesystem is
> >>> mounted with the nologreplay option, because later it can be mounted RW
> >>> without that option, which causes log replay to happen and result in
> >>> either a failure to replay the log trees (leading to a mount failure), a
> >>> crash or some silent corruption.
> >>>
> >>> Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
> >>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> >>
> >> Does it make sense to make the check a bit more specific and only return
> >> EROFS when NOLOGREPLAY and the log tree has non-null generation?
> > 
> > It would make sense checking if there's actually a log tree as well.
> > Neither the xfs nor ext4 (which is already in Linus' tree) do such
> > equivalent checks, nor the proposed fstests test case makes sure a
> > journal/log exists.
> > 
> > Not against it, but this isn't a common use case either.
> 
> I think of this as sorts of "optimisation" where if we don't have a tree
> then we can allow trim. Though this is much simpler so I'm fine with it
> as well.

Agreed, the simple solution sounds ok to me, trim is not a critical
operation so we don't need to try harder to make it work even with the
mount option.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-03-28 15:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-26 10:49 [PATCH] Btrfs: do not allow trimming when a fs is mounted with the nologreplay option fdmanana
2019-03-26 12:17 ` Nikolay Borisov
2019-03-26 12:35   ` Filipe Manana
2019-03-26 12:39     ` Nikolay Borisov
2019-03-28 15:54       ` David Sterba
2019-03-26 13:40   ` Qu Wenruo
2019-03-26 13:48     ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).