All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] Btrfs: fix fsync not persisting changed attributes of a directory
@ 2019-05-15 15:02 fdmanana
  2019-05-16  7:03 ` Nikolay Borisov
  2019-05-16 14:48 ` [PATCH v2 " fdmanana
  0 siblings, 2 replies; 4+ messages in thread
From: fdmanana @ 2019-05-15 15:02 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

While logging an inode we follow its ancestors and for each one we mark
it as logged in the current transaction, even if we have not logged it.
As a consequence if we change an attribute of an ancestor, such as the
UID or GID for example, and then explicitly fsync it, we end up not
logging the inode at all despite returning success to user space, which
results in the attribute being lost if a power failure happens after
the fsync.

Sample reproducer:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  $ mkdir /mnt/dir
  $ chown 6007:6007 /mnt/dir

  $ sync

  $ chown 9003:9003 /mnt/dir
  $ touch /mnt/dir/file
  $ xfs_io -c fsync /mnt/dir/file

  # fsync our directory after fsync'ing the new file, should persist the
  # new values for the uid and gid.
  $ xfs_io -c fsync /mnt/dir

  <power failure>

  $ mount /dev/sdb /mnt
  $ stat -c %u:%g /mnt/dir
  6007:6007

    --> should be 9003:9003, the uid and gid were not persisted, despite
        the explicit fsync on the directory prior to the power failure

Fix this by not updating the logged_trans field of ancestor inodes when
logging an inode, since we have not logged them. Let only future calls to
btrfs_log_inode() to mark inodes as logged.

This could be triggered by my recent fsync fuzz tester for fstests, for
which an fstests patch exists titled "fstests: generic, fsync fuzz tester
with fsstress".

Fixes: 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/tree-log.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 87e3e4e37606..7d13533a9620 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -5465,7 +5465,6 @@ static noinline int check_parent_dirs_for_sync(struct btrfs_trans_handle *trans,
 {
 	int ret = 0;
 	struct dentry *old_parent = NULL;
-	struct btrfs_inode *orig_inode = inode;
 
 	/*
 	 * for regular files, if its inode is already on disk, we don't
@@ -5485,16 +5484,6 @@ static noinline int check_parent_dirs_for_sync(struct btrfs_trans_handle *trans,
 	}
 
 	while (1) {
-		/*
-		 * If we are logging a directory then we start with our inode,
-		 * not our parent's inode, so we need to skip setting the
-		 * logged_trans so that further down in the log code we don't
-		 * think this inode has already been logged.
-		 */
-		if (inode != orig_inode)
-			inode->logged_trans = trans->transid;
-		smp_mb();
-
 		if (btrfs_must_commit_transaction(trans, inode)) {
 			ret = 1;
 			break;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/3] Btrfs: fix fsync not persisting changed attributes of a directory
  2019-05-15 15:02 [PATCH 1/3] Btrfs: fix fsync not persisting changed attributes of a directory fdmanana
@ 2019-05-16  7:03 ` Nikolay Borisov
  2019-05-16 14:48 ` [PATCH v2 " fdmanana
  1 sibling, 0 replies; 4+ messages in thread
From: Nikolay Borisov @ 2019-05-16  7:03 UTC (permalink / raw)
  To: fdmanana, linux-btrfs



On 15.05.19 г. 18:02 ч., fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> While logging an inode we follow its ancestors and for each one we mark
> it as logged in the current transaction, even if we have not logged it.
> As a consequence if we change an attribute of an ancestor, such as the
> UID or GID for example, and then explicitly fsync it, we end up not
> logging the inode at all despite returning success to user space, which
> results in the attribute being lost if a power failure happens after
> the fsync.
> 
> Sample reproducer:
> 
>   $ mkfs.btrfs -f /dev/sdb
>   $ mount /dev/sdb /mnt
> 
>   $ mkdir /mnt/dir
>   $ chown 6007:6007 /mnt/dir
> 
>   $ sync
> 
>   $ chown 9003:9003 /mnt/dir
>   $ touch /mnt/dir/file
>   $ xfs_io -c fsync /mnt/dir/file
> 
>   # fsync our directory after fsync'ing the new file, should persist the
>   # new values for the uid and gid.
>   $ xfs_io -c fsync /mnt/dir
> 
>   <power failure>
> 
>   $ mount /dev/sdb /mnt
>   $ stat -c %u:%g /mnt/dir
>   6007:6007
> 
>     --> should be 9003:9003, the uid and gid were not persisted, despite
>         the explicit fsync on the directory prior to the power failure
> 
> Fix this by not updating the logged_trans field of ancestor inodes when
> logging an inode, since we have not logged them. Let only future calls to
> btrfs_log_inode() to mark inodes as logged.
> 
> This could be triggered by my recent fsync fuzz tester for fstests, for
> which an fstests patch exists titled "fstests: generic, fsync fuzz tester
> with fsstress".
> 
> Fixes: 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>  fs/btrfs/tree-log.c | 11 -----------
>  1 file changed, 11 deletions(-)
> 
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 87e3e4e37606..7d13533a9620 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -5465,7 +5465,6 @@ static noinline int check_parent_dirs_for_sync(struct btrfs_trans_handle *trans,
>  {
>  	int ret = 0;
>  	struct dentry *old_parent = NULL;
> -	struct btrfs_inode *orig_inode = inode;
>  
>  	/*
>  	 * for regular files, if its inode is already on disk, we don't
> @@ -5485,16 +5484,6 @@ static noinline int check_parent_dirs_for_sync(struct btrfs_trans_handle *trans,
>  	}
>  
>  	while (1) {
> -		/*
> -		 * If we are logging a directory then we start with our inode,
> -		 * not our parent's inode, so we need to skip setting the
> -		 * logged_trans so that further down in the log code we don't
> -		 * think this inode has already been logged.
> -		 */
> -		if (inode != orig_inode)
> -			inode->logged_trans = trans->transid;
> -		smp_mb();
> -

By removing this memory barrier don't you also obsolete the one in
btrfs_record_unlink_dir? Both of these were introduced in 12fcfd22fe5b
("Btrfs: tree logging unlink/rename fixes") and despite they are missing
explicit comments about the expected pairing one can only assume they
both pair against each other.


>  		if (btrfs_must_commit_transaction(trans, inode)) {
>  			ret = 1;
>  			break;
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 1/3] Btrfs: fix fsync not persisting changed attributes of a directory
  2019-05-15 15:02 [PATCH 1/3] Btrfs: fix fsync not persisting changed attributes of a directory fdmanana
  2019-05-16  7:03 ` Nikolay Borisov
@ 2019-05-16 14:48 ` fdmanana
  2019-05-28 17:02   ` David Sterba
  1 sibling, 1 reply; 4+ messages in thread
From: fdmanana @ 2019-05-16 14:48 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

While logging an inode we follow its ancestors and for each one we mark
it as logged in the current transaction, even if we have not logged it.
As a consequence if we change an attribute of an ancestor, such as the
UID or GID for example, and then explicitly fsync it, we end up not
logging the inode at all despite returning success to user space, which
results in the attribute being lost if a power failure happens after
the fsync.

Sample reproducer:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  $ mkdir /mnt/dir
  $ chown 6007:6007 /mnt/dir

  $ sync

  $ chown 9003:9003 /mnt/dir
  $ touch /mnt/dir/file
  $ xfs_io -c fsync /mnt/dir/file

  # fsync our directory after fsync'ing the new file, should persist the
  # new values for the uid and gid.
  $ xfs_io -c fsync /mnt/dir

  <power failure>

  $ mount /dev/sdb /mnt
  $ stat -c %u:%g /mnt/dir
  6007:6007

    --> should be 9003:9003, the uid and gid were not persisted, despite
        the explicit fsync on the directory prior to the power failure

Fix this by not updating the logged_trans field of ancestor inodes when
logging an inode, since we have not logged them. Let only future calls to
btrfs_log_inode() to mark inodes as logged.

This could be triggered by my recent fsync fuzz tester for fstests, for
which an fstests patch exists titled "fstests: generic, fsync fuzz tester
with fsstress".

Fixes: 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---

V2: Removed no longer needed smp_mb() at btrfs_record_unlink_dir().

 fs/btrfs/tree-log.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 87e3e4e37606..64b74f0a48d8 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -5465,7 +5465,6 @@ static noinline int check_parent_dirs_for_sync(struct btrfs_trans_handle *trans,
 {
 	int ret = 0;
 	struct dentry *old_parent = NULL;
-	struct btrfs_inode *orig_inode = inode;
 
 	/*
 	 * for regular files, if its inode is already on disk, we don't
@@ -5485,16 +5484,6 @@ static noinline int check_parent_dirs_for_sync(struct btrfs_trans_handle *trans,
 	}
 
 	while (1) {
-		/*
-		 * If we are logging a directory then we start with our inode,
-		 * not our parent's inode, so we need to skip setting the
-		 * logged_trans so that further down in the log code we don't
-		 * think this inode has already been logged.
-		 */
-		if (inode != orig_inode)
-			inode->logged_trans = trans->transid;
-		smp_mb();
-
 		if (btrfs_must_commit_transaction(trans, inode)) {
 			ret = 1;
 			break;
@@ -6371,7 +6360,6 @@ void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
 	 * if this directory was already logged any new
 	 * names for this file/dir will get recorded
 	 */
-	smp_mb();
 	if (dir->logged_trans == trans->transid)
 		return;
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/3] Btrfs: fix fsync not persisting changed attributes of a directory
  2019-05-16 14:48 ` [PATCH v2 " fdmanana
@ 2019-05-28 17:02   ` David Sterba
  0 siblings, 0 replies; 4+ messages in thread
From: David Sterba @ 2019-05-28 17:02 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Thu, May 16, 2019 at 03:48:55PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> While logging an inode we follow its ancestors and for each one we mark
> it as logged in the current transaction, even if we have not logged it.
> As a consequence if we change an attribute of an ancestor, such as the
> UID or GID for example, and then explicitly fsync it, we end up not
> logging the inode at all despite returning success to user space, which
> results in the attribute being lost if a power failure happens after
> the fsync.
> 
> Sample reproducer:
> 
>   $ mkfs.btrfs -f /dev/sdb
>   $ mount /dev/sdb /mnt
> 
>   $ mkdir /mnt/dir
>   $ chown 6007:6007 /mnt/dir
> 
>   $ sync
> 
>   $ chown 9003:9003 /mnt/dir
>   $ touch /mnt/dir/file
>   $ xfs_io -c fsync /mnt/dir/file
> 
>   # fsync our directory after fsync'ing the new file, should persist the
>   # new values for the uid and gid.
>   $ xfs_io -c fsync /mnt/dir
> 
>   <power failure>
> 
>   $ mount /dev/sdb /mnt
>   $ stat -c %u:%g /mnt/dir
>   6007:6007
> 
>     --> should be 9003:9003, the uid and gid were not persisted, despite
>         the explicit fsync on the directory prior to the power failure
> 
> Fix this by not updating the logged_trans field of ancestor inodes when
> logging an inode, since we have not logged them. Let only future calls to
> btrfs_log_inode() to mark inodes as logged.
> 
> This could be triggered by my recent fsync fuzz tester for fstests, for
> which an fstests patch exists titled "fstests: generic, fsync fuzz tester
> with fsstress".
> 
> Fixes: 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---

Added to 5.2-rc queue, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-28 17:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-15 15:02 [PATCH 1/3] Btrfs: fix fsync not persisting changed attributes of a directory fdmanana
2019-05-16  7:03 ` Nikolay Borisov
2019-05-16 14:48 ` [PATCH v2 " fdmanana
2019-05-28 17:02   ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.