All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements
@ 2021-07-27 10:24 fdmanana
  2021-07-27 10:24 ` [PATCH 1/3] btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction fdmanana
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: fdmanana @ 2021-07-27 10:24 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

The first patch in the series fixes a bug where a directory fsync, following
by inode eviction, followed by renaming a file and syncing the log results
in losing a file if a power failure happens and the log is replayed.

The remaining two changes are independent, and are about avoiding unnecessary
work during operations that need to check or modify the log (renames, adding
a hard link, unlinks) and making renames hold log commits for a shorter
period.

Filipe Manana (3):
  btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction
  btrfs: eliminate some false positives when checking if inode was logged
  btrfs: do not pin logs too early during renames

 fs/btrfs/inode.c    | 48 +++++++++++++++++++++++++++++++++++++++------
 fs/btrfs/tree-log.c | 29 ++++++++++++++++-----------
 2 files changed, 60 insertions(+), 17 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction
  2021-07-27 10:24 [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements fdmanana
@ 2021-07-27 10:24 ` fdmanana
  2021-07-27 10:24 ` [PATCH 2/3] btrfs: eliminate some false positives when checking if inode was logged fdmanana
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: fdmanana @ 2021-07-27 10:24 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

When checking if we need to log the new name of a renamed inode, we are
checking if the inode and its parent inode have been logged before, and if
not we don't log the new name. The check however is buggy, as it directly
compares the logged_trans field of the inodes versus the ID of the current
transaction. The problem is that logged_trans is a transient field, only
stored in memory and never persisted in the inode item, so if an inode
was logged before, evicted and reloaded, its logged_trans field is set to
a value of 0, meaning the check will return false and the new name of the
renamed inode is not logged. If the old parent directory was previously
fsynced and we deleted the logged directory entries corresponding to the
old name, we end up with a log that when replayed will delete the renamed
inode.

The following example triggers the problem:

  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt

  $ mkdir /mnt/A
  $ mkdir /mnt/B
  $ echo -n "hello world" > /mnt/A/foo

  $ sync

  # Add some new file to A and fsync directory A.
  $ touch /mnt/A/bar
  $ xfs_io -c "fsync" /mnt/A

  # Now trigger inode eviction. We are only interested in triggering
  # eviction for the inode of directory A.
  $ echo 2 > /proc/sys/vm/drop_caches

  # Move foo from directory A to directory B.
  # This deletes the directory entries for foo in A from the log, and
  # does not add the new name for foo in directory B to the log, because
  # logged_trans of A is 0, which is less than the current transaction ID.
  $ mv /mnt/A/foo /mnt/B/foo

  # Now make an fsync to anything except A, B or any file inside them,
  # like for example create a file at the root directory and fsync this
  # new file. This syncs the log that contains all the changes done by
  # previous rename operation.
  $ touch /mnt/baz
  $ xfs_io -c "fsync" /mnt/baz

  <power fail>

  # Mount the filesystem and replay the log.
  $ mount /dev/sdc /mnt

  # Check the filesystem content.
  $ ls -1R /mnt
  /mnt/:
  A
  B
  baz

  /mnt/A:
  bar

  /mnt/B:
  $

  # File foo is gone, it's neither in A/ nor in B/.

Fix this by using the inode_logged() helper at btrfs_log_new_name(), which
safely checks if an inode was logged before in the current transaction.

A test case for fstests will follow soon.

CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/tree-log.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index c2ebe700d6e2..8dde5c08a48f 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -6536,8 +6536,8 @@ void btrfs_log_new_name(struct btrfs_trans_handle *trans,
 	 * if this inode hasn't been logged and directory we're renaming it
 	 * from hasn't been logged, we don't need to log it
 	 */
-	if (inode->logged_trans < trans->transid &&
-	    (!old_dir || old_dir->logged_trans < trans->transid))
+	if (!inode_logged(trans, inode) &&
+	    (!old_dir || !inode_logged(trans, old_dir)))
 		return;
 
 	/*
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] btrfs: eliminate some false positives when checking if inode was logged
  2021-07-27 10:24 [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements fdmanana
  2021-07-27 10:24 ` [PATCH 1/3] btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction fdmanana
@ 2021-07-27 10:24 ` fdmanana
  2021-07-27 10:24 ` [PATCH 3/3] btrfs: do not pin logs too early during renames fdmanana
  2021-07-28 12:41 ` [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements David Sterba
  3 siblings, 0 replies; 8+ messages in thread
From: fdmanana @ 2021-07-27 10:24 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

When checking if an inode was previously logged in the current transaction
through the helper inode_logged(), we can return some false positives that
can be easily eliminated. These correspond to the cases where an inode has
a ->logged_trans value that is not zero and its value is smaller then the
ID of the current transaction. This means we know exactly that the inode
was never logged before in the current transaction, so we can return false
and avoid the callers to do extra work:

1) Having btrfs_del_dir_entries_in_log() and btrfs_del_inode_ref_in_log()
   unnecessarily join a log transaction and do deletion searches in a log
   tree that will not find anything. This just adds unnecessary contention
   on extent buffer locks;

2) Having btrfs_log_new_name() unnecessarily log an inode when it is not
   needed. If the inode was not logged before, we don't need to log it in
   LOG_INODE_EXISTS mode.

So just make sure that any false positive only happens when ->logged_trans
has a value of 0.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/tree-log.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 8dde5c08a48f..fc98b7a7a8e6 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3421,14 +3421,10 @@ int btrfs_free_log_root_tree(struct btrfs_trans_handle *trans,
 }
 
 /*
- * Check if an inode was logged in the current transaction. We can't always rely
- * on an inode's logged_trans value, because it's an in-memory only field and
- * therefore not persisted. This means that its value is lost if the inode gets
- * evicted and loaded again from disk (in which case it has a value of 0, and
- * certainly it is smaller then any possible transaction ID), when that happens
- * the full_sync flag is set in the inode's runtime flags, so on that case we
- * assume eviction happened and ignore the logged_trans value, assuming the
- * worst case, that the inode was logged before in the current transaction.
+ * Check if an inode was logged in the current transaction. This may often
+ * return some false positives, because logged_trans is an in memory only field,
+ * not persisted anywhere. This is meant to be used in contexts where a false
+ * positive has no functional consequences.
  */
 static bool inode_logged(struct btrfs_trans_handle *trans,
 			 struct btrfs_inode *inode)
@@ -3436,7 +3432,18 @@ static bool inode_logged(struct btrfs_trans_handle *trans,
 	if (inode->logged_trans == trans->transid)
 		return true;
 
-	if (inode->last_trans == trans->transid &&
+	/*
+	 * The inode's logged_trans is always 0 when we load it (because it is
+	 * not persisted in the inode item or elsewhere). So if it is 0, the
+	 * inode was last modified in the current transaction and has the
+	 * full_sync flag set, then the inode may have been logged before in
+	 * the current transaction, then evicted and loaded again in the current
+	 * transaction - or may have never been logged in the current transaction,
+	 * but since we can not be sure, we have to assume it was, otherwise our
+	 * callers can leave an inconsistent log.
+	 */
+	if (inode->logged_trans == 0 &&
+	    inode->last_trans == trans->transid &&
 	    test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags) &&
 	    !test_bit(BTRFS_FS_LOG_RECOVERING, &trans->fs_info->flags))
 		return true;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] btrfs: do not pin logs too early during renames
  2021-07-27 10:24 [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements fdmanana
  2021-07-27 10:24 ` [PATCH 1/3] btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction fdmanana
  2021-07-27 10:24 ` [PATCH 2/3] btrfs: eliminate some false positives when checking if inode was logged fdmanana
@ 2021-07-27 10:24 ` fdmanana
  2021-07-28 12:41 ` [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements David Sterba
  3 siblings, 0 replies; 8+ messages in thread
From: fdmanana @ 2021-07-27 10:24 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

During renames we pin the logs of the roots a bit too early, before the
calls to btrfs_insert_inode_ref(). We can pin the logs after those calls,
since those will not change anything in a log tree.

In a scenario where we have multiple and diverse filesystem operations
running in parallel, those calls can take a significant amount of time,
due to lock contention on extent buffers, and delay log commits from other
tasks for longer than necessary.

So just pin logs after calls to btrfs_insert_inode_ref() and right before
the first operation that can update a log tree.

The following script that uses dbench was used for testing:

  $ cat dbench-test.sh
  #!/bin/bash

  DEV=/dev/nvme0n1
  MNT=/mnt/nvme0n1
  MOUNT_OPTIONS="-o ssd"
  MKFS_OPTIONS="-m single -d single"

  echo "performance" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

  umount $DEV &> /dev/null
  mkfs.btrfs -f $MKFS_OPTIONS $DEV
  mount $MOUNT_OPTIONS $DEV $MNT

  dbench -D $MNT -t 120 16

  umount $MNT

The tests were run on a machine with 12 cores, 64G of RAN, a NVME device
and using a non-debug kernel config (Debian's default config).

The results compare a branch without this patch and without the previous
patch in the series, that has the subject:

 "btrfs: eliminate some false positives when checking if inode was logged"

Versus the same branch with these two patches applied.

dbench with 8 clients, results before:

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    4391359     0.009   249.745
 Close        3225882     0.001     3.243
 Rename        185953     0.065   240.643
 Unlink        886669     0.049   249.906
 Deltree          112     2.455   217.433
 Mkdir             56     0.002     0.004
 Qpathinfo    3980281     0.004     3.109
 Qfileinfo     697579     0.001     0.187
 Qfsinfo       729780     0.002     2.424
 Sfileinfo     357764     0.004     1.415
 Find         1538861     0.016     4.863
 WriteX       2189666     0.010     3.327
 ReadX        6883443     0.002     0.729
 LockX          14298     0.002     0.073
 UnlockX        14298     0.001     0.042
 Flush         307777     2.447   303.663

Throughput 1149.6 MB/sec  8 clients  8 procs  max_latency=303.666 ms

dbench with 8 clients, results after:

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    4269920     0.009   213.532
 Close        3136653     0.001     0.690
 Rename        180805     0.082   213.858
 Unlink        862189     0.050   172.893
 Deltree          112     2.998   218.328
 Mkdir             56     0.002     0.003
 Qpathinfo    3870158     0.004     5.072
 Qfileinfo     678375     0.001     0.194
 Qfsinfo       709604     0.002     0.485
 Sfileinfo     347850     0.004     1.304
 Find         1496310     0.017     5.504
 WriteX       2129613     0.010     2.882
 ReadX        6693066     0.002     1.517
 LockX          13902     0.002     0.075
 UnlockX        13902     0.001     0.055
 Flush         299276     2.511   220.189

Throughput 1187.33 MB/sec  8 clients  8 procs  max_latency=220.194 ms

+3.2% throughput, -31.8% max latency

dbench with 16 clients, results before:

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    5978334     0.028   156.507
 Close        4391598     0.001     1.345
 Rename        253136     0.241   155.057
 Unlink       1207220     0.182   257.344
 Deltree          160     6.123    36.277
 Mkdir             80     0.003     0.005
 Qpathinfo    5418817     0.012     6.867
 Qfileinfo     949929     0.001     0.941
 Qfsinfo       993560     0.002     1.386
 Sfileinfo     486904     0.004     2.829
 Find         2095088     0.059     8.164
 WriteX       2982319     0.017     9.029
 ReadX        9371484     0.002     4.052
 LockX          19470     0.002     0.461
 UnlockX        19470     0.001     0.990
 Flush         418936     2.740   347.902

Throughput 1495.31 MB/sec  16 clients  16 procs  max_latency=347.909 ms

dbench with 16 clients, results after:

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    5711833     0.029   131.240
 Close        4195897     0.001     1.732
 Rename        241849     0.204   147.831
 Unlink       1153341     0.184   231.322
 Deltree          160     6.086    30.198
 Mkdir             80     0.003     0.021
 Qpathinfo    5177011     0.012     7.150
 Qfileinfo     907768     0.001     0.793
 Qfsinfo       949205     0.002     1.431
 Sfileinfo     465317     0.004     2.454
 Find         2001541     0.058     7.819
 WriteX       2850661     0.017     9.110
 ReadX        8952289     0.002     3.991
 LockX          18596     0.002     0.655
 UnlockX        18596     0.001     0.179
 Flush         400342     2.879   293.607

Throughput 1565.73 MB/sec  16 clients  16 procs  max_latency=293.611 ms

+4.6% throughput, -16.9% max latency

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/inode.c | 48 ++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 6089c5e7763c..59b35e12bb81 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9271,8 +9271,6 @@ static int btrfs_rename_exchange(struct inode *old_dir,
 		/* force full log commit if subvolume involved. */
 		btrfs_set_log_full_commit(trans);
 	} else {
-		btrfs_pin_log_trans(root);
-		root_log_pinned = true;
 		ret = btrfs_insert_inode_ref(trans, dest,
 					     new_dentry->d_name.name,
 					     new_dentry->d_name.len,
@@ -9289,8 +9287,6 @@ static int btrfs_rename_exchange(struct inode *old_dir,
 		/* force full log commit if subvolume involved. */
 		btrfs_set_log_full_commit(trans);
 	} else {
-		btrfs_pin_log_trans(dest);
-		dest_log_pinned = true;
 		ret = btrfs_insert_inode_ref(trans, root,
 					     old_dentry->d_name.name,
 					     old_dentry->d_name.len,
@@ -9321,6 +9317,29 @@ static int btrfs_rename_exchange(struct inode *old_dir,
 				BTRFS_I(new_inode), 1);
 	}
 
+	/*
+	 * Now pin the logs of the roots. We do it to ensure that no other task
+	 * can sync the logs while we are in progress with the rename, because
+	 * that could result in an inconsistency in case any of the inodes that
+	 * are part of this rename operation were logged before.
+	 *
+	 * We pin the logs even if at this precise moment none of the inodes was
+	 * logged before. This is because right after we checked for that, some
+	 * other task fsyncing some other inode not involved with this rename
+	 * operation could log that one of our inodes exists.
+	 *
+	 * We don't need to pin the logs before the above calls to
+	 * btrfs_insert_inode_ref(), since those don't ever need to change a log.
+	 */
+	if (old_ino != BTRFS_FIRST_FREE_OBJECTID) {
+		btrfs_pin_log_trans(root);
+		root_log_pinned = true;
+	}
+	if (new_ino != BTRFS_FIRST_FREE_OBJECTID) {
+		btrfs_pin_log_trans(dest);
+		dest_log_pinned = true;
+	}
+
 	/* src is a subvolume */
 	if (old_ino == BTRFS_FIRST_FREE_OBJECTID) {
 		ret = btrfs_unlink_subvol(trans, old_dir, old_dentry);
@@ -9573,8 +9592,6 @@ static int btrfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		/* force full log commit if subvolume involved. */
 		btrfs_set_log_full_commit(trans);
 	} else {
-		btrfs_pin_log_trans(root);
-		log_pinned = true;
 		ret = btrfs_insert_inode_ref(trans, dest,
 					     new_dentry->d_name.name,
 					     new_dentry->d_name.len,
@@ -9598,6 +9615,25 @@ static int btrfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (unlikely(old_ino == BTRFS_FIRST_FREE_OBJECTID)) {
 		ret = btrfs_unlink_subvol(trans, old_dir, old_dentry);
 	} else {
+		/*
+		 * Now pin the log. We do it to ensure that no other task can
+		 * sync the log while we are in progress with the rename, as
+		 * that could result in an inconsistency in case any of the
+		 * inodes that are part of this rename operation were logged
+		 * before.
+		 *
+		 * We pin the log even if at this precise moment none of the
+		 * inodes was logged before. This is because right after we
+		 * checked for that, some other task fsyncing some other inode
+		 * not involved with this rename operation could log that one of
+		 * our inodes exists.
+		 *
+		 * We don't need to pin the logs before the above call to
+		 * btrfs_insert_inode_ref(), since that does not need to change
+		 * a log.
+		 */
+		btrfs_pin_log_trans(root);
+		log_pinned = true;
 		ret = __btrfs_unlink_inode(trans, root, BTRFS_I(old_dir),
 					BTRFS_I(d_inode(old_dentry)),
 					old_dentry->d_name.name,
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements
  2021-07-27 10:24 [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements fdmanana
                   ` (2 preceding siblings ...)
  2021-07-27 10:24 ` [PATCH 3/3] btrfs: do not pin logs too early during renames fdmanana
@ 2021-07-28 12:41 ` David Sterba
  3 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2021-07-28 12:41 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Tue, Jul 27, 2021 at 11:24:42AM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> The first patch in the series fixes a bug where a directory fsync, following
> by inode eviction, followed by renaming a file and syncing the log results
> in losing a file if a power failure happens and the log is replayed.
> 
> The remaining two changes are independent, and are about avoiding unnecessary
> work during operations that need to check or modify the log (renames, adding
> a hard link, unlinks) and making renames hold log commits for a shorter
> period.

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] btrfs: do not pin logs too early during renames
  2021-07-27 10:24 ` [PATCH 3/3] btrfs: do not pin logs too early during renames fdmanana
@ 2021-07-29  9:27 ` Dan Carpenter
  -1 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2021-07-28 22:04 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 21494 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <019dba0981817da70e0820550267b368e3ce7389.1627379796.git.fdmanana@suse.com>
References: <019dba0981817da70e0820550267b368e3ce7389.1627379796.git.fdmanana@suse.com>
TO: fdmanana(a)kernel.org

Hi,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kdave/for-next]
[also build test WARNING on v5.14-rc3 next-20210727]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/fdmanana-kernel-org/btrfs-fsync-changes-a-bug-fix-and-a-couple-improvements/20210727-182628
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
:::::: branch date: 2 days ago
:::::: commit date: 2 days ago
config: arm64-randconfig-m031-20210728 (attached as .config)
compiler: aarch64-linux-gcc (GCC) 10.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

New smatch warnings:
fs/btrfs/inode.c:9338 btrfs_rename_exchange() warn: variable dereferenced before check 'new_inode' (see line 9225)

Old smatch warnings:
fs/btrfs/inode.c:285 insert_inline_extent() error: we previously assumed 'compressed_pages' could be null (see line 254)

vim +/new_inode +9338 fs/btrfs/inode.c

39279cc3d2704c Chris Mason     2007-06-12  9117  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9118  static int btrfs_rename_exchange(struct inode *old_dir,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9119  			      struct dentry *old_dentry,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9120  			      struct inode *new_dir,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9121  			      struct dentry *new_dentry)
cdd1fedf8261cd Dan Fuhry       2016-03-17  9122  {
0b246afa62b0cf Jeff Mahoney    2016-06-22  9123  	struct btrfs_fs_info *fs_info = btrfs_sb(old_dir->i_sb);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9124  	struct btrfs_trans_handle *trans;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9125  	struct btrfs_root *root = BTRFS_I(old_dir)->root;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9126  	struct btrfs_root *dest = BTRFS_I(new_dir)->root;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9127  	struct inode *new_inode = new_dentry->d_inode;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9128  	struct inode *old_inode = old_dentry->d_inode;
95582b00838837 Deepa Dinamani  2018-05-08  9129  	struct timespec64 ctime = current_time(old_inode);
4a0cc7ca6c40b6 Nikolay Borisov 2017-01-10  9130  	u64 old_ino = btrfs_ino(BTRFS_I(old_inode));
4a0cc7ca6c40b6 Nikolay Borisov 2017-01-10  9131  	u64 new_ino = btrfs_ino(BTRFS_I(new_inode));
cdd1fedf8261cd Dan Fuhry       2016-03-17  9132  	u64 old_idx = 0;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9133  	u64 new_idx = 0;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9134  	int ret;
75b463d2b47aef Filipe Manana   2020-08-11  9135  	int ret2;
86e8aa0e772cab Filipe Manana   2016-05-05  9136  	bool root_log_pinned = false;
86e8aa0e772cab Filipe Manana   2016-05-05  9137  	bool dest_log_pinned = false;
dc09ef3562726c Josef Bacik     2021-05-19  9138  	bool need_abort = false;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9139  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9140  	/* we only allow rename subvolume link between subvolumes */
cdd1fedf8261cd Dan Fuhry       2016-03-17  9141  	if (old_ino != BTRFS_FIRST_FREE_OBJECTID && root != dest)
cdd1fedf8261cd Dan Fuhry       2016-03-17  9142  		return -EXDEV;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9143  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9144  	/* close the race window with snapshot create/destroy ioctl */
943eb3bf25f4a7 Josef Bacik     2019-11-19  9145  	if (old_ino == BTRFS_FIRST_FREE_OBJECTID ||
943eb3bf25f4a7 Josef Bacik     2019-11-19  9146  	    new_ino == BTRFS_FIRST_FREE_OBJECTID)
0b246afa62b0cf Jeff Mahoney    2016-06-22  9147  		down_read(&fs_info->subvol_sem);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9148  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9149  	/*
cdd1fedf8261cd Dan Fuhry       2016-03-17  9150  	 * We want to reserve the absolute worst case amount of items.  So if
cdd1fedf8261cd Dan Fuhry       2016-03-17  9151  	 * both inodes are subvols and we need to unlink them then that would
cdd1fedf8261cd Dan Fuhry       2016-03-17  9152  	 * require 4 item modifications, but if they are both normal inodes it
cdd1fedf8261cd Dan Fuhry       2016-03-17  9153  	 * would require 5 item modifications, so we'll assume their normal
cdd1fedf8261cd Dan Fuhry       2016-03-17  9154  	 * inodes.  So 5 * 2 is 10, plus 2 for the new links, so 12 total items
cdd1fedf8261cd Dan Fuhry       2016-03-17  9155  	 * should cover the worst case number of items we'll modify.
cdd1fedf8261cd Dan Fuhry       2016-03-17  9156  	 */
cdd1fedf8261cd Dan Fuhry       2016-03-17  9157  	trans = btrfs_start_transaction(root, 12);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9158  	if (IS_ERR(trans)) {
cdd1fedf8261cd Dan Fuhry       2016-03-17  9159  		ret = PTR_ERR(trans);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9160  		goto out_notrans;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9161  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9162  
00aa8e87c9dc63 Josef Bacik     2021-03-12  9163  	if (dest != root) {
00aa8e87c9dc63 Josef Bacik     2021-03-12  9164  		ret = btrfs_record_root_in_trans(trans, dest);
00aa8e87c9dc63 Josef Bacik     2021-03-12  9165  		if (ret)
00aa8e87c9dc63 Josef Bacik     2021-03-12  9166  			goto out_fail;
00aa8e87c9dc63 Josef Bacik     2021-03-12  9167  	}
3e1740993e4311 Josef Bacik     2019-11-15  9168  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9169  	/*
cdd1fedf8261cd Dan Fuhry       2016-03-17  9170  	 * We need to find a free sequence number both in the source and
cdd1fedf8261cd Dan Fuhry       2016-03-17  9171  	 * in the destination directory for the exchange.
cdd1fedf8261cd Dan Fuhry       2016-03-17  9172  	 */
877574e2548bbf Nikolay Borisov 2017-02-20  9173  	ret = btrfs_set_inode_index(BTRFS_I(new_dir), &old_idx);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9174  	if (ret)
cdd1fedf8261cd Dan Fuhry       2016-03-17  9175  		goto out_fail;
877574e2548bbf Nikolay Borisov 2017-02-20  9176  	ret = btrfs_set_inode_index(BTRFS_I(old_dir), &new_idx);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9177  	if (ret)
cdd1fedf8261cd Dan Fuhry       2016-03-17  9178  		goto out_fail;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9179  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9180  	BTRFS_I(old_inode)->dir_index = 0ULL;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9181  	BTRFS_I(new_inode)->dir_index = 0ULL;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9182  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9183  	/* Reference for the source. */
cdd1fedf8261cd Dan Fuhry       2016-03-17  9184  	if (old_ino == BTRFS_FIRST_FREE_OBJECTID) {
cdd1fedf8261cd Dan Fuhry       2016-03-17  9185  		/* force full log commit if subvolume involved. */
907877664e2d85 David Sterba    2019-03-20  9186  		btrfs_set_log_full_commit(trans);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9187  	} else {
cdd1fedf8261cd Dan Fuhry       2016-03-17  9188  		ret = btrfs_insert_inode_ref(trans, dest,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9189  					     new_dentry->d_name.name,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9190  					     new_dentry->d_name.len,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9191  					     old_ino,
f85b7379cd76ad David Sterba    2017-01-20  9192  					     btrfs_ino(BTRFS_I(new_dir)),
f85b7379cd76ad David Sterba    2017-01-20  9193  					     old_idx);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9194  		if (ret)
cdd1fedf8261cd Dan Fuhry       2016-03-17  9195  			goto out_fail;
dc09ef3562726c Josef Bacik     2021-05-19  9196  		need_abort = true;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9197  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9198  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9199  	/* And now for the dest. */
cdd1fedf8261cd Dan Fuhry       2016-03-17  9200  	if (new_ino == BTRFS_FIRST_FREE_OBJECTID) {
cdd1fedf8261cd Dan Fuhry       2016-03-17  9201  		/* force full log commit if subvolume involved. */
907877664e2d85 David Sterba    2019-03-20  9202  		btrfs_set_log_full_commit(trans);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9203  	} else {
cdd1fedf8261cd Dan Fuhry       2016-03-17  9204  		ret = btrfs_insert_inode_ref(trans, root,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9205  					     old_dentry->d_name.name,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9206  					     old_dentry->d_name.len,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9207  					     new_ino,
f85b7379cd76ad David Sterba    2017-01-20  9208  					     btrfs_ino(BTRFS_I(old_dir)),
f85b7379cd76ad David Sterba    2017-01-20  9209  					     new_idx);
dc09ef3562726c Josef Bacik     2021-05-19  9210  		if (ret) {
dc09ef3562726c Josef Bacik     2021-05-19  9211  			if (need_abort)
dc09ef3562726c Josef Bacik     2021-05-19  9212  				btrfs_abort_transaction(trans, ret);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9213  			goto out_fail;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9214  		}
dc09ef3562726c Josef Bacik     2021-05-19  9215  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9216  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9217  	/* Update inode version and ctime/mtime. */
cdd1fedf8261cd Dan Fuhry       2016-03-17  9218  	inode_inc_iversion(old_dir);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9219  	inode_inc_iversion(new_dir);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9220  	inode_inc_iversion(old_inode);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9221  	inode_inc_iversion(new_inode);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9222  	old_dir->i_ctime = old_dir->i_mtime = ctime;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9223  	new_dir->i_ctime = new_dir->i_mtime = ctime;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9224  	old_inode->i_ctime = ctime;
cdd1fedf8261cd Dan Fuhry       2016-03-17 @9225  	new_inode->i_ctime = ctime;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9226  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9227  	if (old_dentry->d_parent != new_dentry->d_parent) {
f85b7379cd76ad David Sterba    2017-01-20  9228  		btrfs_record_unlink_dir(trans, BTRFS_I(old_dir),
f85b7379cd76ad David Sterba    2017-01-20  9229  				BTRFS_I(old_inode), 1);
f85b7379cd76ad David Sterba    2017-01-20  9230  		btrfs_record_unlink_dir(trans, BTRFS_I(new_dir),
f85b7379cd76ad David Sterba    2017-01-20  9231  				BTRFS_I(new_inode), 1);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9232  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9233  
c8758bb7bea761 Filipe Manana   2021-07-27  9234  	/*
c8758bb7bea761 Filipe Manana   2021-07-27  9235  	 * Now pin the logs of the roots. We do it to ensure that no other task
c8758bb7bea761 Filipe Manana   2021-07-27  9236  	 * can sync the logs while we are in progress with the rename, because
c8758bb7bea761 Filipe Manana   2021-07-27  9237  	 * that could result in an inconsistency in case any of the inodes that
c8758bb7bea761 Filipe Manana   2021-07-27  9238  	 * are part of this rename operation were logged before.
c8758bb7bea761 Filipe Manana   2021-07-27  9239  	 *
c8758bb7bea761 Filipe Manana   2021-07-27  9240  	 * We pin the logs even if at this precise moment none of the inodes was
c8758bb7bea761 Filipe Manana   2021-07-27  9241  	 * logged before. This is because right after we checked for that, some
c8758bb7bea761 Filipe Manana   2021-07-27  9242  	 * other task fsyncing some other inode not involved with this rename
c8758bb7bea761 Filipe Manana   2021-07-27  9243  	 * operation could log that one of our inodes exists.
c8758bb7bea761 Filipe Manana   2021-07-27  9244  	 *
c8758bb7bea761 Filipe Manana   2021-07-27  9245  	 * We don't need to pin the logs before the above calls to
c8758bb7bea761 Filipe Manana   2021-07-27  9246  	 * btrfs_insert_inode_ref(), since those don't ever need to change a log.
c8758bb7bea761 Filipe Manana   2021-07-27  9247  	 */
c8758bb7bea761 Filipe Manana   2021-07-27  9248  	if (old_ino != BTRFS_FIRST_FREE_OBJECTID) {
c8758bb7bea761 Filipe Manana   2021-07-27  9249  		btrfs_pin_log_trans(root);
c8758bb7bea761 Filipe Manana   2021-07-27  9250  		root_log_pinned = true;
c8758bb7bea761 Filipe Manana   2021-07-27  9251  	}
c8758bb7bea761 Filipe Manana   2021-07-27  9252  	if (new_ino != BTRFS_FIRST_FREE_OBJECTID) {
c8758bb7bea761 Filipe Manana   2021-07-27  9253  		btrfs_pin_log_trans(dest);
c8758bb7bea761 Filipe Manana   2021-07-27  9254  		dest_log_pinned = true;
c8758bb7bea761 Filipe Manana   2021-07-27  9255  	}
c8758bb7bea761 Filipe Manana   2021-07-27  9256  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9257  	/* src is a subvolume */
cdd1fedf8261cd Dan Fuhry       2016-03-17  9258  	if (old_ino == BTRFS_FIRST_FREE_OBJECTID) {
045d3967b6920b Josef Bacik     2019-12-18  9259  		ret = btrfs_unlink_subvol(trans, old_dir, old_dentry);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9260  	} else { /* src is an inode */
4ec5934e43cabd Nikolay Borisov 2017-01-18  9261  		ret = __btrfs_unlink_inode(trans, root, BTRFS_I(old_dir),
4ec5934e43cabd Nikolay Borisov 2017-01-18  9262  					   BTRFS_I(old_dentry->d_inode),
cdd1fedf8261cd Dan Fuhry       2016-03-17  9263  					   old_dentry->d_name.name,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9264  					   old_dentry->d_name.len);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9265  		if (!ret)
9a56fcd15a9c6b Nikolay Borisov 2020-11-02  9266  			ret = btrfs_update_inode(trans, root, BTRFS_I(old_inode));
cdd1fedf8261cd Dan Fuhry       2016-03-17  9267  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9268  	if (ret) {
66642832f06a43 Jeff Mahoney    2016-06-10  9269  		btrfs_abort_transaction(trans, ret);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9270  		goto out_fail;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9271  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9272  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9273  	/* dest is a subvolume */
cdd1fedf8261cd Dan Fuhry       2016-03-17  9274  	if (new_ino == BTRFS_FIRST_FREE_OBJECTID) {
045d3967b6920b Josef Bacik     2019-12-18  9275  		ret = btrfs_unlink_subvol(trans, new_dir, new_dentry);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9276  	} else { /* dest is an inode */
4ec5934e43cabd Nikolay Borisov 2017-01-18  9277  		ret = __btrfs_unlink_inode(trans, dest, BTRFS_I(new_dir),
4ec5934e43cabd Nikolay Borisov 2017-01-18  9278  					   BTRFS_I(new_dentry->d_inode),
cdd1fedf8261cd Dan Fuhry       2016-03-17  9279  					   new_dentry->d_name.name,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9280  					   new_dentry->d_name.len);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9281  		if (!ret)
9a56fcd15a9c6b Nikolay Borisov 2020-11-02  9282  			ret = btrfs_update_inode(trans, dest, BTRFS_I(new_inode));
cdd1fedf8261cd Dan Fuhry       2016-03-17  9283  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9284  	if (ret) {
66642832f06a43 Jeff Mahoney    2016-06-10  9285  		btrfs_abort_transaction(trans, ret);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9286  		goto out_fail;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9287  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9288  
db0a669fb00241 Nikolay Borisov 2017-02-20  9289  	ret = btrfs_add_link(trans, BTRFS_I(new_dir), BTRFS_I(old_inode),
cdd1fedf8261cd Dan Fuhry       2016-03-17  9290  			     new_dentry->d_name.name,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9291  			     new_dentry->d_name.len, 0, old_idx);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9292  	if (ret) {
66642832f06a43 Jeff Mahoney    2016-06-10  9293  		btrfs_abort_transaction(trans, ret);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9294  		goto out_fail;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9295  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9296  
db0a669fb00241 Nikolay Borisov 2017-02-20  9297  	ret = btrfs_add_link(trans, BTRFS_I(old_dir), BTRFS_I(new_inode),
cdd1fedf8261cd Dan Fuhry       2016-03-17  9298  			     old_dentry->d_name.name,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9299  			     old_dentry->d_name.len, 0, new_idx);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9300  	if (ret) {
66642832f06a43 Jeff Mahoney    2016-06-10  9301  		btrfs_abort_transaction(trans, ret);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9302  		goto out_fail;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9303  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9304  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9305  	if (old_inode->i_nlink == 1)
cdd1fedf8261cd Dan Fuhry       2016-03-17  9306  		BTRFS_I(old_inode)->dir_index = old_idx;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9307  	if (new_inode->i_nlink == 1)
cdd1fedf8261cd Dan Fuhry       2016-03-17  9308  		BTRFS_I(new_inode)->dir_index = new_idx;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9309  
86e8aa0e772cab Filipe Manana   2016-05-05  9310  	if (root_log_pinned) {
75b463d2b47aef Filipe Manana   2020-08-11  9311  		btrfs_log_new_name(trans, BTRFS_I(old_inode), BTRFS_I(old_dir),
75b463d2b47aef Filipe Manana   2020-08-11  9312  				   new_dentry->d_parent);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9313  		btrfs_end_log_trans(root);
86e8aa0e772cab Filipe Manana   2016-05-05  9314  		root_log_pinned = false;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9315  	}
86e8aa0e772cab Filipe Manana   2016-05-05  9316  	if (dest_log_pinned) {
75b463d2b47aef Filipe Manana   2020-08-11  9317  		btrfs_log_new_name(trans, BTRFS_I(new_inode), BTRFS_I(new_dir),
75b463d2b47aef Filipe Manana   2020-08-11  9318  				   old_dentry->d_parent);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9319  		btrfs_end_log_trans(dest);
86e8aa0e772cab Filipe Manana   2016-05-05  9320  		dest_log_pinned = false;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9321  	}
cdd1fedf8261cd Dan Fuhry       2016-03-17  9322  out_fail:
86e8aa0e772cab Filipe Manana   2016-05-05  9323  	/*
86e8aa0e772cab Filipe Manana   2016-05-05  9324  	 * If we have pinned a log and an error happened, we unpin tasks
86e8aa0e772cab Filipe Manana   2016-05-05  9325  	 * trying to sync the log and force them to fallback to a transaction
86e8aa0e772cab Filipe Manana   2016-05-05  9326  	 * commit if the log currently contains any of the inodes involved in
86e8aa0e772cab Filipe Manana   2016-05-05  9327  	 * this rename operation (to ensure we do not persist a log with an
86e8aa0e772cab Filipe Manana   2016-05-05  9328  	 * inconsistent state for any of these inodes or leading to any
86e8aa0e772cab Filipe Manana   2016-05-05  9329  	 * inconsistencies when replayed). If the transaction was aborted, the
86e8aa0e772cab Filipe Manana   2016-05-05  9330  	 * abortion reason is propagated to userspace when attempting to commit
86e8aa0e772cab Filipe Manana   2016-05-05  9331  	 * the transaction. If the log does not contain any of these inodes, we
86e8aa0e772cab Filipe Manana   2016-05-05  9332  	 * allow the tasks to sync it.
86e8aa0e772cab Filipe Manana   2016-05-05  9333  	 */
86e8aa0e772cab Filipe Manana   2016-05-05  9334  	if (ret && (root_log_pinned || dest_log_pinned)) {
0f8939b8ac8623 Nikolay Borisov 2017-01-18  9335  		if (btrfs_inode_in_log(BTRFS_I(old_dir), fs_info->generation) ||
0f8939b8ac8623 Nikolay Borisov 2017-01-18  9336  		    btrfs_inode_in_log(BTRFS_I(new_dir), fs_info->generation) ||
0f8939b8ac8623 Nikolay Borisov 2017-01-18  9337  		    btrfs_inode_in_log(BTRFS_I(old_inode), fs_info->generation) ||
86e8aa0e772cab Filipe Manana   2016-05-05 @9338  		    (new_inode &&
0f8939b8ac8623 Nikolay Borisov 2017-01-18  9339  		     btrfs_inode_in_log(BTRFS_I(new_inode), fs_info->generation)))
907877664e2d85 David Sterba    2019-03-20  9340  			btrfs_set_log_full_commit(trans);
86e8aa0e772cab Filipe Manana   2016-05-05  9341  
86e8aa0e772cab Filipe Manana   2016-05-05  9342  		if (root_log_pinned) {
86e8aa0e772cab Filipe Manana   2016-05-05  9343  			btrfs_end_log_trans(root);
86e8aa0e772cab Filipe Manana   2016-05-05  9344  			root_log_pinned = false;
86e8aa0e772cab Filipe Manana   2016-05-05  9345  		}
86e8aa0e772cab Filipe Manana   2016-05-05  9346  		if (dest_log_pinned) {
86e8aa0e772cab Filipe Manana   2016-05-05  9347  			btrfs_end_log_trans(dest);
86e8aa0e772cab Filipe Manana   2016-05-05  9348  			dest_log_pinned = false;
86e8aa0e772cab Filipe Manana   2016-05-05  9349  		}
86e8aa0e772cab Filipe Manana   2016-05-05  9350  	}
c5b4a50b74018b Filipe Manana   2018-06-11  9351  	ret2 = btrfs_end_transaction(trans);
c5b4a50b74018b Filipe Manana   2018-06-11  9352  	ret = ret ? ret : ret2;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9353  out_notrans:
943eb3bf25f4a7 Josef Bacik     2019-11-19  9354  	if (new_ino == BTRFS_FIRST_FREE_OBJECTID ||
943eb3bf25f4a7 Josef Bacik     2019-11-19  9355  	    old_ino == BTRFS_FIRST_FREE_OBJECTID)
0b246afa62b0cf Jeff Mahoney    2016-06-22  9356  		up_read(&fs_info->subvol_sem);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9357  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9358  	return ret;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9359  }
cdd1fedf8261cd Dan Fuhry       2016-03-17  9360  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 37662 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] btrfs: do not pin logs too early during renames
@ 2021-07-29  9:27 ` Dan Carpenter
  0 siblings, 0 replies; 8+ messages in thread
From: Dan Carpenter @ 2021-07-29  9:27 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 6208 bytes --]

[ I don't know why Smatch is suddenly complaining about this code when
  it's 5 years old.  But we may as well remove the NULL check. - dan]

Hi,

url:    https://github.com/0day-ci/linux/commits/fdmanana-kernel-org/btrfs-fsync-changes-a-bug-fix-and-a-couple-improvements/20210727-182628
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
config: arm64-randconfig-m031-20210728 (attached as .config)
compiler: aarch64-linux-gcc (GCC) 10.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

New smatch warnings:
fs/btrfs/inode.c:9338 btrfs_rename_exchange() warn: variable dereferenced before check 'new_inode' (see line 9225)

Old smatch warnings:
fs/btrfs/inode.c:285 insert_inline_extent() error: we previously assumed 'compressed_pages' could be null (see line 254)

vim +/new_inode +9338 fs/btrfs/inode.c

cdd1fedf8261cd Dan Fuhry       2016-03-17  9118  static int btrfs_rename_exchange(struct inode *old_dir,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9119  			      struct dentry *old_dentry,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9120  			      struct inode *new_dir,
cdd1fedf8261cd Dan Fuhry       2016-03-17  9121  			      struct dentry *new_dentry)
cdd1fedf8261cd Dan Fuhry       2016-03-17  9122  {
0b246afa62b0cf Jeff Mahoney    2016-06-22  9123  	struct btrfs_fs_info *fs_info = btrfs_sb(old_dir->i_sb);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9124  	struct btrfs_trans_handle *trans;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9125  	struct btrfs_root *root = BTRFS_I(old_dir)->root;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9126  	struct btrfs_root *dest = BTRFS_I(new_dir)->root;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9127  	struct inode *new_inode = new_dentry->d_inode;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9128  	struct inode *old_inode = old_dentry->d_inode;
95582b00838837 Deepa Dinamani  2018-05-08  9129  	struct timespec64 ctime = current_time(old_inode);
4a0cc7ca6c40b6 Nikolay Borisov 2017-01-10  9130  	u64 old_ino = btrfs_ino(BTRFS_I(old_inode));
4a0cc7ca6c40b6 Nikolay Borisov 2017-01-10  9131  	u64 new_ino = btrfs_ino(BTRFS_I(new_inode));
                                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"new_inode" can't be NULL.  Or more properly we are toasted if it is.

cdd1fedf8261cd Dan Fuhry       2016-03-17  9132  	u64 old_idx = 0;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9133  	u64 new_idx = 0;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9134  	int ret;

[ snip ]

cdd1fedf8261cd Dan Fuhry       2016-03-17  9322  out_fail:
86e8aa0e772cab Filipe Manana   2016-05-05  9323  	/*
86e8aa0e772cab Filipe Manana   2016-05-05  9324  	 * If we have pinned a log and an error happened, we unpin tasks
86e8aa0e772cab Filipe Manana   2016-05-05  9325  	 * trying to sync the log and force them to fallback to a transaction
86e8aa0e772cab Filipe Manana   2016-05-05  9326  	 * commit if the log currently contains any of the inodes involved in
86e8aa0e772cab Filipe Manana   2016-05-05  9327  	 * this rename operation (to ensure we do not persist a log with an
86e8aa0e772cab Filipe Manana   2016-05-05  9328  	 * inconsistent state for any of these inodes or leading to any
86e8aa0e772cab Filipe Manana   2016-05-05  9329  	 * inconsistencies when replayed). If the transaction was aborted, the
86e8aa0e772cab Filipe Manana   2016-05-05  9330  	 * abortion reason is propagated to userspace when attempting to commit
86e8aa0e772cab Filipe Manana   2016-05-05  9331  	 * the transaction. If the log does not contain any of these inodes, we
86e8aa0e772cab Filipe Manana   2016-05-05  9332  	 * allow the tasks to sync it.
86e8aa0e772cab Filipe Manana   2016-05-05  9333  	 */
86e8aa0e772cab Filipe Manana   2016-05-05  9334  	if (ret && (root_log_pinned || dest_log_pinned)) {
0f8939b8ac8623 Nikolay Borisov 2017-01-18  9335  		if (btrfs_inode_in_log(BTRFS_I(old_dir), fs_info->generation) ||
0f8939b8ac8623 Nikolay Borisov 2017-01-18  9336  		    btrfs_inode_in_log(BTRFS_I(new_dir), fs_info->generation) ||
0f8939b8ac8623 Nikolay Borisov 2017-01-18  9337  		    btrfs_inode_in_log(BTRFS_I(old_inode), fs_info->generation) ||
86e8aa0e772cab Filipe Manana   2016-05-05 @9338  		    (new_inode &&
                                                                     ^^^^^^^^^
This check is not required and only upsets the static checker.

0f8939b8ac8623 Nikolay Borisov 2017-01-18  9339  		     btrfs_inode_in_log(BTRFS_I(new_inode), fs_info->generation)))
907877664e2d85 David Sterba    2019-03-20  9340  			btrfs_set_log_full_commit(trans);
86e8aa0e772cab Filipe Manana   2016-05-05  9341  
86e8aa0e772cab Filipe Manana   2016-05-05  9342  		if (root_log_pinned) {
86e8aa0e772cab Filipe Manana   2016-05-05  9343  			btrfs_end_log_trans(root);
86e8aa0e772cab Filipe Manana   2016-05-05  9344  			root_log_pinned = false;
86e8aa0e772cab Filipe Manana   2016-05-05  9345  		}
86e8aa0e772cab Filipe Manana   2016-05-05  9346  		if (dest_log_pinned) {
86e8aa0e772cab Filipe Manana   2016-05-05  9347  			btrfs_end_log_trans(dest);
86e8aa0e772cab Filipe Manana   2016-05-05  9348  			dest_log_pinned = false;
86e8aa0e772cab Filipe Manana   2016-05-05  9349  		}
86e8aa0e772cab Filipe Manana   2016-05-05  9350  	}
c5b4a50b74018b Filipe Manana   2018-06-11  9351  	ret2 = btrfs_end_transaction(trans);
c5b4a50b74018b Filipe Manana   2018-06-11  9352  	ret = ret ? ret : ret2;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9353  out_notrans:
943eb3bf25f4a7 Josef Bacik     2019-11-19  9354  	if (new_ino == BTRFS_FIRST_FREE_OBJECTID ||
943eb3bf25f4a7 Josef Bacik     2019-11-19  9355  	    old_ino == BTRFS_FIRST_FREE_OBJECTID)
0b246afa62b0cf Jeff Mahoney    2016-06-22  9356  		up_read(&fs_info->subvol_sem);
cdd1fedf8261cd Dan Fuhry       2016-03-17  9357  
cdd1fedf8261cd Dan Fuhry       2016-03-17  9358  	return ret;
cdd1fedf8261cd Dan Fuhry       2016-03-17  9359  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] btrfs: do not pin logs too early during renames
  2021-07-29  9:27 ` Dan Carpenter
  (?)
@ 2021-07-29 14:40 ` Filipe Manana
  -1 siblings, 0 replies; 8+ messages in thread
From: Filipe Manana @ 2021-07-29 14:40 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 7303 bytes --]

On Thu, Jul 29, 2021 at 10:27 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> [ I don't know why Smatch is suddenly complaining about this code when
>   it's 5 years old.  But we may as well remove the NULL check. - dan]
>
> Hi,
>
> url:    https://github.com/0day-ci/linux/commits/fdmanana-kernel-org/btrfs-fsync-changes-a-bug-fix-and-a-couple-improvements/20210727-182628
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
> config: arm64-randconfig-m031-20210728 (attached as .config)
> compiler: aarch64-linux-gcc (GCC) 10.3.0
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>
> New smatch warnings:
> fs/btrfs/inode.c:9338 btrfs_rename_exchange() warn: variable dereferenced before check 'new_inode' (see line 9225)
>
> Old smatch warnings:
> fs/btrfs/inode.c:285 insert_inline_extent() error: we previously assumed 'compressed_pages' could be null (see line 254)
>
> vim +/new_inode +9338 fs/btrfs/inode.c
>
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9118  static int btrfs_rename_exchange(struct inode *old_dir,
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9119                               struct dentry *old_dentry,
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9120                               struct inode *new_dir,
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9121                               struct dentry *new_dentry)
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9122  {
> 0b246afa62b0cf Jeff Mahoney    2016-06-22  9123         struct btrfs_fs_info *fs_info = btrfs_sb(old_dir->i_sb);
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9124         struct btrfs_trans_handle *trans;
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9125         struct btrfs_root *root = BTRFS_I(old_dir)->root;
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9126         struct btrfs_root *dest = BTRFS_I(new_dir)->root;
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9127         struct inode *new_inode = new_dentry->d_inode;
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9128         struct inode *old_inode = old_dentry->d_inode;
> 95582b00838837 Deepa Dinamani  2018-05-08  9129         struct timespec64 ctime = current_time(old_inode);
> 4a0cc7ca6c40b6 Nikolay Borisov 2017-01-10  9130         u64 old_ino = btrfs_ino(BTRFS_I(old_inode));
> 4a0cc7ca6c40b6 Nikolay Borisov 2017-01-10  9131         u64 new_ino = btrfs_ino(BTRFS_I(new_inode));
>                                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> "new_inode" can't be NULL.  Or more properly we are toasted if it is.
>
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9132         u64 old_idx = 0;
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9133         u64 new_idx = 0;
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9134         int ret;
>
> [ snip ]
>
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9322  out_fail:
> 86e8aa0e772cab Filipe Manana   2016-05-05  9323         /*
> 86e8aa0e772cab Filipe Manana   2016-05-05  9324          * If we have pinned a log and an error happened, we unpin tasks
> 86e8aa0e772cab Filipe Manana   2016-05-05  9325          * trying to sync the log and force them to fallback to a transaction
> 86e8aa0e772cab Filipe Manana   2016-05-05  9326          * commit if the log currently contains any of the inodes involved in
> 86e8aa0e772cab Filipe Manana   2016-05-05  9327          * this rename operation (to ensure we do not persist a log with an
> 86e8aa0e772cab Filipe Manana   2016-05-05  9328          * inconsistent state for any of these inodes or leading to any
> 86e8aa0e772cab Filipe Manana   2016-05-05  9329          * inconsistencies when replayed). If the transaction was aborted, the
> 86e8aa0e772cab Filipe Manana   2016-05-05  9330          * abortion reason is propagated to userspace when attempting to commit
> 86e8aa0e772cab Filipe Manana   2016-05-05  9331          * the transaction. If the log does not contain any of these inodes, we
> 86e8aa0e772cab Filipe Manana   2016-05-05  9332          * allow the tasks to sync it.
> 86e8aa0e772cab Filipe Manana   2016-05-05  9333          */
> 86e8aa0e772cab Filipe Manana   2016-05-05  9334         if (ret && (root_log_pinned || dest_log_pinned)) {
> 0f8939b8ac8623 Nikolay Borisov 2017-01-18  9335                 if (btrfs_inode_in_log(BTRFS_I(old_dir), fs_info->generation) ||
> 0f8939b8ac8623 Nikolay Borisov 2017-01-18  9336                     btrfs_inode_in_log(BTRFS_I(new_dir), fs_info->generation) ||
> 0f8939b8ac8623 Nikolay Borisov 2017-01-18  9337                     btrfs_inode_in_log(BTRFS_I(old_inode), fs_info->generation) ||
> 86e8aa0e772cab Filipe Manana   2016-05-05 @9338                     (new_inode &&
>                                                                      ^^^^^^^^^
> This check is not required and only upsets the static checker.

Indeed, it's not needed as new_inode can never be NULL for a rename
exchange operation.
The code was likely copy-pasted from btrfs_rename(), used for regular
renames, where new_inode can be NULL.

I just sent a patch to remove the check:

https://lore.kernel.org/linux-btrfs/8dd8e8f3020a5bd13ae22a1f21b8328adc1f4636.1627568438.git.fdmanana(a)suse.com/

Thanks.

>
> 0f8939b8ac8623 Nikolay Borisov 2017-01-18  9339                      btrfs_inode_in_log(BTRFS_I(new_inode), fs_info->generation)))
> 907877664e2d85 David Sterba    2019-03-20  9340                         btrfs_set_log_full_commit(trans);
> 86e8aa0e772cab Filipe Manana   2016-05-05  9341
> 86e8aa0e772cab Filipe Manana   2016-05-05  9342                 if (root_log_pinned) {
> 86e8aa0e772cab Filipe Manana   2016-05-05  9343                         btrfs_end_log_trans(root);
> 86e8aa0e772cab Filipe Manana   2016-05-05  9344                         root_log_pinned = false;
> 86e8aa0e772cab Filipe Manana   2016-05-05  9345                 }
> 86e8aa0e772cab Filipe Manana   2016-05-05  9346                 if (dest_log_pinned) {
> 86e8aa0e772cab Filipe Manana   2016-05-05  9347                         btrfs_end_log_trans(dest);
> 86e8aa0e772cab Filipe Manana   2016-05-05  9348                         dest_log_pinned = false;
> 86e8aa0e772cab Filipe Manana   2016-05-05  9349                 }
> 86e8aa0e772cab Filipe Manana   2016-05-05  9350         }
> c5b4a50b74018b Filipe Manana   2018-06-11  9351         ret2 = btrfs_end_transaction(trans);
> c5b4a50b74018b Filipe Manana   2018-06-11  9352         ret = ret ? ret : ret2;
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9353  out_notrans:
> 943eb3bf25f4a7 Josef Bacik     2019-11-19  9354         if (new_ino == BTRFS_FIRST_FREE_OBJECTID ||
> 943eb3bf25f4a7 Josef Bacik     2019-11-19  9355             old_ino == BTRFS_FIRST_FREE_OBJECTID)
> 0b246afa62b0cf Jeff Mahoney    2016-06-22  9356                 up_read(&fs_info->subvol_sem);
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9357
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9358         return ret;
> cdd1fedf8261cd Dan Fuhry       2016-03-17  9359  }
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-29 14:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-27 10:24 [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements fdmanana
2021-07-27 10:24 ` [PATCH 1/3] btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction fdmanana
2021-07-27 10:24 ` [PATCH 2/3] btrfs: eliminate some false positives when checking if inode was logged fdmanana
2021-07-27 10:24 ` [PATCH 3/3] btrfs: do not pin logs too early during renames fdmanana
2021-07-28 12:41 ` [PATCH 0/3] btrfs: fsync changes, a bug fix and a couple improvements David Sterba
2021-07-28 22:04 [PATCH 3/3] btrfs: do not pin logs too early during renames kernel test robot
2021-07-29  9:27 ` Dan Carpenter
2021-07-29 14:40 ` Filipe Manana

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.