All of lore.kernel.org
 help / color / mirror / Atom feed
* possible different donor file naming in e4defrag
@ 2014-08-15 20:12 TR Reardon
  2014-08-15 20:39 ` Darrick J. Wong
  0 siblings, 1 reply; 10+ messages in thread
From: TR Reardon @ 2014-08-15 20:12 UTC (permalink / raw)
  To: linux-ext4

Currently, e4defrag creates a donor file _in the same directory_ of
the file being defrag'ed.

The problem with this is that the containing directory times are
changed because of the transient presence of the donor file.  Any
thoughts as to moving the donor to a better location, mount-root, or
/lost+found?  Or, ugh, a configurable location, if non-privileged
defrag is important?

+Reardon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: possible different donor file naming in e4defrag
  2014-08-15 20:12 possible different donor file naming in e4defrag TR Reardon
@ 2014-08-15 20:39 ` Darrick J. Wong
  2014-08-15 20:45   ` TR Reardon
  0 siblings, 1 reply; 10+ messages in thread
From: Darrick J. Wong @ 2014-08-15 20:39 UTC (permalink / raw)
  To: TR Reardon; +Cc: linux-ext4

On Fri, Aug 15, 2014 at 04:12:40PM -0400, TR Reardon wrote:
> Currently, e4defrag creates a donor file _in the same directory_ of
> the file being defrag'ed.
> 
> The problem with this is that the containing directory times are
> changed because of the transient presence of the donor file.  Any
> thoughts as to moving the donor to a better location, mount-root, or
> /lost+found?  Or, ugh, a configurable location, if non-privileged
> defrag is important?

How about we just reset the directory time?  (Or does utime() not work?)

--D
> 
> +Reardon
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: possible different donor file naming in e4defrag
  2014-08-15 20:39 ` Darrick J. Wong
@ 2014-08-15 20:45   ` TR Reardon
  2014-08-15 21:04     ` Andreas Dilger
  0 siblings, 1 reply; 10+ messages in thread
From: TR Reardon @ 2014-08-15 20:45 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4


> On Fri, Aug 15, 2014 at 04:12:40PM -0400, TR Reardon wrote:
>> Currently, e4defrag creates a donor file _in the same directory_ of
>> the file being defrag'ed.
>> 
>> The problem with this is that the containing directory times are
>> changed because of the transient presence of the donor file. Any
>> thoughts as to moving the donor to a better location, mount-root, or
>> /lost+found? Or, ugh, a configurable location, if non-privileged
>> defrag is important?
> 
> How about we just reset the directory time? (Or does utime() not work?)
> 
> --D

Er, yes, that's simpler.  And a SIGINT handler just to keep things tidy?

+Reardon

 		 	   		  --
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: possible different donor file naming in e4defrag
  2014-08-15 20:45   ` TR Reardon
@ 2014-08-15 21:04     ` Andreas Dilger
  2014-09-11 19:48       ` TR Reardon
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Dilger @ 2014-08-15 21:04 UTC (permalink / raw)
  To: TR Reardon; +Cc: Darrick J. Wong, linux-ext4

The reason the donor file is created in the same directory as the source is to try and keep the block allocation policy consistent with the original inode. 

You may not need a SIGINT handler, since the timestamp could be reset as soon as the file is created and unlinked. 

It may also be possible to use O_TMPFILE on newer kernels to create the donor file to avoid any races?

Cheers, Andreas

> On Aug 15, 2014, at 22:45, TR Reardon <thomas_reardon@hotmail.com> wrote:
> 
> 
>>> On Fri, Aug 15, 2014 at 04:12:40PM -0400, TR Reardon wrote:
>>> Currently, e4defrag creates a donor file _in the same directory_ of
>>> the file being defrag'ed.
>>> 
>>> The problem with this is that the containing directory times are
>>> changed because of the transient presence of the donor file. Any
>>> thoughts as to moving the donor to a better location, mount-root, or
>>> /lost+found? Or, ugh, a configurable location, if non-privileged
>>> defrag is important?
>> 
>> How about we just reset the directory time? (Or does utime() not work?)
>> 
>> --D
> 
> Er, yes, that's simpler.  And a SIGINT handler just to keep things tidy?
> 
> +Reardon
> 
>                         --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: possible different donor file naming in e4defrag
  2014-08-15 21:04     ` Andreas Dilger
@ 2014-09-11 19:48       ` TR Reardon
  2014-09-11 21:19         ` Andreas Dilger
  0 siblings, 1 reply; 10+ messages in thread
From: TR Reardon @ 2014-09-11 19:48 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Darrick J. Wong, linux-ext4

Picking this back up.  How would O_TMPFILE avoid races?  It definitely avoids the unwanted mtime/atime update, but then the existing "<filename>.defrag" pseudo-lock file would no longer be available.  How could you use O_TMPFILE and still avoid multiple defrag?  If this isn't possible, then resetting the parent times on unlink(tmpfile), as you suggest, is the simplest way out of this.

+Reardon

> From: adilger@dilger.ca
> Subject: Re: possible different donor file naming in e4defrag
> Date: Fri, 15 Aug 2014 23:04:21 +0200
> To: thomas_reardon@hotmail.com
>
> The reason the donor file is created in the same directory as the source is to try and keep the block allocation policy consistent with the original inode.
>
> You may not need a SIGINT handler, since the timestamp could be reset as soon as the file is created and unlinked.
>
> It may also be possible to use O_TMPFILE on newer kernels to create the donor file to avoid any races?
>
> Cheers, Andreas
>

 		 	   		  --
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: possible different donor file naming in e4defrag
  2014-09-11 19:48       ` TR Reardon
@ 2014-09-11 21:19         ` Andreas Dilger
  2014-09-11 22:49           ` TR Reardon
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Dilger @ 2014-09-11 21:19 UTC (permalink / raw)
  To: TR Reardon; +Cc: Darrick J. Wong, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 1903 bytes --]

On Sep 11, 2014, at 1:48 PM, TR Reardon <thomas_reardon@hotmail.com> wrote:
> Picking this back up.  How would O_TMPFILE avoid races?  It definitely avoids the unwanted mtime/atime update, but then the existing "<filename>.defrag" pseudo-lock file would no longer be available.  How could you use O_TMPFILE and still avoid multiple defrag?  If this isn't possible, then resetting the parent times on unlink(tmpfile), as you suggest, is the simplest way out of this.

Looking at this the opposite way - what are the chances that there
will be concurrent defrags on the same file?  Basically no chance at
all.  So long as it doesn't explode (the kernel would need to protect
against this anyway to avoid malicious apps), the worst case is that
there will be some extra defragmentation done in a very rare case.

Conversely, creating a temporary filename and then resetting the
parent directory timestamp is extra work for every file defragmented,
and is racy because e4defrag may "reset" the time to before the temp
file was created, but clobber a legitimate timestamp update in the
directory from some other concurrent update.  That timestamp update
is always going to be racy, even if e4defrag tries to be careful.

Cheers, Andreas

>> From: adilger@dilger.ca
>> Subject: Re: possible different donor file naming in e4defrag
>> Date: Fri, 15 Aug 2014 23:04:21 +0200
>> To: thomas_reardon@hotmail.com
>> 
>> The reason the donor file is created in the same directory as the source is to try and keep the block allocation policy consistent with the original inode.
>> 
>> You may not need a SIGINT handler, since the timestamp could be reset as soon as the file is created and unlinked.
>> 
>> It may also be possible to use O_TMPFILE on newer kernels to create the donor file to avoid any races?
>> 
>> Cheers, Andreas
>> 
> 
> 		 	   		  


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: possible different donor file naming in e4defrag
  2014-09-11 21:19         ` Andreas Dilger
@ 2014-09-11 22:49           ` TR Reardon
  2014-09-11 23:03             ` TR Reardon
  0 siblings, 1 reply; 10+ messages in thread
From: TR Reardon @ 2014-09-11 22:49 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Darrick J. Wong, linux-ext4

> On Sep 11, 2014, at 1:48 PM, TR Reardon <thomas_reardon@hotmail.com> wrote:
>> Picking this back up. How would O_TMPFILE avoid races? It definitely avoids the unwanted mtime/atime update, but then the existing "<filename>.defrag" pseudo-lock file would no longer be available. How could you use O_TMPFILE and still avoid multiple defrag? If this isn't possible, then resetting the parent times on unlink(tmpfile), as you suggest, is the simplest way out of this.
>
> Looking at this the opposite way - what are the chances that there
> will be concurrent defrags on the same file? Basically no chance at
> all. So long as it doesn't explode (the kernel would need to protect
> against this anyway to avoid malicious apps), the worst case is that
> there will be some extra defragmentation done in a very rare case.
>
> Conversely, creating a temporary filename and then resetting the
> parent directory timestamp is extra work for every file defragmented,
> and is racy because e4defrag may "reset" the time to before the temp
> file was created, but clobber a legitimate timestamp update in the
> directory from some other concurrent update. That timestamp update
> is always going to be racy, even if e4defrag tries to be careful.
>
> Cheers, Andreas


Thanks, well described.

So a simple attempt with O_TMPFILE first, then revert to original behavior, like below?


diff --git a/misc/e4defrag.c b/misc/e4defrag.c
index d0eac60..8001182 100644
--- a/misc/e4defrag.c
+++ b/misc/e4defrag.c
@@ -40,6 +40,7 @@
 #include <sys/stat.h>
 #include <sys/statfs.h>
 #include <sys/vfs.h>
+#include <libgen.h>
 
 /* A relatively new ioctl interface ... */
 #ifndef EXT4_IOC_MOVE_EXT
@@ -1526,31 +1527,36 @@ static int file_defrag(const char *file, const struct stat64 *buf,
 
 	/* Create donor inode */
 	memset(tmp_inode_name, 0, PATH_MAX + 8);
-	sprintf(tmp_inode_name, "%.*s.defrag",
-				(int)strnlen(file, PATH_MAX), file);
-	donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
+	/* Try O_TMPFILE first, to avoid changing directory mtime */
+	sprintf(tmp_inode_name, "%.*s", (int)strnlen(file, PATH_MAX), file);
+	donor_fd = open64(dirname(tmp_inode_name), O_TMPFILE | O_WRONLY | O_EXCL, S_IRUSR | S_IWUSR );
 	if (donor_fd < 0) {
-		if (mode_flag & DETAIL) {
-			PRINT_FILE_NAME(file);
-			if (errno == EEXIST)
-				PRINT_ERR_MSG_WITH_ERRNO(
-				"File is being defraged by other program");
-			else
-				PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
+		sprintf(tmp_inode_name, "%.*s.defrag",
+					(int)strnlen(file, PATH_MAX), file);
+		donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
+		if (donor_fd < 0) {
+			if (mode_flag & DETAIL) {
+				PRINT_FILE_NAME(file);
+				if (errno == EEXIST)
+					PRINT_ERR_MSG_WITH_ERRNO(
+					"File is being defraged by other program");
+				else
+					PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
+			}
+			goto out;
 		}
-		goto out;
-	}
 
-	/* Unlink donor inode */
-	ret = unlink(tmp_inode_name);
-	if (ret < 0) {
-		if (mode_flag & DETAIL) {
-			PRINT_FILE_NAME(file);
-			PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
+		/* Unlink donor inode */
+		ret = unlink(tmp_inode_name);
+		if (ret < 0) {
+			if (mode_flag & DETAIL) {
+				PRINT_FILE_NAME(file);
+				PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
+			}
+			goto out;
 		}
-		goto out;
 	}

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: possible different donor file naming in e4defrag
  2014-09-11 22:49           ` TR Reardon
@ 2014-09-11 23:03             ` TR Reardon
  2014-09-12 19:41               ` Andreas Dilger
  0 siblings, 1 reply; 10+ messages in thread
From: TR Reardon @ 2014-09-11 23:03 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Darrick J. Wong, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 4175 bytes --]

(attaching same, to fix whitespace...I suppose I should configure a proper email client sometime)

+Reardon


----------------------------------------
> From: thomas_reardon@hotmail.com
> To: adilger@dilger.ca
> CC: darrick.wong@oracle.com; linux-ext4@vger.kernel.org
> Subject: RE: possible different donor file naming in e4defrag
> Date: Thu, 11 Sep 2014 18:49:07 -0400
>
>> On Sep 11, 2014, at 1:48 PM, TR Reardon <thomas_reardon@hotmail.com> wrote:
>>> Picking this back up. How would O_TMPFILE avoid races? It definitely avoids the unwanted mtime/atime update, but then the existing "<filename>.defrag" pseudo-lock file would no longer be available. How could you use O_TMPFILE and still avoid multiple defrag? If this isn't possible, then resetting the parent times on unlink(tmpfile), as you suggest, is the simplest way out of this.
>>
>> Looking at this the opposite way - what are the chances that there
>> will be concurrent defrags on the same file? Basically no chance at
>> all. So long as it doesn't explode (the kernel would need to protect
>> against this anyway to avoid malicious apps), the worst case is that
>> there will be some extra defragmentation done in a very rare case.
>>
>> Conversely, creating a temporary filename and then resetting the
>> parent directory timestamp is extra work for every file defragmented,
>> and is racy because e4defrag may "reset" the time to before the temp
>> file was created, but clobber a legitimate timestamp update in the
>> directory from some other concurrent update. That timestamp update
>> is always going to be racy, even if e4defrag tries to be careful.
>>
>> Cheers, Andreas
>
>
> Thanks, well described.
>
> So a simple attempt with O_TMPFILE first, then revert to original behavior, like below?
>
>
> diff --git a/misc/e4defrag.c b/misc/e4defrag.c
> index d0eac60..8001182 100644
> --- a/misc/e4defrag.c
> +++ b/misc/e4defrag.c
> @@ -40,6 +40,7 @@
> #include <sys/stat.h>
> #include <sys/statfs.h>
> #include <sys/vfs.h>
> +#include <libgen.h>
>
> /* A relatively new ioctl interface ... */
> #ifndef EXT4_IOC_MOVE_EXT
> @@ -1526,31 +1527,36 @@ static int file_defrag(const char *file, const struct stat64 *buf,
>
> /* Create donor inode */
> memset(tmp_inode_name, 0, PATH_MAX + 8);
> - sprintf(tmp_inode_name, "%.*s.defrag",
> - (int)strnlen(file, PATH_MAX), file);
> - donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
> + /* Try O_TMPFILE first, to avoid changing directory mtime */
> + sprintf(tmp_inode_name, "%.*s", (int)strnlen(file, PATH_MAX), file);
> + donor_fd = open64(dirname(tmp_inode_name), O_TMPFILE | O_WRONLY | O_EXCL, S_IRUSR | S_IWUSR );
> if (donor_fd < 0) {
> - if (mode_flag & DETAIL) {
> - PRINT_FILE_NAME(file);
> - if (errno == EEXIST)
> - PRINT_ERR_MSG_WITH_ERRNO(
> - "File is being defraged by other program");
> - else
> - PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
> + sprintf(tmp_inode_name, "%.*s.defrag",
> + (int)strnlen(file, PATH_MAX), file);
> + donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
> + if (donor_fd < 0) {
> + if (mode_flag & DETAIL) {
> + PRINT_FILE_NAME(file);
> + if (errno == EEXIST)
> + PRINT_ERR_MSG_WITH_ERRNO(
> + "File is being defraged by other program");
> + else
> + PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
> + }
> + goto out;
> }
> - goto out;
> - }
>
> - /* Unlink donor inode */
> - ret = unlink(tmp_inode_name);
> - if (ret < 0) {
> - if (mode_flag & DETAIL) {
> - PRINT_FILE_NAME(file);
> - PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
> + /* Unlink donor inode */
> + ret = unlink(tmp_inode_name);
> + if (ret < 0) {
> + if (mode_flag & DETAIL) {
> + PRINT_FILE_NAME(file);
> + PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
> + }
> + goto out;
> }
> - goto out;
> }
> -
> +
> /* Allocate space for donor inode */
> orig_group_tmp = orig_group_head;
> do {
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
 		 	   		  

[-- Attachment #2: DEFRAG_O_TMPFILE --]
[-- Type: application/octet-stream, Size: 2047 bytes --]

diff --git a/misc/e4defrag.c b/misc/e4defrag.c
index d0eac60..8001182 100644
--- a/misc/e4defrag.c
+++ b/misc/e4defrag.c
@@ -40,6 +40,7 @@
 #include <sys/stat.h>
 #include <sys/statfs.h>
 #include <sys/vfs.h>
+#include <libgen.h>
 
 /* A relatively new ioctl interface ... */
 #ifndef EXT4_IOC_MOVE_EXT
@@ -1526,31 +1527,36 @@ static int file_defrag(const char *file, const struct stat64 *buf,
 
 	/* Create donor inode */
 	memset(tmp_inode_name, 0, PATH_MAX + 8);
-	sprintf(tmp_inode_name, "%.*s.defrag",
-				(int)strnlen(file, PATH_MAX), file);
-	donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
+	/* Try O_TMPFILE first, to avoid changing directory mtime */
+	sprintf(tmp_inode_name, "%.*s", (int)strnlen(file, PATH_MAX), file);
+	donor_fd = open64(dirname(tmp_inode_name), O_TMPFILE | O_WRONLY | O_EXCL, S_IRUSR | S_IWUSR );
 	if (donor_fd < 0) {
-		if (mode_flag & DETAIL) {
-			PRINT_FILE_NAME(file);
-			if (errno == EEXIST)
-				PRINT_ERR_MSG_WITH_ERRNO(
-				"File is being defraged by other program");
-			else
-				PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
+		sprintf(tmp_inode_name, "%.*s.defrag",
+					(int)strnlen(file, PATH_MAX), file);
+		donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
+		if (donor_fd < 0) {
+			if (mode_flag & DETAIL) {
+				PRINT_FILE_NAME(file);
+				if (errno == EEXIST)
+					PRINT_ERR_MSG_WITH_ERRNO(
+					"File is being defraged by other program");
+				else
+					PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
+			}
+			goto out;
 		}
-		goto out;
-	}
 
-	/* Unlink donor inode */
-	ret = unlink(tmp_inode_name);
-	if (ret < 0) {
-		if (mode_flag & DETAIL) {
-			PRINT_FILE_NAME(file);
-			PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
+		/* Unlink donor inode */
+		ret = unlink(tmp_inode_name);
+		if (ret < 0) {
+			if (mode_flag & DETAIL) {
+				PRINT_FILE_NAME(file);
+				PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
+			}
+			goto out;
 		}
-		goto out;
 	}
-
+	
 	/* Allocate space for donor inode */
 	orig_group_tmp = orig_group_head;
 	do {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: possible different donor file naming in e4defrag
  2014-09-11 23:03             ` TR Reardon
@ 2014-09-12 19:41               ` Andreas Dilger
  2014-09-13 20:23                 ` TR Reardon
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Dilger @ 2014-09-12 19:41 UTC (permalink / raw)
  To: TR Reardon; +Cc: Darrick J. Wong, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 6964 bytes --]

On Sep 11, 2014, at 5:03 PM, TR Reardon <thomas_reardon@hotmail.com> wrote:
> diff --git a/misc/e4defrag.c b/misc/e4defrag.c
> index d0eac60..8001182 100644
> --- a/misc/e4defrag.c
> +++ b/misc/e4defrag.c
> @@ -1526,31 +1527,36 @@ static int file_defrag(const char *file, const struct stat64 *buf,
>  
>  	/* Create donor inode */
>  	memset(tmp_inode_name, 0, PATH_MAX + 8);
> -	sprintf(tmp_inode_name, "%.*s.defrag",
> -				(int)strnlen(file, PATH_MAX), file);
> -	donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
> +	/* Try O_TMPFILE first, to avoid changing directory mtime */
> +	sprintf(tmp_inode_name, "%.*s", (int)strnlen(file, PATH_MAX), file);
> +	donor_fd = open64(dirname(tmp_inode_name), O_TMPFILE | O_WRONLY | O_EXCL, S_IRUSR | S_IWUSR );

Lines need to be <= 80 columns.  Please run patch through checkpatch.pl.

Why is it opened O_WRONLY, but the permissions are S_IRUSR | S_IWUSR?
I agree it didn't make sense in the old code to have S_IRUSR either,
but I don't think this makes more sense.  If the file is write-only
(which is probably correct, unless e4defrag is doing some post-copy
checksum of the data) then S_IWUSR would be enough.

>  	if (donor_fd < 0) {
> -		if (mode_flag & DETAIL) {
> -			PRINT_FILE_NAME(file);
> -			if (errno == EEXIST)
> -				PRINT_ERR_MSG_WITH_ERRNO(
> -				"File is being defraged by other program");
> -			else
> -				PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
> +		sprintf(tmp_inode_name, "%.*s.defrag",
> +					(int)strnlen(file, PATH_MAX), file);
> +		donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);

Wrap at 80 columns.

This has the same issue with O_WRONLY and S_IRUSR, though it at least
matches the old code.

> +		if (donor_fd < 0) {
> +			if (mode_flag & DETAIL) {
> +				PRINT_FILE_NAME(file);
> +				if (errno == EEXIST)
> +					PRINT_ERR_MSG_WITH_ERRNO(
> +					"File is being defraged by other program");
> +				else
> +					PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
> +			}
> +			goto out;
>  		}
> -		goto out;
> -	}
>  
> -	/* Unlink donor inode */
> -	ret = unlink(tmp_inode_name);
> -	if (ret < 0) {
> -		if (mode_flag & DETAIL) {
> -			PRINT_FILE_NAME(file);
> -			PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
> +		/* Unlink donor inode */
> +		ret = unlink(tmp_inode_name);
> +		if (ret < 0) {
> +			if (mode_flag & DETAIL) {
> +				PRINT_FILE_NAME(file);
> +				PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
> +			}
> +			goto out;
>  		}

Shouldn't it reset the timestamp in this case?

Cheers, Andreas

> -		goto out;
>  	}
> -
> +	
>  	/* Allocate space for donor inode */
>  	orig_group_tmp = orig_group_head;
>  	do {


> ----------------------------------------
>> From: thomas_reardon@hotmail.com
>> To: adilger@dilger.ca
>> CC: darrick.wong@oracle.com; linux-ext4@vger.kernel.org
>> Subject: RE: possible different donor file naming in e4defrag
>> Date: Thu, 11 Sep 2014 18:49:07 -0400
>> 
>>> On Sep 11, 2014, at 1:48 PM, TR Reardon <thomas_reardon@hotmail.com> wrote:
>>>> Picking this back up. How would O_TMPFILE avoid races? It definitely avoids the unwanted mtime/atime update, but then the existing "<filename>.defrag" pseudo-lock file would no longer be available. How could you use O_TMPFILE and still avoid multiple defrag? If this isn't possible, then resetting the parent times on unlink(tmpfile), as you suggest, is the simplest way out of this.
>>> 
>>> Looking at this the opposite way - what are the chances that there
>>> will be concurrent defrags on the same file? Basically no chance at
>>> all. So long as it doesn't explode (the kernel would need to protect
>>> against this anyway to avoid malicious apps), the worst case is that
>>> there will be some extra defragmentation done in a very rare case.
>>> 
>>> Conversely, creating a temporary filename and then resetting the
>>> parent directory timestamp is extra work for every file defragmented,
>>> and is racy because e4defrag may "reset" the time to before the temp
>>> file was created, but clobber a legitimate timestamp update in the
>>> directory from some other concurrent update. That timestamp update
>>> is always going to be racy, even if e4defrag tries to be careful.
>>> 
>>> Cheers, Andreas
>> 
>> 
>> Thanks, well described.
>> 
>> So a simple attempt with O_TMPFILE first, then revert to original behavior, like below?
>> 
>> 
>> diff --git a/misc/e4defrag.c b/misc/e4defrag.c
>> index d0eac60..8001182 100644
>> --- a/misc/e4defrag.c
>> +++ b/misc/e4defrag.c
>> @@ -40,6 +40,7 @@
>> #include <sys/stat.h>
>> #include <sys/statfs.h>
>> #include <sys/vfs.h>
>> +#include <libgen.h>
>> 
>> /* A relatively new ioctl interface ... */
>> #ifndef EXT4_IOC_MOVE_EXT
>> @@ -1526,31 +1527,36 @@ static int file_defrag(const char *file, const struct stat64 *buf,
>> 
>> /* Create donor inode */
>> memset(tmp_inode_name, 0, PATH_MAX + 8);
>> - sprintf(tmp_inode_name, "%.*s.defrag",
>> - (int)strnlen(file, PATH_MAX), file);
>> - donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
>> + /* Try O_TMPFILE first, to avoid changing directory mtime */
>> + sprintf(tmp_inode_name, "%.*s", (int)strnlen(file, PATH_MAX), file);
>> + donor_fd = open64(dirname(tmp_inode_name), O_TMPFILE | O_WRONLY | O_EXCL, S_IRUSR | S_IWUSR );
>> if (donor_fd < 0) {
>> - if (mode_flag & DETAIL) {
>> - PRINT_FILE_NAME(file);
>> - if (errno == EEXIST)
>> - PRINT_ERR_MSG_WITH_ERRNO(
>> - "File is being defraged by other program");
>> - else
>> - PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
>> + sprintf(tmp_inode_name, "%.*s.defrag",
>> + (int)strnlen(file, PATH_MAX), file);
>> + donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
>> + if (donor_fd < 0) {
>> + if (mode_flag & DETAIL) {
>> + PRINT_FILE_NAME(file);
>> + if (errno == EEXIST)
>> + PRINT_ERR_MSG_WITH_ERRNO(
>> + "File is being defraged by other program");
>> + else
>> + PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
>> + }
>> + goto out;
>> }
>> - goto out;
>> - }
>> 
>> - /* Unlink donor inode */
>> - ret = unlink(tmp_inode_name);
>> - if (ret < 0) {
>> - if (mode_flag & DETAIL) {
>> - PRINT_FILE_NAME(file);
>> - PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
>> + /* Unlink donor inode */
>> + ret = unlink(tmp_inode_name);
>> + if (ret < 0) {
>> + if (mode_flag & DETAIL) {
>> + PRINT_FILE_NAME(file);
>> + PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
>> + }
>> + goto out;
>> }
>> - goto out;
>> }
>> -
>> +
>> /* Allocate space for donor inode */
>> orig_group_tmp = orig_group_head;
>> do {
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> 		 	   		  <DEFRAG_O_TMPFILE>


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: possible different donor file naming in e4defrag
  2014-09-12 19:41               ` Andreas Dilger
@ 2014-09-13 20:23                 ` TR Reardon
  0 siblings, 0 replies; 10+ messages in thread
From: TR Reardon @ 2014-09-13 20:23 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Darrick J. Wong, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 5928 bytes --]

> Subject: Re: possible different donor file naming in e4defrag
> From: adilger@dilger.ca
> Date: Fri, 12 Sep 2014 13:41:17 -0600
> CC: darrick.wong@oracle.com; linux-ext4@vger.kernel.org
> To: thomas_reardon@hotmail.com
>
> On Sep 11, 2014, at 5:03 PM, TR Reardon <thomas_reardon@hotmail.com> wrote:
>> diff --git a/misc/e4defrag.c b/misc/e4defrag.c
>> index d0eac60..8001182 100644
>> --- a/misc/e4defrag.c
>> +++ b/misc/e4defrag.c
>> @@ -1526,31 +1527,36 @@ static int file_defrag(const char *file, const struct stat64 *buf,
>>
>> /* Create donor inode */
>> memset(tmp_inode_name, 0, PATH_MAX + 8);
>> - sprintf(tmp_inode_name, "%.*s.defrag",
>> - (int)strnlen(file, PATH_MAX), file);
>> - donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
>> + /* Try O_TMPFILE first, to avoid changing directory mtime */
>> + sprintf(tmp_inode_name, "%.*s", (int)strnlen(file, PATH_MAX), file);
>> + donor_fd = open64(dirname(tmp_inode_name), O_TMPFILE | O_WRONLY | O_EXCL, S_IRUSR | S_IWUSR );
>
> Lines need to be <= 80 columns. Please run patch through checkpatch.pl.
>
> Why is it opened O_WRONLY, but the permissions are S_IRUSR | S_IWUSR?
> I agree it didn't make sense in the old code to have S_IRUSR either,
> but I don't think this makes more sense. If the file is write-only
> (which is probably correct, unless e4defrag is doing some post-copy
> checksum of the data) then S_IWUSR would be enough.


I agree, wasn't sure if appropriate to change existing. I have changed in updated.


>> if (donor_fd < 0) {
>> - if (mode_flag & DETAIL) {
>> - PRINT_FILE_NAME(file);
>> - if (errno == EEXIST)
>> - PRINT_ERR_MSG_WITH_ERRNO(
>> - "File is being defraged by other program");
>> - else
>> - PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
>> + sprintf(tmp_inode_name, "%.*s.defrag",
>> + (int)strnlen(file, PATH_MAX), file);
>> + donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
>
> Wrap at 80 columns.
>
> This has the same issue with O_WRONLY and S_IRUSR, though it at least
> matches the old code.
>
>> + if (donor_fd < 0) {
>> + if (mode_flag & DETAIL) {
>> + PRINT_FILE_NAME(file);
>> + if (errno == EEXIST)
>> + PRINT_ERR_MSG_WITH_ERRNO(
>> + "File is being defraged by other program");
>> + else
>> + PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
>> + }
>> + goto out;
>> }
>> - goto out;
>> - }
>>
>> - /* Unlink donor inode */
>> - ret = unlink(tmp_inode_name);
>> - if (ret < 0) {
>> - if (mode_flag & DETAIL) {
>> - PRINT_FILE_NAME(file);
>> - PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
>> + /* Unlink donor inode */
>> + ret = unlink(tmp_inode_name);
>> + if (ret < 0) {
>> + if (mode_flag & DETAIL) {
>> + PRINT_FILE_NAME(file);
>> + PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
>> + }
>> + goto out;
>> }
>
> Shouldn't it reset the timestamp in this case?
>
> Cheers, Andreas


Oh, I thought you were arguing that it should collapse to only the O_TMPFILE case to avoid unnecessary races. Updated handles it in both cases, but prefers O_TMPFILE.


(attached has proper whitespace)

--
Signed-off-by: TR Reardon <thomas_reardon@hotmail.com>
--
diff --git a/misc/e4defrag.c b/misc/e4defrag.c
index d0eac60..2facf44 100644
--- a/misc/e4defrag.c
+++ b/misc/e4defrag.c
@@ -40,6 +40,7 @@
 #include <sys/stat.h>
 #include <sys/statfs.h>
 #include <sys/vfs.h>
+#include <libgen.h>

 /* A relatively new ioctl interface ... */
 #ifndef EXT4_IOC_MOVE_EXT
@@ -1408,6 +1409,8 @@ static int file_defrag(const char *file, const struct stat64 *buf,
 int file_frags_start, file_frags_end;
 int orig_physical_cnt, donor_physical_cnt = 0;
 char tmp_inode_name[PATH_MAX + 8];
+ char *parent_name = NULL;
+ struct stat parent_stat;
 ext4_fsblk_t blk_count = 0;
 struct fiemap_extent_list *orig_list_physical = NULL;
 struct fiemap_extent_list *orig_list_logical = NULL;
@@ -1526,29 +1529,51 @@ static int file_defrag(const char *file, const struct stat64 *buf,

 /* Create donor inode */
 memset(tmp_inode_name, 0, PATH_MAX + 8);
- sprintf(tmp_inode_name, "%.*s.defrag",
- (int)strnlen(file, PATH_MAX), file);
- donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
+ /* Try O_TMPFILE first, to avoid changing directory mtime */
+ sprintf(tmp_inode_name, "%.*s", (int)strnlen(file, PATH_MAX), file);
+ parent_name = dirname(tmp_inode_name);
+ donor_fd = open64(parent_name, O_WRONLY | O_TMPFILE | O_EXCL, S_IWUSR);
 if (donor_fd < 0) {
- if (mode_flag & DETAIL) {
- PRINT_FILE_NAME(file);
- if (errno == EEXIST)
+ /* Save parent timestamps for reset */
+ ret = stat(parent_name, &parent_stat);
+ if (ret < 0) {
+ if (mode_flag & DETAIL) {
+ PRINT_FILE_NAME(file);
 PRINT_ERR_MSG_WITH_ERRNO(
- "File is being defraged by other program");
- else
- PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
+ "Failed to stat() parent directory");
+ }
+ goto out;
 }
- goto out;
- }

- /* Unlink donor inode */
- ret = unlink(tmp_inode_name);
- if (ret < 0) {
- if (mode_flag & DETAIL) {
- PRINT_FILE_NAME(file);
- PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
+ sprintf(tmp_inode_name, "%.*s.defrag",
+ (int)strnlen(file, PATH_MAX), file);
+ donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL,
+ S_IWUSR);
+ if (donor_fd < 0) {
+ if (mode_flag & DETAIL) {
+ PRINT_FILE_NAME(file);
+ if (errno == EEXIST)
+ PRINT_ERR_MSG_WITH_ERRNO(
+ "File is being defraged by other program");
+ else
+ PRINT_ERR_MSG_WITH_ERRNO(
+ NGMSG_FILE_OPEN);
+ }
+ goto out;
 }
- goto out;
+
+ /* Unlink donor inode */
+ ret = unlink(tmp_inode_name);
+ if (ret < 0) {
+ if (mode_flag & DETAIL) {
+ PRINT_FILE_NAME(file);
+ PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
+ }
+ goto out;
+ }
+
+ /* Reset parent times */
+ utimensat(0, parent_name, &parent_stat.st_atim, 0);
 }

 /* Allocate space for donor inode */ 		 	   		  

[-- Attachment #2: DEFRAG_O_TMPFILE_v2 --]
[-- Type: application/octet-stream, Size: 2819 bytes --]

Signed-off-by: TR Reardon <thomas_reardon@hotmail.com>
--
diff --git a/misc/e4defrag.c b/misc/e4defrag.c
index d0eac60..2facf44 100644
--- a/misc/e4defrag.c
+++ b/misc/e4defrag.c
@@ -40,6 +40,7 @@
 #include <sys/stat.h>
 #include <sys/statfs.h>
 #include <sys/vfs.h>
+#include <libgen.h>
 
 /* A relatively new ioctl interface ... */
 #ifndef EXT4_IOC_MOVE_EXT
@@ -1408,6 +1409,8 @@ static int file_defrag(const char *file, const struct stat64 *buf,
 	int	file_frags_start, file_frags_end;
 	int	orig_physical_cnt, donor_physical_cnt = 0;
 	char	tmp_inode_name[PATH_MAX + 8];
+	char	*parent_name = NULL;
+	struct stat parent_stat;
 	ext4_fsblk_t			blk_count = 0;
 	struct fiemap_extent_list	*orig_list_physical = NULL;
 	struct fiemap_extent_list	*orig_list_logical = NULL;
@@ -1526,29 +1529,51 @@ static int file_defrag(const char *file, const struct stat64 *buf,
 
 	/* Create donor inode */
 	memset(tmp_inode_name, 0, PATH_MAX + 8);
-	sprintf(tmp_inode_name, "%.*s.defrag",
-				(int)strnlen(file, PATH_MAX), file);
-	donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR);
+	/* Try O_TMPFILE first, to avoid changing directory mtime */
+	sprintf(tmp_inode_name, "%.*s", (int)strnlen(file, PATH_MAX), file);
+	parent_name = dirname(tmp_inode_name);
+	donor_fd = open64(parent_name, O_WRONLY | O_TMPFILE | O_EXCL, S_IWUSR);
 	if (donor_fd < 0) {
-		if (mode_flag & DETAIL) {
-			PRINT_FILE_NAME(file);
-			if (errno == EEXIST)
+		/* Save parent timestamps for reset */
+		ret = stat(parent_name, &parent_stat);
+		if (ret < 0) {
+			if (mode_flag & DETAIL) {
+				PRINT_FILE_NAME(file);
 				PRINT_ERR_MSG_WITH_ERRNO(
-				"File is being defraged by other program");
-			else
-				PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN);
+				"Failed to stat() parent directory");
+			}
+			goto out;
 		}
-		goto out;
-	}
 
-	/* Unlink donor inode */
-	ret = unlink(tmp_inode_name);
-	if (ret < 0) {
-		if (mode_flag & DETAIL) {
-			PRINT_FILE_NAME(file);
-			PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
+		sprintf(tmp_inode_name, "%.*s.defrag",
+					(int)strnlen(file, PATH_MAX), file);
+		donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL,
+					S_IWUSR);
+		if (donor_fd < 0) {
+			if (mode_flag & DETAIL) {
+				PRINT_FILE_NAME(file);
+				if (errno == EEXIST)
+					PRINT_ERR_MSG_WITH_ERRNO(
+					"File is being defraged by other program");
+				else
+					PRINT_ERR_MSG_WITH_ERRNO(
+						NGMSG_FILE_OPEN);
+			}
+			goto out;
 		}
-		goto out;
+
+		/* Unlink donor inode */
+		ret = unlink(tmp_inode_name);
+		if (ret < 0) {
+			if (mode_flag & DETAIL) {
+				PRINT_FILE_NAME(file);
+				PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink");
+			}
+			goto out;
+		}
+
+		/* Reset parent times */
+		utimensat(0, parent_name, &parent_stat.st_atim, 0);
 	}
 
 	/* Allocate space for donor inode */

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-09-13 20:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-15 20:12 possible different donor file naming in e4defrag TR Reardon
2014-08-15 20:39 ` Darrick J. Wong
2014-08-15 20:45   ` TR Reardon
2014-08-15 21:04     ` Andreas Dilger
2014-09-11 19:48       ` TR Reardon
2014-09-11 21:19         ` Andreas Dilger
2014-09-11 22:49           ` TR Reardon
2014-09-11 23:03             ` TR Reardon
2014-09-12 19:41               ` Andreas Dilger
2014-09-13 20:23                 ` TR Reardon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.