linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: tmpfs inode leakage when opening file with O_TMP_FILE
       [not found] <CAHMF36F4JN44Y-yMnxw36A8cO0yVUQhAkvJDcj_gbWbsuUAA5A@mail.gmail.com>
@ 2019-02-14 23:44 ` Andrew Morton
  2019-02-15  0:26   ` Darrick J. Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2019-02-14 23:44 UTC (permalink / raw)
  To: Matej Kupljen; +Cc: linux-kernel, linux-fsdevel

(cc linux-fsdevel)

On Mon, 11 Feb 2019 15:18:11 +0100 Matej Kupljen <matej.kupljen@gmail.com> wrote:

> Hi,
> 
> it seems that when opening file on file system that is mounted on
> tmpfs with the O_TMPFILE flag and using linkat call after that, it
> uses 2 inodes instead of 1.
> 
> This is simple test case:
> 
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <string.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <linux/limits.h>
> #include <errno.h>
> 
> #define TEST_STRING     "Testing\n"
> 
> #define TMP_PATH        "/tmp/ping/"
> #define TMP_FILE        "file.txt"
> 
> 
> int main(int argc, char* argv[])
> {
>         char path[PATH_MAX];
>         int fd;
>         int rc;
> 
>         fd = open(TMP_PATH, __O_TMPFILE | O_RDWR,
>                         S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP |
> S_IROTH | S_IWOTH);
> 
>         rc = write(fd, TEST_STRING, strlen(TEST_STRING));
> 
>         snprintf(path, PATH_MAX,  "/proc/self/fd/%d", fd);
>         linkat(AT_FDCWD, path, AT_FDCWD, TMP_PATH TMP_FILE, AT_SYMLINK_FOLLOW);
>         close(fd);
> 
>         return 0;
> }
> 
> I have checked indoes with "df -i" tool. The first inode is used when
> the call to open is executed and the second one when the call to
> linkat is executed.
> It is not decreased when close is executed.
> 
> I have also tested this on an ext4 mounted fs and there only one inode is used.
> 
> I tested this on:
> $ cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=18.04
> DISTRIB_CODENAME=bionic
> DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"
> 
> $ uname -a
> Linux Orion 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC
> 2018 x86_64 x86_64 x86_64 GNU/Linux
> 
> If you need any more information, please let me know.
> 
> And please CC me when replying, I am not subscribed to the list.
> 
> Thanks and BR,
> Matej

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: tmpfs inode leakage when opening file with O_TMP_FILE
  2019-02-14 23:44 ` tmpfs inode leakage when opening file with O_TMP_FILE Andrew Morton
@ 2019-02-15  0:26   ` Darrick J. Wong
  2019-02-15 10:38     ` Hugh Dickins
  0 siblings, 1 reply; 5+ messages in thread
From: Darrick J. Wong @ 2019-02-15  0:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matej Kupljen, linux-kernel, linux-fsdevel, linux-mm, hughd

[cc the shmem maintainer and the mm list]

On Thu, Feb 14, 2019 at 03:44:02PM -0800, Andrew Morton wrote:
> (cc linux-fsdevel)
> 
> On Mon, 11 Feb 2019 15:18:11 +0100 Matej Kupljen <matej.kupljen@gmail.com> wrote:
> 
> > Hi,
> > 
> > it seems that when opening file on file system that is mounted on
> > tmpfs with the O_TMPFILE flag and using linkat call after that, it
> > uses 2 inodes instead of 1.
> > 
> > This is simple test case:
> > 
> > #include <sys/types.h>
> > #include <sys/stat.h>
> > #include <fcntl.h>
> > #include <unistd.h>
> > #include <string.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <linux/limits.h>
> > #include <errno.h>
> > 
> > #define TEST_STRING     "Testing\n"
> > 
> > #define TMP_PATH        "/tmp/ping/"
> > #define TMP_FILE        "file.txt"
> > 
> > 
> > int main(int argc, char* argv[])
> > {
> >         char path[PATH_MAX];
> >         int fd;
> >         int rc;
> > 
> >         fd = open(TMP_PATH, __O_TMPFILE | O_RDWR,
> >                         S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP |
> > S_IROTH | S_IWOTH);
> > 
> >         rc = write(fd, TEST_STRING, strlen(TEST_STRING));
> > 
> >         snprintf(path, PATH_MAX,  "/proc/self/fd/%d", fd);
> >         linkat(AT_FDCWD, path, AT_FDCWD, TMP_PATH TMP_FILE, AT_SYMLINK_FOLLOW);
> >         close(fd);
> > 
> >         return 0;
> > }
> > 
> > I have checked indoes with "df -i" tool. The first inode is used when
> > the call to open is executed and the second one when the call to
> > linkat is executed.
> > It is not decreased when close is executed.
> > 
> > I have also tested this on an ext4 mounted fs and there only one inode is used.
> > 
> > I tested this on:
> > $ cat /etc/lsb-release
> > DISTRIB_ID=Ubuntu
> > DISTRIB_RELEASE=18.04
> > DISTRIB_CODENAME=bionic
> > DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"
> > 
> > $ uname -a
> > Linux Orion 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC
> > 2018 x86_64 x86_64 x86_64 GNU/Linux

Heh, tmpfs and its weird behavior where each new link counts as a new
inode because "each new link needs a new dentry, pinning lowmem, and
tmpfs dentries cannot be pruned until they are unlinked."

It seems to have this behavior on 5.0-rc6 too:

$ /bin/df -i /tmp ; ./c ; /bin/df -i /tmp
Filesystem      Inodes IUsed   IFree IUse% Mounted on
tmp            1019110    17 1019093    1% /tmp
Filesystem      Inodes IUsed   IFree IUse% Mounted on
tmp            1019110    19 1019091    1% /tmp

Probably because shmem_tmpfile -> shmem_get_inode -> shmem_reserve_inode
which decrements ifree when we create the tmpfile, and then the
d_tmpfile decrements i_nlink to zero.  Now we have iused=1, nlink=0,
assuming iused=itotal-ifree like usual.

Then the linkat call does:

shmem_link -> shmem_reserve_inode

which decrements ifree again and increments i_nlink to 1.  Now we have
iused=2, nlink=1.

The program exits, which closes the file.  /tmp/ping/file.txt still
exists and we haven't evicted inodes yet, so nothing much happens.

But then I added in rm -rf /tmp/ping/file.txt to see what happens.
shmem_unlink contains this:

	if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode))
		shmem_free_inode(inode->i_sb);

So shmem_iunlink *doesnt* decrement ifree but does drop the nlink, so
our state is now iused=2, nlink=0.

Now we evict the inode, which decrements ifree, so iused=1 and the inode
goes away.  Oops, we just leaked an ifree.

I /think/ the proper fix is to change shmem_link to decrement ifree only
if the inode has nonzero nlink, e.g.

	/*
	 * No ordinary (disk based) filesystem counts links as inodes;
	 * but each new link needs a new dentry, pinning lowmem, and
	 * tmpfs dentries cannot be pruned until they are unlinked.  If
	 * we're linking an O_TMPFILE file into the tmpfs we can skip
	 * this because there's still only one link to the inode.
	 */
	if (inode->i_nlink > 0) {
		ret = shmem_reserve_inode(inode->i_sb);
		if (ret)
			goto out;
	}

Says me who was crawling around poking at O_TMPFILE behavior all morning.
Not sure if that's right; what happens to the old dentry?

--D

> > If you need any more information, please let me know.
> > 
> > And please CC me when replying, I am not subscribed to the list.
> > 
> > Thanks and BR,
> > Matej

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: tmpfs inode leakage when opening file with O_TMP_FILE
  2019-02-15  0:26   ` Darrick J. Wong
@ 2019-02-15 10:38     ` Hugh Dickins
  2019-02-19  4:23       ` Hugh Dickins
  0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2019-02-15 10:38 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Andrew Morton, Matej Kupljen, linux-kernel, linux-fsdevel,
	linux-mm, hughd

On Thu, 14 Feb 2019, Darrick J. Wong wrote:
> [cc the shmem maintainer and the mm list]

Yup, thanks - Matej also did so the day after sending to linux-kernel.

> 
> On Thu, Feb 14, 2019 at 03:44:02PM -0800, Andrew Morton wrote:
> > (cc linux-fsdevel)

Okay, thanks, but a tmpfs peculiarity we think.

> > 
> > On Mon, 11 Feb 2019 15:18:11 +0100 Matej Kupljen <matej.kupljen@gmail.com> wrote:
> > 
> > > Hi,
> > > 
> > > it seems that when opening file on file system that is mounted on
> > > tmpfs with the O_TMPFILE flag and using linkat call after that, it
> > > uses 2 inodes instead of 1.
> > > 
> > > This is simple test case:
> > > 
> > > #include <sys/types.h>
> > > #include <sys/stat.h>
> > > #include <fcntl.h>
> > > #include <unistd.h>
> > > #include <string.h>
> > > #include <stdio.h>
> > > #include <stdlib.h>
> > > #include <linux/limits.h>
> > > #include <errno.h>
> > > 
> > > #define TEST_STRING     "Testing\n"
> > > 
> > > #define TMP_PATH        "/tmp/ping/"
> > > #define TMP_FILE        "file.txt"
> > > 
> > > 
> > > int main(int argc, char* argv[])
> > > {
> > >         char path[PATH_MAX];
> > >         int fd;
> > >         int rc;
> > > 
> > >         fd = open(TMP_PATH, __O_TMPFILE | O_RDWR,
> > >                         S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP |
> > > S_IROTH | S_IWOTH);
> > > 
> > >         rc = write(fd, TEST_STRING, strlen(TEST_STRING));
> > > 
> > >         snprintf(path, PATH_MAX,  "/proc/self/fd/%d", fd);
> > >         linkat(AT_FDCWD, path, AT_FDCWD, TMP_PATH TMP_FILE, AT_SYMLINK_FOLLOW);
> > >         close(fd);
> > > 
> > >         return 0;
> > > }
> > > 
> > > I have checked indoes with "df -i" tool. The first inode is used when
> > > the call to open is executed and the second one when the call to
> > > linkat is executed.
> > > It is not decreased when close is executed.
> > > 
> > > I have also tested this on an ext4 mounted fs and there only one inode is used.
> > > 
> > > I tested this on:
> > > $ cat /etc/lsb-release
> > > DISTRIB_ID=Ubuntu
> > > DISTRIB_RELEASE=18.04
> > > DISTRIB_CODENAME=bionic
> > > DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"
> > > 
> > > $ uname -a
> > > Linux Orion 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC
> > > 2018 x86_64 x86_64 x86_64 GNU/Linux
> 
> Heh, tmpfs and its weird behavior where each new link counts as a new
> inode because "each new link needs a new dentry, pinning lowmem, and
> tmpfs dentries cannot be pruned until they are unlinked."

That's very much a peculiarity of tmpfs, so agreed: it's what I expect
to be the cause, but I've not actually tracked it through and fixed yet.

> 
> It seems to have this behavior on 5.0-rc6 too:

Yes, it does.

> 
> $ /bin/df -i /tmp ; ./c ; /bin/df -i /tmp
> Filesystem      Inodes IUsed   IFree IUse% Mounted on
> tmp            1019110    17 1019093    1% /tmp
> Filesystem      Inodes IUsed   IFree IUse% Mounted on
> tmp            1019110    19 1019091    1% /tmp
> 
> Probably because shmem_tmpfile -> shmem_get_inode -> shmem_reserve_inode
> which decrements ifree when we create the tmpfile, and then the
> d_tmpfile decrements i_nlink to zero.  Now we have iused=1, nlink=0,
> assuming iused=itotal-ifree like usual.
> 
> Then the linkat call does:
> 
> shmem_link -> shmem_reserve_inode
> 
> which decrements ifree again and increments i_nlink to 1.  Now we have
> iused=2, nlink=1.
> 
> The program exits, which closes the file.  /tmp/ping/file.txt still
> exists and we haven't evicted inodes yet, so nothing much happens.
> 
> But then I added in rm -rf /tmp/ping/file.txt to see what happens.
> shmem_unlink contains this:
> 
> 	if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode))
> 		shmem_free_inode(inode->i_sb);
> 
> So shmem_iunlink *doesnt* decrement ifree but does drop the nlink, so
> our state is now iused=2, nlink=0.
> 
> Now we evict the inode, which decrements ifree, so iused=1 and the inode
> goes away.  Oops, we just leaked an ifree.
> 
> I /think/ the proper fix is to change shmem_link to decrement ifree only
> if the inode has nonzero nlink, e.g.
> 
> 	/*
> 	 * No ordinary (disk based) filesystem counts links as inodes;
> 	 * but each new link needs a new dentry, pinning lowmem, and
> 	 * tmpfs dentries cannot be pruned until they are unlinked.  If
> 	 * we're linking an O_TMPFILE file into the tmpfs we can skip
> 	 * this because there's still only one link to the inode.
> 	 */
> 	if (inode->i_nlink > 0) {
> 		ret = shmem_reserve_inode(inode->i_sb);
> 		if (ret)
> 			goto out;
> 	}
> 
> Says me who was crawling around poking at O_TMPFILE behavior all morning.

Thanks for the Cc on that patch: I thought at first that you were
coincidentally fixing up Matej's observation, but from its commit
message no.  That work is just a generic cleanup to suit XFS needs,
and won't change the tmpfs behavior one way or the other.

> Not sure if that's right; what happens to the old dentry?

I'm relieved to see your "/think/" above and "Not sure" there :)
Me too.  It is so easy to get these counting things wrong, especially
when distributed between the generic and the specific file system.

I'm not going to attempt a pronouncement until I've had time to
sink properly into it at the weekend, when I'll follow your guide
and work it through - thanks a lot for getting this far, Darrick.

Hugh

> 
> --D
> 
> > > If you need any more information, please let me know.
> > > 
> > > And please CC me when replying, I am not subscribed to the list.
> > > 
> > > Thanks and BR,
> > > Matej

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: tmpfs inode leakage when opening file with O_TMP_FILE
  2019-02-15 10:38     ` Hugh Dickins
@ 2019-02-19  4:23       ` Hugh Dickins
  2019-02-19  4:34         ` Darrick J. Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2019-02-19  4:23 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Hugh Dickins, Andrew Morton, Matej Kupljen, linux-kernel,
	linux-fsdevel, linux-mm

On Fri, 15 Feb 2019, Hugh Dickins wrote:
> On Thu, 14 Feb 2019, Darrick J. Wong wrote:
> > > On Mon, 11 Feb 2019 15:18:11 +0100 Matej Kupljen <matej.kupljen@gmail.com> wrote:
> > > > 
> > > > it seems that when opening file on file system that is mounted on
> > > > tmpfs with the O_TMPFILE flag and using linkat call after that, it
> > > > uses 2 inodes instead of 1.
...
> > 
> > Heh, tmpfs and its weird behavior where each new link counts as a new
> > inode because "each new link needs a new dentry, pinning lowmem, and
> > tmpfs dentries cannot be pruned until they are unlinked."
> 
> That's very much a peculiarity of tmpfs, so agreed: it's what I expect
> to be the cause, but I've not actually tracked it through and fixed yet.
...
> 
> > I /think/ the proper fix is to change shmem_link to decrement ifree only
> > if the inode has nonzero nlink, e.g.
> > 
> > 	/*
> > 	 * No ordinary (disk based) filesystem counts links as inodes;
> > 	 * but each new link needs a new dentry, pinning lowmem, and
> > 	 * tmpfs dentries cannot be pruned until they are unlinked.  If
> > 	 * we're linking an O_TMPFILE file into the tmpfs we can skip
> > 	 * this because there's still only one link to the inode.
> > 	 */
> > 	if (inode->i_nlink > 0) {
> > 		ret = shmem_reserve_inode(inode->i_sb);
> > 		if (ret)
> > 			goto out;
> > 	}
> > 
> > Says me who was crawling around poking at O_TMPFILE behavior all morning.
> > Not sure if that's right; what happens to the old dentry?

Not sure what you mean by "what happens to the old dentry?"
But certainly the accounting feels a bit like a shell game,
and my attempts to explain it have not satisfied even me.

The way I'm finding it helpful to think, is to imagine tmpfs'
count of inodes actually to be implemented as a count of dentries.
And the 1 for the last remaining goes away in the shmem_free_inode()
at the end of shmem_evict_inode().  Does that answer "what happens"?

Since applying the patch, I have verified (watching "dentry" and
"shmem_inode_cache" in /proc/slabinfo) that doing Matej's sequence
repeatedly does not leak any "df -i" nor dentries nor inodes.

> 
> I'm relieved to see your "/think/" above and "Not sure" there :)
> Me too.  It is so easy to get these counting things wrong, especially
> when distributed between the generic and the specific file system.
> 
> I'm not going to attempt a pronouncement until I've had time to
> sink properly into it at the weekend, when I'll follow your guide
> and work it through - thanks a lot for getting this far, Darrick.

I have now sunk into it, and sure that I agree with your patch,
filled out below (I happen to have changed "inode->i_nlink > 0" to
"inode->i_nlink" just out of some personal preference at the time).
One can argue that it's not technically quite the right place, but
it is the place where we can detect the condition without getting
into unnecessary further complications, and does the job well enough.

May I change "Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>"
to your "Signed-off-by" before sending on to Andrew "From" you?

Thanks!
Hugh

[PATCH] tmpfs: fix link accounting when a tmpfile is linked in

tmpfs has a peculiarity of accounting hard links as if they were separate
inodes: so that when the number of inodes is limited, as it is by default,
a user cannot soak up an unlimited amount of unreclaimable dcache memory
just by repeatedly linking a file.

But when v3.11 added O_TMPFILE, and the ability to use linkat() on the fd,
we missed accommodating this new case in tmpfs: "df -i" shows that an
extra "inode" remains accounted after the file is unlinked and the fd
closed and the actual inode evicted.  If a user repeatedly links tmpfiles
into a tmpfs, the limit will be hit (ENOSPC) even after they are deleted.

Just skip the extra reservation from shmem_link() in this case: there's
a sense in which this first link of a tmpfile is then cheaper than a
hard link of another file, but the accounting works out, and there's
still good limiting, so no need to do anything more complicated.

Fixes: f4e0c30c191 ("allow the temp files created by open() to be linked to")
Reported-by: Matej Kupljen <matej.kupljen@gmail.com>
Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
---

 mm/shmem.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

--- 5.0-rc7/mm/shmem.c	2019-01-06 19:15:45.764805103 -0800
+++ linux/mm/shmem.c	2019-02-18 13:56:48.388032606 -0800
@@ -2854,10 +2854,14 @@ static int shmem_link(struct dentry *old
 	 * No ordinary (disk based) filesystem counts links as inodes;
 	 * but each new link needs a new dentry, pinning lowmem, and
 	 * tmpfs dentries cannot be pruned until they are unlinked.
+	 * But if an O_TMPFILE file is linked into the tmpfs, the
+	 * first link must skip that, to get the accounting right.
 	 */
-	ret = shmem_reserve_inode(inode->i_sb);
-	if (ret)
-		goto out;
+	if (inode->i_nlink) {
+		ret = shmem_reserve_inode(inode->i_sb);
+		if (ret)
+			goto out;
+	}
 
 	dir->i_size += BOGO_DIRENT_SIZE;
 	inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: tmpfs inode leakage when opening file with O_TMP_FILE
  2019-02-19  4:23       ` Hugh Dickins
@ 2019-02-19  4:34         ` Darrick J. Wong
  0 siblings, 0 replies; 5+ messages in thread
From: Darrick J. Wong @ 2019-02-19  4:34 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Matej Kupljen, linux-kernel, linux-fsdevel, linux-mm

On Mon, Feb 18, 2019 at 08:23:20PM -0800, Hugh Dickins wrote:
> On Fri, 15 Feb 2019, Hugh Dickins wrote:
> > On Thu, 14 Feb 2019, Darrick J. Wong wrote:
> > > > On Mon, 11 Feb 2019 15:18:11 +0100 Matej Kupljen <matej.kupljen@gmail.com> wrote:
> > > > > 
> > > > > it seems that when opening file on file system that is mounted on
> > > > > tmpfs with the O_TMPFILE flag and using linkat call after that, it
> > > > > uses 2 inodes instead of 1.
> ...
> > > 
> > > Heh, tmpfs and its weird behavior where each new link counts as a new
> > > inode because "each new link needs a new dentry, pinning lowmem, and
> > > tmpfs dentries cannot be pruned until they are unlinked."
> > 
> > That's very much a peculiarity of tmpfs, so agreed: it's what I expect
> > to be the cause, but I've not actually tracked it through and fixed yet.
> ...
> > 
> > > I /think/ the proper fix is to change shmem_link to decrement ifree only
> > > if the inode has nonzero nlink, e.g.
> > > 
> > > 	/*
> > > 	 * No ordinary (disk based) filesystem counts links as inodes;
> > > 	 * but each new link needs a new dentry, pinning lowmem, and
> > > 	 * tmpfs dentries cannot be pruned until they are unlinked.  If
> > > 	 * we're linking an O_TMPFILE file into the tmpfs we can skip
> > > 	 * this because there's still only one link to the inode.
> > > 	 */
> > > 	if (inode->i_nlink > 0) {
> > > 		ret = shmem_reserve_inode(inode->i_sb);
> > > 		if (ret)
> > > 			goto out;
> > > 	}
> > > 
> > > Says me who was crawling around poking at O_TMPFILE behavior all morning.
> > > Not sure if that's right; what happens to the old dentry?
> 
> Not sure what you mean by "what happens to the old dentry?"
> But certainly the accounting feels a bit like a shell game,
> and my attempts to explain it have not satisfied even me.
> 
> The way I'm finding it helpful to think, is to imagine tmpfs'
> count of inodes actually to be implemented as a count of dentries.
> And the 1 for the last remaining goes away in the shmem_free_inode()
> at the end of shmem_evict_inode().  Does that answer "what happens"?
> 
> Since applying the patch, I have verified (watching "dentry" and
> "shmem_inode_cache" in /proc/slabinfo) that doing Matej's sequence
> repeatedly does not leak any "df -i" nor dentries nor inodes.
> 
> > 
> > I'm relieved to see your "/think/" above and "Not sure" there :)
> > Me too.  It is so easy to get these counting things wrong, especially
> > when distributed between the generic and the specific file system.
> > 
> > I'm not going to attempt a pronouncement until I've had time to
> > sink properly into it at the weekend, when I'll follow your guide
> > and work it through - thanks a lot for getting this far, Darrick.
> 
> I have now sunk into it, and sure that I agree with your patch,
> filled out below (I happen to have changed "inode->i_nlink > 0" to
> "inode->i_nlink" just out of some personal preference at the time).
> One can argue that it's not technically quite the right place, but
> it is the place where we can detect the condition without getting
> into unnecessary further complications, and does the job well enough.
> 
> May I change "Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>"
> to your "Signed-off-by" before sending on to Andrew "From" you?

That's fine with me!

> Thanks!
> Hugh
> 
> [PATCH] tmpfs: fix link accounting when a tmpfile is linked in
> 
> tmpfs has a peculiarity of accounting hard links as if they were separate
> inodes: so that when the number of inodes is limited, as it is by default,
> a user cannot soak up an unlimited amount of unreclaimable dcache memory
> just by repeatedly linking a file.
> 
> But when v3.11 added O_TMPFILE, and the ability to use linkat() on the fd,
> we missed accommodating this new case in tmpfs: "df -i" shows that an
> extra "inode" remains accounted after the file is unlinked and the fd
> closed and the actual inode evicted.  If a user repeatedly links tmpfiles
> into a tmpfs, the limit will be hit (ENOSPC) even after they are deleted.
> 
> Just skip the extra reservation from shmem_link() in this case: there's
> a sense in which this first link of a tmpfile is then cheaper than a
> hard link of another file, but the accounting works out, and there's
> still good limiting, so no need to do anything more complicated.
> 
> Fixes: f4e0c30c191 ("allow the temp files created by open() to be linked to")
> Reported-by: Matej Kupljen <matej.kupljen@gmail.com>
> Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>

Or if you prefer:

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
> 
>  mm/shmem.c |   10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> --- 5.0-rc7/mm/shmem.c	2019-01-06 19:15:45.764805103 -0800
> +++ linux/mm/shmem.c	2019-02-18 13:56:48.388032606 -0800
> @@ -2854,10 +2854,14 @@ static int shmem_link(struct dentry *old
>  	 * No ordinary (disk based) filesystem counts links as inodes;
>  	 * but each new link needs a new dentry, pinning lowmem, and
>  	 * tmpfs dentries cannot be pruned until they are unlinked.
> +	 * But if an O_TMPFILE file is linked into the tmpfs, the
> +	 * first link must skip that, to get the accounting right.
>  	 */
> -	ret = shmem_reserve_inode(inode->i_sb);
> -	if (ret)
> -		goto out;
> +	if (inode->i_nlink) {
> +		ret = shmem_reserve_inode(inode->i_sb);
> +		if (ret)
> +			goto out;
> +	}
>  
>  	dir->i_size += BOGO_DIRENT_SIZE;
>  	inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-02-19  4:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAHMF36F4JN44Y-yMnxw36A8cO0yVUQhAkvJDcj_gbWbsuUAA5A@mail.gmail.com>
2019-02-14 23:44 ` tmpfs inode leakage when opening file with O_TMP_FILE Andrew Morton
2019-02-15  0:26   ` Darrick J. Wong
2019-02-15 10:38     ` Hugh Dickins
2019-02-19  4:23       ` Hugh Dickins
2019-02-19  4:34         ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).