All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
@ 2017-08-07  9:54 Wangkai
  2017-08-25  7:06 ` Wangkai (Kevin,C)
  0 siblings, 1 reply; 14+ messages in thread
From: Wangkai @ 2017-08-07  9:54 UTC (permalink / raw)
  To: linux-fsdevel, viro; +Cc: wangkai86, renjinyong

sometimes, on my server, there were lots of files creating and removing, and I
found that memory usage were high, and I checked, almost all the memory was 
occupied by slab recliamable, the slab struct was dentry, and I checked the 
code of dcache, and found that when a file was deleted the dentry never free, 
unless a memory recliam was triggerd.

I made this patch to mark the dentry as a remove state after file unlinked or
directory removed, and when the dentry’s reference count dec to zero and 
free it, and it worked well on my server base on kernel 4.4.


Signed-off-by: Wangkai <wangkai86@huawei.com>
---
 fs/dcache.c            | 12 +++++++++++-
 fs/namei.c             |  6 ++++++
 include/linux/dcache.h |  2 +-
 3 files changed, 18 insertions(+), 2 deletions(-)
 mode change 100644 => 100755 fs/dcache.c
 mode change 100644 => 100755 fs/namei.c
 mode change 100644 => 100755 include/linux/dcache.h

diff --git a/fs/dcache.c b/fs/dcache.c
old mode 100644
new mode 100755
index f901413..6828463
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -722,7 +722,8 @@ static inline bool fast_dput(struct dentry *dentry)
 	 */
 	smp_rmb();
 	d_flags = ACCESS_ONCE(dentry->d_flags);
-	d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED;
+	d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED
+		| DCACHE_FILE_REMOVED;
 
 	/* Nothing to do? Dropping the reference was all we needed? */
 	if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry))
@@ -816,6 +817,15 @@ void dput(struct dentry *dentry)
 	dentry_lru_add(dentry);
 
 	dentry->d_lockref.count--;
+
+	/*
+	 * if file has been declare as removed and reference count is zero
+	 * then we can free the dentry rather than leave it stay in dcache
+	 */
+	if (unlikely(dentry->d_flags & DCACHE_FILE_REMOVED)) {
+		if (dentry->d_lockref.count == 0) 
+			goto kill_it;
+	}
 	spin_unlock(&dentry->d_lock);
 	return;
 
diff --git a/fs/namei.c b/fs/namei.c
old mode 100644
new mode 100755
index ddb6a7c..0ec1478
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3918,6 +3918,9 @@ static long do_rmdir(int dfd, const char __user *pathname)
 		goto exit3;
 	error = vfs_rmdir(path.dentry->d_inode, dentry);
 exit3:
+	/* after remove the dir set dentry remove flag */
+	if (!error)
+		dentry->d_flags |= DCACHE_FILE_REMOVED;
 	dput(dentry);
 exit2:
 	inode_unlock(path.dentry->d_inode);
@@ -4042,6 +4045,9 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 			goto exit2;
 		error = vfs_unlink(path.dentry->d_inode, dentry, &delegated_inode);
 exit2:
+		/* after unlink file set dentry remove flag */
+		if (!error)
+			dentry->d_flags |= DCACHE_FILE_REMOVED;
 		dput(dentry);
 	}
 	inode_unlock(path.dentry->d_inode);
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
old mode 100644
new mode 100755
index aae1cdb..2d65bd6
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -215,7 +215,7 @@ struct dentry_operations {
 #define DCACHE_FALLTHRU			0x01000000 /* Fall through to lower layer */
 #define DCACHE_ENCRYPTED_WITH_KEY	0x02000000 /* dir is encrypted with a valid key */
 #define DCACHE_OP_REAL			0x04000000
-
+#define DCACHE_FILE_REMOVED		0x08000000 /* file or dir has been unlinked or removed */
 #define DCACHE_PAR_LOOKUP		0x10000000 /* being looked up (with parent locked shared) */
 #define DCACHE_DENTRY_CURSOR		0x20000000
 
-- 
2.8.0.GIT

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* RE: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-07  9:54 [PATCH] fs/dcache: dentries should free after files unlinked or directories removed Wangkai
@ 2017-08-25  7:06 ` Wangkai (Kevin,C)
  2017-08-25 14:47   ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Wangkai (Kevin,C) @ 2017-08-25  7:06 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: viro, Waiman Long, Jan Kara, Renjinyong (Renjinyong,
	Business Support Dept), Wangkai (Kevin,C)



> -----Original Message-----
> From: Wangkai (Kevin,C)
> Sent: Monday, August 07, 2017 5:55 PM
> To: linux-fsdevel@vger.kernel.org; viro@zeniv.linux.org.uk
> Cc: Wangkai (Kevin,C); Renjinyong (Renjinyong, Business Support Dept)
> Subject: [PATCH] fs/dcache: dentries should free after files unlinked or
> directories removed
> 
> sometimes, on my server, there were lots of files creating and removing, and I
> found that memory usage were high, and I checked, almost all the memory was
> occupied by slab recliamable, the slab struct was dentry, and I checked the
> code of dcache, and found that when a file was deleted the dentry never free,
> unless a memory recliam was triggerd.
> 
> I made this patch to mark the dentry as a remove state after file unlinked or
> directory removed, and when the dentry’s reference count dec to zero and
> free it, and it worked well on my server base on kernel 4.4.
> 
> 
> Signed-off-by: Wangkai <wangkai86@huawei.com>
> ---
>  fs/dcache.c            | 12 +++++++++++-
>  fs/namei.c             |  6 ++++++
>  include/linux/dcache.h |  2 +-
>  3 files changed, 18 insertions(+), 2 deletions(-)
>  mode change 100644 => 100755 fs/dcache.c
>  mode change 100644 => 100755 fs/namei.c
>  mode change 100644 => 100755 include/linux/dcache.h
> 
> diff --git a/fs/dcache.c b/fs/dcache.c
> old mode 100644
> new mode 100755
> index f901413..6828463
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -722,7 +722,8 @@ static inline bool fast_dput(struct dentry *dentry)
>  	 */
>  	smp_rmb();
>  	d_flags = ACCESS_ONCE(dentry->d_flags);
> -	d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST |
> DCACHE_DISCONNECTED;
> +	d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST |
> DCACHE_DISCONNECTED
> +		| DCACHE_FILE_REMOVED;
> 
>  	/* Nothing to do? Dropping the reference was all we needed? */
>  	if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST)
> && !d_unhashed(dentry))
> @@ -816,6 +817,15 @@ void dput(struct dentry *dentry)
>  	dentry_lru_add(dentry);
> 
>  	dentry->d_lockref.count--;
> +
> +	/*
> +	 * if file has been declare as removed and reference count is zero
> +	 * then we can free the dentry rather than leave it stay in dcache
> +	 */
> +	if (unlikely(dentry->d_flags & DCACHE_FILE_REMOVED)) {
> +		if (dentry->d_lockref.count == 0)
> +			goto kill_it;
> +	}
>  	spin_unlock(&dentry->d_lock);
>  	return;
> 
> diff --git a/fs/namei.c b/fs/namei.c
> old mode 100644
> new mode 100755
> index ddb6a7c..0ec1478
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3918,6 +3918,9 @@ static long do_rmdir(int dfd, const char __user
> *pathname)
>  		goto exit3;
>  	error = vfs_rmdir(path.dentry->d_inode, dentry);
>  exit3:


> +	/* after remove the dir set dentry remove flag */
> +	if (!error)
> +		dentry->d_flags |= DCACHE_FILE_REMOVED;
>  	dput(dentry);
>  exit2:
>  	inode_unlock(path.dentry->d_inode);
> @@ -4042,6 +4045,9 @@ static long do_unlinkat(int dfd, const char __user
> *pathname)
>  			goto exit2;
>  		error = vfs_unlink(path.dentry->d_inode, dentry, &delegated_inode);
>  exit2:
> +		/* after unlink file set dentry remove flag */
> +		if (!error)
> +			dentry->d_flags |= DCACHE_FILE_REMOVED;
>  		dput(dentry);
>  	}
>  	inode_unlock(path.dentry->d_inode);
> diff --git a/include/linux/dcache.h b/include/linux/dcache.h
> old mode 100644
> new mode 100755
> index aae1cdb..2d65bd6
> --- a/include/linux/dcache.h
> +++ b/include/linux/dcache.h
> @@ -215,7 +215,7 @@ struct dentry_operations {
>  #define DCACHE_FALLTHRU			0x01000000 /* Fall through to lower
> layer */
>  #define DCACHE_ENCRYPTED_WITH_KEY	0x02000000 /* dir is encrypted
> with a valid key */
>  #define DCACHE_OP_REAL			0x04000000
> -
> +#define DCACHE_FILE_REMOVED		0x08000000 /* file or dir has been
> unlinked or removed */
>  #define DCACHE_PAR_LOOKUP		0x10000000 /* being looked up (with
> parent locked shared) */
>  #define DCACHE_DENTRY_CURSOR		0x20000000
> 
> --
> 2.8.0.GIT

Hi, all
The negative dentries keep growing and waste a lots of kernel memory, this problem has been
Occurred on my server, and I looked for internet, and many people had met the same problem.
Recently, I discussed with Longman, we have two different patches to solve this problem,
In my patch, remove the negative dentries with the files unlinked, I checked 15 years ago,
Viro and linus had talked about this, that unlink was only worth doing.
  ref:   http://yarchive.net/comp/linux/negative_dentries.html
in Longman patch, limit the negative dentries number
maybe we can discuss.

Thanks,
Kevin


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-25  7:06 ` Wangkai (Kevin,C)
@ 2017-08-25 14:47   ` Jan Kara
  2017-08-25 15:10     ` Colin Walters
  2017-08-26  6:56     ` Wangkai (Kevin,C)
  0 siblings, 2 replies; 14+ messages in thread
From: Jan Kara @ 2017-08-25 14:47 UTC (permalink / raw)
  To: Wangkai (Kevin,C)
  Cc: linux-fsdevel, viro, Waiman Long, Jan Kara,
	Renjinyong (Renjinyong, Business Support Dept),
	Linus Torvalds

On Fri 25-08-17 07:06:50, Wangkai (Kevin,C) wrote:
> > -----Original Message-----
> > From: Wangkai (Kevin,C)
> > Sent: Monday, August 07, 2017 5:55 PM
> > To: linux-fsdevel@vger.kernel.org; viro@zeniv.linux.org.uk
> > Cc: Wangkai (Kevin,C); Renjinyong (Renjinyong, Business Support Dept)
> > Subject: [PATCH] fs/dcache: dentries should free after files unlinked or
> > directories removed
> > 
> > sometimes, on my server, there were lots of files creating and removing, and I
> > found that memory usage were high, and I checked, almost all the memory was
> > occupied by slab recliamable, the slab struct was dentry, and I checked the
> > code of dcache, and found that when a file was deleted the dentry never free,
> > unless a memory recliam was triggerd.
> > 
> > I made this patch to mark the dentry as a remove state after file unlinked or
> > directory removed, and when the dentry’s reference count dec to zero and
> > free it, and it worked well on my server base on kernel 4.4.
> > 
> > 
> > Signed-off-by: Wangkai <wangkai86@huawei.com>

...

> Hi, all
> The negative dentries keep growing and waste a lots of kernel memory, this problem has been
> Occurred on my server, and I looked for internet, and many people had met the same problem.
> Recently, I discussed with Longman, we have two different patches to solve this problem,
> In my patch, remove the negative dentries with the files unlinked, I checked 15 years ago,
> Viro and linus had talked about this, that unlink was only worth doing.
>   ref:   http://yarchive.net/comp/linux/negative_dentries.html
> in Longman patch, limit the negative dentries number
> maybe we can discuss.

So both you and Waiman complain about negative dentries consuming space
(and I agree, they do) but neither of you has explained why it is a
problem. If memory is ever needed, negative dentries are very easy to
reclaim. So to some extent this is like complaing that page cache consumes
your memory - which is again true but it is a deliberate decision and it
helps performance.

It is possible that some of these dentries are so rarely used that they are
indeed just a waste but then I'd like to see detailed analysis of which
negative dentries are these and how your reclaim heuristics improve the
situation. But I haven't seen any performance numbers from either you or
Waiman. So please gather some performance numbers justifying your change
so that we have something to talk about...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-25 14:47   ` Jan Kara
@ 2017-08-25 15:10     ` Colin Walters
  2017-08-25 16:43       ` Linus Torvalds
  2017-08-26  6:56     ` Wangkai (Kevin,C)
  1 sibling, 1 reply; 14+ messages in thread
From: Colin Walters @ 2017-08-25 15:10 UTC (permalink / raw)
  To: Jan Kara, Wangkai (Kevin,C)
  Cc: linux-fsdevel, viro, Waiman Long, Renjinyong (Renjinyong,
	Business Support Dept),
	Linus Torvalds

On Fri, Aug 25, 2017, at 10:47 AM, Jan Kara wrote:
> 
> It is possible that some of these dentries are so rarely used that they are
> indeed just a waste 

In some cases - think containers, or ostree-style root filesystem snapshots,
if we do an `rm -rf /path/to/container-root`, userspace knows for a fact
that nothing will reference those paths again - all of the processes that could
have been killed.  There's no point to having negative dentries for them.

Maybe something like unlinkat (dfd, path, AT_UNLINKAT_DONTNEED), like
madvise (MAV_DONTNEED) ?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-25 15:10     ` Colin Walters
@ 2017-08-25 16:43       ` Linus Torvalds
  0 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2017-08-25 16:43 UTC (permalink / raw)
  To: Colin Walters
  Cc: Jan Kara, Wangkai (Kevin,C),
	linux-fsdevel, Al Viro, Waiman Long, Renjinyong (Renjinyong,
	Business Support Dept)

On Fri, Aug 25, 2017 at 8:10 AM, Colin Walters <walters@verbum.org> wrote:
> On Fri, Aug 25, 2017, at 10:47 AM, Jan Kara wrote:
>>
>> It is possible that some of these dentries are so rarely used that they are
>> indeed just a waste
>
> In some cases - think containers, or ostree-style root filesystem snapshots,
> if we do an `rm -rf /path/to/container-root`, userspace knows for a fact
> that nothing will reference those paths again - all of the processes that could
> have been killed.  There's no point to having negative dentries for them.
>
> Maybe something like unlinkat (dfd, path, AT_UNLINKAT_DONTNEED), like
> madvise (MAV_DONTNEED) ?

No, I think the right fix is to just prune the child dentries when
doing an rmdir - and I thought we did that already.

IOW, when doing "rmdir()" on a dentry, we should not prune *that*
dentry, but we should prune all the dentries (recursively) under it.

.. goers to look ..

Yeah, look at vfs_rmdir(): it does that

    shrink_dcache_parent(dentry);

already. So when you do a "rm -rf something", you should already have
*no* extra negative dentries - there should be exactly one negative
dentry remaining (the dentry that used to be the directory itself).

So I really don't see what the problem is.  If you have lots of
dentries left over after a "rm -rf /path/to/container-root", something
else is wrong.

The only time we have negative dentries is:

 (a) lookups that didn't match. These are CRITICALLY IMPORTANT. They
are very very common.

   Why? Think of all the PATH-like behavior that Unix traditionally
has: not just PATH itself, but look at what processes do with
'strace'. They end up searching for things like translation files for
error messages, for shared libraries, for a number of things using
path-like things, and it's actually really important that they do
*not* keep trying to call down to the filesystem to look for a path
that doesn't exist. That's often the most expensive filesystem
operation there is - because in a big directory (and again - think
about PATH - it often traverses some of the biggest directories
around), many filesystems end up walking the whole directory before
they say "oops, I didn't find that file".

 (b) individually removed files.

  These aren't as important, and we could possibly shrink things, but
they really shouldn't matter.

  How often do you remove a ton of files without removing the
directory they are in? Not often.

 (c) renames etc. These tend to be even less of an issue.

So I really think that you shouldn't have that many negative dentries
to begin with under normal load, but even if you do, they should be
really easy to prune like Jan says.

So send out a real load with real numbers. None of this touchy-feely
thing that seems to be wrong. Ok?

Because maybe we have a bug, and that shrink_dcache_parent() thing doesn't work.

That would be interesting and relevant and a bug, so definitely worth fixing.

(Side note: the shrink_dcache_parent() thing is actually done before
the filesystem rmdir() is called, which means that you can use a
"rmdir()" as a MAV_DONTNEED on the dentry tree below that directory.
Just make sure the directory isn't empty, so that it doesn't actually
get deleted)

               Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-25 14:47   ` Jan Kara
  2017-08-25 15:10     ` Colin Walters
@ 2017-08-26  6:56     ` Wangkai (Kevin,C)
  2017-08-26 16:18       ` Linus Torvalds
  1 sibling, 1 reply; 14+ messages in thread
From: Wangkai (Kevin,C) @ 2017-08-26  6:56 UTC (permalink / raw)
  To: Jan Kara, Linus Torvalds, viro, Waiman Long, walters
  Cc: linux-fsdevel, Renjinyong (Renjinyong, Business Support Dept)

> So both you and Waiman complain about negative dentries consuming space
> (and I agree, they do) but neither of you has explained why it is a problem. If
> memory is ever needed, negative dentries are very easy to reclaim. So to some
> extent this is like complaing that page cache consumes your memory - which is
> again true but it is a deliberate decision and it helps performance.
> 
> It is possible that some of these dentries are so rarely used that they are
> indeed just a waste but then I'd like to see detailed analysis of which negative
> dentries are these and how your reclaim heuristics improve the situation. But I
> haven't seen any performance numbers from either you or Waiman. So please
> gather some performance numbers justifying your change so that we have
> something to talk about...
> 

I am sorry, so let me to explain my problem:
on my linux machine kernel 4.4, there was one program "foo" doing some backup data
and create backup files one by one, which named like foo.1.bak foo.2.bak ... foo.n.bak
sequenced. e.g. create foo.2.bak, then delete foo.1.bak, create foo.3.bak then delete
foo.2.bak , and one day I found the memory usage was high(total memory was 8G):
/proc/meminfo:
SReclaimable:    7634344 kB

After check the slabinfo:
dentry            40001682 40001682    192   21    1 : tunables

and the dentry-state: 
40001058        39990056        45      0       0       0

The foo has been created about 40 millions of files, and left one file remained, all other
Was deleted.

At this time, I have one kernel module allocing pages with GFP_ATOMIC flag, some times
Allocation failed due to reclaim not very fast(maybe I will modify the parameter of memory 
Reclaim ...)

Another perf issue about keeping large number dentries was, the dentry lookup slowdown
The test:
I tried to create, close, and remove different files when different number dentries present:
(On x86_64 kernel 4.4, 12 cpus Intel(R) Xeon(R) CPU E5645 @ 2.40GHz)
When dentries count was 18800
    open   1000 files, avg once time 10321.671 ns
    close  1000 files, avg once time 455.299 ns
    unlink 1000 files, avg once time 5179.519 ns
when dentries count was 40001058
    open   1000 files, avg once time 13483.361 ns
    close  1000 files, avg once time 455.067 ns
    unlink 1000 files, avg once time 7645.616 ns

actually, I can modify the program "foo", and change the backup file's name to be back 
around like foo.1.bak, foo.2.bak ... foo.100.bak -> foo.1.bak

but I am worried, if there are programs create,delete many temporary files and unique, 
the negative dentries will keep growing.

Thanks, 
Kevin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-26  6:56     ` Wangkai (Kevin,C)
@ 2017-08-26 16:18       ` Linus Torvalds
  2017-08-27 15:05         ` Waiman Long
  2017-08-28  6:31         ` Wangkai (Kevin,C)
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2017-08-26 16:18 UTC (permalink / raw)
  To: Wangkai (Kevin,C)
  Cc: Jan Kara, viro, Waiman Long, walters, linux-fsdevel,
	Renjinyong (Renjinyong, Business Support Dept)

On Fri, Aug 25, 2017 at 11:56 PM, Wangkai (Kevin,C)
<wangkai86@huawei.com> wrote:
>
> but I am worried, if there are programs create,delete many temporary files and unique,
> the negative dentries will keep growing.

The thing is, this has nothing to do with unlink.

The *easiest* way to generate negative dentries is in fact to never
create any files at all: just look up millions of non-existent names.

IOW, just something like this

    #include <stdio.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <unistd.h>

    int main()
    {
        int i;
        for (i = 0; i < 10000000; i++) {
                char name[20];
                struct stat st;

                snprintf(name, sizeof(name), "n:%d", i);
                stat(name, &st);
        }
        return 0;
    }

is a much easier and faster way to create negative dentries.

And yes, it's entirely possible that we could/should have some way to
balance negative dentries against positive ones, but on the whole this
has not really come up as a huge problem.

For example, your module that does a lot of GFP_ATOMIC allocations -
if it wasn't for dentries, it would have been something else.
GFP_ATOMIC *will* fail after a while, because it just can't replenish
the free memory. That's fundamental. That's what GFP_ATOMIC _means_.
It's very much meant for "occasional critical allocations", and if you
do just GFP_ATOMIC, you will fail.

                 Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-26 16:18       ` Linus Torvalds
@ 2017-08-27 15:05         ` Waiman Long
  2017-08-31  7:53           ` Jan Kara
  2017-08-28  6:31         ` Wangkai (Kevin,C)
  1 sibling, 1 reply; 14+ messages in thread
From: Waiman Long @ 2017-08-27 15:05 UTC (permalink / raw)
  To: Linus Torvalds, Wangkai (Kevin,C)
  Cc: Jan Kara, viro, walters, linux-fsdevel, Renjinyong (Renjinyong,
	Business Support Dept)

On 08/26/2017 12:18 PM, Linus Torvalds wrote:
> On Fri, Aug 25, 2017 at 11:56 PM, Wangkai (Kevin,C)
> <wangkai86@huawei.com> wrote:
>> but I am worried, if there are programs create,delete many temporary files and unique,
>> the negative dentries will keep growing.
> The thing is, this has nothing to do with unlink.
>
> The *easiest* way to generate negative dentries is in fact to never
> create any files at all: just look up millions of non-existent names.
>
> IOW, just something like this
>
>     #include <stdio.h>
>     #include <sys/types.h>
>     #include <sys/stat.h>
>     #include <unistd.h>
>
>     int main()
>     {
>         int i;
>         for (i = 0; i < 10000000; i++) {
>                 char name[20];
>                 struct stat st;
>
>                 snprintf(name, sizeof(name), "n:%d", i);
>                 stat(name, &st);
>         }
>         return 0;
>     }
>
> is a much easier and faster way to create negative dentries.
>
> And yes, it's entirely possible that we could/should have some way to
> balance negative dentries against positive ones, but on the whole this
> has not really come up as a huge problem.

It is certainly true that the current scheme of unlimited negative
dentry creation is not a problem under most cases. However, there are
scenarios where it can be a problem.

A customer discovered the negative dentry issue because of a bug in
their application code. They fixed their code to solve the problem.
However, they wondered if this could be used as one vector of a DoS
attack on a Linux system by having a rogue program generate massive
number of negative dentries continuously. It is the thought of this
malicious use of the negative dentry behavior that prompted me to create
and send out a patch to limit the number of negative dentries allowable
in a system.

Besides, Kevin had shown that keeping the dentry cache from growing too
big was good for file lookup performance too.

Cheers,
Longman
> For example, your module that does a lot of GFP_ATOMIC allocations -
> if it wasn't for dentries, it would have been something else.
> GFP_ATOMIC *will* fail after a while, because it just can't replenish
> the free memory. That's fundamental. That's what GFP_ATOMIC _means_.
> It's very much meant for "occasional critical allocations", and if you
> do just GFP_ATOMIC, you will fail.
>
>                  Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-26 16:18       ` Linus Torvalds
  2017-08-27 15:05         ` Waiman Long
@ 2017-08-28  6:31         ` Wangkai (Kevin,C)
  1 sibling, 0 replies; 14+ messages in thread
From: Wangkai (Kevin,C) @ 2017-08-28  6:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jan Kara, viro, Waiman Long, walters, linux-fsdevel,
	Renjinyong (Renjinyong, Business Support Dept)



> -----Original Message-----
> From: linus971@gmail.com [mailto:linus971@gmail.com] On Behalf Of Linus
> Torvalds
> Sent: Sunday, August 27, 2017 12:19 AM
> To: Wangkai (Kevin,C)
> Cc: Jan Kara; viro@zeniv.linux.org.uk; Waiman Long; walters@verbum.org;
> linux-fsdevel@vger.kernel.org; Renjinyong (Renjinyong, Business Support Dept)
> Subject: Re: [PATCH] fs/dcache: dentries should free after files unlinked or
> directories removed
> 
> On Fri, Aug 25, 2017 at 11:56 PM, Wangkai (Kevin,C)
> <wangkai86@huawei.com> wrote:
> >
> > but I am worried, if there are programs create,delete many temporary files
> and unique,
> > the negative dentries will keep growing.
> 
> The thing is, this has nothing to do with unlink.
> 
> The *easiest* way to generate negative dentries is in fact to never
> create any files at all: just look up millions of non-existent names.
> 
> IOW, just something like this
> 
>     #include <stdio.h>
>     #include <sys/types.h>
>     #include <sys/stat.h>
>     #include <unistd.h>
> 
>     int main()
>     {
>         int i;
>         for (i = 0; i < 10000000; i++) {
>                 char name[20];
>                 struct stat st;
> 
>                 snprintf(name, sizeof(name), "n:%d", i);
>                 stat(name, &st);
>         }
>         return 0;
>     }
> 
> is a much easier and faster way to create negative dentries.
>
> And yes, it's entirely possible that we could/should have some way to
> balance negative dentries against positive ones, but on the whole this
> has not really come up as a huge problem.
> 
> For example, your module that does a lot of GFP_ATOMIC allocations -
> if it wasn't for dentries, it would have been something else.
> GFP_ATOMIC *will* fail after a while, because it just can't replenish
> the free memory. That's fundamental. That's what GFP_ATOMIC _means_.
> It's very much meant for "occasional critical allocations", and if you
> do just GFP_ATOMIC, you will fail.
> 
>                  Linus

(1)  yes, negative dentries can come up in two ways:
 i)  look up different files 
 ii)  create and delete different files (I have met)

(2)  the module allocation with flag GFP_ATOMIC was called from irq, 
which do some special packets receiving, called dev_alloc_skb, 
in this function the memory allocation with flag GFP_ATOMIC.

Regards,
Kevin 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-27 15:05         ` Waiman Long
@ 2017-08-31  7:53           ` Jan Kara
  2017-08-31 16:27             ` Waiman Long
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2017-08-31  7:53 UTC (permalink / raw)
  To: Waiman Long
  Cc: Linus Torvalds, Wangkai (Kevin,C),
	Jan Kara, viro, walters, linux-fsdevel, Renjinyong (Renjinyong,
	Business Support Dept)

On Sun 27-08-17 11:05:34, Waiman Long wrote:
> On 08/26/2017 12:18 PM, Linus Torvalds wrote:
> > On Fri, Aug 25, 2017 at 11:56 PM, Wangkai (Kevin,C)
> > <wangkai86@huawei.com> wrote:
> >> but I am worried, if there are programs create,delete many temporary files and unique,
> >> the negative dentries will keep growing.
> > The thing is, this has nothing to do with unlink.
> >
> > The *easiest* way to generate negative dentries is in fact to never
> > create any files at all: just look up millions of non-existent names.
> >
> > IOW, just something like this
> >
> >     #include <stdio.h>
> >     #include <sys/types.h>
> >     #include <sys/stat.h>
> >     #include <unistd.h>
> >
> >     int main()
> >     {
> >         int i;
> >         for (i = 0; i < 10000000; i++) {
> >                 char name[20];
> >                 struct stat st;
> >
> >                 snprintf(name, sizeof(name), "n:%d", i);
> >                 stat(name, &st);
> >         }
> >         return 0;
> >     }
> >
> > is a much easier and faster way to create negative dentries.
> >
> > And yes, it's entirely possible that we could/should have some way to
> > balance negative dentries against positive ones, but on the whole this
> > has not really come up as a huge problem.
> 
> It is certainly true that the current scheme of unlimited negative
> dentry creation is not a problem under most cases. However, there are
> scenarios where it can be a problem.
> 
> A customer discovered the negative dentry issue because of a bug in
> their application code. They fixed their code to solve the problem.
> However, they wondered if this could be used as one vector of a DoS
> attack on a Linux system by having a rogue program generate massive
> number of negative dentries continuously. It is the thought of this
> malicious use of the negative dentry behavior that prompted me to create
> and send out a patch to limit the number of negative dentries allowable
> in a system.

Well, and how is this fundamentally different from a user consuming
resources by other means (positive dentries, inode cache, page cache, anon
memory etc.)? Sure you can force slab reclaim to work hard but you have
many other ways how a local user can do that. So if you can demonstrate
that it is too easy to DoS a system in some way, we can talk about
mitigating the attack. But just the ability of making the system busy does
not seem serious to me.

> Besides, Kevin had shown that keeping the dentry cache from growing too
> big was good for file lookup performance too.

Well, that rather speaks for better data structure for dentry lookup (e.g.
growing hash tables) rather than for limiting negative dentries? I can
imagine there are workloads which would benefit from that as well?

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-31  7:53           ` Jan Kara
@ 2017-08-31 16:27             ` Waiman Long
  2017-09-04 12:59               ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Waiman Long @ 2017-08-31 16:27 UTC (permalink / raw)
  To: Jan Kara
  Cc: Linus Torvalds, Wangkai (Kevin,C),
	viro, walters, linux-fsdevel, Renjinyong (Renjinyong,
	Business Support Dept)

On 08/31/2017 03:53 AM, Jan Kara wrote:
> On Sun 27-08-17 11:05:34, Waiman Long wrote:
>>
>> It is certainly true that the current scheme of unlimited negative
>> dentry creation is not a problem under most cases. However, there are
>> scenarios where it can be a problem.
>>
>> A customer discovered the negative dentry issue because of a bug in
>> their application code. They fixed their code to solve the problem.
>> However, they wondered if this could be used as one vector of a DoS
>> attack on a Linux system by having a rogue program generate massive
>> number of negative dentries continuously. It is the thought of this
>> malicious use of the negative dentry behavior that prompted me to create
>> and send out a patch to limit the number of negative dentries allowable
>> in a system.
> Well, and how is this fundamentally different from a user consuming
> resources by other means (positive dentries, inode cache, page cache, anon
> memory etc.)? Sure you can force slab reclaim to work hard but you have
> many other ways how a local user can do that. So if you can demonstrate
> that it is too easy to DoS a system in some way, we can talk about
> mitigating the attack. But just the ability of making the system busy does
> not seem serious to me.

Positive dentries are limited by the total number of files in the file
system. Negative dentries, on the other hand, have no such limit. There
are ways to limit other resource usages such as limiting memory usage of
a user by memory cgroup, filesystem quota for amount of disk space or
number of files created or owned, etc. However, I am not aware of any
control mechanism that can limit the number negative dentries generated
by a given user. That makes negative dentries somewhat different from
the other resource types that you are talking about.

>> Besides, Kevin had shown that keeping the dentry cache from growing too
>> big was good for file lookup performance too.
> Well, that rather speaks for better data structure for dentry lookup (e.g.
> growing hash tables) rather than for limiting negative dentries? I can
> imagine there are workloads which would benefit from that as well?

Current dentry lookup is through a hash table. The lookup performance
will depend on the number of hashed slots as well as the number of
entries queued in each slot. So in general, lookup performance
deteriorates the more entries you put into a given slot. That is true no
matter how many slots you have allocated.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-08-31 16:27             ` Waiman Long
@ 2017-09-04 12:59               ` Jan Kara
  2017-09-04 15:48                 ` Linus Torvalds
  2017-09-04 21:45                 ` Waiman Long
  0 siblings, 2 replies; 14+ messages in thread
From: Jan Kara @ 2017-09-04 12:59 UTC (permalink / raw)
  To: Waiman Long
  Cc: Jan Kara, Linus Torvalds, Wangkai (Kevin,C),
	viro, walters, linux-fsdevel, Renjinyong (Renjinyong,
	Business Support Dept)

On Thu 31-08-17 12:27:27, Waiman Long wrote:
> On 08/31/2017 03:53 AM, Jan Kara wrote:
> > On Sun 27-08-17 11:05:34, Waiman Long wrote:
> >>
> >> It is certainly true that the current scheme of unlimited negative
> >> dentry creation is not a problem under most cases. However, there are
> >> scenarios where it can be a problem.
> >>
> >> A customer discovered the negative dentry issue because of a bug in
> >> their application code. They fixed their code to solve the problem.
> >> However, they wondered if this could be used as one vector of a DoS
> >> attack on a Linux system by having a rogue program generate massive
> >> number of negative dentries continuously. It is the thought of this
> >> malicious use of the negative dentry behavior that prompted me to create
> >> and send out a patch to limit the number of negative dentries allowable
> >> in a system.
> > Well, and how is this fundamentally different from a user consuming
> > resources by other means (positive dentries, inode cache, page cache, anon
> > memory etc.)? Sure you can force slab reclaim to work hard but you have
> > many other ways how a local user can do that. So if you can demonstrate
> > that it is too easy to DoS a system in some way, we can talk about
> > mitigating the attack. But just the ability of making the system busy does
> > not seem serious to me.
> 
> Positive dentries are limited by the total number of files in the file
> system. Negative dentries, on the other hand, have no such limit. There
> are ways to limit other resource usages such as limiting memory usage of
> a user by memory cgroup, filesystem quota for amount of disk space or
> number of files created or owned, etc. However, I am not aware of any
> control mechanism that can limit the number negative dentries generated
> by a given user. That makes negative dentries somewhat different from
> the other resource types that you are talking about.

So I agree they are somewhat different but not fundamentally different -
e.g. total number of files in the file system can be easily so high that
dentries + inodes cannot fit into RAM and thus you are in a very similar
situation as with negative dentries. That's actually one of the reasons why
people were trying to bend memcgs to account slab cache as well. But it
didn't end anywhere AFAIK.

The reason why I'm objecting is that the limit on the number of negative
dentries is another tuning knob, it is for very specific cases, and most of
sysadmins will have no clue how to set it properly (even I wouldn't have a
good idea).

> >> Besides, Kevin had shown that keeping the dentry cache from growing too
> >> big was good for file lookup performance too.
> > Well, that rather speaks for better data structure for dentry lookup (e.g.
> > growing hash tables) rather than for limiting negative dentries? I can
> > imagine there are workloads which would benefit from that as well?
> 
> Current dentry lookup is through a hash table. The lookup performance
> will depend on the number of hashed slots as well as the number of
> entries queued in each slot. So in general, lookup performance
> deteriorates the more entries you put into a given slot. That is true no
> matter how many slots you have allocated.

Agreed, but with rhashtables the number of slots grows dynamically with the
number of entries...

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-09-04 12:59               ` Jan Kara
@ 2017-09-04 15:48                 ` Linus Torvalds
  2017-09-04 21:45                 ` Waiman Long
  1 sibling, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2017-09-04 15:48 UTC (permalink / raw)
  To: Jan Kara
  Cc: Waiman Long, Wangkai (Kevin,C),
	viro, walters, linux-fsdevel, Renjinyong (Renjinyong,
	Business Support Dept)

On Mon, Sep 4, 2017 at 5:59 AM, Jan Kara <jack@suse.cz> wrote:
>
> The reason why I'm objecting is that the limit on the number of negative
> dentries is another tuning knob, it is for very specific cases, and most of
> sysadmins will have no clue how to set it properly (even I wouldn't have a
> good idea).

We could make some cut at just doing it without any explicit tuning.

In particular, we could probably have some fairly trivial logic to
count negative dentries vs positive ones, and start pruning negative
dentries if we've crossed some threshold and we have more of them than
of the positive kind. That's a fairly obvious "somebody is doing
something bad" measure which wouldn't need a ton of tuning.

That said, I'm not convinced it's a huge deal and worth it - I'm just
saying that I don't think it's necessarily crazy or something I'd NAK
automatically.

                  Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] fs/dcache: dentries should free after files unlinked or directories removed
  2017-09-04 12:59               ` Jan Kara
  2017-09-04 15:48                 ` Linus Torvalds
@ 2017-09-04 21:45                 ` Waiman Long
  1 sibling, 0 replies; 14+ messages in thread
From: Waiman Long @ 2017-09-04 21:45 UTC (permalink / raw)
  To: Jan Kara
  Cc: Linus Torvalds, Wangkai (Kevin,C),
	viro, walters, linux-fsdevel, Renjinyong (Renjinyong,
	Business Support Dept)

On 09/04/2017 08:59 AM, Jan Kara wrote:
>
> So I agree they are somewhat different but not fundamentally different -
> e.g. total number of files in the file system can be easily so high that
> dentries + inodes cannot fit into RAM and thus you are in a very similar
> situation as with negative dentries. That's actually one of the reasons why
> people were trying to bend memcgs to account slab cache as well. But it
> didn't end anywhere AFAIK.
>
> The reason why I'm objecting is that the limit on the number of negative
> dentries is another tuning knob, it is for very specific cases, and most of
> sysadmins will have no clue how to set it properly (even I wouldn't have a
> good idea).

Thanks for letting me know which part of the patch you are objecting to.
As suggested by Linus, I can easily change the patch to do some kind of
auto-tuning depending on the positive-negative dentries ratio without
needing a user configurable kernel command line option. I added that
option to make the patch more flexible, but I do agree that most people
will likely leave it at the default value without ever using it.

>> Current dentry lookup is through a hash table. The lookup performance
>> will depend on the number of hashed slots as well as the number of
>> entries queued in each slot. So in general, lookup performance
>> deteriorates the more entries you put into a given slot. That is true no
>> matter how many slots you have allocated.
> Agreed, but with rhashtables the number of slots grows dynamically with the
> number of entries...

Currently, alloc_large_system_hash() is scaling the number of hash slots
according to the system memory size. That is adequate in most cases.
Using rhashtable will add a little bit of overhead into the hash index
computation, so we will probably see a little bit of slowdown with small
number of dentries and a bit of speed-up with large number of dentries.
That may not be a trade-off we would like to take.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-09-04 21:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-07  9:54 [PATCH] fs/dcache: dentries should free after files unlinked or directories removed Wangkai
2017-08-25  7:06 ` Wangkai (Kevin,C)
2017-08-25 14:47   ` Jan Kara
2017-08-25 15:10     ` Colin Walters
2017-08-25 16:43       ` Linus Torvalds
2017-08-26  6:56     ` Wangkai (Kevin,C)
2017-08-26 16:18       ` Linus Torvalds
2017-08-27 15:05         ` Waiman Long
2017-08-31  7:53           ` Jan Kara
2017-08-31 16:27             ` Waiman Long
2017-09-04 12:59               ` Jan Kara
2017-09-04 15:48                 ` Linus Torvalds
2017-09-04 21:45                 ` Waiman Long
2017-08-28  6:31         ` Wangkai (Kevin,C)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.