linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Pasha Tatashin <pasha.tatashin@oracle.com>
Cc: Andi Kleen <andi@firstfloor.org>,
	linux-mm@kvack.org, sparclinux@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
Date: Wed, 1 Mar 2017 15:10:25 -0800	[thread overview]
Message-ID: <20170301231025.GJ26852@two.firstfloor.org> (raw)
In-Reply-To: <1e7db21b-808d-1f47-e78c-7d55c543ae39@oracle.com>

> For example, I am pretty sure that scale value in most places should
> be changed from literal value (inode scale = 14, dentry scale = 13,
> etc to: (PAGE_SHIFT + value): inode scale would become (PAGE_SHIFT +
> 2), dentry scale would become (PAGE_SHIFT + 1), etc. This is because
> we want 1/4 inodes and 1/2 dentries per every page in the system.

This is still far too much for a large system. The algorithm
simply was not designed for TB systems.

It's unlikely to have nowhere near that many small files active, as it's 
better to use the memory for something that is actually useful.

Also even a few hops in the open hash table are normally not a problems
dentry/inode; it is not that file lookups are that critical.

For networking the picture may be different, but I suspect GBs worth of
hash tables are still overkill there (Dave et.al. may have stronger opinions on this) 

I think a upper size (with user override which already exists) is fine,
but if you really don't want to do it then scale the factor down 
very aggressively for larger sizes, so that we don't end up with more
than a few tens of MB.

> This is basically a bug, and would not change the theory, but I am
> sure that changing scales without at least some theoretical backup

One dentry per page would only make sense if the files are zero sized.
If the file even has one byte then it already needs more than 1 page just to
cache the contents (even ignoring inodes and other caches)

With larger files that need multiple pages it makes even less sense.

So clearly one dentry per page theory is nonsense if the files are actually
used.

There is the "make find / + stat fast" case (where only the entries 
and inodes are cached). But even there it is unlikely that the TB system
has a much larger file system with more files than the 100GB system, so
I once a reasonable plateau is reached I don't see why you would want 
to exceed that.

Also the reason to make hash tables big is to minimize collisions,
but we have fairly good hash functions and a few hops worse case 
are likely not a problem for an already expensive file access
or open.

BTW the other option would be to switch all the large system hashes to a
rhashtable and do the resizing only when it is actually needed. 
But that would be more work than just adding a reasonable upper limit.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-03-01 23:10 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-01  0:14 [PATCH v2 0/3] Zeroing hash tables in allocator Pavel Tatashin
2017-03-01  0:14 ` [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow Pavel Tatashin
2017-03-01  0:24   ` Andi Kleen
2017-03-01 14:51     ` Pasha Tatashin
2017-03-01 15:19       ` Andi Kleen
2017-03-01 16:34         ` Pasha Tatashin
2017-03-01 17:31           ` Andi Kleen
2017-03-01 21:20             ` Pasha Tatashin
2017-03-01 23:10               ` Andi Kleen [this message]
2017-03-02 19:15                 ` Pasha Tatashin
2017-03-02  0:12               ` Matthew Wilcox
2017-03-01  0:14 ` [PATCH v2 2/3] mm: Zeroing hash tables in allocator Pavel Tatashin
2017-03-01  0:14 ` [PATCH v2 3/3] mm: Updated callers to use HASH_ZERO flag Pavel Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170301231025.GJ26852@two.firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pasha.tatashin@oracle.com \
    --cc=sparclinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).