linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Jani Nikula <jani.nikula@linux.intel.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Chris Wilson <chris@chris-wilson.co.uk>,
	David Howells <dhowells@redhat.com>,
	Christoph Hellwig <hch@lst.de>, David Airlie <airlied@linux.ie>,
	Daniel Vetter <daniel@ffwll.ch>,
	intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Hugh Dickins <hughd@google.com>
Subject: Re: [PATCHv2 2/3] i915: convert to new mount API
Date: Tue, 6 Aug 2019 00:50:10 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.11.1908060007190.1941@eggly.anvils> (raw)
In-Reply-To: <20190805182834.GI1131@ZenIV.linux.org.uk>

On Mon, 5 Aug 2019, Al Viro wrote:
> On Mon, Aug 05, 2019 at 07:12:55PM +0100, Al Viro wrote:
> > On Tue, Aug 06, 2019 at 01:03:06AM +0900, Sergey Senozhatsky wrote:
> > > tmpfs does not set ->remount_fs() anymore and its users need
> > > to be converted to new mount API.
> > 
> > Could you explain why the devil do you bother with remount at all?
> > Why not pass the right options when mounting the damn thing?
> 
> ... and while we are at it, I really wonder what's going on with
> that gemfs thing - among the other things, this is the only
> user of shmem_file_setup_with_mnt().  Sure, you want your own
> options, but that brings another question - is there any reason
> for having the huge=... per-superblock rather than per-file?

Yes: we want a default for how files of that superblock are to
allocate their pages, without people having to fcntl or advise
each of their files.

Setting aside the weirder options (within_size, advise) and emergency/
testing override (shmem_huge), we want files on an ordinary default
tmpfs (huge=never) to be allocated with small pages (so users with
access to that filesystem will not consume, and will not waste time
and space on consuming, the more valuable huge pages); but files on a
huge=always tmpfs to be allocated with huge pages whenever possible.

Or am I missing your point?  Yes, hugeness can certainly be decided
differently per-file, or even per-extent of file.  That is already
made possible through "judicious" use of madvise MADV_HUGEPAGE and
MADV_NOHUGEPAGE on mmaps of the file, carried over from anon THP.

Though personally I'm averse to managing "f"objects through
"m"interfaces, which can get ridiculous (notably, MADV_HUGEPAGE works
on the virtual address of a mapping, but the huge-or-not alignment of
that mapping must have been decided previously).  In Google we do use
fcntls F_HUGEPAGE and F_NOHUGEPAGE to override on a per-file basis -
one day I'll get to upstreaming those.

Hugh

> 
> After all, the readers of ->huge in mm/shmem.c are
> mm/shmem.c:582:     (shmem_huge == SHMEM_HUGE_FORCE || sbinfo->huge) &&
> 	is_huge_enabled(), sbinfo is an explicit argument
> 
> mm/shmem.c:1799:        switch (sbinfo->huge) {
> 	shmem_getpage_gfp(), sbinfo comes from inode
> 
> mm/shmem.c:2113:                if (SHMEM_SB(sb)->huge == SHMEM_HUGE_NEVER)
> 	shmem_get_unmapped_area(), sb comes from file
> 
> mm/shmem.c:3531:        if (sbinfo->huge)
> mm/shmem.c:3532:                seq_printf(seq, ",huge=%s", shmem_format_huge(sbinfo->huge));
> 	->show_options()
> mm/shmem.c:3880:        switch (sbinfo->huge) {
> 	shmem_huge_enabled(), sbinfo comes from an inode
> 
> And the only caller of is_huge_enabled() is shmem_getattr(), with sbinfo
> picked from inode.
> 
> So is there any reason why the hugepage policy can't be per-file, with
> the current being overridable default?

  reply	other threads:[~2019-08-06  7:50 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-05 16:03 [PATCHv2 0/3] convert i915 to new mount API Sergey Senozhatsky
2019-08-05 16:03 ` [PATCHv2 1/3] fs: export put_filesystem() Sergey Senozhatsky
2019-08-05 16:03 ` [PATCHv2 2/3] i915: convert to new mount API Sergey Senozhatsky
2019-08-05 18:12   ` Al Viro
2019-08-05 18:28     ` Al Viro
2019-08-06  7:50       ` Hugh Dickins [this message]
2019-08-07  6:30         ` Christoph Hellwig
2019-08-08  1:23           ` Al Viro
2019-08-08 15:54             ` Hugh Dickins
2019-08-08 16:23               ` Chris Wilson
2019-08-08 17:03                 ` Matthew Auld
2019-08-08  1:21         ` Al Viro
2019-08-05 23:33     ` Al Viro
2019-08-06  1:20     ` Sergey Senozhatsky
2019-08-05 16:03 ` [PATCHv2 3/3] i915: do not leak module ref counter Sergey Senozhatsky
2019-08-05 19:34 ` [PATCHv2 0/3] convert i915 to new mount API Sedat Dilek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.1908060007190.1941@eggly.anvils \
    --to=hughd@google.com \
    --cc=airlied@linux.ie \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel@ffwll.ch \
    --cc=dhowells@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@lst.de \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).