linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2 0/3] convert i915 to new mount API
@ 2019-08-05 16:03 Sergey Senozhatsky
  2019-08-05 16:03 ` [PATCHv2 1/3] fs: export put_filesystem() Sergey Senozhatsky
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Sergey Senozhatsky @ 2019-08-05 16:03 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Chris Wilson,
	Alexander Viro, David Howells, Christoph Hellwig
  Cc: David Airlie, Daniel Vetter, intel-gfx, dri-devel, linux-kernel,
	linux-fsdevel, Sergey Senozhatsky

	Hello,

Convert i915 to a new mount API and fix i915_gemfs_init() kernel Oops.

It also appears that we need to EXPORTs put_filesystem(), so i915
can properly put filesystem after it is done with kern_mount().

v2:
- export put_filesystem() [Chris]
- always put_filesystem() in i915_gemfs_init() [Chris]
- improve i915_gemfs_init() error message [Chris]

Sergey Senozhatsky (3):
  fs: export put_filesystem()
  i915: convert to new mount API
  i915: do not leak module ref counter

 drivers/gpu/drm/i915/gem/i915_gemfs.c | 33 +++++++++++++++++++--------
 fs/filesystems.c                      |  1 +
 2 files changed, 25 insertions(+), 9 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv2 1/3] fs: export put_filesystem()
  2019-08-05 16:03 [PATCHv2 0/3] convert i915 to new mount API Sergey Senozhatsky
@ 2019-08-05 16:03 ` Sergey Senozhatsky
  2019-08-05 16:03 ` [PATCHv2 2/3] i915: convert to new mount API Sergey Senozhatsky
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Sergey Senozhatsky @ 2019-08-05 16:03 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Chris Wilson,
	Alexander Viro, David Howells, Christoph Hellwig
  Cc: David Airlie, Daniel Vetter, intel-gfx, dri-devel, linux-kernel,
	linux-fsdevel, Sergey Senozhatsky

Modules can use get_fs_type(), which is exported, but are unable
to put_filesystem(). Export it and let modules to also decrement
corresponding file systems' reference counters.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 fs/filesystems.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/filesystems.c b/fs/filesystems.c
index 9135646e41ac..02669839b584 100644
--- a/fs/filesystems.c
+++ b/fs/filesystems.c
@@ -45,6 +45,7 @@ void put_filesystem(struct file_system_type *fs)
 {
 	module_put(fs->owner);
 }
+EXPORT_SYMBOL(put_filesystem);
 
 static struct file_system_type **find_filesystem(const char *name, unsigned len)
 {
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHv2 2/3] i915: convert to new mount API
  2019-08-05 16:03 [PATCHv2 0/3] convert i915 to new mount API Sergey Senozhatsky
  2019-08-05 16:03 ` [PATCHv2 1/3] fs: export put_filesystem() Sergey Senozhatsky
@ 2019-08-05 16:03 ` Sergey Senozhatsky
  2019-08-05 18:12   ` Al Viro
  2019-08-05 16:03 ` [PATCHv2 3/3] i915: do not leak module ref counter Sergey Senozhatsky
  2019-08-05 19:34 ` [PATCHv2 0/3] convert i915 to new mount API Sedat Dilek
  3 siblings, 1 reply; 16+ messages in thread
From: Sergey Senozhatsky @ 2019-08-05 16:03 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Chris Wilson,
	Alexander Viro, David Howells, Christoph Hellwig
  Cc: David Airlie, Daniel Vetter, intel-gfx, dri-devel, linux-kernel,
	linux-fsdevel, Sergey Senozhatsky

tmpfs does not set ->remount_fs() anymore and its users need
to be converted to new mount API.

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 PF: supervisor instruction fetch in kernel mode
 PF: error_code(0x0010) - not-present page
 RIP: 0010:0x0
 Code: Bad RIP value.
 Call Trace:
  i915_gemfs_init+0x6e/0xa0 [i915]
  i915_gem_init_early+0x76/0x90 [i915]
  i915_driver_probe+0x30a/0x1640 [i915]
  ? kernfs_activate+0x5a/0x80
  ? kernfs_add_one+0xdd/0x130
  pci_device_probe+0x9e/0x110
  really_probe+0xce/0x230
  driver_probe_device+0x4b/0xc0
  device_driver_attach+0x4e/0x60
  __driver_attach+0x47/0xb0
  ? device_driver_attach+0x60/0x60
  bus_for_each_dev+0x61/0x90
  bus_add_driver+0x167/0x1b0
  driver_register+0x67/0xaa

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gemfs.c | 32 +++++++++++++++++++--------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c
index 099f3397aada..feedc9242072 100644
--- a/drivers/gpu/drm/i915/gem/i915_gemfs.c
+++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c
@@ -7,14 +7,17 @@
 #include <linux/fs.h>
 #include <linux/mount.h>
 #include <linux/pagemap.h>
+#include <linux/fs_context.h>
 
 #include "i915_drv.h"
 #include "i915_gemfs.h"
 
 int i915_gemfs_init(struct drm_i915_private *i915)
 {
+	struct fs_context *fc = NULL;
 	struct file_system_type *type;
 	struct vfsmount *gemfs;
+	bool ok = true;
 
 	type = get_fs_type("tmpfs");
 	if (!type)
@@ -36,18 +39,29 @@ int i915_gemfs_init(struct drm_i915_private *i915)
 		struct super_block *sb = gemfs->mnt_sb;
 		/* FIXME: Disabled until we get W/A for read BW issue. */
 		char options[] = "huge=never";
-		int flags = 0;
-		int err;
-
-		err = sb->s_op->remount_fs(sb, &flags, options);
-		if (err) {
-			kern_unmount(gemfs);
-			return err;
-		}
+
+		ok = false;
+		fc = fs_context_for_reconfigure(sb->s_root, 0, 0);
+		if (IS_ERR(fc))
+			goto out;
+
+		if (!fc->ops->parse_monolithic ||
+				fc->ops->parse_monolithic(fc, options))
+			goto out;
+
+		if (fc->ops->reconfigure && !fc->ops->reconfigure(fc))
+			ok = true;
 	}
 
+out:
+	if (!ok)
+		dev_err(i915->drm.dev,
+			"Unable to reconfigure %s. %s\n",
+			"shmemfs for preferred allocation strategy",
+			"Continuing, but performance may suffer");
+	if (!IS_ERR_OR_NULL(fc))
+		put_fs_context(fc);
 	i915->mm.gemfs = gemfs;
-
 	return 0;
 }
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHv2 3/3] i915: do not leak module ref counter
  2019-08-05 16:03 [PATCHv2 0/3] convert i915 to new mount API Sergey Senozhatsky
  2019-08-05 16:03 ` [PATCHv2 1/3] fs: export put_filesystem() Sergey Senozhatsky
  2019-08-05 16:03 ` [PATCHv2 2/3] i915: convert to new mount API Sergey Senozhatsky
@ 2019-08-05 16:03 ` Sergey Senozhatsky
  2019-08-05 19:34 ` [PATCHv2 0/3] convert i915 to new mount API Sedat Dilek
  3 siblings, 0 replies; 16+ messages in thread
From: Sergey Senozhatsky @ 2019-08-05 16:03 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Chris Wilson,
	Alexander Viro, David Howells, Christoph Hellwig
  Cc: David Airlie, Daniel Vetter, intel-gfx, dri-devel, linux-kernel,
	linux-fsdevel, Sergey Senozhatsky

put_filesystem() must follow get_fs_type().

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/gpu/drm/i915/gem/i915_gemfs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c
index feedc9242072..93ac365ce9ce 100644
--- a/drivers/gpu/drm/i915/gem/i915_gemfs.c
+++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c
@@ -24,6 +24,7 @@ int i915_gemfs_init(struct drm_i915_private *i915)
 		return -ENODEV;
 
 	gemfs = kern_mount(type);
+	put_filesystem(type);
 	if (IS_ERR(gemfs))
 		return PTR_ERR(gemfs);
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-05 16:03 ` [PATCHv2 2/3] i915: convert to new mount API Sergey Senozhatsky
@ 2019-08-05 18:12   ` Al Viro
  2019-08-05 18:28     ` Al Viro
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Al Viro @ 2019-08-05 18:12 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Chris Wilson,
	David Howells, Christoph Hellwig, David Airlie, Daniel Vetter,
	intel-gfx, dri-devel, linux-kernel, linux-fsdevel

On Tue, Aug 06, 2019 at 01:03:06AM +0900, Sergey Senozhatsky wrote:
> tmpfs does not set ->remount_fs() anymore and its users need
> to be converted to new mount API.

Could you explain why the devil do you bother with remount at all?
Why not pass the right options when mounting the damn thing?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-05 18:12   ` Al Viro
@ 2019-08-05 18:28     ` Al Viro
  2019-08-06  7:50       ` Hugh Dickins
  2019-08-05 23:33     ` Al Viro
  2019-08-06  1:20     ` Sergey Senozhatsky
  2 siblings, 1 reply; 16+ messages in thread
From: Al Viro @ 2019-08-05 18:28 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Chris Wilson,
	David Howells, Christoph Hellwig, David Airlie, Daniel Vetter,
	intel-gfx, dri-devel, linux-kernel, linux-fsdevel, Hugh Dickins

On Mon, Aug 05, 2019 at 07:12:55PM +0100, Al Viro wrote:
> On Tue, Aug 06, 2019 at 01:03:06AM +0900, Sergey Senozhatsky wrote:
> > tmpfs does not set ->remount_fs() anymore and its users need
> > to be converted to new mount API.
> 
> Could you explain why the devil do you bother with remount at all?
> Why not pass the right options when mounting the damn thing?

... and while we are at it, I really wonder what's going on with
that gemfs thing - among the other things, this is the only
user of shmem_file_setup_with_mnt().  Sure, you want your own
options, but that brings another question - is there any reason
for having the huge=... per-superblock rather than per-file?

After all, the readers of ->huge in mm/shmem.c are
mm/shmem.c:582:     (shmem_huge == SHMEM_HUGE_FORCE || sbinfo->huge) &&
	is_huge_enabled(), sbinfo is an explicit argument

mm/shmem.c:1799:        switch (sbinfo->huge) {
	shmem_getpage_gfp(), sbinfo comes from inode

mm/shmem.c:2113:                if (SHMEM_SB(sb)->huge == SHMEM_HUGE_NEVER)
	shmem_get_unmapped_area(), sb comes from file

mm/shmem.c:3531:        if (sbinfo->huge)
mm/shmem.c:3532:                seq_printf(seq, ",huge=%s", shmem_format_huge(sbinfo->huge));
	->show_options()
mm/shmem.c:3880:        switch (sbinfo->huge) {
	shmem_huge_enabled(), sbinfo comes from an inode

And the only caller of is_huge_enabled() is shmem_getattr(), with sbinfo
picked from inode.

So is there any reason why the hugepage policy can't be per-file, with
the current being overridable default?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 0/3] convert i915 to new mount API
  2019-08-05 16:03 [PATCHv2 0/3] convert i915 to new mount API Sergey Senozhatsky
                   ` (2 preceding siblings ...)
  2019-08-05 16:03 ` [PATCHv2 3/3] i915: do not leak module ref counter Sergey Senozhatsky
@ 2019-08-05 19:34 ` Sedat Dilek
  3 siblings, 0 replies; 16+ messages in thread
From: Sedat Dilek @ 2019-08-05 19:34 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Chris Wilson,
	Alexander Viro, David Howells, Christoph Hellwig, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel, linux-kernel, linux-fsdevel

On Mon, Aug 5, 2019 at 6:05 PM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
>         Hello,
>
> Convert i915 to a new mount API and fix i915_gemfs_init() kernel Oops.
>
> It also appears that we need to EXPORTs put_filesystem(), so i915
> can properly put filesystem after it is done with kern_mount().
>
> v2:
> - export put_filesystem() [Chris]
> - always put_filesystem() in i915_gemfs_init() [Chris]
> - improve i915_gemfs_init() error message [Chris]
>
> Sergey Senozhatsky (3):
>   fs: export put_filesystem()
>   i915: convert to new mount API
>   i915: do not leak module ref counter
>

Fee free to add:

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>

[1] https://lore.kernel.org/lkml/CA+icZUXh068m8UFeHDXCKDi0YfL2Z=WoONy7J7DJLqAT1CZ+rQ@mail.gmail.com/

>  drivers/gpu/drm/i915/gem/i915_gemfs.c | 33 +++++++++++++++++++--------
>  fs/filesystems.c                      |  1 +
>  2 files changed, 25 insertions(+), 9 deletions(-)
>
> --
> 2.22.0
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-05 18:12   ` Al Viro
  2019-08-05 18:28     ` Al Viro
@ 2019-08-05 23:33     ` Al Viro
  2019-08-06  1:20     ` Sergey Senozhatsky
  2 siblings, 0 replies; 16+ messages in thread
From: Al Viro @ 2019-08-05 23:33 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Chris Wilson,
	David Howells, Christoph Hellwig, David Airlie, Daniel Vetter,
	intel-gfx, dri-devel, linux-kernel, linux-fsdevel

On Mon, Aug 05, 2019 at 07:12:55PM +0100, Al Viro wrote:
> On Tue, Aug 06, 2019 at 01:03:06AM +0900, Sergey Senozhatsky wrote:
> > tmpfs does not set ->remount_fs() anymore and its users need
> > to be converted to new mount API.
> 
> Could you explain why the devil do you bother with remount at all?
> Why not pass the right options when mounting the damn thing?

Incidentally, the only remaining modular user of get_fs_type() is the
same i915_gemfs.c.  And I wonder if we should aim for unexporting
the damn thing instead of exporting put_filesystem()...

Note that users in tomoyo and apparmor are bogus - they are in the
instances of ill-defined method that needs to be split and moved,
with the lookups (fs type included) replaced with callers passing
the values they look up and will end up using.

IOW, outside of core VFS we have very few legitimate users, and the
one in kernel/trace might be better off as vfs_submount_by_name().

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-05 18:12   ` Al Viro
  2019-08-05 18:28     ` Al Viro
  2019-08-05 23:33     ` Al Viro
@ 2019-08-06  1:20     ` Sergey Senozhatsky
  2 siblings, 0 replies; 16+ messages in thread
From: Sergey Senozhatsky @ 2019-08-06  1:20 UTC (permalink / raw)
  To: Al Viro
  Cc: Sergey Senozhatsky, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Chris Wilson, David Howells, Christoph Hellwig, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel, linux-kernel, linux-fsdevel

On (08/05/19 19:12), Al Viro wrote:
[..]
> On Tue, Aug 06, 2019 at 01:03:06AM +0900, Sergey Senozhatsky wrote:
> > tmpfs does not set ->remount_fs() anymore and its users need
> > to be converted to new mount API.
> 
> Could you explain why the devil do you bother with remount at all?

I would redirect this question to i915 developers. As far as I know
i915 performance suffers with huge pages enabled.

> Why not pass the right options when mounting the damn thing?

vfs_kern_mount()? It still requires struct file_system_type, which
we need to get and put.

	-ss

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-05 18:28     ` Al Viro
@ 2019-08-06  7:50       ` Hugh Dickins
  2019-08-07  6:30         ` Christoph Hellwig
  2019-08-08  1:21         ` Al Viro
  0 siblings, 2 replies; 16+ messages in thread
From: Hugh Dickins @ 2019-08-06  7:50 UTC (permalink / raw)
  To: Al Viro
  Cc: Sergey Senozhatsky, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Chris Wilson, David Howells, Christoph Hellwig, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel, linux-kernel, linux-fsdevel,
	Hugh Dickins

On Mon, 5 Aug 2019, Al Viro wrote:
> On Mon, Aug 05, 2019 at 07:12:55PM +0100, Al Viro wrote:
> > On Tue, Aug 06, 2019 at 01:03:06AM +0900, Sergey Senozhatsky wrote:
> > > tmpfs does not set ->remount_fs() anymore and its users need
> > > to be converted to new mount API.
> > 
> > Could you explain why the devil do you bother with remount at all?
> > Why not pass the right options when mounting the damn thing?
> 
> ... and while we are at it, I really wonder what's going on with
> that gemfs thing - among the other things, this is the only
> user of shmem_file_setup_with_mnt().  Sure, you want your own
> options, but that brings another question - is there any reason
> for having the huge=... per-superblock rather than per-file?

Yes: we want a default for how files of that superblock are to
allocate their pages, without people having to fcntl or advise
each of their files.

Setting aside the weirder options (within_size, advise) and emergency/
testing override (shmem_huge), we want files on an ordinary default
tmpfs (huge=never) to be allocated with small pages (so users with
access to that filesystem will not consume, and will not waste time
and space on consuming, the more valuable huge pages); but files on a
huge=always tmpfs to be allocated with huge pages whenever possible.

Or am I missing your point?  Yes, hugeness can certainly be decided
differently per-file, or even per-extent of file.  That is already
made possible through "judicious" use of madvise MADV_HUGEPAGE and
MADV_NOHUGEPAGE on mmaps of the file, carried over from anon THP.

Though personally I'm averse to managing "f"objects through
"m"interfaces, which can get ridiculous (notably, MADV_HUGEPAGE works
on the virtual address of a mapping, but the huge-or-not alignment of
that mapping must have been decided previously).  In Google we do use
fcntls F_HUGEPAGE and F_NOHUGEPAGE to override on a per-file basis -
one day I'll get to upstreaming those.

Hugh

> 
> After all, the readers of ->huge in mm/shmem.c are
> mm/shmem.c:582:     (shmem_huge == SHMEM_HUGE_FORCE || sbinfo->huge) &&
> 	is_huge_enabled(), sbinfo is an explicit argument
> 
> mm/shmem.c:1799:        switch (sbinfo->huge) {
> 	shmem_getpage_gfp(), sbinfo comes from inode
> 
> mm/shmem.c:2113:                if (SHMEM_SB(sb)->huge == SHMEM_HUGE_NEVER)
> 	shmem_get_unmapped_area(), sb comes from file
> 
> mm/shmem.c:3531:        if (sbinfo->huge)
> mm/shmem.c:3532:                seq_printf(seq, ",huge=%s", shmem_format_huge(sbinfo->huge));
> 	->show_options()
> mm/shmem.c:3880:        switch (sbinfo->huge) {
> 	shmem_huge_enabled(), sbinfo comes from an inode
> 
> And the only caller of is_huge_enabled() is shmem_getattr(), with sbinfo
> picked from inode.
> 
> So is there any reason why the hugepage policy can't be per-file, with
> the current being overridable default?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-06  7:50       ` Hugh Dickins
@ 2019-08-07  6:30         ` Christoph Hellwig
  2019-08-08  1:23           ` Al Viro
  2019-08-08  1:21         ` Al Viro
  1 sibling, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2019-08-07  6:30 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Al Viro, Sergey Senozhatsky, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Chris Wilson, David Howells, Christoph Hellwig,
	David Airlie, Daniel Vetter, intel-gfx, dri-devel, linux-kernel,
	linux-fsdevel

On Tue, Aug 06, 2019 at 12:50:10AM -0700, Hugh Dickins wrote:
> Though personally I'm averse to managing "f"objects through
> "m"interfaces, which can get ridiculous (notably, MADV_HUGEPAGE works
> on the virtual address of a mapping, but the huge-or-not alignment of
> that mapping must have been decided previously).  In Google we do use
> fcntls F_HUGEPAGE and F_NOHUGEPAGE to override on a per-file basis -
> one day I'll get to upstreaming those.

Such an interface seems very useful, although the two fcntls seem a bit
odd.

But I think the point here is that the i915 has its own somewhat odd
instance of tmpfs.  If we could pass the equivalent of the huge=*
options to shmem_file_setup all that garbage (including the
shmem_file_setup_with_mnt function) could go away.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-06  7:50       ` Hugh Dickins
  2019-08-07  6:30         ` Christoph Hellwig
@ 2019-08-08  1:21         ` Al Viro
  1 sibling, 0 replies; 16+ messages in thread
From: Al Viro @ 2019-08-08  1:21 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Sergey Senozhatsky, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Chris Wilson, David Howells, Christoph Hellwig, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel, linux-kernel, linux-fsdevel

On Tue, Aug 06, 2019 at 12:50:10AM -0700, Hugh Dickins wrote:

> that mapping must have been decided previously).  In Google we do use
> fcntls F_HUGEPAGE and F_NOHUGEPAGE to override on a per-file basis -
> one day I'll get to upstreaming those.

That'd be nice - we could kill the i915 wierd private shmem instance,
along with some kludges in mm/shmem.c.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-07  6:30         ` Christoph Hellwig
@ 2019-08-08  1:23           ` Al Viro
  2019-08-08 15:54             ` Hugh Dickins
  0 siblings, 1 reply; 16+ messages in thread
From: Al Viro @ 2019-08-08  1:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hugh Dickins, Sergey Senozhatsky, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Chris Wilson, David Howells, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel, linux-kernel, linux-fsdevel

On Wed, Aug 07, 2019 at 08:30:02AM +0200, Christoph Hellwig wrote:
> On Tue, Aug 06, 2019 at 12:50:10AM -0700, Hugh Dickins wrote:
> > Though personally I'm averse to managing "f"objects through
> > "m"interfaces, which can get ridiculous (notably, MADV_HUGEPAGE works
> > on the virtual address of a mapping, but the huge-or-not alignment of
> > that mapping must have been decided previously).  In Google we do use
> > fcntls F_HUGEPAGE and F_NOHUGEPAGE to override on a per-file basis -
> > one day I'll get to upstreaming those.
> 
> Such an interface seems very useful, although the two fcntls seem a bit
> odd.
> 
> But I think the point here is that the i915 has its own somewhat odd
> instance of tmpfs.  If we could pass the equivalent of the huge=*
> options to shmem_file_setup all that garbage (including the
> shmem_file_setup_with_mnt function) could go away.

... or follow shmem_file_super() with whatever that fcntl maps to
internally.  I would really love to get rid of that i915 kludge.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-08  1:23           ` Al Viro
@ 2019-08-08 15:54             ` Hugh Dickins
  2019-08-08 16:23               ` Chris Wilson
  0 siblings, 1 reply; 16+ messages in thread
From: Hugh Dickins @ 2019-08-08 15:54 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Hugh Dickins, Sergey Senozhatsky, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Chris Wilson, David Howells,
	David Airlie, Daniel Vetter, intel-gfx, dri-devel, linux-kernel,
	linux-fsdevel

On Thu, 8 Aug 2019, Al Viro wrote:
> On Wed, Aug 07, 2019 at 08:30:02AM +0200, Christoph Hellwig wrote:
> > On Tue, Aug 06, 2019 at 12:50:10AM -0700, Hugh Dickins wrote:
> > > Though personally I'm averse to managing "f"objects through
> > > "m"interfaces, which can get ridiculous (notably, MADV_HUGEPAGE works
> > > on the virtual address of a mapping, but the huge-or-not alignment of
> > > that mapping must have been decided previously).  In Google we do use
> > > fcntls F_HUGEPAGE and F_NOHUGEPAGE to override on a per-file basis -
> > > one day I'll get to upstreaming those.
> > 
> > Such an interface seems very useful, although the two fcntls seem a bit
> > odd.
> > 
> > But I think the point here is that the i915 has its own somewhat odd
> > instance of tmpfs.  If we could pass the equivalent of the huge=*
> > options to shmem_file_setup all that garbage (including the
> > shmem_file_setup_with_mnt function) could go away.
> 
> ... or follow shmem_file_super() with whatever that fcntl maps to
> internally.  I would really love to get rid of that i915 kludge.

As to the immediate problem of i915_gemfs using remount_fs on linux-next,
IIUC, all that is necessary at the moment is the deletions patch below
(but I'd prefer that to come from the i915 folks).  Since gemfs has no
need to change the huge option from its default to its default.

As to the future of when they get back to wanting huge pages in gemfs,
yes, that can probably best be arranged by using the internals of an
fcntl F_HUGEPAGE on those objects that would benefit from it.

Though my intention there was that the "huge=never" default ought
to continue to refuse to give huge pages, even when asked by fcntl.
So a little hackery may still be required, to allow the i915_gemfs
internal mount to get huge pages when a user mount would not.

As to whether shmem_file_setup_with_mnt() needs to live: I've given
that no thought, but accept that shm_mnt is such a ragbag of different
usages, that i915 is right to prefer their own separate gemfs mount.

Hugh

--- mmotm/drivers/gpu/drm/i915/gem/i915_gemfs.c	2019-07-21 19:40:16.573703780 -0700
+++ linux/drivers/gpu/drm/i915/gem/i915_gemfs.c	2019-08-08 07:19:23.967689058 -0700
@@ -24,28 +24,6 @@ int i915_gemfs_init(struct drm_i915_priv
 	if (IS_ERR(gemfs))
 		return PTR_ERR(gemfs);
 
-	/*
-	 * Enable huge-pages for objects that are at least HPAGE_PMD_SIZE, most
-	 * likely 2M. Note that within_size may overallocate huge-pages, if say
-	 * we allocate an object of size 2M + 4K, we may get 2M + 2M, but under
-	 * memory pressure shmem should split any huge-pages which can be
-	 * shrunk.
-	 */
-
-	if (has_transparent_hugepage()) {
-		struct super_block *sb = gemfs->mnt_sb;
-		/* FIXME: Disabled until we get W/A for read BW issue. */
-		char options[] = "huge=never";
-		int flags = 0;
-		int err;
-
-		err = sb->s_op->remount_fs(sb, &flags, options);
-		if (err) {
-			kern_unmount(gemfs);
-			return err;
-		}
-	}
-
 	i915->mm.gemfs = gemfs;
 
 	return 0;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-08 15:54             ` Hugh Dickins
@ 2019-08-08 16:23               ` Chris Wilson
  2019-08-08 17:03                 ` Matthew Auld
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2019-08-08 16:23 UTC (permalink / raw)
  To: Al Viro, Hugh Dickins
  Cc: Christoph Hellwig, Hugh Dickins, Sergey Senozhatsky, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, David Howells, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel, linux-kernel, linux-fsdevel,
	Matthew Auld

Quoting Hugh Dickins (2019-08-08 16:54:16)
> On Thu, 8 Aug 2019, Al Viro wrote:
> > On Wed, Aug 07, 2019 at 08:30:02AM +0200, Christoph Hellwig wrote:
> > > On Tue, Aug 06, 2019 at 12:50:10AM -0700, Hugh Dickins wrote:
> > > > Though personally I'm averse to managing "f"objects through
> > > > "m"interfaces, which can get ridiculous (notably, MADV_HUGEPAGE works
> > > > on the virtual address of a mapping, but the huge-or-not alignment of
> > > > that mapping must have been decided previously).  In Google we do use
> > > > fcntls F_HUGEPAGE and F_NOHUGEPAGE to override on a per-file basis -
> > > > one day I'll get to upstreaming those.
> > > 
> > > Such an interface seems very useful, although the two fcntls seem a bit
> > > odd.
> > > 
> > > But I think the point here is that the i915 has its own somewhat odd
> > > instance of tmpfs.  If we could pass the equivalent of the huge=*
> > > options to shmem_file_setup all that garbage (including the
> > > shmem_file_setup_with_mnt function) could go away.
> > 
> > ... or follow shmem_file_super() with whatever that fcntl maps to
> > internally.  I would really love to get rid of that i915 kludge.
> 
> As to the immediate problem of i915_gemfs using remount_fs on linux-next,
> IIUC, all that is necessary at the moment is the deletions patch below
> (but I'd prefer that to come from the i915 folks).  Since gemfs has no
> need to change the huge option from its default to its default.
> 
> As to the future of when they get back to wanting huge pages in gemfs,
> yes, that can probably best be arranged by using the internals of an
> fcntl F_HUGEPAGE on those objects that would benefit from it.
> 
> Though my intention there was that the "huge=never" default ought
> to continue to refuse to give huge pages, even when asked by fcntl.
> So a little hackery may still be required, to allow the i915_gemfs
> internal mount to get huge pages when a user mount would not.
> 
> As to whether shmem_file_setup_with_mnt() needs to live: I've given
> that no thought, but accept that shm_mnt is such a ragbag of different
> usages, that i915 is right to prefer their own separate gemfs mount.
> 
> Hugh
> 
> --- mmotm/drivers/gpu/drm/i915/gem/i915_gemfs.c 2019-07-21 19:40:16.573703780 -0700
> +++ linux/drivers/gpu/drm/i915/gem/i915_gemfs.c 2019-08-08 07:19:23.967689058 -0700
> @@ -24,28 +24,6 @@ int i915_gemfs_init(struct drm_i915_priv
>         if (IS_ERR(gemfs))
>                 return PTR_ERR(gemfs);
>  
> -       /*
> -        * Enable huge-pages for objects that are at least HPAGE_PMD_SIZE, most
> -        * likely 2M. Note that within_size may overallocate huge-pages, if say
> -        * we allocate an object of size 2M + 4K, we may get 2M + 2M, but under
> -        * memory pressure shmem should split any huge-pages which can be
> -        * shrunk.
> -        */
> -
> -       if (has_transparent_hugepage()) {
> -               struct super_block *sb = gemfs->mnt_sb;
> -               /* FIXME: Disabled until we get W/A for read BW issue. */
> -               char options[] = "huge=never";
> -               int flags = 0;
> -               int err;
> -
> -               err = sb->s_op->remount_fs(sb, &flags, options);
> -               if (err) {
> -                       kern_unmount(gemfs);
> -                       return err;
> -               }
> -       }

That's perfectly fine; we should probably leave a hint as to why gemfs
exists and include the suggestion of looking at per-file hugepage
controls.

Matthew, how does this affect your current plans? If at all?
-Chris

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 2/3] i915: convert to new mount API
  2019-08-08 16:23               ` Chris Wilson
@ 2019-08-08 17:03                 ` Matthew Auld
  0 siblings, 0 replies; 16+ messages in thread
From: Matthew Auld @ 2019-08-08 17:03 UTC (permalink / raw)
  To: Chris Wilson, Al Viro, Hugh Dickins
  Cc: Christoph Hellwig, Sergey Senozhatsky, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, David Howells, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel, linux-kernel, linux-fsdevel

On 08/08/2019 17:23, Chris Wilson wrote:
> Quoting Hugh Dickins (2019-08-08 16:54:16)
>> On Thu, 8 Aug 2019, Al Viro wrote:
>>> On Wed, Aug 07, 2019 at 08:30:02AM +0200, Christoph Hellwig wrote:
>>>> On Tue, Aug 06, 2019 at 12:50:10AM -0700, Hugh Dickins wrote:
>>>>> Though personally I'm averse to managing "f"objects through
>>>>> "m"interfaces, which can get ridiculous (notably, MADV_HUGEPAGE works
>>>>> on the virtual address of a mapping, but the huge-or-not alignment of
>>>>> that mapping must have been decided previously).  In Google we do use
>>>>> fcntls F_HUGEPAGE and F_NOHUGEPAGE to override on a per-file basis -
>>>>> one day I'll get to upstreaming those.
>>>>
>>>> Such an interface seems very useful, although the two fcntls seem a bit
>>>> odd.
>>>>
>>>> But I think the point here is that the i915 has its own somewhat odd
>>>> instance of tmpfs.  If we could pass the equivalent of the huge=*
>>>> options to shmem_file_setup all that garbage (including the
>>>> shmem_file_setup_with_mnt function) could go away.
>>>
>>> ... or follow shmem_file_super() with whatever that fcntl maps to
>>> internally.  I would really love to get rid of that i915 kludge.
>>
>> As to the immediate problem of i915_gemfs using remount_fs on linux-next,
>> IIUC, all that is necessary at the moment is the deletions patch below
>> (but I'd prefer that to come from the i915 folks).  Since gemfs has no
>> need to change the huge option from its default to its default.
>>
>> As to the future of when they get back to wanting huge pages in gemfs,
>> yes, that can probably best be arranged by using the internals of an
>> fcntl F_HUGEPAGE on those objects that would benefit from it.
>>
>> Though my intention there was that the "huge=never" default ought
>> to continue to refuse to give huge pages, even when asked by fcntl.
>> So a little hackery may still be required, to allow the i915_gemfs
>> internal mount to get huge pages when a user mount would not.
>>
>> As to whether shmem_file_setup_with_mnt() needs to live: I've given
>> that no thought, but accept that shm_mnt is such a ragbag of different
>> usages, that i915 is right to prefer their own separate gemfs mount.
>>
>> Hugh
>>
>> --- mmotm/drivers/gpu/drm/i915/gem/i915_gemfs.c 2019-07-21 19:40:16.573703780 -0700
>> +++ linux/drivers/gpu/drm/i915/gem/i915_gemfs.c 2019-08-08 07:19:23.967689058 -0700
>> @@ -24,28 +24,6 @@ int i915_gemfs_init(struct drm_i915_priv
>>          if (IS_ERR(gemfs))
>>                  return PTR_ERR(gemfs);
>>   
>> -       /*
>> -        * Enable huge-pages for objects that are at least HPAGE_PMD_SIZE, most
>> -        * likely 2M. Note that within_size may overallocate huge-pages, if say
>> -        * we allocate an object of size 2M + 4K, we may get 2M + 2M, but under
>> -        * memory pressure shmem should split any huge-pages which can be
>> -        * shrunk.
>> -        */
>> -
>> -       if (has_transparent_hugepage()) {
>> -               struct super_block *sb = gemfs->mnt_sb;
>> -               /* FIXME: Disabled until we get W/A for read BW issue. */
>> -               char options[] = "huge=never";
>> -               int flags = 0;
>> -               int err;
>> -
>> -               err = sb->s_op->remount_fs(sb, &flags, options);
>> -               if (err) {
>> -                       kern_unmount(gemfs);
>> -                       return err;
>> -               }
>> -       }
> 
> That's perfectly fine; we should probably leave a hint as to why gemfs
> exists and include the suggestion of looking at per-file hugepage
> controls.
> 
> Matthew, how does this affect your current plans? If at all?
Fine with me.

> -Chris
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-08-08 17:03 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-05 16:03 [PATCHv2 0/3] convert i915 to new mount API Sergey Senozhatsky
2019-08-05 16:03 ` [PATCHv2 1/3] fs: export put_filesystem() Sergey Senozhatsky
2019-08-05 16:03 ` [PATCHv2 2/3] i915: convert to new mount API Sergey Senozhatsky
2019-08-05 18:12   ` Al Viro
2019-08-05 18:28     ` Al Viro
2019-08-06  7:50       ` Hugh Dickins
2019-08-07  6:30         ` Christoph Hellwig
2019-08-08  1:23           ` Al Viro
2019-08-08 15:54             ` Hugh Dickins
2019-08-08 16:23               ` Chris Wilson
2019-08-08 17:03                 ` Matthew Auld
2019-08-08  1:21         ` Al Viro
2019-08-05 23:33     ` Al Viro
2019-08-06  1:20     ` Sergey Senozhatsky
2019-08-05 16:03 ` [PATCHv2 3/3] i915: do not leak module ref counter Sergey Senozhatsky
2019-08-05 19:34 ` [PATCHv2 0/3] convert i915 to new mount API Sedat Dilek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).