($INBOX_DIR/description missing)
 help / color / Atom feed
* Re: kernel BUG at mm/hugetlb.c:LINE!
       [not found]           ` <78313ae9-8596-9cbe-f648-3152660be9b3@oracle.com>
@ 2020-05-22 10:05             ` Miklos Szeredi
  2020-05-28  0:01               ` Mike Kravetz
  0 siblings, 1 reply; 10+ messages in thread
From: Miklos Szeredi @ 2020-05-22 10:05 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Colin Walters, syzbot, Andrew Morton, linux-fsdevel,
	linux-kernel, linux-mm, Miklos Szeredi, syzkaller-bugs, Al Viro,
	linux-unionfs

On Wed, May 20, 2020 at 10:27:15AM -0700, Mike Kravetz wrote:

> I am fairly confident it is all about checking limits and alignment.  The
> filesystem knows if it can/should align to base or huge page size. DAX has
> some interesting additional restrictions, and several 'traditional' filesystems
> check if they are 'on DAX'.


Okay, I haven't looked at DAX vs. overlay.  I'm sure it's going to come up at
some point, if it hasn't already.

> 
> In a previous e-mail, you suggested hugetlb_get_unmapped_area could do the
> length adjustment in hugetlb_get_unmapped_area (generic and arch specific).
> I agree, although there may be the need to add length overflow checks in
> these routines (after round up) as this is done in core code now.  However,
> this can be done as a separate cleanup patch.
> 
> In any case, we need to get the core mmap code to call filesystem specific
> get_unmapped_area if on a union/overlay.  The patch I suggested does this
> by simply calling real_file to determine if there is a filesystem specific
> get_unmapped_area.  The other approach would be to provide an overlayfs
> get_unmapped_area that calls the underlying filesystem get_unmapped_area.

That latter is what's done for all other stacked operations in overlayfs.

Untested patch below.

Thanks,
Miklos

---
 fs/overlayfs/file.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -757,6 +757,17 @@ static loff_t ovl_remap_file_range(struc
 			    remap_flags, op);
 }
 
+static unsigned long ovl_get_unmapped_area(struct file *file,
+				unsigned long uaddr, unsigned long len,
+				unsigned long pgoff, unsigned long flags)
+{
+	struct file *realfile = file->private_data;
+
+	return (realfile->f_op->get_unmapped_area ?:
+		current->mm->get_unmapped_area)(realfile,
+						uaddr, len, pgoff, flags);
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
@@ -774,6 +785,7 @@ const struct file_operations ovl_file_op
 
 	.copy_file_range	= ovl_copy_file_range,
 	.remap_file_range	= ovl_remap_file_range,
+	.get_unmapped_area	= ovl_get_unmapped_area,
 };
 
 int __init ovl_aio_request_cache_init(void)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel BUG at mm/hugetlb.c:LINE!
  2020-05-22 10:05             ` kernel BUG at mm/hugetlb.c:LINE! Miklos Szeredi
@ 2020-05-28  0:01               ` Mike Kravetz
  2020-05-28  8:37                 ` [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area() kbuild test robot
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Kravetz @ 2020-05-28  0:01 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Colin Walters, syzbot, Andrew Morton, linux-fsdevel,
	linux-kernel, linux-mm, Miklos Szeredi, syzkaller-bugs, Al Viro,
	linux-unionfs

On 5/22/20 3:05 AM, Miklos Szeredi wrote:
> On Wed, May 20, 2020 at 10:27:15AM -0700, Mike Kravetz wrote:
> 
>> I am fairly confident it is all about checking limits and alignment.  The
>> filesystem knows if it can/should align to base or huge page size. DAX has
>> some interesting additional restrictions, and several 'traditional' filesystems
>> check if they are 'on DAX'.
> 
> 
> Okay, I haven't looked at DAX vs. overlay.  I'm sure it's going to come up at
> some point, if it hasn't already.
> 
>>
>> In a previous e-mail, you suggested hugetlb_get_unmapped_area could do the
>> length adjustment in hugetlb_get_unmapped_area (generic and arch specific).
>> I agree, although there may be the need to add length overflow checks in
>> these routines (after round up) as this is done in core code now.  However,
>> this can be done as a separate cleanup patch.
>>
>> In any case, we need to get the core mmap code to call filesystem specific
>> get_unmapped_area if on a union/overlay.  The patch I suggested does this
>> by simply calling real_file to determine if there is a filesystem specific
>> get_unmapped_area.  The other approach would be to provide an overlayfs
>> get_unmapped_area that calls the underlying filesystem get_unmapped_area.
> 
> That latter is what's done for all other stacked operations in overlayfs.
> 
> Untested patch below.
> 

Thanks Miklos!

We still need the 'real_file()' routine for is_file_hugepages.  So combining
these, I propose the following patch.  It addresses the known issue as well
as potential issues with is_file_hugepages returning incorrect information.
I don't really like a separate header file for real_file, but I can not
think of any good place to put it.

Let me know what you think,

From 33f6bbd19406108b61a4113b1ec8e4e6888cd482 Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Wed, 27 May 2020 16:58:58 -0700
Subject: [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area()

If a file is on a union/overlay, then the 'struct file *' will have
overlayfs file operations.  The routine is_file_hugepages() compares
f->f_op to hugetlbfs_file_operations to determine if it is a hugetlbfs
file.  If a hugetlbfs file is on a union/overlay, this comparison is
false and is_file_hugepages() incorrectly indicates the underlying
file is not hugetlbfs.  One result of this is a BUG as shown in [1].

mmap uses is_file_hugepages() because hugetlbfs files have different
alignment restrictions.  In addition, mmap code would like to use the
filesystem specific get_unmapped_area() routine if one is defined.

To address this issue,
- Add a new routine real_file() which will return the underlying file.
- Update is_file_hugepages to get the real file.
- Add get_unmapped_area f_op to oerrlayfs to call underlying routine.

[1] https://lore.kernel.org/linux-mm/000000000000b4684e05a2968ca6@google.com/

Reported-by: syzbot+d6ec23007e951dadf3de@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 fs/overlayfs/file.c       | 13 +++++++++++++
 include/linux/hugetlb.h   |  3 +++
 include/linux/overlayfs.h | 27 +++++++++++++++++++++++++++
 3 files changed, 43 insertions(+)
 create mode 100644 include/linux/overlayfs.h

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 87c362f65448..cc020e1c72d5 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -12,6 +12,7 @@
 #include <linux/splice.h>
 #include <linux/mm.h>
 #include <linux/fs.h>
+#include <linux/overlayfs.h>
 #include "overlayfs.h"
 
 struct ovl_aio_req {
@@ -757,6 +758,17 @@ static loff_t ovl_remap_file_range(struct file *file_in, loff_t pos_in,
 			    remap_flags, op);
 }
 
+static unsigned long ovl_get_unmapped_area(struct file *file,
+				unsigned long uaddr, unsigned long len,
+				unsigned long pgoff, unsigned long flags)
+{
+	struct file *realfile = real_file(file);
+
+	return (realfile->f_op->get_unmapped_area ?:
+		current->mm->get_unmapped_area)(realfile,
+						uaddr, len, pgoff, flags);
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
@@ -774,6 +786,7 @@ const struct file_operations ovl_file_operations = {
 
 	.copy_file_range	= ovl_copy_file_range,
 	.remap_file_range	= ovl_remap_file_range,
+	.get_unmapped_area	= ovl_get_unmapped_area,
 };
 
 int __init ovl_aio_request_cache_init(void)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 43a1cef8f0f1..fb22c0a7474a 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -9,6 +9,7 @@
 #include <linux/cgroup.h>
 #include <linux/list.h>
 #include <linux/kref.h>
+#include <linux/overlayfs.h>
 #include <asm/pgtable.h>
 
 struct ctl_table;
@@ -437,6 +438,8 @@ struct file *hugetlb_file_setup(const char *name, size_t size, vm_flags_t acct,
 
 static inline bool is_file_hugepages(struct file *file)
 {
+	file = real_file(file);
+
 	if (file->f_op == &hugetlbfs_file_operations)
 		return true;
 
diff --git a/include/linux/overlayfs.h b/include/linux/overlayfs.h
new file mode 100644
index 000000000000..eecdfda0286f
--- /dev/null
+++ b/include/linux/overlayfs.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_OVERLAYFS_H
+#define _LINUX_OVERLAYFS_H
+
+#include <linux/fs.h>
+
+extern const struct file_operations ovl_file_operations;
+
+#ifdef CONFIG_OVERLAY_FS
+/*
+ * If file is on a union/overlay, then return the underlying real file.
+ * Otherwise return the file itself.
+ */
+static inline struct file *real_file(struct file *file)
+{
+	while (unlikely(file->f_op == &ovl_file_operations))
+		file = file->private_data;
+	return file;
+}
+#else
+static inline struct file *real_file(struct file *file)
+{
+	return file;
+}
+#endif
+
+#endif /* _LINUX_OVERLAYFS_H */
-- 
2.25.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area()
  2020-05-28  0:01               ` Mike Kravetz
@ 2020-05-28  8:37                 ` kbuild test robot
  2020-05-28 21:01                   ` Mike Kravetz
  0 siblings, 1 reply; 10+ messages in thread
From: kbuild test robot @ 2020-05-28  8:37 UTC (permalink / raw)
  To: Mike Kravetz, Miklos Szeredi
  Cc: kbuild-all, Colin Walters, syzbot, Andrew Morton,
	Linux Memory Management List, linux-fsdevel, linux-kernel,
	syzkaller-bugs, Al Viro, linux-unionfs


[-- Attachment #1: Type: text/plain, Size: 2140 bytes --]

Hi Mike,

I love your patch! Yet something to improve:

[auto build test ERROR on miklos-vfs/overlayfs-next]
[also build test ERROR on linus/master v5.7-rc7]
[cannot apply to linux/master next-20200526]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Mike-Kravetz/ovl-provide-real_file-and-overlayfs-get_unmapped_area/20200528-080533
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-next
config: h8300-randconfig-r036-20200528 (attached as .config)
compiler: h8300-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=h8300 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>, old ones prefixed by <<):

fs/overlayfs/file.c: In function 'ovl_get_unmapped_area':
>> fs/overlayfs/file.c:768:14: error: 'struct mm_struct' has no member named 'get_unmapped_area'
768 |   current->mm->get_unmapped_area)(realfile,
|              ^~
>> fs/overlayfs/file.c:770:1: warning: control reaches end of non-void function [-Wreturn-type]
770 | }
| ^

vim +768 fs/overlayfs/file.c

   760	
   761	static unsigned long ovl_get_unmapped_area(struct file *file,
   762					unsigned long uaddr, unsigned long len,
   763					unsigned long pgoff, unsigned long flags)
   764	{
   765		struct file *realfile = real_file(file);
   766	
   767		return (realfile->f_op->get_unmapped_area ?:
 > 768			current->mm->get_unmapped_area)(realfile,
   769							uaddr, len, pgoff, flags);
 > 770	}
   771	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31095 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area()
  2020-05-28  8:37                 ` [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area() kbuild test robot
@ 2020-05-28 21:01                   ` Mike Kravetz
  2020-06-04  9:16                     ` Miklos Szeredi
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Kravetz @ 2020-05-28 21:01 UTC (permalink / raw)
  To: kbuild test robot, Miklos Szeredi
  Cc: kbuild-all, Colin Walters, syzbot, Andrew Morton,
	Linux Memory Management List, linux-fsdevel, linux-kernel,
	syzkaller-bugs, Al Viro, linux-unionfs

On 5/28/20 1:37 AM, kbuild test robot wrote:
> Hi Mike,
> 
> I love your patch! Yet something to improve:
> 
> [auto build test ERROR on miklos-vfs/overlayfs-next]
> [also build test ERROR on linus/master v5.7-rc7]
> [cannot apply to linux/master next-20200526]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
> 
> url:    https://github.com/0day-ci/linux/commits/Mike-Kravetz/ovl-provide-real_file-and-overlayfs-get_unmapped_area/20200528-080533
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-next
> config: h8300-randconfig-r036-20200528 (attached as .config)
> compiler: h8300-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=h8300 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kbuild test robot <lkp@intel.com>
> 
> All error/warnings (new ones prefixed by >>, old ones prefixed by <<):
> 
> fs/overlayfs/file.c: In function 'ovl_get_unmapped_area':
>>> fs/overlayfs/file.c:768:14: error: 'struct mm_struct' has no member named 'get_unmapped_area'
> 768 |   current->mm->get_unmapped_area)(realfile,
> |              ^~
>>> fs/overlayfs/file.c:770:1: warning: control reaches end of non-void function [-Wreturn-type]
> 770 | }
> | ^
> 
> vim +768 fs/overlayfs/file.c
> 
>    760	
>    761	static unsigned long ovl_get_unmapped_area(struct file *file,
>    762					unsigned long uaddr, unsigned long len,
>    763					unsigned long pgoff, unsigned long flags)
>    764	{
>    765		struct file *realfile = real_file(file);
>    766	
>    767		return (realfile->f_op->get_unmapped_area ?:
>  > 768			current->mm->get_unmapped_area)(realfile,
>    769							uaddr, len, pgoff, flags);
>  > 770	}
>    771	
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
> 

Well yuck!  get_unmapped_area is not part of mm_struct if !CONFIG_MMU.

Miklos, would adding '#ifdef CONFIG_MMU' around the overlayfs code be too
ugly for you?  Another option is to use real_file() in the mmap code as
done in [1].

Sorry for all the questions.  However, I want to make sure you are happy
with any overlayfs changes.

[1] https://lore.kernel.org/linux-mm/04a00e3b-539c-236f-e43b-0024ef94b7cb@oracle.com/
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area()
  2020-05-28 21:01                   ` Mike Kravetz
@ 2020-06-04  9:16                     ` Miklos Szeredi
  2020-06-11  0:13                       ` Mike Kravetz
  0 siblings, 1 reply; 10+ messages in thread
From: Miklos Szeredi @ 2020-06-04  9:16 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: kbuild test robot, kbuild-all, Colin Walters, syzbot,
	Andrew Morton, Linux Memory Management List, linux-fsdevel,
	linux-kernel, syzkaller-bugs, Al Viro, overlayfs

On Thu, May 28, 2020 at 11:01 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 5/28/20 1:37 AM, kbuild test robot wrote:
> > Hi Mike,
> >
> > I love your patch! Yet something to improve:
> >
> > [auto build test ERROR on miklos-vfs/overlayfs-next]
> > [also build test ERROR on linus/master v5.7-rc7]
> > [cannot apply to linux/master next-20200526]
> > [if your patch is applied to the wrong git tree, please drop us a note to help
> > improve the system. BTW, we also suggest to use '--base' option to specify the
> > base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
> >
> > url:    https://github.com/0day-ci/linux/commits/Mike-Kravetz/ovl-provide-real_file-and-overlayfs-get_unmapped_area/20200528-080533
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-next
> > config: h8300-randconfig-r036-20200528 (attached as .config)
> > compiler: h8300-linux-gcc (GCC) 9.3.0
> > reproduce (this is a W=1 build):
> >         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> >         chmod +x ~/bin/make.cross
> >         # save the attached .config to linux build tree
> >         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=h8300
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kbuild test robot <lkp@intel.com>
> >
> > All error/warnings (new ones prefixed by >>, old ones prefixed by <<):
> >
> > fs/overlayfs/file.c: In function 'ovl_get_unmapped_area':
> >>> fs/overlayfs/file.c:768:14: error: 'struct mm_struct' has no member named 'get_unmapped_area'
> > 768 |   current->mm->get_unmapped_area)(realfile,
> > |              ^~
> >>> fs/overlayfs/file.c:770:1: warning: control reaches end of non-void function [-Wreturn-type]
> > 770 | }
> > | ^
> >
> > vim +768 fs/overlayfs/file.c
> >
> >    760
> >    761        static unsigned long ovl_get_unmapped_area(struct file *file,
> >    762                                        unsigned long uaddr, unsigned long len,
> >    763                                        unsigned long pgoff, unsigned long flags)
> >    764        {
> >    765                struct file *realfile = real_file(file);
> >    766
> >    767                return (realfile->f_op->get_unmapped_area ?:
> >  > 768                        current->mm->get_unmapped_area)(realfile,
> >    769                                                        uaddr, len, pgoff, flags);
> >  > 770        }
> >    771
> >
> > ---
> > 0-DAY CI Kernel Test Service, Intel Corporation
> > https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
> >
>
> Well yuck!  get_unmapped_area is not part of mm_struct if !CONFIG_MMU.
>
> Miklos, would adding '#ifdef CONFIG_MMU' around the overlayfs code be too
> ugly for you?  Another option is to use real_file() in the mmap code as
> done in [1].

I think the proper fix is to add an inline helper
(call_get_unmapped_area()?) in linux/mm.h, and make that work properly
for the NOMMU case as well.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area()
  2020-06-04  9:16                     ` Miklos Szeredi
@ 2020-06-11  0:13                       ` Mike Kravetz
  2020-06-11  0:37                         ` Al Viro
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Kravetz @ 2020-06-11  0:13 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: kbuild test robot, kbuild-all, Colin Walters, syzbot,
	Andrew Morton, Linux Memory Management List, linux-fsdevel,
	linux-kernel, syzkaller-bugs, Al Viro, overlayfs

On 6/4/20 2:16 AM, Miklos Szeredi wrote:
> On Thu, May 28, 2020 at 11:01 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>
>> Well yuck!  get_unmapped_area is not part of mm_struct if !CONFIG_MMU.
>>
>> Miklos, would adding '#ifdef CONFIG_MMU' around the overlayfs code be too
>> ugly for you?  Another option is to use real_file() in the mmap code as
>> done in [1].
> 
> I think the proper fix is to add an inline helper
> (call_get_unmapped_area()?) in linux/mm.h, and make that work properly
> for the NOMMU case as well.
> 

I'm ignoring the above issue for now.  There may be a more fundamental issue
that I do not know how to solve without adding another memeber to
file_operations.  Why?

In order to go from file -> realfile, one needs to do something like the
code you provided.

	while (file->f_op == &ovl_file_operations)
		file = file->private_data;
	return file;

The problem is that this needs to be called from core kernel code
(is_file_hugepages in mmap).  ovl_file_operations is obviously only defined
in the overlayfs code.  Since overlayfs can be built as a module, I do not
know of a way to reference ovl_file_operations which will only become available
when/if overlayfs is loaded.  Is there a way to do that?

If there is no other way to do this, then I think we need to add another
member to file_operations as is done in the following patch.  I hope there
is another way, because adding another file_op seems like overkill.

From 6d22c93284263b5ddcc2199adc1bcceb233f1499 Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Wed, 10 Jun 2020 16:23:46 -0700
Subject: [PATCH v3] ovl: add f_real file_operation and supporting code for
 overlayfs

If a file is on a union/overlay, then the 'struct file *' will have
overlayfs file operations.  The routine is_file_hugepages() compares
f->f_op to hugetlbfs_file_operations to determine if it is a hugetlbfs
file.  If a hugetlbfs file is on a union/overlay, this comparison is
false and is_file_hugepages() incorrectly indicates the underlying
file is not hugetlbfs.  One result of this is a BUG as shown in [1].

mmap uses is_file_hugepages() to identify hugetlbfs file and potentially
round up length because hugetlbfs files have different alignment
restrictions.  This is defined/expected behavior.  In addition, mmap
code would like to use the filesystem specific get_unmapped_area()
routine if one is defined.

To address this issue,
- Add a new file operation f_real while will return the underlying file.
  Only overlayfs provides a function for this operation.
- Add a new routine real_file() which can be used by core code get an
  underlying file.
- Update is_file_hugepages to get the real file.
- Add get_unmapped_area f_op to oerrlayfs to call underlying routine.

[1] https://lore.kernel.org/linux-mm/000000000000b4684e05a2968ca6@google.com/

Reported-by: syzbot+d6ec23007e951dadf3de@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 fs/overlayfs/file.c     | 23 +++++++++++++++++++++++
 include/linux/fs.h      |  8 ++++++++
 include/linux/hugetlb.h |  2 ++
 3 files changed, 33 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 87c362f65448..eb870fc3912f 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -757,6 +757,27 @@ static loff_t ovl_remap_file_range(struct file *file_in, loff_t pos_in,
 			    remap_flags, op);
 }
 
+#ifdef CONFIG_MMU
+static unsigned long ovl_get_unmapped_area(struct file *file,
+				unsigned long uaddr, unsigned long len,
+				unsigned long pgoff, unsigned long flags)
+{
+	file = real_file(file);
+
+	return (file->f_op->get_unmapped_area ?:
+		current->mm->get_unmapped_area)(file, uaddr, len, pgoff, flags);
+}
+#else
+#define ovl_get_unmapped_area NULL
+#endif
+
+static struct file *ovl_f_real(struct file *file)
+{
+	while (file->f_op == &ovl_file_operations)
+		file = file->private_data;
+	return file;
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
@@ -771,9 +792,11 @@ const struct file_operations ovl_file_operations = {
 	.compat_ioctl	= ovl_compat_ioctl,
 	.splice_read    = ovl_splice_read,
 	.splice_write   = ovl_splice_write,
+	.f_real		= ovl_f_real,
 
 	.copy_file_range	= ovl_copy_file_range,
 	.remap_file_range	= ovl_remap_file_range,
+	.get_unmapped_area	= ovl_get_unmapped_area,
 };
 
 int __init ovl_aio_request_cache_init(void)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 45cc10cdf6dd..59a969549aec 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1863,6 +1863,7 @@ struct file_operations {
 				   struct file *file_out, loff_t pos_out,
 				   loff_t len, unsigned int remap_flags);
 	int (*fadvise)(struct file *, loff_t, loff_t, int);
+	struct file * (*f_real)(struct file *file);
 } __randomize_layout;
 
 struct inode_operations {
@@ -1895,6 +1896,13 @@ struct inode_operations {
 	int (*set_acl)(struct inode *, struct posix_acl *, int);
 } ____cacheline_aligned;
 
+static inline struct file *real_file(struct file *file)
+{
+	if (unlikely(file->f_op->f_real))
+		return file->f_op->f_real(file);
+	return file;
+}
+
 static inline ssize_t call_read_iter(struct file *file, struct kiocb *kio,
 				     struct iov_iter *iter)
 {
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 43a1cef8f0f1..528b07145414 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -437,6 +437,8 @@ struct file *hugetlb_file_setup(const char *name, size_t size, vm_flags_t acct,
 
 static inline bool is_file_hugepages(struct file *file)
 {
+	file = real_file(file);
+
 	if (file->f_op == &hugetlbfs_file_operations)
 		return true;
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area()
  2020-06-11  0:13                       ` Mike Kravetz
@ 2020-06-11  0:37                         ` Al Viro
  2020-06-11  1:36                           ` Matthew Wilcox
  0 siblings, 1 reply; 10+ messages in thread
From: Al Viro @ 2020-06-11  0:37 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Miklos Szeredi, kbuild test robot, kbuild-all, Colin Walters,
	syzbot, Andrew Morton, Linux Memory Management List,
	linux-fsdevel, linux-kernel, syzkaller-bugs, overlayfs

On Wed, Jun 10, 2020 at 05:13:52PM -0700, Mike Kravetz wrote:

> To address this issue,
> - Add a new file operation f_real while will return the underlying file.
>   Only overlayfs provides a function for this operation.
> - Add a new routine real_file() which can be used by core code get an
>   underlying file.
> - Update is_file_hugepages to get the real file.

Egads...  So to find out whether it's a hugetlb you would
	* check if a method is NULL
	* if not, call it
	* ... and check if the method table of the result is hugetlbfs one?

Here's a radical suggestion: FMODE_HUGEPAGES.  Just have it set by
->open() and let is_file_hugepages() check it.  In ->f_mode.  And
make the bloody hugetlbfs_file_operations static, while we are at it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area()
  2020-06-11  0:37                         ` Al Viro
@ 2020-06-11  1:36                           ` Matthew Wilcox
  2020-06-11  2:17                             ` Al Viro
  0 siblings, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2020-06-11  1:36 UTC (permalink / raw)
  To: Al Viro
  Cc: Mike Kravetz, Miklos Szeredi, kbuild test robot, kbuild-all,
	Colin Walters, syzbot, Andrew Morton,
	Linux Memory Management List, linux-fsdevel, linux-kernel,
	syzkaller-bugs, overlayfs

On Thu, Jun 11, 2020 at 01:37:26AM +0100, Al Viro wrote:
> On Wed, Jun 10, 2020 at 05:13:52PM -0700, Mike Kravetz wrote:
> 
> > To address this issue,
> > - Add a new file operation f_real while will return the underlying file.
> >   Only overlayfs provides a function for this operation.
> > - Add a new routine real_file() which can be used by core code get an
> >   underlying file.
> > - Update is_file_hugepages to get the real file.
> 
> Egads...  So to find out whether it's a hugetlb you would
> 	* check if a method is NULL
> 	* if not, call it
> 	* ... and check if the method table of the result is hugetlbfs one?
> 
> Here's a radical suggestion: FMODE_HUGEPAGES.  Just have it set by
> ->open() and let is_file_hugepages() check it.  In ->f_mode.  And
> make the bloody hugetlbfs_file_operations static, while we are at it.

ITYM FMODE_OVL_UPPER.  To quote Mike:

>         while (file->f_op == &ovl_file_operations)
>                 file = file->private_data;
>         return file;

which would be transformed into:

	while (file->f_mode & FMODE_OVL_UPPER)
		file = file->private_data;
	return file;

Or are you proposing that overlayfs copy FMODE_HUGEPAGES from the
underlying fs to the overlaying fs?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area()
  2020-06-11  1:36                           ` Matthew Wilcox
@ 2020-06-11  2:17                             ` Al Viro
  2020-06-11  2:31                               ` Mike Kravetz
  0 siblings, 1 reply; 10+ messages in thread
From: Al Viro @ 2020-06-11  2:17 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Mike Kravetz, Miklos Szeredi, kbuild test robot, kbuild-all,
	Colin Walters, syzbot, Andrew Morton,
	Linux Memory Management List, linux-fsdevel, linux-kernel,
	syzkaller-bugs, overlayfs

On Wed, Jun 10, 2020 at 06:36:16PM -0700, Matthew Wilcox wrote:

> 	while (file->f_mode & FMODE_OVL_UPPER)
> 		file = file->private_data;
> 	return file;
> 
> Or are you proposing that overlayfs copy FMODE_HUGEPAGES from the
> underlying fs to the overlaying fs?

The latter - that way nobody outside of overlayfs needs to know what
its ->private_data points to, for one thing.  And it's cheaper that
way, obviously.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area()
  2020-06-11  2:17                             ` Al Viro
@ 2020-06-11  2:31                               ` Mike Kravetz
  0 siblings, 0 replies; 10+ messages in thread
From: Mike Kravetz @ 2020-06-11  2:31 UTC (permalink / raw)
  To: Al Viro, Matthew Wilcox
  Cc: Miklos Szeredi, kbuild test robot, kbuild-all, Colin Walters,
	syzbot, Andrew Morton, Linux Memory Management List,
	linux-fsdevel, linux-kernel, syzkaller-bugs, overlayfs

On 6/10/20 7:17 PM, Al Viro wrote:
> On Wed, Jun 10, 2020 at 06:36:16PM -0700, Matthew Wilcox wrote:
> 
>> 	while (file->f_mode & FMODE_OVL_UPPER)
>> 		file = file->private_data;
>> 	return file;
>>
>> Or are you proposing that overlayfs copy FMODE_HUGEPAGES from the
>> underlying fs to the overlaying fs?
> 
> The latter - that way nobody outside of overlayfs needs to know what
> its ->private_data points to, for one thing.  And it's cheaper that
> way, obviously.
> 

Thanks Al and Matthew!

I knew adding a file op for this was overkill and was looking for other
suggestions.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <000000000000b4684e05a2968ca6@google.com>
     [not found] ` <aa7812b8-60ae-8578-40db-e71ad766b4d3@oracle.com>
     [not found]   ` <CAJfpegtVca6H1JPW00OF-7sCwpomMCo=A2qr5K=9uGKEGjEp3w@mail.gmail.com>
     [not found]     ` <bb232cfa-5965-42d0-88cf-46d13f7ebda3@www.fastmail.com>
     [not found]       ` <9a56a79a-88ed-9ff4-115e-ec169cba5c0b@oracle.com>
     [not found]         ` <CAJfpegsNVB12MQ-Jgbb-f=+i3g0Xy52miT3TmUAYL951HVQS_w@mail.gmail.com>
     [not found]           ` <78313ae9-8596-9cbe-f648-3152660be9b3@oracle.com>
2020-05-22 10:05             ` kernel BUG at mm/hugetlb.c:LINE! Miklos Szeredi
2020-05-28  0:01               ` Mike Kravetz
2020-05-28  8:37                 ` [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area() kbuild test robot
2020-05-28 21:01                   ` Mike Kravetz
2020-06-04  9:16                     ` Miklos Szeredi
2020-06-11  0:13                       ` Mike Kravetz
2020-06-11  0:37                         ` Al Viro
2020-06-11  1:36                           ` Matthew Wilcox
2020-06-11  2:17                             ` Al Viro
2020-06-11  2:31                               ` Mike Kravetz

($INBOX_DIR/description missing)

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-unionfs/0 linux-unionfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-unionfs linux-unionfs/ https://lore.kernel.org/linux-unionfs \
		linux-unionfs@vger.kernel.org
	public-inbox-index linux-unionfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-unionfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git