linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kernel: Expose SYS_kcmp by default
@ 2021-02-05 16:37 Chris Wilson
  2021-02-05 17:02 ` Lucas Stach
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Chris Wilson @ 2021-02-05 16:37 UTC (permalink / raw)
  To: linux-kernel
  Cc: dri-devel, intel-gfx, Chris Wilson, Kees Cook, Andy Lutomirski,
	Will Drewry, Andrew Morton, Dave Airlie, Daniel Vetter,
	Lucas Stach

Userspace has discovered the functionality offered by SYS_kcmp and has
started to depend upon it. In particular, Mesa uses SYS_kcmp for
os_same_file_description() in order to identify when two fd (e.g. device
or dmabuf) point to the same struct file. Since they depend on it for
core functionality, lift SYS_kcmp out of the non-default
CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
---
 init/Kconfig                                  | 11 +++++++++++
 kernel/Makefile                               |  2 +-
 tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index b77c60f8b963..f62fca13ac5b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1194,6 +1194,7 @@ endif # NAMESPACES
 config CHECKPOINT_RESTORE
 	bool "Checkpoint/restore support"
 	select PROC_CHILDREN
+	select KCMP
 	default n
 	help
 	  Enables additional kernel features in a sake of checkpoint/restore.
@@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
 config ARCH_HAS_MEMBARRIER_SYNC_CORE
 	bool
 
+config KCMP
+	bool "Enable kcmp() system call" if EXPERT
+	default y
+	help
+	  Enable the file descriptor comparison system call. It provides
+	  user-space with the ability to compare two fd to see if they
+	  point to the same file, and check other attributes.
+
+	  If unsure, say Y.
+
 config RSEQ
 	bool "Enable rseq() system call" if EXPERT
 	default y
diff --git a/kernel/Makefile b/kernel/Makefile
index aa7368c7eabf..320f1f3941b7 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -51,7 +51,7 @@ obj-y += livepatch/
 obj-y += dma/
 obj-y += entry/
 
-obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
+obj-$(CONFIG_KCMP) += kcmp.o
 obj-$(CONFIG_FREEZER) += freezer.o
 obj-$(CONFIG_PROFILING) += profile.o
 obj-$(CONFIG_STACKTRACE) += stacktrace.o
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 26c72f2b61b1..1b6c7d33c4ff 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -315,7 +315,7 @@ TEST(kcmp)
 	ret = __filecmp(getpid(), getpid(), 1, 1);
 	EXPECT_EQ(ret, 0);
 	if (ret != 0 && errno == ENOSYS)
-		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
+		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
 }
 
 TEST(mode_strict_support)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-05 16:37 [PATCH] kernel: Expose SYS_kcmp by default Chris Wilson
@ 2021-02-05 17:02 ` Lucas Stach
  2021-02-05 18:37 ` Kees Cook
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Lucas Stach @ 2021-02-05 17:02 UTC (permalink / raw)
  To: Chris Wilson, linux-kernel
  Cc: dri-devel, intel-gfx, Kees Cook, Andy Lutomirski, Will Drewry,
	Andrew Morton, Dave Airlie, Daniel Vetter

Am Freitag, dem 05.02.2021 um 16:37 +0000 schrieb Chris Wilson:
> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf) point to the same struct file. Since they depend on it for
> core functionality, lift SYS_kcmp out of the non-default
> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Will Drewry <wad@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> ---
>  init/Kconfig                                  | 11 +++++++++++
>  kernel/Makefile                               |  2 +-
>  tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
>  3 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/init/Kconfig b/init/Kconfig
> index b77c60f8b963..f62fca13ac5b 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>  config CHECKPOINT_RESTORE
>  	bool "Checkpoint/restore support"
>  	select PROC_CHILDREN
> +	select KCMP
>  	default n
>  	help
>  	  Enables additional kernel features in a sake of checkpoint/restore.
> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>  config ARCH_HAS_MEMBARRIER_SYNC_CORE
>  	bool
>  
> 
> 
> 
> +config KCMP
> +	bool "Enable kcmp() system call" if EXPERT
> +	default y
> +	help
> +	  Enable the file descriptor comparison system call. It provides
> +	  user-space with the ability to compare two fd to see if they
> +	  point to the same file, and check other attributes.

This description undersells the abilities of kcmp, while fd compare is
the only thing used by the graphics stack, kcmp can compare a handful
of other system resources, see man 2 kcmp. I think the helptext should
at least try to cover this fact somewhat.

Regards,
Lucas

> +
> +	  If unsure, say Y.
> +
>  config RSEQ
>  	bool "Enable rseq() system call" if EXPERT
>  	default y
> diff --git a/kernel/Makefile b/kernel/Makefile
> index aa7368c7eabf..320f1f3941b7 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -51,7 +51,7 @@ obj-y += livepatch/
>  obj-y += dma/
>  obj-y += entry/
>  
> 
> 
> 
> -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> +obj-$(CONFIG_KCMP) += kcmp.o
>  obj-$(CONFIG_FREEZER) += freezer.o
>  obj-$(CONFIG_PROFILING) += profile.o
>  obj-$(CONFIG_STACKTRACE) += stacktrace.o
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 26c72f2b61b1..1b6c7d33c4ff 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -315,7 +315,7 @@ TEST(kcmp)
>  	ret = __filecmp(getpid(), getpid(), 1, 1);
>  	EXPECT_EQ(ret, 0);
>  	if (ret != 0 && errno == ENOSYS)
> -		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> +		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
>  }
>  
> 
> 
> 
>  TEST(mode_strict_support)



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-05 16:37 [PATCH] kernel: Expose SYS_kcmp by default Chris Wilson
  2021-02-05 17:02 ` Lucas Stach
@ 2021-02-05 18:37 ` Kees Cook
  2021-02-05 20:53   ` Daniel Vetter
  2021-02-05 19:25 ` kernel test robot
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2021-02-05 18:37 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, dri-devel, intel-gfx, Andy Lutomirski, Will Drewry,
	Andrew Morton, Dave Airlie, Daniel Vetter, Lucas Stach, jannh

On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote:
> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf) point to the same struct file. Since they depend on it for
> core functionality, lift SYS_kcmp out of the non-default
> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Will Drewry <wad@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> ---
>  init/Kconfig                                  | 11 +++++++++++
>  kernel/Makefile                               |  2 +-
>  tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
>  3 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/init/Kconfig b/init/Kconfig
> index b77c60f8b963..f62fca13ac5b 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>  config CHECKPOINT_RESTORE
>  	bool "Checkpoint/restore support"
>  	select PROC_CHILDREN
> +	select KCMP
>  	default n
>  	help
>  	  Enables additional kernel features in a sake of checkpoint/restore.
> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>  config ARCH_HAS_MEMBARRIER_SYNC_CORE
>  	bool
>  
> +config KCMP
> +	bool "Enable kcmp() system call" if EXPERT
> +	default y

I would expect this to be not default-y, especially if
CHECKPOINT_RESTORE does a "select" on it.

This is a really powerful syscall, but it is bounded by ptrace access
controls, and uses pointer address obfuscation, so it may be okay to
expose this. As it is, at least Ubuntu already has
CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much
difference on exposure.

So, if you drop the "default y", I'm fine with this.

-Kees

> +	help
> +	  Enable the file descriptor comparison system call. It provides
> +	  user-space with the ability to compare two fd to see if they
> +	  point to the same file, and check other attributes.
> +
> +	  If unsure, say Y.
> +
>  config RSEQ
>  	bool "Enable rseq() system call" if EXPERT
>  	default y
> diff --git a/kernel/Makefile b/kernel/Makefile
> index aa7368c7eabf..320f1f3941b7 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -51,7 +51,7 @@ obj-y += livepatch/
>  obj-y += dma/
>  obj-y += entry/
>  
> -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> +obj-$(CONFIG_KCMP) += kcmp.o
>  obj-$(CONFIG_FREEZER) += freezer.o
>  obj-$(CONFIG_PROFILING) += profile.o
>  obj-$(CONFIG_STACKTRACE) += stacktrace.o
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 26c72f2b61b1..1b6c7d33c4ff 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -315,7 +315,7 @@ TEST(kcmp)
>  	ret = __filecmp(getpid(), getpid(), 1, 1);
>  	EXPECT_EQ(ret, 0);
>  	if (ret != 0 && errno == ENOSYS)
> -		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> +		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
>  }
>  
>  TEST(mode_strict_support)
> -- 
> 2.20.1
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-05 16:37 [PATCH] kernel: Expose SYS_kcmp by default Chris Wilson
  2021-02-05 17:02 ` Lucas Stach
  2021-02-05 18:37 ` Kees Cook
@ 2021-02-05 19:25 ` kernel test robot
  2021-02-05 19:46 ` kernel test robot
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: kernel test robot @ 2021-02-05 19:25 UTC (permalink / raw)
  To: Chris Wilson, linux-kernel
  Cc: kbuild-all, clang-built-linux, Will Drewry, Kees Cook, intel-gfx,
	dri-devel, Andy Lutomirski, Andrew Morton,
	Linux Memory Management List, Chris Wilson

[-- Attachment #1: Type: text/plain, Size: 11511 bytes --]

Hi Chris,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linux/master]
[also build test ERROR on kees/for-next/seccomp kees/for-next/pstore linus/master v5.11-rc6 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/kernel-Expose-SYS_kcmp-by-default/20210206-004006
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2ab38c17aac10bf55ab3efde4c4db3893d8691d2
config: powerpc-randconfig-r023-20210205 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c9439ca36342fb6013187d0a69aef92736951476)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc cross compiling tool for clang build
        # apt-get install binutils-powerpc-linux-gnu
        # https://github.com/0day-ci/linux/commit/f7694e48ae81aac5a226e74421dbda1dcdc3ca92
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Chris-Wilson/kernel-Expose-SYS_kcmp-by-default/20210206-004006
        git checkout f7694e48ae81aac5a226e74421dbda1dcdc3ca92
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:619:
   arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:190:1: note: expanded from here
   __do_insw
   ^
   arch/powerpc/include/asm/io.h:557:56: note: expanded from macro '__do_insw'
   #define __do_insw(p, b, n)      readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
                                          ~~~~~~~~~~~~~~~~~~~~~^
   In file included from kernel/kcmp.c:3:
   In file included from include/linux/syscalls.h:84:
   In file included from include/trace/syscall.h:7:
   In file included from include/linux/trace_events.h:9:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:619:
   arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:192:1: note: expanded from here
   __do_insl
   ^
   arch/powerpc/include/asm/io.h:558:56: note: expanded from macro '__do_insl'
   #define __do_insl(p, b, n)      readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
                                          ~~~~~~~~~~~~~~~~~~~~~^
   In file included from kernel/kcmp.c:3:
   In file included from include/linux/syscalls.h:84:
   In file included from include/trace/syscall.h:7:
   In file included from include/linux/trace_events.h:9:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:619:
   arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:194:1: note: expanded from here
   __do_outsb
   ^
   arch/powerpc/include/asm/io.h:559:58: note: expanded from macro '__do_outsb'
   #define __do_outsb(p, b, n)     writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
                                           ~~~~~~~~~~~~~~~~~~~~~^
   In file included from kernel/kcmp.c:3:
   In file included from include/linux/syscalls.h:84:
   In file included from include/trace/syscall.h:7:
   In file included from include/linux/trace_events.h:9:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:619:
   arch/powerpc/include/asm/io-defs.h:51:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsw, (unsigned long p, const void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:196:1: note: expanded from here
   __do_outsw
   ^
   arch/powerpc/include/asm/io.h:560:58: note: expanded from macro '__do_outsw'
   #define __do_outsw(p, b, n)     writesw((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
                                           ~~~~~~~~~~~~~~~~~~~~~^
   In file included from kernel/kcmp.c:3:
   In file included from include/linux/syscalls.h:84:
   In file included from include/trace/syscall.h:7:
   In file included from include/linux/trace_events.h:9:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:619:
   arch/powerpc/include/asm/io-defs.h:53:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsl, (unsigned long p, const void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:198:1: note: expanded from here
   __do_outsl
   ^
   arch/powerpc/include/asm/io.h:561:58: note: expanded from macro '__do_outsl'
   #define __do_outsl(p, b, n)     writesl((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
                                           ~~~~~~~~~~~~~~~~~~~~~^
>> kernel/kcmp.c:117:13: error: implicit declaration of function 'get_epoll_tfile_raw_ptr' [-Werror,-Wimplicit-function-declaration]
           filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff);
                      ^
   kernel/kcmp.c:117:13: note: did you mean 'get_file_raw_ptr'?
   kernel/kcmp.c:62:1: note: 'get_file_raw_ptr' declared here
   get_file_raw_ptr(struct task_struct *task, unsigned int idx)
   ^
>> kernel/kcmp.c:117:11: warning: incompatible integer to pointer conversion assigning to 'struct file *' from 'int' [-Wint-conversion]
           filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff);
                    ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   13 warnings and 1 error generated.


vim +/get_epoll_tfile_raw_ptr +117 kernel/kcmp.c

d97b46a64674a2 Cyrill Gorcunov   2012-05-31   96  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12   97  #ifdef CONFIG_EPOLL
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12   98  static int kcmp_epoll_target(struct task_struct *task1,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12   99  			     struct task_struct *task2,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  100  			     unsigned long idx1,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  101  			     struct kcmp_epoll_slot __user *uslot)
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  102  {
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  103  	struct file *filp, *filp_epoll, *filp_tgt;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  104  	struct kcmp_epoll_slot slot;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  105  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  106  	if (copy_from_user(&slot, uslot, sizeof(slot)))
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  107  		return -EFAULT;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  108  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  109  	filp = get_file_raw_ptr(task1, idx1);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  110  	if (!filp)
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  111  		return -EBADF;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  112  
f43c283a89a7dc Eric W. Biederman 2020-11-20  113  	filp_epoll = fget_task(task2, slot.efd);
f43c283a89a7dc Eric W. Biederman 2020-11-20  114  	if (!filp_epoll)
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  115  		return -EBADF;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  116  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12 @117  	filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  118  	fput(filp_epoll);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  119  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  120  	if (IS_ERR(filp_tgt))
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  121  		return PTR_ERR(filp_tgt);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  122  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  123  	return kcmp_ptr(filp, filp_tgt, KCMP_FILE);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  124  }
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  125  #else
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  126  static int kcmp_epoll_target(struct task_struct *task1,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  127  			     struct task_struct *task2,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  128  			     unsigned long idx1,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  129  			     struct kcmp_epoll_slot __user *uslot)
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  130  {
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  131  	return -EOPNOTSUPP;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  132  }
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  133  #endif
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  134  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26996 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-05 16:37 [PATCH] kernel: Expose SYS_kcmp by default Chris Wilson
                   ` (2 preceding siblings ...)
  2021-02-05 19:25 ` kernel test robot
@ 2021-02-05 19:46 ` kernel test robot
  2021-02-05 21:06 ` [PATCH v2] " Chris Wilson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: kernel test robot @ 2021-02-05 19:46 UTC (permalink / raw)
  To: Chris Wilson, linux-kernel
  Cc: kbuild-all, Will Drewry, Kees Cook, intel-gfx, dri-devel,
	Andy Lutomirski, Andrew Morton, Linux Memory Management List,
	Chris Wilson

[-- Attachment #1: Type: text/plain, Size: 5069 bytes --]

Hi Chris,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linux/master]
[also build test ERROR on kees/for-next/seccomp kees/for-next/pstore linus/master v5.11-rc6 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/kernel-Expose-SYS_kcmp-by-default/20210206-004006
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2ab38c17aac10bf55ab3efde4c4db3893d8691d2
config: i386-randconfig-s002-20210205 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-215-g0fb77bb6-dirty
        # https://github.com/0day-ci/linux/commit/f7694e48ae81aac5a226e74421dbda1dcdc3ca92
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Chris-Wilson/kernel-Expose-SYS_kcmp-by-default/20210206-004006
        git checkout f7694e48ae81aac5a226e74421dbda1dcdc3ca92
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   kernel/kcmp.c: In function 'kcmp_epoll_target':
>> kernel/kcmp.c:117:13: error: implicit declaration of function 'get_epoll_tfile_raw_ptr'; did you mean 'get_file_raw_ptr'? [-Werror=implicit-function-declaration]
     117 |  filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff);
         |             ^~~~~~~~~~~~~~~~~~~~~~~
         |             get_file_raw_ptr
>> kernel/kcmp.c:117:11: warning: assignment to 'struct file *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
     117 |  filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff);
         |           ^
   cc1: some warnings being treated as errors


vim +117 kernel/kcmp.c

d97b46a64674a2 Cyrill Gorcunov   2012-05-31   96  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12   97  #ifdef CONFIG_EPOLL
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12   98  static int kcmp_epoll_target(struct task_struct *task1,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12   99  			     struct task_struct *task2,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  100  			     unsigned long idx1,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  101  			     struct kcmp_epoll_slot __user *uslot)
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  102  {
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  103  	struct file *filp, *filp_epoll, *filp_tgt;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  104  	struct kcmp_epoll_slot slot;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  105  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  106  	if (copy_from_user(&slot, uslot, sizeof(slot)))
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  107  		return -EFAULT;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  108  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  109  	filp = get_file_raw_ptr(task1, idx1);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  110  	if (!filp)
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  111  		return -EBADF;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  112  
f43c283a89a7dc Eric W. Biederman 2020-11-20  113  	filp_epoll = fget_task(task2, slot.efd);
f43c283a89a7dc Eric W. Biederman 2020-11-20  114  	if (!filp_epoll)
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  115  		return -EBADF;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  116  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12 @117  	filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  118  	fput(filp_epoll);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  119  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  120  	if (IS_ERR(filp_tgt))
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  121  		return PTR_ERR(filp_tgt);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  122  
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  123  	return kcmp_ptr(filp, filp_tgt, KCMP_FILE);
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  124  }
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  125  #else
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  126  static int kcmp_epoll_target(struct task_struct *task1,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  127  			     struct task_struct *task2,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  128  			     unsigned long idx1,
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  129  			     struct kcmp_epoll_slot __user *uslot)
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  130  {
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  131  	return -EOPNOTSUPP;
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  132  }
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  133  #endif
0791e3644e5ef2 Cyrill Gorcunov   2017-07-12  134  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31465 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-05 18:37 ` Kees Cook
@ 2021-02-05 20:53   ` Daniel Vetter
  2021-02-08 11:49     ` Michel Dänzer
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Vetter @ 2021-02-05 20:53 UTC (permalink / raw)
  To: Kees Cook, airlied
  Cc: Chris Wilson, Linux Kernel Mailing List, dri-devel, intel-gfx,
	Andy Lutomirski, Will Drewry, Andrew Morton, Lucas Stach,
	Jann Horn

On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote:
> > Userspace has discovered the functionality offered by SYS_kcmp and has
> > started to depend upon it. In particular, Mesa uses SYS_kcmp for
> > os_same_file_description() in order to identify when two fd (e.g. device
> > or dmabuf) point to the same struct file. Since they depend on it for
> > core functionality, lift SYS_kcmp out of the non-default
> > CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Will Drewry <wad@chromium.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > ---
> >  init/Kconfig                                  | 11 +++++++++++
> >  kernel/Makefile                               |  2 +-
> >  tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
> >  3 files changed, 13 insertions(+), 2 deletions(-)
> >
> > diff --git a/init/Kconfig b/init/Kconfig
> > index b77c60f8b963..f62fca13ac5b 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1194,6 +1194,7 @@ endif # NAMESPACES
> >  config CHECKPOINT_RESTORE
> >       bool "Checkpoint/restore support"
> >       select PROC_CHILDREN
> > +     select KCMP
> >       default n
> >       help
> >         Enables additional kernel features in a sake of checkpoint/restore.
> > @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
> >  config ARCH_HAS_MEMBARRIER_SYNC_CORE
> >       bool
> >
> > +config KCMP
> > +     bool "Enable kcmp() system call" if EXPERT
> > +     default y
>
> I would expect this to be not default-y, especially if
> CHECKPOINT_RESTORE does a "select" on it.
>
> This is a really powerful syscall, but it is bounded by ptrace access
> controls, and uses pointer address obfuscation, so it may be okay to
> expose this. As it is, at least Ubuntu already has
> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much
> difference on exposure.
>
> So, if you drop the "default y", I'm fine with this.

It was maybe stupid, but our userspace started relying on fd
comaprison through sys_kcomp. So for better or worse, if you want to
run the mesa3d gl/vk stacks, you need this. Was maybe not the brighest
ideas, but since enough distros had this enabled by defaults, it
wasn't really discovered, and now we're shipping this everywhere.

Ofc we can leave the default n, but the select if CONFIG_DRM is
unfortunately needed I think. For that part:

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Also adding Dave Airlie for his take.
-Daniel

>
> -Kees
>
> > +     help
> > +       Enable the file descriptor comparison system call. It provides
> > +       user-space with the ability to compare two fd to see if they
> > +       point to the same file, and check other attributes.
> > +
> > +       If unsure, say Y.
> > +
> >  config RSEQ
> >       bool "Enable rseq() system call" if EXPERT
> >       default y
> > diff --git a/kernel/Makefile b/kernel/Makefile
> > index aa7368c7eabf..320f1f3941b7 100644
> > --- a/kernel/Makefile
> > +++ b/kernel/Makefile
> > @@ -51,7 +51,7 @@ obj-y += livepatch/
> >  obj-y += dma/
> >  obj-y += entry/
> >
> > -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> > +obj-$(CONFIG_KCMP) += kcmp.o
> >  obj-$(CONFIG_FREEZER) += freezer.o
> >  obj-$(CONFIG_PROFILING) += profile.o
> >  obj-$(CONFIG_STACKTRACE) += stacktrace.o
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index 26c72f2b61b1..1b6c7d33c4ff 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -315,7 +315,7 @@ TEST(kcmp)
> >       ret = __filecmp(getpid(), getpid(), 1, 1);
> >       EXPECT_EQ(ret, 0);
> >       if (ret != 0 && errno == ENOSYS)
> > -             SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> > +             SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
> >  }
> >
> >  TEST(mode_strict_support)
> > --
> > 2.20.1
> >
>
> --
> Kees Cook



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2] kernel: Expose SYS_kcmp by default
  2021-02-05 16:37 [PATCH] kernel: Expose SYS_kcmp by default Chris Wilson
                   ` (3 preceding siblings ...)
  2021-02-05 19:46 ` kernel test robot
@ 2021-02-05 21:06 ` Chris Wilson
  2021-02-05 21:16   ` Chris Wilson
  2021-02-05 21:47   ` Rasmus Villemoes
  2021-02-05 22:00 ` [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE Chris Wilson
  2021-02-13 17:40 ` [PATCH] kernel: Expose SYS_kcmp by default Pavel Machek
  6 siblings, 2 replies; 28+ messages in thread
From: Chris Wilson @ 2021-02-05 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: dri-devel, intel-gfx, Chris Wilson, Kees Cook, Andy Lutomirski,
	Will Drewry, Andrew Morton, Dave Airlie, Daniel Vetter,
	Lucas Stach, Daniel Vetter

Userspace has discovered the functionality offered by SYS_kcmp and has
started to depend upon it. In particular, Mesa uses SYS_kcmp for
os_same_file_description() in order to identify when two fd (e.g. device
or dmabuf) point to the same struct file. Since they depend on it for
core functionality, lift SYS_kcmp out of the non-default
CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.

Note that some distributions such as Ubuntu are already enabling
CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on SYS_kcmp

---
v2:
  - Default n.
  - Borrrow help message from man kcmp.
  - Export get_epoll_tfile_raw_ptr() for CONFIG_KCMP
---
 fs/eventpoll.c                                |  4 ++--
 include/linux/eventpoll.h                     |  2 +-
 init/Kconfig                                  | 12 ++++++++++++
 kernel/Makefile                               |  2 +-
 tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
 5 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index a829af074eb5..3196474cbe24 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -979,7 +979,7 @@ static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
 	return epir;
 }
 
-#ifdef CONFIG_CHECKPOINT_RESTORE
+#ifdef CONFIG_KCMP
 static struct epitem *ep_find_tfd(struct eventpoll *ep, int tfd, unsigned long toff)
 {
 	struct rb_node *rbp;
@@ -1021,7 +1021,7 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
 
 	return file_raw;
 }
-#endif /* CONFIG_CHECKPOINT_RESTORE */
+#endif /* CONFIG_KCMP */
 
 /**
  * Adds a new entry to the tail of the list in a lockless way, i.e.
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index 0350393465d4..593322c946e6 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -18,7 +18,7 @@ struct file;
 
 #ifdef CONFIG_EPOLL
 
-#ifdef CONFIG_CHECKPOINT_RESTORE
+#ifdef CONFIG_KCMP
 struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long toff);
 #endif
 
diff --git a/init/Kconfig b/init/Kconfig
index b77c60f8b963..1b75141bc18b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1194,6 +1194,7 @@ endif # NAMESPACES
 config CHECKPOINT_RESTORE
 	bool "Checkpoint/restore support"
 	select PROC_CHILDREN
+	select KCMP
 	default n
 	help
 	  Enables additional kernel features in a sake of checkpoint/restore.
@@ -1737,6 +1738,17 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
 config ARCH_HAS_MEMBARRIER_SYNC_CORE
 	bool
 
+config KCMP
+	bool "Enable kcmp() system call" if EXPERT
+	default n
+	help
+	  Enable the kernel resource comparison system call. It provides
+	  user-space with the ability to compare two processes to see if they
+	  share a common resource, such as a file descriptor or even virtual
+	  memory space.
+
+	  If unsure, say N.
+
 config RSEQ
 	bool "Enable rseq() system call" if EXPERT
 	default y
diff --git a/kernel/Makefile b/kernel/Makefile
index aa7368c7eabf..320f1f3941b7 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -51,7 +51,7 @@ obj-y += livepatch/
 obj-y += dma/
 obj-y += entry/
 
-obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
+obj-$(CONFIG_KCMP) += kcmp.o
 obj-$(CONFIG_FREEZER) += freezer.o
 obj-$(CONFIG_PROFILING) += profile.o
 obj-$(CONFIG_STACKTRACE) += stacktrace.o
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 26c72f2b61b1..1b6c7d33c4ff 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -315,7 +315,7 @@ TEST(kcmp)
 	ret = __filecmp(getpid(), getpid(), 1, 1);
 	EXPECT_EQ(ret, 0);
 	if (ret != 0 && errno == ENOSYS)
-		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
+		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
 }
 
 TEST(mode_strict_support)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] kernel: Expose SYS_kcmp by default
  2021-02-05 21:06 ` [PATCH v2] " Chris Wilson
@ 2021-02-05 21:16   ` Chris Wilson
  2021-02-05 21:20     ` Kees Cook
  2021-02-05 21:47   ` Rasmus Villemoes
  1 sibling, 1 reply; 28+ messages in thread
From: Chris Wilson @ 2021-02-05 21:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: dri-devel, intel-gfx, Kees Cook, Andy Lutomirski, Will Drewry,
	Andrew Morton, Dave Airlie, Daniel Vetter, Lucas Stach,
	Daniel Vetter

The subject should of course be changed, as it is no longer being
enabled by default.

Something like

kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE

Quoting Chris Wilson (2021-02-05 21:06:10)
> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf) point to the same struct file. Since they depend on it for
> core functionality, lift SYS_kcmp out of the non-default
> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> 
> Note that some distributions such as Ubuntu are already enabling
> CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.
> 
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Will Drewry <wad@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on SYS_kcmp
> 
> ---
> v2:
>   - Default n.
>   - Borrrow help message from man kcmp.
>   - Export get_epoll_tfile_raw_ptr() for CONFIG_KCMP
> ---
>  fs/eventpoll.c                                |  4 ++--
>  include/linux/eventpoll.h                     |  2 +-
>  init/Kconfig                                  | 12 ++++++++++++
>  kernel/Makefile                               |  2 +-
>  tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
>  5 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index a829af074eb5..3196474cbe24 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -979,7 +979,7 @@ static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
>         return epir;
>  }
>  
> -#ifdef CONFIG_CHECKPOINT_RESTORE
> +#ifdef CONFIG_KCMP
>  static struct epitem *ep_find_tfd(struct eventpoll *ep, int tfd, unsigned long toff)
>  {
>         struct rb_node *rbp;
> @@ -1021,7 +1021,7 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
>  
>         return file_raw;
>  }
> -#endif /* CONFIG_CHECKPOINT_RESTORE */
> +#endif /* CONFIG_KCMP */
>  
>  /**
>   * Adds a new entry to the tail of the list in a lockless way, i.e.
> diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> index 0350393465d4..593322c946e6 100644
> --- a/include/linux/eventpoll.h
> +++ b/include/linux/eventpoll.h
> @@ -18,7 +18,7 @@ struct file;
>  
>  #ifdef CONFIG_EPOLL
>  
> -#ifdef CONFIG_CHECKPOINT_RESTORE
> +#ifdef CONFIG_KCMP
>  struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long toff);
>  #endif
>  
> diff --git a/init/Kconfig b/init/Kconfig
> index b77c60f8b963..1b75141bc18b 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>  config CHECKPOINT_RESTORE
>         bool "Checkpoint/restore support"
>         select PROC_CHILDREN
> +       select KCMP
>         default n
>         help
>           Enables additional kernel features in a sake of checkpoint/restore.
> @@ -1737,6 +1738,17 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>  config ARCH_HAS_MEMBARRIER_SYNC_CORE
>         bool
>  
> +config KCMP
> +       bool "Enable kcmp() system call" if EXPERT
> +       default n
> +       help
> +         Enable the kernel resource comparison system call. It provides
> +         user-space with the ability to compare two processes to see if they
> +         share a common resource, such as a file descriptor or even virtual
> +         memory space.
> +
> +         If unsure, say N.
> +
>  config RSEQ
>         bool "Enable rseq() system call" if EXPERT
>         default y
> diff --git a/kernel/Makefile b/kernel/Makefile
> index aa7368c7eabf..320f1f3941b7 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -51,7 +51,7 @@ obj-y += livepatch/
>  obj-y += dma/
>  obj-y += entry/
>  
> -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> +obj-$(CONFIG_KCMP) += kcmp.o
>  obj-$(CONFIG_FREEZER) += freezer.o
>  obj-$(CONFIG_PROFILING) += profile.o
>  obj-$(CONFIG_STACKTRACE) += stacktrace.o
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 26c72f2b61b1..1b6c7d33c4ff 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -315,7 +315,7 @@ TEST(kcmp)
>         ret = __filecmp(getpid(), getpid(), 1, 1);
>         EXPECT_EQ(ret, 0);
>         if (ret != 0 && errno == ENOSYS)
> -               SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> +               SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
>  }
>  
>  TEST(mode_strict_support)
> -- 
> 2.20.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] kernel: Expose SYS_kcmp by default
  2021-02-05 21:16   ` Chris Wilson
@ 2021-02-05 21:20     ` Kees Cook
  2021-02-05 21:28       ` Chris Wilson
  0 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2021-02-05 21:20 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, dri-devel, intel-gfx, Andy Lutomirski, Will Drewry,
	Andrew Morton, Dave Airlie, Daniel Vetter, Lucas Stach,
	Daniel Vetter

On Fri, Feb 05, 2021 at 09:16:01PM +0000, Chris Wilson wrote:
> The subject should of course be changed, as it is no longer being
> enabled by default.

"default n" is redundant. I thought Daniel said CONFIG_DRM needed to
"select" it too, though? Otherwise, yeah, this looks good. Was the
export due to the 0-day bot failure reports?

-Kees

> 
> Something like
> 
> kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
> 
> Quoting Chris Wilson (2021-02-05 21:06:10)
> > Userspace has discovered the functionality offered by SYS_kcmp and has
> > started to depend upon it. In particular, Mesa uses SYS_kcmp for
> > os_same_file_description() in order to identify when two fd (e.g. device
> > or dmabuf) point to the same struct file. Since they depend on it for
> > core functionality, lift SYS_kcmp out of the non-default
> > CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> > 
> > Note that some distributions such as Ubuntu are already enabling
> > CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.
> > 
> > References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Will Drewry <wad@chromium.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on SYS_kcmp
> > 
> > ---
> > v2:
> >   - Default n.
> >   - Borrrow help message from man kcmp.
> >   - Export get_epoll_tfile_raw_ptr() for CONFIG_KCMP
> > ---
> >  fs/eventpoll.c                                |  4 ++--
> >  include/linux/eventpoll.h                     |  2 +-
> >  init/Kconfig                                  | 12 ++++++++++++
> >  kernel/Makefile                               |  2 +-
> >  tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
> >  5 files changed, 17 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> > index a829af074eb5..3196474cbe24 100644
> > --- a/fs/eventpoll.c
> > +++ b/fs/eventpoll.c
> > @@ -979,7 +979,7 @@ static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
> >         return epir;
> >  }
> >  
> > -#ifdef CONFIG_CHECKPOINT_RESTORE
> > +#ifdef CONFIG_KCMP
> >  static struct epitem *ep_find_tfd(struct eventpoll *ep, int tfd, unsigned long toff)
> >  {
> >         struct rb_node *rbp;
> > @@ -1021,7 +1021,7 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
> >  
> >         return file_raw;
> >  }
> > -#endif /* CONFIG_CHECKPOINT_RESTORE */
> > +#endif /* CONFIG_KCMP */
> >  
> >  /**
> >   * Adds a new entry to the tail of the list in a lockless way, i.e.
> > diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> > index 0350393465d4..593322c946e6 100644
> > --- a/include/linux/eventpoll.h
> > +++ b/include/linux/eventpoll.h
> > @@ -18,7 +18,7 @@ struct file;
> >  
> >  #ifdef CONFIG_EPOLL
> >  
> > -#ifdef CONFIG_CHECKPOINT_RESTORE
> > +#ifdef CONFIG_KCMP
> >  struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long toff);
> >  #endif
> >  
> > diff --git a/init/Kconfig b/init/Kconfig
> > index b77c60f8b963..1b75141bc18b 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1194,6 +1194,7 @@ endif # NAMESPACES
> >  config CHECKPOINT_RESTORE
> >         bool "Checkpoint/restore support"
> >         select PROC_CHILDREN
> > +       select KCMP
> >         default n
> >         help
> >           Enables additional kernel features in a sake of checkpoint/restore.
> > @@ -1737,6 +1738,17 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
> >  config ARCH_HAS_MEMBARRIER_SYNC_CORE
> >         bool
> >  
> > +config KCMP
> > +       bool "Enable kcmp() system call" if EXPERT
> > +       default n
> > +       help
> > +         Enable the kernel resource comparison system call. It provides
> > +         user-space with the ability to compare two processes to see if they
> > +         share a common resource, such as a file descriptor or even virtual
> > +         memory space.
> > +
> > +         If unsure, say N.
> > +
> >  config RSEQ
> >         bool "Enable rseq() system call" if EXPERT
> >         default y
> > diff --git a/kernel/Makefile b/kernel/Makefile
> > index aa7368c7eabf..320f1f3941b7 100644
> > --- a/kernel/Makefile
> > +++ b/kernel/Makefile
> > @@ -51,7 +51,7 @@ obj-y += livepatch/
> >  obj-y += dma/
> >  obj-y += entry/
> >  
> > -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> > +obj-$(CONFIG_KCMP) += kcmp.o
> >  obj-$(CONFIG_FREEZER) += freezer.o
> >  obj-$(CONFIG_PROFILING) += profile.o
> >  obj-$(CONFIG_STACKTRACE) += stacktrace.o
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index 26c72f2b61b1..1b6c7d33c4ff 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -315,7 +315,7 @@ TEST(kcmp)
> >         ret = __filecmp(getpid(), getpid(), 1, 1);
> >         EXPECT_EQ(ret, 0);
> >         if (ret != 0 && errno == ENOSYS)
> > -               SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> > +               SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
> >  }
> >  
> >  TEST(mode_strict_support)
> > -- 
> > 2.20.1
> >

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] kernel: Expose SYS_kcmp by default
  2021-02-05 21:20     ` Kees Cook
@ 2021-02-05 21:28       ` Chris Wilson
  2021-02-05 21:48         ` Daniel Vetter
  0 siblings, 1 reply; 28+ messages in thread
From: Chris Wilson @ 2021-02-05 21:28 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, dri-devel, intel-gfx, Andy Lutomirski, Will Drewry,
	Andrew Morton, Dave Airlie, Daniel Vetter, Lucas Stach,
	Daniel Vetter

Quoting Kees Cook (2021-02-05 21:20:33)
> On Fri, Feb 05, 2021 at 09:16:01PM +0000, Chris Wilson wrote:
> > The subject should of course be changed, as it is no longer being
> > enabled by default.
> 
> "default n" is redundant.

I thought being explicit would be preferred. There are a few other
default n, so at least it's not the odd-one-out!

> I thought Daniel said CONFIG_DRM needed to
> "select" it too, though?

Yes. We will need to select it for any DRM driver so that the Vulkan/GL
stacks can rely on having SYS_kcmp. That deserves to be handled and
explain within drm/Kconfig, and as they are already shipping with calls
to SYS_kcmp we may have to ask for a stable backport.

> Otherwise, yeah, this looks good. Was the
> export due to the 0-day bot failure reports?

Yes.
-Chris

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] kernel: Expose SYS_kcmp by default
  2021-02-05 21:06 ` [PATCH v2] " Chris Wilson
  2021-02-05 21:16   ` Chris Wilson
@ 2021-02-05 21:47   ` Rasmus Villemoes
  1 sibling, 0 replies; 28+ messages in thread
From: Rasmus Villemoes @ 2021-02-05 21:47 UTC (permalink / raw)
  To: Chris Wilson, linux-kernel
  Cc: dri-devel, intel-gfx, Kees Cook, Andy Lutomirski, Will Drewry,
	Andrew Morton, Dave Airlie, Daniel Vetter, Lucas Stach,
	Daniel Vetter, Cyrill Gorcunov

On 05/02/2021 22.06, Chris Wilson wrote:
> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf) point to the same struct file. Since they depend on it for
> core functionality, lift SYS_kcmp out of the non-default
> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> 
> Note that some distributions such as Ubuntu are already enabling
> CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.
> 

Looks a lot like
https://lore.kernel.org/lkml/20200710075632.14661-1-linux@rasmusvillemoes.dk/.
So FWIW, ack from me.

cc += Cyrill.

> References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Will Drewry <wad@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on SYS_kcmp
> 
> ---
> v2:
>   - Default n.
>   - Borrrow help message from man kcmp.
>   - Export get_epoll_tfile_raw_ptr() for CONFIG_KCMP
> ---
>  fs/eventpoll.c                                |  4 ++--
>  include/linux/eventpoll.h                     |  2 +-
>  init/Kconfig                                  | 12 ++++++++++++
>  kernel/Makefile                               |  2 +-
>  tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
>  5 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index a829af074eb5..3196474cbe24 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -979,7 +979,7 @@ static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
>  	return epir;
>  }
>  
> -#ifdef CONFIG_CHECKPOINT_RESTORE
> +#ifdef CONFIG_KCMP
>  static struct epitem *ep_find_tfd(struct eventpoll *ep, int tfd, unsigned long toff)
>  {
>  	struct rb_node *rbp;
> @@ -1021,7 +1021,7 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
>  
>  	return file_raw;
>  }
> -#endif /* CONFIG_CHECKPOINT_RESTORE */
> +#endif /* CONFIG_KCMP */
>  
>  /**
>   * Adds a new entry to the tail of the list in a lockless way, i.e.
> diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> index 0350393465d4..593322c946e6 100644
> --- a/include/linux/eventpoll.h
> +++ b/include/linux/eventpoll.h
> @@ -18,7 +18,7 @@ struct file;
>  
>  #ifdef CONFIG_EPOLL
>  
> -#ifdef CONFIG_CHECKPOINT_RESTORE
> +#ifdef CONFIG_KCMP
>  struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long toff);
>  #endif
>  
> diff --git a/init/Kconfig b/init/Kconfig
> index b77c60f8b963..1b75141bc18b 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>  config CHECKPOINT_RESTORE
>  	bool "Checkpoint/restore support"
>  	select PROC_CHILDREN
> +	select KCMP
>  	default n
>  	help
>  	  Enables additional kernel features in a sake of checkpoint/restore.
> @@ -1737,6 +1738,17 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>  config ARCH_HAS_MEMBARRIER_SYNC_CORE
>  	bool
>  
> +config KCMP
> +	bool "Enable kcmp() system call" if EXPERT
> +	default n
> +	help
> +	  Enable the kernel resource comparison system call. It provides
> +	  user-space with the ability to compare two processes to see if they
> +	  share a common resource, such as a file descriptor or even virtual
> +	  memory space.
> +
> +	  If unsure, say N.
> +
>  config RSEQ
>  	bool "Enable rseq() system call" if EXPERT
>  	default y
> diff --git a/kernel/Makefile b/kernel/Makefile
> index aa7368c7eabf..320f1f3941b7 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -51,7 +51,7 @@ obj-y += livepatch/
>  obj-y += dma/
>  obj-y += entry/
>  
> -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> +obj-$(CONFIG_KCMP) += kcmp.o
>  obj-$(CONFIG_FREEZER) += freezer.o
>  obj-$(CONFIG_PROFILING) += profile.o
>  obj-$(CONFIG_STACKTRACE) += stacktrace.o
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 26c72f2b61b1..1b6c7d33c4ff 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -315,7 +315,7 @@ TEST(kcmp)
>  	ret = __filecmp(getpid(), getpid(), 1, 1);
>  	EXPECT_EQ(ret, 0);
>  	if (ret != 0 && errno == ENOSYS)
> -		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> +		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
>  }
>  
>  TEST(mode_strict_support)
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] kernel: Expose SYS_kcmp by default
  2021-02-05 21:28       ` Chris Wilson
@ 2021-02-05 21:48         ` Daniel Vetter
  0 siblings, 0 replies; 28+ messages in thread
From: Daniel Vetter @ 2021-02-05 21:48 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Kees Cook, Linux Kernel Mailing List, dri-devel, intel-gfx,
	Andy Lutomirski, Will Drewry, Andrew Morton, Dave Airlie,
	Lucas Stach

On Fri, Feb 5, 2021 at 10:28 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Kees Cook (2021-02-05 21:20:33)
> > On Fri, Feb 05, 2021 at 09:16:01PM +0000, Chris Wilson wrote:
> > > The subject should of course be changed, as it is no longer being
> > > enabled by default.
> >
> > "default n" is redundant.
>
> I thought being explicit would be preferred. There are a few other
> default n, so at least it's not the odd-one-out!
>
> > I thought Daniel said CONFIG_DRM needed to
> > "select" it too, though?
>
> Yes. We will need to select it for any DRM driver so that the Vulkan/GL
> stacks can rely on having SYS_kcmp. That deserves to be handled and
> explain within drm/Kconfig, and as they are already shipping with calls
> to SYS_kcmp we may have to ask for a stable backport.

Oh I dreamed and thought it's part of this patch already. So v3 with
matching subject to enabled it for drm?
-Daniel

>
> > Otherwise, yeah, this looks good. Was the
> > export due to the 0-day bot failure reports?
>
> Yes.
> -Chris



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-05 16:37 [PATCH] kernel: Expose SYS_kcmp by default Chris Wilson
                   ` (4 preceding siblings ...)
  2021-02-05 21:06 ` [PATCH v2] " Chris Wilson
@ 2021-02-05 22:00 ` Chris Wilson
  2021-02-06 12:47   ` Cyrill Gorcunov
                     ` (3 more replies)
  2021-02-13 17:40 ` [PATCH] kernel: Expose SYS_kcmp by default Pavel Machek
  6 siblings, 4 replies; 28+ messages in thread
From: Chris Wilson @ 2021-02-05 22:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: dri-devel, intel-gfx, Chris Wilson, Kees Cook, Andy Lutomirski,
	Will Drewry, Andrew Morton, Dave Airlie, Daniel Vetter,
	Lucas Stach, Rasmus Villemoes, Cyrill Gorcunov, stable,
	Daniel Vetter

Userspace has discovered the functionality offered by SYS_kcmp and has
started to depend upon it. In particular, Mesa uses SYS_kcmp for
os_same_file_description() in order to identify when two fd (e.g. device
or dmabuf) point to the same struct file. Since they depend on it for
core functionality, lift SYS_kcmp out of the non-default
CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.

Rasmus Villemoes also pointed out that systemd uses SYS_kcmp to
deduplicate the per-service file descriptor store.

Note that some distributions such as Ubuntu are already enabling
CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on kcmp
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> # systemd uses kcmp

---
v2:
  - Default n.
  - Borrrow help message from man kcmp.
  - Export get_epoll_tfile_raw_ptr() for CONFIG_KCMP
v3:
  - Select KCMP for CONFIG_DRM
---
 drivers/gpu/drm/Kconfig                       |  3 +++
 fs/eventpoll.c                                |  4 ++--
 include/linux/eventpoll.h                     |  2 +-
 init/Kconfig                                  | 11 +++++++++++
 kernel/Makefile                               |  2 +-
 tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
 6 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 0973f408d75f..af6c6d214d91 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -15,6 +15,9 @@ menuconfig DRM
 	select I2C_ALGOBIT
 	select DMA_SHARED_BUFFER
 	select SYNC_FILE
+# gallium uses SYS_kcmp for os_same_file_description() to de-duplicate
+# device and dmabuf fd. Let's make sure that is available for our userspace.
+	select KCMP
 	help
 	  Kernel-level support for the Direct Rendering Infrastructure (DRI)
 	  introduced in XFree86 4.0. If you say Y here, you need to select
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index a829af074eb5..3196474cbe24 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -979,7 +979,7 @@ static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
 	return epir;
 }
 
-#ifdef CONFIG_CHECKPOINT_RESTORE
+#ifdef CONFIG_KCMP
 static struct epitem *ep_find_tfd(struct eventpoll *ep, int tfd, unsigned long toff)
 {
 	struct rb_node *rbp;
@@ -1021,7 +1021,7 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
 
 	return file_raw;
 }
-#endif /* CONFIG_CHECKPOINT_RESTORE */
+#endif /* CONFIG_KCMP */
 
 /**
  * Adds a new entry to the tail of the list in a lockless way, i.e.
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index 0350393465d4..593322c946e6 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -18,7 +18,7 @@ struct file;
 
 #ifdef CONFIG_EPOLL
 
-#ifdef CONFIG_CHECKPOINT_RESTORE
+#ifdef CONFIG_KCMP
 struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long toff);
 #endif
 
diff --git a/init/Kconfig b/init/Kconfig
index b77c60f8b963..9cc7436b2f73 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1194,6 +1194,7 @@ endif # NAMESPACES
 config CHECKPOINT_RESTORE
 	bool "Checkpoint/restore support"
 	select PROC_CHILDREN
+	select KCMP
 	default n
 	help
 	  Enables additional kernel features in a sake of checkpoint/restore.
@@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
 config ARCH_HAS_MEMBARRIER_SYNC_CORE
 	bool
 
+config KCMP
+	bool "Enable kcmp() system call" if EXPERT
+	help
+	  Enable the kernel resource comparison system call. It provides
+	  user-space with the ability to compare two processes to see if they
+	  share a common resource, such as a file descriptor or even virtual
+	  memory space.
+
+	  If unsure, say N.
+
 config RSEQ
 	bool "Enable rseq() system call" if EXPERT
 	default y
diff --git a/kernel/Makefile b/kernel/Makefile
index aa7368c7eabf..320f1f3941b7 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -51,7 +51,7 @@ obj-y += livepatch/
 obj-y += dma/
 obj-y += entry/
 
-obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
+obj-$(CONFIG_KCMP) += kcmp.o
 obj-$(CONFIG_FREEZER) += freezer.o
 obj-$(CONFIG_PROFILING) += profile.o
 obj-$(CONFIG_STACKTRACE) += stacktrace.o
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 26c72f2b61b1..1b6c7d33c4ff 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -315,7 +315,7 @@ TEST(kcmp)
 	ret = __filecmp(getpid(), getpid(), 1, 1);
 	EXPECT_EQ(ret, 0);
 	if (ret != 0 && errno == ENOSYS)
-		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
+		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
 }
 
 TEST(mode_strict_support)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-05 22:00 ` [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE Chris Wilson
@ 2021-02-06 12:47   ` Cyrill Gorcunov
  2021-02-08 22:12   ` Kees Cook
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2021-02-06 12:47 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, dri-devel, intel-gfx, Kees Cook, Andy Lutomirski,
	Will Drewry, Andrew Morton, Dave Airlie, Daniel Vetter,
	Lucas Stach, Rasmus Villemoes, stable, Daniel Vetter

On Fri, Feb 05, 2021 at 10:00:12PM +0000, Chris Wilson wrote:
> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf) point to the same struct file. Since they depend on it for
> core functionality, lift SYS_kcmp out of the non-default
> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> 
...
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-05 20:53   ` Daniel Vetter
@ 2021-02-08 11:49     ` Michel Dänzer
  2021-02-08 13:11       ` Michel Dänzer
  2021-02-08 13:34       ` Daniel Vetter
  0 siblings, 2 replies; 28+ messages in thread
From: Michel Dänzer @ 2021-02-08 11:49 UTC (permalink / raw)
  To: Daniel Vetter, Kees Cook, airlied
  Cc: Will Drewry, Jann Horn, intel-gfx, Linux Kernel Mailing List,
	dri-devel, Andy Lutomirski, Andrew Morton, Chris Wilson

On 2021-02-05 9:53 p.m., Daniel Vetter wrote:
> On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote:
>>
>> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote:
>>> Userspace has discovered the functionality offered by SYS_kcmp and has
>>> started to depend upon it. In particular, Mesa uses SYS_kcmp for
>>> os_same_file_description() in order to identify when two fd (e.g. device
>>> or dmabuf) point to the same struct file. Since they depend on it for
>>> core functionality, lift SYS_kcmp out of the non-default
>>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Kees Cook <keescook@chromium.org>
>>> Cc: Andy Lutomirski <luto@amacapital.net>
>>> Cc: Will Drewry <wad@chromium.org>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Dave Airlie <airlied@gmail.com>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>> ---
>>>   init/Kconfig                                  | 11 +++++++++++
>>>   kernel/Makefile                               |  2 +-
>>>   tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
>>>   3 files changed, 13 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/init/Kconfig b/init/Kconfig
>>> index b77c60f8b963..f62fca13ac5b 100644
>>> --- a/init/Kconfig
>>> +++ b/init/Kconfig
>>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>>>   config CHECKPOINT_RESTORE
>>>        bool "Checkpoint/restore support"
>>>        select PROC_CHILDREN
>>> +     select KCMP
>>>        default n
>>>        help
>>>          Enables additional kernel features in a sake of checkpoint/restore.
>>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>>>   config ARCH_HAS_MEMBARRIER_SYNC_CORE
>>>        bool
>>>
>>> +config KCMP
>>> +     bool "Enable kcmp() system call" if EXPERT
>>> +     default y
>>
>> I would expect this to be not default-y, especially if
>> CHECKPOINT_RESTORE does a "select" on it.
>>
>> This is a really powerful syscall, but it is bounded by ptrace access
>> controls, and uses pointer address obfuscation, so it may be okay to
>> expose this. As it is, at least Ubuntu already has
>> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much
>> difference on exposure.
>>
>> So, if you drop the "default y", I'm fine with this.
> 
> It was maybe stupid, but our userspace started relying on fd
> comaprison through sys_kcomp. So for better or worse, if you want to
> run the mesa3d gl/vk stacks, you need this.

That's overstating things somewhat. The vast majority of applications 
will work fine regardless (as they did before Mesa started using this 
functionality). Only some special ones will run into issues, because the 
user-space drivers incorrectly assume two file descriptors reference 
different descriptions.


> Was maybe not the brighest ideas, but since enough distros had this
> enabled by defaults,

Right, that (and the above) is why I considered it fair game to use. 
What should I have done instead? (TBH I was surprised that this 
functionality isn't generally available)

> it wasn't really discovered, and now we're
> shipping this everywhere.

You're making it sound like this snuck in secretly somehow, which is not 
true of course.


> Ofc we can leave the default n, but the select if CONFIG_DRM is
> unfortunately needed I think.

Per above, not sure this is really true.


-- 
Earthling Michel Dänzer               |               https://redhat.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-08 11:49     ` Michel Dänzer
@ 2021-02-08 13:11       ` Michel Dänzer
  2021-02-08 13:34       ` Daniel Vetter
  1 sibling, 0 replies; 28+ messages in thread
From: Michel Dänzer @ 2021-02-08 13:11 UTC (permalink / raw)
  To: Daniel Vetter, Kees Cook, airlied
  Cc: Will Drewry, Jann Horn, intel-gfx, Linux Kernel Mailing List,
	dri-devel, Andy Lutomirski, Andrew Morton, Chris Wilson

On 2021-02-08 12:49 p.m., Michel Dänzer wrote:
> On 2021-02-05 9:53 p.m., Daniel Vetter wrote:
>> On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote:
>>>
>>> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote:
>>>> Userspace has discovered the functionality offered by SYS_kcmp and has
>>>> started to depend upon it. In particular, Mesa uses SYS_kcmp for
>>>> os_same_file_description() in order to identify when two fd (e.g. 
>>>> device
>>>> or dmabuf) point to the same struct file. Since they depend on it for
>>>> core functionality, lift SYS_kcmp out of the non-default
>>>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
>>>>
>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Kees Cook <keescook@chromium.org>
>>>> Cc: Andy Lutomirski <luto@amacapital.net>
>>>> Cc: Will Drewry <wad@chromium.org>
>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>> ---
>>>>   init/Kconfig                                  | 11 +++++++++++
>>>>   kernel/Makefile                               |  2 +-
>>>>   tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
>>>>   3 files changed, 13 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/init/Kconfig b/init/Kconfig
>>>> index b77c60f8b963..f62fca13ac5b 100644
>>>> --- a/init/Kconfig
>>>> +++ b/init/Kconfig
>>>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>>>>   config CHECKPOINT_RESTORE
>>>>        bool "Checkpoint/restore support"
>>>>        select PROC_CHILDREN
>>>> +     select KCMP
>>>>        default n
>>>>        help
>>>>          Enables additional kernel features in a sake of 
>>>> checkpoint/restore.
>>>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>>>>   config ARCH_HAS_MEMBARRIER_SYNC_CORE
>>>>        bool
>>>>
>>>> +config KCMP
>>>> +     bool "Enable kcmp() system call" if EXPERT
>>>> +     default y
>>>
>>> I would expect this to be not default-y, especially if
>>> CHECKPOINT_RESTORE does a "select" on it.
>>>
>>> This is a really powerful syscall, but it is bounded by ptrace access
>>> controls, and uses pointer address obfuscation, so it may be okay to
>>> expose this. As it is, at least Ubuntu already has
>>> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much
>>> difference on exposure.
>>>
>>> So, if you drop the "default y", I'm fine with this.
>>
>> It was maybe stupid, but our userspace started relying on fd
>> comaprison through sys_kcomp. So for better or worse, if you want to
>> run the mesa3d gl/vk stacks, you need this.
> 
> That's overstating things somewhat. The vast majority of applications 
> will work fine regardless (as they did before Mesa started using this 
> functionality). Only some special ones will run into issues, because the 
> user-space drivers incorrectly assume two file descriptors reference 
> different descriptions.
> 
> 
>> Was maybe not the brighest ideas, but since enough distros had this
>> enabled by defaults,
> 
> Right, that (and the above) is why I considered it fair game to use. 
> What should I have done instead? (TBH I was surprised that this 
> functionality isn't generally available)

In that spirit, an alternative might be to make KCMP_FILE available 
unconditionally, and the rest of SYS_kcmp only with CHECKPOINT_RESTORE 
as before. (Or maybe other parts of SYS_kcmp are generally useful as well?)


-- 
Earthling Michel Dänzer               |               https://redhat.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-08 11:49     ` Michel Dänzer
  2021-02-08 13:11       ` Michel Dänzer
@ 2021-02-08 13:34       ` Daniel Vetter
  2021-02-08 13:49         ` Michel Dänzer
  1 sibling, 1 reply; 28+ messages in thread
From: Daniel Vetter @ 2021-02-08 13:34 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Kees Cook, airlied, Will Drewry, Jann Horn, intel-gfx,
	Linux Kernel Mailing List, dri-devel, Andy Lutomirski,
	Andrew Morton, Chris Wilson

On Mon, Feb 8, 2021 at 12:49 PM Michel Dänzer <michel@daenzer.net> wrote:
>
> On 2021-02-05 9:53 p.m., Daniel Vetter wrote:
> > On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote:
> >>
> >> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote:
> >>> Userspace has discovered the functionality offered by SYS_kcmp and has
> >>> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> >>> os_same_file_description() in order to identify when two fd (e.g. device
> >>> or dmabuf) point to the same struct file. Since they depend on it for
> >>> core functionality, lift SYS_kcmp out of the non-default
> >>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> >>>
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Kees Cook <keescook@chromium.org>
> >>> Cc: Andy Lutomirski <luto@amacapital.net>
> >>> Cc: Will Drewry <wad@chromium.org>
> >>> Cc: Andrew Morton <akpm@linux-foundation.org>
> >>> Cc: Dave Airlie <airlied@gmail.com>
> >>> Cc: Daniel Vetter <daniel@ffwll.ch>
> >>> Cc: Lucas Stach <l.stach@pengutronix.de>
> >>> ---
> >>>   init/Kconfig                                  | 11 +++++++++++
> >>>   kernel/Makefile                               |  2 +-
> >>>   tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
> >>>   3 files changed, 13 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/init/Kconfig b/init/Kconfig
> >>> index b77c60f8b963..f62fca13ac5b 100644
> >>> --- a/init/Kconfig
> >>> +++ b/init/Kconfig
> >>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
> >>>   config CHECKPOINT_RESTORE
> >>>        bool "Checkpoint/restore support"
> >>>        select PROC_CHILDREN
> >>> +     select KCMP
> >>>        default n
> >>>        help
> >>>          Enables additional kernel features in a sake of checkpoint/restore.
> >>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
> >>>   config ARCH_HAS_MEMBARRIER_SYNC_CORE
> >>>        bool
> >>>
> >>> +config KCMP
> >>> +     bool "Enable kcmp() system call" if EXPERT
> >>> +     default y
> >>
> >> I would expect this to be not default-y, especially if
> >> CHECKPOINT_RESTORE does a "select" on it.
> >>
> >> This is a really powerful syscall, but it is bounded by ptrace access
> >> controls, and uses pointer address obfuscation, so it may be okay to
> >> expose this. As it is, at least Ubuntu already has
> >> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much
> >> difference on exposure.
> >>
> >> So, if you drop the "default y", I'm fine with this.
> >
> > It was maybe stupid, but our userspace started relying on fd
> > comaprison through sys_kcomp. So for better or worse, if you want to
> > run the mesa3d gl/vk stacks, you need this.
>
> That's overstating things somewhat. The vast majority of applications
> will work fine regardless (as they did before Mesa started using this
> functionality). Only some special ones will run into issues, because the
> user-space drivers incorrectly assume two file descriptors reference
> different descriptions.
>
>
> > Was maybe not the brighest ideas, but since enough distros had this
> > enabled by defaults,
>
> Right, that (and the above) is why I considered it fair game to use.
> What should I have done instead? (TBH I was surprised that this
> functionality isn't generally available)

Yeah that one is fine, but I thought we've discussed (irc or
something) more uses for de-duping dma-buf and stuff like that. But
quick grep says that hasn't landed yet, so I got a bit confused (or
just dreamt). Looking at this again I'm kinda surprised the drmfd
de-duping blows up on normal linux distros, but I guess it can all
happen.

> > it wasn't really discovered, and now we're
> > shipping this everywhere.
>
> You're making it sound like this snuck in secretly somehow, which is not
> true of course.
>
>
> > Ofc we can leave the default n, but the select if CONFIG_DRM is
> > unfortunately needed I think.
>
> Per above, not sure this is really true.

We seem to be going boom on linux distros now, maybe userspace got
more creative in abusing stuff? The entire thing is small enough that
imo we don't really have to care, e.g. we also unconditionally select
dma-buf, despite that on most systems there's only 1 gpu, and you're
never going to end up with a buffer sharing case that needs any of
that code (aside from the "here's an fd" part).

But I guess we can limit to just KCMP_FILE like you suggest in another
reply. Just feels a bit like overkill.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-08 13:34       ` Daniel Vetter
@ 2021-02-08 13:49         ` Michel Dänzer
  0 siblings, 0 replies; 28+ messages in thread
From: Michel Dänzer @ 2021-02-08 13:49 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Will Drewry, Kees Cook, Jann Horn, intel-gfx,
	Linux Kernel Mailing List, dri-devel, Andy Lutomirski,
	Andrew Morton, Chris Wilson

On 2021-02-08 2:34 p.m., Daniel Vetter wrote:
> On Mon, Feb 8, 2021 at 12:49 PM Michel Dänzer <michel@daenzer.net> wrote:
>>
>> On 2021-02-05 9:53 p.m., Daniel Vetter wrote:
>>> On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote:
>>>>
>>>> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote:
>>>>> Userspace has discovered the functionality offered by SYS_kcmp and has
>>>>> started to depend upon it. In particular, Mesa uses SYS_kcmp for
>>>>> os_same_file_description() in order to identify when two fd (e.g. device
>>>>> or dmabuf) point to the same struct file. Since they depend on it for
>>>>> core functionality, lift SYS_kcmp out of the non-default
>>>>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Kees Cook <keescook@chromium.org>
>>>>> Cc: Andy Lutomirski <luto@amacapital.net>
>>>>> Cc: Will Drewry <wad@chromium.org>
>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>> ---
>>>>>    init/Kconfig                                  | 11 +++++++++++
>>>>>    kernel/Makefile                               |  2 +-
>>>>>    tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
>>>>>    3 files changed, 13 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/init/Kconfig b/init/Kconfig
>>>>> index b77c60f8b963..f62fca13ac5b 100644
>>>>> --- a/init/Kconfig
>>>>> +++ b/init/Kconfig
>>>>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>>>>>    config CHECKPOINT_RESTORE
>>>>>         bool "Checkpoint/restore support"
>>>>>         select PROC_CHILDREN
>>>>> +     select KCMP
>>>>>         default n
>>>>>         help
>>>>>           Enables additional kernel features in a sake of checkpoint/restore.
>>>>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>>>>>    config ARCH_HAS_MEMBARRIER_SYNC_CORE
>>>>>         bool
>>>>>
>>>>> +config KCMP
>>>>> +     bool "Enable kcmp() system call" if EXPERT
>>>>> +     default y
>>>>
>>>> I would expect this to be not default-y, especially if
>>>> CHECKPOINT_RESTORE does a "select" on it.
>>>>
>>>> This is a really powerful syscall, but it is bounded by ptrace access
>>>> controls, and uses pointer address obfuscation, so it may be okay to
>>>> expose this. As it is, at least Ubuntu already has
>>>> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much
>>>> difference on exposure.
>>>>
>>>> So, if you drop the "default y", I'm fine with this.
>>>
>>> It was maybe stupid, but our userspace started relying on fd
>>> comaprison through sys_kcomp. So for better or worse, if you want to
>>> run the mesa3d gl/vk stacks, you need this.
>>
>> That's overstating things somewhat. The vast majority of applications
>> will work fine regardless (as they did before Mesa started using this
>> functionality). Only some special ones will run into issues, because the
>> user-space drivers incorrectly assume two file descriptors reference
>> different descriptions.
>>
>>
>>> Was maybe not the brighest ideas, but since enough distros had this
>>> enabled by defaults,
>>
>> Right, that (and the above) is why I considered it fair game to use.
>> What should I have done instead? (TBH I was surprised that this
>> functionality isn't generally available)
> 
> Yeah that one is fine, but I thought we've discussed (irc or
> something) more uses for de-duping dma-buf and stuff like that. But
> quick grep says that hasn't landed yet, so I got a bit confused (or
> just dreamt). Looking at this again I'm kinda surprised the drmfd
> de-duping blows up on normal linux distros, but I guess it can all
> happen.

One example: GEM handle name-spaces are per file description. If 
user-space incorrectly assumes two DRM fds are independent, when they 
actually reference the same file description, closing a GEM handle with 
one file descriptor will make it unusable with the other file descriptor 
as well.


>>> Ofc we can leave the default n, but the select if CONFIG_DRM is
>>> unfortunately needed I think.
>>
>> Per above, not sure this is really true.
> 
> We seem to be going boom on linux distros now, maybe userspace got
> more creative in abusing stuff?

I don't know what you're referring to. I've only seen maybe two or three 
reports from people who didn't enable CHECKPOINT_RESTORE in their 
self-built kernels.


> The entire thing is small enough that imo we don't really have to care,
> e.g. we also unconditionally select dma-buf, despite that on most
> systems there's only 1 gpu, and you're never going to end up with a
> buffer sharing case that needs any of that code (aside from the
> "here's an fd" part).
> 
> But I guess we can limit to just KCMP_FILE like you suggest in another
> reply. Just feels a bit like overkill.

Making KCMP_FILE gated by DRM makes as little sense to me as by 
CHECKPOINT_RESTORE.


-- 
Earthling Michel Dänzer               |               https://redhat.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-05 22:00 ` [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE Chris Wilson
  2021-02-06 12:47   ` Cyrill Gorcunov
@ 2021-02-08 22:12   ` Kees Cook
  2021-02-16  9:02     ` Daniel Vetter
  2021-02-12 12:57   ` Emil Velikov
  2021-02-15  8:56   ` Thomas Zimmermann
  3 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2021-02-08 22:12 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, dri-devel, intel-gfx, Andy Lutomirski, Will Drewry,
	Andrew Morton, Dave Airlie, Daniel Vetter, Lucas Stach,
	Rasmus Villemoes, Cyrill Gorcunov, stable, Daniel Vetter

On Fri, Feb 05, 2021 at 10:00:12PM +0000, Chris Wilson wrote:
> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf) point to the same struct file. Since they depend on it for
> core functionality, lift SYS_kcmp out of the non-default
> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> 
> Rasmus Villemoes also pointed out that systemd uses SYS_kcmp to
> deduplicate the per-service file descriptor store.
> 
> Note that some distributions such as Ubuntu are already enabling
> CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.
> 
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks!

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Will Drewry <wad@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Cc: Cyrill Gorcunov <gorcunov@gmail.com>
> Cc: stable@vger.kernel.org
> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on kcmp
> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> # systemd uses kcmp
> 
> ---
> v2:
>   - Default n.
>   - Borrrow help message from man kcmp.
>   - Export get_epoll_tfile_raw_ptr() for CONFIG_KCMP
> v3:
>   - Select KCMP for CONFIG_DRM
> ---
>  drivers/gpu/drm/Kconfig                       |  3 +++
>  fs/eventpoll.c                                |  4 ++--
>  include/linux/eventpoll.h                     |  2 +-
>  init/Kconfig                                  | 11 +++++++++++
>  kernel/Makefile                               |  2 +-
>  tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
>  6 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 0973f408d75f..af6c6d214d91 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -15,6 +15,9 @@ menuconfig DRM
>  	select I2C_ALGOBIT
>  	select DMA_SHARED_BUFFER
>  	select SYNC_FILE
> +# gallium uses SYS_kcmp for os_same_file_description() to de-duplicate
> +# device and dmabuf fd. Let's make sure that is available for our userspace.
> +	select KCMP
>  	help
>  	  Kernel-level support for the Direct Rendering Infrastructure (DRI)
>  	  introduced in XFree86 4.0. If you say Y here, you need to select
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index a829af074eb5..3196474cbe24 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -979,7 +979,7 @@ static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
>  	return epir;
>  }
>  
> -#ifdef CONFIG_CHECKPOINT_RESTORE
> +#ifdef CONFIG_KCMP
>  static struct epitem *ep_find_tfd(struct eventpoll *ep, int tfd, unsigned long toff)
>  {
>  	struct rb_node *rbp;
> @@ -1021,7 +1021,7 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
>  
>  	return file_raw;
>  }
> -#endif /* CONFIG_CHECKPOINT_RESTORE */
> +#endif /* CONFIG_KCMP */
>  
>  /**
>   * Adds a new entry to the tail of the list in a lockless way, i.e.
> diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> index 0350393465d4..593322c946e6 100644
> --- a/include/linux/eventpoll.h
> +++ b/include/linux/eventpoll.h
> @@ -18,7 +18,7 @@ struct file;
>  
>  #ifdef CONFIG_EPOLL
>  
> -#ifdef CONFIG_CHECKPOINT_RESTORE
> +#ifdef CONFIG_KCMP
>  struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long toff);
>  #endif
>  
> diff --git a/init/Kconfig b/init/Kconfig
> index b77c60f8b963..9cc7436b2f73 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>  config CHECKPOINT_RESTORE
>  	bool "Checkpoint/restore support"
>  	select PROC_CHILDREN
> +	select KCMP
>  	default n
>  	help
>  	  Enables additional kernel features in a sake of checkpoint/restore.
> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>  config ARCH_HAS_MEMBARRIER_SYNC_CORE
>  	bool
>  
> +config KCMP
> +	bool "Enable kcmp() system call" if EXPERT
> +	help
> +	  Enable the kernel resource comparison system call. It provides
> +	  user-space with the ability to compare two processes to see if they
> +	  share a common resource, such as a file descriptor or even virtual
> +	  memory space.
> +
> +	  If unsure, say N.
> +
>  config RSEQ
>  	bool "Enable rseq() system call" if EXPERT
>  	default y
> diff --git a/kernel/Makefile b/kernel/Makefile
> index aa7368c7eabf..320f1f3941b7 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -51,7 +51,7 @@ obj-y += livepatch/
>  obj-y += dma/
>  obj-y += entry/
>  
> -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> +obj-$(CONFIG_KCMP) += kcmp.o
>  obj-$(CONFIG_FREEZER) += freezer.o
>  obj-$(CONFIG_PROFILING) += profile.o
>  obj-$(CONFIG_STACKTRACE) += stacktrace.o
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 26c72f2b61b1..1b6c7d33c4ff 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -315,7 +315,7 @@ TEST(kcmp)
>  	ret = __filecmp(getpid(), getpid(), 1, 1);
>  	EXPECT_EQ(ret, 0);
>  	if (ret != 0 && errno == ENOSYS)
> -		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> +		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
>  }
>  
>  TEST(mode_strict_support)
> -- 
> 2.20.1
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-05 22:00 ` [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE Chris Wilson
  2021-02-06 12:47   ` Cyrill Gorcunov
  2021-02-08 22:12   ` Kees Cook
@ 2021-02-12 12:57   ` Emil Velikov
  2021-02-12 13:14     ` Simon Ser
  2021-02-12 14:01     ` Michel Dänzer
  2021-02-15  8:56   ` Thomas Zimmermann
  3 siblings, 2 replies; 28+ messages in thread
From: Emil Velikov @ 2021-02-12 12:57 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Linux-Kernel@Vger. Kernel. Org, Will Drewry, Kees Cook,
	Daniel Vetter, Intel Graphics Development, Rasmus Villemoes,
	ML dri-devel, Andy Lutomirski, Cyrill Gorcunov, # 3.13+,
	Andrew Morton

On Fri, 5 Feb 2021 at 22:01, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf)

As you rightfully point out, SYS_kcmp is a bit of a two edged sword.
While you mention the CONFIG issue, there is also a portability aspect
(mesa runs on more than just linux) and as well as sandbox filtering
of the extra syscall.

Last time I looked, the latter was still an issue and mesa was using
SYS_kcmp to compare device node fds.
A far shorter and more portable solution is possible, so let me
prepare a Mesa patch.

-Emil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-12 12:57   ` Emil Velikov
@ 2021-02-12 13:14     ` Simon Ser
  2021-02-12 14:07       ` Emil Velikov
  2021-02-12 14:01     ` Michel Dänzer
  1 sibling, 1 reply; 28+ messages in thread
From: Simon Ser @ 2021-02-12 13:14 UTC (permalink / raw)
  To: Emil Velikov
  Cc: Chris Wilson, Will Drewry, Kees Cook, Daniel Vetter,
	Intel Graphics Development, Rasmus Villemoes,
	Linux-Kernel@Vger. Kernel. Org, ML dri-devel, Andy Lutomirski,
	Cyrill Gorcunov, # 3.13+,
	Andrew Morton

On Friday, February 12th, 2021 at 1:57 PM, Emil Velikov <emil.l.velikov@gmail.com> wrote:

> On Fri, 5 Feb 2021 at 22:01, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > Userspace has discovered the functionality offered by SYS_kcmp and has
> > started to depend upon it. In particular, Mesa uses SYS_kcmp for
> > os_same_file_description() in order to identify when two fd (e.g. device
> > or dmabuf)
>
> As you rightfully point out, SYS_kcmp is a bit of a two edged sword.
> While you mention the CONFIG issue, there is also a portability aspect
> (mesa runs on more than just linux) and as well as sandbox filtering
> of the extra syscall.
>
> Last time I looked, the latter was still an issue and mesa was using
> SYS_kcmp to compare device node fds.
> A far shorter and more portable solution is possible, so let me
> prepare a Mesa patch.

Comparing two DMA-BUFs can be done with their inode number, I think.

Comparing two device FDs is more subtle, because of GEM handle
ref'counting. You sometimes really want to check whether two FDs are
backed by the same file *description*. See [1] for details.

[1]: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/110

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-12 12:57   ` Emil Velikov
  2021-02-12 13:14     ` Simon Ser
@ 2021-02-12 14:01     ` Michel Dänzer
  2021-02-12 14:09       ` Emil Velikov
  1 sibling, 1 reply; 28+ messages in thread
From: Michel Dänzer @ 2021-02-12 14:01 UTC (permalink / raw)
  To: Emil Velikov, Chris Wilson
  Cc: Will Drewry, Kees Cook, Daniel Vetter,
	Intel Graphics Development, Rasmus Villemoes,
	Linux-Kernel@Vger. Kernel. Org, ML dri-devel, Andy Lutomirski,
	Cyrill Gorcunov, # 3.13+,
	Andrew Morton

On 2021-02-12 1:57 p.m., Emil Velikov wrote:
> On Fri, 5 Feb 2021 at 22:01, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>>
>> Userspace has discovered the functionality offered by SYS_kcmp and has
>> started to depend upon it. In particular, Mesa uses SYS_kcmp for
>> os_same_file_description() in order to identify when two fd (e.g. device
>> or dmabuf)
> 
> As you rightfully point out, SYS_kcmp is a bit of a two edged sword.
> While you mention the CONFIG issue, there is also a portability aspect
> (mesa runs on more than just linux) and as well as sandbox filtering
> of the extra syscall.
> 
> Last time I looked, the latter was still an issue and mesa was using
> SYS_kcmp to compare device node fds.
> A far shorter and more portable solution is possible, so let me
> prepare a Mesa patch.

Make sure to read my comments on 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6881 first. :)


-- 
Earthling Michel Dänzer               |               https://redhat.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-12 13:14     ` Simon Ser
@ 2021-02-12 14:07       ` Emil Velikov
  0 siblings, 0 replies; 28+ messages in thread
From: Emil Velikov @ 2021-02-12 14:07 UTC (permalink / raw)
  To: Simon Ser
  Cc: Chris Wilson, Will Drewry, Kees Cook, Daniel Vetter,
	Intel Graphics Development, Rasmus Villemoes,
	Linux-Kernel@Vger. Kernel. Org, ML dri-devel, Andy Lutomirski,
	Cyrill Gorcunov, # 3.13+,
	Andrew Morton

On Fri, 12 Feb 2021 at 13:14, Simon Ser <contact@emersion.fr> wrote:
>
> On Friday, February 12th, 2021 at 1:57 PM, Emil Velikov <emil.l.velikov@gmail.com> wrote:
>
> > On Fri, 5 Feb 2021 at 22:01, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > >
> > > Userspace has discovered the functionality offered by SYS_kcmp and has
> > > started to depend upon it. In particular, Mesa uses SYS_kcmp for
> > > os_same_file_description() in order to identify when two fd (e.g. device
> > > or dmabuf)
> >
> > As you rightfully point out, SYS_kcmp is a bit of a two edged sword.
> > While you mention the CONFIG issue, there is also a portability aspect
> > (mesa runs on more than just linux) and as well as sandbox filtering
> > of the extra syscall.
> >
> > Last time I looked, the latter was still an issue and mesa was using
> > SYS_kcmp to compare device node fds.
> > A far shorter and more portable solution is possible, so let me
> > prepare a Mesa patch.
>
> Comparing two DMA-BUFs can be done with their inode number, I think.
>
> Comparing two device FDs is more subtle, because of GEM handle
> ref'counting. You sometimes really want to check whether two FDs are
> backed by the same file *description*. See [1] for details.
>
Thanks for the correction and the reference.
Seems like I've short circuited file description table vs file descriptor.

Emil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-12 14:01     ` Michel Dänzer
@ 2021-02-12 14:09       ` Emil Velikov
  0 siblings, 0 replies; 28+ messages in thread
From: Emil Velikov @ 2021-02-12 14:09 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Chris Wilson, Will Drewry, Kees Cook, Daniel Vetter,
	Intel Graphics Development, Rasmus Villemoes,
	Linux-Kernel@Vger. Kernel. Org, ML dri-devel, Andy Lutomirski,
	Cyrill Gorcunov, # 3.13+,
	Andrew Morton

On Fri, 12 Feb 2021 at 14:01, Michel Dänzer <michel@daenzer.net> wrote:
>
> On 2021-02-12 1:57 p.m., Emil Velikov wrote:
> > On Fri, 5 Feb 2021 at 22:01, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >>
> >> Userspace has discovered the functionality offered by SYS_kcmp and has
> >> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> >> os_same_file_description() in order to identify when two fd (e.g. device
> >> or dmabuf)
> >
> > As you rightfully point out, SYS_kcmp is a bit of a two edged sword.
> > While you mention the CONFIG issue, there is also a portability aspect
> > (mesa runs on more than just linux) and as well as sandbox filtering
> > of the extra syscall.
> >
> > Last time I looked, the latter was still an issue and mesa was using
> > SYS_kcmp to compare device node fds.
> > A far shorter and more portable solution is possible, so let me
> > prepare a Mesa patch.
>
> Make sure to read my comments on
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6881 first. :)
>
Much appreciated. I might have been "slightly" off - pardon for the noise o/

-Emil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-05 16:37 [PATCH] kernel: Expose SYS_kcmp by default Chris Wilson
                   ` (5 preceding siblings ...)
  2021-02-05 22:00 ` [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE Chris Wilson
@ 2021-02-13 17:40 ` Pavel Machek
  2021-02-15 10:23   ` Lucas Stach
  6 siblings, 1 reply; 28+ messages in thread
From: Pavel Machek @ 2021-02-13 17:40 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, dri-devel, intel-gfx, Kees Cook, Andy Lutomirski,
	Will Drewry, Andrew Morton, Dave Airlie, Daniel Vetter,
	Lucas Stach

[-- Attachment #1: Type: text/plain, Size: 630 bytes --]

Hi!

> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf) point to the same struct file. Since they depend on it for
> core functionality, lift SYS_kcmp out of the non-default
> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.

Is it good idea to enable everything because Mesa uses it for file
descriptors?

This is really interesting syscall...

Best regards,
								Pavel

-- 
http://www.livejournal.com/~pavelmachek

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-05 22:00 ` [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE Chris Wilson
                     ` (2 preceding siblings ...)
  2021-02-12 12:57   ` Emil Velikov
@ 2021-02-15  8:56   ` Thomas Zimmermann
  3 siblings, 0 replies; 28+ messages in thread
From: Thomas Zimmermann @ 2021-02-15  8:56 UTC (permalink / raw)
  To: Chris Wilson, linux-kernel
  Cc: Will Drewry, Kees Cook, Daniel Vetter, intel-gfx,
	Rasmus Villemoes, dri-devel, Andy Lutomirski, Cyrill Gorcunov,
	stable, Andrew Morton


[-- Attachment #1.1: Type: text/plain, Size: 6191 bytes --]

Hi

Am 05.02.21 um 23:00 schrieb Chris Wilson:
> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf) point to the same struct file. Since they depend on it for
> core functionality, lift SYS_kcmp out of the non-default
> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> 
> Rasmus Villemoes also pointed out that systemd uses SYS_kcmp to
> deduplicate the per-service file descriptor store.

This helps a lot with transactional programming in userspace system 
code. So FWIW

Acked-by: Thomas Zimmermann <tzimmermann@suse.de>

> 
> Note that some distributions such as Ubuntu are already enabling
> CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.
> 
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Will Drewry <wad@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Cc: Cyrill Gorcunov <gorcunov@gmail.com>
> Cc: stable@vger.kernel.org
> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on kcmp
> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> # systemd uses kcmp
> 
> ---
> v2:
>    - Default n.
>    - Borrrow help message from man kcmp.
>    - Export get_epoll_tfile_raw_ptr() for CONFIG_KCMP
> v3:
>    - Select KCMP for CONFIG_DRM
> ---
>   drivers/gpu/drm/Kconfig                       |  3 +++
>   fs/eventpoll.c                                |  4 ++--
>   include/linux/eventpoll.h                     |  2 +-
>   init/Kconfig                                  | 11 +++++++++++
>   kernel/Makefile                               |  2 +-
>   tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
>   6 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 0973f408d75f..af6c6d214d91 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -15,6 +15,9 @@ menuconfig DRM
>   	select I2C_ALGOBIT
>   	select DMA_SHARED_BUFFER
>   	select SYNC_FILE
> +# gallium uses SYS_kcmp for os_same_file_description() to de-duplicate
> +# device and dmabuf fd. Let's make sure that is available for our userspace.
> +	select KCMP
>   	help
>   	  Kernel-level support for the Direct Rendering Infrastructure (DRI)
>   	  introduced in XFree86 4.0. If you say Y here, you need to select
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index a829af074eb5..3196474cbe24 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -979,7 +979,7 @@ static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
>   	return epir;
>   }
>   
> -#ifdef CONFIG_CHECKPOINT_RESTORE
> +#ifdef CONFIG_KCMP
>   static struct epitem *ep_find_tfd(struct eventpoll *ep, int tfd, unsigned long toff)
>   {
>   	struct rb_node *rbp;
> @@ -1021,7 +1021,7 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
>   
>   	return file_raw;
>   }
> -#endif /* CONFIG_CHECKPOINT_RESTORE */
> +#endif /* CONFIG_KCMP */
>   
>   /**
>    * Adds a new entry to the tail of the list in a lockless way, i.e.
> diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> index 0350393465d4..593322c946e6 100644
> --- a/include/linux/eventpoll.h
> +++ b/include/linux/eventpoll.h
> @@ -18,7 +18,7 @@ struct file;
>   
>   #ifdef CONFIG_EPOLL
>   
> -#ifdef CONFIG_CHECKPOINT_RESTORE
> +#ifdef CONFIG_KCMP
>   struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long toff);
>   #endif
>   
> diff --git a/init/Kconfig b/init/Kconfig
> index b77c60f8b963..9cc7436b2f73 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>   config CHECKPOINT_RESTORE
>   	bool "Checkpoint/restore support"
>   	select PROC_CHILDREN
> +	select KCMP
>   	default n
>   	help
>   	  Enables additional kernel features in a sake of checkpoint/restore.
> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>   config ARCH_HAS_MEMBARRIER_SYNC_CORE
>   	bool
>   
> +config KCMP
> +	bool "Enable kcmp() system call" if EXPERT
> +	help
> +	  Enable the kernel resource comparison system call. It provides
> +	  user-space with the ability to compare two processes to see if they
> +	  share a common resource, such as a file descriptor or even virtual
> +	  memory space.
> +
> +	  If unsure, say N.
> +
>   config RSEQ
>   	bool "Enable rseq() system call" if EXPERT
>   	default y
> diff --git a/kernel/Makefile b/kernel/Makefile
> index aa7368c7eabf..320f1f3941b7 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -51,7 +51,7 @@ obj-y += livepatch/
>   obj-y += dma/
>   obj-y += entry/
>   
> -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> +obj-$(CONFIG_KCMP) += kcmp.o
>   obj-$(CONFIG_FREEZER) += freezer.o
>   obj-$(CONFIG_PROFILING) += profile.o
>   obj-$(CONFIG_STACKTRACE) += stacktrace.o
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 26c72f2b61b1..1b6c7d33c4ff 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -315,7 +315,7 @@ TEST(kcmp)
>   	ret = __filecmp(getpid(), getpid(), 1, 1);
>   	EXPECT_EQ(ret, 0);
>   	if (ret != 0 && errno == ENOSYS)
> -		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> +		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
>   }
>   
>   TEST(mode_strict_support)
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] kernel: Expose SYS_kcmp by default
  2021-02-13 17:40 ` [PATCH] kernel: Expose SYS_kcmp by default Pavel Machek
@ 2021-02-15 10:23   ` Lucas Stach
  0 siblings, 0 replies; 28+ messages in thread
From: Lucas Stach @ 2021-02-15 10:23 UTC (permalink / raw)
  To: Pavel Machek, Chris Wilson
  Cc: linux-kernel, dri-devel, intel-gfx, Kees Cook, Andy Lutomirski,
	Will Drewry, Andrew Morton, Dave Airlie, Daniel Vetter

Am Samstag, dem 13.02.2021 um 18:40 +0100 schrieb Pavel Machek:
> Hi!
> 
> > Userspace has discovered the functionality offered by SYS_kcmp and has
> > started to depend upon it. In particular, Mesa uses SYS_kcmp for
> > os_same_file_description() in order to identify when two fd (e.g. device
> > or dmabuf) point to the same struct file. Since they depend on it for
> > core functionality, lift SYS_kcmp out of the non-default
> > CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> 
> Is it good idea to enable everything because Mesa uses it for file
> descriptors?
> 
> This is really interesting syscall...

As Debian/Ubuntu and Fedora are already shipping with
CONFIG_CHECKPOINT_RESTORE=y in their kernel configs, I don't really see
the need to add further restrictions here. Or this discussion should
have happened a while ago...

Regards,
Lucas


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
  2021-02-08 22:12   ` Kees Cook
@ 2021-02-16  9:02     ` Daniel Vetter
  0 siblings, 0 replies; 28+ messages in thread
From: Daniel Vetter @ 2021-02-16  9:02 UTC (permalink / raw)
  To: Kees Cook
  Cc: Chris Wilson, linux-kernel, dri-devel, intel-gfx,
	Andy Lutomirski, Will Drewry, Andrew Morton, Dave Airlie,
	Daniel Vetter, Lucas Stach, Rasmus Villemoes, Cyrill Gorcunov,
	stable, Daniel Vetter

On Mon, Feb 08, 2021 at 02:12:00PM -0800, Kees Cook wrote:
> On Fri, Feb 05, 2021 at 10:00:12PM +0000, Chris Wilson wrote:
> > Userspace has discovered the functionality offered by SYS_kcmp and has
> > started to depend upon it. In particular, Mesa uses SYS_kcmp for
> > os_same_file_description() in order to identify when two fd (e.g. device
> > or dmabuf) point to the same struct file. Since they depend on it for
> > core functionality, lift SYS_kcmp out of the non-default
> > CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
> > 
> > Rasmus Villemoes also pointed out that systemd uses SYS_kcmp to
> > deduplicate the per-service file descriptor store.
> > 
> > Note that some distributions such as Ubuntu are already enabling
> > CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.
> > 
> > References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Thanks!
> 
> Reviewed-by: Kees Cook <keescook@chromium.org>

Thanks for reviews&patch, I stuffed it into a topic branch and plan to
send it to Linus later this week.

Cheers, Daniel

> 
> -Kees
> 
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Will Drewry <wad@chromium.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > Cc: Cyrill Gorcunov <gorcunov@gmail.com>
> > Cc: stable@vger.kernel.org
> > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on kcmp
> > Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> # systemd uses kcmp
> > 
> > ---
> > v2:
> >   - Default n.
> >   - Borrrow help message from man kcmp.
> >   - Export get_epoll_tfile_raw_ptr() for CONFIG_KCMP
> > v3:
> >   - Select KCMP for CONFIG_DRM
> > ---
> >  drivers/gpu/drm/Kconfig                       |  3 +++
> >  fs/eventpoll.c                                |  4 ++--
> >  include/linux/eventpoll.h                     |  2 +-
> >  init/Kconfig                                  | 11 +++++++++++
> >  kernel/Makefile                               |  2 +-
> >  tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
> >  6 files changed, 19 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > index 0973f408d75f..af6c6d214d91 100644
> > --- a/drivers/gpu/drm/Kconfig
> > +++ b/drivers/gpu/drm/Kconfig
> > @@ -15,6 +15,9 @@ menuconfig DRM
> >  	select I2C_ALGOBIT
> >  	select DMA_SHARED_BUFFER
> >  	select SYNC_FILE
> > +# gallium uses SYS_kcmp for os_same_file_description() to de-duplicate
> > +# device and dmabuf fd. Let's make sure that is available for our userspace.
> > +	select KCMP
> >  	help
> >  	  Kernel-level support for the Direct Rendering Infrastructure (DRI)
> >  	  introduced in XFree86 4.0. If you say Y here, you need to select
> > diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> > index a829af074eb5..3196474cbe24 100644
> > --- a/fs/eventpoll.c
> > +++ b/fs/eventpoll.c
> > @@ -979,7 +979,7 @@ static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
> >  	return epir;
> >  }
> >  
> > -#ifdef CONFIG_CHECKPOINT_RESTORE
> > +#ifdef CONFIG_KCMP
> >  static struct epitem *ep_find_tfd(struct eventpoll *ep, int tfd, unsigned long toff)
> >  {
> >  	struct rb_node *rbp;
> > @@ -1021,7 +1021,7 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
> >  
> >  	return file_raw;
> >  }
> > -#endif /* CONFIG_CHECKPOINT_RESTORE */
> > +#endif /* CONFIG_KCMP */
> >  
> >  /**
> >   * Adds a new entry to the tail of the list in a lockless way, i.e.
> > diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> > index 0350393465d4..593322c946e6 100644
> > --- a/include/linux/eventpoll.h
> > +++ b/include/linux/eventpoll.h
> > @@ -18,7 +18,7 @@ struct file;
> >  
> >  #ifdef CONFIG_EPOLL
> >  
> > -#ifdef CONFIG_CHECKPOINT_RESTORE
> > +#ifdef CONFIG_KCMP
> >  struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long toff);
> >  #endif
> >  
> > diff --git a/init/Kconfig b/init/Kconfig
> > index b77c60f8b963..9cc7436b2f73 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1194,6 +1194,7 @@ endif # NAMESPACES
> >  config CHECKPOINT_RESTORE
> >  	bool "Checkpoint/restore support"
> >  	select PROC_CHILDREN
> > +	select KCMP
> >  	default n
> >  	help
> >  	  Enables additional kernel features in a sake of checkpoint/restore.
> > @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
> >  config ARCH_HAS_MEMBARRIER_SYNC_CORE
> >  	bool
> >  
> > +config KCMP
> > +	bool "Enable kcmp() system call" if EXPERT
> > +	help
> > +	  Enable the kernel resource comparison system call. It provides
> > +	  user-space with the ability to compare two processes to see if they
> > +	  share a common resource, such as a file descriptor or even virtual
> > +	  memory space.
> > +
> > +	  If unsure, say N.
> > +
> >  config RSEQ
> >  	bool "Enable rseq() system call" if EXPERT
> >  	default y
> > diff --git a/kernel/Makefile b/kernel/Makefile
> > index aa7368c7eabf..320f1f3941b7 100644
> > --- a/kernel/Makefile
> > +++ b/kernel/Makefile
> > @@ -51,7 +51,7 @@ obj-y += livepatch/
> >  obj-y += dma/
> >  obj-y += entry/
> >  
> > -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> > +obj-$(CONFIG_KCMP) += kcmp.o
> >  obj-$(CONFIG_FREEZER) += freezer.o
> >  obj-$(CONFIG_PROFILING) += profile.o
> >  obj-$(CONFIG_STACKTRACE) += stacktrace.o
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index 26c72f2b61b1..1b6c7d33c4ff 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -315,7 +315,7 @@ TEST(kcmp)
> >  	ret = __filecmp(getpid(), getpid(), 1, 1);
> >  	EXPECT_EQ(ret, 0);
> >  	if (ret != 0 && errno == ENOSYS)
> > -		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> > +		SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
> >  }
> >  
> >  TEST(mode_strict_support)
> > -- 
> > 2.20.1
> > 
> 
> -- 
> Kees Cook

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-02-16  9:05 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-05 16:37 [PATCH] kernel: Expose SYS_kcmp by default Chris Wilson
2021-02-05 17:02 ` Lucas Stach
2021-02-05 18:37 ` Kees Cook
2021-02-05 20:53   ` Daniel Vetter
2021-02-08 11:49     ` Michel Dänzer
2021-02-08 13:11       ` Michel Dänzer
2021-02-08 13:34       ` Daniel Vetter
2021-02-08 13:49         ` Michel Dänzer
2021-02-05 19:25 ` kernel test robot
2021-02-05 19:46 ` kernel test robot
2021-02-05 21:06 ` [PATCH v2] " Chris Wilson
2021-02-05 21:16   ` Chris Wilson
2021-02-05 21:20     ` Kees Cook
2021-02-05 21:28       ` Chris Wilson
2021-02-05 21:48         ` Daniel Vetter
2021-02-05 21:47   ` Rasmus Villemoes
2021-02-05 22:00 ` [PATCH v3] kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE Chris Wilson
2021-02-06 12:47   ` Cyrill Gorcunov
2021-02-08 22:12   ` Kees Cook
2021-02-16  9:02     ` Daniel Vetter
2021-02-12 12:57   ` Emil Velikov
2021-02-12 13:14     ` Simon Ser
2021-02-12 14:07       ` Emil Velikov
2021-02-12 14:01     ` Michel Dänzer
2021-02-12 14:09       ` Emil Velikov
2021-02-15  8:56   ` Thomas Zimmermann
2021-02-13 17:40 ` [PATCH] kernel: Expose SYS_kcmp by default Pavel Machek
2021-02-15 10:23   ` Lucas Stach

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).