LKML Archive on lore.kernel.org
 help / Atom feed
* [PATCH] syscalls: define and explain goal to not call syscalls in the kernel
@ 2018-03-25 16:25 Dominik Brodowski
  2018-03-30 15:35 ` Jonathan Corbet
  0 siblings, 1 reply; 3+ messages in thread
From: Dominik Brodowski @ 2018-03-25 16:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-doc, Jonathan Corbet, viro, x86, torvalds, mingo, tglx, luto

The syscall entry points to the kernel defined by SYSCALL_DEFINEx()
and COMPAT_SYSCALL_DEFINEx() should only be called from userspace
through kernel entry points, but not from the kernel itself. This
will allow cleanups and optimizations to the entry paths *and* to
the parts of the kernel code which currently need to pretend to be
userspace in order to make use of syscalls.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

---

As there have been multiple inquiries on the rationale of my patchsets
removing in-kernel calls to sys_xyzzy(), here is an updated patch 01/NN
which I will push upstream for v4.17-rc1. I will also include a reference
to this mail (and therefore to the explanation below) in all related
patches of the series. Any improvements, hints, suggestions, spelling
fixes, and/or objections?

Thanks,
	Dominik


diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
index 8cc25a06f353..556613744556 100644
--- a/Documentation/process/adding-syscalls.rst
+++ b/Documentation/process/adding-syscalls.rst
@@ -487,6 +487,38 @@ patchset, for the convenience of reviewers.
 The man page should be cc'ed to linux-man@vger.kernel.org
 For more details, see https://www.kernel.org/doc/man-pages/patches.html
 
+
+Do not call System Calls in the Kernel
+--------------------------------------
+
+System calls are, as stated above, interaction points between userspace and
+the kernel.  Therefore, system call functions such as ``sys_xyzzy()`` or
+``compat_sys_xyzzy()`` should only be called from userspace via the syscall
+table, but not from elsewhere in the kernel.  If the syscall functionality is
+useful to be used within the kernel, needs to be shared between an old and a
+new syscall, or needs to be shared between a syscall and its compatibility
+variant, it should be implemented by means of a "helper" function (such as
+``kern_xyzzy()``).  This kernel function may then be called within the
+syscall stub (``sys_xyzzy()``), the compatibility syscall stub
+(``compat_sys_xyzzy()``), and/or other kernel code.
+
+At least on 64-bit x86, it will be a hard requirement from v4.17 onwards to not
+call system call functions in the kernel.  It uses a different calling
+convention for system calls where ``struct pt_regs`` is decoded on-the-fly in a
+syscall wrapper which then hands processing over to the actual syscall function.
+This means that only those parameters which are actually needed for a specific
+syscall are passed on during syscall entry, instead of filling in six CPU
+registers with random user space content all the time (which may cause serious
+trouble down the call chain).
+
+Moreover, rules on how data may be accessed may differ between kernel data and
+user data.  This is another reason why calling ``sys_xyzzy()`` is generally a
+bad idea.
+
+Exceptions to this rule are only allowed in architecture-specific overrides,
+architecture-specific compatibility wrappers, or other code in arch/.
+
+
 References and Sources
 ----------------------
 
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index a78186d826d7..0526286a0314 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -941,4 +941,11 @@ asmlinkage long sys_pkey_free(int pkey);
 asmlinkage long sys_statx(int dfd, const char __user *path, unsigned flags,
 			  unsigned mask, struct statx __user *buffer);
 
+
+/*
+ * Kernel code should not call syscalls (i.e., sys_xyzyyz()) directly.
+ * Instead, use one of the functions which work equivalently, such as
+ * the ksys_xyzyyz() functions prototyped below.
+ */
+
 #endif

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] syscalls: define and explain goal to not call syscalls in the kernel
  2018-03-25 16:25 [PATCH] syscalls: define and explain goal to not call syscalls in the kernel Dominik Brodowski
@ 2018-03-30 15:35 ` Jonathan Corbet
  2018-03-30 18:31   ` Dominik Brodowski
  0 siblings, 1 reply; 3+ messages in thread
From: Jonathan Corbet @ 2018-03-30 15:35 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, linux-doc, viro, x86, torvalds, mingo, tglx, luto

On Sun, 25 Mar 2018 18:25:27 +0200
Dominik Brodowski <linux@dominikbrodowski.net> wrote:

> As there have been multiple inquiries on the rationale of my patchsets
> removing in-kernel calls to sys_xyzzy(), here is an updated patch 01/NN
> which I will push upstream for v4.17-rc1. I will also include a reference
> to this mail (and therefore to the explanation below) in all related
> patches of the series. Any improvements, hints, suggestions, spelling
> fixes, and/or objections?

I have no objections to the text, but I do wonder about the placement.
The "adding syscalls" document isn't about *invoking* them; I suspect that
few people will see it there.  The coding-style document isn't quite right
either, but I wonder if it might not be a better place in the short term?

What we may really need is an "assorted rules" document that sits near
coding style; we can put stuff like this text, "volatile considered
harmful", and so on there.

Thanks,

jon

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] syscalls: define and explain goal to not call syscalls in the kernel
  2018-03-30 15:35 ` Jonathan Corbet
@ 2018-03-30 18:31   ` Dominik Brodowski
  0 siblings, 0 replies; 3+ messages in thread
From: Dominik Brodowski @ 2018-03-30 18:31 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: linux-kernel, linux-doc, viro, x86, torvalds, mingo, tglx, luto

Jon,

On Fri, Mar 30, 2018 at 09:35:18AM -0600, Jonathan Corbet wrote:
> On Sun, 25 Mar 2018 18:25:27 +0200
> Dominik Brodowski <linux@dominikbrodowski.net> wrote:
> 
> > As there have been multiple inquiries on the rationale of my patchsets
> > removing in-kernel calls to sys_xyzzy(), here is an updated patch 01/NN
> > which I will push upstream for v4.17-rc1. I will also include a reference
> > to this mail (and therefore to the explanation below) in all related
> > patches of the series. Any improvements, hints, suggestions, spelling
> > fixes, and/or objections?
> 
> I have no objections to the text, but I do wonder about the placement.
> The "adding syscalls" document isn't about *invoking* them; I suspect that
> few people will see it there.  The coding-style document isn't quite right
> either, but I wonder if it might not be a better place in the short term?

Well, most of the existing instances where syscalls were called in the
kernel were common codepaths for old and new syscalls or native and compat
syscalls, and syscall multiplexers like sys_ipc() which got replaced or
superseded by many new syscalls. That's what lead me to 
Documentation/process/adding-syscalls.rst . I'm happy to move this text to
Documentation/process/coding-style.rst (as new section 21?), or even to
Documentation/process/do-not-call-syscalls.rst . Just let me know what you
prefer me to push upstream.

Thanks,
	Dominik

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-25 16:25 [PATCH] syscalls: define and explain goal to not call syscalls in the kernel Dominik Brodowski
2018-03-30 15:35 ` Jonathan Corbet
2018-03-30 18:31   ` Dominik Brodowski

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox