archive mirror
 help / color / mirror / Atom feed
* Formal description of system call interface
@ 2016-11-06 22:39 Dmitry Vyukov
       [not found] ` <>
  2016-11-11 17:10 ` Andy Lutomirski
  0 siblings, 2 replies; 14+ messages in thread
From: Dmitry Vyukov @ 2016-11-06 22:39 UTC (permalink / raw)
  To: linux-api-u79uwXL29TY76Z2rM5mHXA, LKML
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Thomas Gleixner,
	Sasha Levin, Mathieu Desnoyers, scientist-b10kYP2dOMg,
	Steven Rostedt, Arnd Bergmann, carlos-H+wXaHxf7aLQT0dZR+AlfA,
	syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones,
	Tavis Ormandy


This is notes from the discussion we had at Linux Plumbers this week
regarding providing a formal description of system calls (user API).

The idea come up in the context of syzkaller, syscall fuzzer, which
has descriptions for 1000+ syscalls mostly concentrating on types of
arguments and return values. However, problems are that a small group
of people can't write descriptions for all syscalls; can't keep them
up-to-date and doesn't have necessary domain expertise to do correct
descriptions in some cases.

We identified a surprisingly large number of potential users for such
 - fuzzers (syzkaller, trinity, iknowthis)
 - strace/syscall tracepoints (capturing indirect arguments and
   printing human-readable info)
 - generation of entry points for C libraries (glibc, liblinux
   (raw syscalls), Go runtime, clang/gcc sanitizers)
 - valgrind/sanitizers checking of input/output values of syscalls
 - seccomp filters (minijail, libseccomp) need to know interfaces
   to generate wrappers
 - safety certification (requires syscall specifications)
 - man pages (could provide actual syscall interface rather than
   glibc wrapper interface, it was noted that possible errno values
   is an important part here)
 - generation of syscall argument validation in kernel (fast version
   is enabled all the time, extended is optional)

It's worth noting that number of these users already have some
descriptions that suffer from the same problems of being
incomplete/outdated. See also linux-api mailing list description
which lists an overlapping set of cases:

We discussed several implementation approaches:
 - Extracting the interface from kernel code either by parsing
   sources or using dwarf. However, current source doesn't have
   enough info: fd are specified as int, while we need to know exact
   fd type (e.g. fd_epoll_t); not possible to extract flag set for
   'int flags'; don't know what is 'char*'.
 - Making the formal description the master copy and generating
   kernel code from it (structs, flags, syscall entry points).
   This is quite pervasive, but otherwise should work.
 - Doing what syzkaller currently does: providing the description
   on side. Verifying that description and implementation match
   is an important piece here. We can do dynamic checking in syscall
   entry points (print warnings on anything that does not match
   descriptions); or static checking (but again kernel code doesn't
   have enough info for checking).

We decided to pursue the last option as the least pervasive for now.
Several locations for the descriptions were proposed: with source code,
include/uapi, Documentation.

Action points:
 - polish DSL for description (must be extensible)
 - write a parser for DSL
 - provide definition for mm syscalls (mm is reasonably simple
   and self-contained)
 - see if we can do validation of mm arguments

It was acknowledged that whatever we do now it will probably
significantly change and evolve over time as we better understand
what we need and what works.

For the reference, current syzkaller descriptions are in txt files here:
The most generic syscalls are here:
Specific subsystems are described in separate files, e.g.:
The descriptions should be self-explanatory, but just in case there
is also a semi-formal DSL specification here:

Taking the opportunity, if you see that something is missing/wrong
in the descriptions of the subsystem you care about, or if it is not
described at all, fixes are welcome.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-04-21 15:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-06 22:39 Formal description of system call interface Dmitry Vyukov
     [not found] ` <>
2016-11-07  0:28   ` Szabolcs Nagy
2016-11-21 15:03     ` Dmitry Vyukov
     [not found]       ` <>
2016-11-22 13:07         ` Szabolcs Nagy
2016-11-07 10:38   ` Cyril Hrubis
     [not found]     ` <>
2016-11-21 15:14       ` Dmitry Vyukov
     [not found]         ` <>
2016-11-21 15:34           ` Tavis Ormandy
2016-11-21 16:10           ` Cyril Hrubis
2016-11-21 15:37     ` Steven Rostedt
     [not found]       ` <>
2016-11-21 15:48         ` Dmitry Vyukov
2016-11-21 16:58           ` Cyril Hrubis
2017-04-21 15:14   ` Carlos O'Donell
2016-11-11 17:10 ` Andy Lutomirski
2016-11-21 15:17   ` Dmitry Vyukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).