linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cyril Hrubis <chrubis@suse.cz>
To: Spencer Baugh <sbaugh@catern.com>
Cc: linux-api@vger.kernel.org, linux-kernel@vger.kernel.org,
	marcin@juszkiewicz.com.pl, torvalds@linux-foundation.org,
	arnd@arndb.de
Subject: Re: Explicitly defining the userspace API
Date: Thu, 21 Apr 2022 11:57:16 +0200	[thread overview]
Message-ID: <YmEqfFdYN0Rml6V2@yuki> (raw)
In-Reply-To: <874k2nhgtg.fsf@catern.com>

Hi!
> Linux guarantees the stability of its userspace API, but the API
> itself is only informally described, primarily with English prose.  I
> want to add an explicit, authoritative machine-readable definition of
> the Linux userspace API.

My background is in kernel testing I do maintain the Linux Test Project
for more than a decade now. During the years we did create many "unit
tests" for kernel syscalls that are watching over the syscall API and
making sure that we get right results for both valid and invalid inputs.
These tests can also be considered to be a form of a documentation. The
same goes for some of the selftests that have been added to kernel repo
in the recent years. In a sense these are the most detailed descriptions
of the interfaces we have.

The main problem is that the kernel userspace boundary is large, we have
thousands of tests and I'm pretty sure that we don't cover even half of
it.

Also some of the interfaces are too complex to be even described in any
formal system, mostly the modern stuff such as io_uring or bfp. I have
had hard time even understading how to use these and I doubt I would be
even able to build a formal system to describe them. Especially since
the io_uring is mostly syscall less and we talk to the kernel by shared
buffers and atomic data updates.

> As background, in a conventional libc like glibc, read(2) calls the
> Linux system call read, passing arguments in an architecture-specific
> way according to the specific details of read.
> 
> The details of these syscalls are at best documented in manpages, and
> often defined only by the implementation.  Anyone else who wants to
> work with a syscall, in any way, needs to duplicate all those details.
> 
> So the most basic definition of the API would just represent the
> information already present in SYSCALL_DEFINE macros: the C types of
> arguments and return values.  More usefully, it would describe the
> formats of those arguments and return values: that the first argument
> to read is a file descriptor rather than an arbitrary integer, and
> what flags are valid in the flags argument of openat, and that open
> returns a file descriptor.  A step beyond that would be describing, in
> some limited way, the effects of syscalls; for example, that read
> writes into the passed buffer the number of bytes that it returned.

Having this would be awesome, this is just one step from actually
generating automated tests for the syscalls. However my estimate is that
even if you started to work on this now it will take decade to get
somewhere, but maybe I'm too pesimistic.

Stil fingers crossed.

-- 
Cyril Hrubis
chrubis@suse.cz

      parent reply	other threads:[~2022-04-21  9:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-20 16:15 Explicitly defining the userspace API Spencer Baugh
2022-04-20 17:14 ` Greg KH
2022-05-06 16:59   ` Spencer Baugh
2022-04-20 17:18 ` Jann Horn
2022-04-21 11:33   ` Arnd Bergmann
2022-04-20 17:52 ` Marcin Juszkiewicz
2022-04-21  9:57 ` Cyril Hrubis [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YmEqfFdYN0Rml6V2@yuki \
    --to=chrubis@suse.cz \
    --cc=arnd@arndb.de \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcin@juszkiewicz.com.pl \
    --cc=sbaugh@catern.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).