All of lore.kernel.org
 help / color / mirror / Atom feed
From: Geert Uytterhoeven <geert@linux-m68k.org>
To: Arnd Bergmann <arnd@arndb.de>
Cc: "Jens Axboe" <axboe@kernel.dk>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"y2038 Mailman List" <y2038@lists.linaro.org>,
	"Mauro Carvalho Chehab" <mchehab+samsung@kernel.org>,
	"Jonathan Neuschäfer" <j.neuschaefer@gmx.net>,
	"Masahiro Yamada" <yamada.masahiro@socionext.com>,
	"Vladimir Oltean" <olteanv@gmail.com>,
	"Kent Overstreet" <kent.overstreet@gmail.com>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>
Subject: Re: [PATCH 24/24] Documentation: document ioctl interfaces better
Date: Thu, 12 Dec 2019 09:16:41 +0100	[thread overview]
Message-ID: <CAMuHMdW=eG9WWbeL0Od6dVzE7ydpBvVybvkn+cGFtyBR77sP6A@mail.gmail.com> (raw)
In-Reply-To: <20191211204306.1207817-25-arnd@arndb.de>

Hi Arnd,

On Wed, Dec 11, 2019 at 9:53 PM Arnd Bergmann <arnd@arndb.de> wrote:
> Documentation/process/botching-up-ioctls.rst was orignally
> written as a blog post for DRM driver writers, so it it misses
> some points while going into a lot of detail on others.
>
> Try to provide a replacement that addresses typical issues
> across a wider range of subsystems, and follows the style of
> the core-api documentation better.
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Thanks for your patch!

> --- /dev/null
> +++ b/Documentation/core-api/ioctl.rst
> @@ -0,0 +1,250 @@
> +======================
> +ioctl based interfaces
> +======================
> +
> +:c:func:`ioctl` is the most common way for applications to interface
> +with device drivers. It is flexible and easily extended by adding new
> +commands and can be passed through character devices, block devices as
> +well as sockets and other special file descriptors.
> +
> +However, it is also very easy to get ioctl command definitions wrong,
> +and hard to fix them later without breaking existing applications,
> +so this documentation tries to help developers get it right.
> +
> +Command number definitions
> +==========================
> +
> +The command number, or request number, is the second argument passed to
> +the ioctl system call. While this can be any 32-bit number that uniquely
> +identifies an action for a particular driver, there are a number of
> +conventions around defining them.

Interesting. I never realized the action is 32-bit in the kernel, but
unsigned long in userspace...

> +
> +``include/uapi/asm-generic/ioctl.h`` provides four macros for defining
> +ioctl commands that follow modern conventions: ``_IOC``, ``_IOR``,
> +``_IOW``, and ``_IORW``. These should be used for all new commands,
> +with the correct parameters:
> +
> +_IO/_IOR/_IOW/_IOWR

This says _IO....

> +   The macro name determines whether the argument is used for passing
> +   data into kernel (_IOW), from the kernel (_IOR), both (_IOWR) or is

into the kernel
, or is

> +   not a pointer (_IOC). It is possible but not recommended to pass an
> +   integer value instead of a pointer with _IOC.

...which is not explained here, but _IOC is?

> +
> +type
> +   An 8-bit number, often a character literal, specific to a subsystem
> +   or driver, and listed in :doc:`../ioctl/ioctl-number`
> +
> +nr
> +  An 8-bit number identifying the specific command, unique for a give
> +  value of 'type'
> +
> +size
> +  The name of the data type pointed to by the argument, the command
> +  number encodes the ``sizeof(size)`` value in a 13-bit or 14-bit integer,
> +  leading to a limit of 8191 bytes for the maximum size of the argument.
> +  Note: do not pass sizeof(type) type into _IOR/IOW, as that will lead
> +  to encoding sizeof(sizeof(type)), i.e. sizeof(size_t).

Looks like "size" could be renamed, to make this more obvious?

> +Timestamps
> +==========
> +
> +Traditionally, timestamps and timeout values are passed as ``struct
> +timespec`` or ``struct timeval``, but these are problematic because of
> +incompatible definitions of these structures in user space after the
> +move to 64-bit time_t.
> +
> +The __kernel_timespec type can be used instead to be embedded in other
> +data structures when separate second/nanosecond values are desired,
> +or passed to user space directly. This is still not ideal though,
> +as the structure matches neither the kernel's timespec64 nor the user
> +space timespec exactly. The get_timespec64() and put_timespec64() helper
> +functions canbe used to ensure that the layout remains compatible with

can be

> +user space and the padding is treated correctly.
> +
> +As it is cheap to convert seconds to nanoseconds, but the opposite
> +requires an expensive 64-bit division, a simple __u64 nanosecond value
> +can be simpler and more efficient.
> +
> +Timeout values and timestamps should ideally use CLOCK_MONOTONIC time,
> +as returned by ``ktime_get_ns()`` or ``ktime_get_ts64()``.  Unlike
> +CLOCK_REALTIME, this makes the timestamps immune from jumping backwards
> +or forwards due to leap second adjustments and clock_settime() calls.
> +
> +``ktime_get_real_ns()`` can be used for CLOCK_REALTIME timestamps that
> +may be required for timestamps that need to be persistent across a reboot

Drop "may be required for timestamps that"?

> +or between multiple machines.

> +Structure layout
> +----------------
> +
> +Compatible data structures have the same layout on all architectures,
> +avoiding all problematic members:
> +
> +* ``long`` and ``unsigned long`` are the size of a register, so
> +  they can be either 32 bit or 64 bit wide and cannot be used in portable

32-bit or 64-bit (for consistency with the rest of the document)

> +  data structures. Fixed-length replacements are ``__s32``, ``__u32``,
> +  ``__s64`` and ``__u64``.
> +
> +* Pointers have the same problem, in addition to requiring the
> +  use of ``compat_ptr()``. The best workaround is to use ``__u64``
> +  in place of pointers, which requires a cast to ``uintptr_t`` in user
> +  space, and the use of ``u64_to_user_ptr()`` in the kernel to convert
> +  it back into a user pointer.
> +
> +* On the x86-32 (i386) architecture, the alignment of 64-bit variables
> +  is only 32 bit, but they are naturally aligned on most other

32-bit

> +  architectures including x86-64. This means a structure like
> +
> +  ::
> +
> +    struct foo {
> +        __u32 a;
> +        __u64 b;
> +        __u32 c;
> +    };
> +
> +  has four bytes of padding between a and b on x86-64, plus another four
> +  bytes of padding at the end, but no padding on i386, and it needs a
> +  compat_ioctl conversion handler to translate between the two formats.
> +
> +  To avoid this problem, all structures should have their members
> +  naturally aligned, or explicit reserved fields added in place of the
> +  implicit padding.
> +
> +* On ARM OABI user space, 16-bit member variables have 32-bit
> +  alignment, making them incompatible with modern EABI kernels.
> +  Conversely, on the m68k architecture, all struct members have at most
> +  16-bit alignment. These rarely cause problems as neither ARM-OABI nor

"have at most 16-bit alignment" sounds a bit weird to me, as a member
may have a greater alignment.
"struct members are not guaranteed to have an alignment greater than 16-bit"?

> +  m68k are supported by any compat mode, but for consistency, it is best
> +  to completely avoid 16-bit member variables.
> +
> +
> +* Bitfields and enums generally work as one would expect them to,
> +  but some properties of them are implementation-defined, so it is better
> +  to avoid them completely in ioctl interfaces.
> +
> +* ``char`` members can be either signed or unsigned, depending on
> +  the architecture, so the __u8 and __s8 types should be used for 8-bit
> +  integer values, though char arrays are clearer for fixed-length strings.
> +
> +Information leaks
> +=================
> +
> +Uninitialized data must not be copied back to user space, as this can
> +cause an information leak, which can be used to defeat kernel address
> +space layout randomization (KASLR), helping in an attack.
> +
> +As explained for the compat mode, it is best to not avoid any padding in

best to avoid any implicit padding?

> +data structures, but if there is already padding in existing structures,
> +the kernel driver must be careful to zero out the padding using
> +``memset()`` or similar before copying it to user space.
> +
> +Subsystem abstractions
> +======================
> +
> +While some device drivers implement their own ioctl function, most
> +subsystems implement the same command for multiple drivers.  Ideally the
> +subsystem has an .ioctl() handler that copies the arguments from and
> +to user space, passing them into subsystem specific callback functions
> +through normal kernel pointers.
> +
> +This helps in various ways:
> +
> +* Applications written for one driver are more likely to work for
> +  another one in the same subsystem if there are no subtle differences
> +  in the user space ABI.
> +
> +* The complexity of user space access and data structure layout at done

is done

> +  in one place, reducing the potential for implementation bugs.
> +
> +* It is more likely to be reviewed by experienced developers
> +  that can spot problems in the interface when the ioctl is shared
> +  between multiple drivers than when it is only used in a single driver.
> +
> +Alternatives to ioctl
> +=====================
> +
> +There are many cases in which ioctl is not the best solution for a
> +problem. Alternatives include

:

> +
> +* System calls are a better choice for a system-wide feature that
> +  is not tied to a physical device or constrained by the file system
> +  permissions of a character device node
> +
> +* netlink is the preferred way of configuring any network related
> +  objects through sockets.
> +
> +* debugfs is used for ad-hoc interfaces for debugging functionality
> +  that does not need to be exposed as a stable interface to applications.
> +
> +* sysfs is a good way to expose the state of an in-kernel object
> +  that is not tied to a file descriptor.
> +
> +* configfs can be used for more complex configuration than sysfs
> +
> +* A custom file system can provide extra flexibility with a simple
> +  user interface but add a lot of complexity in the implementation.

adds ... to

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

  parent reply	other threads:[~2019-12-12  8:16 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-11 20:42 [PATCH 00/24] block, scsi: final compat_ioctl cleanup Arnd Bergmann
2019-12-11 20:42 ` [Xen-devel] " Arnd Bergmann
2019-12-11 20:42 ` Arnd Bergmann
2019-12-11 20:42 ` [PATCH 01/24] compat: ARM64: always include asm-generic/compat.h Arnd Bergmann
2019-12-11 20:42   ` Arnd Bergmann
2019-12-12 10:16   ` Will Deacon
2019-12-12 10:16     ` Will Deacon
2019-12-12 10:41     ` Arnd Bergmann
2019-12-12 10:41       ` Arnd Bergmann
2019-12-11 20:42 ` [PATCH 02/24] compat: scsi: sg: fix v3 compat read/write interface Arnd Bergmann
2019-12-12 16:25   ` Christoph Hellwig
2019-12-12 17:34     ` Arnd Bergmann
2019-12-11 20:42 ` [PATCH 03/24] compat_ioctl: block: handle BLKREPORTZONE/BLKRESETZONE Arnd Bergmann
2019-12-12  1:20   ` Damien Le Moal
2019-12-11 20:42 ` [PATCH 04/24] compat_ioctl: block: handle BLKGETZONESZ/BLKGETNRZONES Arnd Bergmann
2019-12-12  1:20   ` Damien Le Moal
2019-12-11 20:42 ` [PATCH 05/24] compat_ioctl: block: handle add zone open, close and finish ioctl Arnd Bergmann
2019-12-12  1:20   ` Damien Le Moal
2019-12-11 20:42 ` [PATCH 06/24] compat_ioctl: block: handle Persistent Reservations Arnd Bergmann
2019-12-11 20:42 ` [PATCH 07/24] compaT_ioctl: ubd, aoe: use blkdev_compat_ptr_ioctl Arnd Bergmann
2019-12-11 20:42   ` Arnd Bergmann
2019-12-11 20:42 ` [PATCH 08/24] compat_ioctl: move CDROM_SEND_PACKET handling into scsi Arnd Bergmann
2019-12-11 20:42 ` [PATCH 09/24] compat_ioctl: move CDROMREADADIO to cdrom.c Arnd Bergmann
2019-12-11 20:42 ` [PATCH 10/24] compat_ioctl: cdrom: handle CDROM_LAST_WRITTEN Arnd Bergmann
2019-12-17 15:20   ` [Y2038] " Ben Hutchings
2019-12-17 21:38     ` Arnd Bergmann
2019-12-11 20:42 ` [PATCH 11/24] compat_ioctl: block: handle cdrom compat ioctl in non-cdrom drivers Arnd Bergmann
2019-12-11 20:42   ` [Xen-devel] " Arnd Bergmann
2019-12-11 20:42 ` [PATCH 12/24] compat_ioctl: add scsi_compat_ioctl Arnd Bergmann
2019-12-11 20:42 ` [PATCH 13/24] compat_ioctl: bsg: add handler Arnd Bergmann
2019-12-11 20:42 ` [PATCH 14/24] compat_ioctl: ide: floppy: " Arnd Bergmann
2019-12-11 20:42 ` [PATCH 15/24] compat_ioctl: scsi: move ioctl handling into drivers Arnd Bergmann
2019-12-11 23:05   ` Michael S. Tsirkin
2019-12-12  0:28     ` Paolo Bonzini
2019-12-12  0:28     ` Paolo Bonzini
2019-12-12  9:17       ` Arnd Bergmann
2019-12-12  9:17       ` Arnd Bergmann
2019-12-12 10:27       ` Michael S. Tsirkin
2019-12-12 10:27       ` Michael S. Tsirkin
2019-12-12 16:27       ` Christoph Hellwig
2019-12-12 16:35         ` Jens Axboe
2019-12-12 16:35         ` Jens Axboe
2019-12-12 16:27       ` Christoph Hellwig
2019-12-11 23:05   ` Michael S. Tsirkin
2019-12-12 10:25   ` Michael S. Tsirkin
2019-12-12 10:25     ` Michael S. Tsirkin
2019-12-11 20:42 ` Arnd Bergmann
2019-12-11 20:42 ` [PATCH 16/24] compat_ioctl: move sys_compat_ioctl() to ioctl.c Arnd Bergmann
2019-12-11 20:42 ` [PATCH 17/24] compat_ioctl: simplify the implementation Arnd Bergmann
2019-12-11 20:42 ` [PATCH 18/24] compat_ioctl: move cdrom commands into cdrom.c Arnd Bergmann
2019-12-11 20:42 ` [PATCH 19/24] compat_ioctl: scsi: handle HDIO commands from drivers Arnd Bergmann
2019-12-11 20:42 ` [PATCH 20/24] compat_ioctl: move HDIO ioctl handling into drivers/ide Arnd Bergmann
2019-12-12 16:29   ` Christoph Hellwig
2019-12-12 17:21     ` Arnd Bergmann
2019-12-11 20:42 ` [PATCH 21/24] compat_ioctl: block: move blkdev_compat_ioctl() into ioctl.c Arnd Bergmann
2019-12-12 16:30   ` Christoph Hellwig
2019-12-12 17:24     ` Arnd Bergmann
2019-12-11 20:42 ` [PATCH 22/24] compat_ioctl: block: simplify compat_blkpg_ioctl() Arnd Bergmann
2019-12-11 20:42 ` [PATCH 23/24] compat_ioctl: simplify up block/ioctl.c Arnd Bergmann
2019-12-11 20:42 ` [PATCH 24/24] Documentation: document ioctl interfaces better Arnd Bergmann
2019-12-11 21:05   ` Jonathan Corbet
2019-12-12 10:56     ` Arnd Bergmann
2019-12-12  8:16   ` Geert Uytterhoeven [this message]
2019-12-12 11:04     ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMuHMdW=eG9WWbeL0Od6dVzE7ydpBvVybvkn+cGFtyBR77sP6A@mail.gmail.com' \
    --to=geert@linux-m68k.org \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=corbet@lwn.net \
    --cc=j.neuschaefer@gmx.net \
    --cc=jejb@linux.ibm.com \
    --cc=kent.overstreet@gmail.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mchehab+samsung@kernel.org \
    --cc=olteanv@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=y2038@lists.linaro.org \
    --cc=yamada.masahiro@socionext.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.