Linux-Block Archive on
 help / color / Atom feed
From: David Howells <>
To: Andy Lutomirski <>
	Linus Torvalds <>,
	Greg Kroah-Hartman <>,
	Casey Schaufler <>,
	Stephen Smalley <>,
	Nicolas Dichtel <>,, Christian Brauner <>,, USB list <>,
	linux-block <>,
	LSM List <>,
	Linux FS Devel <>,
	Linux API <>,
	LKML <>
Subject: Re: [RFC PATCH 04/14] pipe: Add O_NOTIFICATION_PIPE [ver #2]
Date: Thu, 07 Nov 2019 18:48:37 +0000
Message-ID: <> (raw)
In-Reply-To: <>

Andy Lutomirski <> wrote:

> > Add an O_NOTIFICATION_PIPE flag that can be passed to pipe2() to indicate
> > that the pipe being created is going to be used for notifications.  This
> > suppresses the use of splice(), vmsplice(), tee() and sendfile() on the
> > pipe as calling iov_iter_revert() on a pipe when a kernel notification
> > message has been inserted into the middle of a multi-buffer splice will be
> > messy.
> How messy?

Well, iov_iter_revert() on a pipe iterator simply walks backwards along the
ring discarding the last N contiguous slots (where N is normally the number of
slots that were filled by whatever operation is being reverted).

However, unless the code that transfers stuff into the pipe takes the spinlock
spinlock and disables softirqs for the duration of its ring filling, what were
N contiguous slots may now have kernel notifications interspersed - even if it
has been holding the pipe mutex.

So, now what do you do?  You have to free up just the buffers relevant to the
iterator and then you can either compact down the ring to free up the space or
you can leave null slots and let the read side clean them up, thereby
reducing the capacity of the pipe temporarily.

Either way, iov_iter_revert() gets more complex and has to hold the spinlock.

And if you don't take the spinlock whilst you're reverting, more notifications
can come in to make your life more interesting.

There's also a problem with splicing out from a notification pipe that the
messages are scribed onto preallocated buffers, but now the buffers need
refcounts and, in any case, are of limited quantity.

> And is there some way to make it impossible for this to happen?

Yes.  That's what I'm doing by declaring the pipe to be unspliceable up front.

> Adding a new flag to pipe2() to avoid messy kernel code seems
> like a poor tradeoff.

By far the easiest place to check whether a pipe can be spliced to is in
get_pipe_info().  That's checking the file anyway.  After that, you can't make
the check until the pipe is locked.

Furthermore, if it's not done upfront, the change to the pipe might happen
during a splicing operation that's residing in pipe_wait()... which drops the
pipe mutex.


  parent reply index

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-07 13:35 [RFC PATCH 00/14] pipe: Keyrings, Block and USB notifications " David Howells
2019-11-07 13:35 ` [RFC PATCH 01/14] uapi: General notification queue definitions " David Howells
2019-11-07 13:35 ` [RFC PATCH 02/14] security: Add hooks to rule on setting a watch " David Howells
2019-11-07 13:35 ` [RFC PATCH 03/14] security: Add a hook for the point of notification insertion " David Howells
2019-11-07 13:35 ` [RFC PATCH 04/14] pipe: Add O_NOTIFICATION_PIPE " David Howells
2019-11-07 18:16   ` Andy Lutomirski
2019-11-07 18:48   ` David Howells [this message]
2019-11-08  5:06     ` Andy Lutomirski
2019-11-08  6:42     ` David Howells
2019-11-07 13:36 ` [RFC PATCH 05/14] pipe: Add general notification queue support " David Howells
2019-11-07 13:36 ` [RFC PATCH 06/14] keys: Add a notification facility " David Howells
2019-11-07 13:36 ` [RFC PATCH 07/14] Add sample notification program " David Howells
2019-11-07 13:36 ` [RFC PATCH 08/14] pipe: Allow buffers to be marked read-whole-or-error for notifications " David Howells
2019-11-07 18:15   ` Andy Lutomirski
2019-11-07 18:23   ` David Howells
2019-11-07 13:36 ` [RFC PATCH 09/14] pipe: Add notification lossage handling " David Howells
2019-11-07 13:36 ` [RFC PATCH 10/14] Add a general, global device notification watch list " David Howells
2019-11-07 13:37 ` [RFC PATCH 11/14] block: Add block layer notifications " David Howells
2019-11-07 13:37 ` [RFC PATCH 12/14] usb: Add USB subsystem " David Howells
2019-11-07 13:37 ` [RFC PATCH 13/14] selinux: Implement the watch_key security hook " David Howells
2019-11-07 13:37 ` [RFC PATCH 14/14] smack: Implement the watch_key and post_notification hooks " David Howells
2019-11-07 17:16 ` [RFC PATCH 05/14] pipe: Add general notification queue support " David Howells

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on

Archives are clonable:
	git clone --mirror linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ \
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone