From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B8A6C5DF60 for ; Fri, 8 Nov 2019 05:07:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D214821D82 for ; Fri, 8 Nov 2019 05:07:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573189631; bh=Nml6ZqPGIxeHePwEoMl3uFywnf+kk1nTeP99Kndi+gU=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=e7MV2plEXyOMXSzp0qOFJbOfHSLDUpKhjSZDHT1pLQ1pNAuxRN98myRmuXwsYCTS7 f16KTk6/Mb4ihoRmAtRAeIMJBO0m8AyvrGdklWySbnwOns3z4mVl1tk9vPWLFw9DST Pdb+KR57QtHxOt76M32l2y7D7z1TIfIFkpMK9M8M= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727649AbfKHFHK (ORCPT ); Fri, 8 Nov 2019 00:07:10 -0500 Received: from mail.kernel.org ([198.145.29.99]:35870 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726556AbfKHFHI (ORCPT ); Fri, 8 Nov 2019 00:07:08 -0500 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 95F89222CF for ; Fri, 8 Nov 2019 05:07:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573189626; bh=Nml6ZqPGIxeHePwEoMl3uFywnf+kk1nTeP99Kndi+gU=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=bAdWTsYRTkqedcC/+Z5sINwapybfyle+SAfQCYzh0EWS5bFnoTWCp8gEp5sb9cWnI O5bF/n5f1DTs5DQ85rVVTZbzCY3ybfp9Rs/y1P5HNCiBgUd2Nw3kHKIgcB9HugJvZn vVLMDAwji8m5eynkuo8EBnTeCAjeGmxaYmxlk+Hs= Received: by mail-wm1-f51.google.com with SMTP id x4so4815678wmi.3 for ; Thu, 07 Nov 2019 21:07:06 -0800 (PST) X-Gm-Message-State: APjAAAUALiIqZJsORTA6u0ioWCmnimr3MkknKiki2gYZ+EX8VoRaRKK9 p87x/6lMmXN9NUDxPnaXZaaBMbUrE9MhvyMqnDqlNw== X-Google-Smtp-Source: APXvYqx08Mv0U8UmADt1yPzJqTiWQTH/61DB5ak1MBLcnJribE7k+KhLlnxirQA97U78di+ySNLiALLNE6nRtXo4CT0= X-Received: by 2002:a7b:c1ca:: with SMTP id a10mr6805119wmj.161.1573189624926; Thu, 07 Nov 2019 21:07:04 -0800 (PST) MIME-Version: 1.0 References: <157313371694.29677.15388731274912671071.stgit@warthog.procyon.org.uk> <157313375678.29677.15875689548927466028.stgit@warthog.procyon.org.uk> <6964.1573152517@warthog.procyon.org.uk> In-Reply-To: <6964.1573152517@warthog.procyon.org.uk> From: Andy Lutomirski Date: Thu, 7 Nov 2019 21:06:53 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH 04/14] pipe: Add O_NOTIFICATION_PIPE [ver #2] To: David Howells Cc: Andy Lutomirski , Linus Torvalds , Greg Kroah-Hartman , Casey Schaufler , Stephen Smalley , Nicolas Dichtel , raven@themaw.net, Christian Brauner , keyrings@vger.kernel.org, USB list , linux-block , LSM List , Linux FS Devel , Linux API , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On Thu, Nov 7, 2019 at 10:48 AM David Howells wrote: > > Andy Lutomirski wrote: > > > > Add an O_NOTIFICATION_PIPE flag that can be passed to pipe2() to indicate > > > that the pipe being created is going to be used for notifications. This > > > suppresses the use of splice(), vmsplice(), tee() and sendfile() on the > > > pipe as calling iov_iter_revert() on a pipe when a kernel notification > > > message has been inserted into the middle of a multi-buffer splice will be > > > messy. > > > > How messy? > > Well, iov_iter_revert() on a pipe iterator simply walks backwards along the > ring discarding the last N contiguous slots (where N is normally the number of > slots that were filled by whatever operation is being reverted). > > However, unless the code that transfers stuff into the pipe takes the spinlock > spinlock and disables softirqs for the duration of its ring filling, what were > N contiguous slots may now have kernel notifications interspersed - even if it > has been holding the pipe mutex. > > So, now what do you do? You have to free up just the buffers relevant to the > iterator and then you can either compact down the ring to free up the space or > you can leave null slots and let the read side clean them up, thereby > reducing the capacity of the pipe temporarily. > > Either way, iov_iter_revert() gets more complex and has to hold the spinlock. I feel like I'm missing something fundamental here. I can open a normal pipe from userspace (with pipe() or pipe2()), and I can have two threads. One thread writes to the pipe with write(). The other thread writes with splice(). Everything works fine. What's special about notifications? From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Date: Fri, 08 Nov 2019 05:06:53 +0000 Subject: Re: [RFC PATCH 04/14] pipe: Add O_NOTIFICATION_PIPE [ver #2] Message-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit List-Id: References: <157313371694.29677.15388731274912671071.stgit@warthog.procyon.org.uk> <157313375678.29677.15875689548927466028.stgit@warthog.procyon.org.uk> <6964.1573152517@warthog.procyon.org.uk> In-Reply-To: <6964.1573152517@warthog.procyon.org.uk> To: David Howells Cc: Andy Lutomirski , Linus Torvalds , Greg Kroah-Hartman , Casey Schaufler , Stephen Smalley , Nicolas Dichtel , raven@themaw.net, Christian Brauner , keyrings@vger.kernel.org, USB list , linux-block , LSM List , Linux FS Devel , Linux API , LKML On Thu, Nov 7, 2019 at 10:48 AM David Howells wrote: > > Andy Lutomirski wrote: > > > > Add an O_NOTIFICATION_PIPE flag that can be passed to pipe2() to indicate > > > that the pipe being created is going to be used for notifications. This > > > suppresses the use of splice(), vmsplice(), tee() and sendfile() on the > > > pipe as calling iov_iter_revert() on a pipe when a kernel notification > > > message has been inserted into the middle of a multi-buffer splice will be > > > messy. > > > > How messy? > > Well, iov_iter_revert() on a pipe iterator simply walks backwards along the > ring discarding the last N contiguous slots (where N is normally the number of > slots that were filled by whatever operation is being reverted). > > However, unless the code that transfers stuff into the pipe takes the spinlock > spinlock and disables softirqs for the duration of its ring filling, what were > N contiguous slots may now have kernel notifications interspersed - even if it > has been holding the pipe mutex. > > So, now what do you do? You have to free up just the buffers relevant to the > iterator and then you can either compact down the ring to free up the space or > you can leave null slots and let the read side clean them up, thereby > reducing the capacity of the pipe temporarily. > > Either way, iov_iter_revert() gets more complex and has to hold the spinlock. I feel like I'm missing something fundamental here. I can open a normal pipe from userspace (with pipe() or pipe2()), and I can have two threads. One thread writes to the pipe with write(). The other thread writes with splice(). Everything works fine. What's special about notifications?