linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Hugh Dickins <hughd@google.com>
Cc: David Herrmann <dh.herrmann@gmail.com>,
	Tony Battersby <tonyb@cybernetics.com>,
	Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Ryan Lortie <desrt@desrt.ca>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	Greg Kroah-Hartman <greg@kroah.com>,
	John Stultz <john.stultz@linaro.org>,
	Kristian Hogsberg <krh@bitplanet.net>,
	Lennart Poettering <lennart@poettering.net>,
	Daniel Mack <zonque@gmail.com>, Kay Sievers <kay@vrfy.org>
Subject: Re: [PATCH v2 2/3] shm: add memfd_create() syscall
Date: Mon, 2 Jun 2014 10:50:55 -0700	[thread overview]
Message-ID: <CALCETrUp+3h7aW=FUoB2fng0gNF_Fi3K-WYPR8bNXoXR5968PA@mail.gmail.com> (raw)
In-Reply-To: <alpine.LSU.2.11.1406020331100.1259@eggly.anvils>

On Mon, Jun 2, 2014 at 3:59 AM, Hugh Dickins <hughd@google.com> wrote:
> On Fri, 23 May 2014, David Herrmann wrote:
>> On Tue, May 20, 2014 at 4:20 AM, Hugh Dickins <hughd@google.com> wrote:
>> >
>> > What is a front-FD?
>>
>> With 'front-FD' I refer to things like dma-buf: They allocate a
>> file-descriptor which is just a wrapper around a kernel-internal FD.
>> For instance, DRM-gem buffers exported as dma-buf. fops on the dma-buf
>> are forwarded to the shmem-fd of the given gem-object, but any access
>> to the inode of the dma-buf fd is a no-op as the dma-buf fd uses
>> anon-inode, not the shmem-inode.
>>
>> A previous revision of memfd used something like that, but that was
>> inherently racy.
>
> Thanks for explaining: then I guess you can leave "front-FD" out of the
> description next time around, in case there are others like me who are
> more mystified than enlightened by it.
>
>> > But this does highlight how the "size" arg to memfd_create() is
>> > perhaps redundant.  Why give a size there, when size can be changed
>> > afterwards?  I expect your answer is that many callers want to choose
>> > the size at the beginning, and would prefer to avoid the extra call.
>> > I'm not sure if that's a good enough reason for a redundant argument.
>>
>> At one point in time we might be required to support atomic-sealing.
>> So a memfd_create() call takes the initial seals as upper 32bits in
>> "flags" and sets them before returning the object. If these seals
>> contain SEAL_GROW/SHRINK, we must pass the size during setup (think
>> CLOEXEC with fork()).
>
> That does sound like over-design to me.  You stop short of passing
> in an optional buffer of the data it's to contain, good.
>
> I think it would be a clearer interface without the size, but really
> that's an issue for the linux-api people you'll be Cc'ing next time.

I agree that the interface is more orthogonal without size, but I
suspect that every single user of memfd_create will follow up with an
immediate ftruncate for fallocate.  That being said, maybe it's better
to leave size out so that users have to think about whether to use
ftruncate or fallocate.

>
> You say "think CLOEXEC with fork()": you have thought about this, I
> have not, please spell out for me what the atomic size guards against.
> Do you want an fd that's not shared across fork?
>
>>
>> Note that we spent a lot of time discussing whether such
>> atomic-sealing is necessary and no-one came up with a real race so
>> far. Therefore, I didn't include that. But especially if we add new
>> seals (like SHMEM_SEAL_OPEN, which I still think is not needed and
>> just hides real problems), we might at one point be required to
>> support that. That's also the reason why "flags" is 64bits.
>>
>> One might argue that we can just add memfd_create2() once that
>> happens, but I didn't see any harm in including "size" and making them
>> 64bit.
>
> I've not noticed another system call with 64-bit flags, it does seem
> over the top to me: the familiar ones all use int.  But again,
> a matter for linux-api not for me.

I think that 64-bit flags are barely better than just having two flags
arguments: 64-bit syscall arguments take up two slots on 32-bit
architectures, so they don't save any space.  (They save a tiny amount
of time on 64-bit architectures.)

>
> Hugh



-- 
Andy Lutomirski
AMA Capital Management, LLC

  reply	other threads:[~2014-06-02 17:51 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-15 18:38 [PATCH v2 0/3] File Sealing & memfd_create() David Herrmann
2014-04-15 18:38 ` [PATCH v2 1/3] shm: add sealing API David Herrmann
2014-05-20  2:16   ` Hugh Dickins
2014-05-23 16:37     ` David Herrmann
2014-06-02 10:30       ` Hugh Dickins
2014-04-15 18:38 ` [PATCH v2 2/3] shm: add memfd_create() syscall David Herrmann
2014-05-20  2:20   ` Hugh Dickins
2014-05-23 16:57     ` David Herrmann
2014-06-02 10:59       ` Hugh Dickins
2014-06-02 17:50         ` Andy Lutomirski [this message]
2014-06-13 10:42         ` David Herrmann
2014-05-21 10:50   ` Konstantin Khlebnikov
2014-04-15 18:38 ` [PATCH v2 3/3] selftests: add memfd_create() + sealing tests David Herrmann
2014-05-20  2:22   ` Hugh Dickins
2014-05-23 17:06     ` David Herrmann
2014-05-14  5:09 ` [PATCH v2 0/3] File Sealing & memfd_create() Hugh Dickins
2014-05-14 16:15   ` Tony Battersby
2014-05-14 22:35     ` Hugh Dickins
2014-05-19 11:44       ` David Herrmann
2014-05-19 16:09         ` Jan Kara
2014-05-19 22:11           ` Hugh Dickins
2014-05-26 11:44             ` David Herrmann
2014-05-31  4:44               ` Hugh Dickins
2014-06-02  4:42         ` Minchan Kim
2014-06-02  9:14           ` Jan Kara
2014-06-02 16:04             ` David Herrmann
2014-06-03  8:31               ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrUp+3h7aW=FUoB2fng0gNF_Fi3K-WYPR8bNXoXR5968PA@mail.gmail.com' \
    --to=luto@amacapital.net \
    --cc=akpm@linux-foundation.org \
    --cc=desrt@desrt.ca \
    --cc=dh.herrmann@gmail.com \
    --cc=greg@kroah.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=john.stultz@linaro.org \
    --cc=kay@vrfy.org \
    --cc=krh@bitplanet.net \
    --cc=lennart@poettering.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mtk.manpages@gmail.com \
    --cc=tj@kernel.org \
    --cc=tonyb@cybernetics.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zonque@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).