All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] shmgetfd idea
@ 2014-01-28  1:37 John Stultz
  2014-01-28  1:53 ` Kay Sievers
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: John Stultz @ 2014-01-28  1:37 UTC (permalink / raw)
  To: linux-mm
  Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

In working with ashmem and looking briefly at kdbus' memfd ideas,
there's a commonality that both basically act as a method to provide
applications with unlinked tmpfs/shmem fds.

In the Android case, its important to have this interface to atomically
provide these unlinked tmpfs fds, because they'd like to avoid having
tmpfs mounts that are writable by applications (since that creates a
potential DOS on the system by applications writing random files that
persist after the process has been killed). It also provides better
life-cycle management for resources, since as the fds never have named
links in the filesystem, their resources are automatically cleaned up
when the last process with the fd dies, and there's no potential races
between create and unlink with processes being terminated, which avoids
the need for cleanup management.

I won't speak for the kdbus use, but my understanding is memfds address
similar needs along with being something to connect with other features.


So one idea was maybe we need a new interface. Something like:

int shmgetfd(char* name, size_t size, int shmflg);


Basically this would be very similar to shmget, but would return a file
descriptor which could be mapped and passed to other processes to map.
Basically very similar to the in-kernel shmem_file_setup() interface.

(Thanks to Akashi-san for initially pointing out the similarity to shmget.)

Of course, shmgetfd on its own wouldn't address the quota issue right
away, but it would be fairly easy have a limit for the total number of
bytes a process could generate, or some other limiting mechanism.


The probably more major drawback here is that both ashmem and memfd tack
on additional features that can be done to the fds.

In ashmems' case it allows for changing the segment's name, and
unpinning regions which can then be lazily discarded by the kernel.

For memfd, the extra feature is sealing, which prevents modification of
the file when its shared.

In ashmem's case, both vma-naming and volatile ranges are trying to
address how the needed features would be generically applied to tmpfs
fds (as well as potentially wider uses as well) - so with something like
shmgetfd it would provide all the functionality needed. I am not aware
of any current plans for memfd's sealing to be similarly worked into a
generic concept - the code hasn't even been submitted, so this is too
early - but in any case, its important to note none of these plans for
generic functionality have been merged or even received with much
interest, so I do understand how a proposal for a new interface that
only solves half of the needed infrastructure may not be particularly
welcome.

So while I do understand the difficulty of trying to create more generic
interfaces rather then just creating a new chardev/ioctl interface to a
more limited subset of functionality, I do think its worth exploring if
we can find a way to share infrastructure at some level (even if its
just due-diligence to prove if the more limited scope chardev/ioctl
interfaces are widely agreed to be better).

Anyway, I just wanted to submit this sketched out idea as food for
thought to see if there was any objection or interest (I've got a draft
patch I'll send out once I get a chance to test it). So let me know if
you have any feedback or comments.

thanks
-john

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28  1:37 [RFC] shmgetfd idea John Stultz
@ 2014-01-28  1:53 ` Kay Sievers
  2014-01-28 19:47   ` John Stultz
  2014-01-28  3:52 ` H. Peter Anvin
  2014-01-30  8:46 ` Christoph Hellwig
  2 siblings, 1 reply; 25+ messages in thread
From: Kay Sievers @ 2014-01-28  1:53 UTC (permalink / raw)
  To: John Stultz
  Cc: linux-mm, Greg KH, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Tue, Jan 28, 2014 at 2:37 AM, John Stultz <john.stultz@linaro.org> wrote:
> In working with ashmem and looking briefly at kdbus' memfd ideas,
> there's a commonality that both basically act as a method to provide
> applications with unlinked tmpfs/shmem fds.
>
> In the Android case, its important to have this interface to atomically
> provide these unlinked tmpfs fds, because they'd like to avoid having
> tmpfs mounts that are writable by applications (since that creates a
> potential DOS on the system by applications writing random files that
> persist after the process has been killed). It also provides better
> life-cycle management for resources, since as the fds never have named
> links in the filesystem, their resources are automatically cleaned up
> when the last process with the fd dies, and there's no potential races
> between create and unlink with processes being terminated, which avoids
> the need for cleanup management.
>
> I won't speak for the kdbus use, but my understanding is memfds address
> similar needs along with being something to connect with other features.
>
>
> So one idea was maybe we need a new interface. Something like:
>
> int shmgetfd(char* name, size_t size, int shmflg);
>
>
> Basically this would be very similar to shmget, but would return a file
> descriptor which could be mapped and passed to other processes to map.
> Basically very similar to the in-kernel shmem_file_setup() interface.
>
> (Thanks to Akashi-san for initially pointing out the similarity to shmget.)
>
> Of course, shmgetfd on its own wouldn't address the quota issue right
> away, but it would be fairly easy have a limit for the total number of
> bytes a process could generate, or some other limiting mechanism.
>
>
> The probably more major drawback here is that both ashmem and memfd tack
> on additional features that can be done to the fds.
>
> In ashmems' case it allows for changing the segment's name, and
> unpinning regions which can then be lazily discarded by the kernel.
>
> For memfd, the extra feature is sealing, which prevents modification of
> the file when its shared.
>
> In ashmem's case, both vma-naming and volatile ranges are trying to
> address how the needed features would be generically applied to tmpfs
> fds (as well as potentially wider uses as well) - so with something like
> shmgetfd it would provide all the functionality needed. I am not aware
> of any current plans for memfd's sealing to be similarly worked into a
> generic concept - the code hasn't even been submitted, so this is too
> early - but in any case, its important to note none of these plans for
> generic functionality have been merged or even received with much
> interest, so I do understand how a proposal for a new interface that
> only solves half of the needed infrastructure may not be particularly
> welcome.
>
> So while I do understand the difficulty of trying to create more generic
> interfaces rather then just creating a new chardev/ioctl interface to a
> more limited subset of functionality, I do think its worth exploring if
> we can find a way to share infrastructure at some level (even if its
> just due-diligence to prove if the more limited scope chardev/ioctl
> interfaces are widely agreed to be better).
>
> Anyway, I just wanted to submit this sketched out idea as food for
> thought to see if there was any objection or interest (I've got a draft
> patch I'll send out once I get a chance to test it). So let me know if
> you have any feedback or comments.

The reason "kdbus-memfd" exists is primarily the sealing.

We need a way to pass possibly large areas of memory from one process
to another, without requiring any trust relation between the two
processes; there cannot be an assumption about trusted vs. untrusted
or creator vs. consumer; all variations must be able to mix in all
combinations, and still be safe
A sender of the message must be sure that the receiver cannot alter
the message, the same way the receiver must be sure that the sender
cannot alter the message content it just sent.

It would be nice if we can generalize the whole memfd logic, but the
shmem allocation facility alone, without the sealing function cannot
replace kdbus-memfd.
We would need secure sealing right from the start for the kdbus use
case; other than that, there are no specific requirements from the
kdbus side.

Kay

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28  1:37 [RFC] shmgetfd idea John Stultz
  2014-01-28  1:53 ` Kay Sievers
@ 2014-01-28  3:52 ` H. Peter Anvin
  2014-01-28 19:56   ` John Stultz
  2014-01-30  8:46 ` Christoph Hellwig
  2 siblings, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2014-01-28  3:52 UTC (permalink / raw)
  To: John Stultz, linux-mm
  Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli,
	Takahiro Akashi, Minchan Kim, Lennart Poettering

On 01/27/2014 05:37 PM, John Stultz wrote:
> 
> In the Android case, its important to have this interface to atomically
> provide these unlinked tmpfs fds, because they'd like to avoid having
> tmpfs mounts that are writable by applications (since that creates a
> potential DOS on the system by applications writing random files that
> persist after the process has been killed). It also provides better
> life-cycle management for resources, since as the fds never have named
> links in the filesystem, their resources are automatically cleaned up
> when the last process with the fd dies, and there's no potential races
> between create and unlink with processes being terminated, which avoids
> the need for cleanup management.
> 

What about if tmpfs could be restricted to only only O_TMPFILE open()s?
 This pretty much amounts to an option to prevent tmpfs from creating
new directory entries.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28  1:53 ` Kay Sievers
@ 2014-01-28 19:47   ` John Stultz
  0 siblings, 0 replies; 25+ messages in thread
From: John Stultz @ 2014-01-28 19:47 UTC (permalink / raw)
  To: Kay Sievers
  Cc: linux-mm, Greg KH, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On 01/27/2014 05:53 PM, Kay Sievers wrote:
> On Tue, Jan 28, 2014 at 2:37 AM, John Stultz <john.stultz@linaro.org> wrote:
>> Anyway, I just wanted to submit this sketched out idea as food for
>> thought to see if there was any objection or interest (I've got a draft
>> patch I'll send out once I get a chance to test it). So let me know if
>> you have any feedback or comments.
> The reason "kdbus-memfd" exists is primarily the sealing.
[snip]
> It would be nice if we can generalize the whole memfd logic, but the
> shmem allocation facility alone, without the sealing function cannot
> replace kdbus-memfd.

Yes. Quite understood. And I too hope to discuss how the sealing feature
could be generalized when the code is submitted for review. I just
figured I'd start here, so when that time comes we have a sketch for
what the rest of the parts that would be needed are.


> We would need secure sealing right from the start for the kdbus use
> case; other than that, there are no specific requirements from the
> kdbus side.

Thanks for the clarifications!

-john

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28  3:52 ` H. Peter Anvin
@ 2014-01-28 19:56   ` John Stultz
  2014-01-28 20:37     ` H. Peter Anvin
  0 siblings, 1 reply; 25+ messages in thread
From: John Stultz @ 2014-01-28 19:56 UTC (permalink / raw)
  To: H. Peter Anvin, linux-mm
  Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli,
	Takahiro Akashi, Minchan Kim, Lennart Poettering

On 01/27/2014 07:52 PM, H. Peter Anvin wrote:
> On 01/27/2014 05:37 PM, John Stultz wrote:
>> In the Android case, its important to have this interface to atomically
>> provide these unlinked tmpfs fds, because they'd like to avoid having
>> tmpfs mounts that are writable by applications (since that creates a
>> potential DOS on the system by applications writing random files that
>> persist after the process has been killed). It also provides better
>> life-cycle management for resources, since as the fds never have named
>> links in the filesystem, their resources are automatically cleaned up
>> when the last process with the fd dies, and there's no potential races
>> between create and unlink with processes being terminated, which avoids
>> the need for cleanup management.
>>
> What about if tmpfs could be restricted to only only O_TMPFILE open()s?
>  This pretty much amounts to an option to prevent tmpfs from creating
> new directory entries.

Thanks for reminding me about O_TMPFILE.. I have it on my list to look
into how it could be used.

As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky
to me, but possible. If others think this would be preferred over a new
syscall, I'll dig in deeper.

thanks
-john

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 19:56   ` John Stultz
@ 2014-01-28 20:37     ` H. Peter Anvin
  2014-01-28 20:58       ` John Stultz
  0 siblings, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2014-01-28 20:37 UTC (permalink / raw)
  To: John Stultz, linux-mm
  Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli,
	Takahiro Akashi, Minchan Kim, Lennart Poettering

On 01/28/2014 11:56 AM, John Stultz wrote:
> 
> Thanks for reminding me about O_TMPFILE.. I have it on my list to look
> into how it could be used.
> 
> As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky
> to me, but possible. If others think this would be preferred over a new
> syscall, I'll dig in deeper.
> 

What is clunky about it?  It reuses an existing interface and still
points to the specific tmpfs instance that should be populated.

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 20:37     ` H. Peter Anvin
@ 2014-01-28 20:58       ` John Stultz
  2014-01-28 21:01         ` Kay Sievers
  0 siblings, 1 reply; 25+ messages in thread
From: John Stultz @ 2014-01-28 20:58 UTC (permalink / raw)
  To: H. Peter Anvin, linux-mm
  Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli,
	Takahiro Akashi, Minchan Kim, Lennart Poettering

On 01/28/2014 12:37 PM, H. Peter Anvin wrote:
> On 01/28/2014 11:56 AM, John Stultz wrote:
>> Thanks for reminding me about O_TMPFILE.. I have it on my list to look
>> into how it could be used.
>>
>> As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky
>> to me, but possible. If others think this would be preferred over a new
>> syscall, I'll dig in deeper.
>>
> What is clunky about it?  It reuses an existing interface and still
> points to the specific tmpfs instance that should be populated.

It would require new mount point convention that userland would have to
standardize.  To me (and admittedly its a taste thing), a new
O_TMPFILE-only tmpfs mount point seems to be to be a bigger interface
change from an application writers perspective then a new syscall.

But maybe I'm misunderstanding your suggestion?

thanks
-john


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 20:58       ` John Stultz
@ 2014-01-28 21:01         ` Kay Sievers
  2014-01-28 21:05           ` John Stultz
  0 siblings, 1 reply; 25+ messages in thread
From: Kay Sievers @ 2014-01-28 21:01 UTC (permalink / raw)
  To: John Stultz
  Cc: H. Peter Anvin, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Tue, Jan 28, 2014 at 9:58 PM, John Stultz <john.stultz@linaro.org> wrote:
> On 01/28/2014 12:37 PM, H. Peter Anvin wrote:
>> On 01/28/2014 11:56 AM, John Stultz wrote:
>>> Thanks for reminding me about O_TMPFILE.. I have it on my list to look
>>> into how it could be used.
>>>
>>> As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky
>>> to me, but possible. If others think this would be preferred over a new
>>> syscall, I'll dig in deeper.
>>>
>> What is clunky about it?  It reuses an existing interface and still
>> points to the specific tmpfs instance that should be populated.
>
> It would require new mount point convention that userland would have to
> standardize.  To me (and admittedly its a taste thing), a new
> O_TMPFILE-only tmpfs mount point seems to be to be a bigger interface
> change from an application writers perspective then a new syscall.
>
> But maybe I'm misunderstanding your suggestion?

General purpose Linux has /dev/shm/ for that already, which will not
go away anytime soon..

Kay

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 21:01         ` Kay Sievers
@ 2014-01-28 21:05           ` John Stultz
  2014-01-28 21:10             ` H. Peter Anvin
  2014-01-28 21:28             ` Kay Sievers
  0 siblings, 2 replies; 25+ messages in thread
From: John Stultz @ 2014-01-28 21:05 UTC (permalink / raw)
  To: Kay Sievers
  Cc: H. Peter Anvin, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On 01/28/2014 01:01 PM, Kay Sievers wrote:
> On Tue, Jan 28, 2014 at 9:58 PM, John Stultz <john.stultz@linaro.org> wrote:
>> On 01/28/2014 12:37 PM, H. Peter Anvin wrote:
>>> On 01/28/2014 11:56 AM, John Stultz wrote:
>>>> Thanks for reminding me about O_TMPFILE.. I have it on my list to look
>>>> into how it could be used.
>>>>
>>>> As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky
>>>> to me, but possible. If others think this would be preferred over a new
>>>> syscall, I'll dig in deeper.
>>>>
>>> What is clunky about it?  It reuses an existing interface and still
>>> points to the specific tmpfs instance that should be populated.
>> It would require new mount point convention that userland would have to
>> standardize.  To me (and admittedly its a taste thing), a new
>> O_TMPFILE-only tmpfs mount point seems to be to be a bigger interface
>> change from an application writers perspective then a new syscall.
>>
>> But maybe I'm misunderstanding your suggestion?
> General purpose Linux has /dev/shm/ for that already, which will not
> go away anytime soon..

Right, though making /dev/shm/ O_TMPFILE only would likely break things, no?

thanks
-john

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 21:05           ` John Stultz
@ 2014-01-28 21:10             ` H. Peter Anvin
  2014-01-28 21:54               ` John Stultz
  2014-01-28 21:28             ` Kay Sievers
  1 sibling, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2014-01-28 21:10 UTC (permalink / raw)
  To: John Stultz, Kay Sievers
  Cc: linux-mm, Greg KH, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli,
	Takahiro Akashi, Minchan Kim, Lennart Poettering

On 01/28/2014 01:05 PM, John Stultz wrote:
>> General purpose Linux has /dev/shm/ for that already, which will not
>> go away anytime soon..
> 
> Right, though making /dev/shm/ O_TMPFILE only would likely break things, no?

If it isn't, then you already have a writable tmpfs, which is what you
said you didn't want.

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 21:05           ` John Stultz
  2014-01-28 21:10             ` H. Peter Anvin
@ 2014-01-28 21:28             ` Kay Sievers
  1 sibling, 0 replies; 25+ messages in thread
From: Kay Sievers @ 2014-01-28 21:28 UTC (permalink / raw)
  To: John Stultz
  Cc: H. Peter Anvin, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Tue, Jan 28, 2014 at 10:05 PM, John Stultz <john.stultz@linaro.org> wrote:
> On 01/28/2014 01:01 PM, Kay Sievers wrote:
>> On Tue, Jan 28, 2014 at 9:58 PM, John Stultz <john.stultz@linaro.org> wrote:
>>> On 01/28/2014 12:37 PM, H. Peter Anvin wrote:
>>>> On 01/28/2014 11:56 AM, John Stultz wrote:
>>>>> Thanks for reminding me about O_TMPFILE.. I have it on my list to look
>>>>> into how it could be used.
>>>>>
>>>>> As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky
>>>>> to me, but possible. If others think this would be preferred over a new
>>>>> syscall, I'll dig in deeper.
>>>>>
>>>> What is clunky about it?  It reuses an existing interface and still
>>>> points to the specific tmpfs instance that should be populated.
>>> It would require new mount point convention that userland would have to
>>> standardize.  To me (and admittedly its a taste thing), a new
>>> O_TMPFILE-only tmpfs mount point seems to be to be a bigger interface
>>> change from an application writers perspective then a new syscall.
>>>
>>> But maybe I'm misunderstanding your suggestion?
>> General purpose Linux has /dev/shm/ for that already, which will not
>> go away anytime soon..
>
> Right, though making /dev/shm/ O_TMPFILE only would likely break things, no?

Right, general purpose Linux could not mount with that option without
expecting major breakage, see: man shm_overview. But a custom OS could
just define that, I guess.

The current /dev/shm/ semantics and the shm apis in general are a kind
of a broken idea from the very beginning.

Kay

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 21:10             ` H. Peter Anvin
@ 2014-01-28 21:54               ` John Stultz
  2014-01-28 22:14                 ` Kay Sievers
  0 siblings, 1 reply; 25+ messages in thread
From: John Stultz @ 2014-01-28 21:54 UTC (permalink / raw)
  To: H. Peter Anvin, Kay Sievers
  Cc: linux-mm, Greg KH, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli,
	Takahiro Akashi, Minchan Kim, Lennart Poettering

On 01/28/2014 01:10 PM, H. Peter Anvin wrote:
> On 01/28/2014 01:05 PM, John Stultz wrote:
>>> General purpose Linux has /dev/shm/ for that already, which will not
>>> go away anytime soon..
>> Right, though making /dev/shm/ O_TMPFILE only would likely break things, no?
> If it isn't, then you already have a writable tmpfs, which is what you
> said you didn't want.

Well, rather then finding a solution exclusively for Android, I'm trying
to find an approach that would work more generically.

While classic Linux systems do have writable /dev/shm/, which we *have*
to preserve, it seem to me that classic linux systems may some day want
to deal with the issues with writable tmpfs that Android has
intentionally avoided.

For examples of grumblings on these issues see:
https://bugzilla.redhat.com/show_bug.cgi?id=693253 (and its dup)

Requiring a binary on/off flag for /dev/shm makes it so you have to
choose if you are a classic or new-style (android-like) system. By
avoiding re-using existing convention via providing a new syscall (or
alternatively with your approach, a new yet to be standardized mount
point convention), it would allow best practices to be updated, and
allow for a slow deprecation of the writable /dev/shm, possibly by
limiting permissions to /dev/shm to only legacy applications, etc.

But yes, alternatively classic systems may be able to get around the
issues via tmpfs quotas and convincing applications to use O_TMPFILE
there. But to me this seems less ideal then the Android approach, where
the lifecycle of the tmpfs fds more limited and clear.

And my main point being: Both Android's ashmem and kdbus' memfds are
both utilizing these semantics (though maybe they aren't as
important/intentional for kdbus?), so it seems like some generic method
(which would work in both environments) would generally useful.

Again, I really do appreciate your feedback here, and I don't mean to be
panning your idea (I'm quite willing to look further into it if others
think its the right way)! I just want to explain my point of view and
motivations a bit better.

thanks!
-john



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 21:54               ` John Stultz
@ 2014-01-28 22:14                 ` Kay Sievers
  2014-01-28 23:02                   ` H. Peter Anvin
  2014-01-28 23:14                   ` John Stultz
  0 siblings, 2 replies; 25+ messages in thread
From: Kay Sievers @ 2014-01-28 22:14 UTC (permalink / raw)
  To: John Stultz
  Cc: H. Peter Anvin, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Tue, Jan 28, 2014 at 10:54 PM, John Stultz <john.stultz@linaro.org> wrote:
> On 01/28/2014 01:10 PM, H. Peter Anvin wrote:
>> On 01/28/2014 01:05 PM, John Stultz wrote:
>>>> General purpose Linux has /dev/shm/ for that already, which will not
>>>> go away anytime soon..
>>> Right, though making /dev/shm/ O_TMPFILE only would likely break things, no?
>> If it isn't, then you already have a writable tmpfs, which is what you
>> said you didn't want.
>
> Well, rather then finding a solution exclusively for Android, I'm trying
> to find an approach that would work more generically.
>
> While classic Linux systems do have writable /dev/shm/, which we *have*
> to preserve, it seem to me that classic linux systems may some day want
> to deal with the issues with writable tmpfs that Android has
> intentionally avoided.
>
> For examples of grumblings on these issues see:
> https://bugzilla.redhat.com/show_bug.cgi?id=693253 (and its dup)
>
> Requiring a binary on/off flag for /dev/shm makes it so you have to
> choose if you are a classic or new-style (android-like) system. By
> avoiding re-using existing convention via providing a new syscall (or
> alternatively with your approach, a new yet to be standardized mount
> point convention), it would allow best practices to be updated, and
> allow for a slow deprecation of the writable /dev/shm, possibly by
> limiting permissions to /dev/shm to only legacy applications, etc.
>
> But yes, alternatively classic systems may be able to get around the
> issues via tmpfs quotas and convincing applications to use O_TMPFILE
> there. But to me this seems less ideal then the Android approach, where
> the lifecycle of the tmpfs fds more limited and clear.

Tmpfs supports no quota, it's all a huge hole and unsafe in that
regard on every system today. But ashmem and kdbus, as they are today,
are not better.

> And my main point being: Both Android's ashmem and kdbus' memfds are
> both utilizing these semantics (though maybe they aren't as
> important/intentional for kdbus?),

We need a way to securely identify an fd that is a memfd in the kernel
and in userspace, and we need to be able to seal it. The rest does not
really matter, we could use O_TMPFILE if we need to, but it still
lacks all the other features.

> so it seems like some generic method
> (which would work in both environments) would generally useful.

Sure, would be nice. There are people from the wayland and X camp, who
asked for a secure semantics and sharing of shmfds too.

> Again, I really do appreciate your feedback here, and I don't mean to be
> panning your idea (I'm quite willing to look further into it if others
> think its the right way)! I just want to explain my point of view and
> motivations a bit better.

I think the most convincing option right now is a new memfd() syscall
or a character device.

We would need more than a create syscall for the sealing/unsealing,
not sure if fcntl() could be (mis-)used/extended for the sealing
interface.

A new character device with ioctls, replacing the current ashmem and
the kdbus memfd part could also work. It has the advantage that it
would just be an optional device driver and it not a primary API with
all the promises, and would provide us with all we need, just the
creation part with the involved ioctl struct definitions is not really
pretty.

Kay

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 22:14                 ` Kay Sievers
@ 2014-01-28 23:02                   ` H. Peter Anvin
  2014-01-28 23:14                     ` Kay Sievers
  2014-01-28 23:14                   ` John Stultz
  1 sibling, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2014-01-28 23:02 UTC (permalink / raw)
  To: Kay Sievers, John Stultz
  Cc: linux-mm, Greg KH, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli,
	Takahiro Akashi, Minchan Kim, Lennart Poettering

On 01/28/2014 02:14 PM, Kay Sievers wrote:
>>
>> But yes, alternatively classic systems may be able to get around the
>> issues via tmpfs quotas and convincing applications to use O_TMPFILE
>> there. But to me this seems less ideal then the Android approach, where
>> the lifecycle of the tmpfs fds more limited and clear.
> 
> Tmpfs supports no quota, it's all a huge hole and unsafe in that
> regard on every system today. But ashmem and kdbus, as they are today,
> are not better.
> 

We can fix that aspect in tmpfs.  Creating new file objcts outside of
filesystems really doesn't make things any better, since our toolbox
around this stuff largely revolves around filesystems.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 22:14                 ` Kay Sievers
  2014-01-28 23:02                   ` H. Peter Anvin
@ 2014-01-28 23:14                   ` John Stultz
  1 sibling, 0 replies; 25+ messages in thread
From: John Stultz @ 2014-01-28 23:14 UTC (permalink / raw)
  To: Kay Sievers
  Cc: H. Peter Anvin, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On 01/28/2014 02:14 PM, Kay Sievers wrote:
> On Tue, Jan 28, 2014 at 10:54 PM, John Stultz <john.stultz@linaro.org> wrote:
>> But yes, alternatively classic systems may be able to get around the
>> issues via tmpfs quotas and convincing applications to use O_TMPFILE
>> there. But to me this seems less ideal then the Android approach, where
>> the lifecycle of the tmpfs fds more limited and clear.
> Tmpfs supports no quota, it's all a huge hole and unsafe in that
> regard on every system today. But ashmem and kdbus, as they are today,
> are not better.

While its true ashmem and kdbus currently have no limitation on the
amount of memory an application can consume via the unlinked tmpfs fds,
they both do have the benefit that those unlinked files are cleaned up
when the last user dies (or is killed).

While adding quota to these approaches would improve things, tmpfs quota
alone on writable tmpfs mounts only limits the DOS to the user (ie: one
bad application could fill up the user's tmpfs and quit, then other
applications would fail to work or have some sort of logic to figure out
what tmpfs files could safely be cleaned up).

Other then this minor point, I think I'm in agreement with the other
points in your mail.

thanks
-john




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 23:02                   ` H. Peter Anvin
@ 2014-01-28 23:14                     ` Kay Sievers
  2014-01-28 23:19                       ` H. Peter Anvin
  0 siblings, 1 reply; 25+ messages in thread
From: Kay Sievers @ 2014-01-28 23:14 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: John Stultz, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Wed, Jan 29, 2014 at 12:02 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 01/28/2014 02:14 PM, Kay Sievers wrote:
>>>
>>> But yes, alternatively classic systems may be able to get around the
>>> issues via tmpfs quotas and convincing applications to use O_TMPFILE
>>> there. But to me this seems less ideal then the Android approach, where
>>> the lifecycle of the tmpfs fds more limited and clear.
>>
>> Tmpfs supports no quota, it's all a huge hole and unsafe in that
>> regard on every system today. But ashmem and kdbus, as they are today,
>> are not better.
>
> We can fix that aspect in tmpfs.  Creating new file objcts outside of
> filesystems really doesn't make things any better, since our toolbox
> around this stuff largely revolves around filesystems.

Sure, it should be fixed, not doubt, even when not in this context,
it's something that we should have.

Back to the topic, let's say, if we would require a tmpfs mount to get
to an unlinked shmemfd, which sounds acceptable if we can solve the
other features in a nice way.

What would be the interface for additional functionality like
sealing/unsealing that thing, that no operation can destruct its
content as long as there is more than a single owner? That would be a
new syscall or fcntl() with specific shmemfd options?

We also need to solve the problem that the inode does not show up in
/proc/$PID/fd/, so that nothing can create a new file for it which we
don't catch with the "single owner" logic. Or we could determine the
"single owner" state from the inode itself?

Kay

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 23:14                     ` Kay Sievers
@ 2014-01-28 23:19                       ` H. Peter Anvin
  2014-01-29  0:14                         ` Kay Sievers
  0 siblings, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2014-01-28 23:19 UTC (permalink / raw)
  To: Kay Sievers
  Cc: John Stultz, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On 01/28/2014 03:14 PM, Kay Sievers wrote:
> 
> What would be the interface for additional functionality like
> sealing/unsealing that thing, that no operation can destruct its
> content as long as there is more than a single owner? That would be a
> new syscall or fcntl() with specific shmemfd options?
> 
> We also need to solve the problem that the inode does not show up in
> /proc/$PID/fd/, so that nothing can create a new file for it which we
> don't catch with the "single owner" logic. Or we could determine the
> "single owner" state from the inode itself?
> 

If the "single owner" is determined by the file structure (e.g. via a
fcntl as opposed to a ioctl), then presumably we would simply deny an
attempt to open the inode and create a new file structure for it.

On Linux, /proc/$PID/fd is an open as opposed to a dup (as much as I
personally don't like those semantics, they are well set in stone at
this point) so it satisfies your requirements.

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28 23:19                       ` H. Peter Anvin
@ 2014-01-29  0:14                         ` Kay Sievers
  2014-01-29  0:20                           ` H. Peter Anvin
  0 siblings, 1 reply; 25+ messages in thread
From: Kay Sievers @ 2014-01-29  0:14 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: John Stultz, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Wed, Jan 29, 2014 at 12:19 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 01/28/2014 03:14 PM, Kay Sievers wrote:
>>
>> What would be the interface for additional functionality like
>> sealing/unsealing that thing, that no operation can destruct its
>> content as long as there is more than a single owner? That would be a
>> new syscall or fcntl() with specific shmemfd options?
>>
>> We also need to solve the problem that the inode does not show up in
>> /proc/$PID/fd/, so that nothing can create a new file for it which we
>> don't catch with the "single owner" logic. Or we could determine the
>> "single owner" state from the inode itself?
>>
>
> If the "single owner" is determined by the file structure (e.g. via a
> fcntl as opposed to a ioctl), then presumably we would simply deny an
> attempt to open the inode and create a new file structure for it.
>
> On Linux, /proc/$PID/fd is an open as opposed to a dup (as much as I
> personally don't like those semantics, they are well set in stone at
> this point) so it satisfies your requirements.

If that all could be made working, for the kdbus case we would be fine
with requiring *any* tmpfs mount, create a new memfd from there with
O_TMPFILE, and use new fcntl() definitios to protect/seal/unseal and
identify that fd.

For the more restricted cases like Android that tmpfs mount could get
a mount option to not allow the creation of any non-unlinked file, I
guess.

Kay

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-29  0:14                         ` Kay Sievers
@ 2014-01-29  0:20                           ` H. Peter Anvin
  2014-01-29  0:49                             ` Kay Sievers
  0 siblings, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2014-01-29  0:20 UTC (permalink / raw)
  To: Kay Sievers
  Cc: John Stultz, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On 01/28/2014 04:14 PM, Kay Sievers wrote:
>>
>> If the "single owner" is determined by the file structure (e.g. via a
>> fcntl as opposed to a ioctl), then presumably we would simply deny an
>> attempt to open the inode and create a new file structure for it.
>>
>> On Linux, /proc/$PID/fd is an open as opposed to a dup (as much as I
>> personally don't like those semantics, they are well set in stone at
>> this point) so it satisfies your requirements.
> 
> If that all could be made working, for the kdbus case we would be fine
> with requiring *any* tmpfs mount, create a new memfd from there with
> O_TMPFILE, and use new fcntl() definitios to protect/seal/unseal and
> identify that fd.
> 
> For the more restricted cases like Android that tmpfs mount could get
> a mount option to not allow the creation of any non-unlinked file, I
> guess.
> 

Right, that would be the idea.

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-29  0:20                           ` H. Peter Anvin
@ 2014-01-29  0:49                             ` Kay Sievers
  0 siblings, 0 replies; 25+ messages in thread
From: Kay Sievers @ 2014-01-29  0:49 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: John Stultz, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Wed, Jan 29, 2014 at 1:20 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 01/28/2014 04:14 PM, Kay Sievers wrote:
>>>
>>> If the "single owner" is determined by the file structure (e.g. via a
>>> fcntl as opposed to a ioctl), then presumably we would simply deny an
>>> attempt to open the inode and create a new file structure for it.
>>>
>>> On Linux, /proc/$PID/fd is an open as opposed to a dup (as much as I
>>> personally don't like those semantics, they are well set in stone at
>>> this point) so it satisfies your requirements.
>>
>> If that all could be made working, for the kdbus case we would be fine
>> with requiring *any* tmpfs mount, create a new memfd from there with
>> O_TMPFILE, and use new fcntl() definitios to protect/seal/unseal and
>> identify that fd.
>>
>> For the more restricted cases like Android that tmpfs mount could get
>> a mount option to not allow the creation of any non-unlinked file, I
>> guess.
>>
>
> Right, that would be the idea.

I like your idea. Sounds worth trying, if you think we can make the
protection/sealing work without too much ugly workarounds.

With the filesystem as a "domain" / the root for all the unlinked
shmem files, we could even mount a separate tmpfs for every logged-in
user, and put the quota on the user that way.

It will still not solve the /dev/shm/ or /tmp quota problem, but it
would at least not get bigger with every new shmem user we invent. :)

Kay

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-28  1:37 [RFC] shmgetfd idea John Stultz
  2014-01-28  1:53 ` Kay Sievers
  2014-01-28  3:52 ` H. Peter Anvin
@ 2014-01-30  8:46 ` Christoph Hellwig
  2014-01-30 16:02   ` Kay Sievers
  2 siblings, 1 reply; 25+ messages in thread
From: Christoph Hellwig @ 2014-01-30  8:46 UTC (permalink / raw)
  To: John Stultz
  Cc: linux-mm, Greg KH, Kay Sievers, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin,
	Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Mon, Jan 27, 2014 at 05:37:04PM -0800, John Stultz wrote:
> In working with ashmem and looking briefly at kdbus' memfd ideas,
> there's a commonality that both basically act as a method to provide
> applications with unlinked tmpfs/shmem fds.

Just use O_TMPFILE on a tmpfs file and you're done.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-30  8:46 ` Christoph Hellwig
@ 2014-01-30 16:02   ` Kay Sievers
  2014-01-30 21:42     ` John Stultz
  2014-02-03 15:03     ` Christoph Hellwig
  0 siblings, 2 replies; 25+ messages in thread
From: Kay Sievers @ 2014-01-30 16:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: John Stultz, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin,
	Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Thu, Jan 30, 2014 at 9:46 AM, Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Jan 27, 2014 at 05:37:04PM -0800, John Stultz wrote:
>> In working with ashmem and looking briefly at kdbus' memfd ideas,
>> there's a commonality that both basically act as a method to provide
>> applications with unlinked tmpfs/shmem fds.
>
> Just use O_TMPFILE on a tmpfs file and you're done.

Ashmem and kdbus can name the deleted files, which is useful for
debugging and tools to show the associated name for the file
descriptor. They also show up in /proc/$PID/maps/ and possibly in
/proc/$PID/fd/.

O_TMPFILE always creates files with just the name "/". Unless that is
changed we wouldn't want switch over to O_TMPFILE, because we would
lose that nice feature.

Is there are way to "fix" O_TMPFILE to accept the name of the file to
be created, instead of insisting to take only the leading directory as
the argument?

Kay

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-30 16:02   ` Kay Sievers
@ 2014-01-30 21:42     ` John Stultz
  2014-01-31  0:01       ` Kay Sievers
  2014-02-03 15:03     ` Christoph Hellwig
  1 sibling, 1 reply; 25+ messages in thread
From: John Stultz @ 2014-01-30 21:42 UTC (permalink / raw)
  To: Kay Sievers, Christoph Hellwig
  Cc: linux-mm, Greg KH, Android Kernel Team, Andrew Morton,
	Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel,
	Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown,
	Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On 01/30/2014 08:02 AM, Kay Sievers wrote:
> On Thu, Jan 30, 2014 at 9:46 AM, Christoph Hellwig <hch@infradead.org> wrote:
>> On Mon, Jan 27, 2014 at 05:37:04PM -0800, John Stultz wrote:
>>> In working with ashmem and looking briefly at kdbus' memfd ideas,
>>> there's a commonality that both basically act as a method to provide
>>> applications with unlinked tmpfs/shmem fds.
>> Just use O_TMPFILE on a tmpfs file and you're done.
> Ashmem and kdbus can name the deleted files, which is useful for
> debugging and tools to show the associated name for the file
> descriptor. They also show up in /proc/$PID/maps/ and possibly in
> /proc/$PID/fd/.
>
> O_TMPFILE always creates files with just the name "/". Unless that is
> changed we wouldn't want switch over to O_TMPFILE, because we would
> lose that nice feature.

Not sure, but would Colin's vma-naming patch (or something like it) help
address this?
https://lkml.org/lkml/2013/10/30/518

thanks
-john

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-30 21:42     ` John Stultz
@ 2014-01-31  0:01       ` Kay Sievers
  0 siblings, 0 replies; 25+ messages in thread
From: Kay Sievers @ 2014-01-31  0:01 UTC (permalink / raw)
  To: John Stultz
  Cc: Christoph Hellwig, linux-mm, Greg KH, Android Kernel Team,
	Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen,
	Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin,
	Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim,
	Lennart Poettering

On Thu, Jan 30, 2014 at 10:42 PM, John Stultz <john.stultz@linaro.org> wrote:
> On 01/30/2014 08:02 AM, Kay Sievers wrote:
>> On Thu, Jan 30, 2014 at 9:46 AM, Christoph Hellwig <hch@infradead.org> wrote:
>>> On Mon, Jan 27, 2014 at 05:37:04PM -0800, John Stultz wrote:
>>>> In working with ashmem and looking briefly at kdbus' memfd ideas,
>>>> there's a commonality that both basically act as a method to provide
>>>> applications with unlinked tmpfs/shmem fds.
>>> Just use O_TMPFILE on a tmpfs file and you're done.
>> Ashmem and kdbus can name the deleted files, which is useful for
>> debugging and tools to show the associated name for the file
>> descriptor. They also show up in /proc/$PID/maps/ and possibly in
>> /proc/$PID/fd/.
>>
>> O_TMPFILE always creates files with just the name "/". Unless that is
>> changed we wouldn't want switch over to O_TMPFILE, because we would
>> lose that nice feature.
>
> Not sure, but would Colin's vma-naming patch (or something like it) help
> address this?
> https://lkml.org/lkml/2013/10/30/518

Hmm, I don't think so, this seems to be about anonymous memory only,
but shmem files are not anonymous.

We actually just really want the actual file names, ashmem too, like
shmem_file_setup() accepts the name for the unlinked file to create.

Kay

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] shmgetfd idea
  2014-01-30 16:02   ` Kay Sievers
  2014-01-30 21:42     ` John Stultz
@ 2014-02-03 15:03     ` Christoph Hellwig
  1 sibling, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2014-02-03 15:03 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Christoph Hellwig, John Stultz, linux-mm, Greg KH,
	Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins,
	Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner,
	H. Peter Anvin, Neil Brown, Andrea Arcangeli, Takahiro Akashi,
	Minchan Kim, Lennart Poettering, Al Viro, Michael Kerrisk

On Thu, Jan 30, 2014 at 05:02:40PM +0100, Kay Sievers wrote:
> Ashmem and kdbus can name the deleted files, which is useful for
> debugging and tools to show the associated name for the file
> descriptor. They also show up in /proc/$PID/maps/ and possibly in
> /proc/$PID/fd/.
> 
> O_TMPFILE always creates files with just the name "/". Unless that is
> changed we wouldn't want switch over to O_TMPFILE, because we would
> lose that nice feature.
> 
> Is there are way to "fix" O_TMPFILE to accept the name of the file to
> be created, instead of insisting to take only the leading directory as
> the argument?

As far as the VFS is concerned this should be fairly easily doable,
we'd just have to switch O_TMPFILE to the same lookup parent first
algorithm used for O_CREAT.  The filesystems shouldn't really care
at all as the name will never be stored on disk.

In fact such a full-path O_TMPFILE would be much nicer than the
current one as it has more similar arguments to the normal O_CREAT open
that I would document it as the default one, even if the old semantics
would have to still be supported for backwards compatibility.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-02-03 15:03 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-28  1:37 [RFC] shmgetfd idea John Stultz
2014-01-28  1:53 ` Kay Sievers
2014-01-28 19:47   ` John Stultz
2014-01-28  3:52 ` H. Peter Anvin
2014-01-28 19:56   ` John Stultz
2014-01-28 20:37     ` H. Peter Anvin
2014-01-28 20:58       ` John Stultz
2014-01-28 21:01         ` Kay Sievers
2014-01-28 21:05           ` John Stultz
2014-01-28 21:10             ` H. Peter Anvin
2014-01-28 21:54               ` John Stultz
2014-01-28 22:14                 ` Kay Sievers
2014-01-28 23:02                   ` H. Peter Anvin
2014-01-28 23:14                     ` Kay Sievers
2014-01-28 23:19                       ` H. Peter Anvin
2014-01-29  0:14                         ` Kay Sievers
2014-01-29  0:20                           ` H. Peter Anvin
2014-01-29  0:49                             ` Kay Sievers
2014-01-28 23:14                   ` John Stultz
2014-01-28 21:28             ` Kay Sievers
2014-01-30  8:46 ` Christoph Hellwig
2014-01-30 16:02   ` Kay Sievers
2014-01-30 21:42     ` John Stultz
2014-01-31  0:01       ` Kay Sievers
2014-02-03 15:03     ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.