All of lore.kernel.org
 help / color / mirror / Atom feed
* Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
@ 2015-07-03 12:01 jon
  2015-07-04 20:56 ` Valdis.Kletnieks
  0 siblings, 1 reply; 12+ messages in thread
From: jon @ 2015-07-03 12:01 UTC (permalink / raw)
  To: coreutils; +Cc: linux-kernel

Hi, could I make a hugely nieve user request :-) 

Would it be possible to add a new mount option to everything?

New mount option 'com' = "create on mount" (implied remove on unmount).


Example fstab entry
/mounts/amountpoint	LABEL=notalwayshere	ext4,com


# ls /mounts
# mount /mounts/amountpoint
# ls /mounts
amountpoint
# umount /mounts/amountpoint
# ls /mounts
# 


The idea is to create a mount point directory (one level only) if does
not exist when an FS is mounted, umount would remove it when an FS is
unmounted (assuming it was empty) otherwise generate a warning.

As the 'com' flag would need to carried with the mount I assume the
logic would have to be handled in mount() and umount() call itself ?

I can see issues if the mount point directory is read only or similar,
but I am sure most cases could be handled with just a warning.

Many thanks,
Jon



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-03 12:01 Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount jon
@ 2015-07-04 20:56 ` Valdis.Kletnieks
  2015-07-04 22:48   ` jon
  0 siblings, 1 reply; 12+ messages in thread
From: Valdis.Kletnieks @ 2015-07-04 20:56 UTC (permalink / raw)
  To: jon; +Cc: coreutils, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 600 bytes --]

On Fri, 03 Jul 2015 13:01:59 +0100, jon said:
> Hi, could I make a hugely nieve user request :-)
>
> Would it be possible to add a new mount option to everything?
>
> New mount option 'com' = "create on mount" (implied remove on unmount).
>
>
> Example fstab entry
> /mounts/amountpoint	LABEL=notalwayshere	ext4,com

I'll bite.  What system administration problem does this solve?

In particular, automount has been around in one form or another *at least*
since SunOS3.2 in the mid 80's, and I have seen it work with huge user maps
(10k+ users). How did it cope for 30+ years without this feature?


[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-04 20:56 ` Valdis.Kletnieks
@ 2015-07-04 22:48   ` jon
  2015-07-05 14:29     ` Al Viro
  2015-07-15 14:38     ` Karel Zak
  0 siblings, 2 replies; 12+ messages in thread
From: jon @ 2015-07-04 22:48 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: coreutils, linux-kernel

On Sat, 2015-07-04 at 16:56 -0400, Valdis.Kletnieks@vt.edu wrote:
> On Fri, 03 Jul 2015 13:01:59 +0100, jon said:
> > Hi, could I make a hugely nieve user request :-)
> >
> > Would it be possible to add a new mount option to everything?
> >
> > New mount option 'com' = "create on mount" (implied remove on unmount).
> >
> >
> > Example fstab entry
> > /mounts/amountpoint	LABEL=notalwayshere	ext4,com
> 
> I'll bite.
Thank you :-)

>   What system administration problem does this solve?
Most auto mounters are based on an a system event being processed by a
user space tool. I am talking about a simple optional bit of code built
into mount() umount() calls.  So it could be an fstab entry, or typed
into a shell or any other source rather than an event processed through
user space tools in response to USB or a DVD read or similar. I am not
talking about an automount.

It solves these problems:
1) It solves the problem of processes writing data into the mount point
when not mounted (as does, I accept a user space automounter, but as I
explained the usage scenario differs). 

2) It would be useful for embedded devices, installers etc.  I do quite
a bit of work in the embedded space, sometimes running kernel+shell+user
process only, sometimes no udev, no systemd, not even full fat init.  

3) installers or similar could use such an option for mounting install
data. By specifying the flag user space processes can infer that the FS
is successfully mounted by the presence of the mount point without the
need to explicitly code against an event system or parse log files. 

3) Users can use it to have a slightly improved new mount behaviour and
also hopefully be used as a flag to indicate that "oh so clever user
space managers" should stay away entries using it in fstab.

> In particular, automount has been around in one form or another *at least*
> since SunOS3.2 in the mid 80's, and I have seen it work with huge user maps
> (10k+ users). 
Yes, but like I say automount is normally based on an event. I am simply
talking about a flag/switch that can be used for optional implied
mkdir,rmdir around calls to mount() unount() - nothing more, nothing
less !

Such a feature would mitigate the justification for some of the less
sensible behaviours of systemd or similar user space event processors.

By adding at as an option it would not break other behaviours where it
was not explicitly enabled.

To be completely clear, I am not after a kernel based auto mounter -
just a kernel based mount point creator/remover, it is not quite the
same thing !

>How did it cope for 30+ years without this feature?
By people saying "ahh bugger" when a mount fails and some process craps
out files all over the mount point directory I expect ......

<partial rant>
Or maybe by the new "improved" systemd way of failing to go multi user
when an device referenced in fstab is offline, fucking things up for
anyone without true remote administration - the price you pay for using
cheap PC hardware on a small scale, you know, like Linux cheerleaders
always claimed was its unique selling point ......  
As a bonus I would hope that systemd would take such a mount option to
mean that an FS specified in fstab would be optional, but that would
just be a bonus and is not my justification for such a a feature.
</rant>

I can probably think of other reasons, but off the top of my head it
just seems a useful behaviour to have as an option IMHO.

I know my suggestion is not as fashionable as hanging user space code
from the kernel events, but I personally would rather just have the
option natively in the kernel rather than an option for some 'pre' or
'post' mount() umount() event that some user space process needs to
handle.

Thanks,
Jon



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-04 22:48   ` jon
@ 2015-07-05 14:29     ` Al Viro
  2015-07-05 15:46       ` jon
  2015-07-15 14:38     ` Karel Zak
  1 sibling, 1 reply; 12+ messages in thread
From: Al Viro @ 2015-07-05 14:29 UTC (permalink / raw)
  To: jon; +Cc: Valdis.Kletnieks, coreutils, linux-kernel

On Sat, Jul 04, 2015 at 11:48:28PM +0100, jon wrote:

> Yes, but like I say automount is normally based on an event. I am simply
> talking about a flag/switch that can be used for optional implied
> mkdir,rmdir around calls to mount() unount() - nothing more, nothing
> less !

umount(2) is not the only way for mount to detached from a mountpoint.
There's exit(2) as well - when the last process in a namespace exits, it
gets dissolved.  What should happen upon those?  Even more interesting question
is what should happen if you do such mount, then clone a process into a new
namespace and have it exit.  Should _that_ rmdir the hell out of that
mountpoint (presumably detaching everything mounted on it in all namespaces)?

What should happen when a process in new namespace decides to unmount that
thing, because they don't what it visible.  Should that take out the instance
in parent namespace? `

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-05 14:29     ` Al Viro
@ 2015-07-05 15:46       ` jon
  2015-07-05 17:39         ` Al Viro
  0 siblings, 1 reply; 12+ messages in thread
From: jon @ 2015-07-05 15:46 UTC (permalink / raw)
  To: Al Viro; +Cc: Valdis.Kletnieks, coreutils, linux-kernel

On Sun, 2015-07-05 at 15:29 +0100, Al Viro wrote:
> On Sat, Jul 04, 2015 at 11:48:28PM +0100, jon wrote:
> 
> > Yes, but like I say automount is normally based on an event. I am simply
> > talking about a flag/switch that can be used for optional implied
> > mkdir,rmdir around calls to mount() unount() - nothing more, nothing
> > less !
> 
> umount(2) is not the only way for mount to detached from a mountpoint.
> There's exit(2) as well - when the last process in a namespace exits, it
> gets dissolved.  What should happen upon those?  Even more interesting question
> is what should happen if you do such mount, then clone a process into a new
> namespace and have it exit.  Should _that_ rmdir the hell out of that
> mountpoint (presumably detaching everything mounted on it in all namespaces)?
> 

I should have titled it "Feature request from a simple minded user"

I have not the slightest idea what you are talking about.  

When I learnt *nix it did not have "name spaces" in reference to process
tables.  I understand the theory of VM a bit, the model in my mind each
"machine", be that one kernel on a true processor or a VM instance has
"a process table" and "a file descriptor table" etc - anything more is
beyond my current level of knowledge.

Containers for example are something I dont understand in two ways. I
dont truely understand the theory, I also dont understand why in a world
of true VM  someone would want to make something as complex as linux
even more complex using containers for what seems little or no benefit.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-05 15:46       ` jon
@ 2015-07-05 17:39         ` Al Viro
  2015-07-05 23:35           ` jon
  0 siblings, 1 reply; 12+ messages in thread
From: Al Viro @ 2015-07-05 17:39 UTC (permalink / raw)
  To: jon; +Cc: Valdis.Kletnieks, coreutils, linux-kernel

On Sun, Jul 05, 2015 at 04:46:50PM +0100, jon wrote:

> I should have titled it "Feature request from a simple minded user"
> 
> I have not the slightest idea what you are talking about.  
> 
> When I learnt *nix it did not have "name spaces" in reference to process
> tables.  I understand the theory of VM a bit, the model in my mind each
> "machine", be that one kernel on a true processor or a VM instance has
> "a process table" and "a file descriptor table" etc - anything more is
> beyond my current level of knowledge.

File descriptor table isn't something system-wide - it belongs to a process...

Containers are basically glorified process groups.

Anyway, the underlying model hasn't changed much since _way_ back; each
thread of execution is a virtual machine of its own, with actual CPUs
switched between those.  Each of them has memory, ports (== file descriptors)
and traps (== signal handlers).  The main primitives are
	clone() (== rfork() in other branches; plain fork() is just the most
common case) - create a copy of the virtual machine, in the state identical
to that of caller with the exception of different return values given to
child and parent.
	exit() - terminate the virtual machine
	execve() - load a new program
Parts of those virtual machines can be shared - e.g. you can have descriptor
table not just identical to that of parent at the time of clone(), but
actually shared with it, so e.g. open() in child makes the resulting descriptor
visible to parent as well.  Or you can have memory (address space) shared,
so that something like mmap() in parent would affect the memory mappings of
child, etc.  Which components are to be shared and which - copied is selected
by clone() argument.
	unshare() allows to switch to using a private copy of chosen components
- e.g. you might say "from now on, I want my file descriptor table to be
private".  In e.g. Plan 9 that's expressed via rfork() as well.

Less obvious componets including current directory and root.  Normally, these
are not shared; chdir() done in child won't affect the parent and vice versa.
You could ask them to be shared, though - for multithreaded program it could
be convenient.

Different processes might see different parts of the mount tree since v7 had
introduced chroot(2).  Namespaces simply allow to have a *forest* - different
groups of processes seeing different mount trees in that forest.  The same
filesystem may be mounted in many places, and the same directory might be
a mountpoint in an instance visible to one process and not a mountpoint
in an instance visible to another (or a mountpoint with something entirely
different mounted in an instance visible to somebody else).

Mount tree is yet another component; the difference is that normally it *is*
shared on clone(), rather than being copied.  I.e. mount() done by child
affects the mount tree visible to parent.   But you still can ask for
a new private copy of mount tree via clone() or unshare().  When the
last process sharing that mount tree exits, it gets dissolved, same as
every file descriptor in a descriptor table gets closed when the last
thread sharing that descriptor table exits (or asks for unshared copy of
descriptor table, e.g. as a side effect of execve()).  Just as with
file descriptors close() does not necessary close the opened file
descriptor's connected to (that happens only when all descriptors connected
to given opened file are closed), umount() does not necessary shut the
filesystem down; that happens only if it's not mounted elsewhere.

With something like Plan 9 that would be pretty much all you need for
isolating process groups into separate environments - just give each
the set of filesystems they should be seeing and be done with that.
We, unfortunately, can't drop certain FPOS APIs (starting with sockets,
with their "network interfaces are magical sets of named objects, names
are not experssed as pathnames, access control and visibility completely
ad-hoc, ditto for listing and renaming" shite), so we get more
state components ;-/  Which leads to e.g. "network namespace" and similar
complications; that crap should've been dealt with in _filesystem_ namespace,
but Occam Razor be damned, we need to support every misdesigned interface
that got there, no matter how many entities it breeds and how convoluted
the result becomes...  In principle, though, it's still the same model -
only with more components to be possibly shared.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-05 17:39         ` Al Viro
@ 2015-07-05 23:35           ` jon
  2015-07-06  1:08             ` Al Viro
  0 siblings, 1 reply; 12+ messages in thread
From: jon @ 2015-07-05 23:35 UTC (permalink / raw)
  To: Al Viro; +Cc: Valdis.Kletnieks, coreutils, linux-kernel

On Sun, 2015-07-05 at 18:39 +0100, Al Viro wrote:
> On Sun, Jul 05, 2015 at 04:46:50PM +0100, jon wrote:
> 
> > I should have titled it "Feature request from a simple minded user"
> > 
> > I have not the slightest idea what you are talking about.  
> > 
> > When I learnt *nix it did not have "name spaces" in reference to process
> > tables.  I understand the theory of VM a bit, the model in my mind each
> > "machine", be that one kernel on a true processor or a VM instance has
> > "a process table" and "a file descriptor table" etc - anything more is
> > beyond my current level of knowledge.
> 
> File descriptor table isn't something system-wide - it belongs to a process...
Ok, true... I guess it is not DOS or CP/M ;-)

> 
> Containers are basically glorified process groups.
> 
> Anyway, the underlying model hasn't changed much since _way_ back; each
> thread of execution is a virtual machine of its own, with actual CPUs
> switched between those.
Ok, not sure I quite follow. What do you mean virtual machine  ? 
My understanding was that a true VM has a hypervisor and I though also
required some extra processor instructions to basically do an "outer"
context switch (and some memory fiddling to fake up unqique address
spaces) while the operating systems within the VMs own scheduler is
doing the "inner" context switch (IE push/pop all on Intel style CPU).
Not all architectures have any VM capability. 
Are you talking about kernels on Intel with SMP enabled only ? 

>   Each of them has memory, ports (== file descriptors)
> and traps (== signal handlers).  The main primitives are
> 	clone() (== rfork() in other branches; plain fork() is just the most
> common case) - create a copy of the virtual machine, in the state identical
> to that of caller with the exception of different return values given to
> child and parent.
> 	exit() - terminate the virtual machine
> 	execve() - load a new program
Ok, I think I follow that.

> Parts of those virtual machines can be shared - e.g. you can have descriptor
> table not just identical to that of parent at the time of clone(), but
> actually shared with it, so e.g. open() in child makes the resulting descriptor
> visible to parent as well.
Ok, I follow you. I often dont need anything more complex than fork(),
when I thread I use pthreads so have not dug around into what is
actually happening at the kernel level.  I was not aware that the parent
could see file descriptors created by the child, is this always true or
only true if the parent and child are explicitly a shared memory
process.

>   Or you can have memory (address space) shared,
> so that something like mmap() in parent would affect the memory mappings of
> child, etc.  Which components are to be shared and which - copied is selected
> by clone() argument.
OK.
I have used that to create parent child processes with shared memory,
but I did cut&paste the initial code from a googled example rather an
apply any true skill ;-)

> 	unshare() allows to switch to using a private copy of chosen components
> - e.g. you might say "from now on, I want my file descriptor table to be
> private".  In e.g. Plan 9 that's expressed via rfork() as well.
unshare() is new to me but I see the logic. 


> Less obvious componets including current directory and root.  Normally, these
> are not shared; chdir() done in child won't affect the parent and vice versa.
> You could ask them to be shared, though - for multithreaded program it could
> be convenient.
OK.

> 
> Different processes might see different parts of the mount tree since v7 had
> introduced chroot(2).  Namespaces simply allow to have a *forest* - different
> groups of processes seeing different mount trees in that forest.  The same
> filesystem may be mounted in many places, and the same directory might be
> a mountpoint in an instance visible to one process and not a mountpoint
> in an instance visible to another (or a mountpoint with something entirely
> different mounted in an instance visible to somebody else).
Ok, I follow that. I have used chroot but only very sparingly, I have
never used a machine (to my knowledge) with the same file system mounted
onto multiple mount points so I had not considered that.

> Mount tree is yet another component; the difference is that normally it *is*
> shared on clone(), rather than being copied.  I.e. mount() done by child
> affects the mount tree visible to parent.   But you still can ask for
> a new private copy of mount tree via clone() or unshare().  When the
> last process sharing that mount tree exits, it gets dissolved, same as
> every file descriptor in a descriptor table gets closed when the last
> thread sharing that descriptor table exits (or asks for unshared copy of
> descriptor table, e.g. as a side effect of execve()).  Just as with
> file descriptors close() does not necessary close the opened file
> descriptor's connected to (that happens only when all descriptors connected
> to given opened file are closed), umount() does not necessary shut the
> filesystem down; that happens only if it's not mounted elsewhere.
Ok, I follow that :-) But logically it must be done with two functions
or handlers or something, so I would assume that my proposed "remove
mount directory" would simply hang off whatever call truly discards the
file system from the kernel.

I thought code for my feature might need to generate a warning if the
mount point has files in it (IE rmdir fails on unmount) or if the mount
point exists in some read only part of the directory tree. I figured a
few lines of code and couple of kernel warning would be enough. I get
from your explanation that things are a little more complex than maybe I
thought.

> With something like Plan 9 that would be pretty much all you need for
> isolating process groups into separate environments - just give each
> the set of filesystems they should be seeing and be done with that.
> We, unfortunately, can't drop certain FPOS APIs (starting with sockets,
> with their "network interfaces are magical sets of named objects, names
> are not experssed as pathnames, access control and visibility completely
> ad-hoc, ditto for listing and renaming" shite), so we get more
> state components ;-/  Which leads to e.g. "network namespace" and similar
> complications; that crap should've been dealt with in _filesystem_ namespace,
> but Occam Razor be damned, we need to support every misdesigned interface
> that got there, no matter how many entities it breeds and how convoluted
> the result becomes...  In principle, though, it's still the same model -
> only with more components to be possibly shared.
OK, thanks for the explanation. I have never looked at plan 9, I put it
in the same camp as Hurd - something that is interesting in theory but
that I will probably never live to see running on anything that I
use ;-)




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-05 23:35           ` jon
@ 2015-07-06  1:08             ` Al Viro
  2015-07-06  2:34               ` jon
  0 siblings, 1 reply; 12+ messages in thread
From: Al Viro @ 2015-07-06  1:08 UTC (permalink / raw)
  To: jon; +Cc: Valdis.Kletnieks, coreutils, linux-kernel

On Mon, Jul 06, 2015 at 12:35:48AM +0100, jon wrote:
> > Anyway, the underlying model hasn't changed much since _way_ back; each
> > thread of execution is a virtual machine of its own, with actual CPUs
> > switched between those.
> Ok, not sure I quite follow. What do you mean virtual machine  ? 
> My understanding was that a true VM has a hypervisor and I though also
> required some extra processor instructions to basically do an "outer"
> context switch (and some memory fiddling to fake up unqique address
> spaces) while the operating systems within the VMs own scheduler is
> doing the "inner" context switch (IE push/pop all on Intel style CPU).
> Not all architectures have any VM capability. 
> Are you talking about kernels on Intel with SMP enabled only ? 

Anything timesharing, starting with 7094 running CTSS.  Hypervisors allow to
virtualize priveleged mode parts of processor; it's a different beast.

Each process sees CPU and memory of its own; what the kernel does to give them
such an illusion depends upon the system (up to and including full write of
registers and user memory to disk, followed by reading that of the next
process back from disk - remember where the name "swap" had originally come
from?), but no matter how you do that, you give process a virtual CPU of
its own and multiplex the actual processor(s) between those.

>From the process' point of view system call is just a weird instruction that
has rather convoluted side effects.  The fact that it actually triggers
a trap on the underlying hardware CPU, switches to kernel mode, marshals
the arguments, arranges execution environment for C code, executes it, etc.
is immaterial - as far as userland code is concerned, the kernel is a black
box.  For all it cares, there might be another CPU sitting there, with
entirely different instruction set and something running on it.  With
"system call" insn on your CPU raising a signal to attract attention of
the priveleged one and stopping itself until the priveleged one tells it to
resume.

It's considerably older than hypervisors (and both are much older than
x86).

> > Parts of those virtual machines can be shared - e.g. you can have descriptor
> > table not just identical to that of parent at the time of clone(), but
> > actually shared with it, so e.g. open() in child makes the resulting descriptor
> > visible to parent as well.
> Ok, I follow you. I often dont need anything more complex than fork(),
> when I thread I use pthreads so have not dug around into what is
> actually happening at the kernel level.  I was not aware that the parent
> could see file descriptors created by the child, is this always true or
> only true if the parent and child are explicitly a shared memory
> process.

It is true if and only if clone(2) gets CLONE_FILES in its arguments.
Sharing address space is controlled by CLONE_VM and these can be used
independently; pthreads set both at the same time, but you can have shared
descriptor table without shared memory and vice versa.  Most of the time
you use shared descriptor tables, you want shared memory as well, but
it's not universally true.

> Ok, I follow that :-) But logically it must be done with two functions
> or handlers or something, so I would assume that my proposed "remove
> mount directory" would simply hang off whatever call truly discards the
> file system from the kernel.

Er...  _Which_ mount directory would you have removed (and what's to
guarantee that all filesystems it had been mounted on are still alive
when the last mount goes away)?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-06  1:08             ` Al Viro
@ 2015-07-06  2:34               ` jon
  2015-07-06  3:07                 ` Al Viro
  2015-07-06  5:40                 ` Valdis.Kletnieks
  0 siblings, 2 replies; 12+ messages in thread
From: jon @ 2015-07-06  2:34 UTC (permalink / raw)
  To: Al Viro; +Cc: Valdis.Kletnieks, coreutils, linux-kernel

On Mon, 2015-07-06 at 02:08 +0100, Al Viro wrote:
> On Mon, Jul 06, 2015 at 12:35:48AM +0100, jon wrote:
> > > Anyway, the underlying model hasn't changed much since _way_ back; each
> > > thread of execution is a virtual machine of its own, with actual CPUs
> > > switched between those.
> > Ok, not sure I quite follow. What do you mean virtual machine  ? 

> Anything timesharing, starting with 7094 running CTSS.  Hypervisors allow to
> virtualize priveleged mode parts of processor; it's a different beast.
This was my point.  To me "virtual machine" is a modern term that
describes something running with a hypervisor.
My confusion is you are retrospectively applying it to time sharing.
I managed to find the article that first taught me about schedulers, the
1979 Byte article "Introduction to Multiprogrammig" 

https://archive.org/details/byte-magazine-1979-09

Searching for the pdf for the term "virtual" gives one result, not in
that article.

I remember "virtual memory" and even "virtual addressing" but I think
the term "virtual machine" is modern, maybe someone else knows, google
did not help me much trying to prove it one way or the other.

> It's considerably older than hypervisors (and both are much older than
> x86).
Yes it is, but it was not called "virtual machine" at the time in anything I personally read.

> I was not aware that the parent
> > could see file descriptors created by the child, is this always true or
> > only true if the parent and child are explicitly a shared memory
> > process.
> 
> It is true if and only if clone(2) gets CLONE_FILES in its arguments.
> Sharing address space is controlled by CLONE_VM and these can be used
> independently; pthreads set both at the same time, but you can have shared
> descriptor table without shared memory and vice versa.  Most of the time
> you use shared descriptor tables, you want shared memory as well, but
> it's not universally true.
I mainly use fork(), file descriptors are copied (forward) but memory
not shared.


> > Ok, I follow that :-) But logically it must be done with two functions
> > or handlers or something, so I would assume that my proposed "remove
> > mount directory" would simply hang off whatever call truly discards the
> > file system from the kernel.
> 
> Er...  _Which_ mount directory would you have removed
The one that was passed as "target" in the call ? I assume the kernel
stores that ?
int mount(const char *source, const char *target,
 

>  (and what's to
> guarantee that all filesystems it had been mounted on are still alive
> when the last mount goes away)?
?The same rules that would be in play if it was cross mounted in some
other way, or am I being dumb here?

I assume Linux will not let me unmount a mount point from lower in the
directory tree. I've not tried in living memory so lets give it a
go ....

root@jonspc:/# mkdir mounttest
root@jonspc:/# cd mounttest/
root@jonspc:/mounttest# mkdir firstmount
root@jonspc:/mounttest# mount /dev/sdb1 /mounttest/firstmount
root@jonspc:/mounttest# cd firstmount/
root@jonspc:/mounttest/firstmount# mkdir secondmount
root@jonspc:/mounttest/firstmount# mount /dev/sdb1 /mounttest/firstmount/secondmount/
root@jonspc:/mounttest/firstmount# cd ..
root@jonspc:/mounttest# umount /mounttest/firstmount
umount: /mounttest/firstmount: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))

It fails, as I expected.
 Also rmdir would fail if the mount point had was anything other than
empty, mkdir would fail if the mount point was already created.
I don't see how the kernel doing a pre mount mkdir and a post mount
rmdir would differ in outcome from user space performing the same
operations regardless of convoluted configuration was in use.
Like I said I dont use containers or Zen so can you show me a
(preferably simple) scenario that my proposal breaks.

I assume an entry in a table in the kernel is the source of the above
"device is busy" message, is this not also true if the name spaces
differ but the same file system is mounted in multiple places?

I would expect unmount <device> to unmount all mounted references to
that device and umount <mountpoint> to remove just that mountpoint or am
I miss remembering....




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-06  2:34               ` jon
@ 2015-07-06  3:07                 ` Al Viro
  2015-07-06  5:40                 ` Valdis.Kletnieks
  1 sibling, 0 replies; 12+ messages in thread
From: Al Viro @ 2015-07-06  3:07 UTC (permalink / raw)
  To: jon; +Cc: Valdis.Kletnieks, coreutils, linux-kernel

On Mon, Jul 06, 2015 at 03:34:59AM +0100, jon wrote:
> > It is true if and only if clone(2) gets CLONE_FILES in its arguments.
> > Sharing address space is controlled by CLONE_VM and these can be used
> > independently; pthreads set both at the same time, but you can have shared
> > descriptor table without shared memory and vice versa.  Most of the time
> > you use shared descriptor tables, you want shared memory as well, but
> > it's not universally true.
> I mainly use fork(), file descriptors are copied (forward) but memory
> not shared.

fork() doesn't pass either.  Both the address space and descriptor table
are copied.

> > > Ok, I follow that :-) But logically it must be done with two functions
> > > or handlers or something, so I would assume that my proposed "remove
> > > mount directory" would simply hang off whatever call truly discards the
> > > file system from the kernel.
> > 
> > Er...  _Which_ mount directory would you have removed

> The one that was passed as "target" in the call ? I assume the kernel
> stores that ?

Which time?  You can mount the same fs many times, at many places, unmounting
them whenever you like...

mount -t ramfs none /mnt
mkdir /mnt/a
mount /dev/sda1 /tmp/a
mkdir /tmp/b
mount /dev/sda1 /tmp/b
umount /mnt/a
umount /mnt

and you've got sda1 active through all of that, with the original mountpoint
not busy anymore (moreover, the filesystem it used to be on already shut down).

What's more, there's mount --bind (attach a subtree to new location) and
mount --move (move whatever's mounted at <this> place to <that> place).

Basically, you have a (system-wide) set of active filesystems.  Mount trees
consist of subtrees in that forest (normally - entire trees) pasted together.
The same subtree (or smaller subtrees) might be seen any number of times at
any places.

You can say e.g.

mount -t xfs /dev/sda1 /mnt
mount --bind /mnt/a /usr
mount --bind /mnt/b /var
umount /mnt

and you'll get an active fs from sda1, with two subtrees (rooted at a and b
resp.) attached at /usr and /var.  By the end of that, the entire tree isn't
attached anywhere.

Seriously, say man mount and play with what's described there.  The model is
fairly simple, really...

As an aside, it's a bleeding shame that even as late as '79 *all* filesystems
on a box had to be of the same type; that's pretty much _the_ place where
Unix history went wrong - mount(2) had remained an afterthought (albeit a very
early one) all the way until v7.  Hell, as late as in v6 mounting something
on /usr and opening /usr/.. gave you /usr, not /   It was kludged up in
iget(9), of all things - mount table basically had been "when doing iget()
of this inumber on this device, use root directory inumber on that device
instead".  Consistent handling of .. had appeared only in v7.  It was very
much _not_ a first-class operation.

As far as I know, the real pressure to support heterogenous filesystem mix
had been created only by the need to support network filesystems.  Sure,
as soon as it had appeared in what was to become v8 (circa 82 or so?),
a filesystem to get rid of ptrace(2) (aka procfs) had appeared.  But it really
had been too late by then - to have netfs, you really need to have some kind
of networking API (if nothing else, to be able to implement userland servers).
And having _that_ happen before the "filesystem as a first-class object"
had pretty much doomed us to really shitty APIs.

Pity it hadn't happened in opposite order - very good reasons to do something
like e.g. procfs had all been there.  Take a look at v7 /usr/src/cmd/ps.c
someday...  And as soon as mount as the first-class operation would've been
there, a _lot_ of API design would've gone a different way...  Fucking shame
it hadn't happened in v7 - after that it had been too damn late ;-/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-06  2:34               ` jon
  2015-07-06  3:07                 ` Al Viro
@ 2015-07-06  5:40                 ` Valdis.Kletnieks
  1 sibling, 0 replies; 12+ messages in thread
From: Valdis.Kletnieks @ 2015-07-06  5:40 UTC (permalink / raw)
  To: jon; +Cc: Al Viro, coreutils, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 826 bytes --]

On Mon, 06 Jul 2015 03:34:59 +0100, jon said:

> I remember "virtual memory" and even "virtual addressing" but I think
> the term "virtual machine" is modern, maybe someone else knows, google
> did not help me much trying to prove it one way or the other.

Hardly.  IBM was working with virtual machines as far back as the
IBM S360/67 in the late 60s (and even a highly modified /40), and released
VM/370 in 1972.  They used the term 'Virtual Machine' as early as 1966:

R. J. Adair, R. U. Bayles, L. W. Comeau, and R. J. Creasy, A Virtual Machine
System for the 360/40, IBM Corporation, Cambridge Scientific Center Report No.
320-2007 (May 1966)

https://en.wikipedia.org/wiki/VM_%28operating_system%29

(And that's just what I'm familiar with from being a VM jock from 1982 to 2000,
I'm sure there's earlier references...)


[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount
  2015-07-04 22:48   ` jon
  2015-07-05 14:29     ` Al Viro
@ 2015-07-15 14:38     ` Karel Zak
  1 sibling, 0 replies; 12+ messages in thread
From: Karel Zak @ 2015-07-15 14:38 UTC (permalink / raw)
  To: jon; +Cc: Valdis.Kletnieks, coreutils, linux-kernel

On Sat, Jul 04, 2015 at 11:48:28PM +0100, jon wrote:
> It solves these problems:
> 1) It solves the problem of processes writing data into the mount point
> when not mounted (as does, I accept a user space automounter, but as I
> explained the usage scenario differs). 
> 
> 2) It would be useful for embedded devices, installers etc.  I do quite
> a bit of work in the embedded space, sometimes running kernel+shell+user
> process only, sometimes no udev, no systemd, not even full fat init.  
> 
> 3) installers or similar could use such an option for mounting install
> data. By specifying the flag user space processes can infer that the FS
> is successfully mounted by the presence of the mount point without the
> need to explicitly code against an event system or parse log files. 
> 
> 3) Users can use it to have a slightly improved new mount behaviour and
> also hopefully be used as a flag to indicate that "oh so clever user
> space managers" should stay away entries using it in fstab.

 man mount (since util-linux v2.23, May 2013):

   x-mount.mkdir[=mode]    Allow to make a target directory (mountpoint).  

 It's userspace mount(8) option and you can use it in your fstab.

    Karel

-- 
 Karel Zak  <kzak@redhat.com>
 http://karelzak.blogspot.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-07-15 14:38 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-03 12:01 Feature request, "create on mount" to create mount point directory on mount, implied remove on unmount jon
2015-07-04 20:56 ` Valdis.Kletnieks
2015-07-04 22:48   ` jon
2015-07-05 14:29     ` Al Viro
2015-07-05 15:46       ` jon
2015-07-05 17:39         ` Al Viro
2015-07-05 23:35           ` jon
2015-07-06  1:08             ` Al Viro
2015-07-06  2:34               ` jon
2015-07-06  3:07                 ` Al Viro
2015-07-06  5:40                 ` Valdis.Kletnieks
2015-07-15 14:38     ` Karel Zak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.