* v4 clientid uniquifiers in containers/namespaces @ 2022-02-05 15:03 Benjamin Coddington 2022-02-05 18:24 ` Trond Myklebust 2022-02-08 1:59 ` NeilBrown 0 siblings, 2 replies; 23+ messages in thread From: Benjamin Coddington @ 2022-02-05 15:03 UTC (permalink / raw) To: Linux NFS Mailing List Hi all, Is anyone using a udev(-like) implementation with NETLINK_LISTEN_ALL_NSID? It looks like that is at least necessary to allow the init namespaced udev to receive notifications on /sys/fs/nfs/net/nfs_client/identifier, which would be a pre-req to automatically uniquify in containers. I'md interested since it will inform whether I need to send patches to systemd's udev, and potentially open the can of worms over there. Yet its not yet clear to me how an init namespaced udev process can write to a netns sysfs path. Another option might be to create yet another daemon/tool that would listen specifically for these notifications. Ugh. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-05 15:03 v4 clientid uniquifiers in containers/namespaces Benjamin Coddington @ 2022-02-05 18:24 ` Trond Myklebust 2022-02-05 19:50 ` Benjamin Coddington 2022-02-08 1:59 ` NeilBrown 1 sibling, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2022-02-05 18:24 UTC (permalink / raw) To: linux-nfs, bcodding On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington wrote: > Hi all, > > Is anyone using a udev(-like) implementation with > NETLINK_LISTEN_ALL_NSID? > It looks like that is at least necessary to allow the init namespaced > udev > to receive notifications on /sys/fs/nfs/net/nfs_client/identifier, > which > would be a pre-req to automatically uniquify in containers. > > I'md interested since it will inform whether I need to send patches > to > systemd's udev, and potentially open the can of worms over there. > Yet its > not yet clear to me how an init namespaced udev process can write to > a netns > sysfs path. > > Another option might be to create yet another daemon/tool that would > listen > specifically for these notifications. Ugh. > > Ben > I don't understand. Why do you need a new daemon/tool? I have the following entry in /etc/udev/rules.d: [trondmy@leira ~]$ cat /etc/udev/rules.d/50-nfs4.rules ACTION=="add" KERNEL=="nfs_client" ATTR{identifier}=="(null)" PROGRAM="/usr/sbin/nfs4_uuid" ATTR{identifier}="%c" ...and a very simple script /usr/sbin/nfs4_uuid that reads as follows: #!/bin/bash # if [ ! -f /etc/nfs4_uuid ] then uuid="$(uuidgen -r)" echo -n ${uuid} > /etc/nfs4_uuid else uuid="$(cat /etc/nfs4_uuid)" fi echo ${uuid} -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-05 18:24 ` Trond Myklebust @ 2022-02-05 19:50 ` Benjamin Coddington 2022-02-07 14:05 ` Benjamin Coddington 0 siblings, 1 reply; 23+ messages in thread From: Benjamin Coddington @ 2022-02-05 19:50 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs On 5 Feb 2022, at 13:24, Trond Myklebust wrote: > On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington wrote: >> Hi all, >> >> Is anyone using a udev(-like) implementation with >> NETLINK_LISTEN_ALL_NSID? >> It looks like that is at least necessary to allow the init namespaced >> udev >> to receive notifications on /sys/fs/nfs/net/nfs_client/identifier, >> which >> would be a pre-req to automatically uniquify in containers. >> >> I'md interested since it will inform whether I need to send patches >> to >> systemd's udev, and potentially open the can of worms over there. >> Yet its >> not yet clear to me how an init namespaced udev process can write to >> a netns >> sysfs path. >> >> Another option might be to create yet another daemon/tool that would >> listen >> specifically for these notifications. Ugh. >> >> Ben >> > > I don't understand. Why do you need a new daemon/tool? > > I have the following entry in /etc/udev/rules.d: > > [trondmy@leira ~]$ cat /etc/udev/rules.d/50-nfs4.rules > ACTION=="add" KERNEL=="nfs_client" ATTR{identifier}=="(null)" > PROGRAM="/usr/sbin/nfs4_uuid" ATTR{identifier}="%c" > > > ...and a very simple script /usr/sbin/nfs4_uuid that reads as follows: > > #!/bin/bash > # > if [ ! -f /etc/nfs4_uuid ] > then > uuid="$(uuidgen -r)" > echo -n ${uuid} > /etc/nfs4_uuid > else > uuid="$(cat /etc/nfs4_uuid)" > fi > echo ${uuid} We're in the same place, but what I see is that when I create a new network namespace with: ip netns add testnamespace Everything in the kernel works up to the point where the userspace udevd never gets a notification. I suspect thats because it hasn't used NETLINK_LISTEN_ALL_NSID, so the kernel's skipping the notification in do_one_broadcast(). If your udev is getting notified of new network namespaces and firing that rule each time, something's different between our setups, and I'd like to figure out what it might be. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-05 19:50 ` Benjamin Coddington @ 2022-02-07 14:05 ` Benjamin Coddington 2022-02-07 15:49 ` Chuck Lever III 0 siblings, 1 reply; 23+ messages in thread From: Benjamin Coddington @ 2022-02-07 14:05 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs On 5 Feb 2022, at 14:50, Benjamin Coddington wrote: > On 5 Feb 2022, at 13:24, Trond Myklebust wrote: > >> On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington wrote: >>> Hi all, >>> >>> Is anyone using a udev(-like) implementation with >>> NETLINK_LISTEN_ALL_NSID? >>> It looks like that is at least necessary to allow the init namespaced >>> udev >>> to receive notifications on /sys/fs/nfs/net/nfs_client/identifier, >>> which >>> would be a pre-req to automatically uniquify in containers. >>> >>> I'md interested since it will inform whether I need to send patches >>> to >>> systemd's udev, and potentially open the can of worms over there. >>> Yet its >>> not yet clear to me how an init namespaced udev process can write to >>> a netns >>> sysfs path. >>> >>> Another option might be to create yet another daemon/tool that would >>> listen >>> specifically for these notifications. Ugh. >>> >>> Ben >>> >> >> I don't understand. Why do you need a new daemon/tool? Because what we've got only works for the init namespace. Udev won't get kobject notifications because its not using NETLINK_LISTEN_ALL_NSIDs. We need to figure out if we want: 1) the init namespace udevd to handle all client_id uniquifiers 2) we expect network namespaces to run their own udevd 3) or both. I think 2 violates "least surprise", and 3 might not be something anyone ever wants. If they do, we can fix it at that point. So to make 1 work, we can try to change udevd, or maybe just hacking about with nfs_netns_object_child_ns_type will be sufficient. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-07 14:05 ` Benjamin Coddington @ 2022-02-07 15:49 ` Chuck Lever III 2022-02-07 19:38 ` Trond Myklebust 0 siblings, 1 reply; 23+ messages in thread From: Chuck Lever III @ 2022-02-07 15:49 UTC (permalink / raw) To: Benjamin Coddington; +Cc: Trond Myklebust, Linux NFS Mailing List > On Feb 7, 2022, at 9:05 AM, Benjamin Coddington <bcodding@redhat.com> wrote: > > On 5 Feb 2022, at 14:50, Benjamin Coddington wrote: > >> On 5 Feb 2022, at 13:24, Trond Myklebust wrote: >> >>> On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington wrote: >>>> Hi all, >>>> >>>> Is anyone using a udev(-like) implementation with >>>> NETLINK_LISTEN_ALL_NSID? >>>> It looks like that is at least necessary to allow the init namespaced >>>> udev >>>> to receive notifications on /sys/fs/nfs/net/nfs_client/identifier, >>>> which >>>> would be a pre-req to automatically uniquify in containers. >>>> >>>> I'md interested since it will inform whether I need to send patches >>>> to >>>> systemd's udev, and potentially open the can of worms over there. >>>> Yet its >>>> not yet clear to me how an init namespaced udev process can write to >>>> a netns >>>> sysfs path. >>>> >>>> Another option might be to create yet another daemon/tool that would >>>> listen >>>> specifically for these notifications. Ugh. >>>> >>>> Ben >>>> >>> >>> I don't understand. Why do you need a new daemon/tool? > > Because what we've got only works for the init namespace. > > Udev won't get kobject notifications because its not using > NETLINK_LISTEN_ALL_NSIDs. > > We need to figure out if we want: > > 1) the init namespace udevd to handle all client_id uniquifiers > 2) we expect network namespaces to run their own udevd > 3) or both. > > I think 2 violates "least surprise", and 3 might not be something anyone > ever wants. If they do, we can fix it at that point. > > So to make 1 work, we can try to change udevd, or maybe just hacking about > with nfs_netns_object_child_ns_type will be sufficient. I agree that 1 seems like the preferred approach, though I don't have a technical suggestion at this point. Again, thank you for drilling into this. -- Chuck Lever ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-07 15:49 ` Chuck Lever III @ 2022-02-07 19:38 ` Trond Myklebust 2022-02-07 23:59 ` Chuck Lever III 0 siblings, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2022-02-07 19:38 UTC (permalink / raw) To: bcodding, chuck.lever; +Cc: linux-nfs On Mon, 2022-02-07 at 15:49 +0000, Chuck Lever III wrote: > > > > On Feb 7, 2022, at 9:05 AM, Benjamin Coddington > > <bcodding@redhat.com> wrote: > > > > On 5 Feb 2022, at 14:50, Benjamin Coddington wrote: > > > > > On 5 Feb 2022, at 13:24, Trond Myklebust wrote: > > > > > > > On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington wrote: > > > > > Hi all, > > > > > > > > > > Is anyone using a udev(-like) implementation with > > > > > NETLINK_LISTEN_ALL_NSID? > > > > > It looks like that is at least necessary to allow the init > > > > > namespaced > > > > > udev > > > > > to receive notifications on > > > > > /sys/fs/nfs/net/nfs_client/identifier, > > > > > which > > > > > would be a pre-req to automatically uniquify in containers. > > > > > > > > > > I'md interested since it will inform whether I need to send > > > > > patches > > > > > to > > > > > systemd's udev, and potentially open the can of worms over > > > > > there. > > > > > Yet its > > > > > not yet clear to me how an init namespaced udev process can > > > > > write to > > > > > a netns > > > > > sysfs path. > > > > > > > > > > Another option might be to create yet another daemon/tool > > > > > that would > > > > > listen > > > > > specifically for these notifications. Ugh. > > > > > > > > > > Ben > > > > > > > > > > > > > I don't understand. Why do you need a new daemon/tool? > > > > Because what we've got only works for the init namespace. > > > > Udev won't get kobject notifications because its not using > > NETLINK_LISTEN_ALL_NSIDs. > > > > We need to figure out if we want: > > > > 1) the init namespace udevd to handle all client_id uniquifiers > > 2) we expect network namespaces to run their own udevd > > 3) or both. > > > > I think 2 violates "least surprise", and 3 might not be something > > anyone > > ever wants. If they do, we can fix it at that point. > > > > So to make 1 work, we can try to change udevd, or maybe just > > hacking about > > with nfs_netns_object_child_ns_type will be sufficient. > > I agree that 1 seems like the preferred approach, though > I don't have a technical suggestion at this point. > I strongly disagree. (1) requires the init namespace to have intimate knowledge of container internals. Why do we need to make that a requirement? That violates the expectation that containers are stateless by default, and also the expectation that they operate independently of the environment. If you really do want external control over the uuid that is set, then it should be pretty trivial to do so by using the standard container tools for manipulating the namespace (e.g. to mount a file that is under control of the parent as /etc/nfs4-uuid.conf or whatever). However in most cases that I can think of, if the container is doing its own NFS mounting, then it is going to have to be set up with its own nfs-utils, etc, so there is no reason why we can't also require udev. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-07 19:38 ` Trond Myklebust @ 2022-02-07 23:59 ` Chuck Lever III 2022-02-08 11:32 ` Benjamin Coddington 0 siblings, 1 reply; 23+ messages in thread From: Chuck Lever III @ 2022-02-07 23:59 UTC (permalink / raw) To: Trond Myklebust; +Cc: bcodding, Linux NFS Mailing List > On Feb 7, 2022, at 2:38 PM, Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Mon, 2022-02-07 at 15:49 +0000, Chuck Lever III wrote: >> >> >>> On Feb 7, 2022, at 9:05 AM, Benjamin Coddington >>> <bcodding@redhat.com> wrote: >>> >>> On 5 Feb 2022, at 14:50, Benjamin Coddington wrote: >>> >>>> On 5 Feb 2022, at 13:24, Trond Myklebust wrote: >>>> >>>>> On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington wrote: >>>>>> Hi all, >>>>>> >>>>>> Is anyone using a udev(-like) implementation with >>>>>> NETLINK_LISTEN_ALL_NSID? >>>>>> It looks like that is at least necessary to allow the init >>>>>> namespaced >>>>>> udev >>>>>> to receive notifications on >>>>>> /sys/fs/nfs/net/nfs_client/identifier, >>>>>> which >>>>>> would be a pre-req to automatically uniquify in containers. >>>>>> >>>>>> I'md interested since it will inform whether I need to send >>>>>> patches >>>>>> to >>>>>> systemd's udev, and potentially open the can of worms over >>>>>> there. >>>>>> Yet its >>>>>> not yet clear to me how an init namespaced udev process can >>>>>> write to >>>>>> a netns >>>>>> sysfs path. >>>>>> >>>>>> Another option might be to create yet another daemon/tool >>>>>> that would >>>>>> listen >>>>>> specifically for these notifications. Ugh. >>>>>> >>>>>> Ben >>>>>> >>>>> >>>>> I don't understand. Why do you need a new daemon/tool? >>> >>> Because what we've got only works for the init namespace. >>> >>> Udev won't get kobject notifications because its not using >>> NETLINK_LISTEN_ALL_NSIDs. >>> >>> We need to figure out if we want: >>> >>> 1) the init namespace udevd to handle all client_id uniquifiers >>> 2) we expect network namespaces to run their own udevd >>> 3) or both. >>> >>> I think 2 violates "least surprise", and 3 might not be something >>> anyone >>> ever wants. If they do, we can fix it at that point. >>> >>> So to make 1 work, we can try to change udevd, or maybe just >>> hacking about >>> with nfs_netns_object_child_ns_type will be sufficient. >> >> I agree that 1 seems like the preferred approach, though >> I don't have a technical suggestion at this point. >> > > I strongly disagree. (1) requires the init namespace to have intimate > knowledge of container internals. Why do we need to make that a > requirement? That violates the expectation that containers are > stateless by default, and also the expectation that they operate > independently of the environment. > > If you really do want external control over the uuid that is set, then > it should be pretty trivial to do so by using the standard container > tools for manipulating the namespace (e.g. to mount a file that is > under control of the parent as /etc/nfs4-uuid.conf or whatever). > > However in most cases that I can think of, if the container is doing > its own NFS mounting, then it is going to have to be set up with its > own nfs-utils, etc, so there is no reason why we can't also require > udev. What Ben described in 1. more closely aligned with how I thought containers work today. But it could be that 2. gives the ability to migrate the guest container to another physical host and take its nfs4_unique_id with it. I don't have a strong preference between the two. I'm in favor of doing whichever gets us to "done" faster. -- Chuck Lever ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-07 23:59 ` Chuck Lever III @ 2022-02-08 11:32 ` Benjamin Coddington 2022-02-08 13:45 ` Trond Myklebust 2022-02-08 16:47 ` Trond Myklebust 0 siblings, 2 replies; 23+ messages in thread From: Benjamin Coddington @ 2022-02-08 11:32 UTC (permalink / raw) To: Chuck Lever III; +Cc: Trond Myklebust, Linux NFS Mailing List On 7 Feb 2022, at 18:59, Chuck Lever III wrote: >> On Feb 7, 2022, at 2:38 PM, Trond Myklebust <trondmy@hammerspace.com> >> wrote: >> >> On Mon, 2022-02-07 at 15:49 +0000, Chuck Lever III wrote: >>> >>> >>>> On Feb 7, 2022, at 9:05 AM, Benjamin Coddington >>>> <bcodding@redhat.com> wrote: >>>> >>>> On 5 Feb 2022, at 14:50, Benjamin Coddington wrote: >>>> >>>>> On 5 Feb 2022, at 13:24, Trond Myklebust wrote: >>>>> >>>>>> On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Is anyone using a udev(-like) implementation with >>>>>>> NETLINK_LISTEN_ALL_NSID? >>>>>>> It looks like that is at least necessary to allow the init >>>>>>> namespaced >>>>>>> udev >>>>>>> to receive notifications on >>>>>>> /sys/fs/nfs/net/nfs_client/identifier, >>>>>>> which >>>>>>> would be a pre-req to automatically uniquify in containers. >>>>>>> >>>>>>> I'md interested since it will inform whether I need to send >>>>>>> patches >>>>>>> to >>>>>>> systemd's udev, and potentially open the can of worms over >>>>>>> there. >>>>>>> Yet its >>>>>>> not yet clear to me how an init namespaced udev process can >>>>>>> write to >>>>>>> a netns >>>>>>> sysfs path. >>>>>>> >>>>>>> Another option might be to create yet another daemon/tool >>>>>>> that would >>>>>>> listen >>>>>>> specifically for these notifications. Ugh. >>>>>>> >>>>>>> Ben >>>>>>> >>>>>> >>>>>> I don't understand. Why do you need a new daemon/tool? >>>> >>>> Because what we've got only works for the init namespace. >>>> >>>> Udev won't get kobject notifications because its not using >>>> NETLINK_LISTEN_ALL_NSIDs. >>>> >>>> We need to figure out if we want: >>>> >>>> 1) the init namespace udevd to handle all client_id uniquifiers >>>> 2) we expect network namespaces to run their own udevd >>>> 3) or both. >>>> >>>> I think 2 violates "least surprise", and 3 might not be something >>>> anyone >>>> ever wants. If they do, we can fix it at that point. >>>> >>>> So to make 1 work, we can try to change udevd, or maybe just >>>> hacking about >>>> with nfs_netns_object_child_ns_type will be sufficient. >>> >>> I agree that 1 seems like the preferred approach, though >>> I don't have a technical suggestion at this point. >>> >> >> I strongly disagree. (1) requires the init namespace to have intimate >> knowledge of container internals. Not really, we're just distinguishing NFS clients in containers from NFS clients on the host. That doesn't require intimate knowledge, only a mechanism to create a unique value per-container. >> Why do we need to make that a requirement? That violates the >> expectation >> that containers are stateless by default, and also the expectation >> that >> they operate independently of the environment. I'm not familiar with the expectation that containers are stateless by default, or that they operate independently of the environment. >> If you really do want external control over the uuid that is set, >> then >> it should be pretty trivial to do so by using the standard container >> tools for manipulating the namespace (e.g. to mount a file that is >> under control of the parent as /etc/nfs4-uuid.conf or whatever). We're not looking for external control, just automation. The NFS community has decided that udev is the way to go here, so as long as we can get the notifications to /some/ udev process, I feel confident we can make all of this transparent. The less we have to teach all the container tooling folks, the better for us. >> However in most cases that I can think of, if the container is doing >> its own NFS mounting, then it is going to have to be set up with its >> own nfs-utils, etc, so there is no reason why we can't also require >> udev. I'm not as confident about this as you are. Network namespaces are pretty useful on their own to create independent network configurations or to isolate hardware interfaces. We've had a few surprising cases of customers using them in creative ways. There's a bit of a chicken and egg problem with 2, though. If the nfs module is loaded, the kernel notification gets sent as soon as you create the namespace. Its not going to wait for you to move or exec udev into that network namespace, and the notification is lost. Can't we just uniquify the namespaced NFS client ourselves, while still exposing /sys/fs/nfs/net/nfs_client/identifier within the namespace? That way if someone want to run udev or use their own method of persistent id its available to them within the container so they can. Then we can move forward because the problem of distinguishing clients between the host and netns is automagically solved. Where we are today is the host NFS client is uniquified, and all the netns clients are distinguished from the host, but not eachother. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 11:32 ` Benjamin Coddington @ 2022-02-08 13:45 ` Trond Myklebust 2022-02-08 14:29 ` Benjamin Coddington 2022-02-08 16:47 ` Trond Myklebust 1 sibling, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2022-02-08 13:45 UTC (permalink / raw) To: bcodding, chuck.lever; +Cc: linux-nfs On Tue, 2022-02-08 at 06:32 -0500, Benjamin Coddington wrote: > On 7 Feb 2022, at 18:59, Chuck Lever III wrote: > > > > On Feb 7, 2022, at 2:38 PM, Trond Myklebust > > > <trondmy@hammerspace.com> > > > wrote: > > > > > > On Mon, 2022-02-07 at 15:49 +0000, Chuck Lever III wrote: > > > > > > > > > > > > > On Feb 7, 2022, at 9:05 AM, Benjamin Coddington > > > > > <bcodding@redhat.com> wrote: > > > > > > > > > > On 5 Feb 2022, at 14:50, Benjamin Coddington wrote: > > > > > > > > > > > On 5 Feb 2022, at 13:24, Trond Myklebust wrote: > > > > > > > > > > > > > On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington > > > > > > > wrote: > > > > > > > > Hi all, > > > > > > > > > > > > > > > > Is anyone using a udev(-like) implementation with > > > > > > > > NETLINK_LISTEN_ALL_NSID? > > > > > > > > It looks like that is at least necessary to allow the > > > > > > > > init > > > > > > > > namespaced > > > > > > > > udev > > > > > > > > to receive notifications on > > > > > > > > /sys/fs/nfs/net/nfs_client/identifier, > > > > > > > > which > > > > > > > > would be a pre-req to automatically uniquify in > > > > > > > > containers. > > > > > > > > > > > > > > > > I'md interested since it will inform whether I need to > > > > > > > > send > > > > > > > > patches > > > > > > > > to > > > > > > > > systemd's udev, and potentially open the can of worms > > > > > > > > over > > > > > > > > there. > > > > > > > > Yet its > > > > > > > > not yet clear to me how an init namespaced udev process > > > > > > > > can > > > > > > > > write to > > > > > > > > a netns > > > > > > > > sysfs path. > > > > > > > > > > > > > > > > Another option might be to create yet another > > > > > > > > daemon/tool > > > > > > > > that would > > > > > > > > listen > > > > > > > > specifically for these notifications. Ugh. > > > > > > > > > > > > > > > > Ben > > > > > > > > > > > > > > > > > > > > > > I don't understand. Why do you need a new daemon/tool? > > > > > > > > > > Because what we've got only works for the init namespace. > > > > > > > > > > Udev won't get kobject notifications because its not using > > > > > NETLINK_LISTEN_ALL_NSIDs. > > > > > > > > > > We need to figure out if we want: > > > > > > > > > > 1) the init namespace udevd to handle all client_id > > > > > uniquifiers > > > > > 2) we expect network namespaces to run their own udevd > > > > > 3) or both. > > > > > > > > > > I think 2 violates "least surprise", and 3 might not be > > > > > something > > > > > anyone > > > > > ever wants. If they do, we can fix it at that point. > > > > > > > > > > So to make 1 work, we can try to change udevd, or maybe just > > > > > hacking about > > > > > with nfs_netns_object_child_ns_type will be sufficient. > > > > > > > > I agree that 1 seems like the preferred approach, though > > > > I don't have a technical suggestion at this point. > > > > > > > > > > I strongly disagree. (1) requires the init namespace to have > > > intimate > > > knowledge of container internals. > > Not really, we're just distinguishing NFS clients in containers from > NFS > clients on the host. That doesn't require intimate knowledge, only a > mechanism to create a unique value per-container. > > > > Why do we need to make that a requirement? That violates the > > > expectation > > > that containers are stateless by default, and also the > > > expectation > > > that > > > they operate independently of the environment. > > I'm not familiar with the expectation that containers are stateless > by > default, or that they operate independently of the environment. > Put differently: do you expect QEMU/KVM and VMware ESX to have to know a priori that a VM is going to use NFSv4, and force them to have to modify the VM state accordingly? No, of course not. So why do you think this is a good idea for containers? This is exactly the problem with the keyring upcall mechanism, and why it is completely useless on a modern system. It relies on the top level knowing what the containers are doing and how they are configured. Imagine if you want to nest containers (yes, people do that - just Google "nested docker containers"). Your top level process would have to know not just how the first level of containers is configured (network details, user mappings, ...), but also details about how the child containers, that it is not directly managing, are configured. It's just not practical. > > > If you really do want external control over the uuid that is set, > > > then > > > it should be pretty trivial to do so by using the standard > > > container > > > tools for manipulating the namespace (e.g. to mount a file that > > > is > > > under control of the parent as /etc/nfs4-uuid.conf or whatever). > > We're not looking for external control, just automation. The NFS > community > has decided that udev is the way to go here, so as long as we can get > the > notifications to /some/ udev process, I feel confident we can make > all > of > this transparent. > > The less we have to teach all the container tooling folks, the better > for us. > Agreed. I'm saying that udev case also allows for top level control if you think you need it. > > > However in most cases that I can think of, if the container is > > > doing > > > its own NFS mounting, then it is going to have to be set up with > > > its > > > own nfs-utils, etc, so there is no reason why we can't also > > > require > > > udev. > > I'm not as confident about this as you are. Network namespaces are > pretty > useful on their own to create independent network configurations or > to > isolate hardware interfaces. We've had a few surprising cases of > customers > using them in creative ways. > > There's a bit of a chicken and egg problem with 2, though. If the > nfs > module is loaded, the kernel notification gets sent as soon as you > create > the namespace. Its not going to wait for you to move or exec udev > into > that > network namespace, and the notification is lost. > > Can't we just uniquify the namespaced NFS client ourselves, while > still > exposing /sys/fs/nfs/net/nfs_client/identifier within the namespace? > That > way if someone want to run udev or use their own method of persistent > id > its available to them within the container so they can. Then we can > move > forward because the problem of distinguishing clients between the > host > and > netns is automagically solved. That could be done. > > Where we are today is the host NFS client is uniquified, and all the > netns > clients are distinguished from the host, but not eachother. > > Ben > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 13:45 ` Trond Myklebust @ 2022-02-08 14:29 ` Benjamin Coddington 2022-02-08 14:42 ` Chuck Lever III 0 siblings, 1 reply; 23+ messages in thread From: Benjamin Coddington @ 2022-02-08 14:29 UTC (permalink / raw) To: Trond Myklebust; +Cc: chuck.lever, linux-nfs On 8 Feb 2022, at 8:45, Trond Myklebust wrote: > On Tue, 2022-02-08 at 06:32 -0500, Benjamin Coddington wrote: >> On 7 Feb 2022, at 18:59, Chuck Lever III wrote: >> >>>> On Feb 7, 2022, at 2:38 PM, Trond Myklebust >>>> <trondmy@hammerspace.com> >>>> wrote: >>>> >>>> On Mon, 2022-02-07 at 15:49 +0000, Chuck Lever III wrote: >>>>> >>>>> >>>>>> On Feb 7, 2022, at 9:05 AM, Benjamin Coddington >>>>>> <bcodding@redhat.com> wrote: >>>>>> >>>>>> On 5 Feb 2022, at 14:50, Benjamin Coddington wrote: >>>>>> >>>>>>> On 5 Feb 2022, at 13:24, Trond Myklebust wrote: >>>>>>> >>>>>>>> On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington >>>>>>>> wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Is anyone using a udev(-like) implementation with >>>>>>>>> NETLINK_LISTEN_ALL_NSID? >>>>>>>>> It looks like that is at least necessary to allow the >>>>>>>>> init >>>>>>>>> namespaced >>>>>>>>> udev >>>>>>>>> to receive notifications on >>>>>>>>> /sys/fs/nfs/net/nfs_client/identifier, >>>>>>>>> which >>>>>>>>> would be a pre-req to automatically uniquify in >>>>>>>>> containers. >>>>>>>>> >>>>>>>>> I'md interested since it will inform whether I need to >>>>>>>>> send >>>>>>>>> patches >>>>>>>>> to >>>>>>>>> systemd's udev, and potentially open the can of worms >>>>>>>>> over >>>>>>>>> there. >>>>>>>>> Yet its >>>>>>>>> not yet clear to me how an init namespaced udev process >>>>>>>>> can >>>>>>>>> write to >>>>>>>>> a netns >>>>>>>>> sysfs path. >>>>>>>>> >>>>>>>>> Another option might be to create yet another >>>>>>>>> daemon/tool >>>>>>>>> that would >>>>>>>>> listen >>>>>>>>> specifically for these notifications. Ugh. >>>>>>>>> >>>>>>>>> Ben >>>>>>>>> >>>>>>>> >>>>>>>> I don't understand. Why do you need a new daemon/tool? >>>>>> >>>>>> Because what we've got only works for the init namespace. >>>>>> >>>>>> Udev won't get kobject notifications because its not using >>>>>> NETLINK_LISTEN_ALL_NSIDs. >>>>>> >>>>>> We need to figure out if we want: >>>>>> >>>>>> 1) the init namespace udevd to handle all client_id >>>>>> uniquifiers >>>>>> 2) we expect network namespaces to run their own udevd >>>>>> 3) or both. >>>>>> >>>>>> I think 2 violates "least surprise", and 3 might not be >>>>>> something >>>>>> anyone >>>>>> ever wants. If they do, we can fix it at that point. >>>>>> >>>>>> So to make 1 work, we can try to change udevd, or maybe just >>>>>> hacking about >>>>>> with nfs_netns_object_child_ns_type will be sufficient. >>>>> >>>>> I agree that 1 seems like the preferred approach, though >>>>> I don't have a technical suggestion at this point. >>>>> >>>> >>>> I strongly disagree. (1) requires the init namespace to have >>>> intimate >>>> knowledge of container internals. >> >> Not really, we're just distinguishing NFS clients in containers from >> NFS >> clients on the host. That doesn't require intimate knowledge, only a >> mechanism to create a unique value per-container. >> >>>> Why do we need to make that a requirement? That violates the >>>> expectation >>>> that containers are stateless by default, and also the >>>> expectation >>>> that >>>> they operate independently of the environment. >> >> I'm not familiar with the expectation that containers are stateless >> by >> default, or that they operate independently of the environment. >> > > Put differently: do you expect QEMU/KVM and VMware ESX to have to know > a priori that a VM is going to use NFSv4, and force them to have to > modify the VM state accordingly? No, of course not. So why do you think > this is a good idea for containers? Well, I don't think /that's/ a good idea, no, but I don't think the comparison is valid. I wouldn't equate containers with VMs when it comes to configuration or state because VMs attempt to create a nearly isolated processing environment, while containers or namespaces are a complete mish-mash of objects, state, and paradigms. A lot of what happens in a particular set of namespaces can happen and affect objects in init too. The immediate example is the very problem we're trying to fix: nfs clients in netns can disrupt/reclaim state from the init namespace client. > This is exactly the problem with the keyring upcall mechanism, and why > it is completely useless on a modern system. It relies on the top level > knowing what the containers are doing and how they are configured. We're actually talking over this problem while working TLS, and I agree that keyrings need changes to allow userspace callouts to be "routed", and that configuration must come from within the containers. And lacking a container taking responsibility for it, it is up to the host to do something sane. > Imagine if you want to nest containers (yes, people do that - just > Google "nested docker containers"). Your top level process would have > to know not just how the first level of containers is configured > (network details, user mappings, ...), but also details about how the > child containers, that it is not directly managing, are configured. > It's just not practical. Oh yeah, I know all about it. Its quite a mess, and every subsystem that has to account for all of this does it a little differently. >> Can't we just uniquify the namespaced NFS client ourselves, while >> still >> exposing /sys/fs/nfs/net/nfs_client/identifier within the namespace? >> That >> way if someone want to run udev or use their own method of persistent >> id >> its available to them within the container so they can. Then we can >> move >> forward because the problem of distinguishing clients between the >> host >> and >> netns is automagically solved. > > That could be done. Ok, I'm eyeballing a sha1 of the init namespace uniquifier and peernet2id_alloc(new_net, init_net).. but means the NFS client would grow a dependency on CRYPTO and CRYPTO_SHA1. hm. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 14:29 ` Benjamin Coddington @ 2022-02-08 14:42 ` Chuck Lever III 2022-02-08 15:23 ` Benjamin Coddington 0 siblings, 1 reply; 23+ messages in thread From: Chuck Lever III @ 2022-02-08 14:42 UTC (permalink / raw) To: Benjamin Coddington; +Cc: Trond Myklebust, Linux NFS Mailing List > On Feb 8, 2022, at 9:29 AM, Benjamin Coddington <bcodding@redhat.com> wrote: > > On 8 Feb 2022, at 8:45, Trond Myklebust wrote: > >>> Can't we just uniquify the namespaced NFS client ourselves, while >>> still >>> exposing /sys/fs/nfs/net/nfs_client/identifier within the namespace? >>> That >>> way if someone want to run udev or use their own method of persistent >>> id >>> its available to them within the container so they can. Then we can >>> move >>> forward because the problem of distinguishing clients between the >>> host >>> and >>> netns is automagically solved. >> >> That could be done. > > Ok, I'm eyeballing a sha1 of the init namespace uniquifier and > peernet2id_alloc(new_net, init_net).. but means the NFS client would grow a > dependency on CRYPTO and CRYPTO_SHA1. Or you could use siphash instead of SHA-1. I don't think we should be adding any more SHA-1 to the kernel -- it's deprecated for good reasons. -- Chuck Lever ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 14:42 ` Chuck Lever III @ 2022-02-08 15:23 ` Benjamin Coddington 2022-02-08 15:43 ` Trond Myklebust 0 siblings, 1 reply; 23+ messages in thread From: Benjamin Coddington @ 2022-02-08 15:23 UTC (permalink / raw) To: Chuck Lever III; +Cc: Trond Myklebust, Linux NFS Mailing List On 8 Feb 2022, at 9:42, Chuck Lever III wrote: >> On Feb 8, 2022, at 9:29 AM, Benjamin Coddington <bcodding@redhat.com> >> wrote: >> >> On 8 Feb 2022, at 8:45, Trond Myklebust wrote: >> >>>> Can't we just uniquify the namespaced NFS client ourselves, while >>>> still >>>> exposing /sys/fs/nfs/net/nfs_client/identifier within the >>>> namespace? >>>> That >>>> way if someone want to run udev or use their own method of >>>> persistent >>>> id >>>> its available to them within the container so they can. Then we >>>> can >>>> move >>>> forward because the problem of distinguishing clients between the >>>> host >>>> and >>>> netns is automagically solved. >>> >>> That could be done. >> >> Ok, I'm eyeballing a sha1 of the init namespace uniquifier and >> peernet2id_alloc(new_net, init_net).. but means the NFS client would >> grow a >> dependency on CRYPTO and CRYPTO_SHA1. > > Or you could use siphash instead of SHA-1. > > I don't think we should be adding any more SHA-1 to the kernel -- > it's deprecated for good reasons. Thanks! Siphash is nicer too. :) Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 15:23 ` Benjamin Coddington @ 2022-02-08 15:43 ` Trond Myklebust 2022-02-08 15:47 ` Trond Myklebust 0 siblings, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2022-02-08 15:43 UTC (permalink / raw) To: bcodding, chuck.lever; +Cc: linux-nfs On Tue, 2022-02-08 at 10:23 -0500, Benjamin Coddington wrote: > On 8 Feb 2022, at 9:42, Chuck Lever III wrote: > > > > On Feb 8, 2022, at 9:29 AM, Benjamin Coddington > > > <bcodding@redhat.com> > > > wrote: > > > > > > On 8 Feb 2022, at 8:45, Trond Myklebust wrote: > > > > > > > > Can't we just uniquify the namespaced NFS client ourselves, > > > > > while > > > > > still > > > > > exposing /sys/fs/nfs/net/nfs_client/identifier within the > > > > > namespace? > > > > > That > > > > > way if someone want to run udev or use their own method of > > > > > persistent > > > > > id > > > > > its available to them within the container so they can. Then > > > > > we > > > > > can > > > > > move > > > > > forward because the problem of distinguishing clients between > > > > > the > > > > > host > > > > > and > > > > > netns is automagically solved. > > > > > > > > That could be done. > > > > > > Ok, I'm eyeballing a sha1 of the init namespace uniquifier and > > > peernet2id_alloc(new_net, init_net).. but means the NFS client > > > would > > > grow a > > > dependency on CRYPTO and CRYPTO_SHA1. > > > > Or you could use siphash instead of SHA-1. > > > > I don't think we should be adding any more SHA-1 to the kernel -- > > it's deprecated for good reasons. > > Thanks! Siphash is nicer too. :) > > peernet2id_alloc() is not designed for this. It appears to use idr_alloc(), which means it will reuse values frequently. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 15:43 ` Trond Myklebust @ 2022-02-08 15:47 ` Trond Myklebust 2022-02-08 16:18 ` Benjamin Coddington 0 siblings, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2022-02-08 15:47 UTC (permalink / raw) To: bcodding, chuck.lever; +Cc: linux-nfs On Tue, 2022-02-08 at 15:43 +0000, Trond Myklebust wrote: > On Tue, 2022-02-08 at 10:23 -0500, Benjamin Coddington wrote: > > On 8 Feb 2022, at 9:42, Chuck Lever III wrote: > > > > > > On Feb 8, 2022, at 9:29 AM, Benjamin Coddington > > > > <bcodding@redhat.com> > > > > wrote: > > > > > > > > On 8 Feb 2022, at 8:45, Trond Myklebust wrote: > > > > > > > > > > Can't we just uniquify the namespaced NFS client ourselves, > > > > > > while > > > > > > still > > > > > > exposing /sys/fs/nfs/net/nfs_client/identifier within the > > > > > > namespace? > > > > > > That > > > > > > way if someone want to run udev or use their own method of > > > > > > persistent > > > > > > id > > > > > > its available to them within the container so they can. > > > > > > Then > > > > > > we > > > > > > can > > > > > > move > > > > > > forward because the problem of distinguishing clients > > > > > > between > > > > > > the > > > > > > host > > > > > > and > > > > > > netns is automagically solved. > > > > > > > > > > That could be done. > > > > > > > > Ok, I'm eyeballing a sha1 of the init namespace uniquifier and > > > > peernet2id_alloc(new_net, init_net).. but means the NFS client > > > > would > > > > grow a > > > > dependency on CRYPTO and CRYPTO_SHA1. > > > > > > Or you could use siphash instead of SHA-1. > > > > > > I don't think we should be adding any more SHA-1 to the kernel -- > > > it's deprecated for good reasons. > > > > Thanks! Siphash is nicer too. :) > > > > > > peernet2id_alloc() is not designed for this. It appears to use > idr_alloc(), which means it will reuse values frequently. > Furthermore, that would introduce a dependency on the init namespace identifier being unique, which precludes its use for initialising said init namespace. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 15:47 ` Trond Myklebust @ 2022-02-08 16:18 ` Benjamin Coddington 0 siblings, 0 replies; 23+ messages in thread From: Benjamin Coddington @ 2022-02-08 16:18 UTC (permalink / raw) To: Trond Myklebust; +Cc: chuck.lever, linux-nfs On 8 Feb 2022, at 10:47, Trond Myklebust wrote: >> peernet2id_alloc() is not designed for this. It appears to use >> idr_alloc(), which means it will reuse values frequently. I did not think of that. > Furthermore, that would introduce a dependency on the init namespace > identifier being unique, which precludes its use for initialising said > init namespace. That's what the udev rule will fix! :) I think I was still on the deterministic bus, but it seems to make the most sense to simply use a random value as a default, then. And if a container wants to be the same client it must run udev, or write to sysfs itselt. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 11:32 ` Benjamin Coddington 2022-02-08 13:45 ` Trond Myklebust @ 2022-02-08 16:47 ` Trond Myklebust 2022-02-08 17:45 ` Benjamin Coddington 1 sibling, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2022-02-08 16:47 UTC (permalink / raw) To: bcodding, chuck.lever; +Cc: linux-nfs On Tue, 2022-02-08 at 06:32 -0500, Benjamin Coddington wrote: > > There's a bit of a chicken and egg problem with 2, though. If the > nfs > module is loaded, the kernel notification gets sent as soon as you > create > the namespace. Its not going to wait for you to move or exec udev > into > that > network namespace, and the notification is lost. Wait a minute... I missed this comment earlier, but it definitely points to a misunderstanding. The notification is _not_ sent by the act of loading a module. It is sent by the call to kobject_uevent() in nfs_netns_sysfs_setup(). That again is called as part of nfs_net_init() when the net namespace gets created. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 16:47 ` Trond Myklebust @ 2022-02-08 17:45 ` Benjamin Coddington 0 siblings, 0 replies; 23+ messages in thread From: Benjamin Coddington @ 2022-02-08 17:45 UTC (permalink / raw) To: Trond Myklebust; +Cc: chuck.lever, linux-nfs On 8 Feb 2022, at 11:47, Trond Myklebust wrote: > On Tue, 2022-02-08 at 06:32 -0500, Benjamin Coddington wrote: >> >> There's a bit of a chicken and egg problem with 2, though. If the >> nfs >> module is loaded, the kernel notification gets sent as soon as you >> create >> the namespace. Its not going to wait for you to move or exec udev >> into >> that >> network namespace, and the notification is lost. > > > Wait a minute... I missed this comment earlier, but it definitely > points to a misunderstanding. > > The notification is _not_ sent by the act of loading a module. It is > sent by the call to kobject_uevent() in nfs_netns_sysfs_setup(). That > again is called as part of nfs_net_init() when the net namespace gets > created. My communication was poor. The first notification is sent to udev when the nfs module is loaded. That is the initial creation of the sysfs, the notification in the init namespace. After that, if a network namespace is created and "the nfs module is [already] loaded", the notification is immediately sent. I think we're both understanding it and our understanding matches how it works. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-05 15:03 v4 clientid uniquifiers in containers/namespaces Benjamin Coddington 2022-02-05 18:24 ` Trond Myklebust @ 2022-02-08 1:59 ` NeilBrown 2022-02-08 11:52 ` Benjamin Coddington 1 sibling, 1 reply; 23+ messages in thread From: NeilBrown @ 2022-02-08 1:59 UTC (permalink / raw) To: Benjamin Coddington; +Cc: Linux NFS Mailing List On Sun, 06 Feb 2022, Benjamin Coddington wrote: > Hi all, > > Is anyone using a udev(-like) implementation with NETLINK_LISTEN_ALL_NSID? > It looks like that is at least necessary to allow the init namespaced udev > to receive notifications on /sys/fs/nfs/net/nfs_client/identifier, which > would be a pre-req to automatically uniquify in containers. Could you walk me through the reasoning here - or point me to where it has been discussed. It seems to me that mount.nfs is the place to set nfs_client/identifier. It can be told (via /etc/nfs.conf or /etc/nfsmount.conf) how to generate and where to store the identifier. It can check the current value and update if needed. As long as the identifier is set before the first mount, there is no rush. Why does it need to be done in response to a uevent?? Thanks, NeilBrown ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 1:59 ` NeilBrown @ 2022-02-08 11:52 ` Benjamin Coddington 2022-02-08 20:56 ` NeilBrown 0 siblings, 1 reply; 23+ messages in thread From: Benjamin Coddington @ 2022-02-08 11:52 UTC (permalink / raw) To: NeilBrown; +Cc: Linux NFS Mailing List On 7 Feb 2022, at 20:59, NeilBrown wrote: > On Sun, 06 Feb 2022, Benjamin Coddington wrote: >> Hi all, >> >> Is anyone using a udev(-like) implementation with NETLINK_LISTEN_ALL_NSID? >> It looks like that is at least necessary to allow the init namespaced udev >> to receive notifications on /sys/fs/nfs/net/nfs_client/identifier, which >> would be a pre-req to automatically uniquify in containers. > > Could you walk me through the reasoning here - or point me to where it > has been discussed. https://lore.kernel.org/linux-nfs/20210414181040.7108-1-steved@redhat.com/ > It seems to me that mount.nfs is the place to set nfs_client/identifier. > It can be told (via /etc/nfs.conf or /etc/nfsmount.conf) how to generate > and where to store the identifier. It can check the current value and > update if needed. As long as the identifier is set before the first > mount, there is no rush. > > Why does it need to be done in response to a uevent?? I think the assertion was that it was the only sensible way, and it does seem to be better than exposing yet another knob when all that's needed is a way to distinguish and persist NFS clients when network namespaces can come and go at any time, and there can be a lot of them. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 11:52 ` Benjamin Coddington @ 2022-02-08 20:56 ` NeilBrown 2022-02-08 23:34 ` Trond Myklebust 0 siblings, 1 reply; 23+ messages in thread From: NeilBrown @ 2022-02-08 20:56 UTC (permalink / raw) To: Benjamin Coddington; +Cc: Linux NFS Mailing List On Tue, 08 Feb 2022, Benjamin Coddington wrote: > On 7 Feb 2022, at 20:59, NeilBrown wrote: > > > On Sun, 06 Feb 2022, Benjamin Coddington wrote: > >> Hi all, > >> > >> Is anyone using a udev(-like) implementation with NETLINK_LISTEN_ALL_NSID? > >> It looks like that is at least necessary to allow the init namespaced udev > >> to receive notifications on /sys/fs/nfs/net/nfs_client/identifier, which > >> would be a pre-req to automatically uniquify in containers. > > > > Could you walk me through the reasoning here - or point me to where it > > has been discussed. > > https://lore.kernel.org/linux-nfs/20210414181040.7108-1-steved@redhat.com/ Thanks. I did remember that discussion though it was helpful to refresh my memory, and to be sure there is nothing else. > > > It seems to me that mount.nfs is the place to set nfs_client/identifier. > > It can be told (via /etc/nfs.conf or /etc/nfsmount.conf) how to generate > > and where to store the identifier. It can check the current value and > > update if needed. As long as the identifier is set before the first > > mount, there is no rush. > > > > Why does it need to be done in response to a uevent?? > > I think the assertion was that it was the only sensible way, and it does > seem to be better than exposing yet another knob when all that's needed is a > way to distinguish and persist NFS clients when network namespaces can come > and go at any time, and there can be a lot of them. "assertion" is an apt word. There wasn't a whole lot of reasoned argument, mostly just assertions. The best argument was that "nfs.conf is not namespace aware", which is only somewhat true. Using "ip netnfs exec" will make non-namepsace-aware tools work correctly in namespaces providing their config files are in /etc/netns/NAME - they get bind-mounted over the files in /etc. And of course /etc/nfs.conf can be MADE namespace aware. There is also a reasonable argument that auto-editiing /etc/nfs.conf risks collision with an admin, but that is why we have /etc/nfs.conf.d For me, the weakest part of the Steve's case was that he presented it as "setting module parameters via nfs.conf" rather than "configuring client identity via nfs.conf". A number of the early negative responses were focused on the distraction of a module parameter being involved. The weakness for the alternative, of course, is the fact that using the udev mechanism requires running udevd in each network namespace, which is an unnecessary burden. So I still STRONGLY think that the identity should be set by mount.nfs reading (and writing) some file in /etc or /etc/netnfs/NAME, and I weakly think that the file should be in /etc/nfs.conf.d/ so that the reading is automagic. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 20:56 ` NeilBrown @ 2022-02-08 23:34 ` Trond Myklebust 2022-02-09 1:24 ` NeilBrown 2022-02-15 16:57 ` Benjamin Coddington 0 siblings, 2 replies; 23+ messages in thread From: Trond Myklebust @ 2022-02-08 23:34 UTC (permalink / raw) To: neilb, bcodding; +Cc: linux-nfs On Wed, 2022-02-09 at 07:56 +1100, NeilBrown wrote: > On Tue, 08 Feb 2022, Benjamin Coddington wrote: > > On 7 Feb 2022, at 20:59, NeilBrown wrote: > > > > > On Sun, 06 Feb 2022, Benjamin Coddington wrote: > > > > Hi all, > > > > > > > > Is anyone using a udev(-like) implementation with > > > > NETLINK_LISTEN_ALL_NSID? > > > > It looks like that is at least necessary to allow the init > > > > namespaced udev > > > > to receive notifications on > > > > /sys/fs/nfs/net/nfs_client/identifier, which > > > > would be a pre-req to automatically uniquify in containers. > > > > > > Could you walk me through the reasoning here - or point me to > > > where it > > > has been discussed. > > > > https://lore.kernel.org/linux-nfs/20210414181040.7108-1-steved@redhat.com/ > > Thanks. I did remember that discussion though it was helpful to > refresh > my memory, and to be sure there is nothing else. > > > > > > It seems to me that mount.nfs is the place to set > > > nfs_client/identifier. > > > It can be told (via /etc/nfs.conf or /etc/nfsmount.conf) how to > > > generate > > > and where to store the identifier. It can check the current > > > value and > > > update if needed. As long as the identifier is set before the > > > first > > > mount, there is no rush. > > > > > > Why does it need to be done in response to a uevent?? > > > > I think the assertion was that it was the only sensible way, and it > > does > > seem to be better than exposing yet another knob when all that's > > needed is a > > way to distinguish and persist NFS clients when network namespaces > > can come > > and go at any time, and there can be a lot of them. > > "assertion" is an apt word. There wasn't a whole lot of reasoned > argument, mostly just assertions. > > The best argument was that "nfs.conf is not namespace aware", which > is > only somewhat true. Using "ip netnfs exec" will make > non-namepsace-aware tools work correctly in namespaces providing > their > config files are in /etc/netns/NAME - they get bind-mounted over the > files in /etc. > And of course /etc/nfs.conf can be MADE namespace aware. > > There is also a reasonable argument that auto-editiing /etc/nfs.conf > risks collision with an admin, but that is why we have > /etc/nfs.conf.d > > For me, the weakest part of the Steve's case was that he presented it > as > "setting module parameters via nfs.conf" rather than "configuring > client > identity via nfs.conf". A number of the early negative responses > were > focused on the distraction of a module parameter being involved. > > The weakness for the alternative, of course, is the fact that using > the > udev mechanism requires running udevd in each network namespace, > which > is an unnecessary burden. > > So I still STRONGLY think that the identity should be set by > mount.nfs > reading (and writing) some file in /etc or /etc/netnfs/NAME, and I > weakly think that the file should be in /etc/nfs.conf.d/ so that the > reading is automagic. > No. It's not a per-mount setting, so it has no business being in the mount protocol. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 23:34 ` Trond Myklebust @ 2022-02-09 1:24 ` NeilBrown 2022-02-15 16:57 ` Benjamin Coddington 1 sibling, 0 replies; 23+ messages in thread From: NeilBrown @ 2022-02-09 1:24 UTC (permalink / raw) To: Trond Myklebust; +Cc: bcodding, linux-nfs On Wed, 09 Feb 2022, Trond Myklebust wrote: > On Wed, 2022-02-09 at 07:56 +1100, NeilBrown wrote: > > > > So I still STRONGLY think that the identity should be set by > > mount.nfs > > reading (and writing) some file in /etc or /etc/netnfs/NAME, and I > > weakly think that the file should be in /etc/nfs.conf.d/ so that the > > reading is automagic. > > > > No. It's not a per-mount setting, so it has no business being in the > mount protocol. I agree that it is not different for different mounts, but every mount needs it, and without any mounts it is not needed. Much like statd really, which is started by mount.nfs when it is determined that it is needed, but not running. NeilBrown ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v4 clientid uniquifiers in containers/namespaces 2022-02-08 23:34 ` Trond Myklebust 2022-02-09 1:24 ` NeilBrown @ 2022-02-15 16:57 ` Benjamin Coddington 1 sibling, 0 replies; 23+ messages in thread From: Benjamin Coddington @ 2022-02-15 16:57 UTC (permalink / raw) To: Trond Myklebust; +Cc: neilb, linux-nfs On 8 Feb 2022, at 18:34, Trond Myklebust wrote: > On Wed, 2022-02-09 at 07:56 +1100, NeilBrown wrote: >> So I still STRONGLY think that the identity should be set by >> mount.nfs >> reading (and writing) some file in /etc or /etc/netnfs/NAME, and I >> weakly think that the file should be in /etc/nfs.conf.d/ so that the >> reading is automagic. >> > > No. It's not a per-mount setting, so it has no business being in the > mount protocol. Trond, We still have the issue that udev handling the event to set the uniquifier for the init namespace races with the first SETCLIENTID/EXCHANGE_ID. Now that network namespaces uniqify by default, would you prefer we try to solve this with the userspace tools setting the module parameter instead of depending on udev for the init namespace? Alternatively, we could grow another module parameter: nfs4_unique_id_timeout:int Seconds to wait for a uniquifier A non-zero default also gives network namespaces the chance to set a persistent value that differs from the random value the kernel generated. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2022-02-15 16:57 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-02-05 15:03 v4 clientid uniquifiers in containers/namespaces Benjamin Coddington 2022-02-05 18:24 ` Trond Myklebust 2022-02-05 19:50 ` Benjamin Coddington 2022-02-07 14:05 ` Benjamin Coddington 2022-02-07 15:49 ` Chuck Lever III 2022-02-07 19:38 ` Trond Myklebust 2022-02-07 23:59 ` Chuck Lever III 2022-02-08 11:32 ` Benjamin Coddington 2022-02-08 13:45 ` Trond Myklebust 2022-02-08 14:29 ` Benjamin Coddington 2022-02-08 14:42 ` Chuck Lever III 2022-02-08 15:23 ` Benjamin Coddington 2022-02-08 15:43 ` Trond Myklebust 2022-02-08 15:47 ` Trond Myklebust 2022-02-08 16:18 ` Benjamin Coddington 2022-02-08 16:47 ` Trond Myklebust 2022-02-08 17:45 ` Benjamin Coddington 2022-02-08 1:59 ` NeilBrown 2022-02-08 11:52 ` Benjamin Coddington 2022-02-08 20:56 ` NeilBrown 2022-02-08 23:34 ` Trond Myklebust 2022-02-09 1:24 ` NeilBrown 2022-02-15 16:57 ` Benjamin Coddington
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.