All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Richard Weinberger <richard@nod.at>
Cc: linux-nfs@vger.kernel.org, luis.turcitu@appsbroker.com,
	chris.chilvers@appsbroker.com, david.young@appsbroker.com,
	david <david@sigma-star.at>,
	david oberhollenzer <david.oberhollenzer@sigma-star.at>
Subject: Re: Improving NFS re-export
Date: Thu, 9 Dec 2021 16:41:39 -0500	[thread overview]
Message-ID: <20211209214139.GA23483@fieldses.org> (raw)
In-Reply-To: <1576494286.153679.1639083948872.JavaMail.zimbra@nod.at>

On Thu, Dec 09, 2021 at 10:05:48PM +0100, Richard Weinberger wrote:
> Hello NFS list,
> 
> I'd like to improve the NFS re-export feature, especially wrt. crossmounts.
> Currently a NFS client will face EIO when crossing a mount point on the re-exporting server.
> This was discussed here[0]. While in that discussion the assumption was that check_export()
> in fs/nfsd/export.c emits EIO I did further experiments and realized that EIO actually
> comes from the NFS client side of the re-exporting server.
> 
> nfs_encode_fh() in fs/nfs/export.c checks for IS_AUTOMOUNT(inode), if this is the case
> it refuses to create a new file handle.
> So while accessing /files/disk2 directly on the re-exporting server triggers an automount,
> accessing via nfsd the export function of the client side gives up.
> 
> AFAIU the suggested proxy-only-mode[1] will not address this problem, right?

That's how I was thinking of addressing the problem, actually.  I
haven't figured out how to make that proxy-only mode work, though.

> One workaround is manually adding an export for each volume on the re-exporting server.
> This kinda works but is tedious and error prone.
> 
> I have a crazy idea how to automate this:
> Since nfs_encode_fh() in the NFS client side of the re-exporting server can detect
> crossing mounts, we could install a new export on the sever side as soon the
> IS_AUTOMOUNT(inode) case arises. We could even use the same fsid.
> What do you think?

Something like that might work.

I'm not sure what you mean by the same fsid.  I think you'd need to make
up a new fsid each time you encounter a new filesystem.  And you'd also
want to persist it on disk if you want this to keep working across
reboots of the proxy.

I think you could patch rpc.mountd to do that.

> Another obstacle is file handle wrapping.
> When re-exporting, the NFS client side adds inode and file information to each file handle,
> the server side also adds information. In my test setup this enlarges a 16 bytes file handle
> to 40 bytes.
> The proxy-only-mode won't help us either here.

Part of my motivation for a proxy-only mode was to remove that wrapping.

Since you're dedicating the host to reexporting one single backend
server, in theory you don't need any of the information in the wrapper.
When you (the proxy) get a filehandle from a client, you know which
server that filehandle originally came from, so you can go ask that
server for whatever you need to know about the filehandle (like an
fsid).

> Did you consider using the opaque file handle from the server as
> lookup key in a (persisted) data structure?

A little, but I don't think it works.

If you do this, you do need to require that you only export one server.
Otherwise there may be collisions (two different servers could return
filehandles that happen to have the same value).

The database would store every filehandle the client has ever seen.
That could be a lot.  It may also include filehandles for since-deleted
files.  The only way to prune such entries would be to try using them
and see if the server gives you STALE errors.

--b.

> That way at least the client side of the re-exporting server no longer has to enlarge
> the file handle with inode and file type information.
> If the re-exporting server re-exports just one server (proxy-only-mode) we could also
> skip adding the fsid to the handle.
> What do you think?
> 
> I'm looking forward to hear your comments.
> 
> Thanks,
> //richard
> 
> [0] https://marc.info/?l=linux-nfs&m=161670807413876&w=2
> [1] https://linux-nfs.org/wiki/index.php/NFS_proxy-only_mode

  reply	other threads:[~2021-12-09 21:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09 21:05 Improving NFS re-export Richard Weinberger
2021-12-09 21:41 ` J. Bruce Fields [this message]
2021-12-09 22:03   ` Richard Weinberger
2021-12-21 14:30     ` Daire Byrne
2021-12-21 17:21       ` bfields
2021-12-21 21:39       ` Richard Weinberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211209214139.GA23483@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=chris.chilvers@appsbroker.com \
    --cc=david.oberhollenzer@sigma-star.at \
    --cc=david.young@appsbroker.com \
    --cc=david@sigma-star.at \
    --cc=linux-nfs@vger.kernel.org \
    --cc=luis.turcitu@appsbroker.com \
    --cc=richard@nod.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.