All of lore.kernel.org
 help / color / mirror / Atom feed
From: Richard Weinberger <richard@nod.at>
To: bfields <bfields@fieldses.org>
Cc: linux-nfs <linux-nfs@vger.kernel.org>,
	luis turcitu <luis.turcitu@appsbroker.com>,
	chris chilvers <chris.chilvers@appsbroker.com>,
	david young <david.young@appsbroker.com>,
	david <david@sigma-star.at>,
	david oberhollenzer <david.oberhollenzer@sigma-star.at>
Subject: Re: Improving NFS re-export
Date: Thu, 9 Dec 2021 23:03:24 +0100 (CET)	[thread overview]
Message-ID: <763412597.153709.1639087404752.JavaMail.zimbra@nod.at> (raw)
In-Reply-To: <20211209214139.GA23483@fieldses.org>

----- Ursprüngliche Mail -----
> On Thu, Dec 09, 2021 at 10:05:48PM +0100, Richard Weinberger wrote:
>> nfs_encode_fh() in fs/nfs/export.c checks for IS_AUTOMOUNT(inode), if this is
>> the case
>> it refuses to create a new file handle.
>> So while accessing /files/disk2 directly on the re-exporting server triggers an
>> automount,
>> accessing via nfsd the export function of the client side gives up.
>> 
>> AFAIU the suggested proxy-only-mode[1] will not address this problem, right?
> 
> That's how I was thinking of addressing the problem, actually.  I
> haven't figured out how to make that proxy-only mode work, though.
> 
>> One workaround is manually adding an export for each volume on the re-exporting
>> server.
>> This kinda works but is tedious and error prone.
>> 
>> I have a crazy idea how to automate this:
>> Since nfs_encode_fh() in the NFS client side of the re-exporting server can
>> detect
>> crossing mounts, we could install a new export on the sever side as soon the
>> IS_AUTOMOUNT(inode) case arises. We could even use the same fsid.
>> What do you think?
> 
> Something like that might work.
> 
> I'm not sure what you mean by the same fsid.  I think you'd need to make
> up a new fsid each time you encounter a new filesystem.  And you'd also
> want to persist it on disk if you want this to keep working across
> reboots of the proxy.

By same fsid I meant reusing the fsid from the backend server.
 
> I think you could patch rpc.mountd to do that.

Okay, I need to dig into this.

>> Another obstacle is file handle wrapping.
>> When re-exporting, the NFS client side adds inode and file information to each
>> file handle,
>> the server side also adds information. In my test setup this enlarges a 16 bytes
>> file handle
>> to 40 bytes.
>> The proxy-only-mode won't help us either here.
> 
> Part of my motivation for a proxy-only mode was to remove that wrapping.
> 
> Since you're dedicating the host to reexporting one single backend
> server, in theory you don't need any of the information in the wrapper.
> When you (the proxy) get a filehandle from a client, you know which
> server that filehandle originally came from, so you can go ask that
> server for whatever you need to know about the filehandle (like an
> fsid).

I see. That way we could get rid of file handle wrapping but loose the
NFS clinet inode cache on the re-exporting server, I think.
 
>> Did you consider using the opaque file handle from the server as
>> lookup key in a (persisted) data structure?
> 
> A little, but I don't think it works.
> 
> If you do this, you do need to require that you only export one server.
> Otherwise there may be collisions (two different servers could return
> filehandles that happen to have the same value).
> 
> The database would store every filehandle the client has ever seen.
> That could be a lot.  It may also include filehandles for since-deleted
> files.  The only way to prune such entries would be to try using them
> and see if the server gives you STALE errors.

True. I didn't think about the pruning case.

Thanks a lot for the prompt reply and your valuable input.
//richard

  reply	other threads:[~2021-12-09 22:03 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09 21:05 Improving NFS re-export Richard Weinberger
2021-12-09 21:41 ` J. Bruce Fields
2021-12-09 22:03   ` Richard Weinberger [this message]
2021-12-21 14:30     ` Daire Byrne
2021-12-21 17:21       ` bfields
2021-12-21 21:39       ` Richard Weinberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=763412597.153709.1639087404752.JavaMail.zimbra@nod.at \
    --to=richard@nod.at \
    --cc=bfields@fieldses.org \
    --cc=chris.chilvers@appsbroker.com \
    --cc=david.oberhollenzer@sigma-star.at \
    --cc=david.young@appsbroker.com \
    --cc=david@sigma-star.at \
    --cc=linux-nfs@vger.kernel.org \
    --cc=luis.turcitu@appsbroker.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.