linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
	Volker.Lendecke@sernet.de,
	samba-technical <samba-technical@lists.samba.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Pavel Shilovskiy <pshilov@microsoft.com>
Subject: Re: Better interop for NFS/SMB file share mode/reservation
Date: Sun, 28 Apr 2019 09:45:50 -0400	[thread overview]
Message-ID: <CAOQ4uxi6fQdp_RQKHp-i6Q-m-G1+384_DafF3QzYcUq4guLd6w@mail.gmail.com> (raw)
In-Reply-To: <8504a05f2b0462986b3a323aec83a5b97aae0a03.camel@kernel.org>

On Sun, Apr 28, 2019 at 8:09 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Sat, 2019-04-27 at 16:16 -0400, Amir Goldstein wrote:
> > [adding back samba/nfs and fsdevel]
> >
>
> cc'ing Pavel too -- he did a bunch of work in this area a few years ago.
>
> > On Fri, Apr 26, 2019 at 6:22 PM Jeff Layton <jlayton@kernel.org> wrote:
> > > On Fri, 2019-04-26 at 10:50 -0400, J. Bruce Fields wrote:
> > > > On Fri, Apr 26, 2019 at 04:11:00PM +0200, Amir Goldstein wrote:
> > > > > On Fri, Apr 26, 2019, 4:00 PM J. Bruce Fields <bfields@fieldses.org> wrote:
> > > > >
> > > > > > On Fri, Apr 26, 2019 at 03:50:46PM +0200, Amir Goldstein wrote:
> > > > > > > On Fri, Feb 8, 2019, 5:03 PM Jeff Layton <jlayton@kernel.org> wrote:
> > > > > > > > Share/deny open semantics are pretty similar across NFS and SMB (by
> > > > > > > > design, really). If you intend to solve that use-case, what you really
> > > > > > > > want is whole-file, shared/exclusive locks that are set atomically with
> > > > > > > > the open call. O_EXLOCK and O_SHLOCK seem like a reasonable fit there.
> > > > > > > >
> > > > > > > > Then you could have SMB and NFS servers set these flags when opening
> > > > > > > > files, and deal with the occasional denial at open time. Other
> > > > > > > > applications won't be aware of them of course, but that's probably fine
> > > > > > > > for most use-cases where you want this sort of protocol interop.
> > > > > > >
> > > > > > > Sorry for posting off list. Airport emails...
> > > > > > > I looked at implemeting O_EXLOCK and O_SHLOCK and it looks doable.
> > > > > > >
> > > > > > > I was wondering if there is an inherent reason not to allow an exclusive
> > > > > > > lock on a file that is open read-only.
> > > > > > >
> > > > > > > Samba seems to need it and currently flock and ofd locks won't allow it.
> > > > > > > Do you thing it will be ok to allow it with O_EXLOCK?
> > > > > >
> > > > > > Somebody could deny everyone access to a shared resource that everyone
> > > > > > needs to make progress, like /etc/passwd or a shared library.
> > > > > >
> > > > > > Have you looked at Pavel Shilovsky's O_DENY patches?  He had the feature
> > > > > > off by default, with a mount option provided to turn it on.
> > > > > >
> > > > >
> > > > > O_EXLOCK is advisory. It only aquired flock or ofd lock atomically with
> > > > > open.
> > > >
> > > > Whoops, got it.
> > > >
> > > > Is that really adequate for open share locks, though?
> > > >
> > > > I assumed that Windows apps depend on the assumption that they're
> > > > mandatory.  So e.g. if you can get a DENY_READ open on a shared library
> > > > then you know you can update it without the risk of making someone else
> > > > crash.
> > > >
> > >
> > > I think this is (slightly) better than doing it internally like we do
> > > today and would give you coherent locking between NFS and SMB. Other
> > > applications wouldn't see them, but for a NAS-style deployment, that's
> > > probably ok.
> > >
> >
> > We can do a little bit better.
> > We can make sure that O_DENY_WRITE (named for convenience) fails
> > if file is currently open for write by anyone and similarly for O_DENY_READ.
> > But if we cannot deny future non-cooperative opens what's the point?....
> >
>
> As you said in another mail, the main interest here is in getting
> NFS+SMB semantics right. If the exported filesystem is _only_ available
> via NFS+SMB, then do we need to deny non-cooperative opens?
>

We do not.

> > > Any open by samba or nfsd would need to start setting O_SHLOCK, and deny
> > > mode opens would have to set O_EXLOCK. We would actually need 2 per
> > > inode though (one for read and one for write).
> > >
> >
> > ...the point is that O_DENY_NONE does not need to be implemented with
> > a new type of lock object (O_WR_SHLOCK) its enough that it checks there
> > are no relevant exclusive locks and the then inode->i_writecount and
> > inode->i_readcount already provide enough context to cooperate with
> > O_DENY_WRITE and O_DENY_READ.
> >
>
> That would work, if the goal is to have deny modes affect all opens. We
> could also do this on the opt-in basis that I was suggesting with a new
> set of counters in struct file_lock_context.
>

Ok.

> > I need to see if incrementing inode->i_readcount on O_RDWR opens is
> > possible (right now it only counts O_RDONLY opens).
> >
> > > I think these should probably be in their own "namespace" too. They
> > > could use the same semantics as flock, but should sit on their own list
> > > in file_lock_context.
> > >
> >
> > I would much rather that they didn't. The reason is that new open flags
> > are a backward compat problem. The way I want to solve it is this API:
> >
> > // On new kernel this will acquire OFD F_WRLCK atomically...
> > fd = open(..., O_RDWR | O_EXLOCK);
> > // ...check if it did acquire OFD lock
> > fcntl(fd,  F_OFD_GETLK, ...);
> >
> > We'd need at least one new l_type F_EX_RDLCK and maybe also a new
> > semantic F_EX_RDWRLCK, although similar in conflicts to F_WRLCK it can be
> > acquired without FMODE_WRITE. Though I personally thing we can do without
> > it if the only way to acquire F_WRLCK on readonly file is via new open flag.
> >
>
> I don't think that will work at all. Share/deny modes are entirely
> orthogonal to byte-range locks in both NFS and SMB. Consider:
>
> Two clients open a file with O_RDWR | | O_SHARE_WRITE | O_SHARE_READ.
> One of them now wants to set byte-range write lock on the file. That
> should be allowed, but now it'll be denied, because the other client
> will effectively hold a whole-file readlock on it.
>

Got it. flock semantics (as Pavel chose) are a better fit.
It only does not support O_SHARE_WRITE | O_DENY_READ naively,
but easy to add.

> There is also the problem that read and write deny modes are orthogonal
> to one other, so you have to have a way to deal with them independently.
>
> I'd suggest an API like this:
>
> // open read/write and deny read/write
> fd = open(..., O_RDWR | O_DENY_READ | O_DENY_WRITE);
> // test for flags with F_GETFL
> flags = fcntl(fd, F_GETFL);
>
> That would also allow you to use F_SETFL to change those flags on an
> existing fd.
>

Nice. If only old kernel wouldn't give out in F_GETFL any garbage flags
you piled on open.
That's why I wanted a different way to check if lock is taken and thought
of F_OFD_GETLK as a natural candidate.

We can play this game:

// New kernel doesn't copy O_TEST to f_flags
#define O_DENY_READ O_TEST | __O_DENY_READ
fd = open(..., O_RDWR | O_DENY_READ);
flags = fcntl(fd, F_GETFL);
if ((flags & O_DENY_READ) && !(flags & O_TEST))

A bit ugly, but if its wrapped in a library function
get_open_flags() who cares...

> > > That said, we could also look at a vfs-level mount option that would
> > > make the kernel enforce these for any opener. That could also be useful,
> > > and shouldn't be too hard to implement. Maybe even make it a vfsmount-
> > > level option (like -o ro is).
> > >
> >
> > Yeh, I am humbly going to leave this struggle to someone else.
> > Not important enough IMO and completely independent effort to the
> > advisory atomic open&lock API.
>
> Having the kernel allow setting deny modes on any open call is a non-
> starter, for the reasons Bruce outlined earlier. This _must_ be
> restricted in some fashion or we'll be opening up a ginormous DoS
> mechanism.
>
> My proposal was to make this only be enforced by applications that
> explicitly opt-in by setting O_SH*/O_EX* flags. It wouldn't be too
> difficult to also allow them to be enforced on a per-fs basis via mount
> option or something. Maybe we could expand the meaning of '-o mand' ?
>
> How would you propose that we restrict this?
>

Our communication channel is broken.
I did not intend to propose any implicit locking.
If samba and nfsd can opt-in with O_SHARE flags, I do not
understand why a mount option is helpful for the cause of
samba/nfsd interop.

If someone else is interested in samba/local interop than
yes, a mount option like suggested by Pavel could be a good option,
but it is an orthogonal effort IMO.


> > > If you're denied, what error should you get back when you try to open
> > > it? It should be something distinct. We may even want to add new error
> > > codes for this.
> >
> > IMO EBUSY does the job. Its distinct because open is not expected
> > to return EBUSY for regular files/dirs and when open is expected to
> > return EBUSY for blockdev its for the exact same use case (i.e.
> > exclusive write open is acquired by userspace tools).
>
> That works for me.

From Pavel's v6 cover letter:
"Make nfs code return -EBUSY for share conflicts (was -EACCESS)."
;-)

>
> We should probably have a close look at the work that Pavel did several
> years ago too. It has almost certainly bitrotted by now, but it may
> serve as a starting point (and he may he may have valuable input here).

I looked at the patches. There's good stuff in there.
Once we agree on the specifications I can rip some code off ;-)

A lot of the work in Pavel's patches evolves around making the
mount option work and respecting O_DENYDELETE.
IMO, that is not a good use of up-streaming effort, because:
- NFS won't ask for deny delete
- IMO, Windows applications should be used to being denied
  a DENY_DELETE and fall back to SHARE_DELETE

So while implementing DENYDELETE may fall into a category of making
samba server behave more like Windows server, I don't think it falls into
the category of better samba/nfs interop.

It is something that we can add later if anyone really cares about.

Thanks,
Amir.

  reply	other threads:[~2019-04-28 13:46 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-08 11:20 Better interop for NFS/SMB file share mode/reservation Amir Goldstein
2019-02-08 13:10 ` Jeff Layton
2019-02-08 14:45   ` Amir Goldstein
2019-02-08 15:50     ` J. Bruce Fields
2019-02-08 20:02       ` Amir Goldstein
2019-02-08 20:16         ` J. Bruce Fields
2019-02-08 20:31           ` Amir Goldstein
2019-02-14 20:51             ` J. Bruce Fields
2019-02-15  7:31               ` Amir Goldstein
2019-02-15 20:09                 ` J. Bruce Fields
2019-02-08 22:12         ` Jeremy Allison
2019-02-09  4:04           ` Amir Goldstein
2019-02-14 21:06             ` J. Bruce Fields
2019-03-05 21:47               ` J. Bruce Fields
2019-03-06  7:09                 ` Amir Goldstein
2019-03-06 15:17                   ` J. Bruce Fields
2019-03-06 15:37                     ` [NFS-Ganesha-Devel] " Frank Filz
2019-03-08 21:38                       ` 'J. Bruce Fields'
2019-03-08 21:53                         ` Frank Filz
2019-03-06 15:11                 ` J. Bruce Fields
2019-03-06 20:31                   ` Jeff Layton
2019-03-06 21:07                     ` Jeremy Allison
2019-03-06 21:25                       ` Ralph Böhme
2019-03-07 11:03                         ` Stefan Metzmacher
2019-03-07 16:47                           ` Simo
2019-04-25 18:11                           ` Amir Goldstein
2019-05-24  7:12                             ` Amir Goldstein
2019-05-24 13:15                               ` Ralph Boehme
2019-05-24 15:07                               ` J. Bruce Fields
2019-03-06 21:55                       ` Jeff Layton
2019-02-08 16:03     ` Jeff Layton
2019-02-08 16:28       ` Jeffrey Layton
     [not found]       ` <CAOQ4uxgQsRaEOxz1aYzP1_1fzRpQbOm2-wuzG=ABAphPB=7Mxg@mail.gmail.com>
     [not found]         ` <20190426140023.GB25827@fieldses.org>
     [not found]           ` <CAOQ4uxhuxoEsoBbvenJ8eLGstPc4AH-msrxDC-tBFRhvDxRSNg@mail.gmail.com>
     [not found]             ` <20190426145006.GD25827@fieldses.org>
     [not found]               ` <e69d149c80187b84833fec369ad8a51247871f26.camel@kernel.org>
2019-04-27 20:16                 ` Amir Goldstein
2019-04-28 12:09                   ` Jeff Layton
2019-04-28 13:45                     ` Amir Goldstein [this message]
2019-04-28 15:06                       ` Trond Myklebust
2019-04-28 22:00                         ` Amir Goldstein
2019-04-28 22:08                           ` Trond Myklebust
2019-04-28 22:33                             ` Amir Goldstein
2019-04-29  0:57                               ` Trond Myklebust
2019-04-29 11:42                                 ` Amir Goldstein
2019-04-29 13:10                                   ` Trond Myklebust
2019-04-29 20:29                                 ` Jeff Layton
2019-04-29 22:33                                   ` Pavel Shilovskiy
2019-04-30  0:31                                     ` Amir Goldstein
2019-04-30  8:12                                       ` Uri Simchoni
2019-04-30  9:22                                         ` Amir Goldstein
2019-02-11  5:31     ` ronnie sahlberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxi6fQdp_RQKHp-i6Q-m-G1+384_DafF3QzYcUq4guLd6w@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=Volker.Lendecke@sernet.de \
    --cc=bfields@fieldses.org \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=pshilov@microsoft.com \
    --cc=samba-technical@lists.samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).