linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
	Volker.Lendecke@sernet.de, samba-technical@lists.samba.org,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Pavel Shilovsky <piastryyy@gmail.com>
Subject: Re: Better interop for NFS/SMB file share mode/reservation
Date: Fri, 8 Feb 2019 16:45:46 +0200	[thread overview]
Message-ID: <CAOQ4uxgewN=j3ju5MSowEvwhK1HqKG3n1hBRUQTi1W5asaO1dQ@mail.gmail.com> (raw)
In-Reply-To: <379106947f859bdf5db4c6f9c4ab8c44f7423c08.camel@kernel.org>

On Fri, Feb 8, 2019 at 3:10 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Fri, 2019-02-08 at 13:20 +0200, Amir Goldstein wrote:
> > Hi Bruce,
> >
> > I have been following you discussion with Volker Lendecke
> > on the samba technical mailing list [1] and have had discussed
> > this issue with Volker myself as well.
> >
> > I decided to start this new thread to bring some kernel developers
> > in the loop and to propose an idea that takes a somewhat
> > different approach to the "interop" approaches I have seen
> > so far. "interop" in this context often means consistency of file
> > lock states between samba and nfs server, but I am referring
> > to the stronger sense of interop with local filesystem on the server.
> >
> > You pointed to Pavel Shilovsky's O_DENY* patches [2] as a possible
> > solution to interop of NFS Share Reservation and SMB Share Mode
> > with local filesystems.
> > Some of the complaints on this approach were (rightfully) concerned
> > about DoS and the prospect of plaguing Linux with Windows server
> > "files left open" issues.
> >
> > My idea comes from the observation that Windows server
> > administrators can release locked files that were left open by clients.
> > I suppose that an NFS server admin can do the same?
>
> The Linux kernel has no mechanism for this (aside from sending a SIGKILL
> to lockd, which makes it drop all locks). Solaris did have a tool for
> this at one point (and probably still does).
>
> It's a little less of a problem now than it used to be with NFS, given
> the move to NFSv4 (which has lease-based locking). If you have
> misbehaving clients, you just kick them out and their locks eventually
> go away. v3 locks can stick around in perpetuity however, so people have
> long wanted such a tool on Linux as well.
>

In a nut shell, I think my proposal is that samba will do something
similar and request leases from the kernel instead of trying to
enforce real mandatory locks.

> > That realization makes "share access" locks (a.k.a. MAND_LOCK)
> > not so very different from oplocks (leases/delegations).
> > As long as samba and nfsd cooperate nicely with MAND_LOCK
> > semantics, we don't really have to force local filesystems
> > to obay MAND_LOCK semantics. If the file servers take leases
> > on local filesystems, they will not get exclusive write access for
> > files already open for write on local filesytem and same for read.
> >
>
> I think this last statement isn't correct (if I'm parsing it correctly).
> If a file is already open for write, then you just don't get a lease
> when you try to request one. Ditto for write leases if it's already open
> for read.
>

I think you miss read what I miss wrote ;-)
As the title of this thread states, I am talking about the first case
of acquiring an exclusive or read shared access to file at open time.
It may be the fact that samba currently calls flock(LOCK_MAND)
that is the source for confusion.

Open failure is the expected behavior if file is already open for
write (or read) on local filesystem, so my suggestion is:
- Server opens the file and request a lease based of desired share mode
- If file server got the lease, client gets the file handle
- Otherwise, client gets an open failure

> > On local file access on the server that violates the share mode,
> > the file server acts as a grumpy washed out administrator that
> > automatically grants any lock revoke ticket after timeout.
> >
>
> Devil's advocate:
>
> Is this situation any better than just teaching the NFS/SMB servers to
> track these locks out of band? Both samba and most NFS servers respect
> share/deny mode locks, but only internally -- they aren't aware of the
> others'. We could (in principle) come up with a mechanism to track these
> that doesn't involve plumbing them into the kernel.
>

That would be a prerequisite to my suggested solution, as I wrote:
"As long as samba and nfsd cooperate nicely with LOCK_MAND..."
That means the two file servers cooperate on the share mode locks
and try to figure out if there are outstanding leases before opening
a file that will break those leases.

> That said, coherent locking is best done in the kernel, IMO...
>

Indeed...

> > This model may not fit use cases where "real" interop with
> > local filesystem is needed, but compared to the existing
> > solution (no interop at all) it is quite an improvement.
> >
> > Furthermore, short of SMB DENY_DELETE, we may not even
> > need to change any kernel APIs.
> > The addition of O_DENY* open flags can make programming
> > easier, but taking a lease on an open file is still safe enough
> > to implement share reservation (no?).
> >
> > Satisfying DENY_DELETE could be more tricky, but perhaps
> > the existing SILLYRENAME interface of==between knfsd and vfs
> > could be somehow utilized for this purpose?
> >
> > I though of bringing this up as a TOPIC for LSF/MM, but wanted
> > to consult with you first. I am sure that you or Jeff can do a better
> > job than me in enumerating the "interop" file lock issues that
> > could be discussed in filesystems track forum.
> >
> > Thoughts? Explanation why this idea is idiotic?
>
> I think it's not a single idea. There are really two different aspects
> to this given that we're really talking about two different types of
> locks in SMB. I think you have to consider solving these problems
> separately:
>
> 1) the ability to set a (typically whole-file) share/deny lock
> atomically when you open a file. This is necessary for coherent
> share/deny lock semantics. Note that these are only enforced open()
> time.
>
> 2) mandatory locking (forbidding reads and writes on a byte range when
> there is a conflicting lock set).
>

I was only trying to address the first problem (small steps...).

> The first could (probably) be solved with something like what Pavel
> proposed a few years ago...or maybe we just wire up O_EXLOCK and
> O_SHLOCK:
>
>     https://www.gnu.org/software/libc/manual/html_node/Open_002dtime-Flags.html
>

Nice. I wasn't aware of those BSD flags.

> This seems like a fine idea (in principle) but it needs someone to drive
> the work forward. You'll also likely be consuming a couple of O_* flags,
> which could be tough sell (unless you come up with another way to do
> it).
>

Once I know the obstacles to watch out from, I can drive this work.
Thing is, I am not convinced myself that any new O_ flags are needed.

How about this (for samba, knfsd is simpler):
- pfd = open(filename, O_PATH)
- flock(pfd, LOCK_MAND) (for file servers interop)
- vfs checks no conflicting LOCK_MAND locks (like patch you once posted)
- open(filename, O_RDWR) (and verify st_ino like samba does)
- Request lease (for local fs interop)
- check_conflicting_open() is changed to use inode_is_open_for_read()
  instead of checking d_count and i_count.
- we already have i_readcount, just need to remove ifdef CONFIG_IMA
- On lease break (from local fs), break client oplocks and invalidate
file handle on server

Thanks,
Amir.

  reply	other threads:[~2019-02-08 14:46 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-08 11:20 Better interop for NFS/SMB file share mode/reservation Amir Goldstein
2019-02-08 13:10 ` Jeff Layton
2019-02-08 14:45   ` Amir Goldstein [this message]
2019-02-08 15:50     ` J. Bruce Fields
2019-02-08 20:02       ` Amir Goldstein
2019-02-08 20:16         ` J. Bruce Fields
2019-02-08 20:31           ` Amir Goldstein
2019-02-14 20:51             ` J. Bruce Fields
2019-02-15  7:31               ` Amir Goldstein
2019-02-15 20:09                 ` J. Bruce Fields
2019-02-08 22:12         ` Jeremy Allison
2019-02-09  4:04           ` Amir Goldstein
2019-02-14 21:06             ` J. Bruce Fields
2019-03-05 21:47               ` J. Bruce Fields
2019-03-06  7:09                 ` Amir Goldstein
2019-03-06 15:17                   ` J. Bruce Fields
2019-03-06 15:37                     ` [NFS-Ganesha-Devel] " Frank Filz
2019-03-08 21:38                       ` 'J. Bruce Fields'
2019-03-08 21:53                         ` Frank Filz
2019-03-06 15:11                 ` J. Bruce Fields
2019-03-06 20:31                   ` Jeff Layton
2019-03-06 21:07                     ` Jeremy Allison
2019-03-06 21:25                       ` Ralph Böhme
2019-03-07 11:03                         ` Stefan Metzmacher
2019-03-07 16:47                           ` Simo
2019-04-25 18:11                           ` Amir Goldstein
2019-03-06 21:55                       ` Jeff Layton
2019-02-08 16:03     ` Jeff Layton
2019-02-08 16:28       ` Jeffrey Layton
     [not found]       ` <CAOQ4uxgQsRaEOxz1aYzP1_1fzRpQbOm2-wuzG=ABAphPB=7Mxg@mail.gmail.com>
     [not found]         ` <20190426140023.GB25827@fieldses.org>
     [not found]           ` <CAOQ4uxhuxoEsoBbvenJ8eLGstPc4AH-msrxDC-tBFRhvDxRSNg@mail.gmail.com>
     [not found]             ` <20190426145006.GD25827@fieldses.org>
     [not found]               ` <e69d149c80187b84833fec369ad8a51247871f26.camel@kernel.org>
2019-04-27 20:16                 ` Amir Goldstein
2019-04-28 12:09                   ` Jeff Layton
2019-04-28 13:45                     ` Amir Goldstein
2019-04-28 15:06                       ` Trond Myklebust
2019-04-28 22:00                         ` Amir Goldstein
2019-04-28 22:08                           ` Trond Myklebust
2019-04-28 22:33                             ` Amir Goldstein
2019-04-29  0:57                               ` Trond Myklebust
2019-04-29 11:42                                 ` Amir Goldstein
2019-04-29 13:10                                   ` Trond Myklebust
2019-04-29 20:29                                 ` Jeff Layton
2019-04-29 22:33                                   ` Pavel Shilovskiy
2019-04-30  0:31                                     ` Amir Goldstein
2019-04-30  8:12                                       ` Uri Simchoni
2019-04-30  9:22                                         ` Amir Goldstein
2019-02-11  5:31     ` ronnie sahlberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxgewN=j3ju5MSowEvwhK1HqKG3n1hBRUQTi1W5asaO1dQ@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=Volker.Lendecke@sernet.de \
    --cc=bfields@fieldses.org \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=piastryyy@gmail.com \
    --cc=samba-technical@lists.samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).