All of lore.kernel.org
 help / color / mirror / Atom feed
* fanotify read returns with errno == EOPENSTALE
@ 2017-03-22 15:31 Marko Rauhamaa
  2017-03-22 18:01 ` Matthew Wilcox
  2017-03-22 18:20 ` Amir Goldstein
  0 siblings, 2 replies; 32+ messages in thread
From: Marko Rauhamaa @ 2017-03-22 15:31 UTC (permalink / raw)
  To: linux-fsdevel


An F-Secure product uses fanotify with OPEN_PERM. We ran into an
unexpected situation: a read(2) on the fanotify fd returned -1 with
errno == EOPENSTALE. The only place in the (development) kernel where I
can find EOPENSTALE is in nfs4file.c:nfs4_file_open().

Questions:

 * Should an fanotify client expect EOPENSTALE from read(2)?

 * According to <URL:
   https://github.com/torvalds/linux/blob/master/include/linux/errno.h>,
   EOPENSTALE "should never be seen by user programs." Is this a kernel
   bug?

 * The kernel in question is kernel-3.10.0-229.el7.x86_64 (RHEL 7.3). I
   will take it up with Red Hat if necessary. However, is this a known
   issue that has been fixed in a newer kernel version?


Marko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-22 15:31 fanotify read returns with errno == EOPENSTALE Marko Rauhamaa
@ 2017-03-22 18:01 ` Matthew Wilcox
  2017-03-22 18:20 ` Amir Goldstein
  1 sibling, 0 replies; 32+ messages in thread
From: Matthew Wilcox @ 2017-03-22 18:01 UTC (permalink / raw)
  To: Marko Rauhamaa; +Cc: linux-fsdevel

On Wed, Mar 22, 2017 at 05:31:05PM +0200, Marko Rauhamaa wrote:
>  * The kernel in question is kernel-3.10.0-229.el7.x86_64 (RHEL 7.3). I
>    will take it up with Red Hat if necessary. However, is this a known
>    issue that has been fixed in a newer kernel version?

You should take it up with Red Hat first, unless you can repreoduce it
on a vanilla kernel.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-22 15:31 fanotify read returns with errno == EOPENSTALE Marko Rauhamaa
  2017-03-22 18:01 ` Matthew Wilcox
@ 2017-03-22 18:20 ` Amir Goldstein
  2017-03-22 19:17   ` Amir Goldstein
  2017-03-22 19:31   ` Al Viro
  1 sibling, 2 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-03-22 18:20 UTC (permalink / raw)
  To: Marko Rauhamaa; +Cc: linux-fsdevel, Jan Kara, Al Viro, Jeff Layton

On Wed, Mar 22, 2017 at 11:31 AM, Marko Rauhamaa
<marko.rauhamaa@f-secure.com> wrote:
>
> An F-Secure product uses fanotify with OPEN_PERM. We ran into an
> unexpected situation: a read(2) on the fanotify fd returned -1 with
> errno == EOPENSTALE. The only place in the (development) kernel where I
> can find EOPENSTALE is in nfs4file.c:nfs4_file_open().
>

Hi Marko,

Is this issue reproducible?
Which file system is the notification coming from? NFS?

> Questions:
>
>  * Should an fanotify client expect EOPENSTALE from read(2)?
>

The fanotify event metadata->fd can be an error (e.g. EOPENSTALE),
so if you don't check that the fd you get is valid (>=0)
I suppose glibc might be setting errno if you pass
it a negative fd???

Anyway, over NFS it seems that something like this can happen.

>  * According to <URL:
>    https://github.com/torvalds/linux/blob/master/include/linux/errno.h>,
>    EOPENSTALE "should never be seen by user programs." Is this a kernel
>    bug?
>

Well, the behavior was changed in kernel 4.7 (and stable kernels) by
commit by Al Viro:
fac7d19 fix EOPENSTALE bug in do_last()

Since that commit userspace will be able to see this error in fanotify events.
I can't say for sure about other ways that this error could get to
user, but a quick
grep for users of dentry_open() that don't convert EOPENSTALE such as
fanotify finds:
autofs4, ecrypts, cachefiles, overlayfs (the error may be converted higher up)
and the XFS specific XFS_IOC_OPEN_BY_HANDLE ioctl.

I don't know if that's called a bug or out of date documentation.
RHEL must have picked up this bug fix so have the same behavior as upstream now.

>  * The kernel in question is kernel-3.10.0-229.el7.x86_64 (RHEL 7.3). I
>    will take it up with Red Hat if necessary. However, is this a known
>    issue that has been fixed in a newer kernel version?
>
>
> Marko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-22 18:20 ` Amir Goldstein
@ 2017-03-22 19:17   ` Amir Goldstein
  2017-03-22 19:31   ` Al Viro
  1 sibling, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-03-22 19:17 UTC (permalink / raw)
  To: Marko Rauhamaa; +Cc: linux-fsdevel, Jan Kara, Al Viro, Jeff Layton

On Wed, Mar 22, 2017 at 2:20 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Mar 22, 2017 at 11:31 AM, Marko Rauhamaa
> <marko.rauhamaa@f-secure.com> wrote:
>>
>> An F-Secure product uses fanotify with OPEN_PERM. We ran into an
>> unexpected situation: a read(2) on the fanotify fd returned -1 with
>> errno == EOPENSTALE. The only place in the (development) kernel where I
>> can find EOPENSTALE is in nfs4file.c:nfs4_file_open().
>>
>
> Hi Marko,
>
> Is this issue reproducible?
> Which file system is the notification coming from? NFS?
>
>> Questions:
>>
>>  * Should an fanotify client expect EOPENSTALE from read(2)?
>>
>
> The fanotify event metadata->fd can be an error (e.g. EOPENSTALE),
> so if you don't check that the fd you get is valid (>=0)
> I suppose glibc might be setting errno if you pass
> it a negative fd???
>
> Anyway, over NFS it seems that something like this can happen.
>
>>  * According to <URL:
>>    https://github.com/torvalds/linux/blob/master/include/linux/errno.h>,
>>    EOPENSTALE "should never be seen by user programs." Is this a kernel
>>    bug?
>>
>
> Well, the behavior was changed in kernel 4.7 (and stable kernels) by
> commit by Al Viro:
> fac7d19 fix EOPENSTALE bug in do_last()
>
> Since that commit userspace will be able to see this error in fanotify events.
> I can't say for sure about other ways that this error could get to
> user, but a quick
> grep for users of dentry_open() that don't convert EOPENSTALE such as
> fanotify finds:
> autofs4, ecrypts, cachefiles, overlayfs (the error may be converted higher up)

from code audit of the modules above I think only APIs that *may* be leaking
EOPENSTALE to userspace (besides fanotify and xfs open by handle) are:
- autofs_dev_ioctl_openmount
- overlayfs readdir (when opening lower NFS dir)

Amir.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-22 18:20 ` Amir Goldstein
  2017-03-22 19:17   ` Amir Goldstein
@ 2017-03-22 19:31   ` Al Viro
  2017-03-22 19:39     ` Amir Goldstein
  1 sibling, 1 reply; 32+ messages in thread
From: Al Viro @ 2017-03-22 19:31 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Marko Rauhamaa, linux-fsdevel, Jan Kara, Jeff Layton

On Wed, Mar 22, 2017 at 02:20:15PM -0400, Amir Goldstein wrote:

> Well, the behavior was changed in kernel 4.7 (and stable kernels) by
> commit by Al Viro:
> fac7d19 fix EOPENSTALE bug in do_last()
> 
> Since that commit userspace will be able to see this error in fanotify events.

Unless *notify somehow uses do_last() directly, that commit should have
no effect on it (and it definitely has no effect on dentry_open() callers)...

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-22 19:31   ` Al Viro
@ 2017-03-22 19:39     ` Amir Goldstein
  2017-03-23  8:13       ` Marko Rauhamaa
  0 siblings, 1 reply; 32+ messages in thread
From: Amir Goldstein @ 2017-03-22 19:39 UTC (permalink / raw)
  To: Al Viro; +Cc: Marko Rauhamaa, linux-fsdevel, Jan Kara, Jeff Layton

On Wed, Mar 22, 2017 at 3:31 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Wed, Mar 22, 2017 at 02:20:15PM -0400, Amir Goldstein wrote:
>
>> Well, the behavior was changed in kernel 4.7 (and stable kernels) by
>> commit by Al Viro:
>> fac7d19 fix EOPENSTALE bug in do_last()
>>
>> Since that commit userspace will be able to see this error in fanotify events.
>
> Unless *notify somehow uses do_last() directly, that commit should have
> no effect on it (and it definitely has no effect on dentry_open() callers)...

Right. I'm being silly :/

Back to Redhat I guess...

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-22 19:39     ` Amir Goldstein
@ 2017-03-23  8:13       ` Marko Rauhamaa
  2017-03-23 11:46         ` Amir Goldstein
  0 siblings, 1 reply; 32+ messages in thread
From: Marko Rauhamaa @ 2017-03-23  8:13 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, linux-fsdevel, Jan Kara, Jeff Layton

Amir Goldstein <amir73il@gmail.com>:

> On Wed, Mar 22, 2017 at 3:31 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> On Wed, Mar 22, 2017 at 02:20:15PM -0400, Amir Goldstein wrote:
>>
>>> Well, the behavior was changed in kernel 4.7 (and stable kernels) by
>>> commit by Al Viro:
>>> fac7d19 fix EOPENSTALE bug in do_last()
>>>
>>> Since that commit userspace will be able to see this error in
>>> fanotify events.
>>
>> Unless *notify somehow uses do_last() directly, that commit should
>> have no effect on it (and it definitely has no effect on
>> dentry_open() callers)...
>
> Right. I'm being silly :/
>
> Back to Redhat I guess...

I will gladly take the issue to RedHat. However, the discussion so far
confuses me a bit. To confirm, is there a consensus here that EOPENSTALE
should never leak to userspace (through fanotify read anyway)?

If EOPENSTALE *is* a valid possible return from fanotify read, this is
my bug and not RedHat's. In that case, what is the correct recovery?

As for reproduction, I don't yet have one. At the moment, I just need an
authoritative user-space API clarification.


Marko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-23  8:13       ` Marko Rauhamaa
@ 2017-03-23 11:46         ` Amir Goldstein
  2017-03-23 11:56           ` Jeff Layton
  0 siblings, 1 reply; 32+ messages in thread
From: Amir Goldstein @ 2017-03-23 11:46 UTC (permalink / raw)
  To: Marko Rauhamaa; +Cc: Al Viro, linux-fsdevel, Jan Kara, Jeff Layton

On Thu, Mar 23, 2017 at 4:13 AM, Marko Rauhamaa
<marko.rauhamaa@f-secure.com> wrote:
> Amir Goldstein <amir73il@gmail.com>:
>
>> On Wed, Mar 22, 2017 at 3:31 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>> On Wed, Mar 22, 2017 at 02:20:15PM -0400, Amir Goldstein wrote:
>>>
>>>> Well, the behavior was changed in kernel 4.7 (and stable kernels) by
>>>> commit by Al Viro:
>>>> fac7d19 fix EOPENSTALE bug in do_last()
>>>>
>>>> Since that commit userspace will be able to see this error in
>>>> fanotify events.
>>>
>>> Unless *notify somehow uses do_last() directly, that commit should
>>> have no effect on it (and it definitely has no effect on
>>> dentry_open() callers)...
>>
>> Right. I'm being silly :/
>>
>> Back to Redhat I guess...
>
> I will gladly take the issue to RedHat. However, the discussion so far
> confuses me a bit. To confirm, is there a consensus here that EOPENSTALE
> should never leak to userspace (through fanotify read anyway)?
>
> If EOPENSTALE *is* a valid possible return from fanotify read, this is
> my bug and not RedHat's. In that case, what is the correct recovery?
>
> As for reproduction, I don't yet have one. At the moment, I just need an
> authoritative user-space API clarification.
>

There is a consensus that the commit I pointed to has nothing do to with
this alleged error leak.

It certainly looks like it wasn't the intention that this error will end up in
userspace, but I can't say that it is bad that it got there in this case.
The error is quite useful for you to understand what happened.

On a second look, I think that beyond pointing to an irrelevant commit
my analysis still looks correct correct. I think upstream kernel *can* deliver
this error on fanotify event and all those callers I mentioned that call
dentry_open() directly without checking EOPENSTALE later on.

IIUC, the only way to get out of a stale dentry situation is that some thread
will lookup that path and revalidate the dentry.

But if file has become stale between the time that the event was queued
and the time the event is read and no process has tried looking it up since
then the event read will have -EOPENSTALE for metadata->fd.

It's probably much harder to hit this in other cases I mentioned, but
seems quite plausible with fanotify because events are often read some
time after they happened.

With overlayfs readdir for example, lower dir will be revalidated on
open of overlay dir, so process would have to wait some time
before open and readdir to make this case likely.

I have been wrong once. Could be wrong again.
Anyone?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-23 11:46         ` Amir Goldstein
@ 2017-03-23 11:56           ` Jeff Layton
  2017-03-23 12:43             ` Marko Rauhamaa
  0 siblings, 1 reply; 32+ messages in thread
From: Jeff Layton @ 2017-03-23 11:56 UTC (permalink / raw)
  To: Amir Goldstein, Marko Rauhamaa
  Cc: Al Viro, linux-fsdevel, Jan Kara, Linux NFS Mailing List

On Thu, 2017-03-23 at 07:46 -0400, Amir Goldstein wrote:
> On Thu, Mar 23, 2017 at 4:13 AM, Marko Rauhamaa
> <marko.rauhamaa@f-secure.com> wrote:
> > Amir Goldstein <amir73il@gmail.com>:
> > 
> > > On Wed, Mar 22, 2017 at 3:31 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > > > On Wed, Mar 22, 2017 at 02:20:15PM -0400, Amir Goldstein wrote:
> > > > 
> > > > > Well, the behavior was changed in kernel 4.7 (and stable kernels) by
> > > > > commit by Al Viro:
> > > > > fac7d19 fix EOPENSTALE bug in do_last()
> > > > > 
> > > > > Since that commit userspace will be able to see this error in
> > > > > fanotify events.
> > > > 
> > > > Unless *notify somehow uses do_last() directly, that commit should
> > > > have no effect on it (and it definitely has no effect on
> > > > dentry_open() callers)...
> > > 
> > > Right. I'm being silly :/
> > > 
> > > Back to Redhat I guess...
> > 
> > I will gladly take the issue to RedHat. However, the discussion so far
> > confuses me a bit. To confirm, is there a consensus here that EOPENSTALE
> > should never leak to userspace (through fanotify read anyway)?
> > 
> > If EOPENSTALE *is* a valid possible return from fanotify read, this is
> > my bug and not RedHat's. In that case, what is the correct recovery?
> > 
> > As for reproduction, I don't yet have one. At the moment, I just need an
> > authoritative user-space API clarification.
> > 
> 
> There is a consensus that the commit I pointed to has nothing do to with
> this alleged error leak.
> 
> It certainly looks like it wasn't the intention that this error will end up in
> userspace, but I can't say that it is bad that it got there in this case.
> The error is quite useful for you to understand what happened.
> 
> On a second look, I think that beyond pointing to an irrelevant commit
> my analysis still looks correct correct. I think upstream kernel *can* deliver
> this error on fanotify event and all those callers I mentioned that call
> dentry_open() directly without checking EOPENSTALE later on.
> 
> IIUC, the only way to get out of a stale dentry situation is that some thread
> will lookup that path and revalidate the dentry.
> 
> But if file has become stale between the time that the event was queued
> and the time the event is read and no process has tried looking it up since
> then the event read will have -EOPENSTALE for metadata->fd.
> 
> It's probably much harder to hit this in other cases I mentioned, but
> seems quite plausible with fanotify because events are often read some
> time after they happened.
> 
> With overlayfs readdir for example, lower dir will be revalidated on
> open of overlay dir, so process would have to wait some time
> before open and readdir to make this case likely.
> 
> I have been wrong once. Could be wrong again.
> Anyone?

It was definitely not the intention to leak this error code to
userland. EOPENSTALE is not a POSIX sanctioned error code, so
applications generally don't know anything about it and will be
confused.

I haven't looked closely at this particular problem, but IIRC we
usually just translate EOPENSTALE to ESTALE, and that may be all that
needs to be done here. If this happened in the RHEL kernel, then please
do open a bug with Red Hat and we'll get it straightened out.

That said, you should take heed that all of the [fa|i|d]notify APIs do
not extend beyond the local machine when you use them on network
filesystems. If you're expecting to get notification of events that are
occurring on other clients, you're going to be disappointed here.
-- 
Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-23 11:56           ` Jeff Layton
@ 2017-03-23 12:43             ` Marko Rauhamaa
  2017-03-23 13:47                 ` Amir Goldstein
  0 siblings, 1 reply; 32+ messages in thread
From: Marko Rauhamaa @ 2017-03-23 12:43 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Amir Goldstein, Al Viro, linux-fsdevel, Jan Kara, Linux NFS Mailing List

Jeff Layton <jlayton@poochiereds.net>:

> It was definitely not the intention to leak this error code to
> userland. EOPENSTALE is not a POSIX sanctioned error code, so
> applications generally don't know anything about it and will be
> confused.

Got it. I will try to work on a reproduction and make a proper bug
report.

> I haven't looked closely at this particular problem, but IIRC we
> usually just translate EOPENSTALE to ESTALE, and that may be all that
> needs to be done here. If this happened in the RHEL kernel, then
> please do open a bug with Red Hat and we'll get it straightened out.

ESTALE has not been mentioned as a possible error code from an fanotify
read. Most importantly, since read fails, I suppose there is no recovery
but you must close the fanotify fd and call fanotify_init() again. Or
should I just ignore it and read on? If so, why bother returning the
error from the kernel in the first place?

> That said, you should take heed that all of the [fa|i|d]notify APIs do
> not extend beyond the local machine when you use them on network
> filesystems. If you're expecting to get notification of events that
> are occurring on other clients, you're going to be disappointed here.

That certainly is disappointing. However, there is a certain level of
coherency one would expect, namely:

 * An NFS4 client opening a file should be subject to an OPEN_PERM check
   on that client (if the client is monitoring the mount point).

 * An NFS4 client opening a file should be subject to an OPEN_PERM check
   on the server (if the server is monitoring the mount point).

 * An fanotify read should not fail mysteriously. Rather, a read on
   metadata->fd should be the one failing.


Marko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-03-23 12:43             ` Marko Rauhamaa
  2017-03-23 13:47                 ` Amir Goldstein
@ 2017-03-23 13:47                 ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-03-23 13:47 UTC (permalink / raw)
  To: Marko Rauhamaa
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api

On Thu, Mar 23, 2017 at 8:43 AM, Marko Rauhamaa
<marko.rauhamaa@f-secure.com> wrote:
> Jeff Layton <jlayton@poochiereds.net>:
>
>> It was definitely not the intention to leak this error code to
>> userland. EOPENSTALE is not a POSIX sanctioned error code, so
>> applications generally don't know anything about it and will be
>> confused.
>
> Got it. I will try to work on a reproduction and make a proper bug
> report.
>

Try this:

- watch a single file for permissions events (so you will only have
one event in the queue)
- open file from client to generate single event (don't read event yet)
- remove file from server (to make it stale)
- read event (with stale file)

>> I haven't looked closely at this particular problem, but IIRC we
>> usually just translate EOPENSTALE to ESTALE, and that may be all that
>> needs to be done here. If this happened in the RHEL kernel, then
>> please do open a bug with Red Hat and we'll get it straightened out.
>
> ESTALE has not been mentioned as a possible error code from an fanotify
> read. Most importantly, since read fails, I suppose there is no recovery
> but you must close the fanotify fd and call fanotify_init() again. Or
> should I just ignore it and read on? If so, why bother returning the
> error from the kernel in the first place?
>

Oh my. I completely misread your report before.
I though you were trying to read from the event->fd.
Now I understand that  you mean read from fanotify fd.
That will definitely return the error, but  only in the special case
where open error
happened on the first event being read to the buffer.
If error happens after adding some events to the buffer, fanotify
process will not know
about this. Regular event will be silently dropped and permission event will be
denied.

>From fanotify(7) man page BUGS section:

       *  If  a  call  to read(2) processes multiple events from the
fanotify queue
          and an error occurs, the return value will be the total
length of the events
          successfully copied to the user-space buffer before the
error occurred.
          The return value will not be -1, and errno will not be set.
          Thus, the reading application has no way to detect the error.

There is  small subset of open errors that you could get for failure to create
an fd for the event, like  ENODEV/ENXIO and EMFILE.
You do NOT need to call fanotify_init() again, the next read will read
the next event.
(That information might be useful in man page)


>> That said, you should take heed that all of the [fa|i|d]notify APIs do
>> not extend beyond the local machine when you use them on network
>> filesystems. If you're expecting to get notification of events that
>> are occurring on other clients, you're going to be disappointed here.
>
> That certainly is disappointing. However, there is a certain level of
> coherency one would expect, namely:
>
>  * An NFS4 client opening a file should be subject to an OPEN_PERM check
>    on that client (if the client is monitoring the mount point).
>
>  * An NFS4 client opening a file should be subject to an OPEN_PERM check
>    on the server (if the server is monitoring the mount point).
>
>  * An fanotify read should not fail mysteriously. Rather, a read on
>    metadata->fd should be the one failing.
>

Depending on the error.
If you can't get an fd, how will the event be reported?
With my suggestion to pass filehandle on event, fd is redundant,
so event can be passed with the error on fd field.

Please try to create a reproducer for case where error is returned
and when it is not.

The fix seems trivial and I can post it once you have the test:
- return EAGAIN for read in case of a single event in queue without fd
  so apps getting the read error will have a good idea what to do
- in case of non single event, maybe copy event with error on event->fd
  to the buffer for specific errors that make sense to report (EMFILE)
  so a watcher checks the values of negative event->fd can maybe do
  something about it (e.g. provide a smaller buffer).

Amir.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-03-23 13:47                 ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-03-23 13:47 UTC (permalink / raw)
  To: Marko Rauhamaa
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api

On Thu, Mar 23, 2017 at 8:43 AM, Marko Rauhamaa
<marko.rauhamaa@f-secure.com> wrote:
> Jeff Layton <jlayton@poochiereds.net>:
>
>> It was definitely not the intention to leak this error code to
>> userland. EOPENSTALE is not a POSIX sanctioned error code, so
>> applications generally don't know anything about it and will be
>> confused.
>
> Got it. I will try to work on a reproduction and make a proper bug
> report.
>

Try this:

- watch a single file for permissions events (so you will only have
one event in the queue)
- open file from client to generate single event (don't read event yet)
- remove file from server (to make it stale)
- read event (with stale file)

>> I haven't looked closely at this particular problem, but IIRC we
>> usually just translate EOPENSTALE to ESTALE, and that may be all that
>> needs to be done here. If this happened in the RHEL kernel, then
>> please do open a bug with Red Hat and we'll get it straightened out.
>
> ESTALE has not been mentioned as a possible error code from an fanotify
> read. Most importantly, since read fails, I suppose there is no recovery
> but you must close the fanotify fd and call fanotify_init() again. Or
> should I just ignore it and read on? If so, why bother returning the
> error from the kernel in the first place?
>

Oh my. I completely misread your report before.
I though you were trying to read from the event->fd.
Now I understand that  you mean read from fanotify fd.
That will definitely return the error, but  only in the special case
where open error
happened on the first event being read to the buffer.
If error happens after adding some events to the buffer, fanotify
process will not know
about this. Regular event will be silently dropped and permission event will be
denied.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-03-23 13:47                 ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-03-23 13:47 UTC (permalink / raw)
  To: Marko Rauhamaa
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Mar 23, 2017 at 8:43 AM, Marko Rauhamaa
<marko.rauhamaa-xdZvthzU1HpWk0Htik3J/w@public.gmane.org> wrote:
> Jeff Layton <jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org>:
>
>> It was definitely not the intention to leak this error code to
>> userland. EOPENSTALE is not a POSIX sanctioned error code, so
>> applications generally don't know anything about it and will be
>> confused.
>
> Got it. I will try to work on a reproduction and make a proper bug
> report.
>

Try this:

- watch a single file for permissions events (so you will only have
one event in the queue)
- open file from client to generate single event (don't read event yet)
- remove file from server (to make it stale)
- read event (with stale file)

>> I haven't looked closely at this particular problem, but IIRC we
>> usually just translate EOPENSTALE to ESTALE, and that may be all that
>> needs to be done here. If this happened in the RHEL kernel, then
>> please do open a bug with Red Hat and we'll get it straightened out.
>
> ESTALE has not been mentioned as a possible error code from an fanotify
> read. Most importantly, since read fails, I suppose there is no recovery
> but you must close the fanotify fd and call fanotify_init() again. Or
> should I just ignore it and read on? If so, why bother returning the
> error from the kernel in the first place?
>

Oh my. I completely misread your report before.
I though you were trying to read from the event->fd.
Now I understand that  you mean read from fanotify fd.
That will definitely return the error, but  only in the special case
where open error
happened on the first event being read to the buffer.
If error happens after adding some events to the buffer, fanotify
process will not know
about this. Regular event will be silently dropped and permission event will be
denied.

>From fanotify(7) man page BUGS section:

       *  If  a  call  to read(2) processes multiple events from the
fanotify queue
          and an error occurs, the return value will be the total
length of the events
          successfully copied to the user-space buffer before the
error occurred.
          The return value will not be -1, and errno will not be set.
          Thus, the reading application has no way to detect the error.

There is  small subset of open errors that you could get for failure to create
an fd for the event, like  ENODEV/ENXIO and EMFILE.
You do NOT need to call fanotify_init() again, the next read will read
the next event.
(That information might be useful in man page)


>> That said, you should take heed that all of the [fa|i|d]notify APIs do
>> not extend beyond the local machine when you use them on network
>> filesystems. If you're expecting to get notification of events that
>> are occurring on other clients, you're going to be disappointed here.
>
> That certainly is disappointing. However, there is a certain level of
> coherency one would expect, namely:
>
>  * An NFS4 client opening a file should be subject to an OPEN_PERM check
>    on that client (if the client is monitoring the mount point).
>
>  * An NFS4 client opening a file should be subject to an OPEN_PERM check
>    on the server (if the server is monitoring the mount point).
>
>  * An fanotify read should not fail mysteriously. Rather, a read on
>    metadata->fd should be the one failing.
>

Depending on the error.
If you can't get an fd, how will the event be reported?
With my suggestion to pass filehandle on event, fd is redundant,
so event can be passed with the error on fd field.

Please try to create a reproducer for case where error is returned
and when it is not.

The fix seems trivial and I can post it once you have the test:
- return EAGAIN for read in case of a single event in queue without fd
  so apps getting the read error will have a good idea what to do
- in case of non single event, maybe copy event with error on event->fd
  to the buffer for specific errors that make sense to report (EMFILE)
  so a watcher checks the values of negative event->fd can maybe do
  something about it (e.g. provide a smaller buffer).

Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-19 13:46                   ` Marko Rauhamaa
  0 siblings, 0 replies; 32+ messages in thread
From: Marko Rauhamaa @ 2017-04-19 13:46 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api

Amir Goldstein <amir73il@gmail.com>:

> On Thu, Mar 23, 2017 at 8:43 AM, Marko Rauhamaa
> <marko.rauhamaa@f-secure.com> wrote:
>> Jeff Layton <jlayton@poochiereds.net>:
>>
>>> It was definitely not the intention to leak this error code to
>>> userland. EOPENSTALE is not a POSIX sanctioned error code, so
>>> applications generally don't know anything about it and will be
>>> confused.
>>
>> Got it. I will try to work on a reproduction and make a proper bug
>> report.
>
> Try this:
>
> - watch a single file for permissions events (so you will only have
> one event in the queue)
> - open file from client to generate single event (don't read event yet)
> - remove file from server (to make it stale)
> - read event (with stale file)

I did that and reproduced the problem on a recent development kernel.
Happens every time.

Just take the example program listed under "man fanotify" ("fantest")
and follow these steps:

    ==============================================================
    NFS Server    NFS Client(1)     NFS Client(2)
    ==============================================================
    # echo foo >/nfsshare/bar.txt
                  # cat /nfsshare/bar.txt
                  foo
                                    # ./fantest /nfsshare
                                    Press enter key to terminate.
                                    Listening for events.
    # rm -f /nfsshare/bar.txt
                  # cat /nfsshare/bar.txt
                                    read: Unknown error 518
                  cat: /nfsshare/bar.txt: Operation not permitted
    ==============================================================

where NFS Client (1) and (2) are two terminal sessions on a single NFS
Client machine.

So what do we conclude? Is this a kernel bug or works as designed?

> Oh my. I completely misread your report before. I though you were
> trying to read from the event->fd. Now I understand that you mean read
> from fanotify fd. That will definitely return the error, but only in
> the special case where open error happened on the first event being
> read to the buffer. If error happens after adding some events to the
> buffer, fanotify process will not know about this. Regular event will
> be silently dropped and permission event will be denied.
>
> [...]
>
> You do NOT need to call fanotify_init() again, the next read will read
> the next event.

It does appear that reading the fanotify fd again does the trick. 

However, the client gets an EPERM instead of ENOENT, which is a bit
weird.

> The fix seems trivial and I can post it once you have the test:
> - return EAGAIN for read in case of a single event in queue without fd
>   so apps getting the read error will have a good idea what to do
> - in case of non single event, maybe copy event with error on event->fd
>   to the buffer for specific errors that make sense to report (EMFILE)
>   so a watcher checks the values of negative event->fd can maybe do
>   something about it (e.g. provide a smaller buffer).

EAGAIN would be perfect for me since I'm using fanotify in a nonblocking
mode. It might be a bit surprising in the blocking case.


Marko

-- 
+358 44 990 4795
Skype: marko.rauhamaa_f-secure

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-19 13:46                   ` Marko Rauhamaa
  0 siblings, 0 replies; 32+ messages in thread
From: Marko Rauhamaa @ 2017-04-19 13:46 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> On Thu, Mar 23, 2017 at 8:43 AM, Marko Rauhamaa
> <marko.rauhamaa-xdZvthzU1HpWk0Htik3J/w@public.gmane.org> wrote:
>> Jeff Layton <jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org>:
>>
>>> It was definitely not the intention to leak this error code to
>>> userland. EOPENSTALE is not a POSIX sanctioned error code, so
>>> applications generally don't know anything about it and will be
>>> confused.
>>
>> Got it. I will try to work on a reproduction and make a proper bug
>> report.
>
> Try this:
>
> - watch a single file for permissions events (so you will only have
> one event in the queue)
> - open file from client to generate single event (don't read event yet)
> - remove file from server (to make it stale)
> - read event (with stale file)

I did that and reproduced the problem on a recent development kernel.
Happens every time.

Just take the example program listed under "man fanotify" ("fantest")
and follow these steps:

    ==============================================================
    NFS Server    NFS Client(1)     NFS Client(2)
    ==============================================================
    # echo foo >/nfsshare/bar.txt
                  # cat /nfsshare/bar.txt
                  foo
                                    # ./fantest /nfsshare
                                    Press enter key to terminate.
                                    Listening for events.
    # rm -f /nfsshare/bar.txt
                  # cat /nfsshare/bar.txt
                                    read: Unknown error 518
                  cat: /nfsshare/bar.txt: Operation not permitted
    ==============================================================

where NFS Client (1) and (2) are two terminal sessions on a single NFS
Client machine.

So what do we conclude? Is this a kernel bug or works as designed?

> Oh my. I completely misread your report before. I though you were
> trying to read from the event->fd. Now I understand that you mean read
> from fanotify fd. That will definitely return the error, but only in
> the special case where open error happened on the first event being
> read to the buffer. If error happens after adding some events to the
> buffer, fanotify process will not know about this. Regular event will
> be silently dropped and permission event will be denied.
>
> [...]
>
> You do NOT need to call fanotify_init() again, the next read will read
> the next event.

It does appear that reading the fanotify fd again does the trick. 

However, the client gets an EPERM instead of ENOENT, which is a bit
weird.

> The fix seems trivial and I can post it once you have the test:
> - return EAGAIN for read in case of a single event in queue without fd
>   so apps getting the read error will have a good idea what to do
> - in case of non single event, maybe copy event with error on event->fd
>   to the buffer for specific errors that make sense to report (EMFILE)
>   so a watcher checks the values of negative event->fd can maybe do
>   something about it (e.g. provide a smaller buffer).

EAGAIN would be perfect for me since I'm using fanotify in a nonblocking
mode. It might be a bit surprising in the blocking case.


Marko

-- 
+358 44 990 4795
Skype: marko.rauhamaa_f-secure
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 11:06                     ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-20 11:06 UTC (permalink / raw)
  To: Marko Rauhamaa
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api

On Wed, Apr 19, 2017 at 4:46 PM, Marko Rauhamaa
<marko.rauhamaa@f-secure.com> wrote:
> Amir Goldstein <amir73il@gmail.com>:
>
>> On Thu, Mar 23, 2017 at 8:43 AM, Marko Rauhamaa
>> <marko.rauhamaa@f-secure.com> wrote:
>>> Jeff Layton <jlayton@poochiereds.net>:
>>>
>>>> It was definitely not the intention to leak this error code to
>>>> userland. EOPENSTALE is not a POSIX sanctioned error code, so
>>>> applications generally don't know anything about it and will be
>>>> confused.
>>>
>>> Got it. I will try to work on a reproduction and make a proper bug
>>> report.
>>
>> Try this:
>>
>> - watch a single file for permissions events (so you will only have
>> one event in the queue)
>> - open file from client to generate single event (don't read event yet)
>> - remove file from server (to make it stale)
>> - read event (with stale file)
>
> I did that and reproduced the problem on a recent development kernel.
> Happens every time.
>
> Just take the example program listed under "man fanotify" ("fantest")
> and follow these steps:
>
>     ==============================================================
>     NFS Server    NFS Client(1)     NFS Client(2)
>     ==============================================================
>     # echo foo >/nfsshare/bar.txt
>                   # cat /nfsshare/bar.txt
>                   foo
>                                     # ./fantest /nfsshare
>                                     Press enter key to terminate.
>                                     Listening for events.
>     # rm -f /nfsshare/bar.txt
>                   # cat /nfsshare/bar.txt
>                                     read: Unknown error 518
>                   cat: /nfsshare/bar.txt: Operation not permitted
>     ==============================================================
>
> where NFS Client (1) and (2) are two terminal sessions on a single NFS
> Client machine.
>

Thanks for the reproducer.
I'll try it myself when I get to it.

> So what do we conclude? Is this a kernel bug or works as designed?
>

Exposing EOPENSTALE to userspace is definitely a kernel bug.


>> Oh my. I completely misread your report before. I though you were
>> trying to read from the event->fd. Now I understand that you mean read
>> from fanotify fd. That will definitely return the error, but only in
>> the special case where open error happened on the first event being
>> read to the buffer. If error happens after adding some events to the
>> buffer, fanotify process will not know about this. Regular event will
>> be silently dropped and permission event will be denied.
>>
>> [...]
>>
>> You do NOT need to call fanotify_init() again, the next read will read
>> the next event.
>
> It does appear that reading the fanotify fd again does the trick.
>
> However, the client gets an EPERM instead of ENOENT, which is a bit
> weird.
>

Why would the client get ENOENT? That EOPENSTALE event is already
consumed, the client reads the next event in the queue.

>> The fix seems trivial and I can post it once you have the test:
>> - return EAGAIN for read in case of a single event in queue without fd
>>   so apps getting the read error will have a good idea what to do
>> - in case of non single event, maybe copy event with error on event->fd
>>   to the buffer for specific errors that make sense to report (EMFILE)
>>   so a watcher checks the values of negative event->fd can maybe do
>>   something about it (e.g. provide a smaller buffer).
>
> EAGAIN would be perfect for me since I'm using fanotify in a nonblocking
> mode. It might be a bit surprising in the blocking case.
>
>

Can you please try this patch?
Can you please try it with blocking and non-blocking
Can you please try to add to reproducer the non empty queue case:
- Add another mark on another mount without PERM events in the mask
- Populate other mount with some files
- Before reading from nfsshare, read from other mount to fill the
   event queue, e.g.:
# cat /tmp/foo* /nfsshare/bar.txt /tmp/bar*

This should result (depending on number of files) with
>= 2 buffer reads - first with /tmp/foo* files access
last with /tmp/bar* files access


diff --git a/fs/notify/fanotify/fanotify_user.c
b/fs/notify/fanotify/fanotify_user.c
index 2b37f27..5b14890 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -295,6 +295,17 @@ static ssize_t fanotify_read(struct file *file,
char __user *buf,
                }

                ret = copy_event_to_user(group, kevent, buf);
+               if (unlikely(ret == -EOPENSTALE)) {
+                       /*
+                        * We cannot report events with stale fd so drop it.
+                        * Setting ret/mask to 0 will continue the event loop
+                        * and do the right thing if there are no more events
+                        * to read (i.e. return bytes read, -EAGAIN or wait).
+                        */
+                       kevent->mask = 0;
+                       ret = 0;
+               }
+
                /*
                 * Permission events get queued to wait for response.  Other
                 * events can be destroyed now.

---

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 11:06                     ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-20 11:06 UTC (permalink / raw)
  To: Marko Rauhamaa
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

On Wed, Apr 19, 2017 at 4:46 PM, Marko Rauhamaa
<marko.rauhamaa-xdZvthzU1HpWk0Htik3J/w@public.gmane.org> wrote:
> Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>
>> On Thu, Mar 23, 2017 at 8:43 AM, Marko Rauhamaa
>> <marko.rauhamaa-xdZvthzU1HpWk0Htik3J/w@public.gmane.org> wrote:
>>> Jeff Layton <jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org>:
>>>
>>>> It was definitely not the intention to leak this error code to
>>>> userland. EOPENSTALE is not a POSIX sanctioned error code, so
>>>> applications generally don't know anything about it and will be
>>>> confused.
>>>
>>> Got it. I will try to work on a reproduction and make a proper bug
>>> report.
>>
>> Try this:
>>
>> - watch a single file for permissions events (so you will only have
>> one event in the queue)
>> - open file from client to generate single event (don't read event yet)
>> - remove file from server (to make it stale)
>> - read event (with stale file)
>
> I did that and reproduced the problem on a recent development kernel.
> Happens every time.
>
> Just take the example program listed under "man fanotify" ("fantest")
> and follow these steps:
>
>     ==============================================================
>     NFS Server    NFS Client(1)     NFS Client(2)
>     ==============================================================
>     # echo foo >/nfsshare/bar.txt
>                   # cat /nfsshare/bar.txt
>                   foo
>                                     # ./fantest /nfsshare
>                                     Press enter key to terminate.
>                                     Listening for events.
>     # rm -f /nfsshare/bar.txt
>                   # cat /nfsshare/bar.txt
>                                     read: Unknown error 518
>                   cat: /nfsshare/bar.txt: Operation not permitted
>     ==============================================================
>
> where NFS Client (1) and (2) are two terminal sessions on a single NFS
> Client machine.
>

Thanks for the reproducer.
I'll try it myself when I get to it.

> So what do we conclude? Is this a kernel bug or works as designed?
>

Exposing EOPENSTALE to userspace is definitely a kernel bug.


>> Oh my. I completely misread your report before. I though you were
>> trying to read from the event->fd. Now I understand that you mean read
>> from fanotify fd. That will definitely return the error, but only in
>> the special case where open error happened on the first event being
>> read to the buffer. If error happens after adding some events to the
>> buffer, fanotify process will not know about this. Regular event will
>> be silently dropped and permission event will be denied.
>>
>> [...]
>>
>> You do NOT need to call fanotify_init() again, the next read will read
>> the next event.
>
> It does appear that reading the fanotify fd again does the trick.
>
> However, the client gets an EPERM instead of ENOENT, which is a bit
> weird.
>

Why would the client get ENOENT? That EOPENSTALE event is already
consumed, the client reads the next event in the queue.

>> The fix seems trivial and I can post it once you have the test:
>> - return EAGAIN for read in case of a single event in queue without fd
>>   so apps getting the read error will have a good idea what to do
>> - in case of non single event, maybe copy event with error on event->fd
>>   to the buffer for specific errors that make sense to report (EMFILE)
>>   so a watcher checks the values of negative event->fd can maybe do
>>   something about it (e.g. provide a smaller buffer).
>
> EAGAIN would be perfect for me since I'm using fanotify in a nonblocking
> mode. It might be a bit surprising in the blocking case.
>
>

Can you please try this patch?
Can you please try it with blocking and non-blocking
Can you please try to add to reproducer the non empty queue case:
- Add another mark on another mount without PERM events in the mask
- Populate other mount with some files
- Before reading from nfsshare, read from other mount to fill the
   event queue, e.g.:
# cat /tmp/foo* /nfsshare/bar.txt /tmp/bar*

This should result (depending on number of files) with
>= 2 buffer reads - first with /tmp/foo* files access
last with /tmp/bar* files access


diff --git a/fs/notify/fanotify/fanotify_user.c
b/fs/notify/fanotify/fanotify_user.c
index 2b37f27..5b14890 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -295,6 +295,17 @@ static ssize_t fanotify_read(struct file *file,
char __user *buf,
                }

                ret = copy_event_to_user(group, kevent, buf);
+               if (unlikely(ret == -EOPENSTALE)) {
+                       /*
+                        * We cannot report events with stale fd so drop it.
+                        * Setting ret/mask to 0 will continue the event loop
+                        * and do the right thing if there are no more events
+                        * to read (i.e. return bytes read, -EAGAIN or wait).
+                        */
+                       kevent->mask = 0;
+                       ret = 0;
+               }
+
                /*
                 * Permission events get queued to wait for response.  Other
                 * events can be destroyed now.

---

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 11:33                       ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-20 11:33 UTC (permalink / raw)
  To: Marko Rauhamaa
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api

On Thu, Apr 20, 2017 at 2:06 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Apr 19, 2017 at 4:46 PM, Marko Rauhamaa
> <marko.rauhamaa@f-secure.com> wrote:
>> Amir Goldstein <amir73il@gmail.com>:
>>
>>> On Thu, Mar 23, 2017 at 8:43 AM, Marko Rauhamaa
>>> <marko.rauhamaa@f-secure.com> wrote:
>>>> Jeff Layton <jlayton@poochiereds.net>:
>>>>
>>>>> It was definitely not the intention to leak this error code to
>>>>> userland. EOPENSTALE is not a POSIX sanctioned error code, so
>>>>> applications generally don't know anything about it and will be
>>>>> confused.
>>>>
>>>> Got it. I will try to work on a reproduction and make a proper bug
>>>> report.
>>>
>>> Try this:
>>>
>>> - watch a single file for permissions events (so you will only have
>>> one event in the queue)
>>> - open file from client to generate single event (don't read event yet)
>>> - remove file from server (to make it stale)
>>> - read event (with stale file)
>>
>> I did that and reproduced the problem on a recent development kernel.
>> Happens every time.
>>
>> Just take the example program listed under "man fanotify" ("fantest")
>> and follow these steps:
>>
>>     ==============================================================
>>     NFS Server    NFS Client(1)     NFS Client(2)
>>     ==============================================================
>>     # echo foo >/nfsshare/bar.txt
>>                   # cat /nfsshare/bar.txt
>>                   foo
>>                                     # ./fantest /nfsshare
>>                                     Press enter key to terminate.
>>                                     Listening for events.
>>     # rm -f /nfsshare/bar.txt
>>                   # cat /nfsshare/bar.txt
>>                                     read: Unknown error 518
>>                   cat: /nfsshare/bar.txt: Operation not permitted
>>     ==============================================================
>>
>> where NFS Client (1) and (2) are two terminal sessions on a single NFS
>> Client machine.
>>
>
> Thanks for the reproducer.
> I'll try it myself when I get to it.
>
>> So what do we conclude? Is this a kernel bug or works as designed?
>>
>
> Exposing EOPENSTALE to userspace is definitely a kernel bug.
>
>
>>> Oh my. I completely misread your report before. I though you were
>>> trying to read from the event->fd. Now I understand that you mean read
>>> from fanotify fd. That will definitely return the error, but only in
>>> the special case where open error happened on the first event being
>>> read to the buffer. If error happens after adding some events to the
>>> buffer, fanotify process will not know about this. Regular event will
>>> be silently dropped and permission event will be denied.
>>>
>>> [...]
>>>
>>> You do NOT need to call fanotify_init() again, the next read will read
>>> the next event.
>>
>> It does appear that reading the fanotify fd again does the trick.
>>
>> However, the client gets an EPERM instead of ENOENT, which is a bit
>> weird.
>>
>
> Why would the client get ENOENT? That EOPENSTALE event is already
> consumed, the client reads the next event in the queue.

Sorry, I keep confusing when you refer to read of file and read of fanotify fd
when kernel fails to get response from fanotify daemon it will deny access
to file. That's the default.

>
>>> The fix seems trivial and I can post it once you have the test:
>>> - return EAGAIN for read in case of a single event in queue without fd
>>>   so apps getting the read error will have a good idea what to do
>>> - in case of non single event, maybe copy event with error on event->fd
>>>   to the buffer for specific errors that make sense to report (EMFILE)
>>>   so a watcher checks the values of negative event->fd can maybe do
>>>   something about it (e.g. provide a smaller buffer).
>>
>> EAGAIN would be perfect for me since I'm using fanotify in a nonblocking
>> mode. It might be a bit surprising in the blocking case.
>>
>>
>
> Can you please try this patch?
> Can you please try it with blocking and non-blocking
> Can you please try to add to reproducer the non empty queue case:
> - Add another mark on another mount without PERM events in the mask
> - Populate other mount with some files
> - Before reading from nfsshare, read from other mount to fill the
>    event queue, e.g.:
> # cat /tmp/foo* /nfsshare/bar.txt /tmp/bar*
>
> This should result (depending on number of files) with
>>= 2 buffer reads - first with /tmp/foo* files access
> last with /tmp/bar* files access
>
>

Sorry I messed up the previous patch. please try this one:

diff --git a/fs/notify/fanotify/fanotify_user.c
b/fs/notify/fanotify/fanotify_user.c
index 2b37f27..7864354 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -295,6 +295,16 @@ static ssize_t fanotify_read(struct file *file,
char __user *buf,
                }

                ret = copy_event_to_user(group, kevent, buf);
+               if (unlikely(ret == -EOPENSTALE)) {
+                       /*
+                        * We cannot report events with stale fd so drop it.
+                        * Setting ret to 0 will continue the event loop and
+                        * do the right thing if there are no more events to
+                        * read (i.e. return bytes read, -EAGAIN or wait).
+                        */
+                       ret = 0;
+               }
+
                /*
                 * Permission events get queued to wait for response.  Other
                 * events can be destroyed now.
@@ -305,7 +315,7 @@ static ssize_t fanotify_read(struct file *file,
char __user *buf,
                                break;
                } else {
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
-                       if (ret < 0) {
+                       if (ret <= 0) {
                                FANOTIFY_PE(kevent)->response = FAN_DENY;
                                wake_up(&group->fanotify_data.access_waitq);
                                break;

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 11:33                       ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-20 11:33 UTC (permalink / raw)
  To: Marko Rauhamaa
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Apr 20, 2017 at 2:06 PM, Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Wed, Apr 19, 2017 at 4:46 PM, Marko Rauhamaa
> <marko.rauhamaa-xdZvthzU1HpWk0Htik3J/w@public.gmane.org> wrote:
>> Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>>
>>> On Thu, Mar 23, 2017 at 8:43 AM, Marko Rauhamaa
>>> <marko.rauhamaa-xdZvthzU1HpWk0Htik3J/w@public.gmane.org> wrote:
>>>> Jeff Layton <jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org>:
>>>>
>>>>> It was definitely not the intention to leak this error code to
>>>>> userland. EOPENSTALE is not a POSIX sanctioned error code, so
>>>>> applications generally don't know anything about it and will be
>>>>> confused.
>>>>
>>>> Got it. I will try to work on a reproduction and make a proper bug
>>>> report.
>>>
>>> Try this:
>>>
>>> - watch a single file for permissions events (so you will only have
>>> one event in the queue)
>>> - open file from client to generate single event (don't read event yet)
>>> - remove file from server (to make it stale)
>>> - read event (with stale file)
>>
>> I did that and reproduced the problem on a recent development kernel.
>> Happens every time.
>>
>> Just take the example program listed under "man fanotify" ("fantest")
>> and follow these steps:
>>
>>     ==============================================================
>>     NFS Server    NFS Client(1)     NFS Client(2)
>>     ==============================================================
>>     # echo foo >/nfsshare/bar.txt
>>                   # cat /nfsshare/bar.txt
>>                   foo
>>                                     # ./fantest /nfsshare
>>                                     Press enter key to terminate.
>>                                     Listening for events.
>>     # rm -f /nfsshare/bar.txt
>>                   # cat /nfsshare/bar.txt
>>                                     read: Unknown error 518
>>                   cat: /nfsshare/bar.txt: Operation not permitted
>>     ==============================================================
>>
>> where NFS Client (1) and (2) are two terminal sessions on a single NFS
>> Client machine.
>>
>
> Thanks for the reproducer.
> I'll try it myself when I get to it.
>
>> So what do we conclude? Is this a kernel bug or works as designed?
>>
>
> Exposing EOPENSTALE to userspace is definitely a kernel bug.
>
>
>>> Oh my. I completely misread your report before. I though you were
>>> trying to read from the event->fd. Now I understand that you mean read
>>> from fanotify fd. That will definitely return the error, but only in
>>> the special case where open error happened on the first event being
>>> read to the buffer. If error happens after adding some events to the
>>> buffer, fanotify process will not know about this. Regular event will
>>> be silently dropped and permission event will be denied.
>>>
>>> [...]
>>>
>>> You do NOT need to call fanotify_init() again, the next read will read
>>> the next event.
>>
>> It does appear that reading the fanotify fd again does the trick.
>>
>> However, the client gets an EPERM instead of ENOENT, which is a bit
>> weird.
>>
>
> Why would the client get ENOENT? That EOPENSTALE event is already
> consumed, the client reads the next event in the queue.

Sorry, I keep confusing when you refer to read of file and read of fanotify fd
when kernel fails to get response from fanotify daemon it will deny access
to file. That's the default.

>
>>> The fix seems trivial and I can post it once you have the test:
>>> - return EAGAIN for read in case of a single event in queue without fd
>>>   so apps getting the read error will have a good idea what to do
>>> - in case of non single event, maybe copy event with error on event->fd
>>>   to the buffer for specific errors that make sense to report (EMFILE)
>>>   so a watcher checks the values of negative event->fd can maybe do
>>>   something about it (e.g. provide a smaller buffer).
>>
>> EAGAIN would be perfect for me since I'm using fanotify in a nonblocking
>> mode. It might be a bit surprising in the blocking case.
>>
>>
>
> Can you please try this patch?
> Can you please try it with blocking and non-blocking
> Can you please try to add to reproducer the non empty queue case:
> - Add another mark on another mount without PERM events in the mask
> - Populate other mount with some files
> - Before reading from nfsshare, read from other mount to fill the
>    event queue, e.g.:
> # cat /tmp/foo* /nfsshare/bar.txt /tmp/bar*
>
> This should result (depending on number of files) with
>>= 2 buffer reads - first with /tmp/foo* files access
> last with /tmp/bar* files access
>
>

Sorry I messed up the previous patch. please try this one:

diff --git a/fs/notify/fanotify/fanotify_user.c
b/fs/notify/fanotify/fanotify_user.c
index 2b37f27..7864354 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -295,6 +295,16 @@ static ssize_t fanotify_read(struct file *file,
char __user *buf,
                }

                ret = copy_event_to_user(group, kevent, buf);
+               if (unlikely(ret == -EOPENSTALE)) {
+                       /*
+                        * We cannot report events with stale fd so drop it.
+                        * Setting ret to 0 will continue the event loop and
+                        * do the right thing if there are no more events to
+                        * read (i.e. return bytes read, -EAGAIN or wait).
+                        */
+                       ret = 0;
+               }
+
                /*
                 * Permission events get queued to wait for response.  Other
                 * events can be destroyed now.
@@ -305,7 +315,7 @@ static ssize_t fanotify_read(struct file *file,
char __user *buf,
                                break;
                } else {
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
-                       if (ret < 0) {
+                       if (ret <= 0) {
                                FANOTIFY_PE(kevent)->response = FAN_DENY;
                                wake_up(&group->fanotify_data.access_waitq);
                                break;
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 12:43                         ` Marko Rauhamaa
  0 siblings, 0 replies; 32+ messages in thread
From: Marko Rauhamaa @ 2017-04-20 12:43 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api

Amir Goldstein <amir73il@gmail.com>:

> Sorry I messed up the previous patch. please try this one:

I will try it.

> +                        * do the right thing if there are no more events to
> +                        * read (i.e. return bytes read, -EAGAIN or wait).

EAGAIN is the right thing to do when FAN_NONBLOCK has been specified.
Without FAN_NONBLOCK, EAGAIN is bound to confuse the application. That
could be documented, of course.

More importantly, does EAGAIN here still guarantee EPOLLET semantics of
epoll(7)? IOW, if I get EAGAIN, I shouldn't have to try read(2)ing the
fanotify fd again before calling epoll_wait(2).


Marko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 12:43                         ` Marko Rauhamaa
  0 siblings, 0 replies; 32+ messages in thread
From: Marko Rauhamaa @ 2017-04-20 12:43 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> Sorry I messed up the previous patch. please try this one:

I will try it.

> +                        * do the right thing if there are no more events to
> +                        * read (i.e. return bytes read, -EAGAIN or wait).

EAGAIN is the right thing to do when FAN_NONBLOCK has been specified.
Without FAN_NONBLOCK, EAGAIN is bound to confuse the application. That
could be documented, of course.

More importantly, does EAGAIN here still guarantee EPOLLET semantics of
epoll(7)? IOW, if I get EAGAIN, I shouldn't have to try read(2)ing the
fanotify fd again before calling epoll_wait(2).


Marko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 13:34                           ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-20 13:34 UTC (permalink / raw)
  To: Marko Rauhamaa
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api

On Thu, Apr 20, 2017 at 3:43 PM, Marko Rauhamaa
<marko.rauhamaa@f-secure.com> wrote:
> Amir Goldstein <amir73il@gmail.com>:
>
>> Sorry I messed up the previous patch. please try this one:
>
> I will try it.
>
>> +                        * do the right thing if there are no more events to
>> +                        * read (i.e. return bytes read, -EAGAIN or wait).
>
> EAGAIN is the right thing to do when FAN_NONBLOCK has been specified.
> Without FAN_NONBLOCK, EAGAIN is bound to confuse the application. That
> could be documented, of course.
>

My comment says "do the right thing ... -EAGAIN or wait", meaning depending
FAN_NONBLOCK. The same code that checks for FAN_NONBLOCK will
take care of that. My patch only takes care of dropping the stale event and
continue to next event. If there is no next event, code will "do the
right thing".

> More importantly, does EAGAIN here still guarantee EPOLLET semantics of
> epoll(7)? IOW, if I get EAGAIN, I shouldn't have to try read(2)ing the
> fanotify fd again before calling epoll_wait(2).
>

Yes, if you get EAGAIN it means there are no more events in the queue,
so shouldn't have to try read again.

Amir.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 13:34                           ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-20 13:34 UTC (permalink / raw)
  To: Marko Rauhamaa
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Apr 20, 2017 at 3:43 PM, Marko Rauhamaa
<marko.rauhamaa-xdZvthzU1HpWk0Htik3J/w@public.gmane.org> wrote:
> Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>
>> Sorry I messed up the previous patch. please try this one:
>
> I will try it.
>
>> +                        * do the right thing if there are no more events to
>> +                        * read (i.e. return bytes read, -EAGAIN or wait).
>
> EAGAIN is the right thing to do when FAN_NONBLOCK has been specified.
> Without FAN_NONBLOCK, EAGAIN is bound to confuse the application. That
> could be documented, of course.
>

My comment says "do the right thing ... -EAGAIN or wait", meaning depending
FAN_NONBLOCK. The same code that checks for FAN_NONBLOCK will
take care of that. My patch only takes care of dropping the stale event and
continue to next event. If there is no next event, code will "do the
right thing".

> More importantly, does EAGAIN here still guarantee EPOLLET semantics of
> epoll(7)? IOW, if I get EAGAIN, I shouldn't have to try read(2)ing the
> fanotify fd again before calling epoll_wait(2).
>

Yes, if you get EAGAIN it means there are no more events in the queue,
so shouldn't have to try read again.

Amir.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
  2017-04-20 11:33                       ` Amir Goldstein
  (?)
  (?)
@ 2017-04-20 14:20                       ` Jan Kara
  2017-04-20 15:06                           ` Amir Goldstein
  -1 siblings, 1 reply; 32+ messages in thread
From: Jan Kara @ 2017-04-20 14:20 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Marko Rauhamaa, Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api

On Thu 20-04-17 14:33:04, Amir Goldstein wrote:
> 
> Sorry I messed up the previous patch. please try this one:
> 
> diff --git a/fs/notify/fanotify/fanotify_user.c
> b/fs/notify/fanotify/fanotify_user.c
> index 2b37f27..7864354 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -295,6 +295,16 @@ static ssize_t fanotify_read(struct file *file,
> char __user *buf,
>                 }
> 
>                 ret = copy_event_to_user(group, kevent, buf);
> +               if (unlikely(ret == -EOPENSTALE)) {
> +                       /*
> +                        * We cannot report events with stale fd so drop it.
> +                        * Setting ret to 0 will continue the event loop and
> +                        * do the right thing if there are no more events to
> +                        * read (i.e. return bytes read, -EAGAIN or wait).
> +                        */
> +                       ret = 0;
> +               }
> +
>                 /*
>                  * Permission events get queued to wait for response.  Other
>                  * events can be destroyed now.
> @@ -305,7 +315,7 @@ static ssize_t fanotify_read(struct file *file,
> char __user *buf,
>                                 break;
>                 } else {
>  #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
> -                       if (ret < 0) {
> +                       if (ret <= 0) {
>                                 FANOTIFY_PE(kevent)->response = FAN_DENY;
>                                 wake_up(&group->fanotify_data.access_waitq);
>                                 break;

I don't think you want to break out of the reading loop when ret == 0 and
the code might be more readable as:

               if (!(kevent->mask & FAN_ALL_PERM_EVENTS)) {
                        fsnotify_destroy_event(group, kevent);
               } else {
#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
                        if (ret <= 0) {
                                FANOTIFY_PE(kevent)->response = FAN_DENY;
                                wake_up(&group->fanotify_data.access_waitq);
                        } else {
                                spin_lock(&group->notification_lock);
                                list_add_tail(&kevent->list,
                                              &group->fanotify_data.access_list);
                                spin_unlock(&group->notification_lock);
                        }
#endif
                }
                if (ret < 0)
                        break;

Hmm?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 15:06                           ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-20 15:06 UTC (permalink / raw)
  To: Jan Kara
  Cc: Marko Rauhamaa, Jeff Layton, Al Viro, linux-fsdevel,
	Linux NFS Mailing List, linux-api

On Thu, Apr 20, 2017 at 5:20 PM, Jan Kara <jack@suse.cz> wrote:
> On Thu 20-04-17 14:33:04, Amir Goldstein wrote:
>>
>> Sorry I messed up the previous patch. please try this one:
>>
>> diff --git a/fs/notify/fanotify/fanotify_user.c
>> b/fs/notify/fanotify/fanotify_user.c
>> index 2b37f27..7864354 100644
>> --- a/fs/notify/fanotify/fanotify_user.c
>> +++ b/fs/notify/fanotify/fanotify_user.c
>> @@ -295,6 +295,16 @@ static ssize_t fanotify_read(struct file *file,
>> char __user *buf,
>>                 }
>>
>>                 ret = copy_event_to_user(group, kevent, buf);
>> +               if (unlikely(ret == -EOPENSTALE)) {
>> +                       /*
>> +                        * We cannot report events with stale fd so drop it.
>> +                        * Setting ret to 0 will continue the event loop and
>> +                        * do the right thing if there are no more events to
>> +                        * read (i.e. return bytes read, -EAGAIN or wait).
>> +                        */
>> +                       ret = 0;
>> +               }
>> +
>>                 /*
>>                  * Permission events get queued to wait for response.  Other
>>                  * events can be destroyed now.
>> @@ -305,7 +315,7 @@ static ssize_t fanotify_read(struct file *file,
>> char __user *buf,
>>                                 break;
>>                 } else {
>>  #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>> -                       if (ret < 0) {
>> +                       if (ret <= 0) {
>>                                 FANOTIFY_PE(kevent)->response = FAN_DENY;
>>                                 wake_up(&group->fanotify_data.access_waitq);
>>                                 break;
>
> I don't think you want to break out of the reading loop when ret == 0 and
> the code might be more readable as:
>
>                if (!(kevent->mask & FAN_ALL_PERM_EVENTS)) {
>                         fsnotify_destroy_event(group, kevent);
>                } else {
> #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>                         if (ret <= 0) {
>                                 FANOTIFY_PE(kevent)->response = FAN_DENY;
>                                 wake_up(&group->fanotify_data.access_waitq);
>                         } else {
>                                 spin_lock(&group->notification_lock);
>                                 list_add_tail(&kevent->list,
>                                               &group->fanotify_data.access_list);
>                                 spin_unlock(&group->notification_lock);
>                         }
> #endif
>                 }
>                 if (ret < 0)
>                         break;
>
> Hmm?
>

Right, I missed that.
Thanks.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-20 15:06                           ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-20 15:06 UTC (permalink / raw)
  To: Jan Kara
  Cc: Marko Rauhamaa, Jeff Layton, Al Viro, linux-fsdevel,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Apr 20, 2017 at 5:20 PM, Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> wrote:
> On Thu 20-04-17 14:33:04, Amir Goldstein wrote:
>>
>> Sorry I messed up the previous patch. please try this one:
>>
>> diff --git a/fs/notify/fanotify/fanotify_user.c
>> b/fs/notify/fanotify/fanotify_user.c
>> index 2b37f27..7864354 100644
>> --- a/fs/notify/fanotify/fanotify_user.c
>> +++ b/fs/notify/fanotify/fanotify_user.c
>> @@ -295,6 +295,16 @@ static ssize_t fanotify_read(struct file *file,
>> char __user *buf,
>>                 }
>>
>>                 ret = copy_event_to_user(group, kevent, buf);
>> +               if (unlikely(ret == -EOPENSTALE)) {
>> +                       /*
>> +                        * We cannot report events with stale fd so drop it.
>> +                        * Setting ret to 0 will continue the event loop and
>> +                        * do the right thing if there are no more events to
>> +                        * read (i.e. return bytes read, -EAGAIN or wait).
>> +                        */
>> +                       ret = 0;
>> +               }
>> +
>>                 /*
>>                  * Permission events get queued to wait for response.  Other
>>                  * events can be destroyed now.
>> @@ -305,7 +315,7 @@ static ssize_t fanotify_read(struct file *file,
>> char __user *buf,
>>                                 break;
>>                 } else {
>>  #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>> -                       if (ret < 0) {
>> +                       if (ret <= 0) {
>>                                 FANOTIFY_PE(kevent)->response = FAN_DENY;
>>                                 wake_up(&group->fanotify_data.access_waitq);
>>                                 break;
>
> I don't think you want to break out of the reading loop when ret == 0 and
> the code might be more readable as:
>
>                if (!(kevent->mask & FAN_ALL_PERM_EVENTS)) {
>                         fsnotify_destroy_event(group, kevent);
>                } else {
> #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>                         if (ret <= 0) {
>                                 FANOTIFY_PE(kevent)->response = FAN_DENY;
>                                 wake_up(&group->fanotify_data.access_waitq);
>                         } else {
>                                 spin_lock(&group->notification_lock);
>                                 list_add_tail(&kevent->list,
>                                               &group->fanotify_data.access_list);
>                                 spin_unlock(&group->notification_lock);
>                         }
> #endif
>                 }
>                 if (ret < 0)
>                         break;
>
> Hmm?
>

Right, I missed that.
Thanks.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-21 13:13                             ` Marko Rauhamaa
  0 siblings, 0 replies; 32+ messages in thread
From: Marko Rauhamaa @ 2017-04-21 13:13 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api

Amir Goldstein <amir73il@gmail.com>:

> On Thu, Apr 20, 2017 at 3:43 PM, Marko Rauhamaa
> <marko.rauhamaa@f-secure.com> wrote:
>> Amir Goldstein <amir73il@gmail.com>:
>>
>>> Sorry I messed up the previous patch. please try this one:
>>
>> I will try it.

Tried it. Superficially, it seems to be working, but...

>> More importantly, does EAGAIN here still guarantee EPOLLET semantics
>> of epoll(7)? IOW, if I get EAGAIN, I shouldn't have to try read(2)ing
>> the fanotify fd again before calling epoll_wait(2).
>
> Yes, if you get EAGAIN it means there are no more events in the queue,
> so shouldn't have to try read again.

I ran into a system hang with our real product that suggests there might
be a problem left. It would be explained if your patch generated an
EAGAIN while there were other events waiting in the queue.

I will have to investigate further.


Marko

-- 
+358 44 990 4795
Skype: marko.rauhamaa_f-secure

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-21 13:13                             ` Marko Rauhamaa
  0 siblings, 0 replies; 32+ messages in thread
From: Marko Rauhamaa @ 2017-04-21 13:13 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jeff Layton, Al Viro, linux-fsdevel, Jan Kara,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> On Thu, Apr 20, 2017 at 3:43 PM, Marko Rauhamaa
> <marko.rauhamaa-xdZvthzU1HpWk0Htik3J/w@public.gmane.org> wrote:
>> Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>>
>>> Sorry I messed up the previous patch. please try this one:
>>
>> I will try it.

Tried it. Superficially, it seems to be working, but...

>> More importantly, does EAGAIN here still guarantee EPOLLET semantics
>> of epoll(7)? IOW, if I get EAGAIN, I shouldn't have to try read(2)ing
>> the fanotify fd again before calling epoll_wait(2).
>
> Yes, if you get EAGAIN it means there are no more events in the queue,
> so shouldn't have to try read again.

I ran into a system hang with our real product that suggests there might
be a problem left. It would be explained if your patch generated an
EAGAIN while there were other events waiting in the queue.

I will have to investigate further.


Marko

-- 
+358 44 990 4795
Skype: marko.rauhamaa_f-secure

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-22  7:22                             ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-22  7:22 UTC (permalink / raw)
  To: Jan Kara
  Cc: Marko Rauhamaa, Jeff Layton, Al Viro, linux-fsdevel,
	Linux NFS Mailing List, linux-api

On Thu, Apr 20, 2017 at 6:06 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Apr 20, 2017 at 5:20 PM, Jan Kara <jack@suse.cz> wrote:
>> On Thu 20-04-17 14:33:04, Amir Goldstein wrote:
>>>
>>> Sorry I messed up the previous patch. please try this one:
>>>
>>> diff --git a/fs/notify/fanotify/fanotify_user.c
>>> b/fs/notify/fanotify/fanotify_user.c
>>> index 2b37f27..7864354 100644
>>> --- a/fs/notify/fanotify/fanotify_user.c
>>> +++ b/fs/notify/fanotify/fanotify_user.c
>>> @@ -295,6 +295,16 @@ static ssize_t fanotify_read(struct file *file,
>>> char __user *buf,
>>>                 }
>>>
>>>                 ret = copy_event_to_user(group, kevent, buf);
>>> +               if (unlikely(ret == -EOPENSTALE)) {
>>> +                       /*
>>> +                        * We cannot report events with stale fd so drop it.
>>> +                        * Setting ret to 0 will continue the event loop and
>>> +                        * do the right thing if there are no more events to
>>> +                        * read (i.e. return bytes read, -EAGAIN or wait).
>>> +                        */
>>> +                       ret = 0;
>>> +               }
>>> +
>>>                 /*
>>>                  * Permission events get queued to wait for response.  Other
>>>                  * events can be destroyed now.
>>> @@ -305,7 +315,7 @@ static ssize_t fanotify_read(struct file *file,
>>> char __user *buf,
>>>                                 break;
>>>                 } else {
>>>  #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>>> -                       if (ret < 0) {
>>> +                       if (ret <= 0) {
>>>                                 FANOTIFY_PE(kevent)->response = FAN_DENY;
>>>                                 wake_up(&group->fanotify_data.access_waitq);
>>>                                 break;
>>
>> I don't think you want to break out of the reading loop when ret == 0 and
>> the code might be more readable as:
>>
>>                if (!(kevent->mask & FAN_ALL_PERM_EVENTS)) {
>>                         fsnotify_destroy_event(group, kevent);
>>                } else {
>> #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>>                         if (ret <= 0) {
>>                                 FANOTIFY_PE(kevent)->response = FAN_DENY;
>>                                 wake_up(&group->fanotify_data.access_waitq);
>>                         } else {
>>                                 spin_lock(&group->notification_lock);
>>                                 list_add_tail(&kevent->list,
>>                                               &group->fanotify_data.access_list);
>>                                 spin_unlock(&group->notification_lock);
>>                         }
>> #endif
>>                 }
>>                 if (ret < 0)
>>                         break;
>>
>> Hmm?
>>
>

On Fri, Apr 21, 2017 at 5:27 PM, Marko Rauhamaa
<marko.rauhamaa@f-secure.com> wrote:
> Amir Goldstein <amir73il@gmail.com>:
>
>> Did you notice Jan's comments on my patch? I had a bug that broke out
>> of the loop. Without his corrections read will return even if there
>> are more events in the queue.
>
> Yes, I now tried Jan's fix, and it did the trick.
>
> It will now take months or years before distros have a proper fix. In
> the interim, I will absorb EOPENSTALE and schedule a reread in that
> situation.
>
> Thank you both for your attention.
>
>

Marko,

Were you able to verify that both blocking and non-blocking mode work correctly?
May I add your Tested-by and Reported-by tags?

Thanks!
Amir.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-22  7:22                             ` Amir Goldstein
  0 siblings, 0 replies; 32+ messages in thread
From: Amir Goldstein @ 2017-04-22  7:22 UTC (permalink / raw)
  To: Jan Kara
  Cc: Marko Rauhamaa, Jeff Layton, Al Viro, linux-fsdevel,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Apr 20, 2017 at 6:06 PM, Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Thu, Apr 20, 2017 at 5:20 PM, Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> wrote:
>> On Thu 20-04-17 14:33:04, Amir Goldstein wrote:
>>>
>>> Sorry I messed up the previous patch. please try this one:
>>>
>>> diff --git a/fs/notify/fanotify/fanotify_user.c
>>> b/fs/notify/fanotify/fanotify_user.c
>>> index 2b37f27..7864354 100644
>>> --- a/fs/notify/fanotify/fanotify_user.c
>>> +++ b/fs/notify/fanotify/fanotify_user.c
>>> @@ -295,6 +295,16 @@ static ssize_t fanotify_read(struct file *file,
>>> char __user *buf,
>>>                 }
>>>
>>>                 ret = copy_event_to_user(group, kevent, buf);
>>> +               if (unlikely(ret == -EOPENSTALE)) {
>>> +                       /*
>>> +                        * We cannot report events with stale fd so drop it.
>>> +                        * Setting ret to 0 will continue the event loop and
>>> +                        * do the right thing if there are no more events to
>>> +                        * read (i.e. return bytes read, -EAGAIN or wait).
>>> +                        */
>>> +                       ret = 0;
>>> +               }
>>> +
>>>                 /*
>>>                  * Permission events get queued to wait for response.  Other
>>>                  * events can be destroyed now.
>>> @@ -305,7 +315,7 @@ static ssize_t fanotify_read(struct file *file,
>>> char __user *buf,
>>>                                 break;
>>>                 } else {
>>>  #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>>> -                       if (ret < 0) {
>>> +                       if (ret <= 0) {
>>>                                 FANOTIFY_PE(kevent)->response = FAN_DENY;
>>>                                 wake_up(&group->fanotify_data.access_waitq);
>>>                                 break;
>>
>> I don't think you want to break out of the reading loop when ret == 0 and
>> the code might be more readable as:
>>
>>                if (!(kevent->mask & FAN_ALL_PERM_EVENTS)) {
>>                         fsnotify_destroy_event(group, kevent);
>>                } else {
>> #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>>                         if (ret <= 0) {
>>                                 FANOTIFY_PE(kevent)->response = FAN_DENY;
>>                                 wake_up(&group->fanotify_data.access_waitq);
>>                         } else {
>>                                 spin_lock(&group->notification_lock);
>>                                 list_add_tail(&kevent->list,
>>                                               &group->fanotify_data.access_list);
>>                                 spin_unlock(&group->notification_lock);
>>                         }
>> #endif
>>                 }
>>                 if (ret < 0)
>>                         break;
>>
>> Hmm?
>>
>

On Fri, Apr 21, 2017 at 5:27 PM, Marko Rauhamaa
<marko.rauhamaa-xdZvthzU1HpWk0Htik3J/w@public.gmane.org> wrote:
> Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>
>> Did you notice Jan's comments on my patch? I had a bug that broke out
>> of the loop. Without his corrections read will return even if there
>> are more events in the queue.
>
> Yes, I now tried Jan's fix, and it did the trick.
>
> It will now take months or years before distros have a proper fix. In
> the interim, I will absorb EOPENSTALE and schedule a reread in that
> situation.
>
> Thank you both for your attention.
>
>

Marko,

Were you able to verify that both blocking and non-blocking mode work correctly?
May I add your Tested-by and Reported-by tags?

Thanks!
Amir.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-24  7:40                               ` Marko Rauhamaa
  0 siblings, 0 replies; 32+ messages in thread
From: Marko Rauhamaa @ 2017-04-24  7:40 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, Jeff Layton, Al Viro, linux-fsdevel,
	Linux NFS Mailing List, linux-api

Amir Goldstein <amir73il@gmail.com>:

> Were you able to verify that both blocking and non-blocking mode work
> correctly? May I add your Tested-by and Reported-by tags?

I have only verified the nonblocking case in my tests (which are by no
means extensive). The reported issue can no longer be reproduced with
the patched kernel, and the regular OPEN_PERM function is still
operational.

Feel free to add me to the tags.


Marko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: fanotify read returns with errno == EOPENSTALE
@ 2017-04-24  7:40                               ` Marko Rauhamaa
  0 siblings, 0 replies; 32+ messages in thread
From: Marko Rauhamaa @ 2017-04-24  7:40 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, Jeff Layton, Al Viro, linux-fsdevel,
	Linux NFS Mailing List, linux-api-u79uwXL29TY76Z2rM5mHXA

Amir Goldstein <amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> Were you able to verify that both blocking and non-blocking mode work
> correctly? May I add your Tested-by and Reported-by tags?

I have only verified the nonblocking case in my tests (which are by no
means extensive). The reported issue can no longer be reproduced with
the patched kernel, and the regular OPEN_PERM function is still
operational.

Feel free to add me to the tags.


Marko

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2017-04-24  7:40 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-22 15:31 fanotify read returns with errno == EOPENSTALE Marko Rauhamaa
2017-03-22 18:01 ` Matthew Wilcox
2017-03-22 18:20 ` Amir Goldstein
2017-03-22 19:17   ` Amir Goldstein
2017-03-22 19:31   ` Al Viro
2017-03-22 19:39     ` Amir Goldstein
2017-03-23  8:13       ` Marko Rauhamaa
2017-03-23 11:46         ` Amir Goldstein
2017-03-23 11:56           ` Jeff Layton
2017-03-23 12:43             ` Marko Rauhamaa
2017-03-23 13:47               ` Amir Goldstein
2017-03-23 13:47                 ` Amir Goldstein
2017-03-23 13:47                 ` Amir Goldstein
2017-04-19 13:46                 ` Marko Rauhamaa
2017-04-19 13:46                   ` Marko Rauhamaa
2017-04-20 11:06                   ` Amir Goldstein
2017-04-20 11:06                     ` Amir Goldstein
2017-04-20 11:33                     ` Amir Goldstein
2017-04-20 11:33                       ` Amir Goldstein
2017-04-20 12:43                       ` Marko Rauhamaa
2017-04-20 12:43                         ` Marko Rauhamaa
2017-04-20 13:34                         ` Amir Goldstein
2017-04-20 13:34                           ` Amir Goldstein
2017-04-21 13:13                           ` Marko Rauhamaa
2017-04-21 13:13                             ` Marko Rauhamaa
2017-04-20 14:20                       ` Jan Kara
2017-04-20 15:06                         ` Amir Goldstein
2017-04-20 15:06                           ` Amir Goldstein
2017-04-22  7:22                           ` Amir Goldstein
2017-04-22  7:22                             ` Amir Goldstein
2017-04-24  7:40                             ` Marko Rauhamaa
2017-04-24  7:40                               ` Marko Rauhamaa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.