All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Quentin.BOUGET@cea.fr" <Quentin.BOUGET@cea.fr>
To: Amir Goldstein <amir73il@gmail.com>,
	Dominique Martinet <asmadeus@codewreck.org>
Cc: Jan Kara <jack@suse.cz>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"robinhood-devel@lists.sf.net" <robinhood-devel@lists.sf.net>
Subject: Re: robinhood, fanotify name info events and lustre changelog
Date: Mon, 1 Jun 2020 19:46:15 +0000	[thread overview]
Message-ID: <1591040775412.28640@cea.fr> (raw)
In-Reply-To: <CAOQ4uxiE9R4gRGwQQETvWK7SLm4J60SvfrSAOZxYJdRHquAwtA@mail.gmail.com>

> > > > I am guessing the most interesting bits for this discussion should be found
> > > > here:
> > > > https://github.com/cea-hpc/robinhood/blob/v4/include/robinhood/fsevent.h
> > > >
> > > 
> > > That is a very well documented API and a valuable resource for me.

Thank you!

> > > Notes for API choices that are aligned with current fanotify plans:
> > > - The combination of parent fid + object fid without name is never expected
> > > 
> > > Notes for API choices that are NOT aligned with current fanotify plans:
> > > - LINK/UNLINK events carry the linked/unlinked object fid
> > > - XATTR events for inode (not namespace) do not carry parent fid/name
> > > 
> > > This doesn't mean that fanotify -> rbh_fsevent translation is not going to
> > > be possible.
> > > 
> > > With fanotify FAN_CREATE event, for example, the parent fid + name
> > > information should be used by the rbh adapter code to call
> > > name_to_handle_at(2) and get the created object's file handle.
> > > 
> > > The reason we made this API choice is because fanotify events should
> > > not be perceived as a sequence of changes that can be followed to
> > > describe the current state of the filesystem.
> > > fanotify events should be perceived as a "poll" on the namespace.
> > > Whenever notified of a change, application should read the current state
> > > for the filesystem. fanotify events provide "just enough" information, so
> > > reading the current state of the filesystem is not too expensive.

I am a little worried about objects that would move around constantly and thus
"evade" name_to_handle_at(). A bad actor could try to hide a setuid binary this
way... Of course they could also just copy/delete the file repeatedly and in
this case having the fid becomes useless, but it seems harder to do, and it is
likely it would take more time than a simple rename.

> > > When fanotify event FAN_MODIFY reports a change of file size,
> > > along with parent fid + name, that do not match the parent/name robinhood
> > > knows about (i.e. because the event is received out of order with rename),
> > > you may use that information to create rbh_fsevent_ns_xattr event to
> > > update the path or you may wait for the FAN_MOVE_SELF event that
> > > should arrive later.
> > > Up to you.

This is making me think: if I receive such a FAN_MODIFY event, and an object
is moved at parent_fid + name before I query the FS, how can I know which file
the event was originally meant for?

> > > > So, to be fair, full paths _are_ computed solely from information in the
> > > > changelog records, even though it requires a bit of processing on the side.
> > > > No additional query to the filesystem for that.
> > > 
> > > As I wrote, that fact that robinhood trusts the information in changelog
> > > records doesn't mean that information needs to arrive from the kernel.
> > > The adapter code should use information provided by fanotify events
> > > then use open_by_handle_at(2) for directory fid to finds its current
> > > path in the filesystem then feed that information to a robinhood change
> > > record.
> > 
> > I can agree with that - it's not because for lustre we made the decision
> > to be able to run without querying the filesystem at all that it has to
> > hold true for all type of inputs.

I agree as well. The issue I mention above is a special case. In general, I am
fine with the "just enough information" approach.

> > > May I ask, what is the reason for embarking on the project to decouple
> > > robinhood v4 API from Lustre changelog API?

There is an impedance mismatch between what Lustre emits, and what robinhood
needs for its updates: even with Lustre's changelog, we still need to query
the filesystem to get additional information. I could have extended Lustre's
structures, but then I would have depended on them too much for my taste. It
just seemed cleaner to have a clear separation between the two.

> Looking at robinhood (especially v4), I seems like it could fit
> very well into the vacuum in Linux and act as "fsnotifyd".
> unprivileged applications and services could register to event streams
> and get fed from db, so applications not running will not loose events.
> Events delivered to unprivileged applications need to be filtered by
> subtree those applications, something that fanotify does not do and
> will not likely do and filtered by access permissions of application
> to the path of the reported object.

The plan is to use a dedicated message queue for the streaming part (such as
Kafka or RabbitMQ) and robinhood would only really deal with serializing events
into a standard communication format (the current target is YAML), and dumping
that into the message queues.

From there, it's definitely possible to write a program that will filter
events and route them to unprivileged applications... But it is unlikely I will
write it myself. =)

Cheers,
Quentin

  reply	other threads:[~2020-06-01 19:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 17:21 [GIT PULL] Fanotify revert for 5.7-rc8 Jan Kara
2020-05-27 18:10 ` pr-tracker-bot
     [not found] ` <20200527173937.GA17769@nautica>
     [not found]   ` <CAOQ4uxjQXwTo1Ug4jY1X+eBdLj80rEfJ0X3zKRi+L8L_uYSrgQ@mail.gmail.com>
2020-05-28 12:56     ` robinhood, fanotify name info events and lustre changelog Dominique Martinet
2020-05-29 18:41       ` Quentin.BOUGET
2020-05-30 13:07         ` Amir Goldstein
2020-05-30 13:39           ` Dominique Martinet
2020-05-30 20:37             ` Amir Goldstein
2020-06-01 19:46               ` Quentin.BOUGET [this message]
2020-06-01 20:20                 ` Amir Goldstein
2020-06-02  1:30                   ` Quentin.BOUGET

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1591040775412.28640@cea.fr \
    --to=quentin.bouget@cea.fr \
    --cc=amir73il@gmail.com \
    --cc=asmadeus@codewreck.org \
    --cc=jack@suse.cz \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=robinhood-devel@lists.sf.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.