linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Quentin.BOUGET@cea.fr" <Quentin.BOUGET@cea.fr>
To: Amir Goldstein <amir73il@gmail.com>,
	Dominique Martinet <asmadeus@codewreck.org>
Cc: Jan Kara <jack@suse.cz>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"robinhood-devel@lists.sf.net" <robinhood-devel@lists.sf.net>
Subject: Re: robinhood, fanotify name info events and lustre changelog
Date: Mon, 1 Jun 2020 19:46:15 +0000	[thread overview]
Message-ID: <1591040775412.28640@cea.fr> (raw)
In-Reply-To: <CAOQ4uxiE9R4gRGwQQETvWK7SLm4J60SvfrSAOZxYJdRHquAwtA@mail.gmail.com>

> > > > I am guessing the most interesting bits for this discussion should be found
> > > > here:
> > > > https://github.com/cea-hpc/robinhood/blob/v4/include/robinhood/fsevent.h
> > > >
> > > 
> > > That is a very well documented API and a valuable resource for me.

Thank you!

> > > Notes for API choices that are aligned with current fanotify plans:
> > > - The combination of parent fid + object fid without name is never expected
> > > 
> > > Notes for API choices that are NOT aligned with current fanotify plans:
> > > - LINK/UNLINK events carry the linked/unlinked object fid
> > > - XATTR events for inode (not namespace) do not carry parent fid/name
> > > 
> > > This doesn't mean that fanotify -> rbh_fsevent translation is not going to
> > > be possible.
> > > 
> > > With fanotify FAN_CREATE event, for example, the parent fid + name
> > > information should be used by the rbh adapter code to call
> > > name_to_handle_at(2) and get the created object's file handle.
> > > 
> > > The reason we made this API choice is because fanotify events should
> > > not be perceived as a sequence of changes that can be followed to
> > > describe the current state of the filesystem.
> > > fanotify events should be perceived as a "poll" on the namespace.
> > > Whenever notified of a change, application should read the current state
> > > for the filesystem. fanotify events provide "just enough" information, so
> > > reading the current state of the filesystem is not too expensive.

I am a little worried about objects that would move around constantly and thus
"evade" name_to_handle_at(). A bad actor could try to hide a setuid binary this
way... Of course they could also just copy/delete the file repeatedly and in
this case having the fid becomes useless, but it seems harder to do, and it is
likely it would take more time than a simple rename.

> > > When fanotify event FAN_MODIFY reports a change of file size,
> > > along with parent fid + name, that do not match the parent/name robinhood
> > > knows about (i.e. because the event is received out of order with rename),
> > > you may use that information to create rbh_fsevent_ns_xattr event to
> > > update the path or you may wait for the FAN_MOVE_SELF event that
> > > should arrive later.
> > > Up to you.

This is making me think: if I receive such a FAN_MODIFY event, and an object
is moved at parent_fid + name before I query the FS, how can I know which file
the event was originally meant for?

> > > > So, to be fair, full paths _are_ computed solely from information in the
> > > > changelog records, even though it requires a bit of processing on the side.
> > > > No additional query to the filesystem for that.
> > > 
> > > As I wrote, that fact that robinhood trusts the information in changelog
> > > records doesn't mean that information needs to arrive from the kernel.
> > > The adapter code should use information provided by fanotify events
> > > then use open_by_handle_at(2) for directory fid to finds its current
> > > path in the filesystem then feed that information to a robinhood change
> > > record.
> > 
> > I can agree with that - it's not because for lustre we made the decision
> > to be able to run without querying the filesystem at all that it has to
> > hold true for all type of inputs.

I agree as well. The issue I mention above is a special case. In general, I am
fine with the "just enough information" approach.

> > > May I ask, what is the reason for embarking on the project to decouple
> > > robinhood v4 API from Lustre changelog API?

There is an impedance mismatch between what Lustre emits, and what robinhood
needs for its updates: even with Lustre's changelog, we still need to query
the filesystem to get additional information. I could have extended Lustre's
structures, but then I would have depended on them too much for my taste. It
just seemed cleaner to have a clear separation between the two.

> Looking at robinhood (especially v4), I seems like it could fit
> very well into the vacuum in Linux and act as "fsnotifyd".
> unprivileged applications and services could register to event streams
> and get fed from db, so applications not running will not loose events.
> Events delivered to unprivileged applications need to be filtered by
> subtree those applications, something that fanotify does not do and
> will not likely do and filtered by access permissions of application
> to the path of the reported object.

The plan is to use a dedicated message queue for the streaming part (such as
Kafka or RabbitMQ) and robinhood would only really deal with serializing events
into a standard communication format (the current target is YAML), and dumping
that into the message queues.

From there, it's definitely possible to write a program that will filter
events and route them to unprivileged applications... But it is unlikely I will
write it myself. =)

Cheers,
Quentin

  reply	other threads:[~2020-06-01 19:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 17:21 [GIT PULL] Fanotify revert for 5.7-rc8 Jan Kara
2020-05-27 18:10 ` pr-tracker-bot
     [not found] ` <20200527173937.GA17769@nautica>
     [not found]   ` <CAOQ4uxjQXwTo1Ug4jY1X+eBdLj80rEfJ0X3zKRi+L8L_uYSrgQ@mail.gmail.com>
2020-05-28 12:56     ` robinhood, fanotify name info events and lustre changelog Dominique Martinet
2020-05-29 18:41       ` Quentin.BOUGET
2020-05-30 13:07         ` Amir Goldstein
2020-05-30 13:39           ` Dominique Martinet
2020-05-30 20:37             ` Amir Goldstein
2020-06-01 19:46               ` Quentin.BOUGET [this message]
2020-06-01 20:20                 ` Amir Goldstein
2020-06-02  1:30                   ` Quentin.BOUGET

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1591040775412.28640@cea.fr \
    --to=quentin.bouget@cea.fr \
    --cc=amir73il@gmail.com \
    --cc=asmadeus@codewreck.org \
    --cc=jack@suse.cz \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=robinhood-devel@lists.sf.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).