All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Quentin.BOUGET@cea.fr" <Quentin.BOUGET@cea.fr>
To: Dominique Martinet <asmadeus@codewreck.org>,
	Amir Goldstein <amir73il@gmail.com>
Cc: Jan Kara <jack@suse.cz>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"robinhood-devel@lists.sf.net" <robinhood-devel@lists.sf.net>
Subject: Re: robinhood, fanotify name info events and lustre changelog
Date: Fri, 29 May 2020 18:41:49 +0000	[thread overview]
Message-ID: <1590777699518.49838@cea.fr> (raw)
In-Reply-To: <20200528125651.GA12279@nautica>

Hi,

Developer of robinhood v4 here,

> > > [1] https://github.com/cea-hpc/robinhood/

The sources for version 4 live in a separate branch:
https://github.com/cea-hpc/robinhood/tree/v4

Any feedback is welcome =)

I am guessing the most interesting bits for this discussion should be found
here:
https://github.com/cea-hpc/robinhood/blob/v4/include/robinhood/fsevent.h

I am not sure it will matter for the rest of the conversation, but just in case:

    RobinHood v4 has a notion of a "namespace" xattr (like an xattr, but for
    a dentry rather than an inode), it is used it to store things that are only
    really tied to the namespace (like the path of an entry). I don't think this
    is really relevant here, you can probably ignore it.

    Also, RobinHood uses file handles to uniquely identify filesystem entries,
    and this is what is stored in a `struct rbh_id`.

> > I couldn't find the documentation for Lustre Changelog format, because
> > the name of the feature is not very Google friendly.

Yes, this is really unfortunate. For the record, user documentation for Lustre
lives at: http://doc.lustre.org/lustre_manual.xhtml

Chapter 12.1 deals with "Lustre Changelogs" (not much more there than
what Dominique already wrote).

> > There is one critical difference between a changelog and fanotify events.
> > fanotify events are delivered a-synchronically and may be delivered out
> > of order, so application must not rely on path information to update
> > internal records without using fstatat(2) to check the actual state of the
> > object in the filesystem.

> lustre changelogs are asynchronous but the order is guaranteed so we
> might rely on that for robinhood v4,

Yes, we do. At least to a certain extent : we at least expect changelog records
for a single filesystem entry to be emitted in the order they happened on the
FS. I have not really given much thought to how things would work in general
if that wasn't true, but I know this is an issue for things that deal with the
namespace : https://jira.whamcloud.com/browse/LU-12574

> but full path is not computed from
> information in the changelogs. Instead the design plan is to have a
> process scrub the database for files that got updated since the last
> path update and fix paths with fstatat, so I think it might work ; but
> that unfortunately hasn't been implemented yet.

Not exactly (I am not sure it really matters, so I'll try to be brief).

The idea to keep paths in sync with what's in the filesystem is to "tag"
entries as we update their name (ie. after a rename). Then a separate
process comes in, queries for entries that have that "tag", and updates
their path by concatenating their parent's path (if the parents themselves
are not "tagged") with the entries' own, up-to-date name. After that, if
the entry was a directory, its children are "tagged". I simplified a bit, but
that's the idea.

So, to be fair, full paths _are_ computed solely from information in the
changelog records, even though it requires a bit of processing on the side.
No additional query to the filesystem for that.

Cheers,
Quentin

  reply	other threads:[~2020-05-29 19:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 17:21 [GIT PULL] Fanotify revert for 5.7-rc8 Jan Kara
2020-05-27 18:10 ` pr-tracker-bot
     [not found] ` <20200527173937.GA17769@nautica>
     [not found]   ` <CAOQ4uxjQXwTo1Ug4jY1X+eBdLj80rEfJ0X3zKRi+L8L_uYSrgQ@mail.gmail.com>
2020-05-28 12:56     ` robinhood, fanotify name info events and lustre changelog Dominique Martinet
2020-05-29 18:41       ` Quentin.BOUGET [this message]
2020-05-30 13:07         ` Amir Goldstein
2020-05-30 13:39           ` Dominique Martinet
2020-05-30 20:37             ` Amir Goldstein
2020-06-01 19:46               ` Quentin.BOUGET
2020-06-01 20:20                 ` Amir Goldstein
2020-06-02  1:30                   ` Quentin.BOUGET

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1590777699518.49838@cea.fr \
    --to=quentin.bouget@cea.fr \
    --cc=amir73il@gmail.com \
    --cc=asmadeus@codewreck.org \
    --cc=jack@suse.cz \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=robinhood-devel@lists.sf.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.