All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dominique Martinet <asmadeus@codewreck.org>
To: Amir Goldstein <amir73il@gmail.com>
Cc: "Quentin.BOUGET@cea.fr" <Quentin.BOUGET@cea.fr>,
	Jan Kara <jack@suse.cz>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"robinhood-devel@lists.sf.net" <robinhood-devel@lists.sf.net>
Subject: Re: robinhood, fanotify name info events and lustre changelog
Date: Sat, 30 May 2020 15:39:08 +0200	[thread overview]
Message-ID: <20200530133908.GA5969@nautica> (raw)
In-Reply-To: <CAOQ4uxgpugScXRLT6jJAAZf_ET+DpmEWoqkSdqCAMEwp+Kezhw@mail.gmail.com>

Answering what I can until Quentin chips back in.

Amir Goldstein wrote on Sat, May 30, 2020:
> Nice. thanks for explaining that.
> I suppose you need to store the calculated path attribute for things like
> index queries on the database?

Either querying for a subtree or simply printing the path (rbh-find
would print path by default, like find does)

> > So, to be fair, full paths _are_ computed solely from information in the
> > changelog records, even though it requires a bit of processing on the side.
> > No additional query to the filesystem for that.
> 
> As I wrote, that fact that robinhood trusts the information in changelog
> records doesn't mean that information needs to arrive from the kernel.
> The adapter code should use information provided by fanotify events
> then use open_by_handle_at(2) for directory fid to finds its current
> path in the filesystem then feed that information to a robinhood change
> record.

I can agree with that - it's not because for lustre we made the decision
to be able to run without querying the filesystem at all that it has to
hold true for all type of inputs.

> I would be happy to work with you on a POC for adapting fanotify
> test code with robinhood v4, but before I invest time on that, I would
> need to know there is a good chance that people are going to test and
> use robinhood with Linux vfs.
>
> Do you have actual users requesting to use robinhood with non-Lustre
> fs?

I would run it at home, but that isn't much :D
As I wrote previously we have users for large nfs shares out of lustre,
but I honestly don't think there will be much use for local filesystems
at least in the short term.

Filesystem indexers like tracker[1] or similar would definitely get much
more use for that; from an objective point of view I wouldn't suggest
you spend time on robinhood for this: local filesytems are rarely large
enough to warrant using something like robinhood, and without something
like fanotify we wouldn't be efficient for a local disk with hundreds of
millions of files anyway because of the prohibitive rescan cost - so
it's a bit like chicken and egg maybe, I don't know, but if you want
many users to test different configurations I wouldn't recommend
robinhood (OTOH, we run CI tests so would be happy to add that to the
tests once it's available on vanilla kernels; but that's still not real
users)

[1] https://wiki.gnome.org/Projects/Tracker


> May I ask, what is the reason for embarking on the project to decouple
> robinhood v4 API from Lustre changelog API?
> Is it because you had other fsevent producers in mind?

I've been planning to at least add some recursive-inotifywatch a
subfolder at least (like watchman does) before these new fanotify events
came up, so I might be partly to blame for that.

There also are advantages for lustre though; the point is to be able to
ingest changelogs directly with some daemon (it's only at proof of
concept level for v4 at this point), but also to split the load by
involving multiple lustre clients.
So you would get a pool of lustre clients to read changelogs, a pool of
lustre clients to stat files as required to enrich the fsevents (file
size etc), and a pool of servers to read fsevents and commit changes to
the database (this part is still at the design level afaik)


Hope this all makes sense,
-- 
Dominique

  reply	other threads:[~2020-05-30 13:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 17:21 [GIT PULL] Fanotify revert for 5.7-rc8 Jan Kara
2020-05-27 18:10 ` pr-tracker-bot
     [not found] ` <20200527173937.GA17769@nautica>
     [not found]   ` <CAOQ4uxjQXwTo1Ug4jY1X+eBdLj80rEfJ0X3zKRi+L8L_uYSrgQ@mail.gmail.com>
2020-05-28 12:56     ` robinhood, fanotify name info events and lustre changelog Dominique Martinet
2020-05-29 18:41       ` Quentin.BOUGET
2020-05-30 13:07         ` Amir Goldstein
2020-05-30 13:39           ` Dominique Martinet [this message]
2020-05-30 20:37             ` Amir Goldstein
2020-06-01 19:46               ` Quentin.BOUGET
2020-06-01 20:20                 ` Amir Goldstein
2020-06-02  1:30                   ` Quentin.BOUGET

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200530133908.GA5969@nautica \
    --to=asmadeus@codewreck.org \
    --cc=Quentin.BOUGET@cea.fr \
    --cc=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=robinhood-devel@lists.sf.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.