linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Beata Michalska <b.michalska@samsung.com>
To: Greg KH <greg@kroah.com>
Cc: Jan Kara <jack@suse.cz>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-api@vger.kernel.org, tytso@mit.edu,
	adilger.kernel@dilger.ca, hughd@google.com, lczerner@redhat.com,
	hch@infradead.org, linux-ext4@vger.kernel.org,
	linux-mm@kvack.org, kyungmin.park@samsung.com,
	kmpark@infradead.org
Subject: Re: [RFC v2 1/4] fs: Add generic file system event notifications
Date: Wed, 29 Apr 2015 13:10:34 +0200	[thread overview]
Message-ID: <5540BC2A.8010504@samsung.com> (raw)
In-Reply-To: <20150429091303.GA4090@kroah.com>

On 04/29/2015 11:13 AM, Greg KH wrote:
> On Wed, Apr 29, 2015 at 09:42:59AM +0200, Jan Kara wrote:
>> On Wed 29-04-15 09:03:08, Beata Michalska wrote:
>>> On 04/28/2015 07:39 PM, Greg KH wrote:
>>>> On Tue, Apr 28, 2015 at 04:46:46PM +0200, Beata Michalska wrote:
>>>>> On 04/28/2015 04:09 PM, Greg KH wrote:
>>>>>> On Tue, Apr 28, 2015 at 03:56:53PM +0200, Jan Kara wrote:
>>>>>>> On Mon 27-04-15 17:37:11, Greg KH wrote:
>>>>>>>> On Mon, Apr 27, 2015 at 05:08:27PM +0200, Beata Michalska wrote:
>>>>>>>>> On 04/27/2015 04:24 PM, Greg KH wrote:
>>>>>>>>>> On Mon, Apr 27, 2015 at 01:51:41PM +0200, Beata Michalska wrote:
>>>>>>>>>>> Introduce configurable generic interface for file
>>>>>>>>>>> system-wide event notifications, to provide file
>>>>>>>>>>> systems with a common way of reporting any potential
>>>>>>>>>>> issues as they emerge.
>>>>>>>>>>>
>>>>>>>>>>> The notifications are to be issued through generic
>>>>>>>>>>> netlink interface by newly introduced multicast group.
>>>>>>>>>>>
>>>>>>>>>>> Threshold notifications have been included, allowing
>>>>>>>>>>> triggering an event whenever the amount of free space drops
>>>>>>>>>>> below a certain level - or levels to be more precise as two
>>>>>>>>>>> of them are being supported: the lower and the upper range.
>>>>>>>>>>> The notifications work both ways: once the threshold level
>>>>>>>>>>> has been reached, an event shall be generated whenever
>>>>>>>>>>> the number of available blocks goes up again re-activating
>>>>>>>>>>> the threshold.
>>>>>>>>>>>
>>>>>>>>>>> The interface has been exposed through a vfs. Once mounted,
>>>>>>>>>>> it serves as an entry point for the set-up where one can
>>>>>>>>>>> register for particular file system events.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Beata Michalska <b.michalska@samsung.com>
>>>>>>>>>>> ---
>>>>>>>>>>>  Documentation/filesystems/events.txt |  231 ++++++++++
>>>>>>>>>>>  fs/Makefile                          |    1 +
>>>>>>>>>>>  fs/events/Makefile                   |    6 +
>>>>>>>>>>>  fs/events/fs_event.c                 |  770 ++++++++++++++++++++++++++++++++++
>>>>>>>>>>>  fs/events/fs_event.h                 |   25 ++
>>>>>>>>>>>  fs/events/fs_event_netlink.c         |   99 +++++
>>>>>>>>>>>  fs/namespace.c                       |    1 +
>>>>>>>>>>>  include/linux/fs.h                   |    6 +-
>>>>>>>>>>>  include/linux/fs_event.h             |   58 +++
>>>>>>>>>>>  include/uapi/linux/fs_event.h        |   54 +++
>>>>>>>>>>>  include/uapi/linux/genetlink.h       |    1 +
>>>>>>>>>>>  net/netlink/genetlink.c              |    7 +-
>>>>>>>>>>>  12 files changed, 1257 insertions(+), 2 deletions(-)
>>>>>>>>>>>  create mode 100644 Documentation/filesystems/events.txt
>>>>>>>>>>>  create mode 100644 fs/events/Makefile
>>>>>>>>>>>  create mode 100644 fs/events/fs_event.c
>>>>>>>>>>>  create mode 100644 fs/events/fs_event.h
>>>>>>>>>>>  create mode 100644 fs/events/fs_event_netlink.c
>>>>>>>>>>>  create mode 100644 include/linux/fs_event.h
>>>>>>>>>>>  create mode 100644 include/uapi/linux/fs_event.h
>>>>>>>>>>
>>>>>>>>>> Any reason why you just don't do uevents for the block devices today,
>>>>>>>>>> and not create a new type of netlink message and userspace tool required
>>>>>>>>>> to read these?
>>>>>>>>>
>>>>>>>>> The idea here is to have support for filesystems with no backing device as well.
>>>>>>>>> Parsing the message with libnl is really simple and requires few lines of code
>>>>>>>>> (sample application has been presented in the initial version of this RFC)
>>>>>>>>
>>>>>>>> I'm not saying it's not "simple" to parse, just that now you are doing
>>>>>>>> something that requires a different tool.  If you have a block device,
>>>>>>>> you should be able to emit uevents for it, you don't need a backing
>>>>>>>> device, we handle virtual filesystems in /sys/block/ just fine :)
>>>>>>>>
>>>>>>>> People already have tools that listen to libudev for system monitoring
>>>>>>>> and management, why require them to hook up to yet-another-library?  And
>>>>>>>> what is going to provide the ability for multiple userspace tools to
>>>>>>>> listen to these netlink messages in case you have more than one program
>>>>>>>> that wants to watch for these things (i.e. multiple desktop filesystem
>>>>>>>> monitoring tools, system-health checkers, etc.)?
>>>>>>>   As much as I understand your concerns I'm not convinced uevent interface
>>>>>>> is a good fit. There are filesystems that don't have underlying block
>>>>>>> device - think of e.g. tmpfs or filesystems working directly on top of
>>>>>>> flash devices.  These still want to send notification to userspace (one of
>>>>>>> primary motivation for this interfaces was so that tmpfs can notify about
>>>>>>> something). And creating some fake nodes in /sys/block for tmpfs and
>>>>>>> similar filesystems seems like doing more harm than good to me...
>>>>>>
>>>>>> If these are "fake" block devices, what's going to be present in the
>>>>>> block major/minor fields of the netlink message?  For some reason I
>>>>>> thought it was a required field, and because of that, I thought we had a
>>>>>> "real" filesystem somewhere to refer to, otherwise how would userspace
>>>>>> know what filesystem was creating these events?
>>>>>>
>>>>>> What am I missing here?
>>>>>>
>>>>>> confused,
>>>>>>
>>>>>> greg k-h
>>>>>>
>>>>>
>>>>> For those 'fake' block devs, upon mount, get_anon_bdev will assign
>>>>> the major:minor numbers. Userspace might get those through stat.
>>>>
>>>> How can userspace do the mapping backwards from this "anonymous"
>>>> major:minor number for these types of filesystems in such a way that
>>>> they can "know" how to report the block device that is causing the
>>>> event?
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>>>
>>>
>>> It needs to be done internally by the app but is doable.
>>> The app knows what it is watching, so it can maintain the mappings.
>>> So prior to activating the notifications it can call 'stat' on the mount point.
>>> Stat struct gives the 'st_dev' which is the device id. Same will be reported
>>> within the message payload (through major:minor numbers). So having this,
>>> the app is able to get any other information it needs. 
>>> Note that the events refer to the file system as a whole and they may not
>>> necessarily have anything to do with the actual block device. 
> 
> How are you going to show an event for a filesystem that is made up of
> multiple block devices?

AFAIK, for such filesystems there will be similar case with the anonymous
major:minor numbers - at least the btrfs is doing so. Not sure we can
differentiate here the actual block device. So in this case such events
serves merely as a hint for the userspace. At this point a user might
decide to run some scanning tools. We might extend the scope of the
info being sent, though I would consider this as a nice-to-have but not
required for this initial version of notifications. The filesystems
might also want to decide to send their own custom messages so it is
possible for filesystems like btrfs to send more detailed information
using the new genetlink multicast group.


> 
>>   Or you can use /proc/self/mountinfo for the mapping. There you can see
>> device numbers, real device names if applicable and mountpoints. This has
>> the advantage that it works even if filesystem mountpoints change.
> 
> Ok, then that brings up my next question, how does this handle
> namespaces?  What namespace is the event being sent in?  block devices
> aren't namespaced, but the mount points are, is that going to cause
> problems?
> 

The path should get resolved properly (as from root level). though I must
admit I'm not sure if there will be no issues when it comes to the network
namespaces. I'll double check it. Any hints though are more than welcomed :)

> thanks,
> 
> greg k-h
> 

BR
Beata

  reply	other threads:[~2015-04-29 11:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-27 11:51 [RFC v2 0/4] fs: Add generic file system event notifications Beata Michalska
2015-04-27 11:51 ` [RFC v2 1/4] " Beata Michalska
2015-04-27 14:24   ` Greg KH
2015-04-27 15:08     ` Beata Michalska
2015-04-27 15:37       ` Greg KH
2015-04-28  9:05         ` Beata Michalska
2015-04-28 13:56         ` Jan Kara
2015-04-28 14:09           ` Greg KH
2015-04-28 14:46             ` Beata Michalska
2015-04-28 17:39               ` Greg KH
2015-04-29  7:03                 ` Beata Michalska
2015-04-29  7:42                   ` Jan Kara
2015-04-29  9:13                     ` Greg KH
2015-04-29 11:10                       ` Beata Michalska [this message]
2015-04-29 13:45                         ` Greg KH
2015-04-29 15:48                           ` Beata Michalska
2015-04-29 15:55                             ` Greg KH
2015-04-30  8:21                               ` Beata Michalska
2015-05-05 12:16                       ` Beata Michalska
2015-05-07 11:57                         ` Beata Michalska
2015-05-26 16:39                           ` Beata Michalska
2015-05-27  2:34                             ` Greg KH
2015-05-27 13:32                               ` Beata Michalska
2015-04-27 11:51 ` [RFC v2 2/4] ext4: Add helper function to mark group as corrupted Beata Michalska
2015-04-27 11:51 ` [RFC v2 3/4] ext4: Add support for generic FS events Beata Michalska
2015-04-27 11:51 ` [RFC v2 4/4] shmem: " Beata Michalska

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5540BC2A.8010504@samsung.com \
    --to=b.michalska@samsung.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=greg@kroah.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kmpark@infradead.org \
    --cc=kyungmin.park@samsung.com \
    --cc=lczerner@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).