All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer
@ 2021-04-26 21:57 kernel test robot
  0 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2021-04-26 21:57 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 13149 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210426184201.4177978-6-krisman@collabora.com>
References: <20210426184201.4177978-6-krisman@collabora.com>
TO: Gabriel Krisman Bertazi <krisman@collabora.com>

Hi Gabriel,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on pcmoore-audit/next]
[also build test WARNING on ext4/dev linus/master v5.12]
[cannot apply to ext3/fsnotify next-20210426]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210427-024627
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git next
:::::: branch date: 3 hours ago
:::::: commit date: 3 hours ago
config: arc-randconfig-m031-20210426 (attached as .config)
compiler: arceb-elf-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

New smatch warnings:
fs/notify/ring.c:24 fsnotify_ring_peek_first_event() warn: should 'group->ring_buffer.nr_pages << 13' be a 64 bit type?
fs/notify/ring.c:57 fsnotify_ring_buffer_consume_event() warn: should 'group->ring_buffer.nr_pages << 13' be a 64 bit type?
fs/notify/ring.c:74 fsnotify_ring_alloc_event_slot() warn: should 'group->ring_buffer.nr_pages << 13' be a 64 bit type?
fs/notify/ring.c:139 fsnotify_ring_commit_slot() warn: should 'group->ring_buffer.nr_pages << 13' be a 64 bit type?

Old smatch warnings:
arch/arc/include/asm/thread_info.h:65 current_thread_info() error: uninitialized symbol 'sp'.

vim +24 fs/notify/ring.c

0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   21  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   22  struct fsnotify_event *fsnotify_ring_peek_first_event(struct fsnotify_group *group)
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   23  {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  @24  	u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   25  	struct fsnotify_event *fsn;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   26  	char *kaddr;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   27  	u64 tail;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   28  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   29  	assert_spin_locked(&group->notification_lock);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   30  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   31  again:
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   32  	tail = group->ring_buffer.tail;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   33  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   34  	if ((PAGE_SIZE - (tail & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   35  		group->ring_buffer.tail = NEXT_PAGE(tail, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   36  		goto again;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   37  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   38  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   39  	kaddr = kmap_atomic(group->ring_buffer.pages[tail / PAGE_SIZE]);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   40  	if (!kaddr)
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   41  		return NULL;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   42  	fsn = (struct fsnotify_event *) (kaddr + (tail & (PAGE_SIZE-1)));
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   43  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   44  	if (fsn->slot_len == INVALID_RING_SLOT) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   45  		group->ring_buffer.tail = NEXT_PAGE(tail, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   46  		kunmap_atomic(kaddr);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   47  		goto again;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   48  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   49  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   50  	/* will be unmapped when entry is consumed. */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   51  	return fsn;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   52  }
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   53  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   54  void fsnotify_ring_buffer_consume_event(struct fsnotify_group *group,
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   55  					struct fsnotify_event *event)
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   56  {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  @57  	u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   58  	u64 new_tail = NEXT_SLOT(group->ring_buffer.tail, event->slot_len, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   59  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   60  	kunmap_atomic(event);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   61  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   62  	pr_debug("%s: group=%p tail=%llx->%llx ring_size=%llu\n", __func__,
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   63  		 group, group->ring_buffer.tail, new_tail, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   64  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   65  	WRITE_ONCE(group->ring_buffer.tail, new_tail);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   66  }
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   67  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   68  struct fsnotify_event *fsnotify_ring_alloc_event_slot(struct fsnotify_group *group,
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   69  						      size_t size)
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   70  	__acquires(&group->notification_lock)
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   71  {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   72  	struct fsnotify_event *fsn;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   73  	u64 head, tail;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  @74  	u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   75  	u64 new_head;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   76  	void *kaddr;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   77  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   78  	if (WARN_ON(!(group->flags & FSN_SUBMISSION_RING_BUFFER) || size > PAGE_SIZE))
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   79  		return ERR_PTR(-EINVAL);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   80  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   81  	pr_debug("%s: start group=%p ring_size=%llu, requested=%lu\n", __func__, group,
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   82  		 ring_size, size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   83  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   84  	spin_lock(&group->notification_lock);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   85  again:
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   86  	head = group->ring_buffer.head;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   87  	tail = group->ring_buffer.tail;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   88  	new_head = NEXT_SLOT(head, size, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   89  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   90  	/* head would catch up to tail, corrupting an entry. */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   91  	if ((head < tail && new_head > tail) || (head > new_head && new_head > tail)) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   92  		fsn = ERR_PTR(-ENOMEM);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   93  		goto err;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   94  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   95  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   96  	/*
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   97  	 * Not event a skip message fits in the page. We can detect the
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   98  	 * lack of space. Move on to the next page.
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   99  	 */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  100  	if ((PAGE_SIZE - (head & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  101  		/* Start again on next page */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  102  		group->ring_buffer.head = NEXT_PAGE(head, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  103  		goto again;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  104  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  105  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  106  	kaddr = kmap_atomic(group->ring_buffer.pages[head / PAGE_SIZE]);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  107  	if (!kaddr) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  108  		fsn = ERR_PTR(-EFAULT);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  109  		goto err;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  110  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  111  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  112  	fsn = (struct fsnotify_event *) (kaddr + (head & (PAGE_SIZE-1)));
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  113  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  114  	if ((head >> PAGE_SHIFT) != (new_head >> PAGE_SHIFT)) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  115  		/*
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  116  		 * No room in the current page.  Add a fake entry
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  117  		 * consuming the end the page to avoid splitting event
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  118  		 * structure.
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  119  		 */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  120  		fsn->slot_len = INVALID_RING_SLOT;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  121  		kunmap_atomic(kaddr);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  122  		/* Start again on the next page */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  123  		group->ring_buffer.head = NEXT_PAGE(head, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  124  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  125  		goto again;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  126  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  127  	fsn->slot_len = size;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  128  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  129  	return fsn;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  130  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  131  err:
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  132  	spin_unlock(&group->notification_lock);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  133  	return fsn;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  134  }
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  135  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  136  void fsnotify_ring_commit_slot(struct fsnotify_group *group, struct fsnotify_event *fsn)
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  137  	__releases(&group->notification_lock)
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  138  {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26 @139  	u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  140  	u64 head = group->ring_buffer.head;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  141  	u64 new_head = NEXT_SLOT(head, fsn->slot_len, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  142  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  143  	pr_debug("%s: group=%p head=%llx->%llx ring_size=%llu\n", __func__,
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  144  		 group, head, new_head, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  145  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  146  	kunmap_atomic(fsn);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  147  	group->ring_buffer.head = new_head;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  148  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  149  	spin_unlock(&group->notification_lock);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  150  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  151  	wake_up(&group->notification_waitq);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  152  	kill_fasync(&group->fsn_fa, SIGIO, POLL_IN);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  153  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 24565 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer
  2021-04-27  5:39   ` Amir Goldstein
@ 2021-04-29 18:33     ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 7+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-04-29 18:33 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Theodore Tso, Darrick J. Wong, Dave Chinner, Jan Kara,
	David Howells, Khazhismel Kumykov, linux-fsdevel, Ext4, kernel

Amir Goldstein <amir73il@gmail.com> writes:

> On Mon, Apr 26, 2021 at 9:42 PM Gabriel Krisman Bertazi
> <krisman@collabora.com> wrote:
>>
>> In order to support file system health/error reporting over fanotify,
>> fsnotify needs to expose a submission path that doesn't allow sleeping.
>> The only problem I identified with the current submission path is the
>> need to dynamically allocate memory for the event queue.
>>
>> This patch avoids the problem by introducing a new mode in fsnotify,
>> where a ring buffer is used to submit events for a group.  Each group
>> has its own ring buffer, and error notifications are submitted
>> exclusively through it.
>>
>> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
>> ---
>>  fs/notify/Makefile               |   2 +-
>>  fs/notify/group.c                |  12 +-
>>  fs/notify/notification.c         |  10 ++
>>  fs/notify/ring.c                 | 199 +++++++++++++++++++++++++++++++
>>  include/linux/fsnotify_backend.h |  37 +++++-
>>  5 files changed, 255 insertions(+), 5 deletions(-)
>>  create mode 100644 fs/notify/ring.c
>>
>> diff --git a/fs/notify/Makefile b/fs/notify/Makefile
>> index 63a4b8828df4..61dae1e90f2d 100644
>> --- a/fs/notify/Makefile
>> +++ b/fs/notify/Makefile
>> @@ -1,6 +1,6 @@
>>  # SPDX-License-Identifier: GPL-2.0
>>  obj-$(CONFIG_FSNOTIFY)         += fsnotify.o notification.o group.o mark.o \
>> -                                  fdinfo.o
>> +                                  fdinfo.o ring.o
>>
>>  obj-y                  += dnotify/
>>  obj-y                  += inotify/
>> diff --git a/fs/notify/group.c b/fs/notify/group.c
>> index 08acb1afc0c2..b99b3de36696 100644
>> --- a/fs/notify/group.c
>> +++ b/fs/notify/group.c
>> @@ -81,7 +81,10 @@ void fsnotify_destroy_group(struct fsnotify_group *group)
>>          * notification against this group. So clearing the notification queue
>>          * of all events is reliable now.
>>          */
>> -       fsnotify_flush_notify(group);
>> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER)
>> +               fsnotify_free_ring_buffer(group);
>> +       else
>> +               fsnotify_flush_notify(group);
>>
>>         /*
>>          * Destroy overflow event (we cannot use fsnotify_destroy_event() as
>> @@ -136,6 +139,13 @@ static struct fsnotify_group *__fsnotify_alloc_group(
>>         group->ops = ops;
>>         group->flags = flags;
>>
>> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER) {
>> +               if (fsnotify_create_ring_buffer(group)) {
>> +                       kfree(group);
>> +                       return ERR_PTR(-ENOMEM);
>> +               }
>> +       }
>> +
>>         return group;
>>  }
>>
>> diff --git a/fs/notify/notification.c b/fs/notify/notification.c
>> index 75d79d6d3ef0..32f97e7b7a80 100644
>> --- a/fs/notify/notification.c
>> +++ b/fs/notify/notification.c
>> @@ -51,6 +51,10 @@ EXPORT_SYMBOL_GPL(fsnotify_get_cookie);
>>  bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group)
>>  {
>>         assert_spin_locked(&group->notification_lock);
>> +
>> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER)
>> +               return fsnotify_ring_notify_queue_is_empty(group);
>> +
>>         return list_empty(&group->notification_list) ? true : false;
>>  }
>>
>> @@ -132,6 +136,9 @@ void fsnotify_remove_queued_event(struct fsnotify_group *group,
>>                                   struct fsnotify_event *event)
>>  {
>>         assert_spin_locked(&group->notification_lock);
>> +
>> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER)
>> +               return;
>>         /*
>>          * We need to init list head for the case of overflow event so that
>>          * check in fsnotify_add_event() works
>> @@ -166,6 +173,9 @@ struct fsnotify_event *fsnotify_peek_first_event(struct fsnotify_group *group)
>>  {
>>         assert_spin_locked(&group->notification_lock);
>>
>> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER)
>> +               return fsnotify_ring_peek_first_event(group);
>> +
>>         return list_first_entry(&group->notification_list,
>>                                 struct fsnotify_event, list);
>>  }
>> diff --git a/fs/notify/ring.c b/fs/notify/ring.c
>> new file mode 100644
>> index 000000000000..75e8af1f8d80
>> --- /dev/null
>> +++ b/fs/notify/ring.c
>> @@ -0,0 +1,199 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +#include <linux/types.h>
>> +#include <linux/fsnotify.h>
>> +#include <linux/memcontrol.h>
>> +
>> +#define INVALID_RING_SLOT -1
>> +
>> +#define FSNOTIFY_RING_PAGES 16
>> +
>> +#define NEXT_SLOT(cur, len, ring_size) ((cur + len) & (ring_size-1))
>> +#define NEXT_PAGE(cur, ring_size) (round_up(cur, PAGE_SIZE) & (ring_size-1))
>> +
>> +bool fsnotify_ring_notify_queue_is_empty(struct fsnotify_group *group)
>> +{
>> +       assert_spin_locked(&group->notification_lock);
>> +
>> +       if (group->ring_buffer.tail == group->ring_buffer.head)
>> +               return true;
>> +       return false;
>> +}
>> +
>> +struct fsnotify_event *fsnotify_ring_peek_first_event(struct fsnotify_group *group)
>> +{
>> +       u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
>> +       struct fsnotify_event *fsn;
>> +       char *kaddr;
>> +       u64 tail;
>> +
>> +       assert_spin_locked(&group->notification_lock);
>> +
>> +again:
>> +       tail = group->ring_buffer.tail;
>> +
>> +       if ((PAGE_SIZE - (tail & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
>> +               group->ring_buffer.tail = NEXT_PAGE(tail, ring_size);
>> +               goto again;
>> +       }
>> +
>> +       kaddr = kmap_atomic(group->ring_buffer.pages[tail / PAGE_SIZE]);
>> +       if (!kaddr)
>> +               return NULL;
>> +       fsn = (struct fsnotify_event *) (kaddr + (tail & (PAGE_SIZE-1)));
>> +
>> +       if (fsn->slot_len == INVALID_RING_SLOT) {
>> +               group->ring_buffer.tail = NEXT_PAGE(tail, ring_size);
>> +               kunmap_atomic(kaddr);
>> +               goto again;
>> +       }
>> +
>> +       /* will be unmapped when entry is consumed. */
>> +       return fsn;
>> +}
>> +
>> +void fsnotify_ring_buffer_consume_event(struct fsnotify_group *group,
>> +                                       struct fsnotify_event *event)
>> +{
>> +       u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
>> +       u64 new_tail = NEXT_SLOT(group->ring_buffer.tail, event->slot_len, ring_size);
>> +
>> +       kunmap_atomic(event);
>> +
>> +       pr_debug("%s: group=%p tail=%llx->%llx ring_size=%llu\n", __func__,
>> +                group, group->ring_buffer.tail, new_tail, ring_size);
>> +
>> +       WRITE_ONCE(group->ring_buffer.tail, new_tail);
>> +}
>> +
>> +struct fsnotify_event *fsnotify_ring_alloc_event_slot(struct fsnotify_group *group,
>> +                                                     size_t size)
>> +       __acquires(&group->notification_lock)
>> +{
>> +       struct fsnotify_event *fsn;
>> +       u64 head, tail;
>> +       u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
>> +       u64 new_head;
>> +       void *kaddr;
>> +
>> +       if (WARN_ON(!(group->flags & FSN_SUBMISSION_RING_BUFFER) || size > PAGE_SIZE))
>> +               return ERR_PTR(-EINVAL);
>> +
>> +       pr_debug("%s: start group=%p ring_size=%llu, requested=%lu\n", __func__, group,
>> +                ring_size, size);
>> +
>> +       spin_lock(&group->notification_lock);
>> +again:
>> +       head = group->ring_buffer.head;
>> +       tail = group->ring_buffer.tail;
>> +       new_head = NEXT_SLOT(head, size, ring_size);
>> +
>> +       /* head would catch up to tail, corrupting an entry. */
>> +       if ((head < tail && new_head > tail) || (head > new_head && new_head > tail)) {
>> +               fsn = ERR_PTR(-ENOMEM);
>> +               goto err;
>> +       }
>> +
>> +       /*
>> +        * Not event a skip message fits in the page. We can detect the
>> +        * lack of space. Move on to the next page.
>> +        */
>> +       if ((PAGE_SIZE - (head & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
>> +               /* Start again on next page */
>> +               group->ring_buffer.head = NEXT_PAGE(head, ring_size);
>> +               goto again;
>> +       }
>> +
>> +       kaddr = kmap_atomic(group->ring_buffer.pages[head / PAGE_SIZE]);
>> +       if (!kaddr) {
>> +               fsn = ERR_PTR(-EFAULT);
>> +               goto err;
>> +       }
>> +
>> +       fsn = (struct fsnotify_event *) (kaddr + (head & (PAGE_SIZE-1)));
>> +
>> +       if ((head >> PAGE_SHIFT) != (new_head >> PAGE_SHIFT)) {
>> +               /*
>> +                * No room in the current page.  Add a fake entry
>> +                * consuming the end the page to avoid splitting event
>> +                * structure.
>> +                */
>> +               fsn->slot_len = INVALID_RING_SLOT;
>> +               kunmap_atomic(kaddr);
>> +               /* Start again on the next page */
>> +               group->ring_buffer.head = NEXT_PAGE(head, ring_size);
>> +
>> +               goto again;
>> +       }
>> +       fsn->slot_len = size;
>> +
>> +       return fsn;
>> +
>> +err:
>> +       spin_unlock(&group->notification_lock);
>> +       return fsn;
>> +}
>> +
>> +void fsnotify_ring_commit_slot(struct fsnotify_group *group, struct fsnotify_event *fsn)
>> +       __releases(&group->notification_lock)
>> +{
>> +       u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
>> +       u64 head = group->ring_buffer.head;
>> +       u64 new_head = NEXT_SLOT(head, fsn->slot_len, ring_size);
>> +
>> +       pr_debug("%s: group=%p head=%llx->%llx ring_size=%llu\n", __func__,
>> +                group, head, new_head, ring_size);
>> +
>> +       kunmap_atomic(fsn);
>> +       group->ring_buffer.head = new_head;
>> +
>> +       spin_unlock(&group->notification_lock);
>> +
>> +       wake_up(&group->notification_waitq);
>> +       kill_fasync(&group->fsn_fa, SIGIO, POLL_IN);
>> +
>> +}
>> +
>> +void fsnotify_free_ring_buffer(struct fsnotify_group *group)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < group->ring_buffer.nr_pages; i++)
>> +               __free_page(group->ring_buffer.pages[i]);
>> +       kfree(group->ring_buffer.pages);
>> +       group->ring_buffer.nr_pages = 0;
>> +}
>> +
>> +int fsnotify_create_ring_buffer(struct fsnotify_group *group)
>> +{
>> +       int nr_pages = FSNOTIFY_RING_PAGES;
>> +       int i;
>> +
>> +       pr_debug("%s: group=%p pages=%d\n", __func__, group, nr_pages);
>> +
>> +       group->ring_buffer.pages = kmalloc_array(nr_pages, sizeof(struct pages *),
>> +                                                GFP_KERNEL);
>> +       if (!group->ring_buffer.pages)
>> +               return -ENOMEM;
>> +
>> +       group->ring_buffer.head = 0;
>> +       group->ring_buffer.tail = 0;
>> +
>> +       for (i = 0; i < nr_pages; i++) {
>> +               group->ring_buffer.pages[i] = alloc_pages(GFP_KERNEL, 1);
>> +               if (!group->ring_buffer.pages)
>> +                       goto err_dealloc;
>> +       }
>> +
>> +       group->ring_buffer.nr_pages = nr_pages;
>> +
>> +       return 0;
>> +
>> +err_dealloc:
>> +       for (--i; i >= 0; i--)
>> +               __free_page(group->ring_buffer.pages[i]);
>> +       kfree(group->ring_buffer.pages);
>> +       group->ring_buffer.nr_pages = 0;
>> +       return -ENOMEM;
>> +}
>> +
>> +
>
> Nothing in this file is fsnotify specific.
> Is there no kernel lib implementation for this already?
> If there isn't (I'd be very surprised) please put this in lib/ and post it
> for wider review including self tests.

About the implementation, the only generic code I could find is
include/linux/circ_buf.h, but it doesn't really do much or fit well
here.  For instance, it doesn't deal well with non-contiguous pages.

There are other smarter implementations around the kernel like
Documentation/trace/ring-buffer-design.txt, that would be a better
candidate to be a generic ring buffer in lib/, but I admit to haven't
checked them well enough to see if they would solve the problem for
fsnotify, which has a very simple ring buffer anyway.

>> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
>> index 190c6a402e98..a1a4dd69e5ed 100644
>> --- a/include/linux/fsnotify_backend.h
>> +++ b/include/linux/fsnotify_backend.h
>> @@ -74,6 +74,8 @@
>>  #define ALL_FSNOTIFY_PERM_EVENTS (FS_OPEN_PERM | FS_ACCESS_PERM | \
>>                                   FS_OPEN_EXEC_PERM)
>>
>> +#define FSN_SUBMISSION_RING_BUFFER     0x00000080
>
> FSNOTIFY_GROUP_FLAG_RING_BUFFER please (or FSN_GROUP_ if you must)
> and please define this above struct fsnotify_group, even right above the flags
> field like FSNOTIFY_CONN_FLAG_HAS_FSID
>
> *IF* we go this way :)
>
> Thanks,
> Amir.

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer
  2021-04-26 18:41 ` [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer Gabriel Krisman Bertazi
  2021-04-26 22:00   ` kernel test robot
  2021-04-26 22:43   ` kernel test robot
@ 2021-04-27  5:39   ` Amir Goldstein
  2021-04-29 18:33     ` Gabriel Krisman Bertazi
  2 siblings, 1 reply; 7+ messages in thread
From: Amir Goldstein @ 2021-04-27  5:39 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: Theodore Tso, Darrick J. Wong, Dave Chinner, Jan Kara,
	David Howells, Khazhismel Kumykov, linux-fsdevel, Ext4, kernel

On Mon, Apr 26, 2021 at 9:42 PM Gabriel Krisman Bertazi
<krisman@collabora.com> wrote:
>
> In order to support file system health/error reporting over fanotify,
> fsnotify needs to expose a submission path that doesn't allow sleeping.
> The only problem I identified with the current submission path is the
> need to dynamically allocate memory for the event queue.
>
> This patch avoids the problem by introducing a new mode in fsnotify,
> where a ring buffer is used to submit events for a group.  Each group
> has its own ring buffer, and error notifications are submitted
> exclusively through it.
>
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
> ---
>  fs/notify/Makefile               |   2 +-
>  fs/notify/group.c                |  12 +-
>  fs/notify/notification.c         |  10 ++
>  fs/notify/ring.c                 | 199 +++++++++++++++++++++++++++++++
>  include/linux/fsnotify_backend.h |  37 +++++-
>  5 files changed, 255 insertions(+), 5 deletions(-)
>  create mode 100644 fs/notify/ring.c
>
> diff --git a/fs/notify/Makefile b/fs/notify/Makefile
> index 63a4b8828df4..61dae1e90f2d 100644
> --- a/fs/notify/Makefile
> +++ b/fs/notify/Makefile
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_FSNOTIFY)         += fsnotify.o notification.o group.o mark.o \
> -                                  fdinfo.o
> +                                  fdinfo.o ring.o
>
>  obj-y                  += dnotify/
>  obj-y                  += inotify/
> diff --git a/fs/notify/group.c b/fs/notify/group.c
> index 08acb1afc0c2..b99b3de36696 100644
> --- a/fs/notify/group.c
> +++ b/fs/notify/group.c
> @@ -81,7 +81,10 @@ void fsnotify_destroy_group(struct fsnotify_group *group)
>          * notification against this group. So clearing the notification queue
>          * of all events is reliable now.
>          */
> -       fsnotify_flush_notify(group);
> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER)
> +               fsnotify_free_ring_buffer(group);
> +       else
> +               fsnotify_flush_notify(group);
>
>         /*
>          * Destroy overflow event (we cannot use fsnotify_destroy_event() as
> @@ -136,6 +139,13 @@ static struct fsnotify_group *__fsnotify_alloc_group(
>         group->ops = ops;
>         group->flags = flags;
>
> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER) {
> +               if (fsnotify_create_ring_buffer(group)) {
> +                       kfree(group);
> +                       return ERR_PTR(-ENOMEM);
> +               }
> +       }
> +
>         return group;
>  }
>
> diff --git a/fs/notify/notification.c b/fs/notify/notification.c
> index 75d79d6d3ef0..32f97e7b7a80 100644
> --- a/fs/notify/notification.c
> +++ b/fs/notify/notification.c
> @@ -51,6 +51,10 @@ EXPORT_SYMBOL_GPL(fsnotify_get_cookie);
>  bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group)
>  {
>         assert_spin_locked(&group->notification_lock);
> +
> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER)
> +               return fsnotify_ring_notify_queue_is_empty(group);
> +
>         return list_empty(&group->notification_list) ? true : false;
>  }
>
> @@ -132,6 +136,9 @@ void fsnotify_remove_queued_event(struct fsnotify_group *group,
>                                   struct fsnotify_event *event)
>  {
>         assert_spin_locked(&group->notification_lock);
> +
> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER)
> +               return;
>         /*
>          * We need to init list head for the case of overflow event so that
>          * check in fsnotify_add_event() works
> @@ -166,6 +173,9 @@ struct fsnotify_event *fsnotify_peek_first_event(struct fsnotify_group *group)
>  {
>         assert_spin_locked(&group->notification_lock);
>
> +       if (group->flags & FSN_SUBMISSION_RING_BUFFER)
> +               return fsnotify_ring_peek_first_event(group);
> +
>         return list_first_entry(&group->notification_list,
>                                 struct fsnotify_event, list);
>  }
> diff --git a/fs/notify/ring.c b/fs/notify/ring.c
> new file mode 100644
> index 000000000000..75e8af1f8d80
> --- /dev/null
> +++ b/fs/notify/ring.c
> @@ -0,0 +1,199 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/types.h>
> +#include <linux/fsnotify.h>
> +#include <linux/memcontrol.h>
> +
> +#define INVALID_RING_SLOT -1
> +
> +#define FSNOTIFY_RING_PAGES 16
> +
> +#define NEXT_SLOT(cur, len, ring_size) ((cur + len) & (ring_size-1))
> +#define NEXT_PAGE(cur, ring_size) (round_up(cur, PAGE_SIZE) & (ring_size-1))
> +
> +bool fsnotify_ring_notify_queue_is_empty(struct fsnotify_group *group)
> +{
> +       assert_spin_locked(&group->notification_lock);
> +
> +       if (group->ring_buffer.tail == group->ring_buffer.head)
> +               return true;
> +       return false;
> +}
> +
> +struct fsnotify_event *fsnotify_ring_peek_first_event(struct fsnotify_group *group)
> +{
> +       u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
> +       struct fsnotify_event *fsn;
> +       char *kaddr;
> +       u64 tail;
> +
> +       assert_spin_locked(&group->notification_lock);
> +
> +again:
> +       tail = group->ring_buffer.tail;
> +
> +       if ((PAGE_SIZE - (tail & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
> +               group->ring_buffer.tail = NEXT_PAGE(tail, ring_size);
> +               goto again;
> +       }
> +
> +       kaddr = kmap_atomic(group->ring_buffer.pages[tail / PAGE_SIZE]);
> +       if (!kaddr)
> +               return NULL;
> +       fsn = (struct fsnotify_event *) (kaddr + (tail & (PAGE_SIZE-1)));
> +
> +       if (fsn->slot_len == INVALID_RING_SLOT) {
> +               group->ring_buffer.tail = NEXT_PAGE(tail, ring_size);
> +               kunmap_atomic(kaddr);
> +               goto again;
> +       }
> +
> +       /* will be unmapped when entry is consumed. */
> +       return fsn;
> +}
> +
> +void fsnotify_ring_buffer_consume_event(struct fsnotify_group *group,
> +                                       struct fsnotify_event *event)
> +{
> +       u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
> +       u64 new_tail = NEXT_SLOT(group->ring_buffer.tail, event->slot_len, ring_size);
> +
> +       kunmap_atomic(event);
> +
> +       pr_debug("%s: group=%p tail=%llx->%llx ring_size=%llu\n", __func__,
> +                group, group->ring_buffer.tail, new_tail, ring_size);
> +
> +       WRITE_ONCE(group->ring_buffer.tail, new_tail);
> +}
> +
> +struct fsnotify_event *fsnotify_ring_alloc_event_slot(struct fsnotify_group *group,
> +                                                     size_t size)
> +       __acquires(&group->notification_lock)
> +{
> +       struct fsnotify_event *fsn;
> +       u64 head, tail;
> +       u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
> +       u64 new_head;
> +       void *kaddr;
> +
> +       if (WARN_ON(!(group->flags & FSN_SUBMISSION_RING_BUFFER) || size > PAGE_SIZE))
> +               return ERR_PTR(-EINVAL);
> +
> +       pr_debug("%s: start group=%p ring_size=%llu, requested=%lu\n", __func__, group,
> +                ring_size, size);
> +
> +       spin_lock(&group->notification_lock);
> +again:
> +       head = group->ring_buffer.head;
> +       tail = group->ring_buffer.tail;
> +       new_head = NEXT_SLOT(head, size, ring_size);
> +
> +       /* head would catch up to tail, corrupting an entry. */
> +       if ((head < tail && new_head > tail) || (head > new_head && new_head > tail)) {
> +               fsn = ERR_PTR(-ENOMEM);
> +               goto err;
> +       }
> +
> +       /*
> +        * Not event a skip message fits in the page. We can detect the
> +        * lack of space. Move on to the next page.
> +        */
> +       if ((PAGE_SIZE - (head & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
> +               /* Start again on next page */
> +               group->ring_buffer.head = NEXT_PAGE(head, ring_size);
> +               goto again;
> +       }
> +
> +       kaddr = kmap_atomic(group->ring_buffer.pages[head / PAGE_SIZE]);
> +       if (!kaddr) {
> +               fsn = ERR_PTR(-EFAULT);
> +               goto err;
> +       }
> +
> +       fsn = (struct fsnotify_event *) (kaddr + (head & (PAGE_SIZE-1)));
> +
> +       if ((head >> PAGE_SHIFT) != (new_head >> PAGE_SHIFT)) {
> +               /*
> +                * No room in the current page.  Add a fake entry
> +                * consuming the end the page to avoid splitting event
> +                * structure.
> +                */
> +               fsn->slot_len = INVALID_RING_SLOT;
> +               kunmap_atomic(kaddr);
> +               /* Start again on the next page */
> +               group->ring_buffer.head = NEXT_PAGE(head, ring_size);
> +
> +               goto again;
> +       }
> +       fsn->slot_len = size;
> +
> +       return fsn;
> +
> +err:
> +       spin_unlock(&group->notification_lock);
> +       return fsn;
> +}
> +
> +void fsnotify_ring_commit_slot(struct fsnotify_group *group, struct fsnotify_event *fsn)
> +       __releases(&group->notification_lock)
> +{
> +       u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
> +       u64 head = group->ring_buffer.head;
> +       u64 new_head = NEXT_SLOT(head, fsn->slot_len, ring_size);
> +
> +       pr_debug("%s: group=%p head=%llx->%llx ring_size=%llu\n", __func__,
> +                group, head, new_head, ring_size);
> +
> +       kunmap_atomic(fsn);
> +       group->ring_buffer.head = new_head;
> +
> +       spin_unlock(&group->notification_lock);
> +
> +       wake_up(&group->notification_waitq);
> +       kill_fasync(&group->fsn_fa, SIGIO, POLL_IN);
> +
> +}
> +
> +void fsnotify_free_ring_buffer(struct fsnotify_group *group)
> +{
> +       int i;
> +
> +       for (i = 0; i < group->ring_buffer.nr_pages; i++)
> +               __free_page(group->ring_buffer.pages[i]);
> +       kfree(group->ring_buffer.pages);
> +       group->ring_buffer.nr_pages = 0;
> +}
> +
> +int fsnotify_create_ring_buffer(struct fsnotify_group *group)
> +{
> +       int nr_pages = FSNOTIFY_RING_PAGES;
> +       int i;
> +
> +       pr_debug("%s: group=%p pages=%d\n", __func__, group, nr_pages);
> +
> +       group->ring_buffer.pages = kmalloc_array(nr_pages, sizeof(struct pages *),
> +                                                GFP_KERNEL);
> +       if (!group->ring_buffer.pages)
> +               return -ENOMEM;
> +
> +       group->ring_buffer.head = 0;
> +       group->ring_buffer.tail = 0;
> +
> +       for (i = 0; i < nr_pages; i++) {
> +               group->ring_buffer.pages[i] = alloc_pages(GFP_KERNEL, 1);
> +               if (!group->ring_buffer.pages)
> +                       goto err_dealloc;
> +       }
> +
> +       group->ring_buffer.nr_pages = nr_pages;
> +
> +       return 0;
> +
> +err_dealloc:
> +       for (--i; i >= 0; i--)
> +               __free_page(group->ring_buffer.pages[i]);
> +       kfree(group->ring_buffer.pages);
> +       group->ring_buffer.nr_pages = 0;
> +       return -ENOMEM;
> +}
> +
> +

Nothing in this file is fsnotify specific.
Is there no kernel lib implementation for this already?
If there isn't (I'd be very surprised) please put this in lib/ and post it
for wider review including self tests.

> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index 190c6a402e98..a1a4dd69e5ed 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -74,6 +74,8 @@
>  #define ALL_FSNOTIFY_PERM_EVENTS (FS_OPEN_PERM | FS_ACCESS_PERM | \
>                                   FS_OPEN_EXEC_PERM)
>
> +#define FSN_SUBMISSION_RING_BUFFER     0x00000080

FSNOTIFY_GROUP_FLAG_RING_BUFFER please (or FSN_GROUP_ if you must)
and please define this above struct fsnotify_group, even right above the flags
field like FSNOTIFY_CONN_FLAG_HAS_FSID

*IF* we go this way :)

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer
  2021-04-26 18:41 ` [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer Gabriel Krisman Bertazi
  2021-04-26 22:00   ` kernel test robot
@ 2021-04-26 22:43   ` kernel test robot
  2021-04-27  5:39   ` Amir Goldstein
  2 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2021-04-26 22:43 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4824 bytes --]

Hi Gabriel,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on pcmoore-audit/next]
[also build test WARNING on ext4/dev linus/master v5.12]
[cannot apply to ext3/fsnotify next-20210426]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210427-024627
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git next
config: arm-randconfig-r022-20210426 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project d941863de2becb3d8d2e00676fc7125974934c7f)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm cross compiling tool for clang build
        # apt-get install binutils-arm-linux-gnueabi
        # https://github.com/0day-ci/linux/commit/0b550d36bb2ec4613ad64b68b18898a72fd5af50
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210427-024627
        git checkout 0b550d36bb2ec4613ad64b68b18898a72fd5af50
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> fs/notify/ring.c:82:15: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
                    ring_size, size);
                               ^~~~
   include/linux/printk.h:430:38: note: expanded from macro 'pr_debug'
           no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
                                       ~~~     ^~~~~~~~~~~
   include/linux/printk.h:140:17: note: expanded from macro 'no_printk'
                   printk(fmt, ##__VA_ARGS__);             \
                          ~~~    ^~~~~~~~~~~
   1 warning generated.


vim +82 fs/notify/ring.c

    67	
    68	struct fsnotify_event *fsnotify_ring_alloc_event_slot(struct fsnotify_group *group,
    69							      size_t size)
    70		__acquires(&group->notification_lock)
    71	{
    72		struct fsnotify_event *fsn;
    73		u64 head, tail;
    74		u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
    75		u64 new_head;
    76		void *kaddr;
    77	
    78		if (WARN_ON(!(group->flags & FSN_SUBMISSION_RING_BUFFER) || size > PAGE_SIZE))
    79			return ERR_PTR(-EINVAL);
    80	
    81		pr_debug("%s: start group=%p ring_size=%llu, requested=%lu\n", __func__, group,
  > 82			 ring_size, size);
    83	
    84		spin_lock(&group->notification_lock);
    85	again:
    86		head = group->ring_buffer.head;
    87		tail = group->ring_buffer.tail;
    88		new_head = NEXT_SLOT(head, size, ring_size);
    89	
    90		/* head would catch up to tail, corrupting an entry. */
    91		if ((head < tail && new_head > tail) || (head > new_head && new_head > tail)) {
    92			fsn = ERR_PTR(-ENOMEM);
    93			goto err;
    94		}
    95	
    96		/*
    97		 * Not event a skip message fits in the page. We can detect the
    98		 * lack of space. Move on to the next page.
    99		 */
   100		if ((PAGE_SIZE - (head & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
   101			/* Start again on next page */
   102			group->ring_buffer.head = NEXT_PAGE(head, ring_size);
   103			goto again;
   104		}
   105	
   106		kaddr = kmap_atomic(group->ring_buffer.pages[head / PAGE_SIZE]);
   107		if (!kaddr) {
   108			fsn = ERR_PTR(-EFAULT);
   109			goto err;
   110		}
   111	
   112		fsn = (struct fsnotify_event *) (kaddr + (head & (PAGE_SIZE-1)));
   113	
   114		if ((head >> PAGE_SHIFT) != (new_head >> PAGE_SHIFT)) {
   115			/*
   116			 * No room in the current page.  Add a fake entry
   117			 * consuming the end the page to avoid splitting event
   118			 * structure.
   119			 */
   120			fsn->slot_len = INVALID_RING_SLOT;
   121			kunmap_atomic(kaddr);
   122			/* Start again on the next page */
   123			group->ring_buffer.head = NEXT_PAGE(head, ring_size);
   124	
   125			goto again;
   126		}
   127		fsn->slot_len = size;
   128	
   129		return fsn;
   130	
   131	err:
   132		spin_unlock(&group->notification_lock);
   133		return fsn;
   134	}
   135	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 33558 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer
  2021-04-26 18:41 ` [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer Gabriel Krisman Bertazi
@ 2021-04-26 22:00   ` kernel test robot
  2021-04-26 22:43   ` kernel test robot
  2021-04-27  5:39   ` Amir Goldstein
  2 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2021-04-26 22:00 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3775 bytes --]

Hi Gabriel,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on pcmoore-audit/next]
[also build test WARNING on ext4/dev linus/master v5.12]
[cannot apply to ext3/fsnotify next-20210426]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210427-024627
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git next
config: microblaze-randconfig-r006-20210426 (attached as .config)
compiler: microblaze-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/0b550d36bb2ec4613ad64b68b18898a72fd5af50
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210427-024627
        git checkout 0b550d36bb2ec4613ad64b68b18898a72fd5af50
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=microblaze 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:16,
                    from include/linux/radix-tree.h:12,
                    from include/linux/idr.h:15,
                    from include/linux/fsnotify_backend.h:13,
                    from include/linux/fsnotify.h:15,
                    from fs/notify/ring.c:3:
   fs/notify/ring.c: In function 'fsnotify_ring_alloc_event_slot':
>> include/linux/kern_levels.h:5:18: warning: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'size_t' {aka 'unsigned int'} [-Wformat=]
       5 | #define KERN_SOH "\001"  /* ASCII Start Of Header */
         |                  ^~~~~~
   include/linux/printk.h:140:10: note: in definition of macro 'no_printk'
     140 |   printk(fmt, ##__VA_ARGS__);  \
         |          ^~~
   include/linux/kern_levels.h:15:20: note: in expansion of macro 'KERN_SOH'
      15 | #define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
         |                    ^~~~~~~~
   include/linux/printk.h:430:12: note: in expansion of macro 'KERN_DEBUG'
     430 |  no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
         |            ^~~~~~~~~~
   fs/notify/ring.c:81:2: note: in expansion of macro 'pr_debug'
      81 |  pr_debug("%s: start group=%p ring_size=%llu, requested=%lu\n", __func__, group,
         |  ^~~~~~~~
   fs/notify/ring.c:81:59: note: format string is defined here
      81 |  pr_debug("%s: start group=%p ring_size=%llu, requested=%lu\n", __func__, group,
         |                                                         ~~^
         |                                                           |
         |                                                           long unsigned int
         |                                                         %u


vim +5 include/linux/kern_levels.h

314ba3520e513a Joe Perches 2012-07-30  4  
04d2c8c83d0e3a Joe Perches 2012-07-30 @5  #define KERN_SOH	"\001"		/* ASCII Start Of Header */
04d2c8c83d0e3a Joe Perches 2012-07-30  6  #define KERN_SOH_ASCII	'\001'
04d2c8c83d0e3a Joe Perches 2012-07-30  7  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 35433 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer
@ 2021-04-26 21:38 kernel test robot
  0 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2021-04-26 21:38 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 7969 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210426184201.4177978-6-krisman@collabora.com>
References: <20210426184201.4177978-6-krisman@collabora.com>
TO: Gabriel Krisman Bertazi <krisman@collabora.com>

Hi Gabriel,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on pcmoore-audit/next]
[also build test WARNING on ext4/dev linus/master v5.12]
[cannot apply to ext3/fsnotify next-20210426]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210427-024627
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git next
:::::: branch date: 3 hours ago
:::::: commit date: 3 hours ago
config: microblaze-randconfig-s031-20210426 (attached as .config)
compiler: microblaze-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.3-341-g8af24329-dirty
        # https://github.com/0day-ci/linux/commit/0b550d36bb2ec4613ad64b68b18898a72fd5af50
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210427-024627
        git checkout 0b550d36bb2ec4613ad64b68b18898a72fd5af50
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' W=1 ARCH=microblaze 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
>> fs/notify/ring.c:68:23: sparse: sparse: context imbalance in 'fsnotify_ring_alloc_event_slot' - wrong count at exit

vim +/fsnotify_ring_alloc_event_slot +68 fs/notify/ring.c

0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   67  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  @68  struct fsnotify_event *fsnotify_ring_alloc_event_slot(struct fsnotify_group *group,
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   69  						      size_t size)
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   70  	__acquires(&group->notification_lock)
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   71  {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   72  	struct fsnotify_event *fsn;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   73  	u64 head, tail;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   74  	u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   75  	u64 new_head;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   76  	void *kaddr;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   77  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   78  	if (WARN_ON(!(group->flags & FSN_SUBMISSION_RING_BUFFER) || size > PAGE_SIZE))
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   79  		return ERR_PTR(-EINVAL);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   80  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   81  	pr_debug("%s: start group=%p ring_size=%llu, requested=%lu\n", __func__, group,
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   82  		 ring_size, size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   83  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   84  	spin_lock(&group->notification_lock);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   85  again:
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   86  	head = group->ring_buffer.head;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   87  	tail = group->ring_buffer.tail;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   88  	new_head = NEXT_SLOT(head, size, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   89  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   90  	/* head would catch up to tail, corrupting an entry. */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   91  	if ((head < tail && new_head > tail) || (head > new_head && new_head > tail)) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   92  		fsn = ERR_PTR(-ENOMEM);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   93  		goto err;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   94  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   95  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   96  	/*
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   97  	 * Not event a skip message fits in the page. We can detect the
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   98  	 * lack of space. Move on to the next page.
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26   99  	 */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  100  	if ((PAGE_SIZE - (head & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  101  		/* Start again on next page */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  102  		group->ring_buffer.head = NEXT_PAGE(head, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  103  		goto again;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  104  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  105  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  106  	kaddr = kmap_atomic(group->ring_buffer.pages[head / PAGE_SIZE]);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  107  	if (!kaddr) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  108  		fsn = ERR_PTR(-EFAULT);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  109  		goto err;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  110  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  111  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  112  	fsn = (struct fsnotify_event *) (kaddr + (head & (PAGE_SIZE-1)));
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  113  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  114  	if ((head >> PAGE_SHIFT) != (new_head >> PAGE_SHIFT)) {
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  115  		/*
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  116  		 * No room in the current page.  Add a fake entry
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  117  		 * consuming the end the page to avoid splitting event
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  118  		 * structure.
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  119  		 */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  120  		fsn->slot_len = INVALID_RING_SLOT;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  121  		kunmap_atomic(kaddr);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  122  		/* Start again on the next page */
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  123  		group->ring_buffer.head = NEXT_PAGE(head, ring_size);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  124  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  125  		goto again;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  126  	}
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  127  	fsn->slot_len = size;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  128  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  129  	return fsn;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  130  
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  131  err:
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  132  	spin_unlock(&group->notification_lock);
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  133  	return fsn;
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  134  }
0b550d36bb2ec4 Gabriel Krisman Bertazi 2021-04-26  135  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 24139 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer
  2021-04-26 18:41 [PATCH RFC 00/15] File system wide monitoring Gabriel Krisman Bertazi
@ 2021-04-26 18:41 ` Gabriel Krisman Bertazi
  2021-04-26 22:00   ` kernel test robot
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-04-26 18:41 UTC (permalink / raw)
  To: amir73il, tytso, djwong
  Cc: david, jack, dhowells, khazhy, linux-fsdevel, linux-ext4,
	Gabriel Krisman Bertazi, kernel

In order to support file system health/error reporting over fanotify,
fsnotify needs to expose a submission path that doesn't allow sleeping.
The only problem I identified with the current submission path is the
need to dynamically allocate memory for the event queue.

This patch avoids the problem by introducing a new mode in fsnotify,
where a ring buffer is used to submit events for a group.  Each group
has its own ring buffer, and error notifications are submitted
exclusively through it.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 fs/notify/Makefile               |   2 +-
 fs/notify/group.c                |  12 +-
 fs/notify/notification.c         |  10 ++
 fs/notify/ring.c                 | 199 +++++++++++++++++++++++++++++++
 include/linux/fsnotify_backend.h |  37 +++++-
 5 files changed, 255 insertions(+), 5 deletions(-)
 create mode 100644 fs/notify/ring.c

diff --git a/fs/notify/Makefile b/fs/notify/Makefile
index 63a4b8828df4..61dae1e90f2d 100644
--- a/fs/notify/Makefile
+++ b/fs/notify/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_FSNOTIFY)		+= fsnotify.o notification.o group.o mark.o \
-				   fdinfo.o
+				   fdinfo.o ring.o
 
 obj-y			+= dnotify/
 obj-y			+= inotify/
diff --git a/fs/notify/group.c b/fs/notify/group.c
index 08acb1afc0c2..b99b3de36696 100644
--- a/fs/notify/group.c
+++ b/fs/notify/group.c
@@ -81,7 +81,10 @@ void fsnotify_destroy_group(struct fsnotify_group *group)
 	 * notification against this group. So clearing the notification queue
 	 * of all events is reliable now.
 	 */
-	fsnotify_flush_notify(group);
+	if (group->flags & FSN_SUBMISSION_RING_BUFFER)
+		fsnotify_free_ring_buffer(group);
+	else
+		fsnotify_flush_notify(group);
 
 	/*
 	 * Destroy overflow event (we cannot use fsnotify_destroy_event() as
@@ -136,6 +139,13 @@ static struct fsnotify_group *__fsnotify_alloc_group(
 	group->ops = ops;
 	group->flags = flags;
 
+	if (group->flags & FSN_SUBMISSION_RING_BUFFER) {
+		if (fsnotify_create_ring_buffer(group)) {
+			kfree(group);
+			return ERR_PTR(-ENOMEM);
+		}
+	}
+
 	return group;
 }
 
diff --git a/fs/notify/notification.c b/fs/notify/notification.c
index 75d79d6d3ef0..32f97e7b7a80 100644
--- a/fs/notify/notification.c
+++ b/fs/notify/notification.c
@@ -51,6 +51,10 @@ EXPORT_SYMBOL_GPL(fsnotify_get_cookie);
 bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group)
 {
 	assert_spin_locked(&group->notification_lock);
+
+	if (group->flags & FSN_SUBMISSION_RING_BUFFER)
+		return fsnotify_ring_notify_queue_is_empty(group);
+
 	return list_empty(&group->notification_list) ? true : false;
 }
 
@@ -132,6 +136,9 @@ void fsnotify_remove_queued_event(struct fsnotify_group *group,
 				  struct fsnotify_event *event)
 {
 	assert_spin_locked(&group->notification_lock);
+
+	if (group->flags & FSN_SUBMISSION_RING_BUFFER)
+		return;
 	/*
 	 * We need to init list head for the case of overflow event so that
 	 * check in fsnotify_add_event() works
@@ -166,6 +173,9 @@ struct fsnotify_event *fsnotify_peek_first_event(struct fsnotify_group *group)
 {
 	assert_spin_locked(&group->notification_lock);
 
+	if (group->flags & FSN_SUBMISSION_RING_BUFFER)
+		return fsnotify_ring_peek_first_event(group);
+
 	return list_first_entry(&group->notification_list,
 				struct fsnotify_event, list);
 }
diff --git a/fs/notify/ring.c b/fs/notify/ring.c
new file mode 100644
index 000000000000..75e8af1f8d80
--- /dev/null
+++ b/fs/notify/ring.c
@@ -0,0 +1,199 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/types.h>
+#include <linux/fsnotify.h>
+#include <linux/memcontrol.h>
+
+#define INVALID_RING_SLOT -1
+
+#define FSNOTIFY_RING_PAGES 16
+
+#define NEXT_SLOT(cur, len, ring_size) ((cur + len) & (ring_size-1))
+#define NEXT_PAGE(cur, ring_size) (round_up(cur, PAGE_SIZE) & (ring_size-1))
+
+bool fsnotify_ring_notify_queue_is_empty(struct fsnotify_group *group)
+{
+	assert_spin_locked(&group->notification_lock);
+
+	if (group->ring_buffer.tail == group->ring_buffer.head)
+		return true;
+	return false;
+}
+
+struct fsnotify_event *fsnotify_ring_peek_first_event(struct fsnotify_group *group)
+{
+	u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
+	struct fsnotify_event *fsn;
+	char *kaddr;
+	u64 tail;
+
+	assert_spin_locked(&group->notification_lock);
+
+again:
+	tail = group->ring_buffer.tail;
+
+	if ((PAGE_SIZE - (tail & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
+		group->ring_buffer.tail = NEXT_PAGE(tail, ring_size);
+		goto again;
+	}
+
+	kaddr = kmap_atomic(group->ring_buffer.pages[tail / PAGE_SIZE]);
+	if (!kaddr)
+		return NULL;
+	fsn = (struct fsnotify_event *) (kaddr + (tail & (PAGE_SIZE-1)));
+
+	if (fsn->slot_len == INVALID_RING_SLOT) {
+		group->ring_buffer.tail = NEXT_PAGE(tail, ring_size);
+		kunmap_atomic(kaddr);
+		goto again;
+	}
+
+	/* will be unmapped when entry is consumed. */
+	return fsn;
+}
+
+void fsnotify_ring_buffer_consume_event(struct fsnotify_group *group,
+					struct fsnotify_event *event)
+{
+	u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
+	u64 new_tail = NEXT_SLOT(group->ring_buffer.tail, event->slot_len, ring_size);
+
+	kunmap_atomic(event);
+
+	pr_debug("%s: group=%p tail=%llx->%llx ring_size=%llu\n", __func__,
+		 group, group->ring_buffer.tail, new_tail, ring_size);
+
+	WRITE_ONCE(group->ring_buffer.tail, new_tail);
+}
+
+struct fsnotify_event *fsnotify_ring_alloc_event_slot(struct fsnotify_group *group,
+						      size_t size)
+	__acquires(&group->notification_lock)
+{
+	struct fsnotify_event *fsn;
+	u64 head, tail;
+	u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
+	u64 new_head;
+	void *kaddr;
+
+	if (WARN_ON(!(group->flags & FSN_SUBMISSION_RING_BUFFER) || size > PAGE_SIZE))
+		return ERR_PTR(-EINVAL);
+
+	pr_debug("%s: start group=%p ring_size=%llu, requested=%lu\n", __func__, group,
+		 ring_size, size);
+
+	spin_lock(&group->notification_lock);
+again:
+	head = group->ring_buffer.head;
+	tail = group->ring_buffer.tail;
+	new_head = NEXT_SLOT(head, size, ring_size);
+
+	/* head would catch up to tail, corrupting an entry. */
+	if ((head < tail && new_head > tail) || (head > new_head && new_head > tail)) {
+		fsn = ERR_PTR(-ENOMEM);
+		goto err;
+	}
+
+	/*
+	 * Not event a skip message fits in the page. We can detect the
+	 * lack of space. Move on to the next page.
+	 */
+	if ((PAGE_SIZE - (head & (PAGE_SIZE-1))) < sizeof(struct fsnotify_event)) {
+		/* Start again on next page */
+		group->ring_buffer.head = NEXT_PAGE(head, ring_size);
+		goto again;
+	}
+
+	kaddr = kmap_atomic(group->ring_buffer.pages[head / PAGE_SIZE]);
+	if (!kaddr) {
+		fsn = ERR_PTR(-EFAULT);
+		goto err;
+	}
+
+	fsn = (struct fsnotify_event *) (kaddr + (head & (PAGE_SIZE-1)));
+
+	if ((head >> PAGE_SHIFT) != (new_head >> PAGE_SHIFT)) {
+		/*
+		 * No room in the current page.  Add a fake entry
+		 * consuming the end the page to avoid splitting event
+		 * structure.
+		 */
+		fsn->slot_len = INVALID_RING_SLOT;
+		kunmap_atomic(kaddr);
+		/* Start again on the next page */
+		group->ring_buffer.head = NEXT_PAGE(head, ring_size);
+
+		goto again;
+	}
+	fsn->slot_len = size;
+
+	return fsn;
+
+err:
+	spin_unlock(&group->notification_lock);
+	return fsn;
+}
+
+void fsnotify_ring_commit_slot(struct fsnotify_group *group, struct fsnotify_event *fsn)
+	__releases(&group->notification_lock)
+{
+	u64 ring_size = group->ring_buffer.nr_pages << PAGE_SHIFT;
+	u64 head = group->ring_buffer.head;
+	u64 new_head = NEXT_SLOT(head, fsn->slot_len, ring_size);
+
+	pr_debug("%s: group=%p head=%llx->%llx ring_size=%llu\n", __func__,
+		 group, head, new_head, ring_size);
+
+	kunmap_atomic(fsn);
+	group->ring_buffer.head = new_head;
+
+	spin_unlock(&group->notification_lock);
+
+	wake_up(&group->notification_waitq);
+	kill_fasync(&group->fsn_fa, SIGIO, POLL_IN);
+
+}
+
+void fsnotify_free_ring_buffer(struct fsnotify_group *group)
+{
+	int i;
+
+	for (i = 0; i < group->ring_buffer.nr_pages; i++)
+		__free_page(group->ring_buffer.pages[i]);
+	kfree(group->ring_buffer.pages);
+	group->ring_buffer.nr_pages = 0;
+}
+
+int fsnotify_create_ring_buffer(struct fsnotify_group *group)
+{
+	int nr_pages = FSNOTIFY_RING_PAGES;
+	int i;
+
+	pr_debug("%s: group=%p pages=%d\n", __func__, group, nr_pages);
+
+	group->ring_buffer.pages = kmalloc_array(nr_pages, sizeof(struct pages *),
+						 GFP_KERNEL);
+	if (!group->ring_buffer.pages)
+		return -ENOMEM;
+
+	group->ring_buffer.head = 0;
+	group->ring_buffer.tail = 0;
+
+	for (i = 0; i < nr_pages; i++) {
+		group->ring_buffer.pages[i] = alloc_pages(GFP_KERNEL, 1);
+		if (!group->ring_buffer.pages)
+			goto err_dealloc;
+	}
+
+	group->ring_buffer.nr_pages = nr_pages;
+
+	return 0;
+
+err_dealloc:
+	for (--i; i >= 0; i--)
+		__free_page(group->ring_buffer.pages[i]);
+	kfree(group->ring_buffer.pages);
+	group->ring_buffer.nr_pages = 0;
+	return -ENOMEM;
+}
+
+
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 190c6a402e98..a1a4dd69e5ed 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -74,6 +74,8 @@
 #define ALL_FSNOTIFY_PERM_EVENTS (FS_OPEN_PERM | FS_ACCESS_PERM | \
 				  FS_OPEN_EXEC_PERM)
 
+#define FSN_SUBMISSION_RING_BUFFER	0x00000080
+
 /*
  * This is a list of all events that may get sent to a parent that is watching
  * with flag FS_EVENT_ON_CHILD based on fs event on a child of that directory.
@@ -166,7 +168,11 @@ struct fsnotify_ops {
  * listener this structure is where you need to be adding fields.
  */
 struct fsnotify_event {
-	struct list_head list;
+	union {
+		struct list_head list;
+		int slot_len;
+	};
+
 	unsigned long objectid;	/* identifier for queue merges */
 };
 
@@ -191,7 +197,21 @@ struct fsnotify_group {
 
 	/* needed to send notification to userspace */
 	spinlock_t notification_lock;		/* protect the notification_list */
-	struct list_head notification_list;	/* list of event_holder this group needs to send to userspace */
+
+	union {
+		/*
+		 * list of event_holder this group needs to send to
+		 * userspace.  Either a linked list (default), or a ring
+		 * buffer(FSN_SUBMISSION_RING_BUFFER).
+		 */
+		struct list_head notification_list;
+		struct {
+			struct page **pages;
+			int nr_pages;
+			u64 head;
+			u64 tail;
+		} ring_buffer;
+	};
 	wait_queue_head_t notification_waitq;	/* read() on the notification file blocks on this waitq */
 	unsigned int q_len;			/* events on the queue */
 	unsigned int max_events;		/* maximum events allowed on the list */
@@ -492,6 +512,16 @@ extern int fsnotify_add_event(struct fsnotify_group *group,
 			      struct fsnotify_event *event,
 			      int (*merge)(struct list_head *,
 					   struct fsnotify_event *));
+
+extern int fsnotify_create_ring_buffer(struct fsnotify_group *group);
+extern void fsnotify_free_ring_buffer(struct fsnotify_group *group);
+extern struct fsnotify_event *fsnotify_ring_alloc_event_slot(struct fsnotify_group *group,
+							     size_t size);
+extern void fsnotify_ring_buffer_consume_event(struct fsnotify_group *group,
+					       struct fsnotify_event *event);
+extern bool fsnotify_ring_notify_queue_is_empty(struct fsnotify_group *group);
+struct fsnotify_event *fsnotify_ring_peek_first_event(struct fsnotify_group *group);
+extern void fsnotify_ring_commit_slot(struct fsnotify_group *group, struct fsnotify_event *fsn);
 /* Queue overflow event to a notification group */
 static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
 {
@@ -583,7 +613,8 @@ static inline void fsnotify_init_event(struct fsnotify_group *group,
 				       struct fsnotify_event *event,
 				       unsigned long objectid)
 {
-	INIT_LIST_HEAD(&event->list);
+	if (!(group->flags & FSN_SUBMISSION_RING_BUFFER))
+		INIT_LIST_HEAD(&event->list);
 	event->objectid = objectid;
 }
 
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-29 18:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-26 21:57 [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2021-04-26 21:38 kernel test robot
2021-04-26 18:41 [PATCH RFC 00/15] File system wide monitoring Gabriel Krisman Bertazi
2021-04-26 18:41 ` [PATCH RFC 05/15] fsnotify: Support event submission through ring buffer Gabriel Krisman Bertazi
2021-04-26 22:00   ` kernel test robot
2021-04-26 22:43   ` kernel test robot
2021-04-27  5:39   ` Amir Goldstein
2021-04-29 18:33     ` Gabriel Krisman Bertazi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.