linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: dhowells@redhat.com, viro@zeniv.linux.org.uk, raven@themaw.net,
	linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
	linux-block@vger.kernel.org, keyrings@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/7] General notification queue with user mmap()'able ring buffer
Date: Fri, 31 May 2019 15:55:08 +0100	[thread overview]
Message-ID: <3763.1559314508@warthog.procyon.org.uk> (raw)
In-Reply-To: <20190529231112.GB3164@kroah.com>

Greg KH <gregkh@linuxfoundation.org> wrote:

> So, if that's all that needs to be fixed, can you use the same
> buffer/code if that patch is merged?

I really don't know.  The perf code is complex, partially in hardware drivers
and is tricky to understand - though a chunk of that is the "aux" buffer part;
PeterZ used words like "special" and "magic" and the comments in the code talk
about the hardware writing into the buffer.

__perf_output_begin() does not appear to be SMP safe.  It uses local_cmpxchg()
and local_add() which on x86 lack the LOCK prefix.

stracing the perf command on my test machine, it calls perf_event_open(2) four
times and mmap's each fd it gets back.  I'm guessing that each one maps a
separate buffer for each CPU.

So to use watch_queue based on perf's buffering, you would have to have a
(2^N)+1 pages-sized buffer for each CPU.  So that would be a minimum of 64K of
unswappable memory for my desktop machine, say).  Multiply that by each
process that wants to listen for events...

What I'm aiming for is something that has a single buffer used by all CPUs for
each instance of /dev/watch_queue opened and I'd also like to avoid having to
allocate the metadata page and the aux buffer to save space.  This is locked
memory and cannot be swapped.

Also, perf has to leave a gap in the ring because it uses CIRC_SPACE(), though
that's a minor detail that I guess can't be fixed now.

I'm also slightly concerned that __perf_output_begin() doesn't check if
rb->user->tail has got ahead of rb->user->head or that it's lagging too far
behind.  I doubt it's a serious problem for the kernel since it won't write
outside of the buffer, but userspace might screw up.  I think the worst that
will happen is that userspace will get confused.

One thing I would like is to waive the 2^N size requirement.  I understand
*why* we do that, but I wonder how expensive DIV instructions are for
relatively small divisors.

David

  parent reply	other threads:[~2019-05-31 14:55 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-28 16:01 [RFC][PATCH 0/7] Mount, FS, Block and Keyrings notifications David Howells
2019-05-28 16:01 ` [PATCH 1/7] General notification queue with user mmap()'able ring buffer David Howells
2019-05-28 16:26   ` Greg KH
2019-05-28 17:30   ` David Howells
2019-05-28 23:12     ` Greg KH
2019-05-29 16:06     ` David Howells
2019-05-29 17:46       ` Jann Horn
2019-05-29 21:02       ` David Howells
2019-05-31 11:14         ` Peter Zijlstra
2019-05-31 12:02         ` David Howells
2019-05-31 13:26           ` Peter Zijlstra
2019-05-31 14:20           ` David Howells
2019-05-31 16:44             ` Peter Zijlstra
2019-05-31 17:12             ` David Howells
2019-06-17 16:24               ` Peter Zijlstra
2019-05-29 23:09       ` Greg KH
2019-05-29 23:11       ` Greg KH
2019-05-30  9:50         ` Andrea Parri
2019-05-31  8:35           ` Peter Zijlstra
2019-05-31  8:47       ` Peter Zijlstra
2019-05-31 12:42       ` David Howells
2019-05-31 14:55       ` David Howells [this message]
2019-05-28 19:14   ` Jann Horn
2019-05-28 22:28   ` David Howells
2019-05-28 23:16     ` Jann Horn
2019-05-28 16:02 ` [PATCH 2/7] keys: Add a notification facility David Howells
2019-05-28 16:02 ` [PATCH 3/7] vfs: Add a mount-notification facility David Howells
2019-05-28 20:06   ` Jann Horn
2019-05-28 23:04   ` David Howells
2019-05-28 23:23     ` Jann Horn
2019-05-29 11:16     ` David Howells
2019-05-28 23:08   ` David Howells
2019-05-29 10:55   ` David Howells
2019-05-29 11:00   ` David Howells
2019-05-29 15:53     ` Casey Schaufler
2019-05-29 16:12       ` Jann Horn
2019-05-29 17:04         ` Casey Schaufler
2019-06-03 16:30         ` David Howells
2019-05-29 17:13       ` Andy Lutomirski
2019-05-29 17:46         ` Casey Schaufler
2019-05-29 18:11           ` Jann Horn
2019-05-29 19:28             ` Casey Schaufler
2019-05-29 19:47               ` Jann Horn
2019-05-29 20:50                 ` Casey Schaufler
2019-05-29 23:12           ` Andy Lutomirski
2019-05-29 23:56             ` Casey Schaufler
2019-05-28 16:02 ` [PATCH 4/7] vfs: Add superblock notifications David Howells
2019-05-28 20:27   ` Jann Horn
2019-05-29 12:58   ` David Howells
2019-05-29 14:16     ` Jann Horn
2019-05-28 16:02 ` [PATCH 5/7] fsinfo: Export superblock notification counter David Howells
2019-05-28 16:02 ` [PATCH 6/7] block: Add block layer notifications David Howells
2019-05-28 20:37   ` Jann Horn
2019-05-28 16:02 ` [PATCH 7/7] Add sample notification program David Howells
2019-05-28 23:58 ` [RFC][PATCH 0/7] Mount, FS, Block and Keyrings notifications Greg KH
2019-05-29  6:33 ` Amir Goldstein
2019-05-29 14:25   ` Jan Kara
2019-05-29 15:10     ` Greg KH
2019-05-29 15:53     ` Amir Goldstein
2019-05-30 11:00       ` Jan Kara
2019-06-04 12:33     ` David Howells
2019-05-29  6:45 ` David Howells
2019-05-29  7:40   ` Amir Goldstein
2019-05-29  9:09 ` David Howells
2019-05-29 15:41   ` Casey Schaufler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3763.1559314508@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=keyrings@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=raven@themaw.net \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).