linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Beau Belgrave <beaub@linux.microsoft.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: rostedt@goodmis.org, mhiramat@kernel.org,
	dcook@linux.microsoft.com, alanau@linux.microsoft.com,
	linux-trace-devel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/2] tracing/user_events: Remote write ABI
Date: Fri, 28 Oct 2022 15:17:28 -0700	[thread overview]
Message-ID: <20221028221728.GA162@W11-BEAU-MD.localdomain> (raw)
In-Reply-To: <96d9f066-2f39-78e6-9be7-f9c69235615e@efficios.com>

On Fri, Oct 28, 2022 at 05:50:04PM -0400, Mathieu Desnoyers wrote:
> On 2022-10-27 18:40, Beau Belgrave wrote:
> > As part of the discussions for user_events aligned with user space
> > tracers, it was determined that user programs should register a 32-bit
> > value to set or clear a bit when an event becomes enabled. Currently a
> > shared page is being used that requires mmap().
> > 
> > In this new model during the event registration from user programs 2 new
> > values are specified. The first is the address to update when the event
> > is either enabled or disabled. The second is the bit to set/clear to
> > reflect the event being enabled. This allows for a local 32-bit value in
> > user programs to support both kernel and user tracers. As an example,
> > setting bit 31 for kernel tracers when the event becomes enabled allows
> > for user tracers to use the other bits for ref counts or other flags.
> > The kernel side updates the bit atomically, user programs need to also
> > update these values atomically.
> 
> Nice!
> 
> > 
> > User provided addresses must be aligned on a 32-bit boundary, this
> > allows for single page checking and prevents odd behaviors such as a
> > 32-bit value straddling 2 pages instead of a single page.
> > 
> > When page faults are encountered they are done asyncly via a workqueue.
> > If the page faults back in, the write update is attempted again. If the
> > page cannot fault-in, then we log and wait until the next time the event
> > is enabled/disabled. This is to prevent possible infinite loops resulting
> > from bad user processes unmapping or changing protection values after
> > registering the address.
> 
> I'll have a close look at this workqueue page fault scheme, probably next
> week.
> 

Excellent.

> > 
> > NOTE:
> > User programs that wish to have the enable bit shared across forks
> > either need to use a MAP_SHARED allocated address or register a new
> > address and file descriptor. If MAP_SHARED cannot be used or new
> > registrations cannot be done, then it's allowable to use MAP_PRIVATE
> > as long as the forked children never update the page themselves. Once
> > the page has been updated, the page from the parent will be copied over
> > to the child. This new copy-on-write page will not receive updates from
> > the kernel until another registration has been performed with this new
> > address.
> 
> This seems rather odd. I would expect that if a parent process registers
> some instrumentation using private mappings for enabled state through the
> user events ioctl, and then forks, the child process would seamlessly be
> traced by the user events ABI while being able to also change the enabled
> state from the userspace tracer libraries (which would trigger COW).
> Requiring the child to re-register to user events is rather odd.
> 

It's the COW that is the problem, see below.

> What is preventing us from tracing the child without re-registration in this
> scenario ?
> 

Largely knowing when the COW occurs on a specific page. We don't make
the mappings, so I'm unsure if we can ask to be notified easily during
these times or not. If we could, that would solve this. I'm glad you are
thinking about this. The note here was exactly to trigger this
discussion :)

I believe this is the same as a Futex, I'll take another look at that
code to see if they've come up with anything regarding this.

Any ideas?

Thanks,
-Beau

  reply	other threads:[~2022-10-28 22:17 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27 22:40 [RFC PATCH 0/2] tracing/user_events: Remote write ABI Beau Belgrave
2022-10-27 22:40 ` [RFC PATCH 1/2] tracing/user_events: Use remote writes for event enablement Beau Belgrave
2022-10-29 14:44   ` Mathieu Desnoyers
2022-10-31 16:38     ` Beau Belgrave
2022-10-31 14:47   ` Masami Hiramatsu
2022-10-31 16:46     ` Beau Belgrave
2022-10-31 23:55       ` Masami Hiramatsu
2022-11-01 16:45         ` Beau Belgrave
2022-10-27 22:40 ` [RFC PATCH 2/2] tracing/user_events: Fixup enable faults asyncly Beau Belgrave
2022-10-28 22:07   ` Mathieu Desnoyers
2022-10-28 22:35     ` Beau Belgrave
2022-10-29 14:23       ` Mathieu Desnoyers
2022-10-31 16:58         ` Beau Belgrave
2022-10-28 22:19   ` Mathieu Desnoyers
2022-10-28 22:42     ` Beau Belgrave
2022-10-29 14:40       ` Mathieu Desnoyers
2022-10-30 11:45         ` Mathieu Desnoyers
2022-10-31 17:18           ` Beau Belgrave
2022-10-31 17:12         ` Beau Belgrave
2022-10-28 21:50 ` [RFC PATCH 0/2] tracing/user_events: Remote write ABI Mathieu Desnoyers
2022-10-28 22:17   ` Beau Belgrave [this message]
2022-10-29 13:58     ` Mathieu Desnoyers
2022-10-31 16:53       ` Beau Belgrave
2022-11-02 13:46         ` Mathieu Desnoyers
2022-11-02 17:18           ` Beau Belgrave
2022-10-31 14:15 ` Masami Hiramatsu
2022-10-31 15:27   ` Mathieu Desnoyers
2022-10-31 17:27   ` Beau Belgrave
2022-10-31 18:25     ` Mathieu Desnoyers
2022-11-01 13:52     ` Masami Hiramatsu
2022-11-01 16:55       ` Beau Belgrave

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221028221728.GA162@W11-BEAU-MD.localdomain \
    --to=beaub@linux.microsoft.com \
    --cc=alanau@linux.microsoft.com \
    --cc=dcook@linux.microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).