intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
To: Chris Wilson <chris.p.wilson@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query
Date: Fri, 24 Jul 2020 11:47:37 -0700	[thread overview]
Message-ID: <20200724184737.GC28353@orsosgc001.amr.corp.intel.com> (raw)
In-Reply-To: <159560845126.2889.3198879925052513730@build.alporthouse.com>

On Fri, Jul 24, 2020 at 05:34:11PM +0100, Chris Wilson wrote:
>Quoting Umesh Nerlige Ramappa (2020-07-24 17:29:56)
>> On Fri, Jul 24, 2020 at 01:42:33PM +0100, Chris Wilson wrote:
>> >Quoting Umesh Nerlige Ramappa (2020-07-24 01:19:01)
>> >> From: Piotr Maciejewski <piotr.maciejewski@intel.com>
>> >>
>> >> i915 used to support time based sampling mode which is good for overall
>> >> system monitoring, but is not enough for query mode used to measure a
>> >> single draw call or dispatch. Gen9-Gen11 are using current i915 perf
>> >> implementation for query, but Gen12+ requires a new approach for query
>> >> based on triggered reports within oa buffer.
>> >>
>> >> Triggering reports into the OA buffer is achieved by writing into a
>> >> a trigger register. Optionally an unused counter/register is set with a
>> >> marker value such that a triggered report can be identified in the OA
>> >> buffer. Reports are usually triggered at the start and end of work that
>> >> is measured.
>> >>
>> >> Since OA buffer is large and queries can be frequent, an efficient way
>> >> to look for triggered reports is required. By knowing the current head
>> >> and tail offsets into the OA buffer, it is easier to determine the
>> >> locality of the reports of interest.
>> >>
>> >> Current perf OA interface does not expose head/tail information to the
>> >> user and it filters out invalid reports before sending data to user.
>> >> Also considering limited size of user buffer used during a query,
>> >> creating a 1:1 copy of the OA buffer at the user space added undesired
>> >> complexity.
>> >>
>> >> The solution was to map the OA buffer to user space provided
>> >>
>> >> (1) that it is accessed from a privileged user.
>> >> (2) OA report filtering is not used.
>> >>
>> >> These 2 conditions would satisfy the safety criteria that the current
>> >> perf interface addresses.
>> >>
>> >> To enable the query:
>> >> - Add an ioctl to expose head and tail to the user
>> >> - Add an ioctl to return size and offset of the OA buffer
>> >> - Map the OA buffer to the user space
>> >>
>> >> v2:
>> >> - Improve commit message (Chris)
>> >> - Do not mmap based on gem object filp. Instead, use perf_fd and support
>> >>   mmap syscall (Chris)
>> >> - Pass non-zero offset in mmap to enforce the right object is
>> >>   mapped (Chris)
>> >> - Do not expose gpu_address (Chris)
>> >> - Verify start and length of vma for page alignment (Lionel)
>> >> - Move SQNTL config out (Lionel)
>> >>
>> >> v3: (Chris)
>> >> - Omit redundant checks
>> >> - Return VM_FAULT_SIGBUS is old stream is closed
>> >> - Maintain reference counts to stream in vm_open and vm_close
>> >> - Use switch to identify object to be mapped
>> >>
>> >> v4: Call kref_put on closing perf fd (Chris)
>> >> v5:
>> >> - Strip access to OA buffer from unprivileged child of a privileged
>> >>   parent. Use VM_DONTCOPY
>> >> - Enforce MAP_PRIVATE by checking for VM_MAYSHARE
>> >>
>> >> Signed-off-by: Piotr Maciejewski <piotr.maciejewski@intel.com>
>> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> >> ---
>> >> @@ -3314,12 +3427,113 @@ static int i915_perf_release(struct inode *inode, struct file *file)
>> >>         i915_perf_destroy_locked(stream);
>> >>         mutex_unlock(&perf->lock);
>> >>
>> >> +       unmap_mapping_range(file->f_mapping, 0, OA_BUFFER_SIZE, 1);
>> >
>> >You can just used unmap_mapping_range(file->f_mapping, 0, -1, 1);
>> >It scales with the number of vma present, so no worries, be conservative.
>> >(Otherwise, you need s/0/OA_BUFFER_OFFSET/.)
>> >
>> >> +
>> >>         /* Release the reference the perf stream kept on the driver. */
>> >>         drm_dev_put(&perf->i915->drm);
>> >>
>> >>         return 0;
>> >>  }
>> >>
>> >> +static void vm_open_oa(struct vm_area_struct *vma)
>> >> +{
>> >> +       struct i915_perf_stream *stream = vma->vm_private_data;
>> >> +
>> >> +       GEM_BUG_ON(!stream);
>> >> +       perf_stream_get(stream);
>> >> +}
>> >> +
>> >> +static void vm_close_oa(struct vm_area_struct *vma)
>> >> +{
>> >> +       struct i915_perf_stream *stream = vma->vm_private_data;
>> >> +
>> >> +       GEM_BUG_ON(!stream);
>> >> +       perf_stream_put(stream);
>> >> +}
>> >> +
>> >> +static vm_fault_t vm_fault_oa(struct vm_fault *vmf)
>> >> +{
>> >> +       struct vm_area_struct *vma = vmf->vma;
>> >> +       struct i915_perf_stream *stream = vma->vm_private_data;
>> >> +       struct i915_perf *perf = stream->perf;
>> >> +       struct drm_i915_gem_object *obj = stream->oa_buffer.vma->obj;
>> >> +       int err;
>> >> +       bool closed;
>> >
>> >So vm_area_struct has a reference to the stream, that looks good now.
>> >But there's no reference held to the vma itself.
>>
>> How do I get a reference to the vma.
>
>That would be i915_vma_get(), but you don't need to if we control the
>order correctly, as then neither the PTE nor the ongoing faulthandler
>last longer than the i915_perf_stream

I see that the do_mmap()->mmap_region() takes a reference to file

vma->vm_file = get_file(file);

In our case this is perf_fd. do_munmap does a corresponding fput.

so unmap_mapping_range() is never called unless both unmap() and 
close(perf_fd) are called by user (or process terminates).

Is that good to take care of this ordering?

This also explains why I cannot get a VM_FAULT_SIGBUS with the IGTs.

>
>> >> +       mutex_lock(&perf->lock);
>> >> +       closed = READ_ONCE(stream->closed);
>> >> +       mutex_unlock(&perf->lock);
>> >
>> >We do WRITE_ONCE(stream->closed, true) then invalidate all the mappings,
>> >so that part looks good. The invalidate is serialised with the
>> >vm_fault_oa, so we can just use a plain READ_ONCE(stream->closed) here
>> >and not worry about the perf->lock.
>>
>> will do
>> >
>> >However... I think it should close&invalidate before releasing
>> >stream->oa_buffer.
>>
>> will do
>> >
>> >And the read here of stream->oa_buffer should be after checking
>> >stream->closed.
>>
>> I don't understand. I am checking for closed before remap_io_sg.
>
>It's the
>
>struct drm_i915_gem_object *obj = stream->oa_buffer.vma->obj;
>
>that's before the stream->closed check. That's dereferencing vma, but vma
>will be set to NULL in i915_perf_destroy.

I will not use stream->oa_buffer.vma->obj in vm_fault_oa based on your 
earlier comments, so this should be taken care of.

Thanks,
Umesh

>-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-07-24 18:47 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-24  0:18 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-24  0:18 ` [Intel-gfx] [PATCH 1/4] drm/i915/perf: Ensure observation logic is not clock gated Umesh Nerlige Ramappa
2020-07-24  0:18 ` [Intel-gfx] [PATCH 2/4] drm/i915/perf: Whitelist OA report trigger registers Umesh Nerlige Ramappa
2020-07-24  9:26   ` Chris Wilson
2020-07-24 10:07     ` Lionel Landwerlin
2020-07-24 10:19       ` Chris Wilson
2020-07-24 10:23         ` Chris Wilson
2020-07-24 10:33           ` Lionel Landwerlin
2020-07-24 11:27             ` Chris Wilson
2020-07-24 10:31         ` Lionel Landwerlin
2020-07-24  0:19 ` [Intel-gfx] [PATCH 3/4] drm/i915/perf: Whitelist OA counter and buffer registers Umesh Nerlige Ramappa
2020-07-24  8:55   ` Chris Wilson
2020-07-27 19:34     ` Umesh Nerlige Ramappa
2020-07-24  0:19 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-24 12:42   ` Chris Wilson
2020-07-24 16:29     ` Umesh Nerlige Ramappa
2020-07-24 16:34       ` Chris Wilson
2020-07-24 18:47         ` Umesh Nerlige Ramappa [this message]
2020-07-24 18:55           ` Chris Wilson
2020-07-24 19:35             ` Umesh Nerlige Ramappa
2020-07-24 19:46               ` Chris Wilson
2020-07-24 22:05                 ` Umesh Nerlige Ramappa
2020-07-24  1:23 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Allow privileged user to map the OA buffer (rev5) Patchwork
2020-07-24  1:44 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2020-08-20 18:01 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-08-20 18:02 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-08-04 17:11 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-08-04 17:11 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-31 23:58 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-31 23:58 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-31 14:46 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-31 14:46 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-31 14:55   ` Chris Wilson
2020-07-31  6:07 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-31  6:07 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-31  9:35   ` Chris Wilson
2020-07-30 23:02 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-30 23:02 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-22  5:55 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-22  5:55 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-23  9:31   ` Lionel Landwerlin
2020-07-23 23:48     ` Umesh Nerlige Ramappa
2020-07-21  2:00 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-21  2:00 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-21  2:17   ` Umesh Nerlige Ramappa
2020-07-21  8:52     ` Chris Wilson
2020-07-18  0:04 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-18  0:04 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-18 11:44   ` kernel test robot
2020-07-20 12:21   ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200724184737.GC28353@orsosgc001.amr.corp.intel.com \
    --to=umesh.nerlige.ramappa@intel.com \
    --cc=chris.p.wilson@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).