From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
To: Chris Wilson <chris.p.wilson@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query
Date: Fri, 24 Jul 2020 11:47:37 -0700 [thread overview]
Message-ID: <20200724184737.GC28353@orsosgc001.amr.corp.intel.com> (raw)
In-Reply-To: <159560845126.2889.3198879925052513730@build.alporthouse.com>
On Fri, Jul 24, 2020 at 05:34:11PM +0100, Chris Wilson wrote:
>Quoting Umesh Nerlige Ramappa (2020-07-24 17:29:56)
>> On Fri, Jul 24, 2020 at 01:42:33PM +0100, Chris Wilson wrote:
>> >Quoting Umesh Nerlige Ramappa (2020-07-24 01:19:01)
>> >> From: Piotr Maciejewski <piotr.maciejewski@intel.com>
>> >>
>> >> i915 used to support time based sampling mode which is good for overall
>> >> system monitoring, but is not enough for query mode used to measure a
>> >> single draw call or dispatch. Gen9-Gen11 are using current i915 perf
>> >> implementation for query, but Gen12+ requires a new approach for query
>> >> based on triggered reports within oa buffer.
>> >>
>> >> Triggering reports into the OA buffer is achieved by writing into a
>> >> a trigger register. Optionally an unused counter/register is set with a
>> >> marker value such that a triggered report can be identified in the OA
>> >> buffer. Reports are usually triggered at the start and end of work that
>> >> is measured.
>> >>
>> >> Since OA buffer is large and queries can be frequent, an efficient way
>> >> to look for triggered reports is required. By knowing the current head
>> >> and tail offsets into the OA buffer, it is easier to determine the
>> >> locality of the reports of interest.
>> >>
>> >> Current perf OA interface does not expose head/tail information to the
>> >> user and it filters out invalid reports before sending data to user.
>> >> Also considering limited size of user buffer used during a query,
>> >> creating a 1:1 copy of the OA buffer at the user space added undesired
>> >> complexity.
>> >>
>> >> The solution was to map the OA buffer to user space provided
>> >>
>> >> (1) that it is accessed from a privileged user.
>> >> (2) OA report filtering is not used.
>> >>
>> >> These 2 conditions would satisfy the safety criteria that the current
>> >> perf interface addresses.
>> >>
>> >> To enable the query:
>> >> - Add an ioctl to expose head and tail to the user
>> >> - Add an ioctl to return size and offset of the OA buffer
>> >> - Map the OA buffer to the user space
>> >>
>> >> v2:
>> >> - Improve commit message (Chris)
>> >> - Do not mmap based on gem object filp. Instead, use perf_fd and support
>> >> mmap syscall (Chris)
>> >> - Pass non-zero offset in mmap to enforce the right object is
>> >> mapped (Chris)
>> >> - Do not expose gpu_address (Chris)
>> >> - Verify start and length of vma for page alignment (Lionel)
>> >> - Move SQNTL config out (Lionel)
>> >>
>> >> v3: (Chris)
>> >> - Omit redundant checks
>> >> - Return VM_FAULT_SIGBUS is old stream is closed
>> >> - Maintain reference counts to stream in vm_open and vm_close
>> >> - Use switch to identify object to be mapped
>> >>
>> >> v4: Call kref_put on closing perf fd (Chris)
>> >> v5:
>> >> - Strip access to OA buffer from unprivileged child of a privileged
>> >> parent. Use VM_DONTCOPY
>> >> - Enforce MAP_PRIVATE by checking for VM_MAYSHARE
>> >>
>> >> Signed-off-by: Piotr Maciejewski <piotr.maciejewski@intel.com>
>> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> >> ---
>> >> @@ -3314,12 +3427,113 @@ static int i915_perf_release(struct inode *inode, struct file *file)
>> >> i915_perf_destroy_locked(stream);
>> >> mutex_unlock(&perf->lock);
>> >>
>> >> + unmap_mapping_range(file->f_mapping, 0, OA_BUFFER_SIZE, 1);
>> >
>> >You can just used unmap_mapping_range(file->f_mapping, 0, -1, 1);
>> >It scales with the number of vma present, so no worries, be conservative.
>> >(Otherwise, you need s/0/OA_BUFFER_OFFSET/.)
>> >
>> >> +
>> >> /* Release the reference the perf stream kept on the driver. */
>> >> drm_dev_put(&perf->i915->drm);
>> >>
>> >> return 0;
>> >> }
>> >>
>> >> +static void vm_open_oa(struct vm_area_struct *vma)
>> >> +{
>> >> + struct i915_perf_stream *stream = vma->vm_private_data;
>> >> +
>> >> + GEM_BUG_ON(!stream);
>> >> + perf_stream_get(stream);
>> >> +}
>> >> +
>> >> +static void vm_close_oa(struct vm_area_struct *vma)
>> >> +{
>> >> + struct i915_perf_stream *stream = vma->vm_private_data;
>> >> +
>> >> + GEM_BUG_ON(!stream);
>> >> + perf_stream_put(stream);
>> >> +}
>> >> +
>> >> +static vm_fault_t vm_fault_oa(struct vm_fault *vmf)
>> >> +{
>> >> + struct vm_area_struct *vma = vmf->vma;
>> >> + struct i915_perf_stream *stream = vma->vm_private_data;
>> >> + struct i915_perf *perf = stream->perf;
>> >> + struct drm_i915_gem_object *obj = stream->oa_buffer.vma->obj;
>> >> + int err;
>> >> + bool closed;
>> >
>> >So vm_area_struct has a reference to the stream, that looks good now.
>> >But there's no reference held to the vma itself.
>>
>> How do I get a reference to the vma.
>
>That would be i915_vma_get(), but you don't need to if we control the
>order correctly, as then neither the PTE nor the ongoing faulthandler
>last longer than the i915_perf_stream
I see that the do_mmap()->mmap_region() takes a reference to file
vma->vm_file = get_file(file);
In our case this is perf_fd. do_munmap does a corresponding fput.
so unmap_mapping_range() is never called unless both unmap() and
close(perf_fd) are called by user (or process terminates).
Is that good to take care of this ordering?
This also explains why I cannot get a VM_FAULT_SIGBUS with the IGTs.
>
>> >> + mutex_lock(&perf->lock);
>> >> + closed = READ_ONCE(stream->closed);
>> >> + mutex_unlock(&perf->lock);
>> >
>> >We do WRITE_ONCE(stream->closed, true) then invalidate all the mappings,
>> >so that part looks good. The invalidate is serialised with the
>> >vm_fault_oa, so we can just use a plain READ_ONCE(stream->closed) here
>> >and not worry about the perf->lock.
>>
>> will do
>> >
>> >However... I think it should close&invalidate before releasing
>> >stream->oa_buffer.
>>
>> will do
>> >
>> >And the read here of stream->oa_buffer should be after checking
>> >stream->closed.
>>
>> I don't understand. I am checking for closed before remap_io_sg.
>
>It's the
>
>struct drm_i915_gem_object *obj = stream->oa_buffer.vma->obj;
>
>that's before the stream->closed check. That's dereferencing vma, but vma
>will be set to NULL in i915_perf_destroy.
I will not use stream->oa_buffer.vma->obj in vm_fault_oa based on your
earlier comments, so this should be taken care of.
Thanks,
Umesh
>-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2020-07-24 18:47 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-24 0:18 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-24 0:18 ` [Intel-gfx] [PATCH 1/4] drm/i915/perf: Ensure observation logic is not clock gated Umesh Nerlige Ramappa
2020-07-24 0:18 ` [Intel-gfx] [PATCH 2/4] drm/i915/perf: Whitelist OA report trigger registers Umesh Nerlige Ramappa
2020-07-24 9:26 ` Chris Wilson
2020-07-24 10:07 ` Lionel Landwerlin
2020-07-24 10:19 ` Chris Wilson
2020-07-24 10:23 ` Chris Wilson
2020-07-24 10:33 ` Lionel Landwerlin
2020-07-24 11:27 ` Chris Wilson
2020-07-24 10:31 ` Lionel Landwerlin
2020-07-24 0:19 ` [Intel-gfx] [PATCH 3/4] drm/i915/perf: Whitelist OA counter and buffer registers Umesh Nerlige Ramappa
2020-07-24 8:55 ` Chris Wilson
2020-07-27 19:34 ` Umesh Nerlige Ramappa
2020-07-24 0:19 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-24 12:42 ` Chris Wilson
2020-07-24 16:29 ` Umesh Nerlige Ramappa
2020-07-24 16:34 ` Chris Wilson
2020-07-24 18:47 ` Umesh Nerlige Ramappa [this message]
2020-07-24 18:55 ` Chris Wilson
2020-07-24 19:35 ` Umesh Nerlige Ramappa
2020-07-24 19:46 ` Chris Wilson
2020-07-24 22:05 ` Umesh Nerlige Ramappa
2020-07-24 1:23 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Allow privileged user to map the OA buffer (rev5) Patchwork
2020-07-24 1:44 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2020-08-20 18:01 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-08-20 18:02 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-08-04 17:11 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-08-04 17:11 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-31 23:58 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-31 23:58 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-31 14:46 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-31 14:46 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-31 14:55 ` Chris Wilson
2020-07-31 6:07 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-31 6:07 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-31 9:35 ` Chris Wilson
2020-07-30 23:02 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-30 23:02 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-22 5:55 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-22 5:55 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-23 9:31 ` Lionel Landwerlin
2020-07-23 23:48 ` Umesh Nerlige Ramappa
2020-07-21 2:00 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-21 2:00 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-21 2:17 ` Umesh Nerlige Ramappa
2020-07-21 8:52 ` Chris Wilson
2020-07-18 0:04 [Intel-gfx] [PATCH 0/4] Allow privileged user to map the OA buffer Umesh Nerlige Ramappa
2020-07-18 0:04 ` [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2020-07-18 11:44 ` kernel test robot
2020-07-20 12:21 ` Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200724184737.GC28353@orsosgc001.amr.corp.intel.com \
--to=umesh.nerlige.ramappa@intel.com \
--cc=chris.p.wilson@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).