All of lore.kernel.org
 help / color / mirror / Atom feed
From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
To: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 1/3] drm/i915/perf: rework aging tail workaround
Date: Mon, 23 Mar 2020 20:15:34 -0700	[thread overview]
Message-ID: <20200324031534.GA44281@orsosgc001.amr.corp.intel.com> (raw)
In-Reply-To: <87zhc9p9vx.wl-ashutosh.dixit@intel.com>

On Sat, Mar 21, 2020 at 04:26:42PM -0700, Dixit, Ashutosh wrote:
>On Thu, 19 Mar 2020 15:52:01 -0700, Umesh Nerlige Ramappa wrote:
>>
>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>
>> We're about to introduce an options to open the perf stream, giving
>> the user ability to configure how often it wants the kernel to poll
>> the OA registers for available data.
>>
>> Right now the workaround against the OA tail pointer race condition
>> requires at least twice the internal kernel polling timer to make any
>> data available.
>>
>> This changes introduce checks on the OA data written into the circular
>> buffer to make as much data as possible available on the first
>> iteration of the polling timer.
>
>/snip/
>
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> index 3222f6cd8255..c1429d3acaf9 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -223,26 +223,17 @@
>>   *
>>   * Although this can be observed explicitly while copying reports to userspace
>>   * by checking for a zeroed report-id field in tail reports, we want to account
>> - * for this earlier, as part of the oa_buffer_check to avoid lots of redundant
>> - * read() attempts.
>> - *
>> - * In effect we define a tail pointer for reading that lags the real tail
>> - * pointer by at least %OA_TAIL_MARGIN_NSEC nanoseconds, which gives enough
>> - * time for the corresponding reports to become visible to the CPU.
>> - *
>> - * To manage this we actually track two tail pointers:
>> - *  1) An 'aging' tail with an associated timestamp that is tracked until we
>> - *     can trust the corresponding data is visible to the CPU; at which point
>> - *     it is considered 'aged'.
>> - *  2) An 'aged' tail that can be used for read()ing.
>> - *
>> - * The two separate pointers let us decouple read()s from tail pointer aging.
>> - *
>> - * The tail pointers are checked and updated at a limited rate within a hrtimer
>> - * callback (the same callback that is used for delivering EPOLLIN events)
>> - *
>> - * Initially the tails are marked invalid with %INVALID_TAIL_PTR which
>> - * indicates that an updated tail pointer is needed.
>> + * for this earlier, as part of the oa_buffer_check_unlocked to avoid lots of
>> + * redundant read() attempts.
>> + *
>> + * We workaround this issue in oa_buffer_check_unlocked() by reading the reports
>> + * in the OA buffer, starting from the tail reported by the HW until we find 2
>> + * consecutive reports with their first 2 dwords of not at 0. Those dwords are
>
>until we find a report with its first 2 dwords not 0 meaning its previous
>report is completely in memory and ready to be read.
>
>> + * also set to 0 once read and the whole buffer is cleared upon OA buffer
>> + * initialization. The first dword is the reason for this report while the
>> + * second is the timestamp, making the chances of having those 2 fields at 0
>> + * fairly unlikely. A more detailed explanation is available in
>> + * oa_buffer_check_unlocked().
>
>> @@ -477,16 +468,6 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>	 */
>>	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>>
>>	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
>>
>>	hw_tail &= ~(report_size - 1);
>>
>> @@ -496,64 +477,64 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>
>>	now = ktime_get_mono_fast_ns();
>>
>> +	if (hw_tail == stream->oa_buffer.aging_tail &&
>> +	   (now - stream->oa_buffer.aging_timestamp) > OA_TAIL_MARGIN_NSEC) {
>> +		/* If the HW tail hasn't move since the last check and the HW
>> +		 * tail has been aging for long enough, declare it the new
>> +		 * tail.
>> +		 */
>> +		stream->oa_buffer.tail = stream->oa_buffer.aging_tail;
>> +	} else {
>> +		u32 head, tail;
>>
>> +		/* NB: The head we observe here might effectively be a little
>> +		 * out of date. If a read() is in progress, the head could be
>> +		 * anywhere between this head and stream->oa_buffer.tail.
>> +		 */
>> +		head = stream->oa_buffer.head - gtt_offset;
>>
>> +		hw_tail -= gtt_offset;
>> +		tail = hw_tail;
>>
>> +		/* Walk the stream backward until we find a report with dword 0
>> +		 * & 1 not at 0. Since the circular buffer pointers progress by
>> +		 * increments of 64 bytes and that reports can be up to 256
>> +		 * bytes long, we can't tell whether a report has fully landed
>> +		 * in memory before the first 2 dwords of the following report
>> +		 * have effectively landed.
>> +		 *
>> +		 * This is assuming that the writes of the OA unit land in
>> +		 * memory in the order they were written to.
>> +		 * If not : (╯°□°)╯︵ ┻━┻
>>		 */
>> -		if (hw_tail >= gtt_offset &&
>> -		    hw_tail < (gtt_offset + OA_BUFFER_SIZE)) {
>> -			stream->oa_buffer.tails[!aged_idx].offset =
>> -				aging_tail = hw_tail;
>> -			stream->oa_buffer.aging_timestamp = now;
>> -		} else {
>> -			drm_err(&stream->perf->i915->drm,
>> -				"Ignoring spurious out of range OA buffer tail pointer = %x\n",
>> -				hw_tail);
>> +		while (OA_TAKEN(tail, head) >= report_size) {
>> +			u32 previous_tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
>> +			u32 *report32 = (void *)(stream->oa_buffer.vaddr + previous_tail);
>
>Sorry, this is wrong. This should just be:
>
>			tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
>			report32 = (void *)(stream->oa_buffer.vaddr + tail);
>
>Otherwise when we break out of the loop below tail is still set one
>report_size ahead. previous_tail is not needed. (In the previous version of
>the patch this used to work out correctly).

oh, that's right. tail should point to the first report that has 
non-zero dwords. everything before that should be considered landed.  

Thanks,
Umesh

>
>> +
>> +			/* Head of the report indicated by the HW tail register has
>> +			 * indeed landed into memory.
>> +			 */
>> +			if (report32[0] != 0 || report32[1] != 0)
>> +				break;
>> +
>> +			tail = previous_tail;
>>		}
>> +
>> +		if (((tail - hw_tail) & (OA_BUFFER_SIZE - 1)) > report_size &&
>
>nit: OA_TAKEN(hw_tail, tail) > report_size?
>
>> +		    __ratelimit(&stream->perf->tail_pointer_race))
>> +			DRM_NOTE("unlanded report(s) head=0x%x "
>> +				 "tail=0x%x hw_tail=0x%x\n",
>> +				 head, tail, hw_tail);
>> +
>> +		stream->oa_buffer.tail = gtt_offset + tail;
>> +		stream->oa_buffer.aging_tail = gtt_offset + hw_tail;
>> +		stream->oa_buffer.aging_timestamp = now;
>>	}
>>
>>	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
>>
>> -	return aged_tail == INVALID_TAIL_PTR ?
>> -		false : OA_TAKEN(aged_tail, head) >= report_size;
>> +	return OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
>> +			stream->oa_buffer.head - gtt_offset) >= report_size;
>>  }
>
>> @@ -303,6 +292,12 @@ struct i915_perf_stream {
>>		 * OA buffer data to userspace.
>>		 */
>>		u32 head;
>> +
>> +		/**
>> +		 * @tail: The last tail verified tail that can be read by
>
>The last verified tail
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2020-03-24  3:15 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-19 22:52 [Intel-gfx] [PATCH 0/3] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
2020-03-19 22:52 ` [Intel-gfx] [PATCH 1/3] drm/i915/perf: rework aging tail workaround Umesh Nerlige Ramappa
2020-03-21 23:26   ` Dixit, Ashutosh
2020-03-22  4:44     ` Dixit, Ashutosh
2020-03-22 19:47       ` Dixit, Ashutosh
2020-03-24  3:17       ` Umesh Nerlige Ramappa
2020-03-24  3:15     ` Umesh Nerlige Ramappa [this message]
2020-03-19 22:52 ` [Intel-gfx] [PATCH 2/3] drm/i915/perf: move pollin setup to non hw specific code Umesh Nerlige Ramappa
2020-03-19 22:52 ` [Intel-gfx] [PATCH 3/3] drm/i915/perf: add new open param to configure polling of OA buffer Umesh Nerlige Ramappa
2020-03-21  7:16   ` Dixit, Ashutosh
2020-03-19 23:40 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/perf: add OA interrupt support (rev7) Patchwork
2020-03-19 23:43 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-03-20  0:04 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-03-24 18:54 [Intel-gfx] [PATCH 0/3] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
2020-03-24 18:54 ` [Intel-gfx] [PATCH 1/3] drm/i915/perf: rework aging tail workaround Umesh Nerlige Ramappa
2020-03-24 21:25   ` Dixit, Ashutosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200324031534.GA44281@orsosgc001.amr.corp.intel.com \
    --to=umesh.nerlige.ramappa@intel.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.