All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tomas Elf <tomas.elf@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH 3/8] drm/i915: Cope with request list state change during error state capture
Date: Fri, 09 Oct 2015 12:45:07 +0100	[thread overview]
Message-ID: <5617A8C3.1030700@intel.com> (raw)
In-Reply-To: <20151009082825.GH26718@phenom.ffwll.local>

On 09/10/2015 09:28, Daniel Vetter wrote:
> On Thu, Oct 08, 2015 at 07:31:35PM +0100, Tomas Elf wrote:
>> Since we're not synchronizing the ring request list during error state capture
>> the request list state might change between the time the corresponding error
>> request list was allocated and dimensioned to the time when the ring request
>> list is actually captured into the error state. If this happens, throw a
>> WARNING and do early exit and be aware that the captured error state might not
>> be fully reliable.
>
> Please don't throw a WARNING since this is expected to occasionally
> happen. DRM_DEBUG_DRIVER is enough imo.
> -Daniel

I still don't see how it could happen without leading to reads of 
unallocated memory. The error state request list has been allocated to a 
certain size equal to num_requests and this loop seems to assume that 
the error state request list maintains the same size as the driver 
request list, which is not the case - leading to crashes, which is how I 
happened to notice it.

I can obviously remove the warning but are you saying we shouldn't even 
take action if it happens? Such as early exit?

Thanks,
Tomas

>
>>
>> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gpu_error.c | 13 +++++++++++++
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>> index 32c1799..cc75ca4 100644
>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>> @@ -1071,6 +1071,19 @@ static void i915_gem_record_rings(struct drm_device *dev,
>>   		list_for_each_entry_safe(request, tmpreq, &ring->request_list, list) {
>>   			struct drm_i915_error_request *erq;
>>
>> +			if (WARN_ON(!request || count >= error->ring[i].num_requests)) {
>> +				/*
>> +				 * If the ring request list was changed in
>> +				 * between the point where the error request
>> +				 * list was created and dimensioned and this
>> +				 * point then just update the num_requests
>> +				 * field to reflect this.
>> +				 */
>> +				error->ring[i].num_requests =
>> +					min(count, error->ring[i].num_requests);
>> +				break;
>> +			}
>> +
>>   			erq = &error->ring[i].requests[count++];
>>   			erq->seqno = request->seqno;
>>   			erq->jiffies = request->emitted_jiffies;
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-10-09 11:45 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-08 18:31 [PATCH 0/8] Stability improvements to error state capture Tomas Elf
2015-10-08 18:31 ` [PATCH 1/8] drm/i915: Early exit from semaphore_waits_for for execlist mode Tomas Elf
2015-10-08 18:31 ` [PATCH 2/8] drm/i915: Migrate to safe iterators in error state capture Tomas Elf
2015-10-09  7:49   ` Chris Wilson
2015-10-09 11:38     ` Tomas Elf
2015-10-09  8:27   ` Daniel Vetter
2015-10-09 11:40     ` Tomas Elf
2015-10-13 11:37       ` Daniel Vetter
2015-10-13 11:47         ` Chris Wilson
2015-10-08 18:31 ` [PATCH 3/8] drm/i915: Cope with request list state change during " Tomas Elf
2015-10-09  7:48   ` Chris Wilson
2015-10-09 11:25     ` Tomas Elf
2015-10-13 11:39       ` Daniel Vetter
2015-10-14 11:46         ` Tomas Elf
2015-10-14 12:45           ` Daniel Vetter
2015-10-09  8:28   ` Daniel Vetter
2015-10-09 11:45     ` Tomas Elf [this message]
2015-10-13 11:40       ` Daniel Vetter
2015-10-08 18:31 ` [PATCH 4/8] drm/i915: NULL checking when capturing buffer objects " Tomas Elf
2015-10-09  7:49   ` Chris Wilson
2015-10-09 11:34     ` Tomas Elf
2015-10-09  8:32   ` Daniel Vetter
2015-10-09  8:47     ` Chris Wilson
2015-10-09 11:52       ` Tomas Elf
2015-10-09 11:45     ` Tomas Elf
2015-10-08 18:31 ` [PATCH 5/8] drm/i915: vma NULL pointer check Tomas Elf
2015-10-09  7:48   ` Chris Wilson
2015-10-09 11:30     ` Tomas Elf
2015-10-09 11:59       ` Chris Wilson
2015-10-13 11:43         ` Daniel Vetter
2015-10-09  8:33   ` Daniel Vetter
2015-10-09 11:46     ` Tomas Elf
2015-10-08 18:31 ` [PATCH 6/8] drm/i915: Use safe list iterators Tomas Elf
2015-10-09  7:41   ` Chris Wilson
2015-10-09 10:27     ` Tomas Elf
2015-10-09 10:38       ` Chris Wilson
2015-10-09 12:00         ` Tomas Elf
2015-10-08 18:31 ` [PATCH 7/8] drm/i915: Grab execlist spinlock to avoid post-reset concurrency issues Tomas Elf
2015-10-09  7:45   ` Chris Wilson
2015-10-09 10:28     ` Tomas Elf
2015-10-09  8:38   ` Daniel Vetter
2015-10-09  8:45     ` Chris Wilson
2015-10-13 11:46       ` Daniel Vetter
2015-10-13 11:45         ` Chris Wilson
2015-10-13 13:46           ` Daniel Vetter
2015-10-13 14:00             ` Chris Wilson
2015-10-19 15:32   ` [PATCH v2 " Tomas Elf
2015-10-22 16:49     ` Dave Gordon
2015-10-22 17:35       ` Daniel Vetter
2015-10-23  8:42     ` Tvrtko Ursulin
2015-10-23  8:59       ` Daniel Vetter
2015-10-23 11:02         ` Tomas Elf
2015-10-23 12:49           ` Dave Gordon
2015-10-23 13:08     ` [PATCH v3 " Tomas Elf
2015-10-23 14:53       ` Daniel, Thomas
2015-10-23 17:02     ` [PATCH] drm/i915: Update to post-reset execlist queue clean-up Tomas Elf
2015-12-01 11:46       ` Tvrtko Ursulin
2015-12-11 14:14         ` Dave Gordon
2015-12-11 16:40           ` Daniel Vetter
2015-12-14 10:21           ` Mika Kuoppala
2015-10-08 18:31 ` [PATCH 8/8] drm/i915: NULL check of unpin_work Tomas Elf
2015-10-09  7:46   ` Chris Wilson
2015-10-09  8:39     ` Daniel Vetter
2015-10-09 11:50       ` Tomas Elf
2015-10-09 10:30     ` Tomas Elf
2015-10-09 10:44       ` Chris Wilson
2015-10-09 12:06         ` Tomas Elf
2015-10-13 11:51           ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5617A8C3.1030700@intel.com \
    --to=tomas.elf@intel.com \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=daniel@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.