All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Ken.Xue@amd.com, "Deucher, Alexander" <alexander.deucher@amd.com>,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	thomas.hellstrom@linux.intel.com,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>
Subject: Re: [Bug][5.18-rc0] Between commits ed4643521e6a and 34af78c4e616, appears warning "WARNING: CPU: 31 PID: 51848 at drivers/dma-buf/dma-fence-array.c:191 dma_fence_array_create+0x101/0x120" and some games stopped working.
Date: Fri, 8 Apr 2022 16:27:19 +0200	[thread overview]
Message-ID: <eef04fc4-741d-606c-c2c6-f054e4e3fffd@amd.com> (raw)
In-Reply-To: <CABXGCsPi68Lyvg+6UjTK2aJm6PVBs83YJuP6x68mcrzAQgpuZg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1208 bytes --]

Am 08.04.22 um 14:24 schrieb Mikhail Gavrilov:
> On Fri, 8 Apr 2022 at 16:13, Christian König <christian.koenig@amd.com> wrote:
>
>> I own you a beer.
>>
>> I still don't know what happens here, but that makes at least a bit more
>> sense than a patch which only changes comments :)
>>
>> Looks like we are missing something here. Can I send you a patch to try
>> something later today?
> Yes, please feel free to send me a patch for testing.
>

Please test the attached patch, it just re-introduce the lock without 
doing much else.

And does your branch contain the following patch:

commit d18b8eadd83e3d8d63a45f9479478640dbcfca02
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Feb 23 14:35:31 2022 +0100

     drm/amdgpu: install ctx entities with cmpxchg

     Since we removed the context lock we need to make sure that not two 
threads
     are trying to install an entity at the same time.

     Signed-off-by: Christian König <christian.koenig@amd.com>
     Fixes: 461fa7b0ac565e ("drm/amdgpu: remove ctx->lock")
     Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
     Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Thanks,
Christian.

[-- Attachment #2: 0001-drm-amdgpu-partial-revert-remove-ctx-lock.patch --]
[-- Type: text/x-patch, Size: 2754 bytes --]

From e2e39cb1a4a1c7c0e3ff2e4e0188394b0eda0ba6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>
Date: Fri, 8 Apr 2022 16:22:55 +0200
Subject: [PATCH] drm/amdgpu: partial revert "remove ctx->lock"
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This reverts commit 461fa7b0ac565ef25c1da0ced31005dd437883a7.

We are missing some inter dependencies here.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 4 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 +
 3 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 8de283997769..5471b93f6808 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -128,6 +128,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs
 		goto free_chunk;
 	}
 
+	mutex_lock(&p->ctx->lock);
+
 	/* skip guilty context job */
 	if (atomic_read(&p->ctx->guilty) == 1) {
 		ret = -ECANCELED;
@@ -688,6 +690,7 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error,
 	dma_fence_put(parser->fence);
 
 	if (parser->ctx) {
+		mutex_unlock(&parser->ctx->lock);
 		amdgpu_ctx_put(parser->ctx);
 	}
 	if (parser->bo_list)
@@ -1332,6 +1335,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 		goto out;
 
 	r = amdgpu_cs_submit(&parser, cs);
+
 out:
 	amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 5981c7d9bd48..8f0e6d93bb9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -237,6 +237,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
 
 	kref_init(&ctx->refcount);
 	spin_lock_init(&ctx->ring_lock);
+	mutex_init(&ctx->lock);
 
 	ctx->reset_counter = atomic_read(&adev->gpu_reset_counter);
 	ctx->reset_counter_query = ctx->reset_counter;
@@ -357,6 +358,7 @@ static void amdgpu_ctx_fini(struct kref *ref)
 		drm_dev_exit(idx);
 	}
 
+	mutex_destroy(&ctx->lock);
 	kfree(ctx);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
index d0cbfcea90f7..142f2f87d44c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
@@ -49,6 +49,7 @@ struct amdgpu_ctx {
 	bool				preamble_presented;
 	int32_t				init_priority;
 	int32_t				override_priority;
+	struct mutex			lock;
 	atomic_t			guilty;
 	unsigned long			ras_counter_ce;
 	unsigned long			ras_counter_ue;
-- 
2.25.1


WARNING: multiple messages have this Message-ID (diff)
From: "Christian König" <christian.koenig@amd.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: thomas.hellstrom@linux.intel.com,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	"Deucher, Alexander" <alexander.deucher@amd.com>,
	Ken.Xue@amd.com
Subject: Re: [Bug][5.18-rc0] Between commits ed4643521e6a and 34af78c4e616, appears warning "WARNING: CPU: 31 PID: 51848 at drivers/dma-buf/dma-fence-array.c:191 dma_fence_array_create+0x101/0x120" and some games stopped working.
Date: Fri, 8 Apr 2022 16:27:19 +0200	[thread overview]
Message-ID: <eef04fc4-741d-606c-c2c6-f054e4e3fffd@amd.com> (raw)
In-Reply-To: <CABXGCsPi68Lyvg+6UjTK2aJm6PVBs83YJuP6x68mcrzAQgpuZg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1208 bytes --]

Am 08.04.22 um 14:24 schrieb Mikhail Gavrilov:
> On Fri, 8 Apr 2022 at 16:13, Christian König <christian.koenig@amd.com> wrote:
>
>> I own you a beer.
>>
>> I still don't know what happens here, but that makes at least a bit more
>> sense than a patch which only changes comments :)
>>
>> Looks like we are missing something here. Can I send you a patch to try
>> something later today?
> Yes, please feel free to send me a patch for testing.
>

Please test the attached patch, it just re-introduce the lock without 
doing much else.

And does your branch contain the following patch:

commit d18b8eadd83e3d8d63a45f9479478640dbcfca02
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Feb 23 14:35:31 2022 +0100

     drm/amdgpu: install ctx entities with cmpxchg

     Since we removed the context lock we need to make sure that not two 
threads
     are trying to install an entity at the same time.

     Signed-off-by: Christian König <christian.koenig@amd.com>
     Fixes: 461fa7b0ac565e ("drm/amdgpu: remove ctx->lock")
     Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
     Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Thanks,
Christian.

[-- Attachment #2: 0001-drm-amdgpu-partial-revert-remove-ctx-lock.patch --]
[-- Type: text/x-patch, Size: 2754 bytes --]

From e2e39cb1a4a1c7c0e3ff2e4e0188394b0eda0ba6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>
Date: Fri, 8 Apr 2022 16:22:55 +0200
Subject: [PATCH] drm/amdgpu: partial revert "remove ctx->lock"
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This reverts commit 461fa7b0ac565ef25c1da0ced31005dd437883a7.

We are missing some inter dependencies here.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 4 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 +
 3 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 8de283997769..5471b93f6808 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -128,6 +128,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs
 		goto free_chunk;
 	}
 
+	mutex_lock(&p->ctx->lock);
+
 	/* skip guilty context job */
 	if (atomic_read(&p->ctx->guilty) == 1) {
 		ret = -ECANCELED;
@@ -688,6 +690,7 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error,
 	dma_fence_put(parser->fence);
 
 	if (parser->ctx) {
+		mutex_unlock(&parser->ctx->lock);
 		amdgpu_ctx_put(parser->ctx);
 	}
 	if (parser->bo_list)
@@ -1332,6 +1335,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 		goto out;
 
 	r = amdgpu_cs_submit(&parser, cs);
+
 out:
 	amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 5981c7d9bd48..8f0e6d93bb9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -237,6 +237,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
 
 	kref_init(&ctx->refcount);
 	spin_lock_init(&ctx->ring_lock);
+	mutex_init(&ctx->lock);
 
 	ctx->reset_counter = atomic_read(&adev->gpu_reset_counter);
 	ctx->reset_counter_query = ctx->reset_counter;
@@ -357,6 +358,7 @@ static void amdgpu_ctx_fini(struct kref *ref)
 		drm_dev_exit(idx);
 	}
 
+	mutex_destroy(&ctx->lock);
 	kfree(ctx);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
index d0cbfcea90f7..142f2f87d44c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
@@ -49,6 +49,7 @@ struct amdgpu_ctx {
 	bool				preamble_presented;
 	int32_t				init_priority;
 	int32_t				override_priority;
+	struct mutex			lock;
 	atomic_t			guilty;
 	unsigned long			ras_counter_ce;
 	unsigned long			ras_counter_ue;
-- 
2.25.1


  reply	other threads:[~2022-04-08 14:27 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-03 18:39 [Bug][5.18-rc0] Between commits ed4643521e6a and 34af78c4e616, appears warning "WARNING: CPU: 31 PID: 51848 at drivers/dma-buf/dma-fence-array.c:191 dma_fence_array_create+0x101/0x120" and some games stopped working Mikhail Gavrilov
2022-04-04  6:30 ` Christian König
2022-04-04  8:22   ` Paul Menzel
2022-04-04  8:22     ` Paul Menzel
2022-04-04  8:38     ` Christian König
2022-04-04  8:38       ` Christian König
2022-04-08 11:01   ` Mikhail Gavrilov
2022-04-08 11:01     ` Mikhail Gavrilov
2022-04-08 11:13     ` Christian König
2022-04-08 11:13       ` Christian König
2022-04-08 12:24       ` Mikhail Gavrilov
2022-04-08 12:24         ` Mikhail Gavrilov
2022-04-08 14:27         ` Christian König [this message]
2022-04-08 14:27           ` Christian König
2022-04-08 17:25           ` Mikhail Gavrilov
2022-04-08 17:25             ` Mikhail Gavrilov
2022-04-09 14:27             ` Christian König
2022-04-15  5:38               ` Mikhail Gavrilov
2022-04-15  5:38                 ` Mikhail Gavrilov
2022-04-15  8:04                 ` Christian König
2022-04-15  8:04                   ` Christian König
2022-05-11  9:05                   ` Mikhail Gavrilov
2022-05-11  9:05                     ` Mikhail Gavrilov
2022-05-11 12:01                     ` Christian König
2022-10-17 22:43                       ` Mikhail Gavrilov
2022-10-17 22:43                         ` Mikhail Gavrilov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eef04fc4-741d-606c-c2c6-f054e4e3fffd@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Ken.Xue@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=daniel.vetter@ffwll.ch \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikhail.v.gavrilov@gmail.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.