linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Venus - Handle race conditions in concurrency
@ 2020-09-10 12:44 Mansur Alisha Shaik
  2020-09-10 12:44 ` [PATCH v2 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Mansur Alisha Shaik @ 2020-09-10 12:44 UTC (permalink / raw)
  To: linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia, Mansur Alisha Shaik

The intention of this patchset is to handle race
conditions during concurrency usecases like
Multiple YouTube browser tabs(approx 50 plus tabs),
graphics_Stress, WiFi ON/OFF, Bluetooth ON/OF,
and reboot in parallel.

Mansur Alisha Shaik (3):
  venus: core: handle race condititon for core ops
  venus: core: cancel pending work items in workqueue
  venus: handle use after free for iommu_map/iommu_unmap

 drivers/media/platform/qcom/venus/core.c     |  4 ++++
 drivers/media/platform/qcom/venus/firmware.c | 17 +++++++++++++----
 drivers/media/platform/qcom/venus/hfi.c      |  5 ++++-
 3 files changed, 21 insertions(+), 5 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] venus: core: handle race condititon for core ops
  2020-09-10 12:44 [PATCH v2 0/3] Venus - Handle race conditions in concurrency Mansur Alisha Shaik
@ 2020-09-10 12:44 ` Mansur Alisha Shaik
  2020-09-11 10:10   ` Stanimir Varbanov
  2020-09-10 12:44 ` [PATCH v2 2/3] venus: core: cancel pending work items in workqueue Mansur Alisha Shaik
  2020-09-10 12:44 ` [PATCH v2 3/3] venus: handle use after free for iommu_map/iommu_unmap Mansur Alisha Shaik
  2 siblings, 1 reply; 8+ messages in thread
From: Mansur Alisha Shaik @ 2020-09-10 12:44 UTC (permalink / raw)
  To: linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia, Mansur Alisha Shaik

For core ops we are having only write protect but there
is no read protect, because of this in multthreading
and concurrency, one CPU core is reading without wait
which is causing the NULL pointer dereferece crash.

one such scenario is as show below, where in one CPU
core, core->ops becoming NULL and in another CPU core
calling core->ops->session_init().

CPU: core-7:
Call trace:
 hfi_session_init+0x180/0x1dc [venus_core]
 vdec_queue_setup+0x9c/0x364 [venus_dec]
 vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common]
 vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2]
 v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem]
 v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem]
 v4l_reqbufs+0x4c/0x5c
__video_do_ioctl+0x2b0/0x39c

CPU: core-0:
Call trace:
 venus_shutdown+0x98/0xfc [venus_core]
 venus_sys_error_handler+0x64/0x148 [venus_core]
 process_one_work+0x210/0x3d0
 worker_thread+0x248/0x3f4
 kthread+0x11c/0x12c

Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
Acked-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
---
Changes in V2:
- Addressed review comments by stan by validating on top
- of https://lore.kernel.org/patchwork/project/lkml/list/?series=455962

 drivers/media/platform/qcom/venus/hfi.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c
index a59022a..3137071 100644
--- a/drivers/media/platform/qcom/venus/hfi.c
+++ b/drivers/media/platform/qcom/venus/hfi.c
@@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create);
 int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
 {
 	struct venus_core *core = inst->core;
-	const struct hfi_ops *ops = core->ops;
+	const struct hfi_ops *ops;
 	int ret;
 
 	if (inst->state != INST_UNINIT)
@@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
 	inst->hfi_codec = to_codec_type(pixfmt);
 	reinit_completion(&inst->done);
 
+	mutex_lock(&core->lock);
+	ops = core->ops;
 	ret = ops->session_init(inst, inst->session_type, inst->hfi_codec);
 	if (ret)
 		return ret;
 
+	mutex_unlock(&core->lock);
 	ret = wait_session_msg(inst);
 	if (ret)
 		return ret;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] venus: core: cancel pending work items in workqueue
  2020-09-10 12:44 [PATCH v2 0/3] Venus - Handle race conditions in concurrency Mansur Alisha Shaik
  2020-09-10 12:44 ` [PATCH v2 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
@ 2020-09-10 12:44 ` Mansur Alisha Shaik
  2020-09-11 10:22   ` Stanimir Varbanov
  2020-09-10 12:44 ` [PATCH v2 3/3] venus: handle use after free for iommu_map/iommu_unmap Mansur Alisha Shaik
  2 siblings, 1 reply; 8+ messages in thread
From: Mansur Alisha Shaik @ 2020-09-10 12:44 UTC (permalink / raw)
  To: linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia, Mansur Alisha Shaik

In concurrency usecase and reboot scenario we are
observing race condition and seeing NULL pointer
dereference crash. In shutdown path and system
recovery path we are destroying the same mutex
hence seeing crash.

This case is handled by mutex protection and
cancel delayed work items in work queue.

Below is the call trace for the crash
Call trace:
 venus_remove+0xdc/0xec [venus_core]
 venus_core_shutdown+0x1c/0x34 [venus_core]
 platform_drv_shutdown+0x28/0x34
 device_shutdown+0x154/0x1fc
 kernel_restart_prepare+0x40/0x4c
 kernel_restart+0x1c/0x64

Call trace:
 mutex_lock+0x34/0x60
 venus_hfi_destroy+0x28/0x98 [venus_core]
 hfi_destroy+0x1c/0x28 [venus_core]
 venus_sys_error_handler+0x60/0x14c [venus_core]
 process_one_work+0x210/0x3d0
 worker_thread+0x248/0x3f4
 kthread+0x11c/0x12c
 ret_from_fork+0x10/0x18

Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
---
 drivers/media/platform/qcom/venus/core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c
index c5af428..69aa199 100644
--- a/drivers/media/platform/qcom/venus/core.c
+++ b/drivers/media/platform/qcom/venus/core.c
@@ -323,6 +323,8 @@ static int venus_remove(struct platform_device *pdev)
 	struct device *dev = core->dev;
 	int ret;
 
+	cancel_delayed_work_sync(&core->work);
+
 	ret = pm_runtime_get_sync(dev);
 	WARN_ON(ret < 0);
 
@@ -340,7 +342,9 @@ static int venus_remove(struct platform_device *pdev)
 	if (pm_ops->core_put)
 		pm_ops->core_put(dev);
 
+	mutex_lock(&core->lock);
 	hfi_destroy(core);
+	mutex_unlock(&core->lock);
 
 	icc_put(core->video_path);
 	icc_put(core->cpucfg_path);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] venus: handle use after free for iommu_map/iommu_unmap
  2020-09-10 12:44 [PATCH v2 0/3] Venus - Handle race conditions in concurrency Mansur Alisha Shaik
  2020-09-10 12:44 ` [PATCH v2 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
  2020-09-10 12:44 ` [PATCH v2 2/3] venus: core: cancel pending work items in workqueue Mansur Alisha Shaik
@ 2020-09-10 12:44 ` Mansur Alisha Shaik
  2 siblings, 0 replies; 8+ messages in thread
From: Mansur Alisha Shaik @ 2020-09-10 12:44 UTC (permalink / raw)
  To: linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia, Mansur Alisha Shaik

In concurrency usecase and reboot scenario we are seeing muliple
crashes related to iommu_map/iommu_unamp of core->fw.iommu_domain.

In one case we are seeing "Unable to handle kernel NULL pointer
dereference at virtual address 0000000000000008" crash, this is 
because of core->fw.iommu_domain in venus_firmware_deinit() and 
trying to map in venus_boot() during venus_sys_error_handler()

Call trace:
 __iommu_map+0x4c/0x348
 iommu_map+0x5c/0x70
 venus_boot+0x184/0x230 [venus_core]
 venus_sys_error_handler+0xa0/0x14c [venus_core]
 process_one_work+0x210/0x3d0
 worker_thread+0x248/0x3f4
 kthread+0x11c/0x12c
 ret_from_fork+0x10/0x18

In second case we are seeing "Unable to handle kernel paging request
at virtual address 006b6b6b6b6b6b9b" crash, this is because of
unmappin iommu domain which is already unmapped.

Call trace:
 venus_remove+0xf8/0x108 [venus_core]
 venus_core_shutdown+0x1c/0x34 [venus_core]
 platform_drv_shutdown+0x28/0x34
 device_shutdown+0x154/0x1fc
 kernel_restart_prepare+0x40/0x4c
 kernel_restart+0x1c/0x64
 __arm64_sys_reboot+0x190/0x238
 el0_svc_common+0xa4/0x154
 el0_svc_compat_handler+0x2c/0x38
 el0_svc_compat+0x8/0x10


Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
---
Changes in V2:
- Addressed review comments by stan
- Elaborated commit message

 drivers/media/platform/qcom/venus/firmware.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c
index 8801a6a..c427e88 100644
--- a/drivers/media/platform/qcom/venus/firmware.c
+++ b/drivers/media/platform/qcom/venus/firmware.c
@@ -171,9 +171,14 @@ static int venus_shutdown_no_tz(struct venus_core *core)
 
 	iommu = core->fw.iommu_domain;
 
-	unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped);
-	if (unmapped != mapped)
-		dev_err(dev, "failed to unmap firmware\n");
+	if (core->fw.mapped_mem_size && iommu) {
+		unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped);
+
+		if (unmapped != mapped)
+			dev_err(dev, "failed to unmap firmware\n");
+		else
+			core->fw.mapped_mem_size = 0;
+	}
 
 	return 0;
 }
@@ -288,7 +293,11 @@ void venus_firmware_deinit(struct venus_core *core)
 	iommu = core->fw.iommu_domain;
 
 	iommu_detach_device(iommu, core->fw.dev);
-	iommu_domain_free(iommu);
+
+	if (core->fw.iommu_domain) {
+		iommu_domain_free(iommu);
+		core->fw.iommu_domain = NULL;
+	}
 
 	platform_device_unregister(to_platform_device(core->fw.dev));
 }
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] venus: core: handle race condititon for core ops
  2020-09-10 12:44 ` [PATCH v2 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
@ 2020-09-11 10:10   ` Stanimir Varbanov
  2020-09-17  1:51     ` mansur
  0 siblings, 1 reply; 8+ messages in thread
From: Stanimir Varbanov @ 2020-09-11 10:10 UTC (permalink / raw)
  To: Mansur Alisha Shaik, linux-media; +Cc: linux-kernel, linux-arm-msm, vgarodia



On 9/10/20 3:44 PM, Mansur Alisha Shaik wrote:
> For core ops we are having only write protect but there
> is no read protect, because of this in multthreading
> and concurrency, one CPU core is reading without wait
> which is causing the NULL pointer dereferece crash.
> 
> one such scenario is as show below, where in one CPU
> core, core->ops becoming NULL and in another CPU core
> calling core->ops->session_init().
> 
> CPU: core-7:
> Call trace:
>  hfi_session_init+0x180/0x1dc [venus_core]
>  vdec_queue_setup+0x9c/0x364 [venus_dec]
>  vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common]
>  vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2]
>  v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem]
>  v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem]
>  v4l_reqbufs+0x4c/0x5c
> __video_do_ioctl+0x2b0/0x39c
> 
> CPU: core-0:
> Call trace:
>  venus_shutdown+0x98/0xfc [venus_core]
>  venus_sys_error_handler+0x64/0x148 [venus_core]
>  process_one_work+0x210/0x3d0
>  worker_thread+0x248/0x3f4
>  kthread+0x11c/0x12c
> 
> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
> Acked-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
> ---
> Changes in V2:
> - Addressed review comments by stan by validating on top
> - of https://lore.kernel.org/patchwork/project/lkml/list/?series=455962
> 
>  drivers/media/platform/qcom/venus/hfi.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c
> index a59022a..3137071 100644
> --- a/drivers/media/platform/qcom/venus/hfi.c
> +++ b/drivers/media/platform/qcom/venus/hfi.c
> @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create);
>  int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>  {
>  	struct venus_core *core = inst->core;
> -	const struct hfi_ops *ops = core->ops;
> +	const struct hfi_ops *ops;
>  	int ret;
>  

If we are in system error recovery the session_init cannot pass
successfully, so we exit early in the function.

I'd suggest to make it:

	/* If core shutdown is in progress or we are in system error 	recovery,
return an error */
	mutex_lock(&core->lock);
	if (!core->ops || core->sys_error) {
		mutex_unclock(&core->lock);
		return -EIO;
	}
	mutex_unclock(&core->lock);
		
>  	if (inst->state != INST_UNINIT)
> @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>  	inst->hfi_codec = to_codec_type(pixfmt);
>  	reinit_completion(&inst->done);
>  
> +	mutex_lock(&core->lock);
> +	ops = core->ops;
>  	ret = ops->session_init(inst, inst->session_type, inst->hfi_codec);
>  	if (ret)
>  		return ret;
>  
> +	mutex_unlock(&core->lock);
>  	ret = wait_session_msg(inst);
>  	if (ret)
>  		return ret;
> 

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] venus: core: cancel pending work items in workqueue
  2020-09-10 12:44 ` [PATCH v2 2/3] venus: core: cancel pending work items in workqueue Mansur Alisha Shaik
@ 2020-09-11 10:22   ` Stanimir Varbanov
  2020-09-17  1:53     ` mansur
  0 siblings, 1 reply; 8+ messages in thread
From: Stanimir Varbanov @ 2020-09-11 10:22 UTC (permalink / raw)
  To: Mansur Alisha Shaik, linux-media; +Cc: linux-kernel, linux-arm-msm, vgarodia

Hi,

On 9/10/20 3:44 PM, Mansur Alisha Shaik wrote:
> In concurrency usecase and reboot scenario we are
> observing race condition and seeing NULL pointer
> dereference crash. In shutdown path and system
> recovery path we are destroying the same mutex
> hence seeing crash.
> 
> This case is handled by mutex protection and
> cancel delayed work items in work queue.
> 
> Below is the call trace for the crash
> Call trace:
>  venus_remove+0xdc/0xec [venus_core]
>  venus_core_shutdown+0x1c/0x34 [venus_core]
>  platform_drv_shutdown+0x28/0x34
>  device_shutdown+0x154/0x1fc
>  kernel_restart_prepare+0x40/0x4c
>  kernel_restart+0x1c/0x64
> 
> Call trace:
>  mutex_lock+0x34/0x60
>  venus_hfi_destroy+0x28/0x98 [venus_core]
>  hfi_destroy+0x1c/0x28 [venus_core]

I queued up [1] and after it this cannot happen anymore because
hfi_destroy() is not called by venus_sys_error_handler().

So I guess this patch is not needed anymore.

[1] https://www.spinics.net/lists/linux-arm-msm/msg70092.html

>  venus_sys_error_handler+0x60/0x14c [venus_core]
>  process_one_work+0x210/0x3d0
>  worker_thread+0x248/0x3f4
>  kthread+0x11c/0x12c
>  ret_from_fork+0x10/0x18
> 
> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
> ---
>  drivers/media/platform/qcom/venus/core.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c
> index c5af428..69aa199 100644
> --- a/drivers/media/platform/qcom/venus/core.c
> +++ b/drivers/media/platform/qcom/venus/core.c
> @@ -323,6 +323,8 @@ static int venus_remove(struct platform_device *pdev)
>  	struct device *dev = core->dev;
>  	int ret;
>  
> +	cancel_delayed_work_sync(&core->work);
> +
>  	ret = pm_runtime_get_sync(dev);
>  	WARN_ON(ret < 0);
>  
> @@ -340,7 +342,9 @@ static int venus_remove(struct platform_device *pdev)
>  	if (pm_ops->core_put)
>  		pm_ops->core_put(dev);
>  
> +	mutex_lock(&core->lock);
>  	hfi_destroy(core);
> +	mutex_unlock(&core->lock);
>  
>  	icc_put(core->video_path);
>  	icc_put(core->cpucfg_path);
> 

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] venus: core: handle race condititon for core ops
  2020-09-11 10:10   ` Stanimir Varbanov
@ 2020-09-17  1:51     ` mansur
  0 siblings, 0 replies; 8+ messages in thread
From: mansur @ 2020-09-17  1:51 UTC (permalink / raw)
  To: Stanimir Varbanov; +Cc: linux-media, linux-kernel, linux-arm-msm, vgarodia

On 2020-09-11 15:40, Stanimir Varbanov wrote:
> On 9/10/20 3:44 PM, Mansur Alisha Shaik wrote:
>> For core ops we are having only write protect but there
>> is no read protect, because of this in multthreading
>> and concurrency, one CPU core is reading without wait
>> which is causing the NULL pointer dereferece crash.
>> 
>> one such scenario is as show below, where in one CPU
>> core, core->ops becoming NULL and in another CPU core
>> calling core->ops->session_init().
>> 
>> CPU: core-7:
>> Call trace:
>>  hfi_session_init+0x180/0x1dc [venus_core]
>>  vdec_queue_setup+0x9c/0x364 [venus_dec]
>>  vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common]
>>  vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2]
>>  v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem]
>>  v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem]
>>  v4l_reqbufs+0x4c/0x5c
>> __video_do_ioctl+0x2b0/0x39c
>> 
>> CPU: core-0:
>> Call trace:
>>  venus_shutdown+0x98/0xfc [venus_core]
>>  venus_sys_error_handler+0x64/0x148 [venus_core]
>>  process_one_work+0x210/0x3d0
>>  worker_thread+0x248/0x3f4
>>  kthread+0x11c/0x12c
>> 
>> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
>> Acked-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
>> ---
>> Changes in V2:
>> - Addressed review comments by stan by validating on top
>> - of 
>> https://lore.kernel.org/patchwork/project/lkml/list/?series=455962
>> 
>>  drivers/media/platform/qcom/venus/hfi.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/media/platform/qcom/venus/hfi.c 
>> b/drivers/media/platform/qcom/venus/hfi.c
>> index a59022a..3137071 100644
>> --- a/drivers/media/platform/qcom/venus/hfi.c
>> +++ b/drivers/media/platform/qcom/venus/hfi.c
>> @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create);
>>  int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>>  {
>>  	struct venus_core *core = inst->core;
>> -	const struct hfi_ops *ops = core->ops;
>> +	const struct hfi_ops *ops;
>>  	int ret;
>> 
> 
> If we are in system error recovery the session_init cannot pass
> successfully, so we exit early in the function.
> 
> I'd suggest to make it:
> 
> 	/* If core shutdown is in progress or we are in system error 
> 	recovery,
> return an error */
> 	mutex_lock(&core->lock);
> 	if (!core->ops || core->sys_error) {
> 		mutex_unclock(&core->lock);
> 		return -EIO;
> 	}
> 	mutex_unclock(&core->lock);
> 
Tried above suggestion and ran the failed scenario, I didn't see any 
issue.
Posted new version 
https://lore.kernel.org/patchwork/project/lkml/list/?series=463091
>>  	if (inst->state != INST_UNINIT)
>> @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, 
>> u32 pixfmt)
>>  	inst->hfi_codec = to_codec_type(pixfmt);
>>  	reinit_completion(&inst->done);
>> 
>> +	mutex_lock(&core->lock);
>> +	ops = core->ops;
>>  	ret = ops->session_init(inst, inst->session_type, inst->hfi_codec);
>>  	if (ret)
>>  		return ret;
>> 
>> +	mutex_unlock(&core->lock);
>>  	ret = wait_session_msg(inst);
>>  	if (ret)
>>  		return ret;
>> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] venus: core: cancel pending work items in workqueue
  2020-09-11 10:22   ` Stanimir Varbanov
@ 2020-09-17  1:53     ` mansur
  0 siblings, 0 replies; 8+ messages in thread
From: mansur @ 2020-09-17  1:53 UTC (permalink / raw)
  To: Stanimir Varbanov; +Cc: linux-media, linux-kernel, linux-arm-msm, vgarodia

On 2020-09-11 15:52, Stanimir Varbanov wrote:
> Hi,
> 
> On 9/10/20 3:44 PM, Mansur Alisha Shaik wrote:
>> In concurrency usecase and reboot scenario we are
>> observing race condition and seeing NULL pointer
>> dereference crash. In shutdown path and system
>> recovery path we are destroying the same mutex
>> hence seeing crash.
>> 
>> This case is handled by mutex protection and
>> cancel delayed work items in work queue.
>> 
>> Below is the call trace for the crash
>> Call trace:
>>  venus_remove+0xdc/0xec [venus_core]
>>  venus_core_shutdown+0x1c/0x34 [venus_core]
>>  platform_drv_shutdown+0x28/0x34
>>  device_shutdown+0x154/0x1fc
>>  kernel_restart_prepare+0x40/0x4c
>>  kernel_restart+0x1c/0x64
>> 
>> Call trace:
>>  mutex_lock+0x34/0x60
>>  venus_hfi_destroy+0x28/0x98 [venus_core]
>>  hfi_destroy+0x1c/0x28 [venus_core]
> 
> I queued up [1] and after it this cannot happen anymore because
> hfi_destroy() is not called by venus_sys_error_handler().
> 
> So I guess this patch is not needed anymore.
> 
> [1] https://www.spinics.net/lists/linux-arm-msm/msg70092.html
> 
Yes, this patch is not needed any more. rebased and posted new version
https://lore.kernel.org/patchwork/project/lkml/list/?series=463091

>>  venus_sys_error_handler+0x60/0x14c [venus_core]
>>  process_one_work+0x210/0x3d0
>>  worker_thread+0x248/0x3f4
>>  kthread+0x11c/0x12c
>>  ret_from_fork+0x10/0x18
>> 
>> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
>> ---
>>  drivers/media/platform/qcom/venus/core.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>> 
>> diff --git a/drivers/media/platform/qcom/venus/core.c 
>> b/drivers/media/platform/qcom/venus/core.c
>> index c5af428..69aa199 100644
>> --- a/drivers/media/platform/qcom/venus/core.c
>> +++ b/drivers/media/platform/qcom/venus/core.c
>> @@ -323,6 +323,8 @@ static int venus_remove(struct platform_device 
>> *pdev)
>>  	struct device *dev = core->dev;
>>  	int ret;
>> 
>> +	cancel_delayed_work_sync(&core->work);
>> +
>>  	ret = pm_runtime_get_sync(dev);
>>  	WARN_ON(ret < 0);
>> 
>> @@ -340,7 +342,9 @@ static int venus_remove(struct platform_device 
>> *pdev)
>>  	if (pm_ops->core_put)
>>  		pm_ops->core_put(dev);
>> 
>> +	mutex_lock(&core->lock);
>>  	hfi_destroy(core);
>> +	mutex_unlock(&core->lock);
>> 
>>  	icc_put(core->video_path);
>>  	icc_put(core->cpucfg_path);
>> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-09-17  1:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-10 12:44 [PATCH v2 0/3] Venus - Handle race conditions in concurrency Mansur Alisha Shaik
2020-09-10 12:44 ` [PATCH v2 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
2020-09-11 10:10   ` Stanimir Varbanov
2020-09-17  1:51     ` mansur
2020-09-10 12:44 ` [PATCH v2 2/3] venus: core: cancel pending work items in workqueue Mansur Alisha Shaik
2020-09-11 10:22   ` Stanimir Varbanov
2020-09-17  1:53     ` mansur
2020-09-10 12:44 ` [PATCH v2 3/3] venus: handle use after free for iommu_map/iommu_unmap Mansur Alisha Shaik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).