linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND  0/3] Venus - Handle race conditions in concurrency
@ 2020-08-07  6:24 Mansur Alisha Shaik
  2020-08-07  6:24 ` [RESEND 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Mansur Alisha Shaik @ 2020-08-07  6:24 UTC (permalink / raw)
  To: linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia, Mansur Alisha Shaik

The intention of this patchset is to handle race
conditions during concurrency usecases like
Multiple YouTube browser tabs(approx 50 plus tabs),
graphics_Stress, WiFi ON/OFF, Bluetooth ON/OF,
and reboot in parallel.

---
Resending the fixes by describing more about the issue
and correcting typo errors.

Mansur Alisha Shaik (3):
  venus: core: handle race condititon for core ops
  venus: core: cancel pending work items in workqueue
  venus: handle use after free for iommu_map/iommu_unmap

 drivers/media/platform/qcom/venus/core.c     |  6 +++++-
 drivers/media/platform/qcom/venus/firmware.c | 17 +++++++++++++----
 drivers/media/platform/qcom/venus/hfi.c      |  5 ++++-
 3 files changed, 22 insertions(+), 6 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RESEND  1/3] venus: core: handle race condititon for core ops
  2020-08-07  6:24 [RESEND 0/3] Venus - Handle race conditions in concurrency Mansur Alisha Shaik
@ 2020-08-07  6:24 ` Mansur Alisha Shaik
  2020-08-10  9:50   ` Stanimir Varbanov
  2020-09-10 10:43   ` Stanimir Varbanov
  2020-08-07  6:24 ` [RESEND 2/3] venus: core: cancel pending work items in workqueue Mansur Alisha Shaik
  2020-08-07  6:24 ` [RESEND 3/3] venus: handle use after free for iommu_map/iommu_unmap Mansur Alisha Shaik
  2 siblings, 2 replies; 10+ messages in thread
From: Mansur Alisha Shaik @ 2020-08-07  6:24 UTC (permalink / raw)
  To: linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia, Mansur Alisha Shaik

For core ops we are having only write protect but there
is no read protect, because of this in multthreading
and concurrency, one CPU core is reading without wait
which is causing the NULL pointer dereferece crash.

one such scenario is as show below, where in one
core core->ops becoming NULL and in another core
calling core->ops->session_init().

CPU: core-7:
Call trace:
 hfi_session_init+0x180/0x1dc [venus_core]
 vdec_queue_setup+0x9c/0x364 [venus_dec]
 vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common]
 vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2]
 v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem]
 v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem]
 v4l_reqbufs+0x4c/0x5c
__video_do_ioctl+0x2b0/0x39c

CPU: core-0:
Call trace:
 venus_shutdown+0x98/0xfc [venus_core]
 venus_sys_error_handler+0x64/0x148 [venus_core]
 process_one_work+0x210/0x3d0
 worker_thread+0x248/0x3f4
 kthread+0x11c/0x12c

Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
---
 drivers/media/platform/qcom/venus/core.c | 2 +-
 drivers/media/platform/qcom/venus/hfi.c  | 5 ++++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c
index 203c653..fe99c83 100644
--- a/drivers/media/platform/qcom/venus/core.c
+++ b/drivers/media/platform/qcom/venus/core.c
@@ -64,8 +64,8 @@ static void venus_sys_error_handler(struct work_struct *work)
 	pm_runtime_get_sync(core->dev);
 
 	hfi_core_deinit(core, true);
-	hfi_destroy(core);
 	mutex_lock(&core->lock);
+	hfi_destroy(core);
 	venus_shutdown(core);
 
 	pm_runtime_put_sync(core->dev);
diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c
index a211eb9..2eeb31f 100644
--- a/drivers/media/platform/qcom/venus/hfi.c
+++ b/drivers/media/platform/qcom/venus/hfi.c
@@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create);
 int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
 {
 	struct venus_core *core = inst->core;
-	const struct hfi_ops *ops = core->ops;
+	const struct hfi_ops *ops;
 	int ret;
 
 	if (inst->state != INST_UNINIT)
@@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
 	inst->hfi_codec = to_codec_type(pixfmt);
 	reinit_completion(&inst->done);
 
+	mutex_lock(&core->lock);
+	ops = core->ops;
 	ret = ops->session_init(inst, inst->session_type, inst->hfi_codec);
 	if (ret)
 		return ret;
 
+	mutex_unlock(&core->lock);
 	ret = wait_session_msg(inst);
 	if (ret)
 		return ret;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RESEND  2/3] venus: core: cancel pending work items in workqueue
  2020-08-07  6:24 [RESEND 0/3] Venus - Handle race conditions in concurrency Mansur Alisha Shaik
  2020-08-07  6:24 ` [RESEND 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
@ 2020-08-07  6:24 ` Mansur Alisha Shaik
  2020-08-07  6:24 ` [RESEND 3/3] venus: handle use after free for iommu_map/iommu_unmap Mansur Alisha Shaik
  2 siblings, 0 replies; 10+ messages in thread
From: Mansur Alisha Shaik @ 2020-08-07  6:24 UTC (permalink / raw)
  To: linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia, Mansur Alisha Shaik

In concurrency usecase and reboot scenario we are
observing race condition and seeing NULL pointer
dereference crash. In shutdown path and system
recovery path we are destroying the same mutex
hence seeing crash.

This case is handled by mutex protection and
cancel delayed work items in work queue.

Below is the call trace for the crash
Call trace:
 venus_remove+0xdc/0xec [venus_core]
 venus_core_shutdown+0x1c/0x34 [venus_core]
 platform_drv_shutdown+0x28/0x34
 device_shutdown+0x154/0x1fc
 kernel_restart_prepare+0x40/0x4c
 kernel_restart+0x1c/0x64

Call trace:
 mutex_lock+0x34/0x60
 venus_hfi_destroy+0x28/0x98 [venus_core]
 hfi_destroy+0x1c/0x28 [venus_core]
 venus_sys_error_handler+0x60/0x14c [venus_core]
 process_one_work+0x210/0x3d0
 worker_thread+0x248/0x3f4
 kthread+0x11c/0x12c
 ret_from_fork+0x10/0x18

Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
---
 drivers/media/platform/qcom/venus/core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c
index fe99c83..41a293e 100644
--- a/drivers/media/platform/qcom/venus/core.c
+++ b/drivers/media/platform/qcom/venus/core.c
@@ -312,6 +312,8 @@ static int venus_remove(struct platform_device *pdev)
 	struct device *dev = core->dev;
 	int ret;
 
+	cancel_delayed_work_sync(&core->work);
+
 	ret = pm_runtime_get_sync(dev);
 	WARN_ON(ret < 0);
 
@@ -329,7 +331,9 @@ static int venus_remove(struct platform_device *pdev)
 	if (pm_ops->core_put)
 		pm_ops->core_put(dev);
 
+	mutex_lock(&core->lock);
 	hfi_destroy(core);
+	mutex_unlock(&core->lock);
 
 	icc_put(core->video_path);
 	icc_put(core->cpucfg_path);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RESEND  3/3] venus: handle use after free for iommu_map/iommu_unmap
  2020-08-07  6:24 [RESEND 0/3] Venus - Handle race conditions in concurrency Mansur Alisha Shaik
  2020-08-07  6:24 ` [RESEND 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
  2020-08-07  6:24 ` [RESEND 2/3] venus: core: cancel pending work items in workqueue Mansur Alisha Shaik
@ 2020-08-07  6:24 ` Mansur Alisha Shaik
  2020-08-10  9:53   ` Stanimir Varbanov
  2020-08-25 13:13   ` Stanimir Varbanov
  2 siblings, 2 replies; 10+ messages in thread
From: Mansur Alisha Shaik @ 2020-08-07  6:24 UTC (permalink / raw)
  To: linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia, Mansur Alisha Shaik

In concurrency usecase and reboot scenario we are trying
to map fw.iommu_domain which is already unmapped during
shutdown. This is causing NULL pointer dereference crash.

This case is handled by adding necessary checks.

Call trace:
 __iommu_map+0x4c/0x348
 iommu_map+0x5c/0x70
 venus_boot+0x184/0x230 [venus_core]
 venus_sys_error_handler+0xa0/0x14c [venus_core]
 process_one_work+0x210/0x3d0
 worker_thread+0x248/0x3f4
 kthread+0x11c/0x12c
 ret_from_fork+0x10/0x18

Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
---
 drivers/media/platform/qcom/venus/firmware.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c
index 8801a6a..c427e88 100644
--- a/drivers/media/platform/qcom/venus/firmware.c
+++ b/drivers/media/platform/qcom/venus/firmware.c
@@ -171,9 +171,14 @@ static int venus_shutdown_no_tz(struct venus_core *core)
 
 	iommu = core->fw.iommu_domain;
 
-	unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped);
-	if (unmapped != mapped)
-		dev_err(dev, "failed to unmap firmware\n");
+	if (core->fw.mapped_mem_size && iommu) {
+		unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped);
+
+		if (unmapped != mapped)
+			dev_err(dev, "failed to unmap firmware\n");
+		else
+			core->fw.mapped_mem_size = 0;
+	}
 
 	return 0;
 }
@@ -288,7 +293,11 @@ void venus_firmware_deinit(struct venus_core *core)
 	iommu = core->fw.iommu_domain;
 
 	iommu_detach_device(iommu, core->fw.dev);
-	iommu_domain_free(iommu);
+
+	if (core->fw.iommu_domain) {
+		iommu_domain_free(iommu);
+		core->fw.iommu_domain = NULL;
+	}
 
 	platform_device_unregister(to_platform_device(core->fw.dev));
 }
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RESEND 1/3] venus: core: handle race condititon for core ops
  2020-08-07  6:24 ` [RESEND 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
@ 2020-08-10  9:50   ` Stanimir Varbanov
  2020-08-21 10:59     ` Stanimir Varbanov
  2020-09-10 10:43   ` Stanimir Varbanov
  1 sibling, 1 reply; 10+ messages in thread
From: Stanimir Varbanov @ 2020-08-10  9:50 UTC (permalink / raw)
  To: Mansur Alisha Shaik, linux-media; +Cc: linux-kernel, linux-arm-msm, vgarodia

Hi Mansur,

Thanks for the patches!

On 8/7/20 9:24 AM, Mansur Alisha Shaik wrote:
> For core ops we are having only write protect but there
> is no read protect, because of this in multthreading
> and concurrency, one CPU core is reading without wait
> which is causing the NULL pointer dereferece crash.
> 
> one such scenario is as show below, where in one
> core core->ops becoming NULL and in another core
> calling core->ops->session_init().
> 
> CPU: core-7:
> Call trace:
>  hfi_session_init+0x180/0x1dc [venus_core]
>  vdec_queue_setup+0x9c/0x364 [venus_dec]
>  vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common]
>  vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2]
>  v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem]
>  v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem]
>  v4l_reqbufs+0x4c/0x5c
> __video_do_ioctl+0x2b0/0x39c
> 
> CPU: core-0:
> Call trace:
>  venus_shutdown+0x98/0xfc [venus_core]
>  venus_sys_error_handler+0x64/0x148 [venus_core]
>  process_one_work+0x210/0x3d0
>  worker_thread+0x248/0x3f4
>  kthread+0x11c/0x12c
> 
> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
> ---
>  drivers/media/platform/qcom/venus/core.c | 2 +-
>  drivers/media/platform/qcom/venus/hfi.c  | 5 ++++-
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c
> index 203c653..fe99c83 100644
> --- a/drivers/media/platform/qcom/venus/core.c
> +++ b/drivers/media/platform/qcom/venus/core.c
> @@ -64,8 +64,8 @@ static void venus_sys_error_handler(struct work_struct *work)
>  	pm_runtime_get_sync(core->dev);
>  
>  	hfi_core_deinit(core, true);
> -	hfi_destroy(core);
>  	mutex_lock(&core->lock);
> +	hfi_destroy(core);

As my recovery fixes [1] touches this part also, could you please apply
them on top of yours and re-test?

Otherwise this patch looks good to me.

[1] https://www.spinics.net/lists/linux-arm-msm/msg70092.html

>  	venus_shutdown(core);
>  
>  	pm_runtime_put_sync(core->dev);
> diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c
> index a211eb9..2eeb31f 100644
> --- a/drivers/media/platform/qcom/venus/hfi.c
> +++ b/drivers/media/platform/qcom/venus/hfi.c
> @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create);
>  int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>  {
>  	struct venus_core *core = inst->core;
> -	const struct hfi_ops *ops = core->ops;
> +	const struct hfi_ops *ops;
>  	int ret;
>  
>  	if (inst->state != INST_UNINIT)
> @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>  	inst->hfi_codec = to_codec_type(pixfmt);
>  	reinit_completion(&inst->done);
>  
> +	mutex_lock(&core->lock);
> +	ops = core->ops;
>  	ret = ops->session_init(inst, inst->session_type, inst->hfi_codec);
>  	if (ret)
>  		return ret;
>  
> +	mutex_unlock(&core->lock);
>  	ret = wait_session_msg(inst);
>  	if (ret)
>  		return ret;
> 

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND 3/3] venus: handle use after free for iommu_map/iommu_unmap
  2020-08-07  6:24 ` [RESEND 3/3] venus: handle use after free for iommu_map/iommu_unmap Mansur Alisha Shaik
@ 2020-08-10  9:53   ` Stanimir Varbanov
  2020-08-25 13:13   ` Stanimir Varbanov
  1 sibling, 0 replies; 10+ messages in thread
From: Stanimir Varbanov @ 2020-08-10  9:53 UTC (permalink / raw)
  To: Mansur Alisha Shaik, linux-media; +Cc: linux-kernel, linux-arm-msm, vgarodia

Hi,

On 8/7/20 9:24 AM, Mansur Alisha Shaik wrote:
> In concurrency usecase and reboot scenario we are trying
> to map fw.iommu_domain which is already unmapped during

I guess you want to say "to unmap iommu domain which is already unmapped" ?

> shutdown. This is causing NULL pointer dereference crash.
> 
> This case is handled by adding necessary checks.
> 
> Call trace:
>  __iommu_map+0x4c/0x348
>  iommu_map+0x5c/0x70
>  venus_boot+0x184/0x230 [venus_core]
>  venus_sys_error_handler+0xa0/0x14c [venus_core]
>  process_one_work+0x210/0x3d0
>  worker_thread+0x248/0x3f4
>  kthread+0x11c/0x12c
>  ret_from_fork+0x10/0x18
> 
> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
> ---
>  drivers/media/platform/qcom/venus/firmware.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c
> index 8801a6a..c427e88 100644
> --- a/drivers/media/platform/qcom/venus/firmware.c
> +++ b/drivers/media/platform/qcom/venus/firmware.c
> @@ -171,9 +171,14 @@ static int venus_shutdown_no_tz(struct venus_core *core)
>  
>  	iommu = core->fw.iommu_domain;
>  
> -	unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped);
> -	if (unmapped != mapped)
> -		dev_err(dev, "failed to unmap firmware\n");
> +	if (core->fw.mapped_mem_size && iommu) {
> +		unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped);
> +
> +		if (unmapped != mapped)
> +			dev_err(dev, "failed to unmap firmware\n");
> +		else
> +			core->fw.mapped_mem_size = 0;
> +	}
>  
>  	return 0;
>  }
> @@ -288,7 +293,11 @@ void venus_firmware_deinit(struct venus_core *core)
>  	iommu = core->fw.iommu_domain;
>  
>  	iommu_detach_device(iommu, core->fw.dev);
> -	iommu_domain_free(iommu);
> +
> +	if (core->fw.iommu_domain) {
> +		iommu_domain_free(iommu);
> +		core->fw.iommu_domain = NULL;
> +	}
>  
>  	platform_device_unregister(to_platform_device(core->fw.dev));
>  }
> 

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND 1/3] venus: core: handle race condititon for core ops
  2020-08-10  9:50   ` Stanimir Varbanov
@ 2020-08-21 10:59     ` Stanimir Varbanov
  2020-08-25  1:43       ` mansur
  0 siblings, 1 reply; 10+ messages in thread
From: Stanimir Varbanov @ 2020-08-21 10:59 UTC (permalink / raw)
  To: Mansur Alisha Shaik, linux-media; +Cc: linux-kernel, linux-arm-msm, vgarodia

Hi Mansur,

On 8/10/20 12:50 PM, Stanimir Varbanov wrote:
> Hi Mansur,
> 
> Thanks for the patches!
> 
> On 8/7/20 9:24 AM, Mansur Alisha Shaik wrote:
>> For core ops we are having only write protect but there
>> is no read protect, because of this in multthreading
>> and concurrency, one CPU core is reading without wait
>> which is causing the NULL pointer dereferece crash.
>>
>> one such scenario is as show below, where in one
>> core core->ops becoming NULL and in another core
>> calling core->ops->session_init().
>>
>> CPU: core-7:
>> Call trace:
>>  hfi_session_init+0x180/0x1dc [venus_core]
>>  vdec_queue_setup+0x9c/0x364 [venus_dec]
>>  vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common]
>>  vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2]
>>  v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem]
>>  v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem]
>>  v4l_reqbufs+0x4c/0x5c
>> __video_do_ioctl+0x2b0/0x39c
>>
>> CPU: core-0:
>> Call trace:
>>  venus_shutdown+0x98/0xfc [venus_core]
>>  venus_sys_error_handler+0x64/0x148 [venus_core]
>>  process_one_work+0x210/0x3d0
>>  worker_thread+0x248/0x3f4
>>  kthread+0x11c/0x12c
>>
>> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
>> ---
>>  drivers/media/platform/qcom/venus/core.c | 2 +-
>>  drivers/media/platform/qcom/venus/hfi.c  | 5 ++++-
>>  2 files changed, 5 insertions(+), 2 deletions(-)

See below comment, otherwise:

Acked-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>

>>
>> diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c
>> index 203c653..fe99c83 100644
>> --- a/drivers/media/platform/qcom/venus/core.c
>> +++ b/drivers/media/platform/qcom/venus/core.c
>> @@ -64,8 +64,8 @@ static void venus_sys_error_handler(struct work_struct *work)
>>  	pm_runtime_get_sync(core->dev);
>>  
>>  	hfi_core_deinit(core, true);
>> -	hfi_destroy(core);
>>  	mutex_lock(&core->lock);
>> +	hfi_destroy(core);
> 
> As my recovery fixes [1] touches this part also, could you please apply
> them on top of yours and re-test?

I'll drop above chunk from the patch because it is already taken into
account in my recovery fixes series and queue up the patch for v5.10.

> 
> Otherwise this patch looks good to me.
> 
> [1] https://www.spinics.net/lists/linux-arm-msm/msg70092.html
> 
>>  	venus_shutdown(core);
>>  
>>  	pm_runtime_put_sync(core->dev);
>> diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c
>> index a211eb9..2eeb31f 100644
>> --- a/drivers/media/platform/qcom/venus/hfi.c
>> +++ b/drivers/media/platform/qcom/venus/hfi.c
>> @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create);
>>  int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>>  {
>>  	struct venus_core *core = inst->core;
>> -	const struct hfi_ops *ops = core->ops;
>> +	const struct hfi_ops *ops;
>>  	int ret;
>>  
>>  	if (inst->state != INST_UNINIT)
>> @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>>  	inst->hfi_codec = to_codec_type(pixfmt);
>>  	reinit_completion(&inst->done);
>>  
>> +	mutex_lock(&core->lock);
>> +	ops = core->ops;
>>  	ret = ops->session_init(inst, inst->session_type, inst->hfi_codec);
>>  	if (ret)
>>  		return ret;
>>  
>> +	mutex_unlock(&core->lock);
>>  	ret = wait_session_msg(inst);
>>  	if (ret)
>>  		return ret;
>>
> 

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND 1/3] venus: core: handle race condititon for core ops
  2020-08-21 10:59     ` Stanimir Varbanov
@ 2020-08-25  1:43       ` mansur
  0 siblings, 0 replies; 10+ messages in thread
From: mansur @ 2020-08-25  1:43 UTC (permalink / raw)
  To: Stanimir Varbanov; +Cc: linux-media, linux-kernel, linux-arm-msm, vgarodia

On 2020-08-21 16:29, Stanimir Varbanov wrote:
> Hi Mansur,
> 
> On 8/10/20 12:50 PM, Stanimir Varbanov wrote:
>> Hi Mansur,
>> 
>> Thanks for the patches!
>> 
>> On 8/7/20 9:24 AM, Mansur Alisha Shaik wrote:
>>> For core ops we are having only write protect but there
>>> is no read protect, because of this in multthreading
>>> and concurrency, one CPU core is reading without wait
>>> which is causing the NULL pointer dereferece crash.
>>> 
>>> one such scenario is as show below, where in one
>>> core core->ops becoming NULL and in another core
>>> calling core->ops->session_init().
>>> 
>>> CPU: core-7:
>>> Call trace:
>>>  hfi_session_init+0x180/0x1dc [venus_core]
>>>  vdec_queue_setup+0x9c/0x364 [venus_dec]
>>>  vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common]
>>>  vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2]
>>>  v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem]
>>>  v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem]
>>>  v4l_reqbufs+0x4c/0x5c
>>> __video_do_ioctl+0x2b0/0x39c
>>> 
>>> CPU: core-0:
>>> Call trace:
>>>  venus_shutdown+0x98/0xfc [venus_core]
>>>  venus_sys_error_handler+0x64/0x148 [venus_core]
>>>  process_one_work+0x210/0x3d0
>>>  worker_thread+0x248/0x3f4
>>>  kthread+0x11c/0x12c
>>> 
>>> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
>>> ---
>>>  drivers/media/platform/qcom/venus/core.c | 2 +-
>>>  drivers/media/platform/qcom/venus/hfi.c  | 5 ++++-
>>>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> See below comment, otherwise:
> 
> Acked-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
> 
>>> 
>>> diff --git a/drivers/media/platform/qcom/venus/core.c 
>>> b/drivers/media/platform/qcom/venus/core.c
>>> index 203c653..fe99c83 100644
>>> --- a/drivers/media/platform/qcom/venus/core.c
>>> +++ b/drivers/media/platform/qcom/venus/core.c
>>> @@ -64,8 +64,8 @@ static void venus_sys_error_handler(struct 
>>> work_struct *work)
>>>  	pm_runtime_get_sync(core->dev);
>>> 
>>>  	hfi_core_deinit(core, true);
>>> -	hfi_destroy(core);
>>>  	mutex_lock(&core->lock);
>>> +	hfi_destroy(core);
>> 
>> As my recovery fixes [1] touches this part also, could you please 
>> apply
>> them on top of yours and re-test?
> 
> I'll drop above chunk from the patch because it is already taken into
> account in my recovery fixes series and queue up the patch for v5.10.
> 
yes, you can drop. I have validated these patches on top of your 
recovery patch
series. I will push V2 with dependency on "venus - recovery from 
frimware crash"
series 
(https://lore.kernel.org/patchwork/project/lkml/list/?series=455962)

>> 
>> Otherwise this patch looks good to me.
>> 
>> [1] https://www.spinics.net/lists/linux-arm-msm/msg70092.html
>> 
>>>  	venus_shutdown(core);
>>> 
>>>  	pm_runtime_put_sync(core->dev);
>>> diff --git a/drivers/media/platform/qcom/venus/hfi.c 
>>> b/drivers/media/platform/qcom/venus/hfi.c
>>> index a211eb9..2eeb31f 100644
>>> --- a/drivers/media/platform/qcom/venus/hfi.c
>>> +++ b/drivers/media/platform/qcom/venus/hfi.c
>>> @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create);
>>>  int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>>>  {
>>>  	struct venus_core *core = inst->core;
>>> -	const struct hfi_ops *ops = core->ops;
>>> +	const struct hfi_ops *ops;
>>>  	int ret;
>>> 
>>>  	if (inst->state != INST_UNINIT)
>>> @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, 
>>> u32 pixfmt)
>>>  	inst->hfi_codec = to_codec_type(pixfmt);
>>>  	reinit_completion(&inst->done);
>>> 
>>> +	mutex_lock(&core->lock);
>>> +	ops = core->ops;
>>>  	ret = ops->session_init(inst, inst->session_type, inst->hfi_codec);
>>>  	if (ret)
>>>  		return ret;
>>> 
>>> +	mutex_unlock(&core->lock);
>>>  	ret = wait_session_msg(inst);
>>>  	if (ret)
>>>  		return ret;
>>> 
>> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND 3/3] venus: handle use after free for iommu_map/iommu_unmap
  2020-08-07  6:24 ` [RESEND 3/3] venus: handle use after free for iommu_map/iommu_unmap Mansur Alisha Shaik
  2020-08-10  9:53   ` Stanimir Varbanov
@ 2020-08-25 13:13   ` Stanimir Varbanov
  1 sibling, 0 replies; 10+ messages in thread
From: Stanimir Varbanov @ 2020-08-25 13:13 UTC (permalink / raw)
  To: Mansur Alisha Shaik, linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia



On 8/7/20 9:24 AM, Mansur Alisha Shaik wrote:
> In concurrency usecase and reboot scenario we are trying
> to map fw.iommu_domain which is already unmapped during
> shutdown. This is causing NULL pointer dereference crash.
> 
> This case is handled by adding necessary checks.
> 
> Call trace:
>  __iommu_map+0x4c/0x348
>  iommu_map+0x5c/0x70
>  venus_boot+0x184/0x230 [venus_core]
>  venus_sys_error_handler+0xa0/0x14c [venus_core]
>  process_one_work+0x210/0x3d0
>  worker_thread+0x248/0x3f4
>  kthread+0x11c/0x12c
>  ret_from_fork+0x10/0x18
> 
> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
> ---
>  drivers/media/platform/qcom/venus/firmware.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c
> index 8801a6a..c427e88 100644
> --- a/drivers/media/platform/qcom/venus/firmware.c
> +++ b/drivers/media/platform/qcom/venus/firmware.c
> @@ -171,9 +171,14 @@ static int venus_shutdown_no_tz(struct venus_core *core)
>  
>  	iommu = core->fw.iommu_domain;
>  
> -	unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped);
> -	if (unmapped != mapped)
> -		dev_err(dev, "failed to unmap firmware\n");
> +	if (core->fw.mapped_mem_size && iommu) {
> +		unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped);
> +
> +		if (unmapped != mapped)
> +			dev_err(dev, "failed to unmap firmware\n");
> +		else
> +			core->fw.mapped_mem_size = 0;
> +	}
>  
>  	return 0;
>  }
> @@ -288,7 +293,11 @@ void venus_firmware_deinit(struct venus_core *core)
>  	iommu = core->fw.iommu_domain;
>  
>  	iommu_detach_device(iommu, core->fw.dev);
> -	iommu_domain_free(iommu);
> +
> +	if (core->fw.iommu_domain) {

why not just ?

	if (iommu)

> +		iommu_domain_free(iommu);
> +		core->fw.iommu_domain = NULL;
> +	}
>  
>  	platform_device_unregister(to_platform_device(core->fw.dev));
>  }
> 

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND 1/3] venus: core: handle race condititon for core ops
  2020-08-07  6:24 ` [RESEND 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
  2020-08-10  9:50   ` Stanimir Varbanov
@ 2020-09-10 10:43   ` Stanimir Varbanov
  1 sibling, 0 replies; 10+ messages in thread
From: Stanimir Varbanov @ 2020-09-10 10:43 UTC (permalink / raw)
  To: Mansur Alisha Shaik, linux-media, stanimir.varbanov
  Cc: linux-kernel, linux-arm-msm, vgarodia

Hi Mansur,

On 8/7/20 9:24 AM, Mansur Alisha Shaik wrote:
> For core ops we are having only write protect but there
> is no read protect, because of this in multthreading
> and concurrency, one CPU core is reading without wait
> which is causing the NULL pointer dereferece crash.
> 
> one such scenario is as show below, where in one
> core core->ops becoming NULL and in another core
> calling core->ops->session_init().
> 
> CPU: core-7:
> Call trace:
>  hfi_session_init+0x180/0x1dc [venus_core]

I thought more on this issue. I think we have to return error from
hfi_session_init() in the case when the driver is in
system-error-handler. Infact all userspace ioctls must end up with error
while we are in recovery state. What do you think?

>  vdec_queue_setup+0x9c/0x364 [venus_dec]
>  vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common]
>  vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2]
>  v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem]
>  v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem]
>  v4l_reqbufs+0x4c/0x5c
> __video_do_ioctl+0x2b0/0x39c
> 
> CPU: core-0:
> Call trace:
>  venus_shutdown+0x98/0xfc [venus_core]
>  venus_sys_error_handler+0x64/0x148 [venus_core]
>  process_one_work+0x210/0x3d0
>  worker_thread+0x248/0x3f4
>  kthread+0x11c/0x12c
> 
> Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
> ---
>  drivers/media/platform/qcom/venus/core.c | 2 +-
>  drivers/media/platform/qcom/venus/hfi.c  | 5 ++++-
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c
> index 203c653..fe99c83 100644
> --- a/drivers/media/platform/qcom/venus/core.c
> +++ b/drivers/media/platform/qcom/venus/core.c
> @@ -64,8 +64,8 @@ static void venus_sys_error_handler(struct work_struct *work)
>  	pm_runtime_get_sync(core->dev);
>  
>  	hfi_core_deinit(core, true);
> -	hfi_destroy(core);
>  	mutex_lock(&core->lock);
> +	hfi_destroy(core);
>  	venus_shutdown(core);
>  
>  	pm_runtime_put_sync(core->dev);
> diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c
> index a211eb9..2eeb31f 100644
> --- a/drivers/media/platform/qcom/venus/hfi.c
> +++ b/drivers/media/platform/qcom/venus/hfi.c
> @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create);
>  int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>  {
>  	struct venus_core *core = inst->core;
> -	const struct hfi_ops *ops = core->ops;
> +	const struct hfi_ops *ops;
>  	int ret;
>  
>  	if (inst->state != INST_UNINIT)
> @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>  	inst->hfi_codec = to_codec_type(pixfmt);
>  	reinit_completion(&inst->done);
>  
> +	mutex_lock(&core->lock);
> +	ops = core->ops;
>  	ret = ops->session_init(inst, inst->session_type, inst->hfi_codec);
>  	if (ret)
>  		return ret;
>  
> +	mutex_unlock(&core->lock);
>  	ret = wait_session_msg(inst);
>  	if (ret)
>  		return ret;
> 

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-09-10 10:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-07  6:24 [RESEND 0/3] Venus - Handle race conditions in concurrency Mansur Alisha Shaik
2020-08-07  6:24 ` [RESEND 1/3] venus: core: handle race condititon for core ops Mansur Alisha Shaik
2020-08-10  9:50   ` Stanimir Varbanov
2020-08-21 10:59     ` Stanimir Varbanov
2020-08-25  1:43       ` mansur
2020-09-10 10:43   ` Stanimir Varbanov
2020-08-07  6:24 ` [RESEND 2/3] venus: core: cancel pending work items in workqueue Mansur Alisha Shaik
2020-08-07  6:24 ` [RESEND 3/3] venus: handle use after free for iommu_map/iommu_unmap Mansur Alisha Shaik
2020-08-10  9:53   ` Stanimir Varbanov
2020-08-25 13:13   ` Stanimir Varbanov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).