All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible case of Race in kobject_get_path()
@ 2022-03-30 10:11 Mukesh Ojha
  2022-03-30 10:16 ` Greg KH
  0 siblings, 1 reply; 9+ messages in thread
From: Mukesh Ojha @ 2022-03-30 10:11 UTC (permalink / raw)
  To: Rafael J. Wysocki, gregkh, linux-remoteproc

Hi All,

We are facing one issue where one driver (p1) is trying to register its 
device from driver probe
and from another path (p2) dev_set_name(new name) done from driver probe 
of the added device whose
new name length can be more than earlier done in (p1 path) which could 
result in redzone overwritten issue.

Can we get your suggestion here ? is this case of a race here ?

             p1                                                   p2

  device_register()
     kobject_get_path()
             =>  get_kobj_path_length
             (length is calculated from this path)

                                                  dev_set_name()

             => fill_kobj_path
               (path is copied here)


Thanks,
-Mukesh


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible case of Race in kobject_get_path()
  2022-03-30 10:11 Possible case of Race in kobject_get_path() Mukesh Ojha
@ 2022-03-30 10:16 ` Greg KH
       [not found]   ` <148f4e9b-0584-ccc1-1443-30d41190dfd3@quicinc.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2022-03-30 10:16 UTC (permalink / raw)
  To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-remoteproc

On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote:
> Hi All,
> 
> We are facing one issue where one driver (p1) is trying to register its
> device from driver probe
> and from another path (p2) dev_set_name(new name) done from driver probe of
> the added device whose
> new name length can be more than earlier done in (p1 path) which could
> result in redzone overwritten issue.

I do not understand, what specific driver is doing this so that we can
see an example of this problem?

> Can we get your suggestion here ? is this case of a race here ?
> 
>             p1                                                   p2
> 
>  device_register()
>     kobject_get_path()
>             =>  get_kobj_path_length
>             (length is calculated from this path)
> 
>                                                  dev_set_name()
> 
>             => fill_kobj_path
>               (path is copied here)
> 

I can not understand this example, any specific code you can point me
at?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible case of Race in kobject_get_path()
       [not found]   ` <148f4e9b-0584-ccc1-1443-30d41190dfd3@quicinc.com>
@ 2022-03-30 11:01     ` Greg KH
  2022-03-30 11:14       ` Mukesh Ojha
  0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2022-03-30 11:01 UTC (permalink / raw)
  To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-kernel

On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote:
> 
> On 3/30/2022 3:46 PM, Greg KH wrote:
> > On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote:
> > > Hi All,
> > > 
> > > We are facing one issue where one driver (p1) is trying to register its
> > > device from driver probe
> > > and from another path (p2) dev_set_name(new name) done from driver probe of
> > > the added device whose
> > > new name length can be more than earlier done in (p1 path) which could
> > > result in redzone overwritten issue.
> > I do not understand, what specific driver is doing this so that we can
> > see an example of this problem?
> 
> >>
> 
> trying to paste some logs of the race.
> 
> [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone
> overwritten

What kernel version is this?

> [ 14.244189][ T503]
> -----------------------------------------------------------------------------
> [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f
> @offset=3965. First byte 0x2f instead of 0xcc
> [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1
> cpu=0 pid=503
> [ 14.273111][ T503] kobject_get_path+0x58/0x100
> [ 14.277747][ T503] kobject_uevent_env+0x120/0x744
> [ 14.282639][ T503] device_add+0x98c/0x1028
> [ 14.286925][ T503] *device_register+0x24/0x38*
> [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus]
> [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus]
> [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl]
> [ 14.310007][ T503] platform_drv_probe+0x60/0x180
> [ 14.314812][ T503] really_probe+0x208/0xb64
> [ 14.319184][ T503] driver_probe_device+0x130/0x254
> [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324
> [ 14.329399][ T503] __device_attach+0x170/0x364
> [ 14.334035][ T503] bus_probe_device+0x4c/0x164
> [ 14.338671][ T503] device_add+0xa60/0x1028
> [ 14.342957][ T503] platform_device_add+0x260/0x2c8
> [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804
> [slim_qcom_ngd_ctrl]
> [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1
> cpu=4 pid=518
> [ 14.363018][ T503] do_init_module+0xac/0x494
> [ 14.367475][ T503] load_module+0x35c0/0x3e54
> [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8
> [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30
> [ 14.382328][ T503] el0_svc_common+0xdc/0x294
> [ 14.386786][ T503] el0_svc+0x38/0x9c
> [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0
> 
> 
> [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc
> cc cc cc cc 2f 64 65 .............*/de*
> [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74
> 66 6f 72 6d 2f 73 vices/platform/s
> [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30
> 2e 73 6c 69 6d 2f oc/3340000.slim/
> [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d
> 6e 67 64 2e 31 2f qcom,slim-ngd.1/
> [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73
> 6c 61 76 65 00 00 *btfmslim_slave*..
> 
> 
> 
> 499 static int btfm_slim_probe(struct slim_device *slim)

I do not see that function in Linus's tree right now.  Where does it
come from?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible case of Race in kobject_get_path()
  2022-03-30 11:01     ` Greg KH
@ 2022-03-30 11:14       ` Mukesh Ojha
  2022-03-30 11:19         ` Greg KH
  0 siblings, 1 reply; 9+ messages in thread
From: Mukesh Ojha @ 2022-03-30 11:14 UTC (permalink / raw)
  To: Greg KH; +Cc: Rafael J. Wysocki, linux-kernel


On 3/30/2022 4:31 PM, Greg KH wrote:
> On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote:
>> On 3/30/2022 3:46 PM, Greg KH wrote:
>>> On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote:
>>>> Hi All,
>>>>
>>>> We are facing one issue where one driver (p1) is trying to register its
>>>> device from driver probe
>>>> and from another path (p2) dev_set_name(new name) done from driver probe of
>>>> the added device whose
>>>> new name length can be more than earlier done in (p1 path) which could
>>>> result in redzone overwritten issue.
>>> I do not understand, what specific driver is doing this so that we can
>>> see an example of this problem?
>> trying to paste some logs of the race.
>>
>> [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone
>> overwritten
> What kernel version is this?
5.10

>
>> [ 14.244189][ T503]
>> -----------------------------------------------------------------------------
>> [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f
>> @offset=3965. First byte 0x2f instead of 0xcc
>> [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1
>> cpu=0 pid=503
>> [ 14.273111][ T503] kobject_get_path+0x58/0x100
>> [ 14.277747][ T503] kobject_uevent_env+0x120/0x744
>> [ 14.282639][ T503] device_add+0x98c/0x1028
>> [ 14.286925][ T503] *device_register+0x24/0x38*
>> [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus]
>> [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus]
>> [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl]
>> [ 14.310007][ T503] platform_drv_probe+0x60/0x180
>> [ 14.314812][ T503] really_probe+0x208/0xb64
>> [ 14.319184][ T503] driver_probe_device+0x130/0x254
>> [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324
>> [ 14.329399][ T503] __device_attach+0x170/0x364
>> [ 14.334035][ T503] bus_probe_device+0x4c/0x164
>> [ 14.338671][ T503] device_add+0xa60/0x1028
>> [ 14.342957][ T503] platform_device_add+0x260/0x2c8
>> [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804
>> [slim_qcom_ngd_ctrl]
>> [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1
>> cpu=4 pid=518
>> [ 14.363018][ T503] do_init_module+0xac/0x494
>> [ 14.367475][ T503] load_module+0x35c0/0x3e54
>> [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8
>> [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30
>> [ 14.382328][ T503] el0_svc_common+0xdc/0x294
>> [ 14.386786][ T503] el0_svc+0x38/0x9c
>> [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0
>>
>>
>> [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc
>> cc cc cc cc 2f 64 65 .............*/de*
>> [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74
>> 66 6f 72 6d 2f 73 vices/platform/s
>> [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30
>> 2e 73 6c 69 6d 2f oc/3340000.slim/
>> [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d
>> 6e 67 64 2e 31 2f qcom,slim-ngd.1/
>> [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73
>> 6c 61 76 65 00 00 *btfmslim_slave*..
>>
>>
>>
>> 499 static int btfm_slim_probe(struct slim_device *slim)
> I do not see that function in Linus's tree right now.  Where does it
> come from?
Don't you think, it can be caused by any kernel module.

>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible case of Race in kobject_get_path()
  2022-03-30 11:14       ` Mukesh Ojha
@ 2022-03-30 11:19         ` Greg KH
  2022-03-30 11:23           ` Mukesh Ojha
  0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2022-03-30 11:19 UTC (permalink / raw)
  To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-kernel

On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote:
> 
> On 3/30/2022 4:31 PM, Greg KH wrote:
> > On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote:
> > > On 3/30/2022 3:46 PM, Greg KH wrote:
> > > > On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote:
> > > > > Hi All,
> > > > > 
> > > > > We are facing one issue where one driver (p1) is trying to register its
> > > > > device from driver probe
> > > > > and from another path (p2) dev_set_name(new name) done from driver probe of
> > > > > the added device whose
> > > > > new name length can be more than earlier done in (p1 path) which could
> > > > > result in redzone overwritten issue.
> > > > I do not understand, what specific driver is doing this so that we can
> > > > see an example of this problem?
> > > trying to paste some logs of the race.
> > > 
> > > [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone
> > > overwritten
> > What kernel version is this?
> 5.10

That is very old, is this an issue in 5.17?

> > > [ 14.244189][ T503]
> > > -----------------------------------------------------------------------------
> > > [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f
> > > @offset=3965. First byte 0x2f instead of 0xcc
> > > [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1
> > > cpu=0 pid=503
> > > [ 14.273111][ T503] kobject_get_path+0x58/0x100
> > > [ 14.277747][ T503] kobject_uevent_env+0x120/0x744
> > > [ 14.282639][ T503] device_add+0x98c/0x1028
> > > [ 14.286925][ T503] *device_register+0x24/0x38*
> > > [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus]
> > > [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus]
> > > [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl]
> > > [ 14.310007][ T503] platform_drv_probe+0x60/0x180
> > > [ 14.314812][ T503] really_probe+0x208/0xb64
> > > [ 14.319184][ T503] driver_probe_device+0x130/0x254
> > > [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324
> > > [ 14.329399][ T503] __device_attach+0x170/0x364
> > > [ 14.334035][ T503] bus_probe_device+0x4c/0x164
> > > [ 14.338671][ T503] device_add+0xa60/0x1028
> > > [ 14.342957][ T503] platform_device_add+0x260/0x2c8
> > > [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804
> > > [slim_qcom_ngd_ctrl]
> > > [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1
> > > cpu=4 pid=518
> > > [ 14.363018][ T503] do_init_module+0xac/0x494
> > > [ 14.367475][ T503] load_module+0x35c0/0x3e54
> > > [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8
> > > [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30
> > > [ 14.382328][ T503] el0_svc_common+0xdc/0x294
> > > [ 14.386786][ T503] el0_svc+0x38/0x9c
> > > [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0
> > > 
> > > 
> > > [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc
> > > cc cc cc cc 2f 64 65 .............*/de*
> > > [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74
> > > 66 6f 72 6d 2f 73 vices/platform/s
> > > [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30
> > > 2e 73 6c 69 6d 2f oc/3340000.slim/
> > > [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d
> > > 6e 67 64 2e 31 2f qcom,slim-ngd.1/
> > > [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73
> > > 6c 61 76 65 00 00 *btfmslim_slave*..
> > > 
> > > 
> > > 
> > > 499 static int btfm_slim_probe(struct slim_device *slim)
> > I do not see that function in Linus's tree right now.  Where does it
> > come from?
> Don't you think, it can be caused by any kernel module.

Kernel code do not protect themselves against other kernel code doing
abusive things.

Perhaps fix your out-of-tree code to not do this if no in-kernel code is
doing this?  Nothing we can do at all about out-of-tree code that we
have never seen.

Best of luck!

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible case of Race in kobject_get_path()
  2022-03-30 11:19         ` Greg KH
@ 2022-03-30 11:23           ` Mukesh Ojha
  2022-03-30 11:31             ` Greg KH
  0 siblings, 1 reply; 9+ messages in thread
From: Mukesh Ojha @ 2022-03-30 11:23 UTC (permalink / raw)
  To: Greg KH; +Cc: Rafael J. Wysocki, linux-kernel


On 3/30/2022 4:49 PM, Greg KH wrote:
> On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote:
>> On 3/30/2022 4:31 PM, Greg KH wrote:
>>> On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote:
>>>> On 3/30/2022 3:46 PM, Greg KH wrote:
>>>>> On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> We are facing one issue where one driver (p1) is trying to register its
>>>>>> device from driver probe
>>>>>> and from another path (p2) dev_set_name(new name) done from driver probe of
>>>>>> the added device whose
>>>>>> new name length can be more than earlier done in (p1 path) which could
>>>>>> result in redzone overwritten issue.
>>>>> I do not understand, what specific driver is doing this so that we can
>>>>> see an example of this problem?
>>>> trying to paste some logs of the race.
>>>>
>>>> [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone
>>>> overwritten
>>> What kernel version is this?
>> 5.10
> That is very old, is this an issue in 5.17?
>
>>>> [ 14.244189][ T503]
>>>> -----------------------------------------------------------------------------
>>>> [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f
>>>> @offset=3965. First byte 0x2f instead of 0xcc
>>>> [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1
>>>> cpu=0 pid=503
>>>> [ 14.273111][ T503] kobject_get_path+0x58/0x100
>>>> [ 14.277747][ T503] kobject_uevent_env+0x120/0x744
>>>> [ 14.282639][ T503] device_add+0x98c/0x1028
>>>> [ 14.286925][ T503] *device_register+0x24/0x38*
>>>> [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus]
>>>> [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus]
>>>> [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl]
>>>> [ 14.310007][ T503] platform_drv_probe+0x60/0x180
>>>> [ 14.314812][ T503] really_probe+0x208/0xb64
>>>> [ 14.319184][ T503] driver_probe_device+0x130/0x254
>>>> [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324
>>>> [ 14.329399][ T503] __device_attach+0x170/0x364
>>>> [ 14.334035][ T503] bus_probe_device+0x4c/0x164
>>>> [ 14.338671][ T503] device_add+0xa60/0x1028
>>>> [ 14.342957][ T503] platform_device_add+0x260/0x2c8
>>>> [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804
>>>> [slim_qcom_ngd_ctrl]
>>>> [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1
>>>> cpu=4 pid=518
>>>> [ 14.363018][ T503] do_init_module+0xac/0x494
>>>> [ 14.367475][ T503] load_module+0x35c0/0x3e54
>>>> [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8
>>>> [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30
>>>> [ 14.382328][ T503] el0_svc_common+0xdc/0x294
>>>> [ 14.386786][ T503] el0_svc+0x38/0x9c
>>>> [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0
>>>>
>>>>
>>>> [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc
>>>> cc cc cc cc 2f 64 65 .............*/de*
>>>> [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74
>>>> 66 6f 72 6d 2f 73 vices/platform/s
>>>> [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30
>>>> 2e 73 6c 69 6d 2f oc/3340000.slim/
>>>> [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d
>>>> 6e 67 64 2e 31 2f qcom,slim-ngd.1/
>>>> [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73
>>>> 6c 61 76 65 00 00 *btfmslim_slave*..
>>>>
>>>>
>>>>
>>>> 499 static int btfm_slim_probe(struct slim_device *slim)
>>> I do not see that function in Linus's tree right now.  Where does it
>>> come from?
>> Don't you think, it can be caused by any kernel module.
> Kernel code do not protect themselves against other kernel code doing
> abusive things.
>
> Perhaps fix your out-of-tree code to not do this if no in-kernel code is
> doing this?  Nothing we can do at all about out-of-tree code that we
> have never seen.


Thanks for the comment, Greg.
Why do we make dev_set_name() make it available to module apart from the 
core ?

-Mukesh

>
> Best of luck!
>
> greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible case of Race in kobject_get_path()
  2022-03-30 11:23           ` Mukesh Ojha
@ 2022-03-30 11:31             ` Greg KH
  2022-03-30 12:55               ` Mukesh Ojha
  0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2022-03-30 11:31 UTC (permalink / raw)
  To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-kernel

On Wed, Mar 30, 2022 at 04:53:41PM +0530, Mukesh Ojha wrote:
> 
> On 3/30/2022 4:49 PM, Greg KH wrote:
> > On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote:
> > > On 3/30/2022 4:31 PM, Greg KH wrote:
> > > > On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote:
> > > > > On 3/30/2022 3:46 PM, Greg KH wrote:
> > > > > > On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote:
> > > > > > > Hi All,
> > > > > > > 
> > > > > > > We are facing one issue where one driver (p1) is trying to register its
> > > > > > > device from driver probe
> > > > > > > and from another path (p2) dev_set_name(new name) done from driver probe of
> > > > > > > the added device whose
> > > > > > > new name length can be more than earlier done in (p1 path) which could
> > > > > > > result in redzone overwritten issue.
> > > > > > I do not understand, what specific driver is doing this so that we can
> > > > > > see an example of this problem?
> > > > > trying to paste some logs of the race.
> > > > > 
> > > > > [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone
> > > > > overwritten
> > > > What kernel version is this?
> > > 5.10
> > That is very old, is this an issue in 5.17?
> > 
> > > > > [ 14.244189][ T503]
> > > > > -----------------------------------------------------------------------------
> > > > > [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f
> > > > > @offset=3965. First byte 0x2f instead of 0xcc
> > > > > [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1
> > > > > cpu=0 pid=503
> > > > > [ 14.273111][ T503] kobject_get_path+0x58/0x100
> > > > > [ 14.277747][ T503] kobject_uevent_env+0x120/0x744
> > > > > [ 14.282639][ T503] device_add+0x98c/0x1028
> > > > > [ 14.286925][ T503] *device_register+0x24/0x38*
> > > > > [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus]
> > > > > [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus]
> > > > > [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl]
> > > > > [ 14.310007][ T503] platform_drv_probe+0x60/0x180
> > > > > [ 14.314812][ T503] really_probe+0x208/0xb64
> > > > > [ 14.319184][ T503] driver_probe_device+0x130/0x254
> > > > > [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324
> > > > > [ 14.329399][ T503] __device_attach+0x170/0x364
> > > > > [ 14.334035][ T503] bus_probe_device+0x4c/0x164
> > > > > [ 14.338671][ T503] device_add+0xa60/0x1028
> > > > > [ 14.342957][ T503] platform_device_add+0x260/0x2c8
> > > > > [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804
> > > > > [slim_qcom_ngd_ctrl]
> > > > > [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1
> > > > > cpu=4 pid=518
> > > > > [ 14.363018][ T503] do_init_module+0xac/0x494
> > > > > [ 14.367475][ T503] load_module+0x35c0/0x3e54
> > > > > [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8
> > > > > [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30
> > > > > [ 14.382328][ T503] el0_svc_common+0xdc/0x294
> > > > > [ 14.386786][ T503] el0_svc+0x38/0x9c
> > > > > [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0
> > > > > 
> > > > > 
> > > > > [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc
> > > > > cc cc cc cc 2f 64 65 .............*/de*
> > > > > [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74
> > > > > 66 6f 72 6d 2f 73 vices/platform/s
> > > > > [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30
> > > > > 2e 73 6c 69 6d 2f oc/3340000.slim/
> > > > > [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d
> > > > > 6e 67 64 2e 31 2f qcom,slim-ngd.1/
> > > > > [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73
> > > > > 6c 61 76 65 00 00 *btfmslim_slave*..
> > > > > 
> > > > > 
> > > > > 
> > > > > 499 static int btfm_slim_probe(struct slim_device *slim)
> > > > I do not see that function in Linus's tree right now.  Where does it
> > > > come from?
> > > Don't you think, it can be caused by any kernel module.
> > Kernel code do not protect themselves against other kernel code doing
> > abusive things.
> > 
> > Perhaps fix your out-of-tree code to not do this if no in-kernel code is
> > doing this?  Nothing we can do at all about out-of-tree code that we
> > have never seen.
> 
> 
> Thanks for the comment, Greg.
> Why do we make dev_set_name() make it available to module apart from the
> core ?

Because busses can be in a module, just like you should be doing :)

Please submit your code upstream for review.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible case of Race in kobject_get_path()
  2022-03-30 11:31             ` Greg KH
@ 2022-03-30 12:55               ` Mukesh Ojha
  2022-03-30 14:22                 ` Greg KH
  0 siblings, 1 reply; 9+ messages in thread
From: Mukesh Ojha @ 2022-03-30 12:55 UTC (permalink / raw)
  To: Greg KH; +Cc: Rafael J. Wysocki, linux-kernel


On 3/30/2022 5:01 PM, Greg KH wrote:
> On Wed, Mar 30, 2022 at 04:53:41PM +0530, Mukesh Ojha wrote:
>> On 3/30/2022 4:49 PM, Greg KH wrote:
>>> On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote:
>>>> On 3/30/2022 4:31 PM, Greg KH wrote:
>>>>> On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote:
>>>>>> On 3/30/2022 3:46 PM, Greg KH wrote:
>>>>>>> On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote:
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> We are facing one issue where one driver (p1) is trying to register its
>>>>>>>> device from driver probe
>>>>>>>> and from another path (p2) dev_set_name(new name) done from driver probe of
>>>>>>>> the added device whose
>>>>>>>> new name length can be more than earlier done in (p1 path) which could
>>>>>>>> result in redzone overwritten issue.
>>>>>>> I do not understand, what specific driver is doing this so that we can
>>>>>>> see an example of this problem?
>>>>>> trying to paste some logs of the race.
>>>>>>
>>>>>> [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone
>>>>>> overwritten
>>>>> What kernel version is this?
>>>> 5.10
>>> That is very old, is this an issue in 5.17?
>>>
>>>>>> [ 14.244189][ T503]
>>>>>> -----------------------------------------------------------------------------
>>>>>> [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f
>>>>>> @offset=3965. First byte 0x2f instead of 0xcc
>>>>>> [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1
>>>>>> cpu=0 pid=503
>>>>>> [ 14.273111][ T503] kobject_get_path+0x58/0x100
>>>>>> [ 14.277747][ T503] kobject_uevent_env+0x120/0x744
>>>>>> [ 14.282639][ T503] device_add+0x98c/0x1028
>>>>>> [ 14.286925][ T503] *device_register+0x24/0x38*
>>>>>> [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus]
>>>>>> [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus]
>>>>>> [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl]
>>>>>> [ 14.310007][ T503] platform_drv_probe+0x60/0x180
>>>>>> [ 14.314812][ T503] really_probe+0x208/0xb64
>>>>>> [ 14.319184][ T503] driver_probe_device+0x130/0x254
>>>>>> [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324
>>>>>> [ 14.329399][ T503] __device_attach+0x170/0x364
>>>>>> [ 14.334035][ T503] bus_probe_device+0x4c/0x164
>>>>>> [ 14.338671][ T503] device_add+0xa60/0x1028
>>>>>> [ 14.342957][ T503] platform_device_add+0x260/0x2c8
>>>>>> [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804
>>>>>> [slim_qcom_ngd_ctrl]
>>>>>> [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1
>>>>>> cpu=4 pid=518
>>>>>> [ 14.363018][ T503] do_init_module+0xac/0x494
>>>>>> [ 14.367475][ T503] load_module+0x35c0/0x3e54
>>>>>> [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8
>>>>>> [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30
>>>>>> [ 14.382328][ T503] el0_svc_common+0xdc/0x294
>>>>>> [ 14.386786][ T503] el0_svc+0x38/0x9c
>>>>>> [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0
>>>>>>
>>>>>>
>>>>>> [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc
>>>>>> cc cc cc cc 2f 64 65 .............*/de*
>>>>>> [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74
>>>>>> 66 6f 72 6d 2f 73 vices/platform/s
>>>>>> [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30
>>>>>> 2e 73 6c 69 6d 2f oc/3340000.slim/
>>>>>> [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d
>>>>>> 6e 67 64 2e 31 2f qcom,slim-ngd.1/
>>>>>> [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73
>>>>>> 6c 61 76 65 00 00 *btfmslim_slave*..
>>>>>>
>>>>>>
>>>>>>
>>>>>> 499 static int btfm_slim_probe(struct slim_device *slim)
>>>>> I do not see that function in Linus's tree right now.  Where does it
>>>>> come from?
>>>> Don't you think, it can be caused by any kernel module.
>>> Kernel code do not protect themselves against other kernel code doing
>>> abusive things.
>>>
>>> Perhaps fix your out-of-tree code to not do this if no in-kernel code is
>>> doing this?  Nothing we can do at all about out-of-tree code that we
>>> have never seen.
>>
>> Thanks for the comment, Greg.
>> Why do we make dev_set_name() make it available to module apart from the
>> core ?
> Because busses can be in a module, just like you should be doing :)
>
> Please submit your code upstream for review.

I still did not get why one should not do device name change if 
dev_set_name() is available for anyone to do name change while 
kobject_get_path() is running?
This is possible race, if we know this is the limitation/issue and would 
be difficult to address , that is fine and should be documented.

Thanks,
Mukesh

>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible case of Race in kobject_get_path()
  2022-03-30 12:55               ` Mukesh Ojha
@ 2022-03-30 14:22                 ` Greg KH
  0 siblings, 0 replies; 9+ messages in thread
From: Greg KH @ 2022-03-30 14:22 UTC (permalink / raw)
  To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-kernel

On Wed, Mar 30, 2022 at 06:25:29PM +0530, Mukesh Ojha wrote:
> 
> On 3/30/2022 5:01 PM, Greg KH wrote:
> > On Wed, Mar 30, 2022 at 04:53:41PM +0530, Mukesh Ojha wrote:
> > > On 3/30/2022 4:49 PM, Greg KH wrote:
> > > > On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote:
> > > > > On 3/30/2022 4:31 PM, Greg KH wrote:
> > > > > > On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote:
> > > > > > > On 3/30/2022 3:46 PM, Greg KH wrote:
> > > > > > > > On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote:
> > > > > > > > > Hi All,
> > > > > > > > > 
> > > > > > > > > We are facing one issue where one driver (p1) is trying to register its
> > > > > > > > > device from driver probe
> > > > > > > > > and from another path (p2) dev_set_name(new name) done from driver probe of
> > > > > > > > > the added device whose
> > > > > > > > > new name length can be more than earlier done in (p1 path) which could
> > > > > > > > > result in redzone overwritten issue.
> > > > > > > > I do not understand, what specific driver is doing this so that we can
> > > > > > > > see an example of this problem?
> > > > > > > trying to paste some logs of the race.
> > > > > > > 
> > > > > > > [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone
> > > > > > > overwritten
> > > > > > What kernel version is this?
> > > > > 5.10
> > > > That is very old, is this an issue in 5.17?
> > > > 
> > > > > > > [ 14.244189][ T503]
> > > > > > > -----------------------------------------------------------------------------
> > > > > > > [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f
> > > > > > > @offset=3965. First byte 0x2f instead of 0xcc
> > > > > > > [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1
> > > > > > > cpu=0 pid=503
> > > > > > > [ 14.273111][ T503] kobject_get_path+0x58/0x100
> > > > > > > [ 14.277747][ T503] kobject_uevent_env+0x120/0x744
> > > > > > > [ 14.282639][ T503] device_add+0x98c/0x1028
> > > > > > > [ 14.286925][ T503] *device_register+0x24/0x38*
> > > > > > > [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus]
> > > > > > > [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus]
> > > > > > > [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl]
> > > > > > > [ 14.310007][ T503] platform_drv_probe+0x60/0x180
> > > > > > > [ 14.314812][ T503] really_probe+0x208/0xb64
> > > > > > > [ 14.319184][ T503] driver_probe_device+0x130/0x254
> > > > > > > [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324
> > > > > > > [ 14.329399][ T503] __device_attach+0x170/0x364
> > > > > > > [ 14.334035][ T503] bus_probe_device+0x4c/0x164
> > > > > > > [ 14.338671][ T503] device_add+0xa60/0x1028
> > > > > > > [ 14.342957][ T503] platform_device_add+0x260/0x2c8
> > > > > > > [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804
> > > > > > > [slim_qcom_ngd_ctrl]
> > > > > > > [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1
> > > > > > > cpu=4 pid=518
> > > > > > > [ 14.363018][ T503] do_init_module+0xac/0x494
> > > > > > > [ 14.367475][ T503] load_module+0x35c0/0x3e54
> > > > > > > [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8
> > > > > > > [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30
> > > > > > > [ 14.382328][ T503] el0_svc_common+0xdc/0x294
> > > > > > > [ 14.386786][ T503] el0_svc+0x38/0x9c
> > > > > > > [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0
> > > > > > > 
> > > > > > > 
> > > > > > > [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc
> > > > > > > cc cc cc cc 2f 64 65 .............*/de*
> > > > > > > [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74
> > > > > > > 66 6f 72 6d 2f 73 vices/platform/s
> > > > > > > [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30
> > > > > > > 2e 73 6c 69 6d 2f oc/3340000.slim/
> > > > > > > [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d
> > > > > > > 6e 67 64 2e 31 2f qcom,slim-ngd.1/
> > > > > > > [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73
> > > > > > > 6c 61 76 65 00 00 *btfmslim_slave*..
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 499 static int btfm_slim_probe(struct slim_device *slim)
> > > > > > I do not see that function in Linus's tree right now.  Where does it
> > > > > > come from?
> > > > > Don't you think, it can be caused by any kernel module.
> > > > Kernel code do not protect themselves against other kernel code doing
> > > > abusive things.
> > > > 
> > > > Perhaps fix your out-of-tree code to not do this if no in-kernel code is
> > > > doing this?  Nothing we can do at all about out-of-tree code that we
> > > > have never seen.
> > > 
> > > Thanks for the comment, Greg.
> > > Why do we make dev_set_name() make it available to module apart from the
> > > core ?
> > Because busses can be in a module, just like you should be doing :)
> > 
> > Please submit your code upstream for review.
> 
> I still did not get why one should not do device name change if
> dev_set_name() is available for anyone to do name change while
> kobject_get_path() is running?

I do not see where in the kernel today this is happening.  What code
changes the name of a device like this?

> This is possible race, if we know this is the limitation/issue and would be
> difficult to address , that is fine and should be documented.

Again, why are you doing this?  What needs to change the device name in
this way and what is it racing with?

Without code examples, speculating about all of this is quite difficult
to discuss.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-03-30 14:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-30 10:11 Possible case of Race in kobject_get_path() Mukesh Ojha
2022-03-30 10:16 ` Greg KH
     [not found]   ` <148f4e9b-0584-ccc1-1443-30d41190dfd3@quicinc.com>
2022-03-30 11:01     ` Greg KH
2022-03-30 11:14       ` Mukesh Ojha
2022-03-30 11:19         ` Greg KH
2022-03-30 11:23           ` Mukesh Ojha
2022-03-30 11:31             ` Greg KH
2022-03-30 12:55               ` Mukesh Ojha
2022-03-30 14:22                 ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.