Re: [RFC PATCH v2 1/2] fpga: add a module owner field to fpga_manager and fpga_manager_ops

From: Marco Pagani <marpagan@redhat.com>
To: Xu Yilun <yilun.xu@linux.intel.com>
Cc: Moritz Fischer <mdf@kernel.org>, Wu Hao <hao.wu@intel.com>,
	Xu Yilun <yilun.xu@intel.com>, Tom Rix <trix@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org, linux-fpga@vger.kernel.org
Subject: Re: [RFC PATCH v2 1/2] fpga: add a module owner field to fpga_manager and fpga_manager_ops
Date: Sat, 16 Dec 2023 10:35:58 +0100	[thread overview]
Message-ID: <bd9b5122-788c-42dd-bbd4-c5a8323adfde@redhat.com> (raw)
In-Reply-To: <ZXSHVguFadvfaUcO@yilunxu-OptiPlex-7050>

On 2023-12-09 16:27, Xu Yilun wrote:
>>>>> I feel the scope of the protection is unclear to me in this patch. What
>>>>> data should be protected from concurrent access by this mutex? From the
>>>>> code seems the racing of mgr dev should be protected but apparently it
>>>>> doesn't have to.
>>>>
>>>> The mutex is there to ensure the lifetime of the manager device and
>>>> its context (struct fpga_manager) if fpga_mgr_get() happens to run
>>>> concurrently with the removal of the low-level module.
>>>>
>>>>>
>>>>> And with this mutex, the get/put/unregister() for one mgr should be
>>>>> exclusive with another mgr, but that also seems not necessary.
>>>>>
>>>>
>>>> I decided to use a static mutex because I thought putting the mutex
>>>> in the manager's context would be unsafe since its life would be tied
>>>> to the manager's life. For instance, consider the following sequence:
>>>>
>>>> - A removes the low-level control module, and delete_module progresses
>>>> up to the point when it calls the module's exit function, which in turn
>>>> calls fpga_mgr_unregister().
>>>>
>>>> - fpga_mgr_unregister() takes the mutex but gets descheduled before
>>>> completing the unregistering of the manager device.
>>>>
>>>> - Meanwhile, B wants to get the manager (it is still there) and calls
>>>> fpga_mgr_get(), which tries to take the mutex but gets suspended since
>>>> it is held by A.
>>>>
>>>> - A resumes and fpga_mgr_unregister() releases the manager device and
>>>
>>> The lifecycle of the manager device is not entirely decided by
>>> fpga_mgr_unregister(), this func just puts/decreases the device
>>> refcount.
>>
>> Right, but here I'm considering the case where no one has previously
>> taken the manager device. In that specific case, the refcount will be
> 
> I don't think this is valid, anyone should firstly get the manager
> device via get_device() then try to access its context.
> 
>> decremented to zero, and the device (with its context) will be released.
> 
> If no one has taken the manager device, the device & its context are
> safe to be released.
> 
>>
>> However, you got me thinking about how the mutex is causing the problem
>> in this case.
>>
>>>
>>> Remember fpga_mgr_get() gets the device via
>>> class_find_device()->get_device(). I assume if the valid device pointer
>>> could be returned by class_find_device(), it would never be released by
>>> other nice players. So I have no worry about the per manager mutex.
>>>
>>>> its context, including the mutex on which B is suspended.
>>>>
>>>> We could mitigate this specific case using mutex_trylock(). However,
>>>> there will still be other problematic cases, like if fpga_mgr_get()
>>>> gets suspended right before taking the mutex and then delete_module
>>>> proceeds up to when fpga_mgr_unregister() frees the manager device
>>>> and its context.
>>>>
>>>>> I think the mgr->owner & mgr->ops should be protected from concurrent
>>>>> access of delete_module & fpga_mgr_get/put(), so how about:
>>>>>
>>>>> struct fpga_manager_ops {
>>>>> 	struct module *owner;
>>>>> 	...
>>>>> };
>>>>>
>>>>> struct fpga_manager {
>>>>> 	...
>>>>> 	struct mutex mops_lock;
>>>>> 	const struct fpga_manager_ops *mops;
>>>>> 	...
>>>>> };
>>>>>
>>>>>
>>>>> static struct fpga_manager *__fpga_mgr_get(struct device *dev)
>>>>> {
>>>>> 	struct fpga_manager *mgr;
>>>>>
>>>>> 	mgr = to_fpga_manager(dev);
>>>>>
>>>>> 	mutex_lock(&mgr->mops_lock);
>>>>>
>>>>> 	if (!mgr->mops || !try_module_get(mgr->mops->owner))
>>>>> 		mgr = ERR_PTR(-ENODEV);
>>>>>
>>>>> 	mutex_unlock(&mgr->mops_lock);
>>>>> 		
>>>>> 	return mgr;
>>>>> }
>>>>>
>>>>> void fpga_mgr_unregister(struct fpga_manager *mgr)
>>>>> {
>>>>> 	fpga_mgr_fpga_remove(mgr);	
>>>>>
>>>>> 	mutex_lock(&mgr->ops_lock);
>>>>> 	mgr->mops = NULL;
>>>>> 	mutex_unlock(&mgr->ops_lock);
>>>>>
>>>>> 	device_unregister(&mgr->dev);	
>>>>> }
>>>>>
>>>>> Not actually tested.
>>>>>
>>>>
>>>> I think protecting the only the ops is not enough for the same reasons.
>>>> If fpga_mgr_get() gets suspended right after class_find_device(),and
>>>> meanwhile the low-level module is removed, it resumes with a reference
>>>> to a manager device that no longer exists.
>>>>
>>>> In a certain sense, however, using a mutex seems like a mitigation
>>>> that does not solve the problem at its root. I honestly still think
>>>> that taking the module's refcount right when registering the manager
>>>> is the only way that is both safe and robust against code changes.
>>>
>>> I would nak either. As mentioned above, that makes rmmod vendor module
>>> impossible. Introducing another user interface to release the module's
>>> refcount is also a bad idea. Who decides to take the ref, who releases
>>> it. A user has no knowledge of what is happening inside and should not
>>> enforce.
>>>
>>>> However, my proposal was rejected.
>>>>
>>>> So, if you prefer, I can drop the mutex entirely in the next version,
>>>> and we leave the responsibility of keeping all kernel pieces to the
>>>
>>> No, please try to fix it. Could you please reconsider my proposal?
>>>
>>
>> I have considered it and thought about it. However, I don't think we need a
>> mutex to protect mgr->mops. This is because the low-level module's refcount has
>> already been decremented when fpga_mgr_unregister() is called by the module's
>> exit function. So, when we get to the point of executing fpga_mgr_unregister(),
>> any concurrent call to try_module_get() will fail (if no one has taken the
> 
> Are you still taking care of your previous finding [1]? It says:
> 
>   To be clear, you should only use try_module_get() *iff* you are 100% sure
>   the module already does exist ...
> 
> IIUC, if you do nothing on fpga_mgr_unregister(), the low-level module may
> just disappear and any copy of the owner pointer became invalid. Then
> try_module_get() would not fail but panic.
> 
> [1] 557aafac1153 ("kernel/module: add documentation for try_module_get()")
> 

You are right. I'll follow your proposal for locking. I would have preferred a
solution that kept the module locked until it was safe to remove it, but I
cannot come up with anything reasonable at the moment.

> 
>> module before) without the need to check mops first.
>>
>> If we assume (as you pointed out) that class_find_device() can be safely
>> executed concurrently with device_unregister() (returning either a valid dev
>> pointer or null) and, consequently, the manager context is guaranteed to exist
>> after that point. Then, we should be good without the mutex if we call
>> try_module_get() on a copy of the owner pointer stored in the manager context.
>>
>>>> user. It will still be an improvement over taking the low-level
>>>> module's refcount through the parent device's driver pointer.
>>>>

Thanks,
Marco