linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister_dev()
@ 2022-04-10  8:31 Duoming Zhou
  2022-04-10  9:27 ` Krzysztof Kozlowski
  0 siblings, 1 reply; 3+ messages in thread
From: Duoming Zhou @ 2022-04-10  8:31 UTC (permalink / raw)
  To: krzk, linux-kernel
  Cc: netdev, akpm, davem, gregkh, alexander.deucher, broonie, Duoming Zhou

There is a potential double bug in nfcmrvl usb driver between
unregister and resume operation.

The race that cause that double free bug can be shown as below:

   (FREE)                   |      (USE)
                            | nfcmrvl_resume
                            |  nfcmrvl_submit_bulk_urb
                            |   nfcmrvl_bulk_complete
                            |    nfcmrvl_nci_recv_frame
                            |     nfcmrvl_fw_dnld_recv_frame
                            |      queue_work
                            |       fw_dnld_rx_work
                            |        fw_dnld_over
                            |         release_firmware
                            |          kfree(fw); //(1)
nfcmrvl_disconnect          |
 nfcmrvl_nci_unregister_dev |
  nfcmrvl_fw_dnld_abort     |
   fw_dnld_over             |         ...
    if (priv->fw_dnld.fw)   |
    release_firmware        |
     kfree(fw); //(2)       |
     ...                    |         fw = NULL;

When nfcmrvl usb driver is resuming, we detach the device.
The release_firmware() will deallocate firmware in position (1),
but firmware will be deallocated again in position (2), which
leads to double free.

This patch reorders nfcmrvl_fw_dnld_deinit() before nfcmrvl_fw_dnld_abort()
in order to prevent double free bug. Because destroy_workqueue() will
not return until all work is finished.

Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
---
 drivers/nfc/nfcmrvl/main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c
index 2fcf545012b..d8342271f50 100644
--- a/drivers/nfc/nfcmrvl/main.c
+++ b/drivers/nfc/nfcmrvl/main.c
@@ -183,11 +183,10 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
 {
 	struct nci_dev *ndev = priv->ndev;
 
+	nfcmrvl_fw_dnld_deinit(priv);
 	if (priv->ndev->nfc_dev->fw_download_in_progress)
 		nfcmrvl_fw_dnld_abort(priv);
 
-	nfcmrvl_fw_dnld_deinit(priv);
-
 	if (gpio_is_valid(priv->config.reset_n_io))
 		gpio_free(priv->config.reset_n_io);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister_dev()
  2022-04-10  8:31 [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister_dev() Duoming Zhou
@ 2022-04-10  9:27 ` Krzysztof Kozlowski
  2022-04-10 12:33   ` duoming
  0 siblings, 1 reply; 3+ messages in thread
From: Krzysztof Kozlowski @ 2022-04-10  9:27 UTC (permalink / raw)
  To: Duoming Zhou, linux-kernel
  Cc: netdev, akpm, davem, gregkh, alexander.deucher, broonie

On 10/04/2022 10:31, Duoming Zhou wrote:
> There is a potential double bug in nfcmrvl usb driver between
> unregister and resume operation.
> 

Thank you for your patch. There is something to discuss/improve.

> The race that cause that double free bug can be shown as below:

Your patch solves the most visible race, but because of lack of locking,
I believe race still might exist:

   (FREE)                   |      (USE)
                            | nfcmrvl_resume
                            |  nfcmrvl_submit_bulk_urb
                            |   nfcmrvl_bulk_complete
                            |    nfcmrvl_nci_recv_frame
                            |     nfcmrvl_fw_dnld_recv_frame
                            |      queue_work
                            |       fw_dnld_rx_work
nfcmrvl_disconnect          |
 nfcmrvl_nci_unregister_dev |
  nfcmrvl_fw_dnld_deinit    |
   wait for the workqueue to finish |
                            |        fw_dnld_over
                            |         release_firmware
                            |          kfree(fw);
                            |          no synchronization //(1)
  if (fw_download_in_progress)
    - no synchronization, so CPU sees old value
  nfcmrvl_fw_dnld_abort     |
   fw_dnld_over             |         ...
    if (priv->fw_dnld.fw)   |
    release_firmware        |
     kfree(fw); //(2)       |
     ...                    |         fw = NULL;

The kfree() from (2) would still free old value. Even if fw=NULL happens
earlier, it is not propagated back to the other CPU, unless there are
some implicit barriers due to workqueue?

Is it safe then to rely on such implicit barriers from workqueue?

> 
>    (FREE)                   |      (USE)
>                             | nfcmrvl_resume
>                             |  nfcmrvl_submit_bulk_urb
>                             |   nfcmrvl_bulk_complete
>                             |    nfcmrvl_nci_recv_frame
>                             |     nfcmrvl_fw_dnld_recv_frame
>                             |      queue_work
>                             |       fw_dnld_rx_work
>                             |        fw_dnld_over
>                             |         release_firmware
>                             |          kfree(fw); //(1)
> nfcmrvl_disconnect          |
>  nfcmrvl_nci_unregister_dev |
>   nfcmrvl_fw_dnld_abort     |
>    fw_dnld_over             |         ...
>     if (priv->fw_dnld.fw)   |
>     release_firmware        |
>      kfree(fw); //(2)       |
>      ...                    |         fw = NULL;
> 

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Re: [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister_dev()
  2022-04-10  9:27 ` Krzysztof Kozlowski
@ 2022-04-10 12:33   ` duoming
  0 siblings, 0 replies; 3+ messages in thread
From: duoming @ 2022-04-10 12:33 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: linux-kernel, netdev, akpm, davem, gregkh, alexander.deucher, broonie

Hello,

On Sun, 10 Apr 2022 11:27:09 +0200 Krzysztof Kozlowski wrote:

> > There is a potential double bug in nfcmrvl usb driver between
> > unregister and resume operation.
> > 
> 
> Thank you for your patch. There is something to discuss/improve.
> 
> > The race that cause that double free bug can be shown as below:
> 
> Your patch solves the most visible race, but because of lack of locking,
> I believe race still might exist:
> 
>    (FREE)                   |      (USE)
>                             | nfcmrvl_resume
>                             |  nfcmrvl_submit_bulk_urb
>                             |   nfcmrvl_bulk_complete
>                             |    nfcmrvl_nci_recv_frame
>                             |     nfcmrvl_fw_dnld_recv_frame
>                             |      queue_work
>                             |       fw_dnld_rx_work
> nfcmrvl_disconnect          |
>  nfcmrvl_nci_unregister_dev |
>   nfcmrvl_fw_dnld_deinit    |
>    wait for the workqueue to finish |
>                             |        fw_dnld_over
>                             |         release_firmware
>                             |          kfree(fw);
>                             |          no synchronization //(1)
>   if (fw_download_in_progress)
>     - no synchronization, so CPU sees old value
>   nfcmrvl_fw_dnld_abort     |
>    fw_dnld_over             |         ...
>     if (priv->fw_dnld.fw)   |
>     release_firmware        |
>      kfree(fw); //(2)       |
>      ...                    |         fw = NULL;
> 
> The kfree() from (2) would still free old value. Even if fw=NULL happens
> earlier, it is not propagated back to the other CPU, unless there are
> some implicit barriers due to workqueue?
> 
> Is it safe then to rely on such implicit barriers from workqueue?

I think it is safe to rely on such barriers from destroy_workqueue(). 
The destroy_workqueue will wait all work items to finish, the code
behind it will not execute until all work items are finished.
The progress is shown below:

destroy_workqueue()-->drain_workqueue()-->flush_workqueue()-->wait_for_completion().

The function drain_workqueue() will wait until the workqueue becomes empty.
It sets wq->flags to __WQ_DRAINING, this could ensure the new coming work items
will not be queued into the draining workqueue. Because __queue_work will check
the __WQ_DRAINING flag. 

Then drain_workqueue() calls flush_workqueue() to ensure that any scheduled work
has run to completion. Because wait_for_completion() is called in flush_workqueue(),
it will block current thread and wait work items to completion. 

Best regards,
Duoming Zhou.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-04-10 12:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-10  8:31 [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister_dev() Duoming Zhou
2022-04-10  9:27 ` Krzysztof Kozlowski
2022-04-10 12:33   ` duoming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).