* [PATCH net v5 0/2] Replace improper checks and fix bugs in nfc subsystem @ 2022-04-29 1:14 Duoming Zhou 2022-04-29 1:14 ` [PATCH net v5 1/2] nfc: replace improper check device_is_registered() in netlink related functions Duoming Zhou 2022-04-29 1:14 ` [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs Duoming Zhou 0 siblings, 2 replies; 12+ messages in thread From: Duoming Zhou @ 2022-04-29 1:14 UTC (permalink / raw) To: linux-kernel, kuba, krzysztof.kozlowski, gregkh Cc: davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma, Duoming Zhou The first patch is used to replace improper checks in netlink related functions of nfc core, the second patch is used to fix bugs in nfcmrvl driver. Duoming Zhou (2): nfc: replace improper check device_is_registered() in netlink related functions nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs drivers/nfc/nfcmrvl/main.c | 2 +- include/net/nfc/nfc.h | 1 + net/nfc/core.c | 26 ++++++++++++++------------ 3 files changed, 16 insertions(+), 13 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net v5 1/2] nfc: replace improper check device_is_registered() in netlink related functions 2022-04-29 1:14 [PATCH net v5 0/2] Replace improper checks and fix bugs in nfc subsystem Duoming Zhou @ 2022-04-29 1:14 ` Duoming Zhou 2022-04-29 7:03 ` Greg KH 2022-04-29 7:19 ` Krzysztof Kozlowski 2022-04-29 1:14 ` [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs Duoming Zhou 1 sibling, 2 replies; 12+ messages in thread From: Duoming Zhou @ 2022-04-29 1:14 UTC (permalink / raw) To: linux-kernel, kuba, krzysztof.kozlowski, gregkh Cc: davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma, Duoming Zhou The device_is_registered() in nfc core is used to check whether nfc device is registered in netlink related functions such as nfc_fw_download(), nfc_dev_up() and so on. Although device_is_registered() is protected by device_lock, there is still a race condition between device_del() and device_is_registered(). The root cause is that kobject_del() in device_del() is not protected by device_lock. (cleanup task) | (netlink task) | nfc_unregister_device | nfc_fw_download device_del | device_lock ... | if (!device_is_registered)//(1) kobject_del//(2) | ... ... | device_unlock The device_is_registered() returns the value of state_in_sysfs and the state_in_sysfs is set to zero in kobject_del(). If we pass check in position (1), then set zero in position (2). As a result, the check in position (1) is useless. This patch uses bool variable instead of device_is_registered() to judge whether the nfc device is registered, which is well synchronized. Fixes: 3e256b8f8dfa ("NFC: add nfc subsystem core") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> --- Changes in v5: - Replace device_is_registered() to bool variable. include/net/nfc/nfc.h | 1 + net/nfc/core.c | 26 ++++++++++++++------------ 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/include/net/nfc/nfc.h b/include/net/nfc/nfc.h index 5dee575fbe8..7bb4ccb1830 100644 --- a/include/net/nfc/nfc.h +++ b/include/net/nfc/nfc.h @@ -167,6 +167,7 @@ struct nfc_dev { int n_targets; int targets_generation; struct device dev; + bool dev_register; bool dev_up; bool fw_download_in_progress; u8 rf_mode; diff --git a/net/nfc/core.c b/net/nfc/core.c index dc7a2404efd..52147da2286 100644 --- a/net/nfc/core.c +++ b/net/nfc/core.c @@ -38,7 +38,7 @@ int nfc_fw_download(struct nfc_dev *dev, const char *firmware_name) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -94,7 +94,7 @@ int nfc_dev_up(struct nfc_dev *dev) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -142,7 +142,7 @@ int nfc_dev_down(struct nfc_dev *dev) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -207,7 +207,7 @@ int nfc_start_poll(struct nfc_dev *dev, u32 im_protocols, u32 tm_protocols) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -246,7 +246,7 @@ int nfc_stop_poll(struct nfc_dev *dev) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -291,7 +291,7 @@ int nfc_dep_link_up(struct nfc_dev *dev, int target_index, u8 comm_mode) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -335,7 +335,7 @@ int nfc_dep_link_down(struct nfc_dev *dev) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -401,7 +401,7 @@ int nfc_activate_target(struct nfc_dev *dev, u32 target_idx, u32 protocol) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -448,7 +448,7 @@ int nfc_deactivate_target(struct nfc_dev *dev, u32 target_idx, u8 mode) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -495,7 +495,7 @@ int nfc_data_exchange(struct nfc_dev *dev, u32 target_idx, struct sk_buff *skb, device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; kfree_skb(skb); goto error; @@ -552,7 +552,7 @@ int nfc_enable_se(struct nfc_dev *dev, u32 se_idx) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -601,7 +601,7 @@ int nfc_disable_se(struct nfc_dev *dev, u32 se_idx) device_lock(&dev->dev); - if (!device_is_registered(&dev->dev)) { + if (!dev->dev_register) { rc = -ENODEV; goto error; } @@ -1134,6 +1134,7 @@ int nfc_register_device(struct nfc_dev *dev) dev->rfkill = NULL; } } + dev->dev_register = true; device_unlock(&dev->dev); rc = nfc_genl_device_added(dev); @@ -1166,6 +1167,7 @@ void nfc_unregister_device(struct nfc_dev *dev) rfkill_unregister(dev->rfkill); rfkill_destroy(dev->rfkill); } + dev->dev_register = false; device_unlock(&dev->dev); if (dev->ops->check_presence) { -- 2.17.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net v5 1/2] nfc: replace improper check device_is_registered() in netlink related functions 2022-04-29 1:14 ` [PATCH net v5 1/2] nfc: replace improper check device_is_registered() in netlink related functions Duoming Zhou @ 2022-04-29 7:03 ` Greg KH 2022-04-29 7:19 ` Krzysztof Kozlowski 1 sibling, 0 replies; 12+ messages in thread From: Greg KH @ 2022-04-29 7:03 UTC (permalink / raw) To: Duoming Zhou Cc: linux-kernel, kuba, krzysztof.kozlowski, davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma On Fri, Apr 29, 2022 at 09:14:32AM +0800, Duoming Zhou wrote: > The device_is_registered() in nfc core is used to check whether > nfc device is registered in netlink related functions such as > nfc_fw_download(), nfc_dev_up() and so on. Although device_is_registered() > is protected by device_lock, there is still a race condition between > device_del() and device_is_registered(). The root cause is that > kobject_del() in device_del() is not protected by device_lock. > > (cleanup task) | (netlink task) > | > nfc_unregister_device | nfc_fw_download > device_del | device_lock > ... | if (!device_is_registered)//(1) > kobject_del//(2) | ... > ... | device_unlock > > The device_is_registered() returns the value of state_in_sysfs and > the state_in_sysfs is set to zero in kobject_del(). If we pass check in > position (1), then set zero in position (2). As a result, the check > in position (1) is useless. > > This patch uses bool variable instead of device_is_registered() to judge > whether the nfc device is registered, which is well synchronized. > > Fixes: 3e256b8f8dfa ("NFC: add nfc subsystem core") > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > --- > Changes in v5: > - Replace device_is_registered() to bool variable. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v5 1/2] nfc: replace improper check device_is_registered() in netlink related functions 2022-04-29 1:14 ` [PATCH net v5 1/2] nfc: replace improper check device_is_registered() in netlink related functions Duoming Zhou 2022-04-29 7:03 ` Greg KH @ 2022-04-29 7:19 ` Krzysztof Kozlowski 2022-04-29 9:18 ` duoming 1 sibling, 1 reply; 12+ messages in thread From: Krzysztof Kozlowski @ 2022-04-29 7:19 UTC (permalink / raw) To: Duoming Zhou, linux-kernel, kuba, gregkh Cc: davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma On 29/04/2022 03:14, Duoming Zhou wrote: > The device_is_registered() in nfc core is used to check whether > nfc device is registered in netlink related functions such as > nfc_fw_download(), nfc_dev_up() and so on. Although device_is_registered() > is protected by device_lock, there is still a race condition between > device_del() and device_is_registered(). The root cause is that > kobject_del() in device_del() is not protected by device_lock. > > (cleanup task) | (netlink task) > | > nfc_unregister_device | nfc_fw_download > device_del | device_lock > ... | if (!device_is_registered)//(1) > kobject_del//(2) | ... > ... | device_unlock > > The device_is_registered() returns the value of state_in_sysfs and > the state_in_sysfs is set to zero in kobject_del(). If we pass check in > position (1), then set zero in position (2). As a result, the check > in position (1) is useless. > > This patch uses bool variable instead of device_is_registered() to judge > whether the nfc device is registered, which is well synchronized. > > Fixes: 3e256b8f8dfa ("NFC: add nfc subsystem core") > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > --- > Changes in v5: > - Replace device_is_registered() to bool variable. > > include/net/nfc/nfc.h | 1 + > net/nfc/core.c | 26 ++++++++++++++------------ > 2 files changed, 15 insertions(+), 12 deletions(-) > > diff --git a/include/net/nfc/nfc.h b/include/net/nfc/nfc.h > index 5dee575fbe8..7bb4ccb1830 100644 > --- a/include/net/nfc/nfc.h > +++ b/include/net/nfc/nfc.h > @@ -167,6 +167,7 @@ struct nfc_dev { > int n_targets; > int targets_generation; > struct device dev; > + bool dev_register; > bool dev_up; > bool fw_download_in_progress; > u8 rf_mode; > diff --git a/net/nfc/core.c b/net/nfc/core.c > index dc7a2404efd..52147da2286 100644 > --- a/net/nfc/core.c > +++ b/net/nfc/core.c > @@ -38,7 +38,7 @@ int nfc_fw_download(struct nfc_dev *dev, const char *firmware_name) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -94,7 +94,7 @@ int nfc_dev_up(struct nfc_dev *dev) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -142,7 +142,7 @@ int nfc_dev_down(struct nfc_dev *dev) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -207,7 +207,7 @@ int nfc_start_poll(struct nfc_dev *dev, u32 im_protocols, u32 tm_protocols) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -246,7 +246,7 @@ int nfc_stop_poll(struct nfc_dev *dev) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -291,7 +291,7 @@ int nfc_dep_link_up(struct nfc_dev *dev, int target_index, u8 comm_mode) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -335,7 +335,7 @@ int nfc_dep_link_down(struct nfc_dev *dev) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -401,7 +401,7 @@ int nfc_activate_target(struct nfc_dev *dev, u32 target_idx, u32 protocol) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -448,7 +448,7 @@ int nfc_deactivate_target(struct nfc_dev *dev, u32 target_idx, u8 mode) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -495,7 +495,7 @@ int nfc_data_exchange(struct nfc_dev *dev, u32 target_idx, struct sk_buff *skb, > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > kfree_skb(skb); > goto error; > @@ -552,7 +552,7 @@ int nfc_enable_se(struct nfc_dev *dev, u32 se_idx) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -601,7 +601,7 @@ int nfc_disable_se(struct nfc_dev *dev, u32 se_idx) > > device_lock(&dev->dev); > > - if (!device_is_registered(&dev->dev)) { > + if (!dev->dev_register) { > rc = -ENODEV; > goto error; > } > @@ -1134,6 +1134,7 @@ int nfc_register_device(struct nfc_dev *dev) > dev->rfkill = NULL; > } > } > + dev->dev_register = true; > device_unlock(&dev->dev); > > rc = nfc_genl_device_added(dev); > @@ -1166,6 +1167,7 @@ void nfc_unregister_device(struct nfc_dev *dev) > rfkill_unregister(dev->rfkill); > rfkill_destroy(dev->rfkill); > } > + dev->dev_register = false; We already have flag for it - dev->shutting_down. Currently it is used only in if device implements check_presence but I think it can be easily moved to common path. Having multiple fields for similar, but slightly different cases, is getting us closer and closer to spaghetti code. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH net v5 1/2] nfc: replace improper check device_is_registered() in netlink related functions 2022-04-29 7:19 ` Krzysztof Kozlowski @ 2022-04-29 9:18 ` duoming 0 siblings, 0 replies; 12+ messages in thread From: duoming @ 2022-04-29 9:18 UTC (permalink / raw) To: Krzysztof Kozlowski Cc: linux-kernel, kuba, gregkh, davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma Hello, On Fri, 29 Apr 2022 09:19:36 +0200 Krzysztof wrote: > > The device_is_registered() in nfc core is used to check whether > > nfc device is registered in netlink related functions such as > > nfc_fw_download(), nfc_dev_up() and so on. Although device_is_registered() > > is protected by device_lock, there is still a race condition between > > device_del() and device_is_registered(). The root cause is that > > kobject_del() in device_del() is not protected by device_lock. > > > > (cleanup task) | (netlink task) > > | > > nfc_unregister_device | nfc_fw_download > > device_del | device_lock > > ... | if (!device_is_registered)//(1) > > kobject_del//(2) | ... > > ... | device_unlock > > > > The device_is_registered() returns the value of state_in_sysfs and > > the state_in_sysfs is set to zero in kobject_del(). If we pass check in > > position (1), then set zero in position (2). As a result, the check > > in position (1) is useless. > > > > This patch uses bool variable instead of device_is_registered() to judge > > whether the nfc device is registered, which is well synchronized. > > > > Fixes: 3e256b8f8dfa ("NFC: add nfc subsystem core") > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > --- > > Changes in v5: > > - Replace device_is_registered() to bool variable. > > > > include/net/nfc/nfc.h | 1 + > > net/nfc/core.c | 26 ++++++++++++++------------ > > 2 files changed, 15 insertions(+), 12 deletions(-) > > > > diff --git a/include/net/nfc/nfc.h b/include/net/nfc/nfc.h > > index 5dee575fbe8..7bb4ccb1830 100644 > > --- a/include/net/nfc/nfc.h > > +++ b/include/net/nfc/nfc.h > > @@ -167,6 +167,7 @@ struct nfc_dev { > > int n_targets; > > int targets_generation; > > struct device dev; > > + bool dev_register; > > bool dev_up; > > bool fw_download_in_progress; > > u8 rf_mode; > > diff --git a/net/nfc/core.c b/net/nfc/core.c > > index dc7a2404efd..52147da2286 100644 > > --- a/net/nfc/core.c > > +++ b/net/nfc/core.c > > @@ -38,7 +38,7 @@ int nfc_fw_download(struct nfc_dev *dev, const char *firmware_name) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -94,7 +94,7 @@ int nfc_dev_up(struct nfc_dev *dev) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -142,7 +142,7 @@ int nfc_dev_down(struct nfc_dev *dev) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -207,7 +207,7 @@ int nfc_start_poll(struct nfc_dev *dev, u32 im_protocols, u32 tm_protocols) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -246,7 +246,7 @@ int nfc_stop_poll(struct nfc_dev *dev) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -291,7 +291,7 @@ int nfc_dep_link_up(struct nfc_dev *dev, int target_index, u8 comm_mode) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -335,7 +335,7 @@ int nfc_dep_link_down(struct nfc_dev *dev) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -401,7 +401,7 @@ int nfc_activate_target(struct nfc_dev *dev, u32 target_idx, u32 protocol) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -448,7 +448,7 @@ int nfc_deactivate_target(struct nfc_dev *dev, u32 target_idx, u8 mode) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -495,7 +495,7 @@ int nfc_data_exchange(struct nfc_dev *dev, u32 target_idx, struct sk_buff *skb, > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > kfree_skb(skb); > > goto error; > > @@ -552,7 +552,7 @@ int nfc_enable_se(struct nfc_dev *dev, u32 se_idx) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -601,7 +601,7 @@ int nfc_disable_se(struct nfc_dev *dev, u32 se_idx) > > > > device_lock(&dev->dev); > > > > - if (!device_is_registered(&dev->dev)) { > > + if (!dev->dev_register) { > > rc = -ENODEV; > > goto error; > > } > > @@ -1134,6 +1134,7 @@ int nfc_register_device(struct nfc_dev *dev) > > dev->rfkill = NULL; > > } > > } > > + dev->dev_register = true; > > device_unlock(&dev->dev); > > > > rc = nfc_genl_device_added(dev); > > @@ -1166,6 +1167,7 @@ void nfc_unregister_device(struct nfc_dev *dev) > > rfkill_unregister(dev->rfkill); > > rfkill_destroy(dev->rfkill); > > } > > + dev->dev_register = false; > > We already have flag for it - dev->shutting_down. Currently it is used > only in if device implements check_presence but I think it can be easily > moved to common path. > > Having multiple fields for similar, but slightly different cases, is > getting us closer and closer to spaghetti code. Thanks a lot for your suggestion, I will move dev->shutting_down to common path. Best regards, Duoming Zhou ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs 2022-04-29 1:14 [PATCH net v5 0/2] Replace improper checks and fix bugs in nfc subsystem Duoming Zhou 2022-04-29 1:14 ` [PATCH net v5 1/2] nfc: replace improper check device_is_registered() in netlink related functions Duoming Zhou @ 2022-04-29 1:14 ` Duoming Zhou 2022-04-29 7:27 ` Krzysztof Kozlowski 1 sibling, 1 reply; 12+ messages in thread From: Duoming Zhou @ 2022-04-29 1:14 UTC (permalink / raw) To: linux-kernel, kuba, krzysztof.kozlowski, gregkh Cc: davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma, Duoming Zhou There are destructive operations such as nfcmrvl_fw_dnld_abort and gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware, gpio and so on could be destructed while the upper layer functions such as nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads to double-free, use-after-free and null-ptr-deref bugs. There are three situations that could lead to double-free bugs. The first situation is shown below: (Thread 1) | (Thread 2) nfcmrvl_fw_dnld_start | ... | nfcmrvl_nci_unregister_dev release_firmware() | nfcmrvl_fw_dnld_abort kfree(fw) //(1) | fw_dnld_over | release_firmware ... | kfree(fw) //(2) | ... The second situation is shown below: (Thread 1) | (Thread 2) nfcmrvl_fw_dnld_start | ... | mod_timer | (wait a time) | fw_dnld_timeout | nfcmrvl_nci_unregister_dev fw_dnld_over | nfcmrvl_fw_dnld_abort release_firmware | fw_dnld_over kfree(fw) //(1) | release_firmware ... | kfree(fw) //(2) The third situation is shown below: (Thread 1) | (Thread 2) nfcmrvl_nci_recv_frame | if(..->fw_download_in_progress)| nfcmrvl_fw_dnld_recv_frame | queue_work | | fw_dnld_rx_work | nfcmrvl_nci_unregister_dev fw_dnld_over | nfcmrvl_fw_dnld_abort release_firmware | fw_dnld_over kfree(fw) //(1) | release_firmware | kfree(fw) //(2) The firmware struct is deallocated in position (1) and deallocated in position (2) again. The crash trace triggered by POC is like below: [ 122.640457] BUG: KASAN: double-free or invalid-free in fw_dnld_over+0x28/0xf0 [ 122.640457] Call Trace: [ 122.640457] <TASK> [ 122.640457] kfree+0xb0/0x330 [ 122.640457] fw_dnld_over+0x28/0xf0 [ 122.640457] nfcmrvl_nci_unregister_dev+0x61/0x70 [ 122.640457] nci_uart_tty_close+0x87/0xd0 [ 122.640457] tty_ldisc_kill+0x3e/0x80 [ 122.640457] tty_ldisc_hangup+0x1b2/0x2c0 [ 122.640457] __tty_hangup.part.0+0x316/0x520 [ 122.640457] tty_release+0x200/0x670 [ 122.640457] __fput+0x110/0x410 [ 122.640457] task_work_run+0x86/0xd0 [ 122.640457] exit_to_user_mode_prepare+0x1aa/0x1b0 [ 122.640457] syscall_exit_to_user_mode+0x19/0x50 [ 122.640457] do_syscall_64+0x48/0x90 [ 122.640457] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 122.640457] RIP: 0033:0x7f68433f6beb What's more, there are also use-after-free and null-ptr-deref bugs in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev, then, we dereference firmware, gpio or the members of priv->fw_dnld in nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen. This patch reorders destructive operations after nci_unregister_device and uses bool variable in nfc_dev to synchronize between cleanup routine and firmware download routine. The process is shown below. (Thread 1) | (Thread 2) nfcmrvl_nci_unregister_dev | nci_unregister_device | nfc_unregister_device | nfc_fw_download device_lock() | ... | dev->dev_register = false; | ... device_unlock() | ... | device_lock() (destructive operations) | if(!dev->dev_register) | goto error; | error: | device_unlock() If the device is detaching, the download function will goto error. If the download function is executing, nci_unregister_device will wait until download function is finished. Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> --- Changes in v5: - Use bool variable added in patch 1/2 to synchronize. drivers/nfc/nfcmrvl/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c index 2fcf545012b..1a5284de434 100644 --- a/drivers/nfc/nfcmrvl/main.c +++ b/drivers/nfc/nfcmrvl/main.c @@ -183,6 +183,7 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv) { struct nci_dev *ndev = priv->ndev; + nci_unregister_device(ndev); if (priv->ndev->nfc_dev->fw_download_in_progress) nfcmrvl_fw_dnld_abort(priv); @@ -191,7 +192,6 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv) if (gpio_is_valid(priv->config.reset_n_io)) gpio_free(priv->config.reset_n_io); - nci_unregister_device(ndev); nci_free_device(ndev); kfree(priv); } -- 2.17.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs 2022-04-29 1:14 ` [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs Duoming Zhou @ 2022-04-29 7:27 ` Krzysztof Kozlowski 2022-04-29 9:13 ` duoming 0 siblings, 1 reply; 12+ messages in thread From: Krzysztof Kozlowski @ 2022-04-29 7:27 UTC (permalink / raw) To: Duoming Zhou, linux-kernel, kuba, gregkh Cc: davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma On 29/04/2022 03:14, Duoming Zhou wrote: > There are destructive operations such as nfcmrvl_fw_dnld_abort and > gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware, > gpio and so on could be destructed while the upper layer functions such as > nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads > to double-free, use-after-free and null-ptr-deref bugs. > > There are three situations that could lead to double-free bugs. > > The first situation is shown below: > > (Thread 1) | (Thread 2) > nfcmrvl_fw_dnld_start | > ... | nfcmrvl_nci_unregister_dev > release_firmware() | nfcmrvl_fw_dnld_abort > kfree(fw) //(1) | fw_dnld_over > | release_firmware > ... | kfree(fw) //(2) > | ... > > The second situation is shown below: > > (Thread 1) | (Thread 2) > nfcmrvl_fw_dnld_start | > ... | > mod_timer | > (wait a time) | > fw_dnld_timeout | nfcmrvl_nci_unregister_dev > fw_dnld_over | nfcmrvl_fw_dnld_abort > release_firmware | fw_dnld_over > kfree(fw) //(1) | release_firmware > ... | kfree(fw) //(2) How exactly the case here is being prevented? If nfcmrvl_nci_unregister_dev() happens slightly earlier, before fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it? > > The third situation is shown below: > > (Thread 1) | (Thread 2) > nfcmrvl_nci_recv_frame | > if(..->fw_download_in_progress)| > nfcmrvl_fw_dnld_recv_frame | > queue_work | > | > fw_dnld_rx_work | nfcmrvl_nci_unregister_dev > fw_dnld_over | nfcmrvl_fw_dnld_abort > release_firmware | fw_dnld_over > kfree(fw) //(1) | release_firmware > | kfree(fw) //(2) > > The firmware struct is deallocated in position (1) and deallocated > in position (2) again. > > The crash trace triggered by POC is like below: > > [ 122.640457] BUG: KASAN: double-free or invalid-free in fw_dnld_over+0x28/0xf0 > [ 122.640457] Call Trace: > [ 122.640457] <TASK> > [ 122.640457] kfree+0xb0/0x330 > [ 122.640457] fw_dnld_over+0x28/0xf0 > [ 122.640457] nfcmrvl_nci_unregister_dev+0x61/0x70 > [ 122.640457] nci_uart_tty_close+0x87/0xd0 > [ 122.640457] tty_ldisc_kill+0x3e/0x80 > [ 122.640457] tty_ldisc_hangup+0x1b2/0x2c0 > [ 122.640457] __tty_hangup.part.0+0x316/0x520 > [ 122.640457] tty_release+0x200/0x670 > [ 122.640457] __fput+0x110/0x410 > [ 122.640457] task_work_run+0x86/0xd0 > [ 122.640457] exit_to_user_mode_prepare+0x1aa/0x1b0 > [ 122.640457] syscall_exit_to_user_mode+0x19/0x50 > [ 122.640457] do_syscall_64+0x48/0x90 > [ 122.640457] entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 122.640457] RIP: 0033:0x7f68433f6beb Please always strip unrelated parts of oops, so minimum the timing. The addresses could be removed as well. > > What's more, there are also use-after-free and null-ptr-deref bugs > in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or > set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev, > then, we dereference firmware, gpio or the members of priv->fw_dnld in > nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen. > > This patch reorders destructive operations after nci_unregister_device > and uses bool variable in nfc_dev to synchronize between cleanup routine > and firmware download routine. The process is shown below. > > (Thread 1) | (Thread 2) > nfcmrvl_nci_unregister_dev | > nci_unregister_device | > nfc_unregister_device | nfc_fw_download > device_lock() | > ... | > dev->dev_register = false; | ... > device_unlock() | > ... | device_lock() > (destructive operations) | if(!dev->dev_register) > | goto error; > | error: > | device_unlock() How T2 calls appeared here? They were not present in any of your previous process-examples... > > If the device is detaching, the download function will goto error. > If the download function is executing, nci_unregister_device will > wait until download function is finished. > > Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support") > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > --- > Changes in v5: > - Use bool variable added in patch 1/2 to synchronize. > > drivers/nfc/nfcmrvl/main.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c > index 2fcf545012b..1a5284de434 100644 > --- a/drivers/nfc/nfcmrvl/main.c > +++ b/drivers/nfc/nfcmrvl/main.c > @@ -183,6 +183,7 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv) > { > struct nci_dev *ndev = priv->ndev; > > + nci_unregister_device(ndev); > if (priv->ndev->nfc_dev->fw_download_in_progress) > nfcmrvl_fw_dnld_abort(priv); > > @@ -191,7 +192,6 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv) > if (gpio_is_valid(priv->config.reset_n_io)) > gpio_free(priv->config.reset_n_io); > > - nci_unregister_device(ndev); > nci_free_device(ndev); > kfree(priv); > } Best regards, Krzysztof ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs 2022-04-29 7:27 ` Krzysztof Kozlowski @ 2022-04-29 9:13 ` duoming 2022-05-02 6:34 ` Krzysztof Kozlowski 0 siblings, 1 reply; 12+ messages in thread From: duoming @ 2022-04-29 9:13 UTC (permalink / raw) To: Krzysztof Kozlowski Cc: linux-kernel, kuba, gregkh, davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma Hello, On Fri, 29 Apr 2022 09:27:48 +0200 Krzysztof wrote: > > There are destructive operations such as nfcmrvl_fw_dnld_abort and > > gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware, > > gpio and so on could be destructed while the upper layer functions such as > > nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads > > to double-free, use-after-free and null-ptr-deref bugs. > > > > There are three situations that could lead to double-free bugs. > > > > The first situation is shown below: > > > > (Thread 1) | (Thread 2) > > nfcmrvl_fw_dnld_start | > > ... | nfcmrvl_nci_unregister_dev > > release_firmware() | nfcmrvl_fw_dnld_abort > > kfree(fw) //(1) | fw_dnld_over > > | release_firmware > > ... | kfree(fw) //(2) > > | ... > > > > The second situation is shown below: > > > > (Thread 1) | (Thread 2) > > nfcmrvl_fw_dnld_start | > > ... | > > mod_timer | > > (wait a time) | > > fw_dnld_timeout | nfcmrvl_nci_unregister_dev > > fw_dnld_over | nfcmrvl_fw_dnld_abort > > release_firmware | fw_dnld_over > > kfree(fw) //(1) | release_firmware > > ... | kfree(fw) //(2) > > How exactly the case here is being prevented? > > If nfcmrvl_nci_unregister_dev() happens slightly earlier, before > fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it? I think it could be prevented. We use nci_unregister_device() to synchronize, if the firmware download routine is running, the cleanup routine will wait it to finish. The flag "fw_download_in_progress" will be set to false, if the the firmware download routine is finished. Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev() will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort() in nfcmrvl_nci_unregister_dev() will not execute. > > > > The third situation is shown below: > > > > (Thread 1) | (Thread 2) > > nfcmrvl_nci_recv_frame | > > if(..->fw_download_in_progress)| > > nfcmrvl_fw_dnld_recv_frame | > > queue_work | > > | > > fw_dnld_rx_work | nfcmrvl_nci_unregister_dev > > fw_dnld_over | nfcmrvl_fw_dnld_abort > > release_firmware | fw_dnld_over > > kfree(fw) //(1) | release_firmware > > | kfree(fw) //(2) > > > > The firmware struct is deallocated in position (1) and deallocated > > in position (2) again. > > > > The crash trace triggered by POC is like below: > > > > [ 122.640457] BUG: KASAN: double-free or invalid-free in fw_dnld_over+0x28/0xf0 > > [ 122.640457] Call Trace: > > [ 122.640457] <TASK> > > [ 122.640457] kfree+0xb0/0x330 > > [ 122.640457] fw_dnld_over+0x28/0xf0 > > [ 122.640457] nfcmrvl_nci_unregister_dev+0x61/0x70 > > [ 122.640457] nci_uart_tty_close+0x87/0xd0 > > [ 122.640457] tty_ldisc_kill+0x3e/0x80 > > [ 122.640457] tty_ldisc_hangup+0x1b2/0x2c0 > > [ 122.640457] __tty_hangup.part.0+0x316/0x520 > > [ 122.640457] tty_release+0x200/0x670 > > [ 122.640457] __fput+0x110/0x410 > > [ 122.640457] task_work_run+0x86/0xd0 > > [ 122.640457] exit_to_user_mode_prepare+0x1aa/0x1b0 > > [ 122.640457] syscall_exit_to_user_mode+0x19/0x50 > > [ 122.640457] do_syscall_64+0x48/0x90 > > [ 122.640457] entry_SYSCALL_64_after_hwframe+0x44/0xae > > [ 122.640457] RIP: 0033:0x7f68433f6beb > > Please always strip unrelated parts of oops, so minimum the timing. The > addresses could be removed as well. Thanks, I will strip unrelated parts of oops. > > > > What's more, there are also use-after-free and null-ptr-deref bugs > > in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or > > set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev, > > then, we dereference firmware, gpio or the members of priv->fw_dnld in > > nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen. > > > > This patch reorders destructive operations after nci_unregister_device > > and uses bool variable in nfc_dev to synchronize between cleanup routine > > and firmware download routine. The process is shown below. > > > > (Thread 1) | (Thread 2) > > nfcmrvl_nci_unregister_dev | > > nci_unregister_device | > > nfc_unregister_device | nfc_fw_download > > device_lock() | > > ... | > > dev->dev_register = false; | ... > > device_unlock() | > > ... | device_lock() > > (destructive operations) | if(!dev->dev_register) > > | goto error; > > | error: > > | device_unlock() > > How T2 calls appeared here? They were not present in any of your > previous process-examples... The firmware download process is shown below: nfc_genl_fw_download()->nfc_fw_download()->nci_fw_download()->nfcmrvl_nci_fw_download() ->nfcmrvl_fw_dnld_start() So T2 calls is a part of firmware download routine in nfc core layer. We use bool variable and device_lock() to synchronize between cleanup routine and firmware download routine in nfc core layer. The above picture is attempt to show this synchronized process. > > > > If the device is detaching, the download function will goto error. > > If the download function is executing, nci_unregister_device will > > wait until download function is finished. > > > > Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support") > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > --- > > Changes in v5: > > - Use bool variable added in patch 1/2 to synchronize. > > > > drivers/nfc/nfcmrvl/main.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c > > index 2fcf545012b..1a5284de434 100644 > > --- a/drivers/nfc/nfcmrvl/main.c > > +++ b/drivers/nfc/nfcmrvl/main.c > > @@ -183,6 +183,7 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv) > > { > > struct nci_dev *ndev = priv->ndev; > > > > + nci_unregister_device(ndev); > > if (priv->ndev->nfc_dev->fw_download_in_progress) > > nfcmrvl_fw_dnld_abort(priv); > > > > @@ -191,7 +192,6 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv) > > if (gpio_is_valid(priv->config.reset_n_io)) > > gpio_free(priv->config.reset_n_io); > > > > - nci_unregister_device(ndev); > > nci_free_device(ndev); > > kfree(priv); > > } Best regards, Duoming Zhou ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs 2022-04-29 9:13 ` duoming @ 2022-05-02 6:34 ` Krzysztof Kozlowski 2022-05-02 7:25 ` duoming 0 siblings, 1 reply; 12+ messages in thread From: Krzysztof Kozlowski @ 2022-05-02 6:34 UTC (permalink / raw) To: duoming Cc: linux-kernel, kuba, gregkh, davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma On 29/04/2022 11:13, duoming@zju.edu.cn wrote: > Hello, > > On Fri, 29 Apr 2022 09:27:48 +0200 Krzysztof wrote: > >>> There are destructive operations such as nfcmrvl_fw_dnld_abort and >>> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware, >>> gpio and so on could be destructed while the upper layer functions such as >>> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads >>> to double-free, use-after-free and null-ptr-deref bugs. >>> >>> There are three situations that could lead to double-free bugs. >>> >>> The first situation is shown below: >>> >>> (Thread 1) | (Thread 2) >>> nfcmrvl_fw_dnld_start | >>> ... | nfcmrvl_nci_unregister_dev >>> release_firmware() | nfcmrvl_fw_dnld_abort >>> kfree(fw) //(1) | fw_dnld_over >>> | release_firmware >>> ... | kfree(fw) //(2) >>> | ... >>> >>> The second situation is shown below: >>> >>> (Thread 1) | (Thread 2) >>> nfcmrvl_fw_dnld_start | >>> ... | >>> mod_timer | >>> (wait a time) | >>> fw_dnld_timeout | nfcmrvl_nci_unregister_dev >>> fw_dnld_over | nfcmrvl_fw_dnld_abort >>> release_firmware | fw_dnld_over >>> kfree(fw) //(1) | release_firmware >>> ... | kfree(fw) //(2) >> >> How exactly the case here is being prevented? >> >> If nfcmrvl_nci_unregister_dev() happens slightly earlier, before >> fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it? > > I think it could be prevented. We use nci_unregister_device() to synchronize, if the > firmware download routine is running, the cleanup routine will wait it to finish. > The flag "fw_download_in_progress" will be set to false, if the the firmware download > routine is finished. fw_download_in_progress is not synchronized in nfcmrvl_nci_unregister_dev(), so even if fw_dnld_timeout() set it to true, the nfcmrvl_nci_unregister_dev() happening concurrently will not see updated fw_download_in_progress. > > Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev() > will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort() > in nfcmrvl_nci_unregister_dev() will not execute. I am sorry, but you cannot move code around hoping it will by itself solve synchronization issues. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs 2022-05-02 6:34 ` Krzysztof Kozlowski @ 2022-05-02 7:25 ` duoming 2022-05-04 6:32 ` Krzysztof Kozlowski 0 siblings, 1 reply; 12+ messages in thread From: duoming @ 2022-05-02 7:25 UTC (permalink / raw) To: Krzysztof Kozlowski Cc: linux-kernel, kuba, gregkh, davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma > -----原始邮件----- > 发件人: "Krzysztof Kozlowski" <krzysztof.kozlowski@linaro.org> > 发送时间: 2022-05-02 14:34:07 (星期一) > 收件人: duoming@zju.edu.cn > 抄送: linux-kernel@vger.kernel.org, kuba@kernel.org, gregkh@linuxfoundation.org, davem@davemloft.net, edumazet@google.com, pabeni@redhat.com, alexander.deucher@amd.com, akpm@linux-foundation.org, broonie@kernel.org, netdev@vger.kernel.org, linma@zju.edu.cn > 主题: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs > > On 29/04/2022 11:13, duoming@zju.edu.cn wrote: > > Hello, > > > > On Fri, 29 Apr 2022 09:27:48 +0200 Krzysztof wrote: > > > >>> There are destructive operations such as nfcmrvl_fw_dnld_abort and > >>> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware, > >>> gpio and so on could be destructed while the upper layer functions such as > >>> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads > >>> to double-free, use-after-free and null-ptr-deref bugs. > >>> > >>> There are three situations that could lead to double-free bugs. > >>> > >>> The first situation is shown below: > >>> > >>> (Thread 1) | (Thread 2) > >>> nfcmrvl_fw_dnld_start | > >>> ... | nfcmrvl_nci_unregister_dev > >>> release_firmware() | nfcmrvl_fw_dnld_abort > >>> kfree(fw) //(1) | fw_dnld_over > >>> | release_firmware > >>> ... | kfree(fw) //(2) > >>> | ... > >>> > >>> The second situation is shown below: > >>> > >>> (Thread 1) | (Thread 2) > >>> nfcmrvl_fw_dnld_start | > >>> ... | > >>> mod_timer | > >>> (wait a time) | > >>> fw_dnld_timeout | nfcmrvl_nci_unregister_dev > >>> fw_dnld_over | nfcmrvl_fw_dnld_abort > >>> release_firmware | fw_dnld_over > >>> kfree(fw) //(1) | release_firmware > >>> ... | kfree(fw) //(2) > >> > >> How exactly the case here is being prevented? > >> > >> If nfcmrvl_nci_unregister_dev() happens slightly earlier, before > >> fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it? > > > > I think it could be prevented. We use nci_unregister_device() to synchronize, if the > > firmware download routine is running, the cleanup routine will wait it to finish. > > The flag "fw_download_in_progress" will be set to false, if the the firmware download > > routine is finished. > > fw_download_in_progress is not synchronized in > nfcmrvl_nci_unregister_dev(), so even if fw_dnld_timeout() set it to > true, the nfcmrvl_nci_unregister_dev() happening concurrently will not > see updated fw_download_in_progress. The fw_download_in_progress is set to false in nfc_fw_download(). The nfc_fw_download() is synchronized with nfc_unregister_device(). If nfc_fw_download() is running, nfc_unregister_device() will wait nfc_fw_download() to finish. So the nfcmrvl_nci_unregister_dev() could see the updated fw_download_in_progress. The process is shown below: (Thread 1) | (Thread 2) nfcmrvl_nci_unregister_dev | nfc_fw_download nci_unregister_device | ... | device_lock() ... | dev->fw_download_in_progress = false; //(1) | device_unlock() nfc_unregister_device | if (priv->ndev->nfc_dev->fw_download_in_progress) //(2) | nfcmrvl_fw_dnld_abort(priv); //not execute | We set fw_download_in_progress to false in position (1) and the check in position (2) will fail, the nfcmrvl_fw_dnld_abort() in nfcmrvl_nci_unregister_dev() will not execute. So the double-free bugs could be prevented. > > Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev() > > will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort() > > in nfcmrvl_nci_unregister_dev() will not execute. > > I am sorry, but you cannot move code around hoping it will by itself > solve synchronization issues. I think this solution sove synchronization issues. If you still have any questions welcome to ask me. Best regards, Duoming Zhou ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs 2022-05-02 7:25 ` duoming @ 2022-05-04 6:32 ` Krzysztof Kozlowski 2022-05-04 9:07 ` duoming 0 siblings, 1 reply; 12+ messages in thread From: Krzysztof Kozlowski @ 2022-05-04 6:32 UTC (permalink / raw) To: duoming Cc: linux-kernel, kuba, gregkh, davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma On 02/05/2022 09:25, duoming@zju.edu.cn wrote: > > > >> -----原始邮件----- >> 发件人: "Krzysztof Kozlowski" <krzysztof.kozlowski@linaro.org> >> 发送时间: 2022-05-02 14:34:07 (星期一) >> 收件人: duoming@zju.edu.cn >> 抄送: linux-kernel@vger.kernel.org, kuba@kernel.org, gregkh@linuxfoundation.org, davem@davemloft.net, edumazet@google.com, pabeni@redhat.com, alexander.deucher@amd.com, akpm@linux-foundation.org, broonie@kernel.org, netdev@vger.kernel.org, linma@zju.edu.cn >> 主题: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs >> >> On 29/04/2022 11:13, duoming@zju.edu.cn wrote: >>> Hello, >>> >>> On Fri, 29 Apr 2022 09:27:48 +0200 Krzysztof wrote: >>> >>>>> There are destructive operations such as nfcmrvl_fw_dnld_abort and >>>>> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware, >>>>> gpio and so on could be destructed while the upper layer functions such as >>>>> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads >>>>> to double-free, use-after-free and null-ptr-deref bugs. >>>>> >>>>> There are three situations that could lead to double-free bugs. >>>>> >>>>> The first situation is shown below: >>>>> >>>>> (Thread 1) | (Thread 2) >>>>> nfcmrvl_fw_dnld_start | >>>>> ... | nfcmrvl_nci_unregister_dev >>>>> release_firmware() | nfcmrvl_fw_dnld_abort >>>>> kfree(fw) //(1) | fw_dnld_over >>>>> | release_firmware >>>>> ... | kfree(fw) //(2) >>>>> | ... >>>>> >>>>> The second situation is shown below: >>>>> >>>>> (Thread 1) | (Thread 2) >>>>> nfcmrvl_fw_dnld_start | >>>>> ... | >>>>> mod_timer | >>>>> (wait a time) | >>>>> fw_dnld_timeout | nfcmrvl_nci_unregister_dev >>>>> fw_dnld_over | nfcmrvl_fw_dnld_abort >>>>> release_firmware | fw_dnld_over >>>>> kfree(fw) //(1) | release_firmware >>>>> ... | kfree(fw) //(2) >>>> >>>> How exactly the case here is being prevented? >>>> >>>> If nfcmrvl_nci_unregister_dev() happens slightly earlier, before >>>> fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it? >>> >>> I think it could be prevented. We use nci_unregister_device() to synchronize, if the >>> firmware download routine is running, the cleanup routine will wait it to finish. >>> The flag "fw_download_in_progress" will be set to false, if the the firmware download >>> routine is finished. >> >> fw_download_in_progress is not synchronized in >> nfcmrvl_nci_unregister_dev(), so even if fw_dnld_timeout() set it to >> true, the nfcmrvl_nci_unregister_dev() happening concurrently will not >> see updated fw_download_in_progress. > > The fw_download_in_progress is set to false in nfc_fw_download(). The nfc_fw_download() is > synchronized with nfc_unregister_device(). No, it is not. There is no synchronization primitive in nfc_unregister_device(), at least explicitly. > If nfc_fw_download() is running, nfc_unregister_device() > will wait nfc_fw_download() to finish. So the nfcmrvl_nci_unregister_dev() could see the updated > fw_download_in_progress. The process is shown below: > > (Thread 1) | (Thread 2) > nfcmrvl_nci_unregister_dev | nfc_fw_download > nci_unregister_device | ... > | device_lock() > ... | dev->fw_download_in_progress = false; //(1) > | device_unlock() > nfc_unregister_device | > if (priv->ndev->nfc_dev->fw_download_in_progress) //(2) | > nfcmrvl_fw_dnld_abort(priv); //not execute | > > We set fw_download_in_progress to false in position (1) and the check in position (2) will fail, > the nfcmrvl_fw_dnld_abort() in nfcmrvl_nci_unregister_dev() will not execute. So the double-free > bugs could be prevented. You just repeated the same not answering the question. The fw_download_in_progress at point (2) can be still true, on that CPU. I explain it third time so let me rephrase it - the fw_download_in_progress can be reordered by compiler or CPU to: T1 | T2 nfcmrvl_nci_unregister_dev() nci_unregister_device() var = fw_download_in_progress; (true) | nfc_fw_download | device_lock | dev->fw_download = false; | device_unlock if (var) | nfcmrvl_fw_dnld_abort(priv); | Every write barrier must be paired with read barrier. Every lock on one access to variable, must be paired with same lock on other access to variable . > >>> Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev() >>> will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort() >>> in nfcmrvl_nci_unregister_dev() will not execute. >> >> I am sorry, but you cannot move code around hoping it will by itself >> solve synchronization issues. > > I think this solution sove synchronization issues. If you still have any questions welcome to ask me. No, you still do not get the pint. You cannot move code around because this itself does not solve missing synchronization primitives and related issues. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs 2022-05-04 6:32 ` Krzysztof Kozlowski @ 2022-05-04 9:07 ` duoming 0 siblings, 0 replies; 12+ messages in thread From: duoming @ 2022-05-04 9:07 UTC (permalink / raw) To: Krzysztof Kozlowski Cc: linux-kernel, kuba, gregkh, davem, edumazet, pabeni, alexander.deucher, akpm, broonie, netdev, linma Hello, On Wed, 4 May 2022 08:32:15 +0200 Krzysztof wrote: > >>>>> There are destructive operations such as nfcmrvl_fw_dnld_abort and > >>>>> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware, > >>>>> gpio and so on could be destructed while the upper layer functions such as > >>>>> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads > >>>>> to double-free, use-after-free and null-ptr-deref bugs. > >>>>> > >>>>> There are three situations that could lead to double-free bugs. > >>>>> > >>>>> The first situation is shown below: > >>>>> > >>>>> (Thread 1) | (Thread 2) > >>>>> nfcmrvl_fw_dnld_start | > >>>>> ... | nfcmrvl_nci_unregister_dev > >>>>> release_firmware() | nfcmrvl_fw_dnld_abort > >>>>> kfree(fw) //(1) | fw_dnld_over > >>>>> | release_firmware > >>>>> ... | kfree(fw) //(2) > >>>>> | ... > >>>>> > >>>>> The second situation is shown below: > >>>>> > >>>>> (Thread 1) | (Thread 2) > >>>>> nfcmrvl_fw_dnld_start | > >>>>> ... | > >>>>> mod_timer | > >>>>> (wait a time) | > >>>>> fw_dnld_timeout | nfcmrvl_nci_unregister_dev > >>>>> fw_dnld_over | nfcmrvl_fw_dnld_abort > >>>>> release_firmware | fw_dnld_over > >>>>> kfree(fw) //(1) | release_firmware > >>>>> ... | kfree(fw) //(2) > >>>> > >>>> How exactly the case here is being prevented? > >>>> > >>>> If nfcmrvl_nci_unregister_dev() happens slightly earlier, before > >>>> fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it? > >>> > >>> I think it could be prevented. We use nci_unregister_device() to synchronize, if the > >>> firmware download routine is running, the cleanup routine will wait it to finish. > >>> The flag "fw_download_in_progress" will be set to false, if the the firmware download > >>> routine is finished. > >> > >> fw_download_in_progress is not synchronized in > >> nfcmrvl_nci_unregister_dev(), so even if fw_dnld_timeout() set it to > >> true, the nfcmrvl_nci_unregister_dev() happening concurrently will not > >> see updated fw_download_in_progress. > > > > The fw_download_in_progress is set to false in nfc_fw_download(). The nfc_fw_download() is > > synchronized with nfc_unregister_device(). > > No, it is not. There is no synchronization primitive in > nfc_unregister_device(), at least explicitly. There is device_lock() in nfc_fw_download() which could synchronize with nfc_unregister_device(). The code in nfc_unregister_device() is shown below: nfc_unregister_device() ... device_lock(&dev->dev); ... dev->shutting_down = true; device_unlock(&dev->dev); > > If nfc_fw_download() is running, nfc_unregister_device() > > will wait nfc_fw_download() to finish. So the nfcmrvl_nci_unregister_dev() could see the updated > > fw_download_in_progress. The process is shown below: > > > > (Thread 1) | (Thread 2) > > nfcmrvl_nci_unregister_dev | nfc_fw_download > > nci_unregister_device | ... > > | device_lock() > > ... | dev->fw_download_in_progress = false; //(1) > > | device_unlock() > > nfc_unregister_device | > > if (priv->ndev->nfc_dev->fw_download_in_progress) //(2) | > > nfcmrvl_fw_dnld_abort(priv); //not execute | > > > > We set fw_download_in_progress to false in position (1) and the check in position (2) will fail, > > the nfcmrvl_fw_dnld_abort() in nfcmrvl_nci_unregister_dev() will not execute. So the double-free > > bugs could be prevented. > > You just repeated the same not answering the question. The > fw_download_in_progress at point (2) can be still true, on that CPU. I > explain it third time so let me rephrase it - the > fw_download_in_progress can be reordered by compiler or CPU to: > > T1 | T2 > nfcmrvl_nci_unregister_dev() > nci_unregister_device() > var = fw_download_in_progress; (true) > | nfc_fw_download > | device_lock > | dev->fw_download = false; > | device_unlock > if (var) | > nfcmrvl_fw_dnld_abort(priv); | > > Every write barrier must be paired with read barrier. Every lock on one > access to variable, must be paired with same lock on other access to > variable . There is paired write barrier and read barrier "device_lock(&dev->dev)" between nfc_unregister_device() and nfc_fw_download(). The shutting_down flag of nfc_dev protected by device_lock() is set to true in position(2) and check in position(1). (Thread 1) | (Thread 2) nfcmrvl_nci_unregister_dev | nfc_fw_download nci_unregister_device | ... | device_lock(&dev->dev) ... | if (dev->shutting_down) //(1) | goto error; | ... | dev->fw_download_in_progress = false; //(3) | device_unlock(&dev->dev) nfc_unregister_device | device_lock(&dev->dev); | dev->shutting_down = true; //(2) | device_unlock(&dev->dev); | if (priv->ndev->nfc_dev->fw_download_in_progress)//(4)| nfcmrvl_fw_dnld_abort(priv); | If nfc_fw_download() is running, nfc_unregister_device() will wait nfc_fw_download() to finish, and the nfc_dev->fw_download_in_progress is set to false in position (3). The check in position(4) is false, because we have set nfc_dev->fw_download_in_progress to false in position(3). If we set shutting_down to true in position(2) before check in position(1) the nfc_fw_download() will goto error. > >>> Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev() > >>> will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort() > >>> in nfcmrvl_nci_unregister_dev() will not execute. > >> > >> I am sorry, but you cannot move code around hoping it will by itself > >> solve synchronization issues. > > > > I think this solution sove synchronization issues. If you still have any questions welcome to ask me. > > No, you still do not get the pint. You cannot move code around because > this itself does not solve missing synchronization primitives and > related issues. I think this solution could solve synchronization issues. If you still have any questions welcome to ask me. Best regards, Duoming Zhou ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-05-04 9:08 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-04-29 1:14 [PATCH net v5 0/2] Replace improper checks and fix bugs in nfc subsystem Duoming Zhou 2022-04-29 1:14 ` [PATCH net v5 1/2] nfc: replace improper check device_is_registered() in netlink related functions Duoming Zhou 2022-04-29 7:03 ` Greg KH 2022-04-29 7:19 ` Krzysztof Kozlowski 2022-04-29 9:18 ` duoming 2022-04-29 1:14 ` [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs Duoming Zhou 2022-04-29 7:27 ` Krzysztof Kozlowski 2022-04-29 9:13 ` duoming 2022-05-02 6:34 ` Krzysztof Kozlowski 2022-05-02 7:25 ` duoming 2022-05-04 6:32 ` Krzysztof Kozlowski 2022-05-04 9:07 ` duoming
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.