All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-08 15:48 Taehee Yoo
  2020-10-08 15:59   ` David Laight
  0 siblings, 1 reply; 38+ messages in thread
From: Taehee Yoo @ 2020-10-08 15:48 UTC (permalink / raw)
  To: davem, kuba, netdev
  Cc: ap420073, linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

When debugfs file is opened, its module should not be removed until
it's closed.
Because debugfs internally uses the module's data.
So, it could access freed memory.

In order to avoid panic, it just sets .owner to THIS_MODULE.
So that all modules will be held when its debugfs file is opened.


Test commands:
cat <<EOF > open.c
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
        int fd = open(argv[1], O_RDONLY);

        if(fd < 0) {
                printf("failed to open\n");
                return 1;
        }

        usleep(3000000);

        close(fd);
        return 0;
}
EOF
gcc -o open open.c
modprobe netdevsim
echo 1 > /sys/bus/netdevsim/new_device
./open /sys/kernel/debug/netdevsim/netdevsim1/take_snapshot &
modprobe -rv netdevsim

Splat looks like:
[  115.765838][  T713] BUG: unable to handle page fault for address: fffffbfff8077fa0
[  115.768324][  T713] #PF: supervisor read access in kernel mode
[  115.770196][  T713] #PF: error_code(0x0000) - not-present page
[  115.772056][  T713] PGD 1237ee067 P4D 1237ee067 PUD 123642067 PMD 117501067 PTE 0
[  115.774455][  T713] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  115.776429][  T713] CPU: 1 PID: 713 Comm: open Not tainted 5.9.0-rc8+ #756
[  115.778641][  T713] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[  115.781726][  T713] RIP: 0010:full_proxy_release+0xca/0x290
[  115.783522][  T713] Code: c1 ea 03 80 3c 02 00 0f 85 60 01 00 00 49 8d bc 24 80 00 00 00 4c 8b 73 28 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 71 01 00 00 49 8b 84 24 80 00 00 00 48 85 c0 0f
[  115.789724][  T713] RSP: 0018:ffff8880bad97e38 EFLAGS: 00010a06
[  115.791635][  T713] RAX: dffffc0000000000 RBX: ffff88810805de00 RCX: ffff88810805de28
[  115.794153][  T713] RDX: 1ffffffff8077fa0 RSI: ffff88810805de00 RDI: ffffffffc03bfd00
[  115.796682][  T713] RBP: ffff8880a60abd20 R08: ffff8880a60abd20 R09: ffff8880bb86e920
[  115.799188][  T713] R10: ffff8880bad97e78 R11: ffffed1016e5ab9d R12: ffffffffc03bfc80
[  115.801703][  T713] R13: ffff88810805de28 R14: ffff88810d0ac100 R15: ffff8880bb86e920
[  115.804204][  T713] FS:  00007fb71e5fb4c0(0000) GS:ffff888118e00000(0000) knlGS:0000000000000000
[  115.807018][  T713] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  115.809077][  T713] CR2: fffffbfff8077fa0 CR3: 00000000b8784005 CR4: 00000000003706e0
[  115.811588][  T713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  115.814089][  T713] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  115.816573][  T713] Call Trace:
[  115.817620][  T713]  __fput+0x1ff/0x820
[  115.818886][  T713]  ? trace_hardirqs_on+0x45/0x190
[  115.820459][  T713]  task_work_run+0xd3/0x170
[  115.821876][  T713]  exit_to_user_mode_prepare+0x15b/0x160
[  115.823637][  T713]  syscall_exit_to_user_mode+0x44/0x270
[  115.825364][  T713]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  115.827857][  T713] RIP: 0033:0x7fb71e1019e4
[ ... ]


Taehee Yoo (117):
  mac80211: set .owner to THIS_MODULE in debugfs_netdev.c
  mac80211: set rcname_ops.owner to THIS_MODULE
  mac80211: set minstrel_ht_stat_fops.owner to THIS_MODULE
  mac80211: set minstrel_ht_stat_csv_fops.owner to THIS_MODULE
  mac80211: set KEY_OPS.owner to THIS_MODULE
  mac80211: set KEY_OPS_W.owner to THIS_MODULE
  mac80211: set KEY_CONF_OPS.owner to THIS_MODULE
  mac80211: set STA_OPS.owner to THIS_MODULE
  mac80211: set STA_OPS_RW.owner to THIS_MODULE
  mac80211: set DEBUGFS_READONLY_FILE_OPS.owner to THIS_MODULE
  mac80211: set aqm_ops.owner to THIS_MODULE
  mac80211: debugfs: set airtime_flags_ops.owner to THIS_MODULE
  mac80211: set aql_txq_limit_ops.owner to THIS_MODULE
  mac80211: set force_tx_status_ops.owner to THIS_MODULE
  mac80211: set reset_ops.owner to THIS_MODULE
  mac80211: set DEBUGFS_DEVSTATS_FILE.owner to THIS_MODULE
  mac80211/cfg80211: set DEBUGFS_READONLY_FILE.owner to THIS_MODULE
  cfg80211: set ht40allow_map_ops.owner to THIS_MODULE
  net: hsr: set hsr_fops.owner to THIS_MODULE
  batman-adv: set batadv_log_fops.owner to THIS_MODULE
  6lowpan: iphc: set lowpan_ctx_pfx_fops.owner to THIS_MODULE
  netdevsim: set nsim_dev_health_break_fops.owner to THIS_MODULE
  netdevsim: set nsim_udp_tunnels_info_reset_fops.owner to THIS_MODULE
  netdevsim: set nsim_dev_take_snapshot_fops.owner to THIS_MODULE
  netdevsim: set nsim_dev_trap_fa_cookie_fops.owner to THIS_MODULE
  ieee802154: set test_int_fops.owner to THIS_MODULE
  i2400m: set i2400m_rx_stats_fops.owner to THIS_MODULE
  i2400m: set i2400m_tx_stats_fops.owner to THIS_MODULE
  dpaa2-eth: set dpaa2_dbg_cpu_ops.owner to THIS_MODULE
  dpaa2-eth: set dpaa2_dbg_fq_ops.owner to THIS_MODULE
  dpaa2-eth: set dpaa2_dbg_ch_ops.owner to THIS_MODULE
  wl1271: set DEBUGFS_READONLY_FILE.owner to THIS_MODULE
  wl1271: set DEBUGFS_FWSTATS_FILE.owner to THIS_MODULE
  wlcore: set DEBUGFS_FWSTATS_FILE_ARRAY.owner to THIS_MODULE
  wl12xx: set DEBUGFS_FWSTATS_FILE_ARRAY.owner to THIS_MODULE
  wl12xx: set DEBUGFS_READONLY_FILE.owner to THIS_MODULE
  wl12xx: set tx_queue_len_ops.owner to THIS_MODULE
  wl1251: set tx_queue_status_ops.owner to THIS_MODULE
  iwlwifi: set rs_sta_dbgfs_scale_table_ops.owner to THIS_MODULE
  iwlwifi: set rs_sta_dbgfs_stats_table_ops.owner to THIS_MODULE
  iwlwifi: set DEBUGFS_READ_FILE_OPS.owner to THIS_MODULE
  iwlwifi: set DEBUGFS_WRITE_FILE_OPS.owner to THIS_MODULE
  iwlwifi: set DEBUGFS_READ_WRITE_FILE_OPS.owner to THIS_MODULE
  iwlwifi: set rs_sta_dbgfs_scale_table_ops.owner to THIS_MODULE
  iwlwifi: set rs_sta_dbgfs_stats_table_ops.owner to THIS_MODULE
  iwlwifi: mvm: set rs_sta_dbgfs_drv_tx_stats_ops.owner to THIS_MODULE
  iwlwifi: mvm: set .owner to THIS_MODULE in debugfs.h
  iwlwifi: mvm: set iwl_dbgfs_mem_ops.owner to THIS_MODULE
  iwlwifi: runtime: set _FWRT_DEBUGFS_READ_FILE_OPS.owner to THIS_MODULE
  iwlwifi: runtime: set _FWRT_DEBUGFS_READ_WRITE_FILE_OPS.owner to
    THIS_MODULE
  iwlwifi: runtime: set _FWRT_DEBUGFS_WRITE_FILE_OPS.owner to
    THIS_MODULE
  iwlwifi: set DEBUGFS_READ_FILE_OPS.owner to THIS_MODULE
  iwlwifi: set DEBUGFS_WRITE_FILE_OPS.owner to THIS_MODULE
  iwlwifi: set DEBUGFS_READ_WRITE_FILE_OPS.owner to THIS_MODULE
  iwlwifi: set rs_sta_dbgfs_stats_table_ops.owner to THIS_MODULE
  iwlwifi: set rs_sta_dbgfs_scale_table_ops.owner to THIS_MODULE
  iwlwifi: set rs_sta_dbgfs_stats_table_ops.owner to THIS_MODULE
  iwlwifi: set rs_sta_dbgfs_rate_scale_data_ops.owner to THIS_MODULE
  iwlagn: set rs_sta_dbgfs_rate_scale_data_ops.owner to THIS_MODULE
  iwlagn: set DEBUGFS_READ_FILE_OPS.owner to THIS_MODULE
  iwlagn: set DEBUGFS_WRITE_FILE_OPS.owner to THIS_MODULE
  iwlagn: set DEBUGFS_READ_WRITE_FILE_OPS.owner to THIS_MODULE
  rtlwifi: set file_ops_common.owner to THIS_MODULE
  ath11k: set fops_extd_tx_stats.owner to THIS_MODULE
  ath11k: set fops_extd_rx_stats.owner to THIS_MODULE
  ath11k: set fops_pktlog_filter.owner to THIS_MODULE
  ath11k: set fops_simulate_radar.owner to THIS_MODULE
  ath10k: set fops_pktlog_filter.owner to THIS_MODULE
  ath10k: set fops_quiet_period.owner to THIS_MODULE
  ath10k: set fops_btcoex.owner to THIS_MODULE
  ath10k: set fops_enable_extd_tx_stats.owner to THIS_MODULE
  ath10k: set fops_peer_stats.owner to THIS_MODULE
  wcn36xx: set fops_wcn36xx_bmps.owner to THIS_MODULE
  wcn36xx: set fops_wcn36xx_dump.owner to THIS_MODULE
  wireless: set fops_ioblob.owner to THIS_MODULE
  wil6210: set fops_rxon.owner to THIS_MODULE
  wil6210: set fops_rbufcap.owner to THIS_MODULE
  wil6210: set fops_back.owner to THIS_MODULE
  wil6210: set fops_pmccfg.owner to THIS_MODULE
  wil6210: set fops_pmcdata.owner to THIS_MODULE
  wil6210: set fops_pmcring.owner to THIS_MODULE
  wil6210: set fops_txmgmt.owner to THIS_MODULE
  wil6210: set fops_wmi.owner to THIS_MODULE
  wil6210: set fops_recovery.owner to THIS_MODULE
  wil6210: set fops_tx_latency.owner to THIS_MODULE
  wil6210: set fops_link_stats.owner to THIS_MODULE
  wil6210: set fops_link_stats_global.owner to THIS_MODULE
  wil6210: set fops_led_cfg.owner to THIS_MODULE
  wil6210: set fops_led_blink_time.owner to THIS_MODULE
  wil6210: set fops_fw_capabilities.owner to THIS_MODULE
  wil6210: set fops_fw_version.owner to THIS_MODULE
  wil6210: set fops_suspend_stats.owner to THIS_MODULE
  wil6210: set fops_compressed_rx_status.owner to THIS_MODULE
  cw1200: set fops_wsm_dumps.owner to THIS_MODULE
  brcmfmac: set bus_reset_fops.owner to THIS_MODULE
  b43legacy: set B43legacy_DEBUGFS_FOPS.owner to THIS_MODULE
  b43: set B43_DEBUGFS_FOPS.owner to THIS_MODULE
  wireless: mwifiex: set .owner to THIS_MODULE in debugfs.c
  net: mt7601u: set fops_ampdu_stat.owner to THIS_MODULE
  net: mt7601u: set fops_eeprom_param.owner to THIS_MODULE
  mt76: mt7615: set fops_ampdu_stat.owner to THIS_MODULE
  mt76: mt7603: set fops_ampdu_stat.owner to THIS_MODULE
  mt76: set fops_ampdu_stat.owner to THIS_MODULE
  mt76: set fops_tx_stats.owner to THIS_MODULE
  mt76: mt7915: set fops_sta_stats.owner to THIS_MODULE
  Bluetooth: set dut_mode_fops.owner to THIS_MODULE
  Bluetooth: set vendor_diag_fops.owner to THIS_MODULE
  Bluetooth: set force_bredr_smp_fops.owner to THIS_MODULE
  Bluetooth: set test_smp_fops.owner to THIS_MODULE
  Bluetooth: set use_debug_keys_fops.owner to THIS_MODULE
  Bluetooth: set sc_only_mode_fops.owner to THIS_MODULE
  Bluetooth: set DEFINE_QUIRK_ATTRIBUTE.owner to THIS_MODULE
  Bluetooth: set ssp_debug_mode_fops.owner to THIS_MODULE
  Bluetooth: set force_static_address_fops.owner to THIS_MODULE
  Bluetooth: set force_no_mitm_fops.owner to THIS_MODULE
  Bluetooth: 6LoWPAN: set lowpan_control_fops.owner to THIS_MODULE
  Bluetooth: set test_ecdh_fops.owner to THIS_MODULE

 .../freescale/dpaa2/dpaa2-eth-debugfs.c       |  3 +++
 drivers/net/ieee802154/ca8210.c               |  3 ++-
 drivers/net/netdevsim/dev.c                   |  2 ++
 drivers/net/netdevsim/health.c                |  1 +
 drivers/net/netdevsim/udp_tunnels.c           |  1 +
 drivers/net/wimax/i2400m/debugfs.c            |  2 ++
 drivers/net/wireless/ath/ath10k/debug.c       | 15 ++++++++++-----
 drivers/net/wireless/ath/ath11k/debug.c       | 10 +++++++---
 drivers/net/wireless/ath/wcn36xx/debug.c      |  2 ++
 drivers/net/wireless/ath/wil6210/debugfs.c    | 19 +++++++++++++++++++
 drivers/net/wireless/broadcom/b43/debugfs.c   |  1 +
 .../net/wireless/broadcom/b43legacy/debugfs.c |  1 +
 .../broadcom/brcm80211/brcmfmac/core.c        |  1 +
 drivers/net/wireless/intel/iwlegacy/3945-rs.c |  1 +
 drivers/net/wireless/intel/iwlegacy/4965-rs.c |  3 +++
 drivers/net/wireless/intel/iwlegacy/debug.c   |  3 +++
 .../net/wireless/intel/iwlwifi/dvm/debugfs.c  |  3 +++
 drivers/net/wireless/intel/iwlwifi/dvm/rs.c   |  3 +++
 .../net/wireless/intel/iwlwifi/fw/debugfs.c   |  3 +++
 .../net/wireless/intel/iwlwifi/mvm/debugfs.c  |  1 +
 .../net/wireless/intel/iwlwifi/mvm/debugfs.h  |  3 +++
 drivers/net/wireless/intel/iwlwifi/mvm/rs.c   |  3 +++
 .../net/wireless/intel/iwlwifi/pcie/trans.c   |  3 +++
 .../net/wireless/marvell/mwifiex/debugfs.c    |  3 +++
 .../wireless/mediatek/mt76/mt7603/debugfs.c   |  1 +
 .../wireless/mediatek/mt76/mt7615/debugfs.c   |  1 +
 .../wireless/mediatek/mt76/mt76x02_debugfs.c  |  2 ++
 .../wireless/mediatek/mt76/mt7915/debugfs.c   |  2 ++
 .../net/wireless/mediatek/mt7601u/debugfs.c   |  2 ++
 drivers/net/wireless/realtek/rtlwifi/debug.c  |  1 +
 drivers/net/wireless/st/cw1200/debug.c        |  1 +
 drivers/net/wireless/ti/wl1251/debugfs.c      |  4 ++++
 drivers/net/wireless/ti/wlcore/debugfs.h      |  3 +++
 net/6lowpan/debugfs.c                         |  1 +
 net/batman-adv/log.c                          |  1 +
 net/bluetooth/6lowpan.c                       |  1 +
 net/bluetooth/hci_core.c                      |  2 ++
 net/bluetooth/hci_debugfs.c                   |  6 ++++++
 net/bluetooth/selftest.c                      |  1 +
 net/bluetooth/smp.c                           |  2 ++
 net/hsr/hsr_debugfs.c                         |  1 +
 net/mac80211/debugfs.c                        |  7 +++++++
 net/mac80211/debugfs_key.c                    |  3 +++
 net/mac80211/debugfs_netdev.c                 |  1 +
 net/mac80211/debugfs_sta.c                    |  2 ++
 net/mac80211/rate.c                           |  1 +
 net/mac80211/rc80211_minstrel_ht_debugfs.c    |  2 ++
 net/wireless/debugfs.c                        |  2 ++
 48 files changed, 131 insertions(+), 9 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-08 15:48 [PATCH net 000/117] net: avoid to remove module when its debugfs is being used Taehee Yoo
@ 2020-10-08 15:59   ` David Laight
  0 siblings, 0 replies; 38+ messages in thread
From: David Laight @ 2020-10-08 15:59 UTC (permalink / raw)
  To: 'Taehee Yoo', davem, kuba, netdev
  Cc: linux-wireless, wil6210, brcm80211-dev-list, b43-dev, linux-bluetooth

From: Taehee Yoo
> Sent: 08 October 2020 16:49
> 
> When debugfs file is opened, its module should not be removed until
> it's closed.
> Because debugfs internally uses the module's data.
> So, it could access freed memory.
> 
> In order to avoid panic, it just sets .owner to THIS_MODULE.
> So that all modules will be held when its debugfs file is opened.

Can't you fix it in common code?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-08 15:59   ` David Laight
  0 siblings, 0 replies; 38+ messages in thread
From: David Laight @ 2020-10-08 15:59 UTC (permalink / raw)
  To: 'Taehee Yoo', davem, kuba, netdev
  Cc: linux-wireless, wil6210, brcm80211-dev-list, b43-dev, linux-bluetooth

From: Taehee Yoo
> Sent: 08 October 2020 16:49
> 
> When debugfs file is opened, its module should not be removed until
> it's closed.
> Because debugfs internally uses the module's data.
> So, it could access freed memory.
> 
> In order to avoid panic, it just sets .owner to THIS_MODULE.
> So that all modules will be held when its debugfs file is opened.

Can't you fix it in common code?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-08 15:59   ` David Laight
@ 2020-10-08 16:14     ` Johannes Berg
  -1 siblings, 0 replies; 38+ messages in thread
From: Johannes Berg @ 2020-10-08 16:14 UTC (permalink / raw)
  To: David Laight, 'Taehee Yoo', davem, kuba, netdev, Nicolai Stange
  Cc: linux-wireless, wil6210, brcm80211-dev-list, b43-dev, linux-bluetooth

On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
> From: Taehee Yoo
> > Sent: 08 October 2020 16:49
> > 
> > When debugfs file is opened, its module should not be removed until
> > it's closed.
> > Because debugfs internally uses the module's data.
> > So, it could access freed memory.
> > 
> > In order to avoid panic, it just sets .owner to THIS_MODULE.
> > So that all modules will be held when its debugfs file is opened.
> 
> Can't you fix it in common code?

Yeah I was just wondering that too - weren't the proxy_fops even already
intended to fix this?

The modules _should_ be removing the debugfs files, and then the
proxy_fops should kick in, no?

So where's the issue?

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-08 16:14     ` Johannes Berg
  0 siblings, 0 replies; 38+ messages in thread
From: Johannes Berg @ 2020-10-08 16:14 UTC (permalink / raw)
  To: David Laight, 'Taehee Yoo', davem, kuba, netdev, Nicolai Stange
  Cc: linux-wireless, wil6210, brcm80211-dev-list, b43-dev, linux-bluetooth

On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
> From: Taehee Yoo
> > Sent: 08 October 2020 16:49
> > 
> > When debugfs file is opened, its module should not be removed until
> > it's closed.
> > Because debugfs internally uses the module's data.
> > So, it could access freed memory.
> > 
> > In order to avoid panic, it just sets .owner to THIS_MODULE.
> > So that all modules will be held when its debugfs file is opened.
> 
> Can't you fix it in common code?

Yeah I was just wondering that too - weren't the proxy_fops even already
intended to fix this?

The modules _should_ be removing the debugfs files, and then the
proxy_fops should kick in, no?

So where's the issue?

johannes

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-08 16:14     ` Johannes Berg
@ 2020-10-08 16:37       ` Taehee Yoo
  -1 siblings, 0 replies; 38+ messages in thread
From: Taehee Yoo @ 2020-10-08 16:37 UTC (permalink / raw)
  To: Johannes Berg
  Cc: David Laight, davem, kuba, netdev, Nicolai Stange,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

On Fri, 9 Oct 2020 at 01:14, Johannes Berg <johannes@sipsolutions.net> wrote:
On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:

Hi Johannes and David,
Thank you for the review!

> From: Taehee Yoo
> > Sent: 08 October 2020 16:49
> >
> > When debugfs file is opened, its module should not be removed until
> > it's closed.
> > Because debugfs internally uses the module's data.
> > So, it could access freed memory.
> >
> > In order to avoid panic, it just sets .owner to THIS_MODULE.
> > So that all modules will be held when its debugfs file is opened.
>
> Can't you fix it in common code?

> Yeah I was just wondering that too - weren't the proxy_fops even already
> intended to fix this?

I didn't try to fix this issue in the common code(debugfs).
Because I thought It's a typical pattern of panic and THIS_MODULE
can fix it clearly.
So I couldn't think there is a root reason in the common code.

> The modules _should_ be removing the debugfs files, and then the
> proxy_fops should kick in, no?

If I understand your mention correctly,
you mean that when the module is being removed, the opened file
should be closed automatically by debugfs filesystem.
Is that right?

> So where's the issue?

> johannes

Thanks a lot!
Taehee

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-08 16:37       ` Taehee Yoo
  0 siblings, 0 replies; 38+ messages in thread
From: Taehee Yoo @ 2020-10-08 16:37 UTC (permalink / raw)
  To: Johannes Berg
  Cc: David Laight, davem, kuba, netdev, Nicolai Stange,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

On Fri, 9 Oct 2020 at 01:14, Johannes Berg <johannes@sipsolutions.net> wrote:
On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:

Hi Johannes and David,
Thank you for the review!

> From: Taehee Yoo
> > Sent: 08 October 2020 16:49
> >
> > When debugfs file is opened, its module should not be removed until
> > it's closed.
> > Because debugfs internally uses the module's data.
> > So, it could access freed memory.
> >
> > In order to avoid panic, it just sets .owner to THIS_MODULE.
> > So that all modules will be held when its debugfs file is opened.
>
> Can't you fix it in common code?

> Yeah I was just wondering that too - weren't the proxy_fops even already
> intended to fix this?

I didn't try to fix this issue in the common code(debugfs).
Because I thought It's a typical pattern of panic and THIS_MODULE
can fix it clearly.
So I couldn't think there is a root reason in the common code.

> The modules _should_ be removing the debugfs files, and then the
> proxy_fops should kick in, no?

If I understand your mention correctly,
you mean that when the module is being removed, the opened file
should be closed automatically by debugfs filesystem.
Is that right?

> So where's the issue?

> johannes

Thanks a lot!
Taehee

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-08 16:14     ` Johannes Berg
@ 2020-10-09  5:09       ` Nicolai Stange
  -1 siblings, 0 replies; 38+ messages in thread
From: Nicolai Stange @ 2020-10-09  5:09 UTC (permalink / raw)
  To: Johannes Berg
  Cc: David Laight, 'Taehee Yoo',
	davem, kuba, netdev, Nicolai Stange, linux-wireless, wil6210,
	brcm80211-dev-list, b43-dev, linux-bluetooth

Johannes Berg <johannes@sipsolutions.net> writes:

> On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
>> From: Taehee Yoo
>> > Sent: 08 October 2020 16:49
>> > 
>> > When debugfs file is opened, its module should not be removed until
>> > it's closed.
>> > Because debugfs internally uses the module's data.
>> > So, it could access freed memory.
>> > 
>> > In order to avoid panic, it just sets .owner to THIS_MODULE.
>> > So that all modules will be held when its debugfs file is opened.
>> 
>> Can't you fix it in common code?

Probably not: it's the call to ->release() that's faulting in the Oops
quoted in the cover letter and that one can't be protected by the
core debugfs code, unfortunately.

There's a comment in full_proxy_release(), which reads as

	/*
	 * We must not protect this against removal races here: the
	 * original releaser should be called unconditionally in order
	 * not to leak any resources. Releasers must not assume that
	 * ->i_private is still being meaningful here.
	 */

> Yeah I was just wondering that too - weren't the proxy_fops even already
> intended to fix this?

No, as far as file_operations are concerned, the proxy fops's intent was
only to ensure that the memory the file_operations' ->owner resides in
is still valid so that try_module_get() won't splat at file open
(c.f. [1]).

You're right that the default "full" proxy fops do prevent all
file_operations but ->release() from getting invoked on removed files,
but the motivation had not been to protect the file_operations
themselves, but accesses to any stale data associated with removed files
([2]).


> The modules _should_ be removing the debugfs files, and then the
> proxy_fops should kick in, no?

No, as said, not for ->release(). I haven't looked into the inidividual
patches here, but setting ->owner indeed sounds like the right thing to
do.

But you're right that modules should be removing any left debugfs files
at exit.

Thanks,

Nicolai


[1] 9fd4dcece43a ("debugfs: prevent access to possibly dead
                   file_operations at file open")
[2] 49d200deaa68 ("debugfs: prevent access to removed files' private data")

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-09  5:09       ` Nicolai Stange
  0 siblings, 0 replies; 38+ messages in thread
From: Nicolai Stange @ 2020-10-09  5:09 UTC (permalink / raw)
  To: Johannes Berg
  Cc: David Laight, 'Taehee Yoo',
	davem, kuba, netdev, Nicolai Stange, linux-wireless, wil6210,
	brcm80211-dev-list, b43-dev, linux-bluetooth

Johannes Berg <johannes@sipsolutions.net> writes:

> On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
>> From: Taehee Yoo
>> > Sent: 08 October 2020 16:49
>> > 
>> > When debugfs file is opened, its module should not be removed until
>> > it's closed.
>> > Because debugfs internally uses the module's data.
>> > So, it could access freed memory.
>> > 
>> > In order to avoid panic, it just sets .owner to THIS_MODULE.
>> > So that all modules will be held when its debugfs file is opened.
>> 
>> Can't you fix it in common code?

Probably not: it's the call to ->release() that's faulting in the Oops
quoted in the cover letter and that one can't be protected by the
core debugfs code, unfortunately.

There's a comment in full_proxy_release(), which reads as

	/*
	 * We must not protect this against removal races here: the
	 * original releaser should be called unconditionally in order
	 * not to leak any resources. Releasers must not assume that
	 * ->i_private is still being meaningful here.
	 */

> Yeah I was just wondering that too - weren't the proxy_fops even already
> intended to fix this?

No, as far as file_operations are concerned, the proxy fops's intent was
only to ensure that the memory the file_operations' ->owner resides in
is still valid so that try_module_get() won't splat at file open
(c.f. [1]).

You're right that the default "full" proxy fops do prevent all
file_operations but ->release() from getting invoked on removed files,
but the motivation had not been to protect the file_operations
themselves, but accesses to any stale data associated with removed files
([2]).


> The modules _should_ be removing the debugfs files, and then the
> proxy_fops should kick in, no?

No, as said, not for ->release(). I haven't looked into the inidividual
patches here, but setting ->owner indeed sounds like the right thing to
do.

But you're right that modules should be removing any left debugfs files
at exit.

Thanks,

Nicolai


[1] 9fd4dcece43a ("debugfs: prevent access to possibly dead
                   file_operations at file open")
[2] 49d200deaa68 ("debugfs: prevent access to removed files' private data")

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 N?rnberg, Germany
(HRB 36809, AG N?rnberg), GF: Felix Imend?rffer

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-08 16:37       ` Taehee Yoo
@ 2020-10-09  5:38         ` Nicolai Stange
  -1 siblings, 0 replies; 38+ messages in thread
From: Nicolai Stange @ 2020-10-09  5:38 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: Johannes Berg, David Laight, davem, kuba, netdev, Nicolai Stange,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

Taehee Yoo <ap420073@gmail.com> writes:

> On Fri, 9 Oct 2020 at 01:14, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
>
>> From: Taehee Yoo
>> > Sent: 08 October 2020 16:49
>> >
>> > When debugfs file is opened, its module should not be removed until
>> > it's closed.
>> > Because debugfs internally uses the module's data.
>> > So, it could access freed memory.

Yes, the file_operations' ->release() to be more specific -- that's not
covered by debugfs' proxy fops.


>> > In order to avoid panic, it just sets .owner to THIS_MODULE.
>> > So that all modules will be held when its debugfs file is opened.
>>
>> Can't you fix it in common code?
>
>> Yeah I was just wondering that too - weren't the proxy_fops even already
>> intended to fix this?
>
> I didn't try to fix this issue in the common code(debugfs).
> Because I thought It's a typical pattern of panic and THIS_MODULE
> can fix it clearly.
> So I couldn't think there is a root reason in the common code.

That's correct, ->owner should get set properly, c.f. my other mail in
this thread.


Thanks,

Nicolai

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-09  5:38         ` Nicolai Stange
  0 siblings, 0 replies; 38+ messages in thread
From: Nicolai Stange @ 2020-10-09  5:38 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: Johannes Berg, David Laight, davem, kuba, netdev, Nicolai Stange,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

Taehee Yoo <ap420073@gmail.com> writes:

> On Fri, 9 Oct 2020 at 01:14, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
>
>> From: Taehee Yoo
>> > Sent: 08 October 2020 16:49
>> >
>> > When debugfs file is opened, its module should not be removed until
>> > it's closed.
>> > Because debugfs internally uses the module's data.
>> > So, it could access freed memory.

Yes, the file_operations' ->release() to be more specific -- that's not
covered by debugfs' proxy fops.


>> > In order to avoid panic, it just sets .owner to THIS_MODULE.
>> > So that all modules will be held when its debugfs file is opened.
>>
>> Can't you fix it in common code?
>
>> Yeah I was just wondering that too - weren't the proxy_fops even already
>> intended to fix this?
>
> I didn't try to fix this issue in the common code(debugfs).
> Because I thought It's a typical pattern of panic and THIS_MODULE
> can fix it clearly.
> So I couldn't think there is a root reason in the common code.

That's correct, ->owner should get set properly, c.f. my other mail in
this thread.


Thanks,

Nicolai

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 N?rnberg, Germany
(HRB 36809, AG N?rnberg), GF: Felix Imend?rffer

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-09  5:09       ` Nicolai Stange
@ 2020-10-09  7:45         ` Johannes Berg
  -1 siblings, 0 replies; 38+ messages in thread
From: Johannes Berg @ 2020-10-09  7:45 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: David Laight, 'Taehee Yoo',
	davem, kuba, netdev, linux-wireless, wil6210, brcm80211-dev-list,
	b43-dev, linux-bluetooth

On Fri, 2020-10-09 at 07:09 +0200, Nicolai Stange wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
> > > From: Taehee Yoo
> > > > Sent: 08 October 2020 16:49
> > > > 
> > > > When debugfs file is opened, its module should not be removed until
> > > > it's closed.
> > > > Because debugfs internally uses the module's data.
> > > > So, it could access freed memory.
> > > > 
> > > > In order to avoid panic, it just sets .owner to THIS_MODULE.
> > > > So that all modules will be held when its debugfs file is opened.
> > > 
> > > Can't you fix it in common code?
> 
> Probably not: it's the call to ->release() that's faulting in the Oops
> quoted in the cover letter and that one can't be protected by the
> core debugfs code, unfortunately.
> 
> There's a comment in full_proxy_release(), which reads as
> 
> 	/*
> 	 * We must not protect this against removal races here: the
> 	 * original releaser should be called unconditionally in order
> 	 * not to leak any resources. Releasers must not assume that
> 	 * ->i_private is still being meaningful here.
> 	 */

Yeah, found that too now :-)

> > Yeah I was just wondering that too - weren't the proxy_fops even already
> > intended to fix this?
> 
> No, as far as file_operations are concerned, the proxy fops's intent was
> only to ensure that the memory the file_operations' ->owner resides in
> is still valid so that try_module_get() won't splat at file open
> (c.f. [1]).

Right.

> You're right that the default "full" proxy fops do prevent all
> file_operations but ->release() from getting invoked on removed files,
> but the motivation had not been to protect the file_operations
> themselves, but accesses to any stale data associated with removed files
> ([2]).

:)

I actually got this to work in a crazy way, I'll send something out but
I'm sure it's a better idea to add the .owner everywhere, but please
let's do it in fewer than hundreds of patches :-)

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-09  7:45         ` Johannes Berg
  0 siblings, 0 replies; 38+ messages in thread
From: Johannes Berg @ 2020-10-09  7:45 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: David Laight, 'Taehee Yoo',
	davem, kuba, netdev, linux-wireless, wil6210, brcm80211-dev-list,
	b43-dev, linux-bluetooth

On Fri, 2020-10-09 at 07:09 +0200, Nicolai Stange wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
> > > From: Taehee Yoo
> > > > Sent: 08 October 2020 16:49
> > > > 
> > > > When debugfs file is opened, its module should not be removed until
> > > > it's closed.
> > > > Because debugfs internally uses the module's data.
> > > > So, it could access freed memory.
> > > > 
> > > > In order to avoid panic, it just sets .owner to THIS_MODULE.
> > > > So that all modules will be held when its debugfs file is opened.
> > > 
> > > Can't you fix it in common code?
> 
> Probably not: it's the call to ->release() that's faulting in the Oops
> quoted in the cover letter and that one can't be protected by the
> core debugfs code, unfortunately.
> 
> There's a comment in full_proxy_release(), which reads as
> 
> 	/*
> 	 * We must not protect this against removal races here: the
> 	 * original releaser should be called unconditionally in order
> 	 * not to leak any resources. Releasers must not assume that
> 	 * ->i_private is still being meaningful here.
> 	 */

Yeah, found that too now :-)

> > Yeah I was just wondering that too - weren't the proxy_fops even already
> > intended to fix this?
> 
> No, as far as file_operations are concerned, the proxy fops's intent was
> only to ensure that the memory the file_operations' ->owner resides in
> is still valid so that try_module_get() won't splat at file open
> (c.f. [1]).

Right.

> You're right that the default "full" proxy fops do prevent all
> file_operations but ->release() from getting invoked on removed files,
> but the motivation had not been to protect the file_operations
> themselves, but accesses to any stale data associated with removed files
> ([2]).

:)

I actually got this to work in a crazy way, I'll send something out but
I'm sure it's a better idea to add the .owner everywhere, but please
let's do it in fewer than hundreds of patches :-)

johannes

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  5:09       ` Nicolai Stange
  (?)
  (?)
@ 2020-10-09  7:53       ` Johannes Berg
  2020-10-09  8:03         ` Greg KH
  -1 siblings, 1 reply; 38+ messages in thread
From: Johannes Berg @ 2020-10-09  7:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: nstange, ap420073, David.Laight, netdev, linux-wireless, gregkh, rafael

[RFF = Request For Flaming]

THIS IS PROBABLY COMPLETELY CRAZY.

Currently, if a module is unloaded while debugfs files are being
kept open, things crash since the ->release() method is called
only at the actual release, despite the proxy_fops, and then the
code is no longer around.

The correct way to fix this is to annotate all the debugfs fops
with .owner= THIS_MODULE, but as a learning exercise and to see
how much hate I can possibly receive, I figured I'd try to work
around this in debugfs itself.

First, if the fops have a release method and no owner, we track
all the open files - currently using a very simple array data
structure for it linked into struct debugfs_fsdata. This required
changing the API of debugfs_file_get() and debugfs_file_put() to
pass the struct file * to it.

Then, once we know which files are still open at remove time, we
can call all their release functions, and avoid calling them from
the proxy_fops. Of course still call them from the proxy_fops if
the file is still around, and clean up, so the release isn't all
deferred to the end, but done as soon as possible.

This introduces a potential locking issue, we could have a fops
struct that takes a lock in its release that the same module is
also taking around debugfs_remove(). To flag these issues using
lockdep, take inode_lock_shared() in full_proxy_release(), see
the comment there. If this triggers for some fops, add the owner
as it should have been in the first place.

Over adding the owner everywhere this has two small advantages:
 1) you can unload the module while a debugfs file is open;
 2) no need to change hundreds of places in the kernel :-)

(No, I won't sign off on this unless somebody _really_ wants it.)
---
 drivers/base/regmap/regmap-debugfs.c |   8 +-
 drivers/infiniband/hw/hfi1/debugfs.c |  10 +-
 drivers/infiniband/hw/hfi1/fault.c   |   8 +-
 fs/debugfs/file.c                    | 168 +++++++++++++++++++++------
 fs/debugfs/inode.c                   |  11 ++
 fs/debugfs/internal.h                |   7 ++
 include/linux/debugfs.h              |   4 +-
 7 files changed, 165 insertions(+), 51 deletions(-)

diff --git a/drivers/base/regmap/regmap-debugfs.c b/drivers/base/regmap/regmap-debugfs.c
index b6d63ef16b44..1415fe9ba4c9 100644
--- a/drivers/base/regmap/regmap-debugfs.c
+++ b/drivers/base/regmap/regmap-debugfs.c
@@ -470,7 +470,7 @@ static ssize_t regmap_cache_only_write_file(struct file *file,
 	if (err)
 		return count;
 
-	err = debugfs_file_get(file->f_path.dentry);
+	err = debugfs_file_get(file);
 	if (err)
 		return err;
 
@@ -486,7 +486,7 @@ static ssize_t regmap_cache_only_write_file(struct file *file,
 	map->cache_only = new_val;
 
 	map->unlock(map->lock_arg);
-	debugfs_file_put(file->f_path.dentry);
+	debugfs_file_put(file);
 
 	if (require_sync) {
 		err = regcache_sync(map);
@@ -517,7 +517,7 @@ static ssize_t regmap_cache_bypass_write_file(struct file *file,
 	if (err)
 		return count;
 
-	err = debugfs_file_get(file->f_path.dentry);
+	err = debugfs_file_get(file);
 	if (err)
 		return err;
 
@@ -532,7 +532,7 @@ static ssize_t regmap_cache_bypass_write_file(struct file *file,
 	map->cache_bypass = new_val;
 
 	map->unlock(map->lock_arg);
-	debugfs_file_put(file->f_path.dentry);
+	debugfs_file_put(file);
 
 	return count;
 }
diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index 2ced236e1553..81f38da1dee0 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -68,27 +68,25 @@ static struct dentry *hfi1_dbg_root;
 ssize_t hfi1_seq_read(struct file *file, char __user *buf, size_t size,
 		      loff_t *ppos)
 {
-	struct dentry *d = file->f_path.dentry;
 	ssize_t r;
 
-	r = debugfs_file_get(d);
+	r = debugfs_file_get(file);
 	if (unlikely(r))
 		return r;
 	r = seq_read(file, buf, size, ppos);
-	debugfs_file_put(d);
+	debugfs_file_put(file);
 	return r;
 }
 
 loff_t hfi1_seq_lseek(struct file *file, loff_t offset, int whence)
 {
-	struct dentry *d = file->f_path.dentry;
 	loff_t r;
 
-	r = debugfs_file_get(d);
+	r = debugfs_file_get(file);
 	if (unlikely(r))
 		return r;
 	r = seq_lseek(file, offset, whence);
-	debugfs_file_put(d);
+	debugfs_file_put(file);
 	return r;
 }
 
diff --git a/drivers/infiniband/hw/hfi1/fault.c b/drivers/infiniband/hw/hfi1/fault.c
index 0dfbcfb048ca..19b6e41458b6 100644
--- a/drivers/infiniband/hw/hfi1/fault.c
+++ b/drivers/infiniband/hw/hfi1/fault.c
@@ -146,7 +146,7 @@ static ssize_t fault_opcodes_write(struct file *file, const char __user *buf,
 		goto free_data;
 	}
 
-	ret = debugfs_file_get(file->f_path.dentry);
+	ret = debugfs_file_get(file);
 	if (unlikely(ret))
 		goto free_data;
 	ptr = data;
@@ -196,7 +196,7 @@ static ssize_t fault_opcodes_write(struct file *file, const char __user *buf,
 	}
 	ret = len;
 
-	debugfs_file_put(file->f_path.dentry);
+	debugfs_file_put(file);
 free_data:
 	kfree(data);
 	return ret;
@@ -215,7 +215,7 @@ static ssize_t fault_opcodes_read(struct file *file, char __user *buf,
 	data = kcalloc(datalen, sizeof(*data), GFP_KERNEL);
 	if (!data)
 		return -ENOMEM;
-	ret = debugfs_file_get(file->f_path.dentry);
+	ret = debugfs_file_get(file);
 	if (unlikely(ret))
 		goto free_data;
 	bit = find_first_bit(fault->opcodes, bitsize);
@@ -231,7 +231,7 @@ static ssize_t fault_opcodes_read(struct file *file, char __user *buf,
 					 bit);
 		bit = find_next_bit(fault->opcodes, bitsize, zero);
 	}
-	debugfs_file_put(file->f_path.dentry);
+	debugfs_file_put(file);
 	data[size - 1] = '\n';
 	data[size] = '\0';
 	ret = simple_read_from_buffer(buf, len, pos, data, size);
diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index a768a09430c3..fa758e774568 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -66,7 +66,7 @@ EXPORT_SYMBOL_GPL(debugfs_real_fops);
 
 /**
  * debugfs_file_get - mark the beginning of file data access
- * @dentry: the dentry object whose data is being accessed.
+ * @file: the file object whose data is being accessed.
  *
  * Up to a matching call to debugfs_file_put(), any successive call
  * into the file removing functions debugfs_remove() and
@@ -79,8 +79,9 @@ EXPORT_SYMBOL_GPL(debugfs_real_fops);
  * it is not safe to access any of its data. If, on the other hand,
  * it is allowed to access the file data, zero is returned.
  */
-int debugfs_file_get(struct dentry *dentry)
+int debugfs_file_get(struct file *file)
 {
+	struct dentry *dentry = F_DENTRY(file);
 	struct debugfs_fsdata *fsd;
 	void *d_fsd;
 
@@ -94,9 +95,25 @@ int debugfs_file_get(struct dentry *dentry)
 
 		fsd->real_fops = (void *)((unsigned long)d_fsd &
 					~DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
+
+		if (fsd->real_fops->release && !fsd->real_fops->owner) {
+			fsd->files = kzalloc(struct_size(fsd->files, files, 4),
+					     GFP_KERNEL);
+			if (!fsd->files) {
+				kfree(fsd);
+				return -ENOMEM;
+			}
+			fsd->files->n_alloc = 4;
+			fsd->files->n_used = 0;
+			mutex_init(&fsd->files_lock);
+		} else {
+			fsd->files = NULL;
+		}
+
 		refcount_set(&fsd->active_users, 1);
 		init_completion(&fsd->active_users_drained);
 		if (cmpxchg(&dentry->d_fsdata, d_fsd, fsd) != d_fsd) {
+			kfree(fsd->files);
 			kfree(fsd);
 			fsd = READ_ONCE(dentry->d_fsdata);
 		}
@@ -116,21 +133,60 @@ int debugfs_file_get(struct dentry *dentry)
 	if (!refcount_inc_not_zero(&fsd->active_users))
 		return -EIO;
 
+	if (fsd->files) {
+		struct debugfs_files_release *files;
+		bool found = false;
+		unsigned int i;
+
+		mutex_lock(&fsd->files_lock);
+		/* re-read under mutex */
+		files = READ_ONCE(fsd->files);
+		for (i = 0; i < files->n_used; i++) {
+			if (files->files[i] == file) {
+				found = true;
+				break;
+			}
+		}
+
+		if (!found) {
+			if (files->n_used >= files->n_alloc) {
+				struct debugfs_files_release *new;
+
+				files->n_alloc += 4;
+				new = krealloc(files,
+					       struct_size(new, files,
+							   files->n_alloc),
+					       GFP_KERNEL);
+				if (!new) {
+					files->n_alloc -= 4;
+					mutex_unlock(&fsd->files_lock);
+					refcount_dec(&fsd->active_users);
+					return -ENOMEM;
+				}
+				files = new;
+				fsd->files = files;
+			}
+			files->files[files->n_used++] = file;
+		}
+		mutex_unlock(&fsd->files_lock);
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(debugfs_file_get);
 
 /**
  * debugfs_file_put - mark the end of file data access
- * @dentry: the dentry object formerly passed to
- *          debugfs_file_get().
+ * @file: the file object formerly passed to
+ *        debugfs_file_get().
  *
  * Allow any ongoing concurrent call into debugfs_remove() or
  * debugfs_remove_recursive() blocked by a former call to
  * debugfs_file_get() to proceed and return to its caller.
  */
-void debugfs_file_put(struct dentry *dentry)
+void debugfs_file_put(struct file *file)
 {
+	struct dentry *dentry = F_DENTRY(file);
 	struct debugfs_fsdata *fsd = READ_ONCE(dentry->d_fsdata);
 
 	if (refcount_dec_and_test(&fsd->active_users))
@@ -166,7 +222,7 @@ static int open_proxy_open(struct inode *inode, struct file *filp)
 	const struct file_operations *real_fops = NULL;
 	int r;
 
-	r = debugfs_file_get(dentry);
+	r = debugfs_file_get(filp);
 	if (r)
 		return r == -EIO ? -ENOENT : r;
 
@@ -195,7 +251,7 @@ static int open_proxy_open(struct inode *inode, struct file *filp)
 		r = real_fops->open(inode, filp);
 
 out:
-	debugfs_file_put(dentry);
+	debugfs_file_put(filp);
 	return r;
 }
 
@@ -209,16 +265,15 @@ const struct file_operations debugfs_open_proxy_file_operations = {
 #define FULL_PROXY_FUNC(name, ret_type, filp, proto, args)		\
 static ret_type full_proxy_ ## name(proto)				\
 {									\
-	struct dentry *dentry = F_DENTRY(filp);			\
 	const struct file_operations *real_fops;			\
 	ret_type r;							\
 									\
-	r = debugfs_file_get(dentry);					\
+	r = debugfs_file_get(filp);					\
 	if (unlikely(r))						\
 		return r;						\
 	real_fops = debugfs_real_fops(filp);				\
 	r = real_fops->name(args);					\
-	debugfs_file_put(dentry);					\
+	debugfs_file_put(filp);						\
 	return r;							\
 }
 
@@ -243,16 +298,15 @@ FULL_PROXY_FUNC(unlocked_ioctl, long, filp,
 static __poll_t full_proxy_poll(struct file *filp,
 				struct poll_table_struct *wait)
 {
-	struct dentry *dentry = F_DENTRY(filp);
 	__poll_t r = 0;
 	const struct file_operations *real_fops;
 
-	if (debugfs_file_get(dentry))
+	if (debugfs_file_get(filp))
 		return EPOLLHUP;
 
 	real_fops = debugfs_real_fops(filp);
 	r = real_fops->poll(filp, wait);
-	debugfs_file_put(dentry);
+	debugfs_file_put(filp);
 	return r;
 }
 
@@ -261,16 +315,65 @@ static int full_proxy_release(struct inode *inode, struct file *filp)
 	const struct dentry *dentry = F_DENTRY(filp);
 	const struct file_operations *real_fops = debugfs_real_fops(filp);
 	const struct file_operations *proxy_fops = filp->f_op;
+	struct debugfs_fsdata *fsd = dentry->d_fsdata;
 	int r = 0;
 
 	/*
-	 * We must not protect this against removal races here: the
-	 * original releaser should be called unconditionally in order
-	 * not to leak any resources. Releasers must not assume that
-	 * ->i_private is still being meaningful here.
+	 * If debugfs_file_get() fails here, the original releaser
+	 * was already called at debugfs_remove() time, so don't do
+	 * it again. This allows even removing a module while files
+	 * are open.
 	 */
-	if (real_fops->release)
+	if (!fsd->files) {
+		if (real_fops->release)
+			r = real_fops->release(inode, filp);
+	} else if (debugfs_file_get(filp) == 0) {
+		struct debugfs_files_release *files;
+
+		unsigned int i, old_n_used;
+
+		/*
+		 * XXX This is a hack - is this even allowed? XXX
+		 *
+		 * Take the inode lock here because we also hold it
+		 * during the other possible call to ->release during
+		 * debugfs_remove() over in __debugfs_file_removed().
+		 *
+		 * Taking it here also means that lockdep will *always*
+		 * record a chain from the inode lock to any locks the
+		 * debugfs user might take during release, and then if
+		 * they also hold the same locks during debugfs_remove()
+		 * it will complain.
+		 *
+		 * When it complains, we can then fix it by adding
+		 *
+		 *	.owner = THIS_MODULE
+		 *
+		 * to the relevant fops, in which case we won't get to
+		 * this path here, but to the above code block that's
+		 * only calling it directly.
+		 */
+		inode_lock_shared(inode);
 		r = real_fops->release(inode, filp);
+		inode_unlock_shared(inode);
+
+		mutex_lock(&fsd->files_lock);
+		/* re-read under mutex */
+		files = READ_ONCE(fsd->files);
+		old_n_used = files->n_used;
+		for (i = 0; i < files->n_used; i++) {
+			if (files->files[i] == filp) {
+				files->n_used--;
+				files->files[i] = files->files[files->n_used];
+				files->files[files->n_used] = NULL;
+				break;
+			}
+		}
+		WARN_ON(i >= old_n_used);
+		mutex_unlock(&fsd->files_lock);
+
+		debugfs_file_put(filp);
+	}
 
 	replace_fops(filp, d_inode(dentry)->i_fop);
 	kfree(proxy_fops);
@@ -301,7 +404,7 @@ static int full_proxy_open(struct inode *inode, struct file *filp)
 	struct file_operations *proxy_fops = NULL;
 	int r;
 
-	r = debugfs_file_get(dentry);
+	r = debugfs_file_get(filp);
 	if (r)
 		return r == -EIO ? -ENOENT : r;
 
@@ -351,7 +454,7 @@ static int full_proxy_open(struct inode *inode, struct file *filp)
 	kfree(proxy_fops);
 	fops_put(real_fops);
 out:
-	debugfs_file_put(dentry);
+	debugfs_file_put(filp);
 	return r;
 }
 
@@ -362,14 +465,13 @@ const struct file_operations debugfs_full_proxy_file_operations = {
 ssize_t debugfs_attr_read(struct file *file, char __user *buf,
 			size_t len, loff_t *ppos)
 {
-	struct dentry *dentry = F_DENTRY(file);
 	ssize_t ret;
 
-	ret = debugfs_file_get(dentry);
+	ret = debugfs_file_get(file);
 	if (unlikely(ret))
 		return ret;
 	ret = simple_attr_read(file, buf, len, ppos);
-	debugfs_file_put(dentry);
+	debugfs_file_put(file);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(debugfs_attr_read);
@@ -377,14 +479,13 @@ EXPORT_SYMBOL_GPL(debugfs_attr_read);
 ssize_t debugfs_attr_write(struct file *file, const char __user *buf,
 			 size_t len, loff_t *ppos)
 {
-	struct dentry *dentry = F_DENTRY(file);
 	ssize_t ret;
 
-	ret = debugfs_file_get(dentry);
+	ret = debugfs_file_get(file);
 	if (unlikely(ret))
 		return ret;
 	ret = simple_attr_write(file, buf, len, ppos);
-	debugfs_file_put(dentry);
+	debugfs_file_put(file);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(debugfs_attr_write);
@@ -776,13 +877,12 @@ ssize_t debugfs_read_file_bool(struct file *file, char __user *user_buf,
 	char buf[3];
 	bool val;
 	int r;
-	struct dentry *dentry = F_DENTRY(file);
 
-	r = debugfs_file_get(dentry);
+	r = debugfs_file_get(file);
 	if (unlikely(r))
 		return r;
 	val = *(bool *)file->private_data;
-	debugfs_file_put(dentry);
+	debugfs_file_put(file);
 
 	if (val)
 		buf[0] = 'Y';
@@ -800,15 +900,14 @@ ssize_t debugfs_write_file_bool(struct file *file, const char __user *user_buf,
 	bool bv;
 	int r;
 	bool *val = file->private_data;
-	struct dentry *dentry = F_DENTRY(file);
 
 	r = kstrtobool_from_user(user_buf, count, &bv);
 	if (!r) {
-		r = debugfs_file_get(dentry);
+		r = debugfs_file_get(file);
 		if (unlikely(r))
 			return r;
 		*val = bv;
-		debugfs_file_put(dentry);
+		debugfs_file_put(file);
 	}
 
 	return count;
@@ -869,15 +968,14 @@ static ssize_t read_file_blob(struct file *file, char __user *user_buf,
 			      size_t count, loff_t *ppos)
 {
 	struct debugfs_blob_wrapper *blob = file->private_data;
-	struct dentry *dentry = F_DENTRY(file);
 	ssize_t r;
 
-	r = debugfs_file_get(dentry);
+	r = debugfs_file_get(file);
 	if (unlikely(r))
 		return r;
 	r = simple_read_from_buffer(user_buf, count, ppos, blob->data,
 				blob->size);
-	debugfs_file_put(dentry);
+	debugfs_file_put(file);
 	return r;
 }
 
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 2fcf66473436..f34202c1d7ee 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -694,6 +694,17 @@ static void __debugfs_file_removed(struct dentry *dentry)
 		return;
 	if (!refcount_dec_and_test(&fsd->active_users))
 		wait_for_completion(&fsd->active_users_drained);
+
+	if (fsd->files) {
+		unsigned int i;
+
+		for (i = 0; i < fsd->files->n_used; i++)
+			fsd->real_fops->release(d_inode(dentry),
+						fsd->files->files[i]);
+
+		kfree(fsd->files);
+		fsd->files = NULL;
+	}
 }
 
 static void remove_one(struct dentry *victim)
diff --git a/fs/debugfs/internal.h b/fs/debugfs/internal.h
index 92af8ae31313..d898ef8108e6 100644
--- a/fs/debugfs/internal.h
+++ b/fs/debugfs/internal.h
@@ -15,10 +15,17 @@ extern const struct file_operations debugfs_noop_file_operations;
 extern const struct file_operations debugfs_open_proxy_file_operations;
 extern const struct file_operations debugfs_full_proxy_file_operations;
 
+struct debugfs_files_release {
+	unsigned int n_used, n_alloc;
+	struct file *files[];
+};
+
 struct debugfs_fsdata {
 	const struct file_operations *real_fops;
 	refcount_t active_users;
 	struct completion active_users_drained;
+	struct debugfs_files_release *files;
+	struct mutex files_lock;
 };
 
 /*
diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h
index 851dd1f9a8a5..ef32f7be19bc 100644
--- a/include/linux/debugfs.h
+++ b/include/linux/debugfs.h
@@ -93,8 +93,8 @@ void debugfs_remove(struct dentry *dentry);
 
 const struct file_operations *debugfs_real_fops(const struct file *filp);
 
-int debugfs_file_get(struct dentry *dentry);
-void debugfs_file_put(struct dentry *dentry);
+int debugfs_file_get(struct file *file);
+void debugfs_file_put(struct file *file);
 
 ssize_t debugfs_attr_read(struct file *file, char __user *buf,
 			size_t len, loff_t *ppos);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  7:53       ` [CRAZY-RFF] debugfs: track open files and release on remove Johannes Berg
@ 2020-10-09  8:03         ` Greg KH
  2020-10-09  8:06           ` Johannes Berg
  0 siblings, 1 reply; 38+ messages in thread
From: Greg KH @ 2020-10-09  8:03 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, nstange, ap420073, David.Laight, netdev,
	linux-wireless, rafael

On Fri, Oct 09, 2020 at 09:53:06AM +0200, Johannes Berg wrote:
> [RFF = Request For Flaming]
> 
> THIS IS PROBABLY COMPLETELY CRAZY.
> 
> Currently, if a module is unloaded while debugfs files are being
> kept open, things crash since the ->release() method is called
> only at the actual release, despite the proxy_fops, and then the
> code is no longer around.
> 
> The correct way to fix this is to annotate all the debugfs fops
> with .owner= THIS_MODULE, but as a learning exercise and to see
> how much hate I can possibly receive, I figured I'd try to work
> around this in debugfs itself.

For lots of debugfs files, .owner should already be set, if you use the
DEFINE_SIMPLE_ATTRIBUTE() or DEFINE_DEBUGFS_ATTRIBUTE() macros.

But yes, not all.

> First, if the fops have a release method and no owner, we track
> all the open files - currently using a very simple array data
> structure for it linked into struct debugfs_fsdata. This required
> changing the API of debugfs_file_get() and debugfs_file_put() to
> pass the struct file * to it.
> 
> Then, once we know which files are still open at remove time, we
> can call all their release functions, and avoid calling them from
> the proxy_fops. Of course still call them from the proxy_fops if
> the file is still around, and clean up, so the release isn't all
> deferred to the end, but done as soon as possible.
> 
> This introduces a potential locking issue, we could have a fops
> struct that takes a lock in its release that the same module is
> also taking around debugfs_remove(). To flag these issues using
> lockdep, take inode_lock_shared() in full_proxy_release(), see
> the comment there. If this triggers for some fops, add the owner
> as it should have been in the first place.
> 
> Over adding the owner everywhere this has two small advantages:
>  1) you can unload the module while a debugfs file is open;
>  2) no need to change hundreds of places in the kernel :-)

I thought the proxy-ops stuff was supposed to fix this issue already.
Why isn't it, what is broken in them that causes this to still crash?

And of course, removing kernel modules is never a guaranteed operation,
nor is it anything that ever happens automatically, so is this really an
issue?  :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  8:03         ` Greg KH
@ 2020-10-09  8:06           ` Johannes Berg
  2020-10-09  8:16             ` Greg KH
  0 siblings, 1 reply; 38+ messages in thread
From: Johannes Berg @ 2020-10-09  8:06 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, nstange, ap420073, David.Laight, netdev,
	linux-wireless, rafael

On Fri, 2020-10-09 at 10:03 +0200, Greg KH wrote:

> For lots of debugfs files, .owner should already be set, if you use the
> DEFINE_SIMPLE_ATTRIBUTE() or DEFINE_DEBUGFS_ATTRIBUTE() macros.
> 
> But yes, not all.

Right.

You didn't see the original thread:

https://lore.kernel.org/netdev/20201008155048.17679-1-ap420073@gmail.com/

> I thought the proxy-ops stuff was supposed to fix this issue already.
> Why isn't it, what is broken in them that causes this to still crash?

Well exactly what I described - the proxy_fops *release* doesn't get
proxied, since we don't have any knowledge of the open files (without
this patch) when the proxy_fops are redirected to nothing when a file is
removed.

Nicolai also discussed it a bit here:

https://lore.kernel.org/netdev/87v9fkgf4i.fsf@suse.de/

> And of course, removing kernel modules is never a guaranteed operation,
> nor is it anything that ever happens automatically, so is this really an
> issue?  :)

:)

We used to say the proxy_fops weren't needed and it wasn't an issue, and
then still implemented it. Dunno. I'm not really too concerned about it
myself, only root can hold the files open and remove modules ...

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  8:06           ` Johannes Berg
@ 2020-10-09  8:16             ` Greg KH
  2020-10-09  8:19               ` Johannes Berg
  0 siblings, 1 reply; 38+ messages in thread
From: Greg KH @ 2020-10-09  8:16 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, nstange, ap420073, David.Laight, netdev,
	linux-wireless, rafael

On Fri, Oct 09, 2020 at 10:06:14AM +0200, Johannes Berg wrote:
> We used to say the proxy_fops weren't needed and it wasn't an issue, and
> then still implemented it. Dunno. I'm not really too concerned about it
> myself, only root can hold the files open and remove modules ...

proxy_fops were needed because devices can be removed from the system at
any time, causing their debugfs files to want to also be removed.  It
wasn't because of unloading kernel code.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  8:16             ` Greg KH
@ 2020-10-09  8:19               ` Johannes Berg
  2020-10-09  8:34                 ` David Laight
  2020-10-09  8:47                 ` Greg KH
  0 siblings, 2 replies; 38+ messages in thread
From: Johannes Berg @ 2020-10-09  8:19 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, nstange, ap420073, David.Laight, netdev,
	linux-wireless, rafael

On Fri, 2020-10-09 at 10:16 +0200, Greg KH wrote:
> On Fri, Oct 09, 2020 at 10:06:14AM +0200, Johannes Berg wrote:
> > We used to say the proxy_fops weren't needed and it wasn't an issue, and
> > then still implemented it. Dunno. I'm not really too concerned about it
> > myself, only root can hold the files open and remove modules ...
> 
> proxy_fops were needed because devices can be removed from the system at
> any time, causing their debugfs files to want to also be removed.  It
> wasn't because of unloading kernel code.

Indeed, that's true. Still, we lived with it for years.

Anyway, like I said, I really just did this more to see that it _could_
be done, not to suggest that it _should_ :-)

I think adding the .owner everywhere would be good, and perhaps we can
somehow put a check somewhere like

	WARN_ON(is_module_address((unsigned long)fops) && !fops->owner);

to prevent the issue in the future?

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  8:19               ` Johannes Berg
@ 2020-10-09  8:34                 ` David Laight
  2020-10-09  8:44                   ` Johannes Berg
  2020-10-09  8:47                 ` Greg KH
  1 sibling, 1 reply; 38+ messages in thread
From: David Laight @ 2020-10-09  8:34 UTC (permalink / raw)
  To: 'Johannes Berg', Greg KH
  Cc: linux-kernel, nstange, ap420073, netdev, linux-wireless, rafael

From: Johannes Berg
> Sent: 09 October 2020 09:19
> 
> On Fri, 2020-10-09 at 10:16 +0200, Greg KH wrote:
> > On Fri, Oct 09, 2020 at 10:06:14AM +0200, Johannes Berg wrote:
> > > We used to say the proxy_fops weren't needed and it wasn't an issue, and
> > > then still implemented it. Dunno. I'm not really too concerned about it
> > > myself, only root can hold the files open and remove modules ...
> >
> > proxy_fops were needed because devices can be removed from the system at
> > any time, causing their debugfs files to want to also be removed.  It
> > wasn't because of unloading kernel code.
> 
> Indeed, that's true. Still, we lived with it for years.
> 
> Anyway, like I said, I really just did this more to see that it _could_
> be done, not to suggest that it _should_ :-)
> 
> I think adding the .owner everywhere would be good, and perhaps we can
> somehow put a check somewhere like
> 
> 	WARN_ON(is_module_address((unsigned long)fops) && !fops->owner);
> 
> to prevent the issue in the future?

Does it ever make any sense to set .owner to anything other than
THIS_MODULE?

If not the code that saves the 'struct file_operations' address
ought to be able to save the associated module.

I was also wondering if this affects normal opens?
They should hold a reference on the module to stop it being unloaded.
Does that rely on .owner being set?

For debugfs surely it is possible to determine and save THIS_MODULE
when he nodes are registers and do a try_module_get() in the open?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  8:34                 ` David Laight
@ 2020-10-09  8:44                   ` Johannes Berg
  2020-10-09  9:00                     ` David Laight
  0 siblings, 1 reply; 38+ messages in thread
From: Johannes Berg @ 2020-10-09  8:44 UTC (permalink / raw)
  To: David Laight, Greg KH
  Cc: linux-kernel, nstange, ap420073, netdev, linux-wireless, rafael

On Fri, 2020-10-09 at 08:34 +0000, David Laight wrote:
> 
> > I think adding the .owner everywhere would be good, and perhaps we can
> > somehow put a check somewhere like
> > 
> > 	WARN_ON(is_module_address((unsigned long)fops) && !fops->owner);
> > 
> > to prevent the issue in the future?
> 
> Does it ever make any sense to set .owner to anything other than
> THIS_MODULE?

No. But I believe THIS_MODULE is NULL for built-in code, so we can't
just WARN_ON(!fops->owner).

> If not the code that saves the 'struct file_operations' address
> ought to be able to save the associated module.

No, it's const.

> I was also wondering if this affects normal opens?
> They should hold a reference on the module to stop it being unloaded.
> Does that rely on .owner being set?

Yes.

> For debugfs surely it is possible to determine and save THIS_MODULE
> when he nodes are registers and do a try_module_get() in the open?

I don't really see where to save it?

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  8:19               ` Johannes Berg
  2020-10-09  8:34                 ` David Laight
@ 2020-10-09  8:47                 ` Greg KH
  2020-10-09  8:48                   ` Johannes Berg
  1 sibling, 1 reply; 38+ messages in thread
From: Greg KH @ 2020-10-09  8:47 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, nstange, ap420073, David.Laight, netdev,
	linux-wireless, rafael

On Fri, Oct 09, 2020 at 10:19:02AM +0200, Johannes Berg wrote:
> On Fri, 2020-10-09 at 10:16 +0200, Greg KH wrote:
> > On Fri, Oct 09, 2020 at 10:06:14AM +0200, Johannes Berg wrote:
> > > We used to say the proxy_fops weren't needed and it wasn't an issue, and
> > > then still implemented it. Dunno. I'm not really too concerned about it
> > > myself, only root can hold the files open and remove modules ...
> > 
> > proxy_fops were needed because devices can be removed from the system at
> > any time, causing their debugfs files to want to also be removed.  It
> > wasn't because of unloading kernel code.
> 
> Indeed, that's true. Still, we lived with it for years.

Because no one wanted to fix the code, not because it was correct :)

> Anyway, like I said, I really just did this more to see that it _could_
> be done, not to suggest that it _should_ :-)

Agreed.

> I think adding the .owner everywhere would be good, and perhaps we can
> somehow put a check somewhere like
> 
> 	WARN_ON(is_module_address((unsigned long)fops) && !fops->owner);
> 
> to prevent the issue in the future?

That will fail for all of the debugfs_create_* operations, as there is
only one set of file operations for all of the different files created
with these calls.

Which, now that I remember it, is why we went down the proxy "solution"
in the first place :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  8:47                 ` Greg KH
@ 2020-10-09  8:48                   ` Johannes Berg
  2020-10-10  9:38                     ` Greg KH
  0 siblings, 1 reply; 38+ messages in thread
From: Johannes Berg @ 2020-10-09  8:48 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, nstange, ap420073, David.Laight, netdev,
	linux-wireless, rafael

On Fri, 2020-10-09 at 10:47 +0200, Greg KH wrote:

> > I think adding the .owner everywhere would be good, and perhaps we can
> > somehow put a check somewhere like
> > 
> > 	WARN_ON(is_module_address((unsigned long)fops) && !fops->owner);
> > 
> > to prevent the issue in the future?
> 
> That will fail for all of the debugfs_create_* operations, as there is
> only one set of file operations for all of the different files created
> with these calls.

Why would it fail? Those have their fops in the core debugfs code, which
might have a .owner assigned but is probably built-in anyway?

> Which, now that I remember it, is why we went down the proxy "solution"
> in the first place :(

Not sure I understand. That was related more to (arbitrary) files having
to be disappeared rather than anything else?

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  8:44                   ` Johannes Berg
@ 2020-10-09  9:00                     ` David Laight
  0 siblings, 0 replies; 38+ messages in thread
From: David Laight @ 2020-10-09  9:00 UTC (permalink / raw)
  To: 'Johannes Berg', Greg KH
  Cc: linux-kernel, nstange, ap420073, netdev, linux-wireless, rafael

From: Johannes Berg
> Sent: 09 October 2020 09:45
> 
> On Fri, 2020-10-09 at 08:34 +0000, David Laight wrote:
> >
...
> > Does it ever make any sense to set .owner to anything other than
> > THIS_MODULE?
> 
> No. But I believe THIS_MODULE is NULL for built-in code, so we can't
> just WARN_ON(!fops->owner).
...
> > I was also wondering if this affects normal opens?
> > They should hold a reference on the module to stop it being unloaded.
> > Does that rely on .owner being set?
> 
> Yes.

Sound like the module load code should be verifying it then.

Looking at one of my modules (which does set .owner).
Perhaps cdev_init() could be a #define that picks up THIS_MODULE.
This could then be checked against the one in fops or saved
in the 'struct cdev'.

I presume debugfs (which I've not used) has some similar calls.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-09  5:38         ` Nicolai Stange
@ 2020-10-09 10:07           ` Taehee Yoo
  -1 siblings, 0 replies; 38+ messages in thread
From: Taehee Yoo @ 2020-10-09 10:07 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Johannes Berg, David Laight, davem, kuba, netdev, linux-wireless,
	wil6210, b43-dev, linux-bluetooth

On Fri, 9 Oct 2020 at 14:39, Nicolai Stange <nstange@suse.de> wrote:
>

Hi Nicolai,
Thank you for the review!

> Taehee Yoo <ap420073@gmail.com> writes:
>
> > On Fri, 9 Oct 2020 at 01:14, Johannes Berg <johannes@sipsolutions.net> wrote:
> > On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
> >
> >> From: Taehee Yoo
> >> > Sent: 08 October 2020 16:49
> >> >
> >> > When debugfs file is opened, its module should not be removed until
> >> > it's closed.
> >> > Because debugfs internally uses the module's data.
> >> > So, it could access freed memory.
>
> Yes, the file_operations' ->release() to be more specific -- that's not
> covered by debugfs' proxy fops.
>
>
> >> > In order to avoid panic, it just sets .owner to THIS_MODULE.
> >> > So that all modules will be held when its debugfs file is opened.
> >>
> >> Can't you fix it in common code?
> >
> >> Yeah I was just wondering that too - weren't the proxy_fops even already
> >> intended to fix this?
> >
> > I didn't try to fix this issue in the common code(debugfs).
> > Because I thought It's a typical pattern of panic and THIS_MODULE
> > can fix it clearly.
> > So I couldn't think there is a root reason in the common code.
>
> That's correct, ->owner should get set properly, c.f. my other mail in
> this thread.
>

Thanks a lot for verifying it!
Taehee

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-09 10:07           ` Taehee Yoo
  0 siblings, 0 replies; 38+ messages in thread
From: Taehee Yoo @ 2020-10-09 10:07 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Johannes Berg, David Laight, davem, kuba, netdev, linux-wireless,
	wil6210, b43-dev, linux-bluetooth

On Fri, 9 Oct 2020 at 14:39, Nicolai Stange <nstange@suse.de> wrote:
>

Hi Nicolai,
Thank you for the review!

> Taehee Yoo <ap420073@gmail.com> writes:
>
> > On Fri, 9 Oct 2020 at 01:14, Johannes Berg <johannes@sipsolutions.net> wrote:
> > On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
> >
> >> From: Taehee Yoo
> >> > Sent: 08 October 2020 16:49
> >> >
> >> > When debugfs file is opened, its module should not be removed until
> >> > it's closed.
> >> > Because debugfs internally uses the module's data.
> >> > So, it could access freed memory.
>
> Yes, the file_operations' ->release() to be more specific -- that's not
> covered by debugfs' proxy fops.
>
>
> >> > In order to avoid panic, it just sets .owner to THIS_MODULE.
> >> > So that all modules will be held when its debugfs file is opened.
> >>
> >> Can't you fix it in common code?
> >
> >> Yeah I was just wondering that too - weren't the proxy_fops even already
> >> intended to fix this?
> >
> > I didn't try to fix this issue in the common code(debugfs).
> > Because I thought It's a typical pattern of panic and THIS_MODULE
> > can fix it clearly.
> > So I couldn't think there is a root reason in the common code.
>
> That's correct, ->owner should get set properly, c.f. my other mail in
> this thread.
>

Thanks a lot for verifying it!
Taehee

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-09  7:45         ` Johannes Berg
@ 2020-10-09 10:15           ` Taehee Yoo
  -1 siblings, 0 replies; 38+ messages in thread
From: Taehee Yoo @ 2020-10-09 10:15 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Nicolai Stange, David Laight, davem, kuba, netdev,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

On Fri, 9 Oct 2020 at 16:45, Johannes Berg <johannes@sipsolutions.net> wrote:
>

Hi Johannes,
Thank you for the review!

> On Fri, 2020-10-09 at 07:09 +0200, Nicolai Stange wrote:
> > Johannes Berg <johannes@sipsolutions.net> writes:
> >
> > > On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
> > > > From: Taehee Yoo
> > > > > Sent: 08 October 2020 16:49
> > > > >
> > > > > When debugfs file is opened, its module should not be removed until
> > > > > it's closed.
> > > > > Because debugfs internally uses the module's data.
> > > > > So, it could access freed memory.
> > > > >
> > > > > In order to avoid panic, it just sets .owner to THIS_MODULE.
> > > > > So that all modules will be held when its debugfs file is opened.
> > > >
> > > > Can't you fix it in common code?
> >
> > Probably not: it's the call to ->release() that's faulting in the Oops
> > quoted in the cover letter and that one can't be protected by the
> > core debugfs code, unfortunately.
> >
> > There's a comment in full_proxy_release(), which reads as
> >
> >       /*
> >        * We must not protect this against removal races here: the
> >        * original releaser should be called unconditionally in order
> >        * not to leak any resources. Releasers must not assume that
> >        * ->i_private is still being meaningful here.
> >        */
>
> Yeah, found that too now :-)
>
> > > Yeah I was just wondering that too - weren't the proxy_fops even already
> > > intended to fix this?
> >
> > No, as far as file_operations are concerned, the proxy fops's intent was
> > only to ensure that the memory the file_operations' ->owner resides in
> > is still valid so that try_module_get() won't splat at file open
> > (c.f. [1]).
>
> Right.
>
> > You're right that the default "full" proxy fops do prevent all
> > file_operations but ->release() from getting invoked on removed files,
> > but the motivation had not been to protect the file_operations
> > themselves, but accesses to any stale data associated with removed files
> > ([2]).
>
> :)
>
> I actually got this to work in a crazy way, I'll send something out but
> I'm sure it's a better idea to add the .owner everywhere, but please
> let's do it in fewer than hundreds of patches :-)
>
Okay, as you mentioned earlier in 001/117 patch thread,
I will squash patches into per-driver/subsystem then send them as v2.

Thanks a lot!
Taehee

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-09 10:15           ` Taehee Yoo
  0 siblings, 0 replies; 38+ messages in thread
From: Taehee Yoo @ 2020-10-09 10:15 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Nicolai Stange, David Laight, davem, kuba, netdev,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

On Fri, 9 Oct 2020 at 16:45, Johannes Berg <johannes@sipsolutions.net> wrote:
>

Hi Johannes,
Thank you for the review!

> On Fri, 2020-10-09 at 07:09 +0200, Nicolai Stange wrote:
> > Johannes Berg <johannes@sipsolutions.net> writes:
> >
> > > On Thu, 2020-10-08 at 15:59 +0000, David Laight wrote:
> > > > From: Taehee Yoo
> > > > > Sent: 08 October 2020 16:49
> > > > >
> > > > > When debugfs file is opened, its module should not be removed until
> > > > > it's closed.
> > > > > Because debugfs internally uses the module's data.
> > > > > So, it could access freed memory.
> > > > >
> > > > > In order to avoid panic, it just sets .owner to THIS_MODULE.
> > > > > So that all modules will be held when its debugfs file is opened.
> > > >
> > > > Can't you fix it in common code?
> >
> > Probably not: it's the call to ->release() that's faulting in the Oops
> > quoted in the cover letter and that one can't be protected by the
> > core debugfs code, unfortunately.
> >
> > There's a comment in full_proxy_release(), which reads as
> >
> >       /*
> >        * We must not protect this against removal races here: the
> >        * original releaser should be called unconditionally in order
> >        * not to leak any resources. Releasers must not assume that
> >        * ->i_private is still being meaningful here.
> >        */
>
> Yeah, found that too now :-)
>
> > > Yeah I was just wondering that too - weren't the proxy_fops even already
> > > intended to fix this?
> >
> > No, as far as file_operations are concerned, the proxy fops's intent was
> > only to ensure that the memory the file_operations' ->owner resides in
> > is still valid so that try_module_get() won't splat at file open
> > (c.f. [1]).
>
> Right.
>
> > You're right that the default "full" proxy fops do prevent all
> > file_operations but ->release() from getting invoked on removed files,
> > but the motivation had not been to protect the file_operations
> > themselves, but accesses to any stale data associated with removed files
> > ([2]).
>
> :)
>
> I actually got this to work in a crazy way, I'll send something out but
> I'm sure it's a better idea to add the .owner everywhere, but please
> let's do it in fewer than hundreds of patches :-)
>
Okay, as you mentioned earlier in 001/117 patch thread,
I will squash patches into per-driver/subsystem then send them as v2.

Thanks a lot!
Taehee

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-09 10:15           ` Taehee Yoo
@ 2020-10-09 10:21             ` Johannes Berg
  -1 siblings, 0 replies; 38+ messages in thread
From: Johannes Berg @ 2020-10-09 10:21 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: Nicolai Stange, David Laight, davem, kuba, netdev,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

On Fri, 2020-10-09 at 19:15 +0900, Taehee Yoo wrote:
> 
> Okay, as you mentioned earlier in 001/117 patch thread,
> I will squash patches into per-driver/subsystem then send them as v2.

Give me a bit. I think I figured out a less intrusive way that at least
means we don't have to do it if the fops doesn't have ->release(), which
is the vast majority.

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-09 10:21             ` Johannes Berg
  0 siblings, 0 replies; 38+ messages in thread
From: Johannes Berg @ 2020-10-09 10:21 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: Nicolai Stange, David Laight, davem, kuba, netdev,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

On Fri, 2020-10-09 at 19:15 +0900, Taehee Yoo wrote:
> 
> Okay, as you mentioned earlier in 001/117 patch thread,
> I will squash patches into per-driver/subsystem then send them as v2.

Give me a bit. I think I figured out a less intrusive way that at least
means we don't have to do it if the fops doesn't have ->release(), which
is the vast majority.

johannes

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] debugfs: protect against rmmod while files are open
  2020-10-09 10:21             ` Johannes Berg
  (?)
@ 2020-10-09 10:41             ` Johannes Berg
  2020-10-09 10:48               ` Johannes Berg
  -1 siblings, 1 reply; 38+ messages in thread
From: Johannes Berg @ 2020-10-09 10:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: nstange, ap420073, David.Laight, netdev, linux-wireless, gregkh,
	rafael, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

Currently, things will crash (or at least UAF) in release() when
a module owning a debugfs file, but that didn't set the fops.owner,
is removed while the offending debugfs file is open.

Since we have the proxy_fops, we can break that down into two
different cases:

If the fops doesn't have a release method, we don't even need
to keep a reference to the real_fops, we can just fops_put()
them already in debugfs remove, and a later full_proxy_release()
won't call anything anyway - this just crashed/UAFed because it
used real_fops, not because there was actually a (now invalid)
release() method.

If, on the other hand, the fops do have a release method then
WARN and prevent adding this debugfs file if it doesn't also
have an owner and the release method is in a module. In theory,
the fops and the release method could be in different modules,
while this is something we don't really need to consider it is
in fact handled as well because we make a copy of the release()
pointer and call through that, releasing the fops when the file
is removed from debugfs.

Surely this warning will find a few places that should have an
owner, but at least then we don't have to add one everywhere.

Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 fs/debugfs/file.c     | 24 +++++++++++++++++++-----
 fs/debugfs/inode.c    |  9 +++++++++
 fs/debugfs/internal.h |  1 +
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index ae49a55bda00..addacefc356e 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -94,6 +94,7 @@ int debugfs_file_get(struct dentry *dentry)
 
 		fsd->real_fops = (void *)((unsigned long)d_fsd &
 					~DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
+		fsd->fop_release = fsd->real_fops->release;
 		refcount_set(&fsd->active_users, 1);
 		init_completion(&fsd->active_users_drained);
 		if (cmpxchg(&dentry->d_fsdata, d_fsd, fsd) != d_fsd) {
@@ -258,8 +259,8 @@ static __poll_t full_proxy_poll(struct file *filp,
 
 static int full_proxy_release(struct inode *inode, struct file *filp)
 {
-	const struct dentry *dentry = F_DENTRY(filp);
-	const struct file_operations *real_fops = debugfs_real_fops(filp);
+	struct dentry *dentry = F_DENTRY(filp);
+	struct debugfs_fsdata *fsd = dentry->d_fsdata;
 	const struct file_operations *proxy_fops = filp->f_op;
 	int r = 0;
 
@@ -268,13 +269,26 @@ static int full_proxy_release(struct inode *inode, struct file *filp)
 	 * original releaser should be called unconditionally in order
 	 * not to leak any resources. Releasers must not assume that
 	 * ->i_private is still being meaningful here.
+	 *
+	 * Note, however, that we don't reference real_fops (unless we
+	 * can guarantee it's still around). We made a copy of release()
+	 * before, in case it was NULL we then will not call anything and
+	 * don't need to use real_fops at all. This allows us to allow
+	 * module unloading of modules exposing debugfs files if they
+	 * don't have release() methods.
 	 */
-	if (real_fops->release)
-		r = real_fops->release(inode, filp);
+	if (fsd->fop_release)
+		r = fsd->fop_release(inode, filp);
 
 	replace_fops(filp, d_inode(dentry)->i_fop);
 	kfree((void *)proxy_fops);
-	fops_put(real_fops);
+
+	/* fops_put() only if not already gone */
+	if (refcount_inc_not_zero(&fsd->active_users)) {
+		fops_put(fsd->real_fops);
+		debugfs_file_put(dentry);
+	}
+
 	return r;
 }
 
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index b7f2e971ecbc..25fd95f79c3b 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -377,6 +377,13 @@ static struct dentry *__debugfs_create_file(const char *name, umode_t mode,
 	struct dentry *dentry;
 	struct inode *inode;
 
+	if (WARN(real_fops->release &&
+		 is_module_text_address((unsigned long)real_fops->release) &&
+		 !real_fops->owner,
+		 "%ps is in a module but %ps doesn't have an owner",
+		 real_fops->release, real_fops))
+		return ERR_PTR(-EINVAL);
+
 	if (!(mode & S_IFMT))
 		mode |= S_IFREG;
 	BUG_ON(!S_ISREG(mode));
@@ -672,6 +679,8 @@ static void __debugfs_file_removed(struct dentry *dentry)
 		return;
 	if (!refcount_dec_and_test(&fsd->active_users))
 		wait_for_completion(&fsd->active_users_drained);
+	fops_put(fsd->real_fops);
+	fsd->real_fops = NULL;
 }
 
 static void remove_one(struct dentry *victim)
diff --git a/fs/debugfs/internal.h b/fs/debugfs/internal.h
index 034e6973cead..160a77abcfab 100644
--- a/fs/debugfs/internal.h
+++ b/fs/debugfs/internal.h
@@ -17,6 +17,7 @@ extern const struct file_operations debugfs_full_proxy_file_operations;
 
 struct debugfs_fsdata {
 	const struct file_operations *real_fops;
+	int (*fop_release)(struct inode *, struct file *);
 	refcount_t active_users;
 	struct completion active_users_drained;
 };
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC] debugfs: protect against rmmod while files are open
  2020-10-09 10:41             ` [RFC] debugfs: protect against rmmod while files are open Johannes Berg
@ 2020-10-09 10:48               ` Johannes Berg
  2020-10-09 10:56                 ` David Laight
  0 siblings, 1 reply; 38+ messages in thread
From: Johannes Berg @ 2020-10-09 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: nstange, ap420073, David.Laight, netdev, linux-wireless, gregkh, rafael

On Fri, 2020-10-09 at 12:41 +0200, Johannes Berg wrote:

> If the fops doesn't have a release method, we don't even need
> to keep a reference to the real_fops, we can just fops_put()
> them already in debugfs remove, and a later full_proxy_release()
> won't call anything anyway - this just crashed/UAFed because it
> used real_fops, not because there was actually a (now invalid)
> release() method.

I actually implemented something a bit better than what I described - we
never need a reference to the real_fops for the release method alone,
and that means if the release method is in the kernel image, rather than
a module, it can still be called.

That together should reduce the ~117 places you changed in the large
patchset to around a handful.

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [RFC] debugfs: protect against rmmod while files are open
  2020-10-09 10:48               ` Johannes Berg
@ 2020-10-09 10:56                 ` David Laight
  2020-10-09 10:56                   ` Johannes Berg
  2020-10-09 11:15                   ` gregkh
  0 siblings, 2 replies; 38+ messages in thread
From: David Laight @ 2020-10-09 10:56 UTC (permalink / raw)
  To: 'Johannes Berg', linux-kernel
  Cc: nstange, ap420073, netdev, linux-wireless, gregkh, rafael

From: Johannes Berg
> Sent: 09 October 2020 11:48
> 
> On Fri, 2020-10-09 at 12:41 +0200, Johannes Berg wrote:
> 
> > If the fops doesn't have a release method, we don't even need
> > to keep a reference to the real_fops, we can just fops_put()
> > them already in debugfs remove, and a later full_proxy_release()
> > won't call anything anyway - this just crashed/UAFed because it
> > used real_fops, not because there was actually a (now invalid)
> > release() method.
> 
> I actually implemented something a bit better than what I described - we
> never need a reference to the real_fops for the release method alone,
> and that means if the release method is in the kernel image, rather than
> a module, it can still be called.
> 
> That together should reduce the ~117 places you changed in the large
> patchset to around a handful.

Is there an equivalent problem for normal cdev opens
in any modules?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] debugfs: protect against rmmod while files are open
  2020-10-09 10:56                 ` David Laight
@ 2020-10-09 10:56                   ` Johannes Berg
  2020-10-09 11:15                   ` gregkh
  1 sibling, 0 replies; 38+ messages in thread
From: Johannes Berg @ 2020-10-09 10:56 UTC (permalink / raw)
  To: David Laight, linux-kernel
  Cc: nstange, ap420073, netdev, linux-wireless, gregkh, rafael

On Fri, 2020-10-09 at 10:56 +0000, David Laight wrote:
> From: Johannes Berg
> > Sent: 09 October 2020 11:48
> > 
> > On Fri, 2020-10-09 at 12:41 +0200, Johannes Berg wrote:
> > 
> > > If the fops doesn't have a release method, we don't even need
> > > to keep a reference to the real_fops, we can just fops_put()
> > > them already in debugfs remove, and a later full_proxy_release()
> > > won't call anything anyway - this just crashed/UAFed because it
> > > used real_fops, not because there was actually a (now invalid)
> > > release() method.
> > 
> > I actually implemented something a bit better than what I described - we
> > never need a reference to the real_fops for the release method alone,
> > and that means if the release method is in the kernel image, rather than
> > a module, it can still be called.
> > 
> > That together should reduce the ~117 places you changed in the large
> > patchset to around a handful.
> 
> Is there an equivalent problem for normal cdev opens
> in any modules?

I guess so, but since there's no proxy_fops infrastructure and no
revoke(), you can't really do anything else other than adding .owner
properly, afaict.

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] debugfs: protect against rmmod while files are open
  2020-10-09 10:56                 ` David Laight
  2020-10-09 10:56                   ` Johannes Berg
@ 2020-10-09 11:15                   ` gregkh
  1 sibling, 0 replies; 38+ messages in thread
From: gregkh @ 2020-10-09 11:15 UTC (permalink / raw)
  To: David Laight
  Cc: 'Johannes Berg',
	linux-kernel, nstange, ap420073, netdev, linux-wireless, rafael

On Fri, Oct 09, 2020 at 10:56:16AM +0000, David Laight wrote:
> From: Johannes Berg
> > Sent: 09 October 2020 11:48
> > 
> > On Fri, 2020-10-09 at 12:41 +0200, Johannes Berg wrote:
> > 
> > > If the fops doesn't have a release method, we don't even need
> > > to keep a reference to the real_fops, we can just fops_put()
> > > them already in debugfs remove, and a later full_proxy_release()
> > > won't call anything anyway - this just crashed/UAFed because it
> > > used real_fops, not because there was actually a (now invalid)
> > > release() method.
> > 
> > I actually implemented something a bit better than what I described - we
> > never need a reference to the real_fops for the release method alone,
> > and that means if the release method is in the kernel image, rather than
> > a module, it can still be called.
> > 
> > That together should reduce the ~117 places you changed in the large
> > patchset to around a handful.
> 
> Is there an equivalent problem for normal cdev opens
> in any modules?

What does cdev have to do with debugfs?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
  2020-10-09 10:21             ` Johannes Berg
@ 2020-10-09 15:33               ` Steve deRosier
  -1 siblings, 0 replies; 38+ messages in thread
From: Steve deRosier @ 2020-10-09 15:33 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Taehee Yoo, Nicolai Stange, David Laight, davem, kuba, netdev,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

On Fri, Oct 9, 2020 at 3:22 AM Johannes Berg <johannes@sipsolutions.net> wrote:
>
> On Fri, 2020-10-09 at 19:15 +0900, Taehee Yoo wrote:
> >
> > Okay, as you mentioned earlier in 001/117 patch thread,
> > I will squash patches into per-driver/subsystem then send them as v2.
>
> Give me a bit. I think I figured out a less intrusive way that at least
> means we don't have to do it if the fops doesn't have ->release(), which
> is the vast majority.
>

While I'm all for a patch that fixes something at a single level
instead of touching 100s of files, let me ask a loosely related, but
more basic, question: Should `->owner` be set properly in each driver?
 Or the flip of that, should we be considering that it isn't a
semantic error? I don't know the answer myself, I just thought to ask
the question.

IMHO, if true that `->owner` should be set for "correctness", and even
if we fix the debugfs problem elsewhere, perhaps this series (squashed
of course) should be merged.

- Steve

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 000/117] net: avoid to remove module when its debugfs is being used
@ 2020-10-09 15:33               ` Steve deRosier
  0 siblings, 0 replies; 38+ messages in thread
From: Steve deRosier @ 2020-10-09 15:33 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Taehee Yoo, Nicolai Stange, David Laight, davem, kuba, netdev,
	linux-wireless, wil6210, brcm80211-dev-list, b43-dev,
	linux-bluetooth

On Fri, Oct 9, 2020 at 3:22 AM Johannes Berg <johannes@sipsolutions.net> wrote:
>
> On Fri, 2020-10-09 at 19:15 +0900, Taehee Yoo wrote:
> >
> > Okay, as you mentioned earlier in 001/117 patch thread,
> > I will squash patches into per-driver/subsystem then send them as v2.
>
> Give me a bit. I think I figured out a less intrusive way that at least
> means we don't have to do it if the fops doesn't have ->release(), which
> is the vast majority.
>

While I'm all for a patch that fixes something at a single level
instead of touching 100s of files, let me ask a loosely related, but
more basic, question: Should `->owner` be set properly in each driver?
 Or the flip of that, should we be considering that it isn't a
semantic error? I don't know the answer myself, I just thought to ask
the question.

IMHO, if true that `->owner` should be set for "correctness", and even
if we fix the debugfs problem elsewhere, perhaps this series (squashed
of course) should be merged.

- Steve

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-09  8:48                   ` Johannes Berg
@ 2020-10-10  9:38                     ` Greg KH
  2020-10-10 10:47                       ` Johannes Berg
  0 siblings, 1 reply; 38+ messages in thread
From: Greg KH @ 2020-10-10  9:38 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, nstange, ap420073, David.Laight, netdev,
	linux-wireless, rafael

On Fri, Oct 09, 2020 at 10:48:09AM +0200, Johannes Berg wrote:
> On Fri, 2020-10-09 at 10:47 +0200, Greg KH wrote:
> 
> > > I think adding the .owner everywhere would be good, and perhaps we can
> > > somehow put a check somewhere like
> > > 
> > > 	WARN_ON(is_module_address((unsigned long)fops) && !fops->owner);
> > > 
> > > to prevent the issue in the future?
> > 
> > That will fail for all of the debugfs_create_* operations, as there is
> > only one set of file operations for all of the different files created
> > with these calls.
> 
> Why would it fail? Those have their fops in the core debugfs code, which
> might have a .owner assigned but is probably built-in anyway?

Bad choice of terms, it would "fail" in that this type of check would
never actually work because the debugfs code is built into the kernel,
and there is no module owner for it.  But the value it is referencing is
an address in a module.

> > Which, now that I remember it, is why we went down the proxy "solution"
> > in the first place :(
> 
> Not sure I understand. That was related more to (arbitrary) files having
> to be disappeared rather than anything else?

Isn't this the same issue?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [CRAZY-RFF] debugfs: track open files and release on remove
  2020-10-10  9:38                     ` Greg KH
@ 2020-10-10 10:47                       ` Johannes Berg
  0 siblings, 0 replies; 38+ messages in thread
From: Johannes Berg @ 2020-10-10 10:47 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, nstange, ap420073, David.Laight, netdev,
	linux-wireless, rafael

On Sat, 2020-10-10 at 11:38 +0200, Greg KH wrote:
> On Fri, Oct 09, 2020 at 10:48:09AM +0200, Johannes Berg wrote:
> > On Fri, 2020-10-09 at 10:47 +0200, Greg KH wrote:
> > 
> > > > I think adding the .owner everywhere would be good, and perhaps we can
> > > > somehow put a check somewhere like
> > > > 
> > > > 	WARN_ON(is_module_address((unsigned long)fops) && !fops->owner);
> > > > 
> > > > to prevent the issue in the future?
> > > 
> > > That will fail for all of the debugfs_create_* operations, as there is
> > > only one set of file operations for all of the different files created
> > > with these calls.
> > 
> > Why would it fail? Those have their fops in the core debugfs code, which
> > might have a .owner assigned but is probably built-in anyway?
> 
> Bad choice of terms, it would "fail" in that this type of check would
> never actually work because the debugfs code is built into the kernel,
> and there is no module owner for it.  But the value it is referencing is
> an address in a module.

Ahh.

Yes and no. I mean, yes, the check wouldn't really work.

But OTOH, this is exactly what the proxy_fops protects against.

The _only_ thing that proxy_fops *doesn't* proxy is the ->release()
method.

If you have a debugfs file that's say debugfs_create_u32(), then the
code is all built into the kernel, and - if ->release() even exists, I
didn't check now - it would surely not dereference the pointer you gave
to debugfs_create_u32(). So as long as the file is debugfs_remove()d
before the pointer becomes invalid, there's no issue.

The check I'm proposing (and actually wrote in my separate RFC patch
that didn't seem quite as crazy) would basically protect the ->release()
method only, if needed. Everything else is handled by proxy_fops.

> > > Which, now that I remember it, is why we went down the proxy "solution"
> > > in the first place :(
> > 
> > Not sure I understand. That was related more to (arbitrary) files having
> > to be disappeared rather than anything else?
> 
> Isn't this the same issue?

Well, not exactly? The difference is that proxy_fops basically protects
the *value*, read/write/etc., but not ->release(). So it protects more
against bus unbind or the like, where the *device* disappears, rather
than the *code* disappearing.

Now, you still need to be careful that ->release() doesn't actually
access anything related to the device, of course. As long as we don't
have a general revoke() at least.

I guess in that sense this crazy patch actually makes things *better*
than the RFC patch because it *does* call the ->release() during
debugfs_remove() and therefore allows even ->release() to access data of
the device or other data structures that are being removed; whereas the
RFC patch I also sent doesn't protect that, it just protects the code
itself.

johannes


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2020-10-10 22:53 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-08 15:48 [PATCH net 000/117] net: avoid to remove module when its debugfs is being used Taehee Yoo
2020-10-08 15:59 ` David Laight
2020-10-08 15:59   ` David Laight
2020-10-08 16:14   ` Johannes Berg
2020-10-08 16:14     ` Johannes Berg
2020-10-08 16:37     ` Taehee Yoo
2020-10-08 16:37       ` Taehee Yoo
2020-10-09  5:38       ` Nicolai Stange
2020-10-09  5:38         ` Nicolai Stange
2020-10-09 10:07         ` Taehee Yoo
2020-10-09 10:07           ` Taehee Yoo
2020-10-09  5:09     ` Nicolai Stange
2020-10-09  5:09       ` Nicolai Stange
2020-10-09  7:45       ` Johannes Berg
2020-10-09  7:45         ` Johannes Berg
2020-10-09 10:15         ` Taehee Yoo
2020-10-09 10:15           ` Taehee Yoo
2020-10-09 10:21           ` Johannes Berg
2020-10-09 10:21             ` Johannes Berg
2020-10-09 10:41             ` [RFC] debugfs: protect against rmmod while files are open Johannes Berg
2020-10-09 10:48               ` Johannes Berg
2020-10-09 10:56                 ` David Laight
2020-10-09 10:56                   ` Johannes Berg
2020-10-09 11:15                   ` gregkh
2020-10-09 15:33             ` [PATCH net 000/117] net: avoid to remove module when its debugfs is being used Steve deRosier
2020-10-09 15:33               ` Steve deRosier
2020-10-09  7:53       ` [CRAZY-RFF] debugfs: track open files and release on remove Johannes Berg
2020-10-09  8:03         ` Greg KH
2020-10-09  8:06           ` Johannes Berg
2020-10-09  8:16             ` Greg KH
2020-10-09  8:19               ` Johannes Berg
2020-10-09  8:34                 ` David Laight
2020-10-09  8:44                   ` Johannes Berg
2020-10-09  9:00                     ` David Laight
2020-10-09  8:47                 ` Greg KH
2020-10-09  8:48                   ` Johannes Berg
2020-10-10  9:38                     ` Greg KH
2020-10-10 10:47                       ` Johannes Berg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.