From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Ido Schimmel <idosch@nvidia.com>,
Vadim Pasternak <vadimp@nvidia.com>,
Jakub Kicinski <kuba@kernel.org>
Subject: [PATCH 5.4 58/69] mlxsw: thermal: Fix out-of-bounds memory accesses
Date: Mon, 18 Oct 2021 15:24:56 +0200 [thread overview]
Message-ID: <20211018132331.395914338@linuxfoundation.org> (raw)
In-Reply-To: <20211018132329.453964125@linuxfoundation.org>
From: Ido Schimmel <idosch@nvidia.com>
commit 332fdf951df8b870e3da86b122ae304e2aabe88c upstream.
Currently, mlxsw allows cooling states to be set above the maximum
cooling state supported by the driver:
# cat /sys/class/thermal/thermal_zone2/cdev0/type
mlxsw_fan
# cat /sys/class/thermal/thermal_zone2/cdev0/max_state
10
# echo 18 > /sys/class/thermal/thermal_zone2/cdev0/cur_state
# echo $?
0
This results in out-of-bounds memory accesses when thermal state
transition statistics are enabled (CONFIG_THERMAL_STATISTICS=y), as the
transition table is accessed with a too large index (state) [1].
According to the thermal maintainer, it is the responsibility of the
driver to reject such operations [2].
Therefore, return an error when the state to be set exceeds the maximum
cooling state supported by the driver.
To avoid dead code, as suggested by the thermal maintainer [3],
partially revert commit a421ce088ac8 ("mlxsw: core: Extend cooling
device with cooling levels") that tried to interpret these invalid
cooling states (above the maximum) in a special way. The cooling levels
array is not removed in order to prevent the fans going below 20% PWM,
which would cause them to get stuck at 0% PWM.
[1]
BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x271/0x290
Read of size 4 at addr ffff8881052f7bf8 by task kworker/0:0/5
CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.15.0-rc3-custom-45935-gce1adf704b14 #122
Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2FO"/"SA000874", BIOS 4.6.5 03/08/2016
Workqueue: events_freezable_power_ thermal_zone_device_check
Call Trace:
dump_stack_lvl+0x8b/0xb3
print_address_description.constprop.0+0x1f/0x140
kasan_report.cold+0x7f/0x11b
thermal_cooling_device_stats_update+0x271/0x290
__thermal_cdev_update+0x15e/0x4e0
thermal_cdev_update+0x9f/0xe0
step_wise_throttle+0x770/0xee0
thermal_zone_device_update+0x3f6/0xdf0
process_one_work+0xa42/0x1770
worker_thread+0x62f/0x13e0
kthread+0x3ee/0x4e0
ret_from_fork+0x1f/0x30
Allocated by task 1:
kasan_save_stack+0x1b/0x40
__kasan_kmalloc+0x7c/0x90
thermal_cooling_device_setup_sysfs+0x153/0x2c0
__thermal_cooling_device_register.part.0+0x25b/0x9c0
thermal_cooling_device_register+0xb3/0x100
mlxsw_thermal_init+0x5c5/0x7e0
__mlxsw_core_bus_device_register+0xcb3/0x19c0
mlxsw_core_bus_device_register+0x56/0xb0
mlxsw_pci_probe+0x54f/0x710
local_pci_probe+0xc6/0x170
pci_device_probe+0x2b2/0x4d0
really_probe+0x293/0xd10
__driver_probe_device+0x2af/0x440
driver_probe_device+0x51/0x1e0
__driver_attach+0x21b/0x530
bus_for_each_dev+0x14c/0x1d0
bus_add_driver+0x3ac/0x650
driver_register+0x241/0x3d0
mlxsw_sp_module_init+0xa2/0x174
do_one_initcall+0xee/0x5f0
kernel_init_freeable+0x45a/0x4de
kernel_init+0x1f/0x210
ret_from_fork+0x1f/0x30
The buggy address belongs to the object at ffff8881052f7800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 1016 bytes inside of
1024-byte region [ffff8881052f7800, ffff8881052f7c00)
The buggy address belongs to the page:
page:0000000052355272 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1052f0
head:0000000052355272 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x200000000010200(slab|head|node=0|zone=2)
raw: 0200000000010200 ffffea0005034800 0000000300000003 ffff888100041dc0
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8881052f7a80: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc
ffff8881052f7b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8881052f7b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff8881052f7c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8881052f7c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[2] https://lore.kernel.org/linux-pm/9aca37cb-1629-5c67-1895-1fdc45c0244e@linaro.org/
[3] https://lore.kernel.org/linux-pm/af9857f2-578e-de3a-e62b-6baff7e69fd4@linaro.org/
CC: Daniel Lezcano <daniel.lezcano@linaro.org>
Fixes: a50c1e35650b ("mlxsw: core: Implement thermal zone")
Fixes: a421ce088ac8 ("mlxsw: core: Extend cooling device with cooling levels")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20211012174955.472928-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlxsw/core_thermal.c | 52 ++-------------------
1 file changed, 5 insertions(+), 47 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
@@ -25,16 +25,8 @@
#define MLXSW_THERMAL_ZONE_MAX_NAME 16
#define MLXSW_THERMAL_TEMP_SCORE_MAX GENMASK(31, 0)
#define MLXSW_THERMAL_MAX_STATE 10
+#define MLXSW_THERMAL_MIN_STATE 2
#define MLXSW_THERMAL_MAX_DUTY 255
-/* Minimum and maximum fan allowed speed in percent: from 20% to 100%. Values
- * MLXSW_THERMAL_MAX_STATE + x, where x is between 2 and 10 are used for
- * setting fan speed dynamic minimum. For example, if value is set to 14 (40%)
- * cooling levels vector will be set to 4, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10 to
- * introduce PWM speed in percent: 40, 40, 40, 40, 40, 50, 60. 70, 80, 90, 100.
- */
-#define MLXSW_THERMAL_SPEED_MIN (MLXSW_THERMAL_MAX_STATE + 2)
-#define MLXSW_THERMAL_SPEED_MAX (MLXSW_THERMAL_MAX_STATE * 2)
-#define MLXSW_THERMAL_SPEED_MIN_LEVEL 2 /* 20% */
/* External cooling devices, allowed for binding to mlxsw thermal zones. */
static char * const mlxsw_thermal_external_allowed_cdev[] = {
@@ -703,49 +695,16 @@ static int mlxsw_thermal_set_cur_state(s
struct mlxsw_thermal *thermal = cdev->devdata;
struct device *dev = thermal->bus_info->dev;
char mfsc_pl[MLXSW_REG_MFSC_LEN];
- unsigned long cur_state, i;
int idx;
- u8 duty;
int err;
+ if (state > MLXSW_THERMAL_MAX_STATE)
+ return -EINVAL;
+
idx = mlxsw_get_cooling_device_idx(thermal, cdev);
if (idx < 0)
return idx;
- /* Verify if this request is for changing allowed fan dynamical
- * minimum. If it is - update cooling levels accordingly and update
- * state, if current state is below the newly requested minimum state.
- * For example, if current state is 5, and minimal state is to be
- * changed from 4 to 6, thermal->cooling_levels[0 to 5] will be changed
- * all from 4 to 6. And state 5 (thermal->cooling_levels[4]) should be
- * overwritten.
- */
- if (state >= MLXSW_THERMAL_SPEED_MIN &&
- state <= MLXSW_THERMAL_SPEED_MAX) {
- state -= MLXSW_THERMAL_MAX_STATE;
- for (i = 0; i <= MLXSW_THERMAL_MAX_STATE; i++)
- thermal->cooling_levels[i] = max(state, i);
-
- mlxsw_reg_mfsc_pack(mfsc_pl, idx, 0);
- err = mlxsw_reg_query(thermal->core, MLXSW_REG(mfsc), mfsc_pl);
- if (err)
- return err;
-
- duty = mlxsw_reg_mfsc_pwm_duty_cycle_get(mfsc_pl);
- cur_state = mlxsw_duty_to_state(duty);
-
- /* If current fan state is lower than requested dynamical
- * minimum, increase fan speed up to dynamical minimum.
- */
- if (state < cur_state)
- return 0;
-
- state = cur_state;
- }
-
- if (state > MLXSW_THERMAL_MAX_STATE)
- return -EINVAL;
-
/* Normalize the state to the valid speed range. */
state = thermal->cooling_levels[state];
mlxsw_reg_mfsc_pack(mfsc_pl, idx, mlxsw_state_to_duty(state));
@@ -1040,8 +999,7 @@ int mlxsw_thermal_init(struct mlxsw_core
/* Initialize cooling levels per PWM state. */
for (i = 0; i < MLXSW_THERMAL_MAX_STATE; i++)
- thermal->cooling_levels[i] = max(MLXSW_THERMAL_SPEED_MIN_LEVEL,
- i);
+ thermal->cooling_levels[i] = max(MLXSW_THERMAL_MIN_STATE, i);
thermal->polling_delay = bus_info->low_frequency ?
MLXSW_THERMAL_SLOW_POLL_INT :
next prev parent reply other threads:[~2021-10-18 13:38 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-18 13:23 [PATCH 5.4 00/69] 5.4.155-rc1 review Greg Kroah-Hartman
2021-10-18 13:23 ` [PATCH 5.4 01/69] ovl: simplify file splice Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 02/69] ALSA: usb-audio: Add quirk for VF0770 Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 03/69] ALSA: seq: Fix a potential UAF by wrong private_free call order Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 04/69] ALSA: hda/realtek: Complete partial device name to avoid ambiguity Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 05/69] ALSA: hda/realtek: Add quirk for Clevo X170KM-G Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 06/69] ALSA: hda/realtek - ALC236 headset MIC recording issue Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 07/69] ALSA: hda/realtek: Fix the mic type detection issue for ASUS G551JW Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 08/69] nds32/ftrace: Fix Error: invalid operands (*UND* and *UND* sections) for `^ Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 09/69] s390: fix strrchr() implementation Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 10/69] csky: dont let sigreturn play with priveleged bits of status register Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 11/69] csky: Fixup regs.sr broken in ptrace Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 12/69] btrfs: unlock newly allocated extent buffer after error Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 13/69] btrfs: deal with errors when replaying dir entry during log replay Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 14/69] btrfs: deal with errors when adding inode reference " Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 15/69] btrfs: check for error when looking up inode during dir entry replay Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 16/69] watchdog: orion: use 0 for unset heartbeat Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 17/69] x86/resctrl: Free the ctrlval arrays when domain_setup_mon_state() fails Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 18/69] mei: me: add Ice Lake-N device id Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 19/69] xhci: guard accesses to ep_state in xhci_endpoint_reset() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 20/69] xhci: Fix command ring pointer corruption while aborting a command Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 21/69] xhci: Enable trust tx length quirk for Fresco FL11 USB controller Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 22/69] cb710: avoid NULL pointer subtraction Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 23/69] efi/cper: use stack buffer for error record decoding Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 24/69] efi: Change down_interruptible() in virt_efi_reset_system() to down_trylock() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 25/69] usb: musb: dsps: Fix the probe error path Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 26/69] Input: xpad - add support for another USB ID of Nacon GC-100 Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 27/69] USB: serial: qcserial: add EM9191 QDL support Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 28/69] USB: serial: option: add Quectel EC200S-CN module support Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 29/69] USB: serial: option: add Telit LE910Cx composition 0x1204 Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 30/69] USB: serial: option: add prod. id for Quectel EG91 Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 31/69] virtio: write back F_VERSION_1 before validate Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 32/69] EDAC/armada-xp: Fix output of uncorrectable error counter Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 33/69] nvmem: Fix shift-out-of-bound (UBSAN) with byte size cells Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 34/69] KVM: PPC: Book3S HV: Fix stack handling in idle_kvm_start_guest() Greg Kroah-Hartman
2021-10-19 10:34 ` Michael Ellerman
2021-10-18 13:24 ` [PATCH 5.4 35/69] KVM: PPC: Book3S HV: Make idle_kvm_start_guest() return 0 if it went to guest Greg Kroah-Hartman
2021-10-19 10:54 ` Michael Ellerman
2021-10-18 13:24 ` [PATCH 5.4 36/69] x86/Kconfig: Do not enable AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT automatically Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 37/69] powerpc/xive: Discard disabled interrupts in get_irqchip_state() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 38/69] iio: adc: aspeed: set driver data when adc probe Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 39/69] iio: adc128s052: Fix the error handling path of adc128_probe() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 40/69] iio: mtk-auxadc: fix case IIO_CHAN_INFO_PROCESSED Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 41/69] iio: light: opt3001: Fixed timeout error when 0 lux Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 42/69] iio: ssp_sensors: add more range checking in ssp_parse_dataframe() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 43/69] iio: ssp_sensors: fix error code in ssp_print_mcu_debug() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 44/69] iio: dac: ti-dac5571: fix an error code in probe() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 45/69] sctp: account stream padding length for reconf chunk Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 46/69] gpio: pca953x: Improve bias setting Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 47/69] net: arc: select CRC32 Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 48/69] net: korina: " Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 49/69] net/mlx5e: Mutually exclude RX-FCS and RX-port-timestamp Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 50/69] net: stmmac: fix get_hw_feature() on old hardware Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 51/69] net: encx24j600: check error in devm_regmap_init_encx24j600 Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 52/69] ethernet: s2io: fix setting mac address during resume Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 53/69] nfc: fix error handling of nfc_proto_register() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 54/69] NFC: digital: fix possible memory leak in digital_tg_listen_mdaa() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 55/69] NFC: digital: fix possible memory leak in digital_in_send_sdd_req() Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 56/69] pata_legacy: fix a couple uninitialized variable bugs Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 57/69] ata: ahci_platform: fix null-ptr-deref in ahci_platform_enable_regulators() Greg Kroah-Hartman
2021-10-18 13:24 ` Greg Kroah-Hartman [this message]
2021-10-18 13:24 ` [PATCH 5.4 59/69] platform/mellanox: mlxreg-io: Fix argument base in kstrtou32() call Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 60/69] drm/panel: olimex-lcd-olinuxino: select CRC32 Greg Kroah-Hartman
2021-10-18 13:24 ` [PATCH 5.4 61/69] drm/msm: Fix null pointer dereference on pointer edp Greg Kroah-Hartman
2021-10-18 13:25 ` [PATCH 5.4 62/69] drm/msm/mdp5: fix cursor-related warnings Greg Kroah-Hartman
2021-10-18 13:25 ` [PATCH 5.4 63/69] drm/msm/dsi: Fix an error code in msm_dsi_modeset_init() Greg Kroah-Hartman
2021-10-18 13:25 ` [PATCH 5.4 64/69] drm/msm/dsi: fix off by one in dsi_bus_clk_enable error handling Greg Kroah-Hartman
2021-10-18 13:25 ` [PATCH 5.4 65/69] acpi/arm64: fix next_platform_timer() section mismatch error Greg Kroah-Hartman
2021-10-18 13:25 ` [PATCH 5.4 66/69] mqprio: Correct stats in mqprio_dump_class_stats() Greg Kroah-Hartman
2021-10-18 13:25 ` [PATCH 5.4 67/69] qed: Fix missing error code in qed_slowpath_start() Greg Kroah-Hartman
2021-10-18 13:25 ` [PATCH 5.4 68/69] r8152: select CRC32 and CRYPTO/CRYPTO_HASH/CRYPTO_SHA256 Greg Kroah-Hartman
2021-10-18 13:25 ` [PATCH 5.4 69/69] ionic: dont remove netdev->dev_addr when syncing uc list Greg Kroah-Hartman
2021-10-18 14:11 ` [PATCH 5.4 00/69] 5.4.155-rc1 review Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211018132331.395914338@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=daniel.lezcano@linaro.org \
--cc=idosch@nvidia.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=vadimp@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).