* [PATCH] bnxt_en: use new module_firmware_crashed()
@ 2020-05-16 5:49 Vasundhara Volam
2020-05-16 8:01 ` kbuild test robot
0 siblings, 1 reply; 3+ messages in thread
From: Vasundhara Volam @ 2020-05-16 5:49 UTC (permalink / raw)
To: jeyu; +Cc: davem, netdev, Vasundhara Volam, Michael Chan, Luis Chamberlain
This makes use of the new module_firmware_crashed() to help
annotate when firmware for device drivers crash. When firmware
crashes devices can sometimes become unresponsive, and recovery
sometimes requires a driver unload / reload and in the worst cases
a reboot.
Using a taint flag allows us to annotate when this happens clearly.
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
---
Please append to the patchset:
("[PATCH v2 00/15] net: taint when the device driver firmware crashes")
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index f86b621..b208404 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2009,6 +2009,7 @@ static int bnxt_async_event_process(struct bnxt *bp,
if (!bp->fw_reset_max_dsecs)
bp->fw_reset_max_dsecs = BNXT_DFLT_FW_RST_MAX_DSECS;
if (EVENT_DATA1_RESET_NOTIFY_FATAL(data1)) {
+ module_firmware_crashed();
netdev_warn(bp->dev, "Firmware fatal reset event received\n");
set_bit(BNXT_STATE_FW_FATAL_COND, &bp->state);
} else {
@@ -10183,6 +10184,7 @@ static void bnxt_force_fw_reset(struct bnxt *bp)
void bnxt_fw_exception(struct bnxt *bp)
{
+ module_firmware_crashed();
netdev_warn(bp->dev, "Detected firmware fatal condition, initiating reset\n");
set_bit(BNXT_STATE_FW_FATAL_COND, &bp->state);
bnxt_rtnl_lock_sp(bp);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] bnxt_en: use new module_firmware_crashed()
2020-05-16 5:49 [PATCH] bnxt_en: use new module_firmware_crashed() Vasundhara Volam
@ 2020-05-16 8:01 ` kbuild test robot
0 siblings, 0 replies; 3+ messages in thread
From: kbuild test robot @ 2020-05-16 8:01 UTC (permalink / raw)
To: Vasundhara Volam, jeyu
Cc: kbuild-all, davem, netdev, Vasundhara Volam, Michael Chan,
Luis Chamberlain
[-- Attachment #1: Type: text/plain, Size: 6519 bytes --]
Hi Vasundhara,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on sparc-next/master]
[also build test ERROR on linus/master v5.7-rc5 next-20200515]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Vasundhara-Volam/bnxt_en-use-new-module_firmware_crashed/20200516-135339
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next.git master
config: s390-allyesconfig (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day GCC_VERSION=9.3.0 make.cross ARCH=s390
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
drivers/net/ethernet/broadcom/bnxt/bnxt.c: In function 'bnxt_async_event_process':
>> drivers/net/ethernet/broadcom/bnxt/bnxt.c:2012:4: error: implicit declaration of function 'module_firmware_crashed' [-Werror=implicit-function-declaration]
2012 | module_firmware_crashed();
| ^~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +/module_firmware_crashed +2012 drivers/net/ethernet/broadcom/bnxt/bnxt.c
1938
1939 #define BNXT_GET_EVENT_PORT(data) \
1940 ((data) & \
1941 ASYNC_EVENT_CMPL_PORT_CONN_NOT_ALLOWED_EVENT_DATA1_PORT_ID_MASK)
1942
1943 static int bnxt_async_event_process(struct bnxt *bp,
1944 struct hwrm_async_event_cmpl *cmpl)
1945 {
1946 u16 event_id = le16_to_cpu(cmpl->event_id);
1947
1948 /* TODO CHIMP_FW: Define event id's for link change, error etc */
1949 switch (event_id) {
1950 case ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CFG_CHANGE: {
1951 u32 data1 = le32_to_cpu(cmpl->event_data1);
1952 struct bnxt_link_info *link_info = &bp->link_info;
1953
1954 if (BNXT_VF(bp))
1955 goto async_event_process_exit;
1956
1957 /* print unsupported speed warning in forced speed mode only */
1958 if (!(link_info->autoneg & BNXT_AUTONEG_SPEED) &&
1959 (data1 & 0x20000)) {
1960 u16 fw_speed = link_info->force_link_speed;
1961 u32 speed = bnxt_fw_to_ethtool_speed(fw_speed);
1962
1963 if (speed != SPEED_UNKNOWN)
1964 netdev_warn(bp->dev, "Link speed %d no longer supported\n",
1965 speed);
1966 }
1967 set_bit(BNXT_LINK_SPEED_CHNG_SP_EVENT, &bp->sp_event);
1968 }
1969 /* fall through */
1970 case ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CHANGE:
1971 case ASYNC_EVENT_CMPL_EVENT_ID_PORT_PHY_CFG_CHANGE:
1972 set_bit(BNXT_LINK_CFG_CHANGE_SP_EVENT, &bp->sp_event);
1973 /* fall through */
1974 case ASYNC_EVENT_CMPL_EVENT_ID_LINK_STATUS_CHANGE:
1975 set_bit(BNXT_LINK_CHNG_SP_EVENT, &bp->sp_event);
1976 break;
1977 case ASYNC_EVENT_CMPL_EVENT_ID_PF_DRVR_UNLOAD:
1978 set_bit(BNXT_HWRM_PF_UNLOAD_SP_EVENT, &bp->sp_event);
1979 break;
1980 case ASYNC_EVENT_CMPL_EVENT_ID_PORT_CONN_NOT_ALLOWED: {
1981 u32 data1 = le32_to_cpu(cmpl->event_data1);
1982 u16 port_id = BNXT_GET_EVENT_PORT(data1);
1983
1984 if (BNXT_VF(bp))
1985 break;
1986
1987 if (bp->pf.port_id != port_id)
1988 break;
1989
1990 set_bit(BNXT_HWRM_PORT_MODULE_SP_EVENT, &bp->sp_event);
1991 break;
1992 }
1993 case ASYNC_EVENT_CMPL_EVENT_ID_VF_CFG_CHANGE:
1994 if (BNXT_PF(bp))
1995 goto async_event_process_exit;
1996 set_bit(BNXT_RESET_TASK_SILENT_SP_EVENT, &bp->sp_event);
1997 break;
1998 case ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY: {
1999 u32 data1 = le32_to_cpu(cmpl->event_data1);
2000
2001 if (!bp->fw_health)
2002 goto async_event_process_exit;
2003
2004 bp->fw_reset_timestamp = jiffies;
2005 bp->fw_reset_min_dsecs = cmpl->timestamp_lo;
2006 if (!bp->fw_reset_min_dsecs)
2007 bp->fw_reset_min_dsecs = BNXT_DFLT_FW_RST_MIN_DSECS;
2008 bp->fw_reset_max_dsecs = le16_to_cpu(cmpl->timestamp_hi);
2009 if (!bp->fw_reset_max_dsecs)
2010 bp->fw_reset_max_dsecs = BNXT_DFLT_FW_RST_MAX_DSECS;
2011 if (EVENT_DATA1_RESET_NOTIFY_FATAL(data1)) {
> 2012 module_firmware_crashed();
2013 netdev_warn(bp->dev, "Firmware fatal reset event received\n");
2014 set_bit(BNXT_STATE_FW_FATAL_COND, &bp->state);
2015 } else {
2016 netdev_warn(bp->dev, "Firmware non-fatal reset event received, max wait time %d msec\n",
2017 bp->fw_reset_max_dsecs * 100);
2018 }
2019 set_bit(BNXT_FW_RESET_NOTIFY_SP_EVENT, &bp->sp_event);
2020 break;
2021 }
2022 case ASYNC_EVENT_CMPL_EVENT_ID_ERROR_RECOVERY: {
2023 struct bnxt_fw_health *fw_health = bp->fw_health;
2024 u32 data1 = le32_to_cpu(cmpl->event_data1);
2025
2026 if (!fw_health)
2027 goto async_event_process_exit;
2028
2029 fw_health->enabled = EVENT_DATA1_RECOVERY_ENABLED(data1);
2030 fw_health->master = EVENT_DATA1_RECOVERY_MASTER_FUNC(data1);
2031 if (!fw_health->enabled)
2032 break;
2033
2034 if (netif_msg_drv(bp))
2035 netdev_info(bp->dev, "Error recovery info: error recovery[%d], master[%d], reset count[0x%x], health status: 0x%x\n",
2036 fw_health->enabled, fw_health->master,
2037 bnxt_fw_health_readl(bp,
2038 BNXT_FW_RESET_CNT_REG),
2039 bnxt_fw_health_readl(bp,
2040 BNXT_FW_HEALTH_REG));
2041 fw_health->tmr_multiplier =
2042 DIV_ROUND_UP(fw_health->polling_dsecs * HZ,
2043 bp->current_interval * 10);
2044 fw_health->tmr_counter = fw_health->tmr_multiplier;
2045 fw_health->last_fw_heartbeat =
2046 bnxt_fw_health_readl(bp, BNXT_FW_HEARTBEAT_REG);
2047 fw_health->last_fw_reset_cnt =
2048 bnxt_fw_health_readl(bp, BNXT_FW_RESET_CNT_REG);
2049 goto async_event_process_exit;
2050 }
2051 default:
2052 goto async_event_process_exit;
2053 }
2054 bnxt_queue_sp_work(bp);
2055 async_event_process_exit:
2056 bnxt_ulp_async_events(bp, cmpl);
2057 return 0;
2058 }
2059
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 59287 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] bnxt_en: use new module_firmware_crashed()
@ 2020-05-16 8:01 ` kbuild test robot
0 siblings, 0 replies; 3+ messages in thread
From: kbuild test robot @ 2020-05-16 8:01 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 6680 bytes --]
Hi Vasundhara,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on sparc-next/master]
[also build test ERROR on linus/master v5.7-rc5 next-20200515]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Vasundhara-Volam/bnxt_en-use-new-module_firmware_crashed/20200516-135339
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next.git master
config: s390-allyesconfig (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day GCC_VERSION=9.3.0 make.cross ARCH=s390
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
drivers/net/ethernet/broadcom/bnxt/bnxt.c: In function 'bnxt_async_event_process':
>> drivers/net/ethernet/broadcom/bnxt/bnxt.c:2012:4: error: implicit declaration of function 'module_firmware_crashed' [-Werror=implicit-function-declaration]
2012 | module_firmware_crashed();
| ^~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +/module_firmware_crashed +2012 drivers/net/ethernet/broadcom/bnxt/bnxt.c
1938
1939 #define BNXT_GET_EVENT_PORT(data) \
1940 ((data) & \
1941 ASYNC_EVENT_CMPL_PORT_CONN_NOT_ALLOWED_EVENT_DATA1_PORT_ID_MASK)
1942
1943 static int bnxt_async_event_process(struct bnxt *bp,
1944 struct hwrm_async_event_cmpl *cmpl)
1945 {
1946 u16 event_id = le16_to_cpu(cmpl->event_id);
1947
1948 /* TODO CHIMP_FW: Define event id's for link change, error etc */
1949 switch (event_id) {
1950 case ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CFG_CHANGE: {
1951 u32 data1 = le32_to_cpu(cmpl->event_data1);
1952 struct bnxt_link_info *link_info = &bp->link_info;
1953
1954 if (BNXT_VF(bp))
1955 goto async_event_process_exit;
1956
1957 /* print unsupported speed warning in forced speed mode only */
1958 if (!(link_info->autoneg & BNXT_AUTONEG_SPEED) &&
1959 (data1 & 0x20000)) {
1960 u16 fw_speed = link_info->force_link_speed;
1961 u32 speed = bnxt_fw_to_ethtool_speed(fw_speed);
1962
1963 if (speed != SPEED_UNKNOWN)
1964 netdev_warn(bp->dev, "Link speed %d no longer supported\n",
1965 speed);
1966 }
1967 set_bit(BNXT_LINK_SPEED_CHNG_SP_EVENT, &bp->sp_event);
1968 }
1969 /* fall through */
1970 case ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CHANGE:
1971 case ASYNC_EVENT_CMPL_EVENT_ID_PORT_PHY_CFG_CHANGE:
1972 set_bit(BNXT_LINK_CFG_CHANGE_SP_EVENT, &bp->sp_event);
1973 /* fall through */
1974 case ASYNC_EVENT_CMPL_EVENT_ID_LINK_STATUS_CHANGE:
1975 set_bit(BNXT_LINK_CHNG_SP_EVENT, &bp->sp_event);
1976 break;
1977 case ASYNC_EVENT_CMPL_EVENT_ID_PF_DRVR_UNLOAD:
1978 set_bit(BNXT_HWRM_PF_UNLOAD_SP_EVENT, &bp->sp_event);
1979 break;
1980 case ASYNC_EVENT_CMPL_EVENT_ID_PORT_CONN_NOT_ALLOWED: {
1981 u32 data1 = le32_to_cpu(cmpl->event_data1);
1982 u16 port_id = BNXT_GET_EVENT_PORT(data1);
1983
1984 if (BNXT_VF(bp))
1985 break;
1986
1987 if (bp->pf.port_id != port_id)
1988 break;
1989
1990 set_bit(BNXT_HWRM_PORT_MODULE_SP_EVENT, &bp->sp_event);
1991 break;
1992 }
1993 case ASYNC_EVENT_CMPL_EVENT_ID_VF_CFG_CHANGE:
1994 if (BNXT_PF(bp))
1995 goto async_event_process_exit;
1996 set_bit(BNXT_RESET_TASK_SILENT_SP_EVENT, &bp->sp_event);
1997 break;
1998 case ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY: {
1999 u32 data1 = le32_to_cpu(cmpl->event_data1);
2000
2001 if (!bp->fw_health)
2002 goto async_event_process_exit;
2003
2004 bp->fw_reset_timestamp = jiffies;
2005 bp->fw_reset_min_dsecs = cmpl->timestamp_lo;
2006 if (!bp->fw_reset_min_dsecs)
2007 bp->fw_reset_min_dsecs = BNXT_DFLT_FW_RST_MIN_DSECS;
2008 bp->fw_reset_max_dsecs = le16_to_cpu(cmpl->timestamp_hi);
2009 if (!bp->fw_reset_max_dsecs)
2010 bp->fw_reset_max_dsecs = BNXT_DFLT_FW_RST_MAX_DSECS;
2011 if (EVENT_DATA1_RESET_NOTIFY_FATAL(data1)) {
> 2012 module_firmware_crashed();
2013 netdev_warn(bp->dev, "Firmware fatal reset event received\n");
2014 set_bit(BNXT_STATE_FW_FATAL_COND, &bp->state);
2015 } else {
2016 netdev_warn(bp->dev, "Firmware non-fatal reset event received, max wait time %d msec\n",
2017 bp->fw_reset_max_dsecs * 100);
2018 }
2019 set_bit(BNXT_FW_RESET_NOTIFY_SP_EVENT, &bp->sp_event);
2020 break;
2021 }
2022 case ASYNC_EVENT_CMPL_EVENT_ID_ERROR_RECOVERY: {
2023 struct bnxt_fw_health *fw_health = bp->fw_health;
2024 u32 data1 = le32_to_cpu(cmpl->event_data1);
2025
2026 if (!fw_health)
2027 goto async_event_process_exit;
2028
2029 fw_health->enabled = EVENT_DATA1_RECOVERY_ENABLED(data1);
2030 fw_health->master = EVENT_DATA1_RECOVERY_MASTER_FUNC(data1);
2031 if (!fw_health->enabled)
2032 break;
2033
2034 if (netif_msg_drv(bp))
2035 netdev_info(bp->dev, "Error recovery info: error recovery[%d], master[%d], reset count[0x%x], health status: 0x%x\n",
2036 fw_health->enabled, fw_health->master,
2037 bnxt_fw_health_readl(bp,
2038 BNXT_FW_RESET_CNT_REG),
2039 bnxt_fw_health_readl(bp,
2040 BNXT_FW_HEALTH_REG));
2041 fw_health->tmr_multiplier =
2042 DIV_ROUND_UP(fw_health->polling_dsecs * HZ,
2043 bp->current_interval * 10);
2044 fw_health->tmr_counter = fw_health->tmr_multiplier;
2045 fw_health->last_fw_heartbeat =
2046 bnxt_fw_health_readl(bp, BNXT_FW_HEARTBEAT_REG);
2047 fw_health->last_fw_reset_cnt =
2048 bnxt_fw_health_readl(bp, BNXT_FW_RESET_CNT_REG);
2049 goto async_event_process_exit;
2050 }
2051 default:
2052 goto async_event_process_exit;
2053 }
2054 bnxt_queue_sp_work(bp);
2055 async_event_process_exit:
2056 bnxt_ulp_async_events(bp, cmpl);
2057 return 0;
2058 }
2059
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 59287 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-05-16 8:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-16 5:49 [PATCH] bnxt_en: use new module_firmware_crashed() Vasundhara Volam
2020-05-16 8:01 ` kbuild test robot
2020-05-16 8:01 ` kbuild test robot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.