From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB835C282C0 for ; Sun, 27 Jan 2019 07:51:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 85DED214C6 for ; Sun, 27 Jan 2019 07:51:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726597AbfA0HvN (ORCPT ); Sun, 27 Jan 2019 02:51:13 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:46616 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726280AbfA0HvM (ORCPT ); Sun, 27 Jan 2019 02:51:12 -0500 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0R7mrbT145167 for ; Sun, 27 Jan 2019 02:51:11 -0500 Received: from e06smtp02.uk.ibm.com (e06smtp02.uk.ibm.com [195.75.94.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 2q95q247sj-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Sun, 27 Jan 2019 02:51:11 -0500 Received: from localhost by e06smtp02.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 27 Jan 2019 07:51:09 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp02.uk.ibm.com (192.168.101.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Sun, 27 Jan 2019 07:51:06 -0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x0R7p5bd35979468 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Sun, 27 Jan 2019 07:51:05 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5E0AFA4062; Sun, 27 Jan 2019 07:51:05 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E373FA405B; Sun, 27 Jan 2019 07:51:04 +0000 (GMT) Received: from rapoport-lnx (unknown [9.148.8.103]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Sun, 27 Jan 2019 07:51:04 +0000 (GMT) Date: Sun, 27 Jan 2019 09:51:03 +0200 From: Mike Rapoport To: Oded Gabbay Cc: gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, ogabbay@habana.ai Subject: Re: [PATCH 10/15] habanalabs: add device reset support References: <20190123000057.31477-1-oded.gabbay@gmail.com> <20190123000057.31477-11-oded.gabbay@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190123000057.31477-11-oded.gabbay@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-TM-AS-GCONF: 00 x-cbid: 19012707-0008-0000-0000-000002B68CB9 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19012707-0009-0000-0000-00002222C839 Message-Id: <20190127075059.GA28461@rapoport-lnx> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-01-27_05:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901270066 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 23, 2019 at 02:00:52AM +0200, Oded Gabbay wrote: > This patch adds support for doing various on-the-fly reset of Goya. > > The driver supports two types of resets: > 1. soft-reset > 2. hard-reset > > Soft-reset is done when the device detects a timeout of a command > submission that was given to the device. The soft-reset process only resets > the engines that are relevant for the submission of compute jobs, i.e. the > DMA channels, the TPCs and the MME. The purpose is to bring the device as > fast as possible to a working state. > > Hard-reset is done in several cases: > 1. After soft-reset is done but the device is not responding > 2. When fatal errors occur inside the device, e.g. ECC error > 3. When the driver is removed > > Hard-reset performs a reset of the entire chip except for the PCI > controller and the PLLs. It is a much longer process then soft-reset but it > helps to recover the device without the need to reboot the Host. > > After hard-reset, the driver will restore the max power attribute and in > case of manual power management, the frequencies that were set. > > This patch also adds two entries to the sysfs, which allows the root user > to initiate a soft or hard reset. > > Signed-off-by: Oded Gabbay > --- > drivers/misc/habanalabs/command_buffer.c | 11 +- > drivers/misc/habanalabs/device.c | 308 +++++++++++++++++++++- > drivers/misc/habanalabs/goya/goya.c | 201 ++++++++++++++ > drivers/misc/habanalabs/goya/goya_hwmgr.c | 18 +- > drivers/misc/habanalabs/habanalabs.h | 35 +++ > drivers/misc/habanalabs/habanalabs_drv.c | 9 +- > drivers/misc/habanalabs/hwmon.c | 4 +- > drivers/misc/habanalabs/irq.c | 31 +++ > drivers/misc/habanalabs/sysfs.c | 120 ++++++++- > 9 files changed, 712 insertions(+), 25 deletions(-) > > diff --git a/drivers/misc/habanalabs/command_buffer.c b/drivers/misc/habanalabs/command_buffer.c > index 535ed6cc5bda..700c6da01188 100644 > --- a/drivers/misc/habanalabs/command_buffer.c > +++ b/drivers/misc/habanalabs/command_buffer.c > @@ -81,9 +81,10 @@ int hl_cb_create(struct hl_device *hdev, struct hl_cb_mgr *mgr, > bool alloc_new_cb = true; > int rc; > > - if (hdev->disabled) { > + if ((hdev->disabled) || ((atomic_read(&hdev->in_reset)) && > + (ctx_id != HL_KERNEL_ASID_ID))) { > dev_warn_ratelimited(hdev->dev, > - "Device is disabled !!! Can't create new CBs\n"); > + "Device is disabled or in reset !!! Can't create new CBs\n"); > rc = -EBUSY; > goto out_err; > } > @@ -187,6 +188,12 @@ int hl_cb_ioctl(struct hl_fpriv *hpriv, void *data) > u64 handle; > int rc; > > + if (hdev->hard_reset_pending) { > + dev_crit_ratelimited(hdev->dev, > + "Device HARD reset pending !!! Please close FD\n"); > + return -ENODEV; > + } Probably this check should be done at the top-level ioctl()? And, what will happen if the devices performs hard reset, but the used keeps the file descriptor open? > + > switch (args->in.op) { > case HL_CB_OP_CREATE: > rc = hl_cb_create(hdev, &hpriv->cb_mgr, args->in.cb_size, > diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c > index ff7b610f18c4..00fde57ce823 100644 > --- a/drivers/misc/habanalabs/device.c > +++ b/drivers/misc/habanalabs/device.c > @@ -188,6 +188,7 @@ static int device_early_init(struct hl_device *hdev) > > mutex_init(&hdev->device_open); > mutex_init(&hdev->send_cpu_message_lock); > + atomic_set(&hdev->in_reset, 0); > atomic_set(&hdev->fd_open_cnt, 0); > > return 0; > @@ -238,6 +239,27 @@ static void set_freq_to_low_job(struct work_struct *work) > usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC)); > } > > +static void hl_device_heartbeat(struct work_struct *work) > +{ > + struct hl_device *hdev = container_of(work, struct hl_device, > + work_heartbeat.work); > + > + if ((hdev->disabled) || (atomic_read(&hdev->in_reset))) > + goto reschedule; > + > + if (!hdev->asic_funcs->send_heartbeat(hdev)) > + goto reschedule; AFAIU, asic_funcs->send_heartbeat() it set once at init time. The work should not be scheduled it it's NULL, I suppose. > + > + dev_err(hdev->dev, "Device heartbeat failed !!!\n"); > + hl_device_reset(hdev, true, false); > + > + return; > + > +reschedule: > + schedule_delayed_work(&hdev->work_heartbeat, > + usecs_to_jiffies(HL_HEARTBEAT_PER_USEC)); > +} > + > /** > * device_late_init - do late stuff initialization for the habanalabs device > * > @@ -273,6 +295,12 @@ static int device_late_init(struct hl_device *hdev) > schedule_delayed_work(&hdev->work_freq, > usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC)); > > + if (hdev->heartbeat) { > + INIT_DELAYED_WORK(&hdev->work_heartbeat, hl_device_heartbeat); > + schedule_delayed_work(&hdev->work_heartbeat, > + usecs_to_jiffies(HL_HEARTBEAT_PER_USEC)); > + } > + > hdev->late_init_done = true; > > return 0; > @@ -290,6 +318,8 @@ static void device_late_fini(struct hl_device *hdev) > return; > > cancel_delayed_work_sync(&hdev->work_freq); > + if (hdev->heartbeat) > + cancel_delayed_work_sync(&hdev->work_heartbeat); > > if (hdev->asic_funcs->late_fini) > hdev->asic_funcs->late_fini(hdev); > @@ -397,6 +427,254 @@ int hl_device_resume(struct hl_device *hdev) > return 0; > } > > +static void hl_device_hard_reset_pending(struct work_struct *work) > +{ > + struct hl_device_reset_work *device_reset_work = > + container_of(work, struct hl_device_reset_work, reset_work); > + struct hl_device *hdev = device_reset_work->hdev; > + u16 pending_cnt = HL_PENDING_RESET_PER_SEC; > + struct task_struct *task = NULL; > + > + /* Flush all processes that are inside hl_open */ > + mutex_lock(&hdev->device_open); > + > + while ((atomic_read(&hdev->fd_open_cnt)) && (pending_cnt)) { > + > + pending_cnt--; > + > + dev_info(hdev->dev, > + "Can't HARD reset, waiting for user to close FD\n"); > + ssleep(1); > + } > + > + if (atomic_read(&hdev->fd_open_cnt)) { > + task = get_pid_task(hdev->user_ctx->hpriv->taskpid, > + PIDTYPE_PID); > + if (task) { > + dev_info(hdev->dev, "Killing user processes\n"); > + send_sig(SIGKILL, task, 1); Shouldn't the user get a chance for cleanup? > + msleep(100); > + > + put_task_struct(task); > + } > + } > + > + mutex_unlock(&hdev->device_open); > + > + hl_device_reset(hdev, true, true); > + > + kfree(device_reset_work); > +} > + [ ... ] > diff --git a/drivers/misc/habanalabs/goya/goya_hwmgr.c b/drivers/misc/habanalabs/goya/goya_hwmgr.c > index 866d1774b2e4..9482dbb2e03a 100644 > --- a/drivers/misc/habanalabs/goya/goya_hwmgr.c > +++ b/drivers/misc/habanalabs/goya/goya_hwmgr.c > @@ -38,7 +38,7 @@ static ssize_t mme_clk_show(struct device *dev, struct device_attribute *attr, > struct hl_device *hdev = dev_get_drvdata(dev); > long value; > > - if (hdev->disabled) > + if ((hdev->disabled) || (atomic_read(&hdev->in_reset))) > return -ENODEV; > > value = hl_get_frequency(hdev, MME_PLL, false); > @@ -57,7 +57,7 @@ static ssize_t mme_clk_store(struct device *dev, struct device_attribute *attr, > int rc; > long value; > > - if (hdev->disabled) { > + if ((hdev->disabled) || (atomic_read(&hdev->in_reset))) { There are quite a few of those, maybe split this check to a helper function? > count = -ENODEV; > goto fail; > } > @@ -87,7 +87,7 @@ static ssize_t tpc_clk_show(struct device *dev, struct device_attribute *attr, > struct hl_device *hdev = dev_get_drvdata(dev); > long value; > > - if (hdev->disabled) > + if ((hdev->disabled) || (atomic_read(&hdev->in_reset))) > return -ENODEV; > > value = hl_get_frequency(hdev, TPC_PLL, false); > @@ -106,7 +106,7 @@ static ssize_t tpc_clk_store(struct device *dev, struct device_attribute *attr, > int rc; > long value; > > - if (hdev->disabled) { > + if ((hdev->disabled) || (atomic_read(&hdev->in_reset))) { > count = -ENODEV; > goto fail; > } -- Sincerely yours, Mike.