From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D053C76196 for ; Fri, 31 Mar 2023 21:18:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230033AbjCaVSq (ORCPT ); Fri, 31 Mar 2023 17:18:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229529AbjCaVSp (ORCPT ); Fri, 31 Mar 2023 17:18:45 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9280A20C2D for ; Fri, 31 Mar 2023 14:18:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680297524; x=1711833524; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=0IOZL0fby0KY0WYAWgWUs3I2+2m/X50su7vVCKxY0QY=; b=QFs+4T2Gnof4h02IqYP4jsuhfNKyVk39e4D1doLvym1/pbGCy2gogdBY ky29KZ6/wSxSpBJOOcfDkrWU6reVNIr9r5+uu6cBbIxx8OaRJ2L1noN9l NQdXXlPCDg1jmhdeyxR/QJU9QLPy55sTr2VNtRUAWLrykmfRP/7a1M+N4 1xIgwlbYyrEGrlJCSaueZNdhaaXlF6lKcLcYMFib6bTheJEaMjUND5PPR DCzOKLkHCzCz+7CreJQAo5n2zW22XuFkjglqO4JdQ2JaQGbFWKZDuGta/ MTQIAHByTnUGWlSFcouEHg+HrZj+kxwqWscbpo1dzXEPKsPHYiS/yIuYN A==; X-IronPort-AV: E=McAfee;i="6600,9927,10666"; a="325542768" X-IronPort-AV: E=Sophos;i="5.98,307,1673942400"; d="scan'208";a="325542768" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2023 14:18:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10666"; a="809169047" X-IronPort-AV: E=Sophos;i="5.98,307,1673942400"; d="scan'208";a="809169047" Received: from djiang5-mobl3.amr.corp.intel.com (HELO [10.212.71.212]) ([10.212.71.212]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2023 14:18:43 -0700 Message-ID: <238216f9-6dbf-c3a0-3ba5-066ea9579080@intel.com> Date: Fri, 31 Mar 2023 14:18:42 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0 Thunderbird/102.9.0 Subject: Re: [PATCH v5 02/12] cxl/memdev: Add support for the Clear Poison mailbox command Content-Language: en-US To: Alison Schofield Cc: Dan Williams , Ira Weiny , Vishal Verma , Ben Widawsky , linux-cxl@vger.kernel.org References: <548e2a175a2f20cdc886297430102ee851d30f26.1679892337.git.alison.schofield@intel.com> From: Dave Jiang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org On 3/31/23 12:55 PM, Alison Schofield wrote: > On Fri, Mar 31, 2023 at 11:40:01AM -0700, Dave Jiang wrote: >> >> >> On 3/26/23 10:03 PM, alison.schofield@intel.com wrote: >>> From: Alison Schofield >>> >>> CXL devices optionally support the CLEAR POISON mailbox command. Add >>> memdev driver support for clearing poison. >>> >>> Per the CXL Specification (3.0 8.2.9.8.4.3), after receiving a valid >>> clear poison request, the device removes the address from the device's >>> Poison List and writes 0 (zero) for 64 bytes starting at address. If >>> the device cannot clear poison from the address, it returns a permanent >>> media error and -ENXIO is returned to the user. >>> >>> Additionally, and per the spec also, it is not an error to clear poison >>> of an address that is not poisoned. In this case, the device does not >>> overwrite the address and the device does not return an error. >>> >>> If the address is not contained in the device's dpa resource, or is >>> not 64 byte aligned, return -EINVAL without issuing the mbox command. >>> >>> Poison clearing is intended for debug only and will be exposed to >>> userspace through debugfs. Restrict compilation to CONFIG_DEBUG_FS. >>> >>> Implementation note: Although the CXL specification defines the clear >>> command to accept 64 bytes of 'write-data' to be used when clearing >>> the poisoned address, this implementation always uses 0 (zeros) for >>> the write-data. >>> >>> Signed-off-by: Alison Schofield >>> --- >>> drivers/cxl/core/memdev.c | 43 +++++++++++++++++++++++++++++++++++++++ >>> drivers/cxl/cxlmem.h | 7 +++++++ >>> 2 files changed, 50 insertions(+) >>> >>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c >>> index 3b3ac2868848..0e39c3c3fb09 100644 >>> --- a/drivers/cxl/core/memdev.c >>> +++ b/drivers/cxl/core/memdev.c >>> @@ -268,6 +268,49 @@ int cxl_inject_poison(struct device *dev, u64 dpa) >>> } >>> EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, CXL); >>> +int cxl_clear_poison(struct device *dev, u64 dpa) >>> +{ >>> + struct cxl_memdev *cxlmd = to_cxl_memdev(dev); >>> + struct cxl_mbox_clear_poison clear; >>> + struct cxl_mbox_cmd mbox_cmd; >>> + int rc; >>> + >>> + if (!IS_ENABLED(CONFIG_DEBUG_FS)) >>> + return 0; >>> + >>> + down_read(&cxl_dpa_rwsem); >>> + rc = cxl_validate_poison_dpa(cxlmd, dpa); >>> + if (rc) >>> + goto out; >>> + >>> + /* >>> + * In CXL 3.0 Spec 8.2.9.8.4.3, the Clear Poison mailbox command >>> + * is defined to accept 64 bytes of 'write-data', along with the >>> + * address to clear. The device writes the data into the address >>> + * atomically, while clearing poison if the location is marked as >>> + * being poisoned. >>> + * >>> + * Always use '0' for the write-data. >>> + */ >>> + clear = (struct cxl_mbox_clear_poison) { >>> + .address = cpu_to_le64(dpa) >>> + }; >> >> The write_data[] should be 0s in order to clear the poison right? Since >> 'clear' is allocated on the stack, if it's not initialized then it would be >> random garbage in the data. You could just init all 'clear' members when you >> declare the variable at top if you like. > > Declaring like this initializes any unspecified fields to zero. > This is the same initialization used across all the mbox_cmd setups > here and in core/mbox.c. I thought you need to do: clear = (struct cxl_mbox_clear_poison) { .address = cpu_to_le64(dpa), .write_data = { 0 }, }; I didn't think it would initialize the other members to 0 if you omit them? But my simple C code test seems to indicate otherwise. So sorry about the noise. > > Am I using that construct incorrectly? > >> >> DJ >> >>> + >>> + mbox_cmd = (struct cxl_mbox_cmd) { >>> + .opcode = CXL_MBOX_OP_CLEAR_POISON, >>> + .size_in = sizeof(clear), >>> + .payload_in = &clear, >>> + }; >>> + >>> + rc = cxl_internal_send_cmd(cxlmd->cxlds, &mbox_cmd); >>> + >>> +out: >>> + up_read(&cxl_dpa_rwsem); >>> + >>> + return rc; >>> +} >>> +EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, CXL); >>> + >>> static struct attribute *cxl_memdev_attributes[] = { >>> &dev_attr_serial.attr, >>> &dev_attr_firmware_version.attr, >>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h >>> index 527efef2d700..1d8677ab2306 100644 >>> --- a/drivers/cxl/cxlmem.h >>> +++ b/drivers/cxl/cxlmem.h >>> @@ -607,6 +607,12 @@ struct cxl_mbox_inject_poison { >>> __le64 address; >>> }; >>> +/* Clear Poison CXL 3.0 Spec 8.2.9.8.4.3 */ >>> +struct cxl_mbox_clear_poison { >>> + __le64 address; >>> + u8 write_data[CXL_POISON_LEN_MULT]; >>> +} __packed; >>> + >>> /** >>> * struct cxl_mem_command - Driver representation of a memory device command >>> * @info: Command information as it exists for the UAPI >>> @@ -684,6 +690,7 @@ ssize_t cxl_trigger_poison_list(struct device *dev, >>> struct device_attribute *attr, const char *buf, >>> size_t len); >>> int cxl_inject_poison(struct device *dev, u64 dpa); >>> +int cxl_clear_poison(struct device *dev, u64 dpa); >>> #ifdef CONFIG_CXL_SUSPEND >>> void cxl_mem_active_inc(void);