From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DBAA3C001DC for ; Thu, 27 Jul 2023 10:21:27 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=velankanigroup.com header.i=jdhanasekar@velankanigroup.com header.a=rsa-sha256 header.s=zoho header.b=cbI4eaKP; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4RBRd23n6Dz3cJC for ; Thu, 27 Jul 2023 20:21:26 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=velankanigroup.com header.i=jdhanasekar@velankanigroup.com header.a=rsa-sha256 header.s=zoho header.b=cbI4eaKP; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=velankanigroup.com (client-ip=103.117.158.11; helo=sender-op-o11.zoho.in; envelope-from=jdhanasekar@velankanigroup.com; receiver=lists.ozlabs.org) Received: from sender-op-o11.zoho.in (sender-op-o11.zoho.in [103.117.158.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4RBRbt0QZWz3cBH for ; Thu, 27 Jul 2023 20:20:25 +1000 (AEST) ARC-Seal: i=1; a=rsa-sha256; t=1690453208; cv=none; d=zohomail.in; s=zohoarc; b=dMu0wbldHC4a80srrDGbK8i2ajGSKA+qx2pCUZ38lKc56mQr+kbB2VqfIgS+NL5jMGOeQ2P4lyXO0i/yS6fSFpTeB1HtOK9LhymNa5Xt00UwgoBHAze8htDD+b7d6PIQ+gDKp1GQGPgKegjZKdtDtgb57MMIgglJ5ADhGKnUEoM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.in; s=zohoarc; t=1690453208; h=Content-Type:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To; bh=u3sMifCu+n14D5aWCFbKfVVNGuf4WM6zApGEhFP/ROo=; b=JwfcXkGZ5/zi2u2t01AaRfC6ZXkXDBLNLvm28Oa2f4zfK0rXdIcb4NuVpojtw8L0aJWWxgDlqzuZfGc1GI82ZbiO4RAht01W6xRvICIPtvtUM9lesBwnWykkVyTawrXN82jc0KuYXuAeVblk7NiNlixkz6XYJG4636I15AcNkGE= ARC-Authentication-Results: i=1; mx.zohomail.in; dkim=pass header.i=velankanigroup.com; spf=pass smtp.mailfrom=jdhanasekar@velankanigroup.com; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1690453208; s=zoho; d=velankanigroup.com; i=jdhanasekar@velankanigroup.com; h=Date:Date:From:From:To:To:Cc:Cc:Message-Id:Message-Id:In-Reply-To:References:Subject:Subject:MIME-Version:Content-Type:Reply-To; bh=u3sMifCu+n14D5aWCFbKfVVNGuf4WM6zApGEhFP/ROo=; b=cbI4eaKPtxo2ZA3utalAqrgiMkO4ZYDxTI8C/cWI/mx1FRCHIH20PUpR8quiJk7T +7XG4v7pXMVfqaEq5K5j4Esx7F2dYpJyWhidZS7JBZMCkrr9XtRUzzIu3SUyo3hYpqO WcG7/nThzJMIAyGxfyxx4gzP5latiPc11HthNf4A= Received: from mail.zoho.in by mx.zoho.in with SMTP id 1690453206670929.1418610872388; Thu, 27 Jul 2023 15:50:06 +0530 (IST) Date: Thu, 27 Jul 2023 15:50:06 +0530 From: J Dhanasekar To: "Venkatesh, Supreeth" Message-Id: <18996dce62b.69ab145b1143324.1441707262676750389@velankanigroup.com> In-Reply-To: References: <07621845-19a4-0568-be0e-f556ba40b813@amd.com> <255d7c9a-ce17-bbe1-7312-990d0221cf36@amd.com> <65515592-8f77-1c8f-731c-165fb833344b@amd.com> <71a122a9-07a9-06a8-ee1a-dd108db63df3@amd.com> <18977ff7cd7.59a883fc562150.7689391317426675156@velankanigroup.com> <18987ffeff9.35c4bda1801937.8894247920197462243@velankanigroup.com> <1898d2b2d0e.4ac15546926284.5918723584994850422@velankanigroup.com> Subject: RE: [RFC] BMC RAS Feature MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_3582874_321889682.1690453206575" Importance: Medium User-Agent: Zoho Mail X-Mailer: Zoho Mail X-Zoho-Virus-Status: 1 X-BeenThere: openbmc@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development list for OpenBMC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lei Yu , Michael Shen , openbmc , dhruvaraj S , Brad Bishop , Ed Tanous , "Dhandapani, Abinaya" Errors-To: openbmc-bounces+openbmc=archiver.kernel.org@lists.ozlabs.org Sender: "openbmc" ------=_Part_3582874_321889682.1690453206575 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Supreeth, Thanks for the info.=C2=A0 -dhanasekar ---- On Tue, 25 Jul 2023 19:32:59 +0530 Venkatesh, Supreeth wrote --- [AMD Official Use Only - General] Hi Dhanasekar, =C2=A0 Algorithms or Steps for implementing functionalities (SOL, PostCode, ) will= be same. =C2=A0 Thanks, Supreeth Venkatesh System Manageability Architect=C2=A0=C2=A0|=C2=A0=C2=A0AMD Server Software =C2=A0 From: J Dhanasekar =20 Sent: Tuesday, July 25, 2023 8:09 AM To: Venkatesh, Supreeth Cc: Lei Yu ; Michael Shen ; openbmc ; dhruvaraj S ; Brad Bishop ; Ed T= anous ; Dhandapani, Abinaya Subject: RE: [RFC] BMC RAS Feature =C2=A0 Caution: This message originated from an External Source. Use proper cautio= n when opening attachments, clicking links, or responding. =C2=A0 =C2=A0 Hi Supreeth,=C2=A0 =C2=A0 I am working on SP5 Servers too. SP5 Servers has aspeed 2600 chip and=C2=A0= BMC is off the board whereas EthanolX/Daytonax has 2500 and BMC is on the = board.=C2=A0 Algorithms or Steps for implementing functionalities (SOL, PostCode, PSU..)= will=C2=A0=C2=A0remain the same?.=C2=A0 =C2=A0 Thanks, Dhanasekar =C2=A0 =C2=A0 =C2=A0 =C2=A0 ---- On Mon, 24 Jul 2023 19:44:52 +0530 Venkatesh, Supreeth wrote --- =C2=A0 [AMD Official Use Only - General] =C2=A0 Hi Dhanasekar, =C2=A0 DaytonaX and EthanolX platforms were only OpenBMC PoC with limited function= ality. We are in the process of upstreaming new AMD CRBs with OpenBMC which has al= l the functionality you mention below. Public instance of the staging/intermediary repository before upstream is h= ere: https://github.com/AMDESE/OpenBMC =C2=A0 Thanks, Supreeth Venkatesh System Manageability Architect=C2=A0=C2=A0|=C2=A0=C2=A0AMD Server Software =C2=A0 From: J Dhanasekar =20 Sent: Monday, July 24, 2023 8:04 AM To: Venkatesh, Supreeth Cc: Lei Yu ; Zane Shelley ; Michael Shen ; openbmc ; dhruvaraj S ; Brad Bishop ; Ed Tanous ; Dhandapani, Abinaya Subject: RE: [RFC] BMC RAS Feature =C2=A0 Caution: This message originated from an External Source. Use proper cautio= n when opening attachments, clicking links, or responding. =C2=A0 Hi Supreeth, =C2=A0 Thanks for the info. We hoped that Daytonax would be upstreamed. Unfortunat= ely, It is not available.=C2=A0 Actually, we need to enable SOL, Post code and PSU features in Daytona.=C2= =A0 Will we get support for this feature enablement? or Are there any refer= ence=C2=A0implementation available for AMD boards?. =C2=A0 Thanks, Dhanasekar =C2=A0 =C2=A0 =C2=A0 ---- On Fri, 21 Jul 2023 19:33:41 +0530 Venkatesh, Supreeth wrote --- =C2=A0 [AMD Official Use Only - General] =C2=A0 Hi Dhanasekar, =C2=A0 It is supported for EPYC Genoa family and beyond at this time. Daytona uses EPYC Milan family and support is not there in that. =C2=A0 Thanks, Supreeth Venkatesh System Manageability Architect=C2=A0=C2=A0|=C2=A0=C2=A0AMD Server Software =C2=A0 From: J Dhanasekar =20 Sent: Friday, July 21, 2023 5:30 AM To: Venkatesh, Supreeth Cc: Zane Shelley ; Lei Yu ; Michael Shen ; openbmc ; dhruvaraj S ; Brad Bishop ; Ed Tanous ; Dhandapani, Abinaya Subject: Re: [RFC] BMC RAS Feature =C2=A0 Caution: This message originated from an External Source. Use proper cautio= n when opening attachments, clicking links, or responding. =C2=A0 Hi Supreeth Venkatesh, =C2=A0 Does this RAS feature work for the Daytona Platform.=C2=A0 i have been work= ing in openBMC development for the Daytonax platform.=C2=A0 If this RAS works for Daytona Platform. I will include it in my project.=C2= =A0 =C2=A0 Please provide your suggestions.=C2=A0 =C2=A0 Thanks, Dhanasekar =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ---- On Mon, 03 Apr 2023 22:06:24 +0530 Supreeth Venkatesh wrote --- =C2=A0 On 3/23/23 13:57, Zane Shelley wrote: > Caution: This message originated from an External Source. Use proper=20 > caution when opening attachments, clicking links, or responding.=20 >=20 >=20 > On 2023-03-22 19:07, Supreeth Venkatesh wrote: =20 >> On 3/22/23 02:10, Lei Yu wrote:=20 >>> Caution: This message originated from an External Source. Use proper= =20 >>> caution when opening attachments, clicking links, or responding.=20 >>>=20 >>>=20 >>>>> On Tue, 21 Mar 2023 at 20:38, Supreeth Venkatesh=20 >>>>> wrote:=20 >>>>>=20 >>>>>=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 On 3/21/23 05:40, Patrick Williams wrote:= =20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > On Tue, Mar 21, 2023 at 12:14:45AM -0500,= Supreeth Venkatesh=20 >>>>> wrote:=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 >=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 >> #### Alternatives Considered=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 >>=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 >> In-band mechanisms using System Manageme= nt Mode (SMM)=20 >>>>> exists.=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 >>=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 >> However, out of band method to gather RA= S data is processor=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 specific.=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 >>=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > How does this compare with existing imple= mentations in=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > phosphor-debug-collector.=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 Thanks for your feedback. See below. =20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > I believe there was some attempt to exten= d=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > P-D-C previously to handle Intel's crashd= ump behavior.=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 Intel's crashdump interface uses com.intel.= crashdump.=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 We have implemented com.amd.crashdump based= on that reference.=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 However,=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 can this be made generic?=20 >>>>>=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 PoC below:=20 >>>>>=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 busctl tree com.amd.crashdump=20 >>>>>=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80/com=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80/com/am= d=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2= =94=80/com/amd/crashdump=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80/com/amd/crashdump/0=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80/com/amd/crashdump/1=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80/com/amd/crashdump/2=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80/com/amd/crashdump/3=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80/com/amd/crashdump/4=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80/com/amd/crashdump/5=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80/com/amd/crashdump/6=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80/com/amd/crashdump/7=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80/com/amd/crashdump/8=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=94=E2=94=80/com/amd/crashdump/9=20 >>>>>=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > The repository=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > currently handles IBM's processors, I thi= nk, or maybe that is=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 covered by=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > openpower-debug-collector.=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 >=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > In any case, I think you should look at t= he existing D-Bus=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 interfaces=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > (and associated Redfish implementation) o= f these repositories=20 >>>>> and=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 > determine if you can use those approaches= (or document why=20 >>>>> now).=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 I could not find an existing D-Bus interfac= e for RAS in=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 xyz/openbmc_project/.=20 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 It would be helpful if you could point me t= o it.=20 >>>>>=20 >>>>>=20 >>>>> There is an interface for the dumps generated from the host, which= =20 >>>>> can=20 >>>>> be used for these kinds of dumps=20 >>>>> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml= /xyz/openbmc_project/Dump/Entry/System.interface.yaml=20 >>>>>=20 >>>>>=20 >>>>> The fault log also provides similar dumps =20 >>>>> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml= /xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml=20 >>>>>=20 >>>>>=20 >>>> ThanksDdhruvraj. The interface looks useful for the purpose. However,= =20 >>>> the current BMCWEB implementation references =20 >>>> https://github.com/openbmc/bmcweb/blob/master/redfish-core/lib/log_se= rvices.hpp=20 >>>>=20 >>>> [com.intel.crashdump]=20 >>>> constexpr char const* crashdumpPath =3D "/com/intel/crashdump";=20 >>>>=20 >>>> constexpr char const* crashdumpInterface =3D "com.intel.crashdump";= =20 >>>> constexpr char const* crashdumpObject =3D "com.intel.crashdump";=20 >>>>=20 >>>> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/= xyz/openbmc_project/Dump/Entry/System.interface.yaml=20 >>>>=20 >>>> or=20 >>>> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/= xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml=20 >>>>=20 >>>> is it exercised in Redfish logservices? =20 >>> In our practice, a plugin `tools/dreport.d/plugins.d/acddump` is added= =20 >>> to copy the crashdump json file to the dump tarball.=20 >>> The crashdump tool (Intel or AMD) could trigger a dump after the=20 >>> crashdump is completed, and then we could get a dump entry containing= =20 >>> the crashdump.=20 >> Thanks Lei Yu for your input. We are using Redfish to retrieve the=20 >> CPER binary file which can then be passed through a plugin/script for= =20 >> detailed analysis.=20 >> In any case irrespective of whichever Dbus interface we use, we need a= =20 >> repository which will gather data from AMD processor via APML as per=20 >> AMD design.=20 >> APML=20 >> Spec: https://www.amd.com/system/files/TechDocs/57019-A0-PUB_3.00.zip= =20 >> Can someone please help create bmc-ras or amd-debug-collector=20 >> repository as there are instances of openpower-debug-collector=20 >> repository used for Open Power systems?=20 >>>=20 >>>=20 >>> --=20 >>> BRs,=20 >>> Lei YU=20 > I am interested in possibly standardizing some of this. IBM POWER has=20 > several related components. openpower-hw-diags is a service that will=20 > listen for the hardware interrupts via a GPIO pin. When an error is=20 > detected, it will use openpower-libhei to query hardware registers to=20 > determine what happened. Based on that information openpower-hw-diags=20 > will generate a PEL, which is an extended log in phosphor-logging, that= =20 > is used to tell service what to replace if necessary. Afterward,=20 > openpower-hw-diags will initiate openpower-debug-collector, which=20 > gathers a significant amount of data from the hardware for additional=20 > debug when necessary. I wrote openpower-libhei to be fairly agnostic. It= =20 > uses data files (currently XML, but moving to JSON) to define register= =20 > addresses and rules for isolation. openpower-hw-diags is fairly POWER=20 > specific, but I can see some parts can be made generic. Dhruv would have= =20 > to help with openpower-debug-collector.=20 Thank you. Lets collaborate in standardizing some aspects of it.=20 >=20 > Regarding creation of a new repository, I think we'll need to have some= =20 > more collaboration to determine the scope before creating it. It=20 > certainly sounds like we are doing similar things, but we need to=20 > determine if enough can be abstracted to make it worth our time.=20 I have put in a request here:=20 https://github.com/openbmc/technical-oversight-forum/issues/24=20 Please chime in. =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ------=_Part_3582874_321889682.1690453206575 Content-Type: multipart/related; boundary="----=_Part_3582875_650366529.1690453206594" ------=_Part_3582875_650366529.1690453206594 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =
Hi Supreeth,

Thanks f= or the info. 

-dhanasekar

<= /div>




---- On Tue, 25 Jul 2023 19:32:59 +0530 Venkatesh, Su= preeth <Supreeth.Venkatesh@amd.com> wrote ---

<= /div>

[AMD Official Use Only - General]

<= span class=3D"size" style=3D"font-size:12pt">Hi Dhanasekar,

<= span class=3D"colour" style=3D"color:rgb(34, 34, 34)"> 

Algorithms or Steps for impleme= nting functionalities (SOL, PostCode, ) will be same.<= br>

 <= br>

= Thanks,

Supreeth Venkatesh

System Manageability Architect  |  AMD<= /b>
Server Software


<= p class=3D"" style=3D"margin-top: 0px; margin-bottom: 0px;"> 

<= div>

From: J Dhanasekar <jdhanasekar@velankanigroup.com= >
Sent: Tuesday, July 25, 2023 8:09 AM
To: Venkat= esh, Supreeth <Supreeth.Venkatesh@amd.com>
Cc: Lei Yu <yulei.sh@bytedance.com= >; Michael Shen <gpgpgp@google.com>; openbmc <openbmc@lists.ozlabs.org>; dhruvara= j S <dhruvaraj@= gmail.com>; Brad Bishop <bradleyb@fuzziesquirrel.com>; Ed Tanous <= ;ed@tanous.net>; = Dhandapani, Abinaya <Abinaya.Dhandapani@amd.com>
Subject: RE: [RF= C] BMC RAS Feature

 

<= /table>

 <= br>

 

Hi Supreeth, 

 

I am working on= SP5 Servers too. SP5 Servers has aspeed 2600 chip and  BMC is off the= board whereas EthanolX/Daytonax has 2500 and BMC is on the board. 

Algorithms or Steps for implementing fun= ctionalities (SOL, PostCode, PSU..) will  remain the same?. =

 

Thanks,

Dhanasekar

 =

<= span class=3D"size" style=3D"font-size:10pt"> 

 

 

---- On Mon, 24 Jul 2023 19:44:52 +0530 Venkatesh= , Supreeth <Supreeth.Venkatesh@amd.com> wrote ---

<= /div>

 

[AMD Official Use Only - General]=

 

Hi D= hanasekar,

 

DaytonaX and EthanolX platforms were only OpenBMC = PoC with limited functionality.

We= are in the process of upstreaming new AMD CRBs with OpenBMC which has all = the functionality you mention below.

Public instance of the staging/intermediary repository before upstream i= s here:

AMDESE/OpenBMC: OpenBMC for Genoa SP5= socket platforms (github.com)

 

Thanks,

Supreeth Venkatesh

S= ystem Manageability Architect  |  AMD
Server Software<= /span>
=

=

 

=

From: J Dhanasekar <jdhanasekar@velankanigroup.com>
Sent: Mo= nday, July 24, 2023 8:04 AM
To: Venkatesh, Supreeth <Supreeth.Venkatesh= @amd.com>
Cc: Lei Yu <yulei.sh@bytedance.com>; Zane Shelley &l= t;zshelle@i= map.linux.ibm.com>; Michael Shen <gpgpgp@google.com>; openbmc <= openbmc@lists.ozlabs.org>; dhruvaraj S <dhruvaraj@gmail.com>; Brad Bishop &l= t;bradleyb= @fuzziesquirrel.com>; Ed Tanous <ed@tanous= .net>; Dhandapani, Abinaya <Abinaya.Dhandapani@amd.com>
Subjec= t: RE: [RFC] BMC RAS Feature

 


Caution: This message o= riginated from an External Source. Use proper caution when opening attachme= nts, clicking links, or responding.


Caution: This message originated from an External Source. Use = proper caution when opening attachments, clicking links, or responding.

 = ;

Hi Supreeth,

 

Thanks for the info. We hoped that Daytonax would be upstreamed. Unfortuna= tely, It is not available. 

Actually, we need to enable SOL, Post code and PSU features in Da= ytona.  Will we get support for this feature enablement? or Are there = any reference implementation available for AMD boards?.

 <= /span>

Thanks,

Dhanasekar=

 

&nb= sp;

=  =

---- On Fri, 21 Jul 2023 19:33:41 +0530 Ve= nkatesh, Supreeth <Supreeth.Venkatesh@amd.com> wrote ---<= /span>

 

[AMD Official Use Only - General]=

 

Hi Dhanasekar,

 

It is supported for EPYC Genoa famil= y and beyond at this time.

Daytona uses EPYC Milan family and support is not there in that.

=

 <= /span>

Thanks,

Supreeth Venkatesh

System Manageability Architect  = ;|  AMD
Server Software


 

From: J Dhanaseka= r <j= dhanasekar@velankanigroup.com>
Sent: Fri= day, July 21, 2023 5:30 AM
= To: Venkatesh, Supreeth <Supreeth.Venkatesh@amd.com>
Cc: Zane Shelley <zshelle@imap.linux.ibm.com&g= t;; Lei Yu <= yulei.sh@bytedance.com>; Michael Shen <gpgpgp@google.com>; openbmc <= openbmc@lists.ozlabs.org>; dhruvaraj S <dhruvaraj@gmail.com>; Brad Bishop &l= t;bradleyb= @fuzziesquirrel.com>; Ed Tanous <ed@tanous= .net>; Dhandapani, Abinaya <Abinaya.Dhandapani@amd.com>
Subject: Re: [RFC] BMC RAS Feature

 =


=

Caution: This message originated from an External Source. Use proper cauti= on when opening attachments, clicking links, or responding.

 

Hi Supreeth Venkatesh,

 

Does this RAS feature work for the Daytona Platform.  i h= ave been working in openBMC development for the Daytonax platform. 

If this RAS works for Dayt= ona Platform. I will include it in my project. 

 

Please provide your suggestions. 

 

Thanks,

Dhanasekar

 

 

 

&nb= sp;

=  =

---- On Mon, 03 Apr 2023 22:06:24 +0530 Su= preeth Venkatesh <supreeth.venkatesh@amd.com> wrote ---<= /span>

 


On 3/23/23 13:= 57, Zane Shelley wrote:
> Caution: This message originated from an External Source. Use pro= per
> caution w= hen opening attachments, clicking links, or responding.
>
>
> On 2023-03-22 19:07, Supreeth Venkatesh wrote: >> On 3/22/23 02:10, L= ei Yu wrote:
>&= gt;> Caution: This message originated from an External Source. Use prope= r
>>> cau= tion when opening attachments, clicking links, or responding.
<= span class=3D"x_-1155238930x-684518287size">>>>

>>>
>>>>> On Tue, 21 Mar 2023 = at 20:38, Supreeth Venkatesh
>>>>> <supreeth.venkatesh@amd.com> wrote: <= br> >>>>>
>>>>>
>>>>>=      On 3/21/23 05:40, Patrick Williams wrote: =
>>>>> &nbs= p;    > On Tue, Mar 21, 2023 at 12:14:45AM -0500, Supreet= h Venkatesh
>&g= t;>>> wrote:
>>>>>      >
>>>>>    = ;  >> #### Alternatives Considered
>>>>>      &= gt;>
>>&g= t;>>      >> In-band mechanisms using Syste= m Management Mode (SMM)
>>>>> exists.
>>>>>      >>
>>>>> =      >> However, out of band method to gather RAS= data is processor
>>>>>      specific.
>>>>>   &nb= sp;  >>
>>>>>      > How does this compare w= ith existing implementations in
>>>>>      > phosphor-d= ebug-collector.
&g= t;>>>>      Thanks for your feedback. See b= elow.
>>>= ;>>      > I believe there was some attempt to= extend
>>&g= t;>>      > P-D-C previously to handle Intel's= crashdump behavior.
>>>>>      Intel's crashdump interfac= e uses com.intel.crashdump.
>>>>>      We have implemented= com.amd.crashdump based on that reference.
>>>>>      How= ever,
>>>= >>      can this be made generic?
>>>>>

= >>>>>  &nb= sp;   PoC below:
>>>>>
>>>>>      busctl tree com.a= md.crashdump
>&= gt;>>>
&g= t;>>>>      =E2=94=94=E2=94=80/com <= br> >>>>>  = ;       =E2=94=94=E2=94=80/com/amd >>>>>  &= nbsp;        =E2=94=94=E2=94=80/com/amd/= crashdump
>>= >>>           &n= bsp;
=E2=94=9C=E2=94=80/com/amd/crashdump/0
>>>>>      = ;      
=E2=94=9C= =E2=94=80/com/amd/crashdump/1
>>>>= ;>            
=E2=94=9C<= span class=3D"font" style=3D"font-family:Verdana, sans-serif">=E2=94=80/com/amd/crashdump/2 <= span class=3D"size" style=3D"font-size:10pt">
>>>>>      &nbs= p;     
<= span class=3D"size" style=3D"font-size:10pt">=E2=94=9C
= =E2=94= =80/com/amd/crashdump/3
>>>>> =             <= /span>
=E2=94=9C=E2=94=80/com/amd/crashdump/4
>>>>>       &nbs= p;    
=E2=94=9C=E2=94=80/com= /amd/crashdump/5 >>>>>  &= nbsp;          <= /span>=E2= =94=9C=E2=94=80/com/amd/crashdump/6
>>>>>         &= nbsp;  
=E2=94=9C
=E2=94=80/com/amd/cras= hdump/7
>>>>>   &nbs= p;        
=E2=94=9C=E2=94=80/com/amd/crashdump/8
&g= t;>>>>          &n= bsp;  =E2=94=94=E2=94=80/com/amd/crashdump/9
>>>>>
>>>>>    = ;  > The repository
>>>>>      > currently hand= les IBM's processors, I think, or maybe that is
>>>>>     = covered by
>&g= t;>>>      > openpower-debug-collector.
>>>>> =      >
>>>>>      > In any case,= I think you should look at the existing D-Bus
>>>>>      = interfaces
>>= ;>>>      > (and associated Redfish impleme= ntation) of these repositories
>>>>> and
>>>>>      > dete= rmine if you can use those approaches (or document why
>>>>> now).
= >>>>>  &n= bsp;   I could not find an existing D-Bus interface for RAS in
>>>>>=      xyz/openbmc_project/.
>>>>>     = It would be helpful if you could point me to it.
>>>>>
>>>>>
>>>>> There is an in= terface for the dumps generated from the host, which

>>>>> can
>>>>> be used for = these kinds of dumps

>>>>> https://github.com/openbmc/phosphor-dbus-interfa= ces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System.interface.yaml
>>>>= ;>
>>>= >>
>>&= gt;>> The fault log also provides similar dumps
>>>>>
https://gith= ub.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_projec= t/Dump/Entry/FaultLog.interface.yaml

>>>>>
>>>>>
>>>> ThanksDdhruvraj. The interfac= e looks useful for the purpose. However,
>>>> the current BMCWEB implementation r= eferences
>>= ;>> https://github.com/openbmc/bm= cweb/blob/master/redfish-core/lib/log_services.hpp
>>>>
>>>> [com.intel.crashdump] =
>>>> c= onstexpr char const* crashdumpPath =3D "/com/intel/crashdump";
= >>>>
<= span class=3D"x_-1155238930x-684518287size">>>>> constexpr char= const* crashdumpInterface =3D "com.intel.crashdump";

>>>> constexpr char const* = crashdumpObject =3D "com.intel.crashdump";
>>>>
>>>> https://github.com/openbmc/phos= phor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System= .interface.yaml
>>>>
>>>> or
>>>> https://github.com/openbmc/phosphor-dbus-interface= s/blob/master/yaml/xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml
>>>>= ;
>>>>= is it exercised in Redfish logservices?
>>> In our practice, a plugin `tools/drepo= rt.d/plugins.d/acddump` is added
>>> to copy the crashdump json file to the dump tar= ball.
>>>= The crashdump tool (Intel or AMD) could trigger a dump after the >>> crashdump is c= ompleted, and then we could get a dump entry containing
>>> the crashdump. >> Thanks Lei Yu for = your input. We are using Redfish to retrieve the
>> CPER binary file which can then be = passed through a plugin/script for
>> detailed analysis.
>> In any case irrespective of whichever D= bus interface we use, we need a
>> repository which will gather data from AMD processor= via APML as per
&= gt;> AMD design.
>> APML
&g= t;> Spec:
https://www.amd.com/system/files/TechDocs/= 57019-A0-PUB_3.00.zip

>> Can someone please help create bmc-ras or amd-debug-collec= tor
>> repos= itory as there are instances of openpower-debug-collector
>> repository used for Open P= ower systems?

>= >>
>>&= gt;
>>> -= -
>>> BRs= ,
>>> Lei= YU
> I am inte= rested in possibly standardizing some of this. IBM POWER has
> several related components.= openpower-hw-diags is a service that will
> listen for the hardware interrupts via a GPIO= pin. When an error is
> detected, it will use openpower-libhei to query hardware register= s to
> determin= e what happened. Based on that information openpower-hw-diags
<= span class=3D"x_-1155238930x-684518287size">> will generate a PEL, which= is an extended log in phosphor-logging, that
> is used to tell service what to replace if= necessary. Afterward,
> openpower-hw-diags will initiate openpower-debug-collector, which=
> gathers a si= gnificant amount of data from the hardware for additional
> debug when necessary. I wrote = openpower-libhei to be fairly agnostic. It
> uses data files (currently XML, but moving to= JSON) to define register
> addresses and rules for isolation. openpower-hw-diags is fairl= y POWER
> speci= fic, but I can see some parts can be made generic. Dhruv would have =
> to help with openpow= er-debug-collector.
Thank you. Lets collaborate in standardizing some aspects of it. <= br> >
> Regarding creation of a new reposi= tory, I think we'll need to have some
> more collaboration to determine the scope before c= reating it. It
>= ; certainly sounds like we are doing similar things, but we need to =
> determine if enough = can be abstracted to make it worth our time.
I have put in a request here:
https://github.com= /openbmc/technical-oversight-forum/issues/24
Please chime in.

 

 

 

 

 

 

=
------=_Part_3582875_650366529.1690453206594 Content-Type: image/png; name=1.png Content-Transfer-Encoding: base64 Content-Disposition: inline; filename=1.png Content-ID: iVBORw0KGgoAAAANSUhEUgAAAJYAAAAjCAYAAAB2BvMkAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ bWFnZVJlYWR5ccllPAAAA0xpVFh0WE1MOmNvbS5hZG9iZS54bXAAAAAAADw/eHBhY2tldCBiZWdp bj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6 eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuNi1jMTQ1IDc5LjE2 MzQ5OSwgMjAxOC8wOC8xMy0xNjo0MDoyMiAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJo dHRwOi8vd3d3LnczLm9yZy8xOTk5LzAyLzIyLXJkZi1zeW50YXgtbnMjIj4gPHJkZjpEZXNjcmlw dGlvbiByZGY6YWJvdXQ9IiIgeG1sbnM6eG1wTU09Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEu MC9tbS8iIHhtbG5zOnN0UmVmPSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAvc1R5cGUvUmVz b3VyY2VSZWYjIiB4bWxuczp4bXA9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC8iIHhtcE1N OkRvY3VtZW50SUQ9InhtcC5kaWQ6Njg2Njg2MTAwRDEzMTFFOTg1OEREMTQ2NTU1Qjg5RTUiIHht cE1NOkluc3RhbmNlSUQ9InhtcC5paWQ6Njg2Njg2MEYwRDEzMTFFOTg1OEREMTQ2NTU1Qjg5RTUi IHhtcDpDcmVhdG9yVG9vbD0iQWRvYmUgUGhvdG9zaG9wIENDIDIwMTkgKE1hY2ludG9zaCkiPiA8 eG1wTU06RGVyaXZlZEZyb20gc3RSZWY6aW5zdGFuY2VJRD0iYWRvYmU6ZG9jaWQ6cGhvdG9zaG9w OmRmOTM1NGYxLTFiODYtNDE0Zi1hYmE2LWIzZDg0OGUzYjMxYiIgc3RSZWY6ZG9jdW1lbnRJRD0i YWRvYmU6ZG9jaWQ6cGhvdG9zaG9wOmRmOTM1NGYxLTFiODYtNDE0Zi1hYmE2LWIzZDg0OGUzYjMx YiIvPiA8L3JkZjpEZXNjcmlwdGlvbj4gPC9yZGY6UkRGPiA8L3g6eG1wbWV0YT4gPD94cGFja2V0 IGVuZD0iciI/Pu955OsAAApiSURBVHja7Ft9jFxVFT/vY+bNzu623bZb21LakkIF/K7VIn7sWoKl CoohUk3xo7FRA2lFk0YxfFTQiMYQrdEGKKgQykcgoYrEWFA+gqIVKk1qxabBQivtLu12d2d2Zt68 967nvHdm+/bu+5o3M/LPO+nZeZ/33nfv757zO+feKkIIyCSTdouadUEmGbAyyYCVSQasTDLJgJVJ BqxMMmBlkkkGrEwyYGWSASuTTNoouqIoWS/8n+SM+X3w0PbNOJsVGCtXgHv+KdRfsYIQDuTyBTC6 iiAcp6PtobqMQhfkcnkwzRo4tg2anoM6HWPdrWAjs1jTZRFqvhMFHz02AjvufwYWLpwHBRzQvKdn ofbxsQuoXN4AYTtA67idVoSXH2nejwtogcgWC9KWq2c4miYfRL0bdRy10qYy16P+mQ7+8uJBsBwV isVusNBCoJyU6/HGd3LAt6Ne0uZv3Ir6awEaWiXVBRKC4Vys8Uo8vzlnFKBaKRf0XP4uVVEuxptH miz/RBywrkL9kHSNOnxLwgpuQe33nd+L+lzCd9+JerVU73WoFp9/BvWimDJM1GHUf6P+DfWVBPU+ iLoc9eY2DuSsxkFPT5drkfJGHVQPWAEuasrpuahL2wysBfSHXB15OwTVW3U99w/QcgZeWNNV7BnE 85O1WmWTYXT9iepvchfM0ihg5RkIQUKVPZ6gguul86+gvh31nzHvLeYZ3i1dv8EHrCtQ1zXZob9B /Qm3P25CHES9v00DWZsktZqGvAbdnWGBYtlJ3j3ZAassV7wT1fBALS6k8enpnbk6ZxiH6rXq4pzR 9RTib6AJcA1HAWtDxL0fJATWIdRlvnNig88zuF4NeWc26l8DQDU8lRDA6yk69JOsd6B+NebZB1BL qL9t54gqqorAKqDFMkHV7DRF0MSyWyDW9GJZUFeKSeu1YqrFFGej5Xq8u3fmpTU9d9i27cF83ngE b31imnX1yiGunpsSFUY04NsxbmqQI5pmpRd1D+r55Iule10MvPkpZyFFVmUGYI477b1sAWXLuSQB d3mMqcAf20XoVVWDQlc3gssCzWoeWIinbbZt3eqg9dAQpIltyCRZBxV/xwjguqbT5RlS+fTcgGXV RzFKrFDUWK1UyG1eoen6POEjgHRMANc0/QP45q4kwFoj+fU6R5Ca79oPUVel7N95bJXIclV918n9 nZOyTJrJG0Mi388yCZ4hfeNNqN+NKfc5BuYX2V1EjSXN6AuigaG4HCuXz6P1SmWxhpFUD9eqE26K QFWiA3vBJoqAhMBwn6/XTUAO5aYZEBxWQCP34/MnLHyu4f442hvyu0PvWwoErFfkbgkD1rek80dR 7+Pfhrwf9V2oL6UEArnIZ1Hfx+dPor67FS/DgB2SrjvMIV5A3SdZHoqOtqGOxJR9HPVHCdM3kcCi /FBlokRRF1ipLJYXxZXGxzxgqVoSc9VwWm6ao9jdA7ZlAQEnpJZ+BM2JhnFSVVWyuqoXYOAfKqNu VvumJUhDBvyjAcDaxa5rju/6rahrWwDDSg7tyX2tbgeFibj3MuqN3GY58v1ZxHsDqDvY9R+NqX9u rL/GAS2NjUIZgZEGWIisfsus99uOjZZCgwSEmlBh4nMjlYkyjAwfh54Zs6AX1fHyVZHWlYonIBWK RfeYrJ47Ocol/A1vfxCwviOdVxlUJLehft937xIOh/+VsFt+yYR4U0yQcBdHUle3ORq6LwBYqyOA daGPRx5tRwMIEOXSGEyUSo08VnMixGYc8WtosOu2lXSy1REkL6Pr2+4Y9o6J8rhr9Qgk8cAUOBls DDQ0mD13HlrKUTgxfAzsuuVeSwosisQ+H5DXKfMxherfkywDRYifTtgtBKrNTP4HQp7Zx1zpxg6E 2a9z+D7bd21RBM/8PR8fZn5pt9oABy0WWSua8ZadaslGjwm6QlJHygo0QHfi75kKKDeRK04aWVrY Zlr+QTTCG0PH0I1PuBzNZ+2KcllyAzfKYaPELSZQ7+SoqiGX8+Akyc6ezb+DnLQ8J2DgGwnZ8zoA LJHQfX5K4pNtW6HwLNY4lMtlsJMBq6+9XaDQhL0Hu+JQE5Go2+bXXv0PFIwC8jQDrVjdb+0Oy2uL cod9QzrfHZDM/CbzkqKUUNyQoI1+tkgkd78vtTDOUeZ4wLPtkgWStWrkx+T83d3QIXFcjjWCVqsa DqypUD/IEzLtfwBFouR+ty5RmJ8nm4oIGE0HWxDfUqBqmq4Fy+dzLsdjLnagbtYGzVrtCTymemxd CpWXSMUuYY7hf24s4CO/BN4yzxtNfPBJjghf4/NVvuNOyecCrj0vke8wULVlqwHN8joODoX84RZL 8Xfx1+QLzWDKe1FQRPwe341C0raqCCpVU5HAK9AIDm0i7zUTKIsmkOdRKgM539NoxVY6QryI4Jrv B8z1AWUvZ00i9P61TX75Ec5l0Yce6DCozmLLKss9U7lIaI5MxI6il+6I9Suqnnez7iIQq0IGl0gL KjeMU1x4mRGeIzQ9oagIKuJSUxfFvfQDfoeD7pC22mB4uFBVtf/isy/hbFmpqNpO3cd9Lmhx4CiC o0XiZncE7G8jfzoecu8yjkgLAVGif2Ha5k7PB6QRbg8Z4LnMDb8O3lrkulgrompu/olcS9BnKO7A aS14vwawHPxHn6Q0GXSII9gIbII2bUXcZ85ca+alJATlHx9EIH5YVdS9SPRXNIB1Q8CrX+ZoKCq/ 438vx4nVrfDmiM7pkBKcXtKZz4nct4XktTYkLLtbClggIBfWSMTOiuIvxG9d4JCKiIdO3yQ6MjPV PKPdC+6yjZgj3TRiApM1CJiDkGDRmSYJPreIg66n0RoPIJgrVGgP6hek5/+egMA+yf7fvy1mC7sb +00AlhYQfITJ71CvBG+pqhU5zBPMPwF/AV4mf2eYJVEVbx9Uwmifyvt4S3Hg9Ir8XDho1eHhlOV/ BEH2Alqy9So3XJZtCcu8TTqnSHGH73yZdP/MJtp7hnTeL8VLaRaqaTLQPq5LOXXSihxgQhxk1Wm7 zdpATkNWBK0VzfSEKpp4NkK9Oj0yru9yfz2XPITHu/3PtFjHCgTaZrJYo+BtEfF3wgMJO5cy1u+A qVuca77jn6K+RbKESeURKRUwBqf3YpHEbdtxGDzHOWdGi8lxuRsVku1i2MumPwqclFz9GOofOAB6 1m2UI0DL02Y/5FIhUaHSkTyWwnzeuZyWd5SpNV2FoNiD3nNxy7wOYJ8Q9i0ErE0tlEQZ+fUR969t oeztrGFyL4RvREwrBNyjMeDaw8FAkvTDbn52cgeHWXegt2cW1Ooq6HZYVKj43cszEL0GmigvS0GS 41i3m2Z1r0voyRWrSoOID+l6/nzULTxhiinqwIkvnrDq5o8t08n2vAdwj2Uxz9SaLPMx/8nai1bB zL45cKpsge6RY7JIXX5XaZlV3utEBlRc166g2baFL8oTp42TuyXGKSOYtypK+v9fg5bKXfqhfRQZ sKabi1qnCj9v+VK4ZuM6ODU+AXmju0HeiaOd8kdZdeQ/E6VRUNV2f1p8QrQVV+hf0P6fAAMA0CcI PP9xuG0AAAAASUVORK5CYII= ------=_Part_3582875_650366529.1690453206594-- ------=_Part_3582874_321889682.1690453206575--