Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command

From: Dan Williams <dan.j.williams@intel.com>
To: Ira Weiny <ira.weiny@intel.com>, Dan Williams <dan.j.williams@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Alison Schofield <alison.schofield@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	"Ben Widawsky" <bwidawsk@kernel.org>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	"Dave Jiang" <dave.jiang@intel.com>,
	<linux-kernel@vger.kernel.org>, <linux-cxl@vger.kernel.org>
Subject: Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
Date: Sat, 3 Dec 2022 13:33:43 -0800	[thread overview]
Message-ID: <638bc0b71b5b_18fe029412@dwillia2-mobl3.amr.corp.intel.com.notmuch> (raw)
In-Reply-To: <Y4pyYvbaSF5DpFGG@iweiny-desk3>

Ira Weiny wrote:
> On Thu, Dec 01, 2022 at 05:39:12PM -0800, Dan Williams wrote:
> > ira.weiny@ wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> 
> [snip]
> 
> > >  
> > > +#define CREATE_TRACE_POINTS
> > > +#include <trace/events/cxl.h>
> > > +
> > >  #include "core.h"
> > >  
> > >  static bool cxl_raw_allow_all;
> > > @@ -48,6 +51,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
> > >  	CXL_CMD(RAW, CXL_VARIABLE_PAYLOAD, CXL_VARIABLE_PAYLOAD, 0),
> > >  #endif
> > >  	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
> > > +	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
> > >  	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
> > >  	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
> > >  	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
> > 
> > Similar to this patch:
> > 
> > https://lore.kernel.org/linux-cxl/166993221008.1995348.11651567302609703175.stgit@dwillia2-xfh.jf.intel.com/
> > 
> > CXL_MEM_COMMAND_ID_GET_EVENT_RECORD, should be added to the "always
> > kernel" / cxlds->exclusive_cmds mask.
> 
> Done for all the commands.  I'll rebase as well before sending this out.
> 
> > 
> > > @@ -704,6 +708,106 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
> > >  }
> > >  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> > >  
> > > +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> > > +				    enum cxl_event_log_type type)
> > > +{
> > > +	struct cxl_get_event_payload *payload;
> > > +	u16 nr_rec;
> > > +
> > > +	mutex_lock(&cxlds->event_buf_lock);
> > > +
> > > +	payload = cxlds->event_buf;
> > > +
> > > +	do {
> > > +		u8 log_type = type;
> > > +		int rc;
> > > +
> > > +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
> > > +				       &log_type, sizeof(log_type),
> > > +				       payload, cxlds->payload_size);
> > > +		if (rc) {
> > > +			dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
> > > +				cxl_event_log_type_str(type), rc);
> > > +			goto unlock_buffer;
> > > +		}
> > > +
> > > +		nr_rec = le16_to_cpu(payload->record_count);
> > > +		if (trace_cxl_generic_event_enabled()) {
> > 
> > This feels like a premature micro-optimization as none of this code is
> > fast path and it is dwarfed by the cost of executing the mailbox
> > command. I started with trying to reduce the 80 column collision
> > pressure, but then stepped back even further and asked, why?
> 
> Because Steven told me to.  :-(  I should have been smarter than that.

You did the right thing, I failed to jump in sooner on this set.

> 
> > 
> > > +			int i;
> > > +
> > > +			for (i = 0; i < nr_rec; i++)
> > > +				trace_cxl_generic_event(dev_name(cxlds->dev),
> > > +							type,
> > > +							&payload->records[i]);
> > 
> > As far as I can tell trace_cxl_generic_event() always expects a
> > device-name as its first argument. So why not enforce that with
> > type-safety?  I.e. I think trace_cxl_generic_event() should take a
> > "struct device *", not a string unless it is really the case that any
> > old string will do as the first argument to the trace event. Otherwise
> > the trace point can do "__string(dev_name, dev_name(dev))", and mandate
> > that callers pass devices.
> 
> From a trace point view 'any old string' will do.  There was nothing else the
> trace needed from struct device so I skipped it.

I'd prefer more fine-grained type safety wherever possible.

> 
> [snip]
> 
> > > +
> > > +/**
> > > + * cxl_mem_get_event_records - Get Event Records from the device
> > > + * @cxlds: The device data for the operation
> > > + *
> > > + * Retrieve all event records available on the device and report them as trace
> > > + * events.
> > > + *
> > > + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> > > + */
> > > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> > > +{
> > > +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > > +
> > > +	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> > > +
> > > +	if (!cxlds->event_buf) {
> > > +		cxlds->event_buf = alloc_event_buf(cxlds);
> > > +		if (WARN_ON_ONCE(!cxlds->event_buf))
> > > +			return;
> > > +	}
> > 
> > What's the point of having an event_buf_lock if event_buf is reallocated
> > every call?
> 
> This is only called on start up.

cxl_mem_get_event_records() is called all the time. The place to
allocate buffers attached to 'struct cxl_dev_state' at start up is
cxl_dev_state_create(), or sometime after cxl_enumerate_cmds() if you
want to wait and see if the device supports events and CXL _OSC says the
driver can drive events.

> > Just allocate event_buf once at the beginning of time during init if the
> > device supports event log retrieval, and fail the driver load if that
> > allocation fails. No runtime WARN() for memory allocation.
> 
> It was.  I'll make that more clear in the next series.
> 
> > 
> > I notice this patch does not clear events, I trust that comes later in
> > the series, but I think it belongs here to make this patch a complete
> > standalone thought.
> 
> Squashed.  But it does make for a large patch.  Which I'm not a fan of for
> review.  Lucky that now we have a lot of review on the parts.
> 
> > 
> > > +	if (status & CXLDEV_EVENT_STATUS_INFO)
> > > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > > +	if (status & CXLDEV_EVENT_STATUS_WARN)
> > > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > > +	if (status & CXLDEV_EVENT_STATUS_FAIL)
> > > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > > +	if (status & CXLDEV_EVENT_STATUS_FATAL)
> > > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> > 
> > This retrieval order should be flipped. If there is a FATAL pending I
> > expect a monitor wants that first and would be happy to parse the INFO
> > later. I would go so far as to say that if the INFO logger is looping
> > and a FATAL comes in the driver should get that out first before going
> > back for more INFO logs. That would mean executing Clear Events and
> > looping through the logs by priority until all the status bits fall
> > silent inside cxl_mem_get_records_log().
> 
> I'll flip them.  And determine if this is really what we want to do for the
> irq.
> 
> The issue with the irq handling calling a single function which checks all
> status is that we may end up with some odd interrupts doing nothing depending
> on racing etc.

If an event handler wakes and reads 0-status bits because another
handler did it then return IRQ_HANDLED. You'll have this problem whether
you have a central function or not, because there's only one status
register for multiple sources.