linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mukesh Ojha <quic_mojha@quicinc.com>
To: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>,
	<agross@kernel.org>, <andersson@kernel.org>,
	<konrad.dybcio@linaro.org>, <corbet@lwn.net>,
	<keescook@chromium.org>, <tony.luck@intel.com>,
	<gpiccoli@igalia.com>, <catalin.marinas@arm.com>,
	<will@kernel.org>, <krzysztof.kozlowski+dt@linaro.org>,
	<robh+dt@kernel.org>, <linus.walleij@linaro.org>,
	<linux-gpio@vger.kernel.org>, <srinivas.kandagatla@linaro.org>
Cc: <linux-arm-msm@vger.kernel.org>,
	<linux-remoteproc@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-hardening@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-doc@vger.kernel.org>
Subject: Re: [PATCH v3 04/18] soc: qcom: Add Qualcomm minidump kernel driver
Date: Fri, 5 May 2023 11:04:35 +0530	[thread overview]
Message-ID: <04ead29c-7fd1-df0d-f313-2fc0edfe9010@quicinc.com> (raw)
In-Reply-To: <575a422d-6224-06b7-628c-8487b47882e9@linaro.org>



On 5/4/2023 8:51 PM, Krzysztof Kozlowski wrote:
> On 04/05/2023 14:38, Mukesh Ojha wrote:
>>
>>
>> On 5/4/2023 5:06 PM, Krzysztof Kozlowski wrote:
>>> On 03/05/2023 19:02, Mukesh Ojha wrote:
>>>> Minidump is a best effort mechanism to collect useful and predefined
>>>> data for first level of debugging on end user devices running on
>>>> Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
>>>> or subsystem part of SoC crashes, due to a range of hardware and
>>>> software bugs. Hence, the ability to collect accurate data is only
>>>> a best-effort. The data collected could be invalid or corrupted,
>>>> data collection itself could fail, and so on.
>>>>
>>>> Qualcomm devices in engineering mode provides a mechanism for
>>>> generating full system ramdumps for post mortem debugging. But in some
>>>> cases it's however not feasible to capture the entire content of RAM.
>>>> The minidump mechanism provides the means for selecting region should
>>>> be included in the ramdump. The solution supports extracting the
>>>> ramdump/minidump produced either over USB or stored to an attached
>>>> storage device.
>>>>
>>>> The core of minidump feature is part of Qualcomm's boot firmware code.
>>>> It initializes shared memory(SMEM), which is a part of DDR and
>>>> allocates a small section of it to minidump table i.e also called
>>>> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
>>>> their own table of segments to be included in the minidump, all
>>>> references from a descriptor in SMEM (G-ToC). Each segment/region has
>>>> some details like name, physical address and it's size etc. and it
>>>> could be anywhere scattered in the DDR.
>>>>
>>>> Minidump kernel driver adds the capability to add linux region to be
>>>> dumped as part of ram dump collection. It provides appropriate symbol
>>>> to check its enablement and register client regions.
>>>>
>>>> To simplify post mortem debugging, it creates and maintain an ELF
>>>> header as first region that gets updated upon registration
>>>> of a new region.
>>>>
>>>> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
>>>> ---
>>>>    drivers/soc/qcom/Kconfig         |  14 +
>>>>    drivers/soc/qcom/Makefile        |   1 +
>>>>    drivers/soc/qcom/qcom_minidump.c | 581 +++++++++++++++++++++++++++++++++++++++
>>>>    drivers/soc/qcom/smem.c          |   8 +
>>>>    include/soc/qcom/qcom_minidump.h |  61 +++-
>>>>    5 files changed, 663 insertions(+), 2 deletions(-)
>>>>    create mode 100644 drivers/soc/qcom/qcom_minidump.c
>>>>
>>>> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
>>>> index a491718..15c931e 100644
>>>> --- a/drivers/soc/qcom/Kconfig
>>>> +++ b/drivers/soc/qcom/Kconfig
>>>> @@ -279,4 +279,18 @@ config QCOM_INLINE_CRYPTO_ENGINE
>>>>    	tristate
>>>>    	select QCOM_SCM
>>>>    
>>>> +config QCOM_MINIDUMP
>>>> +	tristate "QCOM Minidump Support"
>>>> +	depends on ARCH_QCOM || COMPILE_TEST
>>>> +	select QCOM_SMEM
>>>> +	help
>>>> +	  Enablement of core minidump feature is controlled from boot firmware
>>>> +	  side, and this config allow linux to query and manages APPS minidump
>>>> +	  table.
>>>> +
>>>> +	  Client drivers can register their internal data structures and debug
>>>> +	  messages as part of the minidump region and when the SoC is crashed,
>>>> +	  these selective regions will be dumped instead of the entire DDR.
>>>> +	  This saves significant amount of time and/or storage space.
>>>> +
>>>>    endmenu
>>>> diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
>>>> index 0f43a88..1ebe081 100644
>>>> --- a/drivers/soc/qcom/Makefile
>>>> +++ b/drivers/soc/qcom/Makefile
>>>> @@ -33,3 +33,4 @@ obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
>>>>    obj-$(CONFIG_QCOM_KRYO_L2_ACCESSORS) +=	kryo-l2-accessors.o
>>>>    obj-$(CONFIG_QCOM_ICC_BWMON)	+= icc-bwmon.o
>>>>    obj-$(CONFIG_QCOM_INLINE_CRYPTO_ENGINE)	+= ice.o
>>>> +obj-$(CONFIG_QCOM_MINIDUMP) += qcom_minidump.o
>>>> diff --git a/drivers/soc/qcom/qcom_minidump.c b/drivers/soc/qcom/qcom_minidump.c
>>>> new file mode 100644
>>>> index 0000000..d107a86
>>>> --- /dev/null
>>>> +++ b/drivers/soc/qcom/qcom_minidump.c
>>>> @@ -0,0 +1,581 @@
>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>> +
>>>> +/*
>>>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
>>>> + */
>>>> +
>>>> +#include <linux/elf.h>
>>>> +#include <linux/err.h>
>>>> +#include <linux/errno.h>
>>>> +#include <linux/export.h>
>>>> +#include <linux/init.h>
>>>> +#include <linux/io.h>
>>>> +#include <linux/kernel.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/platform_device.h>
>>>> +#include <linux/string.h>
>>>> +#include <linux/soc/qcom/smem.h>
>>>> +#include <soc/qcom/qcom_minidump.h>
>>>> +
>>>> +/**
>>>> + * struct minidump_elfhdr - Minidump table elf header
>>>> + * @ehdr: Elf main header
>>>> + * @shdr: Section header
>>>> + * @phdr: Program header
>>>> + * @elf_offset: Section offset in elf
>>>> + * @strtable_idx: String table current index position
>>>> + */
>>>> +struct minidump_elfhdr {
>>>> +	struct elfhdr		*ehdr;
>>>> +	struct elf_shdr		*shdr;
>>>> +	struct elf_phdr		*phdr;
>>>> +	size_t			elf_offset;
>>>> +	size_t			strtable_idx;
>>>> +};
>>>> +
>>>> +/**
>>>> + * struct minidump - Minidump driver private data
>>>> + * @md_gbl_toc	: Global TOC pointer
>>>> + * @md_apss_toc	: Application Subsystem TOC pointer
>>>> + * @md_regions	: High level OS region base pointer
>>>> + * @elf		: Minidump elf header
>>>> + * @dev		: Minidump device
>>>> + */
>>>> +struct minidump {
>>>> +	struct minidump_global_toc	*md_gbl_toc;
>>>> +	struct minidump_subsystem	*md_apss_toc;
>>>> +	struct minidump_region		*md_regions;
>>>> +	struct minidump_elfhdr		elf;
>>>> +	struct device			*dev;
>>>> +};
>>>> +
>>>> +/*
>>>> + * In some of the Old Qualcomm devices, boot firmware statically allocates 300
>>>> + * as total number of supported region (including all co-processors) in
>>>> + * minidump table out of which linux was using 201. In future, this limitation
>>>> + * from boot firmware might get removed by allocating the region dynamically.
>>>> + * So, keep it compatible with older devices, we can keep the current limit for
>>>> + * Linux to 201.
>>>> + */
>>>> +#define MAX_NUM_ENTRIES	  201
>>>> +#define MAX_STRTBL_SIZE	  (MAX_NUM_ENTRIES * MAX_REGION_NAME_LENGTH)
>>>> +
>>>> +static struct minidump *__md;
>>>
>>> No, no file scope or global scope statics.
>>
>> Sorry, this is done as per recommendation given here [1] and this
>> matches both driver/firmware/qcom_scm.c and driver/soc/qcom/smem.c
>> implementations.
>>
>> [1]
>> https://lore.kernel.org/lkml/f74dfcde-e59b-a9b3-9bbc-a8de644f6740@linaro.org/
> 
> That's not true. You had the static already in v2, before Srini commented.
> 
> Look:
> https://lore.kernel.org/lkml/1679491817-2498-5-git-send-email-quic_mojha@quicinc.com/
> 
> +static struct minidump minidump;
> +static DEFINE_MUTEX(minidump_lock);
> 
> We do not talk about the names.

I apologize for this.

> 
> 
>>>> +
>>>> +	if (size < sizeof(*mdgtoc) || !mdgtoc->status) {
>>>> +		ret = -EINVAL;
>>>> +		dev_err(&pdev->dev, "minidump table is not initialized: %d\n", ret);
>>>> +		return ret;
>>>> +	}
>>>> +
>>>> +	mutex_lock(&minidump_lock);
>>>> +	md->dev = &pdev->dev;
>>>> +	md->md_gbl_toc = mdgtoc;
>>>
>>> What are you protecting here? It's not possible to have concurrent
>>> access to md, is it?
>>
>> Check qcom_apss_minidump_region_{register/unregister} and it is possible
>> that these API gets called parallel to this probe.
> 
> Wait, you say that something can modify local variable md before it is
> assigned to __md? How?

No.

>>
>> I agree, i made a mistake in not protecting __md in {register} API
>> but did it unregister API in this patch, which i have fixed in later patch.
> 
> No, you are protecting random things. Nothing will concurrently modify
> md and &pdev->dev in this moment. mdgtoc is allocated above, so also
> cannot by modified.
> 
> Otherwise show me the hypothetical scenario.

You are correct, it should just protect the assignment.
__md = md;

Thanks
> 
> 
>>
>>>
>>>> +	ret = qcom_minidump_init_apss_subsystem(md);
>>>> +	if (ret) {
>>>> +		dev_err(&pdev->dev, "apss minidump initialization failed: %d\n", ret);
>>>> +		goto unlock;
>>>> +	}
>>>> +
>>>> +	__md = md;
>>>
>>> No. This is a platform device, so it can have multiple instances.
>>
>> It can have only one instance that is created from SMEM driver probe.
> 
> Anyone can instantiate more of them.... how did you solve it?
> 
> 
>>
>>>
>>>> +	/* First entry would be ELF header */
>>>> +	ret = qcom_apss_minidump_add_elf_header();
>>>> +	if (ret) {
>>>> +		dev_err(&pdev->dev, "Failed to add elf header: %d\n", ret);
>>>> +		memset(md->md_apss_toc, 0, sizeof(struct minidump_subsystem));
>>>> +		__md = NULL;
>>>> +	}
>>>> +
>>>> +unlock:
>>>> +	mutex_unlock(&minidump_lock);
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static int qcom_minidump_remove(struct platform_device *pdev)
>>>> +{
>>>> +	memset(__md->md_apss_toc, 0, sizeof(struct minidump_subsystem));
>>>> +	__md = NULL;
>>>
>>> Don't use __ in variable names. Drop it everywhere.
>>
>> As i said above, this is being followed in other drivers, so followed
>> it here as per recommendation.
>>
>> Let @srini comeback on this.
> 
> Which part of coding style recommends __ for driver code?

Will fix this.

> 
>>
>>>
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static struct platform_driver qcom_minidump_driver = {
>>>> +	.probe = qcom_minidump_probe,
>>>> +	.remove = qcom_minidump_remove,
>>>> +	.driver  = {
>>>> +		.name = "qcom-minidump",
>>>> +	},
>>>> +};
>>>> +
>>>> +module_platform_driver(qcom_minidump_driver);
>>>> +
>>>> +MODULE_DESCRIPTION("Qualcomm APSS minidump driver");
>>>> +MODULE_LICENSE("GPL v2");
>>>> +MODULE_ALIAS("platform:qcom-minidump");
>>>> diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c
>>>> index 6be7ea9..d459656 100644
>>>> --- a/drivers/soc/qcom/smem.c
>>>> +++ b/drivers/soc/qcom/smem.c
>>>> @@ -279,6 +279,7 @@ struct qcom_smem {
>>>>    
>>>>    	u32 item_count;
>>>>    	struct platform_device *socinfo;
>>>> +	struct platform_device *minidump;
>>>>    	struct smem_ptable *ptable;
>>>>    	struct smem_partition global_partition;
>>>>    	struct smem_partition partitions[SMEM_HOST_COUNT];
>>>> @@ -1151,12 +1152,19 @@ static int qcom_smem_probe(struct platform_device *pdev)
>>>>    	if (IS_ERR(smem->socinfo))
>>>>    		dev_dbg(&pdev->dev, "failed to register socinfo device\n");
>>>>    
>>>> +	smem->minidump = platform_device_register_data(&pdev->dev, "qcom-minidump",
>>>> +						      PLATFORM_DEVID_NONE, NULL,
>>>> +						      0);
>>>> +	if (IS_ERR(smem->minidump))
>>>> +		dev_dbg(&pdev->dev, "failed to register minidump device\n");
>>>> +
>>>>    	return 0;
>>>>    }
>>>>    
>>>>    static int qcom_smem_remove(struct platform_device *pdev)
>>>>    {
>>>>    	platform_device_unregister(__smem->socinfo);
>>>> +	platform_device_unregister(__smem->minidump);
>>>
>>> Wrong order. You registered first socinfo, right?
>>
>> Any order is fine here, they are not dependent.
>> But, will fix this.
> 
> No, the order is always reversed from allocation. It does not matter if
> they are dependent or not.

Ok

> 
> Best regards,
> Krzysztof
> 

-- Mukesh

  parent reply	other threads:[~2023-05-05  5:36 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-03 17:02 [PATCH v3 00/18] Add basic Minidump kernel driver support Mukesh Ojha
2023-05-03 17:02 ` [PATCH v3 01/18] remoteproc: qcom: Expand MD_* as MINIDUMP_* Mukesh Ojha
2023-05-04 12:35   ` Krzysztof Kozlowski
2023-05-03 17:02 ` [PATCH v3 02/18] remoteproc: qcom: Move minidump specific data to qcom_minidump.h Mukesh Ojha
2023-05-04 11:38   ` Krzysztof Kozlowski
2023-05-04 11:58     ` Mukesh Ojha
2023-05-04 12:03       ` Krzysztof Kozlowski
2023-05-04 12:26         ` Mukesh Ojha
2023-05-04 12:36           ` Krzysztof Kozlowski
2023-05-04 12:57             ` Mukesh Ojha
2023-05-04 15:16               ` Krzysztof Kozlowski
2023-05-03 17:02 ` [PATCH v3 03/18] docs: qcom: Add qualcomm minidump guide Mukesh Ojha
2023-05-08 10:01   ` Bagas Sanjaya
2023-05-25 16:00     ` Mukesh Ojha
2023-05-13 18:46   ` Randy Dunlap
2023-05-25 15:59     ` Mukesh Ojha
2023-05-03 17:02 ` [PATCH v3 04/18] soc: qcom: Add Qualcomm minidump kernel driver Mukesh Ojha
2023-05-04 11:36   ` Krzysztof Kozlowski
2023-05-04 12:38     ` Mukesh Ojha
2023-05-04 15:21       ` Krzysztof Kozlowski
2023-05-04 16:34         ` Krzysztof Kozlowski
2023-05-08  7:10           ` Mukesh Ojha
2023-05-09  7:11             ` Krzysztof Kozlowski
2023-05-28 11:29               ` Mukesh Ojha
2023-05-14  4:16             ` Trilok Soni
2023-05-05  5:34         ` Mukesh Ojha [this message]
2023-06-02 10:43     ` Mukesh Ojha
2023-05-03 17:02 ` [PATCH v3 05/18] soc: qcom: minidump: Add pending region registration support Mukesh Ojha
2023-05-03 17:02 ` [PATCH v3 06/18] soc: qcom: minidump: Add update region support Mukesh Ojha
2023-05-04 11:40   ` Krzysztof Kozlowski
2023-05-03 17:02 ` [PATCH v3 07/18] arm64: defconfig: Enable Qualcomm minidump driver Mukesh Ojha
2023-05-04 11:23   ` Krzysztof Kozlowski
2023-05-04 11:45     ` Mukesh Ojha
2023-05-04 12:32       ` Krzysztof Kozlowski
2023-05-04 14:43         ` Mukesh Ojha
2023-05-04 15:24           ` Krzysztof Kozlowski
2023-05-03 17:02 ` [PATCH v3 08/18] remoterproc: qcom: refactor to leverage exported minidump symbol Mukesh Ojha
2023-05-03 17:02 ` [PATCH v3 09/18] soc: qcom: Add qcom's pstore minidump driver support Mukesh Ojha
2023-05-04 15:35   ` Krzysztof Kozlowski
2023-05-09 16:06   ` Luca Stefani
2023-05-16 20:48     ` Kees Cook
2023-05-03 17:02 ` [PATCH v3 10/18] dt-bindings: reserved-memory: Add qcom,ramoops-minidump binding Mukesh Ojha
2023-05-04 11:22   ` Krzysztof Kozlowski
2023-05-03 17:02 ` [PATCH v3 11/18] arm64: dts: qcom: sm8450: Add Qualcomm ramoops minidump node Mukesh Ojha
2023-05-04  7:14   ` Konrad Dybcio
2023-05-04 11:26   ` Krzysztof Kozlowski
2023-05-03 17:02 ` [PATCH v3 12/18] soc: qcom: Register pstore frontend region with minidump Mukesh Ojha
2023-05-09 15:45   ` Luca Stefani
2023-05-16 20:50   ` Kees Cook
2023-05-03 17:02 ` [PATCH v3 13/18] arm64: defconfig: Enable Qualcomm pstore minidump client driver Mukesh Ojha
2023-05-04 11:23   ` Krzysztof Kozlowski
2023-05-03 17:02 ` [PATCH v3 14/18] firmware: qcom_scm: provide a read-modify-write function Mukesh Ojha
2023-05-18 18:48   ` Trilok Soni
2023-05-03 17:02 ` [PATCH v3 15/18] pinctrl: qcom: Use qcom_scm_io_update_field() Mukesh Ojha
2023-05-03 17:02 ` [PATCH v3 16/18] firmware: scm: Modify only the download bits in TCSR register Mukesh Ojha
2023-05-03 17:02 ` [PATCH v3 17/18] firmware: qcom_scm: Refactor code to support multiple download mode Mukesh Ojha
2023-05-03 17:02 ` [PATCH v3 18/18] firmware: qcom_scm: Add multiple download mode support Mukesh Ojha
2023-05-04 11:26 ` [PATCH v3 00/18] Add basic Minidump kernel driver support Krzysztof Kozlowski
2023-07-15 22:13 ` (subset) " Bjorn Andersson
2023-07-17  1:15   ` Mathieu Poirier
2023-07-17  8:02     ` Krzysztof Kozlowski
2023-07-17 16:21     ` Bjorn Andersson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04ead29c-7fd1-df0d-f313-2fc0edfe9010@quicinc.com \
    --to=quic_mojha@quicinc.com \
    --cc=agross@kernel.org \
    --cc=andersson@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=gpiccoli@igalia.com \
    --cc=keescook@chromium.org \
    --cc=konrad.dybcio@linaro.org \
    --cc=krzysztof.kozlowski+dt@linaro.org \
    --cc=krzysztof.kozlowski@linaro.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-gpio@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-remoteproc@vger.kernel.org \
    --cc=robh+dt@kernel.org \
    --cc=srinivas.kandagatla@linaro.org \
    --cc=tony.luck@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).