linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device
@ 2020-03-25  8:54 WeiXiong Liao
  2020-03-25  8:54 ` [PATCH v3 01/11] pstore/zone: a common layer to manage storage as zones WeiXiong Liao
                   ` (11 more replies)
  0 siblings, 12 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:54 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

Why do we need to log to block (mtd) device?
1. Most embedded intelligent equipment have no persistent ram, which
   increases costs. We perfer to cheaper solutions, like block devices.
2. Do not any equipment have battery, which means that it lost all data
   on general ram if power failure. Pstore has little to do for these
   equipments.

Why do we need mtdpstore instead of mtdoops?
1. repetitive jobs between pstore and mtdoops
   Both of pstore and mtdoops do the same jobs that store panic/oops log.
2. do what a driver should do
   To me, a driver should provide methods instead of policies. What MTD
   should do is to provide read/write/erase operations, geting rid of codes
   about chunk management, kmsg dumper and configuration.
3. enhanced feature
   Not only store log, but also show it as files.
   Not only log, but also trigger time and trigger count.
   Not only panic/oops log, but also log recorder for pmsg, console and
   ftrace in the future.

Before upstream submission, pstore/blk is tested on arch ARM and x84_64,
block device and mtd device, built as modules and in kernel. Here are the
details:

	https://github.com/gmpy/articles/blob/master/pstore/Test-Pstore-Block.md

[PATCH v3]:
1. patch 1~10: a lot of improvements according to Kees Cook <keescook@chromium.org>
               including rename module, typo, reorder, rewrite document, bugs
			   and so on.
2. patch 11: rename funtions of pstore/blk and update document.
[PATCH v2]:
1. fix syntax error in documents. Thank Randy Dunlap <rdunlap@infradead.org>
2. replace pr_* with dev_* for mtdpstore.
   Thank Vignesh Raghavendra <vigneshr@ti.com>
3. improve mtdpstore. Thank Miquel Raynal <mraynal@kernel.org>
[PATCH v1]:
1. fix errors and warnings reported by kbuild test robot.

WeiXiong Liao (11):
  pstore/zone: a common layer to manage storage as zones
  pstore/blk: new support logger for block devices
  pstore/blk: respect for driver to pick pstore front-ends
  pstore/blk: pstore/zone: support pmsg recorder
  pstore/blk: blkoops: support console recorder
  pstore/blk: blkoops: support ftrace recorder
  Documentation: create document for pstore/blk
  pstore/zone: skip broken zone for MTD device
  pstore/blk: a way to get user configure about pstore front-ends.
  pstore/zone: pstore/blk: support non-block devices
  mtd: new support oops logger based on pstore/blk

 Documentation/admin-guide/pstore-block.rst |  237 +++++
 MAINTAINERS                                |    1 +
 drivers/mtd/Kconfig                        |   10 +
 drivers/mtd/Makefile                       |    1 +
 drivers/mtd/mtdpstore.c                    |  564 +++++++++++
 fs/pstore/Kconfig                          |  107 ++
 fs/pstore/Makefile                         |    4 +
 fs/pstore/platform.c                       |    3 +-
 fs/pstore/pstore_blk.c                     |  480 +++++++++
 fs/pstore/pstore_zone.c                    | 1511 ++++++++++++++++++++++++++++
 include/linux/pstore.h                     |    1 +
 include/linux/pstore_blk.h                 |   77 ++
 include/linux/pstore_zone.h                |   60 ++
 13 files changed, 3055 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/admin-guide/pstore-block.rst
 create mode 100644 drivers/mtd/mtdpstore.c
 create mode 100644 fs/pstore/pstore_blk.c
 create mode 100644 fs/pstore/pstore_zone.c
 create mode 100644 include/linux/pstore_blk.h
 create mode 100644 include/linux/pstore_zone.h

-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 01/11] pstore/zone: a common layer to manage storage as zones
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
@ 2020-03-25  8:54 ` WeiXiong Liao
  2020-04-14  1:42   ` WeiXiong Liao
  2020-03-25  8:54 ` [PATCH v3 02/11] pstore/blk: new support logger for block devices WeiXiong Liao
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:54 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

This is a pre patch to support pstore/blk.

Pstore/zone provides the zones management of storage. It's designed
to work for pstore/blk and pstore/ram in the future.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 fs/pstore/Kconfig           |   7 +
 fs/pstore/Makefile          |   2 +
 fs/pstore/platform.c        |   3 +-
 fs/pstore/pstore_zone.c     | 989 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pstore.h      |   1 +
 include/linux/pstore_zone.h |  44 ++
 6 files changed, 1045 insertions(+), 1 deletion(-)
 create mode 100644 fs/pstore/pstore_zone.c
 create mode 100644 include/linux/pstore_zone.h

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 8f0369aad22a..5ad4ab68ac4f 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -153,3 +153,10 @@ config PSTORE_RAM
 	  "ramoops.ko".
 
 	  For more information, see Documentation/admin-guide/ramoops.rst.
+
+config PSTORE_ZONE
+	tristate
+	depends on PSTORE
+	help
+	  The common layer for pstore/blk (and pstore/ram in the future)
+	  to manager storage as zones.
diff --git a/fs/pstore/Makefile b/fs/pstore/Makefile
index 967b5891f325..94f3631c80ce 100644
--- a/fs/pstore/Makefile
+++ b/fs/pstore/Makefile
@@ -12,3 +12,5 @@ pstore-$(CONFIG_PSTORE_PMSG)	+= pmsg.o
 
 ramoops-objs += ram.o ram_core.o
 obj-$(CONFIG_PSTORE_RAM)	+= ramoops.o
+
+obj-$(CONFIG_PSTORE_ZONE)	+= pstore_zone.o
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index d896457e7c11..80b4728c702f 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -129,7 +129,7 @@ enum pstore_type_id pstore_name_to_type(const char *name)
 }
 EXPORT_SYMBOL_GPL(pstore_name_to_type);
 
-static const char *get_reason_str(enum kmsg_dump_reason reason)
+const char *get_reason_str(enum kmsg_dump_reason reason)
 {
 	switch (reason) {
 	case KMSG_DUMP_PANIC:
@@ -148,6 +148,7 @@ static const char *get_reason_str(enum kmsg_dump_reason reason)
 		return "Unknown";
 	}
 }
+EXPORT_SYMBOL_GPL(get_reason_str);
 
 /*
  * Should pstore_dump() wait for a concurrent pstore_dump()? If
diff --git a/fs/pstore/pstore_zone.c b/fs/pstore/pstore_zone.c
new file mode 100644
index 000000000000..62c834072498
--- /dev/null
+++ b/fs/pstore/pstore_zone.c
@@ -0,0 +1,989 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define MODNAME "pstore-zone"
+#define pr_fmt(fmt) MODNAME ": " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/pstore.h>
+#include <linux/mount.h>
+#include <linux/printk.h>
+#include <linux/fs.h>
+#include <linux/pstore_zone.h>
+#include <linux/kdev_t.h>
+#include <linux/device.h>
+#include <linux/namei.h>
+#include <linux/fcntl.h>
+#include <linux/uio.h>
+#include <linux/writeback.h>
+
+/**
+ * struct psz_head - header of zone to flush to storage
+ *
+ * @sig: signature to indicate header (PSZ_SIG xor PSZONE-type value)
+ * @datalen: length of data in @data
+ * @data: zone data.
+ */
+struct psz_buffer {
+#define PSZ_SIG (0x43474244) /* DBGC */
+	uint32_t sig;
+	atomic_t datalen;
+	uint8_t data[];
+};
+
+/**
+ * struct psz_oops_header - sub header of oops zones to flush to storage
+ *
+ * @magic: magic num for oops header
+ * @time: oops/panic trigger time
+ * @compressed: whether conpressed
+ * @counter: oops/panic counter
+ * @reason: identify oops or panic
+ * @data: pointer to log data
+ *
+ * It's a sub-header of oops zone, trailing after &psz_buffer.
+ */
+struct psz_oops_header {
+#define OOPS_HEADER_MAGIC 0x4dfc3ae5 /* Just a ramdom number */
+	uint32_t magic;
+	struct timespec64 time;
+	bool compressed;
+	uint32_t counter;
+	enum kmsg_dump_reason reason;
+	uint8_t data[];
+};
+
+/**
+ * struct psz_zone - zone information
+ *
+ * @off: zone offset of storage
+ * @type: front-end type for this zone
+ * @name: front-end name for this zone
+ * @buffer: pointer to data buffer managed by this zone
+ * @oldbuf: pointer to old data buffer.
+ * @buffer_size: bytes in @buffer->data
+ * @should_recover: whether this zone should recover from storage
+ * @dirty: whether the data in @buffer dirty
+ *
+ * zone structure in memory.
+ */
+struct psz_zone {
+	loff_t off;
+	const char *name;
+	enum pstore_type_id type;
+
+	struct psz_buffer *buffer;
+	struct psz_buffer *oldbuf;
+	size_t buffer_size;
+	bool should_recover;
+	atomic_t dirty;
+};
+
+/**
+ * struct psz_context - all about running state of pstore/zone
+ *
+ * @opszs: oops/panic storage zones
+ * @oops_max_cnt: max count of @opszs
+ * @oops_read_cnt: counter to read oops zone
+ * @oops_write_cnt: counter to write
+ * @oops_counter: counter to oops
+ * @panic_counter: counter to panic
+ * @recovered: whether finish recovering data from storage
+ * @on_panic: whether occur panic
+ * @psz_info_lock: lock to @psz_info
+ * @psz_info: information from back-end
+ * @pstore: structure for pstore
+ */
+struct psz_context {
+	struct psz_zone **opszs;
+	unsigned int oops_max_cnt;
+	unsigned int oops_read_cnt;
+	unsigned int oops_write_cnt;
+	/*
+	 * the counter should be recovered when recover.
+	 * It records the oops/panic times after burning rather than booting.
+	 */
+	unsigned int oops_counter;
+	unsigned int panic_counter;
+	atomic_t recovered;
+	atomic_t on_panic;
+
+	/*
+	 * psz_info_lock just protects "psz_info" during calls to
+	 * psz_register/psz_unregister
+	 */
+	spinlock_t psz_info_lock;
+	struct psz_info *psz_info;
+	struct pstore_info pstore;
+};
+static struct psz_context psz_cxt;
+
+/**
+ * enum psz_flush_mode - flush mode for psz_zone_write()
+ *
+ * @FLUSH_NONE: do not flush to storage but update data on memory
+ * @FLUSH_PART: just flush part of data including meta data to storage
+ * @FLUSH_META: just flush meta data of zone to storage
+ * @FLUSH_ALL: flush all of zone
+ */
+enum psz_flush_mode {
+	FLUSH_NONE = 0,
+	FLUSH_PART,
+	FLUSH_META,
+	FLUSH_ALL,
+};
+
+static inline int buffer_datalen(struct psz_zone *zone)
+{
+	return atomic_read(&zone->buffer->datalen);
+}
+
+static inline bool is_on_panic(void)
+{
+	struct psz_context *cxt = &psz_cxt;
+
+	return atomic_read(&cxt->on_panic);
+}
+
+static ssize_t psz_zone_read(struct psz_zone *zone, char *buf,
+		size_t len, unsigned long off)
+{
+	if (!buf || !zone->buffer)
+		return -EINVAL;
+	if (off > zone->buffer_size)
+		return -EINVAL;
+	len = min_t(size_t, len, zone->buffer_size - off);
+	memcpy(buf, zone->buffer->data + off, len);
+	return len;
+}
+
+static int psz_zone_write(struct psz_zone *zone,
+		enum psz_flush_mode flush_mode, const char *buf,
+		size_t len, unsigned long off)
+{
+	struct psz_info *info = psz_cxt.psz_info;
+	ssize_t wcnt = 0;
+	ssize_t (*writeop)(const char *buf, size_t bytes, loff_t pos);
+	size_t wlen;
+
+	if (off > zone->buffer_size)
+		return -EINVAL;
+
+	wlen = min_t(size_t, len, zone->buffer_size - off);
+	if (buf && wlen) {
+		memcpy(zone->buffer->data + off, buf, wlen);
+		atomic_set(&zone->buffer->datalen, wlen + off);
+	}
+
+	/* avoid to damage old records */
+	if (!is_on_panic() && !atomic_read(&psz_cxt.recovered))
+		goto dirty;
+
+	writeop = is_on_panic() ? info->panic_write : info->write;
+	if (!writeop)
+		goto dirty;
+
+	switch (flush_mode) {
+	case FLUSH_NONE:
+		if (unlikely(buf && wlen))
+			goto dirty;
+		return 0;
+	case FLUSH_PART:
+		wcnt = writeop((const char *)zone->buffer->data + off, wlen,
+				zone->off + sizeof(*zone->buffer) + off);
+		if (wcnt != wlen)
+			goto dirty;
+		fallthrough;
+	case FLUSH_META:
+		wlen = sizeof(struct psz_buffer);
+		wcnt = writeop((const char *)zone->buffer, wlen, zone->off);
+		if (wcnt != wlen)
+			goto dirty;
+		break;
+	case FLUSH_ALL:
+		wlen = zone->buffer_size + sizeof(*zone->buffer);
+		wcnt = writeop((const char *)zone->buffer, wlen, zone->off);
+		if (wcnt != wlen)
+			goto dirty;
+		break;
+	}
+
+	return 0;
+dirty:
+	atomic_set(&zone->dirty, true);
+	return -EBUSY;
+}
+
+static int psz_flush_dirty_zone(struct psz_zone *zone)
+{
+	int ret;
+
+	if (!zone)
+		return -EINVAL;
+
+	if (!atomic_read(&zone->dirty))
+		return 0;
+
+	if (!atomic_read(&psz_cxt.recovered))
+		return -EBUSY;
+
+	ret = psz_zone_write(zone, FLUSH_ALL, NULL, 0, 0);
+	if (!ret)
+		atomic_set(&zone->dirty, false);
+	return ret;
+}
+
+static int psz_flush_dirty_zones(struct psz_zone **zones, unsigned int cnt)
+{
+	int i, ret;
+	struct psz_zone *zone;
+
+	if (!zones)
+		return -EINVAL;
+
+	for (i = 0; i < cnt; i++) {
+		zone = zones[i];
+		if (!zone)
+			return -EINVAL;
+		ret = psz_flush_dirty_zone(zone);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
+static int psz_move_zone(struct psz_zone *old, struct psz_zone *new)
+{
+	const char *data = (const char *)old->buffer->data;
+	int ret;
+
+	ret = psz_zone_write(new, FLUSH_ALL, data, buffer_datalen(old), 0);
+	if (ret) {
+		atomic_set(&new->buffer->datalen, 0);
+		atomic_set(&new->dirty, false);
+		return ret;
+	}
+	atomic_set(&old->buffer->datalen, 0);
+	return 0;
+}
+
+static int psz_recover_oops_data(struct psz_context *cxt)
+{
+	struct psz_info *info = cxt->psz_info;
+	struct psz_zone *zone = NULL;
+	struct psz_buffer *buf;
+	unsigned long i;
+	ssize_t rcnt;
+
+	if (!info->read)
+		return -EINVAL;
+
+	for (i = 0; i < cxt->oops_max_cnt; i++) {
+		zone = cxt->opszs[i];
+		if (unlikely(!zone))
+			return -EINVAL;
+		if (atomic_read(&zone->dirty)) {
+			unsigned int wcnt = cxt->oops_write_cnt;
+			struct psz_zone *new = cxt->opszs[wcnt];
+			int ret;
+
+			ret = psz_move_zone(zone, new);
+			if (ret) {
+				pr_err("move zone from %lu to %d failed\n",
+						i, wcnt);
+				return ret;
+			}
+			cxt->oops_write_cnt = (wcnt + 1) % cxt->oops_max_cnt;
+		}
+		if (!zone->should_recover)
+			continue;
+		buf = zone->buffer;
+		rcnt = info->read((char *)buf, zone->buffer_size + sizeof(*buf),
+				zone->off);
+		if (rcnt != zone->buffer_size + sizeof(*buf))
+			return (int)rcnt < 0 ? (int)rcnt : -EIO;
+	}
+	return 0;
+}
+
+static int psz_recover_oops_meta(struct psz_context *cxt)
+{
+	struct psz_info *info = cxt->psz_info;
+	struct psz_zone *zone;
+	size_t rcnt, len;
+	struct psz_buffer *buf;
+	struct psz_oops_header *hdr;
+	struct timespec64 time = {0};
+	unsigned long i;
+	/*
+	 * Recover may on panic, we can't allocate any memory by kmalloc.
+	 * So, we use local array instead.
+	 */
+	char buffer_header[sizeof(*buf) + sizeof(*hdr)] = {0};
+
+	if (!info->read)
+		return -EINVAL;
+
+	len = sizeof(*buf) + sizeof(*hdr);
+	buf = (struct psz_buffer *)buffer_header;
+	for (i = 0; i < cxt->oops_max_cnt; i++) {
+		zone = cxt->opszs[i];
+		if (unlikely(!zone))
+			return -EINVAL;
+
+		rcnt = info->read((char *)buf, len, zone->off);
+		if (rcnt != len) {
+			pr_err("read %s with id %lu failed\n", zone->name, i);
+			return (int)rcnt < 0 ? (int)rcnt : -EIO;
+		}
+
+		if (buf->sig != zone->buffer->sig) {
+			pr_debug("no valid data in oops zone %lu\n", i);
+			continue;
+		}
+
+		if (zone->buffer_size < atomic_read(&buf->datalen)) {
+			pr_info("found overtop zone: %s: id %lu, off %lld, size %zu\n",
+					zone->name, i, zone->off,
+					zone->buffer_size);
+			continue;
+		}
+
+		hdr = (struct psz_oops_header *)buf->data;
+		if (hdr->magic != OOPS_HEADER_MAGIC) {
+			pr_info("found invalid zone: %s: id %lu, off %lld, size %zu\n",
+					zone->name, i, zone->off,
+					zone->buffer_size);
+			continue;
+		}
+
+		/*
+		 * we get the newest zone, and the next one must be the oldest
+		 * or unused zone, because we do write one by one like a circle.
+		 */
+		if (hdr->time.tv_sec >= time.tv_sec) {
+			time.tv_sec = hdr->time.tv_sec;
+			cxt->oops_write_cnt = (i + 1) % cxt->oops_max_cnt;
+		}
+
+		if (hdr->reason == KMSG_DUMP_OOPS)
+			cxt->oops_counter =
+				max(cxt->oops_counter, hdr->counter);
+		else
+			cxt->panic_counter =
+				max(cxt->panic_counter, hdr->counter);
+
+		if (!atomic_read(&buf->datalen)) {
+			pr_debug("found erased zone: %s: id %lu, off %lld, size %zu, datalen %d\n",
+					zone->name, i, zone->off,
+					zone->buffer_size,
+					atomic_read(&buf->datalen));
+			continue;
+		}
+
+		if (!is_on_panic())
+			zone->should_recover = true;
+		pr_debug("found nice zone: %s: id %lu, off %lld, size %zu, datalen %d\n",
+				zone->name, i, zone->off,
+				zone->buffer_size, atomic_read(&buf->datalen));
+	}
+
+	return 0;
+}
+
+static int psz_recover_oops(struct psz_context *cxt)
+{
+	int ret;
+
+	if (!cxt->opszs)
+		return 0;
+
+	ret = psz_recover_oops_meta(cxt);
+	if (ret)
+		goto recover_fail;
+
+	ret = psz_recover_oops_data(cxt);
+	if (ret)
+		goto recover_fail;
+
+	return 0;
+recover_fail:
+	pr_debug("recover oops failed\n");
+	return ret;
+}
+
+/**
+ * psz_recovery() - recover data from storage
+ * @cxt: the context of pstore/zone
+ *
+ * recovery means reading data back from storage after rebooting
+ *
+ * Return: 0 on success, others on failure.
+ */
+static inline int psz_recovery(struct psz_context *cxt)
+{
+	int ret = -EBUSY;
+
+	if (atomic_read(&cxt->recovered))
+		return 0;
+
+	ret = psz_recover_oops(cxt);
+	if (ret)
+		goto recover_fail;
+
+	pr_debug("recover end!\n");
+	atomic_set(&cxt->recovered, 1);
+	return 0;
+
+recover_fail:
+	pr_err("recover failed\n");
+	return ret;
+}
+
+static int psz_pstore_open(struct pstore_info *psi)
+{
+	struct psz_context *cxt = psi->data;
+
+	cxt->oops_read_cnt = 0;
+	return 0;
+}
+
+static inline bool psz_ok(struct psz_zone *zone)
+{
+	if (zone && zone->buffer && buffer_datalen(zone))
+		return true;
+	return false;
+}
+
+static inline int psz_oops_erase(struct psz_context *cxt,
+		struct psz_zone *zone, struct pstore_record *record)
+{
+	struct psz_buffer *buffer = zone->buffer;
+	struct psz_oops_header *hdr =
+		(struct psz_oops_header *)buffer->data;
+
+	if (unlikely(!psz_ok(zone)))
+		return 0;
+	/* this zone is already updated, no need to erase */
+	if (record->count != hdr->counter)
+		return 0;
+
+	atomic_set(&zone->buffer->datalen, 0);
+	return psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+}
+
+static int psz_pstore_erase(struct pstore_record *record)
+{
+	struct psz_context *cxt = record->psi->data;
+
+	switch (record->type) {
+	case PSTORE_TYPE_DMESG:
+		if (record->id >= cxt->oops_max_cnt)
+			return -EINVAL;
+		return psz_oops_erase(cxt, cxt->opszs[record->id], record);
+	default:
+		return -EINVAL;
+	}
+}
+
+static void psz_write_kmsg_hdr(struct psz_zone *zone,
+		struct pstore_record *record)
+{
+	struct psz_context *cxt = record->psi->data;
+	struct psz_buffer *buffer = zone->buffer;
+	struct psz_oops_header *hdr =
+		(struct psz_oops_header *)buffer->data;
+
+	hdr->magic = OOPS_HEADER_MAGIC;
+	hdr->compressed = record->compressed;
+	hdr->time.tv_sec = record->time.tv_sec;
+	hdr->time.tv_nsec = record->time.tv_nsec;
+	hdr->reason = record->reason;
+	if (hdr->reason == KMSG_DUMP_OOPS)
+		hdr->counter = ++cxt->oops_counter;
+	else
+		hdr->counter = ++cxt->panic_counter;
+}
+
+static inline int notrace psz_oops_write_record(struct psz_context *cxt,
+		struct pstore_record *record)
+{
+	size_t size, hlen;
+	struct psz_zone *zone;
+	unsigned int zonenum;
+
+	zonenum = cxt->oops_write_cnt;
+	zone = cxt->opszs[zonenum];
+	if (unlikely(!zone))
+		return -ENOSPC;
+	cxt->oops_write_cnt = (zonenum + 1) % cxt->oops_max_cnt;
+
+	pr_debug("write %s to zone id %d\n", zone->name, zonenum);
+	psz_write_kmsg_hdr(zone, record);
+	hlen = sizeof(struct psz_oops_header);
+	size = min_t(size_t, record->size, zone->buffer_size - hlen);
+	return psz_zone_write(zone, FLUSH_ALL, record->buf, size, hlen);
+}
+
+static int notrace psz_oops_write(struct psz_context *cxt,
+		struct pstore_record *record)
+{
+	int ret;
+	struct psz_info *info = cxt->psz_info;
+
+	/*
+	 * Out of the various oops/panic dump types, pstore/zone is currently
+	 * designed to only store crash logs, rather than storing general
+	 * kernel logs.
+	 */
+	if (record->reason != KMSG_DUMP_OOPS &&
+			record->reason != KMSG_DUMP_PANIC)
+		return -EINVAL;
+
+	/* Skip Oopes when configured to do so. */
+	if (record->reason == KMSG_DUMP_OOPS && !info->dump_oops)
+		return -EINVAL;
+
+	/*
+	 * Explicitly only take the first part of any new crash.
+	 * If our buffer is larger than kmsg_bytes, this can never happen,
+	 * and if our buffer is smaller than kmsg_bytes, we don't want the
+	 * report split across multiple records.
+	 */
+	if (record->part != 1)
+		return -ENOSPC;
+
+	if (!cxt->opszs)
+		return -ENOSPC;
+
+	ret = psz_oops_write_record(cxt, record);
+	if (!ret) {
+		pr_debug("try to flush other dirty oops zones\n");
+		psz_flush_dirty_zones(cxt->opszs, cxt->oops_max_cnt);
+	}
+
+	/* alway return 0 as we had handled it on buffer */
+	return 0;
+}
+
+static int notrace psz_pstore_write(struct pstore_record *record)
+{
+	struct psz_context *cxt = record->psi->data;
+
+	if (record->type == PSTORE_TYPE_DMESG &&
+			record->reason == KMSG_DUMP_PANIC)
+		atomic_set(&cxt->on_panic, 1);
+
+	switch (record->type) {
+	case PSTORE_TYPE_DMESG:
+		return psz_oops_write(cxt, record);
+	default:
+		return -EINVAL;
+	}
+}
+
+static struct psz_zone *psz_read_next_zone(struct psz_context *cxt)
+{
+	struct psz_zone *zone = NULL;
+
+	while (cxt->oops_read_cnt < cxt->oops_max_cnt) {
+		zone = cxt->opszs[cxt->oops_read_cnt++];
+		if (psz_ok(zone))
+			return zone;
+	}
+
+	return NULL;
+}
+
+static int psz_read_oops_hdr(struct psz_zone *zone,
+		struct pstore_record *record)
+{
+	struct psz_buffer *buffer = zone->buffer;
+	struct psz_oops_header *hdr =
+		(struct psz_oops_header *)buffer->data;
+
+	if (hdr->magic != OOPS_HEADER_MAGIC)
+		return -EINVAL;
+	record->compressed = hdr->compressed;
+	record->time.tv_sec = hdr->time.tv_sec;
+	record->time.tv_nsec = hdr->time.tv_nsec;
+	record->reason = hdr->reason;
+	record->count = hdr->counter;
+	return 0;
+}
+
+static ssize_t psz_oops_read(struct psz_zone *zone,
+		struct pstore_record *record)
+{
+	ssize_t size, hlen = 0;
+
+	size = buffer_datalen(zone);
+	/* Clear and skip this oops record if it has no valid header */
+	if (psz_read_oops_hdr(zone, record)) {
+		atomic_set(&zone->buffer->datalen, 0);
+		atomic_set(&zone->dirty, 0);
+		return -ENOMSG;
+	}
+	size -= sizeof(struct psz_oops_header);
+
+	if (!record->compressed) {
+		char *buf = kasprintf(GFP_KERNEL, "%s: Total %d times\n",
+				get_reason_str(record->reason), record->count);
+		hlen = strlen(buf);
+		record->buf = krealloc(buf, hlen + size, GFP_KERNEL);
+		if (!record->buf) {
+			kfree(buf);
+			return -ENOMEM;
+		}
+	} else {
+		record->buf = kmalloc(size, GFP_KERNEL);
+		if (!record->buf)
+			return -ENOMEM;
+	}
+
+	size = psz_zone_read(zone, record->buf + hlen, size,
+			sizeof(struct psz_oops_header) < 0);
+	if (unlikely(size < 0)) {
+		kfree(record->buf);
+		return -ENOMSG;
+	}
+
+	return size + hlen;
+}
+
+static ssize_t psz_pstore_read(struct pstore_record *record)
+{
+	struct psz_context *cxt = record->psi->data;
+	ssize_t (*readop)(struct psz_zone *zone,
+			struct pstore_record *record);
+	struct psz_zone *zone;
+	ssize_t ret;
+
+	/* before read, we must recover from storage */
+	ret = psz_recovery(cxt);
+	if (ret)
+		return ret;
+
+next_zone:
+	zone = psz_read_next_zone(cxt);
+	if (!zone)
+		return 0;
+
+	record->type = zone->type;
+	switch (record->type) {
+	case PSTORE_TYPE_DMESG:
+		readop = psz_oops_read;
+		record->id = cxt->oops_read_cnt - 1;
+		break;
+	default:
+		goto next_zone;
+	}
+
+	ret = readop(zone, record);
+	if (ret == -ENOMSG)
+		goto next_zone;
+	return ret;
+}
+
+static struct psz_context psz_cxt = {
+	.psz_info_lock = __SPIN_LOCK_UNLOCKED(psz_cxt.psz_info_lock),
+	.recovered = ATOMIC_INIT(0),
+	.on_panic = ATOMIC_INIT(0),
+	.pstore = {
+		.owner = THIS_MODULE,
+		.name = MODNAME,
+		.open = psz_pstore_open,
+		.read = psz_pstore_read,
+		.write = psz_pstore_write,
+		.erase = psz_pstore_erase,
+	},
+};
+
+static struct psz_zone *psz_init_zone(enum pstore_type_id type,
+		loff_t *off, size_t size)
+{
+	struct psz_info *info = psz_cxt.psz_info;
+	struct psz_zone *zone;
+	const char *name = pstore_type_to_name(type);
+
+	if (!size)
+		return NULL;
+
+	if (*off + size > info->total_size) {
+		pr_err("no room for %s (0x%zx@0x%llx over 0x%lx)\n",
+			name, size, *off, info->total_size);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	zone = kzalloc(sizeof(struct psz_zone), GFP_KERNEL);
+	if (!zone)
+		return ERR_PTR(-ENOMEM);
+
+	zone->buffer = kmalloc(size, GFP_KERNEL);
+	if (!zone->buffer) {
+		kfree(zone);
+		return ERR_PTR(-ENOMEM);
+	}
+	memset(zone->buffer, 0xFF, size);
+	zone->off = *off;
+	zone->name = name;
+	zone->type = type;
+	zone->buffer_size = size - sizeof(struct psz_buffer);
+	zone->buffer->sig = type ^ PSZ_SIG;
+	atomic_set(&zone->dirty, 0);
+	atomic_set(&zone->buffer->datalen, 0);
+
+	*off += size;
+
+	pr_debug("pszone %s: off 0x%llx, %zu header, %zu data\n", zone->name,
+			zone->off, sizeof(*zone->buffer), zone->buffer_size);
+	return zone;
+}
+
+static struct psz_zone **psz_init_zones(enum pstore_type_id type,
+	loff_t *off, size_t total_size, ssize_t record_size,
+	unsigned int *cnt)
+{
+	struct psz_info *info = psz_cxt.psz_info;
+	struct psz_zone **zones, *zone;
+	const char *name = pstore_type_to_name(type);
+	int c, i;
+
+	if (!total_size || !record_size)
+		return NULL;
+
+	if (*off + total_size > info->total_size) {
+		pr_err("no room for zones %s (0x%zx@0x%llx over 0x%lx)\n",
+			name, total_size, *off, info->total_size);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	c = total_size / record_size;
+	zones = kcalloc(c, sizeof(*zones), GFP_KERNEL);
+	if (!zones) {
+		pr_err("allocate for zones %s failed\n", name);
+		return ERR_PTR(-ENOMEM);
+	}
+	memset(zones, 0, c * sizeof(*zones));
+
+	for (i = 0; i < c; i++) {
+		zone = psz_init_zone(type, off, record_size);
+		if (!zone || IS_ERR(zone)) {
+			pr_err("initialize zones %s failed\n", name);
+			while (--i >= 0) {
+				kfree(zones[i]->buffer);
+				kfree(zones[i]);
+			}
+			kfree(zones);
+			return (void *)zone;
+		}
+		zones[i] = zone;
+	}
+
+	*cnt = c;
+	return zones;
+}
+
+static void psz_free_zone(struct psz_zone **pszone)
+{
+	struct psz_zone *zone = *pszone;
+
+	if (!zone)
+		return;
+
+	kfree(zone->buffer);
+	kfree(zone);
+	*pszone = NULL;
+}
+
+static void psz_free_zones(struct psz_zone ***pszones, unsigned int *cnt)
+{
+	struct psz_zone **zones = *pszones;
+
+	if (!zones)
+		return;
+
+	while (*cnt > 0) {
+		psz_free_zone(&zones[*cnt]);
+		(*cnt)--;
+	}
+	kfree(zones);
+	*pszones = NULL;
+}
+
+static void psz_free_all_zones(struct psz_context *cxt)
+{
+	if (cxt->opszs)
+		psz_free_zones(&cxt->opszs, &cxt->oops_max_cnt);
+}
+
+static int psz_alloc_zones(struct psz_context *cxt)
+{
+	struct psz_info *info = cxt->psz_info;
+	loff_t off = 0;
+	int err;
+	size_t size;
+
+	size = info->total_size;
+	cxt->opszs = psz_init_zones(PSTORE_TYPE_DMESG, &off, size,
+			info->oops_size, &cxt->oops_max_cnt);
+	if (IS_ERR(cxt->opszs)) {
+		err = PTR_ERR(cxt->opszs);
+		goto fail_out;
+	}
+
+	return 0;
+fail_out:
+	return err;
+}
+
+/**
+ * psz_register() - register to pstore/zone
+ *
+ * @info: back-end driver information. See &struct psz_info.
+ *
+ * Only one back-end at one time.
+ *
+ * Return: 0 on success, others on failure.
+ */
+int psz_register(struct psz_info *info)
+{
+	int err = -EINVAL;
+	struct psz_context *cxt = &psz_cxt;
+	struct module *owner = info->owner;
+
+	if (!info->total_size) {
+		pr_warn("the total size must be non-zero\n");
+		return -EINVAL;
+	}
+
+	if (!info->oops_size) {
+		pr_warn("at least one of the records be non-zero\n");
+		return -EINVAL;
+	}
+
+	if (!info->name || !info->name[0])
+		return -EINVAL;
+
+	if (info->total_size < 4096) {
+		pr_err("total size must be greater than 4096 bytes\n");
+		return -EINVAL;
+	}
+
+#define check_size(name, size) {					\
+		if (info->name > 0 && info->name < (size)) {		\
+			pr_err(#name " must be over %d\n", (size));	\
+			return -EINVAL;					\
+		}							\
+		if (info->name & (size - 1)) {				\
+			pr_err(#name " must be a multiple of %d\n",	\
+					(size));			\
+			return -EINVAL;					\
+		}							\
+	}
+
+	check_size(total_size, 4096);
+	check_size(oops_size, SECTOR_SIZE);
+
+#undef check_size
+
+	/*
+	 * the @read and @write must be applied.
+	 * if no @read, pstore may mount failed.
+	 * if no @write, pstore do not support to remove record file.
+	 */
+	if (!info->read || !info->write) {
+		pr_err("no valid general read/write interface\n");
+		return -EINVAL;
+	}
+
+	spin_lock(&cxt->psz_info_lock);
+	if (cxt->psz_info) {
+		pr_warn("'%s' already loaded: ignoring '%s'\n",
+				cxt->psz_info->name, info->name);
+		spin_unlock(&cxt->psz_info_lock);
+		return -EBUSY;
+	}
+	cxt->psz_info = info;
+	spin_unlock(&cxt->psz_info_lock);
+
+	if (owner && !try_module_get(owner)) {
+		err = -EBUSY;
+		goto fail_out;
+	}
+
+	pr_debug("register %s with properties:\n", info->name);
+	pr_debug("\ttotal size : %ld Bytes\n", info->total_size);
+	pr_debug("\toops size : %ld Bytes\n", info->oops_size);
+
+	err = psz_alloc_zones(cxt);
+	if (err) {
+		pr_err("alloc zones failed\n");
+		goto put_module;
+	}
+
+	if (info->oops_size) {
+		cxt->pstore.bufsize = cxt->opszs[0]->buffer_size -
+			sizeof(struct psz_oops_header);
+		cxt->pstore.buf = kzalloc(cxt->pstore.bufsize, GFP_KERNEL);
+		if (!cxt->pstore.buf) {
+			err = -ENOMEM;
+			goto free_all_zones;
+		}
+	}
+	cxt->pstore.data = cxt;
+	if (info->oops_size)
+		cxt->pstore.flags = PSTORE_FLAGS_DMESG;
+
+	pr_info("Registered %s as pszone backend for%s%s\n", info->name,
+			cxt->opszs && cxt->psz_info->dump_oops ? " Oops" : "",
+			cxt->opszs && cxt->psz_info->panic_write ? " Panic" : "");
+
+	err = pstore_register(&cxt->pstore);
+	if (err) {
+		pr_err("registering with pstore failed\n");
+		goto free_pstore_buf;
+	}
+
+	module_put(owner);
+	return 0;
+
+free_pstore_buf:
+	kfree(cxt->pstore.buf);
+free_all_zones:
+	psz_free_all_zones(cxt);
+put_module:
+	module_put(owner);
+fail_out:
+	spin_lock(&psz_cxt.psz_info_lock);
+	psz_cxt.psz_info = NULL;
+	spin_unlock(&psz_cxt.psz_info_lock);
+	return err;
+}
+EXPORT_SYMBOL_GPL(psz_register);
+
+/**
+ * psz_unregister() - unregister to pstore/zone
+ *
+ * @info: back-end driver information. See &struct psz_info.
+ */
+void psz_unregister(struct psz_info *info)
+{
+	struct psz_context *cxt = &psz_cxt;
+
+	pstore_unregister(&cxt->pstore);
+	kfree(cxt->pstore.buf);
+	cxt->pstore.bufsize = 0;
+
+	spin_lock(&cxt->psz_info_lock);
+	cxt->psz_info = NULL;
+	spin_unlock(&cxt->psz_info_lock);
+
+	psz_free_all_zones(cxt);
+}
+EXPORT_SYMBOL_GPL(psz_unregister);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>");
+MODULE_DESCRIPTION("Storage Manager for Pstore/blk and Pstore/ram");
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index e779441e6d26..859583a7104d 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -45,6 +45,7 @@ enum pstore_type_id {
 
 const char *pstore_type_to_name(enum pstore_type_id type);
 enum pstore_type_id pstore_name_to_type(const char *name);
+const char *get_reason_str(enum kmsg_dump_reason reason);
 
 struct pstore_info;
 /**
diff --git a/include/linux/pstore_zone.h b/include/linux/pstore_zone.h
new file mode 100644
index 000000000000..afc10a67a777
--- /dev/null
+++ b/include/linux/pstore_zone.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __PSTORE_ZONE_H_
+#define __PSTORE_ZONE_H_
+
+#include <linux/types.h>
+
+typedef ssize_t (*psz_read_op)(char *, size_t, loff_t);
+typedef ssize_t (*psz_write_op)(const char *, size_t, loff_t);
+/**
+ * struct psz_info - pstore/zone back-end driver structure
+ *
+ * @owner:	Module which is responsible for this back-end driver.
+ * @name:	Name of the back-end driver.
+ * @total_size: The total size in bytes pstore/zone can use. It must be greater
+ *		than 4096 and be multiple of 4096.
+ * @oops_size:	The size of oops/panic zone. Zero means disabled, otherwise,
+ *		it must be multiple of SECTOR_SIZE(512 Bytes).
+ * @dump_oops:	Whether to dump oops log.
+ * @read:	The general read operation. Both of the function parameters
+ *		@size and @offset are relative value to storage.
+ *		On success, the number of bytes should be returned, others
+ *		means error.
+ * @write:	The same as @read.
+ * @panic_write:The write operation only used for panic case. It's optional
+ *		if you do not care panic log. The parameters and return value
+ *		are the same as @read.
+ */
+struct psz_info {
+	struct module *owner;
+	const char *name;
+
+	unsigned long total_size;
+	unsigned long oops_size;
+	int dump_oops;
+	psz_read_op read;
+	psz_write_op write;
+	psz_write_op panic_write;
+};
+
+extern int psz_register(struct psz_info *info);
+extern void psz_unregister(struct psz_info *info);
+
+#endif
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 02/11] pstore/blk: new support logger for block devices
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
  2020-03-25  8:54 ` [PATCH v3 01/11] pstore/zone: a common layer to manage storage as zones WeiXiong Liao
@ 2020-03-25  8:54 ` WeiXiong Liao
  2020-05-08  6:07   ` Kees Cook
  2020-03-25  8:54 ` [PATCH v3 03/11] pstore/blk: respect for driver to pick pstore front-ends WeiXiong Liao
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:54 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

pstore/blk is similar to pstore/ram, but dump log to block device
rather than persistent ram.

Why do we need pstore/blk?
1. Most embedded intelligent equipment have no persistent ram, which
increases costs. We perfer to cheaper solutions, like block devices.
2. Do not any equipment have battery, which means that it lost all data
on general ram if power failure. Pstore has little to do for these
equipments.

Pstore/blk provides efficient configuration method. It divides all
configurations into 2 parts, configurations for user and
configurations for driver.

Configurations for user detemine how pstore/blk work, such as
dump_oops and oops_size. They can be set by Kconfig and module
parameter, but module parameter has priority over Kconfig.

Configurations for driver are all about block device, such as
total_size of device and read/write operations. They should be provided
by device drivers, calling psblk_register_blkdev().

If block device do not support panic write, @panic_write can be NULL.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 fs/pstore/Kconfig          |  62 +++++++
 fs/pstore/Makefile         |   2 +
 fs/pstore/pstore_blk.c     | 425 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pstore_blk.h |  27 +++
 4 files changed, 516 insertions(+)
 create mode 100644 fs/pstore/pstore_blk.c
 create mode 100644 include/linux/pstore_blk.h

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 5ad4ab68ac4f..590af61019c2 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -160,3 +160,65 @@ config PSTORE_ZONE
 	help
 	  The common layer for pstore/blk (and pstore/ram in the future)
 	  to manager storage as zones.
+
+config PSTORE_BLK
+	tristate "Log panic/oops to a block device"
+	depends on PSTORE
+	depends on BLOCK
+	select PSTORE_ZONE
+	default n
+	help
+	  This enables panic and oops message to be logged to a block dev
+	  where it can be read back at some later point.
+
+	  If unsure, say N.
+
+config PSTORE_BLK_BLKDEV
+	string "block device identifier"
+	depends on PSTORE_BLK
+	default ""
+	help
+	  Which block device should be used for pstore/blk.
+
+	  It accept the following variants:
+	  1) <hex_major><hex_minor> device number in hexadecimal represents
+	     itself no leading 0x, for example b302.
+	  2) /dev/<disk_name> represents the device number of disk
+	  3) /dev/<disk_name><decimal> represents the device number
+	     of partition - device number of disk plus the partition number
+	  4) /dev/<disk_name>p<decimal> - same as the above, this form is
+	     used when disk name of partitioned disk ends with a digit.
+	  5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+	     unique id of a partition if the partition table provides it.
+	     The UUID may be either an EFI/GPT UUID, or refer to an MSDOS
+	     partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero-
+	     filled hex representation of the 32-bit "NT disk signature", and PP
+	     is a zero-filled hex representation of the 1-based partition number.
+	  6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation
+	     to a partition with a known unique id.
+	  7) <major>:<minor> major and minor number of the device separated by
+	     a colon.
+
+	  NOTE that, both Kconfig and module parameters can configure
+	  pstore/blk, but module parameters have priority over Kconfig.
+
+config PSTORE_BLK_OOPS_SIZE
+	int "Size in Kbytes of oops/panic log to store"
+	depends on PSTORE_BLK
+	default 64
+	help
+	  This just sets size of oops/panic log (oops_size) for pstore/blk.
+	  The size is in KB and must be a multiple of 4.
+
+	  NOTE that, both Kconfig and module parameters can configure
+	  pstore/blk, but module parameters have priority over Kconfig.
+
+config PSTORE_BLK_DUMP_OOPS
+	bool "dump oops"
+	depends on PSTORE_BLK
+	default y
+	help
+	  Whether pstore/blk dumps oops or not.
+
+	  NOTE that, both Kconfig and module parameters can configure
+	  pstore/blk, but module parameters have priority over Kconfig.
diff --git a/fs/pstore/Makefile b/fs/pstore/Makefile
index 94f3631c80ce..0e1c5faf7a0b 100644
--- a/fs/pstore/Makefile
+++ b/fs/pstore/Makefile
@@ -14,3 +14,5 @@ ramoops-objs += ram.o ram_core.o
 obj-$(CONFIG_PSTORE_RAM)	+= ramoops.o
 
 obj-$(CONFIG_PSTORE_ZONE)	+= pstore_zone.o
+
+obj-$(CONFIG_PSTORE_BLK)	+= pstore_blk.o
diff --git a/fs/pstore/pstore_blk.c b/fs/pstore/pstore_blk.c
new file mode 100644
index 000000000000..2fbdd4563e5c
--- /dev/null
+++ b/fs/pstore/pstore_blk.c
@@ -0,0 +1,425 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define MODNAME "pstore-blk"
+#define pr_fmt(fmt) MODNAME ": " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/blkdev.h>
+#include <linux/string.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/platform_device.h>
+#include <linux/pstore_blk.h>
+#include <linux/mount.h>
+#include <linux/uio.h>
+
+static long oops_size = CONFIG_PSTORE_BLK_OOPS_SIZE;
+module_param(oops_size, long, 0400);
+MODULE_PARM_DESC(oops_size, "oops size in kbytes");
+
+static int dump_oops = CONFIG_PSTORE_BLK_DUMP_OOPS;
+module_param(dump_oops, int, 0400);
+MODULE_PARM_DESC(total_size, "whether dump oops");
+
+/*
+ * blkdev - The block device to use.
+ *
+ * Most of the time, it is a partition of block device.
+ *
+ * blkdev accepts the following variants:
+ * 1) <hex_major><hex_minor> device number in hexadecimal represents itself
+ *    no leading 0x, for example b302.
+ * 2) /dev/<disk_name> represents the device number of disk
+ * 3) /dev/<disk_name><decimal> represents the device number
+ *    of partition - device number of disk plus the partition number
+ * 4) /dev/<disk_name>p<decimal> - same as the above, that form is
+ *    used when disk name of partitioned disk ends on a digit.
+ * 5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+ *    unique id of a partition if the partition table provides it.
+ *    The UUID may be either an EFI/GPT UUID, or refer to an MSDOS
+ *    partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero-
+ *    filled hex representation of the 32-bit "NT disk signature", and PP
+ *    is a zero-filled hex representation of the 1-based partition number.
+ * 6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to
+ *    a partition with a known unique id.
+ * 7) <major>:<minor> major and minor number of the device separated by
+ *    a colon.
+ */
+static char blkdev[80] = CONFIG_PSTORE_BLK_BLKDEV;
+module_param_string(blkdev, blkdev, 80, 0400);
+MODULE_PARM_DESC(blkdev, "the block device for general read/write");
+
+static DEFINE_MUTEX(psz_lock);
+static struct block_device *psblk_bdev;
+static struct psz_info *psz_info;
+static psblk_panic_write_op blkdev_panic_write;
+static struct bdev_info {
+	dev_t devt;
+	sector_t nr_sects;
+	sector_t start_sect;
+} g_bdev_info;
+
+/**
+ * struct psblk_device - back-end pstore/blk driver structure.
+ *
+ * @total_size: The total size in bytes pstore/blk can use. It must be greater
+ *		than 4096 and be multiple of 4096.
+ * @read:	The general read operation. Both of the function parameters
+ *		@size and @offset are relative value to bock device (not the
+ *		whole disk).
+ *		On success, the number of bytes should be returned, others
+ *		means error.
+ * @write:	The same as @read.
+ * @panic_write:The write operation only used for panic case. It's optional
+ *		if you do not care panic log. The parameters and return value
+ *		are the same as @read.
+ */
+struct psblk_device {
+	unsigned long total_size;
+	psz_read_op read;
+	psz_write_op write;
+	psz_write_op panic_write;
+};
+
+static int psblk_register_do(struct psblk_device *dev)
+{
+	int ret;
+
+	if (!dev || !dev->total_size || !dev->read || !dev->write)
+		return -EINVAL;
+
+	mutex_lock(&psz_lock);
+
+	/* someone already registered before */
+	if (psz_info) {
+		mutex_unlock(&psz_lock);
+		return -EBUSY;
+	}
+	psz_info = kzalloc(sizeof(struct psz_info), GFP_KERNEL);
+	if (!psz_info) {
+		mutex_unlock(&psz_lock);
+		return -ENOMEM;
+	}
+
+#define verify_size(name, alignsize) {					\
+		long _##name_ = (name);					\
+		_##name_ = _##name_ <= 0 ? 0 : (_##name_ * 1024);	\
+		if (_##name_ & ((alignsize) - 1)) {			\
+			pr_info(#name " must align to %d\n",		\
+					(alignsize));			\
+			_##name_ = ALIGN(name, (alignsize));		\
+		}							\
+		name = _##name_ / 1024;					\
+		psz_info->name = _##name_;				\
+	}
+
+	verify_size(oops_size, 4096);
+#undef verify_size
+	dump_oops = dump_oops <= 0 ? 0 : 1;
+
+	psz_info->total_size = dev->total_size;
+	psz_info->dump_oops = dump_oops;
+	psz_info->read = dev->read;
+	psz_info->write = dev->write;
+	psz_info->panic_write = dev->panic_write;
+	psz_info->name = MODNAME;
+	psz_info->owner = THIS_MODULE;
+
+	ret = psz_register(psz_info);
+	if (ret) {
+		kfree(psz_info);
+		psz_info = NULL;
+	}
+	mutex_unlock(&psz_lock);
+	return ret;
+}
+
+static void psblk_unregister_do(struct psblk_device *dev)
+{
+	mutex_lock(&psz_lock);
+	if (psz_info && psz_info->read == dev->read) {
+		psz_unregister(psz_info);
+		kfree(psz_info);
+		psz_info = NULL;
+	}
+	mutex_unlock(&psz_lock);
+}
+
+/**
+ * psblk_get_bdev() - open block device
+ * @holder: exclusive holder identifier
+ *
+ * Return: pointer to block device on success and others on error.
+ *
+ * On success, the returned block_device has reference count of one.
+ */
+static struct block_device *psblk_get_bdev(void *holder)
+{
+	struct block_device *bdev = ERR_PTR(-ENODEV);
+	fmode_t mode = FMODE_READ | FMODE_WRITE;
+
+	if (!blkdev[0])
+		return ERR_PTR(-ENODEV);
+
+	mutex_lock(&psz_lock);
+	if (psz_info)
+		goto out;
+	if (holder)
+		mode |= FMODE_EXCL;
+	bdev = blkdev_get_by_path(blkdev, mode, holder);
+	if (IS_ERR(bdev)) {
+		dev_t devt;
+
+		devt = name_to_dev_t(blkdev);
+		if (devt == 0) {
+			bdev = ERR_PTR(-ENODEV);
+			goto out;
+		}
+		bdev = blkdev_get_by_dev(devt, mode, holder);
+	}
+out:
+	mutex_unlock(&psz_lock);
+	return bdev;
+}
+
+static void psblk_put_bdev(struct block_device *bdev, void *holder)
+{
+	fmode_t mode = FMODE_READ | FMODE_WRITE;
+
+	if (!bdev)
+		return;
+
+	mutex_lock(&psz_lock);
+	if (holder)
+		mode |= FMODE_EXCL;
+	blkdev_put(bdev, mode);
+	mutex_unlock(&psz_lock);
+}
+
+static ssize_t psblk_generic_blk_read(char *buf, size_t bytes, loff_t pos)
+{
+	struct block_device *bdev = psblk_bdev;
+	struct file file;
+	struct kiocb kiocb;
+	struct iov_iter iter;
+	struct kvec iov = {.iov_base = buf, .iov_len = bytes};
+
+	if (!bdev)
+		return -ENODEV;
+
+	memset(&file, 0, sizeof(struct file));
+	file.f_mapping = bdev->bd_inode->i_mapping;
+	file.f_flags = O_DSYNC | __O_SYNC | O_NOATIME;
+	file.f_inode = bdev->bd_inode;
+	file_ra_state_init(&file.f_ra, file.f_mapping);
+
+	init_sync_kiocb(&kiocb, &file);
+	kiocb.ki_pos = pos;
+	iov_iter_kvec(&iter, READ, &iov, 1, bytes);
+
+	return generic_file_read_iter(&kiocb, &iter);
+}
+
+static ssize_t psblk_generic_blk_write(const char *buf, size_t bytes,
+		loff_t pos)
+{
+	struct block_device *bdev = psblk_bdev;
+	struct iov_iter iter;
+	struct kiocb kiocb;
+	struct file file;
+	ssize_t ret;
+	struct kvec iov = {.iov_base = (void *)buf, .iov_len = bytes};
+
+	if (!bdev)
+		return -ENODEV;
+
+	/* Console/Ftrace recorder may handle buffer until flush dirty zones */
+	if (in_interrupt() || irqs_disabled())
+		return -EBUSY;
+
+	memset(&file, 0, sizeof(struct file));
+	file.f_mapping = bdev->bd_inode->i_mapping;
+	file.f_flags = O_DSYNC | __O_SYNC | O_NOATIME;
+	file.f_inode = bdev->bd_inode;
+
+	init_sync_kiocb(&kiocb, &file);
+	kiocb.ki_pos = pos;
+	iov_iter_kvec(&iter, WRITE, &iov, 1, bytes);
+
+	inode_lock(bdev->bd_inode);
+	ret = generic_write_checks(&kiocb, &iter);
+	if (ret > 0)
+		ret = generic_perform_write(&file, &iter, pos);
+	inode_unlock(bdev->bd_inode);
+
+	if (likely(ret > 0)) {
+		const struct file_operations f_op = {.fsync = blkdev_fsync};
+
+		file.f_op = &f_op;
+		kiocb.ki_pos += ret;
+		ret = generic_write_sync(&kiocb, ret);
+	}
+	return ret;
+}
+
+static inline unsigned long psblk_bdev_size(struct block_device *bdev)
+{
+	return (unsigned long)part_nr_sects_read(bdev->bd_part) << SECTOR_SHIFT;
+}
+
+static ssize_t psblk_blk_panic_write(const char *buf, size_t size,
+		loff_t off)
+{
+	int ret;
+
+	if (!blkdev_panic_write)
+		return -EOPNOTSUPP;
+
+	/* size and off must align to SECTOR_SIZE for block device */
+	ret = blkdev_panic_write(buf, off >> SECTOR_SHIFT,
+			size >> SECTOR_SHIFT);
+	return ret ? -EIO : size;
+}
+
+static struct bdev_info *psblk_get_bdev_info(void)
+{
+	struct bdev_info *info = &g_bdev_info;
+	struct block_device *bdev;
+
+	if (info->devt)
+		return info;
+
+	bdev = psblk_get_bdev(NULL);
+	if (IS_ERR(bdev))
+		return ERR_CAST(bdev);
+
+	info->devt = bdev->bd_dev;
+	info->nr_sects = part_nr_sects_read(bdev->bd_part);
+	info->start_sect = get_start_sect(bdev);
+
+	if (!psblk_bdev_size(bdev)) {
+		pr_err("no enough space to '%s'\n", blkdev);
+		info = ERR_PTR(-ENOSPC);
+	}
+
+	psblk_put_bdev(bdev, NULL);
+	return info;
+}
+
+/**
+ * psblk_register_blkdev() - register block device to pstore/blk
+ *
+ * @major: the major device number of registering device
+ * @panic_write: the interface for panic case.
+ *
+ * Only the matching major to @blkdev can register.
+ *
+ * If block device do not support panic write, @panic_write can be NULL.
+ *
+ * Return:
+ * * 0		- OK
+ * * Others	- something error.
+ */
+int psblk_register_blkdev(unsigned int major, psblk_panic_write_op panic_write)
+{
+	struct block_device *bdev;
+	struct psblk_device dev = {0};
+	struct bdev_info *binfo;
+	int ret = -ENODEV;
+	void *holder = blkdev;
+
+	binfo = psblk_get_bdev_info();
+	if (IS_ERR(binfo))
+		return PTR_ERR(binfo);
+
+	/* only allow driver matching the @blkdev */
+	if (!binfo->devt || MAJOR(binfo->devt) != major) {
+		pr_debug("invalid major %u (expect %u)\n",
+				major, MAJOR(binfo->devt));
+		return -ENODEV;
+	}
+
+	/* hold bdev exclusively */
+	bdev = psblk_get_bdev(holder);
+	if (IS_ERR(bdev)) {
+		pr_err("failed to open '%s'!\n", blkdev);
+		return PTR_ERR(bdev);
+	}
+
+	/* psblk_bdev must be assigned before register to pstore/blk */
+	psblk_bdev = bdev;
+	blkdev_panic_write = panic_write;
+
+	dev.total_size = psblk_bdev_size(bdev);
+	dev.panic_write = panic_write ? psblk_blk_panic_write : NULL;
+	dev.read = psblk_generic_blk_read;
+	dev.write = psblk_generic_blk_write;
+
+	ret = psblk_register_do(&dev);
+	if (ret)
+		goto err_put_bdev;
+
+	pr_info("using '%s'\n", blkdev);
+	return 0;
+
+err_put_bdev:
+	psblk_bdev = NULL;
+	blkdev_panic_write = NULL;
+	psblk_put_bdev(bdev, holder);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(psblk_register_blkdev);
+
+/**
+ * psblk_unregister_blkdev() - unregister block device from pstore/blk
+ *
+ * @major: the major device number of device
+ */
+void psblk_unregister_blkdev(unsigned int major)
+{
+	struct psblk_device dev = {.read = psblk_generic_blk_read};
+	void *holder = blkdev;
+
+	if (psblk_bdev && MAJOR(psblk_bdev->bd_dev) == major) {
+		psblk_unregister_do(&dev);
+		psblk_put_bdev(psblk_bdev, holder);
+		blkdev_panic_write = NULL;
+		psblk_bdev = NULL;
+	}
+}
+EXPORT_SYMBOL_GPL(psblk_unregister_blkdev);
+
+/**
+ * psblk_blkdev_info() - get information of @blkdev
+ *
+ * @devt: the block device num of @blkdev
+ * @nr_sects: the sector count of @blkdev
+ * @start_sect: the start sector of @blkdev
+ *
+ * Block driver needs the follow information for @panic_write.
+ *
+ * Return: 0 on success, others on failure.
+ */
+int psblk_blkdev_info(dev_t *devt, sector_t *nr_sects, sector_t *start_sect)
+{
+	struct bdev_info *binfo;
+
+	binfo = psblk_get_bdev_info();
+	if (IS_ERR(binfo))
+		return PTR_ERR(binfo);
+
+	if (devt)
+		*devt = binfo->devt;
+	if (nr_sects)
+		*nr_sects = binfo->nr_sects;
+	if (start_sect)
+		*start_sect = binfo->start_sect;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(psblk_blkdev_info);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>");
+MODULE_DESCRIPTION("Block device Oops/Panic logger");
diff --git a/include/linux/pstore_blk.h b/include/linux/pstore_blk.h
new file mode 100644
index 000000000000..5ff465e3953e
--- /dev/null
+++ b/include/linux/pstore_blk.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __PSTORE_BLK_H_
+#define __PSTORE_BLK_H_
+
+#include <linux/types.h>
+#include <linux/pstore_zone.h>
+
+/**
+ * typedef psblk_panic_write_op - panic write operation to block device
+ *
+ * @buf: the data to write
+ * @start_sect: start sector to block device
+ * @sects: sectors count on buf
+ *
+ * Return: On success, zero should be returned. Others mean error.
+ *
+ * Panic write to block device must be aligned to SECTOR_SIZE.
+ */
+typedef int (*psblk_panic_write_op)(const char *buf, sector_t start_sect,
+		sector_t sects);
+
+int  psblk_register_blkdev(unsigned int major, psblk_panic_write_op panic_write);
+void psblk_unregister_blkdev(unsigned int major);
+int  psblk_blkdev_info(dev_t *devt, sector_t *nr_sects, sector_t *start_sect);
+
+#endif
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 03/11] pstore/blk: respect for driver to pick pstore front-ends
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
  2020-03-25  8:54 ` [PATCH v3 01/11] pstore/zone: a common layer to manage storage as zones WeiXiong Liao
  2020-03-25  8:54 ` [PATCH v3 02/11] pstore/blk: new support logger for block devices WeiXiong Liao
@ 2020-03-25  8:54 ` WeiXiong Liao
  2020-03-25  8:54 ` [PATCH v3 04/11] pstore/blk: pstore/zone: support pmsg recorder WeiXiong Liao
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:54 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

For most devices, not all psotre front-ends are supported. Pstore/blk
should provide a way for drivers to list the supported front-ends.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 fs/pstore/pstore_blk.c     | 18 ++++++++++++++----
 include/linux/pstore_blk.h |  4 +++-
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/fs/pstore/pstore_blk.c b/fs/pstore/pstore_blk.c
index 2fbdd4563e5c..f3ce7bbd9077 100644
--- a/fs/pstore/pstore_blk.c
+++ b/fs/pstore/pstore_blk.c
@@ -65,6 +65,9 @@
  *
  * @total_size: The total size in bytes pstore/blk can use. It must be greater
  *		than 4096 and be multiple of 4096.
+ * @flags:	Refer to macro starting with PSTORE_FLAGS defined in
+ *		linux/pstore.h. It means what front-ends this device support.
+ *		Zero means all recorders for compatible.
  * @read:	The general read operation. Both of the function parameters
  *		@size and @offset are relative value to bock device (not the
  *		whole disk).
@@ -77,6 +80,7 @@
  */
 struct psblk_device {
 	unsigned long total_size;
+	unsigned int flags;
 	psz_read_op read;
 	psz_write_op write;
 	psz_write_op panic_write;
@@ -102,8 +106,11 @@ static int psblk_register_do(struct psblk_device *dev)
 		return -ENOMEM;
 	}
 
-#define verify_size(name, alignsize) {					\
-		long _##name_ = (name);					\
+	/* zero means all recorders for compatible */
+	if (!dev->flags)
+		dev->flags = UINT_MAX;
+#define verify_size(name, alignsize, enable) {				\
+		long _##name_ = (enable) ? (name) : 0;			\
 		_##name_ = _##name_ <= 0 ? 0 : (_##name_ * 1024);	\
 		if (_##name_ & ((alignsize) - 1)) {			\
 			pr_info(#name " must align to %d\n",		\
@@ -114,7 +121,7 @@ static int psblk_register_do(struct psblk_device *dev)
 		psz_info->name = _##name_;				\
 	}
 
-	verify_size(oops_size, 4096);
+	verify_size(oops_size, 4096, dev->flags & PSTORE_FLAGS_DMESG);
 #undef verify_size
 	dump_oops = dump_oops <= 0 ? 0 : 1;
 
@@ -311,6 +318,7 @@ static struct bdev_info *psblk_get_bdev_info(void)
  * psblk_register_blkdev() - register block device to pstore/blk
  *
  * @major: the major device number of registering device
+ * @flags: refer to macro starting with PSTORE_FLAGS defined in linux/pstore.h
  * @panic_write: the interface for panic case.
  *
  * Only the matching major to @blkdev can register.
@@ -321,7 +329,8 @@ static struct bdev_info *psblk_get_bdev_info(void)
  * * 0		- OK
  * * Others	- something error.
  */
-int psblk_register_blkdev(unsigned int major, psblk_panic_write_op panic_write)
+int psblk_register_blkdev(unsigned int major, unsigned int flags,
+		psblk_panic_write_op panic_write)
 {
 	struct block_device *bdev;
 	struct psblk_device dev = {0};
@@ -352,6 +361,7 @@ int psblk_register_blkdev(unsigned int major, psblk_panic_write_op panic_write)
 	blkdev_panic_write = panic_write;
 
 	dev.total_size = psblk_bdev_size(bdev);
+	dev.flags = flags;
 	dev.panic_write = panic_write ? psblk_blk_panic_write : NULL;
 	dev.read = psblk_generic_blk_read;
 	dev.write = psblk_generic_blk_write;
diff --git a/include/linux/pstore_blk.h b/include/linux/pstore_blk.h
index 5ff465e3953e..d8f609e60288 100644
--- a/include/linux/pstore_blk.h
+++ b/include/linux/pstore_blk.h
@@ -4,6 +4,7 @@
 #define __PSTORE_BLK_H_
 
 #include <linux/types.h>
+#include <linux/pstore.h>
 #include <linux/pstore_zone.h>
 
 /**
@@ -20,7 +21,8 @@
 typedef int (*psblk_panic_write_op)(const char *buf, sector_t start_sect,
 		sector_t sects);
 
-int  psblk_register_blkdev(unsigned int major, psblk_panic_write_op panic_write);
+int  psblk_register_blkdev(unsigned int major, unsigned int flags,
+		psblk_panic_write_op panic_write);
 void psblk_unregister_blkdev(unsigned int major);
 int  psblk_blkdev_info(dev_t *devt, sector_t *nr_sects, sector_t *start_sect);
 
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 04/11] pstore/blk: pstore/zone: support pmsg recorder
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
                   ` (2 preceding siblings ...)
  2020-03-25  8:54 ` [PATCH v3 03/11] pstore/blk: respect for driver to pick pstore front-ends WeiXiong Liao
@ 2020-03-25  8:54 ` WeiXiong Liao
  2020-03-25  8:55 ` [PATCH v3 05/11] pstore/blk: blkoops: support console recorder WeiXiong Liao
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:54 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

pmsg support recorder for userspace. To enable pmsg, just make pmsg_size
be greater than 0 and a multiple of 4096.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 fs/pstore/Kconfig           |  12 ++
 fs/pstore/pstore_blk.c      |   9 ++
 fs/pstore/pstore_zone.c     | 273 +++++++++++++++++++++++++++++++++++++++++---
 include/linux/pstore_zone.h |   2 +
 4 files changed, 283 insertions(+), 13 deletions(-)

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 590af61019c2..8cead860dcfc 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -213,6 +213,18 @@ config PSTORE_BLK_OOPS_SIZE
 	  NOTE that, both Kconfig and module parameters can configure
 	  pstore/blk, but module parameters have priority over Kconfig.
 
+config PSTORE_BLK_PMSG_SIZE
+	int "Size in Kbytes of pmsg to store"
+	depends on PSTORE_BLK
+	depends on PSTORE_PMSG
+	default 64
+	help
+	  This just sets size of pmsg (pmsg_size) for pstore/blk. The size is
+	  in KB and must be a multiple of 4.
+
+	  NOTE that, both Kconfig and module parameters can configure
+	  pstore/blk, but module parameters have priority over Kconfig.
+
 config PSTORE_BLK_DUMP_OOPS
 	bool "dump oops"
 	depends on PSTORE_BLK
diff --git a/fs/pstore/pstore_blk.c b/fs/pstore/pstore_blk.c
index f3ce7bbd9077..85cd9f2335be 100644
--- a/fs/pstore/pstore_blk.c
+++ b/fs/pstore/pstore_blk.c
@@ -18,6 +18,14 @@
 module_param(oops_size, long, 0400);
 MODULE_PARM_DESC(oops_size, "oops size in kbytes");
 
+#if IS_ENABLED(CONFIG_PSTORE_PMSG)
+static long pmsg_size = CONFIG_PSTORE_BLK_PMSG_SIZE;
+#else
+static long pmsg_size = -1;
+#endif
+module_param(pmsg_size, long, 0400);
+MODULE_PARM_DESC(pmsg_size, "pmsg size in kbytes");
+
 static int dump_oops = CONFIG_PSTORE_BLK_DUMP_OOPS;
 module_param(dump_oops, int, 0400);
 MODULE_PARM_DESC(total_size, "whether dump oops");
@@ -122,6 +130,7 @@ static int psblk_register_do(struct psblk_device *dev)
 	}
 
 	verify_size(oops_size, 4096, dev->flags & PSTORE_FLAGS_DMESG);
+	verify_size(pmsg_size, 4096, dev->flags & PSTORE_FLAGS_PMSG);
 #undef verify_size
 	dump_oops = dump_oops <= 0 ? 0 : 1;
 
diff --git a/fs/pstore/pstore_zone.c b/fs/pstore/pstore_zone.c
index 62c834072498..444bce7f9ac3 100644
--- a/fs/pstore/pstore_zone.c
+++ b/fs/pstore/pstore_zone.c
@@ -23,12 +23,14 @@
  *
  * @sig: signature to indicate header (PSZ_SIG xor PSZONE-type value)
  * @datalen: length of data in @data
+ * @start: offset into @data where the beginning of the stored bytes begin
  * @data: zone data.
  */
 struct psz_buffer {
 #define PSZ_SIG (0x43474244) /* DBGC */
 	uint32_t sig;
 	atomic_t datalen;
+	atomic_t start;
 	uint8_t data[];
 };
 
@@ -84,9 +86,11 @@ struct psz_zone {
  * struct psz_context - all about running state of pstore/zone
  *
  * @opszs: oops/panic storage zones
+ * @ppsz: pmsg storage zone
  * @oops_max_cnt: max count of @opszs
  * @oops_read_cnt: counter to read oops zone
  * @oops_write_cnt: counter to write
+ * @pmsg_read_cnt: counter to read pmsg zone
  * @oops_counter: counter to oops
  * @panic_counter: counter to panic
  * @recovered: whether finish recovering data from storage
@@ -97,9 +101,11 @@ struct psz_zone {
  */
 struct psz_context {
 	struct psz_zone **opszs;
+	struct psz_zone *ppsz;
 	unsigned int oops_max_cnt;
 	unsigned int oops_read_cnt;
 	unsigned int oops_write_cnt;
+	unsigned int pmsg_read_cnt;
 	/*
 	 * the counter should be recovered when recover.
 	 * It records the oops/panic times after burning rather than booting.
@@ -139,6 +145,11 @@ static inline int buffer_datalen(struct psz_zone *zone)
 	return atomic_read(&zone->buffer->datalen);
 }
 
+static inline int buffer_start(struct psz_zone *zone)
+{
+	return atomic_read(&zone->buffer->start);
+}
+
 static inline bool is_on_panic(void)
 {
 	struct psz_context *cxt = &psz_cxt;
@@ -146,10 +157,10 @@ static inline bool is_on_panic(void)
 	return atomic_read(&cxt->on_panic);
 }
 
-static ssize_t psz_zone_read(struct psz_zone *zone, char *buf,
+static ssize_t psz_zone_read_buffer(struct psz_zone *zone, char *buf,
 		size_t len, unsigned long off)
 {
-	if (!buf || !zone->buffer)
+	if (!buf || !zone || !zone->buffer)
 		return -EINVAL;
 	if (off > zone->buffer_size)
 		return -EINVAL;
@@ -158,6 +169,18 @@ static ssize_t psz_zone_read(struct psz_zone *zone, char *buf,
 	return len;
 }
 
+static int psz_zone_read_oldbuf(struct psz_zone *zone, char *buf,
+		size_t len, unsigned long off)
+{
+	if (!buf || !zone || !zone->oldbuf)
+		return -EINVAL;
+	if (off > zone->buffer_size)
+		return -EINVAL;
+	len = min_t(size_t, len, zone->buffer_size - off);
+	memcpy(buf, zone->oldbuf->data + off, len);
+	return 0;
+}
+
 static int psz_zone_write(struct psz_zone *zone,
 		enum psz_flush_mode flush_mode, const char *buf,
 		size_t len, unsigned long off)
@@ -413,6 +436,93 @@ static int psz_recover_oops(struct psz_context *cxt)
 	return ret;
 }
 
+static int psz_recover_zone(struct psz_context *cxt, struct psz_zone *zone)
+{
+	struct psz_info *info = cxt->psz_info;
+	struct psz_buffer *oldbuf, tmpbuf;
+	int ret = 0;
+	char *buf;
+	ssize_t rcnt, len, start, off;
+
+	if (!zone || zone->oldbuf)
+		return 0;
+
+	if (is_on_panic()) {
+		/* save data as much as possible */
+		psz_flush_dirty_zone(zone);
+		return 0;
+	}
+
+	if (unlikely(!info->read))
+		return -EINVAL;
+
+	len = sizeof(struct psz_buffer);
+	rcnt = info->read((char *)&tmpbuf, len, zone->off);
+	if (rcnt != len) {
+		pr_debug("read zone %s failed\n", zone->name);
+		return (int)rcnt < 0 ? (int)rcnt : -EIO;
+	}
+
+	if (tmpbuf.sig != zone->buffer->sig) {
+		pr_debug("no valid data in zone %s\n", zone->name);
+		return 0;
+	}
+
+	if (zone->buffer_size < atomic_read(&tmpbuf.datalen) ||
+		zone->buffer_size < atomic_read(&tmpbuf.start)) {
+		pr_info("found overtop zone: %s: off %lld, size %zu\n",
+				zone->name, zone->off, zone->buffer_size);
+		/* just keep going */
+		return 0;
+	}
+
+	if (!atomic_read(&tmpbuf.datalen)) {
+		pr_debug("found erased zone: %s: off %lld, size %zu, datalen %d\n",
+				zone->name, zone->off, zone->buffer_size,
+				atomic_read(&tmpbuf.datalen));
+		return 0;
+	}
+
+	pr_debug("found nice zone: %s: off %lld, size %zu, datalen %d\n",
+			zone->name, zone->off, zone->buffer_size,
+			atomic_read(&tmpbuf.datalen));
+
+	len = atomic_read(&tmpbuf.datalen) + sizeof(*oldbuf);
+	oldbuf = kzalloc(len, GFP_KERNEL);
+	if (!oldbuf)
+		return -ENOMEM;
+
+	memcpy(oldbuf, &tmpbuf, sizeof(*oldbuf));
+	buf = (char *)oldbuf + sizeof(*oldbuf);
+	len = atomic_read(&oldbuf->datalen);
+	start = atomic_read(&oldbuf->start);
+	off = zone->off + sizeof(*oldbuf);
+
+	/* get part of data */
+	rcnt = info->read(buf, len - start, off + start);
+	if (rcnt != len - start) {
+		pr_err("read zone %s failed\n", zone->name);
+		ret = (int)rcnt < 0 ? (int)rcnt : -EIO;
+		goto free_oldbuf;
+	}
+
+	/* get the rest of data */
+	rcnt = info->read(buf + len - start, start, off);
+	if (rcnt != start) {
+		pr_err("read zone %s failed\n", zone->name);
+		ret = (int)rcnt < 0 ? (int)rcnt : -EIO;
+		goto free_oldbuf;
+	}
+
+	zone->oldbuf = oldbuf;
+	psz_flush_dirty_zone(zone);
+	return 0;
+
+free_oldbuf:
+	kfree(oldbuf);
+	return ret;
+}
+
 /**
  * psz_recovery() - recover data from storage
  * @cxt: the context of pstore/zone
@@ -432,6 +542,10 @@ static inline int psz_recovery(struct psz_context *cxt)
 	if (ret)
 		goto recover_fail;
 
+	ret = psz_recover_zone(cxt, cxt->ppsz);
+	if (ret)
+		goto recover_fail;
+
 	pr_debug("recover end!\n");
 	atomic_set(&cxt->recovered, 1);
 	return 0;
@@ -446,9 +560,17 @@ static int psz_pstore_open(struct pstore_info *psi)
 	struct psz_context *cxt = psi->data;
 
 	cxt->oops_read_cnt = 0;
+	cxt->pmsg_read_cnt = 0;
 	return 0;
 }
 
+static inline bool psz_old_ok(struct psz_zone *zone)
+{
+	if (zone && zone->oldbuf && atomic_read(&zone->oldbuf->datalen))
+		return true;
+	return false;
+}
+
 static inline bool psz_ok(struct psz_zone *zone)
 {
 	if (zone && zone->buffer && buffer_datalen(zone))
@@ -473,6 +595,25 @@ static inline int psz_oops_erase(struct psz_context *cxt,
 	return psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
 }
 
+static inline int psz_record_erase(struct psz_context *cxt,
+		struct psz_zone *zone)
+{
+	if (unlikely(!psz_old_ok(zone)))
+		return 0;
+
+	kfree(zone->oldbuf);
+	zone->oldbuf = NULL;
+	/*
+	 * if there are new data in zone buffer, that means the old data
+	 * are already invalid. It is no need to flush 0 (erase) to
+	 * block device.
+	 */
+	if (!buffer_datalen(zone))
+		return psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+	psz_flush_dirty_zone(zone);
+	return 0;
+}
+
 static int psz_pstore_erase(struct pstore_record *record)
 {
 	struct psz_context *cxt = record->psi->data;
@@ -482,6 +623,8 @@ static int psz_pstore_erase(struct pstore_record *record)
 		if (record->id >= cxt->oops_max_cnt)
 			return -EINVAL;
 		return psz_oops_erase(cxt, cxt->opszs[record->id], record);
+	case PSTORE_TYPE_PMSG:
+		return psz_record_erase(cxt, cxt->ppsz);
 	default:
 		return -EINVAL;
 	}
@@ -502,8 +645,10 @@ static void psz_write_kmsg_hdr(struct psz_zone *zone,
 	hdr->reason = record->reason;
 	if (hdr->reason == KMSG_DUMP_OOPS)
 		hdr->counter = ++cxt->oops_counter;
-	else
+	else if (hdr->reason == KMSG_DUMP_PANIC)
 		hdr->counter = ++cxt->panic_counter;
+	else
+		hdr->counter = 0;
 }
 
 static inline int notrace psz_oops_write_record(struct psz_context *cxt,
@@ -567,6 +712,53 @@ static int notrace psz_oops_write(struct psz_context *cxt,
 	return 0;
 }
 
+static int notrace psz_record_write(struct psz_zone *zone,
+		struct pstore_record *record)
+{
+	size_t start, rem;
+	int cnt = record->size;
+	bool is_full_data = false;
+	char *buf = record->buf;
+
+	if (!zone || !record)
+		return -ENOSPC;
+
+	if (atomic_read(&zone->buffer->datalen) >= zone->buffer_size)
+		is_full_data = true;
+
+	if (unlikely(cnt > zone->buffer_size)) {
+		buf += cnt - zone->buffer_size;
+		cnt = zone->buffer_size;
+	}
+
+	start = buffer_start(zone);
+	rem = zone->buffer_size - start;
+	if (unlikely(rem < cnt)) {
+		psz_zone_write(zone, FLUSH_PART, buf, rem, start);
+		buf += rem;
+		cnt -= rem;
+		start = 0;
+		is_full_data = true;
+	}
+
+	atomic_set(&zone->buffer->start, cnt + start);
+	psz_zone_write(zone, FLUSH_PART, buf, cnt, start);
+
+	/**
+	 * psz_zone_write will set datalen as start + cnt.
+	 * It work if actual data length lesser than buffer size.
+	 * If data length greater than buffer size, pmsg will rewrite to
+	 * beginning of zone, which make buffer->datalen wrongly.
+	 * So we should reset datalen as buffer size once actual data length
+	 * greater than buffer size.
+	 */
+	if (is_full_data) {
+		atomic_set(&zone->buffer->datalen, zone->buffer_size);
+		psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+	}
+	return 0;
+}
+
 static int notrace psz_pstore_write(struct pstore_record *record)
 {
 	struct psz_context *cxt = record->psi->data;
@@ -578,6 +770,8 @@ static int notrace psz_pstore_write(struct pstore_record *record)
 	switch (record->type) {
 	case PSTORE_TYPE_DMESG:
 		return psz_oops_write(cxt, record);
+	case PSTORE_TYPE_PMSG:
+		return psz_record_write(cxt->ppsz, record);
 	default:
 		return -EINVAL;
 	}
@@ -593,6 +787,13 @@ static struct psz_zone *psz_read_next_zone(struct psz_context *cxt)
 			return zone;
 	}
 
+	if (cxt->pmsg_read_cnt == 0) {
+		cxt->pmsg_read_cnt++;
+		zone = cxt->ppsz;
+		if (psz_old_ok(zone))
+			return zone;
+	}
+
 	return NULL;
 }
 
@@ -642,7 +843,7 @@ static ssize_t psz_oops_read(struct psz_zone *zone,
 			return -ENOMEM;
 	}
 
-	size = psz_zone_read(zone, record->buf + hlen, size,
+	size = psz_zone_read_buffer(zone, record->buf + hlen, size,
 			sizeof(struct psz_oops_header) < 0);
 	if (unlikely(size < 0)) {
 		kfree(record->buf);
@@ -652,6 +853,32 @@ static ssize_t psz_oops_read(struct psz_zone *zone,
 	return size + hlen;
 }
 
+static ssize_t psz_record_read(struct psz_zone *zone,
+		struct pstore_record *record)
+{
+	size_t len;
+	struct psz_buffer *buf;
+
+	if (!zone || !record)
+		return -ENOSPC;
+
+	buf = (struct psz_buffer *)zone->oldbuf;
+	if (!buf)
+		return -ENOMSG;
+
+	len = atomic_read(&buf->datalen);
+	record->buf = kmalloc(len, GFP_KERNEL);
+	if (!record->buf)
+		return -ENOMEM;
+
+	if (unlikely(psz_zone_read_oldbuf(zone, record->buf, len, 0))) {
+		kfree(record->buf);
+		return -ENOMSG;
+	}
+
+	return len;
+}
+
 static ssize_t psz_pstore_read(struct pstore_record *record)
 {
 	struct psz_context *cxt = record->psi->data;
@@ -676,6 +903,9 @@ static ssize_t psz_pstore_read(struct pstore_record *record)
 		readop = psz_oops_read;
 		record->id = cxt->oops_read_cnt - 1;
 		break;
+	case PSTORE_TYPE_PMSG:
+		readop = psz_record_read;
+		break;
 	default:
 		goto next_zone;
 	}
@@ -731,8 +961,10 @@ static struct psz_zone *psz_init_zone(enum pstore_type_id type,
 	zone->type = type;
 	zone->buffer_size = size - sizeof(struct psz_buffer);
 	zone->buffer->sig = type ^ PSZ_SIG;
+	zone->oldbuf = NULL;
 	atomic_set(&zone->dirty, 0);
 	atomic_set(&zone->buffer->datalen, 0);
+	atomic_set(&zone->buffer->start, 0);
 
 	*off += size;
 
@@ -816,6 +1048,8 @@ static void psz_free_all_zones(struct psz_context *cxt)
 {
 	if (cxt->opszs)
 		psz_free_zones(&cxt->opszs, &cxt->oops_max_cnt);
+	if (cxt->ppsz)
+		psz_free_zone(&cxt->ppsz);
 }
 
 static int psz_alloc_zones(struct psz_context *cxt)
@@ -823,18 +1057,26 @@ static int psz_alloc_zones(struct psz_context *cxt)
 	struct psz_info *info = cxt->psz_info;
 	loff_t off = 0;
 	int err;
-	size_t size;
+	size_t off_size = 0;
+
+	off_size += info->pmsg_size;
+	cxt->ppsz = psz_init_zone(PSTORE_TYPE_PMSG, &off, info->pmsg_size);
+	if (IS_ERR(cxt->ppsz)) {
+		err = PTR_ERR(cxt->ppsz);
+		goto free_out;
+	}
 
-	size = info->total_size;
-	cxt->opszs = psz_init_zones(PSTORE_TYPE_DMESG, &off, size,
+	cxt->opszs = psz_init_zones(PSTORE_TYPE_DMESG, &off,
+			info->total_size - off_size,
 			info->oops_size, &cxt->oops_max_cnt);
 	if (IS_ERR(cxt->opszs)) {
 		err = PTR_ERR(cxt->opszs);
-		goto fail_out;
+		goto free_out;
 	}
 
 	return 0;
-fail_out:
+free_out:
+	psz_free_all_zones(cxt);
 	return err;
 }
 
@@ -858,7 +1100,7 @@ int psz_register(struct psz_info *info)
 		return -EINVAL;
 	}
 
-	if (!info->oops_size) {
+	if (!info->oops_size && !info->pmsg_size) {
 		pr_warn("at least one of the records be non-zero\n");
 		return -EINVAL;
 	}
@@ -885,6 +1127,7 @@ int psz_register(struct psz_info *info)
 
 	check_size(total_size, 4096);
 	check_size(oops_size, SECTOR_SIZE);
+	check_size(pmsg_size, SECTOR_SIZE);
 
 #undef check_size
 
@@ -916,6 +1159,7 @@ int psz_register(struct psz_info *info)
 	pr_debug("register %s with properties:\n", info->name);
 	pr_debug("\ttotal size : %ld Bytes\n", info->total_size);
 	pr_debug("\toops size : %ld Bytes\n", info->oops_size);
+	pr_debug("\tpmsg size : %ld Bytes\n", info->pmsg_size);
 
 	err = psz_alloc_zones(cxt);
 	if (err) {
@@ -934,11 +1178,14 @@ int psz_register(struct psz_info *info)
 	}
 	cxt->pstore.data = cxt;
 	if (info->oops_size)
-		cxt->pstore.flags = PSTORE_FLAGS_DMESG;
+		cxt->pstore.flags |= PSTORE_FLAGS_DMESG;
+	if (info->pmsg_size)
+		cxt->pstore.flags |= PSTORE_FLAGS_PMSG;
 
-	pr_info("Registered %s as pszone backend for%s%s\n", info->name,
+	pr_info("Registered %s as pszone backend for%s%s%s\n", info->name,
 			cxt->opszs && cxt->psz_info->dump_oops ? " Oops" : "",
-			cxt->opszs && cxt->psz_info->panic_write ? " Panic" : "");
+			cxt->opszs && cxt->psz_info->panic_write ? " Panic" : "",
+			cxt->ppsz ? " Pmsg" : "");
 
 	err = pstore_register(&cxt->pstore);
 	if (err) {
diff --git a/include/linux/pstore_zone.h b/include/linux/pstore_zone.h
index afc10a67a777..85e159d8f935 100644
--- a/include/linux/pstore_zone.h
+++ b/include/linux/pstore_zone.h
@@ -16,6 +16,7 @@
  *		than 4096 and be multiple of 4096.
  * @oops_size:	The size of oops/panic zone. Zero means disabled, otherwise,
  *		it must be multiple of SECTOR_SIZE(512 Bytes).
+ * @pmsg_size:	The size of pmsg zone which is the same as @oops_size.
  * @dump_oops:	Whether to dump oops log.
  * @read:	The general read operation. Both of the function parameters
  *		@size and @offset are relative value to storage.
@@ -32,6 +33,7 @@ struct psz_info {
 
 	unsigned long total_size;
 	unsigned long oops_size;
+	unsigned long pmsg_size;
 	int dump_oops;
 	psz_read_op read;
 	psz_write_op write;
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 05/11] pstore/blk: blkoops: support console recorder
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
                   ` (3 preceding siblings ...)
  2020-03-25  8:54 ` [PATCH v3 04/11] pstore/blk: pstore/zone: support pmsg recorder WeiXiong Liao
@ 2020-03-25  8:55 ` WeiXiong Liao
  2020-04-14  2:13   ` WeiXiong Liao
  2020-03-25  8:55 ` [PATCH v3 06/11] pstore/blk: blkoops: support ftrace recorder WeiXiong Liao
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:55 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

Support recorder for console. To enable console recorder, just make
console_size be greater than 0 and a multiple of 4096.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 fs/pstore/Kconfig           | 12 ++++++++
 fs/pstore/pstore_blk.c      | 12 +++++++-
 fs/pstore/pstore_zone.c     | 70 +++++++++++++++++++++++++++++++++++++++++----
 include/linux/pstore_zone.h |  4 ++-
 4 files changed, 91 insertions(+), 7 deletions(-)

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 8cead860dcfc..bf90de48ad3c 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -225,6 +225,18 @@ config PSTORE_BLK_PMSG_SIZE
 	  NOTE that, both Kconfig and module parameters can configure
 	  pstore/blk, but module parameters have priority over Kconfig.
 
+config PSTORE_BLK_CONSOLE_SIZE
+	int "Size in Kbytes of console to store"
+	depends on PSTORE_BLK
+	depends on PSTORE_CONSOLE
+	default 64
+	help
+	  This just sets size of console (console_size) for pstore/blk. The
+	  size is in KB and must be a multiple of 4.
+
+	  NOTE that, both Kconfig and module parameters can configure
+	  pstore/blk, but module parameters have priority over Kconfig.
+
 config PSTORE_BLK_DUMP_OOPS
 	bool "dump oops"
 	depends on PSTORE_BLK
diff --git a/fs/pstore/pstore_blk.c b/fs/pstore/pstore_blk.c
index 85cd9f2335be..2b513acaa18f 100644
--- a/fs/pstore/pstore_blk.c
+++ b/fs/pstore/pstore_blk.c
@@ -26,6 +26,14 @@
 module_param(pmsg_size, long, 0400);
 MODULE_PARM_DESC(pmsg_size, "pmsg size in kbytes");
 
+#if IS_ENABLED(CONFIG_PSTORE_CONSOLE)
+static long console_size = CONFIG_PSTORE_BLK_CONSOLE_SIZE;
+#else
+static long console_size = -1;
+#endif
+module_param(console_size, long, 0400);
+MODULE_PARM_DESC(console_size, "console size in kbytes");
+
 static int dump_oops = CONFIG_PSTORE_BLK_DUMP_OOPS;
 module_param(dump_oops, int, 0400);
 MODULE_PARM_DESC(total_size, "whether dump oops");
@@ -81,7 +89,8 @@
  *		whole disk).
  *		On success, the number of bytes should be returned, others
  *		means error.
- * @write:	The same as @read.
+ * @write:	The same as @read, but the following error number:
+ *		-EBUSY means try to write again later.
  * @panic_write:The write operation only used for panic case. It's optional
  *		if you do not care panic log. The parameters and return value
  *		are the same as @read.
@@ -131,6 +140,7 @@ static int psblk_register_do(struct psblk_device *dev)
 
 	verify_size(oops_size, 4096, dev->flags & PSTORE_FLAGS_DMESG);
 	verify_size(pmsg_size, 4096, dev->flags & PSTORE_FLAGS_PMSG);
+	verify_size(console_size, 4096, dev->flags & PSTORE_FLAGS_CONSOLE);
 #undef verify_size
 	dump_oops = dump_oops <= 0 ? 0 : 1;
 
diff --git a/fs/pstore/pstore_zone.c b/fs/pstore/pstore_zone.c
index 444bce7f9ac3..e1e84505b046 100644
--- a/fs/pstore/pstore_zone.c
+++ b/fs/pstore/pstore_zone.c
@@ -87,10 +87,12 @@ struct psz_zone {
  *
  * @opszs: oops/panic storage zones
  * @ppsz: pmsg storage zone
+ * @cpsz: console storage zone
  * @oops_max_cnt: max count of @opszs
  * @oops_read_cnt: counter to read oops zone
  * @oops_write_cnt: counter to write
  * @pmsg_read_cnt: counter to read pmsg zone
+ * @console_read_cnt: counter to read console zone
  * @oops_counter: counter to oops
  * @panic_counter: counter to panic
  * @recovered: whether finish recovering data from storage
@@ -102,10 +104,12 @@ struct psz_zone {
 struct psz_context {
 	struct psz_zone **opszs;
 	struct psz_zone *ppsz;
+	struct psz_zone *cpsz;
 	unsigned int oops_max_cnt;
 	unsigned int oops_read_cnt;
 	unsigned int oops_write_cnt;
 	unsigned int pmsg_read_cnt;
+	unsigned int console_read_cnt;
 	/*
 	 * the counter should be recovered when recover.
 	 * It records the oops/panic times after burning rather than booting.
@@ -125,6 +129,9 @@ struct psz_context {
 };
 static struct psz_context psz_cxt;
 
+static void psz_flush_all_dirty_zones(struct work_struct *);
+static DECLARE_WORK(psz_cleaner, psz_flush_all_dirty_zones);
+
 /**
  * enum psz_flush_mode - flush mode for psz_zone_write()
  *
@@ -235,6 +242,9 @@ static int psz_zone_write(struct psz_zone *zone,
 	return 0;
 dirty:
 	atomic_set(&zone->dirty, true);
+	/* flush dirty zones nicely */
+	if (wcnt == -EBUSY && !is_on_panic())
+		schedule_work(&psz_cleaner);
 	return -EBUSY;
 }
 
@@ -291,6 +301,15 @@ static int psz_move_zone(struct psz_zone *old, struct psz_zone *new)
 	return 0;
 }
 
+static void psz_flush_all_dirty_zones(struct work_struct *work)
+{
+	struct psz_context *cxt = &psz_cxt;
+
+	psz_flush_dirty_zone(cxt->ppsz);
+	psz_flush_dirty_zone(cxt->cpsz);
+	psz_flush_dirty_zones(cxt->opszs, cxt->oops_max_cnt);
+}
+
 static int psz_recover_oops_data(struct psz_context *cxt)
 {
 	struct psz_info *info = cxt->psz_info;
@@ -546,6 +565,10 @@ static inline int psz_recovery(struct psz_context *cxt)
 	if (ret)
 		goto recover_fail;
 
+	ret = psz_recover_zone(cxt, cxt->cpsz);
+	if (ret)
+		goto recover_fail;
+
 	pr_debug("recover end!\n");
 	atomic_set(&cxt->recovered, 1);
 	return 0;
@@ -561,6 +584,7 @@ static int psz_pstore_open(struct pstore_info *psi)
 
 	cxt->oops_read_cnt = 0;
 	cxt->pmsg_read_cnt = 0;
+	cxt->console_read_cnt = 0;
 	return 0;
 }
 
@@ -625,8 +649,9 @@ static int psz_pstore_erase(struct pstore_record *record)
 		return psz_oops_erase(cxt, cxt->opszs[record->id], record);
 	case PSTORE_TYPE_PMSG:
 		return psz_record_erase(cxt, cxt->ppsz);
-	default:
-		return -EINVAL;
+	case PSTORE_TYPE_CONSOLE:
+		return psz_record_erase(cxt, cxt->cpsz);
+	default: return -EINVAL;
 	}
 }
 
@@ -767,9 +792,18 @@ static int notrace psz_pstore_write(struct pstore_record *record)
 			record->reason == KMSG_DUMP_PANIC)
 		atomic_set(&cxt->on_panic, 1);
 
+	/*
+	 * if on panic, do not write except panic records
+	 * Fix case that panic_write prints log which wakes up console recorder.
+	 */
+	if (is_on_panic() && record->type != PSTORE_TYPE_DMESG)
+		return -EBUSY;
+
 	switch (record->type) {
 	case PSTORE_TYPE_DMESG:
 		return psz_oops_write(cxt, record);
+	case PSTORE_TYPE_CONSOLE:
+		return psz_record_write(cxt->cpsz, record);
 	case PSTORE_TYPE_PMSG:
 		return psz_record_write(cxt->ppsz, record);
 	default:
@@ -794,6 +828,13 @@ static struct psz_zone *psz_read_next_zone(struct psz_context *cxt)
 			return zone;
 	}
 
+	if (cxt->console_read_cnt == 0) {
+		cxt->console_read_cnt++;
+		zone = cxt->cpsz;
+		if (psz_old_ok(zone))
+			return zone;
+	}
+
 	return NULL;
 }
 
@@ -903,6 +944,8 @@ static ssize_t psz_pstore_read(struct pstore_record *record)
 		readop = psz_oops_read;
 		record->id = cxt->oops_read_cnt - 1;
 		break;
+	case PSTORE_TYPE_CONSOLE:
+		fallthrough;
 	case PSTORE_TYPE_PMSG:
 		readop = psz_record_read;
 		break;
@@ -1050,6 +1093,8 @@ static void psz_free_all_zones(struct psz_context *cxt)
 		psz_free_zones(&cxt->opszs, &cxt->oops_max_cnt);
 	if (cxt->ppsz)
 		psz_free_zone(&cxt->ppsz);
+	if (cxt->cpsz)
+		psz_free_zone(&cxt->cpsz);
 }
 
 static int psz_alloc_zones(struct psz_context *cxt)
@@ -1066,6 +1111,14 @@ static int psz_alloc_zones(struct psz_context *cxt)
 		goto free_out;
 	}
 
+	off_size += info->console_size;
+	cxt->cpsz = psz_init_zone(PSTORE_TYPE_CONSOLE, &off,
+			info->console_size);
+	if (IS_ERR(cxt->cpsz)) {
+		err = PTR_ERR(cxt->cpsz);
+		goto free_out;
+	}
+
 	cxt->opszs = psz_init_zones(PSTORE_TYPE_DMESG, &off,
 			info->total_size - off_size,
 			info->oops_size, &cxt->oops_max_cnt);
@@ -1100,7 +1153,7 @@ int psz_register(struct psz_info *info)
 		return -EINVAL;
 	}
 
-	if (!info->oops_size && !info->pmsg_size) {
+	if (!info->oops_size && !info->pmsg_size && !info->console_size) {
 		pr_warn("at least one of the records be non-zero\n");
 		return -EINVAL;
 	}
@@ -1128,6 +1181,7 @@ int psz_register(struct psz_info *info)
 	check_size(total_size, 4096);
 	check_size(oops_size, SECTOR_SIZE);
 	check_size(pmsg_size, SECTOR_SIZE);
+	check_size(console_size, SECTOR_SIZE);
 
 #undef check_size
 
@@ -1160,6 +1214,7 @@ int psz_register(struct psz_info *info)
 	pr_debug("\ttotal size : %ld Bytes\n", info->total_size);
 	pr_debug("\toops size : %ld Bytes\n", info->oops_size);
 	pr_debug("\tpmsg size : %ld Bytes\n", info->pmsg_size);
+	pr_debug("\tconsole size : %ld Bytes\n", info->console_size);
 
 	err = psz_alloc_zones(cxt);
 	if (err) {
@@ -1181,11 +1236,14 @@ int psz_register(struct psz_info *info)
 		cxt->pstore.flags |= PSTORE_FLAGS_DMESG;
 	if (info->pmsg_size)
 		cxt->pstore.flags |= PSTORE_FLAGS_PMSG;
+	if (info->console_size)
+		cxt->pstore.flags |= PSTORE_FLAGS_CONSOLE;
 
-	pr_info("Registered %s as pszone backend for%s%s%s\n", info->name,
+	pr_info("Registered %s as pszone backend for%s%s%s%s\n", info->name,
 			cxt->opszs && cxt->psz_info->dump_oops ? " Oops" : "",
 			cxt->opszs && cxt->psz_info->panic_write ? " Panic" : "",
-			cxt->ppsz ? " Pmsg" : "");
+			cxt->ppsz ? " Pmsg" : "",
+			cxt->cpsz ? " Console" : "");
 
 	err = pstore_register(&cxt->pstore);
 	if (err) {
@@ -1219,6 +1277,8 @@ void psz_unregister(struct psz_info *info)
 {
 	struct psz_context *cxt = &psz_cxt;
 
+	flush_work(&psz_cleaner);
+
 	pstore_unregister(&cxt->pstore);
 	kfree(cxt->pstore.buf);
 	cxt->pstore.bufsize = 0;
diff --git a/include/linux/pstore_zone.h b/include/linux/pstore_zone.h
index 85e159d8f935..8a1838633010 100644
--- a/include/linux/pstore_zone.h
+++ b/include/linux/pstore_zone.h
@@ -17,12 +17,13 @@
  * @oops_size:	The size of oops/panic zone. Zero means disabled, otherwise,
  *		it must be multiple of SECTOR_SIZE(512 Bytes).
  * @pmsg_size:	The size of pmsg zone which is the same as @oops_size.
+ * @console_size:The size of console zone which is the same as @oops_size.
  * @dump_oops:	Whether to dump oops log.
  * @read:	The general read operation. Both of the function parameters
  *		@size and @offset are relative value to storage.
  *		On success, the number of bytes should be returned, others
  *		means error.
- * @write:	The same as @read.
+ * @write:	The same as @read, but -EBUSY means try to write again later.
  * @panic_write:The write operation only used for panic case. It's optional
  *		if you do not care panic log. The parameters and return value
  *		are the same as @read.
@@ -34,6 +35,7 @@ struct psz_info {
 	unsigned long total_size;
 	unsigned long oops_size;
 	unsigned long pmsg_size;
+	unsigned long console_size;
 	int dump_oops;
 	psz_read_op read;
 	psz_write_op write;
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 06/11] pstore/blk: blkoops: support ftrace recorder
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
                   ` (4 preceding siblings ...)
  2020-03-25  8:55 ` [PATCH v3 05/11] pstore/blk: blkoops: support console recorder WeiXiong Liao
@ 2020-03-25  8:55 ` WeiXiong Liao
  2020-03-25  8:55 ` [PATCH v3 07/11] Documentation: create document for pstore/blk WeiXiong Liao
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:55 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

Support recorder for ftrace. To enable ftrace recorder, just make
ftrace_size be greater than 0 and a multiple of 4096.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 fs/pstore/Kconfig           |  12 ++++
 fs/pstore/pstore_blk.c      |   9 +++
 fs/pstore/pstore_zone.c     | 172 +++++++++++++++++++++++++++++++++++++++++++-
 include/linux/pstore_zone.h |   2 +
 4 files changed, 193 insertions(+), 2 deletions(-)

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index bf90de48ad3c..55ce726be795 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -237,6 +237,18 @@ config PSTORE_BLK_CONSOLE_SIZE
 	  NOTE that, both Kconfig and module parameters can configure
 	  pstore/blk, but module parameters have priority over Kconfig.
 
+config PSTORE_BLK_FTRACE_SIZE
+	int "Size in Kbytes of ftarce to store"
+	depends on PSTORE_BLK
+	depends on PSTORE_FTRACE
+	default 64
+	help
+	  This just sets size of ftrace (ftrace_size) for pstore/blk. The
+	  size is in KB and must be a multiple of 4.
+
+	  NOTE that, both Kconfig and module parameters can configure
+	  pstore/blk, but module parameters have priority over Kconfig.
+
 config PSTORE_BLK_DUMP_OOPS
 	bool "dump oops"
 	depends on PSTORE_BLK
diff --git a/fs/pstore/pstore_blk.c b/fs/pstore/pstore_blk.c
index 2b513acaa18f..560e7b3f0945 100644
--- a/fs/pstore/pstore_blk.c
+++ b/fs/pstore/pstore_blk.c
@@ -34,6 +34,14 @@
 module_param(console_size, long, 0400);
 MODULE_PARM_DESC(console_size, "console size in kbytes");
 
+#if IS_ENABLED(CONFIG_PSTORE_FTRACE)
+static long ftrace_size = CONFIG_PSTORE_BLK_FTRACE_SIZE;
+#else
+static long ftrace_size = -1;
+#endif
+module_param(ftrace_size, long, 0400);
+MODULE_PARM_DESC(ftrace_size, "ftrace size in kbytes");
+
 static int dump_oops = CONFIG_PSTORE_BLK_DUMP_OOPS;
 module_param(dump_oops, int, 0400);
 MODULE_PARM_DESC(total_size, "whether dump oops");
@@ -141,6 +149,7 @@ static int psblk_register_do(struct psblk_device *dev)
 	verify_size(oops_size, 4096, dev->flags & PSTORE_FLAGS_DMESG);
 	verify_size(pmsg_size, 4096, dev->flags & PSTORE_FLAGS_PMSG);
 	verify_size(console_size, 4096, dev->flags & PSTORE_FLAGS_CONSOLE);
+	verify_size(ftrace_size, 4096, dev->flags & PSTORE_FLAGS_FTRACE);
 #undef verify_size
 	dump_oops = dump_oops <= 0 ? 0 : 1;
 
diff --git a/fs/pstore/pstore_zone.c b/fs/pstore/pstore_zone.c
index e1e84505b046..1da4fc760d31 100644
--- a/fs/pstore/pstore_zone.c
+++ b/fs/pstore/pstore_zone.c
@@ -88,11 +88,14 @@ struct psz_zone {
  * @opszs: oops/panic storage zones
  * @ppsz: pmsg storage zone
  * @cpsz: console storage zone
+ * @fpszs: ftrace storage zones
  * @oops_max_cnt: max count of @opszs
  * @oops_read_cnt: counter to read oops zone
  * @oops_write_cnt: counter to write
  * @pmsg_read_cnt: counter to read pmsg zone
  * @console_read_cnt: counter to read console zone
+ * @ftrace_max_cnt: max count of @fpszs
+ * @ftrace_read_cnt: counter to read ftrace zone
  * @oops_counter: counter to oops
  * @panic_counter: counter to panic
  * @recovered: whether finish recovering data from storage
@@ -105,11 +108,14 @@ struct psz_context {
 	struct psz_zone **opszs;
 	struct psz_zone *ppsz;
 	struct psz_zone *cpsz;
+	struct psz_zone **fpszs;
 	unsigned int oops_max_cnt;
 	unsigned int oops_read_cnt;
 	unsigned int oops_write_cnt;
 	unsigned int pmsg_read_cnt;
 	unsigned int console_read_cnt;
+	unsigned int ftrace_max_cnt;
+	unsigned int ftrace_read_cnt;
 	/*
 	 * the counter should be recovered when recover.
 	 * It records the oops/panic times after burning rather than booting.
@@ -308,6 +314,7 @@ static void psz_flush_all_dirty_zones(struct work_struct *work)
 	psz_flush_dirty_zone(cxt->ppsz);
 	psz_flush_dirty_zone(cxt->cpsz);
 	psz_flush_dirty_zones(cxt->opszs, cxt->oops_max_cnt);
+	psz_flush_dirty_zones(cxt->fpszs, cxt->ftrace_max_cnt);
 }
 
 static int psz_recover_oops_data(struct psz_context *cxt)
@@ -542,6 +549,31 @@ static int psz_recover_zone(struct psz_context *cxt, struct psz_zone *zone)
 	return ret;
 }
 
+static int psz_recover_zones(struct psz_context *cxt,
+		struct psz_zone **zones, unsigned int cnt)
+{
+	int ret;
+	unsigned int i;
+	struct psz_zone *zone;
+
+	if (!zones)
+		return 0;
+
+	for (i = 0; i < cnt; i++) {
+		zone = zones[i];
+		if (unlikely(!zone))
+			continue;
+		ret = psz_recover_zone(cxt, zone);
+		if (ret)
+			goto recover_fail;
+	}
+
+	return 0;
+recover_fail:
+	pr_debug("recover %s[%u] failed\n", zone->name, i);
+	return ret;
+}
+
 /**
  * psz_recovery() - recover data from storage
  * @cxt: the context of pstore/zone
@@ -569,6 +601,10 @@ static inline int psz_recovery(struct psz_context *cxt)
 	if (ret)
 		goto recover_fail;
 
+	ret = psz_recover_zones(cxt, cxt->fpszs, cxt->ftrace_max_cnt);
+	if (ret)
+		goto recover_fail;
+
 	pr_debug("recover end!\n");
 	atomic_set(&cxt->recovered, 1);
 	return 0;
@@ -585,6 +621,7 @@ static int psz_pstore_open(struct pstore_info *psi)
 	cxt->oops_read_cnt = 0;
 	cxt->pmsg_read_cnt = 0;
 	cxt->console_read_cnt = 0;
+	cxt->ftrace_read_cnt = 0;
 	return 0;
 }
 
@@ -651,6 +688,10 @@ static int psz_pstore_erase(struct pstore_record *record)
 		return psz_record_erase(cxt, cxt->ppsz);
 	case PSTORE_TYPE_CONSOLE:
 		return psz_record_erase(cxt, cxt->cpsz);
+	case PSTORE_TYPE_FTRACE:
+		if (record->id >= cxt->ftrace_max_cnt)
+			return -EINVAL;
+		return psz_record_erase(cxt, cxt->fpszs[record->id]);
 	default: return -EINVAL;
 	}
 }
@@ -806,6 +847,13 @@ static int notrace psz_pstore_write(struct pstore_record *record)
 		return psz_record_write(cxt->cpsz, record);
 	case PSTORE_TYPE_PMSG:
 		return psz_record_write(cxt->ppsz, record);
+	case PSTORE_TYPE_FTRACE: {
+		int zonenum = smp_processor_id();
+
+		if (!cxt->fpszs)
+			return -ENOSPC;
+		return psz_record_write(cxt->fpszs[zonenum], record);
+	}
 	default:
 		return -EINVAL;
 	}
@@ -821,6 +869,14 @@ static struct psz_zone *psz_read_next_zone(struct psz_context *cxt)
 			return zone;
 	}
 
+	if (cxt->ftrace_read_cnt < cxt->ftrace_max_cnt)
+		/*
+		 * No need psz_old_ok(). Let psz_ftrace_read() do so for
+		 * combination. psz_ftrace_read() should traverse over
+		 * all zones in case of some zone without data.
+		 */
+		return cxt->fpszs[cxt->ftrace_read_cnt++];
+
 	if (cxt->pmsg_read_cnt == 0) {
 		cxt->pmsg_read_cnt++;
 		zone = cxt->ppsz;
@@ -894,6 +950,98 @@ static ssize_t psz_oops_read(struct psz_zone *zone,
 	return size + hlen;
 }
 
+static int psz_ftrace_combine(char *src1_buf, size_t src1_size,
+		char *src2_buf, size_t src2_size,
+		char **dest_buf, size_t *dest_size)
+{
+	size_t src1_off, src2_off, total;
+	size_t src1_idx = 0, src2_idx = 0, merged_idx = 0;
+	void *merged_buf;
+	struct pstore_ftrace_record *mrec, *s1rec, *s2rec;
+	size_t record_size = sizeof(struct pstore_ftrace_record);
+
+	src1_off = src1_size % record_size;
+	src1_size -= src1_off;
+
+	src2_off = src2_size % record_size;
+	src2_size -= src2_off;
+
+	total = src1_size + src2_size;
+	merged_buf = kmalloc(total, GFP_KERNEL);
+	if (!merged_buf)
+		return -ENOMEM;
+
+	s1rec = (struct pstore_ftrace_record *)(src1_buf + src1_off);
+	s2rec = (struct pstore_ftrace_record *)(src2_buf + src2_off);
+	mrec = (struct pstore_ftrace_record *)(merged_buf);
+
+	while (src1_size > 0 && src2_size > 0) {
+		u64 s1_ts, s2_ts;
+
+		s1_ts = pstore_ftrace_read_timestamp(&s1rec[src1_idx]);
+		s2_ts = pstore_ftrace_read_timestamp(&s2rec[src2_idx]);
+		if (s1_ts < s2_ts) {
+			mrec[merged_idx++] = s1rec[src1_idx++];
+			src1_size -= record_size;
+		} else {
+			mrec[merged_idx++] = s2rec[src2_idx++];
+			src2_size -= record_size;
+		}
+	}
+
+	while (src1_size > 0) {
+		mrec[merged_idx++] = s1rec[src1_idx++];
+		src1_size -= record_size;
+	}
+
+	while (src2_size > 0) {
+		mrec[merged_idx++] = s2rec[src2_idx++];
+		src2_size -= record_size;
+	}
+
+	*dest_buf = merged_buf;
+	*dest_size = total;
+	return 0;
+}
+
+/* try to combine all ftrace zones */
+static ssize_t psz_ftrace_read(struct psz_zone *zone,
+		struct pstore_record *record)
+{
+	struct psz_context *cxt = record->psi->data;
+	struct psz_buffer *buf;
+	char *dest;
+	size_t dest_size;
+	int ret;
+
+	if (!zone || !record)
+		return -ENOSPC;
+
+	if (!psz_old_ok(zone))
+		goto out;
+
+	buf = (struct psz_buffer *)zone->oldbuf;
+	if (!buf)
+		return -ENOMSG;
+
+	ret = psz_ftrace_combine(record->buf, record->size,
+			(char *)buf->data, atomic_read(&buf->datalen),
+			&dest, &dest_size);
+	if (unlikely(ret))
+		return ret;
+
+	kfree(record->buf);
+	record->buf = dest;
+	record->size = dest_size;
+
+out:
+	if (cxt->ftrace_read_cnt < cxt->ftrace_max_cnt)
+		/* then, read next ftrace zone */
+		return -ENOMSG;
+	record->id = 0;
+	return record->size ? record->size : -ENOMSG;
+}
+
 static ssize_t psz_record_read(struct psz_zone *zone,
 		struct pstore_record *record)
 {
@@ -944,6 +1092,9 @@ static ssize_t psz_pstore_read(struct pstore_record *record)
 		readop = psz_oops_read;
 		record->id = cxt->oops_read_cnt - 1;
 		break;
+	case PSTORE_TYPE_FTRACE:
+		readop = psz_ftrace_read;
+		break;
 	case PSTORE_TYPE_CONSOLE:
 		fallthrough;
 	case PSTORE_TYPE_PMSG:
@@ -1095,6 +1246,8 @@ static void psz_free_all_zones(struct psz_context *cxt)
 		psz_free_zone(&cxt->ppsz);
 	if (cxt->cpsz)
 		psz_free_zone(&cxt->cpsz);
+	if (cxt->fpszs)
+		psz_free_zones(&cxt->fpszs, &cxt->ftrace_max_cnt);
 }
 
 static int psz_alloc_zones(struct psz_context *cxt)
@@ -1119,6 +1272,16 @@ static int psz_alloc_zones(struct psz_context *cxt)
 		goto free_out;
 	}
 
+	off_size += info->ftrace_size;
+	cxt->fpszs = psz_init_zones(PSTORE_TYPE_FTRACE, &off,
+			info->ftrace_size,
+			info->ftrace_size / nr_cpu_ids,
+			&cxt->ftrace_max_cnt);
+	if (IS_ERR(cxt->fpszs)) {
+		err = PTR_ERR(cxt->fpszs);
+		goto free_out;
+	}
+
 	cxt->opszs = psz_init_zones(PSTORE_TYPE_DMESG, &off,
 			info->total_size - off_size,
 			info->oops_size, &cxt->oops_max_cnt);
@@ -1182,6 +1345,7 @@ int psz_register(struct psz_info *info)
 	check_size(oops_size, SECTOR_SIZE);
 	check_size(pmsg_size, SECTOR_SIZE);
 	check_size(console_size, SECTOR_SIZE);
+	check_size(ftrace_size, SECTOR_SIZE);
 
 #undef check_size
 
@@ -1215,6 +1379,7 @@ int psz_register(struct psz_info *info)
 	pr_debug("\toops size : %ld Bytes\n", info->oops_size);
 	pr_debug("\tpmsg size : %ld Bytes\n", info->pmsg_size);
 	pr_debug("\tconsole size : %ld Bytes\n", info->console_size);
+	pr_debug("\tftrace size : %ld Bytes\n", info->ftrace_size);
 
 	err = psz_alloc_zones(cxt);
 	if (err) {
@@ -1238,12 +1403,15 @@ int psz_register(struct psz_info *info)
 		cxt->pstore.flags |= PSTORE_FLAGS_PMSG;
 	if (info->console_size)
 		cxt->pstore.flags |= PSTORE_FLAGS_CONSOLE;
+	if (info->ftrace_size)
+		cxt->pstore.flags |= PSTORE_FLAGS_FTRACE;
 
-	pr_info("Registered %s as pszone backend for%s%s%s%s\n", info->name,
+	pr_info("Registered %s as pszone backend for%s%s%s%s%s\n", info->name,
 			cxt->opszs && cxt->psz_info->dump_oops ? " Oops" : "",
 			cxt->opszs && cxt->psz_info->panic_write ? " Panic" : "",
 			cxt->ppsz ? " Pmsg" : "",
-			cxt->cpsz ? " Console" : "");
+			cxt->cpsz ? " Console" : "",
+			cxt->fpszs ? " Ftrace" : "");
 
 	err = pstore_register(&cxt->pstore);
 	if (err) {
diff --git a/include/linux/pstore_zone.h b/include/linux/pstore_zone.h
index 8a1838633010..a138e8b7dc20 100644
--- a/include/linux/pstore_zone.h
+++ b/include/linux/pstore_zone.h
@@ -18,6 +18,7 @@
  *		it must be multiple of SECTOR_SIZE(512 Bytes).
  * @pmsg_size:	The size of pmsg zone which is the same as @oops_size.
  * @console_size:The size of console zone which is the same as @oops_size.
+ * @ftrace_size:The size of ftrace zone which is the same as @oops_size.
  * @dump_oops:	Whether to dump oops log.
  * @read:	The general read operation. Both of the function parameters
  *		@size and @offset are relative value to storage.
@@ -36,6 +37,7 @@ struct psz_info {
 	unsigned long oops_size;
 	unsigned long pmsg_size;
 	unsigned long console_size;
+	unsigned long ftrace_size;
 	int dump_oops;
 	psz_read_op read;
 	psz_write_op write;
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 07/11] Documentation: create document for pstore/blk
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
                   ` (5 preceding siblings ...)
  2020-03-25  8:55 ` [PATCH v3 06/11] pstore/blk: blkoops: support ftrace recorder WeiXiong Liao
@ 2020-03-25  8:55 ` WeiXiong Liao
  2020-03-25  8:55 ` [PATCH v3 08/11] pstore/zone: skip broken zone for MTD device WeiXiong Liao
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:55 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

Pstore/blk is a new back-end of pstore to dump oops/pmsg log to
block device.

The document, at Documentation/admin-guide/pstore-block.rst, tells us
how to use pstore/blk.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 Documentation/admin-guide/pstore-block.rst | 223 +++++++++++++++++++++++++++++
 MAINTAINERS                                |   1 +
 fs/pstore/Kconfig                          |   2 +
 3 files changed, 226 insertions(+)
 create mode 100644 Documentation/admin-guide/pstore-block.rst

diff --git a/Documentation/admin-guide/pstore-block.rst b/Documentation/admin-guide/pstore-block.rst
new file mode 100644
index 000000000000..a96415eaaf24
--- /dev/null
+++ b/Documentation/admin-guide/pstore-block.rst
@@ -0,0 +1,223 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Pstore block oops/panic logger
+==============================
+
+Introduction
+------------
+
+Pstore block (pstore/blk) is an oops/panic logger that writes its logs to a
+block device before the system crashes. You can get these log files by
+mounting pstore filesystem like::
+
+    mount -t pstore pstore /sys/fs/pstore
+
+
+Pstore block concepts
+---------------------
+
+Pstore/blk provides efficient configuration method for pstore/blk, which
+divides all configurations into two parts, configurations for user and
+configurations for driver.
+
+Configurations for user determine how pstore/blk works, such as pmsg_size,
+oops_size and so on. All of them support both Kconfig and module parameters,
+but module parameters have priority over Kconfig.
+
+Configurations for driver are all about block device, such as total_size
+of block device and read/write operations.
+
+Configurations for user
+-----------------------
+
+All of these configurations support both Kconfig and module parameters, but
+module parameters have priority over Kconfig.
+
+Here is an example for module parameters::
+
+        pstore_blk.blkdev=179:7 pstore_blk.oops_size=64
+
+The detail of each configurations may be of interest to you.
+
+blkdev
+~~~~~~
+
+The block device to use. Most of the time, it is a partition of block device.
+It's required for pstore/blk.
+
+It accepts the following variants:
+
+1. <hex_major><hex_minor> device number in hexadecimal represents itself; no
+   leading 0x, for example b302.
+#. /dev/<disk_name> represents the device number of disk
+#. /dev/<disk_name><decimal> represents the device number of partition - device
+   number of disk plus the partition number
+#. /dev/<disk_name>p<decimal> - same as the above; this form is used when disk
+   name of partitioned disk ends with a digit.
+#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF represents the unique id of
+   a partition if the partition table provides it. The UUID may be either an
+   EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP,
+   where SSSSSSSS is a zero-filled hex representation of the 32-bit
+   "NT disk signature", and PP is a zero-filled hex representation of the
+   1-based partition number.
+#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a
+   partition with a known unique id.
+#. <major>:<minor> major and minor number of the device separated by a colon.
+
+oops_size
+~~~~~~~~~
+
+The chunk size in KB for oops/panic front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care oops/panic log.
+
+There are multiple chunks for oops/panic front-end depending on the remaining
+space except other pstore front-ends.
+
+Pstore/blk will log to oops/panic chunks one by one, and always overwrite the
+oldest chunk if there is no more free chunk.
+
+pmsg_size
+~~~~~~~~~
+
+The chunk size in KB for pmsg front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care pmsg log.
+
+Unlike oops/panic front-end, there is only one chunk for pmsg front-end.
+
+Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are
+appended to the chunk. On reboot the contents are available in
+*/sys/fs/pstore/pmsg-pstore-blk-0*.
+
+console_size
+~~~~~~~~~~~~
+
+The chunk size in KB for console front-end.  It **MUST** be a multiple of 4.
+It's optional if you do not care console log.
+
+Similar to pmsg front-end, there is only one chunk for console front-end.
+
+All log of console will be appended to the chunk. On reboot the contents are
+available in */sys/fs/pstore/console-pstore-blk-0*.
+
+ftrace_size
+~~~~~~~~~~~
+
+The chunk size in KB for ftrace front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care console log.
+
+Similar to oops front-end, there are multiple chunks for ftrace front-end
+depending on the count of cpu processors. Each chunk size is equal to
+ftrace_size / processors_count.
+
+All log of ftrace will be appended to the chunk. On reboot the contents are
+combined and available in */sys/fs/pstore/ftrace-pstore-blk-0*.
+
+Persistent function tracing might be useful for debugging software or hardware
+related hangs. Here is an example of usage::
+
+ # mount -t pstore pstore /sys/fs/pstore
+ # mount -t debugfs debugfs /sys/kernel/debug/
+ # echo 1 > /sys/kernel/debug/pstore/record_ftrace
+ # reboot -f
+ [...]
+ # mount -t pstore pstore /sys/fs/pstore
+ # tail /sys/fs/pstore/ftrace-pstore-blk-0
+ CPU:0 ts:5914676 c0063828  c0063b94  call_cpuidle <- cpu_startup_entry+0x1b8/0x1e0
+ CPU:0 ts:5914678 c039ecdc  c006385c  cpuidle_enter_state <- call_cpuidle+0x44/0x48
+ CPU:0 ts:5914680 c039e9a0  c039ecf0  cpuidle_enter_freeze <- cpuidle_enter_state+0x304/0x314
+ CPU:0 ts:5914681 c0063870  c039ea30  sched_idle_set_state <- cpuidle_enter_state+0x44/0x314
+ CPU:1 ts:5916720 c0160f59  c015ee04  kernfs_unmap_bin_file <- __kernfs_remove+0x140/0x204
+ CPU:1 ts:5916721 c05ca625  c015ee0c  __mutex_lock_slowpath <- __kernfs_remove+0x148/0x204
+ CPU:1 ts:5916723 c05c813d  c05ca630  yield_to <- __mutex_lock_slowpath+0x314/0x358
+ CPU:1 ts:5916724 c05ca2d1  c05ca638  __ww_mutex_lock <- __mutex_lock_slowpath+0x31c/0x358
+
+dump_oops
+~~~~~~~~~
+
+Dumping both oopses and panics can be done by setting 1 (not zero) in the
+``dump_oops`` member while setting 0 in that variable dumps only the panics.
+
+Configurations for driver
+-------------------------
+
+Only a block device driver cares about these configurations. A block device
+driver uses ``psblk_register_blkdev`` to register to pstore/blk.
+
+.. kernel-doc:: fs/pstore/pstore_blk.c
+   :identifiers: psblk_register_blkdev
+
+Compression and header
+----------------------
+
+Block device is large enough for uncompressed oops data. Actually we do not
+recommend data compression because pstore/blk will insert some information into
+the first line of oops/panic data. For example::
+
+        Panic: Total 16 times
+
+It means that it's OOPS|Panic for the 16th time since the first booting.
+Sometimes the number of occurrences of oops|panic since the first booting is
+important to judge whether the system is stable.
+
+The following line is inserted by pstore filesystem. For example::
+
+        Oops#2 Part1
+
+It means that it's OOPS for the 2nd time on the last boot.
+
+Reading the data
+----------------
+
+The dump data can be read from the pstore filesystem. The format for these
+files is ``dmesg-pstore-blk-[N]`` for oops/panic front-end,
+``pmsg-pstore-blk-0`` for pmsg front-end and so on.  The timestamp of the
+dump file records the trigger time. To delete a stored record from block
+device, simply unlink the respective pstore file.
+
+Attentions in panic read/write APIs
+-----------------------------------
+
+If on panic, the kernel is not going to run for much longer, the tasks will not
+be scheduled and most kernel resources will be out of service. It
+looks like a single-threaded program running on a single-core computer.
+
+The following points require special attention for panic read/write APIs:
+
+1. Can **NOT** allocate any memory.
+   If you need memory, just allocate while the block driver is initializing
+   rather than waiting until the panic.
+#. Must be polled, **NOT** interrupt driven.
+   No task schedule any more. The block driver should delay to ensure the write
+   succeeds, but NOT sleep.
+#. Can **NOT** take any lock.
+   There is no other task, nor any shared resource; you are safe to break all
+   locks.
+#. Just use CPU to transfer.
+   Do not use DMA to transfer unless you are sure that DMA will not keep lock.
+#. Control registers directly.
+   Please control registers directly rather than use Linux kernel resources.
+   Do I/O map while initializing rather than wait until a panic occurs.
+#. Reset your block device and controller if necessary.
+   If you are not sure of the state of your block device and controller when
+   a panic occurs, you are safe to stop and reset them.
+
+Pstore/blk supports psblk_blkdev_info(), which is defined in
+*linux/pstore_blk.h*, to get information of using block device, such as the
+device number, sector count and start sector of the whole disk.
+
+pstore block internals
+----------------------
+
+For developer reference, here are all the important structures and APIs:
+
+.. kernel-doc:: fs/pstore/pstore_zone.c
+   :internal:
+
+.. kernel-doc:: include/linux/pstore_zone.h
+   :internal:
+
+.. kernel-doc:: fs/pstore/pstore_blk.c
+   :export:
+
+.. kernel-doc:: include/linux/pstore_blk.h
+   :internal:
diff --git a/MAINTAINERS b/MAINTAINERS
index cc0a4a8ae06a..f553323b68c9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13380,6 +13380,7 @@ F:	include/linux/pstore*
 F:	drivers/firmware/efi/efi-pstore.c
 F:	drivers/acpi/apei/erst.c
 F:	Documentation/admin-guide/ramoops.rst
+F:	Documentation/admin-guide/pstore-block.rst
 F:	Documentation/devicetree/bindings/reserved-memory/ramoops.txt
 K:	\b(pstore|ramoops)
 
diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 55ce726be795..699f1925948f 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -171,6 +171,8 @@ config PSTORE_BLK
 	  This enables panic and oops message to be logged to a block dev
 	  where it can be read back at some later point.
 
+	  For more information, see Documentation/admin-guide/pstore-block.rst.
+
 	  If unsure, say N.
 
 config PSTORE_BLK_BLKDEV
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 08/11] pstore/zone: skip broken zone for MTD device
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
                   ` (6 preceding siblings ...)
  2020-03-25  8:55 ` [PATCH v3 07/11] Documentation: create document for pstore/blk WeiXiong Liao
@ 2020-03-25  8:55 ` WeiXiong Liao
  2020-03-25  8:55 ` [PATCH v3 09/11] pstore/blk: a way to get user configure about pstore front-ends WeiXiong Liao
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:55 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

It's one of a series of patches to adapt to MTD device.

It's different with block device, the MTD device will make new broken
blocks (or bad block) while running. It's necessary for pstore/blk to
skip the broken block.

The MTD driver should return -ENOMSG if meets a bad block, which tells
pstore/zone to skip and try next one.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 fs/pstore/pstore_blk.c      | 10 +++++--
 fs/pstore/pstore_zone.c     | 65 ++++++++++++++++++++++++++++++++++++---------
 include/linux/pstore_blk.h  |  3 ++-
 include/linux/pstore_zone.h | 12 ++++++---
 4 files changed, 71 insertions(+), 19 deletions(-)

diff --git a/fs/pstore/pstore_blk.c b/fs/pstore/pstore_blk.c
index 560e7b3f0945..d6949146135b 100644
--- a/fs/pstore/pstore_blk.c
+++ b/fs/pstore/pstore_blk.c
@@ -99,9 +99,12 @@
  *		means error.
  * @write:	The same as @read, but the following error number:
  *		-EBUSY means try to write again later.
+ *		-ENOMSG means to try next zone.
  * @panic_write:The write operation only used for panic case. It's optional
- *		if you do not care panic log. The parameters and return value
- *		are the same as @read.
+ *		if you do not care panic log. The parameters are relative
+ *		value to storage.
+ *		On success, the number of bytes should be returned, others
+ *		excluding -ENOMSG mean error. -ENOMSG means to try next zone.
  */
 struct psblk_device {
 	unsigned long total_size;
@@ -314,6 +317,9 @@ static ssize_t psblk_blk_panic_write(const char *buf, size_t size,
 	/* size and off must align to SECTOR_SIZE for block device */
 	ret = blkdev_panic_write(buf, off >> SECTOR_SHIFT,
 			size >> SECTOR_SHIFT);
+	/* try next zone */
+	if (ret == -ENOMSG)
+		return ret;
 	return ret ? -EIO : size;
 }
 
diff --git a/fs/pstore/pstore_zone.c b/fs/pstore/pstore_zone.c
index 1da4fc760d31..84ff53c1aaae 100644
--- a/fs/pstore/pstore_zone.c
+++ b/fs/pstore/pstore_zone.c
@@ -247,6 +247,9 @@ static int psz_zone_write(struct psz_zone *zone,
 
 	return 0;
 dirty:
+	/* no need to mark dirty if going to try next zone */
+	if (wcnt == -ENOMSG)
+		return -ENOMSG;
 	atomic_set(&zone->dirty, true);
 	/* flush dirty zones nicely */
 	if (wcnt == -EBUSY && !is_on_panic())
@@ -382,7 +385,11 @@ static int psz_recover_oops_meta(struct psz_context *cxt)
 			return -EINVAL;
 
 		rcnt = info->read((char *)buf, len, zone->off);
-		if (rcnt != len) {
+		if (rcnt == -ENOMSG) {
+			pr_debug("%s with id %lu may be broken, skip\n",
+					zone->name, i);
+			continue;
+		} else if (rcnt != len) {
 			pr_err("read %s with id %lu failed\n", zone->name, i);
 			return (int)rcnt < 0 ? (int)rcnt : -EIO;
 		}
@@ -717,24 +724,58 @@ static void psz_write_kmsg_hdr(struct psz_zone *zone,
 		hdr->counter = 0;
 }
 
+/*
+ * In case zone is broken, which may occur to MTD device, we try each zones,
+ * start at cxt->oops_write_cnt.
+ */
 static inline int notrace psz_oops_write_record(struct psz_context *cxt,
 		struct pstore_record *record)
 {
+	int ret = -EBUSY;
 	size_t size, hlen;
 	struct psz_zone *zone;
-	unsigned int zonenum;
+	unsigned int i;
 
-	zonenum = cxt->oops_write_cnt;
-	zone = cxt->opszs[zonenum];
-	if (unlikely(!zone))
-		return -ENOSPC;
-	cxt->oops_write_cnt = (zonenum + 1) % cxt->oops_max_cnt;
+	for (i = 0; i < cxt->oops_max_cnt; i++) {
+		unsigned int zonenum, len;
+
+		zonenum = (cxt->oops_write_cnt + i) % cxt->oops_max_cnt;
+		zone = cxt->opszs[zonenum];
+		if (unlikely(!zone))
+			return -ENOSPC;
 
-	pr_debug("write %s to zone id %d\n", zone->name, zonenum);
-	psz_write_kmsg_hdr(zone, record);
-	hlen = sizeof(struct psz_oops_header);
-	size = min_t(size_t, record->size, zone->buffer_size - hlen);
-	return psz_zone_write(zone, FLUSH_ALL, record->buf, size, hlen);
+		/* avoid destorying old data, allocate a new one */
+		len = zone->buffer_size + sizeof(*zone->buffer);
+		zone->oldbuf = zone->buffer;
+		zone->buffer = kzalloc(len, GFP_KERNEL);
+		if (!zone->buffer) {
+			zone->buffer = zone->oldbuf;
+			return -ENOMEM;
+		}
+		zone->buffer->sig = zone->oldbuf->sig;
+
+		pr_debug("write %s to zone id %d\n", zone->name, zonenum);
+		psz_write_kmsg_hdr(zone, record);
+		hlen = sizeof(struct psz_oops_header);
+		size = min_t(size_t, record->size, zone->buffer_size - hlen);
+		ret = psz_zone_write(zone, FLUSH_ALL, record->buf, size, hlen);
+		if (likely(!ret || ret != -ENOMSG)) {
+			cxt->oops_write_cnt = zonenum + 1;
+			cxt->oops_write_cnt %= cxt->oops_max_cnt;
+			/* no need to try next zone, free last zone buffer */
+			kfree(zone->oldbuf);
+			zone->oldbuf = NULL;
+			return ret;
+		}
+
+		pr_debug("zone %u may be broken, try next dmesg zone\n",
+				zonenum);
+		kfree(zone->buffer);
+		zone->buffer = zone->oldbuf;
+		zone->oldbuf = NULL;
+	}
+
+	return -EBUSY;
 }
 
 static int notrace psz_oops_write(struct psz_context *cxt,
diff --git a/include/linux/pstore_blk.h b/include/linux/pstore_blk.h
index d8f609e60288..828b0763d477 100644
--- a/include/linux/pstore_blk.h
+++ b/include/linux/pstore_blk.h
@@ -14,7 +14,8 @@
  * @start_sect: start sector to block device
  * @sects: sectors count on buf
  *
- * Return: On success, zero should be returned. Others mean error.
+ * Return: On success, zero should be returned. Others excluding -ENOMSG
+ * mean error. -ENOMSG means to try next zone.
  *
  * Panic write to block device must be aligned to SECTOR_SIZE.
  */
diff --git a/include/linux/pstore_zone.h b/include/linux/pstore_zone.h
index a138e8b7dc20..2bfa79bff97b 100644
--- a/include/linux/pstore_zone.h
+++ b/include/linux/pstore_zone.h
@@ -23,11 +23,15 @@
  * @read:	The general read operation. Both of the function parameters
  *		@size and @offset are relative value to storage.
  *		On success, the number of bytes should be returned, others
- *		means error.
- * @write:	The same as @read, but -EBUSY means try to write again later.
+ *		mean error.
+ * @write:	The same as @read, but the following error number:
+ *		-EBUSY means try to write again later.
+ *		-ENOMSG means to try next zone.
  * @panic_write:The write operation only used for panic case. It's optional
- *		if you do not care panic log. The parameters and return value
- *		are the same as @read.
+ *		if you do not care panic log. The parameters are relative
+ *		value to storage.
+ *		On success, the number of bytes should be returned, others
+ *		excluding -ENOMSG mean error. -ENOMSG means to try next zone.
  */
 struct psz_info {
 	struct module *owner;
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 09/11] pstore/blk: a way to get user configure about pstore front-ends.
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
                   ` (7 preceding siblings ...)
  2020-03-25  8:55 ` [PATCH v3 08/11] pstore/zone: skip broken zone for MTD device WeiXiong Liao
@ 2020-03-25  8:55 ` WeiXiong Liao
  2020-03-25  8:55 ` [PATCH v3 10/11] pstore/zone: pstore/blk: support non-block devices WeiXiong Liao
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:55 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

It's one of a series of patches to adapt to MTD device.

It's different to block driver, MTD driver should check zone size
and get identifier to MTD device. To make it, psblk_usr_info() is
created.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 fs/pstore/pstore_blk.c     | 37 ++++++++++++++++++++++++++++++-------
 include/linux/pstore_blk.h | 10 ++++++++++
 2 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/fs/pstore/pstore_blk.c b/fs/pstore/pstore_blk.c
index d6949146135b..061609b66d8a 100644
--- a/fs/pstore/pstore_blk.c
+++ b/fs/pstore/pstore_blk.c
@@ -84,6 +84,17 @@
 	sector_t start_sect;
 } g_bdev_info;
 
+#define check_size(name, alignsize) ({				\
+	long _##name_ = (name);					\
+	_##name_ = _##name_ <= 0 ? 0 : (_##name_ * 1024);	\
+	if (_##name_ & ((alignsize) - 1)) {			\
+		pr_info(#name " must align to %d\n",		\
+				(alignsize));			\
+		_##name_ = ALIGN(name, (alignsize));		\
+	}							\
+	_##name_;						\
+})
+
 /**
  * struct psblk_device - back-end pstore/blk driver structure.
  *
@@ -138,13 +149,11 @@ static int psblk_register_do(struct psblk_device *dev)
 	if (!dev->flags)
 		dev->flags = UINT_MAX;
 #define verify_size(name, alignsize, enable) {				\
-		long _##name_ = (enable) ? (name) : 0;			\
-		_##name_ = _##name_ <= 0 ? 0 : (_##name_ * 1024);	\
-		if (_##name_ & ((alignsize) - 1)) {			\
-			pr_info(#name " must align to %d\n",		\
-					(alignsize));			\
-			_##name_ = ALIGN(name, (alignsize));		\
-		}							\
+		long _##name_;						\
+		if (enable)						\
+			_##name_ = check_size(name, alignsize);		\
+		else							\
+			_##name_ = 0;					\
 		name = _##name_ / 1024;					\
 		psz_info->name = _##name_;				\
 	}
@@ -464,6 +473,20 @@ int psblk_blkdev_info(dev_t *devt, sector_t *nr_sects, sector_t *start_sect)
 }
 EXPORT_SYMBOL_GPL(psblk_blkdev_info);
 
+/* get information of pstore/blk */
+int psblk_usr_info(struct psblk_info *info)
+{
+	strncpy(info->device, blkdev, 80);
+	info->dump_oops = dump_oops <= 0 ? 0 : 1;
+	info->oops_size = check_size(oops_size, 4096);
+	info->pmsg_size = check_size(pmsg_size, 4096);
+	info->ftrace_size = check_size(ftrace_size, 4096);
+	info->console_size = check_size(console_size, 4096);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(psblk_usr_info);
+
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>");
 MODULE_DESCRIPTION("Block device Oops/Panic logger");
diff --git a/include/linux/pstore_blk.h b/include/linux/pstore_blk.h
index 828b0763d477..d2ea1733b51a 100644
--- a/include/linux/pstore_blk.h
+++ b/include/linux/pstore_blk.h
@@ -27,4 +27,14 @@ int  psblk_register_blkdev(unsigned int major, unsigned int flags,
 void psblk_unregister_blkdev(unsigned int major);
 int  psblk_blkdev_info(dev_t *devt, sector_t *nr_sects, sector_t *start_sect);
 
+struct psblk_info {
+	int dump_oops;
+	char device[80];
+	unsigned long oops_size;
+	unsigned long pmsg_size;
+	unsigned long console_size;
+	unsigned long ftrace_size;
+};
+int psblk_usr_info(struct psblk_info *info);
+
 #endif
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 10/11] pstore/zone: pstore/blk: support non-block devices
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
                   ` (8 preceding siblings ...)
  2020-03-25  8:55 ` [PATCH v3 09/11] pstore/blk: a way to get user configure about pstore front-ends WeiXiong Liao
@ 2020-03-25  8:55 ` WeiXiong Liao
  2020-03-25  8:55 ` [PATCH v3 11/11] mtd: new support oops logger based on pstore/blk WeiXiong Liao
  2020-05-07 19:45 ` [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device Kees Cook
  11 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:55 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

To support non-block device like MTD device. A non-block driver calls
psblk_register_device() to register to pstore/blk.

In addition, pstore/zone should also fix to non-block devices like
MTD device. MTD device is not block device and an erase must be done
before write. To adapt to non block device like MTD device,
pstore/zone must call the special erase job rather than set the
@datalen of zone to 0, which works on block device but not MTD device.
If no this patch, pstore on MTD disables to remove log file.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 Documentation/admin-guide/pstore-block.rst | 17 +++++++---
 fs/pstore/pstore_blk.c                     | 52 ++++++++++++------------------
 fs/pstore/pstore_zone.c                    |  8 ++++-
 include/linux/pstore_blk.h                 | 37 +++++++++++++++++++++
 include/linux/pstore_zone.h                |  6 ++++
 5 files changed, 83 insertions(+), 37 deletions(-)

diff --git a/Documentation/admin-guide/pstore-block.rst b/Documentation/admin-guide/pstore-block.rst
index a96415eaaf24..a31155b36e62 100644
--- a/Documentation/admin-guide/pstore-block.rst
+++ b/Documentation/admin-guide/pstore-block.rst
@@ -7,8 +7,8 @@ Introduction
 ------------
 
 Pstore block (pstore/blk) is an oops/panic logger that writes its logs to a
-block device before the system crashes. You can get these log files by
-mounting pstore filesystem like::
+block device and non-block device before the system crashes. You can get
+these log files by mounting pstore filesystem like::
 
     mount -t pstore pstore /sys/fs/pstore
 
@@ -24,8 +24,8 @@ Configurations for user determine how pstore/blk works, such as pmsg_size,
 oops_size and so on. All of them support both Kconfig and module parameters,
 but module parameters have priority over Kconfig.
 
-Configurations for driver are all about block device, such as total_size
-of block device and read/write operations.
+Configurations for driver are all about block device and non-block device,
+such as total_size of block device and read/write operations.
 
 Configurations for user
 -----------------------
@@ -146,6 +146,15 @@ driver uses ``psblk_register_blkdev`` to register to pstore/blk.
 .. kernel-doc:: fs/pstore/pstore_blk.c
    :identifiers: psblk_register_blkdev
 
+A non-block device driver uses ``psblk_register_device`` with
+``struct psblk_device`` to register to pstore/blk.
+
+.. kernel-doc:: fs/pstore/pstore_blk.c
+   :identifiers: psblk_register_device
+
+.. kernel-doc:: include/linux/pstore_blk.h
+   :identifiers: psblk_device
+
 Compression and header
 ----------------------
 
diff --git a/fs/pstore/pstore_blk.c b/fs/pstore/pstore_blk.c
index 061609b66d8a..5c8006376677 100644
--- a/fs/pstore/pstore_blk.c
+++ b/fs/pstore/pstore_blk.c
@@ -96,36 +96,15 @@
 })
 
 /**
- * struct psblk_device - back-end pstore/blk driver structure.
+ * psblk_register_device() - register non-block device to pstore/blk
  *
- * @total_size: The total size in bytes pstore/blk can use. It must be greater
- *		than 4096 and be multiple of 4096.
- * @flags:	Refer to macro starting with PSTORE_FLAGS defined in
- *		linux/pstore.h. It means what front-ends this device support.
- *		Zero means all recorders for compatible.
- * @read:	The general read operation. Both of the function parameters
- *		@size and @offset are relative value to bock device (not the
- *		whole disk).
- *		On success, the number of bytes should be returned, others
- *		means error.
- * @write:	The same as @read, but the following error number:
- *		-EBUSY means try to write again later.
- *		-ENOMSG means to try next zone.
- * @panic_write:The write operation only used for panic case. It's optional
- *		if you do not care panic log. The parameters are relative
- *		value to storage.
- *		On success, the number of bytes should be returned, others
- *		excluding -ENOMSG mean error. -ENOMSG means to try next zone.
+ * @dev: non-block device information
+ *
+ * Return:
+ * * 0		- OK
+ * * Others	- something error.
  */
-struct psblk_device {
-	unsigned long total_size;
-	unsigned int flags;
-	psz_read_op read;
-	psz_write_op write;
-	psz_write_op panic_write;
-};
-
-static int psblk_register_do(struct psblk_device *dev)
+int psblk_register_device(struct psblk_device *dev)
 {
 	int ret;
 
@@ -169,6 +148,7 @@ static int psblk_register_do(struct psblk_device *dev)
 	psz_info->dump_oops = dump_oops;
 	psz_info->read = dev->read;
 	psz_info->write = dev->write;
+	psz_info->erase = dev->erase;
 	psz_info->panic_write = dev->panic_write;
 	psz_info->name = MODNAME;
 	psz_info->owner = THIS_MODULE;
@@ -181,8 +161,14 @@ static int psblk_register_do(struct psblk_device *dev)
 	mutex_unlock(&psz_lock);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(psblk_register_device);
 
-static void psblk_unregister_do(struct psblk_device *dev)
+/**
+ * psblk_unregister_blkdev() - unregister block device from pstore/blk
+ *
+ * @dev: non-block device information
+ */
+void psblk_unregister_device(struct psblk_device *dev)
 {
 	mutex_lock(&psz_lock);
 	if (psz_info && psz_info->read == dev->read) {
@@ -192,6 +178,7 @@ static void psblk_unregister_do(struct psblk_device *dev)
 	}
 	mutex_unlock(&psz_lock);
 }
+EXPORT_SYMBOL_GPL(psblk_unregister_device);
 
 /**
  * psblk_get_bdev() - open block device
@@ -405,11 +392,12 @@ int psblk_register_blkdev(unsigned int major, unsigned int flags,
 
 	dev.total_size = psblk_bdev_size(bdev);
 	dev.flags = flags;
-	dev.panic_write = panic_write ? psblk_blk_panic_write : NULL;
 	dev.read = psblk_generic_blk_read;
 	dev.write = psblk_generic_blk_write;
+	dev.erase = NULL;
+	dev.panic_write = panic_write ? psblk_blk_panic_write : NULL;
 
-	ret = psblk_register_do(&dev);
+	ret = psblk_register_device(&dev);
 	if (ret)
 		goto err_put_bdev;
 
@@ -435,7 +423,7 @@ void psblk_unregister_blkdev(unsigned int major)
 	void *holder = blkdev;
 
 	if (psblk_bdev && MAJOR(psblk_bdev->bd_dev) == major) {
-		psblk_unregister_do(&dev);
+		psblk_unregister_device(&dev);
 		psblk_put_bdev(psblk_bdev, holder);
 		blkdev_panic_write = NULL;
 		psblk_bdev = NULL;
diff --git a/fs/pstore/pstore_zone.c b/fs/pstore/pstore_zone.c
index 84ff53c1aaae..e51bc794f908 100644
--- a/fs/pstore/pstore_zone.c
+++ b/fs/pstore/pstore_zone.c
@@ -652,15 +652,21 @@ static inline int psz_oops_erase(struct psz_context *cxt,
 	struct psz_buffer *buffer = zone->buffer;
 	struct psz_oops_header *hdr =
 		(struct psz_oops_header *)buffer->data;
+	size_t size;
 
 	if (unlikely(!psz_ok(zone)))
 		return 0;
+
 	/* this zone is already updated, no need to erase */
 	if (record->count != hdr->counter)
 		return 0;
 
+	size = buffer_datalen(zone) + sizeof(*zone->buffer);
 	atomic_set(&zone->buffer->datalen, 0);
-	return psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+	if (cxt->psz_info->erase)
+		return cxt->psz_info->erase(size, zone->off);
+	else
+		return psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
 }
 
 static inline int psz_record_erase(struct psz_context *cxt,
diff --git a/include/linux/pstore_blk.h b/include/linux/pstore_blk.h
index d2ea1733b51a..e733f28aef2c 100644
--- a/include/linux/pstore_blk.h
+++ b/include/linux/pstore_blk.h
@@ -8,6 +8,41 @@
 #include <linux/pstore_zone.h>
 
 /**
+ * struct psblk_device - back-end pstore/blk driver structure.
+ *
+ * @total_size: The total size in bytes pstore/blk can use. It must be greater
+ *		than 4096 and be multiple of 4096.
+ * @flags:	Refer to macro starting with PSTORE_FLAGS defined in
+ *		linux/pstore.h. It means what front-ends this device support.
+ *		Zero means all recorders for compatible.
+ * @read:	The general read operation. Both of the function parameters
+ *		@size and @offset are relative value to bock device (not the
+ *		whole disk).
+ *		On success, the number of bytes should be returned, others
+ *		means error.
+ * @write:	The same as @read, but the following error number:
+ *		-EBUSY means try to write again later.
+ *		-ENOMSG means to try next zone.
+ * @erase:	The general erase operation for device with special removing
+ *		job. Both of the function parameters @size and @offset are
+ *		relative value to storage.
+ *		Return 0 on success and others on failure.
+ * @panic_write:The write operation only used for panic case. It's optional
+ *		if you do not care panic log. The parameters are relative
+ *		value to storage.
+ *		On success, the number of bytes should be returned, others
+ *		excluding -ENOMSG mean error. -ENOMSG means to try next zone.
+ */
+struct psblk_device {
+	unsigned long total_size;
+	unsigned int flags;
+	psz_read_op read;
+	psz_write_op write;
+	psz_erase_op erase;
+	psz_write_op panic_write;
+};
+
+/**
  * typedef psblk_panic_write_op - panic write operation to block device
  *
  * @buf: the data to write
@@ -22,6 +57,8 @@
 typedef int (*psblk_panic_write_op)(const char *buf, sector_t start_sect,
 		sector_t sects);
 
+int psblk_register_device(struct psblk_device *dev);
+void psblk_unregister_device(struct psblk_device *dev);
 int  psblk_register_blkdev(unsigned int major, unsigned int flags,
 		psblk_panic_write_op panic_write);
 void psblk_unregister_blkdev(unsigned int major);
diff --git a/include/linux/pstore_zone.h b/include/linux/pstore_zone.h
index 2bfa79bff97b..c2db8e5c6542 100644
--- a/include/linux/pstore_zone.h
+++ b/include/linux/pstore_zone.h
@@ -7,6 +7,7 @@
 
 typedef ssize_t (*psz_read_op)(char *, size_t, loff_t);
 typedef ssize_t (*psz_write_op)(const char *, size_t, loff_t);
+typedef ssize_t (*psz_erase_op)(size_t, loff_t);
 /**
  * struct psz_info - pstore/zone back-end driver structure
  *
@@ -27,6 +28,10 @@
  * @write:	The same as @read, but the following error number:
  *		-EBUSY means try to write again later.
  *		-ENOMSG means to try next zone.
+ * @erase:	The general erase operation for device with special removing
+ *		job. Both of the function parameters @size and @offset are
+ *		relative value to storage.
+ *		Return 0 on success and others on failure.
  * @panic_write:The write operation only used for panic case. It's optional
  *		if you do not care panic log. The parameters are relative
  *		value to storage.
@@ -45,6 +50,7 @@ struct psz_info {
 	int dump_oops;
 	psz_read_op read;
 	psz_write_op write;
+	psz_erase_op erase;
 	psz_write_op panic_write;
 };
 
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 11/11] mtd: new support oops logger based on pstore/blk
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
                   ` (9 preceding siblings ...)
  2020-03-25  8:55 ` [PATCH v3 10/11] pstore/zone: pstore/blk: support non-block devices WeiXiong Liao
@ 2020-03-25  8:55 ` WeiXiong Liao
  2020-05-07 19:45 ` [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device Kees Cook
  11 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-03-25  8:55 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: WeiXiong Liao, linux-mtd, linux-kernel, linux-doc

It's the last one of a series of patches to adapt to MTD device.

The mtdpstore is similar to mtdoops but more powerful. It bases on
pstore/blk, aims to store panic and oops logs to a flash partition,
where it can be read back as files after mounting pstore filesystem.

The pstore/blk is designed for block device at the very beginning,
but now, compatible to non-block device like MTD device.

To make mtdpstore work, "blkdev" of pstore/blk should be set as MTD
device name or MTD device number. Please refer to document at:
    Documentation/admin-guide/pstore-block.rst

Why do we need mtdpstore?
1. repetitive jobs between pstore and mtdoops
   Both of pstore and mtdoops do the same jobs that store panic/oops log.
   They have much similar logic that register to kmsg dumper and store
   log to several chunks one by one.
2. do what a driver should do
   To me, a driver should provide methods instead of policies. What MTD
   should do is to provide read/write/erase operations, geting rid of codes
   about chunk management, kmsg dumper and configuration.
3. enhanced feature
   Not only store log, but also show it as files.
   Not only log, but also trigger time and trigger count.
   Not only panic/oops log, but also log recorder for pmsg, console and
   ftrace in the future.

Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
---
 Documentation/admin-guide/pstore-block.rst |   9 +-
 drivers/mtd/Kconfig                        |  10 +
 drivers/mtd/Makefile                       |   1 +
 drivers/mtd/mtdpstore.c                    | 564 +++++++++++++++++++++++++++++
 4 files changed, 582 insertions(+), 2 deletions(-)
 create mode 100644 drivers/mtd/mtdpstore.c

diff --git a/Documentation/admin-guide/pstore-block.rst b/Documentation/admin-guide/pstore-block.rst
index a31155b36e62..464efb0db932 100644
--- a/Documentation/admin-guide/pstore-block.rst
+++ b/Documentation/admin-guide/pstore-block.rst
@@ -43,9 +43,9 @@ blkdev
 ~~~~~~
 
 The block device to use. Most of the time, it is a partition of block device.
-It's required for pstore/blk.
+It's required for pstore/blk. It is also used for MTD device.
 
-It accepts the following variants:
+It accepts the following variants for block device:
 
 1. <hex_major><hex_minor> device number in hexadecimal represents itself; no
    leading 0x, for example b302.
@@ -64,6 +64,11 @@ It accepts the following variants:
    partition with a known unique id.
 #. <major>:<minor> major and minor number of the device separated by a colon.
 
+It accepts the following variants for MTD device:
+
+1. <device name> MTD device name. "pstore" is recommended.
+#. <device number> MTD device number.
+
 oops_size
 ~~~~~~~~~
 
diff --git a/drivers/mtd/Kconfig b/drivers/mtd/Kconfig
index 42d401ea60ee..6ddab796216d 100644
--- a/drivers/mtd/Kconfig
+++ b/drivers/mtd/Kconfig
@@ -170,6 +170,16 @@ config MTD_OOPS
 	  buffer in a flash partition where it can be read back at some
 	  later point.
 
+config MTD_PSTORE
+	tristate "Log panic/oops to an MTD buffer based on pstore"
+	depends on PSTORE_BLK
+	help
+	  This enables panic and oops messages to be logged to a circular
+	  buffer in a flash partition where it can be read back as files after
+	  mounting pstore filesystem.
+
+	  If unsure, say N.
+
 config MTD_SWAP
 	tristate "Swap on MTD device support"
 	depends on MTD && SWAP
diff --git a/drivers/mtd/Makefile b/drivers/mtd/Makefile
index 56cc60ccc477..593d0593a038 100644
--- a/drivers/mtd/Makefile
+++ b/drivers/mtd/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_RFD_FTL)		+= rfd_ftl.o
 obj-$(CONFIG_SSFDC)		+= ssfdc.o
 obj-$(CONFIG_SM_FTL)		+= sm_ftl.o
 obj-$(CONFIG_MTD_OOPS)		+= mtdoops.o
+obj-$(CONFIG_MTD_PSTORE)	+= mtdpstore.o
 obj-$(CONFIG_MTD_SWAP)		+= mtdswap.o
 
 nftl-objs		:= nftlcore.o nftlmount.o
diff --git a/drivers/mtd/mtdpstore.c b/drivers/mtd/mtdpstore.c
new file mode 100644
index 000000000000..e6fdaed6d1d3
--- /dev/null
+++ b/drivers/mtd/mtdpstore.c
@@ -0,0 +1,564 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define dev_fmt(fmt) "mtdoops-pstore: " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pstore_blk.h>
+#include <linux/mtd/mtd.h>
+#include <linux/bitops.h>
+
+static struct mtdpstore_context {
+	int index;
+	struct psblk_info info;
+	struct psblk_device dev;
+	struct mtd_info *mtd;
+	unsigned long *rmmap;		/* removed bit map */
+	unsigned long *usedmap;		/* used bit map */
+	/*
+	 * used for panic write
+	 * As there are no block_isbad for panic case, we should keep this
+	 * status before panic to ensure panic_write not failed.
+	 */
+	unsigned long *badmap;		/* bad block bit map */
+} oops_cxt;
+
+static int mtdpstore_block_isbad(struct mtdpstore_context *cxt, loff_t off)
+{
+	int ret;
+	struct mtd_info *mtd = cxt->mtd;
+	u64 blknum = div_u64(off, mtd->erasesize);
+
+	if (test_bit(blknum, cxt->badmap))
+		return true;
+	ret = mtd_block_isbad(mtd, off);
+	if (ret < 0) {
+		dev_err(&mtd->dev, "mtd_block_isbad failed, aborting\n");
+		return ret;
+	} else if (ret > 0) {
+		set_bit(blknum, cxt->badmap);
+		return true;
+	}
+	return false;
+}
+
+static inline int mtdpstore_panic_block_isbad(struct mtdpstore_context *cxt,
+		loff_t off)
+{
+	struct mtd_info *mtd = cxt->mtd;
+	u64 blknum = div_u64(off, mtd->erasesize);
+
+	return test_bit(blknum, cxt->badmap);
+}
+
+static inline void mtdpstore_mark_used(struct mtdpstore_context *cxt,
+		loff_t off)
+{
+	struct mtd_info *mtd = cxt->mtd;
+	u64 zonenum = div_u64(off, cxt->info.oops_size);
+
+	dev_dbg(&mtd->dev, "mark zone %llu used\n", zonenum);
+	set_bit(zonenum, cxt->usedmap);
+}
+
+static inline void mtdpstore_mark_unused(struct mtdpstore_context *cxt,
+		loff_t off)
+{
+	struct mtd_info *mtd = cxt->mtd;
+	u64 zonenum = div_u64(off, cxt->info.oops_size);
+
+	dev_dbg(&mtd->dev, "mark zone %llu unused\n", zonenum);
+	clear_bit(zonenum, cxt->usedmap);
+}
+
+static inline void mtdpstore_block_mark_unused(struct mtdpstore_context *cxt,
+		loff_t off)
+{
+	struct mtd_info *mtd = cxt->mtd;
+	u64 zonenum = div_u64(off, cxt->info.oops_size);
+	u32 zonecnt = cxt->mtd->erasesize / cxt->info.oops_size;
+
+	while (zonecnt > 0) {
+		dev_dbg(&mtd->dev, "mark zone %llu unused\n", zonenum);
+		clear_bit(zonenum, cxt->usedmap);
+		zonenum++;
+		zonecnt--;
+	}
+}
+
+static inline int mtdpstore_is_used(struct mtdpstore_context *cxt, loff_t off)
+{
+	u64 zonenum = div_u64(off, cxt->info.oops_size);
+	u64 blknum = div_u64(off, cxt->mtd->erasesize);
+
+	if (test_bit(blknum, cxt->badmap))
+		return true;
+	return test_bit(zonenum, cxt->usedmap);
+}
+
+static int mtdpstore_block_is_used(struct mtdpstore_context *cxt,
+		loff_t off)
+{
+	u64 zonenum = div_u64(off, cxt->info.oops_size);
+	u32 zonecnt = cxt->mtd->erasesize / cxt->info.oops_size;
+
+	while (zonecnt > 0) {
+		if (test_bit(zonenum, cxt->usedmap))
+			return true;
+		zonenum++;
+		zonecnt--;
+	}
+	return false;
+}
+
+static int mtdpstore_is_empty(struct mtdpstore_context *cxt, char *buf,
+		size_t size)
+{
+	struct mtd_info *mtd = cxt->mtd;
+	size_t sz;
+	int i;
+
+	sz = min_t(uint32_t, size, mtd->writesize / 4);
+	for (i = 0; i < sz; i++) {
+		if (buf[i] != (char)0xFF)
+			return false;
+	}
+	return true;
+}
+
+static void mtdpstore_mark_removed(struct mtdpstore_context *cxt, loff_t off)
+{
+	struct mtd_info *mtd = cxt->mtd;
+	u64 zonenum = div_u64(off, cxt->info.oops_size);
+
+	dev_dbg(&mtd->dev, "mark zone %llu removed\n", zonenum);
+	set_bit(zonenum, cxt->rmmap);
+}
+
+static void mtdpstore_block_clear_removed(struct mtdpstore_context *cxt,
+		loff_t off)
+{
+	u64 zonenum = div_u64(off, cxt->info.oops_size);
+	u32 zonecnt = cxt->mtd->erasesize / cxt->info.oops_size;
+
+	while (zonecnt > 0) {
+		clear_bit(zonenum, cxt->rmmap);
+		zonenum++;
+		zonecnt--;
+	}
+}
+
+static int mtdpstore_block_is_removed(struct mtdpstore_context *cxt,
+		loff_t off)
+{
+	u64 zonenum = div_u64(off, cxt->info.oops_size);
+	u32 zonecnt = cxt->mtd->erasesize / cxt->info.oops_size;
+
+	while (zonecnt > 0) {
+		if (test_bit(zonenum, cxt->rmmap))
+			return true;
+		zonenum++;
+		zonecnt--;
+	}
+	return false;
+}
+
+static int mtdpstore_erase_do(struct mtdpstore_context *cxt, loff_t off)
+{
+	struct mtd_info *mtd = cxt->mtd;
+	struct erase_info erase;
+	int ret;
+
+	dev_dbg(&mtd->dev, "try to erase off 0x%llx\n", off);
+	erase.len = cxt->mtd->erasesize;
+	erase.addr = off;
+	ret = mtd_erase(cxt->mtd, &erase);
+	if (!ret)
+		mtdpstore_block_clear_removed(cxt, off);
+	else
+		dev_err(&mtd->dev, "erase of region [0x%llx, 0x%llx] on \"%s\" failed\n",
+		       (unsigned long long)erase.addr,
+		       (unsigned long long)erase.len, cxt->info.device);
+	return ret;
+}
+
+/*
+ * called while removing file
+ *
+ * Avoiding over erasing, do erase block only when the whole block is unused.
+ * If the block contains valid log, do erase lazily on flush_removed() when
+ * unregister.
+ */
+static ssize_t mtdpstore_erase(size_t size, loff_t off)
+{
+	struct mtdpstore_context *cxt = &oops_cxt;
+
+	if (mtdpstore_block_isbad(cxt, off))
+		return -EIO;
+
+	mtdpstore_mark_unused(cxt, off);
+
+	/* If the block still has valid data, mtdpstore do erase lazily */
+	if (likely(mtdpstore_block_is_used(cxt, off))) {
+		mtdpstore_mark_removed(cxt, off);
+		return 0;
+	}
+
+	/* all zones are unused, erase it */
+	off = ALIGN_DOWN(off, cxt->mtd->erasesize);
+	return mtdpstore_erase_do(cxt, off);
+}
+
+/*
+ * What is security for mtdpstore?
+ * As there is no erase for panic case, we should ensure at least one zone
+ * is writable. Otherwise, panic write will fail.
+ * If zone is used, write operation will return -ENOMSG, which means that
+ * pstore/blk will try one by one until gets an empty zone. So, it is not
+ * needed to ensure the next zone is empty, but at least one.
+ */
+static int mtdpstore_security(struct mtdpstore_context *cxt, loff_t off)
+{
+	int ret = 0, i;
+	struct mtd_info *mtd = cxt->mtd;
+	u32 zonenum = (u32)div_u64(off, cxt->info.oops_size);
+	u32 zonecnt = (u32)div_u64(cxt->mtd->size, cxt->info.oops_size);
+	u32 blkcnt = (u32)div_u64(cxt->mtd->size, cxt->mtd->erasesize);
+	u32 erasesize = cxt->mtd->erasesize;
+
+	for (i = 0; i < zonecnt; i++) {
+		u32 num = (zonenum + i) % zonecnt;
+
+		/* found empty zone */
+		if (!test_bit(num, cxt->usedmap))
+			return 0;
+	}
+
+	/* If there is no any empty zone, we have no way but to do erase */
+	off = ALIGN_DOWN(off, erasesize);
+	while (blkcnt--) {
+		div64_u64_rem(off + erasesize, cxt->mtd->size, (u64 *)&off);
+
+		if (mtdpstore_block_isbad(cxt, off))
+			continue;
+
+		ret = mtdpstore_erase_do(cxt, off);
+		if (!ret) {
+			mtdpstore_block_mark_unused(cxt, off);
+			break;
+		}
+	}
+
+	if (ret)
+		dev_err(&mtd->dev, "all blocks bad!\n");
+	dev_dbg(&mtd->dev, "end security\n");
+	return ret;
+}
+
+static ssize_t mtdpstore_write(const char *buf, size_t size, loff_t off)
+{
+	struct mtdpstore_context *cxt = &oops_cxt;
+	struct mtd_info *mtd = cxt->mtd;
+	size_t retlen;
+	int ret;
+
+	if (mtdpstore_block_isbad(cxt, off))
+		return -ENOMSG;
+
+	/* zone is used, please try next one */
+	if (mtdpstore_is_used(cxt, off))
+		return -ENOMSG;
+
+	dev_dbg(&mtd->dev, "try to write off 0x%llx size %zu\n", off, size);
+	ret = mtd_write(cxt->mtd, off, size, &retlen, (u_char *)buf);
+	if (ret < 0 || retlen != size) {
+		dev_err(&mtd->dev, "write failure at %lld (%zu of %zu written), err %d\n",
+				off, retlen, size, ret);
+		return -EIO;
+	}
+	mtdpstore_mark_used(cxt, off);
+
+	mtdpstore_security(cxt, off);
+	return retlen;
+}
+
+static inline bool mtdpstore_is_io_error(int ret)
+{
+	return ret < 0 && !mtd_is_bitflip(ret) && !mtd_is_eccerr(ret);
+}
+
+/*
+ * All zones will be read as pstore/blk will read zone one by one when do
+ * recover.
+ */
+static ssize_t mtdpstore_read(char *buf, size_t size, loff_t off)
+{
+	struct mtdpstore_context *cxt = &oops_cxt;
+	struct mtd_info *mtd = cxt->mtd;
+	size_t retlen, done;
+	int ret;
+
+	if (mtdpstore_block_isbad(cxt, off))
+		return -ENOMSG;
+
+	dev_dbg(&mtd->dev, "try to read off 0x%llx size %zu\n", off, size);
+	for (done = 0, retlen = 0; done < size; done += retlen) {
+		retlen = 0;
+
+		ret = mtd_read(cxt->mtd, off + done, size - done, &retlen,
+				(u_char *)buf + done);
+		if (mtdpstore_is_io_error(ret)) {
+			dev_err(&mtd->dev, "read failure at %lld (%zu of %zu read), err %d\n",
+					off + done, retlen, size - done, ret);
+			/* the zone may be broken, try next one */
+			return -ENOMSG;
+		}
+
+		/*
+		 * ECC error. The impact on log data is so small. Maybe we can
+		 * still read it and try to understand. So mtdpstore just hands
+		 * over what it gets and user can judge whether the data is
+		 * valid or not.
+		 */
+		if (mtd_is_eccerr(ret)) {
+			dev_err(&mtd->dev, "ecc error at %lld (%zu of %zu read), err %d\n",
+					off + done, retlen, size - done, ret);
+			/* driver may not set retlen when ecc error */
+			retlen = retlen == 0 ? size - done : retlen;
+		}
+	}
+
+	if (mtdpstore_is_empty(cxt, buf, size))
+		mtdpstore_mark_unused(cxt, off);
+	else
+		mtdpstore_mark_used(cxt, off);
+
+	mtdpstore_security(cxt, off);
+	return retlen;
+}
+
+static ssize_t mtdpstore_panic_write(const char *buf, size_t size, loff_t off)
+{
+	struct mtdpstore_context *cxt = &oops_cxt;
+	struct mtd_info *mtd = cxt->mtd;
+	size_t retlen;
+	int ret;
+
+	if (mtdpstore_panic_block_isbad(cxt, off))
+		return -ENOMSG;
+
+	/* zone is used, please try next one */
+	if (mtdpstore_is_used(cxt, off))
+		return -ENOMSG;
+
+	ret = mtd_panic_write(cxt->mtd, off, size, &retlen, (u_char *)buf);
+	if (ret < 0 || size != retlen) {
+		dev_err(&mtd->dev, "panic write failure at %lld (%zu of %zu read), err %d\n",
+				off, retlen, size, ret);
+		return -EIO;
+	}
+	mtdpstore_mark_used(cxt, off);
+
+	return retlen;
+}
+
+static void mtdpstore_notify_add(struct mtd_info *mtd)
+{
+	int ret;
+	struct mtdpstore_context *cxt = &oops_cxt;
+	struct psblk_info *info = &cxt->info;
+	unsigned long longcnt;
+
+	if (!strcmp(mtd->name, info->device))
+		cxt->index = mtd->index;
+
+	if (mtd->index != cxt->index || cxt->index < 0)
+		return;
+
+	dev_dbg(&mtd->dev, "found matching MTD device %s\n", mtd->name);
+
+	if (mtd->size < info->oops_size * 2) {
+		dev_err(&mtd->dev, "MTD partition %d not big enough\n",
+				mtd->index);
+		return;
+	}
+	/*
+	 * oops_size must be aligned to 4096 Bytes, which is limited by
+	 * psblk. The default value of oops_size is 64KB. If oops_size
+	 * is larger than erasesize, some errors will occur since mtdpsotre
+	 * is designed on it.
+	 */
+	if (mtd->erasesize < info->oops_size) {
+		dev_err(&mtd->dev, "eraseblock size of MTD partition %d too small\n",
+				mtd->index);
+		return;
+	}
+	if (unlikely(info->oops_size % mtd->writesize)) {
+		dev_err(&mtd->dev, "record size %lu KB must align to write size %d KB\n",
+				info->oops_size / 1024,
+				mtd->writesize / 1024);
+		return;
+	}
+
+	longcnt = BITS_TO_LONGS(div_u64(mtd->size, info->oops_size));
+	cxt->rmmap = kcalloc(longcnt, sizeof(long), GFP_KERNEL);
+	cxt->usedmap = kcalloc(longcnt, sizeof(long), GFP_KERNEL);
+
+	longcnt = BITS_TO_LONGS(div_u64(mtd->size, mtd->erasesize));
+	cxt->badmap = kcalloc(longcnt, sizeof(long), GFP_KERNEL);
+
+	cxt->dev.total_size = mtd->size;
+	/* just support dmesg right now */
+	cxt->dev.flags = PSTORE_FLAGS_DMESG;
+	cxt->dev.read = mtdpstore_read;
+	cxt->dev.write = mtdpstore_write;
+	cxt->dev.erase = mtdpstore_erase;
+	cxt->dev.panic_write = mtdpstore_panic_write;
+
+	ret = psblk_register_device(&cxt->dev);
+	if (ret) {
+		dev_err(&mtd->dev, "mtd%d register to psblk failed\n",
+				mtd->index);
+		return;
+	}
+	cxt->mtd = mtd;
+	dev_info(&mtd->dev, "Attached to MTD device %d\n", mtd->index);
+}
+
+static int mtdpstore_flush_removed_do(struct mtdpstore_context *cxt,
+		loff_t off, size_t size)
+{
+	struct mtd_info *mtd = cxt->mtd;
+	u_char *buf;
+	int ret;
+	size_t retlen;
+	struct erase_info erase;
+
+	buf = kmalloc(mtd->erasesize, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	/* 1st. read to cache */
+	ret = mtd_read(mtd, off, mtd->erasesize, &retlen, buf);
+	if (mtdpstore_is_io_error(ret))
+		goto free;
+
+	/* 2nd. erase block */
+	erase.len = mtd->erasesize;
+	erase.addr = off;
+	ret = mtd_erase(mtd, &erase);
+	if (ret)
+		goto free;
+
+	/* 3rd. write back */
+	while (size) {
+		unsigned int zonesize = cxt->info.oops_size;
+
+		/* there is valid data on block, write back */
+		if (mtdpstore_is_used(cxt, off)) {
+			ret = mtd_write(mtd, off, zonesize, &retlen, buf);
+			if (ret)
+				dev_err(&mtd->dev, "write failure at %lld (%zu of %u written), err %d\n",
+						off, retlen, zonesize, ret);
+		}
+
+		off += zonesize;
+		size -= min_t(unsigned int, zonesize, size);
+	}
+
+free:
+	kfree(buf);
+	return ret;
+}
+
+/*
+ * What does mtdpstore_flush_removed() do?
+ * When user remove any log file on pstore filesystem, mtdpstore should do
+ * something to ensure log file removed. If the whole block is no longer used,
+ * it's nice to erase the block. However if the block still contains valid log,
+ * what mtdpstore can do is to erase and write the valid log back.
+ */
+static int mtdpstore_flush_removed(struct mtdpstore_context *cxt)
+{
+	struct mtd_info *mtd = cxt->mtd;
+	int ret;
+	loff_t off;
+	u32 blkcnt = (u32)div_u64(mtd->size, mtd->erasesize);
+
+	for (off = 0; blkcnt > 0; blkcnt--, off += mtd->erasesize) {
+		ret = mtdpstore_block_isbad(cxt, off);
+		if (ret)
+			continue;
+
+		ret = mtdpstore_block_is_removed(cxt, off);
+		if (!ret)
+			continue;
+
+		ret = mtdpstore_flush_removed_do(cxt, off, mtd->erasesize);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
+static void mtdpstore_notify_remove(struct mtd_info *mtd)
+{
+	struct mtdpstore_context *cxt = &oops_cxt;
+
+	if (mtd->index != cxt->index || cxt->index < 0)
+		return;
+
+	mtdpstore_flush_removed(cxt);
+
+	psblk_unregister_device(&cxt->dev);
+	kfree(cxt->badmap);
+	kfree(cxt->usedmap);
+	kfree(cxt->rmmap);
+	cxt->mtd = NULL;
+	cxt->index = -1;
+}
+
+static struct mtd_notifier mtdpstore_notifier = {
+	.add	= mtdpstore_notify_add,
+	.remove	= mtdpstore_notify_remove,
+};
+
+static int __init mtdpstore_init(void)
+{
+	int ret;
+	struct mtdpstore_context *cxt = &oops_cxt;
+	struct mtd_info *mtd = cxt->mtd;
+	struct psblk_info *info = &cxt->info;
+
+	ret = psblk_usr_info(info);
+	if (unlikely(ret))
+		return ret;
+
+	if (strlen(info->device) == 0) {
+		dev_err(&mtd->dev, "mtd device must be supplied\n");
+		return -EINVAL;
+	}
+	if (!info->oops_size) {
+		dev_err(&mtd->dev, "no recorder enabled\n");
+		return -EINVAL;
+	}
+
+	/* Setup the MTD device to use */
+	ret = kstrtoint((char *)info->device, 0, &cxt->index);
+	if (ret)
+		cxt->index = -1;
+
+	register_mtd_user(&mtdpstore_notifier);
+	return 0;
+}
+module_init(mtdpstore_init);
+
+static void __exit mtdpstore_exit(void)
+{
+	unregister_mtd_user(&mtdpstore_notifier);
+}
+module_exit(mtdpstore_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>");
+MODULE_DESCRIPTION("MTD Oops/Panic console logger/driver");
-- 
1.9.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 01/11] pstore/zone: a common layer to manage storage as zones
  2020-03-25  8:54 ` [PATCH v3 01/11] pstore/zone: a common layer to manage storage as zones WeiXiong Liao
@ 2020-04-14  1:42   ` WeiXiong Liao
  0 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-04-14  1:42 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: linux-mtd, linux-kernel, linux-doc

hi Kees Cook,

It's nearly three weeks since these patches submitted. Is there anything
to change?

On 2020/3/25 PM 4:54, WeiXiong Liao wrote:
> This is a pre patch to support pstore/blk.
> 
> Pstore/zone provides the zones management of storage. It's designed
> to work for pstore/blk and pstore/ram in the future.
> 
> Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
> ---
>  fs/pstore/Kconfig           |   7 +
>  fs/pstore/Makefile          |   2 +
>  fs/pstore/platform.c        |   3 +-
>  fs/pstore/pstore_zone.c     | 989 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pstore.h      |   1 +
>  include/linux/pstore_zone.h |  44 ++
>  6 files changed, 1045 insertions(+), 1 deletion(-)
>  create mode 100644 fs/pstore/pstore_zone.c
>  create mode 100644 include/linux/pstore_zone.h
> 
> diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
> index 8f0369aad22a..5ad4ab68ac4f 100644
> --- a/fs/pstore/Kconfig
> +++ b/fs/pstore/Kconfig
> @@ -153,3 +153,10 @@ config PSTORE_RAM
>  	  "ramoops.ko".
>  
>  	  For more information, see Documentation/admin-guide/ramoops.rst.
> +
> +config PSTORE_ZONE
> +	tristate
> +	depends on PSTORE
> +	help
> +	  The common layer for pstore/blk (and pstore/ram in the future)
> +	  to manager storage as zones.
> diff --git a/fs/pstore/Makefile b/fs/pstore/Makefile
> index 967b5891f325..94f3631c80ce 100644
> --- a/fs/pstore/Makefile
> +++ b/fs/pstore/Makefile
> @@ -12,3 +12,5 @@ pstore-$(CONFIG_PSTORE_PMSG)	+= pmsg.o
>  
>  ramoops-objs += ram.o ram_core.o
>  obj-$(CONFIG_PSTORE_RAM)	+= ramoops.o
> +
> +obj-$(CONFIG_PSTORE_ZONE)	+= pstore_zone.o
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index d896457e7c11..80b4728c702f 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -129,7 +129,7 @@ enum pstore_type_id pstore_name_to_type(const char *name)
>  }
>  EXPORT_SYMBOL_GPL(pstore_name_to_type);
>  
> -static const char *get_reason_str(enum kmsg_dump_reason reason)
> +const char *get_reason_str(enum kmsg_dump_reason reason)
>  {
>  	switch (reason) {
>  	case KMSG_DUMP_PANIC:
> @@ -148,6 +148,7 @@ static const char *get_reason_str(enum kmsg_dump_reason reason)
>  		return "Unknown";
>  	}
>  }
> +EXPORT_SYMBOL_GPL(get_reason_str);
>  
>  /*
>   * Should pstore_dump() wait for a concurrent pstore_dump()? If
> diff --git a/fs/pstore/pstore_zone.c b/fs/pstore/pstore_zone.c
> new file mode 100644
> index 000000000000..62c834072498
> --- /dev/null
> +++ b/fs/pstore/pstore_zone.c
> @@ -0,0 +1,989 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#define MODNAME "pstore-zone"
> +#define pr_fmt(fmt) MODNAME ": " fmt
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/pstore.h>
> +#include <linux/mount.h>
> +#include <linux/printk.h>
> +#include <linux/fs.h>
> +#include <linux/pstore_zone.h>
> +#include <linux/kdev_t.h>
> +#include <linux/device.h>
> +#include <linux/namei.h>
> +#include <linux/fcntl.h>
> +#include <linux/uio.h>
> +#include <linux/writeback.h>
> +
> +/**
> + * struct psz_head - header of zone to flush to storage
> + *
> + * @sig: signature to indicate header (PSZ_SIG xor PSZONE-type value)
> + * @datalen: length of data in @data
> + * @data: zone data.
> + */
> +struct psz_buffer {
> +#define PSZ_SIG (0x43474244) /* DBGC */
> +	uint32_t sig;
> +	atomic_t datalen;
> +	uint8_t data[];
> +};
> +
> +/**
> + * struct psz_oops_header - sub header of oops zones to flush to storage
> + *
> + * @magic: magic num for oops header
> + * @time: oops/panic trigger time
> + * @compressed: whether conpressed
> + * @counter: oops/panic counter
> + * @reason: identify oops or panic
> + * @data: pointer to log data
> + *
> + * It's a sub-header of oops zone, trailing after &psz_buffer.
> + */
> +struct psz_oops_header {
> +#define OOPS_HEADER_MAGIC 0x4dfc3ae5 /* Just a ramdom number */
> +	uint32_t magic;
> +	struct timespec64 time;
> +	bool compressed;
> +	uint32_t counter;
> +	enum kmsg_dump_reason reason;
> +	uint8_t data[];
> +};
> +
> +/**
> + * struct psz_zone - zone information
> + *
> + * @off: zone offset of storage
> + * @type: front-end type for this zone
> + * @name: front-end name for this zone
> + * @buffer: pointer to data buffer managed by this zone
> + * @oldbuf: pointer to old data buffer.
> + * @buffer_size: bytes in @buffer->data
> + * @should_recover: whether this zone should recover from storage
> + * @dirty: whether the data in @buffer dirty
> + *
> + * zone structure in memory.
> + */
> +struct psz_zone {
> +	loff_t off;
> +	const char *name;
> +	enum pstore_type_id type;
> +
> +	struct psz_buffer *buffer;
> +	struct psz_buffer *oldbuf;
> +	size_t buffer_size;
> +	bool should_recover;
> +	atomic_t dirty;
> +};
> +
> +/**
> + * struct psz_context - all about running state of pstore/zone
> + *
> + * @opszs: oops/panic storage zones
> + * @oops_max_cnt: max count of @opszs
> + * @oops_read_cnt: counter to read oops zone
> + * @oops_write_cnt: counter to write
> + * @oops_counter: counter to oops
> + * @panic_counter: counter to panic
> + * @recovered: whether finish recovering data from storage
> + * @on_panic: whether occur panic
> + * @psz_info_lock: lock to @psz_info
> + * @psz_info: information from back-end
> + * @pstore: structure for pstore
> + */
> +struct psz_context {
> +	struct psz_zone **opszs;
> +	unsigned int oops_max_cnt;
> +	unsigned int oops_read_cnt;
> +	unsigned int oops_write_cnt;
> +	/*
> +	 * the counter should be recovered when recover.
> +	 * It records the oops/panic times after burning rather than booting.
> +	 */
> +	unsigned int oops_counter;
> +	unsigned int panic_counter;
> +	atomic_t recovered;
> +	atomic_t on_panic;
> +
> +	/*
> +	 * psz_info_lock just protects "psz_info" during calls to
> +	 * psz_register/psz_unregister
> +	 */
> +	spinlock_t psz_info_lock;
> +	struct psz_info *psz_info;
> +	struct pstore_info pstore;
> +};
> +static struct psz_context psz_cxt;
> +
> +/**
> + * enum psz_flush_mode - flush mode for psz_zone_write()
> + *
> + * @FLUSH_NONE: do not flush to storage but update data on memory
> + * @FLUSH_PART: just flush part of data including meta data to storage
> + * @FLUSH_META: just flush meta data of zone to storage
> + * @FLUSH_ALL: flush all of zone
> + */
> +enum psz_flush_mode {
> +	FLUSH_NONE = 0,
> +	FLUSH_PART,
> +	FLUSH_META,
> +	FLUSH_ALL,
> +};
> +
> +static inline int buffer_datalen(struct psz_zone *zone)
> +{
> +	return atomic_read(&zone->buffer->datalen);
> +}
> +
> +static inline bool is_on_panic(void)
> +{
> +	struct psz_context *cxt = &psz_cxt;
> +
> +	return atomic_read(&cxt->on_panic);
> +}
> +
> +static ssize_t psz_zone_read(struct psz_zone *zone, char *buf,
> +		size_t len, unsigned long off)
> +{
> +	if (!buf || !zone->buffer)
> +		return -EINVAL;
> +	if (off > zone->buffer_size)
> +		return -EINVAL;
> +	len = min_t(size_t, len, zone->buffer_size - off);
> +	memcpy(buf, zone->buffer->data + off, len);
> +	return len;
> +}
> +
> +static int psz_zone_write(struct psz_zone *zone,
> +		enum psz_flush_mode flush_mode, const char *buf,
> +		size_t len, unsigned long off)
> +{
> +	struct psz_info *info = psz_cxt.psz_info;
> +	ssize_t wcnt = 0;
> +	ssize_t (*writeop)(const char *buf, size_t bytes, loff_t pos);
> +	size_t wlen;
> +
> +	if (off > zone->buffer_size)
> +		return -EINVAL;
> +
> +	wlen = min_t(size_t, len, zone->buffer_size - off);
> +	if (buf && wlen) {
> +		memcpy(zone->buffer->data + off, buf, wlen);
> +		atomic_set(&zone->buffer->datalen, wlen + off);
> +	}
> +
> +	/* avoid to damage old records */
> +	if (!is_on_panic() && !atomic_read(&psz_cxt.recovered))
> +		goto dirty;
> +
> +	writeop = is_on_panic() ? info->panic_write : info->write;
> +	if (!writeop)
> +		goto dirty;
> +
> +	switch (flush_mode) {
> +	case FLUSH_NONE:
> +		if (unlikely(buf && wlen))
> +			goto dirty;
> +		return 0;
> +	case FLUSH_PART:
> +		wcnt = writeop((const char *)zone->buffer->data + off, wlen,
> +				zone->off + sizeof(*zone->buffer) + off);
> +		if (wcnt != wlen)
> +			goto dirty;
> +		fallthrough;
> +	case FLUSH_META:
> +		wlen = sizeof(struct psz_buffer);
> +		wcnt = writeop((const char *)zone->buffer, wlen, zone->off);
> +		if (wcnt != wlen)
> +			goto dirty;
> +		break;
> +	case FLUSH_ALL:
> +		wlen = zone->buffer_size + sizeof(*zone->buffer);
> +		wcnt = writeop((const char *)zone->buffer, wlen, zone->off);
> +		if (wcnt != wlen)
> +			goto dirty;
> +		break;
> +	}
> +
> +	return 0;
> +dirty:
> +	atomic_set(&zone->dirty, true);
> +	return -EBUSY;
> +}
> +
> +static int psz_flush_dirty_zone(struct psz_zone *zone)
> +{
> +	int ret;
> +
> +	if (!zone)
> +		return -EINVAL;
> +
> +	if (!atomic_read(&zone->dirty))
> +		return 0;
> +
> +	if (!atomic_read(&psz_cxt.recovered))
> +		return -EBUSY;
> +
> +	ret = psz_zone_write(zone, FLUSH_ALL, NULL, 0, 0);
> +	if (!ret)
> +		atomic_set(&zone->dirty, false);
> +	return ret;
> +}

To avoid multi writers call flush_dirty_zone(), I modified codes by
using atomic_xchg()

	static int psz_flush_dirty_zone(struct psz_zone *zone)

	{

	        int ret;

	

	        if (unlikely(!zone))

	                return -EINVAL;

	

	        if (unlikely(!atomic_read(&psz_cxt.recovered)))

	                return -EBUSY;

	

	       if (!atomic_xchg(&zone->dirty, false))

	                return 0;

	

	        ret = psz_zone_write(zone, FLUSH_ALL, NULL, 0, 0);

	        if (ret)

	                atomic_set(&zone->dirty, true);

	        return ret;

	}

> +
> +static int psz_flush_dirty_zones(struct psz_zone **zones, unsigned int cnt)
> +{
> +	int i, ret;
> +	struct psz_zone *zone;
> +
> +	if (!zones)
> +		return -EINVAL;
> +
> +	for (i = 0; i < cnt; i++) {
> +		zone = zones[i];
> +		if (!zone)
> +			return -EINVAL;
> +		ret = psz_flush_dirty_zone(zone);
> +		if (ret)
> +			return ret;
> +	}
> +	return 0;
> +}
> +
> +static int psz_move_zone(struct psz_zone *old, struct psz_zone *new)
> +{
> +	const char *data = (const char *)old->buffer->data;
> +	int ret;
> +
> +	ret = psz_zone_write(new, FLUSH_ALL, data, buffer_datalen(old), 0);
> +	if (ret) {
> +		atomic_set(&new->buffer->datalen, 0);
> +		atomic_set(&new->dirty, false);
> +		return ret;
> +	}
> +	atomic_set(&old->buffer->datalen, 0);
> +	return 0;
> +}
> +
> +static int psz_recover_oops_data(struct psz_context *cxt)
> +{
> +	struct psz_info *info = cxt->psz_info;
> +	struct psz_zone *zone = NULL;
> +	struct psz_buffer *buf;
> +	unsigned long i;
> +	ssize_t rcnt;
> +
> +	if (!info->read)
> +		return -EINVAL;
> +
> +	for (i = 0; i < cxt->oops_max_cnt; i++) {
> +		zone = cxt->opszs[i];
> +		if (unlikely(!zone))
> +			return -EINVAL;
> +		if (atomic_read(&zone->dirty)) {
> +			unsigned int wcnt = cxt->oops_write_cnt;
> +			struct psz_zone *new = cxt->opszs[wcnt];
> +			int ret;
> +
> +			ret = psz_move_zone(zone, new);
> +			if (ret) {
> +				pr_err("move zone from %lu to %d failed\n",
> +						i, wcnt);
> +				return ret;
> +			}
> +			cxt->oops_write_cnt = (wcnt + 1) % cxt->oops_max_cnt;
> +		}
> +		if (!zone->should_recover)
> +			continue;
> +		buf = zone->buffer;
> +		rcnt = info->read((char *)buf, zone->buffer_size + sizeof(*buf),
> +				zone->off);
> +		if (rcnt != zone->buffer_size + sizeof(*buf))
> +			return (int)rcnt < 0 ? (int)rcnt : -EIO;
> +	}
> +	return 0;
> +}
> +
> +static int psz_recover_oops_meta(struct psz_context *cxt)
> +{
> +	struct psz_info *info = cxt->psz_info;
> +	struct psz_zone *zone;
> +	size_t rcnt, len;
> +	struct psz_buffer *buf;
> +	struct psz_oops_header *hdr;
> +	struct timespec64 time = {0};
> +	unsigned long i;
> +	/*
> +	 * Recover may on panic, we can't allocate any memory by kmalloc.
> +	 * So, we use local array instead.
> +	 */
> +	char buffer_header[sizeof(*buf) + sizeof(*hdr)] = {0};
> +
> +	if (!info->read)
> +		return -EINVAL;
> +
> +	len = sizeof(*buf) + sizeof(*hdr);
> +	buf = (struct psz_buffer *)buffer_header;
> +	for (i = 0; i < cxt->oops_max_cnt; i++) {
> +		zone = cxt->opszs[i];
> +		if (unlikely(!zone))
> +			return -EINVAL;
> +
> +		rcnt = info->read((char *)buf, len, zone->off);
> +		if (rcnt != len) {
> +			pr_err("read %s with id %lu failed\n", zone->name, i);
> +			return (int)rcnt < 0 ? (int)rcnt : -EIO;
> +		}
> +
> +		if (buf->sig != zone->buffer->sig) {
> +			pr_debug("no valid data in oops zone %lu\n", i);
> +			continue;
> +		}
> +
> +		if (zone->buffer_size < atomic_read(&buf->datalen)) {
> +			pr_info("found overtop zone: %s: id %lu, off %lld, size %zu\n",
> +					zone->name, i, zone->off,
> +					zone->buffer_size);
> +			continue;
> +		}
> +
> +		hdr = (struct psz_oops_header *)buf->data;
> +		if (hdr->magic != OOPS_HEADER_MAGIC) {
> +			pr_info("found invalid zone: %s: id %lu, off %lld, size %zu\n",
> +					zone->name, i, zone->off,
> +					zone->buffer_size);
> +			continue;
> +		}
> +
> +		/*
> +		 * we get the newest zone, and the next one must be the oldest
> +		 * or unused zone, because we do write one by one like a circle.
> +		 */
> +		if (hdr->time.tv_sec >= time.tv_sec) {
> +			time.tv_sec = hdr->time.tv_sec;
> +			cxt->oops_write_cnt = (i + 1) % cxt->oops_max_cnt;
> +		}
> +
> +		if (hdr->reason == KMSG_DUMP_OOPS)
> +			cxt->oops_counter =
> +				max(cxt->oops_counter, hdr->counter);
> +		else
> +			cxt->panic_counter =
> +				max(cxt->panic_counter, hdr->counter);
> +
> +		if (!atomic_read(&buf->datalen)) {
> +			pr_debug("found erased zone: %s: id %lu, off %lld, size %zu, datalen %d\n",
> +					zone->name, i, zone->off,
> +					zone->buffer_size,
> +					atomic_read(&buf->datalen));
> +			continue;
> +		}
> +
> +		if (!is_on_panic())
> +			zone->should_recover = true;
> +		pr_debug("found nice zone: %s: id %lu, off %lld, size %zu, datalen %d\n",
> +				zone->name, i, zone->off,
> +				zone->buffer_size, atomic_read(&buf->datalen));
> +	}
> +
> +	return 0;
> +}
> +
> +static int psz_recover_oops(struct psz_context *cxt)
> +{
> +	int ret;
> +
> +	if (!cxt->opszs)
> +		return 0;
> +
> +	ret = psz_recover_oops_meta(cxt);
> +	if (ret)
> +		goto recover_fail;
> +
> +	ret = psz_recover_oops_data(cxt);
> +	if (ret)
> +		goto recover_fail;
> +
> +	return 0;
> +recover_fail:
> +	pr_debug("recover oops failed\n");
> +	return ret;
> +}
> +
> +/**
> + * psz_recovery() - recover data from storage
> + * @cxt: the context of pstore/zone
> + *
> + * recovery means reading data back from storage after rebooting
> + *
> + * Return: 0 on success, others on failure.
> + */
> +static inline int psz_recovery(struct psz_context *cxt)
> +{
> +	int ret = -EBUSY;
> +
> +	if (atomic_read(&cxt->recovered))
> +		return 0;
> +
> +	ret = psz_recover_oops(cxt);
> +	if (ret)
> +		goto recover_fail;
> +
> +	pr_debug("recover end!\n");
> +	atomic_set(&cxt->recovered, 1);
> +	return 0;
> +
> +recover_fail:
> +	pr_err("recover failed\n");
> +	return ret;
> +}
> +
> +static int psz_pstore_open(struct pstore_info *psi)
> +{
> +	struct psz_context *cxt = psi->data;
> +
> +	cxt->oops_read_cnt = 0;
> +	return 0;
> +}
> +
> +static inline bool psz_ok(struct psz_zone *zone)
> +{
> +	if (zone && zone->buffer && buffer_datalen(zone))
> +		return true;
> +	return false;
> +}
> +
> +static inline int psz_oops_erase(struct psz_context *cxt,
> +		struct psz_zone *zone, struct pstore_record *record)
> +{
> +	struct psz_buffer *buffer = zone->buffer;
> +	struct psz_oops_header *hdr =
> +		(struct psz_oops_header *)buffer->data;
> +
> +	if (unlikely(!psz_ok(zone)))
> +		return 0;
> +	/* this zone is already updated, no need to erase */
> +	if (record->count != hdr->counter)
> +		return 0;
> +
> +	atomic_set(&zone->buffer->datalen, 0);
> +	return psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
> +}
> +
> +static int psz_pstore_erase(struct pstore_record *record)
> +{
> +	struct psz_context *cxt = record->psi->data;
> +
> +	switch (record->type) {
> +	case PSTORE_TYPE_DMESG:
> +		if (record->id >= cxt->oops_max_cnt)
> +			return -EINVAL;
> +		return psz_oops_erase(cxt, cxt->opszs[record->id], record);
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +static void psz_write_kmsg_hdr(struct psz_zone *zone,
> +		struct pstore_record *record)
> +{
> +	struct psz_context *cxt = record->psi->data;
> +	struct psz_buffer *buffer = zone->buffer;
> +	struct psz_oops_header *hdr =
> +		(struct psz_oops_header *)buffer->data;
> +
> +	hdr->magic = OOPS_HEADER_MAGIC;
> +	hdr->compressed = record->compressed;
> +	hdr->time.tv_sec = record->time.tv_sec;
> +	hdr->time.tv_nsec = record->time.tv_nsec;
> +	hdr->reason = record->reason;
> +	if (hdr->reason == KMSG_DUMP_OOPS)
> +		hdr->counter = ++cxt->oops_counter;
> +	else
> +		hdr->counter = ++cxt->panic_counter;
> +}
> +
> +static inline int notrace psz_oops_write_record(struct psz_context *cxt,
> +		struct pstore_record *record)
> +{
> +	size_t size, hlen;
> +	struct psz_zone *zone;
> +	unsigned int zonenum;
> +
> +	zonenum = cxt->oops_write_cnt;
> +	zone = cxt->opszs[zonenum];
> +	if (unlikely(!zone))
> +		return -ENOSPC;
> +	cxt->oops_write_cnt = (zonenum + 1) % cxt->oops_max_cnt;
> +
> +	pr_debug("write %s to zone id %d\n", zone->name, zonenum);
> +	psz_write_kmsg_hdr(zone, record);
> +	hlen = sizeof(struct psz_oops_header);
> +	size = min_t(size_t, record->size, zone->buffer_size - hlen);
> +	return psz_zone_write(zone, FLUSH_ALL, record->buf, size, hlen);
> +}
> +
> +static int notrace psz_oops_write(struct psz_context *cxt,
> +		struct pstore_record *record)
> +{
> +	int ret;
> +	struct psz_info *info = cxt->psz_info;
> +
> +	/*
> +	 * Out of the various oops/panic dump types, pstore/zone is currently
> +	 * designed to only store crash logs, rather than storing general
> +	 * kernel logs.
> +	 */
> +	if (record->reason != KMSG_DUMP_OOPS &&
> +			record->reason != KMSG_DUMP_PANIC)
> +		return -EINVAL;
> +
> +	/* Skip Oopes when configured to do so. */
> +	if (record->reason == KMSG_DUMP_OOPS && !info->dump_oops)
> +		return -EINVAL;
> +
> +	/*
> +	 * Explicitly only take the first part of any new crash.
> +	 * If our buffer is larger than kmsg_bytes, this can never happen,
> +	 * and if our buffer is smaller than kmsg_bytes, we don't want the
> +	 * report split across multiple records.
> +	 */
> +	if (record->part != 1)
> +		return -ENOSPC;
> +
> +	if (!cxt->opszs)
> +		return -ENOSPC;
> +
> +	ret = psz_oops_write_record(cxt, record);
> +	if (!ret) {
> +		pr_debug("try to flush other dirty oops zones\n");
> +		psz_flush_dirty_zones(cxt->opszs, cxt->oops_max_cnt);
> +	}
> +
> +	/* alway return 0 as we had handled it on buffer */
> +	return 0;
> +}
> +
> +static int notrace psz_pstore_write(struct pstore_record *record)
> +{
> +	struct psz_context *cxt = record->psi->data;
> +
> +	if (record->type == PSTORE_TYPE_DMESG &&
> +			record->reason == KMSG_DUMP_PANIC)
> +		atomic_set(&cxt->on_panic, 1);
> +
> +	switch (record->type) {
> +	case PSTORE_TYPE_DMESG:
> +		return psz_oops_write(cxt, record);
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +static struct psz_zone *psz_read_next_zone(struct psz_context *cxt)
> +{
> +	struct psz_zone *zone = NULL;
> +
> +	while (cxt->oops_read_cnt < cxt->oops_max_cnt) {
> +		zone = cxt->opszs[cxt->oops_read_cnt++];
> +		if (psz_ok(zone))
> +			return zone;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int psz_read_oops_hdr(struct psz_zone *zone,
> +		struct pstore_record *record)
> +{
> +	struct psz_buffer *buffer = zone->buffer;
> +	struct psz_oops_header *hdr =
> +		(struct psz_oops_header *)buffer->data;
> +
> +	if (hdr->magic != OOPS_HEADER_MAGIC)
> +		return -EINVAL;
> +	record->compressed = hdr->compressed;
> +	record->time.tv_sec = hdr->time.tv_sec;
> +	record->time.tv_nsec = hdr->time.tv_nsec;
> +	record->reason = hdr->reason;
> +	record->count = hdr->counter;
> +	return 0;
> +}
> +
> +static ssize_t psz_oops_read(struct psz_zone *zone,
> +		struct pstore_record *record)
> +{
> +	ssize_t size, hlen = 0;
> +
> +	size = buffer_datalen(zone);
> +	/* Clear and skip this oops record if it has no valid header */
> +	if (psz_read_oops_hdr(zone, record)) {
> +		atomic_set(&zone->buffer->datalen, 0);
> +		atomic_set(&zone->dirty, 0);
> +		return -ENOMSG;
> +	}
> +	size -= sizeof(struct psz_oops_header);
> +
> +	if (!record->compressed) {
> +		char *buf = kasprintf(GFP_KERNEL, "%s: Total %d times\n",
> +				get_reason_str(record->reason), record->count);
> +		hlen = strlen(buf);
> +		record->buf = krealloc(buf, hlen + size, GFP_KERNEL);
> +		if (!record->buf) {
> +			kfree(buf);
> +			return -ENOMEM;
> +		}
> +	} else {
> +		record->buf = kmalloc(size, GFP_KERNEL);
> +		if (!record->buf)
> +			return -ENOMEM;
> +	}
> +
> +	size = psz_zone_read(zone, record->buf + hlen, size,
> +			sizeof(struct psz_oops_header) < 0);
> +	if (unlikely(size < 0)) {
> +		kfree(record->buf);
> +		return -ENOMSG;
> +	}
> +
> +	return size + hlen;
> +}
> +
> +static ssize_t psz_pstore_read(struct pstore_record *record)
> +{
> +	struct psz_context *cxt = record->psi->data;
> +	ssize_t (*readop)(struct psz_zone *zone,
> +			struct pstore_record *record);
> +	struct psz_zone *zone;
> +	ssize_t ret;
> +
> +	/* before read, we must recover from storage */
> +	ret = psz_recovery(cxt);
> +	if (ret)
> +		return ret;
> +
> +next_zone:
> +	zone = psz_read_next_zone(cxt);
> +	if (!zone)
> +		return 0;
> +
> +	record->type = zone->type;
> +	switch (record->type) {
> +	case PSTORE_TYPE_DMESG:
> +		readop = psz_oops_read;
> +		record->id = cxt->oops_read_cnt - 1;
> +		break;
> +	default:
> +		goto next_zone;
> +	}
> +
> +	ret = readop(zone, record);
> +	if (ret == -ENOMSG)
> +		goto next_zone;
> +	return ret;
> +}
> +
> +static struct psz_context psz_cxt = {
> +	.psz_info_lock = __SPIN_LOCK_UNLOCKED(psz_cxt.psz_info_lock),
> +	.recovered = ATOMIC_INIT(0),
> +	.on_panic = ATOMIC_INIT(0),
> +	.pstore = {
> +		.owner = THIS_MODULE,
> +		.name = MODNAME,
> +		.open = psz_pstore_open,
> +		.read = psz_pstore_read,
> +		.write = psz_pstore_write,
> +		.erase = psz_pstore_erase,
> +	},
> +};
> +
> +static struct psz_zone *psz_init_zone(enum pstore_type_id type,
> +		loff_t *off, size_t size)
> +{
> +	struct psz_info *info = psz_cxt.psz_info;
> +	struct psz_zone *zone;
> +	const char *name = pstore_type_to_name(type);
> +
> +	if (!size)
> +		return NULL;
> +
> +	if (*off + size > info->total_size) {
> +		pr_err("no room for %s (0x%zx@0x%llx over 0x%lx)\n",
> +			name, size, *off, info->total_size);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	zone = kzalloc(sizeof(struct psz_zone), GFP_KERNEL);
> +	if (!zone)
> +		return ERR_PTR(-ENOMEM);
> +
> +	zone->buffer = kmalloc(size, GFP_KERNEL);
> +	if (!zone->buffer) {
> +		kfree(zone);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +	memset(zone->buffer, 0xFF, size);
> +	zone->off = *off;
> +	zone->name = name;
> +	zone->type = type;
> +	zone->buffer_size = size - sizeof(struct psz_buffer);
> +	zone->buffer->sig = type ^ PSZ_SIG;
> +	atomic_set(&zone->dirty, 0);
> +	atomic_set(&zone->buffer->datalen, 0);
> +
> +	*off += size;
> +
> +	pr_debug("pszone %s: off 0x%llx, %zu header, %zu data\n", zone->name,
> +			zone->off, sizeof(*zone->buffer), zone->buffer_size);
> +	return zone;
> +}
> +
> +static struct psz_zone **psz_init_zones(enum pstore_type_id type,
> +	loff_t *off, size_t total_size, ssize_t record_size,
> +	unsigned int *cnt)
> +{
> +	struct psz_info *info = psz_cxt.psz_info;
> +	struct psz_zone **zones, *zone;
> +	const char *name = pstore_type_to_name(type);
> +	int c, i;
> +
> +	if (!total_size || !record_size)
> +		return NULL;
> +
> +	if (*off + total_size > info->total_size) {
> +		pr_err("no room for zones %s (0x%zx@0x%llx over 0x%lx)\n",
> +			name, total_size, *off, info->total_size);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	c = total_size / record_size;
> +	zones = kcalloc(c, sizeof(*zones), GFP_KERNEL);
> +	if (!zones) {
> +		pr_err("allocate for zones %s failed\n", name);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +	memset(zones, 0, c * sizeof(*zones));
> +
> +	for (i = 0; i < c; i++) {
> +		zone = psz_init_zone(type, off, record_size);
> +		if (!zone || IS_ERR(zone)) {
> +			pr_err("initialize zones %s failed\n", name);
> +			while (--i >= 0) {
> +				kfree(zones[i]->buffer);
> +				kfree(zones[i]);
> +			}
> +			kfree(zones);
> +			return (void *)zone;
> +		}
> +		zones[i] = zone;
> +	}
> +
> +	*cnt = c;
> +	return zones;
> +}
> +
> +static void psz_free_zone(struct psz_zone **pszone)
> +{
> +	struct psz_zone *zone = *pszone;
> +
> +	if (!zone)
> +		return;
> +
> +	kfree(zone->buffer);
> +	kfree(zone);
> +	*pszone = NULL;
> +}
> +
> +static void psz_free_zones(struct psz_zone ***pszones, unsigned int *cnt)
> +{
> +	struct psz_zone **zones = *pszones;
> +
> +	if (!zones)
> +		return;
> +
> +	while (*cnt > 0) {
> +		psz_free_zone(&zones[*cnt]);
> +		(*cnt)--;
> +	}
> +	kfree(zones);
> +	*pszones = NULL;
> +}
> +
> +static void psz_free_all_zones(struct psz_context *cxt)
> +{
> +	if (cxt->opszs)
> +		psz_free_zones(&cxt->opszs, &cxt->oops_max_cnt);
> +}
> +
> +static int psz_alloc_zones(struct psz_context *cxt)
> +{
> +	struct psz_info *info = cxt->psz_info;
> +	loff_t off = 0;
> +	int err;
> +	size_t size;
> +
> +	size = info->total_size;
> +	cxt->opszs = psz_init_zones(PSTORE_TYPE_DMESG, &off, size,
> +			info->oops_size, &cxt->oops_max_cnt);
> +	if (IS_ERR(cxt->opszs)) {
> +		err = PTR_ERR(cxt->opszs);
> +		goto fail_out;
> +	}
> +
> +	return 0;
> +fail_out:
> +	return err;
> +}
> +
> +/**
> + * psz_register() - register to pstore/zone
> + *
> + * @info: back-end driver information. See &struct psz_info.
> + *
> + * Only one back-end at one time.
> + *
> + * Return: 0 on success, others on failure.
> + */
> +int psz_register(struct psz_info *info)
> +{
> +	int err = -EINVAL;
> +	struct psz_context *cxt = &psz_cxt;
> +	struct module *owner = info->owner;
> +
> +	if (!info->total_size) {
> +		pr_warn("the total size must be non-zero\n");
> +		return -EINVAL;
> +	}
> +
> +	if (!info->oops_size) {
> +		pr_warn("at least one of the records be non-zero\n");
> +		return -EINVAL;
> +	}
> +
> +	if (!info->name || !info->name[0])
> +		return -EINVAL;
> +
> +	if (info->total_size < 4096) {
> +		pr_err("total size must be greater than 4096 bytes\n");
> +		return -EINVAL;
> +	}
> +
> +#define check_size(name, size) {					\
> +		if (info->name > 0 && info->name < (size)) {		\
> +			pr_err(#name " must be over %d\n", (size));	\
> +			return -EINVAL;					\
> +		}							\
> +		if (info->name & (size - 1)) {				\
> +			pr_err(#name " must be a multiple of %d\n",	\
> +					(size));			\
> +			return -EINVAL;					\
> +		}							\
> +	}
> +
> +	check_size(total_size, 4096);
> +	check_size(oops_size, SECTOR_SIZE);
> +
> +#undef check_size
> +
> +	/*
> +	 * the @read and @write must be applied.
> +	 * if no @read, pstore may mount failed.
> +	 * if no @write, pstore do not support to remove record file.
> +	 */
> +	if (!info->read || !info->write) {
> +		pr_err("no valid general read/write interface\n");
> +		return -EINVAL;
> +	}
> +
> +	spin_lock(&cxt->psz_info_lock);
> +	if (cxt->psz_info) {
> +		pr_warn("'%s' already loaded: ignoring '%s'\n",
> +				cxt->psz_info->name, info->name);
> +		spin_unlock(&cxt->psz_info_lock);
> +		return -EBUSY;
> +	}
> +	cxt->psz_info = info;
> +	spin_unlock(&cxt->psz_info_lock);
> +
> +	if (owner && !try_module_get(owner)) {
> +		err = -EBUSY;
> +		goto fail_out;
> +	}
> +
> +	pr_debug("register %s with properties:\n", info->name);
> +	pr_debug("\ttotal size : %ld Bytes\n", info->total_size);
> +	pr_debug("\toops size : %ld Bytes\n", info->oops_size);
> +
> +	err = psz_alloc_zones(cxt);
> +	if (err) {
> +		pr_err("alloc zones failed\n");
> +		goto put_module;
> +	}
> +
> +	if (info->oops_size) {
> +		cxt->pstore.bufsize = cxt->opszs[0]->buffer_size -
> +			sizeof(struct psz_oops_header);
> +		cxt->pstore.buf = kzalloc(cxt->pstore.bufsize, GFP_KERNEL);
> +		if (!cxt->pstore.buf) {
> +			err = -ENOMEM;
> +			goto free_all_zones;
> +		}
> +	}
> +	cxt->pstore.data = cxt;
> +	if (info->oops_size)
> +		cxt->pstore.flags = PSTORE_FLAGS_DMESG;
> +
> +	pr_info("Registered %s as pszone backend for%s%s\n", info->name,
> +			cxt->opszs && cxt->psz_info->dump_oops ? " Oops" : "",
> +			cxt->opszs && cxt->psz_info->panic_write ? " Panic" : "");
> +
> +	err = pstore_register(&cxt->pstore);
> +	if (err) {
> +		pr_err("registering with pstore failed\n");
> +		goto free_pstore_buf;
> +	}
> +
> +	module_put(owner);
> +	return 0;
> +
> +free_pstore_buf:
> +	kfree(cxt->pstore.buf);
> +free_all_zones:
> +	psz_free_all_zones(cxt);
> +put_module:
> +	module_put(owner);
> +fail_out:
> +	spin_lock(&psz_cxt.psz_info_lock);
> +	psz_cxt.psz_info = NULL;
> +	spin_unlock(&psz_cxt.psz_info_lock);
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(psz_register);
> +
> +/**
> + * psz_unregister() - unregister to pstore/zone
> + *
> + * @info: back-end driver information. See &struct psz_info.
> + */
> +void psz_unregister(struct psz_info *info)
> +{
> +	struct psz_context *cxt = &psz_cxt;
> +
> +	pstore_unregister(&cxt->pstore);
> +	kfree(cxt->pstore.buf);
> +	cxt->pstore.bufsize = 0;
> +
> +	spin_lock(&cxt->psz_info_lock);
> +	cxt->psz_info = NULL;
> +	spin_unlock(&cxt->psz_info_lock);
> +
> +	psz_free_all_zones(cxt);
> +}
> +EXPORT_SYMBOL_GPL(psz_unregister);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>");
> +MODULE_DESCRIPTION("Storage Manager for Pstore/blk and Pstore/ram");
> diff --git a/include/linux/pstore.h b/include/linux/pstore.h
> index e779441e6d26..859583a7104d 100644
> --- a/include/linux/pstore.h
> +++ b/include/linux/pstore.h
> @@ -45,6 +45,7 @@ enum pstore_type_id {
>  
>  const char *pstore_type_to_name(enum pstore_type_id type);
>  enum pstore_type_id pstore_name_to_type(const char *name);
> +const char *get_reason_str(enum kmsg_dump_reason reason);
>  
>  struct pstore_info;
>  /**
> diff --git a/include/linux/pstore_zone.h b/include/linux/pstore_zone.h
> new file mode 100644
> index 000000000000..afc10a67a777
> --- /dev/null
> +++ b/include/linux/pstore_zone.h
> @@ -0,0 +1,44 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __PSTORE_ZONE_H_
> +#define __PSTORE_ZONE_H_
> +
> +#include <linux/types.h>
> +
> +typedef ssize_t (*psz_read_op)(char *, size_t, loff_t);
> +typedef ssize_t (*psz_write_op)(const char *, size_t, loff_t);
> +/**
> + * struct psz_info - pstore/zone back-end driver structure
> + *
> + * @owner:	Module which is responsible for this back-end driver.
> + * @name:	Name of the back-end driver.
> + * @total_size: The total size in bytes pstore/zone can use. It must be greater
> + *		than 4096 and be multiple of 4096.
> + * @oops_size:	The size of oops/panic zone. Zero means disabled, otherwise,
> + *		it must be multiple of SECTOR_SIZE(512 Bytes).
> + * @dump_oops:	Whether to dump oops log.
> + * @read:	The general read operation. Both of the function parameters
> + *		@size and @offset are relative value to storage.
> + *		On success, the number of bytes should be returned, others
> + *		means error.
> + * @write:	The same as @read.
> + * @panic_write:The write operation only used for panic case. It's optional
> + *		if you do not care panic log. The parameters and return value
> + *		are the same as @read.
> + */
> +struct psz_info {
> +	struct module *owner;
> +	const char *name;
> +
> +	unsigned long total_size;
> +	unsigned long oops_size;
> +	int dump_oops;
> +	psz_read_op read;
> +	psz_write_op write;
> +	psz_write_op panic_write;
> +};
> +
> +extern int psz_register(struct psz_info *info);
> +extern void psz_unregister(struct psz_info *info);
> +
> +#endif
> 

-- 
WeiXiong Liao

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 05/11] pstore/blk: blkoops: support console recorder
  2020-03-25  8:55 ` [PATCH v3 05/11] pstore/blk: blkoops: support console recorder WeiXiong Liao
@ 2020-04-14  2:13   ` WeiXiong Liao
  0 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-04-14  2:13 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Mauro Carvalho Chehab, Rob Herring,
	David S. Miller, Greg Kroah-Hartman, Jonathan Cameron
  Cc: linux-mtd, linux-kernel, linux-doc

Hi Kees Cook,

A bad subject on patch 5 and patch 6 that "blkoops" should be "pstore/zone".

I will fix it in the next version.

On 2020/3/25 PM 4:55, WeiXiong Liao wrote:
> Support recorder for console. To enable console recorder, just make
> console_size be greater than 0 and a multiple of 4096.
> 
> Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>
> ---
>  fs/pstore/Kconfig           | 12 ++++++++
>  fs/pstore/pstore_blk.c      | 12 +++++++-
>  fs/pstore/pstore_zone.c     | 70 +++++++++++++++++++++++++++++++++++++++++----
>  include/linux/pstore_zone.h |  4 ++-
>  4 files changed, 91 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
> index 8cead860dcfc..bf90de48ad3c 100644
> --- a/fs/pstore/Kconfig
> +++ b/fs/pstore/Kconfig
> @@ -225,6 +225,18 @@ config PSTORE_BLK_PMSG_SIZE
>  	  NOTE that, both Kconfig and module parameters can configure
>  	  pstore/blk, but module parameters have priority over Kconfig.
>  
> +config PSTORE_BLK_CONSOLE_SIZE
> +	int "Size in Kbytes of console to store"
> +	depends on PSTORE_BLK
> +	depends on PSTORE_CONSOLE
> +	default 64
> +	help
> +	  This just sets size of console (console_size) for pstore/blk. The
> +	  size is in KB and must be a multiple of 4.
> +
> +	  NOTE that, both Kconfig and module parameters can configure
> +	  pstore/blk, but module parameters have priority over Kconfig.
> +
>  config PSTORE_BLK_DUMP_OOPS
>  	bool "dump oops"
>  	depends on PSTORE_BLK
> diff --git a/fs/pstore/pstore_blk.c b/fs/pstore/pstore_blk.c
> index 85cd9f2335be..2b513acaa18f 100644
> --- a/fs/pstore/pstore_blk.c
> +++ b/fs/pstore/pstore_blk.c
> @@ -26,6 +26,14 @@
>  module_param(pmsg_size, long, 0400);
>  MODULE_PARM_DESC(pmsg_size, "pmsg size in kbytes");
>  
> +#if IS_ENABLED(CONFIG_PSTORE_CONSOLE)
> +static long console_size = CONFIG_PSTORE_BLK_CONSOLE_SIZE;
> +#else
> +static long console_size = -1;
> +#endif
> +module_param(console_size, long, 0400);
> +MODULE_PARM_DESC(console_size, "console size in kbytes");
> +
>  static int dump_oops = CONFIG_PSTORE_BLK_DUMP_OOPS;
>  module_param(dump_oops, int, 0400);
>  MODULE_PARM_DESC(total_size, "whether dump oops");
> @@ -81,7 +89,8 @@
>   *		whole disk).
>   *		On success, the number of bytes should be returned, others
>   *		means error.
> - * @write:	The same as @read.
> + * @write:	The same as @read, but the following error number:
> + *		-EBUSY means try to write again later.
>   * @panic_write:The write operation only used for panic case. It's optional
>   *		if you do not care panic log. The parameters and return value
>   *		are the same as @read.
> @@ -131,6 +140,7 @@ static int psblk_register_do(struct psblk_device *dev)
>  
>  	verify_size(oops_size, 4096, dev->flags & PSTORE_FLAGS_DMESG);
>  	verify_size(pmsg_size, 4096, dev->flags & PSTORE_FLAGS_PMSG);
> +	verify_size(console_size, 4096, dev->flags & PSTORE_FLAGS_CONSOLE);
>  #undef verify_size
>  	dump_oops = dump_oops <= 0 ? 0 : 1;
>  
> diff --git a/fs/pstore/pstore_zone.c b/fs/pstore/pstore_zone.c
> index 444bce7f9ac3..e1e84505b046 100644
> --- a/fs/pstore/pstore_zone.c
> +++ b/fs/pstore/pstore_zone.c
> @@ -87,10 +87,12 @@ struct psz_zone {
>   *
>   * @opszs: oops/panic storage zones
>   * @ppsz: pmsg storage zone
> + * @cpsz: console storage zone
>   * @oops_max_cnt: max count of @opszs
>   * @oops_read_cnt: counter to read oops zone
>   * @oops_write_cnt: counter to write
>   * @pmsg_read_cnt: counter to read pmsg zone
> + * @console_read_cnt: counter to read console zone
>   * @oops_counter: counter to oops
>   * @panic_counter: counter to panic
>   * @recovered: whether finish recovering data from storage
> @@ -102,10 +104,12 @@ struct psz_zone {
>  struct psz_context {
>  	struct psz_zone **opszs;
>  	struct psz_zone *ppsz;
> +	struct psz_zone *cpsz;
>  	unsigned int oops_max_cnt;
>  	unsigned int oops_read_cnt;
>  	unsigned int oops_write_cnt;
>  	unsigned int pmsg_read_cnt;
> +	unsigned int console_read_cnt;
>  	/*
>  	 * the counter should be recovered when recover.
>  	 * It records the oops/panic times after burning rather than booting.
> @@ -125,6 +129,9 @@ struct psz_context {
>  };
>  static struct psz_context psz_cxt;
>  
> +static void psz_flush_all_dirty_zones(struct work_struct *> +static DECLARE_WORK(psz_cleaner, psz_flush_all_dirty_zones);

I think it's better to use delayed work.

	static DECLARE_DELAYED_WORK(psz_cleaner, psz_flush_all_dirty_zones);

> +
>  /**
>   * enum psz_flush_mode - flush mode for psz_zone_write()
>   *
> @@ -235,6 +242,9 @@ static int psz_zone_write(struct psz_zone *zone,
>  	return 0;
>  dirty:
>  	atomic_set(&zone->dirty, true);
> +	/* flush dirty zones nicely */
> +	if (wcnt == -EBUSY && !is_on_panic())
> +		schedule_work(&psz_cleaner);

Change to:
	
	schedule_delayed_work(&psz_cleaner, msecs_to_jiffies(500));

delay for 500ms to merge more log of console and reduce calling times.

>  	return -EBUSY;
>  }
>  
> @@ -291,6 +301,15 @@ static int psz_move_zone(struct psz_zone *old, struct psz_zone *new)
>  	return 0;
>  }
>  
> +static void psz_flush_all_dirty_zones(struct work_struct *work)
> +{
> +	struct psz_context *cxt = &psz_cxt;
> +
> +	psz_flush_dirty_zone(cxt->ppsz);
> +	psz_flush_dirty_zone(cxt->cpsz);
> +	psz_flush_dirty_zones(cxt->opszs, cxt->oops_max_cnt);
> +}

If flush dirty failed, I think it should try again later.

	int ret = 0;
	
	ret |= psz_flush_dirty_zone(cxt->ppsz);
	ret |= psz_flush_dirty_zone(cxt->cpsz);
	ret |= psz_flush_dirty_zones(cxt->opszs, cxt->oops_max_cnt);
	if (ret)
		schedule_delayed_work(&psz_cleaner, msecs_to_jiffies(1000));

I will fix it in the next version.

> +
>  static int psz_recover_oops_data(struct psz_context *cxt)
>  {
>  	struct psz_info *info = cxt->psz_info;
> @@ -546,6 +565,10 @@ static inline int psz_recovery(struct psz_context *cxt)
>  	if (ret)
>  		goto recover_fail;
>  
> +	ret = psz_recover_zone(cxt, cxt->cpsz);
> +	if (ret)
> +		goto recover_fail;
> +
>  	pr_debug("recover end!\n");
>  	atomic_set(&cxt->recovered, 1);
>  	return 0;
> @@ -561,6 +584,7 @@ static int psz_pstore_open(struct pstore_info *psi)
>  
>  	cxt->oops_read_cnt = 0;
>  	cxt->pmsg_read_cnt = 0;
> +	cxt->console_read_cnt = 0;
>  	return 0;
>  }
>  
> @@ -625,8 +649,9 @@ static int psz_pstore_erase(struct pstore_record *record)
>  		return psz_oops_erase(cxt, cxt->opszs[record->id], record);
>  	case PSTORE_TYPE_PMSG:
>  		return psz_record_erase(cxt, cxt->ppsz);
> -	default:
> -		return -EINVAL;
> +	case PSTORE_TYPE_CONSOLE:
> +		return psz_record_erase(cxt, cxt->cpsz);
> +	default: return -EINVAL;
>  	}
>  }
>  
> @@ -767,9 +792,18 @@ static int notrace psz_pstore_write(struct pstore_record *record)
>  			record->reason == KMSG_DUMP_PANIC)
>  		atomic_set(&cxt->on_panic, 1);
>  
> +	/*
> +	 * if on panic, do not write except panic records
> +	 * Fix case that panic_write prints log which wakes up console recorder.
> +	 */
> +	if (is_on_panic() && record->type != PSTORE_TYPE_DMESG)
> +		return -EBUSY;
> +
>  	switch (record->type) {
>  	case PSTORE_TYPE_DMESG:
>  		return psz_oops_write(cxt, record);
> +	case PSTORE_TYPE_CONSOLE:
> +		return psz_record_write(cxt->cpsz, record);
>  	case PSTORE_TYPE_PMSG:
>  		return psz_record_write(cxt->ppsz, record);
>  	default:
> @@ -794,6 +828,13 @@ static struct psz_zone *psz_read_next_zone(struct psz_context *cxt)
>  			return zone;
>  	}
>  
> +	if (cxt->console_read_cnt == 0) {
> +		cxt->console_read_cnt++;
> +		zone = cxt->cpsz;
> +		if (psz_old_ok(zone))
> +			return zone;
> +	}
> +
>  	return NULL;
>  }
>  
> @@ -903,6 +944,8 @@ static ssize_t psz_pstore_read(struct pstore_record *record)
>  		readop = psz_oops_read;
>  		record->id = cxt->oops_read_cnt - 1;
>  		break;
> +	case PSTORE_TYPE_CONSOLE:
> +		fallthrough;
>  	case PSTORE_TYPE_PMSG:
>  		readop = psz_record_read;
>  		break;
> @@ -1050,6 +1093,8 @@ static void psz_free_all_zones(struct psz_context *cxt)
>  		psz_free_zones(&cxt->opszs, &cxt->oops_max_cnt);
>  	if (cxt->ppsz)
>  		psz_free_zone(&cxt->ppsz);
> +	if (cxt->cpsz)
> +		psz_free_zone(&cxt->cpsz);
>  }
>  
>  static int psz_alloc_zones(struct psz_context *cxt)
> @@ -1066,6 +1111,14 @@ static int psz_alloc_zones(struct psz_context *cxt)
>  		goto free_out;
>  	}
>  
> +	off_size += info->console_size;
> +	cxt->cpsz = psz_init_zone(PSTORE_TYPE_CONSOLE, &off,
> +			info->console_size);
> +	if (IS_ERR(cxt->cpsz)) {
> +		err = PTR_ERR(cxt->cpsz);
> +		goto free_out;
> +	}
> +
>  	cxt->opszs = psz_init_zones(PSTORE_TYPE_DMESG, &off,
>  			info->total_size - off_size,
>  			info->oops_size, &cxt->oops_max_cnt);
> @@ -1100,7 +1153,7 @@ int psz_register(struct psz_info *info)
>  		return -EINVAL;
>  	}
>  
> -	if (!info->oops_size && !info->pmsg_size) {
> +	if (!info->oops_size && !info->pmsg_size && !info->console_size) {
>  		pr_warn("at least one of the records be non-zero\n");
>  		return -EINVAL;
>  	}
> @@ -1128,6 +1181,7 @@ int psz_register(struct psz_info *info)
>  	check_size(total_size, 4096);
>  	check_size(oops_size, SECTOR_SIZE);
>  	check_size(pmsg_size, SECTOR_SIZE);
> +	check_size(console_size, SECTOR_SIZE);
>  
>  #undef check_size
>  
> @@ -1160,6 +1214,7 @@ int psz_register(struct psz_info *info)
>  	pr_debug("\ttotal size : %ld Bytes\n", info->total_size);
>  	pr_debug("\toops size : %ld Bytes\n", info->oops_size);
>  	pr_debug("\tpmsg size : %ld Bytes\n", info->pmsg_size);
> +	pr_debug("\tconsole size : %ld Bytes\n", info->console_size);
>  
>  	err = psz_alloc_zones(cxt);
>  	if (err) {
> @@ -1181,11 +1236,14 @@ int psz_register(struct psz_info *info)
>  		cxt->pstore.flags |= PSTORE_FLAGS_DMESG;
>  	if (info->pmsg_size)
>  		cxt->pstore.flags |= PSTORE_FLAGS_PMSG;
> +	if (info->console_size)
> +		cxt->pstore.flags |= PSTORE_FLAGS_CONSOLE;
>  
> -	pr_info("Registered %s as pszone backend for%s%s%s\n", info->name,
> +	pr_info("Registered %s as pszone backend for%s%s%s%s\n", info->name,
>  			cxt->opszs && cxt->psz_info->dump_oops ? " Oops" : "",
>  			cxt->opszs && cxt->psz_info->panic_write ? " Panic" : "",
> -			cxt->ppsz ? " Pmsg" : "");
> +			cxt->ppsz ? " Pmsg" : "",
> +			cxt->cpsz ? " Console" : "");
>  
>  	err = pstore_register(&cxt->pstore);
>  	if (err) {
> @@ -1219,6 +1277,8 @@ void psz_unregister(struct psz_info *info)
>  {
>  	struct psz_context *cxt = &psz_cxt;
>  
> +	flush_work(&psz_cleaner);

I think it should try to flush dirty zones before unregister.

	psz_flush_all_dirty_zones(NULL);
	flush_delayed_work(&psz_cleaner);

> +
>  	pstore_unregister(&cxt->pstore);
>  	kfree(cxt->pstore.buf);
>  	cxt->pstore.bufsize = 0;
> diff --git a/include/linux/pstore_zone.h b/include/linux/pstore_zone.h
> index 85e159d8f935..8a1838633010 100644
> --- a/include/linux/pstore_zone.h
> +++ b/include/linux/pstore_zone.h
> @@ -17,12 +17,13 @@
>   * @oops_size:	The size of oops/panic zone. Zero means disabled, otherwise,
>   *		it must be multiple of SECTOR_SIZE(512 Bytes).
>   * @pmsg_size:	The size of pmsg zone which is the same as @oops_size.
> + * @console_size:The size of console zone which is the same as @oops_size.
>   * @dump_oops:	Whether to dump oops log.
>   * @read:	The general read operation. Both of the function parameters
>   *		@size and @offset are relative value to storage.
>   *		On success, the number of bytes should be returned, others
>   *		means error.
> - * @write:	The same as @read.
> + * @write:	The same as @read, but -EBUSY means try to write again later.
>   * @panic_write:The write operation only used for panic case. It's optional
>   *		if you do not care panic log. The parameters and return value
>   *		are the same as @read.
> @@ -34,6 +35,7 @@ struct psz_info {
>  	unsigned long total_size;
>  	unsigned long oops_size;
>  	unsigned long pmsg_size;
> +	unsigned long console_size;
>  	int dump_oops;
>  	psz_read_op read;
>  	psz_write_op write;
> 

-- 
WeiXiong Liao

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device
  2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
                   ` (10 preceding siblings ...)
  2020-03-25  8:55 ` [PATCH v3 11/11] mtd: new support oops logger based on pstore/blk WeiXiong Liao
@ 2020-05-07 19:45 ` Kees Cook
  11 siblings, 0 replies; 17+ messages in thread
From: Kees Cook @ 2020-05-07 19:45 UTC (permalink / raw)
  To: WeiXiong Liao
  Cc: Rob Herring, Tony Luck, Vignesh Raghavendra, Jonathan Corbet,
	Richard Weinberger, Anton Vorontsov, linux-doc,
	Greg Kroah-Hartman, linux-kernel, Colin Cross, linux-mtd,
	Jonathan Cameron, Miquel Raynal, Mauro Carvalho Chehab,
	David S. Miller

On Wed, Mar 25, 2020 at 04:54:55PM +0800, WeiXiong Liao wrote:
> [PATCH v3]:
> 1. patch 1~10: a lot of improvements according to Kees Cook <keescook@chromium.org>
>                including rename module, typo, reorder, rewrite document, bugs
> 			   and so on.
> 2. patch 11: rename funtions of pstore/blk and update document.

This is looking good. I'm going to update it for the recent max_reason
series, and make some tweaks here and there and send out a v4 tomorrow
to see what you think. Thanks for your patience!

-- 
Kees Cook

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 02/11] pstore/blk: new support logger for block devices
  2020-03-25  8:54 ` [PATCH v3 02/11] pstore/blk: new support logger for block devices WeiXiong Liao
@ 2020-05-08  6:07   ` Kees Cook
  2020-05-08 12:46     ` WeiXiong Liao
  0 siblings, 1 reply; 17+ messages in thread
From: Kees Cook @ 2020-05-08  6:07 UTC (permalink / raw)
  To: WeiXiong Liao
  Cc: Rob Herring, Tony Luck, Vignesh Raghavendra, Jonathan Corbet,
	Richard Weinberger, Anton Vorontsov, linux-doc,
	Greg Kroah-Hartman, linux-kernel, Colin Cross, linux-mtd,
	Jonathan Cameron, Miquel Raynal, Mauro Carvalho Chehab,
	David S. Miller

On Wed, Mar 25, 2020 at 04:54:57PM +0800, WeiXiong Liao wrote:
> pstore/blk is similar to pstore/ram, but dump log to block device
> rather than persistent ram.
> [...]
> +int psblk_register_blkdev(unsigned int major, psblk_panic_write_op panic_write)
> +{
> +	struct block_device *bdev;
> +	struct psblk_device dev = {0};
> +	struct bdev_info *binfo;
> +	int ret = -ENODEV;
> +	void *holder = blkdev;
> +
> +	binfo = psblk_get_bdev_info();
> +	if (IS_ERR(binfo))
> +		return PTR_ERR(binfo);
> +
> +	/* only allow driver matching the @blkdev */
> +	if (!binfo->devt || MAJOR(binfo->devt) != major) {
> +		pr_debug("invalid major %u (expect %u)\n",
> +				major, MAJOR(binfo->devt));
> +		return -ENODEV;
> +	}
> +
> +	/* hold bdev exclusively */
> +	bdev = psblk_get_bdev(holder);
> +	if (IS_ERR(bdev)) {
> +		pr_err("failed to open '%s'!\n", blkdev);
> +		return PTR_ERR(bdev);
> +	}
> +
> +	/* psblk_bdev must be assigned before register to pstore/blk */
> +	psblk_bdev = bdev;
> +	blkdev_panic_write = panic_write;
> +
> +	dev.total_size = psblk_bdev_size(bdev);
> +	dev.panic_write = panic_write ? psblk_blk_panic_write : NULL;
> +	dev.read = psblk_generic_blk_read;
> +	dev.write = psblk_generic_blk_write;
> +
> +	ret = psblk_register_do(&dev);
> +	if (ret)
> +		goto err_put_bdev;
> +
> +	pr_info("using '%s'\n", blkdev);
> +	return 0;
> +
> +err_put_bdev:
> +	psblk_bdev = NULL;
> +	blkdev_panic_write = NULL;
> +	psblk_put_bdev(bdev, holder);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(psblk_register_blkdev);

I've gotten this series refactored on top of current pstore, and I've
been making various bikeshed changes to names, etc, and as I went to go
start testing, I realized that nothing actually uses
psblk_register_blkdev().

It seems like it should be possible to just start using this on any
block device of the user's choosing. I assume the idea is to allow for
drivers to register panic_write handlers, but even without that, it'd be
nice to just be able to test this with something like /dev/loop0.

What's your thinking on how this would happen? It seems like if
pstore/blk uses pstore/zone, and mtdpstore uses pstore/blk, there should
be a blkoops that uses pstore/blk too? I guess I need to learn a bit
more about how block device probing works so pstore/blk can notice
devices as they're brought online, etc.

-- 
Kees Cook

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 02/11] pstore/blk: new support logger for block devices
  2020-05-08  6:07   ` Kees Cook
@ 2020-05-08 12:46     ` WeiXiong Liao
  0 siblings, 0 replies; 17+ messages in thread
From: WeiXiong Liao @ 2020-05-08 12:46 UTC (permalink / raw)
  To: Kees Cook
  Cc: Rob Herring, Tony Luck, Vignesh Raghavendra, Jonathan Corbet,
	Richard Weinberger, Anton Vorontsov, linux-doc,
	Greg Kroah-Hartman, linux-kernel, Colin Cross, linux-mtd,
	Jonathan Cameron, Miquel Raynal, Mauro Carvalho Chehab,
	David S. Miller

hi Kees Cook,

On 2020/5/8 PM 2:07, Kees Cook wrote:
> On Wed, Mar 25, 2020 at 04:54:57PM +0800, WeiXiong Liao wrote:
>> pstore/blk is similar to pstore/ram, but dump log to block device
>> rather than persistent ram.
>> [...]
>> +int psblk_register_blkdev(unsigned int major, psblk_panic_write_op panic_write)
>> +{
>> +	struct block_device *bdev;
>> +	struct psblk_device dev = {0};
>> +	struct bdev_info *binfo;
>> +	int ret = -ENODEV;
>> +	void *holder = blkdev;
>> +
>> +	binfo = psblk_get_bdev_info();
>> +	if (IS_ERR(binfo))
>> +		return PTR_ERR(binfo);
>> +
>> +	/* only allow driver matching the @blkdev */
>> +	if (!binfo->devt || MAJOR(binfo->devt) != major) {
>> +		pr_debug("invalid major %u (expect %u)\n",
>> +				major, MAJOR(binfo->devt));
>> +		return -ENODEV;
>> +	}
>> +
>> +	/* hold bdev exclusively */
>> +	bdev = psblk_get_bdev(holder);
>> +	if (IS_ERR(bdev)) {
>> +		pr_err("failed to open '%s'!\n", blkdev);
>> +		return PTR_ERR(bdev);
>> +	}
>> +
>> +	/* psblk_bdev must be assigned before register to pstore/blk */
>> +	psblk_bdev = bdev;
>> +	blkdev_panic_write = panic_write;
>> +
>> +	dev.total_size = psblk_bdev_size(bdev);
>> +	dev.panic_write = panic_write ? psblk_blk_panic_write : NULL;
>> +	dev.read = psblk_generic_blk_read;
>> +	dev.write = psblk_generic_blk_write;
>> +
>> +	ret = psblk_register_do(&dev);
>> +	if (ret)
>> +		goto err_put_bdev;
>> +
>> +	pr_info("using '%s'\n", blkdev);
>> +	return 0;
>> +
>> +err_put_bdev:
>> +	psblk_bdev = NULL;
>> +	blkdev_panic_write = NULL;
>> +	psblk_put_bdev(bdev, holder);
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(psblk_register_blkdev);
> 
> I've gotten this series refactored on top of current pstore, and I've
> been making various bikeshed changes to names, etc, and as I went to go
> start testing, I realized that nothing actually uses
> psblk_register_blkdev().
> 
> It seems like it should be possible to just start using this on any
> block device of the user's choosing. I assume the idea is to allow for
> drivers to register panic_write handlers, but even without that, it'd be
> nice to just be able to test this with something like /dev/loop0.
> 

Yes. psblk_register_blkdev() is there for block drivers to register
panic_write()
handlers. The panic_wrire() is used only when panic occurs. Not only the
panic
log, but also all data on dirty zones. I implement the panic_write() of mmc
and nand on the platform of Allwinner, but I think it is not ready to
submit to
community.

All other front-ends but dmesg for panic are available since pstore/blk
provides
the general write/read through IO stack. That's why /dev/loop0 seemed to
works well.

> What's your thinking on how this would happen? It seems like if
> pstore/blk uses pstore/zone, and mtdpstore uses pstore/blk, there should
> be a blkoops that uses pstore/blk too?  I guess I need to learn a bit> more about how block device probing works so pstore/blk can notice
> devices as they're brought online, etc.
> 

pstore/blk provides all  user options and register function for device
drivers.
The mtdpstore is the implementation case using pstore/blk. How about
'mmc_pstore' for mmc and *_pstore for others block device?

I guess I need to learn more about how pstore/blk can notice devices too.
I think pstore/blk can be better if block device can do like mtd device that
not only notifies but also provides generic panic_write().

-- 
WeiXiong Liao

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-05-08 12:47 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-25  8:54 [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device WeiXiong Liao
2020-03-25  8:54 ` [PATCH v3 01/11] pstore/zone: a common layer to manage storage as zones WeiXiong Liao
2020-04-14  1:42   ` WeiXiong Liao
2020-03-25  8:54 ` [PATCH v3 02/11] pstore/blk: new support logger for block devices WeiXiong Liao
2020-05-08  6:07   ` Kees Cook
2020-05-08 12:46     ` WeiXiong Liao
2020-03-25  8:54 ` [PATCH v3 03/11] pstore/blk: respect for driver to pick pstore front-ends WeiXiong Liao
2020-03-25  8:54 ` [PATCH v3 04/11] pstore/blk: pstore/zone: support pmsg recorder WeiXiong Liao
2020-03-25  8:55 ` [PATCH v3 05/11] pstore/blk: blkoops: support console recorder WeiXiong Liao
2020-04-14  2:13   ` WeiXiong Liao
2020-03-25  8:55 ` [PATCH v3 06/11] pstore/blk: blkoops: support ftrace recorder WeiXiong Liao
2020-03-25  8:55 ` [PATCH v3 07/11] Documentation: create document for pstore/blk WeiXiong Liao
2020-03-25  8:55 ` [PATCH v3 08/11] pstore/zone: skip broken zone for MTD device WeiXiong Liao
2020-03-25  8:55 ` [PATCH v3 09/11] pstore/blk: a way to get user configure about pstore front-ends WeiXiong Liao
2020-03-25  8:55 ` [PATCH v3 10/11] pstore/zone: pstore/blk: support non-block devices WeiXiong Liao
2020-03-25  8:55 ` [PATCH v3 11/11] mtd: new support oops logger based on pstore/blk WeiXiong Liao
2020-05-07 19:45 ` [PATCH v3 00/11] pstore: mtd: support crash log to block and mtd device Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).