linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v9 0/5] pstore/block: new support logger for block devices
@ 2019-02-19 11:52 liaoweixiong
  2019-02-19 11:52 ` [RFC v9 1/5] pstore/blk: " liaoweixiong
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: liaoweixiong @ 2019-02-19 11:52 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Rob Herring, Mark Rutland, liaoweixiong,
	Mauro Carvalho Chehab, David S. Miller, Greg Kroah-Hartman,
	Nicolas Ferre, Arnd Bergmann
  Cc: linux-doc, linux-kernel, devicetree

Why should we need pstore_block?
1. Most embedded intelligent equipment have no persistent ram, which
increases costs. We perfer to cheaper solutions, like block devices.
In fast, there is already a sample for block device logger in driver
MTD (drivers/mtd/mtdoops.c).
2. Do not any equipment have battery, which means that it lost all data
on general ram if power failure. Pstore has little to do for these
equipments.

[PATCH v9]
On patch 1:
1. rename part_path/part_size, members of blkz_info, to blkdev/total_size
2. if total_size is zero, get size from @blkdev
3. support multiple variants for @blkdev, such as partuuid, major with minor,
   and /dev/xxxx. See details on Documentation.
4. get size from block device
5. add depends on CONFIG_BLOCK
On patch 2:
1. update document
On patch 3:
1. update codes for new blkzone. Blkoops support insmod without total_size.
   for example: "insmod ./blkoops.ko blkdev=93:6" (major:minor).
2. use late_initcalls rather than module_init, to avoid block device not ready.
3. support for block driver to add panic apis to blkoops. By this, block
   driver can do the least work that just provides panic operations.
On patch 5:
1. update document

[PATCH v8]
On patch 2:
1. move DT to /bindings/pstore
2. Delete details for kernel.

[PATCH v7]
On patch 1:
1. Fix line over 80 characters.
On patch 2:
1. Insert a separate patch for DT bindings.

[PATCH v6]
On patch 1:
1. Fix according to email from Kees Cook, including spelling mistakes,
   explicit overflow test, none of the zeroing etc.
2. Do not recover data but metadata of dmesg when panic.
3. No need to take recovery when do erase.
4. Do not use "blkoops" for blkzone any more because "blkoops" is used for
   other module now. (rename blkbuf to blkoops)
On patch 2:
1. Rename blkbuf to blkoops.
2. Add Kconfig/device tree/module parameters settings for blkoops.
3. Add document for device tree.
On patch 3:
1. Blkoops support pmsg.
2. Fix description for new version patch.
On patch 4:
1. Fix description for new version patch.

[PATCH v5]
On patch 1:
1. rename pstore/rom to pstore/blk
2. Do not allocate any memory in the write path of panic. So, use local
array instead in function romz_recover_dmesg_meta.
3. Add C header file "linux/fs.h" to fix implicit declaration of function
   'filp_open','kernel_read'...
On patch 3:
1. If panic, do not recover pmsg but flush if it is dirty.
2. Fix erase pmsg failed.
On patch 4:
1. Create a document for pstore/blk

[PATCH v4]
On patch 1:
1. Fix always true condition '(--i >= 0) => (0-u32max >= 0)' in function
   romz_init_zones by defining variable i to 'int' rahter than
   'unsigned int'.
2. To make codes more easily to read, we use macro READ_NEXT_ZONE for
   return value of romz_dmesg_read if it need to read next zone.
   Moveover, we assign READ_NEXT_ZONE -1024 rather than 0.
3. Add 'FLUSH_META' to 'enum romz_flush_mode' and rename 'NOT_FLUSH' to
   'FLUSH_NONE'
4. Function romz_zone_write work badly with FLUSH_PART mode as badly
   address and offset to write.
On patch 3:
NEW SUPPORT psmg for pstore_rom.

[PATCH v3]
On patch 1:
Fix build as module error for undefined 'vfs_read' and 'vfs_write'
Both of 'vfs_read' and 'vfs_write' haven't be exproted yet, so we use
'kernel_read' and 'kernel_write' instead.

[PATCH v2]
On patch 1:
Fix build as module error for redefinition of 'romz_unregister' and
'romz_register'

[PATCH v1]
On patch 1:
Core codes of pstore_rom, which works well on allwinner(sunxi) platform.
On patch 2:
A sample for pstore_rom, using general ram rather than block device.

liaoweixiong (5):
  pstore/blk: new support logger for block devices
  dt-bindings: pstore-block: new support for blkoops
  pstore/blk: add blkoops for pstore_blk
  pstore/blk: support pmsg for pstore block
  Documentation: pstore/blk: create document for pstore_blk

 Documentation/admin-guide/pstore-block.rst         |  233 ++++
 .../devicetree/bindings/pstore/blkoops.txt         |   53 +
 MAINTAINERS                                        |    4 +-
 fs/pstore/Kconfig                                  |  147 +++
 fs/pstore/Makefile                                 |    5 +
 fs/pstore/blkoops.c                                |  265 +++++
 fs/pstore/blkzone.c                                | 1223 ++++++++++++++++++++
 include/linux/pstore_blk.h                         |   87 ++
 8 files changed, 2016 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/admin-guide/pstore-block.rst
 create mode 100644 Documentation/devicetree/bindings/pstore/blkoops.txt
 create mode 100644 fs/pstore/blkoops.c
 create mode 100644 fs/pstore/blkzone.c
 create mode 100644 include/linux/pstore_blk.h

-- 
1.9.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC v9 1/5] pstore/blk: new support logger for block devices
  2019-02-19 11:52 [RFC v9 0/5] pstore/block: new support logger for block devices liaoweixiong
@ 2019-02-19 11:52 ` liaoweixiong
  2019-02-19 11:52 ` [RFC v9 2/5] dt-bindings: pstore-block: new support for blkoops liaoweixiong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: liaoweixiong @ 2019-02-19 11:52 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Rob Herring, Mark Rutland, liaoweixiong,
	Mauro Carvalho Chehab, David S. Miller, Greg Kroah-Hartman,
	Nicolas Ferre, Arnd Bergmann
  Cc: linux-doc, linux-kernel, devicetree

pstore_blk is similar to pstore_ram, but dump log to block devices
rather than persistent ram.

Why should we need pstore_blk?
1. Most embedded intelligent equipment have no persistent ram, which
increases costs. We perfer to cheaper solutions, like block devices.
In fact, there is already a sample for block device logger in driver
MTD (drivers/mtd/mtdoops.c).
2. Do not any equipment have battery, which means that it lost all data
on general ram if power failure. Pstore has little to do for these
equipments.

pstore_blk can only dump Oops/Panic log to block devices. It only
supports dmesg now. To make pstore_blk work, the block driver should
provide the block device and the read/write apis when on panic.

pstore_blk begins at 'blkz_register', by witch block device can register
a block device to pstore_blk. Then pstore_blk divide and manage the
block device as zones, which is similar to pstore_ram.

Recommend that, block driver register pstore_blk after block device is
ready.

pstore_blk works well on allwinner(sunxi) platform.

Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
---
 fs/pstore/Kconfig          |    8 +
 fs/pstore/Makefile         |    3 +
 fs/pstore/blkzone.c        | 1013 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pstore_blk.h |   80 ++++
 4 files changed, 1104 insertions(+)
 create mode 100644 fs/pstore/blkzone.c
 create mode 100644 include/linux/pstore_blk.h

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 8b3ba27..defcb75 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -152,3 +152,11 @@ config PSTORE_RAM
 	  "ramoops.ko".
 
 	  For more information, see Documentation/admin-guide/ramoops.rst.
+
+config PSTORE_BLK
+	tristate "Log panic/oops to a block device"
+	depends on PSTORE
+	depends on BLOCK
+	help
+	  This enables panic and oops message to be logged to a block dev
+	  where it can be read back at some later point.
diff --git a/fs/pstore/Makefile b/fs/pstore/Makefile
index 967b589..0ee2fc8 100644
--- a/fs/pstore/Makefile
+++ b/fs/pstore/Makefile
@@ -12,3 +12,6 @@ pstore-$(CONFIG_PSTORE_PMSG)	+= pmsg.o
 
 ramoops-objs += ram.o ram_core.o
 obj-$(CONFIG_PSTORE_RAM)	+= ramoops.o
+
+obj-$(CONFIG_PSTORE_BLK) += pstore_blk.o
+pstore_blk-y += blkzone.o
diff --git a/fs/pstore/blkzone.c b/fs/pstore/blkzone.c
new file mode 100644
index 0000000..83dd181
--- /dev/null
+++ b/fs/pstore/blkzone.c
@@ -0,0 +1,1013 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * blkzone.c: Block device Oops/Panic logger
+ *
+ * Copyright (C) 2019 liaoweixiong <liaoweixiong@gallwinnertech.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#define MODNAME "pstore-blk"
+#define pr_fmt(fmt) MODNAME ": " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/blkdev.h>
+#include <linux/pstore.h>
+#include <linux/mount.h>
+#include <linux/printk.h>
+#include <linux/fs.h>
+#include <linux/pstore_blk.h>
+#include <linux/kdev_t.h>
+#include <linux/device.h>
+#include <linux/namei.h>
+#include <linux/fcntl.h>
+
+
+#define PSTORE_BLKDEV "/dev/pstore-blk"
+
+/**
+ * struct blkz_head - head of zone to flush to storage
+ *
+ * @sig: signature to indicate header (BLK_SIG xor BLKZONE-type value)
+ * @datalen: length of data in @data
+ * @data: zone data.
+ */
+struct blkz_buffer {
+#define BLK_SIG (0x43474244) /* DBGC */
+	uint32_t sig;
+	atomic_t datalen;
+	uint8_t data[];
+};
+
+/**
+ * struct blkz_dmesg_header: dmesg information
+ *
+ * @magic: magic num for dmesg header
+ * @time: trigger time
+ * @compressed: whether conpressed
+ * @count: oops/panic counter
+ * @reason: identify oops or panic
+ */
+struct blkz_dmesg_header {
+#define DMESG_HEADER_MAGIC 0x4dfc3ae5
+	uint32_t magic;
+	struct timespec64 time;
+	bool compressed;
+	uint32_t counter;
+	enum kmsg_dump_reason reason;
+	uint8_t data[0];
+};
+
+/**
+ * struct blkz_zone - zone information
+ * @off:
+ *	zone offset of block device
+ * @type:
+ *	frontent type for this zone
+ * @name:
+ *	frontent name for this zone
+ * @buffer:
+ *	pointer to data buffer managed by this zone
+ * @buffer_size:
+ *	bytes in @buffer->data
+ * @should_recover:
+ *	should recover from storage
+ * @dirty:
+ *	mark whether the data in @buffer are dirty (not flush to storage yet)
+ */
+struct blkz_zone {
+	unsigned long off;
+	const char *name;
+	enum pstore_type_id type;
+
+	struct blkz_buffer *buffer;
+	size_t buffer_size;
+	bool should_recover;
+	atomic_t dirty;
+};
+
+struct blkz_context {
+	struct blkz_zone **dbzs;	/* dmesg block zones */
+	unsigned int dmesg_max_cnt;
+	unsigned int dmesg_read_cnt;
+	unsigned int dmesg_write_cnt;
+	/*
+	 * the counter should be recovered when do recovery
+	 * It records the oops/panic times after burning rather than booting.
+	 */
+	unsigned int oops_counter;
+	unsigned int panic_counter;
+	atomic_t recovery;
+	atomic_t on_panic;
+
+	/*
+	 * bzinfo_lock just protects "bzinfo" during calls to
+	 * blkz_register/blkz_unregister
+	 */
+	spinlock_t bzinfo_lock;
+	struct blkz_info *bzinfo;
+	struct pstore_info pstore;
+};
+static struct blkz_context blkz_cxt;
+
+enum blkz_flush_mode {
+	FLUSH_NONE = 0,
+	FLUSH_PART,
+	FLUSH_META,
+	FLUSH_ALL,
+};
+
+static inline int buffer_datalen(struct blkz_zone *zone)
+{
+	return atomic_read(&zone->buffer->datalen);
+}
+
+static inline bool is_on_panic(void)
+{
+	struct blkz_context *cxt = &blkz_cxt;
+
+	return atomic_read(&cxt->on_panic);
+}
+
+static int blkz_zone_read(struct blkz_zone *zone, char *buf,
+		size_t len, unsigned long off)
+{
+	if (!buf || !zone->buffer)
+		return -EINVAL;
+	if (off > zone->buffer_size)
+		return -EINVAL;
+	len = min_t(size_t, len, zone->buffer_size - off);
+	memcpy(buf, zone->buffer->data + off, len);
+	return 0;
+}
+
+static int blkz_zone_write(struct blkz_zone *zone,
+		enum blkz_flush_mode flush_mode, const char *buf,
+		size_t len, unsigned long off)
+{
+	struct blkz_info *info = blkz_cxt.bzinfo;
+	ssize_t wcnt;
+	ssize_t (*writeop)(const char *buf, size_t bytes, loff_t pos);
+	size_t wlen;
+
+	if (off > zone->buffer_size)
+		return -EINVAL;
+	wlen = min_t(size_t, len, zone->buffer_size - off);
+	if (flush_mode != FLUSH_META && flush_mode != FLUSH_NONE) {
+		if (buf && zone->buffer)
+			memcpy(zone->buffer->data + off, buf, wlen);
+		atomic_set(&zone->buffer->datalen, wlen + off);
+	}
+
+	writeop = is_on_panic() ? info->panic_write : info->write;
+	if (!writeop)
+		return -EINVAL;
+
+	switch (flush_mode) {
+	case FLUSH_NONE:
+		return 0;
+	case FLUSH_PART:
+		wcnt = writeop((const char *)zone->buffer->data + off, wlen,
+				zone->off + sizeof(*zone->buffer) + off);
+		if (wcnt != wlen)
+			goto set_dirty;
+	case FLUSH_META:
+		wlen = sizeof(struct blkz_buffer);
+		wcnt = writeop((const char *)zone->buffer, wlen, zone->off);
+		if (wcnt != wlen)
+			goto set_dirty;
+		break;
+	case FLUSH_ALL:
+		wlen = buffer_datalen(zone) + sizeof(*zone->buffer);
+		wcnt = writeop((const char *)zone->buffer, wlen, zone->off);
+		if (wcnt != wlen)
+			goto set_dirty;
+		break;
+	}
+
+	return 0;
+set_dirty:
+	pr_err("write failed with %zd returned, set dirty\n", wcnt);
+	atomic_set(&zone->dirty, true);
+	return -EBUSY;
+}
+
+/*
+ * blkz_move_zone: move data from a old zone to a new zone
+ *
+ * @old: the old zone
+ * @new: the new zone
+ *
+ * NOTE:
+ *	Call blkz_zone_write to copy and flush data. If it failed, we
+ *	should reset new->dirty, because the new zone not really dirty.
+ */
+static int blkz_move_zone(struct blkz_zone *old, struct blkz_zone *new)
+{
+	const char *data = (const char *)old->buffer->data;
+	int ret;
+
+	ret = blkz_zone_write(new, FLUSH_ALL, data, buffer_datalen(old), 0);
+	if (ret) {
+		atomic_set(&new->buffer->datalen, 0);
+		atomic_set(&new->dirty, false);
+		return ret;
+	}
+	atomic_set(&old->buffer->datalen, 0);
+	return 0;
+}
+
+static int blkz_recover_dmesg_data(struct blkz_context *cxt)
+{
+	struct blkz_info *info = cxt->bzinfo;
+	struct blkz_zone *zone = NULL;
+	struct blkz_buffer *buf;
+	unsigned long i;
+	ssize_t (*readop)(char *buf, size_t bytes, loff_t pos);
+	ssize_t rcnt;
+
+	readop = is_on_panic() ? info->panic_read : info->read;
+	if (!readop)
+		return -EINVAL;
+
+	for (i = 0; i < cxt->dmesg_max_cnt; i++) {
+		zone = cxt->dbzs[i];
+		if (unlikely(!zone))
+			return -EINVAL;
+		if (atomic_read(&zone->dirty)) {
+			unsigned int wcnt = cxt->dmesg_write_cnt;
+			struct blkz_zone *new = cxt->dbzs[wcnt];
+			int ret;
+
+			ret = blkz_move_zone(zone, new);
+			if (ret) {
+				pr_err("move zone from %lu to %d failed\n",
+						i, wcnt);
+				return ret;
+			}
+			cxt->dmesg_write_cnt = (wcnt + 1) % cxt->dmesg_max_cnt;
+		}
+		if (!zone->should_recover)
+			continue;
+		buf = zone->buffer;
+		rcnt = readop((char *)buf, zone->buffer_size + sizeof(*buf),
+				zone->off);
+		if (rcnt != zone->buffer_size + sizeof(*buf))
+			return (int)rcnt < 0 ? (int)rcnt : -EIO;
+	}
+	return 0;
+}
+
+/**
+ * blkz_recover_dmesg_meta: recover metadata of dmesg
+ *
+ * Recover metadata as follow:
+ * @cxt->dmesg_write_cnt
+ * @cxt->oops_counter
+ * @cxt->panic_counter
+ */
+static int blkz_recover_dmesg_meta(struct blkz_context *cxt)
+{
+	struct blkz_info *info = cxt->bzinfo;
+	struct blkz_zone *zone;
+	size_t rcnt, len;
+	struct blkz_buffer *buf;
+	struct blkz_dmesg_header *hdr;
+	ssize_t (*readop)(char *buf, size_t bytes, loff_t pos);
+	struct timespec64 time = {0};
+	unsigned long i;
+	/*
+	 * Recover may on panic, we can't allocate any memory by kmalloc.
+	 * So, we use local array instead.
+	 */
+	char buffer_header[sizeof(*buf) + sizeof(*hdr)] = {0};
+
+	readop = is_on_panic() ? info->panic_read : info->read;
+	if (!readop)
+		return -EINVAL;
+
+	len = sizeof(*buf) + sizeof(*hdr);
+	buf = (struct blkz_buffer *)buffer_header;
+	for (i = 0; i < cxt->dmesg_max_cnt; i++) {
+		zone = cxt->dbzs[i];
+		if (unlikely(!zone))
+			return -EINVAL;
+
+		rcnt = readop((char *)buf, len, zone->off);
+		if (rcnt != len)
+			return (int)rcnt < 0 ? (int)rcnt : -EIO;
+
+		/*
+		 * If sig NOT match, it means this zone never used before,
+		 * because we write one by one, and we never modify sig even
+		 * when erase. So, we do not need to check next one.
+		 */
+		if (buf->sig != zone->buffer->sig) {
+			cxt->dmesg_write_cnt = i;
+			pr_debug("no valid data in dmesg zone %lu\n", i);
+			break;
+		}
+
+		if (zone->buffer_size < atomic_read(&buf->datalen)) {
+			pr_info("found overtop zone: %s: id %lu, off %lu, size %zu\n",
+					zone->name, i, zone->off,
+					zone->buffer_size);
+			continue;
+		}
+
+		hdr = (struct blkz_dmesg_header *)buf->data;
+		if (hdr->magic != DMESG_HEADER_MAGIC) {
+			pr_info("found invalid zone: %s: id %lu, off %lu, size %zu\n",
+					zone->name, i, zone->off,
+					zone->buffer_size);
+			continue;
+		}
+
+		/*
+		 * we get the newest zone, and the next one must be the oldest
+		 * or unused zone, because we do write one by one like a circle.
+		 */
+		if (hdr->time.tv_sec >= time.tv_sec) {
+			time.tv_sec = hdr->time.tv_sec;
+			cxt->dmesg_write_cnt = (i + 1) % cxt->dmesg_max_cnt;
+		}
+
+		if (hdr->reason == KMSG_DUMP_OOPS)
+			cxt->oops_counter =
+				max(cxt->oops_counter, hdr->counter);
+		else
+			cxt->panic_counter =
+				max(cxt->panic_counter, hdr->counter);
+
+		if (!atomic_read(&buf->datalen)) {
+			pr_debug("found erased zone: %s: id %ld, off %lu, size %zu, datalen %d\n",
+					zone->name, i, zone->off,
+					zone->buffer_size,
+					atomic_read(&buf->datalen));
+			continue;
+		}
+
+		if (!is_on_panic())
+			zone->should_recover = true;
+		pr_debug("found nice zone: %s: id %ld, off %lu, size %zu, datalen %d\n",
+				zone->name, i, zone->off,
+				zone->buffer_size, atomic_read(&buf->datalen));
+	}
+
+	return 0;
+}
+
+static int blkz_recover_dmesg(struct blkz_context *cxt)
+{
+	int ret;
+
+	if (!cxt->dbzs)
+		return 0;
+
+	ret = blkz_recover_dmesg_meta(cxt);
+	if (ret)
+		goto recover_fail;
+
+	ret = blkz_recover_dmesg_data(cxt);
+	if (ret)
+		goto recover_fail;
+
+	return 0;
+recover_fail:
+	pr_debug("recover dmesg failed\n");
+	return ret;
+}
+
+static inline int blkz_recovery(struct blkz_context *cxt)
+{
+	int ret = -EBUSY;
+
+	if (atomic_read(&cxt->recovery))
+		return 0;
+
+	ret = blkz_recover_dmesg(cxt);
+	if (ret)
+		goto recover_fail;
+
+	atomic_set(&cxt->recovery, 1);
+	pr_debug("recover end!\n");
+	return 0;
+
+recover_fail:
+	pr_debug("recovery failed, handle buffer\n");
+	return ret;
+}
+
+static int blkz_pstore_open(struct pstore_info *psi)
+{
+	struct blkz_context *cxt = psi->data;
+
+	cxt->dmesg_read_cnt = 0;
+	return 0;
+}
+
+static inline bool blkz_ok(struct blkz_zone *zone)
+{
+	if (!zone || !zone->buffer || !buffer_datalen(zone))
+		return false;
+	return true;
+}
+
+static int blkz_pstore_erase(struct pstore_record *record)
+{
+	struct blkz_context *cxt = record->psi->data;
+	struct blkz_zone *zone = NULL;
+
+	if (record->type == PSTORE_TYPE_DMESG)
+		zone = cxt->dbzs[record->id];
+	if (!blkz_ok(zone))
+		return 0;
+
+	atomic_set(&zone->buffer->datalen, 0);
+	return blkz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+}
+
+static void blkz_write_kmsg_hdr(struct blkz_zone *zone,
+		struct pstore_record *record)
+{
+	struct blkz_context *cxt = record->psi->data;
+	struct blkz_buffer *buffer = zone->buffer;
+	struct blkz_dmesg_header *hdr =
+		(struct blkz_dmesg_header *)buffer->data;
+
+	hdr->magic = DMESG_HEADER_MAGIC;
+	hdr->compressed = record->compressed;
+	hdr->time.tv_sec = record->time.tv_sec;
+	hdr->time.tv_nsec = record->time.tv_nsec;
+	hdr->reason = record->reason;
+	if (hdr->reason == KMSG_DUMP_OOPS)
+		hdr->counter = ++cxt->oops_counter;
+	else
+		hdr->counter = ++cxt->panic_counter;
+}
+
+static int notrace blkz_dmesg_write(struct blkz_context *cxt,
+		struct pstore_record *record)
+{
+	struct blkz_info *info = cxt->bzinfo;
+	struct blkz_zone *zone;
+	size_t size, hlen;
+
+	/*
+	 * Out of the various dmesg dump types, pstore/blk is currently designed
+	 * to only store crash logs, rather than storing general kernel logs.
+	 */
+	if (record->reason != KMSG_DUMP_OOPS &&
+			record->reason != KMSG_DUMP_PANIC)
+		return -EINVAL;
+
+	/* Skip Oopes when configured to do so. */
+	if (record->reason == KMSG_DUMP_OOPS && !info->dump_oops)
+		return -EINVAL;
+
+	/*
+	 * Explicitly only take the first part of any new crash.
+	 * If our buffer is larger than kmsg_bytes, this can never happen,
+	 * and if our buffer is smaller than kmsg_bytes, we don't want the
+	 * report split across multiple records.
+	 */
+	if (record->part != 1)
+		return -ENOSPC;
+
+	if (!cxt->dbzs)
+		return -ENOSPC;
+
+	zone = cxt->dbzs[cxt->dmesg_write_cnt];
+	if (!zone)
+		return -ENOSPC;
+
+	blkz_write_kmsg_hdr(zone, record);
+	hlen = sizeof(struct blkz_dmesg_header);
+	size = record->size;
+	if (size + hlen > zone->buffer_size)
+		size = zone->buffer_size - hlen;
+	blkz_zone_write(zone, FLUSH_ALL, record->buf, size, hlen);
+
+	pr_debug("write %s to zone id %d\n", zone->name, cxt->dmesg_write_cnt);
+	cxt->dmesg_write_cnt = (cxt->dmesg_write_cnt + 1) % cxt->dmesg_max_cnt;
+	return 0;
+}
+
+static int notrace blkz_pstore_write(struct pstore_record *record)
+{
+	struct blkz_context *cxt = record->psi->data;
+
+	if (record->type == PSTORE_TYPE_DMESG &&
+			record->reason == KMSG_DUMP_PANIC)
+		atomic_set(&cxt->on_panic, 1);
+
+	/*
+	 * before write, we must recover from storage.
+	 * if recover failed, handle buffer
+	 */
+	blkz_recovery(cxt);
+
+	switch (record->type) {
+	case PSTORE_TYPE_DMESG:
+		return blkz_dmesg_write(cxt, record);
+	default:
+		return -EINVAL;
+	}
+}
+
+#define READ_NEXT_ZONE ((ssize_t)(-1024))
+static struct blkz_zone *blkz_read_next_zone(struct blkz_context *cxt)
+{
+	struct blkz_zone *zone = NULL;
+
+	while (cxt->dmesg_read_cnt < cxt->dmesg_max_cnt) {
+		zone = cxt->dbzs[cxt->dmesg_read_cnt++];
+		if (blkz_ok(zone))
+			return zone;
+	}
+
+	return NULL;
+}
+
+static int blkz_read_dmesg_hdr(struct blkz_zone *zone,
+		struct pstore_record *record)
+{
+	struct blkz_buffer *buffer = zone->buffer;
+	struct blkz_dmesg_header *hdr =
+		(struct blkz_dmesg_header *)buffer->data;
+
+	if (hdr->magic != DMESG_HEADER_MAGIC)
+		return -EINVAL;
+	record->compressed = hdr->compressed;
+	record->time.tv_sec = hdr->time.tv_sec;
+	record->time.tv_nsec = hdr->time.tv_nsec;
+	record->reason = hdr->reason;
+	record->count = hdr->counter;
+	return 0;
+}
+
+static ssize_t blkz_dmesg_read(struct blkz_zone *zone,
+		struct pstore_record *record)
+{
+	size_t size, hlen = 0;
+
+	size = buffer_datalen(zone);
+	/* Clear and skip this DMESG record if it has no valid header */
+	if (blkz_read_dmesg_hdr(zone, record)) {
+		atomic_set(&zone->buffer->datalen, 0);
+		atomic_set(&zone->dirty, 0);
+		return READ_NEXT_ZONE;
+	}
+	size -= sizeof(struct blkz_dmesg_header);
+
+	if (!record->compressed) {
+		char *buf = kasprintf(GFP_KERNEL,
+				"%s: Total %d times\n",
+				record->reason == KMSG_DUMP_OOPS ? "Oops" :
+				"Panic", record->count);
+		hlen = strlen(buf);
+		record->buf = krealloc(buf, hlen + size, GFP_KERNEL);
+		if (!record->buf) {
+			kfree(buf);
+			return -ENOMEM;
+		}
+	} else {
+		record->buf = kmalloc(size, GFP_KERNEL);
+		if (!record->buf)
+			return -ENOMEM;
+	}
+
+	if (unlikely(blkz_zone_read(zone, record->buf + hlen, size,
+				sizeof(struct blkz_dmesg_header)) < 0)) {
+		kfree(record->buf);
+		return READ_NEXT_ZONE;
+	}
+
+	return size + hlen;
+}
+
+static ssize_t blkz_pstore_read(struct pstore_record *record)
+{
+	struct blkz_context *cxt = record->psi->data;
+	ssize_t (*blkz_read)(struct blkz_zone *zone,
+			struct pstore_record *record);
+	struct blkz_zone *zone;
+	ssize_t ret;
+
+	/*
+	 * before read, we must recover from storage.
+	 * if recover failed, handle buffer
+	 */
+	blkz_recovery(cxt);
+
+next_zone:
+	zone = blkz_read_next_zone(cxt);
+	if (!zone)
+		return 0;
+
+	record->type = zone->type;
+	switch (record->type) {
+	case PSTORE_TYPE_DMESG:
+		blkz_read = blkz_dmesg_read;
+		record->id = cxt->dmesg_read_cnt - 1;
+		break;
+	default:
+		goto next_zone;
+	}
+
+	ret = blkz_read(zone, record);
+	if (ret == READ_NEXT_ZONE)
+		goto next_zone;
+	return ret;
+}
+
+static struct blkz_context blkz_cxt = {
+	.bzinfo_lock = __SPIN_LOCK_UNLOCKED(blkz_cxt.bzinfo_lock),
+	.recovery = ATOMIC_INIT(0),
+	.on_panic = ATOMIC_INIT(0),
+	.pstore = {
+		.owner = THIS_MODULE,
+		.name = MODNAME,
+		.open = blkz_pstore_open,
+		.read = blkz_pstore_read,
+		.write = blkz_pstore_write,
+		.erase = blkz_pstore_erase,
+	},
+};
+
+static long long blkz_blkdev_size(const char *path)
+{
+	long long size;
+	struct file *filp;
+	struct inode *inode;
+	struct hd_struct *part;
+
+	filp = filp_open(path, O_RDONLY, 0);
+	if (IS_ERR(filp))
+		return PTR_ERR(filp);
+	inode = filp->f_inode;
+	if (!S_ISBLK(inode->i_mode))
+		return -ENOTBLK;
+	part = inode->i_bdev->bd_part;
+	size = (long long)part_nr_sects_read(part) * SECTOR_SIZE;
+	filp_close(filp, NULL);
+
+	return size;
+}
+
+/**
+ * blkz_create_dev: create block device to PSTORE_BLKDEV
+ *
+ * It uses name_to_dev_t to get dev_t, so it accpet the following variants:
+ *	1) <hex_major><hex_minor> device number in hexadecimal represents itself
+ *	   no leading 0x, for example b302.
+ *	2) /dev/<disk_name> represents the device number of disk
+ *	3) /dev/<disk_name><decimal> represents the device number
+ *	   of partition - device number of disk plus the partition number
+ *	4) /dev/<disk_name>p<decimal> - same as the above, that form is
+ *	   used when disk name of partitioned disk ends on a digit.
+ *	5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+ *	   unique id of a partition if the partition table provides it.
+ *	   The UUID may be either an EFI/GPT UUID, or refer to an MSDOS
+ *	   partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero-
+ *	   filled hex representation of the 32-bit "NT disk signature", and PP
+ *	   is a zero-filled hex representation of the 1-based partition number.
+ *	6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to
+ *	   a partition with a known unique id.
+ *	7) <major>:<minor> major and minor number of the device separated by
+ *	   a colon.
+ */
+static int blkz_create_dev(const char *bdev)
+{
+	int err;
+	dev_t devt;
+	struct path path;
+	struct dentry *dentry;
+
+	if (!bdev)
+		return -EINVAL;
+
+	devt = name_to_dev_t(bdev);
+	if (!devt) {
+		pr_err("not found dev_t from %s\n", bdev);
+		return -ENODEV;
+	}
+
+	dentry = kern_path_create(AT_FDCWD, PSTORE_BLKDEV, &path, 0);
+	if (IS_ERR(dentry))
+		return PTR_ERR(dentry);
+	err = vfs_mknod(path.dentry->d_inode, dentry, S_IFBLK | 0600, devt);
+	if (err < 0)
+		pr_err("failed to create %s: %d\n", PSTORE_BLKDEV, err);
+	done_path_create(&path, dentry);
+	return err;
+}
+
+static ssize_t blkz_default_general_read(char *buf, size_t bytes, loff_t pos)
+{
+	struct blkz_context *cxt = &blkz_cxt;
+	struct file *filp;
+	ssize_t ret;
+
+	if (!cxt->bzinfo->blkdev)
+		return -ENODEV;
+
+	filp = filp_open(PSTORE_BLKDEV, O_RDONLY, 0);
+	if (filp == ERR_PTR(-ENOENT) && !blkz_create_dev(cxt->bzinfo->blkdev))
+		filp = filp_open(PSTORE_BLKDEV, O_RDONLY, 0);
+	if (IS_ERR(filp)) {
+		pr_debug("open %s failed, maybe unready\n", PSTORE_BLKDEV);
+		return -EACCES;
+	}
+	ret = kernel_read(filp, buf, bytes, &pos);
+	filp_close(filp, NULL);
+
+	return ret;
+}
+
+static ssize_t blkz_default_general_write(const char *buf, size_t bytes,
+		loff_t pos)
+{
+	struct blkz_context *cxt = &blkz_cxt;
+	struct file *filp;
+	ssize_t ret;
+
+	if (!cxt->bzinfo->blkdev)
+		return -ENODEV;
+
+	filp = filp_open(PSTORE_BLKDEV, O_WRONLY, 0);
+	if (filp == ERR_PTR(-ENOENT) && !blkz_create_dev(cxt->bzinfo->blkdev))
+		filp = filp_open(PSTORE_BLKDEV, O_WRONLY, 0);
+	if (IS_ERR(filp)) {
+		pr_debug("open %s failed, maybe unready\n", PSTORE_BLKDEV);
+		return -EACCES;
+	}
+	ret = kernel_write(filp, buf, bytes, &pos);
+	vfs_fsync(filp, 0);
+	filp_close(filp, NULL);
+
+	return ret;
+}
+
+static struct blkz_zone *blkz_init_zone(enum pstore_type_id type,
+		unsigned long *off, size_t size)
+{
+	struct blkz_info *info = blkz_cxt.bzinfo;
+	struct blkz_zone *zone;
+	const char *name = pstore_type_to_name(type);
+
+	if (!size)
+		return NULL;
+
+	if (*off + size > info->total_size) {
+		pr_err("no room for %s (0x%zx@0x%lx over 0x%lx)\n",
+			name, size, *off, info->total_size);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	zone = kzalloc(sizeof(struct blkz_zone), GFP_KERNEL);
+	if (!zone)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * NOTE: allocate buffer for blk zones for two reasons:
+	 * 1. It can temporarily hold the data before
+	 *    blkz_default_general_read/write are useable.
+	 * 2. It makes pstore usable even if no persistent storage. Most
+	 *    events of pstore except panic are suitable!!
+	 */
+	zone->buffer = kmalloc(size, GFP_KERNEL);
+	if (!zone->buffer) {
+		kfree(zone);
+		return ERR_PTR(-ENOMEM);
+	}
+	memset(zone->buffer, 0xFF, size);
+	zone->off = *off;
+	zone->name = name;
+	zone->type = type;
+	zone->buffer_size = size - sizeof(struct blkz_buffer);
+	zone->buffer->sig = type ^ BLK_SIG;
+	atomic_set(&zone->dirty, 0);
+	atomic_set(&zone->buffer->datalen, 0);
+
+	*off += size;
+
+	pr_debug("blkzone %s: off 0x%lx, %zu header, %zu data\n", zone->name,
+			zone->off, sizeof(*zone->buffer), zone->buffer_size);
+	return zone;
+}
+
+static struct blkz_zone **blkz_init_zones(enum pstore_type_id type,
+	unsigned long *off, size_t total_size, ssize_t record_size,
+	unsigned int *cnt)
+{
+	struct blkz_info *info = blkz_cxt.bzinfo;
+	struct blkz_zone **zones, *zone;
+	const char *name = pstore_type_to_name(type);
+	int c, i;
+
+	if (!total_size || !record_size)
+		return NULL;
+
+	if (*off + total_size > info->total_size) {
+		pr_err("no room for zones %s (0x%zx@0x%lx over 0x%lx)\n",
+			name, total_size, *off, info->total_size);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	c = total_size / record_size;
+	zones = kcalloc(c, sizeof(*zones), GFP_KERNEL);
+	if (!zones) {
+		pr_err("allocate for zones %s failed\n", name);
+		return ERR_PTR(-ENOMEM);
+	}
+	memset(zones, 0, c * sizeof(*zones));
+
+	for (i = 0; i < c; i++) {
+		zone = blkz_init_zone(type, off, record_size);
+		if (!zone || IS_ERR(zone)) {
+			pr_err("initialize zones %s failed\n", name);
+			while (--i >= 0)
+				kfree(zones[i]);
+			kfree(zones);
+			return (void *)zone;
+		}
+		zones[i] = zone;
+	}
+
+	*cnt = c;
+	return zones;
+}
+
+static void blkz_free_zone(struct blkz_zone **blkzone)
+{
+	struct blkz_zone *zone = *blkzone;
+
+	if (!zone)
+		return;
+
+	kfree(zone->buffer);
+	kfree(zone);
+	*blkzone = NULL;
+}
+
+static void blkz_free_zones(struct blkz_zone ***blkzones, unsigned int *cnt)
+{
+	struct blkz_zone **zones = *blkzones;
+
+	while (*cnt > 0) {
+		blkz_free_zone(&zones[*cnt]);
+		(*cnt)--;
+	}
+	kfree(zones);
+	*blkzones = NULL;
+}
+
+static int blkz_cut_zones(struct blkz_context *cxt)
+{
+	struct blkz_info *info = cxt->bzinfo;
+	unsigned long off = 0;
+	int err;
+	size_t size;
+
+	size = info->total_size;
+	cxt->dbzs = blkz_init_zones(PSTORE_TYPE_DMESG, &off, size,
+			info->dmesg_size, &cxt->dmesg_max_cnt);
+	if (IS_ERR(cxt->dbzs)) {
+		err = PTR_ERR(cxt->dbzs);
+		goto fail_out;
+	}
+
+	return 0;
+fail_out:
+	return err;
+}
+
+int blkz_register(struct blkz_info *info)
+{
+	int err = -EINVAL;
+	struct blkz_context *cxt = &blkz_cxt;
+	struct module *owner = info->owner;
+
+	if (info->blkdev && !blkz_create_dev(info->blkdev)) {
+		long long size;
+
+		size = blkz_blkdev_size(PSTORE_BLKDEV);
+		if (size > 0 && (!info->total_size || info->total_size > size)) {
+			info->total_size = (unsigned long)size;
+			pr_info("total size %ld from block device %s\n",
+					info->total_size, info->blkdev);
+		}
+	}
+
+	if (!info->total_size || !info->dmesg_size) {
+		pr_warn("The total size and the dmesg size must be non-zero\n");
+		return -EINVAL;
+	}
+
+	if (info->total_size < 4096) {
+		pr_err("total size must be over 4096 bytes\n");
+		return -EINVAL;
+	}
+
+#define check_size(name, size) {					\
+		if (info->name & (size - 1)) {				\
+			pr_err(#name " must be a multiple of %d\n",	\
+					(size));			\
+			return -EINVAL;					\
+		}							\
+	}
+
+	check_size(total_size, 4096);
+	check_size(dmesg_size, SECTOR_SIZE);
+
+#undef check_size
+
+	if (!info->read)
+		info->read = blkz_default_general_read;
+	if (!info->write)
+		info->write = blkz_default_general_write;
+
+	if (owner && !try_module_get(owner))
+		return -EINVAL;
+
+	spin_lock(&cxt->bzinfo_lock);
+	if (cxt->bzinfo) {
+		pr_warn("blk '%s' already loaded: ignoring '%s'\n",
+				cxt->bzinfo->name, info->name);
+		spin_unlock(&cxt->bzinfo_lock);
+		return -EBUSY;
+	}
+	cxt->bzinfo = info;
+	spin_unlock(&cxt->bzinfo_lock);
+
+	if (blkz_cut_zones(cxt)) {
+		pr_err("cut zones fialed\n");
+		goto fail_out;
+	}
+
+	cxt->pstore.bufsize = cxt->dbzs[0]->buffer_size -
+			sizeof(struct blkz_dmesg_header);
+	cxt->pstore.buf = kzalloc(cxt->pstore.bufsize, GFP_KERNEL);
+	if (!cxt->pstore.buf) {
+		pr_err("cannot allocate pstore crash dump buffer\n");
+		err = -ENOMEM;
+		goto fail_out;
+	}
+	cxt->pstore.data = cxt;
+	cxt->pstore.flags = PSTORE_FLAGS_DMESG;
+
+	pr_info("Registered %s as blkzone backend for %s%s\n", info->name,
+			cxt->dbzs && cxt->bzinfo->dump_oops ? "Oops " : "",
+			cxt->dbzs && cxt->bzinfo->panic_write ? "Panic " : "");
+
+	err = pstore_register(&cxt->pstore);
+	if (err) {
+		pr_err("registering with pstore failed\n");
+		goto free_pstore_buf;
+	}
+
+	module_put(owner);
+	return 0;
+
+free_pstore_buf:
+	kfree(cxt->pstore.buf);
+fail_out:
+	spin_lock(&blkz_cxt.bzinfo_lock);
+	blkz_cxt.bzinfo = NULL;
+	spin_unlock(&blkz_cxt.bzinfo_lock);
+	return err;
+}
+EXPORT_SYMBOL_GPL(blkz_register);
+
+void blkz_unregister(struct blkz_info *info)
+{
+	struct blkz_context *cxt = &blkz_cxt;
+
+	pstore_unregister(&cxt->pstore);
+	kfree(cxt->pstore.buf);
+	cxt->pstore.bufsize = 0;
+
+	spin_lock(&cxt->bzinfo_lock);
+	blkz_cxt.bzinfo = NULL;
+	spin_unlock(&cxt->bzinfo_lock);
+
+	blkz_free_zones(&cxt->dbzs, &cxt->dmesg_max_cnt);
+
+}
+EXPORT_SYMBOL_GPL(blkz_unregister);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("liaoweixiong <liaoweixiong@allwinnertech.com>");
+MODULE_DESCRIPTION("Block device Oops/Panic logger");
diff --git a/include/linux/pstore_blk.h b/include/linux/pstore_blk.h
new file mode 100644
index 0000000..4f239f0
--- /dev/null
+++ b/include/linux/pstore_blk.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __PSTORE_BLK_H_
+#define __PSTORE_BLK_H_
+
+#include <linux/types.h>
+#include <linux/blkdev.h>
+
+#ifndef SECTOR_SIZE
+#define SECTOR_SIZE 512
+#endif
+
+/**
+ * struct blkz_info - backend blkzone driver structure
+ *
+ * @owner:
+ *	module which is responsible for this backend driver
+ * @name:
+ *	name of the backend driver
+ * @blkdev:
+ *	The block device to use. Most of the time, it is a partition of block
+ *	device. It's ok to keep it as NULL if you passing @read and @write
+ *	in blkz_info as @blkdev is used by blkz_default_general_read/write.
+ *	If both of @blkdev, @read and @write are NULL, no block device is
+ *	effective and the data will be saved in ddr buffer.
+ *	It accept the following variants:
+ *	1) <hex_major><hex_minor> device number in hexadecimal represents itself
+ *	   no leading 0x, for example b302.
+ *	2) /dev/<disk_name> represents the device number of disk
+ *	3) /dev/<disk_name><decimal> represents the device number
+ *	   of partition - device number of disk plus the partition number
+ *	4) /dev/<disk_name>p<decimal> - same as the above, that form is
+ *	   used when disk name of partitioned disk ends on a digit.
+ *	5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+ *	   unique id of a partition if the partition table provides it.
+ *	   The UUID may be either an EFI/GPT UUID, or refer to an MSDOS
+ *	   partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero-
+ *	   filled hex representation of the 32-bit "NT disk signature", and PP
+ *	   is a zero-filled hex representation of the 1-based partition number.
+ *	6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to
+ *	   a partition with a known unique id.
+ *	7) <major>:<minor> major and minor number of the device separated by
+ *	   a colon.
+ * @total_size:
+ *	the total size in bytes pstore/blk can use. It must be less than or
+ *	equal to size of block device if @blkdev valid. If @total_size is zero
+ *	with @blkdev, @total_size will be set to equal to size of @blkdev.
+ * @dmesg_size:
+ *	the size of each zones for dmesg (oops & panic).
+ * @dump_oops:
+ *	dump oops and panic log or only panic.
+ * @read:
+ *	the general (not panic) read operation. If NULL, pstore/blk
+ *	replaced as blkz_default_general_read. See also @blkdev
+ * @write:
+ *	the general (not panic) write operation. If NULL, pstore/blk
+ *	replaced as blkz_default_general_write. See also @blkdev
+ * @panic_read:
+ *	the read operation only used for panic.
+ * @panic_write:
+ *	the write operation only used for panic.
+ */
+struct blkz_info {
+	struct module *owner;
+	const char *name;
+
+	const char *blkdev;
+	unsigned long total_size;
+	unsigned long dmesg_size;
+	int dump_oops;
+	ssize_t (*read)(char *buf, size_t bytes, loff_t pos);
+	ssize_t (*write)(const char *buf, size_t bytes, loff_t pos);
+	ssize_t (*panic_read)(char *buf, size_t bytes, loff_t pos);
+	ssize_t (*panic_write)(const char *buf, size_t bytes, loff_t pos);
+};
+
+extern int blkz_register(struct blkz_info *info);
+extern void blkz_unregister(struct blkz_info *info);
+
+#endif
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC v9 2/5] dt-bindings: pstore-block: new support for blkoops
  2019-02-19 11:52 [RFC v9 0/5] pstore/block: new support logger for block devices liaoweixiong
  2019-02-19 11:52 ` [RFC v9 1/5] pstore/blk: " liaoweixiong
@ 2019-02-19 11:52 ` liaoweixiong
  2019-02-22 15:36   ` Rob Herring
  2019-02-19 11:52 ` [RFC v9 3/5] pstore/blk: add blkoops for pstore_blk liaoweixiong
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: liaoweixiong @ 2019-02-19 11:52 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Rob Herring, Mark Rutland, liaoweixiong,
	Mauro Carvalho Chehab, David S. Miller, Greg Kroah-Hartman,
	Nicolas Ferre, Arnd Bergmann
  Cc: linux-doc, linux-kernel, devicetree

Create DT binding document for blkoops.

Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
---
 .../devicetree/bindings/pstore/blkoops.txt         | 53 ++++++++++++++++++++++
 MAINTAINERS                                        |  1 +
 2 files changed, 54 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pstore/blkoops.txt

diff --git a/Documentation/devicetree/bindings/pstore/blkoops.txt b/Documentation/devicetree/bindings/pstore/blkoops.txt
new file mode 100644
index 0000000..5462915
--- /dev/null
+++ b/Documentation/devicetree/bindings/pstore/blkoops.txt
@@ -0,0 +1,53 @@
+Blkoops oops logger
+===================
+
+Blkoops provides a block partition for oops, excluding panics now, so they can
+be recovered after a reboot.
+
+Any space of block device will be used for a circular buffer of oops records.
+These records have a configurable size, with a size of 0 indicating that they
+should be disabled.
+
+At least one of "block-device" and "total_size" must be set.
+
+At least one of "dmesg-size" or "pmsg-size" must be set non-zero.
+
+Required properties:
+
+- compatible: must be "blkoops".
+
+Optional properties:
+
+- block-device: The block device to use. Most of the time, it is a partition of
+		device. If block-device is NULL, no block device is effective
+		and the data will be lost after rebooting.
+		It accept the following variants:
+		1) <hex_major><hex_minor> device number in hexadecimal
+		   represents itself no leading 0x, for example b302.
+		2) /dev/<disk_name> represents the device number of disk
+		3) /dev/<disk_name><decimal> represents the device number of
+		   partition - device number of disk plus the partition number
+		4) /dev/<disk_name>p<decimal> - same as the above, that form is
+		   used when disk name of partitioned disk ends on a digit.
+		5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing
+		   the unique id of a partition if the partition table provides
+		   it. The UUID may be either an EFI/GPT UUID, or refer to an
+		   MSDOS partition using the format SSSSSSSS-PP, where SSSSSSSS
+		   is a zero-filled hex representation of the 32-bit
+		   "NT disk signature", and PP is a zero-filled hex
+		   representation of the 1-based partition number.
+		6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in
+		   relation to a partition with a known unique id.
+		7) <major>:<minor> major and minor number of the device
+		   separated by a colon.
+
+- total-size: The total size in kbytes pstore/blk can use. It must be a multiple
+	      of 4. It must be less than or equal to size of block-device. If
+	      total-size is zero with block-devce valid, it will be set to equal
+	      to size of block-device.
+
+- dmesg-size: maximum size in kbytes of each dump done on oops, which must be a
+	      multiple of 4.
+
+- pmsg-size: maximum size in kbytes for userspace messages, which must be a
+	     multiple of 4.
diff --git a/MAINTAINERS b/MAINTAINERS
index 51029a4..f49dd37 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12318,6 +12318,7 @@ F:	drivers/firmware/efi/efi-pstore.c
 F:	drivers/acpi/apei/erst.c
 F:	Documentation/admin-guide/ramoops.rst
 F:	Documentation/devicetree/bindings/reserved-memory/ramoops.txt
+F:	Documentation/devicetree/bindings/pstore-block/
 K:	\b(pstore|ramoops)
 
 PTP HARDWARE CLOCK SUPPORT
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC v9 3/5] pstore/blk: add blkoops for pstore_blk
  2019-02-19 11:52 [RFC v9 0/5] pstore/block: new support logger for block devices liaoweixiong
  2019-02-19 11:52 ` [RFC v9 1/5] pstore/blk: " liaoweixiong
  2019-02-19 11:52 ` [RFC v9 2/5] dt-bindings: pstore-block: new support for blkoops liaoweixiong
@ 2019-02-19 11:52 ` liaoweixiong
  2019-02-19 11:52 ` [RFC v9 4/5] pstore/blk: support pmsg for pstore block liaoweixiong
  2019-02-19 11:52 ` [RFC v9 5/5] Documentation: pstore/blk: create document for pstore_blk liaoweixiong
  4 siblings, 0 replies; 10+ messages in thread
From: liaoweixiong @ 2019-02-19 11:52 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Rob Herring, Mark Rutland, liaoweixiong,
	Mauro Carvalho Chehab, David S. Miller, Greg Kroah-Hartman,
	Nicolas Ferre, Arnd Bergmann
  Cc: linux-doc, linux-kernel, devicetree

blkoops is a sample for pstore/blk. It can only record oops, excluding
panics as no read/write apis for panic registered. It support settings
on Kconfg/device tree/module parameters. It can record oops log even
power failure if "PSTORE_BLKOOPS_BLKDEV" on Kconfig or "block-device"
on dts or "blkdev" on module parameter is valid.
Otherwise, it can only record data to ram buffer, which will be dropped
when reboot.

Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
---
 MAINTAINERS                |   2 +-
 fs/pstore/Kconfig          | 114 ++++++++++++++++++++
 fs/pstore/Makefile         |   2 +
 fs/pstore/blkoops.c        | 254 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pstore_blk.h |  14 ++-
 5 files changed, 381 insertions(+), 5 deletions(-)
 create mode 100644 fs/pstore/blkoops.c

diff --git a/MAINTAINERS b/MAINTAINERS
index f49dd37..44647a8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12319,7 +12319,7 @@ F:	drivers/acpi/apei/erst.c
 F:	Documentation/admin-guide/ramoops.rst
 F:	Documentation/devicetree/bindings/reserved-memory/ramoops.txt
 F:	Documentation/devicetree/bindings/pstore-block/
-K:	\b(pstore|ramoops)
+K:	\b(pstore|ramoops|blkoops)
 
 PTP HARDWARE CLOCK SUPPORT
 M:	Richard Cochran <richardcochran@gmail.com>
diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index defcb75..a3c3f34 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -160,3 +160,117 @@ config PSTORE_BLK
 	help
 	  This enables panic and oops message to be logged to a block dev
 	  where it can be read back at some later point.
+
+config PSTORE_BLKOOPS
+	tristate "pstore block with oops logger"
+	depends on PSTORE_BLK
+	help
+	  This is a sample for pstore block with oops logger.
+
+	  It CANNOT record panic log as no read/write apis for panic registered.
+
+	  It CAN record oops log even power failure if
+	  "PSTORE_BLKOOPS_BLKDEV" on Kconfig or "block-device" on dts or
+	  "blkdev" on module parameter is valid.
+
+	  Otherwise, it can only record data to ram buffer, which will be
+	  dropped when reboot.
+
+	  NOTE that, there are three ways to set parameters of blkoops and
+	  prioritize according to configuration flexibility. That is
+	  Kconfig < device tree < module parameters. It means that the value can
+	  be overwritten by higher priority settings.
+	  1. Kconfig
+	     It	just sets a default value.
+	  2. device tree
+	     It is set on device tree, which will overwrites value from Kconfig,
+	     but can also be overwritten by module parameters.
+	  3. module parameters
+	     It is the first priority. Take care of that blkoops will take lower
+	     priority settings if higher priority one do not set.
+
+config PSTORE_BLKOOPS_DMESG_SIZE
+	int "dmesg size in kbytes for blkoops"
+	depends on PSTORE_BLKOOPS
+	default 64
+	help
+	  This just sets size of dmesg (dmesg_size) for pstore/blk. The value
+	  must be a multiple of 4096.
+
+	  NOTE that, there are three ways to set parameters of blkoops and
+	  prioritize according to configuration flexibility. That is
+	  Kconfig < device tree < module parameters. It means that the value can
+	  be overwritten by higher priority settings.
+	  1. Kconfig
+	     It	just sets a default value.
+	  2. device tree
+	     It is set on device tree, which will overwrites value from Kconfig,
+	     but can also be overwritten by module parameters.
+	  3. module parameters
+	     It is the first priority. Take care of that blkoops will take lower
+	     priority settings if higher priority one do not set.
+
+config PSTORE_BLKOOPS_TOTAL_SIZE
+	int "total size in kbytes for blkoops"
+	depends on PSTORE_BLKOOPS
+	default 1024
+	help
+	  The total size in kbytes pstore/blk can use. It must be less than or
+	  equal to size of block device if @blkdev valid. If @total_size is zero
+	  with @blkdev, @total_size will be set to equal to size of @blkdev.
+	  The value must be a multiple of 4096.
+
+	  NOTE that, there are three ways to set parameters of blkoops and
+	  prioritize according to configuration flexibility. That is
+	  Kconfig < device tree < module parameters. It means that the value can
+	  be overwritten by higher priority settings.
+	  1. Kconfig
+	     It	just sets a default value.
+	  2. device tree
+	     It is set on device tree, which will overwrites value from Kconfig,
+	     but can also be overwritten by module parameters.
+	  3. module parameters
+	     It is the first priority. Take care of that blkoops will take lower
+	     priority settings if higher priority one do not set.
+
+config PSTORE_BLKOOPS_BLKDEV
+	string "block device for blkoops"
+	depends on PSTORE_BLKOOPS
+	default ""
+	help
+	  This just sets block device (blkdev) for pstore/blk. Pstore/blk
+	  will record data to this block device to avoid losing data due to
+	  power failure. So, If it is not set, pstore/blk will drop all data
+	  when reboot.
+
+	  It accpet the following variants:
+	  1) <hex_major><hex_minor> device number in hexadecimal represents
+	     itself no leading 0x, for example b302.
+	  2) /dev/<disk_name> represents the device number of disk
+	  3) /dev/<disk_name><decimal> represents the device number
+	     of partition - device number of disk plus the partition number
+	  4) /dev/<disk_name>p<decimal> - same as the above, that form is
+	     used when disk name of partitioned disk ends on a digit.
+	  5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+	     unique id of a partition if the partition table provides it.
+	     The UUID may be either an EFI/GPT UUID, or refer to an MSDOS
+	     partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero-
+	     filled hex representation of the 32-bit "NT disk signature", and PP
+	     is a zero-filled hex representation of the 1-based partition number.
+	  6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation
+	     to a partition with a known unique id.
+	  7) <major>:<minor> major and minor number of the device separated by
+	     a colon.
+
+	  NOTE that, there are three ways to set parameters of blkoops and
+	  prioritize according to configuration flexibility. That is
+	  Kconfig < device tree < module parameters. It means that the value can
+	  be overwritten by higher priority settings.
+	  1. Kconfig
+	     It	just sets a default value.
+	  2. device tree
+	     It is set on device tree, which will overwrites value from Kconfig,
+	     but can also be overwritten by module parameters.
+	  3. module parameters
+	     It is the first priority. Take care of that blkoops will take lower
+	     priority settings if higher priority one do not set.
diff --git a/fs/pstore/Makefile b/fs/pstore/Makefile
index 0ee2fc8..24b3d48 100644
--- a/fs/pstore/Makefile
+++ b/fs/pstore/Makefile
@@ -15,3 +15,5 @@ obj-$(CONFIG_PSTORE_RAM)	+= ramoops.o
 
 obj-$(CONFIG_PSTORE_BLK) += pstore_blk.o
 pstore_blk-y += blkzone.o
+
+obj-$(CONFIG_PSTORE_BLKOOPS) += blkoops.o
diff --git a/fs/pstore/blkoops.c b/fs/pstore/blkoops.c
new file mode 100644
index 0000000..3885584
--- /dev/null
+++ b/fs/pstore/blkoops.c
@@ -0,0 +1,254 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * blkoops.c: Block device Oops logger
+ *
+ * Copyright (C) 2019 liaoweixiong <liaoweixiong@gallwinnertech.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+#define MODNAME "blkoops"
+#define pr_fmt(fmt) MODNAME ": " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+#include <linux/platform_device.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/pstore_blk.h>
+
+static long dmesg_size = -1;
+module_param(dmesg_size, long, 0400);
+MODULE_PARM_DESC(dmesg_size, "demsg size in kbytes");
+
+static long total_size = -1;
+module_param(total_size, long, 0400);
+MODULE_PARM_DESC(total_size, "total size in kbytes");
+
+#define BLKDEV_INVALID "INVALID"
+static char blkdev[80] = {BLKDEV_INVALID};
+module_param_string(blkdev, blkdev, 80, 0400);
+MODULE_PARM_DESC(blkdev, "the block device for general read/write");
+
+struct blkz_info blkz_info = {
+	.owner = THIS_MODULE,
+	.name = "blkoops",
+	.dump_oops = true,
+};
+
+struct blkoops_info {
+	unsigned long dmesg_size;
+	unsigned long total_size;
+	const char *blkdev;
+};
+struct blkoops_info blkoops_info = {
+	.dmesg_size = CONFIG_PSTORE_BLKOOPS_DMESG_SIZE * 1024,
+	.total_size = CONFIG_PSTORE_BLKOOPS_TOTAL_SIZE * 1024,
+	.blkdev = CONFIG_PSTORE_BLKOOPS_BLKDEV,
+};
+
+static struct platform_device *dummy;
+
+/**
+ * Block driver use this function to add panic read/write apis to blkoops.
+ * By this, block driver can do the least work that just provides panic ops.
+ */
+int blkoops_add_panic_ops(blkz_read_op panic_read, blkz_write_op panic_write)
+{
+	struct blkz_info *info = &blkz_info;
+
+	if (info->panic_read || info->panic_write)
+		return -EBUSY;
+
+	info->panic_read = panic_read;
+	info->panic_write = panic_write;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(blkoops_add_panic_ops);
+
+static int blkoops_parse_dt_size(struct device_node *np,
+		const char *propname, u32 *value)
+{
+	u32 val32 = 0;
+	int ret;
+
+	ret = of_property_read_u32(np, propname, &val32);
+	if (ret < 0) {
+		if (ret != -EINVAL)
+			pr_err("failed to parse property %s: %d\n",
+				propname, ret);
+		return ret;
+	}
+
+	if (val32 * 1024 > INT_MAX) {
+		pr_err("%s %u > INT_MAX\n", propname, val32);
+		return -EOVERFLOW;
+	}
+
+	*value = val32 * 1024;
+	return 0;
+}
+
+static int __init blkoops_parse_dt(struct blkoops_info *info,
+		struct device_node *np)
+{
+	int ret;
+	u32 value;
+
+	pr_info("using device tree\n");
+
+	ret = of_property_read_string(np, "block-device",
+			&info->blkdev);
+	if (ret < 0 && ret != -EINVAL) {
+		pr_err("failed to parse block-device: %d\n", ret);
+		return ret;
+	}
+
+#define parse_size(name, field) {					\
+		ret = blkoops_parse_dt_size(np, name, &value);		\
+		if (ret < 0 && ret != -EINVAL)				\
+			return ret;					\
+		else if (ret == 0)					\
+			field = value;					\
+	}
+
+	parse_size("total-size", info->total_size);
+	parse_size("dmesg-size", info->dmesg_size);
+
+#undef parse_size
+	return 0;
+}
+
+static int blkoops_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct device_node *of_node = dev_of_node(dev);
+	struct blkoops_info *info = dev->platform_data;
+
+	if (of_node && !info) {
+		int err;
+
+		info = &blkoops_info;
+		err = blkoops_parse_dt(info, of_node);
+		if (err)
+			return err;
+	}
+
+	if (!strcmp(info->blkdev, BLKDEV_INVALID) ||
+			strlen(info->blkdev) == 0) {
+		pr_info("no block device, use ram buffer only\n");
+	} else {
+		pr_debug("block device: %s\n", info->blkdev);
+		blkz_info.blkdev = info->blkdev;
+	}
+
+#define check_size(name, size) {					\
+		if (info->name & (size - 1)) {				\
+			pr_err(#name " must be a multiple of %d\n",	\
+					(size));			\
+			return -EINVAL;					\
+		}							\
+		blkz_info.name = info->name;				\
+	}
+
+	check_size(total_size, 4096);
+	check_size(dmesg_size, 4096);
+
+#undef check_size
+
+	/*
+	 * Update the module parameter variables as well so they are visible
+	 * through /sys/module/blkoops/parameters/
+	 */
+	dmesg_size = blkz_info.dmesg_size;
+	total_size = blkz_info.total_size;
+	if (blkz_info.blkdev)
+		strncpy(blkdev, blkz_info.blkdev, 80 - 1);
+	else
+		blkdev[0] = '\0';
+	return blkz_register(&blkz_info);
+}
+
+static int blkoops_remove(struct platform_device *pdev)
+{
+	blkz_unregister(&blkz_info);
+	return 0;
+}
+
+static const struct of_device_id dt_match[] = {
+	{ .compatible = MODNAME},
+	{}
+};
+
+static struct platform_driver blkoops_driver = {
+	.probe		= blkoops_probe,
+	.remove		= blkoops_remove,
+	.driver		= {
+		.name		= MODNAME,
+		.of_match_table	= dt_match,
+	},
+};
+
+void blkoops_register_dummy(void)
+{
+	struct blkoops_info *info = &blkoops_info;
+	/*
+	 * Prepare a dummy platform data structure to carry the module
+	 * parameters. If mem_size or blkdev isn't set, then there are
+	 * no module parameters, and we can skip this.
+	 */
+	if (total_size < 0 && !strcmp(blkdev, BLKDEV_INVALID))
+		return;
+
+	pr_info("using module parameters\n");
+
+	if (total_size >= 0)
+		info->total_size = (unsigned long)total_size * 1024;
+	if (strcmp(blkdev, BLKDEV_INVALID))
+		info->blkdev = (const char *)blkdev;
+	if (dmesg_size >= 0)
+		info->dmesg_size = (unsigned long)dmesg_size * 1024;
+
+	dummy = platform_device_register_data(NULL, MODNAME, -1, info,
+			sizeof(*info));
+	if (IS_ERR(dummy)) {
+		pr_err("could not create platform device: %ld\n",
+			PTR_ERR(dummy));
+		dummy = NULL;
+	}
+}
+
+static int __init blkoops_init(void)
+{
+	int ret;
+
+	blkoops_register_dummy();
+	ret = platform_driver_register(&blkoops_driver);
+	if (ret != 0) {
+		platform_device_unregister(dummy);
+		dummy = NULL;
+	}
+	return ret;
+}
+late_initcall(blkoops_init);
+
+static void __exit blkoops_exit(void)
+{
+	platform_driver_unregister(&blkoops_driver);
+	platform_device_unregister(dummy);
+	dummy = NULL;
+}
+module_exit(blkoops_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("liaoweixiong <liaoweixiong@allwinnertech.com>");
+MODULE_DESCRIPTION("Sample for Pstore BLK with Oops logger");
diff --git a/include/linux/pstore_blk.h b/include/linux/pstore_blk.h
index 4f239f0..2d2ff97 100644
--- a/include/linux/pstore_blk.h
+++ b/include/linux/pstore_blk.h
@@ -60,6 +60,8 @@
  * @panic_write:
  *	the write operation only used for panic.
  */
+typedef ssize_t (*blkz_read_op)(char *, size_t, loff_t);
+typedef ssize_t (*blkz_write_op)(const char *, size_t, loff_t);
 struct blkz_info {
 	struct module *owner;
 	const char *name;
@@ -68,13 +70,17 @@ struct blkz_info {
 	unsigned long total_size;
 	unsigned long dmesg_size;
 	int dump_oops;
-	ssize_t (*read)(char *buf, size_t bytes, loff_t pos);
-	ssize_t (*write)(const char *buf, size_t bytes, loff_t pos);
-	ssize_t (*panic_read)(char *buf, size_t bytes, loff_t pos);
-	ssize_t (*panic_write)(const char *buf, size_t bytes, loff_t pos);
+	blkz_read_op read;
+	blkz_write_op write;
+	blkz_read_op panic_read;
+	blkz_write_op panic_write;
 };
 
 extern int blkz_register(struct blkz_info *info);
 extern void blkz_unregister(struct blkz_info *info);
 
+#if IS_ENABLED(CONFIG_PSTORE_BLKOOPS)
+extern int blkoops_add_panic_ops(blkz_read_op, blkz_write_op);
+#endif
+
 #endif
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC v9 4/5] pstore/blk: support pmsg for pstore block
  2019-02-19 11:52 [RFC v9 0/5] pstore/block: new support logger for block devices liaoweixiong
                   ` (2 preceding siblings ...)
  2019-02-19 11:52 ` [RFC v9 3/5] pstore/blk: add blkoops for pstore_blk liaoweixiong
@ 2019-02-19 11:52 ` liaoweixiong
  2019-02-19 11:52 ` [RFC v9 5/5] Documentation: pstore/blk: create document for pstore_blk liaoweixiong
  4 siblings, 0 replies; 10+ messages in thread
From: liaoweixiong @ 2019-02-19 11:52 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Rob Herring, Mark Rutland, liaoweixiong,
	Mauro Carvalho Chehab, David S. Miller, Greg Kroah-Hartman,
	Nicolas Ferre, Arnd Bergmann
  Cc: linux-doc, linux-kernel, devicetree

To enable pmsg, just set pmsg_size when block device register blkzone.

Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
---
 fs/pstore/Kconfig          |  21 ++++
 fs/pstore/blkoops.c        |  11 ++
 fs/pstore/blkzone.c        | 254 +++++++++++++++++++++++++++++++++++++++++----
 include/linux/pstore_blk.h |   1 +
 4 files changed, 265 insertions(+), 22 deletions(-)

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index a3c3f34..50d196e 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -210,6 +210,27 @@ config PSTORE_BLKOOPS_DMESG_SIZE
 	     It is the first priority. Take care of that blkoops will take lower
 	     priority settings if higher priority one do not set.
 
+config PSTORE_BLKOOPS_PMSG_SIZE
+	int "pmsg size in kbytes for blkoops"
+	depends on PSTORE_BLKOOPS
+	default 64
+	help
+	  This just sets size of pmsg (pmsg_size) for pstore/blk. The value must
+	  be a multiple of 4096. Pmsg work only if "blkdev" is set.
+
+	  NOTE that, there are three ways to set parameters of blkoops and
+	  prioritize according to configuration flexibility. That is
+	  Kconfig < device tree < module parameters. It means that the value can
+	  be overwritten by higher priority settings.
+	  1. Kconfig
+	     It	just sets a default value.
+	  2. device tree
+	     It is set on device tree, which will overwrites value from Kconfig,
+	     but can also be overwritten by module parameters.
+	  3. module parameters
+	     It is the first priority. Take care of that blkoops will take lower
+	     priority settings if higher priority one do not set.
+
 config PSTORE_BLKOOPS_TOTAL_SIZE
 	int "total size in kbytes for blkoops"
 	depends on PSTORE_BLKOOPS
diff --git a/fs/pstore/blkoops.c b/fs/pstore/blkoops.c
index 3885584..5363b7f 100644
--- a/fs/pstore/blkoops.c
+++ b/fs/pstore/blkoops.c
@@ -30,6 +30,10 @@
 module_param(dmesg_size, long, 0400);
 MODULE_PARM_DESC(dmesg_size, "demsg size in kbytes");
 
+static long pmsg_size = -1;
+module_param(pmsg_size, long, 0400);
+MODULE_PARM_DESC(pmsg_size, "pmsg size in kbytes");
+
 static long total_size = -1;
 module_param(total_size, long, 0400);
 MODULE_PARM_DESC(total_size, "total size in kbytes");
@@ -47,11 +51,13 @@ struct blkz_info blkz_info = {
 
 struct blkoops_info {
 	unsigned long dmesg_size;
+	unsigned long pmsg_size;
 	unsigned long total_size;
 	const char *blkdev;
 };
 struct blkoops_info blkoops_info = {
 	.dmesg_size = CONFIG_PSTORE_BLKOOPS_DMESG_SIZE * 1024,
+	.pmsg_size = CONFIG_PSTORE_BLKOOPS_PMSG_SIZE * 1024,
 	.total_size = CONFIG_PSTORE_BLKOOPS_TOTAL_SIZE * 1024,
 	.blkdev = CONFIG_PSTORE_BLKOOPS_BLKDEV,
 };
@@ -123,6 +129,7 @@ static int __init blkoops_parse_dt(struct blkoops_info *info,
 
 	parse_size("total-size", info->total_size);
 	parse_size("dmesg-size", info->dmesg_size);
+	parse_size("pmsg-size", info->pmsg_size);
 
 #undef parse_size
 	return 0;
@@ -162,6 +169,7 @@ static int blkoops_probe(struct platform_device *pdev)
 
 	check_size(total_size, 4096);
 	check_size(dmesg_size, 4096);
+	check_size(pmsg_size, 4096);
 
 #undef check_size
 
@@ -170,6 +178,7 @@ static int blkoops_probe(struct platform_device *pdev)
 	 * through /sys/module/blkoops/parameters/
 	 */
 	dmesg_size = blkz_info.dmesg_size;
+	pmsg_size = blkz_info.pmsg_size;
 	total_size = blkz_info.total_size;
 	if (blkz_info.blkdev)
 		strncpy(blkdev, blkz_info.blkdev, 80 - 1);
@@ -217,6 +226,8 @@ void blkoops_register_dummy(void)
 		info->blkdev = (const char *)blkdev;
 	if (dmesg_size >= 0)
 		info->dmesg_size = (unsigned long)dmesg_size * 1024;
+	if (pmsg_size >= 0)
+		info->pmsg_size = (unsigned long)pmsg_size * 1024;
 
 	dummy = platform_device_register_data(NULL, MODNAME, -1, info,
 			sizeof(*info));
diff --git a/fs/pstore/blkzone.c b/fs/pstore/blkzone.c
index 83dd181..7a0129c 100644
--- a/fs/pstore/blkzone.c
+++ b/fs/pstore/blkzone.c
@@ -41,12 +41,14 @@
  *
  * @sig: signature to indicate header (BLK_SIG xor BLKZONE-type value)
  * @datalen: length of data in @data
+ * @start: offset into @data where the beginning of the stored bytes begin
  * @data: zone data.
  */
 struct blkz_buffer {
 #define BLK_SIG (0x43474244) /* DBGC */
 	uint32_t sig;
 	atomic_t datalen;
+	atomic_t start;
 	uint8_t data[];
 };
 
@@ -79,6 +81,9 @@ struct blkz_dmesg_header {
  *	frontent name for this zone
  * @buffer:
  *	pointer to data buffer managed by this zone
+ * @oldbuf:
+ *	pointer to old data buffer. It is used for single zone such as pmsg,
+ *	saving the old buffer.
  * @buffer_size:
  *	bytes in @buffer->data
  * @should_recover:
@@ -92,6 +97,7 @@ struct blkz_zone {
 	enum pstore_type_id type;
 
 	struct blkz_buffer *buffer;
+	struct blkz_buffer *oldbuf;
 	size_t buffer_size;
 	bool should_recover;
 	atomic_t dirty;
@@ -99,8 +105,10 @@ struct blkz_zone {
 
 struct blkz_context {
 	struct blkz_zone **dbzs;	/* dmesg block zones */
+	struct blkz_zone *pbz;		/* Pmsg block zone */
 	unsigned int dmesg_max_cnt;
 	unsigned int dmesg_read_cnt;
+	unsigned int pmsg_read_cnt;
 	unsigned int dmesg_write_cnt;
 	/*
 	 * the counter should be recovered when do recovery
@@ -133,6 +141,11 @@ static inline int buffer_datalen(struct blkz_zone *zone)
 	return atomic_read(&zone->buffer->datalen);
 }
 
+static inline int buffer_start(struct blkz_zone *zone)
+{
+	return atomic_read(&zone->buffer->start);
+}
+
 static inline bool is_on_panic(void)
 {
 	struct blkz_context *cxt = &blkz_cxt;
@@ -389,6 +402,72 @@ static int blkz_recover_dmesg(struct blkz_context *cxt)
 	return ret;
 }
 
+static int blkz_recover_pmsg(struct blkz_context *cxt)
+{
+	struct blkz_info *info = cxt->bzinfo;
+	struct blkz_buffer *oldbuf;
+	struct blkz_zone *zone = NULL;
+	ssize_t (*readop)(char *buf, size_t bytes, loff_t pos);
+	int ret = 0;
+	ssize_t rcnt, len;
+
+	zone = cxt->pbz;
+	if (!zone || zone->oldbuf)
+		return 0;
+
+	if (is_on_panic())
+		goto out;
+
+	readop = info->read;
+	if (unlikely(!readop))
+		return -EINVAL;
+
+	len = zone->buffer_size + sizeof(*oldbuf);
+	oldbuf = kzalloc(len, GFP_KERNEL);
+	if (!oldbuf)
+		return -ENOMEM;
+
+	rcnt = readop((char *)oldbuf, len, zone->off);
+	if (rcnt != len) {
+		pr_debug("recover pmsg failed\n");
+		ret = (int)rcnt < 0 ? (int)rcnt : -EIO;
+		goto free_oldbuf;
+	}
+
+	if (oldbuf->sig != zone->buffer->sig) {
+		pr_debug("no valid data in zone %s\n", zone->name);
+		goto free_oldbuf;
+	}
+
+	if (zone->buffer_size < atomic_read(&oldbuf->datalen) ||
+		zone->buffer_size < atomic_read(&oldbuf->start)) {
+		pr_info("found overtop zone: %s: off %lu, size %zu\n",
+				zone->name, zone->off, zone->buffer_size);
+		goto free_oldbuf;
+	}
+
+	if (!atomic_read(&oldbuf->datalen)) {
+		pr_debug("found erased zone: %s: id 0, off %lu, size %zu, datalen %d\n",
+				zone->name, zone->off, zone->buffer_size,
+				atomic_read(&oldbuf->datalen));
+		kfree(oldbuf);
+		goto out;
+	}
+
+	pr_debug("found nice zone: %s: id 0, off %lu, size %zu, datalen %d\n",
+			zone->name, zone->off, zone->buffer_size,
+			atomic_read(&oldbuf->datalen));
+	zone->oldbuf = oldbuf;
+out:
+	if (atomic_read(&zone->dirty))
+		blkz_zone_write(zone, FLUSH_ALL, NULL, buffer_datalen(zone), 0);
+	return 0;
+
+free_oldbuf:
+	kfree(oldbuf);
+	return ret;
+}
+
 static inline int blkz_recovery(struct blkz_context *cxt)
 {
 	int ret = -EBUSY;
@@ -400,6 +479,10 @@ static inline int blkz_recovery(struct blkz_context *cxt)
 	if (ret)
 		goto recover_fail;
 
+	ret = blkz_recover_pmsg(cxt);
+	if (ret)
+		goto recover_fail;
+
 	atomic_set(&cxt->recovery, 1);
 	pr_debug("recover end!\n");
 	return 0;
@@ -417,11 +500,18 @@ static int blkz_pstore_open(struct pstore_info *psi)
 	return 0;
 }
 
+static inline bool blkz_old_ok(struct blkz_zone *zone)
+{
+	if (zone && zone->oldbuf && atomic_read(&zone->oldbuf->datalen))
+		return true;
+	return false;
+}
+
 static inline bool blkz_ok(struct blkz_zone *zone)
 {
-	if (!zone || !zone->buffer || !buffer_datalen(zone))
-		return false;
-	return true;
+	if (zone && zone->buffer && buffer_datalen(zone))
+		return true;
+	return false;
 }
 
 static int blkz_pstore_erase(struct pstore_record *record)
@@ -429,13 +519,29 @@ static int blkz_pstore_erase(struct pstore_record *record)
 	struct blkz_context *cxt = record->psi->data;
 	struct blkz_zone *zone = NULL;
 
-	if (record->type == PSTORE_TYPE_DMESG)
+	if (record->type == PSTORE_TYPE_DMESG) {
 		zone = cxt->dbzs[record->id];
-	if (!blkz_ok(zone))
-		return 0;
+		if (unlikely(!blkz_ok(zone)))
+			return 0;
 
-	atomic_set(&zone->buffer->datalen, 0);
-	return blkz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+		atomic_set(&zone->buffer->datalen, 0);
+		return blkz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+	} else if (record->type == PSTORE_TYPE_PMSG) {
+		zone = cxt->pbz;
+		if (unlikely(!blkz_old_ok(zone)))
+			return 0;
+
+		kfree(zone->oldbuf);
+		zone->oldbuf = NULL;
+		/**
+		 * if there is new data in zone buffer, there is no need to
+		 * flush 0 (erase) to block device
+		 */
+		if (buffer_datalen(zone))
+			return 0;
+		return blkz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+	}
+	return -EINVAL;
 }
 
 static void blkz_write_kmsg_hdr(struct blkz_zone *zone,
@@ -453,8 +559,10 @@ static void blkz_write_kmsg_hdr(struct blkz_zone *zone,
 	hdr->reason = record->reason;
 	if (hdr->reason == KMSG_DUMP_OOPS)
 		hdr->counter = ++cxt->oops_counter;
-	else
+	else if (hdr->reason == KMSG_DUMP_PANIC)
 		hdr->counter = ++cxt->panic_counter;
+	else
+		hdr->counter = 0;
 }
 
 static int notrace blkz_dmesg_write(struct blkz_context *cxt,
@@ -504,6 +612,55 @@ static int notrace blkz_dmesg_write(struct blkz_context *cxt,
 	return 0;
 }
 
+static int notrace blkz_pmsg_write(struct blkz_context *cxt,
+		struct pstore_record *record)
+{
+	struct blkz_zone *zone;
+	size_t start, rem;
+	int cnt = record->size;
+	bool is_full_data = false;
+	char *buf = record->buf;
+
+	zone = cxt->pbz;
+	if (!zone)
+		return -ENOSPC;
+
+	if (atomic_read(&zone->buffer->datalen) >= zone->buffer_size)
+		is_full_data = true;
+
+	if (unlikely(cnt > zone->buffer_size)) {
+		buf += cnt - zone->buffer_size;
+		cnt = zone->buffer_size;
+	}
+
+	start = buffer_start(zone);
+	rem = zone->buffer_size - start;
+	if (unlikely(rem < cnt)) {
+		blkz_zone_write(zone, FLUSH_PART, buf, rem, start);
+		buf += rem;
+		cnt -= rem;
+		start = 0;
+		is_full_data = true;
+	}
+
+	atomic_set(&zone->buffer->start, cnt + start);
+	blkz_zone_write(zone, FLUSH_PART, buf, cnt, start);
+
+	/**
+	 * blkz_zone_write will set datalen as start + cnt.
+	 * It work if actual data length lesser than buffer size.
+	 * If data length greater than buffer size, pmsg will rewrite to
+	 * beginning of zone, which make buffer->datalen wrongly.
+	 * So we should reset datalen as buffer size once actual data length
+	 * greater than buffer size.
+	 */
+	if (is_full_data) {
+		atomic_set(&zone->buffer->datalen, zone->buffer_size);
+		blkz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+	}
+	return 0;
+}
+
 static int notrace blkz_pstore_write(struct pstore_record *record)
 {
 	struct blkz_context *cxt = record->psi->data;
@@ -521,6 +678,8 @@ static int notrace blkz_pstore_write(struct pstore_record *record)
 	switch (record->type) {
 	case PSTORE_TYPE_DMESG:
 		return blkz_dmesg_write(cxt, record);
+	case PSTORE_TYPE_PMSG:
+		return blkz_pmsg_write(cxt, record);
 	default:
 		return -EINVAL;
 	}
@@ -537,6 +696,13 @@ static struct blkz_zone *blkz_read_next_zone(struct blkz_context *cxt)
 			return zone;
 	}
 
+	if (cxt->pmsg_read_cnt == 0) {
+		cxt->pmsg_read_cnt++;
+		zone = cxt->pbz;
+		if (blkz_old_ok(zone))
+			return zone;
+	}
+
 	return NULL;
 }
 
@@ -575,7 +741,8 @@ static ssize_t blkz_dmesg_read(struct blkz_zone *zone,
 		char *buf = kasprintf(GFP_KERNEL,
 				"%s: Total %d times\n",
 				record->reason == KMSG_DUMP_OOPS ? "Oops" :
-				"Panic", record->count);
+				record->reason == KMSG_DUMP_PANIC ? "Panic" :
+				"Unknown", record->count);
 		hlen = strlen(buf);
 		record->buf = krealloc(buf, hlen + size, GFP_KERNEL);
 		if (!record->buf) {
@@ -597,6 +764,29 @@ static ssize_t blkz_dmesg_read(struct blkz_zone *zone,
 	return size + hlen;
 }
 
+static ssize_t blkz_pmsg_read(struct blkz_zone *zone,
+		struct pstore_record *record)
+{
+	size_t size, start;
+	struct blkz_buffer *buf;
+
+	buf = (struct blkz_buffer *)zone->oldbuf;
+	if (!buf)
+		return READ_NEXT_ZONE;
+
+	size = atomic_read(&buf->datalen);
+	start = atomic_read(&buf->start);
+
+	record->buf = kmalloc(size, GFP_KERNEL);
+	if (!record->buf)
+		return -ENOMEM;
+
+	memcpy(record->buf, buf->data + start, size - start);
+	memcpy(record->buf + size - start, buf->data, start);
+
+	return size;
+}
+
 static ssize_t blkz_pstore_read(struct pstore_record *record)
 {
 	struct blkz_context *cxt = record->psi->data;
@@ -622,6 +812,9 @@ static ssize_t blkz_pstore_read(struct pstore_record *record)
 		blkz_read = blkz_dmesg_read;
 		record->id = cxt->dmesg_read_cnt - 1;
 		break;
+	case PSTORE_TYPE_PMSG:
+		blkz_read = blkz_pmsg_read;
+		break;
 	default:
 		goto next_zone;
 	}
@@ -798,8 +991,10 @@ static struct blkz_zone *blkz_init_zone(enum pstore_type_id type,
 	zone->type = type;
 	zone->buffer_size = size - sizeof(struct blkz_buffer);
 	zone->buffer->sig = type ^ BLK_SIG;
+	zone->oldbuf = NULL;
 	atomic_set(&zone->dirty, 0);
 	atomic_set(&zone->buffer->datalen, 0);
+	atomic_set(&zone->buffer->start, 0);
 
 	*off += size;
 
@@ -881,7 +1076,7 @@ static int blkz_cut_zones(struct blkz_context *cxt)
 	int err;
 	size_t size;
 
-	size = info->total_size;
+	size = info->total_size - info->pmsg_size;
 	cxt->dbzs = blkz_init_zones(PSTORE_TYPE_DMESG, &off, size,
 			info->dmesg_size, &cxt->dmesg_max_cnt);
 	if (IS_ERR(cxt->dbzs)) {
@@ -889,7 +1084,16 @@ static int blkz_cut_zones(struct blkz_context *cxt)
 		goto fail_out;
 	}
 
+	size = info->pmsg_size;
+	cxt->pbz = blkz_init_zone(PSTORE_TYPE_PMSG, &off, size);
+	if (IS_ERR(cxt->pbz)) {
+		err = PTR_ERR(cxt->pbz);
+		goto free_dmesg_zones;
+	}
+
 	return 0;
+free_dmesg_zones:
+	blkz_free_zones(&cxt->dbzs, &cxt->dmesg_max_cnt);
 fail_out:
 	return err;
 }
@@ -911,7 +1115,7 @@ int blkz_register(struct blkz_info *info)
 		}
 	}
 
-	if (!info->total_size || !info->dmesg_size) {
+	if (!info->total_size || (!info->dmesg_size && !info->pmsg_size)) {
 		pr_warn("The total size and the dmesg size must be non-zero\n");
 		return -EINVAL;
 	}
@@ -931,6 +1135,7 @@ int blkz_register(struct blkz_info *info)
 
 	check_size(total_size, 4096);
 	check_size(dmesg_size, SECTOR_SIZE);
+	check_size(pmsg_size, SECTOR_SIZE);
 
 #undef check_size
 
@@ -957,20 +1162,25 @@ int blkz_register(struct blkz_info *info)
 		goto fail_out;
 	}
 
-	cxt->pstore.bufsize = cxt->dbzs[0]->buffer_size -
+	if (info->dmesg_size) {
+		cxt->pstore.bufsize = cxt->dbzs[0]->buffer_size -
 			sizeof(struct blkz_dmesg_header);
-	cxt->pstore.buf = kzalloc(cxt->pstore.bufsize, GFP_KERNEL);
-	if (!cxt->pstore.buf) {
-		pr_err("cannot allocate pstore crash dump buffer\n");
-		err = -ENOMEM;
-		goto fail_out;
+		cxt->pstore.buf = kzalloc(cxt->pstore.bufsize, GFP_KERNEL);
+		if (!cxt->pstore.buf) {
+			err = -ENOMEM;
+			goto fail_out;
+		}
 	}
 	cxt->pstore.data = cxt;
-	cxt->pstore.flags = PSTORE_FLAGS_DMESG;
+	if (info->dmesg_size)
+		cxt->pstore.flags |= PSTORE_FLAGS_DMESG;
+	if (info->pmsg_size)
+		cxt->pstore.flags |= PSTORE_FLAGS_PMSG;
 
-	pr_info("Registered %s as blkzone backend for %s%s\n", info->name,
+	pr_info("Registered %s as blkzone backend for %s%s%s\n", info->name,
 			cxt->dbzs && cxt->bzinfo->dump_oops ? "Oops " : "",
-			cxt->dbzs && cxt->bzinfo->panic_write ? "Panic " : "");
+			cxt->dbzs && cxt->bzinfo->panic_write ? "Panic " : "",
+			cxt->pbz ? "Pmsg" : "");
 
 	err = pstore_register(&cxt->pstore);
 	if (err) {
@@ -1004,7 +1214,7 @@ void blkz_unregister(struct blkz_info *info)
 	spin_unlock(&cxt->bzinfo_lock);
 
 	blkz_free_zones(&cxt->dbzs, &cxt->dmesg_max_cnt);
-
+	blkz_free_zone(&cxt->pbz);
 }
 EXPORT_SYMBOL_GPL(blkz_unregister);
 
diff --git a/include/linux/pstore_blk.h b/include/linux/pstore_blk.h
index 2d2ff97..9f2b9a9 100644
--- a/include/linux/pstore_blk.h
+++ b/include/linux/pstore_blk.h
@@ -69,6 +69,7 @@ struct blkz_info {
 	const char *blkdev;
 	unsigned long total_size;
 	unsigned long dmesg_size;
+	unsigned long pmsg_size;
 	int dump_oops;
 	blkz_read_op read;
 	blkz_write_op write;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC v9 5/5] Documentation: pstore/blk: create document for pstore_blk
  2019-02-19 11:52 [RFC v9 0/5] pstore/block: new support logger for block devices liaoweixiong
                   ` (3 preceding siblings ...)
  2019-02-19 11:52 ` [RFC v9 4/5] pstore/blk: support pmsg for pstore block liaoweixiong
@ 2019-02-19 11:52 ` liaoweixiong
  2019-02-28  5:15   ` Randy Dunlap
  4 siblings, 1 reply; 10+ messages in thread
From: liaoweixiong @ 2019-02-19 11:52 UTC (permalink / raw)
  To: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Rob Herring, Mark Rutland, liaoweixiong,
	Mauro Carvalho Chehab, David S. Miller, Greg Kroah-Hartman,
	Nicolas Ferre, Arnd Bergmann
  Cc: linux-doc, linux-kernel, devicetree

The document, at Documentation/admin-guide/pstore-block.rst,
tells user how to use pstore_blk and the attentions about panic
read/write

Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
---
 Documentation/admin-guide/pstore-block.rst | 233 +++++++++++++++++++++++++++++
 MAINTAINERS                                |   1 +
 fs/pstore/Kconfig                          |   4 +
 3 files changed, 238 insertions(+)
 create mode 100644 Documentation/admin-guide/pstore-block.rst

diff --git a/Documentation/admin-guide/pstore-block.rst b/Documentation/admin-guide/pstore-block.rst
new file mode 100644
index 0000000..a828274
--- /dev/null
+++ b/Documentation/admin-guide/pstore-block.rst
@@ -0,0 +1,233 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Pstore block oops/panic logger
+==============================
+
+Introduction
+------------
+
+Pstore block (pstore_blk) is an oops/panic logger that write its logs to block
+device before the system crashes. Pstore_blk needs block device driver
+registering a partition path of the block device, like /dev/mmcblk0p7 for mmc
+driver, and read/write APIs for this partition when on panic.
+
+Pstore block concepts
+---------------------
+
+Pstore block begins at function ``blkz_register``, by which block driver
+registers to pstore_blk. Note that, block driver should register to pstore_blk
+after block device has registered. Block driver transfers a structure
+``blkz_info`` which is defined in *linux/pstore_blk.h*.
+
+The following key members of ``struct blkz_info`` may be of interest to you.
+
+blkdev
+~~~~~~
+
+The block device to use. Most of the time, it is a partition of block device.
+It's ok to keep it as NULL if you passing ``read`` and ``write`` in blkz_info as
+``blkdev`` is used by blkz_default_general_read/write. If both of ``blkdev``,
+``read`` and ``write`` are NULL, no block device is effective and the data will
+be saved in ddr buffer.
+
+It accept the following variants:
+
+1. <hex_major><hex_minor> device number in hexadecimal represents itself no
+   leading 0x, for example b302.
+#. /dev/<disk_name> represents the device number of disk
+#. /dev/<disk_name><decimal> represents the device number of partition - device
+   number of disk plus the partition number
+#. /dev/<disk_name>p<decimal> - same as the above, that form is used when disk
+   name of partitioned disk ends on a digit.
+#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the unique id of
+   a partition if the partition table provides it. The UUID may be either an
+   EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP,
+   where SSSSSSSS is a zero-filled hex representation of the 32-bit
+   "NT disk signature", and PP is a zero-filled hex representation of the
+   1-based partition number.
+#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a
+   partition with a known unique id.
+#. <major>:<minor> major and minor number of the device separated by a colon.
+
+See more on section **read/write**.
+
+total_size
+~~~~~~~~~~
+
+The total size in bytes of block device used for pstore_blk. It **MUST** be less
+than or equal to size of block device if ``blkdev`` valid. It **MUST** be a
+multiple of 4096. If ``total_size`` is zero with ``blkdev``, ``total_size`` will be
+set to equal to size of ``blkdev``.
+
+The block device area is divided into many chunks, and each event writes a chunk
+of information.
+
+dmesg_size
+~~~~~~~~~~
+
+The chunk size in bytes for dmesg(oops/panic). It **MUST** be a multiple of
+SECTOR_SIZE (Most of the time, the SECTOR_SIZE is 512). If you don't need dmesg,
+you are safely to set it to 0.
+
+NOTE that, the remaining space, except ``pmsg_size`` and others, belongs to
+dmesg. It means that there are multiple chunks for dmesg.
+
+Psotre_blk will log to dmesg chunks one by one, and always overwrite the oldest
+chunk if no free chunk.
+
+pmsg_size
+~~~~~~~~~
+
+The chunk size in bytes for pmsg. It **MUST** be a multiple of SECTOR_SIZE (Most
+of the time, the SECTOR_SIZE is 512). If you don't need pmsg, you are safely to
+set it to 0.
+
+There is only one chunk for pmsg.
+
+Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are
+appended to the chunk. On reboot the contents are available in
+/sys/fs/pstore/pmsg-pstore-blk-0.
+
+dump_oops
+~~~~~~~~~
+
+Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
+member while setting 0 in that variable dumps only the panics.
+
+read/write
+~~~~~~~~~~
+
+They are general ``read/write`` APIs. It is safely and recommended to ignore it,
+but set ``blkdev``.
+
+These general APIs are used all the time expect panic. The ``read`` API is
+usually used to recover data from block device, and the ``write`` API is usually
+to flush new data and erase to block device.
+
+Pstore_blk will temporarily hold all new data before block device is ready. If
+you ignore both of ``read/write`` and ``blkdev``, the old data will be lost.
+
+NOTE that, the general APIs must check whether the block device is ready if
+self-defined.
+
+panic_read/panic_write
+~~~~~~~~~~~~~~~~~~~~~~
+
+They are ``read/write`` APIs for panic. They are likely to general
+``read/write`` but will be used only when on panic.
+
+The attentions for panic read/write see section
+**Attentions in panic read/write APIs**.
+
+Register to pstore block
+------------------------
+
+Block device driver call ``blkz_register`` to register to Psotre_blk.
+For example:
+
+.. code-block:: c
+
+ #include <linux/pstore_blk.h>
+ [...]
+
+ static ssize_t XXXX_panic_read(char *buf, size bytes, loff_t pos)
+ {
+    [...]
+ }
+
+ static ssize_t XXXX_panic_write(const char *buf, size_t bytes, loff_t pos)
+ {
+        [...]
+ }
+
+ struct blkz_info XXXX_info = {
+        .onwer = THIS_MODULE,
+        .name = <...>,
+        .dmesg_size = <...>,
+        .pmsg_size = <...>,
+        .dump_oops = true,
+        .panic_read = XXXX_panic_read,
+        .panic_write = XXXX_panic_write,
+ };
+
+ static int __init XXXX_init(void)
+ {
+        [... get block device information ...]
+        XXXX_info.blkdev = <...>;
+        XXXX_info.total_size = <...>;
+
+        [...]
+        return blkz_register(&XXXX_info);
+ }
+
+There are multiple ways by which you can get block device information.
+
+A. Use the module parameters and kernel cmdline.
+B. Use Device Tree bindings.
+C. Use Kconfig.
+D. Use Driver Feature.
+   For example, traverse all MTD device by ``register_mtd_user``, and get the
+   matching name MTD partition.
+
+NOTE that, all of above are done by block driver rather then pstore_blk. You can
+get sample on blkoops.
+
+The attentions for panic read/write see section
+**Attentions in panic read/write APIs**.
+
+Compression and header
+----------------------
+
+Block device is large enough, it is not necessary to compress dmesg data.
+Actually, we recommend not compress. Because pstore_blk will insert some
+information into the first line of dmesg data if no compression.
+For example::
+
+        Panic: Total 16 times
+
+It means that it's the 16th times panic log since burning.
+Sometimes, the oops|panic counter since burning is very important for embedded
+device to judge whether the system is stable.
+
+The follow line is insert by pstore filesystem.
+For example::
+
+        Oops#2 Part1
+
+It means that it's the 2nd times oops log on last booting.
+
+Reading the data
+----------------
+
+The dump data can be read from the pstore filesystem. The format for these
+files is ``dmesg-pstore-blk-[N]`` for dmesg(oops|panic) and
+``pmsg-pstore-blk-0`` for pmsg, where N is the record number. To delete a stored
+record from block device, simply unlink the respective pstore file. The
+timestamp of the dump file records the trigger time.
+
+Attentions in panic read/write APIs
+-----------------------------------
+
+If on panic, the kernel is not going to be running for much longer. The tasks
+will not be scheduled and the most kernel resources will be out of service. It
+looks like a single-threaded program running on a single-core computer.
+
+The following points need special attention for panic read/write APIs:
+
+1. Can **NOT** allocate any memory.
+   If you need memory, just allocate while the block driver is initialing rather
+   than waiting until the panic.
+#. Must be polled, **NOT** interrupt driven.
+   No task schedule any more. The block driver should delay to ensure the write
+   succeeds, but NOT sleep.
+#. Can **NOT** take any lock.
+   There is no other task, no any share resource, you are safely to break all
+   locks.
+#. Just use cpu to transfer.
+   Do not use DMA to transfer unless you are sure that DMA will not keep lock.
+#. Operate register directly.
+   Try not to use linux kernel resources. Do io map while initialing rather than
+   waiting until the panic.
+#. Reset your block device and controller if necessary.
+   If you are not sure the state of you block device and controller when panic,
+   you are safely to stop and reset them.
diff --git a/MAINTAINERS b/MAINTAINERS
index 44647a8..4dd95d3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12317,6 +12317,7 @@ F:	include/linux/pstore*
 F:	drivers/firmware/efi/efi-pstore.c
 F:	drivers/acpi/apei/erst.c
 F:	Documentation/admin-guide/ramoops.rst
+F:	Documentation/admin-guide/pstore-block.rst
 F:	Documentation/devicetree/bindings/reserved-memory/ramoops.txt
 F:	Documentation/devicetree/bindings/pstore-block/
 K:	\b(pstore|ramoops|blkoops)
diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 50d196e..0247832 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -161,6 +161,10 @@ config PSTORE_BLK
 	  This enables panic and oops message to be logged to a block dev
 	  where it can be read back at some later point.
 
+	  For more information, see Documentation/admin-guide/pstore-block.rst.
+
+	  If unsure, say N.
+
 config PSTORE_BLKOOPS
 	tristate "pstore block with oops logger"
 	depends on PSTORE_BLK
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC v9 2/5] dt-bindings: pstore-block: new support for blkoops
  2019-02-19 11:52 ` [RFC v9 2/5] dt-bindings: pstore-block: new support for blkoops liaoweixiong
@ 2019-02-22 15:36   ` Rob Herring
  2019-02-25 14:20     ` liaoweixiong
  0 siblings, 1 reply; 10+ messages in thread
From: Rob Herring @ 2019-02-22 15:36 UTC (permalink / raw)
  To: liaoweixiong
  Cc: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Mark Rutland, Mauro Carvalho Chehab,
	David S. Miller, Greg Kroah-Hartman, Nicolas Ferre,
	Arnd Bergmann, linux-doc, linux-kernel, devicetree,
	boot-architecture

+boot-architecture list

On Tue, Feb 19, 2019 at 07:52:47PM +0800, liaoweixiong wrote:
> Create DT binding document for blkoops.
> 
> Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
> ---
>  .../devicetree/bindings/pstore/blkoops.txt         | 53 ++++++++++++++++++++++
>  MAINTAINERS                                        |  1 +
>  2 files changed, 54 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/pstore/blkoops.txt
> 
> diff --git a/Documentation/devicetree/bindings/pstore/blkoops.txt b/Documentation/devicetree/bindings/pstore/blkoops.txt
> new file mode 100644
> index 0000000..5462915
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/pstore/blkoops.txt
> @@ -0,0 +1,53 @@
> +Blkoops oops logger
> +===================
> +
> +Blkoops provides a block partition for oops, excluding panics now, so they can
> +be recovered after a reboot.
> +
> +Any space of block device will be used for a circular buffer of oops records.
> +These records have a configurable size, with a size of 0 indicating that they
> +should be disabled.
> +
> +At least one of "block-device" and "total_size" must be set.
> +
> +At least one of "dmesg-size" or "pmsg-size" must be set non-zero.
> +
> +Required properties:
> +
> +- compatible: must be "blkoops".
> +
> +Optional properties:
> +
> +- block-device: The block device to use. Most of the time, it is a partition of
> +		device. If block-device is NULL, no block device is effective
> +		and the data will be lost after rebooting.
> +		It accept the following variants:
> +		1) <hex_major><hex_minor> device number in hexadecimal
> +		   represents itself no leading 0x, for example b302.
> +		2) /dev/<disk_name> represents the device number of disk
> +		3) /dev/<disk_name><decimal> represents the device number of
> +		   partition - device number of disk plus the partition number
> +		4) /dev/<disk_name>p<decimal> - same as the above, that form is
> +		   used when disk name of partitioned disk ends on a digit.
> +		5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing
> +		   the unique id of a partition if the partition table provides
> +		   it. The UUID may be either an EFI/GPT UUID, or refer to an
> +		   MSDOS partition using the format SSSSSSSS-PP, where SSSSSSSS
> +		   is a zero-filled hex representation of the 32-bit
> +		   "NT disk signature", and PP is a zero-filled hex
> +		   representation of the 1-based partition number.
> +		6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in
> +		   relation to a partition with a known unique id.
> +		7) <major>:<minor> major and minor number of the device
> +		   separated by a colon.

No.

I didn't suggest to go look at PARTUUID to copy it into the binding, but 
rather to point out that the kernel can already mount by UUID. 
Specifying the UUID in DT is also not what I suggested. My suggestion is 
to define a known UUID so that the kernel (and bootloaders, userspace, 
the world) can just know the UUID. Just like the EFI system partition. 
Now this means you have to get it defined in the UEFI specification 
(or maybe EBBR[1]). If you want help with how to do that, the 
boot-architecture list is a good place to start.

major/minor numbers are a Linux thing, so they don't go in DT.
/dev/* is Linux thing, so it doesn't go in DT.

You can always define all these parameters as kernel command line 
options and avoid DT. That would also make this work on *all* systems, 
not just DT based systems. (Though I still believe that the partition 
should be discoverable.)

Rob

[1] https://github.com/ARM-software/ebbr



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC v9 2/5] dt-bindings: pstore-block: new support for blkoops
  2019-02-22 15:36   ` Rob Herring
@ 2019-02-25 14:20     ` liaoweixiong
  0 siblings, 0 replies; 10+ messages in thread
From: liaoweixiong @ 2019-02-25 14:20 UTC (permalink / raw)
  To: Rob Herring
  Cc: Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Mark Rutland, Mauro Carvalho Chehab,
	David S. Miller, Greg Kroah-Hartman, Nicolas Ferre,
	Arnd Bergmann, linux-doc, linux-kernel, devicetree,
	boot-architecture


On 2019-02-22 23:36, Rob Herring wrote:
> +boot-architecture list
> 
> On Tue, Feb 19, 2019 at 07:52:47PM +0800, liaoweixiong wrote:
>> Create DT binding document for blkoops.
>>
>> Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
>> ---
>>  .../devicetree/bindings/pstore/blkoops.txt         | 53 ++++++++++++++++++++++
>>  MAINTAINERS                                        |  1 +
>>  2 files changed, 54 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/pstore/blkoops.txt
>>
>> diff --git a/Documentation/devicetree/bindings/pstore/blkoops.txt b/Documentation/devicetree/bindings/pstore/blkoops.txt
>> new file mode 100644
>> index 0000000..5462915
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/pstore/blkoops.txt
>> @@ -0,0 +1,53 @@
>> +Blkoops oops logger
>> +===================
>> +
>> +Blkoops provides a block partition for oops, excluding panics now, so they can
>> +be recovered after a reboot.
>> +
>> +Any space of block device will be used for a circular buffer of oops records.
>> +These records have a configurable size, with a size of 0 indicating that they
>> +should be disabled.
>> +
>> +At least one of "block-device" and "total_size" must be set.
>> +
>> +At least one of "dmesg-size" or "pmsg-size" must be set non-zero.
>> +
>> +Required properties:
>> +
>> +- compatible: must be "blkoops".
>> +
>> +Optional properties:
>> +
>> +- block-device: The block device to use. Most of the time, it is a partition of
>> +		device. If block-device is NULL, no block device is effective
>> +		and the data will be lost after rebooting.
>> +		It accept the following variants:
>> +		1) <hex_major><hex_minor> device number in hexadecimal
>> +		   represents itself no leading 0x, for example b302.
>> +		2) /dev/<disk_name> represents the device number of disk
>> +		3) /dev/<disk_name><decimal> represents the device number of
>> +		   partition - device number of disk plus the partition number
>> +		4) /dev/<disk_name>p<decimal> - same as the above, that form is
>> +		   used when disk name of partitioned disk ends on a digit.
>> +		5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing
>> +		   the unique id of a partition if the partition table provides
>> +		   it. The UUID may be either an EFI/GPT UUID, or refer to an
>> +		   MSDOS partition using the format SSSSSSSS-PP, where SSSSSSSS
>> +		   is a zero-filled hex representation of the 32-bit
>> +		   "NT disk signature", and PP is a zero-filled hex
>> +		   representation of the 1-based partition number.
>> +		6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in
>> +		   relation to a partition with a known unique id.
>> +		7) <major>:<minor> major and minor number of the device
>> +		   separated by a colon.
> 
> No.
> 
> I didn't suggest to go look at PARTUUID to copy it into the binding, but 
> rather to point out that the kernel can already mount by UUID. 
> Specifying the UUID in DT is also not what I suggested. My suggestion is 
> to define a known UUID so that the kernel (and bootloaders, userspace, 
> the world) can just know the UUID. Just like the EFI system partition. 
> Now this means you have to get it defined in the UEFI specification 
> (or maybe EBBR[1]). If you want help with how to do that, the 
> boot-architecture list is a good place to start.
> 

Thanks for your suggestion. I don't know whether it is a good idea to
define a known UUID for pstore/blk in the UEFI specification. This
property is only used for pstore/blk to know which block device it can
use. It only works on linux. I think more thorough and rigorous
consideration is needed.
Besides that, mbr partition table has no partition UUID, how can it to
be compatible with mbr?

> major/minor numbers are a Linux thing, so they don't go in DT.
> /dev/* is Linux thing, so it doesn't go in DT.
> 
> You can always define all these parameters as kernel command line 
> options and avoid DT. That would also make this work on *all* systems, 
> not just DT based systems. (Though I still believe that the partition 
> should be discoverable.)
> 

The pstore/blk has already support command line. It now has 3
configuration methods, they are kconfig, DT and module parameters. I
will cancel DT support on next version until we discuss a viable
approach by this mail and than i will submit other patches to implement
DT. Is this ok?

> Rob
> 
> [1] https://github.com/ARM-software/ebbr
> 

-- 
liaoweixiong

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC v9 5/5] Documentation: pstore/blk: create document for pstore_blk
  2019-02-19 11:52 ` [RFC v9 5/5] Documentation: pstore/blk: create document for pstore_blk liaoweixiong
@ 2019-02-28  5:15   ` Randy Dunlap
  2019-02-28  6:40     ` liaoweixiong
  0 siblings, 1 reply; 10+ messages in thread
From: Randy Dunlap @ 2019-02-28  5:15 UTC (permalink / raw)
  To: liaoweixiong, Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Rob Herring, Mark Rutland,
	Mauro Carvalho Chehab, David S. Miller, Greg Kroah-Hartman,
	Nicolas Ferre, Arnd Bergmann
  Cc: linux-doc, linux-kernel, devicetree

On 2/19/19 3:52 AM, liaoweixiong wrote:
> The document, at Documentation/admin-guide/pstore-block.rst,
> tells user how to use pstore_blk and the attentions about panic
> read/write
> 
> Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
> ---
>  Documentation/admin-guide/pstore-block.rst | 233 +++++++++++++++++++++++++++++
>  MAINTAINERS                                |   1 +
>  fs/pstore/Kconfig                          |   4 +
>  3 files changed, 238 insertions(+)
>  create mode 100644 Documentation/admin-guide/pstore-block.rst
> 
> diff --git a/Documentation/admin-guide/pstore-block.rst b/Documentation/admin-guide/pstore-block.rst
> new file mode 100644
> index 0000000..a828274
> --- /dev/null
> +++ b/Documentation/admin-guide/pstore-block.rst
> @@ -0,0 +1,233 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +Pstore block oops/panic logger
> +==============================
> +
> +Introduction
> +------------
> +
> +Pstore block (pstore_blk) is an oops/panic logger that write its logs to block

                                                                         to a block

> +device before the system crashes. Pstore_blk needs block device driver

                                                needs the block

> +registering a partition path of the block device, like /dev/mmcblk0p7 for mmc

   to register                                                           for MMC

> +driver, and read/write APIs for this partition when on panic.
> +
> +Pstore block concepts
> +---------------------
> +
> +Pstore block begins at function ``blkz_register``, by which block driver

                                                      by which a block driver

> +registers to pstore_blk. Note that, block driver should register to pstore_blk

                            Note that the block driver should

> +after block device has registered. Block driver transfers a structure

                                      The block driver

> +``blkz_info`` which is defined in *linux/pstore_blk.h*.
> +
> +The following key members of ``struct blkz_info`` may be of interest to you.
> +
> +blkdev
> +~~~~~~
> +
> +The block device to use. Most of the time, it is a partition of block device.
> +It's ok to keep it as NULL if you passing ``read`` and ``write`` in blkz_info as

                              if you are passing

> +``blkdev`` is used by blkz_default_general_read/write. If both of ``blkdev``,
> +``read`` and ``write`` are NULL, no block device is effective and the data will
> +be saved in ddr buffer.

what is ddr buffer?

> +
> +It accept the following variants:
> +
> +1. <hex_major><hex_minor> device number in hexadecimal represents itself no

                                                                     itself; no

> +   leading 0x, for example b302.
> +#. /dev/<disk_name> represents the device number of disk
> +#. /dev/<disk_name><decimal> represents the device number of partition - device
> +   number of disk plus the partition number
> +#. /dev/<disk_name>p<decimal> - same as the above, that form is used when disk

                                               above; this form

> +   name of partitioned disk ends on a digit.

                               ends with a digit.

> +#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the unique id of
> +   a partition if the partition table provides it. The UUID may be either an
> +   EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP,
> +   where SSSSSSSS is a zero-filled hex representation of the 32-bit
> +   "NT disk signature", and PP is a zero-filled hex representation of the
> +   1-based partition number.
> +#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a
> +   partition with a known unique id.
> +#. <major>:<minor> major and minor number of the device separated by a colon.
> +
> +See more on section **read/write**.

            in section

> +
> +total_size
> +~~~~~~~~~~
> +
> +The total size in bytes of block device used for pstore_blk. It **MUST** be less
> +than or equal to size of block device if ``blkdev`` valid. It **MUST** be a
> +multiple of 4096. If ``total_size`` is zero with ``blkdev``, ``total_size`` will be
> +set to equal to size of ``blkdev``.
> +
> +The block device area is divided into many chunks, and each event writes a chunk
> +of information.
> +
> +dmesg_size
> +~~~~~~~~~~
> +
> +The chunk size in bytes for dmesg(oops/panic). It **MUST** be a multiple of
> +SECTOR_SIZE (Most of the time, the SECTOR_SIZE is 512). If you don't need dmesg,
> +you are safely to set it to 0.

   you can safely

> +
> +NOTE that, the remaining space, except ``pmsg_size`` and others, belongs to
> +dmesg. It means that there are multiple chunks for dmesg.
> +
> +Psotre_blk will log to dmesg chunks one by one, and always overwrite the oldest

   Pstore_blk

> +chunk if no free chunk.
> +
> +pmsg_size
> +~~~~~~~~~
> +
> +The chunk size in bytes for pmsg. It **MUST** be a multiple of SECTOR_SIZE (Most
> +of the time, the SECTOR_SIZE is 512). If you don't need pmsg, you are safely to

                                                                 you can safely {drop "to"}

> +set it to 0.
> +
> +There is only one chunk for pmsg.
> +
> +Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are
> +appended to the chunk. On reboot the contents are available in
> +/sys/fs/pstore/pmsg-pstore-blk-0.
> +
> +dump_oops
> +~~~~~~~~~
> +
> +Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
> +member while setting 0 in that variable dumps only the panics.
> +
> +read/write
> +~~~~~~~~~~
> +
> +They are general ``read/write`` APIs. It is safely and recommended to ignore it,

                                         It is safe and recommended

> +but set ``blkdev``.
> +
> +These general APIs are used all the time expect panic. The ``read`` API is
> +usually used to recover data from block device, and the ``write`` API is usually
> +to flush new data and erase to block device.
> +
> +Pstore_blk will temporarily hold all new data before block device is ready. If
> +you ignore both of ``read/write`` and ``blkdev``, the old data will be lost.
> +
> +NOTE that, the general APIs must check whether the block device is ready if

   NOTE that the general

> +self-defined.
> +
> +panic_read/panic_write
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +They are ``read/write`` APIs for panic. They are likely to general

                                           They are like the general


> +``read/write`` but will be used only when on panic.
> +
> +The attentions for panic read/write see section
> +**Attentions in panic read/write APIs**.
> +
> +Register to pstore block
> +------------------------
> +
> +Block device driver call ``blkz_register`` to register to Psotre_blk.

                                                             Pstore_blk.

> +For example:
> +
> +.. code-block:: c
> +
> + #include <linux/pstore_blk.h>
> + [...]
> +
> + static ssize_t XXXX_panic_read(char *buf, size bytes, loff_t pos)
> + {
> +    [...]
> + }
> +
> + static ssize_t XXXX_panic_write(const char *buf, size_t bytes, loff_t pos)
> + {
> +        [...]
> + }
> +
> + struct blkz_info XXXX_info = {
> +        .onwer = THIS_MODULE,
> +        .name = <...>,
> +        .dmesg_size = <...>,
> +        .pmsg_size = <...>,
> +        .dump_oops = true,
> +        .panic_read = XXXX_panic_read,
> +        .panic_write = XXXX_panic_write,
> + };
> +
> + static int __init XXXX_init(void)
> + {
> +        [... get block device information ...]
> +        XXXX_info.blkdev = <...>;
> +        XXXX_info.total_size = <...>;
> +
> +        [...]
> +        return blkz_register(&XXXX_info);
> + }
> +
> +There are multiple ways by which you can get block device information.
> +
> +A. Use the module parameters and kernel cmdline.
> +B. Use Device Tree bindings.
> +C. Use Kconfig.
> +D. Use Driver Feature.
> +   For example, traverse all MTD device by ``register_mtd_user``, and get the

                                    devices

> +   matching name MTD partition.
> +
> +NOTE that, all of above are done by block driver rather then pstore_blk. You can

   NOTE that all of the above are done by the block driver

> +get sample on blkoops.
> +
> +The attentions for panic read/write see section
> +**Attentions in panic read/write APIs**.
> +
> +Compression and header
> +----------------------
> +
> +Block device is large enough, it is not necessary to compress dmesg data.
> +Actually, we recommend not compress. Because pstore_blk will insert some

                              compressing because

> +information into the first line of dmesg data if no compression.
> +For example::
> +
> +        Panic: Total 16 times
> +
> +It means that it's the 16th times panic log since burning.

what is "burning"?

> +Sometimes, the oops|panic counter since burning is very important for embedded
> +device to judge whether the system is stable.
> +
> +The follow line is insert by pstore filesystem.

       following line is inserted

> +For example::
> +
> +        Oops#2 Part1
> +
> +It means that it's the 2nd times oops log on last booting.
> +
> +Reading the data
> +----------------
> +
> +The dump data can be read from the pstore filesystem. The format for these
> +files is ``dmesg-pstore-blk-[N]`` for dmesg(oops|panic) and
> +``pmsg-pstore-blk-0`` for pmsg, where N is the record number. To delete a stored
> +record from block device, simply unlink the respective pstore file. The
> +timestamp of the dump file records the trigger time.
> +
> +Attentions in panic read/write APIs
> +-----------------------------------
> +
> +If on panic, the kernel is not going to be running for much longer. The tasks
> +will not be scheduled and the most kernel resources will be out of service. It
> +looks like a single-threaded program running on a single-core computer.
> +
> +The following points need special attention for panic read/write APIs:
> +
> +1. Can **NOT** allocate any memory.
> +   If you need memory, just allocate while the block driver is initialing rather

                                                                  initializing

> +   than waiting until the panic.
> +#. Must be polled, **NOT** interrupt driven.
> +   No task schedule any more. The block driver should delay to ensure the write
> +   succeeds, but NOT sleep.
> +#. Can **NOT** take any lock.
> +   There is no other task, no any share resource, you are safely to break all

                              nor any shared resource; you are safe to break all

> +   locks.
> +#. Just use cpu to transfer.

               CPU

> +   Do not use DMA to transfer unless you are sure that DMA will not keep lock.
> +#. Operate register directly.
> +   Try not to use linux kernel resources. Do io map while initialing rather than

                     Linux                      I/O          initializing

> +   waiting until the panic.
> +#. Reset your block device and controller if necessary.
> +   If you are not sure the state of you block device and controller when panic,
> +   you are safely to stop and reset them.

      you are safe to



cheers.
-- 
~Randy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC v9 5/5] Documentation: pstore/blk: create document for pstore_blk
  2019-02-28  5:15   ` Randy Dunlap
@ 2019-02-28  6:40     ` liaoweixiong
  0 siblings, 0 replies; 10+ messages in thread
From: liaoweixiong @ 2019-02-28  6:40 UTC (permalink / raw)
  To: Randy Dunlap, Kees Cook, Anton Vorontsov, Colin Cross, Tony Luck,
	Jonathan Corbet, Rob Herring, Mark Rutland,
	Mauro Carvalho Chehab, David S. Miller, Greg Kroah-Hartman,
	Nicolas Ferre, Arnd Bergmann
  Cc: linux-doc, linux-kernel, devicetree

Thank you for your correction. I will update the patch in the 12th version.

On 2019/02/28 13:15, Randy Dunlap wrote:
> On 2/19/19 3:52 AM, liaoweixiong wrote:
>> The document, at Documentation/admin-guide/pstore-block.rst,
>> tells user how to use pstore_blk and the attentions about panic
>> read/write
>>
>> Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
>> ---
>>  Documentation/admin-guide/pstore-block.rst | 233 +++++++++++++++++++++++++++++
>>  MAINTAINERS                                |   1 +
>>  fs/pstore/Kconfig                          |   4 +
>>  3 files changed, 238 insertions(+)
>>  create mode 100644 Documentation/admin-guide/pstore-block.rst
>>
>> diff --git a/Documentation/admin-guide/pstore-block.rst b/Documentation/admin-guide/pstore-block.rst
>> new file mode 100644
>> index 0000000..a828274
>> --- /dev/null
>> +++ b/Documentation/admin-guide/pstore-block.rst
>> @@ -0,0 +1,233 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +Pstore block oops/panic logger
>> +==============================
>> +
>> +Introduction
>> +------------
>> +
>> +Pstore block (pstore_blk) is an oops/panic logger that write its logs to block
> 
>                                                                          to a block
> 
>> +device before the system crashes. Pstore_blk needs block device driver
> 
>                                                 needs the block
> 
>> +registering a partition path of the block device, like /dev/mmcblk0p7 for mmc
> 
>    to register                                                           for MMC
> 
>> +driver, and read/write APIs for this partition when on panic.
>> +
>> +Pstore block concepts
>> +---------------------
>> +
>> +Pstore block begins at function ``blkz_register``, by which block driver
> 
>                                                       by which a block driver
> 
>> +registers to pstore_blk. Note that, block driver should register to pstore_blk
> 
>                             Note that the block driver should
> 
>> +after block device has registered. Block driver transfers a structure
> 
>                                       The block driver
> 
>> +``blkz_info`` which is defined in *linux/pstore_blk.h*.
>> +
>> +The following key members of ``struct blkz_info`` may be of interest to you.
>> +
>> +blkdev
>> +~~~~~~
>> +
>> +The block device to use. Most of the time, it is a partition of block device.
>> +It's ok to keep it as NULL if you passing ``read`` and ``write`` in blkz_info as
> 
>                               if you are passing
> 
>> +``blkdev`` is used by blkz_default_general_read/write. If both of ``blkdev``,
>> +``read`` and ``write`` are NULL, no block device is effective and the data will
>> +be saved in ddr buffer.
> 
> what is ddr buffer?
> 

It is a buffer allocated from RAM. I modify it as follow:
If both of ``blkdev``, ``read`` and ``write`` are NULL, no block device
is effective and the data will only be saved in RAM.

>> +
>> +It accept the following variants:
>> +
>> +1. <hex_major><hex_minor> device number in hexadecimal represents itself no
> 
>                                                                      itself; no
> 
>> +   leading 0x, for example b302.
>> +#. /dev/<disk_name> represents the device number of disk
>> +#. /dev/<disk_name><decimal> represents the device number of partition - device
>> +   number of disk plus the partition number
>> +#. /dev/<disk_name>p<decimal> - same as the above, that form is used when disk
> 
>                                                above; this form
> 
>> +   name of partitioned disk ends on a digit.
> 
>                                ends with a digit.
> 
>> +#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the unique id of
>> +   a partition if the partition table provides it. The UUID may be either an
>> +   EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP,
>> +   where SSSSSSSS is a zero-filled hex representation of the 32-bit
>> +   "NT disk signature", and PP is a zero-filled hex representation of the
>> +   1-based partition number.
>> +#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a
>> +   partition with a known unique id.
>> +#. <major>:<minor> major and minor number of the device separated by a colon.
>> +
>> +See more on section **read/write**.
> 
>             in section
> 
>> +
>> +total_size
>> +~~~~~~~~~~
>> +
>> +The total size in bytes of block device used for pstore_blk. It **MUST** be less
>> +than or equal to size of block device if ``blkdev`` valid. It **MUST** be a
>> +multiple of 4096. If ``total_size`` is zero with ``blkdev``, ``total_size`` will be
>> +set to equal to size of ``blkdev``.
>> +
>> +The block device area is divided into many chunks, and each event writes a chunk
>> +of information.
>> +
>> +dmesg_size
>> +~~~~~~~~~~
>> +
>> +The chunk size in bytes for dmesg(oops/panic). It **MUST** be a multiple of
>> +SECTOR_SIZE (Most of the time, the SECTOR_SIZE is 512). If you don't need dmesg,
>> +you are safely to set it to 0.
> 
>    you can safely
> 
>> +
>> +NOTE that, the remaining space, except ``pmsg_size`` and others, belongs to
>> +dmesg. It means that there are multiple chunks for dmesg.
>> +
>> +Psotre_blk will log to dmesg chunks one by one, and always overwrite the oldest
> 
>    Pstore_blk
> 
>> +chunk if no free chunk.
>> +
>> +pmsg_size
>> +~~~~~~~~~
>> +
>> +The chunk size in bytes for pmsg. It **MUST** be a multiple of SECTOR_SIZE (Most
>> +of the time, the SECTOR_SIZE is 512). If you don't need pmsg, you are safely to
> 
>                                                                  you can safely {drop "to"}
> 
>> +set it to 0.
>> +
>> +There is only one chunk for pmsg.
>> +
>> +Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are
>> +appended to the chunk. On reboot the contents are available in
>> +/sys/fs/pstore/pmsg-pstore-blk-0.
>> +
>> +dump_oops
>> +~~~~~~~~~
>> +
>> +Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
>> +member while setting 0 in that variable dumps only the panics.
>> +
>> +read/write
>> +~~~~~~~~~~
>> +
>> +They are general ``read/write`` APIs. It is safely and recommended to ignore it,
> 
>                                          It is safe and recommended
> 
>> +but set ``blkdev``.
>> +
>> +These general APIs are used all the time expect panic. The ``read`` API is
>> +usually used to recover data from block device, and the ``write`` API is usually
>> +to flush new data and erase to block device.
>> +
>> +Pstore_blk will temporarily hold all new data before block device is ready. If
>> +you ignore both of ``read/write`` and ``blkdev``, the old data will be lost.
>> +
>> +NOTE that, the general APIs must check whether the block device is ready if
> 
>    NOTE that the general
> 
>> +self-defined.
>> +
>> +panic_read/panic_write
>> +~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +They are ``read/write`` APIs for panic. They are likely to general
> 
>                                            They are like the general
> 
> 
>> +``read/write`` but will be used only when on panic.
>> +
>> +The attentions for panic read/write see section
>> +**Attentions in panic read/write APIs**.
>> +
>> +Register to pstore block
>> +------------------------
>> +
>> +Block device driver call ``blkz_register`` to register to Psotre_blk.
> 
>                                                              Pstore_blk.
> 
>> +For example:
>> +
>> +.. code-block:: c
>> +
>> + #include <linux/pstore_blk.h>
>> + [...]
>> +
>> + static ssize_t XXXX_panic_read(char *buf, size bytes, loff_t pos)
>> + {
>> +    [...]
>> + }
>> +
>> + static ssize_t XXXX_panic_write(const char *buf, size_t bytes, loff_t pos)
>> + {
>> +        [...]
>> + }
>> +
>> + struct blkz_info XXXX_info = {
>> +        .onwer = THIS_MODULE,
>> +        .name = <...>,
>> +        .dmesg_size = <...>,
>> +        .pmsg_size = <...>,
>> +        .dump_oops = true,
>> +        .panic_read = XXXX_panic_read,
>> +        .panic_write = XXXX_panic_write,
>> + };
>> +
>> + static int __init XXXX_init(void)
>> + {
>> +        [... get block device information ...]
>> +        XXXX_info.blkdev = <...>;
>> +        XXXX_info.total_size = <...>;
>> +
>> +        [...]
>> +        return blkz_register(&XXXX_info);
>> + }
>> +
>> +There are multiple ways by which you can get block device information.
>> +
>> +A. Use the module parameters and kernel cmdline.
>> +B. Use Device Tree bindings.
>> +C. Use Kconfig.
>> +D. Use Driver Feature.
>> +   For example, traverse all MTD device by ``register_mtd_user``, and get the
> 
>                                     devices
> 
>> +   matching name MTD partition.
>> +
>> +NOTE that, all of above are done by block driver rather then pstore_blk. You can
> 
>    NOTE that all of the above are done by the block driver
> 
>> +get sample on blkoops.
>> +
>> +The attentions for panic read/write see section
>> +**Attentions in panic read/write APIs**.
>> +
>> +Compression and header
>> +----------------------
>> +
>> +Block device is large enough, it is not necessary to compress dmesg data.
>> +Actually, we recommend not compress. Because pstore_blk will insert some
> 
>                               compressing because
> 
>> +information into the first line of dmesg data if no compression.
>> +For example::
>> +
>> +        Panic: Total 16 times
>> +
>> +It means that it's the 16th times panic log since burning.
> 
> what is "burning"?
> 

It is something about embedded device installing system. It was my
negligence not to consider the compatibility of concepts. I modify it as
follow:
It means that it's the 16th times panic log since the first  booting.

>> +Sometimes, the oops|panic counter since burning is very important for embedded
>> +device to judge whether the system is stable.
>> +
>> +The follow line is insert by pstore filesystem.
> 
>        following line is inserted
> 
>> +For example::
>> +
>> +        Oops#2 Part1
>> +
>> +It means that it's the 2nd times oops log on last booting.
>> +
>> +Reading the data
>> +----------------
>> +
>> +The dump data can be read from the pstore filesystem. The format for these
>> +files is ``dmesg-pstore-blk-[N]`` for dmesg(oops|panic) and
>> +``pmsg-pstore-blk-0`` for pmsg, where N is the record number. To delete a stored
>> +record from block device, simply unlink the respective pstore file. The
>> +timestamp of the dump file records the trigger time.
>> +
>> +Attentions in panic read/write APIs
>> +-----------------------------------
>> +
>> +If on panic, the kernel is not going to be running for much longer. The tasks
>> +will not be scheduled and the most kernel resources will be out of service. It
>> +looks like a single-threaded program running on a single-core computer.
>> +
>> +The following points need special attention for panic read/write APIs:
>> +
>> +1. Can **NOT** allocate any memory.
>> +   If you need memory, just allocate while the block driver is initialing rather
> 
>                                                                   initializing
> 
>> +   than waiting until the panic.
>> +#. Must be polled, **NOT** interrupt driven.
>> +   No task schedule any more. The block driver should delay to ensure the write
>> +   succeeds, but NOT sleep.
>> +#. Can **NOT** take any lock.
>> +   There is no other task, no any share resource, you are safely to break all
> 
>                               nor any shared resource; you are safe to break all
> 
>> +   locks.
>> +#. Just use cpu to transfer.
> 
>                CPU
> 
>> +   Do not use DMA to transfer unless you are sure that DMA will not keep lock.
>> +#. Operate register directly.
>> +   Try not to use linux kernel resources. Do io map while initialing rather than
> 
>                      Linux                      I/O          initializing
> 
>> +   waiting until the panic.
>> +#. Reset your block device and controller if necessary.
>> +   If you are not sure the state of you block device and controller when panic,
>> +   you are safely to stop and reset them.
> 
>       you are safe to
> 
> 
> 
> cheers.
> 

-- 
liaoweixiong

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-02-28  6:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-19 11:52 [RFC v9 0/5] pstore/block: new support logger for block devices liaoweixiong
2019-02-19 11:52 ` [RFC v9 1/5] pstore/blk: " liaoweixiong
2019-02-19 11:52 ` [RFC v9 2/5] dt-bindings: pstore-block: new support for blkoops liaoweixiong
2019-02-22 15:36   ` Rob Herring
2019-02-25 14:20     ` liaoweixiong
2019-02-19 11:52 ` [RFC v9 3/5] pstore/blk: add blkoops for pstore_blk liaoweixiong
2019-02-19 11:52 ` [RFC v9 4/5] pstore/blk: support pmsg for pstore block liaoweixiong
2019-02-19 11:52 ` [RFC v9 5/5] Documentation: pstore/blk: create document for pstore_blk liaoweixiong
2019-02-28  5:15   ` Randy Dunlap
2019-02-28  6:40     ` liaoweixiong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).