All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH next v3 0/8] printk: reimplement LOG_CONT handling
@ 2020-08-31  1:10 ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

Hello,

Here is v3 for the second series to rework the printk subsystem.
(The v2 is here [0].) This series implements a new ringbuffer
feature that allows the last record to be extended. Petr Mladek
provided the initial proof of concept [1] for this.

Using the record extension feature, LOG_CONT is re-implemented
in a way that exactly preserves its behavior, but avoids the
need for an extra buffer. In particular, it avoids the need for
any synchronization that such a buffer requires.

This series deviates from the agreements [2] made at the meeting
during LPC2019 in Lisbon. The test results of the v1 series,
which implemented LOG_CONT as agreed upon, showed that the
effects on existing userspace tools using /dev/kmsg (journalctl,
dmesg) were not acceptable [3].

The main difference to v2 is the implementation of the new
descriptor finalization. For v3 the implementation closely
follows the example [4] from Petr Mladek.

Patch 6 introduces *four* new memory barrier pairs. Two of them
are insignificant additions (data_realloc:A/desc_read:D and
data_realloc:A/data_push_tail:B) because they are alternate path
memory barriers that exactly match the purpose and context of
the two existing memory barrier pairs they provide an alternate
path for. The other two new memory barrier pairs are significant
additions:

desc_reopen_last:A/_prb_commit:B - When reopening a descriptor,
    ensure the commit flag is removed before fully trusing the
    descriptor data.

_prb_commit:B / desc_reserve:D - When committing a descriptor,
    ensure the commit flag is set before checking the head ID
    to see if the finalize flag should be set.

Patch 8 assumes the gdb script series [5] for the new printk
ringbuffer has been applied.

The test module used to test the ringbuffer is available
here [6].

The series is based on next-20200828.

The list of changes since v2:

printk_ringbuffer
=================

- prb_commit(): finalize self if no longer the head

- prb_reserve(): clear @info fields on success

- prb_reserve(): do not finalize the -1 placeholder descriptor

- desc_make_final(): renamed from desc_finalize()

- desc_make_final(): remove loop, change to single shot attempt

- prb_reserve_in_last(): renamed from prb_reserve_last()

- prb_reserve_in_last(): add new fail goto target

- prb_reserve_in_last(): fix logic for calculating
  @text_buf_size and add size check

- desc_reopen_last(): add extra caller ID check before reopening

- desc_reopen_last(): change cmpcxhg() to full memory barrier

- get_desc_state(): remove unneeded @is_final argument

- documentation: update finalization, sample code, and memory
  barrier list

printk.c
========

- set @text_len and @dict_len as required by prb_reserve() change

John Ogness

[0] https://lkml.kernel.org/r/20200824103538.31446-1-john.ogness@linutronix.de
[1] https://lkml.kernel.org/r/20200812163908.GH12903@alley
[2] https://lkml.kernel.org/r/87k1acz5rx.fsf@linutronix.de
[3] https://lkml.kernel.org/r/20200811160551.GC12903@alley
[4] https://lkml.kernel.org/r/20200827151710.GB4928@alley
[5] https://lkml.kernel.org/r/CAHk-=wj_b6Bh=d-Wwh0xYqoQBhHkYeExhszkpxdRA6GjTvkRiQ@mail.gmail.com
[6] https://lkml.kernel.org/r/20200814212525.6118-1-john.ogness@linutronix.de
[7] https://github.com/Linutronix/prb-test.git

John Ogness (8):
  printk: ringbuffer: rename DESC_COMMITTED_MASK flag
  printk: ringbuffer: change representation of reusable
  printk: ringbuffer: relocate get_data()
  printk: ringbuffer: add BLK_DATALESS() macro
  printk: ringbuffer: clear initial reserved fields
  printk: ringbuffer: add finalization/extension support
  printk: reimplement log_cont using record extension
  scripts/gdb: support printk finalized records

 Documentation/admin-guide/kdump/gdbmacros.txt |  10 +-
 kernel/printk/printk.c                        | 105 +--
 kernel/printk/printk_ringbuffer.c             | 604 +++++++++++++++---
 kernel/printk/printk_ringbuffer.h             |  12 +-
 scripts/gdb/linux/dmesg.py                    |  10 +-
 5 files changed, 558 insertions(+), 183 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH next v3 0/8] printk: reimplement LOG_CONT handling
@ 2020-08-31  1:10 ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

Hello,

Here is v3 for the second series to rework the printk subsystem.
(The v2 is here [0].) This series implements a new ringbuffer
feature that allows the last record to be extended. Petr Mladek
provided the initial proof of concept [1] for this.

Using the record extension feature, LOG_CONT is re-implemented
in a way that exactly preserves its behavior, but avoids the
need for an extra buffer. In particular, it avoids the need for
any synchronization that such a buffer requires.

This series deviates from the agreements [2] made at the meeting
during LPC2019 in Lisbon. The test results of the v1 series,
which implemented LOG_CONT as agreed upon, showed that the
effects on existing userspace tools using /dev/kmsg (journalctl,
dmesg) were not acceptable [3].

The main difference to v2 is the implementation of the new
descriptor finalization. For v3 the implementation closely
follows the example [4] from Petr Mladek.

Patch 6 introduces *four* new memory barrier pairs. Two of them
are insignificant additions (data_realloc:A/desc_read:D and
data_realloc:A/data_push_tail:B) because they are alternate path
memory barriers that exactly match the purpose and context of
the two existing memory barrier pairs they provide an alternate
path for. The other two new memory barrier pairs are significant
additions:

desc_reopen_last:A/_prb_commit:B - When reopening a descriptor,
    ensure the commit flag is removed before fully trusing the
    descriptor data.

_prb_commit:B / desc_reserve:D - When committing a descriptor,
    ensure the commit flag is set before checking the head ID
    to see if the finalize flag should be set.

Patch 8 assumes the gdb script series [5] for the new printk
ringbuffer has been applied.

The test module used to test the ringbuffer is available
here [6].

The series is based on next-20200828.

The list of changes since v2:

printk_ringbuffer
=================

- prb_commit(): finalize self if no longer the head

- prb_reserve(): clear @info fields on success

- prb_reserve(): do not finalize the -1 placeholder descriptor

- desc_make_final(): renamed from desc_finalize()

- desc_make_final(): remove loop, change to single shot attempt

- prb_reserve_in_last(): renamed from prb_reserve_last()

- prb_reserve_in_last(): add new fail goto target

- prb_reserve_in_last(): fix logic for calculating
  @text_buf_size and add size check

- desc_reopen_last(): add extra caller ID check before reopening

- desc_reopen_last(): change cmpcxhg() to full memory barrier

- get_desc_state(): remove unneeded @is_final argument

- documentation: update finalization, sample code, and memory
  barrier list

printk.c
========

- set @text_len and @dict_len as required by prb_reserve() change

John Ogness

[0] https://lkml.kernel.org/r/20200824103538.31446-1-john.ogness@linutronix.de
[1] https://lkml.kernel.org/r/20200812163908.GH12903@alley
[2] https://lkml.kernel.org/r/87k1acz5rx.fsf@linutronix.de
[3] https://lkml.kernel.org/r/20200811160551.GC12903@alley
[4] https://lkml.kernel.org/r/20200827151710.GB4928@alley
[5] https://lkml.kernel.org/r/CAHk-=wj_b6Bh=d-Wwh0xYqoQBhHkYeExhszkpxdRA6GjTvkRiQ@mail.gmail.com
[6] https://lkml.kernel.org/r/20200814212525.6118-1-john.ogness@linutronix.de
[7] https://github.com/Linutronix/prb-test.git

John Ogness (8):
  printk: ringbuffer: rename DESC_COMMITTED_MASK flag
  printk: ringbuffer: change representation of reusable
  printk: ringbuffer: relocate get_data()
  printk: ringbuffer: add BLK_DATALESS() macro
  printk: ringbuffer: clear initial reserved fields
  printk: ringbuffer: add finalization/extension support
  printk: reimplement log_cont using record extension
  scripts/gdb: support printk finalized records

 Documentation/admin-guide/kdump/gdbmacros.txt |  10 +-
 kernel/printk/printk.c                        | 105 +--
 kernel/printk/printk_ringbuffer.c             | 604 +++++++++++++++---
 kernel/printk/printk_ringbuffer.h             |  12 +-
 scripts/gdb/linux/dmesg.py                    |  10 +-
 5 files changed, 558 insertions(+), 183 deletions(-)

-- 
2.20.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH next v3 1/8] printk: ringbuffer: rename DESC_COMMITTED_MASK flag
  2020-08-31  1:10 ` John Ogness
@ 2020-08-31  1:10   ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

An upcoming ringbuffer support for continuous lines will allow to
reopen records with DESC_COMMITTED_MASK set. As a result, the flag
will no longer describe the final committed state. Rename it to
DESC_COMMIT_MASK as a preparation step.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 8 ++++----
 kernel/printk/printk_ringbuffer.h | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 0659b50872b5..76248c82d557 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -361,7 +361,7 @@ static enum desc_state get_desc_state(unsigned long id,
 	if (state_val & DESC_REUSE_MASK)
 		return desc_reusable;
 
-	if (state_val & DESC_COMMITTED_MASK)
+	if (state_val & DESC_COMMIT_MASK)
 		return desc_committed;
 
 	return desc_reserved;
@@ -462,7 +462,7 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 static void desc_make_reusable(struct prb_desc_ring *desc_ring,
 			       unsigned long id)
 {
-	unsigned long val_committed = id | DESC_COMMITTED_MASK;
+	unsigned long val_committed = id | DESC_COMMIT_MASK;
 	unsigned long val_reusable = val_committed | DESC_REUSE_MASK;
 	struct prb_desc *desc = to_desc(desc_ring, id);
 	atomic_long_t *state_var = &desc->state_var;
@@ -899,7 +899,7 @@ static bool desc_reserve(struct printk_ringbuffer *rb, unsigned long *id_out)
 	 */
 	prev_state_val = atomic_long_read(&desc->state_var); /* LMM(desc_reserve:E) */
 	if (prev_state_val &&
-	    prev_state_val != (id_prev_wrap | DESC_COMMITTED_MASK | DESC_REUSE_MASK)) {
+	    prev_state_val != (id_prev_wrap | DESC_COMMIT_MASK | DESC_REUSE_MASK)) {
 		WARN_ON_ONCE(1);
 		return false;
 	}
@@ -1184,7 +1184,7 @@ void prb_commit(struct prb_reserved_entry *e)
 	 * this. This pairs with desc_read:B.
 	 */
 	if (!atomic_long_try_cmpxchg(&d->state_var, &prev_state_val,
-				     e->id | DESC_COMMITTED_MASK)) { /* LMM(prb_commit:B) */
+				     e->id | DESC_COMMIT_MASK)) { /* LMM(prb_commit:B) */
 		WARN_ON_ONCE(1);
 	}
 
diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
index e6302da041f9..dcda5e9b4676 100644
--- a/kernel/printk/printk_ringbuffer.h
+++ b/kernel/printk/printk_ringbuffer.h
@@ -115,9 +115,9 @@ struct prb_reserved_entry {
 #define _DATA_SIZE(sz_bits)		(1UL << (sz_bits))
 #define _DESCS_COUNT(ct_bits)		(1U << (ct_bits))
 #define DESC_SV_BITS			(sizeof(unsigned long) * 8)
-#define DESC_COMMITTED_MASK		(1UL << (DESC_SV_BITS - 1))
+#define DESC_COMMIT_MASK		(1UL << (DESC_SV_BITS - 1))
 #define DESC_REUSE_MASK			(1UL << (DESC_SV_BITS - 2))
-#define DESC_FLAGS_MASK			(DESC_COMMITTED_MASK | DESC_REUSE_MASK)
+#define DESC_FLAGS_MASK			(DESC_COMMIT_MASK | DESC_REUSE_MASK)
 #define DESC_ID_MASK			(~DESC_FLAGS_MASK)
 #define DESC_ID(sv)			((sv) & DESC_ID_MASK)
 #define FAILED_LPOS			0x1
@@ -213,7 +213,7 @@ struct prb_reserved_entry {
  */
 #define BLK0_LPOS(sz_bits)	(-(_DATA_SIZE(sz_bits)))
 #define DESC0_ID(ct_bits)	DESC_ID(-(_DESCS_COUNT(ct_bits) + 1))
-#define DESC0_SV(ct_bits)	(DESC_COMMITTED_MASK | DESC_REUSE_MASK | DESC0_ID(ct_bits))
+#define DESC0_SV(ct_bits)	(DESC_COMMIT_MASK | DESC_REUSE_MASK | DESC0_ID(ct_bits))
 
 /*
  * Define a ringbuffer with an external text data buffer. The same as
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 1/8] printk: ringbuffer: rename DESC_COMMITTED_MASK flag
@ 2020-08-31  1:10   ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

An upcoming ringbuffer support for continuous lines will allow to
reopen records with DESC_COMMITTED_MASK set. As a result, the flag
will no longer describe the final committed state. Rename it to
DESC_COMMIT_MASK as a preparation step.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 8 ++++----
 kernel/printk/printk_ringbuffer.h | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 0659b50872b5..76248c82d557 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -361,7 +361,7 @@ static enum desc_state get_desc_state(unsigned long id,
 	if (state_val & DESC_REUSE_MASK)
 		return desc_reusable;
 
-	if (state_val & DESC_COMMITTED_MASK)
+	if (state_val & DESC_COMMIT_MASK)
 		return desc_committed;
 
 	return desc_reserved;
@@ -462,7 +462,7 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 static void desc_make_reusable(struct prb_desc_ring *desc_ring,
 			       unsigned long id)
 {
-	unsigned long val_committed = id | DESC_COMMITTED_MASK;
+	unsigned long val_committed = id | DESC_COMMIT_MASK;
 	unsigned long val_reusable = val_committed | DESC_REUSE_MASK;
 	struct prb_desc *desc = to_desc(desc_ring, id);
 	atomic_long_t *state_var = &desc->state_var;
@@ -899,7 +899,7 @@ static bool desc_reserve(struct printk_ringbuffer *rb, unsigned long *id_out)
 	 */
 	prev_state_val = atomic_long_read(&desc->state_var); /* LMM(desc_reserve:E) */
 	if (prev_state_val &&
-	    prev_state_val != (id_prev_wrap | DESC_COMMITTED_MASK | DESC_REUSE_MASK)) {
+	    prev_state_val != (id_prev_wrap | DESC_COMMIT_MASK | DESC_REUSE_MASK)) {
 		WARN_ON_ONCE(1);
 		return false;
 	}
@@ -1184,7 +1184,7 @@ void prb_commit(struct prb_reserved_entry *e)
 	 * this. This pairs with desc_read:B.
 	 */
 	if (!atomic_long_try_cmpxchg(&d->state_var, &prev_state_val,
-				     e->id | DESC_COMMITTED_MASK)) { /* LMM(prb_commit:B) */
+				     e->id | DESC_COMMIT_MASK)) { /* LMM(prb_commit:B) */
 		WARN_ON_ONCE(1);
 	}
 
diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
index e6302da041f9..dcda5e9b4676 100644
--- a/kernel/printk/printk_ringbuffer.h
+++ b/kernel/printk/printk_ringbuffer.h
@@ -115,9 +115,9 @@ struct prb_reserved_entry {
 #define _DATA_SIZE(sz_bits)		(1UL << (sz_bits))
 #define _DESCS_COUNT(ct_bits)		(1U << (ct_bits))
 #define DESC_SV_BITS			(sizeof(unsigned long) * 8)
-#define DESC_COMMITTED_MASK		(1UL << (DESC_SV_BITS - 1))
+#define DESC_COMMIT_MASK		(1UL << (DESC_SV_BITS - 1))
 #define DESC_REUSE_MASK			(1UL << (DESC_SV_BITS - 2))
-#define DESC_FLAGS_MASK			(DESC_COMMITTED_MASK | DESC_REUSE_MASK)
+#define DESC_FLAGS_MASK			(DESC_COMMIT_MASK | DESC_REUSE_MASK)
 #define DESC_ID_MASK			(~DESC_FLAGS_MASK)
 #define DESC_ID(sv)			((sv) & DESC_ID_MASK)
 #define FAILED_LPOS			0x1
@@ -213,7 +213,7 @@ struct prb_reserved_entry {
  */
 #define BLK0_LPOS(sz_bits)	(-(_DATA_SIZE(sz_bits)))
 #define DESC0_ID(ct_bits)	DESC_ID(-(_DESCS_COUNT(ct_bits) + 1))
-#define DESC0_SV(ct_bits)	(DESC_COMMITTED_MASK | DESC_REUSE_MASK | DESC0_ID(ct_bits))
+#define DESC0_SV(ct_bits)	(DESC_COMMIT_MASK | DESC_REUSE_MASK | DESC0_ID(ct_bits))
 
 /*
  * Define a ringbuffer with an external text data buffer. The same as
-- 
2.20.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 2/8] printk: ringbuffer: change representation of reusable
  2020-08-31  1:10 ` John Ogness
@ 2020-08-31  1:10   ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

The reusable queried state is represented by the combined flags:

    DESC_COMMIT_MASK | DESC_REUSE_MASK

There is no reason for the DESC_COMMIT_MASK to be part of that
representation. In particular, this will add confusion when more
state flags are available.

Change the representation of the reusable queried state to just
the DESC_REUSE_MASK flag.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 4 ++--
 kernel/printk/printk_ringbuffer.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 76248c82d557..d339ff7647da 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -463,7 +463,7 @@ static void desc_make_reusable(struct prb_desc_ring *desc_ring,
 			       unsigned long id)
 {
 	unsigned long val_committed = id | DESC_COMMIT_MASK;
-	unsigned long val_reusable = val_committed | DESC_REUSE_MASK;
+	unsigned long val_reusable = id | DESC_REUSE_MASK;
 	struct prb_desc *desc = to_desc(desc_ring, id);
 	atomic_long_t *state_var = &desc->state_var;
 
@@ -899,7 +899,7 @@ static bool desc_reserve(struct printk_ringbuffer *rb, unsigned long *id_out)
 	 */
 	prev_state_val = atomic_long_read(&desc->state_var); /* LMM(desc_reserve:E) */
 	if (prev_state_val &&
-	    prev_state_val != (id_prev_wrap | DESC_COMMIT_MASK | DESC_REUSE_MASK)) {
+	    get_desc_state(id_prev_wrap, prev_state_val) != desc_reusable) {
 		WARN_ON_ONCE(1);
 		return false;
 	}
diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
index dcda5e9b4676..96ef997d7bd6 100644
--- a/kernel/printk/printk_ringbuffer.h
+++ b/kernel/printk/printk_ringbuffer.h
@@ -213,7 +213,7 @@ struct prb_reserved_entry {
  */
 #define BLK0_LPOS(sz_bits)	(-(_DATA_SIZE(sz_bits)))
 #define DESC0_ID(ct_bits)	DESC_ID(-(_DESCS_COUNT(ct_bits) + 1))
-#define DESC0_SV(ct_bits)	(DESC_COMMIT_MASK | DESC_REUSE_MASK | DESC0_ID(ct_bits))
+#define DESC0_SV(ct_bits)	(DESC_REUSE_MASK | DESC0_ID(ct_bits))
 
 /*
  * Define a ringbuffer with an external text data buffer. The same as
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 2/8] printk: ringbuffer: change representation of reusable
@ 2020-08-31  1:10   ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

The reusable queried state is represented by the combined flags:

    DESC_COMMIT_MASK | DESC_REUSE_MASK

There is no reason for the DESC_COMMIT_MASK to be part of that
representation. In particular, this will add confusion when more
state flags are available.

Change the representation of the reusable queried state to just
the DESC_REUSE_MASK flag.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 4 ++--
 kernel/printk/printk_ringbuffer.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 76248c82d557..d339ff7647da 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -463,7 +463,7 @@ static void desc_make_reusable(struct prb_desc_ring *desc_ring,
 			       unsigned long id)
 {
 	unsigned long val_committed = id | DESC_COMMIT_MASK;
-	unsigned long val_reusable = val_committed | DESC_REUSE_MASK;
+	unsigned long val_reusable = id | DESC_REUSE_MASK;
 	struct prb_desc *desc = to_desc(desc_ring, id);
 	atomic_long_t *state_var = &desc->state_var;
 
@@ -899,7 +899,7 @@ static bool desc_reserve(struct printk_ringbuffer *rb, unsigned long *id_out)
 	 */
 	prev_state_val = atomic_long_read(&desc->state_var); /* LMM(desc_reserve:E) */
 	if (prev_state_val &&
-	    prev_state_val != (id_prev_wrap | DESC_COMMIT_MASK | DESC_REUSE_MASK)) {
+	    get_desc_state(id_prev_wrap, prev_state_val) != desc_reusable) {
 		WARN_ON_ONCE(1);
 		return false;
 	}
diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
index dcda5e9b4676..96ef997d7bd6 100644
--- a/kernel/printk/printk_ringbuffer.h
+++ b/kernel/printk/printk_ringbuffer.h
@@ -213,7 +213,7 @@ struct prb_reserved_entry {
  */
 #define BLK0_LPOS(sz_bits)	(-(_DATA_SIZE(sz_bits)))
 #define DESC0_ID(ct_bits)	DESC_ID(-(_DESCS_COUNT(ct_bits) + 1))
-#define DESC0_SV(ct_bits)	(DESC_COMMIT_MASK | DESC_REUSE_MASK | DESC0_ID(ct_bits))
+#define DESC0_SV(ct_bits)	(DESC_REUSE_MASK | DESC0_ID(ct_bits))
 
 /*
  * Define a ringbuffer with an external text data buffer. The same as
-- 
2.20.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 3/8] printk: ringbuffer: relocate get_data()
  2020-08-31  1:10 ` John Ogness
@ 2020-08-31  1:10   ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

Move the internal get_data() function as-is above prb_reserve() so
that a later change can make use of the static function.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 116 +++++++++++++++---------------
 1 file changed, 58 insertions(+), 58 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index d339ff7647da..86af38c2cf77 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -1038,6 +1038,64 @@ static unsigned int space_used(struct prb_data_ring *data_ring,
 		DATA_SIZE(data_ring) - DATA_INDEX(data_ring, blk_lpos->begin));
 }
 
+/*
+ * Given @blk_lpos, return a pointer to the writer data from the data block
+ * and calculate the size of the data part. A NULL pointer is returned if
+ * @blk_lpos specifies values that could never be legal.
+ *
+ * This function (used by readers) performs strict validation on the lpos
+ * values to possibly detect bugs in the writer code. A WARN_ON_ONCE() is
+ * triggered if an internal error is detected.
+ */
+static const char *get_data(struct prb_data_ring *data_ring,
+			    struct prb_data_blk_lpos *blk_lpos,
+			    unsigned int *data_size)
+{
+	struct prb_data_block *db;
+
+	/* Data-less data block description. */
+	if (LPOS_DATALESS(blk_lpos->begin) && LPOS_DATALESS(blk_lpos->next)) {
+		if (blk_lpos->begin == NO_LPOS && blk_lpos->next == NO_LPOS) {
+			*data_size = 0;
+			return "";
+		}
+		return NULL;
+	}
+
+	/* Regular data block: @begin less than @next and in same wrap. */
+	if (DATA_WRAPS(data_ring, blk_lpos->begin) == DATA_WRAPS(data_ring, blk_lpos->next) &&
+	    blk_lpos->begin < blk_lpos->next) {
+		db = to_block(data_ring, blk_lpos->begin);
+		*data_size = blk_lpos->next - blk_lpos->begin;
+
+	/* Wrapping data block: @begin is one wrap behind @next. */
+	} else if (DATA_WRAPS(data_ring, blk_lpos->begin + DATA_SIZE(data_ring)) ==
+		   DATA_WRAPS(data_ring, blk_lpos->next)) {
+		db = to_block(data_ring, 0);
+		*data_size = DATA_INDEX(data_ring, blk_lpos->next);
+
+	/* Illegal block description. */
+	} else {
+		WARN_ON_ONCE(1);
+		return NULL;
+	}
+
+	/* A valid data block will always be aligned to the ID size. */
+	if (WARN_ON_ONCE(blk_lpos->begin != ALIGN(blk_lpos->begin, sizeof(db->id))) ||
+	    WARN_ON_ONCE(blk_lpos->next != ALIGN(blk_lpos->next, sizeof(db->id)))) {
+		return NULL;
+	}
+
+	/* A valid data block will always have at least an ID. */
+	if (WARN_ON_ONCE(*data_size < sizeof(db->id)))
+		return NULL;
+
+	/* Subtract block ID space from size to reflect data size. */
+	*data_size -= sizeof(db->id);
+
+	return &db->data[0];
+}
+
 /**
  * prb_reserve() - Reserve space in the ringbuffer.
  *
@@ -1192,64 +1250,6 @@ void prb_commit(struct prb_reserved_entry *e)
 	local_irq_restore(e->irqflags);
 }
 
-/*
- * Given @blk_lpos, return a pointer to the writer data from the data block
- * and calculate the size of the data part. A NULL pointer is returned if
- * @blk_lpos specifies values that could never be legal.
- *
- * This function (used by readers) performs strict validation on the lpos
- * values to possibly detect bugs in the writer code. A WARN_ON_ONCE() is
- * triggered if an internal error is detected.
- */
-static const char *get_data(struct prb_data_ring *data_ring,
-			    struct prb_data_blk_lpos *blk_lpos,
-			    unsigned int *data_size)
-{
-	struct prb_data_block *db;
-
-	/* Data-less data block description. */
-	if (LPOS_DATALESS(blk_lpos->begin) && LPOS_DATALESS(blk_lpos->next)) {
-		if (blk_lpos->begin == NO_LPOS && blk_lpos->next == NO_LPOS) {
-			*data_size = 0;
-			return "";
-		}
-		return NULL;
-	}
-
-	/* Regular data block: @begin less than @next and in same wrap. */
-	if (DATA_WRAPS(data_ring, blk_lpos->begin) == DATA_WRAPS(data_ring, blk_lpos->next) &&
-	    blk_lpos->begin < blk_lpos->next) {
-		db = to_block(data_ring, blk_lpos->begin);
-		*data_size = blk_lpos->next - blk_lpos->begin;
-
-	/* Wrapping data block: @begin is one wrap behind @next. */
-	} else if (DATA_WRAPS(data_ring, blk_lpos->begin + DATA_SIZE(data_ring)) ==
-		   DATA_WRAPS(data_ring, blk_lpos->next)) {
-		db = to_block(data_ring, 0);
-		*data_size = DATA_INDEX(data_ring, blk_lpos->next);
-
-	/* Illegal block description. */
-	} else {
-		WARN_ON_ONCE(1);
-		return NULL;
-	}
-
-	/* A valid data block will always be aligned to the ID size. */
-	if (WARN_ON_ONCE(blk_lpos->begin != ALIGN(blk_lpos->begin, sizeof(db->id))) ||
-	    WARN_ON_ONCE(blk_lpos->next != ALIGN(blk_lpos->next, sizeof(db->id)))) {
-		return NULL;
-	}
-
-	/* A valid data block will always have at least an ID. */
-	if (WARN_ON_ONCE(*data_size < sizeof(db->id)))
-		return NULL;
-
-	/* Subtract block ID space from size to reflect data size. */
-	*data_size -= sizeof(db->id);
-
-	return &db->data[0];
-}
-
 /*
  * Count the number of lines in provided text. All text has at least 1 line
  * (even if @text_size is 0). Each '\n' processed is counted as an additional
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 3/8] printk: ringbuffer: relocate get_data()
@ 2020-08-31  1:10   ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

Move the internal get_data() function as-is above prb_reserve() so
that a later change can make use of the static function.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 116 +++++++++++++++---------------
 1 file changed, 58 insertions(+), 58 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index d339ff7647da..86af38c2cf77 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -1038,6 +1038,64 @@ static unsigned int space_used(struct prb_data_ring *data_ring,
 		DATA_SIZE(data_ring) - DATA_INDEX(data_ring, blk_lpos->begin));
 }
 
+/*
+ * Given @blk_lpos, return a pointer to the writer data from the data block
+ * and calculate the size of the data part. A NULL pointer is returned if
+ * @blk_lpos specifies values that could never be legal.
+ *
+ * This function (used by readers) performs strict validation on the lpos
+ * values to possibly detect bugs in the writer code. A WARN_ON_ONCE() is
+ * triggered if an internal error is detected.
+ */
+static const char *get_data(struct prb_data_ring *data_ring,
+			    struct prb_data_blk_lpos *blk_lpos,
+			    unsigned int *data_size)
+{
+	struct prb_data_block *db;
+
+	/* Data-less data block description. */
+	if (LPOS_DATALESS(blk_lpos->begin) && LPOS_DATALESS(blk_lpos->next)) {
+		if (blk_lpos->begin == NO_LPOS && blk_lpos->next == NO_LPOS) {
+			*data_size = 0;
+			return "";
+		}
+		return NULL;
+	}
+
+	/* Regular data block: @begin less than @next and in same wrap. */
+	if (DATA_WRAPS(data_ring, blk_lpos->begin) == DATA_WRAPS(data_ring, blk_lpos->next) &&
+	    blk_lpos->begin < blk_lpos->next) {
+		db = to_block(data_ring, blk_lpos->begin);
+		*data_size = blk_lpos->next - blk_lpos->begin;
+
+	/* Wrapping data block: @begin is one wrap behind @next. */
+	} else if (DATA_WRAPS(data_ring, blk_lpos->begin + DATA_SIZE(data_ring)) ==
+		   DATA_WRAPS(data_ring, blk_lpos->next)) {
+		db = to_block(data_ring, 0);
+		*data_size = DATA_INDEX(data_ring, blk_lpos->next);
+
+	/* Illegal block description. */
+	} else {
+		WARN_ON_ONCE(1);
+		return NULL;
+	}
+
+	/* A valid data block will always be aligned to the ID size. */
+	if (WARN_ON_ONCE(blk_lpos->begin != ALIGN(blk_lpos->begin, sizeof(db->id))) ||
+	    WARN_ON_ONCE(blk_lpos->next != ALIGN(blk_lpos->next, sizeof(db->id)))) {
+		return NULL;
+	}
+
+	/* A valid data block will always have at least an ID. */
+	if (WARN_ON_ONCE(*data_size < sizeof(db->id)))
+		return NULL;
+
+	/* Subtract block ID space from size to reflect data size. */
+	*data_size -= sizeof(db->id);
+
+	return &db->data[0];
+}
+
 /**
  * prb_reserve() - Reserve space in the ringbuffer.
  *
@@ -1192,64 +1250,6 @@ void prb_commit(struct prb_reserved_entry *e)
 	local_irq_restore(e->irqflags);
 }
 
-/*
- * Given @blk_lpos, return a pointer to the writer data from the data block
- * and calculate the size of the data part. A NULL pointer is returned if
- * @blk_lpos specifies values that could never be legal.
- *
- * This function (used by readers) performs strict validation on the lpos
- * values to possibly detect bugs in the writer code. A WARN_ON_ONCE() is
- * triggered if an internal error is detected.
- */
-static const char *get_data(struct prb_data_ring *data_ring,
-			    struct prb_data_blk_lpos *blk_lpos,
-			    unsigned int *data_size)
-{
-	struct prb_data_block *db;
-
-	/* Data-less data block description. */
-	if (LPOS_DATALESS(blk_lpos->begin) && LPOS_DATALESS(blk_lpos->next)) {
-		if (blk_lpos->begin == NO_LPOS && blk_lpos->next == NO_LPOS) {
-			*data_size = 0;
-			return "";
-		}
-		return NULL;
-	}
-
-	/* Regular data block: @begin less than @next and in same wrap. */
-	if (DATA_WRAPS(data_ring, blk_lpos->begin) == DATA_WRAPS(data_ring, blk_lpos->next) &&
-	    blk_lpos->begin < blk_lpos->next) {
-		db = to_block(data_ring, blk_lpos->begin);
-		*data_size = blk_lpos->next - blk_lpos->begin;
-
-	/* Wrapping data block: @begin is one wrap behind @next. */
-	} else if (DATA_WRAPS(data_ring, blk_lpos->begin + DATA_SIZE(data_ring)) ==
-		   DATA_WRAPS(data_ring, blk_lpos->next)) {
-		db = to_block(data_ring, 0);
-		*data_size = DATA_INDEX(data_ring, blk_lpos->next);
-
-	/* Illegal block description. */
-	} else {
-		WARN_ON_ONCE(1);
-		return NULL;
-	}
-
-	/* A valid data block will always be aligned to the ID size. */
-	if (WARN_ON_ONCE(blk_lpos->begin != ALIGN(blk_lpos->begin, sizeof(db->id))) ||
-	    WARN_ON_ONCE(blk_lpos->next != ALIGN(blk_lpos->next, sizeof(db->id)))) {
-		return NULL;
-	}
-
-	/* A valid data block will always have at least an ID. */
-	if (WARN_ON_ONCE(*data_size < sizeof(db->id)))
-		return NULL;
-
-	/* Subtract block ID space from size to reflect data size. */
-	*data_size -= sizeof(db->id);
-
-	return &db->data[0];
-}
-
 /*
  * Count the number of lines in provided text. All text has at least 1 line
  * (even if @text_size is 0). Each '\n' processed is counted as an additional
-- 
2.20.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 4/8] printk: ringbuffer: add BLK_DATALESS() macro
  2020-08-31  1:10 ` John Ogness
@ 2020-08-31  1:10   ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

Rather than continually needing to explicitly check @begin and @next
to identify a dataless block, introduce and use a BLK_DATALESS()
macro.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 86af38c2cf77..d66718e74aae 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -266,6 +266,8 @@
 
 /* Determine if a logical position refers to a data-less block. */
 #define LPOS_DATALESS(lpos)		((lpos) & 1UL)
+#define BLK_DATALESS(blk)		(LPOS_DATALESS((blk)->begin) && \
+					 LPOS_DATALESS((blk)->next))
 
 /* Get the logical position at index 0 of the current wrap. */
 #define DATA_THIS_WRAP_START_LPOS(data_ring, lpos) \
@@ -1021,7 +1023,7 @@ static unsigned int space_used(struct prb_data_ring *data_ring,
 			       struct prb_data_blk_lpos *blk_lpos)
 {
 	/* Data-less blocks take no space. */
-	if (LPOS_DATALESS(blk_lpos->begin))
+	if (BLK_DATALESS(blk_lpos))
 		return 0;
 
 	if (DATA_WRAPS(data_ring, blk_lpos->begin) == DATA_WRAPS(data_ring, blk_lpos->next)) {
@@ -1054,7 +1056,7 @@ static const char *get_data(struct prb_data_ring *data_ring,
 	struct prb_data_block *db;
 
 	/* Data-less data block description. */
-	if (LPOS_DATALESS(blk_lpos->begin) && LPOS_DATALESS(blk_lpos->next)) {
+	if (BLK_DATALESS(blk_lpos)) {
 		if (blk_lpos->begin == NO_LPOS && blk_lpos->next == NO_LPOS) {
 			*data_size = 0;
 			return "";
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 4/8] printk: ringbuffer: add BLK_DATALESS() macro
@ 2020-08-31  1:10   ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

Rather than continually needing to explicitly check @begin and @next
to identify a dataless block, introduce and use a BLK_DATALESS()
macro.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 86af38c2cf77..d66718e74aae 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -266,6 +266,8 @@
 
 /* Determine if a logical position refers to a data-less block. */
 #define LPOS_DATALESS(lpos)		((lpos) & 1UL)
+#define BLK_DATALESS(blk)		(LPOS_DATALESS((blk)->begin) && \
+					 LPOS_DATALESS((blk)->next))
 
 /* Get the logical position at index 0 of the current wrap. */
 #define DATA_THIS_WRAP_START_LPOS(data_ring, lpos) \
@@ -1021,7 +1023,7 @@ static unsigned int space_used(struct prb_data_ring *data_ring,
 			       struct prb_data_blk_lpos *blk_lpos)
 {
 	/* Data-less blocks take no space. */
-	if (LPOS_DATALESS(blk_lpos->begin))
+	if (BLK_DATALESS(blk_lpos))
 		return 0;
 
 	if (DATA_WRAPS(data_ring, blk_lpos->begin) == DATA_WRAPS(data_ring, blk_lpos->next)) {
@@ -1054,7 +1056,7 @@ static const char *get_data(struct prb_data_ring *data_ring,
 	struct prb_data_block *db;
 
 	/* Data-less data block description. */
-	if (LPOS_DATALESS(blk_lpos->begin) && LPOS_DATALESS(blk_lpos->next)) {
+	if (BLK_DATALESS(blk_lpos)) {
 		if (blk_lpos->begin == NO_LPOS && blk_lpos->next == NO_LPOS) {
 			*data_size = 0;
 			return "";
-- 
2.20.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 5/8] printk: ringbuffer: clear initial reserved fields
  2020-08-31  1:10 ` John Ogness
@ 2020-08-31  1:10   ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

prb_reserve() will set some meta data values and leave others
uninitialized (or rather, containing the values of the previous
wrap). Simplify the API by always clearing out all the fields.
Only the sequence number is filled in. The caller is now
responsible for filling in the rest of the meta data fields.
In particular, for correctly filling in text and dict lengths.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c            |  7 ++++++-
 kernel/printk/printk_ringbuffer.c | 29 +++++++++++++++++++----------
 2 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index ad8d1dfe5fbe..7e7d596c8878 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -520,8 +520,11 @@ static int log_store(u32 caller_id, int facility, int level,
 	memcpy(&r.text_buf[0], text, text_len);
 	if (trunc_msg_len)
 		memcpy(&r.text_buf[text_len], trunc_msg, trunc_msg_len);
-	if (r.dict_buf)
+	r.info->text_len = text_len + trunc_msg_len;
+	if (r.dict_buf) {
 		memcpy(&r.dict_buf[0], dict, dict_len);
+		r.info->dict_len = dict_len;
+	}
 	r.info->facility = facility;
 	r.info->level = level & 7;
 	r.info->flags = flags & 0x1f;
@@ -1078,9 +1081,11 @@ static unsigned int __init add_to_rb(struct printk_ringbuffer *rb,
 		return 0;
 
 	memcpy(&dest_r.text_buf[0], &r->text_buf[0], dest_r.text_buf_size);
+	dest_r.info->text_len = r->info->text_len;
 	if (dest_r.dict_buf) {
 		memcpy(&dest_r.dict_buf[0], &r->dict_buf[0],
 		       dest_r.dict_buf_size);
+		dest_r.info->dict_len = r->info->dict_len;
 	}
 	dest_r.info->facility = r->info->facility;
 	dest_r.info->level = r->info->level;
diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index d66718e74aae..da54d4fadf96 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -146,10 +146,13 @@
  *
  *	if (prb_reserve(&e, &test_rb, &r)) {
  *		snprintf(r.text_buf, r.text_buf_size, "%s", textstr);
+ *		r.info->text_len = strlen(textstr);
  *
  *		// dictionary allocation may have failed
- *		if (r.dict_buf)
+ *		if (r.dict_buf) {
  *			snprintf(r.dict_buf, r.dict_buf_size, "%s", dictstr);
+ *			r.info->dict_len = strlen(dictstr);
+ *		}
  *
  *		r.info->ts_nsec = local_clock();
  *
@@ -1125,9 +1128,9 @@ static const char *get_data(struct prb_data_ring *data_ring,
  * @dict_buf_size is set to 0. Writers must check this before writing to
  * dictionary space.
  *
- * @info->text_len and @info->dict_len will already be set to @text_buf_size
- * and @dict_buf_size, respectively. If dictionary space reservation fails,
- * @info->dict_len is set to 0.
+ * Important: @info->text_len and @info->dict_len need to be set correctly by
+ *            the writer in order for data to be readable and/or extended.
+ *            Their values are initialized to 0.
  */
 bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 		 struct printk_record *r)
@@ -1159,6 +1162,18 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 
 	d = to_desc(desc_ring, id);
 
+	/*
+	 * Clear all @info fields except for @seq, which is used to determine
+	 * the new sequence number. The writer must fill in new values.
+	 */
+	d->info.ts_nsec = 0;
+	d->info.text_len = 0;
+	d->info.dict_len = 0;
+	d->info.facility = 0;
+	d->info.flags = 0;
+	d->info.level = 0;
+	d->info.caller_id = 0;
+
 	/*
 	 * Set the @e fields here so that prb_commit() can be used if
 	 * text data allocation fails.
@@ -1186,8 +1201,6 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 				 &d->text_blk_lpos, id);
 	/* If text data allocation fails, a data-less record is committed. */
 	if (r->text_buf_size && !r->text_buf) {
-		d->info.text_len = 0;
-		d->info.dict_len = 0;
 		prb_commit(e);
 		/* prb_commit() re-enabled interrupts. */
 		goto fail;
@@ -1204,10 +1217,6 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 
 	r->info = &d->info;
 
-	/* Set default values for the sizes. */
-	d->info.text_len = r->text_buf_size;
-	d->info.dict_len = r->dict_buf_size;
-
 	/* Record full text space used by record. */
 	e->text_space = space_used(&rb->text_data_ring, &d->text_blk_lpos);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 5/8] printk: ringbuffer: clear initial reserved fields
@ 2020-08-31  1:10   ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

prb_reserve() will set some meta data values and leave others
uninitialized (or rather, containing the values of the previous
wrap). Simplify the API by always clearing out all the fields.
Only the sequence number is filled in. The caller is now
responsible for filling in the rest of the meta data fields.
In particular, for correctly filling in text and dict lengths.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c            |  7 ++++++-
 kernel/printk/printk_ringbuffer.c | 29 +++++++++++++++++++----------
 2 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index ad8d1dfe5fbe..7e7d596c8878 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -520,8 +520,11 @@ static int log_store(u32 caller_id, int facility, int level,
 	memcpy(&r.text_buf[0], text, text_len);
 	if (trunc_msg_len)
 		memcpy(&r.text_buf[text_len], trunc_msg, trunc_msg_len);
-	if (r.dict_buf)
+	r.info->text_len = text_len + trunc_msg_len;
+	if (r.dict_buf) {
 		memcpy(&r.dict_buf[0], dict, dict_len);
+		r.info->dict_len = dict_len;
+	}
 	r.info->facility = facility;
 	r.info->level = level & 7;
 	r.info->flags = flags & 0x1f;
@@ -1078,9 +1081,11 @@ static unsigned int __init add_to_rb(struct printk_ringbuffer *rb,
 		return 0;
 
 	memcpy(&dest_r.text_buf[0], &r->text_buf[0], dest_r.text_buf_size);
+	dest_r.info->text_len = r->info->text_len;
 	if (dest_r.dict_buf) {
 		memcpy(&dest_r.dict_buf[0], &r->dict_buf[0],
 		       dest_r.dict_buf_size);
+		dest_r.info->dict_len = r->info->dict_len;
 	}
 	dest_r.info->facility = r->info->facility;
 	dest_r.info->level = r->info->level;
diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index d66718e74aae..da54d4fadf96 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -146,10 +146,13 @@
  *
  *	if (prb_reserve(&e, &test_rb, &r)) {
  *		snprintf(r.text_buf, r.text_buf_size, "%s", textstr);
+ *		r.info->text_len = strlen(textstr);
  *
  *		// dictionary allocation may have failed
- *		if (r.dict_buf)
+ *		if (r.dict_buf) {
  *			snprintf(r.dict_buf, r.dict_buf_size, "%s", dictstr);
+ *			r.info->dict_len = strlen(dictstr);
+ *		}
  *
  *		r.info->ts_nsec = local_clock();
  *
@@ -1125,9 +1128,9 @@ static const char *get_data(struct prb_data_ring *data_ring,
  * @dict_buf_size is set to 0. Writers must check this before writing to
  * dictionary space.
  *
- * @info->text_len and @info->dict_len will already be set to @text_buf_size
- * and @dict_buf_size, respectively. If dictionary space reservation fails,
- * @info->dict_len is set to 0.
+ * Important: @info->text_len and @info->dict_len need to be set correctly by
+ *            the writer in order for data to be readable and/or extended.
+ *            Their values are initialized to 0.
  */
 bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 		 struct printk_record *r)
@@ -1159,6 +1162,18 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 
 	d = to_desc(desc_ring, id);
 
+	/*
+	 * Clear all @info fields except for @seq, which is used to determine
+	 * the new sequence number. The writer must fill in new values.
+	 */
+	d->info.ts_nsec = 0;
+	d->info.text_len = 0;
+	d->info.dict_len = 0;
+	d->info.facility = 0;
+	d->info.flags = 0;
+	d->info.level = 0;
+	d->info.caller_id = 0;
+
 	/*
 	 * Set the @e fields here so that prb_commit() can be used if
 	 * text data allocation fails.
@@ -1186,8 +1201,6 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 				 &d->text_blk_lpos, id);
 	/* If text data allocation fails, a data-less record is committed. */
 	if (r->text_buf_size && !r->text_buf) {
-		d->info.text_len = 0;
-		d->info.dict_len = 0;
 		prb_commit(e);
 		/* prb_commit() re-enabled interrupts. */
 		goto fail;
@@ -1204,10 +1217,6 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 
 	r->info = &d->info;
 
-	/* Set default values for the sizes. */
-	d->info.text_len = r->text_buf_size;
-	d->info.dict_len = r->dict_buf_size;
-
 	/* Record full text space used by record. */
 	e->text_space = space_used(&rb->text_data_ring, &d->text_blk_lpos);
 
-- 
2.20.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
  2020-08-31  1:10 ` John Ogness
@ 2020-08-31  1:10   ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

Add support for extending the newest data block. For this, introduce
a new finalization state flag (DESC_FINAL_MASK) that denotes when a
descriptor may not be extended, i.e. is finalized.

The DESC_COMMIT_MASK is still set when the record data is in a
consistent state, i.e. the writer is no longer modifying the record.
However, the record remains in the desc_reserved queried state until
it is finalized, in which case it transitions to the desc_committed
queried state.

Until a record is finalized, a writer can reopen that record to
append new data. Reopening a record means clearing the
DESC_COMMIT_MASK flag.

A writer can explicitly finalize a record if there is no intention
of extending it. Also, records are automatically finalized when a
new record is reserved. This relieves writers of needing to
explicitly finalize while also making such records available to
readers sooner. (Readers can only traverse finalized records.)

Three new memory barrier pairs are introduced. Two of them are not
significant because they are alternate path memory barriers that
exactly correspond to existing memory barriers.

But the third (_prb_commit:B / desc_reserve:D) is new and guarantees
that descriptors will always be finalized, either because a
descriptor setting DESC_COMMIT_MASK sees that there is a newer
descriptor and so finalizes itself or because a new descriptor being
reserved sees that the previous descriptor has DESC_COMMIT_MASK set
and finalizes that descriptor.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk_ringbuffer.c | 467 ++++++++++++++++++++++++++++--
 kernel/printk/printk_ringbuffer.h |   8 +-
 2 files changed, 443 insertions(+), 32 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index da54d4fadf96..0731d5e2dddd 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -49,14 +49,16 @@
  * Descriptors have three states:
  *
  *   reserved
- *     A writer is modifying the record.
+ *     A writer is modifying the record. Internally represented as either "0"
+ *     or "DESC_COMMIT_MASK".
  *
  *   committed
  *     The record and all its data are complete and available for reading.
+ *     Internally represented as "DESC_COMMIT_MASK | DESC_FINAL_MASK".
  *
  *   reusable
  *     The record exists, but its text and/or dictionary data may no longer
- *     be available.
+ *     be available. Internally represented as "DESC_REUSE_MASK".
  *
  * Querying the @state_var of a record requires providing the ID of the
  * descriptor to query. This can yield a possible fourth (pseudo) state:
@@ -79,6 +81,25 @@
  * committed or reusable queried state. This makes it possible that a valid
  * sequence number of the tail is always available.
  *
+ * Descriptor Finalization
+ * ~~~~~~~~~~~~~~~~~~~~~~~
+ * When a writer calls the commit function prb_commit(), the record may still
+ * continue to be in the reserved queried state. In order for that record to
+ * enter into the committed queried state, that record also must be finalized.
+ * A record can be finalized by three different scenarios:
+ *
+ *   1) A writer can finalize its record immediately by calling
+ *      prb_final_commit() instead of prb_commit().
+ *
+ *   2) When a new record is reserved and the previous record has been
+ *      committed via prb_commit(), that previous record is finalized.
+ *
+ *   3) When a record is committed via prb_commit() and a newer record
+ *      already exists, the record being committed is finalized.
+ *
+ * Until a record is finalized (represented by "DESC_FINAL_MASK"), a writer
+ * may "reopen" that record and extend it with more data.
+ *
  * Data Rings
  * ~~~~~~~~~~
  * The two data rings (text and dictionary) function identically. They exist
@@ -156,9 +177,38 @@
  *
  *		r.info->ts_nsec = local_clock();
  *
+ *		prb_final_commit(&e);
+ *	}
+ *
+ * Note that additional writer functions are available to extend a record
+ * after it has been committed but not yet finalized. This can be done as
+ * long as no new records have been reserved and the caller is the same.
+ *
+ * Sample writer code (record extending)::
+ *
+ *		// alternate rest of previous example
+ *		r.info->ts_nsec = local_clock();
+ *		r.info->text_len = strlen(textstr);
+ *		r.info->caller_id = printk_caller_id();
+ *
+ *		// commit the record (but do not finalize yet)
  *		prb_commit(&e);
  *	}
  *
+ *	...
+ *
+ *	// specify additional 5 bytes text space to extend
+ *	prb_rec_init_wr(&r, 5, 0);
+ *
+ *	if (prb_reserve_in_last(&e, &test_rb, &r, printk_caller_id())) {
+ *		snprintf(&r.text_buf[r.info->text_len],
+ *			 r.text_buf_size - r.info->text_len, "hello");
+ *
+ *		r.info->text_len += 5;
+ *
+ *		prb_final_commit(&e);
+ *	}
+ *
  * Sample reader code::
  *
  *	struct printk_info info;
@@ -236,14 +286,20 @@
  *   desc_reserve:F / desc_read:D
  *     set new descriptor id and reserved (state), then allow writer changes
  *
- *   data_alloc:A / desc_read:D
+ *   data_alloc:A (or data_realloc:A) / desc_read:D
  *     set old descriptor reusable (state), then modify new data block area
  *
- *   data_alloc:A / data_push_tail:B
+ *   data_alloc:A (or data_realloc:A) / data_push_tail:B
  *     push data tail (lpos), then modify new data block area
  *
- *   prb_commit:B / desc_read:B
- *     store writer changes, then set new descriptor committed (state)
+ *   _prb_commit:B / desc_read:B
+ *     store writer changes, then set new descriptor commit flag (state)
+ *
+ *   desc_reopen_last:A / _prb_commit:B
+ *     store removed descriptor commit flag (state), then read descriptor data
+ *
+ *   _prb_commit:B / desc_reserve:D
+ *     set new descriptor commit flag (state), then check descriptor head (id)
  *
  *   data_push_tail:D / data_push_tail:A
  *     set descriptor reusable (state), then push data tail (lpos)
@@ -366,8 +422,10 @@ static enum desc_state get_desc_state(unsigned long id,
 	if (state_val & DESC_REUSE_MASK)
 		return desc_reusable;
 
-	if (state_val & DESC_COMMIT_MASK)
+	if ((state_val & (DESC_COMMIT_MASK | DESC_FINAL_MASK)) ==
+	    (DESC_COMMIT_MASK | DESC_FINAL_MASK)) {
 		return desc_committed;
+	}
 
 	return desc_reserved;
 }
@@ -394,16 +452,16 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 	/*
 	 * Guarantee the state is loaded before copying the descriptor
 	 * content. This avoids copying obsolete descriptor content that might
-	 * not apply to the descriptor state. This pairs with prb_commit:B.
+	 * not apply to the descriptor state. This pairs with _prb_commit:B.
 	 *
 	 * Memory barrier involvement:
 	 *
-	 * If desc_read:A reads from prb_commit:B, then desc_read:C reads
-	 * from prb_commit:A.
+	 * If desc_read:A reads from _prb_commit:B, then desc_read:C reads
+	 * from _prb_commit:A.
 	 *
 	 * Relies on:
 	 *
-	 * WMB from prb_commit:A to prb_commit:B
+	 * WMB from _prb_commit:A to _prb_commit:B
 	 *    matching
 	 * RMB from desc_read:A to desc_read:C
 	 */
@@ -434,7 +492,8 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 	 *
 	 * 2. Guarantee the record data is loaded before re-checking the
 	 *    state. This avoids reading an obsolete descriptor state that may
-	 *    not apply to the copied data. This pairs with data_alloc:A.
+	 *    not apply to the copied data. This pairs with data_alloc:A and
+	 *    data_realloc:A.
 	 *
 	 *    Memory barrier involvement:
 	 *
@@ -467,7 +526,7 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 static void desc_make_reusable(struct prb_desc_ring *desc_ring,
 			       unsigned long id)
 {
-	unsigned long val_committed = id | DESC_COMMIT_MASK;
+	unsigned long val_committed = id | DESC_COMMIT_MASK | DESC_FINAL_MASK;
 	unsigned long val_reusable = id | DESC_REUSE_MASK;
 	struct prb_desc *desc = to_desc(desc_ring, id);
 	atomic_long_t *state_var = &desc->state_var;
@@ -613,7 +672,7 @@ static bool data_push_tail(struct printk_ringbuffer *rb,
 			 *    data_make_reusable() may be due to a newly
 			 *    recycled data area causing the tail lpos to
 			 *    have been previously pushed. This pairs with
-			 *    data_alloc:A.
+			 *    data_alloc:A and data_realloc:A.
 			 *
 			 *    Memory barrier involvement:
 			 *
@@ -892,6 +951,10 @@ static bool desc_reserve(struct printk_ringbuffer *rb, unsigned long *id_out)
 		 *    another CPU may have pushed the tail ID. This pairs
 		 *    with desc_push_tail:C and this also pairs with
 		 *    prb_first_seq:C.
+		 *
+		 * 5. Guarantee the head ID is stored before trying to
+		 *    finalize the previous descriptor. This pairs with
+		 *    _prb_commit:B.
 		 */
 	} while (!atomic_long_try_cmpxchg(&desc_ring->head_id, &head_id,
 					  id)); /* LMM(desc_reserve:D) */
@@ -1021,6 +1084,83 @@ static char *data_alloc(struct printk_ringbuffer *rb,
 	return &blk->data[0];
 }
 
+/*
+ * Try to resize an existing data block associated with the descriptor
+ * specified by @id. If the resized datablock should become wrapped, it
+ * copies the old data to the new data block.
+ *
+ * Fail if this is not the last allocated data block or if there is not
+ * enough space or it is not possible make enough space.
+ *
+ * Return a pointer to the beginning of the entire data buffer or NULL on
+ * failure.
+ */
+static char *data_realloc(struct printk_ringbuffer *rb,
+			  struct prb_data_ring *data_ring, unsigned int size,
+			  struct prb_data_blk_lpos *blk_lpos, unsigned long id)
+{
+	struct prb_data_block *blk;
+	unsigned long head_lpos;
+	unsigned long next_lpos;
+	bool wrapped;
+
+	/* Reallocation only works if @blk_lpos is the newest data block. */
+	head_lpos = atomic_long_read(&data_ring->head_lpos);
+	if (head_lpos != blk_lpos->next)
+		return NULL;
+
+	/* Keep track if @blk_lpos was a wrapping data block. */
+	wrapped = (DATA_WRAPS(data_ring, blk_lpos->begin) != DATA_WRAPS(data_ring, blk_lpos->next));
+
+	size = to_blk_size(size);
+
+	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
+
+	/* If the data block does not increase, there is nothing to do. */
+	if (next_lpos == head_lpos) {
+		blk = to_block(data_ring, blk_lpos->begin);
+		return &blk->data[0];
+	}
+
+	if (!data_push_tail(rb, data_ring, next_lpos - DATA_SIZE(data_ring)))
+		return NULL;
+
+	/* The memory barrier involvement is the same as data_alloc:A. */
+	if (!atomic_long_try_cmpxchg(&data_ring->head_lpos, &head_lpos,
+				     next_lpos)) { /* LMM(data_realloc:A) */
+		return NULL;
+	}
+
+	blk = to_block(data_ring, blk_lpos->begin);
+
+	if (DATA_WRAPS(data_ring, blk_lpos->begin) != DATA_WRAPS(data_ring, next_lpos)) {
+		struct prb_data_block *old_blk = blk;
+
+		/* Wrapping data blocks store their data at the beginning. */
+		blk = to_block(data_ring, 0);
+
+		/*
+		 * Store the ID on the wrapped block for consistency.
+		 * The printk_ringbuffer does not actually use it.
+		 */
+		blk->id = id;
+
+		if (!wrapped) {
+			/*
+			 * Since the allocated space is now in the newly
+			 * created wrapping data block, copy the content
+			 * from the old data block.
+			 */
+			memcpy(&blk->data[0], &old_blk->data[0],
+			       (blk_lpos->next - blk_lpos->begin) - sizeof(blk->id));
+		}
+	}
+
+	blk_lpos->next = next_lpos;
+
+	return &blk->data[0];
+}
+
 /* Return the number of bytes used by a data block. */
 static unsigned int space_used(struct prb_data_ring *data_ring,
 			       struct prb_data_blk_lpos *blk_lpos)
@@ -1101,6 +1241,203 @@ static const char *get_data(struct prb_data_ring *data_ring,
 	return &db->data[0];
 }
 
+/*
+ * Attempt to remove the commit flag so that the record can be modified by a
+ * writer again. This is only possible if the descriptor is not yet finalized.
+ *
+ * Note that on success, the queried state did not change. A non-finalized
+ * record (even with the commit flag set) is in the reserved queried state.
+ */
+static struct prb_desc *desc_reopen_last(struct prb_desc_ring *desc_ring,
+					 u32 caller_id, unsigned long *id_out)
+{
+	unsigned long prev_state_val;
+	enum desc_state d_state;
+	struct prb_desc desc;
+	struct prb_desc *d;
+	unsigned long id;
+
+	id = atomic_long_read(&desc_ring->head_id);
+
+	/*
+	 * To minimize unnecessarily reopening a descriptor, first check the
+	 * descriptor is in the correct state and has a matching caller ID.
+	 */
+	d_state = desc_read(desc_ring, id, &desc);
+	if (d_state != desc_reserved ||
+	    !(atomic_long_read(&desc.state_var) & DESC_COMMIT_MASK) ||
+	    desc.info.caller_id != caller_id) {
+		return NULL;
+	}
+
+	d = to_desc(desc_ring, id);
+
+	prev_state_val = id | DESC_COMMIT_MASK;
+
+	/*
+	 * Guarantee the commit flag is removed from the state before
+	 * reading any record data. A full memory barrier is needed
+	 * because @state_var is modified for the read. This pairs with
+	 * _prb_commit:B.
+	 *
+	 * Memory barrier involvement:
+	 *
+	 * If desc_reopen_last:A reads from _prb_commit:B, then
+	 * prb_reserve_in_last:A reads from _prb_commit:A.
+	 *
+	 * Relies on:
+	 *
+	 * WMB from _prb_commit:A to _prb_commit:B
+	 *    matching
+	 * MB If desc_reopen_last:A to prb_reserve_in_last:A
+	 */
+	if (!atomic_long_try_cmpxchg(&d->state_var, &prev_state_val,
+				     id | 0)) { /* LMM(desc_reopen_last:A) */
+		return NULL;
+	}
+
+	*id_out = id;
+	return d;
+}
+
+/**
+ * prb_reserve_in_last() - Re-reserve and extend the space in the ringbuffer
+ *                         used by the newest record.
+ *
+ * @e:         The entry structure to setup.
+ * @rb:        The ringbuffer to re-reserve and extend data in.
+ * @r:         The record structure to allocate buffers for.
+ * @caller_id: The caller ID of the caller (reserving writer).
+ *
+ * This is the public function available to writers to re-reserve and extend
+ * data.
+ *
+ * The writer specifies the text size to extend (not the new total size) by
+ * setting the @text_buf_size field of @r. Dictionaries cannot be extended so
+ * @dict_buf_size of @r should be set to 0. To ensure proper initialization of
+ * @r, prb_rec_init_wr() should be used.
+ *
+ * This function will fail if @caller_id does not match the caller ID of the
+ * newest record. In that case the caller must reserve new data using
+ * prb_reserve().
+ *
+ * Context: Any context. Disables local interrupts on success.
+ * Return: true if text data could be extended, otherwise false.
+ *
+ * On success:
+ *
+ *   - @r->text_buf points to the beginning of the entire text buffer.
+ *
+ *   - @r->text_buf_len is set to the new total size of the buffer.
+ *
+ *   - @r->dict_buf and @r->dict_buf_len are cleared because extending
+ *     the dict buffer is not supported.
+ *
+ *   - @r->info is not touched so that @r->info->text_len could be used
+ *     to append the text.
+ *
+ *   - prb_record_text_space() can be used on @e to query the new
+ *     actually used space.
+ *
+ * Important: All @r->info fields will already be set with the current values
+ *            for the record. I.e. @r->info->text_len will be less than
+ *            @text_buf_size and @r->info->dict_len may be set, even though
+ *            @dict_buf_size is 0. Writers can use @r->info->text_len to know
+ *            where concatenation begins and writers should update
+ *            @r->info->text_len after concatenating.
+ */
+bool prb_reserve_in_last(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
+			 struct printk_record *r, u32 caller_id)
+{
+	unsigned int data_size;
+	struct prb_desc *d;
+	unsigned long id;
+
+	local_irq_save(e->irqflags);
+
+	/* Transition the newest descriptor back to the reserved state. */
+	d = desc_reopen_last(&rb->desc_ring, caller_id, &id);
+	if (!d) {
+		local_irq_restore(e->irqflags);
+		goto fail_reopen;
+	}
+
+	/* Now the writer has exclusive access: LMM(prb_reserve_in_last:A) */
+
+	/*
+	 * Set the @e fields here so that prb_commit() can be used if
+	 * anything fails from now on.
+	 */
+	e->rb = rb;
+	e->id = id;
+
+	/*
+	 * desc_reopen_last() checked the caller_id, but there was no
+	 * exclusive access at that point. The descriptor may have
+	 * changed since then.
+	 */
+	if (caller_id != d->info.caller_id)
+		goto fail;
+
+	if (BLK_DATALESS(&d->text_blk_lpos)) {
+		r->text_buf = data_alloc(rb, &rb->text_data_ring, r->text_buf_size,
+					 &d->text_blk_lpos, id);
+	} else {
+		if (!get_data(&rb->text_data_ring, &d->text_blk_lpos, &data_size))
+			goto fail;
+
+		/*
+		 * Increase the buffer size to include the original size. If
+		 * the meta data (@text_len) is not sane, use the full data
+		 * block size.
+		 */
+		if (WARN_ON_ONCE(d->info.text_len > data_size)) {
+			pr_warn_once("wrong data size (%u, expecting >=%hu) for data\n",
+				     data_size, d->info.text_len);
+			d->info.text_len = data_size;
+		}
+		r->text_buf_size += d->info.text_len;
+
+		if (!data_check_size(&rb->text_data_ring, r->text_buf_size))
+			goto fail;
+
+		r->text_buf = data_realloc(rb, &rb->text_data_ring, r->text_buf_size,
+					   &d->text_blk_lpos, id);
+	}
+	if (r->text_buf_size && !r->text_buf)
+		goto fail;
+
+	/* Although dictionary data may be in use, it cannot be extended. */
+	r->dict_buf = NULL;
+	r->dict_buf_size = 0;
+
+	r->info = &d->info;
+
+	e->text_space = space_used(&rb->text_data_ring, &d->text_blk_lpos);
+
+	return true;
+fail:
+	prb_commit(e);
+	/* prb_commit() re-enabled interrupts. */
+fail_reopen:
+	/* Make it clear to the caller that the re-reserve failed. */
+	memset(r, 0, sizeof(*r));
+	return false;
+}
+
+/*
+ * Attempt to finalize a specified descriptor. If this fails, the descriptor
+ * is either already final or it will finalize itself when the writer commits.
+ */
+static void desc_make_final(struct prb_desc_ring *desc_ring, unsigned long id)
+{
+	unsigned long prev_state_val = id | DESC_COMMIT_MASK;
+	struct prb_desc *d = to_desc(desc_ring, id);
+
+	atomic_long_cmpxchg_relaxed(&d->state_var, prev_state_val,
+			prev_state_val | DESC_FINAL_MASK); /* LMM(desc_make_final:A) */
+}
+
 /**
  * prb_reserve() - Reserve space in the ringbuffer.
  *
@@ -1197,6 +1534,15 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 	else
 		d->info.seq += DESCS_COUNT(desc_ring);
 
+	/*
+	 * New data is about to be reserved. Once that happens, previous
+	 * descriptors are no longer able to be extended. Finalize the
+	 * previous descriptor now so that it can be made available to
+	 * readers. (For seq==0 there is no previous descriptor.)
+	 */
+	if (d->info.seq > 0)
+		desc_make_final(desc_ring, DESC_ID(id - 1));
+
 	r->text_buf = data_alloc(rb, &rb->text_data_ring, r->text_buf_size,
 				 &d->text_blk_lpos, id);
 	/* If text data allocation fails, a data-less record is committed. */
@@ -1227,33 +1573,41 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 	return false;
 }
 
-/**
- * prb_commit() - Commit (previously reserved) data to the ringbuffer.
- *
- * @e: The entry containing the reserved data information.
- *
- * This is the public function available to writers to commit data.
- *
- * Context: Any context. Enables local interrupts.
- */
-void prb_commit(struct prb_reserved_entry *e)
+/* Commit the data (possibly finalizing it) and restore interrupts. */
+static void _prb_commit(struct prb_reserved_entry *e, unsigned long final_mask)
 {
 	struct prb_desc_ring *desc_ring = &e->rb->desc_ring;
 	struct prb_desc *d = to_desc(desc_ring, e->id);
 	unsigned long prev_state_val = e->id | 0;
 
-	/* Now the writer has finished all writing: LMM(prb_commit:A) */
+	/* Now the writer has finished all writing: LMM(_prb_commit:A) */
 
 	/*
 	 * Set the descriptor as committed. See "ABA Issues" about why
 	 * cmpxchg() instead of set() is used.
 	 *
-	 * Guarantee all record data is stored before the descriptor state
-	 * is stored as committed. A write memory barrier is sufficient for
-	 * this. This pairs with desc_read:B.
+	 * 1  Guarantee all record data is stored before the descriptor state
+	 *    is stored as committed. A write memory barrier is sufficient
+	 *    for this. This pairs with desc_read:B and desc_reopen_last:A.
+	 *
+	 * 2. Guarantee the commit flag is stored before re-checking the
+	 *    head ID in order to possibly finalize this descriptor. This
+	 *    pairs with desc_reserve:D.
+	 *
+	 *    Memory barrier involvement:
+	 *
+	 *    If prb_commit:A reads from desc_reserve:D, then
+	 *    desc_make_final:A reads from _prb_commit:B.
+	 *
+	 *    Relies on:
+	 *
+	 *    MB _prb_commit:B to prb_commit:A
+	 *       matching
+	 *    MB desc_reserve:D to desc_make_final:A
 	 */
 	if (!atomic_long_try_cmpxchg(&d->state_var, &prev_state_val,
-				     e->id | DESC_COMMIT_MASK)) { /* LMM(prb_commit:B) */
+				     e->id | DESC_COMMIT_MASK |
+					     final_mask)) { /* LMM(_prb_commit:B) */
 		WARN_ON_ONCE(1);
 	}
 
@@ -1261,6 +1615,59 @@ void prb_commit(struct prb_reserved_entry *e)
 	local_irq_restore(e->irqflags);
 }
 
+/**
+ * prb_commit() - Commit (previously reserved) data to the ringbuffer.
+ *
+ * @e: The entry containing the reserved data information.
+ *
+ * This is the public function available to writers to commit data.
+ *
+ * Note that the data is not yet available to readers until it is finalized.
+ * Finalizing happens automatically when space for the next record is
+ * reserved.
+ *
+ * See prb_final_commit() for a version of this function that finalizes
+ * immediately.
+ *
+ * Context: Any context. Enables local interrupts.
+ */
+void prb_commit(struct prb_reserved_entry *e)
+{
+	struct prb_desc_ring *desc_ring = &e->rb->desc_ring;
+	unsigned long head_id;
+
+	_prb_commit(e, 0);
+
+	/*
+	 * If this descriptor is no longer the head (i.e. a new record has
+	 * been allocated), extending the data for this record is no longer
+	 * allowed and therefore it must be finalized.
+	 */
+	head_id = atomic_long_read(&desc_ring->head_id); /* LMM(prb_commit:A) */
+	if (head_id != e->id)
+		desc_make_final(desc_ring, e->id);
+}
+
+/**
+ * prb_final_commit() - Commit and finalize (previously reserved) data to
+ *                      the ringbuffer.
+ *
+ * @e: The entry containing the reserved data information.
+ *
+ * This is the public function available to writers to commit+finalize data.
+ *
+ * By finalizing, the data is made immediately available to readers.
+ *
+ * This function should only be used if there are no intentions of extending
+ * this data using prb_reserve_in_last().
+ *
+ * Context: Any context. Enables local interrupts.
+ */
+void prb_final_commit(struct prb_reserved_entry *e)
+{
+	_prb_commit(e, DESC_FINAL_MASK);
+}
+
 /*
  * Count the number of lines in provided text. All text has at least 1 line
  * (even if @text_size is 0). Each '\n' processed is counted as an additional
@@ -1312,7 +1719,7 @@ static bool copy_data(struct prb_data_ring *data_ring,
 	 * because of the trailing alignment padding.
 	 */
 	if (WARN_ON_ONCE(data_size < (unsigned int)len)) {
-		pr_warn_once("wrong data size (%u, expecting %hu) for data: %.*s\n",
+		pr_warn_once("wrong data size (%u, expecting >=%hu) for data: %.*s\n",
 			     data_size, len, data_size, data);
 		return false;
 	}
diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
index 96ef997d7bd6..caa6fb40dafb 100644
--- a/kernel/printk/printk_ringbuffer.h
+++ b/kernel/printk/printk_ringbuffer.h
@@ -116,8 +116,9 @@ struct prb_reserved_entry {
 #define _DESCS_COUNT(ct_bits)		(1U << (ct_bits))
 #define DESC_SV_BITS			(sizeof(unsigned long) * 8)
 #define DESC_COMMIT_MASK		(1UL << (DESC_SV_BITS - 1))
-#define DESC_REUSE_MASK			(1UL << (DESC_SV_BITS - 2))
-#define DESC_FLAGS_MASK			(DESC_COMMIT_MASK | DESC_REUSE_MASK)
+#define DESC_FINAL_MASK			(1UL << (DESC_SV_BITS - 2))
+#define DESC_REUSE_MASK			(1UL << (DESC_SV_BITS - 3))
+#define DESC_FLAGS_MASK			(DESC_COMMIT_MASK | DESC_FINAL_MASK | DESC_REUSE_MASK)
 #define DESC_ID_MASK			(~DESC_FLAGS_MASK)
 #define DESC_ID(sv)			((sv) & DESC_ID_MASK)
 #define FAILED_LPOS			0x1
@@ -318,7 +319,10 @@ static inline void prb_rec_init_wr(struct printk_record *r,
 
 bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 		 struct printk_record *r);
+bool prb_reserve_in_last(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
+			 struct printk_record *r, u32 caller_id);
 void prb_commit(struct prb_reserved_entry *e);
+void prb_final_commit(struct prb_reserved_entry *e);
 
 void prb_init(struct printk_ringbuffer *rb,
 	      char *text_buf, unsigned int text_buf_size,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
@ 2020-08-31  1:10   ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

Add support for extending the newest data block. For this, introduce
a new finalization state flag (DESC_FINAL_MASK) that denotes when a
descriptor may not be extended, i.e. is finalized.

The DESC_COMMIT_MASK is still set when the record data is in a
consistent state, i.e. the writer is no longer modifying the record.
However, the record remains in the desc_reserved queried state until
it is finalized, in which case it transitions to the desc_committed
queried state.

Until a record is finalized, a writer can reopen that record to
append new data. Reopening a record means clearing the
DESC_COMMIT_MASK flag.

A writer can explicitly finalize a record if there is no intention
of extending it. Also, records are automatically finalized when a
new record is reserved. This relieves writers of needing to
explicitly finalize while also making such records available to
readers sooner. (Readers can only traverse finalized records.)

Three new memory barrier pairs are introduced. Two of them are not
significant because they are alternate path memory barriers that
exactly correspond to existing memory barriers.

But the third (_prb_commit:B / desc_reserve:D) is new and guarantees
that descriptors will always be finalized, either because a
descriptor setting DESC_COMMIT_MASK sees that there is a newer
descriptor and so finalizes itself or because a new descriptor being
reserved sees that the previous descriptor has DESC_COMMIT_MASK set
and finalizes that descriptor.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk_ringbuffer.c | 467 ++++++++++++++++++++++++++++--
 kernel/printk/printk_ringbuffer.h |   8 +-
 2 files changed, 443 insertions(+), 32 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index da54d4fadf96..0731d5e2dddd 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -49,14 +49,16 @@
  * Descriptors have three states:
  *
  *   reserved
- *     A writer is modifying the record.
+ *     A writer is modifying the record. Internally represented as either "0"
+ *     or "DESC_COMMIT_MASK".
  *
  *   committed
  *     The record and all its data are complete and available for reading.
+ *     Internally represented as "DESC_COMMIT_MASK | DESC_FINAL_MASK".
  *
  *   reusable
  *     The record exists, but its text and/or dictionary data may no longer
- *     be available.
+ *     be available. Internally represented as "DESC_REUSE_MASK".
  *
  * Querying the @state_var of a record requires providing the ID of the
  * descriptor to query. This can yield a possible fourth (pseudo) state:
@@ -79,6 +81,25 @@
  * committed or reusable queried state. This makes it possible that a valid
  * sequence number of the tail is always available.
  *
+ * Descriptor Finalization
+ * ~~~~~~~~~~~~~~~~~~~~~~~
+ * When a writer calls the commit function prb_commit(), the record may still
+ * continue to be in the reserved queried state. In order for that record to
+ * enter into the committed queried state, that record also must be finalized.
+ * A record can be finalized by three different scenarios:
+ *
+ *   1) A writer can finalize its record immediately by calling
+ *      prb_final_commit() instead of prb_commit().
+ *
+ *   2) When a new record is reserved and the previous record has been
+ *      committed via prb_commit(), that previous record is finalized.
+ *
+ *   3) When a record is committed via prb_commit() and a newer record
+ *      already exists, the record being committed is finalized.
+ *
+ * Until a record is finalized (represented by "DESC_FINAL_MASK"), a writer
+ * may "reopen" that record and extend it with more data.
+ *
  * Data Rings
  * ~~~~~~~~~~
  * The two data rings (text and dictionary) function identically. They exist
@@ -156,9 +177,38 @@
  *
  *		r.info->ts_nsec = local_clock();
  *
+ *		prb_final_commit(&e);
+ *	}
+ *
+ * Note that additional writer functions are available to extend a record
+ * after it has been committed but not yet finalized. This can be done as
+ * long as no new records have been reserved and the caller is the same.
+ *
+ * Sample writer code (record extending)::
+ *
+ *		// alternate rest of previous example
+ *		r.info->ts_nsec = local_clock();
+ *		r.info->text_len = strlen(textstr);
+ *		r.info->caller_id = printk_caller_id();
+ *
+ *		// commit the record (but do not finalize yet)
  *		prb_commit(&e);
  *	}
  *
+ *	...
+ *
+ *	// specify additional 5 bytes text space to extend
+ *	prb_rec_init_wr(&r, 5, 0);
+ *
+ *	if (prb_reserve_in_last(&e, &test_rb, &r, printk_caller_id())) {
+ *		snprintf(&r.text_buf[r.info->text_len],
+ *			 r.text_buf_size - r.info->text_len, "hello");
+ *
+ *		r.info->text_len += 5;
+ *
+ *		prb_final_commit(&e);
+ *	}
+ *
  * Sample reader code::
  *
  *	struct printk_info info;
@@ -236,14 +286,20 @@
  *   desc_reserve:F / desc_read:D
  *     set new descriptor id and reserved (state), then allow writer changes
  *
- *   data_alloc:A / desc_read:D
+ *   data_alloc:A (or data_realloc:A) / desc_read:D
  *     set old descriptor reusable (state), then modify new data block area
  *
- *   data_alloc:A / data_push_tail:B
+ *   data_alloc:A (or data_realloc:A) / data_push_tail:B
  *     push data tail (lpos), then modify new data block area
  *
- *   prb_commit:B / desc_read:B
- *     store writer changes, then set new descriptor committed (state)
+ *   _prb_commit:B / desc_read:B
+ *     store writer changes, then set new descriptor commit flag (state)
+ *
+ *   desc_reopen_last:A / _prb_commit:B
+ *     store removed descriptor commit flag (state), then read descriptor data
+ *
+ *   _prb_commit:B / desc_reserve:D
+ *     set new descriptor commit flag (state), then check descriptor head (id)
  *
  *   data_push_tail:D / data_push_tail:A
  *     set descriptor reusable (state), then push data tail (lpos)
@@ -366,8 +422,10 @@ static enum desc_state get_desc_state(unsigned long id,
 	if (state_val & DESC_REUSE_MASK)
 		return desc_reusable;
 
-	if (state_val & DESC_COMMIT_MASK)
+	if ((state_val & (DESC_COMMIT_MASK | DESC_FINAL_MASK)) ==
+	    (DESC_COMMIT_MASK | DESC_FINAL_MASK)) {
 		return desc_committed;
+	}
 
 	return desc_reserved;
 }
@@ -394,16 +452,16 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 	/*
 	 * Guarantee the state is loaded before copying the descriptor
 	 * content. This avoids copying obsolete descriptor content that might
-	 * not apply to the descriptor state. This pairs with prb_commit:B.
+	 * not apply to the descriptor state. This pairs with _prb_commit:B.
 	 *
 	 * Memory barrier involvement:
 	 *
-	 * If desc_read:A reads from prb_commit:B, then desc_read:C reads
-	 * from prb_commit:A.
+	 * If desc_read:A reads from _prb_commit:B, then desc_read:C reads
+	 * from _prb_commit:A.
 	 *
 	 * Relies on:
 	 *
-	 * WMB from prb_commit:A to prb_commit:B
+	 * WMB from _prb_commit:A to _prb_commit:B
 	 *    matching
 	 * RMB from desc_read:A to desc_read:C
 	 */
@@ -434,7 +492,8 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 	 *
 	 * 2. Guarantee the record data is loaded before re-checking the
 	 *    state. This avoids reading an obsolete descriptor state that may
-	 *    not apply to the copied data. This pairs with data_alloc:A.
+	 *    not apply to the copied data. This pairs with data_alloc:A and
+	 *    data_realloc:A.
 	 *
 	 *    Memory barrier involvement:
 	 *
@@ -467,7 +526,7 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 static void desc_make_reusable(struct prb_desc_ring *desc_ring,
 			       unsigned long id)
 {
-	unsigned long val_committed = id | DESC_COMMIT_MASK;
+	unsigned long val_committed = id | DESC_COMMIT_MASK | DESC_FINAL_MASK;
 	unsigned long val_reusable = id | DESC_REUSE_MASK;
 	struct prb_desc *desc = to_desc(desc_ring, id);
 	atomic_long_t *state_var = &desc->state_var;
@@ -613,7 +672,7 @@ static bool data_push_tail(struct printk_ringbuffer *rb,
 			 *    data_make_reusable() may be due to a newly
 			 *    recycled data area causing the tail lpos to
 			 *    have been previously pushed. This pairs with
-			 *    data_alloc:A.
+			 *    data_alloc:A and data_realloc:A.
 			 *
 			 *    Memory barrier involvement:
 			 *
@@ -892,6 +951,10 @@ static bool desc_reserve(struct printk_ringbuffer *rb, unsigned long *id_out)
 		 *    another CPU may have pushed the tail ID. This pairs
 		 *    with desc_push_tail:C and this also pairs with
 		 *    prb_first_seq:C.
+		 *
+		 * 5. Guarantee the head ID is stored before trying to
+		 *    finalize the previous descriptor. This pairs with
+		 *    _prb_commit:B.
 		 */
 	} while (!atomic_long_try_cmpxchg(&desc_ring->head_id, &head_id,
 					  id)); /* LMM(desc_reserve:D) */
@@ -1021,6 +1084,83 @@ static char *data_alloc(struct printk_ringbuffer *rb,
 	return &blk->data[0];
 }
 
+/*
+ * Try to resize an existing data block associated with the descriptor
+ * specified by @id. If the resized datablock should become wrapped, it
+ * copies the old data to the new data block.
+ *
+ * Fail if this is not the last allocated data block or if there is not
+ * enough space or it is not possible make enough space.
+ *
+ * Return a pointer to the beginning of the entire data buffer or NULL on
+ * failure.
+ */
+static char *data_realloc(struct printk_ringbuffer *rb,
+			  struct prb_data_ring *data_ring, unsigned int size,
+			  struct prb_data_blk_lpos *blk_lpos, unsigned long id)
+{
+	struct prb_data_block *blk;
+	unsigned long head_lpos;
+	unsigned long next_lpos;
+	bool wrapped;
+
+	/* Reallocation only works if @blk_lpos is the newest data block. */
+	head_lpos = atomic_long_read(&data_ring->head_lpos);
+	if (head_lpos != blk_lpos->next)
+		return NULL;
+
+	/* Keep track if @blk_lpos was a wrapping data block. */
+	wrapped = (DATA_WRAPS(data_ring, blk_lpos->begin) != DATA_WRAPS(data_ring, blk_lpos->next));
+
+	size = to_blk_size(size);
+
+	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
+
+	/* If the data block does not increase, there is nothing to do. */
+	if (next_lpos == head_lpos) {
+		blk = to_block(data_ring, blk_lpos->begin);
+		return &blk->data[0];
+	}
+
+	if (!data_push_tail(rb, data_ring, next_lpos - DATA_SIZE(data_ring)))
+		return NULL;
+
+	/* The memory barrier involvement is the same as data_alloc:A. */
+	if (!atomic_long_try_cmpxchg(&data_ring->head_lpos, &head_lpos,
+				     next_lpos)) { /* LMM(data_realloc:A) */
+		return NULL;
+	}
+
+	blk = to_block(data_ring, blk_lpos->begin);
+
+	if (DATA_WRAPS(data_ring, blk_lpos->begin) != DATA_WRAPS(data_ring, next_lpos)) {
+		struct prb_data_block *old_blk = blk;
+
+		/* Wrapping data blocks store their data at the beginning. */
+		blk = to_block(data_ring, 0);
+
+		/*
+		 * Store the ID on the wrapped block for consistency.
+		 * The printk_ringbuffer does not actually use it.
+		 */
+		blk->id = id;
+
+		if (!wrapped) {
+			/*
+			 * Since the allocated space is now in the newly
+			 * created wrapping data block, copy the content
+			 * from the old data block.
+			 */
+			memcpy(&blk->data[0], &old_blk->data[0],
+			       (blk_lpos->next - blk_lpos->begin) - sizeof(blk->id));
+		}
+	}
+
+	blk_lpos->next = next_lpos;
+
+	return &blk->data[0];
+}
+
 /* Return the number of bytes used by a data block. */
 static unsigned int space_used(struct prb_data_ring *data_ring,
 			       struct prb_data_blk_lpos *blk_lpos)
@@ -1101,6 +1241,203 @@ static const char *get_data(struct prb_data_ring *data_ring,
 	return &db->data[0];
 }
 
+/*
+ * Attempt to remove the commit flag so that the record can be modified by a
+ * writer again. This is only possible if the descriptor is not yet finalized.
+ *
+ * Note that on success, the queried state did not change. A non-finalized
+ * record (even with the commit flag set) is in the reserved queried state.
+ */
+static struct prb_desc *desc_reopen_last(struct prb_desc_ring *desc_ring,
+					 u32 caller_id, unsigned long *id_out)
+{
+	unsigned long prev_state_val;
+	enum desc_state d_state;
+	struct prb_desc desc;
+	struct prb_desc *d;
+	unsigned long id;
+
+	id = atomic_long_read(&desc_ring->head_id);
+
+	/*
+	 * To minimize unnecessarily reopening a descriptor, first check the
+	 * descriptor is in the correct state and has a matching caller ID.
+	 */
+	d_state = desc_read(desc_ring, id, &desc);
+	if (d_state != desc_reserved ||
+	    !(atomic_long_read(&desc.state_var) & DESC_COMMIT_MASK) ||
+	    desc.info.caller_id != caller_id) {
+		return NULL;
+	}
+
+	d = to_desc(desc_ring, id);
+
+	prev_state_val = id | DESC_COMMIT_MASK;
+
+	/*
+	 * Guarantee the commit flag is removed from the state before
+	 * reading any record data. A full memory barrier is needed
+	 * because @state_var is modified for the read. This pairs with
+	 * _prb_commit:B.
+	 *
+	 * Memory barrier involvement:
+	 *
+	 * If desc_reopen_last:A reads from _prb_commit:B, then
+	 * prb_reserve_in_last:A reads from _prb_commit:A.
+	 *
+	 * Relies on:
+	 *
+	 * WMB from _prb_commit:A to _prb_commit:B
+	 *    matching
+	 * MB If desc_reopen_last:A to prb_reserve_in_last:A
+	 */
+	if (!atomic_long_try_cmpxchg(&d->state_var, &prev_state_val,
+				     id | 0)) { /* LMM(desc_reopen_last:A) */
+		return NULL;
+	}
+
+	*id_out = id;
+	return d;
+}
+
+/**
+ * prb_reserve_in_last() - Re-reserve and extend the space in the ringbuffer
+ *                         used by the newest record.
+ *
+ * @e:         The entry structure to setup.
+ * @rb:        The ringbuffer to re-reserve and extend data in.
+ * @r:         The record structure to allocate buffers for.
+ * @caller_id: The caller ID of the caller (reserving writer).
+ *
+ * This is the public function available to writers to re-reserve and extend
+ * data.
+ *
+ * The writer specifies the text size to extend (not the new total size) by
+ * setting the @text_buf_size field of @r. Dictionaries cannot be extended so
+ * @dict_buf_size of @r should be set to 0. To ensure proper initialization of
+ * @r, prb_rec_init_wr() should be used.
+ *
+ * This function will fail if @caller_id does not match the caller ID of the
+ * newest record. In that case the caller must reserve new data using
+ * prb_reserve().
+ *
+ * Context: Any context. Disables local interrupts on success.
+ * Return: true if text data could be extended, otherwise false.
+ *
+ * On success:
+ *
+ *   - @r->text_buf points to the beginning of the entire text buffer.
+ *
+ *   - @r->text_buf_len is set to the new total size of the buffer.
+ *
+ *   - @r->dict_buf and @r->dict_buf_len are cleared because extending
+ *     the dict buffer is not supported.
+ *
+ *   - @r->info is not touched so that @r->info->text_len could be used
+ *     to append the text.
+ *
+ *   - prb_record_text_space() can be used on @e to query the new
+ *     actually used space.
+ *
+ * Important: All @r->info fields will already be set with the current values
+ *            for the record. I.e. @r->info->text_len will be less than
+ *            @text_buf_size and @r->info->dict_len may be set, even though
+ *            @dict_buf_size is 0. Writers can use @r->info->text_len to know
+ *            where concatenation begins and writers should update
+ *            @r->info->text_len after concatenating.
+ */
+bool prb_reserve_in_last(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
+			 struct printk_record *r, u32 caller_id)
+{
+	unsigned int data_size;
+	struct prb_desc *d;
+	unsigned long id;
+
+	local_irq_save(e->irqflags);
+
+	/* Transition the newest descriptor back to the reserved state. */
+	d = desc_reopen_last(&rb->desc_ring, caller_id, &id);
+	if (!d) {
+		local_irq_restore(e->irqflags);
+		goto fail_reopen;
+	}
+
+	/* Now the writer has exclusive access: LMM(prb_reserve_in_last:A) */
+
+	/*
+	 * Set the @e fields here so that prb_commit() can be used if
+	 * anything fails from now on.
+	 */
+	e->rb = rb;
+	e->id = id;
+
+	/*
+	 * desc_reopen_last() checked the caller_id, but there was no
+	 * exclusive access at that point. The descriptor may have
+	 * changed since then.
+	 */
+	if (caller_id != d->info.caller_id)
+		goto fail;
+
+	if (BLK_DATALESS(&d->text_blk_lpos)) {
+		r->text_buf = data_alloc(rb, &rb->text_data_ring, r->text_buf_size,
+					 &d->text_blk_lpos, id);
+	} else {
+		if (!get_data(&rb->text_data_ring, &d->text_blk_lpos, &data_size))
+			goto fail;
+
+		/*
+		 * Increase the buffer size to include the original size. If
+		 * the meta data (@text_len) is not sane, use the full data
+		 * block size.
+		 */
+		if (WARN_ON_ONCE(d->info.text_len > data_size)) {
+			pr_warn_once("wrong data size (%u, expecting >=%hu) for data\n",
+				     data_size, d->info.text_len);
+			d->info.text_len = data_size;
+		}
+		r->text_buf_size += d->info.text_len;
+
+		if (!data_check_size(&rb->text_data_ring, r->text_buf_size))
+			goto fail;
+
+		r->text_buf = data_realloc(rb, &rb->text_data_ring, r->text_buf_size,
+					   &d->text_blk_lpos, id);
+	}
+	if (r->text_buf_size && !r->text_buf)
+		goto fail;
+
+	/* Although dictionary data may be in use, it cannot be extended. */
+	r->dict_buf = NULL;
+	r->dict_buf_size = 0;
+
+	r->info = &d->info;
+
+	e->text_space = space_used(&rb->text_data_ring, &d->text_blk_lpos);
+
+	return true;
+fail:
+	prb_commit(e);
+	/* prb_commit() re-enabled interrupts. */
+fail_reopen:
+	/* Make it clear to the caller that the re-reserve failed. */
+	memset(r, 0, sizeof(*r));
+	return false;
+}
+
+/*
+ * Attempt to finalize a specified descriptor. If this fails, the descriptor
+ * is either already final or it will finalize itself when the writer commits.
+ */
+static void desc_make_final(struct prb_desc_ring *desc_ring, unsigned long id)
+{
+	unsigned long prev_state_val = id | DESC_COMMIT_MASK;
+	struct prb_desc *d = to_desc(desc_ring, id);
+
+	atomic_long_cmpxchg_relaxed(&d->state_var, prev_state_val,
+			prev_state_val | DESC_FINAL_MASK); /* LMM(desc_make_final:A) */
+}
+
 /**
  * prb_reserve() - Reserve space in the ringbuffer.
  *
@@ -1197,6 +1534,15 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 	else
 		d->info.seq += DESCS_COUNT(desc_ring);
 
+	/*
+	 * New data is about to be reserved. Once that happens, previous
+	 * descriptors are no longer able to be extended. Finalize the
+	 * previous descriptor now so that it can be made available to
+	 * readers. (For seq==0 there is no previous descriptor.)
+	 */
+	if (d->info.seq > 0)
+		desc_make_final(desc_ring, DESC_ID(id - 1));
+
 	r->text_buf = data_alloc(rb, &rb->text_data_ring, r->text_buf_size,
 				 &d->text_blk_lpos, id);
 	/* If text data allocation fails, a data-less record is committed. */
@@ -1227,33 +1573,41 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 	return false;
 }
 
-/**
- * prb_commit() - Commit (previously reserved) data to the ringbuffer.
- *
- * @e: The entry containing the reserved data information.
- *
- * This is the public function available to writers to commit data.
- *
- * Context: Any context. Enables local interrupts.
- */
-void prb_commit(struct prb_reserved_entry *e)
+/* Commit the data (possibly finalizing it) and restore interrupts. */
+static void _prb_commit(struct prb_reserved_entry *e, unsigned long final_mask)
 {
 	struct prb_desc_ring *desc_ring = &e->rb->desc_ring;
 	struct prb_desc *d = to_desc(desc_ring, e->id);
 	unsigned long prev_state_val = e->id | 0;
 
-	/* Now the writer has finished all writing: LMM(prb_commit:A) */
+	/* Now the writer has finished all writing: LMM(_prb_commit:A) */
 
 	/*
 	 * Set the descriptor as committed. See "ABA Issues" about why
 	 * cmpxchg() instead of set() is used.
 	 *
-	 * Guarantee all record data is stored before the descriptor state
-	 * is stored as committed. A write memory barrier is sufficient for
-	 * this. This pairs with desc_read:B.
+	 * 1  Guarantee all record data is stored before the descriptor state
+	 *    is stored as committed. A write memory barrier is sufficient
+	 *    for this. This pairs with desc_read:B and desc_reopen_last:A.
+	 *
+	 * 2. Guarantee the commit flag is stored before re-checking the
+	 *    head ID in order to possibly finalize this descriptor. This
+	 *    pairs with desc_reserve:D.
+	 *
+	 *    Memory barrier involvement:
+	 *
+	 *    If prb_commit:A reads from desc_reserve:D, then
+	 *    desc_make_final:A reads from _prb_commit:B.
+	 *
+	 *    Relies on:
+	 *
+	 *    MB _prb_commit:B to prb_commit:A
+	 *       matching
+	 *    MB desc_reserve:D to desc_make_final:A
 	 */
 	if (!atomic_long_try_cmpxchg(&d->state_var, &prev_state_val,
-				     e->id | DESC_COMMIT_MASK)) { /* LMM(prb_commit:B) */
+				     e->id | DESC_COMMIT_MASK |
+					     final_mask)) { /* LMM(_prb_commit:B) */
 		WARN_ON_ONCE(1);
 	}
 
@@ -1261,6 +1615,59 @@ void prb_commit(struct prb_reserved_entry *e)
 	local_irq_restore(e->irqflags);
 }
 
+/**
+ * prb_commit() - Commit (previously reserved) data to the ringbuffer.
+ *
+ * @e: The entry containing the reserved data information.
+ *
+ * This is the public function available to writers to commit data.
+ *
+ * Note that the data is not yet available to readers until it is finalized.
+ * Finalizing happens automatically when space for the next record is
+ * reserved.
+ *
+ * See prb_final_commit() for a version of this function that finalizes
+ * immediately.
+ *
+ * Context: Any context. Enables local interrupts.
+ */
+void prb_commit(struct prb_reserved_entry *e)
+{
+	struct prb_desc_ring *desc_ring = &e->rb->desc_ring;
+	unsigned long head_id;
+
+	_prb_commit(e, 0);
+
+	/*
+	 * If this descriptor is no longer the head (i.e. a new record has
+	 * been allocated), extending the data for this record is no longer
+	 * allowed and therefore it must be finalized.
+	 */
+	head_id = atomic_long_read(&desc_ring->head_id); /* LMM(prb_commit:A) */
+	if (head_id != e->id)
+		desc_make_final(desc_ring, e->id);
+}
+
+/**
+ * prb_final_commit() - Commit and finalize (previously reserved) data to
+ *                      the ringbuffer.
+ *
+ * @e: The entry containing the reserved data information.
+ *
+ * This is the public function available to writers to commit+finalize data.
+ *
+ * By finalizing, the data is made immediately available to readers.
+ *
+ * This function should only be used if there are no intentions of extending
+ * this data using prb_reserve_in_last().
+ *
+ * Context: Any context. Enables local interrupts.
+ */
+void prb_final_commit(struct prb_reserved_entry *e)
+{
+	_prb_commit(e, DESC_FINAL_MASK);
+}
+
 /*
  * Count the number of lines in provided text. All text has at least 1 line
  * (even if @text_size is 0). Each '\n' processed is counted as an additional
@@ -1312,7 +1719,7 @@ static bool copy_data(struct prb_data_ring *data_ring,
 	 * because of the trailing alignment padding.
 	 */
 	if (WARN_ON_ONCE(data_size < (unsigned int)len)) {
-		pr_warn_once("wrong data size (%u, expecting %hu) for data: %.*s\n",
+		pr_warn_once("wrong data size (%u, expecting >=%hu) for data: %.*s\n",
 			     data_size, len, data_size, data);
 		return false;
 	}
diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
index 96ef997d7bd6..caa6fb40dafb 100644
--- a/kernel/printk/printk_ringbuffer.h
+++ b/kernel/printk/printk_ringbuffer.h
@@ -116,8 +116,9 @@ struct prb_reserved_entry {
 #define _DESCS_COUNT(ct_bits)		(1U << (ct_bits))
 #define DESC_SV_BITS			(sizeof(unsigned long) * 8)
 #define DESC_COMMIT_MASK		(1UL << (DESC_SV_BITS - 1))
-#define DESC_REUSE_MASK			(1UL << (DESC_SV_BITS - 2))
-#define DESC_FLAGS_MASK			(DESC_COMMIT_MASK | DESC_REUSE_MASK)
+#define DESC_FINAL_MASK			(1UL << (DESC_SV_BITS - 2))
+#define DESC_REUSE_MASK			(1UL << (DESC_SV_BITS - 3))
+#define DESC_FLAGS_MASK			(DESC_COMMIT_MASK | DESC_FINAL_MASK | DESC_REUSE_MASK)
 #define DESC_ID_MASK			(~DESC_FLAGS_MASK)
 #define DESC_ID(sv)			((sv) & DESC_ID_MASK)
 #define FAILED_LPOS			0x1
@@ -318,7 +319,10 @@ static inline void prb_rec_init_wr(struct printk_record *r,
 
 bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
 		 struct printk_record *r);
+bool prb_reserve_in_last(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
+			 struct printk_record *r, u32 caller_id);
 void prb_commit(struct prb_reserved_entry *e);
+void prb_final_commit(struct prb_reserved_entry *e);
 
 void prb_init(struct printk_ringbuffer *rb,
 	      char *text_buf, unsigned int text_buf_size,
-- 
2.20.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 7/8] printk: reimplement log_cont using record extension
  2020-08-31  1:10 ` John Ogness
@ 2020-08-31  1:10   ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

Use the record extending feature of the ringbuffer to implement
continuous messages. This preserves the existing continuous message
behavior.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 98 +++++++++---------------------------------
 1 file changed, 20 insertions(+), 78 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 7e7d596c8878..d0b2bea1fd81 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -535,7 +535,10 @@ static int log_store(u32 caller_id, int facility, int level,
 	r.info->caller_id = caller_id;
 
 	/* insert message */
-	prb_commit(&e);
+	if ((flags & LOG_CONT) || !(flags & LOG_NEWLINE))
+		prb_commit(&e);
+	else
+		prb_final_commit(&e);
 
 	return (text_len + trunc_msg_len);
 }
@@ -1093,7 +1096,7 @@ static unsigned int __init add_to_rb(struct printk_ringbuffer *rb,
 	dest_r.info->ts_nsec = r->info->ts_nsec;
 	dest_r.info->caller_id = r->info->caller_id;
 
-	prb_commit(&e);
+	prb_final_commit(&e);
 
 	return prb_record_text_space(&e);
 }
@@ -1893,87 +1896,26 @@ static inline u32 printk_caller_id(void)
 		0x80000000 + raw_smp_processor_id();
 }
 
-/*
- * Continuation lines are buffered, and not committed to the record buffer
- * until the line is complete, or a race forces it. The line fragments
- * though, are printed immediately to the consoles to ensure everything has
- * reached the console in case of a kernel crash.
- */
-static struct cont {
-	char buf[LOG_LINE_MAX];
-	size_t len;			/* length == 0 means unused buffer */
-	u32 caller_id;			/* printk_caller_id() of first print */
-	u64 ts_nsec;			/* time of first print */
-	u8 level;			/* log level of first message */
-	u8 facility;			/* log facility of first message */
-	enum log_flags flags;		/* prefix, newline flags */
-} cont;
-
-static void cont_flush(void)
-{
-	if (cont.len == 0)
-		return;
-
-	log_store(cont.caller_id, cont.facility, cont.level, cont.flags,
-		  cont.ts_nsec, NULL, 0, cont.buf, cont.len);
-	cont.len = 0;
-}
-
-static bool cont_add(u32 caller_id, int facility, int level,
-		     enum log_flags flags, const char *text, size_t len)
-{
-	/* If the line gets too long, split it up in separate records. */
-	if (cont.len + len > sizeof(cont.buf)) {
-		cont_flush();
-		return false;
-	}
-
-	if (!cont.len) {
-		cont.facility = facility;
-		cont.level = level;
-		cont.caller_id = caller_id;
-		cont.ts_nsec = local_clock();
-		cont.flags = flags;
-	}
-
-	memcpy(cont.buf + cont.len, text, len);
-	cont.len += len;
-
-	// The original flags come from the first line,
-	// but later continuations can add a newline.
-	if (flags & LOG_NEWLINE) {
-		cont.flags |= LOG_NEWLINE;
-		cont_flush();
-	}
-
-	return true;
-}
-
 static size_t log_output(int facility, int level, enum log_flags lflags, const char *dict, size_t dictlen, char *text, size_t text_len)
 {
 	const u32 caller_id = printk_caller_id();
 
-	/*
-	 * If an earlier line was buffered, and we're a continuation
-	 * write from the same context, try to add it to the buffer.
-	 */
-	if (cont.len) {
-		if (cont.caller_id == caller_id && (lflags & LOG_CONT)) {
-			if (cont_add(caller_id, facility, level, lflags, text, text_len))
-				return text_len;
-		}
-		/* Otherwise, make sure it's flushed */
-		cont_flush();
-	}
-
-	/* Skip empty continuation lines that couldn't be added - they just flush */
-	if (!text_len && (lflags & LOG_CONT))
-		return 0;
-
-	/* If it doesn't end in a newline, try to buffer the current line */
-	if (!(lflags & LOG_NEWLINE)) {
-		if (cont_add(caller_id, facility, level, lflags, text, text_len))
+	if (lflags & LOG_CONT) {
+		struct prb_reserved_entry e;
+		struct printk_record r;
+
+		prb_rec_init_wr(&r, text_len, 0);
+		if (prb_reserve_in_last(&e, prb, &r, caller_id)) {
+			memcpy(&r.text_buf[r.info->text_len], text, text_len);
+			r.info->text_len += text_len;
+			if (lflags & LOG_NEWLINE) {
+				r.info->flags |= LOG_NEWLINE;
+				prb_final_commit(&e);
+			} else {
+				prb_commit(&e);
+			}
 			return text_len;
+		}
 	}
 
 	/* Store it in the record log */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 7/8] printk: reimplement log_cont using record extension
@ 2020-08-31  1:10   ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

Use the record extending feature of the ringbuffer to implement
continuous messages. This preserves the existing continuous message
behavior.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 98 +++++++++---------------------------------
 1 file changed, 20 insertions(+), 78 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 7e7d596c8878..d0b2bea1fd81 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -535,7 +535,10 @@ static int log_store(u32 caller_id, int facility, int level,
 	r.info->caller_id = caller_id;
 
 	/* insert message */
-	prb_commit(&e);
+	if ((flags & LOG_CONT) || !(flags & LOG_NEWLINE))
+		prb_commit(&e);
+	else
+		prb_final_commit(&e);
 
 	return (text_len + trunc_msg_len);
 }
@@ -1093,7 +1096,7 @@ static unsigned int __init add_to_rb(struct printk_ringbuffer *rb,
 	dest_r.info->ts_nsec = r->info->ts_nsec;
 	dest_r.info->caller_id = r->info->caller_id;
 
-	prb_commit(&e);
+	prb_final_commit(&e);
 
 	return prb_record_text_space(&e);
 }
@@ -1893,87 +1896,26 @@ static inline u32 printk_caller_id(void)
 		0x80000000 + raw_smp_processor_id();
 }
 
-/*
- * Continuation lines are buffered, and not committed to the record buffer
- * until the line is complete, or a race forces it. The line fragments
- * though, are printed immediately to the consoles to ensure everything has
- * reached the console in case of a kernel crash.
- */
-static struct cont {
-	char buf[LOG_LINE_MAX];
-	size_t len;			/* length == 0 means unused buffer */
-	u32 caller_id;			/* printk_caller_id() of first print */
-	u64 ts_nsec;			/* time of first print */
-	u8 level;			/* log level of first message */
-	u8 facility;			/* log facility of first message */
-	enum log_flags flags;		/* prefix, newline flags */
-} cont;
-
-static void cont_flush(void)
-{
-	if (cont.len == 0)
-		return;
-
-	log_store(cont.caller_id, cont.facility, cont.level, cont.flags,
-		  cont.ts_nsec, NULL, 0, cont.buf, cont.len);
-	cont.len = 0;
-}
-
-static bool cont_add(u32 caller_id, int facility, int level,
-		     enum log_flags flags, const char *text, size_t len)
-{
-	/* If the line gets too long, split it up in separate records. */
-	if (cont.len + len > sizeof(cont.buf)) {
-		cont_flush();
-		return false;
-	}
-
-	if (!cont.len) {
-		cont.facility = facility;
-		cont.level = level;
-		cont.caller_id = caller_id;
-		cont.ts_nsec = local_clock();
-		cont.flags = flags;
-	}
-
-	memcpy(cont.buf + cont.len, text, len);
-	cont.len += len;
-
-	// The original flags come from the first line,
-	// but later continuations can add a newline.
-	if (flags & LOG_NEWLINE) {
-		cont.flags |= LOG_NEWLINE;
-		cont_flush();
-	}
-
-	return true;
-}
-
 static size_t log_output(int facility, int level, enum log_flags lflags, const char *dict, size_t dictlen, char *text, size_t text_len)
 {
 	const u32 caller_id = printk_caller_id();
 
-	/*
-	 * If an earlier line was buffered, and we're a continuation
-	 * write from the same context, try to add it to the buffer.
-	 */
-	if (cont.len) {
-		if (cont.caller_id == caller_id && (lflags & LOG_CONT)) {
-			if (cont_add(caller_id, facility, level, lflags, text, text_len))
-				return text_len;
-		}
-		/* Otherwise, make sure it's flushed */
-		cont_flush();
-	}
-
-	/* Skip empty continuation lines that couldn't be added - they just flush */
-	if (!text_len && (lflags & LOG_CONT))
-		return 0;
-
-	/* If it doesn't end in a newline, try to buffer the current line */
-	if (!(lflags & LOG_NEWLINE)) {
-		if (cont_add(caller_id, facility, level, lflags, text, text_len))
+	if (lflags & LOG_CONT) {
+		struct prb_reserved_entry e;
+		struct printk_record r;
+
+		prb_rec_init_wr(&r, text_len, 0);
+		if (prb_reserve_in_last(&e, prb, &r, caller_id)) {
+			memcpy(&r.text_buf[r.info->text_len], text, text_len);
+			r.info->text_len += text_len;
+			if (lflags & LOG_NEWLINE) {
+				r.info->flags |= LOG_NEWLINE;
+				prb_final_commit(&e);
+			} else {
+				prb_commit(&e);
+			}
 			return text_len;
+		}
 	}
 
 	/* Store it in the record log */
-- 
2.20.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 8/8] scripts/gdb: support printk finalized records
  2020-08-31  1:10 ` John Ogness
@ 2020-08-31  1:10   ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

With commit ("printk: ringbuffer: add finalization/extension support")
a new state bit for finalized records was added. This not only changed
the bit representation of committed records, but also reduced the size
for record IDs.

Update the gdb scripts to correctly interpret the state variable.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 Documentation/admin-guide/kdump/gdbmacros.txt | 10 +++++++---
 scripts/gdb/linux/dmesg.py                    | 10 ++++++----
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/kdump/gdbmacros.txt b/Documentation/admin-guide/kdump/gdbmacros.txt
index 7adece30237e..bcb78368b381 100644
--- a/Documentation/admin-guide/kdump/gdbmacros.txt
+++ b/Documentation/admin-guide/kdump/gdbmacros.txt
@@ -295,8 +295,11 @@ document dump_record
 end
 
 define dmesg
-	set var $desc_committed = 1UL << ((sizeof(long) * 8) - 1)
-	set var $flags_mask = 3UL << ((sizeof(long) * 8) - 2)
+	# definitions from kernel/printk/printk_ringbuffer.h
+	set var $desc_commit = 1UL << ((sizeof(long) * 8) - 1)
+	set var $desc_final = 1UL << ((sizeof(long) * 8) - 2)
+	set var $desc_reuse = 1UL << ((sizeof(long) * 8) - 3)
+	set var $flags_mask = $desc_commit | $desc_final | $desc_reuse
 	set var $id_mask = ~$flags_mask
 
 	set var $desc_count = 1U << prb->desc_ring.count_bits
@@ -309,7 +312,8 @@ define dmesg
 		set var $desc = &prb->desc_ring.descs[$id % $desc_count]
 
 		# skip non-committed record
-		if (($desc->state_var.counter & $flags_mask) == $desc_committed)
+		# (note that commit+!final records will be displayed)
+		if (($desc->state_var.counter & $desc_commit) == $desc_commit)
 			dump_record $desc $prev_flags
 			set var $prev_flags = $desc->info.flags
 		end
diff --git a/scripts/gdb/linux/dmesg.py b/scripts/gdb/linux/dmesg.py
index 6c6022012ea8..367523c5c270 100644
--- a/scripts/gdb/linux/dmesg.py
+++ b/scripts/gdb/linux/dmesg.py
@@ -79,9 +79,10 @@ class LxDmesg(gdb.Command):
 
         # definitions from kernel/printk/printk_ringbuffer.h
         desc_sv_bits = utils.get_long_type().sizeof * 8
-        desc_committed_mask = 1 << (desc_sv_bits - 1)
-        desc_reuse_mask = 1 << (desc_sv_bits - 2)
-        desc_flags_mask = desc_committed_mask | desc_reuse_mask
+        desc_commit_mask = 1 << (desc_sv_bits - 1)
+        desc_final_mask = 1 << (desc_sv_bits - 2)
+        desc_reuse_mask = 1 << (desc_sv_bits - 3)
+        desc_flags_mask = desc_commit_mask | desc_final_mask | desc_reuse_mask
         desc_id_mask = ~desc_flags_mask
 
         # read in tail and head descriptor ids
@@ -96,8 +97,9 @@ class LxDmesg(gdb.Command):
             desc_off = desc_sz * ind
 
             # skip non-committed record
+            # (note that commit+!final records will be displayed)
             state = utils.read_u64(descs, desc_off + sv_off + counter_off) & desc_flags_mask
-            if state != desc_committed_mask:
+            if state & desc_commit_mask != desc_commit_mask:
                 if did == head_id:
                     break
                 did = (did + 1) & desc_id_mask
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH next v3 8/8] scripts/gdb: support printk finalized records
@ 2020-08-31  1:10   ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31  1:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

With commit ("printk: ringbuffer: add finalization/extension support")
a new state bit for finalized records was added. This not only changed
the bit representation of committed records, but also reduced the size
for record IDs.

Update the gdb scripts to correctly interpret the state variable.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 Documentation/admin-guide/kdump/gdbmacros.txt | 10 +++++++---
 scripts/gdb/linux/dmesg.py                    | 10 ++++++----
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/kdump/gdbmacros.txt b/Documentation/admin-guide/kdump/gdbmacros.txt
index 7adece30237e..bcb78368b381 100644
--- a/Documentation/admin-guide/kdump/gdbmacros.txt
+++ b/Documentation/admin-guide/kdump/gdbmacros.txt
@@ -295,8 +295,11 @@ document dump_record
 end
 
 define dmesg
-	set var $desc_committed = 1UL << ((sizeof(long) * 8) - 1)
-	set var $flags_mask = 3UL << ((sizeof(long) * 8) - 2)
+	# definitions from kernel/printk/printk_ringbuffer.h
+	set var $desc_commit = 1UL << ((sizeof(long) * 8) - 1)
+	set var $desc_final = 1UL << ((sizeof(long) * 8) - 2)
+	set var $desc_reuse = 1UL << ((sizeof(long) * 8) - 3)
+	set var $flags_mask = $desc_commit | $desc_final | $desc_reuse
 	set var $id_mask = ~$flags_mask
 
 	set var $desc_count = 1U << prb->desc_ring.count_bits
@@ -309,7 +312,8 @@ define dmesg
 		set var $desc = &prb->desc_ring.descs[$id % $desc_count]
 
 		# skip non-committed record
-		if (($desc->state_var.counter & $flags_mask) == $desc_committed)
+		# (note that commit+!final records will be displayed)
+		if (($desc->state_var.counter & $desc_commit) == $desc_commit)
 			dump_record $desc $prev_flags
 			set var $prev_flags = $desc->info.flags
 		end
diff --git a/scripts/gdb/linux/dmesg.py b/scripts/gdb/linux/dmesg.py
index 6c6022012ea8..367523c5c270 100644
--- a/scripts/gdb/linux/dmesg.py
+++ b/scripts/gdb/linux/dmesg.py
@@ -79,9 +79,10 @@ class LxDmesg(gdb.Command):
 
         # definitions from kernel/printk/printk_ringbuffer.h
         desc_sv_bits = utils.get_long_type().sizeof * 8
-        desc_committed_mask = 1 << (desc_sv_bits - 1)
-        desc_reuse_mask = 1 << (desc_sv_bits - 2)
-        desc_flags_mask = desc_committed_mask | desc_reuse_mask
+        desc_commit_mask = 1 << (desc_sv_bits - 1)
+        desc_final_mask = 1 << (desc_sv_bits - 2)
+        desc_reuse_mask = 1 << (desc_sv_bits - 3)
+        desc_flags_mask = desc_commit_mask | desc_final_mask | desc_reuse_mask
         desc_id_mask = ~desc_flags_mask
 
         # read in tail and head descriptor ids
@@ -96,8 +97,9 @@ class LxDmesg(gdb.Command):
             desc_off = desc_sz * ind
 
             # skip non-committed record
+            # (note that commit+!final records will be displayed)
             state = utils.read_u64(descs, desc_off + sv_off + counter_off) & desc_flags_mask
-            if state != desc_committed_mask:
+            if state & desc_commit_mask != desc_commit_mask:
                 if did == head_id:
                     break
                 did = (did + 1) & desc_id_mask
-- 
2.20.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
  2020-08-31  1:10   ` John Ogness
@ 2020-08-31 12:54     ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31 12:54 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

This critical piece was missing from patch 6...


From 0b745d507f0c38e6d1612ed9468aa52845ca025b Mon Sep 17 00:00:00 2001
From: John Ogness <john.ogness@linutronix.de>
Date: Mon, 31 Aug 2020 14:45:40 +0206
Subject: [PATCH] printk: ringbuffer: allow reading consistent descriptors

desc_read() will fail to read if a descriptor is in the desc_reserved
queried state because such data would be inconsistent. However, since
("printk: ringbuffer: add finalization/extension support") the
desc_reserved state can have the DESC_COMMIT_MASK flag set, in which
case it _is_ consistent. And indeed, desc_reopen_last() is expecting
a read in this case.

Allow desc_read() to read desc_reserved descriptors if the
DESC_COMMIT_MASK flag is set.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reported-by: Andy Lavr <andy.lavr@gmail.com>
Fixes: ("printk: ringbuffer: add finalization/extension support")
---
 kernel/printk/printk_ringbuffer.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 0731d5e2dddd..6ba7d3fc96f1 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -446,8 +446,10 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 	/* Check the descriptor state. */
 	state_val = atomic_long_read(state_var); /* LMM(desc_read:A) */
 	d_state = get_desc_state(id, state_val);
-	if (d_state != desc_committed && d_state != desc_reusable)
+	if (d_state == desc_miss ||
+	    (d_state == desc_reserved && !(state_val & DESC_COMMIT_MASK))) {
 		return d_state;
+	}
 
 	/*
 	 * Guarantee the state is loaded before copying the descriptor
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
@ 2020-08-31 12:54     ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-08-31 12:54 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

This critical piece was missing from patch 6...


From 0b745d507f0c38e6d1612ed9468aa52845ca025b Mon Sep 17 00:00:00 2001
From: John Ogness <john.ogness@linutronix.de>
Date: Mon, 31 Aug 2020 14:45:40 +0206
Subject: [PATCH] printk: ringbuffer: allow reading consistent descriptors

desc_read() will fail to read if a descriptor is in the desc_reserved
queried state because such data would be inconsistent. However, since
("printk: ringbuffer: add finalization/extension support") the
desc_reserved state can have the DESC_COMMIT_MASK flag set, in which
case it _is_ consistent. And indeed, desc_reopen_last() is expecting
a read in this case.

Allow desc_read() to read desc_reserved descriptors if the
DESC_COMMIT_MASK flag is set.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reported-by: Andy Lavr <andy.lavr@gmail.com>
Fixes: ("printk: ringbuffer: add finalization/extension support")
---
 kernel/printk/printk_ringbuffer.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 0731d5e2dddd..6ba7d3fc96f1 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -446,8 +446,10 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 	/* Check the descriptor state. */
 	state_val = atomic_long_read(state_var); /* LMM(desc_read:A) */
 	d_state = get_desc_state(id, state_val);
-	if (d_state != desc_committed && d_state != desc_reusable)
+	if (d_state == desc_miss ||
+	    (d_state == desc_reserved && !(state_val & DESC_COMMIT_MASK))) {
 		return d_state;
+	}
 
 	/*
 	 * Guarantee the state is loaded before copying the descriptor
-- 
2.20.1

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH next v3 5/8] printk: ringbuffer: clear initial reserved fields
  2020-08-31  1:10   ` John Ogness
@ 2020-09-01 14:33     ` Petr Mladek
  -1 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-01 14:33 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

On Mon 2020-08-31 03:16:55, John Ogness wrote:
> prb_reserve() will set some meta data values and leave others
> uninitialized (or rather, containing the values of the previous
> wrap). Simplify the API by always clearing out all the fields.
> Only the sequence number is filled in. The caller is now
> responsible for filling in the rest of the meta data fields.
> In particular, for correctly filling in text and dict lengths.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
>
> ---
>  kernel/printk/printk.c            |  7 ++++++-
>  kernel/printk/printk_ringbuffer.c | 29 +++++++++++++++++++----------
>  2 files changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index ad8d1dfe5fbe..7e7d596c8878 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -520,8 +520,11 @@ static int log_store(u32 caller_id, int facility, int level,
>  	memcpy(&r.text_buf[0], text, text_len);
>  	if (trunc_msg_len)
>  		memcpy(&r.text_buf[text_len], trunc_msg, trunc_msg_len);
> -	if (r.dict_buf)
> +	r.info->text_len = text_len + trunc_msg_len;
> +	if (r.dict_buf) {
>  		memcpy(&r.dict_buf[0], dict, dict_len);
> +		r.info->dict_len = dict_len;
> +	}
>  	r.info->facility = facility;
>  	r.info->level = level & 7;
>  	r.info->flags = flags & 0x1f;
> @@ -1078,9 +1081,11 @@ static unsigned int __init add_to_rb(struct printk_ringbuffer *rb,
>  		return 0;
>  
>  	memcpy(&dest_r.text_buf[0], &r->text_buf[0], dest_r.text_buf_size);
> +	dest_r.info->text_len = r->info->text_len;

It looks a bit non-consistent that memcpy() copies text_buf_size but valid data
are defined by text_len.

I would prefer to be on the safe size and copy only text_len.

I could imagine some later optimization that will make buf_size bigger
by alignment or something like that. The reason might be preparation
for continuous lines or so. Then the size of the original buffer and
the destination one might differ.

It is unrelated to this patch. I would solve it by a preparation
or followup one.


>  	if (dest_r.dict_buf) {

>  		memcpy(&dest_r.dict_buf[0], &r->dict_buf[0],
>  		       dest_r.dict_buf_size);
> +		dest_r.info->dict_len = r->info->dict_len;

Same here.

>  	}
>  	dest_r.info->facility = r->info->facility;
>  	dest_r.info->level = r->info->level;
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index d66718e74aae..da54d4fadf96 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -1159,6 +1162,18 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
>  
>  	d = to_desc(desc_ring, id);
>  
> +	/*
> +	 * Clear all @info fields except for @seq, which is used to determine
> +	 * the new sequence number. The writer must fill in new values.
> +	 */
> +	d->info.ts_nsec = 0;
> +	d->info.text_len = 0;
> +	d->info.dict_len = 0;
> +	d->info.facility = 0;
> +	d->info.flags = 0;
> +	d->info.level = 0;
> +	d->info.caller_id = 0;

This will be easy to miss when a new field is added in the future.

I would prefer to store @seq into temporary variable and clear
the entire structure by memset(). The new sequence number is
calculated few lines below anyway.


Otherwise, it looks good to me.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH next v3 5/8] printk: ringbuffer: clear initial reserved fields
@ 2020-09-01 14:33     ` Petr Mladek
  0 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-01 14:33 UTC (permalink / raw)
  To: John Ogness
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

On Mon 2020-08-31 03:16:55, John Ogness wrote:
> prb_reserve() will set some meta data values and leave others
> uninitialized (or rather, containing the values of the previous
> wrap). Simplify the API by always clearing out all the fields.
> Only the sequence number is filled in. The caller is now
> responsible for filling in the rest of the meta data fields.
> In particular, for correctly filling in text and dict lengths.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
>
> ---
>  kernel/printk/printk.c            |  7 ++++++-
>  kernel/printk/printk_ringbuffer.c | 29 +++++++++++++++++++----------
>  2 files changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index ad8d1dfe5fbe..7e7d596c8878 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -520,8 +520,11 @@ static int log_store(u32 caller_id, int facility, int level,
>  	memcpy(&r.text_buf[0], text, text_len);
>  	if (trunc_msg_len)
>  		memcpy(&r.text_buf[text_len], trunc_msg, trunc_msg_len);
> -	if (r.dict_buf)
> +	r.info->text_len = text_len + trunc_msg_len;
> +	if (r.dict_buf) {
>  		memcpy(&r.dict_buf[0], dict, dict_len);
> +		r.info->dict_len = dict_len;
> +	}
>  	r.info->facility = facility;
>  	r.info->level = level & 7;
>  	r.info->flags = flags & 0x1f;
> @@ -1078,9 +1081,11 @@ static unsigned int __init add_to_rb(struct printk_ringbuffer *rb,
>  		return 0;
>  
>  	memcpy(&dest_r.text_buf[0], &r->text_buf[0], dest_r.text_buf_size);
> +	dest_r.info->text_len = r->info->text_len;

It looks a bit non-consistent that memcpy() copies text_buf_size but valid data
are defined by text_len.

I would prefer to be on the safe size and copy only text_len.

I could imagine some later optimization that will make buf_size bigger
by alignment or something like that. The reason might be preparation
for continuous lines or so. Then the size of the original buffer and
the destination one might differ.

It is unrelated to this patch. I would solve it by a preparation
or followup one.


>  	if (dest_r.dict_buf) {

>  		memcpy(&dest_r.dict_buf[0], &r->dict_buf[0],
>  		       dest_r.dict_buf_size);
> +		dest_r.info->dict_len = r->info->dict_len;

Same here.

>  	}
>  	dest_r.info->facility = r->info->facility;
>  	dest_r.info->level = r->info->level;
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index d66718e74aae..da54d4fadf96 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -1159,6 +1162,18 @@ bool prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
>  
>  	d = to_desc(desc_ring, id);
>  
> +	/*
> +	 * Clear all @info fields except for @seq, which is used to determine
> +	 * the new sequence number. The writer must fill in new values.
> +	 */
> +	d->info.ts_nsec = 0;
> +	d->info.text_len = 0;
> +	d->info.dict_len = 0;
> +	d->info.facility = 0;
> +	d->info.flags = 0;
> +	d->info.level = 0;
> +	d->info.caller_id = 0;

This will be easy to miss when a new field is added in the future.

I would prefer to store @seq into temporary variable and clear
the entire structure by memset(). The new sequence number is
calculated few lines below anyway.


Otherwise, it looks good to me.

Best Regards,
Petr

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 34+ messages in thread

* state names: vas: Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
  2020-08-31  1:10   ` John Ogness
@ 2020-09-02 10:52     ` Petr Mladek
  -1 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 10:52 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

On Mon 2020-08-31 03:16:56, John Ogness wrote:
> Add support for extending the newest data block. For this, introduce
> a new finalization state flag (DESC_FINAL_MASK) that denotes when a
> descriptor may not be extended, i.e. is finalized.
 
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -49,14 +49,16 @@
>   * Descriptors have three states:
>   *
>   *   reserved
> - *     A writer is modifying the record.
> + *     A writer is modifying the record. Internally represented as either "0"
> + *     or "DESC_COMMIT_MASK".

We should explain the difference between the two values. It might be
enough to add something like:

    See "Descriptor Finalization" section for more details."

> @@ -79,6 +81,25 @@
>   * committed or reusable queried state. This makes it possible that a valid
>   * sequence number of the tail is always available.
>   *
> + * Descriptor Finalization
> + * ~~~~~~~~~~~~~~~~~~~~~~~
> + * When a writer calls the commit function prb_commit(), the record may still
> + * continue to be in the reserved queried state. In order for that record to
> + * enter into the committed queried state, that record also must be finalized.
> + * A record can be finalized by three different scenarios:
> + *
> + *   1) A writer can finalize its record immediately by calling
> + *      prb_final_commit() instead of prb_commit().
> + *
> + *   2) When a new record is reserved and the previous record has been
> + *      committed via prb_commit(), that previous record is finalized.
> + *
> + *   3) When a record is committed via prb_commit() and a newer record
> + *      already exists, the record being committed is finalized.
> + *
> + * Until a record is finalized (represented by "DESC_FINAL_MASK"), a writer
> + * may "reopen" that record and extend it with more data.
> + *
>   * Data Rings
>   * ~~~~~~~~~~
>   * The two data rings (text and dictionary) function identically. They exist

[...]

> +/*
> + * Attempt to remove the commit flag so that the record can be modified by a
> + * writer again. This is only possible if the descriptor is not yet finalized.
> + *
> + * Note that on success, the queried state did not change. A non-finalized
> + * record (even with the commit flag set) is in the reserved queried state.
> + */
> +static struct prb_desc *desc_reopen_last(struct prb_desc_ring *desc_ring,
> +					 u32 caller_id, unsigned long *id_out)
> +{
> +	unsigned long prev_state_val;
> +	enum desc_state d_state;
> +	struct prb_desc desc;
> +	struct prb_desc *d;
> +	unsigned long id;
> +
> +	id = atomic_long_read(&desc_ring->head_id);
> +
> +	/*
> +	 * To minimize unnecessarily reopening a descriptor, first check the
> +	 * descriptor is in the correct state and has a matching caller ID.
> +	 */
> +	d_state = desc_read(desc_ring, id, &desc);
> +	if (d_state != desc_reserved ||
> +	    !(atomic_long_read(&desc.state_var) & DESC_COMMIT_MASK) ||

This looks like a hack. And similar extra check of the bit is needed
also in desc_read(), see
https://lore.kernel.org/r/878sdvq8kd.fsf@jogness.linutronix.de

I has been actually getting less and less happy with the inconsistency
between names of the bits and states.

Sigh, you will hate me because this would mean a bigger change.
IMHO, it would be much cleaner and help with long-term maintainability
when we do the following two changes:

First, define 5 desc_states, something like:

enum desc_state {
	desc_miss = -1,		/* ID mismatch */
	desc_modified =	 0x0,	/* reserved, being modified by writer */
	desc_committed = 0x1,	/* committed by writer, could get reopened */
	desc_finalized = 0x2,	/* committed, could not longer get modified */
	desc_reusable =	 0x3,	/* free, not yet used by any writer */
};

Second, only 4 variants of the 3 state bits are currently used.
It means that two bits are enough and they might use exactly
the above names:

I mean to do something like:

#define DESC_SV_BITS		(sizeof(unsigned long) * 8)
#define DESC_SV(desc_state)	((unsigned long)desc_state << (DESC_SV_BITS - 2))
#define DESC_ST(state_val)	((unsigned long)state_val >> (DESC_SV_BITS - 2))


Then we could have:

static enum desc_state get_desc_state(unsigned long id,
				      unsigned long state_val)
{
	if (id != DESC_ID(state_val))
		return desc_miss;

	return DESC_ST(state_val);
}

and use in the code:

unsigned long val_committed = id | DESC_SV(desc_committed);


or do

#define DESC_SV(id, desc_state)	(id | (unsigned long)desc_state << (DESC_SV_BITS - 2))

and then use DESC_SV(id, DESC_COMMITTED).


I am sorry that I did not came up with this earlier. I know how
painful it is to rework bigger patchsets. But it affects format
of the ring buffer, so we should do it early.

Best Regards,
Petr

PS: I am still middle of review. It looks good so far. I wanted to
send this early and separately because it is a bigger change.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* state names: vas: Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
@ 2020-09-02 10:52     ` Petr Mladek
  0 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 10:52 UTC (permalink / raw)
  To: John Ogness
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

On Mon 2020-08-31 03:16:56, John Ogness wrote:
> Add support for extending the newest data block. For this, introduce
> a new finalization state flag (DESC_FINAL_MASK) that denotes when a
> descriptor may not be extended, i.e. is finalized.
 
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -49,14 +49,16 @@
>   * Descriptors have three states:
>   *
>   *   reserved
> - *     A writer is modifying the record.
> + *     A writer is modifying the record. Internally represented as either "0"
> + *     or "DESC_COMMIT_MASK".

We should explain the difference between the two values. It might be
enough to add something like:

    See "Descriptor Finalization" section for more details."

> @@ -79,6 +81,25 @@
>   * committed or reusable queried state. This makes it possible that a valid
>   * sequence number of the tail is always available.
>   *
> + * Descriptor Finalization
> + * ~~~~~~~~~~~~~~~~~~~~~~~
> + * When a writer calls the commit function prb_commit(), the record may still
> + * continue to be in the reserved queried state. In order for that record to
> + * enter into the committed queried state, that record also must be finalized.
> + * A record can be finalized by three different scenarios:
> + *
> + *   1) A writer can finalize its record immediately by calling
> + *      prb_final_commit() instead of prb_commit().
> + *
> + *   2) When a new record is reserved and the previous record has been
> + *      committed via prb_commit(), that previous record is finalized.
> + *
> + *   3) When a record is committed via prb_commit() and a newer record
> + *      already exists, the record being committed is finalized.
> + *
> + * Until a record is finalized (represented by "DESC_FINAL_MASK"), a writer
> + * may "reopen" that record and extend it with more data.
> + *
>   * Data Rings
>   * ~~~~~~~~~~
>   * The two data rings (text and dictionary) function identically. They exist

[...]

> +/*
> + * Attempt to remove the commit flag so that the record can be modified by a
> + * writer again. This is only possible if the descriptor is not yet finalized.
> + *
> + * Note that on success, the queried state did not change. A non-finalized
> + * record (even with the commit flag set) is in the reserved queried state.
> + */
> +static struct prb_desc *desc_reopen_last(struct prb_desc_ring *desc_ring,
> +					 u32 caller_id, unsigned long *id_out)
> +{
> +	unsigned long prev_state_val;
> +	enum desc_state d_state;
> +	struct prb_desc desc;
> +	struct prb_desc *d;
> +	unsigned long id;
> +
> +	id = atomic_long_read(&desc_ring->head_id);
> +
> +	/*
> +	 * To minimize unnecessarily reopening a descriptor, first check the
> +	 * descriptor is in the correct state and has a matching caller ID.
> +	 */
> +	d_state = desc_read(desc_ring, id, &desc);
> +	if (d_state != desc_reserved ||
> +	    !(atomic_long_read(&desc.state_var) & DESC_COMMIT_MASK) ||

This looks like a hack. And similar extra check of the bit is needed
also in desc_read(), see
https://lore.kernel.org/r/878sdvq8kd.fsf@jogness.linutronix.de

I has been actually getting less and less happy with the inconsistency
between names of the bits and states.

Sigh, you will hate me because this would mean a bigger change.
IMHO, it would be much cleaner and help with long-term maintainability
when we do the following two changes:

First, define 5 desc_states, something like:

enum desc_state {
	desc_miss = -1,		/* ID mismatch */
	desc_modified =	 0x0,	/* reserved, being modified by writer */
	desc_committed = 0x1,	/* committed by writer, could get reopened */
	desc_finalized = 0x2,	/* committed, could not longer get modified */
	desc_reusable =	 0x3,	/* free, not yet used by any writer */
};

Second, only 4 variants of the 3 state bits are currently used.
It means that two bits are enough and they might use exactly
the above names:

I mean to do something like:

#define DESC_SV_BITS		(sizeof(unsigned long) * 8)
#define DESC_SV(desc_state)	((unsigned long)desc_state << (DESC_SV_BITS - 2))
#define DESC_ST(state_val)	((unsigned long)state_val >> (DESC_SV_BITS - 2))


Then we could have:

static enum desc_state get_desc_state(unsigned long id,
				      unsigned long state_val)
{
	if (id != DESC_ID(state_val))
		return desc_miss;

	return DESC_ST(state_val);
}

and use in the code:

unsigned long val_committed = id | DESC_SV(desc_committed);


or do

#define DESC_SV(id, desc_state)	(id | (unsigned long)desc_state << (DESC_SV_BITS - 2))

and then use DESC_SV(id, DESC_COMMITTED).


I am sorry that I did not came up with this earlier. I know how
painful it is to rework bigger patchsets. But it affects format
of the ring buffer, so we should do it early.

Best Regards,
Petr

PS: I am still middle of review. It looks good so far. I wanted to
send this early and separately because it is a bigger change.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
  2020-08-31  1:10   ` John Ogness
@ 2020-09-02 10:58     ` Petr Mladek
  -1 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 10:58 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

On Mon 2020-08-31 03:16:56, John Ogness wrote:
> Add support for extending the newest data block. For this, introduce
> a new finalization state flag (DESC_FINAL_MASK) that denotes when a
> descriptor may not be extended, i.e. is finalized.
> 
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> +/*
> + * Try to resize an existing data block associated with the descriptor
> + * specified by @id. If the resized datablock should become wrapped, it
> + * copies the old data to the new data block.
> + *
> + * Fail if this is not the last allocated data block or if there is not
> + * enough space or it is not possible make enough space.
> + *
> + * Return a pointer to the beginning of the entire data buffer or NULL on
> + * failure.
> + */
> +static char *data_realloc(struct printk_ringbuffer *rb,
> +			  struct prb_data_ring *data_ring, unsigned int size,
> +			  struct prb_data_blk_lpos *blk_lpos, unsigned long id)
> +{
> +	struct prb_data_block *blk;
> +	unsigned long head_lpos;
> +	unsigned long next_lpos;
> +	bool wrapped;
> +
> +	/* Reallocation only works if @blk_lpos is the newest data block. */
> +	head_lpos = atomic_long_read(&data_ring->head_lpos);
> +	if (head_lpos != blk_lpos->next)
> +		return NULL;
> +
> +	/* Keep track if @blk_lpos was a wrapping data block. */
> +	wrapped = (DATA_WRAPS(data_ring, blk_lpos->begin) != DATA_WRAPS(data_ring, blk_lpos->next));
> +
> +	size = to_blk_size(size);
> +
> +	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
> +
> +	/* If the data block does not increase, there is nothing to do. */
> +	if (next_lpos == head_lpos) {
> +		blk = to_block(data_ring, blk_lpos->begin);
> +		return &blk->data[0];
> +	}

We might be here even when the data are shrinked but the code below
is not fully ready for this.

> +	if (!data_push_tail(rb, data_ring, next_lpos - DATA_SIZE(data_ring)))
> +		return NULL;
> +
> +	/* The memory barrier involvement is the same as data_alloc:A. */
> +	if (!atomic_long_try_cmpxchg(&data_ring->head_lpos, &head_lpos,
> +				     next_lpos)) { /* LMM(data_realloc:A) */
> +		return NULL;
> +	}
> +
> +	blk = to_block(data_ring, blk_lpos->begin);
> +
> +	if (DATA_WRAPS(data_ring, blk_lpos->begin) != DATA_WRAPS(data_ring, next_lpos)) {
> +		struct prb_data_block *old_blk = blk;
> +
> +		/* Wrapping data blocks store their data at the beginning. */
> +		blk = to_block(data_ring, 0);
> +
> +		/*
> +		 * Store the ID on the wrapped block for consistency.
> +		 * The printk_ringbuffer does not actually use it.
> +		 */
> +		blk->id = id;
> +
> +		if (!wrapped) {
> +			/*
> +			 * Since the allocated space is now in the newly
> +			 * created wrapping data block, copy the content
> +			 * from the old data block.
> +			 */
> +			memcpy(&blk->data[0], &old_blk->data[0],
> +			       (blk_lpos->next - blk_lpos->begin) - sizeof(blk->id));

It took me quite some time to check whether this code is correct or not.

First, I wondered whether the size was correctly calculated.
It is because the original block was not wrapped, so
lpos->next - lpos->beging defines the real data buffer size.

Second, I wondered whether the target block might be smaller
than the original (the above check allows shrinking). It can't
be smaller because then the new block won't be wrapped as well.

Sigh, it is a bit tricky. And there is 3rd possibility that
is not handled. The original block might be wrapped but
the new shrunken one might not longer be wrapped.
Then we would need to copy the data the other way.


I know that this function is not currently used for shrinking.
But I would prefer to be on the safe side. Either make
the copying generic, e.g. by calculating the real data size
using the code from get_data(). Or simply comletely
refuse shrinking by the above check.

Best Regards,
Petr

> +		}
> +	}
> +
> +	blk_lpos->next = next_lpos;
> +
> +	return &blk->data[0];
> +}
> +
>  /* Return the number of bytes used by a data block. */
>  static unsigned int space_used(struct prb_data_ring *data_ring,
>  			       struct prb_data_blk_lpos *blk_lpos)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
@ 2020-09-02 10:58     ` Petr Mladek
  0 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 10:58 UTC (permalink / raw)
  To: John Ogness
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

On Mon 2020-08-31 03:16:56, John Ogness wrote:
> Add support for extending the newest data block. For this, introduce
> a new finalization state flag (DESC_FINAL_MASK) that denotes when a
> descriptor may not be extended, i.e. is finalized.
> 
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> +/*
> + * Try to resize an existing data block associated with the descriptor
> + * specified by @id. If the resized datablock should become wrapped, it
> + * copies the old data to the new data block.
> + *
> + * Fail if this is not the last allocated data block or if there is not
> + * enough space or it is not possible make enough space.
> + *
> + * Return a pointer to the beginning of the entire data buffer or NULL on
> + * failure.
> + */
> +static char *data_realloc(struct printk_ringbuffer *rb,
> +			  struct prb_data_ring *data_ring, unsigned int size,
> +			  struct prb_data_blk_lpos *blk_lpos, unsigned long id)
> +{
> +	struct prb_data_block *blk;
> +	unsigned long head_lpos;
> +	unsigned long next_lpos;
> +	bool wrapped;
> +
> +	/* Reallocation only works if @blk_lpos is the newest data block. */
> +	head_lpos = atomic_long_read(&data_ring->head_lpos);
> +	if (head_lpos != blk_lpos->next)
> +		return NULL;
> +
> +	/* Keep track if @blk_lpos was a wrapping data block. */
> +	wrapped = (DATA_WRAPS(data_ring, blk_lpos->begin) != DATA_WRAPS(data_ring, blk_lpos->next));
> +
> +	size = to_blk_size(size);
> +
> +	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
> +
> +	/* If the data block does not increase, there is nothing to do. */
> +	if (next_lpos == head_lpos) {
> +		blk = to_block(data_ring, blk_lpos->begin);
> +		return &blk->data[0];
> +	}

We might be here even when the data are shrinked but the code below
is not fully ready for this.

> +	if (!data_push_tail(rb, data_ring, next_lpos - DATA_SIZE(data_ring)))
> +		return NULL;
> +
> +	/* The memory barrier involvement is the same as data_alloc:A. */
> +	if (!atomic_long_try_cmpxchg(&data_ring->head_lpos, &head_lpos,
> +				     next_lpos)) { /* LMM(data_realloc:A) */
> +		return NULL;
> +	}
> +
> +	blk = to_block(data_ring, blk_lpos->begin);
> +
> +	if (DATA_WRAPS(data_ring, blk_lpos->begin) != DATA_WRAPS(data_ring, next_lpos)) {
> +		struct prb_data_block *old_blk = blk;
> +
> +		/* Wrapping data blocks store their data at the beginning. */
> +		blk = to_block(data_ring, 0);
> +
> +		/*
> +		 * Store the ID on the wrapped block for consistency.
> +		 * The printk_ringbuffer does not actually use it.
> +		 */
> +		blk->id = id;
> +
> +		if (!wrapped) {
> +			/*
> +			 * Since the allocated space is now in the newly
> +			 * created wrapping data block, copy the content
> +			 * from the old data block.
> +			 */
> +			memcpy(&blk->data[0], &old_blk->data[0],
> +			       (blk_lpos->next - blk_lpos->begin) - sizeof(blk->id));

It took me quite some time to check whether this code is correct or not.

First, I wondered whether the size was correctly calculated.
It is because the original block was not wrapped, so
lpos->next - lpos->beging defines the real data buffer size.

Second, I wondered whether the target block might be smaller
than the original (the above check allows shrinking). It can't
be smaller because then the new block won't be wrapped as well.

Sigh, it is a bit tricky. And there is 3rd possibility that
is not handled. The original block might be wrapped but
the new shrunken one might not longer be wrapped.
Then we would need to copy the data the other way.


I know that this function is not currently used for shrinking.
But I would prefer to be on the safe side. Either make
the copying generic, e.g. by calculating the real data size
using the code from get_data(). Or simply comletely
refuse shrinking by the above check.

Best Regards,
Petr

> +		}
> +	}
> +
> +	blk_lpos->next = next_lpos;
> +
> +	return &blk->data[0];
> +}
> +
>  /* Return the number of bytes used by a data block. */
>  static unsigned int space_used(struct prb_data_ring *data_ring,
>  			       struct prb_data_blk_lpos *blk_lpos)

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: state names: vas: Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
  2020-09-02 10:52     ` Petr Mladek
@ 2020-09-02 11:20       ` John Ogness
  -1 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-09-02 11:20 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

On 2020-09-02, Petr Mladek <pmladek@suse.com> wrote:
>> +static struct prb_desc *desc_reopen_last(struct prb_desc_ring *desc_ring,
>> +					 u32 caller_id, unsigned long *id_out)
>> +{
>> +	unsigned long prev_state_val;
>> +	enum desc_state d_state;
>> +	struct prb_desc desc;
>> +	struct prb_desc *d;
>> +	unsigned long id;
>> +
>> +	id = atomic_long_read(&desc_ring->head_id);
>> +
>> +	/*
>> +	 * To minimize unnecessarily reopening a descriptor, first check the
>> +	 * descriptor is in the correct state and has a matching caller ID.
>> +	 */
>> +	d_state = desc_read(desc_ring, id, &desc);
>> +	if (d_state != desc_reserved ||
>> +	    !(atomic_long_read(&desc.state_var) & DESC_COMMIT_MASK) ||
>
> This looks like a hack. And similar extra check of the bit is needed
> also in desc_read(), see
> https://lore.kernel.org/r/878sdvq8kd.fsf@jogness.linutronix.de

Agreed.

> I has been actually getting less and less happy with the inconsistency
> between names of the bits and states.
>
> ...
>
> First, define 5 desc_states, something like:
>
> enum desc_state {
> 	desc_miss = -1,		/* ID mismatch */
> 	desc_modified =	 0x0,	/* reserved, being modified by writer */

I prefer the "desc_reserved" name. It may or may not have be modified yet.

> 	desc_committed = 0x1,	/* committed by writer, could get reopened */
> 	desc_finalized = 0x2,	/* committed, could not longer get modified */
> 	desc_reusable =	 0x3,	/* free, not yet used by any writer */
> };
>
> Second, only 4 variants of the 3 state bits are currently used.
> It means that two bits are enough and they might use exactly
> the above names:
>
> I mean to do something like:
>
> #define DESC_SV_BITS		(sizeof(unsigned long) * 8)
> #define DESC_SV(desc_state)	((unsigned long)desc_state << (DESC_SV_BITS - 2))
> #define DESC_ST(state_val)	((unsigned long)state_val >> (DESC_SV_BITS - 2))

This makes sense and will get us back the bit we lost because of
finalization.

> I am sorry that I did not came up with this earlier. I know how
> painful it is to rework bigger patchsets. But it affects format
> of the ring buffer, so we should do it early.

Agreed.

I am wondering if VMCOREINFO should include a DESC_FLAGS_MASK so that
crash tools could at least successfully iterate the ID's, even if they
didn't know what all the flag values mean (in the case that more bits
are added later).

> PS: I am still middle of review. It looks good so far. I wanted to
> send this early and separately because it is a bigger change.

Thanks for the heads up.

John Ogness

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: state names: vas: Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
@ 2020-09-02 11:20       ` John Ogness
  0 siblings, 0 replies; 34+ messages in thread
From: John Ogness @ 2020-09-02 11:20 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

On 2020-09-02, Petr Mladek <pmladek@suse.com> wrote:
>> +static struct prb_desc *desc_reopen_last(struct prb_desc_ring *desc_ring,
>> +					 u32 caller_id, unsigned long *id_out)
>> +{
>> +	unsigned long prev_state_val;
>> +	enum desc_state d_state;
>> +	struct prb_desc desc;
>> +	struct prb_desc *d;
>> +	unsigned long id;
>> +
>> +	id = atomic_long_read(&desc_ring->head_id);
>> +
>> +	/*
>> +	 * To minimize unnecessarily reopening a descriptor, first check the
>> +	 * descriptor is in the correct state and has a matching caller ID.
>> +	 */
>> +	d_state = desc_read(desc_ring, id, &desc);
>> +	if (d_state != desc_reserved ||
>> +	    !(atomic_long_read(&desc.state_var) & DESC_COMMIT_MASK) ||
>
> This looks like a hack. And similar extra check of the bit is needed
> also in desc_read(), see
> https://lore.kernel.org/r/878sdvq8kd.fsf@jogness.linutronix.de

Agreed.

> I has been actually getting less and less happy with the inconsistency
> between names of the bits and states.
>
> ...
>
> First, define 5 desc_states, something like:
>
> enum desc_state {
> 	desc_miss = -1,		/* ID mismatch */
> 	desc_modified =	 0x0,	/* reserved, being modified by writer */

I prefer the "desc_reserved" name. It may or may not have be modified yet.

> 	desc_committed = 0x1,	/* committed by writer, could get reopened */
> 	desc_finalized = 0x2,	/* committed, could not longer get modified */
> 	desc_reusable =	 0x3,	/* free, not yet used by any writer */
> };
>
> Second, only 4 variants of the 3 state bits are currently used.
> It means that two bits are enough and they might use exactly
> the above names:
>
> I mean to do something like:
>
> #define DESC_SV_BITS		(sizeof(unsigned long) * 8)
> #define DESC_SV(desc_state)	((unsigned long)desc_state << (DESC_SV_BITS - 2))
> #define DESC_ST(state_val)	((unsigned long)state_val >> (DESC_SV_BITS - 2))

This makes sense and will get us back the bit we lost because of
finalization.

> I am sorry that I did not came up with this earlier. I know how
> painful it is to rework bigger patchsets. But it affects format
> of the ring buffer, so we should do it early.

Agreed.

I am wondering if VMCOREINFO should include a DESC_FLAGS_MASK so that
crash tools could at least successfully iterate the ID's, even if they
didn't know what all the flag values mean (in the case that more bits
are added later).

> PS: I am still middle of review. It looks good so far. I wanted to
> send this early and separately because it is a bigger change.

Thanks for the heads up.

John Ogness

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 34+ messages in thread

* misc: was: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
  2020-08-31  1:10   ` John Ogness
@ 2020-09-02 12:21     ` Petr Mladek
  -1 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 12:21 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

On Mon 2020-08-31 03:16:56, John Ogness wrote:
> Add support for extending the newest data block. For this, introduce
> a new finalization state flag (DESC_FINAL_MASK) that denotes when a
> descriptor may not be extended, i.e. is finalized.
> 
> Three new memory barrier pairs are introduced. Two of them are not
> significant because they are alternate path memory barriers that
> exactly correspond to existing memory barriers.
>
> But the third (_prb_commit:B / desc_reserve:D) is new and guarantees
> that descriptors will always be finalized, either because a
> descriptor setting DESC_COMMIT_MASK sees that there is a newer
> descriptor and so finalizes itself or because a new descriptor being
> reserved sees that the previous descriptor has DESC_COMMIT_MASK set
> and finalizes that descriptor.

Just for record. All the barriers look good to me.

There are always full barriers when we change DESC_COMMIT_MASK.
It guarantees consistency of the descriptor and data rings.

The only relaxed modification is adding DESC_FINAL_MASK. It is OK.
It can be done only when DESC_COMMIT_MASK has been set before
using a full barrier. It means that all changes of the rings
has already been since before.


> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> +
> +/**
> + * prb_reserve_in_last() - Re-reserve and extend the space in the ringbuffer
> + *                         used by the newest record.
> + *
> + * @e:         The entry structure to setup.
> + * @rb:        The ringbuffer to re-reserve and extend data in.
> + * @r:         The record structure to allocate buffers for.
> + * @caller_id: The caller ID of the caller (reserving writer).
> + *
> + * This is the public function available to writers to re-reserve and extend
> + * data.
> + *
> + * The writer specifies the text size to extend (not the new total size) by
> + * setting the @text_buf_size field of @r. Dictionaries cannot be extended so

Better might be to say that extending dictionaries is not
supported.

> + * @dict_buf_size of @r should be set to 0. To ensure proper initialization of
> + * @r, prb_rec_init_wr() should be used.
> + *
> + * This function will fail if @caller_id does not match the caller ID of the
> + * newest record. In that case the caller must reserve new data using
> + * prb_reserve().
> + *
> + * Context: Any context. Disables local interrupts on success.
> + * Return: true if text data could be extended, otherwise false.
> + *
> + * On success:
> + *
> + *   - @r->text_buf points to the beginning of the entire text buffer.
> + *
> + *   - @r->text_buf_len is set to the new total size of the buffer.

s/buf_len/buf_size/

> + *   - @r->dict_buf and @r->dict_buf_len are cleared because extending
> + *     the dict buffer is not supported.

Same here.

> + *
> + *   - @r->info is not touched so that @r->info->text_len could be used
> + *     to append the text.
> + *
> + *   - prb_record_text_space() can be used on @e to query the new
> + *     actually used space.
> + *
> + * Important: All @r->info fields will already be set with the current values
> + *            for the record. I.e. @r->info->text_len will be less than
> + *            @text_buf_size and @r->info->dict_len may be set, even though
> + *            @dict_buf_size is 0. Writers can use @r->info->text_len to know
> + *            where concatenation begins and writers should update
> + *            @r->info->text_len after concatenating.
> + */
> +bool prb_reserve_in_last(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
> +			 struct printk_record *r, u32 caller_id)
> +{
> +	unsigned int data_size;
> +	struct prb_desc *d;
> +	unsigned long id;
> +
> +	local_irq_save(e->irqflags);
> +
> +	/* Transition the newest descriptor back to the reserved state. */
> +	d = desc_reopen_last(&rb->desc_ring, caller_id, &id);
> +	if (!d) {
> +		local_irq_restore(e->irqflags);
> +		goto fail_reopen;
> +	}
> +
> +	/* Now the writer has exclusive access: LMM(prb_reserve_in_last:A) */
> +
> +	/*
> +	 * Set the @e fields here so that prb_commit() can be used if
> +	 * anything fails from now on.
> +	 */
> +	e->rb = rb;
> +	e->id = id;
> +
> +	/*
> +	 * desc_reopen_last() checked the caller_id, but there was no
> +	 * exclusive access at that point. The descriptor may have
> +	 * changed since then.
> +	 */
> +	if (caller_id != d->info.caller_id)
> +		goto fail;
> +
> +	if (BLK_DATALESS(&d->text_blk_lpos)) {

We do not clear d->info when reopening the descriptor. I would at
least add here a consistency check similar to the one below:

		if (WARN_ON_ONCE(d->info.text_len)) {
			pr_warn_once("wrong data size (%u, expecting 0) for data\n",
				     d->info.text_len);
			d->info.text_len = 0;
		}

> +		r->text_buf = data_alloc(rb, &rb->text_data_ring, r->text_buf_size,
> +					 &d->text_blk_lpos, id);
> +	} else {
> +		if (!get_data(&rb->text_data_ring, &d->text_blk_lpos, &data_size))
> +			goto fail;
> +
> +		/*
> +		 * Increase the buffer size to include the original size. If
> +		 * the meta data (@text_len) is not sane, use the full data
> +		 * block size.
> +		 */
> +		if (WARN_ON_ONCE(d->info.text_len > data_size)) {
> +			pr_warn_once("wrong data size (%u, expecting >=%hu) for data\n",
> +				     data_size, d->info.text_len);
> +			d->info.text_len = data_size;
> +		}
> +		r->text_buf_size += d->info.text_len;
> +
> +		if (!data_check_size(&rb->text_data_ring, r->text_buf_size))
> +			goto fail;
> +
> +		r->text_buf = data_realloc(rb, &rb->text_data_ring, r->text_buf_size,
> +					   &d->text_blk_lpos, id);
> +	}
> +	if (r->text_buf_size && !r->text_buf)
> +		goto fail;

[...]

> @@ -1261,6 +1615,59 @@ void prb_commit(struct prb_reserved_entry *e)
>  	local_irq_restore(e->irqflags);
>  }
>  
> +/**
> + * prb_commit() - Commit (previously reserved) data to the ringbuffer.
> + *
> + * @e: The entry containing the reserved data information.
> + *
> + * This is the public function available to writers to commit data.
> + *
> + * Note that the data is not yet available to readers until it is finalized.
> + * Finalizing happens automatically when space for the next record is
> + * reserved.
> + *
> + * See prb_final_commit() for a version of this function that finalizes
> + * immediately.
> + *
> + * Context: Any context. Enables local interrupts.
> + */
> +void prb_commit(struct prb_reserved_entry *e)
> +{
> +	struct prb_desc_ring *desc_ring = &e->rb->desc_ring;
> +	unsigned long head_id;
> +
> +	_prb_commit(e, 0);
> +
> +	/*
> +	 * If this descriptor is no longer the head (i.e. a new record has
> +	 * been allocated), extending the data for this record is no longer
> +	 * allowed and therefore it must be finalized.
> +	 */
> +	head_id = atomic_long_read(&desc_ring->head_id); /* LMM(prb_commit:A) */
> +	if (head_id != e->id)
> +		desc_make_final(desc_ring, e->id);

Nice trick! I wonder why I suggested to do it the complicated
way via desc_state.

> +}
> +

I finished review of this patch.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 34+ messages in thread

* misc: was: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
@ 2020-09-02 12:21     ` Petr Mladek
  0 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 12:21 UTC (permalink / raw)
  To: John Ogness
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

On Mon 2020-08-31 03:16:56, John Ogness wrote:
> Add support for extending the newest data block. For this, introduce
> a new finalization state flag (DESC_FINAL_MASK) that denotes when a
> descriptor may not be extended, i.e. is finalized.
> 
> Three new memory barrier pairs are introduced. Two of them are not
> significant because they are alternate path memory barriers that
> exactly correspond to existing memory barriers.
>
> But the third (_prb_commit:B / desc_reserve:D) is new and guarantees
> that descriptors will always be finalized, either because a
> descriptor setting DESC_COMMIT_MASK sees that there is a newer
> descriptor and so finalizes itself or because a new descriptor being
> reserved sees that the previous descriptor has DESC_COMMIT_MASK set
> and finalizes that descriptor.

Just for record. All the barriers look good to me.

There are always full barriers when we change DESC_COMMIT_MASK.
It guarantees consistency of the descriptor and data rings.

The only relaxed modification is adding DESC_FINAL_MASK. It is OK.
It can be done only when DESC_COMMIT_MASK has been set before
using a full barrier. It means that all changes of the rings
has already been since before.


> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> +
> +/**
> + * prb_reserve_in_last() - Re-reserve and extend the space in the ringbuffer
> + *                         used by the newest record.
> + *
> + * @e:         The entry structure to setup.
> + * @rb:        The ringbuffer to re-reserve and extend data in.
> + * @r:         The record structure to allocate buffers for.
> + * @caller_id: The caller ID of the caller (reserving writer).
> + *
> + * This is the public function available to writers to re-reserve and extend
> + * data.
> + *
> + * The writer specifies the text size to extend (not the new total size) by
> + * setting the @text_buf_size field of @r. Dictionaries cannot be extended so

Better might be to say that extending dictionaries is not
supported.

> + * @dict_buf_size of @r should be set to 0. To ensure proper initialization of
> + * @r, prb_rec_init_wr() should be used.
> + *
> + * This function will fail if @caller_id does not match the caller ID of the
> + * newest record. In that case the caller must reserve new data using
> + * prb_reserve().
> + *
> + * Context: Any context. Disables local interrupts on success.
> + * Return: true if text data could be extended, otherwise false.
> + *
> + * On success:
> + *
> + *   - @r->text_buf points to the beginning of the entire text buffer.
> + *
> + *   - @r->text_buf_len is set to the new total size of the buffer.

s/buf_len/buf_size/

> + *   - @r->dict_buf and @r->dict_buf_len are cleared because extending
> + *     the dict buffer is not supported.

Same here.

> + *
> + *   - @r->info is not touched so that @r->info->text_len could be used
> + *     to append the text.
> + *
> + *   - prb_record_text_space() can be used on @e to query the new
> + *     actually used space.
> + *
> + * Important: All @r->info fields will already be set with the current values
> + *            for the record. I.e. @r->info->text_len will be less than
> + *            @text_buf_size and @r->info->dict_len may be set, even though
> + *            @dict_buf_size is 0. Writers can use @r->info->text_len to know
> + *            where concatenation begins and writers should update
> + *            @r->info->text_len after concatenating.
> + */
> +bool prb_reserve_in_last(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
> +			 struct printk_record *r, u32 caller_id)
> +{
> +	unsigned int data_size;
> +	struct prb_desc *d;
> +	unsigned long id;
> +
> +	local_irq_save(e->irqflags);
> +
> +	/* Transition the newest descriptor back to the reserved state. */
> +	d = desc_reopen_last(&rb->desc_ring, caller_id, &id);
> +	if (!d) {
> +		local_irq_restore(e->irqflags);
> +		goto fail_reopen;
> +	}
> +
> +	/* Now the writer has exclusive access: LMM(prb_reserve_in_last:A) */
> +
> +	/*
> +	 * Set the @e fields here so that prb_commit() can be used if
> +	 * anything fails from now on.
> +	 */
> +	e->rb = rb;
> +	e->id = id;
> +
> +	/*
> +	 * desc_reopen_last() checked the caller_id, but there was no
> +	 * exclusive access at that point. The descriptor may have
> +	 * changed since then.
> +	 */
> +	if (caller_id != d->info.caller_id)
> +		goto fail;
> +
> +	if (BLK_DATALESS(&d->text_blk_lpos)) {

We do not clear d->info when reopening the descriptor. I would at
least add here a consistency check similar to the one below:

		if (WARN_ON_ONCE(d->info.text_len)) {
			pr_warn_once("wrong data size (%u, expecting 0) for data\n",
				     d->info.text_len);
			d->info.text_len = 0;
		}

> +		r->text_buf = data_alloc(rb, &rb->text_data_ring, r->text_buf_size,
> +					 &d->text_blk_lpos, id);
> +	} else {
> +		if (!get_data(&rb->text_data_ring, &d->text_blk_lpos, &data_size))
> +			goto fail;
> +
> +		/*
> +		 * Increase the buffer size to include the original size. If
> +		 * the meta data (@text_len) is not sane, use the full data
> +		 * block size.
> +		 */
> +		if (WARN_ON_ONCE(d->info.text_len > data_size)) {
> +			pr_warn_once("wrong data size (%u, expecting >=%hu) for data\n",
> +				     data_size, d->info.text_len);
> +			d->info.text_len = data_size;
> +		}
> +		r->text_buf_size += d->info.text_len;
> +
> +		if (!data_check_size(&rb->text_data_ring, r->text_buf_size))
> +			goto fail;
> +
> +		r->text_buf = data_realloc(rb, &rb->text_data_ring, r->text_buf_size,
> +					   &d->text_blk_lpos, id);
> +	}
> +	if (r->text_buf_size && !r->text_buf)
> +		goto fail;

[...]

> @@ -1261,6 +1615,59 @@ void prb_commit(struct prb_reserved_entry *e)
>  	local_irq_restore(e->irqflags);
>  }
>  
> +/**
> + * prb_commit() - Commit (previously reserved) data to the ringbuffer.
> + *
> + * @e: The entry containing the reserved data information.
> + *
> + * This is the public function available to writers to commit data.
> + *
> + * Note that the data is not yet available to readers until it is finalized.
> + * Finalizing happens automatically when space for the next record is
> + * reserved.
> + *
> + * See prb_final_commit() for a version of this function that finalizes
> + * immediately.
> + *
> + * Context: Any context. Enables local interrupts.
> + */
> +void prb_commit(struct prb_reserved_entry *e)
> +{
> +	struct prb_desc_ring *desc_ring = &e->rb->desc_ring;
> +	unsigned long head_id;
> +
> +	_prb_commit(e, 0);
> +
> +	/*
> +	 * If this descriptor is no longer the head (i.e. a new record has
> +	 * been allocated), extending the data for this record is no longer
> +	 * allowed and therefore it must be finalized.
> +	 */
> +	head_id = atomic_long_read(&desc_ring->head_id); /* LMM(prb_commit:A) */
> +	if (head_id != e->id)
> +		desc_make_final(desc_ring, e->id);

Nice trick! I wonder why I suggested to do it the complicated
way via desc_state.

> +}
> +

I finished review of this patch.

Best Regards,
Petr

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: state names: vas: Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
  2020-09-02 11:20       ` John Ogness
@ 2020-09-02 12:39         ` Petr Mladek
  -1 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 12:39 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

On Wed 2020-09-02 13:26:14, John Ogness wrote:
> On 2020-09-02, Petr Mladek <pmladek@suse.com> wrote:
> >> +static struct prb_desc *desc_reopen_last(struct prb_desc_ring *desc_ring,
> >> +					 u32 caller_id, unsigned long *id_out)
> >> +{
> >> +	unsigned long prev_state_val;
> >> +	enum desc_state d_state;
> >> +	struct prb_desc desc;
> >> +	struct prb_desc *d;
> >> +	unsigned long id;
> >> +
> >> +	id = atomic_long_read(&desc_ring->head_id);
> >> +
> >> +	/*
> >> +	 * To minimize unnecessarily reopening a descriptor, first check the
> >> +	 * descriptor is in the correct state and has a matching caller ID.
> >> +	 */
> >> +	d_state = desc_read(desc_ring, id, &desc);
> >> +	if (d_state != desc_reserved ||
> >> +	    !(atomic_long_read(&desc.state_var) & DESC_COMMIT_MASK) ||
> >
> > First, define 5 desc_states, something like:
> >
> > enum desc_state {
> > 	desc_miss = -1,		/* ID mismatch */
> > 	desc_modified =	 0x0,	/* reserved, being modified by writer */
> 
> I prefer the "desc_reserved" name. It may or may not have be modified yet.

Yeah, "desc_reserved" sounds better. I probably just wanted to free my
fantasy from the current code ;-)


> > 	desc_committed = 0x1,	/* committed by writer, could get reopened */
> > 	desc_finalized = 0x2,	/* committed, could not longer get modified */
> > 	desc_reusable =	 0x3,	/* free, not yet used by any writer */
> > };
> >
> > Second, only 4 variants of the 3 state bits are currently used.
> > It means that two bits are enough and they might use exactly
> > the above names:
> >
> > I mean to do something like:
> >
> > #define DESC_SV_BITS		(sizeof(unsigned long) * 8)
> > #define DESC_SV(desc_state)	((unsigned long)desc_state << (DESC_SV_BITS - 2))
> > #define DESC_ST(state_val)	((unsigned long)state_val >> (DESC_SV_BITS - 2))
> 
> This makes sense and will get us back the bit we lost because of
> finalization.

Yup. Which is good especially on 32-bit architectures.

> I am wondering if VMCOREINFO should include a DESC_FLAGS_MASK so that
> crash tools could at least successfully iterate the ID's, even if they
> didn't know what all the flag values mean (in the case that more bits
> are added later).

Good point. I am just not sure whether they should try read all ids
or they should refuse reading anything when a new bit is added.

Well, I really hope that we will not need new states anytime soon.
It would need a really strong reason.

I personally can't think about any use case. pr_cont() was special
because it was the writer side. All other steps of the printk rework
are on the reader side.

I believe that we are getting close with all the ring buffer code.
And I have good feeling about it.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: state names: vas: Re: [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support
@ 2020-09-02 12:39         ` Petr Mladek
  0 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 12:39 UTC (permalink / raw)
  To: John Ogness
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

On Wed 2020-09-02 13:26:14, John Ogness wrote:
> On 2020-09-02, Petr Mladek <pmladek@suse.com> wrote:
> >> +static struct prb_desc *desc_reopen_last(struct prb_desc_ring *desc_ring,
> >> +					 u32 caller_id, unsigned long *id_out)
> >> +{
> >> +	unsigned long prev_state_val;
> >> +	enum desc_state d_state;
> >> +	struct prb_desc desc;
> >> +	struct prb_desc *d;
> >> +	unsigned long id;
> >> +
> >> +	id = atomic_long_read(&desc_ring->head_id);
> >> +
> >> +	/*
> >> +	 * To minimize unnecessarily reopening a descriptor, first check the
> >> +	 * descriptor is in the correct state and has a matching caller ID.
> >> +	 */
> >> +	d_state = desc_read(desc_ring, id, &desc);
> >> +	if (d_state != desc_reserved ||
> >> +	    !(atomic_long_read(&desc.state_var) & DESC_COMMIT_MASK) ||
> >
> > First, define 5 desc_states, something like:
> >
> > enum desc_state {
> > 	desc_miss = -1,		/* ID mismatch */
> > 	desc_modified =	 0x0,	/* reserved, being modified by writer */
> 
> I prefer the "desc_reserved" name. It may or may not have be modified yet.

Yeah, "desc_reserved" sounds better. I probably just wanted to free my
fantasy from the current code ;-)


> > 	desc_committed = 0x1,	/* committed by writer, could get reopened */
> > 	desc_finalized = 0x2,	/* committed, could not longer get modified */
> > 	desc_reusable =	 0x3,	/* free, not yet used by any writer */
> > };
> >
> > Second, only 4 variants of the 3 state bits are currently used.
> > It means that two bits are enough and they might use exactly
> > the above names:
> >
> > I mean to do something like:
> >
> > #define DESC_SV_BITS		(sizeof(unsigned long) * 8)
> > #define DESC_SV(desc_state)	((unsigned long)desc_state << (DESC_SV_BITS - 2))
> > #define DESC_ST(state_val)	((unsigned long)state_val >> (DESC_SV_BITS - 2))
> 
> This makes sense and will get us back the bit we lost because of
> finalization.

Yup. Which is good especially on 32-bit architectures.

> I am wondering if VMCOREINFO should include a DESC_FLAGS_MASK so that
> crash tools could at least successfully iterate the ID's, even if they
> didn't know what all the flag values mean (in the case that more bits
> are added later).

Good point. I am just not sure whether they should try read all ids
or they should refuse reading anything when a new bit is added.

Well, I really hope that we will not need new states anytime soon.
It would need a really strong reason.

I personally can't think about any use case. pr_cont() was special
because it was the writer side. All other steps of the printk rework
are on the reader side.

I believe that we are getting close with all the ring buffer code.
And I have good feeling about it.

Best Regards,
Petr

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH next v3 7/8] printk: reimplement log_cont using record extension
  2020-08-31  1:10   ` John Ogness
@ 2020-09-02 13:38     ` Petr Mladek
  -1 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 13:38 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Linus Torvalds, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra, Andrea Parri, Paul McKenney, kexec, linux-kernel

On Mon 2020-08-31 03:16:57, John Ogness wrote:
> Use the record extending feature of the ringbuffer to implement
> continuous messages. This preserves the existing continuous message
> behavior.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---
>  kernel/printk/printk.c | 98 +++++++++---------------------------------
>  1 file changed, 20 insertions(+), 78 deletions(-)

It looks simple from the printk() side ;-)

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH next v3 7/8] printk: reimplement log_cont using record extension
@ 2020-09-02 13:38     ` Petr Mladek
  0 siblings, 0 replies; 34+ messages in thread
From: Petr Mladek @ 2020-09-02 13:38 UTC (permalink / raw)
  To: John Ogness
  Cc: Andrea Parri, Sergey Senozhatsky, Paul McKenney, Peter Zijlstra,
	Greg Kroah-Hartman, kexec, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Thomas Gleixner, Linus Torvalds

On Mon 2020-08-31 03:16:57, John Ogness wrote:
> Use the record extending feature of the ringbuffer to implement
> continuous messages. This preserves the existing continuous message
> behavior.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---
>  kernel/printk/printk.c | 98 +++++++++---------------------------------
>  1 file changed, 20 insertions(+), 78 deletions(-)

It looks simple from the printk() side ;-)

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2020-09-02 14:09 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31  1:10 [PATCH next v3 0/8] printk: reimplement LOG_CONT handling John Ogness
2020-08-31  1:10 ` John Ogness
2020-08-31  1:10 ` [PATCH next v3 1/8] printk: ringbuffer: rename DESC_COMMITTED_MASK flag John Ogness
2020-08-31  1:10   ` John Ogness
2020-08-31  1:10 ` [PATCH next v3 2/8] printk: ringbuffer: change representation of reusable John Ogness
2020-08-31  1:10   ` John Ogness
2020-08-31  1:10 ` [PATCH next v3 3/8] printk: ringbuffer: relocate get_data() John Ogness
2020-08-31  1:10   ` John Ogness
2020-08-31  1:10 ` [PATCH next v3 4/8] printk: ringbuffer: add BLK_DATALESS() macro John Ogness
2020-08-31  1:10   ` John Ogness
2020-08-31  1:10 ` [PATCH next v3 5/8] printk: ringbuffer: clear initial reserved fields John Ogness
2020-08-31  1:10   ` John Ogness
2020-09-01 14:33   ` Petr Mladek
2020-09-01 14:33     ` Petr Mladek
2020-08-31  1:10 ` [PATCH next v3 6/8] printk: ringbuffer: add finalization/extension support John Ogness
2020-08-31  1:10   ` John Ogness
2020-08-31 12:54   ` John Ogness
2020-08-31 12:54     ` John Ogness
2020-09-02 10:52   ` state names: vas: " Petr Mladek
2020-09-02 10:52     ` Petr Mladek
2020-09-02 11:20     ` John Ogness
2020-09-02 11:20       ` John Ogness
2020-09-02 12:39       ` Petr Mladek
2020-09-02 12:39         ` Petr Mladek
2020-09-02 10:58   ` Petr Mladek
2020-09-02 10:58     ` Petr Mladek
2020-09-02 12:21   ` misc: was: " Petr Mladek
2020-09-02 12:21     ` Petr Mladek
2020-08-31  1:10 ` [PATCH next v3 7/8] printk: reimplement log_cont using record extension John Ogness
2020-08-31  1:10   ` John Ogness
2020-09-02 13:38   ` Petr Mladek
2020-09-02 13:38     ` Petr Mladek
2020-08-31  1:10 ` [PATCH next v3 8/8] scripts/gdb: support printk finalized records John Ogness
2020-08-31  1:10   ` John Ogness

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.