linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
@ 2012-07-03 23:35 Seiji Aguchi
  2012-07-03 23:58 ` Luck, Tony
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Seiji Aguchi @ 2012-07-03 23:35 UTC (permalink / raw)
  To: linux-kernel, Luck, Tony (tony.luck@intel.com),
	mikew, Matthew Garrett (mjg@redhat.com),
	dzickus
  Cc: dle-develop, Satoru Moriya

This patch introduces a following rule checking if an existing entry in NVRAM are erasable to avoid missing 
a critical message ,such as panic, before an user check it.

    - In case where an existing entry is panic or emergency
      -  It is not erasable because if panic/emergency event is lost, we have no way to detect 
         the root cause. We shouldn't overwrite them for any reason.

    - In case where an existing entry is oops/shutdown/halt/poweroff
      -  It is erasable if an error ,panic, emergency or oops, happen in new event.
           - Oops is erasable because system may still be running and its message may be saved 
             into /var/log/messages.
           - Also, normal event ,shutdown/halt/poweroff, is erasable because error message is 
             more important than normal message.

    - In case where an existing entry is unknown event
      -  It is erasable because any event supported by pstore outweighs unsupported events.

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
---
 drivers/firmware/efivars.c |   48 +++++++++++++++++++++++++++++++++++++++++++-
 fs/pstore/platform.c       |    4 +-
 include/linux/pstore.h     |    5 ++++
 3 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index 4929254..54d9bc6 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -685,6 +685,45 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
 	return 0;
 }
 
+
+static bool can_erase_entry(struct efivar_entry *entry, enum kmsg_dump_reason
+			   new_reason)
+{
+	int prev_reason = 0;
+	const char *prev_why;
+	bool is_erasable = 0;
+
+	/* Get a reason of previous message */
+	while (1) {
+		prev_why =  pstore_get_reason_str(prev_reason);
+		if (!strncmp(entry->var.Data, prev_why, strlen(prev_why)))
+			break;
+		prev_reason++;
+	}
+
+	/* check if exisiting message is erasable */
+
+	switch (prev_reason) {
+	case KMSG_DUMP_PANIC:
+	case KMSG_DUMP_EMERG:
+		/* Never erase panic or emergency message */
+		break;
+	case KMSG_DUMP_OOPS:
+	case KMSG_DUMP_RESTART:
+	case KMSG_DUMP_HALT:
+	case KMSG_DUMP_POWEROFF:
+		/* Can erase if new one is error message */
+		if (new_reason <= KMSG_DUMP_EMERG)
+			is_erasable = 1;
+		break;
+	default:
+		/* Can erase unknown message */
+		is_erasable = 1;
+	}
+
+	return is_erasable;
+}
+
 static int efi_pstore_write(enum pstore_type_id type,
 		enum kmsg_dump_reason reason, u64 *id,
 		unsigned int part, size_t size, struct pstore_info *psi) @@ -706,7 +745,7 @@ static int efi_pstore_write(enum pstore_type_id type,
 		efi_name[i] = stub_name[i];
 
 	/*
-	 * Clean up any entries with the same name
+	 * Search for erasable entry
 	 */
 
 	list_for_each_entry(entry, &efivars->list, list) { @@ -721,6 +760,13 @@ static int efi_pstore_write(enum pstore_type_id type,
 		if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
 			continue;
 
+		if (!can_erase_entry(entry, reason)) {
+			/* return without writing new entry */
+			spin_unlock(&efivars->lock);
+			*id = part;
+			return ret;
+		}
+
 		/* found */
 		found = entry;
 		efivars->ops->set_variable(entry->var.VariableName,
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 03ce7a9..32715eb 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -68,7 +68,7 @@ void pstore_set_kmsg_bytes(int bytes)
 /* Tag each group of saved records with a sequence number */
 static int	oopscount;
 
-static const char *get_reason_str(enum kmsg_dump_reason reason)
+const char *pstore_get_reason_str(enum kmsg_dump_reason reason)
 {
 	switch (reason) {
 	case KMSG_DUMP_PANIC:
@@ -104,7 +104,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
 	int		is_locked = 0;
 	int		ret;
 
-	why = get_reason_str(reason);
+	why = pstore_get_reason_str(reason);
 
 	if (in_nmi()) {
 		is_locked = spin_trylock(&psinfo->buf_lock); diff --git a/include/linux/pstore.h b/include/linux/pstore.h index e1461e1..e9347e9 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -54,12 +54,17 @@ struct pstore_info {
 
 #ifdef CONFIG_PSTORE
 extern int pstore_register(struct pstore_info *);
+extern const char *pstore_get_reason_str(enum kmsg_dump_reason reason);
 #else
 static inline int
 pstore_register(struct pstore_info *psi)  {
 	return -ENODEV;
 }
+static const char *pstore_get_reason_str(enum kmsg_dump_reason reason) 
+{
+	return NULL;
+}
 #endif
 
 #endif /*_LINUX_PSTORE_H*/
--
1.7.1



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-03 23:35 [RFC][PATCH 2/2] write callback: Check if existing entry is erasable Seiji Aguchi
@ 2012-07-03 23:58 ` Luck, Tony
  2012-07-04  1:22   ` Seiji Aguchi
  2012-07-04  1:02 ` Mike Waychison
  2012-07-05 13:32 ` Don Zickus
  2 siblings, 1 reply; 12+ messages in thread
From: Luck, Tony @ 2012-07-03 23:58 UTC (permalink / raw)
  To: Seiji Aguchi, linux-kernel, mikew,
	Matthew Garrett (mjg@redhat.com),
	dzickus
  Cc: dle-develop, Satoru Moriya

Perhaps update the comment in include/linux/kmsg_dump.h to mention that
<= KMSG_DUMP_EMERG may overwrite less important logs (since it already
mentions the importance of order and significance of the OOPS level).

-Tony

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-03 23:35 [RFC][PATCH 2/2] write callback: Check if existing entry is erasable Seiji Aguchi
  2012-07-03 23:58 ` Luck, Tony
@ 2012-07-04  1:02 ` Mike Waychison
  2012-07-04  1:54   ` Seiji Aguchi
  2012-07-05 13:32 ` Don Zickus
  2 siblings, 1 reply; 12+ messages in thread
From: Mike Waychison @ 2012-07-04  1:02 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: linux-kernel, Luck, Tony (tony.luck@intel.com),
	Matthew Garrett (mjg@redhat.com),
	dzickus, dle-develop, Satoru Moriya

On 07/03/2012 04:35 PM, Seiji Aguchi wrote:
> This patch introduces a following rule checking if an existing entry in NVRAM are erasable to avoid missing
> a critical message ,such as panic, before an user check it.
>
>      - In case where an existing entry is panic or emergency
>        -  It is not erasable because if panic/emergency event is lost, we have no way to detect
>           the root cause. We shouldn't overwrite them for any reason.
>
>      - In case where an existing entry is oops/shutdown/halt/poweroff
>        -  It is erasable if an error ,panic, emergency or oops, happen in new event.
>             - Oops is erasable because system may still be running and its message may be saved
>               into /var/log/messages.
>             - Also, normal event ,shutdown/halt/poweroff, is erasable because error message is
>               more important than normal message.
>
>      - In case where an existing entry is unknown event
>        -  It is erasable because any event supported by pstore outweighs unsupported events.
>
> Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
> ---
>   drivers/firmware/efivars.c |   48 +++++++++++++++++++++++++++++++++++++++++++-
>   fs/pstore/platform.c       |    4 +-
>   include/linux/pstore.h     |    5 ++++
>   3 files changed, 54 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index 4929254..54d9bc6 100644
> --- a/drivers/firmware/efivars.c
> +++ b/drivers/firmware/efivars.c
> @@ -685,6 +685,45 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
>   	return 0;
>   }
>
> +
> +static bool can_erase_entry(struct efivar_entry *entry, enum kmsg_dump_reason
> +			   new_reason)
> +{
> +	int prev_reason = 0;
> +	const char *prev_why;
> +	bool is_erasable = 0;
> +
> +	/* Get a reason of previous message */
> +	while (1) {
> +		prev_why =  pstore_get_reason_str(prev_reason);
> +		if (!strncmp(entry->var.Data, prev_why, strlen(prev_why)))
> +			break;
> +		prev_reason++;
> +	}

This loop should probably be hardened to cope with bad data.  The patch 
description and code below show intention to overwrite garbage data, but 
this loop will never exit on garbage data.

> +
> +	/* check if exisiting message is erasable */

existing

> +
> +	switch (prev_reason) {
> +	case KMSG_DUMP_PANIC:
> +	case KMSG_DUMP_EMERG:
> +		/* Never erase panic or emergency message */
> +		break;
> +	case KMSG_DUMP_OOPS:
> +	case KMSG_DUMP_RESTART:
> +	case KMSG_DUMP_HALT:
> +	case KMSG_DUMP_POWEROFF:
> +		/* Can erase if new one is error message */
> +		if (new_reason <= KMSG_DUMP_EMERG)
> +			is_erasable = 1;
> +		break;
> +	default:
> +		/* Can erase unknown message */
> +		is_erasable = 1;

It is probably safer to actually complain here at compile time if a 
reason is missing.  prev_reason is an enum though, so if all cases are 
accounted for and no default is specified, gcc should complain if a new 
KMSG_DUMP enum value is added to the codebase without considering this 
logic here.

> +	}
> +
> +	return is_erasable;
> +}
> +
>   static int efi_pstore_write(enum pstore_type_id type,
>   		enum kmsg_dump_reason reason, u64 *id,
>   		unsigned int part, size_t size, struct pstore_info *psi) @@ -706,7 +745,7 @@ static int efi_pstore_write(enum pstore_type_id type,
>   		efi_name[i] = stub_name[i];
>
>   	/*
> -	 * Clean up any entries with the same name
> +	 * Search for erasable entry
>   	 */
>
>   	list_for_each_entry(entry, &efivars->list, list) { @@ -721,6 +760,13 @@ static int efi_pstore_write(enum pstore_type_id type,
>   		if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
>   			continue;
>
> +		if (!can_erase_entry(entry, reason)) {
> +			/* return without writing new entry */
> +			spin_unlock(&efivars->lock);
> +			*id = part;
> +			return ret;
> +		}

I'm not sure how comfortable I am with the code duplication here.  Was 
using a flag not workable?

> +
>   		/* found */
>   		found = entry;
>   		efivars->ops->set_variable(entry->var.VariableName,
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 03ce7a9..32715eb 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -68,7 +68,7 @@ void pstore_set_kmsg_bytes(int bytes)
>   /* Tag each group of saved records with a sequence number */
>   static int	oopscount;
>
> -static const char *get_reason_str(enum kmsg_dump_reason reason)
> +const char *pstore_get_reason_str(enum kmsg_dump_reason reason)
>   {
>   	switch (reason) {
>   	case KMSG_DUMP_PANIC:
> @@ -104,7 +104,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
>   	int		is_locked = 0;
>   	int		ret;
>
> -	why = get_reason_str(reason);
> +	why = pstore_get_reason_str(reason);
>
>   	if (in_nmi()) {
>   		is_locked = spin_trylock(&psinfo->buf_lock); diff --git a/include/linux/pstore.h b/include/linux/pstore.h index e1461e1..e9347e9 100644
> --- a/include/linux/pstore.h
> +++ b/include/linux/pstore.h
> @@ -54,12 +54,17 @@ struct pstore_info {
>
>   #ifdef CONFIG_PSTORE
>   extern int pstore_register(struct pstore_info *);
> +extern const char *pstore_get_reason_str(enum kmsg_dump_reason reason);
>   #else
>   static inline int
>   pstore_register(struct pstore_info *psi)  {
>   	return -ENODEV;
>   }
> +static const char *pstore_get_reason_str(enum kmsg_dump_reason reason)
> +{
> +	return NULL;
> +}
>   #endif
>
>   #endif /*_LINUX_PSTORE_H*/
> --
> 1.7.1
>
>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-03 23:58 ` Luck, Tony
@ 2012-07-04  1:22   ` Seiji Aguchi
  0 siblings, 0 replies; 12+ messages in thread
From: Seiji Aguchi @ 2012-07-04  1:22 UTC (permalink / raw)
  To: Luck, Tony, linux-kernel, mikew, Matthew Garrett (mjg@redhat.com),
	dzickus
  Cc: dle-develop, Satoru Moriya


> Perhaps update the comment in include/linux/kmsg_dump.h to mention that <= KMSG_DUMP_EMERG may overwrite less important
> logs (since it already mentions the importance of order and significance of the OOPS level).

I will update my patch by adding some comment to linux/kmsg_dump.h

Thanks, 

Seiji

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-04  1:02 ` Mike Waychison
@ 2012-07-04  1:54   ` Seiji Aguchi
  0 siblings, 0 replies; 12+ messages in thread
From: Seiji Aguchi @ 2012-07-04  1:54 UTC (permalink / raw)
  To: Mike Waychison
  Cc: linux-kernel, Luck, Tony (tony.luck@intel.com),
	Matthew Garrett (mjg@redhat.com),
	dzickus, dle-develop, Satoru Moriya

Thank you for reviewing my patch.

> > +	while (1) {
> > +		prev_why =  pstore_get_reason_str(prev_reason);
> > +		if (!strncmp(entry->var.Data, prev_why, strlen(prev_why)))
> > +			break;
> > +		prev_reason++;
> > +	}
> 
> This loop should probably be hardened to cope with bad data.  The patch description and code below show intention to overwrite
> garbage data, but this loop will never exit on garbage data.

OK. I will remove infinite loop above.

> > +	switch (prev_reason) {
> > +	case KMSG_DUMP_PANIC:
> > +	case KMSG_DUMP_EMERG:
> > +		/* Never erase panic or emergency message */
> > +		break;
> > +	case KMSG_DUMP_OOPS:
> > +	case KMSG_DUMP_RESTART:
> > +	case KMSG_DUMP_HALT:
> > +	case KMSG_DUMP_POWEROFF:
> > +		/* Can erase if new one is error message */
> > +		if (new_reason <= KMSG_DUMP_EMERG)
> > +			is_erasable = 1;
> > +		break;
> > +	default:
> > +		/* Can erase unknown message */
> > +		is_erasable = 1;
> 
> It is probably safer to actually complain here at compile time if a reason is missing.  prev_reason is an enum though, so if all cases are
> accounted for and no default is specified, gcc should complain if a new KMSG_DUMP enum value is added to the codebase without
> considering this logic here.

It makes sense.
I will remove "default" and add all cases of "KMSG_DUMP_*". 
In latest linus-tree, KMSG_DUMP_UNDEF is added.
I will check how we should treat it.


> > +		if (!can_erase_entry(entry, reason)) {
> > +			/* return without writing new entry */
> > +			spin_unlock(&efivars->lock);
> > +			*id = part;
> > +			return ret;
> > +		}
> 
> I'm not sure how comfortable I am with the code duplication here.  Was using a flag not workable?

If efi_pstore doesn't write new entry, it should return with non-zero value.
I will update my patch by changing to "return -EEXIST " or so.

Also, I will fix typo you pointed out.

Thanks,

Seiji

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-03 23:35 [RFC][PATCH 2/2] write callback: Check if existing entry is erasable Seiji Aguchi
  2012-07-03 23:58 ` Luck, Tony
  2012-07-04  1:02 ` Mike Waychison
@ 2012-07-05 13:32 ` Don Zickus
  2012-07-05 20:03   ` Luck, Tony
  2012-07-05 20:05   ` Seiji Aguchi
  2 siblings, 2 replies; 12+ messages in thread
From: Don Zickus @ 2012-07-05 13:32 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: linux-kernel, Luck, Tony (tony.luck@intel.com),
	mikew, Matthew Garrett (mjg@redhat.com),
	dle-develop, Satoru Moriya

On Tue, Jul 03, 2012 at 11:35:47PM +0000, Seiji Aguchi wrote:
> This patch introduces a following rule checking if an existing entry in NVRAM are erasable to avoid missing 
> a critical message ,such as panic, before an user check it.

I am not a big fan of this policy.

> 
>     - In case where an existing entry is panic or emergency
>       -  It is not erasable because if panic/emergency event is lost, we have no way to detect 
>          the root cause. We shouldn't overwrite them for any reason.

Ok.

> 
>     - In case where an existing entry is oops/shutdown/halt/poweroff
>       -  It is erasable if an error ,panic, emergency or oops, happen in new event.
>            - Oops is erasable because system may still be running and its message may be saved 
>              into /var/log/messages.

Then why save it to begin with?

>            - Also, normal event ,shutdown/halt/poweroff, is erasable because error message is 
>              more important than normal message.

Then why happens if you are using these messages to debug a problem on
your system?  It would seem one could be misled unless you set a flag
stating that a log was overwritten somewhere.

> 
>     - In case where an existing entry is unknown event
>       -  It is erasable because any event supported by pstore outweighs unsupported events.

Again, how does an 'unknown' event get recorded?  Shouldn't all events be
known?

I would rather see no records overwritten and just make sure there is
enough space for a dozen or so records to buffer multiple panics before
userspace can run.

Implementing policy like this in the kernel seems like it would be a
constant battle between everyone's view point of what is important and not
important.

I would rather take the viewpoint, if it is important to log it in a space
limited NVRAM, then it is important enough not to overwrite until
userspace explicitly asks it to be deleted.  Otherwise why log it, if it
is not important?

Cheers,
Don

> 
> Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
> ---
>  drivers/firmware/efivars.c |   48 +++++++++++++++++++++++++++++++++++++++++++-
>  fs/pstore/platform.c       |    4 +-
>  include/linux/pstore.h     |    5 ++++
>  3 files changed, 54 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index 4929254..54d9bc6 100644
> --- a/drivers/firmware/efivars.c
> +++ b/drivers/firmware/efivars.c
> @@ -685,6 +685,45 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
>  	return 0;
>  }
>  
> +
> +static bool can_erase_entry(struct efivar_entry *entry, enum kmsg_dump_reason
> +			   new_reason)
> +{
> +	int prev_reason = 0;
> +	const char *prev_why;
> +	bool is_erasable = 0;
> +
> +	/* Get a reason of previous message */
> +	while (1) {
> +		prev_why =  pstore_get_reason_str(prev_reason);
> +		if (!strncmp(entry->var.Data, prev_why, strlen(prev_why)))
> +			break;
> +		prev_reason++;
> +	}
> +
> +	/* check if exisiting message is erasable */
> +
> +	switch (prev_reason) {
> +	case KMSG_DUMP_PANIC:
> +	case KMSG_DUMP_EMERG:
> +		/* Never erase panic or emergency message */
> +		break;
> +	case KMSG_DUMP_OOPS:
> +	case KMSG_DUMP_RESTART:
> +	case KMSG_DUMP_HALT:
> +	case KMSG_DUMP_POWEROFF:
> +		/* Can erase if new one is error message */
> +		if (new_reason <= KMSG_DUMP_EMERG)
> +			is_erasable = 1;
> +		break;
> +	default:
> +		/* Can erase unknown message */
> +		is_erasable = 1;
> +	}
> +
> +	return is_erasable;
> +}
> +
>  static int efi_pstore_write(enum pstore_type_id type,
>  		enum kmsg_dump_reason reason, u64 *id,
>  		unsigned int part, size_t size, struct pstore_info *psi) @@ -706,7 +745,7 @@ static int efi_pstore_write(enum pstore_type_id type,
>  		efi_name[i] = stub_name[i];
>  
>  	/*
> -	 * Clean up any entries with the same name
> +	 * Search for erasable entry
>  	 */
>  
>  	list_for_each_entry(entry, &efivars->list, list) { @@ -721,6 +760,13 @@ static int efi_pstore_write(enum pstore_type_id type,
>  		if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
>  			continue;
>  
> +		if (!can_erase_entry(entry, reason)) {
> +			/* return without writing new entry */
> +			spin_unlock(&efivars->lock);
> +			*id = part;
> +			return ret;
> +		}
> +
>  		/* found */
>  		found = entry;
>  		efivars->ops->set_variable(entry->var.VariableName,
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 03ce7a9..32715eb 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -68,7 +68,7 @@ void pstore_set_kmsg_bytes(int bytes)
>  /* Tag each group of saved records with a sequence number */
>  static int	oopscount;
>  
> -static const char *get_reason_str(enum kmsg_dump_reason reason)
> +const char *pstore_get_reason_str(enum kmsg_dump_reason reason)
>  {
>  	switch (reason) {
>  	case KMSG_DUMP_PANIC:
> @@ -104,7 +104,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
>  	int		is_locked = 0;
>  	int		ret;
>  
> -	why = get_reason_str(reason);
> +	why = pstore_get_reason_str(reason);
>  
>  	if (in_nmi()) {
>  		is_locked = spin_trylock(&psinfo->buf_lock); diff --git a/include/linux/pstore.h b/include/linux/pstore.h index e1461e1..e9347e9 100644
> --- a/include/linux/pstore.h
> +++ b/include/linux/pstore.h
> @@ -54,12 +54,17 @@ struct pstore_info {
>  
>  #ifdef CONFIG_PSTORE
>  extern int pstore_register(struct pstore_info *);
> +extern const char *pstore_get_reason_str(enum kmsg_dump_reason reason);
>  #else
>  static inline int
>  pstore_register(struct pstore_info *psi)  {
>  	return -ENODEV;
>  }
> +static const char *pstore_get_reason_str(enum kmsg_dump_reason reason) 
> +{
> +	return NULL;
> +}
>  #endif
>  
>  #endif /*_LINUX_PSTORE_H*/
> --
> 1.7.1
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-05 13:32 ` Don Zickus
@ 2012-07-05 20:03   ` Luck, Tony
  2012-07-05 20:05   ` Seiji Aguchi
  1 sibling, 0 replies; 12+ messages in thread
From: Luck, Tony @ 2012-07-05 20:03 UTC (permalink / raw)
  To: Don Zickus, Seiji Aguchi
  Cc: linux-kernel, mikew, Matthew Garrett (mjg@redhat.com),
	dle-develop, Satoru Moriya

>>     - In case where an existing entry is oops/shutdown/halt/poweroff
>>       -  It is erasable if an error ,panic, emergency or oops, happen in new event.
>>            - Oops is erasable because system may still be running and its message may be saved 
>>              into /var/log/messages.
>
> Then why save it to begin with?

It would be useful in the case where an OOPS is followed quickly by a system hang.

-Tony
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-05 13:32 ` Don Zickus
  2012-07-05 20:03   ` Luck, Tony
@ 2012-07-05 20:05   ` Seiji Aguchi
  2012-07-05 20:24     ` Don Zickus
  1 sibling, 1 reply; 12+ messages in thread
From: Seiji Aguchi @ 2012-07-05 20:05 UTC (permalink / raw)
  To: Don Zickus
  Cc: linux-kernel, Luck, Tony (tony.luck@intel.com),
	mikew, Matthew Garrett (mjg@redhat.com),
	dle-develop, Satoru Moriya

Don,

Thank you for giving me your comments.
Let me explain what I'm thinking now.

> I would rather see no records overwritten and just make sure there is enough space for a dozen or so records to buffer multiple
> panics before userspace can run.
> 
> Implementing policy like this in the kernel seems like it would be a constant battle between everyone's view point of what is
> important and not important.
> 
> I would rather take the viewpoint, if it is important to log it in a space limited NVRAM, then it is important enough not to overwrite
> until userspace explicitly asks it to be deleted.  Otherwise why log it, if it is not important?
> 

If the simple policy above is workable, it is easy. 
But we have to discuss whether it is useful in each specific use case.
       When I posted a patch introducing kernel parameter ,efi_pstore_overwrite,
       I thought same thing above. But I changed my mind while considering  Tony's comment....

When an user can read kmsg via /dev/pstore and erase old entries,  we don't need to care.
(Hopefully, some user space apps will be developed near future.)

Problem here is at very final stage and early stage which an user can't see /dev/pstore.

1) At very final stage  (system is panicking/rebooting.)

1-1) Kernel panics while system is rebooting(or oopsing)

When kernel panics while system is rebooting, panic message should be logged rather than skipping it.

Even though reboot message is overwritten by panic one, we will probably save both final part of 
reboot message and panic message as follows.

Example of kmsg in NVRAM
	<snip>
	Panic#1							 <- header supplied by pstore
	<6>kvm: exiting hardware virtualization
	<5>sd 0:0:0:0: [sda] Synchronizing SCSI cache
	 <0>Restarting system.						<- reboot message
	<0>BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0]
	<0> Kernel panic - not syncing: softlockup: hung tasks		<- panic message
	<0>Pid: 0, comm: swapper/0 Not tainted 3.3.8 #4 Call Trace:
  	<0><IRQ>  [<ffffffff8136bdd5>] panic+0xb8/0x1c4
  	<0>[<ffffffff81071f37>] watchdog_timer_fn+0x139/0x15d
  	<0>[<ffffffff81071dfe>] ? __touch_watchdog+0x1f/0x1f
	<snip>

1-2) Double panic 
In this case, 1s panic message should not be overwritten to detect root cause of system failure.

1-3) ) Kernel reboots while system is panicking
Never happens because kmsg_dump in panic case is serialized via smp_send_stop()

2)  At very early stage  (system is booting up.)
2-1)Previous event is panic, and then panic happens again at boot time.
Previous panic should not be overwritten.

2-2)Previous event is reboot, and then panic happens at boot time
This depends on situation.
Some customer would like to have previous reboot message.
Others may want to get latter panic message.

So, in my current patch, I just decided a policy which error message is prioritized higher than normal message.

In the most case, an user can read/erase entries in NVRAM and get all messages.
I think it is understandable setting a policy in preparation for rare situation.

Seiji

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-05 20:05   ` Seiji Aguchi
@ 2012-07-05 20:24     ` Don Zickus
  2012-07-05 20:58       ` Seiji Aguchi
  0 siblings, 1 reply; 12+ messages in thread
From: Don Zickus @ 2012-07-05 20:24 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: linux-kernel, Luck, Tony (tony.luck@intel.com),
	mikew, Matthew Garrett (mjg@redhat.com),
	dle-develop, Satoru Moriya

On Thu, Jul 05, 2012 at 08:05:06PM +0000, Seiji Aguchi wrote:
> Don,
> 
> Thank you for giving me your comments.
> Let me explain what I'm thinking now.
> 
> > I would rather see no records overwritten and just make sure there is enough space for a dozen or so records to buffer multiple
> > panics before userspace can run.
> > 
> > Implementing policy like this in the kernel seems like it would be a constant battle between everyone's view point of what is
> > important and not important.
> > 
> > I would rather take the viewpoint, if it is important to log it in a space limited NVRAM, then it is important enough not to overwrite
> > until userspace explicitly asks it to be deleted.  Otherwise why log it, if it is not important?
> > 
> 
> If the simple policy above is workable, it is easy. 
> But we have to discuss whether it is useful in each specific use case.
>        When I posted a patch introducing kernel parameter ,efi_pstore_overwrite,
>        I thought same thing above. But I changed my mind while considering  Tony's comment....
> 
> When an user can read kmsg via /dev/pstore and erase old entries,  we don't need to care.
> (Hopefully, some user space apps will be developed near future.)
> 
> Problem here is at very final stage and early stage which an user can't see /dev/pstore.
> 
> 1) At very final stage  (system is panicking/rebooting.)
> 
> 1-1) Kernel panics while system is rebooting(or oopsing)
> 
> When kernel panics while system is rebooting, panic message should be logged rather than skipping it.
> 
> Even though reboot message is overwritten by panic one, we will probably save both final part of 
> reboot message and panic message as follows.
> 
> Example of kmsg in NVRAM
> 	<snip>
> 	Panic#1							 <- header supplied by pstore
> 	<6>kvm: exiting hardware virtualization
> 	<5>sd 0:0:0:0: [sda] Synchronizing SCSI cache
> 	 <0>Restarting system.						<- reboot message
> 	<0>BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0]
> 	<0> Kernel panic - not syncing: softlockup: hung tasks		<- panic message
> 	<0>Pid: 0, comm: swapper/0 Not tainted 3.3.8 #4 Call Trace:
>   	<0><IRQ>  [<ffffffff8136bdd5>] panic+0xb8/0x1c4
>   	<0>[<ffffffff81071f37>] watchdog_timer_fn+0x139/0x15d
>   	<0>[<ffffffff81071dfe>] ? __touch_watchdog+0x1f/0x1f
> 	<snip>
> 
> 1-2) Double panic 
> In this case, 1s panic message should not be overwritten to detect root cause of system failure.
> 
> 1-3) ) Kernel reboots while system is panicking
> Never happens because kmsg_dump in panic case is serialized via smp_send_stop()
> 
> 2)  At very early stage  (system is booting up.)
> 2-1)Previous event is panic, and then panic happens again at boot time.
> Previous panic should not be overwritten.
> 
> 2-2)Previous event is reboot, and then panic happens at boot time
> This depends on situation.
> Some customer would like to have previous reboot message.
> Others may want to get latter panic message.
> 
> So, in my current patch, I just decided a policy which error message is prioritized higher than normal message.

I understand the above scenario and was the one of was thinking of when I
replied.  My counter argument is if the NVRAM isn't big enough to hold
more than two panics, then the logs are too big.

This stuff should be designed to easily accomodate multiple logs (like
say 6 or so), then the above situation doesn't matter.

I just feel this is adding complexity to something that shouldn't need it.

Cheers,
Don

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-05 20:24     ` Don Zickus
@ 2012-07-05 20:58       ` Seiji Aguchi
  2012-07-05 23:12         ` Luck, Tony
  0 siblings, 1 reply; 12+ messages in thread
From: Seiji Aguchi @ 2012-07-05 20:58 UTC (permalink / raw)
  To: Don Zickus, Matthew Garrett (mjg@redhat.com)
  Cc: linux-kernel, Luck, Tony (tony.luck@intel.com),
	mikew, dle-develop, Satoru Moriya

> I understand the above scenario and was the one of was thinking of when I replied.  My counter argument is if the NVRAM isn't big
> enough to hold more than two panics, then the logs are too big.
> 
> This stuff should be designed to easily accomodate multiple logs (like say 6 or so), then the above situation doesn't matter.
> 
> I just feel this is adding complexity to something that shouldn't need it.
> 

I see. I understand your argument right  now.

My patch is on assumption that efi_pstore can handle just one log.
But if it can handle multiple logs, we can solve my concern simply.

Default value of pstore.kmsg_byte is 10KB.
I think efi_pstore can log 60KB(= 10KB X 6)...

Original code is developed by Matthew and he is EFI expert.
So, I have to ask Matthew why current write callback simply erases old entries.

Matthew,

Do you think efI_pstore should handle just one log?  Or do you think latest message is most important?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-05 20:58       ` Seiji Aguchi
@ 2012-07-05 23:12         ` Luck, Tony
  2012-07-06  0:24           ` Seiji Aguchi
  0 siblings, 1 reply; 12+ messages in thread
From: Luck, Tony @ 2012-07-05 23:12 UTC (permalink / raw)
  To: Seiji Aguchi, Don Zickus, Matthew Garrett (mjg@redhat.com)
  Cc: linux-kernel, mikew, dle-develop, Satoru Moriya

> Default value of pstore.kmsg_byte is 10KB.
> I think efi_pstore can log 60KB(= 10KB X 6)...

10K is almost certainly more than we need for 99.9% of problems ... I set the
default there to test out that pstore would correctly break a dump into more
than one back-end ERST record (about 7K) and never changed it back. So
don't treat 10K with any magic reverence. It's easy to argue that a smaller
number is good enough.

There are certainly less over-write worries if you can handle a few
(4, 5, 6) simultaneously logged errors of sufficient size to be useful
(must capture all of the panic strings, backtrace and register dump
plus "enough" lines before the panic to see any obvious issues).

If you only get to store two errors, then perhaps one non-over writable
panic type entry, and one other "most recent" type entry?

With just one, like current EFI, then there are certainly hard choices
that might not be the best for certain pathological situations.

-Tony



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC][PATCH 2/2] write callback: Check if existing entry is erasable
  2012-07-05 23:12         ` Luck, Tony
@ 2012-07-06  0:24           ` Seiji Aguchi
  0 siblings, 0 replies; 12+ messages in thread
From: Seiji Aguchi @ 2012-07-06  0:24 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus, Matthew Garrett (mjg@redhat.com)
  Cc: linux-kernel, mikew, dle-develop, Satoru Moriya

> 10K is almost certainly more than we need for 99.9% of problems ... I set the default there to test out that pstore would correctly
> break a dump into more than one back-end ERST record (about 7K) and never changed it back. So don't treat 10K with any magic
> reverence. It's easy to argue that a smaller number is good enough.

OK.

> There are certainly less over-write worries if you can handle a few (4, 5, 6) simultaneously logged errors of sufficient size to be useful
> (must capture all of the panic strings, backtrace and register dump plus "enough" lines before the panic to see any obvious issues).
>

To handle multiple logs, I will probably introduce a new kernel parameter, like efi_pstore_max_log_num.
Users can calculate  overall consumption of NVRAM for kmsg logging with it.
 
> If you only get to store two errors, then perhaps one non-over writable panic type entry, and one other "most recent" type entry?
> 
> With just one, like current EFI, then there are certainly hard choices that might not be the best for certain pathological situations.

I will consider a policy for multiple logging from now.
(In case where oops happens multiple times and kernel hangs, oldest oops may be informative.... I can't decide the policy at this time.)

Seiji



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-07-06  0:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-03 23:35 [RFC][PATCH 2/2] write callback: Check if existing entry is erasable Seiji Aguchi
2012-07-03 23:58 ` Luck, Tony
2012-07-04  1:22   ` Seiji Aguchi
2012-07-04  1:02 ` Mike Waychison
2012-07-04  1:54   ` Seiji Aguchi
2012-07-05 13:32 ` Don Zickus
2012-07-05 20:03   ` Luck, Tony
2012-07-05 20:05   ` Seiji Aguchi
2012-07-05 20:24     ` Don Zickus
2012-07-05 20:58       ` Seiji Aguchi
2012-07-05 23:12         ` Luck, Tony
2012-07-06  0:24           ` Seiji Aguchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).