All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] livepatch: Clean up of reliable stacktrace warnings
@ 2019-04-30  9:10 Petr Mladek
  2019-04-30  9:10 ` [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support Petr Mladek
  2019-04-30  9:10 ` [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock Petr Mladek
  0 siblings, 2 replies; 20+ messages in thread
From: Petr Mladek @ 2019-04-30  9:10 UTC (permalink / raw)
  To: Jiri Kosina, Josh Poimboeuf, Miroslav Benes
  Cc: Joe Lawrence, Kamalesh Babulal, live-patching, linux-kernel, Petr Mladek

This is the remaining piece from the original patchset[1]. It is
a simplification and split of the 3rd patch.

[1] https://lkml.kernel.org/r/20190424085550.29612-1-pmladek@suse.com


Petr Mladek (2):
  livepatch: Remove duplicate warning about missing reliable stacktrace
    support
  livepatch: Use static buffer for debugging messages under rq lock

 kernel/livepatch/transition.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-04-30  9:10 [PATCH v2 0/2] livepatch: Clean up of reliable stacktrace warnings Petr Mladek
@ 2019-04-30  9:10 ` Petr Mladek
  2019-04-30 11:30   ` Miroslav Benes
                     ` (2 more replies)
  2019-04-30  9:10 ` [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock Petr Mladek
  1 sibling, 3 replies; 20+ messages in thread
From: Petr Mladek @ 2019-04-30  9:10 UTC (permalink / raw)
  To: Jiri Kosina, Josh Poimboeuf, Miroslav Benes
  Cc: Joe Lawrence, Kamalesh Babulal, live-patching, linux-kernel, Petr Mladek

WARN_ON_ONCE() could not be called safely under rq lock because
of console deadlock issues. Fortunately, there is another check
for the reliable stacktrace support in klp_enable_patch().

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/livepatch/transition.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 9c89ae8b337a..8e0274075e75 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -263,8 +263,15 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
 	trace.nr_entries = 0;
 	trace.max_entries = MAX_STACK_ENTRIES;
 	trace.entries = entries;
+
 	ret = save_stack_trace_tsk_reliable(task, &trace);
-	WARN_ON_ONCE(ret == -ENOSYS);
+	/*
+	 * pr_warn() under task rq lock might cause a deadlock.
+	 * Fortunately, missing reliable stacktrace support has
+	 * already been handled when the livepatch was enabled.
+	 */
+	if (ret == -ENOSYS)
+		return ret;
 	if (ret) {
 		snprintf(err_buf, STACK_ERR_BUF_SIZE,
 			 "%s: %s:%d has an unreliable stack\n",
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock
  2019-04-30  9:10 [PATCH v2 0/2] livepatch: Clean up of reliable stacktrace warnings Petr Mladek
  2019-04-30  9:10 ` [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support Petr Mladek
@ 2019-04-30  9:10 ` Petr Mladek
  2019-04-30 11:32   ` Miroslav Benes
                     ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Petr Mladek @ 2019-04-30  9:10 UTC (permalink / raw)
  To: Jiri Kosina, Josh Poimboeuf, Miroslav Benes
  Cc: Joe Lawrence, Kamalesh Babulal, live-patching, linux-kernel, Petr Mladek

klp_try_switch_task() is called under klp_mutex. The buffer for
debugging messages might be static.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/livepatch/transition.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 8e0274075e75..d66086fd6d75 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -304,11 +304,11 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
  */
 static bool klp_try_switch_task(struct task_struct *task)
 {
+	static char err_buf[STACK_ERR_BUF_SIZE];
 	struct rq *rq;
 	struct rq_flags flags;
 	int ret;
 	bool success = false;
-	char err_buf[STACK_ERR_BUF_SIZE];
 
 	err_buf[0] = '\0';
 
@@ -351,7 +351,6 @@ static bool klp_try_switch_task(struct task_struct *task)
 		pr_debug("%s", err_buf);
 
 	return success;
-
 }
 
 /*
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-04-30  9:10 ` [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support Petr Mladek
@ 2019-04-30 11:30   ` Miroslav Benes
  2019-04-30 11:56   ` Kamalesh Babulal
  2019-05-07  0:40   ` Josh Poimboeuf
  2 siblings, 0 replies; 20+ messages in thread
From: Miroslav Benes @ 2019-04-30 11:30 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Josh Poimboeuf, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Tue, 30 Apr 2019, Petr Mladek wrote:

> WARN_ON_ONCE() could not be called safely under rq lock because
> of console deadlock issues. Fortunately, there is another check
> for the reliable stacktrace support in klp_enable_patch().
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Acked-by: Miroslav Benes <mbenes@suse.cz> with a nit below

> ---
>  kernel/livepatch/transition.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index 9c89ae8b337a..8e0274075e75 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -263,8 +263,15 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
>  	trace.nr_entries = 0;
>  	trace.max_entries = MAX_STACK_ENTRIES;
>  	trace.entries = entries;
> +

Unnecessary new line?

>  	ret = save_stack_trace_tsk_reliable(task, &trace);
> -	WARN_ON_ONCE(ret == -ENOSYS);
> +	/*
> +	 * pr_warn() under task rq lock might cause a deadlock.
> +	 * Fortunately, missing reliable stacktrace support has
> +	 * already been handled when the livepatch was enabled.
> +	 */
> +	if (ret == -ENOSYS)
> +		return ret;
>  	if (ret) {
>  		snprintf(err_buf, STACK_ERR_BUF_SIZE,
>  			 "%s: %s:%d has an unreliable stack\n",

Miroslav

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock
  2019-04-30  9:10 ` [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock Petr Mladek
@ 2019-04-30 11:32   ` Miroslav Benes
  2019-04-30 12:30   ` Kamalesh Babulal
  2019-05-07  0:43   ` Josh Poimboeuf
  2 siblings, 0 replies; 20+ messages in thread
From: Miroslav Benes @ 2019-04-30 11:32 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Josh Poimboeuf, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Tue, 30 Apr 2019, Petr Mladek wrote:

> klp_try_switch_task() is called under klp_mutex. The buffer for
> debugging messages might be static.
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Acked-by: Miroslav Benes <mbenes@suse.cz>

Miroslav

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-04-30  9:10 ` [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support Petr Mladek
  2019-04-30 11:30   ` Miroslav Benes
@ 2019-04-30 11:56   ` Kamalesh Babulal
  2019-05-07  0:40   ` Josh Poimboeuf
  2 siblings, 0 replies; 20+ messages in thread
From: Kamalesh Babulal @ 2019-04-30 11:56 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Josh Poimboeuf, Miroslav Benes, Joe Lawrence,
	live-patching, linux-kernel

On Tue, Apr 30, 2019 at 11:10:48AM +0200, Petr Mladek wrote:
> WARN_ON_ONCE() could not be called safely under rq lock because
> of console deadlock issues. Fortunately, there is another check
> for the reliable stacktrace support in klp_enable_patch().
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock
  2019-04-30  9:10 ` [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock Petr Mladek
  2019-04-30 11:32   ` Miroslav Benes
@ 2019-04-30 12:30   ` Kamalesh Babulal
  2019-05-07  0:43   ` Josh Poimboeuf
  2 siblings, 0 replies; 20+ messages in thread
From: Kamalesh Babulal @ 2019-04-30 12:30 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Josh Poimboeuf, Miroslav Benes, Joe Lawrence,
	live-patching, linux-kernel

On Tue, Apr 30, 2019 at 11:10:49AM +0200, Petr Mladek wrote:
> klp_try_switch_task() is called under klp_mutex. The buffer for
> debugging messages might be static.
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-04-30  9:10 ` [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support Petr Mladek
  2019-04-30 11:30   ` Miroslav Benes
  2019-04-30 11:56   ` Kamalesh Babulal
@ 2019-05-07  0:40   ` Josh Poimboeuf
  2019-05-07  1:43     ` Josh Poimboeuf
  2019-05-07  8:58     ` Miroslav Benes
  2 siblings, 2 replies; 20+ messages in thread
From: Josh Poimboeuf @ 2019-05-07  0:40 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Tue, Apr 30, 2019 at 11:10:48AM +0200, Petr Mladek wrote:
> WARN_ON_ONCE() could not be called safely under rq lock because
> of console deadlock issues. Fortunately, there is another check
> for the reliable stacktrace support in klp_enable_patch().
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
>  kernel/livepatch/transition.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index 9c89ae8b337a..8e0274075e75 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -263,8 +263,15 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
>  	trace.nr_entries = 0;
>  	trace.max_entries = MAX_STACK_ENTRIES;
>  	trace.entries = entries;
> +
>  	ret = save_stack_trace_tsk_reliable(task, &trace);
> -	WARN_ON_ONCE(ret == -ENOSYS);
> +	/*
> +	 * pr_warn() under task rq lock might cause a deadlock.
> +	 * Fortunately, missing reliable stacktrace support has
> +	 * already been handled when the livepatch was enabled.
> +	 */
> +	if (ret == -ENOSYS)
> +		return ret;

I find the comment to be a bit wordy and confusing (and vague).

Also this check is effectively the same as the klp_have_reliable_stack()
check which is done in kernel/livepatch/core.c.  So I think it would be
clearer and more consistent if the same check is done here:

	if (!klp_have_reliable_stack())
		return -ENOSYS;

	ret = save_stack_trace_tsk_reliable(task, &trace);

	[ no need to check ret for ENOSYS here ]

Then, IMO, no comment is needed.

-- 
Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock
  2019-04-30  9:10 ` [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock Petr Mladek
  2019-04-30 11:32   ` Miroslav Benes
  2019-04-30 12:30   ` Kamalesh Babulal
@ 2019-05-07  0:43   ` Josh Poimboeuf
  2019-05-07 11:50     ` Petr Mladek
  2 siblings, 1 reply; 20+ messages in thread
From: Josh Poimboeuf @ 2019-05-07  0:43 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Tue, Apr 30, 2019 at 11:10:49AM +0200, Petr Mladek wrote:
> klp_try_switch_task() is called under klp_mutex. The buffer for
> debugging messages might be static.

The patch description is missing a "why" (presumably to reduce stack
usage).

> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
>  kernel/livepatch/transition.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index 8e0274075e75..d66086fd6d75 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -304,11 +304,11 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
>   */
>  static bool klp_try_switch_task(struct task_struct *task)
>  {
> +	static char err_buf[STACK_ERR_BUF_SIZE];
>  	struct rq *rq;
>  	struct rq_flags flags;
>  	int ret;
>  	bool success = false;
> -	char err_buf[STACK_ERR_BUF_SIZE];
>  
>  	err_buf[0] = '\0';
>  
> @@ -351,7 +351,6 @@ static bool klp_try_switch_task(struct task_struct *task)
>  		pr_debug("%s", err_buf);
>  
>  	return success;
> -
>  }
>  
>  /*
> -- 
> 2.16.4
> 

-- 
Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-05-07  0:40   ` Josh Poimboeuf
@ 2019-05-07  1:43     ` Josh Poimboeuf
  2019-05-07 11:29       ` Petr Mladek
  2019-05-07  8:58     ` Miroslav Benes
  1 sibling, 1 reply; 20+ messages in thread
From: Josh Poimboeuf @ 2019-05-07  1:43 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Mon, May 06, 2019 at 07:40:32PM -0500, Josh Poimboeuf wrote:
> On Tue, Apr 30, 2019 at 11:10:48AM +0200, Petr Mladek wrote:
> > WARN_ON_ONCE() could not be called safely under rq lock because
> > of console deadlock issues. Fortunately, there is another check
> > for the reliable stacktrace support in klp_enable_patch().
> > 
> > Signed-off-by: Petr Mladek <pmladek@suse.com>
> > ---
> >  kernel/livepatch/transition.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > index 9c89ae8b337a..8e0274075e75 100644
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -263,8 +263,15 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> >  	trace.nr_entries = 0;
> >  	trace.max_entries = MAX_STACK_ENTRIES;
> >  	trace.entries = entries;
> > +
> >  	ret = save_stack_trace_tsk_reliable(task, &trace);
> > -	WARN_ON_ONCE(ret == -ENOSYS);
> > +	/*
> > +	 * pr_warn() under task rq lock might cause a deadlock.
> > +	 * Fortunately, missing reliable stacktrace support has
> > +	 * already been handled when the livepatch was enabled.
> > +	 */
> > +	if (ret == -ENOSYS)
> > +		return ret;
> 
> I find the comment to be a bit wordy and confusing (and vague).
> 
> Also this check is effectively the same as the klp_have_reliable_stack()
> check which is done in kernel/livepatch/core.c.  So I think it would be
> clearer and more consistent if the same check is done here:
> 
> 	if (!klp_have_reliable_stack())
> 		return -ENOSYS;
> 
> 	ret = save_stack_trace_tsk_reliable(task, &trace);
> 
> 	[ no need to check ret for ENOSYS here ]
> 
> Then, IMO, no comment is needed.

BTW, if you agree with this approach then we can leave the
WARN_ON_ONCE() in save_stack_trace_tsk_reliable() after all.

-- 
Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-05-07  0:40   ` Josh Poimboeuf
  2019-05-07  1:43     ` Josh Poimboeuf
@ 2019-05-07  8:58     ` Miroslav Benes
  1 sibling, 0 replies; 20+ messages in thread
From: Miroslav Benes @ 2019-05-07  8:58 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, Jiri Kosina, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Mon, 6 May 2019, Josh Poimboeuf wrote:

> On Tue, Apr 30, 2019 at 11:10:48AM +0200, Petr Mladek wrote:
> > WARN_ON_ONCE() could not be called safely under rq lock because
> > of console deadlock issues. Fortunately, there is another check
> > for the reliable stacktrace support in klp_enable_patch().
> > 
> > Signed-off-by: Petr Mladek <pmladek@suse.com>
> > ---
> >  kernel/livepatch/transition.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > index 9c89ae8b337a..8e0274075e75 100644
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -263,8 +263,15 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> >  	trace.nr_entries = 0;
> >  	trace.max_entries = MAX_STACK_ENTRIES;
> >  	trace.entries = entries;
> > +
> >  	ret = save_stack_trace_tsk_reliable(task, &trace);
> > -	WARN_ON_ONCE(ret == -ENOSYS);
> > +	/*
> > +	 * pr_warn() under task rq lock might cause a deadlock.
> > +	 * Fortunately, missing reliable stacktrace support has
> > +	 * already been handled when the livepatch was enabled.
> > +	 */
> > +	if (ret == -ENOSYS)
> > +		return ret;
> 
> I find the comment to be a bit wordy and confusing (and vague).
> 
> Also this check is effectively the same as the klp_have_reliable_stack()
> check which is done in kernel/livepatch/core.c.  So I think it would be
> clearer and more consistent if the same check is done here:
> 
> 	if (!klp_have_reliable_stack())
> 		return -ENOSYS;

We removed it in 1d98a69e5cef ("livepatch: Remove reliable stacktrace 
check in klp_try_switch_task()") and I do think it does not belong here. 
We can check for the facility right at the beginning in klp_enable_patch() 
and it is not necessary to do it every time klp_check_stack() is called.

But it is nothing I could not live with, so if you decide it is better, I 
will not object.

Miroslav

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-05-07  1:43     ` Josh Poimboeuf
@ 2019-05-07 11:29       ` Petr Mladek
  2019-05-07 12:03         ` Josh Poimboeuf
  0 siblings, 1 reply; 20+ messages in thread
From: Petr Mladek @ 2019-05-07 11:29 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Mon 2019-05-06 20:43:32, Josh Poimboeuf wrote:
> On Mon, May 06, 2019 at 07:40:32PM -0500, Josh Poimboeuf wrote:
> > On Tue, Apr 30, 2019 at 11:10:48AM +0200, Petr Mladek wrote:
> > > WARN_ON_ONCE() could not be called safely under rq lock because
> > > of console deadlock issues. Fortunately, there is another check
> > > for the reliable stacktrace support in klp_enable_patch().
> > > 
> > > Signed-off-by: Petr Mladek <pmladek@suse.com>
> > > ---
> > >  kernel/livepatch/transition.c | 9 ++++++++-
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > > index 9c89ae8b337a..8e0274075e75 100644
> > > --- a/kernel/livepatch/transition.c
> > > +++ b/kernel/livepatch/transition.c
> > > @@ -263,8 +263,15 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> > >  	trace.nr_entries = 0;
> > >  	trace.max_entries = MAX_STACK_ENTRIES;
> > >  	trace.entries = entries;
> > > +
> > >  	ret = save_stack_trace_tsk_reliable(task, &trace);
> > > -	WARN_ON_ONCE(ret == -ENOSYS);
> > > +	/*
> > > +	 * pr_warn() under task rq lock might cause a deadlock.
> > > +	 * Fortunately, missing reliable stacktrace support has
> > > +	 * already been handled when the livepatch was enabled.
> > > +	 */
> > > +	if (ret == -ENOSYS)
> > > +		return ret;
> > 
> > I find the comment to be a bit wordy and confusing (and vague).

Then please provide a better one. I have no idea what might make
you happy and am not interested into an endless disputing.

> > Also this check is effectively the same as the klp_have_reliable_stack()
> > check which is done in kernel/livepatch/core.c.  So I think it would be
> > clearer and more consistent if the same check is done here:
> > 
> > 	if (!klp_have_reliable_stack())
> > 		return -ENOSYS;

Huh, it smells with over engineering to me.

> > 	ret = save_stack_trace_tsk_reliable(task, &trace);
> > 
> > 	[ no need to check ret for ENOSYS here ]
> > 
> > Then, IMO, no comment is needed.
> 
> BTW, if you agree with this approach then we can leave the
> WARN_ON_ONCE() in save_stack_trace_tsk_reliable() after all.

I really like the removal of the WARN_ON_ONCE(). I consider
it an old fashioned way used when people are lazy to handle
errors. It might make sense when the backtrace helps to locate
the context but the context is well known here. Finally,
WARN() should be used with care. It might cause reboot
with panic_on_warn.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock
  2019-05-07  0:43   ` Josh Poimboeuf
@ 2019-05-07 11:50     ` Petr Mladek
  2019-05-07 13:19       ` Josh Poimboeuf
  0 siblings, 1 reply; 20+ messages in thread
From: Petr Mladek @ 2019-05-07 11:50 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Mon 2019-05-06 19:43:19, Josh Poimboeuf wrote:
> On Tue, Apr 30, 2019 at 11:10:49AM +0200, Petr Mladek wrote:
> > klp_try_switch_task() is called under klp_mutex. The buffer for
> > debugging messages might be static.
> 
> The patch description is missing a "why" (presumably to reduce stack
> usage).

Exactly. I thought that it was obvious. But I am infected by printk
code where line buffers are 1k and nobody wants them on the stack.

128bytes in klp_try_switch_task() context are acceptable but
it is still rather big buffer.

OK, what about the following commit message?

"klp_try_switch_task() is called under klp_mutex. The buffer for
debugging messages might be static to reduce stack usage."

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-05-07 11:29       ` Petr Mladek
@ 2019-05-07 12:03         ` Josh Poimboeuf
  2019-05-07 14:28           ` Petr Mladek
  0 siblings, 1 reply; 20+ messages in thread
From: Josh Poimboeuf @ 2019-05-07 12:03 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Tue, May 07, 2019 at 01:29:50PM +0200, Petr Mladek wrote:
> On Mon 2019-05-06 20:43:32, Josh Poimboeuf wrote:
> > On Mon, May 06, 2019 at 07:40:32PM -0500, Josh Poimboeuf wrote:
> > > On Tue, Apr 30, 2019 at 11:10:48AM +0200, Petr Mladek wrote:
> > > > WARN_ON_ONCE() could not be called safely under rq lock because
> > > > of console deadlock issues. Fortunately, there is another check
> > > > for the reliable stacktrace support in klp_enable_patch().
> > > > 
> > > > Signed-off-by: Petr Mladek <pmladek@suse.com>
> > > > ---
> > > >  kernel/livepatch/transition.c | 9 ++++++++-
> > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > > > index 9c89ae8b337a..8e0274075e75 100644
> > > > --- a/kernel/livepatch/transition.c
> > > > +++ b/kernel/livepatch/transition.c
> > > > @@ -263,8 +263,15 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> > > >  	trace.nr_entries = 0;
> > > >  	trace.max_entries = MAX_STACK_ENTRIES;
> > > >  	trace.entries = entries;
> > > > +
> > > >  	ret = save_stack_trace_tsk_reliable(task, &trace);
> > > > -	WARN_ON_ONCE(ret == -ENOSYS);
> > > > +	/*
> > > > +	 * pr_warn() under task rq lock might cause a deadlock.
> > > > +	 * Fortunately, missing reliable stacktrace support has
> > > > +	 * already been handled when the livepatch was enabled.
> > > > +	 */
> > > > +	if (ret == -ENOSYS)
> > > > +		return ret;
> > > 
> > > I find the comment to be a bit wordy and confusing (and vague).
> 
> Then please provide a better one. I have no idea what might make
> you happy and am not interested into an endless disputing.

Something like this would be clearer:

	if (ret == -ENOSYS) {
		/*
		 * This arch doesn't support reliable stack tracing.  No
		 * need to print a warning; that has already been done
		 * by klp_enable_patch().
		 */
		return ret;
	}

But my next point was that changing the code would be even better than
fixing the comment.

> > > Also this check is effectively the same as the klp_have_reliable_stack()
> > > check which is done in kernel/livepatch/core.c.  So I think it would be
> > > clearer and more consistent if the same check is done here:
> > > 
> > > 	if (!klp_have_reliable_stack())
> > > 		return -ENOSYS;
> 
> Huh, it smells with over engineering to me.

How so?  It makes the code more readable and the generated code should
be much better because it becomes a build-time check.

But I think Miroslav's suggestion to revert 1d98a69e5cef would be even
better.

> > > 	ret = save_stack_trace_tsk_reliable(task, &trace);
> > > 
> > > 	[ no need to check ret for ENOSYS here ]
> > > 
> > > Then, IMO, no comment is needed.
> > 
> > BTW, if you agree with this approach then we can leave the
> > WARN_ON_ONCE() in save_stack_trace_tsk_reliable() after all.
> 
> I really like the removal of the WARN_ON_ONCE(). I consider
> it an old fashioned way used when people are lazy to handle
> errors. It might make sense when the backtrace helps to locate
> the context but the context is well known here. Finally,
> WARN() should be used with care. It might cause reboot
> with panic_on_warn.

The warning makes the function consistent with the other weak functions
in stacktrace.c and clarifies that it should never be called unless an
arch has misconfigured something.  And if we aren't even checking the
specific ENOSYS error as I proposed then this warning would make the
error more obvious.

-- 
Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock
  2019-05-07 11:50     ` Petr Mladek
@ 2019-05-07 13:19       ` Josh Poimboeuf
  0 siblings, 0 replies; 20+ messages in thread
From: Josh Poimboeuf @ 2019-05-07 13:19 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Tue, May 07, 2019 at 01:50:29PM +0200, Petr Mladek wrote:
> On Mon 2019-05-06 19:43:19, Josh Poimboeuf wrote:
> > On Tue, Apr 30, 2019 at 11:10:49AM +0200, Petr Mladek wrote:
> > > klp_try_switch_task() is called under klp_mutex. The buffer for
> > > debugging messages might be static.
> > 
> > The patch description is missing a "why" (presumably to reduce stack
> > usage).
> 
> Exactly. I thought that it was obvious. But I am infected by printk
> code where line buffers are 1k and nobody wants them on the stack.
> 
> 128bytes in klp_try_switch_task() context are acceptable but
> it is still rather big buffer.
> 
> OK, what about the following commit message?
> 
> "klp_try_switch_task() is called under klp_mutex. The buffer for
> debugging messages might be static to reduce stack usage."

It's better to use imperative language.  It would also be good to
reverse the order of the wording by starting with the problem.

Something like:

The err_buf array uses 128 bytes of stack space.  Move it off the stack
by making it static.  It's safe to use a shared buffer because
klp_try_switch_task() is called under klp_mutex.

-- 
Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-05-07 12:03         ` Josh Poimboeuf
@ 2019-05-07 14:28           ` Petr Mladek
  2019-05-07 21:24             ` Josh Poimboeuf
  0 siblings, 1 reply; 20+ messages in thread
From: Petr Mladek @ 2019-05-07 14:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Tue 2019-05-07 07:03:50, Josh Poimboeuf wrote:
> On Tue, May 07, 2019 at 01:29:50PM +0200, Petr Mladek wrote:
> > On Mon 2019-05-06 20:43:32, Josh Poimboeuf wrote:
> > > On Mon, May 06, 2019 at 07:40:32PM -0500, Josh Poimboeuf wrote:
> > > > On Tue, Apr 30, 2019 at 11:10:48AM +0200, Petr Mladek wrote:
> > > > > --- a/kernel/livepatch/transition.c
> > > > > +++ b/kernel/livepatch/transition.c
> > > > > @@ -263,8 +263,15 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> > > > >  	trace.nr_entries = 0;
> > > > >  	trace.max_entries = MAX_STACK_ENTRIES;
> > > > >  	trace.entries = entries;
> > > > > +
> > > > >  	ret = save_stack_trace_tsk_reliable(task, &trace);
> > > > > -	WARN_ON_ONCE(ret == -ENOSYS);
> > > > > +	/*
> > > > > +	 * pr_warn() under task rq lock might cause a deadlock.
> > > > > +	 * Fortunately, missing reliable stacktrace support has
> > > > > +	 * already been handled when the livepatch was enabled.
> > > > > +	 */
> > > > > +	if (ret == -ENOSYS)
> > > > > +		return ret;
> > > > 
> > > > I find the comment to be a bit wordy and confusing (and vague).
> > 
> > Then please provide a better one. I have no idea what might make
> > you happy and am not interested into an endless disputing.
>
> Something like this would be clearer:
> 
> 	if (ret == -ENOSYS) {
> 		/*
> 		 * This arch doesn't support reliable stack tracing.  No
> 		 * need to print a warning; that has already been done
> 		 * by klp_enable_patch().
> 		 */
> 		return ret;
> 	}

I do not mind.

> > > > Also this check is effectively the same as the klp_have_reliable_stack()
> > > > check which is done in kernel/livepatch/core.c.  So I think it would be
> > > > clearer and more consistent if the same check is done here:
> > > > 
> > > > 	if (!klp_have_reliable_stack())
> > > > 		return -ENOSYS;
> > 
> > Huh, it smells with over engineering to me.
> 
> How so?  It makes the code more readable and the generated code should
> be much better because it becomes a build-time check.

save_stack_trace_tsk_reliable() returns various error codes.
We catch a specific one because otherwise the message below
might be misleading.

I do not see why we should prevent this error by calling
a custom hack: klp_have_reliable_stack()?

Regarding reliability. If anyone changes semantic of
save_stack_trace_tsk_reliable() error codes, they would likely
check if all users (one at the moment) handle it correctly.

On the other hand, the dependency between the -ENOSYS
return value and klp_have_reliable_stack() is far from
obvious.

If we want to discuss and fix this to the death. We should change
the return value from -ENOSYS to -EOPNOTSUPP. The reason
is the same as in the commit 375bfca3459db1c5596
("livepatch: core: Return EOPNOTSUPP instead of ENOSYS").

Note that EOPNOTSUPP is the same errno as ENOTSUP, see
man 3 errno.


> But I think Miroslav's suggestion to revert 1d98a69e5cef would be even
> better.

AFAIK, Miroslav wanted to point out that your opinion was inconsistent.


> > > > 	ret = save_stack_trace_tsk_reliable(task, &trace);
> > > > 
> > > > 	[ no need to check ret for ENOSYS here ]
> > > > 
> > > > Then, IMO, no comment is needed.
> > > 
> > > BTW, if you agree with this approach then we can leave the
> > > WARN_ON_ONCE() in save_stack_trace_tsk_reliable() after all.
> > 
> > I really like the removal of the WARN_ON_ONCE(). I consider
> > it an old fashioned way used when people are lazy to handle
> > errors. It might make sense when the backtrace helps to locate
> > the context but the context is well known here. Finally,
> > WARN() should be used with care. It might cause reboot
> > with panic_on_warn.
> 
> The warning makes the function consistent with the other weak functions
> in stacktrace.c and clarifies that it should never be called unless an
> arch has misconfigured something.  And if we aren't even checking the
> specific ENOSYS error as I proposed then this warning would make the
> error more obvious.

I consider both WARN() and error value as superfluous. I like the
error value because it allows users to handle the situation as
they need it.

Best Regards,
Petr

PS: This is my last mail in the thread this week. I will eventually
return to it with a clear head next week. It is all nitpicking from
my POV and I have better things to do.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-05-07 14:28           ` Petr Mladek
@ 2019-05-07 21:24             ` Josh Poimboeuf
  2019-05-09  7:24               ` Miroslav Benes
  2019-05-13 10:43               ` Petr Mladek
  0 siblings, 2 replies; 20+ messages in thread
From: Josh Poimboeuf @ 2019-05-07 21:24 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Tue, May 07, 2019 at 04:28:47PM +0200, Petr Mladek wrote:
> > > > > Also this check is effectively the same as the klp_have_reliable_stack()
> > > > > check which is done in kernel/livepatch/core.c.  So I think it would be
> > > > > clearer and more consistent if the same check is done here:
> > > > > 
> > > > > 	if (!klp_have_reliable_stack())
> > > > > 		return -ENOSYS;
> > > 
> > > Huh, it smells with over engineering to me.
> > 
> > How so?  It makes the code more readable and the generated code should
> > be much better because it becomes a build-time check.
> 
> save_stack_trace_tsk_reliable() returns various error codes.
> We catch a specific one because otherwise the message below
> might be misleading.
> 
> I do not see why we should prevent this error by calling
> a custom hack: klp_have_reliable_stack()?

I wouldn't call it a hack.  It's a simple build-time check.

> Regarding reliability. If anyone changes semantic of
> save_stack_trace_tsk_reliable() error codes, they would likely
> check if all users (one at the moment) handle it correctly.
> 
> On the other hand, the dependency between the -ENOSYS
> return value and klp_have_reliable_stack() is far from
> obvious.

I don't follow your point.

> If we want to discuss and fix this to the death. We should change
> the return value from -ENOSYS to -EOPNOTSUPP. The reason
> is the same as in the commit 375bfca3459db1c5596
> ("livepatch: core: Return EOPNOTSUPP instead of ENOSYS").
> 
> Note that EOPNOTSUPP is the same errno as ENOTSUP, see
> man 3 errno.

Sure, but that's a separate issue.

> > But I think Miroslav's suggestion to revert 1d98a69e5cef would be even
> > better.
> 
> AFAIK, Miroslav wanted to point out that your opinion was inconsistent.

How is my opinion inconsistent?

Is there something wrong with the original approach, which was reverted
with

  1d98a69e5cef ("livepatch: Remove reliable stacktrace check in klp_try_switch_task()")

?

As I mentioned, that has some advantages:

- The generated code is simpler/faster because it uses a build-time
  check.

- The code is more readable IMO.  Especially if the check is done higher
  up the call stack by reverting 1d98a69e5cef.  That way the arch
  support short-circuiting is done in the logical place, before doing
  any more unnecessary work.  It's faster, but also, more importantly,
  it's less surprising.

- It's also more consistent with other code, since the intent of this
  check is the same as the klp_have_reliable_stack() use in
  klp_enabled_patch().

If you disagree with those, please explain why.

> > > > > 	ret = save_stack_trace_tsk_reliable(task, &trace);
> > > > > 
> > > > > 	[ no need to check ret for ENOSYS here ]
> > > > > 
> > > > > Then, IMO, no comment is needed.
> > > > 
> > > > BTW, if you agree with this approach then we can leave the
> > > > WARN_ON_ONCE() in save_stack_trace_tsk_reliable() after all.
> > > 
> > > I really like the removal of the WARN_ON_ONCE(). I consider
> > > it an old fashioned way used when people are lazy to handle
> > > errors. It might make sense when the backtrace helps to locate
> > > the context but the context is well known here. Finally,
> > > WARN() should be used with care. It might cause reboot
> > > with panic_on_warn.
> > 
> > The warning makes the function consistent with the other weak functions
> > in stacktrace.c and clarifies that it should never be called unless an
> > arch has misconfigured something.  And if we aren't even checking the
> > specific ENOSYS error as I proposed then this warning would make the
> > error more obvious.
> 
> I consider both WARN() and error value as superfluous. I like the
> error value because it allows users to handle the situation as
> they need it.

... but if there are no users of the weak function then the point is
moot.

> Best Regards,
> Petr
> 
> PS: This is my last mail in the thread this week. I will eventually
> return to it with a clear head next week. It is all nitpicking from
> my POV and I have better things to do.

I don't think it's helpful to characterize it as nitpicking.  The
details of the code are important.

-- 
Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-05-07 21:24             ` Josh Poimboeuf
@ 2019-05-09  7:24               ` Miroslav Benes
  2019-05-13 10:43               ` Petr Mladek
  1 sibling, 0 replies; 20+ messages in thread
From: Miroslav Benes @ 2019-05-09  7:24 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, Jiri Kosina, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel


> > > But I think Miroslav's suggestion to revert 1d98a69e5cef would be even
> > > better.
> > 
> > AFAIK, Miroslav wanted to point out that your opinion was inconsistent.
> 
> How is my opinion inconsistent?
> 
> Is there something wrong with the original approach, which was reverted
> with
> 
>   1d98a69e5cef ("livepatch: Remove reliable stacktrace check in klp_try_switch_task()")
> 
> ?
> 
> As I mentioned, that has some advantages:
> 
> - The generated code is simpler/faster because it uses a build-time
>   check.
> 
> - The code is more readable IMO.  Especially if the check is done higher
>   up the call stack by reverting 1d98a69e5cef.  That way the arch
>   support short-circuiting is done in the logical place, before doing
>   any more unnecessary work.  It's faster, but also, more importantly,
>   it's less surprising.

Correct. I forgot we removed return from klp_enable_patch() if 
klp_have_reliable_stack() errors out and we only warn now. So reverting 
1d98a69e5cef definitely makes sense.

My...

"We removed it in 1d98a69e5cef ("livepatch: Remove reliable stacktrace 
check in klp_try_switch_task()") and I do think it does not belong here. 
We can check for the facility right at the beginning in klp_enable_patch() 
and it is not necessary to do it every time klp_check_stack() is called." 

...from the other email is rubbish then.

Miroslav

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-05-07 21:24             ` Josh Poimboeuf
  2019-05-09  7:24               ` Miroslav Benes
@ 2019-05-13 10:43               ` Petr Mladek
  2019-05-13 16:46                 ` Josh Poimboeuf
  1 sibling, 1 reply; 20+ messages in thread
From: Petr Mladek @ 2019-05-13 10:43 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Tue 2019-05-07 16:24:25, Josh Poimboeuf wrote:
> On Tue, May 07, 2019 at 04:28:47PM +0200, Petr Mladek wrote:
> > > > > > Also this check is effectively the same as the klp_have_reliable_stack()
> > > > > > check which is done in kernel/livepatch/core.c.  So I think it would be
> > > > > > clearer and more consistent if the same check is done here:
> > > > > > 
> > > > > > 	if (!klp_have_reliable_stack())
> > > > > > 		return -ENOSYS;
> > > > 
> > > > Huh, it smells with over engineering to me.
> > > 
> > > How so?  It makes the code more readable and the generated code should
> > > be much better because it becomes a build-time check.
> > 
> > save_stack_trace_tsk_reliable() returns various error codes.
> > We catch a specific one because otherwise the message below
> > might be misleading.
> > 
> > I do not see why we should prevent this error by calling
> > a custom hack: klp_have_reliable_stack()?
> 
> I wouldn't call it a hack.  It's a simple build-time check.
> 
> > Regarding reliability. If anyone changes semantic of
> > save_stack_trace_tsk_reliable() error codes, they would likely
> > check if all users (one at the moment) handle it correctly.
> > 
> > On the other hand, the dependency between the -ENOSYS
> > return value and klp_have_reliable_stack() is far from
> > obvious.
> 
> I don't follow your point.

We implement klp_have_reliable_stack() in livepatch subsystem.
It checks config options that defines behavior of the
stacktrace subsystem.

We use the above livepatch-specific function to warn about
that a function from stacktrace subsustem will not work.
You even suggest to use it to ignore result from
the stacktrace subsystem.

OK, using klp_have_reliable_stack() on both locations
would keep it consistent.

My point is that the check itself is not reliable because
it is "hard" to maintain.

Instead, I suggest to remove klp_have_reliable_stack() and use
the following in klp_enable_patch().


	if (stack_trace_save_tsk_reliable(current, NULL, 0) == -ENOSYS) {
		pr_warn("This architecture doesn't have support for the livepatch consistency model.\n");
		pr_warn("The livepatch transition may never complete.\n");
	}

Also I suggest to remove the check in klp_check_stack() completely.
We will always print that the stack is not reliable but only when
the debug message is enabled. It is slightly misleading
message for -ENOSYS. But I doubt that it could cause much
troubles in reality. This situation should be really rare
and easy to debug.


> > If we want to discuss and fix this to the death. We should change
> > the return value from -ENOSYS to -EOPNOTSUPP. The reason
> > is the same as in the commit 375bfca3459db1c5596
> > ("livepatch: core: Return EOPNOTSUPP instead of ENOSYS").
> > 
> > Note that EOPNOTSUPP is the same errno as ENOTSUP, see
> > man 3 errno.
> 
> Sure, but that's a separate issue.

I just wanted to show that we might spend even more time with
making this code briliant.


> > > But I think Miroslav's suggestion to revert 1d98a69e5cef would be even
> > > better.
> > 
> > AFAIK, Miroslav wanted to point out that your opinion was inconsistent.
> 
> How is my opinion inconsistent?

1. You have already acked the removal of WARN_ON() in
   klp_have_reliable_stack(),
   see https://lkml.kernel.org/r/20190424155154.h62wc3nt7ib2phdy@treble

   You even suggested it, see
   https://lkml.kernel.org/r/20190418135858.n3lzq2oxkj52m6bi@treble

   But you suggest to put it back now.


2. You suggested to remove the warning when klp_check_stack() because
   it was superfluous. There was a discussion about keeping
   the check for -ENOSYS there and you did not react, see
   https://lkml.kernel.org/r/20190424155532.3uyxyxwm4c5dqsf5@treble

   You even acked the commit 1d98a69e5cef3aeb68bcef ("livepatch:
   Remove reliable stacktrace check in klp_try_switch_task()").

   And you suddenly want to revert it.


> Is there something wrong with the original approach, which was reverted
> with
> 
>   1d98a69e5cef ("livepatch: Remove reliable stacktrace check in klp_try_switch_task()")
> 
> ?
> 
> As I mentioned, that has some advantages:
> 
> - The generated code is simpler/faster because it uses a build-time
>   check.
> 
> - The code is more readable IMO.  Especially if the check is done higher
>   up the call stack by reverting 1d98a69e5cef.  That way the arch
>   support short-circuiting is done in the logical place, before doing
>   any more unnecessary work.  It's faster, but also, more importantly,
>   it's less surprising.
> 
> - It's also more consistent with other code, since the intent of this
>   check is the same as the klp_have_reliable_stack() use in
>   klp_enabled_patch().
> 
> If you disagree with those, please explain why.

As I said. I think that it is less reliable because we check config
options of an unrelated subsystem.

Also I think that it is overengineered.
save_stack_trace_tsk_reliable() is able to tell when it failed.
This particular failure is superfast. IMHO, it is not worth such
an optimization.

In fact, it is a compile time check as well because the inline
from include/linux/stacktrace.h


> > PS: This is my last mail in the thread this week. I will eventually
> > return to it with a clear head next week. It is all nitpicking from
> > my POV and I have better things to do.
> 
> I don't think it's helpful to characterize it as nitpicking.  The
> details of the code are important.

1. You had issues with almost all my printk() messages, comments,
   and commit messages. I know that my English is not perfect.
   And that you might want to highlight another information.
   But was is it really that bad?

2. This entire patchset is about adding and removing messages
   and checks. We have 3rd version and you are still not happy.
   Not to say that you suggest something else each time.


Frankly, I do mind too much about this code path, which and how
many warnings are printed. I am not aware about any complains.
IMHO, only people adding support for new architecture
might go this way.

I just wanted to switch the error to warning because Thomas
Gleixner wondered about unused code on s390 when refactoring
the stacktrace code.

I really did not expect that I would spend hours/days on this.
I do not think that it is worth it.

I consider most of the requests as nitpicking because
they requests extra work just because it looks like
a good idea.

My take is that we should accept changes when they improve
the situation and go in the right direction. Further clean up
might be done later by anyone who does not have anything
more important on the plate or gets annoyed enough.

Yes, you have many great ideas. But I am not a trained
monkey. And I do not know how to stop this when it is
looking endless.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support
  2019-05-13 10:43               ` Petr Mladek
@ 2019-05-13 16:46                 ` Josh Poimboeuf
  0 siblings, 0 replies; 20+ messages in thread
From: Josh Poimboeuf @ 2019-05-13 16:46 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, Miroslav Benes, Joe Lawrence, Kamalesh Babulal,
	live-patching, linux-kernel

On Mon, May 13, 2019 at 12:43:18PM +0200, Petr Mladek wrote:
> On Tue 2019-05-07 16:24:25, Josh Poimboeuf wrote:
> > On Tue, May 07, 2019 at 04:28:47PM +0200, Petr Mladek wrote:
> > > > > > > Also this check is effectively the same as the klp_have_reliable_stack()
> > > > > > > check which is done in kernel/livepatch/core.c.  So I think it would be
> > > > > > > clearer and more consistent if the same check is done here:
> > > > > > > 
> > > > > > > 	if (!klp_have_reliable_stack())
> > > > > > > 		return -ENOSYS;
> > > > > 
> > > > > Huh, it smells with over engineering to me.
> > > > 
> > > > How so?  It makes the code more readable and the generated code should
> > > > be much better because it becomes a build-time check.
> > > 
> > > save_stack_trace_tsk_reliable() returns various error codes.
> > > We catch a specific one because otherwise the message below
> > > might be misleading.
> > > 
> > > I do not see why we should prevent this error by calling
> > > a custom hack: klp_have_reliable_stack()?
> > 
> > I wouldn't call it a hack.  It's a simple build-time check.
> > 
> > > Regarding reliability. If anyone changes semantic of
> > > save_stack_trace_tsk_reliable() error codes, they would likely
> > > check if all users (one at the moment) handle it correctly.
> > > 
> > > On the other hand, the dependency between the -ENOSYS
> > > return value and klp_have_reliable_stack() is far from
> > > obvious.
> > 
> > I don't follow your point.
> 
> We implement klp_have_reliable_stack() in livepatch subsystem.
> It checks config options that defines behavior of the
> stacktrace subsystem.
> 
> We use the above livepatch-specific function to warn about
> that a function from stacktrace subsustem will not work.
> You even suggest to use it to ignore result from
> the stacktrace subsystem.
> 
> OK, using klp_have_reliable_stack() on both locations
> would keep it consistent.
> 
> My point is that the check itself is not reliable because
> it is "hard" to maintain.

I don't see how it would be "hard" to maintain.  If an arch implements
reliable stack tracing, but forgets to set HAVE_RELIABLE_STACKTRACE then
they will notice the warning immediately when they try to livepatch.

> Instead, I suggest to remove klp_have_reliable_stack() and use
> the following in klp_enable_patch().
> 
> 
> 	if (stack_trace_save_tsk_reliable(current, NULL, 0) == -ENOSYS) {
> 		pr_warn("This architecture doesn't have support for the livepatch consistency model.\n");
> 		pr_warn("The livepatch transition may never complete.\n");
> 	}

It seems confusing to call that function where it isn't needed, since
klp_enable_patch() isn't doing stack tracing yet at that location.

Calling klp_have_reliable_stack() is more readable, and is exactly the
check we want to make.

> Also I suggest to remove the check in klp_check_stack() completely.
> We will always print that the stack is not reliable but only when
> the debug message is enabled. It is slightly misleading
> message for -ENOSYS. But I doubt that it could cause much
> troubles in reality. This situation should be really rare
> and easy to debug.

If you want to remove the specific check in klp_check_stack(), that's
fine with me.  I slightly prefer just reverting Kamalesh's commit, but I
don't have a strong feeling either way.

> > > If we want to discuss and fix this to the death. We should change
> > > the return value from -ENOSYS to -EOPNOTSUPP. The reason
> > > is the same as in the commit 375bfca3459db1c5596
> > > ("livepatch: core: Return EOPNOTSUPP instead of ENOSYS").
> > > 
> > > Note that EOPNOTSUPP is the same errno as ENOTSUP, see
> > > man 3 errno.
> > 
> > Sure, but that's a separate issue.
> 
> I just wanted to show that we might spend even more time with
> making this code briliant.
> 
> 
> > > > But I think Miroslav's suggestion to revert 1d98a69e5cef would be even
> > > > better.
> > > 
> > > AFAIK, Miroslav wanted to point out that your opinion was inconsistent.
> > 
> > How is my opinion inconsistent?

I don't guarantee that I will always be consistent with my past self.
My thinking may evolve (or devolve?) over time.  And there's nothing
wrong with that.  But none of the below is actually inconsistent.

> 1. You have already acked the removal of WARN_ON() in
>    klp_have_reliable_stack(),
>    see https://lkml.kernel.org/r/20190424155154.h62wc3nt7ib2phdy@treble
> 
>    You even suggested it, see
>    https://lkml.kernel.org/r/20190418135858.n3lzq2oxkj52m6bi@treble
> 
>    But you suggest to put it back now.

Maybe I wasn't clear.  If we're not planning on calling the weak version
of the function, then the warning makes sense.

> 2. You suggested to remove the warning when klp_check_stack() because
>    it was superfluous. There was a discussion about keeping
>    the check for -ENOSYS there and you did not react, see
>    https://lkml.kernel.org/r/20190424155532.3uyxyxwm4c5dqsf5@treble

But then (I thought) Miroslav suggested reverting 1d98a69e5cef, which is
a better idea.

>    You even acked the commit 1d98a69e5cef3aeb68bcef ("livepatch:
>    Remove reliable stacktrace check in klp_try_switch_task()").
> 
>    And you suddenly want to revert it.

The circumstances have changed since that commit: now we are going back
to allowing non-reliable arches to load livepatches.

> > Is there something wrong with the original approach, which was reverted
> > with
> > 
> >   1d98a69e5cef ("livepatch: Remove reliable stacktrace check in klp_try_switch_task()")
> > 
> > ?
> > 
> > As I mentioned, that has some advantages:
> > 
> > - The generated code is simpler/faster because it uses a build-time
> >   check.
> > 
> > - The code is more readable IMO.  Especially if the check is done higher
> >   up the call stack by reverting 1d98a69e5cef.  That way the arch
> >   support short-circuiting is done in the logical place, before doing
> >   any more unnecessary work.  It's faster, but also, more importantly,
> >   it's less surprising.
> > 
> > - It's also more consistent with other code, since the intent of this
> >   check is the same as the klp_have_reliable_stack() use in
> >   klp_enabled_patch().
> > 
> > If you disagree with those, please explain why.
> 
> As I said. I think that it is less reliable because we check config
> options of an unrelated subsystem.

There's no harm in checking the config option of another subsystem.
Livepatch very much relies on stacktrace.

> Also I think that it is overengineered.
> save_stack_trace_tsk_reliable() is able to tell when it failed.
> This particular failure is superfast. IMHO, it is not worth such
> an optimization.

Sure, performance isn't a concern, but code simplicity, readability, and
consistency certainly are.

> In fact, it is a compile time check as well because the inline
> from include/linux/stacktrace.h
> 
> 
> > > PS: This is my last mail in the thread this week. I will eventually
> > > return to it with a clear head next week. It is all nitpicking from
> > > my POV and I have better things to do.
> > 
> > I don't think it's helpful to characterize it as nitpicking.  The
> > details of the code are important.
> 
> 1. You had issues with almost all my printk() messages, comments,
>    and commit messages. I know that my English is not perfect.
>    And that you might want to highlight another information.
>    But was is it really that bad?

It's just code review.

> 2. This entire patchset is about adding and removing messages
>    and checks. We have 3rd version and you are still not happy.
>    Not to say that you suggest something else each time.

That's how review works.  If the code improves with each iteration then
how is that a problem?

> Frankly, I do mind too much about this code path, which and how
> many warnings are printed. I am not aware about any complains.
> IMHO, only people adding support for new architecture
> might go this way.
> 
> I just wanted to switch the error to warning because Thomas
> Gleixner wondered about unused code on s390 when refactoring
> the stacktrace code.
> 
> I really did not expect that I would spend hours/days on this.
> I do not think that it is worth it.
> 
> I consider most of the requests as nitpicking because
> they requests extra work just because it looks like
> a good idea.
> 
> My take is that we should accept changes when they improve
> the situation and go in the right direction. Further clean up
> might be done later by anyone who does not have anything
> more important on the plate or gets annoyed enough.
> 
> Yes, you have many great ideas. But I am not a trained
> monkey. And I do not know how to stop this when it is
> looking endless.

I'm just trying to help improve the code.  That's how open source works.
And it's my responsibility as a maintainer.  If you're not open to
review, don't bother posting the code in the first place.

-- 
Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-05-13 16:46 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-30  9:10 [PATCH v2 0/2] livepatch: Clean up of reliable stacktrace warnings Petr Mladek
2019-04-30  9:10 ` [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support Petr Mladek
2019-04-30 11:30   ` Miroslav Benes
2019-04-30 11:56   ` Kamalesh Babulal
2019-05-07  0:40   ` Josh Poimboeuf
2019-05-07  1:43     ` Josh Poimboeuf
2019-05-07 11:29       ` Petr Mladek
2019-05-07 12:03         ` Josh Poimboeuf
2019-05-07 14:28           ` Petr Mladek
2019-05-07 21:24             ` Josh Poimboeuf
2019-05-09  7:24               ` Miroslav Benes
2019-05-13 10:43               ` Petr Mladek
2019-05-13 16:46                 ` Josh Poimboeuf
2019-05-07  8:58     ` Miroslav Benes
2019-04-30  9:10 ` [PATCH v2 2/2] livepatch: Use static buffer for debugging messages under rq lock Petr Mladek
2019-04-30 11:32   ` Miroslav Benes
2019-04-30 12:30   ` Kamalesh Babulal
2019-05-07  0:43   ` Josh Poimboeuf
2019-05-07 11:50     ` Petr Mladek
2019-05-07 13:19       ` Josh Poimboeuf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.