All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
@ 2019-03-16  2:09 Tetsuo Handa
  2019-03-16  9:11 ` Dmitry Vyukov
  0 siblings, 1 reply; 29+ messages in thread
From: Tetsuo Handa @ 2019-03-16  2:09 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky, Dmitry Vyukov
  Cc: LKML, syzkaller

Hello.

I found a corrupted report at https://syzkaller.appspot.com/text?tag=CrashLog&x=17c6b82b200000 .

The panic() was caused by by WARN_ON() from generic_make_request_checks(), but there
was no printk() messages from WARN_ON(). Moreover, there was no printk() messages
for (at least) nearly one minute despite syzbot's testing likely generates constant
printk() messages. Unfortunately, since the report was truncated due to size limit,
we can't check when was the last time prink() succeeded writing messages to consoles.

----------
14:27:02 executing program 1:
(...snipped...)
14:28:00 executing program 3:
(...snipped...)
[  974.068065][T22281] Kernel panic - not syncing: panic_on_warn set ...
[  974.085454][T22281] CPU: 1 PID: 22281 Comm: syz-executor.1 Not tainted 5.0.0+ #23
[  974.093121][T22281] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[  974.103193][T22281] Call Trace:
[  974.106513][T22281]  dump_stack+0x172/0x1f0
[  974.110893][T22281]  ? generic_make_request_checks+0x1be0/0x2160
[  974.117235][T22281]  panic+0x2cb/0x65c
[  974.121434][T22281]  ? __warn_printk+0xf3/0xf3
[  974.126064][T22281]  ? generic_make_request_checks+0x1ca3/0x2160
[  974.132257][T22281]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
[  974.138521][T22281]  ? __warn.cold+0x5/0x45
[  974.142877][T22281]  ? __warn+0xe8/0x1d0
[  974.146978][T22281]  ? generic_make_request_checks+0x1ca3/0x2160
[  974.153171][T22281]  __warn.cold+0x20/0x45
[  974.157483][T22281]  ? vprintk_emit+0x1ce/0x6d0
[  974.162202][T22281]  ? generic_make_request_checks+0x1ca3/0x2160
[  974.168403][T22281]  report_bug+0x263/0x2b0
[  974.172847][T22281]  do_error_trap+0x11b/0x200
[  974.177646][T22281]  do_invalid_op+0x37/0x50
[  974.182329][T22281]  ? generic_make_request_checks+0x1ca3/0x2160
[  974.188951][T22281]  invalid_op+0x14/0x20
[  974.193142][T22281] RIP: 0010:generic_make_request_checks+0x1ca3/0x2160
----------

Is it possible that the reason there was no printk() messages is that
someone kept printing messages to console from userspace such that
console_trylock() from printk() was not able to succeed for (at least)
nearly one minute?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16  2:09 [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..." Tetsuo Handa
@ 2019-03-16  9:11 ` Dmitry Vyukov
  2019-03-16 10:18   ` Tetsuo Handa
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-16  9:11 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky, LKML, syzkaller

On Sat, Mar 16, 2019 at 3:09 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Hello.
>
> I found a corrupted report at https://syzkaller.appspot.com/text?tag=CrashLog&x=17c6b82b200000 .
>
> The panic() was caused by by WARN_ON() from generic_make_request_checks(), but there
> was no printk() messages from WARN_ON(). Moreover, there was no printk() messages
> for (at least) nearly one minute despite syzbot's testing likely generates constant
> printk() messages. Unfortunately, since the report was truncated due to size limit,
> we can't check when was the last time prink() succeeded writing messages to consoles.
>
> ----------
> 14:27:02 executing program 1:
> (...snipped...)
> 14:28:00 executing program 3:
> (...snipped...)
> [  974.068065][T22281] Kernel panic - not syncing: panic_on_warn set ...
> [  974.085454][T22281] CPU: 1 PID: 22281 Comm: syz-executor.1 Not tainted 5.0.0+ #23
> [  974.093121][T22281] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> [  974.103193][T22281] Call Trace:
> [  974.106513][T22281]  dump_stack+0x172/0x1f0
> [  974.110893][T22281]  ? generic_make_request_checks+0x1be0/0x2160
> [  974.117235][T22281]  panic+0x2cb/0x65c
> [  974.121434][T22281]  ? __warn_printk+0xf3/0xf3
> [  974.126064][T22281]  ? generic_make_request_checks+0x1ca3/0x2160
> [  974.132257][T22281]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
> [  974.138521][T22281]  ? __warn.cold+0x5/0x45
> [  974.142877][T22281]  ? __warn+0xe8/0x1d0
> [  974.146978][T22281]  ? generic_make_request_checks+0x1ca3/0x2160
> [  974.153171][T22281]  __warn.cold+0x20/0x45
> [  974.157483][T22281]  ? vprintk_emit+0x1ce/0x6d0
> [  974.162202][T22281]  ? generic_make_request_checks+0x1ca3/0x2160
> [  974.168403][T22281]  report_bug+0x263/0x2b0
> [  974.172847][T22281]  do_error_trap+0x11b/0x200
> [  974.177646][T22281]  do_invalid_op+0x37/0x50
> [  974.182329][T22281]  ? generic_make_request_checks+0x1ca3/0x2160
> [  974.188951][T22281]  invalid_op+0x14/0x20
> [  974.193142][T22281] RIP: 0010:generic_make_request_checks+0x1ca3/0x2160
> ----------
>
> Is it possible that the reason there was no printk() messages is that
> someone kept printing messages to console from userspace such that
> console_trylock() from printk() was not able to succeed for (at least)
> nearly one minute?

Such reports showed up always with low frequency. For all that I
looked at we also always had a normal  non-truncated version, so I was
never too worried.

If something would write from userspace, that would show up in the
output, or not?

Perhaps syzkaller somehow manages to lower console output level? I
figured out that we should restrict it from doing
syslog(SYSLOG_ACTION_CONSOLE_OFF). And I also restricted its access o
/dev/console. But maybe there is something else? It _should_ not be
able to write to random sysctl's.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16  9:11 ` Dmitry Vyukov
@ 2019-03-16 10:18   ` Tetsuo Handa
  2019-03-16 14:14     ` Sergey Senozhatsky
  0 siblings, 1 reply; 29+ messages in thread
From: Tetsuo Handa @ 2019-03-16 10:18 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky, LKML, syzkaller

On 2019/03/16 18:11, Dmitry Vyukov wrote:
> Such reports showed up always with low frequency. For all that I
> looked at we also always had a normal  non-truncated version, so I was
> never too worried.
> 
> If something would write from userspace, that would show up in the
> output, or not?
> 
> Perhaps syzkaller somehow manages to lower console output level?

panic() calls console_verbose() before calling

  pr_emerg("Kernel panic - not syncing: %s\n", buf);

line. Then, someone might have changed console_loglevel enough to
suppress printk() output.

> I figured out that we should restrict it from doing
> syslog(SYSLOG_ACTION_CONSOLE_OFF). And I also restricted its access o
> /dev/console. But maybe there is something else? It _should_ not be
> able to write to random sysctl's.

Maybe try running with "ignore_loglevel" kernel command line option added?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16 10:18   ` Tetsuo Handa
@ 2019-03-16 14:14     ` Sergey Senozhatsky
  2019-03-16 14:16       ` Dmitry Vyukov
  0 siblings, 1 reply; 29+ messages in thread
From: Sergey Senozhatsky @ 2019-03-16 14:14 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Dmitry Vyukov, Petr Mladek, Sergey Senozhatsky,
	Sergey Senozhatsky, LKML, syzkaller

On (03/16/19 19:18), Tetsuo Handa wrote:
> On 2019/03/16 18:11, Dmitry Vyukov wrote:
> > Such reports showed up always with low frequency. For all that I
> > looked at we also always had a normal  non-truncated version, so I was
> > never too worried.
> > 
> > If something would write from userspace, that would show up in the
> > output, or not?
> > 
> > Perhaps syzkaller somehow manages to lower console output level?
> 
> panic() calls console_verbose() before calling
> 
>   pr_emerg("Kernel panic - not syncing: %s\n", buf);
> 
> line. Then, someone might have changed console_loglevel enough to
> suppress printk() output.

Hmm... sysctl, may be?

> > I figured out that we should restrict it from doing
> > syslog(SYSLOG_ACTION_CONSOLE_OFF). And I also restricted its access o
> > /dev/console. But maybe there is something else? It _should_ not be
> > able to write to random sysctl's.
> 
> Maybe try running with "ignore_loglevel" kernel command line option added?

Right, that's something I would expect 0-day and syzkaller to do.

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16 14:14     ` Sergey Senozhatsky
@ 2019-03-16 14:16       ` Dmitry Vyukov
  2019-03-16 14:40         ` Sergey Senozhatsky
  2019-03-16 14:53         ` Tetsuo Handa
  0 siblings, 2 replies; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-16 14:16 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On Sat, Mar 16, 2019 at 3:14 PM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
> On (03/16/19 19:18), Tetsuo Handa wrote:
> > On 2019/03/16 18:11, Dmitry Vyukov wrote:
> > > Such reports showed up always with low frequency. For all that I
> > > looked at we also always had a normal  non-truncated version, so I was
> > > never too worried.
> > >
> > > If something would write from userspace, that would show up in the
> > > output, or not?
> > >
> > > Perhaps syzkaller somehow manages to lower console output level?
> >
> > panic() calls console_verbose() before calling
> >
> >   pr_emerg("Kernel panic - not syncing: %s\n", buf);
> >
> > line. Then, someone might have changed console_loglevel enough to
> > suppress printk() output.
>
> Hmm... sysctl, may be?
>
> > > I figured out that we should restrict it from doing
> > > syslog(SYSLOG_ACTION_CONSOLE_OFF). And I also restricted its access o
> > > /dev/console. But maybe there is something else? It _should_ not be
> > > able to write to random sysctl's.
> >
> > Maybe try running with "ignore_loglevel" kernel command line option added?
>
> Right, that's something I would expect 0-day and syzkaller to do.

to double-check: enabling this won't lead to verbose/debug level of logging?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16 14:16       ` Dmitry Vyukov
@ 2019-03-16 14:40         ` Sergey Senozhatsky
  2019-03-16 14:53         ` Tetsuo Handa
  1 sibling, 0 replies; 29+ messages in thread
From: Sergey Senozhatsky @ 2019-03-16 14:40 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Tetsuo Handa, Petr Mladek,
	Sergey Senozhatsky, LKML, syzkaller, Steven Rostedt

On (03/16/19 15:16), Dmitry Vyukov wrote:
> > Hmm... sysctl, may be?
> >
> > > > I figured out that we should restrict it from doing
> > > > syslog(SYSLOG_ACTION_CONSOLE_OFF). And I also restricted its access o
> > > > /dev/console. But maybe there is something else? It _should_ not be
> > > > able to write to random sysctl's.
> > >
> > > Maybe try running with "ignore_loglevel" kernel command line option added?
> >
> > Right, that's something I would expect 0-day and syzkaller to do.
> 
> to double-check: enabling this won't lead to verbose/debug level of logging?

My bad, Saturday night.

You are right, this will make it very verbose. May be we can set
ignore_loglevel when we set loglevel to CONSOLE_LOGLEVEL_MOTORMOUTH.


Technically, we can have something like this

	CPU0						CPU1

	foo()						panic
	loglevel_save = console_loglevel
	....						console_verbose()
	console_loglevel = ABC
	....
	console_loglevel = loglevel_save		smp_send_stop()
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	killed
							console_flush_on_panic()
							// console is
							// NOT verbose


So we either can set ignore_loglevel in console_verbose(). Or call
call console_verbose() one more time, once we stopped other CPUs
and before we console_flush_on_panic().

---

diff --git a/kernel/panic.c b/kernel/panic.c
index 0ae0d7332f12..11784d37f625 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -263,6 +263,7 @@ void panic(const char *fmt, ...)
 	if (_crash_kexec_post_notifiers)
 		__crash_kexec(NULL);
 
+	console_verbose();
 #ifdef CONFIG_VT
 	unblank_screen();
 #endif

---

But I think that setting ignore_loglevel in console_verbose() is more
reliable.

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16 14:16       ` Dmitry Vyukov
  2019-03-16 14:40         ` Sergey Senozhatsky
@ 2019-03-16 14:53         ` Tetsuo Handa
  2019-03-16 14:57           ` Dmitry Vyukov
  1 sibling, 1 reply; 29+ messages in thread
From: Tetsuo Handa @ 2019-03-16 14:53 UTC (permalink / raw)
  To: Dmitry Vyukov, Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On 2019/03/16 23:16, Dmitry Vyukov wrote:
>>> Maybe try running with "ignore_loglevel" kernel command line option added?
>>
>> Right, that's something I would expect 0-day and syzkaller to do.
> 
> to double-check: enabling this won't lead to verbose/debug level of logging?
> 

I don't know what "verbose/debug level" means. But enabling this option causes
KERN_{EMERG,ALERT,CRIT,ERR,WARNING,NOTICE,INFO,DEBUG} be printed equally.

  "ignore loglevel setting (prints all kernel messages to the console)"

static bool suppress_message_printing(int level)
{
	return (level >= console_loglevel && !ignore_loglevel);
}


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16 14:53         ` Tetsuo Handa
@ 2019-03-16 14:57           ` Dmitry Vyukov
  2019-03-16 15:02             ` Tetsuo Handa
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-16 14:57 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On Sat, Mar 16, 2019 at 3:53 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2019/03/16 23:16, Dmitry Vyukov wrote:
> >>> Maybe try running with "ignore_loglevel" kernel command line option added?
> >>
> >> Right, that's something I would expect 0-day and syzkaller to do.
> >
> > to double-check: enabling this won't lead to verbose/debug level of logging?
> >
>
> I don't know what "verbose/debug level" means. But enabling this option causes
> KERN_{EMERG,ALERT,CRIT,ERR,WARNING,NOTICE,INFO,DEBUG} be printed equally.
>
>   "ignore loglevel setting (prints all kernel messages to the console)"
>
> static bool suppress_message_printing(int level)
> {
>         return (level >= console_loglevel && !ignore_loglevel);
> }

Then I don't think it's suitable: there will be too much output.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16 14:57           ` Dmitry Vyukov
@ 2019-03-16 15:02             ` Tetsuo Handa
  2019-03-16 15:10               ` Dmitry Vyukov
  0 siblings, 1 reply; 29+ messages in thread
From: Tetsuo Handa @ 2019-03-16 15:02 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On 2019/03/16 23:57, Dmitry Vyukov wrote:
> On Sat, Mar 16, 2019 at 3:53 PM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> On 2019/03/16 23:16, Dmitry Vyukov wrote:
>>>>> Maybe try running with "ignore_loglevel" kernel command line option added?
>>>>
>>>> Right, that's something I would expect 0-day and syzkaller to do.
>>>
>>> to double-check: enabling this won't lead to verbose/debug level of logging?
>>>
>>
>> I don't know what "verbose/debug level" means. But enabling this option causes
>> KERN_{EMERG,ALERT,CRIT,ERR,WARNING,NOTICE,INFO,DEBUG} be printed equally.
>>
>>   "ignore loglevel setting (prints all kernel messages to the console)"
>>
>> static bool suppress_message_printing(int level)
>> {
>>         return (level >= console_loglevel && !ignore_loglevel);
>> }
> 
> Then I don't think it's suitable: there will be too much output.
> 

Then, we need to find what test is changing console_loglevel.
Maybe add debug BUG_ON() in linux-next.git using CONFIG_DEBUG_AID_FOR_SYZBOT ?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16 15:02             ` Tetsuo Handa
@ 2019-03-16 15:10               ` Dmitry Vyukov
  2019-03-18  5:27                 ` Tetsuo Handa
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-16 15:10 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On Sat, Mar 16, 2019 at 4:02 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2019/03/16 23:57, Dmitry Vyukov wrote:
> > On Sat, Mar 16, 2019 at 3:53 PM Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >>
> >> On 2019/03/16 23:16, Dmitry Vyukov wrote:
> >>>>> Maybe try running with "ignore_loglevel" kernel command line option added?
> >>>>
> >>>> Right, that's something I would expect 0-day and syzkaller to do.
> >>>
> >>> to double-check: enabling this won't lead to verbose/debug level of logging?
> >>>
> >>
> >> I don't know what "verbose/debug level" means. But enabling this option causes
> >> KERN_{EMERG,ALERT,CRIT,ERR,WARNING,NOTICE,INFO,DEBUG} be printed equally.
> >>
> >>   "ignore loglevel setting (prints all kernel messages to the console)"
> >>
> >> static bool suppress_message_printing(int level)
> >> {
> >>         return (level >= console_loglevel && !ignore_loglevel);
> >> }
> >
> > Then I don't think it's suitable: there will be too much output.
> >
>
> Then, we need to find what test is changing console_loglevel.
> Maybe add debug BUG_ON() in linux-next.git using CONFIG_DEBUG_AID_FOR_SYZBOT ?


Is there a single place to catch this? I could run syzkaller locally
first with the check.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-16 15:10               ` Dmitry Vyukov
@ 2019-03-18  5:27                 ` Tetsuo Handa
  2019-03-18 12:07                   ` Dmitry Vyukov
  0 siblings, 1 reply; 29+ messages in thread
From: Tetsuo Handa @ 2019-03-18  5:27 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

Dmitry Vyukov wrote:
> > Then, we need to find what test is changing console_loglevel.
> > Maybe add debug BUG_ON() in linux-next.git using CONFIG_DEBUG_AID_FOR_SYZBOT ?
> 
> Is there a single place to catch this? I could run syzkaller locally
> first with the check.
> 

There is no such place. But not so many places change permanently.
For x86, you can test with below patch applied.

---
 drivers/tty/sysrq.c    |  1 +
 kernel/printk/printk.c |  3 +++
 kernel/sysctl.c        | 23 ++++++++++++++++++++++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index fa0ce7d..ad73520 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -91,6 +91,7 @@ static void sysrq_handle_loglevel(int key)
 	i = key - '0';
 	console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
 	pr_info("Loglevel set to %d\n", i);
+	WARN_ONCE(i < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", i);
 	console_loglevel = i;
 }
 static struct sysrq_key_op sysrq_loglevel_op = {
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 02ca827..70d1f8c 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1533,11 +1533,13 @@ int do_syslog(int type, char __user *buf, int len, int source)
 	case SYSLOG_ACTION_CONSOLE_OFF:
 		if (saved_console_loglevel == LOGLEVEL_DEFAULT)
 			saved_console_loglevel = console_loglevel;
+		WARN_ONCE(minimum_console_loglevel < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", minimum_console_loglevel);
 		console_loglevel = minimum_console_loglevel;
 		break;
 	/* Enable logging to console */
 	case SYSLOG_ACTION_CONSOLE_ON:
 		if (saved_console_loglevel != LOGLEVEL_DEFAULT) {
+			WARN_ONCE(saved_console_loglevel < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", saved_console_loglevel);
 			console_loglevel = saved_console_loglevel;
 			saved_console_loglevel = LOGLEVEL_DEFAULT;
 		}
@@ -1548,6 +1550,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
 			return -EINVAL;
 		if (len < minimum_console_loglevel)
 			len = minimum_console_loglevel;
+		WARN_ONCE(len < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", len);
 		console_loglevel = len;
 		/* Implicitly re-enable logging to console */
 		saved_console_loglevel = LOGLEVEL_DEFAULT;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b3df3ab..2170421 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -273,6 +273,27 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
 int sysctl_legacy_va_layout;
 #endif
 
+static int proc_dointvec_loglevel(struct ctl_table *table, int write,
+				  void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	if (write && buffer && *lenp) {
+		size_t len = *lenp;
+		char *kbuf, *p;
+
+		if (len > PAGE_SIZE - 1)
+			len = PAGE_SIZE - 1;
+		p = kbuf = memdup_user_nul(buffer, len);
+		if (IS_ERR(kbuf))
+			return PTR_ERR(kbuf);
+		while (*p && (*p < '0' || *p > '9'))
+			p++;
+		len = *p ? strtoul(p, &p, 10) : CONSOLE_LOGLEVEL_DEFAULT;
+		WARN_ONCE(len < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", (int) len);
+		kfree(kbuf);
+	}
+	return proc_dointvec(table, write, buffer, lenp, ppos);
+}
+
 /* The default sysctl tables: */
 
 static struct ctl_table sysctl_base_table[] = {
@@ -839,7 +860,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
 		.data		= &console_loglevel,
 		.maxlen		= 4*sizeof(int),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec,
+		.proc_handler	= proc_dointvec_loglevel,
 	},
 	{
 		.procname	= "printk_ratelimit",
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-18  5:27                 ` Tetsuo Handa
@ 2019-03-18 12:07                   ` Dmitry Vyukov
  2019-03-18 12:32                     ` Dmitry Vyukov
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-18 12:07 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On Mon, Mar 18, 2019 at 6:27 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Dmitry Vyukov wrote:
> > > Then, we need to find what test is changing console_loglevel.
> > > Maybe add debug BUG_ON() in linux-next.git using CONFIG_DEBUG_AID_FOR_SYZBOT ?
> >
> > Is there a single place to catch this? I could run syzkaller locally
> > first with the check.
> >
>
> There is no such place. But not so many places change permanently.
> For x86, you can test with below patch applied.
>
> ---
>  drivers/tty/sysrq.c    |  1 +
>  kernel/printk/printk.c |  3 +++
>  kernel/sysctl.c        | 23 ++++++++++++++++++++++-
>  3 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
> index fa0ce7d..ad73520 100644
> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -91,6 +91,7 @@ static void sysrq_handle_loglevel(int key)
>         i = key - '0';
>         console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
>         pr_info("Loglevel set to %d\n", i);
> +       WARN_ONCE(i < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", i);
>         console_loglevel = i;
>  }
>  static struct sysrq_key_op sysrq_loglevel_op = {
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 02ca827..70d1f8c 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1533,11 +1533,13 @@ int do_syslog(int type, char __user *buf, int len, int source)
>         case SYSLOG_ACTION_CONSOLE_OFF:
>                 if (saved_console_loglevel == LOGLEVEL_DEFAULT)
>                         saved_console_loglevel = console_loglevel;
> +               WARN_ONCE(minimum_console_loglevel < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", minimum_console_loglevel);
>                 console_loglevel = minimum_console_loglevel;
>                 break;
>         /* Enable logging to console */
>         case SYSLOG_ACTION_CONSOLE_ON:
>                 if (saved_console_loglevel != LOGLEVEL_DEFAULT) {
> +                       WARN_ONCE(saved_console_loglevel < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", saved_console_loglevel);
>                         console_loglevel = saved_console_loglevel;
>                         saved_console_loglevel = LOGLEVEL_DEFAULT;
>                 }
> @@ -1548,6 +1550,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
>                         return -EINVAL;
>                 if (len < minimum_console_loglevel)
>                         len = minimum_console_loglevel;
> +               WARN_ONCE(len < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", len);
>                 console_loglevel = len;
>                 /* Implicitly re-enable logging to console */
>                 saved_console_loglevel = LOGLEVEL_DEFAULT;
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index b3df3ab..2170421 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -273,6 +273,27 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
>  int sysctl_legacy_va_layout;
>  #endif
>
> +static int proc_dointvec_loglevel(struct ctl_table *table, int write,
> +                                 void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> +       if (write && buffer && *lenp) {
> +               size_t len = *lenp;
> +               char *kbuf, *p;
> +
> +               if (len > PAGE_SIZE - 1)
> +                       len = PAGE_SIZE - 1;
> +               p = kbuf = memdup_user_nul(buffer, len);
> +               if (IS_ERR(kbuf))
> +                       return PTR_ERR(kbuf);
> +               while (*p && (*p < '0' || *p > '9'))
> +                       p++;
> +               len = *p ? strtoul(p, &p, 10) : CONSOLE_LOGLEVEL_DEFAULT;
> +               WARN_ONCE(len < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", (int) len);
> +               kfree(kbuf);
> +       }
> +       return proc_dointvec(table, write, buffer, lenp, ppos);
> +}
> +
>  /* The default sysctl tables: */
>
>  static struct ctl_table sysctl_base_table[] = {
> @@ -839,7 +860,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
>                 .data           = &console_loglevel,
>                 .maxlen         = 4*sizeof(int),
>                 .mode           = 0644,
> -               .proc_handler   = proc_dointvec,
> +               .proc_handler   = proc_dointvec_loglevel,
>         },
>         {
>                 .procname       = "printk_ratelimit",
> --
> 1.8.3.1


Thanks.
Running this for 10 mins I only got a bunch of these, so at least none
of these WARNs does not trigger immediately. But I will continue
running it.

WARNING in schedule_bh
kernel BUG at ./include/linux/mm.h:LINE!
WARNING in vkms_vblank_simulate
general protection fault in xfrmi_decode_session
INFO: task hung in corrupted
kernel BUG at drivers/android/binder_alloc.c:LINE!
BUG: unable to handle kernel NULL pointer dereference in
drm_atomic_helper_commit_modeset_disables
INFO: task hung in __rq_qos_throttle
WARNING in sk_stream_kill_queues

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-18 12:07                   ` Dmitry Vyukov
@ 2019-03-18 12:32                     ` Dmitry Vyukov
  2019-03-18 12:39                       ` Dmitry Vyukov
  2019-03-18 12:50                       ` Sergey Senozhatsky
  0 siblings, 2 replies; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-18 12:32 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On Mon, Mar 18, 2019 at 1:07 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Mon, Mar 18, 2019 at 6:27 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > Dmitry Vyukov wrote:
> > > > Then, we need to find what test is changing console_loglevel.
> > > > Maybe add debug BUG_ON() in linux-next.git using CONFIG_DEBUG_AID_FOR_SYZBOT ?
> > >
> > > Is there a single place to catch this? I could run syzkaller locally
> > > first with the check.
> > >
> >
> > There is no such place. But not so many places change permanently.
> > For x86, you can test with below patch applied.
> >
> > ---
> >  drivers/tty/sysrq.c    |  1 +
> >  kernel/printk/printk.c |  3 +++
> >  kernel/sysctl.c        | 23 ++++++++++++++++++++++-
> >  3 files changed, 26 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
> > index fa0ce7d..ad73520 100644
> > --- a/drivers/tty/sysrq.c
> > +++ b/drivers/tty/sysrq.c
> > @@ -91,6 +91,7 @@ static void sysrq_handle_loglevel(int key)
> >         i = key - '0';
> >         console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
> >         pr_info("Loglevel set to %d\n", i);
> > +       WARN_ONCE(i < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", i);
> >         console_loglevel = i;
> >  }
> >  static struct sysrq_key_op sysrq_loglevel_op = {
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 02ca827..70d1f8c 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -1533,11 +1533,13 @@ int do_syslog(int type, char __user *buf, int len, int source)
> >         case SYSLOG_ACTION_CONSOLE_OFF:
> >                 if (saved_console_loglevel == LOGLEVEL_DEFAULT)
> >                         saved_console_loglevel = console_loglevel;
> > +               WARN_ONCE(minimum_console_loglevel < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", minimum_console_loglevel);
> >                 console_loglevel = minimum_console_loglevel;
> >                 break;
> >         /* Enable logging to console */
> >         case SYSLOG_ACTION_CONSOLE_ON:
> >                 if (saved_console_loglevel != LOGLEVEL_DEFAULT) {
> > +                       WARN_ONCE(saved_console_loglevel < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", saved_console_loglevel);
> >                         console_loglevel = saved_console_loglevel;
> >                         saved_console_loglevel = LOGLEVEL_DEFAULT;
> >                 }
> > @@ -1548,6 +1550,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
> >                         return -EINVAL;
> >                 if (len < minimum_console_loglevel)
> >                         len = minimum_console_loglevel;
> > +               WARN_ONCE(len < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", len);
> >                 console_loglevel = len;
> >                 /* Implicitly re-enable logging to console */
> >                 saved_console_loglevel = LOGLEVEL_DEFAULT;
> > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > index b3df3ab..2170421 100644
> > --- a/kernel/sysctl.c
> > +++ b/kernel/sysctl.c
> > @@ -273,6 +273,27 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
> >  int sysctl_legacy_va_layout;
> >  #endif
> >
> > +static int proc_dointvec_loglevel(struct ctl_table *table, int write,
> > +                                 void __user *buffer, size_t *lenp, loff_t *ppos)
> > +{
> > +       if (write && buffer && *lenp) {
> > +               size_t len = *lenp;
> > +               char *kbuf, *p;
> > +
> > +               if (len > PAGE_SIZE - 1)
> > +                       len = PAGE_SIZE - 1;
> > +               p = kbuf = memdup_user_nul(buffer, len);
> > +               if (IS_ERR(kbuf))
> > +                       return PTR_ERR(kbuf);
> > +               while (*p && (*p < '0' || *p > '9'))
> > +                       p++;
> > +               len = *p ? strtoul(p, &p, 10) : CONSOLE_LOGLEVEL_DEFAULT;
> > +               WARN_ONCE(len < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", (int) len);
> > +               kfree(kbuf);
> > +       }
> > +       return proc_dointvec(table, write, buffer, lenp, ppos);
> > +}
> > +
> >  /* The default sysctl tables: */
> >
> >  static struct ctl_table sysctl_base_table[] = {
> > @@ -839,7 +860,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
> >                 .data           = &console_loglevel,
> >                 .maxlen         = 4*sizeof(int),
> >                 .mode           = 0644,
> > -               .proc_handler   = proc_dointvec,
> > +               .proc_handler   = proc_dointvec_loglevel,
> >         },
> >         {
> >                 .procname       = "printk_ratelimit",
> > --
> > 1.8.3.1
>
>
> Thanks.
> Running this for 10 mins I only got a bunch of these, so at least none
> of these WARNs does not trigger immediately. But I will continue
> running it.
>
> WARNING in schedule_bh
> kernel BUG at ./include/linux/mm.h:LINE!
> WARNING in vkms_vblank_simulate
> general protection fault in xfrmi_decode_session
> INFO: task hung in corrupted
> kernel BUG at drivers/android/binder_alloc.c:LINE!
> BUG: unable to handle kernel NULL pointer dereference in
> drm_atomic_helper_commit_modeset_disables
> INFO: task hung in __rq_qos_throttle
> WARNING in sk_stream_kill_queues


Wait, but isn't SYSLOG_ACTION_CONSOLE_LEVEL what we are looking for?
syzkaller knows about the syslog syscall:
https://github.com/google/syzkaller/blob/13026d10f09f0e801c342e6c009ff580d49b894b/sys/linux/sys.txt#L322
and even though it does not know SYSLOG_ACTION_CONSOLE_LEVEL const, it
can guess that number.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-18 12:32                     ` Dmitry Vyukov
@ 2019-03-18 12:39                       ` Dmitry Vyukov
  2019-03-19  0:41                         ` Tetsuo Handa
  2019-03-18 12:50                       ` Sergey Senozhatsky
  1 sibling, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-18 12:39 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On Mon, Mar 18, 2019 at 1:32 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Mon, Mar 18, 2019 at 1:07 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > On Mon, Mar 18, 2019 at 6:27 AM Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > >
> > > Dmitry Vyukov wrote:
> > > > > Then, we need to find what test is changing console_loglevel.
> > > > > Maybe add debug BUG_ON() in linux-next.git using CONFIG_DEBUG_AID_FOR_SYZBOT ?
> > > >
> > > > Is there a single place to catch this? I could run syzkaller locally
> > > > first with the check.
> > > >
> > >
> > > There is no such place. But not so many places change permanently.
> > > For x86, you can test with below patch applied.
> > >
> > > ---
> > >  drivers/tty/sysrq.c    |  1 +
> > >  kernel/printk/printk.c |  3 +++
> > >  kernel/sysctl.c        | 23 ++++++++++++++++++++++-
> > >  3 files changed, 26 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
> > > index fa0ce7d..ad73520 100644
> > > --- a/drivers/tty/sysrq.c
> > > +++ b/drivers/tty/sysrq.c
> > > @@ -91,6 +91,7 @@ static void sysrq_handle_loglevel(int key)
> > >         i = key - '0';
> > >         console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
> > >         pr_info("Loglevel set to %d\n", i);
> > > +       WARN_ONCE(i < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", i);
> > >         console_loglevel = i;
> > >  }
> > >  static struct sysrq_key_op sysrq_loglevel_op = {
> > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > > index 02ca827..70d1f8c 100644
> > > --- a/kernel/printk/printk.c
> > > +++ b/kernel/printk/printk.c
> > > @@ -1533,11 +1533,13 @@ int do_syslog(int type, char __user *buf, int len, int source)
> > >         case SYSLOG_ACTION_CONSOLE_OFF:
> > >                 if (saved_console_loglevel == LOGLEVEL_DEFAULT)
> > >                         saved_console_loglevel = console_loglevel;
> > > +               WARN_ONCE(minimum_console_loglevel < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", minimum_console_loglevel);
> > >                 console_loglevel = minimum_console_loglevel;
> > >                 break;
> > >         /* Enable logging to console */
> > >         case SYSLOG_ACTION_CONSOLE_ON:
> > >                 if (saved_console_loglevel != LOGLEVEL_DEFAULT) {
> > > +                       WARN_ONCE(saved_console_loglevel < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", saved_console_loglevel);
> > >                         console_loglevel = saved_console_loglevel;
> > >                         saved_console_loglevel = LOGLEVEL_DEFAULT;
> > >                 }
> > > @@ -1548,6 +1550,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
> > >                         return -EINVAL;
> > >                 if (len < minimum_console_loglevel)
> > >                         len = minimum_console_loglevel;
> > > +               WARN_ONCE(len < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", len);
> > >                 console_loglevel = len;
> > >                 /* Implicitly re-enable logging to console */
> > >                 saved_console_loglevel = LOGLEVEL_DEFAULT;
> > > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > > index b3df3ab..2170421 100644
> > > --- a/kernel/sysctl.c
> > > +++ b/kernel/sysctl.c
> > > @@ -273,6 +273,27 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
> > >  int sysctl_legacy_va_layout;
> > >  #endif
> > >
> > > +static int proc_dointvec_loglevel(struct ctl_table *table, int write,
> > > +                                 void __user *buffer, size_t *lenp, loff_t *ppos)
> > > +{
> > > +       if (write && buffer && *lenp) {
> > > +               size_t len = *lenp;
> > > +               char *kbuf, *p;
> > > +
> > > +               if (len > PAGE_SIZE - 1)
> > > +                       len = PAGE_SIZE - 1;
> > > +               p = kbuf = memdup_user_nul(buffer, len);
> > > +               if (IS_ERR(kbuf))
> > > +                       return PTR_ERR(kbuf);
> > > +               while (*p && (*p < '0' || *p > '9'))
> > > +                       p++;
> > > +               len = *p ? strtoul(p, &p, 10) : CONSOLE_LOGLEVEL_DEFAULT;
> > > +               WARN_ONCE(len < CONSOLE_LOGLEVEL_DEFAULT, "Reducing console_loglevel to %d", (int) len);
> > > +               kfree(kbuf);
> > > +       }
> > > +       return proc_dointvec(table, write, buffer, lenp, ppos);
> > > +}
> > > +
> > >  /* The default sysctl tables: */
> > >
> > >  static struct ctl_table sysctl_base_table[] = {
> > > @@ -839,7 +860,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
> > >                 .data           = &console_loglevel,
> > >                 .maxlen         = 4*sizeof(int),
> > >                 .mode           = 0644,
> > > -               .proc_handler   = proc_dointvec,
> > > +               .proc_handler   = proc_dointvec_loglevel,
> > >         },
> > >         {
> > >                 .procname       = "printk_ratelimit",
> > > --
> > > 1.8.3.1
> >
> >
> > Thanks.
> > Running this for 10 mins I only got a bunch of these, so at least none
> > of these WARNs does not trigger immediately. But I will continue
> > running it.
> >
> > WARNING in schedule_bh
> > kernel BUG at ./include/linux/mm.h:LINE!
> > WARNING in vkms_vblank_simulate
> > general protection fault in xfrmi_decode_session
> > INFO: task hung in corrupted
> > kernel BUG at drivers/android/binder_alloc.c:LINE!
> > BUG: unable to handle kernel NULL pointer dereference in
> > drm_atomic_helper_commit_modeset_disables
> > INFO: task hung in __rq_qos_throttle
> > WARNING in sk_stream_kill_queues
>
>
> Wait, but isn't SYSLOG_ACTION_CONSOLE_LEVEL what we are looking for?
> syzkaller knows about the syslog syscall:
> https://github.com/google/syzkaller/blob/13026d10f09f0e801c342e6c009ff580d49b894b/sys/linux/sys.txt#L322
> and even though it does not know SYSLOG_ACTION_CONSOLE_LEVEL const, it
> can guess that number.

I've restricted fuzzer from invoking SYSLOG_ACTION_CONSOLE_LEVEL.
Let's see if it helps.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-18 12:32                     ` Dmitry Vyukov
  2019-03-18 12:39                       ` Dmitry Vyukov
@ 2019-03-18 12:50                       ` Sergey Senozhatsky
  2019-03-18 13:42                         ` Dmitry Vyukov
  1 sibling, 1 reply; 29+ messages in thread
From: Sergey Senozhatsky @ 2019-03-18 12:50 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Tetsuo Handa, Sergey Senozhatsky, Petr Mladek,
	Sergey Senozhatsky, LKML, syzkaller

On (03/18/19 13:32), Dmitry Vyukov wrote:
> Wait, but isn't SYSLOG_ACTION_CONSOLE_LEVEL what we are looking for?
> syzkaller knows about the syslog syscall:

There is also SYSLOG_ACTION_CONSOLE_OFF, which disables logging.

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-18 12:50                       ` Sergey Senozhatsky
@ 2019-03-18 13:42                         ` Dmitry Vyukov
  2019-03-18 14:09                           ` Sergey Senozhatsky
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-18 13:42 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On Mon, Mar 18, 2019 at 1:50 PM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
> On (03/18/19 13:32), Dmitry Vyukov wrote:
> > Wait, but isn't SYSLOG_ACTION_CONSOLE_LEVEL what we are looking for?
> > syzkaller knows about the syslog syscall:
>
> There is also SYSLOG_ACTION_CONSOLE_OFF, which disables logging.

That one I somehow figured earlier, here is the exact check:
https://github.com/google/syzkaller/blob/61f9c92f3a465025d2b9d91f429aaec968aeaa2a/sys/linux/init.go#L152-L155

But thanks for pointing it out.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-18 13:42                         ` Dmitry Vyukov
@ 2019-03-18 14:09                           ` Sergey Senozhatsky
  2019-03-19  8:10                             ` Dmitry Vyukov
  0 siblings, 1 reply; 29+ messages in thread
From: Sergey Senozhatsky @ 2019-03-18 14:09 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Tetsuo Handa, Petr Mladek,
	Sergey Senozhatsky, LKML, syzkaller

On (03/18/19 14:42), Dmitry Vyukov wrote:
> > There is also SYSLOG_ACTION_CONSOLE_OFF, which disables logging.
> 
> That one I somehow figured earlier, here is the exact check:
> https://github.com/google/syzkaller/blob/61f9c92f3a465025d2b9d91f429aaec968aeaa2a/sys/linux/init.go#L152-L155

OK, cool.
You probably would also want to add SYSLOG_ACTION_CLEAR to the list.

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-18 12:39                       ` Dmitry Vyukov
@ 2019-03-19  0:41                         ` Tetsuo Handa
  0 siblings, 0 replies; 29+ messages in thread
From: Tetsuo Handa @ 2019-03-19  0:41 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

Dmitry Vyukov wrote:
> On Mon, Mar 18, 2019 at 1:32 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > Wait, but isn't SYSLOG_ACTION_CONSOLE_LEVEL what we are looking for?
> > syzkaller knows about the syslog syscall:
> > https://github.com/google/syzkaller/blob/13026d10f09f0e801c342e6c009ff580d49b894b/sys/linux/sys.txt#L322
> > and even though it does not know SYSLOG_ACTION_CONSOLE_LEVEL const, it
> > can guess that number.
> 
> I've restricted fuzzer from invoking SYSLOG_ACTION_CONSOLE_LEVEL.
> Let's see if it helps.
> 

Oh, coverage page says that

        /* Set level of messages printed to console */
        case SYSLOG_ACTION_CONSOLE_LEVEL:
                if (len < 1 || len > 8) /*covered*/
                        return -EINVAL;
                if (len < minimum_console_loglevel) /*covered*/
                        len = minimum_console_loglevel;
                console_loglevel = len;
                /* Implicitly re-enable logging to console */
                saved_console_loglevel = LOGLEVEL_DEFAULT;
                break;

fuzzer was indeed testing SYSLOG_ACTION_CONSOLE_LEVEL case. ;-)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-18 14:09                           ` Sergey Senozhatsky
@ 2019-03-19  8:10                             ` Dmitry Vyukov
  2019-03-19 12:35                               ` Sergey Senozhatsky
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-19  8:10 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On Mon, Mar 18, 2019 at 3:09 PM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
> On (03/18/19 14:42), Dmitry Vyukov wrote:
> > > There is also SYSLOG_ACTION_CONSOLE_OFF, which disables logging.
> >
> > That one I somehow figured earlier, here is the exact check:
> > https://github.com/google/syzkaller/blob/61f9c92f3a465025d2b9d91f429aaec968aeaa2a/sys/linux/init.go#L152-L155
>
> OK, cool.
> You probably would also want to add SYSLOG_ACTION_CLEAR to the list.

It's not hard to add, but I would like to understand why exactly.
syzkaller reads out console output from qemu, GCE VM, android phone
all the time. So if something was already written and we read it out
right away, it seems that clearing won't affect things. Does
SYSLOG_ACTION_CLEAR affect only reading output via syslog syscall,
/dev/something?

In some setups we read out output with "dmesg -w" (should stream
output as well) in absence of a better solution. Will this be affected
by SYSLOG_ACTION_CLEAR?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-19  8:10                             ` Dmitry Vyukov
@ 2019-03-19 12:35                               ` Sergey Senozhatsky
  2019-03-19 13:35                                 ` Dmitry Vyukov
  0 siblings, 1 reply; 29+ messages in thread
From: Sergey Senozhatsky @ 2019-03-19 12:35 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Tetsuo Handa, Petr Mladek,
	Sergey Senozhatsky, LKML, syzkaller

On (03/19/19 09:10), Dmitry Vyukov wrote:

[..]

> In some setups we read out output with "dmesg -w" (should stream
> output as well) in absence of a better solution. Will this be affected
> by SYSLOG_ACTION_CLEAR?

There is a small chance that syslog/devkmsg can be affected

	syslog_clear()
	{
		clear_seq = log_next_seq;
		clear_idx = log_next_idx;
	}

	syslog_print_all()
	{
		...
		seq = clear_seq;
		idx = clear_idx;
		while (seq < log_next_seq) {
		...
			len += msg_print_text(msg, true, time, NULL, 0);
		}
		...
	}

devkmsg also may be affected

	syslog_clear()
	{
		clear_seq = log_next_seq;
		clear_idx = log_next_idx;
	}

	devkmsg_llseek()
	{
	...
		case SEEK_DATA:
			user->idx = clear_idx;
			user->seq = clear_seq;
			break;
	...
	}

user->idx/user->seq then are used in devkmsg poll()/read(). syslog is
more likely to be affected, I think.

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-19 12:35                               ` Sergey Senozhatsky
@ 2019-03-19 13:35                                 ` Dmitry Vyukov
  2019-05-08 10:31                                   ` Tetsuo Handa
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2019-03-19 13:35 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

On Tue, Mar 19, 2019 at 1:35 PM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
> On (03/19/19 09:10), Dmitry Vyukov wrote:
>
> [..]
>
> > In some setups we read out output with "dmesg -w" (should stream
> > output as well) in absence of a better solution. Will this be affected
> > by SYSLOG_ACTION_CLEAR?
>
> There is a small chance that syslog/devkmsg can be affected
>
>         syslog_clear()
>         {
>                 clear_seq = log_next_seq;
>                 clear_idx = log_next_idx;
>         }
>
>         syslog_print_all()
>         {
>                 ...
>                 seq = clear_seq;
>                 idx = clear_idx;
>                 while (seq < log_next_seq) {
>                 ...
>                         len += msg_print_text(msg, true, time, NULL, 0);
>                 }
>                 ...
>         }
>
> devkmsg also may be affected
>
>         syslog_clear()
>         {
>                 clear_seq = log_next_seq;
>                 clear_idx = log_next_idx;
>         }
>
>         devkmsg_llseek()
>         {
>         ...
>                 case SEEK_DATA:
>                         user->idx = clear_idx;
>                         user->seq = clear_seq;
>                         break;
>         ...
>         }
>
> user->idx/user->seq then are used in devkmsg poll()/read(). syslog is
> more likely to be affected, I think.

SYSLOG_ACTION_CLEAR is disabled:
https://github.com/google/syzkaller/commit/28c04d39e052ee2960eebc16a3f7fba88920cd52

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-03-19 13:35                                 ` Dmitry Vyukov
@ 2019-05-08 10:31                                   ` Tetsuo Handa
  2019-05-09  9:58                                     ` Sergey Senozhatsky
  2019-05-10 14:12                                     ` Petr Mladek
  0 siblings, 2 replies; 29+ messages in thread
From: Tetsuo Handa @ 2019-05-08 10:31 UTC (permalink / raw)
  To: Dmitry Vyukov, Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, LKML, syzkaller

Hello.

We are again getting corrupted reports where message from WARN() is missing.
For example, https://syzkaller.appspot.com/text?tag=CrashLog&x=1720cac8a00000 was
titled as "WARNING in cgroup_exit" because the
"WARNING: CPU: 0 PID: 7870 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
line is there but https://syzkaller.appspot.com/text?tag=CrashLog&x=1670a602a00000
was titled as "corrupted report (2)" because the
"WARNING: CPU: 0 PID: 10223 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
line is missing. Also, it is unlikely that there was no printk() for a few minutes.
Thus, I suspect something is again suppressing console output.
Since this problem is happening in 5.1.0-next-20190507, do we want to try below one?

 kernel/printk/printk.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index e1e8250..f0b9463 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3338,3 +3338,23 @@ void kmsg_dump_rewind(struct kmsg_dumper *dumper)
 EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
 
 #endif
+
+#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT
+static int initial_loglevel;
+static void check_loglevel(struct timer_list *timer)
+{
+	if (console_loglevel < initial_loglevel)
+		panic("Console loglevel changed (%d->%d)!", initial_loglevel,
+		      console_loglevel);
+	mod_timer(timer, jiffies + HZ);
+}
+static int __init loglevelcheck_init(void)
+{
+	static DEFINE_TIMER(timer, check_loglevel);
+
+	initial_loglevel = console_loglevel;
+	mod_timer(&timer, jiffies + HZ);
+	return 0;
+}
+late_initcall(loglevelcheck_init);
+#endif





By the way, recently we are hitting false positives caused by "WARNING:"
string from not WARN() messages but plain printk() messages (e.g.

  https://syzkaller.appspot.com/bug?id=31bdef63e48688854fde93e6edf390922b70f8a4
  https://syzkaller.appspot.com/bug?id=faae4720a75cadb8cd0dbda5c4d3542228d37340

) and we need to avoid emitting "WARNING:" string from plain printk() messages
during fuzzing testing. I guess we want to add something like
CONFIG_DEBUG_AID_FOR_SYZBOT to all kernels in order to mask such string...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-05-08 10:31                                   ` Tetsuo Handa
@ 2019-05-09  9:58                                     ` Sergey Senozhatsky
  2019-05-09 10:18                                       ` Sergey Senozhatsky
  2019-05-09 10:26                                       ` Tetsuo Handa
  2019-05-10 14:12                                     ` Petr Mladek
  1 sibling, 2 replies; 29+ messages in thread
From: Sergey Senozhatsky @ 2019-05-09  9:58 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Dmitry Vyukov, Sergey Senozhatsky, Petr Mladek,
	Sergey Senozhatsky, LKML, syzkaller

On (05/08/19 19:31), Tetsuo Handa wrote:
[..]
> We are again getting corrupted reports where message from WARN() is missing.
> For example, https://syzkaller.appspot.com/text?tag=CrashLog&x=1720cac8a00000 was
> titled as "WARNING in cgroup_exit" because the
> "WARNING: CPU: 0 PID: 7870 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
> line is there but https://syzkaller.appspot.com/text?tag=CrashLog&x=1670a602a00000
> was titled as "corrupted report (2)" because the
> "WARNING: CPU: 0 PID: 10223 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
> line is missing. Also, it is unlikely that there was no printk() for a few minutes.
> Thus, I suspect something is again suppressing console output.

Hmm... That's interesting...

[..]
> +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT
> +static int initial_loglevel;
> +static void check_loglevel(struct timer_list *timer)
> +{
> +	if (console_loglevel < initial_loglevel)
> +		panic("Console loglevel changed (%d->%d)!", initial_loglevel,
> +		      console_loglevel);
> +	mod_timer(timer, jiffies + HZ);
> +}
> +static int __init loglevelcheck_init(void)
> +{
> +	static DEFINE_TIMER(timer, check_loglevel);
> +
> +	initial_loglevel = console_loglevel;
> +	mod_timer(&timer, jiffies + HZ);
> +	return 0;
> +}
> +late_initcall(loglevelcheck_init);
> +#endif

I suppose this patch is for internal testing at Google only. I don't
think we can consider upstreaming it.

> By the way, recently we are hitting false positives caused by "WARNING:"
> string from not WARN() messages but plain printk() messages (e.g.
> 
>   https://syzkaller.appspot.com/bug?id=31bdef63e48688854fde93e6edf390922b70f8a4
>   https://syzkaller.appspot.com/bug?id=faae4720a75cadb8cd0dbda5c4d3542228d37340
> 
> ) and we need to avoid emitting "WARNING:" string from plain printk() messages
> during fuzzing testing. I guess we want to add something like
> CONFIG_DEBUG_AID_FOR_SYZBOT to all kernels in order to mask such string...

I thought that we have MSG_FORMAT_SYSLOG exactly for things like these,
so you can look at actual message level <%d> and then decide if it's a
warning or a false alarm.

These are pr_info() level messages, but the text contains "WARNING: "

[..]
pvr2_trace(PVR2_TRACE_ERROR_LEGS,
	   "WARNING: Detected a wedged cx25840 chip; the device will not work.");
[..]

I would suggest to fix pvrusb2-i2c-core.c. I don't think we really
want to $text =~ s/WARNING//g in printk.

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-05-09  9:58                                     ` Sergey Senozhatsky
@ 2019-05-09 10:18                                       ` Sergey Senozhatsky
  2019-05-09 10:40                                         ` Tetsuo Handa
  2019-05-09 10:26                                       ` Tetsuo Handa
  1 sibling, 1 reply; 29+ messages in thread
From: Sergey Senozhatsky @ 2019-05-09 10:18 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Dmitry Vyukov, Sergey Senozhatsky, Petr Mladek, LKML, syzkaller,
	Sergey Senozhatsky

On (05/09/19 18:58), Sergey Senozhatsky wrote:
> On (05/08/19 19:31), Tetsuo Handa wrote:
> [..]
> > We are again getting corrupted reports where message from WARN() is missing.
> > For example, https://syzkaller.appspot.com/text?tag=CrashLog&x=1720cac8a00000 was
> > titled as "WARNING in cgroup_exit" because the
> > "WARNING: CPU: 0 PID: 7870 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
> > line is there but https://syzkaller.appspot.com/text?tag=CrashLog&x=1670a602a00000
> > was titled as "corrupted report (2)" because the
> > "WARNING: CPU: 0 PID: 10223 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
> > line is missing. Also, it is unlikely that there was no printk() for a few minutes.
> > Thus, I suspect something is again suppressing console output.
> 
> Hmm... That's interesting...

What are these lines right before the kernel panic output?
TTY writes (user space logging)?


03:54:05 executing program 5:
syz_mount_image$xfs(&(0x7f0000000000)='xfs\x00', &(0x7f0000000040)='./file0\x00', 0x0, 0x0, 0x0, 0xe003, &(0x7f00000005c0)={[{@nolargeio='nolargeio'}]})

03:54:06 executing program 2:
syz_emit_ethernet(0x66, &(0x7f0000000080)={@local, @random="029cce98941b", [], {@ipv6={0x86dd, {0x0, 0x6, 'v`Q', 0x30, 0x3a, 0xffffffffffffffff, @remote={0xfe, 0x80, [0x29c, 0x0, 0x700, 0x5], 0xffffffffffffffff}, @mcast2={0xff, 0x2, [0x0, 0xfffffffffffff000]}, {[], @icmpv6=@dest_unreach={0xffffff86, 0x0, 0x0, 0x0, [0x7], {0x0, 0x6, "c5961e", 0x0, 0x0, 0x0, @mcast1={0xff, 0x1, [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x3, 0x0, 0x0, 0x8906], 0x8200}, @mcast2}}}}}}}, 0x0)

[ 2396.035331][T10223] Kernel panic - not syncing: panic_on_warn set ...
[ 2396.042217][T10223] CPU: 0 PID: 10223 Comm: syz-executor.1 Not tainted 5.1.0-next-20190507 #2


Hmm... Dunno... Don't really have any explanations yet...

Can you add

	if (oops_in_progress)
		return IRQ_HANDLED;

to you console driver's IRQ handler (xmit TX/RX path)? Just to
check if this will change anything...

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-05-09  9:58                                     ` Sergey Senozhatsky
  2019-05-09 10:18                                       ` Sergey Senozhatsky
@ 2019-05-09 10:26                                       ` Tetsuo Handa
  2019-05-09 10:36                                         ` Sergey Senozhatsky
  1 sibling, 1 reply; 29+ messages in thread
From: Tetsuo Handa @ 2019-05-09 10:26 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Sergey Senozhatsky, Petr Mladek, LKML, syzkaller

On 2019/05/09 18:58, Sergey Senozhatsky wrote:
>> +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT
>> +static int initial_loglevel;
>> +static void check_loglevel(struct timer_list *timer)
>> +{
>> +	if (console_loglevel < initial_loglevel)
>> +		panic("Console loglevel changed (%d->%d)!", initial_loglevel,
>> +		      console_loglevel);
>> +	mod_timer(timer, jiffies + HZ);
>> +}
>> +static int __init loglevelcheck_init(void)
>> +{
>> +	static DEFINE_TIMER(timer, check_loglevel);
>> +
>> +	initial_loglevel = console_loglevel;
>> +	mod_timer(&timer, jiffies + HZ);
>> +	return 0;
>> +}
>> +late_initcall(loglevelcheck_init);
>> +#endif
> 
> I suppose this patch is for internal testing at Google only. I don't
> think we can consider upstreaming it.

Right. There is CONFIG_DEBUG_AID_FOR_SYZBOT option for testing at linux-next.git only.



> 
>> By the way, recently we are hitting false positives caused by "WARNING:"
>> string from not WARN() messages but plain printk() messages (e.g.
>>
>>   https://syzkaller.appspot.com/bug?id=31bdef63e48688854fde93e6edf390922b70f8a4
>>   https://syzkaller.appspot.com/bug?id=faae4720a75cadb8cd0dbda5c4d3542228d37340
>>
>> ) and we need to avoid emitting "WARNING:" string from plain printk() messages
>> during fuzzing testing. I guess we want to add something like
>> CONFIG_DEBUG_AID_FOR_SYZBOT to all kernels in order to mask such string...
> 
> I thought that we have MSG_FORMAT_SYSLOG exactly for things like these,
> so you can look at actual message level <%d> and then decide if it's a
> warning or a false alarm.

Since syzbot needs to use console output, message level is not available.

> 
> These are pr_info() level messages, but the text contains "WARNING: "
> 
> [..]
> pvr2_trace(PVR2_TRACE_ERROR_LEGS,
> 	   "WARNING: Detected a wedged cx25840 chip; the device will not work.");
> [..]
> 
> I would suggest to fix pvrusb2-i2c-core.c. I don't think we really
> want to $text =~ s/WARNING//g in printk.

Of course, I don't want to try $text =~ s/WARNING//g in printk().
What I meant is guard the callers like
https://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git/commit/?h=next-tomoyo&id=5c6b31e31adc31bd12636b196d3311f845dcc9d8
using a kernel config option which will be acceptable for upstreaming, for
we need to apply to all kernels which syzbot is testing.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-05-09 10:26                                       ` Tetsuo Handa
@ 2019-05-09 10:36                                         ` Sergey Senozhatsky
  0 siblings, 0 replies; 29+ messages in thread
From: Sergey Senozhatsky @ 2019-05-09 10:36 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Dmitry Vyukov, Sergey Senozhatsky,
	Petr Mladek, LKML, syzkaller

On (05/09/19 19:26), Tetsuo Handa wrote:
> >> By the way, recently we are hitting false positives caused by "WARNING:"
> >> string from not WARN() messages but plain printk() messages (e.g.
> >>
> >>   https://syzkaller.appspot.com/bug?id=31bdef63e48688854fde93e6edf390922b70f8a4
> >>   https://syzkaller.appspot.com/bug?id=faae4720a75cadb8cd0dbda5c4d3542228d37340
> >>
> >> ) and we need to avoid emitting "WARNING:" string from plain printk() messages
> >> during fuzzing testing. I guess we want to add something like
> >> CONFIG_DEBUG_AID_FOR_SYZBOT to all kernels in order to mask such string...
> > 
> > I thought that we have MSG_FORMAT_SYSLOG exactly for things like these,
> > so you can look at actual message level <%d> and then decide if it's a
> > warning or a false alarm.
> 
> Since syzbot needs to use console output, message level is not available.

Sorry, not following. MSG_FORMAT_SYSLOG should add message level
to console output. Booting the kernel with console_msg_format=syslog
will change msg_print_text() behaviour, and you'll get on your serial
console messages in

	<print_syslog>[print_time][print_caller] message text\n

format.

Does this work for you?

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-05-09 10:18                                       ` Sergey Senozhatsky
@ 2019-05-09 10:40                                         ` Tetsuo Handa
  0 siblings, 0 replies; 29+ messages in thread
From: Tetsuo Handa @ 2019-05-09 10:40 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Sergey Senozhatsky, Petr Mladek, LKML, syzkaller

On 2019/05/09 19:18, Sergey Senozhatsky wrote:
> What are these lines right before the kernel panic output?
> TTY writes (user space logging)?

I guess that the user space is calling printf() on stdout
which will be readable as console output.

> 
> 
> 03:54:05 executing program 5:
> syz_mount_image$xfs(&(0x7f0000000000)='xfs\x00', &(0x7f0000000040)='./file0\x00', 0x0, 0x0, 0x0, 0xe003, &(0x7f00000005c0)={[{@nolargeio='nolargeio'}]})
> 
> 03:54:06 executing program 2:
> syz_emit_ethernet(0x66, &(0x7f0000000080)={@local, @random="029cce98941b", [], {@ipv6={0x86dd, {0x0, 0x6, 'v`Q', 0x30, 0x3a, 0xffffffffffffffff, @remote={0xfe, 0x80, [0x29c, 0x0, 0x700, 0x5], 0xffffffffffffffff}, @mcast2={0xff, 0x2, [0x0, 0xfffffffffffff000]}, {[], @icmpv6=@dest_unreach={0xffffff86, 0x0, 0x0, 0x0, [0x7], {0x0, 0x6, "c5961e", 0x0, 0x0, 0x0, @mcast1={0xff, 0x1, [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x3, 0x0, 0x0, 0x8906], 0x8200}, @mcast2}}}}}}}, 0x0)
> 
> [ 2396.035331][T10223] Kernel panic - not syncing: panic_on_warn set ...
> [ 2396.042217][T10223] CPU: 0 PID: 10223 Comm: syz-executor.1 Not tainted 5.1.0-next-20190507 #2
> 
> 
> Hmm... Dunno... Don't really have any explanations yet...
> 
> Can you add
> 
> 	if (oops_in_progress)
> 		return IRQ_HANDLED;
> 
> to you console driver's IRQ handler (xmit TX/RX path)? Just to
> check if this will change anything...

The first userspace timestamp available is

  03:51:19 executing program 2:
  syz_emit_ethernet(0x66, &(0x7f0000000080)={@local, @random="029cce98941b", [], {@ipv6={0x86dd, {0x0, 0x6, 'v`Q', 0x30, 0x3a, 0xffffffffffffffff, @remote={0xfe, 0x80, [0x29c, 0x0, 0x700, 0x5], 0xffffffffffffffff}, @mcast2={0xff, 0x2, [0x0, 0xfffffffffffff000]}, {[], @icmpv6=@dest_unreach={0xffffff86, 0x0, 0x0, 0x0, [0x7], {0x0, 0x6, "c5961e", 0x0, 0x0, 0xf5ffffff00000000, @mcast1={0xff, 0x1, [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x3], 0x8200}, @mcast2}}}}}}}, 0x0)

which means that there was no printk() message for nearly 3 minutes.
Can't be oops_in_progress for such long period.

Since printk() timestamp says that the uptime is nearly 40 minutes, I think
that something changed console loglevel (in a way console_verbose() from
panic() can restore) many minutes ago from the crash...


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-05-08 10:31                                   ` Tetsuo Handa
  2019-05-09  9:58                                     ` Sergey Senozhatsky
@ 2019-05-10 14:12                                     ` Petr Mladek
  2019-05-10 14:53                                       ` Tetsuo Handa
  1 sibling, 1 reply; 29+ messages in thread
From: Petr Mladek @ 2019-05-10 14:12 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Dmitry Vyukov, Sergey Senozhatsky, Sergey Senozhatsky, LKML, syzkaller

On Wed 2019-05-08 19:31:06, Tetsuo Handa wrote:
> Hello.
> 
> We are again getting corrupted reports where message from WARN() is missing.
> For example, https://syzkaller.appspot.com/text?tag=CrashLog&x=1720cac8a00000 was
> titled as "WARNING in cgroup_exit" because the
> "WARNING: CPU: 0 PID: 7870 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
> line is there but https://syzkaller.appspot.com/text?tag=CrashLog&x=1670a602a00000
> was titled as "corrupted report (2)" because the
> "WARNING: CPU: 0 PID: 10223 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
> line is missing. Also, it is unlikely that there was no printk() for a few minutes.
> Thus, I suspect something is again suppressing console output.

It is just a wild speculation. It is highly unlikely that the
console_loglevel was manipulated to hide messages. There are
very few locations where console_loglevel is manipulated.

Anyway, the new "panic_print" feature from Feng Tang might help here.
It is in mm tree, see
https://marc.info/?l=linux-mm-commits&m=155614613719648&w=2


> Since this problem is happening in 5.1.0-next-20190507, do we want to try below one?
> 
>  kernel/printk/printk.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index e1e8250..f0b9463 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3338,3 +3338,23 @@ void kmsg_dump_rewind(struct kmsg_dumper *dumper)
>  EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>  
>  #endif
> +
> +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT
> +static int initial_loglevel;
> +static void check_loglevel(struct timer_list *timer)
> +{
> +	if (console_loglevel < initial_loglevel)
> +		panic("Console loglevel changed (%d->%d)!", initial_loglevel,
> +		      console_loglevel);

I am not sure why panic() is needed. I would personally start with
pr_emerg(). Anyway, I somehow doubt that this is the reason.

> +	mod_timer(timer, jiffies + HZ);
> +}
> +static int __init loglevelcheck_init(void)
> +{
> +	static DEFINE_TIMER(timer, check_loglevel);
> +
> +	initial_loglevel = console_loglevel;
> +	mod_timer(&timer, jiffies + HZ);
> +	return 0;
> +}
> +late_initcall(loglevelcheck_init);
> +#endif
> 
> 
> 
> 
> 
> By the way, recently we are hitting false positives caused by "WARNING:"
> string from not WARN() messages but plain printk() messages (e.g.
> 
>   https://syzkaller.appspot.com/bug?id=31bdef63e48688854fde93e6edf390922b70f8a4
>   https://syzkaller.appspot.com/bug?id=faae4720a75cadb8cd0dbda5c4d3542228d37340
> 
> ) and we need to avoid emitting "WARNING:" string from plain printk() messages
> during fuzzing testing. I guess we want to add something like
> CONFIG_DEBUG_AID_FOR_SYZBOT to all kernels in order to mask such string...

I wonder who catches the string "WARNING" and how the system is
killed.

panic_on_warn should cause that WARN() macro calls panic(). Simple
printk() should not cause this.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..."
  2019-05-10 14:12                                     ` Petr Mladek
@ 2019-05-10 14:53                                       ` Tetsuo Handa
  0 siblings, 0 replies; 29+ messages in thread
From: Tetsuo Handa @ 2019-05-10 14:53 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Dmitry Vyukov, Sergey Senozhatsky, Sergey Senozhatsky, LKML, syzkaller

On 2019/05/10 23:12, Petr Mladek wrote:
> On Wed 2019-05-08 19:31:06, Tetsuo Handa wrote:
>> Hello.
>>
>> We are again getting corrupted reports where message from WARN() is missing.
>> For example, https://syzkaller.appspot.com/text?tag=CrashLog&x=1720cac8a00000 was
>> titled as "WARNING in cgroup_exit" because the
>> "WARNING: CPU: 0 PID: 7870 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
>> line is there but https://syzkaller.appspot.com/text?tag=CrashLog&x=1670a602a00000
>> was titled as "corrupted report (2)" because the
>> "WARNING: CPU: 0 PID: 10223 at kernel/cgroup/cgroup.c:6008 cgroup_exit+0x51a/0x5d0"
>> line is missing. Also, it is unlikely that there was no printk() for a few minutes.
>> Thus, I suspect something is again suppressing console output.
> 
> It is just a wild speculation. It is highly unlikely that the
> console_loglevel was manipulated to hide messages. There are
> very few locations where console_loglevel is manipulated.

But since messages from panic() are printed, messages are suppressed in a way
panic() can restore the setting.

> 
> Anyway, the new "panic_print" feature from Feng Tang might help here.
> It is in mm tree, see
> https://marc.info/?l=linux-mm-commits&m=155614613719648&w=2
> 
> 
>> Since this problem is happening in 5.1.0-next-20190507, do we want to try below one?
>>
>>  kernel/printk/printk.c | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>>
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index e1e8250..f0b9463 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -3338,3 +3338,23 @@ void kmsg_dump_rewind(struct kmsg_dumper *dumper)
>>  EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>>  
>>  #endif
>> +
>> +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT
>> +static int initial_loglevel;
>> +static void check_loglevel(struct timer_list *timer)
>> +{
>> +	if (console_loglevel < initial_loglevel)
>> +		panic("Console loglevel changed (%d->%d)!", initial_loglevel,
>> +		      console_loglevel);
> 
> I am not sure why panic() is needed. I would personally start with
> pr_emerg(). Anyway, I somehow doubt that this is the reason.

Since the testcase is printed prior to execution, this panic()
tries to catch it as soon as possible if some testcase is actually
changing loglevel. As I confirmed that nobody changes console loglevel
before starting fuzzing tests, I'd like to try this patch in linux-next.





>> By the way, recently we are hitting false positives caused by "WARNING:"
>> string from not WARN() messages but plain printk() messages (e.g.
>>
>>   https://syzkaller.appspot.com/bug?id=31bdef63e48688854fde93e6edf390922b70f8a4
>>   https://syzkaller.appspot.com/bug?id=faae4720a75cadb8cd0dbda5c4d3542228d37340
>>
>> ) and we need to avoid emitting "WARNING:" string from plain printk() messages
>> during fuzzing testing. I guess we want to add something like
>> CONFIG_DEBUG_AID_FOR_SYZBOT to all kernels in order to mask such string...
> 
> I wonder who catches the string "WARNING" and how the system is
> killed.

syzkaller catches it and treats as if a crash occurred.

> 
> panic_on_warn should cause that WARN() macro calls panic(). Simple
> printk() should not cause this.

Since we can't tell whether a WARNING: string reports a problem enough to
treat as a crash, we need to mask WARNING: string like 
https://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git/commit/?h=next-tomoyo&id=5c6b31e31adc31bd12636b196d3311f845dcc9d8 .

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2019-05-10 14:53 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-16  2:09 [syzbot? printk?] no WARN_ON() messages printed before "Kernel panic - not syncing: panic_on_warn set ..." Tetsuo Handa
2019-03-16  9:11 ` Dmitry Vyukov
2019-03-16 10:18   ` Tetsuo Handa
2019-03-16 14:14     ` Sergey Senozhatsky
2019-03-16 14:16       ` Dmitry Vyukov
2019-03-16 14:40         ` Sergey Senozhatsky
2019-03-16 14:53         ` Tetsuo Handa
2019-03-16 14:57           ` Dmitry Vyukov
2019-03-16 15:02             ` Tetsuo Handa
2019-03-16 15:10               ` Dmitry Vyukov
2019-03-18  5:27                 ` Tetsuo Handa
2019-03-18 12:07                   ` Dmitry Vyukov
2019-03-18 12:32                     ` Dmitry Vyukov
2019-03-18 12:39                       ` Dmitry Vyukov
2019-03-19  0:41                         ` Tetsuo Handa
2019-03-18 12:50                       ` Sergey Senozhatsky
2019-03-18 13:42                         ` Dmitry Vyukov
2019-03-18 14:09                           ` Sergey Senozhatsky
2019-03-19  8:10                             ` Dmitry Vyukov
2019-03-19 12:35                               ` Sergey Senozhatsky
2019-03-19 13:35                                 ` Dmitry Vyukov
2019-05-08 10:31                                   ` Tetsuo Handa
2019-05-09  9:58                                     ` Sergey Senozhatsky
2019-05-09 10:18                                       ` Sergey Senozhatsky
2019-05-09 10:40                                         ` Tetsuo Handa
2019-05-09 10:26                                       ` Tetsuo Handa
2019-05-09 10:36                                         ` Sergey Senozhatsky
2019-05-10 14:12                                     ` Petr Mladek
2019-05-10 14:53                                       ` Tetsuo Handa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.