LKML Archive on lore.kernel.org
 help / Atom feed
* Re: printk feature for syzbot?
       [not found] ` <CACT4Y+boyw_Qy=y-iTnsKZrtTgF0Hk3nHN_xtqUdX4etgiYDQw@mail.gmail.com>
@ 2018-04-24  1:33   ` Sergey Senozhatsky
  2018-04-24 14:40     ` Steven Rostedt
  2018-04-26 10:06     ` Petr Mladek
  0 siblings, 2 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-04-24  1:33 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Tetsuo Handa, Sergey Senozhatsky, syzkaller, Petr Mladek,
	Steven Rostedt, Fengguang Wu, linux-kernel

Let me Cc Petr, Steven and Fengguang on this

On (04/23/18 15:40), Dmitry Vyukov wrote:
> On Mon, Apr 23, 2018 at 3:33 PM, Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > Hello, Sergey.
> >
> > Recently I'm fixing bugs reported by syzbot ( https://syzkaller.appspot.com/ ).
> >
> > Since syzbot frequently makes printk() flooding (e.g. memory allocation fault
> > injection), it is always difficult to distinguish which line is from which event.
> >
> > I wish printk() can prefix context identifier.
> > If I recall correctly, you are using some extra output for debugging, aren't you?
> 
> +syzkaller mailing list for history
> 
> Hi Tetsuo, Sergey,
> 
> Something like TID prefix would be useful. Potentially it would allow
> us to untangle multiple intermixed crash reports.

Hello,

Yes, Tetsuo, we use a bunch of "printk prefix" extensions at Samsung.
For instance, we prefix printk messages with the CPU number: messages
sometimes mix up, we also see partial pr_cont flushes, and so on.
Grep-ping serial logs by CPU number is quite powerful.

Upstreaming those printk prefixes can be a bit challenging, but may
be it's not all so bad. I personally think that syzbot, and build-test
bots in general [like 0day], are helpful indeed, and I don't see why life
should be any more complex for syzbot/0day guys. If printk prefixes can
help - then we probably should consider such an extension.

The main argument from the upstream is that tweaking struct printk_log
breaks user space (tools like crash, and so on). But I guess we can do
something about it. E.g. put a PRINTK_CONTEXT_TRACKING_PREFIX kconfig
option somewhere in "Kernel hacking"->"printk and dmesg options" and
make available only for DEBUG kernels, or something similar.

Petr, Steven, Fengguang, what do you think? Do you have any objections?
Ideas?

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-04-24  1:33   ` printk feature for syzbot? Sergey Senozhatsky
@ 2018-04-24 14:40     ` Steven Rostedt
  2018-04-26 10:06     ` Petr Mladek
  1 sibling, 0 replies; 68+ messages in thread
From: Steven Rostedt @ 2018-04-24 14:40 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Petr Mladek, Fengguang Wu, linux-kernel

On Tue, 24 Apr 2018 10:33:36 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> Petr, Steven, Fengguang, what do you think? Do you have any objections?
> Ideas?

If it can be turned off by a config option, I'm fine with it.

-- Steve

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-04-24  1:33   ` printk feature for syzbot? Sergey Senozhatsky
  2018-04-24 14:40     ` Steven Rostedt
@ 2018-04-26 10:06     ` Petr Mladek
  2018-05-10  4:22       ` Sergey Senozhatsky
  1 sibling, 1 reply; 68+ messages in thread
From: Petr Mladek @ 2018-04-26 10:06 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, linux-kernel

On Tue 2018-04-24 10:33:36, Sergey Senozhatsky wrote:
> Yes, Tetsuo, we use a bunch of "printk prefix" extensions at Samsung.
> For instance, we prefix printk messages with the CPU number: messages
> sometimes mix up, we also see partial pr_cont flushes, and so on.
> Grep-ping serial logs by CPU number is quite powerful.
> 
> Upstreaming those printk prefixes can be a bit challenging, but may
> be it's not all so bad. I personally think that syzbot, and build-test
> bots in general [like 0day], are helpful indeed, and I don't see why life
> should be any more complex for syzbot/0day guys. If printk prefixes can
> help - then we probably should consider such an extension.
> 
> The main argument from the upstream is that tweaking struct printk_log
> breaks user space (tools like crash, and so on). But I guess we can do
> something about it. E.g. put a PRINTK_CONTEXT_TRACKING_PREFIX kconfig
> option somewhere in "Kernel hacking"->"printk and dmesg options" and
> make available only for DEBUG kernels, or something similar.

> Petr, Steven, Fengguang, what do you think? Do you have any objections?
> Ideas?

I wonder if we could create some mechanism that would help to extend
struct printk_log easier in the future.

I know only about crash tool implementation. It uses information provided
by log_buf_vmcoreinfo_setup(). The size of the structure is already
public. Therefore crash should be able to find all existing information
even if we increase the size of the structure.

log_buf_vmcoreinfo_setup() even allows to inform about newly added
structure items. We could probably extend it to inform also about
the offset of the new optional elements.

I am not sure about other tools. But I think that it should be
doable.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-04-26 10:06     ` Petr Mladek
@ 2018-05-10  4:22       ` Sergey Senozhatsky
  2018-05-10 11:30         ` Petr Mladek
  2018-05-10 14:50         ` Tetsuo Handa
  0 siblings, 2 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-10  4:22 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Dmitry Vyukov, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	linux-kernel

On (04/26/18 12:06), Petr Mladek wrote:
> 
> > Petr, Steven, Fengguang, what do you think? Do you have any objections?
> > Ideas?
> 
> I wonder if we could create some mechanism that would help to extend
> struct printk_log easier in the future.

Hm, interesting idea.

> I know only about crash tool implementation. It uses information provided
> by log_buf_vmcoreinfo_setup(). The size of the structure is already
> public. Therefore crash should be able to find all existing information
> even if we increase the size of the structure.
> 
> log_buf_vmcoreinfo_setup() even allows to inform about newly added
> structure items. We could probably extend it to inform also about
> the offset of the new optional elements.

I vaguely remember that the last time Thomas Gleixner modified
printk_log you managed to find a case that broke crash tool.
... Or may be I'm mistaken.

> I am not sure about other tools. But I think that it should be
> doable.

Good. So there are no objections, so far.

Tetsuo, Dmitry, care to send a patch?

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10  4:22       ` Sergey Senozhatsky
@ 2018-05-10 11:30         ` Petr Mladek
  2018-05-10 12:11           ` Sergey Senozhatsky
  2018-05-10 14:50         ` Tetsuo Handa
  1 sibling, 1 reply; 68+ messages in thread
From: Petr Mladek @ 2018-05-10 11:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, linux-kernel

On Thu 2018-05-10 13:22:06, Sergey Senozhatsky wrote:
> On (04/26/18 12:06), Petr Mladek wrote:
> > 
> > > Petr, Steven, Fengguang, what do you think? Do you have any objections?
> > > Ideas?
> > 
> > I wonder if we could create some mechanism that would help to extend
> > struct printk_log easier in the future.
> 
> Hm, interesting idea.
> 
> > I know only about crash tool implementation. It uses information provided
> > by log_buf_vmcoreinfo_setup(). The size of the structure is already
> > public. Therefore crash should be able to find all existing information
> > even if we increase the size of the structure.
> > 
> > log_buf_vmcoreinfo_setup() even allows to inform about newly added
> > structure items. We could probably extend it to inform also about
> > the offset of the new optional elements.
> 
> I vaguely remember that the last time Thomas Gleixner modified
> printk_log you managed to find a case that broke crash tool.
> ... Or may be I'm mistaken.

I guess that you are talking about the patchset adding possibility
to use different time-stamps[1]. It changed the semantic of the
timestamp. All the tools needed an update to show the timestamp
correctly.

The patchset was rejected by Linus because it would broke some
userspace tool, e.g. systemd, that depend on the format and semantic
provided by /dev/kmsg[2].

By other words, we must not change /dev/kmsg format. But it should
be acceptable to change/extend the internal format and eventually
extend the format used on consoles.

Anyway, we need to be careful and test makedumpfile and crash tools
and eventually provide patches for them.

Reference:
[0] https://lkml.kernel.org/r/20160419085613.GJ6862@pathway.suse.cz
[1] https://lkml.kernel.org/r/CA+55aFzLH9crdMtUFkD-PtNGuxu_fsG5GH2ACni69ug9iM=09g@mail.gmail.com

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10 11:30         ` Petr Mladek
@ 2018-05-10 12:11           ` Sergey Senozhatsky
  2018-05-10 14:22             ` Steven Rostedt
  0 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-10 12:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Dmitry Vyukov, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	linux-kernel

On (05/10/18 13:30), Petr Mladek wrote:
[..]
> I guess that you are talking about the patchset adding possibility
> to use different time-stamps[1]. It changed the semantic of the
> timestamp. All the tools needed an update to show the timestamp
> correctly.
> 
> The patchset was rejected by Linus because it would broke some
> userspace tool, e.g. systemd, that depend on the format and semantic
> provided by /dev/kmsg[2].

Right, but I think I was talking about this email
 https://lkml.kernel.org/r/20171123124648.s4oigunxjfzvhtqh@pathway.suse.cz

But yeah, it's not really related to the extension of struct printk_log,
so I think we should be fine.

> By other words, we must not change /dev/kmsg format. But it should
> be acceptable to change/extend the internal format and eventually
> extend the format used on consoles.

Sure.

> Anyway, we need to be careful and test makedumpfile and crash tools
> and eventually provide patches for them.

Agreed. I'd prefer it to be hidden somewhere under kernel hacking config,
so only syzkaller folks would enable it. I think Steven also mentioned
a config option.

> Reference:
> [0] https://lkml.kernel.org/r/20160419085613.GJ6862@pathway.suse.cz
> [1] https://lkml.kernel.org/r/CA+55aFzLH9crdMtUFkD-PtNGuxu_fsG5GH2ACni69ug9iM=09g@mail.gmail.com

Thanks.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10 12:11           ` Sergey Senozhatsky
@ 2018-05-10 14:22             ` Steven Rostedt
  0 siblings, 0 replies; 68+ messages in thread
From: Steven Rostedt @ 2018-05-10 14:22 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Dmitry Vyukov, Tetsuo Handa, Sergey Senozhatsky,
	syzkaller, Fengguang Wu, linux-kernel

On Thu, 10 May 2018 21:11:22 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> > The patchset was rejected by Linus because it would broke some
> > userspace tool, e.g. systemd, that depend on the format and semantic
> > provided by /dev/kmsg[2].  
> 
> Right, but I think I was talking about this email
>  https://lkml.kernel.org/r/20171123124648.s4oigunxjfzvhtqh@pathway.suse.cz
> 
> But yeah, it's not really related to the extension of struct printk_log,
> so I think we should be fine.

Note, crash is "special". It depends on internals of the kernel to keep
working as its purpose is to debug kernel crashes. I'm constantly
breaking it with ftrace. Which reminds me, I need to see if it works
with the latest kernel, and send patches if it isn't.

-- Steve

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10  4:22       ` Sergey Senozhatsky
  2018-05-10 11:30         ` Petr Mladek
@ 2018-05-10 14:50         ` Tetsuo Handa
  2018-05-11  1:45           ` Sergey Senozhatsky
  1 sibling, 1 reply; 68+ messages in thread
From: Tetsuo Handa @ 2018-05-10 14:50 UTC (permalink / raw)
  To: sergey.senozhatsky.work, pmladek
  Cc: dvyukov, sergey.senozhatsky, syzkaller, rostedt, fengguang.wu,
	linux-kernel

Sergey Senozhatsky wrote:
> On (04/26/18 12:06), Petr Mladek wrote:
> > 
> > > Petr, Steven, Fengguang, what do you think? Do you have any objections?
> > > Ideas?
> > 
> > I wonder if we could create some mechanism that would help to extend
> > struct printk_log easier in the future.
> 
> Hm, interesting idea.
> 
> > I know only about crash tool implementation. It uses information provided
> > by log_buf_vmcoreinfo_setup(). The size of the structure is already
> > public. Therefore crash should be able to find all existing information
> > even if we increase the size of the structure.
> > 
> > log_buf_vmcoreinfo_setup() even allows to inform about newly added
> > structure items. We could probably extend it to inform also about
> > the offset of the new optional elements.
> 
> I vaguely remember that the last time Thomas Gleixner modified
> printk_log you managed to find a case that broke crash tool.
> ... Or may be I'm mistaken.
> 
> > I am not sure about other tools. But I think that it should be
> > doable.
> 
> Good. So there are no objections, so far.
> 
> Tetsuo, Dmitry, care to send a patch?
> 
> 	-ss
> 

What I meant is nothing but something like below (i.e. inject context ID before
string to print)

  -sprintf(printk_buf + offset, "[ %s] %s", stamp, string_to_print);
  +cpu = smp_processor_id()
  +if (in_nmi())
  +  sprintf(printk_buf + offset, "[ %s](N%u) %s", stamp, cpu, string_to_print);
  +else if (in_irq())
  +  sprintf(printk_buf + offset, "[ %s](I%u) %s", stamp, cpu, string_to_print);
  +else if (in_serving_softirq())
  +  sprintf(printk_buf + offset, "[ %s](S%u) %s", stamp, cpu, string_to_print);
  +else
  +  sprintf(printk_buf + offset, "[ %s](%u) %s", stamp, current->pid, string_to_print);

without touching any struct.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10 14:50         ` Tetsuo Handa
@ 2018-05-11  1:45           ` Sergey Senozhatsky
       [not found]             ` <201805110238.w4B2cIGH079602@www262.sakura.ne.jp>
  0 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-11  1:45 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: sergey.senozhatsky.work, pmladek, dvyukov, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel

On (05/10/18 23:50), Tetsuo Handa wrote:
> What I meant is nothing but something like below (i.e. inject context ID before
> string to print)
> 
>   -sprintf(printk_buf + offset, "[ %s] %s", stamp, string_to_print);
>   +cpu = smp_processor_id()
>   +if (in_nmi())
>   +  sprintf(printk_buf + offset, "[ %s](N%u) %s", stamp, cpu, string_to_print);
>   +else if (in_irq())
>   +  sprintf(printk_buf + offset, "[ %s](I%u) %s", stamp, cpu, string_to_print);
>   +else if (in_serving_softirq())
>   +  sprintf(printk_buf + offset, "[ %s](S%u) %s", stamp, cpu, string_to_print);
>   +else
>   +  sprintf(printk_buf + offset, "[ %s](%u) %s", stamp, current->pid, string_to_print);
> 
> without touching any struct.

So you basically want to have one more con_msg_format_flags? Do
you want to track a context which prints out a messages or the
context which "generated" the message? A CPU/task that stores
a logbuf entry - vprintk_emit() - is not always the same as the
CPU/task that prints it to consoles - console_unlock().

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
       [not found]             ` <201805110238.w4B2cIGH079602@www262.sakura.ne.jp>
@ 2018-05-11  6:21               ` Sergey Senozhatsky
  2018-05-11  9:17                 ` Dmitry Vyukov
  2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
  0 siblings, 2 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-11  6:21 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, pmladek, dvyukov, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel

On (05/11/18 11:38), Tetsuo Handa wrote:
> > 
> > So you basically want to have one more con_msg_format_flags? Do
> > you want to track a context which prints out a messages or the
> > context which "generated" the message? A CPU/task that stores
> > a logbuf entry - vprintk_emit() - is not always the same as the
> > CPU/task that prints it to consoles - console_unlock().
> > 
> 
> Well, below is the (partial) patch.

Hi,

Tetsuo, I will take a look a bit later, but at glance, there are several
ways to achieve what you are trying to do. The first one is the way you
did it - add additional buffer and make that context tracking info part of
the message body. Another one would be to extend struct printk_log and add
pid/cpu/flag there, which you then can convert into text in msg_print_text().
So far we talked about extending printk_log. Yet another one could be - add
vsprintf specifiers that would add pid/cpu/flag to the vsprintf-ed message.
You then can re-define pr_fmt, for instance, in the code you want to track
pr_fmt "%zZ" fmt, or somehow force printk to add that "%zZ" to every
message.

> By the way, when I tried to make similar change for printk_safe_log_store(),
> I noticed that printk_safe_log_store() is not safe because it is reusing
> the va_list variable after "goto again;". We need to use va_copy(), or
> we will get crash like an example shown below.

Oh, right. Can you send a patch?

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-11  6:21               ` Sergey Senozhatsky
@ 2018-05-11  9:17                 ` Dmitry Vyukov
  2018-05-11  9:50                   ` Sergey Senozhatsky
  2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
  1 sibling, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-05-11  9:17 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Petr Mladek, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML

On Fri, May 11, 2018 at 8:21 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (05/11/18 11:38), Tetsuo Handa wrote:
>> >
>> > So you basically want to have one more con_msg_format_flags? Do
>> > you want to track a context which prints out a messages or the
>> > context which "generated" the message? A CPU/task that stores
>> > a logbuf entry - vprintk_emit() - is not always the same as the
>> > CPU/task that prints it to consoles - console_unlock().
>> >
>>
>> Well, below is the (partial) patch.
>
> Hi,
>
> Tetsuo, I will take a look a bit later, but at glance, there are several
> ways to achieve what you are trying to do. The first one is the way you
> did it - add additional buffer and make that context tracking info part of
> the message body. Another one would be to extend struct printk_log and add
> pid/cpu/flag there, which you then can convert into text in msg_print_text().
> So far we talked about extending printk_log. Yet another one could be - add
> vsprintf specifiers that would add pid/cpu/flag to the vsprintf-ed message.
> You then can re-define pr_fmt, for instance, in the code you want to track
> pr_fmt "%zZ" fmt, or somehow force printk to add that "%zZ" to every
> message.


For syzbot perspective, yes, we can set any necessary additional
configs, add cmdline arguments, etc.

Manually changing format strings won't work -- bugs are all over the place.

>From what I see, it seems that interrupts can be nested:

https://syzkaller.appspot.com/bug?id=72eddef9cedcf81486adb9dd3e789f0d77505ba5
https://syzkaller.appspot.com/bug?id=66fcf61c65f8aa50bbb862eb2fde27c08909a4ff

Will this in_nmi()/in_irq()/in_serving_softirq()/else be enough to
untangle output printed by such nested interrupts? For the first link
it seems that they both are the same type of interrupt --
apic_timer_interrupt.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-11  9:17                 ` Dmitry Vyukov
@ 2018-05-11  9:50                   ` Sergey Senozhatsky
  2018-05-11 11:58                     ` [PATCH] printk: inject caller information into the body of message Tetsuo Handa
  2018-05-11 13:37                     ` printk feature for syzbot? Steven Rostedt
  0 siblings, 2 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-11  9:50 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Tetsuo Handa, Petr Mladek,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML

On (05/11/18 11:17), Dmitry Vyukov wrote:
> 
> From what I see, it seems that interrupts can be nested:

Hm, I thought that in general IRQ handlers run with local IRQs
disabled on CPU. So, generally, IRQs don't nest. Was I wrong?
NMIs can nest, that's true; but I thought that at least IRQs
don't.

> https://syzkaller.appspot.com/bug?id=72eddef9cedcf81486adb9dd3e789f0d77505ba5
> https://syzkaller.appspot.com/bug?id=66fcf61c65f8aa50bbb862eb2fde27c08909a4ff
> 
> Will this in_nmi()/in_irq()/in_serving_softirq()/else be enough to
> untangle output printed by such nested interrupts?

Well, hm. __irq_enter() does preempt_count_add(HARDIRQ_OFFSET) and
__irq_exit() does preempt_count_sub(HARDIRQ_OFFSET). So, technically,
you can store

	preempt_count() & HARDIRQ_MASK
	preempt_count() & SOFTIRQ_MASK
	preempt_count() & NMI_MASK

in that extended context tracking. The numbers will not tell you
the IRQ line number, for instance, but at least you'll be able to
distinguish different hard/soft IRQs, NMIs. Just an idea, I didn't
check it, may be it won't work at all.

Ideally, the serial log should be like this

	i:1 ... foo()
	i:1 ... bar()
	i:2 ... foo()  // __irq_enter()
	i:2 ... bar()
	i:2 ... buz()  // __irq_exit()
	i:1 ... buz()

but I may be completely wrong.

Petr and Steven probably will have better ideas.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH] printk: fix possible reuse of va_list variable
  2018-05-11  6:21               ` Sergey Senozhatsky
  2018-05-11  9:17                 ` Dmitry Vyukov
@ 2018-05-11 11:02                 ` Tetsuo Handa
  2018-05-11 11:27                   ` Sergey Senozhatsky
  2018-05-17 11:57                   ` Petr Mladek
  1 sibling, 2 replies; 68+ messages in thread
From: Tetsuo Handa @ 2018-05-11 11:02 UTC (permalink / raw)
  To: sergey.senozhatsky.work
  Cc: pmladek, dvyukov, sergey.senozhatsky, syzkaller, rostedt,
	fengguang.wu, linux-kernel, peterz

>From 766cf72b5fdc00d1cf5a8ca2c6b23ebb75e2b4d4 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Fri, 11 May 2018 19:54:19 +0900
Subject: [PATCH] printk: fix possible reuse of va_list variable

I noticed that there is a possibility that printk_safe_log_store() causes
kernel oops because "args" parameter is passed to vsnprintf() again when
atomic_cmpxchg() detected that we raced. Fix this by using va_copy().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 42a0bb3f71383b45 ("printk/nmi: generic solution for safe printk in NMI")
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/printk/printk_safe.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 3e3c200..449d67e 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -82,6 +82,7 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
 {
 	int add;
 	size_t len;
+	va_list ap;
 
 again:
 	len = atomic_read(&s->len);
@@ -100,7 +101,9 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
 	if (!len)
 		smp_rmb();
 
-	add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, args);
+	va_copy(ap, args);
+	add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap);
+	va_end(ap);
 	if (!add)
 		return 0;
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: fix possible reuse of va_list variable
  2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
@ 2018-05-11 11:27                   ` Sergey Senozhatsky
  2018-05-17 11:57                   ` Petr Mladek
  1 sibling, 0 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-11 11:27 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: sergey.senozhatsky.work, pmladek, dvyukov, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel, peterz

On (05/11/18 20:02), Tetsuo Handa wrote:
> I noticed that there is a possibility that printk_safe_log_store() causes
> kernel oops because "args" parameter is passed to vsnprintf() again when
> atomic_cmpxchg() detected that we raced. Fix this by using va_copy().
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Fixes: 42a0bb3f71383b45 ("printk/nmi: generic solution for safe printk in NMI")
> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH] printk: inject caller information into the body of message
  2018-05-11  9:50                   ` Sergey Senozhatsky
@ 2018-05-11 11:58                     ` Tetsuo Handa
  2018-05-17 11:21                       ` Sergey Senozhatsky
  2018-05-11 13:37                     ` printk feature for syzbot? Steven Rostedt
  1 sibling, 1 reply; 68+ messages in thread
From: Tetsuo Handa @ 2018-05-11 11:58 UTC (permalink / raw)
  To: sergey.senozhatsky.work, dvyukov
  Cc: pmladek, sergey.senozhatsky, syzkaller, rostedt, fengguang.wu,
	linux-kernel, torvalds, akpm

>From b7b0e56e06db1107f781b4cb5178fbdc99240901 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Fri, 11 May 2018 20:45:31 +0900
Subject: [PATCH] printk: inject caller information into the body of message

Since syzbot frequently makes printk() flooding (e.g. memory allocation
fault injection), it is always difficult to distinguish which line is from
which event.

This patch tries to help grouping concurrent printk() lines, without
touching any struct so that we don't break userspace tools (e.g. crash)
which depend on in-kernel data structures.

If printk() is called from process context, "(T%u)" (where %u is
current->pid) is injected. If printk() is called from interrupt context,
"(C%u)" (where %u is raw_smp_processor_id()) is injected.



Example 1: SysRq-h from keyboard operation.
----------
[   57.688156] (C3) sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w)
----------

Example 2: SysRq-h from /proc/sysrq-trigger interface.
----------
[   64.592273] (T2768) sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w)
----------

Example 3: SysRq-f from keyboard operation.
----------
[   70.792068] (C3) sysrq: SysRq : Manual OOM execution
[   70.797444] (T245) kworker/0:2 invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=-1, oom_score_adj=0
[   70.807690] (T245) kworker/0:2 cpuset=/ mems_allowed=0
[   70.812738] (T245) CPU: 0 PID: 245 Comm: kworker/0:2 Kdump: loaded Not tainted 4.17.0-rc4+ #396
[   70.819886] (T245) Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[   70.828924] (T245) Workqueue: events moom_callback
[   70.830554] (T245) Call Trace:
[   70.831518] (T245)  dump_stack+0x5e/0x8b
[   70.832754] (T245)  dump_header+0x6f/0x454
[   70.834045] (T245)  ? _raw_spin_unlock_irqrestore+0x42/0x60
[   70.835764] (T245)  oom_kill_process+0x223/0x690
[   70.837257] (T245)  ? out_of_memory+0x2c2/0x530
[   70.838669] (T245)  out_of_memory+0x120/0x530
[   70.840016] (T245)  ? out_of_memory+0x1f7/0x530
[   70.841433] (T245)  moom_callback+0x68/0x90
[   70.842735] (T245)  process_one_work+0x19f/0x370
[   70.844162] (T245)  ? process_one_work+0x13c/0x370
[   70.845681] (T245)  worker_thread+0x45/0x3e0
[   70.846985] (T245)  kthread+0xf6/0x130
[   70.848127] (T245)  ? process_one_work+0x370/0x370
[   70.849587] (T245)  ? kthread_create_on_node+0x40/0x40
[   70.851146] (T245)  ret_from_fork+0x24/0x30
[   70.855706] (T245) Mem-Info:
[   70.856674] (T245) active_anon:11880 inactive_anon:2122 isolated_anon:0
[   70.856674]  active_file:11521 inactive_file:18259 isolated_file:0
[   70.856674]  unevictable:0 dirty:4 writeback:0 unstable:0
[   70.856674]  slab_reclaimable:7216 slab_unreclaimable:14300
[   70.856674]  mapped:11981 shmem:2198 pagetables:1743 bounce:0
[   70.856674]  free:853866 free_pcp:566 free_cma:0
[   70.868764] (T245) Node 0 active_anon:47520kB inactive_anon:8488kB active_file:46084kB inactive_file:73036kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:47924kB dirty:16kB writeback:0kB shmem:8792kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 8192kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[   70.880127] (T245) Node 0 DMA free:15872kB min:284kB low:352kB high:420kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[   70.889522] (T245) lowmem_reserve[]: 0 2683 3633 3633
[   70.891285] (T245) Node 0 DMA32 free:2746612kB min:49696kB low:62120kB high:74544kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:2748008kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:340kB local_pcp:116kB free_cma:0kB
[   70.899460] (T245) lowmem_reserve[]: 0 0 950 950
[   70.901212] (T245) Node 0 Normal free:652980kB min:17596kB low:21992kB high:26388kB active_anon:47520kB inactive_anon:8488kB active_file:46084kB inactive_file:73036kB unevictable:0kB writepending:16kB present:1048576kB managed:972972kB mlocked:0kB kernel_stack:3664kB pagetables:6972kB bounce:0kB free_pcp:1920kB local_pcp:644kB free_cma:0kB
[   70.909865] (T245) lowmem_reserve[]: 0 0 0 0
[   70.911646] (T245) Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15872kB
[   70.915231] (T245) Node 0 DMA32: 3*4kB (UM) 3*8kB (UM) 1*16kB (M) 2*32kB (M) 2*64kB (M) 4*128kB (UM) 4*256kB (M) 5*512kB (UM) 2*1024kB (M) 4*2048kB (UM) 667*4096kB (M) = 2746612kB
[   70.920274] (T245) Node 0 Normal: 243*4kB (UM) 51*8kB (UM) 19*16kB (UM) 5*32kB (M) 4*64kB (UME) 1*128kB (M) 2*256kB (ME) 2*512kB (UE) 6*1024kB (UE) 6*2048kB (UME) 154*4096kB (M) = 652980kB
[   70.925188] (T245) Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[   70.927860] (T245) Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[   70.930431] (T245) 31978 total pagecache pages
[   70.932166] (T245) 0 pages in swap cache
[   70.933803] (T245) Swap cache stats: add 0, delete 0, find 0/0
[   70.935841] (T245) Free swap  = 0kB
[   70.937315] (T245) Total swap = 0kB
[   70.938957] (T245) 1048445 pages RAM
[   70.940486] (T245) 0 pages HighMem/MovableOnly
[   70.942158] (T245) 114224 pages reserved
[   70.943767] (T245) 0 pages hwpoisoned
[   70.945267] (T245) Out of memory: Kill process 2474 (tuned) score 6 or sacrifice child
[   70.947863] (T245) Killed process 2474 (tuned) total-vm:573828kB, anon-rss:13072kB, file-rss:10716kB, shmem-rss:0kB
----------



This patch does not distinguish in_nmi()/in_irq()/in_serving_softirq(),
for I guess that it is not too difficult to distinguish them as long as
we can pick up messages from same CPU based on "(C%u)" part. We could
change to use "(C%u%c)" (where %c is type of interrupt context) if needed.

For long term, we might want to touch in-kernel data structures so that
userspace tools can do better processing. But for now, I think that this
patch can help a lot.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
---
 Documentation/admin-guide/kernel-parameters.txt |  6 +++
 kernel/printk/internal.h                        |  1 +
 kernel/printk/printk.c                          | 49 +++++++++++++++++++++++--
 kernel/printk/printk_safe.c                     | 22 ++++++++++-
 lib/Kconfig.debug                               | 13 +++++++
 5 files changed, 87 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 11fc28e..10e716e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3288,6 +3288,12 @@
 			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
 			default: disabled
 
+	printk.caller_info=
+			Show which task (if in process context) or CPU (if not
+			in process context) generated each message.
+			Useful for environments where printk() floods.
+			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
+
 	printk.devkmsg={on,off,ratelimit}
 			Control writing to /dev/kmsg.
 			on - unlimited logging to /dev/kmsg from userspace
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 2a7d040..0b30457 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -23,6 +23,7 @@
 #define PRINTK_NMI_CONTEXT_MASK		 0x80000000
 
 extern raw_spinlock_t logbuf_lock;
+extern bool printk_caller_info;
 
 __printf(1, 0) int vprintk_default(const char *fmt, va_list args);
 __printf(1, 0) int vprintk_deferred(const char *fmt, va_list args);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2f4af21..9040a16 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1733,6 +1733,37 @@ static inline void printk_delay(void)
 	}
 }
 
+bool printk_caller_info = IS_ENABLED(CONFIG_PRINTK_CALLER_INFO);
+module_param_named(caller_info, printk_caller_info, bool, 0644);
+
+static char *printk_inject_caller_info(const char *text, size_t *text_len)
+{
+	static char buf[LOG_LINE_MAX + 128];
+	int len;
+	unsigned int v;
+	char c;
+
+	if (!printk_caller_info)
+		return (char *) text;
+
+	if (in_task()) {
+		v = current->pid;
+		c = 'T';
+	} else {
+		/* Use raw version not to generate warning messages. */
+		v = raw_smp_processor_id();
+		c = 'C';
+	}
+	len = snprintf(buf, sizeof(buf), "(%c%u) ", c, v);
+	/* This should not happen though... */
+	if (unlikely(len + *text_len >= sizeof(buf)))
+		return (char *) text;
+	memmove(buf + len, text, *text_len);
+	*text_len += len;
+	/* "buf" remains valid because it is protected by "logbuf_lock". */
+	return buf;
+}
+
 /*
  * Continuation lines are buffered, and not committed to the record buffer
  * until the line is complete, or a race forces it. The line fragments
@@ -1763,10 +1794,19 @@ static bool cont_add(int facility, int level, enum log_flags flags, const char *
 {
 	/*
 	 * If ext consoles are present, flush and skip in-kernel
-	 * continuation.  See nr_ext_console_drivers definition.  Also, if
-	 * the line gets too long, split it up in separate records.
+	 * continuation. See nr_ext_console_drivers definition.
 	 */
-	if (nr_ext_console_drivers || cont.len + len > sizeof(cont.buf)) {
+	if (nr_ext_console_drivers) {
+		cont_flush();
+		return false;
+	}
+
+	/* Inject before memcpy() in order to avoid overflow. */
+	if (!cont.len)
+		text = printk_inject_caller_info(text, &len);
+
+	/* If the line gets too long, split it up in separate records. */
+	if (cont.len + len > sizeof(cont.buf)) {
 		cont_flush();
 		return false;
 	}
@@ -1820,6 +1860,9 @@ static size_t log_output(int facility, int level, enum log_flags lflags, const c
 			return text_len;
 	}
 
+	/* Inject caller info. */
+	text = printk_inject_caller_info(text, &text_len);
+
 	/* Store it in the record log */
 	return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len);
 }
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 449d67e..02d080a 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -22,6 +22,7 @@
 #include <linux/cpumask.h>
 #include <linux/irq_work.h>
 #include <linux/printk.h>
+#include <linux/sched.h>
 
 #include "internal.h"
 
@@ -83,6 +84,17 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
 	int add;
 	size_t len;
 	va_list ap;
+	unsigned int v;
+	char c;
+
+	if (in_task()) {
+		v = current->pid;
+		c = 't';
+	} else {
+		/* Use raw version not to generate warning messages. */
+		v = raw_smp_processor_id();
+		c = 'c';
+	}
 
 again:
 	len = atomic_read(&s->len);
@@ -102,7 +114,15 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
 		smp_rmb();
 
 	va_copy(ap, args);
-	add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap);
+	if (printk_caller_info) {
+		struct va_format vaf = { .fmt = fmt, .va = &ap };
+
+		add = scnprintf(s->buffer + len, sizeof(s->buffer) - len,
+				"(%c%u) %pV", c, v, &vaf);
+	} else {
+		add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len,
+				 fmt, ap);
+	}
 	va_end(ap);
 	if (!add)
 		return 0;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c40c7b7..9e8ea4e 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -15,6 +15,19 @@ config PRINTK_TIME
 	  The behavior is also controlled by the kernel command line
 	  parameter printk.time=1. See Documentation/admin-guide/kernel-parameters.rst
 
+config PRINTK_CALLER_INFO
+	bool "Show caller information on printks"
+	depends on PRINTK
+	help
+	  Selecting this option causes thread id (if in process context) or CPU
+	  id (if not in process context) of the printk() messages to be added
+	  to the output of the syslog() system call and at the console.
+	  Useful for environments where multiple threads constantly call
+	  printk() (e.g. fault injection fuzzing tests).
+
+	  The behavior is also controlled by printk.caller_info= kernel command
+	  line parameter or /sys/module/printk/parameters/caller_info file.
+
 config CONSOLE_LOGLEVEL_DEFAULT
 	int "Default console loglevel (1-15)"
 	range 1 15
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-11  9:50                   ` Sergey Senozhatsky
  2018-05-11 11:58                     ` [PATCH] printk: inject caller information into the body of message Tetsuo Handa
@ 2018-05-11 13:37                     ` Steven Rostedt
  2018-05-15  5:20                       ` Sergey Senozhatsky
  1 sibling, 1 reply; 68+ messages in thread
From: Steven Rostedt @ 2018-05-11 13:37 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Petr Mladek, Sergey Senozhatsky,
	syzkaller, Fengguang Wu, LKML

On Fri, 11 May 2018 18:50:04 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> On (05/11/18 11:17), Dmitry Vyukov wrote:
> > 
> > From what I see, it seems that interrupts can be nested:  
> 
> Hm, I thought that in general IRQ handlers run with local IRQs
> disabled on CPU. So, generally, IRQs don't nest. Was I wrong?
> NMIs can nest, that's true; but I thought that at least IRQs
> don't.

We normally don't run nested interrupts, although as the comment in
preempt.h says:

 * The hardirq count could in theory be the same as the number of
 * interrupts in the system, but we run all interrupt handlers with
 * interrupts disabled, so we cannot have nesting interrupts. Though
 * there are a few palaeontologic drivers which reenable interrupts in
 * the handler, so we need more than one bit here.

And no, NMI handlers do not nest. Yes, we deal with nested NMIs, but in
those cases, we just set a bit as a latch, and return, and when the
first NMI is complete, it checks that bit and if it is set, it executes
another NMI handler.

> 
> > https://syzkaller.appspot.com/bug?id=72eddef9cedcf81486adb9dd3e789f0d77505ba5
> > https://syzkaller.appspot.com/bug?id=66fcf61c65f8aa50bbb862eb2fde27c08909a4ff
> > 
> > Will this in_nmi()/in_irq()/in_serving_softirq()/else be enough to
> > untangle output printed by such nested interrupts?  
> 
> Well, hm. __irq_enter() does preempt_count_add(HARDIRQ_OFFSET) and
> __irq_exit() does preempt_count_sub(HARDIRQ_OFFSET). So, technically,
> you can store
> 
> 	preempt_count() & HARDIRQ_MASK
> 	preempt_count() & SOFTIRQ_MASK
> 	preempt_count() & NMI_MASK
> 
> in that extended context tracking. The numbers will not tell you
> the IRQ line number, for instance, but at least you'll be able to
> distinguish different hard/soft IRQs, NMIs. Just an idea, I didn't
> check it, may be it won't work at all.
> 
> Ideally, the serial log should be like this
> 
> 	i:1 ... foo()
> 	i:1 ... bar()
> 	i:2 ... foo()  // __irq_enter()
> 	i:2 ... bar()
> 	i:2 ... buz()  // __irq_exit()
> 	i:1 ... buz()
> 
> but I may be completely wrong.
> 
> Petr and Steven probably will have better ideas.

I handle nesting of different contexts in the ftrace ring buffer using
the preempt count. See trace_recursive_lock/unlock() in
kernel/trace/ring_buffer.c.

-- Steve

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-11 13:37                     ` printk feature for syzbot? Steven Rostedt
@ 2018-05-15  5:20                       ` Sergey Senozhatsky
  2018-05-15 14:39                         ` Steven Rostedt
  0 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-15  5:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Dmitry Vyukov, Tetsuo Handa, Petr Mladek,
	Sergey Senozhatsky, syzkaller, Fengguang Wu, LKML

Hello,

On (05/11/18 09:37), Steven Rostedt wrote:
> > On (05/11/18 11:17), Dmitry Vyukov wrote:
> > > 
> > > From what I see, it seems that interrupts can be nested:  
> > 
> > Hm, I thought that in general IRQ handlers run with local IRQs
> > disabled on CPU. So, generally, IRQs don't nest. Was I wrong?
> > NMIs can nest, that's true; but I thought that at least IRQs
> > don't.
> 
> We normally don't run nested interrupts, although as the comment in
> preempt.h says:
> 
>  * The hardirq count could in theory be the same as the number of
>  * interrupts in the system, but we run all interrupt handlers with
>  * interrupts disabled, so we cannot have nesting interrupts. Though
>  * there are a few palaeontologic drivers which reenable interrupts in
>  * the handler, so we need more than one bit here.
> 
> And no, NMI handlers do not nest. Yes, we deal with nested NMIs, but in
> those cases, we just set a bit as a latch, and return, and when the
> first NMI is complete, it checks that bit and if it is set, it executes
> another NMI handler.

Good to know!
I thought that NMI can nest in some weird cases, like a breakpoint from
NMI. This must be super tricky, given that nested NMI will corrupt the
stack of the previous NMI, etc. Anyway.

> > Well, hm. __irq_enter() does preempt_count_add(HARDIRQ_OFFSET) and
> > __irq_exit() does preempt_count_sub(HARDIRQ_OFFSET). So, technically,
> > you can store
> > 
> > 	preempt_count() & HARDIRQ_MASK
> > 	preempt_count() & SOFTIRQ_MASK
> > 	preempt_count() & NMI_MASK
> >
[..]
> I handle nesting of different contexts in the ftrace ring buffer using
> the preempt count. See trace_recursive_lock/unlock() in
> kernel/trace/ring_buffer.c.

Thanks. So you are also checking the preempt_count().

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: printk feature for syzbot?
  2018-05-15  5:20                       ` Sergey Senozhatsky
@ 2018-05-15 14:39                         ` Steven Rostedt
  0 siblings, 0 replies; 68+ messages in thread
From: Steven Rostedt @ 2018-05-15 14:39 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Petr Mladek, Sergey Senozhatsky,
	syzkaller, Fengguang Wu, LKML

On Tue, 15 May 2018 14:20:42 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> > And no, NMI handlers do not nest. Yes, we deal with nested NMIs, but in
> > those cases, we just set a bit as a latch, and return, and when the
> > first NMI is complete, it checks that bit and if it is set, it executes
> > another NMI handler.  
> 
> Good to know!
> I thought that NMI can nest in some weird cases, like a breakpoint from
> NMI. This must be super tricky, given that nested NMI will corrupt the
> stack of the previous NMI, etc. Anyway.

Well, they do kinda nest, but we work hard not to let them do anything
when they do. You can read all about it here:

https://lwn.net/Articles/484932/

> 
> > > Well, hm. __irq_enter() does preempt_count_add(HARDIRQ_OFFSET) and
> > > __irq_exit() does preempt_count_sub(HARDIRQ_OFFSET). So, technically,
> > > you can store
> > > 
> > > 	preempt_count() & HARDIRQ_MASK
> > > 	preempt_count() & SOFTIRQ_MASK
> > > 	preempt_count() & NMI_MASK
> > >  
> [..]
> > I handle nesting of different contexts in the ftrace ring buffer using
> > the preempt count. See trace_recursive_lock/unlock() in
> > kernel/trace/ring_buffer.c.  
> 
> Thanks. So you are also checking the preempt_count().
>

Yes I am.

-- Steve

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-11 11:58                     ` [PATCH] printk: inject caller information into the body of message Tetsuo Handa
@ 2018-05-17 11:21                       ` Sergey Senozhatsky
  2018-05-17 11:52                         ` Sergey Senozhatsky
  2018-05-18 12:15                         ` Petr Mladek
  0 siblings, 2 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-17 11:21 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: sergey.senozhatsky.work, dvyukov, pmladek, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel, torvalds, akpm

On (05/11/18 20:58), Tetsuo Handa wrote:
[..]
> -	if (nr_ext_console_drivers || cont.len + len > sizeof(cont.buf)) {
> +	if (nr_ext_console_drivers) {
> +		cont_flush();
> +		return false;
> +	}
> +
> +	/* Inject before memcpy() in order to avoid overflow. */
> +	if (!cont.len)
> +		text = printk_inject_caller_info(text, &len);
> +
> +	/* If the line gets too long, split it up in separate records. */
> +	if (cont.len + len > sizeof(cont.buf)) {
>  		cont_flush();
>  		return false;
>  	}
> @@ -1820,6 +1860,9 @@ static size_t log_output(int facility, int level, enum log_flags lflags, const c
>  			return text_len;
>  	}
>  
> +	/* Inject caller info. */
> +	text = printk_inject_caller_info(text, &text_len);
> +
>  	/* Store it in the record log */
>  	return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len);
>  }

[..]

I think this is slightly intrusive. I understand that you want to avoid
struct printk_log modification, let's try to see if we have any other
options.

Dunno...
For instance, can we store context tracking info as a extended record
data? We have that dict/dict_len thing. So may we can store tracking
info there? Extended records will appear on the serial console /* if
console supports extended data */ or can be read in via devkmsg_read().
Any other options?

>  #include "internal.h"
>  
> @@ -83,6 +84,17 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
[..]
>  	len = atomic_read(&s->len);
> @@ -102,7 +114,15 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
>  		smp_rmb();
>  
>  	va_copy(ap, args);
> -	add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap);
> +	if (printk_caller_info) {
> +		struct va_format vaf = { .fmt = fmt, .va = &ap };
> +
> +		add = scnprintf(s->buffer + len, sizeof(s->buffer) - len,
> +				"(%c%u) %pV", c, v, &vaf);
> +	} else {
> +		add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len,
> +				 fmt, ap);
> +	}

A bit of a silly question - do we want to modify printk_safe at this
point? With this implementation printk_safe entries will have two context
info-s attached: one from original printk_safe_log_store and another one
from printk_safe_flush->log_store. I suspect that adding context info in
printk_safe_log_store is, probably, not really needed. We flush printk_safe
from irq work on the CPU that issued unsafe printk, so part of the context
info will be valid if you append context info only in printk log_store - at
least the correct smp_processor_id.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-17 11:21                       ` Sergey Senozhatsky
@ 2018-05-17 11:52                         ` Sergey Senozhatsky
  2018-05-18 12:15                         ` Petr Mladek
  1 sibling, 0 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-17 11:52 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, dvyukov, pmladek, sergey.senozhatsky, syzkaller,
	rostedt, fengguang.wu, linux-kernel, torvalds, akpm

On (05/17/18 20:21), Sergey Senozhatsky wrote:
> Dunno...
> For instance, can we store context tracking info as a extended record
> data? We have that dict/dict_len thing. So may we can store tracking
> info there? Extended records will appear on the serial console /* if
> console supports extended data */ or can be read in via devkmsg_read().

Those extended records are already there for exactly the same
reason - people want to attach a special context to printk() entries.
See dev_vprintk_emit() and create_syslog_header(). So we can add more
key/value data to that context. Sounds kinda-sorta reasonable.

So, for example, this output
cat /dev/kmsg

6,577,3156036,-;snd_hda_codec_generic hdaudioC1D0: autoconfig for Generic: line_outs=0 (0x0/0x0/0x0/0x0/0x0) type:line
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC1D
6,578,3156807,-;snd_hda_codec_generic hdaudioC1D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC1D

Becomes this:
6,566,3033752,-;snd_hda_codec_realtek hdaudioC0D0:      Front Mic=0x19
 3/207: 0/0/0
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC0D
6,567,3033754,-;snd_hda_codec_realtek hdaudioC0D0:      Rear Mic=0x18
 3/207: 0/0/0
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC0D


"3/207: 0/0/0" is smp_processor_id/task_pid_nr and then masked
out bits of preempt count: hard irq, soft irq, nmi.

We definitely can change the format, etc. This is just a very quick and
dirty PoC.

Something as follows?
/* just to demonstrate the idea */

---

 kernel/printk/printk.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2f4af216bd6e..4a82d52a343d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -580,16 +580,33 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
 	return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
 }
 
+static size_t add_log_origin(char *buf, size_t buf_len)
+{
+	return snprintf(buf,
+			buf_len,
+			"%d/%d: %lu/%lu/%lu",
+			raw_smp_processor_id(),
+			task_pid_nr(current),
+			preempt_count() & HARDIRQ_MASK,
+			preempt_count() & SOFTIRQ_MASK,
+			preempt_count() & NMI_MASK);
+}
+
 /* insert record into the buffer, discard old ones, update heads */
 static int log_store(int facility, int level,
 		     enum log_flags flags, u64 ts_nsec,
 		     const char *dict, u16 dict_len,
 		     const char *text, u16 text_len)
 {
+	static char log_origin[64];
+	static size_t log_origin_len;
 	struct printk_log *msg;
 	u32 size, pad_len;
 	u16 trunc_msg_len = 0;
 
+	log_origin_len = add_log_origin(log_origin, sizeof(log_origin));
+	dict_len += log_origin_len;
+
 	/* number of '\0' padding bytes to next message */
 	size = msg_used_size(text_len, dict_len, &pad_len);
 
@@ -620,7 +637,10 @@ static int log_store(int facility, int level,
 		memcpy(log_text(msg) + text_len, trunc_msg, trunc_msg_len);
 		msg->text_len += trunc_msg_len;
 	}
-	memcpy(log_dict(msg), dict, dict_len);
+	memcpy(log_dict(msg), log_origin, log_origin_len);
+	memcpy(log_dict(msg) + log_origin_len + 1,
+		dict,
+		dict_len - log_origin_len);
 	msg->dict_len = dict_len;
 	msg->facility = facility;
 	msg->level = level & 7;
 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: fix possible reuse of va_list variable
  2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
  2018-05-11 11:27                   ` Sergey Senozhatsky
@ 2018-05-17 11:57                   ` Petr Mladek
  1 sibling, 0 replies; 68+ messages in thread
From: Petr Mladek @ 2018-05-17 11:57 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: sergey.senozhatsky.work, dvyukov, sergey.senozhatsky, syzkaller,
	rostedt, fengguang.wu, linux-kernel, peterz

On Fri 2018-05-11 20:02:31, Tetsuo Handa wrote:
> >From 766cf72b5fdc00d1cf5a8ca2c6b23ebb75e2b4d4 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Fri, 11 May 2018 19:54:19 +0900
> Subject: [PATCH] printk: fix possible reuse of va_list variable
> 
> I noticed that there is a possibility that printk_safe_log_store() causes
> kernel oops because "args" parameter is passed to vsnprintf() again when
> atomic_cmpxchg() detected that we raced. Fix this by using va_copy().

Great catch!

Reviewed-by: Petr Mladek <pmladek@suse.com>

I have tagged it for stable and pushed into printk.git,
branch for-4.18, see
https://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk.git/commit/?h=for-4.18&id=988a35f8da1dec5a8cd2788054d1e717be61bf25

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-17 11:21                       ` Sergey Senozhatsky
  2018-05-17 11:52                         ` Sergey Senozhatsky
@ 2018-05-18 12:15                         ` Petr Mladek
  2018-05-18 12:25                           ` Dmitry Vyukov
                                             ` (2 more replies)
  1 sibling, 3 replies; 68+ messages in thread
From: Petr Mladek @ 2018-05-18 12:15 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, dvyukov, sergey.senozhatsky, syzkaller,
	rostedt, fengguang.wu, linux-kernel, torvalds, akpm

On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
> On (05/11/18 20:58), Tetsuo Handa wrote:
> > @@ -1820,6 +1860,9 @@ static size_t log_output(int facility, int level, enum log_flags lflags, const c
> >  			return text_len;
> >  	}
> >  
> > +	/* Inject caller info. */
> > +	text = printk_inject_caller_info(text, &text_len);
> > +
> >  	/* Store it in the record log */
> >  	return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len);
> >  }
> 
> [..]
> 
> I think this is slightly intrusive. I understand that you want to avoid
> struct printk_log modification, let's try to see if we have any other
> options.

I agree with Sergey that it is intrusive. We should keep the
information separate from the original string and format it
according to the selected output format (syslog, /dev/kmsg,
console) like we do it with the other metadata, e.g. timestamp,
loglevel, dict).


> Dunno...
> For instance, can we store context tracking info as a extended record
> data? We have that dict/dict_len thing. So may we can store tracking
> info there? Extended records will appear on the serial console /* if
> console supports extended data */ or can be read in via devkmsg_read().
> Any other options?

This sounds interesting. Well, we would need to handle different dict
items different ways. I still wonder if we really need these "hacks".

Another option would be to store the metadata into a separate table
indexed by log_seq number. But it still look unnecessarily complicated.

IMHO, we could change struct printk_log if we provide related
patches for crashdump and crash utilities.


Important:

First, we should ask what we expect from this feature. Different
information might be needed in different situations. In general,
people might want to know:

  + CPUid even in task context
  + exact interrupt context (soft, hard, NMI)
  + whether preemption or interrupts are enabled

It still looks bearable. But what if people want more,
e.g. context switch counts, task state, pending signals,
mem usage, cgroup stuff.

Is this information useful for all messages or only
selected ones?

Is it acceptable when message prefix is longer than, let's
say 40 characters?

Is the extended output worth having even on slow consoles?


By other words, I wonder if you wanted similar feature in many
situations in the past and could provide more use cases.


Note:

The proposed patch enabled the extra info with a config option
=> you need to rebuild the kernel => you could just modify
the problematic message. We could just add some printk_ helpers
to make it easier.

Alternatively, I wonder if it might be enough to add a tracepoint
into printk() and get the extra info via
/sys/kernel/debug/tracing/events/. We would need to prevent
recursion when trace buffer is flushed by printk() but...

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:15                         ` Petr Mladek
@ 2018-05-18 12:25                           ` Dmitry Vyukov
  2018-05-18 12:54                             ` Petr Mladek
  2018-05-23 10:19                           ` Tetsuo Handa
  2018-05-24  2:14                           ` Sergey Senozhatsky
  2 siblings, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-05-18 12:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Fri, May 18, 2018 at 2:15 PM, Petr Mladek <pmladek@suse.com> wrote:
> On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
>> On (05/11/18 20:58), Tetsuo Handa wrote:
>> > @@ -1820,6 +1860,9 @@ static size_t log_output(int facility, int level, enum log_flags lflags, const c
>> >                     return text_len;
>> >     }
>> >
>> > +   /* Inject caller info. */
>> > +   text = printk_inject_caller_info(text, &text_len);
>> > +
>> >     /* Store it in the record log */
>> >     return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len);
>> >  }
>>
>> [..]
>>
>> I think this is slightly intrusive. I understand that you want to avoid
>> struct printk_log modification, let's try to see if we have any other
>> options.
>
> I agree with Sergey that it is intrusive. We should keep the
> information separate from the original string and format it
> according to the selected output format (syslog, /dev/kmsg,
> console) like we do it with the other metadata, e.g. timestamp,
> loglevel, dict).
>
>
>> Dunno...
>> For instance, can we store context tracking info as a extended record
>> data? We have that dict/dict_len thing. So may we can store tracking
>> info there? Extended records will appear on the serial console /* if
>> console supports extended data */ or can be read in via devkmsg_read().

What consoles do support it?
We are interested at least in qemu console, GCE console and Android
phone consoles. But it would be pity if this can't be used on various
development boards too.


>> Any other options?
>
> This sounds interesting. Well, we would need to handle different dict
> items different ways. I still wonder if we really need these "hacks".
>
> Another option would be to store the metadata into a separate table
> indexed by log_seq number. But it still look unnecessarily complicated.
>
> IMHO, we could change struct printk_log if we provide related
> patches for crashdump and crash utilities.
>
>
> Important:
>
> First, we should ask what we expect from this feature. Different
> information might be needed in different situations. In general,
> people might want to know:
>
>   + CPUid even in task context
>   + exact interrupt context (soft, hard, NMI)
>   + whether preemption or interrupts are enabled
>
> It still looks bearable. But what if people want more,
> e.g. context switch counts, task state, pending signals,
> mem usage, cgroup stuff.
>
> Is this information useful for all messages or only
> selected ones?
>
> Is it acceptable when message prefix is longer than, let's
> say 40 characters?
>
> Is the extended output worth having even on slow consoles?
>
>
> By other words, I wonder if you wanted similar feature in many
> situations in the past and could provide more use cases.
>
>
> Note:
>
> The proposed patch enabled the extra info with a config option
> => you need to rebuild the kernel => you could just modify
> the problematic message. We could just add some printk_ helpers
> to make it easier.
>
> Alternatively, I wonder if it might be enough to add a tracepoint
> into printk() and get the extra info via
> /sys/kernel/debug/tracing/events/. We would need to prevent
> recursion when trace buffer is flushed by printk() but...
>
> Best Regards,
> Petr

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:25                           ` Dmitry Vyukov
@ 2018-05-18 12:54                             ` Petr Mladek
  2018-05-18 13:08                               ` Dmitry Vyukov
  0 siblings, 1 reply; 68+ messages in thread
From: Petr Mladek @ 2018-05-18 12:54 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Tetsuo Handa, Sergey Senozhatsky, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Fri 2018-05-18 14:25:57, Dmitry Vyukov wrote:
> > On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
> >> Dunno...
> >> For instance, can we store context tracking info as a extended record
> >> data? We have that dict/dict_len thing. So may we can store tracking
> >> info there? Extended records will appear on the serial console /* if
> >> console supports extended data */ or can be read in via devkmsg_read().
> 
> What consoles do support it?
> We are interested at least in qemu console, GCE console and Android
> phone consoles. But it would be pity if this can't be used on various
> development boards too.

Only the netconsole is able to show the extended (dict)
information at the moment. Search for CON_EXTENDED flag.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:54                             ` Petr Mladek
@ 2018-05-18 13:08                               ` Dmitry Vyukov
  2018-05-24  2:21                                 ` Sergey Senozhatsky
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-05-18 13:08 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Fri, May 18, 2018 at 2:54 PM, Petr Mladek <pmladek@suse.com> wrote:
> On Fri 2018-05-18 14:25:57, Dmitry Vyukov wrote:
>> > On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
>> >> Dunno...
>> >> For instance, can we store context tracking info as a extended record
>> >> data? We have that dict/dict_len thing. So may we can store tracking
>> >> info there? Extended records will appear on the serial console /* if
>> >> console supports extended data */ or can be read in via devkmsg_read().
>>
>> What consoles do support it?
>> We are interested at least in qemu console, GCE console and Android
>> phone consoles. But it would be pity if this can't be used on various
>> development boards too.
>
> Only the netconsole is able to show the extended (dict)
> information at the moment. Search for CON_EXTENDED flag.

Then we won't be able to use it. And we can't pipe from devkmsg_read
in user-space, because we need this to work when kernel is broken in
various ways...

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:15                         ` Petr Mladek
  2018-05-18 12:25                           ` Dmitry Vyukov
@ 2018-05-23 10:19                           ` Tetsuo Handa
  2018-05-24  2:14                           ` Sergey Senozhatsky
  2 siblings, 0 replies; 68+ messages in thread
From: Tetsuo Handa @ 2018-05-23 10:19 UTC (permalink / raw)
  To: pmladek
  Cc: sergey.senozhatsky.work, dvyukov, sergey.senozhatsky, syzkaller,
	rostedt, fengguang.wu, linux-kernel, torvalds, akpm

Sergey Senozhatsky wrote:
> On (05/17/18 20:21), Sergey Senozhatsky wrote:
> > Dunno...
> > For instance, can we store context tracking info as a extended record
> > data? We have that dict/dict_len thing. So may we can store tracking
> > info there? Extended records will appear on the serial console /* if
> > console supports extended data */ or can be read in via devkmsg_read().
> 
> Those extended records are already there for exactly the same
> reason - people want to attach a special context to printk() entries.
> See dev_vprintk_emit() and create_syslog_header(). So we can add more
> key/value data to that context. Sounds kinda-sorta reasonable.

Well, the context which I want is not special. It is common context (like
timestamp which is controlled via /sys/module/printk/parameters/time ) for
distinguishing/correlating concurrently printed messages.



Petr Mladek wrote:
> First, we should ask what we expect from this feature. Different
> information might be needed in different situations. In general,
> people might want to know:
> 
>   + CPUid even in task context

I don't think CPU id in task context is common context. Task context will
sleep and switch to different CPUs. It is special context which would help
for only specific cases.

>   + exact interrupt context (soft, hard, NMI)

I don't know whether it is worth printing. But if it is useful, printing
type of interrupt context using %c would be sufficient for the context
which I want.

>   + whether preemption or interrupts are enabled

I don't think preemption state is common context. It is special context
which would be explicitly printed by e.g. stall detection messages.

> 
> It still looks bearable. But what if people want more,
> e.g. context switch counts, task state, pending signals,
> mem usage, cgroup stuff.
> 

I don't think context switch counts, task state, pending signals are
common context. It is special context which would be explicitly printed
by e.g. thread dump messages.

But if people want special context like listed above, we can consider
specifying by bitmask (e.g. /proc/sys/kernel/sysrq ) or by string (e.g.
/proc/sys/kernel/core_pattern ).

> Is this information useful for all messages or only
> selected ones?

I think the context which I want is useful for all messages. Thus,
my patch controls it via /sys/module/printk/parameters/caller_info
as with /sys/module/printk/parameters/time .

> 
> Is it acceptable when message prefix is longer than, let's
> say 40 characters?

The context which I want won't become so long.

> 
> Is the extended output worth having even on slow consoles?

Netconsole is unique that amount of characters to transmit and delay
are not proportional. As long as a message fits within ethernet packet
size (nearly 1500 bytes, which is longer than LOG_LINE_MAX for printk()
operation), the delay for printing one character and printing multiple
characters would be almost same. Therefore, reducing frequency of
printk() operation by having an API for buffered printk() (e.g.
https://groups.google.com/forum/#!topic/linux.kernel/OnoXED88nQM
and https://patchwork.kernel.org/patch/9927385/ ) would help.

But for other consoles, always printing all extended records might
become a pain. Thus, I prefer that the context which I want and
contexts which people might want are treated separately.
By the way, having an API for buffered printk() will help avoiding

  pr_info("printk: continuation disabled due to ext consoles, expect more fragments in /dev/kmsg\n");

case anyway...

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:15                         ` Petr Mladek
  2018-05-18 12:25                           ` Dmitry Vyukov
  2018-05-23 10:19                           ` Tetsuo Handa
@ 2018-05-24  2:14                           ` Sergey Senozhatsky
  2018-05-26  6:36                             ` Dmitry Vyukov
  2 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-24  2:14 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, dvyukov, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel, torvalds, akpm

On (05/18/18 14:15), Petr Mladek wrote:
> > Dunno...
> > For instance, can we store context tracking info as a extended record
> > data? We have that dict/dict_len thing. So may we can store tracking
> > info there? Extended records will appear on the serial console /* if
> > console supports extended data */ or can be read in via devkmsg_read().
> > Any other options?
> 
> This sounds interesting. Well, we would need to handle different dict
> items different ways. I still wonder if we really need these "hacks".

Well, it doesn't look like a complete hack. Extended records are there
exactly for the "this printk line came from context (device, subsystem) ABC"
type of thing. Those entries are multi-key/value already, we just can
add one more key/value pair. E.g. appending CONTEXT to already existing
SUBSYSTEM/DEVICE lines:

6,575,3130042,-;snd_hda_codec_generic hdaudioC1D0:    dig-out=0x4/0x5
 CONTEXT=4/99 PREEMPT=0/0/0/0
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC1D

But I'm not pushing for this particular solution. It just looked
reasonable and very "cheap", as we don't break anything.

> 
> IMHO, we could change struct printk_log if we provide related
> patches for crashdump and crash utilities.

Yep.

> First, we should ask what we expect from this feature.

Yeah. Can't really comment on this, it's up to Tetsuo and Dmitry to
decide. So far I've seen slightly different requirements/expectations.

> Different information might be needed in different situations.
> In general, people might want to know:
> 
>   + CPUid even in task context
>   + exact interrupt context (soft, hard, NMI)

Agreed.

>   + whether preemption or interrupts are enabled

preemption and irqs are already disabled this far in printk() internals.

> It still looks bearable. But what if people want more,
> e.g. context switch counts, task state, pending signals,
> mem usage, cgroup stuff.

Right. Extended records [dicts] can be up to 8k each, so I'd say
that we can have as many key/value pairs as we want to.

> Is this information useful for all messages or only
> selected ones?

No idea :)

> Is it acceptable when message prefix is longer than, let's
> say 40 characters?

If we talk about embedding this info into normal message payload
then, yes, we better keep it as small as possible. Because we are
limited by LOG_LINE_MAX + PREFIX_MAX chars [~1024 bytes, if I recall
correctly], the more we steal for context info the less we have for
the message.

> Is the extended output worth having even on slow consoles?

My expectation was that syzkaller is mostly executed in qemu environment.
But if someone would want to run it on a device with a slow console, then
it might be painful.

> By other words, I wonder if you wanted similar feature in many
> situations in the past and could provide more use cases.

Sorry, can you explain a bit more?

> Note:
> 
> The proposed patch enabled the extra info with a config option
> => you need to rebuild the kernel => you could just modify
> the problematic message. We could just add some printk_ helpers
> to make it easier.

Yes. As far as I know syzkaller folks are completely fine with
the .config based solution and can rebuild the kernel as many times
as needed, modifying the kernel code, at the same time, is not an
option.

> Alternatively, I wonder if it might be enough to add a tracepoint
> into printk() and get the extra info via
> /sys/kernel/debug/tracing/events/.

Sounds good to me.

> We would need to prevent recursion when trace buffer is flushed by
> printk() but...

Agreed.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 13:08                               ` Dmitry Vyukov
@ 2018-05-24  2:21                                 ` Sergey Senozhatsky
  0 siblings, 0 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-05-24  2:21 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (05/18/18 15:08), Dmitry Vyukov wrote:
[..]
> >> What consoles do support it?
> >> We are interested at least in qemu console, GCE console and Android
> >> phone consoles. But it would be pity if this can't be used on various
> >> development boards too.
> >
> > Only the netconsole is able to show the extended (dict)
> > information at the moment. Search for CON_EXTENDED flag.
> 
> Then we won't be able to use it. And we can't pipe from devkmsg_read
> in user-space, because we need this to work when kernel is broken in
> various ways...

Hmm. Well, basically, any console that has CON_EXTENDED bit set; which
is, probably, only netconsole at this point. Do you use slow serial
consoles?

OK, seems like extended printk records won't make you happy after all :)

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-24  2:14                           ` Sergey Senozhatsky
@ 2018-05-26  6:36                             ` Dmitry Vyukov
  2018-06-20  5:44                               ` Dmitry Vyukov
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-05-26  6:36 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Thu, May 24, 2018 at 4:14 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
>> First, we should ask what we expect from this feature.
>
> Yeah. Can't really comment on this, it's up to Tetsuo and Dmitry to
> decide. So far I've seen slightly different requirements/expectations.

The root problem is that it's not possible to make sense out of kernel
output if message takes more than 1 line (or output non-atomically
with several printk's) because of intermixed output from several
tasks/interrupts/etc. For example, it's not generally possible to
recover crash stack trace, because one gets random mix of frames.
Humans usually, but not always, can restore most of the sense. So the
goal is to make this ought-to-be-simple task actually simple and not
requiring human intelligence and time each time.

Prefixing each line with task/cpu/interrupt context should do the
trick as it will be possible to split kernel output into multiple
independent streams and analyze them independently.

In our context (syzbot testing) we can enable an additional config,
and adopt parser to understand additional line prefix. But I don't
know how prefixing lines fits into a larger picture. Does it make
sense to thought out a potential extension story for this format? E.g.
user specifies set of extension records that are dumped before each
line, and then can unambiguously parse them? I guess some
consoles/interfaces will never be extended to provide access to the
extension records, so it can make sense to make them accessible in
text format too (optionally).

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-26  6:36                             ` Dmitry Vyukov
@ 2018-06-20  5:44                               ` Dmitry Vyukov
  2018-06-20  8:31                                 ` Sergey Senozhatsky
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-06-20  5:44 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Sat, May 26, 2018 at 8:36 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Thu, May 24, 2018 at 4:14 AM, Sergey Senozhatsky
> <sergey.senozhatsky.work@gmail.com> wrote:
>>> First, we should ask what we expect from this feature.
>>
>> Yeah. Can't really comment on this, it's up to Tetsuo and Dmitry to
>> decide. So far I've seen slightly different requirements/expectations.
>
> The root problem is that it's not possible to make sense out of kernel
> output if message takes more than 1 line (or output non-atomically
> with several printk's) because of intermixed output from several
> tasks/interrupts/etc. For example, it's not generally possible to
> recover crash stack trace, because one gets random mix of frames.
> Humans usually, but not always, can restore most of the sense. So the
> goal is to make this ought-to-be-simple task actually simple and not
> requiring human intelligence and time each time.
>
> Prefixing each line with task/cpu/interrupt context should do the
> trick as it will be possible to split kernel output into multiple
> independent streams and analyze them independently.
>
> In our context (syzbot testing) we can enable an additional config,
> and adopt parser to understand additional line prefix. But I don't
> know how prefixing lines fits into a larger picture. Does it make
> sense to thought out a potential extension story for this format? E.g.
> user specifies set of extension records that are dumped before each
> line, and then can unambiguously parse them? I guess some
> consoles/interfaces will never be extended to provide access to the
> extension records, so it can make sense to make them accessible in
> text format too (optionally).


up

We continue to get mess like this, each instance of which needs to be
checked by human.


BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
sysfs: cannot create duplicate filename '/class/ieee80211/!'
PGD 1cae7e067 P4D 1cae7e067 PUD 1b4da6067 PMD 0
Oops: 0010 [#1] SMP KASAN
CPU: 1 PID: 1728 Comm: syz-executor4 Not tainted 4.17.0+ #84
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
CPU: 0 PID: 1738 Comm: syz-executor7 Not tainted 4.17.0+ #84
RIP: 0010:          (null)
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Code:
Call Trace:
Bad RIP value.
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
RSP: 0018:ffff88018cd3f590 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880192f05800 RCX: 1ffffffff10eeea9
 sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30
RDX: ffff88018cd3fab0 RSI: ffff8801c927a480 RDI: ffff88018c77c040
 sysfs_do_create_link_sd.isra.2+0x116/0x130 fs/sysfs/symlink.c:50
RBP: ffff88018cd3f700 R08: 0000000000000001 R09: 0000000000000000
 sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
 sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff100319a7eb7
R13: ffff88018cd3fab0 R14: ffff880192f05812 R15: ffff880192f05c58
 device_add_class_symlinks drivers/base/core.c:1632 [inline]
 device_add+0x5c9/0x16f0 drivers/base/core.c:1834
FS:  00007f4a8e71f700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000191e1b000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 wiphy_register+0x182e/0x24e0 net/wireless/core.c:813
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ieee80211_register_hw+0x13cd/0x35d0 net/mac80211/main.c:1050
 sock_poll+0x1d1/0x710 net/socket.c:1168
 mac80211_hwsim_new_radio+0x1da2/0x33b0
drivers/net/wireless/mac80211_hwsim.c:2772
 vfs_poll+0x77/0x2a0 fs/select.c:40
 do_pollfd fs/select.c:848 [inline]
 do_poll fs/select.c:896 [inline]
 do_sys_poll+0x6fd/0x1100 fs/select.c:990
 hwsim_new_radio_nl+0x7b8/0xa60 drivers/net/wireless/mac80211_hwsim.c:3247
 genl_family_rcv_msg+0x889/0x1120 net/netlink/genetlink.c:599
 genl_rcv_msg+0xc6/0x170 net/netlink/genetlink.c:624
 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
 __do_sys_poll fs/select.c:1048 [inline]
 __se_sys_poll fs/select.c:1036 [inline]
 __x64_sys_poll+0x189/0x510 fs/select.c:1036
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
 netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
 netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x455b29
 sock_sendmsg_nosec net/socket.c:645 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:655
Code:
 ___sys_sendmsg+0x805/0x940 net/socket.c:2161
1d
ba
fb
ff
c3
66
2e
0f
1f
 __sys_sendmsg+0x115/0x270 net/socket.c:2199
84
00
00
00
 __do_sys_sendmsg net/socket.c:2208 [inline]
 __se_sys_sendmsg net/socket.c:2206 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2206
00
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
00 66
90
48
89
f8 48
89
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
f7
RIP: 0033:0x455b29
48
Code:
89
1d
d6
ba fb
48
ff
89
c3
ca
66
4d
2e
89
0f
c2
1f
4d
84
89
00
c8
00
4c
00
8b
00
4c
00
24 08
66
0f
90
05 <48>
48
3d
89
01
f8
f0
48
ff ff
89
0f 83
f7
eb
48
b9 fb
89
ff
d6
c3
48 89
66
ca 4d
2e
89
0f
c2
1f
4d
84
89
00
c8
00
4c
00
8b
00
4c
RSP: 002b:00007f4a8e71ec68 EFLAGS: 00000246
24
 ORIG_RAX: 0000000000000007
08
RAX: ffffffffffffffda RBX: 00007f4a8e71f6d4 RCX: 0000000000455b29
0f
RDX: 0000000000000004 RSI: 0000000000000005 RDI: 0000000020000000
05
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
<48> 3d
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
01
R13: 00000000004c06c7 R14: 00000000004d0030 R15: 0000000000000000
f0
Modules linked in:
ff
Dumping ftrace buffer:
ff
   (ftrace buffer empty)
0f
CR2: 0000000000000000
83
---[ end trace 69744e61e26ed6a4 ]---
eb b9 fb ff c3 66 2e
RIP: 0010:          (null)
0f 1f 84 00 00 00
Code:
00
RSP: 002b:00007f4e4fdedc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
Bad RIP value.
RAX: ffffffffffffffda RBX: 00007f4e4fdee6d4 RCX: 0000000000455b29
RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000014
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
RSP: 0018:ffff88018cd3f590 EFLAGS: 00010246
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000004c0ee7 R14: 00000000004d0d80 R15: 0000000000000000
RAX: 0000000000000000 RBX: ffff880192f05800 RCX: 1ffffffff10eeea9
RDX: ffff88018cd3fab0 RSI: ffff8801c927a480 RDI: ffff88018c77c040
RBP: ffff88018cd3f700 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff100319a7eb7
R13: ffff88018cd3fab0 R14: ffff880192f05812 R15: ffff880192f05c58
FS:  00007f4a8e71f700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
netlink: 8 bytes leftover after parsing attributes in process `syz-executor2'.
CR2: ffffffffffffffd6 CR3: 0000000191e1b000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  5:44                               ` Dmitry Vyukov
@ 2018-06-20  8:31                                 ` Sergey Senozhatsky
  2018-06-20  8:45                                   ` Dmitry Vyukov
  0 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20  8:31 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (06/20/18 07:44), Dmitry Vyukov wrote:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> sysfs: cannot create duplicate filename '/class/ieee80211/!'
> PGD 1cae7e067 P4D 1cae7e067 PUD 1b4da6067 PMD 0
> Oops: 0010 [#1] SMP KASAN
> CPU: 1 PID: 1728 Comm: syz-executor4 Not tainted 4.17.0+ #84
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> CPU: 0 PID: 1738 Comm: syz-executor7 Not tainted 4.17.0+ #84
> RIP: 0010:          (null)
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Code:
> Call Trace:
> Bad RIP value.
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
> RSP: 0018:ffff88018cd3f590 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff880192f05800 RCX: 1ffffffff10eeea9
>  sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30
> RDX: ffff88018cd3fab0 RSI: ffff8801c927a480 RDI: ffff88018c77c040
>  sysfs_do_create_link_sd.isra.2+0x116/0x130 fs/sysfs/symlink.c:50
> RBP: ffff88018cd3f700 R08: 0000000000000001 R09: 0000000000000000
>  sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
>  sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
> R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff100319a7eb7
> R13: ffff88018cd3fab0 R14: ffff880192f05812 R15: ffff880192f05c58
>  device_add_class_symlinks drivers/base/core.c:1632 [inline]
>  device_add+0x5c9/0x16f0 drivers/base/core.c:1834
> FS:  00007f4a8e71f700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 0000000191e1b000 CR4: 00000000001406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  wiphy_register+0x182e/0x24e0 net/wireless/core.c:813
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  ieee80211_register_hw+0x13cd/0x35d0 net/mac80211/main.c:1050
>  sock_poll+0x1d1/0x710 net/socket.c:1168
>  mac80211_hwsim_new_radio+0x1da2/0x33b0
> drivers/net/wireless/mac80211_hwsim.c:2772
>  vfs_poll+0x77/0x2a0 fs/select.c:40
>  do_pollfd fs/select.c:848 [inline]
>  do_poll fs/select.c:896 [inline]
>  do_sys_poll+0x6fd/0x1100 fs/select.c:990
>  hwsim_new_radio_nl+0x7b8/0xa60 drivers/net/wireless/mac80211_hwsim.c:3247
>  genl_family_rcv_msg+0x889/0x1120 net/netlink/genetlink.c:599
>  genl_rcv_msg+0xc6/0x170 net/netlink/genetlink.c:624
>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
>  __do_sys_poll fs/select.c:1048 [inline]
>  __se_sys_poll fs/select.c:1036 [inline]
>  __x64_sys_poll+0x189/0x510 fs/select.c:1036
>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
>  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
>  netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
>  netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x455b29
>  sock_sendmsg_nosec net/socket.c:645 [inline]
>  sock_sendmsg+0xd5/0x120 net/socket.c:655
> Code:
>  ___sys_sendmsg+0x805/0x940 net/socket.c:2161
> 1d
> ba
> fb
> ff
> c3
> 66
> 2e
> 0f
> 1f
>  __sys_sendmsg+0x115/0x270 net/socket.c:2199
> 84
> 00
> 00
> 00
>  __do_sys_sendmsg net/socket.c:2208 [inline]
>  __se_sys_sendmsg net/socket.c:2206 [inline]
>  __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2206
> 00
>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
> 00 66
> 90
> 48
> 89
> f8 48
> 89

Meh, pr_cont() output... I forgot about it. So I have a very simple
patch [probably buggy]. One thing we can be sure of is that It does
not handle pr_cont() interleaving properly - it logs the context which
has stored the messages, while in case of pr_cont() it is not always
correct since we can have a preliminary pr_cont() flush. It also doesn't
handle printk_safe stuff. Tetsuo's patch, probably, handled all those
cases. Hmm.

The patch below is less intrusive but also less complete / less universal.
Maybe it's enough for you, maybe it's not. Wondering if this patch will
make any difference on your side to being with. Note, I'm not pushing for
this particular message format, we can change it the way you want.

===

Subject: [PATCH] printk: log message origin context info

---
 kernel/printk/printk.c | 31 ++++++++++++++++++++++++++++++-
 lib/Kconfig.debug      |  8 ++++++++
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 247808333ba4..304a02b0c432 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -580,16 +580,38 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
 	return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
 }
 
+static size_t log_message_origin(char *buf, size_t buf_len)
+{
+	size_t ret = 0;
+
+#ifdef CONFIG_PRINTK_LOG_MESSAGE_ORIGIN
+	ret = snprintf(buf,
+			buf_len,
+			"[%d/%d preempt:%lu/%lu/%lu] ",
+			raw_smp_processor_id(),
+			task_pid_nr(current),
+			in_nmi(),
+			in_irq(),
+			in_serving_softirq());
+#endif
+	return ret;
+}
+
 /* insert record into the buffer, discard old ones, update heads */
 static int log_store(int facility, int level,
 		     enum log_flags flags, u64 ts_nsec,
 		     const char *dict, u16 dict_len,
 		     const char *text, u16 text_len)
 {
+	static char log_origin[64];
+	static size_t log_origin_len;
 	struct printk_log *msg;
 	u32 size, pad_len;
 	u16 trunc_msg_len = 0;
 
+	log_origin_len = log_message_origin(log_origin, sizeof(log_origin));
+	text_len += log_origin_len;
+
 	/* number of '\0' padding bytes to next message */
 	size = msg_used_size(text_len, dict_len, &pad_len);
 
@@ -614,7 +636,14 @@ static int log_store(int facility, int level,
 
 	/* fill message */
 	msg = (struct printk_log *)(log_buf + log_next_idx);
-	memcpy(log_text(msg), text, text_len);
+	if (log_origin_len) {
+		memcpy(log_text(msg), log_origin, log_origin_len);
+		memcpy(log_text(msg) + log_origin_len,
+			text,
+			text_len - log_origin_len);
+	} else {
+		memcpy(log_text(msg), text, text_len);
+	}
 	msg->text_len = text_len;
 	if (trunc_msg_len) {
 		memcpy(log_text(msg) + text_len, trunc_msg, trunc_msg_len);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 8838d1158d19..57220642a00b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -15,6 +15,14 @@ config PRINTK_TIME
 	  The behavior is also controlled by the kernel command line
 	  parameter printk.time=1. See Documentation/admin-guide/kernel-parameters.rst
 
+config PRINTK_LOG_MESSAGE_ORIGIN
+	bool "Store printk() message origin context info"
+	depends on PRINTK
+	help
+	  Selecting this option causes extra information - CPU, task pid,
+	  preemption mask - to be added to the every message. This can be
+	  helpful when interleaving printk() lines cause too much.
+
 config CONSOLE_LOGLEVEL_DEFAULT
 	int "Default console loglevel (1-15)"
 	range 1 15
-- 
2.17.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  8:31                                 ` Sergey Senozhatsky
@ 2018-06-20  8:45                                   ` Dmitry Vyukov
  2018-06-20  9:06                                     ` Sergey Senozhatsky
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-06-20  8:45 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 10:31 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/20/18 07:44), Dmitry Vyukov wrote:
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
>> sysfs: cannot create duplicate filename '/class/ieee80211/!'
>> PGD 1cae7e067 P4D 1cae7e067 PUD 1b4da6067 PMD 0
>> Oops: 0010 [#1] SMP KASAN
>> CPU: 1 PID: 1728 Comm: syz-executor4 Not tainted 4.17.0+ #84
>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>> BIOS Google 01/01/2011
>> CPU: 0 PID: 1738 Comm: syz-executor7 Not tainted 4.17.0+ #84
>> RIP: 0010:          (null)
>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>> BIOS Google 01/01/2011
>> Code:
>> Call Trace:
>> Bad RIP value.
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>> RSP: 0018:ffff88018cd3f590 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: ffff880192f05800 RCX: 1ffffffff10eeea9
>>  sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30
>> RDX: ffff88018cd3fab0 RSI: ffff8801c927a480 RDI: ffff88018c77c040
>>  sysfs_do_create_link_sd.isra.2+0x116/0x130 fs/sysfs/symlink.c:50
>> RBP: ffff88018cd3f700 R08: 0000000000000001 R09: 0000000000000000
>>  sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
>>  sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
>> R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff100319a7eb7
>> R13: ffff88018cd3fab0 R14: ffff880192f05812 R15: ffff880192f05c58
>>  device_add_class_symlinks drivers/base/core.c:1632 [inline]
>>  device_add+0x5c9/0x16f0 drivers/base/core.c:1834
>> FS:  00007f4a8e71f700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: ffffffffffffffd6 CR3: 0000000191e1b000 CR4: 00000000001406e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>  wiphy_register+0x182e/0x24e0 net/wireless/core.c:813
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>  ieee80211_register_hw+0x13cd/0x35d0 net/mac80211/main.c:1050
>>  sock_poll+0x1d1/0x710 net/socket.c:1168
>>  mac80211_hwsim_new_radio+0x1da2/0x33b0
>> drivers/net/wireless/mac80211_hwsim.c:2772
>>  vfs_poll+0x77/0x2a0 fs/select.c:40
>>  do_pollfd fs/select.c:848 [inline]
>>  do_poll fs/select.c:896 [inline]
>>  do_sys_poll+0x6fd/0x1100 fs/select.c:990
>>  hwsim_new_radio_nl+0x7b8/0xa60 drivers/net/wireless/mac80211_hwsim.c:3247
>>  genl_family_rcv_msg+0x889/0x1120 net/netlink/genetlink.c:599
>>  genl_rcv_msg+0xc6/0x170 net/netlink/genetlink.c:624
>>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
>>  __do_sys_poll fs/select.c:1048 [inline]
>>  __se_sys_poll fs/select.c:1036 [inline]
>>  __x64_sys_poll+0x189/0x510 fs/select.c:1036
>>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
>>  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
>>  netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
>>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
>>  netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> RIP: 0033:0x455b29
>>  sock_sendmsg_nosec net/socket.c:645 [inline]
>>  sock_sendmsg+0xd5/0x120 net/socket.c:655
>> Code:
>>  ___sys_sendmsg+0x805/0x940 net/socket.c:2161
>> 1d
>> ba
>> fb
>> ff
>> c3
>> 66
>> 2e
>> 0f
>> 1f
>>  __sys_sendmsg+0x115/0x270 net/socket.c:2199
>> 84
>> 00
>> 00
>> 00
>>  __do_sys_sendmsg net/socket.c:2208 [inline]
>>  __se_sys_sendmsg net/socket.c:2206 [inline]
>>  __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2206
>> 00
>>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
>> 00 66
>> 90
>> 48
>> 89
>> f8 48
>> 89
>
> Meh, pr_cont() output... I forgot about it. So I have a very simple
> patch [probably buggy]. One thing we can be sure of is that It does
> not handle pr_cont() interleaving properly - it logs the context which
> has stored the messages, while in case of pr_cont() it is not always
> correct since we can have a preliminary pr_cont() flush. It also doesn't
> handle printk_safe stuff. Tetsuo's patch, probably, handled all those
> cases. Hmm.
>
> The patch below is less intrusive but also less complete / less universal.
> Maybe it's enough for you, maybe it's not. Wondering if this patch will
> make any difference on your side to being with. Note, I'm not pushing for
> this particular message format, we can change it the way you want.

Hi Sergey,

What are the visible differences between this patch and Tetsuo's
patch? The only thing that will matter for syzkaller parsing in the
end is the resulting text format as it appears on console. But you say
"I'm not pushing for this particular message format", so what exactly
do you want me to provide feedback on?
I guess we need to handle pr_cont properly whatever approach we take.

Re format, for us it would be much more convenient if the context is a
single token that can be used as is, say "T<pid>" for task context,
"I<cpu>" for interrupts, "N<cpu>" for nmi's, etc. Rather than: split
it all into tokens and parse, then look at a set of flags and choose
the highest priority set flag and then depending on the flag choose
either task id or cpu id.

> ===
>
> Subject: [PATCH] printk: log message origin context info
>
> ---
>  kernel/printk/printk.c | 31 ++++++++++++++++++++++++++++++-
>  lib/Kconfig.debug      |  8 ++++++++
>  2 files changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 247808333ba4..304a02b0c432 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -580,16 +580,38 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
>         return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
>  }
>
> +static size_t log_message_origin(char *buf, size_t buf_len)
> +{
> +       size_t ret = 0;
> +
> +#ifdef CONFIG_PRINTK_LOG_MESSAGE_ORIGIN
> +       ret = snprintf(buf,
> +                       buf_len,
> +                       "[%d/%d preempt:%lu/%lu/%lu] ",
> +                       raw_smp_processor_id(),
> +                       task_pid_nr(current),
> +                       in_nmi(),
> +                       in_irq(),
> +                       in_serving_softirq());
> +#endif
> +       return ret;
> +}
> +
>  /* insert record into the buffer, discard old ones, update heads */
>  static int log_store(int facility, int level,
>                      enum log_flags flags, u64 ts_nsec,
>                      const char *dict, u16 dict_len,
>                      const char *text, u16 text_len)
>  {
> +       static char log_origin[64];
> +       static size_t log_origin_len;
>         struct printk_log *msg;
>         u32 size, pad_len;
>         u16 trunc_msg_len = 0;
>
> +       log_origin_len = log_message_origin(log_origin, sizeof(log_origin));
> +       text_len += log_origin_len;
> +
>         /* number of '\0' padding bytes to next message */
>         size = msg_used_size(text_len, dict_len, &pad_len);
>
> @@ -614,7 +636,14 @@ static int log_store(int facility, int level,
>
>         /* fill message */
>         msg = (struct printk_log *)(log_buf + log_next_idx);
> -       memcpy(log_text(msg), text, text_len);
> +       if (log_origin_len) {
> +               memcpy(log_text(msg), log_origin, log_origin_len);
> +               memcpy(log_text(msg) + log_origin_len,
> +                       text,
> +                       text_len - log_origin_len);
> +       } else {
> +               memcpy(log_text(msg), text, text_len);
> +       }
>         msg->text_len = text_len;
>         if (trunc_msg_len) {
>                 memcpy(log_text(msg) + text_len, trunc_msg, trunc_msg_len);
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 8838d1158d19..57220642a00b 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -15,6 +15,14 @@ config PRINTK_TIME
>           The behavior is also controlled by the kernel command line
>           parameter printk.time=1. See Documentation/admin-guide/kernel-parameters.rst
>
> +config PRINTK_LOG_MESSAGE_ORIGIN
> +       bool "Store printk() message origin context info"
> +       depends on PRINTK
> +       help
> +         Selecting this option causes extra information - CPU, task pid,
> +         preemption mask - to be added to the every message. This can be
> +         helpful when interleaving printk() lines cause too much.
> +
>  config CONSOLE_LOGLEVEL_DEFAULT
>         int "Default console loglevel (1-15)"
>         range 1 15
> --
> 2.17.1
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  8:45                                   ` Dmitry Vyukov
@ 2018-06-20  9:06                                     ` Sergey Senozhatsky
  2018-06-20  9:18                                       ` Sergey Senozhatsky
  2018-06-20  9:30                                       ` Dmitry Vyukov
  0 siblings, 2 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20  9:06 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

Hi Dmitry,

On (06/20/18 10:45), Dmitry Vyukov wrote:
> Hi Sergey,
> 
> What are the visible differences between this patch and Tetsuo's
> patch?

I guess none, and looking at your requirements below I tend to agree
that Tetsuo's approach is probably what you need at the end of the day.

> The only thing that will matter for syzkaller parsing in the
> end is the resulting text format as it appears on console. But you say
> "I'm not pushing for this particular message format", so what exactly
> do you want me to provide feedback on?
> I guess we need to handle pr_cont properly whatever approach we take.

Mostly, was wondering about if:
a) you need pr_cont() handling
b) you need printk_safe() handling

The reasons I left those things behind:

a) pr_cont() is officially hated. It was never supposed to be used
   on SMP systems. So I wasn't sure if we need all that effort and
   add tricky code to handle pr_cont(). Given that syzkaller is
   probably the only user of that functionality.

b) printk_safe output is quite uncommon. And we flush per-CPU buffer
   from the same CPU which has caused printk_safe output [except for
   panic() flush] therefore logging the info available to log_store()
   seemed enough. IOW, once again, was a bit unsure if we want to add
   some complex code to already complex code, with just one potential
   user.

To summarize, I was just wondering where is the waterline: can a small
patch make you happy, or do you need a big one.

> Re format, for us it would be much more convenient if the context is a
> single token that can be used as is, say "T<pid>" for task context,
> "I<cpu>" for interrupts, "N<cpu>" for nmi's

Got it.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:06                                     ` Sergey Senozhatsky
@ 2018-06-20  9:18                                       ` Sergey Senozhatsky
  2018-06-20  9:31                                         ` Dmitry Vyukov
  2018-06-20  9:30                                       ` Dmitry Vyukov
  1 sibling, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20  9:18 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Petr Mladek, Tetsuo Handa, Sergey Senozhatsky,
	syzkaller, Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On (06/20/18 18:06), Sergey Senozhatsky wrote:
> 
> b) printk_safe output is quite uncommon. And we flush per-CPU buffer
>    from the same CPU which has caused printk_safe output [except for
>    panic() flush] therefore logging the info available to log_store()
>    seemed enough. IOW, once again, was a bit unsure if we want to add
>    some complex code to already complex code, with just one potential
>    user.

BTW, pr_cont() handling is not so simple when we are in printk_safe()
context. Unlike vprintk_emit() [normal printk], we don't use any
dedicated pr_cont() buffer in printk_safe. So, at a glance, I suspect
that injecting context info at every printk_safe_log_store() call for
`for (...) pr_cont()' loop is going to produce something like this:
	I<10> 23 I<10> 43 I<10> 47 ....

	// Hmm, maybe the line will endup having two prefixes. Once
	// from printk_safe_log_store, the other from normal printk
	// log_store().

While the same `for (...) pr_cont()' called from normal printk() context
will produce
	I<10> 32 43 47 ....

It could be that I'm wrong.
Tetsuo, have you tested pr_cont() from printk_safe() context?

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:06                                     ` Sergey Senozhatsky
  2018-06-20  9:18                                       ` Sergey Senozhatsky
@ 2018-06-20  9:30                                       ` Dmitry Vyukov
  2018-06-20 11:19                                         ` Sergey Senozhatsky
  2018-06-20 11:37                                         ` Fengguang Wu
  1 sibling, 2 replies; 68+ messages in thread
From: Dmitry Vyukov @ 2018-06-20  9:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> Hi Dmitry,
>
> On (06/20/18 10:45), Dmitry Vyukov wrote:
>> Hi Sergey,
>>
>> What are the visible differences between this patch and Tetsuo's
>> patch?
>
> I guess none, and looking at your requirements below I tend to agree
> that Tetsuo's approach is probably what you need at the end of the day.
>
>> The only thing that will matter for syzkaller parsing in the
>> end is the resulting text format as it appears on console. But you say
>> "I'm not pushing for this particular message format", so what exactly
>> do you want me to provide feedback on?
>> I guess we need to handle pr_cont properly whatever approach we take.
>
> Mostly, was wondering about if:
> a) you need pr_cont() handling
> b) you need printk_safe() handling
>
> The reasons I left those things behind:
>
> a) pr_cont() is officially hated. It was never supposed to be used
>    on SMP systems. So I wasn't sure if we need all that effort and
>    add tricky code to handle pr_cont(). Given that syzkaller is
>    probably the only user of that functionality.

Well, if I put my syzkaller hat on, then I don't care what exactly
happens in the kernel, the only thing I care is well-formed output on
console that can be parsed unambiguously in all cases.
From this point of view I guess pr_cont is actually syzkaller's worst
enemy. If pr_const is officially hated, and it causes corrupted crash
reports, then we can resolve it by just getting rid of more pr_cont's.
So potentially we do not need any support for pr_cont in this patch.
However, we also need to be practical and if there are tons of
pr_cont's then we need some intermediate support of them, just because
we won't be able to get rid of all of them overnight.

But even if we attach context to pr_cont, it still causes problems for
crash parsing, because today we see:

BUG: unable to handle
... 10 lines ...
kernel
... 10 lines ...
paging request
... 10 lines ...
at ADDR

Which is not too friendly for parsing regardless of contexts.
So I am leaning towards to getting rid of pr_cont's as the solution to
the problem.

Looking at current uses of pr_cont:
https://elixir.bootlin.com/linux/latest/ident/pr_cont
It does not look too bad. arch/ except for x86 and exotic drivers
won't cause problems for syzbot today, so we can live with these uses
for now.



> b) printk_safe output is quite uncommon. And we flush per-CPU buffer
>    from the same CPU which has caused printk_safe output [except for
>    panic() flush] therefore logging the info available to log_store()
>    seemed enough. IOW, once again, was a bit unsure if we want to add
>    some complex code to already complex code, with just one potential
>    user.


I can't fully answer this because I don't understand what are the
implications on actual output.
You can use this as litmus test: can you write a simple script that
will parse such output and make sense out of it?

Well, it's not for one user. It affects each and every single user of
Linux kernel out there. Just take a look at these, that's complete
nonsense, it's not that syzkaller can't make sense out of it, it's
nobody can make sense out of it:

https://gist.githubusercontent.com/dvyukov/1528e86e5139f2fd1bf9902398d48298/raw/3b42148554eefed210f1e626d5befd50405c5487/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/6e08ac521f3e19534970ed97aeee1603/raw/0f0bb361902de94e7ee331ac500a3ceebf812c22/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/6e9db2313e48773ad1cd861da8020008/raw/d5b7c023fc8a38c72b1cf8bb1da85fb1c31cea5f/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/3d1bda4c690414ac027de1da45759751/raw/2c68980eabf4f6be24060e807a75f2d3570b5a42/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/9b8831e9ac73ffafa111a33ad40c5667/raw/f4097fbea8f89b25a282a6ef7e648145e10ae4b7/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/d78a3187a1b4e004820e92efcb16f9e0/raw/5530bcbf009c3fba3c581b2d24c523c673c6ef12/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/da1e42436af9ad2afc7de49f2d503510/raw/7dd4cbcc651c5b87122f066a3c689999ae8c4121/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/4571b94bd8cbd78d759412c560fa395c/raw/964c73fc993fc8a9000571e0b7618000584f3638/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/b6deac5faa958ae3733413b34dd5feed/raw/c4da219e284f7fc55da8c3c3af623a87f31bf653/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/2f54c6a2e45347ea76d9c5ce3c0ff091/raw/45f4873898ec8e0d9aa16b9c5c63a85410fd05e0/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/96cb39e29124dbbe2a65a91ec7a5639e/raw/aa8f7b2b1dfa5b8bb8cf93d8a821ca9938e8fc54/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/424da8282d5b28f8be10eab595d37444/raw/acc2fb1ececc1ea9a8215213f7e37e08b524c096/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/b07f37720c632d6d56ae67d95e5599b3/raw/8624ba47d6eb4e7d4d58e3ae1242ebe6cc46d361/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/bc24a7b92289ec04587fb29fc1085045/raw/3136e9262ee2233b5ab369a4a82e83953fc2d8a2/gistfile1.txt

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:18                                       ` Sergey Senozhatsky
@ 2018-06-20  9:31                                         ` Dmitry Vyukov
  2018-06-20 11:07                                           ` Sergey Senozhatsky
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-06-20  9:31 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 11:18 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/20/18 18:06), Sergey Senozhatsky wrote:
>>
>> b) printk_safe output is quite uncommon. And we flush per-CPU buffer
>>    from the same CPU which has caused printk_safe output [except for
>>    panic() flush] therefore logging the info available to log_store()
>>    seemed enough. IOW, once again, was a bit unsure if we want to add
>>    some complex code to already complex code, with just one potential
>>    user.
>
> BTW, pr_cont() handling is not so simple when we are in printk_safe()
> context. Unlike vprintk_emit() [normal printk], we don't use any
> dedicated pr_cont() buffer in printk_safe. So, at a glance, I suspect
> that injecting context info at every printk_safe_log_store() call for
> `for (...) pr_cont()' loop is going to produce something like this:
>         I<10> 23 I<10> 43 I<10> 47 ....
>
>         // Hmm, maybe the line will endup having two prefixes. Once
>         // from printk_safe_log_store, the other from normal printk
>         // log_store().
>
> While the same `for (...) pr_cont()' called from normal printk() context
> will produce
>         I<10> 32 43 47 ....
>
> It could be that I'm wrong.
> Tetsuo, have you tested pr_cont() from printk_safe() context?


So this is another reason to get rid of pr_cont entirely, right?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:31                                         ` Dmitry Vyukov
@ 2018-06-20 11:07                                           ` Sergey Senozhatsky
  2018-06-20 11:32                                             ` Dmitry Vyukov
  0 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20 11:07 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (06/20/18 11:31), Dmitry Vyukov wrote:
> > BTW, pr_cont() handling is not so simple when we are in printk_safe()
> > context. Unlike vprintk_emit() [normal printk], we don't use any
> > dedicated pr_cont() buffer in printk_safe. So, at a glance, I suspect
> > that injecting context info at every printk_safe_log_store() call for
> > `for (...) pr_cont()' loop is going to produce something like this:
> >         I<10> 23 I<10> 43 I<10> 47 ....
> >
> >         // Hmm, maybe the line will endup having two prefixes. Once
> >         // from printk_safe_log_store, the other from normal printk
> >         // log_store().
> >
> > While the same `for (...) pr_cont()' called from normal printk() context
> > will produce
> >         I<10> 32 43 47 ....
> >
> > It could be that I'm wrong.
> > Tetsuo, have you tested pr_cont() from printk_safe() context?
> 
> 
> So this is another reason to get rid of pr_cont entirely, right?

Getting rid of pr_cont() from important output would be totally cool.
Quoting Linus:

    Only acceptable use of continuations is basically boot-time testing,
    when you do things like

     printk("Testing feature XYZ..");
     this_may_blow_up_because_of_hw_bugs();
     printk(KERN_CONT " ... ok\n");


I can recall at least 4 attempts when people tried to introduce new pr_cont()
or some concept with similar functionality to pr_cont(), but SMP safe. We
brought the first one - per-CPU pr_cont() buffers - to KS several years ago
but Linus didn't like it. Then there was a buffered printk() mode patch from
Tetsuo, then a solution from Steven, then I had my second try with a
soft-of-pr_cont() replacement.

So, if we could get rid of pr_cont() from the most important parts
(instruction dumps, etc) then I would just vote to leave pr_cont()
alone and avoid any handling of it in printk context tracking. Simply
because we wouldn't care about pr_cont(). This also could simplify
Tetsuo's patch significantly.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:30                                       ` Dmitry Vyukov
@ 2018-06-20 11:19                                         ` Sergey Senozhatsky
  2018-06-20 11:25                                           ` Dmitry Vyukov
  2018-06-20 11:37                                         ` Fengguang Wu
  1 sibling, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20 11:19 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (06/20/18 11:30), Dmitry Vyukov wrote:
> 
> https://gist.githubusercontent.com/dvyukov/1528e86e5139f2fd1bf9902398d48298/raw/3b42148554eefed210f1e626d5befd50405c5487/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/6e08ac521f3e19534970ed97aeee1603/raw/0f0bb361902de94e7ee331ac500a3ceebf812c22/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/6e9db2313e48773ad1cd861da8020008/raw/d5b7c023fc8a38c72b1cf8bb1da85fb1c31cea5f/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/3d1bda4c690414ac027de1da45759751/raw/2c68980eabf4f6be24060e807a75f2d3570b5a42/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/9b8831e9ac73ffafa111a33ad40c5667/raw/f4097fbea8f89b25a282a6ef7e648145e10ae4b7/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/d78a3187a1b4e004820e92efcb16f9e0/raw/5530bcbf009c3fba3c581b2d24c523c673c6ef12/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/da1e42436af9ad2afc7de49f2d503510/raw/7dd4cbcc651c5b87122f066a3c689999ae8c4121/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/4571b94bd8cbd78d759412c560fa395c/raw/964c73fc993fc8a9000571e0b7618000584f3638/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/b6deac5faa958ae3733413b34dd5feed/raw/c4da219e284f7fc55da8c3c3af623a87f31bf653/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/2f54c6a2e45347ea76d9c5ce3c0ff091/raw/45f4873898ec8e0d9aa16b9c5c63a85410fd05e0/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/96cb39e29124dbbe2a65a91ec7a5639e/raw/aa8f7b2b1dfa5b8bb8cf93d8a821ca9938e8fc54/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/424da8282d5b28f8be10eab595d37444/raw/acc2fb1ececc1ea9a8215213f7e37e08b524c096/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/b07f37720c632d6d56ae67d95e5599b3/raw/8624ba47d6eb4e7d4d58e3ae1242ebe6cc46d361/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/bc24a7b92289ec04587fb29fc1085045/raw/3136e9262ee2233b5ab369a4a82e83953fc2d8a2/gistfile1.txt


Just a small remark

I randomly picked some links, and at least in several reports I saw:

** 4495 printk messages dropped ** [   50.830930]  [<ffffffff8123ab47>] do_raw_write_lock+0xc7/0x1d0
** 3816 printk messages dropped ** [   50.839887]  [<ffffffff814fb353>] SyS_read+0xd3/0x1c0
** 3497 printk messages dropped ** [   50.848107]  [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14
** 4057 printk messages dropped ** [   50.857615] 	run_ksoftirqd+0x20/0x60
** 2855 printk messages dropped ** [   50.864318]  [<ffffffff814fb353>] SyS_read+0xd3/0x1c0
** 3490 printk messages dropped ** [   50.872518]  [<ffffffff815bee10>] ? fsnotify+0xe40/0xe40
** 3600 printk messages dropped ** [   50.880974] 	SyS_fcntl+0x5be/0xc70

This will not get any better if we have printk context tracking. The
problem here is that we lose messages: your console is significantly slower
than your CPUs. So while one CPU is doing its best printing pending logbuf
messages to a slow console, the rest of CPUs don't hesitate to append new
messages (printk -> log_store). Since logbuf is limited in size - we wrap
around and this results in lost messages.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:19                                         ` Sergey Senozhatsky
@ 2018-06-20 11:25                                           ` Dmitry Vyukov
  0 siblings, 0 replies; 68+ messages in thread
From: Dmitry Vyukov @ 2018-06-20 11:25 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 1:19 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/20/18 11:30), Dmitry Vyukov wrote:
>>
>> https://gist.githubusercontent.com/dvyukov/1528e86e5139f2fd1bf9902398d48298/raw/3b42148554eefed210f1e626d5befd50405c5487/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/6e08ac521f3e19534970ed97aeee1603/raw/0f0bb361902de94e7ee331ac500a3ceebf812c22/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/6e9db2313e48773ad1cd861da8020008/raw/d5b7c023fc8a38c72b1cf8bb1da85fb1c31cea5f/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/3d1bda4c690414ac027de1da45759751/raw/2c68980eabf4f6be24060e807a75f2d3570b5a42/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/9b8831e9ac73ffafa111a33ad40c5667/raw/f4097fbea8f89b25a282a6ef7e648145e10ae4b7/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/d78a3187a1b4e004820e92efcb16f9e0/raw/5530bcbf009c3fba3c581b2d24c523c673c6ef12/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/da1e42436af9ad2afc7de49f2d503510/raw/7dd4cbcc651c5b87122f066a3c689999ae8c4121/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/4571b94bd8cbd78d759412c560fa395c/raw/964c73fc993fc8a9000571e0b7618000584f3638/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/b6deac5faa958ae3733413b34dd5feed/raw/c4da219e284f7fc55da8c3c3af623a87f31bf653/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/2f54c6a2e45347ea76d9c5ce3c0ff091/raw/45f4873898ec8e0d9aa16b9c5c63a85410fd05e0/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/96cb39e29124dbbe2a65a91ec7a5639e/raw/aa8f7b2b1dfa5b8bb8cf93d8a821ca9938e8fc54/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/424da8282d5b28f8be10eab595d37444/raw/acc2fb1ececc1ea9a8215213f7e37e08b524c096/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/b07f37720c632d6d56ae67d95e5599b3/raw/8624ba47d6eb4e7d4d58e3ae1242ebe6cc46d361/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/bc24a7b92289ec04587fb29fc1085045/raw/3136e9262ee2233b5ab369a4a82e83953fc2d8a2/gistfile1.txt
>
>
> Just a small remark
>
> I randomly picked some links, and at least in several reports I saw:
>
> ** 4495 printk messages dropped ** [   50.830930]  [<ffffffff8123ab47>] do_raw_write_lock+0xc7/0x1d0
> ** 3816 printk messages dropped ** [   50.839887]  [<ffffffff814fb353>] SyS_read+0xd3/0x1c0
> ** 3497 printk messages dropped ** [   50.848107]  [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14
> ** 4057 printk messages dropped ** [   50.857615]       run_ksoftirqd+0x20/0x60
> ** 2855 printk messages dropped ** [   50.864318]  [<ffffffff814fb353>] SyS_read+0xd3/0x1c0
> ** 3490 printk messages dropped ** [   50.872518]  [<ffffffff815bee10>] ? fsnotify+0xe40/0xe40
> ** 3600 printk messages dropped ** [   50.880974]       SyS_fcntl+0x5be/0xc70
>
> This will not get any better if we have printk context tracking. The
> problem here is that we lose messages: your console is significantly slower
> than your CPUs. So while one CPU is doing its best printing pending logbuf
> messages to a slow console, the rest of CPUs don't hesitate to append new
> messages (printk -> log_store). Since logbuf is limited in size - we wrap
> around and this results in lost messages.

Yes, I realize there are multiple problems combined here.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:07                                           ` Sergey Senozhatsky
@ 2018-06-20 11:32                                             ` Dmitry Vyukov
  2018-06-20 13:06                                               ` Sergey Senozhatsky
  2018-06-21  8:29                                               ` Sergey Senozhatsky
  0 siblings, 2 replies; 68+ messages in thread
From: Dmitry Vyukov @ 2018-06-20 11:32 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 1:07 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/20/18 11:31), Dmitry Vyukov wrote:
>> > BTW, pr_cont() handling is not so simple when we are in printk_safe()
>> > context. Unlike vprintk_emit() [normal printk], we don't use any
>> > dedicated pr_cont() buffer in printk_safe. So, at a glance, I suspect
>> > that injecting context info at every printk_safe_log_store() call for
>> > `for (...) pr_cont()' loop is going to produce something like this:
>> >         I<10> 23 I<10> 43 I<10> 47 ....
>> >
>> >         // Hmm, maybe the line will endup having two prefixes. Once
>> >         // from printk_safe_log_store, the other from normal printk
>> >         // log_store().
>> >
>> > While the same `for (...) pr_cont()' called from normal printk() context
>> > will produce
>> >         I<10> 32 43 47 ....
>> >
>> > It could be that I'm wrong.
>> > Tetsuo, have you tested pr_cont() from printk_safe() context?
>>
>>
>> So this is another reason to get rid of pr_cont entirely, right?
>
> Getting rid of pr_cont() from important output would be totally cool.
> Quoting Linus:
>
>     Only acceptable use of continuations is basically boot-time testing,
>     when you do things like
>
>      printk("Testing feature XYZ..");
>      this_may_blow_up_because_of_hw_bugs();
>      printk(KERN_CONT " ... ok\n");
>
>
> I can recall at least 4 attempts when people tried to introduce new pr_cont()
> or some concept with similar functionality to pr_cont(), but SMP safe. We
> brought the first one - per-CPU pr_cont() buffers - to KS several years ago
> but Linus didn't like it. Then there was a buffered printk() mode patch from
> Tetsuo, then a solution from Steven, then I had my second try with a
> soft-of-pr_cont() replacement.
>
> So, if we could get rid of pr_cont() from the most important parts
> (instruction dumps, etc) then I would just vote to leave pr_cont()
> alone and avoid any handling of it in printk context tracking. Simply
> because we wouldn't care about pr_cont(). This also could simplify
> Tetsuo's patch significantly.

Sounds good to me.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:30                                       ` Dmitry Vyukov
  2018-06-20 11:19                                         ` Sergey Senozhatsky
@ 2018-06-20 11:37                                         ` Fengguang Wu
  2018-06-20 12:31                                           ` Dmitry Vyukov
  1 sibling, 1 reply; 68+ messages in thread
From: Fengguang Wu @ 2018-06-20 11:37 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
><sergey.senozhatsky.work@gmail.com> wrote:
>> Hi Dmitry,
>>
>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>> Hi Sergey,
>>>
>>> What are the visible differences between this patch and Tetsuo's
>>> patch?
>>
>> I guess none, and looking at your requirements below I tend to agree
>> that Tetsuo's approach is probably what you need at the end of the day.
>>
>>> The only thing that will matter for syzkaller parsing in the
>>> end is the resulting text format as it appears on console. But you say
>>> "I'm not pushing for this particular message format", so what exactly
>>> do you want me to provide feedback on?
>>> I guess we need to handle pr_cont properly whatever approach we take.
>>
>> Mostly, was wondering about if:
>> a) you need pr_cont() handling
>> b) you need printk_safe() handling
>>
>> The reasons I left those things behind:
>>
>> a) pr_cont() is officially hated. It was never supposed to be used
>>    on SMP systems. So I wasn't sure if we need all that effort and
>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>    probably the only user of that functionality.
>
>Well, if I put my syzkaller hat on, then I don't care what exactly
>happens in the kernel, the only thing I care is well-formed output on
>console that can be parsed unambiguously in all cases.

+1 for 0day kernel testing.

I admit that goal may never be 100% achievable -- at least some serial
console logs can sometimes become messy. So we'll have to write dmesg
parsing code in defensive ways.

But some unnecessary pr_cont() broken-up messages can obviously be
avoided. For example,

arch/x86/mm/fault.c:

	printk(KERN_ALERT "BUG: unable to handle kernel ");
	if (address < PAGE_SIZE)
		printk(KERN_CONT "NULL pointer dereference");
	else
		printk(KERN_CONT "paging request");

I've actually proposed to remove the above KERN_CONT, unfortunately the
patch was silently ignored.

>From this point of view I guess pr_cont is actually syzkaller's worst
>enemy. If pr_const is officially hated, and it causes corrupted crash
>reports, then we can resolve it by just getting rid of more pr_cont's.
>So potentially we do not need any support for pr_cont in this patch.
>However, we also need to be practical and if there are tons of
>pr_cont's then we need some intermediate support of them, just because
>we won't be able to get rid of all of them overnight.
>
>But even if we attach context to pr_cont, it still causes problems for
>crash parsing, because today we see:
>
>BUG: unable to handle
>... 10 lines ...
>kernel
>... 10 lines ...
>paging request
>... 10 lines ...
>at ADDR
>
>Which is not too friendly for parsing regardless of contexts.

We met exactly the same issue and ended up with special handling in
https://github.com/intel/lkp-tests/blob/master/lib/dmesg.rb:

       /(BUG: unable to handle kernel)/,
       /(BUG: unable to handle kernel) NULL pointer dereference/,
       /(BUG: unable to handle kernel) paging request/,

>So I am leaning towards to getting rid of pr_cont's as the solution to
>the problem.

+1 for reducing unnecessary pr_cont() uses.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:37                                         ` Fengguang Wu
@ 2018-06-20 12:31                                           ` Dmitry Vyukov
  2018-06-20 12:41                                             ` Fengguang Wu
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-06-20 12:31 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 1:37 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>>
>> On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
>> <sergey.senozhatsky.work@gmail.com> wrote:
>>>
>>> Hi Dmitry,
>>>
>>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>>>
>>>> Hi Sergey,
>>>>
>>>> What are the visible differences between this patch and Tetsuo's
>>>> patch?
>>>
>>>
>>> I guess none, and looking at your requirements below I tend to agree
>>> that Tetsuo's approach is probably what you need at the end of the day.
>>>
>>>> The only thing that will matter for syzkaller parsing in the
>>>> end is the resulting text format as it appears on console. But you say
>>>> "I'm not pushing for this particular message format", so what exactly
>>>> do you want me to provide feedback on?
>>>> I guess we need to handle pr_cont properly whatever approach we take.
>>>
>>>
>>> Mostly, was wondering about if:
>>> a) you need pr_cont() handling
>>> b) you need printk_safe() handling
>>>
>>> The reasons I left those things behind:
>>>
>>> a) pr_cont() is officially hated. It was never supposed to be used
>>>    on SMP systems. So I wasn't sure if we need all that effort and
>>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>>    probably the only user of that functionality.
>>
>>
>> Well, if I put my syzkaller hat on, then I don't care what exactly
>> happens in the kernel, the only thing I care is well-formed output on
>> console that can be parsed unambiguously in all cases.
>
>
> +1 for 0day kernel testing.
>
> I admit that goal may never be 100% achievable -- at least some serial
> console logs can sometimes become messy. So we'll have to write dmesg
> parsing code in defensive ways.
>
> But some unnecessary pr_cont() broken-up messages can obviously be
> avoided. For example,
>
> arch/x86/mm/fault.c:
>
>         printk(KERN_ALERT "BUG: unable to handle kernel ");
>         if (address < PAGE_SIZE)
>                 printk(KERN_CONT "NULL pointer dereference");
>         else
>                 printk(KERN_CONT "paging request");
>
> I've actually proposed to remove the above KERN_CONT, unfortunately the
> patch was silently ignored.


I've just cooked this change too, but do you mind reviving your patch?

It actually makes the code even shorter, which is nice:

--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -671,13 +671,9 @@ show_fault_oops(struct pt_regs *regs, unsigned
long error_code,
                        printk(smep_warning, from_kuid(&init_user_ns,
current_uid()));
        }

-       printk(KERN_ALERT "BUG: unable to handle kernel ");
-       if (address < PAGE_SIZE)
-               printk(KERN_CONT "NULL pointer dereference");
-       else
-               printk(KERN_CONT "paging request");
-
-       printk(KERN_CONT " at %px\n", (void *) address);
+       printk(KERN_ALERT "BUG: unable to handle kernel %s at %px\n",
+               (address < PAGE_SIZE ? "NULL pointer dereference" :
+               "paging request"), (void *) address);

        dump_pagetable(address);
 }

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 12:31                                           ` Dmitry Vyukov
@ 2018-06-20 12:41                                             ` Fengguang Wu
  2018-06-20 12:45                                               ` Dmitry Vyukov
  0 siblings, 1 reply; 68+ messages in thread
From: Fengguang Wu @ 2018-06-20 12:41 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 02:31:51PM +0200, Dmitry Vyukov wrote:
>On Wed, Jun 20, 2018 at 1:37 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
>> On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>>>
>>> On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
>>> <sergey.senozhatsky.work@gmail.com> wrote:
>>>>
>>>> Hi Dmitry,
>>>>
>>>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>>>>
>>>>> Hi Sergey,
>>>>>
>>>>> What are the visible differences between this patch and Tetsuo's
>>>>> patch?
>>>>
>>>>
>>>> I guess none, and looking at your requirements below I tend to agree
>>>> that Tetsuo's approach is probably what you need at the end of the day.
>>>>
>>>>> The only thing that will matter for syzkaller parsing in the
>>>>> end is the resulting text format as it appears on console. But you say
>>>>> "I'm not pushing for this particular message format", so what exactly
>>>>> do you want me to provide feedback on?
>>>>> I guess we need to handle pr_cont properly whatever approach we take.
>>>>
>>>>
>>>> Mostly, was wondering about if:
>>>> a) you need pr_cont() handling
>>>> b) you need printk_safe() handling
>>>>
>>>> The reasons I left those things behind:
>>>>
>>>> a) pr_cont() is officially hated. It was never supposed to be used
>>>>    on SMP systems. So I wasn't sure if we need all that effort and
>>>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>>>    probably the only user of that functionality.
>>>
>>>
>>> Well, if I put my syzkaller hat on, then I don't care what exactly
>>> happens in the kernel, the only thing I care is well-formed output on
>>> console that can be parsed unambiguously in all cases.
>>
>>
>> +1 for 0day kernel testing.
>>
>> I admit that goal may never be 100% achievable -- at least some serial
>> console logs can sometimes become messy. So we'll have to write dmesg
>> parsing code in defensive ways.
>>
>> But some unnecessary pr_cont() broken-up messages can obviously be
>> avoided. For example,
>>
>> arch/x86/mm/fault.c:
>>
>>         printk(KERN_ALERT "BUG: unable to handle kernel ");
>>         if (address < PAGE_SIZE)
>>                 printk(KERN_CONT "NULL pointer dereference");
>>         else
>>                 printk(KERN_CONT "paging request");
>>
>> I've actually proposed to remove the above KERN_CONT, unfortunately the
>> patch was silently ignored.
>
>
>I've just cooked this change too, but do you mind reviving your patch?

Yes, sure. My version is more dumb. Since I'm not sure if it's OK to
do string formatting at this critical point. Let's see how others
think about the 2 approaches. I'm fine as long as our problem is fixed. :)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 9a84a0d08727..c7b068c6b010 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -671,11 +671,10 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code,
                        printk(smep_warning, from_kuid(&init_user_ns, current_uid()));
        }

-       printk(KERN_ALERT "BUG: unable to handle kernel ");
        if (address < PAGE_SIZE)
-               printk(KERN_CONT "NULL pointer dereference");
+               printk(KERN_ALERT "BUG: unable to handle kernel NULL pointer dereference");
        else
-               printk(KERN_CONT "paging request");
+               printk(KERN_ALERT "BUG: unable to handle kernel paging request");

        printk(KERN_CONT " at %px\n", (void *) address);

>It actually makes the code even shorter, which is nice:
>
>--- a/arch/x86/mm/fault.c
>+++ b/arch/x86/mm/fault.c
>@@ -671,13 +671,9 @@ show_fault_oops(struct pt_regs *regs, unsigned
>long error_code,
>                        printk(smep_warning, from_kuid(&init_user_ns,
>current_uid()));
>        }
>
>-       printk(KERN_ALERT "BUG: unable to handle kernel ");
>-       if (address < PAGE_SIZE)
>-               printk(KERN_CONT "NULL pointer dereference");
>-       else
>-               printk(KERN_CONT "paging request");
>-
>-       printk(KERN_CONT " at %px\n", (void *) address);
>+       printk(KERN_ALERT "BUG: unable to handle kernel %s at %px\n",
>+               (address < PAGE_SIZE ? "NULL pointer dereference" :
>+               "paging request"), (void *) address);
>
>        dump_pagetable(address);
> }
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 12:41                                             ` Fengguang Wu
@ 2018-06-20 12:45                                               ` Dmitry Vyukov
  2018-06-20 12:48                                                 ` Fengguang Wu
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-06-20 12:45 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 2:41 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> On Wed, Jun 20, 2018 at 02:31:51PM +0200, Dmitry Vyukov wrote:
>>
>> On Wed, Jun 20, 2018 at 1:37 PM, Fengguang Wu <fengguang.wu@intel.com>
>> wrote:
>>>
>>> On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>>>>
>>>>
>>>> On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
>>>> <sergey.senozhatsky.work@gmail.com> wrote:
>>>>>
>>>>>
>>>>> Hi Dmitry,
>>>>>
>>>>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>>>>>
>>>>>>
>>>>>> Hi Sergey,
>>>>>>
>>>>>> What are the visible differences between this patch and Tetsuo's
>>>>>> patch?
>>>>>
>>>>>
>>>>>
>>>>> I guess none, and looking at your requirements below I tend to agree
>>>>> that Tetsuo's approach is probably what you need at the end of the day.
>>>>>
>>>>>> The only thing that will matter for syzkaller parsing in the
>>>>>> end is the resulting text format as it appears on console. But you say
>>>>>> "I'm not pushing for this particular message format", so what exactly
>>>>>> do you want me to provide feedback on?
>>>>>> I guess we need to handle pr_cont properly whatever approach we take.
>>>>>
>>>>>
>>>>>
>>>>> Mostly, was wondering about if:
>>>>> a) you need pr_cont() handling
>>>>> b) you need printk_safe() handling
>>>>>
>>>>> The reasons I left those things behind:
>>>>>
>>>>> a) pr_cont() is officially hated. It was never supposed to be used
>>>>>    on SMP systems. So I wasn't sure if we need all that effort and
>>>>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>>>>    probably the only user of that functionality.
>>>>
>>>>
>>>>
>>>> Well, if I put my syzkaller hat on, then I don't care what exactly
>>>> happens in the kernel, the only thing I care is well-formed output on
>>>> console that can be parsed unambiguously in all cases.
>>>
>>>
>>>
>>> +1 for 0day kernel testing.
>>>
>>> I admit that goal may never be 100% achievable -- at least some serial
>>> console logs can sometimes become messy. So we'll have to write dmesg
>>> parsing code in defensive ways.
>>>
>>> But some unnecessary pr_cont() broken-up messages can obviously be
>>> avoided. For example,
>>>
>>> arch/x86/mm/fault.c:
>>>
>>>         printk(KERN_ALERT "BUG: unable to handle kernel ");
>>>         if (address < PAGE_SIZE)
>>>                 printk(KERN_CONT "NULL pointer dereference");
>>>         else
>>>                 printk(KERN_CONT "paging request");
>>>
>>> I've actually proposed to remove the above KERN_CONT, unfortunately the
>>> patch was silently ignored.
>>
>>
>>
>> I've just cooked this change too, but do you mind reviving your patch?
>
>
> Yes, sure. My version is more dumb. Since I'm not sure if it's OK to
> do string formatting at this critical point. Let's see how others
> think about the 2 approaches. I'm fine as long as our problem is fixed. :)

It already does string formatting for address. And I think we also
need to get rid of KERN_CONT for address while we are here.


> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 9a84a0d08727..c7b068c6b010 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -671,11 +671,10 @@ show_fault_oops(struct pt_regs *regs, unsigned long
> error_code,
>                        printk(smep_warning, from_kuid(&init_user_ns,
> current_uid()));
>        }
>
> -       printk(KERN_ALERT "BUG: unable to handle kernel ");
>        if (address < PAGE_SIZE)
> -               printk(KERN_CONT "NULL pointer dereference");
> +               printk(KERN_ALERT "BUG: unable to handle kernel NULL pointer
> dereference");
>        else
> -               printk(KERN_CONT "paging request");
> +               printk(KERN_ALERT "BUG: unable to handle kernel paging
> request");
>
>
>        printk(KERN_CONT " at %px\n", (void *) address);
>
>> It actually makes the code even shorter, which is nice:
>>
>> --- a/arch/x86/mm/fault.c
>> +++ b/arch/x86/mm/fault.c
>> @@ -671,13 +671,9 @@ show_fault_oops(struct pt_regs *regs, unsigned
>> long error_code,
>>                        printk(smep_warning, from_kuid(&init_user_ns,
>> current_uid()));
>>        }
>>
>> -       printk(KERN_ALERT "BUG: unable to handle kernel ");
>> -       if (address < PAGE_SIZE)
>> -               printk(KERN_CONT "NULL pointer dereference");
>> -       else
>> -               printk(KERN_CONT "paging request");
>> -
>> -       printk(KERN_CONT " at %px\n", (void *) address);
>> +       printk(KERN_ALERT "BUG: unable to handle kernel %s at %px\n",
>> +               (address < PAGE_SIZE ? "NULL pointer dereference" :
>> +               "paging request"), (void *) address);
>>
>>        dump_pagetable(address);
>> }
>>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 12:45                                               ` Dmitry Vyukov
@ 2018-06-20 12:48                                                 ` Fengguang Wu
  0 siblings, 0 replies; 68+ messages in thread
From: Fengguang Wu @ 2018-06-20 12:48 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 02:45:25PM +0200, Dmitry Vyukov wrote:
>On Wed, Jun 20, 2018 at 2:41 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
>> On Wed, Jun 20, 2018 at 02:31:51PM +0200, Dmitry Vyukov wrote:
>>>
>>> On Wed, Jun 20, 2018 at 1:37 PM, Fengguang Wu <fengguang.wu@intel.com>
>>> wrote:
>>>>
>>>> On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>>>>>
>>>>>
>>>>> On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
>>>>> <sergey.senozhatsky.work@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Dmitry,
>>>>>>
>>>>>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Sergey,
>>>>>>>
>>>>>>> What are the visible differences between this patch and Tetsuo's
>>>>>>> patch?
>>>>>>
>>>>>>
>>>>>>
>>>>>> I guess none, and looking at your requirements below I tend to agree
>>>>>> that Tetsuo's approach is probably what you need at the end of the day.
>>>>>>
>>>>>>> The only thing that will matter for syzkaller parsing in the
>>>>>>> end is the resulting text format as it appears on console. But you say
>>>>>>> "I'm not pushing for this particular message format", so what exactly
>>>>>>> do you want me to provide feedback on?
>>>>>>> I guess we need to handle pr_cont properly whatever approach we take.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Mostly, was wondering about if:
>>>>>> a) you need pr_cont() handling
>>>>>> b) you need printk_safe() handling
>>>>>>
>>>>>> The reasons I left those things behind:
>>>>>>
>>>>>> a) pr_cont() is officially hated. It was never supposed to be used
>>>>>>    on SMP systems. So I wasn't sure if we need all that effort and
>>>>>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>>>>>    probably the only user of that functionality.
>>>>>
>>>>>
>>>>>
>>>>> Well, if I put my syzkaller hat on, then I don't care what exactly
>>>>> happens in the kernel, the only thing I care is well-formed output on
>>>>> console that can be parsed unambiguously in all cases.
>>>>
>>>>
>>>>
>>>> +1 for 0day kernel testing.
>>>>
>>>> I admit that goal may never be 100% achievable -- at least some serial
>>>> console logs can sometimes become messy. So we'll have to write dmesg
>>>> parsing code in defensive ways.
>>>>
>>>> But some unnecessary pr_cont() broken-up messages can obviously be
>>>> avoided. For example,
>>>>
>>>> arch/x86/mm/fault.c:
>>>>
>>>>         printk(KERN_ALERT "BUG: unable to handle kernel ");
>>>>         if (address < PAGE_SIZE)
>>>>                 printk(KERN_CONT "NULL pointer dereference");
>>>>         else
>>>>                 printk(KERN_CONT "paging request");
>>>>
>>>> I've actually proposed to remove the above KERN_CONT, unfortunately the
>>>> patch was silently ignored.
>>>
>>>
>>>
>>> I've just cooked this change too, but do you mind reviving your patch?
>>
>>
>> Yes, sure. My version is more dumb. Since I'm not sure if it's OK to
>> do string formatting at this critical point. Let's see how others
>> think about the 2 approaches. I'm fine as long as our problem is fixed. :)
>
>It already does string formatting for address. And I think we also
>need to get rid of KERN_CONT for address while we are here.

Ah yes, sorry I overlooked the next KERN_CONT..

>
>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>> index 9a84a0d08727..c7b068c6b010 100644
>> --- a/arch/x86/mm/fault.c
>> +++ b/arch/x86/mm/fault.c
>> @@ -671,11 +671,10 @@ show_fault_oops(struct pt_regs *regs, unsigned long
>> error_code,
>>                        printk(smep_warning, from_kuid(&init_user_ns,
>> current_uid()));
>>        }
>>
>> -       printk(KERN_ALERT "BUG: unable to handle kernel ");
>>        if (address < PAGE_SIZE)
>> -               printk(KERN_CONT "NULL pointer dereference");
>> +               printk(KERN_ALERT "BUG: unable to handle kernel NULL pointer
>> dereference");
>>        else
>> -               printk(KERN_CONT "paging request");
>> +               printk(KERN_ALERT "BUG: unable to handle kernel paging
>> request");
>>
>>
>>        printk(KERN_CONT " at %px\n", (void *) address);
>>
>>> It actually makes the code even shorter, which is nice:
>>>
>>> --- a/arch/x86/mm/fault.c
>>> +++ b/arch/x86/mm/fault.c
>>> @@ -671,13 +671,9 @@ show_fault_oops(struct pt_regs *regs, unsigned
>>> long error_code,
>>>                        printk(smep_warning, from_kuid(&init_user_ns,
>>> current_uid()));
>>>        }
>>>
>>> -       printk(KERN_ALERT "BUG: unable to handle kernel ");
>>> -       if (address < PAGE_SIZE)
>>> -               printk(KERN_CONT "NULL pointer dereference");
>>> -       else
>>> -               printk(KERN_CONT "paging request");
>>> -
>>> -       printk(KERN_CONT " at %px\n", (void *) address);
>>> +       printk(KERN_ALERT "BUG: unable to handle kernel %s at %px\n",
>>> +               (address < PAGE_SIZE ? "NULL pointer dereference" :
>>> +               "paging request"), (void *) address);
>>>
>>>        dump_pagetable(address);
>>> }
>>>
>>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:32                                             ` Dmitry Vyukov
@ 2018-06-20 13:06                                               ` Sergey Senozhatsky
  2018-06-22 13:06                                                 ` Tetsuo Handa
  2018-09-10 11:20                                                 ` Alexander Potapenko
  2018-06-21  8:29                                               ` Sergey Senozhatsky
  1 sibling, 2 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20 13:06 UTC (permalink / raw)
  To: Dmitry Vyukov, Fengguang Wu
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On (06/20/18 13:32), Dmitry Vyukov wrote:
> > So, if we could get rid of pr_cont() from the most important parts
> > (instruction dumps, etc) then I would just vote to leave pr_cont()
> > alone and avoid any handling of it in printk context tracking. Simply
> > because we wouldn't care about pr_cont(). This also could simplify
> > Tetsuo's patch significantly.
> 
> Sounds good to me.

Awesome. If you and Fengguang can combine forces and lead the
whole thing towards "we couldn't care of pr_cont() less", it
would be really huuuuuge. Go for it!

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:32                                             ` Dmitry Vyukov
  2018-06-20 13:06                                               ` Sergey Senozhatsky
@ 2018-06-21  8:29                                               ` Sergey Senozhatsky
  1 sibling, 0 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-06-21  8:29 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (06/20/18 13:32), Dmitry Vyukov wrote:
> >>
> >> So this is another reason to get rid of pr_cont entirely, right?
> >
> > Getting rid of pr_cont() from important output would be totally cool.
> > Quoting Linus:
> >
> >     Only acceptable use of continuations is basically boot-time testing,
> >     when you do things like
> >
> >      printk("Testing feature XYZ..");
> >      this_may_blow_up_because_of_hw_bugs();
> >      printk(KERN_CONT " ... ok\n");
> >
> >
> > I can recall at least 4 attempts when people tried to introduce new pr_cont()
> > or some concept with similar functionality to pr_cont(), but SMP safe. We
> > brought the first one - per-CPU pr_cont() buffers - to KS several years ago
> > but Linus didn't like it. Then there was a buffered printk() mode patch from
> > Tetsuo, then a solution from Steven, then I had my second try with a
> > soft-of-pr_cont() replacement.
> >
> > So, if we could get rid of pr_cont() from the most important parts
> > (instruction dumps, etc) then I would just vote to leave pr_cont()
> > alone and avoid any handling of it in printk context tracking. Simply
> > because we wouldn't care about pr_cont(). This also could simplify
> > Tetsuo's patch significantly.
> 
> Sounds good to me.

Another thing about pr_cont() is that as long as pr_cont() does not race
with pr_cont() from another task or from IRQ, the task that is the owner
(see struct cont in printk.c) of the existing continuation line can migrate,
IOW we can have

	CPU0	CPU1	CPU2	CPU3

	task A
	pr_cont()
		task A
		pr_cont()
			task A
			pr_cont()
				task A
				pr_cont("\n") << flush

The line was printed from 4 CPUs, but appears as a single line
in the logbuf. Should we account CPU0 or CPU3 as the line origin?

That's another reason why I don't really want to handle pr_cont in
any special way in context tracking.

So, currently, context tracking looks like this:

---
	char mode = 'T';

	if (in_serving_softirq())
		mode = 'S';
	if (in_irq())
		mode = 'I';
	if (in_nmi())
		mode = 'N';

	ret = snprintf(buf, buf_len, "%c<%d>%c",
			mode,
			raw_smp_processor_id(),
			cont.len ? '+' : ' ');
---

I add a '+' symbol to continuation lines. Which should simply hint
that tracking info for that particular line is not entirely trustworthy.

I also don't add any tracking info for printk_safe output. We get
tracking info for such lines from the printk_safe flush path
(irq work that happens on the same CPU that added printk_safe output).

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 13:06                                               ` Sergey Senozhatsky
@ 2018-06-22 13:06                                                 ` Tetsuo Handa
  2018-06-25  1:41                                                   ` Sergey Senozhatsky
  2018-09-10 11:20                                                 ` Alexander Potapenko
  1 sibling, 1 reply; 68+ messages in thread
From: Tetsuo Handa @ 2018-06-22 13:06 UTC (permalink / raw)
  To: Sergey Senozhatsky, Dmitry Vyukov, Fengguang Wu
  Cc: Sergey Senozhatsky, Petr Mladek, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On 2018/06/20 22:06, Sergey Senozhatsky wrote:
> On (06/20/18 13:32), Dmitry Vyukov wrote:
>>> So, if we could get rid of pr_cont() from the most important parts
>>> (instruction dumps, etc) then I would just vote to leave pr_cont()
>>> alone and avoid any handling of it in printk context tracking. Simply
>>> because we wouldn't care about pr_cont(). This also could simplify
>>> Tetsuo's patch significantly.
>>
>> Sounds good to me.
> 
> Awesome. If you and Fengguang can combine forces and lead the
> whole thing towards "we couldn't care of pr_cont() less", it
> would be really huuuuuge. Go for it!

Can't we have seq_printf()-like one which flushes automatically upon seeing '\n'
or buffer full? Printing memory information is using a lot of pr_cont(), even in
function names (e.g. http://lkml.kernel.org/r/20180622083949.GR10465@dhcp22.suse.cz ).
Since OOM killer code is serialized by oom_lock, we can use static buffer for
OOM killer messages.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-22 13:06                                                 ` Tetsuo Handa
@ 2018-06-25  1:41                                                   ` Sergey Senozhatsky
  2018-06-25  9:36                                                     ` Dmitry Vyukov
  0 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-06-25  1:41 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Dmitry Vyukov, Fengguang Wu,
	Sergey Senozhatsky, Petr Mladek, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On (06/22/18 22:06), Tetsuo Handa wrote:
> >
> > Awesome. If you and Fengguang can combine forces and lead the
> > whole thing towards "we couldn't care of pr_cont() less", it
> > would be really huuuuuge. Go for it!
> 
> Can't we have seq_printf()-like one which flushes automatically upon seeing '\n'
> or buffer full? Printing memory information is using a lot of pr_cont(), even in
> function names (e.g. http://lkml.kernel.org/r/20180622083949.GR10465@dhcp22.suse.cz ).
> Since OOM killer code is serialized by oom_lock, we can use static buffer for
> OOM killer messages.

I'm not the right guy to answer this question. Sorry. We need to Cc MM
people on this.

Does OOM's pr_cont() usage cause too much disturbance to syzkaller? I thought
that OOM was slightly out of sight.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-25  1:41                                                   ` Sergey Senozhatsky
@ 2018-06-25  9:36                                                     ` Dmitry Vyukov
  2018-06-27 10:29                                                       ` Tetsuo Handa
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Vyukov @ 2018-06-25  9:36 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Sergey Senozhatsky, Fengguang Wu, Petr Mladek,
	syzkaller, Steven Rostedt, LKML, Linus Torvalds, Andrew Morton

On Mon, Jun 25, 2018 at 3:41 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/22/18 22:06), Tetsuo Handa wrote:
>> >
>> > Awesome. If you and Fengguang can combine forces and lead the
>> > whole thing towards "we couldn't care of pr_cont() less", it
>> > would be really huuuuuge. Go for it!
>>
>> Can't we have seq_printf()-like one which flushes automatically upon seeing '\n'
>> or buffer full? Printing memory information is using a lot of pr_cont(), even in
>> function names (e.g. http://lkml.kernel.org/r/20180622083949.GR10465@dhcp22.suse.cz ).
>> Since OOM killer code is serialized by oom_lock, we can use static buffer for
>> OOM killer messages.
>
> I'm not the right guy to answer this question. Sorry. We need to Cc MM
> people on this.
>
> Does OOM's pr_cont() usage cause too much disturbance to syzkaller? I thought
> that OOM was slightly out of sight.

Hard to tell. Nothing specific comes to mind.
We do see lines like these:

BUG: unable to handle kernel [ 110.NUM] device gre0 entered promiscuous mode
BUG:--------[ cut here ]------------

and frequently it's also required to look deep inside of crash message
to understand what they really mean. Hard to tell how random pr_cont's
contribute to the problem. We now throw away everything that looks any
corrupted right away.
I guess the main requirement is that the crash report itself does not
use pr_cont and provided we have task/cpu context we can separate the
crash report lines from everything else (assuming that random
pr_cont's on other CPUs won't glue to the report lines).

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-25  9:36                                                     ` Dmitry Vyukov
@ 2018-06-27 10:29                                                       ` Tetsuo Handa
  0 siblings, 0 replies; 68+ messages in thread
From: Tetsuo Handa @ 2018-06-27 10:29 UTC (permalink / raw)
  To: Dmitry Vyukov, Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Fengguang Wu, Petr Mladek, syzkaller,
	Steven Rostedt, LKML, Linus Torvalds, Andrew Morton

On 2018/06/25 18:36, Dmitry Vyukov wrote:
> On Mon, Jun 25, 2018 at 3:41 AM, Sergey Senozhatsky
> <sergey.senozhatsky.work@gmail.com> wrote:
>> On (06/22/18 22:06), Tetsuo Handa wrote:
>>>>
>>>> Awesome. If you and Fengguang can combine forces and lead the
>>>> whole thing towards "we couldn't care of pr_cont() less", it
>>>> would be really huuuuuge. Go for it!
>>>
>>> Can't we have seq_printf()-like one which flushes automatically upon seeing '\n'
>>> or buffer full? Printing memory information is using a lot of pr_cont(), even in
>>> function names (e.g. http://lkml.kernel.org/r/20180622083949.GR10465@dhcp22.suse.cz ).
>>> Since OOM killer code is serialized by oom_lock, we can use static buffer for
>>> OOM killer messages.
>>
>> I'm not the right guy to answer this question. Sorry. We need to Cc MM
>> people on this.
>>
>> Does OOM's pr_cont() usage cause too much disturbance to syzkaller? I thought
>> that OOM was slightly out of sight.
> 
> Hard to tell. Nothing specific comes to mind.
> We do see lines like these:
> 
> BUG: unable to handle kernel [ 110.NUM] device gre0 entered promiscuous mode
> BUG:--------[ cut here ]------------
> 
> and frequently it's also required to look deep inside of crash message
> to understand what they really mean. Hard to tell how random pr_cont's
> contribute to the problem. We now throw away everything that looks any
> corrupted right away.
> I guess the main requirement is that the crash report itself does not
> use pr_cont and provided we have task/cpu context we can separate the
> crash report lines from everything else (assuming that random
> pr_cont's on other CPUs won't glue to the report lines).
> 

PATCH 1/3 below is a sample implementation of seq_printf()-like one which
flushes automatically upon seeing '\n' or buffer full. PATCH 2/3 is a
straightforward user of such function. (Well, since it is so simple,
we could rewrite it using snprintf() before PATCH 1/3 is accepted.)
PATCH 3/3 is a complicated user of such function. (Well, we could reduce
pr_cont() before PATCH 1/3 is accepted.) Can we agree with PATCH 1/3 ?

From 485406f585e566dccdfb85a1afbae460b8756457 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 27 Jun 2018 16:29:14 +0900
Subject: [PATCH 1/3] printk: Introduce buffered_printk().

Linus suggested in "printk: what is going on with additional newlines?"
thread [1] that

  Making the buffer explicit is (a) cheaper and (b) better. Now you can
  put the buffer on the stack, you never have to worry about where you
  need to track context, and you have no buffering limits (ie you can
  buffer across any event).

  I definitely suspect that "single line" is often sufficient. I
  mean, that's all that KERN_CONT ever gave you anyway (and not reliably).

  And then a 80 character buffer really isn't any different from having
  a structure with a few pointers in it, which we do on the stack all
  the time.

Now, since syzbot is bothered by concurrent printk() messages (e.g.
memory allocation fault injection), we started thinking about adding
prefix to each line of printk() output. This matches the suggestion that
buffering single line will be sufficient if we add caller's context
information for distinguishing concurrent printk() messages.

Thus, this patch introduces buffered_printk() which spools printk() output
and automatically flushes when '\n' was found or buffer became full (and
related structure/macro/functions).

[1] http://lkml.kernel.org/r/CA+55aFx+5R-vFQfr7+Ok9Yrs2adQ2Ma4fz+S6nCyWHY_-2mrmw@mail.gmail.com

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 include/linux/printk.h | 28 ++++++++++++++++
 kernel/printk/printk.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 114 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 6d7e800..81bc12a 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -153,6 +153,23 @@ static inline void printk_nmi_enter(void) { }
 static inline void printk_nmi_exit(void) { }
 #endif /* PRINTK_NMI */
 
+struct printk_buffer {
+	unsigned short int size;
+	unsigned short int used;
+	char *buf;
+};
+
+#define DEFINE_PRINTK_BUFFER(name, size, buf)		\
+	struct printk_buffer name = { size, 0, buf }
+
+static inline void INIT_PRINTK_BUFFER(struct printk_buffer *ptr,
+				      unsigned short int size, char *buf)
+{
+	ptr->size = size;
+	ptr->used = 0;
+	ptr->buf = buf;
+}
+
 #ifdef CONFIG_PRINTK
 asmlinkage __printf(5, 0)
 int vprintk_emit(int facility, int level,
@@ -169,6 +186,9 @@ int printk_emit(int facility, int level,
 
 asmlinkage __printf(1, 2) __cold
 int printk(const char *fmt, ...);
+asmlinkage __printf(2, 3) __cold
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...);
+void flush_buffered_printk(struct printk_buffer *ptr);
 
 /*
  * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ !
@@ -216,6 +236,14 @@ int printk(const char *s, ...)
 {
 	return 0;
 }
+static inline __printf(2, 3) __cold
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	return 0;
+}
+static inline void flush_buffered_printk(struct printk_buffer *ptr)
+{
+}
 static inline __printf(1, 2) __cold
 int printk_deferred(const char *s, ...)
 {
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2478083..24566dc 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1985,6 +1985,92 @@ asmlinkage __visible int printk(const char *fmt, ...)
 }
 EXPORT_SYMBOL(printk);
 
+static void __flush_printk_buffer(struct printk_buffer *ptr, bool all)
+{
+	while (1) {
+		char *text = ptr->buf;
+		unsigned int text_len = ptr->used;
+		char *cp = memchr(text, '\n', text_len);
+		char c;
+
+		if (cp++)
+			text_len = cp - text;
+		else if (all)
+			cp = text + text_len;
+		else
+			break;
+		c = *cp;
+		*cp = '\0';
+		printk("%s", text);
+		ptr->used -= text_len;
+		if (!ptr->used)
+			break;
+		*cp = c;
+		memmove(text, text + text_len, ptr->used);
+	}
+}
+
+/*
+ * buffered_printk - Try to print multiple printk() calls as line oriented.
+ *
+ * This is a utility function for avoiding KERN_CONT and pr_cont() usage.
+ *
+ * Before:
+ *
+ *   pr_info("INFO:");
+ *   for (i = 0; i < 5; i++)
+ *     pr_cont(" %s=%s", name[i], value[i]);
+ *   pr_cont("\n");
+ *
+ * After:
+ *
+ *   char buffer[256];
+ *   DEFINE_PRINTK_BUFFER(buf, sizeof(buffer), buffer);
+ *   buffered_printk(&buf, KERN_INFO "INFO:");
+ *   for (i = 0; i < 5; i++)
+ *     buffered_printk(&buf, " %s=%s", name[i], value[i]);
+ *   buffered_printk(&buf, "\n");
+ *
+ * If the caller is not sure that the last buffered_printk() call ends with
+ * "\n", the caller can use flush_buffered_printk() in order to make sure that
+ * all data is passed to printk().
+ *
+ * If the buffer is not large enough to hold one line, buffered_printk() will
+ * fall back to regular printk() instead of truncating the data. But be careful
+ * with LOG_LINE_MAX limit anyway.
+ */
+asmlinkage __visible int buffered_printk(struct printk_buffer *ptr,
+					 const char *fmt, ...)
+{
+	va_list args;
+	int r;
+	const unsigned int pos = ptr->used;
+
+	/* Try to store to printk_buffer first. */
+	va_start(args, fmt);
+	r = vsnprintf(ptr->buf + pos, ptr->size - pos, fmt, args);
+	va_end(args);
+	/* If it succeeds, process printk_buffer up to last '\n' and return. */
+	if (r + pos < ptr->size) {
+		ptr->used += r;
+		__flush_printk_buffer(ptr, false);
+		return r;
+	}
+	/* Otherwise, flush printk_buffer and use unbuffered printk(). */
+	__flush_printk_buffer(ptr, true);
+	va_start(args, fmt);
+	r = vprintk_func(fmt, args);
+	va_end(args);
+	return r;
+}
+EXPORT_SYMBOL(buffered_printk);
+
+void flush_buffered_printk(struct printk_buffer *ptr)
+{
+	__flush_printk_buffer(ptr, true);
+}
+EXPORT_SYMBOL(flush_buffered_printk);
+
 #else /* CONFIG_PRINTK */
 
 #define LOG_LINE_MAX		0
-- 
1.8.3.1

From 8f38ec70c9c444673e6bf2e699781cd143442ac6 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 27 Jun 2018 16:30:18 +0900
Subject: [PATCH 2/3] x86: Use buffered_printk() in show_opcodes()

Since syzbot is confused by concurrent printk() messages,
this patch changes show_opcodes() to use buffered_printk().

When we start adding prefix to each line of printk() output,
syzbot will be able to handle concurrent printk() messages.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 arch/x86/kernel/dumpstack.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 666a284..c284dd0 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -97,22 +97,24 @@ void show_opcodes(u8 *rip, const char *loglvl)
 	u8 opcodes[OPCODE_BUFSIZE];
 	u8 *ip;
 	int i;
+	char tmpbuf[(2 + 6) + (3 * OPCODE_BUFSIZE + 2) + 2];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
-	printk("%sCode: ", loglvl);
+	buffered_printk(&buf, "%sCode: ", loglvl);
 
 	ip = (u8 *)rip - code_prologue;
 	if (probe_kernel_read(opcodes, ip, OPCODE_BUFSIZE)) {
-		pr_cont("Bad RIP value.\n");
+		buffered_printk(&buf, "Bad RIP value.\n");
 		return;
 	}
 
 	for (i = 0; i < OPCODE_BUFSIZE; i++, ip++) {
 		if (ip == rip)
-			pr_cont("<%02x> ", opcodes[i]);
+			buffered_printk(&buf, "<%02x> ", opcodes[i]);
 		else
-			pr_cont("%02x ", opcodes[i]);
+			buffered_printk(&buf, "%02x ", opcodes[i]);
 	}
-	pr_cont("\n");
+	buffered_printk(&buf, "\n");
 }
 
 void show_ip(struct pt_regs *regs, const char *loglvl)
-- 
1.8.3.1

From 910520d0f5f366c86f5e4a2d5d344ae16e375604 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 27 Jun 2018 16:31:17 +0900
Subject: [PATCH 3/3] lockdep: Replace KERN_CONT/pr_cont() with
 buffered_printk()

Since syzbot is confused by concurrent printk() messages,
this patch eliminates KERN_CONT/pr_cont() usage from
kernel/locking/lockdep.c functions.

When we start adding prefix to each line of printk() output,
syzbot will be able to handle concurrent printk() messages.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 kernel/locking/lockdep.c | 248 +++++++++++++++++++++++------------------------
 1 file changed, 123 insertions(+), 125 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5fa4d31..b8d9aa6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -499,36 +499,38 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
 	usage[i] = '\0';
 }
 
-static void __print_lock_name(struct lock_class *class)
+static void __print_lock_name(struct printk_buffer *buf, struct lock_class *class,
+			      const char *header, const char *trailer)
 {
 	char str[KSYM_NAME_LEN];
 	const char *name;
 
+	buffered_printk(buf, "%s", header);
 	name = class->name;
 	if (!name) {
 		name = __get_key_name(class->key, str);
-		printk(KERN_CONT "%s", name);
+		buffered_printk(buf, "%s", name);
 	} else {
-		printk(KERN_CONT "%s", name);
+		buffered_printk(buf, "%s", name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
+			buffered_printk(buf, "#%d", class->name_version);
 		if (class->subclass)
-			printk(KERN_CONT "/%d", class->subclass);
+			buffered_printk(buf, "/%d", class->subclass);
 	}
+	buffered_printk(buf, "%s", trailer);
 }
 
-static void print_lock_name(struct lock_class *class)
+static void print_lock_name(struct printk_buffer *buf, struct lock_class *class, const char *trailer)
 {
 	char usage[LOCK_USAGE_CHARS];
 
 	get_usage_chars(class, usage);
 
-	printk(KERN_CONT " (");
-	__print_lock_name(class);
-	printk(KERN_CONT "){%s}", usage);
+	__print_lock_name(buf, class, " (", ")");
+	buffered_printk(buf, "{%s}%s", usage, trailer);
 }
 
-static void print_lockdep_cache(struct lockdep_map *lock)
+static void print_lockdep_cache(struct printk_buffer *buf, struct lockdep_map *lock, const char *trailer)
 {
 	const char *name;
 	char str[KSYM_NAME_LEN];
@@ -537,11 +539,13 @@ static void print_lockdep_cache(struct lockdep_map *lock)
 	if (!name)
 		name = __get_key_name(lock->key->subkeys, str);
 
-	printk(KERN_CONT "%s", name);
+	buffered_printk(buf, "%s%s", name, trailer);
 }
 
-static void print_lock(struct held_lock *hlock)
+static void print_lock(struct printk_buffer *buf, struct held_lock *hlock)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf2, sizeof(tmpbuf), tmpbuf);
 	/*
 	 * We can be called locklessly through debug_show_all_locks() so be
 	 * extra careful, the hlock might have been released and cleared.
@@ -551,19 +555,23 @@ static void print_lock(struct held_lock *hlock)
 	/* Don't re-read hlock->class_idx, can't use READ_ONCE() on bitfields: */
 	barrier();
 
+	if (!buf)
+		buf = &buf2;
 	if (!class_idx || (class_idx - 1) >= MAX_LOCKDEP_KEYS) {
-		printk(KERN_CONT "<RELEASED>\n");
+		buffered_printk(buf, "<RELEASED>\n");
 		return;
 	}
 
-	printk(KERN_CONT "%p", hlock->instance);
-	print_lock_name(lock_classes + class_idx - 1);
-	printk(KERN_CONT ", at: %pS\n", (void *)hlock->acquire_ip);
+	buffered_printk(buf, "%p", hlock->instance);
+	print_lock_name(buf, lock_classes + class_idx - 1, "");
+	buffered_printk(buf, ", at: %pS\n", (void *)hlock->acquire_ip);
 }
 
 static void lockdep_print_held_locks(struct task_struct *p)
 {
 	int i, depth = READ_ONCE(p->lockdep_depth);
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	if (!depth)
 		printk("no locks held by %s/%d.\n", p->comm, task_pid_nr(p));
@@ -577,8 +585,8 @@ static void lockdep_print_held_locks(struct task_struct *p)
 	if (p->state == TASK_RUNNING && p != current)
 		return;
 	for (i = 0; i < depth; i++) {
-		printk(" #%d: ", i);
-		print_lock(p->held_locks + i);
+		buffered_printk(&buf, " #%d: ", i);
+		print_lock(&buf, p->held_locks + i);
 	}
 }
 
@@ -812,10 +820,10 @@ static bool assign_lock_key(struct lockdep_map *lock)
 	if (verbose(class)) {
 		graph_unlock();
 
-		printk("\nnew class %px: %s", class->key, class->name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
-		printk(KERN_CONT "\n");
+			printk("\nnew class %px: %s#%d\n", class->key, class->name, class->name_version);
+		else
+			printk("\nnew class %px: %s\n", class->key, class->name);
 		dump_stack();
 
 		if (!graph_lock()) {
@@ -1089,11 +1097,13 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 static noinline int
 print_circular_bug_entry(struct lock_list *target, int depth)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
+
 	if (debug_locks_silent)
 		return 0;
-	printk("\n-> #%u", depth);
-	print_lock_name(target->class);
-	printk(KERN_CONT ":\n");
+	buffered_printk(&buf, "\n-> #%u", depth);
+	print_lock_name(&buf, target->class, ":\n");
 	print_stack_trace(&target->trace, 6);
 
 	return 0;
@@ -1107,6 +1117,8 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	struct lock_class *source = hlock_class(src);
 	struct lock_class *target = hlock_class(tgt);
 	struct lock_class *parent = prt->class;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	/*
 	 * A direct locking problem where unsafe_class lock is taken
@@ -1122,30 +1134,19 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (parent != source) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(source);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(parent);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(target);
-		printk(KERN_CONT "\n\n");
+		printk("Chain exists of:\n");
+		__print_lock_name(&buf, source, "  ", " --> ");
+		__print_lock_name(&buf, parent, "", " --> ");
+		__print_lock_name(&buf, target, "", "\n\n");
 	}
 
 	printk(" Possible unsafe locking scenario:\n\n");
 	printk("       CPU0                    CPU1\n");
 	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(target);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(parent);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(target);
-	printk(KERN_CONT ");\n");
-	printk("  lock(");
-	__print_lock_name(source);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, target, "  lock(", ");\n");
+	__print_lock_name(&buf, parent, "                               lock(", ");\n");
+	__print_lock_name(&buf, target, "                               lock(", ");\n");
+	__print_lock_name(&buf, source, "  lock(", ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
 }
 
@@ -1170,11 +1171,11 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	pr_warn("------------------------------------------------------\n");
 	pr_warn("%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(check_src);
+	print_lock(NULL, check_src);
 
 	pr_warn("\nbut task is already holding lock:\n");
 
-	print_lock(check_tgt);
+	print_lock(NULL, check_tgt);
 	pr_warn("\nwhich lock already depends on the new lock.\n\n");
 	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
 
@@ -1394,18 +1395,19 @@ static inline int usage_match(struct lock_list *entry, void *bit)
 static void print_lock_class_header(struct lock_class *class, int depth)
 {
 	int bit;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
-	printk("%*s->", depth, "");
-	print_lock_name(class);
-	printk(KERN_CONT " ops: %lu", class->ops);
-	printk(KERN_CONT " {\n");
+	buffered_printk(&buf, "%*s->", depth, "");
+	print_lock_name(&buf, class, "");
+	buffered_printk(&buf, " ops: %lu", class->ops);
+	buffered_printk(&buf, " {\n");
 
 	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
 		if (class->usage_mask & (1 << bit)) {
 			int len = depth;
 
-			len += printk("%*s   %s", depth, "", usage_str[bit]);
-			len += printk(KERN_CONT " at:\n");
+			len += printk("%*s   %s at:\n", depth, "", usage_str[bit]);
 			print_stack_trace(class->usage_traces + bit, len);
 		}
 	}
@@ -1455,6 +1457,8 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 	struct lock_class *safe_class = safe_entry->class;
 	struct lock_class *unsafe_class = unsafe_entry->class;
 	struct lock_class *middle_class = prev_class;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	if (middle_class == safe_class)
 		middle_class = next_class;
@@ -1473,32 +1477,21 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (middle_class != unsafe_class) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(safe_class);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(middle_class);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(unsafe_class);
-		printk(KERN_CONT "\n\n");
+		printk("Chain exists of:\n");
+		__print_lock_name(&buf, safe_class, "  ", " --> ");
+		__print_lock_name(&buf, middle_class, "", " --> ");
+		__print_lock_name(&buf, unsafe_class, "", "\n\n");
 	}
 
 	printk(" Possible interrupt unsafe locking scenario:\n\n");
 	printk("       CPU0                    CPU1\n");
 	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(unsafe_class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, unsafe_class, "  lock(", ");\n");
 	printk("                               local_irq_disable();\n");
-	printk("                               lock(");
-	__print_lock_name(safe_class);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(middle_class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, safe_class, "                               lock(", ");\n");
+	__print_lock_name(&buf, middle_class, "                               lock(", ");\n");
 	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(safe_class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, safe_class, "    lock(", ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
 }
 
@@ -1514,6 +1507,9 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 			 enum lock_usage_bit bit2,
 			 const char *irqclass)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
+
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
@@ -1529,26 +1525,24 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 		curr->softirq_context, softirq_count() >> SOFTIRQ_SHIFT,
 		curr->hardirqs_enabled,
 		curr->softirqs_enabled);
-	print_lock(next);
+	print_lock(NULL, next);
 
 	pr_warn("\nand this task is already holding:\n");
-	print_lock(prev);
+	print_lock(NULL, prev);
 	pr_warn("which would create a new lock dependency:\n");
-	print_lock_name(hlock_class(prev));
-	pr_cont(" ->");
-	print_lock_name(hlock_class(next));
-	pr_cont("\n");
+	print_lock_name(&buf, hlock_class(prev), " ->");
+	print_lock_name(&buf, hlock_class(next), "\n");
 
 	pr_warn("\nbut this new dependency connects a %s-irq-safe lock:\n",
 		irqclass);
-	print_lock_name(backwards_entry->class);
-	pr_warn("\n... which became %s-irq-safe at:\n", irqclass);
+	print_lock_name(&buf, backwards_entry->class, "\n");
+	pr_warn("... which became %s-irq-safe at:\n", irqclass);
 
 	print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
 
 	pr_warn("\nto a %s-irq-unsafe lock:\n", irqclass);
-	print_lock_name(forwards_entry->class);
-	pr_warn("\n... which became %s-irq-unsafe at:\n", irqclass);
+	print_lock_name(&buf, forwards_entry->class, "\n");
+	pr_warn("... which became %s-irq-unsafe at:\n", irqclass);
 	pr_warn("...");
 
 	print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
@@ -1564,8 +1558,8 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 		return 0;
 	print_shortest_lock_dependencies(backwards_entry, prev_root);
 
-	pr_warn("\nthe dependencies between the lock to be acquired");
-	pr_warn(" and %s-irq-unsafe lock:\n", irqclass);
+	pr_warn("\nthe dependencies between the lock to be acquired and %s-irq-unsafe lock:\n",
+		irqclass);
 	if (!save_trace(&next_root->trace))
 		return 0;
 	print_shortest_lock_dependencies(forwards_entry, next_root);
@@ -1725,16 +1719,14 @@ static inline void inc_chains(void)
 {
 	struct lock_class *next = hlock_class(nxt);
 	struct lock_class *prev = hlock_class(prv);
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	printk(" Possible unsafe locking scenario:\n\n");
 	printk("       CPU0\n");
 	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(prev);
-	printk(KERN_CONT ");\n");
-	printk("  lock(");
-	__print_lock_name(next);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, prev, "  lock(", ");\n");
+	__print_lock_name(&buf, next, "  lock(", ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
 	printk(" May be due to missing lock nesting notation\n\n");
 }
@@ -1753,9 +1745,9 @@ static inline void inc_chains(void)
 	pr_warn("--------------------------------------------\n");
 	pr_warn("%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(next);
+	print_lock(NULL, next);
 	pr_warn("\nbut task is already holding lock:\n");
-	print_lock(prev);
+	print_lock(NULL, prev);
 
 	pr_warn("\nother info that might help us debug this:\n");
 	print_deadlock_scenario(next, prev);
@@ -2052,13 +2044,12 @@ static inline int get_first_held_lock(struct task_struct *curr,
 /*
  * Returns the next chain_key iteration
  */
-static u64 print_chain_key_iteration(int class_idx, u64 chain_key)
+static u64 print_chain_key_iteration(struct printk_buffer *buf, int class_idx, u64 chain_key)
 {
 	u64 new_chain_key = iterate_chain_key(chain_key, class_idx);
 
-	printk(" class_idx:%d -> chain_key:%016Lx",
-		class_idx,
-		(unsigned long long)new_chain_key);
+	buffered_printk(buf, " class_idx:%d -> chain_key:%016Lx",
+			class_idx, (unsigned long long)new_chain_key);
 	return new_chain_key;
 }
 
@@ -2069,17 +2060,19 @@ static u64 print_chain_key_iteration(int class_idx, u64 chain_key)
 	u64 chain_key = 0;
 	int depth = curr->lockdep_depth;
 	int i;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	printk("depth: %u\n", depth + 1);
 	for (i = get_first_held_lock(curr, hlock_next); i < depth; i++) {
 		hlock = curr->held_locks + i;
-		chain_key = print_chain_key_iteration(hlock->class_idx, chain_key);
+		chain_key = print_chain_key_iteration(&buf, hlock->class_idx, chain_key);
 
-		print_lock(hlock);
+		print_lock(&buf, hlock);
 	}
 
-	print_chain_key_iteration(hlock_next->class_idx, chain_key);
-	print_lock(hlock_next);
+	print_chain_key_iteration(&buf, hlock_next->class_idx, chain_key);
+	print_lock(&buf, hlock_next);
 }
 
 static void print_chain_keys_chain(struct lock_chain *chain)
@@ -2087,14 +2080,15 @@ static void print_chain_keys_chain(struct lock_chain *chain)
 	int i;
 	u64 chain_key = 0;
 	int class_id;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	printk("depth: %u\n", chain->depth);
 	for (i = 0; i < chain->depth; i++) {
 		class_id = chain_hlocks[chain->base + i];
-		chain_key = print_chain_key_iteration(class_id + 1, chain_key);
+		chain_key = print_chain_key_iteration(&buf, class_id + 1, chain_key);
 
-		print_lock_name(lock_classes + class_id);
-		printk("\n");
+		print_lock_name(&buf, lock_classes + class_id, "\n");
 	}
 }
 
@@ -2495,17 +2489,15 @@ static void check_chain_key(struct task_struct *curr)
 print_usage_bug_scenario(struct held_lock *lock)
 {
 	struct lock_class *class = hlock_class(lock);
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	printk(" Possible unsafe locking scenario:\n\n");
 	printk("       CPU0\n");
 	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, class, "  lock(", ");\n");
 	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, class, "    lock(", ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
 }
 
@@ -2531,7 +2523,7 @@ static void check_chain_key(struct task_struct *curr)
 		trace_softirq_context(curr), softirq_count() >> SOFTIRQ_SHIFT,
 		trace_hardirqs_enabled(curr),
 		trace_softirqs_enabled(curr));
-	print_lock(this);
+	print_lock(NULL, this);
 
 	pr_warn("{%s} state was registered at:\n", usage_str[prev_bit]);
 	print_stack_trace(hlock_class(this)->usage_traces + prev_bit, 1);
@@ -2577,6 +2569,8 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	struct lock_list *entry = other;
 	struct lock_list *middle = NULL;
 	int depth;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
@@ -2588,13 +2582,13 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	pr_warn("--------------------------------------------------------\n");
 	pr_warn("%s/%d just changed the state of lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(this);
+	print_lock(NULL, this);
 	if (forwards)
 		pr_warn("but this lock took another, %s-unsafe lock in the past:\n", irqclass);
 	else
 		pr_warn("but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
-	print_lock_name(other->class);
-	pr_warn("\n\nand interrupts could create inverse lock ordering between them.\n\n");
+	print_lock_name(&buf, other->class, "\n\n");
+	pr_warn("and interrupts could create inverse lock ordering between them.\n\n");
 
 	pr_warn("\nother info that might help us debug this:\n");
 
@@ -3169,7 +3163,7 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	 */
 	if (ret == 2) {
 		printk("\nmarked lock as {%s}:\n", usage_str[new_bit]);
-		print_lock(this);
+		print_lock(NULL, this);
 		print_irqtrace_events(curr);
 		dump_stack();
 	}
@@ -3264,7 +3258,7 @@ void lockdep_init_map(struct lockdep_map *lock, const char *name,
 	pr_warn("----------------------------------\n");
 
 	pr_warn("%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
-	print_lock(hlock);
+	print_lock(NULL, hlock);
 
 	pr_warn("\nbut this task is not holding:\n");
 	pr_warn("%s\n", hlock->nest_lock->name);
@@ -3326,10 +3320,10 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	}
 	atomic_inc((atomic_t *)&class->ops);
 	if (very_verbose(class)) {
-		printk("\nacquire class [%px] %s", class->key, class->name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
-		printk(KERN_CONT "\n");
+			printk("\nacquire class [%px] %s#%d\n", class->key, class->name, class->name_version);
+		else
+			printk("\nacquire class [%px] %s\n", class->key, class->name);
 		dump_stack();
 	}
 
@@ -3465,6 +3459,9 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 print_unlock_imbalance_bug(struct task_struct *curr, struct lockdep_map *lock,
 			   unsigned long ip)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
@@ -3475,10 +3472,9 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	pr_warn("WARNING: bad unlock balance detected!\n");
 	print_kernel_ident();
 	pr_warn("-------------------------------------\n");
-	pr_warn("%s/%d is trying to release lock (",
-		curr->comm, task_pid_nr(curr));
-	print_lockdep_cache(lock);
-	pr_cont(") at:\n");
+	buffered_printk(&buf, KERN_WARNING "%s/%d is trying to release lock (",
+			curr->comm, task_pid_nr(curr));
+	print_lockdep_cache(&buf, lock, ") at:\n");
 	print_ip_sym(ip);
 	pr_warn("but there are no more locks to release!\n");
 	pr_warn("\nother info that might help us debug this:\n");
@@ -4026,6 +4022,9 @@ void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie)
 print_lock_contention_bug(struct task_struct *curr, struct lockdep_map *lock,
 			   unsigned long ip)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
@@ -4036,10 +4035,9 @@ void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie)
 	pr_warn("WARNING: bad contention detected!\n");
 	print_kernel_ident();
 	pr_warn("---------------------------------\n");
-	pr_warn("%s/%d is trying to contend lock (",
-		curr->comm, task_pid_nr(curr));
-	print_lockdep_cache(lock);
-	pr_cont(") at:\n");
+	buffered_printk(&buf, KERN_WARNING "%s/%d is trying to contend lock (",
+			curr->comm, task_pid_nr(curr));
+	print_lockdep_cache(&buf, lock, ") at:\n");
 	print_ip_sym(ip);
 	pr_warn("but there are no locks held!\n");
 	pr_warn("\nother info that might help us debug this:\n");
@@ -4382,7 +4380,7 @@ void __init lockdep_info(void)
 	pr_warn("-------------------------\n");
 	pr_warn("%s/%d is freeing memory %px-%px, with a lock still held there!\n",
 		curr->comm, task_pid_nr(curr), mem_from, mem_to-1);
-	print_lock(hlock);
+	print_lock(NULL, hlock);
 	lockdep_print_held_locks(curr);
 
 	pr_warn("\nstack backtrace:\n");
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 13:06                                               ` Sergey Senozhatsky
  2018-06-22 13:06                                                 ` Tetsuo Handa
@ 2018-09-10 11:20                                                 ` Alexander Potapenko
  2018-09-12  6:53                                                   ` Sergey Senozhatsky
  1 sibling, 1 reply; 68+ messages in thread
From: Alexander Potapenko @ 2018-09-10 11:20 UTC (permalink / raw)
  To: Sergey Senozhatsky, Dmitriy Vyukov, penguin-kernel
  Cc: kbuild test robot, sergey.senozhatsky.work, pmladek, syzkaller,
	Steven Rostedt, LKML, Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 3:06 PM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
> On (06/20/18 13:32), Dmitry Vyukov wrote:
> > > So, if we could get rid of pr_cont() from the most important parts
> > > (instruction dumps, etc) then I would just vote to leave pr_cont()
> > > alone and avoid any handling of it in printk context tracking. Simply
> > > because we wouldn't care about pr_cont(). This also could simplify
> > > Tetsuo's patch significantly.
> >
> > Sounds good to me.
>
> Awesome. If you and Fengguang can combine forces and lead the
> whole thing towards "we couldn't care of pr_cont() less", it
> would be really huuuuuge. Go for it!

Sorry, folks, am I understanding right that pr_cont() and flushing the
buffer on "\n" are two separate problems that can be handled outside
Tetsuo's patchset, just assuming pr_cont() is unsupported?
Or should the pr_cont() cleanup be a prerequisite for that?

>         -ss
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-10 11:20                                                 ` Alexander Potapenko
@ 2018-09-12  6:53                                                   ` Sergey Senozhatsky
  2018-09-12 16:05                                                     ` Steven Rostedt
  0 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-09-12  6:53 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, penguin-kernel,
	kbuild test robot, sergey.senozhatsky.work, pmladek, syzkaller,
	Steven Rostedt, LKML, Linus Torvalds, Andrew Morton

On (09/10/18 13:20), Alexander Potapenko wrote:
> > Awesome. If you and Fengguang can combine forces and lead the
> > whole thing towards "we couldn't care of pr_cont() less", it
> > would be really huuuuuge. Go for it!
> 
> Sorry, folks, am I understanding right that pr_cont() and flushing the
> buffer on "\n" are two separate problems that can be handled outside
> Tetsuo's patchset, just assuming pr_cont() is unsupported?
> Or should the pr_cont() cleanup be a prerequisite for that?

Oh... Sorry. I'm quite overloaded at the moment and simply forgot about
this thread.

So what is exactly our problem with pr_cont -- it's not SMP friendly.
And this leads to various things, the most annoying of which is a
preliminary flush.

E.g. let me do a simple thing on my box:

ps aux | grep firefox
kill 2727

dmesg | tail
[  554.098341] Chrome_~dThread[2823]: segfault at 0 ip 00007f5df153a1f3 sp 00007f5ded47ab00 error 6 in libxul.so[7f5df1531000+4b01000]
[  554.098348] Code: e7 04 48 8d 15 a6 94 ae 03 48 89 10 c7 04 25 00 00 00 00 00 00 00 00 0f 0b 48 8b 05 57 d0 e7 04 48 8d 0d b0 94 ae 03 48 89 08 <c7> 04 25 00 00 00 00 00 00 00 00 0f 0b e8 4d f4 ff ff 48 8b 05 34
[  554.109418] Chrome_~dThread[3047]: segfault at 0 ip 00007f3d5bdba1f3 sp 00007f3d57cfab00 error 6
[  554.109421] Chrome_~dThread[3077]: segfault at 0 ip 00007fe773f661f3 sp 00007fe76fea6b00 error 6
[  554.109424]  in libxul.so[7f3d5bdb1000+4b01000]
[  554.109426]  in libxul.so[7fe773f5d000+4b01000]
[  554.109429] Code: e7 04 48 8d 15 a6 94 ae 03 48 89 10 c7 04 25 00 00 00 00 00 00 00 00 0f 0b 48 8b 05 57 d0 e7 04 48 8d 0d b0 94 ae 03 48 89 08 <c7> 04 25 00 00 00 00 00 00 00 00 0f 0b e8 4d f4 ff ff 48 8b 05 34


Even such a simple thing as "printk several lines per-crashed process"
is broken. Look at line #0 and lines #2-#5.

And this is the only problem we probably need to address. Overlapping
printk lines -- when several CPUs printk simultaneously, or same CPUs
printk-s from IRQ, etc -- are here by design and it's not going to be
easy to change that (and maybe we shouldn't try).


Buffering multiple lines in printk buffer does not look so simple and
perhaps we should not try to do this, as well. Why:

- it's hard to decide what to do when buffer overflows

    Switching to "normal printk" defeats the reason we do buffering in the
    first place. Because "normal printk" permits overlapping. So buffering
    makes a little sense if we are OK with switching to a "normal printk".

- the more we buffer the more we can lose in case of panic.

    We can't flush_on_panic() printk buffers which were allocated on stack.

- flushing multiple lines should be more complex than just a simple
  printk loop

  while (1) {
     x = memchr(buf, '\n', sz);
     ...
     print("%s", buf);
     ...
  }

    Because "printk() loop" permits lines overlap. Hence buffering makes
    little sense, once again.



So let's reduce the problem scope to "we want to have a replacement for
pr_cont()". And let's address pr_cont()'s "preliminary flush" issue only.


I scanned some of Linus' emails, and skimmed through previous discussions
on this topic. Let me quote Linus:

: 
: My preference as a user is actually to just have a dynamically
: re-sizable buffer (that's pretty much what I've done in *every* single
: user space project I've had in the last decade), but because some
: users might have atomicity issues I do suspect that we should just use
: a stack buffer.
: 
: And then perhaps say that the buffer size has to be capped at 80 characters.
: 
: Because if you're printing more than 80 characters and expecting it
: all to fit on a line, you're doing something else wrong anyway.
: 
: And hide it not as a explicit "char buffer[80]]" allocation, but as a
: "struct line_buffer" or similar, so that
: 
:  (a) people don't get the line size wrong
: 
:  (b) the buffering code can add a few fields for length etc in there too
: 
: Introduce a few helper functions for it:
: 
:  init_line_buffer(&buf);
:  print_line(&buf, fmt, args);
:  vprint_line(&buf, fmt, vararg);
:  finish_line(&buf);
: 



And this is, basically, what I have attached to this email. It's very
simple and very short. And I think this is what Linus wanted us to do.

- usage example

       DEFINE_PR_LINE(KERN_ERR, pl);

       pr_line(&pl, "Hello, %s!\n", "buffer");
       pr_line(&pl, "%s", "OK.\n");
       pr_line(&pl, "Goodbye, %s", "buffer");
       pr_line(&pl, "\n");

dmesg | tail

[   69.908542] Hello, buffer!
[   69.908544] OK.
[   69.908545] Goodbye, buffer


- pr_cont-like usage

       DEFINE_PR_LINE(KERN_ERR, pl);

       pr_line(&pl,"%d ", 1);
       pr_line(&pl,"%d ", 3);
       pr_line(&pl,"%d ", 5);
       pr_line(&pl,"%d ", 7);
       pr_line(&pl,"%d\n", 9);

dmesg | tail

[   69.908546] 1 3 5 7 9


- An explicit, aux buffer // output should be truncated

       char buf[16];
       DEFINE_PR_LINE_BUF(KERN_ERR, ps, buf, sizeof(buf));

       pr_line(&ps, "Test test test test test test test test test\n");
       pr_line(&ps, "\n");


dmesg | tail

[   69.908547] Test test test ** truncated **


Opinions? Will this work for us?

====

From 7fd8407e0081d8979f08dec48e88364d6210b4ab Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: [PATCH] printk: add pr_line buffering API

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 include/linux/printk.h | 63 ++++++++++++++++++++++++++++++++++++++++++
 kernel/printk/printk.c | 55 ++++++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccfe1543..fc5f11c7579c 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -157,6 +157,15 @@ static inline void printk_nmi_direct_enter(void) { }
 static inline void printk_nmi_direct_exit(void) { }
 #endif /* PRINTK_NMI */
 
+#define PRINTK_PR_LINE_BUF_SZ	80
+
+struct pr_line {
+	char			*buffer;
+	int			size;
+	int			len;
+	char			*level;
+};
+
 #ifdef CONFIG_PRINTK
 asmlinkage __printf(5, 0)
 int vprintk_emit(int facility, int level,
@@ -209,6 +218,30 @@ extern asmlinkage void dump_stack(void) __cold;
 extern void printk_safe_init(void);
 extern void printk_safe_flush(void);
 extern void printk_safe_flush_on_panic(void);
+
+#define DEFINE_PR_LINE(lev, name)				\
+	char		__pr_line_buf[PRINTK_PR_LINE_BUF_SZ];	\
+	struct pr_line	name = {				\
+		.buffer = __pr_line_buf,			\
+		.size 	= PRINTK_PR_LINE_BUF_SZ,		\
+		.len 	= 0,					\
+		.level	= lev,					\
+	}
+
+#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
+	struct pr_line	name = {				\
+		.buffer = buf,					\
+		.size 	= (sz),					\
+		.len 	= 0,					\
+		.level	= lev,					\
+	}
+
+extern __printf(2, 3)
+int pr_line(struct pr_line *pl, const char *fmt, ...);
+extern __printf(2, 0)
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
+extern void pr_line_flush(struct pr_line *pl);
+
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -284,6 +317,36 @@ static inline void printk_safe_flush(void)
 static inline void printk_safe_flush_on_panic(void)
 {
 }
+
+#define DEFINE_PR_LINE(lev, name)				\
+	struct pr_line	name = {				\
+		.buffer = NULL,					\
+		.size 	= 0,					\
+		.len 	= 0,					\
+		.level	= lev,					\
+	}
+
+#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
+	struct pr_line	name = {				\
+		.buffer = buf,					\
+		.size 	= 0,					\
+		.len 	= 0,					\
+		.level	= lev,					\
+	}
+
+static inline __printf(2, 3)
+int pr_line(struct pr_line *pl, const char *fmt, ...)
+{
+	return 0;
+}
+static inline __printf(2, 0)
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
+{
+	return 0;
+}
+static inline void pr_line_flush(struct pr_line *pl)
+{
+}
 #endif
 
 extern int kptr_restrict;
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index fd6f8ed28e01..daeb41a57929 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2004,6 +2004,61 @@ asmlinkage __visible int printk(const char *fmt, ...)
 }
 EXPORT_SYMBOL(printk);
 
+#define PR_LINE_TRUNCATED_MSG "** truncated **\n"
+
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
+{
+	int len;
+
+	if (unlikely(pl->size >= LOG_LINE_MAX))
+		pl->size = LOG_LINE_MAX - sizeof(PR_LINE_TRUNCATED_MSG);
+
+	if (fmt[0] == '\n') {
+		pr_line_flush(pl);
+		return 0;
+	}
+
+	if (pl->len >= pl->size)
+		return -1;
+
+	len = vsnprintf(pl->buffer + pl->len, pl->size - pl->len, fmt, args);
+	if (pl->len + len >= pl->size) {
+		pl->len = pl->size + 1;
+		return -1;
+	}
+
+	pl->len += len;
+	if (pl->len && pl->buffer[pl->len - 1] == '\n')
+		pr_line_flush(pl);
+	return 0;
+}
+EXPORT_SYMBOL(vpr_line);
+
+int pr_line(struct pr_line *pl, const char *fmt, ...)
+{
+	va_list ap;
+	int ret;
+
+	va_start(ap, fmt);
+	ret = vpr_line(pl, fmt, ap);
+	va_end(ap);
+	return ret;
+}
+EXPORT_SYMBOL(pr_line);
+
+void pr_line_flush(struct pr_line *pl)
+{
+	if (!pl->len)
+		return;
+
+	if (pl->len < pl->size)
+		printk("%s%.*s", pl->level, pl->len, pl->buffer);
+	else
+		printk("%s%.*s%s", pl->level, pl->len, pl->buffer,
+			PR_LINE_TRUNCATED_MSG);
+	pl->len = 0;
+}
+EXPORT_SYMBOL(pr_line_flush);
 #else /* CONFIG_PRINTK */
 
 #define LOG_LINE_MAX		0
-- 
2.19.0


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-12  6:53                                                   ` Sergey Senozhatsky
@ 2018-09-12 16:05                                                     ` Steven Rostedt
  2018-09-13  7:12                                                       ` Sergey Senozhatsky
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Rostedt @ 2018-09-12 16:05 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Alexander Potapenko, Sergey Senozhatsky, Dmitriy Vyukov,
	penguin-kernel, kbuild test robot, pmladek, syzkaller, LKML,
	Linus Torvalds, Andrew Morton

On Wed, 12 Sep 2018 15:53:07 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> I scanned some of Linus' emails, and skimmed through previous discussions
> on this topic. Let me quote Linus:
> 
> : 
> : My preference as a user is actually to just have a dynamically
> : re-sizable buffer (that's pretty much what I've done in *every* single
> : user space project I've had in the last decade), but because some
> : users might have atomicity issues I do suspect that we should just use
> : a stack buffer.
> : 
> : And then perhaps say that the buffer size has to be capped at 80 characters.
> : 
> : Because if you're printing more than 80 characters and expecting it
> : all to fit on a line, you're doing something else wrong anyway.
> : 
> : And hide it not as a explicit "char buffer[80]]" allocation, but as a
> : "struct line_buffer" or similar, so that
> : 
> :  (a) people don't get the line size wrong
> : 
> :  (b) the buffering code can add a few fields for length etc in there too
> : 
> : Introduce a few helper functions for it:
> : 
> :  init_line_buffer(&buf);
> :  print_line(&buf, fmt, args);
> :  vprint_line(&buf, fmt, vararg);
> :  finish_line(&buf);
> : 

This sounds like seq_buf to me.

> 
> 
> 
> And this is, basically, what I have attached to this email. It's very
> simple and very short. And I think this is what Linus wanted us to do.
> 
> - usage example
> 
>        DEFINE_PR_LINE(KERN_ERR, pl);
> 
>        pr_line(&pl, "Hello, %s!\n", "buffer");
>        pr_line(&pl, "%s", "OK.\n");
>        pr_line(&pl, "Goodbye, %s", "buffer");
>        pr_line(&pl, "\n");
> 
> dmesg | tail
> 
> [   69.908542] Hello, buffer!
> [   69.908544] OK.
> [   69.908545] Goodbye, buffer
> 
> 
> - pr_cont-like usage
> 
>        DEFINE_PR_LINE(KERN_ERR, pl);
> 
>        pr_line(&pl,"%d ", 1);
>        pr_line(&pl,"%d ", 3);
>        pr_line(&pl,"%d ", 5);
>        pr_line(&pl,"%d ", 7);
>        pr_line(&pl,"%d\n", 9);
> 
> dmesg | tail
> 
> [   69.908546] 1 3 5 7 9
> 
> 
> - An explicit, aux buffer // output should be truncated
> 
>        char buf[16];
>        DEFINE_PR_LINE_BUF(KERN_ERR, ps, buf, sizeof(buf));
> 
>        pr_line(&ps, "Test test test test test test test test test\n");
>        pr_line(&ps, "\n");
> 
> 
> dmesg | tail
> 
> [   69.908547] Test test test ** truncated **
> 
> 
> Opinions? Will this work for us?
> 
> ====
> 
> >From 7fd8407e0081d8979f08dec48e88364d6210b4ab Mon Sep 17 00:00:00 2001  
> From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Subject: [PATCH] printk: add pr_line buffering API
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  include/linux/printk.h | 63 ++++++++++++++++++++++++++++++++++++++++++
>  kernel/printk/printk.c | 55 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 118 insertions(+)
> 
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index cf3eccfe1543..fc5f11c7579c 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -157,6 +157,15 @@ static inline void printk_nmi_direct_enter(void) { }
>  static inline void printk_nmi_direct_exit(void) { }
>  #endif /* PRINTK_NMI */
>  
> +#define PRINTK_PR_LINE_BUF_SZ	80
> +
> +struct pr_line {
> +	char			*buffer;
> +	int			size;
> +	int			len;
> +	char			*level;
> +};

Can you look at implementing this with using a seq_buf?

-- Steve

> +
>  #ifdef CONFIG_PRINTK
>  asmlinkage __printf(5, 0)
>  int vprintk_emit(int facility, int level,
> @@ -209,6 +218,30 @@ extern asmlinkage void dump_stack(void) __cold;
>  extern void printk_safe_init(void);
>  extern void printk_safe_flush(void);
>  extern void printk_safe_flush_on_panic(void);
> +
> +#define DEFINE_PR_LINE(lev, name)				\
> +	char		__pr_line_buf[PRINTK_PR_LINE_BUF_SZ];	\
> +	struct pr_line	name = {				\
> +		.buffer = __pr_line_buf,			\
> +		.size 	= PRINTK_PR_LINE_BUF_SZ,		\
> +		.len 	= 0,					\
> +		.level	= lev,					\
> +	}
> +
> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
> +	struct pr_line	name = {				\
> +		.buffer = buf,					\
> +		.size 	= (sz),					\
> +		.len 	= 0,					\
> +		.level	= lev,					\
> +	}
> +
> +extern __printf(2, 3)
> +int pr_line(struct pr_line *pl, const char *fmt, ...);
> +extern __printf(2, 0)
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
> +extern void pr_line_flush(struct pr_line *pl);
> +
>  #else
>  static inline __printf(1, 0)
>  int vprintk(const char *s, va_list args)
> @@ -284,6 +317,36 @@ static inline void printk_safe_flush(void)
>  static inline void printk_safe_flush_on_panic(void)
>  {
>  }
> +
> +#define DEFINE_PR_LINE(lev, name)				\
> +	struct pr_line	name = {				\
> +		.buffer = NULL,					\
> +		.size 	= 0,					\
> +		.len 	= 0,					\
> +		.level	= lev,					\
> +	}
> +
> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
> +	struct pr_line	name = {				\
> +		.buffer = buf,					\
> +		.size 	= 0,					\
> +		.len 	= 0,					\
> +		.level	= lev,					\
> +	}
> +
> +static inline __printf(2, 3)
> +int pr_line(struct pr_line *pl, const char *fmt, ...)
> +{
> +	return 0;
> +}
> +static inline __printf(2, 0)
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> +{
> +	return 0;
> +}
> +static inline void pr_line_flush(struct pr_line *pl)
> +{
> +}
>  #endif
>  
>  extern int kptr_restrict;
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index fd6f8ed28e01..daeb41a57929 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2004,6 +2004,61 @@ asmlinkage __visible int printk(const char *fmt, ...)
>  }
>  EXPORT_SYMBOL(printk);
>  
> +#define PR_LINE_TRUNCATED_MSG "** truncated **\n"
> +
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> +{
> +	int len;
> +
> +	if (unlikely(pl->size >= LOG_LINE_MAX))
> +		pl->size = LOG_LINE_MAX - sizeof(PR_LINE_TRUNCATED_MSG);
> +
> +	if (fmt[0] == '\n') {
> +		pr_line_flush(pl);
> +		return 0;
> +	}
> +
> +	if (pl->len >= pl->size)
> +		return -1;
> +
> +	len = vsnprintf(pl->buffer + pl->len, pl->size - pl->len, fmt, args);
> +	if (pl->len + len >= pl->size) {
> +		pl->len = pl->size + 1;
> +		return -1;
> +	}
> +
> +	pl->len += len;
> +	if (pl->len && pl->buffer[pl->len - 1] == '\n')
> +		pr_line_flush(pl);
> +	return 0;
> +}
> +EXPORT_SYMBOL(vpr_line);
> +
> +int pr_line(struct pr_line *pl, const char *fmt, ...)
> +{
> +	va_list ap;
> +	int ret;
> +
> +	va_start(ap, fmt);
> +	ret = vpr_line(pl, fmt, ap);
> +	va_end(ap);
> +	return ret;
> +}
> +EXPORT_SYMBOL(pr_line);
> +
> +void pr_line_flush(struct pr_line *pl)
> +{
> +	if (!pl->len)
> +		return;
> +
> +	if (pl->len < pl->size)
> +		printk("%s%.*s", pl->level, pl->len, pl->buffer);
> +	else
> +		printk("%s%.*s%s", pl->level, pl->len, pl->buffer,
> +			PR_LINE_TRUNCATED_MSG);
> +	pl->len = 0;
> +}
> +EXPORT_SYMBOL(pr_line_flush);
>  #else /* CONFIG_PRINTK */
>  
>  #define LOG_LINE_MAX		0


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-12 16:05                                                     ` Steven Rostedt
@ 2018-09-13  7:12                                                       ` Sergey Senozhatsky
  2018-09-13 12:26                                                         ` Petr Mladek
  2018-09-14  1:12                                                         ` Steven Rostedt
  0 siblings, 2 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-09-13  7:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Alexander Potapenko, Sergey Senozhatsky,
	Dmitriy Vyukov, penguin-kernel, kbuild test robot, pmladek,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

Hi, Steven

On (09/12/18 12:05), Steven Rostedt wrote:
> > : Introduce a few helper functions for it:
> > : 
> > :  init_line_buffer(&buf);
> > :  print_line(&buf, fmt, args);
> > :  vprint_line(&buf, fmt, vararg);
> > :  finish_line(&buf);
> > : 
> 
> This sounds like seq_buf to me.

Correct.

> > +struct pr_line {
> > +	char			*buffer;
> > +	int			size;
> > +	int			len;
> > +	char			*level;
> > +};
> 
> Can you look at implementing this with using a seq_buf?

Certainly, attached.

It doesn't seem to save us that much code, tho. It looks smaller just
because I dropped "truncated" print out and didn't include !CONFIG_PRINTK
noise this time around. And the OK thing about previous version was that
it didn't introduce any new dependencies to printk.

Making pr_line available via printk.h -- #include seq_buf.h in printk.h - at
glance looks like some fun. printk.h is getting included very early, before
we have all the stuff that seq_buf.h wants - we can remove fs.h from
seq_buf.h and add a bunch of forward declarations for path and seq_file;
but all those BUG_ON/WARN_ON/etc is another story (unless we want every
pr_line user to include seq_buf.h).

... maybe I can change API. But I sort of like that implicit buffer case:

	DEFINE_PR_LINE(KERN_ERR, pl);

	pr_line(&pl, "Hello, ");
	pr_line(&pl, "%s.\n", "Steven");

And, looking at potential users of pr_line, I'd say that we better
have DEFINE_PR_LINE_BUF, because some of them do print messages longer
than 80 chars.

===

From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: [PATCH] lib/seq_buf: add pr_line buffering API

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 include/linux/seq_buf.h | 35 +++++++++++++++++++++++++++++++
 lib/seq_buf.c           | 46 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+)

diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h
index aa5deb041c25..5e9a5ff9a440 100644
--- a/include/linux/seq_buf.h
+++ b/include/linux/seq_buf.h
@@ -23,6 +23,36 @@ struct seq_buf {
 	loff_t			readpos;
 };
 
+#define __SEQ_BUF_INITIALIZER(buf, length) {				\
+	.buffer			= (buf),				\
+	.size			= (length),				\
+	.len			= 0,					\
+	.readpos		= 0, }
+
+#ifdef CONFIG_PRINTK
+#define __PR_LINE_BUF_SZ	80
+#else
+#define __PR_LINE_BUF_SZ	0
+#endif
+
+struct pr_line {
+	struct seq_buf		sb;
+	char			*level;
+};
+
+#define DEFINE_PR_LINE(lev, name)					\
+	char		__line[__PR_LINE_BUF_SZ];			\
+	struct pr_line	name = {					\
+		.sb = __SEQ_BUF_INITIALIZER(__line, __PR_LINE_BUF_SZ),	\
+		.level	= lev,						\
+	}
+
+#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)				\
+	struct pr_line	name = {					\
+		.sb = __SEQ_BUF_INITIALIZER(buf, (sz)),		\
+		.level	= lev,						\
+	}
+
 static inline void seq_buf_clear(struct seq_buf *s)
 {
 	s->len = 0;
@@ -131,4 +161,9 @@ extern int
 seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
 #endif
 
+extern __printf(2, 0)
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
+extern __printf(2, 3)
+int pr_line(struct pr_line *pl, const char *fmt, ...);
+extern void pr_line_flush(struct pr_line *pl);
 #endif /* _LINUX_SEQ_BUF_H */
diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 11f2ae0f9099..29bc4f24b83e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -324,3 +324,49 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
 	s->readpos += cnt;
 	return cnt;
 }
+
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
+{
+	struct seq_buf *s = &pl->sb;
+	int ret, len;
+
+	if (fmt[0] == '\n') {
+		pr_line_flush(pl);
+		return 0;
+	}
+
+	ret = seq_buf_vprintf(s, fmt, args);
+
+	len = seq_buf_used(s);
+	if (len && s->buffer[len - 1] == '\n')
+		pr_line_flush(pl);
+
+	return ret;
+}
+EXPORT_SYMBOL(vpr_line);
+
+int pr_line(struct pr_line *pl, const char *fmt, ...)
+{
+	va_list ap;
+	int ret;
+
+	va_start(ap, fmt);
+	ret = vpr_line(pl, fmt, ap);
+	va_end(ap);
+
+	return ret;
+}
+EXPORT_SYMBOL(pr_line);
+
+void pr_line_flush(struct pr_line *pl)
+{
+	struct seq_buf *s = &pl->sb;
+	int len = seq_buf_used(s);
+
+	if (!len)
+		return;
+
+	printk("%s%.*s", pl->level, len, s->buffer);
+	seq_buf_clear(s);
+}
+EXPORT_SYMBOL(pr_line_flush);
-- 
2.19.0


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13  7:12                                                       ` Sergey Senozhatsky
@ 2018-09-13 12:26                                                         ` Petr Mladek
  2018-09-13 14:28                                                           ` Sergey Senozhatsky
  2018-09-14  1:12                                                         ` Steven Rostedt
  1 sibling, 1 reply; 68+ messages in thread
From: Petr Mladek @ 2018-09-13 12:26 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Steven Rostedt, Alexander Potapenko, Sergey Senozhatsky,
	Dmitriy Vyukov, penguin-kernel, kbuild test robot, syzkaller,
	LKML, Linus Torvalds, Andrew Morton

On Thu 2018-09-13 16:12:54, Sergey Senozhatsky wrote:
> On (09/12/18 12:05), Steven Rostedt wrote:
> > > : Introduce a few helper functions for it:
> > > : 
> > > :  init_line_buffer(&buf);
> > > :  print_line(&buf, fmt, args);
> > > :  vprint_line(&buf, fmt, vararg);
> > > :  finish_line(&buf);
> > > : 
> > 
> --- a/lib/seq_buf.c
> +++ b/lib/seq_buf.c
> @@ -324,3 +324,49 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
>  	s->readpos += cnt;
>  	return cnt;
>  }
> +
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> +{
> +	struct seq_buf *s = &pl->sb;
> +	int ret, len;
> +
> +	if (fmt[0] == '\n') {
> +		pr_line_flush(pl);
> +		return 0;
> +	}

You would need to check if fmt[1] == '\0'. But then you would need
to be careful about a possible buffer overflow. I would personally
avoid this optimization.


> +	ret = seq_buf_vprintf(s, fmt, args);
> +
> +	len = seq_buf_used(s);
> +	if (len && s->buffer[len - 1] == '\n')
> +		pr_line_flush(pl);

This would cause that pr_line_flush() won't be strictly needed.
Also it would encourage people to use this feature a more
complicated way (for more lines). Do we really want this?


In general, I like this approach more than any attemps to handle
continuous lines transpatently. The other attemps were much more
complicated or were not reliable.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13 12:26                                                         ` Petr Mladek
@ 2018-09-13 14:28                                                           ` Sergey Senozhatsky
  2018-09-14  1:22                                                             ` Steven Rostedt
  2018-09-14  6:57                                                             ` Sergey Senozhatsky
  0 siblings, 2 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-09-13 14:28 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Alexander Potapenko,
	Sergey Senozhatsky, Dmitriy Vyukov, penguin-kernel,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/13/18 14:26), Petr Mladek wrote:
> > +
> > +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> > +{
> > +	struct seq_buf *s = &pl->sb;
> > +	int ret, len;
> > +
> > +	if (fmt[0] == '\n') {
> > +		pr_line_flush(pl);
> > +		return 0;
> > +	}
> 
> You would need to check if fmt[1] == '\0'. But then you would need
> to be careful about a possible buffer overflow. I would personally
> avoid this optimization.

Good call. It was a fast path for pr_cont("\n").
But it made me wondering and I did some grepping

arch/m68k/kernel/traps.c:                               pr_cont("\n       ");
arch/m68k/kernel/traps.c:                       pr_cont("\n       ");
kernel/trace/ftrace.c:          pr_cont("\n expected tramp: %lx\n", ip);

Lovely.
It will take us some time.

> > +	ret = seq_buf_vprintf(s, fmt, args);
> > +
> > +	len = seq_buf_used(s);
> > +	if (len && s->buffer[len - 1] == '\n')
> > +		pr_line_flush(pl);
> 
> This would cause that pr_line_flush() won't be strictly needed.
> Also it would encourage people to use this feature a more
> complicated way (for more lines). Do we really want this?

Not that I see any problems with pr_line_flush(). But can drop it, sure.
pr_line() is a replacement for pr_cont() and as such it's not for multi-line
buffering.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13  7:12                                                       ` Sergey Senozhatsky
  2018-09-13 12:26                                                         ` Petr Mladek
@ 2018-09-14  1:12                                                         ` Steven Rostedt
  2018-09-14  1:55                                                           ` Sergey Senozhatsky
  1 sibling, 1 reply; 68+ messages in thread
From: Steven Rostedt @ 2018-09-14  1:12 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Alexander Potapenko, Sergey Senozhatsky, Dmitriy Vyukov,
	penguin-kernel, kbuild test robot, pmladek, syzkaller, LKML,
	Linus Torvalds, Andrew Morton

On Thu, 13 Sep 2018 16:12:54 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  include/linux/seq_buf.h | 35 +++++++++++++++++++++++++++++++
>  lib/seq_buf.c           | 46 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h
> index aa5deb041c25..5e9a5ff9a440 100644
> --- a/include/linux/seq_buf.h
> +++ b/include/linux/seq_buf.h
> @@ -23,6 +23,36 @@ struct seq_buf {
>  	loff_t			readpos;
>  };
>  
> +#define __SEQ_BUF_INITIALIZER(buf, length) {				\
> +	.buffer			= (buf),				\
> +	.size			= (length),				\
> +	.len			= 0,					\
> +	.readpos		= 0, }

Nit, but the end bracket '}' should be on it's own line. Even when
part of a macro.

> +
> +#ifdef CONFIG_PRINTK
> +#define __PR_LINE_BUF_SZ	80
> +#else
> +#define __PR_LINE_BUF_SZ	0
> +#endif
> +
> +struct pr_line {
> +	struct seq_buf		sb;
> +	char			*level;
> +};
> +
> +#define DEFINE_PR_LINE(lev, name)					\
> +	char		__line[__PR_LINE_BUF_SZ];			\

To protect against name space collision could you use:

	char		__line_##name[__PR_LINE_BUF_SZ];

> +	struct pr_line	name = {					\
> +		.sb = __SEQ_BUF_INITIALIZER(__line, __PR_LINE_BUF_SZ),	\
> +		.level	= lev,						\
> +	}
> +
> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)				\
> +	struct pr_line	name = {					\
> +		.sb = __SEQ_BUF_INITIALIZER(buf, (sz)),		\
> +		.level	= lev,						\
> +	}
> +
>  static inline void seq_buf_clear(struct seq_buf *s)
>  {
>  	s->len = 0;
> @@ -131,4 +161,9 @@ extern int
>  seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
>  #endif
>  
> +extern __printf(2, 0)
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
> +extern __printf(2, 3)
> +int pr_line(struct pr_line *pl, const char *fmt, ...);
> +extern void pr_line_flush(struct pr_line *pl);
>  #endif /* _LINUX_SEQ_BUF_H */
> diff --git a/lib/seq_buf.c b/lib/seq_buf.c
> index 11f2ae0f9099..29bc4f24b83e 100644
> --- a/lib/seq_buf.c
> +++ b/lib/seq_buf.c
> @@ -324,3 +324,49 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
>  	s->readpos += cnt;
>  	return cnt;
>  }
> +
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> +{
> +	struct seq_buf *s = &pl->sb;
> +	int ret, len;
> +
> +	if (fmt[0] == '\n') {
> +		pr_line_flush(pl);
> +		return 0;
> +	}
> +
> +	ret = seq_buf_vprintf(s, fmt, args);
> +
> +	len = seq_buf_used(s);
> +	if (len && s->buffer[len - 1] == '\n')
> +		pr_line_flush(pl);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(vpr_line);
> +
> +int pr_line(struct pr_line *pl, const char *fmt, ...)
> +{
> +	va_list ap;
> +	int ret;
> +
> +	va_start(ap, fmt);
> +	ret = vpr_line(pl, fmt, ap);
> +	va_end(ap);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(pr_line);
> +
> +void pr_line_flush(struct pr_line *pl)
> +{
> +	struct seq_buf *s = &pl->sb;
> +	int len = seq_buf_used(s);
> +
> +	if (!len)
> +		return;
> +
> +	printk("%s%.*s", pl->level, len, s->buffer);
> +	seq_buf_clear(s);
> +}
> +EXPORT_SYMBOL(pr_line_flush);

The rest looks fine to me.

Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13 14:28                                                           ` Sergey Senozhatsky
@ 2018-09-14  1:22                                                             ` Steven Rostedt
  2018-09-14  2:15                                                               ` Sergey Senozhatsky
  2018-09-14  6:57                                                             ` Sergey Senozhatsky
  1 sibling, 1 reply; 68+ messages in thread
From: Steven Rostedt @ 2018-09-14  1:22 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, Alexander Potapenko,
	Dmitriy Vyukov, penguin-kernel, kbuild test robot, syzkaller,
	LKML, Linus Torvalds, Andrew Morton

On Thu, 13 Sep 2018 23:28:02 +0900
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:

> Good call. It was a fast path for pr_cont("\n").
> But it made me wondering and I did some grepping
> 

[..]

> kernel/trace/ftrace.c:          pr_cont("\n expected tramp: %lx\n", ip);

Note, looking at the history of that, I was just combining a lone "\n"
with the next string. The code before this print add info to the line
depending on the input, thus none do a "\n". The "expected tramp" part
is added to the next line, but I'm fine if you want to break this up.
This print is very unlikely done with other prints happening. It
happens when modifying (serially) ftrace nops to calls or back to nops.

Feel free to send a patch that breaks it up into:

	pr_cont("\n");
	pr_info(" expected tramp: %lx\n", ip);

-- Steve

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14  1:12                                                         ` Steven Rostedt
@ 2018-09-14  1:55                                                           ` Sergey Senozhatsky
  0 siblings, 0 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14  1:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Alexander Potapenko, Sergey Senozhatsky,
	Dmitriy Vyukov, penguin-kernel, kbuild test robot, pmladek,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On (09/13/18 21:12), Steven Rostedt wrote:
> >  
> > +#define __SEQ_BUF_INITIALIZER(buf, length) {				\
> > +	.buffer			= (buf),				\
> > +	.size			= (length),				\
> > +	.len			= 0,					\
> > +	.readpos		= 0, }
> 
> Nit, but the end bracket '}' should be on it's own line. Even when
> part of a macro.

No prob, will change.

I thought about putting it on its own line, but then checked
include/linux/wait.h - __WAITQUEUE_INITIALIZER and
__WAIT_QUEUE_HEAD_INITIALIZER.

> > +#define DEFINE_PR_LINE(lev, name)					\
> > +	char		__line[__PR_LINE_BUF_SZ];			\
> 
> To protect against name space collision could you use:
> 
> 	char		__line_##name[__PR_LINE_BUF_SZ];

Yes.

> The rest looks fine to me.
> 
> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Thanks.

Just, to make sure, we are OK with seq_buf dependency and want
anyone who wants to use pr_line to include linux/seq_buf.h?

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14  1:22                                                             ` Steven Rostedt
@ 2018-09-14  2:15                                                               ` Sergey Senozhatsky
  0 siblings, 0 replies; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14  2:15 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky,
	Alexander Potapenko, Dmitriy Vyukov, penguin-kernel,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/13/18 21:22), Steven Rostedt wrote:
> > Good call. It was a fast path for pr_cont("\n").
> > But it made me wondering and I did some grepping
> > 
> 
> [..]
> 
> > kernel/trace/ftrace.c:          pr_cont("\n expected tramp: %lx\n", ip);
> 
> Note, looking at the history of that, I was just combining a lone "\n"
> with the next string. The code before this print add info to the line
> depending on the input, thus none do a "\n". The "expected tramp" part
> is added to the next line, but I'm fine if you want to break this up.
> This print is very unlikely done with other prints happening. It
> happens when modifying (serially) ftrace nops to calls or back to nops.
> 
> Feel free to send a patch that breaks it up into:
> 
> 	pr_cont("\n");
> 	pr_info(" expected tramp: %lx\n", ip);

I didn't mean to criticize anyone with my "Lovely" comment. Sorry if it
appeared to sound harsh.

I'm fine with the way it is, but we *probably* (up to you) will touch
this code once pr_line is available. As of now, the less pr_cont() calls
we make the better. This

	pr_cont("a");
	pr_cont("b");
	pr_cont("c\n");

in the worst case can be log_store-d as 3 log entries (2 preliminary
flushes). So, from this point of view, this

	pr_cont("ab");
	pr_cont("c\n");

is better, because it can be log_store-d as 2 log entries.
And with pr_line() we can log_store it in 1 log entry [but we will
use some extra stack space for that].

Overall, I counted around 100 cases of printk("\n...."), and around 20+ cases
of pr_cont("\n...") and probably around 10 or 15 printk(KERN_CONT "\n....")
cases. That's what I meant when I said that converting it to pr_line()
will take us some time. Especially given that some of lockdep developers
have really warm feelings toward printk ;)

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13 14:28                                                           ` Sergey Senozhatsky
  2018-09-14  1:22                                                             ` Steven Rostedt
@ 2018-09-14  6:57                                                             ` Sergey Senozhatsky
  2018-09-14 10:37                                                               ` Tetsuo Handa
  1 sibling, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14  6:57 UTC (permalink / raw)
  To: Petr Mladek, Steven Rostedt
  Cc: Alexander Potapenko, Dmitriy Vyukov, penguin-kernel,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton, Sergey Senozhatsky, Sergey Senozhatsky

On (09/13/18 23:28), Sergey Senozhatsky wrote:
> Not that I see any problems with pr_line_flush(). But can drop it, sure.
> pr_line() is a replacement for pr_cont() and as such it's not for multi-line
> buffering.

OK, attached.
Let me know if anything needs to improved (including broken English).
Will we keep in the printk tree or shall I send a formal patch to Andrew?

===

From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Subject: [PATCH] lib/seq_buf: add pr_line buffering API

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 include/linux/kern_levels.h |  3 ++
 include/linux/seq_buf.h     | 60 +++++++++++++++++++++++++++++++++++++
 lib/seq_buf.c               | 57 +++++++++++++++++++++++++++++++++++
 3 files changed, 120 insertions(+)

diff --git a/include/linux/kern_levels.h b/include/linux/kern_levels.h
index d237fe854ad9..9c281ac745b3 100644
--- a/include/linux/kern_levels.h
+++ b/include/linux/kern_levels.h
@@ -20,6 +20,9 @@
  * Annotation for a "continued" line of log printout (only done after a
  * line that had no enclosing \n). Only to be used by core/arch code
  * during early bootup (a continued line is not SMP-safe otherwise).
+ *
+ * Please consider pr_line()/vpr_line() functions for SMP-safe continued
+ * line printing.
  */
 #define KERN_CONT	KERN_SOH "c"
 
diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h
index aa5deb041c25..b33aeea14803 100644
--- a/include/linux/seq_buf.h
+++ b/include/linux/seq_buf.h
@@ -23,6 +23,62 @@ struct seq_buf {
 	loff_t			readpos;
 };
 
+#define __SEQ_BUF_INITIALIZER(buf, length)			\
+{								\
+	.buffer			= (buf),			\
+	.size			= (length),			\
+	.len			= 0,				\
+	.readpos		= 0,				\
+}
+
+#ifdef CONFIG_PRINTK
+#define __PR_LINE_BUF_SZ	80
+#else
+#define __PR_LINE_BUF_SZ	0
+#endif
+
+/**
+ * pr_line - printk() line buffer structure
+ * @sb:	underlying seq buffer, which holds the data
+ * @level:	printk() log level (KERN_ERR, etc.)
+ */
+struct pr_line {
+	struct seq_buf		sb;
+	char			*level;
+};
+
+/**
+ * DEFINE_PR_LINE - define a new pr_line variable
+ * @lev:	printk() log level
+ * @name:	variable name
+ *
+ * Defines a new pr_line varialbe, which would use an implicit
+ * stack buffer of size __PR_LINE_BUF_SZ.
+ */
+#define DEFINE_PR_LINE(lev, name)				\
+	char		__line_##name[__PR_LINE_BUF_SZ];	\
+	struct pr_line	name = {				\
+		.sb	= __SEQ_BUF_INITIALIZER(__line_##name,	\
+					__PR_LINE_BUF_SZ),	\
+		.level	= lev,					\
+	}
+
+/**
+ * DEFINE_PR_LINE_BUF - define a new pr_line variable
+ * @lev:	printk() log level
+ * @name:	variable name
+ * @buf:	external buffer
+ * @sz:	external buffer size
+ *
+ * Defines a new pr_line variable, which would use an external
+ * buffer for printk line.
+ */
+#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
+	struct pr_line	name = {				\
+		.sb	= __SEQ_BUF_INITIALIZER(buf, (sz)),	\
+		.level	= lev,					\
+	}
+
 static inline void seq_buf_clear(struct seq_buf *s)
 {
 	s->len = 0;
@@ -131,4 +187,8 @@ extern int
 seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
 #endif
 
+extern __printf(2, 0)
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
+extern __printf(2, 3)
+int pr_line(struct pr_line *pl, const char *fmt, ...);
 #endif /* _LINUX_SEQ_BUF_H */
diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 11f2ae0f9099..fada7623f168 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -324,3 +324,60 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
 	s->readpos += cnt;
 	return cnt;
 }
+
+/**
+ * vpr_line - Append data to the printk() line buffer
+ * @pl: the pr_line descriptor
+ * @fmt: printf format string
+ * @args: va_list of arguments from a printf() type function
+ *
+ * Writes a vnprintf() format into the printk() pr_line buffer.
+ * Terminating new-line symbol flushes (prints) the buffer.
+ *
+ * Unlike pr_cont() and printk(KERN_CONT), this function is SMP-safe
+ * and shall be used for continued line printing.
+ *
+ * Returns zero on success, -1 on overflow.
+ */
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
+{
+	struct seq_buf *s = &pl->sb;
+	int ret, len;
+
+	ret = seq_buf_vprintf(s, fmt, args);
+
+	len = seq_buf_used(s);
+	if (len && s->buffer[len - 1] == '\n') {
+		printk("%s%.*s", pl->level ? : KERN_DEFAULT, len, s->buffer);
+		seq_buf_clear(s);
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL(vpr_line);
+
+/**
+ * pr_line - Append data to the printk() line buffer
+ * @pl: the pr_line descriptor
+ * @fmt: printf format string
+ *
+ * Writes a printf() format into the printk() pr_line buffer.
+ * Terminating new-line symbol flushes (prints) the buffer.
+ *
+ * Unlike pr_cont() and printk(KERN_CONT), this function is SMP-safe
+ * and shall be used for continued line printing.
+ *
+ * Returns zero on success, -1 on overflow.
+ */
+int pr_line(struct pr_line *pl, const char *fmt, ...)
+{
+	va_list ap;
+	int ret;
+
+	va_start(ap, fmt);
+	ret = vpr_line(pl, fmt, ap);
+	va_end(ap);
+
+	return ret;
+}
+EXPORT_SYMBOL(pr_line);
-- 
2.19.0


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14  6:57                                                             ` Sergey Senozhatsky
@ 2018-09-14 10:37                                                               ` Tetsuo Handa
  2018-09-14 11:50                                                                 ` Sergey Senozhatsky
  0 siblings, 1 reply; 68+ messages in thread
From: Tetsuo Handa @ 2018-09-14 10:37 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek, Steven Rostedt
  Cc: Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton,
	Sergey Senozhatsky

On 2018/09/14 15:57, Sergey Senozhatsky wrote:
> On (09/13/18 23:28), Sergey Senozhatsky wrote:
>> Not that I see any problems with pr_line_flush(). But can drop it, sure.
>> pr_line() is a replacement for pr_cont() and as such it's not for multi-line
>> buffering.
> 
> OK, attached.
> Let me know if anything needs to improved (including broken English).
> Will we keep in the printk tree or shall I send a formal patch to Andrew?



> @@ -20,6 +20,9 @@
>   * Annotation for a "continued" line of log printout (only done after a
>   * line that had no enclosing \n). Only to be used by core/arch code
>   * during early bootup (a continued line is not SMP-safe otherwise).
> + *
> + * Please consider pr_line()/vpr_line() functions for SMP-safe continued
> + * line printing.

I think the advantage is not limited to SMP-safeness. Reducing the frequency of
calling printk() will reduce overhead. Also, latency for netconsole will be
reduced by sending a whole line in one printk().



> +/**
> + * DEFINE_PR_LINE - define a new pr_line variable
> + * @lev:	printk() log level
> + * @name:	variable name
> + *
> + * Defines a new pr_line varialbe, which would use an implicit

s/varialbe/variable/ .

> + * stack buffer of size __PR_LINE_BUF_SZ.
> + */
> +#define DEFINE_PR_LINE(lev, name)				\
> +	char		__line_##name[__PR_LINE_BUF_SZ];	\
> +	struct pr_line	name = {				\
> +		.sb	= __SEQ_BUF_INITIALIZER(__line_##name,	\
> +					__PR_LINE_BUF_SZ),	\
> +		.level	= lev,					\
> +	}

Want a note that

  static DEFINE_PR_LINE(lev, name);

won't make "name" variable "static" ?



> +/**
> + * DEFINE_PR_LINE_BUF - define a new pr_line variable
> + * @lev:	printk() log level
> + * @name:	variable name
> + * @buf:	external buffer
> + * @sz:	external buffer size
> + *
> + * Defines a new pr_line variable, which would use an external
> + * buffer for printk line.
> + */
> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
> +	struct pr_line	name = {				\
> +		.sb	= __SEQ_BUF_INITIALIZER(buf, (sz)),	\
> +		.level	= lev,					\
> +	}
> +

I would use this one for the OOM killer. 80 bytes is too short.

  static char oom_print_buf[1024];
  DEFINE_PR_LINE_BUF(level, oom_print_buf);



> @@ -131,4 +187,8 @@ extern int
>  seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
>  #endif
>  
> +extern __printf(2, 0)
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
> +extern __printf(2, 3)
> +int pr_line(struct pr_line *pl, const char *fmt, ...);

Do we want to mark "asmlinkage" like printk() ?

> @@ -324,3 +324,60 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
>  	s->readpos += cnt;
>  	return cnt;
>  }
> +
> +/**
> + * vpr_line - Append data to the printk() line buffer
> + * @pl: the pr_line descriptor

s/descriptor/structure/ ?

> + * @fmt: printf format string
> + * @args: va_list of arguments from a printf() type function
> + *
> + * Writes a vnprintf() format into the printk() pr_line buffer.

s/vnprintf/vprintf/ ?


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14 10:37                                                               ` Tetsuo Handa
@ 2018-09-14 11:50                                                                 ` Sergey Senozhatsky
  2018-09-14 12:03                                                                   ` Tetsuo Handa
  0 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14 11:50 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton,
	Sergey Senozhatsky

On (09/14/18 19:37), Tetsuo Handa wrote:
> > @@ -20,6 +20,9 @@
> >   * Annotation for a "continued" line of log printout (only done after a
> >   * line that had no enclosing \n). Only to be used by core/arch code
> >   * during early bootup (a continued line is not SMP-safe otherwise).
> > + *
> > + * Please consider pr_line()/vpr_line() functions for SMP-safe continued
> > + * line printing.
> 
> I think the advantage is not limited to SMP-safeness. Reducing the frequency of
> calling printk() will reduce overhead. Also, latency for netconsole will be
> reduced by sending a whole line in one printk().

Hmm. These are very good points, indeed. But do we want to list all
advantages here? I just wanted to mention SMP-unsafe pr_cont/printk(KERN_CONT),
because I also mention pr_line in kern_levels.h.

> > + * Defines a new pr_line varialbe, which would use an implicit
> 
> s/varialbe/variable/ .

Thanks.

> > +#define DEFINE_PR_LINE(lev, name)				\
> > +	char		__line_##name[__PR_LINE_BUF_SZ];	\
> > +	struct pr_line	name = {				\
> > +		.sb	= __SEQ_BUF_INITIALIZER(__line_##name,	\
> > +					__PR_LINE_BUF_SZ),	\
> > +		.level	= lev,					\
> > +	}
> 
> Want a note that
> 
>   static DEFINE_PR_LINE(lev, name);
> 
> won't make "name" variable "static" ?

Interesting point. Any hint what the comment should look like?
Do we want to have static pr_line buffers?

> > +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
> > +	struct pr_line	name = {				\
> > +		.sb	= __SEQ_BUF_INITIALIZER(buf, (sz)),	\
> > +		.level	= lev,					\
> > +	}
> > +
> 
> I would use this one for the OOM killer. 80 bytes is too short.

80 bytes is quite short for OOM, agreed.

>   static char oom_print_buf[1024];
>   DEFINE_PR_LINE_BUF(level, oom_print_buf);

Do I get it right that you suggest to drop the "size" param?
Do OOM people agree on 1024 bytes stack usage?


> > @@ -131,4 +187,8 @@ extern int
> >  seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
> >  #endif
> >  
> > +extern __printf(2, 0)
> > +int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
> > +extern __printf(2, 3)
> > +int pr_line(struct pr_line *pl, const char *fmt, ...);
> 
> Do we want to mark "asmlinkage" like printk() ?

Dunno, do we? Does code written in assembly call pr_cont that often?
We are not turning pr_line() into syscall anyway.

> > @@ -324,3 +324,60 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
> >  	s->readpos += cnt;
> >  	return cnt;
> >  }
> > +
> > +/**
> > + * vpr_line - Append data to the printk() line buffer
> > + * @pl: the pr_line descriptor
> 
> s/descriptor/structure/ ?

Yeah, I used the term "descriptor", just because it's used in seq_buf.c.
So, it's sort of common in seq_buf.
E.g.
   seq_buf_vprintf(), seq_buf_print_seq(), seq_buf_can_fit() and so on.

> > + * @fmt: printf format string
> > + * @args: va_list of arguments from a printf() type function
> > + *
> > + * Writes a vnprintf() format into the printk() pr_line buffer.
> 
> s/vnprintf/vprintf/ ?

Indeed.
We also need to fix a typo in seq_buf_vprintf() comment then.

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14 11:50                                                                 ` Sergey Senozhatsky
@ 2018-09-14 12:03                                                                   ` Tetsuo Handa
  2018-09-14 12:22                                                                     ` Sergey Senozhatsky
  0 siblings, 1 reply; 68+ messages in thread
From: Tetsuo Handa @ 2018-09-14 12:03 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/14 20:50, Sergey Senozhatsky wrote:
>>> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
>>> +	struct pr_line	name = {				\
>>> +		.sb	= __SEQ_BUF_INITIALIZER(buf, (sz)),	\
>>> +		.level	= lev,					\
>>> +	}
>>> +
>>
>> I would use this one for the OOM killer. 80 bytes is too short.
> 
> 80 bytes is quite short for OOM, agreed.
> 
>>   static char oom_print_buf[1024];
>>   DEFINE_PR_LINE_BUF(level, oom_print_buf);
> 
> Do I get it right that you suggest to drop the "size" param?

No. I just forgot to add params. ;-)

> Do OOM people agree on 1024 bytes stack usage?

I won't allocate oom_print_buf on the stack. Since its usage is serialized
by oom_lock mutex, we don't need to allocate from stack. Since memory
allocation request might happen when stack is already tight, we should not
try to allocate much from stack.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14 12:03                                                                   ` Tetsuo Handa
@ 2018-09-14 12:22                                                                     ` Sergey Senozhatsky
  2018-09-19 11:02                                                                       ` Tetsuo Handa
  0 siblings, 1 reply; 68+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14 12:22 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek,
	Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/14/18 21:03), Tetsuo Handa wrote:
> > 80 bytes is quite short for OOM, agreed.
> > 
> >>   static char oom_print_buf[1024];
> >>   DEFINE_PR_LINE_BUF(level, oom_print_buf);
> > 
> > Do I get it right that you suggest to drop the "size" param?
> 
> No. I just forgot to add params. ;-)
> 
> > Do OOM people agree on 1024 bytes stack usage?
> 
> I won't allocate oom_print_buf on the stack. Since its usage is serialized
> by oom_lock mutex, we don't need to allocate from stack. Since memory
> allocation request might happen when stack is already tight, we should not
> try to allocate much from stack.

... by "OOM people" I meant "MM people".
"MM people" is a subset of "OOM people".

OK, so I didn't notice the "static" part of the `oom_print_buf'.
I need some rest, I guess.

The "SMP-safe" comment becomes a bit tricky when pr_line is used with a
static buffer. Either we need to require synchronization - umm... and
document it - or to provide some means of synchronization in pr_line().
Let's think what pr_line API should do about it.

Any thoughts?

	-ss

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14 12:22                                                                     ` Sergey Senozhatsky
@ 2018-09-19 11:02                                                                       ` Tetsuo Handa
  2018-09-24  8:11                                                                         ` Tetsuo Handa
  0 siblings, 1 reply; 68+ messages in thread
From: Tetsuo Handa @ 2018-09-19 11:02 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/14 21:22, Sergey Senozhatsky wrote:
> The "SMP-safe" comment becomes a bit tricky when pr_line is used with a
> static buffer. Either we need to require synchronization - umm... and
> document it - or to provide some means of synchronization in pr_line().
> Let's think what pr_line API should do about it.
> 
> Any thoughts?
> 

I'm inclined to propose a simple one shown below, similar to just having
several "struct cont" for concurrent printk() users.
What Linus has commented is that implicit context is bad, and below one
uses explicit context.
After almost all users are converted to use below one, we might be able
to get rid of KERN_CONT support.



From d5e0e422142ced2b7097040e96ba7c5528a460db Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 19 Sep 2018 14:39:07 +0900
Subject: [PATCH v2] printk: Add best-effort printk() buffering.

Sometimes we want to printk() a line without being disturbed by concurrent
printk() from interrupts and/or other threads. For example, mixed printk()
output of multiple thread's dump makes it hard to interpret.

Assuming that we will go to a direction that we add context identifier to
each line of printk() output (so that we can group multiple lines into one
block when parsing), this patch introduces functions for using fixed-sized
statically allocated buffers for line-buffering printk() output for best
effort basis (i.e. up to LOG_LINE_MAX bytes, up to 16 concurrent printk()
users).

If there happened to be more than 16 concurrent printk() users, existing
printk() will be used for users who failed to get buffers. Of course, if
there were more than 16 concurrent printk() users, the printk() output
would flood the console and the system would be already unusable (e.g.
RCU lockup or hung task watchdog would fire under such situation). Thus,
I think that 16 buffers should be sufficient.

Five functions (get_printk_buffer(), buffered_vprintk(), buffered_printk(),
flush_printk_buffer() and put_printk_buffer()) are provided for printk()
buffering.

  get_printk_buffer() tries to assign a "struct printk_buffer".

  buffered_vprintk()/buffered_printk() tries to use line-buffered printk()
  by holding incomplete line into "struct printk_buffer".

  flush_printk_buffer() flushes the "struct printk_buffer".

  put_printk_buffer() flushes and releases the "struct printk_buffer".

put_printk_buffer() must match corresponding get_printk_buffer() as with
rcu_read_unlock() must match corresponding rcu_read_lock().

These functions are safe to be called from any context, for these are
merely wrapping printk()/vprintk() calls in order to minimize possibility
of using "struct cont" by managing 16 buffers outside of the logbuf_lock
spinlock. Thus, any caller can be updated to use these functions.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 include/linux/printk.h |  28 +++++++++
 kernel/printk/printk.c | 160 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 188 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccf..889491b 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -157,6 +157,7 @@ static inline void printk_nmi_direct_enter(void) { }
 static inline void printk_nmi_direct_exit(void) { }
 #endif /* PRINTK_NMI */
 
+struct printk_buffer;
 #ifdef CONFIG_PRINTK
 asmlinkage __printf(5, 0)
 int vprintk_emit(int facility, int level,
@@ -173,6 +174,13 @@ int printk_emit(int facility, int level,
 
 asmlinkage __printf(1, 2) __cold
 int printk(const char *fmt, ...);
+struct printk_buffer *get_printk_buffer(void);
+void flush_printk_buffer(struct printk_buffer *ptr);
+__printf(2, 3)
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...);
+asmlinkage __printf(2, 0)
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args);
+void put_printk_buffer(struct printk_buffer *ptr);
 
 /*
  * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ !
@@ -220,6 +228,26 @@ int printk(const char *s, ...)
 {
 	return 0;
 }
+static inline struct printk_buffer *get_printk_buffer(void)
+{
+	return NULL;
+}
+static inline __printf(2, 3)
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	return 0;
+}
+static inline __printf(2, 0)
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	return 0;
+}
+static inline void flush_printk_buffer(struct printk_buffer *ptr)
+{
+}
+static inline void put_printk_buffer(struct printk_buffer *ptr)
+{
+}
 static inline __printf(1, 2) __cold
 int printk_deferred(const char *s, ...)
 {
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9bf5404..c9e9f5d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1949,6 +1949,166 @@ asmlinkage int printk_emit(int facility, int level,
 }
 EXPORT_SYMBOL(printk_emit);
 
+struct printk_buffer {
+	unsigned short int used; /* Valid bytes in buf[]. */
+	char buf[LOG_LINE_MAX];
+	bool in_use;
+} __aligned(1024);
+#define MAX_PRINTK_BUFFERS 16
+static struct printk_buffer printk_buffers[MAX_PRINTK_BUFFERS];
+
+/**
+ * get_printk_buffer - Try to get printk_buffer.
+ *
+ * Returns pointer to "struct printk_buffer" on success, NULL otherwise.
+ *
+ * If this function returned "struct printk_buffer", the caller is responsible
+ * for passing it to put_printk_buffer() so that "struct printk_buffer" can be
+ * reused in the future.
+ *
+ * Even if this function returned NULL, the caller does not need to check for
+ * NULL, for passing NULL to buffered_printk() simply acts like normal printk()
+ * and passing NULL to flush_printk_buffer()/put_printk_buffer() is a no-op.
+ */
+struct printk_buffer *get_printk_buffer(void)
+{
+	unsigned short int i;
+
+	for (i = 0; i < MAX_PRINTK_BUFFERS; i++) {
+		struct printk_buffer *ptr = &printk_buffers[i];
+
+		if (ptr->in_use || cmpxchg(&ptr->in_use, false, true))
+			continue;
+		ptr->used = 0;
+		return ptr;
+	}
+	return NULL;
+}
+EXPORT_SYMBOL(get_printk_buffer);
+
+/**
+ * buffered_vprintk - Try to vprintk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string.
+ * @args: va_list structure.
+ *
+ * Returns the return value of vprintk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	va_list tmp_args;
+	unsigned short int i;
+	int r;
+
+	if (!ptr)
+		goto unbuffered;
+	for (i = 0; i < 2; i++) {
+		unsigned int pos = ptr->used;
+		char *text = ptr->buf + pos;
+
+		va_copy(tmp_args, args);
+		r = vsnprintf(text, sizeof(ptr->buf) - pos, fmt, tmp_args);
+		va_end(tmp_args);
+		if (r + pos < sizeof(ptr->buf)) {
+			/*
+			 * Eliminate KERN_CONT at this point because we can
+			 * concatenate incomplete lines inside printk_buffer.
+			 */
+			if (r >= 2 && printk_get_level(text) == 'c') {
+				memmove(text, text + 2, r - 2);
+				ptr->used += r - 2;
+			} else {
+				ptr->used += r;
+			}
+			/* Flush already completed lines if any. */
+			while (1) {
+				char *cp = memchr(ptr->buf, '\n', ptr->used);
+
+				if (!cp)
+					break;
+				*cp = '\0';
+				printk("%s\n", ptr->buf);
+				i = cp - ptr->buf + 1;
+				ptr->used -= i;
+				memmove(ptr->buf, ptr->buf + i, ptr->used);
+			}
+			return r;
+		}
+		if (i)
+			break;
+		flush_printk_buffer(ptr);
+	}
+ unbuffered:
+	return vprintk(fmt, args);
+}
+
+/**
+ * buffered_printk - Try to printk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string, followed by arguments.
+ *
+ * Returns the return value of printk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	va_list args;
+	int r;
+
+	va_start(args, fmt);
+	r = buffered_vprintk(ptr, fmt, args);
+	va_end(args);
+	return r;
+}
+EXPORT_SYMBOL(buffered_printk);
+
+/**
+ * flush_printk_buffer - Flush incomplete line in printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush if @ptr contains partial data. But usually there is no need to call
+ * this function because @ptr is flushed by put_printk_buffer().
+ */
+void flush_printk_buffer(struct printk_buffer *ptr)
+{
+	if (!ptr || !ptr->used)
+		return;
+	/* buffered_vprintk() keeps 0 <= ptr->used < sizeof(ptr->buf) true. */
+	ptr->buf[ptr->used] = '\0';
+	printk("%s", ptr->buf);
+	ptr->used = 0;
+}
+EXPORT_SYMBOL(flush_printk_buffer);
+
+/**
+ * put_printk_buffer - Release printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush and release @ptr.
+ */
+void put_printk_buffer(struct printk_buffer *ptr)
+{
+	if (!ptr)
+		return;
+	if (ptr->used)
+		flush_printk_buffer(ptr);
+	xchg(&ptr->in_use, false);
+}
+EXPORT_SYMBOL(put_printk_buffer);
+
 int vprintk_default(const char *fmt, va_list args)
 {
 	int r;
-- 
1.8.3.1



An example user of these functions which would mitigate output like
https://syzkaller.appspot.com/text?tag=CrashReport&x=13368fda400000 is shown below.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 89d2a2a..44bbb96 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4689,10 +4689,10 @@ unsigned long nr_free_pagecache_pages(void)
 	return nr_free_zone_pages(gfp_zone(GFP_HIGHUSER_MOVABLE));
 }
 
-static inline void show_node(struct zone *zone)
+static inline void show_node(struct printk_buffer *buf, struct zone *zone)
 {
 	if (IS_ENABLED(CONFIG_NUMA))
-		printk("Node %d ", zone_to_nid(zone));
+		buffered_printk(buf, "Node %d ", zone_to_nid(zone));
 }
 
 long si_mem_available(void)
@@ -4814,7 +4814,7 @@ static bool show_mem_node_skip(unsigned int flags, int nid, nodemask_t *nodemask
 
 #define K(x) ((x) << (PAGE_SHIFT-10))
 
-static void show_migration_types(unsigned char type)
+static void show_migration_types(struct printk_buffer *buf, unsigned char type)
 {
 	static const char types[MIGRATE_TYPES] = {
 		[MIGRATE_UNMOVABLE]	= 'U',
@@ -4838,7 +4838,7 @@ static void show_migration_types(unsigned char type)
 	}
 
 	*p = '\0';
-	printk(KERN_CONT "(%s) ", tmp);
+	buffered_printk(buf, "(%s) ", tmp);
 }
 
 /*
@@ -4856,6 +4856,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 	int cpu;
 	struct zone *zone;
 	pg_data_t *pgdat;
+	struct printk_buffer *buf = get_printk_buffer();
 
 	for_each_populated_zone(zone) {
 		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
@@ -4950,8 +4951,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 		for_each_online_cpu(cpu)
 			free_pcp += per_cpu_ptr(zone->pageset, cpu)->pcp.count;
 
-		show_node(zone);
-		printk(KERN_CONT
+		show_node(buf, zone);
+		buffered_printk(buf,
 			"%s"
 			" free:%lukB"
 			" min:%lukB"
@@ -4993,10 +4994,10 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 			K(free_pcp),
 			K(this_cpu_read(zone->pageset->pcp.count)),
 			K(zone_page_state(zone, NR_FREE_CMA_PAGES)));
-		printk("lowmem_reserve[]:");
+		buffered_printk(buf, "lowmem_reserve[]:");
 		for (i = 0; i < MAX_NR_ZONES; i++)
-			printk(KERN_CONT " %ld", zone->lowmem_reserve[i]);
-		printk(KERN_CONT "\n");
+			buffered_printk(buf, " %ld", zone->lowmem_reserve[i]);
+		buffered_printk(buf, "\n");
 	}
 
 	for_each_populated_zone(zone) {
@@ -5006,8 +5007,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 
 		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
 			continue;
-		show_node(zone);
-		printk(KERN_CONT "%s: ", zone->name);
+		show_node(buf, zone);
+		buffered_printk(buf, "%s: ", zone->name);
 
 		spin_lock_irqsave(&zone->lock, flags);
 		for (order = 0; order < MAX_ORDER; order++) {
@@ -5025,13 +5026,14 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
 		for (order = 0; order < MAX_ORDER; order++) {
-			printk(KERN_CONT "%lu*%lukB ",
+			buffered_printk(buf, "%lu*%lukB ",
 			       nr[order], K(1UL) << order);
 			if (nr[order])
-				show_migration_types(types[order]);
+				show_migration_types(buf, types[order]);
 		}
-		printk(KERN_CONT "= %lukB\n", K(total));
+		buffered_printk(buf, "= %lukB\n", K(total));
 	}
+	put_printk_buffer(buf);
 
 	hugetlb_show_meminfo();
 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-19 11:02                                                                       ` Tetsuo Handa
@ 2018-09-24  8:11                                                                         ` Tetsuo Handa
  0 siblings, 0 replies; 68+ messages in thread
From: Tetsuo Handa @ 2018-09-24  8:11 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/19 20:02, Tetsuo Handa wrote:
> On 2018/09/14 21:22, Sergey Senozhatsky wrote:
>> The "SMP-safe" comment becomes a bit tricky when pr_line is used with a
>> static buffer. Either we need to require synchronization - umm... and
>> document it - or to provide some means of synchronization in pr_line().
>> Let's think what pr_line API should do about it.
>>
>> Any thoughts?
>>
> 
> I'm inclined to propose a simple one shown below, similar to just having
> several "struct cont" for concurrent printk() users.
> What Linus has commented is that implicit context is bad, and below one
> uses explicit context.
> After almost all users are converted to use below one, we might be able
> to get rid of KERN_CONT support.

The reason of using statically preallocated global buffers is that I think
that it is inconvenient for KERN_CONT users to calculate necessary bytes
only for avoiding message truncation. The pr_line might be passed to deep
into the callchain and adjusting buffer size whenever the content's possible
max length changes is as much painful as changing printk() to accept only
one "const char *" argument. Even if we guarantee that any context can
allocate buffer from kernel stack, we cannot guarantee that many concurrent
printk() won't trigger lockup. Thus, I think that trying to allocate from
finite static buffers with a fallback to unbuffered printk() upon failure
is sufficient.



By the way, kbuild test robot told me that I forgot to drop asmlinkage keyword.

 include/linux/printk.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 889491b..3347442 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -178,7 +178,7 @@ asmlinkage __printf(1, 2) __cold
 void flush_printk_buffer(struct printk_buffer *ptr);
 __printf(2, 3)
 int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...);
-asmlinkage __printf(2, 0)
+__printf(2, 0)
 int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args);
 void put_printk_buffer(struct printk_buffer *ptr);
 



^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, back to index

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <201804232233.CIC65675.OJSOMFQOFFHVtL@I-love.SAKURA.ne.jp>
     [not found] ` <CACT4Y+boyw_Qy=y-iTnsKZrtTgF0Hk3nHN_xtqUdX4etgiYDQw@mail.gmail.com>
2018-04-24  1:33   ` printk feature for syzbot? Sergey Senozhatsky
2018-04-24 14:40     ` Steven Rostedt
2018-04-26 10:06     ` Petr Mladek
2018-05-10  4:22       ` Sergey Senozhatsky
2018-05-10 11:30         ` Petr Mladek
2018-05-10 12:11           ` Sergey Senozhatsky
2018-05-10 14:22             ` Steven Rostedt
2018-05-10 14:50         ` Tetsuo Handa
2018-05-11  1:45           ` Sergey Senozhatsky
     [not found]             ` <201805110238.w4B2cIGH079602@www262.sakura.ne.jp>
2018-05-11  6:21               ` Sergey Senozhatsky
2018-05-11  9:17                 ` Dmitry Vyukov
2018-05-11  9:50                   ` Sergey Senozhatsky
2018-05-11 11:58                     ` [PATCH] printk: inject caller information into the body of message Tetsuo Handa
2018-05-17 11:21                       ` Sergey Senozhatsky
2018-05-17 11:52                         ` Sergey Senozhatsky
2018-05-18 12:15                         ` Petr Mladek
2018-05-18 12:25                           ` Dmitry Vyukov
2018-05-18 12:54                             ` Petr Mladek
2018-05-18 13:08                               ` Dmitry Vyukov
2018-05-24  2:21                                 ` Sergey Senozhatsky
2018-05-23 10:19                           ` Tetsuo Handa
2018-05-24  2:14                           ` Sergey Senozhatsky
2018-05-26  6:36                             ` Dmitry Vyukov
2018-06-20  5:44                               ` Dmitry Vyukov
2018-06-20  8:31                                 ` Sergey Senozhatsky
2018-06-20  8:45                                   ` Dmitry Vyukov
2018-06-20  9:06                                     ` Sergey Senozhatsky
2018-06-20  9:18                                       ` Sergey Senozhatsky
2018-06-20  9:31                                         ` Dmitry Vyukov
2018-06-20 11:07                                           ` Sergey Senozhatsky
2018-06-20 11:32                                             ` Dmitry Vyukov
2018-06-20 13:06                                               ` Sergey Senozhatsky
2018-06-22 13:06                                                 ` Tetsuo Handa
2018-06-25  1:41                                                   ` Sergey Senozhatsky
2018-06-25  9:36                                                     ` Dmitry Vyukov
2018-06-27 10:29                                                       ` Tetsuo Handa
2018-09-10 11:20                                                 ` Alexander Potapenko
2018-09-12  6:53                                                   ` Sergey Senozhatsky
2018-09-12 16:05                                                     ` Steven Rostedt
2018-09-13  7:12                                                       ` Sergey Senozhatsky
2018-09-13 12:26                                                         ` Petr Mladek
2018-09-13 14:28                                                           ` Sergey Senozhatsky
2018-09-14  1:22                                                             ` Steven Rostedt
2018-09-14  2:15                                                               ` Sergey Senozhatsky
2018-09-14  6:57                                                             ` Sergey Senozhatsky
2018-09-14 10:37                                                               ` Tetsuo Handa
2018-09-14 11:50                                                                 ` Sergey Senozhatsky
2018-09-14 12:03                                                                   ` Tetsuo Handa
2018-09-14 12:22                                                                     ` Sergey Senozhatsky
2018-09-19 11:02                                                                       ` Tetsuo Handa
2018-09-24  8:11                                                                         ` Tetsuo Handa
2018-09-14  1:12                                                         ` Steven Rostedt
2018-09-14  1:55                                                           ` Sergey Senozhatsky
2018-06-21  8:29                                               ` Sergey Senozhatsky
2018-06-20  9:30                                       ` Dmitry Vyukov
2018-06-20 11:19                                         ` Sergey Senozhatsky
2018-06-20 11:25                                           ` Dmitry Vyukov
2018-06-20 11:37                                         ` Fengguang Wu
2018-06-20 12:31                                           ` Dmitry Vyukov
2018-06-20 12:41                                             ` Fengguang Wu
2018-06-20 12:45                                               ` Dmitry Vyukov
2018-06-20 12:48                                                 ` Fengguang Wu
2018-05-11 13:37                     ` printk feature for syzbot? Steven Rostedt
2018-05-15  5:20                       ` Sergey Senozhatsky
2018-05-15 14:39                         ` Steven Rostedt
2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
2018-05-11 11:27                   ` Sergey Senozhatsky
2018-05-17 11:57                   ` Petr Mladek

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox