LKML Archive on lore.kernel.org
 help / Atom feed
* Re: printk feature for syzbot?
       [not found] ` <CACT4Y+boyw_Qy=y-iTnsKZrtTgF0Hk3nHN_xtqUdX4etgiYDQw@mail.gmail.com>
@ 2018-04-24  1:33   ` Sergey Senozhatsky
  2018-04-24 14:40     ` Steven Rostedt
  2018-04-26 10:06     ` Petr Mladek
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-04-24  1:33 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Tetsuo Handa, Sergey Senozhatsky, syzkaller, Petr Mladek,
	Steven Rostedt, Fengguang Wu, linux-kernel

Let me Cc Petr, Steven and Fengguang on this

On (04/23/18 15:40), Dmitry Vyukov wrote:
> On Mon, Apr 23, 2018 at 3:33 PM, Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > Hello, Sergey.
> >
> > Recently I'm fixing bugs reported by syzbot ( https://syzkaller.appspot.com/ ).
> >
> > Since syzbot frequently makes printk() flooding (e.g. memory allocation fault
> > injection), it is always difficult to distinguish which line is from which event.
> >
> > I wish printk() can prefix context identifier.
> > If I recall correctly, you are using some extra output for debugging, aren't you?
> 
> +syzkaller mailing list for history
> 
> Hi Tetsuo, Sergey,
> 
> Something like TID prefix would be useful. Potentially it would allow
> us to untangle multiple intermixed crash reports.

Hello,

Yes, Tetsuo, we use a bunch of "printk prefix" extensions at Samsung.
For instance, we prefix printk messages with the CPU number: messages
sometimes mix up, we also see partial pr_cont flushes, and so on.
Grep-ping serial logs by CPU number is quite powerful.

Upstreaming those printk prefixes can be a bit challenging, but may
be it's not all so bad. I personally think that syzbot, and build-test
bots in general [like 0day], are helpful indeed, and I don't see why life
should be any more complex for syzbot/0day guys. If printk prefixes can
help - then we probably should consider such an extension.

The main argument from the upstream is that tweaking struct printk_log
breaks user space (tools like crash, and so on). But I guess we can do
something about it. E.g. put a PRINTK_CONTEXT_TRACKING_PREFIX kconfig
option somewhere in "Kernel hacking"->"printk and dmesg options" and
make available only for DEBUG kernels, or something similar.

Petr, Steven, Fengguang, what do you think? Do you have any objections?
Ideas?

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-04-24  1:33   ` printk feature for syzbot? Sergey Senozhatsky
@ 2018-04-24 14:40     ` Steven Rostedt
  2018-04-26 10:06     ` Petr Mladek
  1 sibling, 0 replies; 94+ messages in thread
From: Steven Rostedt @ 2018-04-24 14:40 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Petr Mladek, Fengguang Wu, linux-kernel

On Tue, 24 Apr 2018 10:33:36 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> Petr, Steven, Fengguang, what do you think? Do you have any objections?
> Ideas?

If it can be turned off by a config option, I'm fine with it.

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-04-24  1:33   ` printk feature for syzbot? Sergey Senozhatsky
  2018-04-24 14:40     ` Steven Rostedt
@ 2018-04-26 10:06     ` Petr Mladek
  2018-05-10  4:22       ` Sergey Senozhatsky
  1 sibling, 1 reply; 94+ messages in thread
From: Petr Mladek @ 2018-04-26 10:06 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, linux-kernel

On Tue 2018-04-24 10:33:36, Sergey Senozhatsky wrote:
> Yes, Tetsuo, we use a bunch of "printk prefix" extensions at Samsung.
> For instance, we prefix printk messages with the CPU number: messages
> sometimes mix up, we also see partial pr_cont flushes, and so on.
> Grep-ping serial logs by CPU number is quite powerful.
> 
> Upstreaming those printk prefixes can be a bit challenging, but may
> be it's not all so bad. I personally think that syzbot, and build-test
> bots in general [like 0day], are helpful indeed, and I don't see why life
> should be any more complex for syzbot/0day guys. If printk prefixes can
> help - then we probably should consider such an extension.
> 
> The main argument from the upstream is that tweaking struct printk_log
> breaks user space (tools like crash, and so on). But I guess we can do
> something about it. E.g. put a PRINTK_CONTEXT_TRACKING_PREFIX kconfig
> option somewhere in "Kernel hacking"->"printk and dmesg options" and
> make available only for DEBUG kernels, or something similar.

> Petr, Steven, Fengguang, what do you think? Do you have any objections?
> Ideas?

I wonder if we could create some mechanism that would help to extend
struct printk_log easier in the future.

I know only about crash tool implementation. It uses information provided
by log_buf_vmcoreinfo_setup(). The size of the structure is already
public. Therefore crash should be able to find all existing information
even if we increase the size of the structure.

log_buf_vmcoreinfo_setup() even allows to inform about newly added
structure items. We could probably extend it to inform also about
the offset of the new optional elements.

I am not sure about other tools. But I think that it should be
doable.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-04-26 10:06     ` Petr Mladek
@ 2018-05-10  4:22       ` Sergey Senozhatsky
  2018-05-10 11:30         ` Petr Mladek
  2018-05-10 14:50         ` Tetsuo Handa
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-10  4:22 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Dmitry Vyukov, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	linux-kernel

On (04/26/18 12:06), Petr Mladek wrote:
> 
> > Petr, Steven, Fengguang, what do you think? Do you have any objections?
> > Ideas?
> 
> I wonder if we could create some mechanism that would help to extend
> struct printk_log easier in the future.

Hm, interesting idea.

> I know only about crash tool implementation. It uses information provided
> by log_buf_vmcoreinfo_setup(). The size of the structure is already
> public. Therefore crash should be able to find all existing information
> even if we increase the size of the structure.
> 
> log_buf_vmcoreinfo_setup() even allows to inform about newly added
> structure items. We could probably extend it to inform also about
> the offset of the new optional elements.

I vaguely remember that the last time Thomas Gleixner modified
printk_log you managed to find a case that broke crash tool.
... Or may be I'm mistaken.

> I am not sure about other tools. But I think that it should be
> doable.

Good. So there are no objections, so far.

Tetsuo, Dmitry, care to send a patch?

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10  4:22       ` Sergey Senozhatsky
@ 2018-05-10 11:30         ` Petr Mladek
  2018-05-10 12:11           ` Sergey Senozhatsky
  2018-05-10 14:50         ` Tetsuo Handa
  1 sibling, 1 reply; 94+ messages in thread
From: Petr Mladek @ 2018-05-10 11:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, linux-kernel

On Thu 2018-05-10 13:22:06, Sergey Senozhatsky wrote:
> On (04/26/18 12:06), Petr Mladek wrote:
> > 
> > > Petr, Steven, Fengguang, what do you think? Do you have any objections?
> > > Ideas?
> > 
> > I wonder if we could create some mechanism that would help to extend
> > struct printk_log easier in the future.
> 
> Hm, interesting idea.
> 
> > I know only about crash tool implementation. It uses information provided
> > by log_buf_vmcoreinfo_setup(). The size of the structure is already
> > public. Therefore crash should be able to find all existing information
> > even if we increase the size of the structure.
> > 
> > log_buf_vmcoreinfo_setup() even allows to inform about newly added
> > structure items. We could probably extend it to inform also about
> > the offset of the new optional elements.
> 
> I vaguely remember that the last time Thomas Gleixner modified
> printk_log you managed to find a case that broke crash tool.
> ... Or may be I'm mistaken.

I guess that you are talking about the patchset adding possibility
to use different time-stamps[1]. It changed the semantic of the
timestamp. All the tools needed an update to show the timestamp
correctly.

The patchset was rejected by Linus because it would broke some
userspace tool, e.g. systemd, that depend on the format and semantic
provided by /dev/kmsg[2].

By other words, we must not change /dev/kmsg format. But it should
be acceptable to change/extend the internal format and eventually
extend the format used on consoles.

Anyway, we need to be careful and test makedumpfile and crash tools
and eventually provide patches for them.

Reference:
[0] https://lkml.kernel.org/r/20160419085613.GJ6862@pathway.suse.cz
[1] https://lkml.kernel.org/r/CA+55aFzLH9crdMtUFkD-PtNGuxu_fsG5GH2ACni69ug9iM=09g@mail.gmail.com

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10 11:30         ` Petr Mladek
@ 2018-05-10 12:11           ` Sergey Senozhatsky
  2018-05-10 14:22             ` Steven Rostedt
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-10 12:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Dmitry Vyukov, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	linux-kernel

On (05/10/18 13:30), Petr Mladek wrote:
[..]
> I guess that you are talking about the patchset adding possibility
> to use different time-stamps[1]. It changed the semantic of the
> timestamp. All the tools needed an update to show the timestamp
> correctly.
> 
> The patchset was rejected by Linus because it would broke some
> userspace tool, e.g. systemd, that depend on the format and semantic
> provided by /dev/kmsg[2].

Right, but I think I was talking about this email
 https://lkml.kernel.org/r/20171123124648.s4oigunxjfzvhtqh@pathway.suse.cz

But yeah, it's not really related to the extension of struct printk_log,
so I think we should be fine.

> By other words, we must not change /dev/kmsg format. But it should
> be acceptable to change/extend the internal format and eventually
> extend the format used on consoles.

Sure.

> Anyway, we need to be careful and test makedumpfile and crash tools
> and eventually provide patches for them.

Agreed. I'd prefer it to be hidden somewhere under kernel hacking config,
so only syzkaller folks would enable it. I think Steven also mentioned
a config option.

> Reference:
> [0] https://lkml.kernel.org/r/20160419085613.GJ6862@pathway.suse.cz
> [1] https://lkml.kernel.org/r/CA+55aFzLH9crdMtUFkD-PtNGuxu_fsG5GH2ACni69ug9iM=09g@mail.gmail.com

Thanks.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10 12:11           ` Sergey Senozhatsky
@ 2018-05-10 14:22             ` Steven Rostedt
  0 siblings, 0 replies; 94+ messages in thread
From: Steven Rostedt @ 2018-05-10 14:22 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Dmitry Vyukov, Tetsuo Handa, Sergey Senozhatsky,
	syzkaller, Fengguang Wu, linux-kernel

On Thu, 10 May 2018 21:11:22 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> > The patchset was rejected by Linus because it would broke some
> > userspace tool, e.g. systemd, that depend on the format and semantic
> > provided by /dev/kmsg[2].  
> 
> Right, but I think I was talking about this email
>  https://lkml.kernel.org/r/20171123124648.s4oigunxjfzvhtqh@pathway.suse.cz
> 
> But yeah, it's not really related to the extension of struct printk_log,
> so I think we should be fine.

Note, crash is "special". It depends on internals of the kernel to keep
working as its purpose is to debug kernel crashes. I'm constantly
breaking it with ftrace. Which reminds me, I need to see if it works
with the latest kernel, and send patches if it isn't.

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10  4:22       ` Sergey Senozhatsky
  2018-05-10 11:30         ` Petr Mladek
@ 2018-05-10 14:50         ` Tetsuo Handa
  2018-05-11  1:45           ` Sergey Senozhatsky
  1 sibling, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-05-10 14:50 UTC (permalink / raw)
  To: sergey.senozhatsky.work, pmladek
  Cc: dvyukov, sergey.senozhatsky, syzkaller, rostedt, fengguang.wu,
	linux-kernel

Sergey Senozhatsky wrote:
> On (04/26/18 12:06), Petr Mladek wrote:
> > 
> > > Petr, Steven, Fengguang, what do you think? Do you have any objections?
> > > Ideas?
> > 
> > I wonder if we could create some mechanism that would help to extend
> > struct printk_log easier in the future.
> 
> Hm, interesting idea.
> 
> > I know only about crash tool implementation. It uses information provided
> > by log_buf_vmcoreinfo_setup(). The size of the structure is already
> > public. Therefore crash should be able to find all existing information
> > even if we increase the size of the structure.
> > 
> > log_buf_vmcoreinfo_setup() even allows to inform about newly added
> > structure items. We could probably extend it to inform also about
> > the offset of the new optional elements.
> 
> I vaguely remember that the last time Thomas Gleixner modified
> printk_log you managed to find a case that broke crash tool.
> ... Or may be I'm mistaken.
> 
> > I am not sure about other tools. But I think that it should be
> > doable.
> 
> Good. So there are no objections, so far.
> 
> Tetsuo, Dmitry, care to send a patch?
> 
> 	-ss
> 

What I meant is nothing but something like below (i.e. inject context ID before
string to print)

  -sprintf(printk_buf + offset, "[ %s] %s", stamp, string_to_print);
  +cpu = smp_processor_id()
  +if (in_nmi())
  +  sprintf(printk_buf + offset, "[ %s](N%u) %s", stamp, cpu, string_to_print);
  +else if (in_irq())
  +  sprintf(printk_buf + offset, "[ %s](I%u) %s", stamp, cpu, string_to_print);
  +else if (in_serving_softirq())
  +  sprintf(printk_buf + offset, "[ %s](S%u) %s", stamp, cpu, string_to_print);
  +else
  +  sprintf(printk_buf + offset, "[ %s](%u) %s", stamp, current->pid, string_to_print);

without touching any struct.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-10 14:50         ` Tetsuo Handa
@ 2018-05-11  1:45           ` Sergey Senozhatsky
       [not found]             ` <201805110238.w4B2cIGH079602@www262.sakura.ne.jp>
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-11  1:45 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: sergey.senozhatsky.work, pmladek, dvyukov, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel

On (05/10/18 23:50), Tetsuo Handa wrote:
> What I meant is nothing but something like below (i.e. inject context ID before
> string to print)
> 
>   -sprintf(printk_buf + offset, "[ %s] %s", stamp, string_to_print);
>   +cpu = smp_processor_id()
>   +if (in_nmi())
>   +  sprintf(printk_buf + offset, "[ %s](N%u) %s", stamp, cpu, string_to_print);
>   +else if (in_irq())
>   +  sprintf(printk_buf + offset, "[ %s](I%u) %s", stamp, cpu, string_to_print);
>   +else if (in_serving_softirq())
>   +  sprintf(printk_buf + offset, "[ %s](S%u) %s", stamp, cpu, string_to_print);
>   +else
>   +  sprintf(printk_buf + offset, "[ %s](%u) %s", stamp, current->pid, string_to_print);
> 
> without touching any struct.

So you basically want to have one more con_msg_format_flags? Do
you want to track a context which prints out a messages or the
context which "generated" the message? A CPU/task that stores
a logbuf entry - vprintk_emit() - is not always the same as the
CPU/task that prints it to consoles - console_unlock().

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
       [not found]             ` <201805110238.w4B2cIGH079602@www262.sakura.ne.jp>
@ 2018-05-11  6:21               ` Sergey Senozhatsky
  2018-05-11  9:17                 ` Dmitry Vyukov
  2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-11  6:21 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, pmladek, dvyukov, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel

On (05/11/18 11:38), Tetsuo Handa wrote:
> > 
> > So you basically want to have one more con_msg_format_flags? Do
> > you want to track a context which prints out a messages or the
> > context which "generated" the message? A CPU/task that stores
> > a logbuf entry - vprintk_emit() - is not always the same as the
> > CPU/task that prints it to consoles - console_unlock().
> > 
> 
> Well, below is the (partial) patch.

Hi,

Tetsuo, I will take a look a bit later, but at glance, there are several
ways to achieve what you are trying to do. The first one is the way you
did it - add additional buffer and make that context tracking info part of
the message body. Another one would be to extend struct printk_log and add
pid/cpu/flag there, which you then can convert into text in msg_print_text().
So far we talked about extending printk_log. Yet another one could be - add
vsprintf specifiers that would add pid/cpu/flag to the vsprintf-ed message.
You then can re-define pr_fmt, for instance, in the code you want to track
pr_fmt "%zZ" fmt, or somehow force printk to add that "%zZ" to every
message.

> By the way, when I tried to make similar change for printk_safe_log_store(),
> I noticed that printk_safe_log_store() is not safe because it is reusing
> the va_list variable after "goto again;". We need to use va_copy(), or
> we will get crash like an example shown below.

Oh, right. Can you send a patch?

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-11  6:21               ` Sergey Senozhatsky
@ 2018-05-11  9:17                 ` Dmitry Vyukov
  2018-05-11  9:50                   ` Sergey Senozhatsky
  2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
  1 sibling, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-05-11  9:17 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Petr Mladek, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML

On Fri, May 11, 2018 at 8:21 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (05/11/18 11:38), Tetsuo Handa wrote:
>> >
>> > So you basically want to have one more con_msg_format_flags? Do
>> > you want to track a context which prints out a messages or the
>> > context which "generated" the message? A CPU/task that stores
>> > a logbuf entry - vprintk_emit() - is not always the same as the
>> > CPU/task that prints it to consoles - console_unlock().
>> >
>>
>> Well, below is the (partial) patch.
>
> Hi,
>
> Tetsuo, I will take a look a bit later, but at glance, there are several
> ways to achieve what you are trying to do. The first one is the way you
> did it - add additional buffer and make that context tracking info part of
> the message body. Another one would be to extend struct printk_log and add
> pid/cpu/flag there, which you then can convert into text in msg_print_text().
> So far we talked about extending printk_log. Yet another one could be - add
> vsprintf specifiers that would add pid/cpu/flag to the vsprintf-ed message.
> You then can re-define pr_fmt, for instance, in the code you want to track
> pr_fmt "%zZ" fmt, or somehow force printk to add that "%zZ" to every
> message.


For syzbot perspective, yes, we can set any necessary additional
configs, add cmdline arguments, etc.

Manually changing format strings won't work -- bugs are all over the place.

>From what I see, it seems that interrupts can be nested:

https://syzkaller.appspot.com/bug?id=72eddef9cedcf81486adb9dd3e789f0d77505ba5
https://syzkaller.appspot.com/bug?id=66fcf61c65f8aa50bbb862eb2fde27c08909a4ff

Will this in_nmi()/in_irq()/in_serving_softirq()/else be enough to
untangle output printed by such nested interrupts? For the first link
it seems that they both are the same type of interrupt --
apic_timer_interrupt.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-11  9:17                 ` Dmitry Vyukov
@ 2018-05-11  9:50                   ` Sergey Senozhatsky
  2018-05-11 11:58                     ` [PATCH] printk: inject caller information into the body of message Tetsuo Handa
  2018-05-11 13:37                     ` printk feature for syzbot? Steven Rostedt
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-11  9:50 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Tetsuo Handa, Petr Mladek,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML

On (05/11/18 11:17), Dmitry Vyukov wrote:
> 
> From what I see, it seems that interrupts can be nested:

Hm, I thought that in general IRQ handlers run with local IRQs
disabled on CPU. So, generally, IRQs don't nest. Was I wrong?
NMIs can nest, that's true; but I thought that at least IRQs
don't.

> https://syzkaller.appspot.com/bug?id=72eddef9cedcf81486adb9dd3e789f0d77505ba5
> https://syzkaller.appspot.com/bug?id=66fcf61c65f8aa50bbb862eb2fde27c08909a4ff
> 
> Will this in_nmi()/in_irq()/in_serving_softirq()/else be enough to
> untangle output printed by such nested interrupts?

Well, hm. __irq_enter() does preempt_count_add(HARDIRQ_OFFSET) and
__irq_exit() does preempt_count_sub(HARDIRQ_OFFSET). So, technically,
you can store

	preempt_count() & HARDIRQ_MASK
	preempt_count() & SOFTIRQ_MASK
	preempt_count() & NMI_MASK

in that extended context tracking. The numbers will not tell you
the IRQ line number, for instance, but at least you'll be able to
distinguish different hard/soft IRQs, NMIs. Just an idea, I didn't
check it, may be it won't work at all.

Ideally, the serial log should be like this

	i:1 ... foo()
	i:1 ... bar()
	i:2 ... foo()  // __irq_enter()
	i:2 ... bar()
	i:2 ... buz()  // __irq_exit()
	i:1 ... buz()

but I may be completely wrong.

Petr and Steven probably will have better ideas.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH] printk: fix possible reuse of va_list variable
  2018-05-11  6:21               ` Sergey Senozhatsky
  2018-05-11  9:17                 ` Dmitry Vyukov
@ 2018-05-11 11:02                 ` Tetsuo Handa
  2018-05-11 11:27                   ` Sergey Senozhatsky
  2018-05-17 11:57                   ` Petr Mladek
  1 sibling, 2 replies; 94+ messages in thread
From: Tetsuo Handa @ 2018-05-11 11:02 UTC (permalink / raw)
  To: sergey.senozhatsky.work
  Cc: pmladek, dvyukov, sergey.senozhatsky, syzkaller, rostedt,
	fengguang.wu, linux-kernel, peterz

>From 766cf72b5fdc00d1cf5a8ca2c6b23ebb75e2b4d4 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Fri, 11 May 2018 19:54:19 +0900
Subject: [PATCH] printk: fix possible reuse of va_list variable

I noticed that there is a possibility that printk_safe_log_store() causes
kernel oops because "args" parameter is passed to vsnprintf() again when
atomic_cmpxchg() detected that we raced. Fix this by using va_copy().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 42a0bb3f71383b45 ("printk/nmi: generic solution for safe printk in NMI")
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/printk/printk_safe.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 3e3c200..449d67e 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -82,6 +82,7 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
 {
 	int add;
 	size_t len;
+	va_list ap;
 
 again:
 	len = atomic_read(&s->len);
@@ -100,7 +101,9 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
 	if (!len)
 		smp_rmb();
 
-	add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, args);
+	va_copy(ap, args);
+	add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap);
+	va_end(ap);
 	if (!add)
 		return 0;
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: fix possible reuse of va_list variable
  2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
@ 2018-05-11 11:27                   ` Sergey Senozhatsky
  2018-05-17 11:57                   ` Petr Mladek
  1 sibling, 0 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-11 11:27 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: sergey.senozhatsky.work, pmladek, dvyukov, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel, peterz

On (05/11/18 20:02), Tetsuo Handa wrote:
> I noticed that there is a possibility that printk_safe_log_store() causes
> kernel oops because "args" parameter is passed to vsnprintf() again when
> atomic_cmpxchg() detected that we raced. Fix this by using va_copy().
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Fixes: 42a0bb3f71383b45 ("printk/nmi: generic solution for safe printk in NMI")
> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH] printk: inject caller information into the body of message
  2018-05-11  9:50                   ` Sergey Senozhatsky
@ 2018-05-11 11:58                     ` Tetsuo Handa
  2018-05-17 11:21                       ` Sergey Senozhatsky
  2018-05-11 13:37                     ` printk feature for syzbot? Steven Rostedt
  1 sibling, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-05-11 11:58 UTC (permalink / raw)
  To: sergey.senozhatsky.work, dvyukov
  Cc: pmladek, sergey.senozhatsky, syzkaller, rostedt, fengguang.wu,
	linux-kernel, torvalds, akpm

>From b7b0e56e06db1107f781b4cb5178fbdc99240901 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Fri, 11 May 2018 20:45:31 +0900
Subject: [PATCH] printk: inject caller information into the body of message

Since syzbot frequently makes printk() flooding (e.g. memory allocation
fault injection), it is always difficult to distinguish which line is from
which event.

This patch tries to help grouping concurrent printk() lines, without
touching any struct so that we don't break userspace tools (e.g. crash)
which depend on in-kernel data structures.

If printk() is called from process context, "(T%u)" (where %u is
current->pid) is injected. If printk() is called from interrupt context,
"(C%u)" (where %u is raw_smp_processor_id()) is injected.



Example 1: SysRq-h from keyboard operation.
----------
[   57.688156] (C3) sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w)
----------

Example 2: SysRq-h from /proc/sysrq-trigger interface.
----------
[   64.592273] (T2768) sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w)
----------

Example 3: SysRq-f from keyboard operation.
----------
[   70.792068] (C3) sysrq: SysRq : Manual OOM execution
[   70.797444] (T245) kworker/0:2 invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=-1, oom_score_adj=0
[   70.807690] (T245) kworker/0:2 cpuset=/ mems_allowed=0
[   70.812738] (T245) CPU: 0 PID: 245 Comm: kworker/0:2 Kdump: loaded Not tainted 4.17.0-rc4+ #396
[   70.819886] (T245) Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[   70.828924] (T245) Workqueue: events moom_callback
[   70.830554] (T245) Call Trace:
[   70.831518] (T245)  dump_stack+0x5e/0x8b
[   70.832754] (T245)  dump_header+0x6f/0x454
[   70.834045] (T245)  ? _raw_spin_unlock_irqrestore+0x42/0x60
[   70.835764] (T245)  oom_kill_process+0x223/0x690
[   70.837257] (T245)  ? out_of_memory+0x2c2/0x530
[   70.838669] (T245)  out_of_memory+0x120/0x530
[   70.840016] (T245)  ? out_of_memory+0x1f7/0x530
[   70.841433] (T245)  moom_callback+0x68/0x90
[   70.842735] (T245)  process_one_work+0x19f/0x370
[   70.844162] (T245)  ? process_one_work+0x13c/0x370
[   70.845681] (T245)  worker_thread+0x45/0x3e0
[   70.846985] (T245)  kthread+0xf6/0x130
[   70.848127] (T245)  ? process_one_work+0x370/0x370
[   70.849587] (T245)  ? kthread_create_on_node+0x40/0x40
[   70.851146] (T245)  ret_from_fork+0x24/0x30
[   70.855706] (T245) Mem-Info:
[   70.856674] (T245) active_anon:11880 inactive_anon:2122 isolated_anon:0
[   70.856674]  active_file:11521 inactive_file:18259 isolated_file:0
[   70.856674]  unevictable:0 dirty:4 writeback:0 unstable:0
[   70.856674]  slab_reclaimable:7216 slab_unreclaimable:14300
[   70.856674]  mapped:11981 shmem:2198 pagetables:1743 bounce:0
[   70.856674]  free:853866 free_pcp:566 free_cma:0
[   70.868764] (T245) Node 0 active_anon:47520kB inactive_anon:8488kB active_file:46084kB inactive_file:73036kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:47924kB dirty:16kB writeback:0kB shmem:8792kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 8192kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[   70.880127] (T245) Node 0 DMA free:15872kB min:284kB low:352kB high:420kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[   70.889522] (T245) lowmem_reserve[]: 0 2683 3633 3633
[   70.891285] (T245) Node 0 DMA32 free:2746612kB min:49696kB low:62120kB high:74544kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:2748008kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:340kB local_pcp:116kB free_cma:0kB
[   70.899460] (T245) lowmem_reserve[]: 0 0 950 950
[   70.901212] (T245) Node 0 Normal free:652980kB min:17596kB low:21992kB high:26388kB active_anon:47520kB inactive_anon:8488kB active_file:46084kB inactive_file:73036kB unevictable:0kB writepending:16kB present:1048576kB managed:972972kB mlocked:0kB kernel_stack:3664kB pagetables:6972kB bounce:0kB free_pcp:1920kB local_pcp:644kB free_cma:0kB
[   70.909865] (T245) lowmem_reserve[]: 0 0 0 0
[   70.911646] (T245) Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15872kB
[   70.915231] (T245) Node 0 DMA32: 3*4kB (UM) 3*8kB (UM) 1*16kB (M) 2*32kB (M) 2*64kB (M) 4*128kB (UM) 4*256kB (M) 5*512kB (UM) 2*1024kB (M) 4*2048kB (UM) 667*4096kB (M) = 2746612kB
[   70.920274] (T245) Node 0 Normal: 243*4kB (UM) 51*8kB (UM) 19*16kB (UM) 5*32kB (M) 4*64kB (UME) 1*128kB (M) 2*256kB (ME) 2*512kB (UE) 6*1024kB (UE) 6*2048kB (UME) 154*4096kB (M) = 652980kB
[   70.925188] (T245) Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[   70.927860] (T245) Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[   70.930431] (T245) 31978 total pagecache pages
[   70.932166] (T245) 0 pages in swap cache
[   70.933803] (T245) Swap cache stats: add 0, delete 0, find 0/0
[   70.935841] (T245) Free swap  = 0kB
[   70.937315] (T245) Total swap = 0kB
[   70.938957] (T245) 1048445 pages RAM
[   70.940486] (T245) 0 pages HighMem/MovableOnly
[   70.942158] (T245) 114224 pages reserved
[   70.943767] (T245) 0 pages hwpoisoned
[   70.945267] (T245) Out of memory: Kill process 2474 (tuned) score 6 or sacrifice child
[   70.947863] (T245) Killed process 2474 (tuned) total-vm:573828kB, anon-rss:13072kB, file-rss:10716kB, shmem-rss:0kB
----------



This patch does not distinguish in_nmi()/in_irq()/in_serving_softirq(),
for I guess that it is not too difficult to distinguish them as long as
we can pick up messages from same CPU based on "(C%u)" part. We could
change to use "(C%u%c)" (where %c is type of interrupt context) if needed.

For long term, we might want to touch in-kernel data structures so that
userspace tools can do better processing. But for now, I think that this
patch can help a lot.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
---
 Documentation/admin-guide/kernel-parameters.txt |  6 +++
 kernel/printk/internal.h                        |  1 +
 kernel/printk/printk.c                          | 49 +++++++++++++++++++++++--
 kernel/printk/printk_safe.c                     | 22 ++++++++++-
 lib/Kconfig.debug                               | 13 +++++++
 5 files changed, 87 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 11fc28e..10e716e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3288,6 +3288,12 @@
 			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
 			default: disabled
 
+	printk.caller_info=
+			Show which task (if in process context) or CPU (if not
+			in process context) generated each message.
+			Useful for environments where printk() floods.
+			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
+
 	printk.devkmsg={on,off,ratelimit}
 			Control writing to /dev/kmsg.
 			on - unlimited logging to /dev/kmsg from userspace
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 2a7d040..0b30457 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -23,6 +23,7 @@
 #define PRINTK_NMI_CONTEXT_MASK		 0x80000000
 
 extern raw_spinlock_t logbuf_lock;
+extern bool printk_caller_info;
 
 __printf(1, 0) int vprintk_default(const char *fmt, va_list args);
 __printf(1, 0) int vprintk_deferred(const char *fmt, va_list args);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2f4af21..9040a16 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1733,6 +1733,37 @@ static inline void printk_delay(void)
 	}
 }
 
+bool printk_caller_info = IS_ENABLED(CONFIG_PRINTK_CALLER_INFO);
+module_param_named(caller_info, printk_caller_info, bool, 0644);
+
+static char *printk_inject_caller_info(const char *text, size_t *text_len)
+{
+	static char buf[LOG_LINE_MAX + 128];
+	int len;
+	unsigned int v;
+	char c;
+
+	if (!printk_caller_info)
+		return (char *) text;
+
+	if (in_task()) {
+		v = current->pid;
+		c = 'T';
+	} else {
+		/* Use raw version not to generate warning messages. */
+		v = raw_smp_processor_id();
+		c = 'C';
+	}
+	len = snprintf(buf, sizeof(buf), "(%c%u) ", c, v);
+	/* This should not happen though... */
+	if (unlikely(len + *text_len >= sizeof(buf)))
+		return (char *) text;
+	memmove(buf + len, text, *text_len);
+	*text_len += len;
+	/* "buf" remains valid because it is protected by "logbuf_lock". */
+	return buf;
+}
+
 /*
  * Continuation lines are buffered, and not committed to the record buffer
  * until the line is complete, or a race forces it. The line fragments
@@ -1763,10 +1794,19 @@ static bool cont_add(int facility, int level, enum log_flags flags, const char *
 {
 	/*
 	 * If ext consoles are present, flush and skip in-kernel
-	 * continuation.  See nr_ext_console_drivers definition.  Also, if
-	 * the line gets too long, split it up in separate records.
+	 * continuation. See nr_ext_console_drivers definition.
 	 */
-	if (nr_ext_console_drivers || cont.len + len > sizeof(cont.buf)) {
+	if (nr_ext_console_drivers) {
+		cont_flush();
+		return false;
+	}
+
+	/* Inject before memcpy() in order to avoid overflow. */
+	if (!cont.len)
+		text = printk_inject_caller_info(text, &len);
+
+	/* If the line gets too long, split it up in separate records. */
+	if (cont.len + len > sizeof(cont.buf)) {
 		cont_flush();
 		return false;
 	}
@@ -1820,6 +1860,9 @@ static size_t log_output(int facility, int level, enum log_flags lflags, const c
 			return text_len;
 	}
 
+	/* Inject caller info. */
+	text = printk_inject_caller_info(text, &text_len);
+
 	/* Store it in the record log */
 	return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len);
 }
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 449d67e..02d080a 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -22,6 +22,7 @@
 #include <linux/cpumask.h>
 #include <linux/irq_work.h>
 #include <linux/printk.h>
+#include <linux/sched.h>
 
 #include "internal.h"
 
@@ -83,6 +84,17 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
 	int add;
 	size_t len;
 	va_list ap;
+	unsigned int v;
+	char c;
+
+	if (in_task()) {
+		v = current->pid;
+		c = 't';
+	} else {
+		/* Use raw version not to generate warning messages. */
+		v = raw_smp_processor_id();
+		c = 'c';
+	}
 
 again:
 	len = atomic_read(&s->len);
@@ -102,7 +114,15 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
 		smp_rmb();
 
 	va_copy(ap, args);
-	add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap);
+	if (printk_caller_info) {
+		struct va_format vaf = { .fmt = fmt, .va = &ap };
+
+		add = scnprintf(s->buffer + len, sizeof(s->buffer) - len,
+				"(%c%u) %pV", c, v, &vaf);
+	} else {
+		add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len,
+				 fmt, ap);
+	}
 	va_end(ap);
 	if (!add)
 		return 0;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c40c7b7..9e8ea4e 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -15,6 +15,19 @@ config PRINTK_TIME
 	  The behavior is also controlled by the kernel command line
 	  parameter printk.time=1. See Documentation/admin-guide/kernel-parameters.rst
 
+config PRINTK_CALLER_INFO
+	bool "Show caller information on printks"
+	depends on PRINTK
+	help
+	  Selecting this option causes thread id (if in process context) or CPU
+	  id (if not in process context) of the printk() messages to be added
+	  to the output of the syslog() system call and at the console.
+	  Useful for environments where multiple threads constantly call
+	  printk() (e.g. fault injection fuzzing tests).
+
+	  The behavior is also controlled by printk.caller_info= kernel command
+	  line parameter or /sys/module/printk/parameters/caller_info file.
+
 config CONSOLE_LOGLEVEL_DEFAULT
 	int "Default console loglevel (1-15)"
 	range 1 15
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-11  9:50                   ` Sergey Senozhatsky
  2018-05-11 11:58                     ` [PATCH] printk: inject caller information into the body of message Tetsuo Handa
@ 2018-05-11 13:37                     ` Steven Rostedt
  2018-05-15  5:20                       ` Sergey Senozhatsky
  1 sibling, 1 reply; 94+ messages in thread
From: Steven Rostedt @ 2018-05-11 13:37 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Petr Mladek, Sergey Senozhatsky,
	syzkaller, Fengguang Wu, LKML

On Fri, 11 May 2018 18:50:04 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> On (05/11/18 11:17), Dmitry Vyukov wrote:
> > 
> > From what I see, it seems that interrupts can be nested:  
> 
> Hm, I thought that in general IRQ handlers run with local IRQs
> disabled on CPU. So, generally, IRQs don't nest. Was I wrong?
> NMIs can nest, that's true; but I thought that at least IRQs
> don't.

We normally don't run nested interrupts, although as the comment in
preempt.h says:

 * The hardirq count could in theory be the same as the number of
 * interrupts in the system, but we run all interrupt handlers with
 * interrupts disabled, so we cannot have nesting interrupts. Though
 * there are a few palaeontologic drivers which reenable interrupts in
 * the handler, so we need more than one bit here.

And no, NMI handlers do not nest. Yes, we deal with nested NMIs, but in
those cases, we just set a bit as a latch, and return, and when the
first NMI is complete, it checks that bit and if it is set, it executes
another NMI handler.

> 
> > https://syzkaller.appspot.com/bug?id=72eddef9cedcf81486adb9dd3e789f0d77505ba5
> > https://syzkaller.appspot.com/bug?id=66fcf61c65f8aa50bbb862eb2fde27c08909a4ff
> > 
> > Will this in_nmi()/in_irq()/in_serving_softirq()/else be enough to
> > untangle output printed by such nested interrupts?  
> 
> Well, hm. __irq_enter() does preempt_count_add(HARDIRQ_OFFSET) and
> __irq_exit() does preempt_count_sub(HARDIRQ_OFFSET). So, technically,
> you can store
> 
> 	preempt_count() & HARDIRQ_MASK
> 	preempt_count() & SOFTIRQ_MASK
> 	preempt_count() & NMI_MASK
> 
> in that extended context tracking. The numbers will not tell you
> the IRQ line number, for instance, but at least you'll be able to
> distinguish different hard/soft IRQs, NMIs. Just an idea, I didn't
> check it, may be it won't work at all.
> 
> Ideally, the serial log should be like this
> 
> 	i:1 ... foo()
> 	i:1 ... bar()
> 	i:2 ... foo()  // __irq_enter()
> 	i:2 ... bar()
> 	i:2 ... buz()  // __irq_exit()
> 	i:1 ... buz()
> 
> but I may be completely wrong.
> 
> Petr and Steven probably will have better ideas.

I handle nesting of different contexts in the ftrace ring buffer using
the preempt count. See trace_recursive_lock/unlock() in
kernel/trace/ring_buffer.c.

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-11 13:37                     ` printk feature for syzbot? Steven Rostedt
@ 2018-05-15  5:20                       ` Sergey Senozhatsky
  2018-05-15 14:39                         ` Steven Rostedt
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-15  5:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Dmitry Vyukov, Tetsuo Handa, Petr Mladek,
	Sergey Senozhatsky, syzkaller, Fengguang Wu, LKML

Hello,

On (05/11/18 09:37), Steven Rostedt wrote:
> > On (05/11/18 11:17), Dmitry Vyukov wrote:
> > > 
> > > From what I see, it seems that interrupts can be nested:  
> > 
> > Hm, I thought that in general IRQ handlers run with local IRQs
> > disabled on CPU. So, generally, IRQs don't nest. Was I wrong?
> > NMIs can nest, that's true; but I thought that at least IRQs
> > don't.
> 
> We normally don't run nested interrupts, although as the comment in
> preempt.h says:
> 
>  * The hardirq count could in theory be the same as the number of
>  * interrupts in the system, but we run all interrupt handlers with
>  * interrupts disabled, so we cannot have nesting interrupts. Though
>  * there are a few palaeontologic drivers which reenable interrupts in
>  * the handler, so we need more than one bit here.
> 
> And no, NMI handlers do not nest. Yes, we deal with nested NMIs, but in
> those cases, we just set a bit as a latch, and return, and when the
> first NMI is complete, it checks that bit and if it is set, it executes
> another NMI handler.

Good to know!
I thought that NMI can nest in some weird cases, like a breakpoint from
NMI. This must be super tricky, given that nested NMI will corrupt the
stack of the previous NMI, etc. Anyway.

> > Well, hm. __irq_enter() does preempt_count_add(HARDIRQ_OFFSET) and
> > __irq_exit() does preempt_count_sub(HARDIRQ_OFFSET). So, technically,
> > you can store
> > 
> > 	preempt_count() & HARDIRQ_MASK
> > 	preempt_count() & SOFTIRQ_MASK
> > 	preempt_count() & NMI_MASK
> >
[..]
> I handle nesting of different contexts in the ftrace ring buffer using
> the preempt count. See trace_recursive_lock/unlock() in
> kernel/trace/ring_buffer.c.

Thanks. So you are also checking the preempt_count().

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: printk feature for syzbot?
  2018-05-15  5:20                       ` Sergey Senozhatsky
@ 2018-05-15 14:39                         ` Steven Rostedt
  0 siblings, 0 replies; 94+ messages in thread
From: Steven Rostedt @ 2018-05-15 14:39 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Tetsuo Handa, Petr Mladek, Sergey Senozhatsky,
	syzkaller, Fengguang Wu, LKML

On Tue, 15 May 2018 14:20:42 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> > And no, NMI handlers do not nest. Yes, we deal with nested NMIs, but in
> > those cases, we just set a bit as a latch, and return, and when the
> > first NMI is complete, it checks that bit and if it is set, it executes
> > another NMI handler.  
> 
> Good to know!
> I thought that NMI can nest in some weird cases, like a breakpoint from
> NMI. This must be super tricky, given that nested NMI will corrupt the
> stack of the previous NMI, etc. Anyway.

Well, they do kinda nest, but we work hard not to let them do anything
when they do. You can read all about it here:

https://lwn.net/Articles/484932/

> 
> > > Well, hm. __irq_enter() does preempt_count_add(HARDIRQ_OFFSET) and
> > > __irq_exit() does preempt_count_sub(HARDIRQ_OFFSET). So, technically,
> > > you can store
> > > 
> > > 	preempt_count() & HARDIRQ_MASK
> > > 	preempt_count() & SOFTIRQ_MASK
> > > 	preempt_count() & NMI_MASK
> > >  
> [..]
> > I handle nesting of different contexts in the ftrace ring buffer using
> > the preempt count. See trace_recursive_lock/unlock() in
> > kernel/trace/ring_buffer.c.  
> 
> Thanks. So you are also checking the preempt_count().
>

Yes I am.

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-11 11:58                     ` [PATCH] printk: inject caller information into the body of message Tetsuo Handa
@ 2018-05-17 11:21                       ` Sergey Senozhatsky
  2018-05-17 11:52                         ` Sergey Senozhatsky
  2018-05-18 12:15                         ` Petr Mladek
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-17 11:21 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: sergey.senozhatsky.work, dvyukov, pmladek, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel, torvalds, akpm

On (05/11/18 20:58), Tetsuo Handa wrote:
[..]
> -	if (nr_ext_console_drivers || cont.len + len > sizeof(cont.buf)) {
> +	if (nr_ext_console_drivers) {
> +		cont_flush();
> +		return false;
> +	}
> +
> +	/* Inject before memcpy() in order to avoid overflow. */
> +	if (!cont.len)
> +		text = printk_inject_caller_info(text, &len);
> +
> +	/* If the line gets too long, split it up in separate records. */
> +	if (cont.len + len > sizeof(cont.buf)) {
>  		cont_flush();
>  		return false;
>  	}
> @@ -1820,6 +1860,9 @@ static size_t log_output(int facility, int level, enum log_flags lflags, const c
>  			return text_len;
>  	}
>  
> +	/* Inject caller info. */
> +	text = printk_inject_caller_info(text, &text_len);
> +
>  	/* Store it in the record log */
>  	return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len);
>  }

[..]

I think this is slightly intrusive. I understand that you want to avoid
struct printk_log modification, let's try to see if we have any other
options.

Dunno...
For instance, can we store context tracking info as a extended record
data? We have that dict/dict_len thing. So may we can store tracking
info there? Extended records will appear on the serial console /* if
console supports extended data */ or can be read in via devkmsg_read().
Any other options?

>  #include "internal.h"
>  
> @@ -83,6 +84,17 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
[..]
>  	len = atomic_read(&s->len);
> @@ -102,7 +114,15 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
>  		smp_rmb();
>  
>  	va_copy(ap, args);
> -	add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap);
> +	if (printk_caller_info) {
> +		struct va_format vaf = { .fmt = fmt, .va = &ap };
> +
> +		add = scnprintf(s->buffer + len, sizeof(s->buffer) - len,
> +				"(%c%u) %pV", c, v, &vaf);
> +	} else {
> +		add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len,
> +				 fmt, ap);
> +	}

A bit of a silly question - do we want to modify printk_safe at this
point? With this implementation printk_safe entries will have two context
info-s attached: one from original printk_safe_log_store and another one
from printk_safe_flush->log_store. I suspect that adding context info in
printk_safe_log_store is, probably, not really needed. We flush printk_safe
from irq work on the CPU that issued unsafe printk, so part of the context
info will be valid if you append context info only in printk log_store - at
least the correct smp_processor_id.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-17 11:21                       ` Sergey Senozhatsky
@ 2018-05-17 11:52                         ` Sergey Senozhatsky
  2018-05-18 12:15                         ` Petr Mladek
  1 sibling, 0 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-17 11:52 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, dvyukov, pmladek, sergey.senozhatsky, syzkaller,
	rostedt, fengguang.wu, linux-kernel, torvalds, akpm

On (05/17/18 20:21), Sergey Senozhatsky wrote:
> Dunno...
> For instance, can we store context tracking info as a extended record
> data? We have that dict/dict_len thing. So may we can store tracking
> info there? Extended records will appear on the serial console /* if
> console supports extended data */ or can be read in via devkmsg_read().

Those extended records are already there for exactly the same
reason - people want to attach a special context to printk() entries.
See dev_vprintk_emit() and create_syslog_header(). So we can add more
key/value data to that context. Sounds kinda-sorta reasonable.

So, for example, this output
cat /dev/kmsg

6,577,3156036,-;snd_hda_codec_generic hdaudioC1D0: autoconfig for Generic: line_outs=0 (0x0/0x0/0x0/0x0/0x0) type:line
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC1D
6,578,3156807,-;snd_hda_codec_generic hdaudioC1D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC1D

Becomes this:
6,566,3033752,-;snd_hda_codec_realtek hdaudioC0D0:      Front Mic=0x19
 3/207: 0/0/0
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC0D
6,567,3033754,-;snd_hda_codec_realtek hdaudioC0D0:      Rear Mic=0x18
 3/207: 0/0/0
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC0D


"3/207: 0/0/0" is smp_processor_id/task_pid_nr and then masked
out bits of preempt count: hard irq, soft irq, nmi.

We definitely can change the format, etc. This is just a very quick and
dirty PoC.

Something as follows?
/* just to demonstrate the idea */

---

 kernel/printk/printk.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2f4af216bd6e..4a82d52a343d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -580,16 +580,33 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
 	return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
 }
 
+static size_t add_log_origin(char *buf, size_t buf_len)
+{
+	return snprintf(buf,
+			buf_len,
+			"%d/%d: %lu/%lu/%lu",
+			raw_smp_processor_id(),
+			task_pid_nr(current),
+			preempt_count() & HARDIRQ_MASK,
+			preempt_count() & SOFTIRQ_MASK,
+			preempt_count() & NMI_MASK);
+}
+
 /* insert record into the buffer, discard old ones, update heads */
 static int log_store(int facility, int level,
 		     enum log_flags flags, u64 ts_nsec,
 		     const char *dict, u16 dict_len,
 		     const char *text, u16 text_len)
 {
+	static char log_origin[64];
+	static size_t log_origin_len;
 	struct printk_log *msg;
 	u32 size, pad_len;
 	u16 trunc_msg_len = 0;
 
+	log_origin_len = add_log_origin(log_origin, sizeof(log_origin));
+	dict_len += log_origin_len;
+
 	/* number of '\0' padding bytes to next message */
 	size = msg_used_size(text_len, dict_len, &pad_len);
 
@@ -620,7 +637,10 @@ static int log_store(int facility, int level,
 		memcpy(log_text(msg) + text_len, trunc_msg, trunc_msg_len);
 		msg->text_len += trunc_msg_len;
 	}
-	memcpy(log_dict(msg), dict, dict_len);
+	memcpy(log_dict(msg), log_origin, log_origin_len);
+	memcpy(log_dict(msg) + log_origin_len + 1,
+		dict,
+		dict_len - log_origin_len);
 	msg->dict_len = dict_len;
 	msg->facility = facility;
 	msg->level = level & 7;
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: fix possible reuse of va_list variable
  2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
  2018-05-11 11:27                   ` Sergey Senozhatsky
@ 2018-05-17 11:57                   ` Petr Mladek
  1 sibling, 0 replies; 94+ messages in thread
From: Petr Mladek @ 2018-05-17 11:57 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: sergey.senozhatsky.work, dvyukov, sergey.senozhatsky, syzkaller,
	rostedt, fengguang.wu, linux-kernel, peterz

On Fri 2018-05-11 20:02:31, Tetsuo Handa wrote:
> >From 766cf72b5fdc00d1cf5a8ca2c6b23ebb75e2b4d4 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Fri, 11 May 2018 19:54:19 +0900
> Subject: [PATCH] printk: fix possible reuse of va_list variable
> 
> I noticed that there is a possibility that printk_safe_log_store() causes
> kernel oops because "args" parameter is passed to vsnprintf() again when
> atomic_cmpxchg() detected that we raced. Fix this by using va_copy().

Great catch!

Reviewed-by: Petr Mladek <pmladek@suse.com>

I have tagged it for stable and pushed into printk.git,
branch for-4.18, see
https://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk.git/commit/?h=for-4.18&id=988a35f8da1dec5a8cd2788054d1e717be61bf25

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-17 11:21                       ` Sergey Senozhatsky
  2018-05-17 11:52                         ` Sergey Senozhatsky
@ 2018-05-18 12:15                         ` Petr Mladek
  2018-05-18 12:25                           ` Dmitry Vyukov
                                             ` (2 more replies)
  1 sibling, 3 replies; 94+ messages in thread
From: Petr Mladek @ 2018-05-18 12:15 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, dvyukov, sergey.senozhatsky, syzkaller,
	rostedt, fengguang.wu, linux-kernel, torvalds, akpm

On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
> On (05/11/18 20:58), Tetsuo Handa wrote:
> > @@ -1820,6 +1860,9 @@ static size_t log_output(int facility, int level, enum log_flags lflags, const c
> >  			return text_len;
> >  	}
> >  
> > +	/* Inject caller info. */
> > +	text = printk_inject_caller_info(text, &text_len);
> > +
> >  	/* Store it in the record log */
> >  	return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len);
> >  }
> 
> [..]
> 
> I think this is slightly intrusive. I understand that you want to avoid
> struct printk_log modification, let's try to see if we have any other
> options.

I agree with Sergey that it is intrusive. We should keep the
information separate from the original string and format it
according to the selected output format (syslog, /dev/kmsg,
console) like we do it with the other metadata, e.g. timestamp,
loglevel, dict).


> Dunno...
> For instance, can we store context tracking info as a extended record
> data? We have that dict/dict_len thing. So may we can store tracking
> info there? Extended records will appear on the serial console /* if
> console supports extended data */ or can be read in via devkmsg_read().
> Any other options?

This sounds interesting. Well, we would need to handle different dict
items different ways. I still wonder if we really need these "hacks".

Another option would be to store the metadata into a separate table
indexed by log_seq number. But it still look unnecessarily complicated.

IMHO, we could change struct printk_log if we provide related
patches for crashdump and crash utilities.


Important:

First, we should ask what we expect from this feature. Different
information might be needed in different situations. In general,
people might want to know:

  + CPUid even in task context
  + exact interrupt context (soft, hard, NMI)
  + whether preemption or interrupts are enabled

It still looks bearable. But what if people want more,
e.g. context switch counts, task state, pending signals,
mem usage, cgroup stuff.

Is this information useful for all messages or only
selected ones?

Is it acceptable when message prefix is longer than, let's
say 40 characters?

Is the extended output worth having even on slow consoles?


By other words, I wonder if you wanted similar feature in many
situations in the past and could provide more use cases.


Note:

The proposed patch enabled the extra info with a config option
=> you need to rebuild the kernel => you could just modify
the problematic message. We could just add some printk_ helpers
to make it easier.

Alternatively, I wonder if it might be enough to add a tracepoint
into printk() and get the extra info via
/sys/kernel/debug/tracing/events/. We would need to prevent
recursion when trace buffer is flushed by printk() but...

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:15                         ` Petr Mladek
@ 2018-05-18 12:25                           ` Dmitry Vyukov
  2018-05-18 12:54                             ` Petr Mladek
  2018-05-23 10:19                           ` Tetsuo Handa
  2018-05-24  2:14                           ` Sergey Senozhatsky
  2 siblings, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-05-18 12:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Fri, May 18, 2018 at 2:15 PM, Petr Mladek <pmladek@suse.com> wrote:
> On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
>> On (05/11/18 20:58), Tetsuo Handa wrote:
>> > @@ -1820,6 +1860,9 @@ static size_t log_output(int facility, int level, enum log_flags lflags, const c
>> >                     return text_len;
>> >     }
>> >
>> > +   /* Inject caller info. */
>> > +   text = printk_inject_caller_info(text, &text_len);
>> > +
>> >     /* Store it in the record log */
>> >     return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len);
>> >  }
>>
>> [..]
>>
>> I think this is slightly intrusive. I understand that you want to avoid
>> struct printk_log modification, let's try to see if we have any other
>> options.
>
> I agree with Sergey that it is intrusive. We should keep the
> information separate from the original string and format it
> according to the selected output format (syslog, /dev/kmsg,
> console) like we do it with the other metadata, e.g. timestamp,
> loglevel, dict).
>
>
>> Dunno...
>> For instance, can we store context tracking info as a extended record
>> data? We have that dict/dict_len thing. So may we can store tracking
>> info there? Extended records will appear on the serial console /* if
>> console supports extended data */ or can be read in via devkmsg_read().

What consoles do support it?
We are interested at least in qemu console, GCE console and Android
phone consoles. But it would be pity if this can't be used on various
development boards too.


>> Any other options?
>
> This sounds interesting. Well, we would need to handle different dict
> items different ways. I still wonder if we really need these "hacks".
>
> Another option would be to store the metadata into a separate table
> indexed by log_seq number. But it still look unnecessarily complicated.
>
> IMHO, we could change struct printk_log if we provide related
> patches for crashdump and crash utilities.
>
>
> Important:
>
> First, we should ask what we expect from this feature. Different
> information might be needed in different situations. In general,
> people might want to know:
>
>   + CPUid even in task context
>   + exact interrupt context (soft, hard, NMI)
>   + whether preemption or interrupts are enabled
>
> It still looks bearable. But what if people want more,
> e.g. context switch counts, task state, pending signals,
> mem usage, cgroup stuff.
>
> Is this information useful for all messages or only
> selected ones?
>
> Is it acceptable when message prefix is longer than, let's
> say 40 characters?
>
> Is the extended output worth having even on slow consoles?
>
>
> By other words, I wonder if you wanted similar feature in many
> situations in the past and could provide more use cases.
>
>
> Note:
>
> The proposed patch enabled the extra info with a config option
> => you need to rebuild the kernel => you could just modify
> the problematic message. We could just add some printk_ helpers
> to make it easier.
>
> Alternatively, I wonder if it might be enough to add a tracepoint
> into printk() and get the extra info via
> /sys/kernel/debug/tracing/events/. We would need to prevent
> recursion when trace buffer is flushed by printk() but...
>
> Best Regards,
> Petr

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:25                           ` Dmitry Vyukov
@ 2018-05-18 12:54                             ` Petr Mladek
  2018-05-18 13:08                               ` Dmitry Vyukov
  0 siblings, 1 reply; 94+ messages in thread
From: Petr Mladek @ 2018-05-18 12:54 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Tetsuo Handa, Sergey Senozhatsky, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Fri 2018-05-18 14:25:57, Dmitry Vyukov wrote:
> > On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
> >> Dunno...
> >> For instance, can we store context tracking info as a extended record
> >> data? We have that dict/dict_len thing. So may we can store tracking
> >> info there? Extended records will appear on the serial console /* if
> >> console supports extended data */ or can be read in via devkmsg_read().
> 
> What consoles do support it?
> We are interested at least in qemu console, GCE console and Android
> phone consoles. But it would be pity if this can't be used on various
> development boards too.

Only the netconsole is able to show the extended (dict)
information at the moment. Search for CON_EXTENDED flag.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:54                             ` Petr Mladek
@ 2018-05-18 13:08                               ` Dmitry Vyukov
  2018-05-24  2:21                                 ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-05-18 13:08 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Fri, May 18, 2018 at 2:54 PM, Petr Mladek <pmladek@suse.com> wrote:
> On Fri 2018-05-18 14:25:57, Dmitry Vyukov wrote:
>> > On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
>> >> Dunno...
>> >> For instance, can we store context tracking info as a extended record
>> >> data? We have that dict/dict_len thing. So may we can store tracking
>> >> info there? Extended records will appear on the serial console /* if
>> >> console supports extended data */ or can be read in via devkmsg_read().
>>
>> What consoles do support it?
>> We are interested at least in qemu console, GCE console and Android
>> phone consoles. But it would be pity if this can't be used on various
>> development boards too.
>
> Only the netconsole is able to show the extended (dict)
> information at the moment. Search for CON_EXTENDED flag.

Then we won't be able to use it. And we can't pipe from devkmsg_read
in user-space, because we need this to work when kernel is broken in
various ways...

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:15                         ` Petr Mladek
  2018-05-18 12:25                           ` Dmitry Vyukov
@ 2018-05-23 10:19                           ` Tetsuo Handa
  2018-05-24  2:14                           ` Sergey Senozhatsky
  2 siblings, 0 replies; 94+ messages in thread
From: Tetsuo Handa @ 2018-05-23 10:19 UTC (permalink / raw)
  To: pmladek
  Cc: sergey.senozhatsky.work, dvyukov, sergey.senozhatsky, syzkaller,
	rostedt, fengguang.wu, linux-kernel, torvalds, akpm

Sergey Senozhatsky wrote:
> On (05/17/18 20:21), Sergey Senozhatsky wrote:
> > Dunno...
> > For instance, can we store context tracking info as a extended record
> > data? We have that dict/dict_len thing. So may we can store tracking
> > info there? Extended records will appear on the serial console /* if
> > console supports extended data */ or can be read in via devkmsg_read().
> 
> Those extended records are already there for exactly the same
> reason - people want to attach a special context to printk() entries.
> See dev_vprintk_emit() and create_syslog_header(). So we can add more
> key/value data to that context. Sounds kinda-sorta reasonable.

Well, the context which I want is not special. It is common context (like
timestamp which is controlled via /sys/module/printk/parameters/time ) for
distinguishing/correlating concurrently printed messages.



Petr Mladek wrote:
> First, we should ask what we expect from this feature. Different
> information might be needed in different situations. In general,
> people might want to know:
> 
>   + CPUid even in task context

I don't think CPU id in task context is common context. Task context will
sleep and switch to different CPUs. It is special context which would help
for only specific cases.

>   + exact interrupt context (soft, hard, NMI)

I don't know whether it is worth printing. But if it is useful, printing
type of interrupt context using %c would be sufficient for the context
which I want.

>   + whether preemption or interrupts are enabled

I don't think preemption state is common context. It is special context
which would be explicitly printed by e.g. stall detection messages.

> 
> It still looks bearable. But what if people want more,
> e.g. context switch counts, task state, pending signals,
> mem usage, cgroup stuff.
> 

I don't think context switch counts, task state, pending signals are
common context. It is special context which would be explicitly printed
by e.g. thread dump messages.

But if people want special context like listed above, we can consider
specifying by bitmask (e.g. /proc/sys/kernel/sysrq ) or by string (e.g.
/proc/sys/kernel/core_pattern ).

> Is this information useful for all messages or only
> selected ones?

I think the context which I want is useful for all messages. Thus,
my patch controls it via /sys/module/printk/parameters/caller_info
as with /sys/module/printk/parameters/time .

> 
> Is it acceptable when message prefix is longer than, let's
> say 40 characters?

The context which I want won't become so long.

> 
> Is the extended output worth having even on slow consoles?

Netconsole is unique that amount of characters to transmit and delay
are not proportional. As long as a message fits within ethernet packet
size (nearly 1500 bytes, which is longer than LOG_LINE_MAX for printk()
operation), the delay for printing one character and printing multiple
characters would be almost same. Therefore, reducing frequency of
printk() operation by having an API for buffered printk() (e.g.
https://groups.google.com/forum/#!topic/linux.kernel/OnoXED88nQM
and https://patchwork.kernel.org/patch/9927385/ ) would help.

But for other consoles, always printing all extended records might
become a pain. Thus, I prefer that the context which I want and
contexts which people might want are treated separately.
By the way, having an API for buffered printk() will help avoiding

  pr_info("printk: continuation disabled due to ext consoles, expect more fragments in /dev/kmsg\n");

case anyway...

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 12:15                         ` Petr Mladek
  2018-05-18 12:25                           ` Dmitry Vyukov
  2018-05-23 10:19                           ` Tetsuo Handa
@ 2018-05-24  2:14                           ` Sergey Senozhatsky
  2018-05-26  6:36                             ` Dmitry Vyukov
  2 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-24  2:14 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, dvyukov, sergey.senozhatsky,
	syzkaller, rostedt, fengguang.wu, linux-kernel, torvalds, akpm

On (05/18/18 14:15), Petr Mladek wrote:
> > Dunno...
> > For instance, can we store context tracking info as a extended record
> > data? We have that dict/dict_len thing. So may we can store tracking
> > info there? Extended records will appear on the serial console /* if
> > console supports extended data */ or can be read in via devkmsg_read().
> > Any other options?
> 
> This sounds interesting. Well, we would need to handle different dict
> items different ways. I still wonder if we really need these "hacks".

Well, it doesn't look like a complete hack. Extended records are there
exactly for the "this printk line came from context (device, subsystem) ABC"
type of thing. Those entries are multi-key/value already, we just can
add one more key/value pair. E.g. appending CONTEXT to already existing
SUBSYSTEM/DEVICE lines:

6,575,3130042,-;snd_hda_codec_generic hdaudioC1D0:    dig-out=0x4/0x5
 CONTEXT=4/99 PREEMPT=0/0/0/0
 SUBSYSTEM=hdaudio
 DEVICE=+hdaudio:hdaudioC1D

But I'm not pushing for this particular solution. It just looked
reasonable and very "cheap", as we don't break anything.

> 
> IMHO, we could change struct printk_log if we provide related
> patches for crashdump and crash utilities.

Yep.

> First, we should ask what we expect from this feature.

Yeah. Can't really comment on this, it's up to Tetsuo and Dmitry to
decide. So far I've seen slightly different requirements/expectations.

> Different information might be needed in different situations.
> In general, people might want to know:
> 
>   + CPUid even in task context
>   + exact interrupt context (soft, hard, NMI)

Agreed.

>   + whether preemption or interrupts are enabled

preemption and irqs are already disabled this far in printk() internals.

> It still looks bearable. But what if people want more,
> e.g. context switch counts, task state, pending signals,
> mem usage, cgroup stuff.

Right. Extended records [dicts] can be up to 8k each, so I'd say
that we can have as many key/value pairs as we want to.

> Is this information useful for all messages or only
> selected ones?

No idea :)

> Is it acceptable when message prefix is longer than, let's
> say 40 characters?

If we talk about embedding this info into normal message payload
then, yes, we better keep it as small as possible. Because we are
limited by LOG_LINE_MAX + PREFIX_MAX chars [~1024 bytes, if I recall
correctly], the more we steal for context info the less we have for
the message.

> Is the extended output worth having even on slow consoles?

My expectation was that syzkaller is mostly executed in qemu environment.
But if someone would want to run it on a device with a slow console, then
it might be painful.

> By other words, I wonder if you wanted similar feature in many
> situations in the past and could provide more use cases.

Sorry, can you explain a bit more?

> Note:
> 
> The proposed patch enabled the extra info with a config option
> => you need to rebuild the kernel => you could just modify
> the problematic message. We could just add some printk_ helpers
> to make it easier.

Yes. As far as I know syzkaller folks are completely fine with
the .config based solution and can rebuild the kernel as many times
as needed, modifying the kernel code, at the same time, is not an
option.

> Alternatively, I wonder if it might be enough to add a tracepoint
> into printk() and get the extra info via
> /sys/kernel/debug/tracing/events/.

Sounds good to me.

> We would need to prevent recursion when trace buffer is flushed by
> printk() but...

Agreed.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-18 13:08                               ` Dmitry Vyukov
@ 2018-05-24  2:21                                 ` Sergey Senozhatsky
  0 siblings, 0 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-05-24  2:21 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (05/18/18 15:08), Dmitry Vyukov wrote:
[..]
> >> What consoles do support it?
> >> We are interested at least in qemu console, GCE console and Android
> >> phone consoles. But it would be pity if this can't be used on various
> >> development boards too.
> >
> > Only the netconsole is able to show the extended (dict)
> > information at the moment. Search for CON_EXTENDED flag.
> 
> Then we won't be able to use it. And we can't pipe from devkmsg_read
> in user-space, because we need this to work when kernel is broken in
> various ways...

Hmm. Well, basically, any console that has CON_EXTENDED bit set; which
is, probably, only netconsole at this point. Do you use slow serial
consoles?

OK, seems like extended printk records won't make you happy after all :)

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-24  2:14                           ` Sergey Senozhatsky
@ 2018-05-26  6:36                             ` Dmitry Vyukov
  2018-06-20  5:44                               ` Dmitry Vyukov
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-05-26  6:36 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Thu, May 24, 2018 at 4:14 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
>> First, we should ask what we expect from this feature.
>
> Yeah. Can't really comment on this, it's up to Tetsuo and Dmitry to
> decide. So far I've seen slightly different requirements/expectations.

The root problem is that it's not possible to make sense out of kernel
output if message takes more than 1 line (or output non-atomically
with several printk's) because of intermixed output from several
tasks/interrupts/etc. For example, it's not generally possible to
recover crash stack trace, because one gets random mix of frames.
Humans usually, but not always, can restore most of the sense. So the
goal is to make this ought-to-be-simple task actually simple and not
requiring human intelligence and time each time.

Prefixing each line with task/cpu/interrupt context should do the
trick as it will be possible to split kernel output into multiple
independent streams and analyze them independently.

In our context (syzbot testing) we can enable an additional config,
and adopt parser to understand additional line prefix. But I don't
know how prefixing lines fits into a larger picture. Does it make
sense to thought out a potential extension story for this format? E.g.
user specifies set of extension records that are dumped before each
line, and then can unambiguously parse them? I guess some
consoles/interfaces will never be extended to provide access to the
extension records, so it can make sense to make them accessible in
text format too (optionally).

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-05-26  6:36                             ` Dmitry Vyukov
@ 2018-06-20  5:44                               ` Dmitry Vyukov
  2018-06-20  8:31                                 ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-06-20  5:44 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Sat, May 26, 2018 at 8:36 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Thu, May 24, 2018 at 4:14 AM, Sergey Senozhatsky
> <sergey.senozhatsky.work@gmail.com> wrote:
>>> First, we should ask what we expect from this feature.
>>
>> Yeah. Can't really comment on this, it's up to Tetsuo and Dmitry to
>> decide. So far I've seen slightly different requirements/expectations.
>
> The root problem is that it's not possible to make sense out of kernel
> output if message takes more than 1 line (or output non-atomically
> with several printk's) because of intermixed output from several
> tasks/interrupts/etc. For example, it's not generally possible to
> recover crash stack trace, because one gets random mix of frames.
> Humans usually, but not always, can restore most of the sense. So the
> goal is to make this ought-to-be-simple task actually simple and not
> requiring human intelligence and time each time.
>
> Prefixing each line with task/cpu/interrupt context should do the
> trick as it will be possible to split kernel output into multiple
> independent streams and analyze them independently.
>
> In our context (syzbot testing) we can enable an additional config,
> and adopt parser to understand additional line prefix. But I don't
> know how prefixing lines fits into a larger picture. Does it make
> sense to thought out a potential extension story for this format? E.g.
> user specifies set of extension records that are dumped before each
> line, and then can unambiguously parse them? I guess some
> consoles/interfaces will never be extended to provide access to the
> extension records, so it can make sense to make them accessible in
> text format too (optionally).


up

We continue to get mess like this, each instance of which needs to be
checked by human.


BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
sysfs: cannot create duplicate filename '/class/ieee80211/!'
PGD 1cae7e067 P4D 1cae7e067 PUD 1b4da6067 PMD 0
Oops: 0010 [#1] SMP KASAN
CPU: 1 PID: 1728 Comm: syz-executor4 Not tainted 4.17.0+ #84
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
CPU: 0 PID: 1738 Comm: syz-executor7 Not tainted 4.17.0+ #84
RIP: 0010:          (null)
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Code:
Call Trace:
Bad RIP value.
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
RSP: 0018:ffff88018cd3f590 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880192f05800 RCX: 1ffffffff10eeea9
 sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30
RDX: ffff88018cd3fab0 RSI: ffff8801c927a480 RDI: ffff88018c77c040
 sysfs_do_create_link_sd.isra.2+0x116/0x130 fs/sysfs/symlink.c:50
RBP: ffff88018cd3f700 R08: 0000000000000001 R09: 0000000000000000
 sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
 sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff100319a7eb7
R13: ffff88018cd3fab0 R14: ffff880192f05812 R15: ffff880192f05c58
 device_add_class_symlinks drivers/base/core.c:1632 [inline]
 device_add+0x5c9/0x16f0 drivers/base/core.c:1834
FS:  00007f4a8e71f700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000191e1b000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 wiphy_register+0x182e/0x24e0 net/wireless/core.c:813
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ieee80211_register_hw+0x13cd/0x35d0 net/mac80211/main.c:1050
 sock_poll+0x1d1/0x710 net/socket.c:1168
 mac80211_hwsim_new_radio+0x1da2/0x33b0
drivers/net/wireless/mac80211_hwsim.c:2772
 vfs_poll+0x77/0x2a0 fs/select.c:40
 do_pollfd fs/select.c:848 [inline]
 do_poll fs/select.c:896 [inline]
 do_sys_poll+0x6fd/0x1100 fs/select.c:990
 hwsim_new_radio_nl+0x7b8/0xa60 drivers/net/wireless/mac80211_hwsim.c:3247
 genl_family_rcv_msg+0x889/0x1120 net/netlink/genetlink.c:599
 genl_rcv_msg+0xc6/0x170 net/netlink/genetlink.c:624
 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
 __do_sys_poll fs/select.c:1048 [inline]
 __se_sys_poll fs/select.c:1036 [inline]
 __x64_sys_poll+0x189/0x510 fs/select.c:1036
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
 netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
 netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x455b29
 sock_sendmsg_nosec net/socket.c:645 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:655
Code:
 ___sys_sendmsg+0x805/0x940 net/socket.c:2161
1d
ba
fb
ff
c3
66
2e
0f
1f
 __sys_sendmsg+0x115/0x270 net/socket.c:2199
84
00
00
00
 __do_sys_sendmsg net/socket.c:2208 [inline]
 __se_sys_sendmsg net/socket.c:2206 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2206
00
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
00 66
90
48
89
f8 48
89
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
f7
RIP: 0033:0x455b29
48
Code:
89
1d
d6
ba fb
48
ff
89
c3
ca
66
4d
2e
89
0f
c2
1f
4d
84
89
00
c8
00
4c
00
8b
00
4c
00
24 08
66
0f
90
05 <48>
48
3d
89
01
f8
f0
48
ff ff
89
0f 83
f7
eb
48
b9 fb
89
ff
d6
c3
48 89
66
ca 4d
2e
89
0f
c2
1f
4d
84
89
00
c8
00
4c
00
8b
00
4c
RSP: 002b:00007f4a8e71ec68 EFLAGS: 00000246
24
 ORIG_RAX: 0000000000000007
08
RAX: ffffffffffffffda RBX: 00007f4a8e71f6d4 RCX: 0000000000455b29
0f
RDX: 0000000000000004 RSI: 0000000000000005 RDI: 0000000020000000
05
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
<48> 3d
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
01
R13: 00000000004c06c7 R14: 00000000004d0030 R15: 0000000000000000
f0
Modules linked in:
ff
Dumping ftrace buffer:
ff
   (ftrace buffer empty)
0f
CR2: 0000000000000000
83
---[ end trace 69744e61e26ed6a4 ]---
eb b9 fb ff c3 66 2e
RIP: 0010:          (null)
0f 1f 84 00 00 00
Code:
00
RSP: 002b:00007f4e4fdedc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
Bad RIP value.
RAX: ffffffffffffffda RBX: 00007f4e4fdee6d4 RCX: 0000000000455b29
RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000014
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
RSP: 0018:ffff88018cd3f590 EFLAGS: 00010246
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000004c0ee7 R14: 00000000004d0d80 R15: 0000000000000000
RAX: 0000000000000000 RBX: ffff880192f05800 RCX: 1ffffffff10eeea9
RDX: ffff88018cd3fab0 RSI: ffff8801c927a480 RDI: ffff88018c77c040
RBP: ffff88018cd3f700 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff100319a7eb7
R13: ffff88018cd3fab0 R14: ffff880192f05812 R15: ffff880192f05c58
FS:  00007f4a8e71f700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
netlink: 8 bytes leftover after parsing attributes in process `syz-executor2'.
CR2: ffffffffffffffd6 CR3: 0000000191e1b000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  5:44                               ` Dmitry Vyukov
@ 2018-06-20  8:31                                 ` Sergey Senozhatsky
  2018-06-20  8:45                                   ` Dmitry Vyukov
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20  8:31 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (06/20/18 07:44), Dmitry Vyukov wrote:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> sysfs: cannot create duplicate filename '/class/ieee80211/!'
> PGD 1cae7e067 P4D 1cae7e067 PUD 1b4da6067 PMD 0
> Oops: 0010 [#1] SMP KASAN
> CPU: 1 PID: 1728 Comm: syz-executor4 Not tainted 4.17.0+ #84
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> CPU: 0 PID: 1738 Comm: syz-executor7 Not tainted 4.17.0+ #84
> RIP: 0010:          (null)
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Code:
> Call Trace:
> Bad RIP value.
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
> RSP: 0018:ffff88018cd3f590 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff880192f05800 RCX: 1ffffffff10eeea9
>  sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30
> RDX: ffff88018cd3fab0 RSI: ffff8801c927a480 RDI: ffff88018c77c040
>  sysfs_do_create_link_sd.isra.2+0x116/0x130 fs/sysfs/symlink.c:50
> RBP: ffff88018cd3f700 R08: 0000000000000001 R09: 0000000000000000
>  sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
>  sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
> R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff100319a7eb7
> R13: ffff88018cd3fab0 R14: ffff880192f05812 R15: ffff880192f05c58
>  device_add_class_symlinks drivers/base/core.c:1632 [inline]
>  device_add+0x5c9/0x16f0 drivers/base/core.c:1834
> FS:  00007f4a8e71f700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 0000000191e1b000 CR4: 00000000001406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  wiphy_register+0x182e/0x24e0 net/wireless/core.c:813
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  ieee80211_register_hw+0x13cd/0x35d0 net/mac80211/main.c:1050
>  sock_poll+0x1d1/0x710 net/socket.c:1168
>  mac80211_hwsim_new_radio+0x1da2/0x33b0
> drivers/net/wireless/mac80211_hwsim.c:2772
>  vfs_poll+0x77/0x2a0 fs/select.c:40
>  do_pollfd fs/select.c:848 [inline]
>  do_poll fs/select.c:896 [inline]
>  do_sys_poll+0x6fd/0x1100 fs/select.c:990
>  hwsim_new_radio_nl+0x7b8/0xa60 drivers/net/wireless/mac80211_hwsim.c:3247
>  genl_family_rcv_msg+0x889/0x1120 net/netlink/genetlink.c:599
>  genl_rcv_msg+0xc6/0x170 net/netlink/genetlink.c:624
>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
>  __do_sys_poll fs/select.c:1048 [inline]
>  __se_sys_poll fs/select.c:1036 [inline]
>  __x64_sys_poll+0x189/0x510 fs/select.c:1036
>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
>  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
>  netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
>  netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x455b29
>  sock_sendmsg_nosec net/socket.c:645 [inline]
>  sock_sendmsg+0xd5/0x120 net/socket.c:655
> Code:
>  ___sys_sendmsg+0x805/0x940 net/socket.c:2161
> 1d
> ba
> fb
> ff
> c3
> 66
> 2e
> 0f
> 1f
>  __sys_sendmsg+0x115/0x270 net/socket.c:2199
> 84
> 00
> 00
> 00
>  __do_sys_sendmsg net/socket.c:2208 [inline]
>  __se_sys_sendmsg net/socket.c:2206 [inline]
>  __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2206
> 00
>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
> 00 66
> 90
> 48
> 89
> f8 48
> 89

Meh, pr_cont() output... I forgot about it. So I have a very simple
patch [probably buggy]. One thing we can be sure of is that It does
not handle pr_cont() interleaving properly - it logs the context which
has stored the messages, while in case of pr_cont() it is not always
correct since we can have a preliminary pr_cont() flush. It also doesn't
handle printk_safe stuff. Tetsuo's patch, probably, handled all those
cases. Hmm.

The patch below is less intrusive but also less complete / less universal.
Maybe it's enough for you, maybe it's not. Wondering if this patch will
make any difference on your side to being with. Note, I'm not pushing for
this particular message format, we can change it the way you want.

===

Subject: [PATCH] printk: log message origin context info

---
 kernel/printk/printk.c | 31 ++++++++++++++++++++++++++++++-
 lib/Kconfig.debug      |  8 ++++++++
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 247808333ba4..304a02b0c432 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -580,16 +580,38 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
 	return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
 }
 
+static size_t log_message_origin(char *buf, size_t buf_len)
+{
+	size_t ret = 0;
+
+#ifdef CONFIG_PRINTK_LOG_MESSAGE_ORIGIN
+	ret = snprintf(buf,
+			buf_len,
+			"[%d/%d preempt:%lu/%lu/%lu] ",
+			raw_smp_processor_id(),
+			task_pid_nr(current),
+			in_nmi(),
+			in_irq(),
+			in_serving_softirq());
+#endif
+	return ret;
+}
+
 /* insert record into the buffer, discard old ones, update heads */
 static int log_store(int facility, int level,
 		     enum log_flags flags, u64 ts_nsec,
 		     const char *dict, u16 dict_len,
 		     const char *text, u16 text_len)
 {
+	static char log_origin[64];
+	static size_t log_origin_len;
 	struct printk_log *msg;
 	u32 size, pad_len;
 	u16 trunc_msg_len = 0;
 
+	log_origin_len = log_message_origin(log_origin, sizeof(log_origin));
+	text_len += log_origin_len;
+
 	/* number of '\0' padding bytes to next message */
 	size = msg_used_size(text_len, dict_len, &pad_len);
 
@@ -614,7 +636,14 @@ static int log_store(int facility, int level,
 
 	/* fill message */
 	msg = (struct printk_log *)(log_buf + log_next_idx);
-	memcpy(log_text(msg), text, text_len);
+	if (log_origin_len) {
+		memcpy(log_text(msg), log_origin, log_origin_len);
+		memcpy(log_text(msg) + log_origin_len,
+			text,
+			text_len - log_origin_len);
+	} else {
+		memcpy(log_text(msg), text, text_len);
+	}
 	msg->text_len = text_len;
 	if (trunc_msg_len) {
 		memcpy(log_text(msg) + text_len, trunc_msg, trunc_msg_len);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 8838d1158d19..57220642a00b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -15,6 +15,14 @@ config PRINTK_TIME
 	  The behavior is also controlled by the kernel command line
 	  parameter printk.time=1. See Documentation/admin-guide/kernel-parameters.rst
 
+config PRINTK_LOG_MESSAGE_ORIGIN
+	bool "Store printk() message origin context info"
+	depends on PRINTK
+	help
+	  Selecting this option causes extra information - CPU, task pid,
+	  preemption mask - to be added to the every message. This can be
+	  helpful when interleaving printk() lines cause too much.
+
 config CONSOLE_LOGLEVEL_DEFAULT
 	int "Default console loglevel (1-15)"
 	range 1 15
-- 
2.17.1


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  8:31                                 ` Sergey Senozhatsky
@ 2018-06-20  8:45                                   ` Dmitry Vyukov
  2018-06-20  9:06                                     ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-06-20  8:45 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 10:31 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/20/18 07:44), Dmitry Vyukov wrote:
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
>> sysfs: cannot create duplicate filename '/class/ieee80211/!'
>> PGD 1cae7e067 P4D 1cae7e067 PUD 1b4da6067 PMD 0
>> Oops: 0010 [#1] SMP KASAN
>> CPU: 1 PID: 1728 Comm: syz-executor4 Not tainted 4.17.0+ #84
>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>> BIOS Google 01/01/2011
>> CPU: 0 PID: 1738 Comm: syz-executor7 Not tainted 4.17.0+ #84
>> RIP: 0010:          (null)
>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>> BIOS Google 01/01/2011
>> Code:
>> Call Trace:
>> Bad RIP value.
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>> RSP: 0018:ffff88018cd3f590 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: ffff880192f05800 RCX: 1ffffffff10eeea9
>>  sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30
>> RDX: ffff88018cd3fab0 RSI: ffff8801c927a480 RDI: ffff88018c77c040
>>  sysfs_do_create_link_sd.isra.2+0x116/0x130 fs/sysfs/symlink.c:50
>> RBP: ffff88018cd3f700 R08: 0000000000000001 R09: 0000000000000000
>>  sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
>>  sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
>> R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff100319a7eb7
>> R13: ffff88018cd3fab0 R14: ffff880192f05812 R15: ffff880192f05c58
>>  device_add_class_symlinks drivers/base/core.c:1632 [inline]
>>  device_add+0x5c9/0x16f0 drivers/base/core.c:1834
>> FS:  00007f4a8e71f700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: ffffffffffffffd6 CR3: 0000000191e1b000 CR4: 00000000001406e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>  wiphy_register+0x182e/0x24e0 net/wireless/core.c:813
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>  ieee80211_register_hw+0x13cd/0x35d0 net/mac80211/main.c:1050
>>  sock_poll+0x1d1/0x710 net/socket.c:1168
>>  mac80211_hwsim_new_radio+0x1da2/0x33b0
>> drivers/net/wireless/mac80211_hwsim.c:2772
>>  vfs_poll+0x77/0x2a0 fs/select.c:40
>>  do_pollfd fs/select.c:848 [inline]
>>  do_poll fs/select.c:896 [inline]
>>  do_sys_poll+0x6fd/0x1100 fs/select.c:990
>>  hwsim_new_radio_nl+0x7b8/0xa60 drivers/net/wireless/mac80211_hwsim.c:3247
>>  genl_family_rcv_msg+0x889/0x1120 net/netlink/genetlink.c:599
>>  genl_rcv_msg+0xc6/0x170 net/netlink/genetlink.c:624
>>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
>>  __do_sys_poll fs/select.c:1048 [inline]
>>  __se_sys_poll fs/select.c:1036 [inline]
>>  __x64_sys_poll+0x189/0x510 fs/select.c:1036
>>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
>>  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
>>  netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
>>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
>>  netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> RIP: 0033:0x455b29
>>  sock_sendmsg_nosec net/socket.c:645 [inline]
>>  sock_sendmsg+0xd5/0x120 net/socket.c:655
>> Code:
>>  ___sys_sendmsg+0x805/0x940 net/socket.c:2161
>> 1d
>> ba
>> fb
>> ff
>> c3
>> 66
>> 2e
>> 0f
>> 1f
>>  __sys_sendmsg+0x115/0x270 net/socket.c:2199
>> 84
>> 00
>> 00
>> 00
>>  __do_sys_sendmsg net/socket.c:2208 [inline]
>>  __se_sys_sendmsg net/socket.c:2206 [inline]
>>  __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2206
>> 00
>>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
>> 00 66
>> 90
>> 48
>> 89
>> f8 48
>> 89
>
> Meh, pr_cont() output... I forgot about it. So I have a very simple
> patch [probably buggy]. One thing we can be sure of is that It does
> not handle pr_cont() interleaving properly - it logs the context which
> has stored the messages, while in case of pr_cont() it is not always
> correct since we can have a preliminary pr_cont() flush. It also doesn't
> handle printk_safe stuff. Tetsuo's patch, probably, handled all those
> cases. Hmm.
>
> The patch below is less intrusive but also less complete / less universal.
> Maybe it's enough for you, maybe it's not. Wondering if this patch will
> make any difference on your side to being with. Note, I'm not pushing for
> this particular message format, we can change it the way you want.

Hi Sergey,

What are the visible differences between this patch and Tetsuo's
patch? The only thing that will matter for syzkaller parsing in the
end is the resulting text format as it appears on console. But you say
"I'm not pushing for this particular message format", so what exactly
do you want me to provide feedback on?
I guess we need to handle pr_cont properly whatever approach we take.

Re format, for us it would be much more convenient if the context is a
single token that can be used as is, say "T<pid>" for task context,
"I<cpu>" for interrupts, "N<cpu>" for nmi's, etc. Rather than: split
it all into tokens and parse, then look at a set of flags and choose
the highest priority set flag and then depending on the flag choose
either task id or cpu id.

> ===
>
> Subject: [PATCH] printk: log message origin context info
>
> ---
>  kernel/printk/printk.c | 31 ++++++++++++++++++++++++++++++-
>  lib/Kconfig.debug      |  8 ++++++++
>  2 files changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 247808333ba4..304a02b0c432 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -580,16 +580,38 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
>         return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
>  }
>
> +static size_t log_message_origin(char *buf, size_t buf_len)
> +{
> +       size_t ret = 0;
> +
> +#ifdef CONFIG_PRINTK_LOG_MESSAGE_ORIGIN
> +       ret = snprintf(buf,
> +                       buf_len,
> +                       "[%d/%d preempt:%lu/%lu/%lu] ",
> +                       raw_smp_processor_id(),
> +                       task_pid_nr(current),
> +                       in_nmi(),
> +                       in_irq(),
> +                       in_serving_softirq());
> +#endif
> +       return ret;
> +}
> +
>  /* insert record into the buffer, discard old ones, update heads */
>  static int log_store(int facility, int level,
>                      enum log_flags flags, u64 ts_nsec,
>                      const char *dict, u16 dict_len,
>                      const char *text, u16 text_len)
>  {
> +       static char log_origin[64];
> +       static size_t log_origin_len;
>         struct printk_log *msg;
>         u32 size, pad_len;
>         u16 trunc_msg_len = 0;
>
> +       log_origin_len = log_message_origin(log_origin, sizeof(log_origin));
> +       text_len += log_origin_len;
> +
>         /* number of '\0' padding bytes to next message */
>         size = msg_used_size(text_len, dict_len, &pad_len);
>
> @@ -614,7 +636,14 @@ static int log_store(int facility, int level,
>
>         /* fill message */
>         msg = (struct printk_log *)(log_buf + log_next_idx);
> -       memcpy(log_text(msg), text, text_len);
> +       if (log_origin_len) {
> +               memcpy(log_text(msg), log_origin, log_origin_len);
> +               memcpy(log_text(msg) + log_origin_len,
> +                       text,
> +                       text_len - log_origin_len);
> +       } else {
> +               memcpy(log_text(msg), text, text_len);
> +       }
>         msg->text_len = text_len;
>         if (trunc_msg_len) {
>                 memcpy(log_text(msg) + text_len, trunc_msg, trunc_msg_len);
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 8838d1158d19..57220642a00b 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -15,6 +15,14 @@ config PRINTK_TIME
>           The behavior is also controlled by the kernel command line
>           parameter printk.time=1. See Documentation/admin-guide/kernel-parameters.rst
>
> +config PRINTK_LOG_MESSAGE_ORIGIN
> +       bool "Store printk() message origin context info"
> +       depends on PRINTK
> +       help
> +         Selecting this option causes extra information - CPU, task pid,
> +         preemption mask - to be added to the every message. This can be
> +         helpful when interleaving printk() lines cause too much.
> +
>  config CONSOLE_LOGLEVEL_DEFAULT
>         int "Default console loglevel (1-15)"
>         range 1 15
> --
> 2.17.1
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  8:45                                   ` Dmitry Vyukov
@ 2018-06-20  9:06                                     ` Sergey Senozhatsky
  2018-06-20  9:18                                       ` Sergey Senozhatsky
  2018-06-20  9:30                                       ` Dmitry Vyukov
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20  9:06 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

Hi Dmitry,

On (06/20/18 10:45), Dmitry Vyukov wrote:
> Hi Sergey,
> 
> What are the visible differences between this patch and Tetsuo's
> patch?

I guess none, and looking at your requirements below I tend to agree
that Tetsuo's approach is probably what you need at the end of the day.

> The only thing that will matter for syzkaller parsing in the
> end is the resulting text format as it appears on console. But you say
> "I'm not pushing for this particular message format", so what exactly
> do you want me to provide feedback on?
> I guess we need to handle pr_cont properly whatever approach we take.

Mostly, was wondering about if:
a) you need pr_cont() handling
b) you need printk_safe() handling

The reasons I left those things behind:

a) pr_cont() is officially hated. It was never supposed to be used
   on SMP systems. So I wasn't sure if we need all that effort and
   add tricky code to handle pr_cont(). Given that syzkaller is
   probably the only user of that functionality.

b) printk_safe output is quite uncommon. And we flush per-CPU buffer
   from the same CPU which has caused printk_safe output [except for
   panic() flush] therefore logging the info available to log_store()
   seemed enough. IOW, once again, was a bit unsure if we want to add
   some complex code to already complex code, with just one potential
   user.

To summarize, I was just wondering where is the waterline: can a small
patch make you happy, or do you need a big one.

> Re format, for us it would be much more convenient if the context is a
> single token that can be used as is, say "T<pid>" for task context,
> "I<cpu>" for interrupts, "N<cpu>" for nmi's

Got it.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:06                                     ` Sergey Senozhatsky
@ 2018-06-20  9:18                                       ` Sergey Senozhatsky
  2018-06-20  9:31                                         ` Dmitry Vyukov
  2018-06-20  9:30                                       ` Dmitry Vyukov
  1 sibling, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20  9:18 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Vyukov, Petr Mladek, Tetsuo Handa, Sergey Senozhatsky,
	syzkaller, Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On (06/20/18 18:06), Sergey Senozhatsky wrote:
> 
> b) printk_safe output is quite uncommon. And we flush per-CPU buffer
>    from the same CPU which has caused printk_safe output [except for
>    panic() flush] therefore logging the info available to log_store()
>    seemed enough. IOW, once again, was a bit unsure if we want to add
>    some complex code to already complex code, with just one potential
>    user.

BTW, pr_cont() handling is not so simple when we are in printk_safe()
context. Unlike vprintk_emit() [normal printk], we don't use any
dedicated pr_cont() buffer in printk_safe. So, at a glance, I suspect
that injecting context info at every printk_safe_log_store() call for
`for (...) pr_cont()' loop is going to produce something like this:
	I<10> 23 I<10> 43 I<10> 47 ....

	// Hmm, maybe the line will endup having two prefixes. Once
	// from printk_safe_log_store, the other from normal printk
	// log_store().

While the same `for (...) pr_cont()' called from normal printk() context
will produce
	I<10> 32 43 47 ....

It could be that I'm wrong.
Tetsuo, have you tested pr_cont() from printk_safe() context?

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:06                                     ` Sergey Senozhatsky
  2018-06-20  9:18                                       ` Sergey Senozhatsky
@ 2018-06-20  9:30                                       ` Dmitry Vyukov
  2018-06-20 11:19                                         ` Sergey Senozhatsky
  2018-06-20 11:37                                         ` Fengguang Wu
  1 sibling, 2 replies; 94+ messages in thread
From: Dmitry Vyukov @ 2018-06-20  9:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> Hi Dmitry,
>
> On (06/20/18 10:45), Dmitry Vyukov wrote:
>> Hi Sergey,
>>
>> What are the visible differences between this patch and Tetsuo's
>> patch?
>
> I guess none, and looking at your requirements below I tend to agree
> that Tetsuo's approach is probably what you need at the end of the day.
>
>> The only thing that will matter for syzkaller parsing in the
>> end is the resulting text format as it appears on console. But you say
>> "I'm not pushing for this particular message format", so what exactly
>> do you want me to provide feedback on?
>> I guess we need to handle pr_cont properly whatever approach we take.
>
> Mostly, was wondering about if:
> a) you need pr_cont() handling
> b) you need printk_safe() handling
>
> The reasons I left those things behind:
>
> a) pr_cont() is officially hated. It was never supposed to be used
>    on SMP systems. So I wasn't sure if we need all that effort and
>    add tricky code to handle pr_cont(). Given that syzkaller is
>    probably the only user of that functionality.

Well, if I put my syzkaller hat on, then I don't care what exactly
happens in the kernel, the only thing I care is well-formed output on
console that can be parsed unambiguously in all cases.
From this point of view I guess pr_cont is actually syzkaller's worst
enemy. If pr_const is officially hated, and it causes corrupted crash
reports, then we can resolve it by just getting rid of more pr_cont's.
So potentially we do not need any support for pr_cont in this patch.
However, we also need to be practical and if there are tons of
pr_cont's then we need some intermediate support of them, just because
we won't be able to get rid of all of them overnight.

But even if we attach context to pr_cont, it still causes problems for
crash parsing, because today we see:

BUG: unable to handle
... 10 lines ...
kernel
... 10 lines ...
paging request
... 10 lines ...
at ADDR

Which is not too friendly for parsing regardless of contexts.
So I am leaning towards to getting rid of pr_cont's as the solution to
the problem.

Looking at current uses of pr_cont:
https://elixir.bootlin.com/linux/latest/ident/pr_cont
It does not look too bad. arch/ except for x86 and exotic drivers
won't cause problems for syzbot today, so we can live with these uses
for now.



> b) printk_safe output is quite uncommon. And we flush per-CPU buffer
>    from the same CPU which has caused printk_safe output [except for
>    panic() flush] therefore logging the info available to log_store()
>    seemed enough. IOW, once again, was a bit unsure if we want to add
>    some complex code to already complex code, with just one potential
>    user.


I can't fully answer this because I don't understand what are the
implications on actual output.
You can use this as litmus test: can you write a simple script that
will parse such output and make sense out of it?

Well, it's not for one user. It affects each and every single user of
Linux kernel out there. Just take a look at these, that's complete
nonsense, it's not that syzkaller can't make sense out of it, it's
nobody can make sense out of it:

https://gist.githubusercontent.com/dvyukov/1528e86e5139f2fd1bf9902398d48298/raw/3b42148554eefed210f1e626d5befd50405c5487/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/6e08ac521f3e19534970ed97aeee1603/raw/0f0bb361902de94e7ee331ac500a3ceebf812c22/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/6e9db2313e48773ad1cd861da8020008/raw/d5b7c023fc8a38c72b1cf8bb1da85fb1c31cea5f/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/3d1bda4c690414ac027de1da45759751/raw/2c68980eabf4f6be24060e807a75f2d3570b5a42/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/9b8831e9ac73ffafa111a33ad40c5667/raw/f4097fbea8f89b25a282a6ef7e648145e10ae4b7/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/d78a3187a1b4e004820e92efcb16f9e0/raw/5530bcbf009c3fba3c581b2d24c523c673c6ef12/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/da1e42436af9ad2afc7de49f2d503510/raw/7dd4cbcc651c5b87122f066a3c689999ae8c4121/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/4571b94bd8cbd78d759412c560fa395c/raw/964c73fc993fc8a9000571e0b7618000584f3638/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/b6deac5faa958ae3733413b34dd5feed/raw/c4da219e284f7fc55da8c3c3af623a87f31bf653/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/2f54c6a2e45347ea76d9c5ce3c0ff091/raw/45f4873898ec8e0d9aa16b9c5c63a85410fd05e0/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/96cb39e29124dbbe2a65a91ec7a5639e/raw/aa8f7b2b1dfa5b8bb8cf93d8a821ca9938e8fc54/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/424da8282d5b28f8be10eab595d37444/raw/acc2fb1ececc1ea9a8215213f7e37e08b524c096/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/b07f37720c632d6d56ae67d95e5599b3/raw/8624ba47d6eb4e7d4d58e3ae1242ebe6cc46d361/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/bc24a7b92289ec04587fb29fc1085045/raw/3136e9262ee2233b5ab369a4a82e83953fc2d8a2/gistfile1.txt

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:18                                       ` Sergey Senozhatsky
@ 2018-06-20  9:31                                         ` Dmitry Vyukov
  2018-06-20 11:07                                           ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-06-20  9:31 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 11:18 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/20/18 18:06), Sergey Senozhatsky wrote:
>>
>> b) printk_safe output is quite uncommon. And we flush per-CPU buffer
>>    from the same CPU which has caused printk_safe output [except for
>>    panic() flush] therefore logging the info available to log_store()
>>    seemed enough. IOW, once again, was a bit unsure if we want to add
>>    some complex code to already complex code, with just one potential
>>    user.
>
> BTW, pr_cont() handling is not so simple when we are in printk_safe()
> context. Unlike vprintk_emit() [normal printk], we don't use any
> dedicated pr_cont() buffer in printk_safe. So, at a glance, I suspect
> that injecting context info at every printk_safe_log_store() call for
> `for (...) pr_cont()' loop is going to produce something like this:
>         I<10> 23 I<10> 43 I<10> 47 ....
>
>         // Hmm, maybe the line will endup having two prefixes. Once
>         // from printk_safe_log_store, the other from normal printk
>         // log_store().
>
> While the same `for (...) pr_cont()' called from normal printk() context
> will produce
>         I<10> 32 43 47 ....
>
> It could be that I'm wrong.
> Tetsuo, have you tested pr_cont() from printk_safe() context?


So this is another reason to get rid of pr_cont entirely, right?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:31                                         ` Dmitry Vyukov
@ 2018-06-20 11:07                                           ` Sergey Senozhatsky
  2018-06-20 11:32                                             ` Dmitry Vyukov
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20 11:07 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (06/20/18 11:31), Dmitry Vyukov wrote:
> > BTW, pr_cont() handling is not so simple when we are in printk_safe()
> > context. Unlike vprintk_emit() [normal printk], we don't use any
> > dedicated pr_cont() buffer in printk_safe. So, at a glance, I suspect
> > that injecting context info at every printk_safe_log_store() call for
> > `for (...) pr_cont()' loop is going to produce something like this:
> >         I<10> 23 I<10> 43 I<10> 47 ....
> >
> >         // Hmm, maybe the line will endup having two prefixes. Once
> >         // from printk_safe_log_store, the other from normal printk
> >         // log_store().
> >
> > While the same `for (...) pr_cont()' called from normal printk() context
> > will produce
> >         I<10> 32 43 47 ....
> >
> > It could be that I'm wrong.
> > Tetsuo, have you tested pr_cont() from printk_safe() context?
> 
> 
> So this is another reason to get rid of pr_cont entirely, right?

Getting rid of pr_cont() from important output would be totally cool.
Quoting Linus:

    Only acceptable use of continuations is basically boot-time testing,
    when you do things like

     printk("Testing feature XYZ..");
     this_may_blow_up_because_of_hw_bugs();
     printk(KERN_CONT " ... ok\n");


I can recall at least 4 attempts when people tried to introduce new pr_cont()
or some concept with similar functionality to pr_cont(), but SMP safe. We
brought the first one - per-CPU pr_cont() buffers - to KS several years ago
but Linus didn't like it. Then there was a buffered printk() mode patch from
Tetsuo, then a solution from Steven, then I had my second try with a
soft-of-pr_cont() replacement.

So, if we could get rid of pr_cont() from the most important parts
(instruction dumps, etc) then I would just vote to leave pr_cont()
alone and avoid any handling of it in printk context tracking. Simply
because we wouldn't care about pr_cont(). This also could simplify
Tetsuo's patch significantly.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:30                                       ` Dmitry Vyukov
@ 2018-06-20 11:19                                         ` Sergey Senozhatsky
  2018-06-20 11:25                                           ` Dmitry Vyukov
  2018-06-20 11:37                                         ` Fengguang Wu
  1 sibling, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20 11:19 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (06/20/18 11:30), Dmitry Vyukov wrote:
> 
> https://gist.githubusercontent.com/dvyukov/1528e86e5139f2fd1bf9902398d48298/raw/3b42148554eefed210f1e626d5befd50405c5487/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/6e08ac521f3e19534970ed97aeee1603/raw/0f0bb361902de94e7ee331ac500a3ceebf812c22/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/6e9db2313e48773ad1cd861da8020008/raw/d5b7c023fc8a38c72b1cf8bb1da85fb1c31cea5f/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/3d1bda4c690414ac027de1da45759751/raw/2c68980eabf4f6be24060e807a75f2d3570b5a42/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/9b8831e9ac73ffafa111a33ad40c5667/raw/f4097fbea8f89b25a282a6ef7e648145e10ae4b7/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/d78a3187a1b4e004820e92efcb16f9e0/raw/5530bcbf009c3fba3c581b2d24c523c673c6ef12/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/da1e42436af9ad2afc7de49f2d503510/raw/7dd4cbcc651c5b87122f066a3c689999ae8c4121/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/4571b94bd8cbd78d759412c560fa395c/raw/964c73fc993fc8a9000571e0b7618000584f3638/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/b6deac5faa958ae3733413b34dd5feed/raw/c4da219e284f7fc55da8c3c3af623a87f31bf653/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/2f54c6a2e45347ea76d9c5ce3c0ff091/raw/45f4873898ec8e0d9aa16b9c5c63a85410fd05e0/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/96cb39e29124dbbe2a65a91ec7a5639e/raw/aa8f7b2b1dfa5b8bb8cf93d8a821ca9938e8fc54/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/424da8282d5b28f8be10eab595d37444/raw/acc2fb1ececc1ea9a8215213f7e37e08b524c096/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/b07f37720c632d6d56ae67d95e5599b3/raw/8624ba47d6eb4e7d4d58e3ae1242ebe6cc46d361/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/bc24a7b92289ec04587fb29fc1085045/raw/3136e9262ee2233b5ab369a4a82e83953fc2d8a2/gistfile1.txt


Just a small remark

I randomly picked some links, and at least in several reports I saw:

** 4495 printk messages dropped ** [   50.830930]  [<ffffffff8123ab47>] do_raw_write_lock+0xc7/0x1d0
** 3816 printk messages dropped ** [   50.839887]  [<ffffffff814fb353>] SyS_read+0xd3/0x1c0
** 3497 printk messages dropped ** [   50.848107]  [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14
** 4057 printk messages dropped ** [   50.857615] 	run_ksoftirqd+0x20/0x60
** 2855 printk messages dropped ** [   50.864318]  [<ffffffff814fb353>] SyS_read+0xd3/0x1c0
** 3490 printk messages dropped ** [   50.872518]  [<ffffffff815bee10>] ? fsnotify+0xe40/0xe40
** 3600 printk messages dropped ** [   50.880974] 	SyS_fcntl+0x5be/0xc70

This will not get any better if we have printk context tracking. The
problem here is that we lose messages: your console is significantly slower
than your CPUs. So while one CPU is doing its best printing pending logbuf
messages to a slow console, the rest of CPUs don't hesitate to append new
messages (printk -> log_store). Since logbuf is limited in size - we wrap
around and this results in lost messages.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:19                                         ` Sergey Senozhatsky
@ 2018-06-20 11:25                                           ` Dmitry Vyukov
  0 siblings, 0 replies; 94+ messages in thread
From: Dmitry Vyukov @ 2018-06-20 11:25 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 1:19 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/20/18 11:30), Dmitry Vyukov wrote:
>>
>> https://gist.githubusercontent.com/dvyukov/1528e86e5139f2fd1bf9902398d48298/raw/3b42148554eefed210f1e626d5befd50405c5487/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/6e08ac521f3e19534970ed97aeee1603/raw/0f0bb361902de94e7ee331ac500a3ceebf812c22/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/6e9db2313e48773ad1cd861da8020008/raw/d5b7c023fc8a38c72b1cf8bb1da85fb1c31cea5f/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/3d1bda4c690414ac027de1da45759751/raw/2c68980eabf4f6be24060e807a75f2d3570b5a42/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/9b8831e9ac73ffafa111a33ad40c5667/raw/f4097fbea8f89b25a282a6ef7e648145e10ae4b7/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/d78a3187a1b4e004820e92efcb16f9e0/raw/5530bcbf009c3fba3c581b2d24c523c673c6ef12/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/da1e42436af9ad2afc7de49f2d503510/raw/7dd4cbcc651c5b87122f066a3c689999ae8c4121/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/4571b94bd8cbd78d759412c560fa395c/raw/964c73fc993fc8a9000571e0b7618000584f3638/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/b6deac5faa958ae3733413b34dd5feed/raw/c4da219e284f7fc55da8c3c3af623a87f31bf653/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/2f54c6a2e45347ea76d9c5ce3c0ff091/raw/45f4873898ec8e0d9aa16b9c5c63a85410fd05e0/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/96cb39e29124dbbe2a65a91ec7a5639e/raw/aa8f7b2b1dfa5b8bb8cf93d8a821ca9938e8fc54/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/424da8282d5b28f8be10eab595d37444/raw/acc2fb1ececc1ea9a8215213f7e37e08b524c096/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/b07f37720c632d6d56ae67d95e5599b3/raw/8624ba47d6eb4e7d4d58e3ae1242ebe6cc46d361/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/bc24a7b92289ec04587fb29fc1085045/raw/3136e9262ee2233b5ab369a4a82e83953fc2d8a2/gistfile1.txt
>
>
> Just a small remark
>
> I randomly picked some links, and at least in several reports I saw:
>
> ** 4495 printk messages dropped ** [   50.830930]  [<ffffffff8123ab47>] do_raw_write_lock+0xc7/0x1d0
> ** 3816 printk messages dropped ** [   50.839887]  [<ffffffff814fb353>] SyS_read+0xd3/0x1c0
> ** 3497 printk messages dropped ** [   50.848107]  [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14
> ** 4057 printk messages dropped ** [   50.857615]       run_ksoftirqd+0x20/0x60
> ** 2855 printk messages dropped ** [   50.864318]  [<ffffffff814fb353>] SyS_read+0xd3/0x1c0
> ** 3490 printk messages dropped ** [   50.872518]  [<ffffffff815bee10>] ? fsnotify+0xe40/0xe40
> ** 3600 printk messages dropped ** [   50.880974]       SyS_fcntl+0x5be/0xc70
>
> This will not get any better if we have printk context tracking. The
> problem here is that we lose messages: your console is significantly slower
> than your CPUs. So while one CPU is doing its best printing pending logbuf
> messages to a slow console, the rest of CPUs don't hesitate to append new
> messages (printk -> log_store). Since logbuf is limited in size - we wrap
> around and this results in lost messages.

Yes, I realize there are multiple problems combined here.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:07                                           ` Sergey Senozhatsky
@ 2018-06-20 11:32                                             ` Dmitry Vyukov
  2018-06-20 13:06                                               ` Sergey Senozhatsky
  2018-06-21  8:29                                               ` Sergey Senozhatsky
  0 siblings, 2 replies; 94+ messages in thread
From: Dmitry Vyukov @ 2018-06-20 11:32 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Tetsuo Handa, Sergey Senozhatsky, syzkaller,
	Steven Rostedt, Fengguang Wu, LKML, Linus Torvalds,
	Andrew Morton

On Wed, Jun 20, 2018 at 1:07 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/20/18 11:31), Dmitry Vyukov wrote:
>> > BTW, pr_cont() handling is not so simple when we are in printk_safe()
>> > context. Unlike vprintk_emit() [normal printk], we don't use any
>> > dedicated pr_cont() buffer in printk_safe. So, at a glance, I suspect
>> > that injecting context info at every printk_safe_log_store() call for
>> > `for (...) pr_cont()' loop is going to produce something like this:
>> >         I<10> 23 I<10> 43 I<10> 47 ....
>> >
>> >         // Hmm, maybe the line will endup having two prefixes. Once
>> >         // from printk_safe_log_store, the other from normal printk
>> >         // log_store().
>> >
>> > While the same `for (...) pr_cont()' called from normal printk() context
>> > will produce
>> >         I<10> 32 43 47 ....
>> >
>> > It could be that I'm wrong.
>> > Tetsuo, have you tested pr_cont() from printk_safe() context?
>>
>>
>> So this is another reason to get rid of pr_cont entirely, right?
>
> Getting rid of pr_cont() from important output would be totally cool.
> Quoting Linus:
>
>     Only acceptable use of continuations is basically boot-time testing,
>     when you do things like
>
>      printk("Testing feature XYZ..");
>      this_may_blow_up_because_of_hw_bugs();
>      printk(KERN_CONT " ... ok\n");
>
>
> I can recall at least 4 attempts when people tried to introduce new pr_cont()
> or some concept with similar functionality to pr_cont(), but SMP safe. We
> brought the first one - per-CPU pr_cont() buffers - to KS several years ago
> but Linus didn't like it. Then there was a buffered printk() mode patch from
> Tetsuo, then a solution from Steven, then I had my second try with a
> soft-of-pr_cont() replacement.
>
> So, if we could get rid of pr_cont() from the most important parts
> (instruction dumps, etc) then I would just vote to leave pr_cont()
> alone and avoid any handling of it in printk context tracking. Simply
> because we wouldn't care about pr_cont(). This also could simplify
> Tetsuo's patch significantly.

Sounds good to me.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20  9:30                                       ` Dmitry Vyukov
  2018-06-20 11:19                                         ` Sergey Senozhatsky
@ 2018-06-20 11:37                                         ` Fengguang Wu
  2018-06-20 12:31                                           ` Dmitry Vyukov
  1 sibling, 1 reply; 94+ messages in thread
From: Fengguang Wu @ 2018-06-20 11:37 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
><sergey.senozhatsky.work@gmail.com> wrote:
>> Hi Dmitry,
>>
>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>> Hi Sergey,
>>>
>>> What are the visible differences between this patch and Tetsuo's
>>> patch?
>>
>> I guess none, and looking at your requirements below I tend to agree
>> that Tetsuo's approach is probably what you need at the end of the day.
>>
>>> The only thing that will matter for syzkaller parsing in the
>>> end is the resulting text format as it appears on console. But you say
>>> "I'm not pushing for this particular message format", so what exactly
>>> do you want me to provide feedback on?
>>> I guess we need to handle pr_cont properly whatever approach we take.
>>
>> Mostly, was wondering about if:
>> a) you need pr_cont() handling
>> b) you need printk_safe() handling
>>
>> The reasons I left those things behind:
>>
>> a) pr_cont() is officially hated. It was never supposed to be used
>>    on SMP systems. So I wasn't sure if we need all that effort and
>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>    probably the only user of that functionality.
>
>Well, if I put my syzkaller hat on, then I don't care what exactly
>happens in the kernel, the only thing I care is well-formed output on
>console that can be parsed unambiguously in all cases.

+1 for 0day kernel testing.

I admit that goal may never be 100% achievable -- at least some serial
console logs can sometimes become messy. So we'll have to write dmesg
parsing code in defensive ways.

But some unnecessary pr_cont() broken-up messages can obviously be
avoided. For example,

arch/x86/mm/fault.c:

	printk(KERN_ALERT "BUG: unable to handle kernel ");
	if (address < PAGE_SIZE)
		printk(KERN_CONT "NULL pointer dereference");
	else
		printk(KERN_CONT "paging request");

I've actually proposed to remove the above KERN_CONT, unfortunately the
patch was silently ignored.

>From this point of view I guess pr_cont is actually syzkaller's worst
>enemy. If pr_const is officially hated, and it causes corrupted crash
>reports, then we can resolve it by just getting rid of more pr_cont's.
>So potentially we do not need any support for pr_cont in this patch.
>However, we also need to be practical and if there are tons of
>pr_cont's then we need some intermediate support of them, just because
>we won't be able to get rid of all of them overnight.
>
>But even if we attach context to pr_cont, it still causes problems for
>crash parsing, because today we see:
>
>BUG: unable to handle
>... 10 lines ...
>kernel
>... 10 lines ...
>paging request
>... 10 lines ...
>at ADDR
>
>Which is not too friendly for parsing regardless of contexts.

We met exactly the same issue and ended up with special handling in
https://github.com/intel/lkp-tests/blob/master/lib/dmesg.rb:

       /(BUG: unable to handle kernel)/,
       /(BUG: unable to handle kernel) NULL pointer dereference/,
       /(BUG: unable to handle kernel) paging request/,

>So I am leaning towards to getting rid of pr_cont's as the solution to
>the problem.

+1 for reducing unnecessary pr_cont() uses.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:37                                         ` Fengguang Wu
@ 2018-06-20 12:31                                           ` Dmitry Vyukov
  2018-06-20 12:41                                             ` Fengguang Wu
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-06-20 12:31 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 1:37 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>>
>> On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
>> <sergey.senozhatsky.work@gmail.com> wrote:
>>>
>>> Hi Dmitry,
>>>
>>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>>>
>>>> Hi Sergey,
>>>>
>>>> What are the visible differences between this patch and Tetsuo's
>>>> patch?
>>>
>>>
>>> I guess none, and looking at your requirements below I tend to agree
>>> that Tetsuo's approach is probably what you need at the end of the day.
>>>
>>>> The only thing that will matter for syzkaller parsing in the
>>>> end is the resulting text format as it appears on console. But you say
>>>> "I'm not pushing for this particular message format", so what exactly
>>>> do you want me to provide feedback on?
>>>> I guess we need to handle pr_cont properly whatever approach we take.
>>>
>>>
>>> Mostly, was wondering about if:
>>> a) you need pr_cont() handling
>>> b) you need printk_safe() handling
>>>
>>> The reasons I left those things behind:
>>>
>>> a) pr_cont() is officially hated. It was never supposed to be used
>>>    on SMP systems. So I wasn't sure if we need all that effort and
>>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>>    probably the only user of that functionality.
>>
>>
>> Well, if I put my syzkaller hat on, then I don't care what exactly
>> happens in the kernel, the only thing I care is well-formed output on
>> console that can be parsed unambiguously in all cases.
>
>
> +1 for 0day kernel testing.
>
> I admit that goal may never be 100% achievable -- at least some serial
> console logs can sometimes become messy. So we'll have to write dmesg
> parsing code in defensive ways.
>
> But some unnecessary pr_cont() broken-up messages can obviously be
> avoided. For example,
>
> arch/x86/mm/fault.c:
>
>         printk(KERN_ALERT "BUG: unable to handle kernel ");
>         if (address < PAGE_SIZE)
>                 printk(KERN_CONT "NULL pointer dereference");
>         else
>                 printk(KERN_CONT "paging request");
>
> I've actually proposed to remove the above KERN_CONT, unfortunately the
> patch was silently ignored.


I've just cooked this change too, but do you mind reviving your patch?

It actually makes the code even shorter, which is nice:

--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -671,13 +671,9 @@ show_fault_oops(struct pt_regs *regs, unsigned
long error_code,
                        printk(smep_warning, from_kuid(&init_user_ns,
current_uid()));
        }

-       printk(KERN_ALERT "BUG: unable to handle kernel ");
-       if (address < PAGE_SIZE)
-               printk(KERN_CONT "NULL pointer dereference");
-       else
-               printk(KERN_CONT "paging request");
-
-       printk(KERN_CONT " at %px\n", (void *) address);
+       printk(KERN_ALERT "BUG: unable to handle kernel %s at %px\n",
+               (address < PAGE_SIZE ? "NULL pointer dereference" :
+               "paging request"), (void *) address);

        dump_pagetable(address);
 }

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 12:31                                           ` Dmitry Vyukov
@ 2018-06-20 12:41                                             ` Fengguang Wu
  2018-06-20 12:45                                               ` Dmitry Vyukov
  0 siblings, 1 reply; 94+ messages in thread
From: Fengguang Wu @ 2018-06-20 12:41 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 02:31:51PM +0200, Dmitry Vyukov wrote:
>On Wed, Jun 20, 2018 at 1:37 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
>> On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>>>
>>> On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
>>> <sergey.senozhatsky.work@gmail.com> wrote:
>>>>
>>>> Hi Dmitry,
>>>>
>>>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>>>>
>>>>> Hi Sergey,
>>>>>
>>>>> What are the visible differences between this patch and Tetsuo's
>>>>> patch?
>>>>
>>>>
>>>> I guess none, and looking at your requirements below I tend to agree
>>>> that Tetsuo's approach is probably what you need at the end of the day.
>>>>
>>>>> The only thing that will matter for syzkaller parsing in the
>>>>> end is the resulting text format as it appears on console. But you say
>>>>> "I'm not pushing for this particular message format", so what exactly
>>>>> do you want me to provide feedback on?
>>>>> I guess we need to handle pr_cont properly whatever approach we take.
>>>>
>>>>
>>>> Mostly, was wondering about if:
>>>> a) you need pr_cont() handling
>>>> b) you need printk_safe() handling
>>>>
>>>> The reasons I left those things behind:
>>>>
>>>> a) pr_cont() is officially hated. It was never supposed to be used
>>>>    on SMP systems. So I wasn't sure if we need all that effort and
>>>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>>>    probably the only user of that functionality.
>>>
>>>
>>> Well, if I put my syzkaller hat on, then I don't care what exactly
>>> happens in the kernel, the only thing I care is well-formed output on
>>> console that can be parsed unambiguously in all cases.
>>
>>
>> +1 for 0day kernel testing.
>>
>> I admit that goal may never be 100% achievable -- at least some serial
>> console logs can sometimes become messy. So we'll have to write dmesg
>> parsing code in defensive ways.
>>
>> But some unnecessary pr_cont() broken-up messages can obviously be
>> avoided. For example,
>>
>> arch/x86/mm/fault.c:
>>
>>         printk(KERN_ALERT "BUG: unable to handle kernel ");
>>         if (address < PAGE_SIZE)
>>                 printk(KERN_CONT "NULL pointer dereference");
>>         else
>>                 printk(KERN_CONT "paging request");
>>
>> I've actually proposed to remove the above KERN_CONT, unfortunately the
>> patch was silently ignored.
>
>
>I've just cooked this change too, but do you mind reviving your patch?

Yes, sure. My version is more dumb. Since I'm not sure if it's OK to
do string formatting at this critical point. Let's see how others
think about the 2 approaches. I'm fine as long as our problem is fixed. :)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 9a84a0d08727..c7b068c6b010 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -671,11 +671,10 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code,
                        printk(smep_warning, from_kuid(&init_user_ns, current_uid()));
        }

-       printk(KERN_ALERT "BUG: unable to handle kernel ");
        if (address < PAGE_SIZE)
-               printk(KERN_CONT "NULL pointer dereference");
+               printk(KERN_ALERT "BUG: unable to handle kernel NULL pointer dereference");
        else
-               printk(KERN_CONT "paging request");
+               printk(KERN_ALERT "BUG: unable to handle kernel paging request");

        printk(KERN_CONT " at %px\n", (void *) address);

>It actually makes the code even shorter, which is nice:
>
>--- a/arch/x86/mm/fault.c
>+++ b/arch/x86/mm/fault.c
>@@ -671,13 +671,9 @@ show_fault_oops(struct pt_regs *regs, unsigned
>long error_code,
>                        printk(smep_warning, from_kuid(&init_user_ns,
>current_uid()));
>        }
>
>-       printk(KERN_ALERT "BUG: unable to handle kernel ");
>-       if (address < PAGE_SIZE)
>-               printk(KERN_CONT "NULL pointer dereference");
>-       else
>-               printk(KERN_CONT "paging request");
>-
>-       printk(KERN_CONT " at %px\n", (void *) address);
>+       printk(KERN_ALERT "BUG: unable to handle kernel %s at %px\n",
>+               (address < PAGE_SIZE ? "NULL pointer dereference" :
>+               "paging request"), (void *) address);
>
>        dump_pagetable(address);
> }
>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 12:41                                             ` Fengguang Wu
@ 2018-06-20 12:45                                               ` Dmitry Vyukov
  2018-06-20 12:48                                                 ` Fengguang Wu
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-06-20 12:45 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 2:41 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> On Wed, Jun 20, 2018 at 02:31:51PM +0200, Dmitry Vyukov wrote:
>>
>> On Wed, Jun 20, 2018 at 1:37 PM, Fengguang Wu <fengguang.wu@intel.com>
>> wrote:
>>>
>>> On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>>>>
>>>>
>>>> On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
>>>> <sergey.senozhatsky.work@gmail.com> wrote:
>>>>>
>>>>>
>>>>> Hi Dmitry,
>>>>>
>>>>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>>>>>
>>>>>>
>>>>>> Hi Sergey,
>>>>>>
>>>>>> What are the visible differences between this patch and Tetsuo's
>>>>>> patch?
>>>>>
>>>>>
>>>>>
>>>>> I guess none, and looking at your requirements below I tend to agree
>>>>> that Tetsuo's approach is probably what you need at the end of the day.
>>>>>
>>>>>> The only thing that will matter for syzkaller parsing in the
>>>>>> end is the resulting text format as it appears on console. But you say
>>>>>> "I'm not pushing for this particular message format", so what exactly
>>>>>> do you want me to provide feedback on?
>>>>>> I guess we need to handle pr_cont properly whatever approach we take.
>>>>>
>>>>>
>>>>>
>>>>> Mostly, was wondering about if:
>>>>> a) you need pr_cont() handling
>>>>> b) you need printk_safe() handling
>>>>>
>>>>> The reasons I left those things behind:
>>>>>
>>>>> a) pr_cont() is officially hated. It was never supposed to be used
>>>>>    on SMP systems. So I wasn't sure if we need all that effort and
>>>>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>>>>    probably the only user of that functionality.
>>>>
>>>>
>>>>
>>>> Well, if I put my syzkaller hat on, then I don't care what exactly
>>>> happens in the kernel, the only thing I care is well-formed output on
>>>> console that can be parsed unambiguously in all cases.
>>>
>>>
>>>
>>> +1 for 0day kernel testing.
>>>
>>> I admit that goal may never be 100% achievable -- at least some serial
>>> console logs can sometimes become messy. So we'll have to write dmesg
>>> parsing code in defensive ways.
>>>
>>> But some unnecessary pr_cont() broken-up messages can obviously be
>>> avoided. For example,
>>>
>>> arch/x86/mm/fault.c:
>>>
>>>         printk(KERN_ALERT "BUG: unable to handle kernel ");
>>>         if (address < PAGE_SIZE)
>>>                 printk(KERN_CONT "NULL pointer dereference");
>>>         else
>>>                 printk(KERN_CONT "paging request");
>>>
>>> I've actually proposed to remove the above KERN_CONT, unfortunately the
>>> patch was silently ignored.
>>
>>
>>
>> I've just cooked this change too, but do you mind reviving your patch?
>
>
> Yes, sure. My version is more dumb. Since I'm not sure if it's OK to
> do string formatting at this critical point. Let's see how others
> think about the 2 approaches. I'm fine as long as our problem is fixed. :)

It already does string formatting for address. And I think we also
need to get rid of KERN_CONT for address while we are here.


> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 9a84a0d08727..c7b068c6b010 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -671,11 +671,10 @@ show_fault_oops(struct pt_regs *regs, unsigned long
> error_code,
>                        printk(smep_warning, from_kuid(&init_user_ns,
> current_uid()));
>        }
>
> -       printk(KERN_ALERT "BUG: unable to handle kernel ");
>        if (address < PAGE_SIZE)
> -               printk(KERN_CONT "NULL pointer dereference");
> +               printk(KERN_ALERT "BUG: unable to handle kernel NULL pointer
> dereference");
>        else
> -               printk(KERN_CONT "paging request");
> +               printk(KERN_ALERT "BUG: unable to handle kernel paging
> request");
>
>
>        printk(KERN_CONT " at %px\n", (void *) address);
>
>> It actually makes the code even shorter, which is nice:
>>
>> --- a/arch/x86/mm/fault.c
>> +++ b/arch/x86/mm/fault.c
>> @@ -671,13 +671,9 @@ show_fault_oops(struct pt_regs *regs, unsigned
>> long error_code,
>>                        printk(smep_warning, from_kuid(&init_user_ns,
>> current_uid()));
>>        }
>>
>> -       printk(KERN_ALERT "BUG: unable to handle kernel ");
>> -       if (address < PAGE_SIZE)
>> -               printk(KERN_CONT "NULL pointer dereference");
>> -       else
>> -               printk(KERN_CONT "paging request");
>> -
>> -       printk(KERN_CONT " at %px\n", (void *) address);
>> +       printk(KERN_ALERT "BUG: unable to handle kernel %s at %px\n",
>> +               (address < PAGE_SIZE ? "NULL pointer dereference" :
>> +               "paging request"), (void *) address);
>>
>>        dump_pagetable(address);
>> }
>>
>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 12:45                                               ` Dmitry Vyukov
@ 2018-06-20 12:48                                                 ` Fengguang Wu
  0 siblings, 0 replies; 94+ messages in thread
From: Fengguang Wu @ 2018-06-20 12:48 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 02:45:25PM +0200, Dmitry Vyukov wrote:
>On Wed, Jun 20, 2018 at 2:41 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
>> On Wed, Jun 20, 2018 at 02:31:51PM +0200, Dmitry Vyukov wrote:
>>>
>>> On Wed, Jun 20, 2018 at 1:37 PM, Fengguang Wu <fengguang.wu@intel.com>
>>> wrote:
>>>>
>>>> On Wed, Jun 20, 2018 at 11:30:05AM +0200, Dmitry Vyukov wrote:
>>>>>
>>>>>
>>>>> On Wed, Jun 20, 2018 at 11:06 AM, Sergey Senozhatsky
>>>>> <sergey.senozhatsky.work@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Dmitry,
>>>>>>
>>>>>> On (06/20/18 10:45), Dmitry Vyukov wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Sergey,
>>>>>>>
>>>>>>> What are the visible differences between this patch and Tetsuo's
>>>>>>> patch?
>>>>>>
>>>>>>
>>>>>>
>>>>>> I guess none, and looking at your requirements below I tend to agree
>>>>>> that Tetsuo's approach is probably what you need at the end of the day.
>>>>>>
>>>>>>> The only thing that will matter for syzkaller parsing in the
>>>>>>> end is the resulting text format as it appears on console. But you say
>>>>>>> "I'm not pushing for this particular message format", so what exactly
>>>>>>> do you want me to provide feedback on?
>>>>>>> I guess we need to handle pr_cont properly whatever approach we take.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Mostly, was wondering about if:
>>>>>> a) you need pr_cont() handling
>>>>>> b) you need printk_safe() handling
>>>>>>
>>>>>> The reasons I left those things behind:
>>>>>>
>>>>>> a) pr_cont() is officially hated. It was never supposed to be used
>>>>>>    on SMP systems. So I wasn't sure if we need all that effort and
>>>>>>    add tricky code to handle pr_cont(). Given that syzkaller is
>>>>>>    probably the only user of that functionality.
>>>>>
>>>>>
>>>>>
>>>>> Well, if I put my syzkaller hat on, then I don't care what exactly
>>>>> happens in the kernel, the only thing I care is well-formed output on
>>>>> console that can be parsed unambiguously in all cases.
>>>>
>>>>
>>>>
>>>> +1 for 0day kernel testing.
>>>>
>>>> I admit that goal may never be 100% achievable -- at least some serial
>>>> console logs can sometimes become messy. So we'll have to write dmesg
>>>> parsing code in defensive ways.
>>>>
>>>> But some unnecessary pr_cont() broken-up messages can obviously be
>>>> avoided. For example,
>>>>
>>>> arch/x86/mm/fault.c:
>>>>
>>>>         printk(KERN_ALERT "BUG: unable to handle kernel ");
>>>>         if (address < PAGE_SIZE)
>>>>                 printk(KERN_CONT "NULL pointer dereference");
>>>>         else
>>>>                 printk(KERN_CONT "paging request");
>>>>
>>>> I've actually proposed to remove the above KERN_CONT, unfortunately the
>>>> patch was silently ignored.
>>>
>>>
>>>
>>> I've just cooked this change too, but do you mind reviving your patch?
>>
>>
>> Yes, sure. My version is more dumb. Since I'm not sure if it's OK to
>> do string formatting at this critical point. Let's see how others
>> think about the 2 approaches. I'm fine as long as our problem is fixed. :)
>
>It already does string formatting for address. And I think we also
>need to get rid of KERN_CONT for address while we are here.

Ah yes, sorry I overlooked the next KERN_CONT..

>
>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>> index 9a84a0d08727..c7b068c6b010 100644
>> --- a/arch/x86/mm/fault.c
>> +++ b/arch/x86/mm/fault.c
>> @@ -671,11 +671,10 @@ show_fault_oops(struct pt_regs *regs, unsigned long
>> error_code,
>>                        printk(smep_warning, from_kuid(&init_user_ns,
>> current_uid()));
>>        }
>>
>> -       printk(KERN_ALERT "BUG: unable to handle kernel ");
>>        if (address < PAGE_SIZE)
>> -               printk(KERN_CONT "NULL pointer dereference");
>> +               printk(KERN_ALERT "BUG: unable to handle kernel NULL pointer
>> dereference");
>>        else
>> -               printk(KERN_CONT "paging request");
>> +               printk(KERN_ALERT "BUG: unable to handle kernel paging
>> request");
>>
>>
>>        printk(KERN_CONT " at %px\n", (void *) address);
>>
>>> It actually makes the code even shorter, which is nice:
>>>
>>> --- a/arch/x86/mm/fault.c
>>> +++ b/arch/x86/mm/fault.c
>>> @@ -671,13 +671,9 @@ show_fault_oops(struct pt_regs *regs, unsigned
>>> long error_code,
>>>                        printk(smep_warning, from_kuid(&init_user_ns,
>>> current_uid()));
>>>        }
>>>
>>> -       printk(KERN_ALERT "BUG: unable to handle kernel ");
>>> -       if (address < PAGE_SIZE)
>>> -               printk(KERN_CONT "NULL pointer dereference");
>>> -       else
>>> -               printk(KERN_CONT "paging request");
>>> -
>>> -       printk(KERN_CONT " at %px\n", (void *) address);
>>> +       printk(KERN_ALERT "BUG: unable to handle kernel %s at %px\n",
>>> +               (address < PAGE_SIZE ? "NULL pointer dereference" :
>>> +               "paging request"), (void *) address);
>>>
>>>        dump_pagetable(address);
>>> }
>>>
>>
>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:32                                             ` Dmitry Vyukov
@ 2018-06-20 13:06                                               ` Sergey Senozhatsky
  2018-06-22 13:06                                                 ` Tetsuo Handa
  2018-09-10 11:20                                                 ` Alexander Potapenko
  2018-06-21  8:29                                               ` Sergey Senozhatsky
  1 sibling, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-06-20 13:06 UTC (permalink / raw)
  To: Dmitry Vyukov, Fengguang Wu
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On (06/20/18 13:32), Dmitry Vyukov wrote:
> > So, if we could get rid of pr_cont() from the most important parts
> > (instruction dumps, etc) then I would just vote to leave pr_cont()
> > alone and avoid any handling of it in printk context tracking. Simply
> > because we wouldn't care about pr_cont(). This also could simplify
> > Tetsuo's patch significantly.
> 
> Sounds good to me.

Awesome. If you and Fengguang can combine forces and lead the
whole thing towards "we couldn't care of pr_cont() less", it
would be really huuuuuge. Go for it!

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 11:32                                             ` Dmitry Vyukov
  2018-06-20 13:06                                               ` Sergey Senozhatsky
@ 2018-06-21  8:29                                               ` Sergey Senozhatsky
  1 sibling, 0 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-06-21  8:29 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Sergey Senozhatsky, Petr Mladek, Tetsuo Handa,
	Sergey Senozhatsky, syzkaller, Steven Rostedt, Fengguang Wu,
	LKML, Linus Torvalds, Andrew Morton

On (06/20/18 13:32), Dmitry Vyukov wrote:
> >>
> >> So this is another reason to get rid of pr_cont entirely, right?
> >
> > Getting rid of pr_cont() from important output would be totally cool.
> > Quoting Linus:
> >
> >     Only acceptable use of continuations is basically boot-time testing,
> >     when you do things like
> >
> >      printk("Testing feature XYZ..");
> >      this_may_blow_up_because_of_hw_bugs();
> >      printk(KERN_CONT " ... ok\n");
> >
> >
> > I can recall at least 4 attempts when people tried to introduce new pr_cont()
> > or some concept with similar functionality to pr_cont(), but SMP safe. We
> > brought the first one - per-CPU pr_cont() buffers - to KS several years ago
> > but Linus didn't like it. Then there was a buffered printk() mode patch from
> > Tetsuo, then a solution from Steven, then I had my second try with a
> > soft-of-pr_cont() replacement.
> >
> > So, if we could get rid of pr_cont() from the most important parts
> > (instruction dumps, etc) then I would just vote to leave pr_cont()
> > alone and avoid any handling of it in printk context tracking. Simply
> > because we wouldn't care about pr_cont(). This also could simplify
> > Tetsuo's patch significantly.
> 
> Sounds good to me.

Another thing about pr_cont() is that as long as pr_cont() does not race
with pr_cont() from another task or from IRQ, the task that is the owner
(see struct cont in printk.c) of the existing continuation line can migrate,
IOW we can have

	CPU0	CPU1	CPU2	CPU3

	task A
	pr_cont()
		task A
		pr_cont()
			task A
			pr_cont()
				task A
				pr_cont("\n") << flush

The line was printed from 4 CPUs, but appears as a single line
in the logbuf. Should we account CPU0 or CPU3 as the line origin?

That's another reason why I don't really want to handle pr_cont in
any special way in context tracking.

So, currently, context tracking looks like this:

---
	char mode = 'T';

	if (in_serving_softirq())
		mode = 'S';
	if (in_irq())
		mode = 'I';
	if (in_nmi())
		mode = 'N';

	ret = snprintf(buf, buf_len, "%c<%d>%c",
			mode,
			raw_smp_processor_id(),
			cont.len ? '+' : ' ');
---

I add a '+' symbol to continuation lines. Which should simply hint
that tracking info for that particular line is not entirely trustworthy.

I also don't add any tracking info for printk_safe output. We get
tracking info for such lines from the printk_safe flush path
(irq work that happens on the same CPU that added printk_safe output).

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 13:06                                               ` Sergey Senozhatsky
@ 2018-06-22 13:06                                                 ` Tetsuo Handa
  2018-06-25  1:41                                                   ` Sergey Senozhatsky
  2018-09-10 11:20                                                 ` Alexander Potapenko
  1 sibling, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-06-22 13:06 UTC (permalink / raw)
  To: Sergey Senozhatsky, Dmitry Vyukov, Fengguang Wu
  Cc: Sergey Senozhatsky, Petr Mladek, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On 2018/06/20 22:06, Sergey Senozhatsky wrote:
> On (06/20/18 13:32), Dmitry Vyukov wrote:
>>> So, if we could get rid of pr_cont() from the most important parts
>>> (instruction dumps, etc) then I would just vote to leave pr_cont()
>>> alone and avoid any handling of it in printk context tracking. Simply
>>> because we wouldn't care about pr_cont(). This also could simplify
>>> Tetsuo's patch significantly.
>>
>> Sounds good to me.
> 
> Awesome. If you and Fengguang can combine forces and lead the
> whole thing towards "we couldn't care of pr_cont() less", it
> would be really huuuuuge. Go for it!

Can't we have seq_printf()-like one which flushes automatically upon seeing '\n'
or buffer full? Printing memory information is using a lot of pr_cont(), even in
function names (e.g. http://lkml.kernel.org/r/20180622083949.GR10465@dhcp22.suse.cz ).
Since OOM killer code is serialized by oom_lock, we can use static buffer for
OOM killer messages.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-22 13:06                                                 ` Tetsuo Handa
@ 2018-06-25  1:41                                                   ` Sergey Senozhatsky
  2018-06-25  9:36                                                     ` Dmitry Vyukov
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-06-25  1:41 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Dmitry Vyukov, Fengguang Wu,
	Sergey Senozhatsky, Petr Mladek, syzkaller, Steven Rostedt, LKML,
	Linus Torvalds, Andrew Morton

On (06/22/18 22:06), Tetsuo Handa wrote:
> >
> > Awesome. If you and Fengguang can combine forces and lead the
> > whole thing towards "we couldn't care of pr_cont() less", it
> > would be really huuuuuge. Go for it!
> 
> Can't we have seq_printf()-like one which flushes automatically upon seeing '\n'
> or buffer full? Printing memory information is using a lot of pr_cont(), even in
> function names (e.g. http://lkml.kernel.org/r/20180622083949.GR10465@dhcp22.suse.cz ).
> Since OOM killer code is serialized by oom_lock, we can use static buffer for
> OOM killer messages.

I'm not the right guy to answer this question. Sorry. We need to Cc MM
people on this.

Does OOM's pr_cont() usage cause too much disturbance to syzkaller? I thought
that OOM was slightly out of sight.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-25  1:41                                                   ` Sergey Senozhatsky
@ 2018-06-25  9:36                                                     ` Dmitry Vyukov
  2018-06-27 10:29                                                       ` Tetsuo Handa
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Vyukov @ 2018-06-25  9:36 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Sergey Senozhatsky, Fengguang Wu, Petr Mladek,
	syzkaller, Steven Rostedt, LKML, Linus Torvalds, Andrew Morton

On Mon, Jun 25, 2018 at 3:41 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (06/22/18 22:06), Tetsuo Handa wrote:
>> >
>> > Awesome. If you and Fengguang can combine forces and lead the
>> > whole thing towards "we couldn't care of pr_cont() less", it
>> > would be really huuuuuge. Go for it!
>>
>> Can't we have seq_printf()-like one which flushes automatically upon seeing '\n'
>> or buffer full? Printing memory information is using a lot of pr_cont(), even in
>> function names (e.g. http://lkml.kernel.org/r/20180622083949.GR10465@dhcp22.suse.cz ).
>> Since OOM killer code is serialized by oom_lock, we can use static buffer for
>> OOM killer messages.
>
> I'm not the right guy to answer this question. Sorry. We need to Cc MM
> people on this.
>
> Does OOM's pr_cont() usage cause too much disturbance to syzkaller? I thought
> that OOM was slightly out of sight.

Hard to tell. Nothing specific comes to mind.
We do see lines like these:

BUG: unable to handle kernel [ 110.NUM] device gre0 entered promiscuous mode
BUG:--------[ cut here ]------------

and frequently it's also required to look deep inside of crash message
to understand what they really mean. Hard to tell how random pr_cont's
contribute to the problem. We now throw away everything that looks any
corrupted right away.
I guess the main requirement is that the crash report itself does not
use pr_cont and provided we have task/cpu context we can separate the
crash report lines from everything else (assuming that random
pr_cont's on other CPUs won't glue to the report lines).

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-25  9:36                                                     ` Dmitry Vyukov
@ 2018-06-27 10:29                                                       ` Tetsuo Handa
  0 siblings, 0 replies; 94+ messages in thread
From: Tetsuo Handa @ 2018-06-27 10:29 UTC (permalink / raw)
  To: Dmitry Vyukov, Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Fengguang Wu, Petr Mladek, syzkaller,
	Steven Rostedt, LKML, Linus Torvalds, Andrew Morton

On 2018/06/25 18:36, Dmitry Vyukov wrote:
> On Mon, Jun 25, 2018 at 3:41 AM, Sergey Senozhatsky
> <sergey.senozhatsky.work@gmail.com> wrote:
>> On (06/22/18 22:06), Tetsuo Handa wrote:
>>>>
>>>> Awesome. If you and Fengguang can combine forces and lead the
>>>> whole thing towards "we couldn't care of pr_cont() less", it
>>>> would be really huuuuuge. Go for it!
>>>
>>> Can't we have seq_printf()-like one which flushes automatically upon seeing '\n'
>>> or buffer full? Printing memory information is using a lot of pr_cont(), even in
>>> function names (e.g. http://lkml.kernel.org/r/20180622083949.GR10465@dhcp22.suse.cz ).
>>> Since OOM killer code is serialized by oom_lock, we can use static buffer for
>>> OOM killer messages.
>>
>> I'm not the right guy to answer this question. Sorry. We need to Cc MM
>> people on this.
>>
>> Does OOM's pr_cont() usage cause too much disturbance to syzkaller? I thought
>> that OOM was slightly out of sight.
> 
> Hard to tell. Nothing specific comes to mind.
> We do see lines like these:
> 
> BUG: unable to handle kernel [ 110.NUM] device gre0 entered promiscuous mode
> BUG:--------[ cut here ]------------
> 
> and frequently it's also required to look deep inside of crash message
> to understand what they really mean. Hard to tell how random pr_cont's
> contribute to the problem. We now throw away everything that looks any
> corrupted right away.
> I guess the main requirement is that the crash report itself does not
> use pr_cont and provided we have task/cpu context we can separate the
> crash report lines from everything else (assuming that random
> pr_cont's on other CPUs won't glue to the report lines).
> 

PATCH 1/3 below is a sample implementation of seq_printf()-like one which
flushes automatically upon seeing '\n' or buffer full. PATCH 2/3 is a
straightforward user of such function. (Well, since it is so simple,
we could rewrite it using snprintf() before PATCH 1/3 is accepted.)
PATCH 3/3 is a complicated user of such function. (Well, we could reduce
pr_cont() before PATCH 1/3 is accepted.) Can we agree with PATCH 1/3 ?

From 485406f585e566dccdfb85a1afbae460b8756457 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 27 Jun 2018 16:29:14 +0900
Subject: [PATCH 1/3] printk: Introduce buffered_printk().

Linus suggested in "printk: what is going on with additional newlines?"
thread [1] that

  Making the buffer explicit is (a) cheaper and (b) better. Now you can
  put the buffer on the stack, you never have to worry about where you
  need to track context, and you have no buffering limits (ie you can
  buffer across any event).

  I definitely suspect that "single line" is often sufficient. I
  mean, that's all that KERN_CONT ever gave you anyway (and not reliably).

  And then a 80 character buffer really isn't any different from having
  a structure with a few pointers in it, which we do on the stack all
  the time.

Now, since syzbot is bothered by concurrent printk() messages (e.g.
memory allocation fault injection), we started thinking about adding
prefix to each line of printk() output. This matches the suggestion that
buffering single line will be sufficient if we add caller's context
information for distinguishing concurrent printk() messages.

Thus, this patch introduces buffered_printk() which spools printk() output
and automatically flushes when '\n' was found or buffer became full (and
related structure/macro/functions).

[1] http://lkml.kernel.org/r/CA+55aFx+5R-vFQfr7+Ok9Yrs2adQ2Ma4fz+S6nCyWHY_-2mrmw@mail.gmail.com

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 include/linux/printk.h | 28 ++++++++++++++++
 kernel/printk/printk.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 114 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 6d7e800..81bc12a 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -153,6 +153,23 @@ static inline void printk_nmi_enter(void) { }
 static inline void printk_nmi_exit(void) { }
 #endif /* PRINTK_NMI */
 
+struct printk_buffer {
+	unsigned short int size;
+	unsigned short int used;
+	char *buf;
+};
+
+#define DEFINE_PRINTK_BUFFER(name, size, buf)		\
+	struct printk_buffer name = { size, 0, buf }
+
+static inline void INIT_PRINTK_BUFFER(struct printk_buffer *ptr,
+				      unsigned short int size, char *buf)
+{
+	ptr->size = size;
+	ptr->used = 0;
+	ptr->buf = buf;
+}
+
 #ifdef CONFIG_PRINTK
 asmlinkage __printf(5, 0)
 int vprintk_emit(int facility, int level,
@@ -169,6 +186,9 @@ int printk_emit(int facility, int level,
 
 asmlinkage __printf(1, 2) __cold
 int printk(const char *fmt, ...);
+asmlinkage __printf(2, 3) __cold
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...);
+void flush_buffered_printk(struct printk_buffer *ptr);
 
 /*
  * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ !
@@ -216,6 +236,14 @@ int printk(const char *s, ...)
 {
 	return 0;
 }
+static inline __printf(2, 3) __cold
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	return 0;
+}
+static inline void flush_buffered_printk(struct printk_buffer *ptr)
+{
+}
 static inline __printf(1, 2) __cold
 int printk_deferred(const char *s, ...)
 {
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2478083..24566dc 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1985,6 +1985,92 @@ asmlinkage __visible int printk(const char *fmt, ...)
 }
 EXPORT_SYMBOL(printk);
 
+static void __flush_printk_buffer(struct printk_buffer *ptr, bool all)
+{
+	while (1) {
+		char *text = ptr->buf;
+		unsigned int text_len = ptr->used;
+		char *cp = memchr(text, '\n', text_len);
+		char c;
+
+		if (cp++)
+			text_len = cp - text;
+		else if (all)
+			cp = text + text_len;
+		else
+			break;
+		c = *cp;
+		*cp = '\0';
+		printk("%s", text);
+		ptr->used -= text_len;
+		if (!ptr->used)
+			break;
+		*cp = c;
+		memmove(text, text + text_len, ptr->used);
+	}
+}
+
+/*
+ * buffered_printk - Try to print multiple printk() calls as line oriented.
+ *
+ * This is a utility function for avoiding KERN_CONT and pr_cont() usage.
+ *
+ * Before:
+ *
+ *   pr_info("INFO:");
+ *   for (i = 0; i < 5; i++)
+ *     pr_cont(" %s=%s", name[i], value[i]);
+ *   pr_cont("\n");
+ *
+ * After:
+ *
+ *   char buffer[256];
+ *   DEFINE_PRINTK_BUFFER(buf, sizeof(buffer), buffer);
+ *   buffered_printk(&buf, KERN_INFO "INFO:");
+ *   for (i = 0; i < 5; i++)
+ *     buffered_printk(&buf, " %s=%s", name[i], value[i]);
+ *   buffered_printk(&buf, "\n");
+ *
+ * If the caller is not sure that the last buffered_printk() call ends with
+ * "\n", the caller can use flush_buffered_printk() in order to make sure that
+ * all data is passed to printk().
+ *
+ * If the buffer is not large enough to hold one line, buffered_printk() will
+ * fall back to regular printk() instead of truncating the data. But be careful
+ * with LOG_LINE_MAX limit anyway.
+ */
+asmlinkage __visible int buffered_printk(struct printk_buffer *ptr,
+					 const char *fmt, ...)
+{
+	va_list args;
+	int r;
+	const unsigned int pos = ptr->used;
+
+	/* Try to store to printk_buffer first. */
+	va_start(args, fmt);
+	r = vsnprintf(ptr->buf + pos, ptr->size - pos, fmt, args);
+	va_end(args);
+	/* If it succeeds, process printk_buffer up to last '\n' and return. */
+	if (r + pos < ptr->size) {
+		ptr->used += r;
+		__flush_printk_buffer(ptr, false);
+		return r;
+	}
+	/* Otherwise, flush printk_buffer and use unbuffered printk(). */
+	__flush_printk_buffer(ptr, true);
+	va_start(args, fmt);
+	r = vprintk_func(fmt, args);
+	va_end(args);
+	return r;
+}
+EXPORT_SYMBOL(buffered_printk);
+
+void flush_buffered_printk(struct printk_buffer *ptr)
+{
+	__flush_printk_buffer(ptr, true);
+}
+EXPORT_SYMBOL(flush_buffered_printk);
+
 #else /* CONFIG_PRINTK */
 
 #define LOG_LINE_MAX		0
-- 
1.8.3.1

From 8f38ec70c9c444673e6bf2e699781cd143442ac6 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 27 Jun 2018 16:30:18 +0900
Subject: [PATCH 2/3] x86: Use buffered_printk() in show_opcodes()

Since syzbot is confused by concurrent printk() messages,
this patch changes show_opcodes() to use buffered_printk().

When we start adding prefix to each line of printk() output,
syzbot will be able to handle concurrent printk() messages.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 arch/x86/kernel/dumpstack.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 666a284..c284dd0 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -97,22 +97,24 @@ void show_opcodes(u8 *rip, const char *loglvl)
 	u8 opcodes[OPCODE_BUFSIZE];
 	u8 *ip;
 	int i;
+	char tmpbuf[(2 + 6) + (3 * OPCODE_BUFSIZE + 2) + 2];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
-	printk("%sCode: ", loglvl);
+	buffered_printk(&buf, "%sCode: ", loglvl);
 
 	ip = (u8 *)rip - code_prologue;
 	if (probe_kernel_read(opcodes, ip, OPCODE_BUFSIZE)) {
-		pr_cont("Bad RIP value.\n");
+		buffered_printk(&buf, "Bad RIP value.\n");
 		return;
 	}
 
 	for (i = 0; i < OPCODE_BUFSIZE; i++, ip++) {
 		if (ip == rip)
-			pr_cont("<%02x> ", opcodes[i]);
+			buffered_printk(&buf, "<%02x> ", opcodes[i]);
 		else
-			pr_cont("%02x ", opcodes[i]);
+			buffered_printk(&buf, "%02x ", opcodes[i]);
 	}
-	pr_cont("\n");
+	buffered_printk(&buf, "\n");
 }
 
 void show_ip(struct pt_regs *regs, const char *loglvl)
-- 
1.8.3.1

From 910520d0f5f366c86f5e4a2d5d344ae16e375604 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 27 Jun 2018 16:31:17 +0900
Subject: [PATCH 3/3] lockdep: Replace KERN_CONT/pr_cont() with
 buffered_printk()

Since syzbot is confused by concurrent printk() messages,
this patch eliminates KERN_CONT/pr_cont() usage from
kernel/locking/lockdep.c functions.

When we start adding prefix to each line of printk() output,
syzbot will be able to handle concurrent printk() messages.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 kernel/locking/lockdep.c | 248 +++++++++++++++++++++++------------------------
 1 file changed, 123 insertions(+), 125 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5fa4d31..b8d9aa6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -499,36 +499,38 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
 	usage[i] = '\0';
 }
 
-static void __print_lock_name(struct lock_class *class)
+static void __print_lock_name(struct printk_buffer *buf, struct lock_class *class,
+			      const char *header, const char *trailer)
 {
 	char str[KSYM_NAME_LEN];
 	const char *name;
 
+	buffered_printk(buf, "%s", header);
 	name = class->name;
 	if (!name) {
 		name = __get_key_name(class->key, str);
-		printk(KERN_CONT "%s", name);
+		buffered_printk(buf, "%s", name);
 	} else {
-		printk(KERN_CONT "%s", name);
+		buffered_printk(buf, "%s", name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
+			buffered_printk(buf, "#%d", class->name_version);
 		if (class->subclass)
-			printk(KERN_CONT "/%d", class->subclass);
+			buffered_printk(buf, "/%d", class->subclass);
 	}
+	buffered_printk(buf, "%s", trailer);
 }
 
-static void print_lock_name(struct lock_class *class)
+static void print_lock_name(struct printk_buffer *buf, struct lock_class *class, const char *trailer)
 {
 	char usage[LOCK_USAGE_CHARS];
 
 	get_usage_chars(class, usage);
 
-	printk(KERN_CONT " (");
-	__print_lock_name(class);
-	printk(KERN_CONT "){%s}", usage);
+	__print_lock_name(buf, class, " (", ")");
+	buffered_printk(buf, "{%s}%s", usage, trailer);
 }
 
-static void print_lockdep_cache(struct lockdep_map *lock)
+static void print_lockdep_cache(struct printk_buffer *buf, struct lockdep_map *lock, const char *trailer)
 {
 	const char *name;
 	char str[KSYM_NAME_LEN];
@@ -537,11 +539,13 @@ static void print_lockdep_cache(struct lockdep_map *lock)
 	if (!name)
 		name = __get_key_name(lock->key->subkeys, str);
 
-	printk(KERN_CONT "%s", name);
+	buffered_printk(buf, "%s%s", name, trailer);
 }
 
-static void print_lock(struct held_lock *hlock)
+static void print_lock(struct printk_buffer *buf, struct held_lock *hlock)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf2, sizeof(tmpbuf), tmpbuf);
 	/*
 	 * We can be called locklessly through debug_show_all_locks() so be
 	 * extra careful, the hlock might have been released and cleared.
@@ -551,19 +555,23 @@ static void print_lock(struct held_lock *hlock)
 	/* Don't re-read hlock->class_idx, can't use READ_ONCE() on bitfields: */
 	barrier();
 
+	if (!buf)
+		buf = &buf2;
 	if (!class_idx || (class_idx - 1) >= MAX_LOCKDEP_KEYS) {
-		printk(KERN_CONT "<RELEASED>\n");
+		buffered_printk(buf, "<RELEASED>\n");
 		return;
 	}
 
-	printk(KERN_CONT "%p", hlock->instance);
-	print_lock_name(lock_classes + class_idx - 1);
-	printk(KERN_CONT ", at: %pS\n", (void *)hlock->acquire_ip);
+	buffered_printk(buf, "%p", hlock->instance);
+	print_lock_name(buf, lock_classes + class_idx - 1, "");
+	buffered_printk(buf, ", at: %pS\n", (void *)hlock->acquire_ip);
 }
 
 static void lockdep_print_held_locks(struct task_struct *p)
 {
 	int i, depth = READ_ONCE(p->lockdep_depth);
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	if (!depth)
 		printk("no locks held by %s/%d.\n", p->comm, task_pid_nr(p));
@@ -577,8 +585,8 @@ static void lockdep_print_held_locks(struct task_struct *p)
 	if (p->state == TASK_RUNNING && p != current)
 		return;
 	for (i = 0; i < depth; i++) {
-		printk(" #%d: ", i);
-		print_lock(p->held_locks + i);
+		buffered_printk(&buf, " #%d: ", i);
+		print_lock(&buf, p->held_locks + i);
 	}
 }
 
@@ -812,10 +820,10 @@ static bool assign_lock_key(struct lockdep_map *lock)
 	if (verbose(class)) {
 		graph_unlock();
 
-		printk("\nnew class %px: %s", class->key, class->name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
-		printk(KERN_CONT "\n");
+			printk("\nnew class %px: %s#%d\n", class->key, class->name, class->name_version);
+		else
+			printk("\nnew class %px: %s\n", class->key, class->name);
 		dump_stack();
 
 		if (!graph_lock()) {
@@ -1089,11 +1097,13 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 static noinline int
 print_circular_bug_entry(struct lock_list *target, int depth)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
+
 	if (debug_locks_silent)
 		return 0;
-	printk("\n-> #%u", depth);
-	print_lock_name(target->class);
-	printk(KERN_CONT ":\n");
+	buffered_printk(&buf, "\n-> #%u", depth);
+	print_lock_name(&buf, target->class, ":\n");
 	print_stack_trace(&target->trace, 6);
 
 	return 0;
@@ -1107,6 +1117,8 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	struct lock_class *source = hlock_class(src);
 	struct lock_class *target = hlock_class(tgt);
 	struct lock_class *parent = prt->class;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	/*
 	 * A direct locking problem where unsafe_class lock is taken
@@ -1122,30 +1134,19 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (parent != source) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(source);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(parent);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(target);
-		printk(KERN_CONT "\n\n");
+		printk("Chain exists of:\n");
+		__print_lock_name(&buf, source, "  ", " --> ");
+		__print_lock_name(&buf, parent, "", " --> ");
+		__print_lock_name(&buf, target, "", "\n\n");
 	}
 
 	printk(" Possible unsafe locking scenario:\n\n");
 	printk("       CPU0                    CPU1\n");
 	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(target);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(parent);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(target);
-	printk(KERN_CONT ");\n");
-	printk("  lock(");
-	__print_lock_name(source);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, target, "  lock(", ");\n");
+	__print_lock_name(&buf, parent, "                               lock(", ");\n");
+	__print_lock_name(&buf, target, "                               lock(", ");\n");
+	__print_lock_name(&buf, source, "  lock(", ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
 }
 
@@ -1170,11 +1171,11 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	pr_warn("------------------------------------------------------\n");
 	pr_warn("%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(check_src);
+	print_lock(NULL, check_src);
 
 	pr_warn("\nbut task is already holding lock:\n");
 
-	print_lock(check_tgt);
+	print_lock(NULL, check_tgt);
 	pr_warn("\nwhich lock already depends on the new lock.\n\n");
 	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
 
@@ -1394,18 +1395,19 @@ static inline int usage_match(struct lock_list *entry, void *bit)
 static void print_lock_class_header(struct lock_class *class, int depth)
 {
 	int bit;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
-	printk("%*s->", depth, "");
-	print_lock_name(class);
-	printk(KERN_CONT " ops: %lu", class->ops);
-	printk(KERN_CONT " {\n");
+	buffered_printk(&buf, "%*s->", depth, "");
+	print_lock_name(&buf, class, "");
+	buffered_printk(&buf, " ops: %lu", class->ops);
+	buffered_printk(&buf, " {\n");
 
 	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
 		if (class->usage_mask & (1 << bit)) {
 			int len = depth;
 
-			len += printk("%*s   %s", depth, "", usage_str[bit]);
-			len += printk(KERN_CONT " at:\n");
+			len += printk("%*s   %s at:\n", depth, "", usage_str[bit]);
 			print_stack_trace(class->usage_traces + bit, len);
 		}
 	}
@@ -1455,6 +1457,8 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 	struct lock_class *safe_class = safe_entry->class;
 	struct lock_class *unsafe_class = unsafe_entry->class;
 	struct lock_class *middle_class = prev_class;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	if (middle_class == safe_class)
 		middle_class = next_class;
@@ -1473,32 +1477,21 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (middle_class != unsafe_class) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(safe_class);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(middle_class);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(unsafe_class);
-		printk(KERN_CONT "\n\n");
+		printk("Chain exists of:\n");
+		__print_lock_name(&buf, safe_class, "  ", " --> ");
+		__print_lock_name(&buf, middle_class, "", " --> ");
+		__print_lock_name(&buf, unsafe_class, "", "\n\n");
 	}
 
 	printk(" Possible interrupt unsafe locking scenario:\n\n");
 	printk("       CPU0                    CPU1\n");
 	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(unsafe_class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, unsafe_class, "  lock(", ");\n");
 	printk("                               local_irq_disable();\n");
-	printk("                               lock(");
-	__print_lock_name(safe_class);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(middle_class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, safe_class, "                               lock(", ");\n");
+	__print_lock_name(&buf, middle_class, "                               lock(", ");\n");
 	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(safe_class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, safe_class, "    lock(", ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
 }
 
@@ -1514,6 +1507,9 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 			 enum lock_usage_bit bit2,
 			 const char *irqclass)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
+
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
@@ -1529,26 +1525,24 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 		curr->softirq_context, softirq_count() >> SOFTIRQ_SHIFT,
 		curr->hardirqs_enabled,
 		curr->softirqs_enabled);
-	print_lock(next);
+	print_lock(NULL, next);
 
 	pr_warn("\nand this task is already holding:\n");
-	print_lock(prev);
+	print_lock(NULL, prev);
 	pr_warn("which would create a new lock dependency:\n");
-	print_lock_name(hlock_class(prev));
-	pr_cont(" ->");
-	print_lock_name(hlock_class(next));
-	pr_cont("\n");
+	print_lock_name(&buf, hlock_class(prev), " ->");
+	print_lock_name(&buf, hlock_class(next), "\n");
 
 	pr_warn("\nbut this new dependency connects a %s-irq-safe lock:\n",
 		irqclass);
-	print_lock_name(backwards_entry->class);
-	pr_warn("\n... which became %s-irq-safe at:\n", irqclass);
+	print_lock_name(&buf, backwards_entry->class, "\n");
+	pr_warn("... which became %s-irq-safe at:\n", irqclass);
 
 	print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
 
 	pr_warn("\nto a %s-irq-unsafe lock:\n", irqclass);
-	print_lock_name(forwards_entry->class);
-	pr_warn("\n... which became %s-irq-unsafe at:\n", irqclass);
+	print_lock_name(&buf, forwards_entry->class, "\n");
+	pr_warn("... which became %s-irq-unsafe at:\n", irqclass);
 	pr_warn("...");
 
 	print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
@@ -1564,8 +1558,8 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 		return 0;
 	print_shortest_lock_dependencies(backwards_entry, prev_root);
 
-	pr_warn("\nthe dependencies between the lock to be acquired");
-	pr_warn(" and %s-irq-unsafe lock:\n", irqclass);
+	pr_warn("\nthe dependencies between the lock to be acquired and %s-irq-unsafe lock:\n",
+		irqclass);
 	if (!save_trace(&next_root->trace))
 		return 0;
 	print_shortest_lock_dependencies(forwards_entry, next_root);
@@ -1725,16 +1719,14 @@ static inline void inc_chains(void)
 {
 	struct lock_class *next = hlock_class(nxt);
 	struct lock_class *prev = hlock_class(prv);
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	printk(" Possible unsafe locking scenario:\n\n");
 	printk("       CPU0\n");
 	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(prev);
-	printk(KERN_CONT ");\n");
-	printk("  lock(");
-	__print_lock_name(next);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, prev, "  lock(", ");\n");
+	__print_lock_name(&buf, next, "  lock(", ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
 	printk(" May be due to missing lock nesting notation\n\n");
 }
@@ -1753,9 +1745,9 @@ static inline void inc_chains(void)
 	pr_warn("--------------------------------------------\n");
 	pr_warn("%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(next);
+	print_lock(NULL, next);
 	pr_warn("\nbut task is already holding lock:\n");
-	print_lock(prev);
+	print_lock(NULL, prev);
 
 	pr_warn("\nother info that might help us debug this:\n");
 	print_deadlock_scenario(next, prev);
@@ -2052,13 +2044,12 @@ static inline int get_first_held_lock(struct task_struct *curr,
 /*
  * Returns the next chain_key iteration
  */
-static u64 print_chain_key_iteration(int class_idx, u64 chain_key)
+static u64 print_chain_key_iteration(struct printk_buffer *buf, int class_idx, u64 chain_key)
 {
 	u64 new_chain_key = iterate_chain_key(chain_key, class_idx);
 
-	printk(" class_idx:%d -> chain_key:%016Lx",
-		class_idx,
-		(unsigned long long)new_chain_key);
+	buffered_printk(buf, " class_idx:%d -> chain_key:%016Lx",
+			class_idx, (unsigned long long)new_chain_key);
 	return new_chain_key;
 }
 
@@ -2069,17 +2060,19 @@ static u64 print_chain_key_iteration(int class_idx, u64 chain_key)
 	u64 chain_key = 0;
 	int depth = curr->lockdep_depth;
 	int i;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	printk("depth: %u\n", depth + 1);
 	for (i = get_first_held_lock(curr, hlock_next); i < depth; i++) {
 		hlock = curr->held_locks + i;
-		chain_key = print_chain_key_iteration(hlock->class_idx, chain_key);
+		chain_key = print_chain_key_iteration(&buf, hlock->class_idx, chain_key);
 
-		print_lock(hlock);
+		print_lock(&buf, hlock);
 	}
 
-	print_chain_key_iteration(hlock_next->class_idx, chain_key);
-	print_lock(hlock_next);
+	print_chain_key_iteration(&buf, hlock_next->class_idx, chain_key);
+	print_lock(&buf, hlock_next);
 }
 
 static void print_chain_keys_chain(struct lock_chain *chain)
@@ -2087,14 +2080,15 @@ static void print_chain_keys_chain(struct lock_chain *chain)
 	int i;
 	u64 chain_key = 0;
 	int class_id;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	printk("depth: %u\n", chain->depth);
 	for (i = 0; i < chain->depth; i++) {
 		class_id = chain_hlocks[chain->base + i];
-		chain_key = print_chain_key_iteration(class_id + 1, chain_key);
+		chain_key = print_chain_key_iteration(&buf, class_id + 1, chain_key);
 
-		print_lock_name(lock_classes + class_id);
-		printk("\n");
+		print_lock_name(&buf, lock_classes + class_id, "\n");
 	}
 }
 
@@ -2495,17 +2489,15 @@ static void check_chain_key(struct task_struct *curr)
 print_usage_bug_scenario(struct held_lock *lock)
 {
 	struct lock_class *class = hlock_class(lock);
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	printk(" Possible unsafe locking scenario:\n\n");
 	printk("       CPU0\n");
 	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, class, "  lock(", ");\n");
 	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(class);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(&buf, class, "    lock(", ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
 }
 
@@ -2531,7 +2523,7 @@ static void check_chain_key(struct task_struct *curr)
 		trace_softirq_context(curr), softirq_count() >> SOFTIRQ_SHIFT,
 		trace_hardirqs_enabled(curr),
 		trace_softirqs_enabled(curr));
-	print_lock(this);
+	print_lock(NULL, this);
 
 	pr_warn("{%s} state was registered at:\n", usage_str[prev_bit]);
 	print_stack_trace(hlock_class(this)->usage_traces + prev_bit, 1);
@@ -2577,6 +2569,8 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	struct lock_list *entry = other;
 	struct lock_list *middle = NULL;
 	int depth;
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
 
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
@@ -2588,13 +2582,13 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	pr_warn("--------------------------------------------------------\n");
 	pr_warn("%s/%d just changed the state of lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(this);
+	print_lock(NULL, this);
 	if (forwards)
 		pr_warn("but this lock took another, %s-unsafe lock in the past:\n", irqclass);
 	else
 		pr_warn("but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
-	print_lock_name(other->class);
-	pr_warn("\n\nand interrupts could create inverse lock ordering between them.\n\n");
+	print_lock_name(&buf, other->class, "\n\n");
+	pr_warn("and interrupts could create inverse lock ordering between them.\n\n");
 
 	pr_warn("\nother info that might help us debug this:\n");
 
@@ -3169,7 +3163,7 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	 */
 	if (ret == 2) {
 		printk("\nmarked lock as {%s}:\n", usage_str[new_bit]);
-		print_lock(this);
+		print_lock(NULL, this);
 		print_irqtrace_events(curr);
 		dump_stack();
 	}
@@ -3264,7 +3258,7 @@ void lockdep_init_map(struct lockdep_map *lock, const char *name,
 	pr_warn("----------------------------------\n");
 
 	pr_warn("%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
-	print_lock(hlock);
+	print_lock(NULL, hlock);
 
 	pr_warn("\nbut this task is not holding:\n");
 	pr_warn("%s\n", hlock->nest_lock->name);
@@ -3326,10 +3320,10 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	}
 	atomic_inc((atomic_t *)&class->ops);
 	if (very_verbose(class)) {
-		printk("\nacquire class [%px] %s", class->key, class->name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
-		printk(KERN_CONT "\n");
+			printk("\nacquire class [%px] %s#%d\n", class->key, class->name, class->name_version);
+		else
+			printk("\nacquire class [%px] %s\n", class->key, class->name);
 		dump_stack();
 	}
 
@@ -3465,6 +3459,9 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 print_unlock_imbalance_bug(struct task_struct *curr, struct lockdep_map *lock,
 			   unsigned long ip)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
@@ -3475,10 +3472,9 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	pr_warn("WARNING: bad unlock balance detected!\n");
 	print_kernel_ident();
 	pr_warn("-------------------------------------\n");
-	pr_warn("%s/%d is trying to release lock (",
-		curr->comm, task_pid_nr(curr));
-	print_lockdep_cache(lock);
-	pr_cont(") at:\n");
+	buffered_printk(&buf, KERN_WARNING "%s/%d is trying to release lock (",
+			curr->comm, task_pid_nr(curr));
+	print_lockdep_cache(&buf, lock, ") at:\n");
 	print_ip_sym(ip);
 	pr_warn("but there are no more locks to release!\n");
 	pr_warn("\nother info that might help us debug this:\n");
@@ -4026,6 +4022,9 @@ void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie)
 print_lock_contention_bug(struct task_struct *curr, struct lockdep_map *lock,
 			   unsigned long ip)
 {
+	char tmpbuf[256];
+	DEFINE_PRINTK_BUFFER(buf, sizeof(tmpbuf), tmpbuf);
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
@@ -4036,10 +4035,9 @@ void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie)
 	pr_warn("WARNING: bad contention detected!\n");
 	print_kernel_ident();
 	pr_warn("---------------------------------\n");
-	pr_warn("%s/%d is trying to contend lock (",
-		curr->comm, task_pid_nr(curr));
-	print_lockdep_cache(lock);
-	pr_cont(") at:\n");
+	buffered_printk(&buf, KERN_WARNING "%s/%d is trying to contend lock (",
+			curr->comm, task_pid_nr(curr));
+	print_lockdep_cache(&buf, lock, ") at:\n");
 	print_ip_sym(ip);
 	pr_warn("but there are no locks held!\n");
 	pr_warn("\nother info that might help us debug this:\n");
@@ -4382,7 +4380,7 @@ void __init lockdep_info(void)
 	pr_warn("-------------------------\n");
 	pr_warn("%s/%d is freeing memory %px-%px, with a lock still held there!\n",
 		curr->comm, task_pid_nr(curr), mem_from, mem_to-1);
-	print_lock(hlock);
+	print_lock(NULL, hlock);
 	lockdep_print_held_locks(curr);
 
 	pr_warn("\nstack backtrace:\n");
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-06-20 13:06                                               ` Sergey Senozhatsky
  2018-06-22 13:06                                                 ` Tetsuo Handa
@ 2018-09-10 11:20                                                 ` Alexander Potapenko
  2018-09-12  6:53                                                   ` Sergey Senozhatsky
  1 sibling, 1 reply; 94+ messages in thread
From: Alexander Potapenko @ 2018-09-10 11:20 UTC (permalink / raw)
  To: Sergey Senozhatsky, Dmitriy Vyukov, penguin-kernel
  Cc: kbuild test robot, sergey.senozhatsky.work, pmladek, syzkaller,
	Steven Rostedt, LKML, Linus Torvalds, Andrew Morton

On Wed, Jun 20, 2018 at 3:06 PM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
> On (06/20/18 13:32), Dmitry Vyukov wrote:
> > > So, if we could get rid of pr_cont() from the most important parts
> > > (instruction dumps, etc) then I would just vote to leave pr_cont()
> > > alone and avoid any handling of it in printk context tracking. Simply
> > > because we wouldn't care about pr_cont(). This also could simplify
> > > Tetsuo's patch significantly.
> >
> > Sounds good to me.
>
> Awesome. If you and Fengguang can combine forces and lead the
> whole thing towards "we couldn't care of pr_cont() less", it
> would be really huuuuuge. Go for it!

Sorry, folks, am I understanding right that pr_cont() and flushing the
buffer on "\n" are two separate problems that can be handled outside
Tetsuo's patchset, just assuming pr_cont() is unsupported?
Or should the pr_cont() cleanup be a prerequisite for that?

>         -ss
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-10 11:20                                                 ` Alexander Potapenko
@ 2018-09-12  6:53                                                   ` Sergey Senozhatsky
  2018-09-12 16:05                                                     ` Steven Rostedt
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-12  6:53 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, penguin-kernel,
	kbuild test robot, sergey.senozhatsky.work, pmladek, syzkaller,
	Steven Rostedt, LKML, Linus Torvalds, Andrew Morton

On (09/10/18 13:20), Alexander Potapenko wrote:
> > Awesome. If you and Fengguang can combine forces and lead the
> > whole thing towards "we couldn't care of pr_cont() less", it
> > would be really huuuuuge. Go for it!
> 
> Sorry, folks, am I understanding right that pr_cont() and flushing the
> buffer on "\n" are two separate problems that can be handled outside
> Tetsuo's patchset, just assuming pr_cont() is unsupported?
> Or should the pr_cont() cleanup be a prerequisite for that?

Oh... Sorry. I'm quite overloaded at the moment and simply forgot about
this thread.

So what is exactly our problem with pr_cont -- it's not SMP friendly.
And this leads to various things, the most annoying of which is a
preliminary flush.

E.g. let me do a simple thing on my box:

ps aux | grep firefox
kill 2727

dmesg | tail
[  554.098341] Chrome_~dThread[2823]: segfault at 0 ip 00007f5df153a1f3 sp 00007f5ded47ab00 error 6 in libxul.so[7f5df1531000+4b01000]
[  554.098348] Code: e7 04 48 8d 15 a6 94 ae 03 48 89 10 c7 04 25 00 00 00 00 00 00 00 00 0f 0b 48 8b 05 57 d0 e7 04 48 8d 0d b0 94 ae 03 48 89 08 <c7> 04 25 00 00 00 00 00 00 00 00 0f 0b e8 4d f4 ff ff 48 8b 05 34
[  554.109418] Chrome_~dThread[3047]: segfault at 0 ip 00007f3d5bdba1f3 sp 00007f3d57cfab00 error 6
[  554.109421] Chrome_~dThread[3077]: segfault at 0 ip 00007fe773f661f3 sp 00007fe76fea6b00 error 6
[  554.109424]  in libxul.so[7f3d5bdb1000+4b01000]
[  554.109426]  in libxul.so[7fe773f5d000+4b01000]
[  554.109429] Code: e7 04 48 8d 15 a6 94 ae 03 48 89 10 c7 04 25 00 00 00 00 00 00 00 00 0f 0b 48 8b 05 57 d0 e7 04 48 8d 0d b0 94 ae 03 48 89 08 <c7> 04 25 00 00 00 00 00 00 00 00 0f 0b e8 4d f4 ff ff 48 8b 05 34


Even such a simple thing as "printk several lines per-crashed process"
is broken. Look at line #0 and lines #2-#5.

And this is the only problem we probably need to address. Overlapping
printk lines -- when several CPUs printk simultaneously, or same CPUs
printk-s from IRQ, etc -- are here by design and it's not going to be
easy to change that (and maybe we shouldn't try).


Buffering multiple lines in printk buffer does not look so simple and
perhaps we should not try to do this, as well. Why:

- it's hard to decide what to do when buffer overflows

    Switching to "normal printk" defeats the reason we do buffering in the
    first place. Because "normal printk" permits overlapping. So buffering
    makes a little sense if we are OK with switching to a "normal printk".

- the more we buffer the more we can lose in case of panic.

    We can't flush_on_panic() printk buffers which were allocated on stack.

- flushing multiple lines should be more complex than just a simple
  printk loop

  while (1) {
     x = memchr(buf, '\n', sz);
     ...
     print("%s", buf);
     ...
  }

    Because "printk() loop" permits lines overlap. Hence buffering makes
    little sense, once again.



So let's reduce the problem scope to "we want to have a replacement for
pr_cont()". And let's address pr_cont()'s "preliminary flush" issue only.


I scanned some of Linus' emails, and skimmed through previous discussions
on this topic. Let me quote Linus:

: 
: My preference as a user is actually to just have a dynamically
: re-sizable buffer (that's pretty much what I've done in *every* single
: user space project I've had in the last decade), but because some
: users might have atomicity issues I do suspect that we should just use
: a stack buffer.
: 
: And then perhaps say that the buffer size has to be capped at 80 characters.
: 
: Because if you're printing more than 80 characters and expecting it
: all to fit on a line, you're doing something else wrong anyway.
: 
: And hide it not as a explicit "char buffer[80]]" allocation, but as a
: "struct line_buffer" or similar, so that
: 
:  (a) people don't get the line size wrong
: 
:  (b) the buffering code can add a few fields for length etc in there too
: 
: Introduce a few helper functions for it:
: 
:  init_line_buffer(&buf);
:  print_line(&buf, fmt, args);
:  vprint_line(&buf, fmt, vararg);
:  finish_line(&buf);
: 



And this is, basically, what I have attached to this email. It's very
simple and very short. And I think this is what Linus wanted us to do.

- usage example

       DEFINE_PR_LINE(KERN_ERR, pl);

       pr_line(&pl, "Hello, %s!\n", "buffer");
       pr_line(&pl, "%s", "OK.\n");
       pr_line(&pl, "Goodbye, %s", "buffer");
       pr_line(&pl, "\n");

dmesg | tail

[   69.908542] Hello, buffer!
[   69.908544] OK.
[   69.908545] Goodbye, buffer


- pr_cont-like usage

       DEFINE_PR_LINE(KERN_ERR, pl);

       pr_line(&pl,"%d ", 1);
       pr_line(&pl,"%d ", 3);
       pr_line(&pl,"%d ", 5);
       pr_line(&pl,"%d ", 7);
       pr_line(&pl,"%d\n", 9);

dmesg | tail

[   69.908546] 1 3 5 7 9


- An explicit, aux buffer // output should be truncated

       char buf[16];
       DEFINE_PR_LINE_BUF(KERN_ERR, ps, buf, sizeof(buf));

       pr_line(&ps, "Test test test test test test test test test\n");
       pr_line(&ps, "\n");


dmesg | tail

[   69.908547] Test test test ** truncated **


Opinions? Will this work for us?

====

From 7fd8407e0081d8979f08dec48e88364d6210b4ab Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: [PATCH] printk: add pr_line buffering API

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 include/linux/printk.h | 63 ++++++++++++++++++++++++++++++++++++++++++
 kernel/printk/printk.c | 55 ++++++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccfe1543..fc5f11c7579c 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -157,6 +157,15 @@ static inline void printk_nmi_direct_enter(void) { }
 static inline void printk_nmi_direct_exit(void) { }
 #endif /* PRINTK_NMI */
 
+#define PRINTK_PR_LINE_BUF_SZ	80
+
+struct pr_line {
+	char			*buffer;
+	int			size;
+	int			len;
+	char			*level;
+};
+
 #ifdef CONFIG_PRINTK
 asmlinkage __printf(5, 0)
 int vprintk_emit(int facility, int level,
@@ -209,6 +218,30 @@ extern asmlinkage void dump_stack(void) __cold;
 extern void printk_safe_init(void);
 extern void printk_safe_flush(void);
 extern void printk_safe_flush_on_panic(void);
+
+#define DEFINE_PR_LINE(lev, name)				\
+	char		__pr_line_buf[PRINTK_PR_LINE_BUF_SZ];	\
+	struct pr_line	name = {				\
+		.buffer = __pr_line_buf,			\
+		.size 	= PRINTK_PR_LINE_BUF_SZ,		\
+		.len 	= 0,					\
+		.level	= lev,					\
+	}
+
+#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
+	struct pr_line	name = {				\
+		.buffer = buf,					\
+		.size 	= (sz),					\
+		.len 	= 0,					\
+		.level	= lev,					\
+	}
+
+extern __printf(2, 3)
+int pr_line(struct pr_line *pl, const char *fmt, ...);
+extern __printf(2, 0)
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
+extern void pr_line_flush(struct pr_line *pl);
+
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -284,6 +317,36 @@ static inline void printk_safe_flush(void)
 static inline void printk_safe_flush_on_panic(void)
 {
 }
+
+#define DEFINE_PR_LINE(lev, name)				\
+	struct pr_line	name = {				\
+		.buffer = NULL,					\
+		.size 	= 0,					\
+		.len 	= 0,					\
+		.level	= lev,					\
+	}
+
+#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
+	struct pr_line	name = {				\
+		.buffer = buf,					\
+		.size 	= 0,					\
+		.len 	= 0,					\
+		.level	= lev,					\
+	}
+
+static inline __printf(2, 3)
+int pr_line(struct pr_line *pl, const char *fmt, ...)
+{
+	return 0;
+}
+static inline __printf(2, 0)
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
+{
+	return 0;
+}
+static inline void pr_line_flush(struct pr_line *pl)
+{
+}
 #endif
 
 extern int kptr_restrict;
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index fd6f8ed28e01..daeb41a57929 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2004,6 +2004,61 @@ asmlinkage __visible int printk(const char *fmt, ...)
 }
 EXPORT_SYMBOL(printk);
 
+#define PR_LINE_TRUNCATED_MSG "** truncated **\n"
+
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
+{
+	int len;
+
+	if (unlikely(pl->size >= LOG_LINE_MAX))
+		pl->size = LOG_LINE_MAX - sizeof(PR_LINE_TRUNCATED_MSG);
+
+	if (fmt[0] == '\n') {
+		pr_line_flush(pl);
+		return 0;
+	}
+
+	if (pl->len >= pl->size)
+		return -1;
+
+	len = vsnprintf(pl->buffer + pl->len, pl->size - pl->len, fmt, args);
+	if (pl->len + len >= pl->size) {
+		pl->len = pl->size + 1;
+		return -1;
+	}
+
+	pl->len += len;
+	if (pl->len && pl->buffer[pl->len - 1] == '\n')
+		pr_line_flush(pl);
+	return 0;
+}
+EXPORT_SYMBOL(vpr_line);
+
+int pr_line(struct pr_line *pl, const char *fmt, ...)
+{
+	va_list ap;
+	int ret;
+
+	va_start(ap, fmt);
+	ret = vpr_line(pl, fmt, ap);
+	va_end(ap);
+	return ret;
+}
+EXPORT_SYMBOL(pr_line);
+
+void pr_line_flush(struct pr_line *pl)
+{
+	if (!pl->len)
+		return;
+
+	if (pl->len < pl->size)
+		printk("%s%.*s", pl->level, pl->len, pl->buffer);
+	else
+		printk("%s%.*s%s", pl->level, pl->len, pl->buffer,
+			PR_LINE_TRUNCATED_MSG);
+	pl->len = 0;
+}
+EXPORT_SYMBOL(pr_line_flush);
 #else /* CONFIG_PRINTK */
 
 #define LOG_LINE_MAX		0
-- 
2.19.0


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-12  6:53                                                   ` Sergey Senozhatsky
@ 2018-09-12 16:05                                                     ` Steven Rostedt
  2018-09-13  7:12                                                       ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Steven Rostedt @ 2018-09-12 16:05 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Alexander Potapenko, Sergey Senozhatsky, Dmitriy Vyukov,
	penguin-kernel, kbuild test robot, pmladek, syzkaller, LKML,
	Linus Torvalds, Andrew Morton

On Wed, 12 Sep 2018 15:53:07 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> I scanned some of Linus' emails, and skimmed through previous discussions
> on this topic. Let me quote Linus:
> 
> : 
> : My preference as a user is actually to just have a dynamically
> : re-sizable buffer (that's pretty much what I've done in *every* single
> : user space project I've had in the last decade), but because some
> : users might have atomicity issues I do suspect that we should just use
> : a stack buffer.
> : 
> : And then perhaps say that the buffer size has to be capped at 80 characters.
> : 
> : Because if you're printing more than 80 characters and expecting it
> : all to fit on a line, you're doing something else wrong anyway.
> : 
> : And hide it not as a explicit "char buffer[80]]" allocation, but as a
> : "struct line_buffer" or similar, so that
> : 
> :  (a) people don't get the line size wrong
> : 
> :  (b) the buffering code can add a few fields for length etc in there too
> : 
> : Introduce a few helper functions for it:
> : 
> :  init_line_buffer(&buf);
> :  print_line(&buf, fmt, args);
> :  vprint_line(&buf, fmt, vararg);
> :  finish_line(&buf);
> : 

This sounds like seq_buf to me.

> 
> 
> 
> And this is, basically, what I have attached to this email. It's very
> simple and very short. And I think this is what Linus wanted us to do.
> 
> - usage example
> 
>        DEFINE_PR_LINE(KERN_ERR, pl);
> 
>        pr_line(&pl, "Hello, %s!\n", "buffer");
>        pr_line(&pl, "%s", "OK.\n");
>        pr_line(&pl, "Goodbye, %s", "buffer");
>        pr_line(&pl, "\n");
> 
> dmesg | tail
> 
> [   69.908542] Hello, buffer!
> [   69.908544] OK.
> [   69.908545] Goodbye, buffer
> 
> 
> - pr_cont-like usage
> 
>        DEFINE_PR_LINE(KERN_ERR, pl);
> 
>        pr_line(&pl,"%d ", 1);
>        pr_line(&pl,"%d ", 3);
>        pr_line(&pl,"%d ", 5);
>        pr_line(&pl,"%d ", 7);
>        pr_line(&pl,"%d\n", 9);
> 
> dmesg | tail
> 
> [   69.908546] 1 3 5 7 9
> 
> 
> - An explicit, aux buffer // output should be truncated
> 
>        char buf[16];
>        DEFINE_PR_LINE_BUF(KERN_ERR, ps, buf, sizeof(buf));
> 
>        pr_line(&ps, "Test test test test test test test test test\n");
>        pr_line(&ps, "\n");
> 
> 
> dmesg | tail
> 
> [   69.908547] Test test test ** truncated **
> 
> 
> Opinions? Will this work for us?
> 
> ====
> 
> >From 7fd8407e0081d8979f08dec48e88364d6210b4ab Mon Sep 17 00:00:00 2001  
> From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Subject: [PATCH] printk: add pr_line buffering API
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  include/linux/printk.h | 63 ++++++++++++++++++++++++++++++++++++++++++
>  kernel/printk/printk.c | 55 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 118 insertions(+)
> 
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index cf3eccfe1543..fc5f11c7579c 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -157,6 +157,15 @@ static inline void printk_nmi_direct_enter(void) { }
>  static inline void printk_nmi_direct_exit(void) { }
>  #endif /* PRINTK_NMI */
>  
> +#define PRINTK_PR_LINE_BUF_SZ	80
> +
> +struct pr_line {
> +	char			*buffer;
> +	int			size;
> +	int			len;
> +	char			*level;
> +};

Can you look at implementing this with using a seq_buf?

-- Steve

> +
>  #ifdef CONFIG_PRINTK
>  asmlinkage __printf(5, 0)
>  int vprintk_emit(int facility, int level,
> @@ -209,6 +218,30 @@ extern asmlinkage void dump_stack(void) __cold;
>  extern void printk_safe_init(void);
>  extern void printk_safe_flush(void);
>  extern void printk_safe_flush_on_panic(void);
> +
> +#define DEFINE_PR_LINE(lev, name)				\
> +	char		__pr_line_buf[PRINTK_PR_LINE_BUF_SZ];	\
> +	struct pr_line	name = {				\
> +		.buffer = __pr_line_buf,			\
> +		.size 	= PRINTK_PR_LINE_BUF_SZ,		\
> +		.len 	= 0,					\
> +		.level	= lev,					\
> +	}
> +
> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
> +	struct pr_line	name = {				\
> +		.buffer = buf,					\
> +		.size 	= (sz),					\
> +		.len 	= 0,					\
> +		.level	= lev,					\
> +	}
> +
> +extern __printf(2, 3)
> +int pr_line(struct pr_line *pl, const char *fmt, ...);
> +extern __printf(2, 0)
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
> +extern void pr_line_flush(struct pr_line *pl);
> +
>  #else
>  static inline __printf(1, 0)
>  int vprintk(const char *s, va_list args)
> @@ -284,6 +317,36 @@ static inline void printk_safe_flush(void)
>  static inline void printk_safe_flush_on_panic(void)
>  {
>  }
> +
> +#define DEFINE_PR_LINE(lev, name)				\
> +	struct pr_line	name = {				\
> +		.buffer = NULL,					\
> +		.size 	= 0,					\
> +		.len 	= 0,					\
> +		.level	= lev,					\
> +	}
> +
> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
> +	struct pr_line	name = {				\
> +		.buffer = buf,					\
> +		.size 	= 0,					\
> +		.len 	= 0,					\
> +		.level	= lev,					\
> +	}
> +
> +static inline __printf(2, 3)
> +int pr_line(struct pr_line *pl, const char *fmt, ...)
> +{
> +	return 0;
> +}
> +static inline __printf(2, 0)
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> +{
> +	return 0;
> +}
> +static inline void pr_line_flush(struct pr_line *pl)
> +{
> +}
>  #endif
>  
>  extern int kptr_restrict;
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index fd6f8ed28e01..daeb41a57929 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2004,6 +2004,61 @@ asmlinkage __visible int printk(const char *fmt, ...)
>  }
>  EXPORT_SYMBOL(printk);
>  
> +#define PR_LINE_TRUNCATED_MSG "** truncated **\n"
> +
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> +{
> +	int len;
> +
> +	if (unlikely(pl->size >= LOG_LINE_MAX))
> +		pl->size = LOG_LINE_MAX - sizeof(PR_LINE_TRUNCATED_MSG);
> +
> +	if (fmt[0] == '\n') {
> +		pr_line_flush(pl);
> +		return 0;
> +	}
> +
> +	if (pl->len >= pl->size)
> +		return -1;
> +
> +	len = vsnprintf(pl->buffer + pl->len, pl->size - pl->len, fmt, args);
> +	if (pl->len + len >= pl->size) {
> +		pl->len = pl->size + 1;
> +		return -1;
> +	}
> +
> +	pl->len += len;
> +	if (pl->len && pl->buffer[pl->len - 1] == '\n')
> +		pr_line_flush(pl);
> +	return 0;
> +}
> +EXPORT_SYMBOL(vpr_line);
> +
> +int pr_line(struct pr_line *pl, const char *fmt, ...)
> +{
> +	va_list ap;
> +	int ret;
> +
> +	va_start(ap, fmt);
> +	ret = vpr_line(pl, fmt, ap);
> +	va_end(ap);
> +	return ret;
> +}
> +EXPORT_SYMBOL(pr_line);
> +
> +void pr_line_flush(struct pr_line *pl)
> +{
> +	if (!pl->len)
> +		return;
> +
> +	if (pl->len < pl->size)
> +		printk("%s%.*s", pl->level, pl->len, pl->buffer);
> +	else
> +		printk("%s%.*s%s", pl->level, pl->len, pl->buffer,
> +			PR_LINE_TRUNCATED_MSG);
> +	pl->len = 0;
> +}
> +EXPORT_SYMBOL(pr_line_flush);
>  #else /* CONFIG_PRINTK */
>  
>  #define LOG_LINE_MAX		0


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-12 16:05                                                     ` Steven Rostedt
@ 2018-09-13  7:12                                                       ` Sergey Senozhatsky
  2018-09-13 12:26                                                         ` Petr Mladek
  2018-09-14  1:12                                                         ` Steven Rostedt
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-13  7:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Alexander Potapenko, Sergey Senozhatsky,
	Dmitriy Vyukov, penguin-kernel, kbuild test robot, pmladek,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

Hi, Steven

On (09/12/18 12:05), Steven Rostedt wrote:
> > : Introduce a few helper functions for it:
> > : 
> > :  init_line_buffer(&buf);
> > :  print_line(&buf, fmt, args);
> > :  vprint_line(&buf, fmt, vararg);
> > :  finish_line(&buf);
> > : 
> 
> This sounds like seq_buf to me.

Correct.

> > +struct pr_line {
> > +	char			*buffer;
> > +	int			size;
> > +	int			len;
> > +	char			*level;
> > +};
> 
> Can you look at implementing this with using a seq_buf?

Certainly, attached.

It doesn't seem to save us that much code, tho. It looks smaller just
because I dropped "truncated" print out and didn't include !CONFIG_PRINTK
noise this time around. And the OK thing about previous version was that
it didn't introduce any new dependencies to printk.

Making pr_line available via printk.h -- #include seq_buf.h in printk.h - at
glance looks like some fun. printk.h is getting included very early, before
we have all the stuff that seq_buf.h wants - we can remove fs.h from
seq_buf.h and add a bunch of forward declarations for path and seq_file;
but all those BUG_ON/WARN_ON/etc is another story (unless we want every
pr_line user to include seq_buf.h).

... maybe I can change API. But I sort of like that implicit buffer case:

	DEFINE_PR_LINE(KERN_ERR, pl);

	pr_line(&pl, "Hello, ");
	pr_line(&pl, "%s.\n", "Steven");

And, looking at potential users of pr_line, I'd say that we better
have DEFINE_PR_LINE_BUF, because some of them do print messages longer
than 80 chars.

===

From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: [PATCH] lib/seq_buf: add pr_line buffering API

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 include/linux/seq_buf.h | 35 +++++++++++++++++++++++++++++++
 lib/seq_buf.c           | 46 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+)

diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h
index aa5deb041c25..5e9a5ff9a440 100644
--- a/include/linux/seq_buf.h
+++ b/include/linux/seq_buf.h
@@ -23,6 +23,36 @@ struct seq_buf {
 	loff_t			readpos;
 };
 
+#define __SEQ_BUF_INITIALIZER(buf, length) {				\
+	.buffer			= (buf),				\
+	.size			= (length),				\
+	.len			= 0,					\
+	.readpos		= 0, }
+
+#ifdef CONFIG_PRINTK
+#define __PR_LINE_BUF_SZ	80
+#else
+#define __PR_LINE_BUF_SZ	0
+#endif
+
+struct pr_line {
+	struct seq_buf		sb;
+	char			*level;
+};
+
+#define DEFINE_PR_LINE(lev, name)					\
+	char		__line[__PR_LINE_BUF_SZ];			\
+	struct pr_line	name = {					\
+		.sb = __SEQ_BUF_INITIALIZER(__line, __PR_LINE_BUF_SZ),	\
+		.level	= lev,						\
+	}
+
+#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)				\
+	struct pr_line	name = {					\
+		.sb = __SEQ_BUF_INITIALIZER(buf, (sz)),		\
+		.level	= lev,						\
+	}
+
 static inline void seq_buf_clear(struct seq_buf *s)
 {
 	s->len = 0;
@@ -131,4 +161,9 @@ extern int
 seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
 #endif
 
+extern __printf(2, 0)
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
+extern __printf(2, 3)
+int pr_line(struct pr_line *pl, const char *fmt, ...);
+extern void pr_line_flush(struct pr_line *pl);
 #endif /* _LINUX_SEQ_BUF_H */
diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 11f2ae0f9099..29bc4f24b83e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -324,3 +324,49 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
 	s->readpos += cnt;
 	return cnt;
 }
+
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
+{
+	struct seq_buf *s = &pl->sb;
+	int ret, len;
+
+	if (fmt[0] == '\n') {
+		pr_line_flush(pl);
+		return 0;
+	}
+
+	ret = seq_buf_vprintf(s, fmt, args);
+
+	len = seq_buf_used(s);
+	if (len && s->buffer[len - 1] == '\n')
+		pr_line_flush(pl);
+
+	return ret;
+}
+EXPORT_SYMBOL(vpr_line);
+
+int pr_line(struct pr_line *pl, const char *fmt, ...)
+{
+	va_list ap;
+	int ret;
+
+	va_start(ap, fmt);
+	ret = vpr_line(pl, fmt, ap);
+	va_end(ap);
+
+	return ret;
+}
+EXPORT_SYMBOL(pr_line);
+
+void pr_line_flush(struct pr_line *pl)
+{
+	struct seq_buf *s = &pl->sb;
+	int len = seq_buf_used(s);
+
+	if (!len)
+		return;
+
+	printk("%s%.*s", pl->level, len, s->buffer);
+	seq_buf_clear(s);
+}
+EXPORT_SYMBOL(pr_line_flush);
-- 
2.19.0


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13  7:12                                                       ` Sergey Senozhatsky
@ 2018-09-13 12:26                                                         ` Petr Mladek
  2018-09-13 14:28                                                           ` Sergey Senozhatsky
  2018-09-14  1:12                                                         ` Steven Rostedt
  1 sibling, 1 reply; 94+ messages in thread
From: Petr Mladek @ 2018-09-13 12:26 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Steven Rostedt, Alexander Potapenko, Sergey Senozhatsky,
	Dmitriy Vyukov, penguin-kernel, kbuild test robot, syzkaller,
	LKML, Linus Torvalds, Andrew Morton

On Thu 2018-09-13 16:12:54, Sergey Senozhatsky wrote:
> On (09/12/18 12:05), Steven Rostedt wrote:
> > > : Introduce a few helper functions for it:
> > > : 
> > > :  init_line_buffer(&buf);
> > > :  print_line(&buf, fmt, args);
> > > :  vprint_line(&buf, fmt, vararg);
> > > :  finish_line(&buf);
> > > : 
> > 
> --- a/lib/seq_buf.c
> +++ b/lib/seq_buf.c
> @@ -324,3 +324,49 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
>  	s->readpos += cnt;
>  	return cnt;
>  }
> +
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> +{
> +	struct seq_buf *s = &pl->sb;
> +	int ret, len;
> +
> +	if (fmt[0] == '\n') {
> +		pr_line_flush(pl);
> +		return 0;
> +	}

You would need to check if fmt[1] == '\0'. But then you would need
to be careful about a possible buffer overflow. I would personally
avoid this optimization.


> +	ret = seq_buf_vprintf(s, fmt, args);
> +
> +	len = seq_buf_used(s);
> +	if (len && s->buffer[len - 1] == '\n')
> +		pr_line_flush(pl);

This would cause that pr_line_flush() won't be strictly needed.
Also it would encourage people to use this feature a more
complicated way (for more lines). Do we really want this?


In general, I like this approach more than any attemps to handle
continuous lines transpatently. The other attemps were much more
complicated or were not reliable.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13 12:26                                                         ` Petr Mladek
@ 2018-09-13 14:28                                                           ` Sergey Senozhatsky
  2018-09-14  1:22                                                             ` Steven Rostedt
  2018-09-14  6:57                                                             ` Sergey Senozhatsky
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-13 14:28 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Alexander Potapenko,
	Sergey Senozhatsky, Dmitriy Vyukov, penguin-kernel,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/13/18 14:26), Petr Mladek wrote:
> > +
> > +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> > +{
> > +	struct seq_buf *s = &pl->sb;
> > +	int ret, len;
> > +
> > +	if (fmt[0] == '\n') {
> > +		pr_line_flush(pl);
> > +		return 0;
> > +	}
> 
> You would need to check if fmt[1] == '\0'. But then you would need
> to be careful about a possible buffer overflow. I would personally
> avoid this optimization.

Good call. It was a fast path for pr_cont("\n").
But it made me wondering and I did some grepping

arch/m68k/kernel/traps.c:                               pr_cont("\n       ");
arch/m68k/kernel/traps.c:                       pr_cont("\n       ");
kernel/trace/ftrace.c:          pr_cont("\n expected tramp: %lx\n", ip);

Lovely.
It will take us some time.

> > +	ret = seq_buf_vprintf(s, fmt, args);
> > +
> > +	len = seq_buf_used(s);
> > +	if (len && s->buffer[len - 1] == '\n')
> > +		pr_line_flush(pl);
> 
> This would cause that pr_line_flush() won't be strictly needed.
> Also it would encourage people to use this feature a more
> complicated way (for more lines). Do we really want this?

Not that I see any problems with pr_line_flush(). But can drop it, sure.
pr_line() is a replacement for pr_cont() and as such it's not for multi-line
buffering.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13  7:12                                                       ` Sergey Senozhatsky
  2018-09-13 12:26                                                         ` Petr Mladek
@ 2018-09-14  1:12                                                         ` Steven Rostedt
  2018-09-14  1:55                                                           ` Sergey Senozhatsky
  1 sibling, 1 reply; 94+ messages in thread
From: Steven Rostedt @ 2018-09-14  1:12 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Alexander Potapenko, Sergey Senozhatsky, Dmitriy Vyukov,
	penguin-kernel, kbuild test robot, pmladek, syzkaller, LKML,
	Linus Torvalds, Andrew Morton

On Thu, 13 Sep 2018 16:12:54 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  include/linux/seq_buf.h | 35 +++++++++++++++++++++++++++++++
>  lib/seq_buf.c           | 46 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h
> index aa5deb041c25..5e9a5ff9a440 100644
> --- a/include/linux/seq_buf.h
> +++ b/include/linux/seq_buf.h
> @@ -23,6 +23,36 @@ struct seq_buf {
>  	loff_t			readpos;
>  };
>  
> +#define __SEQ_BUF_INITIALIZER(buf, length) {				\
> +	.buffer			= (buf),				\
> +	.size			= (length),				\
> +	.len			= 0,					\
> +	.readpos		= 0, }

Nit, but the end bracket '}' should be on it's own line. Even when
part of a macro.

> +
> +#ifdef CONFIG_PRINTK
> +#define __PR_LINE_BUF_SZ	80
> +#else
> +#define __PR_LINE_BUF_SZ	0
> +#endif
> +
> +struct pr_line {
> +	struct seq_buf		sb;
> +	char			*level;
> +};
> +
> +#define DEFINE_PR_LINE(lev, name)					\
> +	char		__line[__PR_LINE_BUF_SZ];			\

To protect against name space collision could you use:

	char		__line_##name[__PR_LINE_BUF_SZ];

> +	struct pr_line	name = {					\
> +		.sb = __SEQ_BUF_INITIALIZER(__line, __PR_LINE_BUF_SZ),	\
> +		.level	= lev,						\
> +	}
> +
> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)				\
> +	struct pr_line	name = {					\
> +		.sb = __SEQ_BUF_INITIALIZER(buf, (sz)),		\
> +		.level	= lev,						\
> +	}
> +
>  static inline void seq_buf_clear(struct seq_buf *s)
>  {
>  	s->len = 0;
> @@ -131,4 +161,9 @@ extern int
>  seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
>  #endif
>  
> +extern __printf(2, 0)
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
> +extern __printf(2, 3)
> +int pr_line(struct pr_line *pl, const char *fmt, ...);
> +extern void pr_line_flush(struct pr_line *pl);
>  #endif /* _LINUX_SEQ_BUF_H */
> diff --git a/lib/seq_buf.c b/lib/seq_buf.c
> index 11f2ae0f9099..29bc4f24b83e 100644
> --- a/lib/seq_buf.c
> +++ b/lib/seq_buf.c
> @@ -324,3 +324,49 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
>  	s->readpos += cnt;
>  	return cnt;
>  }
> +
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
> +{
> +	struct seq_buf *s = &pl->sb;
> +	int ret, len;
> +
> +	if (fmt[0] == '\n') {
> +		pr_line_flush(pl);
> +		return 0;
> +	}
> +
> +	ret = seq_buf_vprintf(s, fmt, args);
> +
> +	len = seq_buf_used(s);
> +	if (len && s->buffer[len - 1] == '\n')
> +		pr_line_flush(pl);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(vpr_line);
> +
> +int pr_line(struct pr_line *pl, const char *fmt, ...)
> +{
> +	va_list ap;
> +	int ret;
> +
> +	va_start(ap, fmt);
> +	ret = vpr_line(pl, fmt, ap);
> +	va_end(ap);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(pr_line);
> +
> +void pr_line_flush(struct pr_line *pl)
> +{
> +	struct seq_buf *s = &pl->sb;
> +	int len = seq_buf_used(s);
> +
> +	if (!len)
> +		return;
> +
> +	printk("%s%.*s", pl->level, len, s->buffer);
> +	seq_buf_clear(s);
> +}
> +EXPORT_SYMBOL(pr_line_flush);

The rest looks fine to me.

Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13 14:28                                                           ` Sergey Senozhatsky
@ 2018-09-14  1:22                                                             ` Steven Rostedt
  2018-09-14  2:15                                                               ` Sergey Senozhatsky
  2018-09-14  6:57                                                             ` Sergey Senozhatsky
  1 sibling, 1 reply; 94+ messages in thread
From: Steven Rostedt @ 2018-09-14  1:22 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, Alexander Potapenko,
	Dmitriy Vyukov, penguin-kernel, kbuild test robot, syzkaller,
	LKML, Linus Torvalds, Andrew Morton

On Thu, 13 Sep 2018 23:28:02 +0900
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:

> Good call. It was a fast path for pr_cont("\n").
> But it made me wondering and I did some grepping
> 

[..]

> kernel/trace/ftrace.c:          pr_cont("\n expected tramp: %lx\n", ip);

Note, looking at the history of that, I was just combining a lone "\n"
with the next string. The code before this print add info to the line
depending on the input, thus none do a "\n". The "expected tramp" part
is added to the next line, but I'm fine if you want to break this up.
This print is very unlikely done with other prints happening. It
happens when modifying (serially) ftrace nops to calls or back to nops.

Feel free to send a patch that breaks it up into:

	pr_cont("\n");
	pr_info(" expected tramp: %lx\n", ip);

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14  1:12                                                         ` Steven Rostedt
@ 2018-09-14  1:55                                                           ` Sergey Senozhatsky
  0 siblings, 0 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14  1:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Alexander Potapenko, Sergey Senozhatsky,
	Dmitriy Vyukov, penguin-kernel, kbuild test robot, pmladek,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On (09/13/18 21:12), Steven Rostedt wrote:
> >  
> > +#define __SEQ_BUF_INITIALIZER(buf, length) {				\
> > +	.buffer			= (buf),				\
> > +	.size			= (length),				\
> > +	.len			= 0,					\
> > +	.readpos		= 0, }
> 
> Nit, but the end bracket '}' should be on it's own line. Even when
> part of a macro.

No prob, will change.

I thought about putting it on its own line, but then checked
include/linux/wait.h - __WAITQUEUE_INITIALIZER and
__WAIT_QUEUE_HEAD_INITIALIZER.

> > +#define DEFINE_PR_LINE(lev, name)					\
> > +	char		__line[__PR_LINE_BUF_SZ];			\
> 
> To protect against name space collision could you use:
> 
> 	char		__line_##name[__PR_LINE_BUF_SZ];

Yes.

> The rest looks fine to me.
> 
> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Thanks.

Just, to make sure, we are OK with seq_buf dependency and want
anyone who wants to use pr_line to include linux/seq_buf.h?

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14  1:22                                                             ` Steven Rostedt
@ 2018-09-14  2:15                                                               ` Sergey Senozhatsky
  0 siblings, 0 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14  2:15 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky,
	Alexander Potapenko, Dmitriy Vyukov, penguin-kernel,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/13/18 21:22), Steven Rostedt wrote:
> > Good call. It was a fast path for pr_cont("\n").
> > But it made me wondering and I did some grepping
> > 
> 
> [..]
> 
> > kernel/trace/ftrace.c:          pr_cont("\n expected tramp: %lx\n", ip);
> 
> Note, looking at the history of that, I was just combining a lone "\n"
> with the next string. The code before this print add info to the line
> depending on the input, thus none do a "\n". The "expected tramp" part
> is added to the next line, but I'm fine if you want to break this up.
> This print is very unlikely done with other prints happening. It
> happens when modifying (serially) ftrace nops to calls or back to nops.
> 
> Feel free to send a patch that breaks it up into:
> 
> 	pr_cont("\n");
> 	pr_info(" expected tramp: %lx\n", ip);

I didn't mean to criticize anyone with my "Lovely" comment. Sorry if it
appeared to sound harsh.

I'm fine with the way it is, but we *probably* (up to you) will touch
this code once pr_line is available. As of now, the less pr_cont() calls
we make the better. This

	pr_cont("a");
	pr_cont("b");
	pr_cont("c\n");

in the worst case can be log_store-d as 3 log entries (2 preliminary
flushes). So, from this point of view, this

	pr_cont("ab");
	pr_cont("c\n");

is better, because it can be log_store-d as 2 log entries.
And with pr_line() we can log_store it in 1 log entry [but we will
use some extra stack space for that].

Overall, I counted around 100 cases of printk("\n...."), and around 20+ cases
of pr_cont("\n...") and probably around 10 or 15 printk(KERN_CONT "\n....")
cases. That's what I meant when I said that converting it to pr_line()
will take us some time. Especially given that some of lockdep developers
have really warm feelings toward printk ;)

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-13 14:28                                                           ` Sergey Senozhatsky
  2018-09-14  1:22                                                             ` Steven Rostedt
@ 2018-09-14  6:57                                                             ` Sergey Senozhatsky
  2018-09-14 10:37                                                               ` Tetsuo Handa
  1 sibling, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14  6:57 UTC (permalink / raw)
  To: Petr Mladek, Steven Rostedt
  Cc: Alexander Potapenko, Dmitriy Vyukov, penguin-kernel,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton, Sergey Senozhatsky, Sergey Senozhatsky

On (09/13/18 23:28), Sergey Senozhatsky wrote:
> Not that I see any problems with pr_line_flush(). But can drop it, sure.
> pr_line() is a replacement for pr_cont() and as such it's not for multi-line
> buffering.

OK, attached.
Let me know if anything needs to improved (including broken English).
Will we keep in the printk tree or shall I send a formal patch to Andrew?

===

From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Subject: [PATCH] lib/seq_buf: add pr_line buffering API

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 include/linux/kern_levels.h |  3 ++
 include/linux/seq_buf.h     | 60 +++++++++++++++++++++++++++++++++++++
 lib/seq_buf.c               | 57 +++++++++++++++++++++++++++++++++++
 3 files changed, 120 insertions(+)

diff --git a/include/linux/kern_levels.h b/include/linux/kern_levels.h
index d237fe854ad9..9c281ac745b3 100644
--- a/include/linux/kern_levels.h
+++ b/include/linux/kern_levels.h
@@ -20,6 +20,9 @@
  * Annotation for a "continued" line of log printout (only done after a
  * line that had no enclosing \n). Only to be used by core/arch code
  * during early bootup (a continued line is not SMP-safe otherwise).
+ *
+ * Please consider pr_line()/vpr_line() functions for SMP-safe continued
+ * line printing.
  */
 #define KERN_CONT	KERN_SOH "c"
 
diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h
index aa5deb041c25..b33aeea14803 100644
--- a/include/linux/seq_buf.h
+++ b/include/linux/seq_buf.h
@@ -23,6 +23,62 @@ struct seq_buf {
 	loff_t			readpos;
 };
 
+#define __SEQ_BUF_INITIALIZER(buf, length)			\
+{								\
+	.buffer			= (buf),			\
+	.size			= (length),			\
+	.len			= 0,				\
+	.readpos		= 0,				\
+}
+
+#ifdef CONFIG_PRINTK
+#define __PR_LINE_BUF_SZ	80
+#else
+#define __PR_LINE_BUF_SZ	0
+#endif
+
+/**
+ * pr_line - printk() line buffer structure
+ * @sb:	underlying seq buffer, which holds the data
+ * @level:	printk() log level (KERN_ERR, etc.)
+ */
+struct pr_line {
+	struct seq_buf		sb;
+	char			*level;
+};
+
+/**
+ * DEFINE_PR_LINE - define a new pr_line variable
+ * @lev:	printk() log level
+ * @name:	variable name
+ *
+ * Defines a new pr_line varialbe, which would use an implicit
+ * stack buffer of size __PR_LINE_BUF_SZ.
+ */
+#define DEFINE_PR_LINE(lev, name)				\
+	char		__line_##name[__PR_LINE_BUF_SZ];	\
+	struct pr_line	name = {				\
+		.sb	= __SEQ_BUF_INITIALIZER(__line_##name,	\
+					__PR_LINE_BUF_SZ),	\
+		.level	= lev,					\
+	}
+
+/**
+ * DEFINE_PR_LINE_BUF - define a new pr_line variable
+ * @lev:	printk() log level
+ * @name:	variable name
+ * @buf:	external buffer
+ * @sz:	external buffer size
+ *
+ * Defines a new pr_line variable, which would use an external
+ * buffer for printk line.
+ */
+#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
+	struct pr_line	name = {				\
+		.sb	= __SEQ_BUF_INITIALIZER(buf, (sz)),	\
+		.level	= lev,					\
+	}
+
 static inline void seq_buf_clear(struct seq_buf *s)
 {
 	s->len = 0;
@@ -131,4 +187,8 @@ extern int
 seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
 #endif
 
+extern __printf(2, 0)
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
+extern __printf(2, 3)
+int pr_line(struct pr_line *pl, const char *fmt, ...);
 #endif /* _LINUX_SEQ_BUF_H */
diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 11f2ae0f9099..fada7623f168 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -324,3 +324,60 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
 	s->readpos += cnt;
 	return cnt;
 }
+
+/**
+ * vpr_line - Append data to the printk() line buffer
+ * @pl: the pr_line descriptor
+ * @fmt: printf format string
+ * @args: va_list of arguments from a printf() type function
+ *
+ * Writes a vnprintf() format into the printk() pr_line buffer.
+ * Terminating new-line symbol flushes (prints) the buffer.
+ *
+ * Unlike pr_cont() and printk(KERN_CONT), this function is SMP-safe
+ * and shall be used for continued line printing.
+ *
+ * Returns zero on success, -1 on overflow.
+ */
+int vpr_line(struct pr_line *pl, const char *fmt, va_list args)
+{
+	struct seq_buf *s = &pl->sb;
+	int ret, len;
+
+	ret = seq_buf_vprintf(s, fmt, args);
+
+	len = seq_buf_used(s);
+	if (len && s->buffer[len - 1] == '\n') {
+		printk("%s%.*s", pl->level ? : KERN_DEFAULT, len, s->buffer);
+		seq_buf_clear(s);
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL(vpr_line);
+
+/**
+ * pr_line - Append data to the printk() line buffer
+ * @pl: the pr_line descriptor
+ * @fmt: printf format string
+ *
+ * Writes a printf() format into the printk() pr_line buffer.
+ * Terminating new-line symbol flushes (prints) the buffer.
+ *
+ * Unlike pr_cont() and printk(KERN_CONT), this function is SMP-safe
+ * and shall be used for continued line printing.
+ *
+ * Returns zero on success, -1 on overflow.
+ */
+int pr_line(struct pr_line *pl, const char *fmt, ...)
+{
+	va_list ap;
+	int ret;
+
+	va_start(ap, fmt);
+	ret = vpr_line(pl, fmt, ap);
+	va_end(ap);
+
+	return ret;
+}
+EXPORT_SYMBOL(pr_line);
-- 
2.19.0


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14  6:57                                                             ` Sergey Senozhatsky
@ 2018-09-14 10:37                                                               ` Tetsuo Handa
  2018-09-14 11:50                                                                 ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-09-14 10:37 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek, Steven Rostedt
  Cc: Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton,
	Sergey Senozhatsky

On 2018/09/14 15:57, Sergey Senozhatsky wrote:
> On (09/13/18 23:28), Sergey Senozhatsky wrote:
>> Not that I see any problems with pr_line_flush(). But can drop it, sure.
>> pr_line() is a replacement for pr_cont() and as such it's not for multi-line
>> buffering.
> 
> OK, attached.
> Let me know if anything needs to improved (including broken English).
> Will we keep in the printk tree or shall I send a formal patch to Andrew?



> @@ -20,6 +20,9 @@
>   * Annotation for a "continued" line of log printout (only done after a
>   * line that had no enclosing \n). Only to be used by core/arch code
>   * during early bootup (a continued line is not SMP-safe otherwise).
> + *
> + * Please consider pr_line()/vpr_line() functions for SMP-safe continued
> + * line printing.

I think the advantage is not limited to SMP-safeness. Reducing the frequency of
calling printk() will reduce overhead. Also, latency for netconsole will be
reduced by sending a whole line in one printk().



> +/**
> + * DEFINE_PR_LINE - define a new pr_line variable
> + * @lev:	printk() log level
> + * @name:	variable name
> + *
> + * Defines a new pr_line varialbe, which would use an implicit

s/varialbe/variable/ .

> + * stack buffer of size __PR_LINE_BUF_SZ.
> + */
> +#define DEFINE_PR_LINE(lev, name)				\
> +	char		__line_##name[__PR_LINE_BUF_SZ];	\
> +	struct pr_line	name = {				\
> +		.sb	= __SEQ_BUF_INITIALIZER(__line_##name,	\
> +					__PR_LINE_BUF_SZ),	\
> +		.level	= lev,					\
> +	}

Want a note that

  static DEFINE_PR_LINE(lev, name);

won't make "name" variable "static" ?



> +/**
> + * DEFINE_PR_LINE_BUF - define a new pr_line variable
> + * @lev:	printk() log level
> + * @name:	variable name
> + * @buf:	external buffer
> + * @sz:	external buffer size
> + *
> + * Defines a new pr_line variable, which would use an external
> + * buffer for printk line.
> + */
> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
> +	struct pr_line	name = {				\
> +		.sb	= __SEQ_BUF_INITIALIZER(buf, (sz)),	\
> +		.level	= lev,					\
> +	}
> +

I would use this one for the OOM killer. 80 bytes is too short.

  static char oom_print_buf[1024];
  DEFINE_PR_LINE_BUF(level, oom_print_buf);



> @@ -131,4 +187,8 @@ extern int
>  seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
>  #endif
>  
> +extern __printf(2, 0)
> +int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
> +extern __printf(2, 3)
> +int pr_line(struct pr_line *pl, const char *fmt, ...);

Do we want to mark "asmlinkage" like printk() ?

> @@ -324,3 +324,60 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
>  	s->readpos += cnt;
>  	return cnt;
>  }
> +
> +/**
> + * vpr_line - Append data to the printk() line buffer
> + * @pl: the pr_line descriptor

s/descriptor/structure/ ?

> + * @fmt: printf format string
> + * @args: va_list of arguments from a printf() type function
> + *
> + * Writes a vnprintf() format into the printk() pr_line buffer.

s/vnprintf/vprintf/ ?


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14 10:37                                                               ` Tetsuo Handa
@ 2018-09-14 11:50                                                                 ` Sergey Senozhatsky
  2018-09-14 12:03                                                                   ` Tetsuo Handa
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14 11:50 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton,
	Sergey Senozhatsky

On (09/14/18 19:37), Tetsuo Handa wrote:
> > @@ -20,6 +20,9 @@
> >   * Annotation for a "continued" line of log printout (only done after a
> >   * line that had no enclosing \n). Only to be used by core/arch code
> >   * during early bootup (a continued line is not SMP-safe otherwise).
> > + *
> > + * Please consider pr_line()/vpr_line() functions for SMP-safe continued
> > + * line printing.
> 
> I think the advantage is not limited to SMP-safeness. Reducing the frequency of
> calling printk() will reduce overhead. Also, latency for netconsole will be
> reduced by sending a whole line in one printk().

Hmm. These are very good points, indeed. But do we want to list all
advantages here? I just wanted to mention SMP-unsafe pr_cont/printk(KERN_CONT),
because I also mention pr_line in kern_levels.h.

> > + * Defines a new pr_line varialbe, which would use an implicit
> 
> s/varialbe/variable/ .

Thanks.

> > +#define DEFINE_PR_LINE(lev, name)				\
> > +	char		__line_##name[__PR_LINE_BUF_SZ];	\
> > +	struct pr_line	name = {				\
> > +		.sb	= __SEQ_BUF_INITIALIZER(__line_##name,	\
> > +					__PR_LINE_BUF_SZ),	\
> > +		.level	= lev,					\
> > +	}
> 
> Want a note that
> 
>   static DEFINE_PR_LINE(lev, name);
> 
> won't make "name" variable "static" ?

Interesting point. Any hint what the comment should look like?
Do we want to have static pr_line buffers?

> > +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
> > +	struct pr_line	name = {				\
> > +		.sb	= __SEQ_BUF_INITIALIZER(buf, (sz)),	\
> > +		.level	= lev,					\
> > +	}
> > +
> 
> I would use this one for the OOM killer. 80 bytes is too short.

80 bytes is quite short for OOM, agreed.

>   static char oom_print_buf[1024];
>   DEFINE_PR_LINE_BUF(level, oom_print_buf);

Do I get it right that you suggest to drop the "size" param?
Do OOM people agree on 1024 bytes stack usage?


> > @@ -131,4 +187,8 @@ extern int
> >  seq_buf_bprintf(struct seq_buf *s, const char *fmt, const u32 *binary);
> >  #endif
> >  
> > +extern __printf(2, 0)
> > +int vpr_line(struct pr_line *pl, const char *fmt, va_list args);
> > +extern __printf(2, 3)
> > +int pr_line(struct pr_line *pl, const char *fmt, ...);
> 
> Do we want to mark "asmlinkage" like printk() ?

Dunno, do we? Does code written in assembly call pr_cont that often?
We are not turning pr_line() into syscall anyway.

> > @@ -324,3 +324,60 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
> >  	s->readpos += cnt;
> >  	return cnt;
> >  }
> > +
> > +/**
> > + * vpr_line - Append data to the printk() line buffer
> > + * @pl: the pr_line descriptor
> 
> s/descriptor/structure/ ?

Yeah, I used the term "descriptor", just because it's used in seq_buf.c.
So, it's sort of common in seq_buf.
E.g.
   seq_buf_vprintf(), seq_buf_print_seq(), seq_buf_can_fit() and so on.

> > + * @fmt: printf format string
> > + * @args: va_list of arguments from a printf() type function
> > + *
> > + * Writes a vnprintf() format into the printk() pr_line buffer.
> 
> s/vnprintf/vprintf/ ?

Indeed.
We also need to fix a typo in seq_buf_vprintf() comment then.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14 11:50                                                                 ` Sergey Senozhatsky
@ 2018-09-14 12:03                                                                   ` Tetsuo Handa
  2018-09-14 12:22                                                                     ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-09-14 12:03 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/14 20:50, Sergey Senozhatsky wrote:
>>> +#define DEFINE_PR_LINE_BUF(lev, name, buf, sz)			\
>>> +	struct pr_line	name = {				\
>>> +		.sb	= __SEQ_BUF_INITIALIZER(buf, (sz)),	\
>>> +		.level	= lev,					\
>>> +	}
>>> +
>>
>> I would use this one for the OOM killer. 80 bytes is too short.
> 
> 80 bytes is quite short for OOM, agreed.
> 
>>   static char oom_print_buf[1024];
>>   DEFINE_PR_LINE_BUF(level, oom_print_buf);
> 
> Do I get it right that you suggest to drop the "size" param?

No. I just forgot to add params. ;-)

> Do OOM people agree on 1024 bytes stack usage?

I won't allocate oom_print_buf on the stack. Since its usage is serialized
by oom_lock mutex, we don't need to allocate from stack. Since memory
allocation request might happen when stack is already tight, we should not
try to allocate much from stack.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14 12:03                                                                   ` Tetsuo Handa
@ 2018-09-14 12:22                                                                     ` Sergey Senozhatsky
  2018-09-19 11:02                                                                       ` Tetsuo Handa
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-14 12:22 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek,
	Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/14/18 21:03), Tetsuo Handa wrote:
> > 80 bytes is quite short for OOM, agreed.
> > 
> >>   static char oom_print_buf[1024];
> >>   DEFINE_PR_LINE_BUF(level, oom_print_buf);
> > 
> > Do I get it right that you suggest to drop the "size" param?
> 
> No. I just forgot to add params. ;-)
> 
> > Do OOM people agree on 1024 bytes stack usage?
> 
> I won't allocate oom_print_buf on the stack. Since its usage is serialized
> by oom_lock mutex, we don't need to allocate from stack. Since memory
> allocation request might happen when stack is already tight, we should not
> try to allocate much from stack.

... by "OOM people" I meant "MM people".
"MM people" is a subset of "OOM people".

OK, so I didn't notice the "static" part of the `oom_print_buf'.
I need some rest, I guess.

The "SMP-safe" comment becomes a bit tricky when pr_line is used with a
static buffer. Either we need to require synchronization - umm... and
document it - or to provide some means of synchronization in pr_line().
Let's think what pr_line API should do about it.

Any thoughts?

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-14 12:22                                                                     ` Sergey Senozhatsky
@ 2018-09-19 11:02                                                                       ` Tetsuo Handa
  2018-09-24  8:11                                                                         ` Tetsuo Handa
  2018-09-28  8:56                                                                         ` Sergey Senozhatsky
  0 siblings, 2 replies; 94+ messages in thread
From: Tetsuo Handa @ 2018-09-19 11:02 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/14 21:22, Sergey Senozhatsky wrote:
> The "SMP-safe" comment becomes a bit tricky when pr_line is used with a
> static buffer. Either we need to require synchronization - umm... and
> document it - or to provide some means of synchronization in pr_line().
> Let's think what pr_line API should do about it.
> 
> Any thoughts?
> 

I'm inclined to propose a simple one shown below, similar to just having
several "struct cont" for concurrent printk() users.
What Linus has commented is that implicit context is bad, and below one
uses explicit context.
After almost all users are converted to use below one, we might be able
to get rid of KERN_CONT support.



From d5e0e422142ced2b7097040e96ba7c5528a460db Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 19 Sep 2018 14:39:07 +0900
Subject: [PATCH v2] printk: Add best-effort printk() buffering.

Sometimes we want to printk() a line without being disturbed by concurrent
printk() from interrupts and/or other threads. For example, mixed printk()
output of multiple thread's dump makes it hard to interpret.

Assuming that we will go to a direction that we add context identifier to
each line of printk() output (so that we can group multiple lines into one
block when parsing), this patch introduces functions for using fixed-sized
statically allocated buffers for line-buffering printk() output for best
effort basis (i.e. up to LOG_LINE_MAX bytes, up to 16 concurrent printk()
users).

If there happened to be more than 16 concurrent printk() users, existing
printk() will be used for users who failed to get buffers. Of course, if
there were more than 16 concurrent printk() users, the printk() output
would flood the console and the system would be already unusable (e.g.
RCU lockup or hung task watchdog would fire under such situation). Thus,
I think that 16 buffers should be sufficient.

Five functions (get_printk_buffer(), buffered_vprintk(), buffered_printk(),
flush_printk_buffer() and put_printk_buffer()) are provided for printk()
buffering.

  get_printk_buffer() tries to assign a "struct printk_buffer".

  buffered_vprintk()/buffered_printk() tries to use line-buffered printk()
  by holding incomplete line into "struct printk_buffer".

  flush_printk_buffer() flushes the "struct printk_buffer".

  put_printk_buffer() flushes and releases the "struct printk_buffer".

put_printk_buffer() must match corresponding get_printk_buffer() as with
rcu_read_unlock() must match corresponding rcu_read_lock().

These functions are safe to be called from any context, for these are
merely wrapping printk()/vprintk() calls in order to minimize possibility
of using "struct cont" by managing 16 buffers outside of the logbuf_lock
spinlock. Thus, any caller can be updated to use these functions.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 include/linux/printk.h |  28 +++++++++
 kernel/printk/printk.c | 160 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 188 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccf..889491b 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -157,6 +157,7 @@ static inline void printk_nmi_direct_enter(void) { }
 static inline void printk_nmi_direct_exit(void) { }
 #endif /* PRINTK_NMI */
 
+struct printk_buffer;
 #ifdef CONFIG_PRINTK
 asmlinkage __printf(5, 0)
 int vprintk_emit(int facility, int level,
@@ -173,6 +174,13 @@ int printk_emit(int facility, int level,
 
 asmlinkage __printf(1, 2) __cold
 int printk(const char *fmt, ...);
+struct printk_buffer *get_printk_buffer(void);
+void flush_printk_buffer(struct printk_buffer *ptr);
+__printf(2, 3)
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...);
+asmlinkage __printf(2, 0)
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args);
+void put_printk_buffer(struct printk_buffer *ptr);
 
 /*
  * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ !
@@ -220,6 +228,26 @@ int printk(const char *s, ...)
 {
 	return 0;
 }
+static inline struct printk_buffer *get_printk_buffer(void)
+{
+	return NULL;
+}
+static inline __printf(2, 3)
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	return 0;
+}
+static inline __printf(2, 0)
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	return 0;
+}
+static inline void flush_printk_buffer(struct printk_buffer *ptr)
+{
+}
+static inline void put_printk_buffer(struct printk_buffer *ptr)
+{
+}
 static inline __printf(1, 2) __cold
 int printk_deferred(const char *s, ...)
 {
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9bf5404..c9e9f5d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1949,6 +1949,166 @@ asmlinkage int printk_emit(int facility, int level,
 }
 EXPORT_SYMBOL(printk_emit);
 
+struct printk_buffer {
+	unsigned short int used; /* Valid bytes in buf[]. */
+	char buf[LOG_LINE_MAX];
+	bool in_use;
+} __aligned(1024);
+#define MAX_PRINTK_BUFFERS 16
+static struct printk_buffer printk_buffers[MAX_PRINTK_BUFFERS];
+
+/**
+ * get_printk_buffer - Try to get printk_buffer.
+ *
+ * Returns pointer to "struct printk_buffer" on success, NULL otherwise.
+ *
+ * If this function returned "struct printk_buffer", the caller is responsible
+ * for passing it to put_printk_buffer() so that "struct printk_buffer" can be
+ * reused in the future.
+ *
+ * Even if this function returned NULL, the caller does not need to check for
+ * NULL, for passing NULL to buffered_printk() simply acts like normal printk()
+ * and passing NULL to flush_printk_buffer()/put_printk_buffer() is a no-op.
+ */
+struct printk_buffer *get_printk_buffer(void)
+{
+	unsigned short int i;
+
+	for (i = 0; i < MAX_PRINTK_BUFFERS; i++) {
+		struct printk_buffer *ptr = &printk_buffers[i];
+
+		if (ptr->in_use || cmpxchg(&ptr->in_use, false, true))
+			continue;
+		ptr->used = 0;
+		return ptr;
+	}
+	return NULL;
+}
+EXPORT_SYMBOL(get_printk_buffer);
+
+/**
+ * buffered_vprintk - Try to vprintk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string.
+ * @args: va_list structure.
+ *
+ * Returns the return value of vprintk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	va_list tmp_args;
+	unsigned short int i;
+	int r;
+
+	if (!ptr)
+		goto unbuffered;
+	for (i = 0; i < 2; i++) {
+		unsigned int pos = ptr->used;
+		char *text = ptr->buf + pos;
+
+		va_copy(tmp_args, args);
+		r = vsnprintf(text, sizeof(ptr->buf) - pos, fmt, tmp_args);
+		va_end(tmp_args);
+		if (r + pos < sizeof(ptr->buf)) {
+			/*
+			 * Eliminate KERN_CONT at this point because we can
+			 * concatenate incomplete lines inside printk_buffer.
+			 */
+			if (r >= 2 && printk_get_level(text) == 'c') {
+				memmove(text, text + 2, r - 2);
+				ptr->used += r - 2;
+			} else {
+				ptr->used += r;
+			}
+			/* Flush already completed lines if any. */
+			while (1) {
+				char *cp = memchr(ptr->buf, '\n', ptr->used);
+
+				if (!cp)
+					break;
+				*cp = '\0';
+				printk("%s\n", ptr->buf);
+				i = cp - ptr->buf + 1;
+				ptr->used -= i;
+				memmove(ptr->buf, ptr->buf + i, ptr->used);
+			}
+			return r;
+		}
+		if (i)
+			break;
+		flush_printk_buffer(ptr);
+	}
+ unbuffered:
+	return vprintk(fmt, args);
+}
+
+/**
+ * buffered_printk - Try to printk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string, followed by arguments.
+ *
+ * Returns the return value of printk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	va_list args;
+	int r;
+
+	va_start(args, fmt);
+	r = buffered_vprintk(ptr, fmt, args);
+	va_end(args);
+	return r;
+}
+EXPORT_SYMBOL(buffered_printk);
+
+/**
+ * flush_printk_buffer - Flush incomplete line in printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush if @ptr contains partial data. But usually there is no need to call
+ * this function because @ptr is flushed by put_printk_buffer().
+ */
+void flush_printk_buffer(struct printk_buffer *ptr)
+{
+	if (!ptr || !ptr->used)
+		return;
+	/* buffered_vprintk() keeps 0 <= ptr->used < sizeof(ptr->buf) true. */
+	ptr->buf[ptr->used] = '\0';
+	printk("%s", ptr->buf);
+	ptr->used = 0;
+}
+EXPORT_SYMBOL(flush_printk_buffer);
+
+/**
+ * put_printk_buffer - Release printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush and release @ptr.
+ */
+void put_printk_buffer(struct printk_buffer *ptr)
+{
+	if (!ptr)
+		return;
+	if (ptr->used)
+		flush_printk_buffer(ptr);
+	xchg(&ptr->in_use, false);
+}
+EXPORT_SYMBOL(put_printk_buffer);
+
 int vprintk_default(const char *fmt, va_list args)
 {
 	int r;
-- 
1.8.3.1



An example user of these functions which would mitigate output like
https://syzkaller.appspot.com/text?tag=CrashReport&x=13368fda400000 is shown below.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 89d2a2a..44bbb96 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4689,10 +4689,10 @@ unsigned long nr_free_pagecache_pages(void)
 	return nr_free_zone_pages(gfp_zone(GFP_HIGHUSER_MOVABLE));
 }
 
-static inline void show_node(struct zone *zone)
+static inline void show_node(struct printk_buffer *buf, struct zone *zone)
 {
 	if (IS_ENABLED(CONFIG_NUMA))
-		printk("Node %d ", zone_to_nid(zone));
+		buffered_printk(buf, "Node %d ", zone_to_nid(zone));
 }
 
 long si_mem_available(void)
@@ -4814,7 +4814,7 @@ static bool show_mem_node_skip(unsigned int flags, int nid, nodemask_t *nodemask
 
 #define K(x) ((x) << (PAGE_SHIFT-10))
 
-static void show_migration_types(unsigned char type)
+static void show_migration_types(struct printk_buffer *buf, unsigned char type)
 {
 	static const char types[MIGRATE_TYPES] = {
 		[MIGRATE_UNMOVABLE]	= 'U',
@@ -4838,7 +4838,7 @@ static void show_migration_types(unsigned char type)
 	}
 
 	*p = '\0';
-	printk(KERN_CONT "(%s) ", tmp);
+	buffered_printk(buf, "(%s) ", tmp);
 }
 
 /*
@@ -4856,6 +4856,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 	int cpu;
 	struct zone *zone;
 	pg_data_t *pgdat;
+	struct printk_buffer *buf = get_printk_buffer();
 
 	for_each_populated_zone(zone) {
 		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
@@ -4950,8 +4951,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 		for_each_online_cpu(cpu)
 			free_pcp += per_cpu_ptr(zone->pageset, cpu)->pcp.count;
 
-		show_node(zone);
-		printk(KERN_CONT
+		show_node(buf, zone);
+		buffered_printk(buf,
 			"%s"
 			" free:%lukB"
 			" min:%lukB"
@@ -4993,10 +4994,10 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 			K(free_pcp),
 			K(this_cpu_read(zone->pageset->pcp.count)),
 			K(zone_page_state(zone, NR_FREE_CMA_PAGES)));
-		printk("lowmem_reserve[]:");
+		buffered_printk(buf, "lowmem_reserve[]:");
 		for (i = 0; i < MAX_NR_ZONES; i++)
-			printk(KERN_CONT " %ld", zone->lowmem_reserve[i]);
-		printk(KERN_CONT "\n");
+			buffered_printk(buf, " %ld", zone->lowmem_reserve[i]);
+		buffered_printk(buf, "\n");
 	}
 
 	for_each_populated_zone(zone) {
@@ -5006,8 +5007,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 
 		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
 			continue;
-		show_node(zone);
-		printk(KERN_CONT "%s: ", zone->name);
+		show_node(buf, zone);
+		buffered_printk(buf, "%s: ", zone->name);
 
 		spin_lock_irqsave(&zone->lock, flags);
 		for (order = 0; order < MAX_ORDER; order++) {
@@ -5025,13 +5026,14 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
 		for (order = 0; order < MAX_ORDER; order++) {
-			printk(KERN_CONT "%lu*%lukB ",
+			buffered_printk(buf, "%lu*%lukB ",
 			       nr[order], K(1UL) << order);
 			if (nr[order])
-				show_migration_types(types[order]);
+				show_migration_types(buf, types[order]);
 		}
-		printk(KERN_CONT "= %lukB\n", K(total));
+		buffered_printk(buf, "= %lukB\n", K(total));
 	}
+	put_printk_buffer(buf);
 
 	hugetlb_show_meminfo();
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-19 11:02                                                                       ` Tetsuo Handa
@ 2018-09-24  8:11                                                                         ` Tetsuo Handa
  2018-09-27 16:10                                                                           ` Tetsuo Handa
  2018-09-28  9:09                                                                           ` Sergey Senozhatsky
  2018-09-28  8:56                                                                         ` Sergey Senozhatsky
  1 sibling, 2 replies; 94+ messages in thread
From: Tetsuo Handa @ 2018-09-24  8:11 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/19 20:02, Tetsuo Handa wrote:
> On 2018/09/14 21:22, Sergey Senozhatsky wrote:
>> The "SMP-safe" comment becomes a bit tricky when pr_line is used with a
>> static buffer. Either we need to require synchronization - umm... and
>> document it - or to provide some means of synchronization in pr_line().
>> Let's think what pr_line API should do about it.
>>
>> Any thoughts?
>>
> 
> I'm inclined to propose a simple one shown below, similar to just having
> several "struct cont" for concurrent printk() users.
> What Linus has commented is that implicit context is bad, and below one
> uses explicit context.
> After almost all users are converted to use below one, we might be able
> to get rid of KERN_CONT support.

The reason of using statically preallocated global buffers is that I think
that it is inconvenient for KERN_CONT users to calculate necessary bytes
only for avoiding message truncation. The pr_line might be passed to deep
into the callchain and adjusting buffer size whenever the content's possible
max length changes is as much painful as changing printk() to accept only
one "const char *" argument. Even if we guarantee that any context can
allocate buffer from kernel stack, we cannot guarantee that many concurrent
printk() won't trigger lockup. Thus, I think that trying to allocate from
finite static buffers with a fallback to unbuffered printk() upon failure
is sufficient.



By the way, kbuild test robot told me that I forgot to drop asmlinkage keyword.

 include/linux/printk.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 889491b..3347442 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -178,7 +178,7 @@ asmlinkage __printf(1, 2) __cold
 void flush_printk_buffer(struct printk_buffer *ptr);
 __printf(2, 3)
 int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...);
-asmlinkage __printf(2, 0)
+__printf(2, 0)
 int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args);
 void put_printk_buffer(struct printk_buffer *ptr);
 



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-24  8:11                                                                         ` Tetsuo Handa
@ 2018-09-27 16:10                                                                           ` Tetsuo Handa
  2018-09-28  9:02                                                                             ` Sergey Senozhatsky
  2018-09-28  9:09                                                                           ` Sergey Senozhatsky
  1 sibling, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-09-27 16:10 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/24 17:11, Tetsuo Handa wrote:
> On 2018/09/19 20:02, Tetsuo Handa wrote:
>> On 2018/09/14 21:22, Sergey Senozhatsky wrote:
>>> The "SMP-safe" comment becomes a bit tricky when pr_line is used with a
>>> static buffer. Either we need to require synchronization - umm... and
>>> document it - or to provide some means of synchronization in pr_line().
>>> Let's think what pr_line API should do about it.
>>>
>>> Any thoughts?
>>>
>>
>> I'm inclined to propose a simple one shown below, similar to just having
>> several "struct cont" for concurrent printk() users.
>> What Linus has commented is that implicit context is bad, and below one
>> uses explicit context.
>> After almost all users are converted to use below one, we might be able
>> to get rid of KERN_CONT support.
> 
> The reason of using statically preallocated global buffers is that I think
> that it is inconvenient for KERN_CONT users to calculate necessary bytes
> only for avoiding message truncation. The pr_line might be passed to deep
> into the callchain and adjusting buffer size whenever the content's possible
> max length changes is as much painful as changing printk() to accept only
> one "const char *" argument. Even if we guarantee that any context can
> allocate buffer from kernel stack, we cannot guarantee that many concurrent
> printk() won't trigger lockup. Thus, I think that trying to allocate from
> finite static buffers with a fallback to unbuffered printk() upon failure
> is sufficient.
> 

Hmm, what problem is blocking this topic?

I think that the SMP-safe comment is unnecessary for line buffered printk() API.
What we want to do is to avoid mixing incomplete lines from concurrent printk()
callers (and then, prefix caller's information in order to help grouping multiple
lines) rather than avoid stalls / crashes / lost messages caused by concurrent
printk() callers.

We could avoid crashes if there is no bug in printk() related code. But we can
never avoid stalls / lost messages as long as we floodly call printk(). Even if
line buffered printk() API were SMP-safe, printk() might have to discard the
output. We need to try to avoid too much printk() - as if there is no concept
of SMP-safeness regardless of whether pr_line() is used with a static buffer.

Therefore, I think that "Either we need to require synchronization - umm... and
document it - or to provide some means of synchronization in pr_line()." is a
pointless worry. It is only existing printk() API which needs synchronization. I
think that line buffered printk() API does not need to talk about synchronization.
Just saying "don't share DEFINE_PR_LINE()/DEFINE_PR_LINE_BUF() variables" will be
sufficient.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-19 11:02                                                                       ` Tetsuo Handa
  2018-09-24  8:11                                                                         ` Tetsuo Handa
@ 2018-09-28  8:56                                                                         ` Sergey Senozhatsky
  2018-09-28 11:21                                                                           ` Tetsuo Handa
  1 sibling, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-28  8:56 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek,
	Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/19/18 20:02), Tetsuo Handa wrote:
> I'm inclined to propose a simple one shown below, similar to just having
> several "struct cont" for concurrent printk() users.

Tetsuo, thanks for the patch.

> What Linus has commented is that implicit context is bad, and below one
> uses explicit context.
> After almost all users are converted to use below one, we might be able
> to get rid of KERN_CONT support.

The good thing about cont buffer is that we flush it on panic. E.g.
core/arch early boot stage can do:

	pr_cont("going to call early_init_foo()...");
	early_init_foo();
	pr_cont("OK\n");

should early_init_foo() panic the system we will have
"going to call early_init_foo()" on the serial console. This can
be addressed if you'd iterate printk_buffers[] in flush_on_panic().

> +#define MAX_PRINTK_BUFFERS 16
> +static struct printk_buffer printk_buffers[MAX_PRINTK_BUFFERS];

Well, hmm, maybe. Now can we have a problem of either too-small or too-large
MAX_PRINTK_BUFFERS. 16 buffers on a 4 CPU arm board most probably will just
waste some memory. At the same time we probably don't want to have NR_CPUS
buffers. The fallback to "regular printk" is still a bit troubling - technically
there may be cases when we don't fix anything.

So, overall, I'm not against your patch. There are some pros and cons,
however.

pr_line() patch seems to be simpler [probably] and smaller [definitely].
The only problem, as you have mentioned, is that people may miscalculate
the size of the buffer, which won't crash us or anything; people can overshot
even a LOG_LINE_MAX buffer. So probably I'm not completely sold on having
a fixed size printk_buffers[].

May be all we want at the end is to drop explicit buffer API and have just
two options in pr_line:

 DEFINE_PR_LINE()	-- 80-bytes (or 256) pr_line // implicit buffer
 DEFINE_PR_LINE_HUGE()	-- 1024-bytes pr_line        // implicit buffer

So, no explicit buffers, just "a normal" pr_line or "a huge" pr_line.
And no "normal printk" fallback; buffered printk line stays buffered.

The 80-bytes limit can be lifted to, say, 256-bytes.

Tetsuo, do you still want to have a fixed size array of printk buffers?

What do others think?


BTW, Tetsuo, I have addressed your pr_line suggestions/corrections.
Couldn't send the patch or reply to emails because I was offline for
a week due to personal reasons; but I can send it now - it does not
have DEFINE_PR_LINE_HUGE() macro. Just a previous version with
corrections which you have pointed out.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-27 16:10                                                                           ` Tetsuo Handa
@ 2018-09-28  9:02                                                                             ` Sergey Senozhatsky
  0 siblings, 0 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-28  9:02 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek,
	Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/28/18 01:10), Tetsuo Handa wrote:
> 
> Therefore, I think that "Either we need to require synchronization - umm... and
> document it - or to provide some means of synchronization in pr_line()." is a
> pointless worry. It is only existing printk() API which needs synchronization. I
> think that line buffered printk() API does not need to talk about synchronization.
> Just saying "don't share DEFINE_PR_LINE()/DEFINE_PR_LINE_BUF() variables" will be
> sufficient.

Agreed. My conclusion at the end was that - "pr_line is going to do as much
as seq_buf does". So pr_line won't provide any additional synchronization
mechanisms.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-24  8:11                                                                         ` Tetsuo Handa
  2018-09-27 16:10                                                                           ` Tetsuo Handa
@ 2018-09-28  9:09                                                                           ` Sergey Senozhatsky
  2018-09-28 11:01                                                                             ` Tetsuo Handa
  1 sibling, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-28  9:09 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek,
	Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/24/18 17:11), Tetsuo Handa wrote:
> The reason of using statically preallocated global buffers is that I think
> that it is inconvenient for KERN_CONT users to calculate necessary bytes
> only for avoiding message truncation. The pr_line might be passed to deep
> into the callchain and adjusting buffer size whenever the content's possible
> max length changes is as much painful as changing printk() to accept only
> one "const char *" argument. Even if we guarantee that any context can
> allocate buffer from kernel stack, we cannot guarantee that many concurrent
> printk() won't trigger lockup. Thus, I think that trying to allocate from
> finite static buffers with a fallback to unbuffered printk() upon failure
> is sufficient.

Yes, this makes sense. At the same time we can keep pr_line buffer
in .bss

	static char buffer[1024];
	static DEFINE_PR_LINE_BUF(..., buffer);

just like you have already mentioned. But that's going to require a
case-by-case handling; so a big list of printk buffers is a simpler
option. Fallback, tho, can be painful. On a system with 1024 CPUs can
one have more than 16 concurrent cont printks? If the answer is yes,
then we are looking at the same broken cont output as before.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-28  9:09                                                                           ` Sergey Senozhatsky
@ 2018-09-28 11:01                                                                             ` Tetsuo Handa
  2018-09-29 10:51                                                                               ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-09-28 11:01 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/28 18:09, Sergey Senozhatsky wrote:
> On (09/24/18 17:11), Tetsuo Handa wrote:
>> The reason of using statically preallocated global buffers is that I think
>> that it is inconvenient for KERN_CONT users to calculate necessary bytes
>> only for avoiding message truncation. The pr_line might be passed to deep
>> into the callchain and adjusting buffer size whenever the content's possible
>> max length changes is as much painful as changing printk() to accept only
>> one "const char *" argument. Even if we guarantee that any context can
>> allocate buffer from kernel stack, we cannot guarantee that many concurrent
>> printk() won't trigger lockup. Thus, I think that trying to allocate from
>> finite static buffers with a fallback to unbuffered printk() upon failure
>> is sufficient.
> 
> Yes, this makes sense. At the same time we can keep pr_line buffer
> in .bss
> 
> 	static char buffer[1024];
> 	static DEFINE_PR_LINE_BUF(..., buffer);
> 
> just like you have already mentioned. But that's going to require a
> case-by-case handling; so a big list of printk buffers is a simpler
> option. Fallback, tho, can be painful. On a system with 1024 CPUs can
> one have more than 16 concurrent cont printks? If the answer is yes,
> then we are looking at the same broken cont output as before.

I'm OK with making "16" configurable (at kernel configuration and/or
at kernel boot like log_buf_len= kernel command line parameter).

We could even allow each "struct task_struct" to have corresponding
"struct printk_buffer". But if there are such many concurrent callers,
the printk() would have already locked up the system to death. ;-)


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-28  8:56                                                                         ` Sergey Senozhatsky
@ 2018-09-28 11:21                                                                           ` Tetsuo Handa
  2018-09-29 11:13                                                                             ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-09-28 11:21 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/28 17:56, Sergey Senozhatsky wrote:
> The good thing about cont buffer is that we flush it on panic. E.g.
> core/arch early boot stage can do:
> 
> 	pr_cont("going to call early_init_foo()...");
> 	early_init_foo();
> 	pr_cont("OK\n");
> 

Is printing

  going to call early_init_foo()...OK

in one line so critically important? If caller information is prefixed,
we would no longer need to support KERN_CONT. That is, we could do

  printk("going to call early_init_foo()...\n");
  early_init_foo();
  printk("OK\n");

and get output like below.

  T0: going to call early_init_foo()...
  T0: OK

Even if "going to call early_init_foo()..." part became too long,

  T0: going to call
  T0: early_init_foo()...
  T0: OK

will not be so bad.

> should early_init_foo() panic the system we will have
> "going to call early_init_foo()" on the serial console. This can
> be addressed if you'd iterate printk_buffers[] in flush_on_panic().

Yes, flush on panic() would also be possible.



> Tetsuo, do you still want to have a fixed size array of printk buffers?

For my intended users where printk() is used for reporting errors (e.g.
stack backtrace, GFP_ATOMIC memory allocation failure, lockdep splat),
being prepared for already tight stack is preferable.

> 
> What do others think?

Yes, I want to hear from others.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-28 11:01                                                                             ` Tetsuo Handa
@ 2018-09-29 10:51                                                                               ` Sergey Senozhatsky
  2018-09-29 11:15                                                                                 ` Tetsuo Handa
  0 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-29 10:51 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek,
	Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/28/18 20:01), Tetsuo Handa wrote:
> > Yes, this makes sense. At the same time we can keep pr_line buffer
> > in .bss
> > 
> > 	static char buffer[1024];
> > 	static DEFINE_PR_LINE_BUF(..., buffer);
> > 
> > just like you have already mentioned. But that's going to require a
> > case-by-case handling; so a big list of printk buffers is a simpler
> > option. Fallback, tho, can be painful. On a system with 1024 CPUs can
> > one have more than 16 concurrent cont printks? If the answer is yes,
> > then we are looking at the same broken cont output as before.
> 
> I'm OK with making "16" configurable (at kernel configuration and/or
> at kernel boot like log_buf_len= kernel command line parameter).

Do we really want this? Why .bss placement doesn't work for you?

	void oom(...)
	{
		static DEFINE_PR_LINE(KERN_ERR, pr);

		pr_line(&pr, ....);
		pr_line(&pr, "\n");
	}

the underlying buffer will be static; the pr_line will get re-init
(offset = 0) every time we call the function, which is OK. And we can
pass &pr to any function oom() invokes. What am I missing?

> We could even allow each "struct task_struct" to have corresponding
> "struct printk_buffer".

Tetsuo, realistically, we can't. Sorry. No one will let us to have a printk
buffer on per-task_struct basis. Even if someone will let us to do this,
a miracle, a single per-task_struct buffer won't work. Because, then
someone will discover that a very simple API

	buffered_printk(current->printk_buffer, "......");

does not work if buffered_printk() gets interrupted by IRQ, etc. in case
if that new context also does

	buffered_printk(current->printk_buffer, "......");

So then we will have per-context per-task_struct printk buffer: for task,
for exceptions, for softirq, for hardirq, for NMI, etc. This is not worth
it.

Let's just have a very simple seq_buf based pr_line API. No config options,
no command line arguments - heap, bss or stack for buffer placement. Or even
simpler.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-28 11:21                                                                           ` Tetsuo Handa
@ 2018-09-29 11:13                                                                             ` Sergey Senozhatsky
  2018-09-29 11:39                                                                               ` Tetsuo Handa
                                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-09-29 11:13 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek,
	Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/28/18 20:21), Tetsuo Handa wrote:
> On 2018/09/28 17:56, Sergey Senozhatsky wrote:
> > The good thing about cont buffer is that we flush it on panic. E.g.
> > core/arch early boot stage can do:
> > 
> > 	pr_cont("going to call early_init_foo()...");
> > 	early_init_foo();
> > 	pr_cont("OK\n");
> > 
> 
> Is printing
> 
>   going to call early_init_foo()...OK
> 
> in one line so critically important?

Could be. If this is the last thing you are about to see on your
serial console. panic_on_flush() is not guaranteed to succeed.

... Hmm. But it seems that this has changed.

We used to flush "incomplete" cont lines (fragments) from console_unlock().

void console_unlock(void)
{
...
        /* flush buffered message fragment immediately to console */
        console_cont_flush(text, sizeof(text));
again:
        for (;;) {
	...
	}
...
}

Unless I'm missing something, we don't anymore.
Since 5c2992ee7fd8a29d04125dc0aa3522784c5fa5eb.
Now we print only log_buf entries. So we either wait for a \n to flush
a complete cont buffer, or for a race to preliminary flush cont buffer.


> > Tetsuo, do you still want to have a fixed size array of printk buffers?
> 
> For my intended users where printk() is used for reporting errors (e.g.
> stack backtrace, GFP_ATOMIC memory allocation failure, lockdep splat),
> being prepared for already tight stack is preferable.

Agreed. A list of printk buffers has some interesting features and may
be we will use it after all.
At the same time the functions you have mentioned can use static char
buffers for pr_line.

> > What do others think?
> 
> Yes, I want to hear from others.

Yep.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-29 10:51                                                                               ` Sergey Senozhatsky
@ 2018-09-29 11:15                                                                                 ` Tetsuo Handa
  2018-10-01  2:37                                                                                   ` Sergey Senozhatsky
  0 siblings, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-09-29 11:15 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/29 19:51, Sergey Senozhatsky wrote:
> On (09/28/18 20:01), Tetsuo Handa wrote:
>>> Yes, this makes sense. At the same time we can keep pr_line buffer
>>> in .bss
>>>
>>> 	static char buffer[1024];
>>> 	static DEFINE_PR_LINE_BUF(..., buffer);
>>>
>>> just like you have already mentioned. But that's going to require a
>>> case-by-case handling; so a big list of printk buffers is a simpler
>>> option. Fallback, tho, can be painful. On a system with 1024 CPUs can
>>> one have more than 16 concurrent cont printks? If the answer is yes,
>>> then we are looking at the same broken cont output as before.
>>
>> I'm OK with making "16" configurable (at kernel configuration and/or
>> at kernel boot like log_buf_len= kernel command line parameter).
> 
> Do we really want this? Why .bss placement doesn't work for you?
> 
> 	void oom(...)
> 	{
> 		static DEFINE_PR_LINE(KERN_ERR, pr);
> 
> 		pr_line(&pr, ....);
> 		pr_line(&pr, "\n");
> 	}
> 
> the underlying buffer will be static; the pr_line will get re-init
> (offset = 0) every time we call the function, which is OK. And we can
> pass &pr to any function oom() invokes. What am I missing?

Because there is no guarantee that memory information is dumped under the
oom_lock mutex. The oom_lock is held when calling out_of_memory(), and it
cannot be held when reporting GFP_ATOMIC memory allocation failures.

> 
>> We could even allow each "struct task_struct" to have corresponding
>> "struct printk_buffer".
> 
> Tetsuo, realistically, we can't. Sorry. No one will let us to have a printk
> buffer on per-task_struct basis. Even if someone will let us to do this,
> a miracle, a single per-task_struct buffer won't work. Because, then
> someone will discover that a very simple API
> 
> 	buffered_printk(current->printk_buffer, "......");
> 
> does not work if buffered_printk() gets interrupted by IRQ, etc. in case
> if that new context also does
> 
> 	buffered_printk(current->printk_buffer, "......");
> 
> So then we will have per-context per-task_struct printk buffer: for task,
> for exceptions, for softirq, for hardirq, for NMI, etc. This is not worth
> it.

The number of "struct task_struct" instances is volatile. But number of non
"struct task_struct" contexts is finite which can be determined at boot (or
initialization) time.

My intention is that allocate "struct printk_buffer" when "struct task_struct"
is created (i.e. upon dup_task_struct()) and release "struct printk_buffer"
when "struct task_struct" is destroyed (i.e. upon free_task_struct()), and
allocate "struct printk_buffer" for non "struct task_struct" contexts when a
CPU is onlined and release "struct printk_buffer" for non "struct task_struct"
contexts when a CPU is offlined. Then, it will be guaranteed that there is
enough "struct printk_buffer" for any callers.

> 
> Let's just have a very simple seq_buf based pr_line API. No config options,
> no command line arguments - heap, bss or stack for buffer placement. Or even
> simpler.

We cannot avoid "** %u printk messages dropped **\n" inside printk() upon out
of space. But I don't want line buffered printk() API to truncate upon out of
space for line buffered printk() API. I want buffered printk() API to flush
incomplete line even if it resulted in printed in multiple lines.
Injecting caller information can mitigate "printed in multiple lines" case.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-29 11:13                                                                             ` Sergey Senozhatsky
@ 2018-09-29 11:39                                                                               ` Tetsuo Handa
  2018-10-01  5:52                                                                               ` Sergey Senozhatsky
  2018-10-01 18:06                                                                               ` Steven Rostedt
  2 siblings, 0 replies; 94+ messages in thread
From: Tetsuo Handa @ 2018-09-29 11:39 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/09/29 20:13, Sergey Senozhatsky wrote:
> On (09/28/18 20:21), Tetsuo Handa wrote:
>> On 2018/09/28 17:56, Sergey Senozhatsky wrote:
>>> The good thing about cont buffer is that we flush it on panic. E.g.
>>> core/arch early boot stage can do:
>>>
>>> 	pr_cont("going to call early_init_foo()...");
>>> 	early_init_foo();
>>> 	pr_cont("OK\n");
>>>
>>
>> Is printing
>>
>>   going to call early_init_foo()...OK
>>
>> in one line so critically important?
> 
> Could be. If this is the last thing you are about to see on your
> serial console. panic_on_flush() is not guaranteed to succeed.

Doing

  printk("going to call early_init_foo()...\n");
  early_init_foo();
  printk("OK\n");

and getting

  T0: going to call early_init_foo()...

as the last line we see on our serial console will not be so bad.

Implicitly flushing incomplete lines might disturb automated processing.
But there is after all other factors which can disturb automated
processing (e.g. "** %u printk messages dropped **\n" in printk(),
UDP packet being lost when transmitting via netconsole). After all,
we cannot allow perfect automated processing. The last help is
human's eyes and brain.

Then, prefixing caller information helps even if incomplete line
is implicitly flushed by line buffered printk() API...

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-29 11:15                                                                                 ` Tetsuo Handa
@ 2018-10-01  2:37                                                                                   ` Sergey Senozhatsky
  2018-10-01  2:58                                                                                     ` Sergey Senozhatsky
  2018-10-01 11:21                                                                                     ` Tetsuo Handa
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-10-01  2:37 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek,
	Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (09/29/18 20:15), Tetsuo Handa wrote:
> 
> Because there is no guarantee that memory information is dumped under the
> oom_lock mutex. The oom_lock is held when calling out_of_memory(), and it
> cannot be held when reporting GFP_ATOMIC memory allocation failures.

IOW, static pr_line buffer needs additional synchronization for OOM. Correct?

If we are about to have a list of printk buffers then we probably can
define a list of NR_CPUS cont buffers. And we probably can reuse the
existing struct cont for buffered printk, having 2 different struct-s
for the same thing - struct cont and struct printk_buffer - is not very
cool.

> But I don't want line buffered printk() API to truncate upon out of
> space for line buffered printk() API.

All printk()-s are limited by LOG_LINE_MAX. Buffered printk() is not
special.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-01  2:37                                                                                   ` Sergey Senozhatsky
@ 2018-10-01  2:58                                                                                     ` Sergey Senozhatsky
  2018-10-01 11:21                                                                                     ` Tetsuo Handa
  1 sibling, 0 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-10-01  2:58 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton, Sergey Senozhatsky

On (10/01/18 11:37), Sergey Senozhatsky wrote:
> If we are about to have a list of printk buffers then we probably can
> define a list of NR_CPUS cont buffers. And we probably can reuse the
> existing struct cont for buffered printk, having 2 different struct-s
> for the same thing - struct cont and struct printk_buffer - is not very
> cool.

And we also can re-use cont_add() / cont_flush() / etc.
Just pass a specific struct cont *cont to those functions.

> All printk()-s are limited by LOG_LINE_MAX. Buffered printk() is not
> special.

Correction: I was wrong about this.

Looking at cont handling, it seems that buffered printk is special after
all. We do let it to be over LOG_LINE_MAX:

	if (nr_ext_console_drivers || cont.len + len > sizeof(cont.buf))
		cont_flush();

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-29 11:13                                                                             ` Sergey Senozhatsky
  2018-09-29 11:39                                                                               ` Tetsuo Handa
@ 2018-10-01  5:52                                                                               ` Sergey Senozhatsky
  2018-10-01  8:37                                                                                 ` Sergey Senozhatsky
  2018-10-01 18:06                                                                               ` Steven Rostedt
  2 siblings, 1 reply; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-10-01  5:52 UTC (permalink / raw)
  To: Steven Rostedt, Petr Mladek, Tetsuo Handa
  Cc: Sergey Senozhatsky, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton, Sergey Senozhatsky

On (09/29/18 20:13), Sergey Senozhatsky wrote:
> We used to flush "incomplete" cont lines (fragments) from console_unlock().
> 
> void console_unlock(void)
> {
> ...
>         /* flush buffered message fragment immediately to console */
>         console_cont_flush(text, sizeof(text));
> again:
>         for (;;) {
> 	...
> 	}
> ...
> }
> 
> Unless I'm missing something, we don't anymore.
> Since 5c2992ee7fd8a29d04125dc0aa3522784c5fa5eb.
> Now we print only log_buf entries. So we either wait for a \n to flush
> a complete cont buffer, or for a race to preliminary flush cont buffer.

BTW, it just crossed my mind:

Previously, we would do console_cont_flush() for each pr_cont(),
so console_unlock() would print data:

	pr_cont();
	 console_lock();
	 console_unlock()
	  console_cont_flush(); // print cont fragment
	...
	pr_cont();
	 console_lock();
	 console_unlock()
	  console_cont_flush(); // print cont fragment

We don't console_cont_flush() anymore, so when we do pr_cont()
console_unlock() does nothing (unless we flushed the cont buffer):

	pr_cont();
	 console_lock();
	 console_unlock();      // noop
	...
	pr_cont();
	 console_lock();
	 console_unlock();      // noop
	...
	pr_cont();
	  cont_flush();
	    console_lock();
	    console_unlock();   // print data

console_lock()/console_unlock() makes sense only when we flush cont
buffer.

We also wakeup klogd purposelessly for pr_cont() output - un-flushed
is not stored in log_buf; there is nothing to pull.

So we can console_lock()/console_unlock()/wake_up_klogd() only when we
know that we log_stor()-ed a message.

A quick-n-dirty patch (I can send a formal one) which compares log_next_seq
before and after vprintk_store(). log_next_seq is getting incremented each
time we log_store() a new log_buf message:

---

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 308497194bd4..53d9134f02a6 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1931,6 +1931,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	int printed_len;
 	bool in_sched = false;
 	unsigned long flags;
+	u64 curr_log_seq;
 
 	if (level == LOGLEVEL_SCHED) {
 		level = LOGLEVEL_DEFAULT;
@@ -1942,11 +1943,12 @@ asmlinkage int vprintk_emit(int facility, int level,
 
 	/* This stops the holder of console_sem just where we want him */
 	logbuf_lock_irqsave(flags);
+	curr_log_seq = log_next_seq;
 	printed_len = vprintk_store(facility, level, dict, dictlen, fmt, args);
 	logbuf_unlock_irqrestore(flags);
 
 	/* If called from the scheduler, we can not call up(). */
-	if (!in_sched) {
+	if (!in_sched && (curr_log_seq != log_next_seq)) {
 		/*
 		 * Disable preemption to avoid being preempted while holding
 		 * console_sem which would prevent anyone from printing to
@@ -1963,7 +1965,8 @@ asmlinkage int vprintk_emit(int facility, int level,
 		preempt_enable();
 	}
 
-	wake_up_klogd();
+	if (curr_log_seq != log_next_seq)
+		wake_up_klogd();
 	return printed_len;
 }
 EXPORT_SYMBOL(vprintk_emit);

---

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-01  5:52                                                                               ` Sergey Senozhatsky
@ 2018-10-01  8:37                                                                                 ` Sergey Senozhatsky
  0 siblings, 0 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-10-01  8:37 UTC (permalink / raw)
  To: Steven Rostedt, Petr Mladek, Tetsuo Handa
  Cc: Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton,
	Sergey Senozhatsky, Sergey Senozhatsky

On (10/01/18 14:52), Sergey Senozhatsky wrote:
> On (09/29/18 20:13), Sergey Senozhatsky wrote:
> > We used to flush "incomplete" cont lines (fragments) from console_unlock().
> > 
> > void console_unlock(void)
> > {
> > ...
> >         /* flush buffered message fragment immediately to console */
> >         console_cont_flush(text, sizeof(text));
> > again:
> >         for (;;) {
> > 	...
> > 	}
> > ...
> > }
> > 
> > Unless I'm missing something, we don't anymore.
> > Since 5c2992ee7fd8a29d04125dc0aa3522784c5fa5eb.
> > Now we print only log_buf entries. So we either wait for a \n to flush
> > a complete cont buffer, or for a race to preliminary flush cont buffer.
> 
> BTW, it just crossed my mind:

One more thing.

Since we don't print cont fragments to the consoles anymore, do we still
need the "extended consoles disable kernel cont support" thing?

cont lines are proper log_buf entries now, there is nothing to reassemble
according to "consecutive continuation flags". Or am I wrong?

---

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 308497194bd4..e72cb793aff1 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -192,16 +192,7 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
 	return 0;
 }
 
-/*
- * Number of registered extended console drivers.
- *
- * If extended consoles are present, in-kernel cont reassembly is disabled
- * and each fragment is stored as a separate log entry with proper
- * continuation flag so that every emitted message has full metadata.  This
- * doesn't change the result for regular consoles or /proc/kmsg.  For
- * /dev/kmsg, as long as the reader concatenates messages according to
- * consecutive continuation flags, the end result should be the same too.
- */
+/* Number of registered extended console drivers. */
 static int nr_ext_console_drivers;
 
 /*
@@ -1806,12 +1797,8 @@ static void cont_flush(void)
 
 static bool cont_add(int facility, int level, enum log_flags flags, const char *text, size_t len)
 {
-	/*
-	 * If ext consoles are present, flush and skip in-kernel
-	 * continuation.  See nr_ext_console_drivers definition.  Also, if
-	 * the line gets too long, split it up in separate records.
-	 */
-	if (nr_ext_console_drivers || cont.len + len > sizeof(cont.buf)) {
+	/* If the line gets too long, split it up in separate records. */
+	if (cont.len + len > sizeof(cont.buf)) {
 		cont_flush();
 		return false;
 	}
@@ -2731,8 +2718,7 @@ void register_console(struct console *newcon)
 	}
 
 	if (newcon->flags & CON_EXTENDED)
-		if (!nr_ext_console_drivers++)
-			pr_info("printk: continuation disabled due to ext consoles, expect more fragments in /dev/kmsg\n");
+		nr_ext_console_drivers++;
 
 	if (newcon->flags & CON_PRINTBUFFER) {
 		/*
---

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-01  2:37                                                                                   ` Sergey Senozhatsky
  2018-10-01  2:58                                                                                     ` Sergey Senozhatsky
@ 2018-10-01 11:21                                                                                     ` Tetsuo Handa
  2018-10-02  6:38                                                                                       ` Sergey Senozhatsky
  1 sibling, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-10-01 11:21 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On 2018/10/01 11:37, Sergey Senozhatsky wrote:
> On (09/29/18 20:15), Tetsuo Handa wrote:
>>
>> Because there is no guarantee that memory information is dumped under the
>> oom_lock mutex. The oom_lock is held when calling out_of_memory(), and it
>> cannot be held when reporting GFP_ATOMIC memory allocation failures.
> 
> IOW, static pr_line buffer needs additional synchronization for OOM. Correct?

Yes (assuming that your OOM refer to both out_of_memory() and warn_alloc()).
And since warn_alloc() might be called from atomic/intrrupt contexts, we can't
use locks for synchronization.

> 
> If we are about to have a list of printk buffers then we probably can
> define a list of NR_CPUS cont buffers. And we probably can reuse the
> existing struct cont for buffered printk, having 2 different struct-s
> for the same thing - struct cont and struct printk_buffer - is not very
> cool.

My plan is to remove "struct cont" after most of KERN_CONT users are
converted to use buffered_printk(). There will be 2 different struct-s
only during transition period.

By the way, only up to two threads (the active printer thread and a thread
which is marked as console_waiter) can stall inside printk(), doesn't it?
Then, can you imagine a situation where 1024 (NR_CPUS) threads are stalling
inside printk() waiting for flush? Such system is already dead. All callers
but the two should release printk_buffer as soon as their printk() added their
message to the log buffer.

Maybe "struct printk_buffer" after all becomes identical to "struct cont". But
I guess that even 16 printk_buffer-s is practically sufficient for 1024 CPUs
system, and allocating NR_CPUS printk_buffer-s will be too wasteful.

> 
>> But I don't want line buffered printk() API to truncate upon out of
>> space for line buffered printk() API.
> 
> All printk()-s are limited by LOG_LINE_MAX. Buffered printk() is not
> special.

I'm saying that I don't like discarding overflowed part because you are
using seq_buf_vprintf() which just marks "overflowed" rather than
"flush incomplete line" and "store the new data".

 DEFINE_PR_LINE(pr);

 pr_line(&pr, "1234567890123456789012345678901234567890123456789012345678901234567890");
 pr_line(&pr, "1234567890abcde\n");

will discard "1234567890abcde\n" part, won't it?
I think that getting

 1234567890123456789012345678901234567890123456789012345678901234567890\n
 1234567890abcde\n

is better than getting

 1234567890123456789012345678901234567890123456789012345678901234567890\n

because we can still understand such output by prefixing caller information.

Your DEFINE_PR_LINE() is limiting to far smaller than LOG_LINE_MAX.
Since your version has to worry about "buffer full" (i.e. hitting
seq_buf_set_overflow()) case, it might become a headache for API users.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-09-29 11:13                                                                             ` Sergey Senozhatsky
  2018-09-29 11:39                                                                               ` Tetsuo Handa
  2018-10-01  5:52                                                                               ` Sergey Senozhatsky
@ 2018-10-01 18:06                                                                               ` Steven Rostedt
  2 siblings, 0 replies; 94+ messages in thread
From: Steven Rostedt @ 2018-10-01 18:06 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Sergey Senozhatsky, Petr Mladek,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On Sat, 29 Sep 2018 20:13:17 +0900
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:

> On (09/28/18 20:21), Tetsuo Handa wrote:
> > On 2018/09/28 17:56, Sergey Senozhatsky wrote:  
> > > The good thing about cont buffer is that we flush it on panic. E.g.
> > > core/arch early boot stage can do:
> > > 
> > > 	pr_cont("going to call early_init_foo()...");
> > > 	early_init_foo();
> > > 	pr_cont("OK\n");
> > >   
> > 
> > Is printing
> > 
> >   going to call early_init_foo()...OK
> > 
> > in one line so critically important?  
> 

Yes. My testing infrastructure tests for this on boot up for the ftrace
self tests.

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-01 11:21                                                                                     ` Tetsuo Handa
@ 2018-10-02  6:38                                                                                       ` Sergey Senozhatsky
  2018-10-08 10:31                                                                                         ` Tetsuo Handa
  2018-10-08 15:43                                                                                         ` Petr Mladek
  0 siblings, 2 replies; 94+ messages in thread
From: Sergey Senozhatsky @ 2018-10-02  6:38 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek,
	Steven Rostedt, Alexander Potapenko, Dmitriy Vyukov,
	kbuild test robot, syzkaller, LKML, Linus Torvalds,
	Andrew Morton

On (10/01/18 20:21), Tetsuo Handa wrote:
> >> Because there is no guarantee that memory information is dumped under the
> >> oom_lock mutex. The oom_lock is held when calling out_of_memory(), and it
> >> cannot be held when reporting GFP_ATOMIC memory allocation failures.
> > 
> > IOW, static pr_line buffer needs additional synchronization for OOM. Correct?
> 
> Yes (assuming that your OOM refer to both out_of_memory() and warn_alloc()).

Yes, both out_of_memory() and warn_alloc().

> By the way, only up to two threads (the active printer thread and a thread
> which is marked as console_waiter) can stall inside printk(), doesn't it?

Correct.

> Then, can you imagine a situation where 1024 (NR_CPUS) threads are stalling
> inside printk() waiting for flush?

No, not really. Both console_sem owner and waiter should spin outside of
logbuf_lock, so other CPUs can flush/log_store() in the meantime.

> Maybe "struct printk_buffer" after all becomes identical to "struct cont". But
> I guess that even 16 printk_buffer-s is practically sufficient for 1024 CPUs
> system, and allocating NR_CPUS printk_buffer-s will be too wasteful.

NR_CPUS buffers is quite a lot, indeed. Maybe we can do something like
NR_CPUS/4 + 1, etc. Kconfig option will be super hard to get right for
distributions. If people who wrote the code didn't agree on the correct
number of buffers and passed it to the distributions, then it's a good
sign than distributions will have problems picking up the good number as
well.

I'm not experienced enough, and need more opinions here.


I have sketched a very silly, quick-and-dirty implementation using
struct cont. It derives all the good features of the existing pr_cont.
I didn't spend enough time on this. It's just a sketch... which compiles
and that's it.

> I'm saying that I don't like discarding overflowed part because you are
> using seq_buf_vprintf() which just marks "overflowed" rather than
> "flush incomplete line" and "store the new data".
[..]
> Your DEFINE_PR_LINE() is limiting to far smaller than LOG_LINE_MAX.

Yes, you are right, and I was wrong about it (like I said in my email
elsewhere).

The existing cont support is "special", apparently. And it does automatic
flushing and split cont line in separate records when the cont buffer is
about of overflow. So the following loop can pr_cont() more than 1024 bytes
in total:

	for (......)
		pr_cont(.....);
	pr_cont("\n");

Thus new API should have exactly the same characteristics/guarantees. Agreed.

	-ss

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-02  6:38                                                                                       ` Sergey Senozhatsky
@ 2018-10-08 10:31                                                                                         ` Tetsuo Handa
  2018-10-08 16:03                                                                                           ` Petr Mladek
  2018-10-08 15:43                                                                                         ` Petr Mladek
  1 sibling, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-10-08 10:31 UTC (permalink / raw)
  To: Sergey Senozhatsky, Dmitriy Vyukov, Linus Torvalds
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Alexander Potapenko, kbuild test robot, syzkaller, LKML,
	Andrew Morton

On 2018/10/02 15:38, Sergey Senozhatsky wrote:
> I have sketched a very silly, quick-and-dirty implementation using
> struct cont. It derives all the good features of the existing pr_cont.
> I didn't spend enough time on this. It's just a sketch... which compiles
> and that's it.

Sergey and I had off-list discussion about an implementation above. But
I concluded that I should update my version than updating Sergey's sketch.

The origin ( https://groups.google.com/forum/#!topic/syzkaller/ttZehjXiHTU ) is
how to prefix caller information to each line of printk() messages so that syzbot
can group messages from each context and do the better processing.

We know that Linus has refused injecting extra data into message body
( https://lkml.kernel.org/r/CA+55aFynkjSL1NNZbx6m1iE2HjZagGK09rAr5-HaZ4Ep2eWKOg@mail.gmail.com )

  On 2017/09/18 0:35, Linus Torvalds wrote:
  > On Sat, Sep 16, 2017 at 11:26 PM, Sergey Senozhatsky
  > <sergey.senozhatsky@gmail.com> wrote:
  >>
  >> so... I think we don't have to update 'struct printk_log'. we can store
  >> that "extended data" at the beginning of every message, right after the
  >> prefix.
  > 
  > No, we really can't. That just means that all the tools would have to
  > be changed to get the normal messages without the extra crud. And
  > since it will have lost the difference, that's not even easy to do.
  > 
  > So this is exactly the wrong way around.
  > 
  > If people want to see the extra data, it really should be extra data
  > that you can get with a new interface from the kernel logs. Not a
  > "let's just a add it to all lines and make every line uglier and
  > harder to read.
  > 
  >               Linus

but we also know that syzbot cannot count on a new interface
( https://lkml.kernel.org/r/CACT4Y+aFO+yZ7ovkxJOJfz=JgsE3yr+ywLQ9kVUrOHYMBgfWdg@mail.gmail.com )

  On 2018/05/18 22:08, Dmitry Vyukov wrote:
  > On Fri, May 18, 2018 at 2:54 PM, Petr Mladek <pmladek@suse.com> wrote:
  >> On Fri 2018-05-18 14:25:57, Dmitry Vyukov wrote:
  >>>> On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
  >>>>> Dunno...
  >>>>> For instance, can we store context tracking info as a extended record
  >>>>> data? We have that dict/dict_len thing. So may we can store tracking
  >>>>> info there? Extended records will appear on the serial console /* if
  >>>>> console supports extended data */ or can be read in via devkmsg_read().
  >>>
  >>> What consoles do support it?
  >>> We are interested at least in qemu console, GCE console and Android
  >>> phone consoles. But it would be pity if this can't be used on various
  >>> development boards too.
  >>
  >> Only the netconsole is able to show the extended (dict)
  >> information at the moment. Search for CON_EXTENDED flag.
  > 
  > Then we won't be able to use it. And we can't pipe from devkmsg_read
  > in user-space, because we need this to work when kernel is broken in
  > various ways...

and we have to allow normal consoles to inject caller information into message
body. Since syzbot can modify kernel configurations and kernel boot command
line options, if Linus permits, we can enable injecting caller information to
only syzbot environments.

Regarding a concern Linus mentioned
( https://lkml.kernel.org/r/CA+55aFwmwdY_mMqdEyFPpRhCKRyeqj=+aCqe5nN108v8ELFvPw@mail.gmail.com ),
we would be able to convert

     printk("Testing feature XYZ..");
     this_may_blow_up_because_of_hw_bugs();
     printk(KERN_CONT " ... ok\n");

to

     printk("Testing feature XYZ..\n");
     this_may_blow_up_because_of_hw_bugs();
     printk("... feature XYZ ok\n");

and eventually remove pr_cont/printk(KERN_CONT) support (i.e. printk() will always
emit '\n').



From df59a431b18888af3bdc9a90d03f1a9d63a12c3e Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sun, 7 Oct 2018 10:20:38 +0900
Subject: [PATCH v3] printk: Add line-buffered printk() API.

Sometimes we want to print a whole line without being disturbed by
concurrent printk() from interrupts and/or other threads, for printk()
which does not end with '\n' can be disturbed.

Mixed printk() output makes it hard to interpret. Assuming that we will go
to a direction that we allow prefixing context identifier to each line of
printk() output (so that we can group multiple lines into one block when
parsing), this patch introduces API for line-buffered printk() output
(so that we can make sure that printk() ends with '\n').

Since functions introduced by this patch are merely wrapping
printk()/vprintk() calls in order to minimize possibility of using
"struct cont", it is safe to replace printk()/vprintk() with this API.

Details:

  A structure named "struct printk_buffer" is introduced for buffering
  up to LOG_LINE_MAX bytes of printk() output which did not end with '\n'.

  A caller is allowed to allocate/free "struct printk_buffer" using
  kzalloc()/kfree() if that caller is in a location where it is possible
  to do so.

  A macro named "DEFINE_PRINTK_BUFFER()" is defined for allocating
  "struct printk_buffer" from the stack memory or in the .bss section.

  But since sizeof("struct printk_buffer") is nearly 1KB, it might not be
  preferable to allocate "struct printk_buffer" from the stack memory.
  In that case, a caller can use best-effort buffering mode. Two functions
  get_printk_buffer() and put_printk_buffer() are provided for that mode.

  get_printk_buffer() tries to assign a "struct printk_buffer" from
  statically preallocated array. It returns NULL if all static
  "struct printk_buffer" are in use.

  put_printk_buffer() flushes and releases the "struct printk_buffer".
  put_printk_buffer() must match corresponding get_printk_buffer() as with
  rcu_read_unlock() must match corresponding rcu_read_lock().

  Three functions buffered_vprintk(), buffered_printk() and
  flush_printk_buffer() are provided for using "struct printk_buffer".
  These are like vfprintf(), fprintf(), fflush() except that these receive
  "struct printk_buffer *" for the first argument.

  buffered_vprintk() and buffered_printk() behave like vprintk() and
  printk() respectively if "struct printk_buffer *" argument is NULL.
  flush_printk_buffer() and put_printk_buffer() become no-op if
  "struct printk_buffer *" argument is NULL. Therefore, the caller of
  get_printk_buffer() does not need to check for NULL.

How to use:

  (1) Allocate "struct printk_buffer" and zero-clear it.
      You can use one of kzalloc() or DEFINE_PRINTK_BUFFER() or
      get_printk_buffer().

  (2) Rewrite printk() calls in the following way. The "ptr" is
      "struct printk_buffer *" allocated in step (1).

      printk(fmt, ...)     => buffered_printk(ptr, fmt, ...)
      vprintk(fmt, args)   => buffered_vprintk(ptr, fmt, args)
      pr_emerg(fmt, ...)   => bpr_emerg(ptr, fmt, ...)
      pr_alert(fmt, ...)   => bpr_alert(ptr, fmt, ...)
      pr_crit(fmt, ...)    => bpr_crit(ptr, fmt, ...)
      pr_err(fmt, ...)     => bpr_err(ptr, fmt, ...)
      pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...)
      pr_warn(fmt, ...)    => bpr_warn(ptr, fmt, ...)
      pr_notice(fmt, ...)  => bpr_notice(ptr, fmt, ...)
      pr_info(fmt, ...)    => bpr_info(ptr, fmt, ...)
      pr_cont(fmt, ...)    => bpr_cont(ptr, fmt, ...)

  (3) Release "struct printk_buffer" by calling put_printk_buffer()
      only if it was allocated by get_printk_buffer().
      Release "struct printk_buffer" by calling kfree()
      only if it was allocated by kzalloc().

Note that since "struct printk_buffer" buffers only up to one line, there
is no need to rewrite if it is known that the "struct printk_buffer" is
empty and printk() ends with '\n'.

  Good example:

    printk("Hello ");    =>  DEFINE_PRINTK_BUFFER(buf);
    pr_cont("world.\n");     buffered_printk(&buf, "Hello ");
                             buffered_printk(&buf, "world.\n");

  Pointless example:

    printk("Hello\n");   => DEFINE_PRINTK_BUFFER(buf);
    printk("World.\n");     buffered_printk(&buf, "Hello\n");
                            buffered_printk(&buf, "World.\n");

Note that bpr_devel() and bpr_debug() are not defined. This is
because pr_devel()/pr_debug() should not be followed by pr_cont()
because pr_devel()/pr_debug() are conditionally enabled; output from
pr_devel()/pr_debug() should always end with '\n'.

The statically preallocated buffer has 16 "struct printk_buffer".
If there happened to be out of statically preallocated buffer, existing
printk() will be used for users who failed to get "struct printk_buffer".
Of course, under such situation, the printk() output would flood the
console and the system would be already unusable (e.g. RCU lockup or
hung task watchdog would fire) even if "struct printk_buffer" is
dynamically allocated. Thus, I think that 16 should be sufficient.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 include/linux/printk.h |  72 ++++++++++++++++++++++
 kernel/printk/printk.c | 158 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 230 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccf..912b770 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -173,6 +173,37 @@ int printk_emit(int facility, int level,
 
 asmlinkage __printf(1, 2) __cold
 int printk(const char *fmt, ...);
+/*
+ * A structure for line-buffering printk() output.
+ */
+struct printk_buffer {
+	unsigned short int used; /* Valid bytes in buf[]. */
+	char buf[1024 - 32]; /* This is LOG_LINE_MAX bytes. */
+	bool in_use; /* Unused if defined using DEFINE_PRINTK_BUFFER(). */
+};
+/*
+ * A macro for allowing "struct printk_buffer" on stack or in .bss section.
+ *
+ * You can use this macro for allocation on stack only when you are sure that
+ * that location is never tight about stack usage, for e.g. interrupt might
+ * consume some stack from that location. You can use this macro for allocation
+ * in .bss section only when you are sure that access to this variable is
+ * appropriately serialized, for concurrent access to this variable can lead to
+ * memory corruption.
+ *
+ * If you are not sure, you should use get_printk_buffer()/put_printk_buffer()
+ * instead. You don't need to check for get_printk_buffer() == NULL, for
+ * buffered_printk()/buffered_vprintk() will fallback to printk()/vprintk()
+ * in that case.
+ */
+#define DEFINE_PRINTK_BUFFER(name) struct printk_buffer name = { }
+struct printk_buffer *get_printk_buffer(void);
+void flush_printk_buffer(struct printk_buffer *ptr);
+__printf(2, 3)
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...);
+__printf(2, 0)
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args);
+void put_printk_buffer(struct printk_buffer *ptr);
 
 /*
  * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ !
@@ -220,6 +251,30 @@ int printk(const char *s, ...)
 {
 	return 0;
 }
+struct printk_buffer {
+	char dummy; /* Not used. */
+};
+#define DEFINE_PRINTK_BUFFER(name) struct printk_buffer name
+static inline struct printk_buffer *get_printk_buffer(void)
+{
+	return NULL;
+}
+static inline __printf(2, 3)
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	return 0;
+}
+static inline __printf(2, 0)
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	return 0;
+}
+static inline void flush_printk_buffer(struct printk_buffer *ptr)
+{
+}
+static inline void put_printk_buffer(struct printk_buffer *ptr)
+{
+}
 static inline __printf(1, 2) __cold
 int printk_deferred(const char *s, ...)
 {
@@ -300,19 +355,34 @@ static inline void printk_safe_flush_on_panic(void)
  */
 #define pr_emerg(fmt, ...) \
 	printk(KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_emerg(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_alert(fmt, ...) \
 	printk(KERN_ALERT pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_alert(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_ALERT pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_crit(fmt, ...) \
 	printk(KERN_CRIT pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_crit(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_CRIT pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_err(fmt, ...) \
 	printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_err(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_warning(fmt, ...) \
 	printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_warning(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_warn pr_warning
+#define bpr_warn bpr_warning
 #define pr_notice(fmt, ...) \
 	printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_notice(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_info(fmt, ...) \
 	printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_info(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
 /*
  * Like KERN_CONT, pr_cont() should only be used when continuing
  * a line with no newline ('\n') enclosed. Otherwise it defaults
@@ -320,6 +390,8 @@ static inline void printk_safe_flush_on_panic(void)
  */
 #define pr_cont(fmt, ...) \
 	printk(KERN_CONT fmt, ##__VA_ARGS__)
+#define bpr_cont(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_CONT fmt, ##__VA_ARGS__)
 
 /* pr_devel() should produce zero code unless DEBUG is defined */
 #ifdef DEBUG
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9bf5404..6f564e6 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1949,6 +1949,164 @@ asmlinkage int printk_emit(int facility, int level,
 }
 EXPORT_SYMBOL(printk_emit);
 
+#define MAX_PRINTK_BUFFERS 16
+static struct printk_buffer printk_buffers[MAX_PRINTK_BUFFERS] __aligned(1024);
+
+/**
+ * get_printk_buffer - Try to get printk_buffer.
+ *
+ * Returns pointer to "struct printk_buffer" on success, NULL otherwise.
+ *
+ * If this function returned "struct printk_buffer", the caller is responsible
+ * for passing it to put_printk_buffer() so that "struct printk_buffer" can be
+ * reused in the future.
+ *
+ * Even if this function returned NULL, the caller does not need to check for
+ * NULL, for passing NULL to buffered_printk() simply acts like normal printk()
+ * and passing NULL to flush_printk_buffer()/put_printk_buffer() is a no-op.
+ */
+struct printk_buffer *get_printk_buffer(void)
+{
+	unsigned short int i;
+
+	for (i = 0; i < MAX_PRINTK_BUFFERS; i++) {
+		struct printk_buffer *ptr = &printk_buffers[i];
+
+		if (ptr->in_use || cmpxchg(&ptr->in_use, false, true))
+			continue;
+		ptr->used = 0;
+		return ptr;
+	}
+	return NULL;
+}
+EXPORT_SYMBOL(get_printk_buffer);
+
+/**
+ * buffered_vprintk - Try to vprintk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string.
+ * @args: va_list structure.
+ *
+ * Returns the return value of vprintk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	va_list tmp_args;
+	unsigned short int i;
+	int r;
+
+	BUILD_BUG_ON(sizeof(ptr->buf) != LOG_LINE_MAX);
+	if (!ptr)
+		goto unbuffered;
+	for (i = 0; i < 2; i++) {
+		unsigned int pos = ptr->used;
+		char *text = ptr->buf + pos;
+
+		va_copy(tmp_args, args);
+		r = vsnprintf(text, sizeof(ptr->buf) - pos, fmt, tmp_args);
+		va_end(tmp_args);
+		if (r + pos < sizeof(ptr->buf)) {
+			/*
+			 * Eliminate KERN_CONT at this point because we can
+			 * concatenate incomplete lines inside printk_buffer.
+			 */
+			if (r >= 2 && printk_get_level(text) == 'c') {
+				memmove(text, text + 2, r - 2);
+				ptr->used += r - 2;
+			} else {
+				ptr->used += r;
+			}
+			/* Flush already completed lines if any. */
+			while (1) {
+				char *cp = memchr(ptr->buf, '\n', ptr->used);
+
+				if (!cp)
+					break;
+				*cp = '\0';
+				printk("%s\n", ptr->buf);
+				i = cp - ptr->buf + 1;
+				ptr->used -= i;
+				memmove(ptr->buf, ptr->buf + i, ptr->used);
+			}
+			return r;
+		}
+		if (i)
+			break;
+		flush_printk_buffer(ptr);
+	}
+ unbuffered:
+	return vprintk(fmt, args);
+}
+
+/**
+ * buffered_printk - Try to printk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string, followed by arguments.
+ *
+ * Returns the return value of printk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	va_list args;
+	int r;
+
+	va_start(args, fmt);
+	r = buffered_vprintk(ptr, fmt, args);
+	va_end(args);
+	return r;
+}
+EXPORT_SYMBOL(buffered_printk);
+
+/**
+ * flush_printk_buffer - Flush incomplete line in printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush if @ptr contains partial data. But usually there is no need to call
+ * this function because @ptr is flushed by put_printk_buffer().
+ */
+void flush_printk_buffer(struct printk_buffer *ptr)
+{
+	if (!ptr || !ptr->used)
+		return;
+	/* buffered_vprintk() keeps 0 <= ptr->used < sizeof(ptr->buf) true. */
+	ptr->buf[ptr->used] = '\0';
+	printk("%s", ptr->buf);
+	ptr->used = 0;
+}
+EXPORT_SYMBOL(flush_printk_buffer);
+
+/**
+ * put_printk_buffer - Release printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush and release @ptr.
+ */
+void put_printk_buffer(struct printk_buffer *ptr)
+{
+	if (!ptr)
+		return;
+	if (ptr->used)
+		flush_printk_buffer(ptr);
+	/* Make sure ptr->in_use is cleared after setting ptr->used = 0.*/
+	wmb();
+	ptr->in_use = false;
+}
+EXPORT_SYMBOL(put_printk_buffer);
+
 int vprintk_default(const char *fmt, va_list args)
 {
 	int r;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-02  6:38                                                                                       ` Sergey Senozhatsky
  2018-10-08 10:31                                                                                         ` Tetsuo Handa
@ 2018-10-08 15:43                                                                                         ` Petr Mladek
  1 sibling, 0 replies; 94+ messages in thread
From: Petr Mladek @ 2018-10-08 15:43 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Sergey Senozhatsky, Steven Rostedt,
	Alexander Potapenko, Dmitriy Vyukov, kbuild test robot,
	syzkaller, LKML, Linus Torvalds, Andrew Morton

On Tue 2018-10-02 15:38:51, Sergey Senozhatsky wrote:
> On (10/01/18 20:21), Tetsuo Handa wrote:
> > Maybe "struct printk_buffer" after all becomes identical to "struct cont". But
> > I guess that even 16 printk_buffer-s is practically sufficient for 1024 CPUs
> > system, and allocating NR_CPUS printk_buffer-s will be too wasteful.
> 
> NR_CPUS buffers is quite a lot, indeed. Maybe we can do something like
> NR_CPUS/4 + 1, etc. Kconfig option will be super hard to get right for
> distributions. If people who wrote the code didn't agree on the correct
> number of buffers and passed it to the distributions, then it's a good
> sign than distributions will have problems picking up the good number as
> well.

I am afraid that only some testing or real life experience might tell
us what number is good enough.

The good thing is that it could only be better than the current
state when we have only one cont buffer.

Also I would not be so much afraid of the per-cpu buffer. We already
use 16kB per-CPU for printk_safe and printk_nmi. One more kB should
no be that big deal.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-08 10:31                                                                                         ` Tetsuo Handa
@ 2018-10-08 16:03                                                                                           ` Petr Mladek
  2018-10-08 20:48                                                                                             ` Tetsuo Handa
  0 siblings, 1 reply; 94+ messages in thread
From: Petr Mladek @ 2018-10-08 16:03 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Linus Torvalds,
	Sergey Senozhatsky, Steven Rostedt, Alexander Potapenko,
	kbuild test robot, syzkaller, LKML, Andrew Morton

On Mon 2018-10-08 19:31:58, Tetsuo Handa wrote:
> On 2018/10/02 15:38, Sergey Senozhatsky wrote:
> > I have sketched a very silly, quick-and-dirty implementation using
> > struct cont. It derives all the good features of the existing pr_cont.
> > I didn't spend enough time on this. It's just a sketch... which compiles
> > and that's it.
> 
> Sergey and I had off-list discussion about an implementation above. But
> I concluded that I should update my version than updating Sergey's sketch.
> 
> The origin ( https://groups.google.com/forum/#!topic/syzkaller/ttZehjXiHTU ) is
> how to prefix caller information to each line of printk() messages so that syzbot
> can group messages from each context and do the better processing.
> 
> We know that Linus has refused injecting extra data into message body
> ( https://lkml.kernel.org/r/CA+55aFynkjSL1NNZbx6m1iE2HjZagGK09rAr5-HaZ4Ep2eWKOg@mail.gmail.com )
> 
>   On 2017/09/18 0:35, Linus Torvalds wrote:
>   > On Sat, Sep 16, 2017 at 11:26 PM, Sergey Senozhatsky
>   > <sergey.senozhatsky@gmail.com> wrote:
>   >>
>   >> so... I think we don't have to update 'struct printk_log'. we can store
>   >> that "extended data" at the beginning of every message, right after the
>   >> prefix.
>   > 
>   > No, we really can't. That just means that all the tools would have to
>   > be changed to get the normal messages without the extra crud. And
>   > since it will have lost the difference, that's not even easy to do.
>   > 
>   > So this is exactly the wrong way around.
>   > 
>   > If people want to see the extra data, it really should be extra data
>   > that you can get with a new interface from the kernel logs. Not a
>   > "let's just a add it to all lines and make every line uglier and
>   > harder to read.
>   > 
>   >               Linus
> 
> but we also know that syzbot cannot count on a new interface
> ( https://lkml.kernel.org/r/CACT4Y+aFO+yZ7ovkxJOJfz=JgsE3yr+ywLQ9kVUrOHYMBgfWdg@mail.gmail.com )
> 
>   On 2018/05/18 22:08, Dmitry Vyukov wrote:
>   > On Fri, May 18, 2018 at 2:54 PM, Petr Mladek <pmladek@suse.com> wrote:
>   >> On Fri 2018-05-18 14:25:57, Dmitry Vyukov wrote:
>   >>>> On Thu 2018-05-17 20:21:35, Sergey Senozhatsky wrote:
>   >>>>> Dunno...
>   >>>>> For instance, can we store context tracking info as a extended record
>   >>>>> data? We have that dict/dict_len thing. So may we can store tracking
>   >>>>> info there? Extended records will appear on the serial console /* if
>   >>>>> console supports extended data */ or can be read in via devkmsg_read().
>   >>>
>   >>> What consoles do support it?
>   >>> We are interested at least in qemu console, GCE console and Android
>   >>> phone consoles. But it would be pity if this can't be used on various
>   >>> development boards too.
>   >>
>   >> Only the netconsole is able to show the extended (dict)
>   >> information at the moment. Search for CON_EXTENDED flag.
>   > 
>   > Then we won't be able to use it. And we can't pipe from devkmsg_read
>   > in user-space, because we need this to work when kernel is broken in
>   > various ways...
> 
> and we have to allow normal consoles to inject caller information into message
> body. Since syzbot can modify kernel configurations and kernel boot command
> line options, if Linus permits, we can enable injecting caller information to
> only syzbot environments.
> 
> Regarding a concern Linus mentioned
> ( https://lkml.kernel.org/r/CA+55aFwmwdY_mMqdEyFPpRhCKRyeqj=+aCqe5nN108v8ELFvPw@mail.gmail.com ),
> we would be able to convert
> 
>      printk("Testing feature XYZ..");
>      this_may_blow_up_because_of_hw_bugs();
>      printk(KERN_CONT " ... ok\n");
> 
> to
> 
>      printk("Testing feature XYZ..\n");
>      this_may_blow_up_because_of_hw_bugs();
>      printk("... feature XYZ ok\n");
> 
> and eventually remove pr_cont/printk(KERN_CONT) support (i.e. printk() will always
> emit '\n').
> 
> 
> 
> >From df59a431b18888af3bdc9a90d03f1a9d63a12c3e Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Sun, 7 Oct 2018 10:20:38 +0900
> Subject: [PATCH v3] printk: Add line-buffered printk() API.
> 
> Sometimes we want to print a whole line without being disturbed by
> concurrent printk() from interrupts and/or other threads, for printk()
> which does not end with '\n' can be disturbed.
> 
> Mixed printk() output makes it hard to interpret. Assuming that we will go
> to a direction that we allow prefixing context identifier to each line of
> printk() output (so that we can group multiple lines into one block when
> parsing), this patch introduces API for line-buffered printk() output
> (so that we can make sure that printk() ends with '\n').
> 
> Since functions introduced by this patch are merely wrapping
> printk()/vprintk() calls in order to minimize possibility of using
> "struct cont", it is safe to replace printk()/vprintk() with this API.
> 
> Details:
> 
>   A structure named "struct printk_buffer" is introduced for buffering
>   up to LOG_LINE_MAX bytes of printk() output which did not end with '\n'.
> 
>   A caller is allowed to allocate/free "struct printk_buffer" using
>   kzalloc()/kfree() if that caller is in a location where it is possible
>   to do so.
> 
>   A macro named "DEFINE_PRINTK_BUFFER()" is defined for allocating
>   "struct printk_buffer" from the stack memory or in the .bss section.
> 
>   But since sizeof("struct printk_buffer") is nearly 1KB, it might not be
>   preferable to allocate "struct printk_buffer" from the stack memory.
>   In that case, a caller can use best-effort buffering mode. Two functions
>   get_printk_buffer() and put_printk_buffer() are provided for that mode.
> 
>   get_printk_buffer() tries to assign a "struct printk_buffer" from
>   statically preallocated array. It returns NULL if all static
>   "struct printk_buffer" are in use.
> 
>   put_printk_buffer() flushes and releases the "struct printk_buffer".
>   put_printk_buffer() must match corresponding get_printk_buffer() as with
>   rcu_read_unlock() must match corresponding rcu_read_lock().

One problem with this API is when it is used in more complicated code
and put_printk_buffer() is not called in some path. I mean leaking.
We might get out of buffers easily.

A solution might be to store some information about the owner and
put the buffer also when a non-buffered printk is called from
the same context.

It might even make it easier to use. If we are able to guess the
buffer by the context, we do not need to pass it as an argument.

Well, I would like to avoid having the buffer connected with CPU.
It would require to disable preemption in get_printk_buffer().
IMHO, it would be a unintuitive and even unwanted side effect.

Best Regards,
Petr


PS: I am sorry for the late reply. I was busy with some other
important stuff. I still have to think more about it and look
mode deeply into the implementation.

In each case, we need to be careful about the design.
The API has to be easy and safe to use. Also the implementation
should not complicate the printk design too much.

It looks promising. Also there is a high chance that it would
be much more straightforward than the current code around
the cont buffer ;-)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-08 16:03                                                                                           ` Petr Mladek
@ 2018-10-08 20:48                                                                                             ` Tetsuo Handa
  2018-10-09 14:52                                                                                               ` Petr Mladek
  0 siblings, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-10-08 20:48 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Linus Torvalds,
	Sergey Senozhatsky, Steven Rostedt, Alexander Potapenko,
	kbuild test robot, syzkaller, LKML, Andrew Morton

On 2018/10/09 1:03, Petr Mladek wrote:
> On Mon 2018-10-08 19:31:58, Tetsuo Handa wrote:
>>   A structure named "struct printk_buffer" is introduced for buffering
>>   up to LOG_LINE_MAX bytes of printk() output which did not end with '\n'.
>>
>>   A caller is allowed to allocate/free "struct printk_buffer" using
>>   kzalloc()/kfree() if that caller is in a location where it is possible
>>   to do so.
>>
>>   A macro named "DEFINE_PRINTK_BUFFER()" is defined for allocating
>>   "struct printk_buffer" from the stack memory or in the .bss section.
>>
>>   But since sizeof("struct printk_buffer") is nearly 1KB, it might not be
>>   preferable to allocate "struct printk_buffer" from the stack memory.
>>   In that case, a caller can use best-effort buffering mode. Two functions
>>   get_printk_buffer() and put_printk_buffer() are provided for that mode.
>>
>>   get_printk_buffer() tries to assign a "struct printk_buffer" from
>>   statically preallocated array. It returns NULL if all static
>>   "struct printk_buffer" are in use.
>>
>>   put_printk_buffer() flushes and releases the "struct printk_buffer".
>>   put_printk_buffer() must match corresponding get_printk_buffer() as with
>>   rcu_read_unlock() must match corresponding rcu_read_lock().
> 
> One problem with this API is when it is used in more complicated code
> and put_printk_buffer() is not called in some path. I mean leaking.
> We might get out of buffers easily.

Then, as an debugging config option for statically preallocated buffers,
we could record how get_printk_buffer() was called, like lockdep records
where a lock was taken.

> 
> A solution might be to store some information about the owner and
> put the buffer also when a non-buffered printk is called from
> the same context.
> 
> It might even make it easier to use. If we are able to guess the
> buffer by the context, we do not need to pass it as an argument.

It would be nice if we can omit passing "struct printk_buffer" argument.
But that results in "implicit contexts" which Linus has rejected
( https://lkml.kernel.org/CA+55aFx+5R-vFQfr7+Ok9Yrs2adQ2Ma4fz+S6nCyWHY_-2mrmw@mail.gmail.com ).

> 
> Well, I would like to avoid having the buffer connected with CPU.
> It would require to disable preemption in get_printk_buffer().
> IMHO, it would be a unintuitive and even unwanted side effect.

get_printk_buffer() is connected with the context who called "struct printk_buffer".
There is no need to disable preemption.

> 
> Best Regards,
> Petr
> 
> 
> PS: I am sorry for the late reply. I was busy with some other
> important stuff. I still have to think more about it and look
> mode deeply into the implementation.

No problem. Thank you for replying.

> 
> In each case, we need to be careful about the design.
> The API has to be easy and safe to use. Also the implementation
> should not complicate the printk design too much.
> 
> It looks promising. Also there is a high chance that it would
> be much more straightforward than the current code around
> the cont buffer ;-)
> 

We could eventually remove "struct cont" buffer.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-08 20:48                                                                                             ` Tetsuo Handa
@ 2018-10-09 14:52                                                                                               ` Petr Mladek
  2018-10-09 21:19                                                                                                 ` Tetsuo Handa
  0 siblings, 1 reply; 94+ messages in thread
From: Petr Mladek @ 2018-10-09 14:52 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Linus Torvalds,
	Sergey Senozhatsky, Steven Rostedt, Alexander Potapenko,
	kbuild test robot, syzkaller, LKML, Andrew Morton

On Tue 2018-10-09 05:48:33, Tetsuo Handa wrote:
> On 2018/10/09 1:03, Petr Mladek wrote:
> > On Mon 2018-10-08 19:31:58, Tetsuo Handa wrote:
> >>   A structure named "struct printk_buffer" is introduced for buffering
> >>   up to LOG_LINE_MAX bytes of printk() output which did not end with '\n'.
> >>
> >>   A caller is allowed to allocate/free "struct printk_buffer" using
> >>   kzalloc()/kfree() if that caller is in a location where it is possible
> >>   to do so.
> >>
> >>   A macro named "DEFINE_PRINTK_BUFFER()" is defined for allocating
> >>   "struct printk_buffer" from the stack memory or in the .bss section.
> >>
> >>   But since sizeof("struct printk_buffer") is nearly 1KB, it might not be
> >>   preferable to allocate "struct printk_buffer" from the stack memory.
> >>   In that case, a caller can use best-effort buffering mode. Two functions
> >>   get_printk_buffer() and put_printk_buffer() are provided for that mode.
> >>
> >>   get_printk_buffer() tries to assign a "struct printk_buffer" from
> >>   statically preallocated array. It returns NULL if all static
> >>   "struct printk_buffer" are in use.
> >>
> >>   put_printk_buffer() flushes and releases the "struct printk_buffer".
> >>   put_printk_buffer() must match corresponding get_printk_buffer() as with
> >>   rcu_read_unlock() must match corresponding rcu_read_lock().
> > 
> > One problem with this API is when it is used in more complicated code
> > and put_printk_buffer() is not called in some path. I mean leaking.
> > We might get out of buffers easily.
> 
> Then, as an debugging config option for statically preallocated buffers,
> we could record how get_printk_buffer() was called, like lockdep records
> where a lock was taken.

Another solution might be to store some timestamp (jiffies?) into
struct printk_buffer when a new message is added. Then we could flush
stalled buffers in get_printk_buffer() with some warning.

Unfortunately, it might be unsafe to put the stalled buffers.
Well, it might be safe if there is a lock less access. I wonder
if we could reuse the printk_safe code here.

Anyway, I would like to have a solution before we add the new
API into the kernel. We would need it sooner or later anyway.
And I would like to be sure that the API is sane.


> > A solution might be to store some information about the owner and
> > put the buffer also when a non-buffered printk is called from
> > the same context.
> > 
> > It might even make it easier to use. If we are able to guess the
> > buffer by the context, we do not need to pass it as an argument.
> 
> It would be nice if we can omit passing "struct printk_buffer" argument.
> But that results in "implicit contexts" which Linus has rejected
> ( https://lkml.kernel.org/CA+55aFx+5R-vFQfr7+Ok9Yrs2adQ2Ma4fz+S6nCyWHY_-2mrmw@mail.gmail.com ).

Yeah and the arguments for explicit context make sense when
I reread them again.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-09 14:52                                                                                               ` Petr Mladek
@ 2018-10-09 21:19                                                                                                 ` Tetsuo Handa
  2018-10-10 10:14                                                                                                   ` Tetsuo Handa
  0 siblings, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-10-09 21:19 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Linus Torvalds,
	Sergey Senozhatsky, Steven Rostedt, Alexander Potapenko,
	kbuild test robot, syzkaller, LKML, Andrew Morton

On 2018/10/09 23:52, Petr Mladek wrote:
> On Tue 2018-10-09 05:48:33, Tetsuo Handa wrote:
>> On 2018/10/09 1:03, Petr Mladek wrote:
>>> On Mon 2018-10-08 19:31:58, Tetsuo Handa wrote:
>>>>   A structure named "struct printk_buffer" is introduced for buffering
>>>>   up to LOG_LINE_MAX bytes of printk() output which did not end with '\n'.
>>>>
>>>>   A caller is allowed to allocate/free "struct printk_buffer" using
>>>>   kzalloc()/kfree() if that caller is in a location where it is possible
>>>>   to do so.
>>>>
>>>>   A macro named "DEFINE_PRINTK_BUFFER()" is defined for allocating
>>>>   "struct printk_buffer" from the stack memory or in the .bss section.
>>>>
>>>>   But since sizeof("struct printk_buffer") is nearly 1KB, it might not be
>>>>   preferable to allocate "struct printk_buffer" from the stack memory.
>>>>   In that case, a caller can use best-effort buffering mode. Two functions
>>>>   get_printk_buffer() and put_printk_buffer() are provided for that mode.
>>>>
>>>>   get_printk_buffer() tries to assign a "struct printk_buffer" from
>>>>   statically preallocated array. It returns NULL if all static
>>>>   "struct printk_buffer" are in use.
>>>>
>>>>   put_printk_buffer() flushes and releases the "struct printk_buffer".
>>>>   put_printk_buffer() must match corresponding get_printk_buffer() as with
>>>>   rcu_read_unlock() must match corresponding rcu_read_lock().
>>>
>>> One problem with this API is when it is used in more complicated code
>>> and put_printk_buffer() is not called in some path. I mean leaking.
>>> We might get out of buffers easily.
>>
>> Then, as an debugging config option for statically preallocated buffers,
>> we could record how get_printk_buffer() was called, like lockdep records
>> where a lock was taken.
> 
> Another solution might be to store some timestamp (jiffies?) into
> struct printk_buffer when a new message is added. Then we could flush
> stalled buffers in get_printk_buffer() with some warning.

I don't think it will work. What the threshold should be? It is possible that
a thread spends very long time (many seconds for e.g. SysRq-t) between
get_printk_buffer() and put_printk_buffer(). Therefore, the threshold will
have to be very very long. As soon as we reach out of statically preallocated
buffers, we need to fallback to unbuffered printk() before such threshold
elapses.

> 
> Unfortunately, it might be unsafe to put the stalled buffers.
> Well, it might be safe if there is a lock less access. I wonder
> if we could reuse the printk_safe code here.
> 
> Anyway, I would like to have a solution before we add the new
> API into the kernel. We would need it sooner or later anyway.
> And I would like to be sure that the API is sane.

If we worry about get_printk_buffer() without corresponding put_printk_buffer(),
we will also need to worry about a "struct printk_buffer" returned by
get_printk_buffer() is by error shared by multiple threads. We will have to
complicate buffered_printk() by using cmpxchg() & retry logic, but the output is
after all mixed as with simply fallback to unbuffered printk() does.

Do you think that adding cmpxchg() & retry logic to this API generates better
result than simple fallback? buffered_printk() does not add a new locking dependency
is a good point of this API. Showing the backtrace (by enabling a debug kernel config
option for this API) will be sufficient.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-09 21:19                                                                                                 ` Tetsuo Handa
@ 2018-10-10 10:14                                                                                                   ` Tetsuo Handa
  2018-10-11 10:20                                                                                                     ` Tetsuo Handa
  0 siblings, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-10-10 10:14 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Linus Torvalds,
	Sergey Senozhatsky, Steven Rostedt, Alexander Potapenko,
	kbuild test robot, syzkaller, LKML, Andrew Morton

On 2018/10/10 6:19, Tetsuo Handa wrote:
> Do you think that adding cmpxchg() & retry logic to this API generates better
> result than simple fallback? buffered_printk() does not add a new locking dependency
> is a good point of this API. Showing the backtrace (by enabling a debug kernel config
> option for this API) will be sufficient.
> 

This is an idea for reporting out of buffers event. I would add a kernel config
option for whether to report that event. Maybe I should offload the reporting
to a workqueue context, for reporting from get_printk_context() requires that
get_printk_context() callers have to be printk()-safe, and
get_printk_context() callers might be already in vprintk_safe()/vprintk_nmi()
context even if it is possible to call printk().

 include/linux/printk.h |  71 +++++++++++++++++
 kernel/printk/printk.c | 201 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 272 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccf..bcccf1f 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -173,6 +173,36 @@ int printk_emit(int facility, int level,
 
 asmlinkage __printf(1, 2) __cold
 int printk(const char *fmt, ...);
+/*
+ * A structure for line-buffering printk() output.
+ */
+struct printk_buffer {
+	unsigned short int used; /* Valid bytes in buf[]. */
+	char buf[1024 - 32]; /* This is LOG_LINE_MAX bytes. */
+};
+/*
+ * A macro for allowing "struct printk_buffer" on stack or in .bss section.
+ *
+ * You can use this macro for allocation on stack only when you are sure that
+ * that location is never tight about stack usage, for e.g. interrupt might
+ * consume some stack from that location. You can use this macro for allocation
+ * in .bss section only when you are sure that access to this variable is
+ * appropriately serialized, for concurrent access to this variable can lead to
+ * memory corruption.
+ *
+ * If you are not sure, you should use get_printk_buffer()/put_printk_buffer()
+ * instead. You don't need to check for get_printk_buffer() == NULL, for
+ * buffered_printk()/buffered_vprintk() will fallback to printk()/vprintk()
+ * in that case.
+ */
+#define DEFINE_PRINTK_BUFFER(name) struct printk_buffer name = { }
+struct printk_buffer *get_printk_buffer(void);
+void flush_printk_buffer(struct printk_buffer *ptr);
+__printf(2, 3)
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...);
+__printf(2, 0)
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args);
+void put_printk_buffer(struct printk_buffer *ptr);
 
 /*
  * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ !
@@ -220,6 +250,30 @@ int printk(const char *s, ...)
 {
 	return 0;
 }
+struct printk_buffer {
+	char dummy; /* Not used. */
+};
+#define DEFINE_PRINTK_BUFFER(name) struct printk_buffer name
+static inline struct printk_buffer *get_printk_buffer(void)
+{
+	return NULL;
+}
+static inline __printf(2, 3)
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	return 0;
+}
+static inline __printf(2, 0)
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	return 0;
+}
+static inline void flush_printk_buffer(struct printk_buffer *ptr)
+{
+}
+static inline void put_printk_buffer(struct printk_buffer *ptr)
+{
+}
 static inline __printf(1, 2) __cold
 int printk_deferred(const char *s, ...)
 {
@@ -300,19 +354,34 @@ static inline void printk_safe_flush_on_panic(void)
  */
 #define pr_emerg(fmt, ...) \
 	printk(KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_emerg(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_alert(fmt, ...) \
 	printk(KERN_ALERT pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_alert(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_ALERT pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_crit(fmt, ...) \
 	printk(KERN_CRIT pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_crit(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_CRIT pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_err(fmt, ...) \
 	printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_err(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_warning(fmt, ...) \
 	printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_warning(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_warn pr_warning
+#define bpr_warn bpr_warning
 #define pr_notice(fmt, ...) \
 	printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_notice(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_info(fmt, ...) \
 	printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_info(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
 /*
  * Like KERN_CONT, pr_cont() should only be used when continuing
  * a line with no newline ('\n') enclosed. Otherwise it defaults
@@ -320,6 +389,8 @@ static inline void printk_safe_flush_on_panic(void)
  */
 #define pr_cont(fmt, ...) \
 	printk(KERN_CONT fmt, ##__VA_ARGS__)
+#define bpr_cont(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_CONT fmt, ##__VA_ARGS__)
 
 /* pr_devel() should produce zero code unless DEBUG is defined */
 #ifdef DEBUG
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9bf5404..453db95 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1949,6 +1949,207 @@ asmlinkage int printk_emit(int facility, int level,
 }
 EXPORT_SYMBOL(printk_emit);
 
+#define MAX_PRINTK_BUFFERS 16
+static struct printk_buffer printk_buffers[MAX_PRINTK_BUFFERS];
+static DECLARE_BITMAP(printk_buffers_in_use, MAX_PRINTK_BUFFERS);
+
+/**
+ * get_printk_buffer - Try to get printk_buffer.
+ *
+ * Returns pointer to "struct printk_buffer" on success, NULL otherwise.
+ *
+ * If this function returned "struct printk_buffer", the caller is responsible
+ * for passing it to put_printk_buffer() so that "struct printk_buffer" can be
+ * reused in the future.
+ *
+ * Even if this function returned NULL, the caller does not need to check for
+ * NULL, for passing NULL to buffered_printk() simply acts like normal printk()
+ * and passing NULL to flush_printk_buffer()/put_printk_buffer() is a no-op.
+ */
+struct printk_buffer *get_printk_buffer(void)
+{
+#ifdef CONFIG_STACKTRACE
+	static unsigned long trace_entries[MAX_PRINTK_BUFFERS][20];
+	static struct stack_trace trace[MAX_PRINTK_BUFFERS];
+	static unsigned long stamp[MAX_PRINTK_BUFFERS];
+	static int reported;
+#endif
+	long i;
+
+	for (i = 0; i < MAX_PRINTK_BUFFERS; i++) {
+		if (test_bit(i, printk_buffers_in_use) ||
+		    test_and_set_bit(i, printk_buffers_in_use))
+			continue;
+		printk_buffers[i].used = 0;
+#ifdef CONFIG_STACKTRACE
+		if (!reported) {
+			stamp[i] = jiffies;
+			trace[i].nr_entries = 0;
+			trace[i].entries = trace_entries[i];
+			trace[i].max_entries = 20;
+			trace[i].skip = 0;
+			save_stack_trace(&trace[i]);
+		}
+#endif
+		return &printk_buffers[i];
+	}
+#ifdef CONFIG_STACKTRACE
+	if (!cmpxchg(&reported, 0, 1)) {
+		/*
+		 * Report who is reserving the buffers, for it might be due to
+		 * missing put_printk_buffer() calls.
+		 *
+		 * Note that this report is racy.
+		 * Someone might be about to call put_printk_buffer().
+		 * Someone might be about to set stamp[i] to jiffies.
+		 * Someone might have just set trace[i].nr_entries to 0.
+		 * But it does not worth introducing a lock dependency.
+		 */
+		pr_info("printk: All buffers are in use. Falling back to unbuffered mode.\n");
+		for (i = 0; i < MAX_PRINTK_BUFFERS; i++) {
+			unsigned int j;
+
+			if (!test_bit(i, printk_buffers_in_use))
+				continue;
+			pr_info("buffer[%lu] was reserved %lu jiffies ago by\n",
+				i, jiffies - stamp[i]);
+			for (j = 0; j < trace[i].nr_entries; j++)
+				pr_info("  %pS\n", (void *)trace[i].entries[j]);
+		}
+	}
+#endif
+	return NULL;
+}
+EXPORT_SYMBOL(get_printk_buffer);
+
+/**
+ * buffered_vprintk - Try to vprintk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string.
+ * @args: va_list structure.
+ *
+ * Returns the return value of vprintk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	va_list tmp_args;
+	unsigned short int i;
+	int r;
+
+	BUILD_BUG_ON(sizeof(ptr->buf) != LOG_LINE_MAX);
+	if (!ptr)
+		goto unbuffered;
+	for (i = 0; i < 2; i++) {
+		unsigned int pos = ptr->used;
+		char *text = ptr->buf + pos;
+
+		va_copy(tmp_args, args);
+		r = vsnprintf(text, sizeof(ptr->buf) - pos, fmt, tmp_args);
+		va_end(tmp_args);
+		if (r + pos < sizeof(ptr->buf)) {
+			/*
+			 * Eliminate KERN_CONT at this point because we can
+			 * concatenate incomplete lines inside printk_buffer.
+			 */
+			if (r >= 2 && printk_get_level(text) == 'c') {
+				memmove(text, text + 2, r - 2);
+				ptr->used += r - 2;
+			} else {
+				ptr->used += r;
+			}
+			/* Flush already completed lines if any. */
+			while (1) {
+				char *cp = memchr(ptr->buf, '\n', ptr->used);
+
+				if (!cp)
+					break;
+				*cp = '\0';
+				printk("%s\n", ptr->buf);
+				i = cp - ptr->buf + 1;
+				ptr->used -= i;
+				memmove(ptr->buf, ptr->buf + i, ptr->used);
+			}
+			return r;
+		}
+		if (i)
+			break;
+		flush_printk_buffer(ptr);
+	}
+ unbuffered:
+	return vprintk(fmt, args);
+}
+
+/**
+ * buffered_printk - Try to printk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string, followed by arguments.
+ *
+ * Returns the return value of printk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	va_list args;
+	int r;
+
+	va_start(args, fmt);
+	r = buffered_vprintk(ptr, fmt, args);
+	va_end(args);
+	return r;
+}
+EXPORT_SYMBOL(buffered_printk);
+
+/**
+ * flush_printk_buffer - Flush incomplete line in printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush if @ptr contains partial data. But usually there is no need to call
+ * this function because @ptr is flushed by put_printk_buffer().
+ */
+void flush_printk_buffer(struct printk_buffer *ptr)
+{
+	if (!ptr || !ptr->used)
+		return;
+	/* buffered_vprintk() keeps 0 <= ptr->used < sizeof(ptr->buf) true. */
+	ptr->buf[ptr->used] = '\0';
+	printk("%s", ptr->buf);
+	ptr->used = 0;
+}
+EXPORT_SYMBOL(flush_printk_buffer);
+
+/**
+ * put_printk_buffer - Release printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush and release @ptr.
+ */
+void put_printk_buffer(struct printk_buffer *ptr)
+{
+	long i = ptr - printk_buffers;
+
+	if (!ptr || i < 0 || i >= MAX_PRINTK_BUFFERS)
+		return;
+	if (ptr->used)
+		flush_printk_buffer(ptr);
+	/* Make sure in_use flag is cleared after setting ptr->used = 0. */
+	wmb();
+	clear_bit(i, printk_buffers_in_use);
+}
+EXPORT_SYMBOL(put_printk_buffer);
+
 int vprintk_default(const char *fmt, va_list args)
 {
 	int r;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-10 10:14                                                                                                   ` Tetsuo Handa
@ 2018-10-11 10:20                                                                                                     ` Tetsuo Handa
  2018-10-11 13:47                                                                                                       ` Steven Rostedt
  0 siblings, 1 reply; 94+ messages in thread
From: Tetsuo Handa @ 2018-10-11 10:20 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky
  Cc: Dmitriy Vyukov, Linus Torvalds, Steven Rostedt,
	Alexander Potapenko, kbuild test robot, syzkaller, LKML,
	Andrew Morton

On 2018/10/10 19:14, Tetsuo Handa wrote:
> On 2018/10/10 6:19, Tetsuo Handa wrote:
>> Do you think that adding cmpxchg() & retry logic to this API generates better
>> result than simple fallback? buffered_printk() does not add a new locking dependency
>> is a good point of this API. Showing the backtrace (by enabling a debug kernel config
>> option for this API) will be sufficient.
>>
> 
> This is an idea for reporting out of buffers event. I would add a kernel config
> option for whether to report that event. Maybe I should offload the reporting
> to a workqueue context, for reporting from get_printk_context() requires that
> get_printk_context() callers have to be printk()-safe, and
> get_printk_context() callers might be already in vprintk_safe()/vprintk_nmi()
> context even if it is possible to call printk().
> 

I dropped DEFINE_PRINTK_BUFFER() in order to hide "struct printk_buffer" because
I added reporting functionality for out of static "struct printk_buffer" and
made that number configurable at kernel compile option. I think we can start
evaluating with only static buffers. Then, we will consider whether we need to
expose "struct printk_buffer" in order to allow allocating from kernel stack
memory. Maybe allocating from slab memory is sufficient if someone happened to
want to reserve several "struct printk_buffer" buffers for its lifetime.

Thus, here is v4.

From a65f018d563928c7b7e4a9bec1d1a564dd8b4635 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Thu, 11 Oct 2018 14:21:22 +0900
Subject: [PATCH v4] printk: Add line-buffered printk() API.

Sometimes we want to print a whole line without being disturbed by
concurrent printk() from interrupts and/or other threads, for printk()
which does not end with '\n' can be disturbed.

Mixed printk() output makes it hard to interpret. Assuming that we will go
to a direction that we allow prefixing context identifier to each line of
printk() output (so that we can group multiple lines into one block when
parsing), this patch introduces API for line-buffered printk() output
(so that we can make sure that printk() ends with '\n').

Since functions introduced by this patch are merely wrapping
printk()/vprintk() calls in order to minimize possibility of using
"struct cont", it is safe to replace printk()/vprintk() with this API.

Details:

  A structure named "struct printk_buffer" is introduced for buffering
  up to LOG_LINE_MAX bytes of printk() output which did not end with '\n'.

  get_printk_buffer() tries to assign a "struct printk_buffer" from
  statically preallocated array. get_printk_buffer() returns NULL if
  all "struct printk_buffer" are in use, but the caller does not need to
  check for NULL.

  put_printk_buffer() flushes and releases the "struct printk_buffer".
  put_printk_buffer() must match corresponding get_printk_buffer() as with
  rcu_read_unlock() must match corresponding rcu_read_lock().

  Three functions buffered_vprintk(), buffered_printk() and
  flush_printk_buffer() are provided for using "struct printk_buffer".
  These are like vfprintf(), fprintf(), fflush() except that these receive
  "struct printk_buffer *" for the first argument.

  buffered_vprintk() and buffered_printk() behave like vprintk() and
  printk() respectively if "struct printk_buffer *" argument is NULL.
  flush_printk_buffer() and put_printk_buffer() become no-op if
  "struct printk_buffer *" argument is NULL. Therefore, the caller of
  get_printk_buffer() does not need to check for NULL.

How to configure this API:

  For those who want to save memory footprint, this API is enabled only
  if CONFIG_PRINTK_LINE_BUFFERED option is selected.

  For those who want to tune the number of statically preallocated
  buffers, CONFIG_PRINTK_NUM_LINE_BUFFERS option is available. The default
  value is 16. Since "struct printk_buffer" makes difference only when
  there are multiple threads concurrently calling printk() which does not
  end with '\n', and this API will fallback to normal printk() when all
  CONFIG_PRINTK_NUM_LINE_BUFFERS buffers are in use, you won't need to
  specify a large number.

  But somebody might forget to call put_printk_buffer(). For those who
  want to know why all CONFIG_PRINTK_NUM_LINE_BUFFERS buffers are in use,
  CONFIG_PRINTK_REPORT_OUT_OF_LINE_BUFFERS option is available.
  This option reports when/where get_printk_buffer() was called and
  put_printk_buffer() is not yet called, up to once per a minute.

How to use this API:

  (1) Call get_printk_buffer() and acquire "struct printk_buffer *".

  (2) Rewrite printk() calls in the following way. The "ptr" is
      "struct printk_buffer *" obtained in step (1).

      printk(fmt, ...)     => buffered_printk(ptr, fmt, ...)
      vprintk(fmt, args)   => buffered_vprintk(ptr, fmt, args)
      pr_emerg(fmt, ...)   => bpr_emerg(ptr, fmt, ...)
      pr_alert(fmt, ...)   => bpr_alert(ptr, fmt, ...)
      pr_crit(fmt, ...)    => bpr_crit(ptr, fmt, ...)
      pr_err(fmt, ...)     => bpr_err(ptr, fmt, ...)
      pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...)
      pr_warn(fmt, ...)    => bpr_warn(ptr, fmt, ...)
      pr_notice(fmt, ...)  => bpr_notice(ptr, fmt, ...)
      pr_info(fmt, ...)    => bpr_info(ptr, fmt, ...)
      pr_cont(fmt, ...)    => bpr_cont(ptr, fmt, ...)

  (3) Release "struct printk_buffer" by calling put_printk_buffer().

Note that since "struct printk_buffer" buffers only up to one line, there
is no need to rewrite if it is known that the "struct printk_buffer" is
empty and printk() ends with '\n'.

  Good example:

    printk("Hello ");    =>  buf = get_printk_buffer();
    pr_cont("world.\n");     buffered_printk(buf, "Hello ");
                             buffered_printk(buf, "world.\n");
                             put_printk_buffer(buf);

  Pointless example:

    printk("Hello\n");   =>  buf = get_printk_buffer();
    printk("World.\n");      buffered_printk(buf, "Hello\n");
                             buffered_printk(buf, "World.\n");
                             put_printk_buffer(buf);

Note that bpr_devel() and bpr_debug() are not defined. This is
because pr_devel()/pr_debug() should not be followed by pr_cont()
because pr_devel()/pr_debug() are conditionally enabled; output from
pr_devel()/pr_debug() should always end with '\n'.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 include/linux/printk.h |  41 +++++++++
 init/Kconfig           |  31 +++++++
 kernel/printk/printk.c | 239 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 311 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccf..f93d9c8 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -286,6 +286,30 @@ static inline void printk_safe_flush_on_panic(void)
 }
 #endif
 
+struct printk_buffer;
+#if defined(CONFIG_PRINTK_LINE_BUFFERED)
+struct printk_buffer *get_printk_buffer(void);
+void flush_printk_buffer(struct printk_buffer *ptr);
+__printf(2, 3)
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...);
+__printf(2, 0)
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args);
+void put_printk_buffer(struct printk_buffer *ptr);
+#else
+static inline struct printk_buffer *get_printk_buffer(void)
+{
+	return NULL;
+}
+static inline void flush_printk_buffer(struct printk_buffer *ptr)
+{
+}
+#define buffered_printk(ptr, fmt, ...) printk(fmt, ##__VA_ARGS__)
+#define buffered_vprintk(ptr, fmt, args) vprintk(fmt, args)
+static inline void put_printk_buffer(struct printk_buffer *ptr)
+{
+}
+#endif
+
 extern int kptr_restrict;
 
 #ifndef pr_fmt
@@ -300,19 +324,34 @@ static inline void printk_safe_flush_on_panic(void)
  */
 #define pr_emerg(fmt, ...) \
 	printk(KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_emerg(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_alert(fmt, ...) \
 	printk(KERN_ALERT pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_alert(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_ALERT pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_crit(fmt, ...) \
 	printk(KERN_CRIT pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_crit(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_CRIT pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_err(fmt, ...) \
 	printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_err(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_warning(fmt, ...) \
 	printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_warning(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_warn pr_warning
+#define bpr_warn bpr_warning
 #define pr_notice(fmt, ...) \
 	printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_notice(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_info(fmt, ...) \
 	printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_info(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
 /*
  * Like KERN_CONT, pr_cont() should only be used when continuing
  * a line with no newline ('\n') enclosed. Otherwise it defaults
@@ -320,6 +359,8 @@ static inline void printk_safe_flush_on_panic(void)
  */
 #define pr_cont(fmt, ...) \
 	printk(KERN_CONT fmt, ##__VA_ARGS__)
+#define bpr_cont(ptr, fmt, ...) \
+	buffered_printk(ptr, KERN_CONT fmt, ##__VA_ARGS__)
 
 /* pr_devel() should produce zero code unless DEBUG is defined */
 #ifdef DEBUG
diff --git a/init/Kconfig b/init/Kconfig
index 1e234e2..1fb01de 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -604,6 +604,37 @@ config PRINTK_SAFE_LOG_BUF_SHIFT
 		     13 =>   8 KB for each CPU
 		     12 =>   4 KB for each CPU
 
+config PRINTK_LINE_BUFFERED
+	bool "Allow line buffered printk()"
+	default y
+	depends on PRINTK
+	help
+	  The line buffered printk() tries to buffer printk() output up to '\n'
+	  so that incomplete lines won't be mixed when there are multiple
+	  threads concurrently calling printk() which does not end with '\n'.
+
+config PRINTK_NUM_LINE_BUFFERS
+	int "Number of buffers for line buffered printk()"
+	range 1 4096
+	default 16
+	depends on PRINTK_LINE_BUFFERED
+	help
+	  Specify the number of statically preallocated "struct printk_buffer"
+	  for line buffered printk(). You don't need to specify a large number
+	  here because "struct printk_buffer" makes difference only when there
+	  are multiple threads concurrently calling printk() which does not end
+	  with '\n', and line buffered printk() will fallback to normal
+	  printk() when out of statically preallocated "struct printk_buffer"
+	  happened.
+
+config PRINTK_REPORT_OUT_OF_LINE_BUFFERS
+	bool "Report out of buffers for line buffered printk()"
+	default n
+	depends on PRINTK_LINE_BUFFERED && STACKTRACE
+	help
+	  Select this if you want to know who is using statically preallocated
+	  "struct printk_buffer" when out of "struct printk_buffer" happened.
+
 #
 # Architectures with an unreliable sched_clock() should select this:
 #
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9bf5404..afc8bed 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1949,6 +1949,245 @@ asmlinkage int printk_emit(int facility, int level,
 }
 EXPORT_SYMBOL(printk_emit);
 
+#ifdef CONFIG_PRINTK_LINE_BUFFERED
+/*
+ * A structure for line-buffered printk() output.
+ */
+static struct printk_buffer {
+	unsigned short int used; /* Valid bytes in buf[]. */
+	char buf[LOG_LINE_MAX];
+} printk_buffers[CONFIG_PRINTK_NUM_LINE_BUFFERS] __aligned(1024);
+static DECLARE_BITMAP(printk_buffers_in_use, CONFIG_PRINTK_NUM_LINE_BUFFERS);
+
+
+#ifdef CONFIG_PRINTK_REPORT_OUT_OF_LINE_BUFFERS
+static struct {
+	unsigned long stamp;
+	struct stack_trace trace;
+	unsigned long entries[20];
+} printk_buffers_dump[CONFIG_PRINTK_NUM_LINE_BUFFERS];
+static int buffer_users_report_scheduled;
+
+static void reset_holdoff_flag(struct timer_list *timer)
+{
+	buffer_users_report_scheduled = 0;
+}
+static DEFINE_TIMER(buffer_users_holdoff_timer, reset_holdoff_flag);
+
+static void report_buffer_users(struct work_struct *work)
+{
+	long i;
+	unsigned int j;
+
+	/*
+	 * This report is racy. But it does not worth introducing a lock
+	 * dependency.
+	 */
+	pr_info("printk: All line buffers are in use.\n");
+	for (i = 0; i < CONFIG_PRINTK_NUM_LINE_BUFFERS; i++) {
+		if (!test_bit(i, printk_buffers_in_use))
+			continue;
+		pr_info("buffer[%lu] was reserved %lu jiffies ago by\n",
+			i, jiffies - printk_buffers_dump[i].stamp);
+		for (j = 0; j < printk_buffers_dump[i].trace.nr_entries; j++)
+			pr_info("  %pS\n", (void *)
+				printk_buffers_dump[i].entries[j]);
+		cond_resched();
+	}
+	/* Wait for at least 60 seconds before reporting again. */
+	mod_timer(&buffer_users_holdoff_timer, jiffies + 60 * HZ);
+}
+#endif
+
+/**
+ * get_printk_buffer - Try to get printk_buffer.
+ *
+ * Returns pointer to "struct printk_buffer" on success, NULL otherwise.
+ *
+ * If this function returned "struct printk_buffer", the caller is responsible
+ * for passing it to put_printk_buffer() so that "struct printk_buffer" can be
+ * reused in the future.
+ *
+ * Even if this function returned NULL, the caller does not need to check for
+ * NULL, for passing NULL to buffered_printk() simply acts like normal printk()
+ * and passing NULL to flush_printk_buffer()/put_printk_buffer() is a no-op.
+ */
+struct printk_buffer *get_printk_buffer(void)
+{
+#ifdef CONFIG_PRINTK_REPORT_OUT_OF_LINE_BUFFERS
+	static DECLARE_WORK(work, report_buffer_users);
+#endif
+	long i;
+
+	for (i = 0; i < CONFIG_PRINTK_NUM_LINE_BUFFERS; i++) {
+		if (test_bit(i, printk_buffers_in_use) ||
+		    test_and_set_bit(i, printk_buffers_in_use))
+			continue;
+		printk_buffers[i].used = 0;
+#ifdef CONFIG_PRINTK_REPORT_OUT_OF_LINE_BUFFERS
+		printk_buffers_dump[i].stamp = jiffies;
+		printk_buffers_dump[i].trace.nr_entries = 0;
+		printk_buffers_dump[i].trace.entries =
+			printk_buffers_dump[i].entries;
+		printk_buffers_dump[i].trace.max_entries = 20;
+		printk_buffers_dump[i].trace.skip = 0;
+		save_stack_trace(&printk_buffers_dump[i].trace);
+#endif
+		return &printk_buffers[i];
+	}
+#ifdef CONFIG_PRINTK_REPORT_OUT_OF_LINE_BUFFERS
+	/*
+	 * Oops, out of "struct printk_buffer" happened. Fallback to normal
+	 * printk(). You might notice it by partial lines being printed.
+	 *
+	 * If you think that it might be due to missing put_printk_buffer()
+	 * calls, you can enable CONFIG_PRINTK_REPORT_OUT_OF_LINE_BUFFERS.
+	 * Then, who is using the buffers will be reported (from workqueue
+	 * context because reporting CONFIG_PRINTK_NUM_LINE_BUFFERS entries
+	 * from atomic context might be too slow). If it does not look like
+	 * missing put_printk_buffer() calls, you might want to increase
+	 * CONFIG_PRINTK_NUM_LINE_BUFFERS.
+	 *
+	 * But if it turns out that allocating "struct printk_buffer" on stack
+	 * or in .bss section or from kzalloc() is more suitable than tuning
+	 * CONFIG_PRINTK_NUM_LINE_BUFFERS, we can update to do so.
+	 */
+	if (!in_nmi() && !cmpxchg(&buffer_users_report_scheduled, 0, 1))
+		queue_work(system_unbound_wq, &work);
+#endif
+	return NULL;
+}
+EXPORT_SYMBOL(get_printk_buffer);
+
+/**
+ * buffered_vprintk - Try to vprintk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string.
+ * @args: va_list structure.
+ *
+ * Returns the return value of vprintk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_vprintk(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	va_list tmp_args;
+	unsigned short int i;
+	int r;
+
+	if (!ptr)
+		goto unbuffered;
+	for (i = 0; i < 2; i++) {
+		unsigned int pos = ptr->used;
+		char *text = ptr->buf + pos;
+
+		va_copy(tmp_args, args);
+		r = vsnprintf(text, sizeof(ptr->buf) - pos, fmt, tmp_args);
+		va_end(tmp_args);
+		if (r + pos < sizeof(ptr->buf)) {
+			/*
+			 * Eliminate KERN_CONT at this point because we can
+			 * concatenate incomplete lines inside printk_buffer.
+			 */
+			if (r >= 2 && printk_get_level(text) == 'c') {
+				memmove(text, text + 2, r - 2);
+				ptr->used += r - 2;
+			} else {
+				ptr->used += r;
+			}
+			/* Flush already completed lines if any. */
+			while (1) {
+				char *cp = memchr(ptr->buf, '\n', ptr->used);
+
+				if (!cp)
+					break;
+				*cp = '\0';
+				printk("%s\n", ptr->buf);
+				i = cp - ptr->buf + 1;
+				ptr->used -= i;
+				memmove(ptr->buf, ptr->buf + i, ptr->used);
+			}
+			return r;
+		}
+		if (i)
+			break;
+		flush_printk_buffer(ptr);
+	}
+ unbuffered:
+	return vprintk(fmt, args);
+}
+
+/**
+ * buffered_printk - Try to printk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string, followed by arguments.
+ *
+ * Returns the return value of printk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int buffered_printk(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	va_list args;
+	int r;
+
+	va_start(args, fmt);
+	r = buffered_vprintk(ptr, fmt, args);
+	va_end(args);
+	return r;
+}
+EXPORT_SYMBOL(buffered_printk);
+
+/**
+ * flush_printk_buffer - Flush incomplete line in printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush if @ptr contains partial data. But usually there is no need to call
+ * this function because @ptr is flushed by put_printk_buffer().
+ */
+void flush_printk_buffer(struct printk_buffer *ptr)
+{
+	if (!ptr || !ptr->used)
+		return;
+	/* buffered_vprintk() keeps 0 <= ptr->used < sizeof(ptr->buf) true. */
+	ptr->buf[ptr->used] = '\0';
+	printk("%s", ptr->buf);
+	ptr->used = 0;
+}
+EXPORT_SYMBOL(flush_printk_buffer);
+
+/**
+ * put_printk_buffer - Release printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush and release @ptr.
+ */
+void put_printk_buffer(struct printk_buffer *ptr)
+{
+	long i = ptr - printk_buffers;
+
+	if (!ptr || i < 0 || i >= CONFIG_PRINTK_NUM_LINE_BUFFERS)
+		return;
+	if (ptr->used)
+		flush_printk_buffer(ptr);
+	/* Make sure in_use flag is cleared after setting ptr->used = 0. */
+	wmb();
+	clear_bit(i, printk_buffers_in_use);
+}
+EXPORT_SYMBOL(put_printk_buffer);
+
+#endif
+
 int vprintk_default(const char *fmt, va_list args)
 {
 	int r;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] printk: inject caller information into the body of message
  2018-10-11 10:20                                                                                                     ` Tetsuo Handa
@ 2018-10-11 13:47                                                                                                       ` Steven Rostedt
  0 siblings, 0 replies; 94+ messages in thread
From: Steven Rostedt @ 2018-10-11 13:47 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Linus Torvalds, Alexander Potapenko,
	kbuild test robot, syzkaller, LKML, Andrew Morton

On Thu, 11 Oct 2018 19:20:34 +0900
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:

> Thus, here is v4.
> 
> >From a65f018d563928c7b7e4a9bec1d1a564dd8b4635 Mon Sep 17 00:00:00 2001  
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Thu, 11 Oct 2018 14:21:22 +0900
> Subject: [PATCH v4] printk: Add line-buffered printk() API.

Hi Tetsuo,

Can you resend this as a separate patch starting a new thread? Having
v4 hidden deep in another thread makes it hard to find patches like
these searching mail archives.

If you want to reference back to this discussion, just add in the v4
change log:

Link: https://lore.kernel.org/lkml/201805112058.AAB05258.HJQFFOMFOVtOSL@I-love.SAKURA.ne.jp/

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, back to index

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <201804232233.CIC65675.OJSOMFQOFFHVtL@I-love.SAKURA.ne.jp>
     [not found] ` <CACT4Y+boyw_Qy=y-iTnsKZrtTgF0Hk3nHN_xtqUdX4etgiYDQw@mail.gmail.com>
2018-04-24  1:33   ` printk feature for syzbot? Sergey Senozhatsky
2018-04-24 14:40     ` Steven Rostedt
2018-04-26 10:06     ` Petr Mladek
2018-05-10  4:22       ` Sergey Senozhatsky
2018-05-10 11:30         ` Petr Mladek
2018-05-10 12:11           ` Sergey Senozhatsky
2018-05-10 14:22             ` Steven Rostedt
2018-05-10 14:50         ` Tetsuo Handa
2018-05-11  1:45           ` Sergey Senozhatsky
     [not found]             ` <201805110238.w4B2cIGH079602@www262.sakura.ne.jp>
2018-05-11  6:21               ` Sergey Senozhatsky
2018-05-11  9:17                 ` Dmitry Vyukov
2018-05-11  9:50                   ` Sergey Senozhatsky
2018-05-11 11:58                     ` [PATCH] printk: inject caller information into the body of message Tetsuo Handa
2018-05-17 11:21                       ` Sergey Senozhatsky
2018-05-17 11:52                         ` Sergey Senozhatsky
2018-05-18 12:15                         ` Petr Mladek
2018-05-18 12:25                           ` Dmitry Vyukov
2018-05-18 12:54                             ` Petr Mladek
2018-05-18 13:08                               ` Dmitry Vyukov
2018-05-24  2:21                                 ` Sergey Senozhatsky
2018-05-23 10:19                           ` Tetsuo Handa
2018-05-24  2:14                           ` Sergey Senozhatsky
2018-05-26  6:36                             ` Dmitry Vyukov
2018-06-20  5:44                               ` Dmitry Vyukov
2018-06-20  8:31                                 ` Sergey Senozhatsky
2018-06-20  8:45                                   ` Dmitry Vyukov
2018-06-20  9:06                                     ` Sergey Senozhatsky
2018-06-20  9:18                                       ` Sergey Senozhatsky
2018-06-20  9:31                                         ` Dmitry Vyukov
2018-06-20 11:07                                           ` Sergey Senozhatsky
2018-06-20 11:32                                             ` Dmitry Vyukov
2018-06-20 13:06                                               ` Sergey Senozhatsky
2018-06-22 13:06                                                 ` Tetsuo Handa
2018-06-25  1:41                                                   ` Sergey Senozhatsky
2018-06-25  9:36                                                     ` Dmitry Vyukov
2018-06-27 10:29                                                       ` Tetsuo Handa
2018-09-10 11:20                                                 ` Alexander Potapenko
2018-09-12  6:53                                                   ` Sergey Senozhatsky
2018-09-12 16:05                                                     ` Steven Rostedt
2018-09-13  7:12                                                       ` Sergey Senozhatsky
2018-09-13 12:26                                                         ` Petr Mladek
2018-09-13 14:28                                                           ` Sergey Senozhatsky
2018-09-14  1:22                                                             ` Steven Rostedt
2018-09-14  2:15                                                               ` Sergey Senozhatsky
2018-09-14  6:57                                                             ` Sergey Senozhatsky
2018-09-14 10:37                                                               ` Tetsuo Handa
2018-09-14 11:50                                                                 ` Sergey Senozhatsky
2018-09-14 12:03                                                                   ` Tetsuo Handa
2018-09-14 12:22                                                                     ` Sergey Senozhatsky
2018-09-19 11:02                                                                       ` Tetsuo Handa
2018-09-24  8:11                                                                         ` Tetsuo Handa
2018-09-27 16:10                                                                           ` Tetsuo Handa
2018-09-28  9:02                                                                             ` Sergey Senozhatsky
2018-09-28  9:09                                                                           ` Sergey Senozhatsky
2018-09-28 11:01                                                                             ` Tetsuo Handa
2018-09-29 10:51                                                                               ` Sergey Senozhatsky
2018-09-29 11:15                                                                                 ` Tetsuo Handa
2018-10-01  2:37                                                                                   ` Sergey Senozhatsky
2018-10-01  2:58                                                                                     ` Sergey Senozhatsky
2018-10-01 11:21                                                                                     ` Tetsuo Handa
2018-10-02  6:38                                                                                       ` Sergey Senozhatsky
2018-10-08 10:31                                                                                         ` Tetsuo Handa
2018-10-08 16:03                                                                                           ` Petr Mladek
2018-10-08 20:48                                                                                             ` Tetsuo Handa
2018-10-09 14:52                                                                                               ` Petr Mladek
2018-10-09 21:19                                                                                                 ` Tetsuo Handa
2018-10-10 10:14                                                                                                   ` Tetsuo Handa
2018-10-11 10:20                                                                                                     ` Tetsuo Handa
2018-10-11 13:47                                                                                                       ` Steven Rostedt
2018-10-08 15:43                                                                                         ` Petr Mladek
2018-09-28  8:56                                                                         ` Sergey Senozhatsky
2018-09-28 11:21                                                                           ` Tetsuo Handa
2018-09-29 11:13                                                                             ` Sergey Senozhatsky
2018-09-29 11:39                                                                               ` Tetsuo Handa
2018-10-01  5:52                                                                               ` Sergey Senozhatsky
2018-10-01  8:37                                                                                 ` Sergey Senozhatsky
2018-10-01 18:06                                                                               ` Steven Rostedt
2018-09-14  1:12                                                         ` Steven Rostedt
2018-09-14  1:55                                                           ` Sergey Senozhatsky
2018-06-21  8:29                                               ` Sergey Senozhatsky
2018-06-20  9:30                                       ` Dmitry Vyukov
2018-06-20 11:19                                         ` Sergey Senozhatsky
2018-06-20 11:25                                           ` Dmitry Vyukov
2018-06-20 11:37                                         ` Fengguang Wu
2018-06-20 12:31                                           ` Dmitry Vyukov
2018-06-20 12:41                                             ` Fengguang Wu
2018-06-20 12:45                                               ` Dmitry Vyukov
2018-06-20 12:48                                                 ` Fengguang Wu
2018-05-11 13:37                     ` printk feature for syzbot? Steven Rostedt
2018-05-15  5:20                       ` Sergey Senozhatsky
2018-05-15 14:39                         ` Steven Rostedt
2018-05-11 11:02                 ` [PATCH] printk: fix possible reuse of va_list variable Tetsuo Handa
2018-05-11 11:27                   ` Sergey Senozhatsky
2018-05-17 11:57                   ` Petr Mladek

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox