* [RFC][PATCH] printk: increase devkmsg write() ratelimit @ 2018-12-18 9:18 Sergey Senozhatsky 2018-12-18 10:48 ` Peter Zijlstra 2018-12-18 14:02 ` Sergey Senozhatsky 0 siblings, 2 replies; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-18 9:18 UTC (permalink / raw) To: Petr Mladek, Steven Rostedt Cc: Borislav Petkov, Peter Zijlstra, Andrew Morton, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky Hello, RFC A painful subject: I just noticed that stock systemd (no systemd debugging enabled) on my x86 box write()-s during shutdown to devkmsg more than before, so old devkmsg ratelimits do not apply: $ sudo journalctl -n 40000 -f | grep "kernel: printk: systemd-shutdow" kernel: printk: systemd-shutdow: 35 output lines suppressed due to ratelimiting kernel: printk: systemd-shutdow: 31 output lines suppressed due to ratelimiting kernel: printk: systemd-shutdow: 35 output lines suppressed due to ratelimiting kernel: printk: systemd-shutdow: 36 output lines suppressed due to ratelimiting kernel: printk: systemd-shutdow: 36 output lines suppressed due to ratelimiting kernel: printk: systemd-shutdow: 36 output lines suppressed due to ratelimiting kernel: printk: systemd-shutdow: 36 output lines suppressed due to ratelimiting kernel: printk: systemd-shutdow: 35 output lines suppressed due to ratelimiting I know that there is a "kernel.printk_devkmsg" interface; do we expect every systemd-enabled distro to find that out and to tweak kernel.printk_devkmsg or shall we change the default devkmsg ratelimit instead? ===================================================================== Every time I reboot my systemd x86_64 box: kernel: printk: systemd-shutdow: 36 output lines suppressed due to ratelimiting Apparently, the default ratelimit - 10 messages max, 5 * HZ interval - is a bit too restrictive these days, can't help it; so increase it to "50 messages, same interval" (which is not awfully a lot). Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> --- kernel/printk/printk.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 84e54250840a..0ed54f90f208 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -998,7 +998,8 @@ static int devkmsg_open(struct inode *inode, struct file *file) if (!user) return -ENOMEM; - ratelimit_default_init(&user->rs); + ratelimit_state_init(&user->rs, DEFAULT_RATELIMIT_INTERVAL, + 5 * DEFAULT_RATELIMIT_BURST); ratelimit_set_flags(&user->rs, RATELIMIT_MSG_ON_RELEASE); mutex_init(&user->lock); -- 2.20.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 9:18 [RFC][PATCH] printk: increase devkmsg write() ratelimit Sergey Senozhatsky @ 2018-12-18 10:48 ` Peter Zijlstra 2018-12-18 11:17 ` Sergey Senozhatsky 2018-12-18 14:02 ` Sergey Senozhatsky 1 sibling, 1 reply; 22+ messages in thread From: Peter Zijlstra @ 2018-12-18 10:48 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Petr Mladek, Steven Rostedt, Borislav Petkov, Andrew Morton, linux-kernel, Sergey Senozhatsky On Tue, Dec 18, 2018 at 06:18:42PM +0900, Sergey Senozhatsky wrote: > Hello, > > RFC > > A painful subject: > > I just noticed that stock systemd (no systemd debugging enabled) on my > x86 box write()-s during shutdown to devkmsg more than before, so old > devkmsg ratelimits do not apply: > > $ sudo journalctl -n 40000 -f | grep "kernel: printk: systemd-shutdow" > kernel: printk: systemd-shutdow: 35 output lines suppressed due to ratelimiting > kernel: printk: systemd-shutdow: 31 output lines suppressed due to ratelimiting > kernel: printk: systemd-shutdow: 35 output lines suppressed due to ratelimiting > kernel: printk: systemd-shutdow: 36 output lines suppressed due to ratelimiting > kernel: printk: systemd-shutdow: 36 output lines suppressed due to ratelimiting > kernel: printk: systemd-shutdow: 36 output lines suppressed due to ratelimiting > kernel: printk: systemd-shutdow: 36 output lines suppressed due to ratelimiting > kernel: printk: systemd-shutdow: 35 output lines suppressed due to ratelimiting > > I know that there is a "kernel.printk_devkmsg" interface; do we > expect every systemd-enabled distro to find that out and to tweak > kernel.printk_devkmsg or shall we change the default devkmsg > ratelimit instead? How about we complain to systemd instead? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 10:48 ` Peter Zijlstra @ 2018-12-18 11:17 ` Sergey Senozhatsky 2018-12-18 11:29 ` Peter Zijlstra 2018-12-18 11:47 ` Borislav Petkov 0 siblings, 2 replies; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-18 11:17 UTC (permalink / raw) To: Peter Zijlstra Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt, Borislav Petkov, Andrew Morton, linux-kernel, Sergey Senozhatsky On (12/18/18 11:48), Peter Zijlstra wrote: > > I know that there is a "kernel.printk_devkmsg" interface; do we > > expect every systemd-enabled distro to find that out and to tweak > > kernel.printk_devkmsg or shall we change the default devkmsg > > ratelimit instead? > > How about we complain to systemd instead? We certainly can. As far as I understand, they log shutdown events (including errors and warnings): what they kill, what they stop, what they umount, etc. The more partitions, services are running (I guess), the more things they need to umount, kill, stop; hence, the more messages. I kinda can imagine what they will answer ;) The below (and a bunch of other) messages are getting ratelimited. I'm not sure what will happen should any of those steps fail and print warning-s. My guess would be that we probably can ratelimit those warnings: ... systemd[1]: Unmounting /home... systemd[1]: Unmounting Temporary Directory (/tmp)... systemd[1]: Unmounted Temporary Directory (/tmp). systemd[1]: Stopped target Swap. systemd[1]: Unmounted /boot. systemd[1]: Stopped File System Check on /dev/disk/by-uuid/a0737dff-e797-44f0-aea7-d0df1107ff63. systemd[1]: Stopped File System Check on /dev/disk/by-uuid/5d773b72-e200-4d11-a219-176d62a16d8d. systemd[1]: Unmounted /home. systemd[1]: Stopped File System Check on /dev/disk/by-uuid/35319ddc-9b92-4ab0-aaa4-9922db636a5e. systemd[1]: Unmounted /media/edev. systemd[1]: Stopped File System Check on /dev/disk/by-uuid/da00daaf-5601-4531-912e-bd69103b379d. systemd[1]: Unmounted /media/dump. systemd[1]: Reached target Unmount All Filesystems. systemd[1]: Stopped File System Check on /dev/disk/by-uuid/b52da2df-161b-4c33-b700-277d95b9672f. systemd[1]: Removed slice system-systemd\x2dfsck.slice. systemd[1]: Stopped target Local File Systems (Pre). systemd[1]: Stopped Create Static Device Nodes in /dev. systemd[1]: Stopped Create System Users. systemd[1]: Stopped Remount Root and Kernel File Systems. systemd[1]: Reached target Shutdown. systemd[1]: Reached target Final Step. systemd[1]: Starting Reboot... -ss ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 11:17 ` Sergey Senozhatsky @ 2018-12-18 11:29 ` Peter Zijlstra 2018-12-18 13:09 ` Sergey Senozhatsky 2018-12-18 11:47 ` Borislav Petkov 1 sibling, 1 reply; 22+ messages in thread From: Peter Zijlstra @ 2018-12-18 11:29 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Petr Mladek, Steven Rostedt, Borislav Petkov, Andrew Morton, linux-kernel, Sergey Senozhatsky On Tue, Dec 18, 2018 at 08:17:01PM +0900, Sergey Senozhatsky wrote: > On (12/18/18 11:48), Peter Zijlstra wrote: > > > I know that there is a "kernel.printk_devkmsg" interface; do we > > > expect every systemd-enabled distro to find that out and to tweak > > > kernel.printk_devkmsg or shall we change the default devkmsg > > > ratelimit instead? > > > > How about we complain to systemd instead? > > We certainly can. As far as I understand, they log shutdown events > (including errors and warnings): what they kill, what they stop, > what they umount, etc. The more partitions, services are running > (I guess), the more things they need to umount, kill, stop; hence, > the more messages. I kinda can imagine what they will answer ;) > > The below (and a bunch of other) messages are getting ratelimited. > I'm not sure what will happen should any of those steps fail and > print warning-s. My guess would be that we probably can ratelimit > those warnings: Why are they writing it to /dev/kmsg though? Shouldn't they be writing this to syslog directly? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 11:29 ` Peter Zijlstra @ 2018-12-18 13:09 ` Sergey Senozhatsky 0 siblings, 0 replies; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-18 13:09 UTC (permalink / raw) To: Peter Zijlstra Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt, Borislav Petkov, Andrew Morton, linux-kernel, Sergey Senozhatsky On (12/18/18 12:29), Peter Zijlstra wrote: > > We certainly can. As far as I understand, they log shutdown events > > (including errors and warnings): what they kill, what they stop, > > what they umount, etc. The more partitions, services are running > > (I guess), the more things they need to umount, kill, stop; hence, > > the more messages. I kinda can imagine what they will answer ;) > > > > The below (and a bunch of other) messages are getting ratelimited. > > I'm not sure what will happen should any of those steps fail and > > print warning-s. My guess would be that we probably can ratelimit > > those warnings: > > Why are they writing it to /dev/kmsg though? Shouldn't they be writing > this to syslog directly? Can't really answer for systemd developers. To make those messages appear on the serial/net console? -ss ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 11:17 ` Sergey Senozhatsky 2018-12-18 11:29 ` Peter Zijlstra @ 2018-12-18 11:47 ` Borislav Petkov 2018-12-18 13:07 ` Sergey Senozhatsky 1 sibling, 1 reply; 22+ messages in thread From: Borislav Petkov @ 2018-12-18 11:47 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel, Sergey Senozhatsky On Tue, Dec 18, 2018 at 08:17:01PM +0900, Sergey Senozhatsky wrote: > On (12/18/18 11:48), Peter Zijlstra wrote: > > > I know that there is a "kernel.printk_devkmsg" interface; do we > > > expect every systemd-enabled distro to find that out and to tweak > > > kernel.printk_devkmsg or shall we change the default devkmsg > > > ratelimit instead? > > > > How about we complain to systemd instead? > > We certainly can. As far as I understand, they log shutdown events > (including errors and warnings): what they kill, what they stop, > what they umount, etc. The more partitions, services are running > (I guess), the more things they need to umount, kill, stop; hence, > the more messages. I kinda can imagine what they will answer ;) > > The below (and a bunch of other) messages are getting ratelimited. > I'm not sure what will happen should any of those steps fail and > print warning-s. My guess would be that we probably can ratelimit > those warnings: > > ... > systemd[1]: Unmounting /home... > systemd[1]: Unmounting Temporary Directory (/tmp)... > systemd[1]: Unmounted Temporary Directory (/tmp). > systemd[1]: Stopped target Swap. > systemd[1]: Unmounted /boot. > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/a0737dff-e797-44f0-aea7-d0df1107ff63. > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/5d773b72-e200-4d11-a219-176d62a16d8d. > systemd[1]: Unmounted /home. > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/35319ddc-9b92-4ab0-aaa4-9922db636a5e. > systemd[1]: Unmounted /media/edev. > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/da00daaf-5601-4531-912e-bd69103b379d. > systemd[1]: Unmounted /media/dump. > systemd[1]: Reached target Unmount All Filesystems. > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/b52da2df-161b-4c33-b700-277d95b9672f. > systemd[1]: Removed slice system-systemd\x2dfsck.slice. > systemd[1]: Stopped target Local File Systems (Pre). > systemd[1]: Stopped Create Static Device Nodes in /dev. > systemd[1]: Stopped Create System Users. > systemd[1]: Stopped Remount Root and Kernel File Systems. > systemd[1]: Reached target Shutdown. > systemd[1]: Reached target Final Step. > systemd[1]: Starting Reboot... And we wanna see all those "very useful" messages in dmesg because...? -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 11:47 ` Borislav Petkov @ 2018-12-18 13:07 ` Sergey Senozhatsky 2018-12-18 14:26 ` Borislav Petkov 0 siblings, 1 reply; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-18 13:07 UTC (permalink / raw) To: Borislav Petkov Cc: Sergey Senozhatsky, Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel, Sergey Senozhatsky On (12/18/18 12:47), Borislav Petkov wrote: [..] > > systemd[1]: Unmounting /home... > > systemd[1]: Unmounting Temporary Directory (/tmp)... > > systemd[1]: Unmounted Temporary Directory (/tmp). > > systemd[1]: Stopped target Swap. > > systemd[1]: Unmounted /boot. > > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/a0737dff-e797-44f0-aea7-d0df1107ff63. > > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/5d773b72-e200-4d11-a219-176d62a16d8d. > > systemd[1]: Unmounted /home. > > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/35319ddc-9b92-4ab0-aaa4-9922db636a5e. > > systemd[1]: Unmounted /media/edev. > > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/da00daaf-5601-4531-912e-bd69103b379d. > > systemd[1]: Unmounted /media/dump. > > systemd[1]: Reached target Unmount All Filesystems. > > systemd[1]: Stopped File System Check on /dev/disk/by-uuid/b52da2df-161b-4c33-b700-277d95b9672f. > > systemd[1]: Removed slice system-systemd\x2dfsck.slice. > > systemd[1]: Stopped target Local File Systems (Pre). > > systemd[1]: Stopped Create Static Device Nodes in /dev. > > systemd[1]: Stopped Create System Users. > > systemd[1]: Stopped Remount Root and Kernel File Systems. > > systemd[1]: Reached target Shutdown. > > systemd[1]: Reached target Final Step. > > systemd[1]: Starting Reboot... > > And we wanna see all those "very useful" messages in dmesg because...? This is right before shutdown. dmesg is not really available anymore. I can only guess why do they write it to devkmsg - to make it appear on the serial/net console, perhaps. Dunno. -ss ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 13:07 ` Sergey Senozhatsky @ 2018-12-18 14:26 ` Borislav Petkov 2018-12-18 14:55 ` Sergey Senozhatsky 0 siblings, 1 reply; 22+ messages in thread From: Borislav Petkov @ 2018-12-18 14:26 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Sergey Senozhatsky, Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel On Tue, Dec 18, 2018 at 10:07:50PM +0900, Sergey Senozhatsky wrote: > This is right before shutdown. dmesg is not really available anymore. > I can only guess why do they write it to devkmsg - to make it appear > on the serial/net console, perhaps. Dunno. But still, why do we want to see those messages and enlarge the ratelimit for that? Do they contain any particularly important information? -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 14:26 ` Borislav Petkov @ 2018-12-18 14:55 ` Sergey Senozhatsky 2018-12-18 15:03 ` Borislav Petkov 0 siblings, 1 reply; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-18 14:55 UTC (permalink / raw) To: Borislav Petkov Cc: Sergey Senozhatsky, Sergey Senozhatsky, Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel On (12/18/18 15:26), Borislav Petkov wrote: > On Tue, Dec 18, 2018 at 10:07:50PM +0900, Sergey Senozhatsky wrote: > > This is right before shutdown. dmesg is not really available anymore. > > I can only guess why do they write it to devkmsg - to make it appear > > on the serial/net console, perhaps. Dunno. > > But still, why do we want to see those messages and enlarge the > ratelimit for that? Do they contain any particularly important > information? Good question. Theoretically, there can be some interesting stuff. Examples: https://bugzilla.redhat.com/show_bug.cgi?id=1515373#c6 or https://github.com/systemd/systemd/issues/5433 -ss ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 14:55 ` Sergey Senozhatsky @ 2018-12-18 15:03 ` Borislav Petkov 2018-12-18 15:14 ` Sergey Senozhatsky 0 siblings, 1 reply; 22+ messages in thread From: Borislav Petkov @ 2018-12-18 15:03 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Sergey Senozhatsky, Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel On Tue, Dec 18, 2018 at 11:55:58PM +0900, Sergey Senozhatsky wrote: > Good question. Theoretically, there can be some interesting stuff. > Examples: https://bugzilla.redhat.com/show_bug.cgi?id=1515373#c6 > or https://github.com/systemd/systemd/issues/5433 Just like that last page says, people should use the printk.devkmsg= option when they wanna see all messages. Normally, we do add cmdline options to the kernel cmdline when debugging, like debug ignore_loglevel log_buf_len=16M ... etc and this is no different AFAICT. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 15:03 ` Borislav Petkov @ 2018-12-18 15:14 ` Sergey Senozhatsky 2018-12-18 15:24 ` Borislav Petkov 0 siblings, 1 reply; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-18 15:14 UTC (permalink / raw) To: Borislav Petkov Cc: Sergey Senozhatsky, Sergey Senozhatsky, Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel On (12/18/18 16:03), Borislav Petkov wrote: > On Tue, Dec 18, 2018 at 11:55:58PM +0900, Sergey Senozhatsky wrote: > > Good question. Theoretically, there can be some interesting stuff. > > Examples: https://bugzilla.redhat.com/show_bug.cgi?id=1515373#c6 > > or https://github.com/systemd/systemd/issues/5433 > > Just like that last page says, people should use the printk.devkmsg= > option when they wanna see all messages. Normally, we do add cmdline > options to the kernel cmdline when debugging, like > > debug ignore_loglevel log_buf_len=16M ... Right, but unlike log_buf_len, devkmsg is a bit close to a "binary" knob: either all messages or none; it's not like we can say "printk.devkmsg=100 messages per 5 * HZ max". -ss ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 15:14 ` Sergey Senozhatsky @ 2018-12-18 15:24 ` Borislav Petkov 2018-12-18 16:52 ` Sergey Senozhatsky 0 siblings, 1 reply; 22+ messages in thread From: Borislav Petkov @ 2018-12-18 15:24 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Sergey Senozhatsky, Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel On Wed, Dec 19, 2018 at 12:14:55AM +0900, Sergey Senozhatsky wrote: > Right, but unlike log_buf_len, devkmsg is a bit close to a "binary" knob: > either all messages or none; ... which is perfectly fine for a debugging session. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 15:24 ` Borislav Petkov @ 2018-12-18 16:52 ` Sergey Senozhatsky 2018-12-18 17:21 ` Peter Zijlstra 2018-12-18 17:47 ` Borislav Petkov 0 siblings, 2 replies; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-18 16:52 UTC (permalink / raw) To: Borislav Petkov Cc: Sergey Senozhatsky, Sergey Senozhatsky, Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel On (12/18/18 16:24), Borislav Petkov wrote: > On Wed, Dec 19, 2018 at 12:14:55AM +0900, Sergey Senozhatsky wrote: > > Right, but unlike log_buf_len, devkmsg is a bit close to a "binary" knob: > > either all messages or none; > > ... which is perfectly fine for a debugging session. But devkmsg ratelimits systemd errors, so one does not even know that "some debugging is required". For instance from my x86 box: Unmounting /home... [..] home.mount: Mount process exited, code=exited status=32 Failed unmounting /home. I don't want to debug systemd, I want to know that something didn't work out. 10 messages max and 5 seconds interval looks a bit too strict. -ss ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 16:52 ` Sergey Senozhatsky @ 2018-12-18 17:21 ` Peter Zijlstra 2018-12-18 17:37 ` Steven Rostedt 2018-12-18 17:47 ` Borislav Petkov 1 sibling, 1 reply; 22+ messages in thread From: Peter Zijlstra @ 2018-12-18 17:21 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Borislav Petkov, Sergey Senozhatsky, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel On Wed, Dec 19, 2018 at 01:52:17AM +0900, Sergey Senozhatsky wrote: > On (12/18/18 16:24), Borislav Petkov wrote: > > On Wed, Dec 19, 2018 at 12:14:55AM +0900, Sergey Senozhatsky wrote: > > > Right, but unlike log_buf_len, devkmsg is a bit close to a "binary" knob: > > > either all messages or none; > > > > ... which is perfectly fine for a debugging session. > > But devkmsg ratelimits systemd errors, so one does not even know that > "some debugging is required". For instance from my x86 box: > > Unmounting /home... > [..] > home.mount: Mount process exited, code=exited status=32 > Failed unmounting /home. > > I don't want to debug systemd, I want to know that something didn't > work out. 10 messages max and 5 seconds interval looks a bit too strict. Again, complain to system-doofus for printing so much crap to somewhere it should not print to begin with. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 17:21 ` Peter Zijlstra @ 2018-12-18 17:37 ` Steven Rostedt 2018-12-19 8:50 ` Petr Mladek 2018-12-20 11:35 ` Sergey Senozhatsky 0 siblings, 2 replies; 22+ messages in thread From: Steven Rostedt @ 2018-12-18 17:37 UTC (permalink / raw) To: Peter Zijlstra Cc: Sergey Senozhatsky, Borislav Petkov, Sergey Senozhatsky, Petr Mladek, Andrew Morton, linux-kernel On Tue, 18 Dec 2018 18:21:09 +0100 Peter Zijlstra <peterz@infradead.org> wrote: > On Wed, Dec 19, 2018 at 01:52:17AM +0900, Sergey Senozhatsky wrote: > > On (12/18/18 16:24), Borislav Petkov wrote: > > > On Wed, Dec 19, 2018 at 12:14:55AM +0900, Sergey Senozhatsky wrote: > > > > Right, but unlike log_buf_len, devkmsg is a bit close to a "binary" knob: > > > > either all messages or none; > > > > > > ... which is perfectly fine for a debugging session. > > > > But devkmsg ratelimits systemd errors, so one does not even know that > > "some debugging is required". For instance from my x86 box: > > > > Unmounting /home... > > [..] > > home.mount: Mount process exited, code=exited status=32 > > Failed unmounting /home. > > > > I don't want to debug systemd, I want to know that something didn't > > work out. 10 messages max and 5 seconds interval looks a bit too strict. > > Again, complain to system-doofus for printing so much crap to somewhere > it should not print to begin with. I've been saying that it would be good to make the kmsg be a separate buffer that just gets interleaved with the kernel buffer. Userspace processes should never be able to overwrite messages from the kernel. -- Steve ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 17:37 ` Steven Rostedt @ 2018-12-19 8:50 ` Petr Mladek 2018-12-20 11:35 ` Sergey Senozhatsky 1 sibling, 0 replies; 22+ messages in thread From: Petr Mladek @ 2018-12-19 8:50 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, Sergey Senozhatsky, Borislav Petkov, Sergey Senozhatsky, Andrew Morton, linux-kernel On Tue 2018-12-18 12:37:48, Steven Rostedt wrote: > On Tue, 18 Dec 2018 18:21:09 +0100 > Peter Zijlstra <peterz@infradead.org> wrote: > > > On Wed, Dec 19, 2018 at 01:52:17AM +0900, Sergey Senozhatsky wrote: > > > On (12/18/18 16:24), Borislav Petkov wrote: > > > > On Wed, Dec 19, 2018 at 12:14:55AM +0900, Sergey Senozhatsky wrote: > > > > > Right, but unlike log_buf_len, devkmsg is a bit close to a "binary" knob: > > > > > either all messages or none; > > > > > > > > ... which is perfectly fine for a debugging session. > > > > > > But devkmsg ratelimits systemd errors, so one does not even know that > > > "some debugging is required". For instance from my x86 box: > > > > > > Unmounting /home... > > > [..] > > > home.mount: Mount process exited, code=exited status=32 > > > Failed unmounting /home. > > > > > > I don't want to debug systemd, I want to know that something didn't > > > work out. 10 messages max and 5 seconds interval looks a bit too strict. > > > > Again, complain to system-doofus for printing so much crap to somewhere > > it should not print to begin with. > > I've been saying that it would be good to make the kmsg be a separate > buffer that just gets interleaved with the kernel buffer. Userspace > processes should never be able to overwrite messages from the kernel. There was a proposal for this few years ago, see https://lkml.kernel.org/r/1435920595-30879-1-git-send-email-m.niesluchow@samsung.com Honestly, I got scared. It proposed an interface for dynamically adding separate /dev/kmsg devices and related log buffers. This smells with huge amount of messages that might fight for console throughput and complicate printk code. Best Regards, Petr ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 17:37 ` Steven Rostedt 2018-12-19 8:50 ` Petr Mladek @ 2018-12-20 11:35 ` Sergey Senozhatsky 2018-12-20 13:58 ` Steven Rostedt 1 sibling, 1 reply; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-20 11:35 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, Sergey Senozhatsky, Borislav Petkov, Sergey Senozhatsky, Petr Mladek, Andrew Morton, linux-kernel On (12/18/18 12:37), Steven Rostedt wrote: > > > > Again, complain to system-doofus for printing so much crap to somewhere > > it should not print to begin with. > > I've been saying that it would be good to make the kmsg be a separate > buffer that just gets interleaved with the kernel buffer. Hmm, this is interesting. > Userspace processes should never be able to overwrite messages > from the kernel. I would agree. It's a bit funny that kernel people first take the patch in and then, when user-space begins to use the feature, start to object and ratelimit. By the way, systemd seems to be OK with alternative logging targets /etc/systemd/system.conf --- [Manager] #LogLevel=info LogTarget=syslog console journal --- Everything looks fine with read-only devkmsg (running the patched kernel). "journalctl -f -b" gives a nice system log: ... kernel: r8169 0000:04:00.0 enp4s0: renamed from eth0 kernel: snd_hda_codec_realtek hdaudioC0D0: Line=0x1a systemd-udevd[180]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. systemd[1]: Started Flush Journal to Persistent Storage. kernel: input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1f.3/sound/card0/input7 ... Given how often systemd hits ratelimits (I googled some bug reports), I wonder if systemd people are still interested in /dev/kmsg logging at all. --- diff --git a/drivers/char/mem.c b/drivers/char/mem.c index 7b4e4de778e4..6d35115c5629 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -875,7 +875,7 @@ static const struct memdev { [8] = { "random", 0666, &random_fops, 0 }, [9] = { "urandom", 0666, &urandom_fops, 0 }, #ifdef CONFIG_PRINTK - [11] = { "kmsg", 0644, &kmsg_fops, 0 }, + [11] = { "kmsg", 0444, &kmsg_fops, 0 }, #endif }; diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 829fe6fb098a..48c4ccac9fce 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -770,7 +770,6 @@ static ssize_t msg_print_ext_body(char *buf, size_t size, struct devkmsg_user { u64 seq; u32 idx; - struct ratelimit_state rs; struct mutex lock; char buf[CONSOLE_EXT_LOG_MAX]; }; @@ -788,69 +787,6 @@ int devkmsg_emit(int facility, int level, const char *fmt, ...) return r; } -static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from) -{ - char *buf, *line; - int level = default_message_loglevel; - int facility = 1; /* LOG_USER */ - struct file *file = iocb->ki_filp; - struct devkmsg_user *user = file->private_data; - size_t len = iov_iter_count(from); - ssize_t ret = len; - - if (!user || len > LOG_LINE_MAX) - return -EINVAL; - - /* Ignore when user logging is disabled. */ - if (devkmsg_log & DEVKMSG_LOG_MASK_OFF) - return len; - - /* Ratelimit when not explicitly enabled. */ - if (!(devkmsg_log & DEVKMSG_LOG_MASK_ON)) { - if (!___ratelimit(&user->rs, current->comm)) - return ret; - } - - buf = kmalloc(len+1, GFP_KERNEL); - if (buf == NULL) - return -ENOMEM; - - buf[len] = '\0'; - if (!copy_from_iter_full(buf, len, from)) { - kfree(buf); - return -EFAULT; - } - - /* - * Extract and skip the syslog prefix <[0-9]*>. Coming from userspace - * the decimal value represents 32bit, the lower 3 bit are the log - * level, the rest are the log facility. - * - * If no prefix or no userspace facility is specified, we - * enforce LOG_USER, to be able to reliably distinguish - * kernel-generated messages from userspace-injected ones. - */ - line = buf; - if (line[0] == '<') { - char *endp = NULL; - unsigned int u; - - u = simple_strtoul(line + 1, &endp, 10); - if (endp && endp[0] == '>') { - level = LOG_LEVEL(u); - if (LOG_FACILITY(u) != 0) - facility = LOG_FACILITY(u); - endp++; - len -= endp - line; - line = endp; - } - } - - devkmsg_emit(facility, level, "%s", line); - kfree(buf); - return ret; -} - static ssize_t devkmsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { @@ -998,9 +934,6 @@ static int devkmsg_open(struct inode *inode, struct file *file) if (!user) return -ENOMEM; - ratelimit_default_init(&user->rs); - ratelimit_set_flags(&user->rs, RATELIMIT_MSG_ON_RELEASE); - mutex_init(&user->lock); logbuf_lock_irq(); @@ -1019,8 +952,6 @@ static int devkmsg_release(struct inode *inode, struct file *file) if (!user) return 0; - ratelimit_state_exit(&user->rs); - mutex_destroy(&user->lock); kfree(user); return 0; @@ -1029,7 +960,6 @@ static int devkmsg_release(struct inode *inode, struct file *file) const struct file_operations kmsg_fops = { .open = devkmsg_open, .read = devkmsg_read, - .write_iter = devkmsg_write, .llseek = devkmsg_llseek, .poll = devkmsg_poll, .release = devkmsg_release, ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-20 11:35 ` Sergey Senozhatsky @ 2018-12-20 13:58 ` Steven Rostedt 2018-12-21 7:32 ` Sergey Senozhatsky 0 siblings, 1 reply; 22+ messages in thread From: Steven Rostedt @ 2018-12-20 13:58 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Peter Zijlstra, Sergey Senozhatsky, Borislav Petkov, Petr Mladek, Andrew Morton, linux-kernel On Thu, 20 Dec 2018 20:35:37 +0900 Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote: > On (12/18/18 12:37), Steven Rostedt wrote: > > > > > > Again, complain to system-doofus for printing so much crap to somewhere > > > it should not print to begin with. > > > > I've been saying that it would be good to make the kmsg be a separate > > buffer that just gets interleaved with the kernel buffer. > > Hmm, this is interesting. > > > Userspace processes should never be able to overwrite messages > > from the kernel. > > I would agree. > > It's a bit funny that kernel people first take the patch in and then, > when user-space begins to use the feature, start to object and ratelimit. Please note the "kernel people" to first take it in, were also systemd developers. And there was a bit of objections to what they wanted to have to what they got. It wasn't until later when systemd started abusing the buffer (taking it as its own, making the kernel use of it a second class citizen) that other kernel developers (that now need to deal with the fallout) started to object and ratelimit it. Honestly, it should have been a separate buffer to begin with, and I wish I pushed for that when it was first added. It's not too late. We can still make it a separate buffer. > > By the way, systemd seems to be OK with alternative logging targets > > /etc/systemd/system.conf > > --- > [Manager] > #LogLevel=info > LogTarget=syslog console journal When the system is up and running, it works. But I believe systemd can still only use ksmsg for initial boot up messages before it does the chroot. > --- > > Everything looks fine with read-only devkmsg (running the patched > kernel). "journalctl -f -b" gives a nice system log: > > ... > kernel: r8169 0000:04:00.0 enp4s0: renamed from eth0 > kernel: snd_hda_codec_realtek hdaudioC0D0: Line=0x1a > systemd-udevd[180]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. > systemd[1]: Started Flush Journal to Persistent Storage. > kernel: input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1f.3/sound/card0/input7 > ... > Do you get systemd messages before chroot with this patch compared to what you get without it? -- Steve > > Given how often systemd hits ratelimits (I googled some bug reports), > I wonder if systemd people are still interested in /dev/kmsg logging > at all. > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-20 13:58 ` Steven Rostedt @ 2018-12-21 7:32 ` Sergey Senozhatsky 0 siblings, 0 replies; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-21 7:32 UTC (permalink / raw) To: Steven Rostedt Cc: Sergey Senozhatsky, Peter Zijlstra, Sergey Senozhatsky, Borislav Petkov, Petr Mladek, Andrew Morton, linux-kernel On (12/20/18 08:58), Steven Rostedt wrote: > > > Userspace processes should never be able to overwrite messages > > > from the kernel. > > > > I would agree. > > > > It's a bit funny that kernel people first take the patch in and then, > > when user-space begins to use the feature, start to object and ratelimit. > > Please note the "kernel people" to first take it in, were also systemd > developers. Good point. > Honestly, it should have been a separate buffer to begin with, and I > wish I pushed for that when it was first added. It's not too late. We > can still make it a separate buffer. Ideally, yes, separating them will make sense. [..] > When the system is up and running, it works. But I believe systemd can > still only use ksmsg for initial boot up messages before it does the > chroot. [..] > Do you get systemd messages before chroot with this patch compared to > what you get without it? You are right. systemd cannot log anything really early - there should be /dev/kmsg char device so they can open() it and write() to it. IOW only after chr_dev_init() they can append messages to devkmsg. Which is still early enough, perhaps earlier than UART init. systemd can log to tty, but unlike logbuf messages, messages written to tty cannot be re-played: +-> chr_dev_init() create /dev/kmsg | persistent storage is not mounted yet | writes to /dev/kmsg work | ... | chroot | mount /dev/sdX +-> journald starts pulls logbuf (devkmsg) syslog now works "systemd[1]: Starting Flush Journal to Persistent Storage..." So without devkmsg some early systemd messages can be lost (between chr_dev_init() and exec(journald)), just like you said; or (best case) will go to serial log. Messages before chr_dev_init() will be lost because there is no /dev/kmsg; messages after exec(journald) do not necessarily depend on /dev/kmsg and can be logged via syslog/journal interface. -ss ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 16:52 ` Sergey Senozhatsky 2018-12-18 17:21 ` Peter Zijlstra @ 2018-12-18 17:47 ` Borislav Petkov 2018-12-19 1:46 ` Sergey Senozhatsky 1 sibling, 1 reply; 22+ messages in thread From: Borislav Petkov @ 2018-12-18 17:47 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Sergey Senozhatsky, Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel On Wed, Dec 19, 2018 at 01:52:17AM +0900, Sergey Senozhatsky wrote: > But devkmsg ratelimits systemd errors, so one does not even know that > "some debugging is required". For instance from my x86 box: > > Unmounting /home... > [..] > home.mount: Mount process exited, code=exited status=32 > Failed unmounting /home. > > I don't want to debug systemd, I want to know that something didn't > work out. 10 messages max and 5 seconds interval looks a bit too strict. So how much is not strict? And what happens if you raise that ratelimiting level and the *one* line which is most important for debugging an issue, still doesn't get logged because all of a sudden, doofus is more talkative, overflows the new limit and we drop the important line? All I'm saying is, gradually raising the limit is the wrong approach - there will always be a case where something important doesn't get logged. What we need is a different solution, maybe what Rostedt proposes or so... -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 17:47 ` Borislav Petkov @ 2018-12-19 1:46 ` Sergey Senozhatsky 0 siblings, 0 replies; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-19 1:46 UTC (permalink / raw) To: Borislav Petkov Cc: Sergey Senozhatsky, Sergey Senozhatsky, Peter Zijlstra, Petr Mladek, Steven Rostedt, Andrew Morton, linux-kernel On (12/18/18 18:47), Borislav Petkov wrote: > So how much is not strict? > > And what happens if you raise that ratelimiting level and the *one* line > which is most important for debugging an issue Like you said, for debugging devkmsg=off works just fine. I don't mind the ratelimiting and want to keep it; I just don't want errors to be ratelimited. > All I'm saying is, gradually raising the limit is the wrong approach > - there will always be a case where something important doesn't get > logged. Well, OK. > What we need is a different solution, maybe what Rostedt proposes or > so... Sure, a different approach and idea are welcome. This is RFC thread. -ss ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit 2018-12-18 9:18 [RFC][PATCH] printk: increase devkmsg write() ratelimit Sergey Senozhatsky 2018-12-18 10:48 ` Peter Zijlstra @ 2018-12-18 14:02 ` Sergey Senozhatsky 1 sibling, 0 replies; 22+ messages in thread From: Sergey Senozhatsky @ 2018-12-18 14:02 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Petr Mladek, Steven Rostedt, Borislav Petkov, Peter Zijlstra, Andrew Morton, linux-kernel, Sergey Senozhatsky On (12/18/18 18:18), Sergey Senozhatsky wrote: > > I know that there is a "kernel.printk_devkmsg" interface; do we > expect every systemd-enabled distro to find that out and to tweak > kernel.printk_devkmsg My bad, this should be corrected. I thought that user space can adjust ratelimit, via file->private_data->rs. This does not appear to be the case. So the question is - is it time to change the default devkmsg ratelimit? -ss ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2018-12-21 7:32 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-12-18 9:18 [RFC][PATCH] printk: increase devkmsg write() ratelimit Sergey Senozhatsky 2018-12-18 10:48 ` Peter Zijlstra 2018-12-18 11:17 ` Sergey Senozhatsky 2018-12-18 11:29 ` Peter Zijlstra 2018-12-18 13:09 ` Sergey Senozhatsky 2018-12-18 11:47 ` Borislav Petkov 2018-12-18 13:07 ` Sergey Senozhatsky 2018-12-18 14:26 ` Borislav Petkov 2018-12-18 14:55 ` Sergey Senozhatsky 2018-12-18 15:03 ` Borislav Petkov 2018-12-18 15:14 ` Sergey Senozhatsky 2018-12-18 15:24 ` Borislav Petkov 2018-12-18 16:52 ` Sergey Senozhatsky 2018-12-18 17:21 ` Peter Zijlstra 2018-12-18 17:37 ` Steven Rostedt 2018-12-19 8:50 ` Petr Mladek 2018-12-20 11:35 ` Sergey Senozhatsky 2018-12-20 13:58 ` Steven Rostedt 2018-12-21 7:32 ` Sergey Senozhatsky 2018-12-18 17:47 ` Borislav Petkov 2018-12-19 1:46 ` Sergey Senozhatsky 2018-12-18 14:02 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).