All of lore.kernel.org
 help / color / mirror / Atom feed
* 3.5-rc6 printk formatting problem during oom-kill.
@ 2012-07-09 18:03 Dave Jones
  2012-07-09 18:27 ` Kay Sievers
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Jones @ 2012-07-09 18:03 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Kay Sievers, Linus Torvalds

I noticed that the format of the oom-killer output seems to have changed, and
now it spews stuff like..

[49461.758070] lowmem_reserve[]:
[49461.758071]  0
[49461.758071]  2643
[49461.758071]  3878
[49461.758072]  3878
[49461.758072] 
[49461.758072] Node 0 
[49461.758075] DMA32 free:50432kB min:45888kB low:57360kB high:68832kB active_anon:1980216kB inactive_anon:494260kB active_file:276kB inactive_file:1064kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2707204kB mlocked:0kB dirty:0kB writeback:82756kB mapped:0kB shmem:0kB slab_reclaimable:15392kB slab_unreclaimable:118172kB kernel_stack:12672kB pagetables:10744kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:9756 all_unreclaimable? yes
[49461.758076] lowmem_reserve[]:
[49461.758076]  0
[49461.758076]  0
[49461.758077]  1234
[49461.758077]  1234
[49461.758077] 
[49461.758078] Node 0 
[49461.758080] Normal free:21208kB min:21424kB low:26780kB high:32136kB active_anon:448068kB inactive_anon:446924kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):972kB isolated(file):0kB present:1264032kB mlocked:0kB dirty:0kB writeback:19580kB mapped:516kB shmem:5532kB slab_reclaimable:34960kB slab_unreclaimable:110764kB kernel_stack:54392kB pagetables:11596kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:16064 all_unreclaimable? yes
[49461.758080] lowmem_reserve[]:
[49461.758080]  0
[49461.758081]  0
[49461.758081]  0
[49461.758081]  0
[49461.758082] 
[49461.758082] Node 0 
[49461.758082] DMA: 
[49461.758085] 0*4kB 
[49461.758086] 0*8kB 
[49461.758086] 3*16kB 
[49461.758086] 1*32kB 
[49461.758087] 3*64kB 
[49461.758087] 3*128kB 
[49461.758087] 3*256kB 
[49461.758088] 2*512kB 
[49461.758088] 3*1024kB 
[49461.758089] 3*2048kB 
[49461.758089] 1*4096kB 
[49461.758089] = 15760kB
[49461.758090] Node 0 
[49461.758090] DMA32: 
[49461.758091] 166*4kB 
[49461.758091] 74*8kB 
[49461.758092] 162*16kB 
[49461.758092] 314*32kB 
[49461.758093] 337*64kB 
[49461.758093] 82*128kB 
[49461.758094] 8*256kB 
[49461.758094] 1*512kB 
[49461.758094] 1*1024kB 
[49461.758095] 0*2048kB 
[49461.758095] 0*4096kB 
[49461.758096] = 49544kB
[49461.758096] Node 0 
[49461.758096] Normal: 
[49461.758097] 847*4kB 
[49461.758097] 1834*8kB 
[49461.758098] 0*16kB 
[49461.758098] 5*32kB 
[49461.758098] 3*64kB 
[49461.758099] 2*128kB 
[49461.758099] 1*256kB 
[49461.758100] 0*512kB 
[49461.758100] 0*1024kB 
[49461.758100] 1*2048kB 
[49461.758101] 0*4096kB 
[49461.758101] = 20972kB


Does the oom-killer code need modifying, or the printk code ?
I know there's been some regressions in this area recently, but this is still
happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0)

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 18:03 3.5-rc6 printk formatting problem during oom-kill Dave Jones
@ 2012-07-09 18:27 ` Kay Sievers
  2012-07-09 18:48   ` Kay Sievers
  0 siblings, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-07-09 18:27 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, Kay Sievers, Linus Torvalds,
	Greg Kroah-Hartmann

On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote:
> I noticed that the format of the oom-killer output seems to have changed, and
> now it spews stuff like..
>
> [49461.758070] lowmem_reserve[]:
> [49461.758071]  0
> [49461.758071]  2643
> [49461.758071]  3878
> [49461.758072]  3878
> [49461.758072]
> [49461.758072] Node 0

> Does the oom-killer code need modifying, or the printk code ?
> I know there's been some regressions in this area recently, but this is still
> happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0)

This likely fixes it:
  http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD

Let me check if it does, and if I can reproduce it.

Thanks,
Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 18:27 ` Kay Sievers
@ 2012-07-09 18:48   ` Kay Sievers
  2012-07-09 20:31     ` Joe Perches
  2012-07-12  0:54     ` Dave Jones
  0 siblings, 2 replies; 19+ messages in thread
From: Kay Sievers @ 2012-07-09 18:48 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann

On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote:
> On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote:
> > I noticed that the format of the oom-killer output seems to have changed, and
> > now it spews stuff like..
> >
> > [49461.758070] lowmem_reserve[]:
> > [49461.758071]  0
> > [49461.758071]  2643
> > [49461.758071]  3878
> > [49461.758072]  3878
> > [49461.758072]
> > [49461.758072] Node 0
> 
> > Does the oom-killer code need modifying, or the printk code ?
> > I know there's been some regressions in this area recently, but this is still
> > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0)
> 
> This likely fixes it:
>   http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD
> 
> Let me check if it does, and if I can reproduce it.

It looks fine here with the above mentioned patch:
[    0.000000] lowmem_reserve[]:
[    0.000000]  0
[    0.000000]  0
[    0.000000]  0
[    0.000000]  0
[    0.000000] 
[    0.000000] DMA: 
[    0.000000] 1*4kB 
[    0.000000] 0*8kB 
[    0.000000] 0*16kB 
[    0.000000] 1*32kB 
[    0.000000] 2*64kB 
[    0.000000] 1*128kB 
[    0.000000] 1*256kB 
[    0.000000] 0*512kB 
[    0.000000] 1*1024kB 
[    0.000000] 1*2048kB 
[    0.000000] 3*4096kB 
[    0.000000] = 15908kB

becomes:
[    0.000000] lowmem_reserve[]: 0 0 0 0
[    0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB

Thanks,
Kay



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 18:48   ` Kay Sievers
@ 2012-07-09 20:31     ` Joe Perches
  2012-07-09 20:40       ` Greg Kroah-Hartmann
  2012-07-12  0:54     ` Dave Jones
  1 sibling, 1 reply; 19+ messages in thread
From: Joe Perches @ 2012-07-09 20:31 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Dave Jones, Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann

On Mon, 2012-07-09 at 20:48 +0200, Kay Sievers wrote:
> On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote:
> > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote:
> > > I noticed that the format of the oom-killer output seems to have changed, and
> > > now it spews stuff like..
> > >
> > > [49461.758070] lowmem_reserve[]:
> > > [49461.758071]  0
> > > [49461.758071]  2643
> > > [49461.758071]  3878
> > > [49461.758072]  3878
> > > [49461.758072]
> > > [49461.758072] Node 0
> > 
> > > Does the oom-killer code need modifying, or the printk code ?
> > > I know there's been some regressions in this area recently, but this is still
> > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0)
> > 
> > This likely fixes it:
> >   http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD
> > 
> > Let me check if it does, and if I can reproduce it.
> 
> It looks fine here with the above mentioned patch:
> [    0.000000] lowmem_reserve[]:
> [    0.000000]  0
> [    0.000000]  0
> [    0.000000]  0
> [    0.000000]  0
> [    0.000000] 
> [    0.000000] DMA: 
> [    0.000000] 1*4kB 
> [    0.000000] 0*8kB 
> [    0.000000] 0*16kB 
> [    0.000000] 1*32kB 
> [    0.000000] 2*64kB 
> [    0.000000] 1*128kB 
> [    0.000000] 1*256kB 
> [    0.000000] 0*512kB 
> [    0.000000] 1*1024kB 
> [    0.000000] 1*2048kB 
> [    0.000000] 3*4096kB 
> [    0.000000] = 15908kB
> 
> becomes:
> [    0.000000] lowmem_reserve[]: 0 0 0 0
> [    0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB

Hi Kay.

That single patch doesn't apply cleanly to Linus'
8c84bf4166a4698296342841a549bbee03860ac0

What else is necessary?

Your tree seems to have a collection of random patches.

It might be useful to clone Linus' tree and produce a
branch with all the necessary printk patches in it so
someone else could pull it.

cheers, Joe


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 20:31     ` Joe Perches
@ 2012-07-09 20:40       ` Greg Kroah-Hartmann
  2012-07-09 20:44         ` Joe Perches
  0 siblings, 1 reply; 19+ messages in thread
From: Greg Kroah-Hartmann @ 2012-07-09 20:40 UTC (permalink / raw)
  To: Joe Perches; +Cc: Kay Sievers, Dave Jones, Linux Kernel, Linus Torvalds

On Mon, Jul 09, 2012 at 01:31:56PM -0700, Joe Perches wrote:
> On Mon, 2012-07-09 at 20:48 +0200, Kay Sievers wrote:
> > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote:
> > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote:
> > > > I noticed that the format of the oom-killer output seems to have changed, and
> > > > now it spews stuff like..
> > > >
> > > > [49461.758070] lowmem_reserve[]:
> > > > [49461.758071]  0
> > > > [49461.758071]  2643
> > > > [49461.758071]  3878
> > > > [49461.758072]  3878
> > > > [49461.758072]
> > > > [49461.758072] Node 0
> > > 
> > > > Does the oom-killer code need modifying, or the printk code ?
> > > > I know there's been some regressions in this area recently, but this is still
> > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0)
> > > 
> > > This likely fixes it:
> > >   http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD
> > > 
> > > Let me check if it does, and if I can reproduce it.
> > 
> > It looks fine here with the above mentioned patch:
> > [    0.000000] lowmem_reserve[]:
> > [    0.000000]  0
> > [    0.000000]  0
> > [    0.000000]  0
> > [    0.000000]  0
> > [    0.000000] 
> > [    0.000000] DMA: 
> > [    0.000000] 1*4kB 
> > [    0.000000] 0*8kB 
> > [    0.000000] 0*16kB 
> > [    0.000000] 1*32kB 
> > [    0.000000] 2*64kB 
> > [    0.000000] 1*128kB 
> > [    0.000000] 1*256kB 
> > [    0.000000] 0*512kB 
> > [    0.000000] 1*1024kB 
> > [    0.000000] 1*2048kB 
> > [    0.000000] 3*4096kB 
> > [    0.000000] = 15908kB
> > 
> > becomes:
> > [    0.000000] lowmem_reserve[]: 0 0 0 0
> > [    0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB
> 
> Hi Kay.
> 
> That single patch doesn't apply cleanly to Linus'
> 8c84bf4166a4698296342841a549bbee03860ac0
> 
> What else is necessary?
> 
> Your tree seems to have a collection of random patches.
> 
> It might be useful to clone Linus' tree and produce a
> branch with all the necessary printk patches in it so
> someone else could pull it.

They should all now be in my driver-core-next branch that will show up
in the next linux-next release, so having a separate tree isn't
necessary.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 20:40       ` Greg Kroah-Hartmann
@ 2012-07-09 20:44         ` Joe Perches
  2012-07-09 20:48           ` Kay Sievers
  2012-07-09 20:56           ` Greg Kroah-Hartmann
  0 siblings, 2 replies; 19+ messages in thread
From: Joe Perches @ 2012-07-09 20:44 UTC (permalink / raw)
  To: Greg Kroah-Hartmann; +Cc: Kay Sievers, Dave Jones, Linux Kernel, Linus Torvalds

On Mon, 2012-07-09 at 13:40 -0700, Greg Kroah-Hartmann wrote:
> On Mon, Jul 09, 2012 at 01:31:56PM -0700, Joe Perches wrote:
> > On Mon, 2012-07-09 at 20:48 +0200, Kay Sievers wrote:
> > > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote:
> > > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote:
> > > > > I noticed that the format of the oom-killer output seems to have changed, and
> > > > > now it spews stuff like..
> > > > >
> > > > > [49461.758070] lowmem_reserve[]:
> > > > > [49461.758071]  0
> > > > > [49461.758071]  2643
> > > > > [49461.758071]  3878
> > > > > [49461.758072]  3878
> > > > > [49461.758072]
> > > > > [49461.758072] Node 0
> > > > 
> > > > > Does the oom-killer code need modifying, or the printk code ?
> > > > > I know there's been some regressions in this area recently, but this is still
> > > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0)
> > > > 
> > > > This likely fixes it:
> > > >   http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD
> > > > 
> > > > Let me check if it does, and if I can reproduce it.
> > > 
> > > It looks fine here with the above mentioned patch:
> > > [    0.000000] lowmem_reserve[]:
> > > [    0.000000]  0
> > > [    0.000000]  0
> > > [    0.000000]  0
> > > [    0.000000]  0
> > > [    0.000000] 
> > > [    0.000000] DMA: 
> > > [    0.000000] 1*4kB 
> > > [    0.000000] 0*8kB 
> > > [    0.000000] 0*16kB 
> > > [    0.000000] 1*32kB 
> > > [    0.000000] 2*64kB 
> > > [    0.000000] 1*128kB 
> > > [    0.000000] 1*256kB 
> > > [    0.000000] 0*512kB 
> > > [    0.000000] 1*1024kB 
> > > [    0.000000] 1*2048kB 
> > > [    0.000000] 3*4096kB 
> > > [    0.000000] = 15908kB
> > > 
> > > becomes:
> > > [    0.000000] lowmem_reserve[]: 0 0 0 0
> > > [    0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB
> > 
> > Hi Kay.
> > 
> > That single patch doesn't apply cleanly to Linus'
> > 8c84bf4166a4698296342841a549bbee03860ac0
> > 
> > What else is necessary?
> > 
> > Your tree seems to have a collection of random patches.
> > 
> > It might be useful to clone Linus' tree and produce a
> > branch with all the necessary printk patches in it so
> > someone else could pull it.
> 
> They should all now be in my driver-core-next branch that will show up
> in the next linux-next release, so having a separate tree isn't
> necessary.

I don't think so.

There are real defects in the existing code.

These are patches that are necessary _now_.
not for a -next 3.6 future.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 20:44         ` Joe Perches
@ 2012-07-09 20:48           ` Kay Sievers
  2012-07-09 21:08             ` Joe Perches
  2012-07-09 20:56           ` Greg Kroah-Hartmann
  1 sibling, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-07-09 20:48 UTC (permalink / raw)
  To: Joe Perches; +Cc: Greg Kroah-Hartmann, Dave Jones, Linux Kernel, Linus Torvalds

On Mon, Jul 9, 2012 at 10:44 PM, Joe Perches <joe@perches.com> wrote:

>> > That single patch doesn't apply cleanly to Linus'
>> > 8c84bf4166a4698296342841a549bbee03860ac0
>> >
>> > What else is necessary?
>> >
>> > Your tree seems to have a collection of random patches.
>> >
>> > It might be useful to clone Linus' tree and produce a
>> > branch with all the necessary printk patches in it so
>> > someone else could pull it.

I just worked on top of Greg's tree with the pending stuff for 3.5.

>> They should all now be in my driver-core-next branch that will show up
>> in the next linux-next release, so having a separate tree isn't
>> necessary.
>
> I don't think so.
>
> There are real defects in the existing code.
>
> These are patches that are necessary _now_.
> not for a -next 3.6 future.

Wrong conclusion. They are not in the 3.6 branch, but still go into -next:
  http://git.kernel.org/?p=linux/kernel/git/gregkh/driver-core.git;a=shortlog;h=refs/heads/driver-core-linus

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 20:44         ` Joe Perches
  2012-07-09 20:48           ` Kay Sievers
@ 2012-07-09 20:56           ` Greg Kroah-Hartmann
  2012-07-11  0:11             ` Dave Jones
  1 sibling, 1 reply; 19+ messages in thread
From: Greg Kroah-Hartmann @ 2012-07-09 20:56 UTC (permalink / raw)
  To: Joe Perches; +Cc: Kay Sievers, Dave Jones, Linux Kernel, Linus Torvalds

On Mon, Jul 09, 2012 at 01:44:24PM -0700, Joe Perches wrote:
> On Mon, 2012-07-09 at 13:40 -0700, Greg Kroah-Hartmann wrote:
> > On Mon, Jul 09, 2012 at 01:31:56PM -0700, Joe Perches wrote:
> > > On Mon, 2012-07-09 at 20:48 +0200, Kay Sievers wrote:
> > > > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote:
> > > > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote:
> > > > > > I noticed that the format of the oom-killer output seems to have changed, and
> > > > > > now it spews stuff like..
> > > > > >
> > > > > > [49461.758070] lowmem_reserve[]:
> > > > > > [49461.758071]  0
> > > > > > [49461.758071]  2643
> > > > > > [49461.758071]  3878
> > > > > > [49461.758072]  3878
> > > > > > [49461.758072]
> > > > > > [49461.758072] Node 0
> > > > > 
> > > > > > Does the oom-killer code need modifying, or the printk code ?
> > > > > > I know there's been some regressions in this area recently, but this is still
> > > > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0)
> > > > > 
> > > > > This likely fixes it:
> > > > >   http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD
> > > > > 
> > > > > Let me check if it does, and if I can reproduce it.
> > > > 
> > > > It looks fine here with the above mentioned patch:
> > > > [    0.000000] lowmem_reserve[]:
> > > > [    0.000000]  0
> > > > [    0.000000]  0
> > > > [    0.000000]  0
> > > > [    0.000000]  0
> > > > [    0.000000] 
> > > > [    0.000000] DMA: 
> > > > [    0.000000] 1*4kB 
> > > > [    0.000000] 0*8kB 
> > > > [    0.000000] 0*16kB 
> > > > [    0.000000] 1*32kB 
> > > > [    0.000000] 2*64kB 
> > > > [    0.000000] 1*128kB 
> > > > [    0.000000] 1*256kB 
> > > > [    0.000000] 0*512kB 
> > > > [    0.000000] 1*1024kB 
> > > > [    0.000000] 1*2048kB 
> > > > [    0.000000] 3*4096kB 
> > > > [    0.000000] = 15908kB
> > > > 
> > > > becomes:
> > > > [    0.000000] lowmem_reserve[]: 0 0 0 0
> > > > [    0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB
> > > 
> > > Hi Kay.
> > > 
> > > That single patch doesn't apply cleanly to Linus'
> > > 8c84bf4166a4698296342841a549bbee03860ac0
> > > 
> > > What else is necessary?
> > > 
> > > Your tree seems to have a collection of random patches.
> > > 
> > > It might be useful to clone Linus' tree and produce a
> > > branch with all the necessary printk patches in it so
> > > someone else could pull it.
> > 
> > They should all now be in my driver-core-next branch that will show up
> > in the next linux-next release, so having a separate tree isn't
> > necessary.
> 
> I don't think so.
> 
> There are real defects in the existing code.
> 
> These are patches that are necessary _now_.
> not for a -next 3.6 future.

Oops, sorry, I ment to type, "driver-core-linus" above.

greg k-h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 20:48           ` Kay Sievers
@ 2012-07-09 21:08             ` Joe Perches
  0 siblings, 0 replies; 19+ messages in thread
From: Joe Perches @ 2012-07-09 21:08 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Greg Kroah-Hartmann, Dave Jones, Linux Kernel, Linus Torvalds

On Mon, 2012-07-09 at 22:48 +0200, Kay Sievers wrote:
> On Mon, Jul 9, 2012 at 10:44 PM, Joe Perches <joe@perches.com> wrote:
> 
> >> > That single patch doesn't apply cleanly to Linus'
> >> > 8c84bf4166a4698296342841a549bbee03860ac0
> >> >
> >> > What else is necessary?
> >> >
> >> > Your tree seems to have a collection of random patches.
> >> >
> >> > It might be useful to clone Linus' tree and produce a
> >> > branch with all the necessary printk patches in it so
> >> > someone else could pull it.
> 
> I just worked on top of Greg's tree with the pending stuff for 3.5.

No worries Kay, as long as there's a tree with just printk
patches, it'll all work out.

> >> They should all now be in my driver-core-next branch that will show up
> >> in the next linux-next release, so having a separate tree isn't
> >> necessary.
> >
> > I don't think so.
> >
> > There are real defects in the existing code.
> >
> > These are patches that are necessary _now_.
> > not for a -next 3.6 future.
> 
> Wrong conclusion. They are not in the 3.6 branch, but still go into -next:
>   http://git.kernel.org/?p=linux/kernel/git/gregkh/driver-core.git;a=shortlog;h=refs/heads/driver-core-linus

That's the drivers-core-linus branch not the -next one.
Greg said -next.  Anyway, it's not a problem, Greg just
mistyped.  I do that too.  Sometimes too often.

cheers, Joe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 20:56           ` Greg Kroah-Hartmann
@ 2012-07-11  0:11             ` Dave Jones
  2012-07-11  0:15               ` Greg Kroah-Hartmann
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Jones @ 2012-07-11  0:11 UTC (permalink / raw)
  To: Greg Kroah-Hartmann
  Cc: Joe Perches, Kay Sievers, Linux Kernel, Linus Torvalds

On Mon, Jul 09, 2012 at 01:56:31PM -0700, Greg Kroah-Hartman wrote:
 > > > > That single patch doesn't apply cleanly to Linus'
 > > > > 8c84bf4166a4698296342841a549bbee03860ac0
 > > > > 
 > > > > What else is necessary?
 > > > > 
 > > > > Your tree seems to have a collection of random patches.
 > > > > 
 > > > > It might be useful to clone Linus' tree and produce a
 > > > > branch with all the necessary printk patches in it so
 > > > > someone else could pull it.
 > > > 
 > > > They should all now be in my driver-core-next branch that will show up
 > > > in the next linux-next release, so having a separate tree isn't
 > > > necessary.
 > > 
 > > I don't think so.
 > > 
 > > There are real defects in the existing code.
 > > 
 > > These are patches that are necessary _now_.
 > > not for a -next 3.6 future.
 > 
 > Oops, sorry, I ment to type, "driver-core-linus" above.

Can you get this to Linus asap please ? I just had an oops that scrolled
off the screen because the Code: line suffers the same problem.
Thankfully serial console caught most of it, but I had to hand-edit
a lot of crap just to make it readable again.

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-11  0:11             ` Dave Jones
@ 2012-07-11  0:15               ` Greg Kroah-Hartmann
  0 siblings, 0 replies; 19+ messages in thread
From: Greg Kroah-Hartmann @ 2012-07-11  0:15 UTC (permalink / raw)
  To: Dave Jones, Joe Perches, Kay Sievers, Linux Kernel, Linus Torvalds

On Tue, Jul 10, 2012 at 08:11:45PM -0400, Dave Jones wrote:
> On Mon, Jul 09, 2012 at 01:56:31PM -0700, Greg Kroah-Hartman wrote:
>  > > > > That single patch doesn't apply cleanly to Linus'
>  > > > > 8c84bf4166a4698296342841a549bbee03860ac0
>  > > > > 
>  > > > > What else is necessary?
>  > > > > 
>  > > > > Your tree seems to have a collection of random patches.
>  > > > > 
>  > > > > It might be useful to clone Linus' tree and produce a
>  > > > > branch with all the necessary printk patches in it so
>  > > > > someone else could pull it.
>  > > > 
>  > > > They should all now be in my driver-core-next branch that will show up
>  > > > in the next linux-next release, so having a separate tree isn't
>  > > > necessary.
>  > > 
>  > > I don't think so.
>  > > 
>  > > There are real defects in the existing code.
>  > > 
>  > > These are patches that are necessary _now_.
>  > > not for a -next 3.6 future.
>  > 
>  > Oops, sorry, I ment to type, "driver-core-linus" above.
> 
> Can you get this to Linus asap please ? I just had an oops that scrolled
> off the screen because the Code: line suffers the same problem.
> Thankfully serial console caught most of it, but I had to hand-edit
> a lot of crap just to make it readable again.

It will be sent tomorrow, I wanted to get it tested in linux-next at
least for one day.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-09 18:48   ` Kay Sievers
  2012-07-09 20:31     ` Joe Perches
@ 2012-07-12  0:54     ` Dave Jones
  2012-07-12 13:52       ` Kay Sievers
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Jones @ 2012-07-12  0:54 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann

On Mon, Jul 09, 2012 at 08:48:51PM +0200, Kay Sievers wrote:
 > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote:
 > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote:
 > > > I noticed that the format of the oom-killer output seems to have changed, and
 > > > now it spews stuff like..
 > > >
 > > > [49461.758070] lowmem_reserve[]:
 > > > [49461.758071]  0
 > > > [49461.758071]  2643
 > > > [49461.758071]  3878
 > > > [49461.758072]  3878
 > > > [49461.758072]
 > > > [49461.758072] Node 0
 > > 
 > > > Does the oom-killer code need modifying, or the printk code ?
 > > > I know there's been some regressions in this area recently, but this is still
 > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0)
 > > 
 > > This likely fixes it:
 > >   http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD
 > > 
 > > Let me check if it does, and if I can reproduce it.
 > 
 > It looks fine here with the above mentioned patch:

Now that that patch is in Linus tree, I've hit what's probably a different case.
Look at the modules list in this oops..

[10016.460020] BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child1:24295]
[10016.470008]  rose<4>[10016.470008]  ip_set_bitmap_ipmac<4>[10016.470008]  nf_conntrack_h323<4>[10016.470008]  girbil_sir<4>[10016.470008]  ath9k_common<4>[10016.470008]  hdlcdrv<4>[10016.470008]  tun<4>[10016.470008]  spcp8x5<4>[10016.470008]  rc_streamzap<4>[10016.470008]  rc_medion_x10<4>[10016.470008]  gspca_mr97310a<4>[10016.470008]  hid_multitouch<4>[10016.470008]  fam15h_power<4>[10016.470008]  sym53c8xx<4>[10016.470008]  gunze<4>[10016.470008]  pata_ns87410<4>[10016.470008]  snd_ymfpci<4>[10016.470008]  michael_mic<4>[10016.470008]  blocklayoutdriver nfs_layout_nfsv41_files nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log btrfs zlib_deflate libcrc32c raid0 iTCO_wdt iTCO_vendor_support ppdev dcdbas coretemp kvm_intel kvm microcode snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device serio_raw snd_pcm lpc_ich mfd_core tg3 i2c_i801 pcspkr snd_timer i5000_edac edac_core snd i5k_amb soundcore snd_page_alloc parport_pc parport shpchp sunrpc firewire_ohci firewire_core crc_itu_t floppy nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi video wmi [last unloaded: scsi_wait_scan]
[10016.470008] irq event stamp: 82066
[10016.470008] hardirqs last  enabled at (82065): [<ffffffff816b8cf0>] restore_args+0x0/0x30
[10016.470008] hardirqs last disabled at (82066): [<ffffffff816c236a>] apic_timer_interrupt+0x6a/0x80
[10016.470008] softirqs last  enabled at (82064): [<ffffffff8106ef8c>] __do_softirq+0x13c/0x3e0
[10016.470008] softirqs last disabled at (82055): [<ffffffff816c2d6c>] call_softirq+0x1c/0x30
[10016.470008] CPU 1 
[10016.470008] Modules linked in:<4>[10016.470008]  unix_diag<4>[10016.470008]  ip_set_bitmap_ipmac<4>[10016.470008]  nf_conntrack_h323<4>[10016.470008]  girbil_sir<4>[10016.470008]  ath9k_common<4>[10016.470008]  hdlcdrv<4>[10016.470008]  tun<4>[10016.470008]  spcp8x5<4>[10016.470008]  rc_streamzap<4>[10016.470008]  rc_medion_x10<4>[10016.470008]  gspca_mr97310a<4>[10016.470008]  hid_multitouch<4>[10016.470008]  fam15h_power<4>[10016.470008]  sym53c8xx<4>[10016.470008]  gunze<4>[10016.470008]  pata_ns87410<4>[10016.470008]  snd_ymfpci<4>[10016.470008]  michael_mic<4>[10016.470008]  blocklayoutdriver nfs_layout_nfsv41_files nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log btrfs zlib_deflate libcrc32c raid0 iTCO_wdt iTCO_vendor_support ppdev dcdbas coretemp kvm_intel kvm microcode snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device serio_raw snd_pcm lpc_ich mfd_core tg3 i2c_i801 pcspkr snd_timer i5000_edac edac_core snd i5k_amb soundcore snd_page_alloc parport_pc parport shpchp sunrpc firewire_ohci firewire_core crc_itu_t floppy nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi video wmi [last unloaded: scsi_wait_scan]
[10016.470008] 
[10016.470008] Pid: 24295, comm: trinity-child1 Tainted: G         C   3.5.0-rc6+ #80 Dell Inc.                 Precision WorkStation 490    /0DT031
[10016.470008] RIP: 0010:[<ffffffff810ce800>]  [<ffffffff810ce800>] match_held_lock+0x190/0x190
[10016.470008] RSP: 0018:ffff88016bdf1c10  EFLAGS: 00000202
[10016.470008] RAX: 0000000000000001 RBX: ffffffff81021d43 RCX: 000000000000000f
[10016.470008] RDX: ffff88016bdf1fd8 RSI: ffffffff81c2fb60 RDI: ffffffff81c2fb60
[10016.470008] RBP: ffff88016bdf1c38 R08: 000000000000000a R09: 0000000000000000
[10016.470008] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff81021db9
[10016.470008] R13: ffff88016bdf1b88 R14: ffffffff81021d43 R15: ffff88016bdf1b78
[10016.470008] FS:  00007fbad33ad700(0000) GS:ffff880226800000(0000) knlGS:0000000000000000
[10016.470008] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10016.470008] CR2: 0000000000000001 CR3: 00000001413a1000 CR4: 00000000000007e0
[10016.470008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[10016.470008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[10016.470008] Process trinity-child1 (pid: 24295, threadinfo ffff88016bdf0000, task ffff88016a244900)
[10016.470008] Stack:
[10016.470008]  ffffffff8132ef67 ffff88016bdf1c48 000000000000229c 00000000000034b2
[10016.470008]  ffffffff81c6c7c0 ffff88016bdf1c98 ffffffff812b319c ffff88016bdf1c98
[10016.470008]  0000000000000246 ffff88015a81d428 ffff88016bdf1cf0 13722bca39882609
[10016.470008] Call Trace:
[10016.470008]  [<ffffffff8132ef67>] ? idr_find+0xd7/0x150
[10016.470008]  [<ffffffff812b319c>] sysvipc_find_ipc+0x5c/0x170
[10016.470008]  [<ffffffff812b3374>] sysvipc_proc_next+0x54/0xf0
[10016.470008]  [<ffffffff811e8cd9>] traverse+0xc9/0x240
[10016.470008]  [<ffffffff811e91e0>] ? seq_lseek+0x120/0x120
[10016.470008]  [<ffffffff811e954d>] seq_read+0x36d/0x420
[10016.470008]  [<ffffffff811e91e0>] ? seq_lseek+0x120/0x120
[10016.470008]  [<ffffffff811e91e0>] ? seq_lseek+0x120/0x120
[10016.470008]  [<ffffffff8122f916>] proc_reg_read+0x86/0xc0
[10016.470008]  [<ffffffff8122f890>] ? proc_reg_write+0xc0/0xc0
[10016.470008]  [<ffffffff811c4649>] do_loop_readv_writev+0x59/0x90
[10016.470008]  [<ffffffff811c4966>] do_readv_writev+0x1b6/0x1d0
[10016.470008]  [<ffffffff810d462d>] ? trace_hardirqs_on+0xd/0x10
[10016.470008]  [<ffffffff816b8350>] ? _raw_spin_unlock_irq+0x30/0x50
[10016.470008]  [<ffffffff816c1855>] ? sysret_check+0x22/0x5d
[10016.470008]  [<ffffffff811c4a1c>] vfs_readv+0x3c/0x50
[10016.470008]  [<ffffffff811c4c6a>] sys_preadv+0xba/0xc0
[10016.470008]  [<ffffffff816c1829>] system_call_fastpath+0x16/0x1b
[10016.470008] Code: 80 3d dc 14 c1 00 01 74 a0 be cb 02 00 00 48 c7 c7 dc 94 9d 81 e8 81 63 f9 ff c6 05 c2 14 c1 00 01 eb 86 0f 1f 84 00 00 00 00 00 <55> 48 89 e5 41 56 41 55 49 89 fd 41 54 65 4c 8b 24 25 c0 c8 00 

Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the
module listing began (2nd line).

Sidenote: Who maintains sysvipc ?

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-12  0:54     ` Dave Jones
@ 2012-07-12 13:52       ` Kay Sievers
  2012-07-12 14:05         ` Dave Jones
  0 siblings, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-07-12 13:52 UTC (permalink / raw)
  To: Dave Jones, Kay Sievers, Linux Kernel, Linus Torvalds,
	Greg Kroah-Hartmann

On Thu, Jul 12, 2012 at 2:54 AM, Dave Jones <davej@redhat.com> wrote:
> On Mon, Jul 09, 2012 at 08:48:51PM +0200, Kay Sievers wrote:

>  > It looks fine here with the above mentioned patch:
>
> Now that that patch is in Linus tree, I've hit what's probably a different case.
> Look at the modules list in this oops..
>
> [10016.460020] BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child1:24295]
> [10016.470008]  rose<4>[10016.470008]  ip_set_bitmap_ipmac<4>[10016.470008]

> Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the
> module listing began (2nd line).

It's the output of 'dmesg' you pasted, right?

I tried to force all sorts of racy print_modules() calls, and kept
your trinity tool from git for hours, it looks all fine here:

[   54.664301] NET: Registered protocol family 24
[   54.684721] Netfilter messages via NETLINK v0.30.
[   63.103847] ------------[ cut here ]------------
[   63.103857] WARNING: at kernel/futex.c:2452 sys_get_robust_list+0x28e/0x2a0()
[   63.103859] Hardware name: 4174NEG
[   63.103861] deprecated: get_robust_list will be deleted in 2013.
[   63.103862] Modules linked in: binfmt_misc nfnetlink pppoe pppox
ppp_generic slhc atm fuse arc4 i915 hid_generic i2c_algo_bit
drm_kms_helper drm snd_usb_audio snd_usbmidi_lib snd_rawmidi
snd_seq_device bnep bluetooth snd_hda_codec_hdmi
snd_hda_codec_conexant uvcvideo videobuf2_vmalloc videobuf2_memops
videobuf2_core videodev media coretemp kvm_intel kvm crc32c_intel
microcode i2c_i801 i2c_core iwlwifi snd_hda_intel snd_hda_codec
mac80211 snd_hwdep lpc_ich snd_pcm mfd_core sdhci_pci sdhci mmc_core
snd_page_alloc cfg80211 snd_timer thinkpad_acpi snd e1000e soundcore
rfkill video cdc_ncm wmi usbnet mii cdc_wdm cdc_acm
[   63.103921] Pid: 837, comm: trinity-child1 Not tainted
3.5.0-0.rc6.git2.1.fc18.x86_64 #1
[   63.103923] Call Trace:
[   63.103930]  [<ffffffff8106310f>] warn_slowpath_common+0x7f/0xc0
[   63.103934]  [<ffffffff81063206>] warn_slowpath_fmt+0x46/0x50
[   63.103938]  [<ffffffff810d253d>] ? trace_hardirqs_on_caller+0x10d/0x1a0
[   63.103941]  [<ffffffff810d6ffe>] sys_get_robust_list+0x28e/0x2a0
[   63.103946]  [<ffffffff816d7d29>] system_call_fastpath+0x16/0x1b
[   63.103948] ---[ end trace 796993e76a8397be ]---
[   63.126701] NET: Registered protocol family 23
[   63.397208] trinity-child3 (839): Using mlock ulimits for
SHM_HUGETLB is deprecated

Can you easily reproduce the issue you pasted?  If, could you give me
the /dev/kmsg output?

Thanks,
Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-12 13:52       ` Kay Sievers
@ 2012-07-12 14:05         ` Dave Jones
  2012-07-12 16:47           ` Kay Sievers
  2012-07-12 17:11           ` Linus Torvalds
  0 siblings, 2 replies; 19+ messages in thread
From: Dave Jones @ 2012-07-12 14:05 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann

On Thu, Jul 12, 2012 at 03:52:17PM +0200, Kay Sievers wrote:
 > On Thu, Jul 12, 2012 at 2:54 AM, Dave Jones <davej@redhat.com> wrote:
 > > On Mon, Jul 09, 2012 at 08:48:51PM +0200, Kay Sievers wrote:
 > 
 > >  > It looks fine here with the above mentioned patch:
 > >
 > > Now that that patch is in Linus tree, I've hit what's probably a different case.
 > > Look at the modules list in this oops..
 > >
 > > [10016.460020] BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child1:24295]
 > > [10016.470008]  rose<4>[10016.470008]  ip_set_bitmap_ipmac<4>[10016.470008]
 > 
 > > Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the
 > > module listing began (2nd line).
 > 
 > It's the output of 'dmesg' you pasted, right?

Yes.

 > I tried to force all sorts of racy print_modules() calls, and kept
 > your trinity tool from git for hours, it looks all fine here:
 > 
 > Can you easily reproduce the issue you pasted?  If, could you give me
 > the /dev/kmsg output?

I've seen it a few times, always with the soft lockup trace.

You might be able to trigger it using scripts/load-all-modules.sh
from trinity.git. (Assuming you have a lot of modules built, I'm
still trying to track down which one seems to be responsible).

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-12 14:05         ` Dave Jones
@ 2012-07-12 16:47           ` Kay Sievers
  2012-07-12 17:02             ` Dave Jones
  2012-07-12 17:11           ` Linus Torvalds
  1 sibling, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-07-12 16:47 UTC (permalink / raw)
  To: Dave Jones, Kay Sievers, Linux Kernel, Linus Torvalds,
	Greg Kroah-Hartmann

On Thu, Jul 12, 2012 at 4:05 PM, Dave Jones <davej@redhat.com> wrote:
> On Thu, Jul 12, 2012 at 03:52:17PM +0200, Kay Sievers wrote:
>  > On Thu, Jul 12, 2012 at 2:54 AM, Dave Jones <davej@redhat.com> wrote:
>  > > On Mon, Jul 09, 2012 at 08:48:51PM +0200, Kay Sievers wrote:
>  >
>  > >  > It looks fine here with the above mentioned patch:
>  > >
>  > > Now that that patch is in Linus tree, I've hit what's probably a different case.
>  > > Look at the modules list in this oops..
>  > >
>  > > [10016.460020] BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child1:24295]
>  > > [10016.470008]  rose<4>[10016.470008]  ip_set_bitmap_ipmac<4>[10016.470008]
>  >
>  > > Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the
>  > > module listing began (2nd line).

They do not belong together. The second line is just another call to
the same print_modules() done from:
    arch/x86/kernel/dumpstack_64.c :: show_regs()

While we already called print_modules() a few cycles earlier from:
   kernel/watchdog :: watchdog_timer_fn()

>  > I tried to force all sorts of racy print_modules() calls, and kept
>  > your trinity tool from git for hours, it looks all fine here:
>  >
>  > Can you easily reproduce the issue you pasted?  If, could you give me
>  > the /dev/kmsg output?
>
> I've seen it a few times, always with the soft lockup trace.
>
> You might be able to trigger it using scripts/load-all-modules.sh
> from trinity.git. (Assuming you have a lot of modules built, I'm
> still trying to track down which one seems to be responsible).

Hmm, it does not trigger your pattern. I tried adding an rmmod in that
loop, but that crashes after a few seconds. Some modules are just not
meant to be removed. :)

I forced the watchdog to trigger by setting the timeout to 1s, but all
looks still fine:

[   20.854010] BUG: soft lockup - CPU#0 stuck for 1000s! [trinity:247]
[   20.854010] Modules linked in: nfnetlink xfrm_user xfrm_algo pppoe
pppox ppp_generic slhc atm bluetooth rfkill microcode cirrus ttm
sr_mod cdrom pcspkr drm_kms_helper drm syscopyarea sysfillrect
sysimgblt floppy evdev ipv6
[   20.854010] irq event stamp: 980768
[   20.854010] hardirqs last  enabled at (980767):
[<ffffffff814a425d>] __slab_alloc.constprop.65+0x3c9/0x408
[   20.854010] hardirqs last disabled at (980768):
[<ffffffff814adfaa>] apic_timer_interrupt+0x6a/0x80
[   20.854010] softirqs last  enabled at (978462):
[<ffffffff8103db1f>] __do_softirq+0x11f/0x170
[   20.854010] softirqs last disabled at (978457):
[<ffffffff814ae82c>] call_softirq+0x1c/0x30
[   20.854010] CPU 0
[   20.854010] Modules linked in: nfnetlink xfrm_user xfrm_algo pppoe
pppox ppp_generic slhc atm bluetooth rfkill microcode cirrus ttm
sr_mod cdrom pcspkr drm_kms_helper drm syscopyarea sysfillrect
sysimgblt floppy evdev ipv6

Could it possibly be that we get some sort of corruption somewhere
else while running trinity and load modules?

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-12 16:47           ` Kay Sievers
@ 2012-07-12 17:02             ` Dave Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Jones @ 2012-07-12 17:02 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann

On Thu, Jul 12, 2012 at 06:47:57PM +0200, Kay Sievers wrote:
 
 > >  > > Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the
 > >  > > module listing began (2nd line).
 > 
 > They do not belong together. The second line is just another call to
 > the same print_modules() done from:
 >     arch/x86/kernel/dumpstack_64.c :: show_regs()
 > 
 > While we already called print_modules() a few cycles earlier from:
 >    kernel/watchdog :: watchdog_timer_fn()

ah, derp. Good spotting.
 
 > Hmm, it does not trigger your pattern. I tried adding an rmmod in that
 > loop, but that crashes after a few seconds. Some modules are just not
 > meant to be removed. :)

yeah, that's bothersome. TTBOMK, we've never survived a 'load all modules/unload all modules' ever.
 
 > I forced the watchdog to trigger by setting the timeout to 1s, but all
 > looks still fine:
 > 
 > Could it possibly be that we get some sort of corruption somewhere
 > else while running trinity and load modules?

I suppose anything is possible, but it's surprisingly consistent.

	Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-12 14:05         ` Dave Jones
  2012-07-12 16:47           ` Kay Sievers
@ 2012-07-12 17:11           ` Linus Torvalds
  2012-07-12 18:25             ` Kay Sievers
  1 sibling, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2012-07-12 17:11 UTC (permalink / raw)
  To: Dave Jones, Kay Sievers, Linux Kernel, Linus Torvalds,
	Greg Kroah-Hartmann

On Thu, Jul 12, 2012 at 7:05 AM, Dave Jones <davej@redhat.com> wrote:
>
> I've seen it a few times, always with the soft lockup trace.

I bet it's because you have tons of modules, and the line ends up
being *really* long. And overflows LOG_LINE_MAX. I suspect something
odd happens.

There are tons of odd special cases for LOG_LINE_MAX, and I bet Kay
doesn't see it for the simple reason that he's not totally insane, and
hasn't loaded hundreds of modules.

Kay, I suspect the "continuation line" logic could easily have a rule like

  "If the old line is already > 80 characters, do a line break here
and add TAB to the beginning of the new line"

In fact, that could be really nice for things like stack dumps etc -
we wouldn't have to worry about line breaks and crap, if the printk
logic just makes "KERN_CONT" do a line break automatically if it
doesn't fit on the screen.

People who use KERN_CONT don't do it because they *need* things to be
on one line (it's not guaranteed anyway), they do it because they want
the output to be dense and readable. Doing auto-line-break would
actually *help*. And would mean that you never hit the odd
LOG_LINE_MAX cases just because somebody is printing lots of modules.

             Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-12 17:11           ` Linus Torvalds
@ 2012-07-12 18:25             ` Kay Sievers
  2012-07-12 22:28               ` Kay Sievers
  0 siblings, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-07-12 18:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dave Jones, Linux Kernel, Greg Kroah-Hartmann

On Thu, Jul 12, 2012 at 7:11 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Jul 12, 2012 at 7:05 AM, Dave Jones <davej@redhat.com> wrote:
>>
>> I've seen it a few times, always with the soft lockup trace.
>
> I bet it's because you have tons of modules, and the line ends up
> being *really* long. And overflows LOG_LINE_MAX. I suspect something
> odd happens.

Straight to the point. That's the issue, combined with the later
safety range checks, we produce somehow "ugly" output.

We already flush the line out of the buffer when we can not add stuff
anymore. The line length is then close to LOG_LINE_MAX, and we want to
add a prefix with the timestamp during output and we reach the limit
of LOG_LINE_MAX.

> There are tons of odd special cases for LOG_LINE_MAX, and I bet Kay
> doesn't see it for the simple reason that he's not totally insane, and
> hasn't loaded hundreds of modules.

Yeah, I just added a loop now that prints over-long continuation
lines, and I can see parts of the effect Dave sees.

> Kay, I suspect the "continuation line" logic could easily have a rule like
>
>   "If the old line is already > 80 characters, do a line break here
> and add TAB to the beginning of the new line"
>
> In fact, that could be really nice for things like stack dumps etc -
> we wouldn't have to worry about line breaks and crap, if the printk
> logic just makes "KERN_CONT" do a line break automatically if it
> doesn't fit on the screen.
>
> People who use KERN_CONT don't do it because they *need* things to be
> on one line (it's not guaranteed anyway), they do it because they want
> the output to be dense and readable. Doing auto-line-break would
> actually *help*. And would mean that you never hit the odd
> LOG_LINE_MAX cases just because somebody is printing lots of modules.

That sounds good. I'll look into that.

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.5-rc6 printk formatting problem during oom-kill.
  2012-07-12 18:25             ` Kay Sievers
@ 2012-07-12 22:28               ` Kay Sievers
  0 siblings, 0 replies; 19+ messages in thread
From: Kay Sievers @ 2012-07-12 22:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dave Jones, Linux Kernel, Greg Kroah-Hartmann

On Thu, 2012-07-12 at 20:25 +0200, Kay Sievers wrote:
> On Thu, Jul 12, 2012 at 7:11 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Thu, Jul 12, 2012 at 7:05 AM, Dave Jones <davej@redhat.com> wrote:
> >>
> >> I've seen it a few times, always with the soft lockup trace.
> >
> > I bet it's because you have tons of modules, and the line ends up
> > being *really* long. And overflows LOG_LINE_MAX. I suspect something
> > odd happens.
> 
> Straight to the point. That's the issue, combined with the later
> safety range checks, we produce somehow "ugly" output.
> 
> We already flush the line out of the buffer when we can not add stuff
> anymore. The line length is then close to LOG_LINE_MAX, and we want to
> add a prefix with the timestamp during output and we reach the limit
> of LOG_LINE_MAX.
> 
> > There are tons of odd special cases for LOG_LINE_MAX, and I bet Kay
> > doesn't see it for the simple reason that he's not totally insane, and
> > hasn't loaded hundreds of modules.
> 
> Yeah, I just added a loop now that prints over-long continuation
> lines, and I can see parts of the effect Dave sees.
> 
> > Kay, I suspect the "continuation line" logic could easily have a rule like
> >
> >   "If the old line is already > 80 characters, do a line break here
> > and add TAB to the beginning of the new line"

This seems to fix the issue for me, which Dave has posted. I was able to
partly reproduce it by printing continuation lines larger than 1000
bytes. The overlong lines did not leave enough room for the later syslog
and timestamp prefixing.

I'll look into the automatic split of lines later, it is not as trivial
as this fix.

While we are at the over-long log line topic, Harald just pointed me to
this unrelated mail, sent an hour ago. :)

  "The "Modules linked in:" lines say only <omitted>,
  because the lines become too long and cause netconsole to eat most of
  the message."

  https://lkml.org/lkml/2012/7/12/484


Thanks,
Kay


From: Kay Sievers <kay@vrfy.org>
Subject: kmsg - properly print over-long continuation lines

Reserve PREFIX_MAX bytes in the LOG_LINE_MAX line when buffering a
continuation line, to be able to properly prefix the LOG_LINE_MAX
line with the syslog prefix and timestamp when printing it.

Reported-By: Dave Jones <davej@redhat.com>
Signed-off-by: Kay Sievers <kay@vrfy.org>
---

 kernel/printk.c |   33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -235,7 +235,8 @@ static u32 log_next_idx;
 static u64 clear_seq;
 static u32 clear_idx;
 
-#define LOG_LINE_MAX 1024
+#define PREFIX_MAX		32
+#define LOG_LINE_MAX		1024 - PREFIX_MAX
 
 /* record buffer */
 #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
@@ -876,7 +877,7 @@ static size_t msg_print_text(const struc
 
 		if (buf) {
 			if (print_prefix(msg, syslog, NULL) +
-			    text_len + 1>= size - len)
+			    text_len + 1 >= size - len)
 				break;
 
 			if (prefix)
@@ -907,7 +908,7 @@ static int syslog_print(char __user *buf
 	struct log *msg;
 	int len = 0;
 
-	text = kmalloc(LOG_LINE_MAX, GFP_KERNEL);
+	text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL);
 	if (!text)
 		return -ENOMEM;
 
@@ -930,7 +931,8 @@ static int syslog_print(char __user *buf
 
 		skip = syslog_partial;
 		msg = log_from_idx(syslog_idx);
-		n = msg_print_text(msg, syslog_prev, true, text, LOG_LINE_MAX);
+		n = msg_print_text(msg, syslog_prev, true, text,
+				   LOG_LINE_MAX + PREFIX_MAX);
 		if (n - syslog_partial <= size) {
 			/* message fits into buffer, move forward */
 			syslog_idx = log_next(syslog_idx);
@@ -969,7 +971,7 @@ static int syslog_print_all(char __user
 	char *text;
 	int len = 0;
 
-	text = kmalloc(LOG_LINE_MAX, GFP_KERNEL);
+	text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL);
 	if (!text)
 		return -ENOMEM;
 
@@ -1022,7 +1024,8 @@ static int syslog_print_all(char __user
 			struct log *msg = log_from_idx(idx);
 			int textlen;
 
-			textlen = msg_print_text(msg, prev, true, text, LOG_LINE_MAX);
+			textlen = msg_print_text(msg, prev, true, text,
+						 LOG_LINE_MAX + PREFIX_MAX);
 			if (textlen < 0) {
 				len = textlen;
 				break;
@@ -1367,15 +1370,15 @@ static struct cont {
 	bool flushed:1;			/* buffer sealed and committed */
 } cont;
 
-static void cont_flush(void)
+static void cont_flush(enum log_flags flags)
 {
 	if (cont.flushed)
 		return;
 	if (cont.len == 0)
 		return;
 
-	log_store(cont.facility, cont.level, LOG_NOCONS, cont.ts_nsec,
-		  NULL, 0, cont.buf, cont.len);
+	log_store(cont.facility, cont.level, LOG_NOCONS | flags,
+		  cont.ts_nsec, NULL, 0, cont.buf, cont.len);
 
 	cont.flushed = true;
 }
@@ -1386,7 +1389,8 @@ static bool cont_add(int facility, int l
 		return false;
 
 	if (cont.len + len > sizeof(cont.buf)) {
-		cont_flush();
+		/* the line gets too long, split it up in separate records */
+		cont_flush(LOG_CONT);
 		return false;
 	}
 
@@ -1522,7 +1526,7 @@ asmlinkage int vprintk_emit(int facility
 		 * or another task also prints continuation lines.
 		 */
 		if (cont.len && (lflags & LOG_PREFIX || cont.owner != current))
-			cont_flush();
+			cont_flush(0);
 
 		/* buffer line if possible, otherwise store it right away */
 		if (!cont_add(facility, level, text, text_len))
@@ -1540,7 +1544,7 @@ asmlinkage int vprintk_emit(int facility
 		if (cont.len && cont.owner == current) {
 			if (!(lflags & LOG_PREFIX))
 				stored = cont_add(facility, level, text, text_len);
-			cont_flush();
+			cont_flush(0);
 		}
 
 		if (!stored)
@@ -1633,7 +1637,8 @@ EXPORT_SYMBOL(printk);
 
 #else
 
-#define LOG_LINE_MAX 0
+#define LOG_LINE_MAX		0
+#define PREFIX_MAX		0
 static struct cont {
 	size_t len;
 	size_t cons;
@@ -1938,7 +1943,7 @@ static enum log_flags console_prev;
  */
 void console_unlock(void)
 {
-	static char text[LOG_LINE_MAX];
+	static char text[LOG_LINE_MAX + PREFIX_MAX];
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-07-12 22:28 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-09 18:03 3.5-rc6 printk formatting problem during oom-kill Dave Jones
2012-07-09 18:27 ` Kay Sievers
2012-07-09 18:48   ` Kay Sievers
2012-07-09 20:31     ` Joe Perches
2012-07-09 20:40       ` Greg Kroah-Hartmann
2012-07-09 20:44         ` Joe Perches
2012-07-09 20:48           ` Kay Sievers
2012-07-09 21:08             ` Joe Perches
2012-07-09 20:56           ` Greg Kroah-Hartmann
2012-07-11  0:11             ` Dave Jones
2012-07-11  0:15               ` Greg Kroah-Hartmann
2012-07-12  0:54     ` Dave Jones
2012-07-12 13:52       ` Kay Sievers
2012-07-12 14:05         ` Dave Jones
2012-07-12 16:47           ` Kay Sievers
2012-07-12 17:02             ` Dave Jones
2012-07-12 17:11           ` Linus Torvalds
2012-07-12 18:25             ` Kay Sievers
2012-07-12 22:28               ` Kay Sievers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.