Linux-Next Archive on lore.kernel.org
 help / color / Atom feed
* crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
@ 2015-09-17 16:29 Guenter Roeck
  2015-09-18 14:25 ` Kirill A. Shutemov
  0 siblings, 1 reply; 19+ messages in thread
From: Guenter Roeck @ 2015-09-17 16:29 UTC (permalink / raw)
  To: linux-next, linux-kernel
  Cc: Kirill A. Shutemov, Andrew Morton, Mikael Starvik,
	Jesper Nilsson, Minchan Kim, linux-cris-kernel

Hi,

my crisv32 qemu test fails with next-20150917 as follows.

NET: Registered protocol family 16
kernel BUG at mm/slab.c:1648!
Linux 4.3.0-rc1-next-20150917 #1 Wed Sep 16 23:56:59 PDT 2015
Oops: 0000

[ register dump follows ]

See http://server.roeck-us.net:8010/builders/qemu-crisv32-next/builds/83/steps/qemubuildcommand/logs/stdio
for a complete log.

Bisect points to 'page-flags: define behavior SL*B-related flags on compound pages'.

One of the bisect steps, at 'mm: support madvise(MADV_FREE)', causes a build failure,
but that did not seem to affect the overall bisect results. Just for completeness,
here is the reported build failure.

mm/vmscan.c: In function ‘shrink_page_list’:
mm/vmscan.c:1093:5: error: implicit declaration of function ‘__clear_page_locked’ [-Werror=implicit-function-declaration]
      __clear_page_locked(page);
      ^

Guenter

---
Bisect log:

# bad: [01c8787d7f7ea56c16d94cf7133022189be231ad] Add linux-next specific files for 20150917
# good: [6ff33f3902c3b1c5d0db6b1e2c70b6d76fba357f] Linux 4.3-rc1
git bisect start 'HEAD' 'v4.3-rc1'
# good: [5b986ec423c0e7615824a46b48e6d5cfa45ac82d] Merge remote-tracking branch 'battery/master'
git bisect good 5b986ec423c0e7615824a46b48e6d5cfa45ac82d
# good: [d4cb33477f815c8a0c8b1c28e986c79d89d37491] Merge remote-tracking branch 'staging/staging-next'
git bisect good d4cb33477f815c8a0c8b1c28e986c79d89d37491
# good: [b73c6555e0733dc811d6f7b533183c518d6c7bb2] Merge remote-tracking branch 'livepatching/for-next'
git bisect good b73c6555e0733dc811d6f7b533183c518d6c7bb2
# good: [5c74cd46dcbc65a682b2bdfc74d51feac91c0e0f] mm-introduce-vm_lockonfault-v9
git bisect good 5c74cd46dcbc65a682b2bdfc74d51feac91c0e0f
# bad: [b928350d7f112a2525516c444ebac84ed53e937e] mm: support madvise(MADV_FREE)
git bisect bad b928350d7f112a2525516c444ebac84ed53e937e
# good: [8129659afb03679ae08e6bc862c78bc571a10170] page-flags-introduce-page-flags-policies-wrt-compound-pages-fix
git bisect good 8129659afb03679ae08e6bc862c78bc571a10170
# bad: [f285037c71068de549f2432c9d275844575707de] page-flags: look at head page if the flag is encoded in page->mapping
git bisect bad f285037c71068de549f2432c9d275844575707de
# bad: [ceba84a8eac816c92d1f1789b1273f812ac9af80] page-flags: define PG_reserved behavior on compound pages
git bisect bad ceba84a8eac816c92d1f1789b1273f812ac9af80
# good: [40774838143f84a5c38ecee2f142bbd6b9a0687a] page-flags: define behavior of LRU-related flags on compound pages
git bisect good 40774838143f84a5c38ecee2f142bbd6b9a0687a
# bad: [fcd079cce0714accc45d35ffad59bf05ddb37893] page-flags: define behavior of Xen-related flags on compound pages
git bisect bad fcd079cce0714accc45d35ffad59bf05ddb37893
# bad: [2c42f049fc7aa8198514707bce436c7f9f2c318f] page-flags: define behavior SL*B-related flags on compound pages
git bisect bad 2c42f049fc7aa8198514707bce436c7f9f2c318f
# first bad commit: [2c42f049fc7aa8198514707bce436c7f9f2c318f] page-flags: define behavior SL*B-related flags on compound pages

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-17 16:29 crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages' Guenter Roeck
@ 2015-09-18 14:25 ` Kirill A. Shutemov
  2015-09-18 14:53   ` Jesper Nilsson
  0 siblings, 1 reply; 19+ messages in thread
From: Kirill A. Shutemov @ 2015-09-18 14:25 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-next, linux-kernel, Kirill A. Shutemov, Andrew Morton,
	Mikael Starvik, Jesper Nilsson, Minchan Kim, linux-cris-kernel

On Thu, Sep 17, 2015 at 09:29:27AM -0700, Guenter Roeck wrote:
> Hi,
> 
> my crisv32 qemu test fails with next-20150917 as follows.
> 
> NET: Registered protocol family 16
> kernel BUG at mm/slab.c:1648!
> Linux 4.3.0-rc1-next-20150917 #1 Wed Sep 16 23:56:59 PDT 2015
> Oops: 0000
> 
> [ register dump follows ]
> 
> See http://server.roeck-us.net:8010/builders/qemu-crisv32-next/builds/83/steps/qemubuildcommand/logs/stdio
> for a complete log.

Is there a chance to get proper backtrace?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-18 14:25 ` Kirill A. Shutemov
@ 2015-09-18 14:53   ` Jesper Nilsson
  2015-09-18 15:13     ` Guenter Roeck
  0 siblings, 1 reply; 19+ messages in thread
From: Jesper Nilsson @ 2015-09-18 14:53 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Guenter Roeck, linux-next, linux-kernel, Kirill A. Shutemov,
	Andrew Morton, Mikael Starvik, Jesper Nilsson, Minchan Kim,
	linux-cris-kernel

On Fri, Sep 18, 2015 at 05:25:07PM +0300, Kirill A. Shutemov wrote:
> On Thu, Sep 17, 2015 at 09:29:27AM -0700, Guenter Roeck wrote:
> > Hi,
> > 
> > my crisv32 qemu test fails with next-20150917 as follows.
> > 
> > NET: Registered protocol family 16
> > kernel BUG at mm/slab.c:1648!
> > Linux 4.3.0-rc1-next-20150917 #1 Wed Sep 16 23:56:59 PDT 2015
> > Oops: 0000
> > 
> > [ register dump follows ]
> > 
> > See http://server.roeck-us.net:8010/builders/qemu-crisv32-next/builds/83/steps/qemubuildcommand/logs/stdio
> > for a complete log.
> 
> Is there a chance to get proper backtrace?

Yes, it should be possible with CONFIG_KALLSYMS=y in the kconfig.

/^JN - Jesper Nilsson
-- 
               Jesper Nilsson -- jesper.nilsson@axis.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-18 14:53   ` Jesper Nilsson
@ 2015-09-18 15:13     ` Guenter Roeck
  2015-09-21 15:34       ` Kirill A. Shutemov
  0 siblings, 1 reply; 19+ messages in thread
From: Guenter Roeck @ 2015-09-18 15:13 UTC (permalink / raw)
  To: Jesper Nilsson, Kirill A. Shutemov
  Cc: linux-next, linux-kernel, Kirill A. Shutemov, Andrew Morton,
	Mikael Starvik, Jesper Nilsson, Minchan Kim, linux-cris-kernel

On 09/18/2015 07:53 AM, Jesper Nilsson wrote:
> On Fri, Sep 18, 2015 at 05:25:07PM +0300, Kirill A. Shutemov wrote:
>> On Thu, Sep 17, 2015 at 09:29:27AM -0700, Guenter Roeck wrote:
>>> Hi,
>>>
>>> my crisv32 qemu test fails with next-20150917 as follows.
>>>
>>> NET: Registered protocol family 16
>>> kernel BUG at mm/slab.c:1648!
>>> Linux 4.3.0-rc1-next-20150917 #1 Wed Sep 16 23:56:59 PDT 2015
>>> Oops: 0000
>>>
>>> [ register dump follows ]
>>>
>>> See http://server.roeck-us.net:8010/builders/qemu-crisv32-next/builds/83/steps/qemubuildcommand/logs/stdio
>>> for a complete log.
>>
>> Is there a chance to get proper backtrace?
>
> Yes, it should be possible with CONFIG_KALLSYMS=y in the kconfig.
>

Good to know. I added it to my configuration.

Here it is:

kernel BUG at mm/slab.c:1648!
Linux 4.3.0-rc1-next-20150918 #1 Fri Sep 18 08:07:55 PDT 2015
Oops: 0000
CPU: 0
ERP: c0074c00 SRP: c0074c1e  CCS: 00009408 USP: 00000000 MOF: 00000000
  r0: 0000002a  r1: ffffe000   r2: c000f2ca  r3: c0074e40
  r4: c0033d00  r5: 0000000a   r6: 00000009  r7: 00000100
  r8: 0000000a  r9: c04a3254  r10: c1e97765 r11: c1fc1282
r12: c04a3240 r13: 00000000 oR10: c1e97765 acr: 00000001
  sp: c046458c
        Data MMU Cause: 00000000
Instruction MMU Cause: 00000000
Process ksoftirqd/0 (pid: 3, stackpage=c1feb3a0)

Stack from c1ff3e80:

c0004a0e c004ba7c
c04645e0 00000000
c04645d8 0000000a
c00059e8 00000001

        00000000
c046458c 00000000
c0341cfc c004ba7c
c0033d00 c0004b52
0000002a

ffffe000 c000f2ca
c0074e40 c0005898
c0074c1e c1e97765
c0074c1e c0033bd2

Call Trace: [<c0004a0e>] show_stack+0x0/0x9e
[<c004ba7c>] printk+0x0/0x2c
[<c00059e8>] show_registers+0x14a/0x1c2
[<c004ba7c>] printk+0x0/0x2c
[<c0033d00>] rcu_bh_qs+0x0/0x30
[<c0004b52>] die_if_kernel+0x7c/0x9e
[<c000f2ca>] __local_bh_enable_ip+0x0/0xb0
[<c0074e40>] kfree+0x0/0x82
[<c0005898>] do_BUG+0x2a/0x30
[<c0074c1e>] kmem_rcu_free+0x18/0x1e
[<c0074c1e>] kmem_rcu_free+0x18/0x1e
[<c0033bd2>] __rcu_process_callbacks+0x80/0xaa
[<c0033b52>] __rcu_process_callbacks+0x0/0xaa
[<c0033c14>] rcu_process_callbacks+0x18/0x2a
[<c000f15e>] __do_softirq+0xbc/0x1c2
[<c002008c>] kthread_should_stop+0x0/0x18
[<c00200a4>] kthread_should_park+0x0/0x18
[<c02d5f5e>] schedule+0x0/0x7c
[<c000f286>] run_ksoftirqd+0x22/0x32
[<c00228e8>] smpboot_thread_fn+0xe4/0x12c
[<c0022804>] smpboot_thread_fn+0x0/0x12c
[<c0028e88>] __init_waitqueue_head+0x0/0x8
[<c002048e>] kthread+0xb6/0xce
[<c00203d8>] kthread+0x0/0xce
[<c00055fa>] ret_from_kernel_thread+0xe/0x14

Code: 2a d6 6d da 87 d3 65 70 b0 05 3e e9 (7f) d2 65 e0 29 d6 84 e2 7e ba 6a c6
---[ end trace cc289b1d61e6a9a8 ]---

Kernel panic - not syncing: Aiee, killing interrupt handler!
CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.3.0-rc1-next-20150918 #1

Stack from c1ff3e28:

c1feb3a0 0000000b
c0341cfc c004ba7c
c0033d00 0000000a
c0177e2c c004b8fa

        c1feb3a0
0000000b c0341cfc
c004ba7c c0033d00
0000000a 00000009
c000dd54

c0342a14 00000000
00000061 00000000
c046458c 00000000
c0341cfc c004ba7c

Call Trace: [<c004ba7c>] printk+0x0/0x2c
[<c0033d00>] rcu_bh_qs+0x0/0x30
[<c0177e2c>] dump_stack+0x1c/0x22
[<c004b8fa>] panic+0x82/0x19c
[<c004ba7c>] printk+0x0/0x2c
[<c0033d00>] rcu_bh_qs+0x0/0x30
[<c000dd54>] do_exit+0x5e0/0x7da
[<c004ba7c>] printk+0x0/0x2c
[<c0033d00>] rcu_bh_qs+0x0/0x30
[<c004baa2>] printk+0x26/0x2c
[<c0004b6a>] die_if_kernel+0x94/0x9e
[<c0004b72>] die_if_kernel+0x9c/0x9e
[<c000f2ca>] __local_bh_enable_ip+0x0/0xb0
[<c0074e40>] kfree+0x0/0x82
[<c0005898>] do_BUG+0x2a/0x30
[<c0074c1e>] kmem_rcu_free+0x18/0x1e
[<c0074c1e>] kmem_rcu_free+0x18/0x1e
[<c0033bd2>] __rcu_process_callbacks+0x80/0xaa
[<c0033b52>] __rcu_process_callbacks+0x0/0xaa
[<c0033c14>] rcu_process_callbacks+0x18/0x2a
[<c000f15e>] __do_softirq+0xbc/0x1c2
[<c002008c>] kthread_should_stop+0x0/0x18
[<c00200a4>] kthread_should_park+0x0/0x18
[<c02d5f5e>] schedule+0x0/0x7c
[<c000f286>] run_ksoftirqd+0x22/0x32
[<c00228e8>] smpboot_thread_fn+0xe4/0x12c
[<c0022804>] smpboot_thread_fn+0x0/0x12c
[<c0028e88>] __init_waitqueue_head+0x0/0x8
[<c002048e>] kthread+0xb6/0xce
[<c00203d8>] kthread+0x0/0xce
[<c00055fa>] ret_from_kernel_thread+0xe/0x14
---[ end Kernel panic - not syncing: Aiee, killing interrupt handler!

Guenter

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-18 15:13     ` Guenter Roeck
@ 2015-09-21 15:34       ` Kirill A. Shutemov
  2015-09-22  1:17         ` Guenter Roeck
  0 siblings, 1 reply; 19+ messages in thread
From: Kirill A. Shutemov @ 2015-09-21 15:34 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Jesper Nilsson, Kirill A. Shutemov, linux-next, linux-kernel,
	Kirill A. Shutemov, Andrew Morton, Mikael Starvik,
	Jesper Nilsson, Minchan Kim, linux-cris-kernel

Guenter Roeck wrote:
> On 09/18/2015 07:53 AM, Jesper Nilsson wrote:
> > On Fri, Sep 18, 2015 at 05:25:07PM +0300, Kirill A. Shutemov wrote:
> >> On Thu, Sep 17, 2015 at 09:29:27AM -0700, Guenter Roeck wrote:
> >>> Hi,
> >>>
> >>> my crisv32 qemu test fails with next-20150917 as follows.
> >>>
> >>> NET: Registered protocol family 16
> >>> kernel BUG at mm/slab.c:1648!
> >>> Linux 4.3.0-rc1-next-20150917 #1 Wed Sep 16 23:56:59 PDT 2015
> >>> Oops: 0000
> >>>
> >>> [ register dump follows ]
> >>>
> >>> See http://server.roeck-us.net:8010/builders/qemu-crisv32-next/builds/83/steps/qemubuildcommand/logs/stdio
> >>> for a complete log.
> >>
> >> Is there a chance to get proper backtrace?
> >
> > Yes, it should be possible with CONFIG_KALLSYMS=y in the kconfig.
> >
> 
> Good to know. I added it to my configuration.
> 
> Here it is:
> 
> kernel BUG at mm/slab.c:1648!

I still don't understand what's going on :(
Could you try with this instrumentation:

diff --git a/mm/slab.c b/mm/slab.c
index ce9c6531e6f7..10035d1a06d3 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1645,7 +1645,11 @@ static void kmem_freepages(struct kmem_cache *cachep, struct page *page)
                sub_zone_page_state(page_zone(page),
                                NR_SLAB_UNRECLAIMABLE, nr_freed);
 
-       BUG_ON(!PageSlab(page));
+       if (!PageSlab(page)) {
+               dump_page(page, "page");
+               dump_page(compound_head(page), "compound_head(page)");
+               BUG();
+       }
        __ClearPageSlabPfmemalloc(page);
        __ClearPageSlab(page);
        page_mapcount_reset(page);
-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-21 15:34       ` Kirill A. Shutemov
@ 2015-09-22  1:17         ` Guenter Roeck
  2015-09-22 12:03           ` Kirill A. Shutemov
  0 siblings, 1 reply; 19+ messages in thread
From: Guenter Roeck @ 2015-09-22  1:17 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Jesper Nilsson, Kirill A. Shutemov, linux-next, linux-kernel,
	Andrew Morton, Mikael Starvik, Jesper Nilsson, Minchan Kim,
	linux-cris-kernel

On 09/21/2015 08:34 AM, Kirill A. Shutemov wrote:
> Guenter Roeck wrote:
>> On 09/18/2015 07:53 AM, Jesper Nilsson wrote:
>>> On Fri, Sep 18, 2015 at 05:25:07PM +0300, Kirill A. Shutemov wrote:
>>>> On Thu, Sep 17, 2015 at 09:29:27AM -0700, Guenter Roeck wrote:
>>>>> Hi,
>>>>>
>>>>> my crisv32 qemu test fails with next-20150917 as follows.
>>>>>
>>>>> NET: Registered protocol family 16
>>>>> kernel BUG at mm/slab.c:1648!
>>>>> Linux 4.3.0-rc1-next-20150917 #1 Wed Sep 16 23:56:59 PDT 2015
>>>>> Oops: 0000
>>>>>
>>>>> [ register dump follows ]
>>>>>
>>>>> See http://server.roeck-us.net:8010/builders/qemu-crisv32-next/builds/83/steps/qemubuildcommand/logs/stdio
>>>>> for a complete log.
>>>>
>>>> Is there a chance to get proper backtrace?
>>>
>>> Yes, it should be possible with CONFIG_KALLSYMS=y in the kconfig.
>>>
>>
>> Good to know. I added it to my configuration.
>>
>> Here it is:
>>
>> kernel BUG at mm/slab.c:1648!
>
> I still don't understand what's going on :(
> Could you try with this instrumentation:
>
> diff --git a/mm/slab.c b/mm/slab.c
> index ce9c6531e6f7..10035d1a06d3 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1645,7 +1645,11 @@ static void kmem_freepages(struct kmem_cache *cachep, struct page *page)
>                  sub_zone_page_state(page_zone(page),
>                                  NR_SLAB_UNRECLAIMABLE, nr_freed);
>
> -       BUG_ON(!PageSlab(page));
> +       if (!PageSlab(page)) {
> +               dump_page(page, "page");
> +               dump_page(compound_head(page), "compound_head(page)");
> +               BUG();
> +       }
>          __ClearPageSlabPfmemalloc(page);
>          __ClearPageSlab(page);
>          page_mapcount_reset(page);
>

page:c04a5340 count:1 mapcount:1 mapping:c1f34080 index:0xc1f34060
flags: 0x80(slab)
page dumped because: page
page:c1f17a04 count:0 mapcount:1 mapping:00d13600 index:0xc0
flags: 0x0()
page dumped because: compound_head(page)

Does that help ?

Guenter

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22  1:17         ` Guenter Roeck
@ 2015-09-22 12:03           ` Kirill A. Shutemov
  2015-09-22 12:19             ` Mikael Starvik
  0 siblings, 1 reply; 19+ messages in thread
From: Kirill A. Shutemov @ 2015-09-22 12:03 UTC (permalink / raw)
  To: Guenter Roeck, Paul E. McKenney, Mikael Starvik, Jesper Nilsson,
	Hugh Dickins
  Cc: Kirill A. Shutemov, linux-next, linux-kernel, Andrew Morton,
	Minchan Kim, linux-cris-kernel

On Mon, Sep 21, 2015 at 06:17:34PM -0700, Guenter Roeck wrote:
> On 09/21/2015 08:34 AM, Kirill A. Shutemov wrote:
> >Guenter Roeck wrote:
> >>On 09/18/2015 07:53 AM, Jesper Nilsson wrote:
> >>>On Fri, Sep 18, 2015 at 05:25:07PM +0300, Kirill A. Shutemov wrote:
> >>>>On Thu, Sep 17, 2015 at 09:29:27AM -0700, Guenter Roeck wrote:
> >>>>>Hi,
> >>>>>
> >>>>>my crisv32 qemu test fails with next-20150917 as follows.
> >>>>>
> >>>>>NET: Registered protocol family 16
> >>>>>kernel BUG at mm/slab.c:1648!
> >>>>>Linux 4.3.0-rc1-next-20150917 #1 Wed Sep 16 23:56:59 PDT 2015
> >>>>>Oops: 0000
> >>>>>
> >>>>>[ register dump follows ]
> >>>>>
> >>>>>See http://server.roeck-us.net:8010/builders/qemu-crisv32-next/builds/83/steps/qemubuildcommand/logs/stdio
> >>>>>for a complete log.
> >>>>
> >>>>Is there a chance to get proper backtrace?
> >>>
> >>>Yes, it should be possible with CONFIG_KALLSYMS=y in the kconfig.
> >>>
> >>
> >>Good to know. I added it to my configuration.
> >>
> >>Here it is:
> >>
> >>kernel BUG at mm/slab.c:1648!
> >
> >I still don't understand what's going on :(
> >Could you try with this instrumentation:
> >
> >diff --git a/mm/slab.c b/mm/slab.c
> >index ce9c6531e6f7..10035d1a06d3 100644
> >--- a/mm/slab.c
> >+++ b/mm/slab.c
> >@@ -1645,7 +1645,11 @@ static void kmem_freepages(struct kmem_cache *cachep, struct page *page)
> >                 sub_zone_page_state(page_zone(page),
> >                                 NR_SLAB_UNRECLAIMABLE, nr_freed);
> >
> >-       BUG_ON(!PageSlab(page));
> >+       if (!PageSlab(page)) {
> >+               dump_page(page, "page");
> >+               dump_page(compound_head(page), "compound_head(page)");
> >+               BUG();
> >+       }
> >         __ClearPageSlabPfmemalloc(page);
> >         __ClearPageSlab(page);
> >         page_mapcount_reset(page);
> >
> 
> page:c04a5340 count:1 mapcount:1 mapping:c1f34080 index:0xc1f34060
> flags: 0x80(slab)
> page dumped because: page
> page:c1f17a04 count:0 mapcount:1 mapping:00d13600 index:0xc0
> flags: 0x0()
> page dumped because: compound_head(page)
> 
> Does that help ?

Kinda. It's false positive PageTail() due low bit set in
page->rcu_head.next.

It happens (at least) due broken alignment of 'rcu' field within
task_struct -- offsetof(struct task_struct, rcu): 773.

That's looks veery broken. I would guess compiler does something horribly
wrong. I hope it's not an ABI issue. :-/

Mikael? Jesper?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 12:03           ` Kirill A. Shutemov
@ 2015-09-22 12:19             ` Mikael Starvik
  2015-09-22 12:50               ` Hans-Peter Nilsson
  0 siblings, 1 reply; 19+ messages in thread
From: Mikael Starvik @ 2015-09-22 12:19 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Guenter Roeck, Paul E. McKenney, Mikael Starvik, Jesper Nilsson,
	Hugh Dickins, Kirill A. Shutemov, linux-next, linux-kernel,
	Andrew Morton, Minchan Kim, linux-cris-kernel,
	Hans-Peter Nilsson

For cris it is completely valid to do that. It has been an issue before. If you for some reason really require dword alignment there should be an align in the struct.

CC:ing the compiler guy for further comments.

Best regard
/Mikael



> 22 sep 2015 kl. 14:03 skrev Kirill A. Shutemov <kirill@shutemov.name>:
> 
>> On Mon, Sep 21, 2015 at 06:17:34PM -0700, Guenter Roeck wrote:
>>> On 09/21/2015 08:34 AM, Kirill A. Shutemov wrote:
>>> Guenter Roeck wrote:
>>>>> On 09/18/2015 07:53 AM, Jesper Nilsson wrote:
>>>>>> On Fri, Sep 18, 2015 at 05:25:07PM +0300, Kirill A. Shutemov wrote:
>>>>>>> On Thu, Sep 17, 2015 at 09:29:27AM -0700, Guenter Roeck wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> my crisv32 qemu test fails with next-20150917 as follows.
>>>>>>> 
>>>>>>> NET: Registered protocol family 16
>>>>>>> kernel BUG at mm/slab.c:1648!
>>>>>>> Linux 4.3.0-rc1-next-20150917 #1 Wed Sep 16 23:56:59 PDT 2015
>>>>>>> Oops: 0000
>>>>>>> 
>>>>>>> [ register dump follows ]
>>>>>>> 
>>>>>>> See http://server.roeck-us.net:8010/builders/qemu-crisv32-next/builds/83/steps/qemubuildcommand/logs/stdio
>>>>>>> for a complete log.
>>>>>> 
>>>>>> Is there a chance to get proper backtrace?
>>>>> 
>>>>> Yes, it should be possible with CONFIG_KALLSYMS=y in the kconfig.
>>>> 
>>>> Good to know. I added it to my configuration.
>>>> 
>>>> Here it is:
>>>> 
>>>> kernel BUG at mm/slab.c:1648!
>>> 
>>> I still don't understand what's going on :(
>>> Could you try with this instrumentation:
>>> 
>>> diff --git a/mm/slab.c b/mm/slab.c
>>> index ce9c6531e6f7..10035d1a06d3 100644
>>> --- a/mm/slab.c
>>> +++ b/mm/slab.c
>>> @@ -1645,7 +1645,11 @@ static void kmem_freepages(struct kmem_cache *cachep, struct page *page)
>>>                sub_zone_page_state(page_zone(page),
>>>                                NR_SLAB_UNRECLAIMABLE, nr_freed);
>>> 
>>> -       BUG_ON(!PageSlab(page));
>>> +       if (!PageSlab(page)) {
>>> +               dump_page(page, "page");
>>> +               dump_page(compound_head(page), "compound_head(page)");
>>> +               BUG();
>>> +       }
>>>        __ClearPageSlabPfmemalloc(page);
>>>        __ClearPageSlab(page);
>>>        page_mapcount_reset(page);
>> 
>> page:c04a5340 count:1 mapcount:1 mapping:c1f34080 index:0xc1f34060
>> flags: 0x80(slab)
>> page dumped because: page
>> page:c1f17a04 count:0 mapcount:1 mapping:00d13600 index:0xc0
>> flags: 0x0()
>> page dumped because: compound_head(page)
>> 
>> Does that help ?
> 
> Kinda. It's false positive PageTail() due low bit set in
> page->rcu_head.next.
> 
> It happens (at least) due broken alignment of 'rcu' field within
> task_struct -- offsetof(struct task_struct, rcu): 773.
> 
> That's looks veery broken. I would guess compiler does something horribly
> wrong. I hope it's not an ABI issue. :-/
> 
> Mikael? Jesper?
> 
> -- 
> Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 12:19             ` Mikael Starvik
@ 2015-09-22 12:50               ` Hans-Peter Nilsson
  2015-09-22 13:27                 ` Kirill A. Shutemov
  0 siblings, 1 reply; 19+ messages in thread
From: Hans-Peter Nilsson @ 2015-09-22 12:50 UTC (permalink / raw)
  To: kirill; +Cc: starvik, linux, paulmck

> From: Mikael Starvik <mikael.starvik@axis.com>
> Date: Tue, 22 Sep 2015 14:19:38 +0200

> For cris it is completely valid to do that.

Correct, just as it's completely valid for any system to specify
an ABI that says that structures are laid out "packed" by
default.

> It has been an
> issue before. If you for some reason really require dword
> alignment there should be an align in the struct.

Yep.

> CC:ing the compiler guy for further comments.

I have no new information.

> > 22 sep 2015 kl. 14:03 skrev Kirill A. Shutemov <kirill@shutemov.name>:

> > Kinda. It's false positive PageTail() due low bit set in
> > page->rcu_head.next.
> > 
> > It happens (at least) due broken alignment of 'rcu' field within
> > task_struct -- offsetof(struct task_struct, rcu): 773.
> > 
> > That's looks veery broken. I would guess compiler does something horribly
> > wrong. I hope it's not an ABI issue. :-/

It is an ABI issue, but I'm sure you can cope.  If you need to
imply something you have to provide something.  If not, I'd say
the term "horrible" would fit hackish assumptions of the failing
code (and related code that works by happenstance).

That element (the struct) needs *explicit* padding or alignment
to the required multiplicity of bytes for anyone to portably be
able to imply something other than "byte alignment" for the
layout of it, as elements of an array, across systems.  Use
dummy elements or a compiler construct like __attribute__
((__aligned__ (...))) per kernel policy or taste.  I'd recommend
specifying the alignment, so TRT will happen for it when it in
turn is an element of an otherwise unpadded struct.

(I assume all applicable allocators provide "natural" alignment
of, say, sizeof (long)) or that'll be a separate issue.)

brgds, H-P

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 12:50               ` Hans-Peter Nilsson
@ 2015-09-22 13:27                 ` Kirill A. Shutemov
  2015-09-22 13:57                   ` Hans-Peter Nilsson
  0 siblings, 1 reply; 19+ messages in thread
From: Kirill A. Shutemov @ 2015-09-22 13:27 UTC (permalink / raw)
  To: paulmck, Hans-Peter Nilsson
  Cc: starvik, linux, jespern, hughd, kirill.shutemov, linux-next,
	linux-kernel, akpm, minchan, linux-cris-kernel

On Tue, Sep 22, 2015 at 02:50:19PM +0200, Hans-Peter Nilsson wrote:
> > From: Mikael Starvik <mikael.starvik@axis.com>
> > Date: Tue, 22 Sep 2015 14:19:38 +0200
> 
> > For cris it is completely valid to do that.
> 
> Correct, just as it's completely valid for any system to specify
> an ABI that says that structures are laid out "packed" by
> default.
> 
> > It has been an
> > issue before. If you for some reason really require dword
> > alignment there should be an align in the struct.
> 
> Yep.
> 
> > CC:ing the compiler guy for further comments.
> 
> I have no new information.
> 
> > > 22 sep 2015 kl. 14:03 skrev Kirill A. Shutemov <kirill@shutemov.name>:
> 
> > > Kinda. It's false positive PageTail() due low bit set in
> > > page->rcu_head.next.
> > > 
> > > It happens (at least) due broken alignment of 'rcu' field within
> > > task_struct -- offsetof(struct task_struct, rcu): 773.
> > > 
> > > That's looks veery broken. I would guess compiler does something horribly
> > > wrong. I hope it's not an ABI issue. :-/
> 
> It is an ABI issue, but I'm sure you can cope.  If you need to
> imply something you have to provide something.  If not, I'd say
> the term "horrible" would fit hackish assumptions of the failing
> code (and related code that works by happenstance).
> 
> That element (the struct) needs *explicit* padding or alignment
> to the required multiplicity of bytes for anyone to portably be
> able to imply something other than "byte alignment" for the
> layout of it, as elements of an array, across systems.  Use
> dummy elements or a compiler construct like __attribute__
> ((__aligned__ (...))) per kernel policy or taste.  I'd recommend
> specifying the alignment, so TRT will happen for it when it in
> turn is an element of an otherwise unpadded struct.
> 
> (I assume all applicable allocators provide "natural" alignment
> of, say, sizeof (long)) or that'll be a separate issue.)

I see. I would say it's very risky ABI choice, but okay.

What was the reason behind? I don't understand it.
Is it free to make misaligned memory access on CRIS?
What about atomicity? How it works for misaligned accesses?

The patch below fixes issue for me.

I'm not sure if we want to ask for alignment to sizeof(long).
aligned(2) works too.

Paul, any thoughts?

diff --git a/include/linux/types.h b/include/linux/types.h
index c314989d9158..ead18bdcaf70 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -209,7 +209,7 @@ struct ustat {
 struct callback_head {
        struct callback_head *next;
        void (*func)(struct callback_head *head);
-};
+}  __attribute__((aligned(sizeof(long))));
 #define rcu_head callback_head
 
 typedef void (*rcu_callback_t)(struct rcu_head *head);
-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 13:27                 ` Kirill A. Shutemov
@ 2015-09-22 13:57                   ` Hans-Peter Nilsson
  2015-09-22 15:18                     ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Hans-Peter Nilsson @ 2015-09-22 13:57 UTC (permalink / raw)
  To: kirill
  Cc: paulmck, starvik, linux, jespern, hughd, kirill.shutemov,
	linux-next, linux-kernel, akpm, minchan, linux-cris-kernel

> From: "Kirill A. Shutemov" <kirill@shutemov.name>
> Date: Tue, 22 Sep 2015 15:27:51 +0200

> On Tue, Sep 22, 2015 at 02:50:19PM +0200, Hans-Peter Nilsson wrote:
> > That element (the struct) needs *explicit* padding or alignment
> > to the required multiplicity of bytes for anyone to portably be
> > able to imply something other than "byte alignment" for the
> > layout of it, as elements of an array, across systems.  Use
> > dummy elements or a compiler construct like __attribute__
> > ((__aligned__ (...))) per kernel policy or taste.  I'd recommend
> > specifying the alignment, so TRT will happen for it when it in
> > turn is an element of an otherwise unpadded struct.

> I see. I would say it's very risky ABI choice, but okay.
> 
> What was the reason behind? I don't understand it.

It was made some 20+ years ago, some of the reason being (here's
the irony) compatibility with a toolchain for another
architecture, popular at the time, now forgotten.
Another reason (IIRC) was that it saves space. :)

> Is it free to make misaligned memory access on CRIS?

Within a cache-line for CRIS v32, it's free.

> What about atomicity? How it works for misaligned accesses?

Good spotting.  No system with page layouts fixed at size
multiples (all are, it'd be crazy to split pages as low as byte
boundaries) can support naturally-misaligned atomic accesses.

Therefore elements with access expecting atomicity, have be
decorated with alignment-inducing attributes, for portability,
e.g. to work for CRIS.  In userspace, I can at times get away
with calling a special function with a process-wide lock, in
those cases where the upstream project is unlikely to timely
understand e.g. that a naked "int" is not naturally aligned.

> The patch below fixes issue for me.

Thanks.

> I'm not sure if we want to ask for alignment to sizeof(long).
> aligned(2) works too.

I guess you hit the right spot, but I'd think people would be
more comfortable with aligning to sizeof (void *).

brgds, H-P

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 13:57                   ` Hans-Peter Nilsson
@ 2015-09-22 15:18                     ` Paul E. McKenney
  2015-09-22 15:31                       ` Kirill A. Shutemov
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-09-22 15:18 UTC (permalink / raw)
  To: Hans-Peter Nilsson
  Cc: kirill, starvik, linux, jespern, hughd, kirill.shutemov,
	linux-next, linux-kernel, akpm, minchan, linux-cris-kernel

On Tue, Sep 22, 2015 at 03:57:06PM +0200, Hans-Peter Nilsson wrote:
> > From: "Kirill A. Shutemov" <kirill@shutemov.name>
> > Date: Tue, 22 Sep 2015 15:27:51 +0200
> 
> > On Tue, Sep 22, 2015 at 02:50:19PM +0200, Hans-Peter Nilsson wrote:
> > > That element (the struct) needs *explicit* padding or alignment
> > > to the required multiplicity of bytes for anyone to portably be
> > > able to imply something other than "byte alignment" for the
> > > layout of it, as elements of an array, across systems.  Use
> > > dummy elements or a compiler construct like __attribute__
> > > ((__aligned__ (...))) per kernel policy or taste.  I'd recommend
> > > specifying the alignment, so TRT will happen for it when it in
> > > turn is an element of an otherwise unpadded struct.
> 
> > I see. I would say it's very risky ABI choice, but okay.
> > 
> > What was the reason behind? I don't understand it.
> 
> It was made some 20+ years ago, some of the reason being (here's
> the irony) compatibility with a toolchain for another
> architecture, popular at the time, now forgotten.
> Another reason (IIRC) was that it saves space. :)
> 
> > Is it free to make misaligned memory access on CRIS?
> 
> Within a cache-line for CRIS v32, it's free.
> 
> > What about atomicity? How it works for misaligned accesses?
> 
> Good spotting.  No system with page layouts fixed at size
> multiples (all are, it'd be crazy to split pages as low as byte
> boundaries) can support naturally-misaligned atomic accesses.
> 
> Therefore elements with access expecting atomicity, have be
> decorated with alignment-inducing attributes, for portability,
> e.g. to work for CRIS.  In userspace, I can at times get away
> with calling a special function with a process-wide lock, in
> those cases where the upstream project is unlikely to timely
> understand e.g. that a naked "int" is not naturally aligned.
> 
> > The patch below fixes issue for me.
> 
> Thanks.
> 
> > I'm not sure if we want to ask for alignment to sizeof(long).
> > aligned(2) works too.
> 
> I guess you hit the right spot, but I'd think people would be
> more comfortable with aligning to sizeof (void *).

I would indeed prefer sizeof(void *).

						Thanx, Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 15:18                     ` Paul E. McKenney
@ 2015-09-22 15:31                       ` Kirill A. Shutemov
  2015-09-22 15:40                         ` Paul E. McKenney
  2015-09-22 16:16                         ` Hans-Peter Nilsson
  0 siblings, 2 replies; 19+ messages in thread
From: Kirill A. Shutemov @ 2015-09-22 15:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Hans-Peter Nilsson, starvik, linux, jespern, hughd,
	kirill.shutemov, linux-next, linux-kernel, akpm, minchan,
	linux-cris-kernel

On Tue, Sep 22, 2015 at 08:18:35AM -0700, Paul E. McKenney wrote:
> On Tue, Sep 22, 2015 at 03:57:06PM +0200, Hans-Peter Nilsson wrote:
> > I guess you hit the right spot, but I'd think people would be
> > more comfortable with aligning to sizeof (void *).
> 
> I would indeed prefer sizeof(void *).

Do you prefer to have the attribute set for whole structure or for ->next?
I think attribute on ->next is more appropriate from documentation POV.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 15:31                       ` Kirill A. Shutemov
@ 2015-09-22 15:40                         ` Paul E. McKenney
  2015-09-23 10:53                           ` Kirill A. Shutemov
  2015-09-22 16:16                         ` Hans-Peter Nilsson
  1 sibling, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-09-22 15:40 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Hans-Peter Nilsson, starvik, linux, jespern, hughd,
	kirill.shutemov, linux-next, linux-kernel, akpm, minchan,
	linux-cris-kernel

On Tue, Sep 22, 2015 at 06:31:04PM +0300, Kirill A. Shutemov wrote:
> On Tue, Sep 22, 2015 at 08:18:35AM -0700, Paul E. McKenney wrote:
> > On Tue, Sep 22, 2015 at 03:57:06PM +0200, Hans-Peter Nilsson wrote:
> > > I guess you hit the right spot, but I'd think people would be
> > > more comfortable with aligning to sizeof (void *).
> > 
> > I would indeed prefer sizeof(void *).
> 
> Do you prefer to have the attribute set for whole structure or for ->next?
> I think attribute on ->next is more appropriate from documentation POV.

Agreed, I do prefer ->next.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 15:31                       ` Kirill A. Shutemov
  2015-09-22 15:40                         ` Paul E. McKenney
@ 2015-09-22 16:16                         ` Hans-Peter Nilsson
  2015-09-22 16:39                           ` Paul E. McKenney
  1 sibling, 1 reply; 19+ messages in thread
From: Hans-Peter Nilsson @ 2015-09-22 16:16 UTC (permalink / raw)
  To: kirill
  Cc: paulmck, starvik, linux, jespern, hughd, kirill.shutemov,
	linux-next, linux-kernel, akpm, minchan, linux-cris-kernel

> From: "Kirill A. Shutemov" <kirill@shutemov.name>
> Date: Tue, 22 Sep 2015 17:31:04 +0200

> On Tue, Sep 22, 2015 at 08:18:35AM -0700, Paul E. McKenney wrote:
> > On Tue, Sep 22, 2015 at 03:57:06PM +0200, Hans-Peter Nilsson wrote:
> > > I guess you hit the right spot, but I'd think people would be
> > > more comfortable with aligning to sizeof (void *).
> > 
> > I would indeed prefer sizeof(void *).
> 
> Do you prefer to have the attribute set for whole structure or for ->next?
> I think attribute on ->next is more appropriate from documentation POV.

Speaking of the documentation POV, I'd recommend adding an
explanatory comment.  Here's hoping this was obvious! ;)

brgds, H-P

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 16:16                         ` Hans-Peter Nilsson
@ 2015-09-22 16:39                           ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2015-09-22 16:39 UTC (permalink / raw)
  To: Hans-Peter Nilsson
  Cc: kirill, starvik, linux, jespern, hughd, kirill.shutemov,
	linux-next, linux-kernel, akpm, minchan, linux-cris-kernel

On Tue, Sep 22, 2015 at 06:16:18PM +0200, Hans-Peter Nilsson wrote:
> > From: "Kirill A. Shutemov" <kirill@shutemov.name>
> > Date: Tue, 22 Sep 2015 17:31:04 +0200
> 
> > On Tue, Sep 22, 2015 at 08:18:35AM -0700, Paul E. McKenney wrote:
> > > On Tue, Sep 22, 2015 at 03:57:06PM +0200, Hans-Peter Nilsson wrote:
> > > > I guess you hit the right spot, but I'd think people would be
> > > > more comfortable with aligning to sizeof (void *).
> > > 
> > > I would indeed prefer sizeof(void *).
> > 
> > Do you prefer to have the attribute set for whole structure or for ->next?
> > I think attribute on ->next is more appropriate from documentation POV.
> 
> Speaking of the documentation POV, I'd recommend adding an
> explanatory comment.  Here's hoping this was obvious! ;)

What Hans-Peter said!  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-22 15:40                         ` Paul E. McKenney
@ 2015-09-23 10:53                           ` Kirill A. Shutemov
  2015-09-23 15:02                             ` Guenter Roeck
  0 siblings, 1 reply; 19+ messages in thread
From: Kirill A. Shutemov @ 2015-09-23 10:53 UTC (permalink / raw)
  To: Paul E. McKenney, akpm
  Cc: Hans-Peter Nilsson, starvik, linux, jespern, hughd,
	kirill.shutemov, linux-next, linux-kernel, minchan,
	linux-cris-kernel

On Tue, Sep 22, 2015 at 08:40:14AM -0700, Paul E. McKenney wrote:
> On Tue, Sep 22, 2015 at 06:31:04PM +0300, Kirill A. Shutemov wrote:
> > On Tue, Sep 22, 2015 at 08:18:35AM -0700, Paul E. McKenney wrote:
> > > On Tue, Sep 22, 2015 at 03:57:06PM +0200, Hans-Peter Nilsson wrote:
> > > > I guess you hit the right spot, but I'd think people would be
> > > > more comfortable with aligning to sizeof (void *).
> > > 
> > > I would indeed prefer sizeof(void *).
> > 
> > Do you prefer to have the attribute set for whole structure or for ->next?
> > I think attribute on ->next is more appropriate from documentation POV.

I retract this claim: we have requirement about pointee alignment, not
pointer alignment.

>From edbab9e89f5e4ad42e63d93ab05519e6a5f4d552 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Wed, 23 Sep 2015 13:39:28 +0300
Subject: [PATCH] rcu: force alignment on struct callback_head/rcu_head

This patch makes struct callback_head aligned to size of pointer. On
most architectures it happens naturally due ABI requirements, but some
architectures (like CRIS) have weird ABI and we need to ask it
explicitly.

The alignment is required to guarantee that bits 0 and 1 of @next will
be clear under normal conditions -- as long as we use call_rcu(),
call_rcu_bh(), call_rcu_sched(), or call_srcu() to queue callback.

This guarantee is important for few reasons:
 - future call_rcu_lazy() will make use of lower bits in the pointer;
 - the structure shares storage spacer in struct page with @compound_head,
   which encode PageTail() in bit 0. The guarantee is needed to avoid
   false-positive PageTail().

False postive PageTail() caused crash on crisv32[1]. It happend due
misaligned task_struct->rcu, which was byte-aligned.

[1] http://lkml.kernel.org/r/55FAEA67.9000102@roeck-us.net

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
---
 include/linux/types.h | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/types.h b/include/linux/types.h
index c314989d9158..70d8500bddf1 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -205,11 +205,25 @@ struct ustat {
  * struct callback_head - callback structure for use with RCU and task_work
  * @next: next update requests in a list
  * @func: actual update function to call after the grace period.
+ *
+ * The struct is aligned to size of pointer. On most architectures it happens
+ * naturally due ABI requirements, but some architectures (like CRIS) have
+ * weird ABI and we need to ask it explicitly.
+ *
+ * The alignment is required to guarantee that bits 0 and 1 of @next will be
+ * clear under normal conditions -- as long as we use call_rcu(),
+ * call_rcu_bh(), call_rcu_sched(), or call_srcu() to queue callback.
+ *
+ * This guarantee is important for few reasons:
+ *  - future call_rcu_lazy() will make use of lower bits in the pointer;
+ *  - the structure shares storage spacer in struct page with @compound_head,
+ *    which encode PageTail() in bit 0. The guarantee is needed to avoid
+ *    false-positive PageTail().
  */
 struct callback_head {
 	struct callback_head *next;
 	void (*func)(struct callback_head *head);
-};
+} __attribute__((aligned(sizeof(void *))));
 #define rcu_head callback_head
 
 typedef void (*rcu_callback_t)(struct rcu_head *head);
-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-23 10:53                           ` Kirill A. Shutemov
@ 2015-09-23 15:02                             ` Guenter Roeck
  2015-09-24  4:45                               ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Guenter Roeck @ 2015-09-23 15:02 UTC (permalink / raw)
  To: Kirill A. Shutemov, Paul E. McKenney, akpm
  Cc: Hans-Peter Nilsson, starvik, jespern, hughd, kirill.shutemov,
	linux-next, linux-kernel, minchan, linux-cris-kernel

On 09/23/2015 03:53 AM, Kirill A. Shutemov wrote:
> On Tue, Sep 22, 2015 at 08:40:14AM -0700, Paul E. McKenney wrote:
>> On Tue, Sep 22, 2015 at 06:31:04PM +0300, Kirill A. Shutemov wrote:
>>> On Tue, Sep 22, 2015 at 08:18:35AM -0700, Paul E. McKenney wrote:
>>>> On Tue, Sep 22, 2015 at 03:57:06PM +0200, Hans-Peter Nilsson wrote:
>>>>> I guess you hit the right spot, but I'd think people would be
>>>>> more comfortable with aligning to sizeof (void *).
>>>>
>>>> I would indeed prefer sizeof(void *).
>>>
>>> Do you prefer to have the attribute set for whole structure or for ->next?
>>> I think attribute on ->next is more appropriate from documentation POV.
>
> I retract this claim: we have requirement about pointee alignment, not
> pointer alignment.
>
>>From edbab9e89f5e4ad42e63d93ab05519e6a5f4d552 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Wed, 23 Sep 2015 13:39:28 +0300
> Subject: [PATCH] rcu: force alignment on struct callback_head/rcu_head
>
> This patch makes struct callback_head aligned to size of pointer. On
> most architectures it happens naturally due ABI requirements, but some
> architectures (like CRIS) have weird ABI and we need to ask it
> explicitly.
>
> The alignment is required to guarantee that bits 0 and 1 of @next will
> be clear under normal conditions -- as long as we use call_rcu(),
> call_rcu_bh(), call_rcu_sched(), or call_srcu() to queue callback.
>
> This guarantee is important for few reasons:
>   - future call_rcu_lazy() will make use of lower bits in the pointer;
>   - the structure shares storage spacer in struct page with @compound_head,
>     which encode PageTail() in bit 0. The guarantee is needed to avoid
>     false-positive PageTail().
>
> False postive PageTail() caused crash on crisv32[1]. It happend due
> misaligned task_struct->rcu, which was byte-aligned.
>
> [1] http://lkml.kernel.org/r/55FAEA67.9000102@roeck-us.net
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reported-by: Guenter Roeck <linux@roeck-us.net>

Tested-by: Guenter Roeck <linux@roeck-us.net>

Hope the patch won't get lost since it was attached to an e-mail.
Can it be added to the branch introducing the problem ?

Thanks,
Guenter

> ---
>   include/linux/types.h | 16 +++++++++++++++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/types.h b/include/linux/types.h
> index c314989d9158..70d8500bddf1 100644
> --- a/include/linux/types.h
> +++ b/include/linux/types.h
> @@ -205,11 +205,25 @@ struct ustat {
>    * struct callback_head - callback structure for use with RCU and task_work
>    * @next: next update requests in a list
>    * @func: actual update function to call after the grace period.
> + *
> + * The struct is aligned to size of pointer. On most architectures it happens
> + * naturally due ABI requirements, but some architectures (like CRIS) have
> + * weird ABI and we need to ask it explicitly.
> + *
> + * The alignment is required to guarantee that bits 0 and 1 of @next will be
> + * clear under normal conditions -- as long as we use call_rcu(),
> + * call_rcu_bh(), call_rcu_sched(), or call_srcu() to queue callback.
> + *
> + * This guarantee is important for few reasons:
> + *  - future call_rcu_lazy() will make use of lower bits in the pointer;
> + *  - the structure shares storage spacer in struct page with @compound_head,
> + *    which encode PageTail() in bit 0. The guarantee is needed to avoid
> + *    false-positive PageTail().
>    */
>   struct callback_head {
>   	struct callback_head *next;
>   	void (*func)(struct callback_head *head);
> -};
> +} __attribute__((aligned(sizeof(void *))));
>   #define rcu_head callback_head
>
>   typedef void (*rcu_callback_t)(struct rcu_head *head);
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages'
  2015-09-23 15:02                             ` Guenter Roeck
@ 2015-09-24  4:45                               ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2015-09-24  4:45 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Kirill A. Shutemov, akpm, Hans-Peter Nilsson, starvik, jespern,
	hughd, kirill.shutemov, linux-next, linux-kernel, minchan,
	linux-cris-kernel

On Wed, Sep 23, 2015 at 08:02:44AM -0700, Guenter Roeck wrote:
> On 09/23/2015 03:53 AM, Kirill A. Shutemov wrote:
> >On Tue, Sep 22, 2015 at 08:40:14AM -0700, Paul E. McKenney wrote:
> >>On Tue, Sep 22, 2015 at 06:31:04PM +0300, Kirill A. Shutemov wrote:
> >>>On Tue, Sep 22, 2015 at 08:18:35AM -0700, Paul E. McKenney wrote:
> >>>>On Tue, Sep 22, 2015 at 03:57:06PM +0200, Hans-Peter Nilsson wrote:
> >>>>>I guess you hit the right spot, but I'd think people would be
> >>>>>more comfortable with aligning to sizeof (void *).
> >>>>
> >>>>I would indeed prefer sizeof(void *).
> >>>
> >>>Do you prefer to have the attribute set for whole structure or for ->next?
> >>>I think attribute on ->next is more appropriate from documentation POV.
> >
> >I retract this claim: we have requirement about pointee alignment, not
> >pointer alignment.
> >
> >>From edbab9e89f5e4ad42e63d93ab05519e6a5f4d552 Mon Sep 17 00:00:00 2001
> >From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> >Date: Wed, 23 Sep 2015 13:39:28 +0300
> >Subject: [PATCH] rcu: force alignment on struct callback_head/rcu_head
> >
> >This patch makes struct callback_head aligned to size of pointer. On
> >most architectures it happens naturally due ABI requirements, but some
> >architectures (like CRIS) have weird ABI and we need to ask it
> >explicitly.
> >
> >The alignment is required to guarantee that bits 0 and 1 of @next will
> >be clear under normal conditions -- as long as we use call_rcu(),
> >call_rcu_bh(), call_rcu_sched(), or call_srcu() to queue callback.
> >
> >This guarantee is important for few reasons:
> >  - future call_rcu_lazy() will make use of lower bits in the pointer;
> >  - the structure shares storage spacer in struct page with @compound_head,
> >    which encode PageTail() in bit 0. The guarantee is needed to avoid
> >    false-positive PageTail().
> >
> >False postive PageTail() caused crash on crisv32[1]. It happend due
> >misaligned task_struct->rcu, which was byte-aligned.
> >
> >[1] http://lkml.kernel.org/r/55FAEA67.9000102@roeck-us.net
> >
> >Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >Reported-by: Guenter Roeck <linux@roeck-us.net>
> 
> Tested-by: Guenter Roeck <linux@roeck-us.net>
> 
> Hope the patch won't get lost since it was attached to an e-mail.
> Can it be added to the branch introducing the problem ?

Andrew Morton picked it up.  No idea where the problem was introduced.

						Thanx, Paul

> Thanks,
> Guenter
> 
> >---
> >  include/linux/types.h | 16 +++++++++++++++-
> >  1 file changed, 15 insertions(+), 1 deletion(-)
> >
> >diff --git a/include/linux/types.h b/include/linux/types.h
> >index c314989d9158..70d8500bddf1 100644
> >--- a/include/linux/types.h
> >+++ b/include/linux/types.h
> >@@ -205,11 +205,25 @@ struct ustat {
> >   * struct callback_head - callback structure for use with RCU and task_work
> >   * @next: next update requests in a list
> >   * @func: actual update function to call after the grace period.
> >+ *
> >+ * The struct is aligned to size of pointer. On most architectures it happens
> >+ * naturally due ABI requirements, but some architectures (like CRIS) have
> >+ * weird ABI and we need to ask it explicitly.
> >+ *
> >+ * The alignment is required to guarantee that bits 0 and 1 of @next will be
> >+ * clear under normal conditions -- as long as we use call_rcu(),
> >+ * call_rcu_bh(), call_rcu_sched(), or call_srcu() to queue callback.
> >+ *
> >+ * This guarantee is important for few reasons:
> >+ *  - future call_rcu_lazy() will make use of lower bits in the pointer;
> >+ *  - the structure shares storage spacer in struct page with @compound_head,
> >+ *    which encode PageTail() in bit 0. The guarantee is needed to avoid
> >+ *    false-positive PageTail().
> >   */
> >  struct callback_head {
> >  	struct callback_head *next;
> >  	void (*func)(struct callback_head *head);
> >-};
> >+} __attribute__((aligned(sizeof(void *))));
> >  #define rcu_head callback_head
> >
> >  typedef void (*rcu_callback_t)(struct rcu_head *head);
> >
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, back to index

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-17 16:29 crisv32 runtime failure in -next due to 'page-flags: define behavior SL*B-related flags on compound pages' Guenter Roeck
2015-09-18 14:25 ` Kirill A. Shutemov
2015-09-18 14:53   ` Jesper Nilsson
2015-09-18 15:13     ` Guenter Roeck
2015-09-21 15:34       ` Kirill A. Shutemov
2015-09-22  1:17         ` Guenter Roeck
2015-09-22 12:03           ` Kirill A. Shutemov
2015-09-22 12:19             ` Mikael Starvik
2015-09-22 12:50               ` Hans-Peter Nilsson
2015-09-22 13:27                 ` Kirill A. Shutemov
2015-09-22 13:57                   ` Hans-Peter Nilsson
2015-09-22 15:18                     ` Paul E. McKenney
2015-09-22 15:31                       ` Kirill A. Shutemov
2015-09-22 15:40                         ` Paul E. McKenney
2015-09-23 10:53                           ` Kirill A. Shutemov
2015-09-23 15:02                             ` Guenter Roeck
2015-09-24  4:45                               ` Paul E. McKenney
2015-09-22 16:16                         ` Hans-Peter Nilsson
2015-09-22 16:39                           ` Paul E. McKenney

Linux-Next Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-next/0 linux-next/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-next linux-next/ https://lore.kernel.org/linux-next \
		linux-next@vger.kernel.org
	public-inbox-index linux-next

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-next


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git