linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* shutdown panic in mm_release (really flush_tlb_others?)
@ 2004-01-07  2:57 Martin J. Bligh
  2004-01-07  5:30 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Martin J. Bligh @ 2004-01-07  2:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm mailing list

And the award for longest panic I've ever seen goes to ....
<drumroll> ....

(there were several of these in sequence).
Looks like it was trying to printk an error on shutdown ...
really maybe " [<c0115242>] flush_tlb_others+0x22/0xd0"

Probably the same panic I sent out the other day in a slight
disguise ... "BUG_ON(!cpus_equal(cpumask, tmp));" in flush_tlb_others

Unable to handle kernel NULL pointer dereference at virtual address 00000014
 printing eip:
c011dce4
*pde = 00003001
*pte = 00000000
Oops: 0000 [#22]
CPU:    0
EIP:    0060:[<c011dce4>]    Not tainted VLI
EFLAGS: 00010206
EIP is at mm_release+0x34/0x78
eax: 00000000   ebx: d54ab330   ecx: 4005f948   edx: 00000000
esi: d54ab330   edi: ef2f4724   ebp: 0000000b   esp: ef2f4630
ds: 007b   es: 007b   ss: 0068
Process halt (pid: 8846, threadinfo=ef2f4000 task=d54ab330)
Stack: 00000000 c0121805 d54ab330 00000000 ef2f4000 ef2f4724 ef2f4724 0000000b 
       c010af3c 0000000b 00000000 d54ab330 c011855a c026d31e ef2f4724 00000000 
       d54ab330 00000000 c01181d8 0000000b 00000000 c0380dc0 021f2961 00000000 
       00000001 00000000 c012682f c0380dc0 00030001 c00bf620 00000001 c01d1274 
Call Trace:
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c011855a>] do_page_fault+0x382/0x540
 [<c01181d8>] do_page_fault+0x0/0x540
 [<c012682f>] mod_timer+0x43/0x4c
 [<c01d1274>] poke_blanked_console+0x64/0x68
 [<c01d082c>] vt_console_print+0x2bc/0x2d0
 [<c011fba4>] __call_console_drivers+0x3c/0x4c
 [<c011fc0c>] _call_console_drivers+0x58/0x60
 [<c011fcf2>] call_console_drivers+0xde/0xe8
 [<c0260c7b>] error_code+0x2f/0x38
 [<c010b138>] do_invalid_op+0x0/0x8c
 [<c011dce4>] mm_release+0x34/0x78
 [<c0121805>] do_exit+0xc1/0x330
 [<c010b138>] do_invalid_op+0x0/0x8c
 [<c010af3c>] do_divide_error+0x0/0xac
 [<c010b1ba>] do_invalid_op+0x82/0x8c
 [<c0115242>] flush_tlb_others+0x22/0xd0
 [<c014cbd2>] free_page_and_swap_cache+0x52/0x54
 [<c014172c>] zap_pte_range+0x228/0x2c4
 [<c014181d>] zap_pmd_range+0x55/0x70
 [<c0138a37>] free_hot_page+0x7/0x8
 [<c013d567>] __page_cache_release+0x87/0x8c
 [<c0260c7b>] error_code+0x2f/0x38
 [<c0115242>] flush_tlb_others+0x22/0xd0
 [<c0115366>] flush_tlb_mm+0x76/0x7c
 [<c0145adb>] exit_mmap+0x11f/0x1cc
 [<c011dc64>] mmput+0x50/0x70
 [<c01218fd>] do_exit+0x1b9/0x330
 [<c012bb3a>] sys_reboot+0x1f2/0x2f8
 [<c0119f28>] wake_up_state+0xc/0x10
 [<c0128a47>] kill_proc_info+0x37/0x4c
 [<c0128b40>] kill_something_info+0xe4/0xec
 [<c012a7dc>] sys_kill+0x54/0x5c
 [<c0150373>] filp_open+0x3b/0x5c
 [<c0150739>] sys_open+0x59/0x74
 [<c0260c7b>] error_code+0x2f/0x38
 [<c026020f>] syscall_call+0x7/0xb

Code: 7c 01 00 00 8e e0 8e e8 85 d2 74 11 c7 83 7c 01 00 00 00 00 00 00 89 d0 e8 ca e1 ff ff 8b 8b 84 01 00 00 85 c9 74 45 8b 44 24 0c <8b> 40 14 83 f8 01 7e 39 c7 83 84 01 00 00 00 00 00 00 b8 00 e0 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: shutdown panic in mm_release (really flush_tlb_others?)
  2004-01-07  2:57 shutdown panic in mm_release (really flush_tlb_others?) Martin J. Bligh
@ 2004-01-07  5:30 ` Andrew Morton
  2004-01-07  6:13   ` Martin J. Bligh
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2004-01-07  5:30 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: linux-kernel, linux-mm

"Martin J. Bligh" <mbligh@aracnet.com> wrote:
>
> And the award for longest panic I've ever seen goes to ....
>  <drumroll> ....
> 
>  (there were several of these in sequence).
>  Looks like it was trying to printk an error on shutdown ...
>  really maybe " [<c0115242>] flush_tlb_others+0x22/0xd0"
> 
>  Probably the same panic I sent out the other day in a slight
>  disguise ... "BUG_ON(!cpus_equal(cpumask, tmp));" in flush_tlb_others

Cute.  Didn't you have a patch for this?  Or a proposed solution which
you've been too lazy to type in?  ;)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: shutdown panic in mm_release (really flush_tlb_others?)
  2004-01-07  5:30 ` Andrew Morton
@ 2004-01-07  6:13   ` Martin J. Bligh
  2004-01-08  1:04     ` Rusty Russell
  0 siblings, 1 reply; 4+ messages in thread
From: Martin J. Bligh @ 2004-01-07  6:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm, Rusty Russell, Dipankar Sarma

>> And the award for longest panic I've ever seen goes to ....
>>  <drumroll> ....
>> 
>>  (there were several of these in sequence).
>>  Looks like it was trying to printk an error on shutdown ...
>>  really maybe " [<c0115242>] flush_tlb_others+0x22/0xd0"
>> 
>>  Probably the same panic I sent out the other day in a slight
>>  disguise ... "BUG_ON(!cpus_equal(cpumask, tmp));" in flush_tlb_others
> 
> Cute.  Didn't you have a patch for this?  Or a proposed solution which
> you've been too lazy to type in?  ;)

I did have a rough guess as to the problem, but not the solution:

"I presume we've got a race between taking CPUs offline and the 
tlbflush code ... tlb_flush_mm reads the value from mm->cpu_vm_mask,
and then presumably some other cpu changes cpu_online_map before it
gets to calling flush_tlb_others ... does that sound about right?"

There doesn't seem to be anything locking cpu_online_map, AFAICS,
so presumably stop_this_cpu is futzing with it whilst we try to
shove tlb stuff out to other people. Looks like we're trying to stop 
sending IPIs to CPUs that aren't online (which is a Good Thing (tm)),
but we either end up calling send_IPI_mask_sequence (which already
checks it for us), or send_IPI_mask_bitmap. I suppose we could shift
the check into there ... it'll fix my problem for sure, but probably
just makes the race window smaller on mach_default:

diff -aurpN -X /home/fletch/.diff.exclude 2.6.1-rc2/arch/i386/kernel/smp.c 2.6.1-rc2-tlb_fix/arch/i386/kernel/smp.c
--- 2.6.1-rc2/arch/i386/kernel/smp.c	Tue Sep  2 09:55:42 2003
+++ 2.6.1-rc2-tlb_fix/arch/i386/kernel/smp.c	Tue Jan  6 22:10:44 2004
@@ -160,7 +160,7 @@ void send_IPI_self(int vector)
  */
 inline void send_IPI_mask_bitmask(cpumask_t cpumask, int vector)
 {
-	unsigned long mask = cpus_coerce(cpumask);
+	cpumask_t cpus_online;
 	unsigned long cfg;
 	unsigned long flags;
 
@@ -170,11 +170,12 @@ inline void send_IPI_mask_bitmask(cpumas
 	 * Wait for idle.
 	 */
 	apic_wait_icr_idle();
-		
+
 	/*
 	 * prepare target chip field
 	 */
-	cfg = __prepare_ICR2(mask);
+	cpus_and(cpus_online, cpumask, cpu_online_map);
+	cfg = __prepare_ICR2(cpus_coerce(cpus_online));
 	apic_write_around(APIC_ICR2, cfg);
 		
 	/*
@@ -356,7 +357,6 @@ static void flush_tlb_others(cpumask_t c
 	BUG_ON(cpus_empty(cpumask));
 
 	cpus_and(tmp, cpumask, cpu_online_map);
-	BUG_ON(!cpus_equal(cpumask, tmp));
 	BUG_ON(cpu_isset(smp_processor_id(), cpumask));
 	BUG_ON(!mm);
 

(Note: not tested. I will if you want me to, but I'm not enamoured
with the patch ;-))

Perhaps there's some better solution in the up-and-coming CPU hotplug
stuff that we could steal?


M.       

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: shutdown panic in mm_release (really flush_tlb_others?)
  2004-01-07  6:13   ` Martin J. Bligh
@ 2004-01-08  1:04     ` Rusty Russell
  0 siblings, 0 replies; 4+ messages in thread
From: Rusty Russell @ 2004-01-08  1:04 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: linux-kernel, linux-mm, Rusty Russell, Dipankar Sarma,
	Srivatsa Vaddagiri

In message <12800000.1073455988@[10.10.2.4]> you write:
> There doesn't seem to be anything locking cpu_online_map, AFAICS,
> so presumably stop_this_cpu is futzing with it whilst we try to
> shove tlb stuff out to other people.

These attempts to bring the CPUs down at shutdown are optimistic.  The
hotplug CPU code does a superset of this, and we (particularly Vatsa)
found all kinds of races in x86.

Here's part of the patch from the hotplug CPU code, which might
provide some inspiration.  It passes fairly serious stress testing.

Points:
1) skip_call_ipi is to effectively clear IPIs when a cpu first comes
   back up, which you don't care about.

2) cpu_active_map is a superset of cpu_online_map: when a CPU is in
   the process of going down, cpu_online() is false, but the
   cpu_active_map is only cleared once the CPU actually stops taking
   interrupts (ie. really dead).

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .6045-2.6.1-rc2-bk1-hotcpu-i386.pre/arch/i386/kernel/smp.c .6045-2.6.1-rc2-bk1-hotcpu-i386/arch/i386/kernel/smp.c
--- .6045-2.6.1-rc2-bk1-hotcpu-i386.pre/arch/i386/kernel/smp.c	2003-09-22 10:27:28.000000000 +1000
+++ .6045-2.6.1-rc2-bk1-hotcpu-i386/arch/i386/kernel/smp.c	2004-01-07 16:28:14.000000000 +1100
@@ -26,6 +26,8 @@
 #include <mach_ipi.h>
 #include <mach_apic.h>
 
+DECLARE_PER_CPU(int, skip_call_ipi);
+
 /*
  *	Some notes on x86 processor bugs affecting SMP operation:
  *
@@ -355,11 +357,15 @@ static void flush_tlb_others(cpumask_t c
 	 */
 	BUG_ON(cpus_empty(cpumask));
 
-	cpus_and(tmp, cpumask, cpu_online_map);
+	cpus_and(tmp, cpumask, cpu_callout_map);
 	BUG_ON(!cpus_equal(cpumask, tmp));
 	BUG_ON(cpu_isset(smp_processor_id(), cpumask));
 	BUG_ON(!mm);
 
+	cpus_and(cpumask, cpumask, cpu_active_map);
+	if (cpus_empty(cpumask))
+		return;
+
 	/*
 	 * i'm not happy about this global shared spinlock in the
 	 * MM hot path, but we'll see how contended it is.
@@ -387,9 +393,11 @@ static void flush_tlb_others(cpumask_t c
 	 */
 	send_IPI_mask(cpumask, INVALIDATE_TLB_VECTOR);
 
-	while (!cpus_empty(flush_cpumask))
-		/* nothing. lockup detection does not belong here */
+	do {
 		mb();
+		tmp = flush_cpumask;
+		cpus_and(tmp, tmp, cpu_active_map);
+	} while (!cpus_empty(tmp));
 
 	flush_mm = NULL;
 	flush_va = 0;
@@ -486,8 +494,8 @@ static spinlock_t call_lock = SPIN_LOCK_
 struct call_data_struct {
 	void (*func) (void *info);
 	void *info;
-	atomic_t started;
-	atomic_t finished;
+	cpumask_t not_started;
+	cpumask_t not_finished;
 	int wait;
 };
 
@@ -514,32 +522,44 @@ int smp_call_function (void (*func) (voi
  */
 {
 	struct call_data_struct data;
-	int cpus = num_online_cpus()-1;
+	cpumask_t mask;
+	int cpu;
 
-	if (!cpus)
-		return 0;
+	spin_lock(&call_lock);
 
+	cpu = smp_processor_id();
 	data.func = func;
 	data.info = info;
-	atomic_set(&data.started, 0);
+	data.not_started = cpu_active_map;
+	cpu_clear(cpu, data.not_started);
+	if (cpus_empty(data.not_started))
+		goto out_unlock;
+
 	data.wait = wait;
 	if (wait)
-		atomic_set(&data.finished, 0);
+		data.not_finished = data.not_started;
 
-	spin_lock(&call_lock);
 	call_data = &data;
 	mb();
 	
 	/* Send a message to all other CPUs and wait for them to respond */
-	send_IPI_allbutself(CALL_FUNCTION_VECTOR);
+	send_IPI_mask(data.not_started, CALL_FUNCTION_VECTOR);
 
 	/* Wait for response */
-	while (atomic_read(&data.started) != cpus)
-		barrier();
+	do {
+		mb();
+		mask = data.not_started;
+		cpus_and(mask, mask, cpu_active_map);
+	} while(!cpus_empty(mask));
 
 	if (wait)
-		while (atomic_read(&data.finished) != cpus)
-			barrier();
+		do {
+			mb();
+			mask = data.not_finished;
+			cpus_and(mask, mask, cpu_active_map);
+		} while(!cpus_empty(mask));
+
+out_unlock:
 	spin_unlock(&call_lock);
 
 	return 0;
@@ -551,6 +571,7 @@ static void stop_this_cpu (void * dummy)
 	 * Remove this CPU:
 	 */
 	cpu_clear(smp_processor_id(), cpu_online_map);
+	cpu_clear(smp_processor_id(), cpu_active_map);
 	local_irq_disable();
 	disable_local_APIC();
 	if (cpu_data[smp_processor_id()].hlt_works_ok)
@@ -583,17 +604,33 @@ asmlinkage void smp_reschedule_interrupt
 
 asmlinkage void smp_call_function_interrupt(void)
 {
-	void (*func) (void *info) = call_data->func;
-	void *info = call_data->info;
-	int wait = call_data->wait;
+	void (*func) (void *info);
+	void *info;
+	int wait;
+	int cpu = smp_processor_id();
 
 	ack_APIC_irq();
+
+#ifdef CONFIG_HOTPLUG_CPU
+	if (__get_cpu_var(skip_call_ipi)) {
+		printk ("Ignoring Queueed IPI \n");
+		__get_cpu_var(skip_call_ipi) = 0;
+		return;
+	}
+#endif
+
+	func = call_data->func;
+	info = call_data->info;
+	wait = call_data->wait;
+
 	/*
 	 * Notify initiating CPU that I've grabbed the data and am
 	 * about to execute the function
 	 */
-	mb();
-	atomic_inc(&call_data->started);
+	smp_mb__before_clear_bit();
+	cpu_clear(cpu, call_data->not_started);
+	smp_mb__after_clear_bit();
+
 	/*
 	 * At this point the info structure may be out of scope unless wait==1
 	 */
@@ -602,8 +638,9 @@ asmlinkage void smp_call_function_interr
 	irq_exit();
 
 	if (wait) {
-		mb();
-		atomic_inc(&call_data->finished);
+		smp_mb__before_clear_bit();
+		cpu_clear(cpu, call_data->not_finished);
+		smp_mb__after_clear_bit();
 	}
 }
 
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-01-08  1:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-07  2:57 shutdown panic in mm_release (really flush_tlb_others?) Martin J. Bligh
2004-01-07  5:30 ` Andrew Morton
2004-01-07  6:13   ` Martin J. Bligh
2004-01-08  1:04     ` Rusty Russell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).