* 2.5.40-mm2 @ 2002-10-06 18:47 Andrew Morton 2002-10-06 20:47 ` 2.5.40-mm2 Dave Hansen 2002-10-07 17:45 ` 2.5.40-mm2 Badari Pulavarty 0 siblings, 2 replies; 17+ messages in thread From: Andrew Morton @ 2002-10-06 18:47 UTC (permalink / raw) To: lkml, linux-mm url: http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.40/2.5.40-mm2/ - Peter Chubb's 64-bit sector_t patches have been included. These are working fine and are a 2.6 must-have, IMO. - Included Manfred's slab rework. No problems observed there. - The per-cpu hot-n-cold pages code continues to disappoint. For some weird reason, the enormous lock contention which was observed in rmqueue and __free_pages_ok in 2.5.9 has vanished in 2.5.40 on the big ppc64 boxen. So these patches fix something which isn't there any more. Could be related to the hardware (which changed); we're still poking at it. One test which involves repeatedly writing and then truncating smallish files was sped up 60%, which indicates that the cache locality stuff is working correctly, but it's a bit artificial. Ingo said that his 2.4-based per-cpu-pages patch was beneficial to specweb, but nobody has tested these patches with specweb. Hint. - Started work on /proc/sys/vm/swappiness. Setting it to 100% gives you current 2.5 behaviour. Setting it to 0 feels pretty similar to 2.4.19. I ran it for half a day; seems to work OK. Although running a KDE desktop on dual 25" monitors in 96 megabytes is not a ton of fun. More things to be done on this. If anyone tests this code on a small machine, you really do need to set /proc/sys/vm/dirty_async_ratio to 15. I'll be making this dynamic. - Started work on a page reservation API to solve the problem of ENOMEM during radix-tree and pte_chain allocations. It's untested and unused at present. - Dropped the sard patch for now - it kept on getting stomped by the gendisk rework. +discontig-setup-fix.patch A discontigmem compile fix +remove-get_free_page.patch Remove get_free_page() from the kernel API. +wli-libfs.patch Move some library functions from ramfs to libfs +hugetlb-prefault.patch Factor out some hugetlb code - preparation for hugetlbfs -misc.patch -ioperm-fix.patch -radix_tree_gang_lookup.patch -truncate_inode_pages.patch -proc_vmstat.patch -kswapd-reclaim-stats.patch -iowait.patch -bd-sard.patch -dio-bio-add-page.patch -tcp-wakeups.patch -swapoff-deadlock.patch -dirty-and-uptodate.patch -shmem_rename.patch -dirent-size.patch -tmpfs-trivia.patch -per-zone-vm.patch swsusp-feature.patch -bio-get-nr-vecs.patch -dio-nr-segs.patch -remove-page-virtual.patch -dirty-memory-clamp.patch -mempool-wakeup-fix.patch -remove-write_mapping_buffers.patch -buffer_boundary-scheduling.patch -ll_rw_block-cleanup.patch Merged +dio-fine-alignment.patch Permit 512-byte-aligned direct IO against larger-than-512-byte blocksize filesystems. +lbd1.patch +lbd2.patch +lbd3.patch +lbd4.patch +lbd5.patch +lbd6.patch 64-bit sector_t option. +64-bit-sector_t.patch Make 64-bit sector_t's compulsory in config (accellerated testing) +page-reservation.patch Page reervation API +slab-split-01-rename.patch +slab-split-02-SMP.patch +slab-split-03-tail.patch +slab-split-04-drain.patch +slab-split-05-name.patch +slab-split-06-mand-cpuarray.patch +slab-split-07-inline.patch +slab-split-08-reap.patch slab rework +cpucache_init-fix.patch Fix the above +large-queue-throttle.patch Fixed writer throttling for tiny machines which have large disk queues +exit-page-referenced.patch Propagate the pte referenced bit into PG_referenced for pagecache pages during pagetable teardown +swappiness.patch /proc/sys/vm/swappiness linus.patch cset-1.663.1.1-to-1.752.txt.gz discontig-setup-fix.patch discontigmem compile fix discontig-no-contig_page_data.patch undefine contif_page_data for discontigmem per-node-mem_map.patch ia32 NUMA: per-node ZONE_NORMAL remove-get_free_page.patch remove get_free_page() alloc_pages_node-cleanup.patch alloc_pages_node cleanup free_area_init-cleanup.patch free_area_init_node cleanup wli-libfs.patch Move dentry library functions from ramfs to libfs hugetlb-prefault.patch hugetlbpages: factor out some code for hugetlbfs ext3-dxdir.patch ext3 htree spin-lock-check.patch spinlock/rwlock checking infrastructure rd-cleanup.patch Cleanup and fix the ramdisk driver (doesn't work right yet) write-deadlock.patch Fix the generic_file_write-from-same-mmapped-page deadlock swsusp-feature.patch add shrink_all_memory() for swsusp lseek-ext2_readdir.patch remove lock_kernel() from ext2_readdir() dio-fine-alignment.patch Allow O_DIRECT to use 512-byte alignment batched-slab-asap.patch batched slab shrinking lbd1.patch 64-bit sector_t 1/5 lbd2.patch 64-bit sector_t 2/5 lbd3.patch 64-bit sector_t 3/5 lbd4.patch 64-bit sector_t 4/5 lbd5.patch 64-bit sector_t 5/5 lbd6.patch 64-bit sector_t 6/5 64-bit-sector_t.patch Hardwire CONFIG_LBD to "on" akpm-deadline.patch deadline scheduler tweaks rmqueue_bulk.patch bulk page allocator free_pages_bulk.patch Bulk page freeing function hot_cold_pages.patch Hot/Cold pages and zone->lock amortisation readahead-cold-pages.patch Use cache-cold pages for pagecache reads. pagevec-hot-cold-hint.patch hot/cold hints for truncate and page reclaim page-reservation.patch Page reservation API intel-user-copy.patch Faster copt_*_user for Intel ia32 CPUs slab-split-01-rename.patch slab cleanup: rename static functions slab-split-02-SMP.patch slab: enable the cpu arrays on uniprocessor slab-split-03-tail.patch slab: reduced internal fragmentation slab-split-04-drain.patch slab: take the spinlock in the drain function. slab-split-05-name.patch slab: remove spaces from /proc identifiers slab-split-06-mand-cpuarray.patch slab: cleanups and speedups slab-split-07-inline.patch slab: uninline poisoning checks slab-split-08-reap.patch slab: reap timers cpucache_init-fix.patch cpucache_init fix large-queue-throttle.patch Improve writer throttling for small machines exit-page-referenced.patch Propagate pte referenced bit into pagecache during unmap swappiness.patch swappiness control read_barrier_depends.patch extended barrier primitives rcu_ltimer.patch RCU core dcache_rcu.patch Use RCU for dcache ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 18:47 2.5.40-mm2 Andrew Morton @ 2002-10-06 20:47 ` Dave Hansen 2002-10-06 21:55 ` 2.5.40-mm2 Andrew Morton 2002-10-08 11:05 ` 2.5.40-mm2 Ingo Molnar 2002-10-07 17:45 ` 2.5.40-mm2 Badari Pulavarty 1 sibling, 2 replies; 17+ messages in thread From: Dave Hansen @ 2002-10-06 20:47 UTC (permalink / raw) To: Andrew Morton; +Cc: lkml, linux-mm, Ingo Molnar Andrew Morton wrote: > Ingo said that his 2.4-based per-cpu-pages patch was beneficial to > specweb, but nobody has tested these patches with specweb. Hint. cc'ing Ingo, because I think this might be related to the timer bh removal. 2.5.40 doesn't last very long under Specweb. It always dies out with one of these oopses after a little while: CPU: 3 EIP: 0060:[<801204a9>] Not tainted EFLAGS: 00010006 EIP is at run_timer_tasklet+0xcd/0x13c eax: 00000000 ebx: 802657a8 ecx: e3c640a0 edx: 00000000 esi: e3c642c0 edi: 8039cae0 ebp: 00000246 esp: 8c3d9f20 ds: 0068 es: 0068 ss: 0068 Process swapper (pid: 0, threadinfo=8c3d8000 task=8c3dc760) Stack: 8c093188 00000000 8c3d8000 00000001 8011d2e5 00000000 00000001 80399960 fffffffe 00000060 8037e324 8037e324 8011cfea 80399960 0000000c 00000003 00000000 00000000 00000046 801111dd 8c3d8000 80105334 00000000 80107a8a Call Trace: [<8011d2e5>] tasklet_hi_action+0x85/0xe0 [<8011cfea>] do_softirq+0x5a/0xac [<801111dd>] smp_apic_timer_interrupt+0x111/0x118 [<80105334>] poll_idle+0x0/0x48 [<80107a8a>] apic_timer_interrupt+0x1a/0x20 [<80105334>] poll_idle+0x0/0x48 [<8010535d>] poll_idle+0x29/0x48 [<801053b3>] cpu_idle+0x37/0x48 [<801183ad>] printk+0x125/0x140 Code: 89 50 04 89 02 c7 06 00 00 00 00 c7 46 04 00 00 00 00 c7 46 I'll get a properly decoded one later. I think I just wrote over my old vmlinux. But, it looks to me like this is somewhere inside __run_timers() at kernel/timer.c :329, which looks something like this: list_del(&timer->entry); timer->base = NULL; #if CONFIG_SMP base->running_timer = timer; #endif kgdb kills this machine when kjournald is starting up. Time to try kdb. I _really_ hate this POS hardware. -- Dave Hansen haveblue@us.ibm.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 20:47 ` 2.5.40-mm2 Dave Hansen @ 2002-10-06 21:55 ` Andrew Morton 2002-10-06 22:07 ` 2.5.40-mm2 Andrew Morton 2002-10-08 11:05 ` 2.5.40-mm2 Ingo Molnar 1 sibling, 1 reply; 17+ messages in thread From: Andrew Morton @ 2002-10-06 21:55 UTC (permalink / raw) To: Dave Hansen; +Cc: lkml, linux-mm, Ingo Molnar Dave Hansen wrote: > > Andrew Morton wrote: > > Ingo said that his 2.4-based per-cpu-pages patch was beneficial to > > specweb, but nobody has tested these patches with specweb. Hint. > > cc'ing Ingo, because I think this might be related to the timer bh > removal. > > 2.5.40 doesn't last very long under Specweb. It always dies out with > one of these oopses after a little while: > > CPU: 3 > EIP: 0060:[<801204a9>] Not tainted > EFLAGS: 00010006 > EIP is at run_timer_tasklet+0xcd/0x13c Well from a quick peek, there is some funny stuff happening in the timer code. - del_timer_sync() iterates across all CPUs, but does not do actually _do_ anything for each CPU. (I suspect this may be the source of your crash - del_timer_sync() is bust) - the back-to-back preempt_disable()/preempt_enable() is unusual. What's that for? - __run_timers() is doing spin_unlock_irq() inside spin_lock_irqsave(). That's probably not a bug in this context, but it's a wart. This help? --- 2.5.40/kernel/timer.c~timer-tricks Sun Oct 6 14:50:39 2002 +++ 2.5.40-akpm/kernel/timer.c Sun Oct 6 14:52:34 2002 @@ -265,20 +265,19 @@ repeat: */ int del_timer_sync(timer_t *timer) { - tvec_base_t *base = tvec_bases; int i, ret; ret = del_timer(timer); for (i = 0; i < NR_CPUS; i++) { + tvec_base_t *base; + if (!cpu_online(i)) continue; + base = tvec_bases + i; if (base->running_timer == timer) { - while (base->running_timer == timer) { + while (base->running_timer == timer) cpu_relax(); - preempt_disable(); - preempt_enable(); - } break; } base++; @@ -359,9 +358,9 @@ repeat: #if CONFIG_SMP base->running_timer = timer; #endif - spin_unlock_irq(&base->lock); + spin_unlock_irqrestore(&base->lock, flags); fn(data); - spin_lock_irq(&base->lock); + spin_lock_irqsave(&base->lock, flags); goto repeat; } ++base->timer_jiffies; . ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 21:55 ` 2.5.40-mm2 Andrew Morton @ 2002-10-06 22:07 ` Andrew Morton 2002-10-06 22:11 ` 2.5.40-mm2 Andrew Morton 2002-10-06 22:23 ` 2.5.40-mm2 Robert Love 0 siblings, 2 replies; 17+ messages in thread From: Andrew Morton @ 2002-10-06 22:07 UTC (permalink / raw) To: Dave Hansen, lkml, linux-mm, Ingo Molnar Andrew Morton wrote: > > ... > int del_timer_sync(timer_t *timer) > { > - tvec_base_t *base = tvec_bases; > int i, ret; > > ret = del_timer(timer); > > for (i = 0; i < NR_CPUS; i++) { > + tvec_base_t *base; > + > if (!cpu_online(i)) > continue; > + base = tvec_bases + i; > if (base->running_timer == timer) { > - while (base->running_timer == timer) { > + while (base->running_timer == timer) > cpu_relax(); > - preempt_disable(); > - preempt_enable(); > - } > break; > } > base++; Oh, OK. There's a base++ hidden at the end there :( So the code as-is will work OK if all your online CPUs are adjacent, starting at CPU0. It is incorrect if you have gaps in your online map. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 22:07 ` 2.5.40-mm2 Andrew Morton @ 2002-10-06 22:11 ` Andrew Morton 2002-10-07 5:46 ` 2.5.40-mm2 Dave Hansen 2002-10-06 22:23 ` 2.5.40-mm2 Robert Love 1 sibling, 1 reply; 17+ messages in thread From: Andrew Morton @ 2002-10-06 22:11 UTC (permalink / raw) To: Dave Hansen, lkml, linux-mm, Ingo Molnar grr. So that's what that "send" button does. Updated patch: --- 2.5.40/kernel/timer.c~timer-tricks Sun Oct 6 15:08:02 2002 +++ 2.5.40-akpm/kernel/timer.c Sun Oct 6 15:08:45 2002 @@ -265,23 +265,19 @@ repeat: */ int del_timer_sync(timer_t *timer) { - tvec_base_t *base = tvec_bases; int i, ret; ret = del_timer(timer); for (i = 0; i < NR_CPUS; i++) { - if (!cpu_online(i)) - continue; - if (base->running_timer == timer) { - while (base->running_timer == timer) { - cpu_relax(); - preempt_disable(); - preempt_enable(); + if (cpu_online(i)) { + tvec_base_t *base = tvec_bases + i; + if (base->running_timer == timer) { + while (base->running_timer == timer) + cpu_relax(); + break; } - break; } - base++; } return ret; } @@ -359,9 +355,9 @@ repeat: #if CONFIG_SMP base->running_timer = timer; #endif - spin_unlock_irq(&base->lock); + spin_unlock_irqrestore(&base->lock, flags); fn(data); - spin_lock_irq(&base->lock); + spin_lock_irqsave(&base->lock, flags); goto repeat; } ++base->timer_jiffies; . ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 22:11 ` 2.5.40-mm2 Andrew Morton @ 2002-10-07 5:46 ` Dave Hansen 0 siblings, 0 replies; 17+ messages in thread From: Dave Hansen @ 2002-10-07 5:46 UTC (permalink / raw) To: Andrew Morton; +Cc: lkml, linux-mm, Ingo Molnar Andrew Morton wrote: > grr. So that's what that "send" button does. > > Updated patch: Still dies :( Unable to handle kernel NULL pointer dereference at virtual addre ss 00000004 printing eip: 80120479 *pde = 64502001 *pte = 00000000 Oops: 0002 CPU: 2 EIP: 0060:[<80120479>] Not tainted EFLAGS: 00010012 EIP is at run_timer_tasklet+0xcd/0x13c eax: 00000000 ebx: 80264a38 ecx: e4f46e20 edx: 00000000 esi: e4f47040 edi: 80399ac0 ebp: 00000282 esp: e474df64 ds: 0068 es: 0068 ss: 0068 Process httpd (pid: 2587, threadinfo=e474c000 task=f55d17e0) Stack: 8c093148 00000000 e474c000 00000001 8011d295 00000000 00000001 80397960 fffffffe 00000040 8037c1a4 8037c1a4 8011cf9a 80397960 00000008 00000002 00000001 7ffff89c 00000046 801111dd 0813eff0 0813ef14 0813f064 80107a8a Call Trace: [<8011d295>] tasklet_hi_action+0x85/0xe0 [<8011cf9a>] do_softirq+0x5a/0xac [<801111dd>] smp_apic_timer_interrupt+0x111/0x118 [<80107a8a>] apic_timer_interrupt+0x1a/0x20 Code: 89 50 04 89 02 c7 06 00 00 00 00 c7 46 04 00 00 00 00 c7 46 8012046a: 39 c6 cmp %eax,%esi 8012046c: 74 42 <the last if> je 801204b0 <run_timer_tasklet+0x104> 8012046e: 8b 5e 0c mov 0xc(%esi),%ebx 80120471: 8b 4e 10 mov 0x10(%esi),%ecx 80120474: 8b 56 04 mov 0x4(%esi),%edx 80120477: 8b 06 mov (%esi),%eax 80120479: 89 50 04 -------> mov %edx,0x4(%eax) 8012047c: 89 02 mov %eax,(%edx) 8012047e: c7 06 00 00 00 00 movl $0x0,(%esi) 80120484: c7 46 04 00 00 00 00 movl $0x0,0x4(%esi) 8012048b: c7 46 14 00 00 00 00 movl $0x0,0x14(%esi) 80120492: 89 77 08 mov %esi,0x8(%edi) 80120495: c6 07 01 movb $0x1,(%edi) 80120498: 55 push %ebp 80120499: 9d popf -- Dave Hansen haveblue@us.ibm.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 22:07 ` 2.5.40-mm2 Andrew Morton 2002-10-06 22:11 ` 2.5.40-mm2 Andrew Morton @ 2002-10-06 22:23 ` Robert Love 2002-10-06 22:33 ` 2.5.40-mm2 Andrew Morton 1 sibling, 1 reply; 17+ messages in thread From: Robert Love @ 2002-10-06 22:23 UTC (permalink / raw) To: Andrew Morton; +Cc: Dave Hansen, lkml, linux-mm, Ingo Molnar On Sun, 2002-10-06 at 18:07, Andrew Morton wrote: > > - while (base->running_timer == timer) { > > + while (base->running_timer == timer) > > cpu_relax(); > > - preempt_disable(); > > - preempt_enable(); I am confused as to why Ingo would put these here. He knows very well what he is doing... surely he had a reason. If he intended to force a preemption point here, then the lines needs to be reversed. This assumes, of course, preemption is disabled here. But I do not think it is. If he just wanted to check for preemption, we have a preempt_check_resched() which does just that (I even think he wrote it). Note as long as interrupts are enabled this probably does not achieve much anyhow. So I do not know. I find it odd the solution is to completely remove it... Btw, I think the solution to the crash is to add a check to cpu_online(). Robert Love ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 22:23 ` 2.5.40-mm2 Robert Love @ 2002-10-06 22:33 ` Andrew Morton 2002-10-06 22:38 ` 2.5.40-mm2 Robert Love 0 siblings, 1 reply; 17+ messages in thread From: Andrew Morton @ 2002-10-06 22:33 UTC (permalink / raw) To: Robert Love; +Cc: Dave Hansen, lkml, linux-mm, Ingo Molnar Robert Love wrote: > > On Sun, 2002-10-06 at 18:07, Andrew Morton wrote: > > > > - while (base->running_timer == timer) { > > > + while (base->running_timer == timer) > > > cpu_relax(); > > > - preempt_disable(); > > > - preempt_enable(); > > I am confused as to why Ingo would put these here. He knows very well > what he is doing... surely he had a reason. > > If he intended to force a preemption point here, then the lines needs to > be reversed. This assumes, of course, preemption is disabled here. But > I do not think it is. > > If he just wanted to check for preemption, we have a > preempt_check_resched() which does just that (I even think he wrote > it). Note as long as interrupts are enabled this probably does not > achieve much anyhow. > I think it's a way of doing "cond_resched() if cond_resched() is a legal thing to do right now". I'm sure David isn't using preempt though. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 22:33 ` 2.5.40-mm2 Andrew Morton @ 2002-10-06 22:38 ` Robert Love 0 siblings, 0 replies; 17+ messages in thread From: Robert Love @ 2002-10-06 22:38 UTC (permalink / raw) To: Andrew Morton; +Cc: Dave Hansen, lkml, linux-mm, Ingo Molnar On Sun, 2002-10-06 at 18:33, Andrew Morton wrote: > I think it's a way of doing "cond_resched() if cond_resched() is > a legal thing to do right now". > > I'm sure David isn't using preempt though. If the system is preemptible, then the call can be replaced with preempt_check_resched() which avoids the unneeded inc and dec. But if the system is preemptible, it probably does not accomplish much because we will already have preempted (e.g. the interrupt handler that woke up a new task set need_resched and on return from interrupt we serviced it). If the system is not preemptible (non-zero preempt_count here) this accomplishes nothing. Robert Love ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 20:47 ` 2.5.40-mm2 Dave Hansen 2002-10-06 21:55 ` 2.5.40-mm2 Andrew Morton @ 2002-10-08 11:05 ` Ingo Molnar 2002-10-08 16:23 ` 2.5.40-mm2 Dave Hansen 2002-10-08 16:43 ` 2.5.40-mm2 Dave Hansen 1 sibling, 2 replies; 17+ messages in thread From: Ingo Molnar @ 2002-10-08 11:05 UTC (permalink / raw) To: Dave Hansen; +Cc: Andrew Morton, lkml, linux-mm On Sun, 6 Oct 2002, Dave Hansen wrote: > cc'ing Ingo, because I think this might be related to the timer bh > removal. could you try the attached patch against 2.5.41, does it help? It fixes the bugs found so far plus makes del_timer_sync() a bit more robust by re-checking timer pending-ness before exiting. There is one type of code that might have relied on this kind of behavior of the old timer code. Ingo --- linux/kernel/timer.c.orig 2002-10-08 12:39:46.000000000 +0200 +++ linux/kernel/timer.c 2002-10-08 12:49:50.000000000 +0200 @@ -266,29 +266,31 @@ int del_timer_sync(timer_t *timer) { tvec_base_t *base = tvec_bases; - int i, ret; + int i, ret = 0; - ret = del_timer(timer); +del_again: + ret += del_timer(timer); - for (i = 0; i < NR_CPUS; i++) { + for (i = 0; i < NR_CPUS; i++, base++) { if (!cpu_online(i)) continue; if (base->running_timer == timer) { while (base->running_timer == timer) { cpu_relax(); - preempt_disable(); - preempt_enable(); + preempt_check_resched(); } break; } - base++; } + if (timer_pending(timer)) + goto del_again; + return ret; } #endif -static void cascade(tvec_base_t *base, tvec_t *tv) +static int cascade(tvec_base_t *base, tvec_t *tv) { /* cascade all the timers from tv up one level */ struct list_head *head, *curr, *next; @@ -310,7 +312,8 @@ curr = next; } INIT_LIST_HEAD(head); - tv->index = (tv->index + 1) & TVN_MASK; + + return tv->index = (tv->index + 1) & TVN_MASK; } /*** @@ -322,26 +325,18 @@ */ static inline void __run_timers(tvec_base_t *base) { - unsigned long flags; - - spin_lock_irqsave(&base->lock, flags); + spin_lock_irq(&base->lock); while ((long)(jiffies - base->timer_jiffies) >= 0) { struct list_head *head, *curr; /* * Cascade timers: */ - if (!base->tv1.index) { - cascade(base, &base->tv2); - if (base->tv2.index == 1) { - cascade(base, &base->tv3); - if (base->tv3.index == 1) { - cascade(base, &base->tv4); - if (base->tv4.index == 1) - cascade(base, &base->tv5); - } - } - } + if (!base->tv1.index && + (cascade(base, &base->tv2) == 1) && + (cascade(base, &base->tv3) == 1) && + cascade(base, &base->tv4) == 1) + cascade(base, &base->tv5); repeat: head = base->tv1.vec + base->tv1.index; curr = head->next; @@ -370,7 +365,7 @@ #if CONFIG_SMP base->running_timer = NULL; #endif - spin_unlock_irqrestore(&base->lock, flags); + spin_unlock_irq(&base->lock); } /******************************************************************/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-08 11:05 ` 2.5.40-mm2 Ingo Molnar @ 2002-10-08 16:23 ` Dave Hansen 2002-10-08 16:43 ` 2.5.40-mm2 Dave Hansen 1 sibling, 0 replies; 17+ messages in thread From: Dave Hansen @ 2002-10-08 16:23 UTC (permalink / raw) To: Ingo Molnar; +Cc: Andrew Morton, lkml, linux-mm Ingo Molnar wrote: > On Sun, 6 Oct 2002, Dave Hansen wrote: > >>cc'ing Ingo, because I think this might be related to the timer bh >>removal. > > could you try the attached patch against 2.5.41, does it help? It fixes > the bugs found so far plus makes del_timer_sync() a bit more robust by > re-checking timer pending-ness before exiting. There is one type of code > that might have relied on this kind of behavior of the old timer code. Well, I gave it a shot. I haven't seen any more of the __run_timers oopses yet, but I haven't been able to stay up for very long before it freezes, so time will tell. But, for now, I'm pretty sure that it is quite a bit better. -- Dave Hansen haveblue@us.ibm.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-08 11:05 ` 2.5.40-mm2 Ingo Molnar 2002-10-08 16:23 ` 2.5.40-mm2 Dave Hansen @ 2002-10-08 16:43 ` Dave Hansen 2002-10-08 16:56 ` 2.5.40-mm2 Andrew Morton 2002-10-09 8:12 ` 2.5.40-mm2 Ingo Molnar 1 sibling, 2 replies; 17+ messages in thread From: Dave Hansen @ 2002-10-08 16:43 UTC (permalink / raw) To: Ingo Molnar; +Cc: Andrew Morton, lkml, linux-mm Ingo Molnar wrote: > On Sun, 6 Oct 2002, Dave Hansen wrote: > >>cc'ing Ingo, because I think this might be related to the timer bh >>removal. > > could you try the attached patch against 2.5.41, does it help? It fixes > the bugs found so far plus makes del_timer_sync() a bit more robust by > re-checking timer pending-ness before exiting. There is one type of code > that might have relied on this kind of behavior of the old timer code. Hehe. That'll teach me to be optimistic. This is unprocessed, but the EIP in tvec_bases should tell the whole story. Something _nasty_ is going on. addr2line on the run_timer_tasklet call: kernel/timer.c:359 This is with the patch that Ingo sent me about 6 hours ago. Andrew, should I still test the one that you sent me this morning? CPU: 7 EIP: 0060:[<80382bd2>] Not tainted EFLAGS: 00010a02 EIP is at tvec_bases+0x7152/0x20400 eax: e4a2d9a0 ebx: 80382bd0 ecx: 80382bd8 edx: 80382fd0 esi: 80382bc8 edi: 80382b60 ebp: 00000001 esp: f4d71db0 ds: 0068 es: 0068 ss: 0068 Process httpd (pid: 2554, threadinfo=f4d70000 task=f4d727c0) Stack: 8012038b 80382bd8 8b093288 00000000 f4d70000 8011d1e5 00000000 00000001 8037b960 fffffffa 000000e0 80360264 80360264 8011ceea 8037b960 f4d70000 00000001 00000001 e4bf0930 00000246 80257ed0 e4bf07c0 f4d71eec f4d71eb8 Call Trace: [<8012038b>] run_timer_tasklet+0xcf/0x118 [<8011d1e5>] tasklet_hi_action+0x85/0xe0 [<8011ceea>] do_softirq+0x5a/0xac [<80257ed0>] tcp_sendmsg+0x10b8/0x11f4 [<80273726>] inet_sendmsg+0x42/0x48 [<8023bf0e>] sock_sendmsg+0x72/0x94 [<8023c220>] sock_readv_writev+0x94/0xa0 [<8023c29b>] sock_writev+0x37/0x40 [<8013f94a>] do_readv_writev+0x186/0x278 [<8023c07c>] sock_write+0x0/0xb0 [<8013f4d7>] vfs_read+0xb7/0x128 [<8013fb02>] sys_writev+0x5a/0x6c [<801070b3>] syscall_call+0x7/0xb Code: 38 80 d0 2b 38 80 d8 2b 38 80 00 00 00 00 e0 2b 38 80 e0 2b <0>Kernel panic: Aiee, killing interrupt handler! In interrupt handler - not syncing -- Dave Hansen haveblue@us.ibm.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-08 16:43 ` 2.5.40-mm2 Dave Hansen @ 2002-10-08 16:56 ` Andrew Morton 2002-10-09 8:12 ` 2.5.40-mm2 Ingo Molnar 1 sibling, 0 replies; 17+ messages in thread From: Andrew Morton @ 2002-10-08 16:56 UTC (permalink / raw) To: Dave Hansen; +Cc: Ingo Molnar, lkml, linux-mm Dave Hansen wrote: > > Ingo Molnar wrote: > > On Sun, 6 Oct 2002, Dave Hansen wrote: > > > >>cc'ing Ingo, because I think this might be related to the timer bh > >>removal. > > > > could you try the attached patch against 2.5.41, does it help? It fixes > > the bugs found so far plus makes del_timer_sync() a bit more robust by > > re-checking timer pending-ness before exiting. There is one type of code > > that might have relied on this kind of behavior of the old timer code. > > Hehe. That'll teach me to be optimistic. This is unprocessed, but > the EIP in tvec_bases should tell the whole story. Something _nasty_ > is going on. > > addr2line on the run_timer_tasklet call: kernel/timer.c:359 > This is with the patch that Ingo sent me about 6 hours ago. Andrew, > should I still test the one that you sent me this morning? No; I think Ingo covered everything there, and more. > Dave Hansen > haveblue@us.ibm.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-08 16:43 ` 2.5.40-mm2 Dave Hansen 2002-10-08 16:56 ` 2.5.40-mm2 Andrew Morton @ 2002-10-09 8:12 ` Ingo Molnar 1 sibling, 0 replies; 17+ messages in thread From: Ingo Molnar @ 2002-10-09 8:12 UTC (permalink / raw) To: Dave Hansen; +Cc: Andrew Morton, lkml, linux-mm On Tue, 8 Oct 2002, Dave Hansen wrote: > Hehe. That'll teach me to be optimistic. This is unprocessed, but the > EIP in tvec_bases should tell the whole story. Something _nasty_ is > going on. could you try BK-curr with/without my latest patch? Linus and Vojtech found and fixed a bug in the keyboard code that caused timer tasklet oopses. if it still keeps crashing then please add a printk like this: if (!fn) printk("Bad: NULL timer fn of timer %p (data %p).\n", timer, data); else fn(data) it's the fn's NULL-ness that causes the crashes, right? Ingo --- linux/kernel/timer.c.orig 2002-10-08 12:39:46.000000000 +0200 +++ linux/kernel/timer.c 2002-10-08 12:49:50.000000000 +0200 @@ -266,29 +266,31 @@ int del_timer_sync(timer_t *timer) { tvec_base_t *base = tvec_bases; - int i, ret; + int i, ret = 0; - ret = del_timer(timer); +del_again: + ret += del_timer(timer); - for (i = 0; i < NR_CPUS; i++) { + for (i = 0; i < NR_CPUS; i++, base++) { if (!cpu_online(i)) continue; if (base->running_timer == timer) { while (base->running_timer == timer) { cpu_relax(); - preempt_disable(); - preempt_enable(); + preempt_check_resched(); } break; } - base++; } + if (timer_pending(timer)) + goto del_again; + return ret; } #endif -static void cascade(tvec_base_t *base, tvec_t *tv) +static int cascade(tvec_base_t *base, tvec_t *tv) { /* cascade all the timers from tv up one level */ struct list_head *head, *curr, *next; @@ -310,7 +312,8 @@ curr = next; } INIT_LIST_HEAD(head); - tv->index = (tv->index + 1) & TVN_MASK; + + return tv->index = (tv->index + 1) & TVN_MASK; } /*** @@ -322,26 +325,18 @@ */ static inline void __run_timers(tvec_base_t *base) { - unsigned long flags; - - spin_lock_irqsave(&base->lock, flags); + spin_lock_irq(&base->lock); while ((long)(jiffies - base->timer_jiffies) >= 0) { struct list_head *head, *curr; /* * Cascade timers: */ - if (!base->tv1.index) { - cascade(base, &base->tv2); - if (base->tv2.index == 1) { - cascade(base, &base->tv3); - if (base->tv3.index == 1) { - cascade(base, &base->tv4); - if (base->tv4.index == 1) - cascade(base, &base->tv5); - } - } - } + if (!base->tv1.index && + (cascade(base, &base->tv2) == 1) && + (cascade(base, &base->tv3) == 1) && + cascade(base, &base->tv4) == 1) + cascade(base, &base->tv5); repeat: head = base->tv1.vec + base->tv1.index; curr = head->next; @@ -370,7 +365,7 @@ #if CONFIG_SMP base->running_timer = NULL; #endif - spin_unlock_irqrestore(&base->lock, flags); + spin_unlock_irq(&base->lock); } /******************************************************************/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-06 18:47 2.5.40-mm2 Andrew Morton 2002-10-06 20:47 ` 2.5.40-mm2 Dave Hansen @ 2002-10-07 17:45 ` Badari Pulavarty 2002-10-07 17:55 ` 2.5.40-mm2 Jens Axboe 2002-10-07 18:23 ` 2.5.40-mm2 Andrew Morton 1 sibling, 2 replies; 17+ messages in thread From: Badari Pulavarty @ 2002-10-07 17:45 UTC (permalink / raw) To: Andrew Morton; +Cc: lkml, linux-mm Andrew, I get following compile errors while using 2.5.40-mm2. Missing some exports ? - Badari ld -m elf_i386 -e stext -T arch/i386/vmlinux.lds.s arch/i386/kernel/head.o arch/i386/kernel/init_task.o init/built-in.o --start-group arch/i386/kernel/built-in.o arch/i386/mm/built-in.o arch/i386/mach-generic/built-in.o kernel/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o lib/lib.a arch/i386/lib/lib.a drivers/built-in.o sound/built-in.o arch/i386/pci/built-in.o net/built-in.o --end-group -o .tmp_vmlinux drivers/built-in.o: In function `aic7xxx_biosparam': drivers/built-in.o(.text+0xcfc71): undefined reference to `__udivdi3' drivers/built-in.o(.text+0xcfca8): undefined reference to `__udivdi3' drivers/built-in.o: In function `qla1280_proc_info': drivers/built-in.o(.text+0xd0ca0): undefined reference to `get_free_page' drivers/built-in.o: In function `qla1280_biosparam': drivers/built-in.o(.text+0xd1daa): undefined reference to `__udivdi3' drivers/built-in.o(.text+0xd1dce): undefined reference to `__udivdi3' make: *** [.tmp_vmlinux] Error 1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-07 17:45 ` 2.5.40-mm2 Badari Pulavarty @ 2002-10-07 17:55 ` Jens Axboe 2002-10-07 18:23 ` 2.5.40-mm2 Andrew Morton 1 sibling, 0 replies; 17+ messages in thread From: Jens Axboe @ 2002-10-07 17:55 UTC (permalink / raw) To: Badari Pulavarty; +Cc: Andrew Morton, lkml, linux-mm On Mon, Oct 07 2002, Badari Pulavarty wrote: > Andrew, > > I get following compile errors while using 2.5.40-mm2. > Missing some exports ? > > - Badari > > ld -m elf_i386 -e stext -T arch/i386/vmlinux.lds.s arch/i386/kernel/head.o arch/i386/kernel/init_task.o init/built-in.o --start-group arch/i386/kernel/built-in.o arch/i386/mm/built-in.o arch/i386/mach-generic/built-in.o kernel/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o lib/lib.a arch/i386/lib/lib.a drivers/built-in.o sound/built-in.o arch/i386/pci/built-in.o net/built-in.o --end-group -o .tmp_vmlinux > drivers/built-in.o: In function `aic7xxx_biosparam': > drivers/built-in.o(.text+0xcfc71): undefined reference to `__udivdi3' > drivers/built-in.o(.text+0xcfca8): undefined reference to `__udivdi3' > drivers/built-in.o: In function `qla1280_proc_info': > drivers/built-in.o(.text+0xd0ca0): undefined reference to `get_free_page' > drivers/built-in.o: In function `qla1280_biosparam': > drivers/built-in.o(.text+0xd1daa): undefined reference to `__udivdi3' > drivers/built-in.o(.text+0xd1dce): undefined reference to `__udivdi3' > make: *** [.tmp_vmlinux] Error 1 someone is doing divisions on 64-bit ints, at least that's the __udivdi3. -- Jens Axboe ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.5.40-mm2 2002-10-07 17:45 ` 2.5.40-mm2 Badari Pulavarty 2002-10-07 17:55 ` 2.5.40-mm2 Jens Axboe @ 2002-10-07 18:23 ` Andrew Morton 1 sibling, 0 replies; 17+ messages in thread From: Andrew Morton @ 2002-10-07 18:23 UTC (permalink / raw) To: Badari Pulavarty; +Cc: lkml, linux-mm Badari Pulavarty wrote: > > ... > drivers/built-in.o: In function `aic7xxx_biosparam': > drivers/built-in.o(.text+0xcfc71): undefined reference to `__udivdi3' > drivers/built-in.o(.text+0xcfca8): undefined reference to `__udivdi3' > drivers/built-in.o: In function `qla1280_proc_info': > drivers/built-in.o(.text+0xd0ca0): undefined reference to `get_free_page' > drivers/built-in.o: In function `qla1280_biosparam': > drivers/built-in.o(.text+0xd1daa): undefined reference to `__udivdi3' > drivers/built-in.o(.text+0xd1dce): undefined reference to `__udivdi3' > make: *** [.tmp_vmlinux] Error 1 For the __udivdi3 thing, the below patch should fix that up. For the get_free_page thing I need a grep-for-dummies book. Please just go into qla1280_proc_info() and replace get_free_page() with get_zeroed_page(). I need to do a second round on that patch. --- 2.5.40/drivers/scsi/aic7xxx_old.c~lbd-fixes-1 Mon Oct 7 11:18:28 2002 +++ 2.5.40-akpm/drivers/scsi/aic7xxx_old.c Mon Oct 7 11:19:18 2002 @@ -11735,13 +11735,13 @@ aic7xxx_biosparam(Disk *disk, struct blo heads = 64; sectors = 32; - cylinders = disk->capacity / (heads * sectors); + cylinders = sector_div(disk->capacity, heads * sectors); if ((p->flags & AHC_EXTEND_TRANS_A) && (cylinders > 1024)) { heads = 255; sectors = 63; - cylinders = disk->capacity / (heads * sectors); + cylinders = sector_div(disk->capacity, heads * sectors); } geom[0] = heads; --- 2.5.40/drivers/scsi/qla1280.c~lbd-fixes-1 Mon Oct 7 11:19:42 2002 +++ 2.5.40-akpm/drivers/scsi/qla1280.c Mon Oct 7 11:20:06 2002 @@ -1705,11 +1705,11 @@ qla1280_biosparam(Disk * disk, struct bl heads = 64; sectors = 32; - cylinders = disk->capacity / (heads * sectors); + cylinders = sector_div(disk->capacity, heads * sectors); if (cylinders > 1024) { heads = 255; sectors = 63; - cylinders = disk->capacity / (heads * sectors); + cylinders = sector_div(disk->capacity, heads * sectors); /* if (cylinders > 1023) cylinders = 1023; */ } . ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2002-10-09 7:56 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-10-06 18:47 2.5.40-mm2 Andrew Morton 2002-10-06 20:47 ` 2.5.40-mm2 Dave Hansen 2002-10-06 21:55 ` 2.5.40-mm2 Andrew Morton 2002-10-06 22:07 ` 2.5.40-mm2 Andrew Morton 2002-10-06 22:11 ` 2.5.40-mm2 Andrew Morton 2002-10-07 5:46 ` 2.5.40-mm2 Dave Hansen 2002-10-06 22:23 ` 2.5.40-mm2 Robert Love 2002-10-06 22:33 ` 2.5.40-mm2 Andrew Morton 2002-10-06 22:38 ` 2.5.40-mm2 Robert Love 2002-10-08 11:05 ` 2.5.40-mm2 Ingo Molnar 2002-10-08 16:23 ` 2.5.40-mm2 Dave Hansen 2002-10-08 16:43 ` 2.5.40-mm2 Dave Hansen 2002-10-08 16:56 ` 2.5.40-mm2 Andrew Morton 2002-10-09 8:12 ` 2.5.40-mm2 Ingo Molnar 2002-10-07 17:45 ` 2.5.40-mm2 Badari Pulavarty 2002-10-07 17:55 ` 2.5.40-mm2 Jens Axboe 2002-10-07 18:23 ` 2.5.40-mm2 Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).