* [BUG] 2.4 VM sucks. Again @ 2002-05-23 13:11 Roy Sigurd Karlsbakk 2002-05-23 14:54 ` Martin J. Bligh ` (2 more replies) 0 siblings, 3 replies; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-23 13:11 UTC (permalink / raw) To: linux-kernel hi all I've been here complaining about the 2.4 VM before, and here I am, back again. PROBLEM: ---------------------- Starting up 30 downloads from a custom HTTP server (or Tux - or Apache - doesn't matter), file size is 3-6GB, download speed = ~4.5Mbps. After some time the kernel (a) goes bOOM (out of memory) if not having any swap, or (b) goes gong swapping out anything it can. The custom HTTP server processes each have a static buffer of two megabytes, no malloc()s, and are written in < 1000 lines of C. Theory: The buffer fills up, as the clients can't read as fast as kernel is reading from disk, and the server goes boom thanks for any help roy -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 13:11 [BUG] 2.4 VM sucks. Again Roy Sigurd Karlsbakk @ 2002-05-23 14:54 ` Martin J. Bligh 2002-05-23 16:29 ` Roy Sigurd Karlsbakk 2002-05-23 16:03 ` Johannes Erdfelt 2002-05-23 18:12 ` jlnance 2 siblings, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-05-23 14:54 UTC (permalink / raw) To: Roy Sigurd Karlsbakk, linux-kernel > PROBLEM: > ---------------------- > Starting up 30 downloads from a custom HTTP server (or Tux - or Apache - > doesn't matter), file size is 3-6GB, download speed = ~4.5Mbps. After some > time the kernel (a) goes bOOM (out of memory) if not having any swap, or (b) > goes gong swapping out anything it can. How much RAM do you have, and what does /proc/meminfo and /proc/slabinfo say just before the explosion point? M. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 14:54 ` Martin J. Bligh @ 2002-05-23 16:29 ` Roy Sigurd Karlsbakk 2002-05-23 16:46 ` Martin J. Bligh 2002-05-24 15:11 ` [BUG] 2.4 VM sucks. Again Alan Cox 0 siblings, 2 replies; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-23 16:29 UTC (permalink / raw) To: Martin J. Bligh, linux-kernel > > Starting up 30 downloads from a custom HTTP server (or Tux - or Apache - > > doesn't matter), file size is 3-6GB, download speed = ~4.5Mbps. After > > some time the kernel (a) goes bOOM (out of memory) if not having any > > swap, or (b) goes gong swapping out anything it can. > > How much RAM do you have, and what does /proc/meminfo > and /proc/slabinfo say just before the explosion point? I have 1 gig - highmem (not enabled) - 900 megs. for what I can see, kernel can't reclaim buffers fast enough. ut looks better on -aa. -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 16:29 ` Roy Sigurd Karlsbakk @ 2002-05-23 16:46 ` Martin J. Bligh 2002-05-24 10:04 ` Roy Sigurd Karlsbakk 2002-05-24 15:11 ` [BUG] 2.4 VM sucks. Again Alan Cox 1 sibling, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-05-23 16:46 UTC (permalink / raw) To: Roy Sigurd Karlsbakk, linux-kernel >> > Starting up 30 downloads from a custom HTTP server (or Tux - or Apache - >> > doesn't matter), file size is 3-6GB, download speed = ~4.5Mbps. After >> > some time the kernel (a) goes bOOM (out of memory) if not having any >> > swap, or (b) goes gong swapping out anything it can. >> >> How much RAM do you have, and what does /proc/meminfo >> and /proc/slabinfo say just before the explosion point? > > I have 1 gig - highmem (not enabled) - 900 megs. > for what I can see, kernel can't reclaim buffers fast enough. > ut looks better on -aa. Sounds like exactly the same problem we were having. There are two approaches to solving this - Andrea has a patch that tries to free them under memory pressure, akpm has a patch that hacks them down as soon as you've fininshed with them (posted to lse-tech mailing list). Both approaches seemed to work for me, but the performance of the fixes still has to be established. I've seen over 1Gb of buffer_heads ;-) M. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 16:46 ` Martin J. Bligh @ 2002-05-24 10:04 ` Roy Sigurd Karlsbakk 2002-05-24 14:35 ` Martin J. Bligh 0 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-24 10:04 UTC (permalink / raw) To: Martin J. Bligh, linux-kernel > > I have 1 gig - highmem (not enabled) - 900 megs. > > for what I can see, kernel can't reclaim buffers fast enough. > > ut looks better on -aa. > > Sounds like exactly the same problem we were having. There are two > approaches to solving this - Andrea has a patch that tries to free them > under memory pressure, akpm has a patch that hacks them down as soon > as you've fininshed with them (posted to lse-tech mailing list). Both > approaches seemed to work for me, but the performance of the fixes still > has to be established. Where can I find the akpm patch? Any plans to merge this into the main kernel, giving a choice (in config or /proc) to enable this? > I've seen over 1Gb of buffer_heads ;-) > > M. -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 10:04 ` Roy Sigurd Karlsbakk @ 2002-05-24 14:35 ` Martin J. Bligh 2002-05-24 19:32 ` Andrew Morton 0 siblings, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-05-24 14:35 UTC (permalink / raw) To: Roy Sigurd Karlsbakk, linux-kernel >> Sounds like exactly the same problem we were having. There are two >> approaches to solving this - Andrea has a patch that tries to free them >> under memory pressure, akpm has a patch that hacks them down as soon >> as you've fininshed with them (posted to lse-tech mailing list). Both >> approaches seemed to work for me, but the performance of the fixes still >> has to be established. > > Where can I find the akpm patch? http://marc.theaimsgroup.com/?l=lse-tech&m=102083525007877&w=2 > Any plans to merge this into the main kernel, giving a choice > (in config or /proc) to enable this? I don't think Andrew is ready to submit this yet ... before anything gets merged back, it'd be very worthwhile testing the relative performance of both solutions ... the more testers we have the better ;-) Thanks, M. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 14:35 ` Martin J. Bligh @ 2002-05-24 19:32 ` Andrew Morton 2002-05-30 10:29 ` Roy Sigurd Karlsbakk ` (3 more replies) 0 siblings, 4 replies; 48+ messages in thread From: Andrew Morton @ 2002-05-24 19:32 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Roy Sigurd Karlsbakk, linux-kernel "Martin J. Bligh" wrote: > > >> Sounds like exactly the same problem we were having. There are two > >> approaches to solving this - Andrea has a patch that tries to free them > >> under memory pressure, akpm has a patch that hacks them down as soon > >> as you've fininshed with them (posted to lse-tech mailing list). Both > >> approaches seemed to work for me, but the performance of the fixes still > >> has to be established. > > > > Where can I find the akpm patch? > > http://marc.theaimsgroup.com/?l=lse-tech&m=102083525007877&w=2 > > > Any plans to merge this into the main kernel, giving a choice > > (in config or /proc) to enable this? > > I don't think Andrew is ready to submit this yet ... before anything > gets merged back, it'd be very worthwhile testing the relative > performance of both solutions ... the more testers we have the > better ;-) > Cripes no. It's pretty experimental. Andrea spotted a bug, too. Fixed version is below. It's possible that keeping the number of buffers as low as possible will give improved performance over Andrea's approach because it leaves more ZONE_NORMAL for other things. It's also possible that it'll give worse performance because more get_block's need to be done for file overwriting. --- 2.4.19-pre8/include/linux/pagemap.h~nuke-buffers Fri May 24 12:24:56 2002 +++ 2.4.19-pre8-akpm/include/linux/pagemap.h Fri May 24 12:26:30 2002 @@ -89,13 +89,7 @@ extern void add_to_page_cache(struct pag extern void add_to_page_cache_locked(struct page * page, struct address_space *mapping, unsigned long index); extern int add_to_page_cache_unique(struct page * page, struct address_space *mapping, unsigned long index, struct page **hash); -extern void ___wait_on_page(struct page *); - -static inline void wait_on_page(struct page * page) -{ - if (PageLocked(page)) - ___wait_on_page(page); -} +extern void wait_on_page(struct page *); extern struct page * grab_cache_page (struct address_space *, unsigned long); extern struct page * grab_cache_page_nowait (struct address_space *, unsigned long); --- 2.4.19-pre8/mm/filemap.c~nuke-buffers Fri May 24 12:24:56 2002 +++ 2.4.19-pre8-akpm/mm/filemap.c Fri May 24 12:24:56 2002 @@ -608,7 +608,7 @@ int filemap_fdatawait(struct address_spa page_cache_get(page); spin_unlock(&pagecache_lock); - ___wait_on_page(page); + wait_on_page(page); if (PageError(page)) ret = -EIO; @@ -805,33 +805,29 @@ static inline wait_queue_head_t *page_wa return &wait[hash]; } -/* - * Wait for a page to get unlocked. +static void kill_buffers(struct page *page) +{ + if (!PageLocked(page)) + BUG(); + if (page->buffers) + try_to_release_page(page, GFP_NOIO); +} + +/* + * Wait for a page to come unlocked. Then try to ditch its buffer_heads. * - * This must be called with the caller "holding" the page, - * ie with increased "page->count" so that the page won't - * go away during the wait.. + * FIXME: Make the ditching dependent on CONFIG_MONSTER_BOX or something. */ -void ___wait_on_page(struct page *page) +void wait_on_page(struct page *page) { - wait_queue_head_t *waitqueue = page_waitqueue(page); - struct task_struct *tsk = current; - DECLARE_WAITQUEUE(wait, tsk); - - add_wait_queue(waitqueue, &wait); - do { - set_task_state(tsk, TASK_UNINTERRUPTIBLE); - if (!PageLocked(page)) - break; - sync_page(page); - schedule(); - } while (PageLocked(page)); - __set_task_state(tsk, TASK_RUNNING); - remove_wait_queue(waitqueue, &wait); + lock_page(page); + kill_buffers(page); + unlock_page(page); } +EXPORT_SYMBOL(wait_on_page); /* - * Unlock the page and wake up sleepers in ___wait_on_page. + * Unlock the page and wake up sleepers in lock_page. */ void unlock_page(struct page *page) { @@ -1400,6 +1396,11 @@ found_page: if (!Page_Uptodate(page)) goto page_not_up_to_date; + if (page->buffers) { + lock_page(page); + kill_buffers(page); + unlock_page(page); + } generic_file_readahead(reada_ok, filp, inode, page); page_ok: /* If users can be writing to this page using arbitrary @@ -1457,6 +1458,7 @@ page_not_up_to_date: /* Did somebody else fill it already? */ if (Page_Uptodate(page)) { + kill_buffers(page); UnlockPage(page); goto page_ok; } @@ -1948,6 +1950,11 @@ retry_find: */ if (!Page_Uptodate(page)) goto page_not_uptodate; + if (page->buffers) { + lock_page(page); + kill_buffers(page); + unlock_page(page); + } success: /* @@ -2006,6 +2013,7 @@ page_not_uptodate: /* Did somebody else get it up-to-date? */ if (Page_Uptodate(page)) { + kill_buffers(page); UnlockPage(page); goto success; } @@ -2033,6 +2041,7 @@ page_not_uptodate: /* Somebody else successfully read it in? */ if (Page_Uptodate(page)) { + kill_buffers(page); UnlockPage(page); goto success; } @@ -2850,6 +2859,7 @@ retry: goto retry; } if (Page_Uptodate(page)) { + kill_buffers(page); UnlockPage(page); goto out; } --- 2.4.19-pre8/kernel/ksyms.c~nuke-buffers Fri May 24 12:24:56 2002 +++ 2.4.19-pre8-akpm/kernel/ksyms.c Fri May 24 12:24:56 2002 @@ -202,7 +202,6 @@ EXPORT_SYMBOL(ll_rw_block); EXPORT_SYMBOL(submit_bh); EXPORT_SYMBOL(unlock_buffer); EXPORT_SYMBOL(__wait_on_buffer); -EXPORT_SYMBOL(___wait_on_page); EXPORT_SYMBOL(generic_direct_IO); EXPORT_SYMBOL(discard_bh_page); EXPORT_SYMBOL(block_write_full_page); --- 2.4.19-pre8/mm/vmscan.c~nuke-buffers Fri May 24 12:24:56 2002 +++ 2.4.19-pre8-akpm/mm/vmscan.c Fri May 24 12:24:56 2002 @@ -365,8 +365,13 @@ static int shrink_cache(int nr_pages, zo if (unlikely(!page_count(page))) continue; - if (!memclass(page_zone(page), classzone)) + if (!memclass(page_zone(page), classzone)) { + if (page->buffers && !TryLockPage(page)) { + try_to_release_page(page, GFP_NOIO); + unlock_page(page); + } continue; + } /* Racy check to avoid trylocking when not worthwhile */ if (!page->buffers && (page_count(page) != 1 || !page->mapping)) @@ -562,6 +567,11 @@ static int shrink_caches(zone_t * classz nr_pages -= kmem_cache_reap(gfp_mask); if (nr_pages <= 0) return 0; + if ((gfp_mask & __GFP_WAIT) && (shrink_buffer_cache() > 16)) { + nr_pages -= kmem_cache_reap(gfp_mask); + if (nr_pages <= 0) + return 0; + } nr_pages = chunk_size; /* try to keep the active list 2/3 of the size of the cache */ --- 2.4.19-pre8/fs/buffer.c~nuke-buffers Fri May 24 12:24:56 2002 +++ 2.4.19-pre8-akpm/fs/buffer.c Fri May 24 12:26:28 2002 @@ -1500,6 +1500,10 @@ static int __block_write_full_page(struc /* Stage 3: submit the IO */ do { struct buffer_head *next = bh->b_this_page; + /* + * Stick it on BUF_LOCKED so shrink_buffer_cache() can nail it. + */ + refile_buffer(bh); submit_bh(WRITE, bh); bh = next; } while (bh != head); @@ -2615,6 +2619,25 @@ static int sync_page_buffers(struct buff int try_to_free_buffers(struct page * page, unsigned int gfp_mask) { struct buffer_head * tmp, * bh = page->buffers; + int was_uptodate = 1; + + if (!PageLocked(page)) + BUG(); + + if (!bh) + return 1; + /* + * Quick check for freeable buffers before we go take three + * global locks. + */ + if (!(gfp_mask & __GFP_IO)) { + tmp = bh; + do { + if (buffer_busy(tmp)) + return 0; + tmp = tmp->b_this_page; + } while (tmp != bh); + } cleaned_buffers_try_again: spin_lock(&lru_list_lock); @@ -2637,7 +2660,8 @@ cleaned_buffers_try_again: tmp = tmp->b_this_page; if (p->b_dev == B_FREE) BUG(); - + if (!buffer_uptodate(p)) + was_uptodate = 0; remove_inode_queue(p); __remove_from_queues(p); __put_unused_buffer_head(p); @@ -2645,7 +2669,15 @@ cleaned_buffers_try_again: spin_unlock(&unused_list_lock); /* Wake up anyone waiting for buffer heads */ - wake_up(&buffer_wait); + smp_mb(); + if (waitqueue_active(&buffer_wait)) + wake_up(&buffer_wait); + + /* + * Make sure we don't read buffers again when they are reattached + */ + if (was_uptodate) + SetPageUptodate(page); /* And free the page */ page->buffers = NULL; @@ -2674,6 +2706,62 @@ busy_buffer_page: } EXPORT_SYMBOL(try_to_free_buffers); +/* + * Returns the number of pages which might have become freeable + */ +int shrink_buffer_cache(void) +{ + struct buffer_head *bh; + int nr_todo; + int nr_shrunk = 0; + + /* + * Move any clean unlocked buffers from BUF_LOCKED onto BUF_CLEAN + */ + spin_lock(&lru_list_lock); + for ( ; ; ) { + bh = lru_list[BUF_LOCKED]; + if (!bh || buffer_locked(bh)) + break; + __refile_buffer(bh); + } + + /* + * Now start liberating buffers + */ + nr_todo = nr_buffers_type[BUF_CLEAN]; + while (nr_todo--) { + struct page *page; + + bh = lru_list[BUF_CLEAN]; + if (!bh) + break; + + /* + * Park the buffer on BUF_LOCKED so we don't revisit it on + * this pass. + */ + __remove_from_lru_list(bh); + bh->b_list = BUF_LOCKED; + __insert_into_lru_list(bh, BUF_LOCKED); + page = bh->b_page; + if (TryLockPage(page)) + continue; + + page_cache_get(page); + spin_unlock(&lru_list_lock); + if (try_to_release_page(page, GFP_NOIO)) + nr_shrunk++; + unlock_page(page); + page_cache_release(page); + spin_lock(&lru_list_lock); + } + spin_unlock(&lru_list_lock); +// printk("%s: liberated %d page's worth of buffer_heads\n", +// __FUNCTION__, nr_shrunk); + return (nr_shrunk * sizeof(struct buffer_head)) / PAGE_CACHE_SIZE; +} + /* ================== Debugging =================== */ void show_buffers(void) @@ -2988,6 +3076,7 @@ int kupdate(void *startup) #ifdef DEBUG printk(KERN_DEBUG "kupdate() activated...\n"); #endif + shrink_buffer_cache(); sync_old_buffers(); run_task_queue(&tq_disk); } --- 2.4.19-pre8/include/linux/fs.h~nuke-buffers Fri May 24 12:24:56 2002 +++ 2.4.19-pre8-akpm/include/linux/fs.h Fri May 24 12:24:56 2002 @@ -1116,6 +1116,7 @@ extern int FASTCALL(try_to_free_buffers( extern void refile_buffer(struct buffer_head * buf); extern void create_empty_buffers(struct page *, kdev_t, unsigned long); extern void end_buffer_io_sync(struct buffer_head *bh, int uptodate); +extern int shrink_buffer_cache(void); /* reiserfs_writepage needs this */ extern void set_buffer_async_io(struct buffer_head *bh) ; - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 19:32 ` Andrew Morton @ 2002-05-30 10:29 ` Roy Sigurd Karlsbakk 2002-05-30 19:28 ` Andrew Morton 2002-06-18 11:26 ` Roy Sigurd Karlsbakk ` (2 subsequent siblings) 3 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-30 10:29 UTC (permalink / raw) To: Andrew Morton, Martin J. Bligh; +Cc: linux-kernel > > I don't think Andrew is ready to submit this yet ... before anything > > gets merged back, it'd be very worthwhile testing the relative > > performance of both solutions ... the more testers we have the > > better ;-) > > Cripes no. It's pretty experimental. Andrea spotted a bug, too. Fixed > version is below. Works great! This should _definetely_ be merged into the main kernel after som testing. Without it _all_ other kernels I've tested (2.4.lots) goes OOM under the mentioned scenarios. This one simply does the job. > It's possible that keeping the number of buffers as low as possible > will give improved performance over Andrea's approach because it > leaves more ZONE_NORMAL for other things. It's also possible that > it'll give worse performance because more get_block's need to be > done for file overwriting. Andrea's patch merely pushed the problem forward. This one fixed it -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-30 10:29 ` Roy Sigurd Karlsbakk @ 2002-05-30 19:28 ` Andrew Morton 2002-05-31 16:56 ` Roy Sigurd Karlsbakk 0 siblings, 1 reply; 48+ messages in thread From: Andrew Morton @ 2002-05-30 19:28 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: Martin J. Bligh, linux-kernel Roy Sigurd Karlsbakk wrote: > > > > I don't think Andrew is ready to submit this yet ... before anything > > > gets merged back, it'd be very worthwhile testing the relative > > > performance of both solutions ... the more testers we have the > > > better ;-) > > > > Cripes no. It's pretty experimental. Andrea spotted a bug, too. Fixed > > version is below. > > Works great! This should _definetely_ be merged into the main kernel after > som testing. Without it _all_ other kernels I've tested (2.4.lots) goes OOM > under the mentioned scenarios. This one simply does the job. I suspect nuke-buffers is simply always the right thing to do. It's what 2.5 is doing now (effectively). We'll see... But in your case, you only have a couple of gigs of memory, iirc. You shouldn't be running into catastrophic buffer_head congestion. Something odd is happening. If you can provide a really detailed set of steps which can be used by others to reproduce this, that would really help. - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-30 19:28 ` Andrew Morton @ 2002-05-31 16:56 ` Roy Sigurd Karlsbakk 2002-05-31 18:19 ` Andrea Arcangeli 0 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-31 16:56 UTC (permalink / raw) To: Andrew Morton; +Cc: Martin J. Bligh, linux-kernel > I suspect nuke-buffers is simply always the right thing to do. It's > what 2.5 is doing now (effectively). We'll see... > > But in your case, you only have a couple of gigs of memory, iirc. > You shouldn't be running into catastrophic buffer_head congestion. > Something odd is happening. > > If you can provide a really detailed set of steps which can be > used by others to reproduce this, that would really help. What I do: start lots (10-50) downloads, each with a speed of 4,5Mbps from another client. The two are connected using gigEthernet. downloads are over HTTP, with Tux or other servers (have tried several). If the clients are reading at full speed (e.g. only a few clients, or reading directly from localhost), the problem doesn't occir. However, when reading at a fixed rate, it seems like the server is caching itself to death. Detailed configuration: - 4 IBM 40gig disks in RAID-0. chunk size 1MB - 1 x athlon 1GHz - 1GB RAM - no highmem (900 meg) - kernel 2.4.19pre7 + patch from Andrew Morton to ditch buffers early (thread: [BUG] 2.4 VM sucks. Again) - gigEthernet between test client and server Anyone got a clue? -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-31 16:56 ` Roy Sigurd Karlsbakk @ 2002-05-31 18:19 ` Andrea Arcangeli 0 siblings, 0 replies; 48+ messages in thread From: Andrea Arcangeli @ 2002-05-31 18:19 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: Andrew Morton, Martin J. Bligh, linux-kernel On Fri, May 31, 2002 at 06:56:54PM +0200, Roy Sigurd Karlsbakk wrote: > > I suspect nuke-buffers is simply always the right thing to do. It's > > what 2.5 is doing now (effectively). We'll see... > > > > But in your case, you only have a couple of gigs of memory, iirc. > > You shouldn't be running into catastrophic buffer_head congestion. > > Something odd is happening. > > > > If you can provide a really detailed set of steps which can be > > used by others to reproduce this, that would really help. > > What I do: start lots (10-50) downloads, each with a speed of 4,5Mbps from > another client. The two are connected using gigEthernet. downloads are over > HTTP, with Tux or other servers (have tried several). If the clients are > reading at full speed (e.g. only a few clients, or reading directly from > localhost), the problem doesn't occir. However, when reading at a fixed rate, > it seems like the server is caching itself to death. > > > Detailed configuration: > > - 4 IBM 40gig disks in RAID-0. chunk size 1MB > - 1 x athlon 1GHz > - 1GB RAM - no highmem (900 meg) > - kernel 2.4.19pre7 + patch from Andrew Morton to ditch buffers early > (thread: [BUG] 2.4 VM sucks. Again) > - gigEthernet between test client and server > > Anyone got a clue? can you try to reproduce with 2.4.19pre9aa2 just in case it's an oom deadlock, and if it deadlocks again can you press SYSRQ+T, and many times SYSQR+P, and send this info along the system.map (you may need the serial console to easily gather the data if not even a SYSRQ+I is able to let the box resurrect from the livelock). (the system.map possibly not on l-k because it's quite big) thanks! Andrea ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 19:32 ` Andrew Morton 2002-05-30 10:29 ` Roy Sigurd Karlsbakk @ 2002-06-18 11:26 ` Roy Sigurd Karlsbakk 2002-06-18 19:42 ` Andrew Morton 2002-07-10 7:50 ` [2.4 BUFFERING BUG] (was [BUG] 2.4 VM sucks. Again) Roy Sigurd Karlsbakk 2002-08-28 9:28 ` [BUG+FIX] 2.4 buggercache sucks Roy Sigurd Karlsbakk 3 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-06-18 11:26 UTC (permalink / raw) To: Andrew Morton, Martin J. Bligh; +Cc: linux-kernel > > > Any plans to merge this into the main kernel, giving a choice > > > (in config or /proc) to enable this? > > > > I don't think Andrew is ready to submit this yet ... before anything > > gets merged back, it'd be very worthwhile testing the relative > > performance of both solutions ... the more testers we have the > > better ;-) > > Cripes no. It's pretty experimental. Andrea spotted a bug, too. Fixed > version is below. Any more plans? The patch has been working great for some time now, and I'd really like to see this in the official tree. Also - I guess this patch will eliminate any caching whatsoever, and therefore not really a good thing for file or web servers? roy -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-06-18 11:26 ` Roy Sigurd Karlsbakk @ 2002-06-18 19:42 ` Andrew Morton 2002-06-19 11:26 ` Roy Sigurd Karlsbakk 0 siblings, 1 reply; 48+ messages in thread From: Andrew Morton @ 2002-06-18 19:42 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: Martin J. Bligh, linux-kernel Roy Sigurd Karlsbakk wrote: > > > > > Any plans to merge this into the main kernel, giving a choice > > > > (in config or /proc) to enable this? > > > > > > I don't think Andrew is ready to submit this yet ... before anything > > > gets merged back, it'd be very worthwhile testing the relative > > > performance of both solutions ... the more testers we have the > > > better ;-) > > > > Cripes no. It's pretty experimental. Andrea spotted a bug, too. Fixed > > version is below. > > Any more plans? > The patch has been working great for some time now, and I'd really like to see > this in the official tree Roy, all we know is that "nuke-buffers stops your machine from locking up". But we don't know why your machine locks up in the first place. This just isn't sufficient grounds to apply it! We need to know exactly why your kernel is failing. We don't know what the bug is. You have two gigabytes of RAM, yes? It's very weird that stripping buffers prevents a lockup on a machine with such a small highmem/lowmem ratio. I'll have yet another shot at reproducing it. So, again, could you please tell me *exactly*, in great deatail, what I need to do to reproduce this problem? - memory size - number of CPUs - IO system - kernel version, any applied patches, compiler version - exact sequence of commands - anything else you can think of Have you been able to reproduce the failure on any other machine? > Also - I guess this patch will eliminate any > caching whatsoever, and therefore not really a good thing for file or web > servers? No, not at all. All the pagecache is still there - the patch just throws away the buffer_heads which are attached to those pagecache pages. The 2.5 kernel does it tons better. Have you tried it? - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-06-18 19:42 ` Andrew Morton @ 2002-06-19 11:26 ` Roy Sigurd Karlsbakk 0 siblings, 0 replies; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-06-19 11:26 UTC (permalink / raw) To: Andrew Morton; +Cc: Martin J. Bligh, linux-kernel > Roy, all we know is that "nuke-buffers stops your machine from locking up". > But we don't know why your machine locks up in the first place. This just > isn't sufficient grounds to apply it! We need to know exactly why your > kernel is failing. We don't know what the bug is. The bug, as previously described, occurs when multiple (20+) clients downloads large files (3-6Gigs each) at a speed of ~5Mbps. The error does _not_ occur when a fewer number of clients are downloading at speeds close to disk speed. All testing is being done on gigE crossover. > You have two gigabytes of RAM, yes? It's very weird that stripping buffers > prevents a lockup on a machine with such a small highmem/lowmem ratio. No. I have 1GB - highmem (which is disabled) giving me ~900MB > I'll have yet another shot at reproducing it. So, again, could you please > tell me *exactly*, in great deatail, what I need to do to reproduce this > problem? > - memory size 1GB - highmem > - number of CPUs 1 Athlon 1133Mz, 256kB cache > - IO system standard 33MHz/32bit single peer PCI motherboard (SiS based) on-board SiS IDE/ATA 100 controller. promise 20269 controller realtek 100mbps nic e1000 gigE nic 4 IBM 40gig 120GXP drives - one on each IDE channel data partition on RAID-0 across all drives > - kernel version, any applied patches, compiler version kernel 2.4.19-pre8+tux+akpm buffer patch I have tried _many_ different kernels, and as I needed the 20269 support, I chose 2.4.19-pre, Tux is there as I did some testing with that. The problem is _not_ tux specific, as I've tried with other server software (custom or standard) as well. gcc2.95.3 > - exact sequence of commands start http server software start 20+ downloads. each downloaded file is 3-6 gigs after some time most processes are killed OOM > - anything else you can think of I have not tried to give it coffee yet, although that might help. I'm usually pretty pissed if I haven't got my morning coffee > Have you been able to reproduce the failure on any other machine? yes. I have set up one other machine with exact same setup and one with slightly different setup and reproduced it. > No, not at all. All the pagecache is still there - the patch just > throws away the buffer_heads which are attached to those pagecache > pages. oh. that's good. > The 2.5 kernel does it tons better. Have you tried it? I haven't. I've tried to compile it a few times, but it has failed. And. I don't want to run 2.5 on a production server. But - If you ask me to test it, I will thanks for all help roy -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* [2.4 BUFFERING BUG] (was [BUG] 2.4 VM sucks. Again) 2002-05-24 19:32 ` Andrew Morton 2002-05-30 10:29 ` Roy Sigurd Karlsbakk 2002-06-18 11:26 ` Roy Sigurd Karlsbakk @ 2002-07-10 7:50 ` Roy Sigurd Karlsbakk 2002-07-10 8:05 ` Andrew Morton 2002-08-28 9:28 ` [BUG+FIX] 2.4 buggercache sucks Roy Sigurd Karlsbakk 3 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-07-10 7:50 UTC (permalink / raw) To: Andrew Morton, Martin J. Bligh; +Cc: linux-kernel hi I've been using the patch below from Andrew for some weeks now, sometimes under quite heavy load, and find it quite stable. Just wanted to say ... roy > > I don't think Andrew is ready to submit this yet ... before anything > > gets merged back, it'd be very worthwhile testing the relative > > performance of both solutions ... the more testers we have the > > better ;-) > > Cripes no. It's pretty experimental. Andrea spotted a bug, too. Fixed > version is below. > > It's possible that keeping the number of buffers as low as possible > will give improved performance over Andrea's approach because it > leaves more ZONE_NORMAL for other things. It's also possible that > it'll give worse performance because more get_block's need to be > done for file overwriting. > > > --- 2.4.19-pre8/include/linux/pagemap.h~nuke-buffers Fri May 24 12:24:56 > 2002 +++ 2.4.19-pre8-akpm/include/linux/pagemap.h Fri May 24 12:26:30 2002 > @@ -89,13 +89,7 @@ extern void add_to_page_cache(struct pag > extern void add_to_page_cache_locked(struct page * page, struct > address_space *mapping, unsigned long index); extern int > add_to_page_cache_unique(struct page * page, struct address_space *mapping, > unsigned long index, struct page **hash); > > -extern void ___wait_on_page(struct page *); > - > -static inline void wait_on_page(struct page * page) > -{ > - if (PageLocked(page)) > - ___wait_on_page(page); > -} > +extern void wait_on_page(struct page *); > > extern struct page * grab_cache_page (struct address_space *, unsigned > long); extern struct page * grab_cache_page_nowait (struct address_space *, > unsigned long); --- 2.4.19-pre8/mm/filemap.c~nuke-buffers Fri May 24 > 12:24:56 2002 +++ 2.4.19-pre8-akpm/mm/filemap.c Fri May 24 12:24:56 2002 > @@ -608,7 +608,7 @@ int filemap_fdatawait(struct address_spa > page_cache_get(page); > spin_unlock(&pagecache_lock); > > - ___wait_on_page(page); > + wait_on_page(page); > if (PageError(page)) > ret = -EIO; > > @@ -805,33 +805,29 @@ static inline wait_queue_head_t *page_wa > return &wait[hash]; > } > > -/* > - * Wait for a page to get unlocked. > +static void kill_buffers(struct page *page) > +{ > + if (!PageLocked(page)) > + BUG(); > + if (page->buffers) > + try_to_release_page(page, GFP_NOIO); > +} > + > +/* > + * Wait for a page to come unlocked. Then try to ditch its buffer_heads. > * > - * This must be called with the caller "holding" the page, > - * ie with increased "page->count" so that the page won't > - * go away during the wait.. > + * FIXME: Make the ditching dependent on CONFIG_MONSTER_BOX or something. > */ > -void ___wait_on_page(struct page *page) > +void wait_on_page(struct page *page) > { > - wait_queue_head_t *waitqueue = page_waitqueue(page); > - struct task_struct *tsk = current; > - DECLARE_WAITQUEUE(wait, tsk); > - > - add_wait_queue(waitqueue, &wait); > - do { > - set_task_state(tsk, TASK_UNINTERRUPTIBLE); > - if (!PageLocked(page)) > - break; > - sync_page(page); > - schedule(); > - } while (PageLocked(page)); > - __set_task_state(tsk, TASK_RUNNING); > - remove_wait_queue(waitqueue, &wait); > + lock_page(page); > + kill_buffers(page); > + unlock_page(page); > } > +EXPORT_SYMBOL(wait_on_page); > > /* > - * Unlock the page and wake up sleepers in ___wait_on_page. > + * Unlock the page and wake up sleepers in lock_page. > */ > void unlock_page(struct page *page) > { > @@ -1400,6 +1396,11 @@ found_page: > > if (!Page_Uptodate(page)) > goto page_not_up_to_date; > + if (page->buffers) { > + lock_page(page); > + kill_buffers(page); > + unlock_page(page); > + } > generic_file_readahead(reada_ok, filp, inode, page); > page_ok: > /* If users can be writing to this page using arbitrary > @@ -1457,6 +1458,7 @@ page_not_up_to_date: > > /* Did somebody else fill it already? */ > if (Page_Uptodate(page)) { > + kill_buffers(page); > UnlockPage(page); > goto page_ok; > } > @@ -1948,6 +1950,11 @@ retry_find: > */ > if (!Page_Uptodate(page)) > goto page_not_uptodate; > + if (page->buffers) { > + lock_page(page); > + kill_buffers(page); > + unlock_page(page); > + } > > success: > /* > @@ -2006,6 +2013,7 @@ page_not_uptodate: > > /* Did somebody else get it up-to-date? */ > if (Page_Uptodate(page)) { > + kill_buffers(page); > UnlockPage(page); > goto success; > } > @@ -2033,6 +2041,7 @@ page_not_uptodate: > > /* Somebody else successfully read it in? */ > if (Page_Uptodate(page)) { > + kill_buffers(page); > UnlockPage(page); > goto success; > } > @@ -2850,6 +2859,7 @@ retry: > goto retry; > } > if (Page_Uptodate(page)) { > + kill_buffers(page); > UnlockPage(page); > goto out; > } > --- 2.4.19-pre8/kernel/ksyms.c~nuke-buffers Fri May 24 12:24:56 2002 > +++ 2.4.19-pre8-akpm/kernel/ksyms.c Fri May 24 12:24:56 2002 > @@ -202,7 +202,6 @@ EXPORT_SYMBOL(ll_rw_block); > EXPORT_SYMBOL(submit_bh); > EXPORT_SYMBOL(unlock_buffer); > EXPORT_SYMBOL(__wait_on_buffer); > -EXPORT_SYMBOL(___wait_on_page); > EXPORT_SYMBOL(generic_direct_IO); > EXPORT_SYMBOL(discard_bh_page); > EXPORT_SYMBOL(block_write_full_page); > --- 2.4.19-pre8/mm/vmscan.c~nuke-buffers Fri May 24 12:24:56 2002 > +++ 2.4.19-pre8-akpm/mm/vmscan.c Fri May 24 12:24:56 2002 > @@ -365,8 +365,13 @@ static int shrink_cache(int nr_pages, zo > if (unlikely(!page_count(page))) > continue; > > - if (!memclass(page_zone(page), classzone)) > + if (!memclass(page_zone(page), classzone)) { > + if (page->buffers && !TryLockPage(page)) { > + try_to_release_page(page, GFP_NOIO); > + unlock_page(page); > + } > continue; > + } > > /* Racy check to avoid trylocking when not worthwhile */ > if (!page->buffers && (page_count(page) != 1 || !page->mapping)) > @@ -562,6 +567,11 @@ static int shrink_caches(zone_t * classz > nr_pages -= kmem_cache_reap(gfp_mask); > if (nr_pages <= 0) > return 0; > + if ((gfp_mask & __GFP_WAIT) && (shrink_buffer_cache() > 16)) { > + nr_pages -= kmem_cache_reap(gfp_mask); > + if (nr_pages <= 0) > + return 0; > + } > > nr_pages = chunk_size; > /* try to keep the active list 2/3 of the size of the cache */ > --- 2.4.19-pre8/fs/buffer.c~nuke-buffers Fri May 24 12:24:56 2002 > +++ 2.4.19-pre8-akpm/fs/buffer.c Fri May 24 12:26:28 2002 > @@ -1500,6 +1500,10 @@ static int __block_write_full_page(struc > /* Stage 3: submit the IO */ > do { > struct buffer_head *next = bh->b_this_page; > + /* > + * Stick it on BUF_LOCKED so shrink_buffer_cache() can nail it. > + */ > + refile_buffer(bh); > submit_bh(WRITE, bh); > bh = next; > } while (bh != head); > @@ -2615,6 +2619,25 @@ static int sync_page_buffers(struct buff > int try_to_free_buffers(struct page * page, unsigned int gfp_mask) > { > struct buffer_head * tmp, * bh = page->buffers; > + int was_uptodate = 1; > + > + if (!PageLocked(page)) > + BUG(); > + > + if (!bh) > + return 1; > + /* > + * Quick check for freeable buffers before we go take three > + * global locks. > + */ > + if (!(gfp_mask & __GFP_IO)) { > + tmp = bh; > + do { > + if (buffer_busy(tmp)) > + return 0; > + tmp = tmp->b_this_page; > + } while (tmp != bh); > + } > > cleaned_buffers_try_again: > spin_lock(&lru_list_lock); > @@ -2637,7 +2660,8 @@ cleaned_buffers_try_again: > tmp = tmp->b_this_page; > > if (p->b_dev == B_FREE) BUG(); > - > + if (!buffer_uptodate(p)) > + was_uptodate = 0; > remove_inode_queue(p); > __remove_from_queues(p); > __put_unused_buffer_head(p); > @@ -2645,7 +2669,15 @@ cleaned_buffers_try_again: > spin_unlock(&unused_list_lock); > > /* Wake up anyone waiting for buffer heads */ > - wake_up(&buffer_wait); > + smp_mb(); > + if (waitqueue_active(&buffer_wait)) > + wake_up(&buffer_wait); > + > + /* > + * Make sure we don't read buffers again when they are reattached > + */ > + if (was_uptodate) > + SetPageUptodate(page); > > /* And free the page */ > page->buffers = NULL; > @@ -2674,6 +2706,62 @@ busy_buffer_page: > } > EXPORT_SYMBOL(try_to_free_buffers); > > +/* > + * Returns the number of pages which might have become freeable > + */ > +int shrink_buffer_cache(void) > +{ > + struct buffer_head *bh; > + int nr_todo; > + int nr_shrunk = 0; > + > + /* > + * Move any clean unlocked buffers from BUF_LOCKED onto BUF_CLEAN > + */ > + spin_lock(&lru_list_lock); > + for ( ; ; ) { > + bh = lru_list[BUF_LOCKED]; > + if (!bh || buffer_locked(bh)) > + break; > + __refile_buffer(bh); > + } > + > + /* > + * Now start liberating buffers > + */ > + nr_todo = nr_buffers_type[BUF_CLEAN]; > + while (nr_todo--) { > + struct page *page; > + > + bh = lru_list[BUF_CLEAN]; > + if (!bh) > + break; > + > + /* > + * Park the buffer on BUF_LOCKED so we don't revisit it on > + * this pass. > + */ > + __remove_from_lru_list(bh); > + bh->b_list = BUF_LOCKED; > + __insert_into_lru_list(bh, BUF_LOCKED); > + page = bh->b_page; > + if (TryLockPage(page)) > + continue; > + > + page_cache_get(page); > + spin_unlock(&lru_list_lock); > + if (try_to_release_page(page, GFP_NOIO)) > + nr_shrunk++; > + unlock_page(page); > + page_cache_release(page); > + spin_lock(&lru_list_lock); > + } > + spin_unlock(&lru_list_lock); > +// printk("%s: liberated %d page's worth of buffer_heads\n", > +// __FUNCTION__, nr_shrunk); > + return (nr_shrunk * sizeof(struct buffer_head)) / PAGE_CACHE_SIZE; > +} > + > /* ================== Debugging =================== */ > > void show_buffers(void) > @@ -2988,6 +3076,7 @@ int kupdate(void *startup) > #ifdef DEBUG > printk(KERN_DEBUG "kupdate() activated...\n"); > #endif > + shrink_buffer_cache(); > sync_old_buffers(); > run_task_queue(&tq_disk); > } > --- 2.4.19-pre8/include/linux/fs.h~nuke-buffers Fri May 24 12:24:56 2002 > +++ 2.4.19-pre8-akpm/include/linux/fs.h Fri May 24 12:24:56 2002 > @@ -1116,6 +1116,7 @@ extern int FASTCALL(try_to_free_buffers( > extern void refile_buffer(struct buffer_head * buf); > extern void create_empty_buffers(struct page *, kdev_t, unsigned long); > extern void end_buffer_io_sync(struct buffer_head *bh, int uptodate); > +extern int shrink_buffer_cache(void); > > /* reiserfs_writepage needs this */ > extern void set_buffer_async_io(struct buffer_head *bh) ; > > > - -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.4 BUFFERING BUG] (was [BUG] 2.4 VM sucks. Again) 2002-07-10 7:50 ` [2.4 BUFFERING BUG] (was [BUG] 2.4 VM sucks. Again) Roy Sigurd Karlsbakk @ 2002-07-10 8:05 ` Andrew Morton 2002-07-10 8:14 ` Roy Sigurd Karlsbakk 0 siblings, 1 reply; 48+ messages in thread From: Andrew Morton @ 2002-07-10 8:05 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: Martin J. Bligh, linux-kernel Roy Sigurd Karlsbakk wrote: > > hi > > I've been using the patch below from Andrew for some weeks now, sometimes > under quite heavy load, and find it quite stable. > Wish we knew why. I've tried many times to reproduce the problem which you're seeing. With just two gigs of memory, buffer_heads really cannot explain anything. It's weird. We discussed this in Ottawa - I guess Andrea will add the toss-the-buffers code on the read side (basically the filemap.c stuff). That may be sufficient, but without an understanding of what is going on, it is hard to predict. - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.4 BUFFERING BUG] (was [BUG] 2.4 VM sucks. Again) 2002-07-10 8:05 ` Andrew Morton @ 2002-07-10 8:14 ` Roy Sigurd Karlsbakk 0 siblings, 0 replies; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-07-10 8:14 UTC (permalink / raw) To: Andrew Morton; +Cc: Martin J. Bligh, linux-kernel On Wednesday 10 July 2002 10:05, Andrew Morton wrote: > Roy Sigurd Karlsbakk wrote: > > hi > > > > I've been using the patch below from Andrew for some weeks now, sometimes > > under quite heavy load, and find it quite stable. > > Wish we knew why. I've tried many times to reproduce the problem > which you're seeing. With just two gigs of memory, buffer_heads > really cannot explain anything. It's weird. well - firstly, I'm using _1_ gig of memory - highmem (= 900 megs something) secondly - I have reproduced it on two different installations, although on the same hardware - standard PC with SiS MB and an extra promise controller, RAID-0 on 4 drives and chunksize 1MB. Given a 30-50 processes each reading a 4gig file and sending it over HTTP, everything works fine _if_ and only _if_ the client reads at high speed. If, however, the client reads at normal streaming speed (4,3Mbps), buffers go bOOM. > We discussed this in Ottawa - I guess Andrea will add the toss-the-buffers > code on the read side (basically the filemap.c stuff). That may > be sufficient, but without an understanding of what is going on, > it is hard to predict. Is there _any_ more data I can give, or any more testing I can do, then I'll do my very best to help roy -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* [BUG+FIX] 2.4 buggercache sucks 2002-05-24 19:32 ` Andrew Morton ` (2 preceding siblings ...) 2002-07-10 7:50 ` [2.4 BUFFERING BUG] (was [BUG] 2.4 VM sucks. Again) Roy Sigurd Karlsbakk @ 2002-08-28 9:28 ` Roy Sigurd Karlsbakk 2002-08-28 15:30 ` Martin J. Bligh 3 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-08-28 9:28 UTC (permalink / raw) To: Andrew Morton, Martin J. Bligh; +Cc: linux-kernel hi the patch below has now been tested out for quite some time. Will it be likely to see this into 2.4.20? roy On Friday 24 May 2002 21:32, Andrew Morton wrote: > "Martin J. Bligh" wrote: > > >> Sounds like exactly the same problem we were having. There are two > > >> approaches to solving this - Andrea has a patch that tries to free > > >> them under memory pressure, akpm has a patch that hacks them down as > > >> soon as you've fininshed with them (posted to lse-tech mailing list). > > >> Both approaches seemed to work for me, but the performance of the > > >> fixes still has to be established. > > > > > > Where can I find the akpm patch? > > > > http://marc.theaimsgroup.com/?l=lse-tech&m=102083525007877&w=2 > > > > > Any plans to merge this into the main kernel, giving a choice > > > (in config or /proc) to enable this? > > > > I don't think Andrew is ready to submit this yet ... before anything > > gets merged back, it'd be very worthwhile testing the relative > > performance of both solutions ... the more testers we have the > > better ;-) > > Cripes no. It's pretty experimental. Andrea spotted a bug, too. Fixed > version is below. > > It's possible that keeping the number of buffers as low as possible > will give improved performance over Andrea's approach because it > leaves more ZONE_NORMAL for other things. It's also possible that > it'll give worse performance because more get_block's need to be > done for file overwriting. > > > --- 2.4.19-pre8/include/linux/pagemap.h~nuke-buffers Fri May 24 12:24:56 > 2002 +++ 2.4.19-pre8-akpm/include/linux/pagemap.h Fri May 24 12:26:30 2002 > @@ -89,13 +89,7 @@ extern void add_to_page_cache(struct pag > extern void add_to_page_cache_locked(struct page * page, struct > address_space *mapping, unsigned long index); extern int > add_to_page_cache_unique(struct page * page, struct address_space *mapping, > unsigned long index, struct page **hash); > > -extern void ___wait_on_page(struct page *); > - > -static inline void wait_on_page(struct page * page) > -{ > - if (PageLocked(page)) > - ___wait_on_page(page); > -} > +extern void wait_on_page(struct page *); > > extern struct page * grab_cache_page (struct address_space *, unsigned > long); extern struct page * grab_cache_page_nowait (struct address_space *, > unsigned long); --- 2.4.19-pre8/mm/filemap.c~nuke-buffers Fri May 24 > 12:24:56 2002 +++ 2.4.19-pre8-akpm/mm/filemap.c Fri May 24 12:24:56 2002 > @@ -608,7 +608,7 @@ int filemap_fdatawait(struct address_spa > page_cache_get(page); > spin_unlock(&pagecache_lock); > > - ___wait_on_page(page); > + wait_on_page(page); > if (PageError(page)) > ret = -EIO; > > @@ -805,33 +805,29 @@ static inline wait_queue_head_t *page_wa > return &wait[hash]; > } > > -/* > - * Wait for a page to get unlocked. > +static void kill_buffers(struct page *page) > +{ > + if (!PageLocked(page)) > + BUG(); > + if (page->buffers) > + try_to_release_page(page, GFP_NOIO); > +} > + > +/* > + * Wait for a page to come unlocked. Then try to ditch its buffer_heads. > * > - * This must be called with the caller "holding" the page, > - * ie with increased "page->count" so that the page won't > - * go away during the wait.. > + * FIXME: Make the ditching dependent on CONFIG_MONSTER_BOX or something. > */ > -void ___wait_on_page(struct page *page) > +void wait_on_page(struct page *page) > { > - wait_queue_head_t *waitqueue = page_waitqueue(page); > - struct task_struct *tsk = current; > - DECLARE_WAITQUEUE(wait, tsk); > - > - add_wait_queue(waitqueue, &wait); > - do { > - set_task_state(tsk, TASK_UNINTERRUPTIBLE); > - if (!PageLocked(page)) > - break; > - sync_page(page); > - schedule(); > - } while (PageLocked(page)); > - __set_task_state(tsk, TASK_RUNNING); > - remove_wait_queue(waitqueue, &wait); > + lock_page(page); > + kill_buffers(page); > + unlock_page(page); > } > +EXPORT_SYMBOL(wait_on_page); > > /* > - * Unlock the page and wake up sleepers in ___wait_on_page. > + * Unlock the page and wake up sleepers in lock_page. > */ > void unlock_page(struct page *page) > { > @@ -1400,6 +1396,11 @@ found_page: > > if (!Page_Uptodate(page)) > goto page_not_up_to_date; > + if (page->buffers) { > + lock_page(page); > + kill_buffers(page); > + unlock_page(page); > + } > generic_file_readahead(reada_ok, filp, inode, page); > page_ok: > /* If users can be writing to this page using arbitrary > @@ -1457,6 +1458,7 @@ page_not_up_to_date: > > /* Did somebody else fill it already? */ > if (Page_Uptodate(page)) { > + kill_buffers(page); > UnlockPage(page); > goto page_ok; > } > @@ -1948,6 +1950,11 @@ retry_find: > */ > if (!Page_Uptodate(page)) > goto page_not_uptodate; > + if (page->buffers) { > + lock_page(page); > + kill_buffers(page); > + unlock_page(page); > + } > > success: > /* > @@ -2006,6 +2013,7 @@ page_not_uptodate: > > /* Did somebody else get it up-to-date? */ > if (Page_Uptodate(page)) { > + kill_buffers(page); > UnlockPage(page); > goto success; > } > @@ -2033,6 +2041,7 @@ page_not_uptodate: > > /* Somebody else successfully read it in? */ > if (Page_Uptodate(page)) { > + kill_buffers(page); > UnlockPage(page); > goto success; > } > @@ -2850,6 +2859,7 @@ retry: > goto retry; > } > if (Page_Uptodate(page)) { > + kill_buffers(page); > UnlockPage(page); > goto out; > } > --- 2.4.19-pre8/kernel/ksyms.c~nuke-buffers Fri May 24 12:24:56 2002 > +++ 2.4.19-pre8-akpm/kernel/ksyms.c Fri May 24 12:24:56 2002 > @@ -202,7 +202,6 @@ EXPORT_SYMBOL(ll_rw_block); > EXPORT_SYMBOL(submit_bh); > EXPORT_SYMBOL(unlock_buffer); > EXPORT_SYMBOL(__wait_on_buffer); > -EXPORT_SYMBOL(___wait_on_page); > EXPORT_SYMBOL(generic_direct_IO); > EXPORT_SYMBOL(discard_bh_page); > EXPORT_SYMBOL(block_write_full_page); > --- 2.4.19-pre8/mm/vmscan.c~nuke-buffers Fri May 24 12:24:56 2002 > +++ 2.4.19-pre8-akpm/mm/vmscan.c Fri May 24 12:24:56 2002 > @@ -365,8 +365,13 @@ static int shrink_cache(int nr_pages, zo > if (unlikely(!page_count(page))) > continue; > > - if (!memclass(page_zone(page), classzone)) > + if (!memclass(page_zone(page), classzone)) { > + if (page->buffers && !TryLockPage(page)) { > + try_to_release_page(page, GFP_NOIO); > + unlock_page(page); > + } > continue; > + } > > /* Racy check to avoid trylocking when not worthwhile */ > if (!page->buffers && (page_count(page) != 1 || !page->mapping)) > @@ -562,6 +567,11 @@ static int shrink_caches(zone_t * classz > nr_pages -= kmem_cache_reap(gfp_mask); > if (nr_pages <= 0) > return 0; > + if ((gfp_mask & __GFP_WAIT) && (shrink_buffer_cache() > 16)) { > + nr_pages -= kmem_cache_reap(gfp_mask); > + if (nr_pages <= 0) > + return 0; > + } > > nr_pages = chunk_size; > /* try to keep the active list 2/3 of the size of the cache */ > --- 2.4.19-pre8/fs/buffer.c~nuke-buffers Fri May 24 12:24:56 2002 > +++ 2.4.19-pre8-akpm/fs/buffer.c Fri May 24 12:26:28 2002 > @@ -1500,6 +1500,10 @@ static int __block_write_full_page(struc > /* Stage 3: submit the IO */ > do { > struct buffer_head *next = bh->b_this_page; > + /* > + * Stick it on BUF_LOCKED so shrink_buffer_cache() can nail it. > + */ > + refile_buffer(bh); > submit_bh(WRITE, bh); > bh = next; > } while (bh != head); > @@ -2615,6 +2619,25 @@ static int sync_page_buffers(struct buff > int try_to_free_buffers(struct page * page, unsigned int gfp_mask) > { > struct buffer_head * tmp, * bh = page->buffers; > + int was_uptodate = 1; > + > + if (!PageLocked(page)) > + BUG(); > + > + if (!bh) > + return 1; > + /* > + * Quick check for freeable buffers before we go take three > + * global locks. > + */ > + if (!(gfp_mask & __GFP_IO)) { > + tmp = bh; > + do { > + if (buffer_busy(tmp)) > + return 0; > + tmp = tmp->b_this_page; > + } while (tmp != bh); > + } > > cleaned_buffers_try_again: > spin_lock(&lru_list_lock); > @@ -2637,7 +2660,8 @@ cleaned_buffers_try_again: > tmp = tmp->b_this_page; > > if (p->b_dev == B_FREE) BUG(); > - > + if (!buffer_uptodate(p)) > + was_uptodate = 0; > remove_inode_queue(p); > __remove_from_queues(p); > __put_unused_buffer_head(p); > @@ -2645,7 +2669,15 @@ cleaned_buffers_try_again: > spin_unlock(&unused_list_lock); > > /* Wake up anyone waiting for buffer heads */ > - wake_up(&buffer_wait); > + smp_mb(); > + if (waitqueue_active(&buffer_wait)) > + wake_up(&buffer_wait); > + > + /* > + * Make sure we don't read buffers again when they are reattached > + */ > + if (was_uptodate) > + SetPageUptodate(page); > > /* And free the page */ > page->buffers = NULL; > @@ -2674,6 +2706,62 @@ busy_buffer_page: > } > EXPORT_SYMBOL(try_to_free_buffers); > > +/* > + * Returns the number of pages which might have become freeable > + */ > +int shrink_buffer_cache(void) > +{ > + struct buffer_head *bh; > + int nr_todo; > + int nr_shrunk = 0; > + > + /* > + * Move any clean unlocked buffers from BUF_LOCKED onto BUF_CLEAN > + */ > + spin_lock(&lru_list_lock); > + for ( ; ; ) { > + bh = lru_list[BUF_LOCKED]; > + if (!bh || buffer_locked(bh)) > + break; > + __refile_buffer(bh); > + } > + > + /* > + * Now start liberating buffers > + */ > + nr_todo = nr_buffers_type[BUF_CLEAN]; > + while (nr_todo--) { > + struct page *page; > + > + bh = lru_list[BUF_CLEAN]; > + if (!bh) > + break; > + > + /* > + * Park the buffer on BUF_LOCKED so we don't revisit it on > + * this pass. > + */ > + __remove_from_lru_list(bh); > + bh->b_list = BUF_LOCKED; > + __insert_into_lru_list(bh, BUF_LOCKED); > + page = bh->b_page; > + if (TryLockPage(page)) > + continue; > + > + page_cache_get(page); > + spin_unlock(&lru_list_lock); > + if (try_to_release_page(page, GFP_NOIO)) > + nr_shrunk++; > + unlock_page(page); > + page_cache_release(page); > + spin_lock(&lru_list_lock); > + } > + spin_unlock(&lru_list_lock); > +// printk("%s: liberated %d page's worth of buffer_heads\n", > +// __FUNCTION__, nr_shrunk); > + return (nr_shrunk * sizeof(struct buffer_head)) / PAGE_CACHE_SIZE; > +} > + > /* ================== Debugging =================== */ > > void show_buffers(void) > @@ -2988,6 +3076,7 @@ int kupdate(void *startup) > #ifdef DEBUG > printk(KERN_DEBUG "kupdate() activated...\n"); > #endif > + shrink_buffer_cache(); > sync_old_buffers(); > run_task_queue(&tq_disk); > } > --- 2.4.19-pre8/include/linux/fs.h~nuke-buffers Fri May 24 12:24:56 2002 > +++ 2.4.19-pre8-akpm/include/linux/fs.h Fri May 24 12:24:56 2002 > @@ -1116,6 +1116,7 @@ extern int FASTCALL(try_to_free_buffers( > extern void refile_buffer(struct buffer_head * buf); > extern void create_empty_buffers(struct page *, kdev_t, unsigned long); > extern void end_buffer_io_sync(struct buffer_head *bh, int uptodate); > +extern int shrink_buffer_cache(void); > > /* reiserfs_writepage needs this */ > extern void set_buffer_async_io(struct buffer_head *bh) ; > > > - -- Roy Sigurd Karlsbakk, Datavaktmester ProntoTV AS - http://www.pronto.tv/ Tel: +47 9801 3356 Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG+FIX] 2.4 buggercache sucks 2002-08-28 9:28 ` [BUG+FIX] 2.4 buggercache sucks Roy Sigurd Karlsbakk @ 2002-08-28 15:30 ` Martin J. Bligh 2002-08-29 8:00 ` Roy Sigurd Karlsbakk 0 siblings, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-08-28 15:30 UTC (permalink / raw) To: Roy Sigurd Karlsbakk, Andrew Morton; +Cc: linux-kernel Andrew had a new version that he just submitted to 2.5, but it may not backport easily. The agreement at OLS was to treat read and write seperately - nuke them immediately for one side, and reclaim under mem pressure for the other. Half of Andrea's patch, and half of Andrew's. Unfortunately I can never remember which was which ;-) And I don't think anyone has rolled that together yet .... Summary: the code below probably isn't the desired solution. M. --On Wednesday, August 28, 2002 11:28 AM +0200 Roy Sigurd Karlsbakk <roy@karlsbakk.net> wrote: > hi > > the patch below has now been tested out for quite some time. > > Will it be likely to see this into 2.4.20? > > roy > > > On Friday 24 May 2002 21:32, Andrew Morton wrote: >> "Martin J. Bligh" wrote: >> > >> Sounds like exactly the same problem we were having. There are two >> > >> approaches to solving this - Andrea has a patch that tries to free >> > >> them under memory pressure, akpm has a patch that hacks them down as >> > >> soon as you've fininshed with them (posted to lse-tech mailing list). >> > >> Both approaches seemed to work for me, but the performance of the >> > >> fixes still has to be established. >> > > >> > > Where can I find the akpm patch? >> > >> > http://marc.theaimsgroup.com/?l=lse-tech&m=102083525007877&w=2 >> > >> > > Any plans to merge this into the main kernel, giving a choice >> > > (in config or /proc) to enable this? >> > >> > I don't think Andrew is ready to submit this yet ... before anything >> > gets merged back, it'd be very worthwhile testing the relative >> > performance of both solutions ... the more testers we have the >> > better ;-) >> >> Cripes no. It's pretty experimental. Andrea spotted a bug, too. Fixed >> version is below. >> >> It's possible that keeping the number of buffers as low as possible >> will give improved performance over Andrea's approach because it >> leaves more ZONE_NORMAL for other things. It's also possible that >> it'll give worse performance because more get_block's need to be >> done for file overwriting. >> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG+FIX] 2.4 buggercache sucks 2002-08-28 15:30 ` Martin J. Bligh @ 2002-08-29 8:00 ` Roy Sigurd Karlsbakk 2002-08-29 13:42 ` Martin J. Bligh 0 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-08-29 8:00 UTC (permalink / raw) To: Martin J. Bligh, Andrew Morton; +Cc: linux-kernel > Summary: the code below probably isn't the desired solution. Very well - but where is the code to run then? I mean - this code solved _my_ problem. Without it the server OOMs within minutes of high load, as explained earlier. I'd rather like a clean fix in 2.4 than this, although it works. Any thougths? roy -- Roy Sigurd Karlsbakk, Datavaktmester ProntoTV AS - http://www.pronto.tv/ Tel: +47 9801 3356 Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG+FIX] 2.4 buggercache sucks 2002-08-29 8:00 ` Roy Sigurd Karlsbakk @ 2002-08-29 13:42 ` Martin J. Bligh 2002-08-30 9:21 ` Roy Sigurd Karlsbakk 0 siblings, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-08-29 13:42 UTC (permalink / raw) To: Roy Sigurd Karlsbakk, Andrew Morton; +Cc: linux-kernel >> Summary: the code below probably isn't the desired solution. > > Very well - but where is the code to run then? Not quite sure what you mean? > I mean - this code solved _my_ problem. Without it the server OOMs within > minutes of high load, as explained earlier. I'd rather like a clean fix in > 2.4 than this, although it works. I'm sure Andrew could explain this better than I - he wrote the code, I just whined about the problem. Basically he frees the buffer_head immediately after he's used it, which could at least in theory degrade performance a little if it could have been reused. Now, nobody's ever really benchmarked that, so a more conservative approach is likely to be taken, unless someone can prove it doesn't degrade performance much for people who don't need the fix. One of the cases people were running scared of was something doing continual overwrites of a file, I think something like: for (i=0;i<BIGNUMBER;i++) { lseek (0); write 4K of data; } Or something. Was your workload doing lots of reads, or lots of writes? Or both? M. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG+FIX] 2.4 buggercache sucks 2002-08-29 13:42 ` Martin J. Bligh @ 2002-08-30 9:21 ` Roy Sigurd Karlsbakk 2002-08-30 17:19 ` Martin J. Bligh 0 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-08-30 9:21 UTC (permalink / raw) To: Martin J. Bligh, Andrew Morton; +Cc: linux-kernel > > I mean - this code solved _my_ problem. Without it the server OOMs within > > minutes of high load, as explained earlier. I'd rather like a clean fix > > in 2.4 than this, although it works. > > I'm sure Andrew could explain this better than I - he wrote the > code, I just whined about the problem. Basically he frees the > buffer_head immediately after he's used it, which could at least > in theory degrade performance a little if it could have been reused. > Now, nobody's ever really benchmarked that, so a more conservative > approach is likely to be taken, unless someone can prove it doesn't > degrade performance much for people who don't need the fix. One > of the cases people were running scared of was something doing > continual overwrites of a file, I think something like: > > for (i=0;i<BIGNUMBER;i++) { > lseek (0); > write 4K of data; > } > > Or something. > > Was your workload doing lots of reads, or lots of writes? Or both? I was downloading large files @ ~ 4Mbps from 20-50 clients - filesize ~3GB the box has 1GB memory minus (no highmem) - so - 900 megs. After some time it starts swapping and it OOMs. Same happens with several userspace httpd's roy -- Roy Sigurd Karlsbakk, Datavaktmester ProntoTV AS - http://www.pronto.tv/ Tel: +47 9801 3356 Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG+FIX] 2.4 buggercache sucks 2002-08-30 9:21 ` Roy Sigurd Karlsbakk @ 2002-08-30 17:19 ` Martin J. Bligh 2002-08-30 18:49 ` Andrew Morton 0 siblings, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-08-30 17:19 UTC (permalink / raw) To: Roy Sigurd Karlsbakk, Andrew Morton; +Cc: linux-kernel >> Was your workload doing lots of reads, or lots of writes? Or both? > > I was downloading large files @ ~ 4Mbps from 20-50 clients - filesize ~3GB > the box has 1GB memory minus (no highmem) - so - 900 megs. After some time it > starts swapping and it OOMs. Same happens with several userspace httpd's Mmmm .... not quite sure which way round to read that. Presumably the box that was the server fell over, and the clients are fine? So the workload that's causing problems is doing predominantly reads? If so, I suggest you tear down Andrew's patch to read side only, and submit that ... I get the feeling that would be acceptable, and would solve your problem. M. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG+FIX] 2.4 buggercache sucks 2002-08-30 17:19 ` Martin J. Bligh @ 2002-08-30 18:49 ` Andrew Morton 0 siblings, 0 replies; 48+ messages in thread From: Andrew Morton @ 2002-08-30 18:49 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Roy Sigurd Karlsbakk, linux-kernel "Martin J. Bligh" wrote: > > >> Was your workload doing lots of reads, or lots of writes? Or both? > > > > I was downloading large files @ ~ 4Mbps from 20-50 clients - filesize ~3GB > > the box has 1GB memory minus (no highmem) - so - 900 megs. After some time it > > starts swapping and it OOMs. Same happens with several userspace httpd's > > Mmmm .... not quite sure which way round to read that. Presumably the box > that was the server fell over, and the clients are fine? So the workload that's > causing problems is doing predominantly reads? If so, I suggest you tear down > Andrew's patch to read side only, and submit that ... I get the feeling that would > be acceptable, and would solve your problem. But we still don't know what the problem _is_. It's very weird. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 16:29 ` Roy Sigurd Karlsbakk 2002-05-23 16:46 ` Martin J. Bligh @ 2002-05-24 15:11 ` Alan Cox 2002-05-24 15:53 ` Martin J. Bligh 2002-05-27 11:12 ` Roy Sigurd Karlsbakk 1 sibling, 2 replies; 48+ messages in thread From: Alan Cox @ 2002-05-24 15:11 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: Martin J. Bligh, linux-kernel > > How much RAM do you have, and what does /proc/meminfo > > and /proc/slabinfo say just before the explosion point? > > I have 1 gig - highmem (not enabled) - 900 megs. > for what I can see, kernel can't reclaim buffers fast enough. > ut looks better on -aa. > What sort of setup. I can't duplicate the problem here ? ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 15:11 ` [BUG] 2.4 VM sucks. Again Alan Cox @ 2002-05-24 15:53 ` Martin J. Bligh 2002-05-24 16:14 ` Alan Cox 2002-05-27 11:12 ` Roy Sigurd Karlsbakk 1 sibling, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-05-24 15:53 UTC (permalink / raw) To: Alan Cox, Roy Sigurd Karlsbakk; +Cc: linux-kernel >> > How much RAM do you have, and what does /proc/meminfo >> > and /proc/slabinfo say just before the explosion point? >> >> I have 1 gig - highmem (not enabled) - 900 megs. >> for what I can see, kernel can't reclaim buffers fast enough. >> ut looks better on -aa. >> > > What sort of setup. I can't duplicate the problem here ? I'm not sure exactly what Roy was doing, but we were taking a machine with 16Gb of RAM, and reading files into the page cache - I think we built up 8 million buffer_heads according to slabinfo ... on a P4 they're 128 bytes each, on a P3 96 bytes. M. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 15:53 ` Martin J. Bligh @ 2002-05-24 16:14 ` Alan Cox 2002-05-24 16:31 ` Martin J. Bligh 0 siblings, 1 reply; 48+ messages in thread From: Alan Cox @ 2002-05-24 16:14 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Alan Cox, Roy Sigurd Karlsbakk, linux-kernel > > What sort of setup. I can't duplicate the problem here ? > > I'm not sure exactly what Roy was doing, but we were taking a machine > with 16Gb of RAM, and reading files into the page cache - I think we built up > 8 million buffer_heads according to slabinfo ... on a P4 they're 128 bytes each, > on a P3 96 bytes. The buffer heads one would make sense. I only test on realistic sized systems. Once you pass 4Gb there are so many problems its not worth using x86 in the long run ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 16:14 ` Alan Cox @ 2002-05-24 16:31 ` Martin J. Bligh 2002-05-24 17:30 ` Austin Gonyou 0 siblings, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-05-24 16:31 UTC (permalink / raw) To: Alan Cox; +Cc: Roy Sigurd Karlsbakk, linux-kernel >> I'm not sure exactly what Roy was doing, but we were taking a machine >> with 16Gb of RAM, and reading files into the page cache - I think we built up >> 8 million buffer_heads according to slabinfo ... on a P4 they're 128 bytes each, >> on a P3 96 bytes. > > The buffer heads one would make sense. I only test on realistic sized systems. Well, it'll still waste valuable memory there too, though you may not totally kill it. > Once you pass 4Gb there are so many problems its not worth using x86 in the > long run Nah, we just haven't fixed them yet ;-) M. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 16:31 ` Martin J. Bligh @ 2002-05-24 17:30 ` Austin Gonyou 2002-05-24 17:43 ` Martin J. Bligh 2002-05-27 9:24 ` [BUG] 2.4 VM sucks. Again Marco Colombo 0 siblings, 2 replies; 48+ messages in thread From: Austin Gonyou @ 2002-05-24 17:30 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Alan Cox, Roy Sigurd Karlsbakk, linux-kernel On Fri, 2002-05-24 at 11:31, Martin J. Bligh wrote: > >> I'm not sure exactly what Roy was doing, but we were taking a machine > >> with 16Gb of RAM, and reading files into the page cache - I think we built up > >> 8 million buffer_heads according to slabinfo ... on a P4 they're 128 bytes each, > >> on a P3 96 bytes. > > > > The buffer heads one would make sense. I only test on realistic sized systems. > > Well, it'll still waste valuable memory there too, though you may not totally kill it. > > > Once you pass 4Gb there are so many problems its not worth using x86 in the > > long run > I assume that you mean by "not worth using x86" you're referring to say, degraded performance over other platforms? Well...if you talk price/performance, using x86 is perfect in those terms since you can buy more boxes and have a more fluid architecture, rather than building a monolithic system. Monolithic systems aren't always the best. Just look at Fermilab! > Nah, we just haven't fixed them yet ;-) > > M. > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 17:30 ` Austin Gonyou @ 2002-05-24 17:43 ` Martin J. Bligh 2002-05-24 18:03 ` Austin Gonyou 2002-05-27 9:24 ` [BUG] 2.4 VM sucks. Again Marco Colombo 1 sibling, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-05-24 17:43 UTC (permalink / raw) To: Austin Gonyou; +Cc: Alan Cox, Roy Sigurd Karlsbakk, linux-kernel > I assume that you mean by "not worth using x86" you're referring to say, > degraded performance over other platforms? Well...if you talk > price/performance, using x86 is perfect in those terms since you can buy > more boxes and have a more fluid architecture, rather than building a > monolithic system. Monolithic systems aren't always the best. Just look > at Fermilab! Well, to be honest, with the current mainline kernel on >4Gb x86 machines, we're not talking about slow performance on mainline kernel, we're talking about "falls flat on it's face, in a jibbering heap" (if you actually stress the machine with real workloads). If we apply a bunch of patches, we can get the ostritch to just about fly (most of the time), but we're working towards good performance too ... it's not that far off. Of course, this means that we actually have to get these patches accepted for them to be of much use ;-). -aa kernel works best in this area, on the workloads I've been looking at so far ... this area is very much "under active development" at the moment. M. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 17:43 ` Martin J. Bligh @ 2002-05-24 18:03 ` Austin Gonyou 2002-05-24 18:10 ` Martin J. Bligh 0 siblings, 1 reply; 48+ messages in thread From: Austin Gonyou @ 2002-05-24 18:03 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Alan Cox, Roy Sigurd Karlsbakk, linux-kernel On Fri, 2002-05-24 at 12:43, Martin J. Bligh wrote: > > I assume that you mean by "not worth using x86" you're referring to say, > > degraded performance over other platforms? Well...if you talk > > price/performance, using x86 is perfect in those terms since you can buy > > more boxes and have a more fluid architecture, rather than building a > > monolithic system. Monolithic systems aren't always the best. Just look > > at Fermilab! > > Well, to be honest, with the current mainline kernel on >4Gb x86 machines, > we're not talking about slow performance on mainline kernel, we're talking > about "falls flat on it's face, in a jibbering heap" (if you actually stress the > machine with real workloads). If we apply a bunch of patches, we can get > the ostritch to just about fly (most of the time), but we're working towards good > performance too ... it's not that far off. Understood, I think that's everyone's goal in the end anyway. > Of course, this means that we actually have to get these patches accepted > for them to be of much use ;-). -aa kernel works best in this area, on the > workloads I've been looking at so far ... this area is very much "under active > development" at the moment. > > M. Yes, After using a -AA series, then recompiling Glibc with some optimizations, kind of re-purifying the system a few times. Then applying some Oracle patches, (to fix some Oracle bugs in our environment) then voila! We can have a *very* fast Linux box on 4P or 8P with 4-8GB RAM with an uptime of >60 days. I've never a box longer than that to prove otherwise, but it was stable from a *production* point of view. Also, adjusting the bdflush parms greatly increases stability I've found in this respect. On top of all of that though, using XFS with increased logbuffers and LVM or EVMS to do striping really improved performance with IO too. Problem is, my tests are *unofficial* but I plan to do something perhaps at OSDL and see what we can show in a max single-box config with real hardware, etc. Anyway, I digress. Austin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 18:03 ` Austin Gonyou @ 2002-05-24 18:10 ` Martin J. Bligh 2002-05-24 18:29 ` 2.4 Kernel Perf discussion [Was Re: [BUG] 2.4 VM sucks. Again] Austin Gonyou 0 siblings, 1 reply; 48+ messages in thread From: Martin J. Bligh @ 2002-05-24 18:10 UTC (permalink / raw) To: Austin Gonyou; +Cc: Alan Cox, Roy Sigurd Karlsbakk, linux-kernel > Also, adjusting the bdflush parms greatly increases stability I've found > in this respect. What exactly did you do to them? Can you specify what you're set to at the moment (and anything you found along the way in tuning)? > Problem is, my tests are *unofficial* but I plan to do something perhaps > at OSDL and see what we can show in a max single-box config with real > hardware, etc. Great stuff, I'm very interested in knowing about any problems you find. We're doing very similar things here, anywhere from 8-32 procs, and 4-32Gb of RAM, both NUMA and SMP. Thanks, Martin. ^ permalink raw reply [flat|nested] 48+ messages in thread
* 2.4 Kernel Perf discussion [Was Re: [BUG] 2.4 VM sucks. Again] 2002-05-24 18:10 ` Martin J. Bligh @ 2002-05-24 18:29 ` Austin Gonyou 2002-05-24 19:01 ` Stephen Frost 0 siblings, 1 reply; 48+ messages in thread From: Austin Gonyou @ 2002-05-24 18:29 UTC (permalink / raw) To: linux-kernel On Fri, 2002-05-24 at 13:10, Martin J. Bligh wrote: > > Also, adjusting the bdflush parms greatly increases stability I've found > > in this respect. > > What exactly did you do to them? Can you specify what you're set to > at the moment (and anything you found along the way in tuning)? I actually changed the defaults of the bdflush parms before compiling. I don't have that info right now because I had to dismantle my system in a hurry, was a try-buy from Dell at the time, and we weren't authorized to buy yet. At any rate, I found, at the time, (2.4.17-pre5-aa2-xfs I think), that the defaults for bdflush when running dbench would just *destroy* the system. Changing the bdflush parms to be about 60% full, and flushing to 30%, while potentially wasteful, was indeed an improvement. IOzone benchmarks also show distinct improvements in this regard as well, but I never had such terrible kswapd/bdflush issues with that test as I did with dbench, to begin with. The test system was a Dell 6450 with 8GB ram and P3 Xeon 700Mhz 2MB cache procs. I expect far greater peformance from the P4 Xeon 1.6GHz 1MB Cache procs though. In that scenario, we will only be using 4GB ram probably. That test will be internal to us and should start in the next couple weeks (I hope). I'll be charged with making the system testing as immaculate as possible so we have crisp information to use in our decision making process as we move from Sun to x86. > > Problem is, my tests are *unofficial* but I plan to do something perhaps > > at OSDL and see what we can show in a max single-box config with real > > hardware, etc. > > Great stuff, I'm very interested in knowing about any problems you find. > We're doing very similar things here, anywhere from 8-32 procs, and > 4-32Gb of RAM, both NUMA and SMP. As soon as I can get time on their systems to do 4/8-way testing, I'll make my benches available. Should be good stuff. :) > Thanks, > > Martin. > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: 2.4 Kernel Perf discussion [Was Re: [BUG] 2.4 VM sucks. Again] 2002-05-24 18:29 ` 2.4 Kernel Perf discussion [Was Re: [BUG] 2.4 VM sucks. Again] Austin Gonyou @ 2002-05-24 19:01 ` Stephen Frost 0 siblings, 0 replies; 48+ messages in thread From: Stephen Frost @ 2002-05-24 19:01 UTC (permalink / raw) To: Austin Gonyou; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 644 bytes --] * Austin Gonyou (austin@digitalroadkill.net) wrote: > As soon as I can get time on their systems to do 4/8-way testing, I'll > make my benches available. Should be good stuff. :) I may be getting an opportunity in the next weeks/months to play with a 16-way SparcCenter 2000 w/ 85mhz procs and 3GB of ram. I realize this machine is rather pokey but I was wondering if it might be useful to help test the kernel with a large number of processors. So, if you or anyone else have some test you'd like me to run (assuming I get the machine all set up and running Linux) let me know and I'd be happy to try some things. Stephen [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 17:30 ` Austin Gonyou 2002-05-24 17:43 ` Martin J. Bligh @ 2002-05-27 9:24 ` Marco Colombo 2002-05-27 22:24 ` Austin Gonyou 1 sibling, 1 reply; 48+ messages in thread From: Marco Colombo @ 2002-05-27 9:24 UTC (permalink / raw) To: Austin Gonyou; +Cc: linux-kernel On 24 May 2002, Austin Gonyou wrote: > On Fri, 2002-05-24 at 11:31, Martin J. Bligh wrote: > > >> I'm not sure exactly what Roy was doing, but we were taking a machine > > >> with 16Gb of RAM, and reading files into the page cache - I think we built up > > >> 8 million buffer_heads according to slabinfo ... on a P4 they're 128 bytes each, > > >> on a P3 96 bytes. > > > > > > The buffer heads one would make sense. I only test on realistic sized systems. > > > > Well, it'll still waste valuable memory there too, though you may not totally kill it. > > > > > Once you pass 4Gb there are so many problems its not worth using x86 in the > > > long run > > > I assume that you mean by "not worth using x86" you're referring to say, > degraded performance over other platforms? Well...if you talk > price/performance, using x86 is perfect in those terms since you can buy > more boxes and have a more fluid architecture, rather than building a > monolithic system. Monolithic systems aren't always the best. Just look > at Fermilab! Uh? There are many alpha-based clusters out there. Why do you think !x86 == monolithic? .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-27 9:24 ` [BUG] 2.4 VM sucks. Again Marco Colombo @ 2002-05-27 22:24 ` Austin Gonyou 2002-05-27 23:08 ` Austin Gonyou 0 siblings, 1 reply; 48+ messages in thread From: Austin Gonyou @ 2002-05-27 22:24 UTC (permalink / raw) To: Marco Colombo; +Cc: linux-kernel I'm not referring to just *non* x86 arches in this case. Sorry about that. Any setup can be non-monolithic, but the measurement to decide if it is cost worthy is price/performance ratio. I'm not saying that "if it's not x86, it's monolithic", in the context of the discussion, it's really about large costly boxes, designed to be large, costly boxes. That, from this perspective, is monolithic. On Mon, 2002-05-27 at 04:24, Marco Colombo wrote: > On 24 May 2002, Austin Gonyou wrote: > > > On Fri, 2002-05-24 at 11:31, Martin J. Bligh wrote: > > > >> I'm not sure exactly what Roy was doing, but we were taking a machine > > > >> with 16Gb of RAM, and reading files into the page cache - I think we built up > > > >> 8 million buffer_heads according to slabinfo ... on a P4 they're 128 bytes each, > > > >> on a P3 96 bytes. > > > > > > > > The buffer heads one would make sense. I only test on realistic sized systems. > > > > > > Well, it'll still waste valuable memory there too, though you may not totally kill it. > > > > > > > Once you pass 4Gb there are so many problems its not worth using x86 in the > > > > long run > > > > > I assume that you mean by "not worth using x86" you're referring to say, > > degraded performance over other platforms? Well...if you talk > > price/performance, using x86 is perfect in those terms since you can buy > > more boxes and have a more fluid architecture, rather than building a > > monolithic system. Monolithic systems aren't always the best. Just look > > at Fermilab! > > Uh? There are many alpha-based clusters out there. Why do you think > !x86 == monolithic? > > .TM. > -- > ____/ ____/ / > / / / Marco Colombo > ___/ ___ / / Technical Manager > / / / ESI s.r.l. > _____/ _____/ _/ Colombo@ESI.it > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-27 22:24 ` Austin Gonyou @ 2002-05-27 23:08 ` Austin Gonyou 0 siblings, 0 replies; 48+ messages in thread From: Austin Gonyou @ 2002-05-27 23:08 UTC (permalink / raw) To: Austin Gonyou; +Cc: Marco Colombo, linux-kernel Just to clarify, it was Sparc v. x86. (which is what I meant to state in my first sentence there. :) On Mon, 2002-05-27 at 17:24, Austin Gonyou wrote: > I'm not referring to just *non* x86 arches in this case. Sorry about > that. Any setup can be non-monolithic, but the measurement to decide if > it is cost worthy is price/performance ratio. > > I'm not saying that "if it's not x86, it's monolithic", in the context > of the discussion, it's really about large costly boxes, designed to be > large, costly boxes. That, from this perspective, is monolithic. > > > On Mon, 2002-05-27 at 04:24, Marco Colombo wrote: > > On 24 May 2002, Austin Gonyou wrote: > > > > > On Fri, 2002-05-24 at 11:31, Martin J. Bligh wrote: > > > > >> I'm not sure exactly what Roy was doing, but we were taking a machine > > > > >> with 16Gb of RAM, and reading files into the page cache - I think we built up > > > > >> 8 million buffer_heads according to slabinfo ... on a P4 they're 128 bytes each, > > > > >> on a P3 96 bytes. > > > > > > > > > > The buffer heads one would make sense. I only test on realistic sized systems. > > > > > > > > Well, it'll still waste valuable memory there too, though you may not totally kill it. > > > > > > > > > Once you pass 4Gb there are so many problems its not worth using x86 in the > > > > > long run > > > > > > > I assume that you mean by "not worth using x86" you're referring to say, > > > degraded performance over other platforms? Well...if you talk > > > price/performance, using x86 is perfect in those terms since you can buy > > > more boxes and have a more fluid architecture, rather than building a > > > monolithic system. Monolithic systems aren't always the best. Just look > > > at Fermilab! > > > > Uh? There are many alpha-based clusters out there. Why do you think > > !x86 == monolithic? > > > > .TM. > > -- > > ____/ ____/ / > > / / / Marco Colombo > > ___/ ___ / / Technical Manager > > / / / ESI s.r.l. > > _____/ _____/ _/ Colombo@ESI.it > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 15:11 ` [BUG] 2.4 VM sucks. Again Alan Cox 2002-05-24 15:53 ` Martin J. Bligh @ 2002-05-27 11:12 ` Roy Sigurd Karlsbakk 2002-05-27 14:31 ` Alan Cox 1 sibling, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-27 11:12 UTC (permalink / raw) To: Alan Cox; +Cc: Martin J. Bligh, linux-kernel > > > How much RAM do you have, and what does /proc/meminfo > > > and /proc/slabinfo say just before the explosion point? > > > > I have 1 gig - highmem (not enabled) - 900 megs. > > for what I can see, kernel can't reclaim buffers fast enough. > > ut looks better on -aa. > > What sort of setup. I can't duplicate the problem here ? The setup is 2-4 drives in raid0, with chunk size 1MB. If I try to do ~50 simultanous reads from disk, it's no problem as long as the data is being read from the nic with the same speed as it's being read from disk. The server apps are running via inetd (testing), and have 2MB of buffer each. (read 2MB from disk, write 2MB to NIC). The server chrashes within minutes. The same problem occurs when using Tux thanks roy -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-27 11:12 ` Roy Sigurd Karlsbakk @ 2002-05-27 14:31 ` Alan Cox 2002-05-27 13:43 ` Roy Sigurd Karlsbakk 0 siblings, 1 reply; 48+ messages in thread From: Alan Cox @ 2002-05-27 14:31 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: Martin J. Bligh, linux-kernel On Mon, 2002-05-27 at 12:12, Roy Sigurd Karlsbakk wrote: > If I try to do ~50 simultanous reads from disk, it's no problem as long as > the data is being read from the nic with the same speed as it's being read > from disk. The server apps are running via inetd (testing), and have 2MB of > buffer each. (read 2MB from disk, write 2MB to NIC). > > The server chrashes within minutes. The same problem occurs when using Tux > How much physical memory and is your app using sendfile ? ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-27 14:31 ` Alan Cox @ 2002-05-27 13:43 ` Roy Sigurd Karlsbakk 0 siblings, 0 replies; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-27 13:43 UTC (permalink / raw) To: Alan Cox; +Cc: Martin J. Bligh, linux-kernel On Monday 27 May 2002 16:31, you wrote: > On Mon, 2002-05-27 at 12:12, Roy Sigurd Karlsbakk wrote: > > If I try to do ~50 simultanous reads from disk, it's no problem as long > > as the data is being read from the nic with the same speed as it's being > > read from disk. The server apps are running via inetd (testing), and have > > 2MB of buffer each. (read 2MB from disk, write 2MB to NIC). > > > > The server chrashes within minutes. The same problem occurs when using > > Tux > > How much physical memory and is your app using sendfile ? I have 1gig with highmem disabled, ergo 900MB. My app is just doing read() write(), but as the problem occurs similarly with Tux (which uses sendfile()), it shouldn't really matter -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 13:11 [BUG] 2.4 VM sucks. Again Roy Sigurd Karlsbakk 2002-05-23 14:54 ` Martin J. Bligh @ 2002-05-23 16:03 ` Johannes Erdfelt 2002-05-23 16:33 ` Roy Sigurd Karlsbakk 2002-05-23 18:12 ` jlnance 2 siblings, 1 reply; 48+ messages in thread From: Johannes Erdfelt @ 2002-05-23 16:03 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: linux-kernel On Thu, May 23, 2002, Roy Sigurd Karlsbakk <roy@karlsbakk.net> wrote: > I've been here complaining about the 2.4 VM before, and here I am, back again. > > PROBLEM: > ---------------------- > Starting up 30 downloads from a custom HTTP server (or Tux - or Apache - > doesn't matter), file size is 3-6GB, download speed = ~4.5Mbps. After some > time the kernel (a) goes bOOM (out of memory) if not having any swap, or (b) > goes gong swapping out anything it can. > > The custom HTTP server processes each have a static buffer of two megabytes, > no malloc()s, and are written in < 1000 lines of C. > > Theory: The buffer fills up, as the clients can't read as fast as kernel is > reading from disk, and the server goes boom > > thanks for any help What kernel is this? JE ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 16:03 ` Johannes Erdfelt @ 2002-05-23 16:33 ` Roy Sigurd Karlsbakk 2002-05-23 22:50 ` Luigi Genoni 0 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-23 16:33 UTC (permalink / raw) To: Johannes Erdfelt; +Cc: linux-kernel > What kernel is this? Sorry. forgot to tell it's 2.4.18-ac? and 2.4.19pre-several. I beleive it's the same stuff I've seen on earlier kernels as well. -aa seems to solve or reduce the problem -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 16:33 ` Roy Sigurd Karlsbakk @ 2002-05-23 22:50 ` Luigi Genoni 2002-05-24 11:53 ` Roy Sigurd Karlsbakk 0 siblings, 1 reply; 48+ messages in thread From: Luigi Genoni @ 2002-05-23 22:50 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: Johannes Erdfelt, linux-kernel Have you tried latest aa versione? They are quite interesting. I am playing with 1.4.19-pre8aa3 right now... On Thu, 23 May 2002, Roy Sigurd Karlsbakk wrote: > > What kernel is this? > > Sorry. forgot to tell > > it's 2.4.18-ac? and 2.4.19pre-several. I beleive it's the same stuff I've > seen on earlier kernels as well. > > -aa seems to solve or reduce the problem > -- > Roy Sigurd Karlsbakk, Datavaktmester > > Computers are like air conditioners. > They stop working when you open Windows. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 22:50 ` Luigi Genoni @ 2002-05-24 11:53 ` Roy Sigurd Karlsbakk 0 siblings, 0 replies; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-24 11:53 UTC (permalink / raw) To: Luigi Genoni; +Cc: Johannes Erdfelt, linux-kernel On Friday 24 May 2002 00:50, Luigi Genoni wrote: > Have you tried latest aa versione? > They are quite interesting. > I am playing with 1.4.19-pre8aa3 right now... I just tried it. it's better, but not good enough. it still fucks up -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 13:11 [BUG] 2.4 VM sucks. Again Roy Sigurd Karlsbakk 2002-05-23 14:54 ` Martin J. Bligh 2002-05-23 16:03 ` Johannes Erdfelt @ 2002-05-23 18:12 ` jlnance 2002-05-24 10:36 ` Roy Sigurd Karlsbakk 2 siblings, 1 reply; 48+ messages in thread From: jlnance @ 2002-05-23 18:12 UTC (permalink / raw) To: roy, linux-kernel On Thu, May 23, 2002 at 03:11:24PM +0200, Roy Sigurd Karlsbakk wrote: > Starting up 30 downloads from a custom HTTP server (or Tux - or Apache - > doesn't matter), file size is 3-6GB, download speed = ~4.5Mbps. After some > time the kernel (a) goes bOOM (out of memory) if not having any swap, or (b) > goes gong swapping out anything it can. Does this work if the client and the server are on the same machine? It would make reproducing this a lot easier if it only required 1 machine. Thanks, Jim ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-23 18:12 ` jlnance @ 2002-05-24 10:36 ` Roy Sigurd Karlsbakk 2002-05-31 21:21 ` Andrea Arcangeli 0 siblings, 1 reply; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-05-24 10:36 UTC (permalink / raw) To: jlnance, linux-kernel On Thursday 23 May 2002 20:12, jlnance@intrex.net wrote: > On Thu, May 23, 2002 at 03:11:24PM +0200, Roy Sigurd Karlsbakk wrote: > > Starting up 30 downloads from a custom HTTP server (or Tux - or Apache - > > doesn't matter), file size is 3-6GB, download speed = ~4.5Mbps. After > > some time the kernel (a) goes bOOM (out of memory) if not having any > > swap, or (b) goes gong swapping out anything it can. > > Does this work if the client and the server are on the same machine? It > would make reproducing this a lot easier if it only required 1 machine. I guess it'd work fine with only one machine, as IMO, the problem must be the kernel not releasing buffers -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-24 10:36 ` Roy Sigurd Karlsbakk @ 2002-05-31 21:21 ` Andrea Arcangeli 2002-06-01 12:36 ` Roy Sigurd Karlsbakk 0 siblings, 1 reply; 48+ messages in thread From: Andrea Arcangeli @ 2002-05-31 21:21 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: jlnance, linux-kernel On Fri, May 24, 2002 at 12:36:32PM +0200, Roy Sigurd Karlsbakk wrote: > On Thursday 23 May 2002 20:12, jlnance@intrex.net wrote: > > On Thu, May 23, 2002 at 03:11:24PM +0200, Roy Sigurd Karlsbakk wrote: > > > Starting up 30 downloads from a custom HTTP server (or Tux - or Apache - > > > doesn't matter), file size is 3-6GB, download speed = ~4.5Mbps. After > > > some time the kernel (a) goes bOOM (out of memory) if not having any > > > swap, or (b) goes gong swapping out anything it can. > > > > Does this work if the client and the server are on the same machine? It > > would make reproducing this a lot easier if it only required 1 machine. > > I guess it'd work fine with only one machine, as IMO, the problem must be the > kernel not releasing buffers too much variable. Also keep in mind if you grow the socket buffer to hundred mbyte on an highmem machine the zone-normal will finish too fast and you may run out of memory. 2.4.19pre9aa2 in such case should at least return -ENOMEM and not deadlock. Andrea ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [BUG] 2.4 VM sucks. Again 2002-05-31 21:21 ` Andrea Arcangeli @ 2002-06-01 12:36 ` Roy Sigurd Karlsbakk 0 siblings, 0 replies; 48+ messages in thread From: Roy Sigurd Karlsbakk @ 2002-06-01 12:36 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: jlnance, linux-kernel > > I guess it'd work fine with only one machine, as IMO, the problem must be > > the kernel not releasing buffers > > too much variable. > > Also keep in mind if you grow the socket buffer to hundred mbyte on an > highmem machine the zone-normal will finish too fast and you may run out > of memory. 2.4.19pre9aa2 in such case should at least return -ENOMEM and > not deadlock. it's not a highmem machine. And. It's not user space processes using the memory -- Roy Sigurd Karlsbakk, Datavaktmester Computers are like air conditioners. They stop working when you open Windows. ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2002-08-30 18:47 UTC | newest] Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-05-23 13:11 [BUG] 2.4 VM sucks. Again Roy Sigurd Karlsbakk 2002-05-23 14:54 ` Martin J. Bligh 2002-05-23 16:29 ` Roy Sigurd Karlsbakk 2002-05-23 16:46 ` Martin J. Bligh 2002-05-24 10:04 ` Roy Sigurd Karlsbakk 2002-05-24 14:35 ` Martin J. Bligh 2002-05-24 19:32 ` Andrew Morton 2002-05-30 10:29 ` Roy Sigurd Karlsbakk 2002-05-30 19:28 ` Andrew Morton 2002-05-31 16:56 ` Roy Sigurd Karlsbakk 2002-05-31 18:19 ` Andrea Arcangeli 2002-06-18 11:26 ` Roy Sigurd Karlsbakk 2002-06-18 19:42 ` Andrew Morton 2002-06-19 11:26 ` Roy Sigurd Karlsbakk 2002-07-10 7:50 ` [2.4 BUFFERING BUG] (was [BUG] 2.4 VM sucks. Again) Roy Sigurd Karlsbakk 2002-07-10 8:05 ` Andrew Morton 2002-07-10 8:14 ` Roy Sigurd Karlsbakk 2002-08-28 9:28 ` [BUG+FIX] 2.4 buggercache sucks Roy Sigurd Karlsbakk 2002-08-28 15:30 ` Martin J. Bligh 2002-08-29 8:00 ` Roy Sigurd Karlsbakk 2002-08-29 13:42 ` Martin J. Bligh 2002-08-30 9:21 ` Roy Sigurd Karlsbakk 2002-08-30 17:19 ` Martin J. Bligh 2002-08-30 18:49 ` Andrew Morton 2002-05-24 15:11 ` [BUG] 2.4 VM sucks. Again Alan Cox 2002-05-24 15:53 ` Martin J. Bligh 2002-05-24 16:14 ` Alan Cox 2002-05-24 16:31 ` Martin J. Bligh 2002-05-24 17:30 ` Austin Gonyou 2002-05-24 17:43 ` Martin J. Bligh 2002-05-24 18:03 ` Austin Gonyou 2002-05-24 18:10 ` Martin J. Bligh 2002-05-24 18:29 ` 2.4 Kernel Perf discussion [Was Re: [BUG] 2.4 VM sucks. Again] Austin Gonyou 2002-05-24 19:01 ` Stephen Frost 2002-05-27 9:24 ` [BUG] 2.4 VM sucks. Again Marco Colombo 2002-05-27 22:24 ` Austin Gonyou 2002-05-27 23:08 ` Austin Gonyou 2002-05-27 11:12 ` Roy Sigurd Karlsbakk 2002-05-27 14:31 ` Alan Cox 2002-05-27 13:43 ` Roy Sigurd Karlsbakk 2002-05-23 16:03 ` Johannes Erdfelt 2002-05-23 16:33 ` Roy Sigurd Karlsbakk 2002-05-23 22:50 ` Luigi Genoni 2002-05-24 11:53 ` Roy Sigurd Karlsbakk 2002-05-23 18:12 ` jlnance 2002-05-24 10:36 ` Roy Sigurd Karlsbakk 2002-05-31 21:21 ` Andrea Arcangeli 2002-06-01 12:36 ` Roy Sigurd Karlsbakk
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.