From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750889AbWATMSA (ORCPT ); Fri, 20 Jan 2006 07:18:00 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750891AbWATMR7 (ORCPT ); Fri, 20 Jan 2006 07:17:59 -0500 Received: from smtp.osdl.org ([65.172.181.4]:65494 "EHLO smtp.osdl.org") by vger.kernel.org with ESMTP id S1750883AbWATMR7 (ORCPT ); Fri, 20 Jan 2006 07:17:59 -0500 Date: Fri, 20 Jan 2006 04:17:27 -0800 From: Andrew Morton To: Jens Axboe Cc: davej@redhat.com, AChittenden@bluearc.com, linux-kernel@vger.kernel.org, lwoodman@redhat.com Subject: Re: Out of Memory: Killed process 16498 (java). Message-Id: <20060120041727.5329f299.akpm@osdl.org> In-Reply-To: <20060120120844.GG13429@suse.de> References: <89E85E0168AD994693B574C80EDB9C270355601F@uk-email.terastack.bluearc.com> <20060119194836.GM21663@redhat.com> <20060119141515.5f779b8d.akpm@osdl.org> <20060120081231.GE4213@suse.de> <20060120002307.76bcbc27.akpm@osdl.org> <20060120120844.GG13429@suse.de> X-Mailer: Sylpheed version 1.0.4 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Jens Axboe wrote: > > On Fri, Jan 20 2006, Andrew Morton wrote: > > Jens Axboe wrote: > > > > > > On Thu, Jan 19 2006, Andrew Morton wrote: > > > > Dave Jones wrote: > > > > > > > > > > On Thu, Jan 19, 2006 at 03:11:45PM -0000, Andy Chittenden wrote: > > > > > > DMA free:20kB min:24kB low:28kB high:36kB active:0kB inactive:0kB > > > > > > present:12740kB pages_scanned:4 all_unreclaimable? yes > > > > > > > > > > Note we only scanned 4 pages before we gave up. > > > > > Larry Woodman came up with this patch below that clears all_unreclaimable > > > > > when in two places where we've made progress at freeing up some pages > > > > > which has helped oom situations for some of our users. > > > > > > > > That won't help - there are exactly zero pages on ZONE_DMA's LRU. > > > > > > > > The problem appears to be that all of the DMA zone has been gobbled up by > > > > the BIO layer. It seems quite inappropriate that a modern 64-bit machine > > > > is allocating tons of disk I/O pages from the teeny ZONE_DMA. I'm > > > > suspecting that someone has gone and set a queue's ->bounce_gfp to the wrong > > > > thing. > > > > > > > > Jens, would you have time to investigate please? > > > > > > Certainly, I'll get this tested and fixed this afternoon. > > > > Wow ;) > > > > You may find it's an x86_64 glitch - setting max_[low_]pfn wrong down in > > the bowels of the arch mm init code, something like that. > > > > I thought it might have been a regression which came in when we added > > ZONE_DMA32 but the RH reporter is based on 2.6.14-, and he > > didn't have ZONE_DMA32. > > Sorry, spoke too soon, I thought this was the 'bio/scsi leaks' which > most likely is a scsi leak that also results in the bios not getting > freed. > > This DMA32 zone shortage looks like a vm short coming, you're likely the > better candidate to fix that :-) It's not ZONE_DMA32. It's the 12MB ZONE_DMA which is being exhausted on this 4GB 64-bit machine. Andy put a dump_stack() into the oom code and it pointed at Call Trace:{out_of_memory+48} {__alloc_pages+536} {bio_alloc_bioset+232} {bio_copy_user+218} {blk_rq_map_user+136} {sg_io+328} {scsi_cmd_ioctl+491} {:ide_core:generic_ide_ioctl+631} {:sd_mod:sd_ioctl+371} {schedule_timeout+158} {blkdev_ioctl+1365} {sys_sendto+251} {__pollwait+0} {block_ioctl+25} {do_ioctl+24} {vfs_ioctl+541} {sys_ioctl+89}