From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755605AbZCEAOR (ORCPT ); Wed, 4 Mar 2009 19:14:17 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751809AbZCEAN6 (ORCPT ); Wed, 4 Mar 2009 19:13:58 -0500 Received: from accolon.hansenpartnership.com ([76.243.235.52]:59057 "EHLO accolon.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751445AbZCEAN5 (ORCPT ); Wed, 4 Mar 2009 19:13:57 -0500 Subject: Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects From: James Bottomley To: Thomas Gleixner Cc: Jan Engelhardt , Boaz Harrosh , linux-scsi@vger.kernel.org, Linux Kernel Mailing List , linux-ide In-Reply-To: <1236207389.21486.19.camel@localhost.localdomain> References: <49ACF8FE.2020904@panasas.com> <1236093718.3263.3.camel@localhost.localdomain> <1236097526.3263.17.camel@localhost.localdomain> <1236119195.24019.24.camel@localhost.localdomain> <1236207389.21486.19.camel@localhost.localdomain> Content-Type: text/plain Date: Wed, 04 Mar 2009 18:13:49 -0600 Message-Id: <1236212029.32072.4.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 (2.22.3.1-1.fc9) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2009-03-04 at 22:56 +0000, James Bottomley wrote: > On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote: > > On Wed, 4 Mar 2009, Thomas Gleixner wrote: > > > > Instrumented the code and the result of the failing request is > > below. Looks like the function which sets up the request gets > > nr_phys_segments wrong by one. > > > > If you need further trace data feel free to ask. > > OK, the mapping all checks out correctly ... there must be something > wrong with the way we count before mapping. > > If you're tracing everything, could you add these static prints to the > trace ... they'll trigger a lot, but capturing how they applied to the > failing request might tell us why the count is wrong. Debugging this on IRC, this is the point we reached: ftrace debugging patch: http://tglx.de/~tglx/dbg.patch We're tracing both blk_recalc_rq_segments() and blk_phys_contig_segment() The results are here: http://tglx.de/~tglx/t.txt.bz2 Although what they show is that we're missing the point where the counting goes wrong (blk_recalc_rq_segments only goes up to 5 max). James