From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752483Ab0LLBkg (ORCPT ); Sat, 11 Dec 2010 20:40:36 -0500 Received: from thunk.org ([69.25.196.29]:33042 "EHLO thunker.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751323Ab0LLBkc (ORCPT ); Sat, 11 Dec 2010 20:40:32 -0500 Date: Sat, 11 Dec 2010 20:40:13 -0500 From: "Ted Ts'o" To: Jon Nelson Cc: Matt , Chris Mason , Andi Kleen , Mike Snitzer , Milan Broz , linux-btrfs , dm-devel , Linux Kernel , htd , htejun , linux-ext4 Subject: Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective) Message-ID: <20101212014013.GF3059@thunk.org> Mail-Followup-To: Ted Ts'o , Jon Nelson , Matt , Chris Mason , Andi Kleen , Mike Snitzer , Milan Broz , linux-btrfs , dm-devel , Linux Kernel , htd , htejun , linux-ext4 References: <20101209201359.GG2921@thunk.org> <20101209231616.GA12515@basil.fritz.box> <1291945065-sup-1838@think> <20101210023852.GB3059@thunk.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on thunker.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 10, 2010 at 08:14:56PM -0600, Jon Nelson wrote: > > Barring false negatives, bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc > > appears to be the culprit (according to git bisect). > > I will test bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc again, confirm > > the behavior, and work backwards to try to reduce the possibility of > > false negatives. > > A few additional notes, in no particular order: > > - For me, triggering the problem is fairly easy when encryption is involved. > - I'm now 81 iterations into testing > bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc *without* encryption. Out of > 81 iterations, I have 4 failures: #16, 40, 62, and 64. > > I will now try 1de3e3df917459422cb2aecac440febc8879d410 much more extensively. > > Is this useful information? Yes, indeed. Is this in the virtualized environment or on real hardware at this point? And how many CPU's do you have configured in your virtualized environment, and how memory memory? Is having a certain number of CPU's critical for reproducing the problem? Is constricting the amount of memory important? It'll be a lot easier if I can reproduce it locally, which is why I'm asking all of these questions. Thanks, - Ted