From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752460Ab1IDWU0 (ORCPT ); Sun, 4 Sep 2011 18:20:26 -0400 Received: from acsinet15.oracle.com ([141.146.126.227]:52889 "EHLO acsinet15.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751072Ab1IDWUT (ORCPT ); Sun, 4 Sep 2011 18:20:19 -0400 To: "Andi Kleen" Cc: "Martin K. Petersen" , djwong@us.ibm.com, "Greg Freemyer" , "Andreas Dilger" , "Theodore Tso" , "Sunil Mushran" , "Amir Goldstein" , "linux-kernel" , "Mingming Cao" , "Joel Becker" , "linux-fsdevel" , linux-ext4@vger.kernel.org, "Coly Li" Subject: Re: [PATCH v1 00/16] ext4: Add metadata checksumming From: "Martin K. Petersen" Organization: Oracle References: <20110901003030.31048.99467.stgit@elm3c44.beaverton.ibm.com> <20110902182214.GC12086@tux1.beaverton.ibm.com> <6fdb58aed1dae8020900e65cbfb34b28.squirrel@www.firstfloor.org> <88f8907569619968aa7b4a8c669d5aba.squirrel@www.firstfloor.org> Date: Sun, 04 Sep 2011 18:19:16 -0400 In-Reply-To: <88f8907569619968aa7b4a8c669d5aba.squirrel@www.firstfloor.org> (Andi Kleen's message of "Sun, 4 Sep 2011 19:44:55 +0200") Message-ID: User-Agent: Gnus/5.110017 (No Gnus v0.17) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Source-IP: rtcsinet22.oracle.com [66.248.204.30] X-CT-RefId: str=0001.0A090206.4E63F98E.0123,ss=1,re=0.000,fgs=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >>>>> "Andi" == Andi Kleen writes: Andi> Doesn't have any performance numbers. It's been a while since I read them. I thought they had some compelling numbers. Anyway, made a big difference in real life testing here. For sustained I/O we're talking an order of magnitude. Andi> You need to keep in mind that PCLMULQDQ uses FPU state, so any Andi> speedup for the kernel must be large enough to amortize the cost Andi> of saving the FPU state. Yeah, my test cases were for bulk database I/O, not for writing a handful of fs metadata blocks. Plus for the DB tests the CRC was generated in userland. I seem to recall Joel picking something other than the hw-accelerated CRC32C for ocfs2 metadata and that didn't cause any problems. That said, I do see a difference between IP checksum and CRC on normal FS workloads with DIX enabled here. Andi> Typically that only works out for quite large buffers, but kernel Andi> buffers are relatively small. *nod* -- Martin K. Petersen Oracle Linux Engineering