From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S263765AbTJ0Xd4 (ORCPT ); Mon, 27 Oct 2003 18:33:56 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S263766AbTJ0Xd4 (ORCPT ); Mon, 27 Oct 2003 18:33:56 -0500 Received: from mtvcafw.SGI.COM ([192.48.171.6]:8850 "EHLO rj.sgi.com") by vger.kernel.org with ESMTP id S263765AbTJ0Xdz (ORCPT ); Mon, 27 Oct 2003 18:33:55 -0500 Date: Mon, 27 Oct 2003 15:33:52 -0800 (PST) From: John Hawkes Message-Id: <200310272333.h9RNXqaP2542634@babylon.engr.sgi.com> To: linux-kernel@vger.kernel.org Subject: Re: [BENCHMARK] I/O regression after 2.6.0-test5 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org I ran AIM7 on a 64p Altix NUMA (Itanium 2 Madison CPUs, 1.5 GHz), with an XFS filesystem on each of 99 fiberchannel disks, with two different flavors of AIM7 workloads, on both 2.6.0-test5 and 2.6.0-test8. One workload shows that -test8 is half the throughput of -test5. The other workload shows that -test8 is 22% faster. The first workload is the default AIM7 "multiuser/shared system" workload. 2.6.0-test8 shows a peak throughput of 52% of -test5 -- it drops in half. The -test8 peak at 64p is roughly the same as at 8p and 4p. This bottleneck cannot be attributed solely to I/O hardware, since our SGI ProPack kernel (based upon 2.4.21) achieves a peak throughput at 64p that is 30% higher than 2.6.0-test5's peak at 64p. The second workload takes the same "multiuser/shared system" workload, but turns off the O_SYNC flag for the three sync_* subtests. This generally produces higher peak throughput values than when using the default O_SYNC, and the peak throughput values scale significantly better as CPU counts increase. Here the -test8 kernel is 22% *faster* than -test5 at 64p (and 8x faster for no-O_SYNC than with-O_SYNC). John Hawkes