From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755042AbbCMLco (ORCPT ); Fri, 13 Mar 2015 07:32:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54660 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751780AbbCMLcl (ORCPT ); Fri, 13 Mar 2015 07:32:41 -0400 Date: Fri, 13 Mar 2015 19:31:22 +0800 From: Fam Zheng To: Jason Baron Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Alexander Viro , Andrew Morton , Kees Cook , Andy Lutomirski , David Herrmann , Alexei Starovoitov , Miklos Szeredi , David Drysdale , Oleg Nesterov , "David S. Miller" , Vivek Goyal , Mike Frysinger , "Theodore Ts'o" , Heiko Carstens , Rasmus Villemoes , Rashika Kheria , Hugh Dickins , Mathieu Desnoyers , Peter Zijlstra , linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, Josh Triplett , "Michael Kerrisk (man-pages)" , Paolo Bonzini , Omar Sandoval , Jonathan Corbet , shane.seymour@hp.com, dan.j.rosenberg@gmail.com Subject: Re: [PATCH v4 0/9] epoll: Introduce new syscalls, epoll_ctl_batch and epoll_pwait1 Message-ID: <20150313113122.GA7427@ad.nay.redhat.com> References: <1425952155-27603-1-git-send-email-famz@redhat.com> <5501AA6B.2020209@akamai.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5501AA6B.2020209@akamai.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 03/12 11:02, Jason Baron wrote: > On 03/09/2015 09:49 PM, Fam Zheng wrote: > > > > Benchmark for epoll_pwait1 > > ========================== > > > > By running fio tests inside VM with both original and modified QEMU, we can > > compare their difference in performance. > > > > With a small VM setup [t1], the original QEMU (ppoll based) has an 4k read > > latency overhead around 37 us. In this setup, the main loop polls 10~20 fds. > > > > With a slightly larger VM instance [t2] - attached a virtio-serial device so > > that there are 80~90 fds in the main loop - the original QEMU has a latency > > overhead around 49 us. By adding more such devices [t3], we can see the latency > > go even higher - 83 us with ~200 FDs. > > > > Now modify QEMU to use epoll_pwait1 and test again, the latency numbers are > > repectively 36us, 37us, 47us for t1, t2 and t3. > > > > > > Hi, > > So it sounds like you are comparing original qemu code (which was using > ppoll) vs. using epoll with these new syscalls. Curious if you have numbers > comparing the existing epoll (with say the timerfd in your epoll set), so > we can see the improvement relative to epoll. I did compare them, but they are too close to see differences. The improvements in epoll_pwait1 doesn't really help the hot path of guest IO, but it does affect the program timer precision, that are used in various device emulations in QEMU. Although it's kind of subtle and difficult to summarize here, I can give an example in the IO throttling implementation in QEMU, to show the significance: The throttling algorithm computes a duration for the next IO, which is used to arm a timer in order to delay the request a bit. As timers are always rounded *UP* to the effective granularity, the timeout being 1ms in epoll_pwait is just too coarse and will lead to severe inaccuracy. With epoll_pwait1, we can avoid the rounding-up. I think this idea could be pertty generally desired by other applications, too. Regarding the epoll_ctl_batch improvement, again, it is not going to disrupt the numbers in the small workload I managed to test. Of course, if you have a specific application senario in mind, I will try it. :) Thanks, Fam