From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754476AbcKAVFh (ORCPT ); Tue, 1 Nov 2016 17:05:37 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:35360 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754276AbcKAVFf (ORCPT ); Tue, 1 Nov 2016 17:05:35 -0400 From: Jens Axboe To: , , CC: Subject: [PATCHSET] block: IO polling improvements Date: Tue, 1 Nov 2016 15:05:21 -0600 Message-ID: <1478034325-28232-1-git-send-email-axboe@fb.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-01_07:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This builds on top of Christophs simplified bdev O_DIRECT code, posted earlier today [1]. This patchset adds support for a hybrid polling mode, where a poll cycle can be split into an upfront sleep, then a busy poll. On the devices where we care about IO polling, generally we have fairly deterministic completion latencies. So this patchset pulls in the stats code from the buffered writeback code, and uses that to more intelligently poll a device. Let's assume a device completes IO in 8 usecs. When we poll now, we busy loop for 8 usec. It's a lot more efficient to sleep for a bit, then poll the last few usecs instead, if we know the rough completion time of the IO. This adds a sysfs file, /sys/block//queue/io_poll_delay, which makes the polling behave as follows: -1 Never enter hybrid sleep, always poll 0 Use half of the completion mean for this request type for the sleep delay >0 Use this specific value as the sleep delay We default to -1, which is the old behavior. Initial testing with '0' has been positive, retaining 93% of the performance at roughly 50% CPU instead of the 100% with classic polling. You can also find this code in my for-4.10/dio branch. [1] https://marc.info/?l=linux-kernel&m=147793678521108&w=2