From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.0 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BADEBC04EB8 for ; Mon, 26 Nov 2018 16:46:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7A1C920855 for ; Mon, 26 Nov 2018 16:46:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="AIf98ERr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7A1C920855 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-block-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726735AbeK0DlE (ORCPT ); Mon, 26 Nov 2018 22:41:04 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:36153 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727140AbeK0DlD (ORCPT ); Mon, 26 Nov 2018 22:41:03 -0500 Received: by mail-it1-f196.google.com with SMTP id c9so28764088itj.1 for ; Mon, 26 Nov 2018 08:46:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=OIl8Cn3W6krW4SCjJSLwhD4y1eoA0bru+Cwg1YFlJhk=; b=AIf98ERr1HTugFAQffOuzWX9GOmsqg6bIZwtsiJPOgGmOhk/XaPUDGvfoiz6bZ73Ug H+hi6bKuwiGY0iOWWaK1d3VQ8za1oSx8o6Dhuv4f0z4CxBnoH/h9Nm5NIPc0fsk3SlKZ IJZAVnHwG+GE/+PBwNItBP0rMUgbIn0+3VOLlUKpHiVmEEcado4QeFaJv0b/hc5pZwTF S57y11Y8QYB+wODOjO9yOaUpLmC8vBfo5czeCfjTKikrjXa5UAKXcU+v3dB+TGtu4Jza EuK6g8wwytUjUr4DNUj1KZXk1FukhdCbM2sahTu0ZDXOMmeJKWdvhDt2/4AvYTFQSFRJ pbQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=OIl8Cn3W6krW4SCjJSLwhD4y1eoA0bru+Cwg1YFlJhk=; b=sVfmJnStbMFF/0wiV9+PuDTTvxi1UlqBKgSng7TWnJkjCKtYno9iG+pgeYjPNWOBbb T+86wbyoUM+C1AZJf7BnXOhQIi7JjApk8jGDAjHFVuZdZGciliD50TZg9zWP8zA/Ntbb sJ0n453EUDI6Zlkjcljxf0EhN0YnVA0dClKF1YsOVE9vyUpnMfo/zlUlL1zHpuHpvmTp KJwphv5EFfiBNFgxKOYDk551NQJdsAyLl3Qo+KtjCj24kQyQSPUf/UCb/XoeSl+OXn8F xmA+mpGRATo5i/j92ilfVLDj6yyh9dIizAO8kFzOsaStk0VB+CajpRaydak1MHUF9S3o IsEQ== X-Gm-Message-State: AGRZ1gIJuJJYd0gwzNRSM8ny1pUHflVjXHgYunAVJKTRokbFVgz78izs gedN3rVDkdaL/NOOLbbicpA0n/IKI0g= X-Google-Smtp-Source: AFSGD/V2NAJJ8+ykLJhzffrkpsxziOEl9uC+I73XOI19DuejhXfYm6U5UyhrCE1g5Co1cwVE5qOUQw== X-Received: by 2002:a02:2702:: with SMTP id g2mr23484940jaa.83.1543250783563; Mon, 26 Nov 2018 08:46:23 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id t65-v6sm486801ita.9.2018.11.26.08.46.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 Nov 2018 08:46:22 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 18/20] aio: add submission side request cache Date: Mon, 26 Nov 2018 09:45:42 -0700 Message-Id: <20181126164544.5699-19-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181126164544.5699-1-axboe@kernel.dk> References: <20181126164544.5699-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org We have to add each submitted polled request to the io_context poll_submitted list, which means we have to grab the poll_lock. We already use the block plug to batch submissions if we're doing a batch of IO submissions, extend that to cover the poll requests internally as well. Signed-off-by: Jens Axboe --- fs/aio.c | 136 +++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 113 insertions(+), 23 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index db73c8af1a0a..74afd1ff7fc9 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -247,6 +247,15 @@ static const struct address_space_operations aio_ctx_aops; static const unsigned int iocb_page_shift = ilog2(PAGE_SIZE / sizeof(struct iocb)); +/* + * We rely on block level unplugs to flush pending requests, if we schedule + */ +#ifdef CONFIG_BLOCK +static const bool aio_use_state_req_list = true; +#else +static const bool aio_use_state_req_list = false; +#endif + static void aio_useriocb_free(struct kioctx *); static void aio_iopoll_reap_events(struct kioctx *); @@ -1724,6 +1733,21 @@ static void aio_complete_rw_poll(struct kiocb *kiocb, long res, long res2) } } +struct aio_submit_state { + struct kioctx *ctx; + + struct blk_plug plug; +#ifdef CONFIG_BLOCK + struct blk_plug_cb plug_cb; +#endif + + /* + * Polled iocbs that have been submitted, but not added to the ctx yet + */ + struct list_head req_list; + unsigned int req_count; +}; + static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb) { struct kioctx *ctx = kiocb->ki_ctx; @@ -1832,13 +1856,28 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) } +/* + * Called either at the end of IO submission, or through a plug callback + * because we're going to schedule. Moves out local batch of requests to + * the ctx poll list, so they can be found for polling + reaping. + */ +static void aio_flush_state_reqs(struct kioctx *ctx, + struct aio_submit_state *state) +{ + spin_lock(&ctx->poll_lock); + list_splice_tail_init(&state->req_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); + state->req_count = 0; +} + /* * After the iocb has been issued, it's safe to be found on the poll list. * Adding the kiocb to the list AFTER submission ensures that we don't * find it from a io_getevents() thread before the issuer is done accessing * the kiocb cookie. */ -static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) +static void aio_iopoll_iocb_issued(struct aio_submit_state *state, + struct aio_kiocb *kiocb) { /* * For fast devices, IO may have already completed. If it has, add @@ -1848,12 +1887,21 @@ static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) const int front_add = test_bit(IOCB_POLL_COMPLETED, &kiocb->ki_flags); struct kioctx *ctx = kiocb->ki_ctx; - spin_lock(&ctx->poll_lock); - if (front_add) - list_add(&kiocb->ki_list, &ctx->poll_submitted); - else - list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); - spin_unlock(&ctx->poll_lock); + if (!state || !aio_use_state_req_list) { + spin_lock(&ctx->poll_lock); + if (front_add) + list_add(&kiocb->ki_list, &ctx->poll_submitted); + else + list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); + } else { + if (front_add) + list_add(&kiocb->ki_list, &state->req_list); + else + list_add_tail(&kiocb->ki_list, &state->req_list); + if (++state->req_count >= AIO_IOPOLL_BATCH) + aio_flush_state_reqs(ctx, state); + } } static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, @@ -2149,7 +2197,8 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) } static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, - struct iocb __user *user_iocb, bool compat) + struct iocb __user *user_iocb, + struct aio_submit_state *state, bool compat) { struct aio_kiocb *req; ssize_t ret; @@ -2253,7 +2302,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, ret = -EAGAIN; goto out_put_req; } - aio_iopoll_iocb_issued(req); + aio_iopoll_iocb_issued(state, req); } return 0; out_put_req: @@ -2267,7 +2316,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, } static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, - bool compat) + struct aio_submit_state *state, bool compat) { struct iocb iocb, *iocbp; @@ -2288,7 +2337,44 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, iocbp = &iocb; } - return __io_submit_one(ctx, iocbp, user_iocb, compat); + return __io_submit_one(ctx, iocbp, user_iocb, state, compat); +} + +#ifdef CONFIG_BLOCK +static void aio_state_unplug(struct blk_plug_cb *cb, bool from_schedule) +{ + struct aio_submit_state *state; + + state = container_of(cb, struct aio_submit_state, plug_cb); + if (!list_empty(&state->req_list)) + aio_flush_state_reqs(state->ctx, state); +} +#endif + +/* + * Batched submission is done, ensure local IO is flushed out. + */ +static void aio_submit_state_end(struct aio_submit_state *state) +{ + blk_finish_plug(&state->plug); + if (!list_empty(&state->req_list)) + aio_flush_state_reqs(state->ctx, state); +} + +/* + * Start submission side cache. + */ +static void aio_submit_state_start(struct aio_submit_state *state, + struct kioctx *ctx) +{ + state->ctx = ctx; + INIT_LIST_HEAD(&state->req_list); + state->req_count = 0; +#ifdef CONFIG_BLOCK + state->plug_cb.callback = aio_state_unplug; + blk_start_plug(&state->plug); + list_add(&state->plug_cb.list, &state->plug.cb_list); +#endif } /* @@ -2312,10 +2398,10 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, struct iocb __user * __user *, iocbpp) { + struct aio_submit_state state, *statep = NULL; struct kioctx *ctx; long ret = 0; int i = 0; - struct blk_plug plug; if (unlikely(nr < 0)) return -EINVAL; @@ -2329,8 +2415,10 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (nr > ctx->nr_events) nr = ctx->nr_events; - if (nr > AIO_PLUG_THRESHOLD) - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) { + aio_submit_state_start(&state, ctx); + statep = &state; + } for (i = 0; i < nr; i++) { struct iocb __user *user_iocb; @@ -2339,12 +2427,12 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, break; } - ret = io_submit_one(ctx, user_iocb, false); + ret = io_submit_one(ctx, user_iocb, statep, false); if (ret) break; } - if (nr > AIO_PLUG_THRESHOLD) - blk_finish_plug(&plug); + if (statep) + aio_submit_state_end(statep); percpu_ref_put(&ctx->users); return i ? i : ret; @@ -2354,10 +2442,10 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, int, nr, compat_uptr_t __user *, iocbpp) { + struct aio_submit_state state, *statep = NULL; struct kioctx *ctx; long ret = 0; int i = 0; - struct blk_plug plug; if (unlikely(nr < 0)) return -EINVAL; @@ -2371,8 +2459,10 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, if (nr > ctx->nr_events) nr = ctx->nr_events; - if (nr > AIO_PLUG_THRESHOLD) - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) { + aio_submit_state_start(&state, ctx); + statep = &state; + } for (i = 0; i < nr; i++) { compat_uptr_t user_iocb; @@ -2381,12 +2471,12 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, break; } - ret = io_submit_one(ctx, compat_ptr(user_iocb), true); + ret = io_submit_one(ctx, compat_ptr(user_iocb), statep, true); if (ret) break; } - if (nr > AIO_PLUG_THRESHOLD) - blk_finish_plug(&plug); + if (statep) + aio_submit_state_end(statep); percpu_ref_put(&ctx->users); return i ? i : ret; -- 2.17.1