From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B8F2C4360F for ; Mon, 25 Mar 2019 16:41:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6B1C42147C for ; Mon, 25 Mar 2019 16:41:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="UIkWIwpZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729382AbfCYQlp (ORCPT ); Mon, 25 Mar 2019 12:41:45 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:40286 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725747AbfCYQlp (ORCPT ); Mon, 25 Mar 2019 12:41:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=/3Flmjo6PRVqLQp7yNBcOCkVJ+Nd9I3TWl7ZcDnl5JY=; b=UIkWIwpZ9lbeyN/GvN8MIdhpD HvWJf8wGNGp3N2u/TmebgR/lJPWgETjhSE1QmpScX/fFgWJjn0cSuEENoKDuFVnZ1VQMNBJskhmgU 7PhCL/6+Ib7C8Eyk53Ju2X9c5OK5EcPgJC4urcPzzRc30imfG3EzeeHkj/XL5/4Gpt5I1UiBQhLR/ 0Gb+UkE6k6g/EMHa4dibIR7RGZYuDT8GU8n7PrguK6v2bDm3oiOrSGi/1DQvMV6rmQTDR3/B7SUwi 64wqowvqkGU6zXIGiFFLY8Qc9buw+hRSCgnWdyFn6tRzVXA6TDor9ImSzl2ZoTU+t1tI01O3UKoBG gZvOQMGXA==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1h8SfB-0008WS-Ql; Mon, 25 Mar 2019 16:41:29 +0000 Date: Mon, 25 Mar 2019 09:41:29 -0700 From: Matthew Wilcox To: "Darrick J. Wong" Cc: Amir Goldstein , Dave Chinner , linux-xfs , Christoph Hellwig , linux-fsdevel Subject: Re: [QUESTION] Long read latencies on mixed rw buffered IO Message-ID: <20190325164129.GH10344@bombadil.infradead.org> References: <20190325001044.GA23020@dastard> <20190325154731.GT1183@magnolia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190325154731.GT1183@magnolia> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Mon, Mar 25, 2019 at 08:47:31AM -0700, Darrick J. Wong wrote: > Hmmm.... so it looks like the rw_semaphore behavior has shifted over > time, then? Yes. > I thought rwsem was supposed to queue read and write waiters in order, > at least on x86? Though I suppose that might not matter much since we > can only run one writer at a time vs. waking up all the readers at once. > Now I'm wondering if there ever was a time when the readers all got > batched to the front and starved the writers, but eh I haven't drank > enough coffee to remember things like that. :P > > (I wonder what would happen if rw_semaphore decided to wake up some > number of the readers in the rwsem wait_list, not just the ones at the > front...) rwsems currently allow a limited amount of queue-jumping; if a semaphore is currently not acquired (it's in transition between two owners), a running process can acquire it. I think it is a bug that we only wake readers at the front of the queue; I think we would get better performance if we wake all readers. ie here: /* - * Grant an infinite number of read locks to the readers at the front - * of the queue. We know that woken will be at least 1 as we accounted + * Grant an infinite number of read locks. We know that woken will + * be at least 1 as we accounted * for above. Note we increment the 'active part' of the count by the * number of readers before waking any processes up. */ list_for_each_entry_safe(waiter, tmp, &sem->wait_list, list) { struct task_struct *tsk; - if (waiter->type == RWSEM_WAITING_FOR_WRITE) - break; Amir, it seems like you have a good test-case for trying this out ...