From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB1A2C43334 for ; Wed, 29 Jun 2022 20:22:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229944AbiF2UWI (ORCPT ); Wed, 29 Jun 2022 16:22:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229552AbiF2UWH (ORCPT ); Wed, 29 Jun 2022 16:22:07 -0400 Received: from mail-qv1-xf2c.google.com (mail-qv1-xf2c.google.com [IPv6:2607:f8b0:4864:20::f2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BEDDB2655C for ; Wed, 29 Jun 2022 13:22:06 -0700 (PDT) Received: by mail-qv1-xf2c.google.com with SMTP id q4so26583051qvq.8 for ; Wed, 29 Jun 2022 13:22:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=QWEDw7h1383VqAi3RxehxMxpwD3GDj9um2aMqaSzPts=; b=C5XcTHKblWHagx9TqsGVeBiklIvei24sMPG0CyWnQCqrhSFkRMBQXlDEJ6hh0Canah YyYVOinEUfdpChING9/r39HVTiDRJ+uztYZjXr5ayn5pRU/4tvalqgtroDMLkja8jCzI uuqTthHL77BfTdoyHPjsMuJ+J9ooEMIHA7mJc0FEwt72AjciGI0RUOkA83cxCYMbLy4O 1qkJtx4XifXeMIzOx9MGJOS4Uyri+ADOryjilgUZ2y7lSkduKdFbRIXQo41d6sEzLyBE b4DpspKkX3pSuXkN0wkJvqWmcFLnBRn4IayWJPbOJCrahMq2NIaTnCd+tfMPaMsC0MUH Hfqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=QWEDw7h1383VqAi3RxehxMxpwD3GDj9um2aMqaSzPts=; b=XaVghKQi2mRjcr7POSvTSclzdOkYicNG+qCRoU4I1lfcp6o1z3Rm5eyv2tiCMuT01D d33J5YpJESpPViG643gxln2xxV0tq9r/7PZEeUI+kA6cQ0djCs6YJMe5V7MerIp6XKdC R5igA/unYLPhvD9q1+OcPBjFkdpwexE6sxpj5xR7gHWRClkig4M6BSVfulYprW5QGJJS fOyewLj7ppT3GQwHaEnvcmDiUZGUTQZeA9BYiNRSxOMJ+f2oORuoAAkx3UL3V5mJ4cks ZU6o2JBcU7OA4ct4heUIbDvjaQAKq7Tdar/vw71jTxLz8vCl8Yks/09lSkAYdm8KJ8tw D1ng== X-Gm-Message-State: AJIora+BkS9dVxyR5JLTH0g8l4bPdDVIbAKlYcmPIFduvI8vGJNbh7IM yR0jDa1oVFdculsqaIPA/9oMzQ== X-Google-Smtp-Source: AGRyM1vbjOMHgdo0+tTVLUJeh0kzkFarPzbl/db2cUsml287obCmEDCc/xbU35lTj4jz1COyfBF/mw== X-Received: by 2002:a05:622a:253:b0:306:5d68:2f90 with SMTP id c19-20020a05622a025300b003065d682f90mr4277525qtx.263.1656534125832; Wed, 29 Jun 2022 13:22:05 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id k12-20020a05620a414c00b006aefe22d75bsm13623046qko.80.2022.06.29.13.22.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jun 2022 13:22:05 -0700 (PDT) Date: Wed, 29 Jun 2022 16:22:04 -0400 From: Taylor Blau To: Abhradeep Chakraborty Cc: Git , Kaartic Sivaraam , Derrick Stolee Subject: Re: [PATCH v2 4/6] pack-bitmap: prepare to read lookup table Message-ID: References: <20220628085950.19288-1-chakrabortyabhradeep79@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20220628085950.19288-1-chakrabortyabhradeep79@gmail.com> Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, Jun 28, 2022 at 02:29:50PM +0530, Abhradeep Chakraborty wrote: > Taylor Blau wrote: > > > ...exactly my thoughts, too. It's possible that it would be faster to > > key this search on the object_id "oid" above, and then convert each of > > the entries in the lookup table from a uint32_t into an object_id by > > calling nth_bitmap_object_oid() repeatedly. > > > > I *think* that what Abhradeep wrote here is going to be faster more > > often than not since it makes more efficient use of the page cache > > rather than switching between reads across different memory mapped > > regions at each point in the binary search. > > > > But of course that depends on a number of factors. Abhradeep: if you're > > up for it, I think it would be worth trying it both ways and seeing if > > one produces a meaningful speed-up or slow-down over the other. Like I > > said: my guess is that what you have now will be faster, but I don't > > have a clear sense that that is true without trying it both ways ;-). > > Ok. Let me try both the ways. In my opinion, I think my version has > less searching and less computation. So, I want to stick with this > version. But I also like to try the other one once so that we can > get the best out of these two. Yeah, I agree with your general sense that the version as written is going to be faster. We're comparing a smaller datatype (IOW, a 4-byte integer that can be checked for equality in a single instruction, instead of comparing two 20-byte OIDs), and likely flushing the cache far less often. But having two concrete implementations to compare will help us know for a fact that our intuition is correct. I'll be curious to see what you find here! > > I think starting off with a small array and then letting it grow > > according to alloc_nr() would be fine here, since it will grow more and > > more each time, so the amount of times we have to reallocate the buffer > > will tail off over time. > > What should be the size of that array? I think some small, power of 2 would be a reasonable choice here. > > If we were really concerned about it, we could treat the buffer as a > > static pointer and reuse it over time (making sure to clear out the > > portions of it we're going to reuse, or otherwise ensuring that we don't > > read old data). But I doubt it matters much either way in practice: the > > individual records are small (at just 4 bytes each) and entry_count is > > often less than 1,000, so I think this probably has a vanishingly small > > impact. > > Before submitting it to the mailing list, I did use the ALLOC_GROW macro > function. But my version was worse than yours. For every iteration I was > reallocating the array to support `size+1` positions. But later I drop > the code as this might be very much expensive. That shouldn't be the case. When you have a chance, take a look at the alloc_nr macro, which shows how much memory we allocate at each step: #define alloc_nr(x) (((x)+16)*3/2) Suppose we allocated 16 slots initially, so nr (the number of entries stored in the list) is 0 and alloc (the number of entries allocated) is 16. Then when we try to add the 17th item, we'll pass 16 to alloc_nr which will allocate 48 slots. Then 96, then 168, and so on. We only have to reallocate and copy the array when nr > alloc, which should be fairly infrequently, and happens less and less often the larger the array grows. Thanks, Taylor