From: Anton Altaparmakov <aia21@cam.ac.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-sparse@vger.kernel.org, Christopher Li <sparse@chrisli.org>
Subject: Re: Bogus sparse warning?
Date: Tue, 13 Feb 2007 21:23:04 +0000 [thread overview]
Message-ID: <C9A59294-6927-4F50-9DEC-1A6CCA75A77D@cam.ac.uk> (raw)
In-Reply-To: <Pine.LNX.4.64.0702130758380.8424@woody.linux-foundation.org>
On 13 Feb 2007, at 16:10, Linus Torvalds wrote:
> On Tue, 13 Feb 2007, Anton Altaparmakov wrote:
>> Yes you are quite right in what it is flagging up. But I would
>> have assumed
>> that my current code is more efficient than alternatives. Here is
>> the code
>> from the first warning:
>>
>> /* Determine the runlist size. */
>> trl = rl + 1;
>> while (likely(trl->length))
>> trl++;
>> old_size = trl - runlist->rl + 1;
>>
>> Note "rl" and "trl" are "structs" of size 24 bytes (i.e. not a
>> power of 2).
>> What we are dealing with here is an array of those "structs" which is
>> terminated by a "struct" where its "->length" is zero. And I do
>> the above
>> loop to get the size, i.e. I look for the last element and then
>> subtract the
>> first element from the last element and add one to get the number
>> of elements
>> in the array.
>>
>> Is this not more efficient than say doing it using an index like so:
>>
>> for (old_size = 0; rl[old_size].length; old_size++)
>> ;
>> old_size++;
>>
>> My assumption has always been that using the rl[index] would
>> generate larger
>> and slower instructions because of the non-constant offset from
>> pointer base.
>
> I actually think that the second one is the much faster one
> (although they
> differ in where they start - your first code sequence starts at "rl
> +1",
> while the second one starts at "rl[0]".
Yes, sorry, that was just me being silly. I can just make it start
with "old_size = 1"...
> The compiler is quite possibly
> also able to avoid the "rl[index]" calculation in the loop, by
> rewriting
> it as
>
> for (old_size = 1; rl->length; old_size++)
> rl++;
>
> which you could obviously have done manually too (this assumes that
> you
> don't care about the original value of "rl" - use a temporary if
> you do.
Hm, good point! Yes, even I can see that that version would be
faster... I will switch the loops to doing this and see how many of
the warnings I can get rid of that way... Thanks for the suggestion!
>> Have I been going wrong all those years and in fact that is
>> better? Or is
>> there yet another way of doing the above more efficiently?
>
> It partly depends on how slow division is. The above "fastest" version
> uses two registers instead of one, and has two additions in the
> loop. It
> has two downsides: slightly higher register pressure and just the
> extra
> addition. However, on most archiectures the extra addition will be
> basically free (the "old_size" update doesn't depend on anything
> else than
> its previous iteration), and the register pressure will depend on
> how many
> registers you have.
>
> More importantly, it depends on whether the division is slow (which
> is a
> big "if" - the cost of a division like that can range from anything
> between 4 cycles to 50+ cycles) and how many iterations you
> normally go
> through. Is the common length small? Big?
Common should be very small. The "runlist" is basically an array of
extents describing where the (meta)data of an inode is located on
disk. So in an ideal world where there is no fragmentation on the
disk at all, all runlists would contain only two elements, one to
state "logical block 0 corresponds to physical block X and the number
of blocks in this extent is Y" and one to terminate the array which
would have a length of zero.
In reality there will be fragmentation so the number of extents will
increase.
As to how many there can be: A long time ago, the NTFS driver used to
crash when accessing some files. It turned out I was using kmalloc()
to allocate the array of extents and the array was so big for some
files that we tried to allocate more than kmalloc() allows (64kiB
IIRC or was it 128kiB?) so assuming the maximum is 64kiB (the
observations were on i386) that would mean that there are files out
there that have more than 2^16 / 24 = 2740 entries in the array...
Nowadays I allocate the array in multiples of PAGE_SIZE and if it is
less than or equal to a page we I kmalloc() and otherwise I use
vmalloc() which seems to work very well.
> So I don't think there is any final answer on this. On _many_
> architectures, division is fairly slow. On something like Core 2 or
> the
> Opteron, I think most small values are just 4 cycles to divide. On the
> other hand, the addition inside the loop will likely end up
> entirely free,
> since it would be a reg-reg ALU operation in a loop where there are
> likely
> cycles to schedule it.
Yes, that makes perfect sense. On modern pipelined CPUs it literally
should be totally free.
> Probably not a big deal in this case anyway. We added the warning
> when Al
> Viro found some really expensive stuff in the VM layer in some hot-
> paths.
> I doubt it matters for you.
It can be at least somewhat of a deal because this is the extent map
so it can be a very frequently traversed structure (e.g. in direct i/
o where we do not cache the on-disk location in buffers attached to
page cache pages...).
But in any case having sparse spew over 20 warnings is annoying
enough that I want to improve the code so the number of warnings at
least goes down. (-:
Thanks a lot for your advice!
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/
prev parent reply other threads:[~2007-02-13 21:23 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-12 10:05 Bogus sparse warning? Anton Altaparmakov
2007-02-12 18:28 ` Linus Torvalds
2007-02-12 19:14 ` Christopher Li
2007-02-12 19:53 ` Linus Torvalds
2007-02-13 8:30 ` Josh Triplett
2007-02-13 0:15 ` Anton Altaparmakov
2007-02-13 1:46 ` Christopher Li
2007-02-13 8:22 ` Josh Triplett
[not found] ` <20070213190400.GA9989@chrisli.org>
2007-02-13 23:01 ` Josh Triplett
2007-02-13 9:39 ` Anton Altaparmakov
2007-02-13 0:25 ` Anton Altaparmakov
2007-02-13 0:42 ` Linus Torvalds
2007-02-13 9:53 ` Anton Altaparmakov
2007-02-13 16:10 ` Linus Torvalds
2007-02-13 21:23 ` Anton Altaparmakov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=C9A59294-6927-4F50-9DEC-1A6CCA75A77D@cam.ac.uk \
--to=aia21@cam.ac.uk \
--cc=linux-sparse@vger.kernel.org \
--cc=sparse@chrisli.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.