On Fri, Oct 30, 2015 at 2:23 PM, Linus Torvalds wrote: > On Fri, Oct 30, 2015 at 2:02 PM, Al Viro wrote: >> >> Your variant has 1:64 ratio; obviously better than now, but we can actually >> do 1:bits-per-cacheline quite easily. > > Ok, but in that case you end up needing a counter for each cacheline > too (to count how many bits, in order to know when to say "cacheline > is entirely full"). So here's a largely untested version of my "one bit per word" approach. It seems to work, but looking at it, I'm unhappy with a few things: - using kmalloc() for the .full_fds_bits[] array is simple, but disgusting, since 99% of all programs just have a single word. I know I talked about just adding the allocation to the same one that allocates the bitmaps themselves, but I got lazy and didn't do it. Especially since that code seems to try fairly hard to make the allocations nice powers of two, according to the comments. That may actually matter from an allocation standpoint. - Maybe we could just use that "full_fds_bits_init" field for when a single word is sufficient, and avoid the kmalloc that way? Anyway. This is a pretty simple patch, and I actually think that we could just get rid of the "next_fd" logic entirely with this. That would make this *patch* be more complicated, but it would make the resulting *code* be simpler. Hmm? Want to play with this? Eric, what does this do to your test-case? Linus