All of lore.kernel.org
 help / color / mirror / Atom feed
* Bug in mempool::map?
@ 2016-12-20 15:04 Igor Fedotov
  2016-12-20 15:25 ` Sage Weil
  0 siblings, 1 reply; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 15:04 UTC (permalink / raw)
  To: Allen Samuels

Hi Allen,

It looks like mempools don't measure maps allocations properly.

I extended unittest_mempool in the following way but corresponding 
output is always 0 for both 'before' and 'after' values:


diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
index 4113c53..b38a356 100644
--- a/src/test/test_mempool.cc
+++ b/src/test/test_mempool.cc
@@ -232,9 +232,19 @@ TEST(mempool, set)
  TEST(mempool, map)
  {
    {
-    mempool::unittest_1::map<int,int> v;
-    v[1] = 2;
-    v[3] = 4;
+    size_t before = mempool::buffer_data::allocated_bytes();
+    mempool::unittest_1::map<int,int>* v = new 
mempool::unittest_1::map<int,int>;
+    (*v)[1] = 2;
+    (*v)[3] = 4;
+    size_t after = mempool::buffer_data::allocated_bytes();
+    cout << "before " << before << " after " << after << std::endl;
+    delete v;
+    before = after;
+    mempool::unittest_1::map<int64_t,int64_t> v2;
+    v2[1] = 2;
+    v2[3] = 4;
+    after = mempool::buffer_data::allocated_bytes();
+    cout << " before " << before << " after " << after << std::endl;
    }
    {
      mempool::unittest_2::map<int,obj> v;


Output:

[ RUN      ] mempool.map
before 0 after 0
  before 0 after 0
[       OK ] mempool.map (0 ms)

It looks like we do not measure ref_map for BlueStore Blob and 
SharedBlob classes too.

Any ideas?

Thanks,

Igor


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-20 15:04 Bug in mempool::map? Igor Fedotov
@ 2016-12-20 15:25 ` Sage Weil
  2016-12-20 15:56   ` Igor Fedotov
  2016-12-20 17:08   ` Igor Fedotov
  0 siblings, 2 replies; 13+ messages in thread
From: Sage Weil @ 2016-12-20 15:25 UTC (permalink / raw)
  To: Igor Fedotov; +Cc: Allen Samuels

On Tue, 20 Dec 2016, Igor Fedotov wrote:
> Hi Allen,
> 
> It looks like mempools don't measure maps allocations properly.
> 
> I extended unittest_mempool in the following way but corresponding output is
> always 0 for both 'before' and 'after' values:
> 
> 
> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> index 4113c53..b38a356 100644
> --- a/src/test/test_mempool.cc
> +++ b/src/test/test_mempool.cc
> @@ -232,9 +232,19 @@ TEST(mempool, set)
>  TEST(mempool, map)
>  {
>    {
> -    mempool::unittest_1::map<int,int> v;
> -    v[1] = 2;
> -    v[3] = 4;
> +    size_t before = mempool::buffer_data::allocated_bytes();

I think it's just that you're measuring the buffer_data pool...

> +    mempool::unittest_1::map<int,int>* v = new mempool::unittest_1::map<int,int>;

but the map is in the unittest_1 pool?

> +    (*v)[1] = 2;
> +    (*v)[3] = 4;
> +    size_t after = mempool::buffer_data::allocated_bytes();
> +    cout << "before " << before << " after " << after << std::endl;
> +    delete v;
> +    before = after;
> +    mempool::unittest_1::map<int64_t,int64_t> v2;
> +    v2[1] = 2;
> +    v2[3] = 4;
> +    after = mempool::buffer_data::allocated_bytes();
> +    cout << " before " << before << " after " << after << std::endl;
>    }
>    {
>      mempool::unittest_2::map<int,obj> v;
> 
> 
> Output:
> 
> [ RUN      ] mempool.map
> before 0 after 0
>  before 0 after 0
> [       OK ] mempool.map (0 ms)
> 
> It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
> classes too.
> 
> Any ideas?
> 
> Thanks,
> 
> Igor
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-20 15:25 ` Sage Weil
@ 2016-12-20 15:56   ` Igor Fedotov
  2016-12-20 17:08   ` Igor Fedotov
  1 sibling, 0 replies; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 15:56 UTC (permalink / raw)
  To: Sage Weil; +Cc: Allen Samuels, ceph-devel

Yeah, indeed. Thanks!


On 20.12.2016 18:25, Sage Weil wrote:
> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>> Hi Allen,
>>
>> It looks like mempools don't measure maps allocations properly.
>>
>> I extended unittest_mempool in the following way but corresponding output is
>> always 0 for both 'before' and 'after' values:
>>
>>
>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>> index 4113c53..b38a356 100644
>> --- a/src/test/test_mempool.cc
>> +++ b/src/test/test_mempool.cc
>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>>   TEST(mempool, map)
>>   {
>>     {
>> -    mempool::unittest_1::map<int,int> v;
>> -    v[1] = 2;
>> -    v[3] = 4;
>> +    size_t before = mempool::buffer_data::allocated_bytes();
> I think it's just that you're measuring the buffer_data pool...
>
>> +    mempool::unittest_1::map<int,int>* v = new mempool::unittest_1::map<int,int>;
> but the map is in the unittest_1 pool?
>
>> +    (*v)[1] = 2;
>> +    (*v)[3] = 4;
>> +    size_t after = mempool::buffer_data::allocated_bytes();
>> +    cout << "before " << before << " after " << after << std::endl;
>> +    delete v;
>> +    before = after;
>> +    mempool::unittest_1::map<int64_t,int64_t> v2;
>> +    v2[1] = 2;
>> +    v2[3] = 4;
>> +    after = mempool::buffer_data::allocated_bytes();
>> +    cout << " before " << before << " after " << after << std::endl;
>>     }
>>     {
>>       mempool::unittest_2::map<int,obj> v;
>>
>>
>> Output:
>>
>> [ RUN      ] mempool.map
>> before 0 after 0
>>   before 0 after 0
>> [       OK ] mempool.map (0 ms)
>>
>> It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
>> classes too.
>>
>> Any ideas?
>>
>> Thanks,
>>
>> Igor
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-20 15:25 ` Sage Weil
  2016-12-20 15:56   ` Igor Fedotov
@ 2016-12-20 17:08   ` Igor Fedotov
  2016-12-20 17:27     ` Allen Samuels
                       ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 17:08 UTC (permalink / raw)
  To: Sage Weil; +Cc: Allen Samuels, ceph-devel

Some update on map<uint64_t, uint32_t> mem usage.

It looks like single entry map takes 48 bytes. And 40 bytes for 
map<uint32_t,uint32_t>.

Hence 1024 trivial ref_maps for 1024 blobs takes >48K!

These are my results taken from mempools. And they look pretty similar 
to what's been said in the following article:

http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/


Sage, you mentioned that you're planning to do something with ref maps 
during the standup but I missed the details. Is that something about 
their mem use or anything else?


Thanks,

Igor



On 20.12.2016 18:25, Sage Weil wrote:
> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>> Hi Allen,
>>
>> It looks like mempools don't measure maps allocations properly.
>>
>> I extended unittest_mempool in the following way but corresponding output is
>> always 0 for both 'before' and 'after' values:
>>
>>
>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>> index 4113c53..b38a356 100644
>> --- a/src/test/test_mempool.cc
>> +++ b/src/test/test_mempool.cc
>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>>   TEST(mempool, map)
>>   {
>>     {
>> -    mempool::unittest_1::map<int,int> v;
>> -    v[1] = 2;
>> -    v[3] = 4;
>> +    size_t before = mempool::buffer_data::allocated_bytes();
> I think it's just that you're measuring the buffer_data pool...
>
>> +    mempool::unittest_1::map<int,int>* v = new mempool::unittest_1::map<int,int>;
> but the map is in the unittest_1 pool?
>
>> +    (*v)[1] = 2;
>> +    (*v)[3] = 4;
>> +    size_t after = mempool::buffer_data::allocated_bytes();
>> +    cout << "before " << before << " after " << after << std::endl;
>> +    delete v;
>> +    before = after;
>> +    mempool::unittest_1::map<int64_t,int64_t> v2;
>> +    v2[1] = 2;
>> +    v2[3] = 4;
>> +    after = mempool::buffer_data::allocated_bytes();
>> +    cout << " before " << before << " after " << after << std::endl;
>>     }
>>     {
>>       mempool::unittest_2::map<int,obj> v;
>>
>>
>> Output:
>>
>> [ RUN      ] mempool.map
>> before 0 after 0
>>   before 0 after 0
>> [       OK ] mempool.map (0 ms)
>>
>> It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
>> classes too.
>>
>> Any ideas?
>>
>> Thanks,
>>
>> Igor
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: Bug in mempool::map?
  2016-12-20 17:08   ` Igor Fedotov
@ 2016-12-20 17:27     ` Allen Samuels
  2016-12-20 17:34       ` Igor Fedotov
  2016-12-20 17:37     ` Allen Samuels
  2016-12-20 18:26     ` Sage Weil
  2 siblings, 1 reply; 13+ messages in thread
From: Allen Samuels @ 2016-12-20 17:27 UTC (permalink / raw)
  To: Igor Fedotov, Sage Weil; +Cc: ceph-devel

Yes, Slab containers helps here by amortizing the malloc overhead for small nodes.

Allen Samuels
SanDisk |a Western Digital brand
951 SanDisk Drive, Milpitas, CA 95035
T: +1 408 801 7030| M: +1 408 780 6416
allen.samuels@SanDisk.com

> -----Original Message-----
> From: Igor Fedotov [mailto:ifedotov@mirantis.com]
> Sent: Tuesday, December 20, 2016 9:08 AM
> To: Sage Weil <sage@newdream.net>
> Cc: Allen Samuels <Allen.Samuels@sandisk.com>; ceph-devel <ceph-
> devel@vger.kernel.org>
> Subject: Re: Bug in mempool::map?
> 
> Some update on map<uint64_t, uint32_t> mem usage.
> 
> It looks like single entry map takes 48 bytes. And 40 bytes for
> map<uint32_t,uint32_t>.
> 
> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
> 
> These are my results taken from mempools. And they look pretty similar
> to what's been said in the following article:
> 
> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-
> can-be-surprising/
> 
> 
> Sage, you mentioned that you're planning to do something with ref maps
> during the standup but I missed the details. Is that something about
> their mem use or anything else?
> 
> 
> Thanks,
> 
> Igor
> 
> 
> 
> On 20.12.2016 18:25, Sage Weil wrote:
> > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> >> Hi Allen,
> >>
> >> It looks like mempools don't measure maps allocations properly.
> >>
> >> I extended unittest_mempool in the following way but corresponding
> output is
> >> always 0 for both 'before' and 'after' values:
> >>
> >>
> >> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> >> index 4113c53..b38a356 100644
> >> --- a/src/test/test_mempool.cc
> >> +++ b/src/test/test_mempool.cc
> >> @@ -232,9 +232,19 @@ TEST(mempool, set)
> >>   TEST(mempool, map)
> >>   {
> >>     {
> >> -    mempool::unittest_1::map<int,int> v;
> >> -    v[1] = 2;
> >> -    v[3] = 4;
> >> +    size_t before = mempool::buffer_data::allocated_bytes();
> > I think it's just that you're measuring the buffer_data pool...
> >
> >> +    mempool::unittest_1::map<int,int>* v = new
> mempool::unittest_1::map<int,int>;
> > but the map is in the unittest_1 pool?
> >
> >> +    (*v)[1] = 2;
> >> +    (*v)[3] = 4;
> >> +    size_t after = mempool::buffer_data::allocated_bytes();
> >> +    cout << "before " << before << " after " << after << std::endl;
> >> +    delete v;
> >> +    before = after;
> >> +    mempool::unittest_1::map<int64_t,int64_t> v2;
> >> +    v2[1] = 2;
> >> +    v2[3] = 4;
> >> +    after = mempool::buffer_data::allocated_bytes();
> >> +    cout << " before " << before << " after " << after << std::endl;
> >>     }
> >>     {
> >>       mempool::unittest_2::map<int,obj> v;
> >>
> >>
> >> Output:
> >>
> >> [ RUN      ] mempool.map
> >> before 0 after 0
> >>   before 0 after 0
> >> [       OK ] mempool.map (0 ms)
> >>
> >> It looks like we do not measure ref_map for BlueStore Blob and
> SharedBlob
> >> classes too.
> >>
> >> Any ideas?
> >>
> >> Thanks,
> >>
> >> Igor
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-20 17:27     ` Allen Samuels
@ 2016-12-20 17:34       ` Igor Fedotov
  0 siblings, 0 replies; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 17:34 UTC (permalink / raw)
  To: Allen Samuels, Sage Weil; +Cc: ceph-devel

Yeah, but std::map entry also has 3 pointers per item - hence 24 bytes 
overhead per 12 bytes (offset, len, ref_count) of useful payload...


On 20.12.2016 20:27, Allen Samuels wrote:
> Yes, Slab containers helps here by amortizing the malloc overhead for small nodes.
>
> Allen Samuels
> SanDisk |a Western Digital brand
> 951 SanDisk Drive, Milpitas, CA 95035
> T: +1 408 801 7030| M: +1 408 780 6416
> allen.samuels@SanDisk.com
>
>> -----Original Message-----
>> From: Igor Fedotov [mailto:ifedotov@mirantis.com]
>> Sent: Tuesday, December 20, 2016 9:08 AM
>> To: Sage Weil <sage@newdream.net>
>> Cc: Allen Samuels <Allen.Samuels@sandisk.com>; ceph-devel <ceph-
>> devel@vger.kernel.org>
>> Subject: Re: Bug in mempool::map?
>>
>> Some update on map<uint64_t, uint32_t> mem usage.
>>
>> It looks like single entry map takes 48 bytes. And 40 bytes for
>> map<uint32_t,uint32_t>.
>>
>> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
>>
>> These are my results taken from mempools. And they look pretty similar
>> to what's been said in the following article:
>>
>> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-
>> can-be-surprising/
>>
>>
>> Sage, you mentioned that you're planning to do something with ref maps
>> during the standup but I missed the details. Is that something about
>> their mem use or anything else?
>>
>>
>> Thanks,
>>
>> Igor
>>
>>
>>
>> On 20.12.2016 18:25, Sage Weil wrote:
>>> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>>>> Hi Allen,
>>>>
>>>> It looks like mempools don't measure maps allocations properly.
>>>>
>>>> I extended unittest_mempool in the following way but corresponding
>> output is
>>>> always 0 for both 'before' and 'after' values:
>>>>
>>>>
>>>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>>>> index 4113c53..b38a356 100644
>>>> --- a/src/test/test_mempool.cc
>>>> +++ b/src/test/test_mempool.cc
>>>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>>>>    TEST(mempool, map)
>>>>    {
>>>>      {
>>>> -    mempool::unittest_1::map<int,int> v;
>>>> -    v[1] = 2;
>>>> -    v[3] = 4;
>>>> +    size_t before = mempool::buffer_data::allocated_bytes();
>>> I think it's just that you're measuring the buffer_data pool...
>>>
>>>> +    mempool::unittest_1::map<int,int>* v = new
>> mempool::unittest_1::map<int,int>;
>>> but the map is in the unittest_1 pool?
>>>
>>>> +    (*v)[1] = 2;
>>>> +    (*v)[3] = 4;
>>>> +    size_t after = mempool::buffer_data::allocated_bytes();
>>>> +    cout << "before " << before << " after " << after << std::endl;
>>>> +    delete v;
>>>> +    before = after;
>>>> +    mempool::unittest_1::map<int64_t,int64_t> v2;
>>>> +    v2[1] = 2;
>>>> +    v2[3] = 4;
>>>> +    after = mempool::buffer_data::allocated_bytes();
>>>> +    cout << " before " << before << " after " << after << std::endl;
>>>>      }
>>>>      {
>>>>        mempool::unittest_2::map<int,obj> v;
>>>>
>>>>
>>>> Output:
>>>>
>>>> [ RUN      ] mempool.map
>>>> before 0 after 0
>>>>    before 0 after 0
>>>> [       OK ] mempool.map (0 ms)
>>>>
>>>> It looks like we do not measure ref_map for BlueStore Blob and
>> SharedBlob
>>>> classes too.
>>>>
>>>> Any ideas?
>>>>
>>>> Thanks,
>>>>
>>>> Igor
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: Bug in mempool::map?
  2016-12-20 17:08   ` Igor Fedotov
  2016-12-20 17:27     ` Allen Samuels
@ 2016-12-20 17:37     ` Allen Samuels
  2016-12-20 18:26     ` Sage Weil
  2 siblings, 0 replies; 13+ messages in thread
From: Allen Samuels @ 2016-12-20 17:37 UTC (permalink / raw)
  To: Igor Fedotov, Sage Weil; +Cc: ceph-devel

But it won't eliminate the RB-tree overhead (which is 3 pointers, and a flag -- which is probably rounded up to 8 bytes => 32 bytes). With 16 bytes of data (again, rounded up).

Also, the mempool stats don't include malloc overhead (depending on the implementation there might be an 8 or 16 byte header for each allocation). So the true consumption could be substantially worse. 

Allen Samuels
SanDisk |a Western Digital brand
951 SanDisk Drive, Milpitas, CA 95035
T: +1 408 801 7030| M: +1 408 780 6416
allen.samuels@SanDisk.com


> -----Original Message-----
> From: Allen Samuels
> Sent: Tuesday, December 20, 2016 9:28 AM
> To: 'Igor Fedotov' <ifedotov@mirantis.com>; Sage Weil
> <sage@newdream.net>
> Cc: ceph-devel <ceph-devel@vger.kernel.org>
> Subject: RE: Bug in mempool::map?
> 
> Yes, Slab containers helps here by amortizing the malloc overhead for small
> nodes.
> 
> Allen Samuels
> SanDisk |a Western Digital brand
> 951 SanDisk Drive, Milpitas, CA 95035
> T: +1 408 801 7030| M: +1 408 780 6416
> allen.samuels@SanDisk.com
> 
> > -----Original Message-----
> > From: Igor Fedotov [mailto:ifedotov@mirantis.com]
> > Sent: Tuesday, December 20, 2016 9:08 AM
> > To: Sage Weil <sage@newdream.net>
> > Cc: Allen Samuels <Allen.Samuels@sandisk.com>; ceph-devel <ceph-
> > devel@vger.kernel.org>
> > Subject: Re: Bug in mempool::map?
> >
> > Some update on map<uint64_t, uint32_t> mem usage.
> >
> > It looks like single entry map takes 48 bytes. And 40 bytes for
> > map<uint32_t,uint32_t>.
> >
> > Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
> >
> > These are my results taken from mempools. And they look pretty similar
> > to what's been said in the following article:
> >
> > http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-
> > can-be-surprising/
> >
> >
> > Sage, you mentioned that you're planning to do something with ref maps
> > during the standup but I missed the details. Is that something about
> > their mem use or anything else?
> >
> >
> > Thanks,
> >
> > Igor
> >
> >
> >
> > On 20.12.2016 18:25, Sage Weil wrote:
> > > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> > >> Hi Allen,
> > >>
> > >> It looks like mempools don't measure maps allocations properly.
> > >>
> > >> I extended unittest_mempool in the following way but corresponding
> > output is
> > >> always 0 for both 'before' and 'after' values:
> > >>
> > >>
> > >> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> > >> index 4113c53..b38a356 100644
> > >> --- a/src/test/test_mempool.cc
> > >> +++ b/src/test/test_mempool.cc
> > >> @@ -232,9 +232,19 @@ TEST(mempool, set)
> > >>   TEST(mempool, map)
> > >>   {
> > >>     {
> > >> -    mempool::unittest_1::map<int,int> v;
> > >> -    v[1] = 2;
> > >> -    v[3] = 4;
> > >> +    size_t before = mempool::buffer_data::allocated_bytes();
> > > I think it's just that you're measuring the buffer_data pool...
> > >
> > >> +    mempool::unittest_1::map<int,int>* v = new
> > mempool::unittest_1::map<int,int>;
> > > but the map is in the unittest_1 pool?
> > >
> > >> +    (*v)[1] = 2;
> > >> +    (*v)[3] = 4;
> > >> +    size_t after = mempool::buffer_data::allocated_bytes();
> > >> +    cout << "before " << before << " after " << after << std::endl;
> > >> +    delete v;
> > >> +    before = after;
> > >> +    mempool::unittest_1::map<int64_t,int64_t> v2;
> > >> +    v2[1] = 2;
> > >> +    v2[3] = 4;
> > >> +    after = mempool::buffer_data::allocated_bytes();
> > >> +    cout << " before " << before << " after " << after << std::endl;
> > >>     }
> > >>     {
> > >>       mempool::unittest_2::map<int,obj> v;
> > >>
> > >>
> > >> Output:
> > >>
> > >> [ RUN      ] mempool.map
> > >> before 0 after 0
> > >>   before 0 after 0
> > >> [       OK ] mempool.map (0 ms)
> > >>
> > >> It looks like we do not measure ref_map for BlueStore Blob and
> > SharedBlob
> > >> classes too.
> > >>
> > >> Any ideas?
> > >>
> > >> Thanks,
> > >>
> > >> Igor
> > >>
> > >> --
> > >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > >> the body of a message to majordomo@vger.kernel.org
> > >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >>
> > >>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-20 17:08   ` Igor Fedotov
  2016-12-20 17:27     ` Allen Samuels
  2016-12-20 17:37     ` Allen Samuels
@ 2016-12-20 18:26     ` Sage Weil
  2016-12-20 20:05       ` Igor Fedotov
  2 siblings, 1 reply; 13+ messages in thread
From: Sage Weil @ 2016-12-20 18:26 UTC (permalink / raw)
  To: Igor Fedotov; +Cc: Allen Samuels, ceph-devel

On Tue, 20 Dec 2016, Igor Fedotov wrote:
> Some update on map<uint64_t, uint32_t> mem usage.
> 
> It looks like single entry map takes 48 bytes. And 40 bytes for
> map<uint32_t,uint32_t>.
> 
> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
> 
> These are my results taken from mempools. And they look pretty similar to
> what's been said in the following article:
> 
> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/
> 
> 
> Sage, you mentioned that you're planning to do something with ref maps during
> the standup but I missed the details. Is that something about their mem use or
> anything else?

I mentioned btree_map<> and flat_map<> (new in boost).  Probably the thing 
to do here is to make extent_ref_map_t handle the common case of 1 (or 
maybe 2?) extents done inline, and when we go beyond that allocate another 
structure on the heap.  That other structure could be std::map<>, but I 
think one of the other choices would be better: one larger allocation and 
better performance in general for small maps.  This structure will 
only get big for very big blobs, which shouldn't be terribly common, I 
think.

sage


> 
> 
> Thanks,
> 
> Igor
> 
> 
> 
> On 20.12.2016 18:25, Sage Weil wrote:
> > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> > > Hi Allen,
> > > 
> > > It looks like mempools don't measure maps allocations properly.
> > > 
> > > I extended unittest_mempool in the following way but corresponding output
> > > is
> > > always 0 for both 'before' and 'after' values:
> > > 
> > > 
> > > diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> > > index 4113c53..b38a356 100644
> > > --- a/src/test/test_mempool.cc
> > > +++ b/src/test/test_mempool.cc
> > > @@ -232,9 +232,19 @@ TEST(mempool, set)
> > >   TEST(mempool, map)
> > >   {
> > >     {
> > > -    mempool::unittest_1::map<int,int> v;
> > > -    v[1] = 2;
> > > -    v[3] = 4;
> > > +    size_t before = mempool::buffer_data::allocated_bytes();
> > I think it's just that you're measuring the buffer_data pool...
> > 
> > > +    mempool::unittest_1::map<int,int>* v = new
> > > mempool::unittest_1::map<int,int>;
> > but the map is in the unittest_1 pool?
> > 
> > > +    (*v)[1] = 2;
> > > +    (*v)[3] = 4;
> > > +    size_t after = mempool::buffer_data::allocated_bytes();
> > > +    cout << "before " << before << " after " << after << std::endl;
> > > +    delete v;
> > > +    before = after;
> > > +    mempool::unittest_1::map<int64_t,int64_t> v2;
> > > +    v2[1] = 2;
> > > +    v2[3] = 4;
> > > +    after = mempool::buffer_data::allocated_bytes();
> > > +    cout << " before " << before << " after " << after << std::endl;
> > >     }
> > >     {
> > >       mempool::unittest_2::map<int,obj> v;
> > > 
> > > 
> > > Output:
> > > 
> > > [ RUN      ] mempool.map
> > > before 0 after 0
> > >   before 0 after 0
> > > [       OK ] mempool.map (0 ms)
> > > 
> > > It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
> > > classes too.
> > > 
> > > Any ideas?
> > > 
> > > Thanks,
> > > 
> > > Igor
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > > 
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-20 18:26     ` Sage Weil
@ 2016-12-20 20:05       ` Igor Fedotov
  2016-12-20 20:15         ` Sage Weil
  0 siblings, 1 reply; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 20:05 UTC (permalink / raw)
  To: Sage Weil; +Cc: Allen Samuels, ceph-devel

I think I have a better idea.

We can simply track amount of bytes referenced in the blob. E.g. some 
extent has length 100 - this increments the counter by 100 accordingly. 
Removing/punching an extent  decrements the counter.

If we want tp be able to deallocate specific unused pextent within a 
blob as we currently do then we just need to track that amount on 
per-pextent basis. Hence just a few ints per blob... And no need for map 
lookups.

If that's OK I can start implementing it tomorrow.


Thanks,

Igor


The same idea can be probably applied to SharedBlob too.

On 12/20/2016 9:26 PM, Sage Weil wrote:
> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>> Some update on map<uint64_t, uint32_t> mem usage.
>>
>> It looks like single entry map takes 48 bytes. And 40 bytes for
>> map<uint32_t,uint32_t>.
>>
>> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
>>
>> These are my results taken from mempools. And they look pretty similar to
>> what's been said in the following article:
>>
>> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/
>>
>>
>> Sage, you mentioned that you're planning to do something with ref maps during
>> the standup but I missed the details. Is that something about their mem use or
>> anything else?
> I mentioned btree_map<> and flat_map<> (new in boost).  Probably the thing
> to do here is to make extent_ref_map_t handle the common case of 1 (or
> maybe 2?) extents done inline, and when we go beyond that allocate another
> structure on the heap.  That other structure could be std::map<>, but I
> think one of the other choices would be better: one larger allocation and
> better performance in general for small maps.  This structure will
> only get big for very big blobs, which shouldn't be terribly common, I
> think.
>
> sage
>
>
>>
>> Thanks,
>>
>> Igor
>>
>>
>>
>> On 20.12.2016 18:25, Sage Weil wrote:
>>> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>>>> Hi Allen,
>>>>
>>>> It looks like mempools don't measure maps allocations properly.
>>>>
>>>> I extended unittest_mempool in the following way but corresponding output
>>>> is
>>>> always 0 for both 'before' and 'after' values:
>>>>
>>>>
>>>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>>>> index 4113c53..b38a356 100644
>>>> --- a/src/test/test_mempool.cc
>>>> +++ b/src/test/test_mempool.cc
>>>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>>>>    TEST(mempool, map)
>>>>    {
>>>>      {
>>>> -    mempool::unittest_1::map<int,int> v;
>>>> -    v[1] = 2;
>>>> -    v[3] = 4;
>>>> +    size_t before = mempool::buffer_data::allocated_bytes();
>>> I think it's just that you're measuring the buffer_data pool...
>>>
>>>> +    mempool::unittest_1::map<int,int>* v = new
>>>> mempool::unittest_1::map<int,int>;
>>> but the map is in the unittest_1 pool?
>>>
>>>> +    (*v)[1] = 2;
>>>> +    (*v)[3] = 4;
>>>> +    size_t after = mempool::buffer_data::allocated_bytes();
>>>> +    cout << "before " << before << " after " << after << std::endl;
>>>> +    delete v;
>>>> +    before = after;
>>>> +    mempool::unittest_1::map<int64_t,int64_t> v2;
>>>> +    v2[1] = 2;
>>>> +    v2[3] = 4;
>>>> +    after = mempool::buffer_data::allocated_bytes();
>>>> +    cout << " before " << before << " after " << after << std::endl;
>>>>      }
>>>>      {
>>>>        mempool::unittest_2::map<int,obj> v;
>>>>
>>>>
>>>> Output:
>>>>
>>>> [ RUN      ] mempool.map
>>>> before 0 after 0
>>>>    before 0 after 0
>>>> [       OK ] mempool.map (0 ms)
>>>>
>>>> It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
>>>> classes too.
>>>>
>>>> Any ideas?
>>>>
>>>> Thanks,
>>>>
>>>> Igor
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-20 20:05       ` Igor Fedotov
@ 2016-12-20 20:15         ` Sage Weil
  2016-12-20 21:29           ` Igor Fedotov
  0 siblings, 1 reply; 13+ messages in thread
From: Sage Weil @ 2016-12-20 20:15 UTC (permalink / raw)
  To: Igor Fedotov; +Cc: Allen Samuels, ceph-devel

On Tue, 20 Dec 2016, Igor Fedotov wrote:
> I think I have a better idea.
> 
> We can simply track amount of bytes referenced in the blob. E.g. some extent
> has length 100 - this increments the counter by 100 accordingly.
> Removing/punching an extent  decrements the counter.
>
> If we want tp be able to deallocate specific unused pextent within a blob as
> we currently do then we just need to track that amount on per-pextent basis.
> Hence just a few ints per blob... And no need for map lookups.

For small blobs, a counter is probably sufficient since we're a single 
min_alloc_size unit anyway.  That's actually more like a vector of size 
(min_alloc_size count), with 1 being the degenerate case.  Is that what 
you're thinking?

It seems like we have a few common cases...

1) big blob, single ref (e.g., immutable large object, and then the entire 
thing is referenced, modulo the last block).

2) tiny blob, single alloc unit.  a counter is sufficient.

3) everything else in between...

We could probably get away with special-case efficient representations of 
just 1 and 2?

sage


> 
> If that's OK I can start implementing it tomorrow.
> 
> 
> Thanks,
> 
> Igor
> 
> 
> The same idea can be probably applied to SharedBlob too.
> 
> On 12/20/2016 9:26 PM, Sage Weil wrote:
> > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> > > Some update on map<uint64_t, uint32_t> mem usage.
> > > 
> > > It looks like single entry map takes 48 bytes. And 40 bytes for
> > > map<uint32_t,uint32_t>.
> > > 
> > > Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
> > > 
> > > These are my results taken from mempools. And they look pretty similar to
> > > what's been said in the following article:
> > > 
> > > http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/
> > > 
> > > 
> > > Sage, you mentioned that you're planning to do something with ref maps
> > > during
> > > the standup but I missed the details. Is that something about their mem
> > > use or
> > > anything else?
> > I mentioned btree_map<> and flat_map<> (new in boost).  Probably the thing
> > to do here is to make extent_ref_map_t handle the common case of 1 (or
> > maybe 2?) extents done inline, and when we go beyond that allocate another
> > structure on the heap.  That other structure could be std::map<>, but I
> > think one of the other choices would be better: one larger allocation and
> > better performance in general for small maps.  This structure will
> > only get big for very big blobs, which shouldn't be terribly common, I
> > think.
> > 
> > sage
> > 
> > 
> > > 
> > > Thanks,
> > > 
> > > Igor
> > > 
> > > 
> > > 
> > > On 20.12.2016 18:25, Sage Weil wrote:
> > > > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> > > > > Hi Allen,
> > > > > 
> > > > > It looks like mempools don't measure maps allocations properly.
> > > > > 
> > > > > I extended unittest_mempool in the following way but corresponding
> > > > > output
> > > > > is
> > > > > always 0 for both 'before' and 'after' values:
> > > > > 
> > > > > 
> > > > > diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> > > > > index 4113c53..b38a356 100644
> > > > > --- a/src/test/test_mempool.cc
> > > > > +++ b/src/test/test_mempool.cc
> > > > > @@ -232,9 +232,19 @@ TEST(mempool, set)
> > > > >    TEST(mempool, map)
> > > > >    {
> > > > >      {
> > > > > -    mempool::unittest_1::map<int,int> v;
> > > > > -    v[1] = 2;
> > > > > -    v[3] = 4;
> > > > > +    size_t before = mempool::buffer_data::allocated_bytes();
> > > > I think it's just that you're measuring the buffer_data pool...
> > > > 
> > > > > +    mempool::unittest_1::map<int,int>* v = new
> > > > > mempool::unittest_1::map<int,int>;
> > > > but the map is in the unittest_1 pool?
> > > > 
> > > > > +    (*v)[1] = 2;
> > > > > +    (*v)[3] = 4;
> > > > > +    size_t after = mempool::buffer_data::allocated_bytes();
> > > > > +    cout << "before " << before << " after " << after << std::endl;
> > > > > +    delete v;
> > > > > +    before = after;
> > > > > +    mempool::unittest_1::map<int64_t,int64_t> v2;
> > > > > +    v2[1] = 2;
> > > > > +    v2[3] = 4;
> > > > > +    after = mempool::buffer_data::allocated_bytes();
> > > > > +    cout << " before " << before << " after " << after << std::endl;
> > > > >      }
> > > > >      {
> > > > >        mempool::unittest_2::map<int,obj> v;
> > > > > 
> > > > > 
> > > > > Output:
> > > > > 
> > > > > [ RUN      ] mempool.map
> > > > > before 0 after 0
> > > > >    before 0 after 0
> > > > > [       OK ] mempool.map (0 ms)
> > > > > 
> > > > > It looks like we do not measure ref_map for BlueStore Blob and
> > > > > SharedBlob
> > > > > classes too.
> > > > > 
> > > > > Any ideas?
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Igor
> > > > > 
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > > > > in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > > 
> > > > > 
> > > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-20 20:15         ` Sage Weil
@ 2016-12-20 21:29           ` Igor Fedotov
  2016-12-21 21:43             ` Sage Weil
  0 siblings, 1 reply; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 21:29 UTC (permalink / raw)
  To: Sage Weil; +Cc: Allen Samuels, ceph-devel



On 12/20/2016 11:15 PM, Sage Weil wrote:
> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>> I think I have a better idea.
>>
>> We can simply track amount of bytes referenced in the blob. E.g. some extent
>> has length 100 - this increments the counter by 100 accordingly.
>> Removing/punching an extent  decrements the counter.
>>
>> If we want tp be able to deallocate specific unused pextent within a blob as
>> we currently do then we just need to track that amount on per-pextent basis.
>> Hence just a few ints per blob... And no need for map lookups.
> For small blobs, a counter is probably sufficient since we're a single
> min_alloc_size unit anyway.  That's actually more like a vector of size
> (min_alloc_size count), with 1 being the degenerate case.  Is that what
> you're thinking?
Yes. And the key point that we are always to count BYTES that are 
referenced per-blob(or rather per-alloc_unit). Not the number of 
references(extents) per specific interval in the blob.
E.g.
extents
0~200 & 300~100
produce counter for AU(0) = 300
and
extents
0x1000~1, 0x1010~2, 0x1055~3
produce counter for AU(1) = 6

Hence we always store a VECTOR of counters not a map.

> It seems like we have a few common cases...
>
> 1) big blob, single ref (e.g., immutable large object, and then the entire
> thing is referenced, modulo the last block).
>
> 2) tiny blob, single alloc unit.  a counter is sufficient.
>
> 3) everything else in between...
>
> We could probably get away with special-case efficient representations of
> just 1 and 2?
I'd say 1) = entire blob is referenced just once. And 2) = single alloc 
unit  and partially referenced.

Then we need at minimum a bit flag for 1), 'int' for 2) and vector<int> 
for 3).

Possible implementations are
A.{
int standalone_counter;  //this covers 1 & 2, value greater than alloc 
unit means 1)
vector<int> per_au_counters; // and this one for 3 with the first 
counter stored in standalone_counter
}
vs.
B.{
//reuse some blob flag for 1)
vector<int> per_au_counters; //for 2) and 3)
}
vs.
C.{
vector<int> per_au_counters; //for all cases
}

A. Saves one allocation for 1 & 2, wastes 4 bytes for 1. Plus requires 
some simple tricks to use both standalone counter and vector.
B. Saves 1 allocation and 4 bytes for 1. Unified counter vector 
processing but some minor tricks with a flag.
C. Needs +1 allocation for 1 & 2. And wastes 4 bytes for 1. Unified 
counter processing

I'd rather prefer B. implementation.

> sage
>
>
>> If that's OK I can start implementing it tomorrow.
>>
>>
>> Thanks,
>>
>> Igor
>>
>>
>> The same idea can be probably applied to SharedBlob too.
>>
>> On 12/20/2016 9:26 PM, Sage Weil wrote:
>>> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>>>> Some update on map<uint64_t, uint32_t> mem usage.
>>>>
>>>> It looks like single entry map takes 48 bytes. And 40 bytes for
>>>> map<uint32_t,uint32_t>.
>>>>
>>>> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
>>>>
>>>> These are my results taken from mempools. And they look pretty similar to
>>>> what's been said in the following article:
>>>>
>>>> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/
>>>>
>>>>
>>>> Sage, you mentioned that you're planning to do something with ref maps
>>>> during
>>>> the standup but I missed the details. Is that something about their mem
>>>> use or
>>>> anything else?
>>> I mentioned btree_map<> and flat_map<> (new in boost).  Probably the thing
>>> to do here is to make extent_ref_map_t handle the common case of 1 (or
>>> maybe 2?) extents done inline, and when we go beyond that allocate another
>>> structure on the heap.  That other structure could be std::map<>, but I
>>> think one of the other choices would be better: one larger allocation and
>>> better performance in general for small maps.  This structure will
>>> only get big for very big blobs, which shouldn't be terribly common, I
>>> think.
>>>
>>> sage
>>>
>>>
>>>> Thanks,
>>>>
>>>> Igor
>>>>
>>>>
>>>>
>>>> On 20.12.2016 18:25, Sage Weil wrote:
>>>>> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>>>>>> Hi Allen,
>>>>>>
>>>>>> It looks like mempools don't measure maps allocations properly.
>>>>>>
>>>>>> I extended unittest_mempool in the following way but corresponding
>>>>>> output
>>>>>> is
>>>>>> always 0 for both 'before' and 'after' values:
>>>>>>
>>>>>>
>>>>>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>>>>>> index 4113c53..b38a356 100644
>>>>>> --- a/src/test/test_mempool.cc
>>>>>> +++ b/src/test/test_mempool.cc
>>>>>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>>>>>>     TEST(mempool, map)
>>>>>>     {
>>>>>>       {
>>>>>> -    mempool::unittest_1::map<int,int> v;
>>>>>> -    v[1] = 2;
>>>>>> -    v[3] = 4;
>>>>>> +    size_t before = mempool::buffer_data::allocated_bytes();
>>>>> I think it's just that you're measuring the buffer_data pool...
>>>>>
>>>>>> +    mempool::unittest_1::map<int,int>* v = new
>>>>>> mempool::unittest_1::map<int,int>;
>>>>> but the map is in the unittest_1 pool?
>>>>>
>>>>>> +    (*v)[1] = 2;
>>>>>> +    (*v)[3] = 4;
>>>>>> +    size_t after = mempool::buffer_data::allocated_bytes();
>>>>>> +    cout << "before " << before << " after " << after << std::endl;
>>>>>> +    delete v;
>>>>>> +    before = after;
>>>>>> +    mempool::unittest_1::map<int64_t,int64_t> v2;
>>>>>> +    v2[1] = 2;
>>>>>> +    v2[3] = 4;
>>>>>> +    after = mempool::buffer_data::allocated_bytes();
>>>>>> +    cout << " before " << before << " after " << after << std::endl;
>>>>>>       }
>>>>>>       {
>>>>>>         mempool::unittest_2::map<int,obj> v;
>>>>>>
>>>>>>
>>>>>> Output:
>>>>>>
>>>>>> [ RUN      ] mempool.map
>>>>>> before 0 after 0
>>>>>>     before 0 after 0
>>>>>> [       OK ] mempool.map (0 ms)
>>>>>>
>>>>>> It looks like we do not measure ref_map for BlueStore Blob and
>>>>>> SharedBlob
>>>>>> classes too.
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Igor
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>>>>> in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-20 21:29           ` Igor Fedotov
@ 2016-12-21 21:43             ` Sage Weil
  2016-12-21 22:55               ` Igor Fedotov
  0 siblings, 1 reply; 13+ messages in thread
From: Sage Weil @ 2016-12-21 21:43 UTC (permalink / raw)
  To: Igor Fedotov; +Cc: Allen Samuels, ceph-devel

On Wed, 21 Dec 2016, Igor Fedotov wrote:
> On 12/20/2016 11:15 PM, Sage Weil wrote:
> > It seems like we have a few common cases...
> > 
> > 1) big blob, single ref (e.g., immutable large object, and then the entire
> > thing is referenced, modulo the last block).
> > 
> > 2) tiny blob, single alloc unit.  a counter is sufficient.
> > 
> > 3) everything else in between...
> > 
> > We could probably get away with special-case efficient representations of
> > just 1 and 2?
> I'd say 1) = entire blob is referenced just once. And 2) = single alloc unit
> and partially referenced.
> 
> Then we need at minimum a bit flag for 1), 'int' for 2) and vector<int> for
> 3).
> 
> Possible implementations are
> A.{
> int standalone_counter;  //this covers 1 & 2, value greater than alloc unit
> means 1)
> vector<int> per_au_counters; // and this one for 3 with the first counter
> stored in standalone_counter
> }
> vs.
> B.{
> //reuse some blob flag for 1)
> vector<int> per_au_counters; //for 2) and 3)
> }
> vs.
> C.{
> vector<int> per_au_counters; //for all cases
> }
> 
> A. Saves one allocation for 1 & 2, wastes 4 bytes for 1. Plus requires some
> simple tricks to use both standalone counter and vector.
> B. Saves 1 allocation and 4 bytes for 1. Unified counter vector processing but
> some minor tricks with a flag.
> C. Needs +1 allocation for 1 & 2. And wastes 4 bytes for 1. Unified counter
> processing
> 
> I'd rather prefer B. implementation.

vector<int> is 24 bytes.  So if we're worried about memory, I think we 
want to have an alternative strategy.  Something like:

 uint8_t au_order;  ///< size of allocation unit (min_alloc_size may change)
 union {
   long *ref_by_au;  ///< dynamically allocated array of bytes per au
   long ref_in_single_au;  ///< refs if blob is a single au (au_order=0?)
 }

Something simple like that?

I suspect get_ref/put_ref will get quite a bit simpler and more efficient 
with per-au buckets.

Also, for compressed blobs, we can use the ref_in_single_au mode since we 
can't deallocate it piecewise anyway...

sage





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug in mempool::map?
  2016-12-21 21:43             ` Sage Weil
@ 2016-12-21 22:55               ` Igor Fedotov
  0 siblings, 0 replies; 13+ messages in thread
From: Igor Fedotov @ 2016-12-21 22:55 UTC (permalink / raw)
  To: Sage Weil; +Cc: Allen Samuels, ceph-devel


On 12/22/2016 12:43 AM, Sage Weil wrote:
> On Wed, 21 Dec 2016, Igor Fedotov wrote:
>> On 12/20/2016 11:15 PM, Sage Weil wrote:
>>> It seems like we have a few common cases...
>>>
>>> 1) big blob, single ref (e.g., immutable large object, and then the entire
>>> thing is referenced, modulo the last block).
>>>
>>> 2) tiny blob, single alloc unit.  a counter is sufficient.
>>>
>>> 3) everything else in between...
>>>
>>> We could probably get away with special-case efficient representations of
>>> just 1 and 2?
>> I'd say 1) = entire blob is referenced just once. And 2) = single alloc unit
>> and partially referenced.
>>
>> Then we need at minimum a bit flag for 1), 'int' for 2) and vector<int> for
>> 3).
>>
>> Possible implementations are
>> A.{
>> int standalone_counter;  //this covers 1 & 2, value greater than alloc unit
>> means 1)
>> vector<int> per_au_counters; // and this one for 3 with the first counter
>> stored in standalone_counter
>> }
>> vs.
>> B.{
>> //reuse some blob flag for 1)
>> vector<int> per_au_counters; //for 2) and 3)
>> }
>> vs.
>> C.{
>> vector<int> per_au_counters; //for all cases
>> }
>>
>> A. Saves one allocation for 1 & 2, wastes 4 bytes for 1. Plus requires some
>> simple tricks to use both standalone counter and vector.
>> B. Saves 1 allocation and 4 bytes for 1. Unified counter vector processing but
>> some minor tricks with a flag.
>> C. Needs +1 allocation for 1 & 2. And wastes 4 bytes for 1. Unified counter
>> processing
>>
>> I'd rather prefer B. implementation.
> vector<int> is 24 bytes.  So if we're worried about memory, I think we
Yep, missed that...
> want to have an alternative strategy.  Something like:
>
>   uint8_t au_order;  ///< size of allocation unit (min_alloc_size may change)
Unless we pack the struct this byte field adds +8 bytes to struct size 
for 64-bit build. Another point is heap allocator granularity - most 
probably it will align allocated struct to 8 byte boundaries anyway. 
Hence we can use a full value instead of au_order.

Another concern is min_alloc_size change in general - have we ever 
performed any robust BlueStore testing against this feature?
It looks pretty dangerous and error-prone...

>   union {
>     long *ref_by_au;  ///< dynamically allocated array of bytes per au
>     long ref_in_single_au;  ///< refs if blob is a single au (au_order=0?)
>   }
>
> Something simple like that?
>
> I suspect get_ref/put_ref will get quite a bit simpler and more efficient
> with per-au buckets.
yep.
> Also, for compressed blobs, we can use the ref_in_single_au mode since we
> can't deallocate it piecewise anyway...
+1

Also please note that we'll probably impact our current Onode encoding 
with this change. Spanning blobs need ref counters to be persistent....

Thanks,
Igor

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-12-21 22:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-20 15:04 Bug in mempool::map? Igor Fedotov
2016-12-20 15:25 ` Sage Weil
2016-12-20 15:56   ` Igor Fedotov
2016-12-20 17:08   ` Igor Fedotov
2016-12-20 17:27     ` Allen Samuels
2016-12-20 17:34       ` Igor Fedotov
2016-12-20 17:37     ` Allen Samuels
2016-12-20 18:26     ` Sage Weil
2016-12-20 20:05       ` Igor Fedotov
2016-12-20 20:15         ` Sage Weil
2016-12-20 21:29           ` Igor Fedotov
2016-12-21 21:43             ` Sage Weil
2016-12-21 22:55               ` Igor Fedotov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.