* Bug in mempool::map?
@ 2016-12-20 15:04 Igor Fedotov
2016-12-20 15:25 ` Sage Weil
0 siblings, 1 reply; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 15:04 UTC (permalink / raw)
To: Allen Samuels
Hi Allen,
It looks like mempools don't measure maps allocations properly.
I extended unittest_mempool in the following way but corresponding
output is always 0 for both 'before' and 'after' values:
diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
index 4113c53..b38a356 100644
--- a/src/test/test_mempool.cc
+++ b/src/test/test_mempool.cc
@@ -232,9 +232,19 @@ TEST(mempool, set)
TEST(mempool, map)
{
{
- mempool::unittest_1::map<int,int> v;
- v[1] = 2;
- v[3] = 4;
+ size_t before = mempool::buffer_data::allocated_bytes();
+ mempool::unittest_1::map<int,int>* v = new
mempool::unittest_1::map<int,int>;
+ (*v)[1] = 2;
+ (*v)[3] = 4;
+ size_t after = mempool::buffer_data::allocated_bytes();
+ cout << "before " << before << " after " << after << std::endl;
+ delete v;
+ before = after;
+ mempool::unittest_1::map<int64_t,int64_t> v2;
+ v2[1] = 2;
+ v2[3] = 4;
+ after = mempool::buffer_data::allocated_bytes();
+ cout << " before " << before << " after " << after << std::endl;
}
{
mempool::unittest_2::map<int,obj> v;
Output:
[ RUN ] mempool.map
before 0 after 0
before 0 after 0
[ OK ] mempool.map (0 ms)
It looks like we do not measure ref_map for BlueStore Blob and
SharedBlob classes too.
Any ideas?
Thanks,
Igor
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-20 15:04 Bug in mempool::map? Igor Fedotov
@ 2016-12-20 15:25 ` Sage Weil
2016-12-20 15:56 ` Igor Fedotov
2016-12-20 17:08 ` Igor Fedotov
0 siblings, 2 replies; 13+ messages in thread
From: Sage Weil @ 2016-12-20 15:25 UTC (permalink / raw)
To: Igor Fedotov; +Cc: Allen Samuels
On Tue, 20 Dec 2016, Igor Fedotov wrote:
> Hi Allen,
>
> It looks like mempools don't measure maps allocations properly.
>
> I extended unittest_mempool in the following way but corresponding output is
> always 0 for both 'before' and 'after' values:
>
>
> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> index 4113c53..b38a356 100644
> --- a/src/test/test_mempool.cc
> +++ b/src/test/test_mempool.cc
> @@ -232,9 +232,19 @@ TEST(mempool, set)
> TEST(mempool, map)
> {
> {
> - mempool::unittest_1::map<int,int> v;
> - v[1] = 2;
> - v[3] = 4;
> + size_t before = mempool::buffer_data::allocated_bytes();
I think it's just that you're measuring the buffer_data pool...
> + mempool::unittest_1::map<int,int>* v = new mempool::unittest_1::map<int,int>;
but the map is in the unittest_1 pool?
> + (*v)[1] = 2;
> + (*v)[3] = 4;
> + size_t after = mempool::buffer_data::allocated_bytes();
> + cout << "before " << before << " after " << after << std::endl;
> + delete v;
> + before = after;
> + mempool::unittest_1::map<int64_t,int64_t> v2;
> + v2[1] = 2;
> + v2[3] = 4;
> + after = mempool::buffer_data::allocated_bytes();
> + cout << " before " << before << " after " << after << std::endl;
> }
> {
> mempool::unittest_2::map<int,obj> v;
>
>
> Output:
>
> [ RUN ] mempool.map
> before 0 after 0
> before 0 after 0
> [ OK ] mempool.map (0 ms)
>
> It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
> classes too.
>
> Any ideas?
>
> Thanks,
>
> Igor
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-20 15:25 ` Sage Weil
@ 2016-12-20 15:56 ` Igor Fedotov
2016-12-20 17:08 ` Igor Fedotov
1 sibling, 0 replies; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 15:56 UTC (permalink / raw)
To: Sage Weil; +Cc: Allen Samuels, ceph-devel
Yeah, indeed. Thanks!
On 20.12.2016 18:25, Sage Weil wrote:
> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>> Hi Allen,
>>
>> It looks like mempools don't measure maps allocations properly.
>>
>> I extended unittest_mempool in the following way but corresponding output is
>> always 0 for both 'before' and 'after' values:
>>
>>
>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>> index 4113c53..b38a356 100644
>> --- a/src/test/test_mempool.cc
>> +++ b/src/test/test_mempool.cc
>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>> TEST(mempool, map)
>> {
>> {
>> - mempool::unittest_1::map<int,int> v;
>> - v[1] = 2;
>> - v[3] = 4;
>> + size_t before = mempool::buffer_data::allocated_bytes();
> I think it's just that you're measuring the buffer_data pool...
>
>> + mempool::unittest_1::map<int,int>* v = new mempool::unittest_1::map<int,int>;
> but the map is in the unittest_1 pool?
>
>> + (*v)[1] = 2;
>> + (*v)[3] = 4;
>> + size_t after = mempool::buffer_data::allocated_bytes();
>> + cout << "before " << before << " after " << after << std::endl;
>> + delete v;
>> + before = after;
>> + mempool::unittest_1::map<int64_t,int64_t> v2;
>> + v2[1] = 2;
>> + v2[3] = 4;
>> + after = mempool::buffer_data::allocated_bytes();
>> + cout << " before " << before << " after " << after << std::endl;
>> }
>> {
>> mempool::unittest_2::map<int,obj> v;
>>
>>
>> Output:
>>
>> [ RUN ] mempool.map
>> before 0 after 0
>> before 0 after 0
>> [ OK ] mempool.map (0 ms)
>>
>> It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
>> classes too.
>>
>> Any ideas?
>>
>> Thanks,
>>
>> Igor
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-20 15:25 ` Sage Weil
2016-12-20 15:56 ` Igor Fedotov
@ 2016-12-20 17:08 ` Igor Fedotov
2016-12-20 17:27 ` Allen Samuels
` (2 more replies)
1 sibling, 3 replies; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 17:08 UTC (permalink / raw)
To: Sage Weil; +Cc: Allen Samuels, ceph-devel
Some update on map<uint64_t, uint32_t> mem usage.
It looks like single entry map takes 48 bytes. And 40 bytes for
map<uint32_t,uint32_t>.
Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
These are my results taken from mempools. And they look pretty similar
to what's been said in the following article:
http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/
Sage, you mentioned that you're planning to do something with ref maps
during the standup but I missed the details. Is that something about
their mem use or anything else?
Thanks,
Igor
On 20.12.2016 18:25, Sage Weil wrote:
> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>> Hi Allen,
>>
>> It looks like mempools don't measure maps allocations properly.
>>
>> I extended unittest_mempool in the following way but corresponding output is
>> always 0 for both 'before' and 'after' values:
>>
>>
>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>> index 4113c53..b38a356 100644
>> --- a/src/test/test_mempool.cc
>> +++ b/src/test/test_mempool.cc
>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>> TEST(mempool, map)
>> {
>> {
>> - mempool::unittest_1::map<int,int> v;
>> - v[1] = 2;
>> - v[3] = 4;
>> + size_t before = mempool::buffer_data::allocated_bytes();
> I think it's just that you're measuring the buffer_data pool...
>
>> + mempool::unittest_1::map<int,int>* v = new mempool::unittest_1::map<int,int>;
> but the map is in the unittest_1 pool?
>
>> + (*v)[1] = 2;
>> + (*v)[3] = 4;
>> + size_t after = mempool::buffer_data::allocated_bytes();
>> + cout << "before " << before << " after " << after << std::endl;
>> + delete v;
>> + before = after;
>> + mempool::unittest_1::map<int64_t,int64_t> v2;
>> + v2[1] = 2;
>> + v2[3] = 4;
>> + after = mempool::buffer_data::allocated_bytes();
>> + cout << " before " << before << " after " << after << std::endl;
>> }
>> {
>> mempool::unittest_2::map<int,obj> v;
>>
>>
>> Output:
>>
>> [ RUN ] mempool.map
>> before 0 after 0
>> before 0 after 0
>> [ OK ] mempool.map (0 ms)
>>
>> It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
>> classes too.
>>
>> Any ideas?
>>
>> Thanks,
>>
>> Igor
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: Bug in mempool::map?
2016-12-20 17:08 ` Igor Fedotov
@ 2016-12-20 17:27 ` Allen Samuels
2016-12-20 17:34 ` Igor Fedotov
2016-12-20 17:37 ` Allen Samuels
2016-12-20 18:26 ` Sage Weil
2 siblings, 1 reply; 13+ messages in thread
From: Allen Samuels @ 2016-12-20 17:27 UTC (permalink / raw)
To: Igor Fedotov, Sage Weil; +Cc: ceph-devel
Yes, Slab containers helps here by amortizing the malloc overhead for small nodes.
Allen Samuels
SanDisk |a Western Digital brand
951 SanDisk Drive, Milpitas, CA 95035
T: +1 408 801 7030| M: +1 408 780 6416
allen.samuels@SanDisk.com
> -----Original Message-----
> From: Igor Fedotov [mailto:ifedotov@mirantis.com]
> Sent: Tuesday, December 20, 2016 9:08 AM
> To: Sage Weil <sage@newdream.net>
> Cc: Allen Samuels <Allen.Samuels@sandisk.com>; ceph-devel <ceph-
> devel@vger.kernel.org>
> Subject: Re: Bug in mempool::map?
>
> Some update on map<uint64_t, uint32_t> mem usage.
>
> It looks like single entry map takes 48 bytes. And 40 bytes for
> map<uint32_t,uint32_t>.
>
> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
>
> These are my results taken from mempools. And they look pretty similar
> to what's been said in the following article:
>
> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-
> can-be-surprising/
>
>
> Sage, you mentioned that you're planning to do something with ref maps
> during the standup but I missed the details. Is that something about
> their mem use or anything else?
>
>
> Thanks,
>
> Igor
>
>
>
> On 20.12.2016 18:25, Sage Weil wrote:
> > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> >> Hi Allen,
> >>
> >> It looks like mempools don't measure maps allocations properly.
> >>
> >> I extended unittest_mempool in the following way but corresponding
> output is
> >> always 0 for both 'before' and 'after' values:
> >>
> >>
> >> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> >> index 4113c53..b38a356 100644
> >> --- a/src/test/test_mempool.cc
> >> +++ b/src/test/test_mempool.cc
> >> @@ -232,9 +232,19 @@ TEST(mempool, set)
> >> TEST(mempool, map)
> >> {
> >> {
> >> - mempool::unittest_1::map<int,int> v;
> >> - v[1] = 2;
> >> - v[3] = 4;
> >> + size_t before = mempool::buffer_data::allocated_bytes();
> > I think it's just that you're measuring the buffer_data pool...
> >
> >> + mempool::unittest_1::map<int,int>* v = new
> mempool::unittest_1::map<int,int>;
> > but the map is in the unittest_1 pool?
> >
> >> + (*v)[1] = 2;
> >> + (*v)[3] = 4;
> >> + size_t after = mempool::buffer_data::allocated_bytes();
> >> + cout << "before " << before << " after " << after << std::endl;
> >> + delete v;
> >> + before = after;
> >> + mempool::unittest_1::map<int64_t,int64_t> v2;
> >> + v2[1] = 2;
> >> + v2[3] = 4;
> >> + after = mempool::buffer_data::allocated_bytes();
> >> + cout << " before " << before << " after " << after << std::endl;
> >> }
> >> {
> >> mempool::unittest_2::map<int,obj> v;
> >>
> >>
> >> Output:
> >>
> >> [ RUN ] mempool.map
> >> before 0 after 0
> >> before 0 after 0
> >> [ OK ] mempool.map (0 ms)
> >>
> >> It looks like we do not measure ref_map for BlueStore Blob and
> SharedBlob
> >> classes too.
> >>
> >> Any ideas?
> >>
> >> Thanks,
> >>
> >> Igor
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> >>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-20 17:27 ` Allen Samuels
@ 2016-12-20 17:34 ` Igor Fedotov
0 siblings, 0 replies; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 17:34 UTC (permalink / raw)
To: Allen Samuels, Sage Weil; +Cc: ceph-devel
Yeah, but std::map entry also has 3 pointers per item - hence 24 bytes
overhead per 12 bytes (offset, len, ref_count) of useful payload...
On 20.12.2016 20:27, Allen Samuels wrote:
> Yes, Slab containers helps here by amortizing the malloc overhead for small nodes.
>
> Allen Samuels
> SanDisk |a Western Digital brand
> 951 SanDisk Drive, Milpitas, CA 95035
> T: +1 408 801 7030| M: +1 408 780 6416
> allen.samuels@SanDisk.com
>
>> -----Original Message-----
>> From: Igor Fedotov [mailto:ifedotov@mirantis.com]
>> Sent: Tuesday, December 20, 2016 9:08 AM
>> To: Sage Weil <sage@newdream.net>
>> Cc: Allen Samuels <Allen.Samuels@sandisk.com>; ceph-devel <ceph-
>> devel@vger.kernel.org>
>> Subject: Re: Bug in mempool::map?
>>
>> Some update on map<uint64_t, uint32_t> mem usage.
>>
>> It looks like single entry map takes 48 bytes. And 40 bytes for
>> map<uint32_t,uint32_t>.
>>
>> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
>>
>> These are my results taken from mempools. And they look pretty similar
>> to what's been said in the following article:
>>
>> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-
>> can-be-surprising/
>>
>>
>> Sage, you mentioned that you're planning to do something with ref maps
>> during the standup but I missed the details. Is that something about
>> their mem use or anything else?
>>
>>
>> Thanks,
>>
>> Igor
>>
>>
>>
>> On 20.12.2016 18:25, Sage Weil wrote:
>>> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>>>> Hi Allen,
>>>>
>>>> It looks like mempools don't measure maps allocations properly.
>>>>
>>>> I extended unittest_mempool in the following way but corresponding
>> output is
>>>> always 0 for both 'before' and 'after' values:
>>>>
>>>>
>>>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>>>> index 4113c53..b38a356 100644
>>>> --- a/src/test/test_mempool.cc
>>>> +++ b/src/test/test_mempool.cc
>>>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>>>> TEST(mempool, map)
>>>> {
>>>> {
>>>> - mempool::unittest_1::map<int,int> v;
>>>> - v[1] = 2;
>>>> - v[3] = 4;
>>>> + size_t before = mempool::buffer_data::allocated_bytes();
>>> I think it's just that you're measuring the buffer_data pool...
>>>
>>>> + mempool::unittest_1::map<int,int>* v = new
>> mempool::unittest_1::map<int,int>;
>>> but the map is in the unittest_1 pool?
>>>
>>>> + (*v)[1] = 2;
>>>> + (*v)[3] = 4;
>>>> + size_t after = mempool::buffer_data::allocated_bytes();
>>>> + cout << "before " << before << " after " << after << std::endl;
>>>> + delete v;
>>>> + before = after;
>>>> + mempool::unittest_1::map<int64_t,int64_t> v2;
>>>> + v2[1] = 2;
>>>> + v2[3] = 4;
>>>> + after = mempool::buffer_data::allocated_bytes();
>>>> + cout << " before " << before << " after " << after << std::endl;
>>>> }
>>>> {
>>>> mempool::unittest_2::map<int,obj> v;
>>>>
>>>>
>>>> Output:
>>>>
>>>> [ RUN ] mempool.map
>>>> before 0 after 0
>>>> before 0 after 0
>>>> [ OK ] mempool.map (0 ms)
>>>>
>>>> It looks like we do not measure ref_map for BlueStore Blob and
>> SharedBlob
>>>> classes too.
>>>>
>>>> Any ideas?
>>>>
>>>> Thanks,
>>>>
>>>> Igor
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: Bug in mempool::map?
2016-12-20 17:08 ` Igor Fedotov
2016-12-20 17:27 ` Allen Samuels
@ 2016-12-20 17:37 ` Allen Samuels
2016-12-20 18:26 ` Sage Weil
2 siblings, 0 replies; 13+ messages in thread
From: Allen Samuels @ 2016-12-20 17:37 UTC (permalink / raw)
To: Igor Fedotov, Sage Weil; +Cc: ceph-devel
But it won't eliminate the RB-tree overhead (which is 3 pointers, and a flag -- which is probably rounded up to 8 bytes => 32 bytes). With 16 bytes of data (again, rounded up).
Also, the mempool stats don't include malloc overhead (depending on the implementation there might be an 8 or 16 byte header for each allocation). So the true consumption could be substantially worse.
Allen Samuels
SanDisk |a Western Digital brand
951 SanDisk Drive, Milpitas, CA 95035
T: +1 408 801 7030| M: +1 408 780 6416
allen.samuels@SanDisk.com
> -----Original Message-----
> From: Allen Samuels
> Sent: Tuesday, December 20, 2016 9:28 AM
> To: 'Igor Fedotov' <ifedotov@mirantis.com>; Sage Weil
> <sage@newdream.net>
> Cc: ceph-devel <ceph-devel@vger.kernel.org>
> Subject: RE: Bug in mempool::map?
>
> Yes, Slab containers helps here by amortizing the malloc overhead for small
> nodes.
>
> Allen Samuels
> SanDisk |a Western Digital brand
> 951 SanDisk Drive, Milpitas, CA 95035
> T: +1 408 801 7030| M: +1 408 780 6416
> allen.samuels@SanDisk.com
>
> > -----Original Message-----
> > From: Igor Fedotov [mailto:ifedotov@mirantis.com]
> > Sent: Tuesday, December 20, 2016 9:08 AM
> > To: Sage Weil <sage@newdream.net>
> > Cc: Allen Samuels <Allen.Samuels@sandisk.com>; ceph-devel <ceph-
> > devel@vger.kernel.org>
> > Subject: Re: Bug in mempool::map?
> >
> > Some update on map<uint64_t, uint32_t> mem usage.
> >
> > It looks like single entry map takes 48 bytes. And 40 bytes for
> > map<uint32_t,uint32_t>.
> >
> > Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
> >
> > These are my results taken from mempools. And they look pretty similar
> > to what's been said in the following article:
> >
> > http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-
> > can-be-surprising/
> >
> >
> > Sage, you mentioned that you're planning to do something with ref maps
> > during the standup but I missed the details. Is that something about
> > their mem use or anything else?
> >
> >
> > Thanks,
> >
> > Igor
> >
> >
> >
> > On 20.12.2016 18:25, Sage Weil wrote:
> > > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> > >> Hi Allen,
> > >>
> > >> It looks like mempools don't measure maps allocations properly.
> > >>
> > >> I extended unittest_mempool in the following way but corresponding
> > output is
> > >> always 0 for both 'before' and 'after' values:
> > >>
> > >>
> > >> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> > >> index 4113c53..b38a356 100644
> > >> --- a/src/test/test_mempool.cc
> > >> +++ b/src/test/test_mempool.cc
> > >> @@ -232,9 +232,19 @@ TEST(mempool, set)
> > >> TEST(mempool, map)
> > >> {
> > >> {
> > >> - mempool::unittest_1::map<int,int> v;
> > >> - v[1] = 2;
> > >> - v[3] = 4;
> > >> + size_t before = mempool::buffer_data::allocated_bytes();
> > > I think it's just that you're measuring the buffer_data pool...
> > >
> > >> + mempool::unittest_1::map<int,int>* v = new
> > mempool::unittest_1::map<int,int>;
> > > but the map is in the unittest_1 pool?
> > >
> > >> + (*v)[1] = 2;
> > >> + (*v)[3] = 4;
> > >> + size_t after = mempool::buffer_data::allocated_bytes();
> > >> + cout << "before " << before << " after " << after << std::endl;
> > >> + delete v;
> > >> + before = after;
> > >> + mempool::unittest_1::map<int64_t,int64_t> v2;
> > >> + v2[1] = 2;
> > >> + v2[3] = 4;
> > >> + after = mempool::buffer_data::allocated_bytes();
> > >> + cout << " before " << before << " after " << after << std::endl;
> > >> }
> > >> {
> > >> mempool::unittest_2::map<int,obj> v;
> > >>
> > >>
> > >> Output:
> > >>
> > >> [ RUN ] mempool.map
> > >> before 0 after 0
> > >> before 0 after 0
> > >> [ OK ] mempool.map (0 ms)
> > >>
> > >> It looks like we do not measure ref_map for BlueStore Blob and
> > SharedBlob
> > >> classes too.
> > >>
> > >> Any ideas?
> > >>
> > >> Thanks,
> > >>
> > >> Igor
> > >>
> > >> --
> > >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > >> the body of a message to majordomo@vger.kernel.org
> > >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> > >>
> > >>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-20 17:08 ` Igor Fedotov
2016-12-20 17:27 ` Allen Samuels
2016-12-20 17:37 ` Allen Samuels
@ 2016-12-20 18:26 ` Sage Weil
2016-12-20 20:05 ` Igor Fedotov
2 siblings, 1 reply; 13+ messages in thread
From: Sage Weil @ 2016-12-20 18:26 UTC (permalink / raw)
To: Igor Fedotov; +Cc: Allen Samuels, ceph-devel
On Tue, 20 Dec 2016, Igor Fedotov wrote:
> Some update on map<uint64_t, uint32_t> mem usage.
>
> It looks like single entry map takes 48 bytes. And 40 bytes for
> map<uint32_t,uint32_t>.
>
> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
>
> These are my results taken from mempools. And they look pretty similar to
> what's been said in the following article:
>
> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/
>
>
> Sage, you mentioned that you're planning to do something with ref maps during
> the standup but I missed the details. Is that something about their mem use or
> anything else?
I mentioned btree_map<> and flat_map<> (new in boost). Probably the thing
to do here is to make extent_ref_map_t handle the common case of 1 (or
maybe 2?) extents done inline, and when we go beyond that allocate another
structure on the heap. That other structure could be std::map<>, but I
think one of the other choices would be better: one larger allocation and
better performance in general for small maps. This structure will
only get big for very big blobs, which shouldn't be terribly common, I
think.
sage
>
>
> Thanks,
>
> Igor
>
>
>
> On 20.12.2016 18:25, Sage Weil wrote:
> > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> > > Hi Allen,
> > >
> > > It looks like mempools don't measure maps allocations properly.
> > >
> > > I extended unittest_mempool in the following way but corresponding output
> > > is
> > > always 0 for both 'before' and 'after' values:
> > >
> > >
> > > diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> > > index 4113c53..b38a356 100644
> > > --- a/src/test/test_mempool.cc
> > > +++ b/src/test/test_mempool.cc
> > > @@ -232,9 +232,19 @@ TEST(mempool, set)
> > > TEST(mempool, map)
> > > {
> > > {
> > > - mempool::unittest_1::map<int,int> v;
> > > - v[1] = 2;
> > > - v[3] = 4;
> > > + size_t before = mempool::buffer_data::allocated_bytes();
> > I think it's just that you're measuring the buffer_data pool...
> >
> > > + mempool::unittest_1::map<int,int>* v = new
> > > mempool::unittest_1::map<int,int>;
> > but the map is in the unittest_1 pool?
> >
> > > + (*v)[1] = 2;
> > > + (*v)[3] = 4;
> > > + size_t after = mempool::buffer_data::allocated_bytes();
> > > + cout << "before " << before << " after " << after << std::endl;
> > > + delete v;
> > > + before = after;
> > > + mempool::unittest_1::map<int64_t,int64_t> v2;
> > > + v2[1] = 2;
> > > + v2[3] = 4;
> > > + after = mempool::buffer_data::allocated_bytes();
> > > + cout << " before " << before << " after " << after << std::endl;
> > > }
> > > {
> > > mempool::unittest_2::map<int,obj> v;
> > >
> > >
> > > Output:
> > >
> > > [ RUN ] mempool.map
> > > before 0 after 0
> > > before 0 after 0
> > > [ OK ] mempool.map (0 ms)
> > >
> > > It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
> > > classes too.
> > >
> > > Any ideas?
> > >
> > > Thanks,
> > >
> > > Igor
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > >
> > >
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-20 18:26 ` Sage Weil
@ 2016-12-20 20:05 ` Igor Fedotov
2016-12-20 20:15 ` Sage Weil
0 siblings, 1 reply; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 20:05 UTC (permalink / raw)
To: Sage Weil; +Cc: Allen Samuels, ceph-devel
I think I have a better idea.
We can simply track amount of bytes referenced in the blob. E.g. some
extent has length 100 - this increments the counter by 100 accordingly.
Removing/punching an extent decrements the counter.
If we want tp be able to deallocate specific unused pextent within a
blob as we currently do then we just need to track that amount on
per-pextent basis. Hence just a few ints per blob... And no need for map
lookups.
If that's OK I can start implementing it tomorrow.
Thanks,
Igor
The same idea can be probably applied to SharedBlob too.
On 12/20/2016 9:26 PM, Sage Weil wrote:
> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>> Some update on map<uint64_t, uint32_t> mem usage.
>>
>> It looks like single entry map takes 48 bytes. And 40 bytes for
>> map<uint32_t,uint32_t>.
>>
>> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
>>
>> These are my results taken from mempools. And they look pretty similar to
>> what's been said in the following article:
>>
>> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/
>>
>>
>> Sage, you mentioned that you're planning to do something with ref maps during
>> the standup but I missed the details. Is that something about their mem use or
>> anything else?
> I mentioned btree_map<> and flat_map<> (new in boost). Probably the thing
> to do here is to make extent_ref_map_t handle the common case of 1 (or
> maybe 2?) extents done inline, and when we go beyond that allocate another
> structure on the heap. That other structure could be std::map<>, but I
> think one of the other choices would be better: one larger allocation and
> better performance in general for small maps. This structure will
> only get big for very big blobs, which shouldn't be terribly common, I
> think.
>
> sage
>
>
>>
>> Thanks,
>>
>> Igor
>>
>>
>>
>> On 20.12.2016 18:25, Sage Weil wrote:
>>> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>>>> Hi Allen,
>>>>
>>>> It looks like mempools don't measure maps allocations properly.
>>>>
>>>> I extended unittest_mempool in the following way but corresponding output
>>>> is
>>>> always 0 for both 'before' and 'after' values:
>>>>
>>>>
>>>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>>>> index 4113c53..b38a356 100644
>>>> --- a/src/test/test_mempool.cc
>>>> +++ b/src/test/test_mempool.cc
>>>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>>>> TEST(mempool, map)
>>>> {
>>>> {
>>>> - mempool::unittest_1::map<int,int> v;
>>>> - v[1] = 2;
>>>> - v[3] = 4;
>>>> + size_t before = mempool::buffer_data::allocated_bytes();
>>> I think it's just that you're measuring the buffer_data pool...
>>>
>>>> + mempool::unittest_1::map<int,int>* v = new
>>>> mempool::unittest_1::map<int,int>;
>>> but the map is in the unittest_1 pool?
>>>
>>>> + (*v)[1] = 2;
>>>> + (*v)[3] = 4;
>>>> + size_t after = mempool::buffer_data::allocated_bytes();
>>>> + cout << "before " << before << " after " << after << std::endl;
>>>> + delete v;
>>>> + before = after;
>>>> + mempool::unittest_1::map<int64_t,int64_t> v2;
>>>> + v2[1] = 2;
>>>> + v2[3] = 4;
>>>> + after = mempool::buffer_data::allocated_bytes();
>>>> + cout << " before " << before << " after " << after << std::endl;
>>>> }
>>>> {
>>>> mempool::unittest_2::map<int,obj> v;
>>>>
>>>>
>>>> Output:
>>>>
>>>> [ RUN ] mempool.map
>>>> before 0 after 0
>>>> before 0 after 0
>>>> [ OK ] mempool.map (0 ms)
>>>>
>>>> It looks like we do not measure ref_map for BlueStore Blob and SharedBlob
>>>> classes too.
>>>>
>>>> Any ideas?
>>>>
>>>> Thanks,
>>>>
>>>> Igor
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-20 20:05 ` Igor Fedotov
@ 2016-12-20 20:15 ` Sage Weil
2016-12-20 21:29 ` Igor Fedotov
0 siblings, 1 reply; 13+ messages in thread
From: Sage Weil @ 2016-12-20 20:15 UTC (permalink / raw)
To: Igor Fedotov; +Cc: Allen Samuels, ceph-devel
On Tue, 20 Dec 2016, Igor Fedotov wrote:
> I think I have a better idea.
>
> We can simply track amount of bytes referenced in the blob. E.g. some extent
> has length 100 - this increments the counter by 100 accordingly.
> Removing/punching an extent decrements the counter.
>
> If we want tp be able to deallocate specific unused pextent within a blob as
> we currently do then we just need to track that amount on per-pextent basis.
> Hence just a few ints per blob... And no need for map lookups.
For small blobs, a counter is probably sufficient since we're a single
min_alloc_size unit anyway. That's actually more like a vector of size
(min_alloc_size count), with 1 being the degenerate case. Is that what
you're thinking?
It seems like we have a few common cases...
1) big blob, single ref (e.g., immutable large object, and then the entire
thing is referenced, modulo the last block).
2) tiny blob, single alloc unit. a counter is sufficient.
3) everything else in between...
We could probably get away with special-case efficient representations of
just 1 and 2?
sage
>
> If that's OK I can start implementing it tomorrow.
>
>
> Thanks,
>
> Igor
>
>
> The same idea can be probably applied to SharedBlob too.
>
> On 12/20/2016 9:26 PM, Sage Weil wrote:
> > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> > > Some update on map<uint64_t, uint32_t> mem usage.
> > >
> > > It looks like single entry map takes 48 bytes. And 40 bytes for
> > > map<uint32_t,uint32_t>.
> > >
> > > Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
> > >
> > > These are my results taken from mempools. And they look pretty similar to
> > > what's been said in the following article:
> > >
> > > http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/
> > >
> > >
> > > Sage, you mentioned that you're planning to do something with ref maps
> > > during
> > > the standup but I missed the details. Is that something about their mem
> > > use or
> > > anything else?
> > I mentioned btree_map<> and flat_map<> (new in boost). Probably the thing
> > to do here is to make extent_ref_map_t handle the common case of 1 (or
> > maybe 2?) extents done inline, and when we go beyond that allocate another
> > structure on the heap. That other structure could be std::map<>, but I
> > think one of the other choices would be better: one larger allocation and
> > better performance in general for small maps. This structure will
> > only get big for very big blobs, which shouldn't be terribly common, I
> > think.
> >
> > sage
> >
> >
> > >
> > > Thanks,
> > >
> > > Igor
> > >
> > >
> > >
> > > On 20.12.2016 18:25, Sage Weil wrote:
> > > > On Tue, 20 Dec 2016, Igor Fedotov wrote:
> > > > > Hi Allen,
> > > > >
> > > > > It looks like mempools don't measure maps allocations properly.
> > > > >
> > > > > I extended unittest_mempool in the following way but corresponding
> > > > > output
> > > > > is
> > > > > always 0 for both 'before' and 'after' values:
> > > > >
> > > > >
> > > > > diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
> > > > > index 4113c53..b38a356 100644
> > > > > --- a/src/test/test_mempool.cc
> > > > > +++ b/src/test/test_mempool.cc
> > > > > @@ -232,9 +232,19 @@ TEST(mempool, set)
> > > > > TEST(mempool, map)
> > > > > {
> > > > > {
> > > > > - mempool::unittest_1::map<int,int> v;
> > > > > - v[1] = 2;
> > > > > - v[3] = 4;
> > > > > + size_t before = mempool::buffer_data::allocated_bytes();
> > > > I think it's just that you're measuring the buffer_data pool...
> > > >
> > > > > + mempool::unittest_1::map<int,int>* v = new
> > > > > mempool::unittest_1::map<int,int>;
> > > > but the map is in the unittest_1 pool?
> > > >
> > > > > + (*v)[1] = 2;
> > > > > + (*v)[3] = 4;
> > > > > + size_t after = mempool::buffer_data::allocated_bytes();
> > > > > + cout << "before " << before << " after " << after << std::endl;
> > > > > + delete v;
> > > > > + before = after;
> > > > > + mempool::unittest_1::map<int64_t,int64_t> v2;
> > > > > + v2[1] = 2;
> > > > > + v2[3] = 4;
> > > > > + after = mempool::buffer_data::allocated_bytes();
> > > > > + cout << " before " << before << " after " << after << std::endl;
> > > > > }
> > > > > {
> > > > > mempool::unittest_2::map<int,obj> v;
> > > > >
> > > > >
> > > > > Output:
> > > > >
> > > > > [ RUN ] mempool.map
> > > > > before 0 after 0
> > > > > before 0 after 0
> > > > > [ OK ] mempool.map (0 ms)
> > > > >
> > > > > It looks like we do not measure ref_map for BlueStore Blob and
> > > > > SharedBlob
> > > > > classes too.
> > > > >
> > > > > Any ideas?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Igor
> > > > >
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > > > > in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > >
> > > > >
> > >
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-20 20:15 ` Sage Weil
@ 2016-12-20 21:29 ` Igor Fedotov
2016-12-21 21:43 ` Sage Weil
0 siblings, 1 reply; 13+ messages in thread
From: Igor Fedotov @ 2016-12-20 21:29 UTC (permalink / raw)
To: Sage Weil; +Cc: Allen Samuels, ceph-devel
On 12/20/2016 11:15 PM, Sage Weil wrote:
> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>> I think I have a better idea.
>>
>> We can simply track amount of bytes referenced in the blob. E.g. some extent
>> has length 100 - this increments the counter by 100 accordingly.
>> Removing/punching an extent decrements the counter.
>>
>> If we want tp be able to deallocate specific unused pextent within a blob as
>> we currently do then we just need to track that amount on per-pextent basis.
>> Hence just a few ints per blob... And no need for map lookups.
> For small blobs, a counter is probably sufficient since we're a single
> min_alloc_size unit anyway. That's actually more like a vector of size
> (min_alloc_size count), with 1 being the degenerate case. Is that what
> you're thinking?
Yes. And the key point that we are always to count BYTES that are
referenced per-blob(or rather per-alloc_unit). Not the number of
references(extents) per specific interval in the blob.
E.g.
extents
0~200 & 300~100
produce counter for AU(0) = 300
and
extents
0x1000~1, 0x1010~2, 0x1055~3
produce counter for AU(1) = 6
Hence we always store a VECTOR of counters not a map.
> It seems like we have a few common cases...
>
> 1) big blob, single ref (e.g., immutable large object, and then the entire
> thing is referenced, modulo the last block).
>
> 2) tiny blob, single alloc unit. a counter is sufficient.
>
> 3) everything else in between...
>
> We could probably get away with special-case efficient representations of
> just 1 and 2?
I'd say 1) = entire blob is referenced just once. And 2) = single alloc
unit and partially referenced.
Then we need at minimum a bit flag for 1), 'int' for 2) and vector<int>
for 3).
Possible implementations are
A.{
int standalone_counter; //this covers 1 & 2, value greater than alloc
unit means 1)
vector<int> per_au_counters; // and this one for 3 with the first
counter stored in standalone_counter
}
vs.
B.{
//reuse some blob flag for 1)
vector<int> per_au_counters; //for 2) and 3)
}
vs.
C.{
vector<int> per_au_counters; //for all cases
}
A. Saves one allocation for 1 & 2, wastes 4 bytes for 1. Plus requires
some simple tricks to use both standalone counter and vector.
B. Saves 1 allocation and 4 bytes for 1. Unified counter vector
processing but some minor tricks with a flag.
C. Needs +1 allocation for 1 & 2. And wastes 4 bytes for 1. Unified
counter processing
I'd rather prefer B. implementation.
> sage
>
>
>> If that's OK I can start implementing it tomorrow.
>>
>>
>> Thanks,
>>
>> Igor
>>
>>
>> The same idea can be probably applied to SharedBlob too.
>>
>> On 12/20/2016 9:26 PM, Sage Weil wrote:
>>> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>>>> Some update on map<uint64_t, uint32_t> mem usage.
>>>>
>>>> It looks like single entry map takes 48 bytes. And 40 bytes for
>>>> map<uint32_t,uint32_t>.
>>>>
>>>> Hence 1024 trivial ref_maps for 1024 blobs takes >48K!
>>>>
>>>> These are my results taken from mempools. And they look pretty similar to
>>>> what's been said in the following article:
>>>>
>>>> http://lemire.me/blog/2016/09/15/the-memory-usage-of-stl-containers-can-be-surprising/
>>>>
>>>>
>>>> Sage, you mentioned that you're planning to do something with ref maps
>>>> during
>>>> the standup but I missed the details. Is that something about their mem
>>>> use or
>>>> anything else?
>>> I mentioned btree_map<> and flat_map<> (new in boost). Probably the thing
>>> to do here is to make extent_ref_map_t handle the common case of 1 (or
>>> maybe 2?) extents done inline, and when we go beyond that allocate another
>>> structure on the heap. That other structure could be std::map<>, but I
>>> think one of the other choices would be better: one larger allocation and
>>> better performance in general for small maps. This structure will
>>> only get big for very big blobs, which shouldn't be terribly common, I
>>> think.
>>>
>>> sage
>>>
>>>
>>>> Thanks,
>>>>
>>>> Igor
>>>>
>>>>
>>>>
>>>> On 20.12.2016 18:25, Sage Weil wrote:
>>>>> On Tue, 20 Dec 2016, Igor Fedotov wrote:
>>>>>> Hi Allen,
>>>>>>
>>>>>> It looks like mempools don't measure maps allocations properly.
>>>>>>
>>>>>> I extended unittest_mempool in the following way but corresponding
>>>>>> output
>>>>>> is
>>>>>> always 0 for both 'before' and 'after' values:
>>>>>>
>>>>>>
>>>>>> diff --git a/src/test/test_mempool.cc b/src/test/test_mempool.cc
>>>>>> index 4113c53..b38a356 100644
>>>>>> --- a/src/test/test_mempool.cc
>>>>>> +++ b/src/test/test_mempool.cc
>>>>>> @@ -232,9 +232,19 @@ TEST(mempool, set)
>>>>>> TEST(mempool, map)
>>>>>> {
>>>>>> {
>>>>>> - mempool::unittest_1::map<int,int> v;
>>>>>> - v[1] = 2;
>>>>>> - v[3] = 4;
>>>>>> + size_t before = mempool::buffer_data::allocated_bytes();
>>>>> I think it's just that you're measuring the buffer_data pool...
>>>>>
>>>>>> + mempool::unittest_1::map<int,int>* v = new
>>>>>> mempool::unittest_1::map<int,int>;
>>>>> but the map is in the unittest_1 pool?
>>>>>
>>>>>> + (*v)[1] = 2;
>>>>>> + (*v)[3] = 4;
>>>>>> + size_t after = mempool::buffer_data::allocated_bytes();
>>>>>> + cout << "before " << before << " after " << after << std::endl;
>>>>>> + delete v;
>>>>>> + before = after;
>>>>>> + mempool::unittest_1::map<int64_t,int64_t> v2;
>>>>>> + v2[1] = 2;
>>>>>> + v2[3] = 4;
>>>>>> + after = mempool::buffer_data::allocated_bytes();
>>>>>> + cout << " before " << before << " after " << after << std::endl;
>>>>>> }
>>>>>> {
>>>>>> mempool::unittest_2::map<int,obj> v;
>>>>>>
>>>>>>
>>>>>> Output:
>>>>>>
>>>>>> [ RUN ] mempool.map
>>>>>> before 0 after 0
>>>>>> before 0 after 0
>>>>>> [ OK ] mempool.map (0 ms)
>>>>>>
>>>>>> It looks like we do not measure ref_map for BlueStore Blob and
>>>>>> SharedBlob
>>>>>> classes too.
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Igor
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>>>>> in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-20 21:29 ` Igor Fedotov
@ 2016-12-21 21:43 ` Sage Weil
2016-12-21 22:55 ` Igor Fedotov
0 siblings, 1 reply; 13+ messages in thread
From: Sage Weil @ 2016-12-21 21:43 UTC (permalink / raw)
To: Igor Fedotov; +Cc: Allen Samuels, ceph-devel
On Wed, 21 Dec 2016, Igor Fedotov wrote:
> On 12/20/2016 11:15 PM, Sage Weil wrote:
> > It seems like we have a few common cases...
> >
> > 1) big blob, single ref (e.g., immutable large object, and then the entire
> > thing is referenced, modulo the last block).
> >
> > 2) tiny blob, single alloc unit. a counter is sufficient.
> >
> > 3) everything else in between...
> >
> > We could probably get away with special-case efficient representations of
> > just 1 and 2?
> I'd say 1) = entire blob is referenced just once. And 2) = single alloc unit
> and partially referenced.
>
> Then we need at minimum a bit flag for 1), 'int' for 2) and vector<int> for
> 3).
>
> Possible implementations are
> A.{
> int standalone_counter; //this covers 1 & 2, value greater than alloc unit
> means 1)
> vector<int> per_au_counters; // and this one for 3 with the first counter
> stored in standalone_counter
> }
> vs.
> B.{
> //reuse some blob flag for 1)
> vector<int> per_au_counters; //for 2) and 3)
> }
> vs.
> C.{
> vector<int> per_au_counters; //for all cases
> }
>
> A. Saves one allocation for 1 & 2, wastes 4 bytes for 1. Plus requires some
> simple tricks to use both standalone counter and vector.
> B. Saves 1 allocation and 4 bytes for 1. Unified counter vector processing but
> some minor tricks with a flag.
> C. Needs +1 allocation for 1 & 2. And wastes 4 bytes for 1. Unified counter
> processing
>
> I'd rather prefer B. implementation.
vector<int> is 24 bytes. So if we're worried about memory, I think we
want to have an alternative strategy. Something like:
uint8_t au_order; ///< size of allocation unit (min_alloc_size may change)
union {
long *ref_by_au; ///< dynamically allocated array of bytes per au
long ref_in_single_au; ///< refs if blob is a single au (au_order=0?)
}
Something simple like that?
I suspect get_ref/put_ref will get quite a bit simpler and more efficient
with per-au buckets.
Also, for compressed blobs, we can use the ref_in_single_au mode since we
can't deallocate it piecewise anyway...
sage
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug in mempool::map?
2016-12-21 21:43 ` Sage Weil
@ 2016-12-21 22:55 ` Igor Fedotov
0 siblings, 0 replies; 13+ messages in thread
From: Igor Fedotov @ 2016-12-21 22:55 UTC (permalink / raw)
To: Sage Weil; +Cc: Allen Samuels, ceph-devel
On 12/22/2016 12:43 AM, Sage Weil wrote:
> On Wed, 21 Dec 2016, Igor Fedotov wrote:
>> On 12/20/2016 11:15 PM, Sage Weil wrote:
>>> It seems like we have a few common cases...
>>>
>>> 1) big blob, single ref (e.g., immutable large object, and then the entire
>>> thing is referenced, modulo the last block).
>>>
>>> 2) tiny blob, single alloc unit. a counter is sufficient.
>>>
>>> 3) everything else in between...
>>>
>>> We could probably get away with special-case efficient representations of
>>> just 1 and 2?
>> I'd say 1) = entire blob is referenced just once. And 2) = single alloc unit
>> and partially referenced.
>>
>> Then we need at minimum a bit flag for 1), 'int' for 2) and vector<int> for
>> 3).
>>
>> Possible implementations are
>> A.{
>> int standalone_counter; //this covers 1 & 2, value greater than alloc unit
>> means 1)
>> vector<int> per_au_counters; // and this one for 3 with the first counter
>> stored in standalone_counter
>> }
>> vs.
>> B.{
>> //reuse some blob flag for 1)
>> vector<int> per_au_counters; //for 2) and 3)
>> }
>> vs.
>> C.{
>> vector<int> per_au_counters; //for all cases
>> }
>>
>> A. Saves one allocation for 1 & 2, wastes 4 bytes for 1. Plus requires some
>> simple tricks to use both standalone counter and vector.
>> B. Saves 1 allocation and 4 bytes for 1. Unified counter vector processing but
>> some minor tricks with a flag.
>> C. Needs +1 allocation for 1 & 2. And wastes 4 bytes for 1. Unified counter
>> processing
>>
>> I'd rather prefer B. implementation.
> vector<int> is 24 bytes. So if we're worried about memory, I think we
Yep, missed that...
> want to have an alternative strategy. Something like:
>
> uint8_t au_order; ///< size of allocation unit (min_alloc_size may change)
Unless we pack the struct this byte field adds +8 bytes to struct size
for 64-bit build. Another point is heap allocator granularity - most
probably it will align allocated struct to 8 byte boundaries anyway.
Hence we can use a full value instead of au_order.
Another concern is min_alloc_size change in general - have we ever
performed any robust BlueStore testing against this feature?
It looks pretty dangerous and error-prone...
> union {
> long *ref_by_au; ///< dynamically allocated array of bytes per au
> long ref_in_single_au; ///< refs if blob is a single au (au_order=0?)
> }
>
> Something simple like that?
>
> I suspect get_ref/put_ref will get quite a bit simpler and more efficient
> with per-au buckets.
yep.
> Also, for compressed blobs, we can use the ref_in_single_au mode since we
> can't deallocate it piecewise anyway...
+1
Also please note that we'll probably impact our current Onode encoding
with this change. Spanning blobs need ref counters to be persistent....
Thanks,
Igor
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-12-21 22:55 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-20 15:04 Bug in mempool::map? Igor Fedotov
2016-12-20 15:25 ` Sage Weil
2016-12-20 15:56 ` Igor Fedotov
2016-12-20 17:08 ` Igor Fedotov
2016-12-20 17:27 ` Allen Samuels
2016-12-20 17:34 ` Igor Fedotov
2016-12-20 17:37 ` Allen Samuels
2016-12-20 18:26 ` Sage Weil
2016-12-20 20:05 ` Igor Fedotov
2016-12-20 20:15 ` Sage Weil
2016-12-20 21:29 ` Igor Fedotov
2016-12-21 21:43 ` Sage Weil
2016-12-21 22:55 ` Igor Fedotov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.