All of lore.kernel.org
 help / color / mirror / Atom feed
* Too many objects per pg than average: deadlock situation
@ 2018-05-20 20:28 Mike A
       [not found] ` <34562966-FC71-4194-9605-9ECB79BDA513-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Mike A @ 2018-05-20 20:28 UTC (permalink / raw)
  To: Ceph Development; +Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw

Hello!

In our cluster, we see a deadlock situation.
This is a standard cluster for an OpenStack without a RadosGW, we have a standard block access pools and one for metrics from a gnocchi.
The amount of data in the gnocchi pool is small, but objects are just a lot.

When planning a distribution of PG between pools, the PG are distributed depending on the estimated data size of each pool. Correspondingly, as suggested by pgcalc for the gnocchi pool, it is necessary to allocate a little PG quantity.

As a result, the cluster is constantly hanging with the error "1 pools have many more objects per pg than average" and this is understandable: the gnocchi produces a lot of small objects and in comparison with the rest of pools it is tens times larger.

And here we are at a deadlock:
1. We can not increase the amount of PG on the gnocchi pool, since it is very small in data size
2. Even if we increase the number of PG - we can cross the recommended 200 PGs limit for each OSD in cluster
3. Constantly holding the cluster in the HEALTH_WARN mode is a bad idea
4. We can set the parameter "mon pg warn max object skew", but we do not know how the Ceph will work when there is one pool with a huge object / pool ratio

There is no obvious solution.

How to solve this problem correctly?
— 
Mike, runs!
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Too many objects per pg than average: deadlock situation
       [not found] ` <34562966-FC71-4194-9605-9ECB79BDA513-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-05-20 23:05   ` Sage Weil
       [not found]     ` <alpine.DEB.2.11.1805202303591.2133-qHenpvqtifaMSRpgCs4c+g@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Sage Weil @ 2018-05-20 23:05 UTC (permalink / raw)
  To: Mike A; +Cc: Ceph Development, ceph-users-idqoXFIVOFJgJs9I8MT0rw

On Sun, 20 May 2018, Mike A wrote:
> Hello!
> 
> In our cluster, we see a deadlock situation.
> This is a standard cluster for an OpenStack without a RadosGW, we have a standard block access pools and one for metrics from a gnocchi.
> The amount of data in the gnocchi pool is small, but objects are just a lot.
> 
> When planning a distribution of PG between pools, the PG are distributed depending on the estimated data size of each pool. Correspondingly, as suggested by pgcalc for the gnocchi pool, it is necessary to allocate a little PG quantity.
> 
> As a result, the cluster is constantly hanging with the error "1 pools have many more objects per pg than average" and this is understandable: the gnocchi produces a lot of small objects and in comparison with the rest of pools it is tens times larger.
> 
> And here we are at a deadlock:
> 1. We can not increase the amount of PG on the gnocchi pool, since it is very small in data size
> 2. Even if we increase the number of PG - we can cross the recommended 200 PGs limit for each OSD in cluster
> 3. Constantly holding the cluster in the HEALTH_WARN mode is a bad idea
> 4. We can set the parameter "mon pg warn max object skew", but we do not know how the Ceph will work when there is one pool with a huge object / pool ratio
> 
> There is no obvious solution.
> 
> How to solve this problem correctly?

As a workaround, I'd just increase the skew option to make the warning go 
away.

It seems to me like the underlying problem is that we're looking at object 
count vs pg count, but ignoring the object sizes.  Unfortunately it's a 
bit awkward to fix because we don't have a way to quantify the size of 
omap objects via the stats (currently).  So for now, just adjust the skew 
value enough to make the warning go away!

sage

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Too many objects per pg than average: deadlock situation
       [not found]     ` <alpine.DEB.2.11.1805202303591.2133-qHenpvqtifaMSRpgCs4c+g@public.gmane.org>
@ 2018-05-21 19:07       ` Mike A
  2018-05-23 20:28       ` Mike A
  1 sibling, 0 replies; 5+ messages in thread
From: Mike A @ 2018-05-21 19:07 UTC (permalink / raw)
  To: Sage Weil; +Cc: Ceph Development, ceph-users-idqoXFIVOFJgJs9I8MT0rw

Hello,

> 21 мая 2018 г., в 2:05, Sage Weil <sage@newdream.net> написал(а):
> 
> On Sun, 20 May 2018, Mike A wrote:
>> Hello!
>> 
>> In our cluster, we see a deadlock situation.
>> This is a standard cluster for an OpenStack without a RadosGW, we have a standard block access pools and one for metrics from a gnocchi.
>> The amount of data in the gnocchi pool is small, but objects are just a lot.
>> 
>> When planning a distribution of PG between pools, the PG are distributed depending on the estimated data size of each pool. Correspondingly, as suggested by pgcalc for the gnocchi pool, it is necessary to allocate a little PG quantity.
>> 
>> As a result, the cluster is constantly hanging with the error "1 pools have many more objects per pg than average" and this is understandable: the gnocchi produces a lot of small objects and in comparison with the rest of pools it is tens times larger.
>> 
>> And here we are at a deadlock:
>> 1. We can not increase the amount of PG on the gnocchi pool, since it is very small in data size
>> 2. Even if we increase the number of PG - we can cross the recommended 200 PGs limit for each OSD in cluster
>> 3. Constantly holding the cluster in the HEALTH_WARN mode is a bad idea
>> 4. We can set the parameter "mon pg warn max object skew", but we do not know how the Ceph will work when there is one pool with a huge object / pool ratio
>> 
>> There is no obvious solution.
>> 
>> How to solve this problem correctly?
> 
> As a workaround, I'd just increase the skew option to make the warning go 
> away.
> 
> It seems to me like the underlying problem is that we're looking at object 
> count vs pg count, but ignoring the object sizes.  Unfortunately it's a 
> bit awkward to fix because we don't have a way to quantify the size of 
> omap objects via the stats (currently).  So for now, just adjust the skew 
> value enough to make the warning go away!
> 
> sage

Ok.
It seems that increase this config option, is the only acceptable option.

Thank!

— 
Mike, runs!
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Too many objects per pg than average: deadlock situation
       [not found]     ` <alpine.DEB.2.11.1805202303591.2133-qHenpvqtifaMSRpgCs4c+g@public.gmane.org>
  2018-05-21 19:07       ` Mike A
@ 2018-05-23 20:28       ` Mike A
       [not found]         ` <7690C1F1-5C98-4FA2-BD4A-24375F474951-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 5+ messages in thread
From: Mike A @ 2018-05-23 20:28 UTC (permalink / raw)
  To: Sage Weil; +Cc: Ceph Development, ceph-users-idqoXFIVOFJgJs9I8MT0rw

Hello

> 21 мая 2018 г., в 2:05, Sage Weil <sage@newdream.net> написал(а):
> 
> On Sun, 20 May 2018, Mike A wrote:
>> Hello!
>> 
>> In our cluster, we see a deadlock situation.
>> This is a standard cluster for an OpenStack without a RadosGW, we have a standard block access pools and one for metrics from a gnocchi.
>> The amount of data in the gnocchi pool is small, but objects are just a lot.
>> 
>> When planning a distribution of PG between pools, the PG are distributed depending on the estimated data size of each pool. Correspondingly, as suggested by pgcalc for the gnocchi pool, it is necessary to allocate a little PG quantity.
>> 
>> As a result, the cluster is constantly hanging with the error "1 pools have many more objects per pg than average" and this is understandable: the gnocchi produces a lot of small objects and in comparison with the rest of pools it is tens times larger.
>> 
>> And here we are at a deadlock:
>> 1. We can not increase the amount of PG on the gnocchi pool, since it is very small in data size
>> 2. Even if we increase the number of PG - we can cross the recommended 200 PGs limit for each OSD in cluster
>> 3. Constantly holding the cluster in the HEALTH_WARN mode is a bad idea
>> 4. We can set the parameter "mon pg warn max object skew", but we do not know how the Ceph will work when there is one pool with a huge object / pool ratio
>> 
>> There is no obvious solution.
>> 
>> How to solve this problem correctly?
> 
> As a workaround, I'd just increase the skew option to make the warning go 
> away.
> 
> It seems to me like the underlying problem is that we're looking at object 
> count vs pg count, but ignoring the object sizes.  Unfortunately it's a 
> bit awkward to fix because we don't have a way to quantify the size of 
> omap objects via the stats (currently).  So for now, just adjust the skew 
> value enough to make the warning go away!
> 
> sage

This situation can somehow negatively affect the work of the cluster?

— 
Mike, runs!
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Too many objects per pg than average: deadlock situation
       [not found]         ` <7690C1F1-5C98-4FA2-BD4A-24375F474951-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-05-23 20:35           ` Sage Weil
  0 siblings, 0 replies; 5+ messages in thread
From: Sage Weil @ 2018-05-23 20:35 UTC (permalink / raw)
  To: Mike A; +Cc: Ceph Development, ceph-users-idqoXFIVOFJgJs9I8MT0rw

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2300 bytes --]

On Wed, 23 May 2018, Mike A wrote:
> Hello
> 
> > 21 мая 2018 г., в 2:05, Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org> написал(а):
> > 
> > On Sun, 20 May 2018, Mike A wrote:
> >> Hello!
> >> 
> >> In our cluster, we see a deadlock situation.
> >> This is a standard cluster for an OpenStack without a RadosGW, we have a standard block access pools and one for metrics from a gnocchi.
> >> The amount of data in the gnocchi pool is small, but objects are just a lot.
> >> 
> >> When planning a distribution of PG between pools, the PG are distributed depending on the estimated data size of each pool. Correspondingly, as suggested by pgcalc for the gnocchi pool, it is necessary to allocate a little PG quantity.
> >> 
> >> As a result, the cluster is constantly hanging with the error "1 pools have many more objects per pg than average" and this is understandable: the gnocchi produces a lot of small objects and in comparison with the rest of pools it is tens times larger.
> >> 
> >> And here we are at a deadlock:
> >> 1. We can not increase the amount of PG on the gnocchi pool, since it is very small in data size
> >> 2. Even if we increase the number of PG - we can cross the recommended 200 PGs limit for each OSD in cluster
> >> 3. Constantly holding the cluster in the HEALTH_WARN mode is a bad idea
> >> 4. We can set the parameter "mon pg warn max object skew", but we do not know how the Ceph will work when there is one pool with a huge object / pool ratio
> >> 
> >> There is no obvious solution.
> >> 
> >> How to solve this problem correctly?
> > 
> > As a workaround, I'd just increase the skew option to make the warning go 
> > away.
> > 
> > It seems to me like the underlying problem is that we're looking at object 
> > count vs pg count, but ignoring the object sizes.  Unfortunately it's a 
> > bit awkward to fix because we don't have a way to quantify the size of 
> > omap objects via the stats (currently).  So for now, just adjust the skew 
> > value enough to make the warning go away!
> > 
> > sage
> 
> This situation can somehow negatively affect the work of the cluster?

Eh, you'll end up with a PG count that is possibly suboptimal.  You'd have 
to work pretty hard to notice any difference, though.  I wouldn't worry 
about it.

sage

[-- Attachment #2: Type: text/plain, Size: 178 bytes --]

_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-05-23 20:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-20 20:28 Too many objects per pg than average: deadlock situation Mike A
     [not found] ` <34562966-FC71-4194-9605-9ECB79BDA513-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-05-20 23:05   ` Sage Weil
     [not found]     ` <alpine.DEB.2.11.1805202303591.2133-qHenpvqtifaMSRpgCs4c+g@public.gmane.org>
2018-05-21 19:07       ` Mike A
2018-05-23 20:28       ` Mike A
     [not found]         ` <7690C1F1-5C98-4FA2-BD4A-24375F474951-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-05-23 20:35           ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.