ceph-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* rgw: Is rgw_sync_lease_period=120s set small?
@ 2021-03-18 11:35 WeiGuo Ren
  2021-03-18 11:37 ` WeiGuo Ren
  0 siblings, 1 reply; 4+ messages in thread
From: WeiGuo Ren @ 2021-03-18 11:35 UTC (permalink / raw)
  To: Ceph Development

In an rgw multi-site production environment, how many rgw instances
will be started in a single zone? According to my test, multiple rgw
instances will compete for the datalog leaselock, and it is very
likely that the leaselock will not be renewed. Is the default
rgw_sync_lease_period=120s a bit small?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: rgw: Is rgw_sync_lease_period=120s set small?
  2021-03-18 11:35 rgw: Is rgw_sync_lease_period=120s set small? WeiGuo Ren
@ 2021-03-18 11:37 ` WeiGuo Ren
  2021-03-18 12:53   ` WeiGuo Ren
  0 siblings, 1 reply; 4+ messages in thread
From: WeiGuo Ren @ 2021-03-18 11:37 UTC (permalink / raw)
  To: Ceph Development

I have an osd ceph cluster, rgw instance often appears to be renewed
and not locked

WeiGuo Ren <rwg1335252904@gmail.com> 于2021年3月18日周四 下午7:35写道:
>
> In an rgw multi-site production environment, how many rgw instances
> will be started in a single zone? According to my test, multiple rgw
> instances will compete for the datalog leaselock, and it is very
> likely that the leaselock will not be renewed. Is the default
> rgw_sync_lease_period=120s a bit small?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: rgw: Is rgw_sync_lease_period=120s set small?
  2021-03-18 11:37 ` WeiGuo Ren
@ 2021-03-18 12:53   ` WeiGuo Ren
  2021-03-18 17:09     ` Casey Bodley
  0 siblings, 1 reply; 4+ messages in thread
From: WeiGuo Ren @ 2021-03-18 12:53 UTC (permalink / raw)
  To: Ceph Development

radosgw-admin sync error list
[
    {
        "shard_id": 0,
        "entries": [
            {
                "id": "1_1614333890.956965_8080774.1",
                "section": "data",
                "name": "user21-bucket23:multi_master-anna.1827103.323:54",
                "timestamp": "2021-02-26 10:04:50.956965Z",
                "info": {
                    "source_zone": "multi_master-anna",
                    "error_code": 125,
                    "message": "failed to sync bucket instance: (125)
Operation canceled"
                }
            }
        ]
     }
]

I think this command should be used to determine its parameters, and
keep increasing, as long as -ECANCLE(125) does not appear, it is
appropriate.

WeiGuo Ren <rwg1335252904@gmail.com> 于2021年3月18日周四 下午7:37写道:
>
> I have an osd ceph cluster, rgw instance often appears to be renewed
> and not locked
>
> WeiGuo Ren <rwg1335252904@gmail.com> 于2021年3月18日周四 下午7:35写道:
> >
> > In an rgw multi-site production environment, how many rgw instances
> > will be started in a single zone? According to my test, multiple rgw
> > instances will compete for the datalog leaselock, and it is very
> > likely that the leaselock will not be renewed. Is the default
> > rgw_sync_lease_period=120s a bit small?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: rgw: Is rgw_sync_lease_period=120s set small?
  2021-03-18 12:53   ` WeiGuo Ren
@ 2021-03-18 17:09     ` Casey Bodley
  0 siblings, 0 replies; 4+ messages in thread
From: Casey Bodley @ 2021-03-18 17:09 UTC (permalink / raw)
  To: WeiGuo Ren; +Cc: Ceph Development

On Thu, Mar 18, 2021 at 8:55 AM WeiGuo Ren <rwg1335252904@gmail.com> wrote:
>
> radosgw-admin sync error list
> [
>     {
>         "shard_id": 0,
>         "entries": [
>             {
>                 "id": "1_1614333890.956965_8080774.1",
>                 "section": "data",
>                 "name": "user21-bucket23:multi_master-anna.1827103.323:54",
>                 "timestamp": "2021-02-26 10:04:50.956965Z",
>                 "info": {
>                     "source_zone": "multi_master-anna",
>                     "error_code": 125,
>                     "message": "failed to sync bucket instance: (125)
> Operation canceled"
>                 }
>             }
>         ]
>      }
> ]
>
> I think this command should be used to determine its parameters, and
> keep increasing, as long as -ECANCLE(125) does not appear, it is
> appropriate.
>
> WeiGuo Ren <rwg1335252904@gmail.com> 于2021年3月18日周四 下午7:37写道:
> >
> > I have an osd ceph cluster, rgw instance often appears to be renewed
> > and not locked
> >
> > WeiGuo Ren <rwg1335252904@gmail.com> 于2021年3月18日周四 下午7:35写道:
> > >
> > > In an rgw multi-site production environment, how many rgw instances
> > > will be started in a single zone?

it depends on the scale, but i'd guess anywhere from 2-8?

if the zone is serving clients (not just DR), it can make sense to
dedicate some of the rgws to clients (by setting
rgw_run_sync_thread=0, and not including their endpoints in the zone
configuration), and others just to sync. so i think it's easy enough
to control how many gateways are contending for these leases

you can raise the lease period, but that means it will take longer for
sync to recover from a radosgw shutdown/restart. the shard locks it
held will take longer to expire, preventing other gateways from
resuming sync on those shards


> > > According to my test, multiple rgw
> > > instances will compete for the datalog leaselock, and it is very
> > > likely that the leaselock will not be renewed. Is the default
> > > rgw_sync_lease_period=120s a bit small?
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-03-18 17:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-18 11:35 rgw: Is rgw_sync_lease_period=120s set small? WeiGuo Ren
2021-03-18 11:37 ` WeiGuo Ren
2021-03-18 12:53   ` WeiGuo Ren
2021-03-18 17:09     ` Casey Bodley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).