All of lore.kernel.org
 help / color / mirror / Atom feed
* rgw: multiple zonegroups in single realm
@ 2017-02-10  8:21 KIMURA Osamu
  2017-02-12 10:07 ` Orit Wasserman
  0 siblings, 1 reply; 11+ messages in thread
From: KIMURA Osamu @ 2017-02-10  8:21 UTC (permalink / raw)
  To: ceph-devel

Hi Cephers,

I'm trying to configure RGWs with multiple zonegroups within single realm.
The intention is that some buckets to be replicated and others to stay locally.
e.g.:
  realm: fj
   zonegroup east: zone tokyo (not replicated)
   zonegroup west: zone osaka (not replicated)
   zonegroup jp:   zone jp-east + jp-west (replicated)

To evaluate such configuration, I tentatively built multiple zonegroups
(east, west) on a ceph cluster. I barely succeed to configure it, but
some concerns exist.

a) User accounts are not synced among zonegroups

I'm not sure if this is a issue, but the blueprint [1] stated a master
zonegroup manages user accounts as metadata like buckets.


b) Bucket creation is rejected if master zonegroup doesn't have the account

e.g.:
   1) Configure east zonegroup as master.
   2) Create a user "nishi" on west zonegroup (osaka zone) using radosgw-admin.
   3) Try to create a bucket on west zonegroup by user nishi.
      -> ERROR: S3 error: 404 (NoSuchKey)
   4) Create user nishi on east zonegroup with same key.
   5) Succeed to create a bucket on west zonegroup by user nishi.


c) How to restrict to place buckets on specific zonegroups?

If user accounts would synced future as the blueprint, all the zonegroups
contain same account information. It means any user can create buckets on
any zonegroups. If we want to permit to place buckets on a replicated
zonegroup for specific users, how to configure?

If user accounts will not synced as current behavior, we can restrict
to place buckets on specific zonegroups. But I cannot find best way to
configure the master zonegroup.


d) Operations for other zonegroup are not redirected

e.g.:
   1) Create bucket4 on west zonegroup by nishi.
   2) Try to access bucket4 from endpoint on east zonegroup.
      -> Respond "301 (Moved Permanently)",
         but no redirected Location header is returned.

It seems current RGW doesn't follows S3 specification [2].
To implement this feature, probably we need to define another endpoint
on each zonegroup for client accessible URL. RGW may placed behind proxy,
thus the URL may be different from endpoint URLs for replication.


Any thoughts?

[1] http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
[2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html

------ FYI ------
[environments]
Ceph cluster: RHCS 2.0
RGW: RHEL 7.2 + RGW v10.2.5

zonegroup east: master
  zone tokyo
   endpoint http://node5:80
   system user: sync-user
   user azuma (+ nishi)

zonegroup west: (not master)
   zone osaka
   endpoint http://node5:8081
   system user: sync-user (created with same key as zone tokyo)
   user nishi


[detail of "b)"]

$ s3cmd -c s3nishi.cfg ls
$ s3cmd -c s3nishi.cfg mb s3://bucket3
ERROR: S3 error: 404 (NoSuchKey)

---- rgw.osaka log:
2017-02-10 11:54:13.290653 7feac3f7f700  1 ====== starting new request req=0x7feac3f79710 =====
2017-02-10 11:54:13.290709 7feac3f7f700  2 req 50:0.000057::PUT /bucket3/::initializing for trans_id = tx000000000000000000032-00589d2b55-14a2-osaka
2017-02-10 11:54:13.290720 7feac3f7f700 10 rgw api priority: s3=5 s3website=4
2017-02-10 11:54:13.290722 7feac3f7f700 10 host=node5
2017-02-10 11:54:13.290733 7feac3f7f700 10 meta>> HTTP_X_AMZ_CONTENT_SHA256
2017-02-10 11:54:13.290750 7feac3f7f700 10 meta>> HTTP_X_AMZ_DATE
2017-02-10 11:54:13.290753 7feac3f7f700 10 x>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
2017-02-10 11:54:13.290755 7feac3f7f700 10 x>> x-amz-date:20170210T025413Z
2017-02-10 11:54:13.290774 7feac3f7f700 10 handler=25RGWHandler_REST_Bucket_S3
2017-02-10 11:54:13.290775 7feac3f7f700  2 req 50:0.000124:s3:PUT /bucket3/::getting op 1
2017-02-10 11:54:13.290781 7feac3f7f700 10 op=27RGWCreateBucket_ObjStore_S3
2017-02-10 11:54:13.290782 7feac3f7f700  2 req 50:0.000130:s3:PUT /bucket3/:create_bucket:authorizing
2017-02-10 11:54:13.290798 7feac3f7f700 10 v4 signature format = 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
2017-02-10 11:54:13.290804 7feac3f7f700 10 v4 credential format = ZY6EJUVB38SCOWBELERQ/20170210/west/s3/aws4_request
2017-02-10 11:54:13.290806 7feac3f7f700 10 access key id = ZY6EJUVB38SCOWBELERQ
2017-02-10 11:54:13.290814 7feac3f7f700 10 credential scope = 20170210/west/s3/aws4_request
2017-02-10 11:54:13.290834 7feac3f7f700 10 canonical headers format = host:node5:8081
x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
x-amz-date:20170210T025413Z

2017-02-10 11:54:13.290836 7feac3f7f700 10 delaying v4 auth
2017-02-10 11:54:13.290839 7feac3f7f700  2 req 50:0.000187:s3:PUT /bucket3/:create_bucket:normalizing buckets and tenants
2017-02-10 11:54:13.290841 7feac3f7f700 10 s->object=<NULL> s->bucket=bucket3
2017-02-10 11:54:13.290843 7feac3f7f700  2 req 50:0.000191:s3:PUT /bucket3/:create_bucket:init permissions
2017-02-10 11:54:13.290844 7feac3f7f700  2 req 50:0.000192:s3:PUT /bucket3/:create_bucket:recalculating target
2017-02-10 11:54:13.290845 7feac3f7f700  2 req 50:0.000193:s3:PUT /bucket3/:create_bucket:reading permissions
2017-02-10 11:54:13.290846 7feac3f7f700  2 req 50:0.000195:s3:PUT /bucket3/:create_bucket:init op
2017-02-10 11:54:13.290847 7feac3f7f700  2 req 50:0.000196:s3:PUT /bucket3/:create_bucket:verifying op mask
2017-02-10 11:54:13.290849 7feac3f7f700  2 req 50:0.000197:s3:PUT /bucket3/:create_bucket:verifying op permissions
2017-02-10 11:54:13.292027 7feac3f7f700  2 req 50:0.001374:s3:PUT /bucket3/:create_bucket:verifying op params
2017-02-10 11:54:13.292035 7feac3f7f700  2 req 50:0.001383:s3:PUT /bucket3/:create_bucket:pre-executing
2017-02-10 11:54:13.292037 7feac3f7f700  2 req 50:0.001385:s3:PUT /bucket3/:create_bucket:executing
2017-02-10 11:54:13.292072 7feac3f7f700 10 payload request hash = d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
2017-02-10 11:54:13.292083 7feac3f7f700 10 canonical request = PUT
/bucket3/

host:node5:8081
x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
x-amz-date:20170210T025413Z

host;x-amz-content-sha256;x-amz-date
d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
2017-02-10 11:54:13.292084 7feac3f7f700 10 canonical request hash = 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
2017-02-10 11:54:13.292087 7feac3f7f700 10 string to sign = AWS4-HMAC-SHA256
20170210T025413Z
20170210/west/s3/aws4_request
8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
2017-02-10 11:54:13.292118 7feac3f7f700 10 date_k        = 454f3ad73c095e73d2482809d7a6ec8af3c4e900bc83e0a9663ea5fc336cad95
2017-02-10 11:54:13.292131 7feac3f7f700 10 region_k      = e0caaddbb30ebc25840b6aaac3979d1881a14b8e9a0dfea43d8a006c8e0e504d
2017-02-10 11:54:13.292144 7feac3f7f700 10 service_k     = 59d6c9158e9e3c6a1aa97ee15859d2ef9ad9c64209b63f093109844f0c7f6c04
2017-02-10 11:54:13.292171 7feac3f7f700 10 signing_k     = 4dcbccd9c3da779d32758a645644c66a56f64d642eaeb39eec8e0b2facba7805
2017-02-10 11:54:13.292197 7feac3f7f700 10 signature_k   = 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
2017-02-10 11:54:13.292198 7feac3f7f700 10 new signature = 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
2017-02-10 11:54:13.292199 7feac3f7f700 10 ----------------------------- Verifying signatures
2017-02-10 11:54:13.292199 7feac3f7f700 10 Signature     = 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
2017-02-10 11:54:13.292200 7feac3f7f700 10 New Signature = 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
2017-02-10 11:54:13.292200 7feac3f7f700 10 -----------------------------
2017-02-10 11:54:13.292202 7feac3f7f700 10 v4 auth ok
2017-02-10 11:54:13.292238 7feac3f7f700 10 create bucket location constraint: west
2017-02-10 11:54:13.292256 7feac3f7f700 10 cache get: name=osaka.rgw.data.root+bucket3 : type miss (requested=22, cached=0)
2017-02-10 11:54:13.293369 7feac3f7f700 10 cache put: name=osaka.rgw.data.root+bucket3 info.flags=0
2017-02-10 11:54:13.293374 7feac3f7f700 10 moving osaka.rgw.data.root+bucket3 to cache LRU end
2017-02-10 11:54:13.293380 7feac3f7f700  0 sending create_bucket request to master zonegroup
2017-02-10 11:54:13.293401 7feac3f7f700 10 get_canon_resource(): dest=/bucket3/
2017-02-10 11:54:13.293403 7feac3f7f700 10 generated canonical header: PUT


Fri Feb 10 02:54:13 2017
x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
/bucket3/
2017-02-10 11:54:13.299113 7feac3f7f700 10 receive_http_header
2017-02-10 11:54:13.299117 7feac3f7f700 10 received header:HTTP/1.1 404 Not Found
2017-02-10 11:54:13.299119 7feac3f7f700 10 receive_http_header
2017-02-10 11:54:13.299120 7feac3f7f700 10 received header:x-amz-request-id: tx000000000000000000005-00589d2b55-1416-tokyo
2017-02-10 11:54:13.299130 7feac3f7f700 10 receive_http_header
2017-02-10 11:54:13.299131 7feac3f7f700 10 received header:Content-Length: 175
2017-02-10 11:54:13.299133 7feac3f7f700 10 receive_http_header
2017-02-10 11:54:13.299133 7feac3f7f700 10 received header:Accept-Ranges: bytes
2017-02-10 11:54:13.299148 7feac3f7f700 10 receive_http_header
2017-02-10 11:54:13.299149 7feac3f7f700 10 received header:Content-Type: application/xml
2017-02-10 11:54:13.299150 7feac3f7f700 10 receive_http_header
2017-02-10 11:54:13.299150 7feac3f7f700 10 received header:Date: Fri, 10 Feb 2017 02:54:13 GMT
2017-02-10 11:54:13.299152 7feac3f7f700 10 receive_http_header
2017-02-10 11:54:13.299152 7feac3f7f700 10 received header:
2017-02-10 11:54:13.299248 7feac3f7f700  2 req 50:0.008596:s3:PUT /bucket3/:create_bucket:completing
2017-02-10 11:54:13.299319 7feac3f7f700  2 req 50:0.008667:s3:PUT /bucket3/:create_bucket:op status=-2
2017-02-10 11:54:13.299321 7feac3f7f700  2 req 50:0.008670:s3:PUT /bucket3/:create_bucket:http status=404
2017-02-10 11:54:13.299324 7feac3f7f700  1 ====== req done req=0x7feac3f79710 op status=-2 http_status=404 ======
2017-02-10 11:54:13.299349 7feac3f7f700  1 civetweb: 0x7feb2c02d340: 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1" 404 0 - -


---- rgw.tokyo log:
2017-02-10 11:54:13.297852 7f56076c6700  1 ====== starting new request req=0x7f56076c0710 =====
2017-02-10 11:54:13.297887 7f56076c6700  2 req 5:0.000035::PUT /bucket3/::initializing for trans_id = tx000000000000000000005-00589d2b55-1416-tokyo
2017-02-10 11:54:13.297895 7f56076c6700 10 rgw api priority: s3=5 s3website=4
2017-02-10 11:54:13.297897 7f56076c6700 10 host=node5
2017-02-10 11:54:13.297906 7f56076c6700 10 meta>> HTTP_X_AMZ_CONTENT_SHA256
2017-02-10 11:54:13.297912 7f56076c6700 10 x>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
2017-02-10 11:54:13.297929 7f56076c6700 10 handler=25RGWHandler_REST_Bucket_S3
2017-02-10 11:54:13.297937 7f56076c6700  2 req 5:0.000086:s3:PUT /bucket3/::getting op 1
2017-02-10 11:54:13.297946 7f56076c6700 10 op=27RGWCreateBucket_ObjStore_S3
2017-02-10 11:54:13.297947 7f56076c6700  2 req 5:0.000096:s3:PUT /bucket3/:create_bucket:authorizing
2017-02-10 11:54:13.297969 7f56076c6700 10 get_canon_resource(): dest=/bucket3/
2017-02-10 11:54:13.297976 7f56076c6700 10 auth_hdr:
PUT


Fri Feb 10 02:54:13 2017
x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
/bucket3/
2017-02-10 11:54:13.298023 7f56076c6700 10 cache get: name=default.rgw.users.uid+nishi : type miss (requested=6, cached=0)
2017-02-10 11:54:13.298975 7f56076c6700 10 cache put: name=default.rgw.users.uid+nishi info.flags=0
2017-02-10 11:54:13.298986 7f56076c6700 10 moving default.rgw.users.uid+nishi to cache LRU end
2017-02-10 11:54:13.298991 7f56076c6700  0 User lookup failed!
2017-02-10 11:54:13.298993 7f56076c6700 10 failed to authorize request
2017-02-10 11:54:13.299077 7f56076c6700  2 req 5:0.001225:s3:PUT /bucket3/:create_bucket:op status=0
2017-02-10 11:54:13.299086 7f56076c6700  2 req 5:0.001235:s3:PUT /bucket3/:create_bucket:http status=404
2017-02-10 11:54:13.299089 7f56076c6700  1 ====== req done req=0x7f56076c0710 op status=0 http_status=404 ======
2017-02-10 11:54:13.299426 7f56076c6700  1 civetweb: 0x7f56200048c0: 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1" 404 0 - -


-- 
KIMURA Osamu / 木村 修
Engineering Department, Storage Development Division,
Data Center Platform Business Unit, FUJITSU LIMITED

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-10  8:21 rgw: multiple zonegroups in single realm KIMURA Osamu
@ 2017-02-12 10:07 ` Orit Wasserman
  2017-02-13  4:44   ` KIMURA Osamu
  0 siblings, 1 reply; 11+ messages in thread
From: Orit Wasserman @ 2017-02-12 10:07 UTC (permalink / raw)
  To: KIMURA Osamu; +Cc: ceph-devel

On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
<kimura.osamu@jp.fujitsu.com> wrote:
> Hi Cephers,
>
> I'm trying to configure RGWs with multiple zonegroups within single realm.
> The intention is that some buckets to be replicated and others to stay
> locally.

If you are not replicating than you don't need to create any zone configuration,
a default zonegroup and zone are created automatically

> e.g.:
>  realm: fj
>   zonegroup east: zone tokyo (not replicated)
no need if not replicated
>   zonegroup west: zone osaka (not replicated)
same here
>   zonegroup jp:   zone jp-east + jp-west (replicated)
>
> To evaluate such configuration, I tentatively built multiple zonegroups
> (east, west) on a ceph cluster. I barely succeed to configure it, but
> some concerns exist.
>
I think you just need one zonegroup with two zones the other are not needed
Also each gateway can handle only a single zone (rgw_zone
configuration parameter)

> a) User accounts are not synced among zonegroups
>
> I'm not sure if this is a issue, but the blueprint [1] stated a master
> zonegroup manages user accounts as metadata like buckets.
>
You have a lot of confusion with the zones and zonegroups.
A zonegroup is just a group of zones that are sharing the same data
(i.e. replication between them)
A zone represent a geographical location (i.e. one ceph cluster)

We have a meta master zone (the master zone in the master zonegroup),
this meta master is responible on
replicating users and byckets meta operations.

>
> b) Bucket creation is rejected if master zonegroup doesn't have the account
>
> e.g.:
>   1) Configure east zonegroup as master.
you need a master zoen
>   2) Create a user "nishi" on west zonegroup (osaka zone) using
> radosgw-admin.
>   3) Try to create a bucket on west zonegroup by user nishi.
>      -> ERROR: S3 error: 404 (NoSuchKey)
>   4) Create user nishi on east zonegroup with same key.
>   5) Succeed to create a bucket on west zonegroup by user nishi.
>

You are confusing zonegroup and zone here again ...

you should notice that when you are using radosgw-admin command
without providing zonegorup and/or zone info (--rgw-zonegroup=<zg> and
--rgw-zone=<zone>) it will use the default zonegroup and zone.

User is stored per zone and you need to create an admin users in both zones
for more documentation see: http://docs.ceph.com/docs/master/radosgw/multisite/

>
> c) How to restrict to place buckets on specific zonegroups?
>

you probably mean zone.
There is ongoing work to enable/disable sync per bucket
https://github.com/ceph/ceph/pull/10995
with this you can create a bucket on a specific zone and it won't be
replicated to another zone

> If user accounts would synced future as the blueprint, all the zonegroups
> contain same account information. It means any user can create buckets on
> any zonegroups. If we want to permit to place buckets on a replicated
> zonegroup for specific users, how to configure?
>
> If user accounts will not synced as current behavior, we can restrict
> to place buckets on specific zonegroups. But I cannot find best way to
> configure the master zonegroup.
>
>
> d) Operations for other zonegroup are not redirected
>
> e.g.:
>   1) Create bucket4 on west zonegroup by nishi.
>   2) Try to access bucket4 from endpoint on east zonegroup.
>      -> Respond "301 (Moved Permanently)",
>         but no redirected Location header is returned.
>

It could be a bug please open a tracker issue for that in
tracker.ceph.com for RGW component with all the configuration
information,
logs and the version of ceph and radosgw you are using.

> It seems current RGW doesn't follows S3 specification [2].
> To implement this feature, probably we need to define another endpoint
> on each zonegroup for client accessible URL. RGW may placed behind proxy,
> thus the URL may be different from endpoint URLs for replication.
>

The zone and zonegroup endpoints are not used directly by the user with a proxy.
The user get a URL pointing to the proxy and the proxy will need to be
configured to point the rgw urls/IPs , you can have several radosgw
running.
See more https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration

Regrads,
Orit
>
> Any thoughts?

>
> [1]
> http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html
>
> ------ FYI ------
> [environments]
> Ceph cluster: RHCS 2.0
> RGW: RHEL 7.2 + RGW v10.2.5
>
> zonegroup east: master
>  zone tokyo
>   endpoint http://node5:80
>   system user: sync-user
>   user azuma (+ nishi)
>
> zonegroup west: (not master)
>   zone osaka
>   endpoint http://node5:8081
>   system user: sync-user (created with same key as zone tokyo)
>   user nishi
>
>
> [detail of "b)"]
>
> $ s3cmd -c s3nishi.cfg ls
> $ s3cmd -c s3nishi.cfg mb s3://bucket3
> ERROR: S3 error: 404 (NoSuchKey)
>
> ---- rgw.osaka log:
> 2017-02-10 11:54:13.290653 7feac3f7f700  1 ====== starting new request
> req=0x7feac3f79710 =====
> 2017-02-10 11:54:13.290709 7feac3f7f700  2 req 50:0.000057::PUT
> /bucket3/::initializing for trans_id =
> tx000000000000000000032-00589d2b55-14a2-osaka
> 2017-02-10 11:54:13.290720 7feac3f7f700 10 rgw api priority: s3=5
> s3website=4
> 2017-02-10 11:54:13.290722 7feac3f7f700 10 host=node5
> 2017-02-10 11:54:13.290733 7feac3f7f700 10 meta>> HTTP_X_AMZ_CONTENT_SHA256
> 2017-02-10 11:54:13.290750 7feac3f7f700 10 meta>> HTTP_X_AMZ_DATE
> 2017-02-10 11:54:13.290753 7feac3f7f700 10 x>>
> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
> 2017-02-10 11:54:13.290755 7feac3f7f700 10 x>> x-amz-date:20170210T025413Z
> 2017-02-10 11:54:13.290774 7feac3f7f700 10
> handler=25RGWHandler_REST_Bucket_S3
> 2017-02-10 11:54:13.290775 7feac3f7f700  2 req 50:0.000124:s3:PUT
> /bucket3/::getting op 1
> 2017-02-10 11:54:13.290781 7feac3f7f700 10 op=27RGWCreateBucket_ObjStore_S3
> 2017-02-10 11:54:13.290782 7feac3f7f700  2 req 50:0.000130:s3:PUT
> /bucket3/:create_bucket:authorizing
> 2017-02-10 11:54:13.290798 7feac3f7f700 10 v4 signature format =
> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
> 2017-02-10 11:54:13.290804 7feac3f7f700 10 v4 credential format =
> ZY6EJUVB38SCOWBELERQ/20170210/west/s3/aws4_request
> 2017-02-10 11:54:13.290806 7feac3f7f700 10 access key id =
> ZY6EJUVB38SCOWBELERQ
> 2017-02-10 11:54:13.290814 7feac3f7f700 10 credential scope =
> 20170210/west/s3/aws4_request
> 2017-02-10 11:54:13.290834 7feac3f7f700 10 canonical headers format =
> host:node5:8081
> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
> x-amz-date:20170210T025413Z
>
> 2017-02-10 11:54:13.290836 7feac3f7f700 10 delaying v4 auth
> 2017-02-10 11:54:13.290839 7feac3f7f700  2 req 50:0.000187:s3:PUT
> /bucket3/:create_bucket:normalizing buckets and tenants
> 2017-02-10 11:54:13.290841 7feac3f7f700 10 s->object=<NULL>
> s->bucket=bucket3
> 2017-02-10 11:54:13.290843 7feac3f7f700  2 req 50:0.000191:s3:PUT
> /bucket3/:create_bucket:init permissions
> 2017-02-10 11:54:13.290844 7feac3f7f700  2 req 50:0.000192:s3:PUT
> /bucket3/:create_bucket:recalculating target
> 2017-02-10 11:54:13.290845 7feac3f7f700  2 req 50:0.000193:s3:PUT
> /bucket3/:create_bucket:reading permissions
> 2017-02-10 11:54:13.290846 7feac3f7f700  2 req 50:0.000195:s3:PUT
> /bucket3/:create_bucket:init op
> 2017-02-10 11:54:13.290847 7feac3f7f700  2 req 50:0.000196:s3:PUT
> /bucket3/:create_bucket:verifying op mask
> 2017-02-10 11:54:13.290849 7feac3f7f700  2 req 50:0.000197:s3:PUT
> /bucket3/:create_bucket:verifying op permissions
> 2017-02-10 11:54:13.292027 7feac3f7f700  2 req 50:0.001374:s3:PUT
> /bucket3/:create_bucket:verifying op params
> 2017-02-10 11:54:13.292035 7feac3f7f700  2 req 50:0.001383:s3:PUT
> /bucket3/:create_bucket:pre-executing
> 2017-02-10 11:54:13.292037 7feac3f7f700  2 req 50:0.001385:s3:PUT
> /bucket3/:create_bucket:executing
> 2017-02-10 11:54:13.292072 7feac3f7f700 10 payload request hash =
> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
> 2017-02-10 11:54:13.292083 7feac3f7f700 10 canonical request = PUT
> /bucket3/
>
> host:node5:8081
> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
> x-amz-date:20170210T025413Z
>
> host;x-amz-content-sha256;x-amz-date
> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
> 2017-02-10 11:54:13.292084 7feac3f7f700 10 canonical request hash =
> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
> 2017-02-10 11:54:13.292087 7feac3f7f700 10 string to sign = AWS4-HMAC-SHA256
> 20170210T025413Z
> 20170210/west/s3/aws4_request
> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
> 2017-02-10 11:54:13.292118 7feac3f7f700 10 date_k        =
> 454f3ad73c095e73d2482809d7a6ec8af3c4e900bc83e0a9663ea5fc336cad95
> 2017-02-10 11:54:13.292131 7feac3f7f700 10 region_k      =
> e0caaddbb30ebc25840b6aaac3979d1881a14b8e9a0dfea43d8a006c8e0e504d
> 2017-02-10 11:54:13.292144 7feac3f7f700 10 service_k     =
> 59d6c9158e9e3c6a1aa97ee15859d2ef9ad9c64209b63f093109844f0c7f6c04
> 2017-02-10 11:54:13.292171 7feac3f7f700 10 signing_k     =
> 4dcbccd9c3da779d32758a645644c66a56f64d642eaeb39eec8e0b2facba7805
> 2017-02-10 11:54:13.292197 7feac3f7f700 10 signature_k   =
> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
> 2017-02-10 11:54:13.292198 7feac3f7f700 10 new signature =
> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
> 2017-02-10 11:54:13.292199 7feac3f7f700 10 -----------------------------
> Verifying signatures
> 2017-02-10 11:54:13.292199 7feac3f7f700 10 Signature     =
> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
> 2017-02-10 11:54:13.292200 7feac3f7f700 10 New Signature =
> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
> 2017-02-10 11:54:13.292200 7feac3f7f700 10 -----------------------------
> 2017-02-10 11:54:13.292202 7feac3f7f700 10 v4 auth ok
> 2017-02-10 11:54:13.292238 7feac3f7f700 10 create bucket location
> constraint: west
> 2017-02-10 11:54:13.292256 7feac3f7f700 10 cache get:
> name=osaka.rgw.data.root+bucket3 : type miss (requested=22, cached=0)
> 2017-02-10 11:54:13.293369 7feac3f7f700 10 cache put:
> name=osaka.rgw.data.root+bucket3 info.flags=0
> 2017-02-10 11:54:13.293374 7feac3f7f700 10 moving
> osaka.rgw.data.root+bucket3 to cache LRU end
> 2017-02-10 11:54:13.293380 7feac3f7f700  0 sending create_bucket request to
> master zonegroup
> 2017-02-10 11:54:13.293401 7feac3f7f700 10 get_canon_resource():
> dest=/bucket3/
> 2017-02-10 11:54:13.293403 7feac3f7f700 10 generated canonical header: PUT
>
>
> Fri Feb 10 02:54:13 2017
> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
> /bucket3/
> 2017-02-10 11:54:13.299113 7feac3f7f700 10 receive_http_header
> 2017-02-10 11:54:13.299117 7feac3f7f700 10 received header:HTTP/1.1 404 Not
> Found
> 2017-02-10 11:54:13.299119 7feac3f7f700 10 receive_http_header
> 2017-02-10 11:54:13.299120 7feac3f7f700 10 received header:x-amz-request-id:
> tx000000000000000000005-00589d2b55-1416-tokyo
> 2017-02-10 11:54:13.299130 7feac3f7f700 10 receive_http_header
> 2017-02-10 11:54:13.299131 7feac3f7f700 10 received header:Content-Length:
> 175
> 2017-02-10 11:54:13.299133 7feac3f7f700 10 receive_http_header
> 2017-02-10 11:54:13.299133 7feac3f7f700 10 received header:Accept-Ranges:
> bytes
> 2017-02-10 11:54:13.299148 7feac3f7f700 10 receive_http_header
> 2017-02-10 11:54:13.299149 7feac3f7f700 10 received header:Content-Type:
> application/xml
> 2017-02-10 11:54:13.299150 7feac3f7f700 10 receive_http_header
> 2017-02-10 11:54:13.299150 7feac3f7f700 10 received header:Date: Fri, 10 Feb
> 2017 02:54:13 GMT
> 2017-02-10 11:54:13.299152 7feac3f7f700 10 receive_http_header
> 2017-02-10 11:54:13.299152 7feac3f7f700 10 received header:
> 2017-02-10 11:54:13.299248 7feac3f7f700  2 req 50:0.008596:s3:PUT
> /bucket3/:create_bucket:completing
> 2017-02-10 11:54:13.299319 7feac3f7f700  2 req 50:0.008667:s3:PUT
> /bucket3/:create_bucket:op status=-2
> 2017-02-10 11:54:13.299321 7feac3f7f700  2 req 50:0.008670:s3:PUT
> /bucket3/:create_bucket:http status=404
> 2017-02-10 11:54:13.299324 7feac3f7f700  1 ====== req done
> req=0x7feac3f79710 op status=-2 http_status=404 ======
> 2017-02-10 11:54:13.299349 7feac3f7f700  1 civetweb: 0x7feb2c02d340:
> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1" 404
> 0 - -
>
>
> ---- rgw.tokyo log:
> 2017-02-10 11:54:13.297852 7f56076c6700  1 ====== starting new request
> req=0x7f56076c0710 =====
> 2017-02-10 11:54:13.297887 7f56076c6700  2 req 5:0.000035::PUT
> /bucket3/::initializing for trans_id =
> tx000000000000000000005-00589d2b55-1416-tokyo
> 2017-02-10 11:54:13.297895 7f56076c6700 10 rgw api priority: s3=5
> s3website=4
> 2017-02-10 11:54:13.297897 7f56076c6700 10 host=node5
> 2017-02-10 11:54:13.297906 7f56076c6700 10 meta>> HTTP_X_AMZ_CONTENT_SHA256
> 2017-02-10 11:54:13.297912 7f56076c6700 10 x>>
> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
> 2017-02-10 11:54:13.297929 7f56076c6700 10
> handler=25RGWHandler_REST_Bucket_S3
> 2017-02-10 11:54:13.297937 7f56076c6700  2 req 5:0.000086:s3:PUT
> /bucket3/::getting op 1
> 2017-02-10 11:54:13.297946 7f56076c6700 10 op=27RGWCreateBucket_ObjStore_S3
> 2017-02-10 11:54:13.297947 7f56076c6700  2 req 5:0.000096:s3:PUT
> /bucket3/:create_bucket:authorizing
> 2017-02-10 11:54:13.297969 7f56076c6700 10 get_canon_resource():
> dest=/bucket3/
> 2017-02-10 11:54:13.297976 7f56076c6700 10 auth_hdr:
> PUT
>
>
> Fri Feb 10 02:54:13 2017
> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
> /bucket3/
> 2017-02-10 11:54:13.298023 7f56076c6700 10 cache get:
> name=default.rgw.users.uid+nishi : type miss (requested=6, cached=0)
> 2017-02-10 11:54:13.298975 7f56076c6700 10 cache put:
> name=default.rgw.users.uid+nishi info.flags=0
> 2017-02-10 11:54:13.298986 7f56076c6700 10 moving
> default.rgw.users.uid+nishi to cache LRU end
> 2017-02-10 11:54:13.298991 7f56076c6700  0 User lookup failed!
> 2017-02-10 11:54:13.298993 7f56076c6700 10 failed to authorize request
> 2017-02-10 11:54:13.299077 7f56076c6700  2 req 5:0.001225:s3:PUT
> /bucket3/:create_bucket:op status=0
> 2017-02-10 11:54:13.299086 7f56076c6700  2 req 5:0.001235:s3:PUT
> /bucket3/:create_bucket:http status=404
> 2017-02-10 11:54:13.299089 7f56076c6700  1 ====== req done
> req=0x7f56076c0710 op status=0 http_status=404 ======
> 2017-02-10 11:54:13.299426 7f56076c6700  1 civetweb: 0x7f56200048c0:
> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1" 404
> 0 - -
>
>
> --
> KIMURA Osamu / 木村 修
> Engineering Department, Storage Development Division,
> Data Center Platform Business Unit, FUJITSU LIMITED
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-12 10:07 ` Orit Wasserman
@ 2017-02-13  4:44   ` KIMURA Osamu
  2017-02-13  9:42     ` Orit Wasserman
  0 siblings, 1 reply; 11+ messages in thread
From: KIMURA Osamu @ 2017-02-13  4:44 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: ceph-devel

Hi Orit,

Thanks for your comments.
I believe I'm not confusing, but probably my thought may not be well described...

On 2017/02/12 19:07, Orit Wasserman wrote:
> On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
> <kimura.osamu@jp.fujitsu.com> wrote:
>> Hi Cephers,
>>
>> I'm trying to configure RGWs with multiple zonegroups within single realm.
>> The intention is that some buckets to be replicated and others to stay
>> locally.
>
> If you are not replicating than you don't need to create any zone configuration,
> a default zonegroup and zone are created automatically
>
>> e.g.:
>>  realm: fj
>>   zonegroup east: zone tokyo (not replicated)
> no need if not replicated
>>   zonegroup west: zone osaka (not replicated)
> same here
>>   zonegroup jp:   zone jp-east + jp-west (replicated)

The "east" and "west" zonegroups are just renamed from "default"
as described in RHCS document [3].
We may not need to rename them, but at least api_name should be altered.
In addition, I'm not sure what happens if 2 "default" zones/zonegroups
co-exist in same realm.


>> To evaluate such configuration, I tentatively built multiple zonegroups
>> (east, west) on a ceph cluster. I barely succeed to configure it, but
>> some concerns exist.
>>
> I think you just need one zonegroup with two zones the other are not needed
> Also each gateway can handle only a single zone (rgw_zone
> configuration parameter)

This is just a tentative one to confirm the behavior of multiple zonegroups
due to limitation of our current equipment.
The "east" zonegroup was renamed from "default", and another "west" zonegroup
was created. Of course I specified both rgw_zonegroup and rgw_zone parameters
for each RGW instance. (see -FYI- section bellow)


>> a) User accounts are not synced among zonegroups
>>
>> I'm not sure if this is a issue, but the blueprint [1] stated a master
>> zonegroup manages user accounts as metadata like buckets.
>>
> You have a lot of confusion with the zones and zonegroups.
> A zonegroup is just a group of zones that are sharing the same data
> (i.e. replication between them)
> A zone represent a geographical location (i.e. one ceph cluster)
>
> We have a meta master zone (the master zone in the master zonegroup),
> this meta master is responible on
> replicating users and byckets meta operations.

I know it.
But the master zone in the master zonegroup manages bucket meta
operations including buckets in other zonegroups. It means
the master zone in the master zonegroup must have permission to
handle buckets meta operations, i.e., must have same user accounts
as other zonegroups.
This is related to next issue b). If the master zone in the master
zonegroup doesn't have user accounts for other zonegroups, all the
buckets meta operations are rejected.

In addition, it may be overexplanation though, user accounts are
sync'ed to other zones within same zonegroup if the accounts are
created on master zone of the zonegroup. On the other hand,
I found today, user accounts are not sync'ed to master if the
accounts are created on slave(?) zone in the zonegroup. It seems
asymmetric behavior.
I'm not sure if the same behavior is caused by Admin REST API instead
of radosgw-admin.


>> b) Bucket creation is rejected if master zonegroup doesn't have the account
>>
>> e.g.:
>>   1) Configure east zonegroup as master.
> you need a master zoen
>>   2) Create a user "nishi" on west zonegroup (osaka zone) using
>> radosgw-admin.
>>   3) Try to create a bucket on west zonegroup by user nishi.
>>      -> ERROR: S3 error: 404 (NoSuchKey)
>>   4) Create user nishi on east zonegroup with same key.
>>   5) Succeed to create a bucket on west zonegroup by user nishi.
>>
>
> You are confusing zonegroup and zone here again ...
>
> you should notice that when you are using radosgw-admin command
> without providing zonegorup and/or zone info (--rgw-zonegroup=<zg> and
> --rgw-zone=<zone>) it will use the default zonegroup and zone.
>
> User is stored per zone and you need to create an admin users in both zones
> for more documentation see: http://docs.ceph.com/docs/master/radosgw/multisite/

I always specify --rgw-zonegroup and --rgw-zone for radosgw-admin command.

The issue is that any buckets meta operations are rejected when the master
zone in the master zonegroup doesn't have the user account of other zonegroups.

I try to describe details again:
1) Create fj realm as default.
2) Rename default zonegroup/zone to east/tokyo and mark as default.
3) Create west/osaka zonegroup/zone.
4) Create system user sync-user on both tokyo and osaka zones with same key.
5) Start 2 RGW instances for tokyo and osaka zones.
6) Create azuma user account on tokyo zone in east zonegroup.
7) Create /bucket1 through tokyo zone endpoint with azuma account.
    -> No problem.
8) Create nishi user account on osaka zone in west zonegroup.
9) Try to create a bucket /bucket2 through osaka zone endpoint with azuma account.
    -> respond "ERROR: S3 error: 403 (InvalidAccessKeyId)" as expected.
10) Try to create a bucket /bucket3 through osaka zone endpoint with nishi account.
    -> respond "ERROR: S3 error: 404 (NoSuchKey)"
    Detailed log is shown in -FYI- section bellow.
    The RGW for osaka zone verify the signature and forward the request
    to tokyo zone endpoint (= the master zone in the master zonegroup).
    Then, the RGW for tokyo zone rejected the request by unauthorized access.


>> c) How to restrict to place buckets on specific zonegroups?
>>
>
> you probably mean zone.
> There is ongoing work to enable/disable sync per bucket
> https://github.com/ceph/ceph/pull/10995
> with this you can create a bucket on a specific zone and it won't be
> replicated to another zone

My thought means zonegroup (not zone) as described above.
With current code, buckets are sync'ed to all zones within a zonegroup,
no way to choose zone to place specific buckets.
But this change may help to configure our original target.

It seems we need more discussion about the change.
I prefer default behavior is associated with user account (per SLA).
And attribution of each bucket should be able to be changed via REST
API depending on their permission, rather than radosgw-admin command.

Anyway, I'll examine more details.

>> If user accounts would synced future as the blueprint, all the zonegroups
>> contain same account information. It means any user can create buckets on
>> any zonegroups. If we want to permit to place buckets on a replicated
>> zonegroup for specific users, how to configure?
>>
>> If user accounts will not synced as current behavior, we can restrict
>> to place buckets on specific zonegroups. But I cannot find best way to
>> configure the master zonegroup.
>>
>>
>> d) Operations for other zonegroup are not redirected
>>
>> e.g.:
>>   1) Create bucket4 on west zonegroup by nishi.
>>   2) Try to access bucket4 from endpoint on east zonegroup.
>>      -> Respond "301 (Moved Permanently)",
>>         but no redirected Location header is returned.
>>
>
> It could be a bug please open a tracker issue for that in
> tracker.ceph.com for RGW component with all the configuration
> information,
> logs and the version of ceph and radosgw you are using.

I will open it, but it may be issued as "Feature" instead of "Bug"
depending on following discussion.

>> It seems current RGW doesn't follows S3 specification [2].
>> To implement this feature, probably we need to define another endpoint
>> on each zonegroup for client accessible URL. RGW may placed behind proxy,
>> thus the URL may be different from endpoint URLs for replication.
>>
>
> The zone and zonegroup endpoints are not used directly by the user with a proxy.
> The user get a URL pointing to the proxy and the proxy will need to be
> configured to point the rgw urls/IPs , you can have several radosgw
> running.
> See more https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration

Does it mean the proxy has responsibility to alter "Location" header as
redirected URL?

Basically, RGW can respond only the endpoint described in zonegroup
setting as redirected URL on Location header. But client may not access
the endpoint. Someone must translate the Location header to client
accessible URL.

If the proxy translates Location header, it looks like man-in-the-middle
attack.


Regards,
KIMURA

> Regrads,
> Orit
>>
>> Any thoughts?
>
>>
>> [1]
>> http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
>> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html

[3] https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-8-multi-site#migrating_a_single_site_system_to_multi_site

>> ------ FYI ------
>> [environments]
>> Ceph cluster: RHCS 2.0
>> RGW: RHEL 7.2 + RGW v10.2.5
>>
>> zonegroup east: master
>>  zone tokyo
>>   endpoint http://node5:80
        rgw frontends = "civetweb port=80"
        rgw zonegroup = east
        rgw zone = tokyo
>>   system user: sync-user
>>   user azuma (+ nishi)
>>
>> zonegroup west: (not master)
>>   zone osaka
>>   endpoint http://node5:8081
        rgw frontends = "civetweb port=8081"
        rgw zonegroup = west
        rgw zone = osaka
>>   system user: sync-user (created with same key as zone tokyo)
>>   user nishi
>>
>>
>> [detail of "b)"]
>>
>> $ s3cmd -c s3nishi.cfg ls
>> $ s3cmd -c s3nishi.cfg mb s3://bucket3
>> ERROR: S3 error: 404 (NoSuchKey)
>>
>> ---- rgw.osaka log:
>> 2017-02-10 11:54:13.290653 7feac3f7f700  1 ====== starting new request
>> req=0x7feac3f79710 =====
>> 2017-02-10 11:54:13.290709 7feac3f7f700  2 req 50:0.000057::PUT
>> /bucket3/::initializing for trans_id =
>> tx000000000000000000032-00589d2b55-14a2-osaka
>> 2017-02-10 11:54:13.290720 7feac3f7f700 10 rgw api priority: s3=5
>> s3website=4
>> 2017-02-10 11:54:13.290722 7feac3f7f700 10 host=node5
>> 2017-02-10 11:54:13.290733 7feac3f7f700 10 meta>> HTTP_X_AMZ_CONTENT_SHA256
>> 2017-02-10 11:54:13.290750 7feac3f7f700 10 meta>> HTTP_X_AMZ_DATE
>> 2017-02-10 11:54:13.290753 7feac3f7f700 10 x>>
>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>> 2017-02-10 11:54:13.290755 7feac3f7f700 10 x>> x-amz-date:20170210T025413Z
>> 2017-02-10 11:54:13.290774 7feac3f7f700 10
>> handler=25RGWHandler_REST_Bucket_S3
>> 2017-02-10 11:54:13.290775 7feac3f7f700  2 req 50:0.000124:s3:PUT
>> /bucket3/::getting op 1
>> 2017-02-10 11:54:13.290781 7feac3f7f700 10 op=27RGWCreateBucket_ObjStore_S3
>> 2017-02-10 11:54:13.290782 7feac3f7f700  2 req 50:0.000130:s3:PUT
>> /bucket3/:create_bucket:authorizing
>> 2017-02-10 11:54:13.290798 7feac3f7f700 10 v4 signature format =
>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>> 2017-02-10 11:54:13.290804 7feac3f7f700 10 v4 credential format =
>> ZY6EJUVB38SCOWBELERQ/20170210/west/s3/aws4_request
>> 2017-02-10 11:54:13.290806 7feac3f7f700 10 access key id =
>> ZY6EJUVB38SCOWBELERQ
>> 2017-02-10 11:54:13.290814 7feac3f7f700 10 credential scope =
>> 20170210/west/s3/aws4_request
>> 2017-02-10 11:54:13.290834 7feac3f7f700 10 canonical headers format =
>> host:node5:8081
>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>> x-amz-date:20170210T025413Z
>>
>> 2017-02-10 11:54:13.290836 7feac3f7f700 10 delaying v4 auth
>> 2017-02-10 11:54:13.290839 7feac3f7f700  2 req 50:0.000187:s3:PUT
>> /bucket3/:create_bucket:normalizing buckets and tenants
>> 2017-02-10 11:54:13.290841 7feac3f7f700 10 s->object=<NULL>
>> s->bucket=bucket3
>> 2017-02-10 11:54:13.290843 7feac3f7f700  2 req 50:0.000191:s3:PUT
>> /bucket3/:create_bucket:init permissions
>> 2017-02-10 11:54:13.290844 7feac3f7f700  2 req 50:0.000192:s3:PUT
>> /bucket3/:create_bucket:recalculating target
>> 2017-02-10 11:54:13.290845 7feac3f7f700  2 req 50:0.000193:s3:PUT
>> /bucket3/:create_bucket:reading permissions
>> 2017-02-10 11:54:13.290846 7feac3f7f700  2 req 50:0.000195:s3:PUT
>> /bucket3/:create_bucket:init op
>> 2017-02-10 11:54:13.290847 7feac3f7f700  2 req 50:0.000196:s3:PUT
>> /bucket3/:create_bucket:verifying op mask
>> 2017-02-10 11:54:13.290849 7feac3f7f700  2 req 50:0.000197:s3:PUT
>> /bucket3/:create_bucket:verifying op permissions
>> 2017-02-10 11:54:13.292027 7feac3f7f700  2 req 50:0.001374:s3:PUT
>> /bucket3/:create_bucket:verifying op params
>> 2017-02-10 11:54:13.292035 7feac3f7f700  2 req 50:0.001383:s3:PUT
>> /bucket3/:create_bucket:pre-executing
>> 2017-02-10 11:54:13.292037 7feac3f7f700  2 req 50:0.001385:s3:PUT
>> /bucket3/:create_bucket:executing
>> 2017-02-10 11:54:13.292072 7feac3f7f700 10 payload request hash =
>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>> 2017-02-10 11:54:13.292083 7feac3f7f700 10 canonical request = PUT
>> /bucket3/
>>
>> host:node5:8081
>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>> x-amz-date:20170210T025413Z
>>
>> host;x-amz-content-sha256;x-amz-date
>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>> 2017-02-10 11:54:13.292084 7feac3f7f700 10 canonical request hash =
>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>> 2017-02-10 11:54:13.292087 7feac3f7f700 10 string to sign = AWS4-HMAC-SHA256
>> 20170210T025413Z
>> 20170210/west/s3/aws4_request
>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>> 2017-02-10 11:54:13.292118 7feac3f7f700 10 date_k        =
>> 454f3ad73c095e73d2482809d7a6ec8af3c4e900bc83e0a9663ea5fc336cad95
>> 2017-02-10 11:54:13.292131 7feac3f7f700 10 region_k      =
>> e0caaddbb30ebc25840b6aaac3979d1881a14b8e9a0dfea43d8a006c8e0e504d
>> 2017-02-10 11:54:13.292144 7feac3f7f700 10 service_k     =
>> 59d6c9158e9e3c6a1aa97ee15859d2ef9ad9c64209b63f093109844f0c7f6c04
>> 2017-02-10 11:54:13.292171 7feac3f7f700 10 signing_k     =
>> 4dcbccd9c3da779d32758a645644c66a56f64d642eaeb39eec8e0b2facba7805
>> 2017-02-10 11:54:13.292197 7feac3f7f700 10 signature_k   =
>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>> 2017-02-10 11:54:13.292198 7feac3f7f700 10 new signature =
>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>> 2017-02-10 11:54:13.292199 7feac3f7f700 10 -----------------------------
>> Verifying signatures
>> 2017-02-10 11:54:13.292199 7feac3f7f700 10 Signature     =
>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>> 2017-02-10 11:54:13.292200 7feac3f7f700 10 New Signature =
>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>> 2017-02-10 11:54:13.292200 7feac3f7f700 10 -----------------------------
>> 2017-02-10 11:54:13.292202 7feac3f7f700 10 v4 auth ok
>> 2017-02-10 11:54:13.292238 7feac3f7f700 10 create bucket location
>> constraint: west
>> 2017-02-10 11:54:13.292256 7feac3f7f700 10 cache get:
>> name=osaka.rgw.data.root+bucket3 : type miss (requested=22, cached=0)
>> 2017-02-10 11:54:13.293369 7feac3f7f700 10 cache put:
>> name=osaka.rgw.data.root+bucket3 info.flags=0
>> 2017-02-10 11:54:13.293374 7feac3f7f700 10 moving
>> osaka.rgw.data.root+bucket3 to cache LRU end
>> 2017-02-10 11:54:13.293380 7feac3f7f700  0 sending create_bucket request to
>> master zonegroup
>> 2017-02-10 11:54:13.293401 7feac3f7f700 10 get_canon_resource():
>> dest=/bucket3/
>> 2017-02-10 11:54:13.293403 7feac3f7f700 10 generated canonical header: PUT
>>
>>
>> Fri Feb 10 02:54:13 2017
>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>> /bucket3/
>> 2017-02-10 11:54:13.299113 7feac3f7f700 10 receive_http_header
>> 2017-02-10 11:54:13.299117 7feac3f7f700 10 received header:HTTP/1.1 404 Not
>> Found
>> 2017-02-10 11:54:13.299119 7feac3f7f700 10 receive_http_header
>> 2017-02-10 11:54:13.299120 7feac3f7f700 10 received header:x-amz-request-id:
>> tx000000000000000000005-00589d2b55-1416-tokyo
>> 2017-02-10 11:54:13.299130 7feac3f7f700 10 receive_http_header
>> 2017-02-10 11:54:13.299131 7feac3f7f700 10 received header:Content-Length:
>> 175
>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 receive_http_header
>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 received header:Accept-Ranges:
>> bytes
>> 2017-02-10 11:54:13.299148 7feac3f7f700 10 receive_http_header
>> 2017-02-10 11:54:13.299149 7feac3f7f700 10 received header:Content-Type:
>> application/xml
>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 receive_http_header
>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 received header:Date: Fri, 10 Feb
>> 2017 02:54:13 GMT
>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 receive_http_header
>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 received header:
>> 2017-02-10 11:54:13.299248 7feac3f7f700  2 req 50:0.008596:s3:PUT
>> /bucket3/:create_bucket:completing
>> 2017-02-10 11:54:13.299319 7feac3f7f700  2 req 50:0.008667:s3:PUT
>> /bucket3/:create_bucket:op status=-2
>> 2017-02-10 11:54:13.299321 7feac3f7f700  2 req 50:0.008670:s3:PUT
>> /bucket3/:create_bucket:http status=404
>> 2017-02-10 11:54:13.299324 7feac3f7f700  1 ====== req done
>> req=0x7feac3f79710 op status=-2 http_status=404 ======
>> 2017-02-10 11:54:13.299349 7feac3f7f700  1 civetweb: 0x7feb2c02d340:
>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1" 404
>> 0 - -
>>
>>
>> ---- rgw.tokyo log:
>> 2017-02-10 11:54:13.297852 7f56076c6700  1 ====== starting new request
>> req=0x7f56076c0710 =====
>> 2017-02-10 11:54:13.297887 7f56076c6700  2 req 5:0.000035::PUT
>> /bucket3/::initializing for trans_id =
>> tx000000000000000000005-00589d2b55-1416-tokyo
>> 2017-02-10 11:54:13.297895 7f56076c6700 10 rgw api priority: s3=5
>> s3website=4
>> 2017-02-10 11:54:13.297897 7f56076c6700 10 host=node5
>> 2017-02-10 11:54:13.297906 7f56076c6700 10 meta>> HTTP_X_AMZ_CONTENT_SHA256
>> 2017-02-10 11:54:13.297912 7f56076c6700 10 x>>
>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>> 2017-02-10 11:54:13.297929 7f56076c6700 10
>> handler=25RGWHandler_REST_Bucket_S3
>> 2017-02-10 11:54:13.297937 7f56076c6700  2 req 5:0.000086:s3:PUT
>> /bucket3/::getting op 1
>> 2017-02-10 11:54:13.297946 7f56076c6700 10 op=27RGWCreateBucket_ObjStore_S3
>> 2017-02-10 11:54:13.297947 7f56076c6700  2 req 5:0.000096:s3:PUT
>> /bucket3/:create_bucket:authorizing
>> 2017-02-10 11:54:13.297969 7f56076c6700 10 get_canon_resource():
>> dest=/bucket3/
>> 2017-02-10 11:54:13.297976 7f56076c6700 10 auth_hdr:
>> PUT
>>
>>
>> Fri Feb 10 02:54:13 2017
>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>> /bucket3/
>> 2017-02-10 11:54:13.298023 7f56076c6700 10 cache get:
>> name=default.rgw.users.uid+nishi : type miss (requested=6, cached=0)
>> 2017-02-10 11:54:13.298975 7f56076c6700 10 cache put:
>> name=default.rgw.users.uid+nishi info.flags=0
>> 2017-02-10 11:54:13.298986 7f56076c6700 10 moving
>> default.rgw.users.uid+nishi to cache LRU end
>> 2017-02-10 11:54:13.298991 7f56076c6700  0 User lookup failed!
>> 2017-02-10 11:54:13.298993 7f56076c6700 10 failed to authorize request
>> 2017-02-10 11:54:13.299077 7f56076c6700  2 req 5:0.001225:s3:PUT
>> /bucket3/:create_bucket:op status=0
>> 2017-02-10 11:54:13.299086 7f56076c6700  2 req 5:0.001235:s3:PUT
>> /bucket3/:create_bucket:http status=404
>> 2017-02-10 11:54:13.299089 7f56076c6700  1 ====== req done
>> req=0x7f56076c0710 op status=0 http_status=404 ======
>> 2017-02-10 11:54:13.299426 7f56076c6700  1 civetweb: 0x7f56200048c0:
>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1" 404
>> 0 - -

-- 
KIMURA Osamu / 木村 修
Engineering Department, Storage Development Division,
Data Center Platform Business Unit, FUJITSU LIMITED

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-13  4:44   ` KIMURA Osamu
@ 2017-02-13  9:42     ` Orit Wasserman
  2017-02-13 10:57       ` KIMURA Osamu
  0 siblings, 1 reply; 11+ messages in thread
From: Orit Wasserman @ 2017-02-13  9:42 UTC (permalink / raw)
  To: KIMURA Osamu; +Cc: ceph-devel

On Mon, Feb 13, 2017 at 6:44 AM, KIMURA Osamu
<kimura.osamu@jp.fujitsu.com> wrote:
> Hi Orit,
>
> Thanks for your comments.
> I believe I'm not confusing, but probably my thought may not be well
> described...
>
:)
> On 2017/02/12 19:07, Orit Wasserman wrote:
>>
>> On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>
>>> Hi Cephers,
>>>
>>> I'm trying to configure RGWs with multiple zonegroups within single
>>> realm.
>>> The intention is that some buckets to be replicated and others to stay
>>> locally.
>>
>>
>> If you are not replicating than you don't need to create any zone
>> configuration,
>> a default zonegroup and zone are created automatically
>>
>>> e.g.:
>>>  realm: fj
>>>   zonegroup east: zone tokyo (not replicated)
>>
>> no need if not replicated
>>>
>>>   zonegroup west: zone osaka (not replicated)
>>
>> same here
>>>
>>>   zonegroup jp:   zone jp-east + jp-west (replicated)
>
>
> The "east" and "west" zonegroups are just renamed from "default"
> as described in RHCS document [3].

Why do you need two zonegroups (or 3)?

At the moment multisitev2 replicated automatically all zones in the
realm except "default" zone.
The moment you add a new zone (could be part of another zonegroup) it
will be replicated to the other zones.
It seems you don't want or need this.
we are working on allowing more control on the replication but that
will be in the future.

> We may not need to rename them, but at least api_name should be altered.

You can change the api_name for the "default" zone.

> In addition, I'm not sure what happens if 2 "default" zones/zonegroups
> co-exist in same realm.

Realm shares all the zones/zonegroups configuration,
it means it is the same zone/zonegroup.
For "default" it means not zone/zonegroup configured, we use it to run
radosgw without any
zone/zonegroup specified in the configuration.

>
>
>>> To evaluate such configuration, I tentatively built multiple zonegroups
>>> (east, west) on a ceph cluster. I barely succeed to configure it, but
>>> some concerns exist.
>>>
>> I think you just need one zonegroup with two zones the other are not
>> needed
>> Also each gateway can handle only a single zone (rgw_zone
>> configuration parameter)
>
>
> This is just a tentative one to confirm the behavior of multiple zonegroups
> due to limitation of our current equipment.
> The "east" zonegroup was renamed from "default", and another "west"
> zonegroup
> was created. Of course I specified both rgw_zonegroup and rgw_zone
> parameters
> for each RGW instance. (see -FYI- section bellow)
>
Can I suggest starting with a more simple setup:
Two zonegroups,  the first will have two zones and the second will
have one zone.
It is simper to configure and in case of problems to debug.

>
>>> a) User accounts are not synced among zonegroups
>>>
>>> I'm not sure if this is a issue, but the blueprint [1] stated a master
>>> zonegroup manages user accounts as metadata like buckets.
>>>
>> You have a lot of confusion with the zones and zonegroups.
>> A zonegroup is just a group of zones that are sharing the same data
>> (i.e. replication between them)
>> A zone represent a geographical location (i.e. one ceph cluster)
>>
>> We have a meta master zone (the master zone in the master zonegroup),
>> this meta master is responible on
>> replicating users and byckets meta operations.
>
>
> I know it.
> But the master zone in the master zonegroup manages bucket meta
> operations including buckets in other zonegroups. It means
> the master zone in the master zonegroup must have permission to
> handle buckets meta operations, i.e., must have same user accounts
> as other zonegroups.
Again zones not zonegroups,  it needs to have an admin user with the
same credentials in all the other zones.

> This is related to next issue b). If the master zone in the master
> zonegroup doesn't have user accounts for other zonegroups, all the
> buckets meta operations are rejected.
>

Correct

> In addition, it may be overexplanation though, user accounts are
> sync'ed to other zones within same zonegroup if the accounts are
> created on master zone of the zonegroup. On the other hand,
> I found today, user accounts are not sync'ed to master if the
> accounts are created on slave(?) zone in the zonegroup. It seems
> asymmetric behavior.

This requires investigation,  can you open a tracker issue and we will
look into it.

> I'm not sure if the same behavior is caused by Admin REST API instead
> of radosgw-admin.
>

It doesn't matter both use almost the same code

>
>>> b) Bucket creation is rejected if master zonegroup doesn't have the
>>> account
>>>
>>> e.g.:
>>>   1) Configure east zonegroup as master.
>>
>> you need a master zoen
>>>
>>>   2) Create a user "nishi" on west zonegroup (osaka zone) using
>>> radosgw-admin.
>>>   3) Try to create a bucket on west zonegroup by user nishi.
>>>      -> ERROR: S3 error: 404 (NoSuchKey)
>>>   4) Create user nishi on east zonegroup with same key.
>>>   5) Succeed to create a bucket on west zonegroup by user nishi.
>>>
>>
>> You are confusing zonegroup and zone here again ...
>>
>> you should notice that when you are using radosgw-admin command
>> without providing zonegorup and/or zone info (--rgw-zonegroup=<zg> and
>> --rgw-zone=<zone>) it will use the default zonegroup and zone.
>>
>> User is stored per zone and you need to create an admin users in both
>> zones
>> for more documentation see:
>> http://docs.ceph.com/docs/master/radosgw/multisite/
>
>
> I always specify --rgw-zonegroup and --rgw-zone for radosgw-admin command.
>
That is great!
You can also onfigure default zone and zonegroup

> The issue is that any buckets meta operations are rejected when the master
> zone in the master zonegroup doesn't have the user account of other
> zonegroups.
>
Correct
> I try to describe details again:
> 1) Create fj realm as default.
> 2) Rename default zonegroup/zone to east/tokyo and mark as default.
> 3) Create west/osaka zonegroup/zone.
> 4) Create system user sync-user on both tokyo and osaka zones with same key.
> 5) Start 2 RGW instances for tokyo and osaka zones.
> 6) Create azuma user account on tokyo zone in east zonegroup.
> 7) Create /bucket1 through tokyo zone endpoint with azuma account.
>    -> No problem.
> 8) Create nishi user account on osaka zone in west zonegroup.
> 9) Try to create a bucket /bucket2 through osaka zone endpoint with azuma
> account.
>    -> respond "ERROR: S3 error: 403 (InvalidAccessKeyId)" as expected.
> 10) Try to create a bucket /bucket3 through osaka zone endpoint with nishi
> account.
>    -> respond "ERROR: S3 error: 404 (NoSuchKey)"
>    Detailed log is shown in -FYI- section bellow.
>    The RGW for osaka zone verify the signature and forward the request
>    to tokyo zone endpoint (= the master zone in the master zonegroup).
>    Then, the RGW for tokyo zone rejected the request by unauthorized access.
>

This seems a bug, can you open a issue?

>
>>> c) How to restrict to place buckets on specific zonegroups?
>>>
>>
>> you probably mean zone.
>> There is ongoing work to enable/disable sync per bucket
>> https://github.com/ceph/ceph/pull/10995
>> with this you can create a bucket on a specific zone and it won't be
>> replicated to another zone
>
>
> My thought means zonegroup (not zone) as described above.

But it should be zone ..
Zone represent a geographical location , it represent a single ceph cluster.
Bucket is created in a zone (a single ceph cluster) and it stored the zone id.
The zone represent in which ceph cluster the bucket was created.

A zonegroup just a logical collection of zones, in many case you only
need a single zonegroup.
You should use zonegroups if you have lots of zones and it simplifies
your configuration.
You can move zones between zonegroups (it is not tested or supported ...).

> With current code, buckets are sync'ed to all zones within a zonegroup,
> no way to choose zone to place specific buckets.
> But this change may help to configure our original target.
>
> It seems we need more discussion about the change.
> I prefer default behavior is associated with user account (per SLA).
> And attribution of each bucket should be able to be changed via REST
> API depending on their permission, rather than radosgw-admin command.
>

I think that will be very helpful , we need to understand what are the
requirement and the usage.
Please comment on the PR or even open a feature request and we can
discuss it more in detail.

> Anyway, I'll examine more details.
>
>>> If user accounts would synced future as the blueprint, all the zonegroups
>>> contain same account information. It means any user can create buckets on
>>> any zonegroups. If we want to permit to place buckets on a replicated
>>> zonegroup for specific users, how to configure?
>>>
>>> If user accounts will not synced as current behavior, we can restrict
>>> to place buckets on specific zonegroups. But I cannot find best way to
>>> configure the master zonegroup.
>>>
>>>
>>> d) Operations for other zonegroup are not redirected
>>>
>>> e.g.:
>>>   1) Create bucket4 on west zonegroup by nishi.
>>>   2) Try to access bucket4 from endpoint on east zonegroup.
>>>      -> Respond "301 (Moved Permanently)",
>>>         but no redirected Location header is returned.
>>>
>>
>> It could be a bug please open a tracker issue for that in
>> tracker.ceph.com for RGW component with all the configuration
>> information,
>> logs and the version of ceph and radosgw you are using.
>
>
> I will open it, but it may be issued as "Feature" instead of "Bug"
> depending on following discussion.
>
>>> It seems current RGW doesn't follows S3 specification [2].
>>> To implement this feature, probably we need to define another endpoint
>>> on each zonegroup for client accessible URL. RGW may placed behind proxy,
>>> thus the URL may be different from endpoint URLs for replication.
>>>
>>
>> The zone and zonegroup endpoints are not used directly by the user with a
>> proxy.
>> The user get a URL pointing to the proxy and the proxy will need to be
>> configured to point the rgw urls/IPs , you can have several radosgw
>> running.
>> See more
>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration
>
>
> Does it mean the proxy has responsibility to alter "Location" header as
> redirected URL?
>

No

> Basically, RGW can respond only the endpoint described in zonegroup
> setting as redirected URL on Location header. But client may not access
> the endpoint. Someone must translate the Location header to client
> accessible URL.
>

Both locations will have a proxy. This means all communication is done
through proxies.
The endpoint URL should be an external URL and the proxy on the new
location will translate it to the internal one.

Regards,
Orit

> If the proxy translates Location header, it looks like man-in-the-middle
> attack.
>
>
> Regards,
> KIMURA
>
>> Regrads,
>> Orit
>>>
>>>
>>> Any thoughts?
>>
>>
>>>
>>> [1]
>>>
>>> http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
>>> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html
>
>
> [3]
> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-8-multi-site#migrating_a_single_site_system_to_multi_site
>
>>> ------ FYI ------
>>> [environments]
>>> Ceph cluster: RHCS 2.0
>>> RGW: RHEL 7.2 + RGW v10.2.5
>>>
>>> zonegroup east: master
>>>  zone tokyo
>>>   endpoint http://node5:80
>
>        rgw frontends = "civetweb port=80"
>        rgw zonegroup = east
>        rgw zone = tokyo
>>>
>>>   system user: sync-user
>>>   user azuma (+ nishi)
>>>
>>> zonegroup west: (not master)
>>>   zone osaka
>>>   endpoint http://node5:8081
>
>        rgw frontends = "civetweb port=8081"
>        rgw zonegroup = west
>        rgw zone = osaka
>
>>>   system user: sync-user (created with same key as zone tokyo)
>>>   user nishi
>>>
>>>
>>> [detail of "b)"]
>>>
>>> $ s3cmd -c s3nishi.cfg ls
>>> $ s3cmd -c s3nishi.cfg mb s3://bucket3
>>> ERROR: S3 error: 404 (NoSuchKey)
>>>
>>> ---- rgw.osaka log:
>>> 2017-02-10 11:54:13.290653 7feac3f7f700  1 ====== starting new request
>>> req=0x7feac3f79710 =====
>>> 2017-02-10 11:54:13.290709 7feac3f7f700  2 req 50:0.000057::PUT
>>> /bucket3/::initializing for trans_id =
>>> tx000000000000000000032-00589d2b55-14a2-osaka
>>> 2017-02-10 11:54:13.290720 7feac3f7f700 10 rgw api priority: s3=5
>>> s3website=4
>>> 2017-02-10 11:54:13.290722 7feac3f7f700 10 host=node5
>>> 2017-02-10 11:54:13.290733 7feac3f7f700 10 meta>>
>>> HTTP_X_AMZ_CONTENT_SHA256
>>> 2017-02-10 11:54:13.290750 7feac3f7f700 10 meta>> HTTP_X_AMZ_DATE
>>> 2017-02-10 11:54:13.290753 7feac3f7f700 10 x>>
>>>
>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>> 2017-02-10 11:54:13.290755 7feac3f7f700 10 x>>
>>> x-amz-date:20170210T025413Z
>>> 2017-02-10 11:54:13.290774 7feac3f7f700 10
>>> handler=25RGWHandler_REST_Bucket_S3
>>> 2017-02-10 11:54:13.290775 7feac3f7f700  2 req 50:0.000124:s3:PUT
>>> /bucket3/::getting op 1
>>> 2017-02-10 11:54:13.290781 7feac3f7f700 10
>>> op=27RGWCreateBucket_ObjStore_S3
>>> 2017-02-10 11:54:13.290782 7feac3f7f700  2 req 50:0.000130:s3:PUT
>>> /bucket3/:create_bucket:authorizing
>>> 2017-02-10 11:54:13.290798 7feac3f7f700 10 v4 signature format =
>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>> 2017-02-10 11:54:13.290804 7feac3f7f700 10 v4 credential format =
>>> ZY6EJUVB38SCOWBELERQ/20170210/west/s3/aws4_request
>>> 2017-02-10 11:54:13.290806 7feac3f7f700 10 access key id =
>>> ZY6EJUVB38SCOWBELERQ
>>> 2017-02-10 11:54:13.290814 7feac3f7f700 10 credential scope =
>>> 20170210/west/s3/aws4_request
>>> 2017-02-10 11:54:13.290834 7feac3f7f700 10 canonical headers format =
>>> host:node5:8081
>>>
>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>> x-amz-date:20170210T025413Z
>>>
>>> 2017-02-10 11:54:13.290836 7feac3f7f700 10 delaying v4 auth
>>> 2017-02-10 11:54:13.290839 7feac3f7f700  2 req 50:0.000187:s3:PUT
>>> /bucket3/:create_bucket:normalizing buckets and tenants
>>> 2017-02-10 11:54:13.290841 7feac3f7f700 10 s->object=<NULL>
>>> s->bucket=bucket3
>>> 2017-02-10 11:54:13.290843 7feac3f7f700  2 req 50:0.000191:s3:PUT
>>> /bucket3/:create_bucket:init permissions
>>> 2017-02-10 11:54:13.290844 7feac3f7f700  2 req 50:0.000192:s3:PUT
>>> /bucket3/:create_bucket:recalculating target
>>> 2017-02-10 11:54:13.290845 7feac3f7f700  2 req 50:0.000193:s3:PUT
>>> /bucket3/:create_bucket:reading permissions
>>> 2017-02-10 11:54:13.290846 7feac3f7f700  2 req 50:0.000195:s3:PUT
>>> /bucket3/:create_bucket:init op
>>> 2017-02-10 11:54:13.290847 7feac3f7f700  2 req 50:0.000196:s3:PUT
>>> /bucket3/:create_bucket:verifying op mask
>>> 2017-02-10 11:54:13.290849 7feac3f7f700  2 req 50:0.000197:s3:PUT
>>> /bucket3/:create_bucket:verifying op permissions
>>> 2017-02-10 11:54:13.292027 7feac3f7f700  2 req 50:0.001374:s3:PUT
>>> /bucket3/:create_bucket:verifying op params
>>> 2017-02-10 11:54:13.292035 7feac3f7f700  2 req 50:0.001383:s3:PUT
>>> /bucket3/:create_bucket:pre-executing
>>> 2017-02-10 11:54:13.292037 7feac3f7f700  2 req 50:0.001385:s3:PUT
>>> /bucket3/:create_bucket:executing
>>> 2017-02-10 11:54:13.292072 7feac3f7f700 10 payload request hash =
>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>> 2017-02-10 11:54:13.292083 7feac3f7f700 10 canonical request = PUT
>>> /bucket3/
>>>
>>> host:node5:8081
>>>
>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>> x-amz-date:20170210T025413Z
>>>
>>> host;x-amz-content-sha256;x-amz-date
>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>> 2017-02-10 11:54:13.292084 7feac3f7f700 10 canonical request hash =
>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>> 2017-02-10 11:54:13.292087 7feac3f7f700 10 string to sign =
>>> AWS4-HMAC-SHA256
>>> 20170210T025413Z
>>> 20170210/west/s3/aws4_request
>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>> 2017-02-10 11:54:13.292118 7feac3f7f700 10 date_k        =
>>> 454f3ad73c095e73d2482809d7a6ec8af3c4e900bc83e0a9663ea5fc336cad95
>>> 2017-02-10 11:54:13.292131 7feac3f7f700 10 region_k      =
>>> e0caaddbb30ebc25840b6aaac3979d1881a14b8e9a0dfea43d8a006c8e0e504d
>>> 2017-02-10 11:54:13.292144 7feac3f7f700 10 service_k     =
>>> 59d6c9158e9e3c6a1aa97ee15859d2ef9ad9c64209b63f093109844f0c7f6c04
>>> 2017-02-10 11:54:13.292171 7feac3f7f700 10 signing_k     =
>>> 4dcbccd9c3da779d32758a645644c66a56f64d642eaeb39eec8e0b2facba7805
>>> 2017-02-10 11:54:13.292197 7feac3f7f700 10 signature_k   =
>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>> 2017-02-10 11:54:13.292198 7feac3f7f700 10 new signature =
>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10 -----------------------------
>>> Verifying signatures
>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10 Signature     =
>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10 New Signature =
>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10 -----------------------------
>>> 2017-02-10 11:54:13.292202 7feac3f7f700 10 v4 auth ok
>>> 2017-02-10 11:54:13.292238 7feac3f7f700 10 create bucket location
>>> constraint: west
>>> 2017-02-10 11:54:13.292256 7feac3f7f700 10 cache get:
>>> name=osaka.rgw.data.root+bucket3 : type miss (requested=22, cached=0)
>>> 2017-02-10 11:54:13.293369 7feac3f7f700 10 cache put:
>>> name=osaka.rgw.data.root+bucket3 info.flags=0
>>> 2017-02-10 11:54:13.293374 7feac3f7f700 10 moving
>>> osaka.rgw.data.root+bucket3 to cache LRU end
>>> 2017-02-10 11:54:13.293380 7feac3f7f700  0 sending create_bucket request
>>> to
>>> master zonegroup
>>> 2017-02-10 11:54:13.293401 7feac3f7f700 10 get_canon_resource():
>>> dest=/bucket3/
>>> 2017-02-10 11:54:13.293403 7feac3f7f700 10 generated canonical header:
>>> PUT
>>>
>>>
>>> Fri Feb 10 02:54:13 2017
>>>
>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>> /bucket3/
>>> 2017-02-10 11:54:13.299113 7feac3f7f700 10 receive_http_header
>>> 2017-02-10 11:54:13.299117 7feac3f7f700 10 received header:HTTP/1.1 404
>>> Not
>>> Found
>>> 2017-02-10 11:54:13.299119 7feac3f7f700 10 receive_http_header
>>> 2017-02-10 11:54:13.299120 7feac3f7f700 10 received
>>> header:x-amz-request-id:
>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>> 2017-02-10 11:54:13.299130 7feac3f7f700 10 receive_http_header
>>> 2017-02-10 11:54:13.299131 7feac3f7f700 10 received
>>> header:Content-Length:
>>> 175
>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 receive_http_header
>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 received header:Accept-Ranges:
>>> bytes
>>> 2017-02-10 11:54:13.299148 7feac3f7f700 10 receive_http_header
>>> 2017-02-10 11:54:13.299149 7feac3f7f700 10 received header:Content-Type:
>>> application/xml
>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 receive_http_header
>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 received header:Date: Fri, 10
>>> Feb
>>> 2017 02:54:13 GMT
>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 receive_http_header
>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 received header:
>>> 2017-02-10 11:54:13.299248 7feac3f7f700  2 req 50:0.008596:s3:PUT
>>> /bucket3/:create_bucket:completing
>>> 2017-02-10 11:54:13.299319 7feac3f7f700  2 req 50:0.008667:s3:PUT
>>> /bucket3/:create_bucket:op status=-2
>>> 2017-02-10 11:54:13.299321 7feac3f7f700  2 req 50:0.008670:s3:PUT
>>> /bucket3/:create_bucket:http status=404
>>> 2017-02-10 11:54:13.299324 7feac3f7f700  1 ====== req done
>>> req=0x7feac3f79710 op status=-2 http_status=404 ======
>>> 2017-02-10 11:54:13.299349 7feac3f7f700  1 civetweb: 0x7feb2c02d340:
>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1"
>>> 404
>>> 0 - -
>>>
>>>
>>> ---- rgw.tokyo log:
>>> 2017-02-10 11:54:13.297852 7f56076c6700  1 ====== starting new request
>>> req=0x7f56076c0710 =====
>>> 2017-02-10 11:54:13.297887 7f56076c6700  2 req 5:0.000035::PUT
>>> /bucket3/::initializing for trans_id =
>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>> 2017-02-10 11:54:13.297895 7f56076c6700 10 rgw api priority: s3=5
>>> s3website=4
>>> 2017-02-10 11:54:13.297897 7f56076c6700 10 host=node5
>>> 2017-02-10 11:54:13.297906 7f56076c6700 10 meta>>
>>> HTTP_X_AMZ_CONTENT_SHA256
>>> 2017-02-10 11:54:13.297912 7f56076c6700 10 x>>
>>>
>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>> 2017-02-10 11:54:13.297929 7f56076c6700 10
>>> handler=25RGWHandler_REST_Bucket_S3
>>> 2017-02-10 11:54:13.297937 7f56076c6700  2 req 5:0.000086:s3:PUT
>>> /bucket3/::getting op 1
>>> 2017-02-10 11:54:13.297946 7f56076c6700 10
>>> op=27RGWCreateBucket_ObjStore_S3
>>> 2017-02-10 11:54:13.297947 7f56076c6700  2 req 5:0.000096:s3:PUT
>>> /bucket3/:create_bucket:authorizing
>>> 2017-02-10 11:54:13.297969 7f56076c6700 10 get_canon_resource():
>>> dest=/bucket3/
>>> 2017-02-10 11:54:13.297976 7f56076c6700 10 auth_hdr:
>>> PUT
>>>
>>>
>>> Fri Feb 10 02:54:13 2017
>>>
>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>> /bucket3/
>>> 2017-02-10 11:54:13.298023 7f56076c6700 10 cache get:
>>> name=default.rgw.users.uid+nishi : type miss (requested=6, cached=0)
>>> 2017-02-10 11:54:13.298975 7f56076c6700 10 cache put:
>>> name=default.rgw.users.uid+nishi info.flags=0
>>> 2017-02-10 11:54:13.298986 7f56076c6700 10 moving
>>> default.rgw.users.uid+nishi to cache LRU end
>>> 2017-02-10 11:54:13.298991 7f56076c6700  0 User lookup failed!
>>> 2017-02-10 11:54:13.298993 7f56076c6700 10 failed to authorize request
>>> 2017-02-10 11:54:13.299077 7f56076c6700  2 req 5:0.001225:s3:PUT
>>> /bucket3/:create_bucket:op status=0
>>> 2017-02-10 11:54:13.299086 7f56076c6700  2 req 5:0.001235:s3:PUT
>>> /bucket3/:create_bucket:http status=404
>>> 2017-02-10 11:54:13.299089 7f56076c6700  1 ====== req done
>>> req=0x7f56076c0710 op status=0 http_status=404 ======
>>> 2017-02-10 11:54:13.299426 7f56076c6700  1 civetweb: 0x7f56200048c0:
>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1"
>>> 404
>>> 0 - -
>
>
> --
> KIMURA Osamu / 木村 修
> Engineering Department, Storage Development Division,
> Data Center Platform Business Unit, FUJITSU LIMITED

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-13  9:42     ` Orit Wasserman
@ 2017-02-13 10:57       ` KIMURA Osamu
  2017-02-14 14:54         ` Orit Wasserman
  0 siblings, 1 reply; 11+ messages in thread
From: KIMURA Osamu @ 2017-02-13 10:57 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: ceph-devel

Hi Orit,

I almost agree, with some exceptions...

On 2017/02/13 18:42, Orit Wasserman wrote:
> On Mon, Feb 13, 2017 at 6:44 AM, KIMURA Osamu
> <kimura.osamu@jp.fujitsu.com> wrote:
>> Hi Orit,
>>
>> Thanks for your comments.
>> I believe I'm not confusing, but probably my thought may not be well
>> described...
>>
> :)
>> On 2017/02/12 19:07, Orit Wasserman wrote:
>>>
>>> On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>
>>>> Hi Cephers,
>>>>
>>>> I'm trying to configure RGWs with multiple zonegroups within single
>>>> realm.
>>>> The intention is that some buckets to be replicated and others to stay
>>>> locally.
>>>
>>>
>>> If you are not replicating than you don't need to create any zone
>>> configuration,
>>> a default zonegroup and zone are created automatically
>>>
>>>> e.g.:
>>>>  realm: fj
>>>>   zonegroup east: zone tokyo (not replicated)
>>>
>>> no need if not replicated
>>>>
>>>>   zonegroup west: zone osaka (not replicated)
>>>
>>> same here
>>>>
>>>>   zonegroup jp:   zone jp-east + jp-west (replicated)
>>
>>
>> The "east" and "west" zonegroups are just renamed from "default"
>> as described in RHCS document [3].
>
> Why do you need two zonegroups (or 3)?
>
> At the moment multisitev2 replicated automatically all zones in the
> realm except "default" zone.
> The moment you add a new zone (could be part of another zonegroup) it
> will be replicated to the other zones.
> It seems you don't want or need this.
> we are working on allowing more control on the replication but that
> will be in the future.
>
>> We may not need to rename them, but at least api_name should be altered.
>
> You can change the api_name for the "default" zone.
>
>> In addition, I'm not sure what happens if 2 "default" zones/zonegroups
>> co-exist in same realm.
>
> Realm shares all the zones/zonegroups configuration,
> it means it is the same zone/zonegroup.
> For "default" it means not zone/zonegroup configured, we use it to run
> radosgw without any
> zone/zonegroup specified in the configuration.

I didn't think "default" as exception of zonegroup. :-P
Actually, I must specify api_name in default zonegroup setting.

I interpret "default" zone/zonegroup is out of realm. Is it correct?
I think it means namespace for bucket or user is not shared with "default".
At present, I can't make decision to separate namespaces, but it may be
best choice with current code.


>>>> To evaluate such configuration, I tentatively built multiple zonegroups
>>>> (east, west) on a ceph cluster. I barely succeed to configure it, but
>>>> some concerns exist.
>>>>
>>> I think you just need one zonegroup with two zones the other are not
>>> needed
>>> Also each gateway can handle only a single zone (rgw_zone
>>> configuration parameter)
>>
>>
>> This is just a tentative one to confirm the behavior of multiple zonegroups
>> due to limitation of our current equipment.
>> The "east" zonegroup was renamed from "default", and another "west"
>> zonegroup
>> was created. Of course I specified both rgw_zonegroup and rgw_zone
>> parameters
>> for each RGW instance. (see -FYI- section bellow)
>>
> Can I suggest starting with a more simple setup:
> Two zonegroups,  the first will have two zones and the second will
> have one zone.
> It is simper to configure and in case of problems to debug.

I would try with such configuration IF time permitted.


>>>> a) User accounts are not synced among zonegroups
>>>>
>>>> I'm not sure if this is a issue, but the blueprint [1] stated a master
>>>> zonegroup manages user accounts as metadata like buckets.
>>>>
>>> You have a lot of confusion with the zones and zonegroups.
>>> A zonegroup is just a group of zones that are sharing the same data
>>> (i.e. replication between them)
>>> A zone represent a geographical location (i.e. one ceph cluster)
>>>
>>> We have a meta master zone (the master zone in the master zonegroup),
>>> this meta master is responible on
>>> replicating users and byckets meta operations.
>>
>>
>> I know it.
>> But the master zone in the master zonegroup manages bucket meta
>> operations including buckets in other zonegroups. It means
>> the master zone in the master zonegroup must have permission to
>> handle buckets meta operations, i.e., must have same user accounts
>> as other zonegroups.
> Again zones not zonegroups,  it needs to have an admin user with the
> same credentials in all the other zones.
>
>> This is related to next issue b). If the master zone in the master
>> zonegroup doesn't have user accounts for other zonegroups, all the
>> buckets meta operations are rejected.
>>
>
> Correct
>
>> In addition, it may be overexplanation though, user accounts are
>> sync'ed to other zones within same zonegroup if the accounts are
>> created on master zone of the zonegroup. On the other hand,
>> I found today, user accounts are not sync'ed to master if the
>> accounts are created on slave(?) zone in the zonegroup. It seems
>> asymmetric behavior.
>
> This requires investigation,  can you open a tracker issue and we will
> look into it.
>
>> I'm not sure if the same behavior is caused by Admin REST API instead
>> of radosgw-admin.
>>
>
> It doesn't matter both use almost the same code
>
>>
>>>> b) Bucket creation is rejected if master zonegroup doesn't have the
>>>> account
>>>>
>>>> e.g.:
>>>>   1) Configure east zonegroup as master.
>>>
>>> you need a master zoen
>>>>
>>>>   2) Create a user "nishi" on west zonegroup (osaka zone) using
>>>> radosgw-admin.
>>>>   3) Try to create a bucket on west zonegroup by user nishi.
>>>>      -> ERROR: S3 error: 404 (NoSuchKey)
>>>>   4) Create user nishi on east zonegroup with same key.
>>>>   5) Succeed to create a bucket on west zonegroup by user nishi.
>>>>
>>>
>>> You are confusing zonegroup and zone here again ...
>>>
>>> you should notice that when you are using radosgw-admin command
>>> without providing zonegorup and/or zone info (--rgw-zonegroup=<zg> and
>>> --rgw-zone=<zone>) it will use the default zonegroup and zone.
>>>
>>> User is stored per zone and you need to create an admin users in both
>>> zones
>>> for more documentation see:
>>> http://docs.ceph.com/docs/master/radosgw/multisite/
>>
>>
>> I always specify --rgw-zonegroup and --rgw-zone for radosgw-admin command.
>>
> That is great!
> You can also onfigure default zone and zonegroup
>
>> The issue is that any buckets meta operations are rejected when the master
>> zone in the master zonegroup doesn't have the user account of other
>> zonegroups.
>>
> Correct
>> I try to describe details again:
>> 1) Create fj realm as default.
>> 2) Rename default zonegroup/zone to east/tokyo and mark as default.
>> 3) Create west/osaka zonegroup/zone.
>> 4) Create system user sync-user on both tokyo and osaka zones with same key.
>> 5) Start 2 RGW instances for tokyo and osaka zones.
>> 6) Create azuma user account on tokyo zone in east zonegroup.
>> 7) Create /bucket1 through tokyo zone endpoint with azuma account.
>>    -> No problem.
>> 8) Create nishi user account on osaka zone in west zonegroup.
>> 9) Try to create a bucket /bucket2 through osaka zone endpoint with azuma
>> account.
>>    -> respond "ERROR: S3 error: 403 (InvalidAccessKeyId)" as expected.
>> 10) Try to create a bucket /bucket3 through osaka zone endpoint with nishi
>> account.
>>    -> respond "ERROR: S3 error: 404 (NoSuchKey)"
>>    Detailed log is shown in -FYI- section bellow.
>>    The RGW for osaka zone verify the signature and forward the request
>>    to tokyo zone endpoint (= the master zone in the master zonegroup).
>>    Then, the RGW for tokyo zone rejected the request by unauthorized access.
>>
>
> This seems a bug, can you open a issue?
>
>>
>>>> c) How to restrict to place buckets on specific zonegroups?
>>>>
>>>
>>> you probably mean zone.
>>> There is ongoing work to enable/disable sync per bucket
>>> https://github.com/ceph/ceph/pull/10995
>>> with this you can create a bucket on a specific zone and it won't be
>>> replicated to another zone
>>
>>
>> My thought means zonegroup (not zone) as described above.
>
> But it should be zone ..
> Zone represent a geographical location , it represent a single ceph cluster.
> Bucket is created in a zone (a single ceph cluster) and it stored the zone id.
> The zone represent in which ceph cluster the bucket was created.
>
> A zonegroup just a logical collection of zones, in many case you only
> need a single zonegroup.
> You should use zonegroups if you have lots of zones and it simplifies
> your configuration.
> You can move zones between zonegroups (it is not tested or supported ...).
>
>> With current code, buckets are sync'ed to all zones within a zonegroup,
>> no way to choose zone to place specific buckets.
>> But this change may help to configure our original target.
>>
>> It seems we need more discussion about the change.
>> I prefer default behavior is associated with user account (per SLA).
>> And attribution of each bucket should be able to be changed via REST
>> API depending on their permission, rather than radosgw-admin command.
>>
>
> I think that will be very helpful , we need to understand what are the
> requirement and the usage.
> Please comment on the PR or even open a feature request and we can
> discuss it more in detail.
>
>> Anyway, I'll examine more details.
>>
>>>> If user accounts would synced future as the blueprint, all the zonegroups
>>>> contain same account information. It means any user can create buckets on
>>>> any zonegroups. If we want to permit to place buckets on a replicated
>>>> zonegroup for specific users, how to configure?
>>>>
>>>> If user accounts will not synced as current behavior, we can restrict
>>>> to place buckets on specific zonegroups. But I cannot find best way to
>>>> configure the master zonegroup.
>>>>
>>>>
>>>> d) Operations for other zonegroup are not redirected
>>>>
>>>> e.g.:
>>>>   1) Create bucket4 on west zonegroup by nishi.
>>>>   2) Try to access bucket4 from endpoint on east zonegroup.
>>>>      -> Respond "301 (Moved Permanently)",
>>>>         but no redirected Location header is returned.
>>>>
>>>
>>> It could be a bug please open a tracker issue for that in
>>> tracker.ceph.com for RGW component with all the configuration
>>> information,
>>> logs and the version of ceph and radosgw you are using.
>>
>>
>> I will open it, but it may be issued as "Feature" instead of "Bug"
>> depending on following discussion.
>>
>>>> It seems current RGW doesn't follows S3 specification [2].
>>>> To implement this feature, probably we need to define another endpoint
>>>> on each zonegroup for client accessible URL. RGW may placed behind proxy,
>>>> thus the URL may be different from endpoint URLs for replication.
>>>>
>>>
>>> The zone and zonegroup endpoints are not used directly by the user with a
>>> proxy.
>>> The user get a URL pointing to the proxy and the proxy will need to be
>>> configured to point the rgw urls/IPs , you can have several radosgw
>>> running.
>>> See more
>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration
>>
>>
>> Does it mean the proxy has responsibility to alter "Location" header as
>> redirected URL?
>>
>
> No
>
>> Basically, RGW can respond only the endpoint described in zonegroup
>> setting as redirected URL on Location header. But client may not access
>> the endpoint. Someone must translate the Location header to client
>> accessible URL.
>>
>
> Both locations will have a proxy. This means all communication is done
> through proxies.
> The endpoint URL should be an external URL and the proxy on the new
> location will translate it to the internal one.

Our assumption is:

End-user client --- internet --- proxy ---+--- RGW site-A
                                           |
                                           | (dedicated line or VPN)
                                           |
End-user client --- internet --- proxy ---+--- RGW site-B

RGWs can't access through front of proxies.
In this case, endpoints for replication are in backend network of proxies.

How do you think?


> Regards,
> Orit
>
>> If the proxy translates Location header, it looks like man-in-the-middle
>> attack.
>>
>>
>> Regards,
>> KIMURA
>>
>>> Regrads,
>>> Orit
>>>>
>>>>
>>>> Any thoughts?
>>>
>>>
>>>>
>>>> [1]
>>>>
>>>> http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
>>>> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html
>>
>>
>> [3]
>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-8-multi-site#migrating_a_single_site_system_to_multi_site
>>
>>>> ------ FYI ------
>>>> [environments]
>>>> Ceph cluster: RHCS 2.0
>>>> RGW: RHEL 7.2 + RGW v10.2.5
>>>>
>>>> zonegroup east: master
>>>>  zone tokyo
>>>>   endpoint http://node5:80
>>
>>        rgw frontends = "civetweb port=80"
>>        rgw zonegroup = east
>>        rgw zone = tokyo
>>>>
>>>>   system user: sync-user
>>>>   user azuma (+ nishi)
>>>>
>>>> zonegroup west: (not master)
>>>>   zone osaka
>>>>   endpoint http://node5:8081
>>
>>        rgw frontends = "civetweb port=8081"
>>        rgw zonegroup = west
>>        rgw zone = osaka
>>
>>>>   system user: sync-user (created with same key as zone tokyo)
>>>>   user nishi
>>>>
>>>>
>>>> [detail of "b)"]
>>>>
>>>> $ s3cmd -c s3nishi.cfg ls
>>>> $ s3cmd -c s3nishi.cfg mb s3://bucket3
>>>> ERROR: S3 error: 404 (NoSuchKey)
>>>>
>>>> ---- rgw.osaka log:
>>>> 2017-02-10 11:54:13.290653 7feac3f7f700  1 ====== starting new request
>>>> req=0x7feac3f79710 =====
>>>> 2017-02-10 11:54:13.290709 7feac3f7f700  2 req 50:0.000057::PUT
>>>> /bucket3/::initializing for trans_id =
>>>> tx000000000000000000032-00589d2b55-14a2-osaka
>>>> 2017-02-10 11:54:13.290720 7feac3f7f700 10 rgw api priority: s3=5
>>>> s3website=4
>>>> 2017-02-10 11:54:13.290722 7feac3f7f700 10 host=node5
>>>> 2017-02-10 11:54:13.290733 7feac3f7f700 10 meta>>
>>>> HTTP_X_AMZ_CONTENT_SHA256
>>>> 2017-02-10 11:54:13.290750 7feac3f7f700 10 meta>> HTTP_X_AMZ_DATE
>>>> 2017-02-10 11:54:13.290753 7feac3f7f700 10 x>>
>>>>
>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>> 2017-02-10 11:54:13.290755 7feac3f7f700 10 x>>
>>>> x-amz-date:20170210T025413Z
>>>> 2017-02-10 11:54:13.290774 7feac3f7f700 10
>>>> handler=25RGWHandler_REST_Bucket_S3
>>>> 2017-02-10 11:54:13.290775 7feac3f7f700  2 req 50:0.000124:s3:PUT
>>>> /bucket3/::getting op 1
>>>> 2017-02-10 11:54:13.290781 7feac3f7f700 10
>>>> op=27RGWCreateBucket_ObjStore_S3
>>>> 2017-02-10 11:54:13.290782 7feac3f7f700  2 req 50:0.000130:s3:PUT
>>>> /bucket3/:create_bucket:authorizing
>>>> 2017-02-10 11:54:13.290798 7feac3f7f700 10 v4 signature format =
>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>> 2017-02-10 11:54:13.290804 7feac3f7f700 10 v4 credential format =
>>>> ZY6EJUVB38SCOWBELERQ/20170210/west/s3/aws4_request
>>>> 2017-02-10 11:54:13.290806 7feac3f7f700 10 access key id =
>>>> ZY6EJUVB38SCOWBELERQ
>>>> 2017-02-10 11:54:13.290814 7feac3f7f700 10 credential scope =
>>>> 20170210/west/s3/aws4_request
>>>> 2017-02-10 11:54:13.290834 7feac3f7f700 10 canonical headers format =
>>>> host:node5:8081
>>>>
>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>> x-amz-date:20170210T025413Z
>>>>
>>>> 2017-02-10 11:54:13.290836 7feac3f7f700 10 delaying v4 auth
>>>> 2017-02-10 11:54:13.290839 7feac3f7f700  2 req 50:0.000187:s3:PUT
>>>> /bucket3/:create_bucket:normalizing buckets and tenants
>>>> 2017-02-10 11:54:13.290841 7feac3f7f700 10 s->object=<NULL>
>>>> s->bucket=bucket3
>>>> 2017-02-10 11:54:13.290843 7feac3f7f700  2 req 50:0.000191:s3:PUT
>>>> /bucket3/:create_bucket:init permissions
>>>> 2017-02-10 11:54:13.290844 7feac3f7f700  2 req 50:0.000192:s3:PUT
>>>> /bucket3/:create_bucket:recalculating target
>>>> 2017-02-10 11:54:13.290845 7feac3f7f700  2 req 50:0.000193:s3:PUT
>>>> /bucket3/:create_bucket:reading permissions
>>>> 2017-02-10 11:54:13.290846 7feac3f7f700  2 req 50:0.000195:s3:PUT
>>>> /bucket3/:create_bucket:init op
>>>> 2017-02-10 11:54:13.290847 7feac3f7f700  2 req 50:0.000196:s3:PUT
>>>> /bucket3/:create_bucket:verifying op mask
>>>> 2017-02-10 11:54:13.290849 7feac3f7f700  2 req 50:0.000197:s3:PUT
>>>> /bucket3/:create_bucket:verifying op permissions
>>>> 2017-02-10 11:54:13.292027 7feac3f7f700  2 req 50:0.001374:s3:PUT
>>>> /bucket3/:create_bucket:verifying op params
>>>> 2017-02-10 11:54:13.292035 7feac3f7f700  2 req 50:0.001383:s3:PUT
>>>> /bucket3/:create_bucket:pre-executing
>>>> 2017-02-10 11:54:13.292037 7feac3f7f700  2 req 50:0.001385:s3:PUT
>>>> /bucket3/:create_bucket:executing
>>>> 2017-02-10 11:54:13.292072 7feac3f7f700 10 payload request hash =
>>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>> 2017-02-10 11:54:13.292083 7feac3f7f700 10 canonical request = PUT
>>>> /bucket3/
>>>>
>>>> host:node5:8081
>>>>
>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>> x-amz-date:20170210T025413Z
>>>>
>>>> host;x-amz-content-sha256;x-amz-date
>>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>> 2017-02-10 11:54:13.292084 7feac3f7f700 10 canonical request hash =
>>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>>> 2017-02-10 11:54:13.292087 7feac3f7f700 10 string to sign =
>>>> AWS4-HMAC-SHA256
>>>> 20170210T025413Z
>>>> 20170210/west/s3/aws4_request
>>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>>> 2017-02-10 11:54:13.292118 7feac3f7f700 10 date_k        =
>>>> 454f3ad73c095e73d2482809d7a6ec8af3c4e900bc83e0a9663ea5fc336cad95
>>>> 2017-02-10 11:54:13.292131 7feac3f7f700 10 region_k      =
>>>> e0caaddbb30ebc25840b6aaac3979d1881a14b8e9a0dfea43d8a006c8e0e504d
>>>> 2017-02-10 11:54:13.292144 7feac3f7f700 10 service_k     =
>>>> 59d6c9158e9e3c6a1aa97ee15859d2ef9ad9c64209b63f093109844f0c7f6c04
>>>> 2017-02-10 11:54:13.292171 7feac3f7f700 10 signing_k     =
>>>> 4dcbccd9c3da779d32758a645644c66a56f64d642eaeb39eec8e0b2facba7805
>>>> 2017-02-10 11:54:13.292197 7feac3f7f700 10 signature_k   =
>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>> 2017-02-10 11:54:13.292198 7feac3f7f700 10 new signature =
>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10 -----------------------------
>>>> Verifying signatures
>>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10 Signature     =
>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10 New Signature =
>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10 -----------------------------
>>>> 2017-02-10 11:54:13.292202 7feac3f7f700 10 v4 auth ok
>>>> 2017-02-10 11:54:13.292238 7feac3f7f700 10 create bucket location
>>>> constraint: west
>>>> 2017-02-10 11:54:13.292256 7feac3f7f700 10 cache get:
>>>> name=osaka.rgw.data.root+bucket3 : type miss (requested=22, cached=0)
>>>> 2017-02-10 11:54:13.293369 7feac3f7f700 10 cache put:
>>>> name=osaka.rgw.data.root+bucket3 info.flags=0
>>>> 2017-02-10 11:54:13.293374 7feac3f7f700 10 moving
>>>> osaka.rgw.data.root+bucket3 to cache LRU end
>>>> 2017-02-10 11:54:13.293380 7feac3f7f700  0 sending create_bucket request
>>>> to
>>>> master zonegroup
>>>> 2017-02-10 11:54:13.293401 7feac3f7f700 10 get_canon_resource():
>>>> dest=/bucket3/
>>>> 2017-02-10 11:54:13.293403 7feac3f7f700 10 generated canonical header:
>>>> PUT
>>>>
>>>>
>>>> Fri Feb 10 02:54:13 2017
>>>>
>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>> /bucket3/
>>>> 2017-02-10 11:54:13.299113 7feac3f7f700 10 receive_http_header
>>>> 2017-02-10 11:54:13.299117 7feac3f7f700 10 received header:HTTP/1.1 404
>>>> Not
>>>> Found
>>>> 2017-02-10 11:54:13.299119 7feac3f7f700 10 receive_http_header
>>>> 2017-02-10 11:54:13.299120 7feac3f7f700 10 received
>>>> header:x-amz-request-id:
>>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>>> 2017-02-10 11:54:13.299130 7feac3f7f700 10 receive_http_header
>>>> 2017-02-10 11:54:13.299131 7feac3f7f700 10 received
>>>> header:Content-Length:
>>>> 175
>>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 receive_http_header
>>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 received header:Accept-Ranges:
>>>> bytes
>>>> 2017-02-10 11:54:13.299148 7feac3f7f700 10 receive_http_header
>>>> 2017-02-10 11:54:13.299149 7feac3f7f700 10 received header:Content-Type:
>>>> application/xml
>>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 receive_http_header
>>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 received header:Date: Fri, 10
>>>> Feb
>>>> 2017 02:54:13 GMT
>>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 receive_http_header
>>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 received header:
>>>> 2017-02-10 11:54:13.299248 7feac3f7f700  2 req 50:0.008596:s3:PUT
>>>> /bucket3/:create_bucket:completing
>>>> 2017-02-10 11:54:13.299319 7feac3f7f700  2 req 50:0.008667:s3:PUT
>>>> /bucket3/:create_bucket:op status=-2
>>>> 2017-02-10 11:54:13.299321 7feac3f7f700  2 req 50:0.008670:s3:PUT
>>>> /bucket3/:create_bucket:http status=404
>>>> 2017-02-10 11:54:13.299324 7feac3f7f700  1 ====== req done
>>>> req=0x7feac3f79710 op status=-2 http_status=404 ======
>>>> 2017-02-10 11:54:13.299349 7feac3f7f700  1 civetweb: 0x7feb2c02d340:
>>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1"
>>>> 404
>>>> 0 - -
>>>>
>>>>
>>>> ---- rgw.tokyo log:
>>>> 2017-02-10 11:54:13.297852 7f56076c6700  1 ====== starting new request
>>>> req=0x7f56076c0710 =====
>>>> 2017-02-10 11:54:13.297887 7f56076c6700  2 req 5:0.000035::PUT
>>>> /bucket3/::initializing for trans_id =
>>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>>> 2017-02-10 11:54:13.297895 7f56076c6700 10 rgw api priority: s3=5
>>>> s3website=4
>>>> 2017-02-10 11:54:13.297897 7f56076c6700 10 host=node5
>>>> 2017-02-10 11:54:13.297906 7f56076c6700 10 meta>>
>>>> HTTP_X_AMZ_CONTENT_SHA256
>>>> 2017-02-10 11:54:13.297912 7f56076c6700 10 x>>
>>>>
>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>> 2017-02-10 11:54:13.297929 7f56076c6700 10
>>>> handler=25RGWHandler_REST_Bucket_S3
>>>> 2017-02-10 11:54:13.297937 7f56076c6700  2 req 5:0.000086:s3:PUT
>>>> /bucket3/::getting op 1
>>>> 2017-02-10 11:54:13.297946 7f56076c6700 10
>>>> op=27RGWCreateBucket_ObjStore_S3
>>>> 2017-02-10 11:54:13.297947 7f56076c6700  2 req 5:0.000096:s3:PUT
>>>> /bucket3/:create_bucket:authorizing
>>>> 2017-02-10 11:54:13.297969 7f56076c6700 10 get_canon_resource():
>>>> dest=/bucket3/
>>>> 2017-02-10 11:54:13.297976 7f56076c6700 10 auth_hdr:
>>>> PUT
>>>>
>>>>
>>>> Fri Feb 10 02:54:13 2017
>>>>
>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>> /bucket3/
>>>> 2017-02-10 11:54:13.298023 7f56076c6700 10 cache get:
>>>> name=default.rgw.users.uid+nishi : type miss (requested=6, cached=0)
>>>> 2017-02-10 11:54:13.298975 7f56076c6700 10 cache put:
>>>> name=default.rgw.users.uid+nishi info.flags=0
>>>> 2017-02-10 11:54:13.298986 7f56076c6700 10 moving
>>>> default.rgw.users.uid+nishi to cache LRU end
>>>> 2017-02-10 11:54:13.298991 7f56076c6700  0 User lookup failed!
>>>> 2017-02-10 11:54:13.298993 7f56076c6700 10 failed to authorize request
>>>> 2017-02-10 11:54:13.299077 7f56076c6700  2 req 5:0.001225:s3:PUT
>>>> /bucket3/:create_bucket:op status=0
>>>> 2017-02-10 11:54:13.299086 7f56076c6700  2 req 5:0.001235:s3:PUT
>>>> /bucket3/:create_bucket:http status=404
>>>> 2017-02-10 11:54:13.299089 7f56076c6700  1 ====== req done
>>>> req=0x7f56076c0710 op status=0 http_status=404 ======
>>>> 2017-02-10 11:54:13.299426 7f56076c6700  1 civetweb: 0x7f56200048c0:
>>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1"
>>>> 404
>>>> 0 - -

-- 
KIMURA Osamu / 木村 修
Engineering Department, Storage Development Division,
Data Center Platform Business Unit, FUJITSU LIMITED

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-13 10:57       ` KIMURA Osamu
@ 2017-02-14 14:54         ` Orit Wasserman
  2017-02-15  0:26           ` KIMURA Osamu
  0 siblings, 1 reply; 11+ messages in thread
From: Orit Wasserman @ 2017-02-14 14:54 UTC (permalink / raw)
  To: KIMURA Osamu; +Cc: ceph-devel

On Mon, Feb 13, 2017 at 12:57 PM, KIMURA Osamu
<kimura.osamu@jp.fujitsu.com> wrote:
> Hi Orit,
>
> I almost agree, with some exceptions...
>
>
> On 2017/02/13 18:42, Orit Wasserman wrote:
>>
>> On Mon, Feb 13, 2017 at 6:44 AM, KIMURA Osamu
>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>
>>> Hi Orit,
>>>
>>> Thanks for your comments.
>>> I believe I'm not confusing, but probably my thought may not be well
>>> described...
>>>
>> :)
>>>
>>> On 2017/02/12 19:07, Orit Wasserman wrote:
>>>>
>>>>
>>>> On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>
>>>>>
>>>>> Hi Cephers,
>>>>>
>>>>> I'm trying to configure RGWs with multiple zonegroups within single
>>>>> realm.
>>>>> The intention is that some buckets to be replicated and others to stay
>>>>> locally.
>>>>
>>>>
>>>>
>>>> If you are not replicating than you don't need to create any zone
>>>> configuration,
>>>> a default zonegroup and zone are created automatically
>>>>
>>>>> e.g.:
>>>>>  realm: fj
>>>>>   zonegroup east: zone tokyo (not replicated)
>>>>
>>>>
>>>> no need if not replicated
>>>>>
>>>>>
>>>>>   zonegroup west: zone osaka (not replicated)
>>>>
>>>>
>>>> same here
>>>>>
>>>>>
>>>>>   zonegroup jp:   zone jp-east + jp-west (replicated)
>>>
>>>
>>>
>>> The "east" and "west" zonegroups are just renamed from "default"
>>> as described in RHCS document [3].
>>
>>
>> Why do you need two zonegroups (or 3)?
>>
>> At the moment multisitev2 replicated automatically all zones in the
>> realm except "default" zone.
>> The moment you add a new zone (could be part of another zonegroup) it
>> will be replicated to the other zones.
>> It seems you don't want or need this.
>> we are working on allowing more control on the replication but that
>> will be in the future.
>>
>>> We may not need to rename them, but at least api_name should be altered.
>>
>>
>> You can change the api_name for the "default" zone.
>>
>>> In addition, I'm not sure what happens if 2 "default" zones/zonegroups
>>> co-exist in same realm.
>>
>>
>> Realm shares all the zones/zonegroups configuration,
>> it means it is the same zone/zonegroup.
>> For "default" it means not zone/zonegroup configured, we use it to run
>> radosgw without any
>> zone/zonegroup specified in the configuration.
>
>
> I didn't think "default" as exception of zonegroup. :-P
> Actually, I must specify api_name in default zonegroup setting.
>
> I interpret "default" zone/zonegroup is out of realm. Is it correct?
> I think it means namespace for bucket or user is not shared with "default".
> At present, I can't make decision to separate namespaces, but it may be
> best choice with current code.
>
>
>>>>> To evaluate such configuration, I tentatively built multiple zonegroups
>>>>> (east, west) on a ceph cluster. I barely succeed to configure it, but
>>>>> some concerns exist.
>>>>>
>>>> I think you just need one zonegroup with two zones the other are not
>>>> needed
>>>> Also each gateway can handle only a single zone (rgw_zone
>>>> configuration parameter)
>>>
>>>
>>>
>>> This is just a tentative one to confirm the behavior of multiple
>>> zonegroups
>>> due to limitation of our current equipment.
>>> The "east" zonegroup was renamed from "default", and another "west"
>>> zonegroup
>>> was created. Of course I specified both rgw_zonegroup and rgw_zone
>>> parameters
>>> for each RGW instance. (see -FYI- section bellow)
>>>
>> Can I suggest starting with a more simple setup:
>> Two zonegroups,  the first will have two zones and the second will
>> have one zone.
>> It is simper to configure and in case of problems to debug.
>
>
> I would try with such configuration IF time permitted.
>
>
>
>>>>> a) User accounts are not synced among zonegroups
>>>>>
>>>>> I'm not sure if this is a issue, but the blueprint [1] stated a master
>>>>> zonegroup manages user accounts as metadata like buckets.
>>>>>
>>>> You have a lot of confusion with the zones and zonegroups.
>>>> A zonegroup is just a group of zones that are sharing the same data
>>>> (i.e. replication between them)
>>>> A zone represent a geographical location (i.e. one ceph cluster)
>>>>
>>>> We have a meta master zone (the master zone in the master zonegroup),
>>>> this meta master is responible on
>>>> replicating users and byckets meta operations.
>>>
>>>
>>>
>>> I know it.
>>> But the master zone in the master zonegroup manages bucket meta
>>> operations including buckets in other zonegroups. It means
>>> the master zone in the master zonegroup must have permission to
>>> handle buckets meta operations, i.e., must have same user accounts
>>> as other zonegroups.
>>
>> Again zones not zonegroups,  it needs to have an admin user with the
>> same credentials in all the other zones.
>>
>>> This is related to next issue b). If the master zone in the master
>>> zonegroup doesn't have user accounts for other zonegroups, all the
>>> buckets meta operations are rejected.
>>>
>>
>> Correct
>>
>>> In addition, it may be overexplanation though, user accounts are
>>> sync'ed to other zones within same zonegroup if the accounts are
>>> created on master zone of the zonegroup. On the other hand,
>>> I found today, user accounts are not sync'ed to master if the
>>> accounts are created on slave(?) zone in the zonegroup. It seems
>>> asymmetric behavior.
>>
>>
>> This requires investigation,  can you open a tracker issue and we will
>> look into it.
>>
>>> I'm not sure if the same behavior is caused by Admin REST API instead
>>> of radosgw-admin.
>>>
>>
>> It doesn't matter both use almost the same code
>>
>>>
>>>>> b) Bucket creation is rejected if master zonegroup doesn't have the
>>>>> account
>>>>>
>>>>> e.g.:
>>>>>   1) Configure east zonegroup as master.
>>>>
>>>>
>>>> you need a master zoen
>>>>>
>>>>>
>>>>>   2) Create a user "nishi" on west zonegroup (osaka zone) using
>>>>> radosgw-admin.
>>>>>   3) Try to create a bucket on west zonegroup by user nishi.
>>>>>      -> ERROR: S3 error: 404 (NoSuchKey)
>>>>>   4) Create user nishi on east zonegroup with same key.
>>>>>   5) Succeed to create a bucket on west zonegroup by user nishi.
>>>>>
>>>>
>>>> You are confusing zonegroup and zone here again ...
>>>>
>>>> you should notice that when you are using radosgw-admin command
>>>> without providing zonegorup and/or zone info (--rgw-zonegroup=<zg> and
>>>> --rgw-zone=<zone>) it will use the default zonegroup and zone.
>>>>
>>>> User is stored per zone and you need to create an admin users in both
>>>> zones
>>>> for more documentation see:
>>>> http://docs.ceph.com/docs/master/radosgw/multisite/
>>>
>>>
>>>
>>> I always specify --rgw-zonegroup and --rgw-zone for radosgw-admin
>>> command.
>>>
>> That is great!
>> You can also onfigure default zone and zonegroup
>>
>>> The issue is that any buckets meta operations are rejected when the
>>> master
>>> zone in the master zonegroup doesn't have the user account of other
>>> zonegroups.
>>>
>> Correct
>>>
>>> I try to describe details again:
>>> 1) Create fj realm as default.
>>> 2) Rename default zonegroup/zone to east/tokyo and mark as default.
>>> 3) Create west/osaka zonegroup/zone.
>>> 4) Create system user sync-user on both tokyo and osaka zones with same
>>> key.
>>> 5) Start 2 RGW instances for tokyo and osaka zones.
>>> 6) Create azuma user account on tokyo zone in east zonegroup.
>>> 7) Create /bucket1 through tokyo zone endpoint with azuma account.
>>>    -> No problem.
>>> 8) Create nishi user account on osaka zone in west zonegroup.
>>> 9) Try to create a bucket /bucket2 through osaka zone endpoint with azuma
>>> account.
>>>    -> respond "ERROR: S3 error: 403 (InvalidAccessKeyId)" as expected.
>>> 10) Try to create a bucket /bucket3 through osaka zone endpoint with
>>> nishi
>>> account.
>>>    -> respond "ERROR: S3 error: 404 (NoSuchKey)"
>>>    Detailed log is shown in -FYI- section bellow.
>>>    The RGW for osaka zone verify the signature and forward the request
>>>    to tokyo zone endpoint (= the master zone in the master zonegroup).
>>>    Then, the RGW for tokyo zone rejected the request by unauthorized
>>> access.
>>>
>>
>> This seems a bug, can you open a issue?
>>
>>>
>>>>> c) How to restrict to place buckets on specific zonegroups?
>>>>>
>>>>
>>>> you probably mean zone.
>>>> There is ongoing work to enable/disable sync per bucket
>>>> https://github.com/ceph/ceph/pull/10995
>>>> with this you can create a bucket on a specific zone and it won't be
>>>> replicated to another zone
>>>
>>>
>>>
>>> My thought means zonegroup (not zone) as described above.
>>
>>
>> But it should be zone ..
>> Zone represent a geographical location , it represent a single ceph
>> cluster.
>> Bucket is created in a zone (a single ceph cluster) and it stored the zone
>> id.
>> The zone represent in which ceph cluster the bucket was created.
>>
>> A zonegroup just a logical collection of zones, in many case you only
>> need a single zonegroup.
>> You should use zonegroups if you have lots of zones and it simplifies
>> your configuration.
>> You can move zones between zonegroups (it is not tested or supported ...).
>>
>>> With current code, buckets are sync'ed to all zones within a zonegroup,
>>> no way to choose zone to place specific buckets.
>>> But this change may help to configure our original target.
>>>
>>> It seems we need more discussion about the change.
>>> I prefer default behavior is associated with user account (per SLA).
>>> And attribution of each bucket should be able to be changed via REST
>>> API depending on their permission, rather than radosgw-admin command.
>>>
>>
>> I think that will be very helpful , we need to understand what are the
>> requirement and the usage.
>> Please comment on the PR or even open a feature request and we can
>> discuss it more in detail.
>>
>>> Anyway, I'll examine more details.
>>>
>>>>> If user accounts would synced future as the blueprint, all the
>>>>> zonegroups
>>>>> contain same account information. It means any user can create buckets
>>>>> on
>>>>> any zonegroups. If we want to permit to place buckets on a replicated
>>>>> zonegroup for specific users, how to configure?
>>>>>
>>>>> If user accounts will not synced as current behavior, we can restrict
>>>>> to place buckets on specific zonegroups. But I cannot find best way to
>>>>> configure the master zonegroup.
>>>>>
>>>>>
>>>>> d) Operations for other zonegroup are not redirected
>>>>>
>>>>> e.g.:
>>>>>   1) Create bucket4 on west zonegroup by nishi.
>>>>>   2) Try to access bucket4 from endpoint on east zonegroup.
>>>>>      -> Respond "301 (Moved Permanently)",
>>>>>         but no redirected Location header is returned.
>>>>>
>>>>
>>>> It could be a bug please open a tracker issue for that in
>>>> tracker.ceph.com for RGW component with all the configuration
>>>> information,
>>>> logs and the version of ceph and radosgw you are using.
>>>
>>>
>>>
>>> I will open it, but it may be issued as "Feature" instead of "Bug"
>>> depending on following discussion.
>>>
>>>>> It seems current RGW doesn't follows S3 specification [2].
>>>>> To implement this feature, probably we need to define another endpoint
>>>>> on each zonegroup for client accessible URL. RGW may placed behind
>>>>> proxy,
>>>>> thus the URL may be different from endpoint URLs for replication.
>>>>>
>>>>
>>>> The zone and zonegroup endpoints are not used directly by the user with
>>>> a
>>>> proxy.
>>>> The user get a URL pointing to the proxy and the proxy will need to be
>>>> configured to point the rgw urls/IPs , you can have several radosgw
>>>> running.
>>>> See more
>>>>
>>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration
>>>
>>>
>>>
>>> Does it mean the proxy has responsibility to alter "Location" header as
>>> redirected URL?
>>>
>>
>> No
>>
>>> Basically, RGW can respond only the endpoint described in zonegroup
>>> setting as redirected URL on Location header. But client may not access
>>> the endpoint. Someone must translate the Location header to client
>>> accessible URL.
>>>
>>
>> Both locations will have a proxy. This means all communication is done
>> through proxies.
>> The endpoint URL should be an external URL and the proxy on the new
>> location will translate it to the internal one.
>
>
> Our assumption is:
>
> End-user client --- internet --- proxy ---+--- RGW site-A
>                                           |
>                                           | (dedicated line or VPN)
>                                           |
> End-user client --- internet --- proxy ---+--- RGW site-B
>
> RGWs can't access through front of proxies.
> In this case, endpoints for replication are in backend network of proxies.

do you have several radosgw instances in each site?

>
> How do you think?
>
>
>
>> Regards,
>> Orit
>>
>>> If the proxy translates Location header, it looks like man-in-the-middle
>>> attack.
>>>
>>>
>>> Regards,
>>> KIMURA
>>>
>>>> Regrads,
>>>> Orit
>>>>>
>>>>>
>>>>>
>>>>> Any thoughts?
>>>>
>>>>
>>>>
>>>>>
>>>>> [1]
>>>>>
>>>>>
>>>>> http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
>>>>> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html
>>>
>>>
>>>
>>> [3]
>>>
>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-8-multi-site#migrating_a_single_site_system_to_multi_site
>>>
>>>>> ------ FYI ------
>>>>> [environments]
>>>>> Ceph cluster: RHCS 2.0
>>>>> RGW: RHEL 7.2 + RGW v10.2.5
>>>>>
>>>>> zonegroup east: master
>>>>>  zone tokyo
>>>>>   endpoint http://node5:80
>>>
>>>
>>>        rgw frontends = "civetweb port=80"
>>>        rgw zonegroup = east
>>>        rgw zone = tokyo
>>>>>
>>>>>
>>>>>   system user: sync-user
>>>>>   user azuma (+ nishi)
>>>>>
>>>>> zonegroup west: (not master)
>>>>>   zone osaka
>>>>>   endpoint http://node5:8081
>>>
>>>
>>>        rgw frontends = "civetweb port=8081"
>>>        rgw zonegroup = west
>>>        rgw zone = osaka
>>>
>>>>>   system user: sync-user (created with same key as zone tokyo)
>>>>>   user nishi
>>>>>
>>>>>
>>>>> [detail of "b)"]
>>>>>
>>>>> $ s3cmd -c s3nishi.cfg ls
>>>>> $ s3cmd -c s3nishi.cfg mb s3://bucket3
>>>>> ERROR: S3 error: 404 (NoSuchKey)
>>>>>
>>>>> ---- rgw.osaka log:
>>>>> 2017-02-10 11:54:13.290653 7feac3f7f700  1 ====== starting new request
>>>>> req=0x7feac3f79710 =====
>>>>> 2017-02-10 11:54:13.290709 7feac3f7f700  2 req 50:0.000057::PUT
>>>>> /bucket3/::initializing for trans_id =
>>>>> tx000000000000000000032-00589d2b55-14a2-osaka
>>>>> 2017-02-10 11:54:13.290720 7feac3f7f700 10 rgw api priority: s3=5
>>>>> s3website=4
>>>>> 2017-02-10 11:54:13.290722 7feac3f7f700 10 host=node5
>>>>> 2017-02-10 11:54:13.290733 7feac3f7f700 10 meta>>
>>>>> HTTP_X_AMZ_CONTENT_SHA256
>>>>> 2017-02-10 11:54:13.290750 7feac3f7f700 10 meta>> HTTP_X_AMZ_DATE
>>>>> 2017-02-10 11:54:13.290753 7feac3f7f700 10 x>>
>>>>>
>>>>>
>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>> 2017-02-10 11:54:13.290755 7feac3f7f700 10 x>>
>>>>> x-amz-date:20170210T025413Z
>>>>> 2017-02-10 11:54:13.290774 7feac3f7f700 10
>>>>> handler=25RGWHandler_REST_Bucket_S3
>>>>> 2017-02-10 11:54:13.290775 7feac3f7f700  2 req 50:0.000124:s3:PUT
>>>>> /bucket3/::getting op 1
>>>>> 2017-02-10 11:54:13.290781 7feac3f7f700 10
>>>>> op=27RGWCreateBucket_ObjStore_S3
>>>>> 2017-02-10 11:54:13.290782 7feac3f7f700  2 req 50:0.000130:s3:PUT
>>>>> /bucket3/:create_bucket:authorizing
>>>>> 2017-02-10 11:54:13.290798 7feac3f7f700 10 v4 signature format =
>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>> 2017-02-10 11:54:13.290804 7feac3f7f700 10 v4 credential format =
>>>>> ZY6EJUVB38SCOWBELERQ/20170210/west/s3/aws4_request
>>>>> 2017-02-10 11:54:13.290806 7feac3f7f700 10 access key id =
>>>>> ZY6EJUVB38SCOWBELERQ
>>>>> 2017-02-10 11:54:13.290814 7feac3f7f700 10 credential scope =
>>>>> 20170210/west/s3/aws4_request
>>>>> 2017-02-10 11:54:13.290834 7feac3f7f700 10 canonical headers format =
>>>>> host:node5:8081
>>>>>
>>>>>
>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>> x-amz-date:20170210T025413Z
>>>>>
>>>>> 2017-02-10 11:54:13.290836 7feac3f7f700 10 delaying v4 auth
>>>>> 2017-02-10 11:54:13.290839 7feac3f7f700  2 req 50:0.000187:s3:PUT
>>>>> /bucket3/:create_bucket:normalizing buckets and tenants
>>>>> 2017-02-10 11:54:13.290841 7feac3f7f700 10 s->object=<NULL>
>>>>> s->bucket=bucket3
>>>>> 2017-02-10 11:54:13.290843 7feac3f7f700  2 req 50:0.000191:s3:PUT
>>>>> /bucket3/:create_bucket:init permissions
>>>>> 2017-02-10 11:54:13.290844 7feac3f7f700  2 req 50:0.000192:s3:PUT
>>>>> /bucket3/:create_bucket:recalculating target
>>>>> 2017-02-10 11:54:13.290845 7feac3f7f700  2 req 50:0.000193:s3:PUT
>>>>> /bucket3/:create_bucket:reading permissions
>>>>> 2017-02-10 11:54:13.290846 7feac3f7f700  2 req 50:0.000195:s3:PUT
>>>>> /bucket3/:create_bucket:init op
>>>>> 2017-02-10 11:54:13.290847 7feac3f7f700  2 req 50:0.000196:s3:PUT
>>>>> /bucket3/:create_bucket:verifying op mask
>>>>> 2017-02-10 11:54:13.290849 7feac3f7f700  2 req 50:0.000197:s3:PUT
>>>>> /bucket3/:create_bucket:verifying op permissions
>>>>> 2017-02-10 11:54:13.292027 7feac3f7f700  2 req 50:0.001374:s3:PUT
>>>>> /bucket3/:create_bucket:verifying op params
>>>>> 2017-02-10 11:54:13.292035 7feac3f7f700  2 req 50:0.001383:s3:PUT
>>>>> /bucket3/:create_bucket:pre-executing
>>>>> 2017-02-10 11:54:13.292037 7feac3f7f700  2 req 50:0.001385:s3:PUT
>>>>> /bucket3/:create_bucket:executing
>>>>> 2017-02-10 11:54:13.292072 7feac3f7f700 10 payload request hash =
>>>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>> 2017-02-10 11:54:13.292083 7feac3f7f700 10 canonical request = PUT
>>>>> /bucket3/
>>>>>
>>>>> host:node5:8081
>>>>>
>>>>>
>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>> x-amz-date:20170210T025413Z
>>>>>
>>>>> host;x-amz-content-sha256;x-amz-date
>>>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>> 2017-02-10 11:54:13.292084 7feac3f7f700 10 canonical request hash =
>>>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>>>> 2017-02-10 11:54:13.292087 7feac3f7f700 10 string to sign =
>>>>> AWS4-HMAC-SHA256
>>>>> 20170210T025413Z
>>>>> 20170210/west/s3/aws4_request
>>>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>>>> 2017-02-10 11:54:13.292118 7feac3f7f700 10 date_k        =
>>>>> 454f3ad73c095e73d2482809d7a6ec8af3c4e900bc83e0a9663ea5fc336cad95
>>>>> 2017-02-10 11:54:13.292131 7feac3f7f700 10 region_k      =
>>>>> e0caaddbb30ebc25840b6aaac3979d1881a14b8e9a0dfea43d8a006c8e0e504d
>>>>> 2017-02-10 11:54:13.292144 7feac3f7f700 10 service_k     =
>>>>> 59d6c9158e9e3c6a1aa97ee15859d2ef9ad9c64209b63f093109844f0c7f6c04
>>>>> 2017-02-10 11:54:13.292171 7feac3f7f700 10 signing_k     =
>>>>> 4dcbccd9c3da779d32758a645644c66a56f64d642eaeb39eec8e0b2facba7805
>>>>> 2017-02-10 11:54:13.292197 7feac3f7f700 10 signature_k   =
>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>> 2017-02-10 11:54:13.292198 7feac3f7f700 10 new signature =
>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10
>>>>> -----------------------------
>>>>> Verifying signatures
>>>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10 Signature     =
>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10 New Signature =
>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10
>>>>> -----------------------------
>>>>> 2017-02-10 11:54:13.292202 7feac3f7f700 10 v4 auth ok
>>>>> 2017-02-10 11:54:13.292238 7feac3f7f700 10 create bucket location
>>>>> constraint: west
>>>>> 2017-02-10 11:54:13.292256 7feac3f7f700 10 cache get:
>>>>> name=osaka.rgw.data.root+bucket3 : type miss (requested=22, cached=0)
>>>>> 2017-02-10 11:54:13.293369 7feac3f7f700 10 cache put:
>>>>> name=osaka.rgw.data.root+bucket3 info.flags=0
>>>>> 2017-02-10 11:54:13.293374 7feac3f7f700 10 moving
>>>>> osaka.rgw.data.root+bucket3 to cache LRU end
>>>>> 2017-02-10 11:54:13.293380 7feac3f7f700  0 sending create_bucket
>>>>> request
>>>>> to
>>>>> master zonegroup
>>>>> 2017-02-10 11:54:13.293401 7feac3f7f700 10 get_canon_resource():
>>>>> dest=/bucket3/
>>>>> 2017-02-10 11:54:13.293403 7feac3f7f700 10 generated canonical header:
>>>>> PUT
>>>>>
>>>>>
>>>>> Fri Feb 10 02:54:13 2017
>>>>>
>>>>>
>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>> /bucket3/
>>>>> 2017-02-10 11:54:13.299113 7feac3f7f700 10 receive_http_header
>>>>> 2017-02-10 11:54:13.299117 7feac3f7f700 10 received header:HTTP/1.1 404
>>>>> Not
>>>>> Found
>>>>> 2017-02-10 11:54:13.299119 7feac3f7f700 10 receive_http_header
>>>>> 2017-02-10 11:54:13.299120 7feac3f7f700 10 received
>>>>> header:x-amz-request-id:
>>>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>>>> 2017-02-10 11:54:13.299130 7feac3f7f700 10 receive_http_header
>>>>> 2017-02-10 11:54:13.299131 7feac3f7f700 10 received
>>>>> header:Content-Length:
>>>>> 175
>>>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 receive_http_header
>>>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 received
>>>>> header:Accept-Ranges:
>>>>> bytes
>>>>> 2017-02-10 11:54:13.299148 7feac3f7f700 10 receive_http_header
>>>>> 2017-02-10 11:54:13.299149 7feac3f7f700 10 received
>>>>> header:Content-Type:
>>>>> application/xml
>>>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 receive_http_header
>>>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 received header:Date: Fri,
>>>>> 10
>>>>> Feb
>>>>> 2017 02:54:13 GMT
>>>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 receive_http_header
>>>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 received header:
>>>>> 2017-02-10 11:54:13.299248 7feac3f7f700  2 req 50:0.008596:s3:PUT
>>>>> /bucket3/:create_bucket:completing
>>>>> 2017-02-10 11:54:13.299319 7feac3f7f700  2 req 50:0.008667:s3:PUT
>>>>> /bucket3/:create_bucket:op status=-2
>>>>> 2017-02-10 11:54:13.299321 7feac3f7f700  2 req 50:0.008670:s3:PUT
>>>>> /bucket3/:create_bucket:http status=404
>>>>> 2017-02-10 11:54:13.299324 7feac3f7f700  1 ====== req done
>>>>> req=0x7feac3f79710 op status=-2 http_status=404 ======
>>>>> 2017-02-10 11:54:13.299349 7feac3f7f700  1 civetweb: 0x7feb2c02d340:
>>>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1"
>>>>> 404
>>>>> 0 - -
>>>>>
>>>>>
>>>>> ---- rgw.tokyo log:
>>>>> 2017-02-10 11:54:13.297852 7f56076c6700  1 ====== starting new request
>>>>> req=0x7f56076c0710 =====
>>>>> 2017-02-10 11:54:13.297887 7f56076c6700  2 req 5:0.000035::PUT
>>>>> /bucket3/::initializing for trans_id =
>>>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>>>> 2017-02-10 11:54:13.297895 7f56076c6700 10 rgw api priority: s3=5
>>>>> s3website=4
>>>>> 2017-02-10 11:54:13.297897 7f56076c6700 10 host=node5
>>>>> 2017-02-10 11:54:13.297906 7f56076c6700 10 meta>>
>>>>> HTTP_X_AMZ_CONTENT_SHA256
>>>>> 2017-02-10 11:54:13.297912 7f56076c6700 10 x>>
>>>>>
>>>>>
>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>> 2017-02-10 11:54:13.297929 7f56076c6700 10
>>>>> handler=25RGWHandler_REST_Bucket_S3
>>>>> 2017-02-10 11:54:13.297937 7f56076c6700  2 req 5:0.000086:s3:PUT
>>>>> /bucket3/::getting op 1
>>>>> 2017-02-10 11:54:13.297946 7f56076c6700 10
>>>>> op=27RGWCreateBucket_ObjStore_S3
>>>>> 2017-02-10 11:54:13.297947 7f56076c6700  2 req 5:0.000096:s3:PUT
>>>>> /bucket3/:create_bucket:authorizing
>>>>> 2017-02-10 11:54:13.297969 7f56076c6700 10 get_canon_resource():
>>>>> dest=/bucket3/
>>>>> 2017-02-10 11:54:13.297976 7f56076c6700 10 auth_hdr:
>>>>> PUT
>>>>>
>>>>>
>>>>> Fri Feb 10 02:54:13 2017
>>>>>
>>>>>
>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>> /bucket3/
>>>>> 2017-02-10 11:54:13.298023 7f56076c6700 10 cache get:
>>>>> name=default.rgw.users.uid+nishi : type miss (requested=6, cached=0)
>>>>> 2017-02-10 11:54:13.298975 7f56076c6700 10 cache put:
>>>>> name=default.rgw.users.uid+nishi info.flags=0
>>>>> 2017-02-10 11:54:13.298986 7f56076c6700 10 moving
>>>>> default.rgw.users.uid+nishi to cache LRU end
>>>>> 2017-02-10 11:54:13.298991 7f56076c6700  0 User lookup failed!
>>>>> 2017-02-10 11:54:13.298993 7f56076c6700 10 failed to authorize request
>>>>> 2017-02-10 11:54:13.299077 7f56076c6700  2 req 5:0.001225:s3:PUT
>>>>> /bucket3/:create_bucket:op status=0
>>>>> 2017-02-10 11:54:13.299086 7f56076c6700  2 req 5:0.001235:s3:PUT
>>>>> /bucket3/:create_bucket:http status=404
>>>>> 2017-02-10 11:54:13.299089 7f56076c6700  1 ====== req done
>>>>> req=0x7f56076c0710 op status=0 http_status=404 ======
>>>>> 2017-02-10 11:54:13.299426 7f56076c6700  1 civetweb: 0x7f56200048c0:
>>>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1"
>>>>> 404
>>>>> 0 - -
>
>
> --
> KIMURA Osamu / 木村 修
> Engineering Department, Storage Development Division,
> Data Center Platform Business Unit, FUJITSU LIMITED

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-14 14:54         ` Orit Wasserman
@ 2017-02-15  0:26           ` KIMURA Osamu
  2017-02-15  7:53             ` Orit Wasserman
  0 siblings, 1 reply; 11+ messages in thread
From: KIMURA Osamu @ 2017-02-15  0:26 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: ceph-devel

Comments inline...

On 2017/02/14 23:54, Orit Wasserman wrote:
> On Mon, Feb 13, 2017 at 12:57 PM, KIMURA Osamu
> <kimura.osamu@jp.fujitsu.com> wrote:
>> Hi Orit,
>>
>> I almost agree, with some exceptions...
>>
>>
>> On 2017/02/13 18:42, Orit Wasserman wrote:
>>>
>>> On Mon, Feb 13, 2017 at 6:44 AM, KIMURA Osamu
>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>
>>>> Hi Orit,
>>>>
>>>> Thanks for your comments.
>>>> I believe I'm not confusing, but probably my thought may not be well
>>>> described...
>>>>
>>> :)
>>>>
>>>> On 2017/02/12 19:07, Orit Wasserman wrote:
>>>>>
>>>>>
>>>>> On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
>>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Cephers,
>>>>>>
>>>>>> I'm trying to configure RGWs with multiple zonegroups within single
>>>>>> realm.
>>>>>> The intention is that some buckets to be replicated and others to stay
>>>>>> locally.
>>>>>
>>>>>
>>>>>
>>>>> If you are not replicating than you don't need to create any zone
>>>>> configuration,
>>>>> a default zonegroup and zone are created automatically
>>>>>
>>>>>> e.g.:
>>>>>>  realm: fj
>>>>>>   zonegroup east: zone tokyo (not replicated)
>>>>>
>>>>>
>>>>> no need if not replicated
>>>>>>
>>>>>>
>>>>>>   zonegroup west: zone osaka (not replicated)
>>>>>
>>>>>
>>>>> same here
>>>>>>
>>>>>>
>>>>>>   zonegroup jp:   zone jp-east + jp-west (replicated)
>>>>
>>>>
>>>>
>>>> The "east" and "west" zonegroups are just renamed from "default"
>>>> as described in RHCS document [3].
>>>
>>>
>>> Why do you need two zonegroups (or 3)?
>>>
>>> At the moment multisitev2 replicated automatically all zones in the
>>> realm except "default" zone.
>>> The moment you add a new zone (could be part of another zonegroup) it
>>> will be replicated to the other zones.
>>> It seems you don't want or need this.
>>> we are working on allowing more control on the replication but that
>>> will be in the future.
>>>
>>>> We may not need to rename them, but at least api_name should be altered.
>>>
>>>
>>> You can change the api_name for the "default" zone.
>>>
>>>> In addition, I'm not sure what happens if 2 "default" zones/zonegroups
>>>> co-exist in same realm.
>>>
>>>
>>> Realm shares all the zones/zonegroups configuration,
>>> it means it is the same zone/zonegroup.
>>> For "default" it means not zone/zonegroup configured, we use it to run
>>> radosgw without any
>>> zone/zonegroup specified in the configuration.
>>
>>
>> I didn't think "default" as exception of zonegroup. :-P
>> Actually, I must specify api_name in default zonegroup setting.
>>
>> I interpret "default" zone/zonegroup is out of realm. Is it correct?
>> I think it means namespace for bucket or user is not shared with "default".
>> At present, I can't make decision to separate namespaces, but it may be
>> best choice with current code.
>>
>>
>>>>>> To evaluate such configuration, I tentatively built multiple zonegroups
>>>>>> (east, west) on a ceph cluster. I barely succeed to configure it, but
>>>>>> some concerns exist.
>>>>>>
>>>>> I think you just need one zonegroup with two zones the other are not
>>>>> needed
>>>>> Also each gateway can handle only a single zone (rgw_zone
>>>>> configuration parameter)
>>>>
>>>>
>>>>
>>>> This is just a tentative one to confirm the behavior of multiple
>>>> zonegroups
>>>> due to limitation of our current equipment.
>>>> The "east" zonegroup was renamed from "default", and another "west"
>>>> zonegroup
>>>> was created. Of course I specified both rgw_zonegroup and rgw_zone
>>>> parameters
>>>> for each RGW instance. (see -FYI- section bellow)
>>>>
>>> Can I suggest starting with a more simple setup:
>>> Two zonegroups,  the first will have two zones and the second will
>>> have one zone.
>>> It is simper to configure and in case of problems to debug.
>>
>>
>> I would try with such configuration IF time permitted.
>>
>>
>>
>>>>>> a) User accounts are not synced among zonegroups
>>>>>>
>>>>>> I'm not sure if this is a issue, but the blueprint [1] stated a master
>>>>>> zonegroup manages user accounts as metadata like buckets.
>>>>>>
>>>>> You have a lot of confusion with the zones and zonegroups.
>>>>> A zonegroup is just a group of zones that are sharing the same data
>>>>> (i.e. replication between them)
>>>>> A zone represent a geographical location (i.e. one ceph cluster)
>>>>>
>>>>> We have a meta master zone (the master zone in the master zonegroup),
>>>>> this meta master is responible on
>>>>> replicating users and byckets meta operations.
>>>>
>>>>
>>>>
>>>> I know it.
>>>> But the master zone in the master zonegroup manages bucket meta
>>>> operations including buckets in other zonegroups. It means
>>>> the master zone in the master zonegroup must have permission to
>>>> handle buckets meta operations, i.e., must have same user accounts
>>>> as other zonegroups.
>>>
>>> Again zones not zonegroups,  it needs to have an admin user with the
>>> same credentials in all the other zones.
>>>
>>>> This is related to next issue b). If the master zone in the master
>>>> zonegroup doesn't have user accounts for other zonegroups, all the
>>>> buckets meta operations are rejected.
>>>>
>>>
>>> Correct
>>>
>>>> In addition, it may be overexplanation though, user accounts are
>>>> sync'ed to other zones within same zonegroup if the accounts are
>>>> created on master zone of the zonegroup. On the other hand,
>>>> I found today, user accounts are not sync'ed to master if the
>>>> accounts are created on slave(?) zone in the zonegroup. It seems
>>>> asymmetric behavior.
>>>
>>>
>>> This requires investigation,  can you open a tracker issue and we will
>>> look into it.
>>>
>>>> I'm not sure if the same behavior is caused by Admin REST API instead
>>>> of radosgw-admin.
>>>>
>>>
>>> It doesn't matter both use almost the same code
>>>
>>>>
>>>>>> b) Bucket creation is rejected if master zonegroup doesn't have the
>>>>>> account
>>>>>>
>>>>>> e.g.:
>>>>>>   1) Configure east zonegroup as master.
>>>>>
>>>>>
>>>>> you need a master zoen
>>>>>>
>>>>>>
>>>>>>   2) Create a user "nishi" on west zonegroup (osaka zone) using
>>>>>> radosgw-admin.
>>>>>>   3) Try to create a bucket on west zonegroup by user nishi.
>>>>>>      -> ERROR: S3 error: 404 (NoSuchKey)
>>>>>>   4) Create user nishi on east zonegroup with same key.
>>>>>>   5) Succeed to create a bucket on west zonegroup by user nishi.
>>>>>>
>>>>>
>>>>> You are confusing zonegroup and zone here again ...
>>>>>
>>>>> you should notice that when you are using radosgw-admin command
>>>>> without providing zonegorup and/or zone info (--rgw-zonegroup=<zg> and
>>>>> --rgw-zone=<zone>) it will use the default zonegroup and zone.
>>>>>
>>>>> User is stored per zone and you need to create an admin users in both
>>>>> zones
>>>>> for more documentation see:
>>>>> http://docs.ceph.com/docs/master/radosgw/multisite/
>>>>
>>>>
>>>>
>>>> I always specify --rgw-zonegroup and --rgw-zone for radosgw-admin
>>>> command.
>>>>
>>> That is great!
>>> You can also onfigure default zone and zonegroup
>>>
>>>> The issue is that any buckets meta operations are rejected when the
>>>> master
>>>> zone in the master zonegroup doesn't have the user account of other
>>>> zonegroups.
>>>>
>>> Correct
>>>>
>>>> I try to describe details again:
>>>> 1) Create fj realm as default.
>>>> 2) Rename default zonegroup/zone to east/tokyo and mark as default.
>>>> 3) Create west/osaka zonegroup/zone.
>>>> 4) Create system user sync-user on both tokyo and osaka zones with same
>>>> key.
>>>> 5) Start 2 RGW instances for tokyo and osaka zones.
>>>> 6) Create azuma user account on tokyo zone in east zonegroup.
>>>> 7) Create /bucket1 through tokyo zone endpoint with azuma account.
>>>>    -> No problem.
>>>> 8) Create nishi user account on osaka zone in west zonegroup.
>>>> 9) Try to create a bucket /bucket2 through osaka zone endpoint with azuma
>>>> account.
>>>>    -> respond "ERROR: S3 error: 403 (InvalidAccessKeyId)" as expected.
>>>> 10) Try to create a bucket /bucket3 through osaka zone endpoint with
>>>> nishi
>>>> account.
>>>>    -> respond "ERROR: S3 error: 404 (NoSuchKey)"
>>>>    Detailed log is shown in -FYI- section bellow.
>>>>    The RGW for osaka zone verify the signature and forward the request
>>>>    to tokyo zone endpoint (= the master zone in the master zonegroup).
>>>>    Then, the RGW for tokyo zone rejected the request by unauthorized
>>>> access.
>>>>
>>>
>>> This seems a bug, can you open a issue?
>>>
>>>>
>>>>>> c) How to restrict to place buckets on specific zonegroups?
>>>>>>
>>>>>
>>>>> you probably mean zone.
>>>>> There is ongoing work to enable/disable sync per bucket
>>>>> https://github.com/ceph/ceph/pull/10995
>>>>> with this you can create a bucket on a specific zone and it won't be
>>>>> replicated to another zone
>>>>
>>>>
>>>>
>>>> My thought means zonegroup (not zone) as described above.
>>>
>>>
>>> But it should be zone ..
>>> Zone represent a geographical location , it represent a single ceph
>>> cluster.
>>> Bucket is created in a zone (a single ceph cluster) and it stored the zone
>>> id.
>>> The zone represent in which ceph cluster the bucket was created.
>>>
>>> A zonegroup just a logical collection of zones, in many case you only
>>> need a single zonegroup.
>>> You should use zonegroups if you have lots of zones and it simplifies
>>> your configuration.
>>> You can move zones between zonegroups (it is not tested or supported ...).
>>>
>>>> With current code, buckets are sync'ed to all zones within a zonegroup,
>>>> no way to choose zone to place specific buckets.
>>>> But this change may help to configure our original target.
>>>>
>>>> It seems we need more discussion about the change.
>>>> I prefer default behavior is associated with user account (per SLA).
>>>> And attribution of each bucket should be able to be changed via REST
>>>> API depending on their permission, rather than radosgw-admin command.
>>>>
>>>
>>> I think that will be very helpful , we need to understand what are the
>>> requirement and the usage.
>>> Please comment on the PR or even open a feature request and we can
>>> discuss it more in detail.
>>>
>>>> Anyway, I'll examine more details.
>>>>
>>>>>> If user accounts would synced future as the blueprint, all the
>>>>>> zonegroups
>>>>>> contain same account information. It means any user can create buckets
>>>>>> on
>>>>>> any zonegroups. If we want to permit to place buckets on a replicated
>>>>>> zonegroup for specific users, how to configure?
>>>>>>
>>>>>> If user accounts will not synced as current behavior, we can restrict
>>>>>> to place buckets on specific zonegroups. But I cannot find best way to
>>>>>> configure the master zonegroup.
>>>>>>
>>>>>>
>>>>>> d) Operations for other zonegroup are not redirected
>>>>>>
>>>>>> e.g.:
>>>>>>   1) Create bucket4 on west zonegroup by nishi.
>>>>>>   2) Try to access bucket4 from endpoint on east zonegroup.
>>>>>>      -> Respond "301 (Moved Permanently)",
>>>>>>         but no redirected Location header is returned.
>>>>>>
>>>>>
>>>>> It could be a bug please open a tracker issue for that in
>>>>> tracker.ceph.com for RGW component with all the configuration
>>>>> information,
>>>>> logs and the version of ceph and radosgw you are using.
>>>>
>>>>
>>>>
>>>> I will open it, but it may be issued as "Feature" instead of "Bug"
>>>> depending on following discussion.
>>>>
>>>>>> It seems current RGW doesn't follows S3 specification [2].
>>>>>> To implement this feature, probably we need to define another endpoint
>>>>>> on each zonegroup for client accessible URL. RGW may placed behind
>>>>>> proxy,
>>>>>> thus the URL may be different from endpoint URLs for replication.
>>>>>>
>>>>>
>>>>> The zone and zonegroup endpoints are not used directly by the user with
>>>>> a
>>>>> proxy.
>>>>> The user get a URL pointing to the proxy and the proxy will need to be
>>>>> configured to point the rgw urls/IPs , you can have several radosgw
>>>>> running.
>>>>> See more
>>>>>
>>>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration
>>>>
>>>>
>>>>
>>>> Does it mean the proxy has responsibility to alter "Location" header as
>>>> redirected URL?
>>>>
>>>
>>> No
>>>
>>>> Basically, RGW can respond only the endpoint described in zonegroup
>>>> setting as redirected URL on Location header. But client may not access
>>>> the endpoint. Someone must translate the Location header to client
>>>> accessible URL.
>>>>
>>>
>>> Both locations will have a proxy. This means all communication is done
>>> through proxies.
>>> The endpoint URL should be an external URL and the proxy on the new
>>> location will translate it to the internal one.
>>
>>
>> Our assumption is:
>>
>> End-user client --- internet --- proxy ---+--- RGW site-A
>>                                           |
>>                                           | (dedicated line or VPN)
>>                                           |
>> End-user client --- internet --- proxy ---+--- RGW site-B
>>
>> RGWs can't access through front of proxies.
>> In this case, endpoints for replication are in backend network of proxies.
>
> do you have several radosgw instances in each site?

Yes. Probably three or more instances per a site.
Actual system will have same number of physical servers as RGW instances.
We already tested with multiple endpoints per a zone within a zonegroup.


>> How do you think?
>>
>>
>>
>>> Regards,
>>> Orit
>>>
>>>> If the proxy translates Location header, it looks like man-in-the-middle
>>>> attack.
>>>>
>>>>
>>>> Regards,
>>>> KIMURA
>>>>
>>>>> Regrads,
>>>>> Orit
>>>>>>
>>>>>>
>>>>>>
>>>>>> Any thoughts?
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> [1]
>>>>>>
>>>>>>
>>>>>> http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
>>>>>> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html
>>>>
>>>>
>>>>
>>>> [3]
>>>>
>>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-8-multi-site#migrating_a_single_site_system_to_multi_site
>>>>
>>>>>> ------ FYI ------
>>>>>> [environments]
>>>>>> Ceph cluster: RHCS 2.0
>>>>>> RGW: RHEL 7.2 + RGW v10.2.5
>>>>>>
>>>>>> zonegroup east: master
>>>>>>  zone tokyo
>>>>>>   endpoint http://node5:80
>>>>
>>>>
>>>>        rgw frontends = "civetweb port=80"
>>>>        rgw zonegroup = east
>>>>        rgw zone = tokyo
>>>>>>
>>>>>>
>>>>>>   system user: sync-user
>>>>>>   user azuma (+ nishi)
>>>>>>
>>>>>> zonegroup west: (not master)
>>>>>>   zone osaka
>>>>>>   endpoint http://node5:8081
>>>>
>>>>
>>>>        rgw frontends = "civetweb port=8081"
>>>>        rgw zonegroup = west
>>>>        rgw zone = osaka
>>>>
>>>>>>   system user: sync-user (created with same key as zone tokyo)
>>>>>>   user nishi
>>>>>>
>>>>>>
>>>>>> [detail of "b)"]
>>>>>>
>>>>>> $ s3cmd -c s3nishi.cfg ls
>>>>>> $ s3cmd -c s3nishi.cfg mb s3://bucket3
>>>>>> ERROR: S3 error: 404 (NoSuchKey)
>>>>>>
>>>>>> ---- rgw.osaka log:
>>>>>> 2017-02-10 11:54:13.290653 7feac3f7f700  1 ====== starting new request
>>>>>> req=0x7feac3f79710 =====
>>>>>> 2017-02-10 11:54:13.290709 7feac3f7f700  2 req 50:0.000057::PUT
>>>>>> /bucket3/::initializing for trans_id =
>>>>>> tx000000000000000000032-00589d2b55-14a2-osaka
>>>>>> 2017-02-10 11:54:13.290720 7feac3f7f700 10 rgw api priority: s3=5
>>>>>> s3website=4
>>>>>> 2017-02-10 11:54:13.290722 7feac3f7f700 10 host=node5
>>>>>> 2017-02-10 11:54:13.290733 7feac3f7f700 10 meta>>
>>>>>> HTTP_X_AMZ_CONTENT_SHA256
>>>>>> 2017-02-10 11:54:13.290750 7feac3f7f700 10 meta>> HTTP_X_AMZ_DATE
>>>>>> 2017-02-10 11:54:13.290753 7feac3f7f700 10 x>>
>>>>>>
>>>>>>
>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>> 2017-02-10 11:54:13.290755 7feac3f7f700 10 x>>
>>>>>> x-amz-date:20170210T025413Z
>>>>>> 2017-02-10 11:54:13.290774 7feac3f7f700 10
>>>>>> handler=25RGWHandler_REST_Bucket_S3
>>>>>> 2017-02-10 11:54:13.290775 7feac3f7f700  2 req 50:0.000124:s3:PUT
>>>>>> /bucket3/::getting op 1
>>>>>> 2017-02-10 11:54:13.290781 7feac3f7f700 10
>>>>>> op=27RGWCreateBucket_ObjStore_S3
>>>>>> 2017-02-10 11:54:13.290782 7feac3f7f700  2 req 50:0.000130:s3:PUT
>>>>>> /bucket3/:create_bucket:authorizing
>>>>>> 2017-02-10 11:54:13.290798 7feac3f7f700 10 v4 signature format =
>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>> 2017-02-10 11:54:13.290804 7feac3f7f700 10 v4 credential format =
>>>>>> ZY6EJUVB38SCOWBELERQ/20170210/west/s3/aws4_request
>>>>>> 2017-02-10 11:54:13.290806 7feac3f7f700 10 access key id =
>>>>>> ZY6EJUVB38SCOWBELERQ
>>>>>> 2017-02-10 11:54:13.290814 7feac3f7f700 10 credential scope =
>>>>>> 20170210/west/s3/aws4_request
>>>>>> 2017-02-10 11:54:13.290834 7feac3f7f700 10 canonical headers format =
>>>>>> host:node5:8081
>>>>>>
>>>>>>
>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>> x-amz-date:20170210T025413Z
>>>>>>
>>>>>> 2017-02-10 11:54:13.290836 7feac3f7f700 10 delaying v4 auth
>>>>>> 2017-02-10 11:54:13.290839 7feac3f7f700  2 req 50:0.000187:s3:PUT
>>>>>> /bucket3/:create_bucket:normalizing buckets and tenants
>>>>>> 2017-02-10 11:54:13.290841 7feac3f7f700 10 s->object=<NULL>
>>>>>> s->bucket=bucket3
>>>>>> 2017-02-10 11:54:13.290843 7feac3f7f700  2 req 50:0.000191:s3:PUT
>>>>>> /bucket3/:create_bucket:init permissions
>>>>>> 2017-02-10 11:54:13.290844 7feac3f7f700  2 req 50:0.000192:s3:PUT
>>>>>> /bucket3/:create_bucket:recalculating target
>>>>>> 2017-02-10 11:54:13.290845 7feac3f7f700  2 req 50:0.000193:s3:PUT
>>>>>> /bucket3/:create_bucket:reading permissions
>>>>>> 2017-02-10 11:54:13.290846 7feac3f7f700  2 req 50:0.000195:s3:PUT
>>>>>> /bucket3/:create_bucket:init op
>>>>>> 2017-02-10 11:54:13.290847 7feac3f7f700  2 req 50:0.000196:s3:PUT
>>>>>> /bucket3/:create_bucket:verifying op mask
>>>>>> 2017-02-10 11:54:13.290849 7feac3f7f700  2 req 50:0.000197:s3:PUT
>>>>>> /bucket3/:create_bucket:verifying op permissions
>>>>>> 2017-02-10 11:54:13.292027 7feac3f7f700  2 req 50:0.001374:s3:PUT
>>>>>> /bucket3/:create_bucket:verifying op params
>>>>>> 2017-02-10 11:54:13.292035 7feac3f7f700  2 req 50:0.001383:s3:PUT
>>>>>> /bucket3/:create_bucket:pre-executing
>>>>>> 2017-02-10 11:54:13.292037 7feac3f7f700  2 req 50:0.001385:s3:PUT
>>>>>> /bucket3/:create_bucket:executing
>>>>>> 2017-02-10 11:54:13.292072 7feac3f7f700 10 payload request hash =
>>>>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>> 2017-02-10 11:54:13.292083 7feac3f7f700 10 canonical request = PUT
>>>>>> /bucket3/
>>>>>>
>>>>>> host:node5:8081
>>>>>>
>>>>>>
>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>> x-amz-date:20170210T025413Z
>>>>>>
>>>>>> host;x-amz-content-sha256;x-amz-date
>>>>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>> 2017-02-10 11:54:13.292084 7feac3f7f700 10 canonical request hash =
>>>>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>>>>> 2017-02-10 11:54:13.292087 7feac3f7f700 10 string to sign =
>>>>>> AWS4-HMAC-SHA256
>>>>>> 20170210T025413Z
>>>>>> 20170210/west/s3/aws4_request
>>>>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>>>>> 2017-02-10 11:54:13.292118 7feac3f7f700 10 date_k        =
>>>>>> 454f3ad73c095e73d2482809d7a6ec8af3c4e900bc83e0a9663ea5fc336cad95
>>>>>> 2017-02-10 11:54:13.292131 7feac3f7f700 10 region_k      =
>>>>>> e0caaddbb30ebc25840b6aaac3979d1881a14b8e9a0dfea43d8a006c8e0e504d
>>>>>> 2017-02-10 11:54:13.292144 7feac3f7f700 10 service_k     =
>>>>>> 59d6c9158e9e3c6a1aa97ee15859d2ef9ad9c64209b63f093109844f0c7f6c04
>>>>>> 2017-02-10 11:54:13.292171 7feac3f7f700 10 signing_k     =
>>>>>> 4dcbccd9c3da779d32758a645644c66a56f64d642eaeb39eec8e0b2facba7805
>>>>>> 2017-02-10 11:54:13.292197 7feac3f7f700 10 signature_k   =
>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>> 2017-02-10 11:54:13.292198 7feac3f7f700 10 new signature =
>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10
>>>>>> -----------------------------
>>>>>> Verifying signatures
>>>>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10 Signature     =
>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10 New Signature =
>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10
>>>>>> -----------------------------
>>>>>> 2017-02-10 11:54:13.292202 7feac3f7f700 10 v4 auth ok
>>>>>> 2017-02-10 11:54:13.292238 7feac3f7f700 10 create bucket location
>>>>>> constraint: west
>>>>>> 2017-02-10 11:54:13.292256 7feac3f7f700 10 cache get:
>>>>>> name=osaka.rgw.data.root+bucket3 : type miss (requested=22, cached=0)
>>>>>> 2017-02-10 11:54:13.293369 7feac3f7f700 10 cache put:
>>>>>> name=osaka.rgw.data.root+bucket3 info.flags=0
>>>>>> 2017-02-10 11:54:13.293374 7feac3f7f700 10 moving
>>>>>> osaka.rgw.data.root+bucket3 to cache LRU end
>>>>>> 2017-02-10 11:54:13.293380 7feac3f7f700  0 sending create_bucket
>>>>>> request
>>>>>> to
>>>>>> master zonegroup
>>>>>> 2017-02-10 11:54:13.293401 7feac3f7f700 10 get_canon_resource():
>>>>>> dest=/bucket3/
>>>>>> 2017-02-10 11:54:13.293403 7feac3f7f700 10 generated canonical header:
>>>>>> PUT
>>>>>>
>>>>>>
>>>>>> Fri Feb 10 02:54:13 2017
>>>>>>
>>>>>>
>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>> /bucket3/
>>>>>> 2017-02-10 11:54:13.299113 7feac3f7f700 10 receive_http_header
>>>>>> 2017-02-10 11:54:13.299117 7feac3f7f700 10 received header:HTTP/1.1 404
>>>>>> Not
>>>>>> Found
>>>>>> 2017-02-10 11:54:13.299119 7feac3f7f700 10 receive_http_header
>>>>>> 2017-02-10 11:54:13.299120 7feac3f7f700 10 received
>>>>>> header:x-amz-request-id:
>>>>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>>>>> 2017-02-10 11:54:13.299130 7feac3f7f700 10 receive_http_header
>>>>>> 2017-02-10 11:54:13.299131 7feac3f7f700 10 received
>>>>>> header:Content-Length:
>>>>>> 175
>>>>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 receive_http_header
>>>>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 received
>>>>>> header:Accept-Ranges:
>>>>>> bytes
>>>>>> 2017-02-10 11:54:13.299148 7feac3f7f700 10 receive_http_header
>>>>>> 2017-02-10 11:54:13.299149 7feac3f7f700 10 received
>>>>>> header:Content-Type:
>>>>>> application/xml
>>>>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 receive_http_header
>>>>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 received header:Date: Fri,
>>>>>> 10
>>>>>> Feb
>>>>>> 2017 02:54:13 GMT
>>>>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 receive_http_header
>>>>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 received header:
>>>>>> 2017-02-10 11:54:13.299248 7feac3f7f700  2 req 50:0.008596:s3:PUT
>>>>>> /bucket3/:create_bucket:completing
>>>>>> 2017-02-10 11:54:13.299319 7feac3f7f700  2 req 50:0.008667:s3:PUT
>>>>>> /bucket3/:create_bucket:op status=-2
>>>>>> 2017-02-10 11:54:13.299321 7feac3f7f700  2 req 50:0.008670:s3:PUT
>>>>>> /bucket3/:create_bucket:http status=404
>>>>>> 2017-02-10 11:54:13.299324 7feac3f7f700  1 ====== req done
>>>>>> req=0x7feac3f79710 op status=-2 http_status=404 ======
>>>>>> 2017-02-10 11:54:13.299349 7feac3f7f700  1 civetweb: 0x7feb2c02d340:
>>>>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1"
>>>>>> 404
>>>>>> 0 - -
>>>>>>
>>>>>>
>>>>>> ---- rgw.tokyo log:
>>>>>> 2017-02-10 11:54:13.297852 7f56076c6700  1 ====== starting new request
>>>>>> req=0x7f56076c0710 =====
>>>>>> 2017-02-10 11:54:13.297887 7f56076c6700  2 req 5:0.000035::PUT
>>>>>> /bucket3/::initializing for trans_id =
>>>>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>>>>> 2017-02-10 11:54:13.297895 7f56076c6700 10 rgw api priority: s3=5
>>>>>> s3website=4
>>>>>> 2017-02-10 11:54:13.297897 7f56076c6700 10 host=node5
>>>>>> 2017-02-10 11:54:13.297906 7f56076c6700 10 meta>>
>>>>>> HTTP_X_AMZ_CONTENT_SHA256
>>>>>> 2017-02-10 11:54:13.297912 7f56076c6700 10 x>>
>>>>>>
>>>>>>
>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>> 2017-02-10 11:54:13.297929 7f56076c6700 10
>>>>>> handler=25RGWHandler_REST_Bucket_S3
>>>>>> 2017-02-10 11:54:13.297937 7f56076c6700  2 req 5:0.000086:s3:PUT
>>>>>> /bucket3/::getting op 1
>>>>>> 2017-02-10 11:54:13.297946 7f56076c6700 10
>>>>>> op=27RGWCreateBucket_ObjStore_S3
>>>>>> 2017-02-10 11:54:13.297947 7f56076c6700  2 req 5:0.000096:s3:PUT
>>>>>> /bucket3/:create_bucket:authorizing
>>>>>> 2017-02-10 11:54:13.297969 7f56076c6700 10 get_canon_resource():
>>>>>> dest=/bucket3/
>>>>>> 2017-02-10 11:54:13.297976 7f56076c6700 10 auth_hdr:
>>>>>> PUT
>>>>>>
>>>>>>
>>>>>> Fri Feb 10 02:54:13 2017
>>>>>>
>>>>>>
>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>> /bucket3/
>>>>>> 2017-02-10 11:54:13.298023 7f56076c6700 10 cache get:
>>>>>> name=default.rgw.users.uid+nishi : type miss (requested=6, cached=0)
>>>>>> 2017-02-10 11:54:13.298975 7f56076c6700 10 cache put:
>>>>>> name=default.rgw.users.uid+nishi info.flags=0
>>>>>> 2017-02-10 11:54:13.298986 7f56076c6700 10 moving
>>>>>> default.rgw.users.uid+nishi to cache LRU end
>>>>>> 2017-02-10 11:54:13.298991 7f56076c6700  0 User lookup failed!
>>>>>> 2017-02-10 11:54:13.298993 7f56076c6700 10 failed to authorize request
>>>>>> 2017-02-10 11:54:13.299077 7f56076c6700  2 req 5:0.001225:s3:PUT
>>>>>> /bucket3/:create_bucket:op status=0
>>>>>> 2017-02-10 11:54:13.299086 7f56076c6700  2 req 5:0.001235:s3:PUT
>>>>>> /bucket3/:create_bucket:http status=404
>>>>>> 2017-02-10 11:54:13.299089 7f56076c6700  1 ====== req done
>>>>>> req=0x7f56076c0710 op status=0 http_status=404 ======
>>>>>> 2017-02-10 11:54:13.299426 7f56076c6700  1 civetweb: 0x7f56200048c0:
>>>>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/ HTTP/1.1"
>>>>>> 404
>>>>>> 0 - -

-- 
KIMURA Osamu / 木村 修
Engineering Department, Storage Development Division,
Data Center Platform Business Unit, FUJITSU LIMITED

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-15  0:26           ` KIMURA Osamu
@ 2017-02-15  7:53             ` Orit Wasserman
  2017-02-23 11:34               ` KIMURA Osamu
  0 siblings, 1 reply; 11+ messages in thread
From: Orit Wasserman @ 2017-02-15  7:53 UTC (permalink / raw)
  To: KIMURA Osamu; +Cc: ceph-devel

On Wed, Feb 15, 2017 at 2:26 AM, KIMURA Osamu
<kimura.osamu@jp.fujitsu.com> wrote:
> Comments inline...
>
>
> On 2017/02/14 23:54, Orit Wasserman wrote:
>>
>> On Mon, Feb 13, 2017 at 12:57 PM, KIMURA Osamu
>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>
>>> Hi Orit,
>>>
>>> I almost agree, with some exceptions...
>>>
>>>
>>> On 2017/02/13 18:42, Orit Wasserman wrote:
>>>>
>>>>
>>>> On Mon, Feb 13, 2017 at 6:44 AM, KIMURA Osamu
>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>
>>>>>
>>>>> Hi Orit,
>>>>>
>>>>> Thanks for your comments.
>>>>> I believe I'm not confusing, but probably my thought may not be well
>>>>> described...
>>>>>
>>>> :)
>>>>>
>>>>>
>>>>> On 2017/02/12 19:07, Orit Wasserman wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
>>>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Cephers,
>>>>>>>
>>>>>>> I'm trying to configure RGWs with multiple zonegroups within single
>>>>>>> realm.
>>>>>>> The intention is that some buckets to be replicated and others to
>>>>>>> stay
>>>>>>> locally.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> If you are not replicating than you don't need to create any zone
>>>>>> configuration,
>>>>>> a default zonegroup and zone are created automatically
>>>>>>
>>>>>>> e.g.:
>>>>>>>  realm: fj
>>>>>>>   zonegroup east: zone tokyo (not replicated)
>>>>>>
>>>>>>
>>>>>>
>>>>>> no need if not replicated
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   zonegroup west: zone osaka (not replicated)
>>>>>>
>>>>>>
>>>>>>
>>>>>> same here
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   zonegroup jp:   zone jp-east + jp-west (replicated)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The "east" and "west" zonegroups are just renamed from "default"
>>>>> as described in RHCS document [3].
>>>>
>>>>
>>>>
>>>> Why do you need two zonegroups (or 3)?
>>>>
>>>> At the moment multisitev2 replicated automatically all zones in the
>>>> realm except "default" zone.
>>>> The moment you add a new zone (could be part of another zonegroup) it
>>>> will be replicated to the other zones.
>>>> It seems you don't want or need this.
>>>> we are working on allowing more control on the replication but that
>>>> will be in the future.
>>>>
>>>>> We may not need to rename them, but at least api_name should be
>>>>> altered.
>>>>
>>>>
>>>>
>>>> You can change the api_name for the "default" zone.
>>>>
>>>>> In addition, I'm not sure what happens if 2 "default" zones/zonegroups
>>>>> co-exist in same realm.
>>>>
>>>>
>>>>
>>>> Realm shares all the zones/zonegroups configuration,
>>>> it means it is the same zone/zonegroup.
>>>> For "default" it means not zone/zonegroup configured, we use it to run
>>>> radosgw without any
>>>> zone/zonegroup specified in the configuration.
>>>
>>>
>>>
>>> I didn't think "default" as exception of zonegroup. :-P
>>> Actually, I must specify api_name in default zonegroup setting.
>>>
>>> I interpret "default" zone/zonegroup is out of realm. Is it correct?
>>> I think it means namespace for bucket or user is not shared with
>>> "default".
>>> At present, I can't make decision to separate namespaces, but it may be
>>> best choice with current code.
>>>
>>>
>>>>>>> To evaluate such configuration, I tentatively built multiple
>>>>>>> zonegroups
>>>>>>> (east, west) on a ceph cluster. I barely succeed to configure it, but
>>>>>>> some concerns exist.
>>>>>>>
>>>>>> I think you just need one zonegroup with two zones the other are not
>>>>>> needed
>>>>>> Also each gateway can handle only a single zone (rgw_zone
>>>>>> configuration parameter)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> This is just a tentative one to confirm the behavior of multiple
>>>>> zonegroups
>>>>> due to limitation of our current equipment.
>>>>> The "east" zonegroup was renamed from "default", and another "west"
>>>>> zonegroup
>>>>> was created. Of course I specified both rgw_zonegroup and rgw_zone
>>>>> parameters
>>>>> for each RGW instance. (see -FYI- section bellow)
>>>>>
>>>> Can I suggest starting with a more simple setup:
>>>> Two zonegroups,  the first will have two zones and the second will
>>>> have one zone.
>>>> It is simper to configure and in case of problems to debug.
>>>
>>>
>>>
>>> I would try with such configuration IF time permitted.
>>>
>>>
>>>
>>>>>>> a) User accounts are not synced among zonegroups
>>>>>>>
>>>>>>> I'm not sure if this is a issue, but the blueprint [1] stated a
>>>>>>> master
>>>>>>> zonegroup manages user accounts as metadata like buckets.
>>>>>>>
>>>>>> You have a lot of confusion with the zones and zonegroups.
>>>>>> A zonegroup is just a group of zones that are sharing the same data
>>>>>> (i.e. replication between them)
>>>>>> A zone represent a geographical location (i.e. one ceph cluster)
>>>>>>
>>>>>> We have a meta master zone (the master zone in the master zonegroup),
>>>>>> this meta master is responible on
>>>>>> replicating users and byckets meta operations.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I know it.
>>>>> But the master zone in the master zonegroup manages bucket meta
>>>>> operations including buckets in other zonegroups. It means
>>>>> the master zone in the master zonegroup must have permission to
>>>>> handle buckets meta operations, i.e., must have same user accounts
>>>>> as other zonegroups.
>>>>
>>>>
>>>> Again zones not zonegroups,  it needs to have an admin user with the
>>>> same credentials in all the other zones.
>>>>
>>>>> This is related to next issue b). If the master zone in the master
>>>>> zonegroup doesn't have user accounts for other zonegroups, all the
>>>>> buckets meta operations are rejected.
>>>>>
>>>>
>>>> Correct
>>>>
>>>>> In addition, it may be overexplanation though, user accounts are
>>>>> sync'ed to other zones within same zonegroup if the accounts are
>>>>> created on master zone of the zonegroup. On the other hand,
>>>>> I found today, user accounts are not sync'ed to master if the
>>>>> accounts are created on slave(?) zone in the zonegroup. It seems
>>>>> asymmetric behavior.
>>>>
>>>>
>>>>
>>>> This requires investigation,  can you open a tracker issue and we will
>>>> look into it.
>>>>
>>>>> I'm not sure if the same behavior is caused by Admin REST API instead
>>>>> of radosgw-admin.
>>>>>
>>>>
>>>> It doesn't matter both use almost the same code
>>>>
>>>>>
>>>>>>> b) Bucket creation is rejected if master zonegroup doesn't have the
>>>>>>> account
>>>>>>>
>>>>>>> e.g.:
>>>>>>>   1) Configure east zonegroup as master.
>>>>>>
>>>>>>
>>>>>>
>>>>>> you need a master zoen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   2) Create a user "nishi" on west zonegroup (osaka zone) using
>>>>>>> radosgw-admin.
>>>>>>>   3) Try to create a bucket on west zonegroup by user nishi.
>>>>>>>      -> ERROR: S3 error: 404 (NoSuchKey)
>>>>>>>   4) Create user nishi on east zonegroup with same key.
>>>>>>>   5) Succeed to create a bucket on west zonegroup by user nishi.
>>>>>>>
>>>>>>
>>>>>> You are confusing zonegroup and zone here again ...
>>>>>>
>>>>>> you should notice that when you are using radosgw-admin command
>>>>>> without providing zonegorup and/or zone info (--rgw-zonegroup=<zg> and
>>>>>> --rgw-zone=<zone>) it will use the default zonegroup and zone.
>>>>>>
>>>>>> User is stored per zone and you need to create an admin users in both
>>>>>> zones
>>>>>> for more documentation see:
>>>>>> http://docs.ceph.com/docs/master/radosgw/multisite/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I always specify --rgw-zonegroup and --rgw-zone for radosgw-admin
>>>>> command.
>>>>>
>>>> That is great!
>>>> You can also onfigure default zone and zonegroup
>>>>
>>>>> The issue is that any buckets meta operations are rejected when the
>>>>> master
>>>>> zone in the master zonegroup doesn't have the user account of other
>>>>> zonegroups.
>>>>>
>>>> Correct
>>>>>
>>>>>
>>>>> I try to describe details again:
>>>>> 1) Create fj realm as default.
>>>>> 2) Rename default zonegroup/zone to east/tokyo and mark as default.
>>>>> 3) Create west/osaka zonegroup/zone.
>>>>> 4) Create system user sync-user on both tokyo and osaka zones with same
>>>>> key.
>>>>> 5) Start 2 RGW instances for tokyo and osaka zones.
>>>>> 6) Create azuma user account on tokyo zone in east zonegroup.
>>>>> 7) Create /bucket1 through tokyo zone endpoint with azuma account.
>>>>>    -> No problem.
>>>>> 8) Create nishi user account on osaka zone in west zonegroup.
>>>>> 9) Try to create a bucket /bucket2 through osaka zone endpoint with
>>>>> azuma
>>>>> account.
>>>>>    -> respond "ERROR: S3 error: 403 (InvalidAccessKeyId)" as expected.
>>>>> 10) Try to create a bucket /bucket3 through osaka zone endpoint with
>>>>> nishi
>>>>> account.
>>>>>    -> respond "ERROR: S3 error: 404 (NoSuchKey)"
>>>>>    Detailed log is shown in -FYI- section bellow.
>>>>>    The RGW for osaka zone verify the signature and forward the request
>>>>>    to tokyo zone endpoint (= the master zone in the master zonegroup).
>>>>>    Then, the RGW for tokyo zone rejected the request by unauthorized
>>>>> access.
>>>>>
>>>>
>>>> This seems a bug, can you open a issue?
>>>>
>>>>>
>>>>>>> c) How to restrict to place buckets on specific zonegroups?
>>>>>>>
>>>>>>
>>>>>> you probably mean zone.
>>>>>> There is ongoing work to enable/disable sync per bucket
>>>>>> https://github.com/ceph/ceph/pull/10995
>>>>>> with this you can create a bucket on a specific zone and it won't be
>>>>>> replicated to another zone
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> My thought means zonegroup (not zone) as described above.
>>>>
>>>>
>>>>
>>>> But it should be zone ..
>>>> Zone represent a geographical location , it represent a single ceph
>>>> cluster.
>>>> Bucket is created in a zone (a single ceph cluster) and it stored the
>>>> zone
>>>> id.
>>>> The zone represent in which ceph cluster the bucket was created.
>>>>
>>>> A zonegroup just a logical collection of zones, in many case you only
>>>> need a single zonegroup.
>>>> You should use zonegroups if you have lots of zones and it simplifies
>>>> your configuration.
>>>> You can move zones between zonegroups (it is not tested or supported
>>>> ...).
>>>>
>>>>> With current code, buckets are sync'ed to all zones within a zonegroup,
>>>>> no way to choose zone to place specific buckets.
>>>>> But this change may help to configure our original target.
>>>>>
>>>>> It seems we need more discussion about the change.
>>>>> I prefer default behavior is associated with user account (per SLA).
>>>>> And attribution of each bucket should be able to be changed via REST
>>>>> API depending on their permission, rather than radosgw-admin command.
>>>>>
>>>>
>>>> I think that will be very helpful , we need to understand what are the
>>>> requirement and the usage.
>>>> Please comment on the PR or even open a feature request and we can
>>>> discuss it more in detail.
>>>>
>>>>> Anyway, I'll examine more details.
>>>>>
>>>>>>> If user accounts would synced future as the blueprint, all the
>>>>>>> zonegroups
>>>>>>> contain same account information. It means any user can create
>>>>>>> buckets
>>>>>>> on
>>>>>>> any zonegroups. If we want to permit to place buckets on a replicated
>>>>>>> zonegroup for specific users, how to configure?
>>>>>>>
>>>>>>> If user accounts will not synced as current behavior, we can restrict
>>>>>>> to place buckets on specific zonegroups. But I cannot find best way
>>>>>>> to
>>>>>>> configure the master zonegroup.
>>>>>>>
>>>>>>>
>>>>>>> d) Operations for other zonegroup are not redirected
>>>>>>>
>>>>>>> e.g.:
>>>>>>>   1) Create bucket4 on west zonegroup by nishi.
>>>>>>>   2) Try to access bucket4 from endpoint on east zonegroup.
>>>>>>>      -> Respond "301 (Moved Permanently)",
>>>>>>>         but no redirected Location header is returned.
>>>>>>>
>>>>>>
>>>>>> It could be a bug please open a tracker issue for that in
>>>>>> tracker.ceph.com for RGW component with all the configuration
>>>>>> information,
>>>>>> logs and the version of ceph and radosgw you are using.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I will open it, but it may be issued as "Feature" instead of "Bug"
>>>>> depending on following discussion.
>>>>>
>>>>>>> It seems current RGW doesn't follows S3 specification [2].
>>>>>>> To implement this feature, probably we need to define another
>>>>>>> endpoint
>>>>>>> on each zonegroup for client accessible URL. RGW may placed behind
>>>>>>> proxy,
>>>>>>> thus the URL may be different from endpoint URLs for replication.
>>>>>>>
>>>>>>
>>>>>> The zone and zonegroup endpoints are not used directly by the user
>>>>>> with
>>>>>> a
>>>>>> proxy.
>>>>>> The user get a URL pointing to the proxy and the proxy will need to be
>>>>>> configured to point the rgw urls/IPs , you can have several radosgw
>>>>>> running.
>>>>>> See more
>>>>>>
>>>>>>
>>>>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Does it mean the proxy has responsibility to alter "Location" header as
>>>>> redirected URL?
>>>>>
>>>>
>>>> No
>>>>
>>>>> Basically, RGW can respond only the endpoint described in zonegroup
>>>>> setting as redirected URL on Location header. But client may not access
>>>>> the endpoint. Someone must translate the Location header to client
>>>>> accessible URL.
>>>>>
>>>>
>>>> Both locations will have a proxy. This means all communication is done
>>>> through proxies.
>>>> The endpoint URL should be an external URL and the proxy on the new
>>>> location will translate it to the internal one.
>>>
>>>
>>>
>>> Our assumption is:
>>>
>>> End-user client --- internet --- proxy ---+--- RGW site-A
>>>                                           |
>>>                                           | (dedicated line or VPN)
>>>                                           |
>>> End-user client --- internet --- proxy ---+--- RGW site-B
>>>
>>> RGWs can't access through front of proxies.
>>> In this case, endpoints for replication are in backend network of
>>> proxies.
>>
>>
>> do you have several radosgw instances in each site?
>
>
> Yes. Probably three or more instances per a site.
> Actual system will have same number of physical servers as RGW instances.
> We already tested with multiple endpoints per a zone within a zonegroup.

Good to hear :)
As for the redirect message in your case it's should to be handled by
the proxy and not by the client browser
as it cannot access the internal vpn network. The endpoints url should
be the url in the internal network.

Orit
>
>
>
>>> How do you think?
>>>
>>>
>>>
>>>> Regards,
>>>> Orit
>>>>
>>>>> If the proxy translates Location header, it looks like
>>>>> man-in-the-middle
>>>>> attack.
>>>>>
>>>>>
>>>>> Regards,
>>>>> KIMURA
>>>>>
>>>>>> Regrads,
>>>>>> Orit
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Any thoughts?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> [1]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
>>>>>>> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> [3]
>>>>>
>>>>>
>>>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-8-multi-site#migrating_a_single_site_system_to_multi_site
>>>>>
>>>>>>> ------ FYI ------
>>>>>>> [environments]
>>>>>>> Ceph cluster: RHCS 2.0
>>>>>>> RGW: RHEL 7.2 + RGW v10.2.5
>>>>>>>
>>>>>>> zonegroup east: master
>>>>>>>  zone tokyo
>>>>>>>   endpoint http://node5:80
>>>>>
>>>>>
>>>>>
>>>>>        rgw frontends = "civetweb port=80"
>>>>>        rgw zonegroup = east
>>>>>        rgw zone = tokyo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   system user: sync-user
>>>>>>>   user azuma (+ nishi)
>>>>>>>
>>>>>>> zonegroup west: (not master)
>>>>>>>   zone osaka
>>>>>>>   endpoint http://node5:8081
>>>>>
>>>>>
>>>>>
>>>>>        rgw frontends = "civetweb port=8081"
>>>>>        rgw zonegroup = west
>>>>>        rgw zone = osaka
>>>>>
>>>>>>>   system user: sync-user (created with same key as zone tokyo)
>>>>>>>   user nishi
>>>>>>>
>>>>>>>
>>>>>>> [detail of "b)"]
>>>>>>>
>>>>>>> $ s3cmd -c s3nishi.cfg ls
>>>>>>> $ s3cmd -c s3nishi.cfg mb s3://bucket3
>>>>>>> ERROR: S3 error: 404 (NoSuchKey)
>>>>>>>
>>>>>>> ---- rgw.osaka log:
>>>>>>> 2017-02-10 11:54:13.290653 7feac3f7f700  1 ====== starting new
>>>>>>> request
>>>>>>> req=0x7feac3f79710 =====
>>>>>>> 2017-02-10 11:54:13.290709 7feac3f7f700  2 req 50:0.000057::PUT
>>>>>>> /bucket3/::initializing for trans_id =
>>>>>>> tx000000000000000000032-00589d2b55-14a2-osaka
>>>>>>> 2017-02-10 11:54:13.290720 7feac3f7f700 10 rgw api priority: s3=5
>>>>>>> s3website=4
>>>>>>> 2017-02-10 11:54:13.290722 7feac3f7f700 10 host=node5
>>>>>>> 2017-02-10 11:54:13.290733 7feac3f7f700 10 meta>>
>>>>>>> HTTP_X_AMZ_CONTENT_SHA256
>>>>>>> 2017-02-10 11:54:13.290750 7feac3f7f700 10 meta>> HTTP_X_AMZ_DATE
>>>>>>> 2017-02-10 11:54:13.290753 7feac3f7f700 10 x>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>>> 2017-02-10 11:54:13.290755 7feac3f7f700 10 x>>
>>>>>>> x-amz-date:20170210T025413Z
>>>>>>> 2017-02-10 11:54:13.290774 7feac3f7f700 10
>>>>>>> handler=25RGWHandler_REST_Bucket_S3
>>>>>>> 2017-02-10 11:54:13.290775 7feac3f7f700  2 req 50:0.000124:s3:PUT
>>>>>>> /bucket3/::getting op 1
>>>>>>> 2017-02-10 11:54:13.290781 7feac3f7f700 10
>>>>>>> op=27RGWCreateBucket_ObjStore_S3
>>>>>>> 2017-02-10 11:54:13.290782 7feac3f7f700  2 req 50:0.000130:s3:PUT
>>>>>>> /bucket3/:create_bucket:authorizing
>>>>>>> 2017-02-10 11:54:13.290798 7feac3f7f700 10 v4 signature format =
>>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>>> 2017-02-10 11:54:13.290804 7feac3f7f700 10 v4 credential format =
>>>>>>> ZY6EJUVB38SCOWBELERQ/20170210/west/s3/aws4_request
>>>>>>> 2017-02-10 11:54:13.290806 7feac3f7f700 10 access key id =
>>>>>>> ZY6EJUVB38SCOWBELERQ
>>>>>>> 2017-02-10 11:54:13.290814 7feac3f7f700 10 credential scope =
>>>>>>> 20170210/west/s3/aws4_request
>>>>>>> 2017-02-10 11:54:13.290834 7feac3f7f700 10 canonical headers format =
>>>>>>> host:node5:8081
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>>> x-amz-date:20170210T025413Z
>>>>>>>
>>>>>>> 2017-02-10 11:54:13.290836 7feac3f7f700 10 delaying v4 auth
>>>>>>> 2017-02-10 11:54:13.290839 7feac3f7f700  2 req 50:0.000187:s3:PUT
>>>>>>> /bucket3/:create_bucket:normalizing buckets and tenants
>>>>>>> 2017-02-10 11:54:13.290841 7feac3f7f700 10 s->object=<NULL>
>>>>>>> s->bucket=bucket3
>>>>>>> 2017-02-10 11:54:13.290843 7feac3f7f700  2 req 50:0.000191:s3:PUT
>>>>>>> /bucket3/:create_bucket:init permissions
>>>>>>> 2017-02-10 11:54:13.290844 7feac3f7f700  2 req 50:0.000192:s3:PUT
>>>>>>> /bucket3/:create_bucket:recalculating target
>>>>>>> 2017-02-10 11:54:13.290845 7feac3f7f700  2 req 50:0.000193:s3:PUT
>>>>>>> /bucket3/:create_bucket:reading permissions
>>>>>>> 2017-02-10 11:54:13.290846 7feac3f7f700  2 req 50:0.000195:s3:PUT
>>>>>>> /bucket3/:create_bucket:init op
>>>>>>> 2017-02-10 11:54:13.290847 7feac3f7f700  2 req 50:0.000196:s3:PUT
>>>>>>> /bucket3/:create_bucket:verifying op mask
>>>>>>> 2017-02-10 11:54:13.290849 7feac3f7f700  2 req 50:0.000197:s3:PUT
>>>>>>> /bucket3/:create_bucket:verifying op permissions
>>>>>>> 2017-02-10 11:54:13.292027 7feac3f7f700  2 req 50:0.001374:s3:PUT
>>>>>>> /bucket3/:create_bucket:verifying op params
>>>>>>> 2017-02-10 11:54:13.292035 7feac3f7f700  2 req 50:0.001383:s3:PUT
>>>>>>> /bucket3/:create_bucket:pre-executing
>>>>>>> 2017-02-10 11:54:13.292037 7feac3f7f700  2 req 50:0.001385:s3:PUT
>>>>>>> /bucket3/:create_bucket:executing
>>>>>>> 2017-02-10 11:54:13.292072 7feac3f7f700 10 payload request hash =
>>>>>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>>> 2017-02-10 11:54:13.292083 7feac3f7f700 10 canonical request = PUT
>>>>>>> /bucket3/
>>>>>>>
>>>>>>> host:node5:8081
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>>> x-amz-date:20170210T025413Z
>>>>>>>
>>>>>>> host;x-amz-content-sha256;x-amz-date
>>>>>>> d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>>> 2017-02-10 11:54:13.292084 7feac3f7f700 10 canonical request hash =
>>>>>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>>>>>> 2017-02-10 11:54:13.292087 7feac3f7f700 10 string to sign =
>>>>>>> AWS4-HMAC-SHA256
>>>>>>> 20170210T025413Z
>>>>>>> 20170210/west/s3/aws4_request
>>>>>>> 8faa5ec57f69dd7b54baa72c157b6d63f8c7db309a34a1e2a10ad6f2f585cd02
>>>>>>> 2017-02-10 11:54:13.292118 7feac3f7f700 10 date_k        =
>>>>>>> 454f3ad73c095e73d2482809d7a6ec8af3c4e900bc83e0a9663ea5fc336cad95
>>>>>>> 2017-02-10 11:54:13.292131 7feac3f7f700 10 region_k      =
>>>>>>> e0caaddbb30ebc25840b6aaac3979d1881a14b8e9a0dfea43d8a006c8e0e504d
>>>>>>> 2017-02-10 11:54:13.292144 7feac3f7f700 10 service_k     =
>>>>>>> 59d6c9158e9e3c6a1aa97ee15859d2ef9ad9c64209b63f093109844f0c7f6c04
>>>>>>> 2017-02-10 11:54:13.292171 7feac3f7f700 10 signing_k     =
>>>>>>> 4dcbccd9c3da779d32758a645644c66a56f64d642eaeb39eec8e0b2facba7805
>>>>>>> 2017-02-10 11:54:13.292197 7feac3f7f700 10 signature_k   =
>>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>>> 2017-02-10 11:54:13.292198 7feac3f7f700 10 new signature =
>>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10
>>>>>>> -----------------------------
>>>>>>> Verifying signatures
>>>>>>> 2017-02-10 11:54:13.292199 7feac3f7f700 10 Signature     =
>>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10 New Signature =
>>>>>>> 989404f270efd800843cb19183c53dc457cf96b9ea2393ba5d554a42ffc22f76
>>>>>>> 2017-02-10 11:54:13.292200 7feac3f7f700 10
>>>>>>> -----------------------------
>>>>>>> 2017-02-10 11:54:13.292202 7feac3f7f700 10 v4 auth ok
>>>>>>> 2017-02-10 11:54:13.292238 7feac3f7f700 10 create bucket location
>>>>>>> constraint: west
>>>>>>> 2017-02-10 11:54:13.292256 7feac3f7f700 10 cache get:
>>>>>>> name=osaka.rgw.data.root+bucket3 : type miss (requested=22, cached=0)
>>>>>>> 2017-02-10 11:54:13.293369 7feac3f7f700 10 cache put:
>>>>>>> name=osaka.rgw.data.root+bucket3 info.flags=0
>>>>>>> 2017-02-10 11:54:13.293374 7feac3f7f700 10 moving
>>>>>>> osaka.rgw.data.root+bucket3 to cache LRU end
>>>>>>> 2017-02-10 11:54:13.293380 7feac3f7f700  0 sending create_bucket
>>>>>>> request
>>>>>>> to
>>>>>>> master zonegroup
>>>>>>> 2017-02-10 11:54:13.293401 7feac3f7f700 10 get_canon_resource():
>>>>>>> dest=/bucket3/
>>>>>>> 2017-02-10 11:54:13.293403 7feac3f7f700 10 generated canonical
>>>>>>> header:
>>>>>>> PUT
>>>>>>>
>>>>>>>
>>>>>>> Fri Feb 10 02:54:13 2017
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>>> /bucket3/
>>>>>>> 2017-02-10 11:54:13.299113 7feac3f7f700 10 receive_http_header
>>>>>>> 2017-02-10 11:54:13.299117 7feac3f7f700 10 received header:HTTP/1.1
>>>>>>> 404
>>>>>>> Not
>>>>>>> Found
>>>>>>> 2017-02-10 11:54:13.299119 7feac3f7f700 10 receive_http_header
>>>>>>> 2017-02-10 11:54:13.299120 7feac3f7f700 10 received
>>>>>>> header:x-amz-request-id:
>>>>>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>>>>>> 2017-02-10 11:54:13.299130 7feac3f7f700 10 receive_http_header
>>>>>>> 2017-02-10 11:54:13.299131 7feac3f7f700 10 received
>>>>>>> header:Content-Length:
>>>>>>> 175
>>>>>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 receive_http_header
>>>>>>> 2017-02-10 11:54:13.299133 7feac3f7f700 10 received
>>>>>>> header:Accept-Ranges:
>>>>>>> bytes
>>>>>>> 2017-02-10 11:54:13.299148 7feac3f7f700 10 receive_http_header
>>>>>>> 2017-02-10 11:54:13.299149 7feac3f7f700 10 received
>>>>>>> header:Content-Type:
>>>>>>> application/xml
>>>>>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 receive_http_header
>>>>>>> 2017-02-10 11:54:13.299150 7feac3f7f700 10 received header:Date: Fri,
>>>>>>> 10
>>>>>>> Feb
>>>>>>> 2017 02:54:13 GMT
>>>>>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 receive_http_header
>>>>>>> 2017-02-10 11:54:13.299152 7feac3f7f700 10 received header:
>>>>>>> 2017-02-10 11:54:13.299248 7feac3f7f700  2 req 50:0.008596:s3:PUT
>>>>>>> /bucket3/:create_bucket:completing
>>>>>>> 2017-02-10 11:54:13.299319 7feac3f7f700  2 req 50:0.008667:s3:PUT
>>>>>>> /bucket3/:create_bucket:op status=-2
>>>>>>> 2017-02-10 11:54:13.299321 7feac3f7f700  2 req 50:0.008670:s3:PUT
>>>>>>> /bucket3/:create_bucket:http status=404
>>>>>>> 2017-02-10 11:54:13.299324 7feac3f7f700  1 ====== req done
>>>>>>> req=0x7feac3f79710 op status=-2 http_status=404 ======
>>>>>>> 2017-02-10 11:54:13.299349 7feac3f7f700  1 civetweb: 0x7feb2c02d340:
>>>>>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/
>>>>>>> HTTP/1.1"
>>>>>>> 404
>>>>>>> 0 - -
>>>>>>>
>>>>>>>
>>>>>>> ---- rgw.tokyo log:
>>>>>>> 2017-02-10 11:54:13.297852 7f56076c6700  1 ====== starting new
>>>>>>> request
>>>>>>> req=0x7f56076c0710 =====
>>>>>>> 2017-02-10 11:54:13.297887 7f56076c6700  2 req 5:0.000035::PUT
>>>>>>> /bucket3/::initializing for trans_id =
>>>>>>> tx000000000000000000005-00589d2b55-1416-tokyo
>>>>>>> 2017-02-10 11:54:13.297895 7f56076c6700 10 rgw api priority: s3=5
>>>>>>> s3website=4
>>>>>>> 2017-02-10 11:54:13.297897 7f56076c6700 10 host=node5
>>>>>>> 2017-02-10 11:54:13.297906 7f56076c6700 10 meta>>
>>>>>>> HTTP_X_AMZ_CONTENT_SHA256
>>>>>>> 2017-02-10 11:54:13.297912 7f56076c6700 10 x>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>>> 2017-02-10 11:54:13.297929 7f56076c6700 10
>>>>>>> handler=25RGWHandler_REST_Bucket_S3
>>>>>>> 2017-02-10 11:54:13.297937 7f56076c6700  2 req 5:0.000086:s3:PUT
>>>>>>> /bucket3/::getting op 1
>>>>>>> 2017-02-10 11:54:13.297946 7f56076c6700 10
>>>>>>> op=27RGWCreateBucket_ObjStore_S3
>>>>>>> 2017-02-10 11:54:13.297947 7f56076c6700  2 req 5:0.000096:s3:PUT
>>>>>>> /bucket3/:create_bucket:authorizing
>>>>>>> 2017-02-10 11:54:13.297969 7f56076c6700 10 get_canon_resource():
>>>>>>> dest=/bucket3/
>>>>>>> 2017-02-10 11:54:13.297976 7f56076c6700 10 auth_hdr:
>>>>>>> PUT
>>>>>>>
>>>>>>>
>>>>>>> Fri Feb 10 02:54:13 2017
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> x-amz-content-sha256:d8f96fbdf666b991d183a7f5cc7fcf6eaa10934786f67575bda3f734a772464a
>>>>>>> /bucket3/
>>>>>>> 2017-02-10 11:54:13.298023 7f56076c6700 10 cache get:
>>>>>>> name=default.rgw.users.uid+nishi : type miss (requested=6, cached=0)
>>>>>>> 2017-02-10 11:54:13.298975 7f56076c6700 10 cache put:
>>>>>>> name=default.rgw.users.uid+nishi info.flags=0
>>>>>>> 2017-02-10 11:54:13.298986 7f56076c6700 10 moving
>>>>>>> default.rgw.users.uid+nishi to cache LRU end
>>>>>>> 2017-02-10 11:54:13.298991 7f56076c6700  0 User lookup failed!
>>>>>>> 2017-02-10 11:54:13.298993 7f56076c6700 10 failed to authorize
>>>>>>> request
>>>>>>> 2017-02-10 11:54:13.299077 7f56076c6700  2 req 5:0.001225:s3:PUT
>>>>>>> /bucket3/:create_bucket:op status=0
>>>>>>> 2017-02-10 11:54:13.299086 7f56076c6700  2 req 5:0.001235:s3:PUT
>>>>>>> /bucket3/:create_bucket:http status=404
>>>>>>> 2017-02-10 11:54:13.299089 7f56076c6700  1 ====== req done
>>>>>>> req=0x7f56076c0710 op status=0 http_status=404 ======
>>>>>>> 2017-02-10 11:54:13.299426 7f56076c6700  1 civetweb: 0x7f56200048c0:
>>>>>>> 192.168.20.15 - - [10/Feb/2017:11:54:13 +0900] "PUT /bucket3/
>>>>>>> HTTP/1.1"
>>>>>>> 404
>>>>>>> 0 - -
>
>
> --
> KIMURA Osamu / 木村 修
> Engineering Department, Storage Development Division,
> Data Center Platform Business Unit, FUJITSU LIMITED

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-15  7:53             ` Orit Wasserman
@ 2017-02-23 11:34               ` KIMURA Osamu
  2017-02-24  4:43                 ` KIMURA Osamu
  0 siblings, 1 reply; 11+ messages in thread
From: KIMURA Osamu @ 2017-02-23 11:34 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: ceph-devel

Sorry to late.
I opened several tracker issues...

On 2017/02/15 16:53, Orit Wasserman wrote:
> On Wed, Feb 15, 2017 at 2:26 AM, KIMURA Osamu
> <kimura.osamu@jp.fujitsu.com> wrote:
>> Comments inline...
>>
>>
>> On 2017/02/14 23:54, Orit Wasserman wrote:
>>>
>>> On Mon, Feb 13, 2017 at 12:57 PM, KIMURA Osamu
>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>
>>>> Hi Orit,
>>>>
>>>> I almost agree, with some exceptions...
>>>>
>>>>
>>>> On 2017/02/13 18:42, Orit Wasserman wrote:
>>>>>
>>>>>
>>>>> On Mon, Feb 13, 2017 at 6:44 AM, KIMURA Osamu
>>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Orit,
>>>>>>
>>>>>> Thanks for your comments.
>>>>>> I believe I'm not confusing, but probably my thought may not be well
>>>>>> described...
>>>>>>
>>>>> :)
>>>>>>
>>>>>> On 2017/02/12 19:07, Orit Wasserman wrote:
>>>>>>>
>>>>>>> On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
>>>>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>>>>
>>>>>>>> Hi Cephers,
>>>>>>>>
>>>>>>>> I'm trying to configure RGWs with multiple zonegroups within single
>>>>>>>> realm.
>>>>>>>> The intention is that some buckets to be replicated and others to
>>>>>>>> stay
>>>>>>>> locally.
>>>>>>>
>>>>>>>
>>>>>>> If you are not replicating than you don't need to create any zone
>>>>>>> configuration,
>>>>>>> a default zonegroup and zone are created automatically
>>>>>>>
>>>>>>>> e.g.:
>>>>>>>>  realm: fj
>>>>>>>>   zonegroup east: zone tokyo (not replicated)
>>>>>>>
>>>>>>>
>>>>>>> no need if not replicated
>>>>>>>>
>>>>>>>>   zonegroup west: zone osaka (not replicated)
>>>>>>>
>>>>>>>
>>>>>>> same here
>>>>>>>>
>>>>>>>>   zonegroup jp:   zone jp-east + jp-west (replicated)
>>>>>>
>>>>>>
>>>>>> The "east" and "west" zonegroups are just renamed from "default"
>>>>>> as described in RHCS document [3].
>>>>>
>>>>>
>>>>> Why do you need two zonegroups (or 3)?
>>>>>
>>>>> At the moment multisitev2 replicated automatically all zones in the
>>>>> realm except "default" zone.
>>>>> The moment you add a new zone (could be part of another zonegroup) it
>>>>> will be replicated to the other zones.
>>>>> It seems you don't want or need this.
>>>>> we are working on allowing more control on the replication but that
>>>>> will be in the future.
>>>>>
>>>>>> We may not need to rename them, but at least api_name should be
>>>>>> altered.
>>>>>
>>>>>
>>>>> You can change the api_name for the "default" zone.
>>>>>
>>>>>> In addition, I'm not sure what happens if 2 "default" zones/zonegroups
>>>>>> co-exist in same realm.
>>>>>
>>>>>
>>>>>
>>>>> Realm shares all the zones/zonegroups configuration,
>>>>> it means it is the same zone/zonegroup.
>>>>> For "default" it means not zone/zonegroup configured, we use it to run
>>>>> radosgw without any
>>>>> zone/zonegroup specified in the configuration.
>>>>
>>>>
>>>>
>>>> I didn't think "default" as exception of zonegroup. :-P
>>>> Actually, I must specify api_name in default zonegroup setting.
>>>>
>>>> I interpret "default" zone/zonegroup is out of realm. Is it correct?
>>>> I think it means namespace for bucket or user is not shared with
>>>> "default".
>>>> At present, I can't make decision to separate namespaces, but it may be
>>>> best choice with current code.

Unfortunately, if "api_name" is changed for "default" zonegroup,
the "default" zonegroup is set as a member of the realm.
See [19040-1]

It means no major difference from my first provided configuration.
(except reduction of messy error messages [15776] )

In addition, the "api_name" can't be changed with "radosgw-admin
zonegroup set" command if no realm has been defined.
There is no convenient way to change "api_name".

[19040-1]: http://tracker.ceph.com/issues/19040#note-1
[15776]: http://tracker.ceph.com/issues/15776

>>>>>>>> To evaluate such configuration, I tentatively built multiple
>>>>>>>> zonegroups
>>>>>>>> (east, west) on a ceph cluster. I barely succeed to configure it, but
>>>>>>>> some concerns exist.
>>>>>>>>
>>>>>>> I think you just need one zonegroup with two zones the other are not
>>>>>>> needed
>>>>>>> Also each gateway can handle only a single zone (rgw_zone
>>>>>>> configuration parameter)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> This is just a tentative one to confirm the behavior of multiple
>>>>>> zonegroups
>>>>>> due to limitation of our current equipment.
>>>>>> The "east" zonegroup was renamed from "default", and another "west"
>>>>>> zonegroup
>>>>>> was created. Of course I specified both rgw_zonegroup and rgw_zone
>>>>>> parameters
>>>>>> for each RGW instance. (see -FYI- section bellow)
>>>>>>
>>>>> Can I suggest starting with a more simple setup:
>>>>> Two zonegroups,  the first will have two zones and the second will
>>>>> have one zone.
>>>>> It is simper to configure and in case of problems to debug.
>>>>
>>>>
>>>>
>>>> I would try with such configuration IF time permitted.

I tried. But it doesn't seem simpler :P
Because it consists 3 zonegroups and 4 zones.
I want to keep default zone/zonegroup.
The target system already has huge amount of objects.


>>>>>>>> a) User accounts are not synced among zonegroups

I opened 2 issues [19040] [19041]

[19040]: http://tracker.ceph.com/issues/19040
[19041]: http://tracker.ceph.com/issues/19041


>>>>>>>> I'm not sure if this is a issue, but the blueprint [1] stated a
>>>>>>>> master
>>>>>>>> zonegroup manages user accounts as metadata like buckets.
>>>>>>>>
>>>>>>> You have a lot of confusion with the zones and zonegroups.
>>>>>>> A zonegroup is just a group of zones that are sharing the same data
>>>>>>> (i.e. replication between them)
>>>>>>> A zone represent a geographical location (i.e. one ceph cluster)
>>>>>>>
>>>>>>> We have a meta master zone (the master zone in the master zonegroup),
>>>>>>> this meta master is responible on
>>>>>>> replicating users and byckets meta operations.
>>>>>>
>>>>>>
>>>>>> I know it.
>>>>>> But the master zone in the master zonegroup manages bucket meta
>>>>>> operations including buckets in other zonegroups. It means
>>>>>> the master zone in the master zonegroup must have permission to
>>>>>> handle buckets meta operations, i.e., must have same user accounts
>>>>>> as other zonegroups.
>>>>>
>>>>>
>>>>> Again zones not zonegroups,  it needs to have an admin user with the
>>>>> same credentials in all the other zones.
>>>>>
>>>>>> This is related to next issue b). If the master zone in the master
>>>>>> zonegroup doesn't have user accounts for other zonegroups, all the
>>>>>> buckets meta operations are rejected.
>>>>>>
>>>>>
>>>>> Correct
>>>>>
>>>>>> In addition, it may be overexplanation though, user accounts are
>>>>>> sync'ed to other zones within same zonegroup if the accounts are
>>>>>> created on master zone of the zonegroup. On the other hand,
>>>>>> I found today, user accounts are not sync'ed to master if the
>>>>>> accounts are created on slave(?) zone in the zonegroup. It seems
>>>>>> asymmetric behavior.
>>>>>
>>>>>
>>>>>
>>>>> This requires investigation,  can you open a tracker issue and we will
>>>>> look into it.
>>>>>
>>>>>> I'm not sure if the same behavior is caused by Admin REST API instead
>>>>>> of radosgw-admin.
>>>>>>
>>>>>
>>>>> It doesn't matter both use almost the same code
>>>>>
>>>>>>
>>>>>>>> b) Bucket creation is rejected if master zonegroup doesn't have the
>>>>>>>> account
>>>>>>>>
>>>>>>>> e.g.:
>>>>>>>>   1) Configure east zonegroup as master.
>>>>>>>
>>>>>>> you need a master zoen
>>>>>>>>
>>>>>>>>   2) Create a user "nishi" on west zonegroup (osaka zone) using
>>>>>>>> radosgw-admin.
>>>>>>>>   3) Try to create a bucket on west zonegroup by user nishi.
>>>>>>>>      -> ERROR: S3 error: 404 (NoSuchKey)
>>>>>>>>   4) Create user nishi on east zonegroup with same key.
>>>>>>>>   5) Succeed to create a bucket on west zonegroup by user nishi.
>>>>>>>
>>>>>>> You are confusing zonegroup and zone here again ...
>>>>>>>
>>>>>>> you should notice that when you are using radosgw-admin command
>>>>>>> without providing zonegorup and/or zone info (--rgw-zonegroup=<zg> and
>>>>>>> --rgw-zone=<zone>) it will use the default zonegroup and zone.
>>>>>>>
>>>>>>> User is stored per zone and you need to create an admin users in both
>>>>>>> zones
>>>>>>> for more documentation see:
>>>>>>> http://docs.ceph.com/docs/master/radosgw/multisite/
>>>>>>
>>>>>>
>>>>>> I always specify --rgw-zonegroup and --rgw-zone for radosgw-admin
>>>>>> command.
>>>>>>
>>>>> That is great!
>>>>> You can also onfigure default zone and zonegroup
>>>>>
>>>>>> The issue is that any buckets meta operations are rejected when the
>>>>>> master
>>>>>> zone in the master zonegroup doesn't have the user account of other
>>>>>> zonegroups.
>>>>>>
>>>>> Correct
>>>>>>
>>>>>>
>>>>>> I try to describe details again:
>>>>>> 1) Create fj realm as default.
>>>>>> 2) Rename default zonegroup/zone to east/tokyo and mark as default.
>>>>>> 3) Create west/osaka zonegroup/zone.
>>>>>> 4) Create system user sync-user on both tokyo and osaka zones with same
>>>>>> key.
>>>>>> 5) Start 2 RGW instances for tokyo and osaka zones.
>>>>>> 6) Create azuma user account on tokyo zone in east zonegroup.
>>>>>> 7) Create /bucket1 through tokyo zone endpoint with azuma account.
>>>>>>    -> No problem.
>>>>>> 8) Create nishi user account on osaka zone in west zonegroup.
>>>>>> 9) Try to create a bucket /bucket2 through osaka zone endpoint with
>>>>>> azuma
>>>>>> account.
>>>>>>    -> respond "ERROR: S3 error: 403 (InvalidAccessKeyId)" as expected.
>>>>>> 10) Try to create a bucket /bucket3 through osaka zone endpoint with
>>>>>> nishi
>>>>>> account.
>>>>>>    -> respond "ERROR: S3 error: 404 (NoSuchKey)"
>>>>>>    Detailed log is shown in -FYI- section bellow.
>>>>>>    The RGW for osaka zone verify the signature and forward the request
>>>>>>    to tokyo zone endpoint (= the master zone in the master zonegroup).
>>>>>>    Then, the RGW for tokyo zone rejected the request by unauthorized
>>>>>> access.
>>>>>>
>>>>>
>>>>> This seems a bug, can you open a issue?

I opened 2 issues [19042] [19043]

[19042]: http://tracker.ceph.com/issues/19042
[19043]: http://tracker.ceph.com/issues/19043

>>>>>>>> c) How to restrict to place buckets on specific zonegroups?
>>>>>>>
>>>>>>> you probably mean zone.
>>>>>>> There is ongoing work to enable/disable sync per bucket
>>>>>>> https://github.com/ceph/ceph/pull/10995
>>>>>>> with this you can create a bucket on a specific zone and it won't be
>>>>>>> replicated to another zone
>>>>>>
>>>>>>
>>>>>> My thought means zonegroup (not zone) as described above.
>>>>>
>>>>> But it should be zone ..
>>>>> Zone represent a geographical location , it represent a single ceph
>>>>> cluster.
>>>>> Bucket is created in a zone (a single ceph cluster) and it stored the
>>>>> zone
>>>>> id.
>>>>> The zone represent in which ceph cluster the bucket was created.
>>>>>
>>>>> A zonegroup just a logical collection of zones, in many case you only
>>>>> need a single zonegroup.
>>>>> You should use zonegroups if you have lots of zones and it simplifies
>>>>> your configuration.
>>>>> You can move zones between zonegroups (it is not tested or supported
>>>>> ...).
>>>>>
>>>>>> With current code, buckets are sync'ed to all zones within a zonegroup,
>>>>>> no way to choose zone to place specific buckets.
>>>>>> But this change may help to configure our original target.
>>>>>>
>>>>>> It seems we need more discussion about the change.
>>>>>> I prefer default behavior is associated with user account (per SLA).
>>>>>> And attribution of each bucket should be able to be changed via REST
>>>>>> API depending on their permission, rather than radosgw-admin command.
>>>>>>
>>>>>
>>>>> I think that will be very helpful , we need to understand what are the
>>>>> requirement and the usage.
>>>>> Please comment on the PR or even open a feature request and we can
>>>>> discuss it more in detail.
>>>>>
>>>>>> Anyway, I'll examine more details.
>>>>>>
>>>>>>>> If user accounts would synced future as the blueprint, all the
>>>>>>>> zonegroups
>>>>>>>> contain same account information. It means any user can create
>>>>>>>> buckets
>>>>>>>> on
>>>>>>>> any zonegroups. If we want to permit to place buckets on a replicated
>>>>>>>> zonegroup for specific users, how to configure?
>>>>>>>>
>>>>>>>> If user accounts will not synced as current behavior, we can restrict
>>>>>>>> to place buckets on specific zonegroups. But I cannot find best way
>>>>>>>> to
>>>>>>>> configure the master zonegroup.
>>>>>>>>
>>>>>>>>
>>>>>>>> d) Operations for other zonegroup are not redirected
>>>>>>>>
>>>>>>>> e.g.:
>>>>>>>>   1) Create bucket4 on west zonegroup by nishi.
>>>>>>>>   2) Try to access bucket4 from endpoint on east zonegroup.
>>>>>>>>      -> Respond "301 (Moved Permanently)",
>>>>>>>>         but no redirected Location header is returned.
>>>>>>>>
>>>>>>>
>>>>>>> It could be a bug please open a tracker issue for that in
>>>>>>> tracker.ceph.com for RGW component with all the configuration
>>>>>>> information,
>>>>>>> logs and the version of ceph and radosgw you are using.
>>>>>>
>>>>>>
>>>>>> I will open it, but it may be issued as "Feature" instead of "Bug"
>>>>>> depending on following discussion.

I opened an issue [19052] as "Feature" instead of "Bug".

[19052]: http://tracker.ceph.com/issues/19052

I suggested to use "hostnames" field in zonegroup configuration
for this purpose. I feel it is similar to s3 website feature.

>>>>>>>> It seems current RGW doesn't follows S3 specification [2].
>>>>>>>> To implement this feature, probably we need to define another
>>>>>>>> endpoint
>>>>>>>> on each zonegroup for client accessible URL. RGW may placed behind
>>>>>>>> proxy,
>>>>>>>> thus the URL may be different from endpoint URLs for replication.
>>>>>>>>
>>>>>>>
>>>>>>> The zone and zonegroup endpoints are not used directly by the user
>>>>>>> with
>>>>>>> a
>>>>>>> proxy.
>>>>>>> The user get a URL pointing to the proxy and the proxy will need to be
>>>>>>> configured to point the rgw urls/IPs , you can have several radosgw
>>>>>>> running.
>>>>>>> See more
>>>>>>>
>>>>>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration
>>>>>>
>>>>>> Does it mean the proxy has responsibility to alter "Location" header as
>>>>>> redirected URL?
>>>>>
>>>>> No
>>>>>
>>>>>> Basically, RGW can respond only the endpoint described in zonegroup
>>>>>> setting as redirected URL on Location header. But client may not access
>>>>>> the endpoint. Someone must translate the Location header to client
>>>>>> accessible URL.
>>>>>
>>>>> Both locations will have a proxy. This means all communication is done
>>>>> through proxies.
>>>>> The endpoint URL should be an external URL and the proxy on the new
>>>>> location will translate it to the internal one.
>>>>
>>>>
>>>>
>>>> Our assumption is:
>>>>
>>>> End-user client --- internet --- proxy ---+--- RGW site-A
>>>>                                           |
>>>>                                           | (dedicated line or VPN)
>>>>                                           |
>>>> End-user client --- internet --- proxy ---+--- RGW site-B
>>>>
>>>> RGWs can't access through front of proxies.
>>>> In this case, endpoints for replication are in backend network of
>>>> proxies.
>>>
>>>
>>> do you have several radosgw instances in each site?
>>
>>
>> Yes. Probably three or more instances per a site.
>> Actual system will have same number of physical servers as RGW instances.
>> We already tested with multiple endpoints per a zone within a zonegroup.
>
> Good to hear :)
> As for the redirect message in your case it's should to be handled by
> the proxy and not by the client browser
> as it cannot access the internal vpn network. The endpoints url should
> be the url in the internal network.

I don't agree.
It requires more network bandwidth between sites.
I think "hostnames" field provides client accessible URL
that is front of proxy. It seems sufficient.


In addition to above, I opened 2 issues [18800] [19053] regarding
Swift API, that are not related this discussion.

[18800]: http://tracker.ceph.com/issues/18800
[19053]: http://tracker.ceph.com/issues/19053


Regards,
KIMURA

> Orit
>>
>>>> How do you think?
>>>>
>>>>> Regards,
>>>>> Orit
>>>>>
>>>>>> If the proxy translates Location header, it looks like
>>>>>> man-in-the-middle
>>>>>> attack.
>>>>>>
>>>>>> Regards,
>>>>>> KIMURA
>>>>>>
>>>>>>> Regrads,
>>>>>>> Orit
>>>>>>>>
>>>>>>>> Any thoughts?
>>>>>>>
>>>>>>>>
>>>>>>>> [1] http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
>>>>>>>> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html
>>>>>> [3] https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-8-multi-site#migrating_a_single_site_system_to_multi_site
>>>>>>
>>>>>>>> ------ FYI ------
>>>>>>>> [environments]
>>>>>>>> Ceph cluster: RHCS 2.0
>>>>>>>> RGW: RHEL 7.2 + RGW v10.2.5
>>>>>>>>
>>>>>>>> zonegroup east: master
>>>>>>>>  zone tokyo
>>>>>>>>   endpoint http://node5:80
>>>>>>        rgw frontends = "civetweb port=80"
>>>>>>        rgw zonegroup = east
>>>>>>        rgw zone = tokyo
>>>>>>>>
>>>>>>>>   system user: sync-user
>>>>>>>>   user azuma (+ nishi)
>>>>>>>>
>>>>>>>> zonegroup west: (not master)
>>>>>>>>   zone osaka
>>>>>>>>   endpoint http://node5:8081
>>>>>>        rgw frontends = "civetweb port=8081"
>>>>>>        rgw zonegroup = west
>>>>>>        rgw zone = osaka
>>>>>>
>>>>>>>>   system user: sync-user (created with same key as zone tokyo)
>>>>>>>>   user nishi

-- 
KIMURA Osamu / 木村 修
Engineering Department, Storage Development Division,
Data Center Platform Business Unit, FUJITSU LIMITED

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-23 11:34               ` KIMURA Osamu
@ 2017-02-24  4:43                 ` KIMURA Osamu
  2017-02-24  7:22                   ` Orit Wasserman
  0 siblings, 1 reply; 11+ messages in thread
From: KIMURA Osamu @ 2017-02-24  4:43 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: ceph-devel

Hi Orit,

Thanks for your interest in this issue.
I have one more question.

I assumed "endpoints" of a zonegroup would be used for synchronization
of metadata. But, to extent that I read current Jewel code, it may
be used only for redirection.
(-ERR_PERMANENT_REDIRECT || -ERR_WEBSITE_REDIRECT)
It seems metadata synchronization is sent to endpoint of master zone
in each zonegroup (probably it has not been equipped for secondary
zonegroup).

Is it correct?
If so, we can set endpoints of each zonegroup as client accessible
URL (i.e., front of proxy). On the other hand, endpoints of each
zone point internal one.
But, I still prefer to use "hostnames" field for this purpose.


Regards,
KIMURA

On 2017/02/23 20:34, KIMURA Osamu wrote:
> Sorry to late.
> I opened several tracker issues...
>
> On 2017/02/15 16:53, Orit Wasserman wrote:
>> On Wed, Feb 15, 2017 at 2:26 AM, KIMURA Osamu
>> <kimura.osamu@jp.fujitsu.com> wrote:
>>> Comments inline...
>>>
>>>
>>> On 2017/02/14 23:54, Orit Wasserman wrote:
>>>>
>>>> On Mon, Feb 13, 2017 at 12:57 PM, KIMURA Osamu
>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>
>>>>> Hi Orit,
>>>>>
>>>>> I almost agree, with some exceptions...
>>>>>
>>>>>
>>>>> On 2017/02/13 18:42, Orit Wasserman wrote:
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 13, 2017 at 6:44 AM, KIMURA Osamu
>>>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Orit,
>>>>>>>
>>>>>>> Thanks for your comments.
>>>>>>> I believe I'm not confusing, but probably my thought may not be well
>>>>>>> described...
>>>>>>>
>>>>>> :)
>>>>>>>
>>>>>>> On 2017/02/12 19:07, Orit Wasserman wrote:
>>>>>>>>
>>>>>>>> On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
>>>>>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi Cephers,
>>>>>>>>>
>>>>>>>>> I'm trying to configure RGWs with multiple zonegroups within single
>>>>>>>>> realm.
>>>>>>>>> The intention is that some buckets to be replicated and others to
>>>>>>>>> stay
>>>>>>>>> locally.
>>>>>>>>
>>>>>>>>
>>>>>>>> If you are not replicating than you don't need to create any zone
>>>>>>>> configuration,
>>>>>>>> a default zonegroup and zone are created automatically
>>>>>>>>
>>>>>>>>> e.g.:
>>>>>>>>>  realm: fj
>>>>>>>>>   zonegroup east: zone tokyo (not replicated)
>>>>>>>>
>>>>>>>>
>>>>>>>> no need if not replicated
>>>>>>>>>
>>>>>>>>>   zonegroup west: zone osaka (not replicated)
>>>>>>>>
>>>>>>>>
>>>>>>>> same here
>>>>>>>>>
>>>>>>>>>   zonegroup jp:   zone jp-east + jp-west (replicated)
>>>>>>>
>>>>>>>
>>>>>>> The "east" and "west" zonegroups are just renamed from "default"
>>>>>>> as described in RHCS document [3].
>>>>>>
>>>>>>
>>>>>> Why do you need two zonegroups (or 3)?
>>>>>>
>>>>>> At the moment multisitev2 replicated automatically all zones in the
>>>>>> realm except "default" zone.
>>>>>> The moment you add a new zone (could be part of another zonegroup) it
>>>>>> will be replicated to the other zones.
>>>>>> It seems you don't want or need this.
>>>>>> we are working on allowing more control on the replication but that
>>>>>> will be in the future.
>>>>>>
>>>>>>> We may not need to rename them, but at least api_name should be
>>>>>>> altered.
>>>>>>
>>>>>>
>>>>>> You can change the api_name for the "default" zone.
>>>>>>
>>>>>>> In addition, I'm not sure what happens if 2 "default" zones/zonegroups
>>>>>>> co-exist in same realm.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Realm shares all the zones/zonegroups configuration,
>>>>>> it means it is the same zone/zonegroup.
>>>>>> For "default" it means not zone/zonegroup configured, we use it to run
>>>>>> radosgw without any
>>>>>> zone/zonegroup specified in the configuration.
>>>>>
>>>>>
>>>>>
>>>>> I didn't think "default" as exception of zonegroup. :-P
>>>>> Actually, I must specify api_name in default zonegroup setting.
>>>>>
>>>>> I interpret "default" zone/zonegroup is out of realm. Is it correct?
>>>>> I think it means namespace for bucket or user is not shared with
>>>>> "default".
>>>>> At present, I can't make decision to separate namespaces, but it may be
>>>>> best choice with current code.
>
> Unfortunately, if "api_name" is changed for "default" zonegroup,
> the "default" zonegroup is set as a member of the realm.
> See [19040-1]
>
> It means no major difference from my first provided configuration.
> (except reduction of messy error messages [15776] )
>
> In addition, the "api_name" can't be changed with "radosgw-admin
> zonegroup set" command if no realm has been defined.
> There is no convenient way to change "api_name".
>
> [19040-1]: http://tracker.ceph.com/issues/19040#note-1
> [15776]: http://tracker.ceph.com/issues/15776
>
>>>>>>>>> To evaluate such configuration, I tentatively built multiple
>>>>>>>>> zonegroups
>>>>>>>>> (east, west) on a ceph cluster. I barely succeed to configure it, but
>>>>>>>>> some concerns exist.
>>>>>>>>>
>>>>>>>> I think you just need one zonegroup with two zones the other are not
>>>>>>>> needed
>>>>>>>> Also each gateway can handle only a single zone (rgw_zone
>>>>>>>> configuration parameter)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This is just a tentative one to confirm the behavior of multiple
>>>>>>> zonegroups
>>>>>>> due to limitation of our current equipment.
>>>>>>> The "east" zonegroup was renamed from "default", and another "west"
>>>>>>> zonegroup
>>>>>>> was created. Of course I specified both rgw_zonegroup and rgw_zone
>>>>>>> parameters
>>>>>>> for each RGW instance. (see -FYI- section bellow)
>>>>>>>
>>>>>> Can I suggest starting with a more simple setup:
>>>>>> Two zonegroups,  the first will have two zones and the second will
>>>>>> have one zone.
>>>>>> It is simper to configure and in case of problems to debug.
>>>>>
>>>>>
>>>>>
>>>>> I would try with such configuration IF time permitted.
>
> I tried. But it doesn't seem simpler :P
> Because it consists 3 zonegroups and 4 zones.
> I want to keep default zone/zonegroup.
> The target system already has huge amount of objects.
>
>
>>>>>>>>> a) User accounts are not synced among zonegroups
>
> I opened 2 issues [19040] [19041]
>
> [19040]: http://tracker.ceph.com/issues/19040
> [19041]: http://tracker.ceph.com/issues/19041
>
>
>>>>>>>>> I'm not sure if this is a issue, but the blueprint [1] stated a
>>>>>>>>> master
>>>>>>>>> zonegroup manages user accounts as metadata like buckets.
>>>>>>>>>
>>>>>>>> You have a lot of confusion with the zones and zonegroups.
>>>>>>>> A zonegroup is just a group of zones that are sharing the same data
>>>>>>>> (i.e. replication between them)
>>>>>>>> A zone represent a geographical location (i.e. one ceph cluster)
>>>>>>>>
>>>>>>>> We have a meta master zone (the master zone in the master zonegroup),
>>>>>>>> this meta master is responible on
>>>>>>>> replicating users and byckets meta operations.
>>>>>>>
>>>>>>>
>>>>>>> I know it.
>>>>>>> But the master zone in the master zonegroup manages bucket meta
>>>>>>> operations including buckets in other zonegroups. It means
>>>>>>> the master zone in the master zonegroup must have permission to
>>>>>>> handle buckets meta operations, i.e., must have same user accounts
>>>>>>> as other zonegroups.
>>>>>>
>>>>>>
>>>>>> Again zones not zonegroups,  it needs to have an admin user with the
>>>>>> same credentials in all the other zones.
>>>>>>
>>>>>>> This is related to next issue b). If the master zone in the master
>>>>>>> zonegroup doesn't have user accounts for other zonegroups, all the
>>>>>>> buckets meta operations are rejected.
>>>>>>>
>>>>>>
>>>>>> Correct
>>>>>>
>>>>>>> In addition, it may be overexplanation though, user accounts are
>>>>>>> sync'ed to other zones within same zonegroup if the accounts are
>>>>>>> created on master zone of the zonegroup. On the other hand,
>>>>>>> I found today, user accounts are not sync'ed to master if the
>>>>>>> accounts are created on slave(?) zone in the zonegroup. It seems
>>>>>>> asymmetric behavior.
>>>>>>
>>>>>>
>>>>>>
>>>>>> This requires investigation,  can you open a tracker issue and we will
>>>>>> look into it.
>>>>>>
>>>>>>> I'm not sure if the same behavior is caused by Admin REST API instead
>>>>>>> of radosgw-admin.
>>>>>>>
>>>>>>
>>>>>> It doesn't matter both use almost the same code
>>>>>>
>>>>>>>
>>>>>>>>> b) Bucket creation is rejected if master zonegroup doesn't have the
>>>>>>>>> account
>>>>>>>>>
>>>>>>>>> e.g.:
>>>>>>>>>   1) Configure east zonegroup as master.
>>>>>>>>
>>>>>>>> you need a master zoen
>>>>>>>>>
>>>>>>>>>   2) Create a user "nishi" on west zonegroup (osaka zone) using
>>>>>>>>> radosgw-admin.
>>>>>>>>>   3) Try to create a bucket on west zonegroup by user nishi.
>>>>>>>>>      -> ERROR: S3 error: 404 (NoSuchKey)
>>>>>>>>>   4) Create user nishi on east zonegroup with same key.
>>>>>>>>>   5) Succeed to create a bucket on west zonegroup by user nishi.
>>>>>>>>
>>>>>>>> You are confusing zonegroup and zone here again ...
>>>>>>>>
>>>>>>>> you should notice that when you are using radosgw-admin command
>>>>>>>> without providing zonegorup and/or zone info (--rgw-zonegroup=<zg> and
>>>>>>>> --rgw-zone=<zone>) it will use the default zonegroup and zone.
>>>>>>>>
>>>>>>>> User is stored per zone and you need to create an admin users in both
>>>>>>>> zones
>>>>>>>> for more documentation see:
>>>>>>>> http://docs.ceph.com/docs/master/radosgw/multisite/
>>>>>>>
>>>>>>>
>>>>>>> I always specify --rgw-zonegroup and --rgw-zone for radosgw-admin
>>>>>>> command.
>>>>>>>
>>>>>> That is great!
>>>>>> You can also onfigure default zone and zonegroup
>>>>>>
>>>>>>> The issue is that any buckets meta operations are rejected when the
>>>>>>> master
>>>>>>> zone in the master zonegroup doesn't have the user account of other
>>>>>>> zonegroups.
>>>>>>>
>>>>>> Correct
>>>>>>>
>>>>>>>
>>>>>>> I try to describe details again:
>>>>>>> 1) Create fj realm as default.
>>>>>>> 2) Rename default zonegroup/zone to east/tokyo and mark as default.
>>>>>>> 3) Create west/osaka zonegroup/zone.
>>>>>>> 4) Create system user sync-user on both tokyo and osaka zones with same
>>>>>>> key.
>>>>>>> 5) Start 2 RGW instances for tokyo and osaka zones.
>>>>>>> 6) Create azuma user account on tokyo zone in east zonegroup.
>>>>>>> 7) Create /bucket1 through tokyo zone endpoint with azuma account.
>>>>>>>    -> No problem.
>>>>>>> 8) Create nishi user account on osaka zone in west zonegroup.
>>>>>>> 9) Try to create a bucket /bucket2 through osaka zone endpoint with
>>>>>>> azuma
>>>>>>> account.
>>>>>>>    -> respond "ERROR: S3 error: 403 (InvalidAccessKeyId)" as expected.
>>>>>>> 10) Try to create a bucket /bucket3 through osaka zone endpoint with
>>>>>>> nishi
>>>>>>> account.
>>>>>>>    -> respond "ERROR: S3 error: 404 (NoSuchKey)"
>>>>>>>    Detailed log is shown in -FYI- section bellow.
>>>>>>>    The RGW for osaka zone verify the signature and forward the request
>>>>>>>    to tokyo zone endpoint (= the master zone in the master zonegroup).
>>>>>>>    Then, the RGW for tokyo zone rejected the request by unauthorized
>>>>>>> access.
>>>>>>>
>>>>>>
>>>>>> This seems a bug, can you open a issue?
>
> I opened 2 issues [19042] [19043]
>
> [19042]: http://tracker.ceph.com/issues/19042
> [19043]: http://tracker.ceph.com/issues/19043
>
>>>>>>>>> c) How to restrict to place buckets on specific zonegroups?
>>>>>>>>
>>>>>>>> you probably mean zone.
>>>>>>>> There is ongoing work to enable/disable sync per bucket
>>>>>>>> https://github.com/ceph/ceph/pull/10995
>>>>>>>> with this you can create a bucket on a specific zone and it won't be
>>>>>>>> replicated to another zone
>>>>>>>
>>>>>>>
>>>>>>> My thought means zonegroup (not zone) as described above.
>>>>>>
>>>>>> But it should be zone ..
>>>>>> Zone represent a geographical location , it represent a single ceph
>>>>>> cluster.
>>>>>> Bucket is created in a zone (a single ceph cluster) and it stored the
>>>>>> zone
>>>>>> id.
>>>>>> The zone represent in which ceph cluster the bucket was created.
>>>>>>
>>>>>> A zonegroup just a logical collection of zones, in many case you only
>>>>>> need a single zonegroup.
>>>>>> You should use zonegroups if you have lots of zones and it simplifies
>>>>>> your configuration.
>>>>>> You can move zones between zonegroups (it is not tested or supported
>>>>>> ...).
>>>>>>
>>>>>>> With current code, buckets are sync'ed to all zones within a zonegroup,
>>>>>>> no way to choose zone to place specific buckets.
>>>>>>> But this change may help to configure our original target.
>>>>>>>
>>>>>>> It seems we need more discussion about the change.
>>>>>>> I prefer default behavior is associated with user account (per SLA).
>>>>>>> And attribution of each bucket should be able to be changed via REST
>>>>>>> API depending on their permission, rather than radosgw-admin command.
>>>>>>>
>>>>>>
>>>>>> I think that will be very helpful , we need to understand what are the
>>>>>> requirement and the usage.
>>>>>> Please comment on the PR or even open a feature request and we can
>>>>>> discuss it more in detail.
>>>>>>
>>>>>>> Anyway, I'll examine more details.
>>>>>>>
>>>>>>>>> If user accounts would synced future as the blueprint, all the
>>>>>>>>> zonegroups
>>>>>>>>> contain same account information. It means any user can create
>>>>>>>>> buckets
>>>>>>>>> on
>>>>>>>>> any zonegroups. If we want to permit to place buckets on a replicated
>>>>>>>>> zonegroup for specific users, how to configure?
>>>>>>>>>
>>>>>>>>> If user accounts will not synced as current behavior, we can restrict
>>>>>>>>> to place buckets on specific zonegroups. But I cannot find best way
>>>>>>>>> to
>>>>>>>>> configure the master zonegroup.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> d) Operations for other zonegroup are not redirected
>>>>>>>>>
>>>>>>>>> e.g.:
>>>>>>>>>   1) Create bucket4 on west zonegroup by nishi.
>>>>>>>>>   2) Try to access bucket4 from endpoint on east zonegroup.
>>>>>>>>>      -> Respond "301 (Moved Permanently)",
>>>>>>>>>         but no redirected Location header is returned.
>>>>>>>>>
>>>>>>>>
>>>>>>>> It could be a bug please open a tracker issue for that in
>>>>>>>> tracker.ceph.com for RGW component with all the configuration
>>>>>>>> information,
>>>>>>>> logs and the version of ceph and radosgw you are using.
>>>>>>>
>>>>>>>
>>>>>>> I will open it, but it may be issued as "Feature" instead of "Bug"
>>>>>>> depending on following discussion.
>
> I opened an issue [19052] as "Feature" instead of "Bug".
>
> [19052]: http://tracker.ceph.com/issues/19052
>
> I suggested to use "hostnames" field in zonegroup configuration
> for this purpose. I feel it is similar to s3 website feature.
>
>>>>>>>>> It seems current RGW doesn't follows S3 specification [2].
>>>>>>>>> To implement this feature, probably we need to define another
>>>>>>>>> endpoint
>>>>>>>>> on each zonegroup for client accessible URL. RGW may placed behind
>>>>>>>>> proxy,
>>>>>>>>> thus the URL may be different from endpoint URLs for replication.
>>>>>>>>>
>>>>>>>>
>>>>>>>> The zone and zonegroup endpoints are not used directly by the user
>>>>>>>> with
>>>>>>>> a
>>>>>>>> proxy.
>>>>>>>> The user get a URL pointing to the proxy and the proxy will need to be
>>>>>>>> configured to point the rgw urls/IPs , you can have several radosgw
>>>>>>>> running.
>>>>>>>> See more
>>>>>>>>
>>>>>>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration
>>>>>>>
>>>>>>> Does it mean the proxy has responsibility to alter "Location" header as
>>>>>>> redirected URL?
>>>>>>
>>>>>> No
>>>>>>
>>>>>>> Basically, RGW can respond only the endpoint described in zonegroup
>>>>>>> setting as redirected URL on Location header. But client may not access
>>>>>>> the endpoint. Someone must translate the Location header to client
>>>>>>> accessible URL.
>>>>>>
>>>>>> Both locations will have a proxy. This means all communication is done
>>>>>> through proxies.
>>>>>> The endpoint URL should be an external URL and the proxy on the new
>>>>>> location will translate it to the internal one.
>>>>>
>>>>>
>>>>>
>>>>> Our assumption is:
>>>>>
>>>>> End-user client --- internet --- proxy ---+--- RGW site-A
>>>>>                                           |
>>>>>                                           | (dedicated line or VPN)
>>>>>                                           |
>>>>> End-user client --- internet --- proxy ---+--- RGW site-B
>>>>>
>>>>> RGWs can't access through front of proxies.
>>>>> In this case, endpoints for replication are in backend network of
>>>>> proxies.
>>>>
>>>>
>>>> do you have several radosgw instances in each site?
>>>
>>>
>>> Yes. Probably three or more instances per a site.
>>> Actual system will have same number of physical servers as RGW instances.
>>> We already tested with multiple endpoints per a zone within a zonegroup.
>>
>> Good to hear :)
>> As for the redirect message in your case it's should to be handled by
>> the proxy and not by the client browser
>> as it cannot access the internal vpn network. The endpoints url should
>> be the url in the internal network.
>
> I don't agree.
> It requires more network bandwidth between sites.
> I think "hostnames" field provides client accessible URL
> that is front of proxy. It seems sufficient.
>
>
> In addition to above, I opened 2 issues [18800] [19053] regarding
> Swift API, that are not related this discussion.
>
> [18800]: http://tracker.ceph.com/issues/18800
> [19053]: http://tracker.ceph.com/issues/19053
>
>
> Regards,
> KIMURA
>
>> Orit
>>>
>>>>> How do you think?
>>>>>
>>>>>> Regards,
>>>>>> Orit
>>>>>>
>>>>>>> If the proxy translates Location header, it looks like
>>>>>>> man-in-the-middle
>>>>>>> attack.
>>>>>>>
>>>>>>> Regards,
>>>>>>> KIMURA
>>>>>>>
>>>>>>>> Regrads,
>>>>>>>> Orit
>>>>>>>>>
>>>>>>>>> Any thoughts?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
>>>>>>>>> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html
>>>>>>> [3] https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-8-multi-site#migrating_a_single_site_system_to_multi_site
>>>>>>>
>>>>>>>>> ------ FYI ------
>>>>>>>>> [environments]
>>>>>>>>> Ceph cluster: RHCS 2.0
>>>>>>>>> RGW: RHEL 7.2 + RGW v10.2.5
>>>>>>>>>
>>>>>>>>> zonegroup east: master
>>>>>>>>>  zone tokyo
>>>>>>>>>   endpoint http://node5:80
>>>>>>>        rgw frontends = "civetweb port=80"
>>>>>>>        rgw zonegroup = east
>>>>>>>        rgw zone = tokyo
>>>>>>>>>
>>>>>>>>>   system user: sync-user
>>>>>>>>>   user azuma (+ nishi)
>>>>>>>>>
>>>>>>>>> zonegroup west: (not master)
>>>>>>>>>   zone osaka
>>>>>>>>>   endpoint http://node5:8081
>>>>>>>        rgw frontends = "civetweb port=8081"
>>>>>>>        rgw zonegroup = west
>>>>>>>        rgw zone = osaka
>>>>>>>
>>>>>>>>>   system user: sync-user (created with same key as zone tokyo)
>>>>>>>>>   user nishi

-- 
KIMURA Osamu / 木村 修
Engineering Department, Storage Development Division,
Data Center Platform Business Unit, FUJITSU LIMITED

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rgw: multiple zonegroups in single realm
  2017-02-24  4:43                 ` KIMURA Osamu
@ 2017-02-24  7:22                   ` Orit Wasserman
  0 siblings, 0 replies; 11+ messages in thread
From: Orit Wasserman @ 2017-02-24  7:22 UTC (permalink / raw)
  To: KIMURA Osamu; +Cc: ceph-devel

On Fri, Feb 24, 2017 at 6:43 AM, KIMURA Osamu
<kimura.osamu@jp.fujitsu.com> wrote:
> Hi Orit,
>
> Thanks for your interest in this issue.
> I have one more question.
>
> I assumed "endpoints" of a zonegroup would be used for synchronization
> of metadata. But, to extent that I read current Jewel code, it may
> be used only for redirection.
> (-ERR_PERMANENT_REDIRECT || -ERR_WEBSITE_REDIRECT)
> It seems metadata synchronization is sent to endpoint of master zone
> in each zonegroup (probably it has not been equipped for secondary
> zonegroup).
>
> Is it correct?

Correct, metadata sync is syncrounus and only the meta master (master
zone in the master zonegroup) handles it.

> If so, we can set endpoints of each zonegroup as client accessible
> URL (i.e., front of proxy). On the other hand, endpoints of each
> zone point internal one.

This could work.

> But, I still prefer to use "hostnames" field for this purpose.
Yes using zonegroup endpoint could be confusing to the users.
On the other hand a new parameter can introduce backward compatibility issues.
I will look into it.

Regards,
Orit
>
>
> Regards,
> KIMURA
>
>
> On 2017/02/23 20:34, KIMURA Osamu wrote:
>>
>> Sorry to late.
>> I opened several tracker issues...
>>
>> On 2017/02/15 16:53, Orit Wasserman wrote:
>>>
>>> On Wed, Feb 15, 2017 at 2:26 AM, KIMURA Osamu
>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>
>>>> Comments inline...
>>>>
>>>>
>>>> On 2017/02/14 23:54, Orit Wasserman wrote:
>>>>>
>>>>>
>>>>> On Mon, Feb 13, 2017 at 12:57 PM, KIMURA Osamu
>>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Orit,
>>>>>>
>>>>>> I almost agree, with some exceptions...
>>>>>>
>>>>>>
>>>>>> On 2017/02/13 18:42, Orit Wasserman wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 13, 2017 at 6:44 AM, KIMURA Osamu
>>>>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Orit,
>>>>>>>>
>>>>>>>> Thanks for your comments.
>>>>>>>> I believe I'm not confusing, but probably my thought may not be well
>>>>>>>> described...
>>>>>>>>
>>>>>>> :)
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2017/02/12 19:07, Orit Wasserman wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Feb 10, 2017 at 10:21 AM, KIMURA Osamu
>>>>>>>>> <kimura.osamu@jp.fujitsu.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Cephers,
>>>>>>>>>>
>>>>>>>>>> I'm trying to configure RGWs with multiple zonegroups within
>>>>>>>>>> single
>>>>>>>>>> realm.
>>>>>>>>>> The intention is that some buckets to be replicated and others to
>>>>>>>>>> stay
>>>>>>>>>> locally.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you are not replicating than you don't need to create any zone
>>>>>>>>> configuration,
>>>>>>>>> a default zonegroup and zone are created automatically
>>>>>>>>>
>>>>>>>>>> e.g.:
>>>>>>>>>>  realm: fj
>>>>>>>>>>   zonegroup east: zone tokyo (not replicated)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> no need if not replicated
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   zonegroup west: zone osaka (not replicated)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> same here
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   zonegroup jp:   zone jp-east + jp-west (replicated)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The "east" and "west" zonegroups are just renamed from "default"
>>>>>>>> as described in RHCS document [3].
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Why do you need two zonegroups (or 3)?
>>>>>>>
>>>>>>> At the moment multisitev2 replicated automatically all zones in the
>>>>>>> realm except "default" zone.
>>>>>>> The moment you add a new zone (could be part of another zonegroup) it
>>>>>>> will be replicated to the other zones.
>>>>>>> It seems you don't want or need this.
>>>>>>> we are working on allowing more control on the replication but that
>>>>>>> will be in the future.
>>>>>>>
>>>>>>>> We may not need to rename them, but at least api_name should be
>>>>>>>> altered.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> You can change the api_name for the "default" zone.
>>>>>>>
>>>>>>>> In addition, I'm not sure what happens if 2 "default"
>>>>>>>> zones/zonegroups
>>>>>>>> co-exist in same realm.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Realm shares all the zones/zonegroups configuration,
>>>>>>> it means it is the same zone/zonegroup.
>>>>>>> For "default" it means not zone/zonegroup configured, we use it to
>>>>>>> run
>>>>>>> radosgw without any
>>>>>>> zone/zonegroup specified in the configuration.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I didn't think "default" as exception of zonegroup. :-P
>>>>>> Actually, I must specify api_name in default zonegroup setting.
>>>>>>
>>>>>> I interpret "default" zone/zonegroup is out of realm. Is it correct?
>>>>>> I think it means namespace for bucket or user is not shared with
>>>>>> "default".
>>>>>> At present, I can't make decision to separate namespaces, but it may
>>>>>> be
>>>>>> best choice with current code.
>>
>>
>> Unfortunately, if "api_name" is changed for "default" zonegroup,
>> the "default" zonegroup is set as a member of the realm.
>> See [19040-1]
>>
>> It means no major difference from my first provided configuration.
>> (except reduction of messy error messages [15776] )
>>
>> In addition, the "api_name" can't be changed with "radosgw-admin
>> zonegroup set" command if no realm has been defined.
>> There is no convenient way to change "api_name".
>>
>> [19040-1]: http://tracker.ceph.com/issues/19040#note-1
>> [15776]: http://tracker.ceph.com/issues/15776
>>
>>>>>>>>>> To evaluate such configuration, I tentatively built multiple
>>>>>>>>>> zonegroups
>>>>>>>>>> (east, west) on a ceph cluster. I barely succeed to configure it,
>>>>>>>>>> but
>>>>>>>>>> some concerns exist.
>>>>>>>>>>
>>>>>>>>> I think you just need one zonegroup with two zones the other are
>>>>>>>>> not
>>>>>>>>> needed
>>>>>>>>> Also each gateway can handle only a single zone (rgw_zone
>>>>>>>>> configuration parameter)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This is just a tentative one to confirm the behavior of multiple
>>>>>>>> zonegroups
>>>>>>>> due to limitation of our current equipment.
>>>>>>>> The "east" zonegroup was renamed from "default", and another "west"
>>>>>>>> zonegroup
>>>>>>>> was created. Of course I specified both rgw_zonegroup and rgw_zone
>>>>>>>> parameters
>>>>>>>> for each RGW instance. (see -FYI- section bellow)
>>>>>>>>
>>>>>>> Can I suggest starting with a more simple setup:
>>>>>>> Two zonegroups,  the first will have two zones and the second will
>>>>>>> have one zone.
>>>>>>> It is simper to configure and in case of problems to debug.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I would try with such configuration IF time permitted.
>>
>>
>> I tried. But it doesn't seem simpler :P
>> Because it consists 3 zonegroups and 4 zones.
>> I want to keep default zone/zonegroup.
>> The target system already has huge amount of objects.
>>
>>
>>>>>>>>>> a) User accounts are not synced among zonegroups
>>
>>
>> I opened 2 issues [19040] [19041]
>>
>> [19040]: http://tracker.ceph.com/issues/19040
>> [19041]: http://tracker.ceph.com/issues/19041
>>
>>
>>>>>>>>>> I'm not sure if this is a issue, but the blueprint [1] stated a
>>>>>>>>>> master
>>>>>>>>>> zonegroup manages user accounts as metadata like buckets.
>>>>>>>>>>
>>>>>>>>> You have a lot of confusion with the zones and zonegroups.
>>>>>>>>> A zonegroup is just a group of zones that are sharing the same data
>>>>>>>>> (i.e. replication between them)
>>>>>>>>> A zone represent a geographical location (i.e. one ceph cluster)
>>>>>>>>>
>>>>>>>>> We have a meta master zone (the master zone in the master
>>>>>>>>> zonegroup),
>>>>>>>>> this meta master is responible on
>>>>>>>>> replicating users and byckets meta operations.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I know it.
>>>>>>>> But the master zone in the master zonegroup manages bucket meta
>>>>>>>> operations including buckets in other zonegroups. It means
>>>>>>>> the master zone in the master zonegroup must have permission to
>>>>>>>> handle buckets meta operations, i.e., must have same user accounts
>>>>>>>> as other zonegroups.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Again zones not zonegroups,  it needs to have an admin user with the
>>>>>>> same credentials in all the other zones.
>>>>>>>
>>>>>>>> This is related to next issue b). If the master zone in the master
>>>>>>>> zonegroup doesn't have user accounts for other zonegroups, all the
>>>>>>>> buckets meta operations are rejected.
>>>>>>>>
>>>>>>>
>>>>>>> Correct
>>>>>>>
>>>>>>>> In addition, it may be overexplanation though, user accounts are
>>>>>>>> sync'ed to other zones within same zonegroup if the accounts are
>>>>>>>> created on master zone of the zonegroup. On the other hand,
>>>>>>>> I found today, user accounts are not sync'ed to master if the
>>>>>>>> accounts are created on slave(?) zone in the zonegroup. It seems
>>>>>>>> asymmetric behavior.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This requires investigation,  can you open a tracker issue and we
>>>>>>> will
>>>>>>> look into it.
>>>>>>>
>>>>>>>> I'm not sure if the same behavior is caused by Admin REST API
>>>>>>>> instead
>>>>>>>> of radosgw-admin.
>>>>>>>>
>>>>>>>
>>>>>>> It doesn't matter both use almost the same code
>>>>>>>
>>>>>>>>
>>>>>>>>>> b) Bucket creation is rejected if master zonegroup doesn't have
>>>>>>>>>> the
>>>>>>>>>> account
>>>>>>>>>>
>>>>>>>>>> e.g.:
>>>>>>>>>>   1) Configure east zonegroup as master.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> you need a master zoen
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   2) Create a user "nishi" on west zonegroup (osaka zone) using
>>>>>>>>>> radosgw-admin.
>>>>>>>>>>   3) Try to create a bucket on west zonegroup by user nishi.
>>>>>>>>>>      -> ERROR: S3 error: 404 (NoSuchKey)
>>>>>>>>>>   4) Create user nishi on east zonegroup with same key.
>>>>>>>>>>   5) Succeed to create a bucket on west zonegroup by user nishi.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You are confusing zonegroup and zone here again ...
>>>>>>>>>
>>>>>>>>> you should notice that when you are using radosgw-admin command
>>>>>>>>> without providing zonegorup and/or zone info (--rgw-zonegroup=<zg>
>>>>>>>>> and
>>>>>>>>> --rgw-zone=<zone>) it will use the default zonegroup and zone.
>>>>>>>>>
>>>>>>>>> User is stored per zone and you need to create an admin users in
>>>>>>>>> both
>>>>>>>>> zones
>>>>>>>>> for more documentation see:
>>>>>>>>> http://docs.ceph.com/docs/master/radosgw/multisite/
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I always specify --rgw-zonegroup and --rgw-zone for radosgw-admin
>>>>>>>> command.
>>>>>>>>
>>>>>>> That is great!
>>>>>>> You can also onfigure default zone and zonegroup
>>>>>>>
>>>>>>>> The issue is that any buckets meta operations are rejected when the
>>>>>>>> master
>>>>>>>> zone in the master zonegroup doesn't have the user account of other
>>>>>>>> zonegroups.
>>>>>>>>
>>>>>>> Correct
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I try to describe details again:
>>>>>>>> 1) Create fj realm as default.
>>>>>>>> 2) Rename default zonegroup/zone to east/tokyo and mark as default.
>>>>>>>> 3) Create west/osaka zonegroup/zone.
>>>>>>>> 4) Create system user sync-user on both tokyo and osaka zones with
>>>>>>>> same
>>>>>>>> key.
>>>>>>>> 5) Start 2 RGW instances for tokyo and osaka zones.
>>>>>>>> 6) Create azuma user account on tokyo zone in east zonegroup.
>>>>>>>> 7) Create /bucket1 through tokyo zone endpoint with azuma account.
>>>>>>>>    -> No problem.
>>>>>>>> 8) Create nishi user account on osaka zone in west zonegroup.
>>>>>>>> 9) Try to create a bucket /bucket2 through osaka zone endpoint with
>>>>>>>> azuma
>>>>>>>> account.
>>>>>>>>    -> respond "ERROR: S3 error: 403 (InvalidAccessKeyId)" as
>>>>>>>> expected.
>>>>>>>> 10) Try to create a bucket /bucket3 through osaka zone endpoint with
>>>>>>>> nishi
>>>>>>>> account.
>>>>>>>>    -> respond "ERROR: S3 error: 404 (NoSuchKey)"
>>>>>>>>    Detailed log is shown in -FYI- section bellow.
>>>>>>>>    The RGW for osaka zone verify the signature and forward the
>>>>>>>> request
>>>>>>>>    to tokyo zone endpoint (= the master zone in the master
>>>>>>>> zonegroup).
>>>>>>>>    Then, the RGW for tokyo zone rejected the request by unauthorized
>>>>>>>> access.
>>>>>>>>
>>>>>>>
>>>>>>> This seems a bug, can you open a issue?
>>
>>
>> I opened 2 issues [19042] [19043]
>>
>> [19042]: http://tracker.ceph.com/issues/19042
>> [19043]: http://tracker.ceph.com/issues/19043
>>
>>>>>>>>>> c) How to restrict to place buckets on specific zonegroups?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> you probably mean zone.
>>>>>>>>> There is ongoing work to enable/disable sync per bucket
>>>>>>>>> https://github.com/ceph/ceph/pull/10995
>>>>>>>>> with this you can create a bucket on a specific zone and it won't
>>>>>>>>> be
>>>>>>>>> replicated to another zone
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> My thought means zonegroup (not zone) as described above.
>>>>>>>
>>>>>>>
>>>>>>> But it should be zone ..
>>>>>>> Zone represent a geographical location , it represent a single ceph
>>>>>>> cluster.
>>>>>>> Bucket is created in a zone (a single ceph cluster) and it stored the
>>>>>>> zone
>>>>>>> id.
>>>>>>> The zone represent in which ceph cluster the bucket was created.
>>>>>>>
>>>>>>> A zonegroup just a logical collection of zones, in many case you only
>>>>>>> need a single zonegroup.
>>>>>>> You should use zonegroups if you have lots of zones and it simplifies
>>>>>>> your configuration.
>>>>>>> You can move zones between zonegroups (it is not tested or supported
>>>>>>> ...).
>>>>>>>
>>>>>>>> With current code, buckets are sync'ed to all zones within a
>>>>>>>> zonegroup,
>>>>>>>> no way to choose zone to place specific buckets.
>>>>>>>> But this change may help to configure our original target.
>>>>>>>>
>>>>>>>> It seems we need more discussion about the change.
>>>>>>>> I prefer default behavior is associated with user account (per SLA).
>>>>>>>> And attribution of each bucket should be able to be changed via REST
>>>>>>>> API depending on their permission, rather than radosgw-admin
>>>>>>>> command.
>>>>>>>>
>>>>>>>
>>>>>>> I think that will be very helpful , we need to understand what are
>>>>>>> the
>>>>>>> requirement and the usage.
>>>>>>> Please comment on the PR or even open a feature request and we can
>>>>>>> discuss it more in detail.
>>>>>>>
>>>>>>>> Anyway, I'll examine more details.
>>>>>>>>
>>>>>>>>>> If user accounts would synced future as the blueprint, all the
>>>>>>>>>> zonegroups
>>>>>>>>>> contain same account information. It means any user can create
>>>>>>>>>> buckets
>>>>>>>>>> on
>>>>>>>>>> any zonegroups. If we want to permit to place buckets on a
>>>>>>>>>> replicated
>>>>>>>>>> zonegroup for specific users, how to configure?
>>>>>>>>>>
>>>>>>>>>> If user accounts will not synced as current behavior, we can
>>>>>>>>>> restrict
>>>>>>>>>> to place buckets on specific zonegroups. But I cannot find best
>>>>>>>>>> way
>>>>>>>>>> to
>>>>>>>>>> configure the master zonegroup.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> d) Operations for other zonegroup are not redirected
>>>>>>>>>>
>>>>>>>>>> e.g.:
>>>>>>>>>>   1) Create bucket4 on west zonegroup by nishi.
>>>>>>>>>>   2) Try to access bucket4 from endpoint on east zonegroup.
>>>>>>>>>>      -> Respond "301 (Moved Permanently)",
>>>>>>>>>>         but no redirected Location header is returned.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It could be a bug please open a tracker issue for that in
>>>>>>>>> tracker.ceph.com for RGW component with all the configuration
>>>>>>>>> information,
>>>>>>>>> logs and the version of ceph and radosgw you are using.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I will open it, but it may be issued as "Feature" instead of "Bug"
>>>>>>>> depending on following discussion.
>>
>>
>> I opened an issue [19052] as "Feature" instead of "Bug".
>>
>> [19052]: http://tracker.ceph.com/issues/19052
>>
>> I suggested to use "hostnames" field in zonegroup configuration
>> for this purpose. I feel it is similar to s3 website feature.
>>
>>>>>>>>>> It seems current RGW doesn't follows S3 specification [2].
>>>>>>>>>> To implement this feature, probably we need to define another
>>>>>>>>>> endpoint
>>>>>>>>>> on each zonegroup for client accessible URL. RGW may placed behind
>>>>>>>>>> proxy,
>>>>>>>>>> thus the URL may be different from endpoint URLs for replication.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The zone and zonegroup endpoints are not used directly by the user
>>>>>>>>> with
>>>>>>>>> a
>>>>>>>>> proxy.
>>>>>>>>> The user get a URL pointing to the proxy and the proxy will need to
>>>>>>>>> be
>>>>>>>>> configured to point the rgw urls/IPs , you can have several radosgw
>>>>>>>>> running.
>>>>>>>>> See more
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration
>>>>>>>>
>>>>>>>>
>>>>>>>> Does it mean the proxy has responsibility to alter "Location" header
>>>>>>>> as
>>>>>>>> redirected URL?
>>>>>>>
>>>>>>>
>>>>>>> No
>>>>>>>
>>>>>>>> Basically, RGW can respond only the endpoint described in zonegroup
>>>>>>>> setting as redirected URL on Location header. But client may not
>>>>>>>> access
>>>>>>>> the endpoint. Someone must translate the Location header to client
>>>>>>>> accessible URL.
>>>>>>>
>>>>>>>
>>>>>>> Both locations will have a proxy. This means all communication is
>>>>>>> done
>>>>>>> through proxies.
>>>>>>> The endpoint URL should be an external URL and the proxy on the new
>>>>>>> location will translate it to the internal one.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Our assumption is:
>>>>>>
>>>>>> End-user client --- internet --- proxy ---+--- RGW site-A
>>>>>>                                           |
>>>>>>                                           | (dedicated line or VPN)
>>>>>>                                           |
>>>>>> End-user client --- internet --- proxy ---+--- RGW site-B
>>>>>>
>>>>>> RGWs can't access through front of proxies.
>>>>>> In this case, endpoints for replication are in backend network of
>>>>>> proxies.
>>>>>
>>>>>
>>>>>
>>>>> do you have several radosgw instances in each site?
>>>>
>>>>
>>>>
>>>> Yes. Probably three or more instances per a site.
>>>> Actual system will have same number of physical servers as RGW
>>>> instances.
>>>> We already tested with multiple endpoints per a zone within a zonegroup.
>>>
>>>
>>> Good to hear :)
>>> As for the redirect message in your case it's should to be handled by
>>> the proxy and not by the client browser
>>> as it cannot access the internal vpn network. The endpoints url should
>>> be the url in the internal network.
>>
>>
>> I don't agree.
>> It requires more network bandwidth between sites.
>> I think "hostnames" field provides client accessible URL
>> that is front of proxy. It seems sufficient.
>>
>>
>> In addition to above, I opened 2 issues [18800] [19053] regarding
>> Swift API, that are not related this discussion.
>>
>> [18800]: http://tracker.ceph.com/issues/18800
>> [19053]: http://tracker.ceph.com/issues/19053
>>
>>
>> Regards,
>> KIMURA
>>
>>> Orit
>>>>
>>>>
>>>>>> How do you think?
>>>>>>
>>>>>>> Regards,
>>>>>>> Orit
>>>>>>>
>>>>>>>> If the proxy translates Location header, it looks like
>>>>>>>> man-in-the-middle
>>>>>>>> attack.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> KIMURA
>>>>>>>>
>>>>>>>>> Regrads,
>>>>>>>>> Orit
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Any thoughts?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> http://tracker.ceph.com/projects/ceph/wiki/Rgw_new_multisite_configuration
>>>>>>>>>> [2] http://docs.aws.amazon.com/AmazonS3/latest/dev/Redirects.html
>>>>>>>>
>>>>>>>> [3]
>>>>>>>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-8-multi-site#migrating_a_single_site_system_to_multi_site
>>>>>>>>
>>>>>>>>>> ------ FYI ------
>>>>>>>>>> [environments]
>>>>>>>>>> Ceph cluster: RHCS 2.0
>>>>>>>>>> RGW: RHEL 7.2 + RGW v10.2.5
>>>>>>>>>>
>>>>>>>>>> zonegroup east: master
>>>>>>>>>>  zone tokyo
>>>>>>>>>>   endpoint http://node5:80
>>>>>>>>
>>>>>>>>        rgw frontends = "civetweb port=80"
>>>>>>>>        rgw zonegroup = east
>>>>>>>>        rgw zone = tokyo
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   system user: sync-user
>>>>>>>>>>   user azuma (+ nishi)
>>>>>>>>>>
>>>>>>>>>> zonegroup west: (not master)
>>>>>>>>>>   zone osaka
>>>>>>>>>>   endpoint http://node5:8081
>>>>>>>>
>>>>>>>>        rgw frontends = "civetweb port=8081"
>>>>>>>>        rgw zonegroup = west
>>>>>>>>        rgw zone = osaka
>>>>>>>>
>>>>>>>>>>   system user: sync-user (created with same key as zone tokyo)
>>>>>>>>>>   user nishi
>
>
> --
> KIMURA Osamu / 木村 修
> Engineering Department, Storage Development Division,
> Data Center Platform Business Unit, FUJITSU LIMITED
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-02-24  7:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-10  8:21 rgw: multiple zonegroups in single realm KIMURA Osamu
2017-02-12 10:07 ` Orit Wasserman
2017-02-13  4:44   ` KIMURA Osamu
2017-02-13  9:42     ` Orit Wasserman
2017-02-13 10:57       ` KIMURA Osamu
2017-02-14 14:54         ` Orit Wasserman
2017-02-15  0:26           ` KIMURA Osamu
2017-02-15  7:53             ` Orit Wasserman
2017-02-23 11:34               ` KIMURA Osamu
2017-02-24  4:43                 ` KIMURA Osamu
2017-02-24  7:22                   ` Orit Wasserman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.