All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: OSD will never become up. HEALTH_ERR
       [not found] ` <1462972579.13078.36.camel@gmail.com>
@ 2016-05-11 15:20   ` Gonzalo Aguilar Delgado
  0 siblings, 0 replies; 4+ messages in thread
From: Gonzalo Aguilar Delgado @ 2016-05-11 15:20 UTC (permalink / raw)
  To: ceph-devel

Hi, 

For your information and for all in the same situation than me. 

I found on release notes that's very well explained that when the
server is down but controller doesn't know about it. It can be because
the upgrades done in ceph during several releases. In this case firefly
there are some instructions 



        Upgrade Ceph on monitor hosts

        Restart all ceph-mon daemons

        Set noout::

            ceph osd set noout

        Upgrade Ceph on all OSD hosts

        Stop all ceph-osd daemons

        Mark all OSDs down with something like::

            ceph osd down seq 0 1000

        Start all ceph-osd daemons

        Let the cluster settle and then unset noout::

            ceph osd unset noout

        Upgrade and restart any remaining daemons (ceph-mds, radosgw)


In my case relevant point was to mark the hanged osd as down. After
everything started to work again. 


I suppose it was in a stale situation.

On mié, 2016-05-11 at 15:16 +0200, Gonzalo Aguilar Delgado wrote:
> Hi Again, 
> 
> Do you think I should open an issue?
> 
> Best regards,
> 
> 
> I also see that monitor is never getting OSD up. No matter what I do.
> And it's never setting OSD's down/out. Even if I set
>  mon_osd_min_in_ratio = 0
> 
> On mié, 2016-05-11 at 10:26 +0200, Gonzalo Aguilar Delgado wrote:
> > Hello, 
> > 
> > I just upgraded my cluster to the version 10.1.2 and it worked well
> > for a while until I saw that systemctl ceph-disk@dev-sdc1.service
> > was failed and I reruned it.
> > 
> > From there the OSD stopped working. 
> > 
> > This is ubuntu 16.04. 
> > 
> > I connected to the IRC looking for help where people pointed me to
> > one or another place but none of the investigations helped to
> > resolve.
> > 
> > My configuration is rather simple:
> > 
> > oot@red-compute:~# ceph osd tree
> > ID WEIGHT  TYPE NAME                 UP/DOWN REWEIGHT PRIMARY-
> > AFFINITY 
> > -1 1.00000 root
> > default                                                
> > -4 1.00000     rack rack-
> > 1                                             
> > -2 1.00000         host blue-
> > compute                                   
> >  0 1.00000             osd.0            down        0         
> > 1.00000 
> >  2 1.00000             osd.2            down        0         
> > 1.00000 
> > -3 1.00000         host red-
> > compute                                    
> >  1 1.00000             osd.1            down        0         
> > 1.00000 
> >  3 0.50000             osd.3              up  1.00000         
> > 1.00000 
> >  4 1.00000             osd.4            down        0         
> > 1.00000 
> > 
> > It seems that all nodes are in preboot status. I was looking at the
> > latests commits and it seems that there's a patch
> > to make OSDs to wait for cluster to become healthy before
> > rejoining. Can this be the source of my problems?
> > 
> > root@red-compute:/var/lib/ceph/osd/ceph-1# ceph daemon osd.1 status
> > {
> >     "cluster_fsid": "9028f4da-0d77-462b-be9b-dbdf7fa57771",
> >     "osd_fsid": "adf9890a-e680-48e4-82c6-e96f4ed56889",
> >     "whoami": 1,
> >     "state": "preboot",
> >     "oldest_map": 1764,
> >     "newest_map": 2504,
> >     "num_pgs": 323
> > }
> > 
> > root@red-compute:/var/lib/ceph/osd/ceph-1# ceph daemon osd.3 status
> > {
> >     "cluster_fsid": "9028f4da-0d77-462b-be9b-dbdf7fa57771",
> >     "osd_fsid": "8dd085d4-0b50-4c80-a0ca-c5bc4ad972f7",
> >     "whoami": 3,
> >     "state": "preboot",
> >     "oldest_map": 1764,
> >     "newest_map": 2504,
> >     "num_pgs": 150
> > }
> > 
> > 3 is up and in. 
> > 
> > 
> > This is what I got sofar:
> > 
> > Once upgraded I discovered that daemon runs under ceph. I just ran
> > chown on ceph directories. and it worked. 
> > Firewall is fully disabled. Checked connectivity with nc and nmap. 
> > Configuration seems to be right. I can post if you want. 
> > Enabling logging on OSD shows that for example osd.1 is
> > reconnecting all the time.
> > 2016-05-10 14:35:48.199573 7f53e8f1a700  1 -- 0.0.0.0:6806/13962 >>
> > :/0 pipe(0x556f99413400 sd=84 :6806 s=0 pgs=0 cs=0 l=0
> > c=0x556f993b3a80).accept sd=84 172.16.0.119:35388/0
> >  2016-05-10 14:35:48.199966 7f53e8f1a700  2
> > -- 0.0.0.0:6806/13962 >> :/0 pipe(0x556f99413400 sd=84 :6806 s=4
> > pgs=0 cs=0 l=0 c=0x556f993b3a80).fault (0) Success
> >  2016-05-10 14:35:48.200018 7f53fb941700  1 osd.1 2468
> > ms_handle_reset con 0x556f993b3a80 session 0
> > OSD.3 goes ok because never left out because ceph restriction.
> > I rebooted all services at once for it to have available all OSD at
> > the same time and don't mark it down. Don't work. 
> > I forced up from commandline. ceph osd in 1-5. They appear as in
> > for a while then out.
> > We tried ceph-disk activate-all to boot everything. Don't work.
> > 
> > The strange thing is that culster started worked just right after
> > upgrade. But the systemctrl command broke both servers. 
> > root@blue-compute:~# ceph -w
> >     cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771
> >      health HEALTH_ERR
> >             694 pgs are stuck inactive for more than 300 seconds
> >             694 pgs stale
> >             694 pgs stuck stale
> >             too many PGs per OSD (1528 > max 300)
> >             mds cluster is degraded
> >             crush map has straw_calc_version=0
> >      monmap e10: 2 mons at {blue-compute=172.16.0.119:6789/0,red-
> > compute=172.16.0.100:6789/0}
> >             election epoch 3600, quorum 0,1 red-compute,blue-
> > compute
> >       fsmap e673: 1/1/1 up {0:0=blue-compute=up:replay}
> >      osdmap e2495: 5 osds: 1 up, 1 in; 5 remapped pgs
> >       pgmap v40765481: 764 pgs, 6 pools, 410 GB data, 103 kobjects
> >             87641 MB used, 212 GB / 297 GB avail
> >                  694 stale+active+clean
> >                   70 active+clean
> > 
> > 2016-05-10 17:03:55.822440 mon.0 [INF] HEALTH_ERR; 694 pgs are
> > stuck inactive for more than 300 seconds; 694 pgs stale; 694 pgs
> > stuck stale; too many PGs per OSD (1528 > max 300); mds cluster is
> > degraded; crush map has straw_calc_version=
> > cat /etc/ceph/ceph.conf 
> > [global]
> > 
> > fsid = 9028f4da-0d77-462b-be9b-dbdf7fa57771
> > mon_initial_members = blue-compute, red-compute
> > mon_host = 172.16.0.119, 172.16.0.100
> > auth_cluster_required = cephx
> > auth_service_required = cephx
> > auth_client_required = cephx
> > filestore_xattr_use_omap = true
> > public_network = 172.16.0.0/24
> > osd_pool_default_pg_num = 100
> > osd_pool_default_pgp_num = 100
> > osd_pool_default_size = 2  # Write an object 3 times.
> > osd_pool_default_min_size = 1 # Allow writing one copy in a
> > degraded state.
> > 
> > ## Required upgrade
> > osd max object name len = 256
> > osd max object namespace len = 64
> > 
> > [mon.]
> > 
> >     debug mon = 9
> >     caps mon = "allow *"
> > 
> > Any help on this? Any clue of what's going wrong?
> > 
> > 
> > I also see this, I don't know if it's related or not
> > 
> > => ceph-osd.admin.log <==
> > 2016-05-10 18:21:46.060278 7fa8f30cc8c0  0 ceph version 10.1.2
> > (4a2a6f72640d6b74a3bbd92798bb913ed380dcd4), process ceph-osd, pid
> > 14135
> > 2016-05-10 18:21:46.060460 7fa8f30cc8c0 -1 bluestore(/dev/sdc2)
> > _read_bdev_label unable to decode label at offset 66:
> > buffer::malformed_input: void
> > bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&)
> > decode past end of struct encoding
> > 2016-05-10 18:21:46.062949 7fa8f30cc8c0  1 journal _open /dev/sdc2
> > fd 4: 5367660544 bytes, block size 4096 bytes, directio = 0, aio =
> > 0
> > 2016-05-10 18:21:46.062991 7fa8f30cc8c0  1 journal close /dev/sdc2
> > 2016-05-10 18:21:46.063026 7fa8f30cc8c0  0 probe_block_device_fsid
> > /dev/sdc2 is filestore, 119a9f4e-73d8-4a1f-877c-d60b01840c96
> > 2016-05-10 18:21:47.072082 7eff735598c0  0 ceph version 10.1.2
> > (4a2a6f72640d6b74a3bbd92798bb913ed380dcd4), process ceph-osd, pid
> > 14177
> > 2016-05-10 18:21:47.072285 7eff735598c0 -1 bluestore(/dev/sdf2)
> > _read_bdev_label unable to decode label at offset 66:
> > buffer::malformed_input: void
> > bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&)
> > decode past end of struct encoding
> > 2016-05-10 18:21:47.074799 7eff735598c0  1 journal _open /dev/sdf2
> > fd 4: 5367660544 bytes, block size 4096 bytes, directio = 0, aio =
> > 0
> > 2016-05-10 18:21:47.074844 7eff735598c0  1 journal close /dev/sdf2
> > 2016-05-10 18:21:47.074881 7eff735598c0  0 probe_block_device_fsid
> > /dev/sdf2 is filestore, fd069e6a-9a62-4286-99cb-d8a523bd946a
> > 
> > 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: OSD will never become up. HEALTH_ERR
  2016-05-11 12:55 ` Sage Weil
@ 2016-05-11 15:47   ` Gonzalo Aguilar Delgado
  0 siblings, 0 replies; 4+ messages in thread
From: Gonzalo Aguilar Delgado @ 2016-05-11 15:47 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Hello Sage, 

Thank you al lot for answering. Ideed this was the problem. But it was
strange as I told it booted from update without problems until the
ceph-disk command on boot. 


I was updating from Ubuntu 14.04 so quite old ceph. Yes. 

I posted what I did just in case another one has the same problem. 

Thank you a lot.

On mié, 2016-05-11 at 08:55 -0400, Sage Weil wrote:
> On Wed, 11 May 2016, Gonzalo Aguilar Delgado wrote:
> > Hello, 
> > 
> > I just upgraded my cluster to the version 10.1.2 and it worked well
> for
> > a while until I saw that systemctl ceph-disk@dev-sdc1.service was
> > failed and I reruned it.
> 
> What version did you upgrade *from*?  If it was older than 0.94.4
> then 
> that is the problem.  Check for messages in /var/log/ceph/ceph.log.
> 
> Also, you probably want to use 10.2.0, not 10.1.2 (which was a
> release 
> candidate).
> 
> sage
> 
> 
> > From there the OSD stopped working. 
> > 
> > This is ubuntu 16.04. 
> > 
> > I connected to the IRC looking for help where people pointed me to
> one
> > or another place but none of the investigations helped to resolve.
> > 
> > My configuration is rather simple:
> > 
> > oot@red-compute:~# ceph osd tree
> > ID WEIGHT  TYPE NAME                 UP/DOWN REWEIGHT PRIMARY-
> AFFINITY 
> > -1 1.00000 root
> default                                                
> > -4 1.00000     rack rack-
> 1                                             
> > -2 1.00000         host blue-
> compute                                   
> >  0 1.00000             osd.0            down        0         
> 1.00000 
> >  2 1.00000             osd.2            down        0         
> 1.00000 
> > -3 1.00000         host red-
> compute                                    
> >  1 1.00000             osd.1            down        0         
> 1.00000 
> >  3 0.50000             osd.3              up  1.00000         
> 1.00000 
> >  4 1.00000             osd.4            down        0         
> 1.00000 
> > 
> > It seems that all nodes are in preboot status. I was looking at the
> > latests commits and it seems that there's a patch
> > to make OSDs to wait for cluster to become healthy before
> rejoining.
> > Can this be the source of my problems?
> > 
> > root@red-compute:/var/lib/ceph/osd/ceph-1# ceph daemon osd.1 status
> > {
> >     "cluster_fsid": "9028f4da-0d77-462b-be9b-dbdf7fa57771",
> >     "osd_fsid": "adf9890a-e680-48e4-82c6-e96f4ed56889",
> >     "whoami": 1,
> >     "state": "preboot",
> >     "oldest_map": 1764,
> >     "newest_map": 2504,
> >     "num_pgs": 323
> > }
> > 
> > root@red-compute:/var/lib/ceph/osd/ceph-1# ceph daemon osd.3 status
> > {
> >     "cluster_fsid": "9028f4da-0d77-462b-be9b-dbdf7fa57771",
> >     "osd_fsid": "8dd085d4-0b50-4c80-a0ca-c5bc4ad972f7",
> >     "whoami": 3,
> >     "state": "preboot",
> >     "oldest_map": 1764,
> >     "newest_map": 2504,
> >     "num_pgs": 150
> > }
> > 
> > 3 is up and in. 
> > 
> > 
> > This is what I got sofar:
> > 
> > Once upgraded I discovered that daemon runs under ceph. I just ran
> > chown on ceph directories. and it worked. 
> > Firewall is fully disabled. Checked connectivity with nc and nmap. 
> > Configuration seems to be right. I can post if you want. 
> > Enabling logging on OSD shows that for example osd.1 is
> reconnecting
> > all the time.
> > 2016-05-10 14:35:48.199573 7f53e8f1a700  1 -- 0.0.0.0:6806/13962 >>
> :/0
> > pipe(0x556f99413400 sd=84 :6806 s=0 pgs=0 cs=0 l=0
> > c=0x556f993b3a80).accept sd=84 172.16.0.119:35388/0
> >  2016-05-10 14:35:48.199966 7f53e8f1a700  2
> -- 0.0.0.0:6806/13962 >>
> > :/0 pipe(0x556f99413400 sd=84 :6806 s=4 pgs=0 cs=0 l=0
> > c=0x556f993b3a80).fault (0) Success
> >  2016-05-10 14:35:48.200018 7f53fb941700  1 osd.1 2468
> ms_handle_reset
> > con 0x556f993b3a80 session 0
> > OSD.3 goes ok because never left out because ceph restriction.
> > I rebooted all services at once for it to have available all OSD at
> the
> > same time and don't mark it down. Don't work. 
> > I forced up from commandline. ceph osd in 1-5. They appear as in
> for a
> > while then out.
> > We tried ceph-disk activate-all to boot everything. Don't work.
> > 
> > The strange thing is that culster started worked just right after
> > upgrade. But the systemctrl command broke both servers. 
> > root@blue-compute:~# ceph -w
> >     cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771
> >      health HEALTH_ERR
> >             694 pgs are stuck inactive for more than 300 seconds
> >             694 pgs stale
> >             694 pgs stuck stale
> >             too many PGs per OSD (1528 > max 300)
> >             mds cluster is degraded
> >             crush map has straw_calc_version=0
> >      monmap e10: 2 mons at {blue-compute=172.16.0.119:6789/0,red-
> > compute=172.16.0.100:6789/0}
> >             election epoch 3600, quorum 0,1 red-compute,blue-
> compute
> >       fsmap e673: 1/1/1 up {0:0=blue-compute=up:replay}
> >      osdmap e2495: 5 osds: 1 up, 1 in; 5 remapped pgs
> >       pgmap v40765481: 764 pgs, 6 pools, 410 GB data, 103 kobjects
> >             87641 MB used, 212 GB / 297 GB avail
> >                  694 stale+active+clean
> >                   70 active+clean
> > 
> > 2016-05-10 17:03:55.822440 mon.0 [INF] HEALTH_ERR; 694 pgs are
> stuck
> > inactive for more than 300 seconds; 694 pgs stale; 694 pgs stuck
> stale;
> > too many PGs per OSD (1528 > max 300); mds cluster is degraded;
> crush
> > map has straw_calc_version=
> > cat /etc/ceph/ceph.conf 
> > [global]
> > 
> > fsid = 9028f4da-0d77-462b-be9b-dbdf7fa57771
> > mon_initial_members = blue-compute, red-compute
> > mon_host = 172.16.0.119, 172.16.0.100
> > auth_cluster_required = cephx
> > auth_service_required = cephx
> > auth_client_required = cephx
> > filestore_xattr_use_omap = true
> > public_network = 172.16.0.0/24
> > osd_pool_default_pg_num = 100
> > osd_pool_default_pgp_num = 100
> > osd_pool_default_size = 2  # Write an object 3 times.
> > osd_pool_default_min_size = 1 # Allow writing one copy in a
> degraded
> > state.
> > 
> > ## Required upgrade
> > osd max object name len = 256
> > osd max object namespace len = 64
> > 
> > [mon.]
> > 
> >     debug mon = 9
> >     caps mon = "allow *"
> > 
> > Any help on this? Any clue of what's going wrong?
> > 
> > 
> > I also see this, I don't know if it's related or not
> > 
> > => ceph-osd.admin.log <==
> > 2016-05-10 18:21:46.060278 7fa8f30cc8c0  0 ceph version 10.1.2
> > (4a2a6f72640d6b74a3bbd92798bb913ed380dcd4), process ceph-osd, pid
> 14135
> > 2016-05-10 18:21:46.060460 7fa8f30cc8c0 -1 bluestore(/dev/sdc2)
> > _read_bdev_label unable to decode label at offset 66:
> > buffer::malformed_input: void
> > bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&)
> decode
> > past end of struct encoding
> > 2016-05-10 18:21:46.062949 7fa8f30cc8c0  1 journal _open /dev/sdc2
> fd
> > 4: 5367660544 bytes, block size 4096 bytes, directio = 0, aio = 0
> > 2016-05-10 18:21:46.062991 7fa8f30cc8c0  1 journal close /dev/sdc2
> > 2016-05-10 18:21:46.063026 7fa8f30cc8c0  0 probe_block_device_fsid
> > /dev/sdc2 is filestore, 119a9f4e-73d8-4a1f-877c-d60b01840c96
> > 2016-05-10 18:21:47.072082 7eff735598c0  0 ceph version 10.1.2
> > (4a2a6f72640d6b74a3bbd92798bb913ed380dcd4), process ceph-osd, pid
> 14177
> > 2016-05-10 18:21:47.072285 7eff735598c0 -1 bluestore(/dev/sdf2)
> > _read_bdev_label unable to decode label at offset 66:
> > buffer::malformed_input: void
> > bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&)
> decode
> > past end of struct encoding
> > 2016-05-10 18:21:47.074799 7eff735598c0  1 journal _open /dev/sdf2
> fd
> > 4: 5367660544 bytes, block size 4096 bytes, directio = 0, aio = 0
> > 2016-05-10 18:21:47.074844 7eff735598c0  1 journal close /dev/sdf2
> > 2016-05-10 18:21:47.074881 7eff735598c0  0 probe_block_device_fsid
> > /dev/sdf2 is filestore, fd069e6a-9a62-4286-99cb-d8a523bd946a
> > 
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-
> devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: OSD will never become up. HEALTH_ERR
  2016-05-11  8:37 Gonzalo Aguilar Delgado
@ 2016-05-11 12:55 ` Sage Weil
  2016-05-11 15:47   ` Gonzalo Aguilar Delgado
  0 siblings, 1 reply; 4+ messages in thread
From: Sage Weil @ 2016-05-11 12:55 UTC (permalink / raw)
  To: Gonzalo Aguilar Delgado; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 7948 bytes --]

On Wed, 11 May 2016, Gonzalo Aguilar Delgado wrote:
> Hello, 
> 
> I just upgraded my cluster to the version 10.1.2 and it worked well for
> a while until I saw that systemctl ceph-disk@dev-sdc1.service was
> failed and I reruned it.

What version did you upgrade *from*?  If it was older than 0.94.4 then 
that is the problem.  Check for messages in /var/log/ceph/ceph.log.

Also, you probably want to use 10.2.0, not 10.1.2 (which was a release 
candidate).

sage


> From there the OSD stopped working. 
> 
> This is ubuntu 16.04. 
> 
> I connected to the IRC looking for help where people pointed me to one
> or another place but none of the investigations helped to resolve.
> 
> My configuration is rather simple:
> 
> oot@red-compute:~# ceph osd tree
> ID WEIGHT  TYPE NAME                 UP/DOWN REWEIGHT PRIMARY-AFFINITY 
> -1 1.00000 root default                                                
> -4 1.00000     rack rack-1                                             
> -2 1.00000         host blue-compute                                   
>  0 1.00000             osd.0            down        0          1.00000 
>  2 1.00000             osd.2            down        0          1.00000 
> -3 1.00000         host red-compute                                    
>  1 1.00000             osd.1            down        0          1.00000 
>  3 0.50000             osd.3              up  1.00000          1.00000 
>  4 1.00000             osd.4            down        0          1.00000 
> 
> It seems that all nodes are in preboot status. I was looking at the
> latests commits and it seems that there's a patch
> to make OSDs to wait for cluster to become healthy before rejoining.
> Can this be the source of my problems?
> 
> root@red-compute:/var/lib/ceph/osd/ceph-1# ceph daemon osd.1 status
> {
>     "cluster_fsid": "9028f4da-0d77-462b-be9b-dbdf7fa57771",
>     "osd_fsid": "adf9890a-e680-48e4-82c6-e96f4ed56889",
>     "whoami": 1,
>     "state": "preboot",
>     "oldest_map": 1764,
>     "newest_map": 2504,
>     "num_pgs": 323
> }
> 
> root@red-compute:/var/lib/ceph/osd/ceph-1# ceph daemon osd.3 status
> {
>     "cluster_fsid": "9028f4da-0d77-462b-be9b-dbdf7fa57771",
>     "osd_fsid": "8dd085d4-0b50-4c80-a0ca-c5bc4ad972f7",
>     "whoami": 3,
>     "state": "preboot",
>     "oldest_map": 1764,
>     "newest_map": 2504,
>     "num_pgs": 150
> }
> 
> 3 is up and in. 
> 
> 
> This is what I got sofar:
> 
> Once upgraded I discovered that daemon runs under ceph. I just ran
> chown on ceph directories. and it worked. 
> Firewall is fully disabled. Checked connectivity with nc and nmap. 
> Configuration seems to be right. I can post if you want. 
> Enabling logging on OSD shows that for example osd.1 is reconnecting
> all the time.
> 2016-05-10 14:35:48.199573 7f53e8f1a700  1 -- 0.0.0.0:6806/13962 >> :/0
> pipe(0x556f99413400 sd=84 :6806 s=0 pgs=0 cs=0 l=0
> c=0x556f993b3a80).accept sd=84 172.16.0.119:35388/0
>  2016-05-10 14:35:48.199966 7f53e8f1a700  2 -- 0.0.0.0:6806/13962 >>
> :/0 pipe(0x556f99413400 sd=84 :6806 s=4 pgs=0 cs=0 l=0
> c=0x556f993b3a80).fault (0) Success
>  2016-05-10 14:35:48.200018 7f53fb941700  1 osd.1 2468 ms_handle_reset
> con 0x556f993b3a80 session 0
> OSD.3 goes ok because never left out because ceph restriction.
> I rebooted all services at once for it to have available all OSD at the
> same time and don't mark it down. Don't work. 
> I forced up from commandline. ceph osd in 1-5. They appear as in for a
> while then out.
> We tried ceph-disk activate-all to boot everything. Don't work.
> 
> The strange thing is that culster started worked just right after
> upgrade. But the systemctrl command broke both servers. 
> root@blue-compute:~# ceph -w
>     cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771
>      health HEALTH_ERR
>             694 pgs are stuck inactive for more than 300 seconds
>             694 pgs stale
>             694 pgs stuck stale
>             too many PGs per OSD (1528 > max 300)
>             mds cluster is degraded
>             crush map has straw_calc_version=0
>      monmap e10: 2 mons at {blue-compute=172.16.0.119:6789/0,red-
> compute=172.16.0.100:6789/0}
>             election epoch 3600, quorum 0,1 red-compute,blue-compute
>       fsmap e673: 1/1/1 up {0:0=blue-compute=up:replay}
>      osdmap e2495: 5 osds: 1 up, 1 in; 5 remapped pgs
>       pgmap v40765481: 764 pgs, 6 pools, 410 GB data, 103 kobjects
>             87641 MB used, 212 GB / 297 GB avail
>                  694 stale+active+clean
>                   70 active+clean
> 
> 2016-05-10 17:03:55.822440 mon.0 [INF] HEALTH_ERR; 694 pgs are stuck
> inactive for more than 300 seconds; 694 pgs stale; 694 pgs stuck stale;
> too many PGs per OSD (1528 > max 300); mds cluster is degraded; crush
> map has straw_calc_version=
> cat /etc/ceph/ceph.conf 
> [global]
> 
> fsid = 9028f4da-0d77-462b-be9b-dbdf7fa57771
> mon_initial_members = blue-compute, red-compute
> mon_host = 172.16.0.119, 172.16.0.100
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> filestore_xattr_use_omap = true
> public_network = 172.16.0.0/24
> osd_pool_default_pg_num = 100
> osd_pool_default_pgp_num = 100
> osd_pool_default_size = 2  # Write an object 3 times.
> osd_pool_default_min_size = 1 # Allow writing one copy in a degraded
> state.
> 
> ## Required upgrade
> osd max object name len = 256
> osd max object namespace len = 64
> 
> [mon.]
> 
>     debug mon = 9
>     caps mon = "allow *"
> 
> Any help on this? Any clue of what's going wrong?
> 
> 
> I also see this, I don't know if it's related or not
> 
> => ceph-osd.admin.log <==
> 2016-05-10 18:21:46.060278 7fa8f30cc8c0  0 ceph version 10.1.2
> (4a2a6f72640d6b74a3bbd92798bb913ed380dcd4), process ceph-osd, pid 14135
> 2016-05-10 18:21:46.060460 7fa8f30cc8c0 -1 bluestore(/dev/sdc2)
> _read_bdev_label unable to decode label at offset 66:
> buffer::malformed_input: void
> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode
> past end of struct encoding
> 2016-05-10 18:21:46.062949 7fa8f30cc8c0  1 journal _open /dev/sdc2 fd
> 4: 5367660544 bytes, block size 4096 bytes, directio = 0, aio = 0
> 2016-05-10 18:21:46.062991 7fa8f30cc8c0  1 journal close /dev/sdc2
> 2016-05-10 18:21:46.063026 7fa8f30cc8c0  0 probe_block_device_fsid
> /dev/sdc2 is filestore, 119a9f4e-73d8-4a1f-877c-d60b01840c96
> 2016-05-10 18:21:47.072082 7eff735598c0  0 ceph version 10.1.2
> (4a2a6f72640d6b74a3bbd92798bb913ed380dcd4), process ceph-osd, pid 14177
> 2016-05-10 18:21:47.072285 7eff735598c0 -1 bluestore(/dev/sdf2)
> _read_bdev_label unable to decode label at offset 66:
> buffer::malformed_input: void
> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode
> past end of struct encoding
> 2016-05-10 18:21:47.074799 7eff735598c0  1 journal _open /dev/sdf2 fd
> 4: 5367660544 bytes, block size 4096 bytes, directio = 0, aio = 0
> 2016-05-10 18:21:47.074844 7eff735598c0  1 journal close /dev/sdf2
> 2016-05-10 18:21:47.074881 7eff735598c0  0 probe_block_device_fsid
> /dev/sdf2 is filestore, fd069e6a-9a62-4286-99cb-d8a523bd946a
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* OSD will never become up. HEALTH_ERR
@ 2016-05-11  8:37 Gonzalo Aguilar Delgado
  2016-05-11 12:55 ` Sage Weil
  0 siblings, 1 reply; 4+ messages in thread
From: Gonzalo Aguilar Delgado @ 2016-05-11  8:37 UTC (permalink / raw)
  To: ceph-devel

Hello, 

I just upgraded my cluster to the version 10.1.2 and it worked well for
a while until I saw that systemctl ceph-disk@dev-sdc1.service was
failed and I reruned it.

From there the OSD stopped working. 

This is ubuntu 16.04. 

I connected to the IRC looking for help where people pointed me to one
or another place but none of the investigations helped to resolve.

My configuration is rather simple:

oot@red-compute:~# ceph osd tree
ID WEIGHT  TYPE NAME                 UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 1.00000 root default                                                
-4 1.00000     rack rack-1                                             
-2 1.00000         host blue-compute                                   
 0 1.00000             osd.0            down        0          1.00000 
 2 1.00000             osd.2            down        0          1.00000 
-3 1.00000         host red-compute                                    
 1 1.00000             osd.1            down        0          1.00000 
 3 0.50000             osd.3              up  1.00000          1.00000 
 4 1.00000             osd.4            down        0          1.00000 

It seems that all nodes are in preboot status. I was looking at the
latests commits and it seems that there's a patch
to make OSDs to wait for cluster to become healthy before rejoining.
Can this be the source of my problems?

root@red-compute:/var/lib/ceph/osd/ceph-1# ceph daemon osd.1 status
{
    "cluster_fsid": "9028f4da-0d77-462b-be9b-dbdf7fa57771",
    "osd_fsid": "adf9890a-e680-48e4-82c6-e96f4ed56889",
    "whoami": 1,
    "state": "preboot",
    "oldest_map": 1764,
    "newest_map": 2504,
    "num_pgs": 323
}

root@red-compute:/var/lib/ceph/osd/ceph-1# ceph daemon osd.3 status
{
    "cluster_fsid": "9028f4da-0d77-462b-be9b-dbdf7fa57771",
    "osd_fsid": "8dd085d4-0b50-4c80-a0ca-c5bc4ad972f7",
    "whoami": 3,
    "state": "preboot",
    "oldest_map": 1764,
    "newest_map": 2504,
    "num_pgs": 150
}

3 is up and in. 


This is what I got sofar:

Once upgraded I discovered that daemon runs under ceph. I just ran
chown on ceph directories. and it worked. 
Firewall is fully disabled. Checked connectivity with nc and nmap. 
Configuration seems to be right. I can post if you want. 
Enabling logging on OSD shows that for example osd.1 is reconnecting
all the time.
2016-05-10 14:35:48.199573 7f53e8f1a700  1 -- 0.0.0.0:6806/13962 >> :/0
pipe(0x556f99413400 sd=84 :6806 s=0 pgs=0 cs=0 l=0
c=0x556f993b3a80).accept sd=84 172.16.0.119:35388/0
 2016-05-10 14:35:48.199966 7f53e8f1a700  2 -- 0.0.0.0:6806/13962 >>
:/0 pipe(0x556f99413400 sd=84 :6806 s=4 pgs=0 cs=0 l=0
c=0x556f993b3a80).fault (0) Success
 2016-05-10 14:35:48.200018 7f53fb941700  1 osd.1 2468 ms_handle_reset
con 0x556f993b3a80 session 0
OSD.3 goes ok because never left out because ceph restriction.
I rebooted all services at once for it to have available all OSD at the
same time and don't mark it down. Don't work. 
I forced up from commandline. ceph osd in 1-5. They appear as in for a
while then out.
We tried ceph-disk activate-all to boot everything. Don't work.

The strange thing is that culster started worked just right after
upgrade. But the systemctrl command broke both servers. 
root@blue-compute:~# ceph -w
    cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771
     health HEALTH_ERR
            694 pgs are stuck inactive for more than 300 seconds
            694 pgs stale
            694 pgs stuck stale
            too many PGs per OSD (1528 > max 300)
            mds cluster is degraded
            crush map has straw_calc_version=0
     monmap e10: 2 mons at {blue-compute=172.16.0.119:6789/0,red-
compute=172.16.0.100:6789/0}
            election epoch 3600, quorum 0,1 red-compute,blue-compute
      fsmap e673: 1/1/1 up {0:0=blue-compute=up:replay}
     osdmap e2495: 5 osds: 1 up, 1 in; 5 remapped pgs
      pgmap v40765481: 764 pgs, 6 pools, 410 GB data, 103 kobjects
            87641 MB used, 212 GB / 297 GB avail
                 694 stale+active+clean
                  70 active+clean

2016-05-10 17:03:55.822440 mon.0 [INF] HEALTH_ERR; 694 pgs are stuck
inactive for more than 300 seconds; 694 pgs stale; 694 pgs stuck stale;
too many PGs per OSD (1528 > max 300); mds cluster is degraded; crush
map has straw_calc_version=
cat /etc/ceph/ceph.conf 
[global]

fsid = 9028f4da-0d77-462b-be9b-dbdf7fa57771
mon_initial_members = blue-compute, red-compute
mon_host = 172.16.0.119, 172.16.0.100
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
public_network = 172.16.0.0/24
osd_pool_default_pg_num = 100
osd_pool_default_pgp_num = 100
osd_pool_default_size = 2  # Write an object 3 times.
osd_pool_default_min_size = 1 # Allow writing one copy in a degraded
state.

## Required upgrade
osd max object name len = 256
osd max object namespace len = 64

[mon.]

    debug mon = 9
    caps mon = "allow *"

Any help on this? Any clue of what's going wrong?


I also see this, I don't know if it's related or not

=> ceph-osd.admin.log <==
2016-05-10 18:21:46.060278 7fa8f30cc8c0  0 ceph version 10.1.2
(4a2a6f72640d6b74a3bbd92798bb913ed380dcd4), process ceph-osd, pid 14135
2016-05-10 18:21:46.060460 7fa8f30cc8c0 -1 bluestore(/dev/sdc2)
_read_bdev_label unable to decode label at offset 66:
buffer::malformed_input: void
bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode
past end of struct encoding
2016-05-10 18:21:46.062949 7fa8f30cc8c0  1 journal _open /dev/sdc2 fd
4: 5367660544 bytes, block size 4096 bytes, directio = 0, aio = 0
2016-05-10 18:21:46.062991 7fa8f30cc8c0  1 journal close /dev/sdc2
2016-05-10 18:21:46.063026 7fa8f30cc8c0  0 probe_block_device_fsid
/dev/sdc2 is filestore, 119a9f4e-73d8-4a1f-877c-d60b01840c96
2016-05-10 18:21:47.072082 7eff735598c0  0 ceph version 10.1.2
(4a2a6f72640d6b74a3bbd92798bb913ed380dcd4), process ceph-osd, pid 14177
2016-05-10 18:21:47.072285 7eff735598c0 -1 bluestore(/dev/sdf2)
_read_bdev_label unable to decode label at offset 66:
buffer::malformed_input: void
bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode
past end of struct encoding
2016-05-10 18:21:47.074799 7eff735598c0  1 journal _open /dev/sdf2 fd
4: 5367660544 bytes, block size 4096 bytes, directio = 0, aio = 0
2016-05-10 18:21:47.074844 7eff735598c0  1 journal close /dev/sdf2
2016-05-10 18:21:47.074881 7eff735598c0  0 probe_block_device_fsid
/dev/sdf2 is filestore, fd069e6a-9a62-4286-99cb-d8a523bd946a


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-05-11 15:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1462955200.13078.20.camel@gmail.com>
     [not found] ` <1462972579.13078.36.camel@gmail.com>
2016-05-11 15:20   ` OSD will never become up. HEALTH_ERR Gonzalo Aguilar Delgado
2016-05-11  8:37 Gonzalo Aguilar Delgado
2016-05-11 12:55 ` Sage Weil
2016-05-11 15:47   ` Gonzalo Aguilar Delgado

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.