All of lore.kernel.org
 help / color / mirror / Atom feed
* some questions about ceph deployment
@ 2010-09-04 13:45 FWDF
  2010-09-17 20:30 ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: FWDF @ 2010-09-04 13:45 UTC (permalink / raw)
  To: ceph-devel

  We use 3 servers to build a test system of ceph, configured as below:
  
  Host                          IP      
  client01            192.168.1.10   
  ceph01              192.168.2.50
  ceph02              192.168.2.51   
  
  The OS is unbuntu 10.04 LTS and the version of ceph is v0.21.1
  
  ceph.conf:
  [global]
          auth supported = cephx
          pid file = /var/run/ceph/$name.pid
          debug ms = 0
          keyring = /etc/ceph/keyring.bin
   [mon]
          mon data = /mnt/ceph/data/mon$id
          debug ms = 1
  [mon0]
          host = ceph01
          mon addr = 192.168.2.50:6789
   [mds]
          keyring = /etc/ceph/keyring.$name
debug ms = 1
  [mds.ceph01]
          host = ceph01
  [mds.ceph02]
          host = ceph02
  [osd]
          sudo = true
          osd data = /mnt/ceph/osd$id/data
          keyring = /etc/ceph/keyring.$name
          osd journal = /mnt/ceph/osd$id/data/journal
          osd journal size = 100
  [osd0]
          host = ceph01
  [osd1]
          host = ceph01
  [osd2]
          host = ceph01
  [osd3]
          host = ceph01
  [osd10]
          host = ceph02
  
  There are 4 HDDs in the ceph01 and every HDD has a OSD named as osd0, osd1, osd2,osd3; there is 1 HDD in the ceph02 named as osd10. All these HDDs are made as btrfs and mounted on the mount point as listed below:
  
  ceph01
           /dev/sdc1         /mnt/ceph/osd0/data               btrfs
           /dev/sdd1         /mnt/ceph/osd1/data               btrfs
           /dev/sde1         /mnt/ceph/osd2/data               btrfs
           /dev/sdf1          /mnt/ceph/osd3/data               btrfs
  
  ceph02
           /dev/sdb1         /mnt.ceph/osd10/data             btrfs
  
  Make ceph FileSystem:
  root@ceph01:~#  mkcephfs  -c /etc/cepf/ceph.conf -a -k /etc/ceph/keyring.bin
  
  Startup ceph:
  root@ceph01:~#  /etc/init.d/ceph –a  start

         Then
  root@ceph01:~#  ceph -w
  10.09.01_17:56:19.337895   mds e17: 1/1/1 up {0=up:active}, 1 up:standby
  10.09.01_17:56:19.347184   osd e27: 5 osds: 5 up, 5 in
  10.09.01_17:56:19.349447     log … 
  10.09.01_17:56:19.373773   mon e1: 1 mons at 192.168.2.50:6789/0
  
  The ceph file system is mounted to client01(192.168.1.10), ceph01(192.168.2.50), ceph02(192.168.2.51)at /data/ceph. It works fine at the beginning, I can use ls and the write and read of file is ok. After some files are wrote , I find I can’t use ls –l /data/ceph until I umount ceph from ceph02, but one day later the same problem occurred again, then I umount ceph from ceph01 the system and everything is ok.

  Q1:
         Can the ceph filesystem be mounted to a member of ceph cluster?

         When I follow the instruction of http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a monitor to ceph02, the following error occurred:
  
  root@ceph02:~#  /etc/init.d/ceph start mon1
  [/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565  2.5KB/s  00:00 
  === mon.1 ===
  Starting Ceph mon1 on ceph02...
   ** WARNING: Ceph is still under heavy development, and is only suitable for **
   ** testing and review.  Do not trust it with important data.  **
  terminate called after throwing an instance of 'std::logic_error'
    what():  basic_string::_S_construct NULL not valid
  Aborted (core dumped)
  failed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '

  Q2:
  How to expand a monitor to a running ceph system?
   
  Q3
      Is it possible to add mds when the ceph system is running? how?
  
  I fdisked a HDD into two partition, one for journal, other one for data like this:
  /dev/sdc1(180GB)as data
  /dev/sdc2(10GB)as journal
  
  /dev/sdc1 made as btrfs, mount to /mnt/osd0/data
  /dev/sdc2 made as btrfs, mount to /mnt/osd0/journal
  
  ceph.conf:
  …
  [osd]
          osd data = /mnt/ceph/osd$id/data
          osd journal = /mnt/ceph/osd$id/journal
          ; osd journal size = 100
  …
  When I use mkcephfs command, I can't build osd until I edited ceph.conf like this:
  
  [osd]
          osd data = /mnt/ceph/osd$id/data
          osd journal = /mnt/ceph/osd$id/data/journal
          osd journal size = 100
  …
  
  Q4.
    How to set the journal path to a device or patition?
  
  Thanks for all help and reply , sorry for my lame English.
  
  Lin


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: some questions about ceph deployment
  2010-09-04 13:45 some questions about ceph deployment FWDF
@ 2010-09-17 20:30 ` Sage Weil
  2010-09-22 11:09   ` cang lin
       [not found]   ` <AANLkTikLgvHVRnHC+ept0NZv7uGVpAL52hDdFH2wiN9L@mail.gmail.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Sage Weil @ 2010-09-17 20:30 UTC (permalink / raw)
  To: FWDF; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6717 bytes --]

Sorry, I just realized this one slipped through the cracks!

On Sat, 4 Sep 2010, FWDF wrote:

> ÿÿÿÿWe use 3 servers to build a test system of ceph, configured as below:
> ÿÿÿÿ
> ÿÿÿÿHost                          IP      
> ÿÿÿÿclient01            192.168.1.10   
> ÿÿÿÿceph01              192.168.2.50
> ÿÿÿÿceph02              192.168.2.51   
> ÿÿÿÿ
> ÿÿÿÿThe OS is unbuntu 10.04 LTS and the version of ceph is v0.21.1
> ÿÿÿÿ
> ÿÿÿÿceph.conf:
> ÿÿÿÿ[global]
> ÿÿÿÿ        auth supported = cephx
> ÿÿÿÿ        pid file = /var/run/ceph/$name.pid
> ÿÿÿÿ        debug ms = 0
> ÿÿÿÿ        keyring = /etc/ceph/keyring.bin
> ÿÿÿÿ [mon]
> ÿÿÿÿ        mon data = /mnt/ceph/data/mon$id
> ÿÿÿÿ        debug ms = 1
> ÿÿÿÿ[mon0]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ        mon addr = 192.168.2.50:6789
> ÿÿÿÿ [mds]
> ÿÿÿÿ        keyring = /etc/ceph/keyring.$name
> debug ms = 1
> ÿÿÿÿ[mds.ceph01]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[mds.ceph02]
> ÿÿÿÿ        host = ceph02
> ÿÿÿÿ[osd]
> ÿÿÿÿ        sudo = true
> ÿÿÿÿ        osd data = /mnt/ceph/osd$id/data
> ÿÿÿÿ        keyring = /etc/ceph/keyring.$name
> ÿÿÿÿ        osd journal = /mnt/ceph/osd$id/data/journal
> ÿÿÿÿ        osd journal size = 100
> ÿÿÿÿ[osd0]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[osd1]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[osd2]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[osd3]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[osd10]
> ÿÿÿÿ        host = ceph02
> ÿÿÿÿ
> ÿÿÿÿThere are 4 HDDs in the ceph01 and every HDD has a OSD named as osd0, osd1, osd2,osd3; there is 1 HDD in the ceph02 named as osd10. All these HDDs are made as btrfs and mounted on the mount point as listed below:
> ÿÿÿÿ
> ÿÿÿÿceph01
> ÿÿÿÿ         /dev/sdc1         /mnt/ceph/osd0/data               btrfs
> ÿÿÿÿ         /dev/sdd1         /mnt/ceph/osd1/data               btrfs
> ÿÿÿÿ         /dev/sde1         /mnt/ceph/osd2/data               btrfs
> ÿÿÿÿ         /dev/sdf1          /mnt/ceph/osd3/data               btrfs
> ÿÿÿÿ
> ÿÿÿÿceph02
> ÿÿÿÿ         /dev/sdb1         /mnt.ceph/osd10/data             btrfs
> ÿÿÿÿ
> ÿÿÿÿMake ceph FileSystem:
> ÿÿÿÿroot@ceph01:~#  mkcephfs  -c /etc/cepf/ceph.conf -a -k /etc/ceph/keyring.bin
> ÿÿÿÿ
> ÿÿÿÿStartup ceph:
> ÿÿÿÿroot@ceph01:~#  /etc/init.d/ceph ÿÿa  start
> 
>          Then
> ÿÿÿÿroot@ceph01:~#  ceph -w
> ÿÿÿÿ10.09.01_17:56:19.337895   mds e17: 1/1/1 up {0=up:active}, 1 up:standby
> ÿÿÿÿ10.09.01_17:56:19.347184   osd e27: 5 osds: 5 up, 5 in
> ÿÿÿÿ10.09.01_17:56:19.349447     log ÿÿ 
> ÿÿÿÿ10.09.01_17:56:19.373773   mon e1: 1 mons at 192.168.2.50:6789/0
> ÿÿÿÿ
> ÿÿÿÿThe ceph file system is mounted to client01(192.168.1.10), ceph01(192.168.2.50), ceph02ÿÿ192.168.2.51ÿÿat /data/ceph. It works fine at the beginning, I can use ls and the write and read of file is ok. After some files are wrote , I find I canÿÿt use ls ÿÿl /data/ceph until I umount ceph from ceph02, but one day later the same problem occurred again, then I umount ceph from ceph01 the system and everything is ok.
> 
> ÿÿÿÿQ1:
>          Can the ceph filesystem be mounted to a member of ceph cluster?

Technically, yes, but you should be very careful doing so.  The problem is 
that when the kernel is low on memory it will force the client to write 
out dirty data so that it can reclaim those pages.  If the writeout 
depends on then waking up some user process (cosd daemon), doing a bunch 
of random work, and writing the data to disk (dirtying yet more memory), 
you can deadlock the system.
 
>          When I follow the instruction of http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a monitor to ceph02, the following error occurred:
> ÿÿÿÿ
> ÿÿÿÿroot@ceph02:~#  /etc/init.d/ceph start mon1
> ÿÿÿÿ[/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565  2.5KB/s  00:00 
> ÿÿÿÿ=== mon.1 ===
> ÿÿÿÿStarting Ceph mon1 on ceph02...
> ÿÿÿÿ ** WARNING: Ceph is still under heavy development, and is only suitable for **
> ÿÿÿÿ ** testing and review.  Do not trust it with important data.  **
> ÿÿÿÿterminate called after throwing an instance of 'std::logic_error'
> ÿÿÿÿ  what():  basic_string::_S_construct NULL not valid
> ÿÿÿÿAborted (core dumped)
> ÿÿÿÿfailed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '

I haven't seen that crash, but it looks like a std::string constructor is 
being passed a NULL pointer.  Do you have a core dump (to get a 
backtrace)?  Which version are you running (`cmon -v`)?

> ÿÿÿÿQ2:
> ÿÿÿÿHow to expand a monitor to a running ceph system?

The process in that wiki article can expand the monitor cluster while it 
is online.  Note that the monitor identication changed slightly between 
v0.21 and the current unstable branch (will be v0.22), and the 
instructions still need to be updated for that.

> ÿÿÿÿQ3
> ÿÿÿÿ    Is it possible to add mds when the ceph system is running? how?

Yes.  Add the new mds to ceph.conf, start the daemon.  You should see it 
as up:standby in the 'ceph -s' or 'ceph mds dump -o -' output.  Then

 ceph mds setmaxmds 2

change the size of the 'active' cluster to 2.  

Please keep in mind the clustered MDS still has some bugs; we expect v0.22 
to be stable.

> ÿÿÿÿ
> ÿÿÿÿI fdisked a HDD into two partition, one for journal, other one for data like this:
> ÿÿÿÿ/dev/sdc1ÿÿ180GBÿÿas data
> ÿÿÿÿ/dev/sdc2ÿÿ10GBÿÿas journal
> ÿÿÿÿ
> ÿÿÿÿ/dev/sdc1 made as btrfs, mount to /mnt/osd0/data
> ÿÿÿÿ/dev/sdc2 made as btrfs, mount to /mnt/osd0/journal
> ÿÿÿÿ
> ÿÿÿÿceph.conf:
> ÿÿÿÿÿÿ
> ÿÿÿÿ[osd]
> ÿÿÿÿ        osd data = /mnt/ceph/osd$id/data
> ÿÿÿÿ        osd journal = /mnt/ceph/osd$id/journal
> ÿÿÿÿ        ; osd journal size = 100
> ÿÿÿÿÿÿ
> ÿÿÿÿWhen I use mkcephfs command, I can't build osd until I edited ceph.conf like this:
> ÿÿÿÿ
> ÿÿÿÿ[osd]
> ÿÿÿÿ        osd data = /mnt/ceph/osd$id/data
> ÿÿÿÿ        osd journal = /mnt/ceph/osd$id/data/journal
> ÿÿÿÿ        osd journal size = 100

If the journal is a file, the system won't create it for you unless you 
specify a size.  If it already exists (e.g., you created it via 'dd', or 
it's a block device) the journal size isn't needed.

> ÿÿÿÿQ4.
> ÿÿÿÿ  How to set the journal path to a device or patition?

	osd journal = /dev/sdc1  ; or whatever

Hope this helps!  Sorry for the slow response.  Let us know if you have 
further questions!

sage


> ÿÿÿÿThanks for all help and reply , sorry for my lame English.
> ÿÿÿÿ
> ÿÿÿÿLin
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: some questions about ceph deployment
  2010-09-17 20:30 ` Sage Weil
@ 2010-09-22 11:09   ` cang lin
       [not found]   ` <AANLkTikLgvHVRnHC+ept0NZv7uGVpAL52hDdFH2wiN9L@mail.gmail.com>
  1 sibling, 0 replies; 7+ messages in thread
From: cang lin @ 2010-09-22 11:09 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Thanks for your replay,Sege.I think ceph is a very good distributed
filesystem and want to test it in product environment.Your reply is
very important to us.

2010/9/18 Sage Weil <sage@newdream.net>
>
> Sorry, I just realized this one slipped through the cracks!
>
> On Sat, 4 Sep 2010, FWDF wrote:
>
> > We use 3 servers to build a test system of ceph, configured as below:
> >
> > Host                          IP
> > client01            192.168.1.10
> > ceph01              192.168.2.50
> > ceph02              192.168.2.51
> >
> > The OS is unbuntu 10.04 LTS and the version of ceph is v0.21.1
> >
> > ceph.conf:
> > [global]
> >         auth supported = cephx
> >         pid file = /var/run/ceph/$name.pid
> >         debug ms = 0
> >         keyring = /etc/ceph/keyring.bin
> > [mon]
> >         mon data = /mnt/ceph/data/mon$id
> >         debug ms = 1
> > [mon0]
> >         host = ceph01
> >         mon addr = 192.168.2.50:6789
> >  [mds]
> >         keyring = /etc/ceph/keyring.$name
> >         debug ms = 1
> > [mds.ceph01]
> >         host = ceph01
> > [mds.ceph02]
> >         host = ceph02
> > [osd]
> >         sudo = true
> >         osd data = /mnt/ceph/osd$id/data
> >         keyring = /etc/ceph/keyring.$name
> >         osd journal = /mnt/ceph/osd$id/data/journal
> >         osd journal size = 100
> > [osd0]
> >         host = ceph01
> > [osd1]
> >         host = ceph01
> > [osd2]
> >         host = ceph01
> > [osd3]
> >         host = ceph01
> > [osd10]
> >         host = ceph02
> >
> >There are 4 HDDs in the ceph01 and every HDD has a OSD named as osd0, osd1, osd2,osd3; there is 1 HDD in the ceph02 named as osd10. All these HDDs are made as btrfs and mounted on the mount point as listed below:
> >
> > ceph01
> >         /dev/sdc1         /mnt/ceph/osd0/data               btrfs
> >         /dev/sdd1         /mnt/ceph/osd1/data               btrfs
> >         /dev/sde1         /mnt/ceph/osd2/data               btrfs
> >         /dev/sdf1          /mnt/ceph/osd3/data               btrfs
> >
> > ceph02
> >         /dev/sdb1         /mnt.ceph/osd10/data             btrfs
> >
> > Make ceph FileSystem:
> > root@ceph01:~#  mkcephfs  -c /etc/cepf/ceph.conf -a -k /etc/ceph/keyring.bin
> >
> > Startup ceph:
> > root@ceph01:~#  /etc/init.d/ceph -a  start
> >
> >          Then
> > root@ceph01:~#  ceph -w
> > 10.09.01_17:56:19.337895   mds e17: 1/1/1 up {0=up:active}, 1 up:standby
> > 10.09.01_17:56:19.347184   osd e27: 5 osds: 5 up, 5 in
> > 10.09.01_17:56:19.349447     log ...
> > 10.09.01_17:56:19.373773   mon e1: 1 mons at 192.168.2.50:6789/0
> >
> > The ceph file system is mounted to client01(192.168.1.10), ceph01(192.168.2.50), ceph02(192.168.2.51) at /data/ceph. It works fine at the beginning, I can use ls and the write and read of file is ok. After some files are wrote , I find I can't use ls -l /data/ceph until I umount ceph from ceph02, but one day later the same problem occurred again, then I umount ceph from ceph01 the system and everything is ok.
> >
> > Q1:
> >          Can the ceph filesystem be mounted to a member of ceph cluster?
>
> Technically, yes, but you should be very careful doing so.  The problem is
> that when the kernel is low on memory it will force the client to write
> out dirty data so that it can reclaim those pages.  If the writeout
> depends on then waking up some user process (cosd daemon), doing a bunch
> of random work, and writing the data to disk (dirtying yet more memory),
> you can deadlock the system.

We not only mount ceph onto a client in the same subnet but also mount
it onto remote client through internet.in the first week everything
worked fine,it is about 100G write operation and 10 times read
operation per day.The file was almost read only and the size is form a
dozens  of MB to a few GB,not a very heavy load.but in the second week
the client in the same subnet with ceph cluster can’t be accessed and
ceph can’t be unmounted from it,the remote client can still access and
unmount ceph.

Use 'ceph –s' and 'ceph osd dump -0' on ceph01 can find out that the 3
of 4 osd were down(osd0,osd02,osd04). Use 'df –h' command can find out
/dev/sde1(for osd0), /dev/sdd1(for osd2), /dev/sdc1(for osd4) still in
their mount point.

Use following command to restart osd:

# /etc/init.d/ceph start osd0

[/etc/ceph/fetch_config /tmp/fetched.ceph.conf.4967]
=== osd.0 ===
Starting Ceph osd0 on ceph01...
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
starting osd0 at 0.0.0.0:6800/4864 osd_data /mnt/ceph/osd0/data
/mnt/ceph/osd0/data/journal
…

3 osd started and ran normally,but the local ceph client was down.Dose
it have anything to do with the osd restart?The local client can
remount ceph after reboot and work normally. The remote client can
remount ceph and work normally too,but a few days later it can’t
access or unmount ceph.

#umount /mnt/ceph
umount: /mnt/ceph: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))

There was no response for lsof or fuser command.the only thing could
do is kill the process and reboot the system.We use ceph v0.21.2 for
the cluster and client and use Ubuntu 10.04 LTS(server),kernel version
is 2.6.32-21-generic-pae。
What confuse me is why the client can’t access ceph?Even if the osd
was down shouldn’t affect the client.what is the reason for the client
can’t access or unmount ceph?
>
> >          When I follow the instruction of http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a monitor to ceph02, the following error occurred:
> >
> > root@ceph02:~#  /etc/init.d/ceph start mon1
> > [/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565  2.5KB/s  00:00
> > === mon.1 ===
> > Starting Ceph mon1 on ceph02...
> >  ** WARNING: Ceph is still under heavy development, and is only suitable for **
> >  ** testing and review.  Do not trust it with important data.  **
> > terminate called after throwing an instance of 'std::logic_error'
> >   what():  basic_string::_S_construct NULL not valid
> > Aborted (core dumped)
> > failed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '
>
> I haven't seen that crash, but it looks like a std::string constructor is
> being passed a NULL pointer.  Do you have a core dump (to get a
> backtrace)?  Which version are you running (`cmon -v`)?

The cmon version is v0.21.1 when the crash happen and been updated to v0.21.2.
The following backtrace is from v0.21.2:

# gdb cmon core
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/cmon...(no debugging symbols found)...done.

warning: exec file is newer than core file.
[New Thread 17644]
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/tls/i686/cmov/libpthread.so.0...(no
debugging symbols found)...done.
Loaded symbols for /lib/tls/i686/cmov/libpthread.so.0
Reading symbols from /lib/i686/cmov/libcrypto.so.0.9.8...(no debugging
symbols found)...done.
Loaded symbols for /lib/i686/cmov/libcrypto.so.0.9.8
Reading symbols from /usr/lib/libstdc++.so.6...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libstdc++.so.6
Reading symbols from /lib/tls/i686/cmov/libm.so.6...(no debugging
symbols found)...done.
Loaded symbols for /lib/tls/i686/cmov/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/tls/i686/cmov/libc.so.6...(no debugging
symbols found)...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/tls/i686/cmov/libdl.so.2...(no debugging
symbols found)...done.
Loaded symbols for /lib/tls/i686/cmov/libdl.so.2
Reading symbols from /lib/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libz.so.1
Core was generated by `/usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.17598'.
Program terminated with signal 6, Aborted.
#0  0x001be422 in __kernel_vsyscall ()

(gdb)bt
#0  0x001be422 in __kernel_vsyscall ()
#1  0x00c2d651 in raise () from /lib/tls/i686/cmov/libc.so.6
#2  0x00c30a82 in abort () from /lib/tls/i686/cmov/libc.so.6
#3  0x0050a52f in __gnu_cxx::__verbose_terminate_handler() () from
/usr/lib/libstdc++.so.6
#4  0x00508465 in ?? () from /usr/lib/libstdc++.so.6
#5  0x005084a2 in std::terminate() () from /usr/lib/libstdc++.so.6
#6  0x005085e1 in __cxa_throw () from /usr/lib/libstdc++.so.6
#7  0x0049f57f in std::__throw_logic_error(char const*) () from
/usr/lib/libstdc++.so.6
#8  0x004e3b82 in ?? () from /usr/lib/libstdc++.so.6
#9  0x004e3da2 in std::basic_string<char, std::char_traits<char>,
std::allocator<char> >::basic_string(char const*, unsigned int,
std::allocator<char> const&) () from /usr/lib/libstdc++.so.6
#10 0x08088744 in main ()
(gdb)
>
> > Q2:
> > How to expand a monitor to a running ceph system?
>
> The process in that wiki article can expand the monitor cluster while it
> is online.  Note that the monitor identication changed slightly between
> v0.21 and the current unstable branch (will be v0.22), and the
> instructions still need to be updated for that.
>
> > Q3
> >    Is it possible to add mds when the ceph system is running? how?
>
> Yes.  Add the new mds to ceph.conf, start the daemon.  You should see it
> as up:standby in the 'ceph -s' or 'ceph mds dump -o -' output.  Then
>
>  ceph mds setmaxmds 2
>
> change the size of the 'active' cluster to 2.
>
> Please keep in mind the clustered MDS still has some bugs; we expect v0.22
> to be stable.

Thanks,I will wait for v0.22 and try to add mds then,but I want to is
my config for mds is right.

I set 2 mds in ceph.conf

[mds]
        keyring = /etc/ceph/keyring.$name
        debug ms = 1
[mds.ceph01]
        host = ceph01
[mds.ceph02]
      host = ceph02

The result for 'ceph –s':

  10.09.01_17:56:19.337895   mds e17: 1/1/1 up {0=up:active}, 1 up:standby

But now the result for 'ceph –s' is:

  10.09.19_17:01:50.398809   mds e27: 1/1/1 up {0=up:active}

The result for '

  ceph mds dump -o –' is:
  10.09.19_17:05:10.263142 mon <- [mds,dump]
  10.09.19_17:05:10.264095 mon0 -> 'dumped mdsmap epoch 27' (0)
  epoch 27
  client_epoch 0
  created 10.08.26_03:27:01.753124
  modified 10.09.11_00:42:41.691011
  tableserver 0
  root 0
  session_timeout 60
  session_autoclose 300
  compat  compat={},rocompat={},incompat={1=base v0.20}
  max_mds 1
  in      0
  up      {0=4298}
  failed
  stopped
  4298:   192.168.2.51:6800/3780 'ceph02' mds0.6 up:active seq 260551
  10.09.19_17:05:10.264231 wrote 321 byte payload to -

I don’t quite understand what the infomation means?Does it mean 1 mds
was down?how should I handle it?
>
> >
> > I fdisked a HDD into two partition, one for journal, other one for data like this:
> > /dev/sdc1ÿÿ180GBÿÿas data
> > dev/sdc2ÿÿ10GBÿÿas journal
> >
> > /dev/sdc1 made as btrfs, mount to /mnt/osd0/data
> > /dev/sdc2 made as btrfs, mount to /mnt/osd0/journal
> >
> > ceph.conf:
> >
> > [osd]
> >         osd data = /mnt/ceph/osd$id/data
> >         osd journal = /mnt/ceph/osd$id/journal
> >         ; osd journal size = 100
> >
> > When I use mkcephfs command, I can't build osd until I edited ceph.conf like this:
> >
> > [osd]
> >         osd data = /mnt/ceph/osd$id/data
> >         osd journal = /mnt/ceph/osd$id/data/journal
> >         osd journal size = 100
>
> If the journal is a file, the system won't create it for you unless you
> specify a size.  If it already exists (e.g., you created it via 'dd', or
> it's a block device) the journal size isn't needed.
>
> > Q4.
> > How to set the journal path to a device or patition?
>
>        osd journal = /dev/sdc1  ; or whatever

How to know which journal is for certain osd?
Can the following config does that?

[osd]
        sudo = true
        osd data = /mnt/ceph/osd$id/data
[osd0]
        host = ceph01
        osd journal = /dev/sdc1

If I make a partition for journal in a 500GB hdd,what is the proper
size for the partition?

thanks.
Lin
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: some questions about ceph deployment
       [not found]   ` <AANLkTikLgvHVRnHC+ept0NZv7uGVpAL52hDdFH2wiN9L@mail.gmail.com>
@ 2010-09-22 16:17     ` Sage Weil
  2010-09-22 17:57       ` cang lin
  2010-09-22 20:44       ` Sage Weil
  0 siblings, 2 replies; 7+ messages in thread
From: Sage Weil @ 2010-09-22 16:17 UTC (permalink / raw)
  To: cang lin; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5018 bytes --]

On Wed, 22 Sep 2010, cang lin wrote:
> We not only mount ceph onto a client in the same subnet but also mount it
> onto remote client through internet.in the first week everything worked
> fine,it is about 100G write operation and 10 times read operation per
> day.The file was almost read only and the size is form a dozens  of MB to a
> few GB,not a very heavy load.but in the second week the client in the same
> subnet with ceph cluster canÿÿt be accessed and ceph canÿÿt be unmounted from
> it,the remote client can still access and unmount ceph.
> 
> Use 'ceph ÿÿs' and 'ceph osd dump -0' on ceph01 can find out that the 3 of 4
> osd were down(osd0,osd02,osd04). Use 'df ÿÿh' command can find out
> /dev/sde1(for osd0), /dev/sdd1(for osd2), /dev/sdc1(for osd4) still in their
> mount point.
> 
> Use following command to restart osd:
> 
> # /etc/init.d/ceph start osd0
> 
> [/etc/ceph/fetch_config /tmp/fetched.ceph.conf.4967]
> 
> === osd.0 ===
> 
> Starting Ceph osd0 on ceph01...
> 
>  ** WARNING: Ceph is still under heavy development, and is only suitable for
> **
> 
>  **          testing and review.  Do not trust it with important data.
> **
> 
> starting osd0 at 0.0.0.0:6800/4864 osd_data /mnt/ceph/osd0/data
> /mnt/ceph/osd0/data/journal
> 
> ÿÿ
> 
> 3 osd started and ran normally,but the local ceph client was down.Dose it
> have anything to do with the osd restart?The local client can remount ceph
> after reboot and work normally. The remote client can remount ceph and work
> normally too,but a few days later it canÿÿt access or unmount ceph.
> 
> 
> 
> #umount /mnt/ceph
> 
> umount: /mnt/ceph: device is busy.
> 
>         (In some cases useful info about processes that use
> 
>          the device is found by lsof(8) or fuser(1))
> 
> 
> There was no response for lsof or fuser command.the only thing could do is
> kill the process and reboot the system.We use ceph v0.21.2 for the cluster
> and client and use Ubuntu 10.04 LTS(server),kernel version is
> 2.6.32-21-generic-paeÿÿ
> 
>  What confuse me is why the client canÿÿt access ceph?Even if the osd was
> down shouldnÿÿt affect the client.what is the reason for the client canÿÿt
> access or unmount ceph?

It could be a number of things.  The output from 

 cat /sys/kernel/debug/ceph/*/mdsc
 cat /sys/kernel/debug/ceph/*/osdc

will tell you if it's waiting for a server request to respond.  Also, if 
you know the hung pid, you can

 cat /proc/$pid/stack

and see where it is blocked.  Also,

 dmesg | tail

may have some relevant console messages.


> >          When I follow the instruction of
> > http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a
> > monitor to ceph02, the following error occurred:
> > >
> > > root@ceph02:~#  /etc/init.d/ceph start mon1
> > > [/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565
> >  2.5KB/s  00:00
> > > === mon.1 ===
> > > Starting Ceph mon1 on ceph02...
> > >  ** WARNING: Ceph is still under heavy development, and is only suitable
> > for **
> > >  ** testing and review.  Do not trust it with important data.  **
> > > terminate called after throwing an instance of 'std::logic_error'
> > >   what():  basic_string::_S_construct NULL not valid
> > > Aborted (core dumped)
> > > failed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '
> >
> > I haven't seen that crash, but it looks like a std::string constructor is
> > being passed a NULL pointer.  Do you have a core dump (to get a
> > backtrace)?  Which version are you running (`cmon -v`)?
> >
> 
> The cmon version is v0.21.1 when the crash happen and been updated to
> v0.21.2.
> 
> The following backtrace is from v0.21.2:

Thanks, we'll see if we can reproduce and fix this one!

> [...]
> Thanks,I will wait for v0.22 and try to add mds then,but I want to is my
> config for mds is right.
> 
> 
> 
> I set 2 mds in ceph.conf
> 
> [mds]
> 
> keyring = /etc/ceph/keyring.$name
> 
> debug ms = 1
> 
> [mds.ceph01]
> 
> host = ceph01
> 
> [mds.ceph02]
> 
>       host = ceph02

Looks right.


> The result for 'ceph ÿÿs':
> 
> 10.09.01_17:56:19.337895   mds e17: 1/1/1 up {0=up:active}, 1 up:standby
> 
> But now the result for 'ceph ÿÿs' is:
> 
> 10.09.19_17:01:50.398809   mds e27: 1/1/1 up {0=up:active}

It looks like the second 'standby' cmds went away.  Is the daemon still 
running?


> > > Q4.
> > > How to set the journal path to a device or patition?
> >
> >        osd journal = /dev/sdc1  ; or whatever
> >
> 
> How to know which journal is for certain osd?
> 
> Can the following config does that?
> 
> 
> 
> [osd]
> 
>         sudo = true
> 
>         osd data = /mnt/ceph/osd$id/data
> 
> [osd0]
> 
>         host = ceph01
> 
>         osd journal = /dev/sdc1
> 
> 
> If I make a partition for journal in a 500GB hdd,what is the proper size for
> the partition?

1 GB should be sufficient.

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: some questions about ceph deployment
  2010-09-22 16:17     ` Sage Weil
@ 2010-09-22 17:57       ` cang lin
  2010-09-22 20:44       ` Sage Weil
  1 sibling, 0 replies; 7+ messages in thread
From: cang lin @ 2010-09-22 17:57 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

2010/9/23 Sage Weil <sage@newdream.net>:
> On Wed, 22 Sep 2010, cang lin wrote:
>>  What confuse me is why the client can't access ceph?Even if the osd was
>> down shouldn't affect the client.what is the reason for the client canÿÿt
>> access or unmount ceph?
>
> It could be a number of things.  The output from
>
>  cat /sys/kernel/debug/ceph/*/mdsc
>  cat /sys/kernel/debug/ceph/*/osdc
>
> will tell you if it's waiting for a server request to respond.  Also, if
> you know the hung pid, you can
>
>  cat /proc/$pid/stack
>
> and see where it is blocked.  Also,
>
>  dmesg | tail
>
> may have some relevant console messages.
>
>
>> >          When I follow the instruction of
>> > http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a
>> > monitor to ceph02, the following error occurred:
>> > >
>> > > root@ceph02:~#  /etc/init.d/ceph start mon1
>> > > [/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565
>> >  2.5KB/s  00:00
>> > > === mon.1 ===
>> > > Starting Ceph mon1 on ceph02...
>> > >  ** WARNING: Ceph is still under heavy development, and is only suitable
>> > for **
>> > >  ** testing and review.  Do not trust it with important data.  **
>> > > terminate called after throwing an instance of 'std::logic_error'
>> > >   what():  basic_string::_S_construct NULL not valid
>> > > Aborted (core dumped)
>> > > failed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '
>> >
>> > I haven't seen that crash, but it looks like a std::string constructor is
>> > being passed a NULL pointer.  Do you have a core dump (to get a
>> > backtrace)?  Which version are you running (`cmon -v`)?
>> >
>>
>> The cmon version is v0.21.1 when the crash happen and been updated to
>> v0.21.2.
>>
>> The following backtrace is from v0.21.2:
>
> Thanks, we'll see if we can reproduce and fix this one!
>
>> [...]
>> Thanks,I will wait for v0.22 and try to add mds then,but I want to is my
>> config for mds is right.
>>
>>
>>
>> I set 2 mds in ceph.conf
>>
>> [mds]
>>
>> keyring = /etc/ceph/keyring.$name
>>
>> debug ms = 1
>>
>> [mds.ceph01]
>>
>>      host = ceph01
>>
>> [mds.ceph02]
>>
>>       host = ceph02
>
> Looks right.
>
>
>> The result for 'ceph -s':
>>
>> 10.09.01_17:56:19.337895   mds e17: 1/1/1 up {0=up:active}, 1 up:standby
>>
>> But now the result for 'ceph -s' is:
>>
>> 10.09.19_17:01:50.398809   mds e27: 1/1/1 up {0=up:active}
>
> It looks like the second 'standby' cmds went away.  Is the daemon still
> running?
>

I don't know if it was still running because both mds were down now.

>>
>>
>> If I make a partition for journal in a 500GB hdd,what is the proper size for
>> the partition?
>
> 1 GB should be sufficient.
>
> sage

thanks!
Lin
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: some questions about ceph deployment
  2010-09-22 16:17     ` Sage Weil
  2010-09-22 17:57       ` cang lin
@ 2010-09-22 20:44       ` Sage Weil
  2010-09-23  5:22         ` cang lin
  1 sibling, 1 reply; 7+ messages in thread
From: Sage Weil @ 2010-09-22 20:44 UTC (permalink / raw)
  To: cang lin; +Cc: ceph-devel

On Wed, 22 Sep 2010, Sage Weil wrote:
> On Wed, 22 Sep 2010, cang lin wrote:
> > >          When I follow the instruction of
> > > http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a
> > > monitor to ceph02, the following error occurred:
> > > >
> > > > root@ceph02:~#  /etc/init.d/ceph start mon1
> > > > [/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565
> > >  2.5KB/s  00:00
> > > > === mon.1 ===
> > > > Starting Ceph mon1 on ceph02...
> > > >  ** WARNING: Ceph is still under heavy development, and is only suitable
> > > for **
> > > >  ** testing and review.  Do not trust it with important data.  **
> > > > terminate called after throwing an instance of 'std::logic_error'
> > > >   what():  basic_string::_S_construct NULL not valid
> > > > Aborted (core dumped)
> > > > failed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '
> > >
> > > I haven't seen that crash, but it looks like a std::string constructor is
> > > being passed a NULL pointer.  Do you have a core dump (to get a
> > > backtrace)?  Which version are you running (`cmon -v`)?
> > >
> > 
> > The cmon version is v0.21.1 when the crash happen and been updated to
> > v0.21.2.
> > 
> > The following backtrace is from v0.21.2:
> 
> Thanks, we'll see if we can reproduce and fix this one!

Ok, this one is fixed by commit 79b6f2f9e9dd70704644338c968f9ad070e5a8f8 
in the testing branch.  It actually should be printing an error that the 
'magic' file is missing from the mon data directory.  Maybe you skipped 
step #3 on the monitor cluster expansion page?

Thanks-
sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: some questions about ceph deployment
  2010-09-22 20:44       ` Sage Weil
@ 2010-09-23  5:22         ` cang lin
  0 siblings, 0 replies; 7+ messages in thread
From: cang lin @ 2010-09-23  5:22 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Hi  Sage,
2010/9/23 Sage Weil <sage@newdream.net>:
> On Wed, 22 Sep 2010, Sage Weil wrote:
>> On Wed, 22 Sep 2010, cang lin wrote:
>> > >          When I follow the instruction of
>> > > http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a
>> > > monitor to ceph02, the following error occurred:
>> > > >
>> > > > root@ceph02:~#  /etc/init.d/ceph start mon1
>> > > > [/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565
>> > >  2.5KB/s  00:00
>> > > > === mon.1 ===
>> > > > Starting Ceph mon1 on ceph02...
>> > > >  ** WARNING: Ceph is still under heavy development, and is only suitable
>> > > for **
>> > > >  ** testing and review.  Do not trust it with important data.  **
>> > > > terminate called after throwing an instance of 'std::logic_error'
>> > > >   what():  basic_string::_S_construct NULL not valid
>> > > > Aborted (core dumped)
>> > > > failed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '
>> > >
>> > > I haven't seen that crash, but it looks like a std::string constructor is
>> > > being passed a NULL pointer.  Do you have a core dump (to get a
>> > > backtrace)?  Which version are you running (`cmon -v`)?
>> > >
>> >
>> > The cmon version is v0.21.1 when the crash happen and been updated to
>> > v0.21.2.
>> >
>> > The following backtrace is from v0.21.2:
>>
>> Thanks, we'll see if we can reproduce and fix this one!
>
> Ok, this one is fixed by commit 79b6f2f9e9dd70704644338c968f9ad070e5a8f8
> in the testing branch.  It actually should be printing an error that the
> 'magic' file is missing from the mon data directory.  Maybe you skipped
> step #3 on the monitor cluster expansion page?
>
> Thanks-
> sage
>

Yes, I understood wrong to the step 3 in the first test , not the host
name included in the path:
root@ceph01:/ # rsync -av /mnt/ceph/data/mon0/ /mnt/ceph/data/mon1
root@ceph01:/ # echo 1 > /mnt/ceph/data/mon1/whoami

After finding the mistake, I added the host name in the path:
root@ceph01:/ # rsync -av /mnt/ceph/data/mon0/ ceph02:/mnt/ceph/data/mon1

I forget to do the following command on the ceph02:
root@ceph02:/ # echo 1 > /mnt/ceph/data/mon1/whoami

The third test was modified to:
root@ceph01:/ # rsync -av /mnt/ceph/data/mon0/ ceph02:/mnt/ceph/data/mon1
root@ceph02:/ # echo 1 > /mnt/ceph/data/mon1/whoami

But still error. I do not know the practice, or the system to remember
the wrong path?

Thinks!
lin
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-09-23  5:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-04 13:45 some questions about ceph deployment FWDF
2010-09-17 20:30 ` Sage Weil
2010-09-22 11:09   ` cang lin
     [not found]   ` <AANLkTikLgvHVRnHC+ept0NZv7uGVpAL52hDdFH2wiN9L@mail.gmail.com>
2010-09-22 16:17     ` Sage Weil
2010-09-22 17:57       ` cang lin
2010-09-22 20:44       ` Sage Weil
2010-09-23  5:22         ` cang lin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.