* Fwd: Data distribution
[not found] <1266837122.17885.1309461874884.JavaMail.root@mail.linserv.se>
@ 2011-06-30 19:27 ` Martin Wilderoth
2011-06-30 23:55 ` Josh Durgin
0 siblings, 1 reply; 2+ messages in thread
From: Martin Wilderoth @ 2011-06-30 19:27 UTC (permalink / raw)
To: ceph-devel
Hello,
I have made a new test with a new filesystem and it seems as if host3 osd5/osd6 is getting less data. I have check the distribution over time. At the end i got some I/O error as some of the disk are quite full. Can't read superblock when mounting
I guess there ar no tools to cerrect that yet ?
Start
/dev/sdc 137G 2.3M 135G 1% /data/osd0
/dev/sdd 137G 2.4M 135G 1% /data/osd1
/dev/sdc 137G 2.6M 135G 1% /data/osd2
/dev/sdd 137G 2.1M 135G 1% /data/osd3
/dev/sdb 137G 2.0M 135G 1% /data/osd4
/dev/sdc 137G 1.7M 135G 1% /data/osd5
later
/dev/sdc 137G 8.9G 126G 7% /data/osd0
/dev/sdd 137G 8.9G 126G 7% /data/osd1
/dev/sdc 137G 7.9G 126G 6% /data/osd2
/dev/sdd 137G 9.2G 125G 7% /data/osd3
/dev/sdb 137G 7.5G 127G 6% /data/osd4
/dev/sdc 137G 7.1G 127G 6% /data/osd5
later
/dev/sdc 137G 56G 78G 42% /data/osd0
/dev/sdd 137G 60G 75G 45% /data/osd1
/dev/sdc 137G 53G 81G 40% /data/osd2
/dev/sdd 137G 61G 74G 46% /data/osd3
/dev/sdb 137G 51G 84G 38% /data/osd4
/dev/sdc 137G 46G 88G 35% /data/osd5
last
/dev/sdc 137G 126G 7.7G 95% /data/osd0
/dev/sdd 137G 130G 3.2G 98% /data/osd1
/dev/sdc 137G 113G 22G 85% /data/osd2
/dev/sdd 137G 126G 7.3G 95% /data/osd3
/dev/sdb 137G 110G 24G 83% /data/osd4
/dev/sdc 137G 70G 64G 53% /data/osd5
>On Jun 27, 2011, at 12:19 PM, Josh Durgin wrote:
>> On 06/25/2011 08:48 PM, Martin Wilderoth wrote:
>>> Hello
>>>
>>> I have a ceph cluster of 6 osd 146gb each. I have copied a lot of data
>>> filling to 87%. Between the osd's the data is not evenly distributed
>>>
>>> host1
>>> /dev/sdb 137G 119G 15G 90% /data/osd0
>>> /dev/sdc 137G 126G 7.4G 95% /data/osd1
>>>
>>> host2
>>> /dev/sdc 137G 114G 21G 85% /data/osd2
>>> /dev/sdd 137G 130G 3.6G 98% /data/osd3
>>>
>>> host3
>>> /dev/sdb 137G 107G 27G 81% /data/osd4
>>> /dev/sdc 137G 98G 36G 74% /data/osd5
>>>
>>> During the copy i got I/O error, but after restarting the cluster it seems >>>fine.
>>>
>>> By some reason osd3 seems to have much more data than osd5. Is there a way of geting the data distributed better ?.
>>
>> Hi Martin,
>>
>> Since the distribution is pseudo-random, you'll get some variance from an even >split. You can reweight the osds manually with:
>>
>> ceph osd reweight osd3 new_weight
>>
>> or use the more automatic:
>>
>> ceph osd reweight-by-utilization 110
>>
>> This reduces the weight of all osds that have a utilization that is more than 110% of the average utilization.
>>
>> Josh
>That said if the data is this unevenly distributed something odd is going on. Are >you using anything besides the filesystem on this cluster? If not we probably need >to figure out if there's a problem with the hashing.
>-Greg
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Fwd: Data distribution
2011-06-30 19:27 ` Fwd: Data distribution Martin Wilderoth
@ 2011-06-30 23:55 ` Josh Durgin
0 siblings, 0 replies; 2+ messages in thread
From: Josh Durgin @ 2011-06-30 23:55 UTC (permalink / raw)
To: Martin Wilderoth; +Cc: ceph-devel
On 06/30/2011 12:27 PM, Martin Wilderoth wrote:
> Hello,
>
> I have made a new test with a new filesystem and it seems as if host3 osd5/osd6 is getting less data. I have check the distribution over time. At the end i got some I/O error as some of the disk are quite full. Can't read superblock when mounting
> I guess there ar no tools to cerrect that yet ?
When an OSD is full beyond a threshold (defaults to 95%, configured by
mon_osd_full_ratio), no more writes are accepted. Mounting the FS
requires the MDS to open a new session, which involves writing to its
journal on the OSDs. This is why you see the error when mounting.
You can increase the full ratio to let you mount the FS and delete files
to free up space, e.g.:
ceph mon injectargs '--mon_osd_full_ratio 99'
>
> Start
>
> /dev/sdc 137G 2.3M 135G 1% /data/osd0
> /dev/sdd 137G 2.4M 135G 1% /data/osd1
> /dev/sdc 137G 2.6M 135G 1% /data/osd2
> /dev/sdd 137G 2.1M 135G 1% /data/osd3
> /dev/sdb 137G 2.0M 135G 1% /data/osd4
> /dev/sdc 137G 1.7M 135G 1% /data/osd5
>
> later
> /dev/sdc 137G 8.9G 126G 7% /data/osd0
> /dev/sdd 137G 8.9G 126G 7% /data/osd1
> /dev/sdc 137G 7.9G 126G 6% /data/osd2
> /dev/sdd 137G 9.2G 125G 7% /data/osd3
> /dev/sdb 137G 7.5G 127G 6% /data/osd4
> /dev/sdc 137G 7.1G 127G 6% /data/osd5
>
> later
> /dev/sdc 137G 56G 78G 42% /data/osd0
> /dev/sdd 137G 60G 75G 45% /data/osd1
> /dev/sdc 137G 53G 81G 40% /data/osd2
> /dev/sdd 137G 61G 74G 46% /data/osd3
> /dev/sdb 137G 51G 84G 38% /data/osd4
> /dev/sdc 137G 46G 88G 35% /data/osd5
>
> last
> /dev/sdc 137G 126G 7.7G 95% /data/osd0
> /dev/sdd 137G 130G 3.2G 98% /data/osd1
> /dev/sdc 137G 113G 22G 85% /data/osd2
> /dev/sdd 137G 126G 7.3G 95% /data/osd3
> /dev/sdb 137G 110G 24G 83% /data/osd4
> /dev/sdc 137G 70G 64G 53% /data/osd5
That's a very high variance - can you post your crushmap, pg dump, and
osd dump?
ceph osd getcrushmap -o /tmp/crushmap && crushtool -d /tmp/crushmap -o
/tmp/crushmap.txt
ceph pg dump -o /tmp/pgdump
ceph osd dump -o /tmp/osddump
Thanks!
Josh
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2011-06-30 23:55 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <1266837122.17885.1309461874884.JavaMail.root@mail.linserv.se>
2011-06-30 19:27 ` Fwd: Data distribution Martin Wilderoth
2011-06-30 23:55 ` Josh Durgin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.