All of lore.kernel.org
 help / color / mirror / Atom feed
* dm-cache fs corruption
@ 2013-11-13 10:24 Vladimir Smolensky
  2013-11-13 13:10 ` Joe Thornber
  2014-01-22  1:04 ` Vladimir Smolensky
  0 siblings, 2 replies; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-13 10:24 UTC (permalink / raw)
  To: dm-devel


[-- Attachment #1.1: Type: text/plain, Size: 844 bytes --]

Hello,
I've been testing dm-cache for use with static web content.
It appears that, when using big cache ~3TB(+20TB origin dev) dm-cache
device corrupts the filesystem after block eviction starts to happen.
If I set smaller cache size - single 480GB ssd(again with 20TB origin dev),
the dm-cache dev works just fine.


dmesg error:

XFS (dm-0): Corruption detected. Unmount and run xfs_repair
XFS (dm-0): metadata I/O error: block 0x3d4f534e0
("xfs_trans_read_buf_map") error 117 numblks 16
XFS (dm-0): xfs_imap_to_bp: xfs_trans_read_buf() returned error 117.
XFS (dm-0): Corruption detected. Unmount and run xfs_repair
XFS (dm-0): Corruption detected. Unmount and run xfs_repair


ext4 also got broken.


Tested with 3.10.11-gentoo, fedora 19- kernel 3.11.something, xfs, ext4,
writeback and writethrough modes.



regards,
Vladimir Smolensky

[-- Attachment #1.2: Type: text/html, Size: 1073 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 10:24 dm-cache fs corruption Vladimir Smolensky
@ 2013-11-13 13:10 ` Joe Thornber
  2013-11-13 13:48   ` Vladimir Smolensky
  2014-01-22  1:04 ` Vladimir Smolensky
  1 sibling, 1 reply; 20+ messages in thread
From: Joe Thornber @ 2013-11-13 13:10 UTC (permalink / raw)
  To: device-mapper development

On Wed, Nov 13, 2013 at 12:24:03PM +0200, Vladimir Smolensky wrote:
> Hello,
> I've been testing dm-cache for use with static web content.
> It appears that, when using big cache ~3TB(+20TB origin dev) dm-cache
> device corrupts the filesystem after block eviction starts to happen.
> If I set smaller cache size - single 480GB ssd(again with 20TB origin dev),
> the dm-cache dev works just fine.

Hi Vladmimir,

Sorry you've been having trouble, and thanks for taking the time to
report.  I'll see if I can scrape together enough disks to reproduce
the problem, and get back to you.

- Joe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 13:10 ` Joe Thornber
@ 2013-11-13 13:48   ` Vladimir Smolensky
  2013-11-13 14:31     ` Mike Snitzer
  2013-11-13 14:41     ` Joe Thornber
  0 siblings, 2 replies; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-13 13:48 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 961 bytes --]

Hello Joe,
I can provide you with the setup to test if you want.
Problem is, 3TB cache gets filled very slowly - about 2 weeks.

regards,
Vladimir Smolensky


On Wed, Nov 13, 2013 at 3:10 PM, Joe Thornber <thornber@redhat.com> wrote:

> On Wed, Nov 13, 2013 at 12:24:03PM +0200, Vladimir Smolensky wrote:
> > Hello,
> > I've been testing dm-cache for use with static web content.
> > It appears that, when using big cache ~3TB(+20TB origin dev) dm-cache
> > device corrupts the filesystem after block eviction starts to happen.
> > If I set smaller cache size - single 480GB ssd(again with 20TB origin
> dev),
> > the dm-cache dev works just fine.
>
> Hi Vladmimir,
>
> Sorry you've been having trouble, and thanks for taking the time to
> report.  I'll see if I can scrape together enough disks to reproduce
> the problem, and get back to you.
>
> - Joe
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

[-- Attachment #1.2: Type: text/html, Size: 1564 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 13:48   ` Vladimir Smolensky
@ 2013-11-13 14:31     ` Mike Snitzer
  2013-11-13 14:41     ` Joe Thornber
  1 sibling, 0 replies; 20+ messages in thread
From: Mike Snitzer @ 2013-11-13 14:31 UTC (permalink / raw)
  To: Vladimir Smolensky; +Cc: device-mapper development

On Wed, Nov 13 2013 at  8:48am -0500,
Vladimir Smolensky <arizal@gmail.com> wrote:

> Hello Joe,
> I can provide you with the setup to test if you want.
> Problem is, 3TB cache gets filled very slowly - about 2 weeks.

We'll have to take steps to get a quicker reproducer.  But in parallel,
could you please test the latest dm-cache code that is staged for v3.13
inclusion?  Would be a great data point to have.

Please see the 'for-linus' branch of the linux-dm.git repo:
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git

(it is based on 3.12-rc5)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 13:48   ` Vladimir Smolensky
  2013-11-13 14:31     ` Mike Snitzer
@ 2013-11-13 14:41     ` Joe Thornber
  2013-11-13 15:19       ` Vladimir Smolensky
  1 sibling, 1 reply; 20+ messages in thread
From: Joe Thornber @ 2013-11-13 14:41 UTC (permalink / raw)
  To: device-mapper development

On Wed, Nov 13, 2013 at 03:48:18PM +0200, Vladimir Smolensky wrote:
> Hello Joe,
> I can provide you with the setup to test if you want.
> Problem is, 3TB cache gets filled very slowly - about 2 weeks.

I'll use cache_restore to generate a populated cache.  Could you tell
me your block size though please?

- Joe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 14:41     ` Joe Thornber
@ 2013-11-13 15:19       ` Vladimir Smolensky
  2013-11-13 16:29         ` Vladimir Smolensky
  0 siblings, 1 reply; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-13 15:19 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 520 bytes --]

8192



On Wed, Nov 13, 2013 at 4:41 PM, Joe Thornber <thornber@redhat.com> wrote:

> On Wed, Nov 13, 2013 at 03:48:18PM +0200, Vladimir Smolensky wrote:
> > Hello Joe,
> > I can provide you with the setup to test if you want.
> > Problem is, 3TB cache gets filled very slowly - about 2 weeks.
>
> I'll use cache_restore to generate a populated cache.  Could you tell
> me your block size though please?
>
> - Joe
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

[-- Attachment #1.2: Type: text/html, Size: 1133 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 15:19       ` Vladimir Smolensky
@ 2013-11-13 16:29         ` Vladimir Smolensky
  2013-11-13 18:39           ` Mears, Morgan
  2013-11-13 19:32           ` Mike Snitzer
  0 siblings, 2 replies; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-13 16:29 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 156 bytes --]

Hello,
I'm compiling the for-linus branch, it shows 3.12.0-rc5, that's ok right?

Where can I get cache_restore, cache_dump, etc. and info how to use them.

[-- Attachment #1.2: Type: text/html, Size: 241 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 16:29         ` Vladimir Smolensky
@ 2013-11-13 18:39           ` Mears, Morgan
  2013-11-14 14:38             ` Joe Thornber
  2013-11-13 19:32           ` Mike Snitzer
  1 sibling, 1 reply; 20+ messages in thread
From: Mears, Morgan @ 2013-11-13 18:39 UTC (permalink / raw)
  To: device-mapper development

On Wed, Nov 13, 2013 at 11:29:26AM -0005, Vladimir Smolensky wrote:
> Hello, 
> I'm compiling the for-linus branch, it shows 3.12.0-rc5, that's ok right?
> 
> Where can I get cache_restore, cache_dump, etc. and info how to use them.

Hi Vladimir,

git clone https://github.com/jthornber/thin-provisioning-tools.git
cd thin-provisioning-tools
autoreconf
./configure --enable-testing
make
sudo cp cache_check cache_dump cache_restore cache_repair /usr/sbin
cache_check --version

This is mostly distilled from README.md in the repo; there are some
additional dependencies in there that you might have to address.

All the tools have a --help option; not sure about other documentation.

--Morgan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 16:29         ` Vladimir Smolensky
  2013-11-13 18:39           ` Mears, Morgan
@ 2013-11-13 19:32           ` Mike Snitzer
  1 sibling, 0 replies; 20+ messages in thread
From: Mike Snitzer @ 2013-11-13 19:32 UTC (permalink / raw)
  To: Vladimir Smolensky; +Cc: device-mapper development

On Wed, Nov 13 2013 at 11:29am -0500,
Vladimir Smolensky <arizal@gmail.com> wrote:

> Hello,
> I'm compiling the for-linus branch, it shows 3.12.0-rc5, that's ok right?

Yes, that branch is based on v3.12-rc5.  Once the branch is made it is
best not to rebase (Linus prefers that, unless there is a strong reason
to rebase).

Chances are there is still a 64bit bug that is causing truncation of
values to 32bit.  So I wouldn't expect the latest code to help your
particular situation but it doesn't hurt to use the latest.

> Where can I get cache_restore, cache_dump, etc. and info how to use them.

Those tools are part of this project:
https://github.com/jthornber/thin-provisioning-tools

There are man pages for each utility (in the man8 subdir).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 18:39           ` Mears, Morgan
@ 2013-11-14 14:38             ` Joe Thornber
  2013-11-14 15:21               ` Vladimir Smolensky
  0 siblings, 1 reply; 20+ messages in thread
From: Joe Thornber @ 2013-11-14 14:38 UTC (permalink / raw)
  To: device-mapper development

On Wed, Nov 13, 2013 at 06:39:37PM +0000, Mears, Morgan wrote:
> On Wed, Nov 13, 2013 at 11:29:26AM -0005, Vladimir Smolensky wrote:
> > Hello, 
> > I'm compiling the for-linus branch, it shows 3.12.0-rc5, that's ok right?
> > 
> > Where can I get cache_restore, cache_dump, etc. and info how to use them.
> 
> Hi Vladimir,
> 
> git clone https://github.com/jthornber/thin-provisioning-tools.git
> cd thin-provisioning-tools
> autoreconf
> ./configure --enable-testing
> make
> sudo cp cache_check cache_dump cache_restore cache_repair /usr/sbin
> cache_check --version
> 
> This is mostly distilled from README.md in the repo; there are some
> additional dependencies in there that you might have to address.
> 
> All the tools have a --help option; not sure about other documentation.
> 
> --Morgan

'make install' should work, and I believe there are some man pages.

- Joe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-14 14:38             ` Joe Thornber
@ 2013-11-14 15:21               ` Vladimir Smolensky
  2013-11-14 20:22                 ` Vladimir Smolensky
  0 siblings, 1 reply; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-14 15:21 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 1135 bytes --]

./cache_restore -V
0.2.8


On Thu, Nov 14, 2013 at 4:38 PM, Joe Thornber <thornber@redhat.com> wrote:

> On Wed, Nov 13, 2013 at 06:39:37PM +0000, Mears, Morgan wrote:
> > On Wed, Nov 13, 2013 at 11:29:26AM -0005, Vladimir Smolensky wrote:
> > > Hello,
> > > I'm compiling the for-linus branch, it shows 3.12.0-rc5, that's ok
> right?
> > >
> > > Where can I get cache_restore, cache_dump, etc. and info how to use
> them.
> >
> > Hi Vladimir,
> >
> > git clone https://github.com/jthornber/thin-provisioning-tools.git
> > cd thin-provisioning-tools
> > autoreconf
> > ./configure --enable-testing
> > make
> > sudo cp cache_check cache_dump cache_restore cache_repair /usr/sbin
> > cache_check --version
> >
> > This is mostly distilled from README.md in the repo; there are some
> > additional dependencies in there that you might have to address.
> >
> > All the tools have a --help option; not sure about other documentation.
> >
> > --Morgan
>
> 'make install' should work, and I believe there are some man pages.
>
> - Joe
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

[-- Attachment #1.2: Type: text/html, Size: 1962 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-14 15:21               ` Vladimir Smolensky
@ 2013-11-14 20:22                 ` Vladimir Smolensky
  2013-11-15 13:01                   ` Vladimir Smolensky
  0 siblings, 1 reply; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-14 20:22 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 1552 bytes --]

Okay, a little correction here. The problematic setup was with 2.3TB cache,
not 3TB. The mistake comes from the fact that at the time of the writing I
was dealing with another server which has 8x480GB ssd's in raid5, which is
little over 3TB. Obviously the number was floating in my head.


On Thu, Nov 14, 2013 at 5:21 PM, Vladimir Smolensky <arizal@gmail.com>wrote:

> ./cache_restore -V
> 0.2.8
>
>
> On Thu, Nov 14, 2013 at 4:38 PM, Joe Thornber <thornber@redhat.com> wrote:
>
>> On Wed, Nov 13, 2013 at 06:39:37PM +0000, Mears, Morgan wrote:
>> > On Wed, Nov 13, 2013 at 11:29:26AM -0005, Vladimir Smolensky wrote:
>> > > Hello,
>> > > I'm compiling the for-linus branch, it shows 3.12.0-rc5, that's ok
>> right?
>> > >
>> > > Where can I get cache_restore, cache_dump, etc. and info how to use
>> them.
>> >
>> > Hi Vladimir,
>> >
>> > git clone https://github.com/jthornber/thin-provisioning-tools.git
>> > cd thin-provisioning-tools
>> > autoreconf
>> > ./configure --enable-testing
>> > make
>> > sudo cp cache_check cache_dump cache_restore cache_repair /usr/sbin
>> > cache_check --version
>> >
>> > This is mostly distilled from README.md in the repo; there are some
>> > additional dependencies in there that you might have to address.
>> >
>> > All the tools have a --help option; not sure about other documentation.
>> >
>> > --Morgan
>>
>> 'make install' should work, and I believe there are some man pages.
>>
>> - Joe
>>
>> --
>> dm-devel mailing list
>> dm-devel@redhat.com
>> https://www.redhat.com/mailman/listinfo/dm-devel
>>
>
>

[-- Attachment #1.2: Type: text/html, Size: 2617 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-14 20:22                 ` Vladimir Smolensky
@ 2013-11-15 13:01                   ` Vladimir Smolensky
  2013-11-28 16:50                     ` Vladimir Smolensky
  0 siblings, 1 reply; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-15 13:01 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 3839 bytes --]

Okay, I'm trying to reproduce the problem using cache_restore, so far with
no luck. The hardware setup is the same as the original.

kernel 3.10.11-gentoo


Cache raid, 1 is meta, 2 data;

# parted /dev/sdc unit b print
Model: DELL PERC H700 (scsi)
Disk /dev/sdc: 2397799710720B
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start        End             Size            File system  Name
Flags
 1      1048576B     1024458751B     1023410176B                  primary
 2      1024458752B  2397798662143B  2396774203392B               primary

Origin:

# parted /dev/sdb unit b print
Model: DELL PERC H700 (scsi)
Disk /dev/sdb: 20973392756736B
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start     End              Size             File system  Name
Flags
 1      1048576B  20973391708159B  20973390659584B  xfs          primary

According to /dev/sda2 size, number of blocks should be

571435, with block size 4MiB( 8192 sectors)

Generating fake meta, I leave 5 blocks free

cache_xml create --nr-cache-blocks 571430 --nr-mappings 571430 0  >
/root/meta.xml

I edit the xml to set block size 8192 since there is no option for this.

cache_restore -i /root/meta.xml -o /dev/sdc1

dmsetup create storage --table '0 40963653632 cache /dev/sdc1 /dev/sdc2
/dev/sdb1 8192 1 writeback default 0’


mkfs.xfs /dev/mapper/storage

mount //dev/mapper/storage /mnt

Then I run fio with following setup:

[test1]
loops=10000
randrepeat=1
directory=/mnt/fio/data/
new_group
group_reporting=1
size=100g
rwmixread=70
rw=randrw
numjobs=12
ioengine=sync
#direct=1
bs=1M
nrfiles=3000

This is the status so far:

# dmsetup status storage
0 40963653632 cache 1719/249856 93161125 11990840 93713813 27533087 307339
307344 571435 306978 0 2 migration_threshold 10000000 4 random_threshold 4
sequential_threshold 10000000

There are ~300k demotions with no crash so far.
So probably filling the cache/demotions are not the only factor here.

regards.




On Thu, Nov 14, 2013 at 10:22 PM, Vladimir Smolensky <arizal@gmail.com>wrote:

> Okay, a little correction here. The problematic setup was with 2.3TB
> cache, not 3TB. The mistake comes from the fact that at the time of the
> writing I was dealing with another server which has 8x480GB ssd's in raid5,
> which is little over 3TB. Obviously the number was floating in my head.
>
>
> On Thu, Nov 14, 2013 at 5:21 PM, Vladimir Smolensky <arizal@gmail.com>wrote:
>
>> ./cache_restore -V
>> 0.2.8
>>
>>
>> On Thu, Nov 14, 2013 at 4:38 PM, Joe Thornber <thornber@redhat.com>wrote:
>>
>>> On Wed, Nov 13, 2013 at 06:39:37PM +0000, Mears, Morgan wrote:
>>> > On Wed, Nov 13, 2013 at 11:29:26AM -0005, Vladimir Smolensky wrote:
>>> > > Hello,
>>> > > I'm compiling the for-linus branch, it shows 3.12.0-rc5, that's ok
>>> right?
>>> > >
>>> > > Where can I get cache_restore, cache_dump, etc. and info how to use
>>> them.
>>> >
>>> > Hi Vladimir,
>>> >
>>> > git clone https://github.com/jthornber/thin-provisioning-tools.git
>>> > cd thin-provisioning-tools
>>> > autoreconf
>>> > ./configure --enable-testing
>>> > make
>>> > sudo cp cache_check cache_dump cache_restore cache_repair /usr/sbin
>>> > cache_check --version
>>> >
>>> > This is mostly distilled from README.md in the repo; there are some
>>> > additional dependencies in there that you might have to address.
>>> >
>>> > All the tools have a --help option; not sure about other documentation.
>>> >
>>> > --Morgan
>>>
>>> 'make install' should work, and I believe there are some man pages.
>>>
>>> - Joe
>>>
>>> --
>>> dm-devel mailing list
>>> dm-devel@redhat.com
>>> https://www.redhat.com/mailman/listinfo/dm-devel
>>>
>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 5660 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-15 13:01                   ` Vladimir Smolensky
@ 2013-11-28 16:50                     ` Vladimir Smolensky
  2013-11-28 17:28                       ` Joe Thornber
  0 siblings, 1 reply; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-28 16:50 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 4281 bytes --]

Hello, I would like to test the latest dm-cache available to see if the
problem I was having still exists. Where can I get this code?

regards.



On Fri, Nov 15, 2013 at 3:01 PM, Vladimir Smolensky <arizal@gmail.com>wrote:

> Okay, I'm trying to reproduce the problem using cache_restore, so far with
> no luck. The hardware setup is the same as the original.
>
> kernel 3.10.11-gentoo
>
>
> Cache raid, 1 is meta, 2 data;
>
> # parted /dev/sdc unit b print
> Model: DELL PERC H700 (scsi)
> Disk /dev/sdc: 2397799710720B
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
> Disk Flags:
>
> Number  Start        End             Size            File system  Name
> Flags
>  1      1048576B     1024458751B     1023410176B                  primary
>  2      1024458752B  2397798662143B  2396774203392B               primary
>
> Origin:
>
> # parted /dev/sdb unit b print
> Model: DELL PERC H700 (scsi)
> Disk /dev/sdb: 20973392756736B
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
> Disk Flags:
>
> Number  Start     End              Size             File system  Name
> Flags
>  1      1048576B  20973391708159B  20973390659584B  xfs          primary
>
> According to /dev/sda2 size, number of blocks should be
>
> 571435, with block size 4MiB( 8192 sectors)
>
> Generating fake meta, I leave 5 blocks free
>
> cache_xml create --nr-cache-blocks 571430 --nr-mappings 571430 0  >
> /root/meta.xml
>
> I edit the xml to set block size 8192 since there is no option for this.
>
> cache_restore -i /root/meta.xml -o /dev/sdc1
>
> dmsetup create storage --table '0 40963653632 cache /dev/sdc1 /dev/sdc2
> /dev/sdb1 8192 1 writeback default 0’
>
>
> mkfs.xfs /dev/mapper/storage
>
> mount //dev/mapper/storage /mnt
>
> Then I run fio with following setup:
>
> [test1]
> loops=10000
> randrepeat=1
> directory=/mnt/fio/data/
> new_group
> group_reporting=1
> size=100g
> rwmixread=70
> rw=randrw
> numjobs=12
> ioengine=sync
> #direct=1
> bs=1M
> nrfiles=3000
>
> This is the status so far:
>
> # dmsetup status storage
> 0 40963653632 cache 1719/249856 93161125 11990840 93713813 27533087 307339
> 307344 571435 306978 0 2 migration_threshold 10000000 4 random_threshold 4
> sequential_threshold 10000000
>
> There are ~300k demotions with no crash so far.
> So probably filling the cache/demotions are not the only factor here.
>
> regards.
>
>
>
>
> On Thu, Nov 14, 2013 at 10:22 PM, Vladimir Smolensky <arizal@gmail.com>wrote:
>
>> Okay, a little correction here. The problematic setup was with 2.3TB
>> cache, not 3TB. The mistake comes from the fact that at the time of the
>> writing I was dealing with another server which has 8x480GB ssd's in raid5,
>> which is little over 3TB. Obviously the number was floating in my head.
>>
>>
>> On Thu, Nov 14, 2013 at 5:21 PM, Vladimir Smolensky <arizal@gmail.com>wrote:
>>
>>> ./cache_restore -V
>>> 0.2.8
>>>
>>>
>>> On Thu, Nov 14, 2013 at 4:38 PM, Joe Thornber <thornber@redhat.com>wrote:
>>>
>>>> On Wed, Nov 13, 2013 at 06:39:37PM +0000, Mears, Morgan wrote:
>>>> > On Wed, Nov 13, 2013 at 11:29:26AM -0005, Vladimir Smolensky wrote:
>>>> > > Hello,
>>>> > > I'm compiling the for-linus branch, it shows 3.12.0-rc5, that's ok
>>>> right?
>>>> > >
>>>> > > Where can I get cache_restore, cache_dump, etc. and info how to use
>>>> them.
>>>> >
>>>> > Hi Vladimir,
>>>> >
>>>> > git clone https://github.com/jthornber/thin-provisioning-tools.git
>>>> > cd thin-provisioning-tools
>>>> > autoreconf
>>>> > ./configure --enable-testing
>>>> > make
>>>> > sudo cp cache_check cache_dump cache_restore cache_repair /usr/sbin
>>>> > cache_check --version
>>>> >
>>>> > This is mostly distilled from README.md in the repo; there are some
>>>> > additional dependencies in there that you might have to address.
>>>> >
>>>> > All the tools have a --help option; not sure about other
>>>> documentation.
>>>> >
>>>> > --Morgan
>>>>
>>>> 'make install' should work, and I believe there are some man pages.
>>>>
>>>> - Joe
>>>>
>>>> --
>>>> dm-devel mailing list
>>>> dm-devel@redhat.com
>>>> https://www.redhat.com/mailman/listinfo/dm-devel
>>>>
>>>
>>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 6241 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-28 16:50                     ` Vladimir Smolensky
@ 2013-11-28 17:28                       ` Joe Thornber
  2013-11-29 13:34                         ` Vladimir Smolensky
  0 siblings, 1 reply; 20+ messages in thread
From: Joe Thornber @ 2013-11-28 17:28 UTC (permalink / raw)
  To: device-mapper development

On Thu, Nov 28, 2013 at 06:50:39PM +0200, Vladimir Smolensky wrote:
> Hello, I would like to test the latest dm-cache available to see if the
> problem I was having still exists. Where can I get this code?

My latest stable code lives in the 'thin-dev' branch of my github tree:

https://github.com/jthornber/linux-2.6/tree/thin-dev

This has the latest versions of both the thin-provisioning and cache
targets, and will shortly have the era target too.


- Joe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-28 17:28                       ` Joe Thornber
@ 2013-11-29 13:34                         ` Vladimir Smolensky
  2013-11-29 15:06                           ` Vladimir Smolensky
  0 siblings, 1 reply; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-29 13:34 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 699 bytes --]

This is complete kernel, right?


On Thu, Nov 28, 2013 at 7:28 PM, Joe Thornber <thornber@redhat.com> wrote:

> On Thu, Nov 28, 2013 at 06:50:39PM +0200, Vladimir Smolensky wrote:
> > Hello, I would like to test the latest dm-cache available to see if the
> > problem I was having still exists. Where can I get this code?
>
> My latest stable code lives in the 'thin-dev' branch of my github tree:
>
> https://github.com/jthornber/linux-2.6/tree/thin-dev
>
> This has the latest versions of both the thin-provisioning and cache
> targets, and will shortly have the era target too.
>
>
> - Joe
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

[-- Attachment #1.2: Type: text/html, Size: 1347 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-29 13:34                         ` Vladimir Smolensky
@ 2013-11-29 15:06                           ` Vladimir Smolensky
  2013-11-30 14:28                             ` Vladimir Smolensky
  0 siblings, 1 reply; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-29 15:06 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 5577 bytes --]

Ok, I compiled it and will test if it works.

When I try to make my cache with blocksize smaller than 8192  I get

# dmsetup  create storage --table '0 40963653632 cache /dev/sdc1 /dev/sdc2
/dev/sdb1 4096 1 writeback default 0'
device-mapper: reload ioctl on storage failed: Cannot allocate memory
Command failed

dnesg:

[  664.940597] device-mapper: cache-policy-mq: version 1.1.0 loaded
[  665.002924] ------------[ cut here ]------------
[  665.002943] WARNING: CPU: 0 PID: 18271 at mm/page_alloc.c:2484
__alloc_pages_nodemask+0x73d/0x820()
[  665.002945] Modules linked in: dm_cache_mq dm_cache dm_bio_prison
dm_persistent_data dm_bufio ipv6 bonding coretemp ixgbe i7core_edac
edac_core kvm processor pcspkr hed button bnx2 dcdbas dca mdio ehci_pci
thermal_sys microcode joydev sha256_generic libiscsi scsi_transport_iscsi
tg3 ptp pps_core libphy e1000 fuse nfs lockd sunrpc jfs multipath linear
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor
async_tx raid6_pq raid1 raid0 dm_snapshot dm_crypt dm_mirror dm_region_hash
dm_log dm_mod hid_sunplus hid_sony hid_samsung hid_pl hid_petalynx
hid_gyration sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd
usbcore usb_common mpt2sas raid_class aic94xx libsas lpfc crc_t10dif
crct10dif_common qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid
aacraid sx8 DAC960 cciss 3w_9xxx 3w_xxxx mptsas scsi_transport_sas mptfc
scsi_transport_fc scsi_tgt mptspi mptscsih mptbase atp870u dc395x qla1280
imm parport dmx3191d sym53c8xx gdth advansys initio BusLogic arcmsr aic7xxx
aic79xx scsi_transport_spi sg pdc_adma sata_inic162x sata_mv ata_piix ahci
libahci sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via
sata_svw sata_sil24 sata_sil sata_promise pata_sl82c105 pata_cs5530
pata_cs5520 pata_via pata_jmicron pata_marvell pata_sis pata_netcell
pata_sc1200 pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd
pata_ali pata_it8213 pata_ns87415 pata_ns87410 pata_serverworks pata_artop
pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366
pata_cmd64x pata_efar pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x
pata_mpiix libata
[  665.003077] CPU: 0 PID: 18271 Comm: dmsetup Not tainted 3.12.0-rc5+ #1
[  665.003079] Hardware name: Dell Inc. PowerEdge R510/0DPRKF, BIOS 1.5.3
10/25/2010
[  665.003082]  00000000000009b4 ffff881fada458f8 ffffffff81595198
00000000000009b4
[  665.003085]  0000000000000000 ffff881fada45938 ffffffff81043a62
ffff881fada45918
[  665.003088]  0000000000000000 ffff881fae8a0000 0000000000000000
0000000000000002
[  665.003092] Call Trace:
[  665.003101]  [<ffffffff81595198>] dump_stack+0x49/0x61
[  665.003107]  [<ffffffff81043a62>] warn_slowpath_common+0x82/0xb0
[  665.003111]  [<ffffffff81043aa5>] warn_slowpath_null+0x15/0x20
[  665.003114]  [<ffffffff810c595d>] __alloc_pages_nodemask+0x73d/0x820
[  665.003120]  [<ffffffffa14405b7>] ? mq_create+0x187/0x3a0 [dm_cache_mq]
[  665.003124]  [<ffffffff810c5ac2>] __get_free_pages+0x12/0x50
[  665.003130]  [<ffffffff810f69cb>] __kmalloc+0xeb/0xf0
[  665.003134]  [<ffffffffa1440767>] mq_create+0x337/0x3a0 [dm_cache_mq]
[  665.003139]  [<ffffffffa1436faa>] dm_cache_policy_create+0x4a/0xcc
[dm_cache]
[  665.003143]  [<ffffffffa14339d1>] cache_ctr+0x4a1/0xd40 [dm_cache]
[  665.003152]  [<ffffffffa065dbb8>] ? dm_split_args+0x78/0x140 [dm_mod]
[  665.003159]  [<ffffffffa065ddba>] dm_table_add_target+0x13a/0x390
[dm_mod]
[  665.003166]  [<ffffffffa0660eb0>] table_load+0xd0/0x330 [dm_mod]
[  665.003173]  [<ffffffffa0660de0>] ? table_clear+0xd0/0xd0 [dm_mod]
[  665.003179]  [<ffffffffa0662032>] ctl_ioctl+0x1d2/0x410 [dm_mod]
[  665.003187]  [<ffffffff8103a718>] ? __do_page_fault+0x208/0x4e0
[  665.003193]  [<ffffffffa066227e>] dm_ctl_ioctl+0xe/0x20 [dm_mod]
[  665.003198]  [<ffffffff8110bd2e>] do_vfs_ioctl+0x8e/0x4e0
[  665.003202]  [<ffffffff810fbc09>] ? ____fput+0x9/0x10
[  665.003205]  [<ffffffff8110c1d2>] SyS_ioctl+0x52/0x80
[  665.003211]  [<ffffffff815996f9>] system_call_fastpath+0x16/0x1b
[  665.003213] ---[ end trace 2afe5f836777e03f ]---
[  665.008704] device-mapper: table: 253:0: cache: Error creating cache's
policy
[  665.008712] device-mapper: ioctl: error adding target to table


# blockdev --report
RO    RA   SSZ   BSZ   StartSec            Size   Device
rw   256   512  4096          0     26843414528   /dev/sda
rw   256   512  4096       2048      4094689280   /dev/sda1
rw   256   512  4096    7999488     22746759168   /dev/sda2
rw   256   512  4096          0  20973392756736   /dev/sdb
rw   256   512  4096       2048  20973390659584   /dev/sdb1
rw   256   512  4096          0   2397799710720   /dev/sdc
rw   256   512  4096       2048      1023410176   /dev/sdc1
rw   256   512  4096    2000896   2396774203392   /dev/sdc2

regards.



On Fri, Nov 29, 2013 at 3:34 PM, Vladimir Smolensky <arizal@gmail.com>wrote:

> This is complete kernel, right?
>
>
> On Thu, Nov 28, 2013 at 7:28 PM, Joe Thornber <thornber@redhat.com> wrote:
>
>> On Thu, Nov 28, 2013 at 06:50:39PM +0200, Vladimir Smolensky wrote:
>> > Hello, I would like to test the latest dm-cache available to see if the
>> > problem I was having still exists. Where can I get this code?
>>
>> My latest stable code lives in the 'thin-dev' branch of my github tree:
>>
>> https://github.com/jthornber/linux-2.6/tree/thin-dev
>>
>> This has the latest versions of both the thin-provisioning and cache
>> targets, and will shortly have the era target too.
>>
>>
>> - Joe
>>
>> --
>> dm-devel mailing list
>> dm-devel@redhat.com
>> https://www.redhat.com/mailman/listinfo/dm-devel
>>
>
>

[-- Attachment #1.2: Type: text/html, Size: 6922 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-29 15:06                           ` Vladimir Smolensky
@ 2013-11-30 14:28                             ` Vladimir Smolensky
  2013-12-02 14:47                               ` Mike Snitzer
  0 siblings, 1 reply; 20+ messages in thread
From: Vladimir Smolensky @ 2013-11-30 14:28 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 8197 bytes --]

Hello, the device broke after about a day:

dmesg:

[83086.878182] device-mapper: space map metadata: unable to allocate new
metadata block
[83088.879861] device-mapper: space map metadata: unable to allocate new
metadata block
[83089.414700] device-mapper: space map metadata: unable to allocate new
metadata block
[83089.414706] device-mapper: cache: could not commit metadata for accurate
status
[83090.881533] device-mapper: space map metadata: unable to allocate new
metadata block
[83092.883142] device-mapper: space map metadata: unable to allocate new
metadata block
[83094.884719] device-mapper: space map metadata: unable to allocate new
metadata block
[83096.886355] device-mapper: space map metadata: unable to allocate new
metadata block
[83097.712195] device-mapper: space map metadata: unable to allocate new
metadata block
[83097.712200] device-mapper: cache: could not commit metadata for accurate
status
[83098.283947] device-mapper: space map metadata: unable to allocate new
metadata block
[83098.283952] device-mapper: cache: could not commit metadata for accurate
status
[83098.729396] device-mapper: space map metadata: unable to allocate new
metadata block
[83098.729401] device-mapper: cache: could not commit metadata for accurate
status

dmsetup status:
storage: 0 40963653632 cache 249856/249856 26968230 2678513 711227 30042060
0 248746 248729 0 1 writeback 2 migration_threshold 100000000 4
random_threshold 4 sequential_threshold 10000000

Obviously the metadata partition is full but accourding to the formula I
use it should be more than enough =>  4 MB + ( 16 bytes * nr_blocks ) I get
13341184 ~ 13MB and metadata partiton is 1024MB

device sizes:
 # blockdev --report
RO    RA   SSZ   BSZ   StartSec            Size   Device
rw   256   512  4096          0     26843414528   /dev/sda
rw   256   512  4096       2048      4094689280   /dev/sda1
rw   256   512  4096    7999488     22746759168   /dev/sda2
rw  2048   512  4096          0  20973392756736   /dev/sdb
rw  2048   512  4096       2048  20973390659584   /dev/sdb1
rw  2048   512  4096          0   2397799710720   /dev/sdc
rw  2048   512  4096       2048      1023410176   /dev/sdc1
rw  2048   512  4096    2000896   2396774203392   /dev/sdc2
rw  2048   512   512          0  20973390659584   /dev/dm-0


What am I doing wrong?



regards.



On Fri, Nov 29, 2013 at 5:06 PM, Vladimir Smolensky <arizal@gmail.com>wrote:

> Ok, I compiled it and will test if it works.
>
> When I try to make my cache with blocksize smaller than 8192  I get
>
> # dmsetup  create storage --table '0 40963653632 cache /dev/sdc1 /dev/sdc2
> /dev/sdb1 4096 1 writeback default 0'
> device-mapper: reload ioctl on storage failed: Cannot allocate memory
> Command failed
>
> dnesg:
>
> [  664.940597] device-mapper: cache-policy-mq: version 1.1.0 loaded
> [  665.002924] ------------[ cut here ]------------
> [  665.002943] WARNING: CPU: 0 PID: 18271 at mm/page_alloc.c:2484
> __alloc_pages_nodemask+0x73d/0x820()
> [  665.002945] Modules linked in: dm_cache_mq dm_cache dm_bio_prison
> dm_persistent_data dm_bufio ipv6 bonding coretemp ixgbe i7core_edac
> edac_core kvm processor pcspkr hed button bnx2 dcdbas dca mdio ehci_pci
> thermal_sys microcode joydev sha256_generic libiscsi scsi_transport_iscsi
> tg3 ptp pps_core libphy e1000 fuse nfs lockd sunrpc jfs multipath linear
> raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor
> async_tx raid6_pq raid1 raid0 dm_snapshot dm_crypt dm_mirror dm_region_hash
> dm_log dm_mod hid_sunplus hid_sony hid_samsung hid_pl hid_petalynx
> hid_gyration sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd
> usbcore usb_common mpt2sas raid_class aic94xx libsas lpfc crc_t10dif
> crct10dif_common qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid
> aacraid sx8 DAC960 cciss 3w_9xxx 3w_xxxx mptsas scsi_transport_sas mptfc
> scsi_transport_fc scsi_tgt mptspi mptscsih mptbase atp870u dc395x qla1280
> imm parport dmx3191d sym53c8xx gdth advansys initio BusLogic arcmsr aic7xxx
> aic79xx scsi_transport_spi sg pdc_adma sata_inic162x sata_mv ata_piix ahci
> libahci sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via
> sata_svw sata_sil24 sata_sil sata_promise pata_sl82c105 pata_cs5530
> pata_cs5520 pata_via pata_jmicron pata_marvell pata_sis pata_netcell
> pata_sc1200 pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd
> pata_ali pata_it8213 pata_ns87415 pata_ns87410 pata_serverworks pata_artop
> pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366
> pata_cmd64x pata_efar pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x
> pata_mpiix libata
> [  665.003077] CPU: 0 PID: 18271 Comm: dmsetup Not tainted 3.12.0-rc5+ #1
> [  665.003079] Hardware name: Dell Inc. PowerEdge R510/0DPRKF, BIOS 1.5.3
> 10/25/2010
> [  665.003082]  00000000000009b4 ffff881fada458f8 ffffffff81595198
> 00000000000009b4
> [  665.003085]  0000000000000000 ffff881fada45938 ffffffff81043a62
> ffff881fada45918
> [  665.003088]  0000000000000000 ffff881fae8a0000 0000000000000000
> 0000000000000002
> [  665.003092] Call Trace:
> [  665.003101]  [<ffffffff81595198>] dump_stack+0x49/0x61
> [  665.003107]  [<ffffffff81043a62>] warn_slowpath_common+0x82/0xb0
> [  665.003111]  [<ffffffff81043aa5>] warn_slowpath_null+0x15/0x20
> [  665.003114]  [<ffffffff810c595d>] __alloc_pages_nodemask+0x73d/0x820
> [  665.003120]  [<ffffffffa14405b7>] ? mq_create+0x187/0x3a0 [dm_cache_mq]
> [  665.003124]  [<ffffffff810c5ac2>] __get_free_pages+0x12/0x50
> [  665.003130]  [<ffffffff810f69cb>] __kmalloc+0xeb/0xf0
> [  665.003134]  [<ffffffffa1440767>] mq_create+0x337/0x3a0 [dm_cache_mq]
> [  665.003139]  [<ffffffffa1436faa>] dm_cache_policy_create+0x4a/0xcc
> [dm_cache]
> [  665.003143]  [<ffffffffa14339d1>] cache_ctr+0x4a1/0xd40 [dm_cache]
> [  665.003152]  [<ffffffffa065dbb8>] ? dm_split_args+0x78/0x140 [dm_mod]
> [  665.003159]  [<ffffffffa065ddba>] dm_table_add_target+0x13a/0x390
> [dm_mod]
> [  665.003166]  [<ffffffffa0660eb0>] table_load+0xd0/0x330 [dm_mod]
> [  665.003173]  [<ffffffffa0660de0>] ? table_clear+0xd0/0xd0 [dm_mod]
> [  665.003179]  [<ffffffffa0662032>] ctl_ioctl+0x1d2/0x410 [dm_mod]
> [  665.003187]  [<ffffffff8103a718>] ? __do_page_fault+0x208/0x4e0
> [  665.003193]  [<ffffffffa066227e>] dm_ctl_ioctl+0xe/0x20 [dm_mod]
> [  665.003198]  [<ffffffff8110bd2e>] do_vfs_ioctl+0x8e/0x4e0
> [  665.003202]  [<ffffffff810fbc09>] ? ____fput+0x9/0x10
> [  665.003205]  [<ffffffff8110c1d2>] SyS_ioctl+0x52/0x80
> [  665.003211]  [<ffffffff815996f9>] system_call_fastpath+0x16/0x1b
> [  665.003213] ---[ end trace 2afe5f836777e03f ]---
> [  665.008704] device-mapper: table: 253:0: cache: Error creating cache's
> policy
> [  665.008712] device-mapper: ioctl: error adding target to table
>
>
> # blockdev --report
> RO    RA   SSZ   BSZ   StartSec            Size   Device
> rw   256   512  4096          0     26843414528   /dev/sda
> rw   256   512  4096       2048      4094689280   /dev/sda1
> rw   256   512  4096    7999488     22746759168   /dev/sda2
> rw   256   512  4096          0  20973392756736   /dev/sdb
> rw   256   512  4096       2048  20973390659584   /dev/sdb1
> rw   256   512  4096          0   2397799710720   /dev/sdc
> rw   256   512  4096       2048      1023410176   /dev/sdc1
> rw   256   512  4096    2000896   2396774203392   /dev/sdc2
>
> regards.
>
>
>
> On Fri, Nov 29, 2013 at 3:34 PM, Vladimir Smolensky <arizal@gmail.com>wrote:
>
>> This is complete kernel, right?
>>
>>
>> On Thu, Nov 28, 2013 at 7:28 PM, Joe Thornber <thornber@redhat.com>wrote:
>>
>>> On Thu, Nov 28, 2013 at 06:50:39PM +0200, Vladimir Smolensky wrote:
>>> > Hello, I would like to test the latest dm-cache available to see if the
>>> > problem I was having still exists. Where can I get this code?
>>>
>>> My latest stable code lives in the 'thin-dev' branch of my github tree:
>>>
>>> https://github.com/jthornber/linux-2.6/tree/thin-dev
>>>
>>> This has the latest versions of both the thin-provisioning and cache
>>> targets, and will shortly have the era target too.
>>>
>>>
>>> - Joe
>>>
>>> --
>>> dm-devel mailing list
>>> dm-devel@redhat.com
>>> https://www.redhat.com/mailman/listinfo/dm-devel
>>>
>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 10619 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-30 14:28                             ` Vladimir Smolensky
@ 2013-12-02 14:47                               ` Mike Snitzer
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Snitzer @ 2013-12-02 14:47 UTC (permalink / raw)
  To: Vladimir Smolensky; +Cc: device-mapper development

On Sat, Nov 30 2013 at  9:28am -0500,
Vladimir Smolensky <arizal@gmail.com> wrote:

> Hello, the device broke after about a day:
> 
> dmesg:
> 
> [83086.878182] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83088.879861] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83089.414700] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83089.414706] device-mapper: cache: could not commit metadata for accurate
> status
> [83090.881533] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83092.883142] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83094.884719] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83096.886355] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83097.712195] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83097.712200] device-mapper: cache: could not commit metadata for accurate
> status
> [83098.283947] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83098.283952] device-mapper: cache: could not commit metadata for accurate
> status
> [83098.729396] device-mapper: space map metadata: unable to allocate new
> metadata block
> [83098.729401] device-mapper: cache: could not commit metadata for accurate
> status
> 
> dmsetup status:
> storage: 0 40963653632 cache 249856/249856 26968230 2678513 711227 30042060
> 0 248746 248729 0 1 writeback 2 migration_threshold 100000000 4
> random_threshold 4 sequential_threshold 10000000
> 
> Obviously the metadata partition is full but accourding to the formula I
> use it should be more than enough =>  4 MB + ( 16 bytes * nr_blocks ) I get
> 13341184 ~ 13MB and metadata partiton is 1024MB
>
> device sizes:
>  # blockdev --report
> RO    RA   SSZ   BSZ   StartSec            Size   Device
> rw   256   512  4096          0     26843414528   /dev/sda
> rw   256   512  4096       2048      4094689280   /dev/sda1
> rw   256   512  4096    7999488     22746759168   /dev/sda2
> rw  2048   512  4096          0  20973392756736   /dev/sdb
> rw  2048   512  4096       2048  20973390659584   /dev/sdb1
> rw  2048   512  4096          0   2397799710720   /dev/sdc
> rw  2048   512  4096       2048      1023410176   /dev/sdc1
> rw  2048   512  4096    2000896   2396774203392   /dev/sdc2
> rw  2048   512   512          0  20973390659584   /dev/dm-0
>
> What am I doing wrong?

Not clear yet.  Something is causing you to run out of metadata blocks.

Please provide the output of 'dmsetup table'.  Last time you did the
metadata device that was provided to the cache target was _not_ 1024MB.

Mike

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: dm-cache fs corruption
  2013-11-13 10:24 dm-cache fs corruption Vladimir Smolensky
  2013-11-13 13:10 ` Joe Thornber
@ 2014-01-22  1:04 ` Vladimir Smolensky
  1 sibling, 0 replies; 20+ messages in thread
From: Vladimir Smolensky @ 2014-01-22  1:04 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 9026 bytes --]

Okay, update on the problem.

Tested with
Linux v-5-231-d1862-150 3.13.0-rc8 #1 SMP Fri Jan 17 17:55:34 GMT 2014
x86_64 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz GenuineIntel GNU/Linux


dm-cache breaks when cache gets full, again

dmesg:

Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589304] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589310] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589312] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589315] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589317] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589319] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589322] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589324] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589326] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589328] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589331] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589333] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589335] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589337] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589339] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589342] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589344] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589346] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589349] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589351] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589353] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589355] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589357] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589360] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589362] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589364] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589366] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589368] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589371] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589373] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589375] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589377] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589448] XFS (dm-0):
metadata I/O error: block 0x2aeb4ee40 ("xfs_trans_read_buf_map") error 117
numblks 16
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.589455] XFS (dm-0):
xfs_imap_to_bp: xfs_trans_read_buf() returned error 117.
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590202] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590206] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590209] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590212] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590214] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590216] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590218] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590221] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590223] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590225] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590227] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590229] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590232] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590234] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590236] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590238] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590241] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590243] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590245] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590248] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590250] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590252] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair
Jan 22 00:34:55 v-5-231-d1862-150 kernel: [362737.590254] XFS (dm-0):
Corruption detected. Unmount and run xfs_repair

and so on... it just repeats the same

 blockdev --report
RO    RA   SSZ   BSZ   StartSec            Size   Device
rw   256   512  4096          0     26843414528   /dev/sda
rw   256   512  4096       2048      4094689280   /dev/sda1
rw   256   512  4096    7999488     22479372288   /dev/sda2
rw  2048   512  4096          0  20973392756736   /dev/sdb
rw  2048   512  4096       2048  20973390659584   /dev/sdb1
rw  2048   512  4096          0   2397799710720   /dev/sdc
rw  2048   512  4096       2048      1023410176   /dev/sdc1
rw  2048   512  4096    2000896   2396774203392   /dev/sdc2
rw  2048   512   512          0  20973390659584   /dev/dm-0


 dmsetup table
storage: 0 40963653632 cache 8:33 8:34 8:17 8192 1 writeback default 0

dmsetup status
storage: 0 40963653632 cache 1152/249856 86805905 9729090 1164931 47563307
0 525209 525209 0 1 writeback 2 migration_threshold 10000000 4
random_threshold 4 sequential_threshold 10000000


Cache size ~2.2T, origin ~20TB

If you need more specific info, please tell  me what.

reagrds.





On Wed, Nov 13, 2013 at 12:24 PM, Vladimir Smolensky <arizal@gmail.com>wrote:

> Hello,
> I've been testing dm-cache for use with static web content.
> It appears that, when using big cache ~3TB(+20TB origin dev) dm-cache
> device corrupts the filesystem after block eviction starts to happen.
> If I set smaller cache size - single 480GB ssd(again with 20TB origin
> dev), the dm-cache dev works just fine.
>
>
> dmesg error:
>
> XFS (dm-0): Corruption detected. Unmount and run xfs_repair
> XFS (dm-0): metadata I/O error: block 0x3d4f534e0
> ("xfs_trans_read_buf_map") error 117 numblks 16
> XFS (dm-0): xfs_imap_to_bp: xfs_trans_read_buf() returned error 117.
> XFS (dm-0): Corruption detected. Unmount and run xfs_repair
> XFS (dm-0): Corruption detected. Unmount and run xfs_repair
>
>
> ext4 also got broken.
>
>
> Tested with 3.10.11-gentoo, fedora 19- kernel 3.11.something, xfs, ext4,
> writeback and writethrough modes.
>
>
>
> regards,
> Vladimir Smolensky
>

[-- Attachment #1.2: Type: text/html, Size: 9996 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-01-22  1:04 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-13 10:24 dm-cache fs corruption Vladimir Smolensky
2013-11-13 13:10 ` Joe Thornber
2013-11-13 13:48   ` Vladimir Smolensky
2013-11-13 14:31     ` Mike Snitzer
2013-11-13 14:41     ` Joe Thornber
2013-11-13 15:19       ` Vladimir Smolensky
2013-11-13 16:29         ` Vladimir Smolensky
2013-11-13 18:39           ` Mears, Morgan
2013-11-14 14:38             ` Joe Thornber
2013-11-14 15:21               ` Vladimir Smolensky
2013-11-14 20:22                 ` Vladimir Smolensky
2013-11-15 13:01                   ` Vladimir Smolensky
2013-11-28 16:50                     ` Vladimir Smolensky
2013-11-28 17:28                       ` Joe Thornber
2013-11-29 13:34                         ` Vladimir Smolensky
2013-11-29 15:06                           ` Vladimir Smolensky
2013-11-30 14:28                             ` Vladimir Smolensky
2013-12-02 14:47                               ` Mike Snitzer
2013-11-13 19:32           ` Mike Snitzer
2014-01-22  1:04 ` Vladimir Smolensky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.