All of lore.kernel.org
 help / color / mirror / Atom feed
* 4.4.0 - no space left with >1.7 TB free space left
@ 2016-02-08  9:22 Tomasz Chmielewski
  2016-02-08 11:24 ` Roman Mamedov
  0 siblings, 1 reply; 11+ messages in thread
From: Tomasz Chmielewski @ 2016-02-08  9:22 UTC (permalink / raw)
  To: linux-btrfs

Linux 4.4.0 - btrfs is mainly used to host lots of test containers, 
often snapshots, and at times, there is heavy IO in many of them for 
extended periods of time. btrfs is on HDDs.


Every few days I'm getting "no space left" in a container running mongo 
3.2.1 database. Interestingly, haven't seen this issue in containers 
with MySQL. All databases have chattr +C set on their directories.

Why would it fail, if there is so much space left?


2016-02-07T06:06:14.648+0000 E STORAGE  [thread1] WiredTiger (28) 
[1454825174:633585][9105:0x7f2b7e33e700], 
file:collection-33-7895599108848542105.wt, WT_SESSION.checkpoint: 
collection-33-7895599108848542105.wt write error: failed to write 4096 
bytes at offset 20480: No space left on device
2016-02-07T06:06:14.648+0000 E STORAGE  [thread1] WiredTiger (28) 
[1454825174:648740][9105:0x7f2b7e33e700], checkpoint-server: checkpoint 
server error: No space left on device
2016-02-07T06:06:14.648+0000 E STORAGE  [thread1] WiredTiger (-31804) 
[1454825174:648766][9105:0x7f2b7e33e700], checkpoint-server: the process 
must exit and restart: WT_PANIC: WiredTiger library panic
2016-02-07T06:06:14.648+0000 I -        [thread1] Fatal Assertion 28558
2016-02-07T06:06:14.648+0000 I -        [thread1]

***aborting after fassert() failure


2016-02-07T06:06:14.694+0000 I -        [WTJournalFlusher] Fatal 
Assertion 28559
2016-02-07T06:06:14.694+0000 I -        [WTJournalFlusher]

***aborting after fassert() failure


2016-02-07T06:06:15.203+0000 F -        [WTJournalFlusher] Got signal: 6 
(Aborted).






# df -h /srv
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda4       2.7T  1.1T  1.7T  39% /srv

# btrfs fi df /srv
Data, RAID1: total=1.25TiB, used=1014.01GiB
System, RAID1: total=32.00MiB, used=240.00KiB
Metadata, RAID1: total=15.00GiB, used=13.13GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

# btrfs fi show /srv
Label: 'btrfs'  uuid: 105b2e0c-8af2-45ee-b4c8-14ff0a3ca899
         Total devices 2 FS bytes used 1.00TiB
         devid    1 size 2.63TiB used 1.26TiB path /dev/sda4
         devid    2 size 2.63TiB used 1.26TiB path /dev/sdb4

btrfs-progs v4.0.1



Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-02-08  9:22 4.4.0 - no space left with >1.7 TB free space left Tomasz Chmielewski
@ 2016-02-08 11:24 ` Roman Mamedov
  2016-02-08 12:15   ` Tomasz Chmielewski
  2016-04-08 11:36   ` Tomasz Chmielewski
  0 siblings, 2 replies; 11+ messages in thread
From: Roman Mamedov @ 2016-02-08 11:24 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 813 bytes --]

On Mon, 08 Feb 2016 18:22:34 +0900
Tomasz Chmielewski <tch@virtall.com> wrote:

> Linux 4.4.0 - btrfs is mainly used to host lots of test containers, 
> often snapshots, and at times, there is heavy IO in many of them for 
> extended periods of time. btrfs is on HDDs.
> 
> 
> Every few days I'm getting "no space left" in a container running mongo 
> 3.2.1 database. Interestingly, haven't seen this issue in containers 
> with MySQL. All databases have chattr +C set on their directories.

Hello,

Do you snapshot the parent subvolume which holds the databases? Can you
correlate that perhaps ENOSPC occurs at the time of snapshotting? If yes, then
you should try the patch https://patchwork.kernel.org/patch/7967161/

(Too bad this was not included into 4.4.1.)

-- 
With respect,
Roman

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-02-08 11:24 ` Roman Mamedov
@ 2016-02-08 12:15   ` Tomasz Chmielewski
  2016-02-08 12:17     ` Roman Mamedov
  2016-04-08 11:36   ` Tomasz Chmielewski
  1 sibling, 1 reply; 11+ messages in thread
From: Tomasz Chmielewski @ 2016-02-08 12:15 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-btrfs

On 2016-02-08 20:24, Roman Mamedov wrote:
> On Mon, 08 Feb 2016 18:22:34 +0900
> Tomasz Chmielewski <tch@virtall.com> wrote:
> 
>> Linux 4.4.0 - btrfs is mainly used to host lots of test containers,
>> often snapshots, and at times, there is heavy IO in many of them for
>> extended periods of time. btrfs is on HDDs.
>> 
>> 
>> Every few days I'm getting "no space left" in a container running 
>> mongo
>> 3.2.1 database. Interestingly, haven't seen this issue in containers
>> with MySQL. All databases have chattr +C set on their directories.
> 
> Hello,
> 
> Do you snapshot the parent subvolume which holds the databases? Can you
> correlate that perhaps ENOSPC occurs at the time of snapshotting?

Not sure.

With the last error, a snapshot was made at around 06:06, while "no 
space left" was reported on 06:14. Suspiciously close to each other, but 
still, a few minutes away.

Unfortunately I don't have error log for previous cases.


> If yes, then
> you should try the patch https://patchwork.kernel.org/patch/7967161/
> 
> (Too bad this was not included into 4.4.1.)

I'll keep an eye on it, thanks.


Tomasz Chmielewski
http://www.ptraveler.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-02-08 12:15   ` Tomasz Chmielewski
@ 2016-02-08 12:17     ` Roman Mamedov
  0 siblings, 0 replies; 11+ messages in thread
From: Roman Mamedov @ 2016-02-08 12:17 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 341 bytes --]

On Mon, 08 Feb 2016 21:15:38 +0900
Tomasz Chmielewski <tch@virtall.com> wrote:

> With the last error, a snapshot was made at around 06:06
> "no space left" was reported on 06:14.

If you mean the log that you have posted in your original message, the ENOSPC
happened at 06:06 and 14 seconds, not 06:14.

-- 
With respect,
Roman

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-02-08 11:24 ` Roman Mamedov
  2016-02-08 12:15   ` Tomasz Chmielewski
@ 2016-04-08 11:36   ` Tomasz Chmielewski
  2016-04-08 11:53     ` Roman Mamedov
  1 sibling, 1 reply; 11+ messages in thread
From: Tomasz Chmielewski @ 2016-04-08 11:36 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-btrfs

On 2016-02-08 20:24, Roman Mamedov wrote:

>> Linux 4.4.0 - btrfs is mainly used to host lots of test containers,
>> often snapshots, and at times, there is heavy IO in many of them for
>> extended periods of time. btrfs is on HDDs.
>> 
>> 
>> Every few days I'm getting "no space left" in a container running 
>> mongo
>> 3.2.1 database. Interestingly, haven't seen this issue in containers
>> with MySQL. All databases have chattr +C set on their directories.
> 
> Hello,
> 
> Do you snapshot the parent subvolume which holds the databases? Can you
> correlate that perhaps ENOSPC occurs at the time of snapshotting? If 
> yes, then
> you should try the patch https://patchwork.kernel.org/patch/7967161/
> 
> (Too bad this was not included into 4.4.1.)

By the way - was it included in any later kernel? I'm running 4.4.5 on 
that server, but still hitting the same issue.


Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-04-08 11:36   ` Tomasz Chmielewski
@ 2016-04-08 11:53     ` Roman Mamedov
  2016-04-08 13:20       ` Tomasz Chmielewski
                         ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Roman Mamedov @ 2016-04-08 11:53 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-btrfs, Chris Mason, Zhao Lei

[-- Attachment #1: Type: text/plain, Size: 1341 bytes --]

On Fri, 08 Apr 2016 20:36:26 +0900
Tomasz Chmielewski <tch@virtall.com> wrote:

> On 2016-02-08 20:24, Roman Mamedov wrote:
> 
> >> Linux 4.4.0 - btrfs is mainly used to host lots of test containers,
> >> often snapshots, and at times, there is heavy IO in many of them for
> >> extended periods of time. btrfs is on HDDs.
> >> 
> >> 
> >> Every few days I'm getting "no space left" in a container running 
> >> mongo
> >> 3.2.1 database. Interestingly, haven't seen this issue in containers
> >> with MySQL. All databases have chattr +C set on their directories.
> > 
> > Hello,
> > 
> > Do you snapshot the parent subvolume which holds the databases? Can you
> > correlate that perhaps ENOSPC occurs at the time of snapshotting? If 
> > yes, then
> > you should try the patch https://patchwork.kernel.org/patch/7967161/
> > 
> > (Too bad this was not included into 4.4.1.)
> 
> By the way - was it included in any later kernel? I'm running 4.4.5 on 
> that server, but still hitting the same issue.

It's not in 4.4.6 either. I don't know why it doesn't get included, or what
we need to do. Last time I asked, it was queued:
http://www.spinics.net/lists/linux-btrfs/msg52478.html
But maybe that meant 4.5 or 4.6 only? While the bug is affecting people on
4.4.x today.

Thanks

-- 
With respect,
Roman

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-04-08 11:53     ` Roman Mamedov
@ 2016-04-08 13:20       ` Tomasz Chmielewski
  2016-04-08 19:53       ` Duncan
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Tomasz Chmielewski @ 2016-04-08 13:20 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-btrfs, Chris Mason, Zhao Lei

On 2016-04-08 20:53, Roman Mamedov wrote:

>> > Do you snapshot the parent subvolume which holds the databases? Can you
>> > correlate that perhaps ENOSPC occurs at the time of snapshotting? If
>> > yes, then
>> > you should try the patch https://patchwork.kernel.org/patch/7967161/
>> >
>> > (Too bad this was not included into 4.4.1.)
>> 
>> By the way - was it included in any later kernel? I'm running 4.4.5 on
>> that server, but still hitting the same issue.
> 
> It's not in 4.4.6 either. I don't know why it doesn't get included, or 
> what
> we need to do. Last time I asked, it was queued:
> http://www.spinics.net/lists/linux-btrfs/msg52478.html
> But maybe that meant 4.5 or 4.6 only? While the bug is affecting people 
> on
> 4.4.x today.

Does it mean 4.5 also doesn't have it yet?


Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-04-08 11:53     ` Roman Mamedov
  2016-04-08 13:20       ` Tomasz Chmielewski
@ 2016-04-08 19:53       ` Duncan
  2016-05-12  6:03       ` Tomasz Chmielewski
  2016-09-15  8:02       ` Roman Mamedov
  3 siblings, 0 replies; 11+ messages in thread
From: Duncan @ 2016-04-08 19:53 UTC (permalink / raw)
  To: linux-btrfs

Roman Mamedov posted on Fri, 08 Apr 2016 16:53:32 +0500 as excerpted:

> It's not in 4.4.6 either. I don't know why it doesn't get included, or
> what we need to do. Last time I asked, it was queued:
> http://www.spinics.net/lists/linux-btrfs/msg52478.html But maybe that
> meant 4.5 or 4.6 only? While the bug is affecting people on 4.4.x today.

Patches must make it to the current development kernel before they're 
eligible for stable.  Additionally, they need to be cced to stable as 
well, in ordered to be queued there.

So check 4.5 and 4.6-rc.  If it's in neither of those, it's not going to 
be in stable yet.  Once it's in the development kernel, see if it was cced 
to stable and if needed, ask the author and btrfs devs to cc it to stable.

Tho sometimes stable can get a backlog as well.  I know earlier this year 
they were dealing with one, but I follow release or development, not 
stable, and don't know what stable's current status is.

If it gets to stable, and it wasn't for a bug introduced /after/ 4.4, it 
should eventually get into 4.4, as that's an LTS kernel.  But it might 
take awhile, as the above discussion hints.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-04-08 11:53     ` Roman Mamedov
  2016-04-08 13:20       ` Tomasz Chmielewski
  2016-04-08 19:53       ` Duncan
@ 2016-05-12  6:03       ` Tomasz Chmielewski
  2016-05-12  6:07         ` Tomasz Chmielewski
  2016-09-15  8:02       ` Roman Mamedov
  3 siblings, 1 reply; 11+ messages in thread
From: Tomasz Chmielewski @ 2016-05-12  6:03 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-btrfs, Chris Mason, Zhao Lei

On 2016-04-08 20:53, Roman Mamedov wrote:

>> > Do you snapshot the parent subvolume which holds the databases? Can you
>> > correlate that perhaps ENOSPC occurs at the time of snapshotting? If
>> > yes, then
>> > you should try the patch https://patchwork.kernel.org/patch/7967161/
>> >
>> > (Too bad this was not included into 4.4.1.)
>> 
>> By the way - was it included in any later kernel? I'm running 4.4.5 on
>> that server, but still hitting the same issue.
> 
> It's not in 4.4.6 either. I don't know why it doesn't get included, or 
> what
> we need to do. Last time I asked, it was queued:
> http://www.spinics.net/lists/linux-btrfs/msg52478.html
> But maybe that meant 4.5 or 4.6 only? While the bug is affecting people 
> on
> 4.4.x today.

FYI, I'm still getting this with 4.5.3, which probably means the fix was 
not yet included ("No space left" at snapshot time):

/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC LOG: 
  could not close temporary statistics file "pg_stat_tmp/db_0.tmp": No 
space left on device
/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC LOG: 
  could not close temporary statistics file "pg_stat_tmp/global.tmp": No 
space left on device
/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC LOG: 
  could not close temporary statistics file "pg_stat_tmp/db_0.tmp": No 
space left on device
/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC LOG: 
  could not close temporary statistics file "pg_stat_tmp/global.tmp": No 
space left on device


I've tried mounting with space_cache=v2, but it didn't help.



Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-05-12  6:03       ` Tomasz Chmielewski
@ 2016-05-12  6:07         ` Tomasz Chmielewski
  0 siblings, 0 replies; 11+ messages in thread
From: Tomasz Chmielewski @ 2016-05-12  6:07 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-btrfs, Chris Mason, Zhao Lei

On 2016-05-12 15:03, Tomasz Chmielewski wrote:

> FYI, I'm still getting this with 4.5.3, which probably means the fix
> was not yet included ("No space left" at snapshot time):
> 
> /var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC
> LOG:  could not close temporary statistics file
> "pg_stat_tmp/db_0.tmp": No space left on device
> /var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC
> LOG:  could not close temporary statistics file
> "pg_stat_tmp/global.tmp": No space left on device
> /var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC
> LOG:  could not close temporary statistics file
> "pg_stat_tmp/db_0.tmp": No space left on device
> /var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC
> LOG:  could not close temporary statistics file
> "pg_stat_tmp/global.tmp": No space left on device
> 
> 
> I've tried mounting with space_cache=v2, but it didn't help.

On the good side, I see it's in 4.6-rc7.


Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 4.4.0 - no space left with >1.7 TB free space left
  2016-04-08 11:53     ` Roman Mamedov
                         ` (2 preceding siblings ...)
  2016-05-12  6:03       ` Tomasz Chmielewski
@ 2016-09-15  8:02       ` Roman Mamedov
  3 siblings, 0 replies; 11+ messages in thread
From: Roman Mamedov @ 2016-09-15  8:02 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-btrfs, Chris Mason, Zhao Lei

[-- Attachment #1: Type: text/plain, Size: 1514 bytes --]

On Fri, 8 Apr 2016 16:53:32 +0500
Roman Mamedov <rm@romanrm.net> wrote:

> On Fri, 08 Apr 2016 20:36:26 +0900
> Tomasz Chmielewski <tch@virtall.com> wrote:
> 
> > On 2016-02-08 20:24, Roman Mamedov wrote:
> > 
> > >> Linux 4.4.0 - btrfs is mainly used to host lots of test containers,
> > >> often snapshots, and at times, there is heavy IO in many of them for
> > >> extended periods of time. btrfs is on HDDs.
> > >> 
> > >> 
> > >> Every few days I'm getting "no space left" in a container running 
> > >> mongo
> > >> 3.2.1 database. Interestingly, haven't seen this issue in containers
> > >> with MySQL. All databases have chattr +C set on their directories.
> > > 
> > > Hello,
> > > 
> > > Do you snapshot the parent subvolume which holds the databases? Can you
> > > correlate that perhaps ENOSPC occurs at the time of snapshotting? If 
> > > yes, then
> > > you should try the patch https://patchwork.kernel.org/patch/7967161/
> > > 
> > > (Too bad this was not included into 4.4.1.)
> > 
> > By the way - was it included in any later kernel? I'm running 4.4.5 on 
> > that server, but still hitting the same issue.
> 
> It's not in 4.4.6 either. I don't know why it doesn't get included, or what
> we need to do. Last time I asked, it was queued:
> http://www.spinics.net/lists/linux-btrfs/msg52478.html
> But maybe that meant 4.5 or 4.6 only? While the bug is affecting people on
> 4.4.x today.

This got applied now in 4.4.21, thanks.

-- 
With respect,
Roman

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-09-15  8:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-08  9:22 4.4.0 - no space left with >1.7 TB free space left Tomasz Chmielewski
2016-02-08 11:24 ` Roman Mamedov
2016-02-08 12:15   ` Tomasz Chmielewski
2016-02-08 12:17     ` Roman Mamedov
2016-04-08 11:36   ` Tomasz Chmielewski
2016-04-08 11:53     ` Roman Mamedov
2016-04-08 13:20       ` Tomasz Chmielewski
2016-04-08 19:53       ` Duncan
2016-05-12  6:03       ` Tomasz Chmielewski
2016-05-12  6:07         ` Tomasz Chmielewski
2016-09-15  8:02       ` Roman Mamedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.