linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.6.0-test2-mm3 and mysql
@ 2003-08-28 17:59 Heikki Tuuri
  2003-08-28 19:01 ` Sergey S. Kostyliov
  0 siblings, 1 reply; 26+ messages in thread
From: Heikki Tuuri @ 2003-08-28 17:59 UTC (permalink / raw)
  To: linux-kernel

Sergey,

does it always crash when you start mysqld?

It is page number 0 in the InnoDB tablespace. That is, the header page of
the whole tablespace!

The checksums in the page are ok. That shows the page was not corrupted in
the Linux file system.

InnoDB is trying to do an index search, but that of course crashes, because
the header page is not any index page.

The reason for the crash is probably that a page number in a pointer record
in the father node of the B-tree has been reset to zero. The corruption has
happened in the mysqld process memory, not in the file system of Linux.
Otherwise, InnoDB would have complained about page checksum errors.

No one else has reported this error. I have now added a check to a future
version of InnoDB which will catch this particular error earlier and will
hex dump the father page.

By the way, I noticed that a website http://www.linuxtestproject.org has
made an extensive regression test suite for Linux. They have also
successfully run big MySQL and DB2 stress tests on their computers, on
2.5.xx kernels. If there is something wrong with 2.5.xx or 2.6.0, it
apparently does not concern all computers.

"
The Linux Test Project test suite, ltp-20030807, has been released. The
latest version of the testsuite contains 2000+ tests for the Linux OS.
"

The general picture about InnoDB corruption is that reports have almost
stopped after I advised people on the mailing list to upgrade to
Linux-2.4.20 kernels.

With apologies,

Heikki
Innobase Oy
http://www.innodb.com

"
030827 15:34:10  InnoDB: Page checksum 1165918361, prior-to-4.0.14-form
checksum 4088416325
InnoDB: stored checksum 1165918361, prior-to-4.0.14-form stored checksum
4088416325
InnoDB: Page lsn 0 4080819655, low 4 bytes of lsn at page end 4080819655
InnoDB: Page directory corruption: supremum not pointed to
030827 15:34:10  InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex 457e8099000000000000000000000000000000
00f33c5dc7000000000000f356ce970000000100000000000000
0000040f0000040240000000000000006c00000002000400000
1b60004000001de0000000400028000144e00040000009e0000
00360000000001160002800015de0000000000000b410000000
20000000200260002b5e500260000000200027d300026000119
3a0026000000000000000000014000207e00018000009e00000
003aaaaaaaaaaaaaaaa

...

000000000000000f3b04845f33c5dc7
"

From: Sergey S. Kostyliov (rathamahata@php4.ru)
Subject: Re: 2.6.0-test2-mm3 and mysql
View: Complete Thread (22 articles)
Original Format
Newsgroups: linux.kernel
Date: 2003-08-27 09:00:19 PST


On Monday 04 August 2003 04:05, Matt Mackall wrote:
> On Sun, Aug 03, 2003 at 10:58:17PM +0400, Sergey S. Kostyliov wrote:
> > Hello Andrew,
> >
> > On Sunday 03 August 2003 05:04, Andrew Morton wrote:
> > > Shane Shrybman <shrybman@sympatico.ca> wrote:
> > > > One last thing, I have started seeing mysql database corruption
> > > > recently. I am not sure it is a kernel problem. And I don't know the
> > > > exact steps to reproduce it, but I think I started seeing it with
> > > > -test2-mm2. I haven't ever seen db corruption in the 8-12 months I
> > > > have being playing with mysql/php.
> > >
> > > hm, that's a worry.  No additional info available?
> >
> > I also suffer from this problem (I'm speaking about heavy InnoDB
> > corruption here), but with vanilla 2.6.0-test2. I can't blame
> > MySQL/InnoDB because there are a lot of MySQL boxes around of me with
the
> > same (in fact the box wich failed is replication slave) or allmost the
> > same database setup. All other boxes (2.4 kernel) works fine up to now.
>
> All Linux kernels prior to 2.6.0-test2-mm3-1 would silently fail to
> complete fsync() and msync() operations if they encountered an I/O
> error, resulting in corruption. If a particular disk subsystem was
> producing these errors, the symptoms would likely be:
>
> - no error reported
> - no messages in logs
> - independent of kernel version, etc.
> - suddenly appear at some point in drive life
> - works flawlessly on other machines
>
> If you can reproduce this corruption, please try running against mm3-1
> and seeing if it reports problems (both to fsync and in logs).

I've just got another one InnoDB crash with 2.6.0-test4.
As in previous case there was no messages in kernel log.
You can find mysql error log here.
http://sysadminday.org.ru/linux-2.6.0-test4_InnoDB_crash

It's a development server, so this isn't a big problem.
I do understand that this can easily be a hardware problem,
but the kernel silence is really sad in such case.
Memory is fine (at least according to memtest 3.0).

Any hints will be appreciated.

-- 
                   Best regards,
                   Sergey S. Kostyliov <rathamahata@php4.ru>
                   Public PGP key: http://sysadminday.org.ru/rathamahata.asc



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-28 17:59 2.6.0-test2-mm3 and mysql Heikki Tuuri
@ 2003-08-28 19:01 ` Sergey S. Kostyliov
  2003-08-28 19:10   ` Heikki Tuuri
  0 siblings, 1 reply; 26+ messages in thread
From: Sergey S. Kostyliov @ 2003-08-28 19:01 UTC (permalink / raw)
  To: Heikki Tuuri, linux-kernel

Hi Heikki,

On Thursday 28 August 2003 21:59, Heikki Tuuri wrote:
> Sergey,
>
> does it always crash when you start mysqld?

Yes It was always crashing until I deleted all InnoDB files and restored
InnoDB tables from backup.

>
> It is page number 0 in the InnoDB tablespace. That is, the header page of
> the whole tablespace!
>
> The checksums in the page are ok. That shows the page was not corrupted in
> the Linux file system.
>
> InnoDB is trying to do an index search, but that of course crashes, because
> the header page is not any index page.
>
> The reason for the crash is probably that a page number in a pointer record
> in the father node of the B-tree has been reset to zero. The corruption has
> happened in the mysqld process memory, not in the file system of Linux.
> Otherwise, InnoDB would have complained about page checksum errors.
>
> No one else has reported this error. I have now added a check to a future
> version of InnoDB which will catch this particular error earlier and will
> hex dump the father page.

Yes, now it seems for me that this particular crash in not related to linux
kernel at all.
The funny thing I've managed to get another InnoDB crash on the same box
http://sysadminday.org.ru/linux-2.6.0-test4_InnoDB_crash-20030828
which in turn was posted to linux-kernel over a two hours ago.
This time the cheksums are different :(

>
> By the way, I noticed that a website http://www.linuxtestproject.org has
> made an extensive regression test suite for Linux. They have also
> successfully run big MySQL and DB2 stress tests on their computers, on
> 2.5.xx kernels. If there is something wrong with 2.5.xx or 2.6.0, it
> apparently does not concern all computers.
>
> "
> The Linux Test Project test suite, ltp-20030807, has been released. The
> latest version of the testsuite contains 2000+ tests for the Linux OS.
> "
>
> The general picture about InnoDB corruption is that reports have almost
> stopped after I advised people on the mailing list to upgrade to
> Linux-2.4.20 kernels.

In fact I'm also a happy InnoDB user. It runs fine on 6 of my production
servers. Thanks for a nice work btw!

It has worked fine also on our development server until I upgraded it to
2.6.0-testX. I don't know. It might be just a broken hardware
which is better stressed with 2.6 than with 2.4...

>
> With apologies,
>
> Heikki
> Innobase Oy


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-28 19:01 ` Sergey S. Kostyliov
@ 2003-08-28 19:10   ` Heikki Tuuri
  2003-08-28 19:27     ` Sergey S. Kostyliov
  0 siblings, 1 reply; 26+ messages in thread
From: Heikki Tuuri @ 2003-08-28 19:10 UTC (permalink / raw)
  To: Sergey S. Kostyliov, linux-kernel

Sergey,

----- Original Message ----- 
From: "Sergey S. Kostyliov" <rathamahata@php4.ru>
To: "Heikki Tuuri" <Heikki.Tuuri@innodb.com>; <linux-kernel@vger.kernel.org>
Sent: Thursday, August 28, 2003 10:01 PM
Subject: Re: 2.6.0-test2-mm3 and mysql


> Hi Heikki,
>
> On Thursday 28 August 2003 21:59, Heikki Tuuri wrote:
> > Sergey,
> >
> > does it always crash when you start mysqld?
>
> Yes It was always crashing until I deleted all InnoDB files and restored
> InnoDB tables from backup.
>
> >
> > It is page number 0 in the InnoDB tablespace. That is, the header page
of
> > the whole tablespace!
> >
> > The checksums in the page are ok. That shows the page was not corrupted
in
> > the Linux file system.
> >
> > InnoDB is trying to do an index search, but that of course crashes,
because
> > the header page is not any index page.
> >
> > The reason for the crash is probably that a page number in a pointer
record
> > in the father node of the B-tree has been reset to zero. The corruption
has
> > happened in the mysqld process memory, not in the file system of Linux.
> > Otherwise, InnoDB would have complained about page checksum errors.
> >
> > No one else has reported this error. I have now added a check to a
future
> > version of InnoDB which will catch this particular error earlier and
will
> > hex dump the father page.
>
> Yes, now it seems for me that this particular crash in not related to
linux
> kernel at all.
> The funny thing I've managed to get another InnoDB crash on the same box
> http://sysadminday.org.ru/linux-2.6.0-test4_InnoDB_crash-20030828
> which in turn was posted to linux-kernel over a two hours ago.
> This time the cheksums are different :(


ok, this time the corruption probably happened in the file cache, or the
file system of Linux, or in the hardware.

It is not at all surprising that you encounter memory corruption and file
corruption in the same computer. That is a common pattern if these problems
appear at all.

Do you have a swap partition? I do not know Linux well enough, but in
theory, file or disk corruption could cause also memory corruption if pages
of the process memory get swapped to disk.


> > By the way, I noticed that a website http://www.linuxtestproject.org has
> > made an extensive regression test suite for Linux. They have also
> > successfully run big MySQL and DB2 stress tests on their computers, on
> > 2.5.xx kernels. If there is something wrong with 2.5.xx or 2.6.0, it
> > apparently does not concern all computers.
> >
> > "
> > The Linux Test Project test suite, ltp-20030807, has been released. The
> > latest version of the testsuite contains 2000+ tests for the Linux OS.
> > "
> >
> > The general picture about InnoDB corruption is that reports have almost
> > stopped after I advised people on the mailing list to upgrade to
> > Linux-2.4.20 kernels.
>
> In fact I'm also a happy InnoDB user. It runs fine on 6 of my production
> servers. Thanks for a nice work btw!
>
> It has worked fine also on our development server until I upgraded it to
> 2.6.0-testX. I don't know. It might be just a broken hardware
> which is better stressed with 2.6 than with 2.4...
>
> >
> > With apologies,
> >
> > Heikki
> > Innobase Oy

Best regards,

Heikki
Innobase Oy
http://www.innodb.com



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-28 19:10   ` Heikki Tuuri
@ 2003-08-28 19:27     ` Sergey S. Kostyliov
  0 siblings, 0 replies; 26+ messages in thread
From: Sergey S. Kostyliov @ 2003-08-28 19:27 UTC (permalink / raw)
  To: Heikki Tuuri, linux-kernel

On Thursday 28 August 2003 23:10, Heikki Tuuri wrote:
> Sergey,
>
> ----- Original Message -----
> From: "Sergey S. Kostyliov" <rathamahata@php4.ru>
> To: "Heikki Tuuri" <Heikki.Tuuri@innodb.com>;
> <linux-kernel@vger.kernel.org> Sent: Thursday, August 28, 2003 10:01 PM
> Subject: Re: 2.6.0-test2-mm3 and mysql
>
> > Hi Heikki,
> >
> > On Thursday 28 August 2003 21:59, Heikki Tuuri wrote:
> > > Sergey,
> > >
> > > does it always crash when you start mysqld?
> >
> > Yes It was always crashing until I deleted all InnoDB files and restored
> > InnoDB tables from backup.
> >
> > > It is page number 0 in the InnoDB tablespace. That is, the header page
>
> of
>
> > > the whole tablespace!
> > >
> > > The checksums in the page are ok. That shows the page was not corrupted
>
> in
>
> > > the Linux file system.
> > >
> > > InnoDB is trying to do an index search, but that of course crashes,
>
> because
>
> > > the header page is not any index page.
> > >
> > > The reason for the crash is probably that a page number in a pointer
>
> record
>
> > > in the father node of the B-tree has been reset to zero. The corruption
>
> has
>
> > > happened in the mysqld process memory, not in the file system of Linux.
> > > Otherwise, InnoDB would have complained about page checksum errors.
> > >
> > > No one else has reported this error. I have now added a check to a
>
> future
>
> > > version of InnoDB which will catch this particular error earlier and
>
> will
>
> > > hex dump the father page.
> >
> > Yes, now it seems for me that this particular crash in not related to
>
> linux
>
> > kernel at all.
> > The funny thing I've managed to get another InnoDB crash on the same box
> > http://sysadminday.org.ru/linux-2.6.0-test4_InnoDB_crash-20030828
> > which in turn was posted to linux-kernel over a two hours ago.
> > This time the cheksums are different :(
>
> ok, this time the corruption probably happened in the file cache, or the
> file system of Linux, or in the hardware.
>
> It is not at all surprising that you encounter memory corruption and file
> corruption in the same computer. That is a common pattern if these problems
> appear at all.
>
> Do you have a swap partition? I do not know Linux well enough, but in
> theory, file or disk corruption could cause also memory corruption if pages
> of the process memory get swapped to disk.

Yes, the swap is used on this box.

rathamahata@dev rathamahata $ free -k
             total       used       free     shared    buffers     cached
Mem:       1035248     860740     174508          0      32676     581852
-/+ buffers/cache:     246212     789036
Swap:      8969484     195136    8774348


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-04  0:05     ` Matt Mackall
@ 2003-08-27 15:52       ` Sergey S. Kostyliov
  0 siblings, 0 replies; 26+ messages in thread
From: Sergey S. Kostyliov @ 2003-08-27 15:52 UTC (permalink / raw)
  To: linux-kernel

On Monday 04 August 2003 04:05, Matt Mackall wrote:
> On Sun, Aug 03, 2003 at 10:58:17PM +0400, Sergey S. Kostyliov wrote:
> > Hello Andrew,
> >
> > On Sunday 03 August 2003 05:04, Andrew Morton wrote:
> > > Shane Shrybman <shrybman@sympatico.ca> wrote:
> > > > One last thing, I have started seeing mysql database corruption
> > > > recently. I am not sure it is a kernel problem. And I don't know the
> > > > exact steps to reproduce it, but I think I started seeing it with
> > > > -test2-mm2. I haven't ever seen db corruption in the 8-12 months I
> > > > have being playing with mysql/php.
> > >
> > > hm, that's a worry.  No additional info available?
> >
> > I also suffer from this problem (I'm speaking about heavy InnoDB
> > corruption here), but with vanilla 2.6.0-test2. I can't blame
> > MySQL/InnoDB because there are a lot of MySQL boxes around of me with the
> > same (in fact the box wich failed is replication slave) or allmost the
> > same database setup. All other boxes (2.4 kernel) works fine up to now.
>
> All Linux kernels prior to 2.6.0-test2-mm3-1 would silently fail to
> complete fsync() and msync() operations if they encountered an I/O
> error, resulting in corruption. If a particular disk subsystem was
> producing these errors, the symptoms would likely be:
>
> - no error reported
> - no messages in logs
> - independent of kernel version, etc.
> - suddenly appear at some point in drive life
> - works flawlessly on other machines
>
> If you can reproduce this corruption, please try running against mm3-1
> and seeing if it reports problems (both to fsync and in logs).

I've just got another one InnoDB crash with 2.6.0-test4.
As in previous case there was no messages in kernel log.
You can find mysql error log here.
http://sysadminday.org.ru/linux-2.6.0-test4_InnoDB_crash

It's a development server, so this isn't a big problem.
I do understand that this can easily be a hardware problem,
but the kernel silence is really sad in such case.
Memory is fine (at least according to memtest 3.0).

Any hints will be appreciated.

-- 
                   Best regards,
                   Sergey S. Kostyliov <rathamahata@php4.ru>
                   Public PGP key: http://sysadminday.org.ru/rathamahata.asc

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-04 12:24     ` Denis Vlasenko
@ 2003-08-04 18:29       ` Heikki Tuuri
  0 siblings, 0 replies; 26+ messages in thread
From: Heikki Tuuri @ 2003-08-04 18:29 UTC (permalink / raw)
  To: linux-kernel

Denis,

----- Original Message ----- 
From: "Denis Vlasenko" <vda@port.imtp.ilyichevsk.odessa.ua>
To: "Heikki Tuuri" <Heikki.Tuuri@innodb.com>; <linux-kernel@vger.kernel.org>
Sent: Monday, August 04, 2003 3:24 PM
Subject: Re: 2.6.0-test2-mm3 and mysql


> On 3 August 2003 13:43, Heikki Tuuri wrote:
> > > Well there's a problem.  We're kernel people, not database people.  I,
for
> > > one, would not have a clue how to set such a thing up.
> > >
> > > If someone could prepare a simple-enough-for-kernel-people description
of
> > > how to get such a test up and running, then we might make some
progress.
> >
> > ok :).
> >
> > 1. Download
>
> [4 screenfuls snipped]
>
> That's a hell of a setup work, and kernel folks will always get slightly
> different setups. Can database folks make a fully configured chrootable
> tarball for mysql stress testing?

I think an even better idea is to use some multithreaded file i/o stress
test program. There probably are such programs already. If not, write a
simple C program which calls pread(), pwrite(), and fsync() on pages of size
2 - 16 kB. Vary the data you write, and check that the data you read is what
you wrote to the file the last time. Run the test for several days or even
weeks. Vary the size of the files so that you get real disk reads.

Is there such a stress test in the standard test suite for Linux
kernel/driver developers?

Running an actual SQL database on top of that file i/o workload may also
have some effect, because it is possible some bugs are really corruption of
the process memory space. Maybe we could simulate the database CPU load by
simple memcpy's etc.

Some additional info I forgot to mention about corruption:

1. In some cases corruption happens very frequently, even several times per
hour. In those cases I suspect a hardware fault. In addition to mysqld, also
other programs may suffer and crash.

2. Most cases of corruption only happen once in several weeks. They require
heavy database load to manifest.

3. A typical case of corruption is that an area of size varying from 4 bytes
to 4 kB is reset to zero in a 16 kB database page. Often is has been the end
of the page. In Sergey's case the whole 16 kB page was reset to zero.

> --
> vda

Regards,

Heikki
http://www.innodb.com



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03 10:43   ` Heikki Tuuri
@ 2003-08-04 12:24     ` Denis Vlasenko
  2003-08-04 18:29       ` Heikki Tuuri
  0 siblings, 1 reply; 26+ messages in thread
From: Denis Vlasenko @ 2003-08-04 12:24 UTC (permalink / raw)
  To: Heikki Tuuri, linux-kernel

On 3 August 2003 13:43, Heikki Tuuri wrote:
> > Well there's a problem.  We're kernel people, not database people.  I, for
> > one, would not have a clue how to set such a thing up.
> >
> > If someone could prepare a simple-enough-for-kernel-people description of
> > how to get such a test up and running, then we might make some progress.
> 
> ok :).
> 
> 1. Download

[4 screenfuls snipped]

That's a hell of a setup work, and kernel folks will always get slightly
different setups. Can database folks make a fully configured chrootable
tarball for mysql stress testing?
--
vda

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03 18:58   ` Sergey S. Kostyliov
@ 2003-08-04  0:05     ` Matt Mackall
  2003-08-27 15:52       ` Sergey S. Kostyliov
  0 siblings, 1 reply; 26+ messages in thread
From: Matt Mackall @ 2003-08-04  0:05 UTC (permalink / raw)
  To: Sergey S. Kostyliov; +Cc: Andrew Morton, Shane Shrybman, linux-kernel

On Sun, Aug 03, 2003 at 10:58:17PM +0400, Sergey S. Kostyliov wrote:
> Hello Andrew,
> 
> On Sunday 03 August 2003 05:04, Andrew Morton wrote:
> > Shane Shrybman <shrybman@sympatico.ca> wrote:
> 
> >
> > > One last thing, I have started seeing mysql database corruption
> > > recently. I am not sure it is a kernel problem. And I don't know the
> > > exact steps to reproduce it, but I think I started seeing it with
> > > -test2-mm2. I haven't ever seen db corruption in the 8-12 months I have
> > > being playing with mysql/php.
> >
> > hm, that's a worry.  No additional info available?
> 
> I also suffer from this problem (I'm speaking about heavy InnoDB corruption
> here), but with vanilla 2.6.0-test2. I can't blame MySQL/InnoDB because
> there are a lot of MySQL boxes around of me with the same (in fact the box
> wich failed is replication slave) or allmost the same database setup.
> All other boxes (2.4 kernel) works fine up to now.

All Linux kernels prior to 2.6.0-test2-mm3-1 would silently fail to
complete fsync() and msync() operations if they encountered an I/O
error, resulting in corruption. If a particular disk subsystem was
producing these errors, the symptoms would likely be:

- no error reported
- no messages in logs
- independent of kernel version, etc.
- suddenly appear at some point in drive life
- works flawlessly on other machines 

If you can reproduce this corruption, please try running against mm3-1
and seeing if it reports problems (both to fsync and in logs).

-- 
Matt Mackall : http://www.selenic.com : of or relating to the moon

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03 16:59 Heikki Tuuri
@ 2003-08-03 23:57 ` Matt Mackall
  0 siblings, 0 replies; 26+ messages in thread
From: Matt Mackall @ 2003-08-03 23:57 UTC (permalink / raw)
  To: Heikki Tuuri; +Cc: linux-kernel

On Sun, Aug 03, 2003 at 07:59:37PM +0300, Heikki Tuuri wrote:
> Shane,
> 
> "
> | tv01.program     | check | error    | got error: 5 when reading datafile
> at record: 6696 |
> "
> 
> InnoDB reported that same error 5 "EIO I/O error" in a call of fsync().
> MyISAM never calls fsync(), but I guess these problems are related. Let us
> hope Andrew's fix fixes this MyISAM problem, too.

Andrew introduced a buglet to my sync fixes -mm3 that reported IO
errors on any sync(). The logic in -mm3-1 should fix this and actually
report failed syncs.

-- 
Matt Mackall : http://www.selenic.com : of or relating to the moon

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03 17:11   ` Heikki Tuuri
@ 2003-08-03 23:54     ` Matt Mackall
  0 siblings, 0 replies; 26+ messages in thread
From: Matt Mackall @ 2003-08-03 23:54 UTC (permalink / raw)
  To: Heikki Tuuri; +Cc: linux-kernel

On Sun, Aug 03, 2003 at 08:11:29PM +0300, Heikki Tuuri wrote:
> Matt,
> 
> > On Sun, Aug 03, 2003 at 12:10:01PM +0300, Heikki Tuuri wrote:
> > >
> > > What to do? People who write drivers should run heavy, multithreaded
> file
> > > i/o tests on their computer using some SQL database which calls fsync().
> For
> > > example, run the Perl '/sql-bench/innotest's all concurrently on MySQL.
> If
> > > the problems are in drivers, that could help.
> >
> > Did you know that until test2-mm3, nothing would report errors that
> > occurred on non-synchronous writes? There was no infrastructure to
> > propagate the error back to userspace. If you wrote a page, the write
> > failed on an intermittent I/O error, and then read again, you'd
> > silently get back the old page.
> 
> we are not using the Linux async i/o. Do you mean that? Or the flush which
> the Linux kernel does from the file cache to the disk time to time on its
> own? I assume it will write to the system log an error message if a disk
> write fails?

This has nothing to do with the AIO interface.

Any write where the write returns immediately without syncing the file
is asynchronous. This includes most normal write()s where you're not
using O_DIRECT, O_SYNC or somesuch, and writes to memory-mapped files.

And no, prior to -mm3, there was absolutely no indication that these
failed writes occurred. It was simply dropped. Now it should be
reported in the logs and at the next sync point.
 
> The error 5 Shane reported came from a call of fsync(), and apparently he
> also got that same 5 from a simple file read which CHECK TABLE in MyISAM
> does.

Not sure what you're referring to here.

> Why would a write in the Linux async i/o fail? I am using aio on Windows,
> and if the disk space can be allocated, it seems to fail only in the case of
> a hardware failure.

For various reasons, there was previously no infrastructure for
transmitting write failure back to the writer after the page cache had
taken ownership of the write.

-- 
Matt Mackall : http://www.selenic.com : of or relating to the moon

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
@ 2003-08-03 20:50 Heikki Tuuri
  0 siblings, 0 replies; 26+ messages in thread
From: Heikki Tuuri @ 2003-08-03 20:50 UTC (permalink / raw)
  To: linux-kernel

Sergey,

I looked at your .err file, and you have really got very bad corruption.

The first crash looks like there were index records in the insert buffer to
insert to a page, but the contents of that page were completely wiped to
zero.

After that crash, InnoDB fails to recover, because the log contains an index
record insertion to a page which is completely wiped to zero except for the
lsn field at the start and the end of the page.

Are you running the same workload on MySQL-4.0.14 on other computers?

What MySQL version did you run previously on this computer or did you create
the tablespace from scratch?

Did you upgrade MySQL before upgrading to Linux-2.6, or after that?

Before blaming Linux-2.6 we should know the same load runs ok on
MySQL-4.0.14 on some Linux-2.4 box.

Regards,

Heikki

........................
List:     linux-kernel
Subject:  Re: 2.6.0-test2-mm3 and mysql
From:     "Sergey S. Kostyliov" <rathamahata () php4 ! ru>
Date:     2003-08-03 18:58:17
[Download message RAW]

Hello Andrew,

On Sunday 03 August 2003 05:04, Andrew Morton wrote:
> Shane Shrybman <shrybman@sympatico.ca> wrote:

<cut>

>
> > One last thing, I have started seeing mysql database corruption
> > recently. I am not sure it is a kernel problem. And I don't know the
> > exact steps to reproduce it, but I think I started seeing it with
> > -test2-mm2. I haven't ever seen db corruption in the 8-12 months I have
> > being playing with mysql/php.
>
> hm, that's a worry.  No additional info available?
>

I also suffer from this problem (I'm speaking about heavy InnoDB corruption
here), but with vanilla 2.6.0-test2. I can't blame MySQL/InnoDB because
there are a lot of MySQL boxes around of me with the same (in fact the box
wich failed is replication slave) or allmost the same database setup.
All other boxes (2.4 kernel) works fine up to now.

Sorry but I can't provide additional info. There was no messages in kernel
log.
All I have is mysql error logs. But I'm afraid they are not very helpfull
for kernel developers.
http://sysadminday.org.ru/linux-2.6.0-test2_InnoDB_crash

System is x86 UP PIII 500, 1Gb RAM with software RAID1 over two scsi disks.



-- 
                   Best regards,
                   Sergey S. Kostyliov <rathamahata@php4.ru>
                   Public PGP key: http://sysadminday.org.ru/rathamahata.asc



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03 15:01       ` Shane Shrybman
@ 2003-08-03 19:25         ` Andrew Morton
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2003-08-03 19:25 UTC (permalink / raw)
  To: Shane Shrybman; +Cc: linux-kernel

Shane Shrybman <shrybman@sympatico.ca> wrote:
>
> I still haven't been able to make it appear in 2.6.0-test1-mm1, but once
>  when I rebooted from -test1-mm1 to -test2-mm3 the tables had problems
>  immediately

Sorry, test2-mm3 was a disaster.  I replaced it with test3-mm3-1, which
should be better.

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test2/2.6.0-test2-mm3/2.6.0-test2-mm3-1.bz2

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  1:04 ` Andrew Morton
  2003-08-03  1:52   ` Con Kolivas
  2003-08-03  1:58   ` Shane Shrybman
@ 2003-08-03 18:58   ` Sergey S. Kostyliov
  2003-08-04  0:05     ` Matt Mackall
  2 siblings, 1 reply; 26+ messages in thread
From: Sergey S. Kostyliov @ 2003-08-03 18:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Shane Shrybman, linux-kernel

Hello Andrew,

On Sunday 03 August 2003 05:04, Andrew Morton wrote:
> Shane Shrybman <shrybman@sympatico.ca> wrote:

<cut>

>
> > One last thing, I have started seeing mysql database corruption
> > recently. I am not sure it is a kernel problem. And I don't know the
> > exact steps to reproduce it, but I think I started seeing it with
> > -test2-mm2. I haven't ever seen db corruption in the 8-12 months I have
> > being playing with mysql/php.
>
> hm, that's a worry.  No additional info available?
>

I also suffer from this problem (I'm speaking about heavy InnoDB corruption
here), but with vanilla 2.6.0-test2. I can't blame MySQL/InnoDB because
there are a lot of MySQL boxes around of me with the same (in fact the box
wich failed is replication slave) or allmost the same database setup.
All other boxes (2.4 kernel) works fine up to now.

Sorry but I can't provide additional info. There was no messages in kernel log.
All I have is mysql error logs. But I'm afraid they are not very helpfull
for kernel developers.
http://sysadminday.org.ru/linux-2.6.0-test2_InnoDB_crash

System is x86 UP PIII 500, 1Gb RAM with software RAID1 over two scsi disks.



-- 
                   Best regards,
                   Sergey S. Kostyliov <rathamahata@php4.ru>
                   Public PGP key: http://sysadminday.org.ru/rathamahata.asc

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03 16:55 ` Matt Mackall
@ 2003-08-03 17:11   ` Heikki Tuuri
  2003-08-03 23:54     ` Matt Mackall
  0 siblings, 1 reply; 26+ messages in thread
From: Heikki Tuuri @ 2003-08-03 17:11 UTC (permalink / raw)
  To: linux-kernel

Matt,

----- Original Message ----- 
From: "Matt Mackall" <mpm@selenic.com>
To: "Heikki Tuuri" <Heikki.Tuuri@innodb.com>
Cc: <linux-kernel@vger.kernel.org>
Sent: Sunday, August 03, 2003 7:55 PM
Subject: Re: 2.6.0-test2-mm3 and mysql


> On Sun, Aug 03, 2003 at 12:10:01PM +0300, Heikki Tuuri wrote:
> >
> > What to do? People who write drivers should run heavy, multithreaded
file
> > i/o tests on their computer using some SQL database which calls fsync().
For
> > example, run the Perl '/sql-bench/innotest's all concurrently on MySQL.
If
> > the problems are in drivers, that could help.
>
> Did you know that until test2-mm3, nothing would report errors that
> occurred on non-synchronous writes? There was no infrastructure to
> propagate the error back to userspace. If you wrote a page, the write
> failed on an intermittent I/O error, and then read again, you'd
> silently get back the old page.

we are not using the Linux async i/o. Do you mean that? Or the flush which
the Linux kernel does from the file cache to the disk time to time on its
own? I assume it will write to the system log an error message if a disk
write fails?

The error 5 Shane reported came from a call of fsync(), and apparently he
also got that same 5 from a simple file read which CHECK TABLE in MyISAM
does.

Why would a write in the Linux async i/o fail? I am using aio on Windows,
and if the disk space can be allocated, it seems to fail only in the case of
a hardware failure.

> -- 
> Matt Mackall : http://www.selenic.com : of or relating to the moon

Regards,

Heikki



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
@ 2003-08-03 16:59 Heikki Tuuri
  2003-08-03 23:57 ` Matt Mackall
  0 siblings, 1 reply; 26+ messages in thread
From: Heikki Tuuri @ 2003-08-03 16:59 UTC (permalink / raw)
  To: linux-kernel

Shane,

"
| tv01.program     | check | error    | got error: 5 when reading datafile
at record: 6696 |
"

InnoDB reported that same error 5 "EIO I/O error" in a call of fsync().
MyISAM never calls fsync(), but I guess these problems are related. Let us
hope Andrew's fix fixes this MyISAM problem, too.

Before your case I have not seen MyISAM report table corruption with error
5. A brief Googling only returns 4 reports of 'got error: 5'. Thus, it is
likely that the bug in this case is in the OS/drivers/hardware.

Regards,

Heikki

....................
List:     linux-kernel
Subject:  Re: 2.6.0-test2-mm3 and mysql
From:     Shane Shrybman <shrybman () sympatico ! ca>
Date:     2003-08-03 15:01:52
[Download message RAW]

On Sat, 2003-08-02 at 22:08, Andrew Morton wrote:
> Shane Shrybman <shrybman@sympatico.ca> wrote:
> >
> > The db corruption hit again on test2-mm2.
>
> How do you know it is "db corruption"?

I haven't been able to get an exact recipe for producing this but I have
posted a couple of the mysql corruption messages.

I still haven't been able to make it appear in 2.6.0-test1-mm1, but once
when I rebooted from -test1-mm1 to -test2-mm3 the tables had problems
immediately, when they came up clean in -test1-mm1 right before. When I
ran the mysql repair tables command it fixed them up and did not delete
any rows from the corrupted table, (or only very few). The repair
command usually deletes thousands of rows in order to repair the table.

http://zeke.yi.org/linux/mysql.tables.corrupt

I haven't found any info on this error message but maybe someone has
seen it before?

BTW: I am using myisam table type in mysql.

I will let you know if I find the exact way to reproduce this problem.

Regards,

Shane



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  9:10 Heikki Tuuri
  2003-08-03  9:27 ` Andrew Morton
@ 2003-08-03 16:55 ` Matt Mackall
  2003-08-03 17:11   ` Heikki Tuuri
  1 sibling, 1 reply; 26+ messages in thread
From: Matt Mackall @ 2003-08-03 16:55 UTC (permalink / raw)
  To: Heikki Tuuri; +Cc: linux-kernel

On Sun, Aug 03, 2003 at 12:10:01PM +0300, Heikki Tuuri wrote:
> 
> What to do? People who write drivers should run heavy, multithreaded file
> i/o tests on their computer using some SQL database which calls fsync(). For
> example, run the Perl '/sql-bench/innotest's all concurrently on MySQL. If
> the problems are in drivers, that could help.

Did you know that until test2-mm3, nothing would report errors that
occurred on non-synchronous writes? There was no infrastructure to
propagate the error back to userspace. If you wrote a page, the write
failed on an intermittent I/O error, and then read again, you'd
silently get back the old page.

-- 
Matt Mackall : http://www.selenic.com : of or relating to the moon

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  2:08     ` Andrew Morton
@ 2003-08-03 15:01       ` Shane Shrybman
  2003-08-03 19:25         ` Andrew Morton
  0 siblings, 1 reply; 26+ messages in thread
From: Shane Shrybman @ 2003-08-03 15:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sat, 2003-08-02 at 22:08, Andrew Morton wrote:
> Shane Shrybman <shrybman@sympatico.ca> wrote:
> >
> > The db corruption hit again on test2-mm2.
> 
> How do you know it is "db corruption"?

I haven't been able to get an exact recipe for producing this but I have
posted a couple of the mysql corruption messages.

I still haven't been able to make it appear in 2.6.0-test1-mm1, but once
when I rebooted from -test1-mm1 to -test2-mm3 the tables had problems
immediately, when they came up clean in -test1-mm1 right before. When I
ran the mysql repair tables command it fixed them up and did not delete
any rows from the corrupted table, (or only very few). The repair
command usually deletes thousands of rows in order to repair the table.

http://zeke.yi.org/linux/mysql.tables.corrupt

I haven't found any info on this error message but maybe someone has
seen it before?

BTW: I am using myisam table type in mysql.

I will let you know if I find the exact way to reproduce this problem.

Regards,

Shane


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  9:27 ` Andrew Morton
@ 2003-08-03 10:43   ` Heikki Tuuri
  2003-08-04 12:24     ` Denis Vlasenko
  0 siblings, 1 reply; 26+ messages in thread
From: Heikki Tuuri @ 2003-08-03 10:43 UTC (permalink / raw)
  To: linux-kernel

Andrew,

----- Original Message ----- 
From: "Andrew Morton" <akpm@osdl.org>
To: "Heikki Tuuri" <Heikki.Tuuri@innodb.com>
Cc: <linux-kernel@vger.kernel.org>
Sent: Sunday, August 03, 2003 12:27 PM
Subject: Re: 2.6.0-test2-mm3 and mysql

> "Heikki Tuuri" <Heikki.Tuuri@innodb.com> wrote:
> >
> > What to do? People who write drivers should run heavy, multithreaded
file
> >  i/o tests on their computer using some SQL database which calls
fsync(). For
> >  example, run the Perl '/sql-bench/innotest's all concurrently on MySQL.
If
> >  the problems are in drivers, that could help.
>
> Well there's a problem.  We're kernel people, not database people.  I, for
> one, would not have a clue how to set such a thing up.
>
> If someone could prepare a simple-enough-for-kernel-people description of
> how to get such a test up and running, then we might make some progress.

ok :).

1. Download

from http://www.mysql.com/downloads/mysql-4.0.html:

MySQL-server-VERSION.i386.rpm The MySQL server. You will need this unless
you only want to connect to a MySQL server running on another machine.
Please note that this package was called MySQL-VERSION.i386.rpm before MySQL
4.0.10.

MySQL-client-VERSION.i386.rpm The standard MySQL client programs. You
probably always want to install this package.

MySQL-bench-VERSION.i386.rpm Tests and benchmarks. Requires Perl and the
DBD-mysql module.

MySQL-shared-compat-VERSION.i386.rpm This package includes the shared
libraries for both MySQL 3.23 and MySQL 4.0. Install this package instead of
MySQL-shared, if you have applications installed that are dynamically linked
against MySQL 3.23 but you want to upgrade to MySQL 4.0 without breaking the
library dependencies. This package is available since MySQL 4.0.13.
(these are named 'Dynamic client libraries (including 3.23.x libraries)' on
the download page).

Do NOT use the MySQL distro which comes with Red Hat distros. It is old and
may not be properly built. Only use binaries downloaded from www.mysql.com.

2. Install with

shell> rpm -i MySQL-server-VERSION.i386.rpm MySQL-client-VERSION.i386.rpm

etc.

3. I am assuming that you have Perl which comes in most Linux distros. You
probably also have the MySQL DBI/DBD module in your Linux distro. It will
use those MySQL-shared-compat client libraries.

http://search.cpan.org/author/JWIED/DBD-mysql-2.1026/lib/DBD/mysql/INSTALL.pod:
"
Red Hat Linux

As of version 7.1, Red Hat Linux comes with MySQL and DBD::mysql. You need
to ensure that the following RPM's are installed:
  mysql
  perl-DBI
  perl-DBD-MySQL
"

If you do not have DBI/DBD, you have to resort to
http://www.mysql.com/downloads/api-dbi.html.

4. The rpm installation should now have the mysqld daemon running and mysqld
etc. placed in a bin dir (probably /usr/bin). You can shut it down with

mysqladmin shutdown

You can start it again with

mysqld

it should print something like:

"
030803 13:13:48  InnoDB: Started
mysqld: ready for connections.
Version: '4.0.14-debug-log'  socket: '/home/heikki/MySQLheikki'  port: 3307
"

(If something went wrong in the grants table creation, it will complain it
cannot find the 'host.frm' file. In that case refer to
http://www.mysql.com/doc/en/Post-installation.html about the script
mysql_install_db.)

The 'datadir' of MySQL is typically /var/lib/mysql. Under it is the actual
database data.

To connect to the database from a console, type

mysql test

mysql> show databases;
...
mysql> exit

5. To run Perl tests, go to the sql-bench directory (typically under
/usr/local/mysql) and:

perl innotest1

perl innotest1a

perl run-all-tests --create-options=type=innodb

etc.

You should run all innotests concurrently.

mysqld should not crash or print anything. The Perl tests themselves print
quite a lot as they test deadlocks etc.

We are testing a Pogo Linux server with a Red Hat 2.4.20 kernel,
http://www.mysql.com/press/release_2003_20.html, and so far that combination
seems to work ok.

Greetings to Linus! I hope you are having a good time at OSDL!

Heikki



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  9:10 Heikki Tuuri
@ 2003-08-03  9:27 ` Andrew Morton
  2003-08-03 10:43   ` Heikki Tuuri
  2003-08-03 16:55 ` Matt Mackall
  1 sibling, 1 reply; 26+ messages in thread
From: Andrew Morton @ 2003-08-03  9:27 UTC (permalink / raw)
  To: Heikki Tuuri; +Cc: linux-kernel

"Heikki Tuuri" <Heikki.Tuuri@innodb.com> wrote:
>
> What to do? People who write drivers should run heavy, multithreaded file
>  i/o tests on their computer using some SQL database which calls fsync(). For
>  example, run the Perl '/sql-bench/innotest's all concurrently on MySQL. If
>  the problems are in drivers, that could help.

Well there's a problem.  We're kernel people, not database people.  I, for
one, would not have a clue how to set such a thing up.

If someone could prepare a simple-enough-for-kernel-people description of
how to get such a test up and running, then we might make some progress.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
@ 2003-08-03  9:10 Heikki Tuuri
  2003-08-03  9:27 ` Andrew Morton
  2003-08-03 16:55 ` Matt Mackall
  0 siblings, 2 replies; 26+ messages in thread
From: Heikki Tuuri @ 2003-08-03  9:10 UTC (permalink / raw)
  To: linux-kernel

Andrew,

I do not know about the specific corruption Shane is talking about, but I
could summarize what we have found out in the past 2 years. I have written
the InnoDB backend to MySQL, and have been hunting Linux corruption bugs for
2 years now.

- Corruption seems to happen on Red Hat kernels 2.4.18 under heavy file i/o
load on some computers.

- A user ran a very simple stress test of type SELECT 'abbaguu' with many
clients. On a 2-way Dell server he was able to get mysqld to crash
predictably in < 24 hours. Sometimes he also got corruption. But another,
cheaper computer worked ok. Both were running a Red Hat kernel 2.4.18. When
the user upgraded to a 'stock' kernel 2.4.20, the crashes and corruption
disappeared.

- Our 4-way Xeon SuSE-2.4.18 computer never corrupts databases, though I run
very heavy stress tests on it.

- Kernels 2.4.20 seem to be more reliable than 2.4.18. I have only one
corruption case from such a kernel.

- We know with certainty that corruption is sometimes caused by
OS/drivers/hardware and not by mysqld, because in some cases rebooting the
computer has magically fixed the corruption. Looks like Linux had corrupted
its own file cache, but the data on disk was ok. I reported this on the
Linux kernel mailing list 2 years ago, but got no definite feedback.

- In some cases InnoDB reports checksum errors in pages. In those cases it
is also very probable that the corruption was caused by OS/drivers/hardware,
and not by mysqld.

- I have not noticed any clear connection between corruption reports and the
used file system.

- I have personally tested on 4 Linux computers. On an old 2.2 kernel
computer I was able to get read errors in 30 seconds. The three 2.4 kernel
computers have worked ok.

My hypothesis is that there are bugs in drivers of Linux. That would explain
why some computers work ok. Or there are Linux kernel bugs which only
manifest on certain hardware under certain file i/o workload.

What to do? People who write drivers should run heavy, multithreaded file
i/o tests on their computer using some SQL database which calls fsync(). For
example, run the Perl '/sql-bench/innotest's all concurrently on MySQL. If
the problems are in drivers, that could help.

Best regards,

Heikki Tuuri
Innobase Oy

.................

List:     linux-kernel
Subject:  Re: 2.6.0-test2-mm3 and mysql
From:     Andrew Morton <akpm () osdl ! org>
Date:     2003-08-03 2:08:59
[Download message RAW]

Shane Shrybman <shrybman@sympatico.ca> wrote:
>
> The db corruption hit again on test2-mm2.

How do you know it is "db corruption"?

>
>  I am still backing out the 64 bit devt bit

why?



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  1:58   ` Shane Shrybman
@ 2003-08-03  2:08     ` Andrew Morton
  2003-08-03 15:01       ` Shane Shrybman
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Morton @ 2003-08-03  2:08 UTC (permalink / raw)
  To: Shane Shrybman; +Cc: linux-kernel

Shane Shrybman <shrybman@sympatico.ca> wrote:
>
> The db corruption hit again on test2-mm2.

How do you know it is "db corruption"?

> 
>  I am still backing out the 64 bit devt bit

why?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  1:52   ` Con Kolivas
@ 2003-08-03  1:59     ` Andrew Morton
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2003-08-03  1:59 UTC (permalink / raw)
  To: Con Kolivas; +Cc: shrybman, linux-kernel

Con Kolivas <kernel@kolivas.org> wrote:
>
> On Sun, 3 Aug 2003 11:04, Andrew Morton wrote:
> > Shane Shrybman <shrybman@sympatico.ca> wrote:
> > > mysql doesn't start on this kernel.
> [snip self abuse...]
> 
> Would this also be why I get lots of this error on this kernel?
> 
> diff: standard output: Input/output error
> 

Yes.  Silly last-minute thing.  Sorry about that.

I'll remove 2.6.0-test2-mm3 and will upload a 2.6.0-test2-mm3-1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  1:04 ` Andrew Morton
  2003-08-03  1:52   ` Con Kolivas
@ 2003-08-03  1:58   ` Shane Shrybman
  2003-08-03  2:08     ` Andrew Morton
  2003-08-03 18:58   ` Sergey S. Kostyliov
  2 siblings, 1 reply; 26+ messages in thread
From: Shane Shrybman @ 2003-08-03  1:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sat, 2003-08-02 at 21:04, Andrew Morton wrote:
> Shane Shrybman <shrybman@sympatico.ca> wrote:
> >
> > mysql doesn't start on this kernel.
> 
> That's because I'm an idiot.


Ah.. thats good 8) For once its not me. :)
> 
> --- 25/fs/mpage.c~awe-use-gfp_flags-braino	Sat Aug  2 18:03:01 2003
> +++ 25-akpm/fs/mpage.c	Sat Aug  2 18:03:01 2003
> @@ -568,7 +568,7 @@ confused:
>  	 */
>  	if (*ret == -ENOSPC)
>  		set_bit(AS_ENOSPC, &mapping->flags);
> -	else
> +	else if (*ret)
>  		set_bit(AS_EIO, &mapping->flags);
>  out:
>  	return bio;
> @@ -673,7 +673,7 @@ mpage_writepages(struct address_space *m
>  				ret = (*writepage)(page, wbc);
>  				if (ret == -ENOSPC)
>  					set_bit(AS_ENOSPC, &mapping->flags);
> -				else
> +				else if (ret)
>  					set_bit(AS_EIO, &mapping->flags);
>  			} else {
>  				bio = mpage_writepage(bio, page, get_block,
> diff -puN mm/vmscan.c~awe-use-gfp_flags-braino mm/vmscan.c
> --- 25/mm/vmscan.c~awe-use-gfp_flags-braino	Sat Aug  2 18:03:01 2003
> +++ 25-akpm/mm/vmscan.c	Sat Aug  2 18:03:01 2003
> @@ -254,7 +254,7 @@ static void handle_write_error(struct ad
>  	if (page->mapping == mapping) {
>  		if (error == -ENOSPC)
>  			set_bit(AS_ENOSPC, &mapping->flags);
> -		else
> +		else if (error)
>  			set_bit(AS_EIO, &mapping->flags);
>  	}
>  	unlock_page(page);
> 
> _
> 
> > One last thing, I have started seeing mysql database corruption
> > recently. I am not sure it is a kernel problem. And I don't know the
> > exact steps to reproduce it, but I think I started seeing it with
> > -test2-mm2. I haven't ever seen db corruption in the 8-12 months I have
> > being playing with mysql/php.
> 
> hm, that's a worry.  No additional info available?
> 
The db corruption hit again on test2-mm2. I am on -test1-mm1 trying to
reproduce it there. I don't now what little query or update is the
problem. There is nothing in the system logs. I went through everything
that I thought might have been happening at the time and the tables came
up clean with the "check tables" command. Then it happened a bit later.
There is a cron job doing a query every minute, but it doesn't happen
all the time. I don't know, its probably some config change I made to
mysql.

I am still backing out the 64 bit devt bit, I assume that is still
needed. The db and mysql binaries are on, respectively:

/dev/vg02/varlv /var ext3 rw,noatime,nosuid,nodev 0 0
/dev/vg02/usrlv /usr ext3 rw,noatime,nodev 0 0

mysql version
mysql  Ver 12.12 Distrib 4.0.3-beta, for pc-linux-gnu (i686)

dmesg
http://zeke.yi.org/linux/2.6.0-test2-mm2.dmesg

Shane


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  1:04 ` Andrew Morton
@ 2003-08-03  1:52   ` Con Kolivas
  2003-08-03  1:59     ` Andrew Morton
  2003-08-03  1:58   ` Shane Shrybman
  2003-08-03 18:58   ` Sergey S. Kostyliov
  2 siblings, 1 reply; 26+ messages in thread
From: Con Kolivas @ 2003-08-03  1:52 UTC (permalink / raw)
  To: Andrew Morton, Shane Shrybman; +Cc: linux-kernel

On Sun, 3 Aug 2003 11:04, Andrew Morton wrote:
> Shane Shrybman <shrybman@sympatico.ca> wrote:
> > mysql doesn't start on this kernel.
[snip self abuse...]

Would this also be why I get lots of this error on this kernel?

diff: standard output: Input/output error

Con


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
  2003-08-03  0:38 Shane Shrybman
@ 2003-08-03  1:04 ` Andrew Morton
  2003-08-03  1:52   ` Con Kolivas
                     ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Andrew Morton @ 2003-08-03  1:04 UTC (permalink / raw)
  To: Shane Shrybman; +Cc: linux-kernel

Shane Shrybman <shrybman@sympatico.ca> wrote:
>
> mysql doesn't start on this kernel.

That's because I'm an idiot.

--- 25/fs/mpage.c~awe-use-gfp_flags-braino	Sat Aug  2 18:03:01 2003
+++ 25-akpm/fs/mpage.c	Sat Aug  2 18:03:01 2003
@@ -568,7 +568,7 @@ confused:
 	 */
 	if (*ret == -ENOSPC)
 		set_bit(AS_ENOSPC, &mapping->flags);
-	else
+	else if (*ret)
 		set_bit(AS_EIO, &mapping->flags);
 out:
 	return bio;
@@ -673,7 +673,7 @@ mpage_writepages(struct address_space *m
 				ret = (*writepage)(page, wbc);
 				if (ret == -ENOSPC)
 					set_bit(AS_ENOSPC, &mapping->flags);
-				else
+				else if (ret)
 					set_bit(AS_EIO, &mapping->flags);
 			} else {
 				bio = mpage_writepage(bio, page, get_block,
diff -puN mm/vmscan.c~awe-use-gfp_flags-braino mm/vmscan.c
--- 25/mm/vmscan.c~awe-use-gfp_flags-braino	Sat Aug  2 18:03:01 2003
+++ 25-akpm/mm/vmscan.c	Sat Aug  2 18:03:01 2003
@@ -254,7 +254,7 @@ static void handle_write_error(struct ad
 	if (page->mapping == mapping) {
 		if (error == -ENOSPC)
 			set_bit(AS_ENOSPC, &mapping->flags);
-		else
+		else if (error)
 			set_bit(AS_EIO, &mapping->flags);
 	}
 	unlock_page(page);

_

> One last thing, I have started seeing mysql database corruption
> recently. I am not sure it is a kernel problem. And I don't know the
> exact steps to reproduce it, but I think I started seeing it with
> -test2-mm2. I haven't ever seen db corruption in the 8-12 months I have
> being playing with mysql/php.

hm, that's a worry.  No additional info available?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 2.6.0-test2-mm3 and mysql
@ 2003-08-03  0:38 Shane Shrybman
  2003-08-03  1:04 ` Andrew Morton
  0 siblings, 1 reply; 26+ messages in thread
From: Shane Shrybman @ 2003-08-03  0:38 UTC (permalink / raw)
  To: linux-kernel

Hi,

mysql doesn't start on this kernel. This is a x86, preempt, ext2/3, UP
system. I get this in the mysql error log,

030802 20:01:17  mysqld started
030802 20:01:18  InnoDB: Error: the OS said file flush did not succeed
030802 20:01:18  InnoDB: Operating system error number 5 in a file
operation.
InnoDB: See http://www.innodb.com/ibman.html for installation help.
InnoDB: Look from section 13.2 at http://www.innodb.com/ibman.html
InnoDB: what the error number means or use the perror program of MySQL.
InnoDB: Cannot continue operation.
030802 20:01:18  mysqld ended

I also did an strace of mysql trying to start and when I tried to copy
the strace file to root's home I got some sort of IO error. I don't
remember the error exactly but I decided to run at that point and
rebooted. The file did seem to copy ok according to diff.

http://zeke.yi.org/linux/2.6.0-test2-mm3.strace.mysql
http://zeke.yi.org/linux/2.6.0-test2-mm3-config

BTW, CONFIG_DEBUG_INFO=y seems to make this kernel huge. I couldn't even
install the sucker because I didn't have enough space for the modules.
35 MB wasn't enough.

One last thing, I have started seeing mysql database corruption
recently. I am not sure it is a kernel problem. And I don't know the
exact steps to reproduce it, but I think I started seeing it with
-test2-mm2. I haven't ever seen db corruption in the 8-12 months I have
being playing with mysql/php.

None of these problems is critical for me (and they could be pilot
error) but I thought I should point them out.

Regards,

Shane


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2003-08-28 19:27 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-28 17:59 2.6.0-test2-mm3 and mysql Heikki Tuuri
2003-08-28 19:01 ` Sergey S. Kostyliov
2003-08-28 19:10   ` Heikki Tuuri
2003-08-28 19:27     ` Sergey S. Kostyliov
  -- strict thread matches above, loose matches on Subject: below --
2003-08-03 20:50 Heikki Tuuri
2003-08-03 16:59 Heikki Tuuri
2003-08-03 23:57 ` Matt Mackall
2003-08-03  9:10 Heikki Tuuri
2003-08-03  9:27 ` Andrew Morton
2003-08-03 10:43   ` Heikki Tuuri
2003-08-04 12:24     ` Denis Vlasenko
2003-08-04 18:29       ` Heikki Tuuri
2003-08-03 16:55 ` Matt Mackall
2003-08-03 17:11   ` Heikki Tuuri
2003-08-03 23:54     ` Matt Mackall
2003-08-03  0:38 Shane Shrybman
2003-08-03  1:04 ` Andrew Morton
2003-08-03  1:52   ` Con Kolivas
2003-08-03  1:59     ` Andrew Morton
2003-08-03  1:58   ` Shane Shrybman
2003-08-03  2:08     ` Andrew Morton
2003-08-03 15:01       ` Shane Shrybman
2003-08-03 19:25         ` Andrew Morton
2003-08-03 18:58   ` Sergey S. Kostyliov
2003-08-04  0:05     ` Matt Mackall
2003-08-27 15:52       ` Sergey S. Kostyliov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).