All of lore.kernel.org
 help / color / mirror / Atom feed
* Error handling: How to "lose" a transaction
@ 2011-12-13 21:47 Jeff Mahoney
  2011-12-14  0:13 ` Chris Mason
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Mahoney @ 2011-12-13 21:47 UTC (permalink / raw)
  To: Chris Mason; +Cc: Mark Fasheh, Btrfs Development List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Hi Chris -

I'm starting to dig into the fun part of error handling and
btrfs_commit_transaction is a minefield right now.

I've been thinking about how I would go about recovering from a
serious error like an -EIO while writing out or an -ENOMEM in a deep
part of the code that it's prohibitively expensive to recover from.
Mostly I'm looking for the best way to make calling btrfs_std_error()
be functionally equivalent to killing the power on the disk. We
already block off new writers, but that's obviously nowhere near
enough. We could have an open transaction floating around, uncommitted
transactions queued, and then an unrecoverable error hits, forcing us
to shut it all down.

It seems to me that that a similar method of recovery that I wrote for
reiserfs can be used here as well. Am I understanding correctly that
if I go through the motions of committing the transaction *except* for
updating the tree roots, or maybe even doing that but declining to
write the superblocks out, that the transaction essentially doesn't
exist on disk? Including the allocations? The in-memory representation
will not match what's on disk, but that's what happens with every file
system in RO-failure mode. With CoW even for data, data is essentially
frozen in time as well. (I suppose with nodatacow that's not true, but
that's for another day.)

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJO58fxAAoJEB57S2MheeWyiH8P/RGfJUCwBoz83vLH5qRfsAzO
Rnfjq9/NS+58zh5MGImimr9u6ZuNCfNUEFUDXGnVwF2Er1jHh0orU1pQdvU9XlHv
T/vAyZp1s/emwwDPQX0Xo24QNumSzA2u7qnUuUBklq8l+KL99OZCErhu/eJ6i06S
vTv6KflsL/EU5ISgro051fVLGep0ZF5hLYOJHQbCJaRlL6OwC2d8cWHGR+qBdRaw
t4SQ+tVmKnnd4UzlpPzyQTCSOwdnSYtei28fCAy7X4rmycCXTa8eYQgvxkIabgkM
IF8F8utcKT2yTFyUbJM3MWUx0yzPVsL77XnO8FCfYbusYC1EPTnMSGJ1CbupHvr1
kmFJEOQ4rj8fxLzxYDdxjEJ7HtyIhQDfH1BZ1/0+e8BShepr7/60AwoNaWVOceN/
rDDkkKgIogprGO0un1Fv3J+FNPgIR/47t1ULSUTLhg4vAqbQRuYiI36Y2zlG7G2a
C/u/4UgrH40CVFVVtIRnjO67/QffTC3pf8Q6kzaXgotQJUt1XfY3a4X6MLQnfWKo
bBQaPTIpsxtf7k3cnH5XfjQqtljGgXrbBExtMPKBor7RDPVw3KrLm4F35Enr4Gur
pumzXQfiSC2oiSxpG1RXegZ2CXLKW4a/++kMApAOR98xTAHM8dzFhx0V0YZh/MHY
Sc+ddgI2v5ZIUL2IV3WK
=DmtI
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Error handling: How to "lose" a transaction
  2011-12-13 21:47 Error handling: How to "lose" a transaction Jeff Mahoney
@ 2011-12-14  0:13 ` Chris Mason
  2011-12-22  2:59   ` Jeff Mahoney
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Mason @ 2011-12-14  0:13 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Mark Fasheh, Btrfs Development List

On Tue, Dec 13, 2011 at 04:47:30PM -0500, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> Hi Chris -
> 
> I'm starting to dig into the fun part of error handling and
> btrfs_commit_transaction is a minefield right now.
> 
> I've been thinking about how I would go about recovering from a
> serious error like an -EIO while writing out or an -ENOMEM in a deep
> part of the code that it's prohibitively expensive to recover from.
> Mostly I'm looking for the best way to make calling btrfs_std_error()
> be functionally equivalent to killing the power on the disk. We
> already block off new writers, but that's obviously nowhere near
> enough. We could have an open transaction floating around, uncommitted
> transactions queued, and then an unrecoverable error hits, forcing us
> to shut it all down.
> 
> It seems to me that that a similar method of recovery that I wrote for
> reiserfs can be used here as well. Am I understanding correctly that
> if I go through the motions of committing the transaction *except* for
> updating the tree roots, or maybe even doing that but declining to
> write the superblocks out, that the transaction essentially doesn't
> exist on disk? Including the allocations? The in-memory representation
> will not match what's on disk, but that's what happens with every file
> system in RO-failure mode. With CoW even for data, data is essentially
> frozen in time as well. (I suppose with nodatacow that's not true, but
> that's for another day.)

Hi Jeff,

Thanks for taking another pass at this.

It should be possible to just skip the step where we update the roots in
the super and you'll keep a fully consistent FS on disk.  The only rule
would be that you're not allowed to take a block that we've freed in the
aborted transaction and reuse it.

-chris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Error handling: How to "lose" a transaction
  2011-12-14  0:13 ` Chris Mason
@ 2011-12-22  2:59   ` Jeff Mahoney
  2011-12-22  3:21     ` Liu Bo
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Mahoney @ 2011-12-22  2:59 UTC (permalink / raw)
  To: Chris Mason, Jeff Mahoney, Mark Fasheh, Btrfs Development List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/13/2011 07:13 PM, Chris Mason wrote:
> On Tue, Dec 13, 2011 at 04:47:30PM -0500, Jeff Mahoney wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>> 
>> 
>> Hi Chris -
>> 
>> I'm starting to dig into the fun part of error handling and 
>> btrfs_commit_transaction is a minefield right now.
>> 
>> I've been thinking about how I would go about recovering from a 
>> serious error like an -EIO while writing out or an -ENOMEM in a
>> deep part of the code that it's prohibitively expensive to
>> recover from. Mostly I'm looking for the best way to make calling
>> btrfs_std_error() be functionally equivalent to killing the power
>> on the disk. We already block off new writers, but that's
>> obviously nowhere near enough. We could have an open transaction
>> floating around, uncommitted transactions queued, and then an
>> unrecoverable error hits, forcing us to shut it all down.
>> 
>> It seems to me that that a similar method of recovery that I
>> wrote for reiserfs can be used here as well. Am I understanding
>> correctly that if I go through the motions of committing the
>> transaction *except* for updating the tree roots, or maybe even
>> doing that but declining to write the superblocks out, that the
>> transaction essentially doesn't exist on disk? Including the
>> allocations? The in-memory representation will not match what's
>> on disk, but that's what happens with every file system in
>> RO-failure mode. With CoW even for data, data is essentially 
>> frozen in time as well. (I suppose with nodatacow that's not
>> true, but that's for another day.)
> 
> Hi Jeff,
> 
> Thanks for taking another pass at this.
> 
> It should be possible to just skip the step where we update the
> roots in the super and you'll keep a fully consistent FS on disk.
> The only rule would be that you're not allowed to take a block that
> we've freed in the aborted transaction and reuse it.

Perfect.

Sorry I haven't responded to this yet. I started digging right in and
I've started to have some good results. It turns out there's already a
btrfs_cleanup_transaction call that will tear down outstanding
transactions. It's not perfect and I've fixed a few bugs in there, but
it saved me a bunch of effort. I just wished I noticed it a day before
since I had it half implemented myself. :)

This afternoon I started running xfstests on a dm-linear mapped
partition. Halfway through a sufficiently long test, I swap out the
linear mapping to an error mapping. It still crashes, but somewhat
less spectacularly. There are still a ton of BUG_ON's I need to
eliminate as well as work out the usual I/O error-recovery issue of
uninterruptible, unrecoverable writeback contexts and still-locked
pages holding up exit. I'm pretty pleased with the results so far and
am pretty optimistic.

- -Jeff


- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJO8p0MAAoJEB57S2MheeWyF4YP/A6uUzuP4zui+iwenSP44trw
FkONuTidZRjSgA4tXNrdsnIF2txtiewzp0HvWWudw5rnMNQzznyO0WynKHSPG3ep
xFZnfpvaYoCaMQt70IxAQFDsZpowbPAI8194mbJqKAql4f2RNzlg/3fR4k+Fz6Ye
Gu824uEbtyHghy96C37e/E30Zizu6+S7xrx8jwmnKbq44docoIV3Pw9LZGOU99Db
1IFipExd0Z/ZhTTiK4gZ787nPhM9QNfxw/9+h1g4gUfJqlcmRrcwGJmOj5iOBGBt
Man51ZCI8hYBpubTgvTQalut+uLq9lCoBZQGTbKHLNLd21qM+Ji4KCAQzMBUtqGn
pzSfs3Gdwa1WjYszINAS6gqA+0ubh1F/WxGwJKW85JnAYy8OjTJHru7GlYzt3C9Y
gouU7xgrneVn+lZFwV9X0gwX8yLQx5Lh9YEF6AJLXJuXHg4zGZyhpFjVkmTlle93
dFUblB92q9lxdw5V8f1Uw+EDIlACZZRo7MFDSypjdTTryRFiAjhCtBdBpnu54Mrb
fH2kdhPCBm4YqAQLlo43aOPAbkOYElAr0rgPvqaLzimZLAW0kd/nGU/if3mhMMa2
7ad7tKTQyktyGKuEkMPnSCU8SqFNGA750aeFG22uJJjbdCytyzkJmeqYQD5oykqm
vDpKh0g20Fcqb98q+qbt
=jjDk
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Error handling: How to "lose" a transaction
  2011-12-22  2:59   ` Jeff Mahoney
@ 2011-12-22  3:21     ` Liu Bo
  2011-12-22  3:38       ` Jeff Mahoney
  0 siblings, 1 reply; 8+ messages in thread
From: Liu Bo @ 2011-12-22  3:21 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: Chris Mason, Jeff Mahoney, Mark Fasheh, Btrfs Development List

On 12/22/2011 10:59 AM, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 12/13/2011 07:13 PM, Chris Mason wrote:
>> On Tue, Dec 13, 2011 at 04:47:30PM -0500, Jeff Mahoney wrote:
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>
>>>
>>> Hi Chris -
>>>
>>> I'm starting to dig into the fun part of error handling and 
>>> btrfs_commit_transaction is a minefield right now.
>>>
>>> I've been thinking about how I would go about recovering from a 
>>> serious error like an -EIO while writing out or an -ENOMEM in a
>>> deep part of the code that it's prohibitively expensive to
>>> recover from. Mostly I'm looking for the best way to make calling
>>> btrfs_std_error() be functionally equivalent to killing the power
>>> on the disk. We already block off new writers, but that's
>>> obviously nowhere near enough. We could have an open transaction
>>> floating around, uncommitted transactions queued, and then an
>>> unrecoverable error hits, forcing us to shut it all down.
>>>
>>> It seems to me that that a similar method of recovery that I
>>> wrote for reiserfs can be used here as well. Am I understanding
>>> correctly that if I go through the motions of committing the
>>> transaction *except* for updating the tree roots, or maybe even
>>> doing that but declining to write the superblocks out, that the
>>> transaction essentially doesn't exist on disk? Including the
>>> allocations? The in-memory representation will not match what's
>>> on disk, but that's what happens with every file system in
>>> RO-failure mode. With CoW even for data, data is essentially 
>>> frozen in time as well. (I suppose with nodatacow that's not
>>> true, but that's for another day.)
>> Hi Jeff,
>>
>> Thanks for taking another pass at this.
>>
>> It should be possible to just skip the step where we update the
>> roots in the super and you'll keep a fully consistent FS on disk.
>> The only rule would be that you're not allowed to take a block that
>> we've freed in the aborted transaction and reuse it.
> 
> Perfect.
> 
> Sorry I haven't responded to this yet. I started digging right in and
> I've started to have some good results. It turns out there's already a
> btrfs_cleanup_transaction call that will tear down outstanding
> transactions. It's not perfect and I've fixed a few bugs in there, but
> it saved me a bunch of effort. I just wished I noticed it a day before
> since I had it half implemented myself. :)
> 

Hi Jeff,

Yes, it should be, and I wrote this cleanup_transaction where I should notice you earlier...
Anyway, thanks for your effort.

The error handling part has lots of corner cases, so I just pick up
a brute way to tear down the current transaction in order to make the FS RO.

thanks,
liubo

> This afternoon I started running xfstests on a dm-linear mapped
> partition. Halfway through a sufficiently long test, I swap out the
> linear mapping to an error mapping. It still crashes, but somewhat
> less spectacularly. There are still a ton of BUG_ON's I need to
> eliminate as well as work out the usual I/O error-recovery issue of
> uninterruptible, unrecoverable writeback contexts and still-locked
> pages holding up exit. I'm pretty pleased with the results so far and
> am pretty optimistic.
> 
> - -Jeff
> 
> 
> - -- 
> Jeff Mahoney
> SUSE Labs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.18 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iQIcBAEBAgAGBQJO8p0MAAoJEB57S2MheeWyF4YP/A6uUzuP4zui+iwenSP44trw
> FkONuTidZRjSgA4tXNrdsnIF2txtiewzp0HvWWudw5rnMNQzznyO0WynKHSPG3ep
> xFZnfpvaYoCaMQt70IxAQFDsZpowbPAI8194mbJqKAql4f2RNzlg/3fR4k+Fz6Ye
> Gu824uEbtyHghy96C37e/E30Zizu6+S7xrx8jwmnKbq44docoIV3Pw9LZGOU99Db
> 1IFipExd0Z/ZhTTiK4gZ787nPhM9QNfxw/9+h1g4gUfJqlcmRrcwGJmOj5iOBGBt
> Man51ZCI8hYBpubTgvTQalut+uLq9lCoBZQGTbKHLNLd21qM+Ji4KCAQzMBUtqGn
> pzSfs3Gdwa1WjYszINAS6gqA+0ubh1F/WxGwJKW85JnAYy8OjTJHru7GlYzt3C9Y
> gouU7xgrneVn+lZFwV9X0gwX8yLQx5Lh9YEF6AJLXJuXHg4zGZyhpFjVkmTlle93
> dFUblB92q9lxdw5V8f1Uw+EDIlACZZRo7MFDSypjdTTryRFiAjhCtBdBpnu54Mrb
> fH2kdhPCBm4YqAQLlo43aOPAbkOYElAr0rgPvqaLzimZLAW0kd/nGU/if3mhMMa2
> 7ad7tKTQyktyGKuEkMPnSCU8SqFNGA750aeFG22uJJjbdCytyzkJmeqYQD5oykqm
> vDpKh0g20Fcqb98q+qbt
> =jjDk
> -----END PGP SIGNATURE-----
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Error handling: How to "lose" a transaction
  2011-12-22  3:21     ` Liu Bo
@ 2011-12-22  3:38       ` Jeff Mahoney
  2011-12-23  5:12         ` Jeff Mahoney
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Mahoney @ 2011-12-22  3:38 UTC (permalink / raw)
  To: Liu Bo; +Cc: Chris Mason, Jeff Mahoney, Mark Fasheh, Btrfs Development List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/21/2011 10:21 PM, Liu Bo wrote:
> On 12/22/2011 10:59 AM, Jeff Mahoney wrote: Sorry I haven't
> responded to this yet. I started digging right in and I've started
> to have some good results. It turns out there's already a 
> btrfs_cleanup_transaction call that will tear down outstanding 
> transactions. It's not perfect and I've fixed a few bugs in there,
> but it saved me a bunch of effort. I just wished I noticed it a day
> before since I had it half implemented myself. :)
> 
> 
>> Hi Jeff,
> 
>> Yes, it should be, and I wrote this cleanup_transaction where I
>> should notice you earlier... Anyway, thanks for your effort.
> 
>> The error handling part has lots of corner cases, so I just pick
>> up a brute way to tear down the current transaction in order to
>> make the FS RO.

Oh, and it's worked great. The brute force method is a good start and
will address the most severe problems (and most cases) well. I've
decided to ignore most cases of -ENOMEM for now. The biggest bug I ran
into so far was calling mutex_lock while holding a spinlock. It was a
quick fix.

The method I've generally used is to mark the transaction aborted and
pass the error up as quickly as possible, cleaning up the local
allocations and locks as I go. The transaction gets completed
normally, returns an error, isn't committed, and then is destroyed
(with others, potentially) when called from in
btrfs_commit_transaction. Btrfs makes this super easy since we can
just skip all the CoW writes.

Thanks!

- -Jeff


>> thanks, liubo
> 
> This afternoon I started running xfstests on a dm-linear mapped 
> partition. Halfway through a sufficiently long test, I swap out
> the linear mapping to an error mapping. It still crashes, but
> somewhat less spectacularly. There are still a ton of BUG_ON's I
> need to eliminate as well as work out the usual I/O error-recovery
> issue of uninterruptible, unrecoverable writeback contexts and
> still-locked pages holding up exit. I'm pretty pleased with the
> results so far and am pretty optimistic.
> 
> -Jeff
> 
> 
>> -- To unsubscribe from this list: send the line "unsubscribe
>> linux-btrfs" in the body of a message to
>> majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
>> 
> 

- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJO8qZBAAoJEB57S2MheeWyCtYP/0+VGdUrdPceYkMGngweINFI
Y6K/xzDG2tiogFyb8mVj4XH9xtGoODWiZ+yb2FkRfoqsq1dS34/XzM1Cf1SBgFTu
J8xIxv3gVp0lDycV6QqpetNaPPpxDz61LmiFqNRd6bn/usBoYdlyexX3HmPll7Je
MS0uAiUVNTJIK+W3qN9BIyvg8F61XFy3SdeCY5dmzClDJft1dgu6mWlHhcKVL7LW
uDrX9vldV56qoL6rrNyR/wBVg8rhMxVN5z9qFttWsSpORwZdIOIUdKiTULqnCdvf
mzs1yNAsAMTcE0GCLOIWEyiTSZrDlg4nGgZMIDKnzD0GywJDy+qc/9XPL+5WkyaD
Z48a6sBCXGhmQsux8iEeGAlTfP5/YJMd2PqaKfFlpSeL2u+Pt6EAFUpEUfXDYRhI
aBxzJK7D+GrgduheWTQc2AgeH8ee7bUEe1k+d4+EIWJTq5vKkPWH7x580q0yL+t2
qiLqzSlSTPaCr9tJlQo3d+dHu2L2r43+2qYeHut0JjFtp2dDjWO7AzcQ2JsL0yZR
jL0dVT96OsWkmKu/qfvSbFZ6LLR+QrlqBzTgNA4R69nLlUj1f05AVaYvwuVqnIPH
QdCf53kaEjvVlRw2WScsRHT1gMY62jmES0glIBgAH9bKAYKADlnzIAW6RSpB8NcO
GZoCa+90OHl/kkXWB2eZ
=DR3D
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Error handling: How to "lose" a transaction
  2011-12-22  3:38       ` Jeff Mahoney
@ 2011-12-23  5:12         ` Jeff Mahoney
  2011-12-23  5:43           ` Liu Bo
  2011-12-23 14:17           ` Chris Mason
  0 siblings, 2 replies; 8+ messages in thread
From: Jeff Mahoney @ 2011-12-23  5:12 UTC (permalink / raw)
  To: Liu Bo; +Cc: Chris Mason, Jeff Mahoney, Mark Fasheh, Btrfs Development List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/21/2011 10:38 PM, Jeff Mahoney wrote:
> On 12/21/2011 10:21 PM, Liu Bo wrote:
>> On 12/22/2011 10:59 AM, Jeff Mahoney wrote: Sorry I haven't 
>> responded to this yet. I started digging right in and I've
>> started to have some good results. It turns out there's already a
>>  btrfs_cleanup_transaction call that will tear down outstanding 
>> transactions. It's not perfect and I've fixed a few bugs in
>> there, but it saved me a bunch of effort. I just wished I noticed
>> it a day before since I had it half implemented myself. :)
> 
> 
>>> Hi Jeff,
> 
>>> Yes, it should be, and I wrote this cleanup_transaction where
>>> I should notice you earlier... Anyway, thanks for your effort.
> 
>>> The error handling part has lots of corner cases, so I just
>>> pick up a brute way to tear down the current transaction in
>>> order to make the FS RO.
> 
> Oh, and it's worked great. The brute force method is a good start
> and will address the most severe problems (and most cases) well.
> I've decided to ignore most cases of -ENOMEM for now. The biggest
> bug I ran into so far was calling mutex_lock while holding a
> spinlock. It was a quick fix.
> 
> The method I've generally used is to mark the transaction aborted
> and pass the error up as quickly as possible, cleaning up the
> local allocations and locks as I go. The transaction gets
> completed normally, returns an error, isn't committed, and then is
> destroyed (with others, potentially) when called from in 
> btrfs_commit_transaction. Btrfs makes this super easy since we can 
> just skip all the CoW writes.


Now, just out of curiosity, would it be ok if I printed this when we
ran out memory in deep call paths?

     FAIL WHALE!

W     W      W
W        W  W     W
              '.  W
  .-""-._     \ \.--|
 /       "-..__) .-'
|     _         /
\'-.__,   .__.,'
 `'----'._\--'
VVVVVVVVVVVVVVVVVVVVV


Happy Holidays ;)

- -Jeff

> Thanks!
> 
> -Jeff
> 
> 
>>> thanks, liubo
> 
>> This afternoon I started running xfstests on a dm-linear mapped 
>> partition. Halfway through a sufficiently long test, I swap out 
>> the linear mapping to an error mapping. It still crashes, but 
>> somewhat less spectacularly. There are still a ton of BUG_ON's I 
>> need to eliminate as well as work out the usual I/O
>> error-recovery issue of uninterruptible, unrecoverable writeback
>> contexts and still-locked pages holding up exit. I'm pretty
>> pleased with the results so far and am pretty optimistic.
> 
>> -Jeff
> 
> 
>>> -- To unsubscribe from this list: send the line "unsubscribe 
>>> linux-btrfs" in the body of a message to 
>>> majordomo@vger.kernel.org More majordomo info at 
>>> http://vger.kernel.org/majordomo-info.html
>>> 
> 
> 
> -- To unsubscribe from this list: send the line "unsubscribe
> linux-btrfs" in the body of a message to majordomo@vger.kernel.org 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJO9A2/AAoJEB57S2MheeWyiNIP/3Z6NETIXskkp+OVKTiF/gaP
bopj2dp92BlURFHEj5vJoESm4cUtQKTx9J/DB3yc7JDzc0UcRs9KCqGV9UpH6y9/
Zetzy3ZMsYyxvV5CZ50NGr+C1r5ULVGQ/UrPex/GT0bApcdBRMkFASLH8xkFl6dE
dfRjir038GzjVX/Phy0VPm0mg8eg77aco11Xk2+Y1MdEhsEqI+cUQYgA8O9M7HWy
67Vv3KWxKC7PU6SYCPa0wGmQwTgs10GuKT9w+s7Ampy8iQhCgEuDo4dQxpRehQfp
YwD/vlHwVATTAR2zMbRtI0BWa+ideBzcdQg1QrZxB3o026Z7ooy+/fTqS6MiUrXy
mxGvb0g/BglK6Q86YQE77doIfJeUDLGoGQx2Zv1S9OzVwigo1a0LcP82P7yNnJBY
oihql+FAYBXwjqiAQ+wUvo7wy0H+ltmQgWfUDf5wjDHquTRT1H0kE15Okc8MX8+T
rmhp6vD1deX5Jz+JBIpCm94JhxUBPkBH2WksyA1jdLUOngHxRI0jmqz/5mPexV8e
dChaq1rsjYs5Zbbv/jpaefnEw0kbZ0cqS7uDLVVoyjEqGnBpqjdwE86WYjxc4biM
MkeSJ67Oof3ZGLWR0VQ+h4YnRjqAsMWsEd3jBLMo2krsr8ucc/UOzVDBVojDlGWJ
Z2HunZuWJkNgcsBatVoS
=z1sd
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Error handling: How to "lose" a transaction
  2011-12-23  5:12         ` Jeff Mahoney
@ 2011-12-23  5:43           ` Liu Bo
  2011-12-23 14:17           ` Chris Mason
  1 sibling, 0 replies; 8+ messages in thread
From: Liu Bo @ 2011-12-23  5:43 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: Chris Mason, Jeff Mahoney, Mark Fasheh, Btrfs Development List

On 12/23/2011 01:12 PM, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 12/21/2011 10:38 PM, Jeff Mahoney wrote:
>> On 12/21/2011 10:21 PM, Liu Bo wrote:
>>> On 12/22/2011 10:59 AM, Jeff Mahoney wrote: Sorry I haven't 
>>> responded to this yet. I started digging right in and I've
>>> started to have some good results. It turns out there's already a
>>>  btrfs_cleanup_transaction call that will tear down outstanding 
>>> transactions. It's not perfect and I've fixed a few bugs in
>>> there, but it saved me a bunch of effort. I just wished I noticed
>>> it a day before since I had it half implemented myself. :)
>>
>>>> Hi Jeff,
>>>> Yes, it should be, and I wrote this cleanup_transaction where
>>>> I should notice you earlier... Anyway, thanks for your effort.
>>>> The error handling part has lots of corner cases, so I just
>>>> pick up a brute way to tear down the current transaction in
>>>> order to make the FS RO.
>> Oh, and it's worked great. The brute force method is a good start
>> and will address the most severe problems (and most cases) well.
>> I've decided to ignore most cases of -ENOMEM for now. The biggest
>> bug I ran into so far was calling mutex_lock while holding a
>> spinlock. It was a quick fix.
>>
>> The method I've generally used is to mark the transaction aborted
>> and pass the error up as quickly as possible, cleaning up the
>> local allocations and locks as I go. The transaction gets
>> completed normally, returns an error, isn't committed, and then is
>> destroyed (with others, potentially) when called from in 
>> btrfs_commit_transaction. Btrfs makes this super easy since we can 
>> just skip all the CoW writes.
> 
> 
> Now, just out of curiosity, would it be ok if I printed this when we
> ran out memory in deep call paths?
> 

I'm ok with this, but it depends on Chris :)

Indeed, ENOMEM in deep call paths is a big big trouble for us, we don't yet have
a graceful solution, and we can make an memory allocation with mask __GFP_NOFAIL
flags for simplicity, although it is not recommended:

 * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
 * cannot handle allocation failures.  This modifier is deprecated and no new
 * users should be added.


>      FAIL WHALE!
> 
> W     W      W
> W        W  W     W
>               '.  W
>   .-""-._     \ \.--|
>  /       "-..__) .-'
> |     _         /
> \'-.__,   .__.,'
>  `'----'._\--'
> VVVVVVVVVVVVVVVVVVVVV
> 
> 
> Happy Holidays ;)
> 

Happy Holidays!

thanks,
liubo

> - -Jeff
> 
>> Thanks!
>>
>> -Jeff
>>
>>
>>>> thanks, liubo
>>> This afternoon I started running xfstests on a dm-linear mapped 
>>> partition. Halfway through a sufficiently long test, I swap out 
>>> the linear mapping to an error mapping. It still crashes, but 
>>> somewhat less spectacularly. There are still a ton of BUG_ON's I 
>>> need to eliminate as well as work out the usual I/O
>>> error-recovery issue of uninterruptible, unrecoverable writeback
>>> contexts and still-locked pages holding up exit. I'm pretty
>>> pleased with the results so far and am pretty optimistic.
>>> -Jeff
>>
>>>> -- To unsubscribe from this list: send the line "unsubscribe 
>>>> linux-btrfs" in the body of a message to 
>>>> majordomo@vger.kernel.org More majordomo info at 
>>>> http://vger.kernel.org/majordomo-info.html
>>>>
>>
>> -- To unsubscribe from this list: send the line "unsubscribe
>> linux-btrfs" in the body of a message to majordomo@vger.kernel.org 
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> - -- 
> Jeff Mahoney
> SUSE Labs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.18 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iQIcBAEBAgAGBQJO9A2/AAoJEB57S2MheeWyiNIP/3Z6NETIXskkp+OVKTiF/gaP
> bopj2dp92BlURFHEj5vJoESm4cUtQKTx9J/DB3yc7JDzc0UcRs9KCqGV9UpH6y9/
> Zetzy3ZMsYyxvV5CZ50NGr+C1r5ULVGQ/UrPex/GT0bApcdBRMkFASLH8xkFl6dE
> dfRjir038GzjVX/Phy0VPm0mg8eg77aco11Xk2+Y1MdEhsEqI+cUQYgA8O9M7HWy
> 67Vv3KWxKC7PU6SYCPa0wGmQwTgs10GuKT9w+s7Ampy8iQhCgEuDo4dQxpRehQfp
> YwD/vlHwVATTAR2zMbRtI0BWa+ideBzcdQg1QrZxB3o026Z7ooy+/fTqS6MiUrXy
> mxGvb0g/BglK6Q86YQE77doIfJeUDLGoGQx2Zv1S9OzVwigo1a0LcP82P7yNnJBY
> oihql+FAYBXwjqiAQ+wUvo7wy0H+ltmQgWfUDf5wjDHquTRT1H0kE15Okc8MX8+T
> rmhp6vD1deX5Jz+JBIpCm94JhxUBPkBH2WksyA1jdLUOngHxRI0jmqz/5mPexV8e
> dChaq1rsjYs5Zbbv/jpaefnEw0kbZ0cqS7uDLVVoyjEqGnBpqjdwE86WYjxc4biM
> MkeSJ67Oof3ZGLWR0VQ+h4YnRjqAsMWsEd3jBLMo2krsr8ucc/UOzVDBVojDlGWJ
> Z2HunZuWJkNgcsBatVoS
> =z1sd
> -----END PGP SIGNATURE-----
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Error handling: How to "lose" a transaction
  2011-12-23  5:12         ` Jeff Mahoney
  2011-12-23  5:43           ` Liu Bo
@ 2011-12-23 14:17           ` Chris Mason
  1 sibling, 0 replies; 8+ messages in thread
From: Chris Mason @ 2011-12-23 14:17 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Liu Bo, Jeff Mahoney, Mark Fasheh, Btrfs Development List

On Fri, Dec 23, 2011 at 12:12:31AM -0500, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 12/21/2011 10:38 PM, Jeff Mahoney wrote:
> > On 12/21/2011 10:21 PM, Liu Bo wrote:
> >> On 12/22/2011 10:59 AM, Jeff Mahoney wrote: Sorry I haven't 
> >> responded to this yet. I started digging right in and I've
> >> started to have some good results. It turns out there's already a
> >>  btrfs_cleanup_transaction call that will tear down outstanding 
> >> transactions. It's not perfect and I've fixed a few bugs in
> >> there, but it saved me a bunch of effort. I just wished I noticed
> >> it a day before since I had it half implemented myself. :)
> > 
> > 
> >>> Hi Jeff,
> > 
> >>> Yes, it should be, and I wrote this cleanup_transaction where
> >>> I should notice you earlier... Anyway, thanks for your effort.
> > 
> >>> The error handling part has lots of corner cases, so I just
> >>> pick up a brute way to tear down the current transaction in
> >>> order to make the FS RO.
> > 
> > Oh, and it's worked great. The brute force method is a good start
> > and will address the most severe problems (and most cases) well.
> > I've decided to ignore most cases of -ENOMEM for now. The biggest
> > bug I ran into so far was calling mutex_lock while holding a
> > spinlock. It was a quick fix.
> > 
> > The method I've generally used is to mark the transaction aborted
> > and pass the error up as quickly as possible, cleaning up the
> > local allocations and locks as I go. The transaction gets
> > completed normally, returns an error, isn't committed, and then is
> > destroyed (with others, potentially) when called from in 
> > btrfs_commit_transaction. Btrfs makes this super easy since we can 
> > just skip all the CoW writes.
> 
> 
> Now, just out of curiosity, would it be ok if I printed this when we
> ran out memory in deep call paths?
> 
>      FAIL WHALE!
> 
> W     W      W
> W        W  W     W
>               '.  W
>   .-""-._     \ \.--|
>  /       "-..__) .-'
> |     _         /
> \'-.__,   .__.,'
>  `'----'._\--'
> VVVVVVVVVVVVVVVVVVVVV
> 
> 
> Happy Holidays ;)

I'll take any patch you put into the suse kernel ;)

-chris


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-12-23 14:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-13 21:47 Error handling: How to "lose" a transaction Jeff Mahoney
2011-12-14  0:13 ` Chris Mason
2011-12-22  2:59   ` Jeff Mahoney
2011-12-22  3:21     ` Liu Bo
2011-12-22  3:38       ` Jeff Mahoney
2011-12-23  5:12         ` Jeff Mahoney
2011-12-23  5:43           ` Liu Bo
2011-12-23 14:17           ` Chris Mason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.