linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* A couple of questions
@ 2010-05-27 13:39 Paul Millar
  2010-05-27 14:56 ` Hubert Kario
  2010-05-27 16:00 ` Chris Mason
  0 siblings, 2 replies; 11+ messages in thread
From: Paul Millar @ 2010-05-27 13:39 UTC (permalink / raw)
  To: linux-btrfs

Hi,

I've been looking at Btrfs and have a couple of naive questions that don't 
seem to be answered on the wiki or in the articles I've read on the 
filesystem.


First: discovering a file's checksum value.

Here's the scenario: software is writing some data as a fresh file.  This 
software happens to know (a priori) the checksum of this data; for example, a 
storage server receives the file's data and checksum independently.

I've some confidence that, once the data is stored in btrfs, any corruption 
(from the storage fabric) will be spotted; however, the data may have became 
corrupt before being stored (e.g., from the network).  To catch this, the 
checksum of the stored data needs to be calculated and checked.

One approach is to calculate the checksum (in user-space) after the data is 
stored.  This adds extra IO- and CPU-load and there's also the possibility of 
false-negative results due to the filesystem cache (although btrfs may remove 
this risk).

Another approach would be to ask btrfs for the checksum.  It seems that it's 
possible to combine multiple CRC-32C values to figure out the checksum of the 
combined data [e.g., zlib's crc32_combine() function].  So, obtaining a file's 
checksum might be a light-weight operation.

Yet another possibility would be to push the desired checksum value (via 
fcntl?) and have btrfs compare the desired checksum with the file's actual 
checksum on close(2), failing that call if the checksums don't match.

Would any of this be possible (without an awful lot of work)?



Second: adding support for Adler32?

Looking at the unstable git repo, it looks like there's currently support for 
only the CRC-32C checksum algorithm.  Is this correct?  If so, is anyone 
working on adding support for Adler32?

Cheers,

Paul.
(ps, please keep me CC-ed in on replies)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-05-27 13:39 A couple of questions Paul Millar
@ 2010-05-27 14:56 ` Hubert Kario
  2010-05-31 17:59   ` Paul Millar
  2010-05-27 16:00 ` Chris Mason
  1 sibling, 1 reply; 11+ messages in thread
From: Hubert Kario @ 2010-05-27 14:56 UTC (permalink / raw)
  To: Paul Millar; +Cc: linux-btrfs

On Thursday 27 May 2010 15:39:54 Paul Millar wrote:
> Hi,
>=20
> I've been looking at Btrfs and have a couple of naive questions that =
don't
> seem to be answered on the wiki or in the articles I've read on the
> filesystem.
>=20
>=20
> First: discovering a file's checksum value.
>=20
> Here's the scenario: software is writing some data as a fresh file.  =
This
> software happens to know (a priori) the checksum of this data; for ex=
ample,
> a storage server receives the file's data and checksum independently.
>=20
> I've some confidence that, once the data is stored in btrfs, any corr=
uption
> (from the storage fabric) will be spotted; however, the data may have
> became corrupt before being stored (e.g., from the network).  To catc=
h
> this, the checksum of the stored data needs to be calculated and chec=
ked.
>=20
> One approach is to calculate the checksum (in user-space) after the d=
ata is
> stored.  This adds extra IO- and CPU-load and there's also the possib=
ility
> of false-negative results due to the filesystem cache (although btrfs=
 may
> remove this risk).
>=20
> Another approach would be to ask btrfs for the checksum.  It seems th=
at
> it's possible to combine multiple CRC-32C values to figure out the
> checksum of the combined data [e.g., zlib's crc32_combine() function]=
=2E=20
> So, obtaining a file's checksum might be a light-weight operation.
>=20
> Yet another possibility would be to push the desired checksum value (=
via
> fcntl?) and have btrfs compare the desired checksum with the file's a=
ctual
> checksum on close(2), failing that call if the checksums don't match.
>=20
> Would any of this be possible (without an awful lot of work)?

IMO, if an application recieves data with checksum it can calculate the=
=20
checksum of data on the fly, as it writes it to the disk. It won't add =
any=20
additional IO to storage subsystem. It won't detect in-memory corruptio=
n=20
though, but if you want to be resilant to this, you should be looking a=
t ECC=20
RAM as subsequent checks can be affected by it to.

Second, you shouldn't tie application or network protocol to a CRC sche=
me used=20
by filesystem on server! Especially when there can be other CRC algorit=
hms=20
used, not only CRC-32C.

If the checksum algorithm used by FS was set in stone, then userspace c=
ould=20
employ it somehow, but if there can be different CRCs used, I see no re=
ason to=20
allow the userspace to read them.


--=20
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawer=C3=B3w 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

System Zarz=C4=85dzania Jako=C5=9Bci=C4=85
zgodny z norm=C4=85 ISO 9001:2000
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-05-27 13:39 A couple of questions Paul Millar
  2010-05-27 14:56 ` Hubert Kario
@ 2010-05-27 16:00 ` Chris Mason
  2010-05-31 18:06   ` Paul Millar
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Mason @ 2010-05-27 16:00 UTC (permalink / raw)
  To: Paul Millar; +Cc: linux-btrfs

On Thu, May 27, 2010 at 03:39:54PM +0200, Paul Millar wrote:
> Hi,
> 
> I've been looking at Btrfs and have a couple of naive questions that don't 
> seem to be answered on the wiki or in the articles I've read on the 
> filesystem.
> 
> 
> First: discovering a file's checksum value.
> 
> Here's the scenario: software is writing some data as a fresh file.  This 
> software happens to know (a priori) the checksum of this data; for example, a 
> storage server receives the file's data and checksum independently.
> 
> I've some confidence that, once the data is stored in btrfs, any corruption 
> (from the storage fabric) will be spotted; however, the data may have became 
> corrupt before being stored (e.g., from the network).  To catch this, the 
> checksum of the stored data needs to be calculated and checked.
> 
> One approach is to calculate the checksum (in user-space) after the data is 
> stored.  This adds extra IO- and CPU-load and there's also the possibility of 
> false-negative results due to the filesystem cache (although btrfs may remove 
> this risk).
> 
> Another approach would be to ask btrfs for the checksum.  It seems that it's 
> possible to combine multiple CRC-32C values to figure out the checksum of the 
> combined data [e.g., zlib's crc32_combine() function].  So, obtaining a file's 
> checksum might be a light-weight operation.
> 
> Yet another possibility would be to push the desired checksum value (via 
> fcntl?) and have btrfs compare the desired checksum with the file's actual 
> checksum on close(2), failing that call if the checksums don't match.
> 
> Would any of this be possible (without an awful lot of work)?

I'd suggest that you look at T10 DIF and DIX, which are targeted at
exactly this kind of thing.  We're looking at integrating dif/dix into
btrfs at some point.

> 
> 
> 
> Second: adding support for Adler32?
> 
> Looking at the unstable git repo, it looks like there's currently support for 
> only the CRC-32C checksum algorithm.  Is this correct?  If so, is anyone 
> working on adding support for Adler32?

We haven't looked at adler32.  crc32c was chosen because it is supported
in hardware by recent intel CPUs.

-chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-05-27 14:56 ` Hubert Kario
@ 2010-05-31 17:59   ` Paul Millar
  2010-06-02 16:19     ` Hubert Kario
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Millar @ 2010-05-31 17:59 UTC (permalink / raw)
  To: Hubert Kario; +Cc: linux-btrfs

Hi Hubert,

On Thursday 27 May 2010 16:56:00 Hubert Kario wrote:
> > Would [obtaining file checksum] be possible (without an awful lot
> > of work)?
> 
> [Calculating checksum in-memory]  won't detect in-memory corruption
> though, but if you want to be resilant to this, you should be looking at
>  ECC RAM as subsequent checks can be affected by it to.

Certainly ECC RAM will help, but unfortunately it doesn't remove the 
possibility of corruption; for example, CERN found [1] that double-bit memory 
corruptions (which ECC cannot recover from) can still happen.

[1] 
http://indico.cern.ch/getFile.py/access?contribId=3&sessionId=0&resId=1&materialId=paper&confId=13797

Also, IIRC there was a case where Fermilab tracked down a data corruption to a 
faulty PCI bus in the server.  So who knows where are all the places 
corruption could occur?

I guess the real problem is that, when processing large amounts of data, these 
rare occurrences start to stack up.


> Second, you shouldn't tie application or network protocol to a CRC scheme
>  used by filesystem on server! Especially when there can be other CRC
>  algorithms used, not only CRC-32C.

Sure, but the protocol isn't tied to any particular checksum algorithm.

 
> If the checksum algorithm used by FS was set in stone, then userspace could
> employ it somehow, but if there can be different CRCs used, I see no reason
>  to allow the userspace to read them.

I agree that a checksum value, without knowing the algorithm, isn't much use.  
However, the FS reported a string representation of the tuple (algorithm, 
value); for example:

   0:DCD05C54

(where "0" is from BTRFS_CSUM_TYPE_CRC32)

Would that allow meaningful use of this information?

Cheers,

Paul.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-05-27 16:00 ` Chris Mason
@ 2010-05-31 18:06   ` Paul Millar
  2010-05-31 20:33     ` Mike Fedyk
  2010-06-01 13:39     ` Martin K. Petersen
  0 siblings, 2 replies; 11+ messages in thread
From: Paul Millar @ 2010-05-31 18:06 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

Hi Chris,

On Thursday 27 May 2010 18:00:44 Chris Mason wrote:
> I'd suggest that you look at T10 DIF and DIX, which are targeted at
> exactly this kind of thing.  We're looking at integrating dif/dix into
> btrfs at some point.

I've been keeping half-an-eye on T10's work in ensuring end-to-end integrity.  
That you guys are planning to integrate dif/dix support is certainly welcome 
news!

In my use-case (a file-server that receives a new file from a remote client),  
I believe that, to ensure end-to-end integrity,  the server software would 
have to push the client-supplied checksum into the FS when writing a new file.  
(I believe there's some T10 slides somewhere that show this use-case) -- or 
(equivalently) the server software obtains the FS checksum for the file and 
matches it against the client-supplied value.

I'm deliberately taking the simplest case when the client has chosen the same 
checksum algorithm as the FS uses.  In reality, this may not be the case, but 
we can probably cope with that.

My concern is that, if the server-software doesn't push the client-provided 
checksum then the FS checksum (plus T-10 DIF/DIX) would not provide a rigorous 
assurance that the bytes are the same.  Without this assurance, corruption 
could still occur; for example, within the server's memory.

> We haven't looked at adler32.  crc32c was chosen because it is supported
> in hardware by recent intel CPUs.

OK, fair enough :)

Cheers,

Paul.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-05-31 18:06   ` Paul Millar
@ 2010-05-31 20:33     ` Mike Fedyk
  2010-06-02 11:56       ` Paul Millar
  2010-06-01 13:39     ` Martin K. Petersen
  1 sibling, 1 reply; 11+ messages in thread
From: Mike Fedyk @ 2010-05-31 20:33 UTC (permalink / raw)
  To: Paul Millar; +Cc: Chris Mason, linux-btrfs

On Mon, May 31, 2010 at 11:06 AM, Paul Millar <paul.millar@desy.de> wro=
te:
> Hi Chris,
>
> On Thursday 27 May 2010 18:00:44 Chris Mason wrote:
>> I'd suggest that you look at T10 DIF and DIX, which are targeted at
>> exactly this kind of thing. =C2=A0We're looking at integrating dif/d=
ix into
>> btrfs at some point.
>
> I've been keeping half-an-eye on T10's work in ensuring end-to-end in=
tegrity.
> That you guys are planning to integrate dif/dix support is certainly =
welcome
> news!
>
> In my use-case (a file-server that receives a new file from a remote =
client),
> I believe that, to ensure end-to-end integrity, =C2=A0the server soft=
ware would
> have to push the client-supplied checksum into the FS when writing a =
new file.
> (I believe there's some T10 slides somewhere that show this use-case)=
 -- or
> (equivalently) the server software obtains the FS checksum for the fi=
le and
> matches it against the client-supplied value.
>
> I'm deliberately taking the simplest case when the client has chosen =
the same
> checksum algorithm as the FS uses. =C2=A0In reality, this may not be =
the case, but
> we can probably cope with that.
>
> My concern is that, if the server-software doesn't push the client-pr=
ovided
> checksum then the FS checksum (plus T-10 DIF/DIX) would not provide a=
 rigorous
> assurance that the bytes are the same. =C2=A0Without this assurance, =
corruption
> could still occur; for example, within the server's memory.
>

Have you taken into account the boundaries of the data checksums?
Your app may checksum per file or some logical partition in the file
format.  Btrfs does the checksum per-extent so unless you keep track
of where the extent boundaries are, that checksum will be useless to
the userspace app.  Also the app would be tied specifically to a
storage technology.  No matter how great foo might be, not everyone's
going to use it.

Also are you going to get this info over nfs, cifs, lustre, gluster,
ceph, foo, bar and baz?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-05-31 18:06   ` Paul Millar
  2010-05-31 20:33     ` Mike Fedyk
@ 2010-06-01 13:39     ` Martin K. Petersen
  2010-06-02 13:40       ` Paul Millar
  1 sibling, 1 reply; 11+ messages in thread
From: Martin K. Petersen @ 2010-06-01 13:39 UTC (permalink / raw)
  To: Paul Millar; +Cc: Chris Mason, linux-btrfs

>>>>> "Paul" == Paul Millar <paul.millar@desy.de> writes:

Paul> My concern is that, if the server-software doesn't push the
Paul> client-provided checksum then the FS checksum (plus T-10 DIF/DIX)
Paul> would not provide a rigorous assurance that the bytes are the
Paul> same.  Without this assurance, corruption could still occur; for
Paul> example, within the server's memory.

For DIX we allow integrity metadata conversion.  Once the data is
received, the server generates appropriate IMD for the next layer.  Then
the server verifies that the original IMD matches the data buffer.  That
way there's no window of error.  But obviously the ideal case is where
the same IMD can be passed throughout the stack without conversion.

Not sure what you use for file service?  I believe NFSv4 allows for
checksums to be passed along. I have not looked at them closely yet,
though.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-05-31 20:33     ` Mike Fedyk
@ 2010-06-02 11:56       ` Paul Millar
  0 siblings, 0 replies; 11+ messages in thread
From: Paul Millar @ 2010-06-02 11:56 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: Chris Mason, linux-btrfs

Hi Mike,

On Monday 31 May 2010 22:33:23 Mike Fedyk wrote:
> On Mon, May 31, 2010 at 11:06 AM, Paul Millar <paul.millar@desy.de> wrote:
> > [...] My concern is that, if the server-software doesn't push the
> > client-provided checksum then the FS checksum (plus T-10 DIF/DIX) would
> > not provide a rigorous assurance that the bytes are the same [...]
> 
> Have you taken into account the boundaries of the data checksums?
> Your app may checksum per file or some logical partition in the file
> format. 

I'm thinking specifically of the case when the user creates a file, writes the 
file's contents and closes it;  for us, this is the only use-case when writing 
data.  In this scenario, the checksum would be of the file's complete data 
rather than any particular logical partition.

> Btrfs does the checksum per-extent so unless you keep track
> of where the extent boundaries are, that checksum will be useless to
> the userspace app. 

Sure, this is true with how things are currently.

However, I was hoping that it would be possible to add code within btrfs to 
obtain the checksum over the all the file's data.  Since btrfs knows the 
extend sizes and per-extend checksum values, I believe this is tractable and 
relatively easy.

> Also the app would be tied specifically to a storage technology.  No
> matter how great foo might be, not everyone's going to use it.

Sure, but I'm thinking of this behaviour (within the app) as being optional. 
The app would continue to be FS and storage-technology independent.

If the FS doesn't support internal consistency (e.g., ext3, xfs, ..) then the 
app would continue to do userland checksum verification on write:  it's better 
than nothing.

If the app is deployed on a node with btrfs then the app could try to "align" 
the user-supplied checksum with the value within the FS: either pushing the 
correct checksum value into the FS or reading the resulting FS-generated 
checksum value after writing and comparing it with the user-supplied value.

> Also are you going to get this info over nfs, cifs, lustre, gluster,
> ceph, foo, bar and baz?

This is certainly a valid concern. 

I can't speak for all these protocols and distributed filesystems: we don't 
support mounting our app with CIFS and the software doesn't participate with 
luster, gluster, ceph cluster filesystems.

However, here's information about the protocols we do support:

The majority of LAN transfers use a custom protocol.  The wire-protocol 
includes support for uploading a checksum value on close.

We also support the xrootd protocol, which allows clients to upload checksum 
values with the kXR_verifyw command.

We've also support for NFS v4.1.   NFS doesn't support uploading checksum (I 
believe, and it isn't part of current v4.2 work), but we may be able to work 
around this.

We also support WebDAV.  This currently has no support for checksum.

Almost all WAN transfers currently use GridFTP v2.  This includes the SCKS 
command, which allows the client to upload the correct checksum value.

In short, with current usage, the app will know the checksum value, as 
supplied by the remote client.

Cheers,

Paul.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-06-01 13:39     ` Martin K. Petersen
@ 2010-06-02 13:40       ` Paul Millar
  2010-06-04  1:17         ` Martin K. Petersen
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Millar @ 2010-06-02 13:40 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: Chris Mason, linux-btrfs

On Tuesday 01 June 2010 15:39:52 Martin K. Petersen wrote:
> >>>>> "Paul" == Paul Millar <paul.millar@desy.de> writes:
> Paul> My concern is that, if the server-software doesn't push the
> Paul> client-provided checksum then the FS checksum (plus T-10 DIF/DIX)
> Paul> would not provide a rigorous assurance that the bytes are the
> Paul> same.  Without this assurance, corruption could still occur; for
> Paul> example, within the server's memory.
> 
> For DIX we allow integrity metadata conversion.  Once the data is
> received, the server generates appropriate IMD for the next layer.  Then
> the server verifies that the original IMD matches the data buffer.  That
> way there's no window of error.  But obviously the ideal case is where
> the same IMD can be passed throughout the stack without conversion.

I think we may be talking slightly at cross-purposes here: in my case, one of 
the end-points (for "end-to-end data integrity") is a remote computer, that is 
uploading a file with a corresponding checksum.

Please correct me if I'm wrong here, but T10 DIF/DIX refers only to data 
integrity protection from the OS's FS-level down to the block device: a 
userland application doesn't know that it is writing into a FS that is 
utilising DIX with a DIF-enabled storage system.

When a file is uploaded from a remote client to an application with the 
checksum, the app can verify this checksum internally.  However, there's then 
a (logical) gap between userland and FS where data integrity is no longer 
assured.  For example, corruption that occurs after the app has verified the 
checksum value would not be picked up, even with T10 DIX/DIF, since the FS 
would receive and store the already-corrupted data "in good faith".

In principle, one can add a btrfs-specific mechanism to continue this 
assurance from userland down to the FS.  Perhaps the simplest would be to 
allow userland applications to read the FS's internal checksum (app would read 
the FS internal checksum after writing and verify it is consistent), but I 
guess more sophisticated (interleaved IMD, T10-like) mechanisms are also 
possible.

Unfortunately, any such solution would be btrfs-specific, since (I believe) no 
one has standardised how to extend T10 into userspace.


> Not sure what you use for file service?  I believe NFSv4 allows for
> checksums to be passed along. I have not looked at them closely yet,
> though.

I believe NFS currently doesn't support checksums (as per v4.1).  Looking into 
more detail, Alok Aggarwal gave a talk at 2006 connectathon about this.  
Alok's slides have a nice diagram (slide 11) showing the kind of end-to-end 
integrity I'm after.  The issue is how to achieve the assurance between "NFS 
Server" and "Local FS" on the right.

For NFS, I believe there aren't any plans for introducing checksum support for 
v4.2.  Perhaps it'll appear with the later minor versions of the standard.

Cheers,

Paul.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-05-31 17:59   ` Paul Millar
@ 2010-06-02 16:19     ` Hubert Kario
  0 siblings, 0 replies; 11+ messages in thread
From: Hubert Kario @ 2010-06-02 16:19 UTC (permalink / raw)
  To: Paul Millar; +Cc: linux-btrfs

On Monday 31 May 2010 19:59:46 Paul Millar wrote:
> Hi Hubert,
>=20
> On Thursday 27 May 2010 16:56:00 Hubert Kario wrote:
> > > Would [obtaining file checksum] be possible (without an awful lot
> > > of work)?
> >=20
> > [Calculating checksum in-memory]  won't detect in-memory corruption
> > though, but if you want to be resilant to this, you should be looki=
ng at
> >=20
> >  ECC RAM as subsequent checks can be affected by it to.
>=20
> Certainly ECC RAM will help, but unfortunately it doesn't remove the
> possibility of corruption; for example, CERN found [1] that double-bi=
t
> memory corruptions (which ECC cannot recover from) can still happen.
>=20
> [1]
> http://indico.cern.ch/getFile.py/access?contribId=3D3&sessionId=3D0&r=
esId=3D1&mat
> erialId=3Dpaper&confId=3D13797
>=20
> Also, IIRC there was a case where Fermilab tracked down a data corrup=
tion
> to a faulty PCI bus in the server.  So who knows where are all the pl=
aces
> corruption could occur?
>=20
> I guess the real problem is that, when processing large amounts of da=
ta,
> these rare occurrences start to stack up.
>=20

Yes, but AFAIK btrfs checksums don't have internal checksum (e.g. you c=
an't=20
check if the read checksum is a valid one or not, it does not have cont=
rol=20
bits), as such, if you consider PCI bus corruption as likely, you still=
 don't=20
get 100% certanity that the data reached the HDD unharmed.

If you need such level of certanity when recording data, I'd consider=20
mainframe hardware and/or duplicating whole storage stack.

Cheers,
--=20
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawer=C3=B3w 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

System Zarz=C4=85dzania Jako=C5=9Bci=C4=85
zgodny z norm=C4=85 ISO 9001:2000
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A couple of questions
  2010-06-02 13:40       ` Paul Millar
@ 2010-06-04  1:17         ` Martin K. Petersen
  0 siblings, 0 replies; 11+ messages in thread
From: Martin K. Petersen @ 2010-06-04  1:17 UTC (permalink / raw)
  To: Paul Millar; +Cc: Martin K. Petersen, Chris Mason, linux-btrfs

>>>>> "Paul" == Paul Millar <paul.millar@desy.de> writes:

Paul> Please correct me if I'm wrong here, but T10 DIF/DIX refers only
Paul> to data integrity protection from the OS's FS-level down to the
Paul> block device: a userland application doesn't know that it is
Paul> writing into a FS that is utilising DIX with a DIF-enabled storage
Paul> system.

My point was that it is possible to have different protection types in
play (and thus different checksums) as long as you overlap the
protection envelopes.  At the expense of having to calculate checksums
multiple times, of course.


Paul> Unfortunately, any such solution would be btrfs-specific, since (I
Paul> believe) no one has standardised how to extend T10 into userspace.

Not yet, but we're working on a generic interface that would allow the
protection information to be attached.  This is not going to be tied to
just T10 DIF.  The current Linux block layer integrity handles different
types of protection information.


Paul> I believe NFS currently doesn't support checksums (as per v4.1).
Paul> Looking into more detail, Alok Aggarwal gave a talk at 2006
Paul> connectathon about this.  Alok's slides have a nice diagram (slide
Paul> 11) showing the kind of end-to-end integrity I'm after.  The issue
Paul> is how to achieve the assurance between "NFS Server" and "Local
Paul> FS" on the right.

Paul> For NFS, I believe there aren't any plans for introducing checksum
Paul> support for v4.2.  Perhaps it'll appear with the later minor
Paul> versions of the standard.

I haven't looked into this for a long time.  Last time I talked to the
NFS folks they seemed to think it would be possible to bridge the two
methods.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-06-04  1:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-27 13:39 A couple of questions Paul Millar
2010-05-27 14:56 ` Hubert Kario
2010-05-31 17:59   ` Paul Millar
2010-06-02 16:19     ` Hubert Kario
2010-05-27 16:00 ` Chris Mason
2010-05-31 18:06   ` Paul Millar
2010-05-31 20:33     ` Mike Fedyk
2010-06-02 11:56       ` Paul Millar
2010-06-01 13:39     ` Martin K. Petersen
2010-06-02 13:40       ` Paul Millar
2010-06-04  1:17         ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).