From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id EF0487F3F for ; Fri, 13 Dec 2013 03:46:44 -0600 (CST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay2.corp.sgi.com (Postfix) with ESMTP id E3367304062 for ; Fri, 13 Dec 2013 01:46:35 -0800 (PST) Received: from greer.hardwarefreak.com (mo-65-41-216-221.sta.embarqhsd.net [65.41.216.221]) by cuda.sgi.com with ESMTP id NAXxW4OdAnuxiuM4 for ; Fri, 13 Dec 2013 01:46:15 -0800 (PST) Received: from [192.168.100.53] (gffx.hardwarefreak.com [192.168.100.53]) by greer.hardwarefreak.com (Postfix) with ESMTP id 112A66C184 for ; Fri, 13 Dec 2013 03:46:15 -0600 (CST) Message-ID: <52AAD766.8080405@hardwarefreak.com> Date: Fri, 13 Dec 2013 03:46:14 -0600 From: Stan Hoeppner MIME-Version: 1.0 Subject: Re: XFS: Internal error XFS_WANT_CORRUPTED_RETURN References: <20131211172725.GA4606@redhat.com> <68DD7157-6ACE-4548-A466-C1EBD31B6DEB@colorremedies.com> <20131211185746.GA11861@redhat.com> <2A0A637F-7ED6-4743-8791-E57E22306139@colorremedies.com> In-Reply-To: <2A0A637F-7ED6-4743-8791-E57E22306139@colorremedies.com> Reply-To: stan@hardwarefreak.com List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: xfs@oss.sgi.com On 12/11/2013 6:19 PM, Chris Murphy wrote: ... > I suspect we've only just begun to see the myriad ways in which SSDs > could fail. I ran across this article earlier today: > http://techreport.com/review/25681/the-ssd-endurance-experiment-testing-data-retention-at-300tb > > What I thought was eye opening was a hashed file failing multiple > times in a row with *different* hash values, being allowed to rest > unpowered for five days and then passing. Eeek. Talk about a great > setup for a lot of weird transient problems with that kind of > reversal. What I can't tell is if there were read errors report to > the SATA driver, or if (different) bad data from a particular page > was sent to the driver. The drive that exhibited this problem, the Samsung 840, is (one of) the first on the market to use triple level cell NAND. The drive is marketed at consumers only. The anomaly occurred after 100 TB of writes, well beyond what is expected for a consumer drive. After the anomaly occurred the drive ran flawlessly up to 300 TB. The rest of the drives, including the Samsung 840 Pro, use two cell MLC NAND, and none of them have shown problems in their testing. They've been flawless. So I disagree with your statement "we've only just begun to see the myriad ways in which SSDs could fail". What we have here is what we've always had. A manufacturer using a bleeding edge technology didn't have all the bugs identified and fixed with the first rev of the product. This isn't a problem with SSDs in general, but one manufacturer, one new drive model, using a brand new NAND type. -- Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs