From: "Theodore Ts'o" <tytso@mit.edu> To: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: "Vladislav Bolkhovitin" <vvvvvst@gmail.com>, "杨苏立 Yang Su Li" <suli@cs.wisc.edu>, "General Discussion of SQLite Database" <sqlite-users@sqlite.org>, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, drh@hwaci.com Subject: Re: [sqlite] light weight write barriers Date: Thu, 25 Oct 2012 09:50:44 -0400 [thread overview] Message-ID: <20121025135044.GA13562@thunk.org> (raw) In-Reply-To: <20121025140325.49cd7c79@pyramind.ukuu.org.uk> On Thu, Oct 25, 2012 at 02:03:25PM +0100, Alan Cox wrote: > > I doubt they care. The profit on high end features from the people who > really need them I would bet far exceeds any other benefit of giving it to > others. Welcome to capitalism 8) Yes, but it's a question of pricing. If they had priced it a just a wee bit higher, then there would have been incentive to add support for TCQ so it could actually be used into various Linux file systems, since there would have been lots of users of it. But as it is, the folks who are purchasing huge, vast number of these drives --- such as at the large cloud providers: Amazon, Facebook, Racespace, et. al. --- will choose to purchase large numbers of commodity drives, and then find ways to work around the missing functionality in userspace. For example, DIF/DIX would be nice, and if it were available for cheap, I could imagine it being used. But you can accomplish the same thing in userspace, and in fact at Google I've implemented a special not-for-mainline patch which spikes out stable writes (required for DIF/DIX) because it has significant performance overhead, and DIF/DIX has zero benefit if you're not willing to shell out $$$ for hardware that supports it. Maybe the HDD manufacturers have been able to price guage a small number enterprise I/T shops with more dollars than sense, but personally, I'm not convinced they picked an optimal pricing strategy.... Put another way, I accept that Toyota should price a Lexus ES more than a Camry, but if it's priced at say, 3x the price of a Camry instead of 20%, they might find that precious few people are willing to pay that kind of money for what is essentially the same car with minor luxury tweaks added to it. > Plus - spinning rust for those end users is on the way out, SATA to flash > is a bit of hack and people are already putting a lot of focus onto > things like NVM Express. Yeah.... I don't buy that. One, flash is still too expensive. Two, the capital costs to build enough Silicon foundries to replace the current production volume of HDD's is way too expensive for any company to afford (the cloud providers are buying *huge* numbers of HDD's) --- and that's assuming companies wouldn't chose to use those foundries for products with larger margins --- such as, for example, CPU/GPU chips. :-) And third and finally, if you study the long-term trends in terms of Data Retention Time (going down), Program and Read Disturb (going up), and Write Endurance (going down) as a function of feature size and/or time, you'd be wise to treat flash as nothing more than short-term cache, and not as a long term stable store. If end users completely give up on flash, and store all of their precious family pictures on flash storage, after a couple of years, they are likely going to be very disappointed.... Speaking personally, I wouldn't want to have anything on flash for more than a few months at *most* before I made sure I had another copy saved on spinning rust platters for long-term retention. - Ted
WARNING: multiple messages have this Message-ID (diff)
From: Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org> To: Alan Cox <alan-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org> Cc: General Discussion of SQLite Database <sqlite-users-CzDROfG0BjIdnm+yROfE0A@public.gmane.org>, drh-X1OJI8nnyKUAvxtiuMwx3w@public.gmane.org, Vladislav Bolkhovitin <vvvvvst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Subject: Re: light weight write barriers Date: Thu, 25 Oct 2012 09:50:44 -0400 [thread overview] Message-ID: <20121025135044.GA13562@thunk.org> (raw) In-Reply-To: <20121025140325.49cd7c79-38n7/U1jhRXW96NNrWNlrekiAK3p4hvP@public.gmane.org> On Thu, Oct 25, 2012 at 02:03:25PM +0100, Alan Cox wrote: > > I doubt they care. The profit on high end features from the people who > really need them I would bet far exceeds any other benefit of giving it to > others. Welcome to capitalism 8) Yes, but it's a question of pricing. If they had priced it a just a wee bit higher, then there would have been incentive to add support for TCQ so it could actually be used into various Linux file systems, since there would have been lots of users of it. But as it is, the folks who are purchasing huge, vast number of these drives --- such as at the large cloud providers: Amazon, Facebook, Racespace, et. al. --- will choose to purchase large numbers of commodity drives, and then find ways to work around the missing functionality in userspace. For example, DIF/DIX would be nice, and if it were available for cheap, I could imagine it being used. But you can accomplish the same thing in userspace, and in fact at Google I've implemented a special not-for-mainline patch which spikes out stable writes (required for DIF/DIX) because it has significant performance overhead, and DIF/DIX has zero benefit if you're not willing to shell out $$$ for hardware that supports it. Maybe the HDD manufacturers have been able to price guage a small number enterprise I/T shops with more dollars than sense, but personally, I'm not convinced they picked an optimal pricing strategy.... Put another way, I accept that Toyota should price a Lexus ES more than a Camry, but if it's priced at say, 3x the price of a Camry instead of 20%, they might find that precious few people are willing to pay that kind of money for what is essentially the same car with minor luxury tweaks added to it. > Plus - spinning rust for those end users is on the way out, SATA to flash > is a bit of hack and people are already putting a lot of focus onto > things like NVM Express. Yeah.... I don't buy that. One, flash is still too expensive. Two, the capital costs to build enough Silicon foundries to replace the current production volume of HDD's is way too expensive for any company to afford (the cloud providers are buying *huge* numbers of HDD's) --- and that's assuming companies wouldn't chose to use those foundries for products with larger margins --- such as, for example, CPU/GPU chips. :-) And third and finally, if you study the long-term trends in terms of Data Retention Time (going down), Program and Read Disturb (going up), and Write Endurance (going down) as a function of feature size and/or time, you'd be wise to treat flash as nothing more than short-term cache, and not as a long term stable store. If end users completely give up on flash, and store all of their precious family pictures on flash storage, after a couple of years, they are likely going to be very disappointed.... Speaking personally, I wouldn't want to have anything on flash for more than a few months at *most* before I made sure I had another copy saved on spinning rust platters for long-term retention. - Ted
next prev parent reply other threads:[~2012-10-25 13:51 UTC|newest] Thread overview: 154+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <415E76CC-A53D-4643-88AB-3D7D7DC56F98@dubeyko.com> 2012-10-06 13:54 ` [PATCH 00/16] f2fs: introduce flash-friendly file system Vyacheslav Dubeyko 2012-10-06 20:06 ` Jaegeuk Kim 2012-10-07 7:09 ` Marco Stornelli 2012-10-07 9:31 ` Jaegeuk Kim 2012-10-07 9:31 ` Jaegeuk Kim 2012-10-07 12:08 ` Vyacheslav Dubeyko 2012-10-07 12:08 ` Vyacheslav Dubeyko 2012-10-08 8:25 ` Jaegeuk Kim 2012-10-08 8:25 ` Jaegeuk Kim 2012-10-08 9:59 ` Namjae Jeon 2012-10-08 9:59 ` Namjae Jeon 2012-10-08 10:52 ` Jaegeuk Kim 2012-10-08 11:21 ` Namjae Jeon 2012-10-08 12:11 ` Jaegeuk Kim 2012-10-09 3:52 ` Namjae Jeon 2012-10-09 8:00 ` Jaegeuk Kim 2012-10-09 8:31 ` Lukáš Czerner 2012-10-09 10:45 ` Jaegeuk Kim 2012-10-09 10:45 ` Jaegeuk Kim 2012-10-09 11:01 ` Lukáš Czerner 2012-10-09 12:01 ` Jaegeuk Kim 2012-10-09 12:39 ` Lukáš Czerner 2012-10-09 13:10 ` Jaegeuk Kim 2012-10-09 21:20 ` Dave Chinner 2012-10-09 21:20 ` Dave Chinner 2012-10-10 2:32 ` Jaegeuk Kim 2012-10-10 4:53 ` Theodore Ts'o 2012-10-10 4:53 ` Theodore Ts'o 2012-10-12 20:55 ` Arnd Bergmann 2012-10-10 10:36 ` David Woodhouse 2012-10-12 20:58 ` Arnd Bergmann 2012-10-13 4:26 ` Namjae Jeon 2012-10-13 12:37 ` Jaegeuk Kim 2012-10-13 12:37 ` Jaegeuk Kim 2012-10-17 11:12 ` Namjae Jeon [not found] ` <000001cdacef$b2f6eaa0$18e4bfe0$%kim@samsung.com> 2012-10-18 13:39 ` Vyacheslav Dubeyko 2012-10-18 22:14 ` Jaegeuk Kim 2012-10-19 9:20 ` NeilBrown 2012-10-08 19:22 ` Vyacheslav Dubeyko 2012-10-09 7:08 ` Jaegeuk Kim 2012-10-09 7:08 ` Jaegeuk Kim 2012-10-09 19:53 ` Jooyoung Hwang 2012-10-09 19:53 ` Jooyoung Hwang 2012-10-10 8:05 ` Vyacheslav Dubeyko 2012-10-10 9:02 ` Theodore Ts'o 2012-10-10 11:52 ` SQLite on flash (was: [PATCH 00/16] f2fs: introduce flash-friendly file system) Clemens Ladisch [not found] ` <50756199.1090103-P6GI/4k7KOmELgA04lAiVw@public.gmane.org> 2012-10-10 12:47 ` Richard Hipp 2012-10-10 17:17 ` light weight write barriers Andi Kleen [not found] ` <m2fw5mtffg.fsf_-_-Vw/NltI1exuRpAAqCnN02g@public.gmane.org> 2012-10-10 17:48 ` Richard Hipp 2012-10-11 16:38 ` [sqlite] " Nico Williams 2012-10-11 16:38 ` Nico Williams 2012-10-11 16:48 ` [sqlite] " Nico Williams 2012-10-11 16:48 ` Nico Williams 2012-10-11 16:32 ` [sqlite] " 杨苏立 Yang Su Li 2012-10-11 16:32 ` 杨苏立 Yang Su Li 2012-10-11 17:41 ` [sqlite] " Christoph Hellwig 2012-10-23 19:53 ` Vladislav Bolkhovitin 2012-10-24 21:17 ` Nico Williams 2012-10-24 21:17 ` Nico Williams 2012-10-24 22:03 ` [sqlite] " david 2012-10-25 0:20 ` Nico Williams 2012-10-25 0:20 ` Nico Williams 2012-10-25 1:04 ` [sqlite] " david 2012-10-25 5:18 ` Nico Williams 2012-10-25 5:18 ` Nico Williams 2012-10-25 6:02 ` [sqlite] " Theodore Ts'o 2012-10-25 6:58 ` david 2012-10-25 14:03 ` Theodore Ts'o 2012-10-25 14:03 ` Theodore Ts'o 2012-10-25 18:03 ` [sqlite] " david 2012-10-25 18:03 ` david-gFPdbfVZQbY 2012-10-25 18:29 ` [sqlite] " Theodore Ts'o 2012-10-25 18:29 ` Theodore Ts'o 2012-11-05 20:03 ` [sqlite] " Pavel Machek 2012-11-05 20:03 ` Pavel Machek 2012-11-05 22:04 ` Theodore Ts'o 2012-11-05 22:04 ` Theodore Ts'o [not found] ` <20121105220440.GB25378-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 2012-11-05 22:37 ` Richard Hipp 2012-11-05 23:00 ` [sqlite] " Theodore Ts'o 2012-11-05 23:00 ` Theodore Ts'o 2012-10-30 23:49 ` [sqlite] " Nico Williams 2012-10-25 5:42 ` Theodore Ts'o 2012-10-25 7:11 ` david 2012-10-27 1:52 ` Vladislav Bolkhovitin 2012-10-25 5:14 ` Theodore Ts'o 2012-10-25 13:03 ` Alan Cox 2012-10-25 13:50 ` Theodore Ts'o [this message] 2012-10-25 13:50 ` Theodore Ts'o 2012-10-27 1:55 ` [sqlite] " Vladislav Bolkhovitin 2012-10-27 1:54 ` Vladislav Bolkhovitin 2012-10-27 4:44 ` Theodore Ts'o 2012-10-27 4:44 ` Theodore Ts'o 2012-10-30 22:22 ` [sqlite] " Vladislav Bolkhovitin 2012-10-31 9:54 ` Alan Cox 2012-10-31 9:54 ` Alan Cox 2012-11-01 20:18 ` [sqlite] " Vladislav Bolkhovitin 2012-11-01 21:24 ` Alan Cox 2012-11-01 21:24 ` Alan Cox 2012-11-02 0:15 ` [sqlite] " Vladislav Bolkhovitin 2012-11-02 0:38 ` Howard Chu 2012-11-02 0:38 ` Howard Chu [not found] ` <50931601.4060102-aQkYFu9vm6AAvxtiuMwx3w@public.gmane.org> 2012-11-02 12:24 ` Richard Hipp 2012-11-13 3:41 ` [sqlite] " Vladislav Bolkhovitin 2012-11-02 12:33 ` Alan Cox 2012-11-02 12:33 ` Alan Cox 2012-11-13 3:41 ` [sqlite] " Vladislav Bolkhovitin 2012-11-13 3:41 ` Vladislav Bolkhovitin 2012-11-13 17:40 ` Alan Cox 2012-11-13 17:40 ` Alan Cox 2012-11-13 19:13 ` [sqlite] " Nico Williams 2012-11-13 19:13 ` Nico Williams 2012-11-15 1:17 ` [sqlite] " Vladislav Bolkhovitin 2012-11-15 12:07 ` David Lang 2012-11-15 12:07 ` David Lang [not found] ` <alpine.DEB.2.02.1211150353080.32408-UEhY+ZBZOcqqLGM74eQ/YA@public.gmane.org> 2012-11-15 16:14 ` 杨苏立 Yang Su Li 2012-11-17 5:02 ` [sqlite] " Vladislav Bolkhovitin 2012-11-17 5:02 ` Vladislav Bolkhovitin 2012-11-16 15:06 ` Howard Chu 2012-11-16 15:06 ` Howard Chu 2012-11-16 15:31 ` [sqlite] " Ric Wheeler 2012-11-16 15:54 ` Howard Chu 2012-11-16 15:54 ` Howard Chu 2012-11-16 18:03 ` [sqlite] " Ric Wheeler 2012-11-16 18:03 ` Ric Wheeler 2012-11-16 19:14 ` David Lang 2012-11-16 19:14 ` David Lang 2012-11-17 5:02 ` [sqlite] " Vladislav Bolkhovitin 2012-11-17 5:02 ` Vladislav Bolkhovitin 2012-11-15 17:06 ` Ryan Johnson 2012-11-15 17:06 ` Ryan Johnson 2012-11-15 22:35 ` [sqlite] " Chris Friesen 2012-11-17 5:02 ` Vladislav Bolkhovitin 2012-11-17 5:02 ` Vladislav Bolkhovitin 2012-11-20 1:23 ` Vladislav Bolkhovitin 2012-11-20 1:23 ` Vladislav Bolkhovitin 2012-11-26 20:05 ` Nico Williams 2012-11-26 20:05 ` Nico Williams 2012-11-29 2:15 ` Vladislav Bolkhovitin 2012-11-29 2:15 ` Vladislav Bolkhovitin 2012-11-15 1:16 ` [sqlite] " Vladislav Bolkhovitin 2012-11-15 1:16 ` Vladislav Bolkhovitin 2012-11-13 3:37 ` Vladislav Bolkhovitin [not found] ` <508B3EED.2080003-d+Crzxg7Rs0@public.gmane.org> 2012-11-11 4:25 ` 杨苏立 Yang Su Li 2012-11-13 3:42 ` [sqlite] " Vladislav Bolkhovitin 2012-11-13 3:42 ` Vladislav Bolkhovitin 2012-10-10 7:57 ` [PATCH 00/16] f2fs: introduce flash-friendly file system Vyacheslav Dubeyko 2012-10-10 9:43 ` Jaegeuk Kim 2012-10-11 3:14 ` Namjae Jeon [not found] ` <CAN863PuyMkSZtZCvqX+kwei9v=rnbBYVYr3TqBXF_6uxwJe2_Q@mail.gmail.com> 2012-10-17 11:13 ` Namjae Jeon 2012-10-17 23:06 ` Changman Lee 2012-10-12 12:30 ` Vyacheslav Dubeyko 2012-10-12 14:25 ` Jaegeuk Kim 2012-10-07 10:15 ` Vyacheslav Dubeyko 2012-10-07 10:15 ` Vyacheslav Dubeyko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20121025135044.GA13562@thunk.org \ --to=tytso@mit.edu \ --cc=alan@lxorguk.ukuu.org.uk \ --cc=drh@hwaci.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=sqlite-users@sqlite.org \ --cc=suli@cs.wisc.edu \ --cc=vvvvvst@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.