linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ram Ramesh <rramesh2400@gmail.com>
To: Roger Heflin <rogerheflin@gmail.com>
Cc: Linux Raid <linux-raid@vger.kernel.org>
Subject: Re: Best way to add caching to a new raid setup.
Date: Sun, 30 Aug 2020 12:19:23 -0500	[thread overview]
Message-ID: <12da829e-d8b1-a2ee-7c0a-7da84788bc39@gmail.com> (raw)
In-Reply-To: <CAAMCDed6HPj3uO8+uUPYb-=5Rurp5LVYsDmMvmmtZiCEi5i39A@mail.gmail.com>

On 8/30/20 10:42 AM, Roger Heflin wrote:
> The LSI should be a good controller as long as you the HBA fw and not
> the raid fw.
>
> I use an LSI with hba + the 8 AMD chipset sata ports, currently I have
> 12 ports cabled to hot swap bays but only 7+boot disk used.
>
> How many recording do you think you will have and how many
> clients/watchers?  With the SSD handling the writes for recording my
> disks actually spin down if no one is watching anything.
>
> The other trick the partitions let me do is initially I moved from 1.5
> -> 3tb disks (2x750 -> 4x750) and once I got 3-3tbs in I added the 2
> more partitions raid6(+1.5TB) (I bought the 3tb drives slowly), then
> the next 3tb gets added to all 4 partitions (+3TB).
>
> On reads at least each disk can do at least 50 iops, and for the most
> part the disks themselves are very likely to cache the entire track
> the head goes over, so a 2nd sequential read likely comes from the
> disk's read cache and does not have to actually be read.  So several
> sequential workloads jumping back and forth do not behave as bad as
> one would expect.  Write are a different story and a lot more
> expensive.  I isloate those to ssd and copy them in the middle of the
> night when it is low activity.  And since they are being copied as big
> fast streams one file at a time they end up with very few fragments
> and write very quickly.   The way I have mine setup mythtv will find
> the file whether it is on the ssd recording directory or the raid
> recording directory, so when I mv the files nothing has to be done
> except the mv.
>
>
> On Sat, Aug 29, 2020 at 7:56 PM Ram Ramesh <rramesh2400@gmail.com> wrote:
>> On 8/29/20 4:26 PM, Roger Heflin wrote:
>>> It should be worth noting that if you buy 2 exactly the same SSD's at
>>> the same time and use them in a mirror they are very likely to be
>>> wearing about the same.
>>>
>>> I am hesitant to go much bigger on disks, especially since the $$/GB
>>> really does not change much as the disks get bigger.
>>>
>>> And be careful of adding on a cheap sata controller as a lot of them work badly.
>>>
>>> Most of my disks have died from bad blocks causing a section of the
>>> disk to have some errors, or bad blocks on sections causing the array
>>> to pause for 7 seconds.  Make sure to get a disk with SCTERC settable
>>> (timeout when bad blocks happen, otherwise the default timeout is a
>>> 60-120seconds, but with it you can set it to no more than 7 seconds).
>>>    In the cases where the entire disk did not just stop and is just
>>> getting bad blocks in places, typically you have time as only a single
>>> section is getting bad blocks, so in this case having sections does
>>> help.    Also note that mdadm with 4 sections like I have will only
>>> run a single rebuild at a time as mdadm understands that the
>>> underlying disks are shared, this makes replacing a disk with 1
>>> section or 4 sections basically work pretty much the same.  It does
>>> the same thing on the weekly scans, it sets all 4 to scan, and it
>>> scans 1 and defers the other scan as disks are shared.
>>>
>>> It seems to be a disk completely dying is a lot less often than badblock issues.
>>>
>>> On Sat, Aug 29, 2020 at 3:50 PM Ram Ramesh <rramesh2400@gmail.com> wrote:
>>>> On 8/29/20 12:02 AM, Roman Mamedov wrote:
>>>>> On Fri, 28 Aug 2020 22:08:22 -0500
>>>>> "R. Ramesh" <rramesh@verizon.net> wrote:
>>>>>
>>>>>> I do not know how SSD caching is implemented. I assumed it will be
>>>>>> somewhat similar to memory cache (L2 vs L3 vs L4 etc). I am hoping that
>>>>>> with SSD caching, reads/writes to disk will be larger in size and
>>>>>> sequential within a file (similar to cache line fill in memory cache
>>>>>> which results in memory bursts that are efficient). I thought that is
>>>>>> what SSD caching will do to disk reads/writes. I assumed, once reads
>>>>>> (ahead) and writes (assuming writeback cache) buffers data sufficiently
>>>>>> in the SSD, all reads/writes will be to SSD with periodic well organized
>>>>>> large transfers to disk. If I am wrong here then I do not see any point
>>>>>> in SSD as a cache. My aim is not to optimize by cache hits, but optimize
>>>>>> by preventing disks from thrashing back and forth seeking after every
>>>>>> block read. I suppose Linux (memory) buffer cache alleviates some of
>>>>>> that. I was hoping SSD will provide next level. If not, I am off in my
>>>>>> understanding of SSD as a disk cache.
>>>>> Just try it, as I said before with LVM it is easy to remove if it doesn't work
>>>>> out. You can always go to the manual copying method or whatnot, but first why
>>>>> not check if the automatic caching solution might be "good enough" for your
>>>>> needs.
>>>>>
>>>>> Yes it usually tries to avoid caching long sequential reads or writes, but
>>>>> there's also quite a bit of other load on the FS, i.e. metadata. I found that
>>>>> browsing directories and especially mounting the filesystem had a great
>>>>> benefit from caching.
>>>>>
>>>>> You are correct that it will try to increase performance via writeback
>>>>> caching, however with LVM that needs to be enabled explicitly:
>>>>> https://www.systutorials.com/docs/linux/man/7-lvmcache/#lbAK
>>>>> And of course a failure of that cache SSD will mean losing some data, even if
>>>>> the main array is RAID. Perhaps should consider a RAID of SSDs for cache in
>>>>> that case then.
>>>>>
>>>> Yes, I have 2x500GB ssds for cache. May be, I should do raid1 on them
>>>> and use as cache volume.
>>>> I thought SSDs are more reliable and even when they begin to die, they
>>>> become readonly before quitting.  Of course, this is all theory, and I
>>>> do not think standards exists on how they behave when reaching EoL.
>>>>
>>>> Ramesh
>>>>
>> My SSDs are from different companies and bought at different times
>> (2019/2016, I think).
>>
>> I have not had many hard disk failures. However, each time I had one, it
>> has been a total death. So, I am a bit biased. May be with sections, I
>> can replace one md at a time and letting others run degraded. I am sure
>> there other tricks. I am simply saying it is a lot of reads/writes, and
>> of course computation, in cold replacement of disks in RAID6 vs. RAID1.
>>
>> Yes, larger disks are not cheaper, but they use one SATA port vs.
>> smaller disks. Also, they use less power in the long run (mine run
>> 24x7). That is why I have a policy of replacing disks once 2x size disks
>> (compared to what I currently own) become commonplace.
>>
>> I have a LSI 9211 SAS HBA which is touted to be reliable by this community.
>>
>> Regards
>> Ramesh
>>
Roger,

   Thanks for the details on your SSD setup. Yes, mythtv is supposed to 
find the file from storage group entries regardless of the actual 
location and thus mv is all that is required. However, I have never 
tried to use this feature though. So, it will be a new thing for me.  
Like I said before, I will try the LVM cache and see my disk activities 
are better. If that is not to my satisfaction, I will remove the cache 
and add it differently like you have. I only have a 500GB SSD, but I do 
not think daily recording will be anywhere close to that size.

Regards
Ramesh


  reply	other threads:[~2020-08-30 17:19 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <16cee7f2-38d9-13c8-4342-4562be68930b.ref@verizon.net>
2020-08-28  2:31 ` Best way to add caching to a new raid setup R. Ramesh
2020-08-28  3:05   ` Peter Grandi
2020-08-28  3:19     ` Ram Ramesh
2020-08-28 15:26   ` antlists
2020-08-28 17:25     ` Ram Ramesh
2020-08-28 22:12       ` antlists
2020-08-28 22:40         ` Ram Ramesh
2020-08-28 22:59           ` antlists
2020-08-29  3:08             ` R. Ramesh
2020-08-29  5:02               ` Roman Mamedov
2020-08-29 20:48                 ` Ram Ramesh
2020-08-29 21:26                   ` Roger Heflin
2020-08-30  0:56                     ` Ram Ramesh
2020-08-30 15:42                       ` Roger Heflin
2020-08-30 17:19                         ` Ram Ramesh [this message]
2020-09-11 18:39                         ` R. Ramesh
2020-09-11 20:37                           ` Roger Heflin
2020-09-11 22:41                             ` Ram Ramesh
2020-08-29  0:01           ` Roger Heflin
2020-08-29  3:12             ` R. Ramesh
2020-08-29 22:36               ` Drew
2020-09-01 16:12                 ` Ram Ramesh
2020-09-01 17:01                   ` Kai Stian Olstad
2020-09-02 18:17                     ` Ram Ramesh
2020-09-14 11:40                   ` Nix
2020-09-14 14:32                     ` Ram Ramesh
2020-09-14 14:48                       ` Roger Heflin
2020-09-14 15:08                         ` Wols Lists
2020-08-31 19:20           ` Nix
2020-08-28 17:46   ` Roman Mamedov
2020-08-28 20:39     ` Ram Ramesh
2020-08-29 15:34       ` antlists
2020-08-29 15:57         ` Roman Mamedov
2020-08-29 16:26           ` Roger Heflin
2020-08-29 20:45             ` Ram Ramesh
2020-08-30 22:16       ` Michal Soltys

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12da829e-d8b1-a2ee-7c0a-7da84788bc39@gmail.com \
    --to=rramesh2400@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=rogerheflin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).