From ac030d06b39c30312acd8a1a591e3444c066327a Mon Sep 17 00:00:00 2001 From: godp21 <godp21@inf.ufpr.br> Date: Thu, 20 Apr 2023 11:31:14 -0300 Subject: [PATCH] adding ZFS description --- ZFS-RAIDs.md | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/ZFS-RAIDs.md b/ZFS-RAIDs.md index 760c28f..26879ba 100644 --- a/ZFS-RAIDs.md +++ b/ZFS-RAIDs.md @@ -1,14 +1,15 @@ -# Useful RAIDs with ZFS -**R**edundant **A**rray of **I**ndependent **D**isks - -## ROADMAP -- [x] RAID{10,1,6,10} +# ROADMAP +- [x] RAID{0,1,6,10} +- [x] ZFS description and capabilities - [ ] RAID implementations in ZFS - [ ] Cache systems in ZFS ___ - [ ] ZFS for DB or lvm+ext4 +# RAIDs +**R**edundant **A**rray of **I**ndependent **D**isks + ## Some recorrent terms when defining a RAID system: **PARITY** : Refers to *parity bit*, it's a bit added to a string that says if the sum of bits in the string is even/odd, it's a simple form of error checking. Commonly the parity bit is added in the end of each byte (8 bits). @@ -17,7 +18,7 @@ ___ : The concept of dividing each data array into various segments so the data can be more easily manipulated and stored **MIRROR** -: Replicates *logical disk* volumes into multiple fisical disks, so the same information is stored in different hard disks in real time +: Replicates *logical disk* volumes into multiple physical disks, so the same information is stored in different hard disks in real time ## RAID0  @@ -26,12 +27,12 @@ RAID0 splits data across a multiple-disks array. The ideal setup is equaly-sized RAID0 create stripes of data so disk operations are n-times faster, n being the total amount of disks available. It also distributes I/O costs between all disks making it a very fast storage system. RAID0 **doesn't implements parity** or even any **fault tolerance**, so the failure of one single disk in the array will result in total data loss. -Besides fastness, RAID0 also is a good system to create large amounts of data storage units with lesser disks, since all disks in the array have unique information and, having equaly-sized units, uses 100% it's fisical capability as storage. +Besides fastness, RAID0 also is a good system to create large amounts of data storage units with lesser disks, since all disks in the array have unique information and, having equaly-sized units, uses 100% it's physical capability as storage. ## RAID1  -RAID1 mirrors sets of data on **two or more** fisical disks at a time. This RAID setting, as in RAID0, also doesn't offer any *parity* and the setup replicates the size of the smallest disk on all the other disks as well. In RAID1 theres no data striping since all data is replicated multiple times. +RAID1 mirrors sets of data on **two or more** physical disks at a time. This RAID setting, as in RAID0, also doesn't offer any *parity* and the setup replicates the size of the smallest disk on all the other disks as well. In RAID1 theres no data striping since all data is replicated multiple times. RAID1 read operations can be taken by any of the disks in the array, it's useful on read performance and reliability on data, but it is bad on write performance and total data storage capacity. @@ -47,6 +48,22 @@ RAID6 is a very good system to use when reliability and disponibility of data is RAID10 is a simuntaneous implementation of RAID0 and RAID1, combining the performance boost of RAID0 with the reliability of data present in RAID1. RAID10 needs at least two 'tanks' of disks arranged in RAID1 configuration, replicating all data in all disks of each tank. Than it takes these tanks and arrange then in a RAID0 setup, so all data is striped and writen segmented in each tank. -RAID10 offers a boost in read and write operations performance, at the same time that it permits at least 1 disk of each tank failing at once without any data loss. This systems does this by using double of logical storage in fisical storage, so it'll always use half of total raw disks capacity. +RAID10 offers a boost in read and write operations performance, at the same time that it permits at least 1 disk of each tank failing at once without any data loss. This systems does this by using double of logical storage in physical storage, so it'll always use half of total raw disks capacity. It's a good system when total storage isn't more requested than performance and relaibility. + +___ + + +# ZFS +**Z**ettabyte **F**ile **S**ystem + +Usually, data management involves physical and logical aspects. The first being the raw HardDrives and SDs organized in blocks and the second being the logical block devices as seen by the operation system. The logical part can use multiple layers as volume managers and RAID controllers. + +ZFS acts both as a physical storage manager and a logical data manager, so it knows about all physical volumes available in the system and all the firmware and software that turns them in a useful unit to the operation system.This way, ZFS garantees that errors commited by the OS or even by the hardware can possibly be fixed in any step of the data management path. + +As described in the documentation, one of the most powerful capabilities of ZFS is **snapshots** and replication of them. With this FileSytem, it's possible to take a snapshot of the entire system before any risky software changes or system operations, this way, it's always possible to **rollback** to a *checkpoint* if any operation has caused instabilities in the entire system. + +With the snapshot option, it's also possible to replicate a entire and independet file system. And it's implementation lets several numbers of snapshots to be taken without losing performance. + + -- GitLab