Zeta file system
Zettabyte File System
Robert Miłkowski System Group Manager Wirtualna Polska
Zettabyte File System
What is ZFS?
● ● ● ● ●
Completely new 128-bit file system Integrated volume manager Best data integrity Simple management Great performance
UNIX DAYS – Gdańsk 2006 2
Zettabyte File System
Traditional file systems
●
No protection for silent datacorruption
●
Can't guarantee correct data
●
Hard to manage
Partitions, slices, labels ● Not flexible storage space management ● Lots of limits ● Platform dependent
●
●
Too much complexity with FS+VM
UNIX DAYS – Gdańsk 2006 3
Zettabyte File System
Wish list
● ● ● ● ●
Make it SCALABLE Make it MANAGABLE Make it DYNAMIC Make it SAFE Make it EASIER
UNIX DAYS – Gdańsk 2006 4Zettabyte File System
Let's do it from the scratch
●
Pooled storage
No need for slices, partitions, volumes, etc. ● Make as many file systems as you want for free
●
●
End-to-end data integrity
●
Guarantee that application gets correct data Always consistent on disk – meta and user data
UNIX DAYS – Gdańsk 2006 5
●
Transactional
●
Zettabyte File System
TraditionalFS+VM
FS FS FS
Volume
Volume
Volume
UNIX DAYS – Gdańsk 2006
6
Zettabyte File System
ZFS pooled storage
ZFS ZFS ZFS
Storage Pool
UNIX DAYS – Gdańsk 2006
7
Zettabyte File System
ZFS pooled storage - example
# zpool create home c5t40d0 c5t40d1 c5t40d2 c5t40d3 # zfs create home/milek # zfs create home/www # zfs create home/milek/mail # zfs set compression=onhome/milek/mail # # zfs list NAME USED AVAIL REFER MOUNTPOINT home 72K 18.0T 10K /home home/milek 18.5K 18.0T 9.50K /home/milek home/milek/mail 9K 18.0T 9K /home/milek/mail home/www 9K 18.0T 9K /home/www #
UNIX DAYS – Gdańsk 2006 8
Zettabyte File System
ZFS Quoas & Reservations
● ● ● ●
You can reserve space for a given fs You can set quota for a given fs quota/reservation forhierarchies No quotas for uid/gid
UNIX DAYS – Gdańsk 2006
9
Zettabyte File System
Data integrity
●
Both data and meta-data are checksumed
●
No silent data corruption
●
Everything is Copy-On-Write
Never overwrite live data ● Always consistent on disk ● No need for fsck-like utility
●
UNIX DAYS – Gdańsk 2006
10
Zettabyte File System
End-to-end data integrity
●Checksum checked after block is in memory
● ●
Whole IO path is checked Corrects driver bugs, phantom writes, etc.
●
Checksum and data block stored separately
● ●
Checksum is stored in parent block Entire pool is self-validating
●
Protects from accidental overwrites
UNIX DAYS – Gdańsk 2006 11
Zettabyte File System Zettabyte File System
ZFS Self Healing
ApplicationApplication Application
ZFS Mirror
ZFS Mirror
ZFS Mirror
UNIX DAYS – Gdańsk 2006
12
Zettabyte File System
Data integrity - example
# zpool create home mirror c5t40d0 c5t40d1 # cd /home/ ; ncftp sol-nv-b39-x86-dvd.iso # digest -a md5 sol-nv-b39-x86-dvd.iso a09a7af5ad25a980fa907b552a5bd523 # dd if=/dev/zero of=/dev/rdsk/c5t40d0s0 bs=1024k # digest -a md5 sol-nv-b39-x86-dvd.isoa09a7af5ad25a980fa907b552a5bd523
UNIX DAYS – Gdańsk 2006
13
Zettabyte File System
Ditto blocks
● ● ● ●
Can't read data block – EIO Can't read meta-data block – part of fs is gone In most cases meta-data is less than 2% So replicate meta-data blocks
●
Additional to pool protection (mirror, raid-z)
UNIX DAYS – Gdańsk 2006
14
Zettabyte File System
Ditto blocks
3xmeta-data blocks with pool information ● 2x meta-data blocks with fs information ● 1x user data blocks
●
●
Planned are 2x and 3x user data ditto blocks per fs
●
Keep each copy on separate vdev ● Spread block with only one drive
●
With user ditto blocks you get mirroring on single disk
UNIX DAYS – Gdańsk 2006 15
Zettabyte File System
RAID-5
●
RAID-5 write hole – silent data...
Regístrate para leer el documento completo.