In last blog I mentioned Apache Hadoop-YARN, but before moving to this latest concept we call some of the basics of file system because unless and until to understand basics we can't move forward.
What is file system?
Any computer file is stored on some kind of storage with a given capacity.
Actually, each storage is a linear space to read or both read and write digital
information. Each byte of information on the storage has its own offset from
the storage start (
address) and is referenced by this
address. A
storage can be presented as a grid with a set of
numbered cells (each
cell – single byte). Any file saved to the storage takes a number of these
cells.
Generally, computer storages use
a pair of sector and
in-sector
offset to reference any byte of information on the storage.
The sector
is a group of bytes (usually
512 bytes) that is a minimum addressable
unit of the physical storage.
For example, byte
1030 on a hard
disk will be referenced as sector
#3 and
offset in sector 16 bytes
([sector]+[sector]+[16 bytes]). This scheme is applied to optimize storage
addressing and use a smaller number to reference any portion of information on
the storage.
To omit the second part of the address (in-sector offset), files are usually
stored
starting from the sector start and
occupy all whole sectors
(e.g.: 10-byte file occupies the whole sector, 512-byte file also occupies the
whole sector, at the same time, 514 byte file occupies two whole sectors).
Each file is stored to
'unused' sectors and can be read then by known
position and size. However, how do we know what sectors are used or unused?
Where are file size and position stored? Where is file name? These answers give
us
the file system.
As a whole,
file system is a structured data representation and a set of
metadata that describe the stored data. File system can not only serve
for the purposes of the whole storage but also be a part of an isolated storage
segment –
disk partition. Usually the file system operates
blocks,
not sectors.
File system blocks are groups of sectors that optimize
storage addressing. Modern file systems generally use block sizes from 1 up to
128 sectors (512-65536 bytes). Files are usually stored from the start of a
block and take entire blocks.
Immense
write/delete operations to file system cause file system
fragmentation.
As a result files aren't stored as whole fragments anymore and are divided into
fragments.
For example, a storage is entirely taken by files with size
about 4 blocks (e.g. pictures collection). User wants to store a file that will
take 8 blocks and therefore deletes the first and the last file. By doing this
he releases 8 blocks, however, the first segment is near to the storage start,
and the second – to the storage end. In this case 8-block file will be split
into two parts (4 blocks for each part) and will take free space 'holes'. The
information about both fragments, which are parts of a single file, will be
stored to file system.
In addition to user files the file system also stores its own
parameters
(such as block size),
file descriptors (that include file size, file
location, its fragments etc.),
file names and
directory hierarchy.
It may also store security information,
extended attributes and other
parameters.
To comply with diverse requirements as to storage performance, stability and
reliability there exists a great variety of file systems each developed to
serve certain user purposes.
Microsoft Windows OS use two major file systems:
FAT, inherited from
old DOS with its later extension
FAT32, and widely-used
NTFS file
systems. Recently released
ReFS file system was developed by Microsoft
as a new generation file system for Windows
Servers
FAT (
File Allocation Table):
FAT file system is one of the most simple types of file systems. It
consists of file system
descriptor sector (boot sector or superblock),
file
system block allocation table (referenced as File Allocation Table) and
plain
storage space to store files and folders. Files on FAT are stored in
directories. Each directory is an array of
32-byte records, each defines
file or file extended attributes (e.g. long file name). File record references
the first block of file. Any next block can be found through block allocation
table by using it as linked-list.
Block allocation table contains an array of block descriptors.
Zero
value indicates that the block is not used and
non-zero – reference to
the next block of the file or special value for file end.
The number in
FAT12,
FAT16,
FAT32 stands for the number if
bits used to enumerate file system block. This means that
FAT12 may use
up to 4096 different block references,
FAT16 - 65536 and
FAT32 -
4294967296. Actual maximum count of blocks is even less and depends on
implementation of
file system driver.
FAT12 was used for old
floppy disks.
FAT16 (or simply FAT)
and
FAT32 are widely used for
flash memory cards and
USB flash
sticks. It is supported by mobile phones, digital cameras and other
portable devices.
FAT or
FAT32 is a file system, used on Windows-compatible
external storages or disk partitions with
size below 2GB (for FAT) or 32GB
(for FAT32). Windows can not create FAT32 file system over 32GB (however
Linux supports FAT32 up to 2TB).
NTFS (
New Technology File System):
NTFS was introduced in Windows NT and at present is major file system
for Windows. This is a default file system for disk partitions and the only
file system that is supported for disk partitions
over 32GB. The file
system is quite extensible and supports many file properties, including
access
control,
encryption etc. Each file on NTFS is stored as file
descriptor in
Master File Table and file content.
Master file table
contains all information about the file: size, allocation, name etc. The first
and the last sectors of the file system contain
file system settings
(boot record or
superblock). This file system uses
48 and
64
bit values to reference files, thus supporting quite large disk storages.
ReFS (
Resilient File System):
ReFS is the latest development of Microsoft presently available for
Windows 8 Servers. File system architecture absolutely differs from other
Windows file systems and is mainly organized in form of
B+-tree.
ReFS
has high tolerance to failures achieved due to new features included into the system.
And, namely,
Copy-on-Write (CoW): no metadata is modified without being
copied; no data is written over the existing ones and rather into a new disk
space. With any file modifications a new copy of metadata is created into any
free storage space, and then the system creates a link from older metadata to
the newer ones. As a result a system stores significant quantity of older
backups in different places which provides for easy file recovery unless this
storage space is overwritten.
MacOS file systems
Apple
Mac OS operating system applies
HFS+ file system, an extension to their
own HFS file system that was used on old Macintosh computers.
HFS+ file system is applied to
Apple desktop products, including
Mac computers,
iPhone,
iPod, as well as Apple X Server products.
Advanced server products also use Apple Xsan file system,
clustered file
system derived from StorNext or CentraVision file systems.
This file system except files and folders also stores
Finder information
about directories view, window positions etc.
Linux file systems
Open-source Linux OS always aimed to implement, test and use different
concepts of file systems. Among huge amount of various file system types the
most popular Linux file systems nowadays are:
·
Ext2, Ext3, Ext4 - 'native' Linux file system. This file
system falls under active developments and improvements. Ext3 file
system is just an extension to Ext2 that uses transactional file write
operations with journal. Ext4 is a further development of Ext3,
extended with support of optimized file allocation information (extents) and
extended file attributes. This file system is frequently used as 'root'
file system for most Linux installations.
·
ReiserFS - alternative Linux file system designed to store
huge amount of small files. It has good capability of files search and
enables compact files allocation by storing file tails or small files along
with metadata in order not to use large file system blocks for this purpose.
·
XFS - file system derived from SGI company that initially used it
for their IRIX servers. Now XFS specifications are implemented in Linux. XFS
file system has great performance and is widely used to store files.
·
JFS - file system developed by IBM for their powerful computing
systems. JFS one usually stands for JFS, JFS2 is the second
edition. Currently this file system is open-source and is implemented in
most modern Linux distributions.
The
concept of
'hard links' used in this kind of OS makes most Linux file
systems similar in that the file name is not regarded as file attribute and
rather defined as an alias for a file in certain directory. File object can be
linked
from many locations, even many times from the same directory under
different names. This is one of the causes why recovery of file names after
file deletion or file system damage can be difficult or even impossible.
BSD, Solaris, Unix file systems
The
most common file system for these OS is UFS (Unix File System) also often
referred to
FFS (Fast File System – fast compared to a previous file
system used for Unix).
UFS is a source of ideas for many other file
system implementations.
Currently UFS (in different editions) is supported by all Unix-family OS and is
major file system of BSD OS and Sun Solaris OS. Modern computer technologies
tend to implement replacements for UFS in different OS (
ZFS for Solaris,
JFS and derived file systems for Unix etc.).
Clustered file systems
Clustered file systems are used in computer cluster systems. These file
systems have embedded support of distributed storage.
Among such distributed file systems are:
·
ZFS - Sun company 'Zettabyte File System' - the new file
system developed for distrubuted storages of Sun Solaris OS.
·
Apple Xsan - the Apple company evolution of CentraVision and
later StorNext file systems.
·
VMFS - 'Virtual Machine File System' developed by VMware
company for its VMware ESX Server.
·
GFS - Rad Hat Linux 'Global File System'.
·
JFS1 - original (legacy) design of IBM JFS file system
used in older AIX storage systems.
Common property of these file systems is distributed storages support,
extensibility and modularity.