Chapter 14. The Ext2 - Ext3 - Ext4 Filesystem

Objectives

  • Describe the main features of the Ext4 filesystem
  • Explain the concepts of block groups, superblock, data blocks and inodes
  • Use dumpe2fs and tune2fs
  • List Ext4 enhacements

Ext2 

First Linux filesystem

Ext3 

Its similar to Ext2 filesystem however it does includes journaling

Ext4

Its similar to Ext3 but it uses extends instead of blocks collections in order to deal with large files

Ext4 Features

Block size​
Can be set when the filesystem is built, it can be

  • 512 bytes
  • 1024 bytes
  • 2048 bytes
  • 4096 byest

Linux Kernel Memory management
Requires an integral number of blocks within a page of memory, thus 8KB block can not be stored on x86 platform where memory pages are 4KB in size.

Inodes
The amount of inodes can be also set in order to save disk space, thus when a directory is being created a certain number of inodes are allocated in order to have inodes available for the next file to be created within such directory

Fast symbolic link
If the pathname of a symbolic link is less than 60 characters, a fast symbolic link is created and stored within the inode to avoid data block access.

Ext4 Layout

Disk blocks are partitioned into block groups, each of which contains inodes and data blocks stored adjacently.

Structure

  • Disk partition (sample: /dev/sda1)
    • Block group
      • Data block divided in categories
        • Boot loader category for block group 0
        • Super block
          • Contains bit fields to assert whether the filesystem is
            • clean
            • dirty
            • unknown
          • Contains information about the filesystem last check

Layout

The layout of Boot block is 

Group 0 Padding  ext4 Super Block  Group Descriptors Reserved GDT Blocks Data Block Bitmap inode Bitmap inode Table Data Blocks
1024 bytes 1 block many blocks many blocks 1 block 1 block many blocks many more blocks
 

The layout of other block groups is 

ext4 Super Block  Group Descriptors Reserved GDT Blocks Data Block Bitmap inode Bitmap inode Table Data Blocks
1 block many blocks many blocks 1 block 1 block many blocks many more blocks
 

Super Block and Group Descriptors

These block categories are often located at the first block of each block group. Under normal circumstances the kernel uses just the information of such categories on the Boot block group, however in order to have redundancy there is a backup for Super Block and Group Descriptors on several other block groups. In order to check this out one could execute the next command

$ sudo dumpe2fs /dev/sda1 | grep superblock
dumpe2fs 1.42.13 (17-May-2015)
  Primary superblock at 1, Group descriptors at 2-3
  Backup superblock at 8193, Group descriptors at 8194-8195
  Backup superblock at 24577, Group descriptors at 24578-24579
  Backup superblock at 40961, Group descriptors at 40962-40963
  Backup superblock at 57345, Group descriptors at 57346-57347
  Backup superblock at 73729, Group descriptors at 73730-73731
  Backup superblock at 204801, Group descriptors at 204802-204803
  Backup superblock at 221185, Group descriptors at 221186-221187
  Backup superblock at 401409, Group descriptors at 401410-401411

Block Groups constraint

The number of block groups is limited by the block bitmap (which indentifies used and free blocks within the group). The block bitmap has to fit within a single block so considering a byte has 8 bits, and each bit can define a block in the bitmap, for a 4KiB bit map block we have

4000 (Bytes) * 8 (Bits per Byte) = 32 000 (posible bit-block association)

dumpe2fs

Utility to scan a filesystem information

[abernal@localhost ~]$ sudo dumpe2fs /dev/sda1
dumpe2fs 1.42.13 (17-May-2015)
Filesystem volume name:   <none>
Last mounted on:          /mnt/sysimage/boot
Filesystem UUID:          a9d6faa3-8509-4287-86ee-74b989abfe60
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              128016
Block count:              512000
Reserved block count:     25600
Free blocks:              377996
Free inodes:              127591
First block:              1
Block size:               1024
Fragment size:            1024
Reserved GDT blocks:      256
Blocks per group:         8192
Fragments per group:      8192
Inodes per group:         2032
Inode blocks per group:   254
Flex block group size:    16
Filesystem created:       Thu May 12 02:19:19 2016
Last mount time:          Fri Jun 17 23:34:41 2016
Last write time:          Fri Jun 17 23:34:41 2016
Mount count:              13
Maximum mount count:      -1
Last checked:             Thu May 12 02:19:19 2016
Check interval:           0 (<none>)
Lifetime writes:          132 MB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:              128
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      22dfbd1f-0bb0-4516-befb-1f9235f8674b
Journal backup:           inode blocks
Journal features:         (none)
Journal size:             8M
Journal length:           8192
Journal sequence:         0x00000152
Journal start:            0


Group 0: (Blocks 1-8192) [ITABLE_ZEROED]
  Checksum 0x3518, unused inodes 2001
  Primary superblock at 1, Group descriptors at 2-3
  Reserved GDT blocks at 4-259
  Block bitmap at 260 (+259), Inode bitmap at 276 (+275)
  Inode table at 292-545 (+291)
  3814 free blocks, 2002 free inodes, 5 directories, 2001 unused inodes
  Free blocks: 4379-8192
  Free inodes: 30, 32-2032
Group 1: (Blocks 8193-16384) [INODE_UNINIT, ITABLE_ZEROED]
  Checksum 0xc021, unused inodes 2032
  Backup superblock at 8193, Group descriptors at 8194-8195
  Reserved GDT blocks at 8196-8451
  Block bitmap at 261 (bg #0 + 260), Inode bitmap at 277 (bg #0 + 276)
  Inode table at 546-799 (bg #0 + 545)
  1728 free blocks, 2032 free inodes, 0 directories, 2032 unused inodes
  Free blocks: 9823-10240, 13027-14336
  Free inodes: 2033-4064
...

tune2fs

Utility used to modify the filesystem parameters

Change maximun number of filesystem mounts until a filesystem check is done (max-mount-count)

$ sudo tune2fs -c 25 /dev/sda1

Change time interval between checks (interval-between-checks, No sufix or the letter d will be interpreted as days, m as months and w as weeks), the next command will set 10 days between a filesystem check

$ sudo tune2fs -i 10 /dev/sda1

List the contents of a subperblock including the current values of parameters which can be changed

$ sudo tune2fs -l /dev/sda1

Superblock Information

This category of block contains the next information

  • Mount count and maximun mount count
  • Block size (This can not be greater than a page of memory, set by mkfs)
  • Blocks per group
  • Free block count
  • Free inode count
  • Operating System ID

Superblock is redundant by default, so it will be present of several other block groups

Data Blocks and Inodes

Data block and inode bitmaps are blocks whose bits contains 0 for unused blocks and 1 for used ones. There is only one of these blocks per block group.

Each inode requires 128 bytes, then a 4KB block contains 32 inodes

Ext4 Filesystem Enhacements

  • Backward compatible with ext3 and ext2
  • Increases the maximun filesystem size to 1 EB (from 16 TB), and the maximum file to 16 TB (from 2 TB)
  • Increase without limit the maximum number of subdirectories (32 K in ext3)
  • Splits large files into largest possible extents instead of using indirect block mapping
  • Uses multiblock allocation which allocates all at once instead of one block at a time
  • Can pre-allocate disk space for a file both guaranteed and contiguous
  • Uses allocate-on-flush, a performace feature that delays block allocation until data is written to disk
  • Use fast fsck 
  • Use checksums for the journal which improves reliability
  • Uses improved timestamps measured in nanoseconds
  • Includes snapshot support

LAB 14.1

Lab 14.1: Defragmentation

Defragmenting ext4 filesystems:

$ sudo e4defrag
Usage : e4defrag [-v] file...| directory...| device...
: e4defrag -c file...| directory...| device...

e4defrag is part of the e2fsprogs package and should be on all modern Linux distributions, although it  doesn’t come with RHEL 6 which is somewhat long in tooth.

The only two options are:
• -v: Be verbose.
• -c: Don’t actually do anything, just analyze and report.

The argument can be:
• A file
• A directory
• An entire device

Examples:

$ sudo e4defrag -c /var/log
<Fragmented files> now/best size/ext
1. /var/log/lastlog 5/1 9 KB
2. /var/log/sa/sa24 3/1 80 KB
3. /var/log/rhsm/rhsm.log 2/1 142 KB
4. /var/log/messages 2/1 4590 KB
5. /var/log/Xorg.1.log.old 1/1 36 KB
Total/best extents 120/112
Average size per extent 220 KB
Fragmentation score 1
[0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
This directory (/var/log) does not need defragmentation.
Done.

$ sudo e4defrag /var/log
ext4 defragmentation for directory(/var/log)
[2/152]/var/log/Xorg.2.log: 100% [ OK ]
[3/152]/var/log/Xorg.0.log.old: 100% [ OK ]
[4/152]/var/log/messages-20141019.gz: 100% [ OK ]
[5/152]/var/log/boot.log: 100% [ OK ]
[7/152]/var/log/cups/page_log-20140924.gz: 100% [ OK ]
[8/152]/var/log/cups/access_log-20141019.gz: 100% [ OK ]
[9/152]/var/log/cups/access_log: 100% [ OK ]
[10/152]/var/log/cups/error_log-20141018.gz: 100% [ OK ]
[11/152]/var/log/cups/error_log-20141019.gz: 100% [ OK ]
[12/152]/var/log/cups/access_log-20141018.gz: 100% [ OK ]
[14/152]/var/log/cups/page_log-20141018.gz: 100% [ OK ]
...
[152/152]/var/log/Xorg.1.log.old: 100% [ OK ]
Success: [ 112/152 ]
Failure: [ 40/152 ]​

Try running e4defrag on various files, directories, and entire devices, always trying with -c first.
You will generally find that Linux filesystems only tend to need defragmentation when they get very full, over 90 percent or so.

Lab 14.2: Modifying Filesystem Parameters with tune2fs

We are going to fiddle with some properties of a formatted ext4 filesystem. This does not require unmounting the filesystem first.

In the below you can work with an image file you create as in:

$ dd if=/dev/zero of=imagefile bs=1M count=1024

or you can substitute /dev/sdaX (using whatever partition the filesystem you want to modify is mounted on) for imagefile.

1. Using dumpe2fs, obtain information about the filesystem whose properties you want to adjust.
2. Ascertain the maximum mount count setting (after which a filesystem check will be forced) and modify it to have the value 30.
3. Set the Check interval (the amount of time after which a filesystem check is forced), to three weeks.
4. Calculate the percentage of blocks reserved, and then reset it to 10%.

Solution
We will work with a mounted filesystem located a /dev/loop0 and mounted at /mnt/tmpdir

1. Execute the next command
$ sudo dumpe2fs /dev/loop0
2. We can see the maximum mount count setting with the last command, however to modify it to a value of 30, execute the next command
$ sudo tune2fs -c 30 /dev/loop0
3. In order to set the filesystem check interval to 3 week, execute the next command
$ sudo tune2fs -i 3w /dev/loop0

Alternative

1. Get filesystem information 

$ dumpe2fs /dev/loop0 > dump_results

2. Set the maximum mount count to 30  

$ grep -i "Mount count" dump_results

Mount count: 0
Maximum mount count: -1

$ sudo tune2fs -c 30 imagefile

$ grep -i "Mount count" dump_results

Mount count: 0
Maximum mount count: 30

3. Set check interval to 3 weeks 

$ grep -i "Check interval" dump_results

Check interval: 0 (<none>)

$ sudo tune2fs -i 3w imagefile
$ grep -i "Check interval" dump_results

Check interval: 1814400 (3 weeks)

4. Setting reserved blocks percentage to 10%

$ grep -i "Block Count" dump_results

Block count: 131072
Reserved block count: 6553

$ echo "scale=4; 6553/131072" | bc
.0499

$ sudo tune2fs -m 10 imagefile
$ tune2fs 1.42.9 (28-Dec-2013)

Setting reserved blocks percentage to 10% (13107 blocks)

$ grep -i "Block Count" dump_results

Block count: 131072
Reserved block count: 13107