Objectives
- Describe the main features of the Ext4 filesystem
- Explain the concepts of block groups, superblock, data blocks and inodes
- Use dumpe2fs and tune2fs
- List Ext4 enhacements
Ext2
First Linux filesystem
Ext3
Its similar to Ext2 filesystem however it does includes journaling
Ext4
Its similar to Ext3 but it uses extends instead of blocks collections in order to deal with large files
Ext4 Features
Block size
Can be set when the filesystem is built, it can be
- 512 bytes
- 1024 bytes
- 2048 bytes
- 4096 byest
Linux Kernel Memory management
Requires an integral number of blocks within a page of memory, thus 8KB block can not be stored on x86 platform where memory pages are 4KB in size.
Inodes
The amount of inodes can be also set in order to save disk space, thus when a directory is being created a certain number of inodes are allocated in order to have inodes available for the next file to be created within such directory
Fast symbolic link
If the pathname of a symbolic link is less than 60 characters, a fast symbolic link is created and stored within the inode to avoid data block access.
Ext4 Layout
Disk blocks are partitioned into block groups, each of which contains inodes and data blocks stored adjacently.
Structure
- Disk partition (sample: /dev/sda1)
- Block group
- Data block divided in categories
- Boot loader category for block group 0
- Super block
- Contains bit fields to assert whether the filesystem is
- clean
- dirty
- unknown
- Contains information about the filesystem last check
- Contains bit fields to assert whether the filesystem is
- Data block divided in categories
- Block group
Layout
The layout of Boot block is
Group 0 Padding | ext4 Super Block | Group Descriptors | Reserved GDT Blocks | Data Block Bitmap | inode Bitmap | inode Table | Data Blocks |
---|---|---|---|---|---|---|---|
1024 bytes | 1 block | many blocks | many blocks | 1 block | 1 block | many blocks | many more blocks |
The layout of other block groups is
ext4 Super Block | Group Descriptors | Reserved GDT Blocks | Data Block Bitmap | inode Bitmap | inode Table | Data Blocks |
---|---|---|---|---|---|---|
1 block | many blocks | many blocks | 1 block | 1 block | many blocks | many more blocks |
Super Block and Group Descriptors
These block categories are often located at the first block of each block group. Under normal circumstances the kernel uses just the information of such categories on the Boot block group, however in order to have redundancy there is a backup for Super Block and Group Descriptors on several other block groups. In order to check this out one could execute the next command
$ sudo dumpe2fs /dev/sda1 | grep superblock dumpe2fs 1.42.13 (17-May-2015) Primary superblock at 1, Group descriptors at 2-3 Backup superblock at 8193, Group descriptors at 8194-8195 Backup superblock at 24577, Group descriptors at 24578-24579 Backup superblock at 40961, Group descriptors at 40962-40963 Backup superblock at 57345, Group descriptors at 57346-57347 Backup superblock at 73729, Group descriptors at 73730-73731 Backup superblock at 204801, Group descriptors at 204802-204803 Backup superblock at 221185, Group descriptors at 221186-221187 Backup superblock at 401409, Group descriptors at 401410-401411
Block Groups constraint
The number of block groups is limited by the block bitmap (which indentifies used and free blocks within the group). The block bitmap has to fit within a single block so considering a byte has 8 bits, and each bit can define a block in the bitmap, for a 4KiB bit map block we have
4000 (Bytes) * 8 (Bits per Byte) = 32 000 (posible bit-block association)
dumpe2fs
Utility to scan a filesystem information
[abernal@localhost ~]$ sudo dumpe2fs /dev/sda1 dumpe2fs 1.42.13 (17-May-2015) Filesystem volume name: <none> Last mounted on: /mnt/sysimage/boot Filesystem UUID: a9d6faa3-8509-4287-86ee-74b989abfe60 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 128016 Block count: 512000 Reserved block count: 25600 Free blocks: 377996 Free inodes: 127591 First block: 1 Block size: 1024 Fragment size: 1024 Reserved GDT blocks: 256 Blocks per group: 8192 Fragments per group: 8192 Inodes per group: 2032 Inode blocks per group: 254 Flex block group size: 16 Filesystem created: Thu May 12 02:19:19 2016 Last mount time: Fri Jun 17 23:34:41 2016 Last write time: Fri Jun 17 23:34:41 2016 Mount count: 13 Maximum mount count: -1 Last checked: Thu May 12 02:19:19 2016 Check interval: 0 (<none>) Lifetime writes: 132 MB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: 22dfbd1f-0bb0-4516-befb-1f9235f8674b Journal backup: inode blocks Journal features: (none) Journal size: 8M Journal length: 8192 Journal sequence: 0x00000152 Journal start: 0 Group 0: (Blocks 1-8192) [ITABLE_ZEROED] Checksum 0x3518, unused inodes 2001 Primary superblock at 1, Group descriptors at 2-3 Reserved GDT blocks at 4-259 Block bitmap at 260 (+259), Inode bitmap at 276 (+275) Inode table at 292-545 (+291) 3814 free blocks, 2002 free inodes, 5 directories, 2001 unused inodes Free blocks: 4379-8192 Free inodes: 30, 32-2032 Group 1: (Blocks 8193-16384) [INODE_UNINIT, ITABLE_ZEROED] Checksum 0xc021, unused inodes 2032 Backup superblock at 8193, Group descriptors at 8194-8195 Reserved GDT blocks at 8196-8451 Block bitmap at 261 (bg #0 + 260), Inode bitmap at 277 (bg #0 + 276) Inode table at 546-799 (bg #0 + 545) 1728 free blocks, 2032 free inodes, 0 directories, 2032 unused inodes Free blocks: 9823-10240, 13027-14336 Free inodes: 2033-4064 ...
tune2fs
Utility used to modify the filesystem parameters
Change maximun number of filesystem mounts until a filesystem check is done (max-mount-count)
$ sudo tune2fs -c 25 /dev/sda1
Change time interval between checks (interval-between-checks, No sufix or the letter d will be interpreted as days, m as months and w as weeks), the next command will set 10 days between a filesystem check
$ sudo tune2fs -i 10 /dev/sda1
List the contents of a subperblock including the current values of parameters which can be changed
$ sudo tune2fs -l /dev/sda1
Superblock Information
This category of block contains the next information
- Mount count and maximun mount count
- Block size (This can not be greater than a page of memory, set by mkfs)
- Blocks per group
- Free block count
- Free inode count
- Operating System ID
Superblock is redundant by default, so it will be present of several other block groups
Data Blocks and Inodes
Data block and inode bitmaps are blocks whose bits contains 0 for unused blocks and 1 for used ones. There is only one of these blocks per block group.
Each inode requires 128 bytes, then a 4KB block contains 32 inodes
Ext4 Filesystem Enhacements
- Backward compatible with ext3 and ext2
- Increases the maximun filesystem size to 1 EB (from 16 TB), and the maximum file to 16 TB (from 2 TB)
- Increase without limit the maximum number of subdirectories (32 K in ext3)
- Splits large files into largest possible extents instead of using indirect block mapping
- Uses multiblock allocation which allocates all at once instead of one block at a time
- Can pre-allocate disk space for a file both guaranteed and contiguous
- Uses allocate-on-flush, a performace feature that delays block allocation until data is written to disk
- Use fast fsck
- Use checksums for the journal which improves reliability
- Uses improved timestamps measured in nanoseconds
- Includes snapshot support
LAB 14.1
Lab 14.1: Defragmentation
Defragmenting ext4 filesystems:
$ sudo e4defrag Usage : e4defrag [-v] file...| directory...| device... : e4defrag -c file...| directory...| device...
e4defrag is part of the e2fsprogs package and should be on all modern Linux distributions, although it doesn’t come with RHEL 6 which is somewhat long in tooth.
The only two options are:
• -v: Be verbose.
• -c: Don’t actually do anything, just analyze and report.
The argument can be:
• A file
• A directory
• An entire device
Examples:
$ sudo e4defrag -c /var/log <Fragmented files> now/best size/ext 1. /var/log/lastlog 5/1 9 KB 2. /var/log/sa/sa24 3/1 80 KB 3. /var/log/rhsm/rhsm.log 2/1 142 KB 4. /var/log/messages 2/1 4590 KB 5. /var/log/Xorg.1.log.old 1/1 36 KB Total/best extents 120/112 Average size per extent 220 KB Fragmentation score 1 [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag] This directory (/var/log) does not need defragmentation. Done.
$ sudo e4defrag /var/log ext4 defragmentation for directory(/var/log) [2/152]/var/log/Xorg.2.log: 100% [ OK ] [3/152]/var/log/Xorg.0.log.old: 100% [ OK ] [4/152]/var/log/messages-20141019.gz: 100% [ OK ] [5/152]/var/log/boot.log: 100% [ OK ] [7/152]/var/log/cups/page_log-20140924.gz: 100% [ OK ] [8/152]/var/log/cups/access_log-20141019.gz: 100% [ OK ] [9/152]/var/log/cups/access_log: 100% [ OK ] [10/152]/var/log/cups/error_log-20141018.gz: 100% [ OK ] [11/152]/var/log/cups/error_log-20141019.gz: 100% [ OK ] [12/152]/var/log/cups/access_log-20141018.gz: 100% [ OK ] [14/152]/var/log/cups/page_log-20141018.gz: 100% [ OK ] ... [152/152]/var/log/Xorg.1.log.old: 100% [ OK ] Success: [ 112/152 ] Failure: [ 40/152 ]
Try running e4defrag on various files, directories, and entire devices, always trying with -c first.
You will generally find that Linux filesystems only tend to need defragmentation when they get very full, over 90 percent or so.
Lab 14.2: Modifying Filesystem Parameters with tune2fs
We are going to fiddle with some properties of a formatted ext4 filesystem. This does not require unmounting the filesystem first.
In the below you can work with an image file you create as in:
$ dd if=/dev/zero of=imagefile bs=1M count=1024
or you can substitute /dev/sdaX (using whatever partition the filesystem you want to modify is mounted on) for imagefile.
1. Using dumpe2fs, obtain information about the filesystem whose properties you want to adjust.
2. Ascertain the maximum mount count setting (after which a filesystem check will be forced) and modify it to have the value 30.
3. Set the Check interval (the amount of time after which a filesystem check is forced), to three weeks.
4. Calculate the percentage of blocks reserved, and then reset it to 10%.
Solution
We will work with a mounted filesystem located a /dev/loop0 and mounted at /mnt/tmpdir
1. Execute the next command
$ sudo dumpe2fs /dev/loop0
2. We can see the maximum mount count setting with the last command, however to modify it to a value of 30, execute the next command
$ sudo tune2fs -c 30 /dev/loop0
3. In order to set the filesystem check interval to 3 week, execute the next command
$ sudo tune2fs -i 3w /dev/loop0
Alternative
1. Get filesystem information
$ dumpe2fs /dev/loop0 > dump_results
2. Set the maximum mount count to 30
$ grep -i "Mount count" dump_results
Mount count: 0
Maximum mount count: -1
$ sudo tune2fs -c 30 imagefile $ grep -i "Mount count" dump_results
Mount count: 0
Maximum mount count: 30
3. Set check interval to 3 weeks
$ grep -i "Check interval" dump_results
Check interval: 0 (<none>)
$ sudo tune2fs -i 3w imagefile $ grep -i "Check interval" dump_results
Check interval: 1814400 (3 weeks)
4. Setting reserved blocks percentage to 10%
$ grep -i "Block Count" dump_results
Block count: 131072
Reserved block count: 6553
$ echo "scale=4; 6553/131072" | bc .0499 $ sudo tune2fs -m 10 imagefile $ tune2fs 1.42.9 (28-Dec-2013)
Setting reserved blocks percentage to 10% (13107 blocks)
$ grep -i "Block Count" dump_results
Block count: 131072
Reserved block count: 13107