12-3inode耗尽故障处理
2016.6.25
12-3-1inode耗尽导致磁盘故障
- 由于每个文件都必须有一个inode,因此有可能发生inode已经用光,但是硬盘还未存满的情况,这时,就无法再硬盘上创建新文件
- 案例
- 查找原因
- /data/cache目录中存在数量非常多的小字节缓存文件,占用的bloc不多,但是占用了大量的inode
- 解决方案
- 删除/data/cache目录中的部分文件,释放出/data分区的一部分inode
12-3-2inode耗尽磁盘故障
模拟实验思路
1新建一个约32M大小的EXT4文件系统
2编写一个测试程序,耗尽所有可用的inode
3测试创建新文件
4解决inode耗尽故障
通过/dev/sde4来分一个32M的主分区,然后格式化成ext4,挂载到/testinode下面,查看挂载信息。
[root@test2 ~]# fdisk /dev/sde
WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
switch off the mode (command 'c') and change display units to
sectors (command 'u').
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Selected partition 4
First cylinder (787-2610, default 787):
Using default value 787
Last cylinder, +cylinders or +size{K,M,G} (787-2610, default 2610): +32m
Unsupported suffix: 'm'.
Supported: 10^N: KB (KiloByte), MB (MegaByte), GB (GigaByte)
2^N: K (KibiByte), M (MebiByte), G (GibiByte)
Last cylinder, +cylinders or +size{K,M,G} (787-2610, default 2610): +32M
Command (m for help): P
Disk /dev/sde: 21.5 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xe4b1d138
Device Boot Start End Blocks Id System
/dev/sde1 1 262 2104483+ fd Linux raid autodetect
/dev/sde2 263 524 2104515 fd Linux raid autodetect
/dev/sde3 525 786 2104515 fd Linux raid autodetect
/dev/sde4 787 791 40162+ 83 Linux
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
[root@test2 ~]# partprobe /dev/sde
sde sde1 sde2 sde3 sde4
[root@test2 ~]# cd /
[root@test2 /]# mkdir /testinode
[root@test2 /]# mount /dev/sde4 /testinode
mount: you must specify the filesystem type
[root@test2 /]# mkfs
mkfs mkfs.ext2 mkfs.ext4 mkfs.msdos
mkfs.cramfs mkfs.ext3 mkfs.ext4dev mkfs.vfat
[root@test2 /]# mkfs.ext4 /dev/sde4
mke2fs 1.41.12 (17-May-2010)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=0 blocks, Stripe width=0 blocks
10040 inodes, 40160 blocks
2008 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=41156608
5 block groups
8192 blocks per group, 8192 fragments per group
2008 inodes per group
Superblock backups stored on blocks:
8193, 24577
Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 39 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
[root@test2 /]# mount /dev/sde4 /testinode/
[root@test2 /]# df -hT | grep /dev/sde4
/dev/sde4 ext4 38M 4.5M 32M 13% /testinode
[root@test2 /]#
查看新挂载硬盘的inodedf -i /testinode/
[root@test2 /]# df -i /testinode/
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sde4 10040 11 10029 1% /testinode
[root@test2 /]# df -i /dev/sde4
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sde4 10040 11 10029 1% /testinode
[root@test2 /]#
创建一个小程序来耗尽inode:
[root@test2 /]# cd /testinode/
[root@test2 testinode]# touch test
[root@test2 testinode]# vi test
#!/bin/bash
for n in $(seq 1 10040)
do
touch a_$n
done
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
:wq
这就是老师曾经说的用for循环来创建
[root@test2 testinode]# mv test test.sh
[root@test2 testinode]# sh test.sh
touch: cannot touch `a_10029': No space left on device
touch: cannot touch `a_10030': No space left on device
touch: cannot touch `a_10031': No space left on device
touch: cannot touch `a_10032': No space left on device
touch: cannot touch `a_10033': No space left on device
touch: cannot touch `a_10034': No space left on device
touch: cannot touch `a_10035': No space left on device
touch: cannot touch `a_10036': No space left on device
touch: cannot touch `a_10037': No space left on device
touch: cannot touch `a_10038': No space left on device
touch: cannot touch `a_10039': No space left on device
touch: cannot touch `a_10040': No space left on device
[root@test2 testinode]#
根据上面的信息能看出来Ifree:10029. 1-10028正好是10028,我觉得那1个inode可能是被test.sh给占用了,毕竟,你是先查看的inode,在你写好了脚本之后并没有再次进行查看。
再次查看sde4的inode:df -i /dev/sde4
[root@test2 testinode]# df -i /dev/sde4
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sde4 10040 10040 0 100% /testinode
df -h /dev/sde4
[root@test2 testinode]# df -h /dev/sde4
Filesystem Size Used Avail Use% Mounted on
/dev/sde4 38M 4.7M 32M 14% /testinode
[root@test2 testinode]#
使用for循环删除
[root@test2 testinode]# vi test.sh
#!/bin/bash
for n in $(seq 1 10040)
do
rm -rf a_$n
done
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
:wq
[root@test2 testinode]# sh test.sh //运行程序删掉了所有以a开头的文件
[root@test2 testinode]# ls -l
total 13
drwx------ 2 root root 12288 Jun 25 17:20 lost+found
-rw-r--r-- 1 root root 57 Jun 25 17:35 test.sh
[root@test2 testinode]#
其实这里你可以再次查看一下inode的信息