Linux软件RAID 1 – 根文件系统在一个磁盘发生故障后变为只读

前端之家收集整理的这篇文章主要介绍了Linux软件RAID 1 – 根文件系统在一个磁盘发生故障后变为只读前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
Linux软件RAID 1锁定为只读模式

设置:
RAID 1中的Centos 5.2,2 x 320 GB SATA驱动器.

> / dev / md0(/ dev / sda1 / dev / sdb1)是
/启动
> / dev / md1(/ dev / sda1
/ dev / sdb1)是一个LVM分区
包含/,/ data和swap分区

除swap之外的所有文件系统都是ext3

我们在几个系统上遇到问题,其中一个驱动器上的故障将根文件系统锁定为只读,这显然会导致问题.

  1. [root@myserver /]# mount | grep Root
  2. /dev/mapper/VolGroup00-LogVolRoot on / type ext3 (rw)
  3. [root@myserver /]# touch /foo
  4. touch: cannot touch `/foo': Read-only file system

我可以看到数组中的一个分区出现故障:

  1. [root@myserver /]# mdadm --detail /dev/md1
  2. /dev/md1:
  3. [...]
  4. State : clean,degraded
  5. Active Devices : 1
  6. Working Devices : 1
  7. Failed Devices : 1
  8. Spare Devices : 0
  9. [...]
  10. Number Major Minor RaidDevice State
  11. 0 0 0 0 removed
  12. 1 8 18 1 active sync /dev/sdb2
  13. 2 8 2 - faulty spare /dev/sda2

重新安装为rw失败:

  1. [root@myserver /]# mount -n -o remount /
  2. mount: block device /dev/VolGroup00/LogVolRoot is write-protected,mounting read-only

除非使用了–ignorelockingfailure(因为它们无法写入/ var),否则LVM工具会出错,但会将卷组显示为rw:

  1. [root@myserver /]# lvm vgdisplay
  2. Locking type 1 initialisation Failed.
  3. [root@myserver /]# lvm pvdisplay --ignorelockingfailure
  4. --- Physical volume ---
  5. PV Name /dev/md1
  6. VG Name VolGroup00
  7. PV Size 279.36 GB / not usable 15.56 MB
  8. Allocatable yes (but full)
  9. [...]
  10.  
  11. [root@myserver /]# lvm vgdisplay --ignorelockingfailure
  12. --- Volume group ---
  13. VG Name VolGroup00
  14. System ID
  15. Format lvm2
  16. Metadata Areas 1
  17. Metadata Sequence No 4
  18. VG Access read/write
  19. VG Status resizable
  20. [...]
  21.  
  22. [root@myserver /]# lvm lvdisplay /dev/VolGroup00/LogVolRoot --ignorelockingfailure
  23. --- Logical volume ---
  24. LV Name /dev/VolGroup00/LogVolRoot
  25. VG Name VolGroup00
  26. LV UUID PGoY0f-rXqj-xH4v-WMbw-jy6I-nE04-yZD3Gx
  27. LV Write Access read/write
  28. [...]

在这种情况下,/ boot(单独的RAID元设备)和/ data(同一卷组中的不同逻辑卷)仍然是可写的.从以前的出现,我知道重启会使系统备份一个读/写根文件系统和一个适当降级的RAID阵列.

所以,我有两个问题:

1)发生这种情况时,如何在没有系统重启的情况下将根文件系统恢复为读/写?

2)需要更改什么来阻止此文件系统锁定?由于单个磁盘上的RAID 1故障,我们不希望文件系统锁定,我们希望系统继续运行,直到我们可以更换坏磁盘.

编辑:我可以在teh dmesg输出中看到这个 – doe sthis表示/ dev / sda失败,然后/ dev / sdb上的单独失败导致文件系统被设置为只读?

  1. sda: Current [descriptor]: sense key: Aborted Command
  2. Add. Sense: Recorded entity not found
  3.  
  4. Descriptor sense data with sense descriptors (in hex):
  5. 72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
  6. 00 03 ce 85
  7. end_request: I/O error,dev sda,sector 249477
  8. raid1: Disk failure on sda2,disabling device.
  9. Operation continuing on 1 devices
  10. ata1: EH complete
  11. SCSI device sda: 586072368 512-byte hdwr sectors (300069 MB)
  12. sda: Write Protect is off
  13. sda: Mode Sense: 00 3a 00 00
  14. SCSI device sda: drive cache: write back
  15. RAID1 conf printout:
  16. --- wd:1 rd:2
  17. disk 0,wo:1,o:0,dev:sda2
  18. disk 1,wo:0,o:1,dev:sdb2
  19. RAID1 conf printout:
  20. --- wd:1 rd:2
  21. disk 1,dev:sdb2
  22. ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
  23. ata2.00: irq_stat 0x40000001
  24. ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
  25. res 51/04:00:34:cf:f3/00:00:00:f3:40/a3 Emask 0x1 (device error)
  26. ata2.00: status: { DRDY ERR }
  27. ata2.00: error: { ABRT }
  28. ata2.00: configured for UDMA/133
  29. ata2: EH complete
  30.  
  31.  
  32.  
  33. sdb: Current [descriptor]: sense key: Aborted Command
  34. Add. Sense: Recorded entity not found
  35.  
  36. Descriptor sense data with sense descriptors (in hex):
  37. 72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
  38. 01 e3 5e 2d
  39. end_request: I/O error,dev sdb,sector 31677997
  40. Buffer I/O error on device dm-0,logical block 3933596
  41. lost page write due to I/O error on dm-0
  42. ata2: EH complete
  43. SCSI device sdb: 586072368 512-byte hdwr sectors (300069 MB)
  44. sdb: Write Protect is off
  45. sdb: Mode Sense: 00 3a 00 00
  46. SCSI device sdb: drive cache: write back
  47. ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
  48. ata2.00: irq_stat 0x40000008
  49. ata2.00: cmd 61/38:00:f5:d6:03/00:00:00:00:00/40 tag 0 ncq 28672 out
  50. res 41/10:00:f5:d6:03/00:00:00:00:00/40 Emask 0x481 (invalid argument) <F>
  51. ata2.00: status: { DRDY ERR }
  52. ata2.00: error: { IDNF }
  53. ata2.00: configured for UDMA/133
  54. sd 1:0:0:0: SCSI error: return code = 0x08000002
  55. sdb: Current [descriptor]: sense key: Aborted Command
  56. Add. Sense: Recorded entity not found
  57.  
  58. Descriptor sense data with sense descriptors (in hex):
  59. 72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
  60. 00 03 d6 f5
  61. end_request: I/O error,sector 251637
  62. ata2: EH complete
  63. SCSI device sdb: 586072368 512-byte hdwr sectors (300069 MB)
  64. sdb: Write Protect is off
  65. sdb: Mode Sense: 00 3a 00 00
  66. SCSI device sdb: drive cache: write back
  67. Aborting journal on device dm-0.
  68. journal commit I/O error
  69. ext3_abort called.
  70. EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted journal
  71. Remounting filesystem read-only

解决方法

你的dmesg输出应该给你一个指示,告诉它为什么它会发出PV信号故障;那应该不会发生.至于再次使系统可写,将VG和LV踢为只读然后再回读读写工作从内存开始,但真正的解决方案是让md不必要地停止担心LVM.

猜你在找的Linux相关文章