我的系统总是崩溃.所以我决定启用kdump来查看问题,因为我无法在日志文件中看到可能的错误.
我跟着steps to set up kdump from a site here.我的服务器在CentOS 5.8和16GB RAM上运行.以下是我为配置kdump所执行的步骤:
- 1. Install kexec-tools,`yum install kexec-tools` and follow the installation steps
- 2. Edit the /boot/grub/grub.conf to configure the kdump memory usage
- 3. Edit the /etc/kdump.cof to configure the target type to /var/crash/ and core_collector
- 4. Enable kdump through `chkconfig kdump on`.
- 5. Reboot the server
当我运行服务kdump状态时,它说Kdump不能运行.
我应该怎么做才能使kdump运作起来.我错过了配置的东西吗?
我在下面包含了/boot/grub/grub.conf和/etc/kdump.conf的内容
Bellow是文件/boot/grub/grub.conf的内容
- # grub.conf generated by anaconda
- #
- # Note that you do not have to rerun grub after making changes to this file
- # NOTICE: You have a /boot partition. This means that
- # all kernel and initrd paths are relative to /boot/,eg.
- # root (hd0,0)
- # kernel /vmlinuz-version ro root=/dev/sda3
- # initrd /initrd-version.img
- #boot=/dev/sda
- default=0
- timeout=5
- splashimage=(hd0,0)/grub/splash.xpm.gz
- hiddenmenu
- title CentOS (2.6.18-308.el5)
- root (hd0,0)
- kernel /vmlinuz-2.6.18-308.el5 ro root=LABEL=/
- crashkernel=128M
- initrd /initrd-2.6.18-308.el5.img
- # Configures where to put the kdump /proc/vmcore files
- #
- # This file contains a series of commands to perform (in order) when a
- # kernel crash has happened and the kdump kernel has been loaded. Directives in
- # this file are only applicable to the kdump initramfs,and have no effect if
- # the root filesystem is mounted and the normal init scripts are processed
- #
- # Currently only one dump target and path may be configured at once
- # if the configured dump target fails,the default action will be preformed
- # the default action may be configured with the default directive below. If the
- # configured dump target succedes
- #
- # For filesystem based dump,it's recommended to use UUID and LABEL
- # instead of device name in dump target.
- #
- # See the kdump.conf(5) man page for details of configuration directives
- #raw /dev/sda5
- #ext3 /dev/sda3
- #ext3 LABEL=/boot
- #ext3 UUID=03138356-5e61-4ab3-b58e-27507ac41937
- #net my.server.com:/export/tmp
- #net user@my.server.com
- path /var/crash
- core_collector makedumpfile -c --message-level 1
- #core_collector cp --sparse=always
- #link_delay 60
- #kdump_post /var/crash/scripts/kdump-post.sh
- #extra_bins /usr/bin/lftp
- #disk_timeout 30
- #extra_modules gfs2
- #options modulename options
- #default shell
- #sshkey /root/.ssh/kdump_id_rsa
我还注意到我的/boot/grub/grub.conf文件与本教程中的示例grub.conf文件不同.它们有两行不同:
- From tutorial
- kernel /vmlinuz-2.6.32-220.el6.x86_64 ro root=/dev/sda3
- initrd /initramfs-2.6.32-220.el6.x86_64.img
- From own conf
- kernel /vmlinuz-2.6.18-308.el5 ro root=LABEL=/
- initrd /initrd-2.6.18-308.el5.img
这些行会导致kdump无法启动吗?
[编辑1]
/ var / log / messages的内容
- Feb 25 02:18:28 61540 kernel: Command line: ro root=LABEL=/ crashkernel=128M
- Feb 25 02:18:28 61540 kernel: BIOS-provided physical RAM map:
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 0000000000010000 - 000000000009a000 (usable)
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 0000000000100000 - 00000000cfda0000 (usable)
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000cfda0000 - 00000000cfdd1000 (ACPI NVS)
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000cfdd1000 - 00000000cfe00000 (ACPI data)
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000cfe00000 - 00000000cff00000 (reserved)
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
- Feb 25 02:18:28 61540 kernel: BIOS-e820: 0000000100000000 - 000000042f000000 (usable)
- Feb 25 02:18:28 61540 kernel: DMI 2.4 present.
- Feb 25 02:18:28 61540 kernel: No NUMA configuration found
- Feb 25 02:18:28 61540 kernel: Faking a node at 0000000000000000-000000042f000000
- Feb 25 02:18:28 61540 kernel: Bootmem setup node 0 0000000000000000-000000042f000000
- Feb 25 02:18:28 61540 kernel: Memory for crash kernel (0x0 to 0x0) notwithin permissible range
- Feb 25 02:18:28 61540 kernel: disabling kdump
- Feb 25 02:44:39 61540 kdump: No crashkernel parameter was specified or crashkernel memory reservation Failed
- Feb 25 02:44:39 61540 kdump: Failed to start up
[编辑2]
或者我应该将代码ro root = LABEL =更改为ro root = / dev / sda3?
- title CentOS (2.6.18-308.el5)
- root (hd0,0)
- kernel /vmlinuz-2.6.18-308.el5 ro root=LABEL=/
- crashkernel=128M
- initrd /initrd-2.6.18-308.el5.img
看起来您将crashkernel参数放入新行.这就是Kdump不是操作消息的原因.所有内核参数必须与内核放在同一行:
- title CentOS (2.6.18-308.el5)
- root (hd0,0)
- kernel /vmlinuz-2.6.18-308.el5 ro root=LABEL=/ crashkernel=128M
- initrd /initrd-2.6.18-308.el5.img
重新启动后,看一下/ var / log / messages,你会看到如下内容:
- localhost kdump: kexec: loaded kdump kernel
- localhost kdump: started up
和:
- # /etc/init.d/kdump start
- Starting kdump: [ OK ]
- # /etc/init.d/kdump status
- Kdump is operational
根据this文档,试试这个:
crashkernel = 128M @ 16M