Trying to get another CentOS 7.4 server running with Xen and getting another error.

This initially looks like a previous error I has been getting where the screen goes blank from Xen Hypervisor won’t boot blog.  However I was able to further determine the issue by enabling the Xen boot messages was not exactly the same problem as my last blog.

The error looked like the following.

(XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.

So first we need to determine if you’re message is the same by enabling Xen boot messages.  You’ll need to edit file /etc/default/grub

vi /etc/default/grub

The default for GRUB_CMDLINE_LINUX and GRUB_CMDLINE_XEN_DEFAULT  looks like this for me:

GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet"
GRUB_CMDLINE_XEN_DEFAULT="dom0_mem=1024M,max:1024M cpuinfo com1=115200,8n1 console=com1,tty loglvl=all guest_loglvl=all"

You’ll need to do two things, remove rhgb quiet from GRUB_CMDLINE_LINUX entry, refer to Turning off progress bar blog in order to enable displaying boot messages.

Next I made the adjustment to Dom0 memory, refer to Xen Hypervisor blog,  and then added vga to console to get Xen boot message output to the screen.

The combination should look like this now:

GRUB_CMDLINE_LINUX="crashkernel=auto"
GRUB_CMDLINE_XEN_DEFAULT="dom0_mem=2048M,max:4096M dom0_max_vcpus=4 dom0_vcpus_pin cpuinfo com1=115200,8n1 console=com1,tty,vga loglvl=all guest_loglvl=all"

I personally have one more thing different here, I use com2 for my configs.  I’m running a Dell R610 server with DRAC Enterprise.  I typically have all output going to the virtual serial port so I can log output using SOL (Serial Over Lan).  I’ll talk about that in another blog, for now keep it com1 unless you’re doing the same thing as me.

Before rebooting you server don’t forget to run

/usr/bin/grub-bootxen.sh

After rebooting you’ll get all the output for Xen message to the screen here’s a portion of mine, just because there’s a lot of output:

Loading Xen 4.6.6-8.el7 ...
Loading Linux 4.9.75-29.el7.x86_64 ...
Loading initial ramdisk ...
(XEN) Xen version 4.6.6-8.el7 (mockbuild@centos.org) (gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)) debug=n Tue Dec 12 12:15:46 UTC 2017
(XEN) Latest ChangeSet: Tue Dec 12 11:42:15 2017 +0000 git:07e9f39-dirty
(XEN) Bootloader: GRUB 2.02~beta2
(XEN) Command line: placeholder dom0_mem=2048M,max:4096M cpuinfo com2=115200,8n1 console=com2,tty,vga loglvl=all guest_loglvl=all
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: none; EDID transfer time: 0 seconds
(XEN)  EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN)  Found 2 MBR signatures
(XEN)  Found 2 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 000000000009e000 (usable)
(XEN)  0000000000100000 - 00000000cf379000 (usable)
(XEN)  00000000cf379000 - 00000000cf38f000 (reserved)
(XEN)  00000000cf38f000 - 00000000cf3ce000 (ACPI data)
(XEN)  00000000cf3ce000 - 00000000d0000000 (reserved)
(XEN)  00000000e0000000 - 00000000f0000000 (reserved)
(XEN)  00000000fe000000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000000e30000000 (usable)
(XEN) ACPI: RSDP 000F1150, 0024 (r2 DELL  )
(XEN) ACPI: XSDT 000F1254, 009C (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: FACP CF3B3F9C, 00F4 (r3 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: DSDT CF38F000, 3DD0 (r1 DELL   PE_SC3          1 INTL 20050624)
(XEN) ACPI: FACS CF3B6000, 0040
(XEN) ACPI: APIC CF3B3478, 015E (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: SPCR CF3B35D8, 0050 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: HPET CF3B362C, 0038 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: DMAR CF3B3668, 01C0 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: MCFG CF3B38C4, 003C (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: WD__ CF3B3904, 0134 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: SLIC CF3B3A3C, 0024 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: ERST CF392F70, 0270 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: HEST CF3931E0, 03A8 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: BERT CF392DD0, 0030 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: EINJ CF392E00, 0170 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: SRAT CF3B3BC0, 0370 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: TCPA CF3B3F34, 0064 (r2 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: SSDT CF3B7000, 43F4 (r1  INTEL PPM RCM  80000001 INTL 20061109)
(XEN) System RAM: 57331MB (58707036kB)
(XEN) SRAT: PXM 1 -> APIC 20 -> Node 0
(XEN) SRAT: PXM 2 -> APIC 00 -> Node 1
(XEN) SRAT: PXM 1 -> APIC 22 -> Node 0
(XEN) SRAT: PXM 2 -> APIC 02 -> Node 1
(XEN) SRAT: PXM 1 -> APIC 32 -> Node 0
(XEN) SRAT: PXM 2 -> APIC 12 -> Node 1
(XEN) SRAT: PXM 1 -> APIC 34 -> Node 0
.
.
.
(XEN) Dom0 has maximum 16 VCPUs
(XEN) Scrubbing Free RAM on 2 nodes using 8 CPUs
(XEN) ..................................................................done.
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
(XEN) Freed 288kB init memory.
mapping kernel into physical memory
about to get started...
[    0.000000] Linux version 4.9.75-29.el7.x86_64 (mockbuild@) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Fri Jan 5 19:42:28 UTC 2018
[    0.000000] Command line: placeholder root=/dev/mapper/cl-root ro crashkernel=auto rd.lvm.lv=cl/root rd.lvm.lv=cl/swap console=hvc0 earlyprintk=xen nomodeset
[    0.000000] x86/fpu: Legacy x87 FPU detected.
[    0.000000] x86/fpu: Using 'eager' FPU context switches.
[    0.000000] Released 0 page(s)
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] Xen: [mem 0x0000000000000000-0x000000000009dfff] usable
[    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[    0.000000] Xen: [mem 0x0000000000100000-0x00000000cf378fff] usable
[    0.000000] Xen: [mem 0x00000000cf379000-0x00000000cf38efff] reserved
[    0.000000] Xen: [mem 0x00000000cf38f000-0x00000000cf3cdfff] ACPI data
[    0.000000] Xen: [mem 0x00000000cf3ce000-0x00000000cfffffff] reserved
[    0.000000] Xen: [mem 0x00000000e0000000-0x00000000efffffff] reserved
[    0.000000] Xen: [mem 0x00000000fe000000-0x00000000ffffffff] reserved
[    0.000000] Xen: [mem 0x0000000100000000-0x0000000130ce8fff] usable
[    0.000000] bootconsole [xenboot0] enabled
[    0.000000] NX (Execute Disable) protection: active
.
.
.
.
[    0.000000] console [hvc0] enabled^M
[    0.000000] console [hvc0] enabled
[    0.000000] bootconsole [xenboot0] disabled^M
[    0.000000] bootconsole [xenboot0] disabled
[    0.000000] clocksource: xen: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns^M
[    0.000000] installing Xen timer for CPU 0^M
[    0.000000] tsc: Fast TSC calibration using PIT^M
[    0.000000] tsc: Detected 2393.947 MHz processor^M
[    7.398543] Calibrating delay loop (skipped), value calculated using timer frequency.. 4788.10 BogoMIPS (lpj=2394050)^M
[    7.398549] pid_max: default: 32768 minimum: 301^M
[    7.398594] ACPI: Core revision 20160831^M
[    7.407815] ACPI: 2 ACPI AML tables successfully acquired and loaded^M
[    7.408100] Security Framework initialized^M
[    7.408104] Yama: becoming mindful.^M
[    7.408119] SELinux:  Initializing.^M
[    7.408695] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)^M
[    7.409775] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)^M
[    7.410248] Mount-cache hash table entries: 8192 (order: 4, 65536 bytes)^M
[    7.410259] Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes)^M
[    7.411026] CPU: Physical Processor ID: 1^M
[    7.411030] CPU: Processor Core ID: 0^M
[    7.411035] mce: CPU supports 2 MCE banks^M
[    7.411053] Last level iTLB entries: 4KB 512, 2MB 7, 4MB 7^M
[    7.411056] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0^M
[    7.411425] Freeing SMP alternatives memory: 32K^M
[    7.413092] ftrace: allocating 34397 entries in 135 pages^M
[    7.430060] smpboot: Max logical packages: 1^M
[    7.430064] smpboot: CPU 0 Converting physical 1 to logical package 0^M
[    7.430081] VPMU disabled by hypervisor.^M
[    7.430124] Performance Events: Westmere events, PMU not available due to virtualization, using software events only.^M
[    7.431293] NMI watchdog: disabled (cpu0): hardware events not enabled^M
[    7.431297] NMI watchdog: Shutting down hard lockup detector on all cpus^M
[    7.431807] installing Xen timer for CPU 1^M
[    7.433360] installing Xen timer for CPU 2^M
[    7.434904] installing Xen timer for CPU 3^M
[    7.436603] installing Xen timer for CPU 4^M
[    7.438018] installing Xen timer for CPU 5^M
[    7.439297] installing Xen timer for CPU 6^M
[    7.440635] installing Xen timer for CPU 7^M
[    7.441812] installing Xen timer for CPU 8^M
(XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.

You’ll see it first outputs the same three messages as my previous blog.  Now after those three lines you’ll see the Xen boot messages, then kernel boot messages.  This is where it crashed for me.

The issue seemed to have something to do with Xen and CPUs, so I went a to determine what other people were doing with a similar issue.

I found an article about Xen performance tuning that I found helpful.  I made a couple of adjustments and found that assigning a specific number of CPUs to Dom0 and pinning Dom0 made all the difference.

I sent back and updated my grub config, and added dom0_max_vcpus and dom0_vcpus_pin like this: 

GRUB_CMDLINE_XEN_DEFAULT="dom0_mem=2048M,max:4096M dom0_max_vcpus=4 dom0_vcpus_pin cpuinfo com1=115200,8n1 console=com1,tty,vga loglvl=all guest_loglvl=all"

Next run grub-bootxen.sh one more time and reboot

/usr/bin/grub-bootxen.sh

After this reboot I was able to successfully get the machine to fully boot into Xen Hypervisor. Check CPU pinning

xl vcpu-list

You’ll get an output that looks like this:

Name                                ID  VCPU   CPU State   Time(s) Affinity (Hard / Soft)
Domain-0                             0     0    0   -b-      71.8  0 / all
Domain-0                             0     1    1   -b-      51.4  1 / all
Domain-0                             0     2    2   -b-      65.4  2 / all
Domain-0                             0     3    3   r--      48.3  3 / all
vm135                                1     0   14   -b-      18.7  all / 8-15
vm136                                2     0    8   -b-      17.4  all / 8-15
vm136                                2     1   10   -b-       7.2  all / 8-15

You’ll see that Dom0 has 4 CPUs as we set.  Since I finished the server and started writing this article already have 2 new SolusVM guests running on this server.

Try creating a DomU an see if you’re good now.