Hello,
I'm running VMware Server 1.0.1 on a Dual Opteron 265 server with CentOS Server 4.4, and 4 250GB SATA drives paired two each as RAID-1 (so two md devices) combined using LVM into a single logical volume. One of the VMs started having hard drive trouble with one of its vdisks (it has two, one preallocated, on expandable). The drive would periodically lock up the VM, although it would do so without any errors appearing (I'm deducing this from what happened later). The drive recently started locking a critical guest server process on every boot. The problem seems to stem from one of the vmdk files (the faulty disk is split into 2GB files) and after defragmenting the hard disk, the server now refuses to boot at all.
The vdiskmanager log looks like this:
Jan 26 10:52:16: app| Log for VMware Server pid=18527 version=1.0.1 build=build-29996 option=Release
Jan 26 10:52:16: app| Scanning directory of file nova-0.vmdk for vmx files.
Jan 26 10:52:16: app| baseDir = '/opt/vmware/vm/nova/', vmx file = 'nova.vmx'
Jan 26 10:52:16: app| Search start: '/opt/vmware/vm/nova/nova.vmx', baseDiskOnly
Jan 26 10:52:16: app| Search result: inTree , isCurrent , isLegacy , states: 1
Jan 26 10:52:16: app| Search analysis: disk file found as part of current state.
Jan 26 10:52:16: app| FILEIO: Found a previous instance of lock file '/opt/vmware/vm/nova/nova-0.vmdk.WRITELOCK'. It will be removed automatically.
Jan 26 10:52:16: app| DISKLIB-DSCPTR: Opened : "nova-0-s001.vmdk" (0x18)
Jan 26 10:52:16: app| DISKLIB-DSCPTR: Opened : "nova-0-s002.vmdk" (0x18)
Jan 26 10:52:16: app| DISKLIB-DSCPTR: Opened : "nova-0-s003.vmdk" (0x18)
Jan 26 10:52:16: app| DISKLIB-DSCPTR: Opened : "nova-0-s004.vmdk" (0x18)
Jan 26 10:52:16: app| DISKLIB-DSCPTR: Opened : "nova-0-s005.vmdk" (0x18)
Jan 26 10:52:16: app| DISKLIB-DSCPTR: Opened : "nova-0-s006.vmdk" (0x18)
Jan 26 10:52:16: app| DISKLIB-SPARSECHK: /opt/vmware/vm/nova/nova-0-s007.vmdk Invalid GD or RGD : 1701013876,2 vs. 259,2
Jan 26 10:52:16: app| DISKLIB-SPARSECHK: /opt/vmware/vm/nova/nova-0-s007.vmdk Invalid GD or RGD : 1428168762,6 vs. 263,6
.
.
.
.
Jan 26 10:52:18: app| 504: 876097585 1818450698 1953391977 775304736 775500850 892220726 1213210717 1867391056
Jan 26 10:52:18: app| DISKLIB-SPUTIL: ****** End of grain table dump ****** Jan 26 10:52:18: app| DISKLIB-SPUTIL: ===== End of extent dump =====
Jan 26 10:52:18: app| DISKLIB-SPARSECHK: uncleanShutdown: 0 Jan 26 10:52:18: app| Backtrace:
Jan 26 10:52:18: app| Backtrace[0] 0xffffb538 eip 0x808e010 Jan 26 10:52:18: app| Backtrace[1] 0xffffb558 eip 0x804aa66
Jan 26 10:52:18: app| Backtrace[2] 0xffffb578 eip 0x8074bfa Jan 26 10:52:18: app| Backtrace[3] 0xffffb5b8 eip 0x8074dbb
Jan 26 10:52:18: app| Backtrace[4] 0xffffb5e8 eip 0x8070cQuestion (id = 960851991) : 96
It returns a bugNr=26323 when I now try to defrag the vdisk.
If I try to start the server, I get this message:
Operation on file "/opt/vmware/vm/nova/nova-0-s007.vmdk" failed (No such file or directory).
Choose Retry to attempt the operation again. Choose Abort to terminate this session.
Choose Continue to forward the error to the guest operating system.
0) Retry
1) Abort
2) Continue
Select choice. Press enter for default <0> :
Note that I can only get even this much of a response from the command line -- any attempt to use the console software to start the machine produces no feedback whatsoever. If anyone has ideas about this, I would appreciate it. I would especially like to retrieve the data on the virtual disk if at all possible.
Thanks,
Akin