Archive

Posts Tagged ‘VMware’

vSphere 5 licensing changes

July 13, 2011 2 comments

Unless you have been living under a rock, you would have heard VMware announce the release of vSphere 5 today. I have spent a few hours sifting through the information on the new features and changes. There is a lot of great content been released today, from bloggers but more importantly on the Partner Portal. VMware have obviously invested a lot of time and resources in getting the content out so quickly and this is something other vendors should take on board.

Although there are a lot of great new features in vSphere 5, I believe its release will be remembered for the changes to the licensing model. This is shame as there are some great new features being released and these may well not get the focus they deserve. I totally understand why VMware made changes to the licensing model and the market has been expecting VMware to make some sort of change. As Intel continues to produce CPU’s with more cores and Servers are capable of being fitted with more and more RAM the old license model was doomed.

New Licensing Details:http://www.vmware.com/files/pdf/vsphere_pricing.pdf

I personally believe the new licensing model (vRAM) is the right model but that the amount of vRAM allocated per license is in-adequate. Instead VMware should have used numbers that reflect what customers are using in their environments now (probably nearly twice what VMware decided).

In my experience, customers deploying VMware using new hardware with 4.1, Enterprise plus on dual socket servers would allocate between 96 to 146GB of physical RAM. Factoring in the over subscription of about 30% vRAM to physical RAM with 80% utilization of a host with 146GB of RAM. I would estimate about 152GB of vRAM total, divide that by two for Dual socket makes 76GB of vRAM per socket. Therefore to ensure customers who have existing infrastructure, that are looking to upgrade to vSphere 5 from 4.1 can without purchasing additional licenses, VMware should look to increase the vRAM to about 76GB for Enterprise Plus per processor.

The new licensing model will no doubt be attacked by many people, customers, competitors and partners but ultimately everyone should agree something had to change.

Categories: VMware Tags: ,

ESXi 4.1 Emulex LPe11000 FC HBA, errors SCSILinuxAbortCommands

February 27, 2011 6 comments

I recently upgraded a customer from 4.0 Update 1 to 4.1 Update 1. Two clusters were upgraded and they consisted of five and eight hosts each, all of the Hosts are IBM x3650 M3 with all but four of them having the Emulex LPe11000 FC HBA’s.

Since the upgrade over about a week we have had six hosts fail, the VM’s on these hosts are lost, the ESXi becomes generally unresponsive but still respond to HA pings from the other hosts in the Cluster. The VM’s therefore become unknown and restarting management agents on the ESXi DCUI doesn’t help. Sometimes when this issue has occured the DCUI is unresponsive and Alt-F11 Alt-F2 keys do nothing, no PSOD is happening.

This issue has only affected the hosts with two LPe11000 single port FC HBA’s, which is on the HCL for 4.1 ESXi. We are using ESXi embedded with USB keys and not boot from SAN.

We have reviewed the Storage and Networking and the fault is isolated to the affected host at the time it occurs, the other hosts don’t have any issues occuring at the same time.

All the affected hosts have had the following HBA: Emulex, LPe11000, firmware 52A3.

We noticed there are alot of storage related errors in the messages.log

——————————————————————
FEB 23 16:33:06 vobd: Feb 23 16:33:00.126: 248389618631us: [esx.problem.storage.connectivity.lost] Lost connectivity to storage device naa.6006016019802900e634e73f72a6df11. Path vmhba4:C0:T1:L59 is down. Affected datastores: “DM01_SAP_Test_VMFS01”..
FEB 23 16:33:06 vmkernel: 2:20:59:46.445 cpu3:4142)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41027f9feb40) to NMP device “naa.6006016019802900e634e73f72a6df11” failed on physical path “vmhba4:C0:T1:L59” H:0x1 D:0x0 P:0x0 Possible sense data: 0 FEB 23 16:33:06 x0 0x0 0x0.
FEB 23 16:33:06 vmkernel: 2:20:59:46.445 cpu3:4142)WARNING: NMP: nmp_DeviceRetryCommand: Device “naa.6006016019802900e634e73f72a6df11”: awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
————————————————-

We contacted VMware GSS, initially we had little help with the first technical support presentative not even able to see any Storage errors and only confirming what we already knew that “the hosts is being unresponsive”, could be a hardware issue.

Seeing at this stage it had affected four different hosts it clearly wasn’t a hardware fault, possibly hardware firmware and driver issue.

After losing another few hosts and applying some pressure GSS, I talked to Aakash from VMware Global Support, he was great in helping us. He confirmed that we were experiencing SCSI aborts commands on our FC HBA and that the Storage connectivity is lost.

-We noticed APD messages around 25th Feb,2011 14:59 based on the below log snippet.

———————————————
FEB 25 14:57:12 vmkernel: 0:09:47:02.054 cpu12:10442)FS3: 7412: Waiting for timed-out heartbeat [HB state abcdef02 offset 3280896 gen 9 stamp 35219957746 uuid 4d6739f2-8ec6b4c0-23f3-e61f13594cb3 jrnl drv 8.46] FEB 25 14:57:12 vmkernel: 0:09:47:02.054 cpu18:10525)FS3: 7412: Waiting for timed-out heartbeat [HB state abcdef02 offset 3280896 gen 9 stamp 35219957746 uuid 4d6739f2-8ec6b4c0-23f3-e61f13594cb3 jrnl drv 8.46] FEB 25 14:57:12 vmkernel: 0:09:47:02.081 cpu14:4135)WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver lpfc820, for vmhba4 FEB 25 14:57:12 vmkernel: 0:09:47:02.529 cpu7:4263)ScsiDeviceIO: 1672: Command 0x12 to device “naa.60060160e8802900bdb0f3eb16a4df11” failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

FEB 25 14:57:12 vmkernel: 0:09:47:02.529 cpu7:4263)WARNING: NMP: nmp_DeviceStartLoop: NMP Device “naa.60060160e8802900bdb0f3eb16a4df11” is blocked. Not starting I/O from device.

FEB 25 14:57:12 vmkernel: 0:09:47:02.529 cpu2:4259)WARNING: VMW_VAAIP_CX: cx_claim_device: Inquiry to device naa.60060160e8802900bdb0f3eb16a4df11 failed FEB 25 14:57:12 vmkernel: 0:09:47:02.529 cpu2:4259)WARNING: vmw_psp_rr: psp_rrSelectPath: Could not select path for device “naa.60060160e8802900c078f2bbc3a5df11”.

FEB 25 14:57:12 vmkernel: 0:09:47:02.529 cpu2:4259)WARNING: NMP: nmp_IssueCommandToDevice: I/O could not be issued to device “naa.60060160e8802900c078f2bbc3a5df11” due to Not found FEB 25 14:57:12 vmkernel: 0:09:47:02.529 cpu2:4259)WARNING: NMP: nmp_DeviceRetryCommand: Device “naa.60060160e8802900c078f2bbc3a5df11”: awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.

FEB 25 14:57:12 vmkernel: 0:09:47:02.529 cpu2:4259)WARNING: NMP: nmp_DeviceStartLoop: NMP Device “naa.60060160e8802900c078f2bbc3a5df11” is blocked. Not starting I/O from device.

FEB 25 14:57:12 vmkernel: 0:09:47:02.679 cpu4:4258)WARNING: vmw_psp_rr: psp_rrSelectPathToActivate: Could not select path for device “naa.60060160e8802900c078f2bbc3a5df11”.

FEB 25 14:57:12 vmkernel: 0:09:47:02.679 cpu0:4255)WARNING: vmw_psp_rr: psp_rrSelectPathToActivate: Could not select path for device “naa.60060160e8802900bdb0f3eb16a4df11”.

FEB 25 14:57:12 vmkernel: 0:09:47:02.679 cpu21:4257)WARNING: vmw_psp_rr: psp_rrSelectPathToActivate: Could not select path for device “naa.60060160e880290079f48bf2c1a5df11”.

FEB 25 14:57:12 vmkernel: 0:09:47:02.679 cpu12:4260)WARNING: vmw_psp_rr: psp_rrSelectPathToActivate: Could not select path for device “naa.60060160e880290004445743c3a5df11”.

FEB 25 14:57:12 vmkernel: 0:09:47:02.679 cpu16:4511)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device “naa.60060160e8802900c078f2bbc3a5df11” – issuing command 0x41027ef92940 FEB 25 14:57:12 vmkernel: 0:09:47:02.679 cpu16:4511)WARNING: vmw_psp_rr: psp_rrSelectPath: Could not select path for device “naa.60060160e8802900c078f2bbc3a5df11”.

FEB 25 14:57:12 vmkernel: 0:09:47:02.679 cpu16:4511)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device “naa.60060160e8802900c078f2bbc3a5df11” – failed to issue command due to Not found (APD), try again…

-It can be caused due to emulex driver since there are abort commands.

FEB 25 14:57:12 vmkernel: 0:09:47:02.054 cpu12:10442)FS3: 7412: Waiting for timed-out heartbeat [HB state abcdef02 offset 3280896 gen 9 stamp 35219957746 uuid 4d6739f2-8ec6b4c0-23f3-e61f13594cb3 jrnl drv 8.46] FEB 25 14:57:12 vmkernel: 0:09:47:02.054 cpu18:10525)FS3: 7412: Waiting for timed-out heartbeat [HB state abcdef02 offset 3280896 gen 9 stamp 35219957746 uuid 4d6739f2-8ec6b4c0-23f3-e61f13594cb3 jrnl drv 8.46] FEB 25 14:57:12 vmkernel: 0:09:47:02.081 cpu14:4135)WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver lpfc820, for vmhba4
—————————————————

We are rolling back hosts to ESXi 4.0 update 1, the driver for that HBA changed in 4.1 to 8.2.1.30.1-58vmw.

This article mentions there are issues with 4.0 and this adapter but GSS confirmed with VMware engineering that the HBA is supported.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1012547&sliceId=1&docTypeID=DT_KB_1_1&dialogID=161271506&stateId=0 0 161275257

Categories: Uncategorized Tags: ,

VCAP Certification

December 5, 2010 2 comments

I am currently studying towards the VCAP-DCA certification and am putting up this post to collect together study material. There are already several exam study guides available as well as flash mock exams and material that I will need to re-read in preparation. I recently held a training course and noticed whilst delivering the course there were certain areas I need to brush up on including HA. VMware did contact me advising that I could do the exam but at the time I had a lot on, I plan to complete the exam by March next year depending on getting access to PearsonVUE exam bookings.

vFail’s VCAP-DCA study notes and index
http://www.vfail.net/vcap-dca/

SaffaGeek VCAP index
http://thesaffageek.wordpress.com/vcap-dca-dcd/

Live Lab Tutorial
http://www.ntpro.nl/blog/archives/1628-VCAP-DCA-Live-Lab-Tutorial.html

Kendrick Coleman Exam Landing Page VDCA410
http://www.kendrickcoleman.com/index.php?/Tech-Blog/vcap-datacenter-administration-exam-landing-page-vdca410.html

Categories: VMware Tags: , ,

vCenter AD authentication ESXi Host Profile issue

November 21, 2010 4 comments

There is a bug with 4.1 host profiles, profiles built from a reference host that were joined to a Windows Domain using Authentication Services cannot be successfully applied. The Host Profile prompts you for credentials to authenticate and gets stuck, next in the host profile wizard no longer works. I also notice if you select options in the host profile settings displayed you may receive an errror “an internal error occurred in the vSphere Client. Details: The given key was not present in the dictionary.”

According to this Communities thread it is indeed a bug and will be fixed in future releases.

In the meantime you can disable the Domain join setting in the host profile by editing the host profile “Configure a fixed domain name” in “Authentication Services/Active Directory Configuration/Domain Name” and set it to “Host not joined to any domain” or remove the reference host from the domain and updating the profile using that reference host.