One thing that is always true, as new technology comes along, some people are going to criticize, ridicule, or simply not believe it does what it says it does.
Many years ago, I was approached by some guys in development that were having issues with an application that had a SQL database backend, this is the story of how they questioned VMware, and what transpired.
The development environment was comprised of several application servers and SQL servers that were particular to different builds of our in-house software. Some of the servers were physical, some were virtual running on VMware GSX (now VMware Server) and some were running on VMware ESX 2.5.
Yes, this is an old story that I have told many times. This is my first time posting it. I decided to post it, because lately I have had some conversations questioning running servers/applications in VMware. Yes, even in 2011. Now back to the story.
The developers were working on some stored procedures, and noticed some procedures were not executing properly. Upon troubleshooting the problem they decided to blame VMware. The stored procedures worked properly on all of the physical SQL servers, but failed on the virtual SQL servers on VMware GSX/ESX.
When they stepped through the stored procedures manually in both physical and virtual, they worked. When they attempted to run them normally through the application, they failed on the virtual machines, but worked on the physical systems.
Result: The developers looked toward IT for a solution, because obviously VMware was the problem, or so it appeared.
I proposed a challenge to them to determine in fact whether the problem was VMware or not.
I suggested the following troubleshooting steps:
- I would give them an appropriate workstation for them to install Windows Server & SQL on.
- They would run their tests against that physical machine for 2 weeks, to validate the configuration was correct.
- After 2 weeks, I would work the weekend, perform a P2V of the physical system & present it to them as a VM.
- They would then run their tests against the VM to validate the configuration still behaved properly as a VM.
- I told them, that not only would they not have any issues, but the server would be faster.
If I was right, they (6+ people) could buy me a nice lunch.
- I also said that if they had any issues, I would buy them a nice lunch (6+ people) out of my pocket.
The following week, I gave them an appropriate workstation and the MSDN media for Windows Server 2003 and SQL 2000. I asked them to install Windows and SQL, with all the approved patches (relative to our software). The only things I had control of, with their agreement, were the IP address of the server, the appropriate membership in the Active Directory Domain, and the physical location of it. Nothing else. They had full control of the system. No input/configuration of the system was going to be done by me or anyone else in IT. They felt very comfortable with that.
When they were done, I moved the desktop to our staging area, and connected it where they could get to it.
Upon completing the installation by the end of the week, they were set to run their tests on following Monday.
On Monday morning, I dropped by the developer’s desks and checked to see how everything was going. “Good” was the response I got from all of them.
Each day, I dropped by, several times a day, checking on how things were going. Again, “Good” was the only response I received.
That Friday night we had a maintenance window. I figured… Why not go ahead and P2V this box and get this over with. Remember from earlier that I was using VMware GSX/ESX? Yep, VMware Converter hadn’t come out yet, and it wasn’t as easy to P2V a system. I had to use the older P2V Importer application that was a little quirky in comparison.
Regardless of the steps, after some effort, it was virtualized.
I failed to share with the developers that I had virtualized the server over the weekend.
Just as I had the week before, I stopped by each of the developer’s desks, asking how the test was going. Most of them said “Good.” I would ask “Any issues?” “Nope, nope, and nope.” A few however said words to the effect of “This is faster this week.” The most engaged developer even used the term “flying” to describe the performance. I simply responded with “We had a maintenance window, and now everything in the staging room is on a Gigabit switch.” They took that as the reason that it was running faster. Nevermind that it had more disks from a storage perspective.
In the middle of the day on Friday, I asked them to meet with me regarding virtualizing this server, so we could talk about the P2V process/etc. We started to talk about the process, and they brought up the normal concerns about downtime, connectivity, and so on. In responding to their question about downtime…
I responded “There won’t be any.”
They responded with “What? No downtime? How can that be?”
I said “I virtualized it last weekend.”
Every mouth in the room dropped. “You did what?”
I said “No problems, issues, etc? Running faster?”
They said “Oh, the ESX box has Gigabit connections, and that’s why it is faster.”
I hated to (but did) break it to them that where it had been running had already been connected to a Gigabit switch.
To them the test proved that VMware was not the root of the issue in any form or fashion.
It made them look deeper into their issue from a coding/configuration standpoint. The root of the problem was a misconfiguration of that initial development SQL VM Template. Because the template was misconfigured, any/all VMs deployed from it were destined to fail. That’s why VMs on both GSX & ESX had the issue.
The point to my story is, it is easy to blame technology when you don’t understand it.
As more and more companies, businesses, etc get to the point of virtualizing servers/applications, it is important to remember to troubleshoot further than just blaming virtualization.
After that, I can’t remember a developer implicating VMware as the cause of an issue again.
And I’m still waiting for that lunch…