Sunday, April 19, 2009

Virtualization : as experienced from ground zero

Virtualization is a big theme in the contemporary enterprise architecture. Many documented case studies highlights the virtualization benefits including better hardware utilization, grater ROI and lower TCO. This deal is so great that you can’t overlook this option…




But, and this but is a big one. But the reality is not that green. From software point of view, it is not able to utilize 100% of the physical infrastructure. Like in our example, we have 8 CPUs and 128 GB of RAM, but each virtual machine can only use up to 4 CPUs and 64 GB of RAM. Systems like RDBMS thrive in many-core / multi-processor environment. So by limiting the number of CPUs or memory, we are minimizing its capacity. This is what we experienced when we decided to use a hypervisor [and this is mostly true for any hypervisor].

In future version of hypervisors, this limitation will be fixed. Based on the processing load, each VM will be able to optimize the physical resource utilization. But as of today, there are some hard limits on no of CPUs and RAM each VM can use.

So to conclude, in our test case – let’s compare machine A with 8 CPUs and 16 GB of RAM with machine B with 8 CPUs and 128 GB of RAM. Machine B is virtualized and is used to run 2 VMs with 4 CPUs each and 64 GB of RAM. Both the VM’s on machine B are operating in parallel and are used for running an instance of SQL Server 2005 and a file server. Which SQL Server instance will run fast – one that is running on machine A or on machine B? What if we bump up the RAM in machine B up to 256 GB? These are some solid question one should consider seriously before jumping on the virtualization bandwagon. Yes, all these issues will be fixed in the future versions and all vendors are working on it. But from pure architecture point of view, as of today, one should consider these limitations seriously.

A lot will depend on the processing load and type of applications running on these virtual machines. For busts and spikes – you need a ton load of RAM and huge processing for a while but after that all that capacity is just idling around. What if that bust or spike is the reason for provisioning such a huge hardware capacity? Can we run this busty system in a virtualized environment and let it suffer because of VM’s limit on no of CPU it can handle and maximize our hardware utilization or go the traditional way and provision a big machine just to handle this one mission critical load efficiently and effectively? These are some critical questions one would face while moving the production applications in a virtualized environment. There are no easy answers and the only way to insure the success is by doing as much testing as possible.

No comments: