Are you getting the most from your stats?
Most Private Cloud consumers only use a fraction of what their stats can do for them. Typically Private Cloud consumers use only network monitoring to alert them of server outages within the data centre, this type of monitoring is called availability monitoring. However this is only scratching the surface of what your monitoring tool can do. A bit like buying a Lamborghini and driving it around in first gear. Network monitoring allows us to understand the relationship between the application, and the hardware or the application and the hypervisor if you are running VMs. Understanding the relationship between the app and the hardware is the real value stats give us.
While not true in all cases, the end user interacts with the application which interacts with the operating system which interacts with the virtualisation layer which in turn interacts with the hardware. This makes for a lot of variables when one is tuning the performance of an application.
So how does monitoring help our Private Cloud customers? Generally before an application performance tuning exercise can start a baseline needs to be established. Using the snapshot range within your monitoring tool you can bench mark some key parameters especially CPU usage, CPU load, Disk I/O, Memory utilization and Swap memory. The NTT ICT monitoring system “NTT Portal” is built on Zabbix which is an enterprise class open source distributed monitoring solution. Zabbix is script based so it’s easy for us to add custom reporting. Nagios, HP OpenView, Splunk and most decent monitoring tools will include the ability to report on at least these features.
Once the baseline is established and logged it becomes easier to compare performance between two periods of time. This is commonly known as profiling. Now you know the performance of the system before modification, from here simply (I say that ironically as its never simple) identify the part of the system that is critical for improving the performance - called the bottleneck. Improve this part of the system to remove the bottleneck i.e. allow the VMs to access more RAM. Then measure the new performance against your baseline. Repeat until golden brown.
Remember the first step in getting the most of your stats is familiarise yourself with the interface. Happy monitoring.