Cloud monitoring

Fault and performance monitoring of cloud environments is not straight-forward, but done properly, it can provide critical information on what is not performing and why. Doing it "properly" involves monitoring each critical system in the cloud using simultaneous internal, external and appliance-like monitoring instances. Real-time comparative analysis of this collected data enables speedy notification on not only the type of fault or performance issue, but also its location within the cloud. Outside of fault conditions, it also provides ongoing reportable assurance that performance and configuration specifications are being maintained.

Monitor each critical system

The key to obtaining vital point-of-failure data is to monitor each critical component in the cloud environment. For example, consider the following relatively common entry-level cloud configuration:

two redundant web servers (Web Server A and Web Server B);
front end load balancer (Load Balancer); and
SQL server for transaction data (SQL Server).

Even with this simplest form of cloud topology, it's clear that the standard approach to monitoring the "website" is unable to differentiate between a failure with any one or more of the servers involved, their network connections or the clouds outgoing connections.

To monitor this environment, we monitor both web servers (A and B), the Load Balancer and the SQL Server. Remmon's comparative analysis between performance and returned data for the Load Balancer and Web Servers A and B enable potential capacity issues or failures to be identified in advance of an actual fault being perceived by a user. It also provides advanced warning of synchronisation issues between multi-VM backends (such as configuration or content differences between Web Server A and B).

Internal, external and appliance

Our cloud monitoring solution incorporates data collected from four sources:

External (to the cloud) monitoring by Remmon. This provides not only a view of public availability but also ongoing assurance that services which should be firewall or configuration blocked are not publicly accessible.
One or more internal Remmon monitoring instances (ReMI) within the cloud.
One or more internal Remmon local agents on servers or virtual machines within the cloud.
One or more Remmon monitoring appliances (ReMA) deployed on user, developer or administrator networks. Remmon monitoring appliances can reside on particular physical or radio (Wifi, 3G, 4G etc.) networks in particular locations that represent critical data collection points for fault and performance monitoring.

Wide service coverage

All Remmon monitoring instances support HTTP(S), SQL, ICMP (ping), DNS and TCP level interactions with target systems. An modular probe framework means application specific, multi-step or logic based monitoring can be accommodated via extension.

Comparative analysis

Comparison of the data collected provides Remmon with the analytical power to narrow down points of failure or inadequate performance. While more simplistic monitoring tools might provide fault or failure warnings, Remmon's approach of monitor everything combined with continual comparison and analysis provides a total solution that can identify an actual point of failure in a complex multi-tiered cloud environment.

Case Studies

Cloud monitoring

Cloud monitoring

Monitor each critical system

Internal, external and appliance

Wide service coverage

Comparative analysis

Beyond monitoring

Client-server application monitoring

Cloud monitoring