Operations MonitoringProactive hybrid monitoring on any operative level
To act is better than to react...
An isolated surveillance of the single components a complex cloud infrastructure consists of only allows for limited conclusions about availability and workload of the whole system. Therefore with Hybrid Monitoring the topology and relations are, together with data and information from other operating processes, also included in the analysis process.
Another characteristic of our Hybrid Monitoring is the incorporation of special agents into the decision-making process. For example, important information about the use of an IT system can be drawn, if data of anomaly detection systems will be linked with system workload values. If there is the probability of an illicit use, processes to enact pre-configured security rules can be automatically started. Should there be a decrease in efficiency due to a high amount of users, the resources can be managed suited to the actual needs.
Classification of measurement data and information
In order for analyzing processes to work properly, a classification is necessary. Therefore we group data according to three operative levels: Infrastructure, software and experience.
During infrastructural monitoring we collect data from building technology, network, load balancing, server hardware, storage systems and virtualization. These values are grouped according to their resources and then sorted into a hierarchic data model. This hierarchic structure is of high importance for the measure analysis as well as for operative control.
Without information regarding hierarchic dependencies of systems it often proves difficult to detect the origin of anomalies and failures. Neighborly relations allow for determination of interdependencies between configurations in a complex hierarchy system and to start counteractions in good time.
The customer project infrastructure is placed on this level of the monitoring hierarchy. The virtual infrastructure of the customer contains individual or clustered virtual machines, operating systems, application software and special services such as IT security or local traffic management. Measurement results from the infrastructural level, for example CPU power allocation of hypervisors or storage system performance, are imported to this level to support the decision-making logic.
Many values are typically collected and logged on software level, as this level often features the highest potential of individual tuning. Measuring in short intervals and the possibility of long time logging to enable trend analyses are important for an informative analysis.
When tuning a system, we can see how changes influence other software layers and levels such as application processes and traffic management. If positive, there is an improvement in the application performance, resulting in a higher amount of transactions per second, which is eventually leading to higher visitor counts.
Because of the amounts of amassed data, measurements are selectively logged. However, in order to conduct complex analyses, all collected data has to be available. We can accomplish this by means of cyclic snapshots of measured data for the whole virtual infrastructure within a definable time frame.
Web site performance is the key to user experience. The so-called perceptive threshold is located at 100 milliseconds. Anything that lasts longer is not perceived as instantly. More than two seconds are perceived as waiting time. Only a few users endure longer waiting times than ten seconds, before they hit the cancel button and switch to another web page. Experience level measurements are important criteria to our customers as they have direct influence on the economic success.
The distributed monitoring agents supply us with valuable information about the loading times of web sites and the durations of specific transactions. Detailed measurements contain DNS response time, as well as different time measurements to answer for example the following questions: How long does it take until the first data package of the reply has been received? How long until all momentarily visible objects within the browser have been received? And how long will it take to reload all objects? The benchmark tools simulate the complex web site loading process, following exactly the procedure of modern internet browsers. The results are presented by us in a clearly laid out waterfall statistic.
Controlling operative processes
Using monitoring as a central turnpike to control various operative processes features significant advantages: Data from different functional areas is saved centrally, interconnected and analyzed. As a result, new process workflows are triggered and controlled.
Seamless interaction between operative processes guarantee information security of application hosting. They control the timely installation of software patches, documentation of changes with analyzation of their effects, as well as updating risk management nonstop with monitoring results.
The modern architecture of our monitoring system also allows for surveillance of remote systems. Reporting agents which can be installed in various locations, will communicate with the central via persistent encrypted web socket connections. Our colocations in Moscow, Singapore and Shanghai are controlled by this technology from our headquarters in Munich.
Integrating your own infrastructure into the monitoring and control process is another case for remote monitoring. E-commerce applications are usually based on complex back office processes which are in communication with the front office. Therefore it is not far-fetched, that integrated monitoring features new possibilities to optimize your operative processes in both areas.