Let’s talk about Data Center Monitoring
This post is also available in: Spanish
Data Center Monitoring: The basis for defining a service
The term Data Center Monitoring is used in the communications plan of some companies as if it were a product specially designed for this area.
However, when it comes to these ideas we believe that they actually refer to a summary of monitoring tools facilities that are intended to support the work of Data Centre managers.
It is necessary to ask then: what can monitoring tools like Pandora FMS do for the Data Center managers?
Let’s start by trying to understand these managers and their working environment.
Data Centre Managers and their competences
There is a lot of stress among Data Center managers. A good friend of mine who holds a position of responsibility over the data centres of a fairly large company told me, that every weekend, at least a couple of times, he thinks there will be a catastrophic event that will lead to his immediate dismissal.
This stress is understandable if we take into account that data centers are the places where, among other things, the business is based, where the data resides and from where communications are established with other company headquarters and with the outside world.
Let’s think about the number of subsystems that can make up a data center:
- Servers (physical, virtual, internal and in the cloud).
- Mainframe servers (operative and legacy).
- Applications (internal, web and in the cloud).
- Backup systems (backup applications, servers, tape readers, robotic systems).
- Network and communications devices (switches, routers, firewalls, load balancers, multiplexers, modems, etc.).
- Cabling (data center cabling and building backbone).
In addition to this, there are other subsystems such as access security, biometric systems, telephony (telephone exchange, cabling, connection to public telephone service), fire control, air conditioning, UPS, etc.
In addition to this, each of these subsystems regularly includes its own administrative tools; thus we find managers moving from one administrative tool to another in order to be able to carry out their daily activities.
On the other hand, the responsibilities of Data Center administrators are very diverse, moving from the administration of current resources to the definition and execution of strategic plans, as well as the exhausting management of incidents.
In the definition of strategic plans, managers must consider those technological trends that impact the conception, management and evolution of data centers. Therefore, many of them spend a good part of their day researching and evaluating technology.
Technology that can range from Cloud Computing to the development of Edge Data Centers and/or Micro data centers, including the replacement of HDD memories by those based on SSD technology, and so on.
What can Monitoring tools offer?
A monitoring tool seems to fit quite well with the environment described in the previous section. Thus, a good monitoring tool could provide, among other things, the following:
With the large number of devices that require status, performance evaluation, behaviour analysis and decision making, it is clear that a platform that allows a unified visualization of all subsystems in a 24/7 regime represents the ideal situation for Data Center managers.
This unified visualization can imply an additional advantage in terms of personnel; if we consider that a data center has a group of operative personnel, when implementing the monitoring tool these operators will be able to monitor the regular behaviour of the subsystems that make up the data center and executing all the basic support possible.
In the case of a more complex incidence, this staff will be able to report to the next level of expertise and other staff, from the specific administrative tools, will be able to try to solve the problem definitively.
Here, monitoring achieves horizontal support, allowing an analyst with network experience to monitor and perform the basic support associated with a mainframe, for example
In this sense, a general purpose-monitoring tool like Pandora FMS, that allows collecting information from any subsystem in a web environment, can be the perfect base to obtain the advantages of a unified visualization.
Users and profiles
Having a unified visualization is essential to have the ability to create user profiles that allow you to control the scope of monitoring given to a given user.
Pandora FMS, for example, presents a flexible scheme, which allows multiple users to work, each with different permissions.
It also proposes the creation of groups that include a specific monitoring scope, so that a user can assign a specific profile in a particular group.
Thus, we can implement a group “servers” and another group “applications” with different monitoring scopes. A user, for example, might have the operator profile in the “servers” group and not have any access to the “applications” group.
Real-time data and alarm system
Real-time monitoring supports the incident management that every Data Center manager must perform. Having real-time data makes it possible to speed up administrative decisions, such as isolating a server with erratic behaviour or establishing a different route for specific data traffic.
Pandora FMS event system, by recording in real time everything that happened in the monitored subsystems, will allow the administrators to have an overview of the active problems and be able to take actions such as delete the event or validate it and give it the operational continuity it requires.
On the other hand, it is essential to have a solid alarm scheme that:
- Distance us from the dreaded alarm fatigue.
- Cover all essential elements regardless of their nature.
- Allow actions to be executed.
- Allows you to record the learning of incidents in order to save time and money in the futured
- It relies on a system of notifications with truly informative messages that reaches not only consoles but also emails and mobile phones.
Pandora FMS alert system, which covers the above, also allows us, on one hand, to create an alarm and an action to execute for a particular component, and on the other hand, to create a template that defines an alarm and a group of actions that could be applied to several monitored components.
The flexibility of this scheme will allow managers to be very precise where necessary, and to save time and effort where possible.
Readers interested in the subject of monitoring and alarms may be interested in this article recently published on this blog.
Monitoring virtual environments
Today’s data centers are supported by complex architectures; physical servers of different architectures and technology that enables the creation of virtual machines and/or dynamic containers, as well as mixed schemes with internal and cloud services or infrastructure.
The implementation of Pandora FMS as a Data Center Monitoring platform can provide operability with the different virtualization technologies, and with the different cloud computing options.
Complying with the premise that the scheme for collecting information on virtual entities will not affect the performance of systems.
The reader can check the scope of Pandora FMS solution for monitoring virtual environments in this link.
Analysing log files is a daily task for Data Center managers.
Now, if an administrator decides to implement a Data Center Monitoring project, it is okay to think that the solution that allows recording information about the behaviour of a server is the same solution that allows obtaining the log files of that server.
In effect, this approach is the one that Pandora FMS handles when it comes to the logs files: https://pandorafms.com/log-collection/
Quick start of a long-range project
At this final point, we draw attention to the importance of a possible Data Center Monitoring project being viable in terms of time, effort and of course in terms of the associated budget.
In general, we can think that tools that involve a strong initial investment and require a lot of effort and resources for installation and implementation have little reception among the group of Data Center managers.
Pandora FMS can represent a good solution in these terms, since the implementation costs and the start-up time are comparatively lower.
Besides, Pandora FMS scales very well, so you can start monitoring the network and some servers and applications, and then progressively incorporate the rest of the components and subsystems.
This medium and long-term vision is finally supported by a simple licensing and billing scheme, and of course by as much technical support as possible.