main

MonitoringServer Monitoring

How to monitor an Apache web server with Pandora FMS

July 13, 2018 — by Alberto Dominguez0

Monitoring-web-server-Apache-featured.png

leyes de la tecnologia

Monitoring Web Server Apache with Pandora FMS

What is an Apache Web server?

In today’s article, you will learn how to monitor in depth an Apache web server with Pandora FMS. But first, let’s find out what Apache is.

It is the most widely used open source HTTP web server on the market, as it is multiplatform, free, high performance, and one of the most secure and powerful.

It was founded in 1999, in the United States, by a group of eight developers who initially formed the Apache Group, which would lead to the Apache Software Foundation.

Among its many advantages are its free and open source cost, its compatibility with Linux, MacOs and Windows, its SSL and TLS security support, its global and functional support team and its performance (one million visits per day).

The Apache Software Foundation logo

Monitoring web server Apache is not as simple as monitoring the status of the process or making a web request to see if it returns anything. This would be a basic monitoring that anyone could do with Pandora FMS, since there are some examples in the documentation.

Performance Monitoring web server Apache

There is a plugin in the Pandora FMS library that allows us, along with the Apache server status module, to obtain detailed information about the server performance.

In addition, we can configure the server to obtain detailed information about each instance or web domain that we are serving on the server.

The first step is, obviously, to have Pandora FMS installed. Then, we will install a Pandora FMS agent in the Linux server where the Apache is located.

Once the agent is installed, we will install the Apache plugin from the module library:

https://pandorafms.com/library/apache-performance-plugin/

We will download it and copy it to the plugins directory of the linux agent, which is in /etc/pandora/plugins

In order to use the plugin we need to configure the Apache server (Monitoring web server Apache) to use the server-status module, which gives detailed server information. In order to do this, edit the file /etc/httpd/conf/httpd.conf and add the following configuration:


ExtendedStatus on

<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
Allow from XX.XX.XX.XX
</Location>

Where it says XX.XX.XX.XX.XX we will put the main IP of our WEB server. So that it will only accept requests from itself, for safety.

Once these changes are made, we will restart the web server and launch the plugin manually to verify that it returns any data:

/etc/pandora/plugins/apache_plugin http://46.105.97.91/server-status

It has to return an XML with data, since it is an agent plugin that returns several modules. This is an extract of the entire XML:

<module>
<name><![CDATA[Apache: Uptime]]&gt;</name>
<description><![CDATA[Uptime since reboot (sec)]]&gt;</description>
type generic_data/type

<min>0</min>
<disabled>0</disabled>
<data><![CDATA[248008]]&gt;</data>
</module>

Once we have verified that it works, we will add the plugin to the Pandora FMS agent with the following line:

module_plugin apache_plugin http://XX.XX.XX.XX/server-status

Once again, we are trying to replace XX.XX.XX.XX with the Apache server IP, the same machine where the Pandora FMS agent is executed.

Once this is done and the agent is restarted to get the new configuration, it should have a view similar to this one:

screenshot of the Pandora FMS agent

Server status monitoring

In addition to performance monitoring, we should do a basic monitoring web server Apache process; a module would be enough to verify that the daemon is working:

module_begin
module_name Apache Status
module_type generic_proc
module_exec ps aux | grep httpd | grep -v grep | wc -l
module_end

Being a Boolean module, it would only be set to CRITICAL when its value is 0, but it will also help us to know how many HTTPD threads are active on the server.

Load monitoring of a specific instance

In Apache we can configure an instance -which in its terminology is a virtual host- to use a specific log, only for itself, in this way:


<VirtualHost *:80>
ServerAdmin [email protected]
DocumentRoot /var/www/mydomain
ServerName mydomain.com
CustomLog logs/access_log_mydomain common

</VirtualHost>

Now we only have to monitor the number of entries of this file to find out how many requests per second we have in our server, through an incremental module:


module_begin
module_name MyDomain Request/sec
module_type generic_data_inc
module_exec wc -l /var/log/httpd/access_log_mydomain | awk '{ print $1 }'
module_end

You can watch the tutorial on how to monitor an Apache web server here:

MonitoringNetwork

How SDN change our vision on networks?

June 28, 2018 — by Alexander La rosa0

SDN-featured.png

SDN: Challenges for Network Administrator’s and Monitoring

SDN Software Defined Networking

SDN: Challenges for Network Administrator’s and Monitoring

Last December, Acumen Research and Consulting, a global provider of market research, published a report titled “Software Defined Network (SDN) Market” where they estimated a compound annual growth rate (CAGR) of 47% for SDN in the period of 2016 – 2022.

In 2016, Cisco launched its DNA (Digital Network Architecture), which is more based on software than hardware.

In 2017, Cisco acquired Viptela to complete its SD-WAN (Software Defined WAN) offer. Also, in 2017, IDC (International Data Corporation) estimated for SD-WAN infrastructure and services revenues a CAGR of 69.6% reaching $8 billion in 2021.

All those statistics show us that the business around network is changing, but apart from new offers from our ISP or cloud services provider, does SDN really imply a change in the way of understanding, designing, managing and monitoring networks?

We have to start by clarifying that SDN is an architectural approach not a specific product. Actually SDN is the result of the application of virtualization paradigm to the world of networks.

In general, virtualization seeks to separate the logical part from the physical part in any process. In server virtualization for example, we can create a fully functional server without having any particular physical equipment for it.

Let’s translate this paradigm to a basic function of a switch:

When a packet arrives at a switch, the rules built into its firmware tell the switch where to put the packet, so all the packets that share the same conditions are treated in the same way.

In a more advanced switch, we can define rules in a configuration environment through a command line interface (CLI) but we have to configure each one of the switches in our platform.

When applying virtualization, we have all the rules for all the switches (logical part) separated from the switches themselves (physical part). SDN applies this principle to all networking equipment.

Therefore SDN proposes the separation of:

  • Control Level: in this level, an application called SDN Controller decides how packets have to flow through the network, and it also performs configuration and management activities.
  • From Data Level: this level actually enables the movement of the packets from one point to another. Here we can find network nodes (any physical and virtual networking equipment). In SDN we say traffic moves through the network nodes rather than towards or from them.

With those two levels defined the idea is that network administrators can change any network rules when necessary interacting with a centralized control console without touching individual network nodes one by one.

This interaction defines a third level in the architecture called

  • Application Level: In this level we find programs that build an abstract view of the network for decision-making purposes. These applications have to work with user’s needs, service requirements, and management.

In the following image we can see a basic model of SDN architecture:

basic model of SDN architecture

Finally there are two elements to mention:

  • Northbound API: these APIs are used to allow the communication between SDN Controller and applications running over the network. By using a northbound API, an application can program the network and request services from it. They enable basic network functions like loop routing, avoidance, security and modifying or customizing network control among others.

    Northbound APIs are also used to integrate SDN Controller with external automation stacks and cloud operating systems like OpenStack, VCloud Director and CloudStack.

  • Southbound API: These APIs enable the communication between SDN Controller and network nodes. SDN Controller uses this communication to identify network topology, determine traffic flows, define the behavior of network nodes and implement the request generated by a Northbound API.

SDN was originally just about this separation of functions; however the architecture has evolved to embrace the automation and virtualization of network services as well, in order to bestow network administrators with the power to deliver network services wherever they are needed without regard to what specific equipment is required.

This automation implies that SDN-based networks have to detect changes in pattern of traffic flow and select the better path based on parameters like application type, quality of services and security rules.

Up to here, our brief introduction to SDN. If the reader wants to go deeper, we recommend visiting the websites of Open Networking Foundation and SDX central.

So, let’s go back to the original question: does SDN really imply a change in the way of understanding, designing, managing and monitoring networks?

Traditionally, network administrators have a very strong connection to the hardware; we usually configure every switch, router and firewall using a command line interface.

This “usual way of doing things” gives us a deep knowledge about the platform, however we have always agreed that this way of working is laborious, prone to errors and slows down changes. With SDN, we may have to think less about commands and configurations and think more about rules and services.

On the other hand, virtualization has taken a long time to impact the world of networks and has taken longer to make an impact in companies that are not Internet service providers or mega corporations

Then, this change may be less hard for those IT teams that have experience with server virtualization, containers and have faced the challenges of DevOps methodology (a topic we discussed previously in this same blog).

In terms of monitoring, the fundamental challenge is how to monitor networks considering the complexity and transience that SDN implies. For example, how to do application performance monitoring if the network topology can change several times a day.

There are some monitoring tools designed to be at an Application Level as part of Network Management Systems. Those tools face the problem of complexity, doing controller monitoring and regular network monitoring in the devices on the Data Level.

The real challenge with an agile structure is to identify the entry of new devices and automatically adjust the monitoring scheme.

Furthermore, troubleshooting on SDN-based networks requires an important effort in interactivity and contextual analysis. In practice, it will not be enough to see the network as it is in a certain moment, but we will need to move forward and backward in the topology in order to identify the performance problems associated with routes to optimize the whole process.

Therefore, we can foresee a large amount of data extracted from the platform that must be stored and then filtered under a flexible visualization scheme.

Finally, we must say that many of the challenges mentioned here have already been assumed by some monitoring tools. Those tools with flexible architectures and extensive experience in virtual environment monitoring can be successful. We invite you to know the full scope of Pandora FMS in virtualized environments by visiting our website.

Redactor técnico con más de diez años de experiencia manejando proyectos de monitorización. Es un auténtico apasionado del Yoga y la meditación.

Pandora FMSRelease

Pandora FMS v6.0 SP3 just released!

June 28, 2016 — by steve0

logo_pandora.png

Again we’ve been hard at work to make Pandora FMS work as best we can for you and we’re very proud to present Pandora FMS 6.0 SP3. Although SP2 came out only a few months ago, we felt it was still a little rough around the edges, and we haven’t rested until we felt comfortable with how it would look and perform in this new version. We are now sure that some of the most relevant trouble spots will be fixed and working correctly, and we highly recommend you update and install this new Service Pack.

CloudGeekIntegrationsNetworkPandora FMSUsabilityVisual consolesWeb Console

ChatOps: when chatting becomes productive

March 16, 2016 — by steve0

Screen-Shot-2016-03-23-at-13.47.39.png

So as you may have noticed, we just released a chat bot plugin for Pandora FMS in order to adapt Pandora to the growing tendency around ChatOps and its use. In spite of this, you may still wondering exactly what ChatOps is all about. I mean, hasn’t chat been around for the longest? Haven’t chat bots been present since IRC chats? The answer is YES, but not like this, now we’re at a time when it’s all about the next step; that’s ChatOps.

FeaturesRelease

Pandora FMS 6.0 SP1 is out!

January 21, 2016 — by Carla Andres0

SP1.png

The first Service Pack of Pandora FMS 6.0 version is finally out. Here are the most significant changes, improvements and new features:

New features

  • New features in the API and CLI: create synthetic modules, manage planned downtimes, and manage Network, Data Server and Plugin modules to set inverse warning/critical intervals.
  • CSV report exportation has been added to the Reporting engine.
  • New ‘alert_template’ tag in agents’ XML files to subscribe a new module to an Alert Template.
  • Added Monthly SLA to the list of dynamic reports.
  • Added get_agents function to the CLI.
  • Added several functions to handle Planned Downtimes to the CLI and API.
  • Synthetic modules can now be created through CLI and API.
  • Added secondary server options to Satellite Server.

Pandora FMSUncategorizedUsability

Improving performance of Pandora FMS

October 22, 2014 — by steve0

Introduction

The main goal of this article is to highlight the bottlenecks in the execution of such a demanding resources system as Pandora FMS. In order of relevance we can remark the following:

  • CPU
  • Memory
  • Disc access
  • DB performance
  • Configuration of the Pandora FMS Server
  • Status of the DB of Pandora FMS

Now we are going to analize the different analysis techniques to detect problems in each of these points. The solution to each problem exceeds the purpose of this article, which aims only to show how to identify the problem and give some clues about how to face its solution.

Processor and disk access

vmstat
We will execute “vmstat 1 10” command. Usually, the first line should be ignored as it’s afected by the boot of the command itself.

vmstat 1 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 10892 105036 404324 2540940 0 0 1 184 2 3 8 1 76 15 0
0 0 10892 104780 404324 2540936 0 0 0 32 557 641 5 2 92 1 0
1 0 10892 103788 404324 2540936 0 0 0 120 335 475 3 0 94 2 0
0 0 10892 103756 404324 2540936 0 0 0 36 361 489 5 0 94 1 0
1 0 10892 103384 404324 2540936 0 0 0 32 378 449 6 1 92 1 0
0 0 10892 103400 404324 2540936 0 0 0 0 465 664 1 0 99 0 0
1 0 10892 103860 404324 2540940 0 0 0 32 1439 1522 8 1 90 1 0
0 1 10892 106264 404324 2540948 0 0 0 112 9086 20506 9 1 87 2 0
0 0 10892 97052 404324 2540948 0 0 0 3704 9543 21045 13 2 77 9 0
0 0 10892 106956 404324 2540948 0 0 0 32 547 752 3 1 95 2 0

The most important columns are:

  • R: Number of threads in the running queue. There are executable threads but they don’t have available CPU to execute them.
  • B: Number of blocked processes waiting for access to E/S.
  • US: CPU usage in user context (Applications).
  • SY: CPU usage in system context (calls to system).
  • WA: Real percentage of time “without use” of the processor in forced wait operations Input/Output.
  • CS: Context Switches, CPU context switches.
  • IN: Interruptions.

The number in “R” shouldn’t exceed 1-3 threads for each processor. So, a system with 2 processors should never exceed a value of 6, that would mean that there are a lot of threads in queue for their execution and a lot of pending work.

If the number in CS is higher than the number in IN, it usually involves a problem because the kernel has to execute a lot of context switches, spending the most part of the time in this operation. It uses to be a system scheduler overload problem. As a secondary effect , WA increases.

CPU usage:The right balance of CPU usage should be 70% user, 25-30% system and 0-5% Idle.

mpstat
This command can be used to see the load balance between the different system CPU’S. Execute the “mpstat -P ALL 1” command. The first line should be ignored.

12:17:19 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:17:20 all 0,75 0,00 0,00 0,00 0,00 0,00 0,00 0,00 99,25
12:17:20 0 1,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 99,00
12:17:20 1 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:17:20 2 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:17:20 3 1,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 99,00

12:17:20 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:17:21 all 6,00 0,00 0,50 0,00 0,00 0,00 0,00 0,00 93,50
12:17:21 0 7,00 0,00 1,00 0,00 0,00 0,00 0,00 0,00 92,00
12:17:21 1 9,90 0,00 0,99 0,00 0,00 0,00 0,00 0,00 89,11
12:17:21 2 8,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 92,00
12:17:21 3 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00

12:17:21 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:17:22 all 7,48 0,00 0,25 4,49 0,00 0,00 0,00 0,00 87,78
12:17:22 0 7,07 0,00 1,01 15,15 0,00 0,00 0,00 0,00 76,77
12:17:22 1 5,94 0,00 0,00 0,00 0,00 0,00 0,00 0,00 94,06
12:17:22 2 12,87 0,00 0,99 2,97 0,00 0,00 0,00 0,00 83,17
12:17:22 3 4,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 96,00

12:17:22 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:17:23 all 14,50 0,00 1,25 0,75 0,00 0,00 0,00 0,00 83,50
12:17:23 0 23,00 0,00 2,00 3,00 0,00 0,00 0,00 0,00 72,00
12:17:23 1 15,84 0,00 1,98 0,00 0,00 0,00 0,00 0,00 82,18
12:17:23 2 2,97 0,00 0,00 0,00 0,00 0,00 0,00 0,00 97,03
12:17:23 3 16,00 0,00 1,00 0,00 0,00 0,00 0,00 0,00 83,00

It is normal that the load is balanced between the different processors. If that isn’t the case then the system has a multiprocessing problem.

It is important to analyze disks from two perspectives: manufacturer information and real write speed.

To get information about the device we need to use the smartctl command:

smartctl –a /dev/sda

This will provide us with manufacturer information and model. With that information we can get an estimation of the IOPS of the module and it’s averige write speed.

The average write speed:

dd if=/dev/urandom of=testfileR bs=8k count=10000; sync;

Optimal values are between 50MB/Sec and 100, values between 20-30MB/sec are for the relatively new devices. Under de 10MB/Sec the system is slow and under 5MB/Sec we don’t recommend continuing with the deployment because the performance is very poor.

The write speed doesn’t have to have correlation with the IOPS, witch are related with the writting eficency than with the writting speed. There is a correlation though for disks that are quick in writting tend to have high IOPS.

Memory

vmstat
Use the vmstat command to get, relative to the SWAP usage, information about the system memory:

vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 10948 904160 404324 2536700 0 0 1 184 2 3 8 1 76 15 0
0 0 10948 896568 404324 2536696 0 0 0 32 2620 3553 18 6 75 1 0
0 0 10948 898332 404324 2536700 0 0 0 36 329 461 2 0 97 1 0
1 0 10948 898332 404324 2536700 0 0 0 20 440 547 4 0 96 0 0
0 0 10948 898396 404324 2536736 0 0 0 0 270 301 4 0 96 0 0
1 0 10948 898372 404324 2536736 0 0 96 88 844 1495 6 0 93 2 0
0 0 10948 898492 404324 2536736 0 0 80 3644 499 781 6 0 84 10 0
0 0 10948 902860 404324 2536736 0 0 0 24 315 405 2 0 98 0 0
0 1 10948 902724 404324 2536736 0 0 48 52 1651 2942 16 1 81 2 0
0 0 10948 902700 404324 2536736 0 0 0 20 128 172 1 0 99 1 0

SI,SO: Swap In/Out. Any value different from 0 means that the system is working on swap. In stable production systems swap isn’t used at all. This also means that there is little memory in the system so we need to adjust the mysql configuration, the pandora FMS configuration or any other elements that might interfere.

4. Database Performance

/etc/my.cnf

There are some key parameters in order to optimize the performance of MySQL. For more information visit the Pandora FMS documentation on MySQL optimization. Let’s start with these three:

 innodb_io_capacity 75
 innodb_flush_log_at_trx_commit 0
 innodb_flush_method O_DIRECT

These three parameters are crucial and should have values as showed above. The value of IO_Capacity should be one or another depending on the type of storage:

  • 5000 RPM disks or lower ~ innodb_io_capacity 75
  • 7200 RPM disks ~ innodb_io_capacity 100
  • 15000 RPM disks ~ innodb_io_capacity 180
  • Last generation SSD disks ~ innodb_io_capacity 240

pandoradb_stress

This is a diagnostic tool used to verify the data insertion capacity of a Pandora FMS, using the Pandora FMS library (API) mechanisims to access the data. To do that we have to follow the steps:

         $target_agent = -1;
  • We replace the -1 by the ID of our agent.

And then we execute the following command:

/usr/share/pandora_server/util/pandora_dbstress.pl /etc/pandora/pandora_server.conf

Pandora DB Stress tool 5.1dev Build 140602 Copyright (c) 2004-2014 Artica ST
This program is OpenSource, licensed under the terms of GPL License version 2.
You can download latest versions and documentation at http://www.pandorafms.org
[*] Working for agent ID 52610
[*] Generating data of 90 days ago
[*] Interval for this workload is 300
[*] Processing module Host Latency
[D] ID_AgenteModulo 341281 Interval 300 ModuleName Host Latency Days 90 Agent 198.27.73.105
-> Current rate: 0.12 modules/sec
-> Current rate: 358.95 modules/sec
-> Current rate: 387.78 modules/sec
-> Current rate: 411.94 modules/sec
-> Current rate: 426.88 modules/sec
-> Current rate: 359.93 modules/sec

For having more exact data is recommended to create a new module and delete the last one on this agent. The tool will start to insert data in the module of this agent, simulating data that later could be used for graphs and reports. By default, the tool inserts data from a month in all the modules of every agent of their installation. By modifying the agent parameter we force to do it in the specified agent.

The average value of a Pandora FMS server should be above 300 mod/second. This tool can be used to check the system optimization.

Pandora_server configuration

/etc/pandora/pandora_server.conf

The proper configuration of the Pandora FMS server can increase up to a 500% its performance. Let’s make some easy checks to verify its correct parameterization:

  • verbosity 1: Higher values will be use as problem diagnosis, but a value higher than 1 will impact in the system performance.
  • network_timeout X: Being 3 the value by default, it’s recommended to make it lower if working in local networks. A high value (e.g.: 10) can easily lead to the emergence of a lot of modules in “unknown” because of the server has to wait 10 seconds per each check failed.
  • server_threshold x:Being 5 the value by default, in case of overload can be recommended to move it up till 10 or 20, but never move it down below 3 or 4 (for lightly loaded servers and checks with small intevals).
  • server_keepalive 45: This parameter is used in environments with several Pandora FMS servers, to detect when a server is down. It shouldn’t be modified.
  • xxxx_checks X: Number of checks that the network server does (icmp, snmp. Tcp). By default its value is 1, in environments with many false positives can be necessary increase it to 2 or 3 maximum, but this can damage the performance of the network server.
  • xxxx_timeout: Similar to network_timeout. When we increase the default values sometimes the performance can decrease. Move it down can produce false positives or monitoring lacks.
  • xxxx_threads: The total number of threads of all the options shouldn’t exceed 30-40.
  • dataserver_threads:The values should be between 1 and 5.
  • max_queue_files 500: Its value shouldn’t be changed.

/var/log/pandora

A simple glance at this directory can help to detect problems. The logs shouldn’t have large sizes:

[[email protected] pandora]# ls -lah /var/log/pandora/
 total 356K
 drwxr-xr-x. 2 pandora root 4,0K jul 21 03:17 .
 drwxr-xr-x. 13 root root 4,0K jul 20 03:33 ..
 -rw-r--r--. 1 root root 983 oct 22 2013 pandora_agent.log
 -rw-rw-rw-. 1 root root 32K jul 23 19:33 pandora_server.error
 -rw-rw-rw- 1 root root 2,1K jul 21 03:17 pandora_server.error-20140721.gz
 -rw-rw-rw- 1 root root 44K jul 23 19:27 pandora_server.log
 -rw-rw-rw- 1 root root 65K jun 14 18:17 pandora_server.log.old
 -rw-rw-rw- 1 root root 176K jul 23 19:33 pandora_snmptrap.log
 -rw-rw-rw- 1 root root 10 jul 23 19:34 pandora_snmptrap.log.index

A log with a size of over 50 MB should be rotated or deleted.

Pandora FMS BBDD

To do that we are going to run the diagnosis tool of Pandora FMS system:

Setup->Diagnostic Info

We should look at the following values:

  • Table tagent_access: It shouldn’t exceed 250.000 records.
  • Table tagente_datos: It shouldn’t exceed 5-10 million records.
  • Table tagente_datos_string: It shouldn’t exceed 2-4 million.
  • Table tagente_estado: It shouldn’t exceed 100,000 records.
  • Table tevento: It shouldn’t exceed 250,000 records.
  • Table tsesion: It shouldn’t have more than 50.000 records.
  • PandoraDB Last run: There should be a date not far than 24h compared to the current date.

Values outside of the specified threshold can be an indicative of a problem, oversized or an imbalance in the system configuration.

icon_contact_us download_it-08
Do you want to know more
on how to optimize Pandora FMS?
Do you want to get Pandora FMS?

DevelopmentPandora FMS

Pandora FMS moves to GitHub!

September 22, 2014 — by steve0

After almost ten years working with conventional repository codes, first CVS and after Subversion, we have realized of the advantages and improvements of working with a system as advanced as GIT. We have been working inrelocating all the Pandora FMS code in this platform since September, and we’ve also adopted the GitFlow methodology for managing our day to day.

pandora fms github

You can find our new code repository on the link below:

https://github.com/pandorafms

As a reminder, some basic commands to download the latest GitHUB code:

  • Download the code (rama master -the official-by default)

    git clonehttps://github.com/pandorafms/pandorafms.git

  • Switch to development branch:

    git checkout origin/develop

  • Update your local copy with changes in the central server (on Github)

    git pull

Warning: The SVN repository will be available for a couple of months, and we will keep it updated through our Git, but within a year it will no longer exist (on Github)

icon_contact_us download_it-08
Do you want to know
more about Pandora FMS?
¿Would you like to get Pandora FMS?

Pandora FMSUsability

False Positives in Monitoring

September 10, 2014 — by steve0

False positives (as well as false negatives) are a recurrent issue in our experience in monitoring, and after a while we are pretty sure that it’s worth talking about them.

The best way to approach a problem is using an example: Suppose we use Pandora FMS to monitor a network with 500 servers, in which we have defined to make a connectivity check (ping) to each IP. The most common result is that all checks appear in green, however, sometimes and in a random way, some check appears in red. Once we detect that, we perform the ping manually and we make sure that it works perfectly.

The initial conclusion is that our monitoring system, in this case Pandora FMS, is failing, but what is really happening is that our monitoring system is not configured as it should, and that is exactly where the problem is.

To test it, we just have to do a ping to one of these IP’s that sometimes fail y leave it for hours. We will see that occasionally, in 1 of 1.000 checks or even in 1 of 10.000 the ping fails, but we shouldn’t worry about that because is relatively common for networks to have that behavior sometimes.
The following screenshot shows us how our entire monitoring system is in green and however, a ping from the console fails. If in this precise moment Pandora FMS had been doing a check, it would have probably turned into red.

falsepositives

All monitoring systems have several parameters to control this behavior. Maybe we are interested in having the maximum detail level, as Pandora FMS does by default, or otherwise we want to attenuate the detail to avoid warning at the minimum failure. Below we list many control mechanisms available in Pandora FMS (also available in other monitoring systems) to avoid this kind of behavior:

  • Nº of checks: Sometimes the first ping fails, but the second one works, that’s why almost all the systems have a number of retries. There have been cases of systems where the first ping always failed, and it only worked when we pinged constantly with three retries. In this kind of cases (infrequent), the best option is to use other adapted checks (a custom plugin) instead of the standard check.
  • Timeout: In case we want to check remote systems maybe we need to increase the Timeout response. If we talk about a LAN, a second is more than enough, in the Internet we’d probably find a lot of false positives caused by a very low Timeout. On the other hand, setting a high Timeout of 10 seconds for example, would be a drag for the capacity of our server, because in the worst case it would have to wait 10 seconds per each check considering that the system is not responding.
  • Sensibility package loss: It could be hard to believe, but different ping tools behave differently, and even the same ping tool in different systems behave differently. Sometimes, the monitoring tool allows to set up this behavior to be tuned. We can`t compare the results of tools like ping, fping, hping or nmap as it will return different values. That’s why we need to know if our monitoring tool has settings that are generally respect to the tolerance of package loss or to the speed of transmission of information (related with Timeout and Nº of checks parameters). A bad configuration can make false positives appear. In an extreme case, because of this intolerance, we can find out with our monitoring tool a network with a package loss negligible for other tools. This is a real case with Pandora ICMP Enterprise server, using T3 parameter in the Nmap scan, in which we can appreciate that some systems don’t respond randomly because of a negligible package loss for the most part of the conventional monitoring systems.
  • Flipflop: The phenomenon in which an element that usually behaves in a stable way “bounces” more or less regular. To avoid that these bounces affect to how we perceive the value we will put a bound threshold. As this sometimes has “peaks” we’ll assume that there is a problem when that failure happens twice.
  • Flipflop threshold: To avoid having to wait till the monitorization process finish we will set the flipflop threshold to control the element faster and better. This way, if something fails we will know instantly. It’ssually combined with the previous parameter (Flipflop) so that if it fails we hope to have a confirmation in a shorter time, in Pandora FMS that is called Intensive Monitoring.

In the previous example we set the flipflop threshold in 1 and the flipflop interval in 30 seconds, so that, if anything fails we will be aware and we will repeat the test after 30 seconds. If the fails again, we’ll consider it as down and we will send an alert to the system, if not, we will consider it as a false positive and we will avoid alerting the system.

In conclusion, before claiming that our system has false positives, it’s important to review and properly set up all those elements in our monitoring software to avoid unnecessary alerts.

icon_contact_us download_it-08
Do you want to know more
about false positives in Pandora FMS
Do you want to get Pandora FMS?

AgentDevelopmentPandora FMS

Graphs in Pandora FMS

July 23, 2014 — by steve0

Nowadays, the best way to resolve a technical problem in our systems or devices is to solve this problems even before it happens. Here is where a proper prevention system is required.

The prevention of problems in each electronic component nowadays is based upon our capabillity to monitor it,  and understand the recollected information. In Pandora FMS we can observe the information in graphs.
Graphs are one of the most complex implementations on Pandora FMS. They gather information in real-time from the DB, and no external system is used (rrdtool or similar).
There are several behaviors of the graphs that depend on the type of the data:

  • Asynchronous modules. It is assumed that there is no data compaction. Only the real samples of the data are stored (therefore, there is no compaction). This way more “exact” graphs are produced without possible misinterpretation.
  • Text string modules. Shows the rate of the gathered data.
  • Numerical modules. Most modules report such data.
  • Boolean modules. These are numerical data on *PROC modules: for instance, ping checks, interface status, etc. 0 means wrong, 1 means “Normal”. Events are fired automatically when they change of state.

Compression
Compression affects on how the graphics are represented. When we receive two data with the same value, Pandora FMS doesn’t store the last data, but interprets that the last known value can be used for the present time if we don’t have another value. When a graph is being painted, if there isn’t a reference value just when the graphic starts, Pandora searches 48 hours back in time to find the last known value to take as reference. If it doesn’t find anything, it will start from 0.
In asynchronous modules, although there are not compression, the backwards search algorithm behaves similarly.

Interpolation
When composing a graph, Pandora FMS takes 50xN samples, being N the resolution factor of the graphs (this value can be configured in the setup). A monitor that gathers data every 300 seconds (5 minutes) will have 12 samples per hour, and 12×24 = 228 samples in a day. So when we ask a graph of a day, we are not printing 228 values, we are “compressing” or interpolating the graphic using only 50×3=150 samples (by default, graph resolution in Pandora FMS is 3).
This means that we lose some resolution and the more samples. When we have a lot of values, for instance the 2016 samples of a week, of 8400 samples of a month, we must compress them in the 150 samples of a graph. This is why sometimes we lose detail and do not see some details, that’s why the graphs can be queried with different intervals and to zoom in or out.

Captura de pantalla 2014-07-23 a la(s) 14.54.25 In the normal graphs, the interpolation is implemented in a simple way: if within an interval we have two samples (p.e: interval B of the example), we do the average and we draw its value.

In boolean graphs, if within a sample we have several data (we can only have 1 or 0), we take the pessimist approach, and draw 0. This helps for the visualization of failures within an interval, having priority showing the problem that the normal status.
In both cases, if within a sample we don’t have any data (because it’s compressed or because it’s missing), we will use the last known value of the previous interval to show the data, like the interval E of the above example shows.

Avg/Max/Min 

graficas

The graphs by default show the average, maximum and minimum values. Because a sample (see interpolation”) can have several data, we show the average values of the data, the maximum or the minimum.

The more interpolation needed (the longer the period we are visualizing and we have considerably more data), the higher the interpolation level will be, therefore the difference between maximum and minimum values will be greater. The lower the range of the graph (an hour or so), there will not be interpolation, or it will be minimum, so we’ll see the data with its “real” resolution, and the three series will be identical.

icon_contact_us download_it-08
Do you want to know more
about Pandora FMS?
Do you want to get Pandora FMS?

AgentDatabasePlugins

How to monitor Apache Hbase

July 17, 2014 — by steve0

In this brave new world of big data, a database technology called “Bigtable”, for example Apache Hbase, would seem to be worth considering — particularly if that technology is the creation of engineers at Google, a company that should know a thing or two about managing large quantities of data.  Now with Pandora FMS you can monitor Apache HBase’s performance settings.

What is Hbase?

hbase_logoHBase is an open source, non-relational, distributed database modeled after Google’s BigTable and written in Java. It is developed as part of Apache Software Foundation’s Apache Hadoop project and runs on top of HDFS (Hadoop Distributed Filesystem), providing BigTable-like capabilities for Hadoop. That is, it provides a fault-tolerant way of storing large quantities of sparse data. For more information on Apache Hbase visit the official Apache Hbase webpage: http://hbase.apache.org/

How to collect data

Pandora FMS uses an agent installed on the machine where Hbase is installed to execute local tests and send the results over to the server in XML format.

The Hbase plugin returns 18 modules. All of them display valuable status information. You can fix
thresholds manually to determine whether something is in a warning/critical or operative condition.

  • Hbase Alive: It shows if Hbase is running. If this goes critical then the rest of the modules won’t be created.
  • Hbase Connections: Display the amout of network connections in database.
  • Hbase CPU Usage: The percentage of CPU used by HBase.
  • Hbase Memory Usage: The percentage of Memory used by Hbase.
  • Hbase Heap Memory Used: The percentage of heap memory used by Hbase.
  • Hbase Process State: The state of Hbase process.
  • Hbase Tables: Number of tables in Hbase.
  • Hbase Time in CPU: It’s the CPU time used by the process of Hbase so far.
  • Hbase/Region Servers Online: Number of Region Servers Online right now.
  • Hbase/Region Server Request per second: Request per second of the RegionServer of Hbase.
  • Hbase Log Warning Messages: Number of warning messages in Hbase Log.
  • Hbase Log Errors: Number of error messages in Hbase log.
  • Hbase/Region Server Cache Hit Ratio: Block cache hit ratio (0 to 100%) from RegionServer
  • Hbase/Region Server Flush Queue Size: Point in time number of enqueued regions in the MemSotre awaiting flush.
  • Hbase/Region Server Compaction time: Point in time length of the compaction queue. This is the number of Stores in the RegionServer that have been targeted for compaction.
  • Hbase/Region Server Memstore Size: Point in time sum of all the memstore sizes in the RegionServer.
  • Hbase/Region Server Read Request: Number of read requests for RegionServer.
  • Hbase/Region Server Write Request: Number of write request for RegionServer
  • Hbase/Region Server Number of Online Regions: Nomber of regions served by the RegionServer

How to configure the plugin

In order to configure correctly this plugin there are a few steps that should be followed:
Move the hbase_plugin.sh file from the default download directory to the /etc/pandora/plugins/ directory:
Assign necessary permitions to hbase.pl script:

chmod +x hbase_plugin.pl

At the end of the pandora_agent.conf file add the following line:

module_plugin /etc/pandora/plugins/hbase_plugin.pl

Restart the pandora agent process writing:

sudo /etc/init.d/pandora_agent_daemon restart.

This plugin also needs a specific configuration for Hbase to be able to retrieve the right information throught the information server of Hbase.

First of all, Hbase must be unpacked in /etc. The folder must be called “hbase” all the files which will be used by Hbase will be in that folder.
Before starting Hbase we should modify the configuration file of Hbase located in /etc/hbase/conf depending on our needs.
We have to edit/add the following lines in hbase-site.xml, between the “configuration” tags (<configuration> and </configuration>):

<property>
<name>hbase.master.info.port</name>
<value>16010</value>
</property>
<property>
<name>hbase.master.info.bindAddress</name>
<value>127.0.0.1</value>
</property>

Extracted Information

General view of the modules in the Pandora Console:

hbase modules

Below we can see a graph with the gathered information by a module:

hbase memory usage

icon_contact_us download_it-08
Do you want to know more
about Pandora FMS?
Do you want to get this plugin?

DevelopmentPandora FMSPlugins

How to monitor Raven DB

July 17, 2014 — by steve0

A new plug in has just been created. Now Pandora FMS can monitor Raven DB’s performance settings. Raven DB is a NoSQL database.

What is NoSQL?

nosql

A NoSQL or Not Only SQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling and finer control over availability.

What is Raven DB?

RavenDB is a transactional, Opensource document base written in .NET, and offering a flexible data modeldesigned to address requirements coming from real-world systems. RavenDB allows you to build high-performance, low-latency applications quickly and efficiently.

Data in RavenDB is stored schema-less as JSON documents, and can be queried efficiently using Linq queries from your .NET code or using Restful API using other tools.

RavenDBliconBurgandy_6

Internally, RavenDB makes use of indexes which are automatically created based on your usage, or created explicitly by the consumer.

RavenDB is built for web-scale, and offers replication and sharding support out-of-the-box. For more information on Raven DB visit the official Raven DB webpage: http://ravendb.net/

How to collect data

Pandora FMS uses an agent installed on the machine where Raven DB is installed to execute local tests and send the results over to the server in XML format.

This plugin returns 12 modules. All of them display valuable status information. You can fix thresholds manually to determine whether something is in a warning/critical or operative condition.

• RavenDB_Server_Process_Running: Informs if the process is active

• RavenDB_Server_Process_PID: Shows the PID of the process

• RavenDB_Server_Process_Memory_Usage: Shows the memory usage for the process .

• RavenDB_Server_Process_CPU_Usage_Percentage: Shows the porcentage of cpu use for the process.

• RavenDB_Server_Process_Sessions: Returns the number of Sessions for the Process

• RavenDB_Server_Process_Session_Name: Returns the specific session name for the process

• RavenDB_Database_Number: Returns the number of Databases created in Raven DB

• RavenDB_Database_Size: Shows the Database size.

• RavenDB_Uptime: Returns the uptime for the Raven DB

• <Database>_Documents: Returns the amount of documents in the database where:

<Database> is the name for every single one of them.

• <Database>_Requests_Per_Second: Returns the number of request per second for the databases

• <Database>_Concurrent_Requests: Returns the number of concurrent requests for the database.

How to configure the plugin

To install the plugin in Pandora FMS, you need to copy it to the following folder:

C:\Program Files(x86)\pandora_agent\util.

Once done edit the pandora fms agent configuration file located in:

C:\Program Files (x86)\pandora_agent\pandora_agent.conf

You need to add the following line:

module_plugin cscript.exe //B “%ProgramFiles%\Pandora_Agent\util\RavenDB_Plugin.vbs

Extracted information

Below you can see a general overview of the modules created in the agent.

modulos ravendb

Below we can see a graph with the gathered information by a module:

 

ravendbmemusage

 

icon_contact_us download_it-08
Do you want to know more
about Pandora FMS?
Do you want to get this plugin?

Pandora FMSRelease

Improved Visual Console

July 11, 2014 — by steve0

In the huge technological world monitoring your systems has gone a long way. No longer we should wounder about how to sleep calmely not knowing if our servers, and network devices are working properly.

This is where the capability to make a proper report, to see specific information in realtime takes a step forward. No longer we must look throught a terminal desperately trying to make the most of the gathered information. Pandora FMS allows you to create visual maps in which each user is able to create his own monitoring map.

Within the new visual console, we’ve been successful in imitating the sensation and touch of a drawing application. We’ve also simplified the editor by dividing it into several subject-matter tabs named ‘Data’, ‘Preview’, ‘Wizard’, ‘List of Elements’ and ‘Editor’.

Data

Within the data tab, you may edit and create the visual console’s basic data. There is only one visible for a new map until you save it. The essential values within this particular tab are the visual console’s name, the group for the ACL management and the background image. By creating it, the size of the visual console is determined by the background’s image size. If you change the background, the last user-defined size or the previous background will be stored.

The background images are stored within the Pandora Console directory under ‘var/www/pandora_console’ in the ‘/images/backgrounds/’ directory.

Preview

The visual console view is a static view, so if the state of the elements contained in there changes, they’re not going to be drawn again. Same as the visual console’s view which is contained in Visual Console’s menu.

In here is a small questionnaire to create several elements of the static-image type simultaneously within the visual console by only two clicks.

As you can see in the picture below, the form consists of the following:

  • The image which will be the same for all the elements created in the batch.
  • The distance between the elements, that will be one after another in a horizontal line from position ‘0,0’.
  • The agent’s selection box to select one or several agents. Whether you select one or several agents, the batch elements will be created for the visual console.
  • The module’s selection box, which is a dynamically designed control which is filled by the agent’s modules you’ve picked within the agent’s selection box. You’re able to pick the modules for which you intend the static image elements in the visual console to be drawn in it.

750px-Pandora_new_visual_console,_tab_wizard

List of Elements
This tab provides a questionnaire for the visual console which you’re presently editing. It’s tabulated in files of the elements and a quick way of editing the different elements. It’s also a useful tool for users which require to adjust certain element’s values.

The supported actions within this questionnaire are editing and the deletion of elements. Creating elements and changing the element’s type is not supported here. These actions are required to be carried out under the ‘Editor’ and ‘Create’ tabs.

The first line is the background image’s configuration.

The rest of the lines are going to be map elements, associated in lines of two elements each and separated by a horizontal black line as shown on the picture below.

600px-Pandora_new_visual_console,_tab_list_elements

This tab contains the most of the visual console editor’s functionalities, because this is the menu in which you’re able to create, edit and position the elements. It’s a dynamically designed page, so your browser is required to appropriately support the JavaScript language. As you can see on the picture below, the window is divided into two well defined areas: The button box, the work area (within which the visual console is going to be drawn) and the options palette (which isn’t visible on the picture).

editor

From version 5.1, thanks to improvements in the visual console, each user will have the possibility to include complex graphics (multi-string), add custom html code, and also you could share public url’s via QR codes.

Custom graphs background

 

icon_contact_us download_it-08
Do you want to know more
about Pandora FMS?
Do you want to get the new
version of Pandora FMS?

DevelopmentPandora FMSRelease

Pandora FMS 5.1 RC2 Is Out

June 27, 2014 — by steve0

Pandora FMS 5.1 RC2 has just been released.

The official release of the stable version is just around the corner! This new version includes primarily:

  • Increased stability to the system
  • Some minor problems and bugs being solved to improve your experience at Pandora FMS

 

You can know the news here.

Do you want to be a betatester of this version including the new Enterprise features? It’s really simple. Contact with us and we will tell you how.

 

download_it-08 icon_contact_us

Pandora FMS

Pandora FMS 4.0 FEATURES VIDEO

September 26, 2011 — by steve0

Hello, we are proud of release the Pandora FMS 4.0 version features video. A short video with all important features of our new Pandora FMS version.

We used the following technologies:

– Smartphone with Android O.S. as camera to record the real sequences.
Kdenlive (GPL2 licensed) to the complete editing and effects.
Xvidcap (GPL licensed) to capture the video demonstrations and animations.
GIMP 2.6.8 (GPL2 licensed) to graphic treatment.
Gource application (GPL3 licensed) to generate the organic animation from the sourceforge tracker.
– Google Maps API to generate the world map’s animation.

Uncategorized

Bug #3000000

May 11, 2010 — by steve2

In sourceforge, the numeration on bugs system for a project is a unique id. Is the total number of bugs in sourceforge. Today, in Pandora FMS project, a bug has been created with Id 3000000:

Congratulations Pandora FMS!!

After this senseless shock, we must to return to the hard work :D

debianPandora FMS

Pandora FMS deb packages

November 27, 2008 — by steve11

We are trying to make as easy as possible Pandora FMS installation, therefore we are getting deb packages built, so anyone can use them. As a preview of what is to come:

apt-cache show pandorafms-console
Package: pandorafms-console
Priority: extra
Section: admin
Installed-Size: 12628
Maintainer: Mario Izquierdo (mariodebian)
Architecture: all
Source: pandorafms
Version: 2.0-4
Depends: apache2, graphviz, libapache2-mod-php5, mysql-server, php-db, php-gettext, php-pear, php-xmlrpc, php5, php5-gd, php5-mysql, php5-snmp
Filename: ./pandorafms-console_2.0-4_all.deb

Installation time after downloading: 2 minutes (depending on your system hardware).