Application Performance Monitoring: a methodology for traffic analysis of multitier applications.
This post is also available in: Spanish
Application Performance Monitoring: an introduction on traffic analysis.
One of our main concerns as IT responsible is without any doubt business application performance monitoring.
If we are facing problems like low performance, recurring failures or even wrong output in our applications it could be necessary a deep analysis over application’s traffic in order to get the root cause of the problem. If you are interested in Root Cause Analysis as a troubleshooting method you can read our post by clicking here.
Going deeper is not always easy, we need basic elements like knowledge on TCP and UDP sequencing and the right tools to collect traffic data, correlate it into a usable format and present it in a friendly way.
Furthermore, the key challenge is how the application is designed and implemented. For a simple client server application it could be easier but for applications designed on Multitier architecture, this kind of analysis requires more effort.
In this article we will try to sketch out a methodology to do application performance monitoring based on traffic analysis but first let’s clarify some concepts about Multitier architecture.
Many times Multitier architecture and Multi Layer architecture are confused. Both terms refer to an architecture in which the application’s main functions are:
- Data administration
They are developed separately, offering developers the option to modify or add a function without touching the entire application.
However the difference between one and the other is that the term Multitier refers to a physical separation and the term Layer refers to a logical separation.
In this way typical 3-tier architecture for a web application can be:
As we said, in this article we will try to sketch out a methodology to do application performance monitoring based on traffic analysis. This methodology works for applications developed on a Multitier architecture.
Establish the application’s architecture
If we have a performance problem in a specific application one fundamental step is to understand its architecture and, to obtain it, it is necessary to consider the following:
- Each server and service involved.
- Relationship between servers.
- User’s experience when they use the application.
- The network platform that supports the application.
In complex environments this step could be challenging, for example let’s consider we need to establish the architecture for a web application:
- We can start working with the regular 3-tier architecture.
- Considering user experience, we found a user profile is defined using a connection between web server and a corporate LDAP server.
- Once validated, users can enter the application which resides in a app server.
- Now we see that the data is divided into two servers, one legacy server and a new database server.
- Finally we review the platform and see the web server resides in a cloud service, so the architecture for our application is this:
Now think for a moment how difficult can be to establish the architecture if we consider elements like server cluster, virtual machines, vlans, etc.
However a detailed review of the structure will be always necessary although, lucky us, many monitoring tools help us with automatic discovery of structure for applications and showing it in a consolidated console.
It’s clear once again the importance of choosing the correct monitoring tool. You can get more information about Pandora’s application performance monitoring and its benefits here:
Finally, we have to say all efforts dedicated in this step will be valuable because defining and establishing a detailed architecture of our application will allow us to do important things like:
- Defining key collect data points, thinking about what elements are critical and what elements are not. Even considering how critical they are.
- Establishing a strategy for regular application performance monitoring through setting a group of performance metrics thresholds and alarms.
- Establishing a plan of traffic analysis given an error or degradation problem in the application.
Analysis each two tiers
In complex environments, trying to make end to end traffic analysis is practically impossible, so we take the strategy recommended by Tribelab in its guide for Network trace analysis:
Let’s consider this generic application:
The main idea is to divide the analysis by each pair of tiers. Begin with traffic between Tier 1 and Tier 2, considering the results here and if it is necessary, go on to analyze Tier 2 and Tier 3 and go forward until find the root cause of the problem or making analysis for Tier n-1 and Tier n.
Starting the analysis with the traffic between Tier 1 and Tier 2 brings the additional benefit that we are closer to final users; in that way it is easier to correlate the user experience with traffic sequences. That relation becomes more complicated to visualize when we move to the right in the architecture.
We must remember that although the objective is to analyze traffic we must not neglect the factors associated with the performance of the application.
So, if we are doing traffic analysis Tier 1 and Tier 2, it is important to consider data on the behavior of the servers (CPU and memory usage, logs files revision, etc.) and the information on the performance of the communication links (latency, response time, etc.)
Advance through the problem path
Our recommendation at this point is to define a problem path and concentrate analysis efforts in those elements that conform it.
When we are finished analyzing traffic between Tier 1 and Tier 2, we have to consider the results in order to define the problem path and the next step.
Considering the graphic on the previous section and assuming we are facing a performance problem of 20 seconds in response time
We analyze the traffic between Tier 1 and Tier 2 and we conclude the traffic between Client and Server B contributes in 19 seconds meanwhile the traffic between Client and Server A contributes only with 1 second.
With these results we have to take the traffic analysis Tier 2 and Tier 3 as our next step but we concentrate our efforts in analyze traffic between Server B and Server C and Server D.
With the results from analyzing Tier 2 and Tier 3 we will define the next step of our problem path and so on.
So, in the next graphic we mark on red the problem path:
This way, we can find the root cause in less time and with the least possible effort.
So here we have a first approach to a traffic analysis methodology, but before we finish, just a final note to say the more understanding we have of our applications, the better we can configure our monitoring tools and the more advantages we will obtain.