The idea
In my current role I analyze logs on a daily basis. This could be for any number of reasons, whether looking for trending requests, helping identify suspicious requests and IP addresses or simply trying to determine when something happened.
When investigating either performance based issues or where a customer is seeing 504
errors due to an influx of visitors, I like to fall back on NGINX logs to look for trends. Is the traffic higher than normal? What has the traffic been like for various times in the past few days?
Getting historical data can certainly help identify trends or point to when a rise in visitors started. This can often be attributed to a sale a website may be running in the case of e-commerce sites. It can often be challenging to provide clear and concise information to all parties involved in the process of resolving the issue at hand, which often is that the site is inaccessible.
I am personally a very visual learner and therefore can often get more out of looking at graphs or visual representations of data as opposed to large data dumps. I also feel a graph can be a valid tool in order to help drive the numbers through when the question is: what is the issue and how do we fix this?
The content for this post was spurred on by a last minute request from a manager working in a different department looking for some help on explaining the effect a particular website’s promotion was having on the server and subsequently the need to look at a higher tier product to cope with the traffic volume.
After an hour of digging and compiling my report, I felt it lacked a little. I was able to show an almost ten fold increase in traffic starting at a certain time, and continuing to rise, however I felt the numbers didn’t do justice as far as an impactful visual representation.
I began exploring the idea of building graphs within a terminal using this type of data. During my research I stumbled upon termgraph. This simple application is written in Python and in a nutshell displays data from a file in various graph options. The syntax is pretty simple as well:
termgraph file.dat --color {red,blue,yellow}
The above colors use a three column data setup (excluding the label). I began testing how it handles the data and manually built out a few files. I liked the look and feel of it along with its simple structure to create easily readable graphs. If you require further options I would recommend reviewing the project’s page.
The next step in the process came to try to automate the task as much as possible. While I was able to hack together the data file required to show a label along with three individually colored sets of data per label, it was time consuming and lots of cutting, pasting and otherwise editing the data file. I began exploring the idea of trying to automate the creation of the table as much as possible.
From my afternoon spent on this, data_build was created. It’s a simple Bash script that has fixed file paths for three days of logs (including the day it is being run on). I hope to get the script to a point where you have greater control over the files it uses to parse, however for the time being this meets my needs.
Workflow
I will run through a sample run of the script. The script accepts the site name either as an argument, or if not provided it will prompt the user. In this example I will let the script prompt me.
$ ./data_build.sh
Please provide the install name: mytestsite
Copy the below output in order to build the data file:
@ 16/Aug/2020,15/Aug/2020,14/Aug/2020
00:00 93 131 136
01:00 144 139 139
02:00 147 145 135
03:00 135 146 139
04:00 144 141 141
05:00 134 149 133
06:00 135 147 133
07:00 135 144 137
08:00 139 138 142
09:00 157 151 138
10:00 142 140 137
11:00 140 134 142
12:00 139 137 136
13:00 141 141 160
14:00 146 158 174
15:00 154 136 143
16:00 136 137 134
17:00 139 144 137
18:00 144 136 137
19:00 138 145 144
20:00 142 140 135
21:00 224 138 136
22:00 342 134 222
23:00 0 136 132
Since termgraph
doesn’t allow for missing values to be present, the script will substitute any missing values for 0
. We see this with the first column (today’s logs) at 23:00
hours. The reason behind the missing values is depending on the time of day the script is being run, the data may not yet exist.
Now that we have our data, we can add it to a file that will be read by termgraph
. The key is added at the top by providing comma separated values following the @
sign. Once we have our data file we can build our graph using termgraph
:

In this instance it is not an overwhelming increase of traffic volume, however it is sufficient for our presentation purposes to show how it can be utilized. In the above example I chose red for the present day. We can already start using this to check when the traffic increase has started and how it compares to some historical data.
I feel that this is going to be a good tool to use when trying to convey drastic changes in visitor behavior or traffic analysis in a clear and concise way.
The one thing I note is currently termgraph
will output the values as floating points as opposed to integers. It may become a pet project to amend that as this option does not appear to exist in the project at this time. I would strongly recommend checking out the termgraph
project in order to see all of its capabilities and how it can be utilized in your own project!