Email Subscription Form

Saturday, April 6, 2019

Logging, Monitoring, and Alerting

This week I'm writing about three things not often associated with testing: logging, monitoring, and alerting.  Perhaps you've taken advantage of logging in your testing, but monitoring and alerting seem like a problem for IT or DevOps.  However, a bug-free application doesn't mean a thing if your users can't get to it because the server crashed!  For this reason, it's important to understand logging, monitoring, and alerting so that we as testers can participate in ensuring the health of our applications.

Logging:

Logging is simply recording information about what happens in an application.  This can be done through writing to a file or a database.  Often developers will include logging statements in their code to help determine what's going on with the application below the UI.  This is especially helpful in applications that make calls to a number of servers or databases.

Recently I tested a notification system that passed a message from a function to a number of different channels.  Logging was so helpful in testing because it enabled me to follow the message through the channels.  If I hadn't had good logging, I wouldn't have had any way to figure out where the bug was when I didn't get a message I was expecting.

Good log messages should be easy to access and easy to search.  You shouldn't have to log on to some obscure remote desktop and sift through tens of thousands of entries with no line breaks.  One helpful tool for logging is Kibana- an open-source tool that lets you search and sort through logs in an easy-to-read format.

Good log messages should also be easy to understand and provide helpful information.  It's so frustrating to find a log message about an error and discover that it says "An unknown error occurred", or "Error TSGB-45667".  Ask your developer if he or she can provide log messages that make it clear what went wrong and where in the code it happened.

Another helpful tactic for logging is to give each event a specific GUID as an identifier.  The GUID will stay associated with everything that happens with the event, so you can follow it as it moves from one area of an application to another. 


Monitoring:

Monitoring means setting up automatic processes to watch the health of your application and the servers that run it.  Good monitoring ensures that any potential problems can be discovered and dealt with before they reach the end user.  For example, if it becomes clear that a server's disk space is reaching maximum capacity, additional servers can be added to handle the load.

Things to monitor include:
  • server response times
  • load on the server
  • server errors, such as 500-level response errors
  • CPU usage
  • memory usage
  • disk space
One way to monitor application health is with a periodic health check or a ping.  A job is set up to make a request to the server every few minutes and record whether the response was positive or negative.  Monitoring can also happen through a tool that watches the number of requests to the server and records whether those requests were successful.  Data points such as response times and CPU usage can also be recorded and examined to see if there are any trends that might indicate that the application is unhealthy.  One example of a tool that monitors application and server health is AppDynamics.  

Alerting:

All the logging and monitoring in the world won't be helpful if no one is watching to see if there are problems!  This is where alerting comes in.  Alerts can be set to notify the appropriate people so that immediate action can be taken when there is a problem.  

Some situations that might call for an alert would be:
  • CPU or memory usage goes above a certain threshold
  • Disk space goes below a certain threshold
  • The number of 500 errors goes above a certain level
  • A health check fails twice in a row
  • Response times are slower than expected
  • Load is higher than normal
There are a number of ways to alert people of problems.  Alerts can be set up that will send emails, text messages, or phone calls.  PagerDuty is one service that provides this alerting functionality.  An important thing to consider, however, is to set off-hours alerts only for serious cases in which users might be affected.  No one wants to be woken up in the middle of the night by an alert that says that the QA servers are down!  However, a problem in the QA environment could indicate an issue that could be seen in the production environment in the future.  So a less invasive alert, such as a message to a team chat room, could be set up for this situation.  

You may be saying to yourself at this point, "But I'm a software tester!  It's not my job to set up logging, monitoring, and alerting for the company."  The health of your application is the responsibility of everyone who works on the application, including you!  While you might not have the clout to purchase server monitoring software, you still have the power to ask questions of your team, such as:
  • How can we troubleshoot user issues?
  • How do we know that we have enough servers to handle our application's load?
  • How will we know if our API is responding correctly?
  • How will we know if a DDoS attack is being attempted on our application?
  • How will we know if our end users are experiencing long wait times?
  • How will we know if we are running out of disk space?
Hopefully these questions will motivate you and your team to set up logging, monitoring, and alerting that will ensure the health and reliability of your application.  

15 comments:

  1. So glad to see you encourage testers to get involved with logging, monitoring and alerting. I would add to that, observability. We testers have good skills for spotting patterns in data and identifying risks, it's another way we can make valuable contributions to our team and product.

    ReplyDelete
    Replies
    1. That is an excellent point, Lisa! Thanks for including it.

      Delete
  2. Thank you for sharing wonderful information with us to get some idea about it.
    Azure Training
    Azure Online Training
    MS Azure Online Training

    ReplyDelete
  3. Very wonderful article. I liked reading your article. Very wonderful share. Thanks ! .
    software testing course in chennai

    ReplyDelete
  4. I have read many articles here and learn many things from them, this was really helpful for me. Thank you so much for sharing this info with us and keep sharing your ideas with us.

    Visit: Employee Monitoring Software

    ReplyDelete

  5. Thank you so much for sharing!
    Free Software for Monitoring Cell Phone (Smartphone, Mobile Phone); Remote employee time tracking software with screenshot and activity monitoring (9999332499, 9999332099).
    Here list of android app offering Monitoring Software companies in India.
    SpyAppKing
    SpyCameraIndia
    MobileJammerIndia
    SpyPlayingCard
    Nagios Core.
    Zabbix.
    Icinga.
    Cacti.
    Sensu Core.
    Observium.
    Zenoss.
    Monitorix.
    spyworld
    spyshoponline

    ReplyDelete
  6. Lista seriale turcesti subtitrat in Romana available on Pretul cel bun Clicksud. Get the latest updates of seriale turcesti subtitrat in Romana freely on our website.

    ReplyDelete
  7. This was very nice information about employee attendance tracking app in your blog post and I really appreciate it. Thanks for sharing such really amazing information. Always updated with technology and track your employee using an employee attendance tracking app at a very affordable price.

    ReplyDelete
  8. KYTE is the most popular attendance tracking software for pharmaceutical companies which helps the companies with in-campus as well as field force attendance management.

    Attendance tracking software for pharmaceutical companies
    KYTE

    ReplyDelete

  9. A comprehensive guide to ensuring application health and reliability! The insights into logging, monitoring, and alerting are invaluable for any IT professional. Speaking of streamlining operations, Flowace Attendance Tracker offers a similar level of meticulousness when it comes to attendance management. It's impressive how technology can elevate both the performance of applications and workforce management. Great job in shedding light on these critical aspects.

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. I'm truly excited about the insights provided in this blog post on logging, monitoring, and alerting! It's evident that the author has a deep understanding of the importance of these aspects in software development and testing.
    promocodehq

    ReplyDelete

New Blog Location!

I've moved!  I've really enjoyed using Blogger for my blog, but it didn't integrate with my website in the way I wanted.  So I&#...