The purpose of logging and monitoring is to obtain data, process it, and come to a conclusion. Depending on the objective being sought, there might be a ‘solution’ to the problem, proving or refuting that an event took place. On other occasions, the goal of an investigation is to determine patterns, trends, and prevent future events from ever happening in the first place (a bit like in the film ‘Minority Report’).
Before we get into the common errors when implementing logging and monitoring, let's first take a look at some definitions.
Logging and monitoring are widely used for internal audit processes, and external investigations. Some industries are heavily regulated and must comply with strict legal requirements, which mandates what data must be gathered. If you are preparing for a compliance audit, you will be expected to show your logs and monitoring controls.
Logs are also known as event logs, audit records, or audit trails. Whatever the word used, we are referring to data that is captured from computer systems about the type, content, or time of actions. Logs can be appended with a ‘suffix’ to indicate their source, i.e. server logs, message logs, event logs, network logs, application logs, etc.
Monitoring refers to the observation and evaluation of logs. Due to the large volume of logs, this is typically automated, based on filters.
The National Institute of Standards and Technology (NIST) defines non-repudiation as “protection against an individual falsely denying having performed a particular action”. In effect, having a log that demonstrates an action did take place can become irrefutable proof, in which case other weaker types of evidence (e.g. hearsay) would be overridden in court.
The first step in the Incident Response Plan (IRP) is to detect events, followed by the second, which is escalating events to security incidents or data breaches (where confidentiality has been compromised).
Logging and monitoring are essential to these crucial first steps, aiding in the process to make sound decisions for containment, eradication, and invoking the Disaster Recovery Plan (DRP) or the Business Continuity Plan (BCP).
Insufficient, inadequate, or — worst of all — inaccurate information can mean incident management is not meeting the purposes it set out to accomplish. This leads us to question: what do we need to log, and how do we keep the data stored and transmitted securely?
The Open Web Application Security Project (OWASP) had “Insufficient logging and monitoring” in their top 10 vulnerabilities 2017. This was later changed to a more general view on the topic in their 2021 (and most recent) version to “Security logging and monitoring failures”.
Let’s address what could possibly go wrong with logging and monitoring so that you can avoid some common mistakes.
You cannot protect what you don’t know you have, and likewise, you cannot monitor what you don’t log.
If there is no monitoring whatsoever, the organization is in bad shape, although logs are still being collected, which is half of the equation (although the less meaningful one). In my experience as a consultant, this rarely occurs. What is common, however, is that monitoring is not taking place, or not to the extent required. This is often down to misconfiguration issues, possibly as a consequence of installing new hardware or software, or making new connections.
This results in inordinate amounts of logs that will require more computation cycles to parse through, as well as more storage capacity. It may appear as the logical solution to ‘what if we are not logging enough, and are missing the important bits?’ But in reality, an effort needs to be made to find out what is necessary, applying proportionality.
For instance, logs regarding access, modification, and exfiltration of intellectual property documents must be more comprehensive than those for publicly available information. Striking the right balance is essential here.
Logs have an expiry date. This is set by either periodic deletion, or overwriting. The former is applied once a certain time period has been reached, for example a month; the latter happens as the result of exceeding the storage space allocated for the log. If the file size for a given log is 10 MB: once filled, new logs will replace the older ones within the file. If set too low, there is a risk that old records will be routinely deleted before they can effectively be used.
Given the sensitivity of the information contained in logs, it is always a good idea to keep them secure and protected from prying eyes by using encryption. Furthermore, they need to also be protected from modification or premature deletion, applying least-privilege permissions.
Lastly, keeping them in a separate location will help prevent attackers deleting their tracks: if logs are stored in the same server that has already been compromised, there is a chance the logs themselves can be compromised too, in the worst-case scenario incriminating an innocent individual.
A timestamp is a digital record with information about the time associated with an event. This is very important, especially when correlating data from different sources, like in a Security Information and Event Management (SIEM) tool. If timestamps are not consistently created in the same way among all sources, determining the correct sequence of events becomes problematic, and prone to yield errors. Network Time Protocol (NTP), running on port 123 (very fitting!) is employed to avoid this type of issue.
If some of the tools used by your organization have been developed in-house, it is important that the output produced is compatible with common standards, otherwise the information will not be readily available for analysis using off-the-shelf software.
The number of alerts and warnings produced during monitoring can result in no longer caring about the hundreds of issues being constantly raised, or the opposite, where some genuine alerts are not raised (false negatives).
When deleting logs, it is paramount that you do not record the data itself in the logs of erasure, as there is obviously no need to retain a duplicate record of what you have just erased; basically, defeating the purpose.
During a penetration test, the systems can be overloaded with alerts. Some firms opt to disable alerting during that period, which leaves the door open for attackers (especially if they happen to know this valuable information).
The Information Commissioner’s Office (ICO) in the UK has stated that “logging enables you to monitor systems for inappropriate access and/or disclosure of data, to verify the lawfulness of any processing, and to ensure the integrity and security of personal data”. Logging has to be compliant with the regulations, in particular:
Logging and monitoring are just the first steps in being able to detect incidents, understanding the nature of the incidents and how these may impact the organization’s assets. Once these factors are known, it becomes easier to apply prioritisation and resource allocation to defuse a crisis.
There are many factors that can go wrong or be missed during design or implementation of your logging and monitoring mechanisms that can produce deficiencies in how they operate. Some of the most common are listed in this article, but your solution should be bespoke to fit your organization needs.