Sometimes, we shall and must listen to the silence. This bit old wisdom is widely practiced in many spiritual practices and not without certain benefits to the one who actually practicing it. But what in it for the monitoring and observability practitioners ? What kind of silence, we must tune our ears and observability platforms ? And what am I meaning, while I am advising to you: "Listen to a silence" ?
In the previous article I was teaching about importance of the change. How we shall always listen and be tuned not to the specific thresholds, but to the patterns of change, as problems always manifests themself through the change, not through the value. And mostly important, the series of interlocked and related changes. This is first given "a form" of the telemetry and observability. And here is the another one for you thoughts: "listen to a silence, listen to what will become a silence, listen to a signs of a silence".
what is "a silence". The concept of a silence in observability is very simple, it is simple to an extreme, but often overlooked. "a silence" means that you receive no telemetry. Yes, the concept is very simple, but even if observed, "a silence" treated as a "condition" to the telemetry item. Which is helpful, but actually not quite. What I am proposing, is when you are started to listen to "a silence", first, you will treat a lack of data that you have to get in time as a value. In mathematics, the value of "a silence" could be represented as "NaN" or if your computational engine permits, you can use greatest or least telemetry value. But which one we must use ? Let's come back to that later.
Yes. Why someone shall bother ? No data, nothing to compare to the threshold, therefore we do not know if situation is normal or not normal here. And if you are thinking in "thresholds" categories, you're right. You can not compare something with nothing, although there are ways to deal with non-deterministic and infinite values, mathematically speaking, but even with simplest approach, you can compare non-deterministic with another non-deterministic or calculate percentage of non-deterministic values in a timeframe ? So, even without going into Numbers theory, complex numbers and reading of what the "Riemann sphere" is, you can compute patterns with data unknown. And we are looking for a patterns.
So, we can compute the pattern consists of real and deterministic , NaN values. Why bother ? For those, who is looking for a reasons, I'll bring a few real-life examples producing silence.
- You are using passive data collection and host or application went down.
- The host on which you are running you data collection is experiencing a performance issues and since delivery of telemetry may not be treated as an "ultimate priority" it may be swapped out. Beginning skipping the values.
- You may have a network issues between telemetry source and telemetry collection, causing gaps in data or data delays.
- You may have a very local problems, causing monitoring platform "skipping a beat"
There are number or reasons, why you do not get the data, but you must treat "no data" as a data. A special type of data.
- First, we can make a pattern and analyze it.
- We can observe a missed data, like a real data.
For those cases, we can treat missed data as "NaN" values. But real fun begins where we can start to use a missed data in calculations. For that "NaN" do have a really limited use cases and we may begin to treat a missed values as a "greatest element" i.e. a "positive infinity". This is a most straightforward approach, as IT telemetry is rarely have a negative values. And, depending on your computational platform, you can perform a number of computations and comparing operation using positive infinity. In all numeric compare operations, infinity will be always higher than any integer or float number. You can use positive infinity with most arithmetic operators if needed.
No matter, what is the source of missed data or as we poetically calling it "a silence", we are better to aim for detection of the actual silence in the telemetry item data. Also, we must be ready to detect if a group of the "silent" items do have something in common. We must detect a patterns in data, that can lead to a silence and use "a silence" as a values leading to a condition. So, "no data" shall not be treated by monitoring practitioner as a moment of uncertainty. Think about silence as another way your infrastructure delivering information to you. And ... silence is a data. Now, you know.