Recently, I have completed this book and tried to compare the challenges we faced while monitoring software applications with the one “Sun Tzu” has mentioned in this book for being a great ruler.
In this article, I will cover the Application monitoring with the Lesson- ”The Use of Spies” (will cover the rest lessons in another article)
Sun Tzu explained the importance of spies in the life of a Ruler or War General. To continue one’s dynasty, the Ruler should have a subtle and straightforward relationship with the spies, that he quoted back then.
“What enables the wise sovereign and the good general to strike and conquer, and achieve things beyond the reach of ordinary men, is FOREKNOWLEDGE” — Sun Tzu
Foreknowledge is not only essential for the well functioning and survival of dynasty but also in the technology-driven business applications. Other than Ruler/Generals it also helps various other leaders/stakeholders (SRE, Ops, PM & SDEs) who are keen to get regular knowledge about the health of the application/system/network/business.
In the era of micro-services either Rest API or event-based, the number of components to be observed increased drastically and monitoring this mesh of inter-connected services requires planning and resources.
Each micro-service in some way or another is contributing towards business success, so monitoring and observing their behavior could help in the governance of complete business as well as in designing and implementing various KPI’s for the business success.
What are Spies?
Person or tool that divulges the information without the permission of the holder of the information. In application monitoring, it could be nothing but more kind of independent utilities, native libraries, or sidecar agents that provides useful information about the state of an application. The true role of spies is to observe the host and provide the pieces of information to his master.
Furthermore, Sun Tzu explained the type of spices and their importance.
- Local Spies
- Inward Spies
- Converted Spies
- Doomed Spies
- Surviving spies
Sun Tzu — “Having local spies means employing the services of the inhabitants of a district.”
Local Spies could be compared with
- The virtual machine infra-level agents which sends VM health report and network information periodically
- custom build report generator utility in legacy applications, which sends the state of the customer transactions success/failure periodically
- Service discovery client in micro services which periodically updates its health and network information to service discovery server so that other consumer could interact with it
Sun Tzu — “Having inward spies, making use of officials of the enemy.”
Inward spies could be recognised as The native library which comes out of the box with the platform and which could be integrated with the existing monitoring system application. This will not only reduce the cost, but also other resources.
For example, Stack driver to integrate logs with existing logging and monitoring systems.
Sun Tzu — “Having converted spies, getting hold of the enemy’s spies and using them for our purposes.”
Compare it with existing configurable monitoring applications which could be configured to integrate with multiple systems based on some configurations and push tranquility information to multiple systems, without much effort. It could be a part of the existing monitoring system and configured to use with the new monitoring system.
Sun Tzu — “Having doomed spies, doing certain things openly for purposes of deception, and allowing our spies to know of them and report them to the enemy”
It enables the application to keep running in case any server/service gets compromised due to network or server failure or native agents somehow stop sending the heath beats.
In such cases, remote service could remove the compromised service from the deeper network and update other agent’s applications about the missing resource. Other agents or utilities, could remove the compromised agent from its look-up tables and continue to work as expected.
For example, Kafka maintains the state of each partition and replicate data between nodes and monitor each partition with the help of zookeeper. In case any partition goes down, it elects other partition replicas as a master.
Sun Tzu — “Those spies who brings back news from the enemy’s camp”
Surviving spies are the ones who faced the challenges and somehow survived the cause and got the key information about “What”, “Where”, “Who”, “When”, “Why” for further planning.
It could be compared with the
- Fault tolerance recovery agents,
- Disk backup native agents,
- Thread dumps or Network logs
which would help respective stakeholders in analyzing the cause of the Application/Disk/Server/Network crash.
Thanks for reading! Hope you have enjoyed reading this article.
Do check my other articles.