The client is based out of Orlando, USA who built a solution that helps the IT Operation to keep track of digital transformation. For over a decade the client has implemented an IT Operation Management system from leading ITOM organizations such as IBM, HP, etc. Nowadays as businesses digitalize, they are demanding a vast range of new applications and services from IT, along with the agility needed to deliver and evolve quickly. To address these challenges they unified monitoring, event management, and ITSM processes through codeless logic and machine learning. The core capabilities were immediately validated by customers including AstraZeneca, Fujitsu, Land O’Lakes, Prudential, The Hershey Company, Ubisoft, 20th Century Fox, etc.
About the Client
About the Project
The project is to build a stack to run the application by proactively identifying IT issues, root cause analysis and automatically remediate problems using Artificial Intelligence. The stack is developed on the dataramp architecture with the following features.
- The event data will be sent to the Kafka data pipeline via rest API
- Agents can define their rules/criteria for analyzing issues or events.
- Rules are defined and stored in the database using the web application interface.
- Data from agents are automatically processed with rules stored in the rule engine.
- Separate instances are created for each client within the cloud.
- Automate the instances.
- Rule engines are built using multiple Docker containers
- nd logistics providers
The following are the features required for this AI powered event monitoring system;
- Develop an Artificial Intelligence-based AI Ops event monitoring and ITSM
- Provide end-to-end visibility and actionable intelligence for dynamic IT environments
- Use machine learning algorithms and ITSM contextual data to automatically prioritizes events, find the root cause and predict outages before they occur
- Collect raw data of high volume and high velocity via a REST API, then process it using the algorithm implemented in Dataramp architecture
- Create a rule engine that correlates various rules defined in it with the incoming data and creates alerts based on the correlation and stores it to the data store or sends it to the dashboard.
The client needs a complete platform which could solve the following use cases:
- Prioritizing the events, based on the measurable impact on the business.
- Build a system that can handle large volumes of incoming data from various sources and process it sequentially.
- To determine the most likely root causes based on infrastructure architecture, different events and performance issues
- Learning and detecting event patterns that typically lead to service outages and degradations.
- Reuse existing codebase from the current tech stack that the client has developed.
- To migrate existing functionalities to the new stack without breaking user expectations and improvising user experience.
All applications, ever built, directly or indirectly rely on real-time data to personalize experiences, detect fraudulent behavior or to build a dynamic dashboard. Traditionally, these complex business requirements are achieved by running analytics using batch processes on historical data or on real-time data. The platform will be a fully automated monitoring and reporting system which collects raw data of the high volume and high velocity via a REST API and is processed by the algorithm implemented on the Dataramp architecture.
Dataramp is a real-time, horizontally scalable, fault-tolerant and fast data processing platform with pluggable data storage, analytics, and visualizations coupled with notification modules tailored to the specific business needs.
Here the data will mainly go through two components of the dataramp
- Kafka data pipeline
- Rule engines
The data pipeline component of the dataramp will receive device data in a distributed manner and buffered in Kafka. This means the data which comes in the dataramp platform will be divided into a numbered stream of ordered messages and stored in different topics like device status, device id, etc. The aforementioned capabilities of the dataramp architecture will be leveraged to employ a rule engine that correlates various rules defined in it with the incoming data and creates alerts based on the correlation and stores it to the data store or sends it to the dashboard.
A fully automated platform to manage IT services thereby ensuring increased customer satisfaction by solving issues faster using the following stack.
- We built a data processing architecture (dataramp) using Kafka for the client
- Remote agents capture the various events from the different data sources like servers, Routers by using the component of the dataramp platform called data interface.
- Remote agents capture events and write to Kafka inside dataramp architecture
- Kafka act as message broker – receives messages from agents and push it out
- Rules are defined in rule engine using the web application interface
- Rule engine prioritise the events and manipulates the events based on the rules
- Processed Data is stored in the database
- Instances are created in rule engines using Docker containers
- Separate and automated instances are created for each client
Technology Stack Used
|Front End||Angular 4|
|Backend||Loopback – NodeJS|
|Jenkins, AWS, Docker|
|Analytics||Apache Kafka, Apache Spark, Elasticsearch|