About the Client

The client is based out of Orlando, USA who built a solution that helps the IT Operation to keep track of digital transformation. For over a decade the client has implemented an IT Operation Management system from leading ITOM organizations such as IBM, HP, etc. Nowadays as businesses digitalize, they are demanding a vast range of new applications and services from IT, along with the agility needed to deliver and evolve quickly. To address these challenges they unified monitoring, event management, and ITSM processes through codeless logic and machine learning. The core capabilities were immediately validated by customers including AstraZeneca, Fujitsu, Land O’Lakes, Prudential, The Hershey Company, Ubisoft, 20th Century Fox, etc. 

About the Project

The project is to build a stack to run the application by proactively identifying IT issues, root cause analysis and automatically remediate problems using Artificial Intelligence. The stack is developed on the dataramp architecture with the following features.

  • The event data will be sent to the Kafka data pipeline via rest API
  • Agents can define their rules/criteria for analyzing issues or events.
  • Rules are defined and stored in the database using the web application interface.
  • Data from agents are automatically processed with rules stored in the rule engine.
  • Separate instances are created for each client within the cloud.
  • Automate the instances.
  • Rule engines are built using multiple Docker containers
  • nd logistics providers

Challenges

The following are the features required for this AI powered event monitoring system;

  • Develop an Artificial Intelligence-based AI Ops event monitoring and ITSM
  • Provide end-to-end visibility and actionable intelligence for dynamic IT environments
  • Use machine learning algorithms and ITSM contextual data to automatically prioritizes events, find the root cause and predict outages before they occur
  • Collect raw data of high volume and high velocity via a REST API, then process it using the algorithm implemented in Dataramp architecture
  • Create a rule engine that correlates various rules defined in it with the incoming data and creates alerts based on the correlation and stores it to the data store or sends it to the dashboard.

The client needs a complete platform which could solve the following use cases:

  • Prioritizing the events, based on the measurable impact on the business.
  • Build a system that can handle large volumes of incoming data from various sources and process it sequentially. 
  • To determine the most likely root causes based on infrastructure architecture, different events and performance issues
  • Learning and detecting event patterns that typically lead to service outages and degradations.
  • Reuse existing codebase from the current tech stack that the client has developed.
  • To migrate existing functionalities to the new stack without breaking user expectations and improvising user experience.

Implementation

All applications, ever built, directly or indirectly rely on real-time data to personalize experiences, detect fraudulent behavior or to build a dynamic dashboard. Traditionally,    these complex business requirements are achieved by running analytics using batch processes on historical data or on real-time data. The platform will be a fully automated monitoring and reporting system which collects raw data of the high volume and high velocity via a REST API and is processed by the algorithm implemented on the Dataramp architecture.

Dataramp is​ ​a​ real-time,​ ​horizontally​ ​scalable,​ ​fault-tolerant​ ​and​ ​fast data​ ​processing​ ​platform​ ​with​ ​pluggable​ ​data​ ​storage,​ analytics, ​and​ ​visualizations coupled​ ​with​ ​notification​ ​modules​ ​tailored​ ​to​ ​the​ specific​ ​business​ ​needs.

Here the data will mainly go through two components of the dataramp

  • Kafka data pipeline
  • Rule engines

The data pipeline component of the dataramp will receive device data in a distributed manner and buffered in Kafka. This means the data which comes in the dataramp platform will be divided into a numbered stream of ordered messages and stored in different topics like device status, device id, etc. The aforementioned capabilities of the dataramp architecture will be leveraged to employ a rule engine that correlates various rules defined in it with the incoming data and creates alerts based on the correlation and stores it to the data store or sends it to the dashboard.

The Solution

A fully automated platform to manage IT services thereby ensuring increased customer satisfaction by solving issues faster using the following stack.

  • We built a data processing architecture (dataramp) using Kafka for the client
  • Remote agents capture the various events from the different data sources like servers, Routers by using the component of the dataramp platform called data interface.
  • Remote agents capture events and write to Kafka inside dataramp architecture
  • Kafka act as message broker – receives messages from agents and push it out
  • Rules are defined in rule engine using the web application interface
  • Rule engine prioritise the events and manipulates the events based on the rules 
  • Processed Data is stored in the database
  • Instances are created in rule engines using Docker containers
  • Separate and automated instances are created for each client

Architecture Diagram

Screens

Technology Stack Used

Front End Angular 4
Backend Loopback – NodeJS
Database MySQL
Infrastructure

provisioning

Jenkins, AWS, Docker
Analytics Apache Kafka, Apache Spark, Elasticsearch

Looking for a similar App ?