5-Day Training | 25-29 Oct


  • Sai Kumar Pilla Senior Trainer, Big Data Analytics and Data Engineering ,
  • Insha Mearaj Senior Trainer,
Duration 5 days
Capacity 15 pax
Seats Available 15
Difficulty intermediate

$6,299.00 $4,200.00

Register Now



DATE: 25-29 October 2020

TIME: 9:00 to 18:00 GST/GMT+4


This immersive 5-day training dives deep into the entire ecosystem of big data analytics and gets trainees hands-on with Hadoop and processing tools like Spark and Kafka. Trainees will get familiar with handling Big Data-related cyber datasets from acquiring and processing to detection, visualisation and analytics.


This training isn’t just a tools manual, but a comprehensive study on how to properly structure relevant datasets and accurately extract valuable information for responsive, real-time anomaly detection and mitigation.


Trainees will delve into how Big data technologies allow deep-dives into logs  to convert the semi or unstructured data to structured data for the extraction of valuable information which can help in detecting frauds, errors etc. This addresses the critical need to detect anomaly of data in real time, as the ability to do so can help in making quick decisions and avoiding serious consequences by responding on time.

Why should you take this course?

If you are looking to…

  • Understand and implement real time analytics using tools like Hadoop, Spark, Kafka.
  • Implement SOC net-flow data analysis.
  • Learn how to capture data in real time and extract valuable information from the same in minutes.
  • Get data into visualization for more insights in easy way and also learn how to generate reports.
  • Learn how to troubleshoot, detect anomalies of the data in seconds using streaming and ML libraries.
  • Learn and apply raw to refine of different types of datasets, interesting and important use cases.
  • Receive after training support.
  • Assess your learning takeaways through a post-training practical test.

…then this training is for you!

Apache Spark is most sought after technology for Big data development to process and do real time analytics.

Hadoop is open source framework technology meant for Big Data Analytics.

Used by more than 80% of all Fortune 100 companies, Apache Kafka is an open-source distributed event streaming platform.

Used by 1,489 companies and counting, Kibana is a free and open user interface that lets you visualize your Elasticsearch data and navigate the Elastic Stack.

Metabase is an open source business intelligence tool which makes it easy to share questions and dashboards with the rest of your team.

Key Learning Objectives

  • How to handle Cyber data using Big data technologies useful for real time analytics.
  • Understanding Big data technologies mainly Hadoop ecosystem.
  • Ecosystem understanding and applying (Acquire, Arrange, Process, Analyze and Visualize).
  • 80% hands-on based with different types of Cyber related datasets.
  • Learn and apply model.
  • Applying machine learning algorithms to look insights.
  • Learn how to perform real time analytics using streams.
  • How to ingest and work with Structured, Semi structured and unstructured data.
  • Understanding various techniques of data cleansing and analyzing.
  • Getting prepared for Big data related certifications.

Who Should Attend

  • Cyber Analyst – SOC
  • SOC manager and Director
  • Big Data Engineer
  • Data Analyst and Researcher
  • Developers (Python and SQL)

Prerequisite Knowledge

This training is suitable for experienced professionals having knowledge of Python, SQL, Databases.


Hardware / Software Requirements

Hardware: 16GB RAM, 500 GB HDD, dual core i5 plus.

Software: Windows 10, VMware workstation 10 plus (64 bit)


Expand All

Day 1

Introduction and Expectations
Introduction to Big data and why Hadoop?
Apache Hadoop Fundamentals and Ecosystem.
Hadoop Architecture (HDFS / YARN)
Understanding Cyber datasets
Data Ingestion using Sqoop, Flume, Kafka streams and Spark streams.
Sqoop use case scenarios
Flume use case scenarios
Ingesting real time data from Twitter and Servers.
Ingesting real time streams data

Understanding Cyber datasets
Data Ingestion using Sqoop, Flume, Kafka streams and Spark streams.
Sqoop use case scenarios
Flume use case scenarios
Ingesting real time data from Twitter and Servers.
Ingesting real time streams data

Day 2

Introduction to Apache Spark
Spark architecture
Spark cluster managers and Spark SQL
Writing Spark programs and IDEs.
Creating RDDs, DataFrame and Datasets
Analyzing cyber data using Spark

Day 3

Spark advanced programs
Writing Machine learning algorithms
Spark streaming
Developing ML logics on cyber data using SparkML

Day 4

Introduction to Kafka
Kafka architecture
Kafka cluster setup
Producer / Consumer APIs.
Creating and working with Topics.
Getting cyber data into Kafka.

Day 5

Kafka streaming
Spark and Kafka integration
Visualizing results using BI
Use Cases related to Cyber

Sign Up For an Account

to track your favorites

Sign Up

Want a Training Not Seen Here?

Write to Us

Contact Us