Be a Certified Data Science Specialist: CDSS Certification by iTrain
Online l_earning banners-01 (1)

Certified Data Science Specialist (CDSS)

5-Day Instructor-Led Course | HRDF SBL-KHAS Claimable!

HRDF Industry Certification (INDCERT) 100% Funded - No Levy Deductions

Data Science Course Overview

Data Science is a field of Big Data which seeks to provide meaningful information from large amounts of complex data using various tools, algorithms and machine learning principles. It is the foundation of all current megatrends from social to mobile to the cloud to AI. The future belongs to Data Scientists who are able to transform data into strategic business decisions, value-driven products, and lead predictions.

In this course, you will learn how to leverage on data to unlock new economic value for your business, as well as apply useful data science concepts to every aspect of your daily life from personal finances, to reading, lifestyle habits, and work decisions. The Certified Data Science Specialist certification is a hands-on guided course for one to learn the concepts, tools, and techniques required to begin learning Data Science.

Combining a good balance of theoretical knowledge and practical application, students will learn the processes of gathering, cleaning and handling data. Key data science and big data concepts are taught using case study references to reinforce learning. Upon completion of the course, participants will be able to perform basic data handling tasks, collect and analyze data, and present them using industry standard tools.

This is a Vendor-Neutral Certification

Vendor-neutral Certifications validate the candidate’s unbiased knowledge of technology principles. They also ensure that the candidate has a good grasp of using different types of equipment or software interchangeably to satisfy an employer’s needs.

“A business and design approach that seeks to ensure broad compatibility and interchangeability of products and technologies. The model encompasses non-proprietary design principles and unbiased business practices.”

Margaret Rouse from TechTarget, NASDAQ listed Technology Media Company

Learning Outcomes

Upon completion of this course, you will be able to: Identify the appropriate model for different data types • Create your own data process and analysis workflow • Define and explain the key concepts and models relevant to data science • Differentiate key data ETL process, from cleaning, processing to visualization • Implement algorithms to extract information from dataset • Apply best practices in data science, and become familiar with standard tools.

Course Outline


Day 1

  • What is Data?
  • Types of Data
  • What is Data Science?
  • Statistical thinking
  • Knowledge Check
  • Lab Activity

  • Extract, Transform and Load (ETL)
  • Data Cleansing
  • Aggregation, Filtering, Sorting, Joining
  • Data Workflow
  • Knowledge Check
  • Lab Activity

  • Raw vs Tidy Data
  • Key features of data quality
  • Maintenance of data quality
  • Data profiling
  • Data completeness and consistency

  • Identify problem
  • Define question
  • Define ideal dataset
  • Obtain data
  • Analyze data
  • Interpret results
  • Distribute results
  • Knowledge Check

Day 2

  • Types of Databases
  • Relational Databases
  • NoSQL
  • Hybrid database
  • Knowledge check
  • Lab activity

  • Performing CRUD (Create, Retrieve, Update, Delete)
  • Designing a Real world database
  • Normalizing a table
  • Knowledge check
  • Lab Activity

  • Basics of Python language
  • Functions and packages
  • Python lists
  • Functional programming in Python
  • Numpy and Scipy
  • iPython
  • Knowledge check
  • Lab Activity

Lab: Exploring data using Python

Day 3

  • Obtain data from online repositories
  • Import data from local file formats (json, xml)
  • Import data using Web API
  • Scrape website for data
  • Knowledge check
  • Lab Activity
  • Instructor-led case study

  • What is EDA?
  • Goals of EDA
  • The role of graphics
  • Handling outliers
  • Dimension reduction

  • Features of R
  • Vectors
  • Matrices and Arrays
  • Data Frame
  • Input / Output

Lab: Exploring data using R

Day 4

  • What is Text Mining?
  • Natural Language Processing
  • Pre-processing text data
  • Extracting features from documents
  • Using BeautifulSoup
  • Measuring document similarity
  • Knowledge check
  • Lab activity

  • What is prediction?
  • Sampling, training set, testing set.
  • Constructing a decision tree.
  • Knowledge check
  • Lab Activity

Day 5

  • Choosing the right visualization
  • Plotting data using Python libraries
  • Plotting data using R
  • Using Jupyter Notebook to validate scripts
  • Knowledge check
  • Lab activity

  • Using Markdown language
  • Convert your data into slides
  • Data presentation techniques
  • The pitfall of data analysis
  • Knowledge check
  • Lab activity
  • Group presentation

Lab: Mini Project

  • What is small data?
  • What is big data?
  • Big data analytics vs Data Science
  • Key elements in Big Data (3Vs)
  • Extracting values from big data
  • Challenges in Big data

  • Introducing Hadoop Ecosystem
  • Cloudera vs Hortonworks
  • Real world big data applications
  • Knowledge check
  • Group discussion

  • Preview of Data Science Specialist
  • Showing advanced data analysis techniques
  • Demo: Interactive visualizations

“I’ll recommend this. A mind blowing experience and learning process!”

Quah Chen Nam, Senior Manager, Bursa Malaysia Berhad

“This training class is good for learning the fundamentals of data science.”

Hafizzah Binti Hamizan, Analyst Programmer, Felda Prodata Systems Sdn Bhd

“Great course on data science subject using vendor neutral/open source tools.”

Lee Shiau Shin, System Engineer, AUO SunPower Sdn Bhd

“Managed to understand how data scientists approach the issue and what tools are available in the market today.”

Muzaini Mohammud, Measat Satellite Systems

“Broad coverage of knowledge in statistics, machine learning and data visualization.”

Hazly Amir, Researcher, TMR&D



You bet it is! Our Certification Body for this course is iTrain Asia Pte Ltd, the region’s top Certifications Tech Provider headquartered in Singapore, with branch offices in Malaysia and Indonesia.

Upon completion of this course, you will be able to:

● Identify appropriate model for different data types
● Create your own data process and analysis workflow
● Define and explain the key concepts and models relevant to data science
● Differentiate key data ETL process, from cleaning, processing to visualization
● Implement algorithms to extract information from dataset
● Apply best practices in data science, and familiar with standard tools

This is a 5-day course at an instructor-led training centre. The CDSS Certification Exam Duration is 2 Hours, consisting of 50 Multiple Choice Questions, with a Passing Score of 70%. You will receive a professional CDSS Certification upon Passing the Exam.

This workshop is intended for individuals who are interested in learning data science, or who want to begin their career as a data scientist.

All participants should have basic understanding of data, relations, and basic knowledge of mathematics.

Computers are provided for iTrain students. However participants can also use their own computers as long as it’s installed with the necessary applications.

Trusted By Public, Private & Education Sectors