Certified Python for Data Science: Certification by iTrain Malaysia
Online l_earning banners-01 (1)

Certified Python Programmer for Data Science


Certified Python Programmer for Data Science Course Overview

Are you planning to become a data scientist? If yes, then you have to learn Python programming language. Why? Python is the number one programming language in the world of data scientists. It emphasises on code readability and clear programming on both small and large scales, allowing you to focus on your research, product, or project.

In this 4-day journey, you will be exposed to multiple development environments so you can choose the best one for you. You will be taught step-by-step how to program in Python. You will go through all the steps of a Data Science project starting from data importing, data cleaning, data analysing, to data visualisation which reveals new insights.

In summary, you will gain a complete understanding of Python with Data Science from the ground up.

Learning Outcomes

Upon completion of this course, you will be able to:

  • Recognise the meaning of the terms “Data Science” and “Machine Learning”.
  • Understand the basics of Python.
  • Develop and write code easily in Python.
  • Deal easily with files and file systems.
  • Deal with different sources of data.
  • Analyse and visualise data to gain new insights.

Course Outline


Beginner to Intermediate (5 Days)

Day 1

  • What is Algorithm?
  • What is Programming?
  • The Natural Language of the Computer
  • Machine Language
  • Programming Language Levels
  • Translators

  • Identifiers, Lists, and Tuples
  • Dictionaries, Sets and Strings
  • Operators, Control Structures and Loops

Day 2

  • Installing and Running Jupyter
  • User Interface
  • Checkpoints

  • Functions
  • Lambda and Map Functions
  • Globals and Locals

  • List Comprehension
  • Generator Expressions
  • Exceptions Handling

  • Modules
  • Documentation
  • Packages and Namespaces

  • Create, Read, Update, Delete (CRUD) a File

Day 3

  • OOP in General
  • Classes
  • Objects
  • Constructors
  • Instance/Class Data
  • Instance/Class Method
  • Inheritance

  • Working with File Systems
  • Walking Directory Trees
  • Paths
  • Filenames
  • Directories

  • Creating a File
  • Reading a File
  • Updating a File
  • Deleting a File

  • What is JSON and Why Is It Important?
  • Module, Serialisation and Deserialisation

  • What is Web Scraping
  • HTML Tags
  • BeautifulSoup Module
  • Webpage Scraping Phase

Day 4

  • What is NumPy?
  • Ndarray Object, Data Types
  • Array Attributes, Array Creation Routines
  • Indexing and Slicing
  • Array Manipulation
  • Mathematical Functions

  • What is Pandas?/li>
  • Series
  • Dataframes
  • Data Importing
  • Data re-processing
  • Data Grouping

  • What is Matplotlib?
  • Line Graphs
  • Bar Graphs
  • Pie Charts
  • Histograms
  • Scatter Plots
  • Graph Attributes
  • Text Annotation

Day 5

  • What is Machine Learning?
  • Machine Learning Algorithm Types
  • Main Steps in Machine Learning Projects
  • Introduction to Scikit-learn Module

Advanced (4 Days)

Day 1: Applied Machine Learning (ML)

  • What is Machine Learning?
  • Introduction to SK Learn
  • Machine Learning Steps

  • What is Dataset?
  • Iris Dataset
  • Handwritten Digits Dataset
  • Dataset Distribution

  • What is Supervised Learning?
  • Key Classifiers Algorithms
    • K-Nearest Neighbors (KNN)
    • Support Vector Machine (SVM)
    • Decision Tree (DT)
  • Performance Metrics and Errors
  • Regression

  • What is Unsupervised Learning?
  • Key Clustering Algorithms
    • K-Means
    • Mean Shift
  • Principal Component Analysis
  • Dimensionality Reduction

  • Introduction to Neural Network
  • Multi-Layer Perceptron Classifier
  • Hidden Layers
  • Activation Function
  • Solver

Day 2: Applied Natural Language Processing (NLP)

  • What is NLP?
  • Basic Text Analysis with Python
  • Introduction to NLTK

  • Tokenise Words and Sentences
  • Stop Words
  • Regular Expressions
  • Stemming
  • Part-of-Speech (POS) Tagging

  • What is Corpus?
  • Popular NLTK Corpus
  • Build Your Own Corpus

  • Text Classification
  • NLTK and Scikit-learn
  • Save and Load the Model

Day 3: Social Network Analysis (SNA)

  • Why Networks Are Very Important
  • Graphs
  • Nodes and Edges
  • Introduction to NetworkX Module

  • Clustering Coefficient
  • Distance Measures
  • Connected Component
  • Network Robustness

  • Degree and Closeness Centrality
  • Betweenness Centrality
  • Hubs and Authorities

  • Power Law
  • Small World Network
  • Link Prediction
  • Use Cases

Day 4

“Good course for beginners in Python programming language.”

Maxolvin Sintore, Technical Data Analyst, Sarawak Shell Berhad

“I’ll recommend this! A mind blowing experience and learning process.”

Quah Chen Nam, Senior Manager, Bursa Malaysia Berhad

“Trainer’s explanation has been very precise and helpful.”

New Ru Wee, Engineer

“The course is very good, quite easy to follow & interactive.”

Balqis, Technical Data Analyst, Sarawak Shell Berhad

“The instructor was helpful in providing clear and structured explanation to someone with no prior programming background like myself..”

Ng Pui Kye, Data Scientist, Zillionquest Sdn Bhd



Students will be given a Certificate of Attendance after successfully completing the course.

You bet it is! Our Certification Body for this course is iTrain Asia Pte Ltd, the region’s top Certifications Tech Provider headquartered in Singapore, with branch offices in Malaysia and Indonesia.

Upon completion of this course, you will be able to:

● Explain the workflow of data science and applying data science concepts with Python
● Analyzing and solving data science datasets with Python

This is a 5-day course for Beginner to Intermediate and 4-day course for Advanced at an instructor-led training centre.

Computers are provided for iTrain students. However participants can also use their own computers as long as it’s installed with the necessary applications.

Trusted By Public, Private & Education Sectors