Python for Data Science

About This Course

Data science is becoming increasingly important in today’s interconnected, data-driven world. Data is being generated at an increasing pace by consumers, sensors, or scientific experiments that emit data points every day. In finance, business, administration and the natural or social sciences, working with data can make up a significant part of the job. Being able to efficiently work with small or large datasets has become a valuable skill. Python increasingly becoming the programming language of choice when performing data analytics due to its flexibility and shallow learning curve.

NumPy is an open-source extension to Python that adds support for multidimensional arrays of large sizes. This support allows the desired acquisition, storage, and complex manipulation of data mentioned previously. NumPy alone is a great tool to solve many numerical computations. This library contains algorithms and mathematical tools to manipulate NumPy objects, with very definite scientific and engineering objectives. The combination of Python, NumPy has been the environment of choice of many applied mathematicians for years.

Pandas is another open-source Python library that eases the task of performing data acquisition, data cleaning/pre-processing and data transformation from various data sources such as CSV, Excel, etc. Simple data exploration and calculation of common statistics can be performed easily in Pandas.

At the end of the course, participants should be familiar with using Python together with popular Python data science libraries such as NumPy, Pandas, matplotlib and seaborn for performing various data analysis and visualisation tasks. The course will be presented in a hands-on, workshop manner, where students will get a chance to write data analysis Python scripts.

Learning Objectives

Upon completion of this course, participants will acquire the following skills/knowledge:

  • General understanding of Industrial 4.0 and Data Science elements, concepts, terms and applications.
  • Understand the fundamentals of NumPy and Matplotlib
  • Learn how to use Linear Algebra package and Optimisation package
  • Understand the Statistics package and Signal Processing package
  • Perform data import and export with Python Pandas
  • Learn how to use Series and DataFrame data types
  • Learn how to use functions such as groupby, merge and pivot tables for data aggregation
  • Understand basic statistics used in Big Data Analytics for extracting useful information from data.

Prerequisites

Participants are assumed to have some knowledge about Python programming. Participants are encouraged to take the “Python Fundamentals” course prior to taking this course.

Target Audience

This program is suitable for those who are in the manufacturing industry as well as those who are interested to know how to use Python to perform data analysis.

Training Outline

  1. Introduction to Industrial 4.0 and Data Science
  2. Essentials of Python Programming for Data Science
  3. Basics of NumPy
  4. Numerical Analysis
  5. Linear Algebra
  6. Data Preparation and Transformation
  7. Data Visualisation
  8. Data Analysis
  9. Data Storage, Networking and Dashboard Creation