What is data pre-processing?

  • Data pre-processing is a step in Machine Learning where we transform data before feeding it to the model.
  • This helps machine to easily parse it and thus improves the ability of our model to learn and its efficiency.
  • There are huge amounts of data in different formats which is not understood by the machine, thus pre-processing it is very helpful.

--

--

Introduction to GPU

GPUs are fast because they have high-bandwidth memories and hardware that performs floating-point arithmetic at significantly higher rates than conventional CPUs

Processing large blocks of data is basically what Machine Learning does, so GPUs come in handy for ML tasks. TensorFlow and Pytorch are examples of libraries that already make use of GPUs. Now with the RAPIDS suite of libraries, we can also manipulate data frames and runmachine learning algorithms on GPUs as well.

--

--

Scikit-learn is a machine learning application for the used in Python Programming language.

• It is used in tasks like classification ,regression and clustering algorithms.

•It is built on NumPy, SciPy, and matplotlib

•It is written in C,C++,Python,Cython

•conda install -c intel scikit-learn

  • conda install scikit-learn
Scikit Learn

Supervised learning is when the model is getting trained on a labelled dataset. Labelled dataset is one which have both input and output parameters. In this type of learning both training and validation datasets are labelled as shown in the figures below.Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labeled responses.

--

--

From histograms to scatterplots, matplotlib lays down an array of colors, themes, palettes, and other options to customize and personalize our plots. matplotlib is useful whether you’re performing data exploration for a machine learning project or simply want to create dazzling and eye-catching charts.

Matplotlib can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits.”

Here are the Visualization We can Design using matplotlib

Bar Graph

Pie Chart

Box Plot

Histogram

Line Chart and Subplots

Scatter Plot

--

--

Pandas object types:

Pandas have two object types:

Series: Series is a type of list in pandas that can take integer values, string values, double values, and more.

Dataframe: Dataframe can be made of more than one series or we can say
that a data frame is a collection of series that can be used to analyze.

--

--

Society of AI

Society of AI

77 Followers

Society of AI has an vision to educate people how Artificial Intelligence can change their life!