Data Analysis-training-in-bangalore-by-zekelabs

Data Analysis Training

Data Analysis Course:

Data Analysis using Python is meant to make data do the talking. Using libraries like numpy, pandas & matplotlib we learn here to conclude data before subjecting data to machine learning. Pandas provide extensive utilities for data analysis - merging, grouping, aggregation & much more.

Data Analysis-training-in-bangalore-by-zekelabs
Assignments
Data Analysis-training-in-bangalore-by-zekelabs
Industry Level Projects
Data Analysis-training-in-bangalore-by-zekelabs
Certification

Data Analysis Course Curriculum



Data analysis and processing
Python libraries in data analysis
Pandas
PyMongo
NumPy arrays
Array creation
Fancy indexing
Array functions
Loading and saving data
Loading an array
NumPy random numbers
An overview of the Pandas package
Series
The essential basic functionality
Head and tail
Functional statistics
Sorting
Computational tools
Advanced uses of Pandas for data analysis
The Panel data
The matplotlib API primer
Figures and subplots
Scatter plots
Contour plots
Legends and annotations
Additional Python data visualization tools
MayaVi
Time series primer
Resampling time series
Upsampling time series data
Timedeltas
Interacting with data in text format
Writing data to text format
HDF5
Interacting with data in Redis
List
Ordered set
Data munging
Filtering
Reshaping data
Grouping data
An overview of machine learning models
Data representation in scikit-learn
Unsupervised learning – clustering and dimensionality reduction
Introducing predictive modelling
Ensemble of statistical algorithms
Historical data
Business context
Task matrix for predictive modelling
LinkedIn's "People also viewed" feature
How is it done?
How is it done?
How is it done?
How is it done?
How was it done?
Anaconda
Installing a Python package
Installing Python packages with pip
IDEs for Python
Reading the data – variations and examples
Delimiters
Case 1 – reading a dataset using the read_csv method
Use cases of the read_csv method
Reading a .txt dataset with a comma delimiter
Case 2 – reading a dataset using the open method of Python
Changing the delimiter of a dataset
Case 4 – miscellaneous cases
Writing to a CSV or Excel file
Handling missing values
What constitutes missing data?
Treating missing values
Imputation
Visualizing a dataset by basic plotting
Histograms
Subsetting a dataset
Selecting rows
Creating new columns
Various methods for generating random numbers
Generating random numbers following probability distributions
Cumulative density function
Normal distribution
Geometry and mathematics behind the calculation of pi
Grouping the data – aggregation, filtering, and transformation
Filtering
Miscellaneous operations
Method 1 – using the Customer Churn Model
Method 3 – using the shuffle function
Merging/joining datasets
Left Join
An example of the Inner Join
An example of the Right Join
Random sampling and the central limit theorem
Null versus alternate hypothesis
Confidence intervals, significance levels, and p-values
A step-by-step guide to do a hypothesis test
Chi-square tests
Understanding the maths behind linear regression
Fitting a linear regression model and checking its efficacy
Making sense of result parameters
F-statistics
Implementing linear regression with Python
Multiple linear regression
Variance Inflation Factor
Training and testing data split of models
Feature selection with scikit-learn
Handling categorical variables
Handling outliers
Linear regression versus logistic regression
Contingency tables
Odds ratio
Estimation using the Maximum Likelihood Method
Log likelihood function:
Making sense of logistic regression parameters
Likelihood Ratio Test statistic
Implementing logistic regression with Python
Data exploration
Creating dummy variables for categorical variables
Implementing the model
Cross validation
The ROC curve
Introduction to clustering – what, why, and how?
How is clustering used?
Mathematics behind clustering
Euclidean distance
Minkowski distance
Normalizing the distances
Single linkage
Average linkage
Ward's method
K-means clustering
Importing and exploring the dataset
Hierarchical clustering using scikit-learn
Interpreting the cluster
The elbow method
Introducing decision trees
Understanding the mathematics behind decision trees
Entropy
ID3 algorithm to create a decision tree
Reduction in Variance
Handling a continuous numerical variable
Implementing a decision tree with scikit-learn
Cross-validating and pruning the decision tree
Regression tree algorithm
Understanding and implementing random forests
Implementing a random forest using Python
Important parameters for random forests
Best practices for coding
Defining functions for substantial individual tasks
Example 2
Avoid hard-coding of variables as much as possible
Using standard libraries, methods, and formulas
Best practices for algorithms
Best practices for business contexts
Data, information, knowledge, and insight
Information
Data analysis and insight
Transforming data into information
Data preprocessing
Organizing data
Transforming information into knowledge
Data visualization history
Minard's Russian campaign (1812)
Statistical graphics (1850-1915)
How does visualization help decision-making?
Data visualization today
Visualization plots
Bar graphs
Box plots
Scatter plots
KDE plots
Why does visualization require planning?
A sports example
Creating interesting stories with data
Reader-driven narratives
The State of the Union address
A few other example narratives
Perception and presentation methods
Some best practices for visualization
Correlation
Location-specific or geodata
Trends over time
Development tools
Anaconda from Continuum Analytics
Event listeners
Circular layout
Balloon layout
NumPy, SciPy, and MKL functions
NumPy universal functions
An example of interpolation
SciPy
The vectorized numerical derivative
The performance of Python
Slicing
Array indexing
Logical indexing
Stacks
Sets
Dictionaries
Sparse matrices
Dictionaries for memoization
Visualization using matplotlib
Installing word clouds
Web feeds
Plotting the stock price chart
The visualization example in sports
The deterministic model
The stochastic model
What exactly is Monte Carlo simulation?
Monte Carlo simulation in basketball
Implied volatilities
The simulation model
The diffusion-based simulation
Schelling's Segregation Model
K-nearest neighbors
Bayesian linear regression
Classification methods
Linear regression
An example
The Naïve Bayes classifier
Installing TextBlob
The Naïve Bayes classifier using TextBlob
k-nearest neighbors
Support vector machines
Installing scikit-learn
Directed graphs and multigraphs
Displaying graphs
NetworkX
PageRank
Analysis of social networks
The directed acyclic graph test
A genetic programming example
Computer simulation
SciPy's random functions
Signal processing
Visualization methods using HTML5
D3.js for visualization

Frequently Asked Questions


This "Data Analysis" course is an instructor-led training (ILT). The trainer travels to your office location and delivers the training within your office premises. If you need training space for the training we can provide a fully-equipped lab with all the required facilities. The online instructor-led training is also available if required. Online training is live and the instructor's screen will be visible and voice will be audible. Participants screen will also be visible and participants can ask queries during the live session.

Participants will be provided "Data Analysis"-specific study material. Participants will have lifetime access to all the code and resources needed for this "Data Analysis". Our public GitHub repository and the study material will also be shared with the participants.

All the courses from zekeLabs are hands-on courses. The code/document used in the class will be provided to the participants. Cloud-lab and Virtual Machines are provided to every participant during the "Data Analysis" training.

The "Data Analysis" training varies several factors. Including the prior knowledge of the team on the subject, the objective of the team learning from the program, customization in the course is needed among others. Contact us to know more about "Data Analysis" course duration.

The "Data Analysis" training is organised at the client's premises. We have delivered and continue to deliver "Data Analysis" training in India, USA, Singapore, Hong Kong, and Indonesia. We also have state-of-art training facilities based on client requirement.

Our Subject matter experts (SMEs) have more than ten years of industry experience. This ensures that the learning program is a 360-degree holistic knowledge and learning experience. The course program has been designed in close collaboration with the experts working in esteemed organizations such as Google, Microsoft, Amazon, and similar others.

Yes, absolutely. For every training, we conduct a technical call with our Subject Matter Expert (SME) and the technical lead of the team that undergoes training. The course is tailored based on the current expertise of the participants, objectives of the team undergoing the training program and short term and long term objectives of the organisation.

Drop a mail to us at [email protected] or call us at +91 8041690175 and we will get back to you at the earliest for your queries on "Data Analysis" course.




Recommended Courses


Data Analysis-training-in-bangalore-by-zekelabs
Apache Kafka
  More Info  
Data Analysis-training-in-bangalore-by-zekelabs
Machine Learning using AWS SageMaker
  More Info  
Data Analysis-training-in-bangalore-by-zekelabs
Machine Learning using Tensorflow
  More Info  
Feedback