UNIT-I
What is Data science?, The Data science process, A data scientist role in this process, NumPy Basics:
The NumPy ndarray: A Multidimensional Array Object(Creating ndarrays ,Data Types for ndarrays,
Operations between Arrays and Scalars, Basic Indexing and Slicing, Boolean Indexing, Fancy
Indexing, Transposing Arrays and Swapping Axes), Universal Functions:(Fast Element-wise Array
Functions), Data Processing Using Arrays(Expressing Conditional Logic as Array Operations ,
Mathematical and Statistical Methods , Methods for Boolean Arrays , Sorting , Unique and Other Set
Logic), File Input and Output with Arrays ( Storing Arrays on Disk in Binary Format, Saving and
Loading Text Files)
UNIT-II
Essential Functionality( Reindexing,Dropping entries from an axis, Indexing, selection, and filtering,
Arithmetic and data alignment, Sorting and ranking, Axis indexes with duplicate values), Summarizing
and Computing Descriptive Statistics(Correlation and Covariance, Unique Values, Value Counts, and
Membership), Handling Missing Data( Filtering Out Missing Data, Filling in Missing Data),
Hierarchical Indexing(Reordering and Sorting Levels, Using a Data Frame's Columns ).
Data Loading, Storage, and File Formats : Reading and Writing Data in Text Format( Reading Text
Files in Pieces, Writing Data Out to Text Format, Manually Working with Delimited Formats, JSON
Data, XML and HTML: Web Scraping), Binary Data Formats(Using HDF5 Format, Reading
Microsoft Excel Files),Interacting with HTML and Web APIs, Interacting with Databases( Storing and
Loading Data in MongoDB ).
Data Wrangling: Clean, Transform, Merge, Reshape:Combining and Merging Data Sets( Databasestyle
DataFrame Merges, Merging on Index, Concatenating Along an Axis, Combining Data with
Overlap), Reshaping and Pivoting( Reshaping with Hierarchical Indexing, Pivoting “long” to “wide”
Format), Data Transformation( Removing Duplicates, Transforming Data Using a Function or
Mapping, Replacing Values, Renaming Axis Indexes, Discretization and Binning, Detecting and
Filtering Outliers)
UNIT-V
Plotting and Visualization: A Brief matplotlib API Primer (Figures and Subplots, Colors, Markers, and
Line Styles, Ticks, Labels, and Legends, Annotations and Drawing on a Subplot, Saving Plots to File),
Plotting Functions in pandas (Line Plots, Bar Plots, Histograms and Density Plots, Scatter Plots)
UNIT-VI
Data Aggregation and Group Operations: GroupBy Mechanics( Iterating Over Groups, Selecting a
Column or Subset of Columns, Grouping with Dicts and Series, Grouping with Functions, Grouping by
Index Levels) Data Aggregation(Column-wise and Multiple Function Application, Returning
Aggregated Data in “unindexed” Form), Group-wise Operations and Transformations(Apply: General
split-apply-combine, Quantile and Bucket Analysis, Example: Filling Missing Values with Group
specific Values, Example: Random Sampling and Permutation, Example: Group Weighted Average
and Correlation, Example: Group-wise Linear Regression)