NEC M.Tech CSE 2nd Sem Data Science Material

0

UNIT-I

What is Data science?, The Data science process, A data scientist role in this process, NumPy Basics:

The NumPy ndarray: A Multidimensional Array Object(Creating ndarrays ,Data Types for ndarrays,

Operations between Arrays and Scalars, Basic Indexing and Slicing, Boolean Indexing, Fancy

Indexing, Transposing Arrays and Swapping Axes), Universal Functions:(Fast Element-wise Array

Functions), Data Processing Using Arrays(Expressing Conditional Logic as Array Operations ,

Mathematical and Statistical Methods , Methods for Boolean Arrays , Sorting , Unique and Other Set

Logic), File Input and Output with Arrays ( Storing Arrays on Disk in Binary Format, Saving and

Loading Text Files)



UNIT-II

Getting Started with pandas: Introduction to pandas Data Structures(Series,DataFrame,Index Objects),

Essential Functionality( Reindexing,Dropping entries from an axis, Indexing, selection, and filtering,

Arithmetic and data alignment, Sorting and ranking, Axis indexes with duplicate values), Summarizing

and Computing Descriptive Statistics(Correlation and Covariance, Unique Values, Value Counts, and

Membership), Handling Missing Data( Filtering Out Missing Data, Filling in Missing Data),

Hierarchical Indexing(Reordering and Sorting Levels, Using a Data Frame's Columns ).






UNIT-III

Data Loading, Storage, and File Formats : Reading and Writing Data in Text Format( Reading Text

Files in Pieces, Writing Data Out to Text Format, Manually Working with Delimited Formats, JSON

Data, XML and HTML: Web Scraping), Binary Data Formats(Using HDF5 Format, Reading

Microsoft Excel Files),Interacting with HTML and Web APIs, Interacting with Databases( Storing and

Loading Data in MongoDB ).


Data Wrangling: Clean, Transform, Merge, Reshape:Combining and Merging Data Sets( Databasestyle

DataFrame Merges, Merging on Index, Concatenating Along an Axis, Combining Data with

Overlap), Reshaping and Pivoting( Reshaping with Hierarchical Indexing, Pivoting “long” to “wide”

Format), Data Transformation( Removing Duplicates, Transforming Data Using a Function or

 Mapping, Replacing Values, Renaming Axis Indexes, Discretization and Binning, Detecting and

Filtering Outliers)


UNIT-V

Plotting and Visualization: A Brief matplotlib API Primer (Figures and Subplots, Colors, Markers, and

Line Styles, Ticks, Labels, and Legends, Annotations and Drawing on a Subplot, Saving Plots to File),

Plotting Functions in pandas (Line Plots, Bar Plots, Histograms and Density Plots, Scatter Plots)


UNIT-VI

Data Aggregation and Group Operations: GroupBy Mechanics( Iterating Over Groups, Selecting a

Column or Subset of Columns, Grouping with Dicts and Series, Grouping with Functions, Grouping by

Index Levels) Data Aggregation(Column-wise and Multiple Function Application, Returning

Aggregated Data in “unindexed” Form), Group-wise Operations and Transformations(Apply: General

split-apply-combine, Quantile and Bucket Analysis, Example: Filling Missing Values with Group

 specific Values, Example: Random Sampling and Permutation, Example: Group Weighted Average

and Correlation, Example: Group-wise Linear Regression)

Post a Comment

0Comments
Post a Comment (0)

Join CSE Team