Author : HASSAN MD TAREQ

What is pandas?

  • pandas is a Python package / library
  • A library for data manipulation, analysis, data science
  • Python-based data analysis toolkit
  • High-level data manipulation tool
  • Built on the Numpy package
  • Key data structure is called the DataFrame

Naming

  • Pandas stands for “Python Data Analysis Library”
  • The name ‘pandas’ is derived from the term “panel data”, an econometrics term for multidimensional structured data sets (Wikipedia)

Links

Core components

  • Series: essentially a column ((1-dimensional) )
  • DataFrame: a multi-dimensional table made up of a collection of Series (2-dimensional)

A Pandas Series is one dimensioned whereas a DataFrame is two dimensioned.

Dataframe

Pandas DataFrame is a two-dimensional tabular data structure with labeled axes (rows and columns). Similar to Excel sheet.

Details: Pandas Dataframe

Getting Started

Check version at top cell of JupyterLab

import pandas
pandas.__version__

Getting Started Step 1

Getting Started Step 2

Getting Started Step 3

Installing pandas separately

Conda

conda install pandas

pip

pip install pandas

More: https://pandas.pydata.org/docs/getting_started/install.html#installing-pandas