What is pandas?
- pandas is a Python package / library
- A library for data manipulation, analysis, data science
- Python-based data analysis toolkit
- High-level data manipulation tool
- Built on the Numpy package
- Key data structure is called the DataFrame
- Pandas stands for “Python Data Analysis Library”
- The name ‘pandas’ is derived from the term “panel data”, an econometrics term for multidimensional structured data sets (Wikipedia)
- Package overview
- pandas API Reference
- Series: essentially a column ((1-dimensional) )
- DataFrame: a multi-dimensional table made up of a collection of Series (2-dimensional)
A Pandas Series is one dimensioned whereas a DataFrame is two dimensioned.
Pandas DataFrame is a two-dimensional tabular data structure with labeled axes (rows and columns). Similar to Excel sheet.
Details: Pandas Dataframe
- Download and install Anaconda: https://www.anaconda.com/products/individual
- Start Anaconda Navigator
- Launch JupyterLab
Check version at top cell of JupyterLab
import pandas pandas.__version__
Installing pandas separately
conda install pandas
pip install pandas