📘 Unit 4 Question Bank – Data Handling with Pandas
Includes MCQs, Short Answer Questions, Long Answer Questions, Case-based problems, and Mini Projects with pharmaceutical applications.
🔷 Section A: MCQs
- Pandas is used for:
a) Gaming
b) Data analysis
c) Networking
d) Designing
Answer: b - Series is:
a) 2D
b) 1D
c) 3D
d) None
Answer: b - DataFrame represents:
a) List
b) Table
c) String
d) Loop
Answer: b - Which function reads CSV?
a) read()
b) read_csv()
c) open()
d) load()
Answer: b - head() shows:
a) Last rows
b) First rows
c) All rows
d) None
Answer: b - describe() gives:
a) Names
b) Statistics
c) Rows
d) Columns
Answer: b - Which function removes missing values?
a) fillna()
b) dropna()
c) replace()
d) remove()
Answer: b - Which operator is used for AND condition?
a) |
b) &
c) +
d) =
Answer: b - groupby() is used for:
a) Filtering
b) Grouping data
c) Sorting
d) Printing
Answer: b - mean() calculates:
a) Sum
b) Average
c) Max
d) Min
Answer: b
🔷 Section B: Short Answer Questions
- Define Pandas.
- What is a Series?
- What is a DataFrame?
- Write syntax for reading CSV file.
- Explain head() and tail().
- What is info() function?
- Define data cleaning.
- What are missing values?
- Explain filtering in Pandas.
- What is groupby()?
🔷 Section C: Long Answer Questions
- Explain Pandas and its importance in pharmaceutical data analysis.
- Explain Series and DataFrame with suitable examples.
- Explain how to read CSV and Excel files using Pandas.
- Explain data inspection functions: head(), tail(), info(), describe().
- Explain data cleaning and handling missing values.
- Explain filtering and selecting data with examples.
- Explain grouping and aggregation techniques in Pandas.
🔷 Section D: Case-Based Questions
💊 Case 1: ADR Dataset Analysis
Dataset contains Drug, Dose, Reaction columns.
- Load dataset using Pandas
- Filter severe reactions
- Count ADR frequency
💊 Case 2: Missing Data Handling
- Identify missing values
- Fill missing dose with mean
- Remove incomplete rows
💊 Case 3: PK Data Analysis
- Load PK dataset
- Display first rows
- Calculate average concentration
💊 Case 4: High-Risk Patient Identification
- Filter patients with dose > 600
- Filter elderly patients
- Display high-risk group
🔷 Section E: Mini Projects
- Project 1: ADR Dataset Analyzer (filter + count reactions)
- Project 2: Patient Data Cleaner (handle missing values)
- Project 3: PK Data Analyzer (mean, max concentration)
- Project 4: Clinical Risk Identifier (filter high-risk patients)
🧠 Quick Revision
| Concept | Key Point |
|---|---|
| Pandas | Data analysis tool |
| Series | 1D data |
| DataFrame | Table data |
| head() | Top rows |
| Cleaning | Fix missing data |
| Filtering | Select data |
| groupby() | Analyze groups |
📥 Download Unit 4 Practice Datasets & Solutions
Includes ADR datasets, PK tables, and real-world pharma problems.
Recommended readings
- Introduction to Pandas (Why it is used in Pharma Data Analysis)
- Pandas Series & DataFrame (with patient & PK datasets)
- Reading CSV & Excel Files (PK datasets, ADR reports)
- Inspecting Data (head(), tail(), info(), describe())
- Data Cleaning & Missing Values (real clinical dataset problems)
- Filtering & Selecting Data (high dose, ADR filtering)
- Grouping & Aggregation (mean dose, ADR frequency)
For detailed information: Basics of Python Programming for Pharmaceutical Sciences