In business analytics, data is the foundation for meaningful insights. But raw data is often messy, incomplete, or inconsistent, which makes it less useful. This is why data preparation is the first step before any analysis, followed by visualization to represent insights clearly. This unit explains how data is collected, cleaned, and refined, and how visualization tools like Excel, Tableau, and Power BI help in presenting data using different charts.
Download UNIT 2 – Data Preparation and Visualization Techniques Notes
Get simplified revision notes for this unit:
Download Unit 2 Notes PDF
Data Collection – The First Step
Data collection is the process of gathering information from various sources. It can come from:
Internal sources such as company databases, financial records, and sales transactions.
External sources like market surveys, government reports, customer feedback, and social media platforms.
Example: An online retailer may collect data from website visits, purchase history, and customer reviews.
Data Cleaning – Making Data Reliable
Raw data usually contains errors, missing values, or irrelevant details. Data cleaning ensures that the dataset is accurate, consistent, and usable.
Common issues solved during cleaning:
Duplicate records – Removing repeated entries.
Formatting errors – Standardizing date formats, names, or numbers.
Irrelevant data – Eliminating unnecessary information.
For instance, if customer names appear as “John D.”, “J. Doe”, and “John Doe”, cleaning ensures they are standardized.
Handling Missing Values and Outliers
Two major problems in datasets are missing values and outliers.
Missing Values: These occur when some data points are not recorded.
Solutions include replacing with the mean/median, using predictive methods, or sometimes removing incomplete records.
Outliers: These are extreme values that differ significantly from others.
Example: A salary dataset where most employees earn ₹40,000–₹60,000, but one record shows ₹10,00,000.
Outliers can be corrected if they are errors, or kept if they provide useful insights (e.g., identifying a top performer).
Importance of Data Visualization
Once data is cleaned, the next step is to visualize it so that managers and stakeholders can easily understand trends, patterns, and relationships.
Visualization turns complex data into simple, visual stories through charts, graphs, and dashboards. It improves decision-making by highlighting key insights at a glance.
Tools for Visualization
Several tools help in creating professional and interactive visualizations:
Excel – Basic but widely used tool for charts, pivot tables, and quick visual summaries.
Tableau – Powerful tool for creating interactive dashboards and exploring big datasets.
Power BI – Microsoft’s business intelligence tool for real-time reporting and visual analysis.
Common Visualization Techniques
Different types of charts are used depending on the type of data and analysis required:
Bar Chart – Useful for comparing categories (e.g., sales in different regions).
Pie Chart – Shows proportions and percentages of a whole (e.g., market share of companies).
Scatter Plot – Displays relationships between two variables (e.g., advertising spend vs. sales revenue).
Heat Map – Uses color intensity to represent values (e.g., website clicks across different pages).
Example: A marketing team may use a bar chart to compare ad campaign results, while a heat map can highlight which product categories are most popular on their website.
Conclusion
Data preparation ensures accuracy, while visualization makes insights understandable. Without proper cleaning and handling of missing values or outliers, analysis can be misleading. Similarly, without clear visualization, even the best analysis may fail to communicate effectively. Together, these techniques form the backbone of data-driven decision-making, making businesses smarter and more efficient.