How to Clean Messy Data in Google Sheets
Master data cleaning techniques with formulas, functions, and AI automation. Turn messy spreadsheets into analysis-ready datasets in minutes.
Data cleaning is the most time-consuming part of any analysis project—often taking 60-80% of your total work time. Whether you are dealing with duplicate entries, inconsistent formatting, missing values, or text that should be numbers, messy data blocks you from getting insights. In this comprehensive guide, you will learn professional data cleaning techniques used by top analysts. From built-in Google Sheets functions to AI-powered automation, you will transform chaotic datasets into clean, analysis-ready tables. By the end, you will cut your data cleaning time from hours to minutes.
What You'll Need
- Basic familiarity with Google Sheets
- A messy dataset that needs cleaning (we will cover common scenarios)
- Understanding of basic formulas (TRIM, CLEAN)
Step-by-Step Guide
Identify Common Data Quality Issues
Scan your dataset to find the types of problems you need to fix.
- Duplicate rows: Use Data → Remove duplicates to find how many duplicates exist
- Inconsistent formatting: Look for mixed date formats, text vs numbers, capitalization issues
- Missing values: Scan for blank cells, "N/A", "null", or placeholder text
- Extra whitespace: Check for leading/trailing spaces that break formulas
- Merged cells: These break most analysis functions—flag them for unmerging
- Data type mismatches: Numbers stored as text, dates formatted incorrectly
Pro Tip
Create a "Data Quality Checklist" sheet listing all issues found. Check them off as you clean—this prevents missing problems.
Remove Duplicates
Find and eliminate duplicate entries that skew your analysis.
- Select your data range including headers
- Go to Data → Remove duplicates
- Check "Data has header row" if your first row contains column names
- Select which columns to check for duplicates (usually all columns, or just key ID fields)
- Google Sheets will show how many duplicates were removed
- Alternative: Use UNIQUE() function for a formula-based approach: =UNIQUE(A2:D100)
Pro Tip
Before removing duplicates, create a backup sheet. Sometimes "duplicates" are actually legitimate entries with slight differences.
Clean Text: Remove Spaces and Special Characters
Fix whitespace and unwanted characters that break formulas.
- Remove leading/trailing spaces: =TRIM(A2) removes extra spaces
- Remove ALL spaces: =SUBSTITUTE(A2," ","") if you need to remove internal spaces too
- Remove line breaks: =CLEAN(A2) removes non-printable characters
- Combine both: =TRIM(CLEAN(A2)) for thorough text cleaning
- Remove specific characters: =SUBSTITUTE(A2,"$","") removes dollar signs
- Apply to entire column: Use ARRAYFORMULA to clean all rows at once
Pro Tip
Use Find & Replace (Ctrl+H) for simple character removal. For complex cleaning, formulas give you more control and can be reused.
Standardize Text Formatting
Make text consistent for proper grouping and analysis.
- Convert to UPPERCASE: =UPPER(A2) for consistent capitalization
- Convert to lowercase: =LOWER(A2) for email addresses or URLs
- Convert to Title Case: =PROPER(A2) capitalizes first letter of each word
- Fix inconsistent names: Use Find & Replace to standardize "Inc" vs "Inc." vs "Incorporated"
- Extract first/last names: =SPLIT(A2," ") to separate full names into columns
Pro Tip
After applying text formulas, copy results and "Paste special → Values only" to replace formulas with clean data.
Fix Data Types and Formatting
Convert text to numbers, fix date formats, and ensure data types match.
- Convert text to numbers: =VALUE(A2) or multiply by 1: =A2*1
- Fix date formatting: Format → Number → Date, then choose your preferred format
- Parse dates from text: =DATEVALUE("January 15, 2024") converts text to date
- Convert numbers to text: =TO_TEXT(A2) or add apostrophe prefix
- Identify data type issues: Numbers left-aligned are stored as text, right-aligned are true numbers
Pro Tip
Use Conditional Formatting to highlight cells with data type issues. Format → Conditional formatting → "Text contains" can find problems.
Handle Missing Values
Decide how to treat blank cells and incomplete data.
- Find blank cells: Use Go To Special (no direct equivalent in Sheets, use filter instead)
- Replace blanks with zero: =IF(ISBLANK(A2),0,A2)
- Replace blanks with "Unknown": =IF(ISBLANK(A2),"Unknown",A2)
- Fill down from above: Select range, Ctrl+D fills down from top cell
- Flag missing data: =IF(ISBLANK(A2),"MISSING",A2) to track incomplete rows
- Remove rows with missing critical fields using FILTER function
Pro Tip
Do not automatically replace all blanks with zero—this can distort averages. Only fill blanks when it makes business sense.
Split and Combine Columns
Separate combined data or merge split information.
- Split text into columns: Data → Split text to columns (choose delimiter: comma, space, etc.)
- Split with formula: =SPLIT(A2,",") separates "Last, First" into two columns
- Combine columns: =A2&" "&B2 or =CONCATENATE(A2," ",B2) to merge
- Join with delimiter: =JOIN(", ",A2:D2) to combine with commas
- Extract specific parts: =LEFT(A2,3) for first 3 characters, =RIGHT(A2,4) for last 4
Pro Tip
When splitting columns, make sure you have enough empty columns to the right—Split will overwrite existing data.
Validate and Verify Clean Data
Check that your cleaning worked and data is analysis-ready.
- Check for remaining duplicates: Data → Remove duplicates with "Remove" turned off shows count
- Verify data types: Sort each column—numbers and dates should sort logically
- Count blanks: =COUNTBLANK(A2:A1000) should return zero for required fields
- Check for outliers: Use MIN/MAX functions to spot impossible values
- Test formulas: Run a sample analysis to ensure cleaned data works correctly
Pro Tip
Create a "Data Quality Dashboard" with counts of rows, blanks, duplicates, and data type checks. Update it after each cleaning step.
Wrapping Up
Frequently Asked Questions
What is the fastest way to clean messy data in Google Sheets?
Use built-in tools like Data → Remove duplicates and Data → Split text to columns for quick fixes. For complex cleaning, combine TRIM(), CLEAN(), and ARRAYFORMULA to process entire columns at once. However, AI tools like ModelMonkey can automate the entire process with simple prompts, reducing hours of work to seconds.
How do I remove duplicates in Google Sheets without deleting important data?
Before removing duplicates, create a backup copy of your sheet. Use Data → Remove duplicates and carefully select which columns to check—sometimes rows that look duplicate differ in important fields. Alternatively, use =UNIQUE(A2:D100) to create a cleaned copy while keeping your original data intact.
How can I convert text to numbers in Google Sheets?
Use the VALUE() function: =VALUE(A2) converts text numbers to real numbers. Alternatively, multiply by 1: =A2*1. For entire columns, use Format → Number → Number to force correct data type. If numbers have currency symbols or commas, remove them first with SUBSTITUTE().
What formulas clean data in Google Sheets?
Essential data cleaning formulas: TRIM() removes extra spaces, CLEAN() removes non-printable characters, UPPER()/LOWER()/PROPER() standardize capitalization, SUBSTITUTE() replaces unwanted characters, SPLIT() separates combined data, and VALUE() converts text to numbers. Combine these with ARRAYFORMULA to clean entire columns instantly.
How do I handle missing values in my dataset?
It depends on your analysis goals. Use =IF(ISBLANK(A2),0,A2) to replace blanks with zero, or =IF(ISBLANK(A2),"Unknown",A2) for text placeholders. For critical fields, filter out incomplete rows with =FILTER(A2:D100, NOT(ISBLANK(A2:A100))). Never arbitrarily fill blanks—this can skew statistical analysis.
Related Guides
Analyze Sales Data in Google Sheets
Turn clean sales data into actionable insights with pivot tables and formulas
Master Google Sheets QUERY Function
Use SQL-like queries to filter, sort, and transform your clean data
Automate Monthly Reports
Set up automated reporting workflows with clean, reliable data