Don't Judge a Table by its Header: A Data Analyst's Guide to Data Discovery

In any data project, the temptation to jump straight into building charts is strong. We hear the client's request, our minds fill with creative ideas, and we just want to start making things.

But this is a classic trap. As my recent training highlighted, if we get "creative and distracted with tangential ideas" too early, we miss the most crucial step: Data Discovery.

Data Discovery isn't just looking at the data. It's a structured, two-phase process. First, you must understand the data objectively. Second, you must explore it with the client's specific needs in mind.

Here are the key questions you should be asking yourself during this critical stage.

Phase 1: The General Inspection (Data-Focused)

Before you even think about the client's "ask," you must perform a general inspection of the raw material. Your only goal here is to assess data quality and structure.

Ask yourself:

  • What does one row actually represent? Is it a single transaction? A daily summary? A unique customer? What is the granularity of this data? This is the most fundamental question.
  • What are my measures and dimensions? What can I count and aggregate? What can I group by?
  • What are my date ranges? Am I looking at one year or ten? Do the dates need to be cleaned, or can I create new date fields?
  • Where are the hierarchies? Can I see natural drill-downs, like Region > State > City?
  • What is the data quality? Where is the missing or null data? Are there any other obvious issues?
  • What's my prep list? Based on the above, what data prep do I need to do? This could involve cleaning, splitting columns, pivoting data, or joining/union of tables.

Only after establishing this foundational, technical understanding of the data are you prepared to apply a client-centric perspective.

Phase 2: The Client-Specific Lens (Context-Focused)

This is the pivotal transition from data description to business application. Your focus shifts from what the data is to what it means for the client's specific concerns.

Ask yourself:

  • Does this data make sense in the client's context? Are there values that seem wrong or impossible given what I know about their business?
  • What data directly relates to their key questions?
  • Is anything essential for their "ask" missing? Or is there data that could provide bonus insight?
  • What new questions can I answer for them? This is your chance to add real value. What opportunities does this data provide that the client hasn't even considered?

By following this two-phase discovery process, you stop yourself from getting distracted by shiny, tangential ideas and instead build a solid foundation. This ensures that when you do start building, you're not just making pretty charts, you're creating a solution that generates real insight.

Author:
Robin Jones
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab