Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Design a JSON Parser

Build a robust JSON parser with tokenization and error handling.

Design and implement a JSON parser that can parse JSON strings into corresponding data structures. The parser should handle nested objects and arrays, support all JSON data types (strings, numbers, booleans, null, objects, arrays), validate JSON format during parsing, and provide meaningful error messages with position information for invalid JSON.

In this problem, you’ll build a system that takes a raw string and converts it into a structured, language-native representation while ensuring strict grammar adherence.


Design and implement a JSON parser that converts strings into usable data structures while validating syntax and providing clear error feedback.

Functional Requirements:

  • Data Type Support: Parse strings, numbers (integers/decimals), booleans, null, objects, and arrays.
  • Deep Nesting: Handle objects and arrays nested to arbitrary depth without stack overflow.
  • Strict Validation: Detect syntax errors and throw exceptions with meaningful messages.
  • Precise Error Tracking: Report the exact line and column where a syntax error occurred.
  • Encoding & Escapes: Correctly handle Unicode characters and escape sequences (e.g., \n, \t, \uXXXX).
  • Whitespace Management: Ignore spaces, tabs, and newlines between JSON elements.
  • Empty Collections: Support parsing empty objects {} and empty arrays [].

Non-Functional Requirements:

  • Clean Architecture: Distinct separation between Lexical analysis (Tokenizer) and Syntactic analysis (Parser).
  • Parsing Strategy: Use a Recursive Descent approach to handle JSON’s hierarchical nature.
  • Efficiency: Single-pass parsing with optimized memory usage for large strings.
  • Extensibility: Easy to add support for custom JSON extensions or output mappings.
  • Maintainability: Clear code structure with robust error handling for aid in debugging.
  • Robustness: Handle edge cases like empty objects {} and arrays [].

The parser is divided into three primary layers: the Tokenizer, the Parser, and the Value Model.

Diagram
classDiagram
    class JsonValue {
        <<interface>>
        +getType()
        +asObject()
        +asArray()
    }
    
    class JsonObject {
        -Map members
        +get(key)
        +add(key, value)
    }
    
    class JsonArray {
        -List elements
        +get(index)
        +add(value)
    }

    class Tokenizer {
        -String input
        -int pos
        +nextToken() Token
    }

    class JsonParser {
        -Tokenizer tokenizer
        +parse(json) JsonValue
        -parseObject() JsonObject
        -parseArray() JsonArray
    }
    
    JsonValue <|-- JsonObject
    JsonValue <|-- JsonArray
    JsonValue <|-- JsonPrimitive
    JsonParser --> Tokenizer
    JsonParser --> JsonValue

Diagram

JSON is inherently recursive. An object can contain an array, which contains an object, and so on.

Solution: Implement Recursive Descent Parsing. The parseValue() method can call parseObject() or parseArray(), which in turn call parseValue() for their children. This mapping of the JSON grammar to function calls is elegant and robust.

Reading the entire input into a list of tokens can be memory-intensive for huge files.

Solution: Use a Lazy Tokenizer (Lexer). Instead of pre-tokenizing, the parser asks the tokenizer for the nextToken() only when needed. This keeps memory usage low and allows for early exits on syntax errors.

Simply saying “Syntax Error” is frustrating for users.

Solution: Maintain line and column counters in the Tokenizer. When the Parser encounters an unexpected token, it can throw a ParseException that includes these coordinates, making it easy to find the bug in the JSON input.


By solving this problem, you’ll master:

  • Lexical Analysis - Building a state machine to scan tokens.
  • Recursive Descent - Converting grammar rules into code.
  • Composite Pattern - Managing hierarchical tree structures.
  • Error Handling - Implementing precise diagnostic systems.
  • String Manipulation - Handling escapes, Unicode, and whitespace.

Ready to see the full implementation? Open the interactive playground to access:

  • 🎯 Step-by-step guidance through the 8-step LLD approach
  • 📊 Interactive UML builder to visualize your design
  • 💻 Complete Code Solutions in Python, Java, C++, TypeScript, JavaScript, C#
  • 🤖 AI-powered review of your design and code

After mastering JSON Parser, explore more LLD problems in our practice playground.