Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Iterator Pattern

Traverse collections uniformly - access elements sequentially without exposing underlying structure!

Iterator Pattern: Traversing Collections Uniformly

Section titled “Iterator Pattern: Traversing Collections Uniformly”

Now let’s dive into the Iterator Pattern - one of the most fundamental behavioral design patterns that provides a way to access elements of a collection sequentially without exposing its underlying representation.

Imagine you’re browsing a library. You don’t need to know how books are organized (by author, by topic, by shelf number) - you just walk through the aisles and browse. The library provides a consistent way to access books regardless of how they’re stored. The Iterator Pattern works the same way!

The Iterator Pattern provides a way to access elements of an aggregate object sequentially without exposing its underlying representation. It decouples the traversal logic from the collection structure.

The Iterator Pattern is useful when:

  1. You want to traverse collections uniformly - Same interface for different collection types
  2. You want to hide collection structure - Clients don’t need to know internal representation
  3. You want multiple traversals - Support multiple simultaneous traversals
  4. You want to decouple traversal logic - Separate iteration logic from collection
  5. You want to support different iteration strategies - Forward, backward, filtered, etc.

What Happens If We Don’t Use Iterator Pattern?

Section titled “What Happens If We Don’t Use Iterator Pattern?”

Without the Iterator Pattern, you might:

  • Expose internal structure - Clients need to know how collection is implemented
  • Tight coupling - Clients depend on specific collection implementation
  • Code duplication - Traversal logic repeated for each collection type
  • Hard to change - Changing collection structure breaks client code
  • No uniform interface - Different collections have different traversal methods

Let’s start with a super simple example that anyone can understand!

Diagram

Here’s how the Iterator Pattern works in practice - showing how iterator traverses collections:

sequenceDiagram
    participant Client
    participant Collection as BookCollection
    participant Iterator as BookIterator
    
    Client->>Collection: create_iterator()
    activate Collection
    Collection->>Iterator: new BookIterator(collection)
    activate Iterator
    Iterator-->>Collection: Iterator created
    deactivate Iterator
    Collection-->>Client: Iterator
    deactivate Collection
    
    Client->>Iterator: has_next()
    activate Iterator
    Iterator->>Iterator: Check index < size
    Iterator-->>Client: true
    deactivate Iterator
    
    Client->>Iterator: next()
    activate Iterator
    Iterator->>Iterator: Get element at index
    Iterator->>Iterator: Increment index
    Iterator-->>Client: Book("Design Patterns")
    deactivate Iterator
    
    Note over Client,Iterator: Iterator provides uniform\ninterface for traversal!
    
    Client->>Iterator: has_next()
    Iterator-->>Client: true
    
    Client->>Iterator: next()
    Iterator-->>Client: Book("Clean Code")

You’re building a library system with different collection types (array, linked list). Without Iterator Pattern:

bad_book_collection.py
# ❌ Without Iterator Pattern - Exposed internal structure!
class Book:
"""Book model"""
def __init__(self, title: str):
self.title = title
def __str__(self):
return self.title
class BookArray:
"""Book collection using array"""
def __init__(self):
self.books = [] # Array implementation
def add(self, book: Book):
self.books.append(book)
def get(self, index: int) -> Book:
return self.books[index]
def size(self) -> int:
return len(self.books)
class BookLinkedList:
"""Book collection using linked list"""
class Node:
def __init__(self, book: Book):
self.book = book
self.next = None
def __init__(self):
self.head = None # Linked list implementation
def add(self, book: Book):
node = self.Node(book)
node.next = self.head
self.head = node
def get_first(self) -> Book:
return self.head.book if self.head else None
# Problem: Client needs to know internal structure!
def traverse_array(collection: BookArray):
# Client knows it's an array - uses index
for i in range(collection.size()):
book = collection.get(i)
print(f"Reading: {book}")
def traverse_linked_list(collection: BookLinkedList):
# Client knows it's a linked list - uses node traversal
current = collection.head
while current:
print(f"Reading: {current.book}")
current = current.next
# Problems:
# - Client needs to know collection type
# - Different traversal code for each type
# - Tight coupling to implementation
# - Hard to add new collection types

Problems:

  • Exposed internal structure - Clients need to know implementation details
  • Different traversal code - Each collection type requires different code
  • Tight coupling - Clients depend on specific implementations
  • Hard to extend - Adding new collection types breaks client code
classDiagram
    class Iterable {
        <<interface>>
        +create_iterator() Iterator
    }
    class Iterator {
        <<interface>>
        +has_next() bool
        +next() Object
        +current() Object
    }
    class BookCollection {
        -books: List
        +create_iterator() Iterator
        +add(book) void
    }
    class BookIterator {
        -collection: BookCollection
        -index: int
        +has_next() bool
        +next() Book
        +current() Book
    }
    
    Iterable <|.. BookCollection : implements
    Iterator <|.. BookIterator : implements
    BookCollection --> Iterator : creates
    BookIterator --> BookCollection : traverses
    
    note for Iterator "Uniform traversal interface"
    note for BookCollection "Hides internal structure"
iterator_book_collection.py
from abc import ABC, abstractmethod
from typing import List, Optional
# Step 1: Define Iterator interface
class Iterator(ABC):
"""Iterator interface - uniform traversal interface"""
@abstractmethod
def has_next(self) -> bool:
"""Check if there are more elements"""
pass
@abstractmethod
def next(self):
"""Get next element and advance"""
pass
@abstractmethod
def current(self):
"""Get current element without advancing"""
pass
# Step 2: Define Iterable interface
class Iterable(ABC):
"""Iterable interface - collections that can be iterated"""
@abstractmethod
def create_iterator(self) -> Iterator:
"""Create an iterator for this collection"""
pass
# Step 3: Implement Book model
class Book:
"""Book model"""
def __init__(self, title: str):
self.title = title
def __str__(self):
return self.title
# Step 4: Implement Concrete Aggregate
class BookCollection(Iterable):
"""Concrete aggregate - book collection"""
def __init__(self):
self._books: List[Book] = [] # Internal structure hidden
def add(self, book: Book) -> None:
"""Add a book to collection"""
self._books.append(book)
def get(self, index: int) -> Book:
"""Get book by index (internal method)"""
return self._books[index]
def size(self) -> int:
"""Get collection size (internal method)"""
return len(self._books)
def create_iterator(self) -> Iterator:
"""Create iterator for this collection"""
return BookIterator(self)
# Step 5: Implement Concrete Iterator
class BookIterator(Iterator):
"""Concrete iterator - traverses book collection"""
def __init__(self, collection: BookCollection):
self._collection = collection
self._index = 0 # Current position
def has_next(self) -> bool:
"""Check if there are more books"""
return self._index < self._collection.size()
def next(self) -> Book:
"""Get next book and advance"""
if not self.has_next():
raise StopIteration("No more books")
book = self._collection.get(self._index)
self._index += 1
return book
def current(self) -> Book:
"""Get current book without advancing"""
if self._index >= self._collection.size():
raise IndexError("No current book")
return self._collection.get(self._index)
# Usage - Uniform interface!
def main():
# Create collection
collection = BookCollection()
collection.add(Book("Design Patterns"))
collection.add(Book("Clean Code"))
collection.add(Book("Refactoring"))
# Client uses uniform iterator interface - doesn't know internal structure!
iterator = collection.create_iterator()
print("Traversing books:\n")
while iterator.has_next():
book = iterator.next()
print(f"Reading: {book}")
print("\n✅ Iterator Pattern: Uniform interface for all collections!")
if __name__ == "__main__":
main()

Real-World Software Example: Distributed Data Processing

Section titled “Real-World Software Example: Distributed Data Processing”

Now let’s see a realistic software example - a distributed system that needs to process data from multiple sources (database, file system, API) uniformly.

You’re building a data processing pipeline that needs to process records from different sources. Without Iterator Pattern:

bad_data_processing.py
# ❌ Without Iterator Pattern - Different access patterns!
class DatabaseSource:
"""Database data source"""
def __init__(self, connection_string: str):
self.connection_string = connection_string
# Simulate database connection
self.records = [
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"},
{"id": 3, "name": "Charlie"}
]
self.current_index = 0
def fetch_all(self):
"""Fetch all records"""
return self.records
def fetch_next(self):
"""Fetch next record"""
if self.current_index < len(self.records):
record = self.records[self.current_index]
self.current_index += 1
return record
return None
class FileSource:
"""File data source"""
def __init__(self, filepath: str):
self.filepath = filepath
# Simulate file reading
self.lines = [
"id,name",
"1,Alice",
"2,Bob",
"3,Charlie"
]
self.current_line = 0
def read_lines(self):
"""Read all lines"""
return self.lines[1:] # Skip header
def read_next_line(self):
"""Read next line"""
if self.current_line < len(self.lines) - 1:
line = self.lines[self.current_line + 1]
self.current_line += 1
return line
return None
class APISource:
"""API data source"""
def __init__(self, endpoint: str):
self.endpoint = endpoint
# Simulate API response
self.data = {
"records": [
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"},
{"id": 3, "name": "Charlie"}
]
}
def get_all(self):
"""Get all records"""
return self.data["records"]
def get_next_page(self):
"""Get next page (pagination)"""
return self.data["records"]
# Problem: Client needs different code for each source!
def process_database(source: DatabaseSource):
records = source.fetch_all() # Different method!
for record in records:
print(f"Processing: {record}")
def process_file(source: FileSource):
lines = source.read_lines() # Different method!
for line in lines:
print(f"Processing: {line}")
def process_api(source: APISource):
records = source.get_all() # Different method!
for record in records:
print(f"Processing: {record}")
# Problems:
# - Different access patterns for each source
# - Client needs to know source type
# - Hard to add new sources
# - Code duplication

Problems:

  • Different access patterns - Each source has different methods
  • Client needs source type - Must know which source it’s using
  • Hard to extend - Adding new sources requires client changes
  • Code duplication - Similar processing logic repeated
iterator_data_processing.py
from abc import ABC, abstractmethod
from typing import Dict, Any, Optional
# Step 1: Define Iterator interface
class DataIterator(ABC):
"""Iterator interface for data sources"""
@abstractmethod
def has_next(self) -> bool:
"""Check if there are more records"""
pass
@abstractmethod
def next(self) -> Dict[str, Any]:
"""Get next record"""
pass
# Step 2: Define Iterable interface
class DataSource(ABC):
"""Iterable interface for data sources"""
@abstractmethod
def create_iterator(self) -> DataIterator:
"""Create iterator for this data source"""
pass
# Step 3: Implement Database Source with Iterator
class DatabaseSource(DataSource):
"""Database data source"""
def __init__(self, connection_string: str):
self.connection_string = connection_string
# Simulate database records
self._records = [
{"id": 1, "name": "Alice", "source": "database"},
{"id": 2, "name": "Bob", "source": "database"},
{"id": 3, "name": "Charlie", "source": "database"}
]
def create_iterator(self) -> DataIterator:
"""Create database iterator"""
return DatabaseIterator(self)
def get_records(self):
"""Internal method to get records"""
return self._records
class DatabaseIterator(DataIterator):
"""Iterator for database source"""
def __init__(self, source: DatabaseSource):
self._source = source
self._records = source.get_records()
self._index = 0
def has_next(self) -> bool:
return self._index < len(self._records)
def next(self) -> Dict[str, Any]:
if not self.has_next():
raise StopIteration("No more records")
record = self._records[self._index]
self._index += 1
return record
# Step 4: Implement File Source with Iterator
class FileSource(DataSource):
"""File data source"""
def __init__(self, filepath: str):
self.filepath = filepath
# Simulate file lines
self._lines = [
"id,name",
"1,Alice",
"2,Bob",
"3,Charlie"
]
def create_iterator(self) -> DataIterator:
"""Create file iterator"""
return FileIterator(self)
def get_lines(self):
"""Internal method to get lines"""
return self._lines
class FileIterator(DataIterator):
"""Iterator for file source"""
def __init__(self, source: FileSource):
self._source = source
self._lines = source.get_lines()[1:] # Skip header
self._index = 0
def has_next(self) -> bool:
return self._index < len(self._lines)
def next(self) -> Dict[str, Any]:
if not self.has_next():
raise StopIteration("No more records")
# Parse CSV line
line = self._lines[self._index]
parts = line.split(",")
record = {"id": int(parts[0]), "name": parts[1], "source": "file"}
self._index += 1
return record
# Step 5: Implement API Source with Iterator
class APISource(DataSource):
"""API data source"""
def __init__(self, endpoint: str):
self.endpoint = endpoint
# Simulate API response
self._records = [
{"id": 1, "name": "Alice", "source": "api"},
{"id": 2, "name": "Bob", "source": "api"},
{"id": 3, "name": "Charlie", "source": "api"}
]
def create_iterator(self) -> DataIterator:
"""Create API iterator"""
return APIIterator(self)
def get_records(self):
"""Internal method to get records"""
return self._records
class APIIterator(DataIterator):
"""Iterator for API source"""
def __init__(self, source: APISource):
self._source = source
self._records = source.get_records()
self._index = 0
def has_next(self) -> bool:
return self._index < len(self._records)
def next(self) -> Dict[str, Any]:
if not self.has_next():
raise StopIteration("No more records")
record = self._records[self._index]
self._index += 1
return record
# Usage - Uniform interface for all sources!
def process_data_source(source: DataSource):
"""Process any data source uniformly"""
iterator = source.create_iterator()
print(f"Processing records from {type(source).__name__}:\n")
while iterator.has_next():
record = iterator.next()
print(f" Processing: {record}")
print()
def main():
# Create different sources
db_source = DatabaseSource("postgresql://localhost/db")
file_source = FileSource("data.csv")
api_source = APISource("https://api.example.com/data")
# Process all sources uniformly - same code!
process_data_source(db_source)
process_data_source(file_source)
process_data_source(api_source)
print("✅ Iterator Pattern: Uniform interface for all data sources!")
if __name__ == "__main__":
main()

There are different ways to implement the Iterator Pattern:

Client controls iteration:

external_iterator.py
# External Iterator - client controls iteration
class Iterator:
def has_next(self): pass
def next(self): pass
# Client controls iteration
iterator = collection.create_iterator()
while iterator.has_next():
item = iterator.next()
process(item)

Pros: Client has full control, flexible
Cons: More code for client

Collection controls iteration, client provides callback:

internal_iterator.py
# Internal Iterator - collection controls iteration
class Collection:
def for_each(self, callback):
for item in self._items:
callback(item)
# Collection controls iteration
collection.for_each(lambda item: process(item))

Pros: Simpler client code, less error-prone
Cons: Less control for client


Use Iterator Pattern when:

You want to traverse collections uniformly - Same interface for different types
You want to hide collection structure - Clients don’t need implementation details
You want multiple traversals - Support multiple simultaneous iterators
You want to decouple traversal logic - Separate iteration from collection
You want to support different iteration strategies - Forward, backward, filtered

Don’t use Iterator Pattern when:

Simple collections - If you only have one collection type, direct iteration is simpler
Performance critical - Iterator adds indirection (usually negligible)
Collections are too simple - For arrays or simple lists, direct access might be better
Over-engineering - Don’t add complexity for simple cases


exposing_structure.py
# ❌ Bad: Iterator exposes internal structure
class Iterator:
def __init__(self, collection):
self.collection = collection # Bad: Exposes collection
self.index = 0
def get_collection(self): # Bad: Exposes internal structure
return self.collection
# ✅ Good: Iterator hides internal structure
class Iterator:
def __init__(self, collection):
self._collection = collection # Good: Private
self._index = 0
def has_next(self):
return self._index < self._collection._size() # Good: Uses interface

Mistake 2: Modifying Collection During Iteration

Section titled “Mistake 2: Modifying Collection During Iteration”
modifying_during_iteration.py
# ❌ Bad: Modifying collection during iteration
iterator = collection.create_iterator()
while iterator.has_next():
item = iterator.next()
collection.remove(item) # Bad: Modifies during iteration!
# ✅ Good: Collect items to remove, then remove
items_to_remove = []
iterator = collection.create_iterator()
while iterator.has_next():
item = iterator.next()
if should_remove(item):
items_to_remove.append(item)
for item in items_to_remove:
collection.remove(item)

Mistake 3: Not Handling Concurrent Modifications

Section titled “Mistake 3: Not Handling Concurrent Modifications”
concurrent_modification.py
# ❌ Bad: No protection against concurrent modification
class Iterator:
def __init__(self, collection):
self._collection = collection
self._index = 0
def next(self):
return self._collection.get(self._index) # Bad: No check!
# ✅ Good: Check for concurrent modification
class Iterator:
def __init__(self, collection):
self._collection = collection
self._index = 0
self._expected_size = collection.size() # Good: Track size
def next(self):
if self._collection.size() != self._expected_size:
raise ConcurrentModificationError() # Good: Detect changes
return self._collection.get(self._index)

  1. Uniform Interface - Same traversal interface for all collections
  2. Hidden Structure - Clients don’t know internal implementation
  3. Decoupled - Traversal logic separated from collection
  4. Multiple Traversals - Support multiple simultaneous iterators
  5. Easy to Extend - Add new collection types without changing clients
  6. Distributed Systems Friendly - Works with remote data sources

Iterator Pattern is a behavioral design pattern that provides a way to access elements of an aggregate object sequentially without exposing its underlying representation.

  • Uniform traversal - Same interface for all collections
  • Hide structure - Clients don’t know implementation
  • Decouple - Traversal logic separated from collection
  • Multiple traversals - Support multiple simultaneous iterators
  • Easy to extend - Add new collections without changing clients
  1. Define Iterator interface - Uniform traversal methods
  2. Define Iterable interface - Collections that can be iterated
  3. Implement Concrete Iterator - Traverses specific collection
  4. Implement Concrete Aggregate - Collection that creates iterator
  5. Client uses iterator - Uniform interface for all collections
Client → Iterable → Iterator → Aggregate
  • Iterator - Interface for traversal
  • Concrete Iterator - Implements traversal for specific collection
  • Iterable - Interface for collections that can be iterated
  • Concrete Aggregate - Collection that creates iterator
  • Client - Uses iterator uniformly
class Iterator:
def has_next(self): pass
def next(self): pass
class Collection:
def create_iterator(self): return Iterator()
iterator = collection.create_iterator()
while iterator.has_next():
item = iterator.next()

✅ Traverse collections uniformly
✅ Hide collection structure
✅ Support multiple traversals
✅ Decouple traversal logic
✅ Support different iteration strategies

❌ Simple collections
❌ Performance critical
❌ Collections are too simple
❌ Over-engineering

  • Iterator Pattern = Uniform traversal interface
  • Iterator = Traversal interface
  • Iterable = Collection interface
  • Benefit = Uniform interface, hidden structure
  • Use Case = Multiple collection types, distributed data sources
class Iterator:
def has_next(self): pass
def next(self): pass
class Collection:
def create_iterator(self): return Iterator()
  • Iterator Pattern provides uniform traversal interface
  • It hides collection structure
  • It decouples traversal from collection
  • It’s about uniformity, not just iteration!
  • Essential for distributed systems with multiple data sources!

What to say:

“Iterator Pattern is a behavioral design pattern that provides a way to access elements of an aggregate object sequentially without exposing its underlying representation. It decouples traversal logic from the collection structure, allowing uniform traversal of different collection types.”

Why it matters:

  • Shows you understand the fundamental purpose
  • Demonstrates knowledge of decoupling and encapsulation
  • Indicates you can explain concepts clearly

Must mention:

  • Multiple collection types - Need uniform traversal interface
  • Hide implementation - Clients shouldn’t know internal structure
  • Multiple traversals - Support simultaneous iterators
  • Distributed systems - Uniform interface for remote data sources
  • Decouple traversal - Separate iteration logic from collection

Example scenario to give:

“I’d use Iterator Pattern when building a data processing pipeline that needs to process records from multiple sources - database, file system, and API. Each source has different internal structure, but Iterator provides uniform traversal interface. Client code doesn’t need to know if data comes from database or file - it just iterates uniformly.”

Must discuss:

  • Iterator: Uniform interface, hides structure, decoupled
  • Direct Access: Simple, but exposes structure, tight coupling
  • Key difference: Iterator abstracts traversal, direct access exposes implementation

Example to give:

“Iterator Pattern abstracts traversal logic, allowing clients to traverse collections without knowing their internal structure. Direct access requires clients to know if collection is array, linked list, or tree, leading to tight coupling. Iterator provides uniform interface regardless of implementation.”

Must discuss:

  • Multiple data sources - Database, file system, API, message queues
  • Uniform interface - Same traversal code for all sources
  • Lazy loading - Can load data on-demand during iteration
  • Network efficiency - Can implement pagination in iterator

Example to give:

“In distributed systems, Iterator Pattern is essential for processing data from multiple sources uniformly. For example, a data pipeline might need to process records from database, S3 files, and Kafka streams. Iterator provides uniform interface, and can implement lazy loading and pagination to handle large datasets efficiently.”

Benefits to mention:

  • Uniform interface - Same traversal code for all collections
  • Hidden structure - Clients don’t know implementation
  • Decoupled - Traversal logic separated from collection
  • Multiple traversals - Support simultaneous iterators
  • Easy to extend - Add new collections without changing clients

Trade-offs to acknowledge:

  • Complexity - Adds abstraction layer
  • Performance - Small overhead (usually negligible)
  • Over-engineering risk - Can be overkill for simple collections

Q: “What’s the difference between Iterator Pattern and for-each loop?”

A:

“Iterator Pattern provides an abstraction layer that hides collection implementation and allows uniform traversal of different collection types. For-each loops work with specific collection types and expose their structure. Iterator Pattern is useful when you have multiple collection types or need to hide implementation, while for-each is simpler for single collection types.”

Q: “How would you implement Iterator for a distributed data source?”

A:

“For distributed data sources, I’d implement Iterator with lazy loading and pagination. The iterator would fetch data in batches from remote source, caching current batch. When has_next() is called, it checks if current batch has more items or fetches next batch. This allows processing large datasets without loading everything into memory.”

Q: “How does Iterator Pattern relate to SOLID principles?”

A:

“Iterator Pattern supports Single Responsibility Principle by separating traversal logic into iterator. It supports Open/Closed Principle - you can add new collection types without modifying client code. It supports Dependency Inversion Principle by having clients depend on Iterator interface rather than concrete collections. It also supports Interface Segregation Principle by providing focused iterator interface.”

Before your interview, make sure you can:

  • Define Iterator Pattern clearly in one sentence
  • Explain when to use it (with examples showing uniform traversal)
  • Describe Iterator vs Direct Access
  • Implement Iterator Pattern from scratch
  • Compare with other patterns (Visitor, Strategy)
  • List benefits and trade-offs
  • Identify common mistakes (exposing structure, concurrent modification)
  • Give 2-3 real-world examples (especially distributed systems)
  • Connect to SOLID principles
  • Discuss when NOT to use it
  • Explain distributed systems relevance

Remember: Iterator Pattern is about uniform traversal - providing a consistent way to access elements without exposing collection structure, essential for distributed systems! 🔄