Data StructuresIntermediate12 min readUpdated March 2025
Hash Maps in Practice
Hash maps are the most versatile data structure in competitive programming and real-world development. This article covers practical patterns: frequency counting, two-sum, grouping, caching, and designing data structures using hash maps.
Python dict and collections Module
Python provides several hash map variants optimized for different use cases:
- dict — Standard hash map. O(1) average for get, set, delete. Maintains insertion order (Python 3.7+).
- collections.defaultdict — Like dict but returns a default value for missing keys instead of raising KeyError. Eliminates boilerplate checks.
- collections.Counter — Specialized dict for counting hashable objects. Has useful methods like most_common().
- collections.OrderedDict — Maintains insertion order explicitly. Useful for LRU Cache implementation.
Frequency Counting and Anagram Detection
The most common hash map pattern — counting occurrences:
python
from collections import Counter, defaultdict
# ---- Frequency Counter ----
words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
freq = Counter(words)
print(freq) # Counter({'apple': 3, 'banana': 2, 'cherry': 1})
print(freq.most_common(2)) # [('apple', 3), ('banana', 2)]
# ---- Anagram Detection ----
def is_anagram(s, t):
return Counter(s) == Counter(t)
print(is_anagram("anagram", "nagaram")) # True
print(is_anagram("rat", "car")) # False
# ---- Group Anagrams ----
def group_anagrams(strs):
groups = defaultdict(list)
for s in strs:
key = tuple(sorted(s)) # Canonical form
groups[key].append(s)
return list(groups.values())
print(group_anagrams(["eat","tea","tan","ate","nat","bat"]))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
# ---- First Non-Repeating Character ----
def first_unique(s):
count = Counter(s)
for i, char in enumerate(s):
if count[char] == 1:
return i
return -1
print(first_unique("leetcode")) # 0 ('l')
print(first_unique("aabb")) # -1Two Sum and Subarray Problems
Hash maps convert O(n^2) brute-force solutions to O(n):
python
# ---- Two Sum - O(n) ----
def two_sum(nums, target):
seen = {} # value -> index
for i, num in enumerate(nums):
complement = target - num
if complement in seen:
return [seen[complement], i]
seen[num] = i
return []
print(two_sum([2, 7, 11, 15], 9)) # [0, 1]
# ---- Subarray Sum Equals K - O(n) ----
def subarray_sum(nums, k):
count = 0
prefix_sum = 0
prefix_counts = defaultdict(int)
prefix_counts[0] = 1 # Empty subarray has sum 0
for num in nums:
prefix_sum += num
# If (prefix_sum - k) was seen before, there's a subarray summing to k
count += prefix_counts[prefix_sum - k]
prefix_counts[prefix_sum] += 1
return count
print(subarray_sum([1, 1, 1], 2)) # 2 (subarrays [1,1] at positions 0-1 and 1-2)
# ---- Longest Consecutive Sequence - O(n) ----
def longest_consecutive(nums):
num_set = set(nums)
max_length = 0
for num in num_set:
if num - 1 not in num_set: # Start of a sequence
length = 1
while num + length in num_set:
length += 1
max_length = max(max_length, length)
return max_length
print(longest_consecutive([100, 4, 200, 1, 3, 2])) # 4 (1,2,3,4)LRU Cache Implementation
The LRU (Least Recently Used) Cache is a classic interview problem that combines a hash map with a doubly linked list:
python
from collections import OrderedDict
class LRUCache:
"""O(1) get and put using OrderedDict"""
def __init__(self, capacity: int):
self.capacity = capacity
self.cache = OrderedDict()
def get(self, key: int) -> int:
if key not in self.cache:
return -1
self.cache.move_to_end(key) # Mark as recently used
return self.cache[key]
def put(self, key: int, value: int) -> None:
if key in self.cache:
self.cache.move_to_end(key)
self.cache[key] = value
if len(self.cache) > self.capacity:
self.cache.popitem(last=False) # Remove least recently used
# Test
lru = LRUCache(3)
lru.put(1, 1)
lru.put(2, 2)
lru.put(3, 3)
print(lru.get(1)) # 1 (moves 1 to end)
lru.put(4, 4) # Evicts key 2 (least recently used)
print(lru.get(2)) # -1 (evicted)
print(lru.get(3)) # 3
print(lru.get(4)) # 4Key Takeaways
- Use Counter for frequency counting and defaultdict to avoid KeyError boilerplate.
- The two-sum pattern (store complement in hash map) converts O(n^2) to O(n) for many problems.
- Prefix sum + hash map solves subarray sum problems in O(n) time.
- LRU Cache combines a hash map (O(1) lookup) with a doubly linked list (O(1) eviction).
- Hash maps are the go-to tool for trading space for time — O(n) space for O(1) lookups.
Contact Us
Have a question or feedback? Fill out the form below or reach us directly at support@nvaitraining.com