Hash Table — Data Structure

Concept

How does it work?

A hash table maps keys to values by running the key through a hash function to compute an index, then storing the pair in that bucket. Click a key to trace its path.

Live hash mapping — capacity = 4

Key → Value pairs

1:"Ahmad" → idx 1

2:"Osaid" → idx 2

6:"Mohammad" → idx 2 ⚡

"a":"First" → idx 0

"d":"Second" → idx 0 ⚡

hash_fun key % capacity
or
char sum % cap

Buckets (capacity = 4)

"a":"First"

"d":"Second"

1:"Ahmad"

2:"Osaid"

6:"Mohammad"

empty

Active key trace

Collision (same index)

Empty bucket

Collision handling — separate chaining

Keys 2 and 6 both hash to index 2 (since 6 % 4 = 2). Rather than overwriting, the implementation appends both into the same bucket as a list — this is called separate chaining. On lookup, it scans the short list until it finds the right key. Average case stays O(1).

Interactive

Try it live

Simulate the hash table in your browser. Insert, get, delete, and update key-value pairs and watch the buckets update in real time.

Operations

Insert

Get / Check

Update

Delete / Clear

00:00Hash table initialized (capacity=5)

Live Buckets capacity = 5

SIZE

EMPTY

yes

KEYS

—

Source

Full implementation

Every function from the Python file — annotated and organized by operation type.

Python 3 · procedural

# Initialize an empty hash table
def create_table(capacity: int):
    """
    Creates and returns a new hash table dictionary.
    
    Args:
        capacity (int): Number of buckets. Must be a positive integer.
    Returns:
        dict: {"Capacity": int, "Size": int, "Buckets": list[list]}
    Raises:
        TypeError:  capacity is bool or non-integer
        ValueError: capacity ≤ 0
    """
    if isinstance(capacity, bool):
        raise TypeError("Capacity must be an integer, not a boolean :)")
    if not isinstance(capacity, int):
        raise TypeError("Capacity must be an integer number :)")
    if capacity < 0:
        raise ValueError("Capacity must be a positive number :)")
    if capacity == 0:
        raise ValueError("Capacity must be a number with a value :)")

    return {
        "Capacity" : capacity,
        "Size"     : 0,
        "Buckets"  : [[] for c in range(capacity)]
    }


# Validate the internal table structure
def validate_table(table: dict):
    """
    Ensures the table is a properly structured dictionary.
    
    Raises:
        TypeError:  if not a dict
        ValueError: if missing required keys
    """
    if not isinstance(table, dict):
        raise TypeError("Invalid table type. It must be a dictionary :)")

    if "Capacity" not in table or "Size" not in table or "Buckets" not in table:
        raise ValueError("Malformed table")
      

# Hash function — maps a key to a bucket index
def hash_fun(key, capacity: int):
    """
    Computes bucket index for a given key.
    
    For integers : index = key % capacity
    For strings  : sum of (char position in alphabet) % capacity
                   — case-insensitive, ignores non-alpha chars
    
    Args:
        key      : int or str
        capacity : int (table capacity)
    Returns:
        int: bucket index in range [0, capacity)
    Raises:
        TypeError: key is not int or str
    """
    if not isinstance(key, (int, str)):
        raise TypeError("Key must be an integer, char or string :)")

    if isinstance(key, int):
        index = key % capacity     # simple modulo for integers

    if isinstance(key, str):
        total = 0
        for char in key:
            if 65 <= ord(char) <= 90:   # A–Z
                total += (ord(char) - 65)
            elif 97 <= ord(char) <= 122: # a–z
                total += (ord(char) - 97)
        index = total % capacity

    return index
      

def insert(table: dict, key, value):
    """Insert key-value pair. Raises KeyError on duplicate."""
    validate_table(table)
    if check(table, key):
        raise KeyError("Key already exists in the table :)")

    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]
    bucket.append({key: value})   # append to the bucket (chaining)
    table["Size"] += 1


def get(table: dict, key):
    """Return value for key. Raises KeyError if not found."""
    validate_table(table)
    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]

    for i in bucket:
        if key in i:
            return i[key]

    raise KeyError("Key not found in the table :)")


def check(table: dict, key) -> bool:
    """Return True if key exists, False otherwise."""
    validate_table(table)
    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]
    for i in bucket:
        if key in i: return True
    return False


def delete(table: dict, key):
    """Remove key-value pair. Raises ValueError on empty, KeyError if missing."""
    validate_table(table)
    if is_empty(table):
        raise ValueError("The table is empty, nothing to delete :)")
    if not check(table, key):
        raise KeyError("Key not found in the table :)")

    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]

    for i in range(len(bucket)):
        if key in bucket[i]:
            del bucket[i]
            table["Size"] -= 1
            return "Key removed successfully :)"
      

def update(table: dict, key, value):
    """Update value for existing key. Raises KeyError if not found."""
    validate_table(table)
    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]

    if check(table, key) == False:
        raise KeyError("Key not found in the table :)")

    for i in bucket:
        if key in i:
            i[key] = value
            return "Key updated successfully :)"


def keys(table: dict):
    """Return a list of all keys in the table."""
    validate_table(table)
    all_keys = []
    for bucket in table["Buckets"]:
        for pair in bucket:
            all_keys.extend(pair.keys())
    return all_keys


def size(table: dict) -> int:
    """Return number of key-value pairs stored."""
    validate_table(table)
    return table["Size"]


def is_empty(table: dict) -> bool:
    """Return True if table has no entries."""
    validate_table(table)
    return table["Size"] == 0


def clear(table: dict):
    """Remove all entries, reset size, keep capacity."""
    validate_table(table)
    table["Buckets"] = [[] for c in range(table["Capacity"])]
    table["Size"] = 0
      

if __name__ == "__main__":

    # ── Test 1: Normal usage ─────────────────────────────
    table = create_table(4)
    insert(table, 1, "Ahmad")
    insert(table, 2, "Osaid")
    insert(table, 6, "Mohammad")
    assert get(table, 1) == "Ahmad"
    assert size(table) == 3

    # ── Test 2: Edge cases (empty table) ─────────────────
    empty = create_table(2)
    assert is_empty(empty) == True
    assert size(empty) == 0

    # ── Test 3: Error validation ──────────────────────────
    try:
        insert(table, 2, "Duplicate")
    except KeyError as e:
        print(e)   # Key already exists

    try:
        create_table(True)
    except TypeError as e:
        print(e)   # bool not allowed

    # ── Test 4: Collision demonstration ──────────────────
    t2 = create_table(3)
    insert(t2, "a", "First")
    insert(t2, "d", "Second")   # 'a'=0,'d'=3 → 3%3=0 → collision!
    assert get(t2, "a") == "First"
    assert get(t2, "d") == "Second"

    # ── Test 5: Update / keys / clear ────────────────────
    update(table, 2, "New Value")
    assert get(table, 2) == "New Value"
    clear(table)
    assert is_empty(table) == True
      

Performance

Time complexity

All core operations run in O(1) on average. Worst case degrades to O(n) only when all keys collide into the same bucket.

create_table()

O(n)

Allocates n empty bucket lists upfront

hash_fun()

O(k)

k = string key length. O(1) for integers

insert()

O(1) avg

Appends to bucket. O(n) worst case (all collide)

get()

O(1) avg

Direct bucket access + short scan

delete()

O(1) avg

Same as get — hash then scan bucket

check()

O(1) avg

O(n) worst when all keys share a bucket

update()

O(1) avg

Find bucket then overwrite value in-place

size() / is_empty()

O(1)

Reads stored counter — no traversal needed

keys()

O(n)

Must scan all buckets to collect every key

clear()

O(n)

Rebuilds n empty bucket lists

Reference

Function signatures

Every function, its parameters, and what it returns or raises.

create_table(capacity)

Returns a fresh hash table dict with the given number of buckets. Raises TypeError for booleans / non-integers and ValueError for ≤ 0.

O(n)

hash_fun(key, capacity)

Computes bucket index. Integers use modulo; strings sum alpha positions then modulo. Returns an int in [0, capacity).

O(k)

insert(table, key, value)

Appends {key: value} to the correct bucket. Raises KeyError on duplicate key.

O(1) avg

get(table, key)

Returns the value for the key. Raises KeyError if not found.

O(1) avg

check(table, key)

Returns True if key exists, False otherwise. Never raises.

O(1) avg

delete(table, key)

Removes the pair and decrements Size. Raises ValueError on empty table, KeyError if missing.

O(1) avg

update(table, key, value)

Overwrites the value for an existing key in-place. Raises KeyError if the key doesn't exist.

O(1) avg

keys(table)

Returns a flat list of every key by scanning all buckets.

O(n)

clear(table)

Resets all buckets to empty and Size to 0. Capacity is preserved.

O(n)

Real world

Where hash tables are used

Hash tables power some of the most critical systems in computer science — anywhere you need fast key-based lookup.

🔐

Authentication tokens

Session tokens mapped to user IDs — O(1) lookup on every HTTP request avoids scanning millions of users.

🌐

DNS resolution

Domain names hashed to IP addresses, enabling sub-millisecond resolution at internet scale.

🔤

Language runtimes

Python's dict and JavaScript objects are hash tables internally — every property access is a hash lookup.

⚡

Caching (memoization)

Store computed results keyed by input. Repeated calls skip recalculation entirely.

🗂️

Database indexing

Hash indexes on primary keys let databases retrieve rows in constant time without scanning the full table.

🔎

Duplicate detection

Compiler symbol tables, spell-checkers, and deduplication pipelines all rely on O(1) membership testing.

HashTableExplored

How does it work?

Key → Value pairs

Buckets (capacity = 4)

Try it live

Operations

Live Buckets capacity = 5

Full implementation

Time complexity

Function signatures

Where hash tables are used

Hash
Table
Explored