Data Structure

Hash
Table
Explored

A key-value store built from scratch in pure Python — no imports, no classes, just clean procedural logic with collision handling, full validation, and O(1) average operations.

O(1)avg. lookup
10+functions
5+test scenarios
0 importspure python

How does it work?

A hash table maps keys to values by running the key through a hash function to compute an index, then storing the pair in that bucket. Click a key to trace its path.

Live hash mapping — capacity = 4

Key → Value pairs

1:"Ahmad" → idx 1
2:"Osaid" → idx 2
6:"Mohammad" → idx 2 ⚡
"a":"First" → idx 0
"d":"Second" → idx 0 ⚡
hash_fun key % capacity
or
char sum % cap

Buckets (capacity = 4)

0
"a":"First"
"d":"Second"
1
1:"Ahmad"
2
2:"Osaid"
6:"Mohammad"
3
empty
Active key trace
Collision (same index)
Empty bucket
Collision handling — separate chaining

Keys 2 and 6 both hash to index 2 (since 6 % 4 = 2). Rather than overwriting, the implementation appends both into the same bucket as a list — this is called separate chaining. On lookup, it scans the short list until it finds the right key. Average case stays O(1).

Try it live

Simulate the hash table in your browser. Insert, get, delete, and update key-value pairs and watch the buckets update in real time.

Operations

Insert
Get / Check
Update
Delete / Clear
00:00Hash table initialized (capacity=5)

Live Buckets capacity = 5

SIZE
0
EMPTY
yes
KEYS

Full implementation

Every function from the Python file — annotated and organized by operation type.

Python 3 · procedural
# Initialize an empty hash table
def create_table(capacity: int):
    """
    Creates and returns a new hash table dictionary.
    
    Args:
        capacity (int): Number of buckets. Must be a positive integer.
    Returns:
        dict: {"Capacity": int, "Size": int, "Buckets": list[list]}
    Raises:
        TypeError:  capacity is bool or non-integer
        ValueError: capacity ≤ 0
    """
    if isinstance(capacity, bool):
        raise TypeError("Capacity must be an integer, not a boolean :)")
    if not isinstance(capacity, int):
        raise TypeError("Capacity must be an integer number :)")
    if capacity < 0:
        raise ValueError("Capacity must be a positive number :)")
    if capacity == 0:
        raise ValueError("Capacity must be a number with a value :)")

    return {
        "Capacity" : capacity,
        "Size"     : 0,
        "Buckets"  : [[] for c in range(capacity)]
    }


# Validate the internal table structure
def validate_table(table: dict):
    """
    Ensures the table is a properly structured dictionary.
    
    Raises:
        TypeError:  if not a dict
        ValueError: if missing required keys
    """
    if not isinstance(table, dict):
        raise TypeError("Invalid table type. It must be a dictionary :)")

    if "Capacity" not in table or "Size" not in table or "Buckets" not in table:
        raise ValueError("Malformed table")
# Hash function — maps a key to a bucket index
def hash_fun(key, capacity: int):
    """
    Computes bucket index for a given key.
    
    For integers : index = key % capacity
    For strings  : sum of (char position in alphabet) % capacity
                   — case-insensitive, ignores non-alpha chars
    
    Args:
        key      : int or str
        capacity : int (table capacity)
    Returns:
        int: bucket index in range [0, capacity)
    Raises:
        TypeError: key is not int or str
    """
    if not isinstance(key, (int, str)):
        raise TypeError("Key must be an integer, char or string :)")

    if isinstance(key, int):
        index = key % capacity     # simple modulo for integers

    if isinstance(key, str):
        total = 0
        for char in key:
            if 65 <= ord(char) <= 90:   # A–Z
                total += (ord(char) - 65)
            elif 97 <= ord(char) <= 122: # a–z
                total += (ord(char) - 97)
        index = total % capacity

    return index
def insert(table: dict, key, value):
    """Insert key-value pair. Raises KeyError on duplicate."""
    validate_table(table)
    if check(table, key):
        raise KeyError("Key already exists in the table :)")

    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]
    bucket.append({key: value})   # append to the bucket (chaining)
    table["Size"] += 1


def get(table: dict, key):
    """Return value for key. Raises KeyError if not found."""
    validate_table(table)
    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]

    for i in bucket:
        if key in i:
            return i[key]

    raise KeyError("Key not found in the table :)")


def check(table: dict, key) -> bool:
    """Return True if key exists, False otherwise."""
    validate_table(table)
    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]
    for i in bucket:
        if key in i: return True
    return False


def delete(table: dict, key):
    """Remove key-value pair. Raises ValueError on empty, KeyError if missing."""
    validate_table(table)
    if is_empty(table):
        raise ValueError("The table is empty, nothing to delete :)")
    if not check(table, key):
        raise KeyError("Key not found in the table :)")

    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]

    for i in range(len(bucket)):
        if key in bucket[i]:
            del bucket[i]
            table["Size"] -= 1
            return "Key removed successfully :)"
def update(table: dict, key, value):
    """Update value for existing key. Raises KeyError if not found."""
    validate_table(table)
    index = hash_fun(key, table["Capacity"])
    bucket = table["Buckets"][index]

    if check(table, key) == False:
        raise KeyError("Key not found in the table :)")

    for i in bucket:
        if key in i:
            i[key] = value
            return "Key updated successfully :)"


def keys(table: dict):
    """Return a list of all keys in the table."""
    validate_table(table)
    all_keys = []
    for bucket in table["Buckets"]:
        for pair in bucket:
            all_keys.extend(pair.keys())
    return all_keys


def size(table: dict) -> int:
    """Return number of key-value pairs stored."""
    validate_table(table)
    return table["Size"]


def is_empty(table: dict) -> bool:
    """Return True if table has no entries."""
    validate_table(table)
    return table["Size"] == 0


def clear(table: dict):
    """Remove all entries, reset size, keep capacity."""
    validate_table(table)
    table["Buckets"] = [[] for c in range(table["Capacity"])]
    table["Size"] = 0
if __name__ == "__main__":

    # ── Test 1: Normal usage ─────────────────────────────
    table = create_table(4)
    insert(table, 1, "Ahmad")
    insert(table, 2, "Osaid")
    insert(table, 6, "Mohammad")
    assert get(table, 1) == "Ahmad"
    assert size(table) == 3

    # ── Test 2: Edge cases (empty table) ─────────────────
    empty = create_table(2)
    assert is_empty(empty) == True
    assert size(empty) == 0

    # ── Test 3: Error validation ──────────────────────────
    try:
        insert(table, 2, "Duplicate")
    except KeyError as e:
        print(e)   # Key already exists

    try:
        create_table(True)
    except TypeError as e:
        print(e)   # bool not allowed

    # ── Test 4: Collision demonstration ──────────────────
    t2 = create_table(3)
    insert(t2, "a", "First")
    insert(t2, "d", "Second")   # 'a'=0,'d'=3 → 3%3=0 → collision!
    assert get(t2, "a") == "First"
    assert get(t2, "d") == "Second"

    # ── Test 5: Update / keys / clear ────────────────────
    update(table, 2, "New Value")
    assert get(table, 2) == "New Value"
    clear(table)
    assert is_empty(table) == True

Time complexity

All core operations run in O(1) on average. Worst case degrades to O(n) only when all keys collide into the same bucket.

create_table()
O(n)
Allocates n empty bucket lists upfront
hash_fun()
O(k)
k = string key length. O(1) for integers
insert()
O(1) avg
Appends to bucket. O(n) worst case (all collide)
get()
O(1) avg
Direct bucket access + short scan
delete()
O(1) avg
Same as get — hash then scan bucket
check()
O(1) avg
O(n) worst when all keys share a bucket
update()
O(1) avg
Find bucket then overwrite value in-place
size() / is_empty()
O(1)
Reads stored counter — no traversal needed
keys()
O(n)
Must scan all buckets to collect every key
clear()
O(n)
Rebuilds n empty bucket lists

Function signatures

Every function, its parameters, and what it returns or raises.

create_table(capacity)
Returns a fresh hash table dict with the given number of buckets. Raises TypeError for booleans / non-integers and ValueError for ≤ 0.
O(n)
hash_fun(key, capacity)
Computes bucket index. Integers use modulo; strings sum alpha positions then modulo. Returns an int in [0, capacity).
O(k)
insert(table, key, value)
Appends {key: value} to the correct bucket. Raises KeyError on duplicate key.
O(1) avg
get(table, key)
Returns the value for the key. Raises KeyError if not found.
O(1) avg
check(table, key)
Returns True if key exists, False otherwise. Never raises.
O(1) avg
delete(table, key)
Removes the pair and decrements Size. Raises ValueError on empty table, KeyError if missing.
O(1) avg
update(table, key, value)
Overwrites the value for an existing key in-place. Raises KeyError if the key doesn't exist.
O(1) avg
keys(table)
Returns a flat list of every key by scanning all buckets.
O(n)
clear(table)
Resets all buckets to empty and Size to 0. Capacity is preserved.
O(n)

Where hash tables are used

Hash tables power some of the most critical systems in computer science — anywhere you need fast key-based lookup.

🔐
Authentication tokens
Session tokens mapped to user IDs — O(1) lookup on every HTTP request avoids scanning millions of users.
🌐
DNS resolution
Domain names hashed to IP addresses, enabling sub-millisecond resolution at internet scale.
🔤
Language runtimes
Python's dict and JavaScript objects are hash tables internally — every property access is a hash lookup.
Caching (memoization)
Store computed results keyed by input. Repeated calls skip recalculation entirely.
🗂️
Database indexing
Hash indexes on primary keys let databases retrieve rows in constant time without scanning the full table.
🔎
Duplicate detection
Compiler symbol tables, spell-checkers, and deduplication pipelines all rely on O(1) membership testing.