Skip to content

Searching and sorting algorithms in python

Interviewers may ask candidates to implement sorting algorithms like quicksort, mergesort, or heapsort, or search algorithms like binary search.

Sorting

  • quick sort
  • merge sort
  • bubble sort
  • insertion sort
  • selection sort
  • heap sort

Quick sort

Quick sort is a recursive algorithm.

  1. Choose a pivot
  2. Partition - all values in array smaller than pivot go on the left. All values in array larger than pivot go on the right.
  3. apply the previous two steps to the smaller created lists/arrays.
  4. The Base case of the recusion is when the sub array has 1 or 0 elements

Quicksort has a (worst) time complexity of O(n^2).

Quicksort has an average time complexity of O(n log (n))

a simple implementation of quicksort in Python:


def quicksort(arr):
    if len(arr) <= 1:
        return arr

    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]

    return quicksort(left) + middle + quicksort(right)

# Example usage:
my_array = [3, 6, 8, 10, 1, 2, 1]
sorted_array = quicksort(my_array)
print(sorted_array)


merge sort

Merge sort is a divide and conquer algorithm for sorting arrays.

Merge sort is an efficient, general-purpose, and comparison-based sorting algorithm.

Merge sort is a recursive algorithm that continually splits a list in half. The list is then sorted by recursively calling the same algorithm on each half until the list is sorted.

The merge sort algorithm is a comparison based algorithm in which two arrays are merged together to produce a new sorted array.

The merge sort algorithm is a divide and conquer algorithm. It divides the input array into two halves, calls itself for the two halves, and then merges the two sorted halves. The merge() function is used for merging two halves.

In the merge step of the merge algorithm

  1. Compare the first element of both arrays.
  2. If the first element of the first array is greater than the first element of the second array, then copy the first element of the first array to the new array.
  3. Otherwise, copy the first element of the second array to the new array.
  4. Repeat steps 1-3 until you compare the last element of the first array with the last element of the second array.
  5. Copy the last element of the second array, if it is larger than the last element of the first array.
  6. Repeat steps 1-5 until you merge all elements from both arrays.

bubble sort

The bubble sort algorithm involves repeatedly swapping the adjacent elements if they are in wrong order.

It is called "bubble" sort because when sorting the biggest (or smallest) number makes its way to the end of the list, it "bubbles" up to the end.

The time complexity of the bubble sort algorithm is O(n^2), where "n" is the number of elements in the array. This is because the algorithm has two nested loops. In the worst-case scenario, it needs to compare and swap elements for each pair in the array, leading to a quadratic time complexity.

The outer loop runs for 'n' iterations, and for each iteration of the outer loop, the inner loop runs for 'n' iterations in the worst case. This results in a total of n * n = n^2 comparisons and swaps.

While bubble sort is simple to understand and implement, its efficiency decreases rapidly with larger datasets, making it less practical for sorting large arrays compared to more efficient algorithms like quicksort or mergesort, which have better average and worst-case time complexities.

Imagine you have a bag of toy blocks, and you want to organize them from the smallest to the biggest. Bubble sort is like comparing and swapping blocks until they are all in order.

Now, if you have lots of blocks, say 10 blocks, you might have to compare each block with every other block. This means you'll have to do 10 times 10 comparisons (10 * 10 = 100) in total.

So, the "Big O notation" for bubble sort is like saying, "If you have 'n' blocks, it might take 'n' times 'n' (n^2) comparisons." It's a way of talking about how many comparisons and swaps you might need to do based on how many blocks you have.

In simple terms, for bubble sort, the Big O notation is O(n^2), where 'n' is the number of blocks. It means the more blocks you have, the more work you might need to do to organize them using bubble sort.


######
## My implementation
######

def my_bubble_sort(a_list):
    """
    Implementation of the bubble sort algorithm
    """

    list_changed = 1

    while list_changed != 0:
        list_changed = 0

        for i in range(len(a_list) - 1):
            if a_list[i] > a_list[i + 1]:
                # swap if left is bigger than right
                a_list[i + 1], a_list[i] = a_list[i], a_list[i + 1]
                list_changed += 1
        #print(a_list)


    return a_list


######
## typical implementation
######

def bubble_sort(a_list):
    """
    Implementation of the bubble sort algorithm
    """

    n = len(a_list)

    # Traverse through all array elements
    for i in range(n):

        # Last i elements are already in place, so we don't need to check them
        for j in range(0, n-i-1):

            # Swap if the element found is greater than the next element
            if a_list[j] > a_list[j+1]:
                a_list[j], a_list[j+1] = a_list[j+1], a_list[j]

    return a_list

insertion sort

Insertion sort is a simple sorting algorithm.

The insertion sort algorithm starts by comparing the first two elements of the array. If the first element is greater than the second element, the first element takes its place in the sorted array. If the first element is less than the second element, the second element takes its place in the sorted array.

The insertion sort algorithm kind of maintains a subset of "sorted" values which increases by 1 with each iteration until the entire array is sorted.

An outline of the insertion sort algorithm is as follows

  1. Start with the second element of the array (index 1), assuming the first element is already sorted.
  2. Compare the current element with the previous elements in the sorted part of the array.
  3. Move the current element to its correct position by shifting the larger elements to the right.
  4. Repeat steps 2-3 until the entire array is sorted.
## My implementation of insertion sort algorithm
def my_insertion_sort(arr:list):
    """
    my implementation of the insertion sort algorithm
    """

    if len(arr) > 2:
        for i in range(1, len(arr)):
            print(arr)
            offset = 1
            while i - offset >= 0:
                print(arr[i], arr[i - offset])
                if arr[i] < arr[i - offset]:
                    arr[i], arr[i - offset] =  arr[i - offset], arr[i]
                    offset += 1
                else:
                    break

    return arr
## Chat GPT implementation of insertion sort algorithm

def insertion_sort(arr):
    # Traverse through 1 to len(arr)
    for i in range(1, len(arr)):
        key = arr[i]

        # Move elements of arr[0..i-1] that are greater than key
        # to one position ahead of their current position
        j = i - 1
        while j >= 0 and key < arr[j]:
            arr[j + 1] = arr[j]
            j -= 1

        # Place the key at its correct position
        arr[j + 1] = key

    return arr

selection sort

Selection sort is a sorting algorithm.

At the start the entire array is considered unsorted.

It works by moving the smallest element in the unsorted portion of the array to the first position in the unsorted portion of the array.

The element that was just moved to the first position in the unsored portion is considered sorted.

The smallest element in the unsorted portion of the array is determined by - starting with the first element, keep track of its index. - this element is then compared with other elements in the array (iterating through) - if a smaller number is found, we now keep track of this new index as the smallest.


def my_selection_sort(a_list):
    """
    Implementation of the selection sort algorithm
    """

    for i in range(len(a_list)):
        # find minimum value in "unsorted" part of list
        smallest = min(a_list[i:])
        # find the index of the minimum element in list
        to_swap = a_list.index(smallest)
        # swap minimum with first element in "unsorted" part of list
        a_list[i], a_list[to_swap] = a_list[to_swap], a_list[i]

    return a_list


###########################
# Chat GPT implementation of selection sort in python
###########################

def selection_sort(arr):
    n = len(arr)

    for i in range(n - 1):
        # Find the minimum element in the unsorted part
        min_index = i
        for j in range(i + 1, n):
            if arr[j] < arr[min_index]:
                min_index = j

        # Swap the found minimum element with the first element
        arr[i], arr[min_index] = arr[min_index], arr[i]

    return arr

Heap sort

Radix sort

Searching

  • linear search
  • binary search
  • jump search
  • interpolation search

Iterate through each item in the array / list until you find what you are looking for.

Iteratively check each item in list until target is found. (This is sometimes known as brute force search)

This approach has an O(n) (linear) time complexity. As the length of time increases as each data point is included that needs to be processed.

Linear search is slow for large datasets.

Data does not need to be sorted for linear search to apply

Continuously divide the list in half until the value is found or the list becomes empty.

The time complexity of the binary search algorithm is O(log n), where "n" is the number of elements in the array. This is because the algorithm divides the list in half at each iteration, leading to a logarithmic time complexity.

The binary search algorithm is efficient when the list is already sorted.

Binary Search - problem 1

Also on leetcode here

Given a rotated sorted array of unique elements, write a function to search for a specific target element.


def search_rotated_array(nums, target):
    # Implement binary search logic here
    pass

# Example usage:
rotated_array = [4, 5, 6, 7, 0, 1, 2]
target_number = 0
result = search_rotated_array(rotated_array, target_number)
print(f"The target number {target_number} is at index {result}.")


Binary Search - problem 2 :: First Bad Version

You are given a list of versions, and you need to find the first one that is bad. Implement a function to find the first bad version using binary search.


def first_bad_version(n):
    # Implement binary search logic here
    pass

# Example usage:
total_versions = 10
first_bad = 4
result = first_bad_version(total_versions)
print(f"The first bad version is at index {result}.")


Binary Search - problem 3 :: Square root

Write a function to compute the square root of a non-negative integer using binary search.


def sqrt_binary_search(x):
    # Implement binary search logic here
    pass

# Example usage:
number_to_find_sqrt = 16
result_sqrt = sqrt_binary_search(number_to_find_sqrt)
print(f"The square root of {number_to_find_sqrt} is approximately {result_sqrt}.")


### MY SOLUTION

def sqrt_binary_search(x):
    # Implement binary search logic here

    # create initial upper and lower bounds
    lower_bound = 0
    upper_bound = x

    for i in range(1000):
        # make a guess in the midpoint of the range
        guess = (lower_bound + upper_bound ) / 2
        guess_result = guess ** 2

        # the guess is actually the square root
        if guess_result == x:
            return guess
        # the guess produces a result lower than the objective
        elif guess_result < x:
            # set the lower bound to the guess, since we know the number we need must be higher
            lower_bound = guess
        # the guess produces a result higher than the objective
        elif guess_result > x:
            # set the higher bound to the guess since we know the number we need must be lower
            upper_bound = guess

    # return the best guess after 1000 iterations of the binary search
    return guess


### Suggested Solution

def sqrt_binary_search_gpt(x):
    """
    ChatGPT inspired square root binary search function
    """
    if x < 0:
        raise ValueError("Cannot compute the square root of a negative number.")
    if x == 0 or x == 1:
        return x

    low, high = 0, x
    result = 0

    while low <= high:
        mid = (low + high) // 2
        mid_squared = mid * mid

        if mid_squared == x:
            return mid  # Perfect square found
        elif mid_squared < x:
            low = mid + 1
            result = mid  # Update result for the potential answer
        else:
            high = mid - 1

    return result

Binary Search - problem 4 :: Search Insert Position

Given a sorted array and a target value, return the index if the target is found. If not, return the index where it would be if it were inserted in order.


def search_insert(nums, target):
    # Implement binary search logic here
    pass

# Example usage:
sorted_array = [1, 3, 5, 6]
target_number = 5
result = search_insert(sorted_array, target_number)
print(f"The target number {target_number} should be inserted at index {result}.")


Binary Search - problem 5 :: Peak Element

Given an array of integers, find a peak element. A peak element is an element that is greater than or equal to its neighbors.


def find_peak_element(nums):
    # Implement binary search logic here
    pass

# Example usage:
array_with_peak = [1, 2, 3, 1]
result_peak = find_peak_element(array_with_peak)
print(f"A peak element is at index {result_peak}.")


Jump search relies on a sorted array.

Jump search is a modification of linear search.

In Jump search you start at the beginning, then we jump ahead in the array a certain number of positions until we find a value that is greater than the target. Then we look at every item in the previous array block.

From ChatGPT

It is a block-based search algorithm that jumps ahead by fixed steps and then performs a linear search in the block where the target element is likely to be found. The key idea is to reduce the number of elements to be searched linearly by jumping over a fixed number of elements in each iteration.

Jump search has a time complexity of O(sqrt(n)).

The jump size is normally the square root of the length of the array or list.

Here's a step-by-step explanation of how the jump search algorithm works:

  1. Given a sorted array: Start with a sorted array of elements.
  2. Define block size: Decide on a block size (often denoted as step or jump). The block size determines how many elements to jump ahead in each iteration.
  3. Jump ahead: Start at the beginning of the array and jump ahead by the block size until you find a value greater than or equal to the target element.
  4. Linear search in the block: Perform a linear search within the block from the previous step. This can be done by iterating through the elements within the block until the target element is found or a value greater than the target element is encountered.
  5. Repeat or stop: If the target element is found, return its index. If the end of the array is reached and the target element is not found, then the element is not present in the array.

https://colab.research.google.com/drive/10NcVv-iVN_J0N34_OD5DWn6EnU_cyWzi


### MY VERSION

from icecream import ic
import math

def marquins_jump_search(a_list, target):
    """
    a jump search algorithm in python.

    returns the index of the target in the array
    returns -1 if the target is not in the array
    """

    # a_list is my array
    n = len(a_list)
    jump_size = math.floor(math.sqrt(n)) # ensure is a whole number.

    for i in range(jump_size + 1):
        # if the length of the array does not have a whole number square root
        # i * jump size may result in a value larger than the length 
        # of the given array / list
        my_key = i * jump_size
        if my_key > n:
            my_key = n - 1

        #ic(my_key)
        #ic(a_list[my_key])
        if a_list[my_key] > target:
            min_key = (i - 1) * jump_size
            #ic(min_key)
            for j in range(min_key, my_key):
                #ic(j)
                #ic(a_list[j])
                if a_list[j] == target:
                    return j
            return -1
    return -1


### ALTERNATIVE

def jump_search(arr, target):
    n = len(arr)
    step = int(math.sqrt(n))  # Calculate the jump size

    # Find the block where the target might be
    prev, curr = 0, step
    while curr < n and arr[curr] < target:
        prev = curr
        curr += step

    # Perform linear search within the block
    for i in range(prev, min(curr, n)):
        if arr[i] == target:
            return i  # Element found, return its index

    return -1  # Element not found in the array

Interpolation search is an modified (improved) version of binary search.

Interpolation search requires a sorted list / array as input. The sorted list / array must also be roughly uniformly distributed.

Interpolation search uses a formula to estimate the position of the item being searched for. (this is instead of always dividing the array by 2)