{ "cells": [ { "cell_type": "markdown", "id": "de4b2462-3d05-423b-bfb0-f487a5d349a1", "metadata": {}, "source": [ "###
San Jose State University
Department of Applied Data Science

**DATA 200
Computational Programming for Data Analytics**

Spring 2024
Instructor: Ron Mak
" ] }, { "cell_type": "markdown", "id": "205c1575-47fb-41c6-b1ac-eb493c657599", "metadata": {}, "source": [ "# Partition a List" ] }, { "cell_type": "markdown", "id": "a5de3b2c-73ec-4177-bbbf-7021f3964fb1", "metadata": {}, "source": [ "#### Demonstrate partitioning a list with a trace." ] }, { "cell_type": "markdown", "id": "958068cb-59fc-4a65-99bc-1a3f9cfbb6cd", "metadata": {}, "source": [ "#### To ***partition*** an unsorted list:\n", "- #### Pick an element of the list as the ***pivot value***. In these examples, we'll always pick the middle element.\n", "- #### Move all the elements whose value is **less than or equal to** the pivot value to the left of the pivot.\n", "- #### Move all the elements whose value is **greater than** the pivot value to the right of the pivot.\n", "#### After partitioning a list, the pivot value is in its correct position if we then sort the list." ] }, { "cell_type": "markdown", "id": "5a1a8bf0-6853-4a9e-b622-37e85ec5ef18", "metadata": {}, "source": [ "## Printing functions" ] }, { "cell_type": "code", "execution_count": null, "id": "4827fdec-097c-4d8e-a467-7eda50901d38", "metadata": {}, "outputs": [], "source": [ "def print_array(a, pivot_index):\n", " \"\"\"\n", " Print list a with parentheses around\n", " the pivot value at pivot_index.\n", " \"\"\"\n", " for k in range(len(a)):\n", " value = a[k]\n", " \n", " if k == pivot_index:\n", " print(f'({value:2d})', end='')\n", " else:\n", " print(f' {value:2d} ', end='')" ] }, { "cell_type": "code", "execution_count": null, "id": "e279ea15-a4ca-492d-9b48-4ac829f57ff6", "metadata": {}, "outputs": [], "source": [ "def print_ij(a, i, j):\n", " \"\"\"\n", " Print the positions of indexes i and j\n", " relative to array a.\n", " \"\"\"\n", " for k in range(len(a)):\n", " if (k == i) and (k == j):\n", " print(' ji ', end='')\n", " elif k == i:\n", " print(' i ', end='')\n", " elif k == j:\n", " print(' j ', end='')\n", " else:\n", " print(' ', end='')\n", " \n", " print()" ] }, { "cell_type": "code", "execution_count": null, "id": "fa301ac0-1f23-4547-923d-00004afb2ecb", "metadata": {}, "outputs": [], "source": [ "def swap(a, i, j, pivot_index, printij=False):\n", " \"\"\"\n", " Print the swap of the elements of array a\n", " at indexes left and right.\n", " \"\"\"\n", " if a[i] != a[j]:\n", " a[i], a[j] = a[j], a[i]\n", " print_array(a, pivot_index)\n", " print(f' swap {a[i]} <=> {a[j]}')\n", "\n", " if printij:\n", " print_ij(a, i, j)" ] }, { "cell_type": "markdown", "id": "c2ffadc6-63c3-4309-8b65-a0ea5d089d85", "metadata": {}, "source": [ "## Partition algorithm\n", "#### Steps to partition an list `a` bounded by indexes `left` and `right` into two sublists:\n", "- #### Pick the middle element as the `pivot_value`.\n", "- #### Park `pivot_value` out of the way by swapping it with the rightmost value.\n", "- #### Index `i` starts at the left end of the array.\n", "- #### Index `j` starts at one from the right end of the array (since the pivot value is parked at the right end).\n", "- #### As long as `i` and `j` don't cross or meet (`while i < j`):\n", " - #### Move `i` to the right while `a[i] <= pivot_value` (skip over values less than the pivot value).\n", " - #### Move `j` to the left while `a[j] > pivot_value` (skip over values greater than the pivot value).\n", " - #### When `i` and `j` have stopped moving, swap `a[i]` and `a[j]` so that values less than `pivot_value` move to the left and values greater than the pivot value move to the right\n", "- #### After `i` and `j` have crossed or met, swap `pivot_value` parked at the right end of the array with `a[i]` to move `pivot_value` to its final position thereby separating the list into two sublists.\n", "#### The left sublist will have all values < `pivot_value` and the right sublist will have all values >= `pivot_value`." ] }, { "cell_type": "code", "execution_count": null, "id": "754f4ede-588c-4d63-b62f-7e6c041da7c2", "metadata": {}, "outputs": [], "source": [ "def partition(a, left, right):\n", " \"\"\"\n", " Partition the list a bounded by indexes \n", " left and right.\n", " @param left the leftmost index of the sublist.\n", " @param right the rightmost index of the sublist.\n", " @return the final index of the pivot value.\n", " \"\"\"\n", " # Pick the pivot value.\n", " middle = (left + right)//2\n", " pivot_value = a[middle]\n", " \n", " print('----'*len(a))\n", " print_array(a, middle)\n", " print(f' pivot value = {pivot_value}')\n", " \n", " # Swap the pivot value with the rightmost value.\n", " swap(a, middle, right, right)\n", " print('----'*len(a))\n", "\n", " print_array(a, right)\n", " print(' starting i and j')\n", " print_ij(a, left, right - 1)\n", " \n", " i = left - 1\n", " j = right\n", " \n", " i_moved = j_moved = False\n", "\n", " # As long as i and j don't cross ...\n", " while i < j: \n", " \n", " # Move i to the right.\n", " i += 1\n", " while (i < right) and (a[i] <= pivot_value):\n", " i += 1\n", " i_moved = True\n", "\n", " if i_moved:\n", " print_array(a, right)\n", " print(' move i right')\n", " print_ij(a, i, min(j, right - 1))\n", " \n", " # Move j to the left.\n", " j -= 1\n", " while (j > left) and (a[j] > pivot_value):\n", " j -= 1\n", " j_moved = True\n", "\n", " if j_moved:\n", " print_array(a, right)\n", " print(' move j left')\n", " print_ij(a, i, max(j, left))\n", " \n", " # Swap after i and j have stopped moving.\n", " if i < j:\n", " swap(a, i, j, right, printij=True)\n", " \n", " i_moved = j_moved = True\n", " \n", " # Swap the pivot value from the right end \n", " # to its correct position.\n", " swap(a, i, right, i)\n", " \n", " return i # final index of the pivot value" ] }, { "cell_type": "markdown", "id": "796c8766-ce98-48c1-bb1f-2232bbdf42dd", "metadata": {}, "source": [ "#### After partitioning an array, print the sorted array to show that the pivot value is in its correct position in the sorted array." ] }, { "cell_type": "code", "execution_count": null, "id": "8ee53dd9-551e-4bb4-bedb-596022c17e89", "metadata": {}, "outputs": [], "source": [ "def do_partitioning(a, show_sort=False):\n", " \"\"\"\n", " Partition array a.\n", " Then sort the array to show that the pivot value\n", " is in its correct position.\n", " \"\"\"\n", " pivot_index = partition(a, 0, len(a) - 1)\n", " verify_partitioning(a, pivot_index)\n", "\n", " if show_sort:\n", " print()\n", " print('After sorting:')\n", " print_array(sorted(a), pivot_index)\n", " print()\n" ] }, { "cell_type": "code", "execution_count": null, "id": "5b258bfe-66e1-467f-91fd-72b0a97cf43a", "metadata": {}, "outputs": [], "source": [ "def verify_partitioning(a, pivot_index):\n", " \"\"\"\n", " Verify that all the values in the left sublist \n", " of list a are less than or equal to the pivot value\n", " at the pivot index, and all the values in the \n", " right sublist are greater than the pivot value.\n", " \"\"\"\n", " pivot_value = a[pivot_index]\n", " size = len(a)\n", "\n", " # Check left sublist.\n", " if pivot_index > 0:\n", " for k in range(pivot_index):\n", " if a[k] > pivot_value:\n", " raise Exception('Partitioning error: left sublist')\n", "\n", " # Check right sublist.\n", " if pivot_index < size - 1:\n", " for k in range(pivot_index + 1, size):\n", " if a[k] <= pivot_value:\n", " raise Exception('Partitioning error: right sublist')\n" ] }, { "cell_type": "markdown", "id": "1556ed42-85c6-4f77-8592-b232d1255d83", "metadata": {}, "source": [ "## Initial tests" ] }, { "cell_type": "code", "execution_count": null, "id": "53576636-9bd5-4265-b3ba-69ef6daa90c1", "metadata": {}, "outputs": [], "source": [ "do_partitioning([99, 8, 55, 39, 82, 63, 57, 51, 6, 5, 5],\n", " show_sort=True)" ] }, { "cell_type": "code", "execution_count": null, "id": "99933ed5-527b-4206-a054-8f7b4d813768", "metadata": {}, "outputs": [], "source": [ "do_partitioning([28, 53, 39, 29, 69, 47, 74, 13, 14, 47, 85],\n", " show_sort=True)" ] }, { "cell_type": "code", "execution_count": null, "id": "43057d61-1538-4857-a78e-3e37cbc6d4c0", "metadata": {}, "outputs": [], "source": [ "do_partitioning([27, 22, 35, 21, 56, 18, 98, 41, 29, 55, 20])" ] }, { "cell_type": "markdown", "id": "6d7fd619-803c-499c-82da-112b0cebaefc", "metadata": {}, "source": [ "## Pathological cases" ] }, { "cell_type": "code", "execution_count": null, "id": "5949b901-6fa5-44b0-a3e0-1a4d9002cc37", "metadata": {}, "outputs": [], "source": [ "# Already partitioned\n", "do_partitioning([4, 3, 1, 2, 5, 6, 9, 7, 10, 8])" ] }, { "cell_type": "code", "execution_count": null, "id": "12476a87-7f48-41e1-8aee-824aa7d4a61e", "metadata": {}, "outputs": [], "source": [ "# Already sorted\n", "do_partitioning(list(range(10)))" ] }, { "cell_type": "code", "execution_count": null, "id": "50ac4cb1-1214-4c9a-8425-059a78a2f89c", "metadata": {}, "outputs": [], "source": [ "# Already reverse sorted\n", "do_partitioning(list(range(9, -1, -1)))" ] }, { "cell_type": "code", "execution_count": null, "id": "e04acab2-eec3-4d83-b962-7513a90e737e", "metadata": {}, "outputs": [], "source": [ "# All the same\n", "do_partitioning([5]*11)" ] }, { "cell_type": "markdown", "id": "25195721-2166-44f2-9179-5e0a27deadbe", "metadata": {}, "source": [ "## Random tests" ] }, { "cell_type": "code", "execution_count": null, "id": "80f11eca-ad25-4dc5-95e3-e605d945dc19", "metadata": {}, "outputs": [], "source": [ "import random\n", "\n", "SIZE = 11\n", "COUNT = 100\n", "\n", "random.seed(0)\n", "\n", "for n in range(1, COUNT+1): \n", " print()\n", " print(f'PARTITIONING TEST #{n}')\n", " \n", " a = [random.randint(10, 99) for _ in range(SIZE)];\n", " do_partitioning(a)\n", "\n", "print()\n", "print('DONE!')" ] }, { "cell_type": "code", "execution_count": null, "id": "c3848594-0357-4e9e-810c-bc920f90575f", "metadata": {}, "outputs": [], "source": [ "# (c) 2024 by Ronald Mak" ] }, { "cell_type": "code", "execution_count": null, "id": "bfa4fc96-95d7-4b0b-b538-7b682632e264", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }