San Jose State University Department of Applied Data Science
**DATA 200 Computational Programming for Data Analytics**
Spring 2024 Instructor: Ron Mak
"
]
},
{
"cell_type": "markdown",
"id": "205c1575-47fb-41c6-b1ac-eb493c657599",
"metadata": {},
"source": [
"# Partition a List"
]
},
{
"cell_type": "markdown",
"id": "a5de3b2c-73ec-4177-bbbf-7021f3964fb1",
"metadata": {},
"source": [
"#### Demonstrate partitioning a list with a trace."
]
},
{
"cell_type": "markdown",
"id": "958068cb-59fc-4a65-99bc-1a3f9cfbb6cd",
"metadata": {},
"source": [
"#### To ***partition*** an unsorted list:\n",
"- #### Pick an element of the list as the ***pivot value***. In these examples, we'll always pick the middle element.\n",
"- #### Move all the elements whose value is **less than or equal to** the pivot value to the left of the pivot.\n",
"- #### Move all the elements whose value is **greater than** the pivot value to the right of the pivot.\n",
"#### After partitioning a list, the pivot value is in its correct position if we then sort the list."
]
},
{
"cell_type": "markdown",
"id": "5a1a8bf0-6853-4a9e-b622-37e85ec5ef18",
"metadata": {},
"source": [
"## Printing functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4827fdec-097c-4d8e-a467-7eda50901d38",
"metadata": {},
"outputs": [],
"source": [
"def print_array(a, pivot_index):\n",
" \"\"\"\n",
" Print list a with parentheses around\n",
" the pivot value at pivot_index.\n",
" \"\"\"\n",
" for k in range(len(a)):\n",
" value = a[k]\n",
" \n",
" if k == pivot_index:\n",
" print(f'({value:2d})', end='')\n",
" else:\n",
" print(f' {value:2d} ', end='')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e279ea15-a4ca-492d-9b48-4ac829f57ff6",
"metadata": {},
"outputs": [],
"source": [
"def print_ij(a, i, j):\n",
" \"\"\"\n",
" Print the positions of indexes i and j\n",
" relative to array a.\n",
" \"\"\"\n",
" for k in range(len(a)):\n",
" if (k == i) and (k == j):\n",
" print(' ji ', end='')\n",
" elif k == i:\n",
" print(' i ', end='')\n",
" elif k == j:\n",
" print(' j ', end='')\n",
" else:\n",
" print(' ', end='')\n",
" \n",
" print()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fa301ac0-1f23-4547-923d-00004afb2ecb",
"metadata": {},
"outputs": [],
"source": [
"def swap(a, i, j, pivot_index, printij=False):\n",
" \"\"\"\n",
" Print the swap of the elements of array a\n",
" at indexes left and right.\n",
" \"\"\"\n",
" if a[i] != a[j]:\n",
" a[i], a[j] = a[j], a[i]\n",
" print_array(a, pivot_index)\n",
" print(f' swap {a[i]} <=> {a[j]}')\n",
"\n",
" if printij:\n",
" print_ij(a, i, j)"
]
},
{
"cell_type": "markdown",
"id": "c2ffadc6-63c3-4309-8b65-a0ea5d089d85",
"metadata": {},
"source": [
"## Partition algorithm\n",
"#### Steps to partition an list `a` bounded by indexes `left` and `right` into two sublists:\n",
"- #### Pick the middle element as the `pivot_value`.\n",
"- #### Park `pivot_value` out of the way by swapping it with the rightmost value.\n",
"- #### Index `i` starts at the left end of the array.\n",
"- #### Index `j` starts at one from the right end of the array (since the pivot value is parked at the right end).\n",
"- #### As long as `i` and `j` don't cross or meet (`while i < j`):\n",
" - #### Move `i` to the right while `a[i] <= pivot_value` (skip over values less than the pivot value).\n",
" - #### Move `j` to the left while `a[j] > pivot_value` (skip over values greater than the pivot value).\n",
" - #### When `i` and `j` have stopped moving, swap `a[i]` and `a[j]` so that values less than `pivot_value` move to the left and values greater than the pivot value move to the right\n",
"- #### After `i` and `j` have crossed or met, swap `pivot_value` parked at the right end of the array with `a[i]` to move `pivot_value` to its final position thereby separating the list into two sublists.\n",
"#### The left sublist will have all values < `pivot_value` and the right sublist will have all values >= `pivot_value`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "754f4ede-588c-4d63-b62f-7e6c041da7c2",
"metadata": {},
"outputs": [],
"source": [
"def partition(a, left, right):\n",
" \"\"\"\n",
" Partition the list a bounded by indexes \n",
" left and right.\n",
" @param left the leftmost index of the sublist.\n",
" @param right the rightmost index of the sublist.\n",
" @return the final index of the pivot value.\n",
" \"\"\"\n",
" # Pick the pivot value.\n",
" middle = (left + right)//2\n",
" pivot_value = a[middle]\n",
" \n",
" print('----'*len(a))\n",
" print_array(a, middle)\n",
" print(f' pivot value = {pivot_value}')\n",
" \n",
" # Swap the pivot value with the rightmost value.\n",
" swap(a, middle, right, right)\n",
" print('----'*len(a))\n",
"\n",
" print_array(a, right)\n",
" print(' starting i and j')\n",
" print_ij(a, left, right - 1)\n",
" \n",
" i = left - 1\n",
" j = right\n",
" \n",
" i_moved = j_moved = False\n",
"\n",
" # As long as i and j don't cross ...\n",
" while i < j: \n",
" \n",
" # Move i to the right.\n",
" i += 1\n",
" while (i < right) and (a[i] <= pivot_value):\n",
" i += 1\n",
" i_moved = True\n",
"\n",
" if i_moved:\n",
" print_array(a, right)\n",
" print(' move i right')\n",
" print_ij(a, i, min(j, right - 1))\n",
" \n",
" # Move j to the left.\n",
" j -= 1\n",
" while (j > left) and (a[j] > pivot_value):\n",
" j -= 1\n",
" j_moved = True\n",
"\n",
" if j_moved:\n",
" print_array(a, right)\n",
" print(' move j left')\n",
" print_ij(a, i, max(j, left))\n",
" \n",
" # Swap after i and j have stopped moving.\n",
" if i < j:\n",
" swap(a, i, j, right, printij=True)\n",
" \n",
" i_moved = j_moved = True\n",
" \n",
" # Swap the pivot value from the right end \n",
" # to its correct position.\n",
" swap(a, i, right, i)\n",
" \n",
" return i # final index of the pivot value"
]
},
{
"cell_type": "markdown",
"id": "796c8766-ce98-48c1-bb1f-2232bbdf42dd",
"metadata": {},
"source": [
"#### After partitioning an array, print the sorted array to show that the pivot value is in its correct position in the sorted array."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8ee53dd9-551e-4bb4-bedb-596022c17e89",
"metadata": {},
"outputs": [],
"source": [
"def do_partitioning(a, show_sort=False):\n",
" \"\"\"\n",
" Partition array a.\n",
" Then sort the array to show that the pivot value\n",
" is in its correct position.\n",
" \"\"\"\n",
" pivot_index = partition(a, 0, len(a) - 1)\n",
" verify_partitioning(a, pivot_index)\n",
"\n",
" if show_sort:\n",
" print()\n",
" print('After sorting:')\n",
" print_array(sorted(a), pivot_index)\n",
" print()\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5b258bfe-66e1-467f-91fd-72b0a97cf43a",
"metadata": {},
"outputs": [],
"source": [
"def verify_partitioning(a, pivot_index):\n",
" \"\"\"\n",
" Verify that all the values in the left sublist \n",
" of list a are less than or equal to the pivot value\n",
" at the pivot index, and all the values in the \n",
" right sublist are greater than the pivot value.\n",
" \"\"\"\n",
" pivot_value = a[pivot_index]\n",
" size = len(a)\n",
"\n",
" # Check left sublist.\n",
" if pivot_index > 0:\n",
" for k in range(pivot_index):\n",
" if a[k] > pivot_value:\n",
" raise Exception('Partitioning error: left sublist')\n",
"\n",
" # Check right sublist.\n",
" if pivot_index < size - 1:\n",
" for k in range(pivot_index + 1, size):\n",
" if a[k] <= pivot_value:\n",
" raise Exception('Partitioning error: right sublist')\n"
]
},
{
"cell_type": "markdown",
"id": "1556ed42-85c6-4f77-8592-b232d1255d83",
"metadata": {},
"source": [
"## Initial tests"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "53576636-9bd5-4265-b3ba-69ef6daa90c1",
"metadata": {},
"outputs": [],
"source": [
"do_partitioning([99, 8, 55, 39, 82, 63, 57, 51, 6, 5, 5],\n",
" show_sort=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "99933ed5-527b-4206-a054-8f7b4d813768",
"metadata": {},
"outputs": [],
"source": [
"do_partitioning([28, 53, 39, 29, 69, 47, 74, 13, 14, 47, 85],\n",
" show_sort=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "43057d61-1538-4857-a78e-3e37cbc6d4c0",
"metadata": {},
"outputs": [],
"source": [
"do_partitioning([27, 22, 35, 21, 56, 18, 98, 41, 29, 55, 20])"
]
},
{
"cell_type": "markdown",
"id": "6d7fd619-803c-499c-82da-112b0cebaefc",
"metadata": {},
"source": [
"## Pathological cases"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5949b901-6fa5-44b0-a3e0-1a4d9002cc37",
"metadata": {},
"outputs": [],
"source": [
"# Already partitioned\n",
"do_partitioning([4, 3, 1, 2, 5, 6, 9, 7, 10, 8])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "12476a87-7f48-41e1-8aee-824aa7d4a61e",
"metadata": {},
"outputs": [],
"source": [
"# Already sorted\n",
"do_partitioning(list(range(10)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "50ac4cb1-1214-4c9a-8425-059a78a2f89c",
"metadata": {},
"outputs": [],
"source": [
"# Already reverse sorted\n",
"do_partitioning(list(range(9, -1, -1)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e04acab2-eec3-4d83-b962-7513a90e737e",
"metadata": {},
"outputs": [],
"source": [
"# All the same\n",
"do_partitioning([5]*11)"
]
},
{
"cell_type": "markdown",
"id": "25195721-2166-44f2-9179-5e0a27deadbe",
"metadata": {},
"source": [
"## Random tests"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "80f11eca-ad25-4dc5-95e3-e605d945dc19",
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"\n",
"SIZE = 11\n",
"COUNT = 100\n",
"\n",
"random.seed(0)\n",
"\n",
"for n in range(1, COUNT+1): \n",
" print()\n",
" print(f'PARTITIONING TEST #{n}')\n",
" \n",
" a = [random.randint(10, 99) for _ in range(SIZE)];\n",
" do_partitioning(a)\n",
"\n",
"print()\n",
"print('DONE!')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c3848594-0357-4e9e-810c-bc920f90575f",
"metadata": {},
"outputs": [],
"source": [
"# (c) 2024 by Ronald Mak"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bfa4fc96-95d7-4b0b-b538-7b682632e264",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}