{ "cells": [ { "cell_type": "markdown", "id": "edc4d861-043c-40d2-871f-4489c175939e", "metadata": {}, "source": [ "# Compute medians in Python" ] }, { "cell_type": "code", "execution_count": null, "id": "a85a4659-9b0a-4e38-a3ac-6ad2ba3430d3", "metadata": {}, "outputs": [], "source": [ "import statistics\n", "from math import floor, ceil" ] }, { "cell_type": "markdown", "id": "9bbd822c-6430-46ba-b79e-9e322f403efb", "metadata": {}, "source": [ "## Python: Odd-sized list\n", "#### First sort the list." ] }, { "cell_type": "code", "execution_count": null, "id": "4bb47859-9826-4456-af83-222fdb8ba971", "metadata": {}, "outputs": [], "source": [ "original_list_odd = [ 15, 6, 20, 13, 8, 2, 6 ]\n", "\n", "count = len(original_list_odd)\n", "print(f'{count = }')\n", "print()\n", "\n", "print('original list odd:', end='')\n", "for i in range(count):\n", " print(f'{original_list_odd[i]:3}', end='')\n", "print()\n", "\n", "sorted_list_odd = sorted(original_list_odd)\n", "\n", "print(' sorted list odd:', end='')\n", "for i in range(count):\n", " print(f'{sorted_list_odd[i]:3}', end='')\n", "print()" ] }, { "cell_type": "code", "execution_count": null, "id": "78cfd526-b603-451d-9f1d-d25bd1c45eab", "metadata": {}, "outputs": [], "source": [ "print('row_index value')\n", "\n", "for i in range(1, count + 1):\n", " print(f'{i:5d}{sorted_list_odd[i-1]:10d}')" ] }, { "cell_type": "markdown", "id": "d2aa9d48-67fe-4690-b461-6c0a57c50141", "metadata": {}, "source": [ "#### If a list contains an **odd number** of values, the median is the value in the sorted list at the middle row index." ] }, { "cell_type": "code", "execution_count": null, "id": "89bc35fd-2cd7-4828-8eb3-ecc226d07a79", "metadata": {}, "outputs": [], "source": [ "middle_index = floor((count + 1)/2)\n", "print(f'{middle_index = }')\n", "\n", "# Subtract 1 because Python list indexes start at 0.\n", "median = sorted_list_odd[middle_index - 1]\n", "print(f'{median = }')" ] }, { "cell_type": "code", "execution_count": null, "id": "49089d82-993f-4a7c-b6df-9794b58deec9", "metadata": {}, "outputs": [], "source": [ "print(f'{statistics.median(original_list_odd) = }')" ] }, { "cell_type": "markdown", "id": "c19607b9-ab4f-4abc-930e-d0e8de1581b6", "metadata": {}, "source": [ "## Python: Even-sized list\n", "#### First sort the list." ] }, { "cell_type": "code", "execution_count": null, "id": "2bc68599-7907-439c-8444-5ea0ebc818c4", "metadata": {}, "outputs": [], "source": [ "original_list_even = original_list_odd + [25,]\n", "\n", "count = len(original_list_even)\n", "print(f'{count = }')\n", "print()\n", "\n", "print('original list even:', end='')\n", "for i in range(count):\n", " print(f'{original_list_even[i]:3}', end='')\n", "print()\n", "\n", "sorted_list_even = sorted(original_list_even)\n", "\n", "print(' sorted list even:', end='')\n", "for i in range(count):\n", " print(f'{sorted_list_even[i]:3}', end='')\n", "print()" ] }, { "cell_type": "code", "execution_count": null, "id": "60ba6998-6b92-41c1-a88e-d9d1a95af5a0", "metadata": {}, "outputs": [], "source": [ "print('row_index value')\n", "\n", "for i in range(1, count + 1):\n", " print(f'{i:5d}{sorted_list_even[i-1]:10d}')" ] }, { "cell_type": "markdown", "id": "a401f449-3409-4c7e-88e3-2bf13d97cabc", "metadata": {}, "source": [ "#### If a list contains an **even number** of values, the median is the **average** of the two values in the sorted list at the two middle indexes." ] }, { "cell_type": "code", "execution_count": null, "id": "08a65ec9-5369-4435-a036-92d1893a967c", "metadata": {}, "outputs": [], "source": [ "middle_index_1 = floor((count + 1)/2)\n", "middle_index_2 = ceil((count + 1)/2)\n", "\n", "print(f'{middle_index_1 = }')\n", "print(f'{middle_index_2 = }')\n", "\n", "# Subtract 1 because Python list indexes start at 0.\n", "median = (sorted_list_even[middle_index_1 - 1] + sorted_list_even[middle_index_2 - 1])/2\n", "print(f'{median = }')" ] }, { "cell_type": "code", "execution_count": null, "id": "d742093e-1018-4599-bfa4-c76fd0a818e1", "metadata": {}, "outputs": [], "source": [ "print(f'{statistics.median(original_list_even) = }')" ] }, { "cell_type": "markdown", "id": "646e38c8-267d-4576-a02b-bdbe2d6aa73c", "metadata": {}, "source": [ "## Function to calculate the median" ] }, { "cell_type": "code", "execution_count": null, "id": "7159ead0-d118-4a2f-888c-2a43183086b1", "metadata": {}, "outputs": [], "source": [ "def median_function(lst):\n", " count = len(lst)\n", " middle_index_1 = floor((count + 1)/2)\n", " middle_index_2 = ceil((count + 1)/2)\n", " \n", " sorted_lst = sorted(lst)\n", " return (sorted_lst[middle_index_1 - 1] + sorted_lst[middle_index_2 - 1])/2\n" ] }, { "cell_type": "code", "execution_count": null, "id": "cd4b3930-41dd-4a91-aa43-3271781eb24f", "metadata": {}, "outputs": [], "source": [ "print(f'{median_function(original_list_even)}')" ] }, { "cell_type": "markdown", "id": "7c239c14-2a83-49ca-ab54-b115ea0c9971", "metadata": {}, "source": [ "## Any size list\n", "#### Will the function also work for an odd number of values?" ] }, { "cell_type": "code", "execution_count": null, "id": "a5a52079-e65c-4f48-9173-824a088b2cad", "metadata": {}, "outputs": [], "source": [ "print(f'{median_function(original_list_odd)}')" ] }, { "cell_type": "markdown", "id": "271a9e29-952a-4bc3-8f79-feb76cd00be7", "metadata": {}, "source": [ "#### Yes, if the number of values is odd, indexes `FLOOR((count + 1)/2)` and `CEIL((count + 1)/2)` are equal, and so w'll end up taking the average of two copies of the same median value." ] }, { "cell_type": "code", "execution_count": null, "id": "738c376e-0d04-404a-9816-798a947c2c2e", "metadata": {}, "outputs": [], "source": [ "count = 7\n", "\n", "print(f'{floor((count + 1)/2) = }')\n", "print(f'{ ceil((count + 1)/2) = }')" ] }, { "cell_type": "code", "execution_count": null, "id": "66293bd8-a62e-40ca-a834-56ab46e4982f", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.4" } }, "nbformat": 4, "nbformat_minor": 5 }