{ "cells": [ { "cell_type": "markdown", "id": "19d7be1b-ec19-48be-a66d-cc55c4997fba", "metadata": {}, "source": [ "###
San Jose State University
Department of Applied Data Science

**DATA 200
Computational Programming for Data Analytics**

Spring 2024
Instructor: Ron Mak
" ] }, { "cell_type": "markdown", "id": "c8e6a3af-456b-4286-8721-8e7cd2572aeb", "metadata": {}, "source": [ "# Reading, splitting, and combining `numpy` arrays" ] }, { "cell_type": "code", "execution_count": null, "id": "62f1a3b7-313d-43d0-8813-c06a23d0862c", "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "id": "6ef76e5b-aee6-45ae-95c9-2b8f8c5774e4", "metadata": {}, "source": [ "## Use the `genfromtxt()` function to read from a CSV file\n", "`data.csv`\n", "```\n", "1,2,3,4,5,6\n", "7,8,9,10,11,12\n", "13,14,15,16,17,18\n", "19,20,21,22,23,24\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "7d216af4-6a86-4f67-9143-248d76d37e4c", "metadata": {}, "outputs": [], "source": [ "dataset = np.genfromtxt('data.csv', delimiter=',')\n", "dataset" ] }, { "cell_type": "markdown", "id": "2d7ca5d3-0340-4f82-9813-85b07a8f280d", "metadata": {}, "source": [ "## Split an array" ] }, { "cell_type": "markdown", "id": "db33c9c3-f3ad-46a0-9251-1a2a93eacbf8", "metadata": {}, "source": [ "### Split horizontally into subarrays with an equal number of columns." ] }, { "cell_type": "code", "execution_count": null, "id": "e8bcf37e-5a66-4e26-b7fd-bd9e899c9fec", "metadata": {}, "outputs": [], "source": [ "np.hsplit(dataset, 3)" ] }, { "cell_type": "code", "execution_count": null, "id": "34a4c340-d674-49d6-abe7-83b5828bd224", "metadata": {}, "outputs": [], "source": [ "np.split(dataset, 3, axis=1)" ] }, { "cell_type": "markdown", "id": "871912ba-2f26-4e13-b482-878b38c9b443", "metadata": {}, "source": [ "### Split vertically into subarrays with an equal number of rows." ] }, { "cell_type": "code", "execution_count": null, "id": "f91b5c3c-c051-452b-b940-67fe9cdbaf13", "metadata": {}, "outputs": [], "source": [ "np.vsplit(dataset, 2)" ] }, { "cell_type": "code", "execution_count": null, "id": "e3fd1717-27c7-4dba-9515-d2ebd1d788e3", "metadata": {}, "outputs": [], "source": [ "np.split(dataset, 2, axis=0)" ] }, { "cell_type": "markdown", "id": "f85ef0ce-abd7-4b23-91cb-642d7ecb6139", "metadata": {}, "source": [ "### Split horizontally at given column indices." ] }, { "cell_type": "code", "execution_count": null, "id": "0cd53da2-42b0-4834-944e-547e2a0e4348", "metadata": {}, "outputs": [], "source": [ "np.hsplit(dataset, [1, 3])" ] }, { "cell_type": "markdown", "id": "e624a337-bcc7-4bc9-a596-5d6096fad18e", "metadata": {}, "source": [ "### Split vertically at given row indices." ] }, { "cell_type": "code", "execution_count": null, "id": "9976d860-afd0-4f30-8d42-9776c6515118", "metadata": {}, "outputs": [], "source": [ "np.vsplit(dataset, [3])" ] }, { "cell_type": "markdown", "id": "0866ca51-9b9f-41f0-8d23-c0c9c28154e6", "metadata": {}, "source": [ "## Combining arrays" ] }, { "cell_type": "code", "execution_count": null, "id": "a76c1d43-2d85-4906-b6bb-cfc634e9ecf4", "metadata": {}, "outputs": [], "source": [ "array_1 = np.array([[1, 2, 3, 4],\n", " [5, 6, 7, 8]])\n", "array_2 = np.array([[10, 20, 30, 40],\n", " [50, 60, 70, 80]])\n", "array_3 = np.array([100, 200, 300, 400])" ] }, { "cell_type": "markdown", "id": "57b09ac0-7434-435b-8952-dfcd0d10d3d6", "metadata": {}, "source": [ "### Vertically stack arrays" ] }, { "cell_type": "code", "execution_count": null, "id": "5bf464e3-0319-4da3-8fb1-b32b3f7b5188", "metadata": {}, "outputs": [], "source": [ "np.vstack([array_1, array_2, array_3])" ] }, { "cell_type": "code", "execution_count": null, "id": "47920462-52e1-486a-82e0-16d3098f9bdc", "metadata": {}, "outputs": [], "source": [ "np.row_stack([array_1, array_2, array_3])" ] }, { "cell_type": "code", "execution_count": null, "id": "18275a9a-a011-4428-a926-3570e21c19c3", "metadata": {}, "outputs": [], "source": [ "array_a = np.array([[1, 2, 3, 4],\n", " [5, 6, 7, 8]])\n", "array_b = np.array([[10, 20, 30],\n", " [50, 60, 70]])\n", "array_c = np.array([[100], \n", " [200]])" ] }, { "cell_type": "code", "execution_count": null, "id": "48789d58-fdb7-4b3d-ac5b-4c924b52a81b", "metadata": {}, "outputs": [], "source": [ "np.hstack([array_a, array_b, array_c])" ] }, { "cell_type": "code", "execution_count": null, "id": "2b469a8a-0781-4aac-8197-41d3a1df6e11", "metadata": {}, "outputs": [], "source": [ "np.column_stack([array_a, array_b, array_c])" ] }, { "cell_type": "code", "execution_count": null, "id": "f8801d23-acec-43ae-a815-322bd334522b", "metadata": {}, "outputs": [], "source": [ "# (C) Copyright 2023 by Ronald Mak" ] }, { "cell_type": "code", "execution_count": null, "id": "16f4b6a5-3262-45b6-b8ea-ab6d0861f5c5", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }