{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### <center>San Jose State University<br>Department of Applied Data Science<br><br>**DATA 200<br>Computational Programming for Data Analytics**<br><br>Spring 2024<br>Instructor: Ron Mak</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 8.9 Splitting and Joining Strings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Splitting Strings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### String method `split()` breaks a string into a list of \"token\" strings that were separated by a common \"delimiter\" string."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "letters = 'A, B, C, D'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "letters.split(', ')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "'A, B,, ,,, C, D'.split(', ')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "'to be or not to be or what else'.split('to')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### An optional second argument specifies the maximum number of splits. The last token string in the list consists of the remaining characters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "letters.split(', ', 2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### If you call `split()` with no arguments, the default delimiter is any consecutive whitespace characters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "'mary\\n had    a little \\t lamb'.split()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Joining Strings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### String method `join()` operates on a string that will become a delimiter. It joins together a list of string values using the delimiter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "letters_list = ['A', 'B', 'C', 'D']\n",
    "\n",
    "','.join(letters_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "' '.join('mary\\n had    a little \\t lamb'.split())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "[str(i) for i in range(10)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "','.join([str(i) for i in range(10)])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### String Methods `partition()` and `rpartition()`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### String method `partition` splits a string into a tuple of three strings based on the call's string argument (the separator):\n",
    "- the part of the string before the **first occurrence** of the separator\n",
    "- the separator string\n",
    "- the part of the string after the first occurrence of the separator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "'Amanda: 89, 97, 92'.partition(': ')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### String method `rpartition()` splits a string into a tuple of three strings based on the **last occurrence** of the separator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "url = 'http://www.cs.sjsu.edu/~mak/DATA200/notebooks/0315/08_09.ipynb'\n",
    "\n",
    "url_path, separator, notebook_file = url.rpartition('/')\n",
    "\n",
    "print(f'{url_path = }')\n",
    "print(f'{notebook_file = }')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### String Method `splitlines()`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### String method `splitlines()` splits a multiline string into a list of strings that each represents one line of the multiline string."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lines = \"\"\"This is line 1\n",
    "This is line2\n",
    "This is line3\"\"\"\n",
    "\n",
    "lines"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(lines)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lines.splitlines()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### If you pass the argument `True`, the newlines are kept at the end of each string."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lines.splitlines(True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "##########################################################################\n",
    "# (C) Copyright 2019 by Deitel & Associates, Inc. and                    #\n",
    "# Pearson Education, Inc. All Rights Reserved.                           #\n",
    "#                                                                        #\n",
    "# DISCLAIMER: The authors and publisher of this book have used their     #\n",
    "# best efforts in preparing the book. These efforts include the          #\n",
    "# development, research, and testing of the theories and programs        #\n",
    "# to determine their effectiveness. The authors and publisher make       #\n",
    "# no warranty of any kind, expressed or implied, with regard to these    #\n",
    "# programs or to the documentation contained in these books. The authors #\n",
    "# and publisher shall not be liable in any event for incidental or       #\n",
    "# consequential damages in connection with, or arising out of, the       #\n",
    "# furnishing, performance, or use of these programs.                     #\n",
    "##########################################################################\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Additional material (C) Copyright 2023 by Ronald Mak"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
