### <center>San Jose State University<br>Department of Applied Data Science<br><br>**DATA 200<br>Computational Programming for Data Analytics**<br><br>Spring 2023<br>Instructor: Ron Mak</center>

# 8.12.3 Other Search Functions; Accessing Matches

In [None]:
import re

### Function `search()` — Finding the First Match Anywhere in a String

#### The regular expression function `search()` finds the first occurence of a substring match and returns a **match object** whose `group()` function returms the matching substring. 

In [None]:
result = re.search('Python', 'Python is fun')
result

In [None]:
result.group() if result else 'not found'

In [None]:
result2 = re.search('fun!', 'Python is fun')
result2

In [None]:
result2.group() if result2 else 'not found'

### Ignoring Case with the Optional flags Keyword Argument

#### Set the keyword parameter `flag` to `INGNORECASE` to ignore case during regular expression matches.

In [None]:
result3 = re.search('Sam', 'BILL SAM WHITE', flags=re.IGNORECASE)
result3

In [None]:
result3.group() if result3 else 'not found'

### Metacharacters that Restrict Matches to the Beginning or End of a String

#### The `^` metacharacter at the beginning of a regular expression (and not inside the square brackets of a custom character class) is an anchor that restricts matches only to the _beginning_ of a string.

In [None]:
result = re.search('^Python', 'Python is fun')
result

In [None]:
result.group() if result else 'not found'

In [None]:
result = re.search('^fun', 'Python is fun')
result

In [None]:
result.group() if result else 'not found'

#### The `$` metacharacter at the end of a regular expression is an anchor that restricts matches only to the end of a string.

In [None]:
result = re.search('Python$', 'Python is fun')
result

In [None]:
result.group() if result else 'not found'

In [None]:
result = re.search('fun$', 'Python is fun')
result

In [None]:
result.group() if result else 'not found'

### Functions `findall()` and `finditer()` — Finding All Matches in a String

#### Regular expression function `findall()` returns a list of all the matching substrings.

In [None]:
contact = 'Wally White, Home: 555-555-1234, Work: 555-555-4321'

In [None]:
re.findall(r'\d{3}-\d{3}-\d{4}', contact)

#### Regular expression function `finditer()` is similar, except that it returns a **lazy iterator** that supplies matching substrings one at a time when requested. This is ideal if there are many matches and memory usage is a concern.

In [None]:
re.finditer(r'\d{3}-\d{3}-\d{4}', contact)

In [None]:
for phone in re.finditer(r'\d{3}-\d{3}-\d{4}', contact):
    print(phone.group())

### Capturing Substrings in a Match

#### Use the paretheses metacharacters to "capture" matching substrings.

In [None]:
text = 'Charlie Cyan, e-mail: demo1@deitel.com'

pattern = r'([A-Z][a-z]+ [A-Z][a-z]+), e-mail: (\w+@\w+\.\w{3})'

In [None]:
result = re.search(pattern, text)
result

#### The match object's `groups()` function returns a list of all the captured substrings that matched.

In [None]:
result.groups()

#### The match object's `group()` function returns returns the _entire_ match as a single string.

In [None]:
result.group()

#### Pass an index value as an argument to `group()` to access individual captured substrings. Unlike array indexing, captured substrings are indexed from 1.

In [None]:
result.group(1)

In [None]:
result.group(2)

In [None]:
##########################################################################
# (C) Copyright 2019 by Deitel & Associates, Inc. and                    #
# Pearson Education, Inc. All Rights Reserved.                           #
#                                                                        #
# DISCLAIMER: The authors and publisher of this book have used their     #
# best efforts in preparing the book. These efforts include the          #
# development, research, and testing of the theories and programs        #
# to determine their effectiveness. The authors and publisher make       #
# no warranty of any kind, expressed or implied, with regard to these    #
# programs or to the documentation contained in these books. The authors #
# and publisher shall not be liable in any event for incidental or       #
# consequential damages in connection with, or arising out of, the       #
# furnishing, performance, or use of these programs.                     #
##########################################################################


In [None]:
# Additional material (C) Copyright 2023 by Ronald Mak