1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
|
<!DOCTYPE html
PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>Chapter 7. Regular Expressions</title>
<link rel="stylesheet" href="../diveintopython.css" type="text/css">
<link rev="made" href="mailto:f8dy@diveintopython.org">
<meta name="generator" content="DocBook XSL Stylesheets V1.52.2">
<meta name="keywords" content="Python, Dive Into Python, tutorial, object-oriented, programming, documentation, book, free">
<meta name="description" content="Python from novice to pro">
<link rel="home" href="../toc/index.html" title="Dive Into Python">
<link rel="up" href="../toc/index.html" title="Dive Into Python">
<link rel="previous" href="../file_handling/summary.html" title="6.7. Summary">
<link rel="next" href="street_addresses.html" title="7.2. Case Study: Street Addresses">
</head>
<body>
<table id="Header" width="100%" border="0" cellpadding="0" cellspacing="0" summary="">
<tr>
<td id="breadcrumb" colspan="5" align="left" valign="top">You are here: <a href="../index.html">Home</a> > <a href="../toc/index.html">Dive Into Python</a> > <span class="thispage">Regular Expressions</span></td>
<td id="navigation" align="right" valign="top"> <a href="../file_handling/summary.html" title="Prev: “Summary”"><<</a> <a href="street_addresses.html" title="Next: “Case Study: Street Addresses”">>></a></td>
</tr>
<tr>
<td colspan="3" id="logocontainer">
<h1 id="logo"><a href="../index.html" accesskey="1">Dive Into Python</a></h1>
<p id="tagline">Python from novice to pro</p>
</td>
<td colspan="3" align="right">
<form id="search" method="GET" action="http://www.google.com/custom">
<p><label for="q" accesskey="4">Find: </label><input type="text" id="q" name="q" size="20" maxlength="255" value=" "> <input type="submit" value="Search"><input type="hidden" name="cof" value="LW:752;L:http://diveintopython.org/images/diveintopython.png;LH:42;AH:left;GL:0;AWFID:3ced2bb1f7f1b212;"><input type="hidden" name="domains" value="diveintopython.org"><input type="hidden" name="sitesearch" value="diveintopython.org"></p>
</form>
</td>
</tr>
</table>
<!--#include virtual="/inc/ads" -->
<div class="chapter" lang="en">
<div class="titlepage">
<div>
<div>
<h2 class="title"><a name="re"></a>Chapter 7. Regular Expressions
</h2>
</div>
</div>
<div></div>
</div>
<div class="toc">
<ul>
<li><span class="section"><a href="index.html#re.intro">7.1. Diving In</a></span></li>
<li><span class="section"><a href="street_addresses.html">7.2. Case Study: Street Addresses</a></span></li>
<li><span class="section"><a href="roman_numerals.html">7.3. Case Study: Roman Numerals</a></span><ul>
<li><span class="section"><a href="roman_numerals.html#d0e17592">7.3.1. Checking for Thousands</a></span></li>
<li><span class="section"><a href="roman_numerals.html#d0e17785">7.3.2. Checking for Hundreds</a></span></li>
</ul>
</li>
<li><span class="section"><a href="n_m_syntax.html">7.4. Using the {n,m} Syntax</a></span><ul>
<li><span class="section"><a href="n_m_syntax.html#d0e18326">7.4.1. Checking for Tens and Ones</a></span></li>
</ul>
</li>
<li><span class="section"><a href="verbose.html">7.5. Verbose Regular Expressions</a></span></li>
<li><span class="section"><a href="phone_numbers.html">7.6. Case study: Parsing Phone Numbers</a></span></li>
<li><span class="section"><a href="summary.html">7.7. Summary</a></span></li>
</ul>
</div>
<div class="abstract">
<p>Regular expressions are a powerful and standardized way of searching, replacing, and parsing text with complex patterns of
characters. If you've used regular expressions in other languages (like <span class="application">Perl</span>), the syntax will be very familiar, and you get by just reading the summary of the <a href="http://www.python.org/doc/current/lib/module-re.html"><tt class="filename">re</tt> module</a> to get an overview of the available functions and their arguments.
</p>
</div>
<div class="section" lang="en">
<div class="titlepage">
<div>
<div>
<h2 class="title"><a name="re.intro"></a>7.1. Diving In
</h2>
</div>
</div>
<div></div>
</div>
<p>Strings have methods for searching (<tt class="function">index</tt>, <tt class="function">find</tt>, and <tt class="function">count</tt>), replacing (<tt class="function">replace</tt>), and parsing (<tt class="function">split</tt>), but they are limited to the simplest of cases. The search methods look for a single, hard-coded substring, and they are
always case-sensitive. To do case-insensitive searches of a string <tt class="varname">s</tt>, you must call <tt class="function">s.lower()</tt> or <tt class="function">s.upper()</tt> and make sure your search strings are the appropriate case to match. The <tt class="function">replace</tt> and <tt class="function">split</tt> methods have the same limitations.
</p>
<div class="abstract">
<p>If what you're trying to do can be accomplished with string functions, you should use them. They're fast and simple and easy
to read, and there's a lot to be said for fast, simple, readable code. But if you find yourself using a lot of different
string functions with <tt class="literal">if</tt> statements to handle special cases, or if you're combining them with <tt class="function">split</tt> and <tt class="function">join</tt> and list comprehensions in weird unreadable ways, you may need to move up to regular expressions.
</p>
</div>
<p>Although the regular expression syntax is tight and unlike normal code, the result can end up being <span class="emphasis"><em>more</em></span> readable than a hand-rolled solution that uses a long chain of string functions. There are even ways of embedding comments
within regular expressions to make them practically self-documenting.
</p>
</div>
</div>
<table class="Footer" width="100%" border="0" cellpadding="0" cellspacing="0" summary="">
<tr>
<td width="35%" align="left"><br><a class="NavigationArrow" href="../file_handling/summary.html"><< Summary</a></td>
<td width="30%" align="center"><br> <span class="divider">|</span> <span class="thispage">1</span> <span class="divider">|</span> <a href="street_addresses.html" title="7.2. Case Study: Street Addresses">2</a> <span class="divider">|</span> <a href="roman_numerals.html" title="7.3. Case Study: Roman Numerals">3</a> <span class="divider">|</span> <a href="n_m_syntax.html" title="7.4. Using the {n,m} Syntax">4</a> <span class="divider">|</span> <a href="verbose.html" title="7.5. Verbose Regular Expressions">5</a> <span class="divider">|</span> <a href="phone_numbers.html" title="7.6. Case study: Parsing Phone Numbers">6</a> <span class="divider">|</span> <a href="summary.html" title="7.7. Summary">7</a> <span class="divider">|</span>
</td>
<td width="35%" align="right"><br><a class="NavigationArrow" href="street_addresses.html">Case Study: Street Addresses >></a></td>
</tr>
<tr>
<td colspan="3"><br></td>
</tr>
</table>
<div class="Footer">
<p class="copyright">Copyright © 2000, 2001, 2002, 2003, 2004 <a href="mailto:mark@diveintopython.org">Mark Pilgrim</a></p>
</div>
</body>
</html>
|