全部学科
Python全栈
python
NodeJS全栈
nodejs
小程序首页
📅 2026-05-19 8 分钟 ✍️ juanwangdev

Python 正则表达式替换与分割

正则替换和分割是文本处理的核心操作,提供灵活的模式匹配能力。

sub 基本替换

Python
import re

text = "Hello World, hello Python"
# 替换所有匹配
result = re.sub(r"hello", "Hi", text, flags=re.IGNORECASE)
print(result)  # Hi World, Hi Python

# 指定替换次数
result = re.sub(r"hello", "Hi", text, count=1, flags=re.IGNORECASE)
print(result)  # Hi World, hello Python

使用分组替换

Python
import re

# 重组日期格式
text = "2024-05-19"
result = re.sub(r"(\d{4})-(\d{2})-(\d{2})", r"\2/\3/\1", text)
print(result)  # 05/19/2024

# 使用命名分组
text = "John Smith"
result = re.sub(r"(?P<first>\w+)\s+(?P<last>\w+)", r"\g<last>, \g<first>", text)
print(result)  # Smith, John

回调函数替换

Python
import re

def uppercase(match):
    return match.group(0).upper()

text = "hello world"
result = re.sub(r"\b\w+", uppercase, text)
print(result)  # HELLO WORLD

# 根据内容决定替换
def smart_replace(match):
    word = match.group(0)
    if len(word) > 5:
        return word.upper()
    return word

text = "hello beautiful world"
result = re.sub(r"\b\w+", smart_replace, text)
print(result)  # hello BEAUTIFUL world

subn 替换计数

Python
import re

text = "a1 a2 a3 a4 a5"
result, count = re.subn(r"a\d", "X", text)
print(result)  # X X X X X
print(count)   # 5(替换次数)

split 基本分割

Python
import re

# 按空格分割
text = "hello   world\tpython"
words = re.split(r"\s+", text)
print(words)  # ['hello', 'world', 'python']

# 按多种分隔符分割
text = "a,b;c d"
parts = re.split(r"[,; ]", text)
print(parts)  # ['a', 'b', 'c', 'd']

split 捕获分隔符

Python
import re

# 普通分割,分隔符不保留
text = "a,b,c"
parts = re.split(r",", text)
print(parts)  # ['a', 'b', 'c']

# 分组分割,分隔符保留
parts = re.split(r"(,)", text)
print(parts)  # ['a', ',', 'b', ',', 'c']

split 限制分割次数

Python
import re

text = "a,b,c,d,e"
parts = re.split(r",", text, maxsplit=2)
print(parts)  # ['a', 'b', 'c,d,e']

split 处理边界

Python
import re

# 开头或结尾的分隔符
text = ",a,b,c,"
parts = re.split(r",", text)
print(parts)  # ['', 'a', 'b', 'c', '']

# 过滤空字符串
parts = [p for p in re.split(r",", text) if p]
print(parts)  # ['a', 'b', 'c']

复杂替换示例

Python
import re

# HTML 标签清理
text = "<p>Hello</p><div>World</div>"
result = re.sub(r"<[^>]+>", "", text)
print(result)  # HelloWorld

# URL 参数处理
text = "key1=value1&key2=value2"
result = re.sub(r"(\w+)=(\w+)", r"\1: \2", text)
print(result)  # key1: value1&key2: value2

# 手机号脱敏
text = "联系电话: 13812345678"
result = re.sub(r"(\d{3})(\d{4})(\d{4})", r"\1****\3", text)
print(result)  # 联系电话: 138****5678

复杂分割示例

Python
import re

# 解析配置文件
config = "name=Alice;age=25;city=Beijing"
pairs = re.split(r";", config)
for pair in pairs:
    key, value = re.split(r"=", pair)
    print(f"{key}: {value}")

# 多级分隔
text = "a:b;c:d"
parts = re.split(r"[;]", text)
for part in parts:
    sub_parts = re.split(r"[:]", part)
    print(sub_parts)

方法对比

方法功能返回值
sub替换所有匹配新字符串
subn替换并计数(新字符串, 次数)
split按模式分割列表

要点总结

  • sub(pattern, repl, string) 替换匹配内容
  • \n\g<name> 在替换中引用分组
  • 回调函数实现动态替换逻辑
  • subn 返回替换结果和次数
  • split(pattern, string) 按模式分割
  • 分组分隔符保留在结果中
  • maxsplit 限制分割次数
  • 替换和分割是文本处理的核心操作

📝 发现内容有误?点击此处直接编辑

← 上一篇 Python 正则表达式分组与捕获
下一篇 → Python 正则表达式标志 flags
想查看更多题目和详细解析?
小程序提供完整的题库、模拟考试和详细解析
马上就来

长按或扫描二维码,立即体验

扫码体验小程序
马上就来
使用微信扫描二维码
立即体验完整题库