之前我一直听说会有这个功能，但是没怎么仔细了解过。Pattern Matching 这个名字给我的第一印象，是模式识别和正则表达式混合的那种感觉，我想正则不是有了吗，应该不会是正则。那难道是模式识别？Python 要真正称为 AI Python 了吗，直接内置 ML？

直到 Guido 这个推，我才去仔细看了下文档，才发现，(⊙o⊙)？，too young。根本不是那回事，什么 AI Python，什么正则，都是浮云。

原来是一个加强版的 switch ……

我们从安装 Python 3.10.0a6 开始，一步一步看看这个 Pattern Matching 是何方神圣。

安装 Python 3.10.0a6

官方其实已经释出了该版本的 docker 镜像，想要快速体验的话可以直接 pull。

但此处我们要从 source 安装，使用上述 docker 镜像的可以直接跳过这部分。

此处我们使用 Ubuntu 18.04 的 docker 镜像，可以直接使用如下命令 pull：

1	docker pull ubuntu:18.04

假设我们已经预先从官网下载好 Python 3.10.0a6 的压缩包，放到了 host 的 /home/alan/download/ 目录，然后我们使用如下命令启动 Ubuntu 镜像：

1 2	# 网速原因，此处直接使用 -v 将压缩包挂载到容器 docker run -it -v /home/alan/download/:/root/ ubuntu:18.04 bash

进入容器后，依次执行下列命令，即可安装 Python 3.10.0a6：

cd /root/
tar -xzvf Python-3.10.0a6.tgz
cd Python-3.10.0a6
apt update
apt-get install -y make build-essential libssl-dev zlib1g-dev \
       libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm \
       libncurses5-dev libncursesw5-dev xz-utils tk-dev
./configure --enable-optimizations --with-ensurepip=install
make -j 8  # 大约需要十几分钟
make altinstall  # 不替换系统 Python

其中，Python-3.10.0a6.tgz 里的目录结构如下：

$ tree Python-3.10.0a6/ -d -L 2
Python-3.10.0a6/
|-- Doc
|   |-- c-api
|   |-- data
|   |-- distributing
|   |-- distutils
|   |-- extending
|   |-- faq
|   |-- howto
|   |-- includes
|   |-- install
|   |-- installing
|   |-- library
|   |-- reference
|   |-- tools
|   |-- tutorial
|   |-- using
|   `-- whatsnew
|-- Grammar
|-- Include
|   |-- cpython
|   `-- internal
|-- Lib
|   |-- asyncio
|   |-- collections
|   |-- concurrent
|   |-- ctypes
|   |-- curses
|   |-- dbm
|   |-- distutils
|   |-- email
|   |-- encodings
|   |-- ensurepip
|   |-- html
|   |-- http
|   |-- idlelib
|   |-- importlib
|   |-- json
|   |-- lib2to3
|   |-- logging
|   |-- msilib
|   |-- multiprocessing
|   |-- pydoc_data
|   |-- site-packages
|   |-- sqlite3
|   |-- test
|   |-- tkinter
|   |-- turtledemo
|   |-- unittest
|   |-- urllib
|   |-- venv
|   |-- wsgiref
|   |-- xml
|   |-- xmlrpc
|   `-- zoneinfo
|-- Mac
|   |-- BuildScript
|   |-- IDLE
|   |-- Icons
|   |-- PythonLauncher
|   |-- Resources
|   `-- Tools
|-- Misc
|-- Modules
|   |-- _blake2
|   |-- _ctypes
|   |-- _decimal
|   |-- _io
|   |-- _multiprocessing
|   |-- _sha3
|   |-- _sqlite
|   |-- _ssl
|   |-- _xxtestfuzz
|   |-- cjkcodecs
|   |-- clinic
|   `-- expat
|-- Objects
|   |-- clinic
|   `-- stringlib
|-- PC
|   |-- clinic
|   |-- icons
|   `-- layout
|-- PCbuild
|-- Parser
|-- Programs
|-- Python
|   `-- clinic
|-- Tools
|   |-- buildbot
|   |-- c-analyzer
|   |-- ccbench
|   |-- clinic
|   |-- demo
|   |-- freeze
|   |-- gdb
|   |-- i18n
|   |-- importbench
|   |-- iobench
|   |-- msi
|   |-- nuget
|   |-- peg_generator
|   |-- pynche
|   |-- scripts
|   |-- ssl
|   |-- stringbench
|   |-- test2to3
|   |-- tz
|   |-- unicode
|   `-- unittestgui
`-- m4

110 directories

完成后，Python 3.10.0a6 就已经安装到系统中，直接命令行输入 python3.10 回车即可进入解释器。

语法

Pattern Matching 的全称是 Structural Pattern Matching（以下简称 SPM），中文可以翻为「结构模式匹配」，先搁置 Structural，先看后面的 pattern matching。

基础语法

match subject:
    case <pattern_1>:
        <action_1>
    case <pattern_2>:
        <action_2>
    case <pattern_3>:
        <action_3>
    case _:
        <action_wildcard>

这就是 SPM 的语法了，很熟悉对不对？其实本质上就是 switch 语句。就是看 subject 和下面的哪一个 case 的 pattern 能匹配得上（顺序依次匹配），就执行该 case 下的 action。最后一个 case 的 pattern 是 _ ，表示 default，就是如果前面的都没匹配上，那么就执行该 case ，可以看作是兜底的。那么如果你没写这个兜底的 case 呢？那就是 no-op，什么也不会执行。

例如，根据不同的 http status 返回不同的 message：

def http_error(status):
    match status:
        case 400:
            return "Bad request"
        case 401 | 403 | 404:
            return "Not allowed"
        case 404:
            return "Not found"
        case 418:
            return "I'm a teapot"
        case _:
            return "Something's wrong with the Internet"

如果 status 是 400，就会返回 "Bad request" ，如果是 500 ，就走到最后一个兜底的 case ，返回 "Something's wrong with the Internet" 。那如果没有最后一个 case ，那么该函数什么也不会返回，或者准确来说，返回 None 。

上述代码中另一个值得注意的是这句： case 401 | 403 | 404: ， | 在这里同样表示或。

Structural？

OK，现在我们来谈谈 structural。语法在哪里体现到了 structural 呢？看文档中下面这句话：

using data with type and shape (the subject)

subject 是带有 type 和 shape 的，就是说 subject 是带有结构的，事先声明好 pattern 的结构。例如 subject 可以是一个 list 、 tuple 、 class 、 list of class 等等。

具体例子：

tuple ：

# point 是一个 (x, y) tuple
match point:
    case (0, 0):
        print("Origin")
    case (0, y):
        print(f"Y={y}")
    case (x, 0):
        print(f"X={x}")
    case (x, y):
        print(f"X={x}, Y={y}")
    case _:
        raise ValueError("Not a point")

class ：

class Point:
    x: int
    y: int

def location(point):
    match point:
        case Point(x=0, y=0):
            print("Origin is the point's location.")
        case Point(x=0, y=y):
            print(f"Y={y} and the point is on the y-axis.")
        case Point(x=x, y=0):
            print(f"X={x} and the point is on the x-axis.")
        case Point():
            print("The point is located somewhere else on the plane.")
        case _:
            print("Not a point")

嵌套结构 list of class ：

match points:
    case []:
        print("No points in the list.")
    case [Point(0, 0)]:
        print("The origin is the only point in the list.")
    case [Point(x, y)]:
        print(f"A single point {x}, {y} is in the list.")
    case [Point(0, y1), Point(0, y2)]:
        print(f"Two points on the Y axis at {y1}, {y2} are in the list.")
    case _:
        print("Something else is found in the list.")

甚至还可以是 dict （Mapping pattern）：

def dict_test(d):
    match d:
        case {'a': 1}:
            return True
        case {'a': 1, 'b': 2}:
            return False

dict_test({'a': 1})
# True

但使用 dict 作为 pattern 的时候要特别注意，pattern 只匹配其包含的 key，额外的 key 会被忽略，即 all(pattern[key] == subject[key] for key in pattern) ，以 pattern 的 key 为依据。例如下面的例子：

1 2	dict_test({'a': 1, 'b': 1}) # 仍然返回 True，因为第一个 case 只检查 'a' 这个 key 的值

此外需要注意的是，即使 case 的 pattern 的 type 和 shape 与 subject 对不上，那也不会报错，只不过不会 match 而已。

A guard？

就像 Python 中常见的 A if B else C 的模式，此处也有这么一种存在：

match point:
    case Point(x=x, y=y) if x == y:
        print(f"The point is located on the diagonal Y=X at {x}.")
    case Point(x=x, y=y):
        print(f"Point is not on the diagonal.")

需要注意的是，原文档中 Point(x=x, y=y) 写的是 Point(x, y) ，其实这样写的话是会报错的：

Traceback (most recent call last):
  File "/root/test.py", line 69, in <module>
    guard_test(point)
  File "/root/test.py", line 39, in guard_test
    case Point(x, y) if x == y:
TypeError: Point() accepts 0 positional sub-patterns (2 given)

因为 Point 类不是 dataclass，也没有设置 __match_args__ ，所以不能直接使用位置参数，文档中相关表述如下：

You can use positional parameters with some builtin classes that provide an ordering for their attributes (e.g. dataclasses). You can also define a specific position for attributes in patterns by setting the __match_args__ special attribute in your classes.

但如果非要写成 Point(x, y) ，可以将 Point 类改成 dataclass 即可：

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

此外关于执行顺序：

Note that value capture happens before the guard is evaluated

如果我没理解错的话，就是先判断 pattern， True 的话再判断 guard，正常逻辑。subject 就像是一个人（本身也有主语的意思），一个个 case 就像一个个门， subject 想要看看自己能进入哪扇门，就要通过每扇门的检查。先检查你是不是个人，再检查你是不是我想要的那个人。guard 是最后一道防线。

总结

总结起来，还是我上面说的，加强版的 switch：逻辑或、structural（支持多种类型）、guard。应该使用场景挺多的，干掉一大堆 if 指日可待，再也不用羡慕 Java 等的 switch 了。

Python 3.10 的新功能：模式匹配 Pattern Matching

目录

简介

安装 Python 3.10.0a6

语法

基础语法

Structural？

A guard？

总结

Reference

END

Alan Lee