wip-2.1

2026-06-05 ⏳7.8分钟(3.1千字)

2.1 The definition of “unit test”

2.1 “单元测试”的定义

There are a lot of definitions of a unit test. Stripped of their non-essential bits, the definitions all have the following three most important attributes. A unit test is an automated test that:

单元测试有许多定义。去掉非本质部分后,这些定义都包含以下三个最重要的属性。单元测试是一种自动化测试,它:

The first two attributes here are pretty non-controversial. There might be some dispute as to what exactly constitutes a fast unit test because it’s a highly subjective measure. But overall, it’s not that important. If your test suite’s execution time is good enough for you, it means your tests are quick enough.

这里的前两个属性争议并不大。关于什么才算“快速”的单元测试,可能会有一些争论,因为这是一个高度主观的衡量标准。但总体来说,它并不是那么重要。如果你的测试套件执行时间对你来说足够好,那就说明你的测试足够快。

What people have vastly different opinions about is the third attribute. The isolation issue is the root of the differences between the classical and London schools of unit testing. As you will see in the next section, all other differences between the two schools flow naturally from this single disagreement on what exactly isolation means. I prefer the classical style for the reasons I describe in section 2.3.

人们真正意见分歧巨大的是第三个属性。隔离问题是经典学派和伦敦学派之间差异的根源。正如你将在下一节看到的,这两个学派的所有其他差异,都自然源自于对“隔离”究竟意味着什么的这一处不同理解。出于 2.3 节中说明的原因,我更偏好经典风格。

The classical and London schools of unit testing

单元测试的经典学派与伦敦学派

The classical approach is also referred to as the Detroit and, sometimes, the classicist approach to unit testing. Probably the most canonical book on the classical school is the one by Kent Beck: Test-Driven Development: By Example (Addison-Wesley Professional, 2002).

经典方法也被称为底特律方法,有时也被称为 classicist 单元测试方法。关于经典学派,最具代表性的书可能是 Kent Beck 的《Test-Driven Development: By Example》(Addison-Wesley Professional,2002)。

The London style is sometimes referred to as mockist. Although the term mockist is widespread, people who adhere to this style of unit testing generally don’t like it, so I call it the London style throughout this book. The most prominent proponents of this approach are Steve Freeman and Nat Pryce. I recommend their book, Growing Object-Oriented Software, Guided by Tests (Addison-Wesley Professional, 2009), as a good source on this subject.

伦敦风格有时被称为 mockist。虽然 mockist 这个术语传播很广,但遵循这种单元测试风格的人通常并不喜欢这个称呼,所以本书中我会称它为伦敦风格。这种方法最著名的支持者是 Steve Freeman 和 Nat Pryce。我推荐他们的《Growing Object-Oriented Software, Guided by Tests》(Addison-Wesley Professional,2009),这是了解该主题的很好资料。

2.1.1 The isolation issue: The London take

2.1.1 隔离问题:伦敦学派的观点

What does it mean to verify a piece of code—a unit—in an isolated manner? The London school describes it as isolating the system under test from its collaborators. It means if a class has a dependency on another class, or several classes, you need to replace all such dependencies with test doubles. This way, you can focus on the class under test exclusively by separating its behavior from any external influence.

以隔离方式验证一段代码,也就是一个单元,究竟是什么意思?伦敦学派把它描述为:将被测系统与其协作者隔离开。这意味着,如果一个类依赖另一个类或多个类,你需要用测试替身替换所有这些依赖。这样一来,你就可以把被测类的行为与任何外部影响分离,专注于被测类本身。

DEFINITION A test double is an object that looks and behaves like its release-intended counterpart but is actually a simplified version that reduces the complexity and facilitates testing. This term was introduced by Gerard Meszaros in his book, xUnit Test Patterns: Refactoring Test Code (Addison-Wesley, 2007). The name itself comes from the notion of a stunt double in movies.

定义 测试替身是一个看起来和行为上都像发布版本中对应对象的对象,但它实际上是一个简化版本,用来降低复杂度并促进测试。这个术语由 Gerard Meszaros 在《xUnit Test Patterns: Refactoring Test Code》(Addison-Wesley,2007)中提出。这个名字本身来自电影中的替身演员概念。

Figure 2.1 shows how the isolation is usually achieved. A unit test that would otherwise verify the system under test along with all its dependencies now can do that separately from those dependencies.

图 2.1 展示了通常如何实现这种隔离。一个原本会连同所有依赖一起验证被测系统的单元测试,现在可以把被测系统与这些依赖分开验证。

Figure 2.1

One benefit of this approach is that if the test fails, you know for sure which part of the code base is broken: it’s the system under test. There could be no other suspects, because all of the class’s neighbors are replaced with the test doubles.

这种方法的一个好处是:如果测试失败,你可以确定代码库中哪里坏了——就是被测系统。不会有其他嫌疑对象,因为这个类的所有邻居都已经被测试替身替换掉了。

Another benefit is the ability to split the object graph—the web of communicating classes solving the same problem. This web may become quite complicated: every class in it may have several immediate dependencies, each of which relies on dependencies of their own, and so on. Classes may even introduce circular dependencies, where the chain of dependency eventually comes back to where it started.

另一个好处是能够切分对象图,也就是一组为解决同一问题而相互通信的类所组成的网络。这个网络可能变得相当复杂:其中每个类都可能有几个直接依赖,而每个直接依赖又依赖自己的依赖,如此递归下去。类之间甚至可能引入循环依赖,也就是依赖链最终回到它开始的地方。

Trying to test such an interconnected code base is hard without test doubles. Pretty much the only choice you are left with is re-creating the full object graph in the test, which might not be a feasible task if the number of classes in it is too high.

如果没有测试替身,要测试这种相互连接的代码库会很困难。你几乎唯一的选择,就是在测试中重建完整对象图;如果其中类的数量太多,这可能并不可行。

With test doubles, you can put a stop to this. You can substitute the immediate dependencies of a class; and, by extension, you don’t have to deal with the dependencies of those dependencies, and so on down the recursion path. You are effectively breaking up the graph—and that can significantly reduce the amount of preparations you have to do in a unit test.

有了测试替身,你就可以阻止这种情况。你可以替换一个类的直接依赖;进一步说,你也就不必处理这些依赖自己的依赖,以及递归路径上的后续依赖。你实际上是在打断对象图——这可以显著减少单元测试中需要做的准备工作。

And let’s not forget another small but pleasant side benefit of this approach to unit test isolation: it allows you to introduce a project-wide guideline of testing only one class at a time, which establishes a simple structure in the whole unit test suite. You no longer have to think much about how to cover your code base with tests. Have a class? Create a corresponding class with unit tests! Figure 2.2 shows how it usually looks.

另外也别忘了这种单元测试隔离方法还有一个虽小但令人愉快的附带好处:它允许你引入一个项目级指导原则,即一次只测试一个类,从而为整个单元测试套件建立简单结构。你不再需要过多思考如何用测试覆盖代码库。有一个类?那就创建一个对应的单元测试类!图 2.2 展示了它通常的样子。

Figure 2.2

Let’s now look at some examples. Since the classical style probably looks more familiar to most people, I’ll show sample tests written in that style first and then rewrite them using the London approach.

现在来看一些例子。由于经典风格对大多数人来说可能更熟悉,我会先展示用这种风格编写的示例测试,然后再用伦敦方法改写它们。

Let’s say that we operate an online store. There’s just one simple use case in our sample application: a customer can purchase a product. When there’s enough inventory in the store, the purchase is deemed to be successful, and the amount of the product in the store is reduced by the purchase’s amount. If there’s not enough product, the purchase is not successful, and nothing happens in the store.

假设我们经营一家在线商店。示例应用中只有一个简单用例:客户可以购买商品。当商店中库存充足时,购买被视为成功,商店中的商品数量会按购买数量减少。如果商品不足,购买就不成功,商店中什么也不会发生。

Listing 2.1 shows two tests verifying that a purchase succeeds only when there’s enough inventory in the store. The tests are written in the classical style and use the typical three-phase sequence: arrange, act, and assert (AAA for short—I talk more about this sequence in chapter 3).

清单 2.1 展示了两个测试,用于验证只有当商店中库存充足时购买才会成功。这些测试用经典风格编写,并使用典型的三阶段顺序:准备、执行和断言(简称 AAA,我会在第 3 章进一步讨论这个顺序)。

Listing 2.1

As you can see, the arrange part is where the tests make ready all dependencies and the system under test. The call to customer.Purchase() is the act phase, where you exercise the behavior you want to verify. The assert statements are the verification stage, where you check to see if the behavior led to the expected results.

如你所见,准备部分负责让所有依赖和被测系统就绪。对 customer.Purchase() 的调用是执行阶段,你在这里执行想要验证的行为。断言语句是验证阶段,你在这里检查该行为是否产生了预期结果。

During the arrange phase, the tests put together two kinds of objects: the system under test (SUT) and one collaborator. In this case, Customer is the SUT and Store is the collaborator. We need the collaborator for two reasons:

在准备阶段,测试组合了两类对象:被测系统(SUT)和一个协作者。在这个例子中,Customer 是 SUT,Store 是协作者。我们需要协作者有两个原因:

Product.Shampoo and the numbers 5 and 15 are constants.

Product.Shampoo 以及数字 5 和 15 都是常量。

DEFINITION A method under test (MUT) is a method in the SUT called by the test. The terms MUT and SUT are often used as synonyms, but normally, MUT refers to a method while SUT refers to the whole class.

定义 被测方法(MUT)是测试调用的 SUT 中的方法。MUT 和 SUT 这两个术语常常被当作同义词使用,但通常来说,MUT 指一个方法,而 SUT 指整个类。

This code is an example of the classical style of unit testing: the test doesn’t replace the collaborator (the Store class) but rather uses a production-ready instance of it. One of the natural outcomes of this style is that the test now effectively verifies both Customer and Store, not just Customer. Any bug in the inner workings of Store that affects Customer will lead to failing these unit tests, even if Customer still works correctly. The two classes are not isolated from each other in the tests.

这段代码是经典单元测试风格的一个例子:测试并没有替换协作者(Store 类),而是使用了它的一个生产就绪实例。这种风格的自然结果之一是,测试现在实际上验证了 CustomerStore,而不仅仅是 CustomerStore 内部工作方式中的任何缺陷,只要影响到 Customer,都会导致这些单元测试失败,即使 Customer 本身仍然正确工作。在这些测试中,这两个类并没有彼此隔离。

Let’s now modify the example toward the London style. I’ll take the same tests and replace the Store instances with test doubles—specifically, mocks.

现在我们把这个例子改成伦敦风格。我会使用相同的测试,并用测试替身替换 Store 实例——更具体地说,是用 mock。

I use Moq as the mocking framework, but you can find several equally good alternatives, such as NSubstitute. All object-oriented languages have analogous frameworks. For instance, in the Java world, you can use Mockito, JMock, or EasyMock.

我使用 Moq 作为 mocking 框架,但你也可以找到几个同样不错的替代品,例如 NSubstitute。所有面向对象语言都有类似框架。例如在 Java 世界中,你可以使用 Mockito、JMock 或 EasyMock。

DEFINITION A mock is a special kind of test double that allows you to examine interactions between the system under test and its collaborators.

定义 mock 是一种特殊的测试替身,它允许你检查被测系统与其协作者之间的交互。

We’ll get back to the topic of mocks, stubs, and the differences between them in later chapters. For now, the main thing to remember is that mocks are a subset of test doubles. People often use the terms test double and mock as synonyms, but technically, they are not:

后续章节会回到 mock、stub 以及它们之间差异这个主题。现在需要记住的主要内容是:mock 是测试替身的一个子集。人们常常把测试替身和 mock 当作同义词使用,但从技术上讲,它们并不是同一个东西:

The next listing shows how the tests look after isolating Customer from its collaborator, Store.

下一个清单展示了在把 Customer 与其协作者 Store 隔离之后,测试会是什么样子。

Listing 2.2

Note how different these tests are from those written in the classical style. In the arrange phase, the tests no longer instantiate a production-ready instance of Store but instead create a substitution for it, using Moq’s built-in class Mock<T>.

注意这些测试与经典风格编写的测试有多么不同。在准备阶段,测试不再实例化一个生产就绪的 Store 实例,而是使用 Moq 内置的 Mock<T> 类为它创建一个替代品。

Furthermore, instead of modifying the state of Store by adding a shampoo inventory to it, we directly tell the mock how to respond to calls to HasEnoughInventory(). The mock reacts to this request the way the tests need, regardless of the actual state of Store. In fact, the tests no longer use Store—we have introduced an IStore interface and are mocking that interface instead of the Store class.

此外,我们不再通过添加洗发水库存来修改 Store 的状态,而是直接告诉 mock 如何响应对 HasEnoughInventory() 的调用。无论 Store 的实际状态如何,mock 都会按照测试需要的方式响应该请求。事实上,测试已经不再使用 Store——我们引入了一个 IStore 接口,并 mock 这个接口,而不是 mock Store 类。

In chapter 8, I write in detail about working with interfaces. For now, just make a note that interfaces are required for isolating the system under test from its collaborators. You can also mock a concrete class, but that’s an anti-pattern; I cover this topic in chapter 11.

第 8 章会详细讨论如何使用接口。现在只需要记住:为了把被测系统与协作者隔离,需要接口。你也可以 mock 一个具体类,但这是反模式;我会在第 11 章讨论这个主题。

The assertion phase has changed too, and that’s where the key difference lies. We still check the output from customer.Purchase as before, but the way we verify that the customer did the right thing to the store is different. Previously, we did that by asserting against the store’s state. Now, we examine the interactions between Customer and Store: the tests check to see if the customer made the correct call on the store.

断言阶段也发生了变化,而关键差异就在这里。我们仍然像以前一样检查 customer.Purchase 的输出,但验证客户是否对商店做了正确事情的方式不同了。以前,我们通过断言商店状态来完成这件事。现在,我们检查 CustomerStore 之间的交互:测试会检查客户是否对商店发起了正确调用。

We do this by passing the method the customer should call on the store (x.RemoveInventory) as well as the number of times it should do that. If the purchase succeeds, the customer should call this method once (Times.Once). If the purchase fails, the customer shouldn’t call it at all (Times.Never).

我们通过传入客户应该在商店上调用的方法(x.RemoveInventory),以及它应该调用的次数来完成验证。如果购买成功,客户应该调用这个方法一次(Times.Once)。如果购买失败,客户根本不应该调用它(Times.Never)。

2.1.2 The isolation issue: The classical take

2.1.2 隔离问题:经典学派的观点

To reiterate, the London style approaches the isolation requirement by segregating the piece of code under test from its collaborators with the help of test doubles: specifically, mocks. Interestingly enough, this point of view also affects your standpoint on what constitutes a small piece of code (a unit). Here are all the attributes of a unit test once again:

重申一下,伦敦风格通过测试替身,尤其是 mock,把被测代码片段与其协作者分离,从而满足隔离要求。有趣的是,这种观点也会影响你对“什么构成一小段代码(一个单元)”的立场。下面再次列出单元测试的所有属性:

In addition to the third attribute leaving room for interpretation, there’s some room in the possible interpretations of the first attribute as well. How small should a small piece of code be? As you saw from the previous section, if you adopt the position of isolating every individual class, then it’s natural to accept that the piece of code under test should also be a single class, or a method inside that class. It can’t be more than that due to the way you approach the isolation issue.

除了第三个属性留有解释空间之外,第一个属性也同样存在解释空间。一小段代码到底应该有多小?正如你在上一节看到的,如果你采取隔离每个单独类的立场,那么很自然也会接受被测代码片段应该是单个类,或该类中的某个方法。由于你处理隔离问题的方式,它不可能比这更大。

In some cases, you might test a couple of classes at once; but in general, you’ll always strive to maintain this guideline of unit testing one class at a time.

在某些情况下,你可能会一次测试几个类;但总体来说,你总会努力维持“一次只单元测试一个类”的指导原则。

As I mentioned earlier, there’s another way to interpret the isolation attribute—the classical way. In the classical approach, it’s not the code that needs to be tested in an isolated manner. Instead, unit tests themselves should be run in isolation from each other. That way, you can run the tests in parallel, sequentially, and in any order, whatever fits you best, and they still won’t affect each other’s outcome.

如前所述,还有另一种解释隔离属性的方式——经典方式。在经典方法中,需要以隔离方式测试的不是代码。相反,单元测试本身应该彼此隔离运行。这样,你可以并行运行、顺序运行,或以任何最适合你的顺序运行这些测试,而它们仍然不会影响彼此结果。

Isolating tests from each other means it’s fine to exercise several classes at once as long as they all reside in the memory and don’t reach out to a shared state, through which the tests can communicate and affect each other’s execution context. Typical examples of such a shared state are out-of-process dependencies—the database, the file system, and so on.

让测试彼此隔离意味着,只要多个类都位于内存中,并且不会触达某种共享状态,就可以一次执行多个类。测试可能通过共享状态彼此通信并影响彼此执行上下文。典型共享状态包括进程外依赖,例如数据库、文件系统等等。

For instance, one test could create a customer in the database as part of its arrange phase, and another test would delete it as part of its own arrange phase, before the first test completes executing. If you run these two tests in parallel, the first test will fail, not because the production code is broken, but rather because of the interference from the second test.

例如,一个测试可能在准备阶段向数据库中创建一个客户,而另一个测试可能在自己的准备阶段删除这个客户,并且发生在第一个测试执行完成之前。如果你并行运行这两个测试,第一个测试会失败,不是因为生产代码坏了,而是因为第二个测试造成了干扰。

Shared, private, and out-of-process dependencies

共享依赖、私有依赖和进程外依赖

A shared dependency is a dependency that is shared between tests and provides means for those tests to affect each other’s outcome. A typical example of shared dependencies is a static mutable field. A change to such a field is visible across all unit tests running within the same process. A database is another typical example of a shared dependency.

共享依赖是测试之间共享,并为这些测试影响彼此结果提供手段的依赖。共享依赖的典型例子是静态可变字段。对这种字段的修改,会对同一进程内运行的所有单元测试可见。数据库是共享依赖的另一个典型例子。

A private dependency is a dependency that is not shared.

私有依赖是不被共享的依赖。

An out-of-process dependency is a dependency that runs outside the application’s execution process; it’s a proxy to data that is not yet in the memory. An out-of-process dependency corresponds to a shared dependency in the vast majority of cases, but not always. For example, a database is both out-of-process and shared. But if you launch that database in a Docker container before each test run, that would make this dependency out-of-process but not shared, since tests no longer work with the same instance of it. Similarly, a read-only database is also out-of-process but not shared, even if it’s reused by tests. Tests can’t mutate data in such a database and thus can’t affect each other’s outcome.

进程外依赖是在应用程序执行进程之外运行的依赖;它是尚未进入内存的数据的代理。在绝大多数情况下,进程外依赖对应共享依赖,但并不总是如此。例如,数据库既是进程外依赖,也是共享依赖。但如果你在每次测试运行之前都在 Docker 容器中启动这个数据库,那么这个依赖就是进程外的,但不是共享的,因为测试不再使用它的同一个实例。类似地,即使只读数据库被测试复用,它也是进程外的,但不是共享的。测试无法修改这种数据库中的数据,因此不能影响彼此结果。

This take on the isolation issue entails a much more modest view on the use of mocks and other test doubles. You can still use them, but you normally do that for only those dependencies that introduce a shared state between tests. Figure 2.3 shows how it looks.

这种对隔离问题的理解,会导致一种对 mock 和其他测试替身更克制的使用观。你仍然可以使用它们,但通常只针对那些在测试之间引入共享状态的依赖。图 2.3 展示了这种方式。

Figure 2.3

Note that shared dependencies are shared between unit tests, not between classes under test (units). In that sense, a singleton dependency is not shared as long as you are able to create a new instance of it in each test. While there’s only one instance of a singleton in the production code, tests may very well not follow this pattern and not reuse that singleton. Thus, such a dependency would be private.

注意,共享依赖是在单元测试之间共享,而不是在被测类(单元)之间共享。从这个意义上说,只要你能在每个测试中创建一个新的单例依赖实例,单例依赖就不是共享的。虽然生产代码中单例只有一个实例,但测试完全可以不遵循这个模式,不复用该单例。因此,这样的依赖就是私有的。

For example, there’s normally only one instance of a configuration class, which is reused across all production code. But if it’s injected into the SUT the way all other dependencies are, say, via a constructor, you can create a new instance of it in each test; you don’t have to maintain a single instance throughout the test suite. You can’t create a new file system or a database, however; they must be either shared between tests or substituted away with test doubles.

例如,配置类通常只有一个实例,并在所有生产代码中复用。但如果它像其他依赖一样被注入 SUT,比如通过构造函数注入,你就可以在每个测试中创建一个新实例;你不必在整个测试套件中维护一个单一实例。然而,你无法创建一个新的文件系统或数据库;它们要么必须在测试之间共享,要么必须用测试替身替换掉。

Shared vs. volatile dependencies

共享依赖与易变依赖

Another term has a similar, yet not identical, meaning: volatile dependency. I recommend Dependency Injection: Principles, Practices, Patterns by Steven van Deursen and Mark Seemann (Manning Publications, 2018) as a go-to book on the topic of dependency management.

还有一个含义相似但并不相同的术语:易变依赖。关于依赖管理这个主题,我推荐 Steven van Deursen 和 Mark Seemann 的《Dependency Injection: Principles, Practices, Patterns》(Manning Publications,2018)。

A volatile dependency is a dependency that exhibits one of the following properties:

易变依赖是指具备以下任一属性的依赖:

As you can see, there’s an overlap between the notions of shared and volatile dependencies. For example, a dependency on the database is both shared and volatile. But that’s not the case for the file system. The file system is not volatile because it is installed on every developer’s machine and it behaves deterministically in the vast majority of cases. Still, the file system introduces a means by which the unit tests can interfere with each other’s execution context; hence it is shared. Likewise, a random number generator is volatile, but because you can supply a separate instance of it to each test, it isn’t shared.

如你所见,共享依赖和易变依赖这两个概念之间存在重叠。例如,数据库依赖既是共享的,也是易变的。但文件系统并非如此。文件系统不是易变的,因为它安装在每个开发者的机器上,并且在绝大多数情况下行为是确定的。不过,文件系统引入了一种单元测试可以彼此干扰执行上下文的手段,因此它是共享的。同样,随机数生成器是易变的,但因为你可以为每个测试提供一个单独实例,所以它不是共享的。

Another reason for substituting shared dependencies is to increase the test execution speed. Shared dependencies almost always reside outside the execution process, while private dependencies usually don’t cross that boundary. Because of that, calls to shared dependencies, such as a database or the file system, take more time than calls to private dependencies. And since the necessity to run quickly is the second attribute of the unit test definition, such calls push the tests with shared dependencies out of the realm of unit testing and into the area of integration testing. I talk more about integration testing later in this chapter.

替换共享依赖的另一个原因是提升测试执行速度。共享依赖几乎总是位于执行进程之外,而私有依赖通常不会跨越这个边界。因此,对数据库或文件系统等共享依赖的调用,比对私有依赖的调用花费更多时间。由于快速运行是单元测试定义的第二个属性,这类调用会把带有共享依赖的测试推出单元测试领域,推入集成测试领域。我会在本章后面进一步讨论集成测试。

This alternative view of isolation also leads to a different take on what constitutes a unit (a small piece of code). A unit doesn’t necessarily have to be limited to a class. You can just as well unit test a group of classes, as long as none of them is a shared dependency.

这种对隔离的替代理解,也会导致对什么构成一个单元(一小段代码)的不同看法。一个单元不一定必须局限于一个类。只要其中没有共享依赖,你完全可以对一组类进行单元测试。