wip-1.2

2026-06-05 ⏳3.0分钟(1.2千字)

1.2 The goal of unit testing

1.2 单元测试的目标

Before taking a deep dive into the topic of unit testing, let’s step back and consider the goal that unit testing helps you to achieve. It’s often said that unit testing practices lead to a better design. And it’s true: the necessity to write unit tests for a code base normally leads to a better design. But that’s not the main goal of unit testing; it’s merely a pleasant side effect.

在深入单元测试主题之前,我们先退一步,思考单元测试帮助你达成的目标。人们常说,单元测试实践会带来更好的设计。这是真的:为代码库编写单元测试的必要性,通常会促成更好的设计。但这并不是单元测试的主要目标;它只是一个令人愉快的副作用。

The relationship between unit testing and code design

单元测试与代码设计之间的关系

The ability to unit test a piece of code is a nice litmus test, but it only works in one direction. It’s a good negative indicator—it points out poor-quality code with relatively high accuracy. If you find that code is hard to unit test, it’s a strong sign that the code needs improvement. The poor quality usually manifests itself in tight coupling, which means different pieces of production code are not decoupled from each other enough, and it’s hard to test them separately.

一段代码是否能够被单元测试,是一个不错的试金石,但它只在一个方向上有效。它是一个很好的负向指标——它能以相对较高的准确度指出低质量代码。如果你发现某段代码很难进行单元测试,这强烈说明这段代码需要改进。低质量通常表现为紧耦合,也就是不同生产代码片段之间没有充分解耦,因此很难把它们分开测试。

Unfortunately, the ability to unit test a piece of code is a bad positive indicator. The fact that you can easily unit test your code base doesn’t necessarily mean it’s of good quality. The project can be a disaster even when it exhibits a high degree of decoupling.

不幸的是,一段代码能够被单元测试,并不是一个好的正向指标。你能轻松对代码库进行单元测试,并不一定意味着它质量很好。即使项目表现出很高程度的解耦,它仍然可能是一场灾难。

What is the goal of unit testing, then? The goal is to enable sustainable growth of the software project. The term sustainable is key. It’s quite easy to grow a project, especially when you start from scratch. It’s much harder to sustain this growth over time.

那么,单元测试的目标到底是什么?目标是让软件项目能够可持续增长。这里的关键词是“可持续”。让一个项目增长并不难,尤其是从零开始的时候。真正困难的是随着时间推移仍然维持这种增长。

Figure 1.1 shows the growth dynamic of a typical project without tests. You start off quickly because there’s nothing dragging you down. No bad architectural decisions have been made yet, and there isn’t any existing code to worry about. As time goes by, however, you have to put in more and more hours to make the same amount of progress you showed at the beginning. Eventually, the development speed slows down significantly, sometimes even to the point where you can’t make any progress whatsoever.

图 1.1 展示了一个典型无测试项目的增长动态。一开始你会很快,因为没有任何东西拖慢你。还没有做出糟糕的架构决策,也没有既有代码需要顾虑。然而,随着时间推移,你必须投入越来越多工时,才能取得和一开始同样多的进展。最终,开发速度会显著下降,有时甚至降到完全无法取得任何进展的程度。

Figure 1.1

This phenomenon of quickly decreasing development speed is also known as software entropy. Entropy (the amount of disorder in a system) is a mathematical and scientific concept that can also apply to software systems. (If you’re interested in the math and science of entropy, look up the second law of thermodynamics.)

这种开发速度快速下降的现象也被称为软件熵。熵(系统中的混乱程度)是一个数学和科学概念,也可以应用于软件系统。(如果你对熵的数学和科学背景感兴趣,可以查阅热力学第二定律。)

In software, entropy manifests in the form of code that tends to deteriorate. Each time you change something in a code base, the amount of disorder in it, or entropy, increases. If left without proper care, such as constant cleaning and refactoring, the system becomes increasingly complex and disorganized. Fixing one bug introduces more bugs, and modifying one part of the software breaks several others—it’s like a domino effect. Eventually, the code base becomes unreliable. And worst of all, it’s hard to bring it back to stability.

在软件中,熵表现为代码倾向于劣化。每当你修改代码库中的某些东西,其中的混乱程度,也就是熵,就会增加。如果缺少适当照料,例如持续清理和重构,系统会变得越来越复杂、越来越无序。修复一个缺陷会引入更多缺陷,修改软件的一个部分会破坏其他几个部分——就像多米诺骨牌效应。最终,代码库会变得不可靠。最糟糕的是,要让它重新稳定下来非常困难。

Tests help overturn this tendency. They act as a safety net—a tool that provides insurance against a vast majority of regressions. Tests help make sure the existing functionality works, even after you introduce new features or refactor the code to better fit new requirements.

测试有助于扭转这种趋势。它们像一张安全网,是一种为绝大多数回归问题提供保险的工具。测试能帮助确认已有功能仍然正常工作,即使你引入了新功能,或为了更好适应新需求而重构了代码。

DEFINITION A regression is when a feature stops working as intended after a certain event (usually, a code modification). The terms regression and software bug are synonyms and can be used interchangeably.

定义 回归是指某个功能在某个事件之后(通常是代码修改之后)不再按预期工作。regression 和 software bug 是同义词,可以互换使用。

The downside here is that tests require initial—sometimes significant—effort. But they pay for themselves in the long run by helping the project to grow in the later stages. Software development without the help of tests that constantly verify the code base simply doesn’t scale.

这里的缺点是,测试需要初始投入,有时投入还很大。但从长期看,测试会通过帮助项目在后期继续增长来收回成本。没有持续验证代码库的测试帮助,软件开发根本无法扩展。

Sustainability and scalability are the keys. They allow you to maintain development speed in the long run.

可持续性和可扩展性是关键。它们让你能够长期维持开发速度。

1.2.1 What makes a good or bad test?

1.2.1 什么造就好测试或坏测试?

Although unit testing helps maintain project growth, it’s not enough to just write tests. Badly written tests still result in the same picture.

虽然单元测试有助于维持项目增长,但仅仅写测试是不够的。写得很差的测试仍然会导致同样的局面。

As shown in figure 1.2, bad tests do help to slow down code deterioration at the beginning: the decline in development speed is less prominent compared to the situation with no tests at all. But nothing really changes in the grand scheme of things. It might take longer for such a project to enter the stagnation phase, but stagnation is still inevitable.

如图 1.2 所示,糟糕的测试确实能在一开始减缓代码劣化:与完全没有测试相比,开发速度下降得没那么明显。但从整体上看,并没有真正改变什么。这样的项目可能会更晚进入停滞阶段,但停滞仍然不可避免。

Figure 1.2

Remember, not all tests are created equal. Some of them are valuable and contribute a lot to overall software quality. Others don’t. They raise false alarms, don’t help you catch regression errors, and are slow and difficult to maintain. It’s easy to fall into the trap of writing unit tests for the sake of unit testing without a clear picture of whether it helps the project.

请记住,并不是所有测试都生来平等。有些测试很有价值,对整体软件质量贡献很大;另一些则不是。它们会发出误报,无法帮助你捕获回归错误,而且运行缓慢、难以维护。如果不清楚测试是否真的帮助了项目,就很容易落入“为了单元测试而写单元测试”的陷阱。

You can’t achieve the goal of unit testing by just throwing more tests at the project. You need to consider both the test’s value and its upkeep cost. The cost component is determined by the amount of time spent on various activities:

你不能仅仅通过往项目里塞更多测试来达成单元测试的目标。你需要同时考虑测试的价值和维护成本。成本部分由花在各种活动上的时间决定:

It’s easy to create tests whose net value is close to zero or even is negative due to high maintenance costs. To enable sustainable project growth, you have to exclusively focus on high-quality tests—those are the only type of tests that are worth keeping in the test suite.

由于维护成本高,很容易写出净价值接近于零,甚至为负的测试。要实现项目的可持续增长,你必须只关注高质量测试——这是唯一值得保留在测试套件中的测试类型。

Production code vs. test code

生产代码与测试代码

People often think production code and test code are different. Tests are assumed to be an addition to production code and have no cost of ownership. By extension, people often believe that the more tests, the better. This isn’t the case. Code is a liability, not an asset. The more code you introduce, the more you extend the surface area for potential bugs in your software, and the higher the project’s upkeep cost. It’s always better to solve problems with as little code as possible.

人们常常认为生产代码和测试代码不同。测试被看作生产代码之外的附加物,并且没有所有权成本。进一步说,人们常常相信测试越多越好。但事实并非如此。代码是负债,不是资产。你引入的代码越多,软件中潜在缺陷的暴露面就越大,项目维护成本也越高。用尽可能少的代码解决问题总是更好。

Tests are code, too. You should view them as the part of your code base that aims at solving a particular problem: ensuring the application’s correctness. Unit tests, just like any other code, are also vulnerable to bugs and require maintenance.

测试也是代码。你应该把它们视为代码库中用于解决特定问题的一部分:确保应用程序的正确性。单元测试和其他任何代码一样,也容易出现缺陷,也需要维护。

It’s crucial to learn how to differentiate between good and bad unit tests. I cover this topic in chapter 4.

学会区分好的和坏的单元测试至关重要。我会在第 4 章讨论这个主题。