wip-2.3
zero2.3 Contrasting the classical and London schools
2.3 对比经典学派与伦敦学派
As you saw in this chapter, the London school of unit testing provides three main benefits over the classical school:
正如你在本章中看到的,与经典学派相比,伦敦学派主要提供三个好处:
- Better granularity. Since the tests check a single class, they are better at identifying which exact functionality failed.
更好的粒度。由于测试检查单个类,因此它们更擅长识别到底哪个具体功能失败了。 - It’s easier to unit test a larger graph of interconnected classes. Since all collaborators are replaced by test doubles, you don’t need to worry about them at the time of writing the test.
更容易对较大的互联类图进行单元测试。由于所有协作者都被测试替身替换,写测试时你不需要担心它们。 - If a test fails, you know for sure which functionality has failed. Without the class’s collaborators, there could be no suspects other than the class under test itself.
如果测试失败,你可以确定哪个功能失败了。没有类的协作者之后,除了被测类本身,不会有其他嫌疑对象。
Of course, there may still be situations where the system under test uses a value object and it’s the change in this value object that makes the test fail. But these cases aren’t that frequent because all other dependencies are eliminated in tests.
当然,仍然可能存在这样的情况:被测系统使用了值对象,而正是这个值对象的变化导致测试失败。但这种情况并不常见,因为测试中已经消除了所有其他依赖。
2.3.1 Unit testing one class at a time
2.3.1 一次只单元测试一个类
The point about better granularity relates to the discussion about what constitutes a unit in unit testing. The London school considers a class as such a unit. Coming from an object-oriented programming background, developers usually regard classes as the atomic building blocks that lie at the foundation of every code base. This naturally leads to treating classes as the atomic units to be verified in tests, too. This tendency is understandable but misleading.
更好粒度这一点,与“单元测试中的单元到底是什么”的讨论有关。伦敦学派把类视为这样的单元。从面向对象编程背景出发,开发者通常把类看作每个代码库基础中的原子构建块。这自然会导致他们也把类看作测试中要验证的原子单元。这种倾向可以理解,但具有误导性。
TIP Tests shouldn’t verify units of code. Rather, they should verify units of behavior: something that is meaningful for the problem domain and, ideally, something that a business person can recognize as useful. The number of classes it takes to implement such a unit of behavior is irrelevant. The unit could span across multiple classes or only one class, or even take up just a tiny method.
提示 测试不应该验证代码单元。相反,它们应该验证行为单元:对问题领域有意义的东西,理想情况下,也是业务人员能够识别为有用的东西。实现这样一个行为单元需要多少个类并不重要。这个单元可以跨越多个类,也可以只有一个类,甚至只是一个很小的方法。
And so, aiming at better code granularity isn’t helpful. As long as the test checks a single unit of behavior, it’s a good test. Targeting something less than that can in fact damage your unit tests, as it becomes harder to understand exactly what these tests verify. A test should tell a story about the problem your code helps to solve, and this story should be cohesive and meaningful to a non-programmer.
因此,追求更好的代码粒度并没有帮助。只要测试检查的是单个行为单元,它就是好测试。瞄准比这更小的东西,实际上可能损害你的单元测试,因为你会更难理解这些测试到底验证了什么。测试应该讲述一个关于你的代码帮助解决的问题的故事,而这个故事应该是连贯的,并且对非程序员有意义。
For instance, this is an example of a cohesive story:
例如,下面是一个连贯故事的例子:
When I call my dog, he comes right to me.
Now compare it to the following:
现在把它和下面这个故事比较:
When I call my dog, he moves his front left leg first, then the front right leg, his head turns, the tail start wagging...
The second story makes much less sense. What’s the purpose of all those movements? Is the dog coming to me? Or is he running away? You can’t tell. This is what your tests start to look like when you target individual classes (the dog’s legs, head, and tail) instead of the actual behavior (the dog coming to his master). I talk more about this topic of observable behavior and how to differentiate it from internal implementation details in chapter 5.
第二个故事就不那么有意义了。所有这些动作的目的是什么?狗是在朝我走来,还是在跑开?你看不出来。当你瞄准单个类(狗的腿、头和尾巴),而不是实际行为(狗走向主人)时,测试就会开始变成这个样子。我会在第 5 章进一步讨论可观察行为,以及如何把它与内部实现细节区分开。
2.3.2 Unit testing a large graph of interconnected classes
2.3.2 单元测试大型互联类图
The use of mocks in place of real collaborators can make it easier to test a class—especially when there’s a complicated dependency graph, where the class under test has dependencies, each of which relies on dependencies of its own, and so on, several layers deep. With test doubles, you can substitute the class’s immediate dependencies and thus break up the graph, which can significantly reduce the amount of preparation you have to do in a unit test.
用 mock 替代真实协作者,可以让测试一个类变得更容易——尤其是在存在复杂依赖图时,被测类有依赖,每个依赖又有自己的依赖,如此向下好几层。有了测试替身,你可以替换这个类的直接依赖,从而打断对象图,这可以显著减少单元测试中需要做的准备工作。
If you follow the classical school, you have to re-create the full object graph (with the exception of shared dependencies) just for the sake of setting up the system under test, which can be a lot of work.
如果遵循经典学派,你必须为了搭建被测系统而重建完整对象图(共享依赖除外),这可能会带来大量工作。
Although this is all true, this line of reasoning focuses on the wrong problem. Instead of finding ways to test a large, complicated graph of interconnected classes, you should focus on not having such a graph of classes in the first place. More often than not, a large class graph is a result of a code design problem.
尽管这些都是真的,但这种推理关注的是错误问题。与其寻找测试庞大复杂互联类图的方法,不如首先关注如何避免出现这样的类图。大型类图往往是代码设计问题的结果。
It’s actually a good thing that the tests point out this problem. As we discussed in chapter 1, the ability to unit test a piece of code is a good negative indicator—it predicts poor code quality with a relatively high precision. If you see that to unit test a class, you need to extend the test’s arrange phase beyond all reasonable limits, it’s a certain sign of trouble. The use of mocks only hides this problem; it doesn’t tackle the root cause. I talk about how to fix the underlying code design problem in part 2.
测试指出这个问题其实是好事。正如第 1 章讨论的,一段代码能否被单元测试是很好的负向指标——它能以相对较高的精度预测代码质量差。如果你发现为了单元测试一个类,必须把测试的准备阶段扩展到超出所有合理限制的程度,这就是明确的问题信号。使用 mock 只会隐藏这个问题;它并没有处理根因。我会在第 2 部分讨论如何修复底层代码设计问题。
2.3.3 Revealing the precise bug location
2.3.3 暴露精确的缺陷位置
If you introduce a bug to a system with London-style tests, it normally causes only tests whose SUT contains the bug to fail. However, with the classical approach, tests that target the clients of the malfunctioning class can also fail. This leads to a ripple effect where a single bug can cause test failures across the whole system. As a result, it becomes harder to find the root of the issue. You might need to spend some time debugging the tests to figure it out.
如果你在一个使用伦敦风格测试的系统中引入缺陷,通常只有那些 SUT 包含该缺陷的测试会失败。然而,在经典方法中,针对故障类客户端的测试也可能失败。这会导致涟漪效应:一个缺陷可能导致整个系统中的测试失败。因此,找到问题根源会变得更困难。你可能需要花一些时间调试测试,才能弄清楚原因。
It’s a valid concern, but I don’t see it as a big problem. If you run your tests regularly (ideally, after each source code change), then you know what caused the bug—it’s what you edited last, so it’s not that difficult to find the issue. Also, you don’t have to look at all the failing tests. Fixing one automatically fixes all the others.
这是一个合理担忧,但我并不认为它是大问题。如果你定期运行测试(理想情况下,每次源代码变更后都运行),那么你知道是什么造成了缺陷——就是你最后编辑的内容,所以找到问题并不那么困难。另外,你不必查看所有失败测试。修复一个问题会自动修复其他失败。
Furthermore, there’s some value in failures cascading all over the test suite. If a bug leads to a fault in not only one test but a whole lot of them, it shows that the piece of code you have just broken is of great value—the entire system depends on it. That’s useful information to keep in mind when working with the code.
此外,失败在整个测试套件中级联也有一定价值。如果一个缺陷不仅导致一个测试失败,而是导致大量测试失败,这说明你刚刚破坏的那段代码非常有价值——整个系统都依赖它。这是在处理代码时值得记住的有用信息。
2.3.4 Other differences between the classical and London schools
2.3.4 经典学派与伦敦学派的其他差异
Two remaining differences between the classical and London schools are:
经典学派和伦敦学派之间还有两个剩余差异:
- Their approach to system design with test-driven development (TDD)
它们在测试驱动开发(TDD)中处理系统设计的方式。 - The issue of over-specification
过度指定问题。
Test-driven development
测试驱动开发
Test-driven development is a software development process that relies on tests to drive the project development. The process consists of three (some authors specify four) stages, which you repeat for every test case:
测试驱动开发是一种依赖测试来驱动项目开发的软件开发过程。这个过程包含三个阶段(有些作者会指定四个阶段),你会为每个测试用例重复这些阶段:
- Write a failing test to indicate which functionality needs to be added and how it should behave.
编写一个失败测试,指出需要添加什么功能以及它应该如何表现。 - Write just enough code to make the test pass. At this stage, the code doesn’t have to be elegant or clean.
编写刚好足够让测试通过的代码。在这个阶段,代码不必优雅或干净。 - Refactor the code. Under the protection of the passing test, you can safely clean up the code to make it more readable and maintainable.
重构代码。在通过测试的保护下,你可以安全地清理代码,让它更可读、更易维护。
Good sources on this topic are the two books I recommended earlier: Kent Beck’s Test-Driven Development: By Example, and Growing Object-Oriented Software, Guided by Tests by Steve Freeman and Nat Pryce.
关于这个主题的好资料,是我前面推荐的两本书:Kent Beck 的《Test-Driven Development: By Example》,以及 Steve Freeman 和 Nat Pryce 的《Growing Object-Oriented Software, Guided by Tests》。
The London style of unit testing leads to outside-in TDD, where you start from the higher-level tests that set expectations for the whole system. By using mocks, you specify which collaborators the system should communicate with to achieve the expected result. You then work your way through the graph of classes until you implement every one of them.
伦敦风格的单元测试会导向由外向内的 TDD,在这种方式中,你从设置整个系统期望的高层测试开始。通过使用 mock,你指定系统应该与哪些协作者通信,以实现预期结果。然后你沿着类图逐步推进,直到实现其中每一个类。
Mocks make this design process possible because you can focus on one class at a time. You can cut off all of the SUT’s collaborators when testing it and thus postpone implementing those collaborators to a later time.
mock 让这种设计过程成为可能,因为你可以一次专注于一个类。测试 SUT 时,你可以切断它的所有协作者,从而把这些协作者的实现推迟到之后。
The classical school doesn’t provide quite the same guidance since you have to deal with the real objects in tests. Instead, you normally use the inside-out approach. In this style, you start from the domain model and then put additional layers on top of it until the software becomes usable by the end user.
经典学派并不提供完全相同的指导,因为你必须在测试中处理真实对象。相反,你通常会使用由内向外的方法。在这种风格中,你从领域模型开始,然后在其上添加额外层,直到软件能被最终用户使用。
But the most crucial distinction between the schools is the issue of over-specification: that is, coupling the tests to the SUT’s implementation details. The London style tends to produce tests that couple to the implementation more often than the classical style. And this is the main objection against the ubiquitous use of mocks and the London style in general.
但两个学派之间最关键的区别,是过度指定问题:也就是把测试耦合到 SUT 的实现细节。与经典风格相比,伦敦风格更容易产生与实现耦合的测试。这也是反对普遍使用 mock 以及反对伦敦风格的一般性主要理由。
There’s much more to the topic of mocking. Starting with chapter 4, I gradually cover everything related to it.
mocking 这个主题还有更多内容。从第 4 章开始,我会逐步覆盖与它相关的一切。