The problem with End-to-end Testing

It sounds like a good idea to implement end-to-end testing on your software. But is it really? Is that the only approach? Let’s find out.

by Fabio Ferreira

December 1, 2022

Cover Image for The problem with End-to-end Testing

It sounds like a good idea to implement end-to-end testing on your software. But is it really? End-to-end tests can be a big, complicated, and fragile process for anything but simple systems. You need more control over the variables to let you get a good handle on your system under test. On the other hand, it is a good idea to test your system thoroughly in production, like testing environments and real-life situations.

So what’s the difference between these two ideas? And how do we get the good one and avoid the bad one’s traps?

What is End-To-End Testing?

When most companies talk about “end-to-end testing,” they mean testing the whole system in a place that looks like the production environment. That makes sense. We want to test how our system works in a setting that looks and feels like real life — checking its configuration and any changes to its infrastructure that could change how it works. And making sure that the new features we’ve added actually work and that we haven’t broken anything that was already working. The problem is we need to learn how to do that.

The most typical method I’ve found for performing end-to-end testing is to set up a staging environment. The idea behind this, especially for large software development teams, is to keep a sort of shadow copy of the production system. A group of individuals works on improving their system. The new versions must be moved to the staging environment before the release as part of the release procedure. They will be evaluated, generally by hand, alongside the new versions from all the other teams to ensure that everything works properly. Once all the teams have confirmed everything is working, the release can be moved to the main production system.

Benefits of a Staging environment

This technique can help prevent errors and bugs from being released to the production system and ensure that the changes are compatible with the existing system. It also allows for more structured and organized testing by creating a dedicated environment for the testing process.

The staging environment is also more secure than the production environment, as it is not connected to the internet and thus is not vulnerable to malicious activity. Additionally, it can be used to test more complex features, as it is optional to deploy the changes to the production system to determine if they are working correctly.

This testing approach can also help us determine the root cause of problems. A staging environment can isolate the changes made to the system and identify which change is causing the issue. That can help us pinpoint the exact issue faster so that it can be fixed more efficiently.

Cons of a staging environment

But, most of the time, this is done by teams of people who have nothing to do with making the software. Most likely, the people making the changes have never seen each other before. In dysfunctional teams, they might need to learn what the changes are for. Most companies that test this way have paranoid testers and for good reasons. So, everything gets tested. That takes a long time, costs a lot, and doesn’t produce a good product. Testing has been done too late to help make the product better. It’s just a way to ensure the biggest mistakes don’t get through.

As I said, the testers have good reason to be nervous. Something is likely to go wrong at this point.

If this is the first time the parts of a much bigger whole are being put together, it’s unlikely that everything will go smoothly. But even with careful planning, there are a lot of ways that software and the people who make it can go wrong.

When organizations work this way, they almost always move at a very slow pace. So there is usually a gap of weeks or months between each release and each end-to-end testing. That means that a considerable number of changes go into each release, which makes it more likely that something will fail. When something goes wrong, the cost of finding and fixing it increases exponentially with each change.

These are signs of a bigger, more serious problem.

A bigger problem with End-to-End Testing

We can’t test something as well the bigger and more complicated it is. Because we no longer know what to do with the variables. Imagine that we are testing System B, which receives data from System A and passes it to System C. We need to find out how system A and system C are changing to properly test System B. Everything we do is at a distance.

Isolating tests

Test isolation is the main idea here. A good test is deterministic and has only one answer. It will put our system to the test straightforwardly and easily to plan for. The way we need it to be for our test to work. We want the same result every time we run this test on the same software version. No matter what time of day it is or what else is happening in the system simultaneously.

When running a test, we want to ensure that it is isolated from any changes that may have occurred in the system. This means we don’t want our test to be affected by changes to other system parts. Or any other external factors that may be impacting the system.

How tests should look like?

We also want to ensure that the same test is run each time so the results are reliable and trustworthy. We’d also like it to do that without depending on anything other than our test and the system it’s testing. That is so that our test stays stable.

We want our test to be targeted and focused. Tests should be short, accurate, easy to understand, and long-lasting. They need to be short and only take a little bit of time to be created and clear about their expectation. Tests must be correct so the system knows exactly what it should do. They need to be easy to understand so that everyone involved in making the system can use them to figure out what the system needs to do. And they need to be rock-solid, which means that changes to the system can’t easily break these tests.

In how I like to do things, these tests will never be wrong because of system changes. They become wrong only if users no longer want what the tests say should happen. Of course, they might not work if the pipes break. But then we’ll be able to fix the pipes.

End-to-End Testing and behavior control (lack of)

What does this have to do with end-to-end testing? If we control the behavior, we need to know how to get our system into the right state to send us the information we want. That won’t be easier than just sending the information we care about straight to system B. So it’s likely that our test will be challenging.

I’m probably cheating here because, with my preferred approach, you can hide much of this complexity so that the test cases can still be short. But the test infrastructure and data will get more complicated and harder to keep up with.

Maybe you’re guessing: “So let’s put Systems A, C, D together and whatever into the mix. So all is tested.” Suppose we do end-to-end testing in this way.

In that case, our tests won’t be easy to understand either. When we write them, we’ll probably be thinking about how our system will interact with system A or system C, Instead of what we want to learn from our system. But our tests will be less accurate because we can’t control the state of systems A, C, or D.

We can only guess where our system ends up after the external system talks to it. Or by breaking encapsulation by digging in and looking at or changing the private state.

That is a terrible idea because it links the tests right away to the system being tested. So they break often and are much harder to keep in good shape.

Lastly, these tests will only last for a while. Keeping these tests running in a stable, reliable suite is notoriously hard. By adding all external systems to this mix, we’ve made it so there are more places where things can go wrong.

Controlling the behavior with CD

Instead, we’d like to deploy our system so that, as far as it’s concerned, it’s in production. That means it should have the same infrastructure, configuration, and deployment methods as the production system.

Then, we’d like to hook it up to some test rig. It is an infrastructure for testing that lets us put our system under test in the same state we want it to be in. It communicates through the interfaces that are already there and open. Then, we start the behavior that we want to test. And we’ll gather our system’s outputs by their natural outputs and make claims about them.

This way, we can be sure that our system is running in a production-like manner and that all the features we’ve developed are working as expected. We can also use this test rig to run our system through different scenarios and ensure that all work as expected. That allows us to be sure that our system is thoroughly tested and ready for production.

So, these kinds of tests are black-box tests that run through our system’s interfaces, but all of the necessary inputs are faked, and the outputs are collected.

When automating your testing, one of the most important decisions you have to make is: what should our system do? What part of the system are we in charge of, and what role do we need to test? Another common approach is called Continuous Delivery, aka CD.

What is Continuous Delivery (CD)?

CD involves creating a pipeline of automated tests to ensure that new features and changes are continuously tested and deployed without compromising quality. Additionally, it is essential to be aware of the testing frameworks and tools available for automating tests and the different types of tests that can be automated.

My favorite way to explain CD quickly is that we work to make sure that our software is always in a state where it can be released. Obviously, that’s different for each system. But one thing is for sure: if you or someone else needs to test your change more before it can be released, it still needs to be prepared to be released.

So the rule is that if your deployment pipeline says, “All is good,” you should be happy to put the code into production without doing anything else. So, how do we decide if our software is ready to be sold?

The deployment pipeline

The scope of a deployment pipeline is a unit of software that can be released. Whatever that means for your system or subsystem. And by definition, acceptance tests are used to see if your software is ready to be released. So the acceptance tests also test a piece of software that can be made public. They make sure that everything works as it should in the deployment pipeline.

So, these tests are like end-to-end testing, but only for our software and nothing else. That means we can test every part of our software. Such as: how it is set up for deployment; how it works when changes are made to the infrastructure it depends on.

End-To-End testing and external systems

Without testing external software, we don’t have direct control over it. To do this, we fake interactions between our software and the external systems it works with during tests. That gives us what I call “measurement points” for our system. We can provide inputs and get outputs to make our tests as realistic as possible. While still being as simple as possible for our system. These tests may still be challenging, but they will only be about the system we are building.

You might be worried about how people will act in these scenarios. You are correct. People don’t feel safe without end-to-end testing. But if we’ve thoroughly tested our system with our acceptance tests, we only care if System A talks to us the way we expect it to. And can we talk to System C the way we think we should? When we look at tests this way, we don’t need to do a lot of complicated testing. We need more tests that are specifically designed to check these interfaces. With these tests in place, we can be sure that when we move data from one system to another, it’s coming through the way we expect.

These more focused tests are better at testing these interfaces than just throwing boring cases through the end-to-end system. This method is one I’ve talked about before contract testing. So we can test our approach in more ways. And we can test these interface points better and more precisely.

Contract tests over End-to-End Testing

There is something else we can do. If we write our assumptions about how we will interact with systems A and C as “contract tests“, we can give them to teams A and C. They can then run our tests in their deployment pipelines. Our tests will fail if they make a change that breaks our assumptions. And they have to decide what to do next: fix their code, so it doesn’t break our test. Or, talk to us about how we need to change our assumptions and/or contract test.

This method is also called “User Contract Testing” at times. Of course, this method falls apart when the connections between our system and other systems can’t be trusted. Which is often the case in “spaghetti-code-like systems”.

Once we have this new idea of end-to-end testing for our software, but not for everyone else’s, we can do some clever things with our test infrastructure.

Conclusion

We started by making a sophisticated test rig that our software plugs into. That lets us put it in whatever state we need for this test more efficiently and accurately. We can also use the same setup to run countless tests in parallel and quickly get accurate results.

Ultimately, it is up to you to decide whether or not an end-to-end testing process is suitable for your system. Automated tests can help speed up the process and reduce potential errors. That will help you get an accurate picture of your system’s performance.

Here at Talendor our experts are ready and capable of implementing the right testing approach for your system, either end-to-end tests, integration tests, unit tests, etc. You can easily find them here.

#e2e tests#home#qa#software testing#unit tests

Fabio Ferreira

Tech lead and Talent Specialist Acquisition. Helping Saas companies and scrappy startups. Nothing makes me happier than meeting new people, building new relationships, solving issues, and helping the success of enterprises.