Beyond Page Objects: liberate yourself from the chains of UI-think!

John Ferguson Smart | Mentor | Author | Speaker - Author of 'BDD in Action'.
Helping teams deliver more valuable software sooner25th July 2017

So you have Page Objects in your test automation suite? That's great! But it's not enough! Page Objects are a great start, but you need to go further if you want truly sustainable, high quality test automation.

The Page Objects origin story

Page Objects are a popular automated web testing pattern first implemented for Selenium by Simon Stewart himself Simon Stewart back in 2009. The idea is to model web pages or UI components as objects. You reuse these objects in different tests to avoid duplication and simplify maintenance.

The idea is to model web pages or UI components as objects. You can then reuse these objects in different tests, avoiding duplication.

The ideas behind Page Objects is a good one. Keeping the WebDriver selectors for a given page in one place makes it easier to keep the test suites up to date when the page changes. They give you a layer of abstraction between your test logic and the elements on the page you want to manipulate. This way, you only need to worry about which element you want to use, not how to locate it.

Page Objects are like training wheels. They help you get started, but eventually you should outgrow them.

The problem with Page Objects

The problem is, Page Objects are only the first layer of abstraction. They were only ever intended as a first step to guide inexperienced testers away from the imperative scripting style of automation that was prevalent in the late 2000s.

But if you want to keep your tests maintainable and robust, Page Objects are just the bare minimum to get you started. You can do much better. Page Objects are like training wheels. They help you get started, but eventually you should outgrow them.

Too much how, not enough what and why

When you write a test built around page objects, you think in terms of the UI. You think in terms of how the user interacts with a page. The user enters a value into a field, the user clicks on a button, and so on.

But well-written tests don’t simply mimic the user’s every action, they describe the user journey through the system. They describe what the user is doing, and why.

For example, if we were testing a “todo” list application, a basic test about marking an item in a todo list as complete might look like this:

public void should_be_able_to_complete_a_todo_with_steps() {
                      .sendKeys()("Put out the garbage", Keys.ENTER);
                      .sendKeys()("Walk the dog", Keys.ENTER);
    getDriver().findElement(By.xpath("//div[@class='view' and contains(.,'Walk the dog')]"))
    assertThat(getElement(By.xpath(""//*[@class='view' and contains(.,'Walk the dog')]//input[@type='checkbox']").isSelected(), is(true));

This code works, but notice how hard this code is to read? You need to work very hard to figure out what the selectors are doing. And this makes it harder to understand what feature the test is actually demonstrating. There is also a lot of duplication. Both of these factors make this kind of test code extremely hard to maintain.

Using Page Objects, the code might look more like this:

TodoListPage todoListPage;

public void should_be_able_to_complete_a_todo_with_steps() {;
    todoListPage.getTodoField().type("Put out the garbage", Keys.ENTER);
    todoListPage.getTodoField().type("Walk the dog", Keys.ENTER);
    todoListPage.getCheckboxInRow("Walk the dog").click();
    assertThat(todoListPage.getStatusInRow("Walk the dog"), 

Here, the page object has hidden the selector logic, which will make the code easier to maintain. But we are still reasoning in terms of typing and clicking, which makes it hard to see at a glance what the test is doing. We are still talking about how we perform an action, not what action we are performing, and this still makes our tests harder to read.

Overweight Page Objects

Using more high-level Page Objects, our test might look like this:

TodoListPage todoListPage;

public void should_be_able_to_complete_a_todo_with_steps() {
    todoListPage.addATodoItemCalled("Put out the garbage");
    todoListPage.addATodoItemCalled("Walk the dog");
    todoListPage.markAsComplete("Walk the dog");
    assertThat(todoListPage.statusOf("Walk the dog"), is(Completed));

This code is easier to read, since we are reasoning more in terms of business actions like “add a todo item” and “mark as complete”, rather than simply performing UI interactions. And our test is still tightly bound to the UI.

Higher level Page Objects also tend to become bloated, as more and more business logic creeps in alongside the logic for locating the page elements. Which in turn makes them harder to maintain.

When all you have is a hammer...

When all you have is a Page Object everything looks like a UI test

But there is a bigger problem with using Page Objects as the foundation of your test automation strategy. When we have a library of Page Objects like this, we will naturally tend to implement all of our tests using these Page Objects. Our tests end up modelling the way the user interface works, rather than what the user is doing, and what outcomes the user wants to achieve.

Beyond Page Objects

Now imagine a test that models not pages, but actual business tasks. Imagine a test that described what the user was trying to do in business terms, rather than what buttons she clicks on and what fields she enters.

The Screenplay pattern is one way to do this. In Serenity Screenplay in Java, for example, we could write something like this:

public void should_be_able_to_complete_a_todo() {

       Start.withATodoListContaining("Walk the dog", 
                                     "Put out the garbage"));

       CompleteItem.called("Walk the dog")

       seeThat(TheItemStatus.forTheItemCalled("Walk the dog"), 

In addition to being very readable, this code is much more declarative than the previous examples. We are no longer thinking about what a particular page does, or how the user interacts with a page. These details are hidden away. Rather, we are thinking about the user’s business activities and goals. We do this by reusing objects that represent actual business tasks and business concepts, such as CompleteAnItem and TheItemStatus.

But there is more. Since we are no longer thinking in terms of the user interface, of pages, buttons and input fields, we can allow a lot more flexibility as to how each step or task is implemented. For example, we could now imagine setting up the todo list via a REST API and not going through the screens. And this would speed up our test suite immensely:

       Start.withATodoListContaining("Walk the dog", 
                                     "Put out the garbage")


If you are moving from imperative script-style testing to a Page Object model, congratulations! You are on the right track. But don’t stop there. If you want your test suites to be truly maintainable and scaleable, model what the user is doing in business terms, rather than how they are interacting with the UI. And this will free you of the urge to test everything through the UI, and open up the possibility of testing your application in more interesting ways.

Related Reading

Related courses and workshops

© 2019 John Ferguson Smart