Agile – Marlo

Agile Teams Need Agile Architects

Photo source: Irfan Simsar Unsplash

In the world of Agile, the role of architects is often misunderstood. A point of conjecture spotlighted by a well known principle from the Agile manifesto:

The best architectures, requirements, and designs emerge from self-organizing teams.

SAFe does its best to describe the theory and intent behind Agile architecture, however, the architects at Marlo understand that Agile skills and experience in architectural design thinking are needed in order to successfully navigate the challenges of Agile solution delivery. Our wealth of experience across diverse industries has helped us evolve and improve the way we architect in an Agile environment.

In this article, we will discuss the role of solution architects in an Agile delivery setting, looking at some of the skills required and benefits to be gained from adopting an Agile approach. This article uses SAFe terminology, however, the concepts apply equally to other Agile delivery methods.

A Complex Problem

There was a time before Agile when being a solution architect meant working on one large monolithic project for months or years on end, with the hope of one day being part of a successful go-live event. Unfortunately, many projects failed due to scope, time, budget or other unforeseen circumstances, meaning go-live was never a certainty and more often than not the architect had long since moved on to other things.

With Agile came a new way of working, with a focus on rapid delivery of incremental value to the business in a collaborative environment. This immediately disrupted the role of architects, who under a waterfall method, were used to working at their own pace, often in isolation for long periods with limited collaboration.

Marlo was recently engaged with a client on a project using SAFe as the delivery framework. The client was about 12 months into the process of adopting SAFe and transitioning away from a waterfall delivery method.

The project had documented a reasonably simple business brief to deliver a new customer benefits platform with the goal of expanding their customer base. There was little to no documentation of anything related to technology.

It was apparent from the outset that the project’s timelines were very aggressive and the following challenges only added to the complexity:

The project had little understanding of technology impacts before committing to cost and timing estimations
The iteration manager didn’t start until half-way through the project
The project also lost a very knowledgeable product owner half-way through the project
There were nine specialised delivery streams spread across six different timezones
Delivery streams were ready to begin prior to much of the architecture being defined
There was no Enterprise / Domain Architect with knowledge of the technology strategy
Multiple product owners were impacted by the project and had conflicting needs
There were several major scope changes throughout, adding pressure, disruption and rework
The project had committed to highly optimistic deadlines from the outset, with the major deadline brought forward in response to business request

Tools and Techniques

It was clear from the outset that there was insufficient time for a traditional Big Design Up Front (BDUF) approach. However, an intentional design (just enough upfront architecture) was needed rather than relying purely on emergent design (architecture is discovered and extended as part of each increment) to minimise the technical debt and rework that would otherwise eventuate.

Marlo’s first order of business was to agree a way forward by defining an Architecture Runway. The Architecture Runway identified the near-term architecture enablers and their relationship (dependencies) to planned features in the project backlog (this is distinct from a Technology Roadmap which provides the long-term strategic architecture view). Our Architecture Runway included: a new customer identity platform, a new customer preferences platform, data migration approach, integration approach for consumers, support solution for contact centre teams, non-prod environments, and non-functional requirements.

Effective engagement and communication was critical from the outset. Marlo led the way in establishing the meeting and communication cadence for the Squads, including Chapters for functional areas and Guilds for non-functional aspects. We established a stakeholder communication matrix to identify important stakeholders and ensured they were engaged at the appropriate times. This included reporting up to the business and project steering committees as well as the architecture review board.

Establishing clear roles and responsibilities is always important, and never more so than in an Agile delivery setting where emergent decision-making behavior can occur without full appreciation of the consequences. We drew the lines of demarcation between solution architecture, application architecture and technical design responsibilities so all team members understood their role. We also made it clear that collaboration was key and no decisions would be made in isolation or without consultation.

We established a design authority to formalise the collaboration process, supported by a central decisions register to transparently communicate all decisions and their rationale. As the architects, it was important that we demonstrate leadership and competence and influence key decisions rather than dictate the outcome. We made ourselves available and responsive at all times through corporate messaging, email, phone and face-to-face meetings to ensure we didn’t become an obstacle to be avoided. All of the architecture documentation was openly accessible on a Wiki.

One of the challenges we had was ensuring the architecture stayed just ahead of the upcoming features. Even though the Architecture Runway was unlikely to change, we did not have the luxury to plan too far ahead. This was due to the number of near-term architecture decisions required, particularly given we had large development teams already starting to code and new features being added by the day. We used architecture spikes to assess the impact to the Runway when a new feature was added, which allowed us to update the Runway and stay just ahead.

One of the most important roles we played was in our day to day discussions within the Agile squads influencing the emergent design. From these discussions we were able to catch and correct-course on a number of decisions that might otherwise have been missed:

Passwords being sent unencrypted in clear text
Data extracts being duplicated
Applications that were planning to pull data directly from production (risking performance)
An opportunity to create a reusable interface adapter that was immediately leveraged by another project

Our technical thought leadership identified risks early, minimised technical debt, exploited opportunities for reuse, fostered alignment with the broader technology direction and improved understanding of cost implications.

Benefits

The engagement presented many challenges which would have added significant time and cost to the project in a traditional setting (using a traditional mindset). However, Marlo was able to bring its Agile skills, experience and mindset to overcome these challenges and deliver the following additional benefits:

Avoiding Big Design Up Front (BDUF) allowed faster start time for Agile delivery teams and quicker incremental benefits to the business, providing more rapid return on investment
A focus on just-in-time architecture meant minimal wastage of architecture effort (no longer spending months working in isolation) and minimal rework for the Agile teams
Decisions were made faster (days not weeks or months), documented in a consistent format and available for all to see
Architects provided technical thought leadership to the Agile teams, ensuring clear technology ownership, early risk mitigation, minimisation of technical debt, architecture reuse, alignment with the broader technology direction and an informed understanding of cost implications
Although the project went slightly over the original budget, it delivered on time and created reusable technology assets that benefited other projects. A traditional waterfall project would have taken many months longer and would have struggled to manage the changing scope without cost blowouts

Beware

Our experience has shown that going Agile is not all sunshine and rainbows. There are many pitfalls to be aware of:

Architecture falling behind delivery teams, leading to a high degree of technical debt and rework
Decisions not made in a timely manner, leading to re-planning, workarounds and delays
Architecture not appropriately addressing or anticipating the needs of important technology stakeholders such as the Architecture Review Board and Change Advisory Board
Architecture concerns deprioritised in favour of delivering business features, leading to increased technology operating risks (e.g. security vulnerability, no backup solution, lack of customer support)
Inability for architects to identify red flags during dynamic discussions
Inability for architects to balance and trade-off compromises appropriately

The Agile Architect

Agile Architects require a broad range of skills. In addition to traditional architecture skills, an Agile architect must:

Be highly organised in terms of planning their own work
Be always thinking a few steps ahead
Be flexible to change
Be available when needed
Collaborate and be inclusive from the start
Be transparent in terms of knowledge sharing, decision making and documentation
Be able to shape and influence dynamic discussions

Agile Architects need to move beyond the old ways of working and adopt an Agile mindset in order to add value to the project.

Marlo has the skills and experience to establish an Agile Architecture practice and help clients navigate the challenges of Agile solution delivery. Get in touch today if you would like to learn more about how Marlo can help with your next project.

8 Dec 2020/by Jag Singh

Agile, Digital Architecture, Microservices

Reactive Microservices (Part 3)

This is part 3 of a series on reactive microservices, the first two parts are available at https://marlo.com.au/reactive-microservices and https://marlo.com.au/reactive-microservices-part-2

^{Photo by Mika Baumeister on Unsplash}

At the end of the last post, I introduced the "Customer Microservice" from CompuHyperGlobalMegaNet¹. The business at CompuHyperGlobalMegaNet have determined that a key plank in their digital strategy is to know who their customers are, so we have been asked to come up with something that will store the customer details. "Doesn’t Salesforce do that really well?" asked a junior dev. The CTO had a quick look at the web page, but to be honest it was pretty confusing and seemed complicated, so instead we decided to develop our own Customer microservice using Spring Boot.

We will build it old school first using Spring Web, then using Reactive to spell out the differences between the 2 approaches.
I’ve created the application scaffolding using Spring Initializr which is a really fast and simple way to get a project setup.

The Architect provided us with a spec:

All the whitespace is room for scaling

And we got building: Spring Boot, REST API, JPA with Spring Data. Go²!

We defined a Customer class that can be saved into a database using JPA, and sprinkle some lombok on so we can replace code with annotations:-

@Entity
@NoArgsConstructor
@AllArgsConstructor
@Data
public class Customer {

    @Id @GeneratedValue
    private Integer id;
    private String firstName;
    private String lastName;
}

More annotations than code, but Awesome.

Let’s create a DAO so we can database it (I know we say repositories these days, but it’s hard to change):

@EnableJpaRepositories
public interface CustomerDAO extends JpaRepository<Customer, Integer> {
}

Spring Data is magic.

And finally a CustomerController to do the web things:

@RestController
@RequiredArgsConstructor
@Slf4j
public class CustomerController {

    private final CustomerDAO customerDAO;

    @GetMapping
    public List<Customer> customers() {
        return this.customerDAO.findAll();
    }
}

This starts, but is hard to test as there’s no data in the DB and no way to create any! We are going to need a "add customer" feature. Until we get that, let’s add an event to populate some customer data:

I am shamelessly stealing this trick (and quite a few reactive idioms further along as well) from Spring Legend Josh Long. Check out his talks and articles, I certainly have learned a LOT from them.

@RequiredArgsConstructor
@Component
class Initialise implements ApplicationListener<ApplicationReadyEvent> {

    private final CustomerDAO customerDAO;

    @Override
    public void onApplicationEvent(ApplicationReadyEvent applicationReadyEvent) {
        List.of("Brock:Mills","Brian:Fitzgerald").stream().map(s -> {
            var c = s.split(":");
            return new Customer(null, c[0], c[1], null, null);
        }).forEach(c -> customerDAO.save(c));

        customerDAO.findAll().stream().forEach(System.out::println);
    }
}

and we should probably do that addCustomer endpoint as well, so back to the CustomerController

@PostMapping
public ResponseEntity<Void> addCustomer(@RequestBody Customer customer) throws Exception {
    this.customerDAO.save(customer);
    return ResponseEntity.created(new URI("/" + customer.getId().toString())).build();
}

Start it and test:


  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::        (v2.3.1.RELEASE)

2020-07-06 14:52:57.929  INFO 73930 --- [           main] c.m.n.NotreactiveDbApplication           : Starting NotreactiveDbApplication on emmet.localdomain with PID 73930 (/Users/brockmills/Development/Marlo/microservices/notreactive-db/target/classes started by brockmills in /Users/brockmills/Development/Marlo/microservices/notreactive-db)
2020-07-06 14:52:57.933  INFO 73930 --- [           main] c.m.n.NotreactiveDbApplication           : No active profile set, falling back to default profiles: default
2020-07-06 14:52:59.450  INFO 73930 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFERRED mode.
2020-07-06 14:52:59.547  INFO 73930 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 80ms. Found 1 JPA repository interfaces.
2020-07-06 14:53:00.259  INFO 73930 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 8080 (http)
2020-07-06 14:53:00.284  INFO 73930 --- [           main] o.apache.catalina.core.StandardService   : Starting service [Tomcat]
... snip ...
2020-07-06 14:53:01.616  INFO 73930 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8080 (http) with context path ''
... snip ...
2020-07-06 14:53:02.496  INFO 73930 --- [           main] DeferredRepositoryInitializationListener : Spring Data repositories initialized!
2020-07-06 14:53:02.509  INFO 73930 --- [           main] c.m.n.NotreactiveDbApplication           : Started NotreactiveDbApplication in 5.405 seconds (JVM running for 6.239)
Customer(id=1, firstName=Brock, lastName=Mills)
Customer(id=2, firstName=Brian, lastName=Fitzgerald)

Great, we are running with some data. Let’s test the API:

brockmills@emmet microservices % curl http://localhost:8080/
[{"id":1,"firstName":"Brock","lastName":"Mills"},{"id":2,"firstName":"Brian","lastName":"Fitzgerald"}]%

Win! Let’s add a customer with a POST and query again:

brockmills@emmet microservices % curl --header 'Content-Type: application/json' --data '{"firstName":"Tester", "lastName":"Testing"}' http://localhost:8080/
brockmills@emmet microservices % curl http://localhost:8080/
[{"id":1,"firstName":"Brock","lastName":"Mills"},{"id":2,"firstName":"Brian","lastName":"Fitzgerald"},{"id":3,"firstName":"Tester","lastName":"Testing"}]%

Ship it! I did hear the architect say something after a long lunch about unfunctionals? defunctionals? IDK. The new dev mentioned they thought security, error handling and data integrity were important, so we created some JIRAs for our tech debt sprint. We’ve got features to ~~chuck over the fence at ops~~ deliver!

While we were deploying to Prod, the business peeps dropped some new requirements on us. Apparently they have some amazing big data science lake thing going and MUST know the weather in Perth when the customer is created.

We had some questions about this – Why Perth? Which Perth? What actual thing about the weather? Wouldn’t it be better to correlate this later in some data tool, rather than adding weather data into the customer entity? – but these concerns were swept aside ("Don’t bring that negative vibe, we can do this together!") and who are we to question the business? We better get coding:

We add the temperature to our Customer class:

@JsonInclude(JsonInclude.Include.NON_NULL)
public class Customer {
... snip ...
    private Double airTempActual;
    private Double airTempFeelsLike;
}

Someone duck duck go’ed "Weather API" and we found a suitable service with the right ~~price of $0~~ features for our needs.

I’m using the Openweathermap API, which has a very nice developer experience and is simple and intuitive. 5 stars, would recommend.

We knocked up a class to hold the response from the API:

@NoArgsConstructor
@Getter
public class Weather {
    private WeatherMain main;
}

@NoArgsConstructor
@Getter
class WeatherMain {
    private double temp;

    @JsonProperty("feels_like")
    private double feelsLike;
}

And wired it all up straight into createCustomer method on the CustomerController

public class CustomerController {

    private final CustomerDAO customerDAO;
    private final RestTemplate restTemplate;

    @Value("${WEATHER_API_KEY}")
    private String weatherApiKey;
    ... snip ...
    @PostMapping
    public ResponseEntity<Void> createCustomer(@RequestBody Customer customer) throws Exception {
        // whats the temp in Perth?
        try {
            var perthWeather = perthWeather();
            customer.setAirTempActual(perthWeather.getMain().getTemp());
            customer.setAirTempFeelsLike(perthWeather.getMain().getFeelsLike());
        } catch (Exception e) {
            log.warn("cant get the weather in perth", e);
        }
        this.customerDAO.save(customer);
        return ResponseEntity.created(new URI("/" + customer.getId().toString())).build();
    }

    /**
     * call the weather api to get the weather in Perth
     * @return
     */
    private Weather perthWeather() {
        try {
            var query = Map.of("appid", weatherApiKey, "q", "perth", "units", "metric");
            return this.restTemplate.exchange("https://api.openweathermap.org/data/2.5/weather?q={q}&appid={appid}&units={units}", HttpMethod.GET, null, Weather.class, query).getBody();
        } catch (RestClientException e) {
            log.error("failed to get the weather in perth: "+ e.getMessage(), e);
            throw new RuntimeException(e);
        }
    }

Let’s push to prod! Oh, after following the CAB process and presumably some sort of testing. In fact, I’ll even test it on my machine:

It starts and we can see the temps are null on my bootstrapped data.

2020-07-06 16:35:17.243  INFO 55154 --- [           main] c.m.n.NotreactiveDbApplication           : Started NotreactiveDbApplication in 5.739 seconds (JVM running for 6.643)
Customer(id=1, firstName=Brock, lastName=Mills, airTempActual=null, airTempFeelsLike=null)
Customer(id=2, firstName=Brian, lastName=Fitzgerald, airTempActual=null, airTempFeelsLike=null)

That’s fine, we don’t want to fail a Customer create just because we didn’t know the weather in some random city.

But does it work if we add a new customer?

brockmills@emmet microservices % curl --header 'Content-Type: application/json' --data '{"firstName":"Tester", "lastName":"Testing"}' http://localhost:8080/
brockmills@emmet microservices % curl http://localhost:8080/
[{"id":1,"firstName":"Brock","lastName":"Mills"},{"id":2,"firstName":"Brian","lastName":"Fitzgerald"},{"id":3,"firstName":"Tester","lastName":"Testing","airTempActual":17.6,"airTempFeelsLike":10.3}]

17 degrees in Perth. A lovely winter’s day and lovely microservice to boot (to Spring Boot, if you will hardi har har).

But this is not a blog about the weather, it’s about reactive! The architect re-appeared from wherever they go when they are not bothering engineers, and they had done some detailed usage projections. Turns out we are going to need to support 4x our expected load and – get this – we can’t deploy any new servers. We have to work with what we have got.

We took some stats from prod, created some flame graphs:
We got confused about the output³ so tried some other profilers, which produced some actual useful output:
We managed to produce a flame graph of the same thing:
Boom- there’s our ticket to doing a APM presentation at DevHatMineCon next year!

We now had some pretty strong evidence that most of the thread time on our app was spent waiting for the weather API (and a little at the database as well).

We also found a neat natural experiment to reinforce our hypothesis: We accidently ran production against our mock Weather API server for 3 hours⁴. That thing is fast and it was still spending 90% of its time talking to the API.

We need to go nuclear – with the reactor!

This section builds on reactive framework concepts we discussed in part 2

Porting the application to reactive is relatively straight forward.

There are 3 main concerns:

We need to replace the JPA and JDBC database components with reactive enabled ones;
We need to wrap our Controller method return types in either Mono or Flux containers; and
We need to replace the RestTemplate client with a reactive WebClient.

Let’s get into it:

I’ll start with the project dependencies. spring-boot-starter-web and spring-boot-starter-data-jpa are out, to be replaced with spring-boot-starter-webflux and spring-boot-starter-data-r2dbc.

The database bound beans require small changes to remove JPA annotations, in the case of my Customer, this means removing @Entity and @GeneratedValue:

@Data
@AllArgsConstructor
@NoArgsConstructor
@JsonInclude(JsonInclude.Include.NON_NULL)
public class Customer {

    @Id
    private Integer id;
    private String firstName;
    private String lastName;

    private Double airTempActual;
    private Double airTempFeelsLike;
}

Q: If we remove @GeneratedValue annotation, how does the primary key get updated?
A: It depends. The reactive Spring Data library really only supports the database sequence / identity approach from JPA, which is driven by the @ID annotation. As such it is aware of database sequences on fields with the @ID and manages them.

The DAO changes from implementing a JpaRepository to a ReactiveCrudRepository.

public interface CustomerDAO extends ReactiveCrudRepository<Customer, Integer> {
}

Pretty painless so far.

The Initialiser class is the first taste of the real differences. Firstly, since we are not using JPA / Hibernate, there is no auto generation of database schemas, so we are going to do this manually with the r2dbc provided DatabaseClient.

Next, to bootstrap the test data we need to create it in the context of a reactive type, being a Flux for the list of customers.

To put it all together:

@RequiredArgsConstructor
@Component
class Initialiser implements ApplicationListener<ApplicationReadyEvent> {

    private final CustomerDAO customerDAO;
    private final DatabaseClient databaseClient;

    @Override
    public void onApplicationEvent(ApplicationReadyEvent applicationReadyEvent) {
        Flux<Customer> customers = Flux.just("Brock:Mills", "Brian:Fitzgerald")
                .map(s -> {
                    var c = s.split(":");
                    return new Customer(null, c[0], c[1], null, null);
                })
                .flatMap(customerDAO::save);

        databaseClient.execute("create table CUSTOMER(ID identity auto_increment, FIRST_NAME varchar(50), LAST_NAME varchar(50), AIR_TEMP_ACTUAL double, AIR_TEMP_FEELS_LIKE double)")
                .fetch()
                .rowsUpdated()
                .thenMany(customers)
                .thenMany(this.customerDAO.findAll())
                .subscribe(System.out::println);
    }
}

This is interesting. customers is a flux that is setup, but (because we are reacting) nothing happens with it until we hit the subscribe() down the track.

This is demonstrated nicely by adding in a few log.info() prints:

    @Override
    public void onApplicationEvent(ApplicationReadyEvent applicationReadyEvent) {
        Flux<Customer> customers = Flux.just("Brock:Mills", "Brian:Fitzgerald")
                .map(s -> {
                    log.info("1: creating customer: " + s);
                    var c = s.split(":");
                    return new Customer(null, c[0], c[1], null, null);
                })
                .flatMap(customerDAO::save);

        log.info("2: this is after the customer");
        databaseClient.execute("create table CUSTOMER(ID identity auto_increment, FIRST_NAME varchar(50), LAST_NAME varchar(50), AIR_TEMP_ACTUAL double, AIR_TEMP_FEELS_LIKE double)")
                .fetch()
                .rowsUpdated()
                .thenMany(customers)
                .thenMany(this.customerDAO.findAll())
                .subscribe(c -> {
                    log.info("3: in the subscribe: " + c.toString());
                });

        log.info("4: the end, we are initialised");
    }

Results look like this:

[  main] c.m.reactivedb.ReactiveDbApplication     : Started ReactiveDbApplication in 5.608 seconds (JVM running for 6.9)
[  main] com.marlo.reactivedb.Initialiser         : 2: this is after the customer
[  main] com.marlo.reactivedb.Initialiser         : 1: creating customer: Brock:Mills
[  main] com.marlo.reactivedb.Initialiser         : 1: creating customer: Brian:Fitzgerald
[  main] com.marlo.reactivedb.Initialiser         : 3: in the subscribe: Customer(id=1, firstName=Brock, lastName=Mills, airTempActual=null, airTempFeelsLike=null)
[  main] com.marlo.reactivedb.Initialiser         : 3: in the subscribe: Customer(id=2, firstName=Brian, lastName=Fitzgerald, airTempActual=null, airTempFeelsLike=null)
[  main] com.marlo.reactivedb.Initialiser         : 4: the end, we are initialised

The customers flux is set up, but there’s no subscription so execution of the stream is deferred. The 2nd statement, at the end of the databaseClient.execute() chain, does subscribe to the stream which causes the publisher to execute, create the DB tables and insert the customer records and finally log the resulting records⁵.

The CustomerController is where the rubber really hits the road. Firstly we need to replace RestTemplate with the WebClient HTTP Client and modify the return types to use the reactive containers. We do retain the @RestController annotation though, as it is reactive aware:

@RestController
@RequiredArgsConstructor
@Slf4j
public class CustomerController {
    private final CustomerDAO customerDAO;
    private final WebClient webClient;
    ...snip...
    @GetMapping("/")
    public Flux<Customer> getCustomers() {
        return this.customerDAO.findAll();
    }

    @PostMapping("/")
    public Mono<ResponseEntity<Void>> createCustomer(@RequestBody Customer customer) throws Exception {
    }

The List on the getCustomers method becomes a Flux but the ResponseEntity on the createCustomer method must be wrapped in a Mono like this: Mono;.

Now we get to the createCustomer method. A quick recap, This API:

calls the weather API,
merges the weather data into the Customer from the client,
saves the Customer in the Database
returns a 201 Created HTTP code that includes a Location header with a link to the newly created Customer.
a failure for the weather API call should not be fatal.

Before I go into the controller method, let’s focus on the call to the Weather API in perthWeather(). This will now join the reactive party with a Mono wrapper, however I need to restructure the way this method fails. Whereas it previously simply threw an Exception that was handled by the createCustomer method, there’s a better way work now we need to operate within the reactive container.

perthWeather becomes this:

    private Mono<Optional<Weather>> perthWeather() {
        var q = Map.of("appid", weatherApiKey,
                "q", "perth",
                "units", "metric");

        return webClient.get().uri("https://api.openweathermap.org/data/2.5/weather?q={q}&appid={appid}&units={units}", q)
                .retrieve()
                .bodyToMono(Weather.class)
                .map(Optional::of)
                .onErrorReturn(Optional.empty());
    }

WebClient has a friendly API, where we define:

The method get()
the target URI with template params along with a Map of the query parameters
tell it what to do with the body – bodyToMono(Weather.class)

We could stop there and simply return the Mono however given the try/catch approach that was used before isn’t going fly, I’m now going to wrap the return Weather class in an Optional and add an error handler that will return an empty Optional if something goes wrong.

The createCustomer method now needs to be refactored. We now need to work within the reactive stream, which means using the methods provided by the Mono to do our businesses bidding.

Here’s the code:

    @PostMapping("/")
    public Mono<ResponseEntity<Void>> createCustomer(@RequestBody Customer customer) throws Exception {

        return perthWeather().map(o -> {
            o.ifPresent(w -> {
                customer.setAirTempActual(w.getMain().getTemp());
                customer.setAirTempFeelsLike(w.getMain().getFeelsLike());
            });
            return customer;
        })
            .map(customerDAO::save)
            .map(c -> ResponseEntity.created(UriComponentsBuilder.fromPath("/" + customer.getId().toString()).build().toUri()).build());

    }

First up, call the perthWeather() method, then use the resulting Weather to fill out our Customer object weather fields. Here I’m calling map(), using the Optional container to determine if we can actually add the weather and then return the Customer, ready for the database. Just like that, I’ve removed a try catch from my method, which definitely looks cleaner. Then we map() again to save the record to the database and finally map to compose the response object including the location path.

Does it work?

brockmills@emmet ~ % curl http://localhost:8080
[{"id":1,"firstName":"Brock","lastName":"Mills"},{"id":2,"firstName":"Brian","lastName":"Fitzgerald"}]%

brockmills@emmet ~ % curl --header 'Content-Type: application/json' --data "{\"firstName\":\"Tester\", \"lastName\":\"$(cat /dev/urandom | env LC_CTYPE=C tr -dc 'a-zA-Z0-9' | fold -w 32 | head -1)\"}" http://localhost:8080/

brockmills@emmet ~ % curl http://localhost:8080
[{"id":1,"firstName":"Brock","lastName":"Mills"},{"id":2,"firstName":"Brian","lastName":"Fitzgerald"},{"id":3,"firstName":"Tester","lastName":"MEz8ibpIWSL34vWQM285aGmnrdidH7qL"}]%

brockmills@emmet ~ %

Sort of! The new record is now in the db, but there’s no temperature…

Hmm, so it looks like the API call fails for some reason, however the exception is swallowed and overridden by the onErrorReturn(), leaving us without the all important weather data. Worse is that we might have let this slip into prod if we weren’t being so careful⁶.

The reactive API also has a onErrorResume() method which allows for more flexible error handling. Lets try that, replacing the onErrorReturn():

                .onErrorResume(e -> {
                   log.error("error calling weather API: " + e.getMessage(), e);
                   return Mono.just(Optional.empty());
                });

Now we still default to the optional, however we log the exception from the web client, so we now have a fighting chance of working out what’s gone wrong.

Testing again results in:

2020-07-07 12:57:12.755 ERROR 2894 --- [ctor-http-nio-4] com.marlo.reactivedb.CustomerController  : error calling weather API: 401 Unauthorized from GET https://api.openweathermap.org/data/2.5/weather?q=perth&appid=weatherkey&units=metric

org.springframework.web.reactive.function.client.WebClientResponseException$Unauthorized: 401 Unauthorized from GET https://api.openweathermap.org/data/2.5/weather?q=perth&appid=weatherkey&units=metric
    at org.springframework.web.reactive.function.client.WebClientResponseException.create(WebClientResponseException.java:181) ~[spring-webflux-5.2.7.RELEASE.jar:5.2.7.RELEASE]
    Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: 
Error has been observed at the following site(s):
    |_ checkpoint ⇢ 401 from GET https://api.openweathermap.org/data/2.5/weather?q=perth&appid=weatherkey&units=metric [DefaultWebClient]
Stack trace:
        at org.springframework.web.reactive.function.client.WebClientResponseException.create(WebClientResponseException.java:181) ~[spring-webflux-5.2.7.RELEASE.jar:5.2.7.RELEASE]
        at org.springframework.web.reactive.function.client.DefaultClientResponse.lambda$createException$1(DefaultClientResponse.java:206) ~[spring-webflux-5.2.7.RELEASE.jar:5.2.7.RELEASE]
        at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:100) ~[reactor-core-3.3.6.RELEASE.jar:3.3.6.RELEASE]
        at reactor.core.publisher.FluxDefaultIfEmpty$DefaultIfEmptySubscriber.onNext(FluxDefaultIfEmpty.java:92) ~[reactor-core-3.3.6.RELEASE.jar:3.3.6.RELEASE]
... cut for brevity, however all classes below are reactor.netty.channel, reactor.core.publisher, io.netty or java.lang namespaces..

Ah, the API key isn’t being set and is defaulting to weatherapikey. Whoops, oh well that’s easy fixed, I need to add the WEATHER_API_KEY env variable.

An interesting effect of the reactive setup then subscribe/execute paradigm is that the stack trace in the exception is not particularly useful in locating the source of the error. It does identify the class where the thing has gone wrong, but since my code is really just defining the execution rather than getting executed, it does nothing to help me find the line in my code that’s blown up. This is annoying, however with careful logging, shouldn’t be a blocker to running reactor in production.

Right, add the API key:

brockmills@emmet ~ % export WEATHER_API_KEY=the_api_key java -jar target/reactive-db-0.0.1-SNAPSHOT.jar
... snip ...
2020-07-07 13:02:35.163  INFO 91619 --- [           main] c.m.reactivedb.ReactiveDbApplication     : Started ReactiveDbApplication in 4.838 seconds (JVM running for 11.345)

and retest:

brockmills@emmet ~ % curl --header 'Content-Type: application/json' --data "{\"firstName\":\"Tester\", \"lastName\":\"$(cat /dev/urandom | env LC_CTYPE=C tr -dc 'a-zA-Z0-9' | fold -w 32 | head -1)\"}" http://localhost:8080/
brockmills@emmet ~ % curl http://localhost:8080
[{"id":1,"firstName":"Brock","lastName":"Mills"},{"id":2,"firstName":"Brian","lastName":"Fitzgerald"},{"id":3,"firstName":"Tester","lastName":"31VqMkuXXVRa06lZsfXiNTdgZYKkwjIP","airTempActual":16.72,"airTempFeelsLike":16.18}]%

It works! Deploy and we are done. Get it into prod and bask in the glow of simple, performant, modern microservices.

Hopefully this has provided a reasonable introduction into how reactive does its thing and what microservices using Spring Boot Reactive look like.

CompuHyperGlobalMegaNet will surely dominate with this, but we aren’t finished. Oh no, not even close. We recently hired an intern and thought we could get them started on unit tests and error handling and one of the BA’s asked what the new performance stats were like on reactive (did we re-run the test? Might need to follow that up) so keep an eye out for posts in the near future.

The code is available on gitlab at (https://gitlab.com/themarlogrouppublic/reactive-blogs).

Endnotes

Richer than astronauts!
I mean start, not golang. We are enterprise, take this seriously, please.
Lots of flame graph tools measure CPU usage and computers are smart enough to not worry the CPU with IO tasks. I was actually surprised at the amount of time the JVM spends in the compiler here and wonder how long I would need to run it for to get the majority of samples to be business logic rather than JVM. I sense a follow up post!
Yes, it did (but with something far more important than this. In my defense the client was using my test environment 🙃).
Calls to subscribe() are meant to be non blocking, however in the log here we can see these are all executing on the main thread, i.e. the thread is blocked. I think this is a spring context specific thing, but I don’t actually know why this happens. ¯\(ツ)/¯
We wont have to be nearly as careful once we get those automated tests the QA team have been banging on about for ages.

25 Sep 2020/by Brock Mills

Agile, Digital Architecture, Events, Microservices

Reactive Microservices

Photo by Christian Fregnan on Unsplash

At Marlo we specialise in systems integration: Message Brokers, Enterprise Service Buses, Business Process Management, EAI – you name it, we’ve seen (and implemented) them all – tomorrow’s legacy today!

One of our current favourite application architectures for complex systems is microservices. Developing microservices is great – we have seen great benefits derive from using this approach – and we’ve built them using various technologies.

But your typical integration developer is never happy!

They complain about their frameworks, tools or platforms and will constantly hassle architects about why we should be using X for this or Y for that. And our architects feel their pain, because they develop too.

So when we started a big API and microservices project a couple of years ago, of course we jumped at the chance to write all the services in fully modern async, reactive monad actor model microservices.

No we didn’t: we wrote them in good old fashioned imperative if-then-else using Spring Boot: we had projects to deliver and the reactive frameworks weren’t quite mature enough to convince our stakeholders to go for it.

But then, early in 2018, Spring Boot version 2.0 went GA and delivered a very capable reactive library to our chosen framework. It was time to get down into the weeds with reactive microservices.

What are Reactive Microservices?

Reactive programming is a paradigm where the application is anchored around data streams and propagation of change. Some definitions:

Microservices are small, self contained services that scale well and are independently deployable.
Reactive microservices are microservices that conform to the Reactive Manifesto:

We believe that a coherent approach to systems architecture is needed, and we believe that all necessary aspects are already recognised individually: we want systems that are Responsive, Resilient, Elastic and Message Driven. We call these Reactive Systems
Source: The Reactive Manifesto

The Reactive Manifesto takes the paradigm of reactive programming and lays out explicit application behaviour. It’s worth reading and understanding in full, but we’ll summarise it for you now. Let’s break down the four key concepts:

Responsive: services respond consistently in a timely manner and establish a reliable upper bound for response times.
Resilient: services stay responsive in the event of failure. By making use of replication, containment, isolation and delegation patterns, they ensure that failures in one component do not affect another.
Elastic: services react to changes in demand by increasing or decreasing resources allocated to them as required.
Message Driven: services use asynchronous messaging at the boundaries between components, driving loose coupling, isolation and location transparency. Non-blocking, asynchronous communication protocols allow systems to use resources only when there is something to be done.

But what does all that really mean?

Reactive Microservices are the computer program equivalent of that hardworking, conscientious co-worker who is always doing something useful, as opposed to the layabout who will say they are working when they send an email and then sit around waiting for the reply.

Less of this:
xkcd 303 - compiling

In plain English, reactive means the service works well:

It doesn’t tie up vital resources, such as CPU threads, when it’s not doing anything – like waiting around for bytes to appear over a network connection from some remote web server
It can scale easily and talk to other services regardless of where that service runs
When part of a system fails, it only affects the bit that’s broken – other unrelated parts of the application can continue working as though nothing is wrong

Why Haven’t Reactive Microservices Taken Over The World (Yet)?

This is a reasonable question and it’s got some interesting answers. It already has taken over some aspects of computing, especially in the user space APIs in file systems and networking for virtually any modern language.

The main reason it really hasn’t taken hold in the enterprise application development is that across the range of things that a typical application needs to do – make HTTP requests over the network or query a database – reactive tools simply weren’t available. Until now!

Reactive in Java

So now we know about reactive microservices, let’s talk about them in our usual tech stack: Java.

The underlying design principles of reactive services are nothing new, and (hopefully) you’ve been doing this stuff for years:

Writing small, independently deployable components
Designing clean APIs with true separation of concerns
Deploying your applications in containers

This is a good start, but there is still more to achieve. The bad guy here is Blocking I/O.

A Brief Foray Into Java History

The old way of developing microservices used the classic servlet style. This was a simpler time when applications were not too fussed with dispatching a request and then waiting around for the response to come back and tying up an entire OS thread while that happened. The code was simpler too. For our classic Java application, this means we have API endpoints in Controller classes that utilise a set of Service classes that implement various business logic, all wrapped up in a servlet engine.

Servlet engines such as Tomcat, Jetty, Websphere and WebLogic are all very well engineered, but the Servlet specification that they implement has not aged quite so well.

They all suffer from a variant of the 10k connection problem: they struggle to scale past about 10,000 concurrent connections.

The primary reason for this is that v2.1 servlet engines typically block on I/O. Not only that, but they allocate a thread per incoming request which then executes the necessary code, waiting patiently for network clients to return results, until it finally sends its response to the caller.

Version 3 and 3.1 of the Servlet specification resolved this somewhat, by providing asynchronous readers and writers via the ReadListener and WriteListener interfaces. These are pretty clever in how they mediate between clients and each servlet invocation, but all code inside the service call is still synchronous and will block on outbound I/O.

Back To Today

All the good modern Java HTTP clients – HTTP Components v4, OkHttp, RestTemplate, JerseyClient – are well written, efficient, utilising connection pooling and efficient thread management. But they all use blocking I/O.

Like we said before, the classic Java servlet application will scale to around 10,000 concurrent requests. That’s a lot of requests, but once you get more things go bad fast. A major reason they go bad is they use Java threads, which are backed by an Operating System thread, which require:

Time: a CPU context switch and 2 system calls
Resources: the JVM will typically allocate between 256Kb and 512Kb for the stack per thread NOTE: this is off heap memory!

Java threads (and OS threads) are simply too expensive to be sitting around doing nothing while waiting for a mobile phone on a GPRS connection to dribble its bytes over the internet! With the current Java threading and memory model, it is simply not possible to scale to millions of concurrent operations per JVM.

So… how do we break through this barrier?

MOAR THREADS???
No.

Use golang?
Uhh… no. Too confusing with its GOPATHS, back to front type declarations and whatnot. I realise this is not a great argument against Go, work with me here.

Node!
Please, let’s be serious! We are enterprise!

Highly scalable Java applications are impossible!
They were. Until now. Enter the reactor…

The Reactor Engine

The reactor engine is the thing that turns the layabout app into an eager always busy app, by enabling 2 key capabilities:

It doesn’t block a thread on I/O – data transfer doesn’t occupy an OS thread while waiting for data to be received; and
It handles backpressure – a mechanism that attempts to ensure that producers don’t overwhelm consumers, which works by having the producer slow down when the consumer of the I/O is too busy. If you’ve seen those "one vehicle per green" lights on a freeway entry ramp, you’ve seen a real life application of back pressure in action.

Non-blocking I/O and backpressure mean the application doesn’t need to go adding threads to service more requests. It scales with very few threads and matches the rate data is produced to that which can be consumed!

Many platforms and applications that utilise reactive principles have only 1 thread! Node.js, which has been the butt of many an enterprise developer joke, can handle 1,000,000 concurrent requests with a single thread, as can NGINX, Erlang and others.

Spring reactive gets it’s non blocking abilities from Netty, which it uses for handling both server and client network I/O. Java applications built on this will use more than 1 thread, but it will be in the order of 10 threads, as opposed to the hundreds or thousands that may have been configured in a classic servlet engine. In turn the cost of context switching is avoided in both memory and CPU usage and the time spent managing the application goes down and time spent doing useful work goes up.

We have let the performance genie out of the bottle!

Conclusion

Now we know what a reactive microservice is and why we should want to start writing one, it’s time to have a look at what it means for the code.

Next time we’ll dive into a reactive Spring Boot application and see how it works under the hood.

21 Feb 2020/by Brock Mills

Agile

Why Scaled Agile?

What is Scaled Agile?

First off, let’s define what Scaled Agile is – and what it isn’t.

Scaled Agile is a way of working underpinned by the values and principles of Lean, Agile software development, and Systems Thinking implemented at scale in medium to large enterprise environments. It is ideal for large delivery teams (typically 50 to 150 members) working on one or more complex software products.

Scaled Agile is not just about distributing work across multiple Scrum teams. It is not limited to Scrum practices, nor is it locked into any other specific agile framework. There are many scaling methods and approaches – including Scaled Agile Framework (SAFe), Scrum of Scrums, the Spotify Model, Disciplined Agile Delivery (DAD), etc – each with its own idiosyncrasies. At Marlo, we borrow elements from different frameworks based on our understanding of what will work best for each client.

Although widely considered one of the best approaches to delivering complex enterprise solutions, Scaled Agile is not a silver bullet, and dare we say, is not for everyone.

Who Benefits from Scaled Agile?

The Internet abounds with comparisons between Waterfall and Agile approaches. Many of these focus primarily on differences relating to task management and team structures. But the more interesting points of difference lie in the cultural norms they embody. Sadly, up to 90% of organisations that embark on an Agile transformation fail to complete their journey. A key to avoiding this fate is to ensure your organisational culture is taken on the ride.

Scaled Agile may benefit organisations facing the following challenges:

Failure to achieve desired business outcomes due to lack of collaboration, transparency and alignment across business and technology teams
Limited ability to accommodate changes due to complex architecture and resource constraints
Products failing to deliver the expected business benefit, or resulting in financial loss
Poor-quality outcomes due to inconsistent or ineffective software engineering practices
Delays in responding to changing market and customer needs
Demoralised teams and disengaged business stakeholders

To achieve the full benefit of Agile ways of working, the right elements need to be in place. Key among these is strong executive sponsorship. Adopting Scaled Agile is a substantial multi-year investment, so it is paramount that leaders commit to leading the change. They must not only support the changes to team structures, policies and processes, but also – and perhaps most critically – they must drive cultural change to shift the organisation from a traditional command-and-control paradigm to a “servant-leader” model that empowers autonomous teams and removes the fear of failure.

The transition to Scaled Agile is far from simple, but for the right organisation, the potential benefits are many:

Faster Time to Market

The term “agile”, by definition, describes one’s ability to change direction quickly. An oft-cited benefit of Agile approaches is a quicker time-to-market. But just how much quicker? Fortunately, SAFe has done some case studies across organisations that have adopted Agile, as summarised below:

Company	Industry	Results
Australia Post	Postal Service	100-fold increase in yearly production deployments with 98% cost reduction and 400% productivity increase over 18 months
Deutsche Bahn	Transportation	Lead time dropped from 12 months to 3-4 months
CapitalOne	Financial	Lead time dropped from 6 months to couple of months
Amdocs	Telecommunication	30% faster delivery to acceptance testing
Fitbit	Technology	Velocity up by 30% year over year

One of the goals of Agile software development methodologies such as Scrum and Lean is to enable teams to deliver fully-functional, properly-tested software in rapid cycles (typically every 2-4 weeks). Scaled Agile builds on this by orchestrating multiple agile teams working in concert toward a common goal. This enables large delivery teams to release working products much more frequently (often in 3-month “program increments”), accelerating returns on investment and reducing potential revenue loss – as depicted in Figure 1.

Figure 1. Faster time to market helps reduce cost of delay resulting in lower actual returns.

Yet agile methods alone are not enough to achieve true speed-to-market: technology makes an enormous difference here. We believe modern cloud-native platforms can give you a running start, and to this end, we have invested our time and expertise to build a platform (“Bring your own Environment”) that can get you up and running in a full-fledged cloud environment in minutes. Read more about this at The New BYOD – Bring Your Own Dev.

Maintain Alignment

Contrary to popular opinion, misalignment between business and IT is not just an issue affecting organisations with a traditional functional structure. In fact, misalignment is often a bigger problem in organisations with a mix of traditional and Scrum teams.

Traditional organisations attempt to achieve alignment through strict processes and stage-gates. Ironically, these regimented processes can actually stifle collaboration. They also run counter to the principles of Agile – serving to highlight the discord across teams following different ways of working. Deming put it aptly:

Best efforts are essential. Unfortunately, best efforts, people charging this way and that way without guidance of principles, can do a lot of damage.

– W. Edwards Deming

Netflix and Spotify popularised the principle of ‘highly aligned, yet loosely coupled teams’ – a play on Microservices Architecture’s ‘loosely coupled, highly cohesive’ principle.

Figure 3. Loosely coupled, tightly aligned teams (Spotify)

Scaled Agile applies this principle in finding a balance between giving teams autonomy and ensuring they remain aligned with the goals of the enterprise. It is particularly beneficial for large distributed teams: Royal Philips, for example, has 42 Agile Release Trains, each with 50-125 members. Open and frequent communication – supported by processes and tools – ensures teams are aligned to business outcomes, and leaders are aware of the progress of each team.

Promoting local autonomy unlocks creativity, speed, and innovation: empowered teams are more engaged and more productive; they feel a sense of ownership and a stronger connection to the organisation’s vision and goals.

Risk Reduction

“I am ignoring all risks and going all-in on this investment, even if it wastes millions of dollars and damages my company’s reputation”

– No One Ever

The quote above is intended to be humorous, but sadly we encounter variations of it all too often. Multi-million dollar enterprise projects crash and burn for a multitude of reasons, including invalid assumptions resulting in poor market fit, incorrect requirements, delayed delivery, blown-out budgets, tons of avoidable rework, unreported issues, counter-intuitive user experience, etc.

Scaled Agile can help enterprise teams uncover hidden risks or problems early. Frequent software releases enable continuous validation of assumptions, and help ensure ongoing alignment with business needs. US mortgage finance provider Fannie Mae used a Scaled Agile approach to meet government regulatory mandates through continuous feedback loops – a feat that would have taken much more time and money using a Waterfall approach.

Scaled Agile can also reduce delivery risk over time by optimising the delivery capability itself. Techniques such as value-stream-mapping, big retrospectives, leading-indicators-tracking, and scrum-of-scrums events are commonly applied in Scaled Agile to facilitate continuous improvement of delivery.

Build Trust

Agile at scale means trust at scale.

– Henrik Kniberg, Spotify

Scaled Agile offers practical ways to build a culture of trust in your organisation, enabling business and delivery teams to work collaboratively and communicate openly.

Regular stand-up meetings encourage a shared understanding and ownership of progress, and surface any issues, risks, dependencies or impediments. Visual tools such as a Program Wall or Kanban board build on this by publicly tracking progress, and showcases and sprint reviews provide further opportunity for feedback and learning. Perhaps most importantly, the working software delivered in each iteration presents a tangible outcome that not only unites teams but allows assumptions to be tested quickly – avoiding the late discovery of issues that can erode trust and devolve into a “blame game”.

Scaled Agile also builds trust between leaders and delivery teams. Decision-makers can act decisively and teams can deliver confidently in the knowledge that any issues will be discovered and rectified early – promoting an environment in which it is safe to “fail”.

Early and incremental realisation of value also builds confidence and trust among stakeholders and customers. Features can be refined iteratively based on customer feedback (e.g. using Design Thinking methods) and budgets can be released incrementally as investment guardrails and milestones are achieved.

Conclusion

Small teams and startups do not have a monopoly on delivery agility. Scaled Agile provides a pragmatic and effective framework for ‘descaling’ large, functional team structures into lean cross-functional and cross-component teams aligned to product streams. Many large organisations have adopted Scaled Agile ways of working to achieve faster time to market, improve returns on investment, reduce risks, and foster a creative and collaborative workforce culture.

A successful transition to Scaled Agile should be supported by tools and technologies that enable team collaboration and quality delivery at pace – underpinned by the right architectural thinking, DevOps, and automation frameworks and platforms.

Do you agree with our list? Please leave a comment below to share your thoughts and experiences, or contact us via our website to arrange a free consultation.

24 Jan 2020/by Gary Febbrarino