At Marlo we specialise in systems integration: Message Brokers, Enterprise Service Buses, Business Process Management, EAI – you name it, we’ve seen (and implemented) them all – tomorrow’s legacy today!
One of our current favourite application architectures for complex systems is microservices. Developing microservices is great – we have seen great benefits derive from using this approach – and we’ve built them using various technologies.
But your typical integration developer is never happy!
They complain about their frameworks, tools or platforms and will constantly hassle architects about why we should be using X for this or Y for that. And our architects feel their pain, because they develop too.
So when we started a big API and microservices project a couple of years ago, of course we jumped at the chance to write all the services in fully modern async, reactive monad actor model microservices.
No we didn’t: we wrote them in good old fashioned imperative if-then-else
using Spring Boot: we had projects to deliver and the reactive frameworks weren’t quite mature enough to convince our stakeholders to go for it.
But then, early in 2018, Spring Boot version 2.0 went GA and delivered a very capable reactive library to our chosen framework. It was time to get down into the weeds with reactive microservices.
What are Reactive Microservices?
Reactive programming is a paradigm where the application is anchored around data streams and propagation of change. Some definitions:
- Microservices are small, self contained services that scale well and are independently deployable.
- Reactive microservices are microservices that conform to the Reactive Manifesto:
We believe that a coherent approach to systems architecture is needed, and we believe that all necessary aspects are already recognised individually: we want systems that are Responsive, Resilient, Elastic and Message Driven. We call these Reactive Systems
Source: The Reactive Manifesto
The Reactive Manifesto takes the paradigm of reactive programming and lays out explicit application behaviour. It’s worth reading and understanding in full, but we’ll summarise it for you now. Let’s break down the four key concepts:
- Responsive: services respond consistently in a timely manner and establish a reliable upper bound for response times.
- Resilient: services stay responsive in the event of failure. By making use of replication, containment, isolation and delegation patterns, they ensure that failures in one component do not affect another.
- Elastic: services react to changes in demand by increasing or decreasing resources allocated to them as required.
- Message Driven: services use asynchronous messaging at the boundaries between components, driving loose coupling, isolation and location transparency. Non-blocking, asynchronous communication protocols allow systems to use resources only when there is something to be done.
But what does all that really mean?
Reactive Microservices are the computer program equivalent of that hardworking, conscientious co-worker who is always doing something useful, as opposed to the layabout who will say they are working when they send an email and then sit around waiting for the reply.
Less of this:
In plain English, reactive means the service works well:
- It doesn’t tie up vital resources, such as CPU threads, when it’s not doing anything – like waiting around for bytes to appear over a network connection from some remote web server
- It can scale easily and talk to other services regardless of where that service runs
- When part of a system fails, it only affects the bit that’s broken – other unrelated parts of the application can continue working as though nothing is wrong
Why Haven’t Reactive Microservices Taken Over The World (Yet)?
This is a reasonable question and it’s got some interesting answers. It already has taken over some aspects of computing, especially in the user space APIs in file systems and networking for virtually any modern language.
The main reason it really hasn’t taken hold in the enterprise application development is that across the range of things that a typical application needs to do – make HTTP requests over the network or query a database – reactive tools simply weren’t available. Until now!
Reactive in Java
So now we know about reactive microservices, let’s talk about them in our usual tech stack: Java.
The underlying design principles of reactive services are nothing new, and (hopefully) you’ve been doing this stuff for years:
- Writing small, independently deployable components
- Designing clean APIs with true separation of concerns
- Deploying your applications in containers
This is a good start, but there is still more to achieve. The bad guy here is Blocking I/O.
A Brief Foray Into Java History
The old way of developing microservices used the classic servlet style. This was a simpler time when applications were not too fussed with dispatching a request and then waiting around for the response to come back and tying up an entire OS thread while that happened. The code was simpler too. For our classic Java application, this means we have API endpoints in Controller
classes that utilise a set of Service
classes that implement various business logic, all wrapped up in a servlet engine.
Servlet engines such as Tomcat, Jetty, Websphere and WebLogic are all very well engineered, but the Servlet specification that they implement has not aged quite so well.
They all suffer from a variant of the 10k connection problem: they struggle to scale past about 10,000 concurrent connections.
The primary reason for this is that v2.1 servlet engines typically block on I/O. Not only that, but they allocate a thread per incoming request which then executes the necessary code, waiting patiently for network clients to return results, until it finally sends its response to the caller.
Version 3 and 3.1 of the Servlet specification resolved this somewhat, by providing asynchronous readers and writers via the ReadListener
and WriteListener
interfaces. These are pretty clever in how they mediate between clients and each servlet invocation, but all code inside the service
call is still synchronous and will block on outbound I/O.
Back To Today
All the good modern Java HTTP clients – HTTP Components v4, OkHttp, RestTemplate, JerseyClient – are well written, efficient, utilising connection pooling and efficient thread management. But they all use blocking I/O.
Like we said before, the classic Java servlet application will scale to around 10,000 concurrent requests. That’s a lot of requests, but once you get more things go bad fast. A major reason they go bad is they use Java threads, which are backed by an Operating System thread, which require:
- Time: a CPU context switch and 2 system calls
- Resources: the JVM will typically allocate between 256Kb and 512Kb for the stack per thread NOTE: this is off heap memory!
Java threads (and OS threads) are simply too expensive to be sitting around doing nothing while waiting for a mobile phone on a GPRS connection to dribble its bytes over the internet! With the current Java threading and memory model, it is simply not possible to scale to millions of concurrent operations per JVM.
So… how do we break through this barrier?
MOAR THREADS???
No.
Use golang?
Uhh… no. Too confusing with its GOPATHS
, back to front type declarations and whatnot. I realise this is not a great argument against Go, work with me here.
Node!
Please, let’s be serious! We are enterprise!
Highly scalable Java applications are impossible!
They were. Until now. Enter the reactor…
The Reactor Engine
The reactor engine is the thing that turns the layabout app into an eager always busy app, by enabling 2 key capabilities:
- It doesn’t block a thread on I/O – data transfer doesn’t occupy an OS thread while waiting for data to be received; and
- It handles backpressure – a mechanism that attempts to ensure that producers don’t overwhelm consumers, which works by having the producer slow down when the consumer of the I/O is too busy. If you’ve seen those "one vehicle per green" lights on a freeway entry ramp, you’ve seen a real life application of back pressure in action.
Non-blocking I/O and backpressure mean the application doesn’t need to go adding threads to service more requests. It scales with very few threads and matches the rate data is produced to that which can be consumed!
Many platforms and applications that utilise reactive principles have only 1 thread! Node.js, which has been the butt of many an enterprise developer joke, can handle 1,000,000 concurrent requests with a single thread, as can NGINX, Erlang and others.
Spring reactive gets it’s non blocking abilities from Netty, which it uses for handling both server and client network I/O. Java applications built on this will use more than 1 thread, but it will be in the order of 10 threads, as opposed to the hundreds or thousands that may have been configured in a classic servlet engine. In turn the cost of context switching is avoided in both memory and CPU usage and the time spent managing the application goes down and time spent doing useful work goes up.
We have let the performance genie out of the bottle!
Conclusion
Now we know what a reactive microservice is and why we should want to start writing one, it’s time to have a look at what it means for the code.
Next time we’ll dive into a reactive Spring Boot application and see how it works under the hood.