Distilled lessons from building microservices powering Slang Labs platform.
At SlangLabs, we are building a platform for programmers to easily and quickly add multilingual, multimodal Voice Augmented eXperiences (VAX) to their mobile and web apps. Think of an assistant like Alexa or Siri, but running inside your app and tailored for your app.
The platform consists of:
- Console to configure a buddy for an app,
- Microservices that SDKs invoke to infer the intent inherent in the voice utterance of an end-user, and extract associated entities, and
- Analytics to analyze end-user behaviour and improve the experience.
This series of blog posts is to share the best practices and lessons we have learned while building the microservices.
At the idea-phase of a startup, one has some sense of destination and direction but does not know exactly what to build. That clarity emerges only through iterations and experimentations. We were no different, so we had to pick a programming language and microservice framework suitable for rapid prototyping. These were our key considerations:
- Rapid Development: high velocity of experimentations for quick implementation and evaluation of ideas.
- Performance: lightweight yet mature microservice framework, efficient for mostly IO-bound application, scales to high throughput for concurrent requests.
- Tools Infrastructure: for automated testing, cloud deployment, monitoring.
- Machine Learning (ML): easy availability of libraries and frameworks.
- Hiring: access to talent and expertise.
There is no perfect choice for the programming language that ticks all of the above. It finally boils down to Python vs. Java/Scala because these are the only feasible languages for machine learning work. While Java has better performance and tooling, Python is apt for rapid prototyping. At that stage, we favoured rapid development and machine learning over other considerations, therefore picked Python.
With Python, came its infamous Global Interpreter Lock. In brief, a thread can execute only if it has acquired the Python interpreter lock. Since it is a global lock, only one thread of the program can acquire it and therefore run at a time, even if the hardware has multiple CPUs. It effectively renders Python programs limited to single-threaded performance.
While GIL is a serious limitation for CPU-bound concurrent Python apps, for IO-bound apps, cooperative multitasking of AsyncIO offers good performance (more about it later). For performance, we desired a web framework which is lightweight yet mature, and has AsyncIO APIs.
- Django follows “batteries included” approach, it has everything you will need and more. While that eliminates integration compatibility blues, it also makes it bulky. It does not have AsyncIO APIs.
- Flask, on the other hand, is super lightweight and has a simple way of defining service endpoints through annotation. It does not have AsyncIO APIs.
- Tornado is somewhere between Django and Flask, it is neither as barebone as Flask nor as heavy as Django. It has quite a number of configurations, hooks, and nice testing framework. It had been having event-loop for scheduling cooperative tasks for much before AsyncIO, and had started supporting AsyncIO event loop and syntax.
Tornado was just right for our needs. But most of our design tactics are independent of that choice, and are applicable regardless of the chosen web framework.
Overcoming Global Interpreter Lock
Before we plunge into design and code, let’s understand some key concepts: cooperative multitasking, non-blocking calls, and AsyncIO.
Preemptive vs Cooperative Multitasking
Threads follow the model of preemptive multitasking. Each thread executes one task. OS schedule a thread on a CPU, and after a fixed interval (or when the thread gets blocked typically due to an IO operation, whichever happens first), OS interrupts the thread and schedules another waiting thread on CPU. In this model of concurrency, multiple threads can execute parallelly on multiple CPUs, as well as interleaved on a single CPU.
In cooperative multitasking, there is a queue of tasks. When a task is scheduled for execution, it executes till a point of its choice (typically an IO wait) and yields control back to the event loop scheduler, which puts it the waiting queue, and schedules another task. At any time, only one task is executing, but it gives an appearance of concurrency.
Synchronous vs Asynchronous Calls
In synchronous or blocking function calls, the control returns back to the caller only after completion. Consider the following pseudocode:
bytes = read() print(bytes) print("done") # "done" is printed only *after* bytes.
In asynchronous or non-blocking function calls, the control returns immediately to the caller. The called function can pause while execution. It takes a callback routine as an argument, and when called function finishes and results are ready, it invokes the callback with results. Meanwhile, the caller function resumes execution even before completion of the called function. Assume there is a non-blocking async_read function, which takes a callback function, and calls it with the read bytes. Consider the following pseudocode:
asyn_read(print) print("done") # "done" may be printed *before* bytes.
As you can see asynchronous code with callbacks is hard to understand because the execution order of the code can be different from the lexical order.
AsyncIO syntax of async and await facilitates writing asynchronous code in synchronous style instead of using callbacks, making code easy to understand.
import asyncio async def f(): bytes = await async_read() # f pauses here, yields control. # Resumes when result (bytes) is ready. print(bytes) print("done") asyncio.run(f()) # Append f() to the IO Event Loop queue
When a function is async, it is called coroutine. It must be awaited, as its results will be available only in future. An await expression yields the control to the scheduler. Code after the await expression is like a callback, the control to be resumed here later when coroutine completes and results are ready.
AsyncIO has an IO Event Loop, a queue that holds all completed coroutines ready to be resumed.
Derisking by Design
While Tornado has worked out well for us so far, we did not know it then. We designed our microservices such that Tornado-dependent code was segregated and localized. It was to easily migrate to a different framework if the need arises. Regardless, it is a good idea to structure your microservice into two layers: Web Framework Layer and framework independent Service Layer.
Web Framework Layer
Web Framework Layer is responsible for REST service endpoints over HTTP protocols. It does not have any business logic. It processes incoming requests, extract relevant information from the payload, and calls a function in the Service Layer which performs business logic. It packages the returned results appropriately and sends the response. For Tornado, it consists of two files:
- server.py contains an HTTP server that starts the event loop and application.
- app.py contains endpoint routes that map REST API to a function in the service layer (specifically to a function in service.py, see next).
Service Layer contains only business logic, and knows nothing about HTTP or REST. That allows any communication protocol to be stitched on top of it without touching business logic. There is only one requirement for this layer:
- service.py must contain all functions needed to implement the service endpoints. Think of it as logical service APIs, independent of any Web framework or communication protocol.
Logical service APIs allow the Web Framework Layer to be implemented (and replaced) without getting into the nitty-gritty of the inner working of the service. It also facilitates standardization and sharing of a large portion of web framework code across services.
We are rare among startups to automate testing and code coverage from the very beginning. It may appear counter-intuitive but we did it to maintain high velocity, and fearlessly change any part of the system. Tests offered us a safety net needed while developing in a dynamically-typed interpreted language. It was also partly due to paranoia regarding our non-obvious choice of Tornado, to safeguard us in case we need to change it.
There are three types of tests:
- Unit Tests: Limited to independently test a class or function, mostly for leaf-level independent classes/functions.
- Integration Tests: To test the working of multiple classes and functions. Out of process or network API calls (such as databases and other services) are mocked.
- End-to-End Tests: To test deployment on test or stage environment. Nothing is mocked, just that data is not from the prod environment and may be synthetic.
We wrote integration tests both for Service Layer to test business logic, as well for Web Framework Layer to test the functioning of REST endpoints in Tornado server.
Get Source Code
Clone the GitHub repo and inspect the content:
$ git clone https://github.com/scgupta/tutorial-python-microservice-tornado.git $ cd tutorial-python-microservice-tornado $ git checkout -b <branch> tag-01-project-setup $ tree . . ├── LICENSE ├── README.md ├── addrservice │ └── __init__.py ├── requirements.txt ├── run.py └── tests ├── __init__.py ├── integration │ └── __init__.py └── unit └── __init__.py
addrservice is for the source code of the service, and the directory
test is for keeping the tests.
Setup Virtual Environment
Using a virtual environment is one of the best practices, especially when you work on multiple projects. Create one for this project, and install the dependencies from
$ python3 -m venv .venv $ source ./.venv/bin/activate $ pip install --upgrade pip $ pip3 install -r ./requirements.txt
run.py is a handy utility script to run static type checker, linter, unit tests, and code coverage. In this series, you will see that using these tools from the very beginning is actually most economical, and does not add perceived overhead.
Let’s try running these. In each of the following, you can use either of the commands.
Static Type Checker: mypy package
$ mypy ./addrservice ./tests $ ./run.py typecheck
Linter: flake8 package
$ flake8 ./addrservice ./tests $ ./run.py lint
Unit Tests: Python unittest framework
$ python -m unittest discover tests -p '*_test.py' $ ./run.py test
This will run all tests in the directory
tests. You can run unit or integration test suites (in
tests/integration directories respectively) as following:
$ ./run.py test --suite unit $ ./run.py test --suite integration
Code Coverage: coverage package
$ coverage run --source=addrservice --branch -m unittest discover tests -p '*_test.py' $ coverage run --source=addrservice --branch ./run.py test
After running tests with code coverage, you can get the report:
$ coverage report Name Stmts Miss Branch BrPart Cover ----------------------------------------------------------- addrservice/__init__.py 2 2 0 0 0%
You can also generate HTML report:
$ coverage html $ open htmlcov/index.html
If you are able to run all these commands, your project setup is complete.
For quick prototyping, Python is more suitable. But it comes with the drawback of Global Interpreter Lock. Cooperative multitasking with non-blocking asynchronous calls using asyncio comes to rescue. Tornado is the only mature Python web framework with asyncio APIs.
Layered design can derisk in case framework is to be changed in future. Tests can also be layered: unit, integration, and end-to-end. It is easy to setup lint, test, code coverage from the very beginning of the project.