
ETL patterns
ETL patterns is another area of computing that lends itself very well to serverless platforms. At a high level, ETL jobs comprise the following three steps:
- Extracting data from one data source
- Transforming that data appropriately
- Loading the processed data into another data source
Often used in analytics and/or data warehousing, ETL jobs are hard to escape. Since this problem is again ephemeral and because users would probably like their ETL jobs to execute as quickly as possible, serverless systems are a great platform in this problem space. While serverless computation is typically short-lived, we will see how ETL processes can be designed to be long-running in order to work through large amounts of data.
In the fan-out pattern, discussed in Chapter 5, Scaling Out with the Fan-Out Pattern, a single unit of work will be broken up into multiple smaller units of work and processed in parallel. This pattern may be used as a standalone system or as a subcomponent in a more extensive system. We'll build out an application using the fan-out pattern in isolation, but later discuss how it can work as a piece in a more extensive system.
Messaging patterns themselves can be an entire class of design pattern. In our context, we will show how to use this as a general pattern to process data asynchronously with a known or fixed amount of processing power. Chapter 6, Asynchronous Processing with the Messaging Pattern, will walk through a full example of this pattern and its variants in a serverless context.