Composable Analytics recently added support to iterate through collections of objects directly within the the application flow! There are some modules which return enumerations of objects. For example, when you query for email messages through IMAP, a collection of messages is returned matching the specified search criteria. Previously, if you wanted to process each email message, you had two options. First, you could write a custom module which takes in an enumeration of messages, and then processes them internally within the module. The other option was to leverage the code module. Within the code module, you could iterate through the messages, and then call another application which will process an individual message. In both approaches, the loop is effectively hidden, and leveraging other modules and applications is tougher. And the dependency between the code module, and the application which is processing the email message is obscured.
When designing modules, the developer was always in a dilemma … do I take in a collection, or just one object? Taking in a collection helps when any upstream modules return collections … but then forces the developer to always have to deal with loops. And the output typically then results in a collection as well. If you’re not careful, every module could soon always take in a collection. By having a ForEach module, module designers don’t have to be forced to take in collections, which results in simpler code. And collections can now be used when actually reqired (i.e. fusing multiple tables together).
So how does it work? The ForEach module has a collection of objects as input. And it’s output, is a single object. Any downstream modules that depend on the object output will be executed multiple times (one time for each object in the collection). So if the foreach output object is piped into multiple modules, and those modules then have other modules linked to them, they all get executed; the downstream graph effectively gets re-run.
What about nested ForEach modules? It still works. The nested foreach module gets executed n-number of times based on upstream foreach modules, and any downstream will get executed n*m times, where m is the number of objects in the nested foreach modules input collections. But a nested foreach module can’t have multiple independent foreach modules upstream from it.
Let’s take a look at the below example. The first Foreach is used to iterate through the messages. DataBinder modules are used to extract the date from the message, and also the attachments. A single message may have multiple attachments. So a second foreach module can be used to enumerate through the attachments.
Latest posts by Lars Fiedler (see all)
- When is a parameterized SQL query slower than having inline values? - May 17, 2018
- Composable shines versus Alteryx - April 17, 2018
- Integrating and Syncing Salesforce with Sql Server - November 16, 2017