web analytics
Press "Enter" to skip to content

Let’s Loop!

Lars Fiedler 0

Composable Analytics recently added support to iterate through collections of objects directly within the the application flow!  There are some modules which return enumerations of objects.  For example, when you query for email messages through  IMAP, a collection of messages is returned matching the specified search criteria.  Previously, if you wanted to process each email message, you had two options.  First, you could write a custom module which takes in an enumeration of messages, and then processes them internally within the module.  The other option was to leverage the code module.  Within the code module, you could iterate through the messages, and then call another application which will process an individual message.  In both approaches, the loop is effectively hidden, and leveraging other modules and applications is tougher.  And the dependency between the code module, and the application which is processing the email message is obscured.

When designing modules, the developer was always in a dilemma … do I take in a collection, or just one object?  Taking in a collection helps when any upstream modules return collections … but then forces the developer to always have to deal with loops.  And the output typically then results in a collection as well.  If you’re not careful, every module could soon always take in a collection.   By having a ForEach module, module designers don’t have to be forced to take in collections, which results in simpler code.  And collections can now be used when actually reqired (i.e. fusing multiple tables together).

So how does it work?  The ForEach module has a collection of objects as input.  And it’s output, is a single object.  Any downstream modules that depend on the object output will be executed multiple times (one time for each object in the collection).  So if the foreach output object is piped into multiple modules, and those modules then have other modules linked to them, they all get executed;  the downstream graph effectively gets re-run.

What about nested ForEach modules?  It still works.   The nested foreach module gets executed n-number of times based on upstream foreach modules, and any downstream will get executed n*m times, where m is the number of objects in the nested foreach modules input collections.  But a nested foreach module can’t have multiple independent  foreach modules upstream from it.

Let’s take a look at the below example.  The first Foreach is used to iterate through the messages.  DataBinder modules are used to extract the date from the message, and also the attachments.  A single message may have multiple attachments.  So a second foreach module can be used to enumerate through the attachments.








Lars Fiedler

Lars has comprehensive expertise building large complex software systems, and has served as a Software Engineer at MIT’s Lincoln Laboratory since 2010, where he began developing Composable Analytics. Prior to joining Lincoln Laboratory, Lars worked as a Software Engineer at Microsoft Corporation from 2006 to 2010. Lars received his MS in Computer Science from Georgia Institute of Technology in 2004, and his BS in Computer Science from Georgia Tech in 2003.

Comments are closed.