No Description


the Scary Red Chick - A Map and Reduce project

First, Scary Red Chick IS NOT a MapReduce framework - it is a Map and Reduce framework - Map and Reduce are two of the operations the framework can perform on the data - not the only operations - and they can be done in any order and any number of times.

The aditional task the Scary Red Chick can do is cache the result data of any task and keep it for a specified type..

the Scary Red Chick can perform the following types of tasks:

  • fetch - get data from outside of the system.

  • map - process the data record by record. Not order or amount of data is assured to be sent to each mapper. The data can be split in any way before the mapper. One possible optimization in the future is stream the data from the previous task to the mapper.

  • reduce - reducers get the data grouped by a key defined by the task definition.

  • filter - process the data record by record and return true or false for each record. This could be achieved by a mapper, but the reason to create a specific task type is that way we can use the same filters as a standalone task or as a filter to be applied before a task is executed.

  • split - process the data record by record and return the split id for each record. This tasks can be used independently to split the data in smaller chunks, that can be processed or exported independently. This tasks can also be used to split the data before a map or a reduce task.

  • store - used to get information out of the system.