the Scary Red Chick - A Map and Reduce project
First, Scary Red Chick IS NOT a MapReduce framework - it is a Map and
Reduce framework - Map and Reduce are two of the operations the framework
can perform on the data - not the only operations - and they can be done
in any order and any number of times.
The aditional task the Scary Red Chick can do is cache the result data
of any task and keep it for a specified type..
the Scary Red Chick can perform the following types of tasks:
* fetch - get data from outside of the system.
* map - process the data record by record. Not order or amount of data is
assured to be sent to each mapper. The data can be split in any way
before the mapper. One possible optimization in the future is stream
the data from the previous task to the mapper.
* reduce - reducers get the data grouped by a key defined by the task
definition.
* filter - process the data record by record and return true or false
for each record. This could be achieved by a mapper, but the reason
to create a specific task type is that way we can use the same filters
as a standalone task or as a filter to be applied before a task is
executed.
* split - process the data record by record and return the split id for
each record. This tasks can be used independently to split the data
in smaller chunks, that can be processed or exported independently.
This tasks can also be used to split the data before a map or a
reduce task.
* store - used to get information out of the system.
theMage