Categories
Programming Uncategorized

Big Data DC #3

Two enterprise big data consulting companies presented about the architecture they use for processing and storing at the third Big Data DC meetup.  Much like the first and second meetups, the common thread seemed to be the decisions that the engineers made to optimize certain aspects over others.

First up, Joey Echeverria who works for Cloudera, talking about using HBase in the real world.  Joey’s presentation covered the basics of Hadoop, and then dove into HBase, the database for Hadoop.  He talked about the benefits of HBase, including having a variable schema in each record and it being atomic per row.  He then gave a few examples of real life applications including Lilly, an open source project content repository, OpenTSDC, a distributed, scalable Time Series Database from stumbleupon and Socorro, the crash report database used by Mozilla.  Peruse Joey’s slides for more information on HBase.

Next up, Ted Dunning from MapR spoke about the Hadoop distribution his company sells.  Ted spoke of the bottlenecks in Hadoop that they try to solve with the implementation they built.  These bottlenecks include Read only files, many copies in I/O path, shuffle based on HTTP, and spills go to local file space.  Ted spent a large amount of his talk on maprfs, the file system they built to solve these bottlenecks.

This meetup had the largest turnout of all the Big Data DC meetups so far.   I can’t wait for the 4th meetup.

Categories
Current Events Programming Uncategorized

Big Data DC meetup #2

For the second time, a group of bright and talented developers gathered at clearspring to discuss Big Data.The first Big Data DC meetup had a great turnout. Rather then write up a summary, I decided to check out storify and build my first one for the meetup. Once again all of the talks were great. If you have an interest in big data and the technical ways to work with it, you should check it out.

View “Big Data DC #2” on Storify

What do you think of this format? Is this something you would like to see for future meetups or would you prefer a more traditional summary post?