Thursday, October 1, 2009

LA CloudCamp -Data in the Cloud

We had a very good CloudCamp in LA. It was my first unconference – I was not sure what to expect and what I will learn. But at the end of the day [around 11.45 PM or so], when I was walking up to my underground garage on the corner of 5th and Olive, I felt really good about it. It was on the fly brainstorming session with 100 plus technology related folks.

The way this camp was conducted is impressive and food [please read it as free food] was awesome. They had unlimited beer for those who would like to climb up in the cloud before getting into any serious conversation…

We focused on the general theme of data in the cloud. I started the conversation with CAP [not CRAP]. C: Consistency, A: Availability and P: partionshing [a.k.a. load balancing, scaling etc.]. It is mathematically proven that one can have any two of these three qualities in any massively distributed system. Now if we are building an app for the cloud – this principal is very important as there is a network in the middle… it was a discussion that could almost go on for days if not weeks, and we had to wrap it up in half an hour. So the conclusion was twofold – know the limitations and choose what you want before you build your app – you can have any combination of CAP – viz – CA, AP or CP but not all…

Then Lynn Langit of the Microsoft presented on SQL Azure. It was a very good, house full presentation. Our small room was jam packed to its limit and Lynn did a great job of introducing this RDBMS in the cloud. Yes, we heard all those legitimate concerns one more time – how can you build a real database with 10 GB, and what about replication? But as Lynn said this is V1 [and we know from our experience that MS gets the right product out with V3]. I learned one important lesson in this session – when there are non-microsofties around – explain every acronym you use. For them PDC and RTM is like JAOO to the Microsoft community.

Our final talk was on scaling the data in the cloud. From what I understood [which might be way off the mark], there is altogether other alternative to the RDBMS model. Things like Tokyo Cabinet and hadoop , HBASE, CouchDB, etc. [WOW – I remembered all these things]. The point of this discussion was, start thinking outside the box. There are other ways to think about transactions – like BASE, and ACID is not the only way to achieve the consistency. This session was more techno- philosophical. Take away, as per our DBA friend –“RDBMS is crap, start thinking about alternatives”. In the end, the data structures you want to use depend on the type of applications you want to build. Facebook and banking app are two extreme end points on this scale, and have their unique requirements. Albeit, both of them deal with large datasets…

In the end, walking back to the parking lot, I heard this interesting comment –“I went to watch a movie and they told me to act in it, there was no Tom Cruise or Don Box, Lary Ellison in the room, and I ended up presenting the show. Oh! Well – thank god my wife was not in the room…please pardon me if I said something stupid, you know I was little bit drunk…”


No comments: