This past week was PASS Summit 2012, the annual gathering of database nerds professionals to learn about database management, business intelligence, and upcoming trends. A common theme this year was big data.
Few people could agree on a definition or the implications of _big data; _there is a lot of FUD going around. This is tragically ironic because big data tools and techniques arose as a _backlash _to misinformation and silos. The original MapReduce paper (Dean, 2004), describes why it was built: a way for more Google employees to do more analysis on massive data sets, to help Google build better products.
The goal: Make data driven decisions.
Data has no value on its own. Storing all your data in Hadoop gives you nothing but a very big bill.
The goal isn’t big data. The goal is better decisions.
To have better decisions, you need data and brains. The benefit comes when talented people analyze data, and use that analysis to make things better. Great tools are not a panacea.
Here is a handy guide to help you decide whether your organization can build, use, and benefit from a big data project:
Question | Yes | No |
---|---|---|
Does your organization make data-driven decisions more than 50% of the time? | +100 | -500 |
Does your organization currently make data-driven decisions more than 80% of the time? | +200 | -100 |
Do you have people with statistical and programming skill to analyze data well? | +200 | -200 |
Do you currently let projects fail, and learn from the failure? | +75 | -150 |
Are your colleagues curious and open-minded? | +50 | -200 |
Do you use your existing data as much as possible? | +50 | -50 |
Think of a relevant data set to your business. Can you brainstorm at least 10 uses for it? | +50 | -100 |
Do you understand the design and use of ‘big data’ tools enough to identify marketing vs. reality? | +25 | -400 |
Add up all of your scores:
Big Data is valuable only within certain business problems, organizational cultures, and with the certain types of people involved. It is not a panacea.
The real panacea, as always, is having smart people, a curious/honest organizational culture, and a collective desire to do amazing things.
PermalinkThere have been some excellent posts about SQL PASS Summit: more advice than you can shake a stick at. I’m going to focus on 3 topics: picking sessions, endurance, and follow up.
This will be my 5th PASS Summit. I have experimented with different approaches to picking sessions, and come up with a guide. Time is valuable.
The week of PASS is grueling: 18-hour days of mental and social stimulation are common.
The week of PASS Summit is too much for anyone to absorb fully. So, don’t expect to. A lot can be done in the following weeks.
I hope to see you at PASS.
Permalink