Dr. David DeWitt recently presented a keynote (video, slides) for PASS Summit 2013 on the new Hekaton query engine. I was impressed by how the new engine design is rooted in basic engineering principles.
Software engineers and IT staff are bound to the economics and practicalities of the computing industry. These trends define what we can reasonably do.
When a CPU is doing work, the job of the rest of the computer is to feed it data and instructions. Reading 1MB of data from memory is ~ 800 times faster than reading it sequentially from a disk.
A recent hype has been "in-memory" technology. These products are based on a constraint: RAM is far, far faster than the disk or network.
"In-memory" means "stored in RAM". It's hard to market "stored in RAM" as the new hotness when it's been around for decades.
The price of CPU cycles has dropped dramatically. So has the cost of basic storage and RAM.
You can buy a 10-core server with 1 terabyte of RAM for $50K. That's cheaper than hiring a single developer or DBA. It is now cost effective to fit database workloads entirely into memory.
I can write code that is infinitely fast, has 0 bugs, and is infinitely scalable. How? By removing it.
The best way to make something faster is to have it do less work.
CPU scaling is running out of headroom. Even if Moore's Law isn't ending, it has transformed into something less useful. Single-threaded performance hasn't improved in some time. The current trend is to add cores.
What software and hardware companies have done is add support for parallel and multicore programming. Unfortunately, parallel programming is notoriously difficult, and runs head-first into a painful problem:
As the amount of parallel code increases, the serial part of the code becomes the bottleneck.
"Big Data" is all the rage nowadays. The number and quality of sensors has increased dramatically, and people are putting more of their information online. A few places have truly big data, like Facebook, Google or the NSA.
For most companies, however, the volume of quality data isn't increasing at nearly as rapid a pace. I see this all the time; OLTP databases are growing at a much smaller pace than their associated 'big data' click-streams.
Systems are not upgraded quickly. IT professionals live with a hard truth: change brings risk. For existing systems the benefit of change must outweigh the cost.
Many NoSQL deployments are in new companies or architectures because they don't have to migrate and re-architect an existing (and presumably working) system.
Backwards compatibility is a huge selling point. It reduces risk.
Brilliant ideas don't come from large groups. The most impressive changes come from small groups of dedicated people.
However, most companies have overhead (email, managers, PMs, accounting, etc). It is easy to destroy a team's productivity by adding overhead.
I have been in teams where 3 weeks of design/coding/testing work required 4 months of planning and project approvals.
Overhead drains productive time and morale.
Smart companies realize this and build isolated labs:
Dr. DeWitt's keynote covered how these basic principles contributed to the Hekaton project.
I have hope for the new query engine, but also concerns:
No Developer is an Island
People behave with hidden motivations and flawed reasoning.
Let's look at one of the underlying causes behind why many employees, teams, and companies don't behave the way we want: the principal-agent problem.
"It is difficult to get a man to understand something when his salary depends on his not understanding it" - Upton Sinclair
The principal-agent problem arises when the person/group asking for help (the principal) has different incentives from the person/group offering help (the agent). It is even more common in situations where the agent has more expertise, such as a hired professional or specialist.
The reason is that behavior, both individual and collective, changes to follow incentives. Individual will, morality, ethics, and integrity are all altered by circumstance and incentives. The principal-agent problem is already part of your life:
Going further, you find examples of the principal-agent problem leading to control fraud:
It is so pervasive that economics majors study its theory. I see the principal-agent problem as a force to be fought and minimized. We each have great potential to change our situation and the world around us, especially if we work hard and think carefully.
"Know your enemy and know yourself and you can fight a hundred battles without disaster" - Sun Tzu
To solve a problem you must know your options. To know your options you must know the context.
"The best revenge is to not be like your enemy." - Marcus Aurelius
You can fight the principal-agent problem with data. Since it often involves information asymmetry , having more information is empowering.
A powerful way to use data is comparison shopping. We have options to acquire goods & services, and vote with our wallets. Doing so in an informed way helps you choose a superior product/service and support companies whose practices you like.
Luckily for principals, the amount of data and number of skilled analysts are increasing rapidly. However, the vast majority of valuable data is not public. Imagine if detailed insurance company data was public: customers would rapidly switch to the most customer-friendly companies. Companies seeking to maximize profit will hide critical data. Therefore we must get information from unconventional sources or public institutions.
A brilliant report by Stephen Brill found that the medical costs are insane. This has been corroborated by many other independent journalists and bloggers. Drug makers make deals to prevent generic (i.e. cheaper) versions of their drugs. It's sometimes cheaper to pay cash. Visitors from other countries are often appalled by our health care system.
It is easy to skew numbers and hide inefficiency when data isn't available. Conversely, it's easy to make systems more efficient when data is available. I'm a big fan of startups that are working on this problem, like Castlight. However, even then the data is not really public.
Data is potentially powerful, and therefore potentially risky. Any individual, group or company with sufficient skill can use data for their own ends. This can be for good reasons, like making health care affordable for the average person, or for nefarious reasons, like figuring out who is more likely to have health issues, so they can be discriminated against (ahem, "priced appropriately").
Any effort to expose, assemble, or publicize data should involve thought about how it can be used for good or evil. Powerful organizations can handle the risk of embarrassing disclosure. Individuals can't.
One of the big changes in the past decade is the rise of social networking. Facebook, Twitter, Tumblr, YouTube and Reddit enable a single person to tell their story and have that story visible to the entire world. Amazon, Yelp, and eBay and other sites enable a person to write a review that can influence many future purchases.
Publicity works because of human psychology. We are a social, tribal species. We are identify with people who seem like us. We love the underdog. Sympathy is powerful.
I've never met a company or PR agent that is as sympathetic as a normal person. Companies in general, and PR agents in particular, aren't authentic. Much of that is cultural; the language that agents use is framed by their worldview (profit, marketing, damage control, shareholder value, etc). I'm glad; it makes fighting the principal-agent problem with publicity really easy.
Publicity works for a second reason: unequal risk. Bad publicity costs far, far more than good customer service does. This is doubly true if an agent lies, argues with their customers, and denies things they have done.
Let's look at two examples.
Individuals (the principals) in this situation have minimal power. The average person can't afford lawyers to fight police corruption. However, individuals collectively have massive power; they can vote out the politicians running these towns and states. Even the threat to do so (opinion polls) has tremendous clout.
Let's look at what happened:
Let's guess that 1% of the 70K Twitter users were also BA customers, and the average ticket costs around $1000.
Let's guess that each person knows ~150 people. We can guess the impact:
700 People * 150 People in Network * 50% Chance of Mentioning * $1000 ticket price 20% are BA customers = $10.5 million
It's still not an even playing field. But the odds are better.
Businesses care about money. People care about their jobs. Publicity gives a principal leverage against an agent's primary weak spot: their wallets.
While researching this post, I found that most successful publicity campaigns had common elements:
"Never doubt that a small group of thoughtful, committed citizens can change the world. Indeed, it is the only thing that ever has." - Margaret Mead
The third way to fight the principal-agent problem is with other people. Other people have skills, ideas and connections you don't. A small team is far more capable than an individual.
The Internet makes organizing far easier than ever before. Anyone can make a site, and anyone else can find it. It's relatively easy to identify competent professionals and bypass layers of middlemen / bureaucracy.
There are already sites that act as online watering holes and gathering places:
Online gathering places are still in their infancy. Here are some examples of what I can't find:
The tools for Internet-based gathering places are largely mature:
Let's look at another area with a bad principal-agent problem: finance.
In the US, the financial system is huge and corrupt. Here's a partial list of what financial institutions have done over the last 10 years:
The power and influence that exists in finance comes from the collective money of the average person (i.e. principals). It's a horrific example of the principle-agent problem.
Financial institutions are vulnerable to customers moving their money somewhere else. Their influence comes from our checking accounts, credit cards, loans, and 401Ks. Move it, and you end up with better service and less organizational corruption. Win win. One recent effort has been the Move Your Money project.
Let's look at why credit unions are better for customers than banks.
People behave according to incentives. Moral courage is admirable, and rare. Incentives encourage what they measure, and care must be taken to avoid unintended consequences.
Here are my questions to identify incentives when choosing an agent:
When looking at banks and credit unions, incentives are the big difference. Banks are designed to take money from their customers, as profit. The customers aren't owners. In contrast, credit unions have aligned incentives. They're nonprofits. The customers are owners. Plus, customers can vote to fire the the board of directors and top executives.
Here are types of organizations with good incentives:
We are all principals and agents. As principals, we want the best service for the price. As agents, we should provide the best service we can.
Since we are more often principals than agents, we should remember tactics to fight the principal agent problem: