Nathan Hurst’s Blog

Thoughts on Software, Technology, and Startups 

How Much Equity a Technical Cofounder Should Get

Through Hirelite, cofounders often ask me how much equity a technical cofounder should get. The graphic below balances the risks cofounders take with their relative contributions to help answer this question. All assumptions and clarifications are noted after the graphic.

This covers one of the most common situations I encounter: For a pre-funding web startup whose team includes only a non-technical cofounder, how much equity should an incoming technical cofounder get?

Technical cofounders, remember that the number of shares or options doesn't matter, just your percentage ownership (what the chart shows).

Assumptions

For this case, I assume the non-technical cofounder has already contributed significantly to the business and will likely get more equity (in this chart their minimum ownership is 50%). This doesn't have to be the case. Technically inclined people can definitely build something on their own, and then seek a non-technical cofounder, retaining more equity for themselves. Even if the technical cofounder doesn't prototype something first, they can still contribute more to the business and receive more equity (ex: someone who did marketing for 2 years out of undergrad manages to recruit a top 10th percentile programmer with a successful app in the iPhone app store).

I assume both the non-technical and the technical cofounders are compensated equally. I also assume they are either working on their company full time now or will be soon. Compensation is probably between $0 and a ramen salary. This is an area that varies widely and can significantly impact how much equity a technical cofounder receives (if they take more salary). It's beyond the scope of this post, but I'll try to cover it in the future.

I assume the technical cofounder has a reasonably general set of technical skills and can pickup new technical skills quickly. Ex: they're primarily a back-end programmer, who can code basic front-end specs, and who can administer cloud hosting.

I assume both cofounders will be diluted equally as more employees and investors get involved.

Finally, I assume this chart may be a little surprising to non-technical cofounders. Why should a technical cofounder get so much of the company? Especially a company based on your vision. Basically, it's because ideas aren't worth much. Execution is what matters, and in most web startups that falls on the technical founder. Software developers are a hot commodity right now, and many of them know it, so they're on the lookout for really stellar business people to partner with, not some run-of-the-mill, me-too idea having, 10k foot synergizer without any concrete sales prospects, marketing ability, or product experience. Sorry for the rant, but in all seriousness, I've seen too many good non-technical founders delay building a product for months because they're too stingy with equity.

Notes

[1] Technical cofounder starts with 50%: start by assuming the technical and non-technical cofounders will contribute similarly to the business and are taking similar risks for the business. Give the non-technical cofounder extra equity for anything "above and beyond" (see final assumption above for more). Also, here's an example calculation: 50 (base equity) - 10 (for working prototype) - 5 (has over 10k users) - 10 (has raised VC) = 25. The technical cofounder gets 25% of the company.

[2] Working prototype (not just wireframes) -10%: If a non-technical cofounder has a working prototype, they've likely assumed some risk already to build the prototype (perhaps by contracting it out). Creating wireframes doesn't require much risk taking or even really help de-risk much of the busines.

[2a] Has paying customers -10%: If the company already has paying customers, the non-technical cofounder has already eliminated a huge risk for the business; someone wants the product.

[2b] Has over 10k users -5%: If the company doesn't have paying customers, but does have some reasonable user traction (with an nice trajectory), the non-technical cofounder has de-risked the business some, but not as much as having paying customers.

[3] Non-technical cofounder has significant connections or experience -10%: Connections include: relationships with key people in the company's target market, social network connections, blog readership, etc. Domain specific experience within the target market are helpful as are more broad experiences in sales, marketing, product development, or business development (probably on the order of 5+ years outside of undergrad, grad, or MBA school).

[4] Non-technical cofounder has raised venture capital before -10%: The non-technical cofounder has done a startup before, and someone has trusted them with a lot of money. It will probably be easier to get money again. If your business doesn't need external capital, tone this number down, but it's still important (maybe -5%) because it conveys a degree of startup savvy.

[5] Non-technical cofounder has had a successful exit before -10%: The biggest predictor of future success is past performance. This could also apply to non-exit situations where the non-technical cofounder started a company that is operating successfully and the non-technical cofounder has chosen to move on, etc.

[6a] Salary upon funding -0%: No extra equity for getting a salary upon funding. It may not be a market salary, but a technical founder will likely get something if they elect to. Why should a technical cofounder take less equity upfront because they may or may not receive a salary later? In this situation a technical cofounder and a non-technical cofounder are taking similar risks; therefore, neither get any extra equity.

[6b] Non-technical cofounder has idea and vision -0%: This is just part of the job description and included in the non-technical cofounder's initial 50%. Execution is what matters anyway (see final assumption above for more).

[6c] Non-technical cofounder has MBA -0%: (this is my opinion) In the early days of a startup, an MBA doesn't help much. It certainly doesn't hurt, but it shouldn't affect equity.

[7] Non-technical cofounder invests money -5% per $10k: This is an approximation based on a very early stage valuation of $200k. The situation will vary from company to company based on who is involved with the company and what they've accomplished so far. When I think about this, I only include money still unspent when the technical cofounder joins (usually the money was used to build the initial prototype. Sometimes I adjust the equity for the "working prototype" step above).

[8] Lower limit 10%: A non-technical cofounder wants to ensure the technical cofounder has compelling incentives. I consider this the minimum if the technical cofounder is not taking a salary.

Self Promotion

Hirelite is having it's first web-based "speed interviewing" event on July 27th for companies and startups in NYC looking for software engineers. Check out Hirelite.com if you're interested in participating as a company or software engineer.

Loading mentions Retweet
Filed under  //   cofounders   equity   startup  

Comments [5]

Easy local repository browsing for Git or Mercurial

In both Git and Mercurial, there's a very easy way to browse your local repository in a web browser that I didn't know about until recently. You should really try these right now. They've already made a huge difference for me.

Browse your local Git repository

git instaweb --httpd webrick # view at http://localhost:1234

If you prefer not to use webrick, you could install lighttpd and then just run git instaweb.

Browse your local Mercurial repository

hg serve # view at http://localhost:8000


Both of these tips came from this forrst post and discussion.

Loading mentions Retweet
Filed under  //   git   mercurial   versioncontrol  

Comments [4]

Speed Dating for Software Jobs, a web event - Hirelite Blog

Speed Dating for Software Jobs, a web event

Our in-person "speed interviewing" events have worked so well that we're expanding to web-based events. On Tuesday, July 27th, Hirelite will host its first web-based event for software jobs and software engineers in New York City.

Get your webcams and microphones ready for efficient, face-to-face interviews, just like our in-person event but even more convenient. This web-based event will last 2 hours and feature a series of 5-minute interviews with either software engineers or companies. Since you can only get through so many interviews in 2 hours, we're capping attendance at 20 companies and 20 software engineers.

Click to view large

Over the next few months, Hirelite will expand to other cities. If you're interested in Hirelite coming to your city, let us know!

Related Posts:

Results of a Speed Dating Event for Hiring Software Engineers

Loading mentions Retweet

Comments [0]

Visual Tech Job Board Comparison

It's hard to determine if a tech job board is worth watching (if you're a job seeker) or worth posting to (if you're hiring), so I made a quick visual comparison of job boards in New York City.

I used metrics that were easiest to quantify quickly through examining up to 300 recent local tech job posts on each of these sites, so you should definitely consider metrics other than what I've mentioned here (namely what types of job seekers frequent each job board). A few notable job boards are missing due to technical constraints that I didn't have time to overcome while scraping data (Startuply and Monster load some posts using JavaScript, and TheLadders and LinkedIn require logging in). I've tried to be as objective as possible, but I run Hirelite: Speed Dating for the Hiring Process, so keep that in mind.

A few notes and observations after the graphic...


What each metric means

  • Cost - the cost of a single post. The life time of the post varies per site.
  • Headhunter posts - the number of posts originating from recruitment agencies as opposed to from companies that are hiring.
  • Typical company sizes - an estimate of the typical size of companies posting to a job board based on the company name, funding stage, salary/equity balance, and other information contained in the post.
  • 20 most frequent words - the words most often used in job posts at a particular job board. Technical note: I used Lucene (and the StandardAnalyzer) to help with text processing and frequency calculations, so very common words (a, the, ...) are excluded. Additionally, some special characters were omitted from words (see observations for effects).

Observations

Keep in mind that more (and more random) samples would be ideal, but here are some preliminary observations:

First, I noticed that "c" appeared much more than I expected. Companies can't be requesting C skills enough to put it in the 20 most frequent words used Craigslist and Stack Overflow! Well... maybe Stack Overflow. It turns out, the tool I used process the text (Lucene) cut out special characters, normalizing C++, C#, Objective-C, and C all to C, thus inflating the frequency of "c".

The words "you", "we", and "our" appear very high on 37signals, Craigslist, Hirelite, NextNY, and Stack Overflow, but are much less emphasized on CareerBuilder and Dice. Are the job posts less personal or intimate? Does this matter? From looking at the posts in more detail, it seems to correlate with a greater focus on more specific requirements in posts on CareerBuilder and Dice. Note that the word "years" as in "3 years of Java experience" appear in the top 20 most frequent words of CareerBuilder and Dice; however, they do appear in the top 50 most frequent words of all the job boards surveyed.

I highlighted words in top 20 most frequent word lists that I thought correlated to technical skills or softer skills to observe the relative importance of each, but I don't see any discernible pattern. Additionally, many of the meanings of these words depend highly on their context (requirements section vs responsibilities section vs about the company section).

Initially, I thought that posts from larger companies correlated with a higher number of recruiter/confidential posts, but then I got to NextNY where many posts for positions at small to medium sized companies are recruiter/confidential posts. Maybe recruiter/confidential posts will appear in high numbers wherever they're allowed? Hirelite and Stack Overflow have policies against posts where the hiring company is not named, but I don't know of any explicit policy on 37signals (though they have no recruiter/confidential posts). Does anyone know if they have a policy about these posts?

Finally, let me know what you see in the data or if you have other ideas of what to do with this type of data. I'm considering doing some kind of analysis of how typical job post language compares to typical English - I predict probably an inordinate use of "pirate" and "ninja".

Data (including top 50 most frequent words)

37signals
Cost (single post): $400
Headhunter posts: 0%
Typical company sizes: generally medium sized companies or funded small companies
50 most frequent words: we, experience, our, you, web, have, design, team, work, development, business, can, developer, your, who, software, looking, end, ruby, new, rails, project, management, skills, working, us, strong, requirements, about, from, css, well, knowledge, front, things, technologies, jquery, html, systems, php, all, years, use, technology, some, should, projects, javascript, help, has

CareerBuilder
Cost: $419
Headhunter posts: 65%
Typical company sizes: large companies
50 most frequent words: experience, skills, management, business, technology, development, job, work, requirements, our, technical, systems, project, information, knowledge, support, must, years, software, data, have, your, team, ability, required, strong, services, security, robert, half, all, working, email, time, us, opportunity, we, sql, contact, developer, more, new, network, industry, design, you, company, system, server, application

Craigslist
Cost: $25
Headhunter posts: 46%
Typical company sizes: all company sizes
50 most frequent words: experience, our, software, development, you, we, work, skills, have, team, new, design, business, strong, knowledge, management, systems, web, c, your, java, developer, technical, years, requirements, please, ability, working, applications, environment, job, must, programming, project, all, company, data, time, technology, product, sql, looking, candidates, york, client, solutions, plus, services, from, well

Dice
Cost: $459
Headhunter posts: 51%
Typical company sizes: large companies
50 most frequent words: experience, business, skills, development, management, team, work, knowledge, services, technology, systems, project, technical, new, client, years, data, strong, design, java, support, developer, our, you, required, software, information, web, financial, have, description, working, ability, all, solutions, position, application, requirements, sales, applications, company, other, manager, your, must, title, environment, including, york, understanding

Hirelite
Cost: $100
Headhunter posts: 0%
Typical company sizes: seed stage to medium sized companies
50 most frequent words: you, we, our, experience, software, team, web, work, have, your, development, skills, new, looking, design, engineer, from, environment, applications, java, years, strong, technology, get, working, plus, senior, about, us, technologies, developers, who, systems, ruby, javascript, can, business, product, platform, people, like, engineering, building, what, want, understanding, technical, other, developer, company

NextNY
Cost: $0
Headhunter posts: 43%
Typical company sizes: seed stage to medium sized companies
50 most frequent words: experience, you, we, work, have, our, skills, your, team, development, new, web, product, business, working, strong, client, management, media, from, apply, all, clients, project, online, data, software, ability, looking, design, marketing, can, years, technology, us, time, sales, including, high, company, about, requirements, must, technical, services, environment, who, advertising, please, lead

Stack Overflow
Cost: $350
Headhunter posts: 0%
Typical company sizes: generally medium sized companies or funded small companies
50 most frequent words: you, experience, our, we, development, software, work, team, have, c, skills, new, systems, from, web, technology, design, your, knowledge, strong, working, developers, programming, java, developer, looking, applications, environment, years, technical, high, including, code, business, application, management, about, projects, technologies, all, ability, well, requirements, performance, media, engineer, us, science, more, computer

Loading mentions Retweet
Filed under  //   comparison   data   jobs   visual  

Comments [0]

Ctrl-R Searches History and Other Historical Tricks

These tricks have saved me a lot of time. Many of them I started using after reading this Definitive Guide to Bash Command Line History by fellow Hacker News reader, pkrumins. It includes a much deeper look at history than the quick examples I cover here. 

Search Your History Quickly

No more history | grep ... or hitting the up button 20 times. Just open a command prompt, press Ctrl-R, and begin typing a word in your command. As you type, the most recent command matching what you type will appear. To continue searching backwards in the history, hit Ctrl-R again. Then hit the left or right key to edit the command or hit enter to run it. 

Increase Your History Size

Once you know how to search your history, make sure your commands stick around for a while. By default, the history size is pretty low, usually only 500. To increase your history size, add the following to either ~/.bashrc, ~/.bash_profile, or /etc/profile:

export HISTFILESIZE=1000000000
export HISTSIZE=1000000

Analyze Your History

Once you've built up a sizable history, analyze it to determine possible aliases that will reduce typing time. To see the top 30 most used commands, run:

cut -f1 -d" " .bash_history | sort | uniq -c | sort -nr | head -n 30

To see the top 30 most used commands including arguments, run:

sort .bash_history | uniq -c | sort -nr | head -n 30

Stop Getting Annoyed When You Forget sudo

I used to get so mad when I forgot to sudo a command, especially a long one. Not any more. To repeat a command using sudo, run:

sudo !!

Reuse Arguments

Say you want to backup a file then edit that file. Here's how you can reuse an argument from the most recent command:

cp a-very-long-file-name.txt a-very-long-file-name.txt.bak
vi !^

Also, you can use !!:N for the Nth argument, !!:N-M for the Nth to Mth arguments,  !!:$ for the last argument, or !!:* for all arguments.

Self Promotion

If you're a software engineer with Linux experience in NYC, consider coming to Hirelite: Speed Dating for the Hiring Process tomorrow evening (4/27).

Loading mentions Retweet
Filed under  //   commands   linux  

Comments [3]

A Delicious Four Years

I realized I made my first Delicious bookmark just over four years ago. Here's a visualization of my links. It's a decent approximation of what I know.

Loading mentions Retweet

Comments [0]

Thread Synchronization Issues & Romance

Who knew threads and romantic relationships had so much in common? For those of you new to threads, threads (and processes) allow computers to seemingly do multiple things at once, where each thing is a separate "thread" of execution. On a computer with a single processor, the processor spends short amounts of time executing each thread before switching to another thread to execute. On a multiprocessor system, processors execute threads simultaneously, switching between threads when there are more threads than processors.

For those of you new to romantic relationships, I don't have much advice for you other than: Don't tell your significant other that you're treating your relationship as a series of thread synchronization problems!

Also, I'm probably perpetuating some stereotypes here. Sorry, it just makes the examples easier.

Thread Synchronization Issues

Deadlock occurs when threads cannot proceed because they're waiting on each other.

Romantic relationship example:
You and wife have to wake up at 6am to catch a flight. You half-wake-up at some point in the morning and think, "she'll wake me up," and go back to sleep. The problem is, now it's noon, and you've both been thinking the same thing for six hours. You missed your flight due to relationship deadlock.

Livelock occurs when threads cannot proceed because they're too busy responding to each other.

Romantic relationship example:
When was the last time you heard an obnoxious couple talking on the phone? Think back to the end of their phone conversation. It probably ended like this.

1: Love you. Talk to you later.
2: Love you too. Bye.
...
both wait ...
1: You hang up first.
2: No you hang up first.
1: No you...

You're witnessing relationship livelock. Neither person in the couple nor the couple as a whole can proceed because they're too busy responding to each other.

Starvation occurs when one thread is deprived of resources by greedy or mis-prioritized threads.

Romantic relationship example:
You and your boyfriend share a checking account and deposit money into it on the first of the month. You routinely make small purchases every day. Your boyfriend rarely makes purchase, but when he does, he buys something big. After the first of the month, you successfully make your small purchases for a few days, but then your boyfriend buys an iPad. All your attempted purchases are now denied. You're suffering from starvation.

Race conditions occur when success depends on the order in which threads run.

Romantic relationship example:
Your son wants to go bungee jumping with his friends. He knows that each parent requires that he ask both parents for permission. Using a clever turn of phrase, he realizes that he can exploit a relationship race condition to get what he wants by asking the stricter parent first.

Son (approaches strict mother): Can I go bungee jumping?
Mother: No, but ask your father.
Son (approaches lenient father): Can I go bungee jumping?
Father: Yes, but ask your mother.
Son: I already did.
Father: Great. Hope you have fun!

Thrashing occurs when threads make little or no progress due to the overhead of context switching.

Romantic relationship example: A couple tries to decide whether or not to get a pet. The argument gets heated. They keep bringing up unrelated topics. Each time a new topic comes up, they spend five minutes on it.

1: Having a dog would be so much fun!
2: You would never clean up after it.
1: What?! I clean all the time.
... five minutes later ...
1: Well at least I don't leave clothes all over the place.
2: Psh. I'm the only one that ever does the laundry. I can leave my clothes wherever I want.
... five minutes later ...
2: I don't know if I can talk about this anymore. I'm just going to go watch TV to cool down.
1: You watch TV all the time! We don't even need a pet. You spend all your time with the TV.

Every context switch gets the couple further away from where they started and from the problem they're trying to resolve. They're thrashing.

Busy waiting occurs when one thread continuously checks if it may proceed, robbing other threads of processing time.

Romantic relationship example: A couple is getting dressed for a party. The man is dressed and ready to go. The woman is nowhere near done. The man keeps interrupting the woman to ask her if she's ready yet. The man is busy waiting.

Self Promotion

If you're a developer and you like these sorts of problems, consider attending Hirelite: Speed Dating for the Hiring Process on Tuesday, April 27th in NYC where companies will be looking for great software people.

Loading mentions Retweet
Filed under  //   relationships   software   threads  

Comments [0]

Black Hat Recruiter Tactics

Since starting Hirelite, where we get companies and software people talking directly, I've heard a lot of horror stories about working with recruiting agencies. When I hear these stories, I can't help but think of black hat vs. white hat hacking and SEO, so I call recruiters who engage in unethical practices "black hat recruiters". Black hat recruiters resort to the tactics below because they're too lazy to confront the real challenges involved in finding and matching good people and good companies. Please note that not all recruiters are bad, and some provide a lot of value, but this post is not about those recruiters. This post is about black hat recruiting where tactics range from lies to ethically gray practices to illegal activity (in approximate order of how common they are):

Posting misleading job descriptions - This is by far the most common form of abuse. Recruiters will post a job description for a legitimate position for a client, but falsify some of the information to entice candidates. For example, a recruiter will inflate the salary/compensation portion of the job description or inflate the job responsibilities while dumbing down the job requirements.

Posting bait-and-switch job descriptions - Black hat recruiters will advertise a job that does not exist or is already filled just to receive resumes from job seekers that they can contact about other job opportunities. This is very similar to a tactic that black hat apartment brokers use (mentioned in Rent Hop's comparison of headhunters and apartment brokers).

Surreptitiously modifying a job seeker's resume - Black hat recruiters often request a resume in a format they can modify. They will make modifications to job seekers resumes without telling job seekers and then give the modified resume to their clients. Modifications range from obscuring contact information so that the recruiter is always in the loop to more liberal modifications like inflating experience and skills. Nothing's worse than getting to an interview and finding out that you know COBOL from the hiring manager reading it off your resume.

Approaching other companies job seekers interview with - Recruiters often ask job seekers what other companies they are interviewing with under the guise of tailoring their search to the job seeker. Some recruiters will go as far as to ask who specifically the job seeker is in contact with. Armed with that information, a recruiter will contact the other companies and try to send competing job seekers. I've spoken to one job seeker who suspected this was happening and caught their recruiter in the act. This job seeker told the recruiter a friend's name and had the friend wait for the recruiter's call. The friend didn't have to wait long. Only 10 minutes after the initial call ended, the recruiter called the job seeker's friend. The recruiter denied everything.

Cold calling and pressuring low level employees - Black hat recruiters will call low level employees at a company and threaten termination and legal repercussions unless the employee passes the recruiter along to a hiring manager at the company.

Buying resumes from hiring companies - Black hat recruiters will give discounts to companies that will pass all the resumes for a particular position along to the recruiter. These resumes could be from other recruiters or from candidates who contacted a company directly.

Pressuring job seekers into interviews - Black hat recruiters will pressure job seekers into interviews that they don't want to go on. Sure, job seekers should stand up to them and say, "I don't want that job," but when a recruiter responds, "I'm not going to put you in front of <company> unless you go to this interview," job seekers may give in.

Promising exclusivity to job seekers - Black hat recruiters will promise a job seeker that they will not submit other job seekers for the same position as long as the job seeker agrees not to talk to any other recruiters. The recruiter then submits multiple competing job seekers for a position. If one is rejected, he tells that job seeker that the company decided there wasn't a fit and continues to send him to other companies.

Recruiting the references of a job seeker - Black hat recruiters request references from job seekers and recruit those references. Later, job seekers hear from their references that their recruiter pressured them for resumes to send to clients, sometimes for the exact job the original job seeker was up for!

Faking a relationship - Black hat recruiters will hear that Dunder Mifflin, a company they have no relationship with, is hiring. Instead of approaching Dunder Mifflin about working for them, the recruiter will solicit resumes from potential job seekers for exciting new openings at Dunder Mifflin. The recruiter will then approach Dunder Mifflin with the resumes they have. If Dunder Mifflin rejects the recruiter, the recruiter will tell the job seekers that Dunder Mifflin said there wasn't a fit for them.

Discrediting an employee's current company - Black hat recruiters will contact an employed potential candidate and tell them that their current company is in a precarious financial state and offer to find the employee another job. Black hat recruiters will even do this to employees of their own clients.

Simulating expiring offers - When a company sends an offer to a job seeker, black hat recruiters will tell the job seeker that they only have X days (where X is usually 1 or 2) to accept the offer; otherwise, it will be rescinded. This practice is a bit more rare because job seekers and companies know each others' contact information by this point, but I've heard of this happening to at least one company and one job seeker (separate events).

Sending false offer letters - Black hat recruiters will send out fake offer letters to job seekers for companies they're having trouble getting interviews for. Black hat recruiters rely on job seekers requesting to interview with the company before accepting the offer. The recruiter then arranges an interview with the company. If the company like the job seeker, the recruiter makes sure to process and negotiate the offer, sometimes issuing a "revised" offer to the job seeker. If there is not a fit for the job seeker at the company, the recruiter is no worse off than they started, and they just drop all contact with a job seeker.


If you're thinking that any of these practices might work for you, think again. Seriously. They may work in the short term, but you will do irreparable harm to your reputation, the reputation of job seekers, and the reputation of companies you represent in addition to possibly opening yourself up to legal problems.

If you're a company or a software engineer who's tired of dealing with these tactics, check out Hirelite: Speed Dating for the Hiring Process. We're currently only in NYC, but hope to expand soon.

Got any more horror stories? Leave them in the comments.

Loading mentions Retweet
Filed under  //   engineering   hiring   jobs   recruiters   software  

Comments [7]

Results of a Speed Dating Event for Hiring Software Engineers

On Tuesday, I ran Hirelite's first event: Speed Dating for the Hiring Process for software engineers. Read about how it went here: Results of a Speed Dating Event for Hiring Software Engineers.

Loading mentions Retweet
Filed under  //   engineering   hiring   interviewing   software  

Comments [1]

Visual Guide to NoSQL Systems

There are so many NoSQL systems these days that it's hard to get a quick overview of the major trade-offs involved when evaluating relational and non-relational systems in non-single-server environments. I've developed this visual primer with quite a lot of help (see credits at the end), and it's still a work in progress, so let me know if you see anything misplaced or missing, and I'll fix it.

Without further ado, here's what you came here for (and further explanation after the visual).

Note: RDBMSs (MySQL, Postgres, etc) are only featured here for comparison purposes. Also, some of these systems can vary their features by configuration (I use the default configuration here, but will try to delve into others later).

As you can see, there are three primary concerns you must balance when choosing a data management system: consistency, availability, and partition tolerance.

  • Consistency means that each client always has the same view of the data.
  • Availability means that all clients can always read and write.
  • Partition tolerance means that the system works well across physical network partitions.

According to the CAP Theorem, you can only pick two. So how does this all relate to NoSQL systems?

One of the primary goals of NoSQL systems is to bolster horizontal scalability. To scale horizontally, you need strong network partition tolerance which requires giving up either consistency or availability. NoSQL systems typically accomplish this by relaxing relational abilities and/or loosening transactional semantics.

In addition to CAP configurations, another significant way data management systems vary is by the data model they use: relational, key-value, column-oriented, or document-oriented (there are others, but these are the main ones).

  • Relational systems are the databases we've been using for a while now. RDBMSs and systems that support ACIDity and joins are considered relational.
  • Key-value systems basically support get, put, and delete operations based on a primary key.
  • Column-oriented systems still use tables but have no joins (joins must be handled within your application). Obviously, they store data by column as opposed to traditional row-oriented databases. This makes aggregations much easier.
  • Document-oriented systems store structured "documents" such as JSON or XML but have no joins (joins must be handled within your application). It's very easy to map data from object-oriented software to these systems.

Now for the particulars of each CAP configuration and the systems that use each configuration:

Consistent, Available (CA) Systems have trouble with partitions and typically deal with it with replication. Examples of CA systems include:

  • Traditional RDBMSs like Postgres, MySQL, etc (relational)
  • Vertica (column-oriented)
  • Aster Data (relational)
  • Greenplum (relational)

Consistent, Partition-Tolerant (CP) Systems have trouble with availability while keeping data consistent across partitioned nodes. Examples of CP systems include:

Available, Partition-Tolerant (AP) Systems achieve "eventual consistency" through replication and verification. Examples of AP systems include:

Self promotion and Credits

  • If you're a developer and looking for a job or if you're hiring developers and these data systems are important to you, consider coming to Hirelite: Speed Dating for the Hiring Process on Tuesday in NYC.
  • This guide draws heavily from a recent Ruby meetup (by Matthew Jording and Michael Bryzek) and a recent MongoDB presentation (given by Dwight Merriman).
  • Thanks to DBNess and ansonism for their help with validating system categorizations.
  • Thanks to those who helped shape the post after it was written: Stan, Dwight, and others who commented here and on this Hacker News thread.

Update: Here's a print version of the Visual Guide To NoSQL Systems if you need one quickly (warning: it's not all that pretty and I may not keep it updated, but as of 3/17/2010, it's current).

Loading mentions Retweet
Filed under  //   cap   comparison   data   database   guide   nosql   visual  

Comments [49]