Easy local repository browsing for Git or Mercurial

In both Git and Mercurial, there's a very easy way to browse your local repository in a web browser that I didn't know about until recently. You should really try these right now. They've already made a huge difference for me.

Browse your local Git repository

git instaweb --httpd webrick # view at http://localhost:1234

If you prefer not to use webrick, you could install lighttpd and then just run git instaweb.

Browse your local Mercurial repository

hg serve # view at http://localhost:8000


Both of these tips came from this forrst post and discussion.

Speed Dating for Software Jobs, a web event - Hirelite Blog

Speed Dating for Software Jobs, a web event

Our in-person "speed interviewing" events have worked so well that we're expanding to web-based events. On Tuesday, July 27th, Hirelite will host its first web-based event for software jobs and software engineers in New York City.

Get your webcams and microphones ready for efficient, face-to-face interviews, just like our in-person event but even more convenient. This web-based event will last 2 hours and feature a series of 5-minute interviews with either software engineers or companies. Since you can only get through so many interviews in 2 hours, we're capping attendance at 20 companies and 20 software engineers.

Click to view large

Over the next few months, Hirelite will expand to other cities. If you're interested in Hirelite coming to your city, let us know!

Related Posts:

Results of a Speed Dating Event for Hiring Software Engineers

Visual Tech Job Board Comparison

It's hard to determine if a tech job board is worth watching (if you're a job seeker) or worth posting to (if you're hiring), so I made a quick visual comparison of job boards in New York City.

I used metrics that were easiest to quantify quickly through examining up to 300 recent local tech job posts on each of these sites, so you should definitely consider metrics other than what I've mentioned here (namely what types of job seekers frequent each job board). A few notable job boards are missing due to technical constraints that I didn't have time to overcome while scraping data (Startuply and Monster load some posts using JavaScript, and TheLadders and LinkedIn require logging in). I've tried to be as objective as possible, but I run Hirelite: Speed Dating for the Hiring Process, so keep that in mind.

A few notes and observations after the graphic...


What each metric means

  • Cost - the cost of a single post. The life time of the post varies per site.
  • Headhunter posts - the number of posts originating from recruitment agencies as opposed to from companies that are hiring.
  • Typical company sizes - an estimate of the typical size of companies posting to a job board based on the company name, funding stage, salary/equity balance, and other information contained in the post.
  • 20 most frequent words - the words most often used in job posts at a particular job board. Technical note: I used Lucene (and the StandardAnalyzer) to help with text processing and frequency calculations, so very common words (a, the, ...) are excluded. Additionally, some special characters were omitted from words (see observations for effects).

Observations

Keep in mind that more (and more random) samples would be ideal, but here are some preliminary observations:

First, I noticed that "c" appeared much more than I expected. Companies can't be requesting C skills enough to put it in the 20 most frequent words used Craigslist and Stack Overflow! Well... maybe Stack Overflow. It turns out, the tool I used process the text (Lucene) cut out special characters, normalizing C++, C#, Objective-C, and C all to C, thus inflating the frequency of "c".

The words "you", "we", and "our" appear very high on 37signals, Craigslist, Hirelite, NextNY, and Stack Overflow, but are much less emphasized on CareerBuilder and Dice. Are the job posts less personal or intimate? Does this matter? From looking at the posts in more detail, it seems to correlate with a greater focus on more specific requirements in posts on CareerBuilder and Dice. Note that the word "years" as in "3 years of Java experience" appear in the top 20 most frequent words of CareerBuilder and Dice; however, they do appear in the top 50 most frequent words of all the job boards surveyed.

I highlighted words in top 20 most frequent word lists that I thought correlated to technical skills or softer skills to observe the relative importance of each, but I don't see any discernible pattern. Additionally, many of the meanings of these words depend highly on their context (requirements section vs responsibilities section vs about the company section).

Initially, I thought that posts from larger companies correlated with a higher number of recruiter/confidential posts, but then I got to NextNY where many posts for positions at small to medium sized companies are recruiter/confidential posts. Maybe recruiter/confidential posts will appear in high numbers wherever they're allowed? Hirelite and Stack Overflow have policies against posts where the hiring company is not named, but I don't know of any explicit policy on 37signals (though they have no recruiter/confidential posts). Does anyone know if they have a policy about these posts?

Finally, let me know what you see in the data or if you have other ideas of what to do with this type of data. I'm considering doing some kind of analysis of how typical job post language compares to typical English - I predict probably an inordinate use of "pirate" and "ninja".

Data (including top 50 most frequent words)

37signals
Cost (single post): $400
Headhunter posts: 0%
Typical company sizes: generally medium sized companies or funded small companies
50 most frequent words: we, experience, our, you, web, have, design, team, work, development, business, can, developer, your, who, software, looking, end, ruby, new, rails, project, management, skills, working, us, strong, requirements, about, from, css, well, knowledge, front, things, technologies, jquery, html, systems, php, all, years, use, technology, some, should, projects, javascript, help, has

CareerBuilder
Cost: $419
Headhunter posts: 65%
Typical company sizes: large companies
50 most frequent words: experience, skills, management, business, technology, development, job, work, requirements, our, technical, systems, project, information, knowledge, support, must, years, software, data, have, your, team, ability, required, strong, services, security, robert, half, all, working, email, time, us, opportunity, we, sql, contact, developer, more, new, network, industry, design, you, company, system, server, application

Craigslist
Cost: $25
Headhunter posts: 46%
Typical company sizes: all company sizes
50 most frequent words: experience, our, software, development, you, we, work, skills, have, team, new, design, business, strong, knowledge, management, systems, web, c, your, java, developer, technical, years, requirements, please, ability, working, applications, environment, job, must, programming, project, all, company, data, time, technology, product, sql, looking, candidates, york, client, solutions, plus, services, from, well

Dice
Cost: $459
Headhunter posts: 51%
Typical company sizes: large companies
50 most frequent words: experience, business, skills, development, management, team, work, knowledge, services, technology, systems, project, technical, new, client, years, data, strong, design, java, support, developer, our, you, required, software, information, web, financial, have, description, working, ability, all, solutions, position, application, requirements, sales, applications, company, other, manager, your, must, title, environment, including, york, understanding

Hirelite
Cost: $100
Headhunter posts: 0%
Typical company sizes: seed stage to medium sized companies
50 most frequent words: you, we, our, experience, software, team, web, work, have, your, development, skills, new, looking, design, engineer, from, environment, applications, java, years, strong, technology, get, working, plus, senior, about, us, technologies, developers, who, systems, ruby, javascript, can, business, product, platform, people, like, engineering, building, what, want, understanding, technical, other, developer, company

NextNY
Cost: $0
Headhunter posts: 43%
Typical company sizes: seed stage to medium sized companies
50 most frequent words: experience, you, we, work, have, our, skills, your, team, development, new, web, product, business, working, strong, client, management, media, from, apply, all, clients, project, online, data, software, ability, looking, design, marketing, can, years, technology, us, time, sales, including, high, company, about, requirements, must, technical, services, environment, who, advertising, please, lead

Stack Overflow
Cost: $350
Headhunter posts: 0%
Typical company sizes: generally medium sized companies or funded small companies
50 most frequent words: you, experience, our, we, development, software, work, team, have, c, skills, new, systems, from, web, technology, design, your, knowledge, strong, working, developers, programming, java, developer, looking, applications, environment, years, technical, high, including, code, business, application, management, about, projects, technologies, all, ability, well, requirements, performance, media, engineer, us, science, more, computer

Ctrl-R Searches History and Other Historical Tricks

These tricks have saved me a lot of time. Many of them I started using after reading this Definitive Guide to Bash Command Line History by fellow Hacker News reader, pkrumins. It includes a much deeper look at history than the quick examples I cover here. 

Search Your History Quickly

No more history | grep ... or hitting the up button 20 times. Just open a command prompt, press Ctrl-R, and begin typing a word in your command. As you type, the most recent command matching what you type will appear. To continue searching backwards in the history, hit Ctrl-R again. Then hit the left or right key to edit the command or hit enter to run it. 

Increase Your History Size

Once you know how to search your history, make sure your commands stick around for a while. By default, the history size is pretty low, usually only 500. To increase your history size, add the following to either ~/.bashrc, ~/.bash_profile, or /etc/profile:

export HISTFILESIZE=1000000000
export HISTSIZE=1000000

Analyze Your History

Once you've built up a sizable history, analyze it to determine possible aliases that will reduce typing time. To see the top 30 most used commands, run:

cut -f1 -d" " .bash_history | sort | uniq -c | sort -nr | head -n 30

To see the top 30 most used commands including arguments, run:

sort .bash_history | uniq -c | sort -nr | head -n 30

Stop Getting Annoyed When You Forget sudo

I used to get so mad when I forgot to sudo a command, especially a long one. Not any more. To repeat a command using sudo, run:

sudo !!

Reuse Arguments

Say you want to backup a file then edit that file. Here's how you can reuse an argument from the most recent command:

cp a-very-long-file-name.txt a-very-long-file-name.txt.bak
vi !^

Also, you can use !!:N for the Nth argument, !!:N-M for the Nth to Mth arguments,  !!:$ for the last argument, or !!:* for all arguments.

Self Promotion

If you're a software engineer with Linux experience in NYC, consider coming to Hirelite: Speed Dating for the Hiring Process tomorrow evening (4/27).

A Delicious Four Years

I realized I made my first Delicious bookmark just over four years ago. Here's a visualization of my links. It's a decent approximation of what I know.

 

Update: I switched to pinboard.in after all the Delicious shutdown controversy.

Thread Synchronization Issues & Romance

Who knew threads and romantic relationships had so much in common? For those of you new to threads, threads (and processes) allow computers to seemingly do multiple things at once, where each thing is a separate "thread" of execution. On a computer with a single processor, the processor spends short amounts of time executing each thread before switching to another thread to execute. On a multiprocessor system, processors execute threads simultaneously, switching between threads when there are more threads than processors.

For those of you new to romantic relationships, I don't have much advice for you other than: Don't tell your significant other that you're treating your relationship as a series of thread synchronization problems!

Also, I'm probably perpetuating some stereotypes here. Sorry, it just makes the examples easier.

Thread Synchronization Issues

Deadlock occurs when threads cannot proceed because they're waiting on each other.

Romantic relationship example:
You and wife have to wake up at 6am to catch a flight. You half-wake-up at some point in the morning and think, "she'll wake me up," and go back to sleep. The problem is, now it's noon, and you've both been thinking the same thing for six hours. You missed your flight due to relationship deadlock.

Livelock occurs when threads cannot proceed because they're too busy responding to each other.

Romantic relationship example:
When was the last time you heard an obnoxious couple talking on the phone? Think back to the end of their phone conversation. It probably ended like this.

1: Love you. Talk to you later.
2: Love you too. Bye.
...
both wait ...
1: You hang up first.
2: No you hang up first.
1: No you...

You're witnessing relationship livelock. Neither person in the couple nor the couple as a whole can proceed because they're too busy responding to each other.

Starvation occurs when one thread is deprived of resources by greedy or mis-prioritized threads.

Romantic relationship example:
You and your boyfriend share a checking account and deposit money into it on the first of the month. You routinely make small purchases every day. Your boyfriend rarely makes purchase, but when he does, he buys something big. After the first of the month, you successfully make your small purchases for a few days, but then your boyfriend buys an iPad. All your attempted purchases are now denied. You're suffering from starvation.

Race conditions occur when success depends on the order in which threads run.

Romantic relationship example:
Your son wants to go bungee jumping with his friends. He knows that each parent requires that he ask both parents for permission. Using a clever turn of phrase, he realizes that he can exploit a relationship race condition to get what he wants by asking the stricter parent first.

Son (approaches strict mother): Can I go bungee jumping?
Mother: No, but ask your father.
Son (approaches lenient father): Can I go bungee jumping?
Father: Yes, but ask your mother.
Son: I already did.
Father: Great. Hope you have fun!

Thrashing occurs when threads make little or no progress due to the overhead of context switching.

Romantic relationship example: A couple tries to decide whether or not to get a pet. The argument gets heated. They keep bringing up unrelated topics. Each time a new topic comes up, they spend five minutes on it.

1: Having a dog would be so much fun!
2: You would never clean up after it.
1: What?! I clean all the time.
... five minutes later ...
1: Well at least I don't leave clothes all over the place.
2: Psh. I'm the only one that ever does the laundry. I can leave my clothes wherever I want.
... five minutes later ...
2: I don't know if I can talk about this anymore. I'm just going to go watch TV to cool down.
1: You watch TV all the time! We don't even need a pet. You spend all your time with the TV.

Every context switch gets the couple further away from where they started and from the problem they're trying to resolve. They're thrashing.

Busy waiting occurs when one thread continuously checks if it may proceed, robbing other threads of processing time.

Romantic relationship example: A couple is getting dressed for a party. The man is dressed and ready to go. The woman is nowhere near done. The man keeps interrupting the woman to ask her if she's ready yet. The man is busy waiting.

Self Promotion

If you're a developer and you like these sorts of problems, consider attending Hirelite: Speed Dating for the Hiring Process on Tuesday, April 27th in NYC where companies will be looking for great software people.

Black Hat Recruiter Tactics

Since starting Hirelite, where we get companies and software people talking directly, I've heard a lot of horror stories about working with recruiting agencies. When I hear these stories, I can't help but think of black hat vs. white hat hacking and SEO, so I call recruiters who engage in unethical practices "black hat recruiters". Black hat recruiters resort to the tactics below because they're too lazy to confront the real challenges involved in finding and matching good people and good companies. Please note that not all recruiters are bad, and some provide a lot of value, but this post is not about those recruiters. This post is about black hat recruiting where tactics range from lies to ethically gray practices to illegal activity (in approximate order of how common they are):

Posting misleading job descriptions - This is by far the most common form of abuse. Recruiters will post a job description for a legitimate position for a client, but falsify some of the information to entice candidates. For example, a recruiter will inflate the salary/compensation portion of the job description or inflate the job responsibilities while dumbing down the job requirements.

Posting bait-and-switch job descriptions - Black hat recruiters will advertise a job that does not exist or is already filled just to receive resumes from job seekers that they can contact about other job opportunities. This is very similar to a tactic that black hat apartment brokers use (mentioned in Rent Hop's comparison of headhunters and apartment brokers).

Surreptitiously modifying a job seeker's resume - Black hat recruiters often request a resume in a format they can modify. They will make modifications to job seekers resumes without telling job seekers and then give the modified resume to their clients. Modifications range from obscuring contact information so that the recruiter is always in the loop to more liberal modifications like inflating experience and skills. Nothing's worse than getting to an interview and finding out that you know COBOL from the hiring manager reading it off your resume.

Approaching other companies job seekers interview with - Recruiters often ask job seekers what other companies they are interviewing with under the guise of tailoring their search to the job seeker. Some recruiters will go as far as to ask who specifically the job seeker is in contact with. Armed with that information, a recruiter will contact the other companies and try to send competing job seekers. I've spoken to one job seeker who suspected this was happening and caught their recruiter in the act. This job seeker told the recruiter a friend's name and had the friend wait for the recruiter's call. The friend didn't have to wait long. Only 10 minutes after the initial call ended, the recruiter called the job seeker's friend. The recruiter denied everything.

Cold calling and pressuring low level employees - Black hat recruiters will call low level employees at a company and threaten termination and legal repercussions unless the employee passes the recruiter along to a hiring manager at the company.

Buying resumes from hiring companies - Black hat recruiters will give discounts to companies that will pass all the resumes for a particular position along to the recruiter. These resumes could be from other recruiters or from candidates who contacted a company directly.

Pressuring job seekers into interviews - Black hat recruiters will pressure job seekers into interviews that they don't want to go on. Sure, job seekers should stand up to them and say, "I don't want that job," but when a recruiter responds, "I'm not going to put you in front of <company> unless you go to this interview," job seekers may give in.

Promising exclusivity to job seekers - Black hat recruiters will promise a job seeker that they will not submit other job seekers for the same position as long as the job seeker agrees not to talk to any other recruiters. The recruiter then submits multiple competing job seekers for a position. If one is rejected, he tells that job seeker that the company decided there wasn't a fit and continues to send him to other companies.

Recruiting the references of a job seeker - Black hat recruiters request references from job seekers and recruit those references. Later, job seekers hear from their references that their recruiter pressured them for resumes to send to clients, sometimes for the exact job the original job seeker was up for!

Faking a relationship - Black hat recruiters will hear that Dunder Mifflin, a company they have no relationship with, is hiring. Instead of approaching Dunder Mifflin about working for them, the recruiter will solicit resumes from potential job seekers for exciting new openings at Dunder Mifflin. The recruiter will then approach Dunder Mifflin with the resumes they have. If Dunder Mifflin rejects the recruiter, the recruiter will tell the job seekers that Dunder Mifflin said there wasn't a fit for them.

Discrediting an employee's current company
- Black hat recruiters will contact an employed potential candidate and tell them that their current company is in a precarious financial state and offer to find the employee another job. Black hat recruiters will even do this to employees of their own clients.

Simulating expiring offers - When a company sends an offer to a job seeker, black hat recruiters will tell the job seeker that they only have X days (where X is usually 1 or 2) to accept the offer; otherwise, it will be rescinded. This practice is a bit more rare because job seekers and companies know each others' contact information by this point, but I've heard of this happening to at least one company and one job seeker (separate events).

Sending false offer letters - Black hat recruiters will send out fake offer letters to job seekers for companies they're having trouble getting interviews for. Black hat recruiters rely on job seekers requesting to interview with the company before accepting the offer. The recruiter then arranges an interview with the company. If the company like the job seeker, the recruiter makes sure to process and negotiate the offer, sometimes issuing a "revised" offer to the job seeker. If there is not a fit for the job seeker at the company, the recruiter is no worse off than they started, and they just drop all contact with a job seeker.


If you're thinking that any of these practices might work for you, think again. Seriously. They may work in the short term, but you will do irreparable harm to your reputation, the reputation of job seekers, and the reputation of companies you represent in addition to possibly opening yourself up to legal problems.

If you're a company or a software engineer who's tired of dealing with these tactics, check out Hirelite: Speed Dating for the Hiring Process. We have another event next Tuesday.

Got any more horror stories? Leave them in the comments.

Visual Guide to NoSQL Systems

There are so many NoSQL systems these days that it's hard to get a quick overview of the major trade-offs involved when evaluating relational and non-relational systems in non-single-server environments. I've developed this visual primer with quite a lot of help (see credits at the end), and it's still a work in progress, so let me know if you see anything misplaced or missing, and I'll fix it.

Without further ado, here's what you came here for (and further explanation after the visual).

Note: RDBMSs (MySQL, Postgres, etc) are only featured here for comparison purposes. Also, some of these systems can vary their features by configuration (I use the default configuration here, but will try to delve into others later).

As you can see, there are three primary concerns you must balance when choosing a data management system: consistency, availability, and partition tolerance.
  • Consistency means that each client always has the same view of the data.
  • Availability means that all clients can always read and write.
  • Partition tolerance means that the system works well across physical network partitions.

According to the CAP Theorem, you can only pick two. So how does this all relate to NoSQL systems?

One of the primary goals of NoSQL systems is to bolster horizontal scalability. To scale horizontally, you need strong network partition tolerance which requires giving up either consistency or availability. NoSQL systems typically accomplish this by relaxing relational abilities and/or loosening transactional semantics.

In addition to CAP configurations, another significant way data management systems vary is by the data model they use: relational, key-value, column-oriented, or document-oriented (there are others, but these are the main ones).
  • Relational systems are the databases we've been using for a while now. RDBMSs and systems that support ACIDity and joins are considered relational.
  • Key-value systems basically support get, put, and delete operations based on a primary key.
  • Column-oriented systems still use tables but have no joins (joins must be handled within your application). Obviously, they store data by column as opposed to traditional row-oriented databases. This makes aggregations much easier.
  • Document-oriented systems store structured "documents" such as JSON or XML but have no joins (joins must be handled within your application). It's very easy to map data from object-oriented software to these systems.

Now for the particulars of each CAP configuration and the systems that use each configuration:

Consistent, Available (CA) Systems have trouble with partitions and typically deal with it with replication. Examples of CA systems include:
  • Traditional RDBMSs like Postgres, MySQL, etc (relational)
  • Vertica (column-oriented)
  • Aster Data (relational)
  • Greenplum (relational)

Consistent, Partition-Tolerant (CP) Systems have trouble with availability while keeping data consistent across partitioned nodes. Examples of CP systems include:

Available, Partition-Tolerant (AP) Systems achieve "eventual consistency" through replication and verification. Examples of AP systems include:

Self promotion and Credits

  • If you're a developer and looking for a job or if you're hiring developers and these data systems are important to you, consider coming to Hirelite: Speed Dating for the Hiring Process on Tuesday.
  • This guide draws heavily from a recent Ruby meetup (by Matthew Jording and Michael Bryzek) and a recent MongoDB presentation (given by Dwight Merriman).
  • Thanks to DBNess and ansonism for their help with validating system categorizations.
  • Thanks to those who helped shape the post after it was written: Stan, Dwight, and others who commented here and on this Hacker News thread.

Update: Here's a print version of the Visual Guide To NoSQL Systems if you need one quickly (warning: it's not all that pretty and I may not keep it updated, but as of 3/17/2010, it's current).

    Why Networking Sucks for Introverts (and one way I'm trying to fix it for us)

    Networking can really suck for introverts. I know because I'm one of them. You're probably thinking, "Of course! Introverts are shy and have trouble with social interactions." However, introversion is much more complex and encompasses an overlapping spectrum of feelings. Here's my take on it:

    In general, the terms "introvert" and "extrovert" describe social preferences, not social capabilities, and it's important to remember that there's nothing wrong with tending toward one side or the other. Both have advantages and disadvantages (many that you can overcome with practice or adrenaline).

    Problems that Introverts Have with Networking

    These observations stem largely from the software-related meet-ups I've attended in NYC (Hackers & Founders, Hadoop, Android, etc), so they may only be applicable to technically-oriented introverts.

    1. Making small talk

    "The weather sure is ____." When introverts hear this, we immediately disengage. It's a struggle for us to realize that a little upfront investment in small talk can lead to a great conversation. Small talk is all about finding something to have a deeper conversation about, but often times, introverts get stuck in small talk ruts or completely blank on what to talk about, leading to awkward pauses.

    2. Inducing awkward pauses

    I've been a party to plenty of awkward pauses, both on the "caused" and the "affected" side. Awkward pauses happen for two reasons: struggling with turn taking or blanking on what to talk about. Blanking on what to talk about can happen because introverts have other interesting ideas we're mulling or we've run out of conversation topics. In the past, I've thought about keeping notes on conversation topics, but it's pretty weird to see someone whip out a notebook in the middle of a conversation, so I haven't done it.

    3. Politely leaving conversations of no interest

    If an introvert can't get out of the small talk stage or genuinely has no interest in the person they're talking to (imagine getting stuck talking to a someone from a recruitment agency that snuck in to a MySQL meetup), the conversation is over. Time to escape.

    The polite introverts needlessly stick with the conversation, trying to think of a way to break it off nicely. From personal experience, these dreaded conversations can last up to half an hour. All the while, you're catching bits and pieces of interesting conversations all around you.

    The less polite introverts either have "I don't care" plastered across their faces or just walk away. I have seen both. The latter is much more entertaining.

    4. Having group discussions instead of 1-on-1 conversations

    If there's a type of conversation that introverts love, it's 1-on-1 conversations. It's nerdy, but it's great to get into an intellectual property debate with one other person. However, at most "networking events" it's tough to get a 1-on-1 conversation. Often times you're stuck with a group.

    Introverts have a lot of trouble with group conversations. We feel like we can't get a word in - other people are always talking! We feel like we have to keep up with the main conversation and all the little side conversations that keep splitting off. And a lot of times, it's hard to even join a group conversation.

    Joining a group conversation is hard because everyone already involved is participating in the conversation. It's hard for them to include someone who has just popped in. It sounds strange, but I've seen people walk up to a group conversation, stand there for 10 minutes, and then walk away without ever saying anything.


    What I'm doing about it

    I've resolved to help fix these problems for a subset of introverts in a subset of networking situations. I want to help software engineers, who are definitely more introverted than the general population, network to find jobs and meet companies. To accomplish this goal (and others - a subject of another post), I'm introducing Hirelite.com: Speed Dating for the Hiring Process.

    At a Hirelite event, software engineers will go on 5-minute "speed interviews" with companies. Making small talk and creating awkward pauses will be less of an issue because the conversations will be short and focused on how each party can help the other. Starting new conversations and politely leaving conversations of no interest will be of little concern due to the 5-minute time limit and rotation to the next conversation. And group conversations will be minimized: one company (possibly two people) speaking with one software engineer.

    Hirelite is having its inaugural event on March 16th in New York City. For this event, there is only space for 20 companies and 20 software engineers, so let us know early if you would like to attend.