Monday, April 29, 2013

You Can't Learn from Failure, You Can Only Learn from Success

I feel strongly about not criticizing entrepreneurs. It's hard enough to start a company without the peanut gallery razzing you. And really, what good are you doing by being critical? Companies that are going to fail are going to fail without you wishing it on them, while your criticism might hurt companies that are on the path to success. What is constructive about your criticism?

The rationalization I always hear, from people who I suspect simply enjoy blood-sport, is: "how can we learn unless we look at what people are doing wrong?"

The easy answer is that there's a difference between a good post-mortem and criticizing something while people are trying to make a go of it. And there's a difference between a good post-mortem and 20/20 hindsight. The only person who can credibly make that distinction is the founder. Anyone else would have the unavoidable whiff of smugness.

There's a better answer and I think this is important. It's that criticism of failed companies is never very useful. Never.

You don't learn from failure. You can't learn anything from failure.

Ok, Neumann, you say. I have learned from failure. That time I burned my toast I learned to set the toaster lower. Now I don't burn my toast. Ha.

When I was a kid I wrote a computer program that learned to win at tic-tac-toe. First it build a decision tree of every possible tic-tac-toe game. Then it played you. Every time it lost it pruned that branch of the tree until only the wins or draws were left. It learned from failure.

Try that with checkers. Harder.

Try that with chess. Hard to impossible.

Try it with starting a company.

Any complicated system is too complicated to learn from failure. Yes, you can learn a few tricks, like: "don't spend all your money on fancy chairs" or "don't hire your college drinking buddies as EVPs of Business Development." But you can also spend your life learning about every mistake every startup founder ever made in all of recorded history and I guarantee that when you start your company you will discover all new mistakes to make. That is how life is.

All you can learn from failure is to avoid that particular kind of failure. And so what? There are too many other kinds of failure for that to make any difference. You need to learn from success. You should be spending your time trying to learn from success.

The successful entrepreneurs I have known have had the ability to look at a failure, any failure, and pull out the couple of things that were done right. These are what they focused on. Think about the successful entrepreneurs you know or have read about. This is just true.

If you're going to learn from failure you need to learn how to avoid every possible way you can fail. It's a waste of your time. You only need to learn one way to succeed.

Friday, April 26, 2013

Making Everyone a Data Scientist

You probably won't believe this.

There was a time when putting up a decent web page was considered a highly technical skill. In the period 1996-2000 most companies, even most big companies, had homepage designs that were raw HTML: static, poor design, broken links, nowhere to go and nothing to do. People had a name for it, brochureware, that was how common it was.

Firms like Razorfish and Red Sky specialized in using bleeding-edge technologies like CSS and DHTML to build sites that represented brands and told a story, that interacted. They hired the best coders and designers to do it, because it wasn't easy building a web site. The established agencies and design firms could not compete because they could not find enough people to hire who could build a competent site.

Crazy, right? Now ten year olds build web sites. College students put together calling-card websites overnight that rival what would have taken months and a team of people fifteen years ago.

That's how technology works. From magic to art to science to taken for granted. There was a time every car driver in America knew how their car engine worked; they had to. There was a time every programmer knew some machine language. There was a time everybody in the Internet business could tell you about what protocols they used in every layer of the OSI stack. One day every practitioner needs to know some technology. The next day it is invisible. We move up a conceptual layer and the layer below can be safely ignored.

Donald Norman, in The Invisible Computer, says

Everything changes when products mature. The customers change, and they want different things from the product. Convenience and user experience dominate over technological superiority. The company must change: it must learn to make products for their customers, to let the technology be subservient... The normal consumers, who make up the bulk of the market, consist of people who just want to get on with life, people who think technology should be invisible, hidden behind the scenes, providing its benefits without pain, anguish and stress... If the information technology is to serve the average consumer, the technology companies need to... start examining what consumers actually do. They have to be market driven, task-driven, driven by the real activities of those who use their devices.
Or, here's designer Jack Schulze, quoted in Domus:
Tech won't be visible but only if it's embedded into the culture that it exists within. By foregrounding the culture, you background the technology. It's the difference between grinding your way through menus on an old Nokia, trying to do something very simple, and inhabiting the bright bouncy bubbly universe of iOS. The technology is there, of course, but it's effectively invisible as the culture is foregrounded.
Does that tell you when? As an investor, I'm deeply interested in when. There are times I look at a particular sub-industry and it seems to be spinning its wheels. That usually says to me that the best place to innovate is actually a layer down in the stack. And there are times when I see awesome technology that is bottlenecked by a shortage of people who understand it. That's when it's time to move up a layer.

This second could be restated as: when a technology becomes too sophisticated for its users to use, make it a platform and build a user interface layer on top of it. That this makes sense seems obvious, how to do it is not.


I invest in data businesses. Messy, technical data businesses. These companies deal with enormous amounts of real-time data. They push Big Data technologies, machine learning, and data visualization to its limits. They are always and constantly in search of people who can make these technologies work, and they're hard to find. Time to move up the stack.

I became involved with several companies in the data science industry: Metamarkets and Ufora in big data, Granify and BigML in machine learning, and Datadog, Lucky Sort and DataHero in data visualization.

DataHero just released their product this week. I think it's a great example of user-centric design, of de-magicking the tech.

I am as good as it gets when it comes to Excel. I was a consultant and a financial analyst for years when I was younger. But even so, when I pulled some AIPP data a few weeks ago to analyze it, making reasonable charts still took me an hour. Cleaning the data, organizing it the right way, deciding which charts would actually show anything, making the charts, and then exporting them so I could put them on the blog. Time suck.

Here's what I did with DataHero.
  1. Connected to my Dropbox, downloaded the dataset I had stored there as an Excel file, made sure Datahero had guessed the datatype in each column correctly (2 minutes);
  2. Dragged and dropped the x and y-axis variables onto a new chart, filtered out bogus values, tried different chart types (2 minutes per chart);
  3. Exported the charts (like 10 seconds each.)
Here are the first two charts from the AIPP blog post. Total elapsed time: 5 minutes. Compare that to the original hour

Those two were fast because I already knew what I wanted. I then spent another fifteen minutes screwing around, tried about ten different charts to find three I thought had some explanatory value. 

The obvious difference between doing this in DataHero and doing it in Excel was speed and bypassing the boring cleaning, categorizing, moving columns around, trying to figure out why Excel doesn't understand what I'm trying to do. But the less obvious and more powerful difference is that DataHero foregrounded what I was trying to do as a user.

The reason data visualization is such a powerful tool is that we, as humans, are better able to understand images than numbers. Tufte says, in closing his landmark The Visual Display of Quantitative Information,
What is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather the task of the designer is to give visual access to the subtle and the difficult--that is, the revelation of the complex.
But this then begs the question: how do I figure out how to display the complex when it's so damned complicated? The AIPP data I was working with had no clear patterns at first glance, the set was too big for that. There are algorithmic techniques to discover order in large sets of data and there are simple hypotheses that can be confirmed or not. But doing either of these takes a decent amount of expertise and time. This sort of revelation requires the priesthood's guidance.

The third technique is more quintessentially human: tinkering and visual discovery. But this is not feasible for the non-programmer: it takes too long for each chart and making even small changes to the data being used is almost like starting all over. By taking away the complexity with automation and a user-centric interface, DataHero makes it possible to make as many charts as you like and throw away all but the ones that "give visual access to the subtle and the difficult."

The idea is to give everyone the ability to do most of what data scientists do today. Back in the '90s there were only a few really interesting websites because there were only a few people who could build interesting websites. Today there are only a few really interesting data visualizations because there are only a few people who can make really interesting data visualizations. When anybody and everybody can make sense of the complex data we're surrounded by, what will they find?

In all of the data science technologies, it is time for user-centric tools, tools designed around the real activities of their users, tools that foreground the culture. Because when the tool becomes invisible enough to us we can start to focus on what to do, not how to do it. That's where we can start to create real value.

Thursday, April 25, 2013

Andy Weissman on entrepreneurship, product development and the future of the Internet

Andy Weissman is a partner at Union Square Ventures and one of my favorite people in the startup world. He was a co-founder of Betaworks and was previously at Dawntreader and AOL. His easygoingness belies the fact that he's one of the most thoughtful investors in the business. 

Andy came up to talk to my entrepreneurship class at Columbia University's engineering school. I have people from the startup world in class every week. The point is not to have them teach, but to have a conversation with the class about what it's like to be an entrepreneur and to be part of the innovation economy. If the three turnings of the wheel are learning, knowing, and enlightenment, the speakers bring about the second: not what we do in the innovation economy, but what we are. Andy's talk is a great example.

Andy talks about what an entrepreneurial environment looks like--even in a company that's no longer a startup, how Betaworks did product development, where he sees business on the internet going, and what USV looks for in a startup. Andy's a really approachable and engaging speaker and I wish I could have had him talk twice as long.

Some marks:
0:00 - Intro, Andy's early career, at AOL
7:17 - Starting Betaworks
11:52 - Joining Union Square Ventures
13:47 - AOL diaspora
14:49 - Background of entrepreneurs
19:14 - Entrepreneurial environments--autonomy and empowerment
21:34 - Life after an acquisition
22:30 - Ideation and product development at Betaworks
30:55 - Evolution of internet business
33:20 - Next stage in internet business evolution
40:21 - NYC's internet ecosystem
44:39 - Is there a 'New Tool' now?
45:56 - Future of the internet
48:16 - What is USV looking for in a startup?
53:06 - Balancing data and gut feel in evaluating your startup idea
55:54 - Regrets?
56:32 - Most intriguing company he hasn't invested in.

If my fourteen-year old got her hands on this video it would have jump cuts and pan zooms and a swelling soundtrack and lots of duck-faces. Instead you get what you get: occasional bursts of static and a weird pan to the previous class' blackboard notes. And no, I have no idea what those equations mean.

Monday, April 15, 2013

A Tool for Open VC Data

A week ago I posted an idea for VCs to share their investments in machine readable form from their own websites. This is the simplest possible way to open up that data without relying on an intermediary that might either become a middleman or a bottleneck.

The first pushback on the idea was that VCs valued being obscure and so wouldn't cooperate. I wrote something in Forbes yesterday in response. The subtext is that some VCs won't share data but there's nothing you can really do about it except recognize that the ones who do share are probably better partners for entrepreneurs.

The other criticism came from Greg Yardley who emailed me to tell me that the sample investments file I put up had a syntax error. I had missed a bracket. Javascript is fiddly. To help me out he put up a tool with a user-friendly front end to create the portfolio objects. This should help you out too if you're going to put one up on your site.

He also improved the spec. He renamed the file from portfolio.js to investments.js--a more accurate description--and added things to the spec that real computer people have, like a version number. The new spec is here.

The tool is at It does several things:

  1. You can create a new investments.js file. This also has the very cool feature of being able to import your existing Crunchbase data to use as a starting point. Huge time saver.
  2. You can edit an existing investments.js file, the site copies the data from your website.
In both cases the result is a investments.js object. You need to copy this onto your computer and upload it to your website.
  • Copy-paste it using a text editor or some-such,
  • Save it on your computer as "investments.js",
  • Upload it to your site using an FTP client, then
  • Let me know! Email me, tweet me (@ganeumann), tweet to @VCdelta, comment on this post or whatever and I will put you on the list of known investments.js files.
Note well: The site does not save your investments.js file, you need to do this yourself. If you leave the site without creating a js object and copy-pasting it to your own machine, you will lose your work. The whole idea is to have us all own our own data. So own it.

Thank you Greg!

Monday, April 8, 2013

A Mechanism for VC Deal Transparency

[Edit: Greg Yardley has built a tool to implement this. I talk about it here.]

There was a bit of a brou-ha-ha last Friday over VCs not publicizing their investments. I was dubious: I've never met a VC who didn't want the world to know about every investment they make. It's a crowded market and letting entrepreneurs know you're making investments lets them know, well, that you're making investments. All of us investors need to let the world know we continue to exist because if we don't, we soon cease to exist. Publicity around deals is by far the easiest way to do that.

But over the past few days I was challenged by entrepreneurs to provide more information. "Through what mechanism?" I asked. There was no good answer. Crunchbase, great for what it is, is not a good mechanism. Perhaps the entrepreneurs don't realize this (or perhaps I'm doing it wrong) but even though CB is wiki-like, I can't edit the "Investments" section of my page; I think only CB admins can. Because I'm never sure what the process is, I have neglected it. But there should be a mechanism to update our portfolios and it needs to be Internet-y: simple, end-to-end, machine readable.

    In some sense, this already exists. It is the VC firm portfolio pages. That's why I built @VCdelta two years ago, to disseminate information that VCs are making public on their sites. It's been pretty successful. It's followed by many VCs, entrepreneurs, and journalists. And when I've had trouble scraping VC sites--because VCdelta is indeed a scraper--VCs have usually helped out. The reason I've needed their help is because VCdelta is a good citizen, it doesn't scrape sites that don't want to be scraped. It doesn't hide what it is--a python client--and it obeys robots.txt. Some sites were accidentally set up with defaults that exclude it.

    It's okay if a VC does not want to share information. Those that share information get more attention, more love from entrepreneurs, and better deal flow. At least that's what I believe. And I've had many VCs bring themselves to my attention so they can be added to VCdelta's scrape. I do what I can.

    But here's the rub: I don't like being a bottleneck. I'm providing a service and it's automated so it doesn't take much time*. But being a scraper, it breaks. And then some firms decide to use technologies that are more viewer friendly that VCdelta is not set up to handle, like AJAX or Flash. Some firms use images on their portfolio page that have no company name associated with them. I don't have time to make it better. And, in my own opinion, it pretty well sucks right now. Firms like Intel Capital have never been included. First Round Capital broke after they moved to an AJAX portfolio page. Etc. These are important firms, and I don't have time to get them back into the program.

    So I have a proposal. If investors want to publicize their deals in a usable way while not relying on a third-party gate-keeper, then we need some common language and setup to communicate. It should be simple, hosted on the VC website, and machine readable. Something like this, my investments**.

    It's a Javascript object. It's machine readable. E.g. in python:

    import urllib2, simplejson as json
    portfolio = json.loads(urllib2.urlopen("").read())

    The object is formatted like this:

     "url": "",
     "rounds":[ {"Series": "Seed", "date":"06/2009"},
       {"Series": "A", "date":"04/2010"}, ...]},
     "rounds":[ {"Series": "Seed", "date":"08/2008"}, ...], 
     "events":[ {"event":"Sale to BigCo", "date":"10/2012"}, ...]},...] 

    My proposal is that investors put this type of file up on their homepage and keep it updated. If we roughly agree on a format and location third-parties can easily find out what we are doing. This is not my standard, it is a proposal for a community standard. I disclaim any ownership of it. If someone wants to publish a similar format for funds raised and fund personnel, please do. I'm a one-person, bootstrapped operation so don't feel my opinion is of much weight for those.

    If you decide to use it, let me know. I will have @VCdelta tweet any firms that adopt it.

    You may or may not believe that investors are being transparent enough, but some of our customers believe we are not. This is a mechanism to address that, if enough people choose to adopt it.

    * Normally I would publish the code, but it was built accretively so it's spaghetti and I'm embarrassed to have anyone else see it. I have had on my list to rebuild it from the ground up, but it's been on my list for 18 months.
    ** There is one investment not on the list because the entrepreneur has asked me not to disclose yet. The wishes of my entrepreneurs trump your need to know. Sorry.

    Thursday, April 4, 2013

    The Signal and the Noise (Angel Investing 6)

    [Note: Post Number 5, on modelling your portfolio, was supposed to be next. Then I asked my friend Chris Wiggins a naive question about power law distributions. In return he sent me this paper. And this software. And this syllabus. And this paper. And this video. And also this talk he gave, which was above my pay-grade. And then I read the papers and watched the video and I was enlightened. And then I ran the software and it told me that the returns in the AIPP data were not power law distributed, that it was far more likely they were log-normal distributed. And I knew there would be a whole 'nother set of papers and videos and online coursework. And so I threw up my hands and wrote this post instead. I'll get back to the other one when my brain recovers.]

    A long-time VC at a top tier firm said to me the other day: "we used to talk about proprietary deal flow, but that doesn't exist anymore. Good ideas can get in front of anyone and good founders make sure they do." Mahendra Ramsinghani* makes the same point yesterday on PEHub. It's true.

    There was a time, say three years ago, when you could see great deals that no one else saw. That time is gone. Great deals are seen by plenty of people. Anyone can easily invest alongside me, or David Lee of SV Angel, or Founder's Fund, or almost any other great investor.

    But this does not mean that you will automatically see great deals. Deals don't just suddenly start landing on your desk the day you decide to start investing. You need to make yourself known, then you need to make yourself wanted, then you need to make yourself needed. That takes some work.

    And then, as soon as you start generating signal, you have to deal with the noise.

    As an investor you will see a lot of potential investments for each one you make. Venture capitalists say they see 100 plans for every company they fund. There are a few reasons for this:
    • Most businesses should be friends-and-family funded, or bootstrapped**, not venture funded;
    • Most startups that should be funded should not be funded by you; you should invest in what you know: the people, the market, the types of activities the firm will need to excel at to thrive;
    • You may see several companies solving the same problem at the same time, but you probably don't want to have too much bet on a single problem;
    • Related, you will need to say no to good companies because they are too similar to companies you've already invested in;
    • You won't know which are the best new companies unless you are seeing a lot of them***, you won't know where the market is pricing companies if you aren't talking to a bunch of them, you won't know what problems companies in that industry are facing, you won't know which other investors are actively in the market, etc. 
    You want to see many of the deals you end up not investing in. Venture investors traffic in information. Good venture investors are great information processors. Looking at companies is the best source of information out there. But a lot of deal flow is unavoidably noise to you, not signal. Although I would be interested to know more about the wave of online consumer subscription retail startups, for instance, they are too tangential to my main investment theses for me to spend the time meeting with them all. I need to just say no and move on.

    Once you open the door to any and all deals, you're drinking from a firehose. You can't look at everything. You need a mechanism to filter for the deals that fit your criteria, and you need to do it in a time-effective way. That may be the harder part of deal flow: not quantity, and not quality exactly, but finding the needle in the haystack.

    Ramsinghani correctly chides the VCs he's invested in to no longer consider proprietary deal flow a competitive advantage. But I don't think it means what he thinks it means. VC investing has become democratized. But the result is not that everyone now has the same advantages, it means that everyone now has the same problems. Established VCs have processes to separate the signal from the noise. You need them too.

    Next: Generating deal flow

    Previous posts in this series
    1. Intro: Why I'm Not an Angel
    2. How to spend your time: The Work-Work Balance 
    3. Positioning: How to be Different When What you Sell is a Commodity
    4. Portfolio Construction: Transcending Hobbyism 
      1. Data sidebar: AIPP data summary
      2. Data sidebar: AIPP exit data 
    5. Portfolio Modelling: TBA
    * Who, btw, wrote the book on venture capital as a profession: The Business of Venture Capital
    ** For many reasons--the return needed to compensate for the risk around a raw startup, the need to get to a cash exit (not just a sustainable business), the principal-agent risk, etc.--most startups should be self-funded or backed by friends and family or supported by the founders starting the company on the side. The National Venture Capital Association says that in 2011, only 973 startups received their initial venture capital funding. CB Insights has said closer to 2000. In comparison, the Bureau of Labor Statistics says that in March 2011, there were 536,445 establishments less than a year old. These numbers are not apples-to-apples, but the point is that far fewer than 1% of new businesses take venture money.
    *** Saying this may rub some entrepreneurs the wrong way, they don't want their time wasted responding to fishing expeditions. I agree. Do not waste entrepreneurs' time. Be honest. As soon as I decide I'm not investing, I tell the entrepreneur. If that's before the pitch I also often tell them that I would be happy to sit down and hear their pitch anyway. Many times they take me up on it. In return, I try to be helpful: constructively critiquing their plan, offering to make intros to portfolio companies or other potential partners, etc. My existing portfolio companies always come first, but in almost all cases in the startup world, the competition is not other startups, but established companies. I like helping startups, and if I can help without disadvantaging the companies I've already made a commitment to, I will.

    Monday, April 1, 2013

    AIPP Data Modeling (Angel Investing, Sidebar 2)

    More from the Angel Investor Performance Project data.

    Much of the financial activity at early stage companies happens around investment rounds. Acquisitions, failures, and fire-sales tend to happen when it's time for the company to raise more money. So it doesn't make sense to look at rates of failure or exits divorced from the fund-raising process.

    Here's a look at the AIPP data from the last post. Company exits and the sale multiples by years held. A zero multiple means the company failed.

    Sale Multiple
    Year 0 0-1 1-2 2-4 4-8 8-16 16-32 32-64 64-128 128-256 256-512 >512
    0 5 3 0 1 0 0 0 0 0 1 0 0
    1 22 8 3 35 1 2 1 0 0 0 0 0
    2 36 24 6 5 3 1 1 1 0 0 0 0
    3 25 6 6 8 21 1 1 1 1 0 0 0
    4 29 8 3 39 1 1 1 1 0 0 0 0
    5 14 23 7 6 1 0 0 0 0 0 1 0
    6 2 2 1 1 1 1 1 2 0 0 0 1
    7 1 5 6 0 0 1 0 0 0 0 0 0
    8 1 2 1 1 2 1 0 2 0 0 0 0
    9 0 0 0 0 0 2 0 0 0 0 0 1
    10 1 0 1 0 0 0 0 0 0 0 0 0
    >10 11 0 0 0 2 1 1 1 0 0 0 1

    Note the oddly large numbers in the 0-1 multiple range in years 2 and 5. My hypothesis is that these were companies that could not raise their next round (the A and the B, I assume) and went through a fire-sale, resulting in some money to the investors but not a gain.

    What would Markov do?

    This, I think, more closely corresponds to reality than the continuous assumptions most analyses take. Note that this roughly agrees with the common wisdom that 1/3 of venture-backed companies fail (here, 34% overall), 1/3 return capital (here, 19% overall are fire-sales so return >0x and <1x, but some bleed-over into slightly more than 1x returns could be attributed to the 1/3 "return capital") and the rest make money for the fund.

    The AIPP dataset is not large enough to be able to make good predictions about sale multiples at each round. But if we assume that
    • Venture investments as a whole make 25% p.a.;
    • Venture funds end up returning 2x cash on cash on average;
    • Seed is in year 0, any A would be in year 1, any B would be in year 3; and
    • A sale after the B would be in year 6.
    Then we can model the multiples at each point. The mean cash-on-cash sale multiple would be*

    Full follow-on:
    • 1.6 for pre-A sales, 
    • 4.4 for pre-B sales, and 
    • 4.7 for post-B sales;
    No follow-on**:
    • 1.6 for pre-A sales, 
    • 6.4 for pre-B sales, and 
    • 10.2 for post-B sales.

    Actual multiples would follow a power law probability distribution, as noted in the first sidebar, with these as the means.

    * I doubt this is a unique result. In configuring the model behind these this seemed most reasonable to me given my experience.
    ** If you follow on, then you invest money in the A and B at a higher valuation, so your eventual multiple is lower, albeit on a larger investment.