When does open-source make sense?

May 05, 2021

If you haven’t subscribed yet, you can get thoughts and musings about personal finance and whatever else I find interesting straight to your inbox by clicking here:

This article is cross-posted over at the OpenView Partners blog…Thanks OpenView!

I was having a discussion with a friend recently, who was inquiring about some of my philosophies on open-source and asked: “when does open-source make sense for a business?” It’s a really hard question to answer, and in many ways is a really loaded question, because depending on who you ask, you may get extremely strong opinions with widely varying rationales.

Open-source can provide a number of different advantages for a business, on both the consumption and production end of the spectrum. Many companies are strategic consumers of open-source software as a means to reduce the burden on their software engineering team to build everything from the ground up. Other businesses, like LinkedIn and Netflix, have strong histories of being producers of open-source projects, which provides a strategic recruitment and retention tool for top-tier engineering talent.

For the purposes of this post, we’ll focus on a business decision to be (or not be) a Commercial Open-Source Software (COSS) company and thinking through whether open-source will provide a strategic advantage for your go-to-market motion. The decision to open-source or not is anything but a black-or-white decision, and as with most strategy decisions, it’s probably best to think about the various different tradeoffs and ways you can slice the question.

This post is not going to tackle some of the hairier second order questions that come up once you’ve decided to open-source, like what license to use, or how to defend against PaaS competitors, or when it makes sense to donate your code to an open-source foundation (all important and complicated questions), but those may be interesting topics for another discussion down the road.

What are some of the benefits of open-sourcing your software?

Open-source communities can provide a powerful distribution and message amplification channel

Open-source and community tend to go hand-in-hand, and the social dynamics that create that environment are probably complex enough to justify an entire article on its own. One major reason could be that open-source is often aligned with more mission-driven goals, rather than purely economic goals, and attracts a diverse set of personalities. They don’t come for free, and these communities require a lot of love and care to foster and grow, but they can provide a strong network effect, which turns the open-source community into an effective way to both distribute the project, but also amplify corporate messaging. In a number of cases, as with Kafka (sponsored predominantly by Confluent) and Elasticsearch (from the eponymous Elastic), the open-source communities grew around projects before a corporate entity was incorporated, and effectively created the opportunity to drive a fast-growing business.

You reduce adoption friction

Sometimes, open-sourcing your software can open a floodgate. Companies that had little interest in your solution as a proprietary offering might now be ready, willing, and excited to talk to you. This was a bit the case in the early days of StreamSets (which is a DataOps/ETL tool). We actually started out as a closed-source offering, and spent about six months trying to sell it as a proprietary product, but kept getting doors slammed in our face. When we finally open-sourced the software, we were suddenly able to get meetings with a lot of the companies that had previously politely declined. However, a good question to ask yourself when considering the reduced friction is: could you get the same benefit with a scaled-back, freemium offering? The answer likely depends on your market segmentation, as freemium may ease adoption friction for SMB/mid-market companies. However, it may not drive results in the enterprise, where businesses are likely to care much more about things like long-term viability of a vendor and potential for lock-in, and freemium may not sufficiently de-risk an early-stage company for enterprise use.

On-premise (or “cloud-prem”) is still a thing

Yes, SaaS is powerful, yes, SaaS is pervasive. And yet on-premise deployments are still very common, especially the enterprise. It depends on your definition of on-premise. There are unique security and privacy needs that come up in enterprise markets that just don’t have the same level of concern downmarket, and so open-source can make those challenges less of an obstacle. These days, on-premise deployments commonly come in the form of businesses running their own VMs in a cloud IaaS environment. Functionally, it’s the same as running a datacenter, and the customer still shoulders the burden of running the infrastructure. For some organizations, that’s a critical need. Think financial services, healthcare, and other highly-regulated industries.

What are some of the downsides to open-sourcing your software?

You’ll compete with yourself

When you can't think of a witty title for a high quality looped gif of an amazing movie - GIF on Imgur

This is probably the toughest aspect of building an open-source business from both a go-to-market and product strategy perspective. Oftentimes, if you ask a open-source salesperson who their primary competitor is, they’ll say DIYers, and it’s totally true. A lot of your top-of-funnel is focused on converting open-source users to paying customers. It produces a challenging messaging strategy because if your primary business model is selling support on an open-source project, you’re kind of incentivizing your team to build shitty software, and nobody ends up happy. That leads to a strong need to determine a product strategy that both rewards adoption, but drives conversion. These days, that often looks like SaaS/PaaS versions of the open-source project, which reduces management costs for end users, as well as holding certain features back (enterprise security features is a very common holdback to entice conversion). You can often see these types of feature segmentations detailed on open-source pricing pages, as with GitLab and HashiCorp.

Your destiny will likely be influenced by your community

It’s hard to really bucket this item into a pro or con, because it has elements of both. Open-source communities typically fall into one of two buckets: user communities or developer communities. In both cases, your project direction is likely to be influenced by these communities, but in different ways depending on the type of community you foster. User communities are powerful mechanisms to uncover unmet needs and organically discover use cases that you didn’t know your project had. However, communities take a lot of work and love and effort to keep alive, and a forgotten community can spell doom for a project, especially if people perceive that a project is slowing down. Developer communities can also have a significant impact dependent upon how the community and its contributions are governed. For example, projects donated to the Apache Software Foundation have very strict bylaws about what qualifies as an acceptable contribution, and means that you may need to accept community contributions that lead your project in different directions than you intended.

Your code and roadmap is on display for competitors

Whether you like it or not, being open-source means competitors will see some surface of your product strategy. They’ll be able to see what you’re building and putting out in the open-source, and they can see how you’re building it, too. This is often where licensing questions start to come up, as well. The Apache Software License v2 has long been a highly popular open-source license, but has caused a lot of headaches in recent days as cloud services (cough, cough, AWS) have forked a number of projects and developed PaaS offerings around them. This have resulted in some highly-controversial decisions in various open-source ecosystems, most notably with Elastic, who adopted the Server-Side Public License, which prohibits creation of a PaaS service for the purpose of commercializing the open-source project (though does not prohibit the use of the project within a commercialized solution). A number of companies like Confluent (with their Confluent Community License) and MongoDB (who originally designed the SSPL) have taken this approach to defend against would-be competitors.

Customers will likely ask for on-premise support

The ability to run on-premise opens up a new addressable market, but it is also damn expensive to support. Trying to debug some weird heisenbug via logs that are woefully uninformative? Yeah, that’s just Tuesday. You can work around this a bit by offering support for single-tenant deployments in an environment accessible to you, which makes the support a lot easier, but any way you cut it, supporting on-premise deployments is way more expensive than supporting a multi-tenant SaaS version of the product.

What signals might indicate that open-sourcing could be a good strategy?

Pros and cons aside, I think there are a lot of other important factors that tie into whether or not it really makes sense to be open-source. In particular, I think a good mental model for thinking through the decision is to consider a handful of primary questions:

How technical is the end user or operator?

The more technical your end user is, the more likely it is that they’ll value a technology being open-source. Developers or data engineers, for example, are likely to want to understand the source code, because it helps them understand the inner workings, and better rationalize how they might integrate it into their existing systems. Higher-functioning teams might also want to customize the software to their needs at various key integration points. Open-source code bases make that possible, and enhance their “right-to-repair,” so to speak. On the other end of the spectrum, if your end user is totally non-technical, say a marketer, they probably could care less whether the thing is open-source or closed-source, and frankly, they probably want it as a SaaS service, so that they can self-serve as much as possible, and limit their reliance on other technical teams. A data scientist or product analyst might be more towards the middle, as they probably understand SQL, or maybe even some statistical languages, or use Python. They probably value open-source, but it may not be the thing that sways a decision to adopt a technology or not.

A slightly different lens on this might be to think about how frequently the end user interacts with source control systems like GitHub or Gitlab. In the data ecosystem, users are often writing code, and checking it into source control with fair regularity. This may be true even with less technical members of the community, like data analysts, who spend most of their time writing SQL queries. On the other hand, in the cybersecurity community, you may have highly technical users who understand the ins and outs of systems administration, but are not developers, and in spite of being very familiar with a command-line, they spend relatively little time working with source-controlled code.

Are you targeting enterprise or mid-market/SMB?

Enterprise sales are quirky beasts. There are a lot more hoops to jump through, like security audits, and source code scans and the like. There’s also often much different scale, which produces nuanced requirements that can vary from opportunity to opportunity. Generally, enterprises tend to be a lot more risk-averse, and appreciate the relative safety/security that open-source affords with the knowledge that bugs can be fixed, and appreciate having the ability to run on-premise in a datacenter or VPC if they need to. Enterprises tend to choose technologies to fill out boxes in architecture diagrams, and so often need a peg that can fit into a very specifically-sized hole, and open-source can often make it easier to shave off the bits on the side that make it hard to get the peg in the hole. Mid-market and SMB companies, on the other hand, often don’t have budget for the best-of-breed technology in every space, so they have to pick and choose, and look for software that can offer something in many boxes rather than just do one thing really well. This often lends an edge to SaaS where the applications can be more general purpose, and all-in-one, and doesn’t incur the same management and operational expense that on-premise implementations of a similar tool might.

Is there a natural monetization strategy?

There are lots of different ways that open-source companies monetize, and none is perfect. Some companies choose to provide pure-play support for open-source (like Hortonworks), but this creates some awkward software quality incentivization. Another very common approach is to hold back certain features, frequently enterprise security/governance capabilities to drive commercial adoption (usually these show up as Community vs Enterprise Editions of a product). An increasingly common strategy is for companies to develop an open-source project that may be available for on-premise deployments but predominantly monetize a cloud-managed service (Confluent Cloud, Databricks, MongoDB Atlas, all good examples here), where the cloud-managed service enables greater adoption downmarket, and often these services are driven by consumption-based pricing. If an appropriate monetization strategy isn’t clear, that’s a big red flag for open-sourcing.

Taking a look at some examples

There’s no perfect answer to the question of whether you should open-source or not, but it can be instructive to think about questions like these, especially in the context of other business in the market. Let’s take a look at some of the patterns across open-source companies (and take a look at some edge cases).

Segmented Product Offerings (Ex: Confluent, HashiCorp, and Grafana)

These companies have all taken the approach of bifurcating their market between an on-premise enterprise offering (Confluent Platform, HashiCorp Enterprise, Grafana Enterprise Stack) and a cloud-managed service. The open-source roots of all three allowed them to make fast inroads into the enterprise in the early stages of growth, and their open-source communities enabled them to take market share while de-risking adoption in the enterprise. They also all sport a very technical userbase (DevOps and Data Engineers primarily) The cloud-managed services attract the lower-end of the market, and make it possible for these businesses to address the SMB/mid-market segments. The cloud-managed services also have different pricing models in several of these cases, with per-node pricing on the enterprise offerings, but consumption-based pricing in the managed service. Strategically, this allows the businesses to continue to address enterprise customers, as the enterprise becomes more accepting of cloud, and can migrate to managed services.

Compete on Enterprise Functionality (Ex: Fivetran/Airbyte, Segment/RudderStack, Slack/Mattermost)

Taking a quick look at the marketing messaging for Airbyte, RudderStack, and Mattermost makes the strategic differentiation very clear. Security, privacy, lock-in. These are the things that matter to enterprise customers, and these are primary differentiators that each leverages for competitive differentiation, which hinge on the open-source strategy. Both Segment and Fivetran are trying very hard to crack the enterprise, but neither has seen wild successes there. In particular Fivetran struggles because it is strategically aimed at a user persona lower on the technical spectrum (analysts), who are able to save time by using Fivetran to circumvent central IT. The double-edged sword of Fivetran’s success with a non-technical crowd is that it has alienated the organization and the individual (IT/CIO) that would typically own its category of technology in the enterprise. Mattermost is a different story, as Slack has been highly successful in the enterprise, but one element of Mattermost’s differentiation strategy is similarly on privacy and compliance..

Compete on User Segmentation (Ex: Slack/Mattermost)

The other major front that Mattermost differentiates on is user segmentation. Slack is geared towards business teams broadly (evidenced by its former ticker $WORK), and with the Salesforce acquisition, its focus on revenue organizations feels even more pronounced. Mattermost paints itself as a collaboration tool for developers, keying into the more technical audience that will get more value out of its open-source strategy.

On Second Thought… (Ex: Panther)

Panther is particularly interesting because it is partially based on an open-source project from Airbnb called StreamAlert. Panther is, or more accurately, was, an open-source SIEM, akin to Splunk. Panther started open-source and recently made the decision to go closed-source, citing the simplicity of the solution that it would bring. This is likely partially a reflection of the relative high cost of supporting customers using on-premise deployments. The other likely factor is that their solution is aimed at a market that doesn’t value open-source as much, and there’s not as much strategic value to being open-source. Snowflake, which is a little unusual in the security space, to begin with, but also aims at a downmarket, likely more digitally native audience. Additionally, their end users are security analysts that are likely comfortable on a command line, but are not necessarily developers familiar with an IDE. The market segment doesn’t demand open-source (although there have been a number of highly successful open-source security companies), nor does their user-base, so going closed source is likely a capital-efficient decision for them.

Upskilling Your Users (Ex: dbt)

dbt is a very curious player in the open-source ecosystem. Its users tend to be analysts and analytics engineers who are somewhat in the middle of the technical spectrum, but tend to have less programming ability. The tool is used broadly across SMBs and mid-market, but also is starting to see adoption in the enterprise, as well. It’s fairly universal, and so the decision to be open-source or not is a little up-in-the-air. Would dbt be successful if it were closed source? My guess would be yes, but what is particularly interesting about dbt is that in many ways, its users start out with less technical capabilities, and dbt actually introduces them to software engineering concepts, and helps them upskill themselves. The result is that dbt causes its user base to become more technical on the spectrum of programming and technical ability, which might create a strong justification for why it makes sense to be open-source.

Does This Really Need to Be Open-Source? (Ex: Preset, Metabase)

Metabase and Preset both fall into the camp of “why is this open-source?” for me. To be clear, I don’t think it’s bad for them to be open-source, I just don’t see a clear necessity. They are end-user applications that don’t demand significant technical skills to use, and in a market (business intelligence) that has historically not necessitated open-source in the enterprise (Tableau, Looker, Sisense have all done just fine for themselves). Superset (which is the open-source basis for Preset) has certainly been an efficient distribution mechanism, and perhaps Preset’s strategy will just be to primarily attempt to monetize open-source adopters, but when that well runs dry, I don’t see a clear advantage to their open-source approach.

Obvious Exceptions (Snowflake)

Snowflake is one very obvious example that goes against the grain of trends with many open-source ecosystems. They are highly successful in SMB/mid-market, but also have significant traction in the enterprise. There are lots of characteristics of Snowflake’s business that would make it seem like open-source would have been a helpful strategy for driving into the enterprise earlier, but so much of their value proposition and secret sauce is in what a SaaS/PaaS distribution model enables. They offer a self-healing, self-optimizing database with data sharing across accounts. It just wouldn’t work as well, deployed on-premise. They are able to provide extreme business value as a PaaS platform, in spite of being closed source, and available only as a multi-tenant service. In some ways, the value they provide is a consequence of being a multi-tenant PaaS offering, through network-oriented capabilities like shared tables, and by dramatically reducing the headcount required to manage and administer a data warehouse. Snowflake is evidence that open-source is not a requirement to be successful for a highly technical audience, even if it can be helpful in many other cases.

There’s no silver bullet to the question

Commercializing an open-source project as a core offering often cements a critical market, but being open-source does not necessarily dictate a particular monetization strategy or product strategy. Companies like Confluent and MongoDB have high-end offerings targeted by their core open-source projects, but also offer PaaS versions of their products (Confluent Cloud and MongoDB Atlas), which are essentially proprietary products that can appeal to a lower end of the market. This enables organizations to find a balance that fits their growth strategy at the time.

Open-sourcing a project has the ability to enable a company to drive towards an untapped market, but can have lasting effects on the direction of the business. There’s no right answer to whether or not a business should open-source, and the question should be generally answered more in the context of who the business is targeting as an end user, and how open-sourcing can provide strategic advantage.

Special thanks to Natalie Vais, Davis Treybig, Sam Richard, and Mark Grover for reviewing earlier versions of this article.

Semi-Structured

Discussion about this post