Are you still exclusively using closed source models?
Maybe it's time to switch to an open source model.
Stop blindly using closed source models (like OpenAI or Anthropic) for all your projects. Sometimes, open source is the better choice.
Closed source models like GPT and Claude are great. They’re easy to set up, require almost zero infrastructure, and are powerful out of the box for most use cases. In fact, I use many closed source models in my own products.
But don’t forget that open sourced models are ALSO an option. In fact, open source models have gotten exponentially better in the last year, and depending on your use case, they might be the better pick.
It really comes down to your data privacy requirements, expected request volume, and how much control you need.
I put together a breakdown of the key factors that I think every data team should evaluate. We’ll cover three areas: Security & Compliance, Cost & Resources, and Control & Flexibility.
Let’s get into it.
Security & Compliance
I’m putting this section first because for many data teams, this is the deal breaker. If your data can’t leave your infrastructure, that alone might make the decision for you.
This is particularly common in industries like
Healthcare where patient data is protected under regulations like HIPAA
Finance where transaction and customer data is subject to strict regulatory oversight
Government where data sovereignty requirements often mandate that data stays on domestic infrastructure
With open source models, all data stays on your infrastructure. You have full control over data residency and access. If you’re working in a regulated industry (healthcare, finance, government), this is a big deal. You can enforce compliance requirements on your own terms, without relying on anyone else’s security posture.
With closed source models, your data passes through a third-party API. That means you need to trust the provider’s security and compliance practices. Most major providers (OpenAI, Anthropic, Google) do have strong security certifications and data processing agreements. But at the end of the day, you are still sending your data to someone else’s servers.
If your team handles sensitive data and you have strict compliance requirements, open source gives you the most protection on the data.
PS: If you’re not sure what your company’s data compliance & security requirements are, check with your compliance team... and in the meantime, err on the side of caution.
Cost & Resources
This is where things get interesting. Open source and closed source have different cost profiles, and the “cheaper” option changes depending on where you are in your journey.
Cost to Get Started
Open source has a higher barrier to entry. You need GPUs (either owned or cloud-rented) and ML engineering talent to deploy and serve the model. It’s a lot to consider (and pay for) just to get started.
Closed source is much easier & faster to get started. All you need is an API key and an API call... and you’re running. You don’t have to worry about infrastructure or deep ML expertise.
Cost at Scale
This is where the decision gets a little more complex
First, some context: with open source models, you need your own GPUs to run them. You can either buy them or rent them through a cloud provider.
Open source gets cheaper when your GPU utilization is high (whether you own or rent). Your infra costs are more or less fixed, so the more you use it, the better your unit economics get. The catch: you’re paying for those GPUs whether you’re using them or not. So if your team is only running a handful of requests per day, you’re still footing the bill for idle infrastructure.
Closed source is pay-per-token, which makes costs predictable, which is great at low volume. But once you’re processing a high volume of requests, those per-token costs start adding up fast... and getting your own GPU + open source model might make more sense.
Control & Flexibility
This section matters most if you’re building AI into a product, or if you have a very specific use case that general-purpose models don’t handle well out of the box.
Fine-Tuning Control
Open source gives you full control. You choose the fine-tuning method (LoRA, QLoRA, full fine-tune), adjust any hyperparameter, freeze specific layers, and you own the resulting model. It’s yours to deploy wherever you want.
Closed source gives you limited fine-tuning options. You upload your data, tweak a few parameters, and the provider handles the rest. The fine-tuned model also stays on their infrastructure. You don’t get to take it with you.
Vendor Lock-In
With Open source, you can switch models freely and self-host anywhere. If a better model comes out tomorrow, you can swap it in.
Closed source ties you to the provider’s ecosystem, pricing, and roadmap. Model gateways like OpenRouter can help reduce this, but if you’ve fine-tuned a model on a specific provider, it’s harder to just pick up and leave.
Model Selection
Open source has a huge ecosystem. There are domain-specific models, like for code, medical, finance, and more. You have a lot of options, which is great, but it also means more research to find the right fit -- paradox of choice!!
Closed source has fewer options, and they’re mostly general-purpose. The upside is that it’s easier to pick. Typically, you’d go by benchmarks and provider reputation, and that’ll help you narrow it down quickly.
Wrapping Up
There’s no one-size-fits-all “right” answer here. Plenty of teams use both. You might use a closed source model for fast prototyping and general tasks, while running an open source model for anything that touches sensitive data or needs heavy customization.
My advice: start by evaluating your security and compliance requirements first (since that can be a deal breaker). Then move onto to the other considerations!
Hope this breakdown was helpful! If you have any questions or want to share your own experience with open source vs. closed source, feel free to drop a comment.
ICYMI (in case you missed it!)
Thanks for reading until the end. Here are some of my favorite recent posts that you might enjoy!
How I built and launched my product WITHOUT being an Engineer
Build an AI agent in Python: perfect for beginners
15 SQL interview questions (save this for your interviews!)
Every business I’ve tried to build: success & failures


