Building Software That Can Scale Beyond the MVP
The decisions that determine whether a software product can scale are made long before the product needs to scale. Some of these decisions are nearly impossible to reverse without a complete rewrite. Others can be safely deferred until scale is actually a problem. Knowing which is which — and what to do about the ones that matter — is the most valuable technical knowledge a startup founder can have.
The Decisions You Cannot Defer
These architectural decisions become exponentially more expensive to change as users and data accumulate. They must be correct at the MVP stage:
- Multi-tenancy model: How you isolate data between customers is the most fundamental scaling decision in a SaaS product. Schema-per-tenant is highly isolated but complex to maintain at hundreds of tenants. Row-level isolation with a tenant_id column scales to thousands of tenants with proper indexing and query design. Choosing the wrong model and migrating later requires touching every table, every query, and every API endpoint.
- Authentication architecture: JWT vs session-based authentication, the role model (RBAC vs ABAC), and the permission granularity all become expensive to change after users are embedded in the system. Define the auth model precisely before building any feature that requires permissions.
- Primary key strategy: Sequential integers as primary keys leak business information (customer count, order volume) and create ordering assumptions that break in distributed systems. UUIDs eliminate both problems and cost nothing to adopt from the start.
- Data model normalisation: Denormalised data models that work at 1,000 rows require expensive migrations at 10 million rows. Normalise aggressively at the MVP stage — denormalise specific queries only when profiling confirms it is necessary.
- API versioning: An API with no versioning strategy cannot be changed without breaking existing integrations. Version from day one (/api/v1/) even if you have no integrations yet — retrofitting versioning onto an unversioned API while customers are live is a multi-sprint project.
The Decisions You Can Safely Defer
These decisions are commonly over-engineered at the MVP stage. Deferring them saves significant development time without creating irrecoverable technical debt:
- Microservices: A monolith is the correct starting architecture. Extract services when a specific team boundary or scaling constraint forces it — not as a proactive architecture choice.
- Caching layer: Add Redis when query profiling shows a specific endpoint is slow due to repeated expensive queries. Do not add caching as a precaution before you have the data to target it.
- CDN and edge deployment: Add a CDN when static asset delivery is measured to be slow for users in specific regions. Most early-stage products have no users in enough regions to justify CDN complexity.
- Database read replicas: Add read replicas when primary database CPU consistently exceeds 70% or when specific read-heavy queries are shown to impact write performance.
- Message queues (Kafka, RabbitMQ): Add a message queue when you have measured that background task volume requires it. FastAPI BackgroundTasks and Celery with Redis are sufficient for most products up to millions of daily users.
The Five Scalability Patterns Worth Learning Early
These five patterns appear in almost every SaaS product that scales successfully. Understanding them before you need them prevents architectural decisions that block their later adoption:
- 1Database indexing strategy: Every query has an implied index requirement. A query filtering on user_id on a table with 10 million rows needs an index on user_id. Adding indexes is cheap; discovering missing indexes in production under load is not.
- 2Pagination everywhere: Any API endpoint that returns a list of records must paginate from day one. Returning all records works at 100 items; it crashes at 100,000.
- 3Idempotent operations: Any operation that creates or modifies data (payment processing, order creation, email sending) must be safe to retry. Idempotency keys prevent double-charging, duplicate orders, and duplicate emails when network failures cause retries.
- 4Background task separation: Operations that take more than 200ms (email, PDF generation, external API calls, data processing) must run asynchronously. A synchronous background task that starts appearing in your API response times at scale is a significant refactoring project.
- 5Soft deletes: Delete data with a deleted_at timestamp rather than a hard database DELETE. Hard deletes break foreign key references, destroy audit trails, and cannot be reversed. Soft deletes are a one-line implementation change that prevents a category of data integrity problems.
Recognising When You Are Approaching a Scaling Limit
These signals indicate a scaling constraint is approaching before it becomes a crisis:
- API response times increasing with no code changes — indicates a database query is now hitting a table size where an index is needed
- Background task queue depth growing faster than it is processed — indicates worker capacity needs to be increased or tasks need to be optimised
- Database CPU consistently above 60% — indicates query optimisation or read replica addition is needed within 1–2 months
- Memory usage growing continuously without releasing — indicates a memory leak in the application code
- Engineers reporting that a specific part of the codebase is "scary to touch" — indicates accumulated technical debt that is creating scaling risk through brittleness
A Practical Scalability Checklist for the First 100 Days
These practices, adopted in the first sprint, prevent the majority of scaling problems that appear in the first year of a growing product:
- Every database table has a UUID primary key, created_at, updated_at, and deleted_at
- Every list endpoint paginates — no endpoint returns an unbounded list
- Foreign key columns and all WHERE clause columns are indexed
- Background tasks (email, webhooks, heavy processing) run asynchronously
- API is versioned from the first endpoint
- Tenant isolation model chosen and enforced at the database query level
- Authentication model (roles, permissions) designed for the eventual permission requirements, not just the current ones
Implementation Checklist
- Multi-tenancy isolation model chosen and documented before the first customer data enters the system
- Authentication and permission model designed for eventual requirements
- UUID primary keys on all tables
- Every list endpoint paginates
- All FK and filter columns indexed
- Background task queue in place for any operation over 200ms
- API versioned (/api/v1/) from day one
- Soft delete pattern implemented on all tables with user data
Common Mistakes to Avoid
- ✗Building microservices before the domain model is stable — you will spend more time coordinating services than building product features
- ✗Skipping pagination on "small" list endpoints — tables grow, and an unbounded query that works today causes an outage in 18 months
- ✗Hard-coding customer-specific business logic — every customer-specific variation that lives in code rather than configuration creates technical debt proportional to the number of customers
- ✗No load testing before a major launch — discovering scaling limits during a product launch is the worst possible time
- ✗Optimising for imaginary scale — adding complexity for traffic levels that are 100× your current users is premature and slows feature development
Frequently Asked Questions
Need help applying these principles to your project? We build exactly this for startups worldwide.