How Razorpay Transitioned to a Secure Data Layer with Zero Downtime

Overview

Razorpay faced high CPU and I/O utilization on their primary MySQL 5.6 datastore and needed to upgrade to MySQL 5.7 with data encryption to meet compliance requirements. As a payment gateway, even a 60-second DNS switchover could cause transaction failures. Mydbops implemented a proxy-routing strategy that enabled a seamless migration to a fully encrypted, optimized database environment with zero end-user disruption while improving performance and scalability for future growth.
10X
Scalability Headroom
Reduced CPU/IO load, enabling 10X scaling.
100%
Compliance Achieved
Migrated to a fully encrypted environment
Zero
Migration Downtime
Enabled zero-downtime migration with proxy routing.
Optimized
SQL & Schema
Tuned inefficient SQL queries and lowered overall resource utilization.
MySQL
Consulting Services

About

Razorpay aims to revolutionize online payments by providing a simple, affordable, and secure way for small businesses, startups, and traditional SME organizations like schools and colleges to start accepting digital payments, Razorpay serves as a critical financial backbone in India, processing over $180 billion in annualized Total Payment Volume (TPV) and powering payments for more than 8 million businesses
★★★★★
Deployment Type
Database Stack
Outcome
Cloud-Based Deployment
Migration from RDS MySQL 5.6 to 5.7
Zero Downtime & 10X Scalability Enhancement
Deployment Type
Cloud-Based Deployment
Database Stack
Migration from RDS MySQL 5.6 to 5.7
Outcome
Zero Downtime & 10X Scalability Enhancement

Business Challenges

Overview
Razorpay was running MySQL 5.6 as its primary datastore. The business was experiencing high CPU and IO utilization, threatening performance under heavy transaction volumes. They needed to upgrade to MySQL 5.7 and implement encryption to fulfill critical compliance requirements. However, as a high-volume payment gateway, taking the system offline was unacceptable.

Downtime Constraints: The application relied on a Route 53 DNS endpoint pointing directly to the database. The minimum Time-To-Live (TTL) supported was 60 seconds, meaning a standard switch to the new database would force at least a 60-second payment outage. The client could not afford this disruption.

Unpredictable Failover: Activities on the RDS infrastructure were executed via API calls, making it impossible to estimate an exact time for failovers or node promotions. Razorpay required a transparent solution that eliminated this operational unpredictability.

Goals
The key objectives the client was aiming to achieve:

Complete the overall migration to MySQL 5.7 with absolutely zero downtime.

Implement encryption to fulfill compliance mandates.

Reduce high resource utilization through database tuning.

Identify and tune inefficient SQL queries while improving the database schema.

Ensure dedicated onsite support during the critical migration window.

Risks if Not Addressed
If left unresolved, these challenges posed serious risks

Risks & Impact if Not Addressed

Performance Issues

Without resolving replication lag and fragmented tables, query performance would continue to degrade, leading to a frustrating customer experience during peak hours.

Business Continuity Risks

Non-standardized backup policies increased the risk of data loss and prolonged outages, potentially disrupting thousands of orders in real-time.

Revenue Loss

Poor performance and downtime during peak times directly impacted Swiggy’s ability to fulfill customer demand, resulting in lost revenue and dissatisfied users.

Escalating Costs

Continued reliance on oversized, under-optimized infrastructure would lead to unnecessary monthly spend, straining the company’s profitability.

Developer Inefficiency

Lack of a stable and scalable database foundation meant developers spent significant time firefighting performance issues instead of innovating on features.

Performance Issues: Replication lag and fragmentation slow order searches and transactions.
Business Continuity Risks: Non-standardized backups mean longer recovery times and higher data-loss risk.
Revenue Loss: Slow page loads or timeouts during peak hours lead to failed checkouts.
Escalating Costs: Over provisioned, under-optimized servers strain profitability
Developer Inefficiency: Engineers spend more time firefighting than building new features
Goals
The key objectives the client was aiming to achieve:
→   
[Goal 1]
→   
[Goal 1]
→   
[Goal 1]

Solution Provided by Mydbops

To meet the strict zero-downtime requirement for live payments, Mydbops designed a proxy-based transition plan. By combining an intermediate proxy layer with database replication, the team bypassed infrastructure limits and delivered a smooth, risk-free upgrade.
Secure Cluster Setup:

To meet compliance mandates, the team created a new MySQL 5.7 cluster with Transparent Data Encryption. They populated it using a snapshot from the existing 5.6 replica, ensuring the setup process did not impact live production performance.

01

Secure Setup

Provisioned a new MySQL 5.7 TDE encrypted cluster smoothly using a safe 5.6 snapshot.

02

Data Sync

Configured external replication to intentionally bypass unpredictable RDS API limits.

03

Proxy Routing

Deployed Maxscale proxy on an EC2 instance to intercept traffic and halt 60s DNS delays.

04

Zero-Downtime

Shifted readers and instantly re-pointed the proxy to transition active live payments.

Controlled Data Synchronization:

External replication was configured to keep data synced between the legacy 5.6 and the new 5.7 clusters. This bypassed unpredictable, API-driven RDS failovers, giving the team precise manual control to guarantee a safe transition without data loss.

Proxy-Routed Traffic Management:

A mandatory 60-second Route 53 DNS delay directly threatened active payment processing. To solve this, Mydbops deployed a Maxscale proxy on an EC2 instance to manage traffic. Updating the DNS to point to this proxy insulated the application from standard wait times.

Zero-Downtime Cutover:

During the final migration, the reader endpoint was first moved to the 5.7 replicas. Next, the Maxscale proxy was updated to point to the new 5.7 primary node. A quick service restart instantly routed all traffic to the secure 5.7 database, resulting in a transparent switch with zero payment interruptions.

Old Setup

  • MySQL 5.6 (Unencrypted)
  • Maxed CPU & IO Utilization
  • 60-Second DNS Outage
  • Unpredictable API Failovers
VS

Mydbops Architecture

  • MySQL 5.7 (TDE Encrypted)
  • Tuned SQL & Schema Layout
  • Zero-Downtime Proxy Cutover
  • 10X Scalability Headroom

Results and Impact

Key Outcomes

10X Scalability Headroom

Proper database tuning significantly reduced CPU and IO load. This optimization provided the operational headroom needed to scale the application load by 10X using the exact same instance configuration.

Application Load Capacity

10X +900% Growth Room

Old (Maxed Out) Optimized Limit
1X
10X Capacity

*Achieved entirely through expert SQL and schema tuning without upgrading server instance size.

Zero-Downtime Operations

Successfully managed the 60-second DNS limitation and unpredictable API failovers, executing the database migration without dropping a single client transaction.

Ensured Regulatory Compliance

Successfully deployed Transparent Data Encryption, fulfilling all security and compliance requirements necessary for a payment gateway.

 Optimized Performance Architecture

Improved overall database health by tuning inefficient SQL queries and refining the schema layout, ensuring sustainable performance for future transaction loads.

Pre-Tuning (Critical Load)
CPU Utilization ~95%
I/O Wait Times High Risk
Post-Tuning (Optimized)
CPU Utilization ~15%
I/O Wait Times Stabilized

Every second of downtime costs a payment gateway lost revenue and consumer trust. When Razorpay needed to upgrade and encrypt their database to meet strict compliance mandates, a standard 60-second outage was simply not an option. By partnering with Mydbops, they bypassed legacy infrastructure constraints using a precise proxy-based routing strategy, turning a highly disruptive maintenance window into a completely transparent transition.

The business outcomes went far beyond checking a compliance box. Expert database tuning reduced resource consumption so drastically that Razorpay gained the capacity to handle 10X their normal application load on their existing hardware.

Ultimately, they secured their data layer, achieved full compliance, and fortified their infrastructure for massive growth—all without dropping a single payment. Don't let the fear of downtime dictate your business roadmap.

Consult with a Certified Database Expert

Need Expert Database Solutions?

Talk to a Database Expert Today!

Database solutions are provided by mydbops expert team
Mydbops set up High Availability (HA) Solutions with InnoDB or Percona Clusters, ensuring continuous uptime and fault tolerance.
Thank You!
We’ve got your request, our expert team will be contacting you shortly.
Oops! Something went wrong while submitting the form.
Download Case Study