Forgot Password? The Hidden Identity Nightmare
What starts as a basic two-step flow, user requests a reset, clicks a link, sets a new password, quickly spirals into complex challenges like brute-forceable OTPs, token misuse on shared devices, old reset links that never expire, and lack of GDPR-grade logging.
Yash Sharma
Welcome to my blog! I write about technology, development, and more.
Forgot Password? The Hidden Identity Nightmare
When you build authentication flows, the “Forgot Password” button feels like a checkbox feature.
User forgets → enters email → gets OTP/reset link → sets new password → done.
Development time: one sprint.
Risk: ignored.
That’s exactly how it began in a rising SaaS startup… until it quietly morphed into a multi-team emergency nobody saw coming.
It Started with a Few Odd Incidents
Support inbox suddenly had complaints like:
- “I’m receiving reset OTPs I never asked for…”
- “Our former employee managed to get back into their old dashboard!”
- “Our login system keeps going down every few hours.”
What initially looked like user confusion triggered a security investigation.
The security lead and a senior engineer opened the logs which they found shook the team.
Where It All Went Wrong
🚨 1. Brute-forcing OTPs and reset links
Attackers were bombing the /reset-password endpoint with automated scripts, randomly trying 6-digit OTPs and UUIDs until they hit a valid one.
With no rate limiting or throttling, they eventually succeeded in resetting accounts they never owned.
🔁 2. Token reuse on shared devices
Users often clicked reset links on shared machines (office desktops, cyber cafés, WhatsApp Web).
Tokens sat in browser history, the next user could simply visit the URL and access the reset flow without ever requesting it.
📬 3. Old reset links never truly expired
Frontend said “Link expires in 30 minutes,” but backend never invalidated them.
Old links buried in archived emails and chat logs stayed usable for months, letting some ex-employees silently regain access to corporate dashboards.
📉 4. Compliance audit nightmare
GDPR auditors arrived asking:
“Who triggered the reset, who clicked the link, when, from which IP, from which device?”
We had… none of these logs.
Only a boolean: password_reset = true.
Suddenly a simple UX feature became a regulatory and legal liability.
Organisational Fallout
| Department | Impact |
|---|---|
| Security | Account takeover risks |
| Fraud | Fake resets → unauthorized access |
| Compliance | GDPR / PCI exposure |
| SRE/Uptime | Bots overwhelmed infra |
| UX | Legit users locked out or confused |
What looked like a UX convenience had become a dangerous, organisation-wide vulnerability.
How Engineering Responded
A task force redesigned the entire reset pipeline:
- Backend-enforced expiry on tokens (15-min TTL).
- Rate limits and IP throttling on reset endpoints.
- IP + Browser-fingerprint PINNING for token usage.
- Token marked as single-use, hard deleted on consumption.
- Audit logs for every request + click + outcome.
- Bot checks + CAPTCHA after abnormal behaviour.
- Alerting on multiple resets for same account/IP.
📊 Secure Password Reset Architecture (High-level)
User → /reset-request → Generate token (expiry, IP, fingerprint, status = pending) → Store in DB → Email link to user
User clicks reset link → /reset-verify:
- Validate token (exists, unexpired, unused)
- Match requesting IP + fingerprint
- Log event (IP, device, timestamp) → Allow password change → Mark token as used or expired Else
- Block + rate limit + alert security
Takeaway
**Forgot Password isn’t a simple form,
it’s an attack surface.**
If you don’t design it like a full-blown security product, it will absolutely become your weakest link.
More articles you might like
Refunds — The Silent Killer of Subscription Engineering
This blog uncovers why refunds, often treated as a minor support feature in subscription products, are actually one of the most complex engineering challenges at scale. It walks through a real-world scenario where a fast-growing digital startup stumbles into chaos due to underestimated refund mechanics — from financial ledger mismatches, multi-system rollback issues, coupon and affiliate payout reversals, abuse loops, cross-financial-year tax complications, to analytics corruption and unexpected international chargebacks.
Rate Limiting — The Day We Throttled Our Own App
This blog tells the story of a SaaS company that introduced rate limiting to stop bot abuse on its public APIs only to accidentally throttle its own internal microservices. What began as a simple protection mechanism using a sliding-window algorithm soon spiraled into a self-inflicted denial-of-service when internal service calls were routed through the same rate-limited gateway, triggering cascading retries and system-wide failures. The narrative highlights how defensive systems like rate limiting must be context-aware and tested against internal traffic not just external threats and emphasizes that poorly tuned safeguards can end up harming the platform they’re meant to protect.
The Silent Migration: How Salesforce Moved 760+ Kafka Nodes Without a Single Drop
This blog recounts Salesforce’s massive engineering feat of migrating 760+ Kafka nodes handling 1 million+ messages per second, all with zero downtime and no data loss. Told in a story-like war-room style, it highlights the challenges of moving from CentOS to RHEL and consolidating onto Salesforce’s Ajna Kafka platform. The narrative walks through how the team orchestrated the migration with mixed-mode clusters, strict validations, checksum-based integrity checks, and live dashboards. In the end, it showcases how a seemingly impossible migration was achieved smoothly proving that large-scale infrastructure upgrades are less about brute force and more about meticulous planning, safety nets, and engineering discipline.