Organizations are remarkably good at collecting data. They're remarkably bad at deleting it.
GDPR Article 5(1)(e) — the storage limitation principle — requires that personal data be kept "no longer than is necessary for the purposes for which the personal data are processed." This sounds straightforward until you try to implement it.
What does "necessary" mean for customer records? For support tickets? For analytics data? For employee files? The answer is different for each data type, each purpose, and often each jurisdiction.
Here's how to build a retention practice that actually works.
Why Retention Matters
Regulatory Exposure
Keeping data longer than necessary is a violation. Data protection authorities have fined organizations millions specifically for excessive data retention. The Italian Garante fined a telecommunications company EUR 27.8 million in part for retaining customer data beyond the stated retention period.
Breach Amplification
Every record you keep is a record that can be breached. An organization storing seven years of customer data when two years would suffice has 3.5x the exposure in a data breach — and 3.5x the notification burden.
Storage and Processing Costs
Data has a carrying cost. Storage, backup, indexing, search, compliance processing — all scale with data volume. Deleting data you don't need reduces costs across every system.
Defining Retention Periods
Start with Legal Requirements
Some retention periods are defined by law:
- Tax records: Typically 7-10 years (varies by jurisdiction)
- Employment records: Varies widely — check local labor law
- Financial transaction records: Often 5-7 years (anti-money laundering, accounting regulations)
- Healthcare records: Country-specific, often 10+ years
- Contractual records: Duration of contract plus limitation period for disputes
Document the specific law or regulation that mandates each retention period. "We keep it for legal reasons" without citing the law is insufficient.
Then Address Business Needs
For data without a legal retention requirement, define the business purpose and the minimum period needed to fulfill it:
- Active customer data: Duration of the business relationship plus a reasonable wind-down period
- Prospect data: 12-24 months from last interaction (legitimate interest decays over time)
- Support tickets: 12-24 months after resolution (for reference and quality improvement)
- Analytics data: Aggregate early, delete raw data within 6-12 months
- Marketing consent records: Duration of consent plus limitation period
- Failed login attempts: 30-90 days (security monitoring)
Create a Retention Schedule
A retention schedule maps every data category to:
- System: Where is the data stored?
- Purpose: Why is it collected?
- Legal basis: Consent, contract, legitimate interest, legal obligation?
- Retention period: How long is it kept?
- Deletion trigger: What event starts the countdown? (Account closure, contract end, last activity)
- Deletion method: Soft delete, hard delete, anonymization?
- Verification frequency: How often is compliance checked?
This schedule is a living document. It must be updated when systems change, legal requirements change, or business processes change.
Implementing Deletion
Soft Delete vs. Hard Delete
Soft delete marks a record as deleted (usually with a deleted_at timestamp) but retains the data. This allows:
- Recovery from accidental deletion
- Audit trail preservation during grace period
- Staged deletion (soft delete → grace period → hard delete)
Hard delete permanently removes the data. This is required after the grace period for GDPR compliance — soft-deleted data that lives forever isn't deleted.
A common pattern: soft delete with a 30-day grace period, then automated hard deletion via a scheduled job.
Dealing with Backups
Backup systems create a particularly difficult retention challenge. You can delete a record from your production database, but it persists in backups for weeks, months, or years.
Approaches:
- Backup rotation: Set backup retention to the shortest defensible period (30-90 days)
- Encryption-based deletion: Encrypt data with per-record keys. Deleting the key makes the backup data inaccessible.
- Documented exception: Acknowledge in your privacy notice that backup retention may extend beyond production deletion, with specific timeframes
Whatever approach you choose, document it. "We delete from production immediately and from backups within 90 days" is an acceptable position for most regulators.
Third-Party Data
Your retention policy must extend to processors. When you delete customer data, you must also instruct processors to delete it. Your DPAs should include:
- Processor obligation to delete data on instruction
- Processor obligation to delete data at contract termination
- Timeframes for deletion
- Confirmation of deletion
Verification: Proving Deletion Happens
A retention policy without verification is a fiction. You need to periodically check that data is actually being deleted on schedule.
Verification Workflow
- Select retention rules due for verification (e.g., all rules with monthly cadence)
- Check the system: Is data older than the retention period actually gone?
- Capture evidence: Screenshot or export showing the current data state
- Document the verification: Who checked, when, what they found, what action was taken
- Flag exceptions: If data should have been deleted but wasn't, investigate and remediate
- Update the verification record: Mark the rule as verified, set next verification date
Automating What You Can
Some systems support automated retention:
- Database TTL (time-to-live) settings
- Cloud storage lifecycle policies
- Email retention policies
- Log rotation
Where automation exists, use it. Then verify that the automation is working correctly.
Where automation isn't available, schedule manual verification at a cadence appropriate to the risk — monthly for high-volume data, quarterly for lower-risk categories.
Common Pitfalls
"We Might Need It Someday"
The most common objection to deletion is fear of future need. This is not a valid retention justification under GDPR. If you don't have a specific, documented purpose for keeping data, the storage limitation principle requires deletion.
Confusing Anonymization with Deletion
Properly anonymized data is no longer personal data and can be kept indefinitely. But anonymization must be irreversible. Pseudonymized data (where re-identification is possible with additional information) is still personal data and still subject to retention limits.
Forgetting About Derived Data
Retention policies often address primary data (customer records, orders) but miss derived data (analytics profiles, recommendation models, aggregated reports that contain personal data). If derived data contains or can reveal personal data, it needs a retention period too.
Inconsistent Deletion Across Systems
Deleting a customer from your CRM but leaving their data in your analytics platform, email marketing tool, and support system isn't compliant deletion. Map every system that holds a copy of each data type, and ensure deletion cascades across all of them.
Building Organizational Discipline
Technical controls aren't enough. Retention requires organizational commitment:
- Executive sponsorship: Retention policies that don't have leadership backing get ignored when storage is cheap and deletion feels risky
- Clear ownership: Every retention rule needs an owner — someone responsible for ensuring the rule is followed and verified
- Regular reviews: Quarterly review of retention policy adherence, verification results, and exceptions
- Incident response: When verification reveals data that should have been deleted but wasn't, treat it as a compliance incident — investigate root cause, remediate, and prevent recurrence
Data retention isn't glamorous. It doesn't generate revenue or delight customers. But it's a fundamental compliance requirement, a risk reduction strategy, and — increasingly — a factor in business valuations. The organizations that do it well are the ones that build the habit of proving it.