๐งน Contact Export Cleanup
๐ Overviewโ
The Contact Export Cleanup module automatically removes expired CSV export files from Wasabi S3 storage and their corresponding database records. It runs on a scheduled basis to maintain storage hygiene and comply with data retention policies.
Cron Schedule: Configured in crons/contacts/cleanup.js (typically daily)
Source Files:
- Cron:
queue-manager/crons/contacts/cleanup.js - Service:
queue-manager/services/contacts/cleanup.js(~35 lines)
๐ฏ Business Purposeโ
Ensures:
- Storage Cost Management: Removes old export files to reduce S3 storage costs
- Data Retention Compliance: Adheres to 24-hour retention policy for exports
- System Hygiene: Prevents accumulation of temporary export files
- Database Cleanup: Removes orphaned upload records
- Security: Limits exposure of exported data to 24-hour window
๐ Complete Processing Flowโ
sequenceDiagram
participant CRON as Cleanup Cron
participant SERVICE as Cleanup Service
participant DB as MongoDB (Uploads)
participant WASABI as Wasabi S3
participant LOGGER as Logger
loop Daily Schedule
CRON->>SERVICE: Execute cleanupContactsCSV()
SERVICE->>DB: Query exports > 24 hours old
DB-->>SERVICE: Export records
alt Exports Found
loop For each export
SERVICE->>LOGGER: Log cleanup start
SERVICE->>WASABI: deleteByKeyPublic(key)
WASABI-->>SERVICE: File deleted
SERVICE->>DB: Delete upload record
DB-->>SERVICE: Record deleted
end
SERVICE->>LOGGER: Log completion
else No Exports
SERVICE->>LOGGER: No files to cleanup
end
end
๐ง Main Service Functionโ
cleanupContactsCSV()โ
Purpose: Identifies and removes expired contact export files from both S3 storage and database.
Complete Source Codeโ
const UploadModel = require('../../models/uploads');
const wasabiUtility = require('../../utilities/wasabi');
const logger = require('../../utilities/logger');
exports.cleanupContactsCSV = async () => {
try {
// Step 1: Query exports older than 24 hours
let exportData = await UploadModel.find({
$or: [
{
source: 'person-export',
type: 'public',
createdAt: {
$lte: new Date(new Date() - 24 * 60 * 60 * 1000),
},
},
{
source: 'business-export',
type: 'public',
createdAt: {
$lte: new Date(new Date() - 24 * 60 * 60 * 1000),
},
},
],
})
.sort({ createdAt: -1 })
.lean()
.exec();
if (exportData.length > 0) {
let wasabiObj = new wasabiUtility();
// Step 2: Delete all files in parallel
await Promise.all(
exportData.map(async d => {
logger.log({
initiator: 'wasabi cleanup',
message: `cleaning up ${d.key} from wasabi`,
});
await wasabiObj.deleteByKeyPublic(d.key);
await UploadModel.deleteOne({ _id: d._id });
}),
);
}
logger.log({
initiator: 'wasabi cleanup',
message: `Deletion completed`,
});
} catch (error) {
console.log(error);
}
};
๐ Step-by-Step Logicโ
Step 1: Query Expired Exportsโ
let exportData = await UploadModel.find({
$or: [
{
source: 'person-export',
type: 'public',
createdAt: {
$lte: new Date(new Date() - 24 * 60 * 60 * 1000),
},
},
{
source: 'business-export',
type: 'public',
createdAt: {
$lte: new Date(new Date() - 24 * 60 * 60 * 1000),
},
},
],
})
.sort({ createdAt: -1 })
.lean()
.exec();
Query Logic:
OR Conditionโ
Matches either person or business exports:
- source: 'person-export': Contact/person exports
- source: 'business-export': Company/business exports
Filters Appliedโ
- type: 'public': Only public uploads (accessible via direct URL)
- createdAt:
{ $lte: ... }: Created 24+ hours ago
Date Calculationโ
new Date(new Date() - 24 * 60 * 60 * 1000);
// 24 hours * 60 minutes * 60 seconds * 1000 milliseconds
// = 86,400,000 milliseconds = 24 hours
Example:
- Current time:
2025-10-10 15:00:00 - Cutoff time:
2025-10-09 15:00:00 - Files created before cutoff are selected
Sort and Optimizationโ
- sort({ createdAt: -1 }): Oldest first (for logging consistency)
- lean(): Returns plain JavaScript objects (faster, no Mongoose overhead)
- exec(): Explicitly executes query
Step 2: Parallel Deletionโ
await Promise.all(
exportData.map(async d => {
logger.log({
initiator: 'wasabi cleanup',
message: `cleaning up ${d.key} from wasabi`,
});
await wasabiObj.deleteByKeyPublic(d.key);
await UploadModel.deleteOne({ _id: d._id });
}),
);
Parallel Processing:
- Promise.all(): Executes all deletions concurrently
- Significantly faster than sequential deletion
- All deletions must complete before function continues
Two-Step Deletion:
-
S3 Deletion:
wasabiObj.deleteByKeyPublic(d.key)- Removes file from Wasabi S3 storage
- Uses public bucket deletion method
- Key format:
exports/contacts/account-123/export-456.csv
-
Database Deletion:
UploadModel.deleteOne({ _id: d._id })- Removes upload record from MongoDB
- Prevents orphaned database records
- Frees up database storage
Logging:
- Logs each file being cleaned up
- Includes S3 key for audit trail
- Helps troubleshoot deletion failures
Step 3: Completion Loggingโ
logger.log({
initiator: 'wasabi cleanup',
message: `Deletion completed`,
});
Purpose:
- Confirms cleanup job completed
- Marks end of cleanup cycle
- Useful for monitoring and alerting
๐ Data Structuresโ
UploadModel Documentโ
{
_id: ObjectId,
source: 'person-export', // 'person-export' or 'business-export'
type: 'public', // 'public' or 'private'
key: 'exports/contacts/account-123/export-456.csv',
url: 'https://bucket.s3.wasabisys.com/exports/...',
filename: 'contacts_export_2025-10-09.csv',
size: 1024576, // File size in bytes
account_id: ObjectId,
user_id: ObjectId,
createdAt: Date, // Used for 24-hour filter
updatedAt: Date
}
Cleanup Query Result Exampleโ
[
{
_id: '507f1f77bcf86cd799439011',
source: 'person-export',
type: 'public',
key: 'exports/contacts/acc123/person-2025-10-08.csv',
createdAt: new Date('2025-10-08T10:00:00Z'), // 2+ days old
// ... other fields
},
{
_id: '507f1f77bcf86cd799439012',
source: 'business-export',
type: 'public',
key: 'exports/contacts/acc456/business-2025-10-07.csv',
createdAt: new Date('2025-10-07T14:00:00Z'), // 3+ days old
// ... other fields
},
];
๐จ Usage Patternsโ
Typical Cleanup Cycleโ
// Cron runs daily at midnight
// Example: 12:00 AM every day
// 1. User exports contacts at 10:00 AM on Oct 9
const exportRecord = await UploadModel.create({
source: 'person-export',
type: 'public',
key: 'exports/contacts/acc123/export.csv',
createdAt: new Date('2025-10-09T10:00:00Z'),
});
// 2. Export remains available for 24 hours
// 3. Cleanup runs at midnight on Oct 10 (14 hours later)
// File NOT deleted (only 14 hours old)
// 4. Cleanup runs at midnight on Oct 11 (38 hours later)
// File DELETED (over 24 hours old)
Manual Cleanup Triggerโ
// For immediate cleanup (e.g., after storage migration)
const { cleanupContactsCSV } = require('./services/contacts/cleanup');
await cleanupContactsCSV();
console.log('Manual cleanup completed');
โ๏ธ Configurationโ
Required Environment Variablesโ
# Wasabi S3 Configuration
WASABI_ACCESS_KEY=your-access-key
WASABI_SECRET_KEY=your-secret-key
WASABI_BUCKET=your-bucket-name
WASABI_REGION=us-east-1
WASABI_ENDPOINT=s3.wasabisys.com
# MongoDB
MONGO_DB_URL=mongodb://...
Cleanup Timingโ
// Retention period: 24 hours
const RETENTION_HOURS = 24;
const RETENTION_MS = RETENTION_HOURS * 60 * 60 * 1000;
// Can be made configurable via environment variable:
const RETENTION_MS = process.env.EXPORT_RETENTION_HOURS
? process.env.EXPORT_RETENTION_HOURS * 60 * 60 * 1000
: 24 * 60 * 60 * 1000;
Cron Schedule Examplesโ
// Daily at midnight
'0 0 * * *';
// Every 6 hours
'0 */6 * * *';
// Every hour
'0 * * * *';
// Twice daily (midnight and noon)
'0 0,12 * * *';
๐จ Error Handlingโ
Top-Level Error Handlingโ
try {
// Cleanup logic
} catch (error) {
console.log(error);
}
Error Behavior:
- Logs error to console
- Does not rethrow (prevents cron failure)
- Cleanup will retry on next scheduled run
- No user notification on failures
Partial Failure Handlingโ
await Promise.all(
exportData.map(async d => {
// If one deletion fails, others continue
await wasabiObj.deleteByKeyPublic(d.key);
await UploadModel.deleteOne({ _id: d._id });
}),
);
Failure Scenarios:
-
S3 Deletion Fails, Database Succeeds:
- File remains in S3 (orphaned)
- Database record removed
- File will be manually cleaned or remain until bucket lifecycle policy
-
S3 Deletion Succeeds, Database Fails:
- File removed from S3
- Database record remains (orphaned)
- Will be attempted again on next run (S3 deletion will fail gracefully)
-
One File Fails, Others Continue:
Promise.all()continues other deletions- Failed file logged but not retried
- Requires manual intervention
Improvement: Better Error Handlingโ
await Promise.all(
exportData.map(async d => {
try {
logger.log({
initiator: 'wasabi cleanup',
message: `cleaning up ${d.key}`,
});
await wasabiObj.deleteByKeyPublic(d.key);
await UploadModel.deleteOne({ _id: d._id });
logger.log({
initiator: 'wasabi cleanup',
message: `Successfully cleaned up ${d.key}`,
});
} catch (error) {
logger.error({
initiator: 'wasabi cleanup',
message: `Failed to cleanup ${d.key}`,
error: error.message,
});
// Continue with other deletions
}
}),
);
๐ Performance Considerationsโ
Optimization Strategiesโ
- Parallel Deletion: Uses
Promise.all()for concurrent operations - Lean Queries: Uses
.lean()to avoid Mongoose document overhead - Targeted Query: Filters at database level (not in-memory)
- Batch Processing: No pagination needed for typical volumes
Scalabilityโ
- Query Performance: Indexed on
source,type,createdAt - S3 Rate Limits: Wasabi allows 100 deletes/second (sufficient)
- Deletion Speed: ~100-200ms per file (S3 + DB)
- Typical Load: 10-100 exports per day (cleanup takes 1-10 seconds)
Index Recommendationโ
// Uploads collection indexes
{
source: 1,
type: 1,
createdAt: 1
}
// Or compound index
{
source: 1,
type: 1,
createdAt: -1
}
Typical Performanceโ
- Small Cleanup (< 10 files): 1-2 seconds
- Medium Cleanup (10-50 files): 5-10 seconds
- Large Cleanup (50-200 files): 15-30 seconds
- Very Large (200+ files): 1-2 minutes
๐งช Testing Considerationsโ
Mock Setupโ
jest.mock('../../models/uploads');
jest.mock('../../utilities/wasabi');
jest.mock('../../utilities/logger');
const { cleanupContactsCSV } = require('./services/contacts/cleanup');
Test Casesโ
describe('cleanupContactsCSV', () => {
test('Deletes expired person exports', async () => {
const mockExports = [
{
_id: 'export1',
source: 'person-export',
type: 'public',
key: 'exports/test.csv',
createdAt: new Date(Date.now() - 48 * 60 * 60 * 1000), // 48 hours old
},
];
UploadModel.find.mockReturnValue({
sort: jest.fn().mockReturnThis(),
lean: jest.fn().mockReturnThis(),
exec: jest.fn().mockResolvedValue(mockExports),
});
await cleanupContactsCSV();
expect(wasabiUtility.prototype.deleteByKeyPublic).toHaveBeenCalledWith('exports/test.csv');
expect(UploadModel.deleteOne).toHaveBeenCalledWith({ _id: 'export1' });
});
test('Does not delete recent exports', async () => {
const mockExports = [];
UploadModel.find.mockReturnValue({
sort: jest.fn().mockReturnThis(),
lean: jest.fn().mockReturnThis(),
exec: jest.fn().mockResolvedValue(mockExports),
});
await cleanupContactsCSV();
expect(wasabiUtility.prototype.deleteByKeyPublic).not.toHaveBeenCalled();
});
test('Handles errors gracefully', async () => {
UploadModel.find.mockImplementation(() => {
throw new Error('Database error');
});
await expect(cleanupContactsCSV()).resolves.not.toThrow();
expect(console.log).toHaveBeenCalledWith(expect.any(Error));
});
});
๐ Related Documentationโ
- Contacts Module Overview
- Contact Export Processing - Creates files cleaned by this module
- Wasabi S3 Utilities (documentation unavailable) (if documented)
๐ Notesโ
Why 24-Hour Retention?โ
- Balance: Gives users time to download without long-term storage
- Security: Limits exposure of exported customer data
- Cost: Reduces S3 storage costs
- Compliance: Meets temporary data processing requirements
Public vs Private Exportsโ
This cleanup only targets public exports:
- Public: Direct download URLs (expires after 24 hours)
- Private: Internal use, longer retention, different cleanup policy
Storage Cost Impactโ
Typical savings:
- 100 exports/day: ~500MB/day
- 30 days without cleanup: ~15GB accumulated
- With cleanup: < 1GB at any time
- Annual savings: ~180GB vs ~6TB
Orphaned Recordsโ
Possible orphan scenarios:
- S3 file deleted manually but database record remains
- Database record deleted but S3 file remains
- Partial cleanup failure
Solution: Run cleanup more frequently or add reconciliation job
Alternative: S3 Lifecycle Policyโ
Instead of application-level cleanup, could use S3 bucket lifecycle rules:
// Wasabi lifecycle rule
{
"Rules": [{
"ID": "cleanup-exports",
"Status": "Enabled",
"Prefix": "exports/contacts/",
"Expiration": {
"Days": 1
}
}]
}
Pros: Automatic, no code maintenance
Cons: Doesn't cleanup database records, less control
Complexity: Low
Business Impact: Medium - Cost optimization and compliance
Dependencies: Wasabi S3, UploadModel
Last Updated: 2025-10-10