InstaReports Service
๐ Overviewโ
internal/api/v1/instareports/Services/instareport.js manages the complete lifecycle of InstaReports - comprehensive business audit reports that analyze digital presence across 7 platforms (Yext, SEMrush, Facebook, Yelp, Google Maps, PageSpeed, Facebook Ads). Handles queue management, scraper orchestration, score calculation, report retrieval, and multi-channel notifications.
File Path: internal/api/v1/instareports/Services/instareport.js
๐๏ธ Collections Usedโ
๐ Full Schema: See Database Collections Documentation
instareportsโ
- Operations: Read/Write for report lifecycle management
- Model:
shared/models/instareports.js - Usage Context: Primary collection storing report status, business info, scraper results, calculated scores
instareports.queueโ
- Operations: Read/Write for queue processing
- Model:
shared/models/instareports-queue.js - Usage Context: Manages report generation queue with retry logic and CSV batch imports
instareports.additional_informationโ
- Operations: Read/Write for recipient tracking
- Model:
shared/models/instareports-additional-information.js - Usage Context: Stores recipients, notification settings, view tracking, email/SMS event IDs
instareports.industry_averageโ
- Operations: Read for benchmark comparisons
- Model:
shared/models/instareports-industry-average.js - Usage Context: Industry benchmarks for review and social scores (24 industries)
yext.publishers.logoโ
- Operations: Read for listing publisher logos
- Model:
shared/models/yext-publishers-logo.js - Usage Context: Logo URLs for Yext listing publishers (Google, Facebook, Yelp, etc.)
crm.contactsโ
- Operations: Read for business and people data
- Model:
shared/models/contact.js - Usage Context: Business information and recipient contact details
_usersโ
- Operations: Read for creator information
- Model:
shared/models/user.js - Usage Context: Report creator details and notification sender info
_accountsโ
- Operations: Read for account domain and business info
- Model:
shared/models/account.js - Usage Context: Account settings, domain for report URLs, business branding
user_configsโ
- Operations: Read for report configuration
- Model:
shared/models/user-config.js - Usage Context: User preferences for videos, get-started URL, report customization
๐ Data Flowโ
sequenceDiagram
participant User
participant Service
participant Queue as Queue Manager
participant Scrapers
participant DB as Database
participant Notify as Notifications
User->>Service: saveInstareportQueue()
Service->>DB: Create InstaReport (QUEUED)
Service->>DB: Create AdditionalInfo
Service->>DB: Create QueueEntry
Service->>Queue: Webhook trigger
Queue->>Scrapers: Parallel scraping
Note over Scrapers: Yext, SEMrush, Facebook,<br/>Yelp, GoogleMaps,<br/>PageSpeed, FacebookAds
Scrapers->>DB: Update scrapers.{platform}
Scrapers->>Service: UpdateStatus()
Service->>DB: Calculate scores
Service->>DB: Update status: GENERATED
Service->>Notify: sendNotifications()
Notify->>User: Email with report link
Notify->>User: SMS with short URL
๐ง Business Logic & Functionsโ
Queue Managementโ
saveInstareportQueue(queue, data)โ
Purpose: Creates a complete report entry including InstantReport record, AdditionalInformation for recipients, and Queue entry for processing. Orchestrates the initial setup for report generation.
Parameters:
queue(Object) - Queue metadata:account_id(ObjectId) - Account identifieruid(ObjectId) - User creating reportparent_account(ObjectId) - Parent account for sub-accountstemplate_id(String) - Report template identifier
data(Object) - Report configuration:con(Object) - Business contact with full detailspeople(Array) - Recipient person IDs with notification settingsconfigs(Object) - Enabled scrapers (yext, semrush, etc.)notifications(Object) - Email/SMS configuration
Returns: Promise<Boolean> - Success indicator
Business Logic Flow:
-
Create InstaReport Document
- Status: 'QUEUED'
- Type:
template_id - Business info: name, email, phone, website, address, social, category
- Configs: enabled/disabled scrapers
- Notifications: email/SMS settings
-
Create Additional Information
- Link to instareport_id
- Recipients array with person IDs
- Email/SMS enabled flags per recipient
- Viewed tracking (default: false)
- Timestamps: instareport_created_at
-
Create Queue Entry
- Link via
reference_idto InstantReport - Type: 'instareport'
- Business details for scraper payload
- Status tracking fields
- Link via
-
Return Success
Key Business Rules:
- Each report must have business contact (name, address, phone minimum)
- Recipients optional but recommended for notifications
- Notification settings default to disabled if not provided
- All three documents created in sequence (not transactional)
- Facebook URL optional (defaults to empty string)
- Business category defaults to 'Other' if not specified
Error Handling:
- Rejects promise with error message if any save fails
- No rollback - partial data may exist on failure
Example Usage:
const queue = {
uid: user_id,
account_id: account_id,
parent_account: parent_account_id,
template_id: 'template_comprehensive',
};
const data = {
con: businessContact, // Full contact object
people: [{ id: person_id, email: { enabled: true }, sms: { enabled: false }, viewed: false }],
configs: { yext: true, semrush: true, facebook: true },
notifications: { email: { enabled: true }, sms: { enabled: false } },
};
await saveInstareportQueue(queue, data);
Side Effects:
- โ ๏ธ Creates 3 database records (InstantReport, AdditionalInfo, Queue)
- โ ๏ธ No transaction - failure may leave orphaned records
- โ ๏ธ Triggers queue manager via webhook (external side effect)
saveQueueData(queue)โ
Purpose: Processes CSV batch import queue by parsing CSV, creating business contacts, and queuing individual reports. Handles validation errors and tracks failures.
Parameters:
queue(Object) - CSV queue entry:_id(ObjectId) - Queue entry IDaccount_id(ObjectId) - Account identifiercreated_by(ObjectId) - User who uploaded CSVparent_account(ObjectId) - Parent accounttemplate_id(String) - Report templatedetails.key(String) - S3 key for CSV filenotifications(Object) - Default notification settings
Returns: Promise<Boolean> - Success indicator
Business Logic Flow:
-
Parse CSV File
- Call
csvToJson(queue.details.key) - Returns:
{ valid: [...businesses], errors: [...] } - Separate valid businesses from CSV validation errors
- Call
-
Create Business Contacts
- Call
CreateBusiness(validBusinesses, accountID, userID).save() - Returns:
{ success: [...savedBusinesses], errors: [...] } - Track both successful inserts and database errors
- Call
-
Queue Individual Reports
- For each saved business:
- Extract people contacts from business
- Build recipients array with notification settings
- Call
saveInstareportQueue()for each business
- Use
Promise.allSettled()to collect all results
- For each saved business:
-
Aggregate Errors
- Combine CSV errors + insert errors + queuing errors
allErrors = [...csvErrors, ...insertErrors, ...queueingErrors]
-
Update CSV Queue Status
- Increment
triescounter - Set
process_errorswith all errors - Set
is_completed: true - Set
completed_attimestamp - Clear
failure_reasonfield
- Increment
-
Return Success
Key Business Rules:
- CSV validation happens before business creation
- Business contact creation uses batch insert for efficiency
- Each business gets separate report queue entry
- Notification settings inherited from CSV queue unless overridden
- People contacts optional - reports can have zero recipients
- All errors tracked but don't stop processing (best-effort)
- CSV queue marked complete even if some reports fail
Error Handling:
- CSV parsing errors: tracked in
process_errors - Business insert errors: tracked in
process_errors - Queue creation errors: tracked in
process_errors - CSV queue update failure: logged but not re-thrown
CSV Format Expected:
Business Name,Email,Phone,Address,Website,Facebook,Category,People
"Acme Corp","info@acme.com","555-0100","123 Main St","acme.com","fb.com/acme","Automotive","person_id_1,person_id_2"
Example Usage:
const csvQueue = await InstaReportQueue.findOne({ type: 'instareport-csv' });
await saveQueueData(csvQueue);
// Creates individual reports for each valid business in CSV
Side Effects:
- โ ๏ธ Creates multiple business contacts in CRM
- โ ๏ธ Creates N report records (N = valid businesses)
- โ ๏ธ Updates CSV queue entry
- โ ๏ธ Triggers N webhooks to queue manager
- โ ๏ธ Long-running operation for large CSVs (use background queue)
Report Processingโ
UpdateStatus(ID)โ
Purpose: Orchestrates report status updates based on scraper completion. Calculates scores when all scrapers complete, handles retries on failures, and manages status transitions (QUEUED โ GENERATED or RETRYING).
Parameters:
ID(ObjectId) - InstantReport identifier
Returns: Promise<String> - New status: 'GENERATED', 'RETRYING', or undefined
Business Logic Flow:
-
Fetch Report
- Find InstaReport by ID
- Check
scrapersobject for platform statuses
-
Check Each Scraper Status
- Yext: If COMPLETED, calculate
scores.yextviaYextScore() - PageSpeed: If COMPLETED, calculate
scores.pageSpeedviaPageSpeedScore() - SEMrush: If COMPLETED, calculate
scores.seoandscores.googleAds - GoogleMap: Check completion status
- Yelp: Check completion status
- Facebook: Check completion status
- FacebookAds: Check completion status
- Yext: If COMPLETED, calculate
-
All Scrapers Complete (SUCCESS path):
- Determine industry category (validate against 24 industries, default 'Other')
- Fetch industry averages from
InstareportsIndustryAverage - Calculate
scores.reviewsviaReviewsScore(report, industryAvg) - Calculate
scores.socialviaSocialScore(report, industryAvg) - Update report:
status: 'GENERATED', generated_at: Date.now() - Update queue:
is_completed: true - Update additional_information:
instareport_generated_at: Date.now() - Return 'GENERATED'
-
Any Scraper Failed (RETRY path):
- Determine which scraper(s) failed
- For each failed scraper:
- Remove scraper data:
$unset: { 'scrapers.{platform}': '' } - Clear scores:
$unset: { scores: '' }
- Remove scraper data:
- Update report:
status: 'RETRYING' - Update queue: increment
tries, setscheduled: false - Return 'RETRYING'
-
Not All Complete (PENDING path):
- Return undefined (no status change)
Scraper Status Evaluation:
// Example scraper object structure
{
yext: { status: 'COMPLETED', data: {...} },
semrush: { status: 'COMPLETED', data: {...} },
page_speed: { status: 'FAILED', error: 'Timeout' },
facebook: { status: 'COMPLETED', data: {...} },
// ... other scrapers
}
Score Calculation Triggers:
- Yext Score: Listing accuracy percentage
- PageSpeed Score: Performance, accessibility, best practices, SEO
- SEO Score: Organic keywords, traffic, backlinks
- GoogleAds Score: Paid keywords, ad positioning
- Reviews Score: Average rating vs industry benchmark
- Social Score: Facebook engagement vs industry benchmark
Industry Categories (24 total):
[
'Active Life',
'Arts & Entertainment',
'Automotive',
'Beauty & Spas',
'Education',
'Event Planning & Services',
'Financial Services',
'Food',
'Health & Medical',
'Home Services',
'Hotels & Travel',
'Industrial Goods & Manufacturing',
'Local Services',
'Mass Media',
'Mining & Agriculture',
'Nigthlife',
'Other',
'Pets',
'Professional Services',
'Public Services & Government',
'Real Estate',
'Religious Organizations',
'Restaurants',
'Shopping',
];
Key Business Rules:
- All 7 scrapers must complete (success or fail) before status update
- Failed scrapers trigger full retry (all scrapers re-run, not just failed)
- Scores only calculated when ALL scrapers succeed
- Industry category normalized to ensure benchmark match
- Retry increments queue
triescounter (external retry limit applied) - Partial completion returns undefined (wait for more scrapers)
Example Usage:
// Called by queue manager when scraper completes
const newStatus = await UpdateStatus(report_id);
if (newStatus === 'GENERATED') {
// Trigger notifications
await sendNotifications(report_id);
} else if (newStatus === 'RETRYING') {
// Schedule retry in queue manager
await scheduleRetry(report_id);
}
Side Effects:
- โ ๏ธ Updates InstaReport document (status, scores, generated_at)
- โ ๏ธ Updates Queue document (is_completed, tries, scheduled)
- โ ๏ธ Updates AdditionalInformation document (instareport_generated_at)
- โ ๏ธ May remove scraper data on retry (destructive)
- โ ๏ธ Queries industry averages collection
Report Retrievalโ
reportByID(ID)โ
Purpose: Retrieves comprehensive report data with full details, user info, account branding, industry benchmarks, and calculated sections. Primary endpoint for report viewing/downloading.
Parameters:
ID(ObjectId) - InstantReport identifier
Returns: Promise<Object> - Full report data or error:
{
status: 'Success',
data: {
status: 'GENERATED',
userDetails: {
name: 'John Doe',
email: 'john@example.com',
businessName: 'Acme Marketing',
businessImage: 'https://...'
},
get_started_url: 'https://...',
businessDetails: { /* Business info */ },
listings: { /* Yext listings */ },
reviews: { /* Google/Yelp reviews */ },
social: { /* Facebook data */ },
website: { /* PageSpeed metrics */ },
seo: { /* SEMrush SEO data */ },
googleAds: { /* SEMrush Ads data */ },
facebookAds: { /* Facebook Ads */ },
overallScore: { /* Aggregate scores */ },
configs: { /* Enabled sections */ }
}
}
Business Logic Flow:
-
Fetch Core Data
- Find InstaReport by ID
- Find User config (type: 'instareports') for preferences
- Find Yext publisher logos (all records cached in Map)
-
Fetch Creator Information
- Query User: name, email, phone, image
- Query Account: business name, business image
- Merge into
userDetails
-
Determine Industry Category
- Normalize business category against industry list
- Try alternate format (split on ' &')
- Default to 'Other' if no match
- Fetch industry averages
-
Build Config Fallback
- If old report missing
details.configs:- Enable all scrapers by default
- Ensures backward compatibility
- If old report missing
-
Calculate Report Sections (via utils):
- businessDetails:
utils.businessDetails(report)- name, address, photos - listings:
utils.listingsDetails(report)- Yext data with logo mapping - reviews:
utils.reviewDetails(report, industryAvg)- Google/Yelp ratings - social:
utils.socialDetails(report, industryAvg)- Facebook engagement - website:
utils.pageSpeedDetails(report)- Performance scores - seo:
utils.seoDetails(report)- Organic traffic, keywords - googleAds:
utils.gogoleAdsDetails(report)- Paid traffic - facebookAds:
utils.facebookAdsDetails(report)- Facebook ad library - overallScore:
utils.overallScore(report)- Aggregate percentage
- businessDetails:
-
Map Yext Publisher Logos
- Replace logoURL with mapped S3 URL
- Keep original URL as logoURLOriginal
- Fallback to original if mapping not found
-
Return Complete Report
Config Conditionals (Sections Only If Enabled):
listings: report.details.configs.yext ? data : null;
reviews: report.details.configs.google_map ? data : null;
social: report.details.configs.facebook ? data : null;
website: report.details.configs.page_speed ? data : null;
seo: report.details.configs.seo ? data : null;
googleAds: report.details.configs.google_ads ? data : null;
Logo Mapping Logic:
const logosMap = new Map();
// Populate: logosMap.set('Google', 'https://bucket/google-logo.png')
// Usage: logoURL = logosMap.get(siteName) || originalLogoURL
Key Business Rules:
- Configs determine which sections calculated (performance optimization)
- Industry category must match exactly for benchmarks
- User config preferences override defaults (videos, get_started_url)
- Old reports without configs get all sections enabled
- Logo mapping enhances UI but doesn't block if missing
- Report can be retrieved in any status (QUEUED, RETRYING, GENERATED)
Error Handling:
{ status: 'Failed', error: 404, message: 'Report ID not found' }- Invalid ID- Re-throws any unexpected errors to caller
Example Usage:
const reportData = await reportByID('60a7f8d5e4b0d8f3a4c5e1b2');
if (reportData.status === 'Success') {
// Display report to user
renderReport(reportData.data);
} else {
// Show error
showError(reportData.message);
}
Side Effects:
- โ ๏ธ Queries multiple collections (InstaReport, User, Account, Config, YextLogos, IndustryAvg)
- โ ๏ธ Executes complex utils calculations (CPU intensive)
- โ ๏ธ No caching - recalculates on every request
getAllReports(account_id, page, limit, search, order, sort_by, statuses, auth)โ
Purpose: Retrieves paginated list of reports with filtering, sorting, and contact visibility rules. Supports CRM visibility settings (entire/owner_followers).
Parameters:
account_id(ObjectId) - Account identifierpage(Number) - Page number (1-indexed)limit(Number) - Records per pagesearch(String, optional) - Search text (name, industry, phone, email)order(Number, optional) - Sort order: 1 (asc) or -1 (desc)sort_by(String, optional) - Sort field (companyName, created, status, priority, etc.)statuses(Array, optional) - Filter by status ['QUEUED', 'GENERATED']auth(Object) - Authentication context:user.is_owner(Boolean) - Account owner flagaccount.preferences.contacts.visibility(String) - Visibility settinguid(ObjectId) - Current user ID
Returns: Promise<Object> - Paginated results:
{
count: 1234, // Total matching records
result: [ // Current page records
{
id: ObjectId,
status: 'GENERATED',
created: Date,
generated_at: Date,
business: {
name: 'Acme Corp',
industry: 'Automotive',
phone: '555-0100',
email: 'info@acme.com',
address: { /* Full address */ }
},
creator_details: {
name: 'John Doe',
email: 'john@example.com'
},
details: {
business_info: { /* Business info */ },
configs: { /* Enabled scrapers */ },
notifications: { /* Email/SMS settings */ }
}
}
]
}
Business Logic Flow:
-
Build Base Query
- Match:
account_id,orphaned: { $ne: true } - Project: Exclude
scrapers(large field, not needed)
- Match:
-
Apply Contact Visibility Rules
- Owner OR visibility='entire': All contacts visible
- Simple lookup:
localField: 'details.business_info.id'
- Simple lookup:
- visibility='owner_followers': Filtered contacts
- Complex lookup with conditions:
- Contact
visibility: 'all'OR - Contact
owner = current_userOR - Current user in contact
followersarray
- Contact
- Complex lookup with conditions:
- Owner OR visibility='entire': All contacts visible
-
Join Creator Details
- Lookup from
_userscollection - Fields: name, email (exclude password, tokens)
- Lookup from
-
Apply Status Filter (if provided)
- Modify
$match:status: { $in: statuses }
- Modify
-
Apply Search Filter (if provided)
- Regex search on:
business.namebusiness.industrybusiness.phonebusiness.email
- Case-insensitive (
$options: 'i')
- Regex search on:
-
Apply Sorting
- Map sort field names:
companyNameโbusiness.namecreatedโcreatedstatusโstatuscityโbusiness.address.cityemailโbusiness.emailcreated_byโcreator_details.name
- Default:
created: -1(newest first)
- Map sort field names:
-
Execute Pagination
- Count query: Get total matching records
- Data query: Skip + Limit
- Sort applied before pagination
-
Clean Response
- Add
idfield (copy of_id) - Remove sensitive fields:
_id,__v,account_id,created_by,parent_accountscrapers,scores(not needed in list view)creator_details.password,creator_details.reset_token
- Add
-
Return Results with Count
Contact Visibility Pipeline (owner_followers):
{
$lookup: {
from: 'crm.contacts',
let: { business_id: '$details.business_info.id' },
pipeline: [
{
$match: {
$and: [
{ $expr: { $eq: ['$_id', '$$business_id'] } }
],
$or: [
{ $expr: { $eq: ['$visibility', 'all'] } },
{ $expr: { $eq: ['$owner', current_user_id] } },
{ $expr: { $in: [current_user_id, '$followers'] } }
]
}
}
],
as: 'business'
}
}
Sortable Fields:
| Frontend Field | Database Field |
|---|---|
companyName | business.name |
created | created |
status | status |
priority | priority |
city | business.address.city |
state | business.address.state_province |
zip | business.address.postal_code |
email | business.email |
phone | business.phone |
created_by | creator_details.name |
Key Business Rules:
- Orphaned reports excluded (deleted business contacts)
- Contact visibility enforced at database level (secure)
- Scrapers data excluded from list view (performance)
- Owner users bypass visibility restrictions
- Search case-insensitive across 4 fields
- Default sort: newest first
- Empty result returns
null, not empty array
Example Usage:
const reports = await getAllReports(
account_id,
1, // page
50, // limit
'Automotive', // search
-1, // order (desc)
'created', // sort_by
['GENERATED'], // statuses
auth, // auth context
);
// Returns: { count: 15, result: [...15 reports] }
Side Effects:
- โ ๏ธ Two aggregation queries (count + data)
- โ ๏ธ Complex visibility pipeline (performance consideration)
- โ ๏ธ No caching - every request hits database
Notification Managementโ
sendNotifications(req, res)โ
Purpose: Sends email and SMS notifications to all recipients with personalized report links. Creates short URLs, attaches files, and tracks notification event IDs.
Parameters:
req(Object) - Express request:params.id(String) - InstantReport IDheaders.authorization(String) - JWT token
res(Object) - Express response (unused, async function)
Returns: Promise<void> - No return value (async)
Business Logic Flow:
-
Fetch Report and Recipients
- Find InstaReport by ID
- Find AdditionalInformation with recipients array
- Extract recipient person IDs
-
Fetch Account Domain
- Find parent Account by
parent_account - Extract
domainfor URL generation
- Find parent Account by
-
Fetch Contact Details
- Query Contacts where
_id IN recipient_idsANDtype: 'people' - Get: name, email, phone for each recipient
- Query Contacts where
-
Process Each Recipient
- For each contact, send notifications if enabled:
-
Email Notification (if enabled):
- Generate Report URL:
- Format:
{domain}/reports/view/{report_id}/{person_id} - Get active domain via
utilities.getActiveDomain()
- Format:
- Shorten URL:
- POST to URL shortener API:
https://playground-api.dashclicks.com/v1/url - Returns:
urlme.app/{code}
- POST to URL shortener API:
- Build Email:
- Subject: from
report.details.notifications.email.subject - From:
report.details.notifications.email.from - Reply-to:
report.details.notifications.email.reply_to - Recipient: contact email
- Content: HTML from
report.details.notifications.email.content
- Subject: from
- Attach Files (if configured):
- For each attachment in
email.attachments:- Generate S3 URL:
{WASABI_PUBLIC_IMAGE_DOWNLOAD}/{key} - Download file as buffer via
generateFileBuffer() - Add to
emailData.filesarray
- Generate S3 URL:
- For each attachment in
- Send Email:
- Call
new Mail().send()with custom fields:InstaReports-InstaReport preview link: short URL
- Fallback values from
email.fallback_entity
- Call
- Track Event:
- Update AdditionalInformation:
- Set
recipients.$.email.eventIDwith Communication ID
- Set
- Update AdditionalInformation:
- Generate Report URL:
-
SMS Notification (if enabled):
- Generate Report URL (same as email)
- Shorten URL (same as email)
- Build SMS:
- From:
report.details.notifications.sms.from - To: contact phone
- Content:
report.details.notifications.sms.content
- From:
- Send SMS:
- Call
new SMS().send()with custom fields:InstaReports-InstaReport preview link: short URL
- Fallback values from
sms.fallback_entity
- Call
- Track Event:
- Update AdditionalInformation:
- Set
recipients.$.sms.eventIDwith Communication ID
- Set
- Update AdditionalInformation:
-
Complete Processing
Custom Fields (Merge Tags):
customFields: [
{
field: 'InstaReports-InstaReport preview link',
value: 'https://urlme.app/abc123',
},
];
Fallback Values (for missing merge tags):
fallbackValues: {
'Business Name': report.details.business_info.name,
'Business Email': report.details.business_info.email,
'Contact Name': contact.name,
// ... other custom fallbacks
}
File Attachment Flow:
// Config structure
email.attachments: [
{ key: 'path/to/file.pdf', name: 'Brochure.pdf', type: 'application/pdf' }
]
// Generated URL
attachment.url = 'https://wasabi.com/bucket/path/to/file.pdf'
// Download buffer
bufferData = await generateFileBuffer(attachment)
// Attach to email
emailData.files = [bufferData]
Key Business Rules:
- Only recipients with email/SMS enabled receive notifications
- Each recipient gets personalized link (includes person_id for tracking)
- URL shortening required (long URLs don't fit in SMS)
- File attachments downloaded on-demand (not stored in DB)
- Event IDs tracked for delivery status and analytics
- Custom fields allow merge tags in email/SMS templates
- Fallback values ensure templates render even if data missing
Error Handling:
- File attachment errors logged but don't stop notification
- URL shortener failure would break notification (no fallback)
- Email/SMS send failures logged via Mail/SMS utilities
Example Usage:
// Called after report status = GENERATED
await sendNotifications(req, res);
// Result: All enabled recipients receive email/SMS
// AdditionalInformation updated with event IDs
Side Effects:
- โ ๏ธ Sends N emails (N = recipients with email enabled)
- โ ๏ธ Sends M SMS (M = recipients with SMS enabled)
- โ ๏ธ Creates N+M Communication records
- โ ๏ธ Creates N+M URL shortener entries
- โ ๏ธ Downloads P files from S3 (P = unique attachments)
- โ ๏ธ Updates AdditionalInformation with event IDs
- โ ๏ธ May trigger external webhook/integration events
๐ Integration Pointsโ
External Servicesโ
Queue Manager Webhookโ
- Purpose: Triggers report scraping workflow
- Endpoint:
{QM_WEBHOOK_URL}/instareport - Method: POST
- Payload: Report metadata, business details
- Used by:
saveInstareportQueue(), CSV import
URL Shortener APIโ
- Purpose: Generate short URLs for SMS
- Endpoint:
https://playground-api.dashclicks.com/v1/url - Method: POST
- Returns:
{ code: 'abc123' }โhttps://urlme.app/abc123 - Used by:
sendNotifications()
Wasabi S3 Storageโ
- Purpose: File attachment storage
- Bucket:
WASABI_PUBLIC_IMAGE_DOWNLOAD - Used by:
sendNotifications()- download email attachments
Scraper Services (7 platforms)โ
- Yext: Directory listings accuracy
- SEMrush: SEO and Google Ads data
- Facebook: Social engagement metrics
- Yelp: Review data
- Google Maps: Local listing, reviews
- PageSpeed: Website performance
- Facebook Ads: Ad library search
Internal Dependenciesโ
shared/utilities/mail.js- Email sending via Mail classshared/utilities/sms.js- SMS sending via SMS classshared/utilities/index.js-getActiveDomain()for URL generationServices/utils.js- Score calculation and report formatting utilitiesServices/csvToJson.js- CSV parsing for batch importServices/create-business.js- Bulk business contact creationServices/generateFileBuffer.js- S3 file download for attachments
Shared Modelsโ
InstaReport- Primary report documentsInstaReportQueue- Queue managementInstareportsAdditionalInformation- Recipients and notificationsInstareportsIndustryAverage- Benchmark data (24 industries)YextPublishersLogo- Publisher logo URLsContact- Business and people contactsUser- Creator informationAccount- Account domain and business infoConfig- User preferences (videos, URLs)
๐งช Edge Cases & Special Handlingโ
Case: Old Reports Without Configsโ
Condition: Report created before configs feature
Handling: Fallback to all scrapers enabled
Result: Full report generated for backward compatibility
Case: Industry Category Not Foundโ
Condition: Business category not in 24 industry list
Handling: Try alternate format (split on ' &'), default to 'Other'
Result: Industry benchmarks from 'Other' category used
Case: Scraper Partial Completionโ
Condition: Some scrapers complete, others still running
Handling: UpdateStatus() returns undefined, no status change
Result: Wait for all scrapers before status transition
Case: All Scrapers Failedโ
Condition: Every scraper returns FAILED status
Handling: Status set to RETRYING, scrapers cleared, queue tries incremented
Result: Full re-scrape scheduled by queue manager
Case: Missing Yext Publisher Logoโ
Condition: Logo mapping not found for listing site
Handling: Use original logoURL from scraper
Result: Report displays with original logo (may be broken link)
Case: Contact Visibility Restrictionโ
Condition: User not owner/follower of business contact
Handling: Business lookup returns empty, report excluded from results
Result: User cannot see report (secure)
Case: File Attachment Download Failureโ
Condition: S3 file missing or inaccessible
Handling: Error logged, attachment skipped, email still sent
Result: Notification delivered without attachment
Case: URL Shortener Failureโ
Condition: Shortener API unavailable
Handling: Request throws error, notification fails
Result: Recipient doesn't receive notification (no fallback)
Case: Recipient Missing Email/Phoneโ
Condition: Contact has no email (email enabled) or no phone (SMS enabled)
Handling: Communication utilities handle validation
Result: Notification silently skipped for that recipient
Case: CSV Import with All Invalid Businessesโ
Condition: Every row in CSV fails validation
Handling: No reports queued, all errors tracked in process_errors
Result: CSV marked complete with zero reports generated
โ ๏ธ Important Notesโ
- ๐ No Transactions: Multi-collection operations not atomic (potential orphaned records)
- ๐ฏ Status Flow: QUEUED โ RETRYING (on failure) โ GENERATED (on success)
- ๐ 7 Scrapers: All must complete before status update
- ๐ Retry Logic: Failed scrapers trigger full re-scrape (not incremental)
- ๐ Score Calculation: Only when ALL scrapers succeed
- ๐ 24 Industries: Benchmarks cover common business categories
- ๐ง Personalized Links: Each recipient gets unique URL with person_id
- ๐ฑ URL Shortening: Required for SMS (long URLs don't fit)
- ๐ File Attachments: Downloaded on-demand, not stored in DB
- ๐ Visibility Rules: Contact permissions enforced at database level
- โก Performance: No caching, complex aggregations may be slow
- ๐จ Error Handling: Best-effort processing, errors logged but not blocking
๐ Related Documentationโ
- Parent Module: InstaReports Module
- Related Service: Reporting Service
- Utilities: Utils Service (Score Calculation) (link removed - file does not exist)
- Controller:
internal/api/v1/instareports/Controllers/instareport.js - Routes:
internal/api/v1/instareports/Routes/index.js - Queue Manager: Queue Processing Documentation (link removed - file does not exist)
- Models:
- InstaReport (link removed - file does not exist)
- InstaReport Queue (link removed - file does not exist)
- Additional Information (link removed - file does not exist)