Skip to main content

InstaReports Service

๐Ÿ“– Overviewโ€‹

internal/api/v1/instareports/Services/instareport.js manages the complete lifecycle of InstaReports - comprehensive business audit reports that analyze digital presence across 7 platforms (Yext, SEMrush, Facebook, Yelp, Google Maps, PageSpeed, Facebook Ads). Handles queue management, scraper orchestration, score calculation, report retrieval, and multi-channel notifications.

File Path: internal/api/v1/instareports/Services/instareport.js

๐Ÿ—„๏ธ Collections Usedโ€‹

๐Ÿ“š Full Schema: See Database Collections Documentation

instareportsโ€‹

  • Operations: Read/Write for report lifecycle management
  • Model: shared/models/instareports.js
  • Usage Context: Primary collection storing report status, business info, scraper results, calculated scores

instareports.queueโ€‹

  • Operations: Read/Write for queue processing
  • Model: shared/models/instareports-queue.js
  • Usage Context: Manages report generation queue with retry logic and CSV batch imports

instareports.additional_informationโ€‹

  • Operations: Read/Write for recipient tracking
  • Model: shared/models/instareports-additional-information.js
  • Usage Context: Stores recipients, notification settings, view tracking, email/SMS event IDs

instareports.industry_averageโ€‹

  • Operations: Read for benchmark comparisons
  • Model: shared/models/instareports-industry-average.js
  • Usage Context: Industry benchmarks for review and social scores (24 industries)
  • Operations: Read for listing publisher logos
  • Model: shared/models/yext-publishers-logo.js
  • Usage Context: Logo URLs for Yext listing publishers (Google, Facebook, Yelp, etc.)

crm.contactsโ€‹

  • Operations: Read for business and people data
  • Model: shared/models/contact.js
  • Usage Context: Business information and recipient contact details

_usersโ€‹

  • Operations: Read for creator information
  • Model: shared/models/user.js
  • Usage Context: Report creator details and notification sender info

_accountsโ€‹

  • Operations: Read for account domain and business info
  • Model: shared/models/account.js
  • Usage Context: Account settings, domain for report URLs, business branding

user_configsโ€‹

  • Operations: Read for report configuration
  • Model: shared/models/user-config.js
  • Usage Context: User preferences for videos, get-started URL, report customization

๐Ÿ”„ Data Flowโ€‹

sequenceDiagram
participant User
participant Service
participant Queue as Queue Manager
participant Scrapers
participant DB as Database
participant Notify as Notifications

User->>Service: saveInstareportQueue()
Service->>DB: Create InstaReport (QUEUED)
Service->>DB: Create AdditionalInfo
Service->>DB: Create QueueEntry
Service->>Queue: Webhook trigger

Queue->>Scrapers: Parallel scraping
Note over Scrapers: Yext, SEMrush, Facebook,<br/>Yelp, GoogleMaps,<br/>PageSpeed, FacebookAds

Scrapers->>DB: Update scrapers.{platform}
Scrapers->>Service: UpdateStatus()

Service->>DB: Calculate scores
Service->>DB: Update status: GENERATED

Service->>Notify: sendNotifications()
Notify->>User: Email with report link
Notify->>User: SMS with short URL

๐Ÿ”ง Business Logic & Functionsโ€‹

Queue Managementโ€‹


saveInstareportQueue(queue, data)โ€‹

Purpose: Creates a complete report entry including InstantReport record, AdditionalInformation for recipients, and Queue entry for processing. Orchestrates the initial setup for report generation.

Parameters:

  • queue (Object) - Queue metadata:
    • account_id (ObjectId) - Account identifier
    • uid (ObjectId) - User creating report
    • parent_account (ObjectId) - Parent account for sub-accounts
    • template_id (String) - Report template identifier
  • data (Object) - Report configuration:
    • con (Object) - Business contact with full details
    • people (Array) - Recipient person IDs with notification settings
    • configs (Object) - Enabled scrapers (yext, semrush, etc.)
    • notifications (Object) - Email/SMS configuration

Returns: Promise<Boolean> - Success indicator

Business Logic Flow:

  1. Create InstaReport Document

    • Status: 'QUEUED'
    • Type: template_id
    • Business info: name, email, phone, website, address, social, category
    • Configs: enabled/disabled scrapers
    • Notifications: email/SMS settings
  2. Create Additional Information

    • Link to instareport_id
    • Recipients array with person IDs
    • Email/SMS enabled flags per recipient
    • Viewed tracking (default: false)
    • Timestamps: instareport_created_at
  3. Create Queue Entry

    • Link via reference_id to InstantReport
    • Type: 'instareport'
    • Business details for scraper payload
    • Status tracking fields
  4. Return Success

Key Business Rules:

  • Each report must have business contact (name, address, phone minimum)
  • Recipients optional but recommended for notifications
  • Notification settings default to disabled if not provided
  • All three documents created in sequence (not transactional)
  • Facebook URL optional (defaults to empty string)
  • Business category defaults to 'Other' if not specified

Error Handling:

  • Rejects promise with error message if any save fails
  • No rollback - partial data may exist on failure

Example Usage:

const queue = {
uid: user_id,
account_id: account_id,
parent_account: parent_account_id,
template_id: 'template_comprehensive',
};

const data = {
con: businessContact, // Full contact object
people: [{ id: person_id, email: { enabled: true }, sms: { enabled: false }, viewed: false }],
configs: { yext: true, semrush: true, facebook: true },
notifications: { email: { enabled: true }, sms: { enabled: false } },
};

await saveInstareportQueue(queue, data);

Side Effects:

  • โš ๏ธ Creates 3 database records (InstantReport, AdditionalInfo, Queue)
  • โš ๏ธ No transaction - failure may leave orphaned records
  • โš ๏ธ Triggers queue manager via webhook (external side effect)

saveQueueData(queue)โ€‹

Purpose: Processes CSV batch import queue by parsing CSV, creating business contacts, and queuing individual reports. Handles validation errors and tracks failures.

Parameters:

  • queue (Object) - CSV queue entry:
    • _id (ObjectId) - Queue entry ID
    • account_id (ObjectId) - Account identifier
    • created_by (ObjectId) - User who uploaded CSV
    • parent_account (ObjectId) - Parent account
    • template_id (String) - Report template
    • details.key (String) - S3 key for CSV file
    • notifications (Object) - Default notification settings

Returns: Promise<Boolean> - Success indicator

Business Logic Flow:

  1. Parse CSV File

    • Call csvToJson(queue.details.key)
    • Returns: { valid: [...businesses], errors: [...] }
    • Separate valid businesses from CSV validation errors
  2. Create Business Contacts

    • Call CreateBusiness(validBusinesses, accountID, userID).save()
    • Returns: { success: [...savedBusinesses], errors: [...] }
    • Track both successful inserts and database errors
  3. Queue Individual Reports

    • For each saved business:
      • Extract people contacts from business
      • Build recipients array with notification settings
      • Call saveInstareportQueue() for each business
    • Use Promise.allSettled() to collect all results
  4. Aggregate Errors

    • Combine CSV errors + insert errors + queuing errors
    • allErrors = [...csvErrors, ...insertErrors, ...queueingErrors]
  5. Update CSV Queue Status

    • Increment tries counter
    • Set process_errors with all errors
    • Set is_completed: true
    • Set completed_at timestamp
    • Clear failure_reason field
  6. Return Success

Key Business Rules:

  • CSV validation happens before business creation
  • Business contact creation uses batch insert for efficiency
  • Each business gets separate report queue entry
  • Notification settings inherited from CSV queue unless overridden
  • People contacts optional - reports can have zero recipients
  • All errors tracked but don't stop processing (best-effort)
  • CSV queue marked complete even if some reports fail

Error Handling:

  • CSV parsing errors: tracked in process_errors
  • Business insert errors: tracked in process_errors
  • Queue creation errors: tracked in process_errors
  • CSV queue update failure: logged but not re-thrown

CSV Format Expected:

Business Name,Email,Phone,Address,Website,Facebook,Category,People
"Acme Corp","info@acme.com","555-0100","123 Main St","acme.com","fb.com/acme","Automotive","person_id_1,person_id_2"

Example Usage:

const csvQueue = await InstaReportQueue.findOne({ type: 'instareport-csv' });
await saveQueueData(csvQueue);
// Creates individual reports for each valid business in CSV

Side Effects:

  • โš ๏ธ Creates multiple business contacts in CRM
  • โš ๏ธ Creates N report records (N = valid businesses)
  • โš ๏ธ Updates CSV queue entry
  • โš ๏ธ Triggers N webhooks to queue manager
  • โš ๏ธ Long-running operation for large CSVs (use background queue)

Report Processingโ€‹


UpdateStatus(ID)โ€‹

Purpose: Orchestrates report status updates based on scraper completion. Calculates scores when all scrapers complete, handles retries on failures, and manages status transitions (QUEUED โ†’ GENERATED or RETRYING).

Parameters:

  • ID (ObjectId) - InstantReport identifier

Returns: Promise<String> - New status: 'GENERATED', 'RETRYING', or undefined

Business Logic Flow:

  1. Fetch Report

    • Find InstaReport by ID
    • Check scrapers object for platform statuses
  2. Check Each Scraper Status

    • Yext: If COMPLETED, calculate scores.yext via YextScore()
    • PageSpeed: If COMPLETED, calculate scores.pageSpeed via PageSpeedScore()
    • SEMrush: If COMPLETED, calculate scores.seo and scores.googleAds
    • GoogleMap: Check completion status
    • Yelp: Check completion status
    • Facebook: Check completion status
    • FacebookAds: Check completion status
  3. All Scrapers Complete (SUCCESS path):

    • Determine industry category (validate against 24 industries, default 'Other')
    • Fetch industry averages from InstareportsIndustryAverage
    • Calculate scores.reviews via ReviewsScore(report, industryAvg)
    • Calculate scores.social via SocialScore(report, industryAvg)
    • Update report: status: 'GENERATED', generated_at: Date.now()
    • Update queue: is_completed: true
    • Update additional_information: instareport_generated_at: Date.now()
    • Return 'GENERATED'
  4. Any Scraper Failed (RETRY path):

    • Determine which scraper(s) failed
    • For each failed scraper:
      • Remove scraper data: $unset: { 'scrapers.{platform}': '' }
      • Clear scores: $unset: { scores: '' }
    • Update report: status: 'RETRYING'
    • Update queue: increment tries, set scheduled: false
    • Return 'RETRYING'
  5. Not All Complete (PENDING path):

    • Return undefined (no status change)

Scraper Status Evaluation:

// Example scraper object structure
{
yext: { status: 'COMPLETED', data: {...} },
semrush: { status: 'COMPLETED', data: {...} },
page_speed: { status: 'FAILED', error: 'Timeout' },
facebook: { status: 'COMPLETED', data: {...} },
// ... other scrapers
}

Score Calculation Triggers:

  • Yext Score: Listing accuracy percentage
  • PageSpeed Score: Performance, accessibility, best practices, SEO
  • SEO Score: Organic keywords, traffic, backlinks
  • GoogleAds Score: Paid keywords, ad positioning
  • Reviews Score: Average rating vs industry benchmark
  • Social Score: Facebook engagement vs industry benchmark

Industry Categories (24 total):

[
'Active Life',
'Arts & Entertainment',
'Automotive',
'Beauty & Spas',
'Education',
'Event Planning & Services',
'Financial Services',
'Food',
'Health & Medical',
'Home Services',
'Hotels & Travel',
'Industrial Goods & Manufacturing',
'Local Services',
'Mass Media',
'Mining & Agriculture',
'Nigthlife',
'Other',
'Pets',
'Professional Services',
'Public Services & Government',
'Real Estate',
'Religious Organizations',
'Restaurants',
'Shopping',
];

Key Business Rules:

  • All 7 scrapers must complete (success or fail) before status update
  • Failed scrapers trigger full retry (all scrapers re-run, not just failed)
  • Scores only calculated when ALL scrapers succeed
  • Industry category normalized to ensure benchmark match
  • Retry increments queue tries counter (external retry limit applied)
  • Partial completion returns undefined (wait for more scrapers)

Example Usage:

// Called by queue manager when scraper completes
const newStatus = await UpdateStatus(report_id);

if (newStatus === 'GENERATED') {
// Trigger notifications
await sendNotifications(report_id);
} else if (newStatus === 'RETRYING') {
// Schedule retry in queue manager
await scheduleRetry(report_id);
}

Side Effects:

  • โš ๏ธ Updates InstaReport document (status, scores, generated_at)
  • โš ๏ธ Updates Queue document (is_completed, tries, scheduled)
  • โš ๏ธ Updates AdditionalInformation document (instareport_generated_at)
  • โš ๏ธ May remove scraper data on retry (destructive)
  • โš ๏ธ Queries industry averages collection

Report Retrievalโ€‹


reportByID(ID)โ€‹

Purpose: Retrieves comprehensive report data with full details, user info, account branding, industry benchmarks, and calculated sections. Primary endpoint for report viewing/downloading.

Parameters:

  • ID (ObjectId) - InstantReport identifier

Returns: Promise<Object> - Full report data or error:

{
status: 'Success',
data: {
status: 'GENERATED',
userDetails: {
name: 'John Doe',
email: 'john@example.com',
businessName: 'Acme Marketing',
businessImage: 'https://...'
},
get_started_url: 'https://...',
businessDetails: { /* Business info */ },
listings: { /* Yext listings */ },
reviews: { /* Google/Yelp reviews */ },
social: { /* Facebook data */ },
website: { /* PageSpeed metrics */ },
seo: { /* SEMrush SEO data */ },
googleAds: { /* SEMrush Ads data */ },
facebookAds: { /* Facebook Ads */ },
overallScore: { /* Aggregate scores */ },
configs: { /* Enabled sections */ }
}
}

Business Logic Flow:

  1. Fetch Core Data

    • Find InstaReport by ID
    • Find User config (type: 'instareports') for preferences
    • Find Yext publisher logos (all records cached in Map)
  2. Fetch Creator Information

    • Query User: name, email, phone, image
    • Query Account: business name, business image
    • Merge into userDetails
  3. Determine Industry Category

    • Normalize business category against industry list
    • Try alternate format (split on ' &')
    • Default to 'Other' if no match
    • Fetch industry averages
  4. Build Config Fallback

    • If old report missing details.configs:
      • Enable all scrapers by default
      • Ensures backward compatibility
  5. Calculate Report Sections (via utils):

    • businessDetails: utils.businessDetails(report) - name, address, photos
    • listings: utils.listingsDetails(report) - Yext data with logo mapping
    • reviews: utils.reviewDetails(report, industryAvg) - Google/Yelp ratings
    • social: utils.socialDetails(report, industryAvg) - Facebook engagement
    • website: utils.pageSpeedDetails(report) - Performance scores
    • seo: utils.seoDetails(report) - Organic traffic, keywords
    • googleAds: utils.gogoleAdsDetails(report) - Paid traffic
    • facebookAds: utils.facebookAdsDetails(report) - Facebook ad library
    • overallScore: utils.overallScore(report) - Aggregate percentage
  6. Map Yext Publisher Logos

    • Replace logoURL with mapped S3 URL
    • Keep original URL as logoURLOriginal
    • Fallback to original if mapping not found
  7. Return Complete Report

Config Conditionals (Sections Only If Enabled):

listings: report.details.configs.yext ? data : null;
reviews: report.details.configs.google_map ? data : null;
social: report.details.configs.facebook ? data : null;
website: report.details.configs.page_speed ? data : null;
seo: report.details.configs.seo ? data : null;
googleAds: report.details.configs.google_ads ? data : null;

Logo Mapping Logic:

const logosMap = new Map();
// Populate: logosMap.set('Google', 'https://bucket/google-logo.png')
// Usage: logoURL = logosMap.get(siteName) || originalLogoURL

Key Business Rules:

  • Configs determine which sections calculated (performance optimization)
  • Industry category must match exactly for benchmarks
  • User config preferences override defaults (videos, get_started_url)
  • Old reports without configs get all sections enabled
  • Logo mapping enhances UI but doesn't block if missing
  • Report can be retrieved in any status (QUEUED, RETRYING, GENERATED)

Error Handling:

  • { status: 'Failed', error: 404, message: 'Report ID not found' } - Invalid ID
  • Re-throws any unexpected errors to caller

Example Usage:

const reportData = await reportByID('60a7f8d5e4b0d8f3a4c5e1b2');

if (reportData.status === 'Success') {
// Display report to user
renderReport(reportData.data);
} else {
// Show error
showError(reportData.message);
}

Side Effects:

  • โš ๏ธ Queries multiple collections (InstaReport, User, Account, Config, YextLogos, IndustryAvg)
  • โš ๏ธ Executes complex utils calculations (CPU intensive)
  • โš ๏ธ No caching - recalculates on every request

getAllReports(account_id, page, limit, search, order, sort_by, statuses, auth)โ€‹

Purpose: Retrieves paginated list of reports with filtering, sorting, and contact visibility rules. Supports CRM visibility settings (entire/owner_followers).

Parameters:

  • account_id (ObjectId) - Account identifier
  • page (Number) - Page number (1-indexed)
  • limit (Number) - Records per page
  • search (String, optional) - Search text (name, industry, phone, email)
  • order (Number, optional) - Sort order: 1 (asc) or -1 (desc)
  • sort_by (String, optional) - Sort field (companyName, created, status, priority, etc.)
  • statuses (Array, optional) - Filter by status ['QUEUED', 'GENERATED']
  • auth (Object) - Authentication context:
    • user.is_owner (Boolean) - Account owner flag
    • account.preferences.contacts.visibility (String) - Visibility setting
    • uid (ObjectId) - Current user ID

Returns: Promise<Object> - Paginated results:

{
count: 1234, // Total matching records
result: [ // Current page records
{
id: ObjectId,
status: 'GENERATED',
created: Date,
generated_at: Date,
business: {
name: 'Acme Corp',
industry: 'Automotive',
phone: '555-0100',
email: 'info@acme.com',
address: { /* Full address */ }
},
creator_details: {
name: 'John Doe',
email: 'john@example.com'
},
details: {
business_info: { /* Business info */ },
configs: { /* Enabled scrapers */ },
notifications: { /* Email/SMS settings */ }
}
}
]
}

Business Logic Flow:

  1. Build Base Query

    • Match: account_id, orphaned: { $ne: true }
    • Project: Exclude scrapers (large field, not needed)
  2. Apply Contact Visibility Rules

    • Owner OR visibility='entire': All contacts visible
      • Simple lookup: localField: 'details.business_info.id'
    • visibility='owner_followers': Filtered contacts
      • Complex lookup with conditions:
        • Contact visibility: 'all' OR
        • Contact owner = current_user OR
        • Current user in contact followers array
  3. Join Creator Details

    • Lookup from _users collection
    • Fields: name, email (exclude password, tokens)
  4. Apply Status Filter (if provided)

    • Modify $match: status: { $in: statuses }
  5. Apply Search Filter (if provided)

    • Regex search on:
      • business.name
      • business.industry
      • business.phone
      • business.email
    • Case-insensitive ($options: 'i')
  6. Apply Sorting

    • Map sort field names:
      • companyName โ†’ business.name
      • created โ†’ created
      • status โ†’ status
      • city โ†’ business.address.city
      • email โ†’ business.email
      • created_by โ†’ creator_details.name
    • Default: created: -1 (newest first)
  7. Execute Pagination

    • Count query: Get total matching records
    • Data query: Skip + Limit
    • Sort applied before pagination
  8. Clean Response

    • Add id field (copy of _id)
    • Remove sensitive fields:
      • _id, __v, account_id, created_by, parent_account
      • scrapers, scores (not needed in list view)
      • creator_details.password, creator_details.reset_token
  9. Return Results with Count

Contact Visibility Pipeline (owner_followers):

{
$lookup: {
from: 'crm.contacts',
let: { business_id: '$details.business_info.id' },
pipeline: [
{
$match: {
$and: [
{ $expr: { $eq: ['$_id', '$$business_id'] } }
],
$or: [
{ $expr: { $eq: ['$visibility', 'all'] } },
{ $expr: { $eq: ['$owner', current_user_id] } },
{ $expr: { $in: [current_user_id, '$followers'] } }
]
}
}
],
as: 'business'
}
}

Sortable Fields:

Frontend FieldDatabase Field
companyNamebusiness.name
createdcreated
statusstatus
prioritypriority
citybusiness.address.city
statebusiness.address.state_province
zipbusiness.address.postal_code
emailbusiness.email
phonebusiness.phone
created_bycreator_details.name

Key Business Rules:

  • Orphaned reports excluded (deleted business contacts)
  • Contact visibility enforced at database level (secure)
  • Scrapers data excluded from list view (performance)
  • Owner users bypass visibility restrictions
  • Search case-insensitive across 4 fields
  • Default sort: newest first
  • Empty result returns null, not empty array

Example Usage:

const reports = await getAllReports(
account_id,
1, // page
50, // limit
'Automotive', // search
-1, // order (desc)
'created', // sort_by
['GENERATED'], // statuses
auth, // auth context
);

// Returns: { count: 15, result: [...15 reports] }

Side Effects:

  • โš ๏ธ Two aggregation queries (count + data)
  • โš ๏ธ Complex visibility pipeline (performance consideration)
  • โš ๏ธ No caching - every request hits database

Notification Managementโ€‹


sendNotifications(req, res)โ€‹

Purpose: Sends email and SMS notifications to all recipients with personalized report links. Creates short URLs, attaches files, and tracks notification event IDs.

Parameters:

  • req (Object) - Express request:
    • params.id (String) - InstantReport ID
    • headers.authorization (String) - JWT token
  • res (Object) - Express response (unused, async function)

Returns: Promise<void> - No return value (async)

Business Logic Flow:

  1. Fetch Report and Recipients

    • Find InstaReport by ID
    • Find AdditionalInformation with recipients array
    • Extract recipient person IDs
  2. Fetch Account Domain

    • Find parent Account by parent_account
    • Extract domain for URL generation
  3. Fetch Contact Details

    • Query Contacts where _id IN recipient_ids AND type: 'people'
    • Get: name, email, phone for each recipient
  4. Process Each Recipient

    • For each contact, send notifications if enabled:
  5. Email Notification (if enabled):

    • Generate Report URL:
      • Format: {domain}/reports/view/{report_id}/{person_id}
      • Get active domain via utilities.getActiveDomain()
    • Shorten URL:
      • POST to URL shortener API: https://playground-api.dashclicks.com/v1/url
      • Returns: urlme.app/{code}
    • Build Email:
      • Subject: from report.details.notifications.email.subject
      • From: report.details.notifications.email.from
      • Reply-to: report.details.notifications.email.reply_to
      • Recipient: contact email
      • Content: HTML from report.details.notifications.email.content
    • Attach Files (if configured):
      • For each attachment in email.attachments:
        • Generate S3 URL: {WASABI_PUBLIC_IMAGE_DOWNLOAD}/{key}
        • Download file as buffer via generateFileBuffer()
        • Add to emailData.files array
    • Send Email:
      • Call new Mail().send() with custom fields:
        • InstaReports-InstaReport preview link: short URL
      • Fallback values from email.fallback_entity
    • Track Event:
      • Update AdditionalInformation:
        • Set recipients.$.email.eventID with Communication ID
  6. SMS Notification (if enabled):

    • Generate Report URL (same as email)
    • Shorten URL (same as email)
    • Build SMS:
      • From: report.details.notifications.sms.from
      • To: contact phone
      • Content: report.details.notifications.sms.content
    • Send SMS:
      • Call new SMS().send() with custom fields:
        • InstaReports-InstaReport preview link: short URL
      • Fallback values from sms.fallback_entity
    • Track Event:
      • Update AdditionalInformation:
        • Set recipients.$.sms.eventID with Communication ID
  7. Complete Processing

Custom Fields (Merge Tags):

customFields: [
{
field: 'InstaReports-InstaReport preview link',
value: 'https://urlme.app/abc123',
},
];

Fallback Values (for missing merge tags):

fallbackValues: {
'Business Name': report.details.business_info.name,
'Business Email': report.details.business_info.email,
'Contact Name': contact.name,
// ... other custom fallbacks
}

File Attachment Flow:

// Config structure
email.attachments: [
{ key: 'path/to/file.pdf', name: 'Brochure.pdf', type: 'application/pdf' }
]

// Generated URL
attachment.url = 'https://wasabi.com/bucket/path/to/file.pdf'

// Download buffer
bufferData = await generateFileBuffer(attachment)

// Attach to email
emailData.files = [bufferData]

Key Business Rules:

  • Only recipients with email/SMS enabled receive notifications
  • Each recipient gets personalized link (includes person_id for tracking)
  • URL shortening required (long URLs don't fit in SMS)
  • File attachments downloaded on-demand (not stored in DB)
  • Event IDs tracked for delivery status and analytics
  • Custom fields allow merge tags in email/SMS templates
  • Fallback values ensure templates render even if data missing

Error Handling:

  • File attachment errors logged but don't stop notification
  • URL shortener failure would break notification (no fallback)
  • Email/SMS send failures logged via Mail/SMS utilities

Example Usage:

// Called after report status = GENERATED
await sendNotifications(req, res);

// Result: All enabled recipients receive email/SMS
// AdditionalInformation updated with event IDs

Side Effects:

  • โš ๏ธ Sends N emails (N = recipients with email enabled)
  • โš ๏ธ Sends M SMS (M = recipients with SMS enabled)
  • โš ๏ธ Creates N+M Communication records
  • โš ๏ธ Creates N+M URL shortener entries
  • โš ๏ธ Downloads P files from S3 (P = unique attachments)
  • โš ๏ธ Updates AdditionalInformation with event IDs
  • โš ๏ธ May trigger external webhook/integration events

๐Ÿ”€ Integration Pointsโ€‹

External Servicesโ€‹

Queue Manager Webhookโ€‹

  • Purpose: Triggers report scraping workflow
  • Endpoint: {QM_WEBHOOK_URL}/instareport
  • Method: POST
  • Payload: Report metadata, business details
  • Used by: saveInstareportQueue(), CSV import

URL Shortener APIโ€‹

  • Purpose: Generate short URLs for SMS
  • Endpoint: https://playground-api.dashclicks.com/v1/url
  • Method: POST
  • Returns: { code: 'abc123' } โ†’ https://urlme.app/abc123
  • Used by: sendNotifications()

Wasabi S3 Storageโ€‹

  • Purpose: File attachment storage
  • Bucket: WASABI_PUBLIC_IMAGE_DOWNLOAD
  • Used by: sendNotifications() - download email attachments

Scraper Services (7 platforms)โ€‹

  • Yext: Directory listings accuracy
  • SEMrush: SEO and Google Ads data
  • Facebook: Social engagement metrics
  • Yelp: Review data
  • Google Maps: Local listing, reviews
  • PageSpeed: Website performance
  • Facebook Ads: Ad library search

Internal Dependenciesโ€‹

  • shared/utilities/mail.js - Email sending via Mail class
  • shared/utilities/sms.js - SMS sending via SMS class
  • shared/utilities/index.js - getActiveDomain() for URL generation
  • Services/utils.js - Score calculation and report formatting utilities
  • Services/csvToJson.js - CSV parsing for batch import
  • Services/create-business.js - Bulk business contact creation
  • Services/generateFileBuffer.js - S3 file download for attachments

Shared Modelsโ€‹

  • InstaReport - Primary report documents
  • InstaReportQueue - Queue management
  • InstareportsAdditionalInformation - Recipients and notifications
  • InstareportsIndustryAverage - Benchmark data (24 industries)
  • YextPublishersLogo - Publisher logo URLs
  • Contact - Business and people contacts
  • User - Creator information
  • Account - Account domain and business info
  • Config - User preferences (videos, URLs)

๐Ÿงช Edge Cases & Special Handlingโ€‹

Case: Old Reports Without Configsโ€‹

Condition: Report created before configs feature
Handling: Fallback to all scrapers enabled
Result: Full report generated for backward compatibility

Case: Industry Category Not Foundโ€‹

Condition: Business category not in 24 industry list
Handling: Try alternate format (split on ' &'), default to 'Other'
Result: Industry benchmarks from 'Other' category used

Case: Scraper Partial Completionโ€‹

Condition: Some scrapers complete, others still running
Handling: UpdateStatus() returns undefined, no status change
Result: Wait for all scrapers before status transition

Case: All Scrapers Failedโ€‹

Condition: Every scraper returns FAILED status
Handling: Status set to RETRYING, scrapers cleared, queue tries incremented
Result: Full re-scrape scheduled by queue manager

Condition: Logo mapping not found for listing site
Handling: Use original logoURL from scraper
Result: Report displays with original logo (may be broken link)

Case: Contact Visibility Restrictionโ€‹

Condition: User not owner/follower of business contact
Handling: Business lookup returns empty, report excluded from results
Result: User cannot see report (secure)

Case: File Attachment Download Failureโ€‹

Condition: S3 file missing or inaccessible
Handling: Error logged, attachment skipped, email still sent
Result: Notification delivered without attachment

Case: URL Shortener Failureโ€‹

Condition: Shortener API unavailable
Handling: Request throws error, notification fails
Result: Recipient doesn't receive notification (no fallback)

Case: Recipient Missing Email/Phoneโ€‹

Condition: Contact has no email (email enabled) or no phone (SMS enabled)
Handling: Communication utilities handle validation
Result: Notification silently skipped for that recipient

Case: CSV Import with All Invalid Businessesโ€‹

Condition: Every row in CSV fails validation
Handling: No reports queued, all errors tracked in process_errors
Result: CSV marked complete with zero reports generated


โš ๏ธ Important Notesโ€‹

  • ๐Ÿ”„ No Transactions: Multi-collection operations not atomic (potential orphaned records)
  • ๐ŸŽฏ Status Flow: QUEUED โ†’ RETRYING (on failure) โ†’ GENERATED (on success)
  • ๐Ÿ“Š 7 Scrapers: All must complete before status update
  • ๐Ÿ” Retry Logic: Failed scrapers trigger full re-scrape (not incremental)
  • ๐Ÿ“ˆ Score Calculation: Only when ALL scrapers succeed
  • ๐ŸŒ 24 Industries: Benchmarks cover common business categories
  • ๐Ÿ“ง Personalized Links: Each recipient gets unique URL with person_id
  • ๐Ÿ“ฑ URL Shortening: Required for SMS (long URLs don't fit)
  • ๐Ÿ“Ž File Attachments: Downloaded on-demand, not stored in DB
  • ๐Ÿ” Visibility Rules: Contact permissions enforced at database level
  • โšก Performance: No caching, complex aggregations may be slow
  • ๐Ÿšจ Error Handling: Best-effort processing, errors logged but not blocking

  • Parent Module: InstaReports Module
  • Related Service: Reporting Service
  • Utilities: Utils Service (Score Calculation) (link removed - file does not exist)
  • Controller: internal/api/v1/instareports/Controllers/instareport.js
  • Routes: internal/api/v1/instareports/Routes/index.js
  • Queue Manager: Queue Processing Documentation (link removed - file does not exist)
  • Models:
    • InstaReport (link removed - file does not exist)
    • InstaReport Queue (link removed - file does not exist)
    • Additional Information (link removed - file does not exist)
๐Ÿ’ฌ

Documentation Assistant

Ask me anything about the docs

Hi! I'm your documentation assistant. Ask me anything about the docs!

I can help you with:
- Code examples
- Configuration details
- Troubleshooting
- Best practices

Try asking: How do I configure the API?
09:31 AM