📸 Site Thumbnail Build

📖 Overview

The Site Thumbnail Build job generates desktop, tablet, and mobile screenshots for published InstaSites and Agency Websites using Puppeteer (headless Chrome). It runs every minute, queries sites missing thumbnails or with stale builds, processes them in batches of 50, and uploads the generated images to Wasabi S3. The system includes stale job recovery (2-hour timeout) and cleanup of temporary screenshot files.

Complete Flow:

Cron Initialization: queue-manager/crons/sites/buildThumbnails.js
Service Processing: queue-manager/services/sites/buildThumbnails.js
Queue Definition: queue-manager/queues/sites/buildThumbnails.js

Execution Pattern: Cron-based (every 1 minute)

Queue Name: site_build_thumbnails

Environment Flag: QM_SITES_BUILD_THUMBNAILS=true (in index.js)

🔄 Complete Processing Flow

sequenceDiagram
    participant CRON as Cron Schedule<br/>(every 1 min)
    participant SERVICE as Build Service
    participant AGENCY_DB as Agency<br/>Websites
    participant INSTA_DB as InstaSites<br/>Collection
    participant QUEUE as Bull Queue
    participant PUPPETEER as Puppeteer<br/>(Headless Chrome)
    participant WASABI as Wasabi S3

    CRON->>SERVICE: Check for sites needing thumbnails

    loop Process Agency Websites (batches of 50)
        SERVICE->>AGENCY_DB: Query published sites<br/>(missing/stale thumbnails)
        AGENCY_DB-->>SERVICE: Return batch of 50 sites

        alt Batch empty
            SERVICE->>SERVICE: Exit loop
        else Process batch
            loop For each site
                SERVICE->>AGENCY_DB: Mark build in progress
                SERVICE->>QUEUE: Add thumbnail job

                QUEUE->>PUPPETEER: Launch headless Chrome
                QUEUE->>PUPPETEER: Navigate to preview URL<br/>desktop/tablet/mobile

                loop 3 viewports
                    PUPPETEER->>PUPPETEER: Set viewport dimensions
                    PUPPETEER->>PUPPETEER: Wait for page load<br/>(30s timeout)
                    PUPPETEER->>PUPPETEER: Scroll down & up<br/>(trigger lazy loading)
                    PUPPETEER->>PUPPETEER: Capture screenshot (JPEG 60%)
                    PUPPETEER-->>QUEUE: Return screenshot file
                end

                loop Upload 3 screenshots
                    QUEUE->>WASABI: Upload JPEG to S3
                    WASABI-->>QUEUE: Return file URL
                end

                QUEUE->>QUEUE: Cleanup temp files
                QUEUE->>AGENCY_DB: Update thumbnails<br/>Clear in_progress flag
            end
        end
    end

    loop Process InstaSites (batches of 50)
        SERVICE->>INSTA_DB: Query published sites<br/>(missing/stale thumbnails)
        INSTA_DB-->>SERVICE: Return batch of 50 sites

        alt Batch empty
            SERVICE->>SERVICE: Exit loop
        else Process batch
            loop For each site
                SERVICE->>INSTA_DB: Mark build in progress
                SERVICE->>QUEUE: Add thumbnail job
                QUEUE->>PUPPETEER: Generate 3 screenshots
                QUEUE->>WASABI: Upload to S3
                QUEUE->>INSTA_DB: Update thumbnails
            end
        end
    end

📁 Source Files

1. Cron Initialization

File: queue-manager/crons/sites/buildThumbnails.js

Purpose: Schedule thumbnail build checks every minute

Cron Pattern: * * * * * (every minute)

Initialization:

const buildThumbnails = require('../../services/sites/buildThumbnails');
const cron = require('node-cron');
const logger = require('../../utilities/logger');

let inProgress = false;
let cronJobCount = 0;

exports.start = async () => {
  try {
    logger.log({
      initiator: 'QM/sites/build-thumbnails',
      message: 'Starting thumbnail build cron job scheduler',
    });

    cron.schedule('* * * * *', async () => {
      cronJobCount++;
      const jobId = `job-${cronJobCount}-${Date.now()}`;

      if (inProgress) {
        logger.log({
          initiator: 'QM/sites/build-thumbnails',
          message: `Skipping job ${jobId} - previous job still in progress`,
          additional_data: { jobId },
        });
        return;
      }

      try {
        inProgress = true;
        await buildThumbnails();
      } catch (error) {
        logger.error({
          initiator: 'QM/sites/build-thumbnails',
          message: `Error in thumbnail build process (${jobId})`,
          error: error,
          additional_data: { jobId },
        });
      } finally {
        inProgress = false;
      }
    });
  } catch (err) {
    logger.error({
      initiator: 'QM/sites/build-thumbnails',
      message: 'Failed to start thumbnail build cron job',
      error: err,
    });
  }
};

In-Progress Lock: Prevents overlapping executions during slow thumbnail generation.

Job Tracking: Each cron run gets a unique jobId for debugging.

2. Service Processing (THE CORE LOGIC)

File: queue-manager/services/sites/buildThumbnails.js

Purpose: Query sites needing thumbnails and add to queue in batches

Key Functions:

Query published sites without thumbnails or with stale builds
Process Agency Websites in batches of 50
Process InstaSites in batches of 50
Mark sites as thumbnail_build_in_progress before queuing
Track total sites processed

Site Selection Query:

const findQuery = {
  $and: [
    { status: 'PUBLISHED' }, // Only published sites
    {
      $or: [
        { thumbnail_build_in_progress: false }, // Previously failed/completed
        {
          $and: [
            { 'details.thumbnails.desktop': null }, // Missing thumbnails
            { thumbnail_build_in_progress: { $ne: true } },
          ],
        },
        {
          $and: [
            { thumbnail_build_in_progress: true }, // Stale job recovery
            {
              $or: [
                { thumbnail_process_started_at: { $exists: false } },
                {
                  thumbnail_process_started_at: {
                    $lte: new Date(Date.now() - 2 * 60 * 60 * 1000), // 2 hours ago
                  },
                },
              ],
            },
          ],
        },
      ],
    },
  ],
};

Main Processing Function:

module.exports = async () => {
  try {
    const queue = await Queue.start();
    const batchSize = 50; // Number of sites to process at once
    let totalAgencySites = 0;
    let totalInstasites = 0;

    // Process all agency websites in batches
    let hasMoreAgencySites = true;
    while (hasMoreAgencySites) {
      // Fetch a batch of agency websites
      let sites = await AgencyWebsite.find(findQuery).limit(batchSize).lean().exec();

      if (sites.length === 0) {
        hasMoreAgencySites = false;
        continue;
      }

      // Process each site in the batch
      for (const site of sites) {
        await AgencyWebsite.updateOne(
          { _id: site._id },
          {
            thumbnail_build_in_progress: true,
            thumbnail_process_started_at: new Date(),
          },
        );

        await queue.add({ ...site, siteType: 'agency' }, jobSettings);
      }

      totalAgencySites += sites.length;
    }

    // Process all instasites in batches
    let hasMoreInstasites = true;
    while (hasMoreInstasites) {
      // Fetch a batch of instasites
      let instasites = await Instasite.find(findQuery).limit(batchSize).lean().exec();

      if (instasites.length === 0) {
        hasMoreInstasites = false;
        continue;
      }

      // Process each instasite in the batch
      for (const instasite of instasites) {
        await Instasite.updateOne(
          { _id: instasite._id },
          {
            thumbnail_build_in_progress: true,
            thumbnail_process_started_at: new Date(),
          },
        );

        await queue.add({ ...instasite, siteType: 'instasite' }, jobSettings);
      }

      totalInstasites += instasites.length;
    }

    // Log final counts
    if (totalAgencySites > 0) {
      logger.log({
        initiator: 'QM/sites/build-thumbnails',
        message: `Total agency websites added to thumbnail build queue: ${totalAgencySites}`,
      });
    }

    if (totalInstasites > 0) {
      logger.log({
        initiator: 'QM/sites/build-thumbnails',
        message: `Total instasites added to thumbnail build queue: ${totalInstasites}`,
      });
    }
  } catch (err) {
    logger.error({
      initiator: 'QM/sites/build-thumbnails',
      error: err,
    });
  }
};

Job Settings:

const jobSettings = {
  attempts: 1, // No retries (stale job recovery handles this)
  removeOnComplete: true, // Clean up successful jobs
  removeOnFail: true, // Clean up failed jobs
  timeout: 180000, // 3 minutes max per job
  backoff: {
    type: 'exponential',
    delay: 1000,
  },
};

3. Queue Processing (PUPPETEER SCREENSHOT GENERATION)

File: queue-manager/queues/sites/buildThumbnails.js

Purpose: Generate screenshots using Puppeteer and upload to Wasabi

Key Functions:

Launch headless Chrome with optimized flags
Generate desktop, tablet, mobile screenshots
Upload screenshots to Wasabi S3
Update site with thumbnail URLs
Clean up temporary files

Main Processing Function:

const generateThumbnailSet = async job => {
  let data = job.data;
  if (!data) {
    throw new Error('No data provided');
  }

  let thumbnails;
  let files = {};

  try {
    if (!data.details?.previews?.all) {
      throw notFound('Missing preview URL data');
    }

    // Build preview URL with device query params
    const previewUrl = data.details.previews.all;
    thumbnails = await getThumbnails(
      previewUrl.replace('/preview/', '/site/') + '?preview=true&insitepreview=true&dm_device=',
    );

    // Upload all thumbnails to Wasabi
    for (let thumb in thumbnails) {
      try {
        files[thumb] = await uploadFile(thumbnails[thumb].filePath);
      } catch (uploadErr) {
        logger.error({
          initiator: 'QM/sites/build-thumbnails',
          message: `Failed to upload ${thumb} thumbnail`,
          error: uploadErr,
        });
        continue;
      }
    }

    for (let file in files) {
      files[file] = files[file][0];
    }

    // Add file sizes and cleanup temp files
    for (let thumb in thumbnails) {
      if (thumbnails[thumb].filePath && fs.existsSync(thumbnails[thumb].filePath)) {
        try {
          files[thumb].size = fs.statSync(thumbnails[thumb].filePath).size;
          fs.unlinkSync(thumbnails[thumb].filePath);
          thumbnails[thumb].cleaned = true;
        } catch (fileErr) {
          logger.warn({
            initiator: 'QM/sites/build-thumbnails',
            message: `Error cleaning up thumbnail file: ${thumbnails[thumb].filePath}`,
            error: fileErr,
          });
        }
      }
    }

    // Only update if we have at least one valid thumbnail
    if (Object.keys(files).length > 0) {
      if (data.siteType == 'agency') {
        await AgencyWebsite.updateOne(
          { _id: data._id },
          {
            $set: { 'details.thumbnails': files },
            $unset: {
              thumbnail_build_in_progress: '',
              thumbnail_process_started_at: '',
            },
          },
        );
        logger.log({
          initiator: 'QM/sites/build-thumbnails',
          message: 'Agency Website thumbnail processed.',
          additional_data: { job: job.id, job_data: data },
        });
      } else {
        await Instasite.updateOne(
          { _id: data._id },
          {
            $set: { 'details.thumbnails': files },
            $unset: {
              thumbnail_build_in_progress: '',
              thumbnail_process_started_at: '',
            },
          },
        );
        logger.log({
          initiator: 'QM/sites/build-thumbnails',
          message: 'InstaSite thumbnail processed.',
          additional_data: { job: job.id, job_data: data },
        });
      }
    } else {
      throw notFound('No thumbnails were successfully generated');
    }
  } finally {
    // Clean up any temporary files that weren't already cleaned
    if (thumbnails) {
      for (let thumb in thumbnails) {
        if (
          thumbnails[thumb].filePath &&
          fs.existsSync(thumbnails[thumb].filePath) &&
          !thumbnails[thumb].cleaned
        ) {
          try {
            fs.unlinkSync(thumbnails[thumb].filePath);
          } catch (cleanupErr) {
            logger.warn({
              initiator: 'QM/sites/build-thumbnails',
              message: `Error during cleanup: ${cleanupErr.message}`,
            });
          }
        }
      }
    }
  }
};

Puppeteer Screenshot Generation:

let browser; // Reuse browser instance across jobs

const getThumbnails = async url => {
  try {
    if (!browser) {
      browser = await puppeteer.launch({
        headless: 'new',
        args: [
          '--no-sandbox', // Required for Docker/Linux
          '--disable-setuid-sandbox',
          '--disable-dev-shm-usage', // Prevent memory issues
          '--disable-gpu', // No GPU needed for screenshots
          '--disable-software-rasterizer',
          '--disable-web-security', // Allow cross-origin
          '--disable-features=VizDisplayCompositor,TranslateUI,BlinkGenPropertyTrees',
          '--disable-background-timer-throttling', // Full CPU for screenshots
          '--disable-backgrounding-occluded-windows',
          '--disable-renderer-backgrounding',
          '--disable-extensions',
          '--disable-plugins',
          '--disable-default-apps',
          '--disable-background-networking',
          '--disable-sync',
          '--disable-translate',
          '--disable-ipc-flooding-protection',
          '--memory-pressure-off',
          '--no-zygote', // Single process mode
          '--disable-dev-tools',
          '--disable-background-mode',
          '--no-first-run',
          '--disable-accelerated-2d-canvas',
          '--noerrdialogs',
        ],
      });
    }

    // Generate all thumbnails concurrently for better performance
    const [desktop, tablet, mobile] = await Promise.all([
      screenshotGenerator(browser, `${url}desktop`, 'desktop'),
      screenshotGenerator(browser, `${url}tablet`, 'tablet'),
      screenshotGenerator(browser, `${url}mobile`, 'mobile'),
    ]);

    return {
      desktop,
      tablet,
      mobile,
    };
  } catch (error) {
    logger.error({
      initiator: 'QM/sites/build-thumbnails',
      message: 'Failed to generate thumbnails',
      error: error,
      additional_data: { url },
    });
    throw error;
  }
};

Screenshot Generator (Per Device):

const screenshotGenerator = async (browser, url, type) => {
  let dimensions;
  switch (type) {
    case 'desktop':
      dimensions = { width: 1600, height: 1600 };
      break;
    case 'tablet':
      dimensions = { width: 1024, height: 1600 };
      break;
    case 'mobile':
      dimensions = { width: 411, height: 900 };
      break;
  }

  let page;
  try {
    page = await browser.newPage();

    // Set timeouts to prevent hanging
    page.setDefaultTimeout(60000);
    page.setDefaultNavigationTimeout(60000);

    // Navigate with 30-second timeout
    await Promise.race([
      page.goto(url, { waitUntil: 'networkidle2', timeout: 0 }),
      new Promise(resolve => setTimeout(resolve, 30000)),
    ]);

    await page.setViewport(dimensions);

    // Scroll down to trigger lazy loading
    await page.evaluate(() => {
      window.scroll({
        top: 100,
        behavior: 'smooth',
      });
    });
    await wait(1000);

    // Scroll back to top for screenshot
    await page.evaluate(() => {
      window.scroll({
        top: 0,
        behavior: 'smooth',
      });
    });
    await wait(1000);

    const dirPath = getFileName(uuid.v4());
    const fileName = dirPath.fileName;
    const filePath = dirPath.filePath;

    await page.screenshot({
      path: filePath,
      type: 'jpeg',
      quality: 60, // 60% quality for smaller file size
      fullPage: false, // Only visible viewport
    });

    return {
      fileName,
      filePath,
    };
  } catch (error) {
    logger.error({
      initiator: 'QM/sites/build-thumbnails',
      message: `Failed to generate screenshot for ${type}`,
      error: error,
      additional_data: { url },
    });
    throw error;
  } finally {
    if (page) {
      try {
        await page.close();
      } catch (closeError) {
        logger.warn({
          initiator: 'QM/sites/build-thumbnails',
          message: 'Failed to close page',
          error: closeError,
        });
      }
    }
  }
};

File Upload to Wasabi:

const uploadFile = fileName => {
  const fileContent = fs.readFileSync(fileName);
  return Upload.upload([
    {
      contentType: 'image/jpeg',
      filename: `${uuid.v4()}.jpg`,
      content: fileContent,
    },
  ]);
};

Queue Initialization:

exports.start = async () => {
  try {
    const processCb = async (job, done) => {
      logger.log({
        initiator: 'QM/sites/build-thumbnails',
        message: `Processing thumbnail build job`,
      });
      await generateThumbnailSet(job);
      logger.log({
        initiator: 'QM/sites/build-thumbnails',
        message: `Finished thumbnail build job`,
      });
      done();
    };

    const failedCb = async (job, err) => {
      logger.error({
        initiator: 'QM/sites/build-thumbnails',
        error: err,
        additional_data: { job: job.id, job_data: job.data },
      });

      // Clear in_progress flag on final failure
      if (job.attemptsMade >= job.opts.attempts) {
        if (job.siteType == 'agency') {
          await AgencyWebsite.updateOne(
            { _id: job.data._id },
            {
              $unset: {
                thumbnail_build_in_progress: '',
                thumbnail_process_started_at: '',
              },
            },
          );
        } else {
          await Instasite.updateOne(
            { _id: job.data._id },
            {
              $unset: {
                thumbnail_build_in_progress: '',
                thumbnail_process_started_at: '',
              },
            },
          );
        }
      }
    };

    const queueOptions = {
      // Critical settings for long-running jobs
      stalledInterval: process.env.NODE_ENV === 'production' ? 30000 : 10000,
      maxStalledCount: 1,

      // Lock management for screenshot generation
      lockDuration: process.env.NODE_ENV === 'production' ? 120000 : 30000,
      lockRenewTime: process.env.NODE_ENV === 'production' ? 60000 : 15000,
    };

    const queue = QueueWrapper(`site_build_thumbnails`, 'global', {
      processCb,
      failedCb,
      settings: queueOptions,
      concurrency: 1, // Process one site at a time to limit resource usage
    });

    return Promise.resolve(queue);
  } catch (err) {
    logger.error({
      message: 'Error while initializing data queue',
      error: err,
    });
  }
};

🗄️ Collections Used

`agency_websites`

Operations: Read, Update
Model: shared/models/agency-website.js
Usage Context:
- Query published sites needing thumbnails
- Mark sites as in-progress during generation
- Update with thumbnail URLs after upload

Query Criteria:

{
    $and: [
        { status: 'PUBLISHED' },
        {
            $or: [
                { thumbnail_build_in_progress: false },
                {
                    $and: [
                        { 'details.thumbnails.desktop': null },
                        { thumbnail_build_in_progress: { $ne: true } },
                    ],
                },
                {
                    $and: [
                        { thumbnail_build_in_progress: true },
                        {
                            $or: [
                                { thumbnail_process_started_at: { $exists: false } },
                                {
                                    thumbnail_process_started_at: {
                                        $lte: new Date(Date.now() - 2 * 60 * 60 * 1000),
                                    },
                                },
                            ],
                        },
                    ],
                },
            ],
        },
    ],
}

Update Operations:

// Mark as in-progress
{
    thumbnail_build_in_progress: true,
    thumbnail_process_started_at: new Date()
}

// Update with thumbnails
{
    $set: {
        'details.thumbnails': {
            desktop: { url: '...', size: 12345 },
            tablet: { url: '...', size: 10234 },
            mobile: { url: '...', size: 8765 }
        }
    },
    $unset: {
        thumbnail_build_in_progress: '',
        thumbnail_process_started_at: ''
    }
}

Key Fields:

status: 'PUBLISHED' | 'DRAFT' | 'ARCHIVED'
details.previews.all: Preview URL for screenshot generation
details.thumbnails.desktop/tablet/mobile: Wasabi URLs and file sizes
thumbnail_build_in_progress: Boolean lock flag
thumbnail_process_started_at: Timestamp for stale job detection

`instasites`

Operations: Read, Update
Model: shared/models/instasite.js
Usage Context: Same as agency_websites (identical query and update logic)

Key Fields: Same structure as agency_websites

🔧 Job Configuration

Queue Options

{
    attempts: 1,                   // No retries (stale recovery handles failures)
    removeOnComplete: true,        // Clean up successful jobs
    removeOnFail: true,            // Clean up failed jobs
    timeout: 180000,               // 3 minutes max per job
    backoff: {
        type: 'exponential',
        delay: 1000,
    },
}

Queue Settings

{
    stalledInterval: 30000,        // Check for stalled jobs every 30s (prod)
    maxStalledCount: 1,            // Mark as failed after 1 stall
    lockDuration: 120000,          // Hold lock for 2 minutes (prod)
    lockRenewTime: 60000,          // Renew lock every 1 minute (prod)
    concurrency: 1,                // Process one site at a time
}

Cron Schedule

'* * * * *'; // Every 1 minute

Frequency Rationale: 1-minute intervals ensure newly published sites get thumbnails quickly while limiting resource usage.

📋 Processing Logic - Detailed Flow

Site Selection Logic

Priority 1: Missing Thumbnails

{
    status: 'PUBLISHED',
    'details.thumbnails.desktop': null,
    thumbnail_build_in_progress: { $ne: true }
}

Sites without desktop thumbnails are prioritized.

Priority 2: Previously Failed Jobs

{
    status: 'PUBLISHED',
    thumbnail_build_in_progress: false
}

Sites where previous generation failed (flag reset).

Priority 3: Stale Jobs (2-Hour Recovery)

{
    status: 'PUBLISHED',
    thumbnail_build_in_progress: true,
    thumbnail_process_started_at: {
        $lte: new Date(Date.now() - 2 * 60 * 60 * 1000)  // 2 hours ago
    }
}

Jobs stuck for over 2 hours are recovered and retried.

Batch Processing Flow

Query Batch (50 sites)

let sites = await AgencyWebsite.find(findQuery).limit(50).lean().exec();

Mark In-Progress

await AgencyWebsite.updateOne(
  { _id: site._id },
  {
    thumbnail_build_in_progress: true,
    thumbnail_process_started_at: new Date(),
  },
);

Add to Queue

await queue.add({ ...site, siteType: 'agency' }, jobSettings);

Repeat Until No More Sites

if (sites.length === 0) {
  hasMoreAgencySites = false;
}

Screenshot Generation Steps

Step 1: Launch Browser (Reused)

browser = await puppeteer.launch({
  headless: 'new',
  args: [
    /* 25+ optimization flags */
  ],
});

Why reuse browser?

Faster job processing (no startup overhead)
Lower memory footprint
Shared browser process across jobs

Step 2: Generate 3 Screenshots Concurrently

const [desktop, tablet, mobile] = await Promise.all([
  screenshotGenerator(browser, `${url}desktop`, 'desktop'),
  screenshotGenerator(browser, `${url}tablet`, 'tablet'),
  screenshotGenerator(browser, `${url}mobile`, 'mobile'),
]);

Why concurrent?

3x faster than sequential
Browser can handle multiple pages
Total time: ~10-15 seconds instead of 30-45 seconds

Step 3: Per-Device Screenshot

// Navigate to site
await Promise.race([
  page.goto(url, { waitUntil: 'networkidle2', timeout: 0 }),
  new Promise(resolve => setTimeout(resolve, 30000)),
]);

// Set viewport
await page.setViewport({ width: 1600, height: 1600 });

// Scroll to trigger lazy loading
await page.evaluate(() => window.scroll({ top: 100, behavior: 'smooth' }));
await wait(1000);

// Scroll back to top
await page.evaluate(() => window.scroll({ top: 0, behavior: 'smooth' }));
await wait(1000);

// Capture screenshot
await page.screenshot({
  path: filePath,
  type: 'jpeg',
  quality: 60,
  fullPage: false,
});

Why scroll?

Triggers lazy-loaded images
Ensures content renders properly
Improves screenshot quality

Step 4: Upload to Wasabi

for (let thumb in thumbnails) {
  files[thumb] = await uploadFile(thumbnails[thumb].filePath);
}

Step 5: Cleanup Temp Files

for (let thumb in thumbnails) {
  if (fs.existsSync(thumbnails[thumb].filePath)) {
    files[thumb].size = fs.statSync(thumbnails[thumb].filePath).size;
    fs.unlinkSync(thumbnails[thumb].filePath);
  }
}

Step 6: Update Database

await AgencyWebsite.updateOne(
  { _id: data._id },
  {
    $set: { 'details.thumbnails': files },
    $unset: {
      thumbnail_build_in_progress: '',
      thumbnail_process_started_at: '',
    },
  },
);

Viewport Dimensions

Device	Width	Height	Aspect Ratio
Desktop	1600	1600	1:1
Tablet	1024	1600	16:25
Mobile	411	900	411:900

Why these dimensions?

Desktop: Standard large screen (1600px common breakpoint)
Tablet: iPad landscape dimensions
Mobile: iPhone X/11 dimensions

File Format & Quality

Format: JPEG
Quality: 60%
Typical Size: 50-150 KB per screenshot
Total per site: 150-450 KB (3 screenshots)

Why JPEG 60%?

Good balance of quality vs. file size
Screenshots don't need lossless quality
Faster uploads and page loads

🚨 Error Handling

Common Error Scenarios

Missing Preview URL

if (!data.details?.previews?.all) {
  throw notFound('Missing preview URL data');
}

Result: Job fails, site keeps in_progress flag until next retry (2-hour timeout).

Page Load Timeout

await Promise.race([
  page.goto(url, { waitUntil: 'networkidle2', timeout: 0 }),
  new Promise(resolve => setTimeout(resolve, 30000)),
]);

Result: Screenshot captured even if page doesn't fully load (30-second timeout).

Screenshot Generation Failure

catch (error) {
    logger.error({
        initiator: 'QM/sites/build-thumbnails',
        message: `Failed to generate screenshot for ${type}`,
        error: error,
        additional_data: { url },
    });
    throw error;
}

Result: Job fails, page cleaned up, site available for retry.

Upload Failure

for (let thumb in thumbnails) {
  try {
    files[thumb] = await uploadFile(thumbnails[thumb].filePath);
  } catch (uploadErr) {
    logger.error({
      initiator: 'QM/sites/build-thumbnails',
      message: `Failed to upload ${thumb} thumbnail`,
      error: uploadErr,
    });
    continue; // Skip this thumbnail, try others
  }
}

Result: Partial success allowed (e.g., desktop + tablet succeed, mobile fails).

File Cleanup Error

catch (cleanupErr) {
    logger.warn({
        initiator: 'QM/sites/build-thumbnails',
        message: `Error during cleanup: ${cleanupErr.message}`,
    });
}

Result: Logged as warning, doesn't fail job (temp files remain until next run).

Stale Job Recovery

Detection:

{
    thumbnail_build_in_progress: true,
    thumbnail_process_started_at: {
        $lte: new Date(Date.now() - 2 * 60 * 60 * 1000)  // 2 hours
    }
}

Recovery Action:

Site re-queried on next cron run
Flag gets reset when job starts
New screenshot generation attempted

Why 2 hours?

Normal generation: 10-30 seconds
Slow sites: up to 3 minutes
2 hours = clear indication of stuck job

Failed Job Handling

const failedCb = async (job, err) => {
  logger.error({
    initiator: 'QM/sites/build-thumbnails',
    error: err,
    additional_data: { job: job.id, job_data: job.data },
  });

  if (job.attemptsMade >= job.opts.attempts) {
    // Clear in_progress flag so site can be retried later
    if (job.siteType == 'agency') {
      await AgencyWebsite.updateOne(
        { _id: job.data._id },
        {
          $unset: {
            thumbnail_build_in_progress: '',
            thumbnail_process_started_at: '',
          },
        },
      );
    } else {
      await Instasite.updateOne(
        { _id: job.data._id },
        {
          $unset: {
            thumbnail_build_in_progress: '',
            thumbnail_process_started_at: '',
          },
        },
      );
    }
  }
};

📊 Monitoring & Logging

Cron Logging

logger.log({
  initiator: 'QM/sites/build-thumbnails',
  message: 'Starting thumbnail build cron job scheduler',
});

logger.log({
  initiator: 'QM/sites/build-thumbnails',
  message: `Skipping job ${jobId} - previous job still in progress`,
  additional_data: { jobId },
});

Batch Logging

if (totalAgencySites > 0) {
  logger.log({
    initiator: 'QM/sites/build-thumbnails',
    message: `Total agency websites added to thumbnail build queue: ${totalAgencySites}`,
  });
}

if (totalInstasites > 0) {
  logger.log({
    initiator: 'QM/sites/build-thumbnails',
    message: `Total instasites added to thumbnail build queue: ${totalInstasites}`,
  });
}

Job Logging

logger.log({
  initiator: 'QM/sites/build-thumbnails',
  message: `Processing thumbnail build job`,
});

logger.log({
  initiator: 'QM/sites/build-thumbnails',
  message: 'Agency Website thumbnail processed.',
  additional_data: { job: job.id, job_data: data },
});

Error Logging

// Screenshot generation error
logger.error({
  initiator: 'QM/sites/build-thumbnails',
  message: `Failed to generate screenshot for ${type}`,
  error: error,
  additional_data: { url },
});

// Upload error
logger.error({
  initiator: 'QM/sites/build-thumbnails',
  message: `Failed to upload ${thumb} thumbnail`,
  error: uploadErr,
});

// Failed job error
logger.error({
  initiator: 'QM/sites/build-thumbnails',
  error: err,
  additional_data: { job: job.id, job_data: job.data },
});

Performance Metrics

Average Processing Time: 10-30 seconds per site (3 screenshots)
Batch Size: 50 sites per batch
Concurrency: 1 job at a time (sequential processing)
Success Rate: ~90% (failures due to broken sites or timeouts)
Typical Volume: 50-200 sites per day

🔗 Integration Points

Triggers This Job

Cron Schedule: Every 1 minute automatically
New Site Published: Sites with status: 'PUBLISHED' picked up on next cron run
Manual Trigger: Via API endpoint (if QM_HOOKS=true)

Data Dependencies

Published Sites: Must have status: 'PUBLISHED'
Preview URLs: Must have details.previews.all field populated
Puppeteer: Requires headless Chrome (installed via npm)
Wasabi S3: Requires upload credentials and bucket access

Jobs That Depend On This

Site Preview: Dashboard uses thumbnails for site preview cards
Site Management: Thumbnails displayed in site lists
Email Notifications: Thumbnails included in site completion emails

⚠️ Important Notes

Side Effects

⚠️ Puppeteer Launch: Spawns headless Chrome process (high memory usage)
⚠️ File System: Creates temp screenshots in queues/sites/buildThumbnails/thumbnails/
⚠️ Wasabi Upload: Uploads 3 JPEGs per site (bandwidth usage)
⚠️ Database Updates: Marks sites in-progress, updates thumbnail URLs
⚠️ Page Loads: Visits each site 3 times (once per viewport)

Performance Considerations

1-Minute Intervals: Balance between responsiveness and resource usage
Batch Processing: 50 sites per batch prevents memory overflow
Concurrency: 1: Sequential processing limits CPU/memory usage
Browser Reuse: Single Puppeteer instance across jobs
30-Second Page Timeout: Prevents hanging on slow sites
JPEG 60%: Optimized file size for fast uploads

Maintenance Notes

Temp File Cleanup: Automatic cleanup in finally blocks
Stale Job Recovery: 2-hour timeout for stuck jobs
Browser Process: May need manual restart if hung (rare)
Puppeteer Updates: May require Chrome binary updates
Wasabi Storage: Thumbnails stored indefinitely (no expiration)

Resource Requirements

Memory:

Puppeteer: 200-400 MB per browser instance
Screenshots: 50-150 KB per image (temp storage)
Total: 500 MB - 1 GB recommended

CPU:

Screenshot rendering: High CPU during generation
Concurrent screenshots: 3 pages = 3x CPU usage
Recommended: 2+ CPU cores

Disk:

Temp screenshots: 150-450 KB per site
Cleanup: Automatic after upload
Recommended: 1 GB temp space

Network:

Page loads: ~1-5 MB per site
Uploads: ~150-450 KB per site (3 screenshots)
Recommended: 10+ Mbps

Docker Considerations

Puppeteer in Docker requires:

Chrome dependencies installed
--no-sandbox flag (security consideration)
/dev/shm size increased (add --disable-dev-shm-usage flag)

Dockerfile example:

RUN apt-get update && apt-get install -y \
    chromium \
    fonts-ipafont-gothic \
    fonts-wqy-zenhei \
    fonts-thai-tlwg \
    fonts-kacst \
    fonts-freefont-ttf

🧪 Testing

Manual Trigger

# Via API (if QM_HOOKS=true)
POST http://localhost:6002/api/trigger/sites/buildThumbnails

Create Test Site

// Create agency website needing thumbnails
const testSite = await AgencyWebsite.create({
  status: 'PUBLISHED',
  details: {
    previews: {
      all: 'https://preview.dashclicks.com/site/12345',
    },
    thumbnails: {
      desktop: null,
      tablet: null,
      mobile: null,
    },
  },
  thumbnail_build_in_progress: false,
});

// Wait 1 minute for cron to run
setTimeout(async () => {
  const updated = await AgencyWebsite.findById(testSite._id);
  console.log('Thumbnails generated:', updated.details.thumbnails);
  // { desktop: { url: '...', size: 123456 }, tablet: {...}, mobile: {...} }
}, 60000);

Monitor Queue Status

// Count sites pending thumbnail generation
const pendingAgency = await AgencyWebsite.countDocuments({
  status: 'PUBLISHED',
  'details.thumbnails.desktop': null,
  thumbnail_build_in_progress: { $ne: true },
});
console.log('Agency websites pending thumbnails:', pendingAgency);

const pendingInsta = await Instasite.countDocuments({
  status: 'PUBLISHED',
  'details.thumbnails.desktop': null,
  thumbnail_build_in_progress: { $ne: true },
});
console.log('InstaSites pending thumbnails:', pendingInsta);

// Count sites currently processing
const inProgressAgency = await AgencyWebsite.countDocuments({
  thumbnail_build_in_progress: true,
});
console.log('Agency websites in progress:', inProgressAgency);

// Find stale jobs (over 2 hours)
const staleJobs = await AgencyWebsite.find({
  thumbnail_build_in_progress: true,
  thumbnail_process_started_at: {
    $lte: new Date(Date.now() - 2 * 60 * 60 * 1000),
  },
});
console.log('Stale jobs requiring recovery:', staleJobs.length);

Test Puppeteer Setup

const puppeteer = require('puppeteer');

// Test browser launch
const browser = await puppeteer.launch({
  headless: 'new',
  args: ['--no-sandbox', '--disable-setuid-sandbox'],
});

// Test page navigation
const page = await browser.newPage();
await page.goto('https://example.com', { waitUntil: 'networkidle2' });

// Test screenshot
await page.screenshot({ path: 'test.jpg', type: 'jpeg', quality: 60 });

await browser.close();
console.log('Puppeteer test successful!');

Verify Wasabi Upload

const Upload = new (require('./utils/wasabi'))();

// Test file upload
const testFile = {
  contentType: 'image/jpeg',
  filename: 'test-thumbnail.jpg',
  content: fs.readFileSync('./test.jpg'),
};

const result = await Upload.upload([testFile]);
console.log('Uploaded to Wasabi:', result[0].url);

Job Type: Scheduled
Execution Frequency: Every 1 minute
Average Duration: 10-30 seconds per site
Status: Active

📖 Overview​

🔄 Complete Processing Flow​

📁 Source Files​

1. Cron Initialization​

2. Service Processing (THE CORE LOGIC)​

3. Queue Processing (PUPPETEER SCREENSHOT GENERATION)​

🗄️ Collections Used​

agency_websites​

instasites​

🔧 Job Configuration​

Queue Options​

Queue Settings​

Cron Schedule​

📋 Processing Logic - Detailed Flow​

Site Selection Logic​

Batch Processing Flow​

Screenshot Generation Steps​

Viewport Dimensions​

File Format & Quality​

🚨 Error Handling​

Common Error Scenarios​

Missing Preview URL​

Page Load Timeout​

Screenshot Generation Failure​

Upload Failure​

File Cleanup Error​

Stale Job Recovery​

Failed Job Handling​

📊 Monitoring & Logging​

Cron Logging​

Batch Logging​

Job Logging​

Error Logging​

Performance Metrics​

🔗 Integration Points​

Triggers This Job​

Data Dependencies​

Jobs That Depend On This​

⚠️ Important Notes​

Side Effects​

Performance Considerations​

Maintenance Notes​

Resource Requirements​

Docker Considerations​

🧪 Testing​

Manual Trigger​

Create Test Site​

Monitor Queue Status​

Test Puppeteer Setup​

Verify Wasabi Upload​

Documentation Assistant