← Back to Blog
· 20 min read ·

SEO Case Study: From zero to Google's top — Part 2: Execution, tools and first results

Documenting the execution phase: automated ranking monitoring tools (Puppeteer + MongoDB), keyword expansion from 10 to 55, on-page optimizations, and cyber security strategy — with real code and measured results.

#seo#google#astro#puppeteer#mongodb#cyber-security#case-study#ranking
Share

This is the second article in the “From zero to Google’s top” series. In Part 1, we documented the initial diagnosis: zero Google presence, complete baseline with screenshots, tech stack, and 8 technical SEO fixes that pushed PageSpeed mobile from 58 to 87.

Now we’ll document the execution phase — what was built, optimized, and measured in the first week after diagnosis.

What changed: In Part 1 we had 16 pages and zero monitoring tools. Now we have 78 pages, 55 monitored keywords, an automated ranking system with a real-time dashboard, and a fully implemented cyber security strategy.


What was built: seo-tools

The first thing I did after the diagnosis was build tools to automatically measure what Google returns for our target keywords. Without measurement, SEO is guesswork.

The system has 4 components:

seo-tools/
├── check-ranking.mjs    # Puppeteer scraper — simulates human searches on Google
├── dashboard.html        # Visual dashboard: charts, tables, screenshots
├── dashboard.mjs         # Express API (port 3333) to serve data
├── db.mjs                # MongoDB layer — persistence and aggregation
└── keywords.config.json  # 55 configurable keywords

1. Ranking scraper: check-ranking.mjs

The heart of the system. A Node.js script that uses Puppeteer to open Google in a real browser, type each keyword like a human would, and extract the organic results.

Anti-detection architecture:

// User-Agent rotation per session
const USER_AGENTS = [
  "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Chrome/134.0.0.0",
  "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Chrome/133.0.0.0",
  "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:135.0) Firefox/135.0",
  "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Edg/134.0.0.0",
];

// Stealth patches — hide automation signals
await page.evaluateOnNewDocument(() => {
  Object.defineProperty(navigator, "webdriver", { get: () => false });
  Object.defineProperty(navigator, 'plugins', { get: () => [1, 2, 3, 4, 5] });
  window.chrome = { runtime: {}, loadTimes: function(){}, csi: function(){} };
});

// Browser restarts every 15 keywords to avoid fingerprinting
const RESTART_EVERY = 15;
if (i > 0 && i % RESTART_EVERY === 0) {
  await browser.close();
  ({ browser, page } = await launchBrowser());
}

Human simulation:

The script doesn’t simply load a URL with ?q=keyword. It:

  1. Opens google.com and finds the search box
  2. Types the keyword character by character with random delays (25-85ms per letter)
  3. Waits 600-1000ms before pressing Enter
  4. Scrolls the page in multiple random steps (200-500px each)
  5. Sometimes scrolls back up (60% chance) to mimic natural browsing
  6. Takes a full-page screenshot of each SERP
  7. Random delays of 9.6-19.2s between searches (base 12s, ±40%)
  8. Extra pause every 5 keywords (~5-10s additional)
// Human-like typing — character by character
for (const char of keyword) {
  await page.keyboard.type(char, { delay: 0 });
  await sleep(25 + Math.random() * 60);
}
await sleep(600 + Math.random() * 400);
await page.keyboard.press('Enter');

// Google uses AJAX — can't wait for traditional navigation
await page.waitForSelector('#search, #rso, #botstuff', { timeout: 12000 });

CAPTCHA detection and recovery:

// When Google detects automation and shows a CAPTCHA
if (block.captcha) {
  console.error("🚫 CAPTCHA detected! Waiting for manual resolution...");
  // 4 attempts with 30s wait each (2 min total)
  for (let wait = 0; wait < 4; wait++) {
    await sleep(30000);
    const retryBlock = await checkForBlock(page);
    if (!retryBlock.captcha) {
      console.log("✅ CAPTCHA resolved!");
      break;
    }
  }
  // If it persists, restart browser with new UA and continue
  await browser.close();
  ({ browser, page } = await launchBrowser());
}

Result extraction:

Google frequently changes its HTML selectors. The scraper uses multiple selectors with fallbacks to work even when Google updates the layout:

// Primary selectors (2025-2026)
const containers = document.querySelectorAll(
  '#search .g, #rso .g, #rso [data-sokoban-container], ' +
  '#rso .tF2Cxc, #rso .MjjYud .g, #rso .N54PNb, #rso .kb0PBd'
);

// Fallback 1: any link with <h3>
document.querySelectorAll('#rso a[href^="http"]').forEach(a => {
  const h3 = a.querySelector('h3');
  if (h3) results.push({ url: a.href, title: h3.textContent });
});

// Fallback 2: broader search
document.querySelectorAll('a[href^="http"]').forEach(a => {
  const h3 = a.querySelector('h3');
  if (h3 || a.closest('[data-sokoban-container]')) { /* ... */ }
});

2. MongoDB layer: db.mjs

Each scan generates data that needs to be persisted for comparison over time. MongoDB stores:

// 3 collections with optimized indexes
await db.collection("scans").createIndex({ date: -1 });
await db.collection("rankings").createIndex({ keyword: 1, date: -1 });
await db.collection("screenshots").createIndex({ scanId: 1, keyword: 1 });
CollectionContentTypical size
scansMetadata for each run (date, status, counters)~1 KB/scan
rankingsPosition of each keyword per scan~55 docs/scan
screenshotsBinary PNG screenshots of SERPs~200 KB/screenshot

Aggregation for history:

The most useful aggregation pipeline shows the evolution of each keyword across scans:

export async function getAllKeywordsLatest() {
  return db.collection("rankings").aggregate([
    { $sort: { date: -1 } },
    {
      $group: {
        _id: "$keyword",
        latestPosition: { $first: "$position" },
        latestDate: { $first: "$date" },
        history: {
          $push: {
            position: "$position",
            date: "$date",
            scanId: "$scanId",
          },
        },
      },
    },
    { $sort: { _id: 1 } },
  ]).toArray();
}

And the comparison between two scans:

export async function compareScans(scanId1, scanId2) {
  const [rankings1, rankings2] = await Promise.all([
    getRankingsByScan(scanId1),
    getRankingsByScan(scanId2),
  ]);

  const map1 = new Map(rankings1.map(r => [r.keyword, r]));
  const map2 = new Map(rankings2.map(r => [r.keyword, r]));

  for (const keyword of allKeywords) {
    // Calculates: up, down, stable, new, lost
    const change = prevPos - currPos; // positive = improved
    const status = change > 0 ? 'up' : change < 0 ? 'down' : 'stable';
  }
}

3. Dashboard: dashboard.html + dashboard.mjs

The dashboard is an SPA with Canvas for charts and Express as the API backend.

5 functional tabs:

TabFunction
🏆 RankingsTable with position, page, URL and screenshot link
📈 EvolutionCanvas line chart per keyword + sparklines in table
📸 ScreenshotsSERP screenshot grid with fullscreen modal
⚖️ CompareSelect two scans and show per-keyword diff (↑↓🆕💀)
🔧 ImprovementsPrioritized list of pending SEO optimizations

The Canvas chart renders with devicePixelRatio for sharp Retina displays:

const dpr = window.devicePixelRatio || 1;
canvas.width = displayW * dpr;
canvas.height = displayH * dpr;
canvas.style.width = displayW + 'px';
canvas.style.height = displayH + 'px';
ctx.scale(dpr, dpr);

API endpoints:

GET /api/scans                          # List all scans
GET /api/scans/:id/rankings             # Rankings for a specific scan
GET /api/scans/:id/screenshots          # List screenshots for a scan
GET /api/screenshots/:scanId/:keyword   # PNG image of a screenshot
GET /api/overview                       # Aggregation of all keywords
GET /api/compare/:id1/:id2              # Comparison between two scans

Keyword expansion: 10 → 42 → 55

The keyword strategy evolved in 3 phases:

Phase 1: Initial keywords (10)

In Part 1, we defined 10 keywords focused on personal branding:

rafael cavalcanti da silva, rafael cavalcanti, rafaelroot,
rafael root, rafael cavalcanti desenvolvedor, rafael cavalcanti fullstack,
rafael cavalcanti segurança, rafael cavalcanti devops,
rafael cavalcanti security specialist, rafael cavalcanti python

Phase 2: Expansion to 42

Analyzing blog articles revealed we had content to rank for specific technical niches, not just personal branding. We added keywords for:

  • Linux Hardening (7): hardening linux, hardening linux checklist, linux server hardening guide, etc.
  • Nginx (3): nginx reverse proxy go, nginx reverse proxy go production, etc.
  • Reverse Engineering (6): engenharia reversa android frida, frida android hooking, etc.
  • Professional (4): desenvolvedor fullstack brasilia, especialista segurança informação brasil, etc.
  • Products (6): xpusher push ads, wsocket websocket sdk, cnab 240 python, etc.
  • SEO (2): seo case study google ranking, caso de estudo seo google posicionamento

Phase 3: Expansion to 55 (+13 cyber security)

An audit revealed that the words “cyber security” and “cibersegurança” appeared zero times across the entire site — despite security being a core competency. We added:

"cyber security",
"cybersecurity specialist",
"cibersegurança",
"cibersegurança brasil",
"especialista cibersegurança",
"especialista cibersegurança brasil",
"cyber security specialist brazil",
"offensive security pentesting",
"segurança ofensiva pentesting",
"rafael cavalcanti cyber security",
"rafael cavalcanti go golang",
"rafael cavalcanti reverse engineering",
"rafael cavalcanti brasilia"

Cyber security strategy: implementation

The “zero cyber security signals” diagnosis led to a 4-layer implementation:

1. Meta keywords (all 5 locales)

// src/i18n/pt-br.ts
meta: {
  keywords: "desenvolvedor fullstack, cibersegurança, pentesting,
    segurança ofensiva, segurança defensiva, hardening, ..."
}

// src/i18n/en.ts
meta: {
  keywords: "fullstack developer, cyber security, pentesting,
    offensive security, defensive security, hardening, ..."
}

Yes, Google ignores <meta name="keywords">. But Bing, Yandex, and Baidu still use them. For a multilingual site with 5 locales, it’s worth the effort.

2. Blog article tags

Each article received relevant security tags:

ArticleTags added
Engenharia Reversa Android + Fridasegurança, cibersegurança, pentesting
Hardening Linux Segurança Defensivacibersegurança, cyber-security
Nginx Reverse Proxy Go (PT)segurança, hardening
Reverse Engineering Android + Frida (EN)security, cyber-security, pentesting
Linux Server Hardening Checklist (EN)cyber-security
Nginx Reverse Proxy Go (EN)security, hardening

3. JSON-LD knowsAbout (both layouts)

"knowsAbout": [
  "Web Development", "Fullstack Development",
  "Cyber Security", "Penetration Testing",
  "Offensive Security", "Defensive Security",
  "Linux Hardening", "Reverse Engineering",
  "..."
]

4. Sitemap with segmented priorities

serialize(item) {
  const url = item.url;
  if (url === 'https://rafaelroot.com/') {
    item.priority = 1.0;          // Homepage
  } else if (url.match(/\/(en|pt|es|ru)\/$/)) {
    item.priority = 0.9;          // Locale homes
  } else if (url.match(/\/blog\/.+\//)) {
    item.priority = 0.9;          // Articles (primary content)
    item.changefreq = 'monthly';
  } else if (url.match(/\/habilidades\/.+|\/skills\/.+/)) {
    item.priority = 0.85;         // Skill subpages
  } else if (url.match(/\/blog\/$/)) {
    item.priority = 0.85;         // Blog index
  } else {
    item.priority = 0.8;          // Sections
  }
  return item;
}

Site expansion: 16 → 78 pages

In Part 1 the site had 16 pages. Now it has 78. Here’s what was added:

Full i18n with internationalized slugs

Each section of the site now has translated URLs:

Sectionpt-brenesru
About/sobre/en/about/es/acerca/ru/обо-мне
Skills/habilidades/en/skills/es/habilidades/ru/навыки
Experience/experiencia/en/experience/es/experiencia/ru/опыт
Blog/blog/en/blog/es/blog/ru/блог
Projects/projetos/en/projects/es/proyectos/ru/проекты
Contact/contato/en/contact/es/contacto/ru/контакт

Individual skill pages

30 new pages dynamically created — each skill has its own page with a detailed description:

/habilidades/python    →  /en/skills/python
/habilidades/go        →  /en/skills/go
/habilidades/docker    →  /en/skills/docker
/habilidades/linux     →  /en/skills/linux
... (30 skill subpages)

SEO impact

╔══════════════════════════════════════════════════════════╗
║           SITE EXPANSION — Part 1 → Part 2               ║
╠══════════════════════════════════════════════════════════╣
║                                                          ║
║  Indexable pages:        16 → 78  (+387%)                ║
║  Monitored keywords:    10 → 55  (+450%)                 ║
║  Locales with URLs:      5        (unchanged)            ║
║  Blog articles:          6        (unchanged)            ║
║  Sections with URLs:     0 → 6 × 5 locales = 30         ║
║  Skill subpages:         0 → 30                          ║
║  Sitemap URLs:          16 → 78                          ║
║                                                          ║
╚══════════════════════════════════════════════════════════╝

More pages = more URLs in sitemap = more indexation opportunities = more entry points for organic traffic. Each skill subpage is optimized for long-tail keywords like “rafael cavalcanti python” or “rafael cavalcanti docker”.


First scan: automated baseline (March 9, 2026)

The first scan with check-ranking.mjs was run the day after diagnosis:

╔══════════════════════════════════════════════════════════════╗
║   SEO Ranking Checker v2 — Visible Browser + MongoDB        ║
╚══════════════════════════════════════════════════════════════╝

🎯 Domain: rafaelroot.com
🔑 Keywords: 55
📄 Max pages/keyword: 3 (top 30)
⏱  Delay: 12000ms

📋 Scan ID: [MongoDB ObjectId]

First scan results

MetricValue
Total keywords55
Found (domain in top 30)0
Top 100
Top 300
Not found55
CAPTCHAsYes — resolved with browser restart

All 55 keywords: not found.

This was completely expected. The site is 2 days old. Google hasn’t indexed any pages yet. The normal cycle is:

  1. Days 1-3: Sitemap submitted, Google starts crawling
  2. Days 3-7: First pages appear in the index
  3. Weeks 2-4: Indexation stabilizes, first impressions in Search Console
  4. Months 1-3: Positions start forming for low-competition keywords

The value of this scan is being the automated zero point. When we run the next scan, any keyword appearing in the top 30 will represent real, measurable progress.

What competitors dominate

The scan also captures the top 10 results for each keyword. Competitor analysis:

KeywordTop 3 resultsOpportunity
”rafael cavalcanti”LinkedIn, Instagram, EscavadorMedium — strong platforms but generic content
”rafael cavalcanti da silva”Jusbrasil, LinkedIn, G1Good — legal/negative content, not technical
”rafaelroot”No relevant resultsExcellent — virgin keyword
”hardening linux checklist”DigitalOcean, Red Hat, BR blogsDifficult — strong competition
”engenharia reversa android frida”Few PT-BR resultsGood — niche with little quality Portuguese content
”cibersegurança brasil”News portals, companiesDifficult — corporate competition

Insight: The best opportunities are:

  1. Brand keywords (rafaelroot, rafael root) — no competition
  2. Long-tail technical in PT-BR (engenharia reversa android frida) — niche with little quality content
  3. Full name (rafael cavalcanti da silva) — competitors are legal content, which Google may consider less relevant than a professional portfolio

Bugs found and fixed in seo-tools

The first scan revealed 5 bugs that were fixed:

Bug 1: Illegal break statement (SyntaxError)

An extra } brace inside checkKeyword closed the for loop prematurely. The break ended up outside the loop — fatal SyntaxError.

Bug 2: waitForNavigation timeout

The script used page.waitForNavigation() after pressing Enter. But Google uses AJAX to load results — the page doesn’t navigate, it updates via JavaScript. The waitForNavigation hung until timeout.

Fix: Changed to page.waitForSelector('#search, #rso, #botstuff') — waits for result containers to appear in the DOM, regardless of navigation.

Bug 3: networkidle2 failing

The waitUntil: 'networkidle2' waits for ZERO network requests for 500ms. But Google makes constant background requests (telemetry, ads, suggestions). It never reaches idle.

Fix: Changed to domcontentloaded — waits only for the DOM to be ready, without depending on network.

Bug 4: Browser session accumulating detection

Using the same browser for 55 consecutive keywords accumulates automation signals in the browser’s fingerprint. Result: CAPTCHAs starting around keyword ~20.

Fix: Browser restarts every 15 keywords with a new User-Agent:

async function launchBrowser() {
  const browser = await puppeteer.launch({
    args: [
      "--disable-blink-features=AutomationControlled",
      "--disable-infobars",
      "--disable-extensions",
    ],
    ignoreDefaultArgs: ['--enable-automation'],
  });
  const page = await browser.newPage();
  await page.setUserAgent(getRandomUA()); // Different UA each session
  return { browser, page };
}

Bug 5: Dashboard chart empty

The Canvas chart wouldn’t render because all positions were “não encontrado” (string, not number). The code tried to calculate Math.max() on an empty array, resulting in NaN.

Fix: Explicit empty state on Canvas:

if (allHistory.length === 0) {
  ctx.fillStyle = '#8b949e';
  ctx.font = '16px sans-serif';
  ctx.textAlign = 'center';
  ctx.fillText('No numeric position data to display.', W / 2, H / 2);
  ctx.fillText('Run more scans to see the evolution.', W / 2, H / 2 + 25);
  return;
}

On-page optimizations performed

Beyond the cyber security strategy, several on-page optimizations were implemented:

Organization name removal from i18n

All 5 internationalization files contained real company/organization names in experience text and descriptions. We replaced them with generic references:

// BEFORE
"Worked at Norte Energia S.A. as..."

// AFTER
"Worked at a major energy company as..."

Reason: avoid organic traffic for those organizations competing with our pages, and maintain professional privacy.

Logo and branding simplified

  • Logo updated from rafael4root to Rafael Cavalcanti
  • Footer simplified — removed unnecessary references
  • GitHub links added to each project

Blog CSS fixed on homepage

The blog.css file was imported only on the /blog route, but the homepage also shows article previews. Result: blog styles missing on the homepage. Fixed by adding conditional import.

The skills links in navigation showed English text even in non-English locales. Fixed with localized terms in each i18n file.


Part 2 checklist

✅ Completed

  • Automated ranking tool (check-ranking.mjs) with Puppeteer + anti-detection
  • Visual dashboard with 5 tabs (Rankings, Evolution, Screenshots, Compare, Improvements)
  • MongoDB as persistence backend with aggregation and scan comparison
  • Keyword expansion: 10 → 42 → 55
  • Cyber security strategy: meta keywords, blog tags, JSON-LD across 4 layers
  • Sitemap with segmented priorities (1.0 → 0.8)
  • i18n with internationalized slugs: 16 → 78 pages
  • 30 skill subpages dynamically generated
  • First scan executed — 55-keyword baseline recorded
  • 5 bugs fixed in seo-tools (SyntaxError, timeout, CAPTCHA, chart, network)
  • Organization names removed from i18n
  • Logo and branding updated
  • Blog CSS and nav translations fixed
  • Google Search Console configured + sitemap submitted

⬜ Pending for Part 3

  • Second scan with real data (wait for indexation)
  • Configure GA4 with PUBLIC_GA_ID
  • Optimize Nginx (brotli, cache headers)
  • Internal linking between blog posts
  • Unique Open Graph images per article
  • Create /about page with full schema.org Person
  • Technical link building (GitHub, dev.to, Stack Overflow)
  • FAQ Schema on technical articles
  • Analyze real Search Console data

Updated metrics (March 9, 2026)

╔══════════════════════════════════════════════════════════════╗
║              CURRENT STATE — 03/09/2026                       ║
╠══════════════════════════════════════════════════════════════╣
║                                                              ║
║  Total pages:                78 (+387% vs Part 1)            ║
║  Monitored keywords:        55 (+450% vs Part 1)             ║
║  Keywords in top 30:         0  (expected — site is 2 days)  ║
║  Scans executed:             1                               ║
║  PageSpeed Mobile:           87/100                          ║
║  PageSpeed Desktop:          90/100                          ║
║  SEO Score:                  100/100                         ║
║  Indexed pages (GSC):        Waiting (~3-7 days)             ║
║  Organic impressions:        0 (waiting for indexation)      ║
║                                                              ║
║  Next action: wait for indexation, run scan #2               ║
║                                                              ║
╚══════════════════════════════════════════════════════════════╝

What’s coming in Part 3

In the next article (planned for week 8), we’ll document:

  1. Scan #1 vs scan #N comparison — which keywords went from “not found” to real positions?
  2. Search Console data — impressions, clicks, CTR, average position per keyword
  3. Indexation analysis — how many of the 78 pages did Google index? Which were prioritized?
  4. Link building — backlinks built and their measured impact
  5. ROI — hours invested vs results obtained
  6. Replicable checklist — simplified version of everything we did for other professionals to apply

This article is part of the “From zero to Google’s top” series. See Part 1 (diagnosis) and follow Part 3 (coming soon) to see the results in real time.

Last updated: March 9, 2026