How to Track and Optimize AEO With AWS CloudFront: Salespeak Lambda@Edge Setup


66% of internet traffic is bots. Your Google Analytics dashboard sees almost none of it. GA runs client-side JavaScript, and AI crawlers don't execute JavaScript. GPTBot, ClaudeBot, PerplexityBot, Google-Extended: they hit your origin, grab the HTML, and leave. No pageview. No event. No trace in your reporting.
AI crawling grew 15x in 2025. If you're running your site on AWS CloudFront and you're not detecting these visitors at the edge, you're missing the fastest-growing traffic segment hitting your infrastructure. Worse, you're serving those crawlers the same generic HTML you serve everyone else, when you could be serving content optimized for how AI models actually parse and cite sources.
That's what Salespeak's LLM Optimizer does on CloudFront. It uses Lambda@Edge to detect AI crawlers, serve them optimized content, and log every visit to your analytics dashboard. And if you're already on AWS, there's nothing new to deploy manually.
Why CloudFront is the right place to detect AI traffic
Most AEO tracking solutions sit at the application layer. They add middleware to your web server, install a WordPress plugin, or require you to pipe access logs into a third-party tool. All of those work. None of them are ideal for enterprise teams already running on AWS.
CloudFront has 400+ edge locations globally. Lambda@Edge functions run at those edge locations, so AI detection happens at the same point and with the same latency as your existing CDN. No extra hop. No added latency. No new vendor in your security review.
For enterprise teams, that last point matters more than the technical architecture. Adding a third-party CDN layer (like putting Cloudflare in front of CloudFront) creates procurement headaches, security questionnaires, and architectural complexity. Lambda@Edge runs inside your existing AWS account, under your existing IAM policies, with your existing compliance posture.
How the Lambda@Edge architecture works
Salespeak's CloudFront integration deploys two Lambda@Edge functions to your distribution. They run at different stages of the request lifecycle, and together they handle both detection and optimization.
1. Viewer request handler: detection and logging
This function fires when a request arrives at the CloudFront edge, before it touches your origin server. It does two things:
- Analyzes the User-Agent header against a maintained list of AI crawlers: GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended, BingPreview, Amazonbot, and others as new crawlers emerge
- Logs the visit to Salespeak's analytics API with crawler identity, requested URL, timestamp, and edge location
The detection happens in microseconds. The logging call is async — it doesn't block the response. Your human visitors see zero impact.
2. Origin response handler: content optimization
When the viewer request handler identifies an AI crawler, the origin response handler takes over. It fetches your AI-optimized content from an alternate origin and injects it into the HTML response before it's sent back to the crawler.
What does "AI-optimized content" mean in practice? It's your same content, restructured for how LLMs parse information:
- Definitive language patterns that increase citation probability (phrases like "is defined as" and "refers to" get cited 36.2% of the time vs. 20.2% without them)
- Higher entity density — naming specific products, people, and brands rather than using generic descriptions
- Question-formatted headers that AI models treat as query-answer pairs
- Structured data that helps crawlers understand relationships between entities
Human visitors never see this alternate content. They get your standard site experience. AI crawlers get content engineered to be cited.
Cache bypass for AI visitors
CloudFront's caching is great for performance. It's terrible for AEO. If an AI crawler hits a cached page, it gets whatever was cached — which might be stale content that doesn't reflect your latest optimizations.
Salespeak's integration configures cache bypass rules for AI visitors. When the viewer request handler detects a bot User-Agent, it sets cache behavior to bypass, ensuring the crawler always gets fresh, optimized content from your alternate origin. Human visitors still get the full benefit of CloudFront's cache.
What you actually have to do (almost nothing)
Here's the part that matters for the team that has to implement this: Salespeak handles the deployment automatically.
You don't write Lambda functions. You don't build CloudFormation templates. You don't configure IAM roles for Lambda@Edge execution. You don't manually set up cache behaviors or origin groups.
The setup flow:
- Connect your AWS account to Salespeak (IAM role with scoped permissions for CloudFront and Lambda)
- Select your CloudFront distribution from the dashboard
- Salespeak deploys the Lambda@Edge functions to your distribution's viewer request and origin response triggers
- AI crawler visits start appearing in your Salespeak analytics dashboard
Updates don't require recreating your CloudFront distribution. Salespeak manages function versioning and deployment. When new AI crawlers emerge (and they're emerging constantly; we've seen at least a dozen new bot User-Agents in the last six months), the detection list updates without any action on your end.
What you'll see in the dashboard
Once the Lambda@Edge functions are running, your Salespeak dashboard shows data that GA physically can't capture:
- AI crawler visits by bot type: which AI systems are crawling your site, how often, and which pages they're hitting
- Crawl frequency trends: are AI crawlers visiting more or less over time? Which content attracts the most AI attention?
- Edge location data: where in the world are AI crawlers accessing your content from?
- Optimized content serving rates: how many AI visits received your optimized content vs. your standard pages
- Page-level AI traffic breakdown: identify which pages AI crawlers are ignoring entirely (those are your optimization opportunities)
This data feeds directly into your AEO measurement framework. Instead of running manual citation audits to guess which pages AI models are seeing, you have direct evidence of crawler behavior.
CloudFront vs. other CDN integrations
Salespeak also offers integrations for Cloudflare (via Workers) and Nginx (via Lua modules). The architecture differs but the outcome is the same: detect AI crawlers, serve optimized content, log everything.
Choose based on your existing infrastructure:
- AWS CloudFront + Lambda@Edge: best for teams already on AWS. No new vendors, same security posture, automatic scaling across 400+ edge locations.
- Cloudflare Workers: best for teams already on Cloudflare. Simpler deployment model, but you're adding (or already have) a CDN layer.
- Nginx: best for teams running their own infrastructure. More control, more operational overhead.
The CloudFront integration has one distinct advantage: if you're already paying for CloudFront, the Lambda@Edge execution costs are negligible. You're charged per request and per compute duration, but AI crawler traffic is a tiny fraction of total requests. For most sites, we're talking single-digit dollars per month.
The enterprise case for edge-based AEO
Enterprise security teams ask three questions about any new tool: where does it run, what data does it touch, and who controls it?
With Lambda@Edge:
- Where does it run? In your AWS account, at CloudFront edge locations you already use.
- What data does it touch? HTTP request headers (User-Agent, URL) and response HTML. No PII. No cookies. No session data.
- Who controls it? Your AWS account. Your IAM policies. Salespeak deploys with scoped permissions — it can't access your S3 buckets, databases, or anything outside the CloudFront distribution.
That's a security conversation that takes 15 minutes instead of 15 weeks.
Start tracking what GA can't see
Every day you're running CloudFront without AI detection, you're missing data. GPTBot might be crawling your pricing page hourly. ClaudeBot might be ignoring your product pages entirely. PerplexityBot might be hitting your blog but skipping your case studies. You don't know — because your analytics stack was built for a world where all visitors run JavaScript.
That world ended in 2024.
The teams winning at AEO in 2026 aren't guessing which AI models see their content. They're measuring it at the edge, serving those models optimized content, and watching citation rates climb as a direct result.
If you're on AWS, you're one integration away from joining them. Connect your CloudFront distribution to Salespeak and start seeing the 66% of traffic your current tools miss.




