The problem with attention metrics: why AI video analysis reveals what dashboards miss
For years, video advertising has been measured by a single proxy: did someone watch it? View-through rate, average watch time, completion rate. These metrics tell you that eyes were on the screen. They do not tell you what the brain did with what it saw.
A viewer can watch 30 seconds of your video and retain nothing emotionally meaningful. Or they can glance at four seconds of a scene and have a strong associative memory encoded. The difference is not visible in the analytics dashboard. It is happening inside the brain - in the visual cortex, in the auditory processing regions, in the memory and emotion centers that determine whether your message actually lands.
This gap between viewership and cognitive impact is the fundamental unresolved problem in video creative. And it is why most A/B testing in video advertising tells you very little about why one creative outperforms another.
What TRIBE v2 actually does
TRIBE v2 is a neural encoding model developed at Meta AI Research. It was trained on fMRI data from over 700 human subjects watching video content. The model learned, at a very fine resolution, how different visual and auditory signals map to brain activation patterns across the cortex.
When you upload a video to app.publicimpact.ai, the model processes every frame, extracts audio and visual features, runs them through the same encoding pathways learned from real brain data, and outputs predicted BOLD (Blood-Oxygen-Level-Dependent) signals across 20,000 cortical vertices. These are the same signals a neuroscientist would measure in an fMRI scanner - predicted, in seconds, from your video file.
The six brain regions that matter for marketing
The tool aggregates the cortical predictions into six Regions of Interest (ROIs) that are directly interpretable for creative decisions:
How the analysis works end-to-end
Upload a video to app.publicimpact.ai and the pipeline runs automatically on an A100 GPU:
What AI video analysis means for how you make creative decisions
The practical implications are significant. Consider a 60-second brand film. Traditional analysis tells you average watch time and where drop-off occurs. Neural analysis tells you something much more useful: at second 38, when your presenter says the brand name for the first time, does the memory region spike? At second 52, when you show the product, does the emotional region activate? At your call-to-action, does prefrontal engagement rise - or has the video already exhausted cognitive load and produced a flat response?
These are the questions that determine whether a video converts. They cannot be answered by view counts. They can only be answered by understanding what is happening inside the brain of the viewer.
Why this is possible now
Two things changed in the last 18 months. First, Meta's TRIBE v2 research matured to the point where cross-subject neural encoding predictions are accurate enough to be useful outside the lab. The Algonauts 2025 benchmark results confirmed that TRIBE v2's predictions correlate strongly with actual measured brain activity - across subjects who were not in the training data.
Second, the cost of GPU inference dropped to a point where running this kind of analysis per-video is commercially viable. The entire pipeline runs on a single A100 GPU in three to five minutes. A year ago, equivalent compute would have required a reserved cluster and a six-figure budget.
The result is a capability that previously existed only inside neuroscience research labs - now accessible to anyone with a video file and a browser.
How to use it in practice
- Pre-launch creative testing: Upload two versions of an ad before spending budget. Compare emotional and memory activation across the timeline. The version with higher activation at brand and CTA moments will perform better in market.
- Editing decisions: When you are choosing between cuts, the neural analysis tells you which version produces stronger activation at the moments that matter - not which one a focus group says they prefer.
- Content audit: Upload your existing video library. Identify which pieces have produced genuine emotional and memory encoding versus those that have produced only visual attention.
- Script optimization: Use the language ROI time series to see which sentences in your narration are actually being processed semantically - and which ones are being tuned out.
app.publicimpact.ai is live. Upload any video file. The first analysis takes three minutes for the GPU to start; subsequent uploads within the same session are faster. The output includes the full ROI time series, brain heatmaps, transcript alignment, and the GPT-4o creative analysis.