Google API Doc Leak

SEO Nerds, This One Is For You

Drip Sequenceđź’§

In case you missed it:

I sent out a report detailing the comp survey, if you missed the first newsletter the URL for the LookerBI Dashboard can be found right here.

If you are here for the Google API Doc leak and nothing else, link is right here and you can skip the rest if you want to.

Also I am always open to suggestions to evolve this newsletter, feel free to either DM me on IG or reply back to this email with feedback or with stuff you would want to see in here.

Headline Overviews

  • Disney Ad Sales Executive Lisa Valentino Exits Amid Restructuring Lisa Valentino, a senior exec at Disney Advertising Sales, has left the company amid a restructuring of the advertising sales staff. Valentino’s exit is part of a planned re-alignment, as those with access to confidential information are required to leave. She had rejoined Disney in 2019 after previous roles at Univision and Conde Nast.

  • Google says YouTube video skipping issue with ad blockers is a performance update, not a crackdown Many YouTube users with ad blockers have experienced videos skipping to the end, leading to speculation about a crackdown on ad blockers. YouTube clarified that the issue is due to a performance update aimed at improving reliability, not targeting ad blockers. They reiterated that using ad blockers violates their Terms of Service and urged users to support creators by allowing ads or subscribing to YouTube Premium.

  • Klarna using GenAI to cut marketing costs by $10 mln annually Fintech firm Klarna uses generative AI to save $10 million annually in marketing costs. By adopting AI for campaigns and image generation, it reduced its marketing budget by 11% in Q1, with AI contributing 37% of the savings. Klarna cut image production costs by $6 million and sped up updates for key events, also saving $4 million by reducing external marketing expenses.

  • US court to hear challenges to potential TikTok ban in September A U.S. appeals court will fast-track the legal challenges to a new law requiring ByteDance to sell TikTok's U.S. assets by January 19, or face a ban. The case, set for oral arguments in September, follows lawsuits from TikTok, ByteDance, and TikTok content creators. Legal briefs are due between June and August, with a ruling expected by December 6 to allow potential Supreme Court review.

  • Meta Removes AI-Generated Influence Campaigns in China, Israel Meta removed hundreds of Facebook accounts from China, Israel, Iran, Russia, and other countries using AI tools for disinformation. Despite AI usage, Meta effectively unplugged these campaigns. With upcoming global elections, Meta emphasized detecting and labeling AI-generated content. New policies require labeling misleading AI content and advertisers to disclose AI use in social, election, or political ads.

The Google API Documentation Deep Dive

So not too long ago, a bunch of Google’s internal search ranking documents was inadvertently published by an automated Google system to a publicly accessible Google-owned repository on GitHub. The leak was first discovered by Erfan Azimi, CEO of EA Digital Eagle, and later publicized by SEO experts Rand Fishkin and Michael King.

The documents contain an older version of Google’s Content Warehouse API but still contains a lot of insights into Google's ranking criteria, including the influence of click metrics, Chrome usage as a quality signal, content freshness, and site authority.

The Google API documentation isn’t a how-to guide for building a search engine from scratch. Instead, it provides comprehensive details on various modules and various endpoints and explains how to use different tools and functions to work with Google's content management services. It covers how to make API calls, handle data, manage connections, and set up rules for content use and user interactions. The goal is to help developers use these tools to organize and control content, apply user restrictions, and improve how data is managed and searched.

With that in mind, here are the Top 5 surface level items about the document:

1) 1. Detailed Metadata and Contextual Information

  • General Impact: The documentation highlights the importance of using contextual data around anchors (e.g., terms near the anchor, fontsize) and detailed metadata annotations which helps the search engine better understand and index content, leading to improved search visibility and relevance.

2. User Interaction Data

  • General Impact: Incorporating user behavior data (engagement metrics and co-clicks, to optimize content ranking) can refine content rankings, ensuring that more engaging and relevant content is prioritized in search results.

3. Content Freshness and Updates

  • General Impact: Keeping content updated ensures that it remains relevant to users and search engines, which can enhance its SEO ranking and visibility.

4. Quality Signals in Search Snippets

  • General Impact: High-quality snippets improve click-through rates by providing users with relevant information directly in the search results, leading to increased traffic and engagement.

5. Localized Relevance

  • General Impact: Enhancing localized relevance helps improve local SEO performance, making content more accessible and appealing to local audiences.

Now for all the SEO people who had some specific questions about the documentation here are your answers:

Do Content Pages Still Have to be 2,000 Words Long to Win Page 1 on a SERP?

The documentation does not support the notion that a strict 2,000-word length is necessary for content to rank on Page 1 of SERPs.

Do They Suppress Anything or is it Truly Best Websites at the Top?

Yes and no. The documentation describes various moderation and sensitivity filters applied to content. For example, the SearchPolicyRankableSensitivity and related modules manage how sensitive content is handled in search rankings. These filters consider factors such as abusive content, legal restrictions, and regional blocking which can vary depending on the location.

There is a huge emphasis on “content freshness” so updating a website regularly seems to be the biggest impact on your SEO ranking.

Does Google Put More Weight on Conversions vs Engagement like Visiting Pages, Scrolling, Clicking

The Google API documentation suggests Google places considerable importance on user engagement like the ones mentioned when ranking content on their SERPs. There is no one specific engagement metric Google favors and it seems to just favor a holistic view of user interaction.

Is SEO Dead or Does Reddit Rule Forevermore?

While there is room for integrating user-generated content from platforms like Reddit, which is kind of going disastrously right now with their AI generated search results, established SEO techniques still carry a lot of weight. But that could all change since anything and everything can change if it generates Google more shareholder value.

How is it Handling AI Generated Content?

The document includes fields for identifying whether data has been generated automatically by a bot or algorithm. For instance, the attribute isAuto specifies if the data was auto-generated, and sourceSummary provides information about the source of this data, including whether it comes from an automated process.

AI generated content is still subject to the same quality control measures as human-generated content, but for now it seems like they are just tracking it with modules like GoogleApi.ContentWarehouse.V1.Model.IndexingConverterRichContentData which contain information about different versions of content (original, processed).

Given that just about everyone is leveraging generative AI to some degree to expedite the entire writing process, Google is never going to really filter out AI generated content.

What Are The Clearly Defined Ranking Priorities, And What Technical Changes Have The Most Impact?

The ranking priorities are still the same: quality signals, content freshness, user engagement, contextual and metadata information, localized relevance.

The technical changes that have the most impact:

  • Compressed Quality Signals

  • Core Web Vitals

  • Anchors and Link Quality

  • Entity and Metadata Annotations

  • Personalization and Contextual Scoring

Why is Google forcing AI Now on Their Results

Formal answer: Google is in the process into pivoting into an AI company (just like every tech company right now on the market). Using AI helps them better understand user queries, especially natural language questions to help AI models train on for future endeavors.

The real answer: Because they signed a deal with Reddit and are trying to train the AI models off human generated content to make money off AI.

Are There Any Signals Being Used That Contain Protected, Sensitive, or PII Data?

Yes, however there are very specific attributes and protocols for handling sensitive data and PII (they don’t want to keep getting that smoke from GDPR). There are a lot of metadata fields that are used to enforce data governance policies that take into account regional restrictions and regulations. I would say in layman’s terms all the data essentially goes through a filter that adheres to region specific data access regulations.

Can You Distill The Top 5 Most Non-Obvious Impactful Levers For SEO?

Saved this one for the end given that it is probably going to be the most acute question that was asked on my Instagram story but here ya go:

1. Optimizing Contextual Anchor Information

  • Lever: Improve the contextual relevance of anchor text by considering the surrounding terms.

  • Description: Use the context2 signal, which is a hash of terms near the anchor, to enhance the relevance and quality of links.

  • Impact: This helps search engines understand the context in which a link is placed, potentially improving the perceived relevance and quality of the link.

  • Citation: "This is a hash of terms near the anchor. (This is a second-generation hash replacing the value stored in the 'context' field.)"

2. Detailed Entity and Metadata Annotations

  • Lever: Implement comprehensive entity and metadata annotations within content.

  • Description: Utilize KnowledgeAnswersIntentQueryEntitySignals to provide detailed signals associated with specific entities and their annotations.

  • Impact: Enhances search engines' understanding of the content's subject matter, improving its relevance for specific queries.

  • Citation: "Signals associated with specific entities and their annotations."

3. Advanced Quality Features for Snippets

  • Lever: Optimize snippets using advanced quality features.

  • Description: Leverage QualityPreviewSnippetQualityFeatures to incorporate quality-related features such as answer scores, passage coverage, and snippet coverage.

  • Impact: Determines the prominence and visibility of snippets in search results, influencing click-through rates and user engagement.

  • Citation: "Quality related features used in snippets scoring."

  • Lever: Assess and enhance link quality based on locality and bucket metrics.

  • Description: Use metrics like "locality" and "bucket" to measure the relevance and importance of links within the content.

  • Impact: Influences the contribution of links to the content's ranking by evaluating their significance and contextual relevance.

  • Citation: "For ranking purposes, the quality of an anchor is measured by its 'locality' and 'bucket'."

5. Handling Sensitive and Personal Data Responsibly

  • Lever: Implement protocols for managing sensitive and personal data.

  • Description: Use fields like KnowledgeAnswersIntentQuerySensitiveArgumentValueGuard to handle sensitive argument values, including encrypted values and annotations for sensitive content.

  • Impact: Ensures compliance with privacy regulations and enhances the credibility and trustworthiness of content by protecting sensitive information.

  • Citation: "This field should never be populated in prod. This is only provided for easier human inspection when using dev builds (dev keys are public)."

But now that whole document has been leaked we’ll see what how much has been changed since then, especially if someone at Google passes on this newsletter to a Google engineer to remedy this so you guys can’t game the algorithms.

What Things Are There On The Document That Contradict What Google Has Said Publicly?

1. User Interaction Data

Public Statement: Google has stated that user interaction data, such as clicks, are not used directly as a ranking factor.

Contradiction in Documentation: The documentation indicates that user interaction data, such as co-clicks and engagement metrics, are used to refine and optimize content ranking​.

2. Contextual Information Around Links

Public Statement: Google emphasizes the importance of high-quality backlinks without detailed public information on the contextual specifics.

Contradiction in Documentation: The documentation details the use of contextual data around anchors (e.g., hash of terms near the anchor, font size) to assess the relevance and quality of links​.

3. Domain and Entity Scores

Public Statement: Google states that they do not use a "Domain Authority" score as a direct ranking factor.

Contradiction in Documentation: The documentation includes various scores related to entities and domains, such as RepositoryWebrefEntityScores and RepositoryWebrefEntityNameScore, which contribute to content ranking​.

4. Quality Signals in Snippets

Public Statement: Google provides general guidance on snippet quality without detailing specific scoring methods.

Contradiction in Documentation: The documentation specifies the use of various quality signals for snippets, such as QualityPreviewSnippetQualityFeatures and QualityPreviewRanklabTitle, which contribute to snippet scoring and ranking​.

Public Statement: Google emphasizes the quality of backlinks but does not provide detailed methods on contextual specifics.

Contradiction in Documentation: The use of contextual information around links and anchors, such as hash of terms near the anchor and font size, indicates a detailed method of evaluating link quality that is not commonly disclosed​.

6. Localized Relevance Signals

Public Statement: Google advises on optimizing for local search through standard practices like Google My Business.

Contradiction in Documentation: The implementation of localized relevance signals for different regions in the documentation indicates more nuanced practices for local SEO that are not commonly discussed by Google​.

Reported Layoffs

  • GroupM layoffs in Hong Kong Reported 5/29

  • EssenceMediacom APAC Reported 5/29