0:00
/
0:00
Transcript

How to Scrape Media Kits for SEO

Using Scrapebox to find Media Kits for Editorial Link-Building projects.

What's up, guys! Ivan, here, and in this video, I’ll show you how I set up a scrape for Media Kits, which I use for link building. Recently, I misplaced my Media Kit URL file and decided to redo the scrape. Let’s dive into the process, so you can use it for building backlinks at your leisure.


Tools and Setup

Scraping Server and Proxies

  • Scraping Server:
    I use a dedicated local scraping server, with Scrapebox installed.

  • Proxy Service:
    For scraping URLs from Yahoo, I currently use Lightning Proxies (IPv6 rotators with unlimited bandwidth).
    Pricing starts at $10/day for 30 MB/s.
    If you plan to scan or pull data from the resulting URLs, then you’ll need a cheap datacenter proxy solution, like WebShare.
    You can probably use Lightning Proxies datacenter products, as well, for a similar effect, but as of this post, I have not tested them, yet.

Why Yahoo Over Google or Bing?

  • Yahoo’s Advantages:
    Yahoo offers better topical URL results compared to Bing and Google.

  • Proxies for Yahoo:
    Unlimited bandwidth proxies perform well for short scraping sessions.


Step-by-Step Guide

Purchasing and Configuring Proxies

  1. Choose a Plan: I opted for Lightning Proxies’ $10/day plan.

  2. Payment Tip: Use Stripe, and click the alternate option for PayPal.
    PayPal often resolves card rejection issues.

  3. Whitelist Your IP:
    Add your IP to the proxy provider's safe list for secure access.

Setting Up ScrapeBox

  1. Adjust Settings:

    • Set connections to 7 threads (1/4th of bandwidth capacity).

    • Timeout settings should match your bandwidth.

  2. Import Permutations:

    • Use city-based permutations like the top 5,000 U.S. cities.

    • Combine permutations with “Media Kit” keywords.

Scraping Process

  1. Search Configuration:

    • Target Yahoo with a depth of 50 results per search term.

    • Automate the scraping process using the ScrapeBox Automator plugin.

  2. URL Filtering:

    • Remove duplicates and irrelevant entries.

    • Sort URLs to focus on viable domains.


Analyzing Results

URL Scrubbing and Exporting

  • Use ScrapeBox's tools to scrub:

    • URLs containing undesired keywords.

    • Hostnames that are IP addresses.

Page Scanner

  • Custom Footprint:
    Look for “Media Kit” in the page content, often located in footers.

  • Export filtered URLs for further analysis.


Post-Scraping Workflow

Checking Domain Authority

  1. Use Tools like Open Page Rank: Identify high-value domains.

  2. Analyze Metrics:

    • Keywords ranking.

    • Referring domains.

Example: Media Kit Review

  • Look for Media Kits that mention:

    • Sponsored content.

    • Editorial coverage details.

    • Pricing for link insertions.

Outreach

  1. Draft an Email Inquiry:

    • Subject: Editorial Content Request.

    • Key Questions:

      • Is it tagged as sponsored?

      • Is it a do-follow link?

      • Is it permanent?

  2. Pricing Benchmark:

    • Good links often range from $100 to $350, depending on metrics.


Proxies for Data Collection

  • Data Center Proxies:
    Use WebShare for scraping deeper URL data. Plans start at $15/month.


Final Thoughts

Scraping for Media Kits is a powerful way to identify link-building opportunities. Use proxies and automation tools to streamline the process. Don’t forget to validate domain authority and contact site owners directly for collaboration.


Resources


Ivan David Lippens is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Get more from Ivan David Lippens in the Substack app
Available for iOS and Android

Discussion about this podcast