Do you want to find the best keywords for your content? Do you need to do a competitive analysis on your blog or website? If so, Screaming Frog is an invaluable tool. This article will walk through how to use it and what information can be found with this tool.
The screaming frog extractor allows users to select which fields they would like extracted from a webpage and then save these as CSV files. These can then be opened in Excel and filtered into specific ranges of data types such as dates, numbers, words. For example, if you wanted all the URLs on a page that contain “keyword” in them but only pages with more than 5 matches per URL, this could easily be done by filtering out any URLs that meet those requirements. Keep reading to learn more.
How to Use Screaming Frog Custom Extraction
In ScreamingFrog, go to Configuration > Custom > Extraction.
Next, you will need to +Add and set up your extraction rules.
Add a title, select if you need CSSPath, XPath, or Regex, then add your search function. If you aren’t sure which selector or function you need, look at the examples below.
Here is an example of how you would scrape for a Facebook Pixel ID
Results, as you can see, one of my pages are missing a Facebook Pixel:
Here are more prebuilt syntaxes to help you crawl and search for content:
Basic Syntax for XPath Web Scraping
Search anywhere in the document
Search within the root
Select a specific attribute of an element
The wildcard is used to select any element.
Find a specific element.
Specifies the current element
Specifies the parent element
Checks if x starts with y
Checks if x contains y
Finds the last item in a set
Counts occurrences of the XPath extraction
How to Extract Common HTML Elements
Extract all H1 tags
Extract the first H3 tag
Extract the second H3 tag
Extract any <p> contained within a <div>
Extract any <div> with class “author”
Extract any <p> with class “bio”
Extract any element with class “bio”
Extract the last <li> in a <ul>
Extract the first <li> in a <ol> with class “cat”
Count the number of H2’s (set extraction filter to “Function Value”)
Extract any link with anchor text containing “click here.”
Extract any link with a title starting with “Written by.”
How to Extract Common HTML Attributes
Extract all links
Extract link that starts with “mailto” (email address)
Extract all image source URLs
Extract all image source URLs for images with the class name containing “aligncenter.”
Extract elements with the rel attribute set to “alternate.”
The possibilities are endless; please let me know if you want any extractions added to this list.
Screaming Frog is a valuable tool for anyone who wants to find the best keywords, do a competitive analysis on their blog or website, or perform any other task that requires crawling and analyzing your site. It’s easy to use and can provide you with a lot of information about your own content and what potential competitors are doing online. If you have questions about how it works or want more in-depth guidance on using the software, please email me. I will be happy to help!