You are here
Home > SEO Links >

A 101 how to use Screaming Frog SEO Spider + XPath to find all article outlinks on a section page

… without the menu, sidebar and footer links

Lets say you want to know which articles about Donald Trump are listet here and on all paginated pages like …

You are just looking for these links:

Step 1: Find the related elements in the source code

Open inspect the code in Google Chrome:

Use the inspect tool and hover the element you want to inspect

I want to check the links so I hover them with the tool. In the tool the corresponding HTML is highlighted

Now I’m looking for a link to the highlighted article. It must be somewhere near. The red arrow looks good. If you can’t find something like a link it’s probably still nested. Open with the little arrows (marked with blue arrow)

If you select the HTML code the inspect tool will color the related parts in the site:

Step 2: Get Xpath

More about XPath

XPath – Wikipedia
has an unclear citation style.citation and footnoting. Violates Wikipedia:External links: “Wikipedia articles may…
The Complete Guide to Screaming Frog Custom Extraction with XPath & Regex
In this guide, I’ll show you how to use Screaming Frog’s Custom Extraction feature to scrape schema markup, HTML…

You can try like this

In this case its


For me having [4] or [any other number] in the Xpath is most of the times an indicator that this is not useful.

So let’s create the Xpath manually…

If you are looking for LinkURL in

<a href=”LinkURL”>Link Text</a>

The Xpath is


It’s getting all a-tags and there the @href attribute text

But we are looking for specific links not all.


could be an indicator to identify the links in the list.

The Xpath to address this is


which is looking for all a-tags with


Another option could be to use the parent div-tag (yellow arrow) with


and than the a child a-tag (blue arrow). It’s possible to use contains if you don’t want to check for all these class-names listed there


So summed up


is getting the div with class attribute containing “layout-item”


it getting the child a-tag


is getting the href-attribute content.

The Xpath starts with // (2 slashes) and separates with / (1 slash) hierarchically

Step 3: Xpath Screaming Frog SEO Spider

Go to

Configuration > Custom > Extraction

and add the 2 Xpath ideas e.g. like this:

Now run with a include filter, which just checks the needed folder + all paginated pages.



is the placeholder for

Now check in the Custom tab in Screaming Frog SEO Spider

The second try with a clickable seems to be wrong.

It lists menu items too:

The div layout-item a looks good

So this is the Xpath to work with:


Just run and collect the links 🙂

Share this post if you enjoyed! 🙂

Source link

Leave a Reply