Time is one of the most valuable resources we have. But what if there was another type of time? What would being able to move through your day more quickly mean for our lives?
This video is part of our Weekly Wisdom series, in which experts discuss a range of issues.
Transcript has been changed.
Hello, this is Ross from Type A Media, and welcome to another episode of Weekly Wisdom. We at Type A Media are recognized for our four-day work weeks, which we do by removing all the fat from our everyday procedures. So in this Weekly Wisdom video, I’ll show you how you can save a second, a minute, or even an hour. With some of these ideas and hacks, as well as certain tools that we utilized to cut the fat and get right to the point, we were able to get the data in, analyze it, and, most importantly, get it live on our client’s site so we could start rating them.
Let’s get started without further ado. Finding all of the URLs that their website has ever had is one of the things I notice that people spend a lot of time on. Typically, they will crawl the site to see what is there, as well as examine the XML site map. Take a peek in Search Console to see if they’re jumping there. Perhaps going into Majestic to check all the sites with links is a good idea, but what if the client has been moved six times in the previous 12 years? Do you have that information? Is it parked somewhere? Of course, you could go to archive.org and search for what you’re looking for, but that’s time-consuming as well, so I’ll show you a quick approach to put everything together.
Archived copy of this page
Did you aware that there is an endpoint for retrieving CSVs from archive.org? So you can literally build the complete URL from scratch. My website, typeamedia.net, is used, and match type is a domain. You can see a URL limit here; I can literally say, “Give me 10,000, 100,000, you name it, as many URLs as you want or put them in a CSV, and do it from 2007 to 2018 and only show me items that had a 200 status code and got a response.” That’s fine, but I can’t really do much with the data until it’s in a spreadsheet, and we all know how much we love Google Sheets. What we’ll do is import the data – I’ll need to add an equal sign at the beginning of it so it understands it’s a formula — and after you’ve done that, make sure you wrap it in parentheses, and all of a sudden you’ll get all of the URLs. So, where do we go from here?
I’m going to obtain my sitemaps; if you use Yoast, which I highly recommend, you’ll most likely receive numerous site map URLs. What you want to do is put up a system where you can simply copy and paste the information into a spreadsheet. Now, Import XML can accomplish that for you, but the issue is that when I use Import XML, it doesn’t give me a nice clean list like this. It’ll either give me the complete document with all of the formatting intact, or it’ll simply give me a huge ol’ error. We clearly don’t want that, so when I perform Import XML, I’ll use a little RegEx to cut part of it out. Now is a good moment to stop the movie and make a mental note of what this is; I won’t explain it since it goes beyond the scope of this video. But, in the end, it allows you to remove all of the unneeded content from your XML site map.
Majestic is up next. Now, I really like Majestic, and it’s largely because they’ve integrated APIs into almost everything, so there’s a Google Sheets add-on. Enter your domain name into the add-on, and we’ll show you the top sites, both old and new. When you click ‘Get data,’ these additional tabs will open since it is pinging the API and pouring everything into Sheets. Beautiful.
But there are two different sheets, and I want them to be combined, so I’ll apply a formula called Unique. So, if we choose ‘Unique,’ we must convert this to an array since we are stacking two distinct items on top of one another rather than seeking for a single unique list. We’ll use ‘curly brackets,’ and I’ll just utilize the first three columns —’semicolon,’ which we use inside of an array in Sheets. Move on to the next one; it’s the same thing; we’ll close our curly brackets like this, and then we’ll be good to go. Okay, so that’s brought in all of the Majestic data, which is amazing.
Then there’s S-E-M, or should I say SEMrush, the popular favorite. So I’m looking at add-ons, and I’m going into super metrics and creating my site bar, and we’re going to put our domain name in there. We want the “domain organic search keywords” report, and then we click “apply,” which will bring everything in for us.
Google Webmaster Tools is a free service provided by Google.
So, next, we’ll acquire Google Webmaster Tools (notice that I said ‘Webmaster Tools’ rather than ‘Search Console’ since I’ve been doing this for more than two seconds). So, how do we integrate Search Console? Again, it’s our favorite tool; super metrics will be used, but the data source will be changed to Search Console. Okay, drop in your website, pull it in as usual, and make sure you use the same dates as last year so it pulls in a ton of data.
I want to retrieve the whole URLs for the search searches, so I press ‘Apply changes’ and it appears. Okay, and here’s everything we rank for; I’m genuinely concerned about that, as well as the landing page data. Take a look at all of that beautiful duplication. So now that we have all of these various sources, we want to bring them all together in a neat, unified way and eliminate all of the duplication; the issue is, how do we accomplish it?
Formulation that is unique
So, let’s return to the beautiful formula, Unique, which is also my favorite formula. We’ll simply type ‘unique’ here, open with a regular bracket, and then remember that we’re about to perform an array, which is many formulae piled on top of one another, so we’ll have a curly bracket here, and we’ll go to absolutely everything. We should start with archive.org and pull it in. Then we’ll go into the sitemap and bring it in. Then we’ll go to all Majestic and pull it in. Then we’ll go into SEMrush and bring everything in, and then we’ll go into Webmaster Tools, previously known as Webmaster Tools but now known as Search Console, and pull everything in, and then we’ll finish it up with a curly bracket and a regular bracket, press the ‘enter’ button, and we’re done.
As a result, we now have a fully organized list of every single URL that has ever appeared on our website, with every duplication deleted. I believe I can declare with a great degree of confidence that they are the only URLs for my website that have ever existed. With the list, I can now do some very amazing stuff. So, as an example of what I may do with this information, I might go visit the frog (Screaming Frog). I’d put in a list, and I’d probably want them to crawl it because after they’re done, I’m going to run a report and see all of my redirect and canonical chains. I can see where all the issues are after a slew of redirects prior to a slew of site migrations.
That is what SEO speed hacks, tips, and techniques are all about. Done.