Auto Links Optimizer Tool: Improve your SEO and Interlinking architecture

Celebrating the 3 years since Safecont was born we launched a new breakthrough functionality, we present “Links Optimizer”. The first SEO tool that will tell you in a completely automatic way, how to optimize the architecture and interlinking of your website, so that you get the maximum benefit from your linkjuice.
Optimize your architecture 100% with Links Optimizer. What architecture should you choose? This will no longer be a problem.

As there are a great variety of web types, types of architecture and different business / product decisions, the complexity of developing a tool for this function is enormous. But there is no doubt that it was something that SEOs needed, an analytical and objective way of developing an SEO architecture and an interlacing system between the pages of our domain.

New feature: CRAWL STATS a free SEO crawler with Safecont

We are pleased to announce that we bring news and this does not stop.
It is already official the launch of “CRAWL STATS”, you have available and completely free of charge a crawler included in all Safecont accounts. 

A new tab appears in your user dashboard. Once you update or launch an analysis, you will have information and data on crawling, with listings for each type of URL, and other metrics. Until you update your analysis, the data will not appear.
In the main screen of the tab you will see “Crawled stats” a summary of the state of the domain with a pie chart and some very interesting information:

Unique indexable pages

Non-indexable pages

Pages that give a code other than 200 (301, 302, 404, 500, …)


Crawlable pages over the limit that you have put to the analysis (you may have launched an analysis on 5,000 urls, but our crawler has found more pages)

And a great novelty our Crawl Score: A metric created by us, which evaluates the difficulty that bots find to crawl our domain and find all the indexable urls, taking out a single metric weighted by the weight of each level according to importance. Further down on the same page, you have the “Crawled URLs per level” graphic where you have the information of each depth level.

All this information is downloadable in CSV where you will have many more details with which to work as your index / follow status, where the url was found, where it points, status code, your pagerisk, similarity, pagerank, hub value, authority value, the semantic cluster to which it belongs, and much more:

Haz click para ampliar
Then you will find 3 boxes with more information:

Non indexed URLs: With information of redirections, noindex and other circumstances that can make a page not indexable. Clicking on any will go to the detail page with the full list of urls that compose it and other relevant data.

Non 200 status URLs: Where

Wild Query Strings and Duplicate Content

Recently, Robin Rozhon wrote an interesting post about duplicate content and how small changes reduce the amount of indexed pages in a site, increasing the traffic received by landing pages and, thus, the revenue obtained by that site.
Rozhon’s thesis is that when you have the same content duplicated in several indexed URLs, you lost control about the crawling of you site, relying on Google’s judgment and spreading all your possible visitors in several URLs that would have to fight among them for the same content. In the post I linked before, the author explains why they make a reduction of 80% of their indexed URLs (from 500.000 pages to only 100.000). Before the change, only the 8.55% indexed URLs generated at least one session in a month; after deindexing URLs, the 49.7% indexed URLs generated organic traffic.
In Safecont, we totally agree: duplicate content is something to avoid. The main source of duplicate content are query parameters used to generate content views, faceted navigation or track users. The last one is really a bad idea. There are better ways to track users, but if you are obliged to use track parameters, it is necessary to avoid its indexing because each pass of the crawler will generate useless URLs that will duplicate with the main landing pages of your site.
Regarding content views, your should ask yourself: Does this parameter change content seen by the user? If the answer is no, you should avoid indexing it. If the answer is yes, index it only if the change is noticeable. For example a parameter used to order some listing by price should not be indexed.
Facets pose the same problem. If you combine several facets with several filters, you could generate thousands of URLs that would not add content to your site. In general, you should use only facets for individual category pages and use non indexable filters to increase the refinement of a search. For example, you could generate a landing for “adidas jackets” but not for “adidas jackets under 200$”.
The usual method to search URL parameters is looking for them in the code

Semantics for your SEO

This week we talk about one of Safecont parts that can help improve the quality of a page. I refer to the “SEMANTIC” tab of Safecont. In this part of the tool we have summarized the semantic information of a domain in different ways.
On the one hand we have the TFIDF again. If in previous occasions we look at the TFIDF of each URL of a site, in the SEMANTIC tab we will focus on the general TFIDF of a domain. On the one hand we have the general TFIDF of the most frequent words of the domain, calculated as the average of the TFIDFs of all the URLs of the system. This graphic can give us information about what words are important in our domain and give us an overview of the use we make of them. As we did in the article that talked about individual TFIDF of each URL, we can look at those words that have TFIDF too high (values ​​well above the average).  As in this graph we show the average values ​​of the domain, a high value in this table implies generally high values ​​in the domain, which will be words that we will be using too much in each of the pages in which it appears.
The other way to see the TFIDF of the system is the graph of the same tab in which we show the relationship of TFIDF value of each word and number of URLs in which it appears.
In this graphic the most used words in the domain are represented as a point. The horizontal axis represents the number of URLs in which that word appears and the vertical axis represents the average TFIDF value that word has among all the pages in which it is used. If we leave the mouse on one of these points, we will see to what word it refers and the values ​​that it has. Likewise, you can zoom at will. This graphic can be used for several things:

Search words with TFIDF close to 0 and that are therefore used in all system

Ranking pages: Hubs and Authorities

This week we are still talking about web architecture, and how we can work on it through Safecont.
One of the least known parts of our tool is the page listings of a domain by Hub or Authority scores. Although we do not usually mention them much in our videos, these scores also serve to measure the importance of the pages of a domain and improve the architecture of a site in an alternative to the typical Pagerank algorithm.

While Pagerank focuses on sorting the pages by the probability that they are visited at random, the HITS (Hyperlink-Induced Topic Search) algorithm is based on the idea that there are two types of pages on the Internet:

Hub-type pages are those that, although they do not provide much information on a topic, link to the pages that do.
Authority type pages are those that contribute content on a topic to a website and are therefore linked by many Hubs pages related to that topic.

It is necessary to emphasize that the two types of value (Hub and Authority) are not exclusionary. The main page of a site usually has high scores in Authority (it is linked from the whole site) and Hub (it links to many pages with high Authority scores). Let’s see how we can use these scores to improve the structure of our site.
We have placed the page listings by their Hub or Authority score in the “Architecture” tab of our tool. In that section you can find two links to the lists of URLs ordered by their weight as Hub and as Authority. Let’s see some examples:
This website is the online store of one of the surfing fashion brands. If we look at your list of Auths we see the following:
As you can see the root has a high Auth weight, this is logical because it is linked from most pages of the site. However we see a curious thing, the Hub score is very low.
Normally it would have a score close to 1.0 because the usual in an e-commerce is that this page