How to implement SEO in applications made in Javascript

June 18, 2021

enter image description here

Search engine crawlers are designed to crawl HTML content on web pages; Nowadays webistes have evolved and many of them generate their content through JS, and these sites which generate that type of content are affected since the crawlers or bots do not know how to handle them correctly.

But there are tools that help us solve this problem as Prerender.io

Prerender.io is a middleware that is installed on your server and will check each request to see if it is a request from a crawler. If it is a request from a crawler, the middleware will send a request to Prerender.io to return the static HTML of that page. If not, the request will continue on its normal server paths. The crawler never knows that you are using Prerender.io since the response always passes through your server.

We have prepared a guide that can help you solve your SEO problems with JS applications

For practical purposes we will use a basic application of Angular. This application is not intended to be perfect nor does it follow any style guide, it is only meant for demonstrating how Prerender works.

Create a basic app in Angular

The next step is to create an application that will make the calls to the Prerender server, we will use Express to create this application.

If we do not have it installed, execute the following commands: [prism:bash] npm install -g express npm install -g express-generator [/prism:bash]

We move to the folder where we will create the app [prism:bash]cd /var/opt/[/prism:bash]

We create the app [prism:bash] express testapp cd testapp/ npm install [/prism:bash]

Edit the file views/layout.jade [prism:jade] doctype html html head title= title meta(name="fragment" content="!") link(rel='stylesheet', href='/stylesheets/style.css') script(src='//ajax.googleapis.com/ajax/libs/angularjs/1.2.6/angular.min.js') script(src='javascripts/app.js') body(ng-app="PrerenderApp") block content [/prism:jade]

Create the file public/javascripts/app.js With the following code:

[prism:javascript] var app = angular.module("PrerenderApp", []);

app.controller("ExampleController", function($scope) { $scope.message = "Hello World"; }); [/prism:javascript]

Edit the file views/index.jade

[prism:jade] extends layout

block content div(ng-controller="ExampleController") h1= title p Welcome to #{title} p {{message}} [/prism:jade]

We change the port from 3000 to 8080 modifying the file bin/www [prism:javascript] var port = normalizePort(process.env.PORT || '8080'); [/prism:javascript]

Now, if we start the Express server

[prism:bash]DEBUG=testapp:* npm start[/prism:bash] We should see something like that

[prism:bash]

testapp@0.0.0 start /var/opt/testapp node ./bin/www

testapp:server Listening on port 8080 +0ms [/prism:bash]

Test the App

At this moment we make a call with curl to see how the crawlers would see our app, for this demonstration we will use the Useragent of the twitter crawler [prism:bash]curl -A "Twitterbot" "http://localhost:8080"" [/prism:bash]

[prism:markup]

htmlExpress
Express

Welcome to Express

{{message}}

GET / 200 71.506 ms - 429 [/prism:markup] If we access from the browser to our app [we should see](http://localhost:8080) ![browser](http://i68.tinypic.com/2ytqiig.png) As we can observe the variable __message__ has been replaced for __Hello World__ We can show that the variable __message__ without replacing it, but, why does this happen? Crawlers do not process JS as browsers do, so when they get the HTTP 200 code, crawlers assume that the page is already loaded, so they do not expect the JS to finish the app, in this case it does not expect that loading, so that the controller will replace the variable __message__ for __Hello World__ ## Install and Configure Prerender ### Install PM2 PM2 is a production process manager for Node.js applications with a built-in load balancer. It lets you keep applications alive forever, reload them with no downtime, and facilitate common system administrator tasks. [prism:bash]npm install pm2 -g[/prism:bash] ### Install Prerender Middleware [prism:bash] $ git clone https://github.com/prerender/prerender.git $ cd prerender $ npm install [/prism:bash] By default the Prerender server does not have any type of cache, so if we start the Prerender server and request a page, it will generate the HTML and then it will serve us; If we ask for the same page again it will generate it again; First let's verify that our Prerender service is working properly We execute the following command [prism:bash]pm2 start server.js[/prism:bash] [prism:bash] [PM2] Starting /server.js in fork_mode (1 instance) [PM2] Done. ┌──────────┬────┬──────┬───────┬────────┬─────────┬────────┬─────┬───────────┬──────────┐ │ App name │ id │ mode │ pid │ status │ restart │ uptime │ cpu │ mem │ watching │ ├──────────┼────┼──────┼───────┼────────┼─────────┼────────┼─────┼───────────┼──────────┤ │ server │ 0 │ fork │ 30710 │ online │ 0 │ 0s │ 3% │ 22.4 MB │ disabled │ └──────────┴────┴──────┴───────┴────────┴─────────┴────────┴─────┴───────────┴──────────┘ [/prism:bash] At this moment we proceed to test the Prerender service [prism:bash]curl -A "Twitterbot" "http://localhost:3000/http://localhost:8080"[/prism:bash] And it should return [prism:markup] @charset "UTF-8";[ng\:cloak],[ng-cloak],[data-ng-cloak],[x-ng-cloak],.ng-cloak,.x-ng-cloak,.ng-hide{display:none !important;}ng\:form{display:block;}Express
Express

Welcome to Express

Hello World

[/prism:markup] As we can see, Angular already process the DOM and we see that the variable __message__ was replaced by __Hello World__ At this point we proceed to configure the server of our app to return the HTML generated by Prerender when necessary, this step varies depending on which server we are configuring, in our case ExpressJS, but we can do it with Apache, Nginx or Heroku We installed the Prerender proxy for node [prism:bash] cd /var/opt/testapp npm install prerender-node --save [/prism:bash] Edit the file __app.js__ And we add the line after the lines of __app.set__ [prism:javascript] app.use(require('prerender-node').set('prerenderServiceUrl', 'http://localhost:3000'));[/prism:javascript] This is all the configuration for the Express server. [prism:httpt]http://localhost:3000[/prism:httpt] Is the url of our Prerender service Restart the server of our app We can use pm2 for this too [prism:bash]pm2 start bin/www[/prism:bash] [prism:bash] ┌──────────┬────┬──────┬───────┬────────┬─────────┬────────┬─────┬───────────┬──────────┐ │ App name │ id │ mode │ pid │ status │ restart │ uptime │ cpu │ mem │ watching │ ├──────────┼────┼──────┼───────┼────────┼─────────┼────────┼─────┼───────────┼──────────┤ │ server │ 3 │ fork │ 9891 │ online │ 0 │ 8m │ 0% │ 35.1 MB │ disabled │ │ www │ 4 │ fork │ 10361 │ online │ 0 │ 0s │ 0% │ 17.0 MB │ disabled │ └──────────┴────┴──────┴───────┴────────┴─────────┴────────┴─────┴───────────┴──────────┘ [/prism:bash] Now that we have the Prerender service and the Express Webserver running, we can prove that everything is working properly [prism:bash]curl -A "Twitterbot" "http://localhost:8080"[/prism:bash] o [prism:bash]curl "http://localhost:8080/?_escaped_fragment_="[/prism:bash] [prism:markup] @charset "UTF-8";[ng\:cloak],[ng-cloak],[data-ng-cloak],[x-ng-cloak],.ng-cloak,.x-ng-cloak,.ng-hide{display:none !important;}ng\:form{display:block;}htmlExpress
Express

Welcome to Express

Hello World

[/prism:markup] That is the result we were waiting for, Prerender served us the HTML code already generated. For more information on configuring a particular server, visit the following [link](https://prerender.io/documentation/install-middleware) Now, we already have the Prerender configured and our app verifies the *user-agent* or the *_escaped_fragment_* query string to bring the HTML from the Prerender service, but we do not have any type of cache configured so far, this means That every time a request is made to the Prerender service, it has to generate all the HTML over and over again ## Install Redis linux debian/ubuntu correr este [sh](https://gist.github.com/rogerleite/5927948#file-redis-install-sh) ### Run the redis-server [prism:bash]start redis-server[/prism:bash] We go to the Prerender folder [prism:bash]cd /var/opt/prerender[/prism:bash] We installed the Redis plugin to prerender [prism:bash]npm install prerender-redis-cache --save[/prism:bash] And we add this line to the file __server.js__ [prism:javascript]server.use(require('prerender-redis-cache'));[/prism:javascript] By default the plugin will connect to Redis on localhost WN, the default port (6379) without any authentication, you can override these settings by setting the following environment variables __REDISTOGO_URL, REDISCLOUD_URL, REDISGREEN_URL or REDIS_URL__ With the following format *redis://user:password@host:port/databaseNumber* Restart the Prerender service and test [prism:bash]curl -A "Twitterbot" "http://localhost:8080"[/prism:bash] The first time it will take what has always taken (for this example it would be normal for about 1 second or a little more), but the next times we ask for the same URL the result will be almost immediate We have ready our service Prerender running with cache. ### Example with Apache To use Prerender with Apache we must be sure to have the following modules activated + mod_rewrite + mod_proxy + proxy_html + proxy_http virtual-host de __apache__ para __angular__ en modo __html5__ [prism:bash] ServerAdmin webmaster@localhost ServerName servername.local ServerAlias subdomain.domain.local DocumentRoot "/dir/to/site/root" ProxyRequests On ProxyPreserveHost On Require all granted RewriteEngine on AllowOverride All Options Indexes MultiViews FollowSymLinks Require all granted # If requested resource exists as a file or directory # (REQUEST_FILENAME is only relative in virtualhost context, so not usable) # RewriteCond %{REQUEST_FILENAME} -f [OR] # RewriteCond %{REQUEST_FILENAME} -d # Go to it as is # RewriteRule ^ - [L] # If non existent # Accept everything on index.html # RewriteRule ^ /index.html # If non existent # If path ends with / and is not just a single /, redirect to without the trailing / RewriteCond %{REQUEST_URI} !^/$ RewriteCond %{REQUEST_URI} ^(.*)/$ RewriteRule ^ %1 [R,QSA,L] # Handle Prerender.io RewriteCond %{HTTP_USER_AGENT} Googlebot|bingbot|Googlebot-Mobile|Baiduspider|Yahoo|YahooSeeker|DoCoMo|Twitterbot|TweetmemeBot|Twikle|Netseer|Daumoa|SeznamBot|Ezooms|MSNBot|Exabot|MJ12bot|sogou\sspider|YandexBot|bitlybot|ia_archiver|proximic|spbot|ChangeDetection|NaverBot|MetaJobBot|magpie-crawler|Genieo\sWeb\sfilter|Qualidator.com\sBot|Woko|Vagabondo|360Spider|ExB\sLanguage\sCrawler|AddThis.com|aiHitBot|Spinn3r|BingPreview|GrapeshotCrawler|CareerBot|ZumBot|ShopWiki|bixocrawler|uMBot|sistrix|linkdexbot|AhrefsBot|archive.org_bot|SeoCheckBot|TurnitinBot|VoilaBot|SearchmetricsBot|Butterfly|Yahoo!|Plukkie|yacybot|trendictionbot|UASlinkChecker|Blekkobot|Wotbox|YioopBot|meanpathbot|TinEye|LuminateBot|FyberSpider|Infohelfer|linkdex.com|Curious\sGeorge|Fetch-Guess|ichiro|MojeekBot|SBSearch|WebThumbnail|socialbm_bot|SemrushBot|Vedma|alexa\ssite\saudit|SEOkicks-Robot|Browsershots|BLEXBot|woriobot|AMZNKAssocBot|Speedy|oBot|HostTracker|OpenWebSpider|WBSearchBot|FacebookExternalHit [NC,OR] RewriteCond %{QUERY_STRING} _escaped_fragment_ # Proxy the request RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff))(.*) http://localhost:3000/http://%{HTTP_HOST}/$2 [P,L] # If requested resource exists as a file or directory # (REQUEST_FILENAME is only relative in virtualhost context, so not usable) RewriteCond %{REQUEST_FILENAME} -f [OR] RewriteCond %{REQUEST_FILENAME} -d # Go to it as is RewriteRule ^ - [L] # If non existent # Accept everything on index.html RewriteRule ^ /index.html ErrorLog "/var/log/apache2/domain.local.error.log" [/prism:bash] ## To take into account * Prerender uses Phantomjs as an engine so the use of JS features __ES6/ES7__ phantomjs will fail, Therefore the use of __BabelJS__ is recommended in order to convert your code to __ES5__ * If you are in __linux__ Make sure you have set your __locale__, This can also affect Phantomjs, [See other related issues](https://github.com/ariya/phantomjs/issues/13433)