Optimizing PHP Web Site for Google
After I finally dipped my hat into the virtual waters of the online oceans called the Internet, it wasn't long before I realized that getting online and going somewhere aren't necessarily the same thing. Just because you have a website, there's no guarantee that anyone will ever see it, or that you'll get any kickbacks.
So I began an educational tour that led me to master several web technologies, from graphic presentation to server-side scripting. I figured that making my sites more appealing was the way to bring in more traffic.
But for a long time my server logs were just sitting there. Visits and hits weren't going up, no matter what I tried to do to increase the appeal of my site. Worse, I found that very little of my work was appearing on search engines like Google. After researching search engine optimization and consulting people who have studied it, I tried to get my sites indexed—but for a long time Google just ignored most of my work, even after my showcase site was reaching a few hundred pages!
I began my quest by reducing the amount of URL-encoded addresses on my site. For example, I changed the way articles were referenced from www.shawnolson.net/articles.php?article_id=457 to www.shawnolson.net/a/457/ . This, according to advice of SEO experts, is a good step. After this step, my Google Page Rank rose… but most of my articles were still not being indexed.
Frustrated, I tried to find other ways to get my site indexed. But everything I tried to do seemed to fail. Until, that is, I noticed a scenario that played out in my site: whenever visitors came to my sites, the first page they came to (no matter which page on my site it was) was transformed in an un-planned manner: all in-site links were appended with a URL-encoded string. The second and future pages did not have this addition in the links, but the second (and only second) page always had the URL-encoded variable in the address bar because of the first page.
As it turned out, I learned that PHP, the scripting language I use, was appending links with a session id variable to keep "state" with visitors that were blocking cookies. While the habit allowed visitors blocking cookies to use advanced features of my sites, it was stopping Google from indexing my pages—because that session id variable was very long and unique every time Google came to a page, Google must've decided that the page was not worth indexing.
I had to make a choice. Would I keep this feature that might help a small percentage of paranoid visitors use my site—why would you block session cookies if you aren’t paranoid—or would I make my site Google-friendly. It wasn't a hard choice. Within a month of removing that link appendage, Google was indexing every article on my sites.
To turn the link appendage off in PHP, programmers need to edit the PHP.ini file on their server to turn off session.use_trans_sid and turn on session.use_only_cookies. If you don't have access to the PHP.ini file, you can turn it off by writing an .HTACCESS file and adding the lines "php_flag session.use_trans_sid Off" and "php_flag session.use_only_cookies On".
I don't write in other scripting languages, but there may be similar issues with ASP and other server-side technologies.
This strategy helps immensely for dynamic sites programmed in PHP. My average daily visitors to www.shawnolson.net was around 50 before I disabled URL-encoded session ids. After the change, my average visitors jumped to over 200 per day within a month. As of August 2004, it was nearly 600 unique daily visitors. By the end of 2005, daily visitors is well over 2,000 unique visitors.
2011-11-02 This article is now pretty dated. But I'm leaving it for some reference. Generally speaking, I'd say taht search engines no longer really give you a boost for using "virtual directory" URLS above query string URLs.