Posted by Jane Copland

The first time I ever accessed the Internet was from my mother’s work computer in late 1995. I was eleven years old and her homepage was set to Yahoo. I can’t really remember what it looked like, but Googling (oh, I hate the irony too) "Yahoo in 1995" produced a post by John Battelle with a magnificent screen cap of the portal in the mid-90s. This was thirteen years ago (so, over half my lifetime), and my memory might not be serving me very well, but I’m fairly sure that the first thing I ever searched for was song lyrics. Probably to a very bad 1995 song. My father wanted to try it next and he searched for the lyrics to "Flower of Scotland." That, I remember.

Today, searching for lyrics is a horrendous task. Most top-ranked lyrics websites look like MySpace threw up on GeoCities and, if I dare to click on a result, inundate my computer with pop-up advertising. Earlier today, I actually stumbled on an instance of a robotic voice congratulating me for having won two iPod nanos. To get a coherent result and not be presented with the "Are You Stupid?" test, you have to memorise which sites are worthwhile to click on.

How do search engines really determine which sites should rank well for song lyrics-related material? This niche seems to be relatively competitive, with advertising being the business model of choice. The first big problem is certainly duplicate content. This is an especially important question when it comes to lyrics because of people’s tendency to take a sample of a song they’ve heard and search for it without knowing the song’s name. If there are thousands of instances of the same song present online, how does a site make sure its version is ranked?

The suggestions Google shows for searches beginning with "lyrics" is a good place to start when analysing what search engines value for these types of searches.

Currently popular music obviously dominates. Choosing the search "lyrics to take a bow," you’ll see that Google presents both results for a currently popular song with that name by Rihanna, as well as a track from 2007 by Leona Lewis and a fourteen-year-old song by Madonna. Edit: two YouTube videos have made it into the mix in the last twenty-four hours, taking out the Leona Lewis song.

The top three results, plus results five, six, seven and nine are all for the same Rihanna track. In the pages’ inlinks, I’ve included internal links, as some of these sites do interesting things with their internal link structure. When you look at the links for the LyricsMode.com page, you’ll see that tens of thousands of them appear to come from pages like this, which are results pages for failed queries. Instead of displaying no content, the site shows the top 100 most popular songs at the given moment. Given that the page has only 29 links from external sources, I have to believe that its internal work is quite important here.

1)  http://www.metrolyrics.com/take-a-bow-lyrics-rihanna.html
     663 inlinks - PR3
2)  http://www.completealbumlyrics.com/lyric/133088/Rihanna+-+Take+A+Bow.html
     72 inlinks - PR2
3)  http://justjared.buzznet.com/2008/03/14/rihanna-take-a-bow-lyrics/
     109 inlinks - PR 6
5)  http://www.lyricsmode.com/lyrics/r/rihanna/take_a_bow.html
     68,982 inlinks - PR0
6)  http://www.lyricstop.com/t/takeabow-rihanna.html
     6 inlinks - PR0
7)  http://www.musicloversgroup.com/rihanna-take-a-bow-video-and-lyrics/
     415 inlinks - PR0
9)  http://www.celebridiot.com/2008/04/25/rihanna-take-a-bow-video-and-lyrics/
     130 inlinks - unranked

For comparison’s sake, here are the links and PageRanks for the domains:

http://www.metrolyrics.com/ - 1,879,225 inlinks - PR5
http://www.completealbumlyrics.com/ - 39,336 inlinks - PR6
http://justjared.buzznet.com/ - 447,097 inlinks - PR6
http://www.lyricsmode.com - 906,098 inlinks - PR5
http://www.lyricstop.com/ - 11,433 inlinks - PR5
http://www.musicloversgroup.com/ - 59,875 inlinks - PR4
http://www.celebridiot.com/ - 526,323 inlinks - PR4

On the surface, this seems totally unexplainable. Aside from a manual tweak which somehow acknowledges that lyrics are inherently duplicated, how do search engines justify ranking the same content over and over again?

Or is this the result of literally everything related to this query being duplicate content? If search engines filter duplicate content, simply lowering results that are duplicated, then surely it stands to reason that if all the results are duplicates, then there is nothing else to be shown above affected results. However, you’d think that adding your own content and hiding the lyrics with something like an iframe, but still optimising for lyrics searches, would be beneficial. Or would this be considered too manipulative? Obviously, this would negate searches where people type in snippets of songs they’ve heard and want to find. For this, could you pick out which parts of songs people are most likely to include in search queries (first words, repeated phrases, hooks, etc) and include only those as indexable content, excluding the rest with whichever technique you choose. It could certainly be done, especially with iframes, and could probably look relatively natural.

The answer in regards to the Rihanna song may well be that the content is not in fact the same. Many of these lyrics websites rely on users to provide their content, and it seems to be rare that words are actually taken from official resources. Each results’ lyrics are slightly different.

The common wisdom is that duplicate content will still be singled out if a degree of similarity is detected. How similar do results have to be in order to be filtered? Also, there is unique content on each of these pages, the easiest and most common being user comments about the song. How much of the content has to be duplicated, and should it make a difference that the original comments are virtually hidden whilst the lyrics are front-and-centre?

If having only ever-so-slight unique content is all it takes, this changes our duplicate content landscape a bit. Currently, we’ll give people advice such as present duplicated (or substantially similar) content in an iframe, surrounding it with unique content to prevent a page from being filtered. Is it really enough to change instances of "closing" to "closin’ " and "cause" to "cuz?"

Perhaps a better indication of truly duplicate content would be a lesser-known song and one that has less room for interpretation when it comes to lyrics. For this, I chose "No Aphrodisiac" by Australian band The Whitlams. The song has two lines which could be up for interpretation as far as punctuation and spelling go. However, upon searching for "lyrics to no aphrodisiac," I see that all but one site replicates the same spelling and same punctuation.

I’d like to see what would happen if a site like Last.fm began offering lyrics. Last.fm, Pandora, and similar sites provide some of the highest quality online music content and are miles ahead of Lyricsdepot, A-Z Lyrics, and other lyrics databases. Last.fm has the web presence and the community to make such a campaign work: the main question would be whether they’d be interested in harnessing that market. For informed users, a Last.fm result would be far more satisfying than the pop-up ridden, hideous results that currently rule the SERPs.

Last and Pandora would also be optimising for a different purpose: 99.9% (I’d say 100% but someone would have to prove me wrong) of ranking lyrics websites are pushing ringtone advertisements; Last and Pandora sell premium subscriptions to their online "radio" stations. Both sites show advertising, but not nearly with the saturation of lyrics databases. I have little experience with Pandora, but Last also touts links to Amazon for users to purchase CDs and mp3s. These business models are very different and undoubtedly, very few lyrics searchers will end up converting into paying Last members. However, those who do often end up providing quite a healthy stream of income as repeat customers, and the commission earned from the Amazon links probably doesn’t go astray either.

Given the duplicate content and the overall horrifying quality of lyrics sites, I wonder how difficult it would be to rank well for these searches. Some of these sites’ link profiles are quite impressive, but if search engines’ goal is to provide the highest quality content to users, they would surely love to see a high-quality competitor take hold of the niche, whether that competitor was selling premium content or making its money through advertising.

As an aside, I have always found LyricsMode to be a lot better than most of these sites and it’s good to see their rankings steadily improving. I do believe, however, that there’s plenty of room for improvement in this lucrative market and if someone dares to make vast improvements, the rest of the market will follow suit.

Do you like this post? Yes No


Incoming Links (via Tecnorati):
Nothing Reported

Tags: Advertising, Amazon, Business, Content, Databases, Domains, Google, HTML, Internet, Java, Javascript, Links, Manual, Music, PageRank, pear, PHP, Plugins, Search Engines, Video, Wordpress, Yahoo!, YouTube