Search-related links
I mentioned that I’d been propagating interesting search-related links at work (from my morning coffee surfing) and that I’d share them here… I only went back to December, though lord knows I’ve been sending out links for longer than that. I started out sending these to my boss (long before he was my boss for this job), then to the team I was in, then to my team plus another team we worked with, and it’s now ended up going out to three different groups, one of which is enormous and includes not a few Important People and Very Important People.
It’s interesting to see how the format evolved from a simple “check out this link” with variable amounts of quoted text to a consistently formatted and structured setup as the number of people (and Important People) receiving the link emails grew.
March 17, 2006. Google wins cache lawsuit, impacy of censorship on Google, AdSense millionaires, and an interview with Ammon Johns
Google wins cache content lawsuit:
“In a legal win for Google, a federal judge dismissed a lawsuit filed by a writer who claimed the search giant infringed on his copyright by archiving a Usenet posting of his and providing excerpts from his website in search results.”
networkworld.com: Impact of censorship significant on Google, other search engine results
“Indiana University researchers have created a Web site that highlights differences in query results provided by country-specific search engines, such as the version of Google built to accommodate China’s free-speech restrictions.
The idea behind CenSEARCHip is to determine the impact countries’ censorship laws have on search results. The project was largely inspired by the google.cn system that Google decided to create earlier this year for China (Yahoo and MSN have followed suit).”
CenSEARCHip
Google AdSense: How I made a million in 3 months
“I am without a doubt the top individual adsense publisher in terms of pageviews. Me talking about how I did it won’t help you, but here is what i found you need to be successful in adsense.
1. Get a database of IP’s so you know where your traffic is coming from. Then create channels for each country. Its not uncommon to see US traffic with a CPM of $5.00 and a CDN traffic at 20 cents and vice versa. If you have access to the hints option, give different hints based on IP. ie if your page is about 401k plans, that won’t get you anything outside of the USA. ”
SEOmoz: Interview with Ammon Johns
“For those who may not be aware, Ammon Johns is one of the most revered and experienced SEOs in the world. Though his profile isn’t as prominent as some of his peers (you can find him posting as Black_Knight at Cre8asite), Ammon has been in the business since the mid-1990’s, working in an incredible array of verticals, achieving enviable levels of success. In this interview, I asked Ammon about his background, his recent move to the London firm - The Search Works - and his passions in SEO. What follows is a veritable treatise on the subject, and incredibly deep and worthy conversation that gives a glimpse into why Ammon Johns, is, without question, one of the best SEOs in the world.”
March 16, 2006. Spam 2.0, loving/hating Google (let me count the ways), Alexa sees all (they think), interviewing Andrei Broder
SEOmoz: SpamAd 2.0 has arrived
“The spammers are always working on the next generation of spam because they know whatever is listed today won’t last long.
I believe I have now seen the next generation. I came across a site last night that used one of my press releases for the primary content on one of its pages. There was a minor connection between the content of the press release and the page’s keyword-rich title tag. That is, the page is targeting wedding eBooks and my press release announced an eBook that has a chapter on weddings.[/]But for the fact that I know my press release should not be used by this site, and that the eBook the press release announces is not about weddings, the site would easily have escaped my notice.”
SearchEngineWatch: 25 things to hate about Google
“Originally, this was to be called “100 Things I Hate About Google.” I suppose it’s good news that by the time I reached the twenties, I started running out of steam. Then again, I’ve also got other things to do. I’m sure I’ve missed pet peeves that others have. Don’t worry. After you work through my list, you can contribute your own via a forum link at the end.”
SearchEngineWatch: 25 things to love about Google
“Love, hate. Love, hate. When it comes to Google, I did the “hate” side of my love/hate relationship over in my 25 Things I Hate About Google article. In this article, I’m all about the love. How do I love Google? Let me count the 25 ways into my heart:”
Matt Cutts on loving and hating Google:
This post contains links to the love/hate Googlefest, but also links to the posts in which he invited user feedback on: “products, search quality, communications, webmaster-related ideas, webspam, and miscellaneous feedback”
March 9, 2006. Meet Joe Spammer, Google and Wikipedia sittin’ in a tree?
Meet Joe Spammer - Confessions of a Search Marketer (MP3)
“At SES Chris interviewed one search spammer who asked to remain nameless while providing some interesting details about the inner workings of the search spamming world. On one hand the interview reveals some of the motivations for spamming the search engines. More interesting are the search spamming tactics these black hat marketers won’t employ because even they are annoyed by the end result.”
Google to Partner with Wikipedia?
“Garett Rogers reports that Google and the Wikipedia may be in some sort of partnership. March 6th, Google registered “googlereference.net/org/info and googlereferencepages.com/net/org/info”, which can imply Google starting, what Garett is calling “Google Reference.”"
Threadwatch.org: Alexa (claim to) know your networks:
“Under a heading of ‘See other sites owned’ in the left hand column are all sorts of sites - linked - that Alexa associates with the owner of the site being looked at. Or not. Or yes, but shouldn’t be public knowledge:”
“According to Alexa i own an American University.”
Yahoo search blog: “Search without a box” - A chat with Andrei Broder
Part I
Part II
Part III
March 7, 2006. Language detection at Google (not), a comparative study of French-language search across six search engines
Unintelligent language detection at Google:
“Though the issue is nothing new (I’ve seen it bouncing on and off occasionally), I can only wonder how hard it is for search engines to read the code and trust what it says.
A comparative study of six search engines (relevance in French-language searches) (Jean Véronis, Université de Provence):
“At the end of 2005, Google was the search engine with the highest number of users in the world, including a particularly high market share in France (82% of traffic according to Xiti1). The reasons why a user might choose one search engine over another are complex, but while elements such as speed, ergonomics and aesthetics all come into play, the most important criterion seems to be that of the relevance of the results to the search performed – at least in the way they are perceived as relevant by the user. Yet little data is available that would allow us to compare this perceived relevance and, as far as we can tell, there is no recent comparative data whatsoever concerning searches in the French language. This study aims to address this lack, at least partially, by carrying out a user test at the end of 2005 in which searches are performed in French using six search engines.”
Article on study (by author)
PDF of study
Update on the comparative study of six search engines
In response to comments made on the article about the study, the author has been producing supplemental result calculations.
February 22, 2006. Google loyalty, Google testing, and goodbye Jeeves
Google Tops Search Loyalty Study, Though Many Searchers Aren’t That Loyal
“A new search loyalty study by Compete shows that Google leads its competitors by far in having the most loyal searchers, those who stick with it exclusively and don’t use other services. But even Google still has nearly one-third of its searchers willing to “cheat” on it and use other search engines.”
Google testing part one: trusted testers
“Garett Rogers at Googling Google covers Google’s “Trusted Tester Program,” a preexisting invite-only program that now appears to have gained a formal FAQ left open to public view. The program lets friends and family of Google employees test new software.”
Google testing part two: human evaluators
“What is it? It’s a lab of humans from all over the world (from China to The Netherlands, from Korea to Brasil) They are paid to check search results of Google every day. Most of the employees, called international agents by Google, were recruited through universities all over the world. The aim is to avoid spam, to get the right sites at the top of the listing and to test new features, not shown to the public yet.”
Exit the butler - Jeeves goes off duty, Ask remains
“After nearly a decade of service, Jeeves is retiring from his duties at the search engine, which will assume the long used but little promoted name “Ask.”"
Jeeve’s “retirement site”
February 9, 2006. What’s wrong with A9 and why do search engines lie?
Search Engine Journal: What’s wrong with the Amazon A9 Search Engine.
“A9 has been getting a lot of flack around the search world lately as being Amazon’s version of the little engine that never could. There was a lot of news around the A9 launch and they were one of the first ‘almost’ top flight search engines to venture into mass personalization and are also innovators in local search and pay per call advertising. A9 however seems to have the Amazon spin-off curse of side projects by the mega ecommerce entity which launch to a hail of fanfare, only to be put on the backburner or pushed to the side later down the line, never accomplishing the task of gathering public interest.”
Scoble: Why do search engines lie?
“Here, do a search for Memetrackers (Google, MSN, Yahoo). Now, why are none of their numbers accurate? Google says there are 713 results, but can only display 62. MSN says there are 101, but only can display 100. Yahoo says there are 368, but only can display 44.”
January 31, 2006. Corporate Black Hat SEO Terminology - when is a cloaker not a cloaker?
Alternate terminology that can be used to refer to spam activity. Some of them sound a little tongue-in-cheek, but you never know… so next time someone says they’re using “targeted IP”, you can say, “uh-huh, CLOAKER!”
“Cloaking: This should always be referred to as Targeted IP, User Agent or Geo-Specific delivery.
Web Scraping: This should be referred to as Information Archiving or Caching.”
January 26, 2006. Ask Jeeves using new image search model, Google cache is legal
“Image search is tricky, because images lack most of the clues search engines use to find relevant text documents that match our queries. Because images are made up of patterns of bits rather than words, search engines can’t directly “look” at an image and figure out what it represents.
Instead, search engines look for other clues, such as filenames, text immediately above or below an image (potential captions), the overall context of a page an image appears on, and so on. Ask Jeeves image search is doing all this, but it’s also applying its Teoma ranking system to find sites that have a broad representation of images and topics, and identify those that have the greatest degree of “authority” for a particular topic to help determine image relevance.
The company also says that it’s applying sophisticated image recognition technologies, examining factors such as color, brightness and other characteristics to better understand what an image represents.”
A court has ruled that Google’s cacheing and displaying of millions of web-pages is legal. Google Cache is the service that offers to show you stored versions of the web-pages that turn up in the results for your Google searches. Until recently, no court had ruled on the legality of this, and it was unclear whether this would qualify as a “fair use.”"
Link 1
Link 2
January 24, 2006. Yahoo gives up on #1
“Yahoo! Inc., one of the first Internet search companies, has capitulated to Google Inc. in the battle for market dominance.
“We don’t think it’s reasonable to assume we’re going to gain a lot of share from Google,” Chief Financial Officer Susan Decker said in an interview. “It’s not our goal to be No. 1 in Internet search. We would be very happy to maintain our market share.”"
January 9, 2006. Speculation on How Search Engines Apply Temporal Data
Nice little article on SEOMoz.org about how search engines respond to highly topical new queries. The example given is the (very funny) video that hit the internet a few weeks ago, “SNL - The Chronicles of Narnia” (go watch it, it’s pretty funny :) .
The article looks at searches for the very new and very original catchphrase in the video, “the chronic,”-”what?”-”cles of narnia.” and how it scored with the search engines at the very beginning of the phenomena and over time.
“In my opinion, temporal based data is one of the most valuable kinds of information that feeds into a modern search engine’s algorithm. With it, search engines know when terms are getting more popular, when new terms are being used by searchers and when new forms of an old term are being used to refer to new concepts (and thus need to pull up new search results).
For example, a recent Saturday Night Live sketch featured Chris Parnell and Andy Samberg rapping about the Chronicles of Narnia. The video was an instant hit online, with thousands of media outlets, big and small, covering the phenomenon of folks using the new terminology - “the chronic,”-”what?”-”cles of narnia.”"
December 23, 2005. Quintura Search